Communications In Mathematical Physics - Volume 296

Commun. Math. Phys. 296, 1–33 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1009-8 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

55 downloads 609 Views 10MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 296, 1–33 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1009-8

Communications in

Mathematical Physics

Phase Transition in a Vlasov-Boltzmann Binary Mixture R. Esposito1 , Y. Guo2 , R. Marra3 1 Dipartimento di Matematica pura ed applicata, Università dell’Aquila,

Coppito, 67100 L’Aquila, Italy. E-mail: [email protected]

2 Division of Applied Mathematics, Brown University, Providence,

RI 02812, U.S.A. E-mail: [email protected]

3 Dipartimento di Fisica and Unità INFN, Università di Roma Tor Vergata,

00133 Roma, Italy. E-mail: [email protected] Received: 5 April 2009 / Accepted: 24 November 2009 Published online: 19 February 2010 – © Springer-Verlag 2010

Abstract: There are not many kinetic models where it is possible to prove bifurcation phenomena for any value of the Knudsen number. Here we consider a binary mixture over a line with collisions and long range repulsive interaction between different species. It undergoes a segregation phase transition at sufficiently low temperature. The spatially homogeneous Maxwellian equilibrium corresponding to the mixed phase, minimizing the free energy at high temperature, changes into a maximizer when the temperature goes below a critical value, while non homogeneous minimizers, corresponding to coexisting segregated phases, arise. We prove that they are dynamically stable with respect to the Vlasov-Boltzmann evolution, while the homogeneous equilibrium becomes dynamically unstable.

1. Introduction and Main Results The phenomenon of phase transition in a thermodynamic system is usually described by the arising of multiple minimizers of the free energy. Namely, when the temperature is lowered below a certain critical value, the unique equilibrium minimizer becomes a local maximizer and new minimizers appear. This is interpreted as a loss of stability of the old minimizer and the birth of new stable states. Here the meaning of stability is merely related to the free energy minimizing properties of the state. Of course it would be desirable to have a more detailed, dynamical analysis of the stability properties involved in such a phenomenon of phase transition. This issue is, generally speaking, very difficult, because the natural dynamics is a many-component one, whose detailed understanding is very far from being attainable. A simpler but, in our opinion, still interesting problem is to study this kind of behavior at the kinetic level, where, instead of a system of a huge number of particles, one has to study a partial differential equation for the probability distribution of particles on the one particle phase space. This is the problem we want to address in this paper.

2

R. Esposito, Y. Guo, R. Marra

The standard kinetic model of a rarefied gas undergoing collisions (Boltzmann equation), however, describes essentially the ideal gas, which does not exhibit phase transitions. Following van der Waals, it is convenient to include a long range attractive interaction between particles to see a vapor-liquid transition. With the introduction of such an interaction, new problems arise, due to the fact that nothing prevents the system from collapsing (Statistical Mechanics instability). The van der Waals approach of adding a hard core interaction is not easy to handle and more complicated many body long range interactions have been introduced (see [13,15]) to handle this in the framework of the equilibrium Statistical Mechanics. While one could in principle try to follow the approach in [13], we find it easier to consider a different kinetic model introduced a few years ago in [2], where there is no attractive interaction and the Statistical Mechanics instability does not arise. The model consists of two species of particles which, for simplicity, have the same mass. To fix the ideas, think of them as distinguished just by their color, say red and blue. Their short range interaction is modeled by Boltzmann-like collisions which are color blind, while the long range repulsive interaction, arising only between particles of different color, is modeled by a Vlasov force with a smooth, bounded, finite range potential. We refer to [1–4] for more information on this model. For any t ∈ R+ , let the non negative functions f i (t, x, ξ ), i = 1, 2, denote the probability densities of finding a particle of species 1 (red) or 2 (blue) in a cell of the phase space around the point (x, ξ ) at time t. In this paper we will only consider the case x ∈ R, the real line, while the velocity ξ = (v, ζ ) ∈ R3 with ζ ∈ R2 . The time evolution is governed by ∂t f 1 + v∂x f 1 + F( f 2 )∂v f 1 = Q( f 1 , f 1 + f 2 ), ∂t f 2 + v∂x f 2 + F( f 1 )∂v f 2 = Q( f 2 , f 1 + f 2 ). Here the Vlasov force F(h) due to the mass distribution h is defined as F(h)(t, x) = −∂x dyU (|x − y|) dξ h(t, y, ξ ), R

R3

(1.1)

(1.2)

where U (r ), the interaction potential is non negative, smooth, bounded, with U (r ) = 0 for r ≥ 1 and d xU (|x|) = 1. R

The collision integral is defined as: Q(h 1 , h 2 )(ξ ) = |(ξ − ξ ) · ω|{h 1 (ξ∗ )h 2 (ξ∗ ) − h 1 (ξ )h 2 (ξ )}dξ dω R3 ×S2

(1.3)

:= Q gain (h 1 , h 2 ) − Q loss (h 1 , h 2 ), with S2 = {ω ∈ R3 | |ω| = 1} and ξ∗ , ξ∗ related to ξ , ξ by the usual elastic collision relations ξ∗ = ξ − ω[ω · (ξ − ξ )],

ξ∗ = ξ + ω[ω · (ξ − ξ )].

(1.4)

For any β > 0, the couple (a1 µβ , a2 µβ ), with µβ the spatially homogeneous Maxwellian at temperature β −1 , 3 β 2 −βξ 2 /2 e (1.5) µβ (ξ ) = 2π

Phase Transition in a Vlasov-Boltzmann Binary Mixture

3

and ai > 0, is an equilibrium solution and the most general homogeneous equilibrium differs from it just for rescaling and centering. However, due to the presence of the Vlasov force, non homogeneous Maxwellian equilibria are possible and they are of the form f i (x, ξ ) = ρi (x)µβ (ξ ),

(1.6)

provided that the densities ρi (x) > 0 satisfy the conditions ln ρi + βU ∗ ρi+1 = Ci , i = 1, 2.

(1.7)

Here and in the rest of the paper the label i + 1 means 2 if i = 1 and 1 if i = 2 and the convolution product ∗ is defined by a ∗ b(x) = R dya(|x − y|)b(y). The constants Ci have the physical meaning of chemical potentials. The phase transition phenomenon we want to discuss occurs when C1 = C2 , a situation where the equilibrium solutions are symmetric under the exchange 1 ↔ 2. This symmetry is spontaneously broken below the critical temperature. Therefore we assume C1 = C2 in the rest of the paper. If β is suitably small, conditions (1.7) are satisfied only by constant ρi ’s. On the other hand if β is sufficiently large, non constant ρi solving (1.7) can be constructed. To understand the arising of multiple equilibria, let us start by defining the local free energy ϕ(ρ1 , ρ2 ) on R+ × R+ (i.e. the free energy density when U is replaced by a δ-function) as ϕ(ρ1 , ρ2 ) = ρ1 ln ρ1 + ρ2 ln ρ2 + βρ1 ρ2 .

(1.8)

Let ρ = ρ1 + ρ2 . It is shown in [4]] (in a slightly more general context) that, if βρ < 2, then the only stationary points for ϕ are characterized by ρ1 = ρ2 = ρ/2 and they are minimizers for ϕ. On the other hand, if βρ > 2, then there are ρ + > ρ − > 0 such that (ρ1 , ρ2 ) = (ρ + , ρ − )

(1.9)

(and (ρ1 , ρ2 ) = (ρ − , ρ + ) by symmetry) is an absolute minimizer for ϕ, while (ρ1 , ρ2 ) = (ρ/2, ρ/2) is a local maximizer for ϕ. When (a1 , a2 ) is chosen equal to (ρ + , ρ − ), the equilibrium state is interpreted as a pure phase rich of red particles, while the state (a1 , a2 ) = (ρ − , ρ + ) corresponds to a phase rich of blue particles. Non-homogeneous solutions become relevant when one has to describe a situation of phase coexistence. The idea to construct non-homogeneous solutions is to observe that at low temperature, in order to minimize the local free energy, the system has to be in a pure phase at infinity. A situation with phase coexistence should arise when the system is in the blue-rich phase at −∞ and in the red-rich phase at +∞ or vice versa. Therefore one looks at the minimizers of a suitable free energy functional with above constraints. Indeed, the conditions (1.7) are the Euler-Lagrange equations for the free energy functional F(ρ1 , ρ2 ) = d x(ρ1 ln ρ1 +ρ2 ln ρ2 )+β d x dyU (|x − y|)ρ1 (x)ρ2 (y) I

I

I

(1.10) on a finite interval I with periodic boundary conditions. Moreover, it can be shown, (see [3]), that when β is large, with the masses of the two species fixed, there are non constant couples (ρ1 , ρ2 ) giving lower values to the free energy than the constants. Over

4

R. Esposito, Y. Guo, R. Marra

the real line −∞ < x < ∞, above free energy does not make sense and a more careful definition is required: Let (− , ) be a bounded interval and set F(− , ) (ρ1 , ρ2 ) = dxϕ(ρ1 , ρ2 ) (− , ) β + dxdyU (|x − y|)[ρ1 (x) − ρ1 (y)][ρ2 (y) − ρ2 (x)]. 2 (− , )2 We define the excess free energy as

ˆ 1 , ρ2 ) := lim F(− , ) (ρ1 , ρ2 ) − 2 ϕ(ρ + , ρ − ) . F(ρ

→∞

(1.11)

Note that, since ϕ(ρ + , ρ − ) = ϕ(ρ − , ρ + ), the functional Fˆ is +∞ when ρ = (ρ1 , ρ2 ) does not go to pure phases at infinity. Since we are interested in the coexistence of phases, we assume that (ρ1 , ρ2 ) satisfy the conditions lim x→±∞ ρ1 (x) = ρ ± and lim x→±∞ ρ2 (x) = ρ ∓ . In [4] several results are proved about the minimizers for the ˆ which are summarized in the following theorem (see also [5]): functional F, Theorem 1.1. Let βρ > 2. Then there exists a unique (up to translations) positive ˆ defined by (1.11), minimizer (front) for the one-dimensional excess free energy F, in the class of continuous functions ρ = (ρ1 (x), ρ2 (x)) such that lim z→±∞ ρ1 = ρ ± , lim z→±∞ ρ2 = ρ ∓ , where ρ ± are defined in (1.9). We denote by ρ¯ = (ρ¯1 (x), ρ¯2 (x)) the unique minimizer such that ρ¯1 (x) = ρ¯2 (−x). ρ¯1 is monotone increasing and ρ¯2 is monotone decreasing and ρ − < ρ¯i (x) < ρ + for any x ∈ R. Moreover, the front ρ¯ is C ∞ (R)-smooth and satisfies the Euler-Lagrange equations (1.7); its derivative ρ¯ satisfies the equations ρ¯i + β ρ¯i (U ∗ ρ¯i+1 ) = 0, i = 1, 2.

(1.12)

The front ρ¯ converges to its asymptotic values exponentially fast, in the sense that there is α > 0 such that |ρ¯1 (x) − ρ ∓ |eα|x| → 0 as x → ∓∞, |ρ¯2 (x) − ρ ± |eα|x| → 0 as x → ∓∞. Finally, the derivatives of ρ¯i of any order vanish at infinity exponentially fast and ρ¯ is odd in the sense that ρ¯1 (x) = −ρ¯2 (−x).

(1.13)

The aim of this paper is to show that for β larger than a certain critical value, the non-homogeneous equilibria, the front solutions (ρ¯1 , ρ¯2 ) are dynamically stable with respect to the evolution (1.1), while the homogeneous mixed phase is unstable. This will show that a complete bifurcation scenario arises in this model for any value of the Knudsen number, i.e. the ratio between the mean free path and the range of the interaction potential. In the rest of the paper, without loss of generality we fix the asymptotic total density ρ = ρ + + ρ − = 2. If β < 1, the only minimizer is the homogeneous equilibrium (mixed phase) Mhom (ξ ) = (µβ (ξ ), µβ (ξ )). On the other hand, if β > 1, the pure phases Mred = (ρ + µβ (ξ ), ρ − µβ (ξ )) and Mblue = (ρ − µβ (ξ ), ρ + µβ (ξ )) are constant minimizers and Mhom is a maximizer. Moreover, we have the non homogeneous equilibrium Mρ¯ = (ρ¯1 (x)µβ (ξ ), ρ¯2 (x)µβ (ξ )), with ρ¯1 (x) + ρ¯2 (x) → 2 as x → ±∞.

(1.14)

Phase Transition in a Vlasov-Boltzmann Binary Mixture

5

We assume that the initial datum for the dynamics (1.1) is a small perturbation of one of the above equilibria, denoted generically by M: f i (0) = Mi + Mi gi (0), (1.15) with gi sufficiently small in a sense that will be specified later. Let f (t, x, ξ ) be the solution to (1.1) and define g(t, x, ξ ) by setting f i (t, x, ξ ) = Mi (x, ξ ) + Mi (x, ξ )gi (x, ξ, t).

(1.16)

The equation for the perturbation g is

∂t + v∂x + F Mi+1 + Mi+1 gi+1 ∂v + ν(x, ξ ) gi = βF +F

Mi+1 gi+1

Mi v +

2

K i, j g j

j=1

Mi+1 gi+1 gi v + Γ (gi , gi ) + Γ (gi , gi+1 ),

(1.17)

where we have used the notation:

|(ξ − ξ ) · ω| M1 (ξ ) + M2 (ξ ) dξ dω, ν(x, ξ ) = R3 ×S2

1 1 Mi gi , M1 + M2 + √ Q Mi , Mi gi , K i,i gi = √ Q gain Mi Mi (1.18)

1 K i,i+1 gi+1 = √ Q Mi , Mi+1 gi+1 , Mi

1 Mi gi , M j g j . Γ (gi , g j ) = √ Q Mi We will need the following symmetry condition on the initial data: g1 (0, x, v, ζ ) = g2 (0, −x, −v, ζ ).

(1.19)

We note that such a property is preserved by the time evolution. It plays a crucial role in removing the obvious orbital instability of our problem related to the translation invariance: indeed such an invariance is explicitly broken when condition (1.19) is assumed. The momentum, kinetic energy and particle number of each species are conserved during collisions, while the sum of potential and kinetic energy is conserved along the trajectories. Therefore the following quantities are conserved under the evolution (1.1): – The masses of the perturbation g, Mi (g) = dx dξ( f i (x, ξ ) − Mi (x, ξ )), i = 1, 2; R

R3

– The energy of the perturbation g, ξ2 E(g) = dx dξ [( f 1 + f 2 ) − (M1 + M2 )] 2 R R3 dyU (|x − y|) ρ f1 (x)ρ f2 (y) − ρ M1 (x)ρ M2 (y) , + R

(1.20)

6

R. Esposito, Y. Guo, R. Marra

with f i = Mi +

√

Mi gi and

ρ f (x) =

R3

dξ f (x, ξ ),

(1.21)

the spatial density of the distribution f . Moreover, by standard arguments it follows that the H -function of the perturbation g, H (g) = dx dξ [( f 1 (x, ξ ) ln f 1 (x, ξ ) + f 2 (x, ξ ) ln f 2 (x, ξ )) R

R3

− (M1 (x, ξ ) ln M1 (x, ξ ) + M2 (x, ξ ) ln M2 (x, ξ ))]

(1.22)

does not increase in the evolution (1.1). We notice that, since the spatial domain is R, the total masses, energy and H -function of the distribution f are not well defined, while above differences are finite. Since H (g) does not increase during the evolution and E(g) and the total masses Mi (g) are constant, any linear combination of them, with a positive coefficient for the entropy, does not increase. In particular, the following non-increasing entropy-energy functional is crucial to study the stability of the equilibria: 3 2 β 2 H(g) = H (g) + βE(g) − Mi (g) Ci + 1 + ln . (1.23) 2π i=1

The factors multiplying the masses are suitably chosen to cancel some linear terms, as will be shown in the next section. The factor β in front of the temperature is dictated by the free energy minimizing properties of the equilibria. In the next sections we shall use the following weighted norm for p ∈ [1, +∞]:

f L wp =

R

1

dx

R3

dξ |w(ξ ) f (x, ξ )|

p

p

,

for some positive weight function w. Moreover L w, p denotes the space of the measurable functions om R × R3 with · L p ,w finite. When notation will be unambiguous, we will also omit the index w. Finally ∇x,v denotes the couple (∂x , ∂v ). Theorem 1.2 (Stability). Assume β > 1 and M = Mρ¯ . Let w(ξ ) = (Σ + |ξ |2 )γ , for Σ > 0 and γ > √ 3/2. There are δ > 0 and Σ > 0, C > 0 such that, if the initial datum f i (0) = Mi + Mi gi (0) satisfies the symmetry condition (1.19), the bound

wg(0) L ∞ + H(g(0)) < δ and ∇x,v g(0) L 2 < +∞, then the initial value problem for (1.17) has a unique global in time solution with sup wg(t) L ∞ ≤ C{ wg(0) L ∞ + H(g(0))}, (1.24) 0≤t≤∞

∇x,v g(t) L 2 ≤ eCt ∇x,v g(0) L 2 .

(1.25)

Remark 1.1. Theorem 1.2 implies the stability of the non homogeneous equilibrium solution Mρ¯ in the Vlasov-Boltzmann evolution (1.1), with respect to initial perturbations satisfying (1.19). It turns out that H (g(0)) > 0 for wg(0) L ∞ small (see Lemma 2.2).

Phase Transition in a Vlasov-Boltzmann Binary Mixture

7

Dynamical stability and rate of convergence for the same front solutions has been established [5] for the Vlasov-Fokker-Planck dynamics: the Vlasov force is the same introduced here, but the collisions are replaced by a Fokker-Plank operator modeling the contact with a reservoir at inverse temperature β. A careful analysis of the macroscopic equation plays an important role. To prove nonlinear stability, the first major mathematical difficulty we encounter is the presence of a large amplitude potential U . To our knowledge, so far there has been no published work on the stability result in the presence of a large external field in the Boltzmann theory. The main problem is the collapse of Sobolev estimate in higher order energy norms. Indeed, even upon taking one x−derivative, the H 1 -norm might actually grow in time due to the presence of the term ∂x F(Mi+1 )∂v gi in (4.14). We are thus forced to design a strategy of proof based on a weighted L ∞ formulation without any derivatives and get control of derivatives only afterwards. Furthermore, unlike the previous linear Fokker-Planck interaction, the nonlinear Boltzmann collisions make it difficult to analyze the equation for the projection on the hydrodynamical modes in this case. In fact, even a L 2 stability of g is difficult to obtain directly from the analysis of Eq. (1.17). Instead, we avoid a direct study of Eq. (1.17) and make crucial use of the fundamental entropy-energy H(g) estimate to obtain a mixed L 1 − L 2 type of stability estimate, based on the spectral gap of the linearized free energy operator A (Lemma 2.2). We note that the spectral gap estimate relies in an essential way on the minimizing properties of the front solution and the symmetry condition is used to control the part of the solution in the null space of the operator A. We then bootstrap such a L 2 stability to a L ∞ estimate to obtain a pointwise stability estimate, by following the curved trajectory induced by the √ force field F(Mi+1 + Mi+1 gi+1 ). The success of such a strategy is somewhat surprising: the perturbation of the field does not need to decay in time, and all the analysis over the one dimensional trajectory is carried out in a finite time interval [0, T0 ] (Lemma 4.1). Yet this is still sufficient for the global in time estimate due to the strong exponential time decay from the collision frequency. Remark 1.2. The same result also holds when perturbing initially the equilibria Mred and Mblue . Indeed the proof is even simpler in this case, because the analog of the operator A has again a spectral gap property, but its null space is trivial. Due to this, the symmetry assumption (1.19) is no more needed. We next discuss the homogeneous equilibrium Mhom = (µβ , µβ ). Theorem 1.3. Assume β < 1 and M =√Mhom . There are δ > 0 and Σ > 0, C > 0 such that, if the initial datum f i (0) = Mi + Mi gi (0) satisfies the bound

wg0 L ∞ +

H(g(0)) < δ

and ∇x,v g(0) L 2 < +∞, then the initial value problem for (1.17) has a unique global in time solution satisfying the estimates (1.24) and (1.25). Remark 1.3. This implies the stability of the mixed phase for β < 1. The proof is again a simpler version of the one for Theorem 1.2 which will be omitted. In this case the analog of operator A has a spectral gap property (because β < 1) and a trivial null space, so that we do not need to require the symmetry property (1.19).

8

R. Esposito, Y. Guo, R. Marra

We have already noticed that, when β > 1, the couple (1, 1) is a local maximizer for the free energy. In the next theorem we establish the dynamical instability of Mhom . Theorem 1.4 (Instability). Assume β > 1. There exist constants k0 > 0, θ > 0 , C > 0, √ δ δ δ c > 0 and a family of initial 2π k0 −periodic data f i (0) = µβ + µβ gi (0) ≥ 0, with g (0) satisfying (1.19) and

∇x,v g δ (0) L 2 + wg δ (0) L ∞ ≤ Cδ, for δ sufficiently small, but the solution g δ (t) to (1.1) satisfies sup wg δ (t) L ∞ ≥ c sup g δ (t) L 2 ≥ cθ > 0.

0≤t≤T δ

0≤t≤T δ

Here the escape time is Tδ =

θ 1 ln , Reλ δ

(1.26)

λ is the eigenvalue with the largest real part for the linearized Vlasov-Boltzmann system constructed in Theorem 5.1 with Reλ > 0. Remark 1.4. Note that T δ → ∞ as δ → 0. We also observe that the critical value β = 1 escapes our analysis, since it is based on some strict inequalities which collapse for the critical β. Furthermore, the growing mode which we construct satisfies the symmetry condition (1.19). Hence the instability is not a consequence of the absence of such a symmetry. To prove instability of the homogeneous state, we encounter a second major difficulty. Even though such a homogeneous equilibrium is not a local minimizer for the free-energy (1.11), it is not clear at all if such a property leads to a dynamical instability along the time evolution. As a matter of fact, the presence of (possibly strong) stabilizing collision in the velocity space might damp out the instability. It is thus fundamental to understand such a stabilizing collision effect. Unfortunately, a direct linear stability analysis in the presence of the collision effect is too complicated to draw any conclusion. We therefore developed a new perturbation argument to establish the linear instability around such a homogeneous steady state. We first observe that in the absence of the collision effect, the homogeneous state is indeed dynamically unstable, by an explicit analysis similar to the Penrose criterion in plasma physics [14]. It then follows, via a perturbation argument, that for ‘weak’ collision effects, such an instability should persist. The key is to show that this is even true for arbitrarily ‘strong’ collision effect. We use an argument of contradiction and a method of continuation. In fact, if instability should fail at some level of the collision effect, then a neutral mode must occur. The interaction between the Vlasov force and the collision effect forces any neutral mode to behave like a multiple of Mhom with a particular dispersion relation never satisfied for our model (Theorem 5.1). To bootstrap such a linear instability into a nonlinear one is delicate due to severe nonlinearity, and we follow the program developed by Strauss and the second author over the years [10,11]. We remark that the instability of the homogeneous equilibrium for the VFP model is still open because the techniques used in the present paper cannot apply due to the unboundedness of the Fokker-Plank operator. On the other hand, in this paper we cannot prove the convergence to the stable equilibrium, while it was possible in the VFP model even to compute the rate.

Phase Transition in a Vlasov-Boltzmann Binary Mixture

9

The paper is organized as follows: In Sect. 2 we use the energy-entropy to derive a mixed L 1 − L 2 estimate. In Sect. 3, we establish some lemmas on the characteristics curves for Eqs. (1.1). In Sect. 4 we establish the nonlinear stability in L ∞ norm. In Sect. 5 we construct a growing mode for the linearized problem around the homogeneous equilibrium and finally, in Sect. 6 we show that such a linear growing mode leads to nonlinear instability. 2. Entropy-Energy Estimate In this section we use the conservation of energy and masses, and the entropy inequality to obtain a priori estimates on the deviation of the solution from equilibrium. A crucial ˆ role is played by the quadratic approximation of the excess free energy functional F. We need a few definitions and some notation. 2 Let u = (u 1 , u 2 ) be a couple of functions in L 2 (R) and denote u, v = i=1 (u i , vi ). Given a couple of densities ρ = (ρ1 (x), ρ2 (x)), we define the operator A by setting

u, Au :=

2 i=1

R

dxu i (x)(Au)i (x) =

1 d2 ˆ F(ρ + su) . s=0 2 ds 2

In particular, when ρ = (ρ¯1 (x), ρ¯2 (x)), the action of the operator A on u is (Au)1 =

u1 u2 + βU ∗ u 2 , (Au)2 = + βU ∗ u 1 . ρ¯1 ρ¯2

(2.1)

Hence

u, Au =

2 u 2 (x) 1 +β dx i dx dyU (|x − y|)u 1 (x)u 2 (y). 2 ρ¯i (x) R R i=1 R

Due to the minimizing properties of ρ, ¯ this quadratic form is non negative. Moreover, by (1.12) and (2.1), we see that Aρ¯ = 0, which shows that ρ¯ is in the null space of A. Indeed, one can show (see [5] and references quoted therein) the following Lemma 2.1. Suppose β > 1 and ρ = (ρ¯1 , ρ¯2 ). Then there exist δ0 > 0 such that

u, Au ≥ δ0 (I − P)u, (I − P)u, where P is the projector on Null A: Null A = {u ∈ L 2 (R) × L 2 (R) | u = cρ¯ , c ∈ R}. If either ρ = (ρ ± , ρ ∓ ), or ρ = (1, 1) with β < 1, we have

u, Au ≥ δ0 u, u, and the null space of A reduces to {0}. We prove the following lemma which plays a crucial role in the proof of the stability of Mρ¯ . Recall that we have adopted the notation ξ = (v, ζ ), with ζ ∈ R2 and v ∈ R for the velocity.

10

R. Esposito, Y. Guo, R. Marra

Lemma 2.2. Let M = Mρ¯ . Let f 1 (t, x, v, ζ ) = f 2 (t, −x, −v, ζ ) and that g L ∞ ≤ δ for some small δ. Then there exists C > 0 and κ > 0, such that ( f i (t) − Mi )2 C dx dξ 1| fi (t)−Mi |≤κ Mi + Mi R3 i=1,2 R | f i (t) − Mi |1| fi (t)−Mi |≥κ Mi ≤ H(g(0)). Remark 2.1. Note that, when dealing with Mred , Mblue and Mhom , the functions ρ¯1 and ρ¯2 have to be replaced by the constant values (ρ + , ρ − ), (ρ − , ρ + ) and (1, 1) respectively. With this modification the lemma still holds. Proof. Remember the notation ρ f (t, x) = R3 dξ f (t, x, ξ ). We may construct solutions (see [8]) such that H(g) ≤ H(g(0)). We expand H(g) and use (1.7) to cancel the linear part of the expansion, which takes the form 3 2 β 2 − dx dξ {Ci + 1 + ln }{ f i − Mi } 2π R R3 i=1 2

+

β|ξ |2 ( f i − Mi ) + (ln Mi + 1)( f i − Mi ) 2 R3 i=1 R + β( f i − Mi )U ∗ ρ¯i+1 . dx

dξ

β 2 Indeed, since ln Mi = ln( 2π ) − β|ξ2 | + ln ρ¯i , by (1.7), the above quantity is zero by construction. Therefore, we turn to the second order expansion of H(g). For some f˜i between Mi and f i , 2 ( f i (t) − Mi )2 H(g) = dx dξ d xdξ 2 f˜i R3 i=1 R dx dy(ρ f1 (t, x) − ρ¯1 (x))U (|x − y|)(ρ f2 (t, x) − ρ¯2 (x)). (2.2) +β 3

R

2

R

For some small number κ to be determined, we introduce the indicator functions χi< = 1| fi (t)−Mi |≤κ Mi and χi> = 1| fi (t)−Mi |>κ Mi and split the first term into ( f i (t) − Mi )2 > ( f i (t) − Mi )2 < dx dξ χi + dx dξ χi . 2 f˜i 2 f˜i R R3 R R3 ( f i (t) − Mi )2 in the case of | f i (t) − Mi | > κ Mi . Notice that either 2 f˜i f i ≥ (1 + κ)Mi , or f i ≤ (1 − κ)Mi . If f i ≥ (1 + κ)Mi , f˜i (t) ≤ f i , and we have

We first estimate

κ | f i (t) − Mi | | f i (t) − Mi | Mi 1 = . ≥ =1− ≥1− ˜ f f 1 + κ 1 + κ f i (t) i i

Phase Transition in a Vlasov-Boltzmann Binary Mixture

11

In the second case f i ≤ (1 − κ)Mi , f˜i (t) ≤ Mi and | f i (t) − Mi | | f i (t) − Mi | fi κ . ≥ =1− ≥ 1 − {1 − κ} = κ > ˜ Mi Mi 1+κ f i (t) Combining these two cases and noticing f˜i ≤ (1 + κ)Mi for | f i (t) − Mi | ≤ κ Mi , we conclude ( f i (t) − Mi )2 < ( f i (t) − Mi )2 > dx dξ χi + dx dξ χi 2 f˜i 2 f˜i R R3 R R3 ( f i (t) − Mi )2 < κ ≥ dx dξ χi + dx dξ | f i (t) − Mi |χi> 3 3 2(1 + κ)M 2(1 + κ) i R R R R 1 κ 2 < = dx dξ gi χi + dx dξ | f i (t) − Mi |χi> 2(1 + κ) 2(1 + κ) R R R3 R3 1 κ 2 ≥ d xn i + dx dξ | f i (t) − Mi |χi> , (2.3) 2(1 + κ) 2(1 + κ) R R R3 √ where we have set n i (t, x) µβ = P[( f i (t) − Mi )χi< ], and P denotes the L 2ξ -projection √ on Mi : P f := dξ f (ξ ) µβ (ξ ) µβ (ξ ). R3

We now split the potential contribution in (2.2) to get β dx dy dξ dη M1 g1 χ1< U (|x − y|) M2 g2 χ2< 3 R R3 R R +β dx dy dξ dη M1 g1 χ1< U (|x − y|) M2 g2 χ2> 3 3 R R R R +β dx dy dξ dη M1 g1 χ1> U (|x − y|) M2 g2 χ2> 3 3 R R R R +β dx dy dξ dη M1 g1 χ1> U (|x − y|) M2 g2 χ2< . R

R

R3

R3

From our assumption g L ∞ ≤ δ, the last three terms are controlled by Cβ ( g1 L ∞ + g2 L ∞ ) dx dξ | f i (t) − M1 |χi> ≤ Cβ δ

2 i=1

R

dx

R3

i=2

R

R3

dξ | f i (t) − Mi |1| fi (t)−Mi |≥κ M1 ,

√ √ which is bounded by the second term in (2.3) for δ << κ. Since dξ Mi gi = ρ¯i n i , the first term can be written as β dx dy n 1 (t, x) ρ¯1 (x)U (|x − y|)n 2 (t, y) ρ¯2 (y). R

R

12

R. Esposito, Y. Guo, R. Marra

We now combine it with the first term in (2.3) to get 1 d x(n 21 + n 22 ) 2(1 + κ) R +β dx dyn 1 (x) ρ¯1 (x)U (|x − y|)n 2 (y) ρ¯2 (y)d xd y R R κ = n ρ, ¯ An ρ ¯ − d x(n 21 + n 22 ). 2(1 + κ) R Since n 1 (x) = n 2 (−x) by our symmetry assumption and from ρ¯1 (x) = −ρ¯2 (−x), it follows that (n 1 , n 2 ) is orthogonal to the null space of A. From the spectral inequality for the operator A, since ρ¯i (x) ≥ ρ − , the above is bounded from below by ρ − δ0 − ≥c

2

κ 2(1 + κ)

R

d x(n 21 + n 22 )

P( f i (t) − Mi )1| fi (t)−Mi |≤κ Mi 2L 2 ,

i=1

provided that we choose κ sufficiently small. Then the lemma follows by collecting the terms with δ << κ. 3. 1-Dimensional Characteristics We define the characteristics curves [X i (s; t, x, v), Vi (s; t, x, v)] for (1.17) passing through (t, x, v) at s = t, such that d X i (s; t, x, v) = Vi (s; t, x, v), ds (3.1) d Vi (s; t, x, v) = −∂x U ∗ dξ(Mi+1 + Mi+1 gi+1 ) ≡ −∂x φ(X i (s; t, x, v)). ds R3 We also define the unperturbed characteristics [X i0 (s; t, x, v), Vi0 (s; t, x, v)] passing through (t, x, v) at s = t, such that d X i0 (s; t, x, v) = Vi0 (s; t, x, v), ds d Vi0 (s; t, x, v) = −∂x U ∗ Mi+1 dξ ≡ −∂x φ0 (X i0 (s; t, x, v)). ds Our main goal is to study the zero set of

∂ X i (s; t, x, v) . ∂v

Lemma 3.1. For any (t, x, v) with v = 0, the set of {s ∈ R : countable.

(3.2)

∂ X i0 (s; t, x, v) = 0 } is ∂v

Phase Transition in a Vlasov-Boltzmann Binary Mixture

13

Proof. From the particle energy conservation for (3.2): 1 0 1 |V (s; t, x, v)|2 − φ0 (X i0 (s; t, x, v)) = |v|2 − φ0 (x). 2 i 2

(3.3)

Taking derivative with respect to v yields Vi0 (s; t, x, v)

∂ X i0 (s; t, x, v) ∂ Vi0 (s; t, x, v) − φ0 = v. ∂v ∂v

Assume v = 0. If there is s0 such that

∂ X i0 (s0 ; t, x, v) = 0, then necessarily ∂v

Vi0 (s0 ; t, x, v)

∂ Vi0 (s0 ; t, x, v) = 0, ∂v

because, for such an s0 it reduces to v by the previous equation and we assumed v = 0. Hence ∂ Vi0 (s0 ; t, x, v) d ∂ X i0 (s; t, x, v) = = 0, s=s0 ∂v ds ∂v ∂ X i0 (s; t, x, v) = 0 in a neighborhood of s = s0 . This implies that for ∂v ∂ X i0 (s; t, x, v) = 0} is countable. v = 0, the set {s : ∂v and therefore

Lemma 3.2. Fix T0 > 0 and N > 0. Let |v| ≤ N . 1. For any ε > 0, there exists L ε sufficiently large so that, if |x| ≥ L ε , then for 0 ≤ s ≤ t − ε, ∂ X i0 (s; t, x, v) ε < − < 0. ∂v 2 2. For any η > 0, there exist P finite points |xk | ≤ L ε (1 ≤ k ≤ P) and corresponding open sets Oxk = {an < s < bn } × {co < v < do } (n,o)∈Ik

with the property |[0, T0 ] × {|v| ≤ N } ∩ Oxck | < η, so that there exists m > 0 and, for any |x| ≤ L ε , there exists l ∈ {1, . . . , P}, ∂ X 0 (s; t, x, v) i > m > 0. ∂v for (s, v) ∈ Oxl

14

R. Esposito, Y. Guo, R. Marra

Proof. For any ε > 0, from (3.2) and (3.3 ),

|Vi0 (s; t, x, v)| ≤ |v| + 2 φ0 L ∞ ≤ N + C,

|X i0 (s; t, x, v) − x| ≤ T0 {N + C}. By choosing L ε (depending on T0 and N ) large enough, for |x| ≥ L ε , |v| ≤ N , and 0 ≤ s ≤ T0 , |X i0 (s; t, x, v)| ≥

Lε . 2

(3.4)

From (3.2), we have ∂ X i0 (s; t, x, v) d 2 ∂ X i0 (s; t, x, v) 0 = −∂ , φ (X (s; t, x, v)) xx 0 i ds 2 ∂v ∂v and we deduce that for |s| ≤ T0 , ∂ X 0 (s; t, x, v) i ≤ C T0 . ∂v

(3.5)

(3.6)

By the Taylor expansion for s, we get ∂ X i0 (s; t, x, v) ∂ X i0 (s; t, x, v) d ∂ X i0 (s; t, x, v) + (s − t) = ∂v ∂v ds ∂v s=t s=t (s − t)2 d 2 ∂ X i0 (¯s ; t, x, v) 2 ds 2 ∂v (s − t)2 d 2 ∂ X i0 (¯s ; t, x, v) = (s − t) + 2 ds 2 ∂v +

for some t − T0 ≤ s¯ ≤ t. Since the densities ρ¯i tend to their asymptotic values at infinity, lim y→∞ ∂x x φ0 (y) = 0. Therefore, using again (3.5) with s = s¯ and (3.6), we have ⎛ ⎞ ∂ X i0 (s; t, x, v) ≤ (s − t) ⎝1 − (t − s)C T0 sup |∂x x φ0 (y)|⎠ ∂v |y|≥ L ε 2

s−t ≤ < 0, 2 by choosing L ε sufficiently large. Part (1) thus follows. To prove part (2), for |x| ≤ L ε , introduce the zero set of Z x = {t − T0 ≤ s ≤ T0 , |v| ≤ N :

∂ X i0 (s; t, x, v) = 0}. ∂v

Then from the Fubini Theorem and Lemma 3.1, N t |Z x | = 1 ∂ X 0 (s;t,x,v) −N

t−T0

{s,v:

i

∂v

∂ X i0 (s; t, x, v) as ∂v

=0}

ds dv = 0.

Phase Transition in a Vlasov-Boltzmann Binary Mixture

15

η . Clearly 2 0 ∂ X i (s;t,x,v) = 0 over the compact set [t − T0 , t] × {|v| ≤ N } ∩ Ωxc . By the continuity in ∂v s and v, there exists m x > 0 such that over [t − T0 , t] × {|v| ≤ N } ∩ Ωxc , ∂ X 0 (s; t, x, v) i > 4m x > 0. ∂v

Therefore, there exists an open set Ωx such that Z x ⊂ Ωx with |Ωx | <

Furthermore, from the continuity in x, we have an open set of (x − ∆x , x + ∆x ) such that ∂ X 0 (s; t, x , v) i > 2m x > 0 ∂v for all x ∈ (x − ∆x , x + ∆x ), (s, v) ∈ [t − T0 , t] × {|v| ≤ N } ∩ Ωxc . Such (x − ∆x , x + ∆x ) forms an open covering for |x| ≤ L ε , hence there is a finite subcovering {(xk − ∆k , xk + ∆k ), k = 1, . . . , P} for |x| ≤ L ε . For any |x| ≤ L ε , there exists Ωxl such that x ∈ (xl − ∆l , xl + ∆l ) and for [t − T0 , t] × {|v| ≤ N } ∩ Ωxcl , ∂ X 0 (s; t, x, v) i > 2m = min 2m k > 0. 1≤k≤P ∂v c We finally choose an (finite) open covering of [t − T0 , t] × {|v| ≤ N } ∩ Ωxk of the form Oxk = n,o {an < s < bn } × {co < v < do } with |an − bn | + |co − do | sufficiently small so that over Oxk , ∂ X 0 (s; t, x, v) i > m > 0. ∂v

Lemma 3.3. Fix T0 > 0 and N > 0. Let |v| ≤ N . For any ε > 0, recall L ε , Oxk , 1 ≤ k ≤ P constructed in Lemma 3.2. There exists δ > 0 such that if g L 2 < δ, 1. If |x| ≥ L ε then for 0 ≤ s ≤ t − ε, ∂ X i (s; t, x, v) ε <− . ∂v 2 2. For any |x| ≤ L ε , there exists l such that x ∈ (xl −∆l , xl +∆l ) and for all (s, v) ∈ Oxl ∂ X i (s; t, x, v) m > > 0. ∂v 2 Proof. Denote the solution operator of (3.2) and (3.1) by G 0 and G. By the Duhamel principle, we have X i (s; t, x, v) G 0 (s−t) x =e v Vi (s; t, x, v) s 0 G 0 (s−τ ) √ dτ. (3.7) + e −∂x U ∗ { Mi+1 gi+1 dξ }(τ ) t

16

Notice that

R. Esposito, Y. Guo, R. Marra

∂ ∂x U ∗ { Mi+1 gi+1 dξ }(τ ) ∂v ∂ X i (τ ; t, x, v) = ∂ x x U ∗ { Mi+1 gi+1 dξ } ∂v ∂ X i (τ ; t, x, v) . ≤ C g L 2 ∂v

It thus follows from taking the derivative of v in (3.7) and by Gronwall’s lemma that 0 ≤ s ≤ T0 , ∂ X i (s; t, x, v) ≤ eC T0 . ∂v We now use (3.7) again to get ∂ X (s; t, x, v) ∂ X 0 (s; t, x, v) i i − ≤ C T0 g L 2 . ∂v ∂v Hence, we deduce our lemma by choosing g L 2 sufficiently small.

4. Weighted L ∞ Stability In this section we use the entropy-energy bound and the estimates on the characteristics to show that the perturbation g of the non homogeneous equilibrium Mρ¯ , is arbitrarily small at any positive time in a suitable weighted L ∞ norm, provided that it is initially sufficiently small, thus showing the

stability γ of the non homogeneous equilibrium. We use the weight function w(ξ ) = Σ + |ξ |2 , with Σ a positive constant to be chosen later and γ > 23 . Lemma 4.1. Let h = wg. There exist T0 > 0 and δ > 0 such that, if h L ∞ < δ, then

h(T0 ) L ∞ ≤

1

h(0) L ∞ + C T0 H(g(0)). 2

Proof. We first write the equation for h = wg from (1.17): h i+1 ∂v + ν(x, ξ ) h i = ∂t + v∂x + F(Mi+1 + Mi+1 w

h i+1 w h i + wF F Mi+1 + Mi+1 M I +1 gi+1 v Mi + K wi j h j w w j=1,2 h i h i+1 h i+1 hi hi vh i + wΓ , + wΓ , , (4.1) +F Mi+1 w w w w w ·

ij where K w (·) = wK i j . We note that for Σ ≥ 1, w [Σ + |ξ |2 ]γ w(ξ ) [Σ + |ξ |2 ]γ + |ξ − ξ |2γ = ≤ C ≤ Cγ [1 + |ξ − ξ |2 ]γ . γ w(ξ ) [Σ + |ξ |2 ]γ [Σ + |ξ |2 ]γ

(4.2)

Phase Transition in a Vlasov-Boltzmann Binary Mixture

17

For any (t, x, ξ ), integrating along its backward trajectory (3.1), [X i (s), Vi (s)] = [X i (s; t, x, v), Vi (s; t, x, v)], we can express h i (t, x, ξ ) as h i (0, X i (0; t, x, v), Vi (0; t, x, v), ζ ) + t s w e t νi (τ )dτ {F(Mi+1 + Mi+1 gi+1 ) h i }(s, X i (s), Vi (s), ζ )ds w 0 t s + e t νi (τ )dτ {F( Mi+1 gi+1 )Vi (s) Mi }(s, X i (s), Vi (s), ζ )ds 0 ⎛ ⎞ 2 t s + e t νi (τ )dτ ⎝ K wi, j h j ⎠ (s, X i (s), Vi (s), ζ )ds j=1 0 t s

(4.3)

j=1,2

e t νi (τ )dτ {F( Mi+1 gi+1 )vh i }(s, X i (s), Vi (s), ζ ) + 0 t s h i h i+1 hi hi +Γ (s, X i (s), Vi (s), ζ )ds. + e t νi (τ )dτ w Γ , , w w w w 0 We have set νi (τ ) ≡ ν(Vi (τ ), ζ ) ≥ ν0 > 0. Fix a small constant ε > 0. We can √ choose Σ large so that | ww | ≤ ε. Since F(Mi+1 + Mi+1 gi+1 ) L ∞ ≤ C if h L ∞ is small, the second term in (4.4) is bounded by Cεe−

ν0 t 2

sup {e

ν0 s 2

h(s) L ∞ }.

0≤s≤T0

For the third term in (4.4), we split gi+1 = gi+1 1| fi (t)−Mi |≥κ Mi + gi+1 1| fi (t)−Mi |≥κ Mi . Since U is smooth,

F( Mi+1 gi+1 ) L ∞ ≤ C{ Mi+1 gi+1 1| fi (t)−Mi |≤κ Mi L 2 + Mi+1 gi+1 1| fi (t)−Mi |≥κ Mi L 1 }, by Lemma 2.2,

F( Mi+1 gi+1 ) Mi (s, X i (s), Vi (s), ζ )Vi (s) L ∞

≤ C H(g(0)) + H(g(0)) .

(4.4)

For the fifth term, we note that, sincefor hard spheres νi (s) ≥ ν0 (1 + |ζ | + |Vi (s)|), t |Vi (s)| 1 νi (s) s (νi (τ )/2)dτ it follows that < . Moreover, et ds ≤ 1. Therefore, νi (s) ν0 2 0 t ν s s e t νi (τ )dτ F( Mi+1 gi+1 )Vi (s)h i ds ≤ Ce−ν0 t sup {e 02 h(s) L ∞ }2 . 0

0≤s≤T0

18

R. Esposito, Y. Guo, R. Marra

For the last term in (4.4), by Lemma 10 of [7], it follows wΓ h i , h i (ξ ) + wΓ h i , h i+1 (ξ ) ≤ Cν(ξ ) h 2 ∞ . L w w w w We therefore get the bound for the last term by t s e t ν(τ )dτ νi (s) h(s) 2L ∞ ds 0 t ν0 s s 2 2 ≤ C{ sup e h(s) L ∞ } e t νi (τ )dτ νi (s)e−ν0 s ds. 0

0≤s≤T0 s

s

d Note that ds [e t νi (τ )dτ ] = e t νi (τ )dτ νi (s). Integrating by parts yields t t

s=t s s s νi (τ )dτ −ν0 s νi (τ )dτ −ν0 s t t e νi (s)e ds = e e + ν0 e t νi (τ )dτ e−ν0 s ds s=0

0

≤ C(1 + t)e

−ν0 t

0

.

We shall mainly concentrate on the fourth term in (4.4). Let ki, j (ξ, ξ ) be the corij responding kernel associated with K w in (4.1). We now use (4.4) for h j (s, X i (s), ξ ) again to evaluate i, j {K wi, j h j }(s, X i (s), Vi (s), ζ ) = kw (Vi (s), ζ, ξ )h j (s, X i (s), ξ )dξ . Denote [X j (s1 ), V j (s1 )] ≡ [X j (s1 ; X i (s; t, x, v), v ), V j (s1 ; X i (s; t, x, v), v )]. We can bound the fourth term in (4.4) by the sum on j of t 0 s i, j e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )|h j (0, X j (0), V j (0), ζ )|dξ ds R3 0 t s s s 1 i, j + e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )| 0

s1

w

R3

×{F(M j+1 + M j+1 g j+1 ) h j }(s1 , X j (s1 ), V j (s1 ), ζ )dξ dsds1 w t s s s 1 νi (τ )dτ + s ν j (τ )dτ i, j t + e |kw (Vi (s), ζ, ξ )| R3 0 s1 ×{F( M j+1 g j+1 )v M j }(s1 , X j (s1 ), V j (s1 ), ζ )dξ ds s s t s i, j 1 + e t νi (τ )dτ + s ν j (τ )dτ kw (Vi (s), ζ, ξ ) R3 × R3 0 s1 k ×kwj,k (V j (s1 ), ζ , ξ )h k (s1 , X j (s1 ), ξ )|dξ dξ dsds1 t s s s 1 i, j + e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )| 3 R 0 s1 ×{F( M j+1 g j+1 )vh j }(s1 , X j (s1 ), V j (s1 ), ζ )dξ dsds1

(4.5)

Phase Transition in a Vlasov-Boltzmann Binary Mixture

t

19

s 1 i, j e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )| 3 R 0 s1 h j h j+1 hj hj , +Γ , (s, X j (s1 ), V j (s1 ))dξ dsds1 . ×w Γ w w w w

+

s s

We will make an extended use of Lemma 7 of [7], which we report here for reader’s convenience: For hard spheres, the usual Grad estimates imply: |ki, j (ξ, ξ )| ≤ C{|ξ − ξ | + |ξ − ξ |−1 }e

− 18 |ξ −ξ |2 − 18

ξ |2 −|ξ |2 |2 |ξ −ξ |2

.

(4.6)

Lemma 4.2 (Lemma 7 of [7]). There are ε > 0 and C > 0 such that 1−ε 2 1−ε ξ |2 −|ξ |2 |2 w(ξ ) C −1 − 8 |ξ −ξ | − 8 |ξ −ξ |2 {|ξ − ξ . (4.7) dξ | + |ξ − ξ | }e ≤ 3 w(ξ ) 1 + |ξ | R By Lemma 4.2, we obtain the crucial estimate i, j |kw (ξ, ξ )|dξ < R3

C 1 + |ξ |

(4.8)

uniformly in Σ. Since νi ≥ ν0 , by taking the L ∞ norm for h and (4.8), we bound the first term in (4.6) by Cte−ν0 t h 0 L ∞ , and the second term by ν0

εCe− 2 t sup {e

ν0 2 s

h(s) L ∞ }.

0≤s≤T0

By (4.8), the third term is bounded by CH(g(0)) as in (4.4), and the last two nonlinear terms are bounded by C{1 + t}e−ν0 t { sup e

ν0 2 s

0≤s≤T0

h(s) L ∞ }2 .

We now concentrate on the fourth term in (4.6), which will be estimated along the same lines of the proof of Theorem 20 in [7]. Case 1. For |ξ | ≥ N T0 , we know that from (3.1), |Vi (s; t, x, v) − v| ≤ |s − t|C ≤ C T0 . By Lemma 4.2 and (4.2), for N T0 large, i, j (Vi (s), ζ, ξ )kwj,k (V j (s1 ), ζ , ξ )|dξ dξ |kw ≤

C C ≤ , 1 + |Vi (s)| + |ζ | N − C T0

we therefore can find an upper bound for the fourth term in (4.6) by (N >> T0 ) s C t −ν0 (t−s) e × e−ν0 (s−s1 ) h(s1 ) L ∞ ds1 ds N 0 0 ν0

≤

Ce− 2 t N

sup e 0≤s≤T0

ν0 2 s

h(s) L ∞ .

20

R. Esposito, Y. Guo, R. Marra

Case 2. For |ξ | ≤ N , |ξ | ≥ 2N , or |ξ | ≤ 2N , |ξ | ≥ 3N . Notice that we have either |ξ − ξ | ≥ N or |ξ − ξ | ≥ N . This implies that |v − Vi (s; t, x, v)| ≥ |v − v| − |v − Vi (s; t, x, v)| ≥ |v − v| − C T0 , |v − V j (s1 ; X i (s; t, x, v), v )| ≥ |v − v | − |v − V j (s1 ; X i (s; t, x, v), v )| ≥ |v − v | − C T0 . Therefore, either one of the following are valid correspondingly for some σ > 0: σ

σ

2 +|ζ −ζ |2 }

i, j i, j |kw (Vi (s), ζ, ξ )| ≤ C T0 e− 8 N |kw (Vi (s), ζ, ξ )e 8 {|Vi (s)−v |

|kwj,k (V j (s1 ), ζ , ξ )|

≤

2

|,

− σ8 N 2

C T0 e |kwj,k (V j (s1 ), ζ , ξ ) × σ 2 2 8 {|V j (s1 )−v | +|ζ −ζ | } |.

e

From Lemma 4.2, σ 2 2 i, j |kw (Vi (s), ζ, ξ )e 8 {|Vi (s)−v | +|ζ −ζ | } |dξ σ 2 2 + |kwj,k (V j (s1 ), ζ , ξ )e 8 {|V j (s1 )−v | +|ζ −ζ | } |dξ < +∞.

(4.9)

We use this bound to combine the cases of |ξ − ξ | ≥ N or |ξ − ξ | ≥ N as: t s1 + . 0

|ξ |≤N ,|ξ |≥2N ,

0

|ξ |≤2N ,|ξ |≥3N

We first integrate ξ for the first integral and apply (4.8) to integrate kw over ξ . We i, j then integrate ξ for the second integral and apply (4.8) to integrate kw over ξ . We thus find an upper bound j,k

t

s1

C

sup 0

0

+ sup ξ

|ξ |≤N ,|ξ |≥2N ,

ξ

|ξ |≤2N ,|ξ |≥3N

Cη η 2 ≤ 2 e− 8 N κ η

t

≤ C η e− 8 N e 2

s1

e 0 0 ν − 20 t

ij |kw (Vi (s), ζ, ξ )|dξ

ij |kw (V j (s1 ), ζ , ξ )|dξ

(4.10)

−ν0 (t−s1 )

sup {e

ν0 2 s

h(s1 ) L ∞ ds1 ds

h(s) L ∞ }.

0≤s≤t

Case 3. |ξ | ≤ N , |ξ | ≤ 2N , |ξ | ≤ 3N . This is the last remaining case because if |ξ | > 2N , it is included in Case 2; while if |ξ | > 3N , either |ξ | ≤ 2N or |ξ | ≥ 2N are also included in Case 2. We now can bound the second term in (4.6) by t

s

C 0

B

0

i, j e−ν0 (t−s1 ) |kw (Vi (s), ζ, ξ )kwj,k (V j (s1 ), ζ , ξ )h k (s1 , X j (s1 ), ξ )|,

Phase Transition in a Vlasov-Boltzmann Binary Mixture

21

where B = {|ξ | ≤ 2N , |ξ | ≤ 3N }. We notice that kw (ξ, ξ ) has a possible integrable i, j 1 singularity of the type |ξ −ξ | . We can choose k N (ξ, ξ ) smooth with compact support such that 1 i, j i, j (4.11) |k N ( p, ξ ) − kw ( p, ξ )|dξ ≤ . sup N | p|≤3N |ξ |≤3N i, j

Split kw (Vi (s), ζ, ξ )kw (V j (s1 ), ζ , ξ ) into ij

j,k

i, j (Vi (s), ζ, ξ ) − k N (Vi (s), ζ , ξ )}kwj,k (V j (s1 ), ζ , ξ ) {kw i, j

i, j +{kw (V j (s1 ), ζ, ξ ) − k N (V j (s1 ), ζ, ξ )}k N (V j (s), ζ , ξ ) i, j

j,k

+k N (Vi (s), ζ, ξ )k N (V j (s1 ), ζ , ξ ). i, j

j,k

We then integrate the first term above in ξ and the second term above in ξ . By (4.8), we can use such an approximation (4.11) to bound the s1 , s integration by ν0

ν0 Ce− 2 t sup {e 2 s h(s) L ∞ } (4.12) N 0≤s≤t i, j j,k × sup |kw (V j (s1 ), ζ , ξ )|dξ + sup |kw (Vi (s), ζ, ξ )|dξ

|ξ |≤2N

t

|ξ |≤2N

s

+C 0

B

s1

e−ν0 (t−s1 ) |k N (Vi (s), ζ, ξ )k N (V j (s1 ), ξ )h j (s, X j (s1 ), ζ , ξ )|. i, j

i, j

The first term above is further bounded by

ν0 t

Ce− 2 N

sup0≤s≤t {e

ν0 2 s

h(s) L ∞ }.

Fix ε > 0. We use now Lemma 3.3 for the last main contribution in (4.13) for which we separate two cases |X j (s; t, x, v)| ≥ L ε and |X j (s; t, x, v)| ≤ L ε , where L ε is given in Lemma 3.3. In the case |X j (s; t, x, v)| ≥ L ε , we bound it by t s e−ν0 (t−s1 ) |h k (s, X j (s1 ), ξ )|1|X j (s1 )|≥L ε dsdξ dξ ds1 CN 0 B 0 t−ε t . ≤ CN + 0

t−ε ν0

ν0

The second integral is bounded by Cεe− 2 t sup0≤s≤t {e 2 s h(s) L ∞ }. In the first integral, since s ≤ t − ε, by Lemma 3.3, we can make a change of variable y = X j (s1 ) = X j (s1 ; X i (s; t, x, v), v ) dy ε because | dv | ≥ 2 . We observe since ∂ x φ L ∞ ≤ C, that from (3.1), s |v − V j (τ )| ≤

∂x φ L ∞ dτ ≤ T0 ∂x φ L ∞ , τ s |V j (τ )|dτ ≤ T0 (|v | + T0 ∂x φ L ∞ ) ≤ C T0 ,N |y − X i (s)| ≤ s1

(4.13)

22

R. Esposito, Y. Guo, R. Marra

for |v | ≤ 2N . By first integrating over ζ and using the change of variable (4.13), t−ε s e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s1 )|≥L ε dsdξ dξ ds1 B 0 0 s C t−ε ≤ e−ν0 (t−s1 ) |h k (s1 , y, ξ )|dsdξ dyds1 ε 0 |y−X i (s)|≤C T0 ,N |ξ |≤3N 0 CN ≤ |h k (s1 , y, ξ )|ds1 dξ dy sup ε 0≤s1 ≤T0 |y−X i (s)|≤C T0 ,N |ξ |≤3N + = | f k (t)−Mk |≥κ M j

| f k (t)−Mk |≤κ M j

H(g(0))}.

≤ C T0 ,N ,ε {H(g(0)) +

f k −Mk ) We have used the fact h k = w( √ , (which is bounded by f k − Mk for |ξ | ≤ 3N ), Mk and applied to Lemma 2.2. For |X i (s; t, x, v)| ≤ L ε , for any η > 0, we again employ Lemma 3.3 to find Oxl such that T0 T0 e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s)|≤L ε dsdξ dξ ds1 0

=

B 0 T0

0

B

T0

T0

1 Oxc e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s)|≤L ε dsdξ dξ ds1 l

0

T0

+ 0

B

1 Ox e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s)|≤L ε dsdξ dξ ds1 . l

0

Since |[0, T0 ] × [−N , N ] ∩ Oxcl | < η, the first part is bounded by ν0

C T0 ,N ,ε ηe− 2 t sup {e

ν0 2 s

h(s) L ∞ }.

0≤s≤t

The second part is bounded by T0 T0 C T0 ,N ,ε 1 Oxl |h k (s1 , X j (s1 ), ξ )dsds1 dξ dξ . 0

0

B

∂ X (s ;X (s;t,x,v),v ) Since | j 1 i∂v | > m η /2 on Oxl from Lemma change of variable y = X j (s1 ) = X j (s1 ; X i (s; t, x, v), v )

C T0 ,N ,ε

T0

0

= C T0 ,N ,ε |ξ |≤3N

=

T0

0 Il

B

1 Oxc |h k (s1 , X j (s1 ), ξ )dsds1 dξ dξ

C T0 ,N

l

T0

0

0

T0

|y−X i (s1 )|≤C T0 ,N

h k (s1 , y, ξ )dsds1 dydξ +

| f k (t)−Mk |≤κ Mk

3.3, we can make a (local) to get

| f k (t)−Mk |≥κ Mk

≤ C T0 ,N ,ε,η {H(g(0)) +

H(g(0))}.

×

Phase Transition in a Vlasov-Boltzmann Binary Mixture

23

Collecting terms, we conclude sup e

ν0 2 t

h(s) L ∞ ≤ C(1 + T0 ) h(0) L ∞

0≤s≤T0 ν0 C T0 + C N ,T0 ε + C N ,T0 ,ε η} sup {e 2 s h(s) L ∞ } N 0≤s≤T0 ν0 s 2 +C{ sup e 2 h(s) L ∞ } + C T0 ,N ,ε,η H(g(0)).

+{

0≤s≤T0

Assume sup0≤s≤T0 h(s) L ∞ is sufficiently small. We first choose T0 sufficiently large so that ν0

2C(1 + T0 )e− 2

T0

≤

1 , 2

then N sufficiently large, then ε sufficiently small, finally η small to conclude our lemma. Proof of Theorem 1.2. Assume sup0≤t≤∞ h(t) L ∞ is small. We first establish (1.24). Choose any n = 0, 1, 2, 3, . . . and apply Lemma 4.1 repeatedly to get

h(nT0 ) L ∞ ≤ ≤ ≤ ≤ ≤

1

h({n − 1}T0 ) L ∞ + C T0 H(g(0)) 2 1 1

h({n − 2}T0 ) L ∞ + C T0 H(g(0)) + C T0 H(g(0)) 4 2 ... 1 1 1 ∞ + CT

h

H(g(0)){1 + + + . . . } 0 L 0 2n 2 4 1 ∞ + 2C T

h

H(g(0)). 0 L 0 2n

For any t, we can find n such that nT0 ≤ t ≤ {n + 1}T0 , and from L ∞ estimate from [0, T0 ], we conclude (1.24) by

h(t) L ∞ ≤ C T0 h(nT0 ) ≤ C{ h 0 L ∞ + H(g(0))}. To prove (1.25), we take x and v derivatives to get {∂t + v∂x + F(Mi+1 + Mi+1 gi+1 )∂v + ν(ξ )}∂x gi − K i, j ∂x g j = −∂x F(Mi+1 + Mi+1 gi+1 )∂v gi + β∂x F( Mi+1 gi+1 )v∂x Mi + +∂x {F( M j g j )vgi } + ∂x {Γ (gi , gi ) + Γ (gi , g j )}; (4.14) i, j {∂t + v∂x + F(M j + M j g j )∂v + ν(ξ )}∂v gi − ∂v {K g j } + {∂v ν(ξ )}gi = −∂x gi + β∂x F( M j g j )v Mi + β∂x F( M j g j )v∂x Mi +F( M j g j )∂v {vgi } + ∂v {Γ (gi , gi ) + Γ (gi , g j )}, (4.15) where K i, j has a similar property as K i in [6] (see Lemma 2.2 in [6], p. 1109). In particular, ∂v {K i, j g j }∂v gi L 1 ≤ 21 ∂v g 2ν + C g 2L 2 so that a positive dissipation for ∂v gi

24

R. Esposito, Y. Guo, R. Marra

occur for small h L ∞ in (4.15). Notice that L = ν − K ≥ 0. We take the inner product with ∂x gi and ∂v gi respectively, following the procedures in [6] to get: d dt d dt

1

∂x g 2L 2 ≤ C{ ∂x F(Mi+1 ) L ∞ + h L ∞ } ∇x,v g 2L 2 + C g 2L 2 . 2 1 1

∂v g 2L 2 + ∂v g 2ν ≤ C ∂x g 2L 2 + C g 2L 2 . (4.16) 2 4

Hence (1.25) follows from the Gronwall Lemma since sup0≤t≤∞ h(t) L ∞ is bounded by (1.24). With such an estimate, we obtain the uniqueness by taking√the L 2 estimate for the difference for (1.17) because the most difficult term F(Mi+1 + Mi+1 gi+1 )∂v gi can be handled. 5. Linear Instability: Growing Mode In this section we study the linearization of Eq. (1.1) around the homogeneous equilibrium Mhom = (µβ , µβ ). In the sequel we omit the index β for sake of shortness: |ξ |2

β 3/2 −β 2 ) e . When M is replaced by Mhom = (µ, µ) in (1.17), we get µ = µβ = ( 2π the following linearized Vlasov-Boltzmann system:

∂t g + Lg = 0, where g = (g1 , g2 ),

(5.1)

√ √ (Lg)i = v∂x gi − β F( µgi+1 )v µ − L i g

and 1 √ √ L i g = √ Q( µgi , 2µ) + Q(µ, µ(g1 + g2 )) . µ We seek an exponential growing mode for such a system when β > 1. To this end, we consider a family of systems ∂t g + Lα g = 0, √ √ (Lα g)i = v∂x gi − β F( µgi+1 )v µ − αL i g,

(5.2)

and show that there is a growing mode for all α > 0. We seek a growing mode periodic in x, so we assume periodic dependence on space and exponential in time: g1 (t, x, v, ζ ) = eλt eikx q(v, ζ ), g2 (t, x, v, ζ ) = eλt e−ikx q(−v, ζ ), so that the system (5.2), using the definition of F (see (1.2)), reduces to the single equation √ √ {λ + ivk}q − βkiUˆ (k) (5.3) q µdξ v µ = αLq,

√ √ with Lg = √2µ Q( µg, µ) + Q(µ, µg . Equivalently, q(ξ ) is an eigenfunction for the operator T α , √ √ α ˆ (T q)(ξ ) = ivkq(ξ ) − βkiU (k) (5.4) q(ξ ) µdξ v µ − αLq(ξ ) with eigenvalue −λ.

Phase Transition in a Vlasov-Boltzmann Binary Mixture

25

Lemma 5.1. Let β > 1. There exists sufficiently small α > 0 such that there is an eigenfunction q(ξ ) to T α with Reλ > 0. Proof. We first study the eigenvalue problem for the unperturbed operator T 0 for α = 0 : √ √ ˆ (λ + ivk)q − βkiU (k) q µdξ v µ = 0. (5.5) R3

This is similar to the Penrose dispersion relation for the unperturbed Vlasov-Poisson system ([14]) for a collisionless plasma. From (5.5), we obtain: √ √ ˆ βk U (k)i q µdξ v µ R3 q= . (5.6) (λ + ivk) √ Normalizing q µdξ = 1, we deduce that R3

Uˆ (k)kvµ(ξ ) dξ = 1. R3 (λ + ivk) √ Multiply and divide (5.6) by (λ − ivk) µ, take the imaginary part and then integrate on ξ . By consistency we must have: βi

β

v 2 Uˆ (k)k 2 µ(ξ ) dξ = 1. λ2 + k 2 v 2 R3

v 2 Uˆ (k)k 2 µ(ξ ) dξ . Clearly F(0, k) = β Uˆ (k). Since Uˆ (0) = 1 λ2 + k 2 v 2 R3 and β > 1, there is k0 sufficiently small so that Define F(λ, k) ≡ β

F(0, k0 ) = β Uˆ (k0 ) > 1.

(5.7)

Moreover, limλ→∞ F(λ, k0 ) = 0 for any k0 = 0. Hence there exists a real number √ βk Uˆ (k )iv µ

0 λ > 0 such that F(λ, k0 ) = 1 and q = 0 (λ+ivk) is the eigenfunction. We now fix k = k0 and return to (5.3). It can be proved (see for example [9]) that Lq L 2 ≤ C νq L 2 . Moreover, for hard spheres, there are constants C1 and C2 > 0 such that √ √ 0 ˆ

T q L 2 = ik0 vq − βk0 iU (k0 ){ q µdξ }v µ L 2

R3

≥ C1 νq L 2 − C2 q L 2 .

(5.8)

This implies Lq L 2 ≤ C{ T 0 q L 2 + q L 2 } so the perturbation L is T 0 -bounded. Since T α = T 0 − αL , we deduce from Kato’s book (p. 206, [12]) that, for α small, there is an eigenvalue −λ and an eigenfunction q(v, ζ ) for (5.3) with positive real part of λ for (5.3).

26

R. Esposito, Y. Guo, R. Marra

Theorem 5.1. Let β > 1. Then there is a 2π k0 -periodic eigenvector (g˜ 1 , g˜ 2 ) for −L such that Reλ > 0 and g˜ 2 (x, v, ζ ) = g˜ 1 (−x, −v, ζ ). Proof. We fix k0 as in Lemma 5.1. We define, for the family of Eqs. (5.2): α0 = sup α : there is an eigenvalue with positive real part for T α α 2π with a – periodic eigenvector . k0 By Lemma 5.1, for α sufficiently small this set is not empty. We want to show that α0 = +∞. We prove it by contradiction. Suppose α0 < +∞. We claim that, if there is such a finite α0 > 0, then there is an eigenvalue λ0 with an eigenfunction q0 with q0 ν = ν|q0 |2 dξ = 1, such that Reλ0 = 0 and (λ0 + ivk0 )q0 − βk0 iUˆ (k0 )

R3

√ √ q0 µdξ v µ = α0 Lq0 .

(5.9)

Proof of the claim: In fact, by (5.8), choose a family of eigenfunctions qα ∈ L 2 such that qα ν = 1, Reλα > 0, as α → α0 and √ √ ˆ qα µdξ v µ = αLqα . (5.10) (λα + ivk0 )qα − βk0 iU (k0 ) R3

Let q¯α denote the complex conjugate of qα . Notice that both (Lqα , q¯α ) and (ivk0 qα , q¯α ) are bounded by C qα 2ν . As α → α0 , taking the L 2 inner product with q¯a for (5.10), we deduce that |λα | is bounded for α → α0 . Hence limα→α0 λα = λ0 (up to subsequences) with Reλ0 ≥ 0. We now prove that λ0 is an eigenvalue so that Reλ0 = 0 by the definition of α0 and the claim is proven. Clearly, we may assume that lim qα = q0 weakly in L 2 and (5.9) are valid as α → α0 . We only need to show that lim qα =√q0 strongly so that q0 ν = 1, and q0 is an eigenfunction. Denote Ph = {1, v, |v|2 } µ. Clearly Pqα → Pq0 strongly in L 2 . It thus is left to show that (I − P)qα → (I − P)q0 strongly in L 2 . We subtract (5.10) from (5.9) to get (λα − λ0 )qα + (α − α0 )Lqα + (λ0 + ik0 v)(qα − q0 ) √ √ {qα − q0 } µdξ v µ + α0 L(qα − q0 ) −βk0 iUˆ (k0 ) = 0.

R3

We take the L 2 inner product with q¯α − q¯0 and then take the real part. Since Reλ0 qα − q0 2 ≥ 0, and ik0 v|qα − q0 |2 dξ is purely imaginary, we obtain: α0 L(g α − g), (g¯ α − g) ¯ ≤ (|λα − λ0 | + |α − α0 |) qα ν · qα − q0 ν √ +C| {qα − q0 } µdξ | · (g α − g) . R3

Therefore, (I − P){g α − g} → 0 in L 2ν and g ν = 1 and our claim follows.

Phase Transition in a Vlasov-Boltzmann Binary Mixture

27

Hence, λ0 is purely imaginary. Actually we show that it is 0. To do this we take the inner product with q¯0 in (5.9) to get √ √ 2 ˆ λ0 q0 2 − βk0 iU (k0 ) q0 µdξ q¯0 v µdξ + α0 Lg, g ¯ = 0. R3

But by integrating

√

R3

µ×(5.9) over ξ , we obtain the continuity equation √ √ λ0 q0 v µdξ + k0 i q0 v µdξ = 0. R3

√

R3

√

Therefore, k0 i R3 q¯0 v µdξ = λ¯ 0 R3 q¯0 v µdξ and 2 √ 2 ˆ ¯ q0 µdξ + α0 Lg, g ¯ = 0. λ0 g 2 − β λ0 U (k0 ) R3

(5.11)

Since Uˆ (k0 ) is real and λ0 is purely imaginary, taking the real part of (5.11) we conclude that

Lq0 , q¯0 = √ 0. Therefore, q0 is a linear combination of the collision invariants √ α0√ µ, ξ µ and |ξ 2 |2 µ and α0 Lq0 vanishes in (5.9). Now (5.9) reduces to a pure Vlasov equation, and we deduce that √ √ βik0 Uˆ (k0 ){ q0 (ξ ) µdξ }v µ q0 (ξ ) = . λ0 + ivk0 √ Since q0 (ξ ) µdξ = 0, this is compatible with the condition that q0 is a combination √ of collision invariants if and only if λ0 = 0. Thus q0 (ξ ) = β Uˆ (k0 ) µ, and hence β Uˆ (k0 ) = 1, which is a contradiction to (5.7). Theorem 5.1 thus follows.

(5.12)

6. Nonlinear Instability In order to establish the non-linear instability, we need several lemmas on the properties of the fastest linear growing mode. First of all we need to establish the smoothness and long time behavior for the growing mode. Recall the definition of the operator L (5.1). We define M = {g = [g1 , g2 ] ∈ L 2 | g1 (x, v, ζ ) = g2 (−x, −v, ζ )} and · L 2 (M) will denote the L 2 norm on this set. We have the following lemmas: Lemma 6.1. Let β > 1. Then for k0 sufficiently small, for all δ > 0, the spectrum of −L in {Reλ > δ} consists of a finite number of eigenvalues of finite multiplicity. If λ1 denotes an eigenvalue with maximal real part, and Λ > max{0, Reλ1 }, then there exists CΛ > 0 such that, for any g0 ∈ M,

e−t L g0 L 2 (M) ≤ CΛ eΛt g0 L 2 (M) .

28

R. Esposito, Y. Guo, R. Marra

Proof. This follows easily from the Vidav’s Lemma [16]. Notice that we can split √ √ Lg = {v∂x g + Lg} + {β F( µg)v µ} ≡ Ag + Kg, where K is a compact operator from L 2 to L 2 , while e−t A L 2 (M)→L 2 (M) ≤ 1.

Lemma 6.2. Let R = (R1 , R2 ) ∈ L 2 (M) with R L 2 (M) = 1 be an eigenvector of −L with Reλ > 0. Then there exists a constant C depending only on λ such that

∇x,v R L 2 (M) ≤ C, sup w(v)|R(x, v)| ≤ C,

(6.1) (6.2)

x,v

where w is a polynomial weight as in previous section. Proof. We begin with R ∈ L 2 . We first claim ∞ R=− e−λt e−t A KRdt.

(6.3)

0

Notice that the corresponding growing mode g(t) = eλt R satisfies ∂t g + Ag= −Kg t so that eλt R = e−(t−s)A eλs R − s e−(t−τ )A KRdτ . Letting s → −∞, since

e−t A L 2 →L 2 ≤ 1 for any t > 0 and Reλ > 0, we get t ∞ eλt R = − e−(t−τ )A KR eλτ dτ = − e−τ A KR eλ(t−τ ) dτ. −∞

0

Dividing by eλt we prove our claim because Reλ > 0 and the integral converges in L 2 . From the property of linear Boltzmann equation, clearly

∂x {e−t A g} L 2 (M) ≤ ∂x g L 2 (M) . Taking the v derivative of (∂t + v∂x + L)g = 0 yields: {∂t + v∂x }{∂v g} + ∂v {−Lg} = −∂x g. From [6], ∂v {−Lg}∂v g ≥ ν2 |∂v g|2 − Cν g 2L 2 . We thus obtain by taking L 2 inner product with ∂v g,

||∂v g|| L 2 = ∂v {e−t A g0 } L 2 ≤ C(t + 1){ g0 L 2 + ∇x,v g0 L 2 }.

(6.4)

Since KR ∈ C ∞ and ∂x {KR} L 2 + ∂v {KR} L 2 ≤ C R L 2 , we can take ∂x and ∂v derivatives in (6.3) to get ∞

∂x R L 2 + ∂v R L 2 ≤ e−Reλt { ∂x {e−t A KR} L 2 + ∂v {e−t A KR} L 2 }dt 0 ∞ e−Reλt (t + 1) ∂x {KR} L 2 + ∂v {KR} L 2 dt ≤C 0 ∞ e−Reλt (t + 1)dt ≤ C R L 2 . ≤ C R L 2 0

Phase Transition in a Vlasov-Boltzmann Binary Mixture

29

We therefore deduce (6.1). To show (6.2), we denote S = w R. We then have √ µS S √ )vw µ} ≡ Aw S + Kw S. λS = {v∂x S + wL( )} + {β F( w w Applying the same proof in Sect. 3 for the stability for the pure linear Boltzmann operator, we can establish:

e−t Aw g0 L ∞ ≤ C{ g0 L 2 + g0 L ∞ }. ∞ We can similarly obtain S = − 0 e−λt e−t Aw Kw Sdt, so that from Reλ > 0, ∞ ∞

S L ≤ e−Reλt e−t Aw Kw S L ∞ dt 0 ∞ e−Reλt { Kw S L ∞ + Kw S L 2 }dt ≤C ≤ C.

0

Lemma 6.3. Let R be an eigenvector of −L with its eigenvalue λ with Reλ > 0. If λ is not real, then there is a constant ζ > 0 such that for all t > 0,

e−Lt Im R L 2 ≥ ζ eReλt Im R 2 > 0. √ √ Proof. We prove by contradiction. Notice that, since ImF( µ R) = F( µIm R), one can immediately check that e−Lt Im R = Im{e−Lt R} = eReλt (sin[Imλt]ReR + cos[Imλt]Im R). If the lemma were false, by passing through a convergent subsequence of sin[Imλtn ] and cos[Imλtn ], with n → ∞ we would have aIm R + bReR = 0, with a 2 + b2 = 1. Therefore either Im R or ReR would be a real eigenvector and λ would be real, a contradiction. Lemma 6.4. Let R be as in the preceeding lemma. There exists δ0 > 0, such that for 0 < δ < δ0 , there exists a (compactly supported) approximate eigenfunction Rδ such that √ δ|Rδ (x, v)| µ ≤ µ, √

R − Rδ L 2 ≤ δ,

∂x Rδ L 2 + ∂v Rδ L 2 ≤ C{ ∂x R L 2 + ∂v R L 2 }. Proof. In fact, we choose χ (v) to be a smooth cutoff function χ (v) = 1 for |v| ≤ N and χ (v) ≡ 0 for |v| ≥ N + 1. By Lemma 6.2, we have √ √ √ |χ (v)R(x, v)| µ ≤ |R(x, v)| µ = |wS| µ ≤ {Cwµ−1/2 }µ. (6.5) Define N by the equation δ =

µ1/2 (N +1) Cw(N +1)

and define

Rδ = χ (v)R(x, v).

30

R. Esposito, Y. Guo, R. Marra

Clearly the third estimate in the lemma is valid. From (6.5) and the definition of N and δ, the first inequality in the lemma is also valid. Since w is a polynomial, we have √ √ 1/2 (N +1) δ = µCw(N +1) ≥ µ(N ) when N is large. We then conclude the lemma by

R − Rδ L 2 = R1|v|≥N L 2 ≤ C

√ 1 µ 2 (v) dv = Cµ 2 (N ) ≤ δ. w(v) 1

|v|≥N

We now establish the crucial bootstrap lemma which shows that L 2 growth leads to the same growth rate for L ∞ . Lemma 6.5. Let g = (g1 , g2 ) be a solution to the nonlinear problem around Mhom : √ √ (6.6) (∂t + v∂x ) gi + β F( µgi+1 ) µv − L i g √ √ = −F( µgi+1 )∂v gi + F( µgi+1 )gi v + Γ (gi , gi ) + Γ (gi , gi+1 ). Assume that Reλ > 0 and

g(t) L 2 ≤ CeReλt g(0) L 2 for t ∈ [0, T ]. There exists ε0 > 0 such that if sup0≤t≤T { wg(t) L ∞ + g(t) L 2 } ≤ ε0 , then there is a constant C such that

∂x g(t) L 2 + ∂v g(t) L 2 + wg(t) L ∞ ≤ CeReλt { ∂x g(0) L 2 + ∂v g(0) L 2 + h(0) L ∞ }.

(6.7)

Proof. We take x and v derivatives for (6.6). Since F(Mi ) = 0, from (4.16), we have (h = wg) d

∂x g 2L 2 ≤ C h L ∞ { ∂x g 2L 2 + ∂v g 2L 2 } + C g 2L 2 , dt d

∂v g 2L 2 ≤ C ∂x g 2L 2 + C g 2L 2 . dt

(6.8)

Applying Gronwall’s inequality to (6.8), by ||h|| L ∞ ≤ ε0 < Reλ, we obtain t

∂x g(t) 2L 2 ≤ ∂x g(0) 2L 2 + Cε0 eCε0 (t−s) ∂v g(s) 2L 2 ds 0

+Ce d

∂v g(t) 2L 2 ≤ Cε0 dt

Reλt

0

t

g(0) 2L 2 ,

(6.9)

eCε0 (t−s) ∂v g(s) 2L 2 ds + CeReλt { g(0) 2L 2

+ ∂x g(0) 2L 2 }.

(6.10)

We further integrate over t of Eq. (6.10) to get t 2 2 ||∂v g(t)|| L 2 ≤ ||∂v g(0)|| L 2 + Cε0 0

τ 0

eCε0 (τ −s) ||∂v g(s)||2L 2 dsdτ

+CeReλt {||∂x g(0)||2L 2 + ||g(0)||2L 2 }.

(6.11)

Phase Transition in a Vlasov-Boltzmann Binary Mixture

31

We therefore have, by writing −Cε0 s = −Reλs + {Reλ − Cε0 }s, e−Reλt ||∂v g(t)||2L 2 ≤ e−Reλt ||∂v g(0)||2L 2 t τ +Cε0 e−Reλt eCε0 (τ −s) ||∂v g(s)||2L 2 dsdτ

≤

(6.12)

0 0 +C{||∂x g(0)||2L 2 + ||g(0)||2L 2 } t τ Cε0 e−Reλt eCε0 τ e(Reλ−Cε0 )s {e−Reλs ||∂v g(s)||2L 2 }dsdτ 0 0 +C{||∂v,x g(0)||2L 2 + ||g(0)||2L 2 }.

τ Since 0 e(Reλ−Cε0 )s ds ≤ Cλ e(Reλ−Cε0 )τ for Reλ − Cε0 > 21 Reλ > 0, we further bound the double integration as t τ −Reλs 2 −Reλt Cε0 τ ||∂v g(s)|| L 2 } × e e e(Reλ−Cε0 )s dsdτ Cε0 sup {e 0≤s≤t

0

≤ Cε0 sup {e−Cε0 s ||∂v g(s)||2L 2 } × Cλ e−Reλt 0≤s≤t

0

t

eCε0 τ e(Reλ−Cε0 )τ dτ

0

≤ Cλ ε0 sup {e−Reλs ||∂v g(s)||2L 2 }. 0≤s≤t

We therefore deduce from (6.12) for Cλ ε0 <

1 2

that

sup {e−Reλs ||∂v g(s)||2L 2 } ≤ C{||∂v,x g(0)||2L 2 + ||g(0)||2L 2 },

0≤s≤t

and conclude our lemma.

Proof of Theorem 1.4. We choose R to be the eigenfunction whose eigenvalue has the largest positive real part. If λ is not real, then Im R L 2 = r > 0. We choose the approximate eigenfunction Im R δ to the imaginary part of R by Lemma 6.4. In case λ is real, we simply do not take the imaginary parts. √ We choose a family of solutions f δ (0, x, ξ ) = µ + δIm R δ µ ≥ 0 or g δ (0, x, ξ ) = δIm R δ . Note that the positivity follows from the first statement in Lemma 6.4, for δ sufficiently small. Clearly, from Lemma 6.4, 1

g δ (0) − δIm R L 2 = δ R − Im R δ L 2 ≤ δ 1+ 2 ≤

δr , 2

for δ sufficiently small. Hence, from Lemma 6.4,

g δ (0) H 1 + h δ (0) L ∞ = δ Im R δ H 1 + δ Im R δ L ∞ ≤ Cδr. Now from the nonlinear Vlasov-Boltzmann system (6.6), we have g δ (t) = δe−Lt Im R δ √ √ t −F( µg2 )∂v g1 + F( µg2 )vg1 +Γ (g1 , g1 ) + Γ (g1 , g2 ) dτ. + e−L(t−τ ) √ √ −F( µg1 )∂v g2 + F( µg1 )vg2 +Γ (g2 , g2 ) + Γ (g2 , g1 ) 0 (6.13)

32

R. Esposito, Y. Guo, R. Marra

We choose Λ such that 1 Reλ < Λ < (1 + )Reλ. 2

(6.14)

Let T δδ =

1 ζr | ln √ |. Λ − Reλ 2CΛ δ

1 ln θδ , since 2(Λ − Reλ) < Reλ, for small δ, we have T δ ≤ T δδ . By (6.14) and T δ = Reλ δδ Clearly, e(Λ−Reλ)t ≤ e(Λ−Reλ)T = ζ r√ , and for 0 ≤ t ≤ T δδ : 2CΛ δ

√ ζ CΛ δeΛt ≤ eReλt r. 2

(6.15)

Let T ∗ = sup{s : ∇x g δ (t) L 2 + ∇v g δ (t) L 2 + h δ (t) L ∞ ≤ ε0 },

(6.16)

s

ζ Reλt δe r, for all 0 ≤ t ≤ s}. 4

T ∗∗ = sup{s : g δ (t) − δe−Lt R δ L 2 ≤ s

For 0 ≤ t ≤ min{T δ , T ∗∗ }, we have from (6.15), ζ

g δ (t) L 2 ≤ δ e−Lt Im R δ L 2 + δeReλt r 4

(6.18)

= δ e−Lt Im R L 2 + δ e−Lt {R − R δ } L 2 + 1

≤ δeReλt + CΛ δ {1+ 2 } eΛt + ≤ (1 +

(6.17)

ζ Reλt δe r 4

ζ Reλt δe r 4

3ζ Reλt δ )e

g (0) L 2 . 4

We now claim that T δ ≤ min{T ∗ , T ∗∗ }. In fact, if T ∗ < min{T δ , T ∗∗ }, then by (1.26), (6.16), (6.18), and the Lemma 6.5 (bootstrap lemma), we obtain for 0 ≤ t ≤ T ∗ :

∇x,v g δ (t) L 2 + h δ (t) L ∞ ≤ CeReλt { ∇x,v g δ (0) L 2 + h δ (0) L ∞ }. In particular, by (1.26), ∗

∇x,v g δ (T ∗ ) L 2 + h δ (T ∗ ) L ∞ ≤ CeReλT { ∇x,v g δ (0) L 2 + h δ (0) L ∞ } δ

< CeReλT δ = Cθ. This is a contradiction to the definition of T ∗ if θ is chosen << ε0 . On the other hand, if T ∗∗ < min{T δ , T ∗ }, then by (6.14), (6.18), and Lemma 6.5, T ∗∗ ∗∗ { ∇x,v g δ (t) L 2 + h δ (t) L ∞ }2 dt

g δ (T ∗∗ ) − δe−LT Im R δ L 2 ≤ C ≤ Ce

0 2ReλT ∗∗

< C{e

ReλT δ

{ ∇x,v g δ (0) L 2 + h δ (0) L ∞ }2 ∗∗

δ}C{eReλT δ}

∗∗

≤ Cθ eReλT δ. This is a contradiction to T ∗∗ in (6.17) when θ is small.

Phase Transition in a Vlasov-Boltzmann Binary Mixture

33

Now that T δ ≤ min{T ∗ , T ∗∗ }, we can evaluate t = T δ in (6.14) to get δ

g δ (T δ ) − δe−LT Im R δ L 2 ≤ Cθ 2 . But δ

δ

δ

δe−LT Im R δ L 2 ≥ δe−LT Im R L 2 − δe−LT Im{R − Rδ } L 2 √ δ δ ≥ ζ e−ReλT Im R L 2 − CΛ δeΛT δr ζ ζ δ δ = ζ eReλT δr − eReλT δr = r θ. 4 4 Therefore, for θ sufficiently small,

g δ (T δ ) L 2 ≥

ζ ζ r θ − Cθ 2 ≥ r θ. 4 8

Acknowledgements. R. E. and R. M. thank Brown University for the kind and warm hospitality. The work of R. E. and R. M. has been supported by MURST and INDAM-GNFM. Y. G. thanks both Università di L’Aquila and Università di Roma Tor Vergata for their hospitality when this work was initiated. Y. G. is supported in part by the NSF grant 0905255.

References 1. Bastea, S., Esposito, R., Lebowitz, J.L., Marra, R.: Binary fluids with long range segregating interaction I: derivation of kinetic and hydrodynamic equation. J. Stat. Phy. 101, 1087 (2000) 2. Bastea, S., Lebowitz, J.L.: Spinodal decomposition in binary gases. Phys. Rev. Lett. 78, 3499 (1997) 3. Carlen, E.A., Carvalho, M., Esposito, R., Lebowitz, J.L., Marra, R.: Free energy minimizers for a two– species model with segregation and liquid-vapor transition. Nonlinearity 16, 1075–1105 (2003) 4. Carlen, E.A., Carvalho, M.C., Esposito, R., Lebowitz, J.L., Marra, R.: Displacement convexity and minimal fronts at phase boundaries. Arch. Rat. Mech. Anal. 194(3), 823–847 (2009) 5. Esposito, R., Guo, Y., Marra, R.: Stability of the front under a Vlasov-FokkerPlanck dynamics. Arch. Rat. Mech. Anal. 195(1), 75–116 (2010) 6. Guo, Y.: The Vlasov-Poisson-Boltzmann system near Maxwellians. Comm. Pure Appl. Math. LV, 1104– 1135 (2002) 7. Guo, Y.: Decay and continuity of Boltzmann equation in bounded domains. Arch. Rat. Mech. Anal. (2010). doi:10.1007/s00205-009-0285-y 8. Guo, Y.: Bounded solutions to the Boltzmann equation. Quart. Applied. Math. to appear, 2009, doi:s0033-569x(09)01180-4, electronically published Oct. 28, 2008 9. Guo, Y., Strauss, W.A.: Unstable oscillatory-tail waves in collisionless plasmas. SIAM J. Math. Anal. 30 no. 5, 1076–1114 (1999) (electronic) 10. Guo, Y., Strauss, W.A.: Instability of periodic BGK equilibria. Comm. Pure Appl. Math. 48(8), 861–894 (1995) 11. Guo, Y., Strauss, W.A.: Nonlinear instability of double-humped equilibria. Ann. Inst. H. Poincaré Anal. Non Linéaire 12(3), 339–352 (1995) 12. Kato, T.: Perturbation Theory for Linear Operators. Berlin: Springer, 1980, 1995 13. Lebowitz, J.L., Mazel, A.E., Presutti, E.: Liquid-vapor phase transitions for systems with finite range interactions. J. Stat. Phys. 80, 4701–4704 (1998) 14. Penrose, O.: Electrostatic instability of a non-Maxwellian plasma. Phys. Fluids. 3, 258–265 (1960) 15. Presutti, E.: Scaling Limits in Statistical Mechanics and Microstructures in Continuum Mechanics. Berlin-Heidelberg NewYork: Springer Verlag, 2008 16. Vidav, I.: Spectra of perturbed semigroups with applications to transport theory. J. Math. Anal. Appl. 30, 264–279 (1970) Communicated by H. Spohn

Commun. Math. Phys. 296, 35–68 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0986-y

Communications in

Mathematical Physics

Large Deviations in Quantum Spin Chains Yoshiko Ogata Graduate School of Mathematics, Kyushu University, 1-10-6 Hakozaki, Fukuoka 812-8581, Japan. E-mail: [email protected] Received: 15 April 2009 / Accepted: 13 November 2009 Published online: 3 February 2010 – © Springer-Verlag 2010

Abstract: We show the full large deviation principle for KMS-states and C ∗ -finitely correlated states on a quantum spin chain. We cover general local observables. Our main tool is Ruelle’s transfer operator method. 1. Introduction While the large deviation for classical lattice spin systems constitutes a rather complete theory, our knowledge on large deviations in quantum spin systems is still restricted. Large deviation results for observables that depend only on one site were established in high temperature KMS-states, in [NR], using cluster expansion techniques. In [LR], large deviation upper bounds were proven for general observables, for KMS-states in the high temperature regime and in dimension one. Furthermore, it was shown that a state in one dimension, which satisfies a certain factorization property satisfies a large deviation upper bound [HMO]. This factorization property is satisfied by KMS-states as well as C ∗ -finitely correlated states. It was also shown in [HMO] that the distributions of the ergodic averages of a one-site observable with respect to an ergodic C ∗ -finitely correlated state satisfy the full large deviation principle. (See [BLP,GLM], and [LLS] for large deviation results in other quantum mechanical models. Other types of large deviation results can be found in [HMOP,LLS,PRV,P], and [RW].) In spite of the above progress, the theory in quantum spin systems is not completed: we do not know if the large deviation lower bound holds for general observables, nor if the large deviation upper bound holds in the intermediate temperature KMS-states, for more than one dimensional spin systems. In this paper, we solve a part of the problem: we prove the full large deviation principle in dimension one. The infinite spin chain with one site algebra Md (C) is given by the UHF C ∗ -algebra AZ :=

Z

C∗

Md (C) ,

36

Y. Ogata

which is the C ∗ - inductive limit of the local algebras Md (C)| ⊂ Z, || < ∞ . A :=

C∗ For any subset S of Z, we identify A S := S Md (C) with a subalgebra of AZ under the natural inclusion. The algebra of local observables is defined by Aloc := ∪||<∞ A . Let γ j , j ∈ Z be the j-lattice translation. A state ω is called translation-invariant if ω ◦ γ j = ω for all j ∈ Z. An interaction is a map from the finite subsets of Z into AZ such that (X ) ∈ A X and (X ) = (X )∗ for any finite X ⊂ Z. In this paper, we will always assume that is a finite range translation-invariant interaction, i.e., there exists r ∈ N such that (X ) = 0, i f diam(X ) > r, and is invariant under γ , (X + j) = γ j ((X )) , ∀ j ∈ Z, ∀X ⊂ Z. A norm of an interaction is defined by ≡ X 0 |X |−1 (X ). For finite ⊂ Z, we set H () := (I ). I ⊂

The distribution of n1 H ([1, n]) with respect to a state ω is the probability measure 1 µn (B) := ω(1 B ( H ([1, n]))), n

B ∈ B,

where B denotes the Borel sets of R and 1 B ( n1 H ([1, n])) ∈ A[1,n] is the spectral projection of n1 H ([1, n]) corresponding to the set B. Let I : R → [0, ∞] be a lower semicontinuous mapping. We say that we have a large deviation upper bound for a closed set C if 1 1 H ([1, n]) ≤ − inf I (x). lim sup log ω 1C x∈C n n→∞ n Similarly, we have a large deviation lower bound for an open set O if 1 1 H ([1, n]) ≥ − inf I (x). lim inf log ω 1 O n→∞ n x∈O n We say that {µn } satisfies the (full) large deviation principle if we have an upper and lower bound for all closed and open sets, respectively. Furthermore, I is said to be a good rate function if all the level sets {x : I (x) ≤ α}, α ∈ [0, ∞) are compact subsets of R( see [DZ]). In this paper, we show the full large deviation principle for any kind of local observable, in KMS-states and C ∗ -finitely correlated states on the quantum spin chain.

Large Deviations in Quantum Spin Chains

37

KMS-states. Let be a translation-invariant finite range interaction, and define the finite volume Hamiltonian associated with a finite subset ⊂ Z by H () :=

(I ).

I ⊂

It is known that there exists a strongly continuous one parameter group of ∗-automorphisms τ on AZ , such that lim τt (A) − eit H () Ae−it H () = 0, ∀t ∈ R, ∀A ∈ AZ .

Z

The equilibrium state corresponding to the interaction is characterized by the KMS condition. A state ω over AZ is called a (τ , β)-KMS state, if iβ

ω(Aτ (B)) = ω(B A) holds for any pair (A, B) of entire analytic elements for τ . It is known that the one dimensional quantum spin system has a unique (τ , β)-KMS state for all β ∈ R [A1]. In this paper, we prove the large deviation principle for the (τ , β)-KMS state: Theorem 1.1. Let be a translation-invariant finite range interaction and ω a (τ , β)KMS state. Furthermore, let be another translation-invariant finite range interaction and µn, the distribution of n1 H ([1, n]) with respect to ω. Then the sequence {µn, }n∈N satisfies the large deviation principle with a good rate function. Finitely correlated states. The following recursive procedure to construct states on AZ was introduced in [FNW], where the states obtained were called C ∗ -finitely correlated states. For the construction one needs a triple (B, E, ρ), where B is a finite dimensional C ∗ -algebra, E : Md (C) ⊗ B → B a unital completely positive map and ρ a faithful state on B with density operator ρ. ˆ Further, one has to assume that E and ρ are related so that T r Md (C) E ∗ (ρ) ˆ = ρˆ holds. Then

⊗(n−1) ∗ ◦ · · · ◦ id Md (C) ⊗ E ∗ ◦ E ∗ ρˆ ; n = 2, 3, · · · ˆ ϕˆn := id M ⊗ E ϕˆ1 := E ∗ (ρ); d (C ) defines a state on Md (C)⊗n ⊗ B for each n ∈ N, and ωˆ n := T rB ϕˆn gives a state ωn on Md (C)⊗n . There exists a unique translation-invariant state ω with local restrictions ω|A[1,n] = ωn . This is the C ∗ -finitely correlated state generated by (B, E, ρ). In this paper, we prove the large deviation principle for C ∗ -finitely correlated states: Theorem 1.2. Let ω be a C ∗ -finitely correlated state and a translation-invariant finite range interaction. Let µn, be the distribution of n1 H ([1, n]) with respect to ω. Then the sequence {µn, }n∈N satisfies the large deviation principle with a good rate function.

38

Y. Ogata

In order to study the large deviations, we consider the corresponding logarithmic moment generating function, defined by 1 log ω(eα H ([1,n]) ). n→∞ n

f (α) = lim

(1)

Theorem 1.3 (Gärtner-Ellis). Let {µn }n∈N be a sequence of probability measures on the Borel sets of R. Assume that the limit 1 f (α) := lim log enαx dµn (x) n→∞ n exists and is differentiable for all α ∈ R. Let I (x) := sup {αx − f (α)}. α∈R

Then {µn } satisfies the large deviation principle, i.e., we have lim sup n→∞

1 log µn (C) ≤ − inf I (x), x∈C n

and lim inf n→∞

1 log µn (O) ≥ − inf I (x), x∈O n

for any closed set C and any open set O, respectively. Furthermore, I is a good rate function. In this paper, we use this theorem to prove the large deviation principle, i.e., we prove the existence and differentiability of the logarithmic moment generating function f (α) of (1). Our main tool is the transfer operator technique introduced by D.Ruelle for classical spin systems [R]. H. Araki applied this method to quantum spin systems and showed the real analyticity of the mean free energy [A1]. This paper is basically an extension of this result. The non-commutative Ruelle transfer operator was further generalized in [GN and M]. We take advantage of these extensions. The structure of this paper is as follows. In Sect. 2, we present a brief introduction to the non-commutative Ruelle transfer operator technique. In Sect. 3 and Sect. 4, we prove the large deviation principle, for KMS-states and C ∗ -finitely correlated states, respectively. As a corollary of the result, we show the equivalence of ensembles, in Sect. 5. 2. Non-commutative Ruelle Transfer Operator In this section, we give a brief introduction of non-commutative Ruelle transfer operators studied in [A1,GN and M]. We follow the notation in [M] and consider a onesided infinite system A[1,∞) . We also introduce a finite dimensional C ∗ -algebra B. By Q ( j) , j ∈ N, we denote the element of 1B ⊗ A[1,∞) with Q in the j th component of the tensor product of A[1,∞) and the unit in any other component. Similarly, by Q (0) we denote an element in B ⊗ 1A[1,∞) . We introduce a C ∗ -algebra

O := B ⊗ A[1,∞) ⊗ B ⊗ A[1,∞)

Large Deviations in Quantum Spin Chains

39

and consider automorphisms { j } j∈N of O determined by

1 ⊗ Q (k) , for k ≥ (k) j Q ⊗ 1 = Q (k) ⊗ 1, for k <

Q (k) ⊗ 1, for k ≥ j 1 ⊗ Q (k) = 1 ⊗ Q (k) , for k < For any element Q in B ⊗ A[1,∞) , we set var j (Q) := j (Q ⊗ 1) − Q ⊗ 1 ,

j j, j . j

j ∈ N.

For any θ satisfying 0 < θ < 1 and Q ∈ B ⊗ A[1,∞) , we set var j Q Qθ := sup , j ∈ N . θj By Fθ we denote the dense subalgebra of B ⊗ A[1,∞) consisting of elements Q with finite Qθ , and introduce the norm |Q|θ of Fθ via the following equation: |Q|θ = max{Q , Qθ }. Fθ is complete in this norm. Some properties of Fθ are described in Appendix C. We need the ∗-isomorphism τc+ , (resp. τc− ) of B ⊗ A[2,∞) B ⊗ 1A{1} ⊗ A[2,∞) onto B ⊗ A[1,∞) (resp. A(−∞,−2] ⊗ 1A{−1} ⊗ B A(−∞,−2] ⊗ B onto A(−∞,−1] ⊗ B) determined by

τc+ x ⊗ y (k+1) = x ⊗ y (k) , for all k ∈ N, x ∈ B, and y ∈ Md (C), (resp.

τc− y (−k−1) ⊗ x = y (−k) ⊗ x, for all k ∈ N, x ∈ B and y ∈ Md (C).) For a selfadjoint element Q of AZ , we denote the infimum of the spectrum of Q by inf Q. A selfadjoint element Q is strictly positive if inf Q > 0. We now introduce a Ruelle transfer operator L: Assumption 2.1. Let a be an element in B ⊗ A[1,∞) , and E : B ⊗ Md (C) → B a completely positive unital map. Define a Ruelle transfer operator L on B ⊗ A[1,∞) by

L(Q) := τc,+ E ⊗ id[2,∞) (a ∗ Qa), Q ∈ B ⊗ A[1,∞) . (2) Assume that (i) The element a is in Fθ and invertible in Fθ . (ii) There exists an invariant state ϕ of L. (iii) There exists a positive constant K such that the following bound is valid: Let Q be any strictly positive element in B ⊗ (Aloc ∩ A[1,∞) ). There exists a positive integer N = N (Q) satisfying L n (Q) ≤ K inf L n (Q), ∀n ≥ N .

40

Y. Ogata

If Assumption 2.1 is valid, the restriction of L to the Banach space Fθ gives a bounded operator on Fθ (see Lemma C.3). Assumption 2.1 guarantees the following properties of L. Theorem 2.1. Let L be a Ruelle transfer operator satisfying Assumption 2.1. Then (i) There exist an element h in Fθ and a positive constant m > 0 such that L(h) = h, m ≤ h, ϕ(h) = 1. (ii) There exist strictly positive constants C1 and δ1 such that n L (Q) − ϕ(Q)h ≤ C1 e−δ1 n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N. θ Proof. See Appendix B.

Now we consider a family of Ruelle transfer operators {L α }α∈C . Theorem 2.2. Let C α → a(α) ∈ Fθ ∩ A[1,∞) be an Fθ -valued entire analytic function such that each a(α), α ∈ C has an inverse in Fθ . Let E : B ⊗ Md (C) → B be a completely positive unital map. For each α ∈ C, define a map L α : B⊗A[1,∞) → B⊗A[1,∞) by

¯ ∗ Qa(α)), Q ∈ B ⊗ A[1,∞) . L α (Q) := τc,+ E ⊗ id[2,∞) (a(α) (3) Assume that for real α, L α satisfies (iii) of Assumption 2.1. Then, for real α, (i) There exist a strictly positive number λ(α) and a strictly positive element h(α) in Fθ such that L α (h(α)) = λ(α)h(α), and lim λ(α)−n L nα (1) − h(α) = 0.

n→∞

(ii) The function R α → λ(α) is differentiable. Remark 2.1. An analogous result for the left-side chain A(−∞,−1] holds. Proof. Each L α , α ∈ C gives a bounded operator on Fθ into itself. (See Lemma C.3). To prove (i), we claim that for real α, there exists a strictly positive number λ(α) such that λ(α)−1 L α satisfies Assumption 2.1. Note that if α is real, L α is a completely positive map satisfying (i) and (iii) of Assumption 2.1. For each α ∈ R, there are a state ϕα and a strictly positive scalar λ(α) such that L ∗α ϕα = λ(α)ϕα . In fact, as a(α) is invertible and E is completely positive and unital, we have −2 L α (1) ≥ a(α)−1 > 0. Accordingly, if ν is a state of B ⊗ A[1,∞) , a state G(ν)(Q) :=

ν(L α (Q)) , ν(L α (1))

Q ∈ B ⊗ A[1,∞)

Large Deviations in Quantum Spin Chains

41

is well defined. This defines a weak∗ -continuous map G from the state space into itself. As the state space is weak∗ -compact and convex, by the Schauder Tychonov theorem, there exists a fixed point ϕα of G. This state ϕα and a strictly positive scalar λ(α) = ϕα (L α (1)), satisfy L ∗α ϕα = λ(α)ϕα . (See [A1].) Clearly, the operator λ(α)−1 L α satisfies Assumption 2.1. Applying Theorem 2.1 to λ(α)−1 L α , α ∈ R, we obtain a strictly positive element h(α) in Fθ such that L α (h(α)) = λ(α)h(α), ϕα (h(α)) = 1. Furthermore, for some strictly positive constants Cα and δα , we have λ(α)−n L n (Q) − ϕα (Q)h(α) ≤ Cα e−δα n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N. α θ

(4)

Hence (i) is proven. To prove (ii), we use Lemma 9.2 of [A1]: Lemma 2.1 (Lemma 9.2 of [A1]). Let X be a Banach space. Let L α be a bounded linear operator on X , analytic in α in a neighborhood D of a real point α0 ∈ D, satisfying the following conditions for all α ∈ R ∩ D: (a) There exist λ(α) > 0, h(α) ∈ X , and ϕα ∈ X ∗ such that L α (h(α)) = λ(α)h(α), L ∗α ϕα = λ(α)ϕα , ϕα (h(α)) = 1. (b) Define a projection Eα : X → X by Eα (Q) = ϕα (Q)h(α). There exists 0 < µα < λ(α) such that N lim µ−N α L α (1 − Eα ) N →∞

B(X )

= 0.

Then, there exists a neighborhood D of α0 such that D ∩ R α → λ(α) has an analytic extension to D . In particular, D ∩ R α → λ(α) is differentiable. As a(α) is an Fθ -valued entire analytic function, L α is a B(Fθ )-valued entire analytic function. (See Lemma C.4.) Furthermore, by the above argument, L α satisfies (a) of Lemma 2.1 with λ(α), h(α), and ϕα . From (4), we obtain λ(α)−n L nα (1 − Eα ) B(F ) ≤ Cα e−δα n . θ

1

Hence for 0 < µα := λ(α)e− 2 δα < λ(α), we have n 1 L (1 − Eα ) ≤ C α e − 2 δα n , µ−n α α B(F ) θ

and L α satisfies (b). Applying Lemma 2.1, R α → λ(α) is differentiable.

We will construct Ruelle operators L α so that the eigenvalue λ(α) in Theorem 2.2 corresponds to the logarithmic moment generating function f (α) in (1).

42

Y. Ogata

3. Large Deviation Principle for KMS-States Let be a finite range interaction and ω a unique (τ , β)-KMS state. Let be another finite range interaction. In this section, we prove the large deviation principle of the distribution of n1 H ([1, n]) in ω, Theorem 1.1. By the Gärtner-Ellis Theorem, it suffices to show the existence and differentiability of the logarithmic moment generating function, f (α) = lim

n→∞

1 log ω(eα H ([1,n]) ), ∀α ∈ R. n

(5)

Lemma 3.1. Let pn (α) be

β β pn (α) := T r[1,n] e− 2 H [1,n] eα H [1,n] e− 2 H [1,n] , α ∈ R.

It suffices to prove the existence and differentiability of the limit 1 log pn (α), ∀α ∈ R. n→∞ n lim

(6)

Proof. In [LR], it was shown that there exists a positive constant C1 such that C1−1 ωn ≤ ω|A[1,n] ≤ C1 ωn ,

(7)

where ωn is a state on A[1,n] given by ωn (A) =

T r[1,n] e−β H [1,n] A . T r[1,n] e−β H [1,n]

From this inequality, we have 1

lim log pn (α) − log ω(eα H [1,n] ) − log T r[1,n] e−β H [1,n] n→∞ n 1

log ωn (eα H [1,n] ) − log ω(eα H [1,n] ) = 0. = lim n→∞ n As the existence of the limit 1 log T r[1,n] e−β H [1,n] n→∞ n lim

is known, it suffices to prove the existence and differentiability of the limit (6). For Lemma 3.1, we shall confine our attention to the analysis of pn (α). We now define a family of Ruelle transfer operators {L α }α∈C in the form (3). We set B = Md (C), and define a completely positive unital map E : Md (C) ⊗ Md (C) → Md (C), through the formula E(a ⊗ b) := d −1 T r Md (C) (a)b. Next we introduce an Fθ -valued entire analytic function a(α). We denote by A1 the subalgebra given by Q ∈ Fθ ∩ A[1,∞) : 0 < ∀θ < 1 . Note that Aloc ∩ A[1,∞) is included in A1 . Let be any translation invariant finite range interaction. For a subset I of [1, ∞), we denote by eitδ(H (I )) the strongly continuous one parameter group of automorphisms generated by a generator i X ⊂I [(X ), ·].(See Appendix A.) Any element in A1 is

Large Deviations in Quantum Spin Chains

43

entire analytic for this automorphism group (Lemma A.2). We denote the cocycle associated to the perturbed dynamics eit (δ(H (I ))+i[P,·]) of eitδ(H (I )) with P = P ∗ ∈ A, by F1 (e(·)δ(H (I )) (P); it), t ∈ R. (See (47).) If Q and P = P ∗ are in A1 , then eitδ(H (I )) (Q) and F1 (e(·)δ(H (I )) (P); it) are in Fθ , for all 0 < θ < 1. Furthermore, the maps iR it → eitδ(H (I )) (Q), F1 (e(·)δ(H (I )) (P); it) ∈ Fθ have entire analytic extensions C z → e zδ(H (I )) (Q), F1 (e(·)δ(H (I )) (P); z) ∈ Fθ . Both e zδ(H (I )) (Q) and F1 (e(·)δ(H (I )) (P); z) are in A1 .(See Lemma A.3 and Lemma A.4.) For a translation invariant finite range interaction , we define Hˆ r (n) := (I ) ∈ A[1,∞) ∩ Aloc , I ⊂[1,∞),I ∩[1,n]=φ

Wr (n)

:=

(I )

∈ A[1,∞) ∩ Aloc .

I ⊂[1,∞),I ⊂[1,n−1],I ⊂[n+1,∞)

We define a(α) by

α β a(α) := e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , − 2

α (·)δ(H [2,∞)) r , α ∈ C. Hˆ (1) , F1 e 2

As F1 e(·)δ(H [2,∞)) Hˆ r (1) ; − β2 is in A1 , a(α) is a well-defined element in Fθ and the map C α → a(α) ∈ Fθ is entire analytic. Each a(α) has an inverse in Fθ ∩ A[1,∞) (Lemma A.6). The Ruelle transfer operator on B ⊗ A[1,∞) = A[0,∞) is given by

¯ ∗ Qa(α) , α ∈ C, Q ∈ A[0,∞) . (8) L α (Q) := γ−1 d −1 T r{0} ⊗ id[1,∞) a(α) Now we prove that L α with real α satisfies (iii) of Assumption 2.1. The proof goes parallel to the argument in [M]. We shall first write L nα in a more tractable form. By an inductive calculation, we obtain

L nα (Q) = d −n γ−n ◦ T r[0,n−1] ⊗ id[n,∞) ◦ a˜ n∗ (α)Q a˜ n (α) , where we denoted a(α)γ1 (a(α))γ2 (a(α)) · · · γ(n−1) (a(α)) by a˜ n (α). It can be shown that

α β a˜ n (α) = e 2 δ(H [1,∞)) F1 e(·)δ(H [n+1,∞)) Hˆ r (n) , − 2

α , F1 e(·)δ(H [n+1,∞)) Hˆ r (n) , 2 (see Appendix A.7). Let an (α), n ≥ 2 be β

α

an (α) := a˜ n (α)e 2 H [1,n−1] e− 2 H [1,n−1] . We have (Lemma A.7)

α β r an (α) = e 2 δ(H [1,∞)) F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , − 2

α r . F1 e(·)δ (( H [1,∞)−W (n))) Wr (n) , 2

44

Y. Ogata

We define a completely positive unital map ϕn : A[0,∞) → A[0,∞) , n ≥ 2, by

−1 ϕn (Q) := pn−1 (α)d −1 γ−n ◦ T r[0,n−1] ⊗ id[n,∞)

β β α α e− 2 H [1,n−1] e 2 H [1,n−1] Qe 2 H [1,n−1] e− 2 H [1,n−1] . Using these notations, we can rewrite L nα as L nα (Q) = d −(n−1) pn−1 (α)ϕn (an (α)∗ Qan (α)), n ≥ 2.

(9)

Next we evaluate (9), using the properties of an (α) given in Lemma A.7: that is, lim [Q, an (α)] = 0, ∀Q ∈ Aloc ,

(10)

and that there exists a positive constant C such that sup an (α) , sup (an (α))−1 < C.

(11)

n→∞

n∈N

n∈N

Let Q be any strictly positive element in A[0,n 0 ] . By (10), we can choose ε > 0 and N (Q) ∈ N so that 1 4C 3 Q 2 ε ≤ inf Q, and 1 N (Q) ≥ n 0 + 1, [Q 2 , an (α)] < ε, ∀n ≥ N (Q). As ϕn is a completely positive unital map, we have ϕn = ϕn (1) = 1. Note that ϕn (Q) is a scalar if n − 1 ≥ n 0 . Thus we get 1 1

L nα (Q) ≤ d −(n−1) pn−1 (α) C 2 ϕn (Q) + 2C Q 2 [Q 2 , an (α)] 1 −(n−1) 2 ≤d pn−1 (α) + C ϕn (Q), 2C 2 and 1 1 1 L nα (Q) ≥ d −(n−1) pn−1 (α) −2C Q 2 [Q 2 , an (α)] + 2 ϕn (Q) C 1 ≥ d −(n−1) pn−1 (α) 2 ϕn (Q), 2C for all n ≥ N (Q). Hence we obtain (iii) of Assumption 2.1: L nα (Q) ≤ (1 + 2C 4 ) inf L nα (Q), for all n ≥ N (Q).

Large Deviations in Quantum Spin Chains

45

Proof of Theorem 1.1. We have seen that {L α }α∈C satisfies all the assumptions in Theorem 2.2. Therefore, we can apply the theorem to {L α }α∈C . Accordingly, for real α, there exist a strictly positive number λ(α) and a strictly positive element h(α) in Fθ such that L α (h(α)) = λ(α)h(α), and lim λ(α)−n L nα (1) − h(α) = 0.

n→∞

Furthermore, R α → λ(α) is differentiable. By (9) and (11), for α ∈ R we have d −(n−1) pn−1 (α)C −2 ≤ L nα (1) = d −(n−1) pn−1 (α)ϕn (an (α)∗ 1an (α)) ≤ d −(n−1) pn−1 (α)C 2 .

(12)

Hence for any state ν on A[0,∞) , we have

1 log pn−1 (α) − log ν(λ(α)−n L nα (1)) − n log λ(α) − (n − 1) log d n−1 1 = lim (log pn (α)) − log λ(α) − log d = 0. n→∞ n

lim

n→∞

Therefore, the limit 1 log pn (α) = log λ(α) + log d, ∀α ∈ R n→∞ n lim

(13)

exists and is differentiable. Applying Lemma 3.1, we have thus proved the theorem. 4. Large Deviation Principle for C ∗-Finitely Correlated States In this section, we prove the large deviation principle for finitely correlated states, Theorem 1.2. First, we note the following fact: Lemma 4.1. For i = 1, . . . , l, l ∈ N, let {µi,n }n∈N be a sequence of distributions over R. Suppose that each {µi,n }n∈N satisfies the large deviation principle with a good rate function Ii . Let λi , i = 1, . . . , l be positive numbers such that λi > 0,

l

λi = 1.

i=1

For each n ∈ N, define µn by µn :=

l

λi µi,n .

i=1

Then {µn }n∈N satisfies the large deviation principle with a good rate function I (x) := min1≤i≤l Ii (x).

46

Y. Ogata

Proof. For any Borel set of R, we have inf x∈ I (x) = min1≤i≤l inf x∈ Ii (x). As {µi,n }n∈N satisfies the large deviation principle with a rate function Ii , we have 1 log µi,n () n 1 ≤ lim sup log µi,n () n ≤ − inf Ii (x)

− inf Ii (x) ≤ lim inf x∈ 0

x∈¯

≤ − inf I (x). x∈¯

(14)

First we prove the upper bound. For any Borel set , we have l 1 1 1 λi µi,m () ≤ max sup sup log µm () = sup log log µi,m () . 1≤i≤l m≥n m m≥n m m≥n m i=1

(15) If inf x∈¯ I (x) = +∞, then for any R > 0, we have from (15) and (14), 1 1 log µi,m () < −R, sup log µm () ≤ max sup 1≤i≤l m≥n m m≥n m for any n large enough. Hence we have lim sup

1 log µn () = −∞ ≤ − inf I (x) = −∞. n x∈¯

If inf x∈¯ I (x) < +∞, then for any ε > 0, we have from (15) and (14), 1 1 log µi,m () ≤ − inf I (x) + ε, sup log µm () ≤ max sup 1≤i≤l m≥n m m≥n m x∈¯ for any n large enough. Hence we have lim sup

1 log µn () ≤ − inf I (x). n x∈¯

We thus proved the upper bound. The lower bound is trivial if inf x∈ 0 I (x) = +∞. If inf x∈ 0 I (x) < +∞, then there exists i 0 such that inf x∈ 0 Ii0 (x) = inf x∈ 0 I (x) < +∞. For any ε > 0, we have inf

m≥n

1 log µi0 ,m () ≥ − inf Ii0 (x) − ε, m x∈ 0

for n large enough. We thus obtain l

1 1 1 inf λi µi,m () ≥ inf log µm () = inf log log λi0 µi0 ,m () m≥n m m≥n m m≥n m i=1

1 1 1 log µi0 ,m () ≥ log λi0 − inf Ii0 (x) − ε, ≥ log λi0 + inf m≥n m n n x∈ 0

Large Deviations in Quantum Spin Chains

47

for n large enough. Therefore, we have the lower bound lim inf

1 log µn () ≥ − inf Ii0 (x) = − inf I (x). n x∈ 0 x∈ 0

Note that {x ∈ R : I (x) ≤ α} = ∪li=1 {x ∈ R : Ii (x) ≤ α} for any α ∈ [0, ∞). As each {x ∈ R : Ii (x) ≤ α} is compact, so is {x ∈ R : I (x) ≤ α}. Hence I is a good rate function. Let ω be a C ∗ -finitely correlated state generated by a finite dimensional C ∗ -algebra B, a completely positive unital map E : Md (C) ⊗ B → B and a faithful state ρ. We define a completely positive unital map Eˆ1 : B → B through the formula Eˆ1 (b) := E(1⊗b), b ∈ B. For l ∈ N, we denote the l th iterate of E, E ◦(id{1} ⊗E)◦· · ·◦(id[−l+1,−1] ⊗E) by E (l) . It is known that every C ∗ -finitely correlated state has a decomposition as a finite convex ∗ combination of extremal periodic states, which [FNW]. nare again C -finitely correlated n That is, we can write ω as a finite sum ω = i=1 λi ωi , 0 < λi , i=1 λi = 1, where each ωi is an extremal pi periodic state. Furthermore, ωi is a C ∗ -finitely correlated state on (Md (C)⊗ pi )Z , generated by a triple (Bi , Ei , ρi ), such that 1 is a nondegenerate eigenvalue of (Eˆi )1 , and the rest of the spectrum has modulus strictly less than 1. Therefore, by Lemma 4.1, it suffices to prove the large deviation principle for ω generated by a completely positive map E such that Eˆ1 has a nondegenerate eigenvalue 1 and the rest of the spectrum has modulus strictly less than 1. We shall confine our attention to this case. Lemma 4.2. Let ω be a C ∗ -finitely correlated state on Z Md (C) generated by (B, E, ρ). Assume Eˆ1 has a nondegenerate eigenvalue 1 and the rest of the spectrum has modulus strictly less than 1. Then there exist a positive constant s > 0 and l ∈ N such that ω is

a C ∗ -finitely correlated state on Z Md (C)⊗l generated by (B, E (l) , ρ) satisfying s −1 ρ(b) ≤ (E ˆ(l) )1 (b) ≤ sρ(b), 0 ≤ ∀b, b ∈ B.

(16)

Proof. We claim that there exists an integer l and a positive constant s > 0 such that

l s −1 ρ(b) ≤ Eˆ1 (b) ≤ sρ(b), 0 ≤ b, b ∈ B.

(17)

To see this, let P be a spectral projection of Eˆ1 corresponding to the eigenvalue 1, and set P¯ = 1 − P. By assumption, the range of P is C1. As ρ is a faithful state on a finite dimensional C ∗ -algebra, there exists c > 0 such that ρ(·) ≥ cT rB (·). Accordingly, we have c b ≤ ρ(b), ∀b ≥ 0, b ∈ B. By the assumption, if we take l large enough, we have ˆ l ¯ c (E1 ) P(b) ≤ b , ∀b ∈ B. 2 Furthermore, we have

ρ(b) = lim ρ Eˆ1n (b) = ρ(P(b)). n→∞

48

Y. Ogata

Hence we have P(b) = ρ(b)1. We thus obtain the claim: there exists l such that 1 c ¯ ρ(b) ≤ ρ(b) − b ≤ Eˆ1l (b) = Eˆ1l (Pb) + Eˆ1l ( Pb) 2 2

3 c ¯ = ρ(b) + Eˆ1l P(b) ≤ ρ(b) + b ≤ ρ(b), 2 2 for 0 ≤ b, b ∈ B. Note that ω is a C ∗ -finitely correlated state on Z (Md (C))⊗l , generated by (B, E (l) , ρ). Furthermore, we have E ˆ(l) 1 = (Eˆ1 )l , and obtain (16). Now we apply the Ruelle transfer operator method to prove existence and differentiability of the logarithmic moment generating function for a C ∗ -finitely correlated state ω, satisfying (16). Lemma 4.3. Let ω be a C ∗ -finitely correlated state on Z Md (C) generated by (B, E, ρ). Suppose that there exists a positive constant s such that ˆ 1 (b) ≤ sρ(b), 0 ≤ ∀b, b ∈ B. s −1 ρ(b) ≤ (E)

(18)

Then for any α ∈ R, the limit

1 log ω eα H [−n,−1] n→∞ n lim

exists and is differentiable with respect to α. Proof. As in Sect. 3, we use the Ruelle transfer operator method. We use the analogous notation and arguments of Sect. 2 for the left side chain. The algebra Fθ is defined analogously for the left side chain and an analogous result of Theorem 2.2 holds. For a translation invariant finite range interaction , we define Hˆ l (n) := (I ) ∈ A(−∞,−1] ∩ Aloc , I ⊂(−∞,−1],I ∩[−n,−1]=φ

Wl (n)

:=

(I )

∈ A(−∞,−1] ∩ Aloc .

(19)

I ⊂(−∞,−1],I ⊂[−n+1,−1],I ⊂(−∞,−n−1]

As a transfer operator, we consider a map from A(−∞,−1] ⊗ B to A(−∞,−1] ⊗ B. For each α ∈ C, we define L α by

L α (Q) := τc− ◦ id(−∞,−2] ⊗ E a(α) ¯ ∗ Qa(α) , Q ∈ A(−∞,−1] ⊗ B. We set a(α) to be

α a(α) := F1 e(·)δ(H (−∞,−2]) Hˆ l (1) , , α ∈ C. 2

As in Lemma A.6, C α → a(α) is an Fθ -valued entire analytic function and each a(α) has an inverse in Fθ . Now we prove that each L α , α ∈ R satisfies (iii) of Assumption 2.1. We shall first write L nα in a more tractable form. By an inductive calculation, we obtain

n

L nα (Q) = τc− ◦ (id(−∞,−2] ⊗ E) a˜ n (α)∗ Q a˜ n (α) ,

Large Deviations in Quantum Spin Chains

49

where a˜ n (α) := a(α)γ−1 (a(α)) · · · γ−(n−1) (a(α)). Let an (α), n ≥ 2 be α

an (α) := a˜ n (α)e− 2 H [−n+1,−1] . For each n ≥ 2, we define a positive constant pn (α) and a completely positive map n by

pn (α) := ω eα H [−n+1,−1] ,

n α H [−n+1,−1] α H [−n+1,−1] n (Q) := pn−1 (α) τc− ◦ (id(−∞,−2] ⊗ E) e2 . Qe 2 (20) Using these notations, we can write L nα as L nα (Q) = pn (α)n (an (α)∗ Qan (α)),

Q ∈ A(−∞,−1] ⊗ B, n ≥ 2.

Next, note that for R ∈ A[−n+1,−1] ⊗ B, n ≥ 2, an element

n−1 α H [−n+1,−1] α H [−n+1,−1] e2 τc− ◦ (id(−∞,−2] ⊗ E) Re 2

(21)

(22)

belongs to 1A(−∞,−1] ⊗ B, and (identifying 1A(−∞,−1] ⊗ B with B),

n−1 α H [−n+1,−1] e = ω(eα H [−n+1,−1] ) = pn (α). ρ τc− ◦ (id(−∞,−2] ⊗ E) Accordingly, ϕn (R) := pn (α)−1 ρ

n−1 α H [−n+1,−1] α H [−n+1,−1] e2 τc− ◦ (id(−∞,−2] ⊗ E) Re 2

defines a state on A[−n+1,−1] ⊗ B. We claim s −1 ϕn (R) ≤ n (R) ≤ sϕn (R), ∀R ≥ 0,

R ∈ A[−n+1,−1] ⊗ B.

(23)

To see this, we denote (22) by 1A(−∞,−1] ⊗ b R . We have

n (R) = pn−1 (α) τc− ◦ (id(−∞,−2] ⊗ E)

n−1 α H [−n+1,−1] α H [−n+1,−1] e2 Re 2 τc− ◦ (id(−∞,−2] ⊗ E)

= pn−1 (α) 1A(−∞,−1] ⊗ Eˆ1 (b R ) . Therefore, from the bound (18), we obtain the claim: s −1 ϕn (R) = s −1 pn−1 (α)ρ(b R ) ≤ n (R)

= pn−1 (α) 1A(−∞,−1] ⊗ Eˆ1 (b R ) ≤ spn−1 (α)ρ(b R ) = sϕn (R).

(24)

From (23), we have 0 ≤ n (1) ≤ s. As n is completely positive, we obtain n = n (1) ≤ s.

50

Y. Ogata

We now check the condition (iii). As in Sect. 3, there exists a positive constant C > 0 such that (25) sup an (α) , sup an (α)−1 < C. n∈N

n∈N

Furthermore, we have lim [Q, an (α)] = 0, ∀Q ∈ Aloc .

n→∞

(See Lemma A.7.) For a strictly positive element Q in A[−n 0 ,−1] ⊗ B, we can choose ε > 0 and N (Q) ∈ N so that 1 1 −2 s inf Q, 2ε Q 2 C ≤ 2C 2 and

1 n 0 + 1 ≤ N (Q), [Q 2 , an (α)] < ε, ∀n ≥ N (Q).

Thus, due to the inequality (23), for n ≥ N (Q), we have L nα (Q) = pn (α)n (an (α)∗ Qan (α)) 1 1 ≤ 2C n [Q 2 , an (α)] Q 2 pn (α) + C 2 spn (α)ϕn (Q) 1 −1 2 ≤ pn (α) s + C s ϕn (Q), 2C 2 1 1 1 L nα (Q) ≥ −2 n C [Q 2 , an (α)] Q 2 pn (α) + 2 s −1 ϕn (Q) pn (α) C 1 −1 ≥ pn (α)ϕn (Q) 2 s . 2C Hence for n ≥ N (Q), we obtain L nα (Q) ≤ 2C 2 s

1 −1 2 s + C s inf L nα (Q). 2C 2

We thus showed (iii). Hence {L α }α∈C satisfies all the assumptions of Theorem 2.2. We thus can apply the left-side version of Theorem 2.2 to {L α }α∈C , and for α ∈ R, we obtain lim λ(α)−n L nα (1) − h(α) = 0, n→∞

for some strictly positive element h(α) in A(−∞,−1] ⊗ B and a strictly positive number λ(α). Furthermore, λ(α) is differentiable with respect to α. By (21), (23)and (25), we have 1 pn (α) ≤ C −2 pn (α)n (1) ≤ L nα (1) = pn (α)n (an (α)∗ an (α)) sC 2 ≤ C 2 pn (α)n (1) ≤ C 2 spn (α).

(26)

Large Deviations in Quantum Spin Chains

51

For any state ν on A(−∞,−1] ⊗ B, we obtain

1 1 lim log ω eα H [−n,−1] = lim log pn (α) n→∞ n n→∞ n

1 = lim log ν L α n (1) = log λ(α). n→∞ n As log λ(α) is differentiable, we have proved the lemma.

(27)

Proof of Theorem 1.2. For a translation invariant finite range interaction over

ˆ over ⊗Z Md (C)⊗k by ⊗Z Md (C) and k ∈ N, we define an interaction ˆ ) := (I (X ), X :I = I˜(X )

I˜(X ) := {l ∈ Z : (kl + [0, k − 1]) ∩ X = φ}.

ˆ is translation invariant finite range interaction over ⊗Z Md (C)⊗k It is easy to see that such that Hˆ (J ) = H ( Jˆ),

Jˆ = k · J + [0, k − 1],

(28)

for all finite subsets J of Z. Furthermore, by the same argument as Appendix A, (use Lemma A.2 and A.4), we have α α α n n α C := sup e 2 H ([−n,−1]) e− 2 H ([−k[ k ],−1]) + sup e 2 H ([−k[ k ],−1]) e− 2 H ([−n,−1]) n n

α

n n = sup F1 e(·)δ ( H ([−k[ k ],−1])) H ([−n, −1]) − H ([−k[ ], −1]) , k 2 n

α n + sup F1 e(·)δ(H ([−n,−1])) H ([−k[ ], −1]) − H ([−n, −1]) , k 2 n < ∞. Hence we obtain n

n

C −2 eα Hˆ [−[ k ],−1] ≤ eα H [−n,−1] ≤ C 2 eα H [−k[ k ],−1] = C 2e

α Hˆ [−[ nk ],−1]

, ∀n ∈ N, ∀α ∈ R.

(29)

Any C ∗ -finitely correlated state ω has a decomposition as a finite convex combination ω=

m i=1

λi ωi , 0 < λi ,

m

λi = 1,

i=1

where each ωi is a C ∗ -finitely correlated state on Z Md (C)⊗ pi ) , generated by a triple (Bi , Ei , ρi ), such that 1 is a nondegenerate eigenvalue of (Eˆi )1 , and rest of the spectrum has modulus strictly less than 1. By Lemma 4.2, ωi is a C ∗ −finitely correlated

(l ) state on Z Md (C)⊗ pi li ) , generated by (Bi , Ei i , ρi ) satisfying ˆ) (l si−1 ρi (b) ≤ (Ei i )1 (b) ≤ si ρi (b), 0 ≤ ∀b, b ∈ Bi , for some li ∈ N and si > 0.

(30)

52

Y. Ogata

ˆ i over As stated above, there exists a translation invariant finite range interaction

⊗ p l i i such that ⊗Z Md (C) Ci−2 e

α Hˆ [−[ pnl ],−1] i

i i

≤ eα H [−n,−1] ≤ Ci2 e

α Hˆ [−[ pnl ],−1] i i

i

, ∀n ∈ N, ∀α ∈ R,

for some Ci > 0. By this, we have

1

1 α H [−[ n ],−1] − log ωi eα H [−n,−1] lim log ωi e ˆ i pi li = 0. n→∞ n n

As ωi is a C ∗ −finitely correlated state on Z Md (C)⊗ pi li ) generated by (Bi , Ei(li ) , ρi ) satisfying (30), Lemma 4.3, implies the existence and differentiability of

1 1 α H [−[ n ],−1] . log ωi eα H [−n,−1] = lim log ωi e ˆ i pi li n→∞ n n→∞ n lim

By the Gärtner-Ellis Theorem this proves the large deviation principle with good rate function Ii for each ωi . Applying Lemma 4.1, we conclude the large deviation principle for ω with a good rate function I (x) = min1≤i≤l Ii (x). 5. Equivalence of Ensembles An immediate consequence of Theorem 1.1 is the equivalence of ensembles considered in [DMN1]. Let 1 , . . . , K be the translation invariant finite range interactions and X 1,N , . . . , X K ,N corresponding macroscopic observables: X k,N := N1 Hk [1, N ]. Several notions of concentration of macroscopic observables were introduced in [DMN1]: A sequence of projections {PN } N , PN ∈ A[1,N ] , is said to be concentrating at x ∈ R K whenever

T r[1,N ] F(X k,N )PN = F(xk ), lim N →∞ T r[1,N ] (PN ) mc

for all F ∈ C(R) and k = 1, · · · K , and written PN →x. In order to define concentration of states, we need a set F of maps G from a set of all finite sequences of {1, . . . , K }, I , to C, such that

|G(k1 , . . . , km )|

m≥0 (k1 ,...,km )∈I

m k < ∞. i i=1

We define G(X N ) by G(X N ) :=

G(k1 , . . . , km )X k1 ,N . . . X km ,N .

m≥0 (k1 ,...,km )∈I

A sequence of states ω N on A[1,N ] , is concentrating at x ∈ R K if lim ω N (G(X N )) = G(x),

N →∞

Large Deviations in Quantum Spin Chains

53 mc

for all G ∈ F, and written ω N → x. It was shown in [DMN1] that if PN → x, then the 1 Tr ] (·PN ) → x. Furthermore, we write ω N → x whenever lim N →∞ ω N (X k,N ) states T r[1,N [1,N ] (PN ) = xk . Three H-functions H mc , H can , H1can were introduced in [DMN1]: H mc (x) := sup lim sup mc P N →x N →+∞

1 log T r[1,N ] (P N ), N

1 H(ω N ), ω N →x N →+∞ N 1 H1can (x) := sup lim sup H(ω N ), N →+∞ N N 1

H can (x) := sup lim sup

ω →x

where H(ω N ) is the von Neumann entropy of ω N . By definition, we have H mc (x) ≤ H can (x) ≤ H1can (x). The following theorem was proven in [DMN1]. Theorem 5.1. Assume that there exists a sequence of states ω N on A[1,N ] with density matrices σ N , satisfying the following conditions: (i) For all δ > 0 and k, there exist Ck (δ) > 0 and Nk (δ) ∈ N such that xk +δ ω N (Q kN (dλ)) ≥ 1 − e−Ck (δ)N , ∀N ≥ Nk (δ), xk −δ

Q kN

where is the spectral projection of X k,N . (ii) For all δ > 0, δ 1 log ω N ( Q˜ N (dλ)) = 0, lim N →∞ N −δ where Q˜ N is the spectral projection of (iii) H1can (x) = lim N →∞ N1 H(ω N ). Then we have

1 N (log σ N

− T r[1,N ] σ N log σ N ).

H mc (x) = H can (x) = H1can (x). This means the equivalence of microcanonical ensemble and canonical ensemble. Let us consider a sequence of states of the form

ω (A) = N

T r[1,N ] e T r[1,N ] e

k

λk Hk [1,N ] k

A

λk Hk [1,N ]

, λk ∈ R.

(31)

Theorem 1.1 and a bound similar to (7) guarantee that ω N concentrates at x for some x ∈ R K and satisfies conditions (i),(ii) of Theorem 5.1. Furthermore, it can be shown that a state of this form satisfies (iii) [DMN1]. Therefore, applying Theorem 5.1, we obtain the equivalence of ensembles in the one dimensional quantum spin system: Corollary 5.1. If there exists a sequence of states of ω N of the form (31) such that 1

ω N → x ∈ R K , then H mc (x) = H can (x) = H1can (x). Acknowledgement. The author thanks Professor L. Rey-Bellet and Dr. W. De Roeck for interesting discussions. The present research is supported by JSPS Grant-in-Aid for Young Scientists (B) and Hayashi Memorial Foundation for Female Natural Scientists.

54

Y. Ogata

A. Analyticity of Local Elements All the results in this section are straightforward application of arguments in [A1]. For the readers’ convenience, we sketch the proof here. Throughout this section, we fix 3 ≤ r ∈ N. For R > 0, we denote by B R an open ball in C centered at the origin with radius R. Let I be a subset of Z, and a translation invariant finite range interaction with range diameter less than r . The derivation iδ (H (I )) (Q) := i [(X ), Q] , Q ∈ Aloc , X ⊂I

defined on Aloc is closable and generates a strongly continuous one parameter group of automorphisms on A. In order to show the dependence of I explicitly, we use the notation of H. Araki and denote this one parameter group by exp(itδ (H (I ))). In one dimension, local elements are entire analytic for this dynamics: Theorem A.1. Let n 1 , n 2 , N1 , N2 be integers such that n := n 1 +n 2 +1 ≥ r , [−n 1 , n 2 ] ⊂ [−N1 , N2 ], and −N1 ≤ N2 . Then Q ∈ A[−n 1 ,n 2 ] is an entire analytic element for exp(itδ (H (I ))) for any I ⊂ Z. Furthermore, we have exp(βδ (H (I )))(Q) ≤ Fn (2 |β| ||) Q ,

(32)

and exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [−N1 , N2 ])))(Q) ≤ Fnmin{N2 +n 1 ,N1 +n 2 }+1 (2 |β| ||) Q ,

(33)

for all β ∈ C. Here, Fn is a function on R given by Fn (x) := exp [(n − r + 1)x + g(x)] , r g(x) := 2 k −1 {exp(kx) − 1},

(34)

k=1

and

FnL

is a function such that

FnL+1 (x) ≤ FnL (x), FnLr +n (x) ≤ ((L + 1)!)−1 [g(x)] L+1 Fn (x), ∀x > 0, 0 ≤ L ∈ N. (35) Proof. Basically, this is proven in [A1] (Theorem 4.2). A slight difference of the setting (for example, N1 , N2 are replaced by −N , n + N , N ∈ N in [A1]) causes no difference of the proof because of the translation invariance of . For an element Q in A I , a0 ∈ Z, and 0 < θ < 1, we say that Q allows a decomposition into local elements centered at a0 with rate θ , if there exists a sequence of local 0 elements := (Q θ,a k )k such that Q =

∞

0 Q θ,a k ,

0 Q θ,a ∈ A[a0 −k,a0 +k]∩I , k

k=r

0 −k < ∞. Cθ,a0 , (Q) := sup Q θ,a k θ

(36)

k

0 Furthermore, we say that a decomposition is self-adjoint if all Q θ,a are self-adjoint. k We denote by A1 the subalgebra given by Q ∈ Fθ ∩ A[1,∞) : 0 < ∀θ < 1 .

Large Deviations in Quantum Spin Chains

55

Lemma A.1. Let Q be an element of A[1,∞) , a0 ∈ Z, and 0 < θ < 1. Suppose that there exist a positive number C and r ≤ l ∈ N satisfying the following condition: for all l ≤ N ∈ N, there exists Q N ∈ A[a0 −N ,a0 +N ]∩[1,∞) such that Q − Q N ≤ Cθ N .

(37)

Then Q has a decomposition into local elements centered at a0 with rate θ , such that QN =

N

−l −1 0 Q θ,a k , l ≤ N , C θ,a0 , (Q) ≤ C + max{Q θ , Cθ } < ∞. (38)

k=r

If each Q N is self-adjoint, the decomposition can be taken to be self-adjoint. In particular, all Q ∈ A1 has a decomposition into local elements centered at a0 = 0 for any 0 < θ < 1, with C ≡ |Q|θ in (38). Proof. This follows by taking 0 Q θ,a k

⎧ ⎨ 0, = Ql , ⎩Q − Q , k k−1

If Q ∈ A1 , we take Q N = Q (N ) of Lemma C.1.

r ≤k
Note that if Q ∈ A[1,∞) has a decomposition (36) for some a0 ∈ Z+ , then Q is in Fθ . Lemma A.2. For a0 ∈ Z, R > 0 and 0 < θ < e−4R|| , let Q be an element in A with a decomposition into local elements centered at a0 with rate θ . Then for any I ⊂ Z, exp(itδ (H (I )))(Q) has an analytic extension to B R such that exp(βδ (H (I )))(Q) ≤ Cθ,a0 , (Q)C1,R,θ,|| , |β| ≤ R.

(39)

Furthermore, for N ∈ N, N θ,a0 Q k ) exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [a0 − 2N , a0 + 2N ])))( k=r

N ≤ Cθ,a0 , (Q)C2,R,θ,|| θ e4R|| , |β| ≤ R. (40) Here, positive constants C1,R,θ,|| and C2,R,θ,|| depend only on R, θ , and ||. In particular, any Q ∈ A1 is entire analytic for eitδ (I ) . Proof. The first part is an immediate consequence of Theorem A.1. Let Q be an element of A1 . For any R > 0, take 0 < θ < e−4R|| . Then by Lemma A.1, Q admits a decomposition into local elements centered at a0 = 0 with rate θ . Hence exp(itδ (H (I )))(Q) has an analytic extension in A to B R . As this holds for all R > 0, exp(itδ (H (I )))(Q) has an entire analytic extension in A.

56

Y. Ogata

Lemma A.3. For any element Q in A1 , I ⊂ [1, ∞), and β ∈ C, the element exp(βδ (H (I )))(Q) belongs to Fθ for all 0 < θ < 1. Furthermore, C β → exp(βδ (H (I )))(Q) ∈ Fθ is Fθ -entire analytic. For any R > 0 and 0 < θ < 1, fix some 0 < θ < e−4R|| (θ )3 . Let be a decomposition of Q into local elements centered at a0 = 0 with rate θ . Then we have |exp(βδ (H (I )))(Q)|θ ≤ C3,R,θ,θ ,|| Cθ,0, (Q), |β| < R, and

N θ,0 Q k ) exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2N ])))( k=r

N

≤ C4,R,θ,θ ,|| Cθ,0, (Q)(θ )

|β| < R.

(41)

θ

(42)

Here, positive constants C3,R,θ,θ ,|| and C4,R,θ,θ ,|| depend only on R, θ, θ and ||. Proof. Note that for A ∈ A[1,∞) and A j ∈ A[1, j−1] , j ∈ N, we have var j (A) ≤ 2 A ,

(43)

var j (A) = var j (A − A j ) ≤ 2 A − A j .

(44)

Fix any R > 0 and 0 < θ < 1, and take 0 < θ < e−4R|| (θ )3 . Let be a decomposition of Q into local elements centered at a0 = 0 with rate 0 < θ < 1. Lemma A.2 implies l (θ )−2l exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2l])))( Q θ,0 ) k k=r

l

≤ (θ ) Cθ,0, (Q)C2,R,θ,|| , |β| < R, l exp(βδ (H (I ∩ [0, 2l])))( Q θ,0 k ) ∈ A[1,2l] , r ≤ ∀l ∈ N.

(45)

k=r

Applying (45) for l =

j−1 2

and using (44), we obtain

(θ )− j var j (exp(βδ (H (I )))(Q)) ≤ C1,R,θ,θ ,|| · (θ )

j−1 2

Cθ,0, (Q), |β| < R, (46)

for a positive constant C1,R,θ,θ ,|| which depends only on R, θ, θ , ||. Hence we obtain (41) and exp(βδ (H (I )))(Q) is an element in Fθ for |β| < R. N Q θ,0 Next, note that if 2N < j, exp(βδ (H (I ∩ [0, 2N ])))( k=r k ) is in A[1, j−1] . Hence we have N θ,0 −j (θ ) var j exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2N ])))( Qk ) −j

= (θ )

k=r −1 N var j (exp(βδ (H (I )))(Q)) ≤ C1,R,θ,θ ,|| θ (θ ) Cθ,0, (Q),

|β| < R.

Large Deviations in Quantum Spin Chains

57

On the other hand, if 2N ≥ j, applying (45) for l = N and using (43), −j

(θ )

var j exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2N ])))(

N

Q θ,0 k )

k=r

≤ 2(θ ) N Cθ,0, (Q)C2,R,θ,|| , |β| < R. Combining these estimates, we obtain (42) with C4,R,θ,θ ,|| := max{2C2,R,θ,|| , (θ )−1 C1,R,θ,θ ,|| }. Now we show that exp(βδ (H (I )))(Q) is Fθ -analytic on B R . For each N ∈ N, N exp(βδ (H (I ∩ [0, 2N ])))( k=r Q θ,0 ) is in A[0,2N ] ⊂ Fθ and the map B R β N k θ,0 → exp(βδ (H (I ∩ [0, 2N ])))( k=r Q k ) ∈ Fθ is analytic. Furthermore, the N Q θ,0 bounded on B R i.e., sequence exp(βδ k ) is uniformly (H (I ∩ [0, 2N ])))( k=r θ,0 N sup|β| N . We have Cθ,0, N (Q N ) ≤ Cθ,0, (Q). Applying (41) to this decomposition, we obtain a uniform bound.) Hence {exp(βδ (H (I ∩ [0, 2N ])))

θ,0 N } N is a uniformly bounded sequence of Fθ -valued analytic functions k=r Q k on B R . From (42), Fθ -valued function exp(βδ (H (I )))(Q) is approximated by this sequence in Fθ -norm. Therefore, exp(βδ (H (I )))(Q) is also Fθ -analytic on B R . As R > 0 is arbitrary, exp(βδ (H (I )))(Q) is Fθ -entire analytic. Next for an Fθ -valued entire analytic function A(z), we define u n−1 1 u1 ∞ n F1 (A, z) := z du 1 du 2 · · · du n A(u n z) · · · A(u 1 z) F2 (A, z) :=

n=0 ∞ n=0

0

0

(−z)n

0

1

u1

du 1 0

du 2 · · ·

0

u n−1

du n A(u 1 z) · · · A(u n z). (47)

0

In particular, for P = P ∗ ∈ A, F1 (e(·)δ(H (I )) (P), it) is a co-cycle related to the perturbed automorphism group eitδ(H (I )+P) of eitδ(H (I )) . By a routine argument, we obtain the following: Lemma A.4. For Fθ -valued entire analytic functions A(z), B(z), each Fi (A, z), i = 1, 2 is in Fθ and the map C z → Fi (A, z) ∈ Fθ is entire analytic. Furthermore, we have Fi (A, z) ≤ exp |z| sup A(w) , |w|≤|z|

|Fi (A, z)|θ ≤ exp 2 |z| sup |A(w)|θ , |w|≤|z|

Fi (A, z) − Fi (B, z) ≤ |z| exp |z| sup (A(w) + B(w)) |w|≤|z|

sup A(w) − B(w)

|w|≤|z|

58

Y. Ogata

|Fi (A, z) − Fi (B, z)|θ ≤ |z| exp 2 |z|

sup (|A(w)|θ + |B(w)|θ )

|w|≤|z|

sup |A(w) − B(w)|θ .

|w|≤|z|

(48)

Lemma A.5. Let a0 be a positive integer and I ⊂ [1, ∞). Let Q, P = P ∗ be elements in A[1,∞) , which allow for any rate 0 < θ < 1, decompositions Q,θ , P,θ into local elements centered at a0 . Assume that P,θ can be taken to be self-adjoint. Let eitδ(H (I )+P) be a strongly continuous one parameter group of automorphisms over A[1,∞) generated by i (δ (H (I )) + [P, ·]). Then for all 0 < θ < 1, eitδ(H (I )+P) (Q) is in Fθ and the map R t → eitδ(H (I )+P) (Q) ∈ Fθ has an Fθ -entire analytic extension e zδ(H (I )+P) (Q). Furthermore, for any R > 0 and 0 < θ < e−4R|| , we have

zδ(H (I )+P) (Q) ≤ C1,R,θ,|| exp C5,R,θ,|| Cθ,a0 , P,θ (P) Cθ,a0 , Q,θ (Q), (49) e

N N θ,a zδ H (I ∩[a0 −2N ,a0 +2N ])+ k=r Pk 0 θ,a0 zδ(H (I )+P) (Q) − e ( Q k ) e k=r

≤ C6,R,θ,|| exp C7,R,θ,|| Cθ,a0 , P,θ (P) Cθ,a0 , P,θ (P) + 1 N

Cθ,a0 , Q,θ (Q) θ e4R|| , (50) |z| ≤ R. Here, positive constants C5,R,θ,|| , C6,R,θ,|| , and C7,R,θ,|| depend only on R, θ , and ||. Proof. This follows from identity

eitδ(H (I )+P) (Q) = F1 e(·)δ(H (I )) (P), it eitδ(H (I )) (Q)F2 e(·)δ(H (I )) (P), it , and the estimation of the analytic continuation of the right hand side using Lemma A.3 and Lemma A.4. If I is finite, then exp(βδ (H (I )))(Q) = eβ H (I ) Qe−β H (I ) ,

F1 (exp((·)δ (H (I )))(Q), β) = eβ(H (I )+Q) e−β H (I ) ,

F2 (exp((·)δ (H (I )))(Q), β) = eβ H (I ) e−β(H (I )+Q) ,

and satisfies the relations: F1 (exp((·)δ (H (I ) + Q))(Q 1 + Q 2 ), β) = F1 (exp((·)δ (H (I ) + Q + Q 1 ))(Q 2 ), β) F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) , F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) exp(βδ (H (I ) + Q))(Q 2 ) = exp(βδ (H (I ) + Q + Q 1 ))(Q 2 )F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) , F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) F2 (exp((·)δ (H (I ) + Q))(Q 1 ), β) = F2 (exp((·)δ (H (I ) + Q))(Q 1 ), β) F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) = 1, (51)

Large Deviations in Quantum Spin Chains

59

for all self-adjoint elements Q, Q 1 in A1 and all element Q 2 in A1 . As Fi (exp((·)δ (H (I )))(Q), β) and exp(βδ (H (I )))(Q) are approximated by local elements by Lemma A.4 and Lemma A.5, the relations (51) hold for general I . We use the following notation: Hˆ r (n) := (I ) ∈ A[1,∞) ∩ Aloc , I ⊂[1,∞),I ∩[1,n]=φ

Hˆ l (n)

:=

(I )

∈ A(−∞,−1] ∩ Aloc ,

I ⊂(−∞,−1],I ∩[−n,−1]=φ

Wr (n) := Wl (n)

:=

(I )

∈ A[1,∞) ∩ Aloc ,

I ⊂[1,∞),I ⊂[1,n−1],I ⊂[n+1,∞)

(I )

∈ A(−∞,−1] ∩ Aloc . (52)

I ⊂(−∞,−1],I ⊂[−n+1,−1],I ⊂(−∞,−n−1]

Let and be translation invariant finite range interactions with range diameter less than r . Lemma A.6. For fixed β ∈ R, define the A-valued function a(α) on C by

α β a(α) := e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , − 2

α (·)δ(H [2,∞)) r , α ∈ C. Hˆ (1) , F1 e 2

(53)

Then for any 0 < θ < 1 and α ∈ C, a(α) belongs to Fθ and C α → a(α) ∈ Fθ is Fθ -entire analytic. For each α ∈ C, a(α) is invertible in Fθ and

α a(α)−1 = F2 e(·)δ(H [2,∞)) Hˆ r (1) , 2

α β δ(H [1,∞)) (·)δ(H [2,∞)) F2 e . e2 Hˆ r (1) , − 2

Proof. By Lemma A.3 and Lemma A.4, F1 e(·)δ(H [2,∞)) Hˆ r (1) , − β2 is an element

α in A1 . Therefore, by Lemma A.3, e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , − β2

α is a well-defined element in Fθ and C α → e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , ∈ Fθ is entire analytic for all 0 < θ < 1. On the other hand, by Lemma A.3 and − β2

Lemma A.4, C α → F1 e(·)δ(H [2,∞)) Hˆ r (1) , α2 ∈ Fθ is entire analytic. Hence a(α) is an Fθ -valued entire analytic function for all 0 < θ < 1. The existence of an inverse follows from (51). Lemma A.7. For n ∈ N and α ∈ C, define a˜ n (α) and an (α) by a˜ n (α) := a(α)γ1 (a(α))γ2 (a(α)) · · · γ(n−1) (a(α)), β

α

an (α) := a˜ n (α)e 2 H [1,n−1] e− 2 H [1,n−1] .

60

Y. Ogata

Then

α β a˜ n (α) = e 2 δ(H [1,∞)) F1 e(·)δ(H [n+1,∞)) Hˆ r (n) , − 2

α (·)δ(H [n+1,∞)) r , Hˆ (n) , F1 e 2

α β r (n) δ(H [1,∞)) (·)δ H [1,∞)−W r ( ) F1 e an (α) = e 2 W (n) , − 2

α r r (n) (·)δ (( H [1,∞)−W )) . W (n) , F1 e 2

For any K > 0, there exists a positive constant C K such that sup sup an (α) , sup sup (an (α))−1 < C K . |α|
|α|
Furthermore, for Q ∈ Aloc , we have lim [an (α), Q] = 0,

n→∞

lim an (α)−1 , Q = 0.

n→∞

(54)

(55)

(56)

(57)

Proof. Equation (54) follows from (51) by induction, using relations δ(H [n, ∞) + Hˆ r (n − 1)) = δ(H [1, ∞)), Hˆ r (n) + γ(n) ( Hˆ r (1)) = Hˆ r (n + 1), and δ(H [n + 2, ∞) + γ(n) ( Hˆ r (1)) = δ(H [n + 1, ∞)). Equation (55) follows from (51), Hˆ r (n) − H [1, n − 1] = Wr (n), e z(H [1,n−1]) = F1 (e(·)δ(H [n+1,∞)+H [1,n−1]) )(H [1, n − 1]); z), and δ(H [n + 1, ∞) + H [1, n − 1]) = δ(H [1, ∞) − Wr (n)). To prove (56), we first note that Wr (n) has a self-adjoint decomposition W,θ = r

θ,n ( W (n) k ) into local elements (36) centered at a0 = n with rate 0 < θ < 1: r

θ,n r W (n), k = r , W (n) k =: 0, k >r Cθ,n,W,θ (Wr (n)) ≤ (2r + 1) || θ −r ,

(58)

for all 0 < θ < 1. From this decomposition, applying Lemma A.5, we observe that

r e zδ ( H (I )−W (n)) Wr (n) is entire analytic in Fθ for all 0 < θ < 1. Furthermore, for all R > 0, 0 < θ < e−4R|| and I ⊂ [1, ∞), we have

zδ ( H (I )−Wr (n)) r W (n) ≤ C2,R,θ,|| , (59) e

N N r zδ H [1,∞)−W r (n) r ) W (n) − e zδ H [1,∞)∩[n−2 2 ,n+2 2 ] −W (n) W r (n) e (

≤

C3,R,θ,||

θ e4R||

N 2

,

N ≥ 2r, |z| < R.

(60)

Here, C2,R,θ,|| and C3,R,θ,|| are positive constants which depend only on R, θ , and ||. Then from Lemma A.4, for R > 0 and 0 < θ < e−4R|| , we have

r , (61) F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , z ≤ C4,R,θ,||

Large Deviations in Quantum Spin Chains

61

r F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , z

(·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ] −Wr (n) r W (n) ; z −F1 e ≤

C5,R,θ,||

θe

4R||

N 2

,

N ≥ 2r, |z| < R,

(62)

and C5,R,θ,|| which depend only on R, θ and with positive constants C4,R,θ,|| ||.

r Now, we claim that for any 0 < θ < 1, F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , − β2 has

a decomposition n,θ into local elements centered at n with rate θ which is uniformly

r bounded in n, i.e., supn∈N Cθ ,n,n,θ (F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , − β2 ) <

< θ < e−4|β||| θ 3 . Applying (62) +∞. For any 0 < θ < 1, take 0

to = ,

r (n) β (·)δ H [1,∞)−W r ( ) W (n) , − 2 satisfies the θ and R = |β|, we observe that F1 e condition of Lemma A.1 (37) for θ and a0 = n, with C = C5,|β|,θ,|| , l = 2r , and

β (·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ] −Wr (n) r W (n) , − ; Q N := F1 e 2 ∈ A[n−N ,n+N ]∩[1,∞) .

Then by the last lemma, it has a decomposition n,θ into local elements centered at n with rate θ such that:

β r (n) (·)δ H [1,∞)−W r ( ) Cθ ,n,n,θ F1 e W (n) , − 2 ≤ C5,|β|,θ,|| + max{C4,|β|,θ,|| θ

−2r

, C5,|β|,θ,|| θ

−1

} < ∞.

Hence, applying Lemma A.2, we have α δ(H (I ))

r β (·)δ ( H [1,∞)−Wr (n)) e 2 ≤ C F1 e W (n) , − 6,K ,|β|,θ,θ ,||,|| , 2

(63)

α δ(H (I ))

r β (·)δ ( H [1,∞)−Wr (n)) e 2 F e (n) , − W 1 2

N N r

β r α [1,∞)∩[n−2 (·)δ H 2 ,n+2 2 ] −W (n) −e 2 δ(H (I )∩[n−2N ,n+2N ]) F1 e W (n) , − ; 2

N 4K || , |α| < K , (64) ≤ C7,K ,|β|,θ,θ ,||,|| θ e

for all K > 0, 0 < θ < e−4K || , 0 < θ < e−4|β||| θ 3 , and I ⊂ [1, ∞). Here, C6,K ,|β|,θ,θ ,||,|| , C 7,K ,|β|,θ,θ ,||,|| are constants which depend only on K , |β| , θ, θ , || and ||. Combining (63) with (61), we obtain from (55) the bound sup sup an (α) ≤ C8,K ,|β|,θ,θ ,||,|| C 4,K ,θ ,|| ,

|α|
62

Y. Ogata

for all K > 0 with 0 < θ < e−4K || and 0 < θ < e−4|β||| θ 3 . The bound for an (α)−1 follows by the same argument. To prove (57), define anN (α) :=

β (·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ]−Wr (n) r F1 e W (n) , − e 2

r (n)

α (·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ]−W F1 e Wr (n) , 2 ∈ A[n−2N ,n+2N ]∩[1,∞) , N ≥ 2r. α 2 δ(H [1,∞)∩[n−2N ,n+2N ])

Using (55), (61), (62), (63), and (64), we have an (α) − anN (α)

N 4K || ≤ C7,K θ e C4,K ,|β|,θ,θ ,||,|| ,θ ,||

4K || +C6,K ,|β|,θ,θ ,||,|| C 5,K ,θ ,|| θ e

N 2

→ 0, N → ∞, |α| < K , for all K > 0 with 0 < θ < e−4K || , and 0 < θ < e−4|β||| θ 3 . From this, we have n an (α) − an 3 (α) → 0, n → ∞. Hence, for a local element A, we have lim [an (α), A] = lim

n→∞

n→∞

! n an 3 (α), A = 0.

B. Proof of Theorem 2.1 In this section, we sketch the proof of Theorem 2.1. Although we are considering a generalized form of L (B = Md (C), and E(a ⊗ b) := d −1 T r Md (C) (a)b in [M]), most parts of the proof can be carried out parallel to [M]. Instead of repeating the details, we indicate the corresponding part of [M]. Lemma B.1. Let L : B ⊗ A[1,∞) → B ⊗ A[1,∞) be a linear operator satisfying the following: (a) L is completely positive. (b) L has an invariant state ϕ. (c) There are positive constants a1 , a2 such that n L (Q) ≤ a1 Q + a2 θ n Qθ , ∀Q ∈ Fθ , ∀n ∈ N. θ

Large Deviations in Quantum Spin Chains

63

(d) There exists a positive constant K such that the following bound is valid: Let Q be any strictly positive element in Fθ . There exists a positive integer N = N (Q) satisfying L n (Q) ≤ K inf L n (Q), ∀n ≥ N . (e) L is unital. Then there are positive constants C1 and δ1 such that n L (Q) − ϕ(Q)1 ≤ C1 e−δ1 n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N. θ

(65)

Proof. For any strictly positive element Q in Fθ , we have lim L n (Q) − ϕ(Q)1 = 0

(66)

n→∞

([M] Lemma 2.8). To see this, note that a set {L n (Q)}n∈N is a subset of the C ∗ -norm compact set {A ∈ Fθ : |A|θ ≤ a1 Q + a2 Qθ } by (c) (see Lemma C.2). Therefore, it has a convergent subsequence with a limit in Fθ . Let {L n k (Q)} be any convergent n k ¯ subsequence, L (Q) − Q → 0, with 0 ≤ Q¯ ∈ Fθ . We show Q¯ = ϕ(Q) · 1. Tak¯ − Q¯ → ing suitable 0 < q(l) = n k(l+1) − n k(l) ∈ N, l ∈ N, we have L q(l) ( Q) 0, q(l) → ∞, l → ∞. (See the explanation before (2.23) of [M].) On the other hand, ¯ n is an increasing sequence and { L n ( Q) ¯ } a decreasing from (a) and (e), {inf L n ( Q)} sequence. (See (2.20) of [M].) Therefore, we have ¯ ≤ lim L q(l) ( Q) ¯ = Q¯ ≤ L n ( Q) ¯ ≤ Q¯ inf Q¯ ≤ inf L n ( Q) l→∞

¯ and L n ( Q) ¯ = Q¯ for all n ∈ N. for all n ∈ N. Hence we obtain inf Q¯ = inf L n ( Q) Then from (d), we have 0 ≤ Q¯ − λ = L n ( Q¯ − λ) ≤ K inf L n ( Q¯ − λ) = K (inf Q¯ − λ), ¯ This ¯ for all λ < inf Q¯ and n ≥ N ( Q−λ). Taking λ ↑ inf Q¯ limit, we get Q¯ = inf( Q). n ¯ = limk→∞ ϕ(L k (Q)) = ϕ(Q), means Q¯ is proportional to 1. From (b), we obtain ϕ( Q) and Q¯ = ϕ(Q)1. As this argument applies to any subsequence of L n (Q), we obtain (66) for strictly positive Q ∈ Fθ . As L(1) = 1, we get (66) for all selfadjoint Q in Fθ . Fix any ε > 0. From (66), a compact set {Q = Q ∗ ∈ Fθ : |Q|θ ≤ 1} can be covered by a finite number of open sets Un := {Q ∈ B ⊗ A[1,∞) : L n (Q) − ϕ(Q) < ε }. Because L is a completely positive unital map, L is contractive and we have U1 ⊂ U2 ⊂ U3 · · ·. Therefore, there exists a positive integer N1 ∈ N such that {Q = Q ∗ ∈ Fθ : |Q|θ ≤ 1} ⊂ U N1 , i.e., n L (Q) − ϕ(Q)1 ≤ ε |Q|θ , ∀Q = Q ∗ ∈ Fθ , ∀n ≥ N1 . (67) (See [M], Lemma 2.9.) Hence for any 0 < ε < 1 we have from (c) and (67), 2N0 L (Q) − ϕ(Q)1 = L N0 (L N0 (Q) − ϕ(Q)1) θ θ N0 N0 N0 ≤ a1 L (Q) − ϕ(Q)1 + a2 θ L (Q) ≤ ε |Q|θ , Q = Q ∗ ∈ Fθ θ

64

Y. Ogata

for N0 large enough. For n = 2m N0 + r , we thus obtain n L (Q) − ϕ(Q) ≤ Cεm |Q|θ , θ

Q = Q ∗ ∈ Fθ ,

with some constant C > 0. For Q = Q 1 + i Q 2 ∈ Fθ with Q 1 , Q 2 self-adjoint, we have |Q 1 |θ , |Q 2 |θ ≤ |Q|θ . Therefore, we obtain (65) for all Q ∈ Fθ . Our transfer operator L does not satisfy the condition (e). However, it turns out to be similar to an operator satisfying (a)-(e): Lemma B.2. Let L : B ⊗ A[1,∞) → B ⊗ A[1,∞) be a linear operator satisfying (a)-(d) of Lemma B.1 and (f) There exists m, M > 0 such that m ≤ L n (1) ≤ M, ∀n ∈ N. Then there exists an element h in Fθ such that L(h) = h, m ≤ h ≤ M, ϕ(h) = 1.

(68)

Furthermore, an operator L h defined by 1

1 1 1 L h (Q) := h − 2 L h 2 Qh 2 h − 2 , Q ∈ B ⊗ A[1,∞) , satisfies (a)-(e) of Lemma B.1 with an invariant state

1 1 ϕh (Q) := ϕ h 2 Qh 2 . Therefore, from (72) and Lemma B.1, we have n L (Q) − ϕ(Q)h ≤ C1 e−δ1 n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N, θ for some constants C1 , δ1 > 0. Proof. By (c), C ∗ -norm closure S of the convex hull of {L n (1) : n ∈ N} is a compact convex set. As (a) and (f) imply L = L(1) ≤ M, the operator L maps S continuously (in C ∗ -norm) into S. Therefore, by the Schauder-Tychonov fixed point theorem, there exists h ∈ S such that L(h) = h. This h satisfies (68) (Lemma 2.5 of [M]). As 1 1 h > 0 is an element in Fθ , h 2 , h − 2 are also in Fθ .(See [M], p. 1191.) The properties (a), (b), (e) of L h are trivial, and (c) follows from that of L, using (72). Property (d) of 1 1 L h follows from (d) of L and the fact that inf R ≤ h inf(h − 2 Rh − 2 ) for all R ≥ 0 ([M], Lemma 2.7). Proof of Theorem 2.1. The operator given in Theorem 2.1 satisfies the conditions in the last lemma: Lemma B.3. Let L be an operator satisfying Assumption 2.1. Then L satisfies (a)-(d) of Lemma B.1 and (f) of Lemma B.2.

Large Deviations in Quantum Spin Chains

65

Proof. (a), (b) is trivial. First we prove (f). As L satisfies (a), (b) and (iii) of Assumption 2.1, we have 1 = ϕ(L n (1)) ≤ L n (1) ≤ K inf L n (1) ≤ K ϕ(L n (1)) = K , −2 for n ≥ N = N (1). On the other hand, because a is invertible, we have a ∗ a ≥ a −1 , −1 −2 and obtain L(1) ≥ a > 0, hence L k (1) > 0, for all k ∈ N. For m = N −1 min{inf L(1), . . . , inf L (1), K −1 } and M = max{L(1) , . . . , L N −1 (1) , K }, L satisfies (f) ([M], Lemma 2.3). In order to prove (d), we approximate Q by a local element. For 0 < K in (iii), fix any positive constant K such that K < K . We show that (d) holds for K . For any strictly positive Q ∈ Fθ , choose ε > 0 and l ∈ N such that (K + 1)Mε ≤ m(K − K ) inf Q and |Q|θ θ l ≤ ε. For this l, by Lemma C.1, there exists Q (l) ∈ B ⊗ A[1,l−1] such that 0 < inf Q ≤ Q (l) ≤ Q and Q − Q (l) ≤ |Q|θ θ l . Using L n = L n (1) ≤ M, we obtain − M |Q|θ θ l ≤ L n (Q) − L n (Q (l) ) ≤ M |Q|θ θ l , n ∈ N.

(69)

Applying (iii) to Q (l) , we have L n (Q (l) ) ≤ K inf L n (Q (l) ), ∀n ≥ N (Q (l) ).

(70)

Using (69), (70), and m inf Q ≤ inf L n (Q), n ∈ N, we obtain (d): L n (Q) ≤ M |Q|θ θ l + L n (Q (l) ) ≤ M |Q|θ θ l + K inf L n (Q (l) )

≤ M |Q|θ θ l + K inf L n (Q) + M |Q|θ θ l ≤ M(K + 1)ε + K inf L n (Q) ≤ m(K − K ) inf Q + K inf L n (Q) ≤ K inf L n (Q), n ≥ N (Q (l) ). To prove (c), we define a map L˜ : O → O by

˜ L(Q) := τc,+ ⊗ τc,+ ◦ (ε ⊗ id[2,∞ ) ⊗ (ε ⊗ id[2,∞ ) (a ∗ ⊗ 1)Q(a ⊗ 1) . ˜ Note that L(Q ⊗ 1) = L(Q) ⊗ 1, Q ∈ B ⊗ A[1,∞) . Define ( L˜ k ) j := j ◦ L˜ k ◦ j+k , k ∈ N ∪ {0}, j ∈ N. For any n, j ∈ N, Q ∈ B ⊗ A[1,∞) , we have

var j L n (Q) ≤ ( L˜ n ) j (Q ⊗ 1) − L˜ n (Q ⊗ 1)

+ j ◦ L˜ n j+n (Q ⊗ 1) − Q ⊗ 1 . (71)

Qθ θ j+n . By j ◦ τc,+ ⊗ τc,+ ◦ (ε ⊗ id[2,∞ ) By (f), the second

by M

term is bounded

⊗(ε ⊗ id[2,∞ ) = τc,+ ⊗ τc,+ ◦ (ε ⊗ id[2,∞ ) ⊗ (ε ⊗ id[2,∞ ) ◦ j+1 , j ∈ N, we have

˜ ( L˜ k−1 ) j ◦ L(R) = ( L˜ k ) j j+k (a ∗−1 ⊗ 1)(a ∗ ⊗ 1)R(a ⊗ 1) j+k (a −1 ⊗ 1) , ∀R ∈ O, j, k ∈ N.

66

Y. Ogata

From this, we obtain ˜ k−1 ˜ − ( L˜ k ) j (R) ( L ) j ◦ L(R)

= ( L˜ k ) j j+k (a ∗−1 ⊗ 1)(a ∗ ⊗ 1)R(a ⊗ 1) j+k (a −1 ⊗ 1) − R

≤ M a −1 a + 1 a a −1 θ j+k R , R ∈ O, j, k ∈ N. θ

Hence we get ˜n ( L ) j (Q ⊗ 1) − L˜ n (Q ⊗ 1) n

k L˜ ≤ ◦ L˜ n−k (Q ⊗ 1) − L˜ k−1 j

k=1

j

≤ θ j M(M + 1)( a −1 a + 1) a a −1

θ

◦ L˜ n−k+1 (Q ⊗ 1) θ Q =: θ j K Q . 1−θ

Substituting these for (71), we obtain (c): n L (Q) ≤ K Q + M Qθ θ n , θ ([M], Lemma 2.4)

Theorem 2.1 is an immediate consequence of Lemma B.2 and Lemma B.1.

C. Banach Space Fθ It is straightforward to check |AB|θ ≤ 2 |A|θ |B|θ , |A + B|θ ≤ |A|θ + |B|θ , |A|θ = A∗ θ , ∀A, B ∈ Fθ .

(72)

Lemma C.1 ([M], Lemma 2.1). If Q is an element of Fθ and k is a positive integer, there exists Q (k) in B ⊗ A[1,k−1] such that Q − Q (k) ≤ |Q|θ θ k . If Q is positive, Q (k) can be taken to satisfy inf Q ≤ Q (k) ≤ Q . If Q is in Fθ ∩ A[1,∞) , then Q (k) is in A[1,k−1] . A closed ball in Fθ is compact with respect to the C ∗ -norm ([M], Lemma 2.2): Lemma C.2. For any C > 0, a set {R ∈ Fθ : |R|θ ≤ C} is compact with respect to the C ∗ -norm. Next we consider operators in the form of (2).

Large Deviations in Quantum Spin Chains

67

Lemma C.3. For a, b ∈ Fθ and E : B ⊗ Md (C) → B a completely positive unital map, define an operator L on Fθ by

L(Q) := τc,+ E ⊗ id[2,∞) (bQa), Q ∈ Fθ . Then L is a bounded linear operator on Fθ such that |L(Q)|θ ≤ 8 |a|θ |b|θ |Q|θ , Q ∈ Fθ . Proof. From (72), the map Fθ Q → bQa ∈ Fθ is a bounded linear operator on Fθ such that |bQa|θ ≤ 4 |a|θ |b|θ |Q|θ for all Q ∈ Fθ . For Q ∈ Fθ and j ∈

N, we take Q ( j) ∈ B ⊗ A[1, j−1] given in Lemma C.1. As var j τc,+ E ⊗ id[2,∞) (Q ( j) ) = 0, we have

var j τc,+ E ⊗ id[2,∞) (Q) = var j τc,+ E ⊗ id[2,∞) (Q − Q ( j) )

Q − Q ( j) ≤ 2 |Q|θ θ j . ≤ 2 τc,+ E ⊗ id[2,∞) B(B⊗A[1,∞) )

Here, we used the fact that a completely positive unital map is a contraction. Hence we obtain

τc,+ E ⊗ id[2,∞) (Q) ≤ 2 |Q|θ , ∀Q ∈ Fθ . (73) θ Combining these estimates, we obtain the claim.

Lemma C.4. Let C α → a(α) ∈ Fθ be an Fθ -valued entire analytic function. Let E : B ⊗ Md (C) → B be a completely positive unital map. Define a family of operators (L α )α∈C on Fθ by

¯ ∗ Qa(α)), Q ∈ Fθ , α ∈ C. L α (Q) := τc,+ E ⊗ id[2,∞) (a(α) Then the B(Fθ )-valued function C α → L α ∈ B(Fθ ) is · B(Fθ ) -entire analytic. Proof. It is straightforward to see from (72) and (73) that the analyticity of a(α) implies that of L α . References [A1] [A2] [BLP] [BR1] [BR2] [DZ] [DMN1] [DMN2]

Araki, H.: Gibbs states of a one dimensional quantum lattice. Commun. Math. Phys. 14, 120–157 (1969) Araki, H.: Relative hamiltonian for faithful normal states of a von neumann algebra. Pub. R.I.M.S., Kyoto Univ. 9, 165–209 (1973) van den Berg, M., Lewis, J.T., Pule, J.V.: The large deviation principle and some models of an interacting boson gas. Commun. Math. Phys. 118, 61–85 (1988) Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 1. BerlinHeidelberg-New York: Springer-Verlag, 1986 Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. BerlinHeidelberg-New York: Springer-Verlag, 1996 Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Second edition, Berlin-Heidelberg-New York: Springer-Verlag, 1998 De Roeck, W., Maes, C., Netoˇcny, K.: Quantum macrostates, equivalence of ensembles and an h-theorem. J. Math. Phys. 47, 073303 (2006) De Roeck, W., Maes, C., Netoˇcny, K.: The Gibbs property for classical restrictions of quantum equilibrium states. In preparation

68

[FNW] [GLM] [GN] [HMO] [HMOP] [LLS] [LR] [M] [NR] [PRV] [P] [R] [RW]

Y. Ogata

Fannes, M., Nachtergaele, B., Werner, R.F.: Finitely correlated states on quantum spin chains. Commun. Math. Phys. 144, 443–490 (1992) Gallavotti, G., Lebowitz, J.L., Mastropietro, V.: Large deviations in rarefied quantum gases. J. Stat. Phys. 108, 831–861 (2002) Golodets, V.Y., Neshveyev, S.: Gibbs states for af algebras. J. Math. Phys. 39, 6329–6344 (1998) Hiai, F., Mosonyi, M., Ogawa, T.: Large deviations and chernoff bound for certain correlated states on a spin chain. J. Math. Phys. 48, 123301 (2007) Hiai, F., Mosonyi, M., Ohno, H., Petz, D.: Free energy density for mean field perturbation of states of a one-dimensional spin chain. Rev. Math. Phys. 20, 335–365 (2008) Lebowitz, J.L., Lenci, M., Spohn, H.: Large deviations for ideal quantum systems. J. Math. Phys. 41, 1224–1243 (2000) Lenci, M., Rey-Bellet, L.: Large deviations in quantum lattice systems: one-phase region. J. Stat. Phys. 119, 715–746 (2005) Matsui, T.: On non-commutative ruelle transfer operator. Rev. Math. Phys. 13, 1183–1201 (2001) Netoˇcný, K., Redig, F.: Large deviations for quantum spin systems. J. Stat. Phys. 117, 521–547 (2004) Petz, D., Raggio, G.P., Verbeure, A.: Asymptotics of varadhan-type and the gibbs variational principle. Commun. Math. Phys. 121, 271–282 (1989) Petz, D.: First steps towards a Donsker and Varadhan theory in operator algebras. In: Quantum Probability and Applications IV, Lecture Notes in Math, 1442, Berlin-Heidelberg-New York: Springer, 1990, pp. 311–319 Ruelle, D.: Statistical mechanics of a one dimensional lattice gas. Commun. Math. Phys. 9, 267–278 (1968) Raggio, G.A., Werner, R.F.: Quantum statistical mechanics of general mean field systems. Helv. Phys. Acta 62, 980–1003 (1989)

Communicated by M. Aizenman

Commun. Math. Phys. 296, 69–88 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0964-4

Communications in

Mathematical Physics

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0) Chongying Dong1,2, , Cuipo Jiang3, 1 Department of Mathematics, University of California, Santa Cruz, CA

95064, USA. E-mail: [email protected]

2 School of Mathematics, Sichuan University, Chengdu 610065, China 3 Department of Mathematics, Shanghai Jiaotong University, Shanghai 200240, China

Received: 7 May 2009 / Accepted: 8 September 2009 Published online: 3 December 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: We study a simple, rational and C2 -cofinite vertex operator algebra whose weight 1 subspace is zero, the dimension of weight 2 subspace is greater than or equal to 2 and with c = c˜ = 1. Under some additional conditions it is shown that such a vertex operator algebra is isomorphic to L( 21 , 0) ⊗ L( 21 , 0).

1. Introduction The vertex operator algebra L( 21 , 0) ⊗ L( 21 , 0) is characterized in [ZD] as a unique simple rational, C2 -cofinite vertex operator algebra with c = c˜ = 1, weight one subspace being zero and weight two subspace being 2 dimensional. In this paper we strengthen this result by allowing the dimensions of weight two subspace to be greater than or equal to 2. This proves the conjecture given in [ZD]. The importance of L( 21 , 0)⊗L( 21 , 0) was first noticed in [DMZ] (also see [M2,DGH]) for the study of the moonshine vertex operator algebra V [FLM]. In fact, it was essentially proved in [DMZ] that the fixed point vertex operator subalgebra VL+ under the involution induced from the −1 isometry of L is isomorphic to L( 21 , 0) ⊗ L( 21 , 0) if L is a rank one lattice generated by a vector whose squared length is 4 and V contains L( 21 , 0)⊗48 . This led to the theory of code vertex operator algebras [M1,M2,M3] and framed vertex operator algebras [DGH]. A new construction of the moonshine vertex operator algebra V is given in [M4] using the theory of code and framed vertex operator

Supported by NSF grants and a Faculty research grant from the University of California at Santa Cruz; part of this work was done when C. Dong was a Changjiang Visiting Chair Professor in Sichuan University. Supported in part by China NSF grants 10871125, 10811120445, and a grant of Science and Technology Commission of Shanghai Municipality (No. 09XD1402500).

70

C. Dong, C. Jiang

algebras. Furthermore, the recent progress in [DGL and LY] on proving the uniqueness of V depends largely on the theory of framed vertex operator algebras and code vertex operator algebras. Also see [KL] for the study of conformal nets arising from framed vertex operator algebras. The characterization of L( 21 , 0) ⊗ L( 21 , 0) given in this paper is a necessary step in the classification of rational vertex operator algebras with c = 1. It is a well known conjecture (cf. [K,ZD]) that any simple rational vertex operator algebra with c = 1 is either VL , VL+ or VLGA where L is a rank one positive definite even lattice, L A1 is the 1 root lattice of type A1 and G is a subgroup of S O(3) isomorphic to A4 , S4 or A5 . As pointed out in [ZD], the correct conjecture should also assume c is equal to the effective central charge c. ˜ A characterization of VL for an arbitrary positive definite even lattice is obtained in [DM1]. Although there was some progress at the q-character level on the classification of rational vertex operator algebras with c = 1 in the physics literature [K], there is still a long way to prove the conjecture completely by a lack of characterization of VL+ . It is desirable that the characterization of L( 21 , 0) ⊗ L( 21 , 0) may help to understand VL+ in general. If the weight one subspace of a vertex operator algebra is 0, then its weight two subspace is a commutative (non-associative) algebra (cf. [FLM,DGL]). Since the weight two subspace V2 in [ZD] is assumed to be 2-dimensional, it is necessarily a commutative associative algebra. The main result in [ZD] was based on the study of the vertex operator algebra W (2, 2) and the growth of the graded dimensions of vertex operator algebras. But in this paper we assume dim V2 ≥ 2. So V2 is not an associative algebra and the situation is much more complicated. By a result from [R], V2 either has two nontrivial idempotent elements or has a nontrivial nilpotent element. The former case basically follows from the argument in [ZD]. The key point in this paper is to use the fusion rules for the Virasoro algebra with c = 1 to deal with the later case. This should explain why we need the assumption in the main theorem that the vertex operator algebra is a sum of highest weight modules for the Virasoro algebra. This assumption is expected to be established for all rational vertex operator algebras with c = 1. This leads us to the study of fusion rules for the Virasoro algebra with c = 1. The fusion rules for the Virasoro algebra with c = 1 have been investigated from different points of view [RT,X]. The fusion rules among irreducible modules L(1, m 2 /4) with m ∈ Z for the Virasoro algebra have been given in [M] based on the A(V )-theory developed in [Z,FZ and L2]. We extend these results to include irreducible modules L(1, n) for n ∈ Z. We certainly believe that the fusion rules computed in this paper will play important roles in the future classification of rational vertex operator algebras with c = 1. The paper is organized as follows: In Sect. 2 we review the various notions of modules and define rational vertex operator algebras. Section 3 is about the Virasoro vertex operator algebras and some results on the structure of highest weight modules for the Virasoro algebra with c = 1. We also prove that any simple vertex operator algebra with c > 1 is a completely reducible module for the Virasoro algebra. In Sect. 4 we first review the A(V )-theory including how to use the bimodules to compute the fusion rules. The new results in this section are the fusion rules for the Virasoro algebra with c = 1. The most difficult case is the fusion rules for the irreducible modules L(1, m 2 ) for integers m as they are not the Verma modules. These fusion rules are fundamental later in the proof of the main theorem. Section 5 is devoted to the proof of the main theorem. In the case that V2 has a nontrivial nilpotent element we need to construct some highest weight vectors with certain properties. Then we use the fusion rules to prove this is impossible. This forces the dimension of V2 to be 2 and the result in [ZD] applies.

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

71

2. Preliminaries Let V = (V, Y, 1, ω) be a vertex operator algebra [B,FLM]. We review various notions of V -modules (cf. [FLM,Z,DLM1]) and the definition of rational vertex operator algebras. We also discuss some consequences following [DLM1]. Definition 2.1. A weak V module is a vector space M equipped with a linear map Y M : V → End(M)[[z, z −1 ]], v → Y M (v, z) = n∈Z vn z −n−1 , vn ∈ End(M), satisfying the following: 1) vn w = 0 for n >> 0, where v ∈ V and w ∈ M, 2) Y M (1, z) = I d M , 3) The Jacobi identity holds: z1 − z2 z2 − z1 z 0−1 δ Y M (u, z 1 )Y M (v, z 2 ) − z 0−1 δ Y M (v, z 2 )Y M (u, z 1 ) z0 −z 0 z1 − z0 Y M (Y (u, z 0 )v, z 2 ). (2.1) = z 2−1 δ z2 Definition 2.2. An admissible V module is a weak V module which carries a Z+ -grading M = n∈Z+ M(n), such that if v ∈ Vr then vm M(n) ⊆ M(n + r − m − 1). Definition 2.3. An ordinary V module is a weak V module which carries a C-grading M = λ∈C Mλ , such that: 1) dim(Mλ ) < ∞, 2) Mλ+n = 0 for fixed λ and n << 0, 3) L(0)w = λw= wt(w)w for w ∈ Mλ , where L(0) is the component operator of Y M (ω, z) = n∈Z L(n)z −n−2 . Remark 2.4. It is easy to see that an ordinary V -module is an admissible one. If W is an ordinary V -module, we simply call W a V -module. We call a vertex operator algebra rational if the admissible module category is semisimple. We have the following result from [DLM2] (also see [Z]). Theorem 2.5. If V is a rational vertex operator algebra, then V has finitely many irreducible admissible modules up to isomorphism and every irreducible admissible V -module is ordinary. Suppose that V is a rational vertex operator algebra and let M 1 , . . . , M k be the irreducible modules such that M i = ⊕n≥0 Mλi i +n , where λi ∈ Q [DLM3], Mλi i = 0 and each Mλi i +n is finite dimensional. Let λmin be the minimum of λi ’s. The effective central charge c˜ is defined as c − 24λmin . For each M i we define the q-character of M i by (dim Mλi i +n )q n+λi . chq M i = q −c/24 n≥0

72

C. Dong, C. Jiang

A vertex operator algebra is called C2 -cofinite if C2 (V ) has finite codimension where C2 (V ) = u −2 v|u, v ∈ V . Take a formal power series in q or a complex function f (z) = q λ n≥0 an q n . We say that the coefficients of f (q) satisfy the polynomial growth condition if there exist positive numbers A and α such that |an | ≤ An α for all n. If V is rational and C2 -cofinite, then chq M i converges to a holomorphic function on the upper half plane [Z]. Using the modular invariance result from [Z] and results on vector valued modular forms from [KM] we have (see [DM1]) Lemma 2.6. Let V be rational and C2 -cofinite. For each i, the coefficients of η(q)c˜ chq M i satisfy the polynomial growth condition where (1 − q n ). η(q) = q 1/24 n≥1

3. Virasoro Vertex Operator Algebras We will review vertex operator algebras associated to the highest weight representations for the Virasoro algebra and study a general vertex operator algebra viewed as a module for the Virasoro vertex operator algebra. We first recall some basic facts about the highest weight modules for the Virasoro algebra V ir . Let c, h ∈ C and V (c, h) be the corresponding highest weight module for the Virasoro algebra V ir with central charge c and highest weight h. We set V¯ (c, 0) = V (c, 0)/U (V ir )L(−1)v, where v is a highest weight vector with highest weight 0 and denote the irreducible quotient of V (c, h) by L(c, h). We have (see [KR,FZ]): Proposition 3.1. Let c be a complex number. (1) (2) (3) (4)

V¯ (c, 0) is a vertex operator algebra and L(c, 0) is a simple vertex operator algebra. For any h ∈ C, V (c, h) is a module for V¯ (c, 0). V (c, h) = L(c, h), V¯ (c, 0) = L(c, 0), for c > 1 and h > 0. 2 V (1, h) = L(1, h) if and only if h = m4 for m ∈ Z. In case h = m 2 for a nonnegative integer m, the unique maximal submodule of V (1, m 2 ) is generated by a highest weight vector with highest weight (m + 1)2 and is isomorphic to V (1, (m + 1)2 ).

We next study a general simple vertex operator algebra as a module for the Virasoro algebra. Lemma 3.2. Let V be a simple vertex operator algebra such that V0 = C1 and L(1)V1 = 0. Let h > 0 be such that the Verma module V (c, h) for the Virasoro algebra is irreducible. Let U be the sum of irreducible submodules of V isomorphic to V (c, h). Then V = U ⊕ U ⊥ , where U ⊥ = {v ∈ V |(v, U ) = 0} and (, ) is the canonical non-degenerate symmetric invariant bilinear form on V such that (1, 1) = 1 [FHL], [L1]. Proof. It is enough to prove that U ∩U ⊥ = 0. First note that U is a completely reducible module for the Virasoro algebra. Also, U ⊥ is a module for the Virasoro algebra. Suppose that U ∩ U ⊥ = 0. Let W be an irreducible submodule of U ∩ U ⊥ . Then X = V /W ⊥ is an irreducible module for the Virasoro algebra isomorphic to V (c, h) and can be identified with the graded dual W of W . Let v ∈ Vh be such that v + W ⊥ is the highest weight

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

73

vector of V /W ⊥ . Let M be the module for the Virasoro algebra generated by v. Then M ∩ W ⊥ is a submodule of M, M/(M ∩ W ⊥ ) is isomorphic to X and M ∩ Vh = Cv ⊕ (M ∩ W ⊥ ∩ Vh ) (direct sum of subspaces). Note that there are only finitely many composition factors in M ∩ W ⊥ . We have the following exact sequences for modules of the Virasoro algebra: 0 → M ∩ W ⊥ → M → L(c, h) → 0 and 0 → L(c, h) → M → (M ∩ W ⊥ ) → 0. Since (W, v) = 0, it follows that M can not be a direct sum of submodules L(c, h) and M ∩ W ⊥ for the Virasoro vertex operator algebra. So M can not be a direct sum of submodules L(c, h) and (M ∩W ⊥ ) . Therefore there exists a highest weight submodule Z of M such that L(c, h) is a submodule of Z . But from the module structure theory in [KR], L(c, h) can never be a submodule of any highest weight module if V (c, h) = L(c, h). This is a contradiction. The proof is complete. Proposition 3.3. If V is a simple vertex operator algebra such that V0 = C1, L(1)V1 = 0 and c > 1. Then V is a completely reducible module for the Virasoro algebra. Proof. Recall from [KR] or Proposition 3.1 that V (c, h) = L(c, h) if h > 0 and L(c, 0) = V¯ (c, 0). It is clear that the vertex operator subalgebra of V generated by 1 is isomorphic to L(c, 0). So we can regard L(c, 0) as a subalgebra of V. Then we have the decomposition V = L(c, 0) ⊕ L(c, 0)⊥ as (1, 1) = 1 and L(c, 0) ∩ L(c, 0)⊥ = 0. Let U n be the L(c, 0)-submodule of V generated by the highest weight vectors with highest weight n. Then U n is a completely reducible module for the Virasoro algebra and V = ⊕n≥0 U n by Lemma 3.2. We remark that in the case c = 1 we cannot establish the result in Proposition 3.3 although we strongly believe it is true if we also assume that V is rational and C2 -cofinite. We need this assumption for c = 1 later to characterize the vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0). This is also the original motivation for us to study the complete reducibility of vertex operator algebras as modules for the Virasoro algebra. It has been studied extensively on how to decompose an arbitrary vertex operator algebra and its modules as a sum of indecomposable modules for sl(2, C) = CL(1) + CL(−1) + CL(0) in [DLiM]. It seems that decomposing an arbitrary vertex operator algebra into a sum of indecomposable modules for the Virasoro algebra is much more difficult. But such a decomposition is definitely important in the study of vertex operator algebras and their representations. 4. A(V )-Theory and Fusion Rules Let V be a vertex operator algebra. An associative algebra A(V ) has been introduced and studied in [Z]. It turns out that A(V ) is very powerful and useful in representation theory for vertex operator algebras. One can use A(V ) not only to classify the irreducible admissible modules [Z], but also to compute the fusion rules using A(V )-bimodules [FZ]. We will first review the definition of A(V ) and some important results about A(V )

74

C. Dong, C. Jiang

from [Z,FZ and L2]. We then apply the A(V )-theory to the vertex operator algebra L(1, 0) to compute the fusion rules for L(1, 0). The central task is to determine the A(L(1, 0))-bimodule A(L(1, m 2 )) for any integer m. As a vector space, A(V ) is a quotient space of V by O(V ), where O(V ) denotes the linear span of elements wt u (z + 1)wt u u ◦ v = Resz (Y (u, z) u i−2 v v) = (4.1) i z2 i≥0

for u, v ∈ V with u being homogeneous. Product in A(V ) is induced from the multiplication wt u (z + 1)wt u u i−1 v v) = u ∗ v = Resz (Y (u, z) (4.2) i z i≥0

for u, v ∈ V with u being homogeneous. A(V ) = V /O(V ) is an associative algebra with identity 1 + O(V ) and with ω + O(V ) being in the center of A(V ). The most important result about A(V ) is that for any admissible V -module M = ⊕n≥0 M(n) with M(0) = 0, M(0) is an A(V )-module such that v + O(V ) acts as o(v), where o(v) = vwtv−1 for homogeneous v. For an admissible V -module W , we also define O(W ) ⊂ W to be the linear span of elements of type wt v (z + 1)wt v Resz (Y (v, z) w) = (4.3) vi−2 w i z2 i≥0

for homogeneous v ∈ V and w ∈ W. Let A(W ) = W/O(W ). Then A(W ) has an A(V )-bimodule structure [FZ] induced by the following bilinear operations V ×W → W and W × V → W : for w ∈ W and homogeneous v ∈ V, wt v (z + 1)wt v v ∗ w = Resz (Y (v, z) w) = vi−1 w, (4.4) z i i≥0

w ∗ v = Resz (Y (v, z)

wt v − 1 (z + 1)wt v−1 vi−1 w. w) = i z

(4.5)

i≥0

We quote the following proposition from [FZ]: Proposition 4.1. If W is an admissible module for a vertex operator algebra V and M is a submodule of W , then the image M¯ of M in A(W ) is a sub-A(V )-bimodule of A(W ), and the quotient A(W )/ M¯ is isomorphic to the A(V )-bimodule A(W/M) associated to the quotient V -module W/M. W3 i the vector Let W (i = 1, 2, 3) be ordinary V -modules. We denote by I V W1 W2 W3 . For a V -module W , let W denote space of all intertwining operators of type W1 W2 the graded dual of W . Then W is also a V -module [FHL]. It is well known that fusion rules have the following symmetry (see [FHL]).

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

75

Proposition 4.2. Let W i (i = 1, 2, 3) be V -modules. Then W3 W3 W3 (W 2 ) = dim I , dim I = dim I . dim I V V V V W1 W2 W2 W1 W1 W2 W 1 (W 3 ) Let W i = ⊕n≥0 W i (n) (i = 1, 2, 3) be V -modules such that L(0)|W i (0) = λi . Let W3 . Define the following bilinear Y(·, z) be an intertwining operator of type W1 W2 map: f Y : A(W 1 ) ⊗ A(V ) W 2 (0) → W 3 (0), u 1 ⊗ u 2 → o(u 1 )u 2 , u 1 ∈ A(W 1 ), u 2 ∈ W 2 (0), where o(u 1 ) is the component operator of Y(u 1 , z) such that o(u 1 ) maps W 2 (0) to W 3 (0). Then f Y is an A(V )-module homomorphism [FZ]. To state the next result we need to define the Verma type admissible module M(U ) associated to an A(V )-module U : Definition 4.3. Let V be a vertex operator algebra and U an A(V )-module. An admissible V -module M = ∞ by U if n=0 M(n) is called the Verma type module generated M(0) = U as A(V )-module and for any admissible V -module W = ∞ W (n) with n=0 W (0) = U as A(V )-module, the identity map from M(0) to W (0) lifts to a V -module homomorphism from M to W . The existence of a Verma type admissible module was given in [Z] (also see [DLM2]). The following result comes from [L2]: Lemma 4.4. Let W i be V -modules for i = 1, 2, 3. If W 3 is an irreducible V -module, then thelinear map Y → f Y is an injective map from the space of intertwining operators W3 to H om A(V ) (A(W 1 ) ⊗ A(V ) W 2 (0), W 3 (0)). Furthermore, Y → f Y of type W1 W2 is an isomorphism, if both W 2 and (W 3 ) are Verma type modules for V . We quote a result about the vertex operator algebra V¯ (c, 0) from [FZ]. Proposition 4.5. (1) The associative algebra A(V¯ (c, 0)) is isomorphic to the polynomial algebra C[x], with the isomorphism being given by x n ∈ C[x] → [(L(−2) + L(−1))n 1], where [a] = a + O(V¯ (c, 0)) for a ∈ V¯ (c, 0). (2) For the Verma module V (c, h), the A(V¯ (c, 0))-bimodule A(V (c, h)) is C[x, y] with x and y acting on the left and right as multiplications by x and y respectively. The isomorphism from C[x, y] to A(V (c, h)) is given by x m y n → [(L(−2) + 2L(−1) + L(0))m (L(−2) + L(−1))n 1h ], where 1h is a fixed nonzero highest weight vector of V (c, h). We now discuss the relation between the Verma module for the Virasoro algebra and the Verma type admissible module for vertex operator algebra V¯ (c, 0). By Proposition 4.5, A(V¯ (c, 0)) = C[x]. So any irreducible A(V¯ (c, 0))-module is one dimensional such that [ω] acts as a constant h. Denote this module by U. It is clear that the Verma type admissible V¯ (c, 0)-module generated by U is exactly the Verma module V (c, h). We next turn our attention to the fusion rules for the vertex operator algebra L(1, 0). The following theorem is the foundation in our computation of the fusion rules.

76

C. Dong, C. Jiang

Theorem 4.6. Let r be a positive integer. Then A(L(1, r 2 )) = C[x, y]/ I¯, where I¯ = < (x − y)

r [(x − y)2 − 2i 2 (x + y) + i 4 ] > i=1

is a two-sided ideal of C[x, y] generated by (x − y)

r

i=1 [(x

− y)2 − 2i 2 (x + y) + i 4 ].

Proof. Since V¯ (1, 0) = L(1, 0), by Proposition 4.5, the associative algebra A(L(1, 0)) is C[x] and the A(L(1, 0))-bimodule A(V (1, r 2 )) is isomorphic to C[x, y] with x and y acting on the left and right as multiplications by x and y respectively. By Proposition 4.1, as an A(L(1, 0))-bimodule, A(L(1, r 2 )) ∼ = C[x, y]/ I¯, where I¯ is the image in A(V (1, r 2 )) of the maximal proper submodule I of V (1, r 2 ). Since I is generated by a non-zero element v (r +1) in V (1, r 2 ) such that L(0)v (r +1) = (r + 1)2 v (r +1) , L(k)v (r +1) = 0, 0 < k ∈ Z+ , it follows that I¯ is generated by a polynomial f (x, y) in C[x, y] with degree s ≤ 2r + 1. Assume that f (x, y) =

s

ai (x)y i ,

i=0

where ai (x), i = 0, 1, . . . , s are polynomials in x of degrees at most 2r + 1 − i. We need to use the vertex operator algebra VL associated to the rank one even positive definite lattice L = Zα with (α, α) = 2 [FLM]. Let h = L ⊗Z C, and hˆ Z be the corresponding Heisenberg algebra. Denote by M(1) = C[α(−n)|n > 0] the associated irreducible induced module for hˆ Z such that the canonical central element of hˆ Z acts as 1. Let C[L] be the group algebra of L with a basis eγ for γ ∈ L . Let β ∈ h be such that (β, β) = 1. It is known that VL = M(1) ⊗ C[L] is a simple rational vertex operator algebra with 1 = 1 ⊗ e0 and ω = 21 β(−1)2 1 [B,FLM,D,DLM1]. The subalgebra generated by ω of VL is isomorphic to L(1, 0) and M(1) = L(1, p 2 ), VL =

p≥0

(2m + 1)L(1, m 2 ),

(4.6)

m≥0

as modules for the Virasoro algebra (cf. [DG]). It is well-known that VL is isomorphic to the fundamental representation L( 0 ) for the affine Kac-Moody algebra A(1) 1 [FK]. Note that the weight one subspace (VL )1 of VL forms a Lie algebra g isomorphic to sl(2, C), where the Lie bracket in (VL )1 is defined as [u, v] = u 0 v and u 0 is the component operator of Y (u, z) = n∈Z u n z −n−1 . g acts g on VL via v0 for v ∈ (VL )1 . The g-invariant elements VL = {v ∈ VL |g · v = 0} form a simple vertex operator algebra and is isomorphic to L(1, 0) (see [DG]).

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

77

Let Wm be the unique m + 1-dimensional highest weight module for g with highest weight m ∈ Z≥0 . Let VLWm be the sum of irreducible g-submodules of VL isomorphic to Wm , and (VL )Wm the space of highest weight vectors in VLWm . Then by [DG], as a g (VL , g)-module VL has decomposition W VL 2m = (VL )W2m ⊗ W2m (4.7) VL = m≥0

m≥0

g

and (VL )W2m is an irreducible module for VL . Moreover, (VL )W2k and (VL )W2m are isomorphic if and only if k = m. By [DG], (VL )W2m is isomorphic to L(1, m 2 ) as L(1, 0)-module. For m, n ∈ Z+ , m ≥ n, let W2m,2n = span{u j v|u ∈ W2m , v ∈ W2n , j ∈ Z}. Then W2m,2n is a g-module. Let u ∈ W2m and v ∈ W2n such that α(0)u = (2m − 2i)u, α(0)v = (2n − 2 j)v, for some 0 ≤ i ≤ 2m, 0 ≤ j ≤ 2n, where α(0) = (α(−1)1)0 is the component operator of α(z) = Y (α(−1)1, z) = k∈Z α(k)z −k−1 . Then α(0)u p v = (α(0)u) p v + u p α(0)v = (2m + 2n − 2i − 2 j)u p v, for all p ∈ Z. This means that W2m,2n is a sum of irreducible g-modules in {W2k |0 ≤ k ≤ m + n}. On the other hand, we have the following well-known tensor product decomposition: W2m ⊗ W2n = W2(m−n) ⊕ W2(m−n)+2 ⊕ · · · ⊕ W2(m+n)−2 ⊕ W2(m+n) .

(4.8)

By Lemma 2.2 of [DM2], for small enough integer p, the map ψ p : W2m ⊗ W2n → ∞ W2m,2n defined by ψ p : u ⊗ v → u i v, u ∈ W2m , v ∈ W2n is injective. Therei= p

fore in the decomposition of W2m,2n into irreducible g-modules, each W2k appears for m − n ≤ k ≤ m + n. Denote by Um,n the L(1, 0)-submodule of VL generated by W2m,2n . Then by (4.7), we have (VL )W2k ⊗ W2k . Um,n ⊇ m−n≤k≤m+n

This proves that

I L(1,0)

L(1, k 2 ) L(1, m 2 ) L(1, n 2 )

= 0,

for all m, n, k ∈ Z+ such that |m − n| ≤ k ≤ n + m. Let m = r , then we have f (n 2 , k 2 ) = 0, for all n, k ∈ Z+ satisfying |r − n| ≤ k ≤ n + r . Thus for n ∈ Z+ with n − r ≥ 0, we have ⎤⎡ ⎡ ⎤ a0 (n 2 ) 1 (n − r )2 (n − r )4 (n − r )6 ··· (n − r )2s ⎢ 1 (n − r + 1)2 (n − r + 1)4 (n − r + 1)6 · · · (n − r + 1)2s ⎥ ⎢ a1 (n 2 )⎥ ⎥⎢ ⎢ ⎥ ⎢ 1 (n − r + 2)2 (n − r + 2)4 (n − r + 2)6 · · · (n − r + 2)2s ⎥ ⎢ a2 (n 2 )⎥ ⎥⎢ ⎢ ⎥ = 0. ⎥ ⎢ .. ⎥ ⎢ .. .. .. .. .. .. ⎣ ⎦ ⎣. . . . . . . ⎦ 2 4 6 2s (n + r ) (n + r ) ··· (n + r ) 1 (n + r ) as (n 2 ) (4.9)

78

C. Dong, C. Jiang

If s ≤ 2r , then for each n ∈ Z+ such that n ≥ r , the coefficient matrix of (4.9) contains a (s + 1) × (s + 1)-minor which is a non-singular Vandermonde determinant, it follows that (4.9) has only zero solution. This implies that ai (x) = 0 for all i, a contradiction. So we have s = 2r + 1. We may assume that a2r +1 (x) = 1. Then we have ⎡ ⎤ ⎤ ⎡ −(n − r )2(2r +1) a0 (n 2 ) ⎢ a1 (n 2 ) ⎥ ⎢ (n − r + 1)2(2r +1) ⎥ ⎢ ⎥ ⎥ ⎢ 2 ⎥ 2(2r +1) ⎥ ⎢ ⎢ A(n) ⎢ a2 (n ) ⎥ = ⎢ −(n − r + 2) ⎥, ⎢ ⎥ ⎥ ⎢ .. .. ⎣ ⎣ ⎦ ⎦ . . a2r (n 2 )

(4.10)

(n + r )2(2r +1)

where ⎡

A(n)

1 (n − r )2 (n − r )4 (n − r )6 2 4 ⎢ 1 (n − r + 1) (n − r + 1) (n − r + 1)6 ⎢ 2 4 6 ⎢ = ⎢ 1 (n − r + 2) (n − r + 2) (n − r + 2) ⎢ .. .. .. .. ⎣. . . . (n + r )4 (n + r )6 1 (n + r )2

⎤ ··· (n − r )4r · · · (n − r + 1)4r ⎥ ⎥ · · · (n − r + 2)4r ⎥ ⎥. ⎥ .. .. ⎦ . . 4r ··· (n + r )

This shows that (4.10) has a unique solution for each n ∈ Z+ such that n ≥ r . Since ai (x), i = 0, 1, . . . , 2r + 1 are polynomials in x with degrees at most 2r + 1, it follows that f (x, y) is uniquely determined (up to a non-zero scalar) by the condition that f (n 2 , k 2 ) = 0 for all n, k ∈ Z+ such that |n − r | ≤ k ≤ n + r . Let f i (x, y) = (x − y)2 − 2i 2 (x + y) + i 4 , i = 1, 2, · · · , r. Then we have f i (n 2 , (n ± i)2 ) = 0. This proves that the polynomial (x − y)

r

[(x − y)2 − 2i 2 (x + y) + i 4 ]

i=1

satisfies the above condition. So we have f (x, y) = (x − y)

r

[(x − y)2 − 2i 2 (x + y) + i 4 ],

i=1

as expected. We are now in a position to give the fusion rules for the vertex operator algebra L(1, 0).

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

79

Theorem 4.7. We have L(1, k 2 ) = 1, k ∈ Z+ , |n − m| ≤ k ≤ n + m, (4.11) dim I L(1,0) L(1, m 2 ) L(1, n 2 ) L(1, k 2 ) = 0, k ∈ Z+ , k < |n − m| or k > n + m, (4.12) dim I L(1,0) L(1, m 2 ) L(1, n 2 ) where n, m ∈ Z+ . For n ∈ Z+ such that n = p 2 , for all p ∈ Z+ , we have L(1, n) = 1, dim I L(1,0) L(1, m 2 ) L(1, n) L(1, k) = 0, dim I L(1,0) L(1, m 2 ) L(1, n) for k ∈ Z+ such that k = n.

Proof. By Lemma 4.4, for k1 , k2 , k3 ∈ Z+ , dim I L(1,0)

L(1, k3 ) L(1, k1 ) L(1, k2 )

(4.13) (4.14) is less than

or equal to dim H om A(L(1,0)) (A(L(1, k1 )) ⊗ A(L(1,0)) L(1, k2 )(0), L(1, k3 )(0)), where L(1, h)(0) = C1h is the one-dimensional lowest weight space of the irreducible L(1, 0)-module L(1, h) such that L(0)1h = h1h , L(n)1h = 0, 1 ≤ n ∈ Z+ . That is, x in C[x] = A(L(1, 0)) acts on L(1, h)(0) as h. Let m, n, k ∈ Z+ such that |m − n| ≤ k ≤ m + n. It is easy to see that A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0) ∼ = C[x]/ < (x − n 2 )

m [(x − n 2 )2 i=1

−2i (x + n ) + i ] > . 2

m

2

4

Denote the ideal < (x − n 2 ) i=1 [(x − n 2 )2 − 2i 2 (x + n 2 ) + i 4 ] > by I¯n . For 0 = φ ∈ H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0), L(1, k 2 )(0)), we have x · φ(1 + I¯n )1k 2 = k 2 1k 2 = φ(x + I¯n )1k 2 , since x · 1k 2 = k 2 1k 2 . So φ( p(x) + I¯)1k 2 = p(k 2 )1k 2 , for p(x) ∈ C[x]. This means that dim H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0), L(1, k 2 )(0)) = 1. On the other hand, by Theorem 4.6, we have

L(1, k 2 ) I L(1,0) = 0. L(1, m 2 ) L(1, n 2 ) So (4.11) holds.

80

C. Dong, C. Jiang

For n, k ∈ Z+ such that k < |n − m| or k > n + m, let x = k 2 , y = n 2 , then we have f (k 2 , n 2 ) = (k 2 − n 2 )

m

[(k 2 − n 2 )2 − 2i 2 (k 2 + n 2 ) + i 4 ]

i=1

= (k 2 − n 2 )

m

[k 2 − (n − i)2 ][k 2 − (n + i)2 ] = 0.

i=1

This proves that dim H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0), L(1, k 2 )(0)) = 0. So (4.12) is true. For (4.14), we have f (k, n) = (k − n)

m [(k − n)2 − 2i 2 (k + n) + i 4 ] i=1

m = (k − n) [(k − n − i)2 − 4i 2 n] = 0, i=1

since n = k and n = p 2 , for all p ∈ Z+ . Therefore (4.14) holds. By Theorem 4.6, we have dim H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n)(0), L(1, n)(0)) = 1. Since for n ∈ Z+ such that n = p 2 , for all p ∈ Z+ , L(1, n) = V (1, n) ∼ = L(1, n) , (4.13) then follows from Lemma 4.4. The following corollary is not used in this paper. But it is an interesting result. Corollary 4.8. Let U be a highest weight module for the Virasoro algebra generated by the highest weight vector u (r ) such that L(0)u (r ) = r 2 u (r ) , L(k)u (r ) = 0, r ∈ Z+ \{0}. Let m, n ∈ Z+ \{0} be such that m = n and m, n are not perfect squares. Then U = 0. I L(1,0) L(1, m) L(1, n) Proof. If U is irreducible, the lemma immediately follows from Proposition 4.2 and Theorem 4.7. Otherwise, let U be the graded dual of U . Then U contains an irreducible submodule W (r ) which is isomorphic to L(1, r 2 ). By Theorem 4.7,

L(1, n) I L(1,0) = 0. W (r ) L(1, m) U contains a submodule W (r +1) such that W¯ (r +1) = W (r +1) /W (r ) is an irreducible L(1, 0)-module isomorphic to L(1, (r + 1)2 ). Again by Theorem 4.7, we have

L(1, n) I L(1,0) = 0. W¯ (r +1) L(1, m)

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

81

This implies I L(1,0)

L(1, n) W (r +1) L(1, m)

= 0.

Continuing the above steps, we deduce that L(1, n) =0 I L(1,0) W L(1, m) for any proper submodule W of U . We now claim that I L(1,0) Let Y ∈ I L(1,0)

L(1, n)

L(1, n) U L(1, m)

= 0.

be a nonzero intertwining operator. Then Y(u, z) = 0 U L(1, m) for some u ∈ U . Since U is a highest weight module for the Virasoro algebra, there exists a proper submodule W of U such that u ∈ W. This shows that L(1, n) = 0, I L(1,0) W L(1, m) a contradiction. Using Proposition 4.2 we conclude that U L(1, n) dim I L(1,0) = dim I L(1,0) = 0, L(1, m) L(1, n) U L(1, m) as desired. 5. Uniqueness of L(1/2, 0) ⊗ L(1/2, 0) In this section we prove the main theorem in this paper: Theorem 5.1. If V is a simple, rational and C2 -cofinite vertex operator algebra such that V1 = 0, c = c˜ = 1, V is a sum of highest weight modules for the Virasoro algebra and dim V2 ≥ 2, then dim V2 = 2 and V is isomorphic to L(1/2, 0) ⊗ L(1/2, 0). From now on we assume that V satisfies all the assumptions given in Theorem 5.1. First we notice that Vn = 0 if n < 0 and V0 = C1 (see [DGL]). Also there is a unique symmetric, non-degenerate invariant bilinear from (, ) on V such that (1, 1) = 1 (see [L1]). Then for any u, v, w ∈ V, (u, v)1 = Resz z −1 Y (e L(1)z (−z −2 ) L(0) u, z −1 )v. In particular, the restriction of the form to each homogeneous subspace Vn is non-degenerate and (u n+1 v, w) = (v, u −n+1 w)

82

C. Dong, C. Jiang

for all u, v ∈ V2 and w ∈ V. V2 is a commutative non-associative algebra with the product ab = a1 b for a, b ∈ V2 and the identity ω2 (cf. [FLM]). For a, b ∈ V2 we have (a, b)1 = a3 b. Moreover, the form on V2 is associative. That is, (ab, c) = (a, bc) for a, b, c ∈ V2 . By [R], either there is a nontrivial nilpotent element x ∈ V2 or V2 is spanned by idempotent elements. Lemma 5.2. If V2 is spanned by the idempotent elements, then V is isomorphic to L(1/2, 0) ⊗ L(1/2, 0). Proof. Let x ∈ V2 be a nontrivial idempotent element. Set ω1 = 2x and ω2 = ω − 2x. Then ωi are Virasoro elements [M1]. It follows from the proof of Theorem 3.1 of [ZD] that V contains L(c1 , 0) ⊗ L(c2 , 0) as a subalgebra for some complex numbers c1 , c2 such that c1 + c2 = 1. In fact, L(ci , 0) is isomorphic to the subalgebra generated by ωi . It then follows from the proof of Lemmas 4.5 and 4.6 of [ZD] that both c1 and c2 are 1/2. That is, V contains rational vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0) (see [DMZ] and [W]) as a subalgebra and V is a completely reducible L(1/2, 0)⊗ L(1/2, 0)-module. Since the irreducible modules of L(1/2, 0) ⊗ L(1/2, 0) are L(1/2, h 1 ) ⊗ L(1/2, h 2 ) for 1 h i ∈ {0, 21 , 16 } and dim V0 = 1, dim V1 = 0, we immediately see that V = L(1/2, 0) ⊗ L(1/2, 0). In particular, dim V2 = 2. We now deal with the case that there exists 0 = x ∈ V2 such that x 2 = 0. There are two cases: (1) (ω, x) = 0; (2) (ω, x) = 0. Lemma 5.3. We must have (ω, x) = 0. Proof. If (ω, x) = 0, we can assume that (ω, x) = 1. Then the component operators W (n) of Y (x, z) = n∈Z W (n)z −n−2 and the component operators L(n) of the Y (ω, z) generate a copy of the W -algebra W (2, 2) with central charge 1, where W (2, 2) is an infinite dimensional Lie algebra with basis L m , Wm , C for m ∈ Z and Lie brackets, [L m , L n ] = (m − n)L m+n +

m3 − m δm+n,0 C, 12

[L m , Wn ] = (m − n)Wm+n +

m3 − m δm+n,0 C, 12

[Wm , Wn ] = 0 for m, n ∈ Z, where C is a central element( see [ZD]). Let c, h 1 , h 2 ∈ C and denote by V (c, h 1 , h 2 ) the Verma module for W (2, 2) with central charge c and highest weight (h 1 , h 2 ). Then V (c, h 1 , h 2 ) = U (W (2, 2))/Ic,h 1 ,h 2 , where Ic,h 1 ,h 2 is the left ideal of the universal enveloping algebra U (W (2, 2)) generated by L m , Wm , C − c, L 0 − h 1 and W0 − h 2 for positive m. By PBW theorem V (c, h 1 , h 2 ) has basis {W−m 1 · · · W−m s L −n 1 · · · L −n t 1(h 1 ,h 2 ) |m 1 ≥ · · · ≥ m s ≥ 1, n 1 ≥ · · · ≥ n t ≥ 1}, where 1(h 1 ,h 2 ) = 1 + Ic,h 1 ,h 2 . It is standard that V (c, h 1 , h 2 ) has a unique maximal submodule J (c, h 1 , h 2 ) so that L(c, h 1 , h 2 ) = V (c, h 1 , h 2 )/J (c, h 1 , h 2 ) is an irreducible highest weight module of W (2, 2). By Theorem 2.1 of [ZD], if c = 0 then J (c, 0, 0) = U (W (2, 2))L −1 1(0,0) + U (W (2, 2))W−1 1(0,0) and L(c, 0, 0) has a basis {W−m 1 · · · W−m s L −n 1 · · · L −n t 10 |m 1 ≥ · · · ≥ m s > 1, n 1 ≥ · · · ≥ n t > 1}, where 10 is the canonical highest weight vector of L(c, 0, 0).

(5.1)

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

83

Let U be the vertex operator subalgebra generated by ω, x. Then U is a highest weight W (2, 2)-module with highest weight vector 1 such that Wn acts as W (n) and L n acts as L(n) for all n ∈ Z. Since L(−1)1 = W (−1)1 = 0, we see that U is isomorphic to L(1, 0, 0). By (5.1), U has q-character q −1/24 . n 2 n>1 (1 − q )

chq U =

By Proposition 4.2 of [ZD], the coefficients of η(q)chq U = 1−q n grow faster than n>1 (1−q ) any polynomial in n. But this is a contradiction as the coefficients of η(q)chq V satisfy the polynomial growth condition by Lemma 2.6. So we can now assume that (ω, x) = 0. Since L(1)x ∈ V1 and (ω, x) = (L(2)x, 1) we see that x is a highest weight vector for the Virasoro algebra. By the fact that the bilinear form (·, ·) on V is non-degenerate and (ω, ω) = 21 , there exists y ∈ V2 such that (x, y) = 1, (y, ω) = 0. So (L(2)y, 1) = 0. This means that L(2)y = 0. Since L(1)y ∈ V1 = 0, we deduce that y is a highest weight vector for the Virasoro algebra. Assume that x y = aω + αx + βy + u, where α, β ∈ C , and u ∈ V2 such that (u, x) = (u, y) = (u, ω) = 0. Note that (x, y) =

1 1 (x, yω) = (x y, ω) 2 2

and (ω, ω) = 21 . We have a = 4. Since (y, x x) = (x y, x) = β(x, y) = 0, it follows that β = 0. Therefore x y = 4ω + αx + u. It is obvious that u is a highest weight vector for the Virasoro algebra. The following lemma is an immediate consequence of the commutator formula in vertex operator algebras. Lemma 5.4. Let v be a highest weight vector for the Virasoro algebra with highest weight 2. Then [L(m), vn ] = (m − n + 1)vn+m for all m, n ∈ Z. Lemma 5.5. Assume that x−1 x = 0. Then we have (1) u 1 x = −10x, (2) u 0 x = −5x−2 1. Proof. Since Vn = 0 for n < 0, we have xn x = 0, for n ≥ 4. By the fact that x1 x = x 2 = 0, we have (x, x) = (x3 x, 1) = (ω/2, x 2 ) = 0. So x3 x = 0. Using the skew symmetry Y (x, z)x = e L(−1)z Y (x, −z)x we see that x0 x = −x0 x + L(−1)x1 x = −x0 x + L(−1)x 2 = −x0 x.

84

C. Dong, C. Jiang

This proves that x0 x = 0. Note that x2 x = 0, since V1 = 0. So we have xn x = 0 for n ≥ 0. Thus Y (x, z 1 )Y (x, z 2 ) = Y (x, z 2 )Y (x, z 1 ) and Y (x−1 x, z) = Y (x, z)Y (x, z) = 0. In particular, x1 x1 + 2

x1−i x1+i = 0

i≥1

and (x1 x1 + 2

x1−i x1+i )y = x1 x1 y + 2x = 10x + x1 u = 0.

i≥1

This proves (1). For (2), we apply the zero operator i≥0 x−i xi+1 to y to obtain 0 = x0 x1 y + x−2 x3 y = x0 (4ω + αx + u) + x−2 1 = 5x−2 1 + x0 u, where we have used Lemma 5.4. Thus, x0 u = −5x−2 1. Using the skew symmetry we see that u 0 x = −x0 u + L(−1)x1 u = 5x−2 1 − 10x−2 1 = −5x−2 1, as desired. α u. It follows from Lemma 5.5 that x1 y = From now on we redefine y as y = y + 10 y1 x = 4ω + u. Although this new y is again a highest weight vector for the Virasoro algebra, we cannot assume (y, u) = 0 any more.

Corollary 5.6. (1) [u m , xn ] = 5(n − m)xm+n−1 for m, n ∈ Z. (2) (u, u) = −10. Proof. (1) follows from Lemma 5.5 and the commutator formula [u m , xn ] =

m (u i x)m+n−i . i i≥0

For (2) we compute (x1 y, x1 y) = (4ω + u, 4ω + u) = 8 + (u, u). On the other hand, (x1 y, x1 y) = (y, x1 (4ω + u)) = (y, 8x − 10x) = −2. That is, (u, u) = −10.

Lemma 5.7. Assume that x−1 x = 0. Then there exist a, b ∈ C such that v = u −1 x + ax−3 1 + bL(−2)x is a nonzero highest weight vector of weight 4 for the Virasoro algebra.

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

85

Proof. We first use the conditions L(1)v = L(2)v = 0 to determine a, b. Using Lemmas 5.4 and 5.5 we have L(1)v = L(1)u −1 x + a L(1)x−3 1 + bL(1)L(−2)x = 3u 0 x + 5ax−2 1 + 3bx−2 1 = (−15 + 5a + 3b)x−2 1 and L(2)v = L(2)u −1 x + a L(2)x−3 1 + bL(2)L(−2)x 1 = 4u 1 x + 6ax + b(4L(0) + )x 2 1 = (−40 + 6a + b(8 + ))x. 2 So a =

220 15 ,b= are uniquely determined by the linear system 49 49 5a + 3b = 15, 12a + 17b = 80.

It is clear that L(n)v = 0 for n > 2. We now prove that v is nonzero. It is enough to prove that y3 v = 0. We have the following computation: y3 v =

3 3 3 3 (yi u)2−i x + u + a (yi x)−i 1 + b(4y1 + L(−2)y3 )x i i i=0

i=0

= (y0 u)2 x + 3(y1 u)1 x + (y, u)x + u + 3ay1 x + 4by1 x + bω = (−u 0 y + L(−1)u 1 y)2 x + 3(y1 u)1 x + (y, u)x + u + (3a + 4b)(4ω + u) + bω = −u 0 y2 x + y2 u 0 x −2(u 1 y)1 x +3(y1 u)1 x + (y, u)x + (12a + 17b)ω + (3a + 4b + 1)u = −5y2 x−2 1 + (u 1 y)1 x + (y, u)x + (12a + 17b)ω + (3a + 4b + 1)u. Thus we have (y3 v, u) = (−5y2 x−2 1 + (u 1 y)1 x + (y, u)x + (12a + 17b)ω + (3a + 4b + 1)u, u) = −5(x−2 1, y0 u) + (u 1 y, x1 u) + (3a + 4b + 1)(u, u) = −5(x−2 1, −u 0 y + L(−1)u 1 y) − 10(u 1 y, x) − 10(3a + 4b + 1) = 5(u 2 x−2 1, y) − 5(L(1)x−2 1, u 1 y) + 100 − 10(3a + 4b + 1) = −100(x, y) − 20(x, u 1 y) + 100 − 10(3a + 4b + 1) 60 = 0. = 200 − 10(3a + 4b + 1) = 49 The proof is complete. Lemma 5.8. Assume that x−1 x = 0. Let v = u −1 x + ax−3 1 + bL(−2)x be the nonzero highest weight vector given in Lemma 5.7. Then xi v = 0 for all i ≥ 0.

86

C. Dong, C. Jiang

1 Proof. Since x−1 x = 0, it follows that x−2 x = L(−1)x−1 x = 0. So for i ≥ 0, we 2 have xi v = xi u −1 x + axi x−3 1 + bxi L(−2)x = 5(−1 − i)xi−2 x + u −1 xi x + b(i + 1)xi−2 x + bL(−2)xi x = 0, as desired. Lemma 5.9. V is a completely reducible module for the Virasoro algebra. Proof. By the assumption, V is a sum of highest weight modules for the Virasoro algebra. We claim that any highest weight module for the Virasoro algebra generated by a highest weight vector w ∈ V with highest weight n is isomorphic to L(1, n). If not, let U be the highest weight module generated by w for the Virasoro algebra. Then U has a unique maximal submodule M generated by a highest weight vector f . Then we can write f as a linear combination of L(−n 1 ) · · · L(−n k )w for n 1 ≥ · · · ≥ n k ≥ 1. Let X be a highest weight module in V for the Virasoro algebra generated by a highest weight vector g. It is clear that (L(−n 1 ) · · · L(−n k )w, g) = (w, L(n k ) · · · L(n 1 )g) = 0, and so ( f, g) = 0. Let L(−m 1 ) · · · L(−m p )g ∈ X such that m i > 0 and p ≥ 1. Then ( f, L(−m 1 ) · · · L(−m p )g) = (L(m p ) · · · L(m 1 ) f, g) = 0. This shows that ( f, V ) = 0. Since the form is non-degenerate, this is impossible. As a result, V is a completely reducible module for the Virasoro algebra. We now can complete the proof of Theorem 5.1. Let v be the vector given in Lemma 5.7 if x−1 x = 0, otherwise let v = x−1 x. Then v is a nonzero highest weight vector for the Virasoro algebra with highest weight 4 such that xi v = 0 for all i ≥ 0. It follows from Lemma 5.9 that highest weight modules generated by x and v are isomorphic to L(1, 2) and L(1, 4) respectively. By Proposition 11.9 of [DL], Y (x, z)v = 0 as V is simple. Thus there exists n > 0 such that x−n v = 0 and x−m v = 0 for all m < n. Then x−n v is a highest weight vector for the Virasoro algebra with highest weight n + 5 and generates an irreducible highest weight module isomorphic to L(1, n+ 5). As a L(1, n + 5) result we have a nonzero intertwining operator of type . This is a L(1, 4), L(1, 2) contradiction by Theorem 4.7. Hence there is no nontrivial nilpotent element in V2 and Theorem 5.1 holds by Lemma 5.2. Remark 5.10. As we pointed out in [ZD] the assumption c = c˜ in Theorem 5.1 is necessary. We believe that the assumption that V is a sum of highest weight modules for the Virasoro algebra is unnecessary. But we do not know how to prove the main result without this assumption in this paper. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)

87

References [B] [D] [DG] [DGH] [DGL] [DL] [DLM1] [DLM2] [DLM3] [DLiM]

[DM1] [DM2] [DMZ] [FHL] [FK] [FLM] [FZ] [KR] [KL] [K] [KM] [LY] [L1] [L2] [M] [M1] [M2] [M3] [M4] [R] [RT]

Borcherds, R.: Vertex algebras kac-moody algebras and the monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) Dong, C.: Vertex algebras associated with even lattices. J. Algebra 160, 245–265 (1993) Dong, C., Griess, R. Jr.: Rank one lattice type vertex operator algebras and their automorphism groups. J. Algebra 208, 262–275 (1998) Dong, C., Griess, R. Jr., Hoehn, G.: Framed vertex operator algebras, codes and the moonshine module. Commu. Math. Phys. 193, 407–448 (1998) Dong, C., Griess, R. Jr., Lam, C.: Uniqueness results of the moonshine vertex operator algebra. Ameri. J. Math. 129, 583–609 (2007) Dong, C., Lepowsky, J.: Generalized Vertex Algebras and Relative Vertex Operators. Progress in Math. Vol. 112, Boston: Birkhäuser, 1993 Dong, C., Li, H., Mason, G.: Regularity of rational vertex operator algebras. Adv. in Math. 132, 148–166 (1997) Dong, C., Li, H., Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. 310, 571–600 (1998) Dong, C., Li, H., Mason, G.: Modular invariance of trace functions in orbifold theory and generalized moonshine. Commu. Math. Phys. 214, 1–56 (2000) Dong, C., Lin, Z., Mason, G.: On vertex operator algebras as sl2 -modules. In: Groups, Difference Sets, and the Monster, Proc. of a Special Research Quarter at The Ohio State University, Spring 1993, ed. by Arasu, K.T., Dillon, J.F., Harada, K., Sehgal, S., Solomon, R., Berlin-New York: Walter de Gruyter, 1996, pp. 349–362 Dong, C., Mason, G.: Rational vertex operator algebras and the effective central charge. International Math. Research Notices 56, 2989–3008 (2004) Dong, C., Mason, G.: Quantum galois theory for compact lie groups. J. Algebra 214, 92–102 (1999) Dong, C., Mason, G., Zhu, Y.: Discrete series of the virasoro algebra and the moonshine module. Proc. Symp. Pure. Math. American Math. Soc. 56(II), 295–316 (1994) Frenkel, I.B., Huang, Y., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Memoirs American Math. Soc. 104, 1993 Frenkel, I., Kac, V.: Basic representations of affine lie algebras and dual resonance models. Invent. Math 62, 23–66 (1980) Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Applied Math. Vol. 134, New York-London: Academic Press, 1988 Frenkel, I., Zhu, Y.: Vertex operator algebras associated to representations of affine and virasoro algebras. Duke Math. J. 66, 123–168 (1992) Kac, V.G., Raina, A.: Highest Weight Representations of Infinite Dimensional Lie Algebras. Adv. Ser. In Math. Phys., Singapore: World Scientific, 1987 Kawahigashi, Y., Longo, R.: Local conformal nets arising from framed vertex operator algebras. Adv. Math. 206, 729–751 (2006) Kiritsis, E.: Proof of the completeness of the classification of rational conformal field theories with c = 1,. Phys. Lett. B 217, 427–430 (1989) Knopp, M., Mason, G.: On vector-valued modular forms and their fourier coefficients. Acta Arith. 110, 117–124 (2003) Lam, C., Yamauchi, H.: A characterization of the moonshine vertex operator algebra by means of Virasoro frames. Int. Math. Res. Not. 2007 (2007), ID rnm003, 10 pp Li, H.: Symmetric invariant bilinear forms on vertex operator algebras. J. Pure Appl. Algebra 96, 279–297 (1994) Li, H.: Determining fusion rules by a(v)-modules and bimodules. J. Algebra 212, 515–556 (1999) Milas, A.: Fusion rings for degenerate minimal models. J. Algebra 254, 300–335 (2002) Miyamoto, M.: Griess algebras and conformal vectors in vertex operator algebras. J. Algebra 179, 523–548 (1996) Miyamoto, M.: Binary codes and vertex operator superalgebras. J. Algebra 181, 207–222 (1996) Miyamoto, M.: Representation theory of code vertex operator algebra. J. Algebra 201, 115–150 (1998) Miyamoto, M.: A new construction of the moonshine vertex operator algebra over the real number field. Ann. of Math. 159, 535–596 (2004) Röhrl, H.: Finite-dimensional algebras without nilpotents over algebraically closed fields. Arch. Math. 32, 10–12 (1979) Rehern, K., Tuneke, H.: Fusion rules for the continuum sectors of the virasoro algebra of c = 1. Lett. Math. Phys. 53, 305–312 (2000)

88

[W] [X] [ZD] [Z]

C. Dong, C. Jiang

Wang, W.: Rationality of virasoro vertex operator algebras. Internat. Math. Res. Notices 7, 197–211 (1993) Xu, F.: Strong additivity and conformal nets. Pacific J. Math. 221, 167–199 (2005) Zhang, W., Dong, C.: W-algebra w(2,2) and the vertex operator algebra, l( 21 , 0)⊗l( 12 , 0). Commun. Math. Phys. 285, 991–1004 (2009) Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996)

Communicated by Y. Kawahigashi

Commun. Math. Phys. 296, 89–109 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0993-z

Communications in

Mathematical Physics

Langlands Duality for Representations and Quantum Groups at a Root of Unity Kevin McGerty Department of Mathematics, Imperial College London, London SW7 2AZ, United Kingdom. E-mail: [email protected] Received: 19 May 2009 / Accepted: 28 October 2009 Published online: 4 February 2010 – © Springer-Verlag 2010

Abstract: We give a representation-theoretic interpretation of the Langlands character duality of [FH], and show that the “Langlands branching multiplicities” for symmetrizable Kac-Moody Lie algebras are equal to certain tensor product multiplicities. For finite type quantum groups, the connection with tensor products can be explained in terms of tilting modules. 1. Introduction Let g be a simple Lie algebra and L g its Langlands dual Lie algebra. In [FH] a duality between the irreducible characters of g and L g was established and a number of conjectures were made about its properties, both on the level of representations and, more combinatorially, at the level of crystals. In this paper we generalize the character duality to the category Oint of integrable representations in category O for a symmetrizable Kac-Moody Lie algebra, and establish, in this more general context, a number of the conjectures of [FH]. With a mild restriction on the generalized Cartan matrix we also give a representation-theoretic interpretation of the character duality using Lusztig’s modified quantum groups at a root of unity. It will be convenient to use Lusztig’s notion of a root datum and Cartan datum, the latter being essentially a symmetrizable generalized Cartan matrix with an integral choice of symmetrization. Indeed for any symmetrizable generalized Cartan matrix C = (ai j )i, j∈I on an indexing set I , we may choose a root datum (X, Y, I ) consisting of a weight lattice X , a coweight lattice Y , a perfect pairing ·, · : Y × X → Z, along with a set of simple roots {αi }i∈I ⊂ X and a set of simple coroots {αˇ i }i∈I ⊂ Y , which satisfy αˇ i , α j = ai j . We will assume that both the simple roots and simple coroots are linearly independent, so that we have the standard partial ordering on X , and a dominant cone

Supported by a Royal Society University Research Fellowship.

90

K. McGerty

X + = {λ ∈ X : αˇ i , λ ≥ 0, ∀i ∈ I }. Let (di )i∈I be an integral vector such that DC is symmetric, where D is the diagonal matrix with Dii = di . Set d to be the least common multiple of the di , and let li = d/di . Let L g be the Langlands dual Kac-Moody algebra with Cartan matrix C t . Then we may embed the weight lattice of L g into X in such a way that iL → i∗ = li i (where i , iL are the fundamental weights). Let X ∗ denote the image of this map1 . Given a simple highest weight representation ∇(λ) of g, such that λ ∈ X ∗ ∩ X + , let c∗ (∇(λ)) denote the direct sum of those weight spaces of ∇(λ) whose weights lie in X ∗ . In [FH] it was shown that the character of c∗ (∇(λ)) is the virtual character of a representation of L g: more precisely it was shown that χ (c∗ (∇(λ))) = χ L (λ) + m λµ χ L (µ), (m λµ ∈ Z), (1.1) µ∈X ∗ ,µ<λ

where χ L (µ) is the character of the simple highest module ∇ L (µ) for L g of highest weight µ (in the notation of that paper, the left-hand side is written (χ (λ))). Moreover, it was also shown there that the crystal graph for ∇(λ) naturally contains the crystal graph of ∇ L (λ) (this result is also implied by a previous result of Kashiwara [K96] in a somewhat different context). Thus the character χ L (λ) can be viewed as a “subcharacter” of the character of χ (λ) = χ (∇(λ)) (that is, the dimension of a weight space in ∇ L (λ) is at most the dimension of the corresponding weight space of ∇(λ)). In [FH] the authors proposed an interpretation of the character duality c∗ in terms of representations of a two-parameter deformation of g. While we cannot establish the duality in this context, we have obtained an interpretation in terms of the representation theory of (one-parameter) quantum groups at a root of unity. Since the strategy using two-parameter deformations also involved specializing one parameter to a root of unity, our approach can be viewed as evidence for the conjectures in [FH] on two-parameter deformations. ˙ be a modified quantum group at an th root of unity. Recall that Lusztig has Let U ˙ → U ˙ ∗ whose target is a defined for such algebras a quantum Frobenius map Fr : U ∗ ∗ ˙ , where the parameters vi for U ˙ are ±1 (see Sect. 2 for modified quantum group U details). Under mild restrictions, (which are satisfied for example by all finite and affine (1) types except the affine type A2n , which is in any case simply-laced, and so not interesting ˙ is the quantum group attached to (X, Y, I ) at from the point of view of our duality) if U ∗ ˙ = r , the algebra U is very close to the enveloping algebra of L g. In particular, there ˙ ∗ which can be identified with the category is a natural category of representations for U Oint of integrable representations in category O for L g, and moreover the natural notion of characters for representations of each algebra coincide under this identification. ˙∗ → U ˙ (this In [M], it was observed that the map Fr has a natural splitting c : U is readily deduced from the existence of a slightly weaker kind of splitting map which ˙ has a standard representation ∇(λ) had already been constructed by Lusztig). Now U with character χ (λ), and via the map c, the subspace c∗ (∇(λ)) becomes a representation ˙ ∗ which when viewed as a L g-module lies in Oint . It is the easy to check that this of U gives a representation-theoretic lift of Eq. (1.1), and hence positivity for the integers m λµ . Some examples of this positivity were already established in [FH] where the case of B2 is studied in detail. 1 For Kac-Moody algebras of finite type, this corresponds to the embedding used by Frenkel and Hernandez, who denote its image by P .

Langlands Duality for Representations and Quantum Groups at a Root of Unity

91

Another natural question posed by [FH] was to compute the “Langlands branching multiplicities” m λµ . Somewhat surprisingly, we are able to give an expression for the m λµ in terms of tensor product multiplicities for g in the case of an arbitrary symmetrizable Kac-Moody Lie algebra. Since tensor product multiplicities can be calculated via various combinatorial techniques, such as Littelmann paths [Li95], Kashiwara’s crystal graphs (see e.g. [K02]), and (in the finite-type case) Berenstein-Zelevinsky’s polyhedral expressions [BZ], this gives a computable expression for the Langlands branching multiplicities. Moreover, since tensor product multiplicities are manifestly positive, we get a purely combinatorial proof of the positivity of branching multiplicities, and hence obtain a proof that c∗ (∇(λ)) has the structure of a L g-representation in the general case (without explicitly constructing that action). Using the theory of tilting modules and good filtrations of modules for quantum groups at a root of unity, we can also give representation-theoretic meaning to the combinatorial calculation of branching multiplicities. Since the theory needed for this is only available in the finite-type case, this interpretation is limited to that context. It is perhaps interesting to note that our results give a new application of the theory of quantum groups to a question which involves only the classical representation theory of Kac-Moody algebras. We now briefly outline the organization of the paper: In Sect. 2 we review Lusztig’s modified quantum groups. In Sect. 3 we recall the contraction map of [M], and in Sect. 4 use it to construct a Langlands duality for representations in category Oint for any symmetrizable Kac-Moody algebra satisfying a mild technical condition. In Sect. 5 we establish the interpretation of Langlands duality branching multiplicities as tensor product multiplicities in the general case. In Sect. 6 we interpret this combinatorics via tilting modules and good filtrations for quantum groups of finite type. 2. Quantum Groups and Modified Quantum Groups In this section we review Lusztig’s modified quantum groups. A detailed account of these algebras is contained in [L93, Part IV]. Let Q(v) be the field of rational functions in an indeterminate v with Q-coefficients, and let A = Z[v, v −1 ], be a subring of Q(v). Suppose that C = (ai j )i, j∈I is a symmetrizable generalized Cartan matrix, and let (di )i∈I ∈ N I be a vector of positive integers such that di ai j = d j a ji , ∀i, j ∈ I.

(2.1)

We may define a symmetric bilinear pairing x, y → x · y on Z[I ] by setting i · j = di ai j , i, j ∈ I, and extending linearly. The pair (Z[I ], ·) is then a Cartan datum in the sense of Lusztig [L93] Remark 2.1. When the matrix C is indecomposable (in the sense of Kac) all symmetrizing vectors (di )i∈I are multiples of the minimal such vector (ri )i∈I . Set r = l.c.m.{ri : i ∈ I } to be the least common multiple of the ri , and similarly d = l.c.m.{di : i ∈ I }. All our results, however, make sense in the context of a general Cartan datum. The Weyl group attached to a generalized Cartan matrix is the group W generated by involutions {si : i ∈ I } satisfying braid relations of length m i j = 2, 3, 4, 6 according as ai j a ji is 0, 1, 2, 3 (when ai j a ji is at least 4 no relation is imposed).

92

K. McGerty

For each i ∈ I let vi = v di and set [n]i = (vin − vi−n )/(vi − vi−1 ) ∈ A. We then define [n]i ! = [n]i [n − 1]i . . . [1]i ;

n [n]i ! . = [k]i ![n − k]i ! k i

It is easy to check that these quantum binomial coefficients all lie in A. Given a Cartan datum, we define f to be the Q(v)-algebra generated by symbols {θi : i ∈ I } subject to relations: (r ) (s) θi θ j θi = 0, r +s=1−ai j

where θi(n) = θin /[n]i ! is a quantum divided power. We also need an integral form of f: let A f be the A-subalgebra of f generated by {θi(n) : n ≥ 0, i ∈ I }. Lusztig [L93] has shown that A f is a free A-module with a canonical basis B. Both f and A f are clearly Z[I ]-graded. To define the full quantum group we need slightly more data. Fix a Cartan datum (I, ·). Definition 2.2. Let C = (ai j )i, j∈I be a generalized Cartan matrix. A root datum associated to C consists of a pair (X, Y ) of finitely generated free abelian groups, a perfect pairing ·, · : Y × X → Z, and finite sets {αi }i∈I ⊂ X and {αˇ i }i∈I ⊂ Y consisting of simple roots and coroots respectively. These must satisfy αˇ i , α j = ai j , ∀i, j ∈ I. Where there is no possibility for confusion, we normally abuse notation slightly and write (X, Y, I ) to denote a root datum. The Weyl group W attached to C acts on X via si → (λ → λ − αˇ i , λαi ) ∈ Aut(X ), and by duality also on Y so that w(µ), λ = µ, w−1 (λ), for all w ∈ W, µ ∈ Y, λ ∈ X . If (I, ·) is a Cartan datum, we say that (X, Y, I ) is a root datum of type2 (I, ·) if (X, Y, I ) and (I, ·) are associated to the same generalized Cartan matrix C. The simple roots and simple coroots give natural maps from Z[I ] to X and Y respectively (which we suppress any notation for). We say that a root datum is X -regular if the simple roots are linearly independent, and Y -regular if the simple coroots are. For a finite-type generalized Cartan matrix, X and Y regularity are automatic by nondegeneracy, but in general it is an additional assumption. For a Y -regular root datum we set X + = {λ ∈ X : αˇ i , λ ≥ 0}. For an X -regular root datum,we may define a partial ordering on the weight lattice X by setting λ ≤ µ if µ − λ ∈ i∈I Nαi . Unless otherwise stated, we always assume that our datum is X and Y regular, so that X has both the dominant cone X + and the partial order. 2 In [L93] a root datum is defined in terms of a Cartan datum, and the generalized Cartan matrix is not emphasized.

Langlands Duality for Representations and Quantum Groups at a Root of Unity

93

Definition 2.3. Given a root datum (X, Y ) of type (I, ·) we may define an associated quantum group U, which is a Q(v)-algebra generated by symbols E i , Fi , K µ , i ∈ I , µ ∈ Y , subject to the following relations: (1) K 0 = 1, K µ1 K µ2 = K µ1 +µ2 for µ1 , µ2 ∈ Y ; (2) K µ E i K µ−1 = v µ,αi E i , K µ Fi K µ−1 = v −µ,αi Fi for all i ∈ I , µ ∈ Y ; (3) E i F j − F j E i = δi, j

K˜ i − K˜ i−1 ; vi −vi−1

(4) The maps + : {θi : i ∈ I } → U given by θi → E i and − : {θi ∈ I } → U given by θi → Fi extend to homomorphisms ± : f → U. Here K˜ i denotes K (i·i/2)αˇ i . The images of f under the maps ± are denoted U± . Remark 2.4. Suppose the generalized Cartan matrix is indecomposable, and (ri )i∈I is the minimal symmetrizing vector. If (di ) = d(ri ) is another choice of symmetrizing vector, then the quantum group U obtained from the datum with i · j = di ci j is obtained from the one corresponding to the minimal symmetrization by adjoining a d th root of v. Frenkel and Hernandez study two-parameter quantum groups of, for example, types “B1 ” and “C1 ” where the two parameter seem to correspond to different choices of symmetrizations, but the author does not know if this is a useful perspective on their deformations. We now recall various categories of modules for U. Given a left U-module V , and λ ∈ X , the λ-weight space of V is Vλ = {u ∈ V : K µ .u = v µ,λ .u, ∀µ ∈ Y }. A module V is said to be a weight module if it is the direct sum of its weight spaces. The full subcategory of the category of U-modules whose objects are weight modules will be denoted Mod X , and the full subcategory whose objects are weight modules with finite f dimensional weight spaces will be denoted Mod X . In fact we will focus on a smaller full subcategory in Mod X consisting of integrable modules with certain bounds on weight f (n) (n) spaces. A module V in Mod X is integrable if the actions of E i and Fi are locally nilpotent for every i ∈ I , n ∈ N. For λ ∈ X let D(µ) = {λ ∈ X : λ ≤ µ}. The module V lies in Oint if it is integrable, and there is a finite set of weights E ⊂ X such that if Vλ = {0} then there some µ ∈ E with λ ≤ µ. f For modules in Mod X it is possible to define a notion of character. Let E be the abelian group of formal sums λ∈X aλ eλ , where aλ ∈ Z. For ξ ∈ E let supp(ξ ) = {λ ∈ X : aλ = 0}, and let E be the subgroup of E consisting of those ξ for which supp(ξ ) f lies in a finite union of D(µ)s. Given a module V in Mod X , we set dim(Vλ )eλ ∈ E . χ (V ) = λ∈X

Clearly if V lies in Oint then χ (V ) ∈ E, and this yields an embedding of K 0 (Oint ) into E. It can be shown that the character of a integrable module is invariant under the action of W , so the image of χ in fact lies in E W . Notice that although multiplication for elements of E does not necessarily make sense, it is easy to see elements of E can be multiplied in the obvious way, so that E forms an algebra. We now recall the modified version of the quantum group due to Lusztig which is better suited to our purposes. Let U be the endomorphism ring of the forgetful functor

94

K. McGerty

from Mod X to the category of vector spaces. Thus by definition an element of a of U associates to each object V of Mod X a linear map aV , such that aW ◦ f = f ◦ aV for any morphism f : V → W . Any element of U clearly determines an element of U, giving a natural inclusion U → U, so one may think of U as a sort of completion of U. For each λ ∈ X , let 1λ ∈ Ube the projection to the λ weight space. Then U is ˙ to be the subring (in fact, isomorphic to the direct product λ∈X U1λ , and we set U clearly, Q(v)-subalgebra) ˙ = U1λ . U λ∈X

˙ does not have a multiplicative identity, but instead a collection {1λ : λ ∈ X } Note that U of orthogonal idempotents. ˙ the It is clear that the category Mod X is equivalent to a category of modules for U, ˙ is said to be unital if for every v ∈ V , category of unital modules. A module V for U there is a finite set K ⊂ X such that v = λ∈K 1λ (v). Let Mod1 be the category of unital ˙ It is easy to see that Mod X is equivalent to Mod1 (see [L93, 23.1.4]). We modules for U. f f denote the full subcategories of Mod1 corresponding to Mod X , Oint by Mod1 and O˙ int respectively. The weight spaces of a module in Mod X correspond to the images of the operators 1λ under this equivalence, hence viewing a module V in Mod X as a module ˙ we may define the character by for U dim(im(1λ ))eλ . χ (V ) = λ∈X

˙ be the A-subalgebra of U ˙ generated by elements E (n) 1λ and Definition 2.5. Let A U i (n) ˙ is a Fi 1λ for λ ∈ X, i ∈ I , and n ∈ Z≥0 . Lusztig [L93, Chap. 25] has shown that A U ˙ free A-module equipped with a canonical basis B. ˙ of U ˙ was constructed by Lusztig. When the root Remark 2.6. The canonical basis B datum is X -regular, in the same way that the canonical basis B of f yields natural bases for irreducible integrable highest weight modules, B˙ yields natural bases for the tensor product of an irreducible integrable highest and lowest weight modules. The basis however can be constructed for an arbitrary root datum. ˙ via the U± -bimodule Remark 2.7. It is straightforward to give a presentation for A U structure, so it can be described just as explicitly as U. Here we have defined it in terms of the category Mod X in order to describe the relation between the representations of the two algebras. Definition 2.8. Let (X, Y, I ) and (X , Y , J ) be root data. Following Steinberg3 , [St] an isogeny of root data is a map ϕ : X → X and a bijection i ↔ i between I and J such that • Both ϕ and its transpose ϕ t : Y → Y are injective. • ϕ(αi ) = i αi for some i ∈ Z, and similarly ϕ t (αˇ i ) = i αˇ i . Given a positive integer and a root datum (X, Y, I ), Lusztig has defined an -modified root datum4 [L93, 2.2.4]: 3 and possibly many others. In [L93] Lusztig defines morphisms of root data, but these are stricter than the notion of an isogeny. 4 This is the author’s term, but there does not seem to be any established terminology for it.

Langlands Duality for Representations and Quantum Groups at a Root of Unity

95

Definition 2.9. Let (I, ·) be a Cartan datum, and let be a positive integer. The -modified Cartan datum (I, ◦) is given by i ◦ j = li l j (i · j), where li is the smallest positive integer such that li (i · i/2) ∈ Z. It is easy to check that (I, ◦) is indeed a Cartan datum. Given a root datum (X, Y, I ) of type (I, ·) we can also define a new root datum of type (I, ◦) by setting X ∗ = {λ ∈ X : αˇ i , λ ∈ li Z} and Y ∗ = Hom(X ∗ , Z), with the obvious pairing between X ∗ and Y ∗ . The simple roots of the (X ∗ , Y ∗ ) are αi∗ = li αi and the simple coroots are αˇ i∗ , where αˇ i∗ (λ) = li−1 αˇ i , λ. Note that the inclusion X ∗ → X and its transpose, the induced restriction map Y → Y ∗ yield an isogeny from (X, Y, I ) to (X ∗ , Y ∗ , I ). It is easy to see that if the simple roots and coroots are linearly independent in (X, Y, I ) then the same is true for (X ∗ , Y ∗ , I ). Remark 2.10. Note that it is immediate from the definitions that the Weyl group W ∗ of (I, ◦) is canonically isomorphic to W the Weyl group of (I, ·). Moreover, by checking on the generators si , it is easy to see that the action of W ∗ on X ∗ coincides with the restriction of the action of W on X via this isomorphism. We may thus identify the two groups and write W for the Weyl group in either case5 . In particular note that this shows X ∗ is invariant under the action of W . Remark 2.11. If C is the generalized Cartan matrix of (X, Y, I ) then the generalized Cartan matrix of the -modified datum is LC, where L = (i−1 j ). In the case where d divides , so that i = /di it follows that LC = C t , that is, the -modified root datum is attached to the transpose of C, and hence the quantum group associated to it is Langlands dual to that attached to the original datum. Notice also that, given d divides , the lattice X ∗ ⊂ X depends only on the ratio /d. More precisely, if the symmetrization vector (di )i∈I determining the Cartan datum is a multiple of another symmetrization ∗ for the sublattice given by and (d ) vector (ci )i∈I , say di = mci , and we write X ,d i i∈I , ∗ ∗ then X ,d = (/d)X c,c , where c is the least common multiple of the ci s. ∗ Let Q = i∈I Zαi be the root lattice of (X, Y, I ), and Q the root lattice of I ). Let denote the roots of (X, Y, I ) and ∗ denote the roots of (X ∗ , Y ∗ , I ).

(X ∗ , Y ∗ ,

Lemma 2.12. Suppose that (X, Y, I ) is of finite or affine type. Then the inclusion ι : X ∗ → X induces a bijection α ↔ α ∗ between and ∗ such that α ∗ = lα α, for some lα ∈ Z. Proof. For a simple root αi the result follows from the definition. Next, as noted above, that the Weyl groups for the two root systems are identical, and hence if α ∈ is a real root, we may write it as w(αi ) for some w ∈ W and i ∈ I , and thus α ∗ = w(αi∗ ) = li w(αi ) = li α (and hence lα = li ). It remains to consider the imaginary roots, in the affine case. Let (·, ·) denote an invariant symmetric bilinear form on Q ⊗Z Q. Then the imaginary roots are exactly the elements α of Q with (α, α) = 0, and similarly the imaginary roots of ∗ are exactly the elements of Q ∗ of norm zero. Since Q ∗ ⊂ Q and the space of norm zero elements is one dimensional, the result follows immediately. 5 In [FH] the fact that the Weyl group actions coincide is Lemma 2.2. The root datum formalism however makes this check self-evident.

96

K. McGerty

Remark 2.13. In fact, using [Kac, Remark 6.1] it is easy to see that δ ∗ = δ, where δ and δ ∗ are the smallest imaginary roots of and ∗ respectively. It seems reasonable to conjecture that this lemma holds for all symmetrizable root data. Remark 2.14. When C is an indecomposable Cartan matrix (i.e. of finite type), the numbers ri = 21 (i · i) are either 1 or r in the notation of [FH]. Thus taking any symmetrizing vector (di )i∈I and divisible by d we have li = d/di = r/ri = r − ri + 1. Finally, the lattice here denoted X is the weight lattice denoted P in [FH, 2.1] and the sublattice X ∗ is there denoted P (thus we will define the weight lattice of our Langlands dual algebra to be X ∗ , rather than identifying it with X ∗ as in [FH]). It is interesting to note also that in the affine case one again has ri = r except in the case of A(2) 2 , where r = 4, and the ri take the values 1,2 and 4. Choosing = 2 in this case gives an (2) isogeny to a root datum which is not of finite or affine type. While A2 is self-dual, and so excluded from the considerations of [FH2], our constructions should still give some interesting relations between representations of this algebra. 2.1. Roots of unity. Let be any positive integer. Following [L93, 35.1.3], for even we set l = 2, while if is odd we set l = or 2. We set A to be the quotient ring ˙ be the correA/(l (v)), where l is the l th cyclotomic polynomial, and then let U ˙ ˙ sponding specialization A ⊗A (A U) of the modified form U, and similarly let f , U± be the specializations A ⊗ (A f), A ⊗ ( A U± ). More generally for any A-algebra R we ˙ for the corresponding specialization of A U, ˙ and R f for the specialization will write R U of A f. ˙ ∗ be the modified quantum group attached to the -modified root datum Let U ∗ ∗ ˙ we will write its generators as (X , Y , I ). To distinguish it from U, (n)

(n)

ei 1λ , f i 1λ , (n ≥ 0, i ∈ I, λ ∈ X ∗ ), l2

and write the generators of f as {ϑi : i ∈ I }. Since in A we have (vi∗ )2 = (vi i )2 = 1, ˙ ∗ is close to the classical enveloping algebra. In this note, so that vi∗ = ±1, the algebra U we are interested in the most degenerate case, when = r . 3. The Contracting Homomorphism In this section we recall the contracting homomorphism of [M], which gives an embed˙ ∗ into U ˙ . This relies on the work of Lusztig on the quantum ding of the algebra U Frobenius homomorphism. Recall from [L93, Chap. 35] that (under mild restrictions on – see the remark below) there are two A -homomorphisms Fr : f → f∗ and (n) (nl ) Fr : f∗ → f which are given on generators by Fr (ϑi ) = θi i , and (n) Fr (θi )

=

(n/li )

ϑi 0,

, if li |n, otherwise.

Langlands Duality for Representations and Quantum Groups at a Root of Unity

97

˙ → U ˙ ∗ (in the sense that U ˙ Lusztig also shows that Fr “extends” to a map Fr : U ± ˙ is compatible with these bimodule is naturally a bimodule over U , and Fr on U structures). It is characterized by the conditions: (n/l ) i (n) 1λ , if li |n, λ ∈ X ∗ Fr (E i 1λ ) = ei 0, otherwise, (n)

(n/l )

and similarly Fr (Fi 1λ ) = f i i 1λ , if li divides n and λ ∈ X ∗ , and to zero otherwise. Note that Fr is obviously surjective. A simple observation of [M] is that the map Fr ˙ ∗ . Here the use of the modified form is essential, as although also has an extension to U ± Fr on U does extend to the ordinary quantum group U , the map Fr does not extend to U∗ . The existence of the contraction map and the quantum Frobenius are (currently) conditional on some mild technical hypotheses. Definition 3.1. Let C = (ai j )i, j∈I be a generalized Cartan matrix. An odd cycle in C is a sequence i 1 , i 2 , . . . , i p+1 = i 1 in I such that p ≥ 3 is odd, and ais is+1 < 0 for each s = 1, 2, . . . , p, that is, a cycle of odd length in the associated Coxeter graph. A generalized Cartan matrix C has no odd cycles if and only if there is a function i → ai ∈ {0, 1} such that ai + a j = 1 whenever ai j < 0 (since a graph with no odd cycles is bipartite). A Cartan datum or root datum is said to have no odd cycles if its associated generalized Cartan matrix has no odd cycles. Note that this condition is satisfied by all finite-type (1) and affine Cartan data except A2n , and in that case the datum is self-dual, so it will not be of interest to us. Proposition 3.2. [M] Suppose that (X, Y, I ) is a root datum, and let φ : A → R be a homomorphism which factors through the natural map A → A for some positive integer . If is even assume that (X, Y, I ) has no odd cycles. Then there is a homomor˙ ∗ → RU ˙ given on generators by e(n) 1λ → E (nli ) 1λ and f (n) 1λ → F (nli ) 1λ , phism c : R U i i i i where λ ∈ X ∗ ⊂ X . Remark 3.3. The proof of the existence Fr in [L93, Chap. 35] holds with mild restrictions on in addition to the condition of no odd cycles, which in fact are not valid when = d. For finite-type quantum groups Kaneda [Ka] has verified that these restrictions can be removed, and indeed this fact was already stated in [L93]. The existence of the map c however, depends only on the existence of the map Fr , not on Fr , and thus it is known to exist whenever the root datum has no odd cycles for arbitrary , and for an arbitrary root datum if is odd. Indeed minor modifications of the arguments in [L93] allow you to verify the existence of Fr without the restriction of no odd cycles when is odd, as is briefly sketched in [L93, 35.5.2]. Finally, it is worth noting that, as Lusztig already points out in [L93], the quantum Frobenius at = r provides a quantum analogue of Chevalley’s exceptional isogenies between algebraic groups in positive characteristic. For example, in characteristic 2, there is a natural map SO2n+1 → Sp2n induced by the quotient map k2n+1 → k2n+1 /L, where Ln is the line fixed by SO2n+1 (e.g. the line spanned by e0 if the quadratic form is x02 + i=1 xi xi+n ). The quantum Frobenius for Bn at = 2 is a lifting of this isogeny to characteristic zero. An elegant construction of all possible isogenies between reductive algebraic groups in any characteristic is given in [St].

98

K. McGerty

4. Duality for Representations Let g be a symmetrizable Kac-Moody Lie algebra with indecomposable generalized Cartan matrix C, and let Oint (g) be the full subcategory of the category of g-representations consisting of those representations V which satisfy (1) V is a direct sum of its weight spaces Vλ , where λ ∈ X is the weight lattice. (2) The operators ei , f i act locally nilpotently. (3) There is a finite set of weights K ⊂ X such that whenever the weight space Vµ = 0 there is a λ ∈ K with µ ≤ λ. By results of Gabber and Kac [Kac], the modules in category Oint (g) are completely reducible, and the simple modules are the standard modules {∇(λ) : λ ∈ X + }, whose characters are given by the Weyl-Kac character formula. Let L g be the Langlands dual Lie algebra, with generalized Cartan matrix C t . In this section we will establish a duality between representations in the categories Oint (g) and Oint ( L g), provided C has no odd cycles (see Definition 3.1). Picking a symmetrizing vector (di )i∈I for C we obtain a Cartan datum, and let (X, Y, I ) be an associated root datum which is X and Y regular (cf. the paragraph after Definition 2.2). To establish the duality we will use the contraction map of the previous section. For the rest of this section unless otherwise stated, we assume that d divides so that the generalized Cartan matrix of the -modified datum is the transpose of C (see Remark 2.11). ˙ be the modified quantum group associated to the root datum (X, Y, I ). The catLet U ˙ ˙ which was defined in Sect. 2 gives a natural deformation of the Oint (g). egory Oint for U Indeed over Q(v) (which we shall refer to as the “generic” case) it is a semisimple category (this follows from [L93, Chap. 6]. Moreover, for each λ ∈ X + one can define highest weight modules ∇(λ) over A which are integral forms of the simple modules in ˙ specialized at v = 1. It is shown in [L93, 33.1.2] that ˙ 1 denote the algebra A U O˙ int . Let U ˙ 1 is equivalent to the the structure of an integrable highest weight module for C ⊗Z U structure of a g-integrable highest weight module, in a fashion which preserves weights and hence characters. It follows that the characters of the modules ∇(λ) are given by the Weyl-Kac character formula, so that O˙ int gives a deformation of category Oint (g) in a fashion which preserves characters. For each λ ∈ X + we may specialize the module ∇(λ) to A and obtain a so-called standard module which we will also write as ∇(λ) since the context should prevent any possibility of confusion. Clearly the standard modules have characters given by the Weyl-Kac formula (for the root datum (X, Y, I )). ˙ ∗ the parameters v ∗ = ±1. In [L93, 33.2] specializations with this Recall that in U i property are called quasiclassical. Under the assumption that the root datum has no odd ˙ ∗ . More precisely, let φ : A → R cycles, these specializations are in fact isomorphic to U 1 be an A-algebra, such that φ(vi ) ∈ {±1} for each i. Then let R0 be the same ring R with the A-algebra structure given by mapping v → 1. Proposition 4.1. Let (X, Y, I ) be a root datum with no odd cycles. Then there is an ˙ and R0 U. ˙ Moreover, the isomorphism maps 1ζ ∈ R U ˙ to isomorphism of the algebras R U ˙ (ζ ∈ X ) so that pulling back representations via this isomorphism preserves 1ζ ∈ R0 U characters. Proof. The isomorphism was constructed by Lusztig in [L93, 33.2.3]. Looking at the formulas there immediately establishes the assertion about characters.

Langlands Duality for Representations and Quantum Groups at a Root of Unity

99

˙ ∗ . Extending scalars to C by picking a ˙ ∗ is isomorphic to the algebra A ⊗Z U Thus U 1 ˙ ∗ instead of U) ˙ shows that the primitive root of unity, the discussion above (applied to U ∗ L ˙ is equivalent to category Oint for g, in a manner which preserves category O˙ int for U characters. We can now describe our duality. It is in the same spirit as, but thanks to the map c∗ , simpler than, the representation-theoretic duality for two-parameter quantum groups proposed in [FH]. Suppose that W is a representation of g in category Oint . Then we may ˙ 1 , and as such it is the specialization of equivalently think of it as a representation of U ˙ We may then specialize this representation instead to U ˙ , to a representation Wv of A U. obtain a representation W . Using the contraction map c we then obtain a representation ˙ ∗ . Since any weight space of W with weight µ ∈ ˙ ∗ , we define of U / X ∗ is annihilated by U c∗ (W ) to be the subspace of W which is the direct sum of those weight spaces whose ˙ ∗ representation. Then by the above discussion we may weight lie in X ∗ , taken as a U view c∗ (W ) as a representation of L g which is easily seen to lie in Oint ( L g). By abuse of notation, we will write c∗ (W ) instead of c∗ (W ) if there is no danger of confusion. Remark 4.2. Notice that although our construction works on the level of representations, it is not given functorially, since we really only deform simple representations in Oint (g). This is perhaps due to the author’s ignorance, and it would be interesting to know to what extent this can be refined. Note also that if is odd, then the above duality can be constructed for an arbitrary root datum: the contraction homomorphism exists in this case, as was noted before in Remark 3.3, and moreover the same modification to Lusztig’s proof of the existence of Fr (that is, using [L93, 33.1] rather than [L93, 33.2]) establishes the necessary relation ˙ ∗ representations and L g representations. between U We now wish to study this duality at the level of characters and relate it to the duality of [FH]. Recall the ring E introduced in Sect. 2. If V is in category Oint for g then we ˙ , set, just as for representations of U χ (V ) = dim(Vλ )eλ ∈ E, λ∈X

˙ ∗ . It is and this embeds K 0 (Oint ) into E W . Let χ L be the corresponding map for U immediate from the definitions that χ L (c∗ (V )) = (χ (V )), where (eλ ) = eλ if λ lies in X ∗ and zero otherwise. For λ ∈ X + let χ (λ) be the Weyl character attached to λ, and similarly for µ ∈ X + ∩ X ∗ let χ L (µ) be the Weyl character (for the datum (X ∗ , Y ∗ , I )) attached to µ. Note that since X ∗ is W -invariant (see Remark 2.10), the map is W -equivariant for the obvious actions of W on Z[X ] and Z[X ∗ ] respectively. Remark 4.3. To compare with [FH], where C is an indecomposable Cartan matrix, and take = d the least common multiple of the di s. Recall that by Remark 2.14 in this case we have li = r/ri = 1 + r − ri . In [FH], the authors use an auxillary map P → P L (in the notation of that paper), given by λ → λ(αˇ i )(1 + r − ri )−1 ˇ i , λ ∈ P , i∈I

100

K. McGerty

and they define : P → P L by extending this map by zero outside P . In the notation of this paper, we have identified P with P L as X ∗ , (this is implicit in attaching a full root datum to X ∗ ). Via these identifications, the map d here coincides with the map of [FH]. Proposition 4.4. Suppose that (X, Y, I ) is a root datum with no odd cycles, or that is odd, and let λ ∈ X + ∩ X ∗ . Then (χ (λ)) = χ L (λ) +

µ<λ,µ∈X ∗

m λµ χ L (µ),

where m λµ is a nonnegative integer. Proof. We simply apply our representation-theoretic duality to the simple highest weight ˙ of highest module V of highest weight λ. Then V is the standard module ∇(λ) for U ˙ ∗ which has character (χ (λ)) weight λ. Taking c∗ (V ) we obtain a representation for U by the remark preceding the proposition. But as discussed above, the representations of ˙ ∗ in category O˙ int are semisimple and the characters of simples are given by Weyl’s U formula, thus χ L (c∗ (V )) is a positive sum of Weyl characters χ L (µ). Since all the weights in ∇(λ) are less than λ, we must also have µ ≤ λ, and as the highest weight occurs with multiplicity 1,we see the coefficient of χ L (λ) must be 1 as claimed. Remark 4.5. Essentially the same map on representations is considered by Littelmann in [Li] in his construction of standard monomials via quantum groups. One could consider the positive characteristic analogue of c (the Frobenius splitting map), but then the categories of representations one has to consider are more complicated. Remark 4.6. In [FH] the authors connect the character duality map to two-parameter deformations of quantum groups, where one of the parameters is specialized to a root of unity. It would be very interesting to understand what those deformations have to do with quantum isogenies. In another direction, the map c acts on representations which are not necessarily in category O˙ int . For example, one could consider integrable representations of affine quantum groups at level zero. It was shown in [M] that c is compatible with extremal weight modules in level zero, thus it seems likely that it would act sensibly on q-characters (which however are usually defined only on finite dimensional representations). Very recent work of Frenkel and Hernandez [FH2] investigates a duality for q-characters, and earlier work of Frenkel and Mukhin [FM] has already studied a notion of q-characters at roots of unity. One can hope the techniques of this paper can be connected to these theories. We end this section with an application of the above results. In [M] it is asserted that c is an embedding. Since this was proved using the Frobenius map Fr , which is not known to exist in the same generality as c is, we give an alternative proof of this fact so that the map c∗ is always a restriction of representations to a subalgebra. Note that here we need not assume that is divisible by d as this was done earlier only to ˙ ∗ -representations correspond to L g-representations. The semisimplicity of ensure that U ˙ ∗ in O˙ int is all that is needed in the following. representations of U ˙∗ → U ˙ is injective. Lemma 4.7. The map c : U

Langlands Duality for Representations and Quantum Groups at a Root of Unity

101

˙ ∗ is free as an A -module, thus it suffices to check the map is Proof. Note that since U ˙∗ → U ˙ ∗ be the involutive automoran injective after we extend scalars to C. Let ω : U (n) (n) phism given by ω(ei 1λ ) = f i 1−λ . Twisting by ω interchanges highest and lowest weight modules. For λ, µ ∈ X + let ∇ ∗ (λ)ω and ∇ ∗ (µ) be the standard modules of lowest weight −λ and highest weight µ respectively. By [L93, 23.3.8] If vλ ∈ ∇ ∗ (λ)ω and vµ ∈ ∇ ∗ (µ) are of weight −λ and µ respectively, and ζ = µ − λ, then the map ˙ ζ → ∇ ∗ (λ)ω ⊗ ∇ ∗ (µ) given by u1ζ → u1ζ (vλ ⊗ vµ ) is surjective. Let P(ζ, λ, µ) U1 be its kernel. It follows readily from the results of [L93, 23.3] that the intersection of these ideals over all λ, µ with λ − µ = ζ is zero. Note moreover that these results all hold over A, and hence over any ring (see [L93, 31.2] for details). Next we have already seen that for λ ∈ X ∗ ∩ X + the module c∗ (∇(λ)) contains ∇ ∗ (λ) (here we use the fact that we are in the quasiclassical case, so that c∗ (∇(λ)) is semisimple and ∇ ∗ (µ) is simple for all µ ∈ X ∗ ), and the highest weight spaces correspond. ˙ and But now one can check directly that c is compatible with the “coproduct” on U ∗ ˙ ˙∗ U (see [L93, 23.1.5]), so that it respects tensor products of modules. Thus if u1ζ ∈ U ∗ ω ∗ lies in the kernel of c, then u1ζ annihilates c (∇(λ) ⊗ ∇(µ)) for all λ, µ ∈ X with ∗ ω ∗ λ − µ = ζ , and hence all the modules ∇ (λ) ⊗ ∇ (µ), so by the above it must be zero as required. Remark 4.8. The existence of special isogenies (the positive characteristic analogues of the = r quantum Frobenius for finite type quantum groups) has been used by Kumar and Stembridge [KS] to establish certain inequalities on tensor product multiplicities for a group and its dual group. They use characteristic p methods rather than quantum groups, but one can also use quantum isogenies to establish their results. It would be interesting to know if the splitting map is useful in their context. 5. On Langlands Duality Branching Rules: Combinatorics In this section we show that the multiplicities m λµ which occur in the character duality can be interpreted as tensor product multiplicities. We shall work purely combinatorially, and so C can be an arbitrary symmetrizable generalized Cartan matrix (but we assume still that our root datum is X and Y regular so that we have a dominant cone X + and partial order ≤ on X ). Since tensor product multiplicities are manifestly positive, we also obtain a proof that (χ (λ)) is the character of a representation of the dual Lie algebra for any symmetrizable Kac-Moody Lie algebra. Let be the set of roots of the Kac-Moody Lie algebra attached to (X, Y, I ) and ∗ ⊂ X ∗ the set of roots for the dual Lie algebra. Pick a Weyl vector ρ ∈ X , so that αˇ i , ρ = 1 for all i ∈ I . If {i : i ∈ I } denote a choice of fundamental weights (which we may assume lie in X ) then we may take ρ = i∈I i . We write w · λ for the ρ-shifted action of W on X , that is w · λ = w(λ + ρ) − ρ.

If ν∈ then for all w ∈ W we have ν − w(ν) ∈ i∈I Nαi . It follows that Aν = w∈W ε(w)ew·ν lies in E for any ν ∈ X + . Since Aw·ν = ε(w)Aν we see Aν ∈ E for any ν ∈ W · X + . Set D = α∈,α>0 (1 − e−α )n α ∈ E, where n α is the dimension of the α root space in the Kac-Moody algebra. For λ ∈ X + , the Weyl-Kac character formula for the irreducible highest weight representation ∇(λ) of highest weight λ ∈ X + states that X +,

χ (λ) = χ (∇(λ)) = D−1 .Aλ ,

102

K. McGerty

(note that D−1 and Aλ lie in E, so the product is well-defined). Note that, as for the Aν , the above expression makes sense for any λ ∈ W · X + , not just λ ∈ X + , but clearly if v ∈ W then χ (v · λ) = (−1)(v) χ (λ), with both sides zero when λ + ρ is not regular. L Let ρ = i∈I li i ∈ X ∗ , a Weyl vector for the datum (X ∗ , Y ∗ , I ). Then we have a similar expression for the characters of the Langlands dual Kac-Moody algebra in terms of its positive roots and the ρ L -shifted action of W (recall by Remark 2.10 that the Weyl groups of the two root data are canonically isomorphic and the natural action of W on X ∗ is the restriction of that on X ). We will need some combinatorial lemmas. The key formula in the next lemma goes back to Brauer. Lemma 5.1. Let ξ ∈ E W , that is, ξ = λ∈X aλ eλ ∈ E and aλ = aw(λ) for all w ∈ W , λ ∈ X . Then ξ.χ (ν) = aλ χ (λ + ν). λ∈X

Moreover it follows that if supp(ξ ) ⊂ X \X ∗ , that is, aλ = 0 if λ ∈ X ∗ , then ξ.χ (ρ L −ρ) lies in the span of Weyl characters of the form χ (µ) with µ ∈ X + and µ ∈ / (ρ L −ρ)+ X ∗ . Proof. Clearly for the first part of the lemma it is enough to establish the identity: ξ.Aν = aλ Aλ+ν . λ∈X

Now we have ξ.Aν =

aλ eλ ε(w)ew·ν

λ∈X w∈W

=

aλ ε(w)ew·(ν+w

−1 (λ))

.

λ∈X w∈W

Interchanging the order of summation we get: −1 aλ ε(w)ew·(ν+w (λ)) = w∈W λ∈X

=

aη ε(w)ew·(ν+η) , (η = w −1 (λ)),

w∈W η∈X

=

aη Aν+η ,

η∈X

where in the second line we used the W -invariance of the aλ s, and we reversed the order of summation again in the last line. Note that since ν + η may not be dominant, this is not necessarily a positive sum of Aλ s for λ ∈ X + even if the aµ s are all positive integers. Applying the formula in the first part of the lemma to ξ ∈ Z[X \X ∗ ]W and ν = ρ L −ρ and using the fact that χ (w · λ) = (−1)(w) χ (λ), we see that all the Weyl characters which can occur in the product ξ.χ (ρ L − ρ) have highest weights of the form w(ρ L + η) − ρ = w(η) + (w(ρ L ) − ρ L ) + (ρ L − ρ), where η ∈ X + and aη = 0. But then by assumption η ∈ / X ∗ , and so w(η) ∈ / X ∗ . On the other hand clearly w(ρ L ) − ρ L ∈ X ∗ , hence this weight cannot lie in ρ L − ρ + X ∗ , and we are done.

Langlands Duality for Representations and Quantum Groups at a Root of Unity

103

Lemma 5.2. Let λ ∈ X ∗ . Then we have χ (λ + ρ L − ρ) = χ (ρ L − ρ).χ L (λ). Proof. Let denote the set of roots for (X, Y, I ), and similarly let ∗ denote the set of roots for (X ∗ , Y ∗ , I ). We have

L χ (λ + ρ L − ρ) = (1 − e−α )n α ε(w)ew(λ+ρ )−ρ , α∈,α>0

while χ L (λ) =

w∈W ∗

∗

(1 − e−α )−n α

α ∗ ∈∗ ,α ∗ >0

ε(w)ew(λ+ρ

L )−ρ L

,

w∈W

where n α and n ∗α denote the dimensions of the roots spaces in g and L g respectively. But now we have L ε(w)ew(ρ )−ρ χ (ρ L − ρ) = ( (1 − e−α )−n α ) α>0

= (e

−ρ+ρ L

w∈W

(1 − e

−α −n α

)

α>0

)

ε(w)ew(ρ

L )−ρ L

.

w∈W

Applying Weyl’s denominator formula for (X ∗ , Y ∗ , I ) to the sum in the last expression, we find that L ∗ ∗ χ (ρ L − ρ) = (eρ −ρ (1 − e−α )−n α ) (1 − e−α )n α . α ∗ >0

α>0

The statement of the lemma follows immediately.

Remark 5.3. In the finite or affine case, Lemma 2.12 shows that and ∗ are in bijection α ↔ α ∗ , so that α ∗ = lα α for some lα ∈ Z positive. Since all root spaces are one-dimensional in the finite case, we get a simple expression for the Weyl character of ρ L − ρ: L (1 + e−α + . . . + e−(lα −1)α ). χ (ρ L − ρ) = eρ −ρ α∈,α>0 (s)

In the affine case suppose the generalized Cartan matrix is of type X m in the classification given in [Kac, Chap. 4]. Then although the root spaces corresponding to real roots are again all one-dimensional, the root space of weight jδ has dimension |I | − 1 if s divides j and dimension (m − |I | + 1)/(s − 1) otherwise. The explicit formula for χ (ρ L − ρ) is thus similar but contains a more elaborate product contribution coming from imaginary roots. ˙ be the quantum group of type B2 , so that U ˙ ∗ is of type Example 5.4. Let = 2, and U

C2 . We take α1 to be the long root and α2 to be the short root, so that ρ L − ρ = 2 . Setting µ = 1 ∈ X ∗ , for example. and writing yi = ei , we have

χ (ρ L − ρ + µ) = χ (ρ) = y1 y2 (1 + y1−2 y2−2 )(1 + y1 y2−2 )(1 + y1−1 )(1 − y2−2 ) (since we always have χ (ρ) = eρ α>0 (1 + e−α )). Now the representation of highest weight 2 is 4-dimensional with character χ (2 ) = y2 (1 + y1 y2−2 )(1 + y1−1 ), and so dually we have χ L (1 ) = y1 (1+ y1−2 y2−2 )(1+ y2−2 ). The product formula of the previous lemma now follows immediately.

104

K. McGerty

For ν1 , ν2 , ν3 ∈ X + let {cνν13 ,ν2 } be the structure constants for the multiplication on the Grothendieck group of the category Oint (g) given by tensor product. Since K 0 (Oint (g)) injects into E via the character map we have χ (ν1 )χ (ν2 ) = cνν13 ,ν2 χ (ν3 ). ν3 ∈X +

Theorem 5.5. The Langlands duality branching rules m λµ are positive. More precisely, we have µ+ρ L −ρ

m λµ = cρ L −ρ,λ . Proof. Suppose we have (χ (λ)) = Lemma 5.2 that χ (ρ L − ρ) (χ (λ)) =

λ L µ∈X + ∩X ∗ ,µ≤λ m µ χ (µ).

µ∈X + ∩X ∗ ,µ≤λ

Then it follows from

m λµ χ (µ + ρ L − ρ).

(5.1)

On the other hand, since χ (λ) is W -invariant, and is W -equivariant, we may apply Lemma 5.1 with ξ = (χ (λ) − (χ (λ)) to obtain χ (ρ L − ρ).(χ (λ) − (χ (λ))) = n λν χ (ν) (5.2) ν∈X + ν ∈ρ / L −ρ+X ∗

for some integers n λν ∈ Z. Finally, since we have η χ (ρ L − ρ).χ (λ) = cρ L −ρ,λ χ (η), η∈X +

and the Weyl characters which can occur on the right-hand sides of Eqs. (5.1) and (5.2) have highest weight which lie in different X ∗ -cosets, the claim immediately follows. Remark 5.6. Note that the above argument shows that χ (ρ L − ρ).(χ (λ) − (χ (λ)) is indeed a positive sum of Weyl characters, in contrast to the general situation of Lemma 5.1. We will see in the next section that, at least in the finite-type case, it is the character ˙ -module ∇(ρ L − ρ) ⊗ ∇(λ). of a direct summand of the U Tensor product multiplicities have been computed combinatorially by various people: for finite type, building on a combinatorial description due to Lusztig, Berenstein and Zelevinsky have given “polyhedral” multiplicity formulas in [BZ]; for a general Kac-Moody algebra, there is a Littlewood-Richardson rule in terms of Littelmann paths [Li95]. Example 5.7. Consider again the case of B2 . We need to calculate the multiplicities in the tensor products ∇(λ) ⊗ ∇(2 ) for λ ∈ X ∗ ∩ X + (where α2 is the short root and α1 the long root). As we have seen above, the set of weights of ∇(2 ) is the Weyl group orbit of 2 and each weight has multiplicity one. Let W2 be the stabilizer of 2 in W . Using the formula in the statement of Lemma 5.1, it follows that for w ∈ W/W2 , the simple highest weight representation ∇(λ + w(2 )) occurs exactly once in the tensor product, provided λ + w(2 ) is dominant (since it is easy to check that if λ + w(2 )

Langlands Duality for Representations and Quantum Groups at a Root of Unity

105

is not dominant, then λ + w(2 ) + ρ is not regular) and these are all the constituents. Hence we have 2 (χ (λ)) = χ L (λ + w(2 ) − 2 ). w∈W/W2 λ+w(2 )∈X +

Note that since λ ∈ X ∗ ∩ X + , the weight λ + w(2 ) is dominant if and only if λ + w(2 ) − 2 is dominant. This recovers the calculations of [FH, Remark 6.9]. 6. On Langlands Duality Branching Rules in the Finite-type Case and Tilting Modules In this section we study the branching multiplicities m λµ from a representation-theoretic point of view, and give an interpretation of the results of the last section using tilting modules. We restrict ourselves to the case of finite type algebras, as we will use the machinery of induction etc. for quantum algebras at roots of unity provided by [APW,Ka], and the infinitesimal quantum groups defined by Lusztig. We begin by recalling the results on quantum groups at a root of unity that we will need. Since we work here with only finite type quantum groups we may work with the category F of finite dimensional representations (of type 1). In [L90], Lusztig defined root vectors θα for each positive root α, via a braid group action. For a positive root α, let α = i , where α is conjugate to the simple root αi under the action of W the Weyl group. Proposition 6.1. Let f be the subalgebra of f = A ⊗A f generated by {θα : α ≥ 2}. Then f is a finite dimensional algebra and we have an isomorphism f∗ ⊗ f → f , given by (x, y) → Fr (x).y. Proof. This is established in most cases in [L93, 35], except when is small. The excluded cases (which are already stated in [L93] but without detailed proof) have been checked in [Ka, 2.7]. ˙ ∗ . Define algebras u˙ and uˆ by ˙ and U Definition 6.2. We need various subalgebras of U + − setting u˙ = {x 1λ y : x, y ∈ f, λ ∈ X } and uˆ = {x1λ : x ∈ u˙ or x ∈ U − }, (note that ˙ ≤0 = {x − 1λ : x ∈ f}. these are indeed subalgebras). Finally, we need the subalgebra U In [APW92] the authors define (derived) induction functors on integrable modules ˙ denoted H i (U1 /U2 , −), where U1 ⊃ U2 are two ˙ ≤0 and U for the algebras u˙ , uˆ , U 6 of the algebras above . Given λ ∈ X , there is a natural one-dimensional module for ˙ ≤0 which we denote simply by λ. When λ is dominant, H 0 (U/ ˙ U ˙ ≤ , λ), just as in the U classical theory of induction from a Borel subgroup, is an indecomposable module with character given by Weyl’s formula, which we denote by ∇(λ). It has a unique simple submodule L(λ). The dual of ∇(λ), denoted (λ), is a costandard, or Weyl, module. ˙ ∗ , though here of course the theory is easier The same theory exists for the algebra U 6 In fact they work with “unmodified” algebras, but the categories of modules obviously correspond to categories of modules for the modified algebras – see [Ka] for more details.

106

K. McGerty

˙ ∗ -modules is semisimple. We will write ∇ ∗ (λ), ∗ (λ) for the because the category of U ˙∗ corresponding modules for U . We define σ = i∈I (i − 1)i , and let St , the Steinberg representation, be the ˙ ≤0 , −) is exact (see [Ka, 2.9]), and we denote it by module ∇(σ ). The functor H 0 (ˆu/U ∼ ˙ and ˆ ˆ Z . It is known that St = Z (σ ) as uˆ -modules, and in fact St is simple as both a U uˆ -module, see for example [APW92, §0.9] for more details. We will also need the class of modules known as tilting modules, whose definition we now recall. ˙ module is said to be tilting if it has a filtration both by standard Definition 6.3. A U and costandard modules. We now review some of the basic results on tilting modules. Although all the results are standard, we sketch the proof to point out that they all hold even for small values of . Theorem 6.4. (1) If M1 , M2 are tilting modules, then so is M1 ⊗ M2 . (2) If M and N are tilting modules, then M ∼ = N if and only if M and N have the same character. Proof. The key to (1) is to show that the tensor product of two standard modules has a filtration by standard modules. This follows even integrally from Lusztig’s theory of based modules: see for example [Ka98]. The general construction of tilting modules shows that for each λ ∈ X + there is a unique indecomposable tilting module T (λ), where λ occurs as a weight of T (λ) with multiplicity one and all other weights of T (λ) are strictly less than λ. Moreover every indecomposable tilting module has this form. See [A92, §2] for more details. This readily implies (2). We also need to understand some relations between pulling back via the Frobenius, ˙ ∗≤0 -module there is an and induction. The main result of [Ka] asserts that for M a U isomorphism ˙ u, M Fr ) ∼ ˙ ∗ /U ˙ ∗≤0 , M) Fr , (i ≥ 0), H i (U/ˆ = H i (U

(6.1)

where M Fr is the uˆ -module obtained via composition with Fr , and similarly for the right-hand side. (This result is already established in [APW92] with some restrictions.) Theorem 6.5. Let λ ∈ X + ∩ X ∗ , then we have ˙ /U ˙ ≤0 , λ + σ ) ∼ ˙ ∗ /U ∗≤0 , λ) Fr . H i (U = St ⊗ H i (U

(6.2)

Proof. With the ingredients provided by [Ka], the proof is standard. By (6.1) we have ˙ ∗ /U ∗≤0 , λ) Fr ∼ ˙ u, λ Fr ). H i (U = H i (U/ˆ Thus tensoring both sides with St we find the right-hand side of (6.2) is isomorphic to ˙ /ˆu, λ Fr ) ⊗ St ∼ ˙ /ˆu, λ Fr ⊗ St ) H i (U = H i (U ∼ ˙ /ˆu, λ Fr ⊗ Zˆ (σ )) = H i (U ∼ H i (U ˙ /ˆu, Zˆ (λ + σ )) = ∼ ˙ /U ˙ ≤0 , λ + σ ), = H i (U

Langlands Duality for Representations and Quantum Groups at a Root of Unity

107

˙ -modules, in the second the isowhere in the first line we use the tensor identity for U morphism St ∼ = Zˆ (σ ), in the third the tensor identity for uˆ -modules, and in the final line, the fact that Zˆ is exact, so the spectral sequence for the composition of induction ˙ /ˆu, Zˆ (M)) ∼ ˙ /U ˙ ≤0 , M). functors degenerates to give an isomorphism H i (U = H i (U Remark 6.6. The characteristic p version of this theorem, due to Andersen, gives a short proof of Kempf’s vanishing theorem. Moreover, taking characters of H 0 when = d we recover Lemma 5.2, and thus it can be seen as the representation-theoretic version of that calculation. Proposition 6.7 ([APW,A92]). We have the following properties of the Steinberg module St : (1) St is injective and projective in F. (2) If M is a finite dimensional representation, then St ⊗ M is tilting, projective and injective in F. Proof. We outline a proof of this theorem here to emphasize that it holds for all (at least over C, which is the only field we need here). We must show that Ext1 (St , L(λ)) = 0 for all λ ∈ X + . The linkage principle already implies that this Ext vanishes unless λ = σ + µ, where µ ∈ X ∗ . Now the previous theorem shows that these modules are in fact standard modules ∇(σ + µ) = St ⊗ Fr ∗ (∇ ∗ (µ)). Hence it is enough to show that Ext1 (St , ∇(σ + µ)) = 0. Since St is self-dual, this follows if we can show Ext1 ((σ ), ∇(σ + µ)) = 0, but it is known (and crucial in the construction of tilting modules) that Ext1 ((λ), ∇(µ)) = 0, ∀λ, µ ∈ X + , and so we are done. Self-duality also immediately implies that St is injective. Moreover, using standard properties of Hom and the tensor product, it follows readily that St ⊗ E is injective and projective for any finite-dimensional module E. To see that it is tilting, one can show that any module can be imbedded in a module of the form St ⊗ T , where T is tilting. Since St is also tilting, and tilting modules are closed under direct summands, it follows that indecomposable projectives and injective modules are tilting. See [A92] for more details. Corollary 6.8. Let µ ∈ X ∗ ∩ X + . Then ∇(µ + ρ L − ρ) is simple, tilting, projective and injective. Proof. From Theorem 6.2 and Lusztig’s quantum version of Steinberg’s tensor product theorem, it follows that the modules ∇(µ + ρ L − ρ) = ∇(ρ L − ρ) ⊗ Fr ∗ (∇ ∗ (µ)) are simple. By the previous proposition, they are also tilting and injective. By duality, they are also projective. We now examine the Langlands branching multiplicities. We would like a representation-theoretic interpretation of the calculation of these multiplicities in terms of tensor-product multiplicities in Theorem 5.5. The key, unsurprisingly, is Theorem 6.2 and the theory of tilting modules outlined above. Notice first that c∗ (∇(λ)) is a rep˙ ∗ , so it does not make sense to compare it to ∇(λ), however we may resentation of U pull it back via Fr without changing its character. Unfortunately, Fr ∗ c∗ (∇) still has no obvious (at least to the authors) relation to ∇(λ). Nevertheless once we tensor with the

108

K. McGerty

Steinberg representation, a natural relation appears. Recall from [A03] that the linkage ˙ shows that the orbits of the ρ-shifted action of the affine Weyl group Wˆ principle for U ˙ . The simple modules ∇(µ + ρ L − ρ) for µ ∈ X ∗ thus all lie are unions of blocks for U in the union of blocks given by the orbits of Wˆ on ρ L − ρ + X ∗ . Proposition 6.9. Let λ ∈ X ∗ . The module St ⊗ Fr ∗ c∗ (∇(λ)) is a direct summand of the module St ⊗ ∇(λ), and moreover is precisely the summand which lies in the union ˙ contained in the Wˆ -orbits of ρ L − ρ + X ∗ . of the blocks of U Proof. By part (2) of Proposition 6.7 we see that St ⊗ ∇(λ) is a tilting module. Thus if T (γ ) denotes the indecomposable tilting module with highest weight γ , we may write St ⊗ ∇(λ) = T (ν + ρ L − ρ), ν∈X +

a sum of indecomposable tilting modules (the tilting modules which occur must have highest weight of the form ν + ρ L − ρ, by [A92, 5.12]). For any µ ∈ X ∗ , Theorem 6.2 combined with Proposition 6.7 shows that ∇(µ + ρ L − ρ) is projective and injective and tilting, thus it cannot occur as a composition factor of a standard filtration of T (ν + ρ L − ρ) for ν ∈ / X ∗ . Therefore ⎛ ⎞ µ+ρ L −ρ ⊕c St ⊗ ∇(λ) = T ⊕ ⎝ ∇(µ + ρ L − ρ) ρ L −ρ,µ ⎠ , µ∈X ∗

where T is a tilting module whose character lies entirely in the (positive) span of the Weyl characters χ (ν + ρ L − ρ) for ν ∈ / X ∗. On the other hand, we have λ St ⊗ Fr ∗ c∗ (∇(λ)) = ∇(µ + ρ L − ρ)⊕m µ . µ∈X ∗

Hence using Theorem 5.5 the result follows.

Remark 6.10. This also shows that the expression χ (ρ L − ρ).(χ (λ) − (χ (λ)) is the character of T in the above proof, which is also a direct summand of St ⊗ ∇(λ). Note µ+ρ L −ρ

also that the above proof shows that m λµ ≤ cµ,ρ L −ρ , independently of Sect. 5 since tilting modules are determined by their character. It would be interesting to know if there ˙ is a natural U-module map between St ⊗ ∇(λ) and St ⊗ Fr ∗ c∗ (∇(λ)). References [A03] [A92] [AP] [APW] [APW92]

Andersen, H.H.: The strong linkage principle for quantum groups at roots of 1. J. Alg. 260, 2–15 (2003) Andersen, H.H.: Tensor products of quantized tilting modules. Commun. Math. Phys. 159, 149–159 (1992) Andersen, H.H., Paradowski, J.: Fusion categories arising from semisimple lie algebras. Commun. Math. Phys. 169, 563–588 (1995) Andersen, H., Polo, P., Wen, K.: Representations of quantum algebras. Invent. Math. 104, 1–53 (1991) Andersen, H., Polo, P., Wen, K.: Injective modules for quantum groups. Amer. J. Math. 114, 571–604 (1992)

Langlands Duality for Representations and Quantum Groups at a Root of Unity

[BZ] [FH] [FH2] [FM] [Kac] [Ka98] [Ka] [K96] [K02] [KL02] [KS] [Li95] [Li] [L90] [L93] [M] [St]

109

Berenstein, A., Zelevinsky, A.: Tensor product multiplicities, canonical bases and totally positive varieties. Invent. Math. 143, 77–128 (2001) Frenkel, E., Hernandez, D.: Langlands duality for representations of quantum groups. http:// arXiv.org/abs/0809.4453v3[math.QA], 2008 Frenkel, E., Hernandez, D.: Langlands duality for finite-dimensional representations of quantum affine algebras. http://arXiv.org/abs/09.2.0447v2[math.QA], 2009 Frenkel, E., Mukhin, E.: The q-characters at a root of unity. Adv. Math. 171(1), 139–167 (2002) Kac, V.: Infinite dimensional Lie algebras. 3rd ed., Cambridge: Cambridge University Press, 1990 Kaneda, M.: Based modules and good filtrations in algebraic groups. Hiroshima Math. J. 28, 337–344 (1998) Kaneda, M.: Cohomology of infinitesimal quantum algebras. J. Alg. 226, 250–282 (2000) Kashiwara, M.: Similarity of crystal bases. Cont. Math. 194, 177–186 (1996) Kashiwara, M.: Bases cristallines des groupes quantiques, Edited by Charles Cochet. Cours Spécialisés, 9. Paris: Société Mathématique de France, 2002 Kumar, S., Littelmann, P.: Algebraization of frobenius splitting via quantum groups. Ann. of Math. (2) 155(2), 491–551 (2002) Kumar, S., Stembridge, J.: Special Isogenies and Tensor Product Multiplicities. Int. Math. Res. Not. 2007, Article ID rnm081, 13 pp. Littelmann, P.: Path and root operators in representation theory. Ann. of Math. (2) 142, 499–525 (1995) Littelmann, P.: Contracting modules and standard monomial theory for symmetrizable kac-moody algebras. J. Amer. Math. Soc. 11(3), 551–567 (1998) Lusztig, G.: Quantum groups at roots of 1. Geom. Dedicata 35, 89–114 (1990) Lusztig, G.: Introduction to Quantum Groups. Birkhäuser, Boston, 1993 McGerty, K.: Generalized q-schur algebras and quantum frobenius. Adv. Math. 214(1), 116–131 (2007) Steinberg, R.: The isomorphism and isogeny theorems for reductive algebraic groups. J. Alg. 216, 366–383 (1999)

Communicated by Y. Kawahigashi

Commun. Math. Phys. 296, 111–143 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0996-9

Communications in

Mathematical Physics

Comments on Hastings’ Additivity Counterexamples Motohisa Fukuda1 , Christopher King2 , David K. Moser3 1 Department of Mathematics, University of California, Davis, CA, USA 2 Department of Mathematics, Northeastern University, Boston, MA 02115, USA

E-mail: [email protected]

3 Department of Physics, Northeastern University, Boston, MA 02115, USA

Received: 22 May 2009 / Accepted: 11 November 2009 Published online: 4 February 2010 – © Springer-Verlag 2010

Abstract: Hastings [12] recently provided a proof of the existence of channels which violate the additivity conjecture for minimal output entropy. In this paper we present an expanded version of Hastings’ proof. In addition to a careful elucidation of the details of the proof, we also present bounds for the minimal dimensions needed to obtain a counterexample. Contents 1. 2. 3.

4.

5.

The Additivity Conjectures . . . . . . . . . . . . . . . . . . . Notation and Statement of Results . . . . . . . . . . . . . . . . 2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The main result . . . . . . . . . . . . . . . . . . . . . . . Background on Random States and Channels . . . . . . . . . . 3.1 Probability distributions for states . . . . . . . . . . . . . 3.2 Probability distributions on the simplex d . . . . . . . . 3.3 Estimates for µd,n . . . . . . . . . . . . . . . . . . . . . . 3.4 Probability distribution for random unitary channels . . . . Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . 4.1 Definition of the typical channel . . . . . . . . . . . . . . 4.2 Definition of the low-entropy events E . . . . . . . . . . . 4.3 The upper bound for Pr ob(E) . . . . . . . . . . . . . . . 4.4 The lower bound for Pr ob(E) . . . . . . . . . . . . . . . 4.5 Combining the bounds for Pr ob(E) and finishing the proof 4.6 Optimizing the bounds for Pr ob(E) and the proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . Proofs of Lemmas . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . 5.2 Proof of Lemma 4 . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

112 114 114 116 117 117 117 118 119 120 120 121 122 124 125

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

126 126 126 127

112

M. Fukuda, C. King, D. K. Moser

5.3 Proof of Lemma 7 . . . . . . 5.4 Proof of Lemma 9 . . . . . . 5.5 Proof of Lemma 11 . . . . . 5.6 Proof of Lemma 12 . . . . . 5.7 Proof of Lemma 13 . . . . . 6. Discussion . . . . . . . . . . . . A. Derivation of Bound for Z (n, d) . B. Proof of Proposition 14 . . . . . References . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

128 130 131 134 135 138 138 141 142

1. The Additivity Conjectures The classical capacity of a quantum channel is the maximum rate at which classical information can be reliably transmitted through the channel. This maximum rate is approached asymptotically with multiple channel uses by encoding the classical information in quantum states which can be reliably distinguished by measurements at the output. In general, in order to achieve optimal performance, it is necessary to use measurements which are entangled across the multiple channel outputs. However it was conjectured that product input states are sufficient to achieve the maximal rate of transmission, in other words that there is no benefit in using entangled states to encode the classical information. This conjecture is closely related to other additivity conjectures of quantum information theory, as will be explained below. Recently Hastings [12] disproved all of these additivity conjectures by proving the existence of channels which violate the additivity of minimal output entropy. The purpose of this paper is to present in detail the findings of Hastings’ paper, and also to find bounds for the minimal dimensions needed for this type of counterexample. We begin by formulating the various additivity conjectures. The Holevo capacity of a quantum channel is defined by ∗ χ () = sup S pi ρi pi S ((ρi )), (1.1) − { pi , ρi }

i

i

where the supremum runs over ensembles of input states, and where S(ρ) denotes the von Neumann entropy of the state ρ (here and throughout the paper log denotes the natural logarithm): S(ρ) = −Trρ log ρ.

(1.2)

The classical information capacity C() is known [15,25] to equal the following limit: 1 ∗ ⊗n (1.3) χ ( ). n→∞ n It has been a longstanding conjecture that the classical information capacity is in fact equal to the Holevo capacity: C() = lim

Conjecture 1

C() = χ ∗ ().

(1.4)

Conjecture 1 would be implied by additivity of χ ∗ over tensor products. This led to the following conjecture: for all channels and , Conjecture 2

χ ∗ ( ⊗ ) = χ ∗ () + χ ∗ ().

(1.5)

Comments on Hastings’ Additivity Counterexamples

113

Subsequently a third conjecture appeared, namely the additivity of minimum output entropy: Conjecture 3

Smin ( ⊗ ) = Smin () + Smin (),

(1.6)

where Smin is defined by Smin () = inf S((ρ)).

(1.7)

ρ

Finally Amosov, Holevo and Werner [3] proposed a generalization of Conjecture 4 with von Neumann entropy replaced by the Renyi entropy: for all p ≥ 1, S p,min ( ⊗ ) = S p,min () + S p,min (),

Conjecture 4

(1.8)

where S p,min is the minimal Renyi entropy defined for p = 1 by S p,min () = inf S p ((ρ)), ρ

S p (τ ) =

1 log Tr τ p . 1− p

(1.9)

In 2004 Shor [27] proved the equivalence of several additivity conjectures, including Conjectures 2 and 3 above. In subsequent work [11] it was shown that Conjectures 1 and 2 are equivalent. The conjectures have been proved in several special cases [1,2,6, 9,17–20], but recently most progress has been made in the search for counterexamples. This started with the Holevo-Werner channel [28] which provided a counterexample to Conjecture 4 with p > 4.79, then more recently Winter and Hayden found counterexamples to Conjecture 4 for all p > 1 [14], and violations have since been found also for p = 0 and p close to zero [8]. Finally in 2008, Hastings [12] produced a family of channels which violate Conjecture 3, namely additivity of minimal output von Neumann entropy, thereby also proving (via [27 and 11]) that Conjectures 1 and 2 are false. Following Winter’s idea, the product channels used by Hastings have the form ⊗, where is a special channel which we call a random unitary channel. This means that there are positive numbers w1 , . . . , wd with i wi = 1 and unitary n × n matrices U1 , . . . , Ud such that (ρ) =

d i=1

wi Ui ρ Ui∗ ,

(ρ) =

d

wi Ui ρ Ui∗ .

(1.10)

i=1

These channels are chosen randomly using a distribution that depends on the two integers n and d, where n is the dimension of the input space and d is the dimension of the environment. Hastings’ main result is that for n and d sufficiently large there are random unitary channels which violate Conjecture 3, that is Smin ( ⊗ ) < Smin () + Smin ().

(1.11)

This result also allows a direct construction of channels which violate Conjectures 1 and 2, as we now show. Using results from the paper [11], the inequality (1.11) implies that the additivity of minimal output entropy does not hold for the product ⊗ , where = ⊕ . In addition, as shown in the paper [10], there is a unital extension of , denoted , such that the additivity of minimal output entropy does not hold for ⊗ , and such that Smin ( ⊗ ) = 2 log D − χ ∗ ( ⊗ ),

(1.12)

114

M. Fukuda, C. King, D. K. Moser

where D is the dimension of the output space for . Thus provides a counterexample for Conjecture 2, and 1 ∗ χ (( )⊗2k ) > χ ∗ ( ). k→∞ 2k lim

(1.13)

Therefore, the classical capacity of does not equal its Holevo capacity, and this provides a counterexample for Conjecture 1. One key ingredient in the proof is the relative sizes of dimensions, namely n >> d >> 1, where n is the dimension of the input space, and d is the dimension of the environment. Recall that in the Stinespring representation a channel is viewed as a partial isometry from the input space Hin to the product of output and environment spaces Hout ⊗ Henv , followed by a partial trace over the environment. The image of Hin under the partial isometry is a subspace of dimension n sitting in the product Hout ⊗ Henv . Making the environment dimension d much smaller than the input dimension n should guarantee that with high probability this subspace will consist of almost maximally entangled states. For such states the output entropy will be close to the maximal possible value log d, and therefore the minimal entropy of the channel should also (hopefully) be close to log d. At the same time the product channel ⊗ sends the maximally entangled state into an output with one relatively large eigenvalue, and thus one might hope to find a gap between Smin ( ⊗ ) and Smin () + Smin (). Turning this vague notion into a proof requires considerable insight and ingenuity. In this paper we focus on the technical aspects of Hastings’ proof. Some of the estimates and inequalities derived in this paper are new, but all the main ideas and methods are taken from [12]. The paper is organized as follows. In Sect. 2 we define notation and make a precise statement of Hastings’ results. In Sect. 3 we present some background material on probability distributions for states and channels. In Sect. 4 we ‘walk through’ the proof of Hastings’ Theorem, stating results where needed and delineating the logic of the argument. In Sect. 5 we give the proofs of various results needed in Sect. 4 and elsewhere. Sect. 6 discusses different aspects of the proof and possible directions for further research. The Appendix contains the derivation of some estimates needed for the proof.

2. Notation and Statement of Results We will mostly avoid Dirac bra and ket notation, although it will be used in Sects. 5.1 and 5.5. 2.1. Notation. Let Mn denote the algebra of complex n × n matrices. The identity matrix will be denoted In , or just I . The set of states in Mn is defined as Sn = {ρ ∈ Mn : ρ = ρ ∗ ≥ 0, Trρ = 1}.

(2.1)

The set of unit vectors in Cn will be denoted Vn = {z = (z 1 , . . . , z n )T ∈ Cn : z ∗ z =

n i=1

|z i |2 = 1}.

(2.2)

Comments on Hastings’ Additivity Counterexamples

115

Every unit vector z ∈ Vn defines a pure state ρ = zz ∗ satisfying ρ 2 = ρ. The set of unit vectors Vn is identified with the real (2n − 1)-dimensional sphere S 2n−1 , and hence carries a unique uniform probability measure which we denote σn . The set of unitary matrices in Mn is denoted U(n) = {U ∈ Mn : UU ∗ = I }.

(2.3)

We will write Hn for the normalized Haar measure on U(n). A channel is a completely positive trace-preserving map : Mn → Mm . Recall the definition of random unitary channel (1.10): (ρ) =

d

wi Ui ρ Ui∗ .

(2.4)

i=1

The set of all random unitary channels on Mn with d summands will be denoted Rd (n). Given a channel ∈ Rd (n) the complementary or conjugate channel C : Mn → Md is defined by [16,22] C (ρ) =

d √ wi w j Tr(ρ U ∗j Ui ) |i j|.

(2.5)

i, j=1

As is well-known, for any input state ρ the output states (ρ) and C (ρ) are related by (ρ) = Tr2 WρW ∗ , C (ρ) = Tr1 WρW ∗ .

(2.6)

Here, W : Cn → Cnd is a partial isometry. Also Tr2 denotes the partial trace over the state space of the environment, and Tr1 denotes the partial trace over the state space of the system. When ρ = zz ∗ is a pure state, the matrices (zz ∗ ) and C (zz ∗ ) are partial traces of the same pure state, and thus have the same non-zero spectrum and the same entropy. Therefore Smin () = Smin (C ). For the purposes of constructing the counterexample it is convenient to work with both and C . In particular, we are interested in the cases where W consists of rescaled unitary block matrices; ⎛√ ⎞ w1 U 1 ⎜ ⎟ W = ⎝ ... ⎠ . (2.7) √ wd U d Note that i wi = 1 as W is a partial isometry. We define a measure on this subset of partial isometries, in Sect. 3.4, as the product of Haar measures and a particular measure on the simplex. The complex conjugate channel is defined by (ρ) =

d i=1

wi Ui ρUi∗

=

d

wi Ui ρUiT .

i=1

Again note that and have identical minimum output entropies.

(2.8)

116

M. Fukuda, C. King, D. K. Moser

2.2. The main result. Following the work of Winter and Hayden [14], the counterexample is taken to be a product channel of the form ⊗ , where is a random unitary channel. Hastings first proves the following universal upper bound for the minimum output entropy of such a product. Lemma 1. For any ∈ Rd (n), Smin ( ⊗ ) ≤ 2 log d −

log d . d

(2.9)

Lemma 1 will be proved in Sect. 5.1. The counterexample is found by proving the existence of a random unitary channel whose minimum output entropy is greater than one half of this upper bound, that is greater than log d − log d/2d. For such a channel it will follow that Smin ( ⊗ ) ≤ 2 log d −

log d d

< 2Smin () = Smin () + Smin ()

(2.10) (2.11) (2.12)

and this will provide the counterexample to Conjecture 3. Hastings [12] proved the existence of such channels using a combination of probabilistic arguments and estimates involving the distribution of the reduced density matrix of a random pure state. The next theorem is a precise statement of Hastings’ result. Theorem 2. There is h min < ∞, such that for all h > h min , all d satisfying d log d ≥ h, and all n sufficiently large, there is ∈ Rd (n) satisfying Smin () > log d −

h . d

(2.13)

By taking d large enough so that 2h min < log d, we deduce from Theorem 2 that there is a channel satisfying Smin () > log d −

log d , 2d

(2.14)

and this establishes the existence of counterexamples for Conjecture 3. In fact the proof will show that as d, n → ∞, the probability that a randomly chosen channel in Rd (n) will satisfy the bound (2.13) approaches one. It would be interesting to determine the set of integers (n, d) for which there are random unitary channels in Rd (n) violating additivity, and in particular to find the smallest dimensions which allow violations, as well as the size of the largest possible violation. Following this line of reasoning we define log d dmin = inf d : ∃ n, ∃ ∈ Rd (n) s.t. Smin () > log d − , 2d log d , (2.15) n min = inf n : ∃ d, ∃ ∈ Rd (n) s.t. Smin () > log d − 2d

Smax = sup sup Smin () + Smin () − Smin ( ⊗ ) . n,d ∈Rd (n)

The next result gives some bounds on these quantities.

Comments on Hastings’ Additivity Counterexamples

117

Proposition 3. dmin < 3.9 × 104 , n min < 7.8 × 1032 , Smax > 9.5 × 10−6 . Proposition 3 will be proved in Sect. 4.6. The bounds in Proposition 3 are surely not optimal, however they may indicate the delicacy of the non-additivity effect for this class of channels. It would certainly be interesting to tune the estimates in this paper in order to improve the bounds in Proposition 3, or even better to find a different class of channels where the effect is larger. 3. Background on Random States and Channels As mentioned above, the proof of Theorem 2 relies on probabilistic arguments, involving distributions of pure states and random unitary channels. The next sections explain the distributions which play a role in the proof. 3.1. Probability distributions for states. Recall that Vn is the set of unit vectors in Cn . This set carries a natural uniform measure σn , namely the uniform measure on the (real) (2n − 1)-dimensional sphere. If Cdn = Cd ⊗ Cn is a product space, then a unit vector z ∈ Vdn can be written as a n × d matrix M, with entries Mi j (z) = z (i−1)d+ j , i = 1, . . . n, j = 1, . . . , d, satisfying TrM ∗ M = i j |z i j |2 = 1. Define the map G : Vdn → Md by G(z) = M(z)∗ M(z).

(3.1)

(3.2)

It follows that G(z) ≥ 0 and Tr G(z) = 1, and hence the image of G lies in Sd (the set of d-dimensional states). Since z is a random vector (with distribution σdn ) it follows that G(z) is a Sd -valued random variable, or more simply a random state. Its distribution has been studied in many other contexts (see for example [13]) and it plays a key role in the proof here. 3.2. Probability distributions on the simplex d . Let d denote the simplex of d-dimensional probability distributions: d = {(x1 , . . . , xd ) ⊂ Rd : xi ≥ 0,

d

xi = 1}.

(3.3)

i=1

We define below three different probability distributions on d . One is the uniform measure inherited from Rd , and the others are defined by the diagonal entries and the eigenvalues of G(z), where z is a random unit vector in Vdn . Uniform distribution The simplex d carries a natural measure inherited from Lebesgue measure on Rd : this is conveniently written as d d δ wi − 1 dw1 . . . dwd = δ wi − 1 [dw], (3.4) i=1

i=1

118

M. Fukuda, C. King, D. K. Moser

where δ(·) is the Dirac δ-function. Integrals with respect to this measure can be evaluated by introducing local coordinates on Rd in a neighborhood of d . In particular the volume of d with respect to the measure (3.4) can be computed: d 1 . (3.5) δ wi − 1 [dw] = (d − 1)! d i=1

Diagonal distribution νd,n Let z ∈ Vdn be a random unit vector in Cn ⊗ Cd . The joint distribution of the diagonal entries (G 11 (z), . . . , G dd (z)) will be denoted νd,n . It is possible to find an explicit formula for the density of νd,n , however we will not need it in this paper. It is sufficient to note that a collection of d random variables Y1 , . . . , Yd have the joint distribution νd,n if and only if they can be written as Yj =

n

|z i j |2 ,

j = 1, . . . , d,

(3.6)

i=1

where {z i j } are the components of a uniform random vector on the unit sphere in Cn ⊗Cd . We come back to this problem in Sect. 5.3. Eigenvalue distribution µd,n As noted above the eigenvalues of G(z) are non-negative and sum to one.1 However the eigenvalues are not ordered and so define a map not into d but rather into the quotient d / d , where d is the symmetric group. Thus when z ∈ Vdn is a random vector the eigenvalues of G(z) are d / d -valued random variables. However it is convenient to use a joint density for the eigenvalue distribution on d , with the understanding that it should be evaluated only on events which are invariant under d . This density is known explicitly [21,29]: for any event A ⊂ d , d d n−d −1 2 µd,n (A) = Z (n, d) (wi − w j ) wi δ wi − 1 [dw], (3.7) A 1≤i< j≤d

i=1

i=1

where Z (n, d) is a normalization factor. The distribution µd,n plays an essential role in the proof of Theorem 2. Explicit expressions for Z (n, d) are known [29]. In Appendix A we derive the following bound: for n sufficiently large, Z (n, d)−1 ≤ n d d d (n−d) . 2

(3.8)

3.3. Estimates for µd,n . Define the function F(x) = − log x + x − 1 .

(3.9)

Lemma 4. For all d, for n sufficiently large, and for any event A ⊂ d , d d µd,n (A) ≤ exp d 2 log n − (n − d) F(dwi ) δ wi − 1 [dw]. A

i=1

(3.10)

i=1

1 G(z) gives the complex Wishart matrix when z ∈ Cn ⊗ Cd with each entry z having IID comi j plex normal distribution. The eigenvalue distribution was shown to be proportional to 1≤i< j≤d (wi −w j )2 d n−d [dw], for example, in [5]. i=1 wi

Comments on Hastings’ Additivity Counterexamples

119

This lemma will be proved in Sect. 5.2. Using (3.5) we immediately get the following bound. Corollary 5. For all d, for n sufficiently large, and for any event A ⊂ d , d 2 F(dwi ) . µd,n (A) ≤ exp d log n − log(d − 1)! − (n − d) inf w∈A

(3.11)

i=1

Note that F(x) is convex, and also F(1) = F (1) = 0. The Taylor expansion around 1 gives F (1 + dδw) =

1 2 d (δw)2 + R, 2

(3.12)

where the remainder is 1 R = − (1 + dδ)−3 (dδw)3 , (3.13) 3 and δ is some value between 0 and δw. Note that −1/d < δw < (d −1)/d as 0 ≤ w ≤ 1. Also, R > 0 if δw < 0. When δw > 0, 1 0 > R > − d 3 (δw)3 . (3.14) 3 Recall that F(x) ≥ 0, so we have the bound F(dwi ) ≥ 0 for all i. Thus feeding (3.12) into Corollary 5 gives the following estimate, which will be used in Sect. 5.4. Corollary 6. For all d, for n sufficiently large, and for any i = 1, . . . , d, 1 n − d 2 2 n−d 3 3 2 d t + d t . µd,n w : wi − ≥ t ≤ exp d log n−log(d − 1)! − d 2 3 (3.15) 3.4. Probability distribution for random unitary channels. A random unitary channel (1.10) is determined by the coefficients wi and the unitary matrices Ui . Thus the set of random unitary channels Rd (n) is naturally identified with d × U(n)d . Recall the distribution νd,n defined in Sect. 3.2 for the diagonal entries of G(z), and the Haar measure Hn defined on U(n). We define the following product probability measure on Rd (n): Pd,n = νd,n × Hn × · · · × Hn ,

(3.16) U(n)d .

where Hn × · · · × Hn is the d-fold product Haar measure on Using the measure Pd,n on Rd (n) means that the unitaries Ui are selected randomly and independently, while the coefficients w j have the joint distribution νd,n , and thus can be written in the form (3.6), where {z i j } (i = 1, . . . , n; j = 1, . . . , d) are the components of a random unit vector in Vnd . Recall the definition (2.5) of the conjugate channel. Define the map H : Rd (n) × Vn → Md , (, z) → C (zz ∗ ).

(3.17)

Recall the definition (3.2) of the map G : Vdn → Md . The following relation between the distributions Pd,n , σn and σdn is crucial to the proof. Given a measurable map f : X → Y between measure spaces (X, A) and (Y, B) (where A and B are σ -algebras on X, Y respectively), and given a measure µ on (X, A), we define the push-forward measure f ∗ (µ) on (Y, B) by f ∗ (µ)(B) = µ( f −1 (B)) for all B ∈ B.

120

M. Fukuda, C. King, D. K. Moser

Lemma 7. H ∗ (Pd,n × σn ) = G ∗ (σdn ).

(3.18)

Lemma 7 will be proved in Sect. 5.3. It implies that if is chosen randomly according to the measure Pd,n and z is chosen randomly and uniformly in Vn , then the eigenvalues of the matrix C (zz ∗ ) will have the distribution µd,n . 4. Proof of Theorem 2 The main idea of the proof is to isolate some properties of random unitary channels which are typical for large values n and d. These properties will then be used to prove that large minimum output entropy is also typical for random unitary channels when n, d are large. Recall that the environment dimension d will be chosen to be much smaller than the input dimension n. As the identity (2.6) shows, selecting a channel in Rd (n) corresponds to selecting a subspace of dimension n in the product space Cn ⊗ Cd . The structure of random bipartite subspaces was analyzed in the paper [13], and it was shown that in some circumstances most states in a randomly selected subspace will be close to maximally entangled. In such a situation the reduced density matrix of a randomly selected output state C (zz ∗ ) will be close to the maximally mixed state I /d, and hence its entropy will be close to log d. Although this observation plays an essential role in Hastings’ proof, the methods used in [13] do not directly yield the bounds needed. 4.1. Definition of the typical channel. A channel will be called typical if C maps at least one half of input states into a small ball centered at the maximally mixed output state I /d. The size of the small ball in question involves a numerical parameter b and is defined as follows: 1 log n Bd (n) = ρ ∈ Sd : . (4.1) ρ − d I ≤ b n ∞ Definition 8. A random unitary channel is called typical if with probability at least 1/2 a randomly chosen input state is mapped by C into the set Bd (n). The set of typical channels is denoted T : T = : σn z : C (zz ∗ ) ∈ Bd (n) ≥ 1/2 . (4.2) As the next result shows, for large n most channels are typical. √ Lemma 9. For all b > 3, d ≥ 2 and 0 < α < b2 /3 − 1, taking n sufficiently large, Pd,n (T c ) ≤

2d exp[−α d 2 log n]. (d − 1)!

(4.3)

√ Thus if b > 3, then as n → ∞ with high probability a randomly chosen channel will lie in the set T . In particular Pd,n (T c ) < 1 for n sufficiently large. The number α can be chosen to satisfy α=

b2 (n − d) − 1. 3n

(4.4)

Comments on Hastings’ Additivity Counterexamples

121

The dimension n must be large enough so that the right side of (4.4) is positive, and also so that n ≥ 4b2 d 2 log n (this is a technical condition needed in the proof, see Sect. 5.4). The second property of a typical channel is the existence of a ‘tube’ of output states surrounding C (zz ∗ ) for every input state z ∈ Vn . This property is used to eliminate the possibility of isolated output states with low entropy: if for some z the output entropy S(C (zz ∗ )) is small, then there is a nonzero fraction of input states whose outputs also have low entropy. In order to define the tube we first construct a line segment Y (ρ) pointing from a general state ρ toward the maximally mixed state I /d. The length of the segment depends on a parameter γ , which satisfies 0 < γ < 1: 1 Y (ρ) = rρ + (1 − r ) I : γ ≤ r ≤ 1 . (4.5) d The tube at ρ is defined to be the set of states which lie within a small distance of the set Y (ρ), and thus form a thickened line segment pointing from ρ toward the maximally mixed state. The definition of ‘small’ here depends on the size of the ball Bd (n), and also on another numerical parameter t. Definition 10. Let ρ ∈ Sd , then the Tube at ρ is defined as Tube(ρ) = θ ∈ Sd : dist (θ, Y (ρ)) ≤ t

d log n n

, (4.6)

dist (θ, Y (ρ)) = inf θ − τ ∞ . τ ∈Y (ρ)

The next result shows that for a channel in the typical set T , and for any state ρ = C (zz ∗ ) in the image of C , there is a uniform lower bound for the probability that a randomly chosen state belongs to the tube at ρ. As explained before, this means that an output state C (zz ∗ ) cannot be too isolated from the other output states. Lemma 11. For all d ≥ 3 there is β > 0 such that for n sufficiently large, for all t ≥ b + 4, and for all ∈ T and ρ ∈ Im(C ), σn z : C (zz ∗ ) ∈ Tube(ρ) ≥ β (1 − γ )n−1 . (4.7) Lemma 11 will be proved in Sect. 5.5. The number β is given by the following expression: d log d n−1 1 2 . (4.8) β = − (d + 2) 1 − 2 n It can be easily seen that for all d ≥ 3 the right side of (4.8) is positive for n sufficiently large. 4.2. Definition of the low-entropy events E. Define the set of channels whose minimum output entropy does not satisfy our requirements for a violation: h Cd,n = ∈ Rd (n) : Smin () ≤ log d − . (4.9) d The goal is to show that for d, n and h sufficiently large we have Pd,n (Cd,n ) < 1, c ) > 0, and thus that there exist random unitary channels with implying that Pd,n (Cd,n

122

M. Fukuda, C. King, D. K. Moser

Smin () > log d − h/d. The proof will hold for all h, d sufficiently large, and thus by taking log d ≥ 2h this will provide a counterexample to additivity. The method is to find useful upper and lower bounds for the probability of a particular event E in Rd (n) × Vn . The event E is chosen to contain all the pairs (, z), where C (zz ∗ ) lies in a tube connected to a state of low entropy. This set of tubes is defined by ! h J= Tube(ρ) : S(ρ) ≤ log d − . (4.10) d ρ Then the main event of interest for us is the following subset of Rd (n) × Vn : E = {(, z) : C (zz ∗ ) ∈ J } = H −1 (J ),

(4.11)

where H is the map defined in (3.17). The proof will proceed by proving upper and lower bounds for the probability of E, that is (Pd,n × σn )(E). These bounds will hold for any 0 < γ < 1; the parameter γ will be ‘tuned’ at the end in order to derive an estimate for the minimal size h min needed for the counterexample. √ As noted the construction works for any values of the parameters b, t satisfying b > 3 and t ≥ b + 4. The sizes of b and t do not play a crucial role, and they can be set to the values b = 2 and t = 6 without changing anything in the proof. 4.3. The upper bound for Pr ob(E). Note that by Lemma 7, (Pd,n × σn )(E) = (Pd,n × σn )(H −1 (J )) = H ∗ (Pd,n × σn )(J ) = G ∗ (σdn )(J ).

(4.12)

Let ρ be a fixed state in the set of tubes J . Then by definition there is a state τ ∈ Sd with low entropy such that ρ lies in the tube at τ . Thus for some r satisfying γ ≤ r ≤ 1, ρ − r τ + (1 − r ) 1 I ≤ t d log n , S(τ ) ≤ log d − h . (4.13) d ∞ n d Letting qi , pi denote the eigenvalues of ρ, τ respectively, it follows that qi = r pi + (1 − r )

1 + i , i = 1, . . . , d, d

(4.14)

where pi , i satisfy −

pi log pi ≤ log d −

i

h , d

d

i = 0.

(4.15)

i=1

Weyl’s perturbation theorem [4, Cor. III. 2.6] and (4.13) imply that d log n . |i | ≤ t n

(4.16)

The entropy condition (4.15) can be written as d i=1

pi d log( pi d) =

d i=1

( pi d log( pi d) − pi d + 1) ≥ h.

(4.17)

Comments on Hastings’ Additivity Counterexamples

123

Define the function f (x) = x log x − x + 1

(4.18)

Lemma 12. sup

x≥0, γ ≤r ≤1

f (x) f (0) 1 = = . f (r x + 1 − r ) f (1 − γ ) f (1 − γ )

(4.19)

Lemma 12 will be proved in Sect. 5.6. Recall (4.14) and define 1 z i = qi − i = r pi + (1 − r ) . d

(4.20)

Then Lemma 12 implies that for each i = 1, . . . , d, pi d log( pi d) − pi d + 1 = f ( pi d) ≤

1 f (z i d). f (1 − γ )

(4.21)

Therefore from (4.17) it follows that d

(z i d log(z i d) − z i d + 1) =

i=1

d

f (z i d) ≥ h f (1 − γ ).

(4.22)

i=1

We will use a standard bound for the difference between the entropies of z i and qi in terms of the l1 -norm of their difference [7, Th. 16.3.2]: 1 z i log z i + qi log qi ≤ m (log d + log ), (4.23) − m i

where m =

d

|z i − qi | =

i=1

d

|i | ≤ t d

i=1

d log n . n

(4.24)

Define η = d m (log d + log

1 ). m

(4.25)

Note that for all d and t, m → 0 as n → ∞, and hence also η → 0 as n → ∞. From (4.23) and (4.22) we deduce d

f (qi d) ≥ h f (1 − γ ) − η.

(4.26)

i=1

To summarize what we have shown so far: if ρ ∈ J has eigenvalues (q1 , . . . , qd ) then (4.26) holds. Thus we may upper bound the probability (4.12) by the probability of the state ρ satisfying the inequality (4.26). Since this event depends only on the eigenvalues of ρ, we obtain d ∗ G (σdn )(J ) ≤ µd,n q : f (qi d) > h f (1 − γ ) − η . (4.27) i=1

124

M. Fukuda, C. King, D. K. Moser

This probability is estimated using the bound (3.11): given a positive number x ≤ d log d, define d d Md (x) = inf F(qi d) : f (qi d) ≥ x , (4.28) q∈d

i=1

i=1

where F(x) = − log x + x − 1 as defined in (3.9). Then from (4.27) and (3.11) we deduce (Pd,n × σn )(E) = G ∗ (σdn )(J ) # " ≤ exp d 2 log n − log(d − 1)! − (n − d)Md (h f (1 − γ ) − η) .

(4.29)

The next lemma gives a lower bound for Md (x) which is not optimal but is sufficient for our purposes. Lemma 13. The function Md (x) is increasing. Suppose that 2e2 ≤ x ≤ d log d. Then Md (x) ≥ log(x − 1) − log(2e2 − 1).

(4.30)

Lemma 13 will be proved in Sect. 5.7. Applying (4.30) to (4.29) gives (Pd,n × σn )(E) h f (1 − γ ) − η − 1 2 , ≤ exp d log n − log(d − 1)! − (n − d) log 2e2 − 1

(4.31)

where h is assumed to satisfy the bounds 2e2 ≤ h f (1 − γ ) − η ≤ d log d.

(4.32)

4.4. The lower bound for Pr ob(E). First we write (Pd,n × σn )(E) = E [σn (z : C (zz ∗ ) ∈ J )] ≥ E [1Cd,n ∩T σn (z : C (zz ∗ ) ∈ J )], where E denotes expectation over Rd (n) with respect to the measure Pd,n , and 1Cd,n ∩T is the characteristic function of the event Cd,n ∩ T . Given that ∈ Cd,n , there is a state v ∈ Cn such that S(C (vv ∗ )) ≤ log d −

h . d

(4.33)

Since Tube(C (vv ∗ )) ⊂ J it follows that (Pd,n × σn )(E) ≥ E [1Cd,n ∩T σn (z : C (zz ∗ ) ∈ Tube(C (vv ∗ )))].

(4.34)

Applying Lemma 11 to (4.34) gives (Pd,n × σn )(E) ≥ β (1 − γ )n−1 E [1Cd,n ∩T ]

(4.35)

= β (1 − γ )n−1 Pd,n (Cd,n ∩ T ) ≥ β (1 − γ )

n−1

(4.36)

(Pd,n (Cd,n ) − Pd,n (T )). c

(4.37)

Comments on Hastings’ Additivity Counterexamples

125

4.5. Combining the bounds for Pr ob(E) and finishing the proof. Putting together the upper and lower bounds for (Pd,n√× σn )(E) and using Lemma 9 produces the following bound: for all d ≥ 3, for all b > 3 and t ≥ b + 4, for all 0 < γ < 1, for h, d satisfying (4.32), and for n sufficiently large n−1 1 1 (Pd,n × σn )(E) Pd,n (Cd,n ) ≤ Pd,n (T ) + β 1−γ 2d exp[−αd 2 log n] ≤ (d − 1)! n−1 1 1 ˜ + exp[d 2 log n − log(d − 1)! − (n − d) log h] β 1−γ 2d exp[−αd 2 log n] = (d − 1)! 1−γ ˜ + (4.38) exp[d 2 log n + d log h˜ − n log(1 − γ )h], β(d − 1)! c

where h˜ = (h f (1 − γ ) − η − 1)/(2e2 − 1). Define h min =

2e2 − γ (1 − γ ) f (1 − γ )

(4.39)

(note that h min satisfies the lower bound in (4.32)). As n → ∞ the parameter η approaches zero, and therefore for h > h min the second term on the right side of (4.38) is controlled by the factor (1 − γ )(h f (1 − γ ) − 1) h f (1 − γ ) − 1 exp −n log = 2 2e − 1 h min f (1 − γ ) − 1

−n

.

(4.40)

The first factor on the right side of (4.38) approaches zero as n → ∞, therefore (4.40) implies that for h > h min , Pd,n (Cd,n ) → 0 as n → ∞.

(4.41)

Summary and conclusion We have shown that for any 0 < γ < 1, for h > h min as √ defined in (4.39), for any b > 3 and t ≥ b + 4, for any d ≥ 3 satisfying d log d > h f (1 − γ ) (this comes from the second inequality in (4.32)), there is N < ∞ such that c ) > 0, and for all n ≥ N we have Pd,n (Cd,n ) < 1. In this case we also have Pd,n (Cd,n c thus a guarantee that the set Cd,n is non-empty. Referring to (4.9), this means that there exists a random unitary channel such that Smin () > log d −

h . d

(4.42)

126

M. Fukuda, C. King, D. K. Moser

4.6. Optimizing the bounds for Pr ob(E) and the proof of Proposition 3. First consider the value h min defined in (4.39). Varying γ shows that the right side achieves its minimum value at γ = 0.72. In order to achieve a counterexample we need log d ≥ 2h, so this implies the existence of counterexamples for all d ≥ d0 with d0 = exp[2h min + 1] exp[276].

(4.43)

In order to get a better estimate of dmin , we return to the bound (4.29) and look for the smallest value of d satisfying f (1 − γ ) log d + log(1 − γ ) > 0. (4.44) Md 2 For n sufficiently large this will yield a counterexample. This is a straightforward numerical problem: for each γ we find the smallest d so that d−z : d −1 f (1 − γ ) d−z = log d} z log z + (d − z) log d −1 2

− log(1 − γ ) < inf {− log z − (d − 1) log z>1

(4.45)

and then minimize over γ . The solution occurs at γ = 0.762 and yields d0 = 38578. This also proves the first statement in Proposition 3. For the second statement we estimate the smallest value of n which yields Pd,n (Cd,n ) < 1. Using the values b = 2, t = 6, γ = 0.762, and with d = 50,000, crude numerical estimates show that we can achieve this with n = d 7 . This proves the second statement in Proposition 3. For the third statement, we note from Lemma 1 that for any random unitary channel , Smax ≥ 2Smin () − 2 log d +

log d . d

(4.46)

c we have Thus for every ∈ Cd,n

Smax ≥

log d − 2h . d

(4.47)

For a fixed value h, the right side of (4.47) achieves its maximum value when d = [exp(2h + 1)], and this maximum value is 1/d. Numerical calculation shows that we can achieve Md ( f (1−γ )h)+log(1−γ ) > 0 using the values γ = 0.762, h = log(38590)/2 and d = [exp(2h + 1)], and then 1/d yields the lower bound for Smax stated in Proposition 3. 5. Proofs of Lemmas 5.1. Proof of Lemma 1. First, note that for any unit vectors {|ψk } and probability distribution { pk }, S pk |ψk ψk | ≤ − pk log pk . (5.1) k

k

Comments on Hastings’ Additivity Counterexamples

127

ˆ be the maximally entangled state. Then Let |ψ ˆ ψ|) ˆ = ( ⊗ )(|ψ

d

ˆ ψ|U ˆ i∗ ⊗ U jT wi w j Ui ⊗ U j |ψ

i, j=1

=

d

ˆ ψ| ˆ + |ψ

wi2

(5.2)

ˆ ψ|U ˆ i∗ ⊗ U jT , (5.3) wi w j Ui ⊗ U j |ψ

i= j

i=1

ˆ ψ|U ˆ ∗ ⊗ U T = |ψ ˆ ψ| ˆ for all i. Hence, where we used the identity Ui ⊗ Ui |ψ i i d d 2 2 ˆ ˆ S ( ⊗ )(|ψ ψ|) ≤ − wi log wi − wi w j log(wi w j ). (5.4) i=1

i= j

i=1

d

Write p = i=1 wi2 and then i= j wi w j = 1 − p. Hence ˆ ψ|) ˆ S ( ⊗ )(|ψ ⎧ 2 −d 2 −d d ⎨ d ≤ − p log p + sup − vk log vk : vk ≥ 0, vk = 1 − ⎩ k=1

k=1

⎫ ⎬

p . ⎭

(5.5)

The supremum on the right side of (5.5) is achieved with vk = (1− p)/(d 2 −d) for all k, hence 1− p ˆ ˆ , (5.6) S ( ⊗ )(|ψ ψ|) ≤ h( p) = − p log p − (1 − p) log d2 − d where 1/d ≤ p ≤ 1. However, h ( p) = − log p + log(1 − p) − log(d 2 − d) 1 1 1 −1 ≤ log = log p d(d − 1) d

<0

(5.7)

for all d. This implies that the above upper bound h( p) is maximized when p = 1/d and the maximum is 1 1 1 log d + 1 − log(d 2 ) = 2 log d − log d. (5.8) d d d 5.2. Proof of Lemma 4. By dropping the terms (wi − w j )2 in (3.7) we get d d n−d −1 µd,n (A) ≤ Z (n, d) wi δ wi − 1 [dw]. A i=1

(5.9)

i=1

Applying (3.8) to (5.9) leads to µd,n (A) ≤

exp[d log n + (n − d) 2

A

d i=1

log(dwi )] δ

d i=1

wi − 1

[dw].

(5.10)

128

M. Fukuda, C. King, D. K. Moser

Noting that d

F(dwi ) = −

i=1

d

log(dwi )

(5.11)

i=1

the result follows. 5.3. Proof of Lemma 7. For any event A ⊂ Md , H ∗ (Pd,n × σn )(A) = (Pd,n × σn )(H −1 (A)) = dσn (z) Pd,n : C (zz ∗ ) ∈ A .

(5.12)

A random unitary channel is determined by the coefficients {w1 , . . . , wd } and the unitary matrices {U1 , . . . , Ud }. Given a unitary matrix V define the transformation TV : Rd (n) → Rd (n) by TV : {w1 , . . . , wd ; U1 , . . . , Ud } → {w1 , . . . , wd ; U1 V, . . . , Ud V }.

(5.13)

The measure Pd,n = νd,n × Hn × · · · × Hn contains the product of d independent copies of Haar measure Hn on the group Un . Since Haar measure is invariant under group multiplication, for any event C ⊂ Rd (n) we have Pd,n (C) = dνd,n (w1 , . . . , wd ) dHn (U1 ) · · · dHn (Un ) C dνd,n (w1 , . . . , wd ) dHn (U1 ) · · · dHn (Un ) = TV (C)

= Pd,n (TV (C)). Thus in particular for any z ∈ Vn , Pd,n : C (zz ∗ ) ∈ A = Pd,n : C (V z(V z)∗ ) ∈ A .

(5.14)

(5.15)

Since Un acts transitively on Vn , (5.15) shows that the probability is independent of z. Hence from (5.12) we obtain that for any fixed z 0 ∈ Vn , H ∗ (Pd,n × σn )(A) = Pd,n : C (z 0 z 0∗ ) ∈ A . (5.16) For a given channel the d × d matrix C (z 0 z 0∗ ) can be written in terms of a n × d matrix K () as follows:

√ √ w1 v1 · · · wd vd , C (z 0 z 0∗ ) = K ()∗ K (), K () = (5.17) where for i = 1, . . . , d, vi = Ui z 0 .

(5.18)

* + H ∗ (Pd,n × σn )(A) = Pd,n : K ()∗ K () ∈ A .

(5.19)

Thus (5.16) can be written as

Comments on Hastings’ Additivity Counterexamples

129

Recall from (3.2) that G(z) = M(z)∗ M(z), where z ∈ Vnd and where the n × d matrix M(z) has entries Mi j (z) = z (i−1)d+ j , i = 1, . . . n, j = 1, . . . , d.

(5.20)

It follows that for any event A ⊂ Md ,

+ * G ∗ (σdn )(A) = σdn z : M(z)∗ M(z) ∈ A . H ∗ (Pd,n

(5.21)

G ∗ (σdn )(A)

We wish to prove that × σn )(A) = for every event A ⊂ Md . Comparing (5.21) and (5.19), it is sufficient to show that the n × d matrices K () and M(z) have the same distribution. We will do this by showing that the columns of K () and M(z) have the same joint distribution. Before showing the result, we need the following observation on a normally distributed vector. Let Z 1 , . . . , Z m be IID complex valued normal random variables with mean zero and , m 2 variance two. Let R = i=1 |Z i | and define the vector ⎛ ⎞ Z1 1 ⎜ . ⎟ ξ = ⎝ .. ⎠ . (5.22) R Zm Then R, ξ are independent, ξ is a random pure state in Vm , and R has density P(r ) ∝ e−r

2 /2

r 2m−1 .

(5.23)

This result may be easily seen by transforming the joint density for Z 1 , . . . , Z m to polar coordinates: m 2 2 (2π )−m e−|zi | /2 d 2 z i = π −m e−r /2 r 2m−1 dr d, (5.24) i=1

where d is the uniform measure on S 2m−1 . We look at M(z) first. Let {Z i j } (i = 1, . . . , n, j = 1, . . . , d) be a collection of IID complex valued normal random variables with mean zero and variance two, arranged into a n × d matrix Z as in (5.20). Applying the previous observation to Z and also to each column of Z yields

Z = R M = R 1 ξ1 · · · R d ξd . (5.25) Here {ξ1 , . . . , ξd } are IID random unit vectors in Vn , and M is a random unit vector in Vnd . The vectors {ξ1 , . . . , ξd } are independent of the numbers R1 , . . . , Rd . Also R 2 = R12 + · · · + Rd2 , hence {ξ1 , . . . , ξd } are also independent of R. Dividing by R, M(z), a random unit vector in Vnd , can be reconstructed as R2 (5.26) Yd ξd , Yi = i2 . R Note that Y1 , . . . , Yd have the joint distribution νd,n , and are independent of {ξ1 , . . . , ξd }. Next, turning to K (), recall that

√ √ w1 v1 · · · wd vd , K () = (5.27) M(z) =

√

Y1 ξ1 · · ·

√

where vi = Ui z 0 . Since the unitaries Ui are independently and uniformly selected (this is part of the definition of the measure Pd,n ), it follows that the vectors {vi } are IID random unit vectors in Vn . Furthermore the coefficients {wi } have the joint distribution νd,n . This verifies our claim.

130

M. Fukuda, C. King, D. K. Moser

5.4. Proof of Lemma 9. Define the following subset of Rd (n) × Vn : / Bd (n)} = H −1 (Bd (n)c ), K = {(, z) : C (zz ∗ ) ∈

(5.28)

where the map H was defined in (3.17). Then (Pd,n × σn )(K ) = E [σn (z : C (zz ∗ ) ∈ / Bd (n))] ≥ E [1T c σn (z : C (zz ∗ ) ∈ / Bd (n))],

(5.29)

where E denotes expectation over Rd (n) with respect to the measure Pd,n , and 1T c is the characteristic function of the event T c . Note that if ∈ T c then σn (z : C (zz ∗ ) ∈ / Bd (n)) ≥ 1/2, hence 1 1 E [1T c ] = Pd,n (T c ). 2 2 Furthermore from Lemma 7 it follows that (Pd,n × σn )(K ) ≥

(5.30)

(Pd,n × σn )(K ) = (Pd,n × σn )(H −1 (Bd (n)c )) = H ∗ (Pd,n × σn )(Bd (n)c ) = G ∗ (σdn )(Bd (n)c ).

(5.31)

Combining these bounds shows that Pd,n (T c ) ≤ 2 G ∗ (σdn )(Bd (n)c )

log n 1 = 2 µd,n (q1 , . . . , qd ) : qi − > b some i = 1, . . . , d d n d ! Li , (5.32) = 2µd,n i=1

√ where the events L i are defined by L i = {(q1 , . . . , qd ) : |qi − 1/d| > b log n/n}. Thus we have Pd,n (T ) ≤ 2 c

d i=1

µd,n (L i ) = 2 d µd,n (L i ).

(5.33)

√ We use the bound (3.15) of Corollary 6 with t = b log n/n to estimate µd,n (L i ). In addition we assume that n is large enough so that 1 log n dt = db ≤ , (5.34) n 2 and hence n−d 2 2 n−d 3 3 n−d 2 2 d t − d t ≥ d t . (5.35) 2 3 3 Thus (5.32) gives n − d 2 2 log n c 2 Pd,n (T ) ≤ 2 d exp d log n − log(d − 1)! − d b 3 n 2 2d b (n − d) = exp −d 2 log n −1 . (5.36) (d − 1)! 3n

Comments on Hastings’ Additivity Counterexamples

131

5.5. Proof of Lemma 11. This result relies on several properties of random states. We will switch to Dirac bra and ket notation throughout this section, as it lends itself well to the arguments used in the proof. To set up the notation, let |ψ be a fixed state in Vn , and let |θ be a random pure state in Vn , with probability distribution σn . Without loss of generality we assume that a basis is chosen so that |ψ = (1, 0, . . . , 0)T . We write x = ψ|θ , and let |φ be the state orthogonal to |ψ such that |θ = x |ψ + 1 − |x|2 |φ. (5.37) Thus |φ is also a random state, defined by its relation to the uniformly random state |θ in (5.37). The following results are proved in Appendix B. Proposition 14. x and |φ are independent. |φ is a random vector in Vn−1 with distribution σn−1 . For all 0 ≤ t ≤ 1, σn {|θ : | ψ|θ | = |x| > t} = (1 − t 2 )n−1 .

(5.38)

Proposition 14 implies that as n → ∞ the overlap x = ψ|θ becomes concentrated around zero. In other words, with high probability a randomly chosen state will be almost orthogonal to any fixed state. As a consequence, from (5.37) it follows that |φ will be almost equal to |θ . This statement is made precise by noting that √ |θ − |φ2 ≤ 2| ψ|θ |. (5.39) Then (5.38) immediately implies that t2 σn (|θ : |θ − |φ2 > t) ≤ 1 − 2

n−1

.

(5.40)

The second property relies on the particular form of the random unitary channel, or more precisely on the form of the complementary channel C . Roughly, this property says that for any fixed random unitary channel and random state |θ , with high probability the norm of the matrix C (|θ ψ|) is small, and approaches zero as n → ∞. We will prove the following bound: for any ∈ Rd (n), and for all 0 ≤ t ≤ 1, σn (|θ : C (|θ ψ|)2 > t) ≤ d 2 (1 − t 2 )n−1 . As a first step toward deriving (5.41), note that for any states |u and |v, ⎛ ⎞1 2 d C ∗ 2 (|u v|)2 = ⎝ wk wl | v|Ul Uk |u| ⎠ ≤ max | v|Ul∗ Uk |u|. k,l

k,l=1

(5.41)

(5.42)

In particular this implies that C (|u v|)2 ≤ |u2 |v2 .

(5.43)

To derive (5.41) we apply (5.42) with u = θ and v = ψ and deduce that σn (|θ : C (|θ ψ|)2 > t) ≤ σn (|θ : max | ψ|Ul∗ Uk |θ | > t) k,l

≤ d σn (|θ : | ψ|Ul∗ Uk |θ | > t) 2

= d 2 (1 − t 2 )n−1 , where the last equality follows from (5.38).

(5.44)

132

M. Fukuda, C. King, D. K. Moser

With these ingredients in place the proof of Lemma 11 can proceed. By assumption is a random unitary channel belonging to the typical set T , and ρ = C (|ψ ψ|) is some state in Im(C ). Let |θ be a random input state, then as in (5.37) we write |θ = x |ψ + 1 − |x|2 |φ. It follows that |θ θ | = |x|2 |ψ ψ| + (1 − |x|2 ) |φ φ| +

-

1 − |x|2 (x |ψ φ| + x |φ ψ|).

(5.45)

Write r = |x|2 , then (5.45) yields

1 (|θ θ |) − r C (|ψ ψ|) + (1 − r ) I d 1 = (1 − r ) C (|φ φ|) − I d + r (1 − r )C eiξ |ψ φ| + e−iξ |φ ψ| , C

where ξ is the phase of x. Since r ≤ 1 this implies C (|θ θ |) − r C (|ψ ψ|) + (1 − r ) 1 I d ∞ C 1 C + I ≤ (|φ φ|) − (|ψ φ|) . ∞ d ∞

(5.46)

(5.47)

Referring to the definition (4.6) of Tube(ρ), recall that C (|θ θ |) belongs to Tube(ρ) if and only if for some r satisfying γ ≤ r ≤ 1, C (|θ θ |) − r C (|ψ ψ|) + (1 − r ) 1 I ≤ t d log n (5.48) d ∞ n (the set Y (ρ) defined in (4.5) is closed so the infimum in (4.6) is achieved). Define the following three events in Vn : A1 = {|θ : r = | ψ|θ |2 ≥ γ }, (5.49) √ C d log d d log n 1 +b A2 = |θ : , (5.50) (|φ φ|) − d I ≤ 2 2 n n ∞ √ d log d C A3 = |θ : (|ψ φ|) ≤ (1 + 2) . (5.51) ∞ n

Assume that d 2 ≤ n and then since t ≥ b + 4 it follows from (5.47) and (5.48) that A1 ∩ A2 ∩ A3 ⊂ {|θ : C (|θ θ |) ∈ Tube(ρ)}.

(5.52)

Note that n ≥ d 2 Furthermore by Proposition 14, A1 is independent of A2 and A3 , hence σn (|θ : C (|θ θ |) ∈ Tube(ρ)) ≥ σn (A1 ∩ A2 ∩ A3 ) = σn (A1 ) σn (A2 ∩ A3 ). (5.53)

Comments on Hastings’ Additivity Counterexamples

133

Proposition 14 immediately yields σn (A1 ) = (1 − γ )n−1 .

(5.54)

From (5.53) this gives σn (|θ : C (|θ θ |) ∈ Tube(ρ)) ≥ (1 − γ )n−1 (1 − σn (Ac2 ) − σn (Ac3 )). (5.55) In order to bound σn (Ac3 ) we first use (5.43) to deduce C (|ψ φ|)∞ ≤ C (|ψ φ|)2 ≤ C (|ψ θ |)2 + |θ − |φ2 . Thus

σn (Ac3 ) = σn

|θ : C (|ψ φ|)

∞

> (1 +

√

2)

d log d n

≤ σn |θ : (|ψ θ |)2 + |θ − |φ2 > (1 + C

d log d ≤ σn |θ : C (|ψ θ |)2 > n √ d log d +σn |θ : |θ − |φ2 > 2 n n−1 d log d ≤ (d 2 + 1) 1 − , n

(5.56)

√

2)

d log d n

(5.57)

where the last inequality follows from (5.44) and (5.40). Turning now to σn (Ac2 ), note first that C C 1 C C (|φ φ|) − 1 I ≤ (|φ φ|) − (|θ θ |) + (|θ θ |) − I ∞ d ∞ d ∞ 1 C C C ≤ (|φ φ|) − (|θ θ |) + (|θ θ |) − I 2 d ∞ C 1 ≤ 2 |θ − |φ2 + (5.58) (|θ θ |) − d I , ∞ where we used (5.43) for the last inequality. As in (5.57) this gives √ C d log d d log n 1 c +b σn (A2 ) = σn |θ : (|φ φ|) − I > 2 2 d 2 n n √ d log d ≤ σn |θ : 2 |θ − |φ2 > 2 2 n C d log n 1 +σn |θ : (|θ θ |) − I > b d ∞ n C d log n 1 d log d n−1 + σn |θ : ≤ 1− , (|θ θ |) − d I > b n n ∞ (5.59)

134

M. Fukuda, C. King, D. K. Moser

where we used (5.40) for the last inequality. By assumption ∈ T , and therefore there is a set of input states L with σn (L) ≥ 1/2 such that C d log n 1 |θ ∈ L ⇒ (|θ θ |) − I ≤ b . (5.60) d ∞ n Thus

σn

C d log n 1 1 |θ : ≤ σn (L c ) ≤ . (|θ θ |) − d I > b n 2 ∞

Putting together the bounds (5.55), (5.57), (5.59) and (5.61) we get

σn (C (|θ θ |) ∈ Tube(ρ)) ≥ (1 − γ )n−1 1 − σn (Ac2 ) − σn (Ac3 ) d log d n−1 n−1 ≥ (1 − γ ) 1− 1− n n−1 1 d log d − − (d 2 + 1) 1 − 2 n 1 d log d − (d 2 + 2) 1 − = (1 − γ )n−1 2 n

(5.61)

n−1

. (5.62)

This completes the proof, with 1 d log d 2 β= − (d + 2) 1 − 2 n

n−1

.

(5.63)

5.6. Proof of Lemma 12. It is clear that f (r x + 1 − r ) is monotone increasing in r , and therefore sup sup

x≥0 γ ≤r ≤1

f (x) f (x) = sup . f (r x + 1 − r ) x≥0 f (γ x + 1 − γ )

(5.64)

The function f (x) f (γ x + 1 − γ )−1 is analytic and decreasing at x = 1 for γ < 1. Thus either the supremum in (5.64) is achieved at x = 0 or else there is a critical point of the function f (x) f (γ x + 1 − γ )−1 in the interval (0, ∞). In order to rule out the second possibility, we introduce a Lagrange multiplier and define the function h(x, y, β) = log f (x) − log f (y) − β(γ x + 1 − γ − y).

(5.65)

To find the critical points of h we solve ∂h ∂h ∂h = = = 0. ∂x ∂y ∂β

(5.66)

f (x) f (y) =γ . f (x) f (y)

(5.67)

Solving for β leads to

Comments on Hastings’ Additivity Counterexamples

135

Since y − 1 = γ (x − 1) this is equivalent to (x − 1) log x (y − 1) log y = . x log x − x + 1 y log y − y + 1

(5.68)

Direct computation shows that d (x − 1) log x (x − 1)2 −2 (log x)2 − = f (x) d x x log x − x + 1 x 1/2 x − x −1/2 = f (x)−2 (log x)2 1 − log x

2

. (5.69)

Furthermore, the function x 1/2 − x −1/2 −log x is monotone increasing for all x > 0, and thus x 1/2 − x −1/2 > log x for x > 1. Thus for x ≥ 1 the derivative (5.69) is negative, and therefore (5.68) has no solution with x > 1. Similarly x 1/2 − x −1/2 < log x for 0 < x < 1, and hence again (5.69) is negative for 0 < x < 1. So there are no solutions of (5.68) except x = y = 1. Therefore (5.65) has no critical points except x = y = 1, and thus the function f (x) f (γ x + 1 − γ )−1 achieves its supremum at x = 0. 5.7. Proof of Lemma 13. Suppose first that 0 < h < d log d. Recall the definition d d Md (h) = inf F(qi d) : f (qi d) ≥ h , (5.70) q∈d

i=1

i=1

where F(x) = − log x + x − 1 and f (x) = x log x − x + 1. Letting xi = qi d we have d d d Md (h) = inf F(xi ) : f (xi ) ≥ h, xi = d . (5.71) xi ≥0

i=1

i=1

i=1

d

The gradient of the function i=1 F(xi ) is zero only at x1 = · · · = xd = 1, hence since d d h > 0 there are no critical points of i=1 F(xi ) in the region i=1 f (xi ) ≥ h. Thus d the infimum in (5.71) is achieved at the boundary where i=1 f (xi ) = h, and so Md (h) = inf

xi ≥0

d i=1

F(xi ) :

d

f (xi ) = h,

d

i=1

xi = d .

We introduce Lagrange multipliers and define d d d F(xi ) − α f (xi ) − h − β xi − d . H (xi , α, β) = i=1

i=1

(5.72)

i=1

(5.73)

i=1

The critical equations for H are ∂H 1 =1− − α log xi − β = 0. ∂ xi xi

(5.74)

136

M. Fukuda, C. King, D. K. Moser

The constraints can be used to eliminate β and obtain h 1+α xi − 1 = α xi log xi , i = 1, . . . , d. d

(5.75)

If α ≤ 0, Eqs. (5.75) have the unique solution xi = 1 for all i = 1, . . . , d. However this d f (xi ) = h for h > 0. Thus α > 0, in which case does not satisfy the constraint i=1 there are positive numbers w and z satisfying 0<w<1
(5.76)

such that the solutions of (5.75) are x1 = · · · = xk = w, xk+1 = · · · = xd = z

(5.77)

for some 1 ≤ k ≤ d − 1. The constraint conditions imply that kw + (d − k)z = d, kw log w + (d − k)z log z = h.

(5.78)

Thus (5.72) can be reformulated as Md (h) =

inf

0<w
{−k log w − (d − k) log z : kw log w + (d − k)z log z = h, kw + (d − k)z = d}.

(5.79)

We claim that −k log w − (d − k) log z, subject to the constraints kw log w + (d − k)z log z = h and kw + (d − k)z = d, is a decreasing function of k. In order to show this, we divide (5.79) by d and write k = td, and consider the function Q(w, z, t) = −t log w − (1 − t) log z

(5.80)

along with the constraints tw log w + (1 − t)z log z =

h , d

tw + (1 − t)z = 1.

(5.81)

The constraints (5.81) allow w, z to be defined locally as functions of t. This follows from the implicit function theorem since the Jacobian is t (1 − t) log(w/z); we must have w < z and hence the Jacobian is nonzero. Solving these constraint equations for the derivatives gives dw −w + z − w log z + w log w = , dt t log(z/w) w − z − z log w + z log z dz = . dt (1 − t) log(z/w)

(5.82) (5.83)

Returning to (5.80) we can now compute its derivative with respect to t: dQ t dw 1 − t dz = − log w + log z − − dt w dt z dt 1 (z − w)2 =− − (log(z/w))2 . log(z/w) zw

(5.84)

Comments on Hastings’ Additivity Counterexamples

137

Note that 2 log u ≤ u − 1/u for all u ≥ 1, hence z z w ≤ log − , w w z

(5.85)

and therefore the right side of (5.84) is negative. Thus Q is a decreasing function of t, and hence the infimum in (5.79) is achieved at the largest possible value of k, namely k = d − 1. This leads to d−z d−z : z log z + (d − z) log = h}. Md (h) = inf {− log z − (d − 1) log z>1 d −1 d −1 (5.86) d−z The function z log z + (d − z) log d−1 is monotone increasing, reaching its maximum value d log d at z = d. Thus for any 0 < h < d log d there is a unique value z(d, h) satisfying the constraint condition in (5.86). Its derivative is

d−z ∂z = ≥ 0. ∂h d log z − h

(5.87)

d−z Furthermore the function g(z) = − log z − (d − 1) log d−1 is also monotone increasing for 1 < z < d, with derivative g (z) = d(z − 1)/z(d − z). Thus

Md (h) − Md (0) = g(z(d, h)) − g(z(d, 0)) h ∂z = g (z(d, h)) dh ∂h 0 h d(1 − z −1 ) dh = 0 d log z − h h 1 d(z − 1) = dh 0 z d log z − h h 1 ≥ dh. 0 z

(5.88)

The constraint condition in (5.86) implies that h ≤ z log z ≤ h + (d − z) log

d −1 ≤ h + z − 1. d−z

(5.89)

Thus z≤

h−1 . log z − 1

(5.90)

If h ≥ 2e2 then the first inequality in (5.89) implies that z(d, h) ≥ e2 , and therefore log z ≥ 2. From (5.90) it follows that h ≥ 2e2 ⇒ z(d, h) ≤ h − 1. Thus from (5.88) we deduce that for h ≥ h h dh dh ≥ = log(h − 1) − log(2e2 − 1). Md (h) − Md (0) ≥ 2 2 z h − 1 2e 2e

(5.91)

2e2 ,

(5.92)

Since z(d, 0) = 1 and g(1) = 0, it follows that Md (0) = 0, and hence (4.30) holds.

138

M. Fukuda, C. King, D. K. Moser

Finally, to show that Md (x) is increasing, note that 1 d − 1 ∂z d Md = − + , dx z d − z ∂x where z solves the constraint equation

z log z + (d − z) log

d−z d −1

(5.93)

= x.

(5.94)

Differentiating (5.94) gives

z(d − 1) −1 ∂z = log > 0, ∂x d−z # " since z > 1. Also − 1z + d−1 d−z > 0, hence (5.93) shows that Md is increasing.

(5.95)

6. Discussion Hastings’ Theorem finally settles the question of additivity of Holevo capacity for quantum channels, as well as additivity of minimal output entropy and entanglement of formation. In this paper we have explored in detail the proof of Hastings’ result, and we have provided some estimates for the minimal dimensions necessary in order to find a violation of additivity. The violation of additivity seems to be a small effect for this class of models, requiring delicate and explicit estimates for the proof. It is an open question whether there are random unitary channels with large violations of additivity. Hastings’ Theorem is non-constructive, and it would be extremely interesting to find explicit channels which demonstrate the effect. Presumably non-additivity of Holevo capacity is generic, and there may be other classes of channels where the effect is larger. Having established non-additivity of Holevo capacity, one is led to the question of finding useful bounds for the channel capacity C(). One may even hope to find a compact ‘single-letter’ formula for C(), though that possibility seems remote. It is likely that the methods introduced by Hastings will prove to be useful in addressing these questions. Acknowledgments. M. F. thanks B. Nachtergaele, A. Pizzo and A. Soshnikov for numerous discussions, M. Hastings for answering questions, A. Holevo for useful comments on an early draft of this paper and R. Siegmund-Schultze for sending his related slides. C. K. thanks P. Gacs, A. Harrow, T. Kemp, M. B. Ruskai, P. Shor and B. Zeng for useful conversations. This collaboration began at the March 2009 workshop “Entropy and the Quantum” at the University of Arizona, and the authors are grateful to the organizers of the workshop.

A. Derivation of Bound for Z(n, d) We consider the following integral: d d Z = δ 1− pi ( pi − p j )2 pkn−d dpk i=1

1 = (dn − 1)!

1≤i< j≤d −r dn−1

e r

dr

δ 1−

k=1 d i=1

pi

1≤i< j≤d

(A.1)

( pi − p j )2

d

pkn−d dpk .

k=1

(A.2)

Comments on Hastings’ Additivity Counterexamples

139

Consider the following change of variables: q 1 = r p1 .. . qd−1 = r pd−1 qd = r (1 − p1 − . . . − pd−1 ). The Jacobian is

r . . ∂(q1 , . . . , qd ) = . ∂( p1 , . . . , pd−1 , r ) 0 −r

... .. . ... ...

(A.3) (A.4) (A.5) (A.6)

r . . =. pd−1 0 1 − p1 − . . . − pd−1 0

0 .. .

p1 .. .

r −r

... 0 . . .. . . ... r ... 0

= r d−1 . pd−1 1 (A.7) p1 .. .

After the change of variables we have d 1 (qi − q j )2 e−qk qkn−d dqk . Z= (dn − 1)! 1≤i< j≤d

However,

1 q1 2 (qi − q j ) = .. . 1≤i< j≤d d−1 q 1

(A.8)

k=1

2 2 . . . 1 1 ... 1 . . . qd p1 (q1 ) . . . p1 (qd ) . . = .. .. .. .. . .. . . . p (q ) . . . p (q ) d−1 . . . qd d−1 1 d−1 d

(A.9)

Here, pk is any monic polynomial2 of degree k. So, set pk = (−1)k k!L n−d , where L n−d k k is the Laguerre polynomial. Then, we have (A.10) pk (x) pl (x) e−x x n−d d x = δkl (n − d + k + 1)(k + 1). Hence 1 Z = (dn − 1)!

σ

1 = (dn − 1)! σ d

sign(σ )

d

2 pσ (i) (qi )

i=1

pσ (k) (qk )

d

e−qk qkn−d dqk

(A.11)

k=1

2

e−qk qkn−d dqk

(A.12)

k=1

=

d−1 (d + 1) (n − d + k + 1)(k + 1) (dn)

(A.13)

d 1 (n − d + k)(k + 1). (dn)

(A.14)

k=0

=

k=1

Here, σ are “permutations”; σ : {1, . . . , d} → {0, . . . , d − 1}. 2 A monic polynomial is a polynomial whose highest degree term has coefficient 1. These polynomials are inserted in (A.9) by using properties of the determinant.

140

M. Fukuda, C. King, D. K. Moser

To evaluate this quantity we use the following fact: (s) is approximated by √ 1 1 exp s log s − s − log s + log 2π + log 1 + O (A.15) 1 2 |s| 2 as s → +∞. Then, we have exp{(s − 1) log s − s} < (A.15) < exp{s log s − s}.

(A.16)

Note that the above upper bound is true only for large enough s but in our case it is not a problem. By using these bounds we get a lower bound for (A.14). First, log(1/ (dn))) is lower bounded by − (dn) log(dn) + dn = −dn log d − dn log n + dn. d Secondly, log( k=1 (n − d + k)) is lower bounded by d d (n − d + k − 1) log(n − d + k) − (n − d + k) k=1

(A.17)

(A.18)

k=1

d n−d +k 1n−d +k−1 log =n n n n 2

k=1

+

d

(n − d + k − 1) log n −

k=1

d (n − d + k).

(A.19)

k=1

The first sum in (A.19) is approximately lower bounded by n−d n−d d log = d(n − d)(log(n − d) − log n) ≈ −d 2 , n2 × × n n n as n → ∞. Also, the remaining part in (A.19) is 1 1 1 1 dn − d 2 − d log n − dn − d 2 + d . 2 2 2 2 Thirdly, log( dk=1 (k + 1)) is lower bounded by d k=1

d k log(k + 1) − (k + 1)

(A.20)

(A.21)

(A.22)

k=1

d k+1 1k log =d dd d 2

k=1

+

d k=1

k log d −

d

(k + 1).

(A.23)

k=1

Again, the first term in (A.23) is approximately lower bounded by −d 2 . Also, the remaining part in (A.23) is 1 2 1 1 2 3 d + d log d − d + d . (A.24) 2 2 2 2

Comments on Hastings’ Additivity Counterexamples

141

As a whole, we know that the inside of exp in (A.14) is lower bounded by (A.17) + (A.21) + (A.24) − 2d 2 1 1 1 1 = − d 2 − d log n + −dn + d 2 + d log d − 2d 2 − 2d 2 2 2 2 1 = −d 2 log n + (d 2 − dn) log d + d(d − 1)(log n − log d − 4) 2 ≥ −d 2 log n + (d 2 − dn) log d

(A.25) (A.26) (A.27) (A.28)

if (log n − log d) ≥ 4. Therefore, we get an upper bound for the normalization constant: Z (n, d)−1 ≤ n d d d(n−d) 2

(A.29)

in this case. B. Proof of Proposition 14 Let Z 1 , . . . , Z n be IID complex Gaussian random vectors with mean zero and variance two. Apply the result (5.22) with m = n − 1 to deduce that (Z 2 , . . . , Z n )T = ρ |φ,

(B.1)

where |φ is a random unit vector in Cn−1 → Cn , independent of ρ = (|Z 2 |2 + · · · + |Z n |2 )1/2 . Then apply (5.22) with m = n to deduce (Z 1 , . . . , Z n )T = R |θ ,

(B.2)

where |θ is a random unit vector in Cn , independent of R = (|Z 1 |2 + · · · + |Z n |2 )1/2 . Let x=

Z1 Z1 =R |Z 1 |2 + ρ 2

(B.3)

and recall that |ψ = (1, 0, . . . , 0)T . Then |θ =

ρ 1 (Z 1 , . . . , Z n )T = x|ψ + |φ = x|ψ + 1 − |x|2 |φ. R R

(B.4)

This proves the first part of Proposition 14 since |φ is independent of Z 1 and ρ, and hence is independent of x. To prove the second part of Proposition 14 we use the representation (B.3) to derive the distribution of |x|2 . We write Z j = X 2 j−1 + i X 2 j , where {X j } are IID real normal random variables with mean zero and variance one, so from (B.3) it follows that |x|2 =

X 12 + X 22 X 12

2 + · · · + X 2n

.

(B.5)

Note that X 12 + · · · + X k2 has the following Chi-square probability distribution: f k (x) =

1 k 2

2 ( k2 )

k

x

x 2 −1 e− 2 .

(B.6)

142

M. Fukuda, C. King, D. K. Moser

Hence, set X = X 12 + X 22 , Y =

X 32

(B.7) 2 X 2n ,

+ ··· +

(B.8)

and then X and Y are independent and have the following probability distributions: 1 −x e 2, 2 y 1 f Y (y) = n−1 y n−2 e− 2 . 2 (n − 1)

f X (x) =

(B.9) (B.10)

However, X ≤ t ⇔ X (1 − t) ≤ tY X +Y implies that the cumulative function of t ∞

1−t

0

y

f X (x) d x

X X +Y

is

f Y (y) dy =

0

(B.11)

∞

ty

1 − e− 2(1−t)

0

∞

f Y (y) dy

(B.12)

ty

e− 2(1−t) f Y (y) dy (B.13) 0 ∞ y 1 y n−2 e− 2(1−t) dy. = 1 − n−1 2 (n − 2)! 0 (B.14)

= 1−

Here,

∞

y

y n−2 − 2(1−t)

e

(n−2)

dy = (2(1 − t))

∞

(n − 2)!

0

y

e− 2(1−t) dy

(B.15)

0

= 2n−1 (1 − t)n−1 (n − 2)!.

(B.16)

Therefore F

X X +Y

(t) = 1 − (1 − t)n−1 .

(B.17)

References 1. Amosov, G.G.: Remark on the additivity conjecture for the quantum depolarizing channel. Probl. Inf. Trans. 42(2), 69–76 (2006) 2. Amosov, G.G.: The strong superadditivity conjecture holds for the quantum depolarizing channel in any dimension. Phys. Rev. A 75(6), P. 060304 (2007) 3. Amosov, G.G., Holevo, A.S., Werner, R.F.: On some additivity problems in quantum information theory. Prob. Inf. Trans. 36, 305–313 (2000) 4. Bhatia, R.: Matrix Analysis. Graduate Texts in Mathematics 169, Berlin-Heidelberg-NewYork: Springer, 1997 5. Bronk, B.V.: Exponential ensemble for random matrices. J. Math. Phys. 6, 228 (1965) 6. Bruss, D., Faoro, L., Macchiavello, C., Palma, M.: Quantum entanglement and classical communication through a depolarising channel. J. Mod. Opt. 47, 325 (2000) 7. Cover, T.M., Thomas, J.A.: Elements of Information Theory. NewYork: John Wiley and Sons, 1991

Comments on Hastings’ Additivity Counterexamples

143

8. Cubitt, T., Harrow, A.W., Leung, D., Montanaro, A., Winter, A.: Counterexamples to additivity of minimum output p-Renyi entropy for p close to 0. Commun. Math. Phys. 284, 281–290 (2008) 9. Fujiwara, A., Hashizume, T.: Additivity of the capacity of depolarizing channels. Phys. Lett. A 299, 469–475 (2002) 10. Fukuda, M.: Simplification of additivity conjecture in quantum information theory. Quant. Info. Proc. 6, 179–186 (2007) 11. Fukuda, M., Wolf, M.M.: Simplifying additivity problems using direct sum constructions. J. Math. Phys. 48, 072101 (2007) 12. Hastings, M.B.: A Counterexample to additivity of minimum output entropy. Nature Physics 5, 255–257 (2009) 13. Hayden, P., Leung, D.W., Winter, A.: Aspects of generic entanglement. Commun. Math. Phys. 265(1), 95–117 (2006) 14. Hayden, P., Winter, A.: Counterexamples to the maximal p-norm multiplicativity conjecture for all p > 1. Commun. Math. Phys. 284(1), 263–280 (2008) 15. Holevo, A.S.: The capacity of the quantum channel with general signal states. IEEE Trans. Inf. Theory 44(1), 269–273 (1998) 16. Holevo, A.S.: On complementary channels and the additivity problem. Prob. Th. and Appl. 51, 133–143 (2005) 17. King, C.: Additivity for unital qubit channels. J. Math. Phys. 43, 4641–4653 (2002) 18. King, C.: The capacity of the quantum depolarizing channel. IEEE Trans. Info. Theory 49, 221–229 (2003) 19. King, C., Nathanson, M., Ruskai, M.B.: Multiplicativity properties of entrywise positive maps. Lin. Alg. Appl. 404, 367–379 (2005) 20. King, C., Ruskai, M.B.: Minimal entropy of states emerging from noisy quantum channels. IEEE Trans. Info. Theory 47, 192–209 (2001) 21. Lloyd, S., Pagels, H.: Complexity as thermodynamic depth. Ann. Phys. 188, 186–213 (1988) 22. King, C., Matsumoto, K., Nathanson, M., Ruskai, M.B.: Properties of conjugate channels with applications to additivity and multiplicativity. Markov Processes and Related Fields 13(2), 391–423 (2007) 23. Page, D.N.: Average entropy of a subsystem. Phys. Rev. Lett. 71, 1291 (1993) 24. Sanchez-Ruiz, J.: Simple proof of pages conjecture on the average entropy of a subsystem. Phys. Rev. E 52, 5653 (1995) 25. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56(1), 131–138 (1997) 26. Sen, S.: Average entropy of a quantum subsystem. Phys. Rev. Lett. 77(1), 13 (1996) 27. Shor, P.W.: Equivalence of additivity questions in quantum information theory. Commun. Math. Phys. 246(3), 453–472 (2004) 28. Werner, R.F., Holevo, A.S.: Counterexample to an additivity conjecture for output purity of quantum channels. J. Math. Phys. 43(9), 4353–4357 (2002) 29. Zyczkowski, K., Sommers, H.-J.: Induced measures in the space of mixed quantum states. J. Phys. A 34, 7111–7125 (2001) Communicated by M.B. Ruskai

Commun. Math. Phys. 296, 145–174 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0929-7

Communications in

Mathematical Physics

FJRW-Rings and Mirror Symmetry Marc Krawitz1, , Nathan Priddis2 , Pedro Acosta2 , Natalie Bergin2 , Himal Rathnakumara2 1 Department of Mathematics, University of Michigan, Ann Arbor, MI, USA.

E-mail: [email protected]

2 Department of Mathematics, Brigham Young University, Provo, UT 84602, USA.

E-mail: [email protected]; [email protected]; [email protected]; [email protected]

Received: 28 May 2009 / Accepted: 21 July 2009 Published online: 8 October 2009 – © Springer-Verlag 2009

Abstract: The Landau-Ginzburg Mirror Symmetry Conjecture states that for an invertible quasi-homogeneous singularity W and its maximal group G of diagonal symmetries, there is a dual singularity W T such that the orbifold A-model of W/G is isomorphic to the B-model of W T . The Landau-Ginzburg A-model is the Frobenius algebra HW,G constructed by Fan, Jarvis, and Ruan, and the B-model is the orbifold Milnor ring of W T . We verify the Landau-Ginzburg Mirror Symmetry Conjecture for Arnol’d’s list of unimodal and bimodal quasi-homogeneous singularities with G the maximal diagonal symmetry group, and include a discussion of eight axioms which facilitate the computation of FJRW-rings.

1. Introduction In this paper we verify the Landau-Ginzburg Mirror Symmetry Conjecture for Arnol’d’s list of unimodal and bimodal singularities [1, p. 25]. Briefly, the conjecture states that for non-degenerate quasi-homogeneous singularities there is a mirror dual singularity such that the ring constructed by Fan-Jarvis-Ruan [3] for one is isomorphic to the LandauGinzburg B-model (Milnor ring) of the other. The conjecture has already been proven for the simple and parabolic singularities in [3]. The Landau-Ginzburg B-model is an orbifolded Milnor ring. When the orbifold group is trivial, this is just the classical Milnor ring (local algebra) of the singularity. [1] In [3], Fan, Jarvis, and Ruan construct a cohomological field theory which gives the A-model Frobenius algebra when restricted to genus zero with three marked points. Since the original motivation for the theory was to study generalizations of the Witten equation, we call the A-model Frobenius Algebra the Fan-Jarvis-Ruan-Witten ring, or FJRW-ring, for short. M. K. is partially Supported by the National Research Foundation of South Africa.

146

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

The singularities for this particular theory are required to be non-degenerate, quasihomogeneous (i.e. weighted homogeneous) polynomials, having an isolated singularity at the origin. Not all of the singularities in the list in [1] are quasi-homogeneous, so we have only used those which are. Several of the families in [1] depend on certain parameters, but are only non-degenerate and quasi-homogeneous for particular parameter values. In these cases, we fix the appropriate parameter values without further comment. 1.1. Outline of Paper. • Section 1: Introduction – 1.2 Review of construction of FJRW-rings – 1.3 Additional notation – 1.4 An example – 1.5 Format of results • Section 2: Computations – 2.1 Exceptional families of unimodal singularities – 2.2 Exceptional families of bimodal singularities – 2.3 Bi-modal singularities of corank 3 – 2.4 Bi-modal singularities of corank 2 1.2. Review of construction. Let W be a non-degenerate quasi-homogeneous polynomial in the variables x1 , x2 , . . . , x N with weights q1 , q2 , . . . , q N respectively. Nondegeneracy requires that these weights are uniquely determined by the condition that each monomial in W has total weight 1, and that W has an isolated singularity at the origin. The following construction of W T was proposed by Berglund-Hübsch in [2]. To each quasi-homogeneous polynomial W we can associate a matrix, BW , such that the columns correspond to the variables, and the rows correspond to the terms of the polynomial. In other words, the entry (BW )i, j is the power of x j in the i th monomial of W . If the number of monomials coincides with the number of variables, the matrix BW is square, and the non-degeneracy condition implies that BW is invertible. In such cases, we call the polynomial W an invertible potential. Note that the variables of an invertible potential can always be rescaled so the coefficients of the monomials are all equal to one. To illustrate the correspondence W ↔ BW , consider the polynomial W = x 3 + x y 4 + 2 yz . The corresponding matrix is ⎛ ⎞ 3 0 0 ⎝1 4 0 ⎠ . 0 1 2 T will also correspond to a quasi-homogeIf the matrix BW is square, its transpose BW neous polynomial, which we denote by W T . In many cases the W T polynomial also has an isolated singularity at the origin, thus satisfying the non-degeneracy condition. The central charge of W is defined to be

cˆ :=

N (1 − 2q j ). j=1

FJRW-Rings and Mirror Symmetry

147

The Jacobian ideal J is defined by ∂W ∂W ∂W J = , ,..., . ∂ x1 ∂ x2 ∂ xN The Milnor ring QW is given by QW := C[x1 , x2 , . . . , x N ]/J , together with the residue pairing. QW is a finite dimensional vector space over C, with dimension N 1 µ= −1 . qj j=1

It is graded by weighted degree, and the elements 2 of top degree form a one-dimensional subspace generated by hess(W ) = det ∂∂xi ∂Wx j . One can check directly that the top degree is equal to c. ˆ For f, g ∈ QW , the residue pairing f, g may be defined by fg =

f, g hess(W ) + lower order terms. µ

This pairing is non-degenerate, and endows the Milnor ring with the structure of a Frobenius algebra. To define the FJRW ring, we require in addition to W a choice of a group of diagonal symmetries of W . The choice of group heavily affects the resulting structure of the FJRW ring. The maximal group of diagonal symmetries is defined as G W = (α1 , α2 , . . . , α N ) ⊆ (C∗ ) N | W (α1 x1 , α2 x2 , . . . , α N x N ) = W (x1 , x2 , . . . , x N ) . Note that G W always contains the exponential grading element J = (e2πiq1 , e2πiq2 , . . . , e2πiq N ). In general, the theory requires that the symmetry group be admissible (see [3] Sect. 2.3). In our computations, we will use either the maximal diagonal symmetry group G W or the subgroup generated by J ; both of these groups known to be admissible. The Landau-Ginzburg Mirror Symmetry Conjecture states the following: Conjecture. For a non-degenerate, quasi-homogeneous, invertible singularity W and (maximal) diagonal symmetry group G, there is a dual singularity W T so that the FJRWring of W/G is isomorphic to the (unorbifolded) Milnor ring of W T . Remark. We use the notation W T suggestively for the dual singularity, as we identify in this paper a class of examples for which the Berglund-Hübsch transposed singularity is the appropriate dual in the context of the Landau-Ginzburg Mirror Symmetry Conjecture. We now outline the definition of HW,G as a C-vector space, after which we will define the pairing, grading, and multiplication that make HW,G a Frobenius algebra. In [3], the state space HW,G is defined in terms of Lefschetz thimbles. For computational convenience, we give a presentation in terms of Milnor rings, but we should point out that the isomorphism between the two presentations is not canonical.

148

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

Let G be an admissible group. For h ∈ G, let Fix h ⊂ C N be the fixed locus of h, and let Nh be its dimension. Define

Hh := Nh (C Nh )/ dW |Fix h ∧ Nh −1 ∼ = QW |Fix h · ω, where ω = d xi1 ∧ d xi2 ∧ · · · ∧ d xi Nh is the natural choice of volume form. G acts on Hh via its action on the coordinates, and the state space of the FJRW-ring is the vector space of invariants under this action, i.e.

G HW,G := Hh . h∈G

HW,G is Q-graded by the so-called W -degree, which depends only on the G-grading. To define this grading, first note that each element h ∈ G can be uniquely expressed as h

h

h

h = (e2πi1 , e2πi2 , . . . , e2πi N ) with 0 ≤ ih < 1. For αh ∈ (Hh )G , the W -degree of αh is defined by degW (αh ) := Nh + 2

N

(hj − q j ).

(1)

j=1

Since Fix h = Fix h −1 , we have Hh ∼ = Hh −1 , and the pairing on QW |Fix h induces a pairing (Hh )G ⊗ (Hh −1 )G → C. The pairing on HW,G is the direct sum of these pairings. Fixing a basis for HW,G , we denote the pairing by a matrix ηα,β = α, β, with inverse ηα,β . For each pair of non-negative integers g and n, with 2g − 2 + n > 0, the FJRW cohoW (α , α , . . . , α ) ∈ H ∗ (M mological field theory produces classes g,n 1 2 n g,n ) of complex n codimension D for each n-tuple (α1 , α2 , . . . , αn ) ∈ (HW,G ) . The codimension D is given by 1 degW (αi ), 2 n

D := cˆW (g − 1) +

i=1

and the n-point correlators are defined to be α1 , . . . , αn g,n :=

M g,n

W

g,n (α1 , . . . , αn ).

W (α , . . . , α ) is The correlator α1 , . . . , αn g,n vanishes unless the codimension of g,n 1 n zero. The ring structure on HW,G is determined by the genus-zero three-point correlators. In other words, if r, s ∈ HW,G , then r, s, α0,3 ηα,β β, (2) r ∗ s := α,β

where the sum is taken over all choices of α and β in a fixed basis of HW,G . W (α , . . . , α ) satisfy the following axioms that allow us to compute The classes g,n 1 n most of the three-point correlators α1 , α2 , α3 explicitly.

FJRW-Rings and Mirror Symmetry

149

W (α , α , . . . , α ) = 0. Otherwise, D is the Axiom 1. Dimension: If D ∈ / 21 Z, then g,n 1 2 n W complex codimension of the class g,n (α1 , α2 , . . . , αn ). In particular, if g = 0 and n = 3, then α1 , α2 , α3 = 0 unless D = 0. 3 Notice that in the case where g = 0 and n = 3, D = 0 if and only if i=1 degW αi = 2c. ˆ

Axiom 2. Symmetry: Let σ ∈ Sn . Then

α1 , . . . , αn g,n = ασ (1) , . . . , ασ (n) g,n .

The next few axioms rely on the degrees of line bundles L1 , . . . , L N endowing an orbicurve with a so-called W -structure; however, this can be reduced to a simple W (α , α , . . . , α ), with α ∈ (H )G for numerical criterion. Consider the class g,k 1 2 k j hj each j ∈ {1, . . . , N }. For each variable x j , define l j by l j = q j (2g − 2 + k) −

k

hj i .

i=1 W (α , α , . . . , / Z for some j ∈ {1, . . . , N }, then g,k Axiom 3. Integer degrees: If l j ∈ 1 2 αk ) = 0.

Axiom 4. Concavity: If l j < 0 for all j ∈ {1, 2, 3}, then α1 , α2 , α3 = 1. The next axiom is related to the Witten map: N

W:

W=

N

j=1

1

Ch j ,

j=1

∂W ∂W ∂W , ,..., ∂ x1 ∂ x2 ∂ xN

where h 0j and h 1j are defined by

,

h 0j

:=

h 1j

0

Ch j →

:=

0 if l j < 0 , l j + 1 if l j ≥ 0

−l j − 1 if l j < 0 , 0 if l j ≥ 0

so that both are non-negative integers satisfying h 0j − h 1j = l j + 1.1 The fact that the Witten map is well-defined is a consequence of the geometric conditions on the L j considered in [3]. For further details, we refer readers to the original paper. W (α , . . . , α ) is a class of codimension zero, we obtain a complex number by If g,n 1 n W (α , . . . , α ) integrating over M g,n . Abusing notation, we will refer to the class g,n 1 n and its integral over M g,n interchangeably. 1 The reader may note that h i is just the dimension of the i th cohomology of the j th line bundle in the j W -structure, and this relation is just Riemann-Roch for a line-bundle L j of degree l j on CP 1 . The preceding

definitions simply serve to axiomatize the entire construction.

150

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

W (α , α , . . . , α ), with α ∈ H Axiom 5. Index Zero: Consider the class g,n 1 2 n i γi ,G . If Fix γi = {0} for each i ∈ {1, 2, . . . , n} and N (h 0j − h 1j ) = 0, j=1 W (α , α , . . . , α ) is equal to then g,n (α1 , α2 , . . . , αn ) is of codimension zero and g,n 1 2 n the degree of the Witten map.

Axiom 6. Composition: There exists a morphism ρ : M 0,3 × M 0,3 → M 0,4 obtained W (α , α , by gluing a marked point of each 3-pointed curve. If the four-point class, 0,4 1 2 α3 , α4 ) is of codimension zero, then the integral of its pullback along ρ decomposes in terms of three-point correlators in the following way: M 0,3 ×M 0,3

W ρ ∗ 0,4 (α1 , α2 , α3 , α4 ) =

α1 , α2 , β ηβ,δ δ, α3 , α4 .

β,δ

Note that Fix J = {0} so H J ∼ = C. The identity element in the FJRW-ring is an element of H J , and we denote this element by 1. Axiom 7. Pairing: For α1 , α2 ∈ HW,G , α1 , α2 , 1 = η(α1 , α2 ), where η is the pairing in HW,G . Axiom 8. Sums of singularities: If W1 ∈ C[x1 , . . . , xr ] and W2 ∈ C[y1 , . . . , ys ] are two non-degenerate, quasi-homogeneous polynomials with maximal symmetry groups G 1 and G 2 , then the maximal symmetry group of W = W1 + W2 is G = G 1 × G 2 , and there is an isomorphism of Frobenius algebras HW,G ∼ = HW1 ,G W1 ⊗ HW2 ,G W2 . Remark. We note an important consequence of Axiom 8. Under the same hypotheses as in the statement of the axiom, we have a Frobenius Algebra isomorphism QW ∼ = Q W1 ⊗ Q W2 , and similarly QW T ∼ = QW T ⊗ QW T . 1

2

Consequently, in order to prove the Mirror Symmetry Conjecture for W = W1 + W2 , a sum of decoupled polynomials (with maximal A-model orbifold group), it suffices to prove it for W1 and W2 individually. These axioms allow us to compute most of the three-point correlators of the FJRWrings. In some cases, the axioms are not enough to compute all of the correlators; however, in most of these cases, one can still verify the mirror symmetry conjecture.

FJRW-Rings and Mirror Symmetry

151

1.3. Additional notation. A singularity is said to be invertible if the number of monomials equals the number of variables. We have found that the Landau-Ginzburg mirror symmetry conjecture holds for the invertible unimodal and bimodal singularities orbifolded by the maximal group of diagonal symmetries, G W . The conjecture was verified in [3] for the simple singularities. In most cases considered, the maximal symmetry group, G W , is cyclic. In these cases we have adopted the following notation. Let g be a generator for G W . If J generates G W , take g = J . If Fix g k = {0}, define ek = 1 ∈ Hgk ∼ = C, otherwise if Fix g k = Cxi1 ⊕ · · · ⊕ Cxi Ng , define ek = d xi1 ∧ d xi2 ∧ · · · ∧ d xi Ng ∈ Hgk . We denote coordinate subspaces of C N with subscripts indicating the non-zero variables in these subspaces. So, for example, the x y-plane in C3 will be denoted by C2x y . Our computations are often made easier by judicious use of associativity. This will be reflected typographically in the grouping of terms. So, for example, α ∗ βγ indicates that βγ should be computed first, and then multiplied by α. 1.4. An example. We will now give an example demonstrating the construction and our methods of computation more fully. 1.4.1. E 19 with maximal symmetry group The polynomial for E 19 is x 3 + x y 7 . The 2 corresponding weights for each variable are qx = 13 , q y = 21 and the central charge is 24 2 7 6 cˆ = 21 . The Jacobian ideal is J = (3x + y , 7x y ). The maximal group of diagonal symmetries is given by G = (α, β) | α 3 , αβ 7 . From the relations, we can see that α = β −7 , so that |G| = 21 and G is cyclic. The 1

2

exponential grading element is J = (e2πi 3 , e2πi 21 ), which has order 21, so the maximal group of diagonal symmetries is generated by J . The fixed point locus of J k is given by ⎧ 2 ⎪ ⎨C if k = 0 Fix J k = Cx if 3|k, k = 0 . ⎪ ⎩0 otherwise The Milnor ring is C[x, y]/J ∼ = 1, x, x 2 , y, . . . , y 6 , x y, x y 2 , . . . , x y 5 , x 2 y, 0 for J = I . If 3|k, then W |Fix J k = x 3 . And in the case that 3 k, the Milnor ring is trivial. So the Milnor rings are given by x 2 y2, . . . , x 2 y5

Q |Fix J k

⎧ 2 6 2 5 2 2 2 2 5 ⎪ ⎨ 1, x, x , y, . . . , y , x y, x y , . . . , x y , x y, x y , . . . , x y , µ = 19 = 1, x , µ = 2 ⎪ ⎩1

if k = 0 if 3|k, k = 0 otherwise

152

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

Now for each choice of k with 3 k, Fix J k = {0} so H J k ,G ∼ = C with trivial G-action. So ek is a generator for the state space. To take G-invariants, we need only consider the action of J , since G = J . If k = 0 and 3 | k, then we must consider the action of J on x j d x for j ∈ {0, 1}.

x j d x. J · x j d x = exp 2πi ( j+1) 3 This action clearly has no invariants. For k = 0, we must consider the action of J on the Milnor ring associated to J 0 .

x l y m d x ∧ dy. J · x l y m d x ∧ dy = exp 2πi (7l+7+2m+2) 21 This action has a single invariant element, namely y 6 d x ∧ dy = y 6 e0 . So a basis for the state space is H E 19 ,J = y 6 e0 , e1 , e2 , e4 , e5 , e7 , e8 , e10 , e11 , e13 , e14 , e16 , e17 , e19 , e20 . The W -degrees are straightforward to compute from formula (1). Again, notice that the W -degree only depends on the power of the generator J . We give the invariants of each sector in the following table as well: k |G| · degW invariants

0 1 2 4 5 7 8 10 11 13 14 16 17 19 20 24 0 18 12 30 24 42 36 12 6 24 18 36 30 48 y 6 e0 1 e2 e4 e5 e7 e8 e10 e11 e13 e14 e16 e17 e19 e20

Several three-point correlators can be computed using the pairing axiom. Note that all twisted sectors correspond to trivial fixed loci, so the residue pairing between twisted sectors is given by ,

H J k ⊗ H J 21−k −→ C, ek , e21−k = 1. So, by the pairing axiom, 1, 1, e20 , 1, e2 , e19 , 1, e4 , e17 , 1, e5 , e16 , 1, e7 , e14 , 1, e8 , e13 , and 1, e10 , e11 are all equal to 1. The pairing in the untwisted sector gives us

19y 12 19y 12 y 6 e0 , y 6 e0 , 1 = η y 6 e0 ,y 6 e0 = = = −1/7. Hess −133y 12

The following three-point correlators have l x and l y both less than zero: e2 , e4 , e16 , e2 , e7 , e13 , e4 , e4 , e14 , e4 , e5 , e13 , e4 , e7 , e11 , e11 , e13 , e19 , e11 , e16 , e16 , e13 , e13 , e17 , e13 , e14 , e16 . Therefore, by the concavity axiom each is equal to 1. For the three-point correlator y 6 e0 , e11 , e11 , we apply the composition axiom to the W (e , e , e , e ). The values for the line bundle degrees are l = −2 and class 0,4 11 11 11 11 x l y = 0. So we have H 0 = 0 ⊕ C y and H 1 = Cx ⊕ 0. The Witten map H 0 → H 1 is given by the complex conjugate of the gradient of W . In other words it maps (0, y) → (y 7 , 0).

FJRW-Rings and Mirror Symmetry

153

The degree of this map is −7, so the composition axiom tells us that W e11 , e11 , α ηα,β β, e11 , e11 , (e11 , e11 , e11 , e11 ) = −7 = 0,4 α,β

where α and β range over a fixed basis for the FJRW-ring. By degree considerations, the only non-zero contribution to this sum occurs when α = y 6 e0 = β. Setting a := y 6 e0 , e11 , e11 , we get the following equation: 6 6 −7 = e11 , e11 , y 6 e0 η y e0 ,y e0 y 6 e0 , e11 , e11 = −7a 2 ,

(3)

where the −7 on the right is the contribution to the inverse of the pairing corresponding to y 6 e0 , y 6 e0 , 1 = −1/7 computed above. From this we can see y 6 e0 , e11 , e11 = ±1. We will see below that either choice T = x 3 y + y7. gives a Frobenius algebra that is isomorphic to Q E T , where E 19 19 This completes the computation of the three-point correlators. All others are required to vanish by Axioms 1 and 2. Recall that mulplication in H E 19 ,G is given by Eq. (2). Examining degrees we see that e13 is a generator for H E 19 ,G . We compute 2 e13 =

α, β e13 , e13 , α ηα,β β

= e13 , e13 , e17 ηe17 ,e4 e4 = e4 , where the first summation is taken over all α and β among the generators we have listed for the state space. The second equality follows because the only non-zero three-point correlator with e13 occurring twice is e13 , e13 , e17 = 1. Again examining degrees, we see that e11 is also a generator of H E 19 ,G . One can check directly that e13 and e11 generate H E 19 ,G as a Frobenius algebra, so we may define a surjective map ϕ : C[X, Y ] → H E 19 ,G by X → e11 and Y → e13 , and extend to a C-algebra homomorphism. 2 ∗ e = 0, e3 = −7e , and e6 = e . Hence (3X 2 Y, X 3 + 7Y 6 ) ⊂ We see that e11 13 10 10 11 13 ker ϕ. Therefore since C[X, Y ]/(3X 2 Y, X 3 + 7Y 6 ) = Q E T has dimension 15, the same as 19

H E 19 ,G , we deduce that the inclusion (3X 2 Y, X 3 +7Y 6 ) ⊂ ker ϕ is an equality, therefore the map induces a degree-preserving isomorphism Q E T ∼ = H E 19 ,G . 19

1.5. Format of results. For each singularity, we will display the information in the following pattern: • Name of singularity, the defining polynomial, the Jacobian ideal, the weights associated to each variable, and the central charge. We will also give the symmetry group used in the construction, typically G W , which we will henceforth denote by G. • Fixed locus for each group element. • Basis for the Milnor ring of W restricted to each fixed locus.

154

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

• Table of sectors with non-trivial G-invariants, including the invariant elements and their W -degrees. For clarity of exposition, we will multiply W -degrees by a factor of |G|. • We will give the values of the three-point correlators that are not required to vanish by Axioms 1 and 2. These will be grouped in the following order: those computed by the Pairing axiom, those by the Concavity axiom, those by the Index Zero axiom, and those by the Composition axiom. Any correlators for which the axioms do not suffice will be listed last, including any relations among them. • Finally, we will describe, where possible, an isomorphism between the FJRW-ring of W and the Milnor ring of W T . 2. Computations We take our examples from the unimodal and bimodal singularities listed by Arnol’d [1]. Many of these singularities are quasi-homogeneous only after fixing specific parameter values, which we do without further comment. 2.1. Unimodal singularities. Q 10 = x 2 z + y 3 + z 4 Axiom 8 applies here. By the subsequent comment, it suffices to prove the Mirror Symmetry Conjecture for D5 = x 2 z + z 4 and A2 = y 3 . The conjecture was proved for the simple (ADE) singularities in [3], so it holds in this case also. Q 11 = x 2 z + y 3 + yz 3 J = (2x z, 3y 2 + z 3 , x 2 + 3yz 2 ), G = J = Z/18Z,

qx =

7 18 ,

qy =

6 18 ,

qz =

4 18 ,

cˆ =

20 18 ,

⎧ ⎪ C3 if k = 0 ⎪ ⎪ ⎨ C y if k = 3, 6, 12, 15 Fix J k = , 2 ⎪ ⎪C yz if k = 9 ⎪ ⎩ 0 otherwise ⎧ ⎪ 1, z, y, x, z 2 , yz, z 3 , x y, yz 2 , z 4 , z 5 , µ = 11 ⎪ ⎪ ⎨ 1, y, y 2 , µ = 3 Q|Fix J k = . ⎪ 1, z, y, z 2 , yz, z 3 , z 4 , µ = 7 ⎪ ⎪ ⎩1 k |G| · degW invariants

1 2 4 5 7 8 9 10 11 13 14 16 17 0 34 30 28 24 22 20 18 16 12 10 6 40 1 e2 e4 e5 e7 e8 z 2 e9 e10 e11 e13 e14 e16 e17

Potential non-zero correlators: By the Pairing axiom, 1, 1, e17 , 1, e2 , e16 , 1, e4 , e14 , 1, e5 , e13 , 1, e7 , e11 , and 1, e8 , e10 are all equal to 1, and 1, z 2 e9 , z 2 e9 = − 13 . By the Concavity axiom, e5 , e16 , e16 , e7 , e14 , e16 , e10 , e11 , e16 , e10 , e13 , e14 , are all equal to 1. By the Index-Zero axiom, e11 , e13 , e13 and e8 , e13 , e16 are −2.

FJRW-Rings and Mirror Symmetry

155

By the Composition axiom, 2 2 −3 = z 2 e9 , e14 , e14 η z e9 ,z e9 z 2 e9 , e14 , e14 ,

(†)

2 2 where η z e9 ,z e9 = −3, so z 2 e9 , e14 , e14 = ±1. Consider the map ϕ : C[X, Y, Z ] → H Q 11 ,G defined by X → e10 , Y → e14 , and Z → e16 extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 3 = −2e , e3 = −3e = −3e ∗e2 , and e2 ∗e = 0. Our correlators tell us that e16 10 14 4 10 16 14 16 3 2 3 Hence (2X +Z , 3Y Z , Y +3X Z 2 ) ⊂ ker ϕ. Since C[X, Y, Z ]/(2X +Z 3 , 3Y 2 Z , Y 3 + 3X Z 2 ) = Q Q T has dimension 13, we deduce the inclusion is in fact equality, and 11 H Q 11 ,G ∼ = QQT . 11

Q 12 = x 2 z + y 3 + z 5 Axiom 8 is applicable, and it suffices to prove the Mirror Symmetry Conjecture for D6 = x 2 z + z 5 and A2 = y 3 . This follows from the study of simple singularities in [3]. S11 = x 2 z + yz 2 + y 4 J = (2x z, z 2 + 4y 3 , x 2 + 2yz), G = J = Z/16Z,

qx =

5 16 ,

qy =

4 16 ,

qz =

6 16 ,

cˆ =

18 16 ,

⎧ ⎪ C3x yz if k = 0 ⎪ ⎪ ⎨C if k = 4, 12 y Fix J k = , 2 ⎪ if k = 8 C ⎪ yz ⎪ ⎩ 0 otherwise ⎧ ⎪ 1, y, x, z, y 2 , x y, x 2 , z 2 , x y 2 , y 2 z, z 3 , µ = 11 ⎪ ⎪ ⎨ 1, z, z 2 , µ = 3 . Q|Fix J k = ⎪ 1, z, y, y 2 , y 3 , µ = 5 ⎪ ⎪ ⎩1 k |G| · degW invariants

1 0 1

2 30 e2

3 28 e3

5 24 e5

6 22 e6

7 20 e7

8 18 ze8

9 16 e9

10 14 e10

11 12 e11

13 8 e13

14 6 e14

15 36 e15

Potential non-zero correlators: By the Pairing axiom, 1, 1, e15 , 1, e2 , e14 , 1, e3 , e13 , 1, e5 , e11 , 1, e6 , e10 , and 1, e7 , e9 are all equal to 1, and 1, ze8 , ze8 = − 21 . By the Concavity axiom, e5 , e14 , e14 , e6 , e13 , e14 , e9 , e10 , e14 , and e9 , e11 , e13 are all equal to 1. By the Index-Zero axiom, e7 , e13 , e13 , e10 , e10 , e13 , and e11 , e11 , e11 are all equal to −2. The correlator e8 , e11 , xe14 may be non-zero, and the composition axiom can be used to compute it. However, using associativity of the product we can compute the ring structure on H S11 ,G without it.

156

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

Consider the map ϕ : C[X, Y, Z ] → H S11 ,G , defined by X → e9 , Y → e13 , and Z → e14 extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 2 = −2e , e4 = −2e e , and e e ∗ e2 = 0. Our correlators tell us that e13 9 14 13 9 13 14 14 2 4 Hence (2X + Y , 2X Y + Z , Y Z 3 ) ⊂ ker ϕ. Since C[X, Y, Z ]/(2X + Y 2 , 2X Y + Z 4 , Y Z 3 ) = Q S T has dimension 13, we deduce the inclusion is in fact equality, and 11 H S11 ,G ∼ = QST . 11

S12 = x 2 z + yz 2 + x y 3 J = (2x z, z 2 + 3x y 2 , x 2 + 2yz), G = J = Z/13Z,

qx =

4 13 ,

qy =

3 13 ,

qz =

5 13 ,

cˆ =

15 13 ,

C3x yz if k = 0 Fix J = , 0 otherwise 1, y, x, z, y 2 , x y, yz, x z, x y 2 , y 2 z, x yz, x y 2 z , µ = 12 . Q|Fix J k = 1 k

k |G| · degW invariants

1 2 0 24 1 e2

3 22 e3

4 20 e4

5 18 e5

6 16 e6

7 14 e7

8 12 e8

9 10 e9

10 8 e10

11 6 e11

12 30 e12

Potential non-zero correlators: By the Pairing axiom, 1, 1, e12 , 1, e2 , e11 , 1, e3 , e10 , 1, e4 , e9 , 1, e5 , e8 , and 1, e6 , e7 are all equal to 1. By the Concavity axiom, e5 , e11 , e11 , e6 , e10 , e11 , e7 , e9 , e11 , and e8 , e9 , e10 are all equal to 1. By the Index-Zero axiom, e7 , e10 , e10 = −2, e8 , e11 , e11 = −2, and e9 , e9 , e9 = −3. Consider the map ϕ : C[X, Y, Z ] → H S12 ,G , defined by X → e9 , Y → e10 , and Z → e11 extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 2 , e2 = −2e = −2e e , and Our correlators tell us that e92 = −3e4 = −3e10 e11 6 9 11 10 3 e11 = −2e5 = −2e9 e10 . Hence (Y 2 + 2X Z , Z 3 + 2X Y, X 2 + 3Y Z 2 ) ⊂ ker ϕ. Since C[X, Y, Z ]/(Y 2 + 2X Z , Z 3 + 2X Y, X 2 + 3Y Z ) = Q S T has dimension 13, we deduce the inclusion 12 is in fact equality, and H S12 ,G ∼ = QST . 12

U12 = x 3 + y 3 + z 4 By Axiom 8, the Mirror Symmetry Conjecture holds here because it holds for the simple singularities A2 = x 3 and A3 = z 4 .

FJRW-Rings and Mirror Symmetry

157

Z 11 = x 3 y + y 5 J = (3x 2 y, x 3 + 5y 4 ),

qx =

4 15 ,

qy =

3 15 ,

cˆ =

G = J = Z/15Z,

16 15 ,

⎧ 2 ⎪ ⎨Cx y if k = 0 Fix J k = C y if k = 5, 10 , ⎪ ⎩0 otherwise ⎧ 2 2 3 2 3 3 4 ⎪ ⎨1, y, x, y , x y, x , y , x y , x , x y , x , µ = 11 Q|Fix J k = 1, y, y 2 , y 3 , µ = 4 . ⎪ ⎩ 1 k |G| · degW invariants

0 1 2 3 4 6 7 8 9 11 12 13 14 16 0 14 28 12 10 24 8 22 20 4 18 32 x 2 e0 1 e2 e3 e4 e6 e7 e8 e9 e11 e12 e13 e14

Potential non-zero correlators: By the Pairing axiom, 1, 1, e14 , 1, e2 , e13 , 1, e3 , e12 , 1, e4 , e11 , 1, e6 , e9 , and 1, e7 , e8 are all equal to 1, and 1, x 2 e0 , x 2 e0 = − 13 . By the Concavity axiom, e2 , e2 , e12 , e2 , e6 , e8 , e4 , e6 , e6 , e6 , e12 , e13 , e7 , e12 , e12 , and e8 , e11 , e12 are all equal to 1. By the Index-Zero axiom, e4 , e4 , e8 = −3. The Composition axiom can be used to compute x 2 e0 , e4 , e12 , and x 2 e0 , e8 , e8 , but we do not need these to establish the desired isomorphism. Consider the map ϕ : C[X, Y ] → H Z 11 ,G , defined by X → e6 , and Y → e12 extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 3 ∗ e2 = −3e = −3e2 and e e4 = 0. Our correlators tell us that e12 11 6 12 12 6 2 5 4 Hence (3X + Y , 5X Y ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 + Y 5 , 5X Y 4 ) = Q Z T has 11 dimension 13, we deduce the inclusion is in fact equality, and H Z 11 ,G ∼ = QZ T . 11

Z 12 = x 3 y + x y 4 J = (3x 2 y + y 4 , x 3 + 4x y 3 ),

qx =

3 11 ,

qy =

2 11 ,

cˆ =

12 11 ,

G = J = Z/11Z,

C2x y if k = 0 , 0 otherwise 1, y, x, y 2 , x y, y 3 , x 2 , x y 2 , x 2 y, x y 3 , x 2 y 2 , x 2 y 3 , µ = 12 . = 1

Fix J = k

Q|Fix J k

k |G| · degW invariants

0 1 2 3 4 5 6 7 8 9 10 12 0 10 20 8 18 6 16 4 14 24 2 3 x e0 , y e0 1 e2 e3 e4 e5 e6 e7 e8 e9 e10

158

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

Potential non-zero correlators: By the Pairing axiom, 1, 1, e10 , 1, e2 , e9 , 1, e3 , e8 , 1, e4 , e7 , e9 , e9 are all equal to 1. and 1, e5 , e6 , 1, 1 4 3 Also, 1, x 2 e0 , x 2 e0 = 11 , 1, x 2 e0 , y 3 e0 = − 11 and 1, y 3 e0 , y 3 e0 = − 11 . , e , e , e e By the Concavity axiom, 2 , e2 , e8 2 , e4 , e6 6 , e8 , e9 and 7 , e8 , e8 are all equal to 1. By the Index-Zero axiom, e 4, e4 , e4 = −3. The correlators x 2 e0 , e4 , e8 , y 3 e0 , e4 , e8 , x 2 , e6 , e6 and y 3 e0 , e6 , e6 are also nonzero, and we use the Composition axiom to extract the necessary information (although we avoid computing individual correlators directly). Note e6 , e6 , µ ηµ,ν ν e62 = µ, ν∈{x 2 e0 , y 3 e0 }

e63 =

e6 , e6 , µ ηµ,ν ν, e6 , e6 e5 .

µ, ν∈{x 2 e0 , y 3 e0 }

By the Composition axiom, the coefficient of e5 here is just the value of the four pointed Z 12 class 0,4 (e6 , e6 , e6 , e6 ). This four-pointed class has codimension zero, and l x = −2 and l y = 0, so its value is the y-degree of the Witten map, which is −4. Consider the map ϕ : C[X, Y ] → H Z 12 ,G , defined by X → e6 , and Y → e8 extending by C-linearity and multiplicativity. One can check directly that this maps onto each twisted sector. To prove surjectivity, we need to check that it also maps onto the untwisted sector. This is equivalent to the linear independence of e62 and e83 . We note the following identities, which are direct consequences of the three-point correlator values cited above: e62 ∗ e6 = −4e5 , e83 ∗ e6 = e82 ∗ (e8 e6 ) = e4 ∗ e2 = e5 , e62 ∗ e8 = e6 ∗ (e6 e8 ) = e6 ∗ e2 = e7 , e83 ∗ e8 = e82 ∗ e82 = e4 ∗ e4 = −3e7 . Putting µ = e62 + 4e83 and ν = 3e6 + e83 , we see that µ ∗ e6 = 0, µ ∗ e8 = −11e7 , ν ∗ e6 = −11e5 , ν ∗ e8 = 0, and conclude that µ and ν are linearly independent combinations of e62 and e83 , yielding surjectivity of the map ϕ. The above identities also show (3X 2 Y + Y 4 , X 3 + 4X Y 3 ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 Y + Y 4 , X 3 + 4X Y 3 ) = Q Z T has dimension 12, we deduce the inclusion 12 is in fact equality, and H Z 12 ,G ∼ = QZ T . 12

FJRW-Rings and Mirror Symmetry

159

Z 13 = x 3 y + y 6 J = (3x 2 y, x 3 + 6y 5 ),

qx =

5 18 ,

qy =

3 18 ,

cˆ =

20 18 ,

G = J = Z/18Z,

⎧ 2 ⎪ ⎨Cx y if k = 0 k Fix J = C y if k = 6, 12, ⎪ ⎩0 otherwise ⎧ 2 3 2 2 4 3 5 4 5 ⎪ ⎨1, y, x, y , x y, y , x , x y , y , x y , y , x y , x y , µ = 13 . Q|Fix J k = 1, y, y 2 , y 3 , y 4 , µ = 5 ⎪ ⎩1 k 0 1 2 3 4 5 7 8 9 10 11 13 14 15 16 17 |G| · degW 20 0 16 32 12 28 24 4 20 36 16 12 28 8 24 40 invariants x 2 e0 1 e2 e3 e4 e5 e7 e8 e9 e10 e11 e13 e14 e15 e16 e17 Potential non-zero correlators: By the Pairing axiom, 1, 1, e17 , 1, e2 , e16 , 1, e3 , e15 , 1, e42 , e14 ,2 1, e5 , e13 , 1, e7 , e11 , 1, e8 , e10 , 1, e9 , e9 are all equal to 1, and 1, x e0 , x e0 = − 13 . By the Concavity axiom, e2 , e2 , e15 , e2 , e4 , e13 , e2 , e8 , e9 , e3 , e8 , e8 , e4 , e7 , e8 , e7 , e15 , e15 , e8 , e13 , e16 , e8 , e14 , e15 , e9 , e13 , e15 , and e11 , e13 , e13 are all equal to 1. By the Index-Zero axiom, e4 , e4 , e11 and e11 , e11 , e15 are both equal to −3. The Composition axiom can be used to compute x 2 e0 , e4 , e15 , and x 2 e0 , e8 , e11 , but we do not need these to establish the desired isomorphism. Consider the map ϕ : C[X, Y ] → H Z 13 ,G , defined by X → e13 , and Y → e8 extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 2 and e e5 = 0. Our correlators tell us that e83 ∗ e83 = −3e7 = −3e13 13 8 2 6 5 Hence (3X + Y , 6X Y ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 + Y 6 , 6X Y 5 ) = Q Z T has 13 dimension 16, we deduce the inclusion is in fact equality, and H Z 13 ,G ∼ = QZ T . 13

W12 = x 4 + y 5 By Axiom 8, the conjecture holds for W12 because it holds for the simple singularities A3 = x 4 and A4 = y 5 . W13 = x 4 + x y 4 J = (4x 3 + y 4 , 4x y 3 ),

qx =

4 16 ,

qy =

3 16 ,

cˆ =

18 16 ,

G = J = Z/16Z,

⎧ 2 ⎪ ⎨Cx y if k = 0 k Fix J = Cx if k = 4, 8, 12 , ⎪ ⎩0 otherwise ⎧ 2 2 3 2 2 4 5 6 ⎪ ⎨1, y, x, y , x y, x , y , x y , x y, y , y , y , µ = 13 . Q|Fix J k = 1, x, x 2 , µ = 3 ⎪ ⎩1

160

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

k |G| · degW invariants

0 1 2 3 5 6 7 9 10 11 13 14 15 18 0 14 28 24 6 20 16 30 12 8 22 36 y 3 e0 1 e2 e3 e5 e6 e7 e9 e10 e11 e13 e14 e15

Potential non-zero correlators: By the Pairing axiom, 1, 1, e15 , 1, e2 , e14 , 1, e3 , e13 , 1, e5 , e11 , 1, e6 , e10 , and 1, e7 , e9 are all equal to 1, and 1, y 3 e0 , y 3 e0 = − 41 By the Concavity axiom, e2 , e2 , e13 , e2 , e6 , e9 , e5 , e6 , e6 , e6 , e13 , e14 ,e7 , e13 , e13 , and e9 , e11 , e13 are all equal to 1. By the Index-Zero axiom, e11 , e11 , e11 = −4. The correlator y 3 e0 , e6 , e11 may be non-zero, and the Composition axiom can be used to compute it. However, using associativity of the product we can compute the ring structure on HW13 ,G without it. Consider the map ϕ : C[X, Y ] → HW13 ,G , defined by X → e6 , and Y → e13 , extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 3 , and e2 ∗ (e e ) = 0. Our correlators tell us that e62 ∗ e62 = −4e5 = −4e13 6 13 6 3 4 3 Hence (4X Y, X + 4Y ) ⊂ ker ϕ. Since C[X, Y ]/(4X 3 Y, X 4 + 4Y 3 ) = QW T has 13 dimension 13, we deduce the inclusion is in fact equality, and HW13 ,G ∼ = QW T . 13

E 12 = x 3 + y 7 By Axiom 8, the Mirror Symmetry Conjecture holds for E 12 because it holds for the simple singularities A2 = x 3 and A6 = y 7 . E 13 = x 3 + x y 5 J = (4x 2 + y 5 , 5x y 4 ),

qx =

5 15 ,

qy =

2 15 ,

cˆ =

16 15 ,

G = J = Z/15Z,

⎧ 2 ⎪ ⎨Cx y if k = 0 k Fix J = Cx if k = 3, 6, 9, 12 , ⎪ ⎩0 otherwise ⎧ 2 3 4 2 5 3 6 7 8 ⎪ ⎨ 1, y, y , x, y , x y, y , x y , y , x y , y , y , y , µ = 13 Q|Fix J k = 1, x , µ = 2 . ⎪ ⎩1 k |G| · degW invariants

0 1 2 4 5 7 8 10 11 13 14 16 0 14 12 26 24 8 6 20 18 32 y 4 e0 1 e2 e4 e5 e7 e8 e10 e11 e13 e14

Potential non-zero correlators: By the Pairing axiom, 1, 1, e14 ,1, e2 , e13 , 1, e4 , e11 , 1, e5 , e10 , and 1, e7 , e8 are all equal to 1, and 1, y 4 e0 , y 4 e0 = − 15 . By the Concavity axiom, e2 , e4 , e10 , e4 , e4 , e8 , e8 , e10 , e13 , and e10 , e10 , e11 are all equal to 1. By the Composition axiom, we have 4 4 −5 = e8 , e8 , y 4 e0 η y e0 ,y e0 y 4 e0 , e8 , e8 , so e8 , e8 , y 4 e0 = ±1.

FJRW-Rings and Mirror Symmetry

161

Consider the map ϕ : C[X, Y ] → H E 13 ,G , defined by X → e8 , and Y → e10 , extending by C-linearity and multiplicativity. One can check directly that this map is surjective. 4 . Our correlators tell us that e8 ∗ e8 e10 = 0 and e82 ∗ e8 = −5e7 = −5e10 2 3 4 2 3 4 Hence (3X Y, X + 5Y ) ⊂ ker ϕ. Since C[X, Y ]/(3X Y, X + 5Y ) = Q E T has 13 dimension 11, we deduce the inclusion is in fact equality, and H E 13 ,G ∼ = QE T . 13

E 14 = x 3 + y 8 By Axiom 8, the Mirror Symmetry Conjecture holds for E 14 because it holds for the simple singularities A2 = x 3 and A7 = y 8 . 2.2. Bimodal singularities. We now turn to the fourteen exceptional bimodal families listed in Arnol’d [1]. We note that the conjecture has already been shown for E 18 , E 20 , U16 , W18 , Q 16 , and Q 18 in [3] as these are sums of simple singularities. Also E 19 was constructed in the Introduction, so in this section, we will consider the eight remaining families, together with their mirror partners. In [3], the construction of the FJRW-ring does not include singularities with a weight greater than or equal to 1/2. In particular, it has not been verified that the required compact moduli space exists in these cases, and that the three-point correlators satisfy the proper axioms. Because of this fact, we will not include the calculations for the transpose T , S T , Q T and S T . of four particular bimodal singularities, namely Q 17 17 2,0 1,0 Z 17 = x 3 y + y 8 J = (3x 2 y, x 3 + 8y 7 ),

qx =

7 24 ,

qy =

3 24 ,

cˆ =

G = J ∼ = Z/24Z,

28 24 ,

⎧ 2 ⎪ ⎨C if k = 0 k Fix J = C y if 8|k, k = 0, ⎪ ⎩0 otherwise ⎧ 2 7 7 ⎪ ⎨1, x, x , y, .. . , y , x y, . . . , x y , µ = 17 . Q|Fix J k = 1, y, . . . , y 6 , µ = 7 ⎪ ⎩ 1 k |G| · degW invariants k |G| · degW invariants

0 28 x 2 e0 12 28 e12

13 48 e13

1 2 0 20 1 e2 14 20 e14

3 40 e3 15 12 e15

4 12 e4

5 32 e5

6 52 e6

17 32 e17

18 4 e18

19 24 e19

7 24 e7 20 44 e20

9 16 e9 21 16 e21

10 36 e10 22 36 e22

11 8 e11 23 56 e23

Potential non-zero correlators: By the Pairing axiom 1, 1, e23 , 1, e2 , e22 , 1, e3 , e21 , 1, e4 , e20 , 1, e5 , e19 , 1, e6 , e18 , 1, e7 , e17 , e 1 , e9 , e15 , 1, e10 , e14 , 1, e11 , e13 and 1, e12 , e12 are equal to 1, and x 2 e0 , x 2 e0 , 1 = −1/3.

162

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

By the Concavity axiom e2 , e2 , e21 , e2 , e4 , e19 , e2 , e5 , e18 , e2 , e9 , e14 ,e2 , e11 , e12 , e3 , e4 , e18 , e3 , e11 , e11 , e4 , e4 , e17 , e4 , e9 , e12 , e4 , e10 , e11 , e5 , e9 , e11 , e7 , e9 , e9 , e9 , e18 , e22 , e9 , e19 , e21 , e10 , e18 , e21 , e11 , e17 , e21 ,e11 , e18 , e20 , e11 , e19 , e19 , e12 , e18 , e19 , e13 , e18 , e18 , and e14 , e17 , e18 are all equal to 1. By the Index Zero axiom e4 , e7 , e14 = e7 , e7 , e11 = e7 , e21 , e21 = e14 , e14 , e21 = −3. By the Composition axiom x 2 e0 , e4 , e21 = x 2 e0 , e7 , e18 = x 2 e0 , e11 , e14 = ±1. Consider the map ϕ : C[X, Y ] → H Z 17 ,G , defined by X → e9 and Y → e18 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. 8 = −3e , and e e7 = 0. Our correlators tell us that e92 = e17 , e18 17 9 18 2 8 8 Hence (3X + Y , 8X Y ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 + Y 8 , 8X Y 8 ) = Q Z T has 17 dimension 22, we deduce that the inclusion is equality, and so we have the isomorphism H Z 17 ,G ∼ = QZ T . 17

T = x 3 + x y8 Z 17

J = (3x 2 + y 8 , 8x y 7 ), qx = J ∼ G∼ = Z/24Z, = Z/12Z.

8 24 , q y

=

2 24 , cˆ

=

28 24 ,

Since J does not generate the maximal group of symmetries, we will use the generator g = (ζ 16 , ζ ), where ζ 24 = 1. We will index our graded FJRW-ring by powers of g: ⎧ 2 ⎪ ⎨C if k = 0 k Fix g = Cx if 3|k, k = 0, ⎪ ⎩0 otherwise ⎧ 2 7 6 2 2 6 ⎪ ⎨ 1, x, x , y, . . . , y , x y, . . . , x y , x y, . . . , x y , µ = 22 . Q|Fix gk = 1, x , µ = 2 ⎪ ⎩1 k 0 1 2 4 5 7 8 10 11 13 14 16 17 19 20 22 23 |G| · degW 28 14 0 20 6 26 12 32 18 38 24 44 30 50 36 56 42 invariants y 7 e0 e1 1 e4 e5 e7 e8 e10 e11 e13 e14 e16 e17 e19 e20 e22 e23 Potential non-zero correlators: By the Pairing axiom 1, e1 , e23 , 1, 1, e22 , 1, e4 , e20 , , 1, 1,7 e5 , e19 e7 , e17 , 1, e8 , e16 , 1, e10 , e14 , and 1, e11 , e13 are equal to 1, and y e0 , y 7 e0 , 1 = −1/8. By the Concavity axiom e1 , e5 , e20 , e1 , e8 , e17 , e1 , e11 , e14 , e4 , e5 , e17 , e4 , e8 , e14 , e4 , e11 , e11 , e5 , e5 , e16 , e5 , e7 , e14 , e5 , e8 , e13 , e5 , e10 , e11 , e7 , e8 , e11 , and e8 , e8 , e10 are all equal to 1. By the Composition axiom e1 , e1 , y 7 e0 = ±1. Consider the map ϕ : C[X, Y ] → H Z T ,G , defined by X → e1 and Y → e5 17 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. Our correlators tell us that e13 = −8e23 , e57 = e23 , and e12 e5 = 0. Hence (3X 2 Y, X 3 + 8Y 7 ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 Y, X 3 + 8Y 7 ) = Q Z 17 has dimension 17, we deduce that the inclusion is equality, and so we have the isomorphism H Z T ,G ∼ = Q Z 17 . 17

FJRW-Rings and Mirror Symmetry

163

Z 18 = x 3 y + x y 6 J = (3x 2 y + y 6 , x 3 + 6x y 5 ),

qx =

5 17 , q y

=

2 17 , cˆ

=

20 17 ,

G = J ∼ = Z/17Z,

C2 if k = 0 , 0 otherwise 1, x, x 2 , y, . . . , y 7 , x y, . . . , x y 5 , x 2 y, . . . , x 2 y 6 , µ = 18 = 1

Fix J = k

Q|Fix J 0

.

k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |G| · degW 20 0 14 28 8 22 36 16 30 10 24 4 18 32 12 26 40 invariants x 2 e0 , y 5 e0 1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11 e12 e13 e14 e15 e16 Potential non-zero correlators: By the Pairing axiom 1, 1, e16 , 1, e2 , e15 , 1, e3 , e14 , , 1, 1,2 e4 , e13 e5 , e12 , 1, e6 , e11 , 1, e7 , e10 , and 1, e8 , e9 are equal to 1, and x e0 , x 2 e0 , 1 = −6/17, x 2 e0 , y 5 e0 , 1 = 1/17, y 5 e0 , y 5 e0 , 1 = −3/17. By the Concavity axiom e2 , e2 , e14 , e2 , e4 , e12 , e2 , e5 , e11 , e2 , e7 , e9 , e3 , e4 , e11 , e4 , e4 , e10 , e4 , e5 , e9 , e9 , e11 , e15 , e9 , e12 , e14 , e10 , e11 , e14 , e11 , e11 , e13 , and e11 , e12 , e12 are all equal to 1. By the Index Zero axiom e4 , e7 , e7 = e7 , e14 , e14 = −3. The remaining potentially non-zero three from point correlators cannot be determined 2 2 5 the axioms alone. These correlators are x e0 , e4 , e14 , y e0 , e4 , e14 , x e0 , e7 , e11 , 5 y e0 , e7 , e11 , x 2 e0 , e9 , e9 , and y 5 e0 , e9 , e9 . In order to prove this we consider the homomorphim ϕ : C[X, Y ] → H Z 18 ,G , defined by X → e9 and Y → e11 . It takes some work to show that ϕ is surjective in this case. One can check that ϕ maps onto the one-dimensional sectors. However, the sector corresponding to the identity in G is two-dimensional. To prove 5 and e2 are linearly independent. surjectivity, it suffices to show that e11 9 We note the following identities, which are direct consequences of the three-point correlator values cited above: e92 ∗ e9 = −6e8 , e92 ∗ e11 = e10 , 5 ∗ e9 = e8 , e11 5 ∗ e11 = −3e10 . e11 5 and ν = 3e2 + e5 , we see that Putting µ = e92 + 6e11 9 11

µ ∗ e9 = 0, µ ∗ e11 = −17e10 , ν ∗ e9 = −17e8 , ν ∗ e11 = 0, 5 , yielding and conclude that µ and ν are linearly independent combinations of e92 and e11 surjectivity of the map ϕ.

164

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

The above identities also tell us that (3X 2 Y + Y 6 , X 3 + 6X Y 5 ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 Y + Y 6 , X 3 + 6X Y 5 ) = Q Z 18 has dimension 18 we deduce that we have an isomorphism Q Z 18 ∼ = H Z 18 ,G . Z 19 = x 3 y + y 9 J = (3x 2 y, x 3 + 9y 8 ),

qx =

8 27 , q y

=

3 27 , cˆ

=

32 27 ,

G = J ∼ = Z/27Z,

⎧ 2 ⎪ ⎨Cx y if k = 0 k Fix J = C y if 9|k, k = 0 , ⎪ ⎩0 otherwise ⎧ 2 8 2 8 ⎪ ⎨1, x, x , y, . . . , y , x y, x y , . . . , x y , µ = 19 Q|Fix J k = 1, y, y 2 , y 3 , y 4 , y 5 , y 6 , y 7 µ = 8 ⎪ ⎩1

.

k 0 1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 |G| · degW 32 0 22 44 12 34 56 24 46 36 4 26 48 16 38 60 28 invariants x 2 e0 1 e2 e3 e4 e5 e6 e7 e8 e10 e11 e12 e13 e14 e15 e16 e17 k 19 20 21 22 23 24 25 26 |G| · degW 18 40 8 30 52 20 42 64 invariants e19 e20 e21 e22 e23 e24 e25 e26 Potential non-zero correlators: By the Pairing axiom 1, 1, e26 , 1, e2 , e25 , 1, e3 , e24 , 1, e4 , e23 , 1, e5 , e22 , 1, e6 , e21 , 1, e7 , e20 ,1, e8 , e19 , 1, e10 , e17 , 1, e11 , e16 , 1, e12 , e15 ,and 1, e13 , e14 are equal to 1, and x 2 e0 , x 2 e0 , 1 = −1/3, By the Concavity axiom e2 , e2 , e24 , e2 , e4 , e22 , e2 , e5 , e21 , e2 , e7 , e19 , e2 , e11 , e15 , e2 , e12 , e14 , e3 , e4 , e21 , e3 , e11 , e14 , e4 , e4 , e20 , e4 , e5 , e19 , e4 , e10 , e14 , e4 , e11 , e13 , e4 , e12 , e12 , e5 , e11 , e12 , e6 , e11 , e11 , e7 , e10 , e11 , e10 , e21 , e24 , e11 , e19 , e25 , e11 , e20 , e24 , e11 , e21 , e23 , e11 , e22 , e22 , e12 , e19 , e24 , e12 , e21 , e22 , e13 , e21 , e21 , e14 , e17 , e24 , e14 , e19 , e22 , e14 , e20 , e21 , e15 , e19 , e21 , and e17 , e19 , e19 are all equal to 1. By the Index Zero axiom e4 , e7 , e17 = e7 , e7 , e14 = e7 , e24 , e24 = e17 , e17 , e21 = −3. By the Composition axiom x 2 e0 , e4 , e24 = x 2 e0 , e7 , e21 = x 2 e0 , e11 , e17 = 2 x e0 , e14 , e14 = ±1. Consider the map ϕ : C[X, Y ] → H Z 19 ,G , defined by X → e19 and Y → e11 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. 2 = e , e9 = −3e and e e8 = 0. Our correlators tell us that e19 10 11 10 19 11 2 9 8 Hence (3X + Y , 9X Y ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 + Y 9 , 9X Y 8 ) = Q Z T has 19 dimension 25, we deduce that the inclusion is equality, and so we have the isomorphism H Z 19 ,G ∼ = QZ T . 19

T = x 3 + x y9 Z 19

J = (3x 2 + y 9 , 9x y 8 ),

qx =

9 27 , q y

=

2 27 , cˆ

=

32 27 ,

G = J ∼ = Z/27Z,

FJRW-Rings and Mirror Symmetry

165

⎧ 2 ⎪ ⎨C if k = 0 k , Fix J = Cx if 3|k ⎪ ⎩0 otherwise ⎧ 8 7 2 2 7 ⎪ ⎨ 1, x, y, . . . , y , x y, . . . , x y , x y, . . . , x y , µ = 25 Q|Fix J k = 1, x , µ = 2 ⎪ ⎩1

.

k 0 1 2 4 5 7 8 10 11 13 14 16 17 19 20 22 23 25 26 |G| · degW 32 0 22 12 34 24 46 36 58 48 16 6 28 18 40 30 52 42 64 invariants y 8 e0 1 e2 e4 e5 e7 e8 e10 e11 e13 e14 e16 e17 e19 e20 e22 e23 e25 e26 Potential non-zero correlators: By the Pairing axiom 1, 1, e26 , 1, e2 , e25 , 1, e4 , e23 , 1, e5 , e22, 1, e7 , e20 , 1, e8 , e19 , 1, e10 , e17 , 1, e11 , e16 , and 1, e13 , e14 are equal to 1, and y 8 e0 , y 8 e0 , 1 = −1/9, By the Concavity axiom e2 , e4 , e22 , e2 , e7 , e19 , e2 , e10 , e16 , e4 , e4 , e20 , e4 , e5 , e19 , e4 , e7 , e17 , e4 , e8 , e16 , e4 , e10 , e14 , e5 , e7 , e16 , e7 , e7 , e14 , e14 , e16 , e25 , e14 , e19 , e22 , e16 , e16 , e23 , e16 , e17 , e22 , e16 , e19 , e20 , and e17 , e19 , e19 are all equal to 1. By the Composition axiom y 8 e0 , e4 , e24 = ±1. Consider the map ϕ : C[X, Y ] → H Z T ,G defined by X → e14 and Y → e16 19 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. 3 = −9e , e8 = e and e2 e = 0. Our correlators tell us that e14 13 16 13 14 16 Hence (3X 2 Y, X 3 + 9Y 8 ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 Y, X 3 + 9Y 8 ) = Q Z 19 has dimension 19, we deduce that the inclusion is equality, and so we have the isomorphism H Z T ,G ∼ = Q Z 19 . 19

W17 = x 4 + x y 5 J = (4x 3 + y 5 , 5x y 4 ),

qx =

5 20 , q y

=

3 20 , cˆ

=

24 20 ,

G = J ∼ = Z/20Z,

⎧ 2 ⎪ ⎨C if k = 0 k Fix J = Cx if 4|k , ⎪ ⎩0 otherwise ⎧ 2 8 2 3 2 2 2 2 3 ⎪ ⎨1, x, x , y, . . . , y , x y, x y , x y , x y, x y , x y , µ = 17 . Q|Fix J k = 1, x, x 2 , µ = 3 ⎪ ⎩1 k |G| · degW invariants

0 1 2 3 5 6 7 9 10 11 13 14 15 17 18 19 24 0 16 32 24 40 16 8 24 40 32 8 24 16 32 48 y 4 e0 1 e2 e3 e5 e6 e7 e9 e10 e11 e13 e14 e15 e17 e18 e19

Potential non-zero correlators: By the Pairing axiom 1, 1, e19 , 1, e2 , e18 , 1, e3 , e17 , , 1, 1,4 e5 , e15 e6 , e14 , 1, e7 , e13 , 1, e9 , e11 , and 1, e10 , e10 are all equal to 1, and y e0 , y 4 e0 , 1 = −1/5

166

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

By the Concavity axiom e2 , e2 , e17 , e2 , e5 , e14 , e2 , e9 , e10 , e3 , e9 , e9 , e5 , e7 , e9 , e7 , e17 , e17 , e9 , e14 , e18 , e9 , e15 , e17 , e10 , e14 , e17 , e13 , e14 , e14 , are all equal to 1. By the Index Zero axiom e7 , e7 , e7 = −5. By the Composition axiom y 4 e0 , e7 , e14 = ±1. Consider the map ϕ : C[X, Y ] → HW17 ,G , defined by X → e14 and Y → e9 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. 3 e = 0, e4 = −5e , and e4 = e . Our correlators tell us that e14 9 13 13 14 9 3 4 4 Hence (4X Y, X + 5Y ) ⊂ ker ϕ. Since C[X, Y ]/(4X 3 Y, X 4 + 5Y 4 ) = QW T has 17 dimension 16, we deduce that the inclusion is equality, and so we have the isomorphism HW17 ,G ∼ = QW T . 17

T = x 4 y + y5 W17

J = (4x 3 y, x 4 + 5y 4 ), qx = J ∼ G∼ = Z/20Z, = Z/5Z.

4 20 , q y

=

4 20 , cˆ

=

24 20 ,

Since J does not generate G we will use the generator g = (ζ, ζ −4 ), where ζ is a primitive 20th root of unity. ⎧ 2 ⎪ ⎨C if k = 0 k Fix g = C y if 5|k , ⎪ ⎩0 otherwise ⎧ 2 3 4 4 2 2 4 ⎪ ⎨1, x, x , x , y, . . . , y , x y, . . . , x y , x y, . . . , x y , µ = 16 . Q|Fix gk = 1, y, y 2 , y 3 , µ = 4 ⎪ ⎩1 k 0 1 2 3 4 6 7 8 9 11 12 13 14 16 17 18 19 |G| · degW 24 18 12 6 0 28 22 16 10 38 32 26 20 48 42 36 30 invariants x 3 e0 e1 e2 e3 1 e6 e7 e8 e9 e11 e12 e13 e14 e16 e17 e18 e19 Potential non-zero correlators: By the Pairing axiom 1, e1 , e19 , 1, e2 , e18 , 1, e3 , e17 , 1, 3 1, e163 , 1, e6 , e14 , 1, e7 , e13 , 1, e8 , e12 , and 1, e9 , e11 are all equal to 1, and x e0 , x e0 , 1 = −1/4 By the Concavity axiom e1 , e9 , e14 , e2 , e3 , e19 , e2 , e8 , e14 , e2 , e9 , e13 , e3 , e3 , e18 , e3 , e7 , e14 , e3 , e8 , e13 , e3 , e9 , e12 , e6 , e9 , e9 , e7 , e8 , e9 , and e8 , e8 , e8 are all equal to 1. By the Index Zero axiom e1 , e1 , e2 = −4. By the Composition axiom x 3 e0 , e1 = e3 x 3 e0 , e2 , e2 = ±1 Consider the map ϕ : C[X, Y ] → HW T ,G , defined by X → e9 and Y → e3 17 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. Our correlators tell us that e9 e34 = 0, e35 = −4e19 , and e93 = e19 . Hence (4X 3 + Y 5 , 5X Y 4 ) ⊂ ker ϕ. Since C[X, Y ]/(4X 3 + Y 5 , 5X Y 4 ) = Q has dimension 17, we deduce that the inclusion is equality, and so we have the isomorphism HW T ,G ∼ = QW17 . 17

FJRW-Rings and Mirror Symmetry

167

Q 17 = x 3 + x y 5 + yz 2 4 13 36 J = (3x 2 + y 5 , 5x y 4 + z 2 , 2yz), qx = 10 30 , q y = 30 , qz = 30 , cˆ = 30 , G = J ∼ = Z/30Z, ⎧ ⎪ C3 if k = 0 ⎪ ⎪ 2 ⎨ Cx y if k = 15 , Fix J k = ⎪ Cx if 3|k, k = 15 ⎪ ⎪ ⎩ 0 otherwise ⎧ ⎪ 1, x, z, z 2 , y, y 2 , . . . , y 9 , x y, x y 2 , x y 3 , x z , µ = 17 ⎪ ⎪ ⎨ 1, x, y, y 2 , . . . , y 8 , x y, x y 2 , x y 3 , µ = 13 . Q|Fix J k = ⎪ 1, x , µ = 2 ⎪ ⎪ ⎩1

k |G| · degW invariants

1 0 1

2 54 e2

4 42 e4

5 36 e5

7 24 e7

8 18 e8

10 6 e10

11 60 e11

13 48 e13

14 42 e14

15 36 y 4 e15

k 16 17 19 20 22 23 25 26 28 29 |G| · degW 30 24 12 66 54 48 36 30 18 72 invariants e16 e17 e19 e20 e22 e23 e25 e26 e28 e29 Potential non-zero correlators: By the Pairing axiom 1, 1, e29 , 1, e2 , e28 , 1, e4 , e26 , 1, e5 , e25 , 1, e7 , e23 , 1, e8 , e22 , 1, e10 , e20 , 1, e11 , e19 , 1, e13 , e17 , and 1, e14 , e16 are all equal to 1, and y 4 e15 , y 4 e15 , 1 = −1/5 By the Concavity axiom e2 , e10 , e19 , e4 , e8 , e19 , e4 , e10 , e17 , e5 , e10 , e16 , e7 , e8 , e16 , e8 , e10 , e13 , e8 , e25 , e28 , e10 , e10 , e11 , e10 , e23 , e28 , e10 , e25 , e26 , e16 , e17 , e28 , e16 , e19 , e26 , e17 , e19 , e25 , e19 , e19 , e23 are all equal to 1. By the Index Zero axiom e5 , e7 , e19 , e5 , e28 , e28 , e7 , e7 , e17 , e7 , e10 , e14 , e7 , e26 , e28 , e14 , e19 , e28 , are all equal to −2. By the Composition axiom e8 , e8 , y 4 e15 = ±1. Consider the map ϕ : C[X, Y, Z ] → H Q 17 ,G defined by X → e8 and Y → e10 and Z → e16 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. 5 = −2e , e3 = −5e , and e4 e = e . Our correlators tell us that e82 e10 = 0, e10 16 8 22 22 10 16 2 3 4 5 Hence (3X , X + 5Y Z , Y + 2Z ) ⊂ ker ϕ. Since C[X, Y ]/(3X 2 , X 3 + 5Y 4 Z , Y 5 + 2Z ) = Q Q T has dimension 21, we deduce that the inclusion is equality, and so we have 17 the isomorphism H Q 17 ,G ∼ = QQT . 17

S16 = x 2 z + yz 2 + x y 4 5 3 7 J = (2x z + y 4 , z 2 + 4x y 3 , x 2 + 2yz), qx = 17 , q y = 17 , qz = 17 , cˆ = 21 17 , G = J ∼ Z/17Z, = C3 if k = 0 k , Fix J = 0 otherwise 1, x, y, y 2 , . . . , y 6 , z, z 2 , z 3 , x y, x y 2 , yz, y 2 z, y 3 z , µ = 16 Q|Fix J 0 = 1

.

168

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |G| · degW 0 30 26 22 18 14 10 6 36 32 28 24 20 16 12 42 invariants 1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11 e12 e13 e14 e15 e16 Potential non-zero correlators: By the Pairing axiom 1, 1, e16 , 1, e2 , e15 , 1, e3 , e14 , 1, e4 , e13 , 1, e5 , e12 , 1, e6 , e11 , 1, e7 , e10 , and 1, e8 , e9 are equal to 1. By the Concavity axiom e2 , e8 , e8 , e3 , e7 , e8 , e4 , e6 , e8 , e5 , e6 , e7 , e6 , e14 , e15 , e7 , e13 , e15 , e8 , e12 , e15 , and e8 , e13 , e14 are all equal to 1. By the Index Zero axiom e4 , e7 , e7 , e7 , e14 , e14 , e5 , e5 , e8 , e5 , e15 , e15 all equal to -2, and e6 , e6 , e6 = −4. Consider the map ϕ : C[X, Y, Z ] → H S16 ,G defined by X → e7 , Y → e8 and Z → e6 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. Our correlators tell us that e7 e6 = e12 , e84 = −2e12 , e62 = −4e11 , e7 e83 = e11 , e72 = −2e13 , and e8 e6 = e13 . Hence (2X Z + Y 4 , Z 2 + 4Z Y 3 , X 2 + 2Y Z ) ⊂ ker ϕ. Since C[X, Y, Z ]/(2X Z + 4 Y , Z 2 + 4Z Y 3 , X 2 + 2Y Z ) = Q S16 has dimension 16, we deduce that the inclusion is equality, and so we have the isomorphism H S16 ,G ∼ = Q S16 . S17 = x 2 z + yz 2 + y 6 7 4 30 J = (2x y, z 2 + 6y 5 , x 2 + 2yz), qx = 24 , q y = 24 , qz = 10 24 , cˆ = 24 , G = J ∼ =⎧Z/24Z, ⎪ C3 if k = 0 ⎪ ⎪ ⎨C2 if k = 12 yz , Fix J k = ⎪ C y if 6|k, k = 12 ⎪ ⎪ ⎩ 0 otherwise ⎧ ⎪ 1, x, y, . . . , y 4 , z, z 2 , z 3 , x y, . . . , x y 4 , yz, . . . , y 4 z , µ = 17 ⎪ ⎪ ⎨ 1, y, y 2 , y 3 , y 4 , z, z 2 , µ = 7 Q|Fix J k = ⎪ 1, y, y 2 , y 3 , y 4 , µ = 5 ⎪ ⎪ ⎩1

.

k 1 2 3 4 5 7 8 9 10 11 |G| · degW 0 42 36 30 24 12 6 48 42 36 invariants 1 e2 e3 e4 e5 e7 e8 e9 e10 e11 k 12 13 14 15 16 17 19 20 21 22 23 |G| · degW 30 24 18 12 54 48 36 30 24 42 60 invariants ze12 e13 e14 e15 e16 e17 e19 e20 e21 e22 e23 Potential non-zero correlators: By the Pairing axiom 1, 1, e23 , 1, e2 , e22 , 1, e3 , e21 , 1, e4 , e20 , 1, e5 , e19 , 1, e7 , e17 , 1, e8 , e16 , 1, e9 , e15 , 1, e10 , e14 , and 1, e11 , e13 are equal to 1, and ze1 2, ze1 2, 1 = −1/2, By the Concavity axiom e2 , e8 , e15 , e3 , e7 , e15 , e3 , e8 , e14 , e4 , e8 , e13 , e5 , e7 , e13 , e7 , e8 , e10 , e7 , e20 , e22 , e8 , e8 , e9 , e8 , e19 , e22 , e8 , e20 , e21 , e13 , e14 , e22 , e13 , e15 , e21 , e14 , e15 , e20 , e15 , e15 , e19 are all equal to 1. By the Index Zero axiom, e4 , e7 , e14 , e5 , e5 , e15 , e5 , e22 , e22 , e7 , e7 , e11 , e7 , e21 , e21 , ande14 , e14 , e21 to be equal to -2

FJRW-Rings and Mirror Symmetry

169

By the Composition axiom e5 , e8 , ze12 = ze12 , e15 , e22 = ±1. Consider the map ϕ : C[X, Y, Z ] → H S17 ,G , defined by X → e8 , Y → e13 , and Z → e7 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. Our correlators tell us that e13 ∗e7 = e19 and e86 = −2e19 , e72 = −2e13 and e85 ∗e7 = 0. Hence (6X 5 Z , Z 2 +2Y, X 6 +2Y Z ) ⊂ ker ϕ. Since C[X, Y, Z ]/(6X 5 Z , Z 2 +2Y, X 6 + 2Y Z ) = Q S T has dimension 21, we deduce that the inclusion is equality, and so we 17 have the isomorphism H S17 ,G ∼ = QST . 17

2.3. Singularities of corank 3. In this section, we consider two singularities Q 2,0 and S1,0 . We will not complete the calcuations for the transposed singularities, as mentioned earlier. The other two singlularities in Arnol’d’s list are not quasi-homogeneous for any choice of constants, so we do not include them either. Q 2,0 = x 3 + yz 2 + x y 4 J = (3x 2 + y 4 , z 2 + 4x y 3 , 2yz), J ∼ G∼ = Z/24Z, = Z/12Z.

qx =

8 24 , q y

=

4 24 , qz

=

10 24 , cˆ

=

28 24 ,

Since J does not generate the full group of diagonal symmetries we will use g = (ζ 8 , ζ −2 , ζ ), where ζ 24 = 1. ⎧ ⎪ C3 if k = 0 ⎪ ⎪ ⎨C2 if k = 12 xy , Fix g k = ⎪ Cx if 3|k, k = 12 ⎪ ⎪ ⎩ 0 otherwise ⎧ ⎪ 1, x, y, y 2 . . . , y 7 , z, z 2 , x y, x z , µ = 14 ⎪ ⎪ ⎨1, x, x 2 , y, y 2 , y 3 , x y, x y 2 , x 2 y, x 2 y 2 , µ = 10 Q|Fix gk = . ⎪ 1, x , µ = 2 ⎪ ⎪ ⎩1 k 1 2 4 5 7 8 10 11 12 13 14 16 17 19 20 22 23 |G| · degW 18 32 12 26 6 20 0 14 28 42 56 36 50 30 44 24 38 invariants e1 e2 e4 e5 e7 e8 1 e11 y 3 e12 e13 e14 e16 e17 e19 e20 e22 e23 Potential non-zero correlators: By the Pairing axiom 1, e1 , e23 , 1, e2 , e22 , 1, e4 , e20 , 1, e53 , e19 ,31, e7 , e17 , 1, e8 , e16 , 1, e11 , e13 , and 1, 1, e14 are all equal to 1, and 1, y e12 , y e12 = −1/4. By the Concavity Axiom e1 , e11 , e22 , e4 , e7 , e23 , e4 , e8 , e22 , e4 , e11 , e19 , e5 , e7 , e22 , e7 , e7 , e20 , e7 , e8 , e19 , e7 , e11 , e16 and are all equal to 1. By the Index Zero Axiom e1 , e2 , e7 = −2 By the Composition Axiom e11 , e11 , y 3 e12 = ±1. Consider the map ϕ : C[X, Y, Z ] → H Q 2,0 ,G , defined by X → e11 , Y → e7 and Z → e22 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. 2 e = 0, e3 = −4e , e3 e 4 Our correlators tell us that e11 7 13 7 22 = e13 , and that e7 = 11 −2e22 .

170

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

Hence (3X 2 Y, X 3 + 4Y 3 Z , 2Z + Y 4 ) ⊂ ker ϕ. Since C[X, Y, Z ]/(3X 2 Y, X 3 + 4Y 3 Z , 2Z + Y 4 ) = Q Q T has dimension 17, we deduce that the inclusion is equality, and so 2,0 we have the isomorphism H Q 2,0 ,G ∼ = QQT . 2,0

S1,0 = x 2 z + yz 2 + y 5 J = (2x z, z 2 + 5y 4 , x 2 + 2yz), J ∼ G∼ = Z/20Z, = Z/10Z.

qx =

6 20 , q y

=

4 20 , qz

=

8 20 , cˆ

=

24 20 ,

Since J does not generate the full group of diagonal symmetries we will use g = (ζ, ζ 4 , ζ −2 ), where ζ 20 = 1. ⎧ ⎪ C3 if k = 0 ⎪ ⎪ ⎨C2 if k = 10 yz Fix g k = , ⎪ C if 5|k, k = 10 y ⎪ ⎪ ⎩ 0 otherwise ⎧ ⎪ 1, x, y, y 2 , y 3 , z, z 2 , z 3 , x y, x y 2 , y 3 , yz, y 2 z, y 3 z , µ = 14 ⎪ ⎪ ⎨ 1, y, y 2 , y 3 , y 4 , z , µ = 6 Q|Fix gk = . ⎪ 1, y, y 2 , y 3 , µ = 4 ⎪ ⎪ ⎩1 k 1 2 3 4 6 7 8 9 10 11 12 13 14 16 17 18 19 |G| · degW 10 16 22 28 0 6 12 18 24 30 36 42 48 20 26 32 38 invariants e1 e2 e3 e4 1 e7 e8 e9 ze10 e11 e12 e13 e14 e16 e17 e18 e19 Potential non-zero correlators: By the Pairing Axiom 1, e1 , e19 , 1, e2 , e18 , 1, e3 , e17 , 1, e4 , e16 , 1, 1, e14 , 1, e7 , e13 , 1, e8 , e12 , and 1, e9 , e11 are equal to 1, and 1, ze10 , ze10 = −1/2. By the Concavity Axiom e1 , e7 , e18 , e1 , e8 , e17 , e1 , e9 , e16 , e2 , e7 , e17 , e2 , e8 , e16 , e3 , e7 , e16 , e7 , e7 , e12 , and e7 , e8 , e11 are all equal to 1. By the Index Zero Axiom e1 , e1 , e4 , e1 , e2 , e3 e2 , e2 , e2 , e8 , e9 , e9 are equal to -2. By the Composition Axiom e7 , e9 , ze10 = e8 , e8 , ze10 = ±1. Consider the map ϕ : C[X, Y, Z ] → H S1,0 ,G , defined by X → e16 , Y → e1 and Z → e7 and extending to a C-algebra homomorphism. One can check directly that this map is surjective. Our correlators tell us that e1 e74 = 0, e12 = −2e16 , e16 e1 = e11 , and that e75 = −2e11 . Hence (2X + Y 2 , Z 5 + 2X Y, 5Y Z 4 ) ⊂ ker ϕ. Since C[X, Y, Z ]/(2X + Y 2 , Z 5 + 2X Y, 5Y Z 4 ) = Q S T has dimension 17, we deduce that the inclusion is equality, and 1,0 ∼Q T . so we have the isomorphism H S1,0 ,G = S 1,0

2.4. Singularities of corank 2. Here we consider only the singularity Z 1,0 out of the four # is not quasi-homogeneous for any singularities listed by Arnol’d. The singularity W1,2q choice of constants. We will consider the singularities J3,0 and W1,0 in the following section:

FJRW-Rings and Mirror Symmetry

171

Z 1,0 = x 3 y + y 7 J = (3x 2 y, x 3 + 7y 6 ), qx = 27 , q y = 17 , cˆ = 87 , J ∼ G∼ = Z/21Z, = Z/7Z, ⎧ 2 ⎪ ⎨C if k = 0 k Fix g = C y if 7|k, k = 0 , ⎪ ⎩0 otherwise ⎧ 2 6 2 6 ⎪ ⎨1, x, x , y, . . . , y , x y, x y , . . . , x y , µ = 21 Q|Fix gk = 1, y, y 2 , y 3 , y 4 , y 5 , µ = 6 ⎪ ⎩1

.

k 0 1 2 3 4 5 6 8 9 10 11 12 13 15 16 17 18 19 20 |G| · degW 24 20 16 12 8 4 0 34 30 26 22 18 14 48 44 40 36 32 28 invariants x 2 e0 e1 e2 e3 e4 e5 1 e8 e9 e10 e11 e12 e13 e15 e16 e17 e18 e19 e20 Potential non-zero correlators: By the Pairing axiom 1, e1 , e20 , 1, e2 , e19 , 1, e3 , e18 , 1, e4 , e17 , 1, e5 , e16 , 1, 1, e15 , 1, e8 , e13 , 1, e9 , e12 , and 1, e10 , e11 are all equal to 1 and x 2 e0 , x 2 e0 , 1 = −1/3. By the Concavity axiom we have the following 3-point correlators are equal to 1: e1 , e13 , e13 , e2 , e5 , e20 , e2 , e12 , e13 , e3 , e5 , e19 , e3 , e4 , e20 , e3 , e11 , e13 , e3 , e12 , e12 , e4 , e4 , e19 , e4 , e5 , e18 , e4 , e10 , e13 , e4 , e11 , e12 , e5 , e5 , e17 , e5 , e9 , e13 , e5 , e10 , e12 , and e5 , e11 , e11 . By the Index Zero axiom e1 , e1 , e4 , e1 , e2 , e3 , and e2 , e2 , e2 are all equal to -3. axiom an argument similar that that for Eq. (3) we have 2 By the Composition using x e0 , e1 , e5 = x 2 e0 , e2 , e4 = x 2 e0 , e3 , e3 = ±1. Again, by examining degrees, we see that e5 and e13 are generators for H J1,0 ,G . Consider the map ϕ : C[X, Y ] → H J1,0 ,G , defined by X → e13 and Y → e5 and extending as a C-algebra homomorphism. One can check directly that this map is surjective. 2 = e , and e e6 = 0. Straightforward computations show that e57 = −3e20 , e13 20 13 5 7 2 6 7 Hence (Y + 3X , X Y ) ⊂ ker ϕ. Since C[X, Y ]/(Y + 3X 2 , X Y 6 ) = Q Z 1,0 has dimension 19, we deduce that the inclusion is equality. The reader will note that Z 1,0 is the transposed singularity for E 19 which we computed as an example in the Introduction, so we have the isomorphism Q E 19 ∼ = H J1,0 ,G . 2.5. Counterexample to the conjecture for non-invertible singularities. Here we will consider the singularities J3,0 and W1,0 as non-invertible singularities. The computations to follow show the necessity of the requirement that the singularities be invertible in the conjecture. J3,0 = x 3 + bx 2 y 3 + y 9 , 4b3 + 27 = 0 J = (3x 2 + 2bx y 3 , 3bx 2 y 2 + 9y 8 ), G = J ∼ = Z/9Z.

qx = 39 , q y = 19 , cˆ =

10 9 ,

172

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

Notice that if b = 0, then the maximal group of diagonal symmetries is not cyclic. But in that case H Z 3,0 ,G |b=0 is isomorphic to a tensor product of simple singularities. So in this paper we only consider the case when the admissible group is generated by J : ⎧ 2 ⎪ ⎨C if k = 0 k Fix J = Cx if 3|k , ⎪ ⎩0 otherwise ⎧ 2 7 2 7 ⎪ ⎨ 1, x, y, y , . . . , y , x y, x y , . . . , x y , µ = 16 . Q|Fix J k = 1, x , µ = 2 ⎪ ⎩1 k |G| · degW invariants

0 1 2 4 5 7 8 10 0 8 6 14 12 20 y 5 e0 , x y 2 e0 1 e2 e4 e5 e7 e8

Potential non-zero correlators: By the Pairing axiom 11, e8 = 1, e2 , e7 = 1, e4 , e5 = 1, and

y 5 e0 , y 5 e0 , 1 =

2b2 , 9(4b3 + 27) 1 y 5 e0 , x y 2 e0 , 1 = 3 , 4b + 27 −2b x y 2 e0 , x y 2 e0 , 1 = . 3(4b3 + 27) By the Concavity axiom e2 , e4 , e4 = 1. Notice that in this case, we have the interesting situation that b = 0 gives a different F J RW ring for this singularity, than any other value for b. Another interesting observation is that in this case, there is no Milnor ring that is isomorphic to the FJRW ring. To see this suppose there is a quasi-homogeneous polynomial f (x1 , x2 , x3 , x4 ) with C[x1 , x2 , x3 , x4 ]/J f ∼ = H J3,0 ,J , and let q1 , q2 , q3 , q4 be the corresponding charges. The isomorphism must send generators to generators, so we may assume x1 → µ, x2 → ν, x3 → e2 , and x4 → e4 , where µ and ν are in the untwisted sector. Since we have an isomorphism of graded rings, q1 = 10α, q2 = 10α, q3 = 8α, q4 = 6α, cˆ f = 20α. (We admit the possibility that the grading may differ by a uniform scaling, although we see below that this possibility does not occur). From the definition of cˆ f , we have (1 − 2qi ) = 4 − 68α, cˆ f = which we can solve to find α = q1 =

5 11 ,

1 22 ,

so the weights are

q2 =

5 11 ,

q3 =

4 11 ,

q4 =

3 11 .

FJRW-Rings and Mirror Symmetry

173

Now this Milnor ring must have dimension µ = 8, but the dimension is given in terms of the charges by µ= ( q1i − 1) = 168 25 , a contradiction. W1,0 = x 4 + ax 2 y 3 + y 6 , a 2 = 4 J = (4x 3 + 2ax y 3 , 3ax 2 y 2 + 6y 5 ), G = J ∼ = Z/12Z.

qx =

3 12 ,

qy =

2 12 ,

cˆ =

14 12 ,

Just as with J3,0 , if a = 0, then the maximal group of diagonal symmetries is not cyclic. But in that case H Z 3,0 ,G |b=0 is isomorphic to a tensor product of simple singularities. So in this paper we only consider the case when the admissible group is generated by J : ⎧ ⎪ C2 if k = 0 ⎪ ⎪ ⎨C if 4|k x Fix J k = , ⎪ C y if 3|k ⎪ ⎪ ⎩0 otherwise ⎧ ⎪ 1, x, x 2 , y, . . . , y 4 , x y, . . . , x y 4 , x 2 y, . . . , x 2 y 4 , µ = 15 ⎪ ⎪ ⎨ 1, x, x 2 , µ = 3 Q|Fix J k = . ⎪ 1, y, y 2 , y 3 , y 4 , µ = 5 ⎪ ⎪ ⎩1 k |G| · degW invariants

0 1 2 5 7 10 11 14 0 10 16 12 18 28 x y 2 e0 1 e2 e5 e7 e10 e11

Potential non-zero correlators: By the Pairing axiom 1,1, e11 , 1, e2 , e10 , 1, e3 , e9 , 1 1, e4 , e8 , 1, e5 , e7 , and 1, e6 , e6 are equal to 1 and x y 2 e0 , x y 2 e0 , 1 = 24−6a 2 By the Concavity axiom e2 , e2 , e9 , e7 , e9 , e9 , and e8 , e8 , e9 are equal to 1. Just as with J3,0 , there is no Milnor ring isomorphic to this FJRW-ring. To see this, suppose there exists a quasi-homogeneous polynomial f (x1 , x2 , . . . , x5 ) with C[x1 , x2 , . . . , x5 ]/J f ∼ = HW1,0 ,J , and let q1 , q2 , . . . , q5 be the charges for x1 , x2 , . . . , x5 resp. The isomorphism must send generators to generators, so we may assume x1 → (x y 2 e0 ), x2 → e2 , x3 → e5 , x4 → e7 , and x5 → e10 . Since we have an isomorphism of graded rings (again allowing for a uniform rescaling of degrees), q1 = 14α, q2 = 10α, q3 = 16α, q4 = 12α, q5 = 18α, cˆ f = 28α. From the definition of cˆ f , we have cˆ f = (1 − 2qi ) = 5 − 140α, which we can solve to find α = q1 =

1 12 ,

q2 =

5 168 , 5 84 ,

so the weights are q3 =

10 21 ,

q4 =

5 14 ,

q5 =

15 28 .

174

M. Krawitz, N. Priddis, P. Acosta, N. Bergin, H. Rathnakumara

The dimension of the Milnor ring is given in terms of the charges by µ= ( q1i − 1), which is not even an integer. So the FJRW A-model in this case is not isomorphic to the Milnor ring of a quasi-homogeneous polynomial. References 1. Arnol’d, V., Gusein-Zade, S., Varchenko, A.: Singularities of Differentiable Maps Vols I, II. Basel-Boston: Birkhauser, 1985 2. Berglund, P., Hubsch, T.: A generalized construction of mirror manifolds. Nucl. Phys. B 393, 377 (1993) 3. Fan, H., Jarvis, T.J., Ruan, Y.: The witten equation, mirror symmetry and quantum singularity theory. http://arxiv.org/abs/:0712.4021v3[math.AG], 2009 Communicated by A. Kapustin

Commun. Math. Phys. 296, 175–213 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0963-5

Communications in

Mathematical Physics

On the Mixing Time of the 2D Stochastic Ising Model with “Plus” Boundary Conditions at Low Temperature Fabio Martinelli1 , Fabio Lucio Toninelli2 1 Dipartimento di Matematica, Università Roma Tre, Largo S. Murialdo 1,

00146 Roma, Italia. E-mail: [email protected]

2 CNRS and ENS Lyon, Laboratoire de Physique, 46 Allée d’Italie, 69364 Lyon,

France. E-mail: [email protected]; [email protected] Received: 28 May 2009 / Accepted: 14 September 2009 Published online: 1 December 2009 – © Springer-Verlag 2009

Abstract: We consider the Glauber dynamics for the 2D Ising model in a box of side L, at inverse temperature β and random boundary conditions τ whose distribution P either stochastically dominates the extremal plus phase (hence the quotation marks in the title) or is stochastically dominated by the extremal minus phase. A particular case is when P is concentrated on the homogeneous configuration identically equal to + (equal to −). For β large enough we show that for any ε > 0 there exists c = c(β, ε) such that the corresponding mixing time Tmix satisfies lim L→∞ P (Tmix ≥ exp(cL ε )) = 0. In the non-random case τ ≡ + (or τ ≡ −), this implies that Tmix ≤ exp(cL ε ). The same bound holds when the boundary conditions are all + on three sides and all − on the remaining one. The result, although still very far from the expected Lifshitz behavior Tmix = O(L 2 ), considerably improves upon the previous known estimates of the form 1 Tmix ≤ exp(cL 2 +ε ). The techniques are based on induction over length scales, combined with a judicious use of the so-called “censoring inequality” of Y. Peres and P. Winkler, which in a sense allows us to guide the dynamics to its equilibrium measure. 1. Introduction, Model and Main Results Glauber dynamics for classical spin systems has been extensively studied in the last fifteen years from various perspectives and across different areas like mathematical physics, probability theory and theoretical computer science. A variety of techniques have been introduced in order to analyze, on an increasing level of sophistication, the typical time scales of the relaxation process to the reversible Gibbs measure (see e.g. [14,17] and the recent work on the cutoff phenomenon for the mean field Ising model [15]). These techniques have in general proved to be quite successful in the so-called one-phase region, corresponding to the case where the system has a unique Gibbs state. When instead the This work was supported by the European Research Council through the “Advanced Grant” PTRELSS 228032, and by ANR through the grants POLINTBIO and LHMSHE.

176

F. Martinelli, F. L. Toninelli

thermodynamic parameters of the system correspond to a point in the phase coexistence region, a whole class of new dynamical phenomena appear (coarsening, phase nucleation, motion of interfaces between different phases,...) whose mathematical analysis at a microscopic level is still quite far from being completed. A good instance of the latter situation is represented by the Glauber dynamics for the usual ±1 Ising model at low temperature in the absence of an external magnetic field (see Sect. 1.2). When the system is analyzed in a finite box of side L of the d-dimensional lattice Zd with free boundary conditions, the relaxation to the Gibbs reversible measure occurs on a time scale exponentially large in the surface L d−1 [26,27] because of the energy barrier between the two stable phases of the system (see Sect. 1.3 for a more quantitative statement). When instead one of the two phases is selected by homogeneous boundary conditions, e.g. all pluses, then equilibration is believed to be much faster and it should occur on a polynomial (in L) time scale because of the shrinking of the big droplets of the opposite phase via motion by mean curvature under the influence of the boundary conditions. Unfortunately, establishing the above polynomial law in Zd remains a kind of holy grail for the subject and the existing bounds of the form exp(c L log(L)) in d = 2 [12,16] and exp(cL d−2 log(L)2 ) in d ≥ 3 [25] are very far from it. It is worth mentioning that, always for the low-temperature Ising model but with the underlying graph G different from Zd , it has been possible to carry out a quite detailed mathematical analysis. The first example is represented by the regular d-ary tree [18] and the second one by certain hyperbolic graphs [5]. In both cases one can show for example that the relaxation time or inverse spectral gap of the Glauber dynamics in a finite ball with all plus boundary conditions is uniformly bounded from above in the radius of the ball, a phenomenon that is believed to occur also in Zd in large enough (≥ 4?) dimension d. Moreover polynomial bounds on the mixing time, sometimes with optimal results, have been proved for some simplified models of the random evolution of the phase separation line between the plus and minus phase for the two-dimensional Ising model (see for instance [7 and 19]). The latter contribution, in particular, partly triggered the present work. There, in fact, the opportunities offered by the so-called Peres-Winkler censoring inequality [22] have been detailed in the very concrete and non-trivial case of the so-called Solid-on-Solid model. Roughly speaking the censoring inequality (see Sect. 2.4) says that, when considering the Glauber dynamics for a monotone system like the Ising model on a finite graph and under certain conditions on the initial distribution, switching off (i.e., censoring) the spin flips in some part of the graph and for a certain amount of time can only increase the variation distance between the distribution of the chain at the final time T and the equilibrium Gibbs measure. Therefore, if the censored dynamics is close to equilibrium at a certain time T , the same holds for the true (i.e. uncensored) one. The fact that the choice of where and when to implement the censoring is completely arbitrary (provided that it is independent of the actual evolution of the chain) offers the possibility of (sort of) guiding the dynamics towards the stationary distribution through a sequence of local equilibrations in suitably chosen subsets of the graph. Of course the local equilibrium in each of the sub-graphs is conditioned to the random configuration reached by the dynamics outside it and therefore one is naturally led to consider the Ising model with random boundary conditions, a quite delicate topic because of the extreme sensitivity of the relaxation or mixing time to boundary conditions (see [1–4] for several results in this direction, some of them quite surprising at first sight). Moreover

Mixing Time of 2D Stochastic Ising Model at Low Temperature

177

it should also be clear that, in order for the guidance process to be successful, the distribution of the random boundary conditions at each stage of the censoring should be close to that provided by the stationary Gibbs distribution, a requirement that puts quite severe restrictions on the choice of the censoring scheduling. The main contribution of this paper is a detailed implementation of this program for the two-dimensional, low-temperature, Ising model in a finite box with either homogeneous, i.e. all plus (all minus), boundary conditions or, more generally, random boundary conditions that are stochastically larger (stochastically smaller) than those distributed according to the plus (minus) phase. In order to state precisely our results we need to define the model, fix some useful notation and recall some basic facts about the Ising model below the critical temperature. 1.1. The standard Ising model. Let be a generic finite subset of Z2 . Each site x in indexes a spin σx which takes values ±1. The spin configurations {σx }x∈ have a statistical weight determined by the Hamiltonian 1 H τ (σ ) = − σx σ y − σx τ y , 2 x,y∈ x∈,y∈c |x−y|=1

|x−y|=1

where τ = {τ y } y∈c are boundary conditions outside . The Gibbs measure associated to the spin system with boundary conditions τ is ∀σ ∈ := {−1, +1} ,

τ π (σ ) =

1 τ Z β,

exp −β H τ (σ ) ,

τ where β is the inverse of the temperature (β = T1 ) and Z β, is the partition function. If the boundary conditions are uniformly equal to +1 (resp. −1), then the Gibbs mea+ (resp. π − ). If instead the boundary conditions are free (i.e. sure will be denoted by π f τ y = 0 ∀y) then the Gibbs measure will be denoted by π .

Remark 1.1. Sometimes we will drop the superscript τ and the subscript from the notation of the Gibbs measure. It is useful to recall a monotonicity property of the Gibbs measure that will play a key role in our analysis. One introduces a partial order on by saying that σ ≤ η if σx ≤ ηx for all x ∈ . A function f : → R is called monotone increasing (decreasing) if σ ≤ η implies f (σ ) ≤ f (η) ( f (σ ) ≥ f (η)). An event is called increasing (decreasing) if its characteristic function is increasing (decreasing). Given two probability measures µ, ν on we write µ ν if µ( f ) ≤ ν( f ) for all increasing functions f (with µ( f ) we denote the expectation of f with respect to µ). In the following we will take advantage of the FKG inequalities [11] which state that

τ πτ , • if τ ≤ τ , then π τ ( f g) ≥ π τ ( f )π τ (g). • if f and g are increasing then π

The phase transition regime occurs at low temperature and it is characterized by spontaneous magnetization in the thermodynamic limit. There is a critical value βc such that ∀β > βc ,

− + lim π (σ0 ) = − lim π (σ0 ) = m β > 0.

→Z2

→Z2

(1.1)

178

F. Martinelli, F. L. Toninelli

+ and π − converge (weakly) Furthermore, in the thermodynamic limit the measures π + − to two distinct Gibbs measures π∞ and π∞ which are measures on the space 2 Z2 = {−1, +1}Z . Each of these measures represents a pure state. The next step is to quantify the coexistence of the two pure states defined above. Let L = {−L/2 , . . . , L/2 }2 , let n be a vector in the unit circle S and φn the angle it forms with e 1 = (1, 0) and finally let τ be the following mixed boundary conditions: +1, if n · y 0, c ∀y ∈ L , τy = −1, if n · y < 0. ± The partition function with mixed boundary conditions is denoted by Z β,L ( n ) and the + one with boundary conditions uniformly equal to +1 by Z β,L .

Definition 1.2. The surface tension in the direction orthogonal to n ∈ S is an even and periodic function of φn of period π/2, and for −π/4 ≤ φn ≤ π/4 it is defined by τβ ( n ) = lim − L→∞

± Z β,L ( n) cos(φn ) log . + βL Z β,L

(1.2)

We refer to [21] for a general derivation of the thermodynamic limit (1.2). With this definition, one result (among many others) concerning the coexistence of the two phases can be formulated as follows [23]. Let m L (σ ) = x∈ L σx be the total magnetization in the box L . Then 1 f lim − log π L (m L /2 = 0) = τβ , (1.3) L→∞ L where τβ is the surface tension in the horizontal direction e 1 . 1.2. The Glauber dynamics. The stochastic dynamics we want to study, sometimes referred to as the heat-bath dynamics, is a continuous time Markov chain on , reversτ , that can be described as follows. With rate one and for each ible w.r.t. the measure π vertex x, the spin σx is refreshed by sampling a new value from the set {−1, +1} accordτ (· | σ , y = x). It is easy to check that ing to the conditional Gibbs measure πx := π y the heat-bath chain is characterized by the generator (Lτ f )(σ ) = (1.4) [πx ( f ) − f (σ )] , x∈

where πx ( f ) denotes the average of f with respect to the conditional Gibbs measure πx , which acts only on the variable σx . The Dirichlet form associated to Lτ takes the form τ τ E ( f, f ) = π ( Var x ( f ) ), x∈

where Var x ( f ) denotes the variance with respect to πx . We will always denote by µσt the distribution of the chain at time t when the starting point is σ . If σ is either identically equal to +1 or −1 then we simply write µ+t or µ− t .

Mixing Time of 2D Stochastic Ising Model at Low Temperature

179

The boundary conditions τ are usually not explicitly spelled out for lightness of notation. Sometimes we write µσ,t when we wish to emphasize that we are looking at the evolution for a system enclosed in the domain . The Glauber dynamics with the heat-bath updating rule satisfies a particularly useful monotonicity property. It is possible to construct on the same probability space (the one built from the independent Poisson clocks attached to each vertex and from the independent coin tosses associated to each ring) a Markov chain {ηtσ,τ }t≥0 , (σ, τ ) ∈ × c , such that • for each τ ∈ c and σ ∈ the coordinate process (ηtσ,τ )t≥0 is a version of the Glauber chain started from σ with boundary conditions τ ;

• for any t ≥ 0, ηtσ,τ ≤ ηtσ ,τ whenever σ ≤ σ and τ ≤ τ . It is possible to extend the above definition of the generator Lτ directly to the whole lattice Z2 and get a well defined Markov process on Z2 (see e.g. [13]). The latter will be referred to as the infinite volume Glauber dynamics, with generator denoted by L. Two key quantities measure the speed of relaxation to equilibrium of the Glauber dynamics. The first one is the relaxation time Trelax . Definition 1.3. Trelax is the best constant C in the Poincaré inequality τ ( f, f ), Var τ ( f ) := Var πτ ( f ) ≤ CE

∀ f : → R.

In particular, for any f : → R, it follows that τ 1/2 Var τ et L f ≤ e−t/Trelax Var τ ( f )1/2 .

(1.5)

(1.6)

We will write gap := gapτ for the inverse of Trelax . Another relevant quantity is the mixing time which is defined as follows. Recall that the total variation distance between two measures µ, ν on a finite probability space is defined as 1 µ − ν := |µ(σ ) − ν(σ )|. (1.7) 2 σ ∈

Definition 1.4. For any ∈ (0, 1), we define τ ≤ }. Tmix ( ) := inf{t > 0 : sup µσt − π σ

(1.8)

When = 1/(2e) we will simply write Tmix . With this definition it follows in particular that (see e.g. [14]) τ sup µσt − π ≤ (2 )t/Tmix ( ) ∀t ≥ 0. σ

(1.9)

As it is well known (see e.g. [14]) the following bounds between Trelax and Tmix hold:

2e Trelax , (1.10) Trelax ≤ Tmix ≤ log π∗ τ (σ ). Notice that π ∗ ≥ e−c|| for some constant c = c(β) and where π ∗ = minσ π therefore the two quantities differ at most by const×volume. Another definition we will often need is the following:

Definition 1.5. Let µ, ν be measures on , let σ ∈ L and V ⊂ . Then, µ − νV denotes the variation distance between the marginals of µ and ν on V , and σV the restriction of σ to V .

180

F. Martinelli, F. L. Toninelli

1.3. Main results. Our main result considerably improves upon the existing upper bound on the mixing time (and therefore also on the relaxation time) when is a square box and the boundary conditions τ are homogeneous i.e. either all plus or all minus. As a by-product we also get a new bound on the time auto-correlation function of, e.g., the spin at the origin for the infinite volume Glauber dynamics started from the + . Before stating the results we quickly review what was known so far. In plus phase π∞ what follows L will always be a L × L box. When the boundary conditions are free, a simple bottleneck argument proves that −1 1 f Trelax ≥ 2 π L (m L /2 = 0) L so that (recall (1.3)) lim

L→∞

1 log(Trelax ) ≥ τβ . L

In [16] such a result was improved to an equality for large enough values of β and in [8] for any β > βc . Quite different is the situation for homogeneous boundary conditions, e.g. all plus, for which the bottleneck between the two phases is removed by the boundary conditions and the relaxation process should occur on a much shorter time scale. In this case one expects a polynomial growth of both Trelax and Tmix of the form Trelax ≈ L ,

Tmix ≈ L 2 .

The reason behind the difference in the power of L of the two growths seems to be quite subtle and largely not yet understood at the mathematical level. The only rigorous results in this direction are those obtained in [6] where, apart from logarithmic corrections, the appropriate lower bounds on Trelax and Tmix have been established by means of quite subtle test functions combined with the whole machinery of the Wulff construction. As far as upper bounds are concerned, they proved to be quite hard to obtain and the available results are still quite poor. In the case of homogeneous boundary conditions it was first shown in [16] that, for β large enough and any ε > 0, Trelax ≤ exp cL 1/2+ε for a suitable constant c depending on ε and β. Later such a bound was improved to √ exp(c L log L) in [12]. When the inverse temperature β is just above the critical value, the only available result is much weaker (see [8]) and of the form lim

L→∞

1 log(Trelax ) = 0. L

Finally when f (σ ) = σ0 the above bounds combined with some simple monotonicity arguments prove that, for any α > 0, Var +∞ et L f ≤ c/t α + denotes the variance w.r.t. the plus phase π + ) while the expected behavior (where Var ∞ √ ∞ is O(e− t ), see [10]. We are now in a position to state our main results.

Mixing Time of 2D Stochastic Ising Model at Low Temperature

181

Theorem 1.6. Let β be large enough and let L belong to the sequence {2n − 1}n∈N . (1) If the boundary conditions (b.c.) τ are sampled from a law P which either stochas+ or is stochastically dominated by π − (see tically dominates the pure phase π∞ ∞ Sect. 2.2), there exists c = c(β, ε) (independent of P) such that τ ε2 /16 , (1.11) Eµ± − π ≤ exp −cL tL where t L = exp(cL ε ). In particular, 2 P (Tmix ≥ t L ) ≤ exp −cL ε /16 .

(1.12)

− on (2) The estimates (1.11)–(1.12) hold also if P is stochastically dominated by π∞ + on the union of the other three one side of L , and stochastically dominates π∞ sides. Similarly if the role of + and − is reversed.

The most natural consequence of the above result is Corollary 1.7. Let β be large enough and let L belong to the sequence {2n − 1}n∈N . Consider the square L with b.c. τ ≡ +. For every ε > 0 there exists c = c(β, ε) < ∞ such that ε

Tmix ≤ ecL .

(1.13)

The same bound holds if the boundary conditions are + on three sides and − on the remaining one. Similarly if + is replaced by −. Remark 1.8. (i) In the proof of Theorem 1.6 and of Corollary 1.10 below, we need at some point some key equilibrium estimates which are proved in the appendix via standard cluster expansion techniques for values of β large enough. However, we expect those bounds to hold for every β > βc . Since this is the only part of the proof where the value of β comes into play, we expect Theorem 1.6 and Corollary 1.10 to hold for any β > βc . Let us also point out that, while we restrict for simplicity to the nearest-neighbor Ising model, we believe that our techniques can be generalized without conceptual difficulties to ferromagnetic Ising models with finite-range interactions. In particular, cluster expansion results for large β are known to hold also in this more general situation. (ii) The restriction that L belongs to the sequence {2n − 1}n∈N is purely technical and it is a consequence of the iterative procedure we use. It would not be difficult to eliminate this restriction by somewhat modifying our iteration below (see Remark 3.12 at the end of the proof of Theorem 3.2), but we have decided not to do this, in order to keep the presentation as simple as possible. (iii) The above results have been stated for the heat-bath dynamics but they actually apply to any other single site Glauber dynamics (e.g. the Metropolis chain) with jump rates uniformly positive (e.g. greater than δ > 0) as can be seen via standard comparison techniques [17]. More precisely, if Tˆmix and Tˆrelax denote the mixing and relaxation times of the new chain, then there exist constants c, c depending on δ, β such that Tˆmix ≤ c||Tˆrelax ≤ c ||Trelax ≤ c ||Tmix ; the results we are after then follow since || represents a polynomial correction which is irrelevant in our case.

182

F. Martinelli, F. L. Toninelli

Fig. 1. The rectangle and its enlargement E L ()

(iv) Notice that in some sense our result (1.12) is not so far from optimality. Indeed, consider the distribution P such that τ = + except for the boundary sites which are at distance at most L ε from one of the corners of the box, where τ is sampled from + . Clearly P stochastically dominates π + . Then, with P-probability exp(−cL ε ), π∞ ∞ τ = − around the corners and, thanks to the results of [1], Tmi x ≥ exp(cL ε ).

1.4. Applications. It is intuitive that if the b.c. are all + (all −) and we start from the all + (all −) configuration, equilibration will be much quicker. Indeed, we have the following Corollary 1.9. Let β be large enough and τ ≡ +. For every ε > 0 there exists c = c(β, ε) > 0 such that lim µ+t1 − π τ = 0,

L→∞

(1.14)

where t1 := exp(c(log L)ε ). By a global spin flip the same results hold if + is replaced by −. Finally, here is the result about the decay of time auto-correlations for the infinitevolume dynamics in a pure phase: Corollary 1.10. Let β be large, let f (σ ) = σ0 and let ρ(t) ≡ Var +∞ et L f be the time + auto-correlation of the spin at the origin in the plus phase π∞ . Then for any ε > 0 there exists a constant c = c(β, ε) such that ρ(t) ≤ c e−(1/c)(log t) . 1/ε

(1.15)

2. Auxiliary Definitions and Results In this section we collect some more detailed notation that will be needed during the proof of the main results, together with certain additional auxiliary results that will play a key role in our analysis. 2.1. Geometrical definitions. The boundary of a finite subset ⊂ Z2 , in the sequel denoted by ∂, consists of those sites in Z2 \ at unit distance from . Given a rectangle ⊂ Z2 and L ∈ N, we denote by E L () the enlarged rectangle obtained from by shifting by L units the Northern boundary upwards, the Eastern boundary eastward and the Western boundary westward (see Fig. 1).

Mixing Time of 2D Stochastic Ising Model at Low Temperature

183

Given ε > 0 (to be thought of as very small) and L ∈ N we let 1

R εL = {(i, j) ∈ Z2 : 1 ≤ i ≤ L , 1 ≤ j ≤ L 2 +ε }. Similarly we define the rectangle Q εL , the only difference being that the vertical sides 1

contain now (2L + 1) 2 +ε sites. Notation warning. In the sequel we will often remove the superscript ε from our notation of the various rectangles involved since it is a (small) parameter that we imagine given once and for all. 2.2. Boundary conditions. A boundary condition τ for a given domain (typically, a rectangle) is an assignment of values ±1 to each spin on the boundary of the domain under consideration. Definition 2.1. A distribution P of b.c. for a rectangle R (which will be R L , Q L or a rectangle obtained by translating one of them by a vector v ∈ Z2 ) is said to belong to D(R) if its marginal on the union of North, East and West borders of R is stochastically − of the infinite system, while the dominated by (the marginal of) the minus phase π∞ marginal on the South border of R dominates the (marginal of the) infinite plus phase +. π∞ The most natural example is to take P concentrated on the boundary conditions τ given by τ ≡ − on the North, East and West borders, and τ ≡ + on the South border. In that case we will sometimes write π R−,−,+,− for the equilibrium measure in R, where we agree to order the sides of the border clockwise starting from the Northern one.

2.3. The inductive statements. Here we define two inductive statements that will be proved later by a “halving the scale” technique. Definition 2.2. For any given L ∈ N, δ > 0, t > 0 consider the system in R L , with boundary condition τ chosen from some distribution P. We say that A(L , t, δ) holds if τ Eµ± t −π ≤δ

(2.1)

for every P ∈ D(R L ). The statement B(L , t, δ) is defined similarly, the only difference being that the rectangle R L is replaced by Q L (in particular, P is required to belong to D(Q L )). 2.4. Censoring inequalities. In this section, we consider the Glauber dynamics in a generic finite domain ⊂ Z2 , not necessarily a rectangle. The boundary conditions τ are not specified, because the results are independent of it. A fundamental role in our work is played by the censoring inequality proved recently by Y. Peres and P. Winkler: this says, roughly speaking, that removing (deterministically) some updates from the dynamics can only slow down equilibration, if the initial configuration is the maximal (or minimal) one. First of all we need a simple but useful lemma:

184

F. Martinelli, F. L. Toninelli

Lemma 2.3. [22, Lemma 16.7] Let π, µ, ν be laws on a finite, partially ordered probability space. If ν µ and ν/π is increasing, i.e. ν(σ ) ν(η) ≥ π(σ ) π(η)

(2.2)

ν − π ≤ µ − π .

(2.3)

whenever σ ≥ η, then

The result of Peres-Winkler can be stated as follows: Theorem 2.4. [22, Theorem 16.5] Let m ∈ N, v := (v1 , . . . , vm ) a sequence of sites in , and let v be a sub-sequence of v. Let µ0 be a law on such that µ0 /π is increasing. Denote by µv the law obtained starting from µ0 and performing heat-bath updates at the ordered sequence of sites v. Similarly for µv . Then, µv − π ≤ µv − π

(2.4)

and µv µv . Moreover, µv /π and µv /π are increasing. It is easy to see that, if µ0 /π is instead decreasing, (2.4) still holds, while the other statements become µv µv and µv /π , µv /π decreasing. Here, “performing a heat-bath update at a given site v ∈ ” simply means freezing the configuration outside v and extracting σv from the equilibrium distribution conditioned on the configuration outside v. Theorem 2.4 is proved in [22] in the particular case where µ0 is the measure concentrated at the all + configuration, but the proof of the above generalized statement is essentially identical. Let us emphasize that such result is not specific of the Ising model but requires in an essential way monotonicity of the dynamics. From Lemma 2.3 and Theorem 2.4 we easily extract the continuous-time censoring inequality we need: Theorem 2.5. Let n ∈ N, 0 ≡ t0 < t1 < . . . tn ≡ T and i ⊂ , i = 1, . . . , n. Let µ0 be a law on such that µ0 /π is increasing. Let µT be the law at time T of the continuous-time, heat-bath dynamics in , started from µ0 at time zero. Also, let µ T be the law at time T of the modified dynamics which again starts from µ0 at time zero, and which is obtained from the above continuous time, heat-bath dynamics by keeping only the updates in i in the time interval [ti−1 , ti ) for i = 1, . . . , n. Then, µT − π ≤ µ T − π , and µT µ T ; moreover,

µT π

,

µ T π

(2.5)

are both increasing.

Needless to say, if instead µ0 /π is decreasing then all inequalities except (2.5) are reversed. Proof. Let m be the (random) number of Poisson clocks which ring during the time interval [0, T ), and denote by si and vi ∈ , i ≤ m the times and sites where they ring. We order the times as si < si+1 and of course vi are IID and chosen uniformly in . Define then w := ((v1 , s1 ), . . . , (vm , sm )) and let µw be obtained from µ0 performing single-site heat-bath updates at sites v1 , v2 , . . . , vm (in this order). Analogously, let w

be obtained by w by removing all pairs (v j , s j ) such that v j ∈ / k , where k is such that

Mixing Time of 2D Stochastic Ising Model at Low Temperature

185

s j ∈ [tk−1 , tk ), and µw is defined in the obvious way. For any realization of w one has from Theorem 2.4 that µw µw and that both µw /π and µw /π are increasing. Since µT (respectively µ T ) is just the average over w of µw (resp. of µw ), one obtains all the claims of the theorem (except (2.5)) by linearity. Inequality (2.5) comes simply from µT µ T , plus Lemma 2.3 and the fact that µT /π is increasing. We will need at various instances the following easy consequences of the above facts. Corollary 2.6. Let t > 0 and assume that µ0 /π is increasing. Denote by µt the evolution started from µt=0 = µ0 , and by µ+t the one started from the maximal configuration +. Then µt − π ≤ µ+t − π .

(2.6)

Proof. We know from Theorem 2.5 that µt /π is increasing. Moreover, by monotonicity of the dynamics µt µ+t . The claim then follows from Lemma 2.3. + Corollary 2.7. Let γ (t) = max µt − π , µ− t − π . Then γ (t + s) ≤ 4γ (t)γ (s)

∀t, s ≥ 0.

Proof. Notice that µ+t+s − π = µ+t+s (A) − π(A), where A = {σ : µ+t+s (σ ) ≥ π(σ )}. Because of Theorem 2.5 the event A is increasing so that f := 1I A − π(A) is an increasing function (and of course π( f ) = 0). Thus µ+t+s − π = µ+t+s (A) − π(A) = µ+t µσs ( f ) = µ+t µσs ( f ) − π µσs ( f ) ≤ 2γ (t) sup |µσs ( f )| σ

≤ 2γ (t) max{|µ+s ( f )|, |µ− s ( f )|} ≤ 4γ (t)γ (s). Similarly for µ− .

2.5. Perturbation of the boundary conditions and mixing time. Consider a finite set and two boundary conditions τ, τˆ . Let Tmix and Tˆmix be the associated mixing times for τ τˆ the Glauber chain in with b.c. τ and τˆ , respectively. Let M = max{ ππ τˆ ∞ , ππ τ ∞ }. Lemma 2.8. There exists a constant c independent of , τ, τˆ such that Tmix ≤ cM 3 ||Tˆmix .

(2.7)

Proof. Thanks to (1.10) and to the variational characterization of the relaxation time we get Tmix ≤ c||Trelax ≤ c||M 3 Tˆrelax ≤ c||M 3 Tˆmix , where the third power of M comes from expressing the Dirichlet form, the variance and the local variances w.r.t. π τ in terms of those w.r.t. π τˆ .

186

F. Martinelli, F. L. Toninelli

Let now ⊂ ∂, let τ be some configuration in , let P be some distribution over the boundary conditions on ∂ and let P be the distribution which assigns probability zero to b.c. τ not identically equal to τ on and whose marginal on ∂\ coincides with the same marginal of P. Notice that we can sample from P by first sampling from P and then changing (if necessary) to τ the spins of τ in . If the pair so obtained is denoted by (τ, τˆ ) then the corresponding constant M satisfies M ≤ M := e8β|| . τ + − ˆ± Let d ± (t) = µ± t −π so that γ (t) = max{d (t), d (t)}. Similarly for d (t), γˆ (t). Lemma 2.9. With the above notation

E (γ (t)) ≤ e−M + 8E γˆ (tˆ) ,

4 ). where tˆ = t/(c||2 M

Proof. Thanks to (2.7) and (1.9),

4 E (γ (t)) ≤ e−M + P (Tmix ≥ t/M ) ≤ e−M + P Tˆmix ≥ t/(c||M ) = e−M + P Tˆmix ≥ ||tˆ .

Notice that, for any s ≥ 0, Tˆmix ≥ s implies that there exists some starting configuration σ for which the variation distance of its distribution at time s from the equilibrium measure π τˆ , call it dˆ σ (s), is at least 1/(2e). However, using the global monotone coupling of the Glauber chain, P(ηs+,τˆ (x) = +) − P(ηs−, τˆ (x) = +) dˆ σ (s) ≤ P ηs+, τˆ = ηs−,τˆ ≤ (2.8) x∈

≤ || dˆ + (s) + dˆ − (s) ≤ 2||γˆ (s), and therefore

ˆ ˆ P Tmix ≥ ||t ≤ P γˆ (||tˆ ) ≥

(2.9)

1 . 4e||

t/t0 Thanks to Corollary 2.7, γˆ (t) ≤ 4γˆ (t0 ) so that

1 1 ≤ P γˆ (tˆ ) ≥ ≤ 8E γˆ (tˆ ) . P γˆ (||tˆ ) ≥ 4e|| 8 Let us remark for later convenience that, exactly like in (2.8), one proves that sup µσt − π τ ≤ 2||γ (t).

(2.10)

σ

With the same notation the following will turn out to be quite useful: Corollary 2.10. Let R L ≡ R εL and let P ∈ D(RL ). Let also ⊂ ∂ R L be such that τ L 3ε ≤ || ≤ 2L 3ε . Assume that E µ± t − π ≤ δ for every P ∈ D(R L ). Then the 8β L 3ε

and t = tecL for some constant statement A(L , t , δ ) holds true with δ = 8δ+e−e τ

c > 0 independent of and τ . Analogously A(L , t, δ) implies E µ± t − π ≤ δ .

Similar statements hold if we replace R L by Q L and A(L , t , δ ) by B(L , t , δ ). 3ε

Mixing Time of 2D Stochastic Ising Model at Low Temperature

187

3. Recursion on Scales: The Heart of the Proof This section represents the key of our results. We will inductively prove over the sequence of length scales L n = 2n+1 − 1 that the statement A(L n , tn , δn ) and its analog B(L n , tn , δn ) hold true for suitable tn , δn (see Theorem 3.2 below). In all this section ε > 0 is fixed very small once and for all. Accordingly, for any L ∈ N, R L ≡ R εL and similarly for Q L . Finally c, c will denote positive numerical constants whose value may change from line to line. First we give a rough estimate which provides the starting point of the recursion: Proposition 3.1. For every β there exists c = c(β) such that for every L ∈ N the −cL −cL statements A(L , t, e−t e ) and B(L , t, e−t e ) hold. Proof. From rough estimates on the spectral gap [16, Cor. 2.1] and (1.10), one has that Tmix ≤ ecL

(3.1)

uniformly in the boundary conditions τ and in L ∈ N, both for R L and for Q L . Applying (1.9) with = 1/(2e), the claim is proved. Theorem 3.2. For every β there exist constants c, c such that: (1) if A(L , t, δ) holds, then also B(L , 2t, δ1 ) does, with

2ε

δ1 = δ1 (L , δ, t) = c δ + e−c L + L 2 e−c log t . (2) If B(L , t, δ) holds, then also A(2L + 1, t2 , δ2 ) holds, with 3ε

t2 = t2 (L , t) = ecL t

(3.2)

and

3ε

δ2 = δ2 (L , δ) = c(δ + e−c L ).

(3.3)

Assuming the theorem we deduce Corollary 3.3. There exist c, c > 0 such that the following holds. For every L ∈ {2n − 1}n∈N there exists 2 (L) ≤ exp −c L ε (3.4) 3ε

such that A (L , t, (L)) holds for every t ≥ ecL . Proof. Note that if one iterates j times the map x → 2x + 1 starting from x = 1 one obtains 2 j+1 − 1 =: L j . Assume now that L = L n for some large n and set n 0 := εn , so that (1/c)L ε ≤ L n 0 ≤ cL ε . From Theorem 3.2 one sees that it is possible to choose c, c > such that A(L j , t j , δ j ) ⇒ A(L j+1 , t j+1 , δ j+1 )

(3.5)

with 3ε

t j+1 = 2 t j ecL j

(3.6)

188

F. Martinelli, F. L. Toninelli

Fig. 2. Q L and its covering with the rectangles A, B

and

2ε

δ j+1 = c δ j + e−c L j + L 2j e−c log t j .

(3.7)

Let tn 0 ≡ ecL

3ε

so that, thanks to Proposition 3.1, A(L n 0 , tn 0 , δn 0 ) holds with 3ε δn 0 = exp −ecL .

(3.8)

Then, applying (3.5) n − n 0 times, one obtains the claim A(L , T (L), (L)) with T (L) := 2n−n 0 e and

c

n j=n 0

L 3ε j

≤ ecL

3ε

(3.9)

2ε

ε2 (L) ≤ L c δ(n 0 ) + e−c L n0 + e−c log(tn0 ) ≤ e−cL ,

for a suitable constant c, where we used the rough bound (cf. (3.7))

2ε

δ j+1 ≤ c δ j + e−c L n0 + L 2 e−c log(tn0 ) . The statement for every t ≥ T (L) then follows from Corollary 2.7.

(3.10)

(3.11)

3.1. Proof of Theorem 3.2: part (1). i) We begin by proving that for every distribution P ∈ D(Q L ) one has (3.12) E µ+2t − π τ ≤ δ1 . Observe that Q L can be seen as the union of two overlapping rectangles A and B, where B is just the basic rectangle R L and A is obtained by shifting B to the North by (2L + 1)1/2+ε − L 1/2+ε (see Fig. 2). Let now µ˜ +2t denote the distribution at time 2t of the dynamics started from the all + configuration and subject to the following “massage”: in the time interval [0, t) we keep only the updates in A, at time t we increase all the spins in B to +1 and in the interval (t, 2t] we keep only the updates in B.

Mixing Time of 2D Stochastic Ising Model at Low Temperature

189

Lemma 3.4. µ+2t − π τ ≤ µ˜ +2t − π τ . Proof. Let µˆ +2t denote the distribution at time 2t of the dynamics started from the all + configuration and subject to the following “censoring”: in the time interval [0, t) we keep only the updates in A and in the interval [t, 2t] only the updates in B. By µˆ + Theorem 2.5, π2tτ is increasing. Moreover µˆ +2t µ˜ +2t which combined with Lemma 2.3 proves the result. In order to better organize the notation we need the following: Definition 3.5. We let (a) ν1 be the distribution obtained at time t after the first half of the “massage”. Clearly ν1 assigns zero probability to configurations that are not identical to + in Ac ; (b) ν2σ be the distribution obtained from the second half of the censoring starting (at time t) from a configuration equal to + in B and to σ in B c . Clearly ν2σ assigns zero probability to configurations that are not identical to σ in B c ; (c) π Aτ,+ := π τ (· | σ Ac = +); τ,η (d) π B := π τ (· | σ B c = η); (e) π τ,− (resp. π τ,+ ) be the Gibbs measure in Q L with minus (resp. plus) b.c. on its South boundary and τ on the North, East and West borders. With these notations the distribution µ˜ +2t is written as η

c

µ˜ +2t (η) = ν1 (η B c )ν2 B (η). Notice that also the Gibbs measure π τ has a similar expression, namely, τ,η B c

π τ (η) = π τ (η B c )π B Therefore 1 2

η

≤

(η).

|µ˜ +2t (η) − π τ (η)|

η c

1

τ,+ 1

η c ν1 (η B c ) − π Aτ,+ (η B c ) ν2 B (η) + π A (η B c )ν2 B (η) − π τ (η) 2 η 2 η

= ν1 − π Aτ,+ B c + γ − π ,

(3.13)

where η

γ (η) := π Aτ,+ (η B c )ν2 B (η). c

Clearly

η c τ,η c γ − π ≤ π τ,− ν2 B − π B B + π Aτ,+ − π τ B c + π τ − π τ,− B c .

In conclusion E µ+2t − π τ ≤ E ν1 − π Aτ,+ B c η c τ,η c + E π τ,− ν2 B − π B B + E π Aτ,+ − π τ B c + E π τ − π τ,− B c . (3.14)

190

F. Martinelli, F. L. Toninelli

By assumption the first term in the r.h.s. of (3.14) is smaller than δ. Next we analyze the second term. In this case, if we denote the four boundary conditions around B, ordered clockwise starting from the North one, by τ1 , τ2 , τ3 , τ4 , then their distribution P− is given by P− (τ1 , τ2 , τ3 , τ4 ) = P(τ2 , τ3 , τ4 )E π τ,− (τ1 ) | τ2 , τ4 . Notice that the marginal of P− on τ3 coincides with that of P and therefore stochasti+ . It remains to examine the marginal cally dominates the corresponding marginal of π∞ on (τ1 , τ2 , τ4 ). Let f be a decreasing function of these variables and observe that, as a function of the boundary conditions on the North, East and West sides of Q L , the average π τ,− ( f ) is also decreasing. Therefore, since P ∈ D(Q L ), τ,− − − E− ( f ) = E π τ,− ( f ) ≥ π∞ π ( f ) ≥ π∞ ( f ), (3.15) i.e. P− ∈ D(B). Therefore η c η c τ,η c τ,η c E π τ,− ν2 B − π B B = E− ν2 B − π B B ≤ δ. The third and the fourth term in (3.14) can be bounded from above by essentially the same argument which we now present only for the fourth term. Clearly, for any choice of the boundary conditions τ , π τ,− π τ . Therefore E π τ − π τ,− B c ≤ E π τ (σx = +) − π τ,− (σx = +) . x∈B c

Claim 3.6. There exists c = c(β, ε) > 0 such that 2ε E π τ (σx = +) − π τ,− (σx = +) ≤ e−cL

(3.16)

for every x ∈ B c . Proof. Let denote the event that in B there is a ∗-connected chain (i.e. either the Euclidean vertices v, v of the chain equals 1, or it √ distance between two consecutive

equals 2 and in that case the segment vv forms an angle π/4 with the horizontal axis) of − spins which connects the East and West sides of B. By monotonicity, π τ (σx = + | ) ≤ π τ,− (σx = +),

(3.17)

and therefore π τ (σx = +) − π τ,− (σx = +) ≤ π τ ( c ). By monotonicity τ,+ c − E π τ (σx = +) − π τ,− (σx = +) ≤ Eπ τ ( c ) ≤ π∞ π ( ) , where we recall that the superscript + means that on the South border of Q L the b.c. are (−,−) − conditioned to have all minuses all plus. Let π∞ be the minus phase measure π∞ on the North, East and West borders of the enlarged rectangle E L (Q L ) (see Fig. 3). Standard bounds on the exponential decay of correlations in the minus phase (see for instance [20] or [24, Chap. V.8]) prove that τ,+ c τ,+ c −cL − (−,−) π∞ π ( ) ≤ π∞ π ( ) + e (3.18)

Mixing Time of 2D Stochastic Ising Model at Low Temperature

191

(−,−,+)

Fig. 3. The rectangle Q L (thick line) and its enlargement E L (Q L ) (narrow line), with the b.c. of π∞

for some constant c > 0. If we now add extra plus b.c. on the whole horizontal line (−,−,+) the corresponding Gibbs containing the South boundary of Q L and denote by π∞ measure then, by monotonicity and DLR equations, we obtain τ,+ c τ,+ c c (−,−) (−,−,+) (−,−,+) π∞ π ( ) ≤ π∞ π ( ) = π∞ . (3.19) (−,−,+)

Notice that π∞ is nothing but the Gibbs measure π E−,−,+,− in the rectangle E L (Q L ) L (Q L ) of Fig. 3, with + b.c on the South border and − b.c. on the rest of the boundary. Next, note that the event c implies that the unique open Peierls contour γ (see the definition in Appendix A) crosses the horizontal line containing the South border of A, and we will prove in Appendix A.2 that π E−,−,+,− (γ reaches the height of the South border of A) ≤ e−cL . L (Q L ) 2ε

(3.20)

The intuition for (3.20) is that the open contour γ behaves like a one-dimensional simple random walk starting at the origin and conditioned to stay positive and to return at time L to the origin: the probability that before this time it goes at distance of order L 1/2+ε from the origin is smaller than exp(−cL 2ε ). Altogether we have obtained Eµ+2t − π τ ≤ 2δ + e−cL . 2ε

ii) Now we consider the dynamics started from the all − configuration and we prove τ Eµ− 2t − π ≤ δ1 .

(3.21)

τ τ By Theorem 2.5, µ− ˜− ˜− 2t −π ≤ µ 2t −π , where this time µ 2t denotes the distribution at time 2t obtained by starting the Glauber dynamics from the minus initial condition and performing the following “massage” (the reverse of the previous one): in the time interval [0, t) we keep only the updates in B, at time t we reset to − all the spins in A and in the time interval [t, 2t] we keep only the updates in A. In order to keep the notation as close as possible to that of the previous case where the starting configuration was all pluses we redefine

Definition 3.7. (a) π Bτ,− = π τ (· | σ B c = −); τ,η (b) π A = π τ (· | σ Ac = η);

192

F. Martinelli, F. L. Toninelli

(c) ν1 is the distribution obtained after time t and ν2σ is that obtained in the second time lag t starting from the configuration equal to − in A and to σ in Ac . With these notations the same computation leading to (3.14) gives η Ac τ,η c τ,− τ τ c ν2 − π A A + Eπ Bτ,− − π τ Ac . Eµ˜ − 2t − π ≤ Eν1 − π B A + Eπ (3.22) The first and third in the r.h.s of (3.22) are smaller than δ and e−cL respectively by essentially the same arguments as before. It remains to analyze the second term. Notice that η c η c τ,η c τ,η c π τ ν2 A (σx = −) − π τ π A A (σx = −) π τ ν2 A − π A A ≤ 2ε

x∈A

=

η c π τ ν2 A (σx = −) − π τ (σx = −) .

x∈A

Given x ∈ A and ∈ N, let K be the intersection of A with a square of side 2 + 1, centered at x. Monotonicity implies that η

c

η

c

ν2 A (σx = −) ≤ ν2,A (σx = −), η

(3.23)

c

where ν2,A denotes the distribution at time t obtained by the dynamics in K , started from all −, and with b.c. which are all − except on ∂ K ∩ ∂ A, where the b.c. remain either τ (on the North, East and West border of A) or η Ac (on the South border of A). τ,η c Let π A be the equilibrium measure of this restricted dynamics. Then, η

ν2 A (σx = −) − π τ (σx = −) η c τ,η c τ,η c ≤ ν2,A (σx = −) − π A (σx = −) + π A (σx = −) − π τ (σx = −) τ,η c −c ≤ e−te + π A (σx = −) − π τ (σx = −) , c

where in the last inequality we used (3.1). If we now average first with respect to π τ and then with respect to P we claim that Claim 3.8. One has for some c > 0, τ,η c E π τ π A (σx = −) − π τ (σx = −) τ,η c τ,η c = E π τ π A (σx = −) − π A A (σx = −) ≤ e−c . τ,η

c

τ,η

c

(3.24) (3.25)

(It is clear that if is so large that K = A, then π A = π A A and the left-hand side of (3.24) equals 0). Assuming the claim it is now sufficient to choose = (1/c)(log t − log log t) to conclude that η c

τ,η c E π τ ν2 A − π A A ≤ L 2 e−c log t (3.26) for some c > 0.

Mixing Time of 2D Stochastic Ising Model at Low Temperature

193

Fig. 4. R2L+1 and its covering with A, B, C, and in bold the set

Proof of Claim 3.8. Let be the event that x is separated from ∂ K ∩ A by a ∗-connected chain of minus spins. By monotonicity, for any η Ac , τ,η Ac

πA

τ,η Ac

(σx = − | ) ≥ π

(σx = −),

and therefore it is enough to show that τ,η c E π τ π A A ( c ) = E π τ ( c ) ≤ e−c . The rest of the proof is now very similar to that of Claim 3.6. Apart from an error e−cL we can replace E (π τ ( c )) by π E−,−,+,− ( c ), where π E−,−,+,− is the Gibbs measure on L (Q L ) L (Q L ) the enlargement E L (Q L ) (see again Fig. 3 above) with plus b.c. on the South border and minus b.c elsewhere. In turn, thanks to the fact that the event c depends only on the spins in A, we can replace π E−,−,+,− by the Gibbs measure π E−L (Q L ) on the same region L (Q L ) but with homogeneous minus b.c. by paying an error smaller than e−cL . Finally, again by monotonicity and standard correlations decay bounds in the pure phase, 2ε

− ( c ) ≤ e−c π E−L (Q L ) ( c ) ≤ π∞

for some c > 0.

3.2. Proof of Theorem 3.2 part (2). Thanks to Corollary 2.10 and apart from the 3ε 3ε harmless rescaling t → t = ecL t and δ → δ = c δ + e−cL for some constants c, c > 0, we can safely replace the distribution P over the boundary conditions outside R2L+1 with the modified distribution P (defined in Sect. 2.5), where = {(i, 0) ∈ ∂ R2L+1 ; |i − L| ≤ L 3ε } and the pinned configuration τ is identically τ ≤ cδ . equal to −1. In other words it is enough to prove that E µ± − π 2t

i) As before we begin with the case where the dynamics in R2L+1 is started from all pluses. Let now (see Fig. 4) A = Q L + (L/2 , 0), B = {Q L } ∪ {Q L + (L + 1, 0)}, C = {(i, j) ∈ R2L+1 ; i = L + 1}, so that R2L+1 = B ∪ C and B ∩ C = ∅.

194

F. Martinelli, F. L. Toninelli

By Theorem 2.5, µ+2t − π τ ≤ µ˜ +2t − π τ where, as before, the tilde indicates that the following “massage” has been applied: in the time interval [0, t ) we keep only the updates in A, at time t we increase to +1 all the spins in B and in the interval (t , 2t ] we keep only the updates in B. Notice that the dynamics in B in the time lag (t , 2t ] is just a product dynamics in the two copies of Q L , in the sequel denoted by B1 and B2 , whose union is B, with boundary conditions τ on ∂ B ∩ ∂ R2L+1 and some boundary conditions on C generated by the dynamics in A in the first time lag [0, t ]. Definition 3.9. We define (a) ν1 as the distribution obtained at time t after the first half of the censoring; (b) ν2σ as the distribution obtained from the second half of the censoring starting (at time t ) from a configuration equal to σ in C and to + in B. Clearly ν2σ assigns zero probability to configurations that are not identical to σ in C; (c) π Aτ,+ := π τ (· | σ Ac = +) and similarly with + replaced by −; τ,η (d) π B C := π τ (·|σC = ηC ); (e) π τ,− (resp. π τ,+ ) as the Gibbs measure in R2L+1 with minus (resp. plus) b.c. on its South boundary and τ on the North, East and West borders. By proceeding exactly as in the proof of statement (1) we get µ+2t − π τ ≤ µ˜ +2t − π τ ≤

ν1 − π Aτ,+ C τ,− τ

+ π

+π

τ,−

ηC τ,η ν2 − π B C + π Aτ,+ − π τ C

− π C

(3.27) (3.28)

and E µ+2t − π τ ≤ E ν1 − π Aτ,+ C η τ,η + E π τ,− ν2 C − π B C + E π Aτ,+ − π τ C + E π τ − π τ,− C . (3.29) By assumption and thanks to Corollary 2.10, if we perform a global spin flip we see that the first term in the r.h.s. of (3.29) is smaller than δ . As far as the second term is concerned we observe that the distribution P,− of the boundary conditions (τ, ηC ) given by P,− (τ, ηC ) = P (τ )π τ,− (ηC ) coincides with the -modification (P− ) of P− (τ, ηC ) = P(τ )π τ,− (ηC ). The same argument as in (3.15) shows that the latter belongs to D(Bi ), i = 1, 2, so that (via Corollary 2.10 and the immediate inequality µ ⊗ ν − µ ⊗ ν ≤ µ − µ + ν − ν ) the second term is smaller than 2δ . We now turn to the more delicate third and fourth term in the r.h.s. of (3.29). Since they can be treated essentially in the same way we discuss only the third one. As usual we write τ,+ E π Aτ,+ − π τ C ≤ E π A (σx = +) − π τ (σx = +) . (3.30) x∈C

Let be the event that in A there exist two ∗-connected chains of minus spins, one to the left and the other to the right of C, connecting the South side of A to its North side. By monotonicity π Aτ,+ (σx = + | ) − π τ (σx = +) ≤ 0

Mixing Time of 2D Stochastic Ising Model at Low Temperature

195

so that π Aτ,+ (σx = +) − π τ (σx = +) ≤ π Aτ,+ ( c ).

(3.31)

Let now A¯ = {(i, j); 1 ≤ i ≤ L , 1 ≤ j ≤ 2(2L + 1)1/2+ε } so that A¯ consists of just two copies of A stacked one on top of the other. Then, using monotonicity together with − (see e.g. (3.18)) the standard exponential decay of correlations in the minus phase π∞ we get 1/2+ε (−,+,) c + π A¯ ( ), (3.32) E π Aτ,+ ( c ) ≤ e−cL where the superscript (−, +, ) indicates the b.c. which is − on the union of the North ¯ The key equilibrium bound we need at this boundary and , and + on the rest of ∂ A. stage is the following: (−,+,)

Claim 3.10. There exists c > 0 such that π A¯

( c ) ≤ e−cL . 3ε

Putting together the bounds we got on the various terms in (3.29), we have proved E µ+2t − π τ ≤ cδ as wished. The proof of the claim is deferred to the Appendix but intuitively the argument goes as follows. Under the boundary conditions (−, +, ), for any configuration σ ∈ A¯ there exist exactly two open Peierls contours γ1 , γ2 with two possible scenarios: (a) γ1 joins the two upper corners of A¯ and γ2 the two ends of the interval ; (b) γ1 joins the left upper corner of A¯ with the left boundary of and similarly for γ2 . If we recall the definition of the surface tension (1.2), the ratio between the probabilities of the two cases is roughly of the form: e−βτβ ( e1 )(L+2L

3ε )+2βτ (θ)D β

,

where D is the Euclidean distance between the left upper corner of A¯ and the left boundary of and θ is the angle formed by the straight line going through these two points with 1 the horizontal axis. Clearly θ ≈ O(L − 2 +ε ) and D ≈ L/2 − L 3ε + O(L 2ε ). Therefore case (b) is much more likely than case (a). Remark 3.11. Notice that it is exactly the presence of the positive correction O(L 2ε ) in D that forced us to take the length of to be L 3ε . Once we are in scenario (b) the most likely situation is that neither γ1 nor γ2 touch C (otherwise they would have an excess length of order L 3ε ) and the desired bound follows by standard properties of the Ising model with homogeneous boundary conditions. τ

ii) The proof of E µ− 2t − π ≤ cδ is identical, modulo the obvious changes, provided that we redefine the “massage” of µ− 2t as the censoring in A, B plus the resetting at time t of the spins inside B to the value −1. Aminor observation is that in this case, for the smallness of the term E ν1 − π Aτ,− C , we do not need anymore the global spin flip that was necessary for the dynamics started from all pluses. Remark 3.12. As we said at the beginning, in order to keep the focus on the main ideas of the method, Theorem 3.2 has been given in the restricted setting in which the length scales are of the form L n = 2n − 1. However it should be clear by now that the case of arbitrary

196

F. Martinelli, F. L. Toninelli

length scales can be dealt with in a very similar way. A possible solution requires a slight modification of the definition of the two inductive statements A(L , t, δ), B(L , t, δ). Let F L (respectively G L ) be the class of rectangles which, modulo translations, have 1 1 horizontal base L and height H ∈ [L 2 +ε , (2L) 2 +ε ] (resp. horizontal base L and height 1 1 H ∈ [(2L) 2 +ε , (4L) 2 +ε ]). Notice that any rectangle in G L can be written as the union of two overlapping rectangles in F L such that the width of their intersection is still O(L 1/2+ε ) (as in Fig. 2). Moreover for any n large enough and any L ∈ [L n+1 , L n+2 ) there exists L ∈ [L n , L n+1 ) such that any rectangle in F L can be written as the union of three sets A, B, C (as in Fig. 4) where A ∈ G L , B consists of two disjoint rectangles in G L and C ≡ \B satisfies dist (C, Ac ) = O(L) and has horizontal width O(1). We then say that A (L , t, δ) (B (L , t, δ)) holds if (2.1) is valid for every rectangle in F L (in G L ). It is almost immediate to check that part (1) of Theorem 3.2 continues to hold with this new definition. Part (2) can be modified as follows. If B (L , t, δ) holds for every L ∈ [L n , L n+1 ) then A (L , t2 , δ2 ) holds for every L ∈ [L n+1 , L n+2 ) with 3nε

3nε t2 = ec2 t and δ2 = c(δ + e−c 2 ). The proof of the new version is essentially the same as that given above. 4. Proof of the Main Results In what follows we will prove Theorem 1.6 and Corollaries 1.9 and 1.10. Notice that, for any ⊂ Z2 , any boundary conditions τ and any starting configuration σ , µσt − π τ is invariant under the global spin flip τ → −τ and σ → −σ . Therefore it will be enough to prove only “half of the statements”. 4.1. Proof of Theorem 1.6. Recall that t L := exp(cL ε ) for some chosen ε > 0 small, and let ε := ε/4. We assume throughout this section that L ∈ {2n − 1}n∈N . 4.1.1. Mixing time with “approximately (−, −, +, −)” boundary conditions First we prove (1.11)–(1.12) when the b.c. τ is sampled from a law P which is dominated by − on the union of three sides of and dominates π + on the remaining side (e.g. the π∞ L ∞ South border). One sees from (2.10), the definition (1.8) of mixing time and the Markov inequality that (1.11) implies (1.12), so we are left with the task of proving (1.11). This is an almost straightforward generalization of the proof of point (1) of Theorem 3.2 and therefore some steps will be only sketched. For definiteness, we assume that the L×L square L we are considering is {(x1 , x2 ) ∈ Z2 : 1 ≤ x1 , x2 ≤ L}. Consider first the evolution started from the + configuration. For i ≥ 0 let

. (4.1) h i := L 1/2+ε + i (2L + 1)1/2+ε − L 1/2+ε To avoid inessential complications, assume that there exists k ∈ N such that h k−1 = L. Of course,

k∼

L 1/2−ε .

21/2+ε − 1

(4.2)

Mixing Time of 2D Stochastic Ising Model at Low Temperature

197

Let iL be the rectangle of height h i whose base coincides with that of L , so that in particular k−1 = L . We will prove by induction at the end of the present section that L Lemma 4.1. The following holds for i = 0, . . . , k − 1. Let the b.c. τ around the rect+ on the South border and is angle iL be sampled from a law P which dominates π∞ − on the union of West, East and North borders. Then, dominated by π∞ τ −cL Eµ+,i (i+1)t L /k − πi ≤ (1 + i)e

(ε )2

L

= (1 + i)e−cL

ε2 /16

,

(4.3)

i τ where µ+,i t is the evolution in L started from +, π i is its invariant measure and c L

depends only on β and ε.

If the lemma holds, it is sufficient to apply it for i = k − 1 to see that Eµ+t L − π τ ≤

exp(−cL ε /16 ) as wished. It remains to show that 2

τ −cL Eµ− tL − π ≤ e

ε2 /16

.

(4.4)

˜− By Theorem 2.5 and (the analog of) Lemma 3.4, µ− t L − π ≤ µ t L − π , where this − time µ˜ t is the dynamics in L obtained via the following “massage”: in the time interval

[0, t L /2) we keep updates only in B := R εL = {(x1 , x2 ) ∈ L : x2 ≤ L 1/2+ε }, at time t L /2 we set to − all spins in A := {(x1 , x2 ) ∈ L : x2 > (1/2)L 1/2+ε } and in (t L /2, t L ] we keep updates only in A. In analogy with Definition 3.7, we introduce Definition 4.2. We let (a) (b) (c) (d)

π Bτ,− := π τ (·|σ B c = −); τ,η π A := π τ (·|σ Ac = η); ν1 be the distribution obtained at time t L /2; ν2σ be the distribution obtained at time t L , starting at time t L /2 from σ in Ac and from − in A.

Then, in analogy with (3.22) one finds η Ac τ,η c τ,− τ τ Eµ˜ − − π A A +Eπ Bτ,− −π τ Ac . t L − π ≤ Eν1 −π B Ac + E π ν2

(4.5)

From Corollary 3.3 one sees that the first term is smaller than exp(−cL ε /16 ) (note that

t L /2 ! exp(cL 3ε )). The last term in (4.5) can be bounded by exp(−c L 2ε ) (the proof is essentially identical to the proof of the upper bound on the last term in (3.22)). Finally, proceeding like for the second term in (3.22), one sees that 2

η c

2ε

ε2 /16 τ,η c E π τ ν2 A − π A A ≤ L 2 e−c log(t L /2) + e−c L " e−c L . Altogether, we proved (4.4) and the proof of (1.13) is complete.

(4.6)

Proof of Lemma 4.1. Let for simplicity of notation π τ := π τ i . For i = 0 the claim is

L

ε = R L ). Assume that the claim holds for i −1. We define rectangles (see Fig. 5): A := iL \i−1 L , C is the rectangle

just Corollary 3.3 (note that 0L the following three disjoint

198

F. Martinelli, F. L. Toninelli

h−h

A

i

i−1

B i−1 L

3

1/2+ í

C

h

i

h−L i

Fig. 5. The rectangle iL and its decomposition into A, B, C

whose South border coincides with that of L and whose height is (h i − L 1/2+ε ), and B := iL \(A ∪ C). By Theorem 2.5 and (the analog of) Lemma 3.4, one has µ+(i+1)t L /k − π τ ≤ µ˜ +(i+1)t L /k − π τ , where the “massage” in µ˜ +t consists in keeping only the updates in A ∪ B in the time interval [0, t L /k) and in B ∪ C in the time interval (t L /k, (i + 1)t L /k], and setting to + all spins in B at time t L /k. In analogy with Definition 3.5: Definition 4.3. We let (a) ν1 be the distribution obtained at time t L /k, which assigns zero probability to configurations which are not all + in C ∪ B; (b) ν2σ be the distribution at time (i + 1)t L /k, starting at time t L /k from σ in A and from + in B ∪ C; τ,+ (c) π A∪B := π τ (·|σC = +); τ,η (d) π B∪C := π τ (·|η A = η); (e) π τ,− be the Gibbs measure in iL with − b.c. on its South border and τ on the other borders. One has then η

µ˜ +(i+1)t L /k (η) = ν1 (η A )ν2 A (η)

(4.7)

and τ,η

π τ (η) = π τ (η A )π B∪CA (η).

(4.8)

In analogy with (3.13), τ,+ µ˜ +(i+1)t L /k − π τ ≤ ν1 − π A∪B A + γ − π τ ,

where η

τ,+ γ (η) := π A∪B (η A )ν2 A (η).

As a consequence, using (4.8), η τ,η τ,+ µ+(i+1)t L /k − π τ ≤ ν1 − π A∪B A + π τ,− ν2 A − π B∪CA τ,+ + π A∪B − π τ A + π τ − π τ,− A .

(4.9)

Mixing Time of 2D Stochastic Ising Model at Low Temperature

199

Now we can take the expectation with respect to P. First of all, we have τ,+ Eν1 − π A∪B A ≤ e−cL

ε2 /16

,

(4.10)

thanks to Corollary 3.3, because A ∪ B is a translation of the rectangle R εL which appears in the definition of the claim A(L , t, δ). As for the P-expectation of the third and fourth

terms, it is upper bounded by exp(−cL 2ε ) (the proof is essentially identical to that of the upper bound for the third and fourth term in (3.14)). Altogether, the average of the 2 sum of the first, third and fourth terms is upper bounded by exp(−cL ε /16 ). Finally, in order to bound the P-expectation of the second term we need the inductive hypothesis. Indeed, we can say that η ε2 /16 τ,η E π τ,− ν2 A − π B∪CA ≤ ie−cL

(4.11)

(which concludes the induction step) if we prove that the marginal on the union of North, East and West borders of B ∪ C of the measure E− := E π τ,− (·) is stochastically dom− . Indeed, if (τ , τ , τ ) is a generic spin configuration of the North, East inated by π∞ 1 2 4 and West borders of B ∪ C and f is a decreasing function, using monotonicity a couple of times one gets − − E− ( f ) = E π τ,− ( f ) ≥ π∞ (π τ,− ( f )) ≥ π∞ ( f ),

which proves the desired stochastic domination.

(4.12)

− Here we prove (1.11) 4.1.2. Mixing time with boundary conditions dominated by π∞ (and therefore, via Markov inequality and (2.10), we obtain (1.12)), when the law P of − (or, by spin-flip symmetry, when it dominates π + ). τ is dominated by π∞ ∞ We begin with the evolution starting from the + configuration and we recall that L = {1, . . . , L}2 . One has by monotonicity π τ µ+t , and therefore Eµ+t L − π τ ≤ S1 + S2 := E µ+t L (σx = +) − π τ (σx = +) x∈− L

+

E µ+t L (σx = +) − π τ (σx = +) ,

(4.13)

x∈+L − + where − L := {(i, j) ∈ L : j < L/2} and L := L \ L . We will show that the sum S1 is small, and S2 can be dealt with similarly. (3/4)k is a Recall that iL and k were defined in Sect. 4.1.1, and observe that L rectangle whose base coincides with that of L , and whose height is h ∼ (3/4)L (cf. (4.1)–(4.2)). Then, thanks to Theorem 2.5 (or actually by monotonicity), we know that (3/4)k are µ+t L µ˜ +t L , where µ˜ +t is the censored dynamics in which only updates in L retained. One has therefore S1 ≤ E µ˜ +t L (σx = +) − π τ (σx = +) x∈− L

≤ L 2 Eµ˜ +t L − π τ,+ − + Eπ τ − π τ,+ − , L

L

(4.14)

200

F. Martinelli, F. L. Toninelli

where π τ,+ is the invariant measure of µ˜ +t , i.e. π τ,+ := π τ ·| σ \(3/4)k = + . L

L

− L

Since the North border of is at distance approximately L/4 from the North border (3/4)k , the last term in (4.14) is easily seen to be upper bounded by exp(−c L) of L (the proof of this fact is essentially identical to the proof of the upper bound for the last two terms in (3.14)). As for the first term, Lemma 4.1 (applied with i = (3/4)k ) 2 shows that it is upper bounded by exp(−c L ε /16 ). This is because the evolution µ˜ +t sees (3/4)k b.c. + on the North border of L , and τ (sampled from P which is stochastically − ) on the remaining three borders. Altogether, we have shown that dominated by π∞

ε2 /16

Eµ+t L − π τ ≤ e−c L

.

Next, we look at the evolution started from all −. Given a site x ∈ L and ∈ N, let K be the intersection of L with a square of side 2 + 1 centered at x. We let µτ,− K ,t be the dynamics in K with − initial condition and with b.c. − except on ∂ K ∩ ∂ L , where the b.c. is τ . The invariant measure of such dynamics is denoted by π Kτ,− . Since τ , we have µ− π t τ τ Eµ− Eµ− (4.15) t −π ≤ t (σx = −) − Eπ (σx = −) x∈ L

τ,− −c E µ K ,t (σx = −) − π Kτ,− (σ = −) + e x

≤

(4.16)

x∈ L

τ,− −c E µτ,− . K ,t − π K + e

≤

(4.17)

x∈ L

The “error term” exp(−c) comes from comparing Eπ τ (σx = −) and Eπ Kτ,− (σx = −) (see the proof of Claim 3.8 for very similar arguments). We know from [16, Cor. 2.1] τ,− that Tmix,K ≤ ec , uniformly in τ . Therefore, from (1.9) and choosing t = t L and = 1c (log t − log log t) ≈ L ε , one gets ε

τ −cL . Eµ− tL − π ≤ e

(4.18)

4.2. Proof of Corollary 1.9. We restart from (4.17), which in the case of τ ≡ − gives − − −c + µ− (4.19) µ− t − π ≤ | L |e K ,t − π K , x∈ L

where and are just and π Kτ,− respectively, in the specific case − τ ≡ −. Now we use the extra information that the mixing time Tmix,K of the dynamics −

ε µ K ,t is at most exp(c ), as follows from (1.13). We choose to be the smallest integer in the sequence {2n − 1}n∈N such that c > 3 log L, so that the first term in the r.h.s. of (4.19) is smaller than 1/L. Taking t1 := exp(c(log L)ε ), one has from (1.9), π − , µ− K ,t

π τ , µτ,− K ,t

π K−

− µ− K ,t1 − π K ≤ e

− −t1 /Tmix,K

≤ exp[− exp(c(log L)ε − c ε )] " 1/| L |,

(4.20)

Mixing Time of 2D Stochastic Ising Model at Low Temperature

201

if one chooses c suitably larger than c (recall that we chose = O(log L)) and the corollary is proved. 4.3. Proof of Corollary 1.10. This is rather standard, once (1.13) is known (cf. for instance Theorem 3.2 in [16] or Theorem 3.6 in [7]). Clearly, it is sufficient to prove the result with f redefined as f (σ ) := (σ0 + 1) which has the advantage of being non-negative, increasing and with support {0}. Consider a square J ⊂ Z2 with side 2 + 1 ∈ {2n − 1}n∈N and centered at 0. By the exponential decay of correlations in the +, pure phase π∞

+ ( f ) − π J+ ( f )| ≤ c e−c . |π∞

(4.21)

Moreover, by monotonicity, for every initial configuration σ of the infinite system + tL 0 ≤ (et L f )(σ ) ≤ e J f (σ ) (4.22) and the right-hand side is an increasing function of σ ; in accord with the notations of Sect. 1.2, L+J denotes the generator of the dynamics in J with + boundary conditions on ∂ J (its invariant measure is of course π J+ ) and L is the generator of the infinite-volume dynamics. One has then (using once more monotonicity) 2 L+ + (4.23) π∞ (et L f )2 ≤ π J+ e J f which, together with (4.21), gives +

tL ρ(t) = Var +∞ et L f ≤ Var π J+ e J f + c e−c .

(4.24)

By (1.6), one has that Var π J+

+ tL −2t gap+J , e J f ≤ Var π J+ ( f )e

(4.25)

with gap+J the spectral gap of L+J . From the inequality gap ≥

1 Tmix

(cf. (1.10)) and (1.13), one deduces that for every ε > 0,

−cε . Var +∞ (et L f ) ≤ c e−c + e−2te

(4.26)

(4.27)

Now letting = (t) be the smallest integer such that cε ≥ log t −

1 log log t, ε

(4.28)

(with the condition that 2 + 1 ∈ {2n − 1}n∈N ) one sees that (4.27) implies (1.15).

202

F. Martinelli, F. L. Toninelli

Appendix A. Some Equilibrium Estimates A.1. A few basic facts on cluster expansion. In this section we rely on the results of ∗ [9], but we try to be reasonably self-contained. We let Z2 be the dual lattice of Z2 and ∗ we call a bond any segment joining two neighboring sites in Z2 . Two sites x, y in Z2 are said to be separated by a bond e if their distance (in R2 ) from e is 1/2. A pair of ∗ orthogonal bonds which meet in a site x ∗ ∈ Z2 is said to be a linked pair of bonds if both bonds are on the same side of the forty-five degrees line across x ∗ . A contour is a sequence e0 , . . . , en of bonds such that: (1) ei = e j for every i = j, except possibly when (i, j) = (0, n), ∗ (2) for every i, ei and ei+1 have a common vertex in Z2 , ∗ (3) if four bonds ei , ei+1 and e j , e j+1 , i = j, j + 1 intersect at some x ∗ ∈ Z2 , then ei , ei+1 and e j , e j+1 are linked pairs of bonds. If e0 = en , the contour is said to be closed, otherwise it is said to be open. Given a contour γ , we let γ be the set of sites in Z2 such that either their distance (in R2 ) from ∗ γ is 1/2, or their distance from the set of vertices in Z2 where two non-linked bonds √ of γ meet equals 1/ 2. We need the following Definition A.1. Given V ⊂ Z2 , we let V˜ ⊂ R2 be the union of all closed unit squares ∗ centered at each site in V , and V¯ be the set of all bonds e ∈ Z2 such that at least one of the two sites separated by e belongs to V . Given a rectangular domain V ⊂ Z2 , a configuration σ ∈ V and a boundary condition τ on ∂ V , let σ (τ,+) be the spin configuration on Z2 which coincides with σ in V , with τ on ∂ V and which is + otherwise. One immediately sees that the (finite) collection ∗ (τ,+) (τ,+) of bonds of Z2 which separate neighboring sites x, y ∈ Z2 such that σx = σ y splits in a unique way into a finite collection τ (σ ) of closed contours. It is easy to see that τ (σ ) ∩ V˜ consists of a certain number of closed contours, plus m open contours, where m is such that going along ∂ V one meets 2m changes of sign in τ . Note that the collection of the 2m endpoints of the open contours is fixed uniquely by τ . We write τ open (σ ) for the collection {γ1 , . . . , γm } of open contours in τ (σ ) ∩ V˜ . Of course, the open contours γi have to satisfy certain compatibility conditions: γi and γ j have no ∗ bond in common if i = j, and if they meet at some x ∗ ∈ Z2 , each of the two linked pairs of bonds belongs to only one contour. Moreover, each γi is contained in V˜ and the collection of the endpoints of the {γi }i≤m must coincide with that dictated by τ . We will write {γ1 , . . . , γm } ∼ τ to indicate that the collection of open contours is compatible with τ . The following result can be easily deduced from [9, Sect. 3.9 and 4.3]. Writing as usual πVτ for the equilibrium measure in V with b.c. τ , one has Theorem A.2. There exists β0 such that for every β > β0 the following holds. For every rectangle V ⊂ Z2 , every b.c. τ on ∂ V and every collection {γ1 , . . . , γm } of open contours compatible with τ , one has ({γ , . . . , γ }; V ) 1 m τ πVτ σ : open , (A1) (σ ) = {γ1 , . . . , γm } = (V, τ )

Mixing Time of 2D Stochastic Ising Model at Low Temperature

203

where the Boltzmann weight ({γ1 , . . . , γm }; V ) is defined as ⎧ ⎪ ⎪ m ⎨ ({γ1 , . . . , γm }; V ) := exp −2β |γi | − ⎪ ⎪ i=1 ⊂V : ⎩

∩(∪i γi )=∅

|γi | is the geometric length of γi and (V, τ ) :=

⎫ ⎪ ⎪ ⎬

() , ⎪ ⎪ ⎭

({γ1 , . . . , γm }; V ).

(A2)

(A3)

{γ1 ,...,γm }∼τ

The potential satisfies for every ⊂ V, || ≥ 2 and for every x ∈ V : |()| ≤ exp(−2(β − β0 )d()), |({x})| ≤ exp(−8(β − β0 )),

(A4) (A5)

where, for connected (in the sense of subgraphs of the graph Z2 ) , d() is the length of ¯ (cf. Definition A.1) containing all the bonds the smallest connected set of bonds from which separate sites in from sites in c . If is not connected then d() := +∞. The fast decay property of (with respect to both β and d()) has the following simple consequence: Lemma A.3 [9, Lemma 3.10]. There exists β0 depending only on β0 of Theorem A.2 ∗ such that for β > β0 , for every bond e ∈ Z2 and for every d > 0 one has

e−2(β−β0 )d() ≤ e−2(β−β0 )d . (A6) ¯ ⊂Z2 :e∈ d()≥d

This allows to essentially neglect the interaction between portions of a contour which are sufficiently far from each other. In order to apply directly results from [9] to obtain the estimates we need, we define the canonical ensemble of contours. Let a, b be sites in Z2 . Then, for any open contour γ ∗ γ which has a + (1/2, 1/2), b + (1/2, 1/2) ∈ Z2 as endpoints, in formulas a ↔ b (with some abuse of language, we will sometimes say that γ connects a and b), we define the probability distribution ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ −1 −1 Pa,b (γ ) := Za,b exp −2β|γ | − () = Za,b (γ ; Z2 ) (A7) ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ ⊂Z2 : ∩γ =∅

and of course Za,b :=

(γ ; Z2 ).

(A8)

γ

γ :a ↔b

Note that we do not require that γ ⊂ V˜ and the sum in is now over all (connected) sets ⊂ Z2 . The expectation w.r.t. Pa,b will be denoted by Ea,b .

204

F. Martinelli, F. L. Toninelli

A.1.1. Surface tension and basic properties Let n be a vector in the unit circle S such that n · e 1 > 0 and call φn the angle it forms with e 1 (of course, −π/2 < φn < π/2). For N ∈ N, let b N , n = (N , y N , n ) ∈ Z2 , where y N , n = max{y ∈ Z : y ≤ N tan(φn )}. Let also 0 := (0, 0). Then, it is known [9, Prop. 4.12] that, for β large enough, the surface tension introduced in (1.2) is given by τβ ( n ) := − lim

N →∞

1 log Z0,b N , n , βd(0, b N , n )

(A9)

where, if x, y ∈ R2 , d(x, y) is their Euclidean distance. To be precise, one has to assume that φn is bounded away from ±π/2 uniformly in N , but this will be inessential for us since we will always have φn small. One can extract from [9, Sect. 4.8, 4.9 and 4.12] that the surface tension is an analytic function of φn (always assuming that β is large enough), and by symmetry one sees that it is an even function of φn . In [9, Sect. 4.12], sharp estimates on the rate of convergence in (A9) (e.g. (A13) below) are given. A.2. Proof of (3.20). The domain E L (Q L ) which appears in (3.20) is a rectangle with height shorter than its base, and the b.c. τ is + on the South border and − otherwise. Since the event that the unique open contour reaches the height of the South border of A is increasing, in order to prove (3.20), by the FKG inequalities we can first of all move upwards the North border of E L (Q L ) until we obtain a square (of side 3L, which however here we call just L); we let therefore V := {1, . . . , L}2 . Secondly (always by FKG) we can change the b.c. τ to τ ≥ τ by first fixing a δ > 0 and then establishing that τx = + if x = (x1 , x2 ) ∈ ∂ V with x2 ≤ δL 1/2+ε , and τx = − otherwise. τ (σ ): of Given a configuration σ ∈ V , let γ be the unique open contour in open γ course, γ ⊂ V˜ and a1 ↔ a2 , where a1 := (0, δL 1/2+ε ) and a2 := (L , δL 1/2+ε ). We let h(γ ) := max{x2 : (x1 , x2 ) ∈ γ } be the maximal height reached by γ , while as usual ε > 0 is small and fixed. Looking at (A1) and (3.20), we see that what we have to prove is that for every fixed δ > 0 one has for every L ∈ N, N 2ε γ ∼τ (γ ; V )1{h(γ )>2δL 1/2+ε } := ≤ e−cL (A10) (V, τ ) (V, τ ) for some c(β, δ, ε) > 0. We will always assume that β is large enough. First we upper bound the numerator in (A10): with the notations of Sect. A.1 (cf. in particular (A7)) and setting for a given contour γ and a given V ⊂ Z2 , (), (A11) V (γ ) := ⊂Z2 : ∩γ =∅,∩V c =∅

one has N ≤ Za1 ,a2 Ea1 ,a2 1{h(γ )>2δL 1/2+ε } exp (V (γ )) ≤ Za1 ,a2 Pa1 ,a2 (h(γ ) > 2δL 1/2+ε ) Ea1 ,a2 exp (2V (γ )) ,

(A12)

where in the first step we simply removed the constraint that γ ⊂ V˜ , which is implicit in the requirement γ ∼ τ . It follows directly from [9, Prop. 4.15] that the first square root is

Mixing Time of 2D Stochastic Ising Model at Low Temperature

205

C

C

1

2

w1

w2

w1

w2 2

1

1

2

2

1

Fig. A1. The two topologically distinct possibilities: either γ1 connects 1 to 2 , or it connects w1 to 1 . The fist case is very unlikely, see (A18)

smaller than exp(−cL 2ε ) (note that we are requiring the contour to reach a height which exceeds by δL 1/2+ε the height of its endpoints). On the other hand, from [9, Th. 4.16, in particular Eq. (4.16.6)] and the fast decay properties of (in particular Lemma A.3) it is not difficult to deduce that the second one is upper bounded by exp (c(log L)c ). Moreover, one has [9, Eq. (4.12.3)] that Za1 ,a2 ≤ c(β)

e−βτβ ( e1 )L , √ L

(A13)

where of course τβ ( e1 ) is the surface tension in the horizontal direction and we used the fact that d(a1 , a2 ) = L. In conclusion, we have (A14) N ≤ exp −βτβ ( e1 )L − cL 2ε . Next we observe that, again from [9, Th. 4.16 and Eq. (4.16.7)], (V, τ ) ≥ exp −βτβ ( e1 )L − c(log L)c , which together with (A14) concludes the proof of (3.20).

(A15)

A.3. Proof of Claim 3.10. In this section, V is the rectangle {(i, j) ∈ Z2 : 1 ≤ i ≤ L , 1 ≤ j ≤ 4(2L + 1)1/2+ε } and the b.c. τ is defined by τx = − for x ∈ := {(i, 0) ∈ Z2 : |i − L/2 | ≤ L 3ε } and for x = (x1 , x2 ) ∈ ∂ V with x2 > 2(2L + 1)1/2+ε ; τx = + otherwise. Moreover, C is the infinite vertical column C = {(x1 , x2 ) ∈ R2 : x1 = L/2 }. Write 1 + (1, 0) (resp. 2 ) for the left-most (resp. right-most) point τ in . For every σ ∈ V there are two open contours in open (σ ): γ1 and γ2 , and we establish by convention that γ1 is the contour which contains 1 + (1/2, 1/2) as one of its endpoints. Two cases can occur (see Fig. A1): γ1

γ2

• either 1 ↔ 2 and w1 ↔ w2 , where w1 := (0, 2(2L + 1)1/2+ε ) and w2 := (L , 2(2L + 1)1/2+ε ), γ1 γ2 • or w1 ↔ 1 and 2 ↔ w2 . Let C1 (resp. C2 ) be the vertical column at distance L ε to the left (resp. to the right) of the column C. Then, one has

206

F. Martinelli, F. L. Toninelli

Lemma A.4. The probability that appears in Claim 3.10 can be upper bounded as (−,+,)

π A¯

( c ) ≤ πVτ (¯ c ),

(A16)

where γi ¯ := {wi ↔ i and γi ∩ Ci = ∅, i = 1, 2}.

(A17)

Therefore, from Theorem A.2 we see that to prove Claim 3.10 it is enough to show that γ1 {γ1 ,γ2 }∼τ ({γ1 , γ2 }; V )1{ ↔ N1 3ε 1 2} := ≤ e−cL (A18) (V, τ ) (V, τ ) and that N2 := (V, τ )

{γ1 ,γ2 }∼τ

({γ1 , γ2 }; V )1

γ1

{1 ↔w1 }

(V, τ )

1{γ1 ∩C1 =∅}

≤ e−cL , 3ε

(A19)

for some positive c = c(β, ε). Proof of Lemma A.4. Since the event c is increasing, we note first of all that thanks to FKG we can enlarge the system from A¯ to V and change the b.c. from (−, +, ) to τ . Secondly, we observe that the event ¯ implies . A.3.1. Lower bound on (V, τ ) We will prove that there exists a positive constant c

such that for β large, (V, τ ) ≥ exp −βτβ ( e1 )(L − c L 3ε ) . (A20) Since we want a lower bound, we are allowed to keep only the configurations {γ1 , γ2 } ∼ τ γi such that wi ↔ i and γi does not touch the column Ci , for i = 1, 2. Call Gi , i = 1, 2 the set of configurations of γi allowed by the above constraints. Using the decay properties of , one sees that ⎞2 ⎛ (γ1 ; V )⎠ . (A21) (V, τ ) ≥ c ⎝ γ1 ∈G1

The square is due to the fact that γ1 and γ2 essentially do not interact because their mutual distance is larger than L ε (the residual interaction can be bounded by a constant which is absorbed in c). It remains to prove that (γ1 ; V ) ≥ exp(−βτβ ( e1 )((L/2) − c L 3ε )) (A22) γ1 ∈G1

for some positive c . This is an immediate consequence of Lemma A.6 below (applied with κ = ε), together with the fact that d(w1 , 1 ) = L/2 − L 3ε + O(L 2ε ), of the fact that the angle φ formed by the segment w1 1 and e 1 is O(L −1/2+ε ), and finally of the analyticity of the surface tension and its symmetry around e 1 .

Mixing Time of 2D Stochastic Ising Model at Low Temperature

207

A.3.2. Upper bound on N1 Using rough upper bounds on the number of paths γ1 which connect 1 and 2 and the decay properties of (in particular Lemma A.3), one sees that for L large, N1 ≤ e−cL

3ε

(γ ; V )

(A23)

γ γ ⊂V˜ : w1 ↔w2

for some c = c(β, ε) > 0, where of course one uses the fact that d(1 , 2 ) = 2L 3ε . Moreover, Theorem 4.16 of [9] ensures that (γ ; V ) ≤ exp(−βτβ ( e1 )L + c(log L)c ), (A24) γ

γ ⊂V˜ : w1 ↔w2

which, together with (A20), concludes the proof of (A18). A.3.3. Proof of (A19) The estimate we wish to prove is very intuitive: if the path γ1 makes a deviation to the right to touch the column C1 , it has an excess length, and therefore an excess energy, of order L 3ε with respect to typical paths. The actual proof of (A19) is a straightforward (although a bit lengthy) application of results from [9] and of the FKG inequalities. We sketch only the main steps. First of all, letting d(γ1 , γ2 ) := min{d(x1 , x2 ), xi ∈ γi , i = 1, 2}, we show that the contribution of the configurations such that d(γ1 , γ2 ) < L ε is negligible. To this purpose, decompose first of all N2 as N2 = N2 + N2

, where N2 :=

({γ1 , γ2 }; V )1

γ1

{1 ↔w1 }

{γ1 ,γ2 }∼τ

1{γ1 ∩C1 =∅} 1{d(γ1 ,γ2 )
(A25)

Consider the paths γi as oriented from wi to i and, if d(γ1 , γ2 ) < L ε , call ∗ ∗ ∗ P := P(γ1 , γ2 ) := (x1 , x2 ) ∈ Z2 × Z2 , where x1 is the first point in γ1 ∩ Z2 ∗ ε 2 which is at distance less than L from γ2 , and x2 is the first point in γ2 ∩ Z at distance less than L ε from x1 . Of course, P can take at mostL 2 different values (this is a rough upper bound) and we can decompose N2 as N2 = p N2, p , where N2, p contains only the terms such that P(γ1 , γ2 ) = p. Given (γ1 , γ2 ) such that P(γ1 , γ2 ) = p, for i = 1, 2 one can write γi as the union of γi and γi

, where γi connects wi to xi , and γi

connects xi to i . Using the decay properties of one sees that, uniformly in p and in {γi }i=1,2 ,

({γ1 , γ2 }; V ) ≤ c(γ1 ; V )(γ2 ; V ),

(A26)

{γi

}i=1,2

where the sum runs over all the configurations of {γi

}i=1,2 compatible with {γi }i=1,2 . Let be the set of paths γ3 which connect x1 to x2 , and such that the concatenation of γ1 , γ3 and γ2 is an admissible open path, call it simply γ , connecting w1 to w2 and contained in V˜ . Of course, the set depends on {γi }i=1,2 . Then, one sees that {γi

}i=1,2

({γ1 , γ2 }; V ) ≤ ecL

ε

γ3 ∈

(γ ; V ).

(A27)

208

F. Martinelli, F. L. Toninelli

In conclusion, summing over the admissible configurations of {γi }i=1,2 and over the possible values of p, recalling (A24) and the lower bound (A20), we have shown that N2

3ε ≤ e−cL . (τ, V )

(A28)

As for N2

, using the decay properties of the potential one sees immediately that, since d(γ1 , γ2 ) ≥ L ε , the mutual interaction between the two paths can be bounded by a constant, so that N2

≤ c (γ1 ; V )1{γ1 ∩C1 =∅} × (γ2 ; V ). (A29) γ1

γ2

γ1 ⊂V˜ : 1 ↔w1

γ2 ⊂V˜ : 2 ↔w2

Recalling (A21) one sees therefore that N2

Q ≤c , (τ, V ) (1 − Q)2 where

Q :=

γ

{γ ⊂V˜ : 1 ↔w1 }

(A30)

(γ ; V )1{γ ∩C1 =∅}

(A31)

γ (γ ; V ) {γ ⊂V˜ : 1 ↔w1 }

and we are left with the task of proving that Q ≤ exp(−cL 3ε ). Note that Q is nothing but the equilibrium probability πVτˆ (γ ∩ C1 = ∅), where γ is the unique open contour for a system enclosed in V and with boundary conditions τˆ given by τˆx = + for x = (i, 0) with i < L/2 − L 3ε and x = (0, i) with i ≤ 2(2L + 1)1/2+ε , and τˆx = − otherwise. Morally, one would like to apply [9, Th. 4.15] to say that Q ≤ exp(−cL 3ε ); such result however cannot be applied directly because of the entropic repulsion effect that γ feels due to the South border of V , and we need to take a small detour. Consider the L-shaped domain W obtained as the union of the rectangles V and V , where V = {(i, j) ∈ Z2 : −L 1/2+ε ≤ j ≤ 0, 1 ≤ i < L/2 − L 3ε − 1}, with boundary conditions τˆ given by τˆ = τˆ on ∂ W ∩ ∂ V and τˆ = + on ∂ W ∩ ∂ V , see Fig. A2. Below we will prove Lemma A.5. One has

τˆ Q = πVτˆ (γ ∩ C1 = ∅) ≤ πW (γ ∩ C1 = ∅| ) ≤

τˆ (γ ∩ C = ∅) πW 1

τˆ ( ) πW

,

(A32)

where = {σ ∈ W : ∃ inside V a ∗-connected path of + spins which connect the site 1 + (0, 1) to one of the sites (1, i) with 1 ≤ i ≤ 2(2L + 1)1/2+ε }, see Fig. A2. The numerator in the right-hand side of (A32) is smaller than exp(−cL 3ε ). Indeed, it suffices to remark that (cf. the notation (A7)) it is smaller than Ew1 ,1 1{γ ∩C1 =∅} exp (W (γ )) Pw1 ,1 (γ ∩ C1 = ∅)Ew1 ,1 (exp (2W (γ ))) ≤ , Ew1 ,1 exp (W (γ )) Ew1 ,1 exp (W (γ )) (A33)

Mixing Time of 2D Stochastic Ising Model at Low Temperature

209

V _

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_ _

_ C1

_

_

_

_

+

_

w1 +(1/2,1/2)

+ + +

(1/2,1/2)

* * * * * * * * * * * * * * * * *

+

+

+

+

+

+

+

+

+

+ +

+

+

+

+

+

+

+

+

_ _ _ _

_

_

_

_

_

_

_

_

_

_

1

+

V’

Fig. A2. The L-shaped domain W (for graphical convenience, proportions are not respected in the drawing) with its boundary conditions τˆ . For the construction of γ , one should imagine that the spins in the framed region are set to −. The sites marked by ∗ denote the ∗-connected set + (γ ). The drawn configuration of γ is entirely above the straight line going through w1 + (1/2, 1/2) and 1 + (1/2, 1/2), i.e. the spin configuration σ belongs to the set

appearing in (A36)

where W (γ ) was defined in (A11). Theorem 4.15 of [9] says directly that Pw1 ,1 (γ ∩ C1 = ∅) ≤ exp(−cL 3ε ), while the fast decay of , together with [9, Th. 4.16], implies that Ew1 ,1 exp (2W (γ )) ≤ exp(c(log L)c ), Ew1 ,1 exp (W (γ )) ≥ exp(−c(log L)c ).

(A34) (A35)

Roughly speaking, typical paths (under Pw1 ,1 ) have a small intersection with W c (again, the precise estimates follow from [9, Th. 4.15]). This is why we enlarged V to W : if W were replaced by V , the intersection would not be small any more and the expectations in (A34)–(A35) would not be under control. The denominator in (A32) is also not difficult to deal with: one observes (see Fig. A2) that the event is implied by the event

={γ does not go below the straight line which goes through 1 + (1/2, 1/2) and w1 + (1/2, 1/2)} (we will write symbolically γ ≥ (1 w1 )). Indeed, the subset of γ , where spins are + is ∗-connected and satisfies

the requirements of . Therefore, πVτˆ ( ) ≥ exp(−cL ε ). Indeed, γ ∼τˆ (γ ; W )1{γ ≥(1 w1 )} τˆ

τˆ

πV ( ) ≥ πW ( ) = : (A36) γ ∼τˆ (γ ; W )

210

F. Martinelli, F. L. Toninelli

the numerator is lower bounded by exp[−βτβ ( vw1 1 )d(w1 , 1 ) − c(d(w1 , 1 ))ε ] via Lemma A.6 (take κ = ε/2) and the denominator is upper bounded by exp[−βτβ ( vw1 1 )d(w1 , 1 ) + c(log d(w1 , 1 ))c ] via [9, Th. 4.16], where v w1 1 is the unit vector pointing from w1 to 1 . Summarizing, we have obtained Q ≤ exp(−cL 3ε ) and, via (A30) and (A28), we have proven (A19). Proof of Lemma A.5. Given a configuration σ ∈ W , imagine to replace all its spins in ∂ V ∩ V by −, cf. Fig. A2; then, associated to the restriction σV ∈ V , there are exactly two open contours in V˜ . The endpoints of these two contours are (1/2, 1/2), w1 + (1/2, 1/2), 1 + (1/2, 1, 2) and 1 + (−1/2, 1/2). Under the assumption that σ ∈ , one sees immediately that one of the two contours connects w1 to 1 (this is nothing else but the open contour which we have called γ so far, e.g. in (A32)); we will call γ the second open contour, see Fig. A2. Given a possible configuration for γ , V is divided into two components, call them V ± (γ ), where V − (γ ) is the one “in contact with” V . It is clear that the intersection + (γ ) := γ ∩ V + (γ ) is a ∗-connected set (i.e. any two of its points can be linked by a ∗-connected chain belonging to + (γ )) and all spins are + there. It is important to remark that if we take σ ∈ and flip any +

+

spin in Vγint

:= V (γ )\ (γ ), the configuration of γ does not change. Also, if (with abuse of notation) we let πγ denote the equilibrium measure in Vγint

with b.c. + on the portion of the boundary which coincides with + (γ ) and τˆ otherwise, one has πγ (γ ∩ C1 = ∅) ≥ πVτˆ (γ ∩ C1 = ∅),

(A37)

by FKG since the event γ ∩C1 = ∅ is increasing. One has then, with S the set of possible configurations of γ ,

τˆ (γ ∩ C = ∅) πW 1

τˆ ( ) πW

≥ =

τˆ (γ ∩ C = ∅; ) πW 1

τˆ ( ) πW

τˆ

πW (γ

∩ C1 = ∅| ; γ = ξ )

τˆ ( ; γ = ξ ) πW

ξ ∈S

=

τˆ ( ) πW

πξ (γ ∩ C1 = ∅)

τˆ ( ; γ = ξ ) πW

ξ ∈S

τˆ ( ) πW

≥ πVτˆ (γ ∩ C1 = ∅), (A38)

where we used (A37) in the second inequality.

∗

∗

A.3.4. A technical lemma Let a := (a1 , a2 ) ∈ Z2 and b = (b1 , b2 ) ∈ Z2 with b1 > a1 . Let v ab be the unit vector pointing from a to b and φab be the angle which v ab forms with e 1 . Assume that −π/4 ≤ φab ≤ π/4. Let A > 0, κ > 0, let Ua,b = Ua,b (A, κ) ⊂ R2 be the cigar-shaped region which is delimited by the two curves

(x − a1 )(b1 − x) 1/2+κ ± (x) := x tan(φab ) ± A , x ∈ [a1 , b1 ], x → ξa,b;A,κ b1 − a1

Mixing Time of 2D Stochastic Ising Model at Low Temperature

Ux

,x

−1

211

(A’,k) 0

+

U a,b (A,k) Ux

,x

−2

(A’,k)

−1

z0 z −1

z

+

U a,b (A’,k)

1

z −n

zn

a

x−2

x−1

x0

x1

x2

b

Fig. A3. A typical path γ which contributes to the lower bound (A39). For graphical convenience, we have assumed that a and b have the same vertical coordinate, and not all the cigar-shaped sets Uzi ,zi+1 (A , κ) have been drawn + be the upper half of U , obtained by slicing U and Ua,b a,b a,b along the segment ab. Also, we will denote by a,b = a,b (A, κ) the set of all open contours γ having a and b as endpoints, and such that every bond in γ has non-empty intersection with Ua,b ; similarly + . Then, we define a,b

Lemma A.6. Let β be large enough, and consider a domain V ⊂ Z2 such that V˜ + (A, κ) (cf. Definition A.1). There exists c depending on β, A, κ such that contains Ua,b (A39) (γ ; V ) ≥ exp −βτβ ( vab )d(a, b) − c(d(a, b))2κ . + γ ∈a,b

This result can be obtained via a repeated use of Theorem 4.16 of [9]. The error term exp(−c (d(a, b))2κ ) is very rough (but sufficient for our purposes) and can presumably be improved. We do not give full details because they are a bit lengthy, although standard, but we sketch the main steps. First of all, let for simplicity of notations L := b1 − a1 and A := A/10. Then, one proceeds as follows (keep in mind Fig. A3): ∗

• for every −n ≤ i ≤ n, with n = log2 (L) − 2, let z i = (xi , yi ) be a point in Z2 at + minimal distance from (x˜i , ξa,b;A ˜i )), where

,κ ( x ⎛ ⎞ |i|−1 1 sign(i) 2− j ⎠ ; x˜i := a1 + (b1 − a1 ) ⎝ + (A40) 2 4 j=0

• remark via elementary geometrical considerations that for every −n ≤ i < n, the + (A, κ); cigar-shaped set Uzi ,zi+1 (A , κ) is entirely contained in Ua,b • restrict the sum (A39) to the paths γ which, when oriented from a to b, go through the points z −n , z −n+1 , . . . , z n (in this order), and such that the portion of the path between z i and z i+1 belongs to zi ,zi+1 (A , κ); • remark that, via the decay properties of the potential , the interaction between two adjacent portions of γ just defined can be bounded above by a constant;

212

F. Martinelli, F. L. Toninelli

• apply Theorem 4.16 of [9] to write that for every −n ≤ i < n one has (γ ; V )≥exp −βτβ ( vzi ,zi+1 )d(z i , z i+1 )−c(log d(z i , z i+1 ))c , γ ∈zi ,zi+1 (A ,κ)

(A41) for some constant c depending on A, κ, β. As for the two portions of γ from a to z −n and from z n to b, they give a multiplicative contribution of order 1 to (A39) (this is because d(a, z −n ) = O(1) and d(b, z n ) = O(1), as is immediately seen from the definition of n); • put together the estimates on the contributions coming from the 2n + 3 portions of γ obtained in the previous point: using the convexity and smoothness properties of the surface tension τβ (·), one obtains the claim of the lemma. Acknowledgements. We are extremely grateful to Senya Shlosman and to Yvan Velenik for valuable help on low-temperature equilibrium estimates. Part of this work was done during the authors’ stay at the Institut Henri Poincaré - Centre Emile Borel during the semester “Interacting particle systems, statistical mechanics and probability theory”. The authors thank this institution for hospitality and support.

References 1. Alexander, K.S.: The spectral gap of the 2-D stochastic ising model with nearly single-spin boundary conditions. J. Stat. Phys. 104, 59–87 (2001) 2. Alexander, K.S., Yoshida, N.: The spectral gap of the 2-D stochastic Ising model with mixed boundary conditions. J. Stat. Phys. 104, 89–109 (2001) 3. Higuchi, Y., Yoshida, N.: Slow relaxation of 2-D stochastic Ising models with random and non-random boundary conditions. In: New Trends in Stochastic Analysis, (Charingworth, England, Sept. 1994), Singapore: World Scientific, 1994, pp. 153–167 4. Schonmann, R.H., Yoshida, N.: Exponential relaxation of Glauber dynamics with some special boundary conditions. Commun. Math. Phys. 189(2), 299–309 (1997) 5. Bianchi, A.: Glauber dynamics on non-amenable graphs: boundary conditions and mixing time. Electron. J. Probab. 13, 1980–2012 (2008) 6. Bodineau, T., Martinelli, F.: Some new results on the kinetic Ising model in a pure phase. J. Stat. Phys. 109, 207–235 (2002) 7. Caputo, P., Martinelli, F., Toninelli, F.L.: On the approach to equilibrium for a polymer with adsorption and repulsion. Electron. J. Probab. 13, 213–258 (2008) 8. Cesi, F., Guadagni, G., Martinelli, F., Schonmann, R.H.: On the 2D stochastic Ising model in the phase coexistence region close to the critical point. J. Stat. Phys. 85, 55–102 (1996) 9. Dobrushin, R., Kotecký, R., Shlosman, S.: Wulff Construction. A global Shape from Local Interaction. Transl. Math. Monographs 104, Providence, RI: Amer. Math. Soc., 1992 10. Fisher, D.S., Huse, D.A.: Dynamics of droplet fluctuations in pure and random Ising systems. Phys. Rev. B 35, 6841–6846 (1987) 11. Fortuin, C.M., Kasteleyn, P.W., Ginibre, J.: Correlation inequalities on some partially ordered sets. Commun. Math. Phys. 22, 89–103 (1971) 12. Higuchi, Y., Wang, J.: Spectral gap of Ising model for Dobrushin’s boundary condition in two dimensions. Preprint, 1999 13. Liggett, T.M.: Interacting particle systems. New York: Springer Verlag, 1985 14. Levin, D.A., Peres, Y., Wilmer, E.L.: Markov Chains and Mixing Times. Providence, RI: Amer. Math. Soc., 2009 15. Levin, D., Luczak, M., Peres, Y.: Glauber dynamics for the Mean-field Ising Model: cut-off, critical power law, and metastability. Probab. Theory Related Fields 146(1,2), 223–265 (2010) 16. Martinelli, F.: On the two dimensional dynamical Ising model in the phase coexistence region. J. Stat. Phys. 76, 1179–1246 (1994) 17. Martinelli, F.: Lectures on Glauber dynamics for discrete spin models. Lecture Notes in Math. 1717, Berlin: Springer, 1999

Mixing Time of 2D Stochastic Ising Model at Low Temperature

213

18. Martinelli, F., Sinclair, A., Weitz, D.: Glauber dynamics on trees: Boundary conditions and mixing time. Commun. Math. Phys. 250(2), 301–334 (2004) 19. Martinelli, F., Sinclair, A.: Mixing time for the solid-on-solid model. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), New York: Assoc. for Comp. Mach., 2009, pp. 571–580 20. Martin-Löf, A.: Mixing properties, differentiability of the free energy and the central limit theorem for a pure phase in the Ising model at low temperature. Commun. Math. Phys. 32, 75–92 (1973) 21. Messager, A., Miracle-Solé, S., Ruiz, J.: Convexity properties of the surface tension and equilibrium crystals. J. Stat. Phys. 67, 449–470 (1992) 22. Peres, Y.: Mixing for Markov Chains and Spin Systems. Available at www.stat.berkeley.edu/~peres/ubc. pdf, August 2005 23. Shlosman, S.: The droplet in the tube: a case of phase transition in the canonical ensemble. Commun. Math. Phys. 125, 81–90 (1989) 24. Simon, B.: The Statistical Mechanics of Lattice Gases. Vol. I. Princeton Series in Physics. Princeton, NJ: Princeton University Press, 1993 25. Sugimine, N.: A lower bound on the spectral gap of the 3-dimensional stochastic Ising models. J. Math. Kyoto Univ. 42, 751–788 (2002) 26. Sugimine, N.: Extension of Thomas’ result and upper bound on the spectral gap of d(≥ 3)-dimensional stochastic Ising models. J. Math. Kyoto. Univ. 42(1), 141–160 (2002) 27. Thomas, L.E.: Bound on the mass gap for finite volume stochastic Ising models at low temperature. Commun. Math. Phys. 126, 1–11 (1989) Communicated by H. Spohn

Commun. Math. Phys. 296, 215–249 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0994-y

Communications in

Mathematical Physics

From Limit Cycles to Strange Attractors William Ott1 , Mikko Stenlund2,3 1 Department of Mathematics, University of Houston, Houston,

TX 77204-3008, USA. E-mail: [email protected]

2 Courant Institute of Mathematical Sciences, New York, NY 10012, USA.

E-mail: [email protected]

3 Department of Mathematics and Statistics, University of Helsinki,

P.O. Box 68, 00014 Helsinki, Finland Received: 28 May 2009 / Accepted: 27 October 2009 Published online: 11 February 2010 – © Springer-Verlag 2010

Abstract: We define a quantitative notion of shear for limit cycles of flows. We prove that strange attractors and SRB measures emerge when systems exhibiting limit cycles with sufficient shear are subjected to periodic pulsatile drives. The strange attractors possess a number of precisely-defined dynamical properties that together imply chaos that is both sustained in time and physically observable. 1. Introduction This paper is about a mechanism for producing chaos: shear. We are guided by the idea that in the presence of shear, a stable dynamical structure can be transformed into a strange attractor with strong stochastic properties by forcing the structure with a pulsatile drive. The forcing does not overwhelm the intrinsic dynamics. Instead, it acts as an amplifier, amplifying the effects of the intrinsic shear. We focus on one particular dynamical structure of great importance: the limit cycle. Limit cycles are asymptotically stable periodic orbits of flows on Riemannian manifolds. The application of a periodic pulsatile drive to a flow exhibiting a limit cycle causes deformations to occur. If shear is present in a neighborhood of the limit cycle, if the limit cycle only weakly attracts nearby orbits, and if the time between pulses (the relaxation time) is sufficiently large, then stretch-and-fold geometry emerges in a neighborhood of the limit cycle. Stretch-and-fold geometry suggests that chaotic behavior that is both sustained in time and observable may exist. We prove that such chaotic behavior does exist in a certain parameter regime for any (generic) forcing function if the shear is sufficiently strong. Moreover, we define a quantity called the shear integral that quantifies the amount of shear that is present in the intrinsic flow in a neighborhood of the limit cycle. We emphasize that the shear integral depends only on the intrinsic system and not on the external forcing. Our result is the first of its kind for general limit cycles. Wang and Young [16,17] obtain results of a similar flavor for supercritical Hopf bifurcations and certain linear models.

216

W. Ott, M. Stenlund

The search for and analysis of stochastic behavior in deterministic dynamical systems have played a major role in guiding dynamical systems research. We discuss a few relevant developments. The theory of uniformly hyperbolic systems is well-developed. Let M be a compact Riemannian manifold and let f : M → M be a C 2 diffeomorphism of M. An attractor for f is a compact set satisfying f () = forwhich there ∞ exists an open set U ⊂ M (the basin) such that f (U¯ ) ⊂ U and = i=0 f i (U¯ ). An attractor is said to be an Axiom A attractor if the tangent bundle over splits into 2 D f -invariant subbundles E s and E u such that vectors in E s are contracted by D f and vectors in E u are expanded by D f (we assume E u is nontrivial). An Axiom A attractor supports a special invariant measure known as a Sinai-Ruelle-Bowen (SRB) measure that describes the asymptotic distribution of the orbit of almost every point in U with respect to Riemannian volume and has strong stochastic properties. In this sense, the chaotic behavior associated with Axiom A systems is observable. It is also sustained in time because of the presence of positive Lyapunov exponent(s). One can, in principle, detect the presence of uniform hyperbolicity in a given system by finding invariant cone families with suitable properties. For example, Tucker uses this approach to prove that the Lorenz equations are chaotic for the classical parameter values studied by Lorenz [13]. Many systems of interest in the biological and physical sciences display some form of hyperbolicity but are not uniformly hyperbolic. A mature theory of nonuniform hyperbolicity has emerged over the last 4 decades. However, the following problem remains a challenge. Given a dynamical system (or a parametrized family of dynamical systems), how can nonuniform hyperbolicity be detected? Numerical techniques include the calculation of Lyapunov exponents and the 0-1 test [4,5]. This paper addresses the analytical component of the problem in the context of limit cycles. Our proofs are based on the recently-developed theory of rank one maps [15,18]. Rank one theory is based on the ideas of Jakobson [8], Benedicks and Carleson [1,2], and Young [19,20]. Rank one theory provides checkable conditions that imply the existence of SRB measures with strong stochastic properties in parametrized families of diffeomorphisms. We conclude the introduction with a remark that the results obtained in this paper are in some sense dual to the phenomenon known as self-induced stochastic resonance (SISR) (see e.g. [3]). Our results demonstrate that certain intrinsic characteristics of a deterministic system (shear) can produce stochastic-type behavior when the system is forced in a deterministic way. SISR demonstrates that underlying phase space structures can produce deterministic (coherent) behavior in stochastically-forced systems when the noise level is taken to 0 along certain distinguished limits. 2. Statement of Results We state the main results and discuss their relationship to the existing literature. Let f : Rn → Rn be a C 5 vector field and consider the differential equation dx = f (x). dt

(2.1)

We assume that (2.1) admits an asymptotically stable hyperbolic periodic solution η of length L and period p0 . Let γ : R → Rn be a function of the parameter s that parametrizes η by length. Define = {γ (s) : s ∈ [0, L)}. Solutions to (2.1) that begin sufficiently close to will converge to at an exponential rate as t → ∞. We are interested in the effects of adding periodic pulsatile forcing to the vector field defining (2.1).

From Limit Cycles to Strange Attractors

217

For 0 < ρ < T , define the periodic function Pρ,T : R → R as follows. For 0 t T , set 1, if 0 t ρ , Pρ,T (t) = 0, if ρ < t < T and then extend periodically to all t ∈ R by requiring Pρ,T (t + T ) = Pρ,T (t). We study the externally-forced system dx = f (x) + ε Pρ,T (t)F(x), dt

(2.2)

where F : Rn → Rn is a C 4 vector field and the parameter ε > 0 controls the amplitude of the forcing. Notice that the right side of (2.2) is not continuous. In Sect. 3 we compute a normal form of Eq. (2.2) that is valid in a tubular neighborhood M˜ ≈ × D, where D is a closed disk in Rn−1 of sufficiently small radius. We are interested in the dynamics of (2.2) in the tubular neighborhood M ≈ × 21 D. Since the external forcing is periodic with period T , it is natural to study the time-T map induced by (2.2). We write the time-T map as the composition of a kick map Hk : M → M˜ and a relaxation map Hr : M˜ → int(M). Let Hk be the time-ρ map induced by the flow associated with (2.2). Notice that the external forcing is active during the kick phase because Pρ,T (t) = 1 for 0 t ρ. For ε sufficiently small, Hk maps M into M˜ diffeomorphically. Let Hr be the time-(T − ρ) map induced by (2.2) with ε set to 0. There exists T0 = T0 (ε) such that if T T0 , then Hr maps M˜ into int(M). The composition G T := Hr ◦ Hk is the time-T map induced by (2.2). The dynamical properties of G T : M → int(M) depend on a number of factors. One feature common to every map G T for T T0 is the existence of an attractor defined by =

∞

G iT (M).

i=0

We call U := int(M) the basin of attraction of . For every x ∈ U , G iT (x) → as i → ∞. Two characteristics of the intrinsic system (2.1) play a key role in determining the structure of and the dynamical properties of G T : shear and the strength of the limit cycle. We quantify these notions momentarily; for now, imagine that (2.1) exhibits strong shear in M if for most points x ∈ , the velocity vector f (ˆx) varies substantially as xˆ moves away from x in directions orthogonal to the limit cycle . Think of the limit cycle as strongly stable if solutions to (2.1) that begin in M converge quickly to . If the shear is weak and the limit cycle is strongly stable, then the attractor associated with G T will be an invariant closed curve. We are interested in the opposite situation. Suppose that the shear is strong in M and the limit cycle is weakly stable. The addition of the periodic pulsatile external force ε Pρ,T (t)F(x) will amplify the effect of the shear in the following way: disturbances that are created when Pρ,T = 1 will be stretched during the relaxation period (when Pρ,T = 0). The stretching effect increases in intensity as T increases. If T is large, then folds will be created in the phase space. If G T exhibits stretch-and-fold geometry, then G T potentially exhibits chaotic behavior that is sustained in time and observable. This paper aims to accomplish the following:

218

W. Ott, M. Stenlund

(1) We define a computable quantity called the shear integral that quantifies the shear associated with the intrinsic system (2.1) near the limit cycle . (2) We prove that if the magnitude of the shear integral is sufficiently large and if the contraction near the limit cycle is sufficiently weak, then the following holds for suitable values of ε. For a typical external vector field F, there exists T1 > 0 and a set ⊂ [T1 , ∞) of positive Lebesgue measure such that for T ∈ , the time-T map G T associated with (2.2) admits a strange attractor and exhibits chaos that is sustained in time and observable. The quantity T1 satisfies T1 ρ, ensuring sufficient relaxation time for the stretch-andfold geometry to emerge. The term strange attractor refers to a number of precisely defined dynamical and structural properties that represent sustained, observable chaos. For T ∈ , supports a unique ergodic SRB measure ν. Here the term SRB measure refers to a measure ν with a positive Lyapunov exponent ν almost everywhere and whose conditional measures on unstable manifolds are absolutely continuous with respect to Riemannian volume on these manifolds. The SRB measure ν satisfies the central limit theorem and exhibits exponential decay of correlations for Hölder continuous observables. For Lebesgue almost every x in the basin of attraction U , the orbit of x has a positive Lyapunov exponent and is asymptotically distributed according to ν in the sense that for every continuous function ϕ : U → R, we have m−1 1 i lim ϕ(G T (x)) = ϕ dν. m→∞ m

(2.3)

i=0

Notice that this statement is substantially stronger than the conclusion of the Birkhoff ergodic theorem. The Birkhoff ergodic theorem implies that (2.3) holds for ν almost every x. However, ν is singular with respect to Lebesgue measure (supported on a set of Lebesgue measure zero) because the dynamics are dissipative. We prove that (2.3) holds for Lebesgue almost every x ∈ U . See (SA1)–(SA4) in Sect. 4 for a more precise description of the dynamical properties of G T for T ∈ . We now define the shear integral. In Sect. 3 we derive a normal form of (2.1) that is ˜ The normal form, expressed in the natural (s, z)-coordinates introduced in valid in M. Sect. 3.1, is given by dt = f (γ (s))−1 + β(s), z + ω1 (s, z), ds dz = Az + ω2 (s, z). ds

(2.4a) (2.4b)

Here ·, · denotes the inner product on Rn−1 . Functions depending on s in (2.4a)–(2.4b) are periodic in s with period 2L. The matrix A is in Jordan canonical form. The functions ω1 and ω2 represent higher order corrections. The function β gives the pointwise magnitude and direction of the shear. Define the shear integral by 2L = ( 1 , . . . , n−1 ) := β(τ ) dτ 0

and define the shear factor σ by σ := . Having defined the shear integral, we describe the setting of the main theorem. We identify intrinsic parameters (parameters associated with f ) and external parameters

From Limit Cycles to Strange Attractors

219

(parameters associated with the external forcing). We fix the normalized shear vector σ

and view the shear factor σ as the first intrinsic parameter. The second intrinsic parameter quantifies the strength of the contraction near the limit cycle and is derived from A. We assume for the sake of simplicity that A is a diagonal matrix given by A = diag(λ1 , . . . , λn−1 ), where 0 > λ1 λ2 · · · λn−1 are the eigenvalues of A. We fix the eigenvalue ratios µi = λλ1i for 1 i n − 1 and we view the weakest eigenvalue λ1 as an intrinsic parameter. The only external parameter is ε, the factor that controls the amplitude of the external forcing. We fix ρ > 0. A key parameter derived from ε, σ , and λ1 is the hyperbolicity factor |λεσ1 | . One additional ingredient is needed. Even if σ is large and |λ1 | is small, a strange attractor cannot emerge unless the forcing F acts in direction(s) in which shear is present. We express this idea by introducing a certain function on the circle S := 2LRZ . We identify S with the interval [0, 2L). In Sect. 3 we derive a normal form of the forced system (2.2) that is valid in M˜ when the forcing is active (Pρ,T = 1): dt = f (γ (s))−1 + β(s), z + ω3 (s, z), ds dz = Az + εζ (s) + ω4 (s, z). ds

(2.5a) (2.5b)

Functions depending on s in (2.5a)–(2.5b) are periodic in s of period 2L. The functions ω3 and ω4 are higher order corrections. The function ζ is related to the projection of F in directions orthogonal to . For s0 ∈ S, define s˜ implicitly by s˜ f (γ (τ ))−1 dτ. ρ= s0

Define the vector d := and define : S → R by

i µi σ

(s0 ) = d,

s˜

n−1 , i=1

ζ (τ ) dτ .

(2.6)

s0

We say that is a Morse function if the critical set C( ) = {s ∈ S : (s) = 0} is finite and if for every s ∈ C( ), we have (s) = 0. We are now in position to state the main theorem. In Theorem 1, we assume that the radius of M is κ0 ε for some constant κ0 > 0. Theorem 1. Let G T denote the time-T map associated with (2.2). Suppose that the function defined by (2.6) is a Morse function. Then there exist a small constant κ1 > 0 and a large constant κ2 > κ1 such that the following holds. If (1) |λ1 | < κ1 , (2) |λε1 | < κ1 , (3) |λεσ1 | > κ2 ,

220

W. Ott, M. Stenlund

then there exists T1 > 0 and a set ⊂ [T1 , ∞) of positive Lebesgue measure such that for T ∈ , G T admits a strange attractor in M and satisfies (SA1)–(SA4) from Sect. 4. For every interval I ⊂ [T1 , ∞) of length 1, ( ∩ I ) > 0, where denotes the Lebesgue measure on R. Remark 2.1. The assumption that is a Morse function is quite mild and should hold for a typical forcing vector field F. We do not formulate precise results of this type in this paper, but such results should hold in terms of both topological genericity and prevalence. Prevalence is a measure-theoretic notion of genericity that generalizes the concept of ‘Lebesgue almost every’ to infinite-dimensional spaces. It provides a powerful framework for describing generic phenomena in a probabilistic way (see e.g. [6,7,10]). Remark 2.2. Theorem 1 concludes that G T exhibits sustained, observable chaos for a set of values of T of positive Lebesgue measure rather than for all T ∈ [T1 , ∞). This is not a consequence of the nature of the proof. Rather, it is a fundamental consequence of the fact that an alternate scenario competes with the SRB scenario in the space of T -values. For an open set S of T -values in [T1 , ∞), the basin U contains a G T -invariant Cantor set on which G T is uniformly hyperbolic (a horseshoe) and a periodic sink. The trajectory of Lebesgue almost every x ∈ U converges to the periodic sink. Thus for T ∈ S, G T exhibits transient chaos: a typical trajectory in the basin will move erratically for some time due to the presence of the horseshoe before finally converging to the periodic sink. Remark 2.3. The function does not depend on the parameters λ1 , σ , and ε. Theorem 1 is related to 2 results obtained by Wang and Young in [17]. Wang and Young consider limit cycles forced by periodic δ-function kicks. First, they prove that any limit cycle, when suitably kicked, can be transformed into a strange attractor. This result is universal but not constructive. An artificially-strong kick is needed if geometric conditions are unfavorable for the creation of nonuniform hyperbolicity. Second, they prove that the Hopf limit cycle that emerges from a supercritical Hopf bifurcation can be transformed into a strange attractor. Here the so-called twist factor plays the role of the shear integral. Unlike the shear integral, the twist factor is local in the sense that it depends only on derivatives of the vector field at the bifurcation parameter. Many of the quantities in Theorem 1 are required to be sufficiently large or sufficiently small. This is an unavoidable consequence of the perturbative nature of the analytic techniques used in the proof. However, numerical evidence suggests that shearinduced chaos emerges over parameter ranges that far exceed those to which the rigorous analysis applies. For example, Lin and Young [9] conduct numerical studies of a linear shear flow model previously studied by Zaslavsky [21]. The work of Lin and Young also provides numerical evidence that the temporal form of the kicks need not be periodic: temporally-sustained chaotic behavior is observed for random kicks at Poissondistributed times and for continuous-time forcing by white noise. 3. Derivation of the Singular Limit 3.1. Derivation of the normal forms. We derive the normal forms (2.4a)–(2.4b) and n (2.5a)–(2.5b) that are valid in a small neighborhood of . For s ∈ S, let {ei (s)}i=1 be n an orthonormal basis for R such that en (s) = γ (s) (where γ denotes the derivative of γ with respect to s) and ei is a C 5 function of s for all 1 i n. One may choose the first n − 1 vectors in many ways. For example, if γ is at least C n+5 and the first n

From Limit Cycles to Strange Attractors

221

derivatives of γ are linearly independent, then one may construct the basis by applying the Gram-Schmidt procedure to the first n derivatives of γ . For any x ∈ Rn sufficiently close to , there exist unique s ∈ S and y = (y1 , . . . , yn−1 ) such that x = γ (s) +

n−1

yi ei (s).

(3.1)

i=1

We use (s, y) as new phase variables. Define

⎛

⎞ (e1 (s))T ⎜(e2 (s))T ⎟ ⎜ ⎟ E(s) = ⎜ ⎟. .. ⎝ ⎠ . (en (s))T

Differentiating E(s) with respect to s, we have E (s) = K(s)E(s), where K(s) = (k j,i (s)) is a skew-symmetric matrix of generalized curvatures defined by k j,i (s) = ej (s), ei (s) . If the first n derivatives of γ are used to create E, then this differential equation is the classical Frenet-Serret equation from differential geometry. For 1 i n, define the vector ⎞ ⎛ k1,i (s) ⎜ k2,i (s) ⎟ ⎟. ki (s) = ⎜ .. ⎠ ⎝ . kn−1,i (s)

Differentiating (3.1) with respect to t, we obtain ⎞ ⎛ n−1 n−1 dx dyi ds ⎝ = ei (s) + y j ej (s)⎠ = f (x) + ε Pρ,T (t)F(x). γ (s) + dt dt dt i=1

(3.2)

j=1

Taking the inner product of (3.2) with respect to ei (s) for 1 i n − 1 yields dyi ds = f (x), ei (s) + ε Pρ,T (t) F(x), ei (s) − y, ki (s) . dt dt Taking the inner product of (3.2) with respect to en (s) yields ds ( y, kn (s) + 1) = f (x), en (s) + ε Pρ,T (t) F(x), en (s) . dt Notice that y, kn (s) + 1 = 0 if y is sufficiently small. Consequently, the system ds 1 = ( f (x), en (s) + ε Pρ,T (t) F(x), en (s) ), dt 1 + y, kn (s)

dyi ds = f (x), ei (s) + ε Pρ,T (t) F(x), ei (s) − y, ki (s)

dt dt is valid in a small neighborhood of .

(3.3a) (3.3b)

222

W. Ott, M. Stenlund

We now extract the terms of leading order in (3.3a) and (3.3b). For 1 j n, define ψ j (s, y) = f (x), e j (s) . For 1 i n − 1, we have ψi (s, y) = ψ i(1) (s), y + Os,y (y2 ), where (1) ψ i (s)

∂ f (x), ei (s) = . ∂y y=0

Here Os,y (y2 ) denotes a function of s and y for which there exists a constant K > 0 independent of s and y such that |Os,y (y2 )| K y2 . Expanding ψn (s, y), we have 2 ψn (s, y) = ψn(0) (s) + ψ (1) n (s), y + Os,y (y ),

where ψn(0) (s) = f (γ (s)), ψ (1) n (s) =

∂ f (x), en (s) . ∂y y=0

Set φ j (s, y) = F(x), e j (s) for 1 j n. Writing (3.3a) and (3.3b) in terms of ψ j and φ j , when the forcing is active (Pρ,T (t) = 1) we obtain ⎧ dt 1 ψ (1) ⎪ n (s) 2 ⎪ ⎪ = 1+ kn (s) − (0) · y + Os,y (y ) ⎪ ⎪ ds ψn(0) (s)+εφn (s, y) ⎪ ψn (s) + εφn (s, y) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ dy ψ i(1) (s) εφi (s, y) i = (0) + − ki (s) · y (0) ⎪ ds ψn (s) + εφn (s, y) ψn (s) + εφn (s, y) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ εφi (s, y) (s) ψ (1) ⎪ n ⎪ + kn (s) − (0) · y + Os,y (y2 ). ⎩ (0) ψn (s) + εφn (s, y) ψn (s) + εφn (s, y) (3.4) When the forcing is off (Pρ,T (t) = 0), we have ⎧ ⎪ dt 1 (s) ψ (1) ⎪ n 2 ⎪ = (0) 1 + kn (s) − (0) · y + Os,y (y ) ⎪ ⎪ ⎨ ds ψn (s) ψn (s) (1) ⎪ ⎪ ψ (s) dy ⎪ i i ⎪ ⎪ − ki (s) · y + Os,y (y2 ). ⎩ ds = (0) ψn (s) Define b0 (s) := b1 (s) :=

1 (0) ψn (s)

1 (0)

ψn (s)

, kn (s) −

ψ (1) n (s) (0)

ψn (s)

,

(3.5)

From Limit Cycles to Strange Attractors

223

˜ and let A(s) denote the (n − 1) × (n − 1) matrix with i th row given by

T

(1)

ψ i (s)

ψn(0) (s)

− ki (s)

.

˜ system (3.5) becomes In terms of b0 , b1 , and A, ⎧ dt ⎪ = b0 (s) + b1 (s), y + Os,y (y2 ) ⎨ ds ⎪ ⎩ dy = A(s)y ˜ + Os,y (y2 ). ds

(3.6)

Applying the Floquet theorem, there exists a real-valued, periodic (n − 1) × (n − 1) matrix P(s) of period 2L such that setting z = P−1 (s)y, we transform (3.6) into dt = b0 (s) + ((b1 (s))T P(s))z + h 2 (s, z), ds dz = Az + h1 (s, z). ds

(3.7a) (3.7b)

This is the normal form of (2.2) on which we will base our analysis of the flow during the relaxation period (when Pρ,T (t) = 0). We obtain the normal form of (2.2) during the forcing period (when Pρ,T (t) = 1) by writing (3.4) in (s, z)-coordinates, giving dt = b0 (s) + ((b1 (s))T P(s))z + Os,z (ε) + Os,z (εz) + Os,z (z2 ), ds dz εP−1 (s)φ(s, 0) = Az + + Os,z (εz) + Os,z (ε2 ) + Os,z (z2 ), (0) ds ψn (s)

(3.8a) (3.8b)

where φ(s, 0) = (φ1 (s, 0), . . . , φn−1 (s, 0))T . 3.2. A general form of the singular limit. Let M˜ ≈ × D be a tubular neighborhood of in Rn , where D is a disk of sufficiently small radius so that the normal form (3.8a)–(3.8b) is valid. Let M ≈ × 21 D. We define flow-induced maps Hk : M → M˜ and Hr : M˜ → M˜ as follows. Let Hk be the time-ρ map associated with the forced system (3.8a)–(3.8b). We call Hk the ‘kick’. Notice that for ε sufficiently small, Hk maps ˜ Let Hr be the time-(T −ρ) map associated with the relaxation system (3.7a)– M into M. (3.7b). We call Hr the relaxation map. There exists T0 = T0 (ε) such that if T T0 , then Hr maps M˜ into int(M). The composition G T := Hr ◦ Hk is the time-T map generated by the flow. Our goal is to show that the family {G T : M → int(M), T T0 } of diffeomorphisms on M has a well-defined singular limit in a certain sense as T → ∞. Let (s0 , y0 ) ∈ M. We write Hk (s0 , y0 ) = (ˆs , zˆ ) and compute Hr (ˆs , zˆ ). Integrating (3.7b), we have s e−(τ −ˆs )A h1 (τ, z(τ )) dτ . z(s) = e(s−ˆs )A zˆ + sˆ

224

W. Ott, M. Stenlund

Integrating (3.7a), we have s(T ) T −ρ = b0 (τ ) dτ + zˆ · sˆ

s(T )

sˆ

b1 (τ )T P(τ )e(τ −ˆs )A dτ +

sˆ

E 2 (s(T )) =

s(T )

sˆ

E k (s(T )),

(3.9)

k=1

where the error terms are given by s(T ) E 1 (s(T )) = b1 (τ )T P(τ )e(τ −ˆs )A

2

sˆ

τ

e−(ξ −ˆs )A h1 (ξ, z(ξ )) dξ dτ ,

h 2 (τ, z(τ )) dτ.

Letting T → ∞ in (3.9) yields nothing meaningful. However, we use the fact that s can be computed modulo 2L to introduce an auxiliary parameter a ∈ S and thereby obtain the singular limit. Recall that p0 is the period of η. As a varies from 0 to 2L, γ traverses 2 times. Let tˆ : [0, 2L) → [0, 2 p0 ) be the strictly increasing function defined by η(tˆ(a)) = γ (a). For m ∈ Z+ and a ∈ S, set T = ρ + 2 p0 m + tˆ(a). Substituting into (3.9), writing s(ρ + 2 p0 m + tˆ(a)) = sˆ + 2Lm + s˜ (ρ + 2 p0 m + tˆ(a)), and using the fact that v+2Lm b0 (τ ) dτ = 2 p0 m v

for all v ∈ R, we obtain sˆ+˜s (ρ+2 p0 m+tˆ(a)) tˆ(a) = b0 (τ ) dτ + zˆ · sˆ

+

s(ρ+2 p0 m+tˆ(a))

sˆ

2

b1 (τ )T P(τ )e(τ −ˆs )A dτ (3.10)

E k (s(ρ + 2 p0 m + tˆ(a)).

k=1

Define G a,m −1 : M → int(M) by G a,m −1 (s0 , y0 ) = (s(ρ + 2 p0 m + tˆ(a)), y(ρ + 2 p0 m + tˆ(a)). It follows from [17, Prop. 3.1] that there exists s∞ (s0 , y0 , a) such that lim sˆ + s˜ (ρ + 2 p0 m + tˆ(a)) = s∞ (s0 , y0 , a),

m→∞

and s∞ (s0 , y0 , a) is defined implicitly by taking the m → ∞ limit in (3.10): ∞ s∞ (s0 ,y0 ,a) 2 T (τ −ˆ s )A tˆ(a) = b0 (τ ) dτ + zˆ , b1 (τ ) P(τ )e dτ + E k (∞). sˆ

sˆ

k=1

The family of maps {G a,0 : M → × {0}}a∈S defined by G a,0 (s0 , y0 ) = (s∞ (s0 , y0 , a), 0) is the desired singular limit. It follows from [17, Prop. 3.1] that the maps (s0 , y0 , a) → G a,m −1 (s0 , y0 ) converge to the map (s0 , y0 , a) → G a,0 (s0 , y0 ) in

C 3 (M

× S) as m → ∞.

(3.11)

From Limit Cycles to Strange Attractors

225

3.3. A computable form of the singular limit. From this point forward, we assume the setting of Theorem 1. We now extract the primary terms in the right side of (3.11). Recall that the shear integral is defined by

2L

= ( 1 , . . . , n−1 ) =

b1 (τ )T P(τ ) dτ,

0

and that the shear factor is given by σ = . We assume that the operator A is diagonalizable and that the z-coordinate has been chosen such that A = diag(λ1 , . . . , λn−1 ), where 0 > λ1 λ2 · · · λn−1 are the eigenvalues of A. Fix the normalized shear λ1 vector σ and the eigenvalue ratios µi = λi for 1 i n − 1. Set ρ = 1 for notational simplicity. We regard σ , ε, and λ1 as the parameters associated with the singular limit. Expanding the second term on the right side of (3.11), we have

∞ sˆ

b1 (τ )T P(τ )e(τ −ˆs )A dτ =

∞

sˆ

= d¯ +

e(τ −ˆs )A dτ +

∞

sˆ

∞ sˆ

T

(b1 (τ )T P(τ ) − )e(τ −ˆs )A dτ (3.12)

(b1 (τ ) P(τ ) − )e

(τ −ˆs )A

dτ,

where d¯ =

∞

sˆ

i n−1 e(τ −ˆs )A dτ = − . λi i=1

Let H˜ k : M → M˜ be the time-1 map generated by the system dt = b0 (s), ds dz εP−1 (s)φ(s, 0) = , ds ψn(0) (s)

(3.13a) (3.13b)

obtained from (3.8a)–(3.8b) by retaining only the terms of leading order. For (s0 , y0 ) ∈ M, write H˜ k (s0 , y0 ) = (˜s , z˜ ). Integrating (3.13a) and (3.13b) gives 1=

s˜

b0 (τ ) dτ,

s0

z˜ = z0 + ε

s˜ s0

P−1 (τ )φ(τ, 0) (0)

ψn (τ )

(3.14) dτ.

Proposition 3.1. There exists a system constant K 0 > 0 such that sˆ = s˜ + ξ1 (s0 , y0 ), zˆ = z˜ + ξ 2 (s0 , y0 ), where ξ1 |{y0 = 0}C 3 (S) K 0 ε, ξ 2 |{y0 = 0}C 3 (S) K 0 ε|λ1 |.

(3.15)

226

W. Ott, M. Stenlund

Setting y0 = 0, define g(s0 , a) = s∞ (s0 , 0, a). Substituting (3.12), (3.14), and (3.15) into (3.11), the value g(s0 , a) is defined implicitly by tˆ(a) + 1 =

g(s0 ,a) s0

−

sˆ

s˜

¯ b0 (τ ) dτ + (˜z + ξ 2 (s0 , 0)), d

b0 (τ ) dτ + zˆ ·

∞ sˆ

T

(b1 (τ ) P(τ ) − )e

(τ −ˆs )A

dτ +

2

E k (∞).

k=1

(3.16) ¯ we define Rescaling d, d=

i µi σ

n−1 ,

s˜

(s0 ) = d,

P−1 (τ )φ(τ, 0)

s0

i=1

(0)

ψn (τ )

dτ ,

giving ¯ = εσ (s0 ). ˜z, d

|λ1 | The higher-order terms are given by E1 = E 1 (∞), E2 = E 2 (∞), ∞ T (τ −ˆs )A (b1 (τ ) P(τ ) − )e dτ , E3 = zˆ , sˆ sˆ

E4 = − Setting E = limit:

5

k=1 Ek

s˜

¯ b0 (τ ) dτ, E5 = ξ 2 (s0 , 0), d .

and substituting into (3.16), we obtain the final form of the singular

tˆ(a) + 1 =

g(s0 ,a)

s0

b0 (τ ) dτ +

εσ

(s0 ) + E. |λ1 |

(3.17)

Proposition 3.2. There exists a system constant K 1 > 0 such that the following hold: ε σε E1 C 3 (S) K 1 , |λ1 | |λ1 | σε ε E2 C 3 (S) K 1 , |λ1 | σ σε E3 C 3 (S) K 1 (|λ1 |), |λ1 | σ ε |λ1 | E4 C 3 (S) K 1 , |λ1 | σ σε (|λ1 |). E5 C 3 (S) K 1 |λ1 |

From Limit Cycles to Strange Attractors

227

4. Theory of Rank One Attractors Let D denote the closed unit disk in Rn−1 and let M = S1 × D. We consider a family of maps G a,b : M → M, where a = (a1 , . . . , ak ) ∈ V is a vector of parameters and b ∈ B0 is a scalar parameter. Here V = V1 × · · · × Vk ⊂ Rk is a product of intervals and B0 ⊂ R \ {0} is a subset of R with an accumulation point at 0. Points in M are denoted by (x, y) with x ∈ S1 and y ∈ D. Rank one theory postulates the following: (H1) Regularity conditions. (a) For each b ∈ B0 , the function (x, y, a) → G a,b (x, y) is C 3 . (b) Each map G a,b is an embedding of M into itself. (c) There exists K D > 0 independent of a and b such that for all a ∈ V, b ∈ B0 , and z, z ∈ M, we have | det DG a,b (z)| K D. | det DG a,b (z )| (H2) Existence of a singular limit. For a ∈ V, there exists a map G a,0 : M → S1 × {0} such that the following holds. For every (x, y) ∈ M and a ∈ V, we have lim G a,b (x, y) = G a,0 (x, y).

b→0

Identifying S1 × {0} with S1 , we refer to G a,0 and the restriction f a : S1 → S1 defined by f a (x) = G a,0 (x, 0) as the singular limit of G a,b . (H3) C 3 convergence to the singular limit. We select a special index j ∈ {1, . . . , k}. Fix ai ∈ Vi for i = j. For every such choice of parameters ai , the maps (x, y, a j ) → G a,b (x, y) converge in the C 3 topology to (x, y, a j ) → G a,0 (x, y) on M × V j as b → 0. (H4) Existence of a sufficiently expanding map within the singular limit. There exists a∗ = (a1∗ , . . . , ak∗ ) ∈ V such that f a∗ ∈ M, where M is the set of Misiurewicz-type maps defined in Definition 4.1 below. (H5) Parameter transversality. Let Ca∗ denote the critical set of f a∗ . For a j ∈ V j , define the vector a˜ j ∈ V by a˜ j = (a1∗ , . . . , a ∗j−1 , a j , a ∗j+1 , . . . , ak∗ ). We say that the family { f a } satisfies the parameter transversality condition with respect to parameter a j if the following holds. For each x ∈ Ca∗ , let p = f a∗ (x) and let x(˜a j ) and p(˜a j ) denote the continuations of x and p, respectively, as the parameter a j varies around a ∗j . The point p(˜a j ) is the unique point such that p(˜a j ) and p have identical symbolic itineraries under f a˜ j and f a∗ , respectively. We have d d f a˜ j (x(˜a j )) = p(˜a j ) . da j da j a j =a ∗ a j =a ∗ j

j

(H6) Nondegeneracy at ‘turns’. For each x ∈ Ca∗ , there exists 1 m n − 1 such that ∂ G a∗ ,0 (x, y) = 0. ∂ ym y=0

228

W. Ott, M. Stenlund

(H7) Conditions for mixing. 1

(a) We have e 3 λ0 > 2, where λ0 is defined within Definition 4.1. (b) Let J1 , . . . , Jr be the intervals of monotonicity of f a∗ . Let Q = (qim ) be the matrix of ‘allowed transitions’ defined by 1, if f a∗ (Ji ) ⊃ Jm , qim = 0, otherwise. There exists N > 0 such that Q N > 0. We now define the family M. Definition 4.1. We say that f ∈ C 2 (S1 , R) is a Misiurewicz map and we write f ∈ M if the following hold for some neighborhood U of the critical set C = C( f ) = {x ∈ S1 : f (x) = 0}: (A) Outside of U. There exist λ0 > 0, M0 ∈ Z+ , and 0 < d0 1 such that (1) for all m M0 , if f i (x) ∈ / U for 0 i m − 1, then |( f m ) (x)| eλ0 m , + i (2) for any m ∈ Z , if f (x) ∈ / U for 0 i m − 1 and f m (x) ∈ U , then m λ m 0 |( f ) (x)| d0 e . (B) Critical orbits. For all c ∈ C and i > 0, f i (c) ∈ / U. (C) Inside U. (1) We have f (x) = 0 for all x ∈ U , and (2) for all x ∈ U \C, there exists p0 (x) > 0 such that f i (x) ∈ / U for all i < p0 (x) 1

and |( f p0 (x) ) (x)| d0−1 e 3 λ0 p0 (x) . Rank one theory states that given a family {G a,b } satisfying (H1)–(H6), a measuretheoretically significant subset of this family consists of maps admitting attractors with strong chaotic and stochastic properties. We formulate the precise results and we then describe the properties that the attractors possess. Theorem 4.2 ([15,18]). Suppose the family {G a,b } satisfies (H1), (H2), (H4), and (H6). The following holds for all 1 j k such that the parameter a j satisfies (H3) and (H5). For all sufficiently small b ∈ B0 , there exists a subset j ⊂ V j of positive Lebesgue measure such that for a j ∈ j , G a˜ j ,b admits a strange attractor with properties (SA1), (SA2), and (SA3). Theorem 4.3 ([15,16,18]). In the sense of Theorem 4.2, (H1)–(H7) ⇒ (SA1)–(SA4). Remark 4.4. The proof of Theorem 4.2 for the special case n = 2 appears in [15]. The additional component (H7) ⇒ (SA4) in Theorem 4.3 is proved in [16]. For general n, Wang and Young [18] prove the existence of an SRB measure for G a˜ j ,b if a j ∈ j . The complete proofs of (SA1)–(SA3) (and (SA4) assuming (H7)) for G a˜ j ,b with a j ∈ j will appear in [14] for general n. We now describe (SA1)–(SA4) precisely. Write G = G a˜ j ,b . (SA1) Positive Lyapunov exponent. Let U denote the basin of attraction of the attractor . This means that U is an open set satisfying G(U ) ⊂ U and =

∞ m=0

T m (U ).

From Limit Cycles to Strange Attractors

229

For almost every z ∈ U with respect to Lebesgue measure, the orbit of z has a positive Lyapunov exponent. That is, lim

m→∞

1 log DG m (z) > 0. m

(SA2) Existence of SRB measures and basin property. (a) The map G admits at least one and at most finitely many ergodic SRB measures each one of which has no zero Lyapunov exponents. Let ν1 , · · · , νr denote these measures. (b) For Lebesgue-a.e. z ∈ U , there exists j (z) ∈ {1, . . . , r } such that for every continuous function ϕ : U → R, m−1 1 ϕ(G i (x, y)) → ϕ dν j (z) . m i=0

(SA3) Statistical properties of dynamical observations. (a) For every ergodic SRB measure ν and every Hölder continuous function ϕ : → R, the sequence {ϕ ◦ G i : i ∈ Z+ } obeys a central limit theorem. That is, if ϕ dν = 0, then the sequence m−1 1 ϕ ◦ Gi √ m i=0

converges in distribution (with respect to ν) to the normal distribution. The variance of the limiting normal distribution is strictly positive unless ϕ = ψ ◦ G − ψ for some ψ ∈ L 2 (ν). (b) Suppose that for some N 1, G N has an SRB measure ν that is mixing. Then given a Hölder exponent η, there exists τ = τ (η) < 1 such that for all Hölder ϕ, ψ : → R with Hölder exponent η, there exists K = K (ϕ, ψ) such that for all m ∈ N, (ϕ ◦ G m N )ψ dν − ϕ dν ψ dν K (ϕ, ψ)τ m . (SA4) Uniqueness of SRB measures and ergodic properties. (a) The map G admits a unique (and therefore ergodic) SRB measure ν, and (b) the dynamical system (G, ν) is mixing, or, equivalently, isomorphic to a Bernoulli shift.

5. Verification of the Rank One Hypotheses We view the singular limit {G a,0 : a ∈ S} as a function of 3 parameters: ε, σ , and λ1 . We show that the family {G a,m −1 : a ∈ S, m ∈ Z+ } satisfies (H1)–(H7) if the parameters ε, σ , and λ1 satisfy certain scaling assumptions.

230

W. Ott, M. Stenlund

5.1. 1D analysis: Verification of (H4), (H5), and (H7). Recall that g(s, a) is defined implicitly by g(s,a) εσ b0 (τ ) dτ + tˆ(a) + 1 =

(s) + E. |λ 1| s Defining f a (s) = g(s, a), = becomes tˆ(a) + 1 =

εσ |λ1 | ,

and (s) = (s) + −1 E, the singular limit

f a (s)

b0 (τ ) dτ + (s).

(5.1)

s

For a map f : S → S and δ > 0, let C( f ) = {s : f (s) = 0} and let Cδ ( f ) = {s : |s − sˆ | < δ for some sˆ ∈ C( f )}. We assume the following about : there exist positive constants K 2 , d0 , d1 , and d2 , and a constant δ0 satisfying 0 < δ0 < 21 d1 , such that the following hold: (A1) (A2) (A3) (A4)

C 3 (S) < K 2 , | (s)| > d0 for s ∈ Cδ0 (), If (s1 ) = (s2 ) = 0 and s1 = s2 , then |s1 − s2 | > d1 , | (s)| > d2 for s ∈ S \ Cδ0 ().

Because is a Morse function, Proposition 3.2 implies that Assumptions (A1)–(A4) are satisfied if σ 1, |λ1 | is sufficiently small, and |λε1 | is sufficiently small. We now compare the map f a to the map . Let {v¯1 , . . . , v¯q0 } be the set of critical 3 points of . Set ξ = − 4 . Lemma 5.1. There exists 0 > 0 and positive constants K 3 , K 4 , and K 5 such that the following hold for fixed > 0 : (a) C( f a ) = {v1 , . . . , vq0 } with |vi − v¯i | < K 3 −1 for 1 i q0 , (b) | f a (s)| > K 4 for all s ∈ Cξ ( f a ), 1 (c) | f a (s)| > K 5 4 for all s ∈ S \ C 1 ξ ( f a ). 2

Proof of Lemma 5.1. Differentiating (5.1) with respect to s, we obtain b0 (s) − (s) = b0 ( f a (s)) f a (s).

(5.2)

Setting f a (s) = 0 gives b0 (s) = (s). Since b0 is bounded above and bounded away from 0, (A2)–(A4) imply (a). Solving for f a (s), we have f a (s) =

b0 (s) − (s) . b0 ( f a (s))

(5.3)

On S \ C 1 ξ ( f a ) we have | (s)| > K ξ using (a), (A2), and (A4). Estimate (c) now 2 follows from (5.3). Differentiating (5.2) with respect to s, we obtain b0 (s) − (s) − b0 ( f a (s))[ f a (s)]2 = b0 ( f a (s)) f a (s). | (s)|

(5.4) | f a (s)|

For all s ∈ Cξ ( f a ), we have < K ξ by (A1) and (a). This implies that < 1 K 4 on Cξ using (5.3). Therefore the second term on the left side of (5.4) dominates and (b) holds.

From Limit Cycles to Strange Attractors

231

5.1.1. Critical curves. Assume > 0 and let ⊂ S be a parameter interval. For a ∈ , we have C( f a ) = {v1 (a), . . . , vq0 (a)} by Lemma 5.1. Write γ (i) (a) = vi (a) for 1 i q0 . For 1 k q0 and i ∈ N, define γi(k) (a) := f ai (γ (k) (a)). Differentiating γ1(k) (a) = f a (γ (k) (a)) = f (γ (k) (a), a) with respect to a, we have d (k) ∂ f (k) d (k) ∂ f (k) γ (a) = (γ (a), a) · γ (a) + (γ (a), a) da 1 ∂s da ∂a ∂ f (k) = (γ (a), a). ∂a Differentiating (5.1) with respect to a and using the fact that

d ˆ da t (a)

(5.5)

= b0 (a), we obtain

b0 (a) ∂ f (s, a) = . ∂a b0 ( f (s, a))

(5.6)

Thus d (k) mins∈S b0 (s) γ1 (a) > 0. da maxs∈S b0 (s) More generally, an estimate on

d (k) da γi+1 (a)

for i ∈ N follows from the recursive formula

d (k) ∂ f (k) d (k) ∂ f (k) γi+1 (a) = (γi (a), a) · γi (a) + (γ (a), a). da ∂s da ∂a i

(5.7)

Lemma 5.2 (Growth estimate for derivatives of critical curves). There exists 1 0 such that the following holds for all > 1 . For any k ∈ {1, . . . , q0 } and i ∈ N such that γ j(k) (a) ∈ S \ Cξ () for all 1 j i, then i d (k) γ (a) > K 5 14 > 5i . da i+1 2

(5.8)

Proof of Lemma 5.2. Estimate (5.8) follows from (5.7), estimate (c) from Lemma 5.1, and the fact that for all s ∈ S and a ∈ we have ∂ maxs∈S b0 (s) f (s, a) =: K 6 . ∂a mins∈S b0 (s) Lemma 5.3 (Distortion estimate for critical curves). There exists 2 1 and D1 > 0 such that the following holds for all > 2 . For any k ∈ {1, . . . , q0 } and any n 2, let be a parameter interval such that (k)

(a) γi () ⊂ S \ Cξ () for 1 i n − 1, and (k) (b) (γn−1 ()) < ξ ( denotes Lebesgue measure on S). Then for all a, aˆ ∈ , we have

d γ (k) (a) da n d (k) < D1 . γn (a) ˆ da

If n = 1, then (5.9) holds for all k ∈ {1, . . . , q0 } and for all a, aˆ ∈ S.

(5.9)

232

W. Ott, M. Stenlund

Proof of Lemma 5.3. For n = 1 and a, aˆ ∈ S, the estimate d γ (k) (a) da 1 d (k) < K 62 γ (a) ˆ da 1 (k)

(k)

follows from (5.5) and (5.6). For n 2 and a, aˆ ∈ , let si = γi (a) and sˆi = γi (a). ˆ We have ∂ d d s f (s ) · d s da i a i−1 da i−1 + ∂a f a (si−1 ) f a (si−1 ) · da si−1 − 15 (i−1) 1 + O( = ) . d = sˆi f (ˆsi−1 ) · d sˆi−1 + ∂ f aˆ (ˆsi−1 ) f (ˆsi−1 ) · d sˆi−1 aˆ

da

∂a

da

aˆ

da

This implies the estimate n−1 n−1 ds ds f (s ) i da 1 da n a i log d = log d + log log 1 + O(− 5 ) + sˆ1 sˆn f aˆ (ˆsi ) da da i=1

i=1

n−1 | f a (si ) − f aˆ (ˆsi )|

| f aˆ (ˆsi )|

i=1

+ O(1).

The equality

1 | f a (si ) − f aˆ (ˆsi )| = (b0 (si+1 ) − b0 (ˆsi+1 )) f aˆ (ˆsi ) b0 ( f a (si ))

+ ( (si ) − (ˆsi )) + (b0 (ˆsi ) − b0 (si ))

implies the estimate n−1 n−1 n−1 ds 3 1 da n log d K |si+1 − sˆi+1 | + K 4 |si − sˆi | + K − 4 |ˆsi − si | + O(1) sˆn da

i=1

i=1

= K |sn − sˆn | +

n−1

i=1

|si − sˆi |

i=2 3

+K 4

n−1

1

|si − sˆi | + K − 4

i=1

n−1

|ˆsi − si | + O(1)

i=1

K |sn − sˆn | + K |sn−1 − sˆn−1 |

n−3

1

− 5 i

i=0 3

1

+K |sn−1 − sˆn−1 |( 4 + − 4 )

n−2

1

− 5 i + O(1)

i=0

= O(1).

From Limit Cycles to Strange Attractors

233

5.1.2. Verification of (H4): Definition 4.1(B) We prove the existence of a parameter a ∗ such that f a ∗ satisfies Definition 4.1(B). We will then show that if is sufficiently large, then for any parameter a, if f a satisfies Definition 4.1(B), then f a ∈ M. Proposition 5.4. There exists 3 2 such that if 3 and ⊂ S is a parameter interval satisfying () = 3D1 K 6 q0 ξ , then there exists a ∗ ∈ such that for all c ∈ C( f a ∗ ), f an∗ (c) ∈ S \ Cξ () for all n ∈ N. Proof of Proposition 5.4. We inductively construct a nested sequence of parameter inter∞ vals = 0 ⊃ 1 ⊃ 2 ⊃ · · · such that a ∗ ∈ i=0 i has the desired property. Definition 5.5. The (q0 + 1)-tuple (n ; i 1,n , . . . , i q0 ,n ) is called an admissible configuration if n is a subinterval of 0 and if for every k ∈ {1, . . . , q0 }, i k,n n and the following conditions are satisfied. (k)

(M1) γi (n ) ∩ Cξ () = ∅ for all i i k,n (M2) For all a, aˆ ∈ n , we have the distortion estimate d (k) da γik,n (a) d (k) < D1 . ˆ da γik,n (a) (k) (M3) γik,n +1 (n ) 3D1 q0 ξ We inductively construct admissible configurations for all n ∈ N such that i k,n → ∞ as n → ∞ for every k. We begin with n = 1. Let d˜ :=

min |s − t|.

s,t∈C() s =t

˜ Let i k,1 = 1 for all k. We choose 1 as follows. We We assume that 3D1 K 62 q0 ξ < 21 d. have d (k) b0 (a) γ1 (a) = , (k) da b0 (γ1 (a)) so (k)

3D1 q0 ξ (γ1 (0 )) 3D1 K 62 q0 ξ <

1˜ d. 2

(k)

Consequently, γ1 (0 ) meets at most one component of Cξ () and we have (k)

((γ1 |0 )−1 (Cξ ())) 2 . (0 ) 3q0 (k)

Even in the worst-case scenario in which the q0 intervals {(γ1 )−1 (Cξ ()) : 1 k q0 } are evenly spaced in 0 , there exists a subinterval 1 of 0 with (1 ) (k) D1 K 6 q 0 ξ such that γ1 (1 ) ∩ Cξ () = ∅ for all k. Property (M1) holds by design q0 +1 and (M2) follows from Lemma 5.3. Property (M3) holds if is such that (k)

1

(γ2 (1 )) 5 (1 ) 3D1 q0 ξ.

234

W. Ott, M. Stenlund

Now assume that for n ∈ N we are given an admissible configuration (n ; i 1,n , . . . , i q0 ,n ). We construct an admissible configuration at step n + 1 as follows. Partition the set {1, . . . , q0 } into 2 sets: A, the set of indices that are ‘ready to advance’, and {1, . . . , q0 } \ A, the set of indices that are not ready to advance. The index k is in A if (I1) and (I2) hold: (k)

(I1) (γik,n (n )) < ξ (distortion estimate holds for the next iterate), (k) (I2) (γik,n +1 (n )) < 21 d˜ (image of the next iterate meets at most one component of Cξ ()).

Suppose that A = ∅. In this case, set i k,n + 1, if k ∈ A; i k,n+1 = i k,n , if k ∈ {1, . . . , q0 } \ A. We now find n+1 so that (n+1 ; i 1,n+1 , . . . , i q0 ,n+1 ) is an admissible configuration. Let k ∈ A. Using (M3) and (I2), we have 3D1 q0 ξ (γi(k) (n )) < k,n +1

1˜ d. 2

(n ) in Cξ () is bounded above by This implies that the fraction of γi(k) k,n +1 2 3D1 q0 .

2ξ 3D1 q0 ξ

=

Using (I1) and Lemma 5.3, we have (k)

((γik,n +1 |n )−1 (Cξ ())) (n )

2 . 3q0

Arguing as in the n = 1 case, there exists a subinterval n+1 of n such that (n+1 ) (k) 1 3(q0 +1) (n ) and for all k ∈ A, γi k,n +1 (n+1 ) ∩ C ξ () = ∅. For k ∈ A, (M1) holds by design and (M2) follows from (I1). The inequality (γi(k) (n+1 )) k,n +1

D1−1 q0 (3D1 q0 ξ ) = ξ 3(q0 + 1) q0 + 1

implies that (M3) holds if is such that K5 1 q0 ξ (k) 4 (γik,n +2 (n+1 )) 3D1 q0 ξ. q0 + 1 2 Now let k ∈ {1, . . . , q0 } \ A. Properties (M1) and (M2) are inherited from step n. If (I1) fails for index k, then (M2) gives (n+1 )) (γi(k) k,n

ξ , 3D1 (q0 + 1)

so index k satisfies (M3) if is such that K5 1 ξ (k) (γik,n +1 (n+1 )) 4 3D1 q0 ξ. 3D1 (q0 + 1) 2

From Limit Cycles to Strange Attractors

235

If (I1) holds but (I2) fails, then (M3) holds for index k if is such that 1˜ 1 (k) (γik,n +1 (n+1 )) d 3D1 q0 ξ. 3D1 (q0 + 1) 2 If A = ∅, then let n be the left half of n . We claim that (n ; i 1,n , . . . , i q0 ,n ) is an admissible configuration. For each index k, properties (M1) and (M2) trivially hold. Property (M3) is established (for sufficiently large) by arguing as above in the 2 cases: (1) (I1) does not hold, and (2) (I1) holds but (I2) fails. Repeat the halving process until A = ∅. 5.1.3. Verification of (H4): f a ∗ satisfies Definition 4.1(B) ⇒ f a ∗ ∈ M We show that if is sufficiently large and a ∗ ∈ S is as in Proposition 5.4, then f a ∗ ∈ M. This implication is a consequence of Lemma 5.1 and the following binding estimate. Proposition 5.6. There exists K 7 > 0 such that for sufficiently large and a ∗ as in Proposition 5.4, we have the following. For c ∈ C( f a ∗ ) and s ∈ S satisfying |s − c| 11 − 12 , let m(s) be the smallest value of m ∈ Z+ such that | f am∗ (s) − f am∗ (c)| > 21 ξ . Then m(s) > 1 and |( f am(s) ) (s)| (K 7 ) ∗

m(s) 16

.

Proof of Proposition 5.6. We begin with a spatial distortion lemma. Lemma 5.7 (Spatial distortion estimate). There exists D2 1 such that the following holds for all a ∈ S. For s, sˆ ∈ S, let m ∈ Z+ be such that πi , the segment between f ai (s) and f ai (ˆs ), satisfies (πi ) < 21 ξ and πi ∩ C 1 ξ ( f a ) = ∅ for all 0 i < m. Then 2 m ( f a ) (s) ( f m ) (ˆs ) D2 . a Proof of Lemma 5.7. Writing si = f ai (s) and sˆi = f ai (ˆs ) and using Lemma 5.1 and its proof, we have m m−1 ( f a ) (s) f a (si ) log m = log ( f a ) (ˆs ) f a (ˆsi ) i=0

m−1 i=0

| f a (si ) − f a (ˆsi )| | f a (ˆsi )|

1 K 5−1 − 4

1

K6 +

3

C 0 (S) minw∈S b0 (w)

(K − 4 + K 4 )|sm−1 − sˆm−1 |

m−1

|si − sˆi |

i=0 m−1

1

(K 5 4 )−i

i=0

= O(1).

236

W. Ott, M. Stenlund

Returning to the proof of Proposition 5.6, write f = f a ∗ . We first show that m(s) > 1. We have 1 | f (ζ )|(s − c)2 2

| f (s) − f (c)| = 11

for some ζ satisfying |ζ − c| − 12 . Arguing as in the proof of Lemma 5.1, | f (ζ )| K . Therefore 5

| f (s) − f (c)| K − 6

ξ 2

for sufficiently large. Now assume m(s) > 1. Using Lemma 5.7, we have ξ < | f m(s) (s) − f m(s) (c)| 2 = |( f m(s)−1 ) (ζ1 )| · | f (s) − f (c)| D2 |( f

(for some ζ1 between f (s) and f (c))

m(s)−1

) ( f (c))| · | f (s) − f (c)|,

and therefore ξ < D2 |( f m(s)−1 ) ( f (c))| · | f (ζ )| · (s − c)2 .

(5.10)

Reversing inequality (5.10) at time m(s) − 1, we have ξ D2−1 |( f m(s)−2 ) ( f (c))| · | f (ζ )| · (s − c)2 .

(5.11)

Estimating |( f m(s)−1 ) ( f (c))| from below using (5.10) gives |( f m(s) ) (s)| = | f (s) − f (c)| · |( f m(s)−1 ) ( f (s))| D2−1 |( f m(s)−1 ) ( f (c))|·| f (ζ4 )|·|s −c| (for some ζ4 between s and c) | f (ζ4 )| ξ . (5.12) 2 D2 |s − c| | f (ζ )|

Arguing as in the proof of Lemma 5.1, || ff (ζ(ζ4)|)| K > 0 since ζ4 and ζ are between s and c. Using this fact and estimating |s − c|−1 from below using (5.11), (5.12) implies |( f

Kξ ) (s)| 2 D2

m(s)

D2−1 |( f m(s)−2 ) ( f (c))| · | f (ζ )| ξ

(K )

m(s) 1 8 −8

(K )

m(s) 16

1 2

.

From Limit Cycles to Strange Attractors

237

5.1.4. Verification of (H5) and (H7). The following lemma facilitates the verification of (H5). Lemma 5.8 ([11,12]). Let f = f a ∗ . Suppose that for all x ∈ C( f a ∗ ), we have ∞ k=0

1 < ∞. |( f k ) ( f (x))|

Then for each x ∈ C( f a ∗ ), ! " ∞ d [(∂a f a )( f k (x))]a=a ∗ d = f p(a) (x(a)) − . a ( f k ) ( f (x)) da da a=a ∗ k=0

Property (H5) follows from Lemma 5.8 for sufficiently large. To see this, suppose f a ∗ ∈ M and let c ∈ C( f a ∗ ). For k ∈ Z+ , we have 1 k |( f ak∗ ) ( f (c))| K 5 4 by Lemma 5.1(c). Since K 6−1 large, then

∂ ∂a

f (s, a) K 6 , we conclude that if is sufficiently

∞ [∂a f a ( f ak∗ (c))]a=a ∗ k=0

( f ak∗ ) ( f a ∗ (c))

K 6−1

−

∞ k=1

K6 1

K5 4

k > 0.

Property (H7) follows from Lemma 5.1 and Proposition 5.6 provided is sufficiently large. Acknowledgements Mikko Stenlund was partially supported by the Academy of Finland. William Ott has been partially supported by NSF grant DMS-0603509.

Appendix A. Some Proofs We assume throughout Sect. A that L = 1. Notice that if V denotes a vector field, then dz =V ds

⇒

dz 1 dz2 z dz z = = · = · V. ds 2z ds z ds z

(A.1)

We will use this fact together with the following Grönwall-type inequality: Lemma A.1. Assume that β is a constant, the function ϕ is continuous on the interval [ˆs , sˇ ], and that the function u is differentiable and satisfies du s , sˇ ). Then, ds βu + ϕ on (ˆ for all s ∈ (ˆs , sˇ ), s β(s−ˆs ) u(s) u(ˆs )e + eβ(s−τ ) ϕ(τ ) dτ. s

sˆ

Proof. Suppose v(s) = u(ˆs )eβ(s−ˆs ) + sˆ eβ(s−τ ) ϕ(τ ) dτ . Then v satisfies the equation dv d s ) = u(ˆs ). Since u − v is differentiable, ds (u − v) ds (s) = βv(s) + ϕ(s) with v(ˆ β(u − v), and (u − v)(ˆs ) = 0, a standard Grönwall argument shows that u v. We get immediately

238

W. Ott, M. Stenlund

λ1 λ1 (s−ˆs ) . Then Corollary A.2. Suppose that in Lemma A.1 du ds (s) 2 u + C 0 e λ1 2C0 e 2 (s−ˆs ) . u(s) u(ˆs ) + |λ1 |

Our first application of a Grönwall inequality is Lemma A.3. Assume z solves the forced equation (3.8b) with z(s0 ) = z0 and fix a constant K > 0. If ε/|λ1 | is sufficiently small, ∂sm ∂sl0 z(s) Cε

(0 l + m 3)

(A.2)

as long as s − s0 K . Moreover, ∂z0 z − 1 C|λ1 |.

(A.3)

Proof. Equation (3.8b) reads dz = Az + h3 (s, z) ds

h3 (s, z) = Os (ε) + Os,z (εz) + Os,z (ε2 ) + Os,z (z2 ).

with

(A.4) Assuming z/|λ1 | and ε/|λ1 | are sufficiently small, (A.1) implies λ1 dz z + C0 ε. ds 2 By Lemma A.1, z(s) z0 e

λ1 2 (s−s0 )

+

λ1 λ1 2C0 ε 1 − e 2 (s−s0 ) z0 e 2 (s−s0 ) + C0 ε(s − s0 ). |λ1 |

For s − s0 K we get z(s)/|λ1 | z0 /|λ1 | + C0 K ε/|λ1 |, which proves the assumption legitimate. Differentiating (A.4) with respect to s up to two times yields an expression for ∂sm z(s). One immediately obtains ∂sm z(s) Cε for 0 m 3 and s − s0 K . Equation (A.4) implies z(s) = e(s−s0 )A z0 +

s

e(s−τ )A h3 (τ, z(τ )) dτ.

(A.5)

s0

Differentiating this with respect to s0 up to three times and evaluating at s = s0 yields ∂s0 z(s0 ) = −Az0 − h3 (s0 , z0 ), ∂s20 z(s0 ) = A2 z0 + Ah3 (s0 , z0 ) − ∂s h3 (s0 , z0 ) − Dh3 (s0 , z0 )∂s0 z(s0 ), ∂s30 z(s0 ) = −A3 z0 − A2 h3 (s0 , z0 ) + 2A∂s h3 (s0 , z0 ) −∂s2 h3 (s0 , z0 ) + ADh3 (s0 , z0 )∂s0 z(s0 ) −D(∂s h3 )(s0 , z0 )∂s0 z(s0 ) − Dh3 (s0 , z0 )

d ∂s z(s0 ) ds0 0

−D 2 h3 (s0 , z0 )(∂s0 z(s0 ), ∂s0 z(s0 )) − Dh3 (s0 , z0 )∂s20 z(s0 ).

From Limit Cycles to Strange Attractors

239

Clearly, for 1 l 3, ∂sl0 z(s0 ) Cε. Such initial conditions are needed for analyzing the variational equations d (A.6) ∂s z = (A + Dh3 (s, z)) ∂s0 z, ds 0 d 2 ∂ z = (A + Dh3 (s, z)) ∂s20 z + D 2 h3 (s, z)(∂s0 z, ∂s0 z), (A.7) ds s0 d 3 ∂ z = (A + Dh3 (s, z)) ∂s30 z + 3D 2 h3 (s, z)(∂s20 z, ∂s0 z) + D 3 h3 (s, z)(∂s0 z, ∂s0 z, ∂s0 z). ds s0 One then checks recursively, using (A.1) and Corollary A.2, that ∂sl0 z(s) Cεe

λ1 2 (s−s0 )

Cε

hold for 1 l 3 and s − s0 K . Equations (A.6) and (A.7) provide us with an expression for ∂s ∂sl0 z(s) with l = 1 and l = 2. Moreover, (A.6) can be differentiated with respect to s to yield an expression for ∂s2 ∂s0 z(s). The bounds in (A.2) are then readily obtained. Finally, we will prove (A.3). To this end, notice that d ∂z z = (A + Dh3 (s, z)) ∂z0 z. ds 0

(A.8)

In particular, each row, ∂z0,i z, of ∂z0 z satisfies this equation. Hence, by principle (A.1), λ1 d ds ∂z0,i z 2 ∂z0,i z, so that the matrix ∂z0 z remains perpetually bounded. Integrating both sides of (A.8) from s0 to s and recalling ∂z0 z(s0 ) = 1 gives ∂z0 z(s) − 1 |s − s0 | |λn−1 | + sup Dh3 (s , z) sup ∂z0 z(s ) , s0 s s

s0 s s

where · now denotes the matrix norm induced by the Euclidean norm. This estimate implies (A.3). Proof of Proposition 3.1. Throughout the proof, · C 3 will stand for the C 3 -norm with respect to s0 . By (3.8a) and (3.14), s˜ and sˆ have to satisfy sˆ s˜ b0 (τ ) dτ = ρ = b0 (τ ) + v(τ ) dτ, (A.9) s0

s0

where v(s) = bT1 (s)P(s)z(s) + Os,z (ε) + Os,z (εz(s)) + Os,z (z(s)2 ) and z = z(s) solves (3.8b) with z(s0 ) = z0 . We theorem to find s˜ . Clearly, F : R × R → R : (s0 , s) → s use the implicit function 3 . Observe that F(s , s ) = −ρ and lim b (τ ) dτ − ρ is C 0 0 0 s→∞ F(s0 , s) = ∞ as s0 min b0 = m > 0. By the intermediate value theorem, there exists a number s˜ such ∂ that F(s0 , s˜ ) = 0. Because ∂s F(s0 , s) = b0 (s) m, the implicit function theorem implies that s˜ is a C 3 -function of s0 . Notice that F(s0 + 1, s + 1) ≡ F(s0 , s), so that s˜ (s0 + 1) = s˜ (s0 ) + 1 which implies that s0 → s˜ (s0 ) − s0 is periodic.

240

W. Ott, M. Stenlund

Now that we have s˜ , let us define the function g(ξ ) := −ρ +

s˜ +ξ

b0 (τ ) + v(τ ) dτ.

s0

Notice that, denoting ξ1 = sˆ − s˜ , the right side of (A.9) is equivalent to g(ξ1 ) = 0. The Taylor expansion g(ξ ) = g(0) + g (0)ξ + δ2 g(ξ ) yields G(ξ ) := −

1 (g(0) + δ2 g(ξ )) = ξ, g (0)

which we regard, for all fixed z0 , as a fixed point equation on the space of C 3 functions ξ = ξ(s0 ). Assuming G is a contraction in a closed, origin-centered, ball B¯ r ⊂ C 3 of radius r , there exists a unique solution, ξ1 , to G(ξ ) = ξ inside the ball. Next, we prove that for a suitably small value of r , G is indeed a contraction. First, notice that g(0) =

s˜

v(τ ) dτ = (˜s − s0 )

s0

1

v((1 − τ )s0 + τ s˜ ) dτ,

0

g (0) = b0 (˜s ) + v(˜s ), 1 1 2 2 δ2 g(ξ ) = ξ (1 − τ ) g (ξ τ ) dτ = ξ (1 − τ ) (b0 + v )(˜s + ξ τ ) dτ 0

0

are smooth functions of s0 . Because s˜ is C 3 in s0 and inf s0 g (0) > 0, the bounds (A.2) yield # # # 1 # # # # g (0) #

C3

# # # g(0) # # C and # # Cε. g (0) #C 3

Moreover, 2 δ2 g(ξ )C 3 Cξ C s + ζ )C 3 . 3 sup (b0 + v )(˜ ζ ∈ B¯ r

Hence, G(ξ )C 3 C0 (ε +r 2 ) for some C0 . Choosing r = 2C0 ε, we have G( B¯ r ) ⊂ B¯ r for ε small enough. Second, let ξ 1 and ξ 2 be elements of B¯ r . Since the map ξ → G(ξ ) is differentiable and the operator norm of the derivative obeys the bound supξ ∈ B¯ r DG(ξ )L(C 3 ) C supξ ∈ B¯ r Dδ2 g(ξ )L(C 3 ) Cr , the mean value theorem yields G(ξ 1 )−G(ξ 2 )C 3 Cr ξ 1 − ξ 2 C 3 . Hence, G is a contraction on B¯ r if ε is sufficiently small. We will now prove that the fixed point, ξ1 , of G is a periodic function of s0 . Let us denote z(s, s0 , z0 ) the solution and v(s)|s0 the function v defined above, when the initial condition z(s0 ) = z0 is being used. Because h3 (s + 2, z) = h3 (s, z) in (A.4), we have z(s + 2, s0 + 2, z0 ) = z(s, s0 , z0 ) and v(s + 2)|s0 +2 = v(s)|s0 . Since g(ξ1 ) = 0 for all

From Limit Cycles to Strange Attractors

241

values of s0 and s˜ (s0 + 2) = s˜ (s0 ) + 2, the computation g(ξ1 )(s0 + 2) = −ρ + = −ρ + = −ρ + = −ρ +

s˜ (s0 +2)+ξ1 (s0 +2) s0 +2 s˜ (s0 )+2+ξ1 (s0 +2) s0 +2 s˜ (s0 )+ξ1 (s0 +2) s0 s˜ (s0 )+ξ1 (s0 +2) s0

= g(ξ1 )(s0 ) +

b0 (τ ) + v(τ )|s0 +2 dτ b0 (τ ) + v(τ )|s0 +2 dτ

b0 (τ + 2) + v(τ + 2)|s0 +2 dτ b0 (τ ) + v(τ )|s0 dτ

s˜ (s0 )+ξ1 (s0 +2)

s˜ (s0 )+ξ1 (s0 )

b0 (τ ) + v(τ )|s0 dτ,

implies that the last integral vanishes despite the fact that the integrand is positive, so we must have ξ1 (s0 + 2) = ξ1 (s0 ). As the last step, we will bound the difference zˆ − z˜ . Let z(1) and z(2) solve (3.8b) and (3.13b), respectively, with the initial condition z(1) (s0 ) = z(2) (s0 ) = z0 . Both of these are C 3 functions of (s0 , z0 ) by the smoothness of the vector fields. By definition, zˆ = z(1) (ˆs ) and z˜ = z(2) (˜s ). We need a bound on the C 3 norm of the difference ξ 2 (s0 ) = zˆ − z˜ for fixed z0 . Notice that ξ 2 (s0 ) = (z(1) − z(2) )(ˆs ) + (z(2) (ˆs ) − z(2) (˜s )). Observe that the difference δ = z(1) − z(2) satisfies the differential equation dδ = Az(1) + Os,z(1) (εz(1) ) + Os,z(1) (ε2 ) + Os,z(1) (z(1) 2 ) = w. ds Here z(1) , and hence w, is to be regarded as a predetermined function for which we already have good bounds. Indeed, let S = {(s0 , s) : 0 s0 < 2, s0 s K } and · C 3 stand for the C 3 norm on this set. According to (A.2), w − Az(1) C 3 Cε2 S

S

whereas, recalling that all eigenvalues of A are proportional to λ1 , Az(1) C 3 C|λ1 |ε. S In other words, wC 3 C|λ1 |ε. As δ(s0 ) = 0, we have S

(z

(1)

−z

(2)

)(ˆs ) = δ(ˆs ) =

sˆ

w(τ ) dτ.

s0

Because sˆ is C 3 in s0 , it follows that (z(1) − z(2) )(ˆs )C 3 C|λ1 |ε. By (3.13b), the remaining contribution reads z(2) (ˆs ) − z(2) (˜s ) =

s˜

sˆ

εP−1 (τ )φ(0, τ ) (0)

ψn (τ )

dτ.

We have seen above that ˆs − s˜ C 3 Cε, which implies z(2) (ˆs ) − z(2) (˜s )C 3 Cε2 and finally ξ 2 (s0 )C 3 C|λ1 |ε.

242

W. Ott, M. Stenlund

Remark A.4. It follows from the previous proof that, under the conditions of Proposition 3.1, ∂z0 sˆ C.

(A.10)

Indeed, ∂z0 sˆ = ∂z0 ξ1 , as ∂z0 s˜ = 0. From the fixed point equation ξ1 = G(ξ1 ) we get ∂z0 ξ1 = (1 − DG(ξ1 ))−1 (∂z0 G)(ξ1 ) and then the claimed bound. Moreover, ∂z0 zˆ − 1 C|λ1 |.

(A.11)

Let z(s) = z(s; s0 , z0 ) be the solution to (3.8b) with z(s0 ; s0 , z0 ) = z0 and recall that sˆ depends on (s0 , z0 ). By definition, zˆ = z(ˆs ; s0 , z0 ) so that ∂z0 zˆ = ∂s z(ˆs )∂z0 sˆ + ∂z0 z(ˆs ) = 1 + O(λ1 ) by the bounds in Lemma A.3. Let us view the solution z(s) = z(s, sˆ , zˆ ),

z(ˆs ) ≡ zˆ

(A.12)

to Eq. (3.7b) as a function of three variables and abbreviate ∂s = ∂/∂s , ∂ˆi = ∂/∂ zˆ i , ∂ˆi1 ···ik = ∂ˆi1 · · · ∂ˆik , and ∂sˆ = ∂/∂ sˆ . Proposition A.5. Assuming ˆz/|λ1 | is small enough, we have, for 0 k + l + m 3 and s sˆ , the following bounds: # # λ1 # m l # #∂s ∂sˆ z(s)# Cˆze 2 (s−ˆs ) , # # λ1 C # m lˆ # e 2 (s−ˆs ) (k > 0). #∂s ∂sˆ ∂i1 ···ik z(s)# k−1 |λ1 | Proof. The initial conditions (∂z/∂ zˆ )(ˆs ) = 1, ∂ˆi j z(ˆs ) = 0, and ∂ˆi jk z(ˆs ) = 0 follow from (A.12), as the zˆ -derivatives can be computed after evaluating z at s = sˆ . Similarly, taking sˆ -derivatives of s z(s) = e(s−ˆs )A zˆ + e−(τ −ˆs )A h1 (τ, z(τ )) dτ sˆ

yields first, analogously to how the identities below (A.5) were obtained, ∂sˆ z(ˆs ) = −Aˆz − h1 (ˆs , zˆ ), ∂sˆ2 z(ˆs ) = A2 zˆ + Ah1 (ˆs , zˆ ) − ∂s h1 (ˆs , zˆ ) − Dh1 (ˆs , zˆ )∂sˆ z(ˆs ), ∂sˆ3 z(ˆs ) = −A3 zˆ − A2 h1 (ˆs , zˆ ) + 2A∂s h1 (ˆs , zˆ ) − ∂s2 h1 (ˆs , zˆ ) + ADh1 (ˆs , zˆ )∂sˆ z(ˆs ) −D(∂s h1 )(ˆs , zˆ )∂sˆ z(ˆs ) − Dh1 (ˆs , zˆ )

d ∂sˆ z(ˆs ) − D 2 h1 (ˆs , zˆ )(∂sˆ z(ˆs ), ∂sˆ z(ˆs )) d sˆ

−Dh1 (ˆs , zˆ )∂sˆ2 z(ˆs ). These formulas can then be differentiated with respect to zˆ in order to find higher-order initial conditions. As h1 (s, z) = O(z2 ), we obtain the following estimates: ∂sˆl z(ˆs ) Cˆz,

∂sˆl ∂ˆi z(ˆs )

C,

∂sˆ ∂ˆi j z(ˆs ) C.

l = 1, 2, 3, l = 1, 2,

From Limit Cycles to Strange Attractors

243

Combining (3.7b) and (A.1), d λ1 z · Az + z · h1 (s, z) z = λ1 z + h1 (s, z) z ds z 2 if z/|λ1 | is small enough. Below, ˆz/|λ1 | will always be assumed small enough. Thus, for all s > sˆ , z(s) ˆze

λ1 s) 2 (s−ˆ

.

(A.13)

Differentiating (3.7b) with respect to various components of zˆ , we obtain the variational equations d ˆ ∂i z = (A + Dh1 (s, z)) ∂ˆi z, ds

(A.14)

d ˆ ∂i j z = (A + Dh1 (s, z)) ∂ˆi j z + D 2 h1 (s, z)(∂ˆi z, ∂ˆ j z), ds

(A.15)

d ˆ ∂i jk z = (A + Dh1 (s, z)) ∂ˆi jk z + D 2 h1 (s, z)(∂ˆi z, ∂ˆ jk z) ds

(A.16)

+ D 2 h1 (s, z)(∂ˆk z, ∂ˆi j z) + D 2 h1 (s, z)(∂ˆ j z, ∂ˆki z) + D 3 h1 (s, z)(∂ˆi z, ∂ˆk z, ∂ˆi j z). Combining (A.14), (A.1), and (A.13), we have ∂ˆi z(s) e

λ1 s) 2 (s−ˆ

(A.17)

in analogy with (A.13). Combining (A.15), (A.1), (A.13), and (A.17), d ˆ λ1 λ1 ∂i j z ∂ˆi j z + C∂ˆi z∂ˆ j z ∂ˆi j z + Ceλ1 (s−ˆs ) . ds 2 2 Applying Corollary A.2, ∂ˆi j z(s)

C λ1 (s−ˆs ) e2 . |λ1 |

(A.18)

Similarly, combining (A.16), (A.1), (A.13), (A.17), and (A.18), ∂ˆi jk z(s)

C λ1 (s−ˆs ) e2 . |λ1 |2

(A.19)

Differentiating Eqs. (3.7b), (A.14), and (A.15) with respect to sˆ produces equations for ∂sˆl z, ∂sˆl ∂ˆi z, and ∂sˆ ∂ˆi j z. For example, d ∂sˆ z = (A + Dh1 (s, z)) ∂sˆ z, ds d 2 ∂ z = (A + Dh1 (s, z)) ∂sˆ2 z + D 2 h1 (s, z)(∂sˆ z, ∂sˆ z). ds sˆ Such equations can be handled in a similar fashion and it is easy to verify that the additional sˆ -derivatives do not change the bounds by more than a constant prefactor. The bounds with m = 0 in the proposition are now clear. The bounds with m = 0 follow immediately from the appropriate differential equation; for instance, bounding d ˆ the right-hand side of (A.14) yields the bound on ds ∂i z.

244

W. Ott, M. Stenlund

Proof of Proposition 3.2. We first view the error terms Ek with 1 k 3 as smooth functions of (ˆs , zˆ ) and bound their derivatives with respect to these variables. The bounds on the C 3 -norms with respect to s0 follow by the chain rule. Bounding the C 3 -norms of E4 and E5 is trivial and is done at the end of the proof. Throughout the proof, · C 3 will stand for the C 3 -norm with respect to s0 . Terms E1 and E2 . It is convenient to express E1 and E2 in the form ∞ τ T E1 = b1 (ˆs + τ ) P(ˆs + τ ) e(τ −ξ )A h1 (ˆs + ξ, z(ˆs + ξ )) dξ dτ, 0 0 ∞ h 2 (ˆs + τ, z(ˆs + τ )) dτ. E2 = 0

First of all, that

(bT1 P)(ˆs

+ τ )C 3 CbT1 PC 3 for every τ , because ˆs − s0 C 3 C, so

E1 C 3

CbT1 PC 3

E2 C 3

∞ 0

∞ τ 0

0

eλ1 (τ −ξ ) h1 (ˆs + ξ, z(ˆs + ξ ))C 3 dξ dτ,

h 2 (ˆs + τ, z(ˆs + τ ))C 3 dτ.

Since h1 (s, z) and h 2 (s, z) are periodic in the variable s, their partial derivatives of any order (less than four) with respect to s are periodic functions of s and can be bounded exactly as h1 (s, z) and h 2 (s, z). To save a considerable amount of space, we write η = sˆ +ξ and ζ = (ˆs + ξ, z(ˆs + ξ )) below. Notice that the first three total sˆ -derivatives of z(η) are d z(η) = ∂s z(η) + ∂sˆ z(η), d sˆ d2 z(η) = ∂s2 z(η) + 2∂s ∂sˆ z(η) + ∂sˆ2 z(η), d sˆ 2 d3 z(η) = ∂s3 z(η) + 3∂s2 ∂sˆ z(η) + 3∂s ∂sˆ2 z(η) + ∂sˆ3 z(η). d sˆ 3 Taking zˆ -derivatives of the first two formulas above, d ˆ ∂i z(η) = ∂s ∂ˆi z(η) + ∂sˆ ∂ˆi z(η), d sˆ d ˆ ∂i j z(η) = ∂s ∂ˆi j z(η) + ∂sˆ ∂ˆi j z(η), d sˆ d2 ˆ ∂i z(η) = ∂s2 ∂ˆi z(η) + ∂s ∂sˆ ∂ˆi z(η) + ∂sˆ2 ∂ˆi z(η). d sˆ 2 Proposition A.5 then implies the bounds for 0 k + l 3: # l # #d # λ1 ξ # # # d sˆl z(η)# Cˆze 2 , # l # #d # λ1 C ξ # ˆ∂i1 ...ik z(η)# (k > 0). # d sˆl # |λ |k−1 e 2 1

From Limit Cycles to Strange Attractors

245

These will be used to bound the C 3 -norm of h1 (ζ ). To this end, we compute d d h1 (ζ ) = ∂s h1 (ζ ) + Dh1 (ζ ) z(η) = O ˆz2 eλ1 ξ , d sˆ d sˆ d2 d h1 (ζ ) = ∂s2 h1 (ζ ) + 2D(∂s h1 )(ζ ) z(η) d sˆ 2 d sˆ d2 d d 2 + Dh1 (ζ ) 2 z(η) + D h1 (ζ ) z(η), z(η) d sˆ d sˆ d sˆ = O ˆz2 eλ1 ξ , d3 d h1 (ζ ) = ∂s3 h1 (ζ ) + D(∂s2 h1 )(ζ ) z(η) d sˆ 3 d sˆ d d2 z(η) + 3D(∂s h1 )(ζ ) 2 z(η) d sˆ d sˆ d d d3 + 3D 2 (∂s h1 )(ζ ) z(η), z(η) + Dh1 (ζ ) 3 z(η) d sˆ d sˆ d sˆ

+ 2D(∂s2 h1 )(ζ )

d d2 + 3D h1 (ζ ) z(η), 2 z(η) d sˆ d sˆ

2

d d d + D h1 (ζ ) z(η), z(η), z(η) . = O ˆz2 eλ1 ξ , d sˆ d sˆ d sˆ 3

d h1 (ζ ) = Dh1 (ζ )∂ˆi z(η) = O ˆzeλ1 ξ , d zˆ i d2 h1 (ζ ) = Dh1 (ζ )∂ˆi j z(η) + D 2 h1 (ζ )(∂ˆi z(η), ∂ˆ j z(η)) d zˆ i d zˆ j =O

ˆz + 1 eλ1 ξ = O eλ1 ξ , |λ1 |

d3 h1 (ζ ) = Dh1 (ζ )∂ˆi jk z(η) + D 2 h1 (ζ )(∂ˆi z(η), ∂ˆ jk z(η)) d zˆ i d zˆ j d zˆ k + D 2 h1 (ζ )(∂ˆk z(η), ∂ˆi j z(η)) + D 2 h1 (ζ )(∂ˆ j z(η), ∂ˆki z(η)) + D 3 h1 (ζ )(∂ˆi z(η), ∂ˆk z(η), ∂ˆi j z(η)). =O

1 λ1 ξ ˆz 1 λ1 ξ = O . e e + |λ1 |2 |λ1 | |λ1 |

246

W. Ott, M. Stenlund

Taking zˆ -derivatives of we get

2 d h (ζ ), ddsˆ2 h1 (ζ ), and d sˆ 1

the resulting expression for

d2 h (ζ ), d zˆ i d sˆ 1

d d2 d h1 (ζ ) = D(∂s h1 )(ζ )∂ˆi z(η) + Dh1 (ζ ) ∂ˆi z(η) + D 2 h1 (ζ ) z(η), ∂ˆi z(η) d zˆ i d sˆ d sˆ d sˆ = O ˆzeλ1 ξ , d3 d h1 (ζ ) = D(∂s2 h1 )(ζ )∂ˆi z(η) + 2D(∂s h1 )(ζ ) ∂ˆi z(η) 2 d zˆ i d sˆ d sˆ d + 2D 2 (∂s h1 )(ζ ) z(η), ∂ˆi z(η) d sˆ 2 d2 ˆ d 2 ˆ + Dh1 (ζ ) 2 ∂i z(η) + D h1 (ζ ) z(η), ∂i z(η) d sˆ d sˆ 2 d ˆ d d d 2 3 ˆ ∂i z(η), z(η) + D h1 (ζ ) ∂i z(η), z(η), z(η) + 2D h1 (ζ ) d sˆ d sˆ d sˆ d sˆ = O ˆzeλ1 ξ , d3 d h1 (ζ ) = D(∂s h1 )(ζ )∂ˆi j z(η)+ D 2 (∂s h1 )(ζ ) ∂ˆi z(η), ∂ˆ j z(η) + Dh1 (ζ ) ∂ˆi j z(η) d zˆ i d zˆ j d sˆ d sˆ d ˆ d ˆ + D 2 h1 (ζ ) ∂i z(η), ∂ˆ j z(η) + D 2 h1 (ζ ) ∂ j z(η), ∂ˆi z(η) d sˆ d sˆ d d + D 2 h1 (ζ ) z(η), ∂ˆi j z(η) + D 3 h1 (ζ ) z(η), ∂ˆi z(η), ∂ˆ j z(η) . d sˆ d sˆ ˆz + 1 e λ1 ξ = O e λ1 ξ . =O |λ1 |

We bound the derivatives of h 2 (ˆs + τ, z(ˆs + τ )) in exactly the same way. ∞ Term E3 . Setting v(τ ) = b1 (τ )T P(τ ) − , we have E3 = zˆ · sˆ v(τ )e(τ −ˆs )A dτ . Using the facts that v is 2-periodic, that its integral vanishes, and that A is negative definite, sˆ

∞

v(τ )e(τ −ˆs )A dτ =

∞

v(ˆs + τ )eτ A dτ =

0

k=0

2

v(ˆs + τ )eτ A dτ e2kA

0

−1 2 τA v(ˆs + τ )e dτ 1 − e2A

=

∞

0 2

= 0

v(ˆs + τ ) e

τA

−1 − 1 dτ 1 − e2A .

From Limit Cycles to Strange Attractors

Hence, dk d sˆ k

∞ sˆ

v(τ )e

(τ −ˆs )A

247

2

dτ =

v

(k)

(ˆs + τ ) e

τA

−1 − 1 dτ 1 − e2A .

0

Recalling that A is diagonal, we obtain for each value of k the upper bound k ∞ # 2 d # (k) # τλ (τ −ˆs )A = #v # e i − 1 dτ (1 − e2λi )−1 v(τ )e dτ # i d sˆ k i ∞ sˆ # # 0 # (k) # C #v # . (A.20) ∞

Incorporating (s0 , z0 = 0) → (ˆs , zˆ ). We set z0 = 0 and denote (ˆs , zˆ ) = Hk (s0 , 0). As k k zˆ = z(ˆs ) = z(ˆs (s0 ), s0 , 0), we have d kzˆ = d k z(ˆs (s0 ), s0 , 0). The bounds ds0 ds0 # # # d k zˆ # # # (1 k 3) # k # Cε # ds0 # follow from the fact that sˆ is a C 3 -function of s0 and the bounds in (A.2). Since sˆ and zˆ are functions of s0 , for any function u = u(ˆs , zˆ ), d d sˆ d zˆ i u= ∂sˆ u + ∂zˆ u, ds0 ds0 ds0 i d2 d 2 sˆ d 2 zˆ i u = ∂ u + ∂zˆ u + s ˆ ds02 ds02 ds02 i

d sˆ ds0

2 ∂sˆsˆ u + 2

d sˆ d zˆ i d zˆ i d zˆ j ∂sˆ zˆ u + ∂zˆ zˆ u, ds0 ds0 i ds0 ds0 i j

d3 d 3 sˆ d 3 zˆ i d sˆ d 2 sˆ u = ∂ u + ∂ u + 2 ∂sˆsˆ u s ˆ z ˆ ds0 ds02 ds03 ds03 ds03 i d 2 sˆ d zˆ i d sˆ d 2 zˆ i d 2 zˆ i d zˆ j +2 + ∂zˆ zˆ u ∂sˆ zˆi u + 2 2 2 2 ds0 ds0 ds0 ds0 ds0 ds0 i j d 2 sˆ d d 2 zˆ i d d sˆ 2 d + 2 ∂sˆ u + ∂ u + ∂sˆsˆ u zˆ ds0 ds0 ds0 ds0 ds02 ds0 i d sˆ d zˆ i d d zˆ i d zˆ j d ∂sˆ zˆi u + ∂zˆ zˆ u. ds0 ds0 ds0 ds0 ds0 ds0 i j Here summation over repeated indices is understood and we leave it to the reader to expand the remaining s0 -derivatives on the last line. Using the bounds derived earlier, we then get +2

h1 (ζ ) Cˆz2 eλ1 ξ # # # d # 2 # # C ˆ z h (ζ ) + εˆ z eλ1 ξ # ds 1 # 0 # # # d2 # # # # 2 h1 (ζ )# C ˆz2 + εˆz + ε2 eλ1 ξ # ds0 # # # # d3 # ε3 # # eλ1 ξ . # 3 h1 (ζ )# C ˆz2 + εˆz + ε2 + # ds0 # |λ1 |

248

W. Ott, M. Stenlund

Similar bounds are obtained for h 2 (ζ ). We conclude that E1 C 3 CbT1 PC 3

ε2 ε2 Cσ , 2 |λ1 | |λ1 |2

ε2 , |λ1 | Cεσ.

E2 C 3 C E3 C 3

The final inequality involving E1 holds because

bT1 P σ

is independent of σ .

Terms E4 and E5 . Writing E4 in the form 1 b0 ((1 − τ )˜s + τ sˆ ) dτ, E4 = (˜s − sˆ ) 0

and recalling that s˜ and sˆ are both

C3

functions of s0 allows us to estimate

E4 C 3 C˜s − sˆ C 3 Cε. Proposition 3.1 was used here. Finally, by the same proposition, ¯ C 3 Cε, E5 C 3 = ξ 2 (s0 , 0), d

which finishes the proof.

Lemma A.6. For all s0 and a, we have ∂z0 s∞ (s0 , 0, a) > 0. Proof. Differentiating both sides of (3.11) with respect to z0 , we get ∞ 0 = b0 (s∞ )∂z0 s∞ + eτ A dτ , ∂z0 zˆ + R,

(A.21)

0

where

R = −b0 (ˆs )∂z0 sˆ + + zˆ , 0

∞

∞ 0

(bT1 P − )(ˆs + τ )eτ A dτ , ∂z0 zˆ

2 (bT1 P) (ˆs + τ )eτ A dτ (∂z0 sˆ ) + ∂z0 Ek . k=1

Because (A.20) holds for any periodic, zero-integral function, the two integrals appearing in R are O(σ ) in the limit λ1 → 0. Terms ∂z0 sˆ and ∂z0 zˆ are bounded by (A.10) and (A.11), respectively. Estimating ∂z0 E1 and ∂z0 E2 , we conclude that σε . R = O(σ ) + O |λ1 |2 From (A.21) and (A.11), we have ! " 1 σε −1 A (1 + O(λ1 )) + O(σ ) + O ∂z0 s∞ = b0 (s∞ ) |λ1 |2

(A.22)

n−1 as λ1 → 0. Since A−1 = (− i λi−1 )i=1 , if |λε1 | is sufficiently small, then the first term on the right side of (A.22) dominates and thus ∂z0 s∞ > 0.

From Limit Cycles to Strange Attractors

249

References 1. Benedicks, M., Carleson, L.: On iterations of 1 − ax 2 on (−1, 1). Ann. of Math. (2) 122(1), 1–25 (1985) 2. Benedicks, M., Carleson, L.: The dynamics of the Hénon map. Ann. of Math. (2) 133(1), 73–169 (1991) 3. DeVille, R.E.L., Vanden-Eijnden, E., Muratov, C.B.: Two distinct mechanisms of coherence in randomly perturbed dynamical systems. Phys. Rev. E (3) 72(3), 031105 (2005) 4. Falconer, I., Gottwald, G.A., Melbourne, I., Wormnes, K.: Application of the 0-1 test for chaos to experimental data. SIAM J. Appl. Dyn. Syst. 6(2), 395–402 (2007) (electronic) 5. Gottwald, G.A., Melbourne, I.: A new test for chaos in deterministic systems. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460(2042), 603–611 (2004) 6. Hunt, B.R., Sauer, T., Yorke, J.A.: Prevalence: a translation-invariant “almost every” on infinite-dimensional spaces. Bull. Amer. Math. Soc. (N.S.) 27(2), 217–238 (1992) 7. Hunt, B.R., Sauer, T., Yorke, J.A.: Prevalence. An addendum to: Prevalence: a translation-invariant ‘almost every’ on infinite-dimensional spaces. Bull. Amer. Math. Soc. (N.S.) 28(2), 306–307 (1993) 8. Jakobson, M.V.: Absolutely continuous invariant measures for one-parameter families of one-dimensional maps. Commun. Math. Phys. 81(1), 39–88 (1981) 9. Lin, K.K., Young, L.-S.: Shear-induced chaos. Nonlinearity 21(5), 899–922 (2008) 10. Ott, W., Yorke, J.A.: Prevalence. Bull. Amer. Math. Soc. (N.S.) 42(3), 263–290 (2005) (electronic) 11. Thieullen, Ph., Tresser, C., Young, L.-S.: Positive Lyapunov exponent for generic one-parameter families of unimodal maps. J. Anal. Math. 64, 121–172 (1994) 12. Thieullen, P., Tresser, C., Young, L.-S.: Exposant de Lyapunov positif dans des familles à un paramètre d’applications unimodales. C. R. Acad. Sci. Paris Sér. I Math. 315(1), 69–72 (1992) 13. Tucker, W.: A rigorous ODE solver and Smale’s 14th problem. Found. Comput. Math. 2(1), 53–117 (2002) 14. Wang, Q., Young, L.-S.: In preparation 15. Wang, Q., Young, L.-S.: Strange attractors with one direction of instability. Commun. Math. Phys. 218(1), 1–97 (2001) 16. Wang, Q., Young, L.-S.: From invariant curves to strange attractors. Commun. Math. Phys. 225(2), 275–304 (2002) 17. Wang, Q., Young, L.-S.: Strange attractors in periodically-kicked limit cycles and Hopf bifurcations. Commun. Math. Phys. 240(3), 509–529 (2003) 18. Wang, Q., Young, L.-S.: Toward a theory of rank one attractors. Ann. of Math. (2) 167(2), 349–480 (2008) 19. Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. of Math. (2) 147(3), 585–650 (1998) 20. Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999) 21. Zaslavsky, G.M.: The simplest case of a strange attractor. Phys. Lett. A 69(3), 145–147 (1978/79) Communicated by G. Gallavotti

Commun. Math. Phys. 296, 251–270 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1015-x

Communications in

Mathematical Physics

Uniform Regularity Close to Cross Singularities in an Unstable Free Boundary Problem John Andersson1 , Henrik Shahgholian2 , Georg S. Weiss3 1 Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK.

E-mail: [email protected]

2 Department of Mathematics, Royal Institute of Technology,

100 44 Stockholm, Sweden. E-mail: [email protected]

3 Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba,

Meguro-ku, Tokyo-to, 153-8914, Japan. E-mail: [email protected] Received: 29 May 2009 / Accepted: 24 November 2009 Published online: 18 February 2010 – © Springer-Verlag 2010

Dedicated to Nina Nikolaevna Uraltseva on the occasion of her 75th birthday Abstract: We introduce a new method for the analysis of singularities in the unstable problem u = −χ{u>0} , which arises in solid combustion as well as in the composite membrane problem. Our study is confined to points of “supercharacteristic” growth of the solution, i.e. points at which the solution grows faster than the characteristic/invariant scaling of the equation would suggest. At such points the classical theory is doomed to fail, due to incompatibility of the invariant scaling of the equation and the scaling of the solution. In the case of two dimensions our result shows that in a neighborhood of the set at which the second derivatives of u are unbounded, the level set {u = 0} consists of two C 1 -curves meeting at right angles. It is important that our result is not confined to the minimal solution of the equation but holds for all solutions.

Contents 1. Introduction . . . . . . . . . . . . . . . . 2. Notation . . . . . . . . . . . . . . . . . . 3. Preliminaries . . . . . . . . . . . . . . . . 4. A Newtonian Potential and its Projection . 5. Growth of the Solution at Singular Points . 6. Controlling the Movement of (u(x + r ·)) 7. Conclusion . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

252 254 254 256 258 264 268 269

252

J. Andersson, H. Shahgholian, G. S. Weiss

1. Introduction In the last decade, the theory of free boundary regularity of obstacle type has got renewed attention, owing to the seminal paper [4] of L.A. Caffarelli as well as [7]. Many interesting old and new problems, intractable by earlier techniques, have been solved, thanks to the ideas in [4 and 7] (see for example [16]). All these problems share a common feature: the scaling of the solution at free boundary points coincides with the characteristic/invariant scaling of the equation. However, there are problems arising in applications for which this does not hold. An example is the unstable obstacle problem u = −χ{u>0}

in ⊂ Rn ,

(1.1)

related to traveling wave solutions in solid combustion with ignition temperature (see the Introduction of [14] for more details), to the composite membrane problem (see [3,8–11,15]) as well as the shape of self-gravitating rotating fluids (see [6, Eq. (1.26)]). Solutions of Equation (1.1) may exhibit “supercharacteristic” growth of order r 2 | log r | not suggested by the invariant/characteristic scaling u(r x)/r 2 of the equation. In this paper we introduce a new method to analyze the fine structure of singular sets close to points of supercharacteristic growth of the solution. Equation (1.1) has been investigated by R. Monneau-G.S. Weiss in [14]. They establish partial regularity for second order non-degenerate solutions of (1.1). More precisely they show that the singular set has Hausdorff dimension less than or equal to n − 2, and that in two dimensions the free boundary consists close to points where the second derivative is unbounded, of four Lipschitz graphs meeting at right angles. They also show that energy-minimising solutions are in the two-dimensional case of class C 1,1 and that their free boundaries are locally analytic. J. Andersson-G.S. Weiss have constructed a cross-shaped counter-example proving that the solution need not be of class C 1,1 (see [1]). In [14] it has been shown that the second variation of the energy at that particular solution takes the value −∞. In this sense the cross-solution is completely unstable. Moreover, it cannot be obtained by naive numerical schemes. In this paper we analyze the behavior of solutions at points at which the second derivatives are unbounded. Difficulties in the analysis are: (i) At cross-like singular points the solution has the “wrong scaling”, i.e. u(r x) scales like r 2 | log(r )| which is different from the characteristic scaling r 2 of the equation. The lack of a suitable local Lyapunov functional/monotonicity formula implies that methods like the Lojasiewicz inequality (see for example [17,18]) would be hard to apply even at isolated singularities. (ii) The cross-like singularities are unstable. (iii) The comparison principle does not hold. Instead we use knowledge about the Newtonian potential of the right-hand side to derive a quantitative estimate for the projection of the solution onto the homogeneous harmonic polynomials of degree 2. Remark 1.1. Although our problem may superficially resemble nodal line problems where one is also interested in the singular set {w = 0} ∩ {∇w = 0}, there is a fundamental difference: whereas in nodal lines the solution w is sometimes expanded close to

Uniform Regularity

253

a singular point into a harmonic polynomial and a (relatively) smooth remainder term, and this expansion thereafter quickly leads to regularity, the same is not true for our problem. Let us make this difference more precise for the example of the nodal line paper [5]: In [5, Theorem 4.3], it is shown that the solution of that paper, w, satisfies close to a (more general) singular point x 0 , w = H + , where H is a harmonic polynomial of precise degree N and the remainder satisfies |(x)| ≤ C|x − x 0 | N +δ for some δ > 0. The corresponding statement would in our case be that in a small neighborhood of the singularity x 0 , u = H + , where H is a harmonic polynomial of precise degree 2 and the remainder satisfies |(x)| ≤ C|x − x 0 |2+δ for some δ > 0. But that is by our Lemma 5.3 not true! What we do is a finer expansion of the type u = H +z+ (up to rotation), where z is the singular Newtonian potential of Lemma 4.4. Observe that the Newtonian potential z is a much more sophisticated object than the harmonic polynomial H . A careful analysis leads in the case of two dimensions to the growth estimate Theorem A (i) for the solution as well as an estimate of order

r 0

√ | log | log s|| ds s| log s|3/2

(1.2)

for how much the projection of u(x + s·) and also the approximate tangent space of the singular set can turn as s moves from r to 0 (see Theorem A and Remark 1.2). Our main result Theorem A shows that close to a non-degenerate singular point, the level set {u = 0} consists of two C 1 -curves meeting at right angles. We provide estimates for the modulus of the normal of the free boundary close to singular points. Different from the (also two-dimensional) unique tangent cone result [14, Theorem 7.1], the result in the present paper is a quantitative result valid uniformly for a certain class of solutions. Moreover the result in the present paper is not confined to the minimal solution. In the paper [2] in preparation the authors extend these new methods to the case of higher dimensions. Our main result in the present paper is the following (cf. Corollary 5.6 and Corollary 7.1):

254

J. Andersson, H. Shahgholian, G. S. Weiss

Theorem A. Let u be a solution of (1.1) in ⊂ R2 satisfying sup |u| ≤ M. Moreover let d > 0. Then there exist an r0 = r0 (M, d) > 0 and a δ0 = δ0 (M, d) > 0 such that if x 0 ∈ d = {x ∈ : dist(x, ∂) > d} and 1/2 1 r2 u 0 2 1 (1.3) S (x , r ) ≡ u dH ≥ r 1 ∂ Br (x 0 ) δ for some δ ≤ δ0 , r ≤ r0 and u(x 0 ) = |∇u(x 0 )| = 0 then: (i) 1δ − C(M, d) s 2 + c log(r/s)s 2 ≤ S u (x 0 , s) for every s ≤ r . 0 (ii) There exists a second order homogeneous harmonic polynomial p x ,u = p such that for each α ∈ (0, 1/2) and each β ∈ (0, 1), α u(x 0 + sx) δ − p ≤ C(M, d, α, β) . (1.4) sup Bs (x 0 ) |u| 1,β 1 + δ log(r/s) C

(B 1 )

(iii) The set {u = 0} ∩ Br (x 0 ) consists of two C 1 -curves intersecting each other at right angles at x 0 . Remark 1.2. 1) By [14, Lemma 8.5] the estimate Theorem A (i) is sharp. The inequality (1.3) is always satisfied for some r at singular points, that is, points at which the solution u is not C 1,1 . Theorem A thus states that x 0 is a singular point if and only if (1.3) is satisfied for some r . 2) The left hand side in (1.4) may be estimated by the somewhat sharper term in (1.2) (see the end of the proof of Theorem 6.3). The proof of (i) in Theorem A is contained in Corollary 5.6, and (ii) and (iii) will be proved in Corollary 7.1. 2. Notation Throughout this article Rn will be equipped with the Euclidean inner product x · y and the induced norm |x| . We define ei as the i th unit vector in Rn , and Br (x 0 ) will denote the open n-dimensional ball of center x 0 , radius r and volume r n ωn . When not specified, x 0 is assumed to be 0. We shall often use abbreviations for inverse images like {u > 0} := {x ∈ : u(x) > 0} , {xn > 0} := {x ∈ Rn : xn > 0}, etc., and occasionally we shall employ the decomposition x = (x1 , . . . , xn ) of a vector x ∈ Rn . Since we are concerned with local regularity we will use the set d := {x ∈ : dist(x, ∂) ≥ d > 0}. We will use the k-dimensional Hausdorff measure Hk . When considering a set A , χ A shall stand for the characteristic function of A , while ν shall typically denote the outward normal to a given boundary. 3. Preliminaries In this section we state some of the definitions and tools from [20,14] and mention some examples from [1]. First we need the monotonicity formula derived in [20] by G.S. Weiss for a class of semilinear free boundary problems. For the sake of completeness let us state the unstable case here:

Uniform Regularity

255

Theorem 3.1 (Monotonicity formula, [20]). Suppose that u is a solution of (1.1) in and that Bδ (x 0 ) ⊂ . Then for all 0 < ρ < σ < δ the function ux 0 (r ) := r −n−2

Br (x 0 )

− 2 r −n−3

|∇u|2 − 2 max(u, 0)

∂ Br (x 0 )

u 2 dHn−1 ,

defined in (0, δ) , satisfies the monotonicity formula ux 0 (σ ) − ux 0 (ρ) =

ρ

σ

r −n−2

u 2 2 ∇u · ν − 2 dHn−1 dr ≥ 0. 0 r ∂ Br (x )

The following proposition has been proved in [14, Sect. 5]. Proposition 3.2 (Classification of blow-up limits with fixed center, Prop. 5.1 in [14]). Let u be a solution of (1.1) in and let us consider a point x 0 ∈ ∩{u = 0}∩{∇u = 0}. (i) In the case ux 0 (0+) = −∞, limr →0 r −3−n ∂ Br (x 0 ) u 2 dHn−1 = +∞, and for 1 2 S u (x 0 , r ) = r 1−n ∂ Br (x 0 ) u 2 dHn−1 , each limit of u(x 0 + r x) S u (x 0 , r ) as r → 0 is a homogeneous harmonic polynomial of degree 2. (ii) In the case ux 0 (0+) ∈ (−∞, 0), u r (x) :=

u(x 0 + r x) r2

is bounded in W 1,2 (B1 (0)), and each limit as r → 0 is a homogeneous solution of degree 2. (iii) Else ux 0 (0+) = 0, and u(x 0 + r x) → 0 in W 1,2 (B1 (0)) as r → 0. r2 Remark 3.3. 1. As observed recently by one of the authors, case (ii) is possible even in two dimensions (cf. [2]). 2. Case (iii) is equivalent to u being degenerate of second order at x 0 . In [1], the authors have obtained abstract existence of solutions in two dimensions that exhibit cross-like singularities, at which the second derivatives of the solution are unbounded (case (i) of Proposition 3.2), as well as degenerate singularities, at which the solution decays to zero faster than any quadratic polynomial (case (iii) of Proposition 3.2):

256

J. Andersson, H. Shahgholian, G. S. Weiss

Theorem 3.4 (Cross-shaped singularity in two dimensions, Cor. 4.2 in [1]). There exists a solution u of u = −χ{u>0} in B1 ⊂ R2 that is not of class C 1,1 . Each limit of u(r x) S u (0, r ) as r → 0 coincides after rotation with the function (x12 − x22 )/x12 − x22 L 2 (∂ B1 (0)) . Theorem 3.5 (Existence of a degenerate point, Cor. 4.4 in [1]). There exists a nontrivial solution u of u = −χ{u>0} in B1 ⊂ R2 that is degenerate of second order at the origin. 4. A Newtonian Potential and its Projection In what follows we will need the space P of second order homogeneous harmonic polynomials and two dimensional homogeneous polynomials respectively which we define now. Definition 4.1. Let us first define in each dimension n ≥ 2 the space P of 2-homogeneous harmonic polynomials, i.e. harmonic polynomials of degree 2. Definition 4.2.

(i) Let us define the projection : W 2,2 (B1 ) → P

as follows: for v ∈ W 2,2 (B1 ), let (v) be the, by Lemma 4.3 unique, minimizer of p → |D 2 v − D 2 p|2

B1

n 2 on P, where |A| = i, j=1 ai j is the Frobenius norm of the matrix A. (ii) Let us also define τ (v) ≥ 0 by

(v) = τ (v) p, p ∈ P, sup | p| = 1. B1

Lemma 4.3. (i) For each v ∈ W 2,2 (B1 ) the minimizer of Definition 4.2 exists and is unique. Thus : W 2,2 (B1 ) → P is well-defined. (ii) is a linear operator. (iii) If h ∈ W 2,2 (B1 ) is harmonic in B1 then (h(x)) = (h(r x)/r 2 ) for all r ∈ (0, 1). (iv) For every v, w ∈ W 2,2 (B1 ), sup |(v + w)| ≤ sup |(v)| + sup |(w)|. B1

B1

B1

Uniform Regularity

257

Proof. The first and second statement follow from the projection theorem with respect 2 to the L 2 (B1 ; Rn )-inner product and the linear subspace 2

{ f ∈ L 2 (B1 ; Rn ) : f (x)is symmetric, constant, and trace( f ) = 0}. Writing h as the sum of homogeneous harmonic polynomials h j that are orthogonal to each other with respect to n (v, w) := ∂i j v∂i j w, B1 i, j=1

we see that (h j ) = 0 for all j such that the degree of h j is different from 2, implying the third statement. The last statement follows from the linearity of and the triangle 2 inequality in L 2 (B1 ; Rn ). In [13] L. Karp-A.S. Margulis derive eigenfunction expansions for generalized Newtonian potentials with respect to a large class of right-hand sides. In the following lemma we calculate explicitly a normalized generalized Newtonian potential of −χ{x1 x2 >0} as well as its projections. Properties (iv), (v) and (vi) in Lemma 4.4 are crucial for what follows. Lemma 4.4. Define v : (0, +∞) × [0, +∞) → R by x2 π − π(x12 + x22 ). v(x1 , x2 ) := −4x1 x2 log(x12 + x22 ) + 2(x12 − x22 ) − 2 arctan 2 x1 Moreover let

⎧ x1 x2 ≥ 0, x1 = 0, ⎨ v(x1 , x2 ), w(x1 , x2 ) := −v(−x1 , x2 ), x1 < 0, x2 ≥ 0, ⎩ −v(x , −x ), x > 0, x ≤ 0, 1 2 1 2

and let z(x1 , x2 ) :=

w(x1 , x2 ) − π(x12 + x22 ) + 8x1 x2 . 8π

Then, z is the unique solution to (i) (ii) (iii) (iv) (v) (vi)

z = −χ{x1 x2 >0} in R2 , z(0) = |∇z(0)| = 0, lim x→∞ z(x) = 0, |x|3 (z) = 0, (z 1/2 ) = log(2)x1 x2 /π, τ (z 1/2 ) = log(2)/(2π ).

Proof. A calculation shows that w can be extended to a C 1 -function and that w = −4π χ{x1 x2 >0} + 4π χ{x1 x2 <0} . We obtain that z can be extended to a C 1 -function solving z = −χ{x1 x2 >0} in R2 and satisfying (ii) and (iii). Next we show that h := (z) = 0: setting a b , D2 h = b −a

258

J. Andersson, H. Shahgholian, G. S. Weiss

we obtain

|D z − D h| = 4

0 = ∂b

2

B1

2

∂12 (h − z) = 4b − 4

2

B1

∂12 z B1

1 + log(x12 + x22 ) = 4b π

= 4b + 2 B1

as well as |D 2 z − D 2 h|2 = 4a,

0 = ∂a B1

implying that h ≡ 0. Rescaling z we see that z(r x1 , r x2 ) x1 x2 log r 2 = z(x , x ) − 1 2 r2 2π which implies ⎛ (z 1/2 ) = (z) − ⎝

x1 x2 log

2 ⎞ 1

2π

2

⎠

= − log(1/2)(x1 x2 )/π = − log(1/2)x1 x2 /π. Thus (v) and (vi) are true. Last, we show uniqueness of z satisfying (i)-(iv). Observe that (v) and (vi) are not needed to show uniqueness. If z 1 and z 2 are two solutions to (i)-(iv), then by (i), z 1 − z 2 is harmonic. Condition (iii) implies that z 1 − z 2 is a second order polynomial. Conditions (ii) and (iv) then imply that z 1 − z 2 = 0.

5. Growth of the Solution at Singular Points The next lemma is crucial for all that follows. Lemma 5.1. Let u solve (1.1) and suppose that d > 0, sup |u| ≤ M < +∞, x 0 ∈ d , u(x 0 ) = |∇u(x 0 )| = 0 and r ≤ d/2. Then B1

1/ p 0 + r x) p 2 u(x 0 + r x) u(x 2 D −D ≤ C(n, M, d, p) r2 r2

and u(x 0 + r x) u(x 0 + r x) − ≤ C(n, M, d, β). 1,β 2 2 C (B 1 ) r r

Uniform Regularity

259

Proof. Let u r (x) = BMO, and that

u(x 0 +r x) . From [19, 4.1 Prop. 1] we infer that r2

D 2 u is locally of class

1/2 |D u r − 2

D2u

3r/2 |

≤ C1 ,

2

B3/2

where D2u

3r/2

1 = ωn (3/2)n

D2 ur , B3/2

and C1 is a constant depending only on n, M and d. It follows that 1/2 C1 ≥

|D 2 u r − D 2 u 3r/2 |2

≥

B3/2

2 1/2 2 D u r − D 2 u 3r/2 − 1 trace(D 2 u 3r/2 )I n B3/2 2 1/2 1 2 − , n trace(D u 3r/2 )I B3/2

where I is the identity matrix. Next it is easy to see that 2 1 trace(D 2 u 3r/2 )I ≤ 1, n B3/2 since trace

D2u

3r/2

1 = ωn (3/2)n

u r B3/2

and |u r | ≤ 1. In particular we have 2 1/2 2 1 2 2 C1 + 1 ≥ . D u η − D u 3r/2 − n trace(D u 3r/2 )I B3/2 Using the minimizing property of the projection we get 2 2 D u r − D 2 u 3r/2 − 1 trace(D 2 u 3r/2 )I (C1 + 1)2 ≥ n B3/2 ≥ |D 2 u r − D 2 (u 3r/2 )|2 . B3/2

Observe that if we set v := u r − (u 3r/2 ), then |D 2 v|2 ≤ (C1 + 1)2 , (v) L 2 (B1 ) ≤ C2 , B3/2

(v) L 2 (B3/2 ) ≤ C3 and v − (v) L 2 (B3/2 ) ≤ C4 .

260

J. Andersson, H. Shahgholian, G. S. Weiss

It follows that D 2 (u r − (u r )) is bounded in L 2 (B3/2 ). Moreover, since (u r ) is harmonic, (u r − (u r )) = −χ{ur >0} . Poincaré’s Inequality implies that u r − (u r ) − ∇u r · x − u r 2,2 ≤ D 2 u r − D 2 (u r ) L 2 (B ) ≤ C5 , W (B ) 3/2

3/2

where ∇u r and u r denote the averages. Thus L p -theory (see for example [12, Th. 9.11]) implies that u r − (u r ) − ∇u r · x − u r 2, p ≤ C6 . W (B ) 1

The embedding into Hölder spaces therefore yields u r − (u r ) − ∇u r · x − u r

C 1,β (B 1 )

≤ C7 .

Using that u(x 0 ) = |∇u(x 0 )| = 0 and the above estimates implies the statement of the lemma. Remark 5.2. The above lemma implies in particular that when one of the quantities u L ∞ (Br (x 0 )) , S u (x 0 , r ) and τ (u(x 0 +r ·)) is large in comparison to r 2 then all these quan¯ 2 tities are comparable. Let us indicate how to prove this: assume that τ (u(x 0 +r ·)) > Cr ¯ for some large constant C¯ = C(n, M, d), then 1/2 1/2 1 1 2 n−1 2 n−1 S u (x 0 , r ) = u dH ≥ (u) dH r n−1 ∂ Br (x 0 ) r n−1 ∂ Br (x 0 ) 1/2 1 − n−1 (u − (u))2 dHn−1 ≥ c(n)τ (u(x 0 + r ·)) − C(n, M, d)r 2 . r ∂ Br (x 0 ) It follows that if C¯ > 2C(n, M, d)/c(n) then S u (x 0 , r ) > c(n)τ (u(x 0 + r ·))/2. Similarly one may deduce that under the above assumptions S u (x 0 , r ) < C(n)τ (u(x 0 + r ·)) and that the corresponding relationships between the other quantities above hold. In what follows, we denote by z(x1 , . . . , xn ) := z(x1 , x2 ) the solution of Lemma 4.4, extended to Rn . Lemma 5.3. For each > 0, n ∈ N, d > 0, M < +∞, α ∈ [1, +∞) and β ∈ (0, 1) there exist r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x) = |∇u(x)| = 0 and Ln (({u(x + r ·) > 0}{x1 x2 > 0}) ∩ B1 ) ≤ δ. Then

u(x + r ·) u(x + r ·) − ( ) − z r2 1,β ¯ ≤ . 2 r C ( B1 )

Proof. Suppose that r j → 0, that Ln ({u j (x j + r j ·) > 0}{x1 x2 > 0}) → 0 as j → ∞, and that u j (x j + r j ·) u j (x j + r j ·) 1,β 2,α − ( ) → z˜ in Cloc (Rn ) and weakly in Wloc (Rn ) rj2 rj2

Uniform Regularity

261

as j → ∞ (cf. Lemma 5.1). Now let N˜ be the Newtonian potential of χd u j , i.e. 1 2−n (χ u )(ξ ) dξ, n > 2, d j Rn |y − ξ | n(2−n)ω ˜ n N (y) := 1 n = 2. 2π R2 log |y − ξ |(χd u j )(ξ ) dξ, Next we let N (y) := N˜ (y) − N˜ (x j ) − ∇ N˜ (x j ) · (y − x j ), and consider the harmonic function h(y) := u j (y) − N (y). Since sup |u j | ≤ M, |h| ≤ C2 on ∂ Bd (x j ), and it follows that |D 3 h(y)| ≤ C3 in Bd/2 (x j ), where C3 depends on n, d and M. Consequently |u j (y) − N (y) − D 2 h(x j )(y − x j )(y − x j )| ≤ C4 |x j − y|3 in Bd/2 (x j ), where C4 depends only on n, d and M. For the scaled functions v j (y) := u j (x j + r j y)/r 2j , N j (y) := N (x j + r j y)/r 2j and p j (y) = D 2 h(x j )(y)(y) we obtain |v j (y) − N j (y) − p j (y)| ≤ C4 r j |y|3 in Bd/(2r j ) . Thus v j − (v j ) = N j − (N j ) + o(1) as j → ∞. Passing if necessary to another subsequence j → ∞, the functions N j converge locally to N0 , where N0 = −χ{x1 x2 >0} , N0 (0) = 0, ∇ N0 (0) = 0 and N0 − (N0 ) = z˜ . We need to establish that |N0 (y)| = o(|y|3 ) as |y| → ∞. Once this is established the uniqueness part of Lemma 4.4 implies that z˜ = N0 −(N0 ) = z and the lemma follows. First, D 2 N0 ∈ B M O(Rn ), so that 2 2 R4 D (N0 (Ry)) − D 2 (N0 (R·)) dy ≤ C5 2 sup B R |D N0 | sup B R |D 2 N0 |2 B1 for all R ∈ (0, +∞), where D 2 (N0 (R·)) denotes the mean value of D 2 (N0 (R·)) on B1 . Thus lim sup R→∞ sup B1 |D 2 N0 (R·)|/R 2 = +∞ implies that N0 (Rk ·)/ sup B R |D 2 N0 | converges for a sequence Rk → ∞ k to a 2-homogeneous harmonic polynomial.

(5.1)

Now suppose towards a contradiction that lim sup |y|→∞

|N0 (y)| > 0. |y|3

Then (N0 − z) = 0 in Rn and lim sup |y|→∞

|N0 (y) − z(y)| > 0. |y|3

Thus N0 − z must be a harmonic polynomial of degree m ≥ 3, contradicting (5.1).

262

J. Andersson, H. Shahgholian, G. S. Weiss

Lemma 5.4. Let n = 2, d > 0 and M < +∞. Then there are r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x 0 ) = |∇u(x 0 )| = 0 and S u (x 0 , r ) ≥

r2 , δ

for some r ≤ r0 . Then | log(S u (x 0 , r )/r 2 )| Ln ({u(x 0 + r ·) > 0}{(u(x 0 + r ·)) > 0}) ∩ B1 ≤ C , S u (x 0 , r )/r 2 where C = C(d, M, r0 ). Proof. Let u r (y) := u(x 0 + r y)/r 2 . Then u r is a solution to (1.1) and S ur (0, 1) > 1/δ. Let τ (u r ) pr = (u r ). By Lemma 5.1, sup B1 |u r − τ (u r ) pr | ≤ C, and we obtain at each point x ∈ {u r > 0} ∩ { pr ≤ 0} that | pr (x)| ≤

C1 C ≤ u , τ (u r ) S r (0, 1)

where we have used that S ur (0, 1) is comparable to τ (u r ) (see Remark 5.2). Next we calculate C1 } ∩ B1 ) S ur (0, 1) C1 }) ≤ 4Ln ({(x1 , x2 ) : 0 < x1 < 1, 0 < x2 < 1, x1 x2 ≤ Sr (0, 1)

Ln ({u r > 0} ∩ { pr ≤ 0} ∩ B1 ) ≤ Ln ({| pr | ≤

C1 /S ur (0,1)

=4

d x1 + 4 0

1

C1 /S ur (0,1)

C1 C| log(S ur (0, 1))| d x . ≤ 1 x1 S ur (0, 1) S ur (0, 1)

The lemma follows by scaling back S u (x 0 , r ) = r 2 S ur (0, 1).

Lemma 5.5. Let n = 2. For each γ ∈ (0, log(2)/(2π )), d > 0 and M < +∞ there are r0 , δ > 0, depending only on γ , d and M, with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x 0 ) = |∇u(x 0 )| = 0 and for some r ≤ r0 , S u (x 0 , r ) ≥

r2 . δ

Then τ (4u(x 0 + r · /2)/r 2 ) ≥ τ (u(x 0 + r ·)/r 2 ) + γ . Proof. Suppose towards a contradiction that τ (4u j (x j + r j · /2)/r 2j ) < τ (u j (x j + r j ·)/r 2j ) + γ for a sequence u j satisfying the assumptions with δ = δ j → 0 as j → ∞. Let v j := u j (x j + r j ·)/r 2j . A straightforward calculation shows that v j solves (1.1) and that S v j (0, 1) ≥

1 . δj

Uniform Regularity

263

From Lemma 5.4 it follows that Ln ({v j > 0}{(v j ) > 0}) ∩ B1 ) → 0. We may apply Lemma 5.3 and deduce that, after a rotation of the coordinate system, v j − (v j ) → z weakly in W 2,α (B1 ) and strongly in C 1,β ( B¯ 1 ) as j → ∞, and that therefore — rotating each v j only slightly more — (v j ) = M j x1 x2 with M j → +∞ as j → ∞. Defining f 1/2 (y) := 4 f (y/2), it follows from Lemma 4.4 (v) that ((v j )1/2 − M j x1 x2 ) → (z 1/2 ) = log(2)x1 x2 /π as j → ∞. On the other hand, τ ((v j )1/2 ) < τ (v j ) + γ , so that (log(2)/π + M j )/2 = τ ((log(2)/π + M j )x1 x2 ) = o(1) + τ ((v j )1/2 ) < o(1) + τ (v j ) + γ = o(1) + M j /2 + γ ,

a contradiction for large j.

The next corollary proves the first statement in Theorem A and is fundamental for the rest of the paper. Corollary 5.6. Let n = 2. Fix a γ ∈ (0, log(2)/2π ), and let u satisfy the assumptions in Lemma 5.5 for some r ≤ r0 (with possibly somewhat smaller δ). Then τ (22 j u(x 0 + 2− j r ·)/r 2 ) ≥ τ (u(x 0 + r ·)/r 2 ) + jγ for all j ∈ N. Moreover, for each s ≤ r , S u (x 0 , s) S u (x 0 , r ) log(r/s) − 2C, ≥ + cγ s2 r2 log(2) where c = x1 x2 L 2 (∂ B1 ) , and C = C(M, d, r0 ). Proof. Since by Lemma 5.1, u(x 0 + r x) u(x 0 + r x) sup − ≤ C0 , r2 r2 B1 it follows that for s ≤ r , u s (x) = u(x 0 + sx)/s 2 and c = x1 x2 L 2 (∂ B1 ) , S (0, 1) − us

1

2C0 π ≤

|(u s )| dH 2

∂B

1

1 − 2C0 π ≤ cτ (u s ). Similarly it follows that τ (u s )

1

2

+

|u s − (u s )| dH 2

∂ B1

1/2 ∂ B1

(x1 x2 )2

dH1 ≤ S u s (0, 1) +

1

2

(5.2)

2C0 π .

(5.3)

From Lemma 5.5 we infer that if S u (x 0 , r )/r 2 ≥ 1/σ with σ < δ and δ is as in Lemma 5.5, then τr/2 ≥ τr + γ . Here we use the shorthand τr ≡ τ (u(x + r ·)/r 2 ). From inequalities (5.2) and (5.3) we see that S u (x 0 , r/2) S u (x 0 , r ) ≥ (τr + γ )c − 2C0 π ≥ + γ c − 2 2C0 π , 2 2 (r/2) r

(5.4)

264

J. Andersson, H. Shahgholian, G. S. Weiss

where c is the constant in the statement of the corollary. In particular, if σ has been chosen small enough, say 1/σ > 1/δ + 2C1 , then u satisfies the assumptions of Lemma 5.5 in Br/2 . We may thus apply Lemma 5.5 again and deduce that S u (x 0 , r/4) ≥ (τ + 2γ )c − 2 2C0 π . r (r/4)2 Applying Lemma 5.5 j times, we arrive at S u (x 0 , r/2 j ) S u (x 0 , r ) ≥ (τ + jγ )c − C ≥ + cγ j − 2C1 . r 1 (r/2 j )2 r2

√ Notice that since τ2− j r is increasing in j and thus S u (x 0 , 2− j r ) ≥ τr − 2 2C0 π for each j and the assumptions of Lemma 5.5 are therefore satisfied for each j. If we put s = 2− j r then j = log(r/s)/ log(2) and we obtain the statement in the corollary. For general s ≤ r we may consider a j such that 2−( j+1)r < s ≤ 2− j r . Using Lemma 5.1, u(x 0 + 2− j r x) u(x 0 + 2− j r x) − ≤ C2 , 1,β − j 2 − j 2 C (B 1 ) (2 r ) (2 r ) and it follows that S u (x 0 , s) S u (x 0 , 2− j r ) − ≤ C3 . s2 (2− j r )2 The corollary follows with a slightly larger constant C. 6. Controlling the Movement of (u(x + r·)) In this section we will exploit the estimate in Corollary 5.6 to obtain control of how much the projection of u(x + r ·) can turn when passing to a smaller radius r . Lemma 6.1. Let n = 2, d > 0 and M < ∞. Then there is r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x) = |∇u(x)| = 0 and S u (x 0 , r ) 1 ≥ . 2 r δ Let g be the solution of g = χ{(u(x+r ·))>0} − χ{u(x+r ·)>0} in B1 , g = 0 on ∂ B1 . Then (i)

D g L 2 (B1 ) ≤ C 2

| log(S u (x 0 , r )/r 2 )| . S u (x 0 , r )/r 2

Uniform Regularity

265

(ii) τ (g) ≤ C

| log(S u (x 0 , r )/r 2 )| , S u (x 0 , r )/r 2

where C = C(d, M, r0 ). Proof. (i) follows from Lemma 5.4 and L 2 -theory (see for example [12, Th. 8.8]). (ii) Rotating and setting p := (g) = a1 x12 + a2 x22 , we obtain | log(S u (x 0 , r )/r 2 )| D 2 p L 2 (B1 ) ≤ C1 D 2 g L 2 (B1 ) ≤ C2 S u (x 0 , r )/r 2 and

|a j | ≤ C3

| log(S u (x 0 , r )/r 2 )| S u (x 0 , r )/r 2

for j = 1, 2. The next proposition already contains the desired estimate for how much the projection may turn when passing from u(x 0 + r ·) to u(x 0 + r · /2). Proposition 6.2. Let n = 2, d > 0 and M < +∞. Then there are r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x) = |∇u(x)| = 0 and S u (x 0 , r ) 1 ≥ . r2 δ Then (u(x + r ·)) (u(x + r · /2)) | log(|S u (x 0 , r )/r 2 |)| sup − ≤C 3/2 , sup B1 |(u(x + r · /2))| B1 sup B1 |(u(x + r ·))| S u (x 0 , r )/r 2 where C = C(n, M, d). Proof. Let us consider v = u r − z ◦ Q r − h r − τ (u r ) pr , where u r (y) = u(x + r y)/r 2 , z is the function defined in Lemma 4.4, (u r ) = τ (u r ) pr , the orthogonal matrix Q r has been chosen such that {(u r ) > 0} = {(x1 x2 ) ◦ Q r > 0} (we may assume that Q r = I , the identity matrix), h r = h(r y)/r 2 , and h is harmonic and satisfies h(x) ≤ C1 |x|3 . It ˜ where g is the solution of follows that (v) = 0. Moreover we may express v = g + h, ˜ ˜ Lemma 6.1 and h is harmonic. Lemma 6.1 (ii) implies now that for h˜ 1/2 (y) = 4h(y/2), g1/2 (y) = 4g(y/2) and v1/2 (y) = 4v(y/2), sup |(v1/2 )| = sup |(h˜ 1/2 + g1/2 )| ≤ sup |(g1/2 )| B1

B1

B1

+ sup |(h˜ 1/2 )| ≤ sup |(h˜ 1/2 )| + C2 B1

B1

| log(S u (x 0 , r )/r 2 )| . S u (x 0 , r )/r 2

266

J. Andersson, H. Shahgholian, G. S. Weiss

u (x 0 ,r )/r 2 )| ˜ ≤ |(g)| ≤ C2 | log(S Since (v) = 0 we also know that |(h)| . On the S u (x 0 ,r )/r 2 ˜ ˜ ˜ other hand, using that h is harmonic and Lemma 4.3 (iii), (h) = (h 1/2 ) so that sup | u r/2 − z 1/2 − h r/2 − τ (u r ) pr | = sup |(v1/2 )| ≤ 2C2 B1

B1

| log(S u (x 0 , r )/r 2 )| . S u (x 0 , r )/r 2

From the linearity of , |h(x)| ≤ C3 |x|3 and Lemma 4.4 we infer that sup |(u r/2 ) − (τ (u r ) + log(2)/(2π )) pr | B1

≤ 2C2

| log(S u (x 0 , r )/r 2 )| + sup |(h r/2 )| ≤ C4 S u (x 0 , r )/r 2 B1

| log(S u (x 0 , r )/r 2 )| ; (6.1) S u (x 0 , r )/r 2

here we also used that sup B1 |(h r/2 )| ≤ C4 r , which can be absorbed in the last term since S u (x 0 , r )/r 2 is large by assumption. From (6.1) we conclude that (u r/2 ) (u r ) − sup sup |(u )| sup |(u )| r r/2 B1 B1 B1 (u r ) (τ (u r ) + log(2)/(2π )) pr | log(S u (x 0 , r )/r 2 )| ≤ sup − + C6 3/2 , sup B1 |(u r/2 )| B1 sup B1 |(u r )| S u (x 0 , r )/r 2 where we also used sup B1 |(u r/2 )| ≥ C7 S u (x 0 , r )/r 2 (cf. Remark 5.2). Next we make the following estimate, which together with the previous estimate yields the conclusion of the proposition: τ (u ) p (τ (u r ) + log(2)/(2π )) pr r r sup − sup B1 |(u r/2 )| B1 τ (u r ) τ (u r ) + log(2)/(2π ) τ (u r ) pr (τ (u ) + log(2)/(2π )) p r r + ≤ sup − − 1 sup |(u )| τ (u ) (τ (u ) + log(2)/(2π )) r r r/2 B1 B1 1 | log(S u (x 0 , r )/r 2 )| ≤ C8 u 0 , 2 S (x , r )/r S u (x 0 , r )/r 2 where we have used (6.1) to estimate

| log(S u (x 0 , r )/r 2 )| , S u (x 0 , r )/r 2 B1 τ (u ) + log(2)/(2π ) 1 | log(S u (x 0 , r )/r 2 )| r − 1 ≤ C9 u 0 . sup B1 |(u r/2 )| S (x , r )/r 2 S u (x 0 , r )/r 2

| sup |(u r/2 )| − (τ (u r ) + log(2)/(2π ))| ≤ C4

Uniform Regularity

267

Theorem 6.3. Let n = 2, d > 0 and suppose that u solves (1.1) and that sup |u| ≤ M < +∞. Then there exists a δ = δ(M, d) > 0 and an r0 = r0 (M, d) > 0 such that if x 0 ∈ d and S u (x 0 , r ) 1 ≥ r2 δ for some r ≤ r0 then for each α ∈ (0, 1/2) and all s ≤ r , α r2 (u(x 0 + r x)) (u(x 0 + sx)) . sup − ≤ C(d, M, α) 0 sup B1 |(u(x 0 + sx))| S u (x 0 , r ) B1 sup B1 |(u(x + r x))| Proof. For simplicity we will only prove the theorem for s = 2− j r ; for general s we may use the estimate in Lemma 5.1 as indicated in the proof of Corollary 5.6. Let us choose δ small enough so that Corollary 5.6 holds for some fixed γ > 0, i.e. S u (x 0 , 2− j r ) S u (x 0 , r ) ≥ + cγ j − 2C. 2−2 j r 2 r2

(6.2)

Decreasing δ somewhat more if necessary, we see that (6.2) implies that the assumptions in Proposition 6.2 hold for every ball B2− j r (x 0 ). Using the triangle inequality we obtain that (u(x 0 + r x)) (u(x 0 + 2− j r x)) sup sup − 0 sup B1 |(u(x 0 + 2− j r x))| j B1 sup B1 |(u(x + r x))| ∞ (u(x 0 + 2− j r x)) (u(x 0 + 2− j−1r x)) ≤ − sup . 0 − j 0 − j−1 sup |(u(x + 2 r x))| sup |(u(x + 2 r x))| B1 B1 B1 j=0

This sum may be estimated, by Proposition 6.2, from above by ∞ log(S u (x 0 , 2− j r )/(2−2 j r 2 )) . u (x 0 , 2− j r )/(2−2 j r 2 ) 3/2 S j=0

(6.3)

Let us set k to be the smallest integer satisfying 1 S u (x 0 , r ) k≥ − 2C . cγ r2 For S u (x 0 , r )/r 2 large enough we see that k > c1

S u (x 0 , r ) . r2

Using (6.2) we may estimate (6.3) by ∞ ∞ log(cγ t) log(cγ j) 2 + log k C2 ≤ C3 dt ≤ C4 √ ≤ C5 (α)k −α 3/2 3/2 (cγ j) (cγ t) k k j=k

for each α ∈ (0, 1/2). Using (6.4) gives the Theorem.

(6.4)

268

J. Andersson, H. Shahgholian, G. S. Weiss

7. Conclusion Corollary 7.1. Under the assumptions in Theorem 6.3 the following holds: (i) there exists a homogeneous harmonic polynomial p x ,u = p of second order such that for each α ∈ (0, 1/2) and each β ∈ (0, 1/2), α u(x 0 + sx) δ − p 1,β ≤ C(d, M, α, β) . C (B 1 ) sup Bs (x 0 ) |u| 1 + δ log(r/s) 0

(ii) The set {u = 0} ∩ Br (x 0 ) consists of two C 1 -curves intersecting each other at right angles at x 0 . Proof. From Corollary 5.6 we know that for each s ≤ r , 1 S u (x 0 , s) ≥ c1 + log(r/s) . s2 δ

(7.1)

It follows from Theorem 6.3 that (u(x 0 + sx)) 0 = p x ,u ≡ p 0 s→0 sup B |(u(x + sx))| 1

(7.2)

lim

exists. Using Lemma 5.1 gives u(x 0 + sx) (u(x 0 + sx)) − C2 ≥ 1,β s2 s2 C u(x 0 + sx) sup Bs (x 0 ) |u| sup Bs (x 0 ) |u| (u(x 0 + sx)) ≥ − p − p− 1,β s2 s2 s2 s2 C 1,β C 0 sup Bs (x 0 ) |u| u(x + sx) − p = sup Bs (x 0 ) |u| 1,β s2 C 0 + sx))| sup |(u(x 0 (u(x 0 + sx)) B1 (x ) − p − . (7.3) sup Bs (x 0 ) |u| sup B1 |(u(x 0 + sx))| 1,β C

As a direct consequence of Lemma 5.1 we obtain sup |(u(x 0 + sx))| C3 s 2 B1 − 1 ≤ . sup Bs (x 0 ) |u| sup Bs (x 0 ) |u| This, together with Theorem 6.3, implies that sup B1 |(u(x 0 + sx))| (u(x 0 + sx)) p − sup Bs (x 0 ) |u| sup B1 |(u(x 0 + sx))|

C 1,β

≤ C4

s2 S u (x 0 , s)

α .

Rearranging terms in (7.3) we get α α u(x 0 + sx) s2 δ ≤ C(d, M, α, β) . − p 1,β ≤ C5 C sup Bs (x 0 ) |u| S u (x 0 , s) 1 + δ log(r/s) This proves (i).

Uniform Regularity

269

Rotating the coordinate system we may assume that p x ,u = p = 2x1 x2 . The first part of the corollary implies that α δ 0 ≡ K s− , u(x + s·) < 0 in (x1 , x2 ) ∈ B1 : x1 x2 ≤ −C(d, M, α, β) 1 + δ log(r/s) 0

that u(x 0 + s·) > 0 in and that

(x1 , x2 ) ∈ B1 : x1 x2 ≥ C(d, M, α, β)

δ 1 + δ log(r/s)

α

≡ K s+ ,

u(x 0 + sx) ∂θ ≥ c6 |x| in B1 \ (K s− ∪ K s+ ). sup Bs (x 0 ) |u|

From the implicit function theorem it follows that, for each > 0, {u = 0} consists of four C 1 -curves in Bs (x 0 ) \ Bs/2 (x 0 ). To show that {u = 0} consists of two C 1 -curves we only need to show that these four curves are differentiable at x 0 and that their derivatives match. The normal ν of {u = 0} will point in the same (or opposite) direction as ∇u at any point of Bs (x 0 ) \ {x 0 } ∩ {u = 0}. Let us consider a point x 0 + sx of {u = 0} such that x2 = 1 and |x1 | ≤ 1: from (i) it follows that at the point x 0 + sx, 0 ∇ u(x + sx) ∇ u(x 0 + sx) = − 2∇(x1 x2 ) + 2∇(x1 x2 ) sup Bs (x 0 ) |u| sup Bs (x 0 ) |u| = 2e1 + terms of order

δ 1 + δ log(r/s)

α

.

By a similar argument for each of the four components of {u = 0} ∩ Bs (x 0 ) \ {x 0 } it follows that each component is a C 1 -curve with modulus of continuity σ (s) = C7 (log(r/s))−α and that each component approaches x 0 tangentially relative to the x 1 - or x 2 -axis. This proves (ii). Acknowledgements. H. Shahgholian has been supported in part by the Swedish Research Council. G.S. Weiss has been partially supported by the Grant-in-Aid 18740086 of the Japanese Ministry of Education, Culture, Sports, Science and Technology. He also thanks the Knut och Alice Wallenberg foundation for a visiting appointment to KTH. Both J. Andersson and G.S. Weiss thank the Göran Gustafsson Foundation for visiting appointments to KTH. The present result is part of the ESF-program GLOBAL. It was completed while the first two authors were visiting the Petrolium Institute in Abu Dhabi.

References 1. Andersson, J., Weiss, G.S.: Cross-shaped and degenerate singularities in an unstable elliptic free boundary problem. J. Diff. Eqs. 228(2), 633–640 (2006) 2. Andersson, J., Shahgholian, H., Weiss, G.S.: In preparation 3. Blank, I.: Eliminating mixed asymptotics in obstacle type free boundary problems. Comm. Part. Diff. Eqs. 29(7–8), 1167–1186 (2004) 4. Caffarelli, L.A.: The obstacle problem revisited. J. Fourier Anal. Appl. 4(4-5), 383–402 (1998)

270

J. Andersson, H. Shahgholian, G. S. Weiss

5. Caffarelli, L.A., Friedman, A.: The free boundary in the Thomas-Fermi atomic model. J. Dif. Eqs. 32(3), 335–356 (1979) 6. Caffarelli, L.A., Friedman, A.: The shape of axisymmetric rotating fluid. J. Funct. Anal. 35(1), 109–142 (1980) 7. Caffarelli, L.A., Karp, L., Shahgholian, H.: Regularity of a free boundary with application to the Pompeiu problem. Ann. of Math. (2) 151(1), 269–292 (2000) 8. Chanillo, S., Grieser, D., Imai, M., Kurata, K., Ohnishi, I.: Symmetry breaking and other phenomena in the optimization of eigenvalues for composite membranes. Commun. Math. Phys. 214(2), 315–337 (2000) 9. Chanillo, S., Grieser, D., Kurata, K.: The free boundary problem in the optimization of composite membranes. In: Differential Geometric Methods in the Control of Partial Differential Equations (Boulder, CO, 1999), Volume 268 of Contemp. Math., Providence, RI: Amer. Math. Soc., 2000, pp. 61–81 10. Chanillo, S., Kenig, C.E.: Weak uniqueness and partial regularity for the composite membrane problem. J. Eur. Math. Soc. 10, 705–737 (2007) 11. Chanillo, S., Kenig, C.E., To, T.: Regularity of the minimizers in the composite membrane problem in R2 . J. Funct. Anal. 255(9), 2299–2320 (2008) 12. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Volume 224 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, second edition, 1983 13. Karp, L., Margulis, A.S.: Newtonian potential theory for unbounded sources and applications to free boundary problems. J. Anal. Math. 70, 1–63 (1996) 14. Monneau, R., Weiss, G.S.: An unstable elliptic free boundary problem arising in solid combustion. Duke Math. J. 136(2), 321–341 (2007) 15. Shahgholian, H.: The singular set for the composite membrane problem. Commun. Math. Phys. 271(1), 93–101 (2007) 16. Shahgholian, H., Uraltseva, N., Weiss, G.S.: The two-phase membrane problem—regularity of the free boundaries in higher dimensions. Int. Math. Res. Not. IMRN, (8):Art. ID rnm026, 16 (2007) 17. Simon, L.: Asymptotics for a class of nonlinear evolution equations, with applications to geometric problems. Ann. of Math. (2) 118(3), 525–571 (1983) 18. Simon, L.: Theorems on Regularity and Singularity of Energy Minimizing Maps. Based on lecture notes by Norbert Hungerbühler, Lectures in Mathematics ETH Zürich. Basel: Birkhäuser Verlag, 1996 19. Stein, E.M.: Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals. With the assistance of Timothy S. Murphy, Monographs in Harmonic Analysis, III. Volume 43 of Princeton Mathematical Series. Princeton, NJ: Princeton University Press, 1993 20. Weiss, G.S.: Partial regularity for weak solutions of an elliptic free boundary problem. Comm. Part. Diff. Eqs. 23(3-4), 439–455 (1998) Communicated by P. Constantin

Commun. Math. Phys. 296, 271–283 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0990-2

Communications in

Mathematical Physics

Limit of Quasilocal Mass at Spatial Infinity Mu-Tao Wang1, , Shing-Tung Yau2, 1 Department of Mathematics, Columbia University, New York,

NY 10027, USA. E-mail: [email protected]

2 Department of Mathematics, Harvard University, Cambridge,

MA 02138, USA Received: 8 June 2009 / Accepted: 6 October 2009 Published online: 4 February 2010 – © Springer-Verlag 2010

Abstract: We study the limit of quasilocal mass defined in [4 and 5] for a family of spacelike 2-surfaces in spacetime. In particular, we show the limit coincides with the ADM mass at spatial infinity. The limit for coordinate spheres of a boosted slice of the Schwarzchild solution is computed explicitly and shown to give the expected energymomentum four-vector. 1. Review of the Definition of Quasilocal Energy In [4 and 5], we define a notion of quasilocal mass for spacelike 2-surfaces in a spacetime. Given an isometric embedding of a 2-surface into R3,1 and a future timelike unit vector (observer) in R3,1 , we associated a quasilocal energy with respect to a canonical gauge. Minimizing among the reference data gives the quasilocal mass and the quasilocal energy-momentum four-vector. We prove that the mass has the important positivity property when the 2-surface bounds a non-singular hypersurface that satisfies the dominant energy condition and it vanishes for surfaces in R3,1 . The expression for the mass is nevertheless rather nonlinear and complicated. In this article, we show that for a family of surfaces going out to spatial infinity, the expression indeed gets “linearized” and gives a well-defined energy-momentum four-vector. First of all, we recall the definition of quasilocal energy in [4]. Let be a spacelike 2-surface in a time-orientable spacetime N . Consider a reference isometric embedding → R3,1 = Nˆ . Fix a future timelike unit vector tˆν in R3,1 . We decompose tˆν along ⊂ R3,1 into tˆν = Nˆ uˆ ν + Nˆ ν in which Nˆ is the lapse function, Nˆ ν is the shift vector, and uˆ ν is the future timelike unit normal vector field along ⊂ R3,1 determined by this decomposition. We also take the spacelike outward pointing unit normal vˆ ν that is orthogonal to uˆ ν along ⊂ R3,1 . (uˆ ν , vˆ ν ) is the reference gauge for ⊂ R3,1 The first author is supported by NSF grant DMS-0605115. The second author is supported by NSF grant DMS-0628341.

272

M.-T. Wang, S.-T. Yau

with respect to tˆν . To compute the quasilocal energy, we also need the canonical gauge (u ν , v ν ) along ⊂ N . u ν is characterized as the unique future timelike unit normal vector field along ⊂ N such that h ν u ν = hˆ ν uˆ ν ,

(1.1)

where h ν is the mean curvature vector of ⊂ N and hˆ ν is the mean curvature vector of ⊂ R3,1 . v ν is the spacelike unit normal vector that is orthogonal to u ν and satisfies ˆ ⊂ R3,1 spanned by ⊂ R3,1 and vˆ ν , v ν h ν < 0. Take a spacelike hypersurface and a spacelike hypersurface ⊂ N spanned by ⊂ N and v ν . Let kˆ be the mean ˆ and k be the mean curvature of with respect to . curvature of with respect to ˆ ˆ and , respectively. These Also denote by K µν and K µν the extrinsic curvatures of data depend only on the gauges along but not on the hypersurfaces. Quasilocal energy in the canonical gauge (see Eq. (6) in [4]) is defined to be 1 (kˆ − k) Nˆ − (vˆ µ Kˆ µν − v µ K µν ) Nˆ ν . (1.2) 8π We shall rewrite the quasilocal energy in terms of the mean curvature gauge. In order to do so, we adopt a different set of notations from [5]. Set T0 = tˆν , Hˆ = hˆ ν , H = h ν , eˆ3 = vˆ ν , eˆ4 = uˆ ν , e3 = v ν , e4 = u ν . Denote by X : → R3,1 the position vector of the isometric embedding and by τ = −X, T0 the restriction of the time function associated 2 ˆ with T0 . T0 = 1 + |∇τ | eˆ4 − ∇τ and thus N = 1 + |∇τ |2 and Nˆ ν = −∇τ . The quasilocal energy becomes 1 R3,1 N (− Hˆ , eˆ3 +H, e3 ) 1+|∇τ |2 −(∇−∇τ eˆ4 , eˆ3 −∇−∇τ e4 , e3 ) d. (1.3) 8π ˆ

Suppose the mean curvature vector Hˆ of in R3,1 is spacelike. Let e3H =

− Hˆ | Hˆ |

be the

ˆ unit vector in the direction of − Hˆ and e4H be the future-directed time-like unit normal ˆ

ˆ

vector with e3H , e4H = 0. The relation between the two gauges along ⊂ R3,1 is ˆ

ˆ

e3H = cosh θˆ eˆ3 + sinh θˆ eˆ4 , and e4H = sinh θˆ eˆ3 + cosh θˆ eˆ4 for some θˆ ∈ R. Since τ = − Hˆ , T0 , we derive sinh θˆ =

− τ . ˆ | H | 1 + |∇τ |2

Therefore, ˆ R3,1 R3,1 Hˆ ∇∇τ eˆ4 , eˆ3 = −∇ θˆ · ∇τ + ∇∇τ e4 , e3H .

The canonical gauge condition (1.1) Hˆ , eˆ4 = H, e4 implies e H =

−H |H |

is given by

e3H = cosh θ e3 + sinh θ e4 with sinh θ =

− τ . |H | 1 + |∇τ |2

Expression (1.3) can now be rewritten in terms of the mean curvature gauge.

(1.4)

Limit of Quasilocal Mass at Spatial Infinity

273

To summarize, let ⊂ N be a spacelike 2-surface in a spacetime N and let X : → R3,1 be a reference isometric embedding of into the Minkowski space. For any given future timelike constant unit vector T0 ∈ R3,1 , the time function on ⊂ R3,1 is denoted by τ = −X, T0 . Let H be the mean curvature vector of in N , we assume H is spacelike. Let J be the future timelike normal vector field along in N which is dual to H along the light cone in the normal bundle of in N , i.e. J is the unique future timelike vector that is the reflection of H along the light cone in the normal bundle. Denote by Hˆ and Jˆ the corresponding data on the isometric embedding in R3,1 . Again, Hˆ is assumed to be spacelike in R3,1 . The quasilocal energy of with respect to the pair (X, T0 ) is given by 1 | Hˆ |2 (1 + |∇τ |2 ) + ( τ )2 − |H |2 (1 + |∇τ |2 ) + ( τ )2 E(, X, T0 ) = 8π τ τ −1 −1 − τ sinh − sinh 1 + |∇τ |2 | Hˆ | 1 + |∇τ |2 |H |

ˆ Hˆ H R3,1 J N J , , d, (1.5) − ∇∇τ + ∇∇τ |H | |H | | Hˆ | | Hˆ | where τ is the Laplacian of τ on (with respect to the induced metric), and ∇ N and 3,1 ∇ R are the covariant derivatives on N and R3,1 , respectively, and ∇τ is the gradient of τ on (with respect to the induced metric again), considered as a tangent vector field on . In the expressions for the last two integrands, we push forward ∇τ by the embeddings and identify it as vector fields along in R3,1 and N , respectively. 2. General Formula for the Limit of Quasilocal Energy Fix R0 > 0 and suppose r , R0 < r < ∞ is a family of closed 2-surfaces in N , and X r is a family of isometric embeddings of r into R3,1 . In the following theorem, we derive an expression for the limit of E(r , X r , T0 ). Theorem 2.1. Suppose the mean curvature vectors of r and of the image of X r in R3,1 are both spacelike for r > R0 and |Hˆ | → 1 as r → ∞. Then the limit of E(r , X r , T0 ) |H | as r → ∞ is the same as the limit of

ˆ ˆ 1 H H J J Jˆ 3,1 R N , , − T0 , (| Hˆ | − |H |) − ∇∇τ + ∇∇τ dr , 8π r |H | |H | | Hˆ | | Hˆ | | Hˆ | as long as the limits exist. Proof. We compute ˆ

τ = − Hˆ , T0 = | Hˆ |e3H , T0

(2.1)

and ˆ

ˆ

|∇τ |2 = −1 + e4H , T0 2 − e3H , T0 2 ,

(2.2)

274

M.-T. Wang, S.-T. Yau ˆ

ˆ − Hˆ and e4H | Hˆ | in R3,1 .

where e3H =

Jˆ | Hˆ |

=

ˆ

is the future timelike unit normal dual to e3H along the

image of X Rationalize the expression | Hˆ |2 (1 + |∇τ |2 ) + ( τ )2 − |H |2 (1 + |∇τ |2 ) + ( τ )2 as (| Hˆ | + |H |)(1 + |∇τ |2 ) . (| Hˆ | − |H |) 2 2 2 2 2 2 ˆ | H | (1 + |∇τ | ) + ( τ ) + |H | (1 + |∇τ | ) + ( τ ) By assumption of

|H | | Hˆ |

1 8π

→ 1 at infinity, the limit as r → ∞ is thus the same as the limit

ˆ

ˆ

e4H , T0 2 − e3H , T0 2 ˆ

−e4H , T0

r

Next we study the term − τ sinh by rewriting it as − τ sinh

−1

−1

(| Hˆ | − |H |) dr .

τ 1 + |∇τ |2 | Hˆ |

− sinh

τ 1 + |∇ρ|2 | Hˆ |

−1

− sinh

−1

τ

(2.3)

1 + |∇τ |2 |H |

| Hˆ | 1 + |∇τ |2 | Hˆ | |H | τ

.

Note that −A sinh−1 A − sinh−1 (A(1 + x)) →√ x 1 + A2 as x → 0. With x = limit of

| Hˆ | |H |

− 1 → 0, the limit of the second term is thus the same as the

1 8π

r

ˆ

e3H , T0 2 ˆ

−e4H , T0

ˆ (| H | − |H |) dr .

The theorem is proved by combining (2.3) and (2.4).

(2.4)

ˆ

ˆ

Suppose the image of the isometric embedding X r lies in R3 ⊂ R3,1 , then e4H = Jˆ |H | 3,1 Jˆ Hˆ R Hˆ coincide with term vanishes. In this case, e is a constant vector and the ∇∇τ , 3 ˆ ˆ |H | |H |

the outward unit normal of the embedding in R3 .

Limit of Quasilocal Mass at Spatial Infinity

275

Corollary 2.1. Suppose the reference isometric embedding is in R3 ⊂ R3,1 and |Hˆ | → 1 |H | as r → ∞, then the limit of the quasilocal energy with respect to T0 = ( 1 + |a|2 , a 1 , 3 (a i )2 is the same as the limit of a 2 , a 3 ) with |a|2 = i=1

H 1 1 N J ∇∇τ ( 1 + |a|2 ) , dr , (| Hˆ | − |H |)dr + 8π r 8π r |H | |H |

(2.5)

when r → ∞ as long as the limits exist. Suppose the isometric embedding for r is given by X r = (X 1 , X 2 , X 3 ) : → R3 3 and consider X i , i = 1, 2, 3 as functions on r . Thus ∇τ = − i=1 a i ∇ X i and we obtain a limiting quasilocal energy-momentum four-vector (e, p1 , p2 , p3 ) as the limit of 1 e = lim (| Hˆ | − |H |)dr , r →∞ 8π r (2.6) H 1 J N , d pi = lim ∇−∇ , i = 1, 2, 3. r i X |H | |H | r →∞ 8π r 3. Relating to ADM Energy-Momentum Let (M, gi j , pi j ) be an asymptotically flat hypersurface in a spacetime N . Thus there exists a compact set K ⊂ M such that M\K is diffeomorphic to a union of complements of balls in R3 (ends) such that gi j = δi j + ai j with ai j = O r1 , ∂k (ai j ) = O r12 , ∂l ∂k (ai j ) = O r13 , and pi j = O r12 , ∂k ( pi j ) = O r13 on each end of M\K . The ADM energy momentum (Arnowitt-Deser-Misner) of an end of M is the four vector (E, P1 , P2 , P3 ), where 1 E = lim (∂ j gi j − ∂i g j j )ν i d Sr r →∞ 16π S r is the total energy and 1 r →∞ 16π

Pk = lim

2( pik − δik p j j )ν i d Sr Sr

is the total momentum. Here Sr is a coordinate sphere of radius r on the end and ν is the outward unit normal of Sr . The positive mass theorem (Schoen-Yau [3], Witten [6]) asserts that under the dominant energy condition, the four-vector (E, P1 , P2 , P3 ) is future timelike, i.e. E ≥ 0 and − E 2 + P12 + P22 + P32 ≤ 0. In the following, we prove that for coordinate spheres of radius r , the limit of the quasilocal energy momentum (2.6) is the same as the ADM energy-momentum.

276

M.-T. Wang, S.-T. Yau

Theorem 3.1. Suppose Sr is the coordinate sphere of radius r in an end of an asymptotically flat three-manifold (M, gi j , pi j ) and (E, P1 , P2 , P3 ) is the ADM energymomentum four vector of this end, then lim E(Sr , X r , T0 ) =

r →∞

3 1 + |a|2 E + a i Pi , i=1

where X r is the (unique) isometric embedding of Sr into R3 ⊂ R3,1 and T0 = ( 1 + |a|2 , a 1 , a 2 , a 3 ) is an arbitrary constant timelike unit vector. Proof. Denote by e0 the future timelike unit normal of the hypersurface M and ν the unit outward normal of the coordinate sphere Sr . Let (y 1 , y 2 , y 3 ) be the asymptotically flat coordinates on the end. Sr is given by (y 1 )2 + (y2 )2 + (y 3 )2 = r 2 and we denote the 1 1 embedding of Sr into M by Y . Since pi j = O r 2 , we have H, e0 = O r 2 . It is known that H, ν = r2 + O r12 (see for example [1]). Since H = H, νν −H, e0 e0 , we estimate 1 . |H | − |H, ν| = O r3 Therefore,

lim

r →∞ S r

(| Hˆ | − |H |)d Sr = lim

r →∞ S r

(| Hˆ | − |H, ν|)d Sr ,

i.e, the Brown-York energy and the Liu-Yau energy have the same limit at spatial infinity. It is known that (see for example [1] and the reference therein) the Brown-York energy approaches the ADM energy E at spatial infinity. Now it suffices to prove 3

a i Pi =

i=1

3

a i pi =

i=1

1 8π

Sr

N ∇∇τ

H J , d Sr . |H | |H |

By definition, the ADM momentum is 3 i=1

a i Pi =

1 8π

p(a i Sr

∂ i ∂ , ν) − (tr p) a , ν d Sr , M ∂ yi ∂yi

where tr M p is the trace g i j pi j of p on M. We decompose a i ∂∂y i = a i ∂∂y i + a i ∂∂y i , ν ν and the integrand becomes

∂ i i ∂ a ,ν + a , ν ( p(ν, ν) − (tr M p)). p ∂ yi ∂ yi By the definition of the mean curvature vector H , we obtain p(ν, ν)−(tr M p) = H, e0 . Therefore the ADM momentum term is

3 1 N i ∂ ∇(a a i Pi = e , ν + H, e a , ν d Sr . (3.1) 0 0 i ∂ ) 8π Sr ∂ yi ∂ yi i=1

Limit of Quasilocal Mass at Spatial Infinity

277

Now we turn to the limit of the quasilocal energy momentum. We can express the normal vector fields H and J in terms of ν and e0 as H = H, νν − H, e0 e0 and J = −H, νe0 + H, e0 ν. We compute

H, e0 H N J N = −∇τ · ∇ sinh−1 − ∇∇τ ∇∇τ e0 , ν. , |H | |H | |H | Integrating by parts gives

1 1 H N J N −1 H, e0 , d Sr = d Sr . ∇ −∇∇τ e0 , ν + τ sinh 8π Sr ∇τ |H | |H | 8π Sr |H | Plug in (2.1), the second integrand on the right-hand side becomes −1 H, e0 Hˆ ˆ e3 , T0 | H | sinh . |H | Recall the asymptotics 2 2 1 1 1 ˆ H, e0 = O , |H | = + O , |H | = + O r2 r r2 r r2 1 H N J and sinh−1 x ∼ x if x << 1. We see the limit of 8π Sr ∇∇τ |H | , |H | d Sr is the same as 1 ˆ N lim −∇∇τ e0 , ν + H, e0 e3H , T0 d Sr . (3.2) r →∞ 8π S r Now we can compare (3.1) and (3.2). Write out the tangential part of a i ∂∂y i , ∂Y j ab ∂Y ∂Y ab ∂Y i ∂ i ∂ i a = a , σ = a g σ . i j ∂ yi ∂ y i ∂u a ∂u b ∂u a ∂u b On the other hand, as τ = −a i X i , and the push-forward of ∇τ becomes ∇τ = −a i

∂ X i ab ∂Y σ . ∂u a ∂u b

The isometric embeddings satisfy (see for example [1]) |X − Y | = O(1) and i

i

ˆ |e3H

1 . − ν| = O r

From these, we deduce that (3.2) is the same as the limit of the right-hand side of (3.1) and the theorem is proved.

4. Explicit Computation in a Boosted Slice of Schwarzchild’s Solution In this section, we compute the limit of quasilocal energy-momentum for coordinate spheres of a boosted slice of Schwarzchild’s solution.

278

M.-T. Wang, S.-T. Yau

4.1. Asymptotics of the geometry of coordinate spheres. Let (y 0 , y 1 , y 2 , y 3 ) be the standard isotropic coordinates of Schwarzchild’s solution in which the spacetime metric is of the form: G αβ dy α dy β = −

3 1 1 i 2 0 2 (dy ) + (dy ) F2 G2 i=1

with M 2 ) (1 + 2ρ 1 2 F2 = 2 , G = M 4 (1 + 2ρ ) M 1 − 2ρ

3 and ρ 2 = i=1 (y i )2 . Given γ > 0 and β with γ = 1/ 1 − β 2 , consider coordinates given by (y 0 ) = γ y 0 − βγ y 3 , (y 3 ) = γ y 3 − βγ y 0 , (y 1 ) = y 1 , (y 2 ) = y 2 . Now consider the family of 2-surfaces r0 given by (y 0 ) = 0 and (y 1 )2 + (y 2 )2 + (y 3 )2 = r02 as r0 → ∞. These are standard coordinate spheres of a boosted slice of Schwarzchild’s solution. We calculate with the standard isotropic coordinates, in terms of which, r0 is defined by γ y 0 − βγ y 3 = 0 and (y 1 )2 + (y 2 )2 + (γ y 3 − βγ y 0 )2 = r02 . Denote the embedding of r0 into Schwarzchild’s solution by Y = (y 0 , y 1 , y 2 , y 3 ). We parametrize the 2-surfaces r0 by y0 y1 y2 y3

= βγ r0 cos θ, = r0 sin θ sin φ, = r0 sin θ cos φ, = γ r0 cos θ.

In terms of local coordinates u 1 = θ and u 2 = φ on the surface, the induced metric on r0 is 2M 2M a b 2 2 2 2 2 2 σab du du = r0 1 + (1 + 2β γ sin θ ) dθ + r0 1 + sin2 θ dφ 2 ρ ρ +O(r0 ), (4.1) and

2M 1 + β 2 γ 2 sin2 θ + O(r0 ). det σab = r02 | sin θ | 1 + ρ

The mean curvature vector H = H γ ∂ ∂y γ of r0 is by definition: 2 γ α β ∂ y γ ∂y ∂y γ γ (δβ − β ), + H γ = σ ab αβ ∂u a ∂u b ∂u a ∂u b

(4.2)

Limit of Quasilocal Mass at Spatial Infinity

279

γ

γ

α

γ

where αβ are the Christoffel symbols of the metric G αβ and β = G βα σ ab ∂∂uy a ∂∂uy b is γ the projection operator onto the tangent space of r0 . The asymptotic expansion of αβ can be computed from the asymptotic expansion of G αβ . α Denote by y˜ α = yr0 and ρ˜ = rρ0 , which are both scaling invariant now. We shall use the following frames along r0 to express the mean curvature vector: ∂ N = y˜ α α , ∂y ∂ ∂ B=γ + β , and ∂ y0 ∂ y3 ∂ y˜ α ∂ . T = ∂θ ∂ y α We notice that T is a tangent vector fieldto r0 while N and Bareonly asymptot1 ically normal in the sense that N , T = O r0 and B, T = O r10 . We also have N , N = 1 + O r10 , B, B = −1 + O r10 , and T, T = 1 + O r10 . A straightforward calculation gives Lemma 4.1. −2 1 H= N + 2 (nN + tT + bB) + O r0 r0

1 r03

with M (6 + 6β 2 γ 2 + 2β 4 γ 4 sin2 θ cos2 θ ), ρ˜ 3 M t = 3 (−8ρ˜ 2 )(β 2 γ 2 sin θ cos θ ), and ρ˜ M b = 3 (2βγ 2 cos θ )(β 2 γ 2 sin2 θ − 1). ρ˜

n=

From here, we compute the norm of |H |: Proposition 4.1. 2 1 |H | = + r0 r02

2M 1 2 2 2 (1 + 2β γ cos θ ) − n + O . ρ˜ r03

Let J be the future-directed timelike normal vector that is dual to H along the light cone in the normal bundle. Lemma 4.2. J is given by 2 1 −2M(1 + 2β 2 γ 2 cos2 θ + γ 2 + β 2 γ 2 ) +n B B− 2 r0 ρ˜ r0 1 1 8M 8Mβγ 2 cos θ 2 (βγ sin θ )T − (b + )N + O + 2 . ρ˜ ρ˜ r0 r03

280

M.-T. Wang, S.-T. Yau

Proof. The coefficients of J are determined by the following equations: −J, J = H, H , J, H = 0, and J, T = 0.

With this explicit formula, we compute the coefficients of the connection form of the normal bundle in the mean curvature vector gauge: Proposition 4.2.

1 1 8M N J, H = 3 2b + , βγ 2 sin θ + O ∇ ∂Y ρ˜ ∂θ r0 r04 where b denotes the derivative of b with respect to θ and 1 N ∇ ∂Y J, H = O . ∂φ r04 It turns out the second term does not contribute to the limit of the quasilocal energy.

4.2. Total mean curvature of isometric embedding. We consider the isometric embedding of a general axially symmetric metric into R3 . The metric is of the form r02 P 2 (r0 , θ )dθ 2 + r02 Q 2 (r0 , θ ) sin2 θ dφ 2 with

P(r0 , θ ) = 1 + O

1 r0

, and Q(r0 , θ ) = 1 + O

Suppose the isometric embedding is given by X = (u(r0 , θ ) sin φ, u(r0 , θ ) cos φ, v(r0 , θ )). Thus ∂X = ∂θ

∂u ∂u ∂v sin φ, cos φ, ∂θ ∂θ ∂θ

and ∂X = (u cos φ, −u sin φ, 0). ∂φ It is not hard to see

1 r0

.

Limit of Quasilocal Mass at Spatial Infinity

281

Lemma 4.3. u and v are given by 2 2 ∂u ∂v + = r02 P 2 ∂θ ∂θ and u 2 = r02 Q 2 sin2 θ. Proposition 4.3. The mean curvature of the isometric embedding of the metric r02 P 2 (r0 , θ )dθ 2 + r02 Q 2 (r0 , θ ) sin2 θ dφ 2 into R3 is given by 2 u ∂v 2 v ∂u 1 ∂v ∂ 1 ∂ + 2 + , Hˆ = − 3 3 2 2 ∂θ ∂θ ∂θ ∂θ ∂θ r0 P r0 P Q sin θ where u(r0 , θ ) = r0 Q(r0 , θ ) sin θ, and

∂v ∂θ

2

=

r02 P 2

−

∂u ∂θ

2 =

Now suppose p P =1+ +O r0 and q +O Q =1+ r0

P −

r02

2

1 r02 1 r02

∂Q sin θ + Q cos θ ∂θ

2 .

, p = p(θ ) , q = q(θ ).

The asymptotic expansion of the mean curvature is found to be 1 cos θ 2 (2q − p ) + q ). − (2 p + Hˆ = r0 r 2 sin θ

(4.3)

Comparing with (4.1), we deduce that in our case p=

M M (1 + 2β 2 γ 2 sin2 θ ) and q = . ρ˜ ρ˜

u and v can be solved explicitly: u = r0 sin θ +

M M sin θ and v = r0 cos θ + cos θ + 2Mβγ sinh−1 (βγ cos θ ). ρ˜ ρ˜

Plug in the expression of p and q into (4.3) and integrate by parts we obtain Proposition 4.4. ˆ H dr0 = 8πr0 + 2π M r0

2π 0

1 + β 2 γ 2 sin2 θ | sin θ |dθ + O ρ˜

This calculation is compatible with Lemma 2.4 in [1].

1 r0

.

(4.4)

282

M.-T. Wang, S.-T. Yau

4.3. Evaluating the quasilocal energy. We are ready to compute the limit of the Liu-Yau mass: Proposition 4.5. r0

( Hˆ − |H |)dr0 = 8π γ M + O

1 r0

.

Proof. Combine Proposition 4.1 and Proposition 4.4; we obtain r0

2+4β 2 γ 2 −6β 2 γ 2 cos2 θ − 4β 4 γ 4 cos4 θ | sin θ |dθ ρ˜ 0 1 . +O r0

( Hˆ − |H |)dr0 = π M

2π

The integral can be evaluate by the substitution βγ cos θ = sinh y.

Now we turn to the momentum part. Suppose T0 = ( 1 + |a|2 , a 1 , a 2 , a 3 ), |a|2 = 3 i 2 3 3,1 i=1 (a ) is a future timelike unit vector and the isometric embedding into R ⊂ R is given by X = (0, u sin φ, u cos φ, v). We know from (4.4) that u = r sin θ + O(1) ∂τ ab ∂Y and v = r cos θ + O(1). The gradient of τ is given by ∇τ = ∂u . We compute aσ ∂u b

r0

N ∇∇τ

H J , dr0 = −a 1 |H | |H |

r0

−a 2 −a

r0

3 r0

1 N (u sin θ ) σ θθ ∇ ∂Y J, H dr0 |H |2 ∂θ 1 N (u cos θ ) σ θθ ∇ ∂Y J, H dr0 |H |2 ∂θ 1 θθ N 1 . (4.5) v σ ∇ ∂Y J, H dr0 + O 2 |H | r0 ∂θ

These integrals can be evaluated and we obtain Proposition 4.6. r0

N ∇∇τ

H J , dr0 = a 3 8πβγ M + O |H | |H |

Proof. By Proposition 4.2,

1 r0

.

∇N

∂Y ∂θ

J, H is of the order

1 r03

while u and v are both of order

r0 . We have

1 θθ N ∇ ∂Y J, H dr0 (u sin θ ) σ 2 ∂θ r0 |H |

π 2π 2 r0 1 N = J, H r02 | sin θ |dθ dφ (r0 sin2 θ ) 2 ∇ ∂Y 4 ∂θ r0 0 0

2π π N (sin2 θ ) r03 ∇ ∂Y J, H | sin θ |dθ. = 4 0 ∂θ

Limit of Quasilocal Mass at Spatial Infinity

283

Therefore the first integral on the right-hand side of (4.5) is π 2π 8M −a 1 βγ 2 sin θ | sin θ |dθ, (sin2 θ ) 2b + 4 0 ρ˜ where b=

M (2βγ 2 cos θ )(β 2 γ 2 sin2 θ − 1), ρ˜ 3

and the second one is 2π 8M 2π 2 −a βγ sin θ | sin θ |dθ. (sin θ cos θ ) 2b + 4 0 ρ˜ 2π Both integrate to zero as they are of the form 0 (cos θ )F(cos2 θ )| sin θ |dθ or 2π 2 2 0 (sin θ )F(cos θ )| sin θ |dθ for some smooth function F of cos θ . The last integral becomes 2π 8M 3π 2 a βγ sin θ | sin θ |dθ sin θ 2b + 4 0 ρ˜ which can be simplified by integration by parts as π sin θ a 3 4π Mβγ 2 dθ. ρ˜ 3 0 Using the same substitution βγ cos θ = sinh y, the integral is a 3 8πβγ M.

Therefore the limit of the quasilocal energy (1.5) is ( 1 + |a|2 )γ M + a 3 βγ M.

Recall that γ 2 − β 2 γ 2 = 1. Minimizing this expression among all T0 = ( 1 + |a|2 , a 1 , a 2 , a 3 ), we see the minimum is achieved at (γ , 0, 0, −βγ ) and the minimum value is M. The limit of the quasilocal energy-momentum is thus M(γ , 0, 0, −βγ ). Acknowledgements. We would like to thank PoNing Chen for his help in checking the correctness of the calculations in §3.

References 1. Fan, X.-Q., Shi, Y., Tam, L.-F.: Large-sphere and small-sphere limits of the Brown-York mass. Comm. Anal. Geom. 17(1), 37–72 (2009) 2. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass. Phys. Rev. Lett. 90(23), 231102 (2003) 3. Schoen, R., Yau, S.-T.: Positivity of the total mass of a general space-time. Phys. Rev. Lett. 43(20), 1457–1459 (1979) 4. Wang, M.-T., Yau, S.-T.: Quasilocal mass in general relativity. Phys. Rev. Lett. 102(2), 021101 (2009) 5. Wang, M.-T., Yau, S.-T.: Isometric embeddings into the Minkowski space and new quasi-local mass. Commun. Math. Phys. 288, 919–942 (2009) 6. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80(3), 381–402 (1981) Communicated by P.T. Chru´sciel

Commun. Math. Phys. 296, 285–301 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0991-1

Communications in

Mathematical Physics

Non-Uniform Dependence on Initial Data of Solutions to the Euler Equations of Hydrodynamics A. Alexandrou Himonas, Gerard Misiołek Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA. E-mail: [email protected]; [email protected] Received: 13 June 2009 / Accepted: 24 October 2009 Published online: 29 January 2010 – © Springer-Verlag 2010

Abstract: We show that continuous dependence on initial data of solutions to the Euler equations of incompressible hydrodynamics is optimal. More precisely, we prove that the data-to-solution map is not uniformly continuous in Sobolev H s () topology for any s ∈ R if the domain is the (flat) torus Tn = Rn /2π Zn and for any s > 0 if the domain is the whole space Rn . 1. Introduction The classical notion of well-posedness of an abstract Cauchy problem due to Hadamard requires constructing a unique solution which depends continuously on initial conditions. This notion is quite strong and difficulties of a specific problem often force one to relax or drop the requirements of continuous dependence or uniqueness. On the other hand, there are many equations for which the solution operator can be shown in suitably chosen topologies to be uniformly continuous, Lipschitz or even differentiable.1 The Cauchy problem for the Euler equations of ideal hydrodynamics has a long and distinguished history which we will not attempt to survey here, recommending instead the monograph of Majda and Bertozzi [MB] or a recent article by Constantin [C] for fundamental results and additional references. Of relevance to us however will be the property of continuous dependence of solutions (in the strong sense of Sobolev norms) which was first established by Ebin and Marsden [EM] for bounded domains (possibly with boundary) and by Kato [K1] for the whole space using semigroup techniques. The former approach is based on a result of V. Arnold according to which motions of an n-dimensional ideal fluid correspond to geodesics of its kinetic energy functional in the group of diffeomorphisms preserving the volume of the fluid domain, see Arnold and Khesin [AK] for a detailed exposition. Topologizing the space of diffeomorphisms by Sobolev H s norms (with s > n/2 + 1) the geodesic equation is then solved (locally 1 Various examples can be found in [H,B,Sh or KPV].

286

A. A. Himonas, G. Misiołek

in time) using Banach contractions. In particular, the geodesics depend smoothly on initial conditions but since derivative loss occurs upon changing back from Lagrangian to Eulerian coordinates this approach ultimately yields only continuous dependence of the corresponding solutions to the Euler equations.2 There arises therefore a natural question whether this dependence is optimal, see [EM], 15.2(ii) p.151. We point out that for the closely related Navier-Stokes equations, the dependence of solutions on the data in sufficiently high Sobolev norms and sufficiently large viscosity is at least Lipschitz.3 However, we have not been able to find an answer to the optimality question for the Euler equations anywhere in the literature. In this paper we show that continuous dependence in Eulerian coordinates is indeed the best one can expect. More precisely, we will prove that the solution map u 0 → u for the Euler equations in (in dimension 2 and 3) is not uniformly continuous on bounded sets into C([0, T ], H s ) for any s ∈ R when = Tn and for any s > 0 when = Rn . Our methods in the two cases are different. One of the first results of this type was proved by Kato [K2] who showed that the solution operator for the (inviscid) Burgers equation is not Hölder continuous in the H s (T) norm (s > 3/2) regardless of the Hölder exponent. Since then other techniques have been developed and successfully applied in the study of various nonlinear dispersive and integrable equations in one space dimension, see for example Kenig, Ponce and Vega [KPV] and its references. Our approach is most closely related to that of Koch and Tzvetkov [KT] in their study of the Benjamin-One equation (see also [HK]). In the next section we describe the basic set-up and present the statements of our results. Sections 3 and 4 contain the main constructions of the paper. A few well-known technical proofs are gathered in the Appendix in an attempt to make the paper reasonably self-contained. 2. Background and Statements of Main Results The initial value problem for the Euler equations governing the motion of an incompressible fluid in an n-dimensional domain can be formulated as ∂t u + ∇u u + ∇ p = 0, div u = 0, u(0, x) = u 0 (x), x ∈ , t ∈ R,

(2.1) (2.2)

where u : R × → Rn is the fluid velocity, p : R × → R is the pressure function and u 0 : → Rn is a divergence free initial condition. We shall be concerned only with the cases when is either the flat n-torus Tn = Rn /2π Zn or the whole space = Rn and where n = 2 or 3. However, it should be clear that similar constructions can be done also for certain bounded domains (such as a disc or a finite cylinder for example) with appropriate boundary conditions. As is well known pressure can be eliminated from (2.1). In fact, applying the divergence operator to the first equation and then solving for p gives ∇ p = −∇−1 div∇u u = −(1 − P)∇u u,

(2.3)

2 The fact that fluids are “better behaved” in Lagrangian coordinates has led to detailed studies of the associated Riemannian exponential map on the diffeomorphism group as in [EMP], see also [AK]. 3 In fact, using semilinear parabolic techniques it can be shown to be analytic, see [H], p. 79–81.

Non-Uniform Dependence of Solutions to the Euler Equations

287

where P is the L 2 -orthogonal projection onto the divergence free part in the Hodge decomposition of vector fields on into divergence free fields and gradients of functions in . Using (2.3) the first of the equations in (2.1) takes the form ∂t u + ∇u u − ∇−1 div∇u u = 0.

(2.4)

We emphasize that the nonlocal term in the above equation is more regular than it appears. In fact, since u = (u 1 , . . . , u n ) is divergence free it follows that the function div∇u u =

n

∂i u j ∂ j u i

(2.5)

i, j=1

involves only first order derivatives of u. For any s ∈ R we shall equip the Sobolev space of vector-valued distributions H s (, Rn ) with the norm u

H s (,Rn )

=

n

u j H s () ,

(2.6)

j=1

where f H s () is the standard Sobolev norm for functions on defined either as 1/2 1/2 s s ˆ 2 ˆ 2 2 2 1+n 1 + |ξ | or , f (n) f (ξ ) dξ Rn

n∈Z

depending on the context. We shall also often use the symbols and to denote estimates that hold up to some universal constant. Local well-posedness of the Euler equations in (in dimensions n ≥ 2) has of course been established by many authors. We summarize the result in the form which is convenient for our purposes. Local well-posedness. If s > n/2 + 1 and u 0 ∈ H s (, Rn ) is a divergence free vector field then there exists T > 0 and a unique solution u ∈ C ([0, T ], H s (, Rn )) of the Cauchy problem (2.1)–(2.2) which depends continuously on the initial data u 0 . Furthermore, we have the estimate u(t) H s ≤

u 0 H s 1 − Ct u 0 H s

for 0 ≤ t ≤ T < C −1 u 0 −1 Hs ,

(2.7)

where C > 0 is a constant depending on s. The proof can be found for example in the references [MB or KP]. Our main results show that in general one cannot expect to improve the continuous dependence of Eulerian solutions on initial data. Theorem 2.1. Let n = 2 or 3 and let u 0 → u denote the solution map of the Euler equations defined by the Cauchy problem (2.1)–(2.2): (P) Periodic case. For any s ∈ R the solution map is not uniformly continuous from the unit ball in H s (Tn , Rn ) into C ([0, T ], H s (Tn , Rn )). (NP) Non-periodic case. If s > 0 then the solution map is not uniformly continuous from the unit ball in H s (Rn , Rn ) into C ([0, T ], H s (Rn , Rn )).

288

A. A. Himonas, G. Misiołek

A few comments are in order. Although our proofs for the periodic and the non-periodic domains are different in both cases the general strategy will be to construct for each Sobolev index two sequences of solutions which are converging at time zero but remain far apart at any later time. The constructions are essentially two dimensional but modifications needed for higher dimensions are trivial. Furthermore, it seems that the restriction on the values of the Sobolev index s in the non-periodic case (NP) is merely a consequence of our methods and can be improved. On the other hand, one may reasonably argue that the range of s is determined by those values which correspond to a well-posed evolution of the fluid. For classical solutions of the Euler equations this range is s > n/2 + 1 and thus is covered by our results. The main value of our result consists in the fact that the instability we prove occurs on a finite time interval independent of the initial distance between the two sequences. This would certainly be known in the space C([0, ∞], H s ) due to the existence of (plenty of) smooth exponentially unstable stationary solutions with smooth unstable eigenfunctions corresponding to unstable eigenvalues of the linearized equation. Indeed, if u 0 is a parallel shear flow with smooth profile possessing a smooth unstable eigenfunction vλ , then the difference between u 0 , as one solution, and the solution with initial data u 0 + εvλ will be greater than an absolute c0 on a time interval of order − log ε. This is a well-known bootstrap argument (see Lin [L] and the references on the subject therein). Remark 2.1. It is not difficult to see that the solution map u 0 → u is differentiable from H s into C([0, T ], H s−1 ) for s > n/2 + 1. This follows from the fact that the corresponding solution map in Lagrangian coordinates (or equivalently, the Riemannian exponential map of the L 2 metric on the group of volume-preserving diffeomorphisms) u 0 → η(t) = expe (tu 0 ),

where e(x) = x

is differentiable as a map into H s diffeomorphisms for any s > n/2 + 1. As already mentioned, the change from Lagrangian to Eulerian coordinates introduces a loss of derivatives. However, letting ι : η → η−1 denote the inversion of diffeomorphisms and writing u = η˙ ◦ ι ◦ η we easily conclude that the map u 0 → u retains the desired differentiablity considered as a function from H s to C([0, T ], H s−1 ) because ι itself is of class C 1 when mapping diffeomorphisms of class H s to those of class H s−1 . We refer to [EM and EMP] for more details regarding diffeomorphism groups and the geometry of the L 2 exponential map. 3. Proof of Theorem 2.1: Part (P) The proof of part (P) for the case when = T2 relies essentially on the construction of two sequences of explicit periodic solutions to the Euler equations with mixed high and low frequency terms that have suitably chosen phase shifts. Our motivation comes partly from the constructions in [M]. Also, sequences similar to those defined by formula (3.1) below have been used for equations like Burgers’, BO and CH which can be considered as approximations to the Euler equations (see [KT,HKM] and the references therein). Remarkably, in the case of the Euler equations these functions are indeed solutions. Lemma 3.1. For any ω ∈ R and n ∈ Z+ the divergence free vector field u ω,n (t, x1 , x2 ) = ωn −1 + n −s cos(nx2 − ωt), ωn −1 + n −s cos(nx1 − ωt) is a solution to the Euler equations on T2 .

(3.1)

Non-Uniform Dependence of Solutions to the Euler Equations

Proof. We compute the first two terms

∂t u ω,n = ωn −s sin(nx2 − ωt), ωn −s sin(nx1 − ωt)

289

(3.2)

and ∇u ω,n u ω,n = − ωn −1 + n −s cos(nx1 − ωt) n −s+1 sin(nx2 − ωt), − ωn −1 + n −s cos(nx2 − ωt) n −s+1 sin(nx1 − ωt)

(3.3)

so that ∂t u ω,n + ∇u ω,n u ω,n = −n −2s+1 cos(nx1 − ωt) sin(nx2 − ωt), − n −2s+1 sin(nx1 − ωt) cos(nx2 − ωt) .

(3.4)

Furthermore, from (2.5) we have div∇u ω,n u ω,n = 2n −2s+2 sin(nx1 − ωt) sin(nx2 − ωt),

(3.5)

and since a quick inspection shows that the product sin nx1 sin nx2 is an eigenfunction of the Laplacian with eigenvalue −2n 2 , we have −1 div∇u ω,n u ω,n = −n −2s sin(nx1 − ωt) sin(nx2 − ωt),

(3.6)

and hence ∇−1 div∇u ω,n u ω,n = −n −2s+1 cos(nx1 − ωt) sin(nx2 − ωt), − n −2s+1 sin(nx1 − ωt) cos(nx2 − ωt) .

(3.7)

Combining (3.4) and (3.7) gives ∂t u ω,n + ∇u ω,n u ω,n − ∇−1 div∇u ω,n u ω,n = 0, which completes the proof.

(3.8)

The second ingredient we need is provided by the following simple estimate: Lemma 3.2. For any s ∈ R and any n 1 we have n −s cos(n · −ωt) H s (T) + n −s sin(n · −ωt) H s (T) 1.

(3.9)

Proof. A direct computation gives (cos(n · −ωt))∧ (k) =

2π

e−ikx cos(nx − ωt) d x = π e−iωt δk,n + π eiωt δk,−n ,

0

√ and consequently cos(n · −ωt) H s (T) = π 2(1 + n 2 )s/2 . An analogous computation for sin(n · −ωt) yields the estimate.

290

A. A. Himonas, G. Misiołek

In order to show that

the solution mapis not uniformly continuous on bounded sets of initial data into C [0, T ], H s (T2 , R2 ) it will be sufficient to select two sequences of solutions converging in the H s norm at t = 0 but separated at some later time t > 0 and which remain confined to a bounded set in H s (T2 ) for all 0 ≤ t ≤ T . To this end we pick the solutions described in Lemma 3.1, namely u 1,n (t) and u −1,n (t) with n = 1, 2, . . . and corresponding to ω = 1 and ω = −1 respectively and use Lemma 3.2 as our tool to verify that they meet our requirements. First, observe that boundedness of the two sequences in any H s norm follows since for any t and ω = ±1, we have ω,n u ω,n (t) H s (T2 ,R2 ) = u ω,n 1 (t) H s (T2 ) + u 2 (t) H s (T2 )

= ωn −1 + n −s cos(n · −ωt) H s (d x2 ) + ωn −1 + n −s cos(n · −ωt) H s (d x1 ) n −1 + 1 1. Next, estimating the difference of the two sequences at time t = 0, we find that u 1,n (0) − u −1,n (0) H s (T2 ,R2 ) 2n −1 −→ 0 whenever n → ∞. On the other hand, by triangle inequality, a little trigonometry and repeated application of Lemma 3.2, we have u 1,n (t) − u −1,n (t) H s (T2 ,R2 ) 2 cos(n · −t) − cos(n · +t) 2 cos(n · −t) − cos(n · +t) + + = + s s n n ns ns Hx Hx 2 1 cos(n · −t) − cos(n · +t) cos(n · −t) − cos(n · +t) 4 s + s −n ns ns Hx Hx 2

= 2n

−s

sin t sin n(·) Hxs + 2n 2

−s

1

sin t sin n(·) Hxs

1

1 | sin t| − n for any t ≥ 0. From the last inequality we now obtain lim inf u 1,n (t) − u −1,n (t) s 2 n→∞

4 − n

H (T ,R2 )

| sin t|,

which completes the proof in the case when = T2 . Remark 3.1. (The case = T3 ) It only suffices to observe that for any constants ω, n and s the vector-valued function u ω,n (t, x1 , x2 , x3 ) = ωn −1 +n −s cos(nx2 −ωt), ωn −1 +n −s cos(nx1 −ωt), 0 (3.10) is also a solution to the Euler equations on the three torus applies without change.4 This completes the proof of part (P) of Theorem 2.1. 4 A similar argument works for any higher-dimensional flat torus.

T3 .

The construction above

Non-Uniform Dependence of Solutions to the Euler Equations

291

4. Proof of Theorem 2.1: Part (NP) As in the previous section we prove (NP) in the two-dimensional case and later indicate modifications needed in three dimensions. Our strategy in the nonperiodic case will be to select two sequences of approximate solutions which are arbitrarily close at time zero but are separated at later times. They will consist of a high frequency part as in the previous section but localized in the spacial variable and a low frequency part which in fact will be a smooth solution with suitably chosen initial data. One of our tasks will be to control the error terms. This approach has been successfully applied to equations in one space dimension, see e.g. [KT or HK].

4.1. Approximate solutions. Our approximate solutions have the following form: u ω,λ (t, x) = u l (t, x) + u h (t, x),

x = (x1 , x2 ) ∈ R2 , t ∈ R,

(4.1)

where u h is the high frequency term u h (t, x) = rotφ h (t, x) = ∂2 φ h (t, x), −∂1 φ h (t, x)

(4.2)

given by the stream function φ h (t, x) = λ−δ−s−1 φ

x x 1 2 φ δ sin(λx2 − ωt), λδ λ

λ ∈ Z+ ,

(4.3)

and where φ ∈ Cc∞ (R) satisfies supp φ ⊂ [−2, 2] and φ(x) ≡ 1 for |x| < 1. The values of the parameters δ > 0 and s ∈ R will be specified later. The low frequency term u l is defined as the solution of the following initial value problem: ∂t u l + ∇ul u l − ∇−1 div∇ul u l = 0, div u l = 0, u l (0, x) = rot φ l (x), with the corresponding stream function x x 1 2 φ l (x) = −ωλ−1+δ ψ1 δ ψ2 δ , λ λ

(4.4) x ∈ R2 ,

ω = ±1, λ ∈ Z+ ,

(4.5)

and where the localizing functions ψ1 , ψ2 ∈ Cc∞ (R) are chosen such that ψ1 = ψ2 ≡ 1 on the support of φ. We first aim to show that the functions u ω,λ are indeed good approximations to solutions in that they satisfy the Euler equations in R2 up to a “small” error term. Since both u l and u h are divergence free, we have div u ω,λ = 0. Furthermore, using the first of the equations in (4.4) we compute six error terms ∂t u ω,λ + ∇u ω,λ u ω,λ − ∇−1 div∇u ω,λ u ω,λ = ∂t u h + ∇ul u h + ∇u h u l + ∇u h u h − 2∇−1 div∇ul u h − ∇−1 div∇u h u h = E1 + E2 + E3 + E4 + E5 + E6. (4.6)

292

A. A. Himonas, G. Misiołek

4.2. L 2 -estimates of error terms. Before we proceed to estimate the error terms E 1 , . . . , E 6 we first need to derive appropriate bounds on u l and u h . The next lemma, whose proof involves a little Fourier analysis, will be helpful in what follows. Lemma 4.1. Let σ ≥ 0 and δ ≥ 0. For any Schwartz function ψ ∈ S(R) we have · λδ/2 ψ L 2 (R) ≤ ψ δ ≤ λδ/2 ψ H σ (R) , λ 1. (4.7) λ H σ (R ) Furthermore, for any constant a ∈ R we have the estimate · λσ +δ/2 ψ L 2 (R) , λ 1, ψ δ cos(λ · −a) σ λ H (R )

(4.8)

which also holds when cos(λ · −a) is replaced by sin(λ · −a). Proof. See Appendix. Since

x x x 1 2 2 −1 x 1 u l (0, x) = −ωλ−1 ψ1 δ ψ2 , ωλ ψ2 δ , ψ 1 λ λδ λδ λ

from estimate (4.7) we obtain that l u (0)

H σ (R 2 )

λ−1+δ .

(4.9)

Next, using (4.9) and standard energy estimates we derive the following lemma whose proof is also relegated to the Appendix. Lemma 4.2. For any δ > 0 and λ 1 the initial value problem (4.4) has a unique solution u l = u l (t, x) such that for any σ ≥ 0 we have l (4.10) u (t) σ 2 λ−1+δ H (R )

uniformly for all t ∈ [0, 1]. Proof. See Appendix.

Using Lemma 4.1 we can estimate Sobolev H σ norms of the high frequency term From (4.2) and (4.3) we have x x 1 2 u h (t, x) = λ−s−δ φ δ φ δ cos(λx2 − ωt) λ λ x2 x1 + λ−s−1−2δ φ δ φ δ sin(λx2 − ωt), λ λ x x 1 2 (4.11) − λ−s−1−2δ φ δ φ δ sin(λx2 − ωt) . λ λ

u h (t).

For any σ ≥ 0 we estimate its H σ norm h u (t) σ 2 = u 1h (t) H (R )

H σ (R 2 )

+ u 2h (t)

H σ (R 2 )

Non-Uniform Dependence of Solutions to the Euler Equations

293

by the sum of three terms · · φ δ cos(λ · −ωt) λ−s−δ φ δ λ λ Hxσ Hxσ 2 1· · −s−1−2δ sin(λ · −ωt) +λ σ φ δ σ φ δ λ λ Hx Hx 2 · 1 · + λ−s−1−2δ φ δ φ δ sin(λ · −ωt) σ λ λ Hxσ Hx 1 2 and apply (4.7) and (4.8) to obtain h u (t)

H σ (R 2 )

λ−s+σ

which holds uniformly in t ∈ [0, 1]. Similarly, we find another bound h u (t) u 1h (t) + u 2h (t) λ−s−δ ∞

∞

∞

(4.12)

(4.13)

(4.14)

also valid for any t ∈ [0, 1]. We proceed to derive L 2 estimates of the error terms. We expect the contributions involving derivatives of the high frequency part to have the slowest decay in λ. It is possible to improve this decay somewhat by combining E 1 and E 2 and using energy estimates for the low frequency part. We will therefore bound the sum E 1 + E 2 . Observe that ∂t u h (t, x) x x x x 1 2 1 2 = ωλ−s−δ φ δ φ δ sin(λx2 −ωt)−ωλ−s−1−2δ φ δ φ δ cos(λx2 −ωt), λ λ λ λ x1 x2 ωλ−s−1−2δ φ δ φ δ cos(λx2 − ωt) , λ λ and hence with our choices of the cut-offs ψ1 and ψ2 the first term in the first component of ∂t u h (t, x) can be written as x x 1 2 ωλ−s−δ φ δ φ δ sin(λx2 − ωt) λ λ x1 x2 −s+1−δ x1 x2 ψ λ φ δ φ δ sin(λx2 − ωt) = ωλ−1 ψ1 2 δ λδ λ λ λx x 1 2 = λ−s+1−δ u l2 (0, x)φ δ φ δ sin(λx2 − ωt), λ λ l using the formula for u (0, x) given by (4.4) and (4.5). We can now write the first component of E 1 + E 2 explicitly in the form x x 1 2 ∂t u h + ∇ul u h (t, x) = λ−s+1−δ u l2 (0, x)−u l2 (t, x) φ δ φ δ sin(λx2 −ωt) 1 λ λ x x 1 2 − ωλ−s−1−2δ φ δ φ δ cos(λx2 − ωt) λ λ x1 x2 + λ−s−2δ u l1 (t, x)φ δ φ δ cos(λx2 − ωt) λ λ x1 x2 + λ−s−1−3δ u l1 (t, x)φ δ φ δ sin(λx2 − ωt) λ λ x x 1 2 + 2λ−s−2δ u l2 (t, x)φ δ φ δ cos(λx2 − ωt) λ λ x x 1 2 + λ−s−1−3δ u l2 (t, x)φ δ φ δ sin(λx2 − ωt), λ λ

294

A. A. Himonas, G. Misiołek

while its second component is x x 1 2 ∂t u h + ∇ul u h (t, x) = ωλ−s−1−2δ φ δ φ δ cos(λx2 − ωt) 2 λ λ x x 1 2 − λ−s−1−3δ u l1 (t, x)φ δ φ δ sin(λx2 − ωt) x λ x λ 1 2 − λ−s−2δ u l2 (t, x)φ δ φ δ cos(λx2 − ωt) λ λ x1 x2 − λ−s−1−3δ u l2 (t, x)φ δ φ δ sin(λx2 − ωt). λ λ Using the estimates in Lemma 4.2 and Lemma 4.1 we can now estimate the L 2 norm of the first two error terms, E 1 + E 2 L 2 ∂t u h + ∇ul u h (t) + ∂t u h + ∇ul u h (t) . 2 2 2 2 1

L (R )

2

L (R )

The norm of the first component is bounded by the sum · λ−s+1−δ u l2 (t) − u l2 (0) 2 φ∞ φ δ sin(λ · −ωt) L λ ∞ · · −s−1−2δ cos(λ · −ωt) +λ φ δ 2 φ λ λδ L L 2 (R ) · + u l1 (t) L 2 φ ∞ λ−s−2δ φ δ cos(λ · −ωt) λ ∞ · −s−1−3δ sin(λ · −ωt) +λ φ λδ · ∞ l −s−2δ cos(λ · −ωt) + u 2 (t) L 2 φ∞ λ φ δ λ ∞ · , + λ−s−1−3δ φ δ sin(λ · −ωt) λ ∞ and that of the second component is bounded by · · cos(λ · −ωt) λ−s−1−2δ φ δ φ δ λ λ L 2 (R ) L 2 (R ) · −s−1−3δ l +λ u 1 (t) 2 2 φ ∞ φ δ sin(λ · −ωt) L (R ) λ ∞ · l −s−2δ + u 2 (t) L 2 φ ∞ λ φ δ cos(λ · −ωt) λ ∞ · −s−1−3δ sin(λ · −ωt) . +λ φ λδ ∞ Combining these estimates and using (4.7), (4.8) and (4.10) we get T l E 1 + E 2 L 2 λ−s+1−δ ∂t u (t) 2 2 dt + λ−s−1−2δ λδ/2 λδ/2 + λ−s−2δ λ−1+δ 0 −s−1−3δ −1+δ

L (R ) −s−2δ −1+δ

+λ λ +λ λ + λ−s−1−3δ λ−1+δ + λ−s−1−2δ λδ/2 λδ/2 −s−1−3δ −1+δ −s−2δ −1+δ +λ λ +λ λ + λ−s−1−3δ λ−1+δ T l λ−s−1−δ + λ−s+1−δ ∂t u (t) 2 2 dt. 0

L (R )

Non-Uniform Dependence of Solutions to the Euler Equations

295

Since u l (t, x) is defined in (4.4) as a solution of the Euler equations we can estimate the integral term above using Lemma 4.2 by T T l ∂t u (t) 2 2 dt = P∇ul u l (t) 2 2 dt L (R )

0

L (R )

0

T

l l u (t) u (t)

0 2(−1+δ)

λ

∞

H 1 (R 2 )

dt

,

where P is the L 2 orthogonal Hodge projection onto divergence free vector fields defined in (2.3). We have therefore obtained the estimate E 1 + E 2 L 2 (R2 ) λ−s−1+δ .

(4.15)

Using the bounds on the high frequency term in (4.13) and (4.14) and proceeding as above we find estimates of the remaining error terms E 3 + E 5 L 2 (R2 ) = P∇u h u l − (1 − P)∇u h u l 2 2 L (R ) h l ≤ 2 u (t) u (t) 1 (4.16) H (R )

∞

λ−s−δ λ−1+δ = λ−s−1 , and similarly

E 4 + E 6 L 2 (R2 ) = P∇u h u h 2 2 L (R ) h h ≤ u (t) u (t) λ

∞ −s−δ −s+1

λ

=λ

H 1 (R ) −2s+1−δ

(4.17) .

Collecting the estimates above gives the following L 2 bound: 6 E j λ−rs,δ , j=1 2 2 L (R )

(4.18)

where rs,δ = min (2s − 1 + δ, s + 1 − δ) .

Note that in order to assure that the error terms are small for λ 1 we need rs,δ > 0. 4.3. Construction of solutions. Our next task will be to show that the family of functions u ω,λ constructed in the previous subsections is a sufficiently good approximation to solutions of the Euler equations. Let u ω,λ = u ω,λ (t, x) be the unique solution of the Euler equations in R2 with initial data given by the values of u ω,λ at time t = 0. Namely, ∂t u ω,λ + ∇u ω,λ u ω,λ = −∇ pω,λ , div u ω,λ = 0, u ω,λ (0, x) = u

ω,λ

(4.19)

(0, x) = u (0, x) + u (0, x), x ∈ R , t ∈ R. l

h

2

296

A. A. Himonas, G. Misiołek

Observe that from formulas (4.5) and (4.11) and the estimates in (4.13) and Lemma 4.1 we have ω,λ h l u (0) s 2 ≤ (0) + (0) u u s 2 H (R ) s 2 H (R )

H (R )

λ−1+δ + 1 1, provided that we pick δ > 0 such that 0 < δ < 1.

(4.20)

It follows that if s > 2 then by the local existence and uniqueness theorem for the Euler equations the solution u ω,λ (t, x) is defined globally in time and with values in H s (R2 ). In fact, it also follows from the bound on the lifespan (2.7) in the local wellposedness theorem, see estimate (5.1). Next, consider the difference v = u ω,λ − u ω,λ between the approximate and the real solutions constructed above and observe that v satisfies the Cauchy problem ∂t v − P∇v v + P∇u ω,λ v + P∇v u ω,λ =

6

E j,

(4.21)

j=1

v(0) = 0. Standard energy estimates give

1 ∂t v, v = − ∂t v2L 2 (R) = ∇v u ω,λ , v + E j, v 2 R2 R2 R2 j E j 2 2 v L 2 (R2 ) ≤ ∇v u ω,λ L 2 (R2 ) v L 2 (R2 ) + L (R ) j

u ω,λ C 1 (R2 ) v2L 2 (R2 ) + λ−rs,δ v L 2 (R2 ) , where by the estimate of Lemma 4.2, the explicit formula for u h (t, x) in (4.11) and the Sobolev lemma we have ω,λ h l u (t) 1 2 ≤ (t) + (t) u u 1 2 C (R ) 2+ 2 H

(R )

C (R )

λ−1+δ + λ−s+1−δ . Therefore, we obtain ∂t v L 2 (R2 ) max λ−1+δ , λ−s+1−δ v L 2 (R2 ) + λ−rs,δ , and using Gromwall’s inequality we find v(t) L 2 (R2 ) λ−rs,δ et max

λ−1+δ ,λ−s+1−δ

which in particular holds uniformly for all t ∈ [0, 1].

,

(4.22)

Non-Uniform Dependence of Solutions to the Euler Equations

297

4.4. Conclusion of the proof. Let u +1,λ (t) and u −1,λ (t) be two sequences of solutions of the Cauchy problem (4.19) corresponding to initial conditions u +1,λ (0) = u +1,λ (0) and u −1,λ (0) = u +1,λ (0). Since 0 < δ < 1 for any integer k > 2 we have ±1,λ (t) k 2 ≤ u l (t) k 2 + u h (t) k 2 u H (R )

λ

−1+δ

H (R ) −s+k

+λ

H (R )

λ

−s+k

uniformly in t by (4.13) and the estimate of Lemma 4.2. Using the corresponding energy estimate for the solutions u ±1,λ (t) (see the Appendix) we then also get ±1,λ u ±1,λ (t) k 2 u ±1,λ (0) k 2 = u (0) k 2 λ−s+k . H (R ) H (R ) H (R )

Put together these estimates give the following bound for the difference: ±1,λ (t) − u ±1,λ (t) k 2 λ−s+k , u H (R )

which holds uniformly in time for any integer k > n/2 + 1. On the other hand, since 0 < δ < 1, we see that if s > 1 − δ,

(4.23)

then rs,δ > 0 (see the definition in (4.18) above). Furthermore the exponential term in (4.22) is bounded for large λ 1 and thus we have ±1,λ (t) − u ±1,λ (t) 2 2 λ−rs,δ −→ 0 as λ ∞. u L (R )

Now let s > 0. Choosing δ ∈ (0, 1) such that s > 1 − δ we have rs,δ > 0.

(4.24)

Interpolating between s1 = 0 and s2 = k = [s] + 2 gives ±1,λ (t) − u ±1,λ (t) u

(k−s)/k ±1,λ u (t) − u (t) 2 2 ±1,λ L (R ) H s (R 2 ) s/k ±1,λ × u (t) − u ±1,λ (t) k 2 λ

−rs,δ (k−s)/k

λ

(−s+k)s/k

H (R ) −(rs,δ −s)(k−s)/k

=λ

.

(4.25)

Observe that by definition of rs,λ in (4.18) and our choice of δ in (4.20) we have rs,δ − s = min (s − 1 + δ, 1 − δ) > 0, and therefore (4.23) and (4.26) give the key estimate ±1,λ (t) − u ±1,λ (t) s 2 λ−(rs,δ −s)(k−s)/k −→ 0 u H (R )

(4.26)

as λ ∞.

(4.27)

298

A. A. Himonas, G. Misiołek

We can now complete the proof as follows. On the one hand, we have u +1,λ (0) − u −1,λ (0)

· · −1 = 2λ ψ ψ 1 H s (R 2 ) λδ Hxs1 (R) 2 λδ Hxs2 (R) · · + 2λ−1 ψ1 (4.28) ψ2 δ s δ s λ λ H x (R ) H x (R ) 1 2 λ−1+δ + λ−1+δ −→ 0,

provided that λ ∞. On the other hand, for any t > 0 by the triangle inequality and (4.25) we have u +1,λ (t) − u −1,λ (t)

+1,λ −1,λ u ≥ (t) − u (t) s 2 H s (R 2 ) H (R ) +1,λ − u (t) − u +1,λ (t) s 2 H (R ) +1,λ − u (t) − u −1,λ (t) s 2 (4.29) H (R ) u +1,λ (t) − u −1,λ (t) s 2 − λ−(rs,δ −s)(k−s)/k . H (R )

The difference of the approximate solutions (with obvious notation for low and high frequency terms) can be written as u +1,λ (t, x) − u −1,λ (t, x) = u l,+1 (t, x) − u l,−1 (t, x) + u h,+1 (t, x) − u h,−1 (t, x), so that by Lemma 4.2 we have +1,λ (t) − u −1,λ (t) u

Hs

≥ u h,+1 (t) − u h,−1 (t) s 2 − u l,+1 (t) s 2 H (R ) H (R ) −1,λ − u (t) s 2 H (R ) h,+1 u (t) − u h,−1 (t) s 2 − λ−1+δ . (4.30) H (R )

Furthermore, the difference of the high frequency terms appearing on the right side can be expressed explicitly using (4.11) as u h,+1 (t, x) − u h,−1 (t, x) x x 1 2 = λ−s−δ φ δ φ δ (cos(λx2 − t) − cos(λx2 + t)) λ λ x2 x1 +λ−s−1−2δ φ δ φ δ (sin(λx2 − t) − sin(λx2 + t)) λ λ −s−1−2δ x 1 −λ φ(x φ ) − t)−sin(λx + t)) , (sin(λx 2 2 2 λδ

Non-Uniform Dependence of Solutions to the Euler Equations

299

so that using the triangle inequality and a little trigonometry as in the periodic case together with Lemma 4.1 we obtain · · h,+1 sin λ(·) (t) − u h,−1 (t) s | sin t|λ−s−δ φ δ s φ u δ s H λ λ Hx Hx 1 2 · −s−1−2δ φ −λ λδ Hxs1 · × φ δ (sin(λ · −t) − sin(λ · +t)) λ Hxs 2 · −s−1−2δ +λ φ λδ Hxs1 · × φ δ (sin(λ · −t) − sin(λ · +t)) λ Hxs 2 | sin t|λ−s−δ λδ/2 λs+δ/2 − λ−s−1−2δ λδ/2 λs+δ/2 | sin t| − λ−1−δ . Combining this estimate with (4.29) and (4.30) we obtain u +1λ (t) − u −1,λ (t) s 2 | sin t| − λ−1−δ − λ−1+δ − λ−(rs,δ −s)(k−s)/k −→ | sin t| H (R ) as λ ∞ for any 0 < t < 1. The proof of part (NP) of Theorem 2.1 is complete.

5. Appendix In order to make the paper self-contained we provide here proofs of the estimates omitted from the main text. 5.1. Proof of Lemma 4.1. Let σ ≥ 0 and δ ≥ 0. For any Schwartz function ψ ∈ S(R) and any λ 1 a simple computation gives 2 · 2

· 1 1 2 σ 2 σ δ δ 2 ψ ψ λ ψ λ ξ dξ (ξ ) = (1+ξ ) dξ = (1 + ξ ) λδ H σ 2π R λδ 2π R 1 δ 2 σ δ 2 δ 1 λ ξ d(λδ ξ ) ψ 1+ δ λ ξ =λ 2π R λ 1 2 σ δ 1 (ξ )2 dξ, ψ 1+ δξ =λ 2π R λ which proves estimate (4.7) of Lemma 4.1. Next, for any constant a ∈ R we have · ∧ ψ δ cos(λ · −a) (ξ ) = e−i xξ ψ(λ−δ x) cos(λx − a) d x λ R e−ia = e−i x(ξ −λ) ψ(λ−δ x) d x 2 R eia + e−i x(ξ +λ) ψ(λ−δ x) d x 2 R e−ia δ eia δ λ (ξ − λ) + λδ λ (ξ + λ) . = λδ ψ ψ 2 2

300

A. A. Himonas, G. Misiołek

The second estimate (4.8) follows now from 2 · λ−2s−δ ψ δ cos(λ · −a) λ Hσ

∧ 2 1 = λ−2σ −δ (1 + ξ 2 )σ ψ(λ−δ ·) cos(λ · −a) (ξ ) dξ 2π R

2 σ 1 ψ (ξ )2 dξ 1 + λ−δ ξ + λ = λ−2σ 8π R σ (ξ )ψ (ξ + 2λδ ) dξ +2 1 + (λ−δ ξ + λ)2 Re e−2ia ψ R 2

−δ 2 σ ψ (ξ ) dξ 1+ λ ξ +λ + R

which is equal to σ 1 (ξ )2 dξ = λ−2 + (λ−1−δ ξ + 1)2 ψ 8π R σ 1 (ξ )2 dξ + λ−2 + (λ−1−δ ξ − 1)2 ψ 8π R σ 1 (ξ )ψ (ξ + 2λ1+δ ) dξ. + λ−2 + (λ−1−δ ξ + 1)2 Re e−2ia ψ 4π R By the dominated convergence theorem the first two terms converge to 1/2ψ2L 2 while the third term vanishes as λ ∞. . 5.2. Proof of Lemma 4.2. The inequality in (4.10) follows from Lemma 4.1 and an energy estimate for (4.4) in H s (Rn , Rn ), where s > n/2 + 1. For the energy estimate one usual trick is to use Friedrichs mollifiers J ( > 0) combined with a limiting argument. First, we replace (4.4) with a regularized equation ∂t s J u l + s J P∇ul u l = 0,

P = 1 − ∇−1 div,

where s = (1 − )s/2 . We can arrange so that the pseudodifferential operators J , s and P commute and then proceed with standard estimates to get 2 1 ∂t J u l s = − ∇ul s J u l , Ps J u l − s J , ∇ul u l , s J u l H 2 Rn Rn 2 l l ≤ C u 1 u s . C

H

Rewriting it as an integral inequality and passing to the limit with → 0 we eliminate dependence on on the left-hand side. Integrating in time over [0, t] we get l u (0) s l H u (t) s ≤ H 1 − Ct u l (0) H s so that

l u (t)

Hs

≤ 2 u l (0)

Hs

Non-Uniform Dependence of Solutions to the Euler Equations

301

for all 0 ≤ t ≤ T < Tc =

1 l 2C u (0)

. Hs

Finally, observe that for any σ ≤ s we have l u (t) s n ≤ u l (t) s n ≤ 2 u l (0) H (R )

H (R )

H s (R n )

λ−1+δ

by (4.5) and Lemma 4.1 which holds for all 0 ≤ t ≤ T , where by the estimates above T ≥ provided that λ 1 and 0 < δ < 1.

1 −1 1−δ C λ ≥ 1, 2

(5.1)

Acknowledgements. The authors would like to thank the referee for constructive suggestions.

References [AK] [B] [C] [EM] [EMP] [H] [HK] [HKM] [K1] [K2] [KP] [KPV] [KT] [L] [MB] [M] [Sh]

Arnold, V., Khesin, B.: Topological Methods in Hydrodynamics. New York: Springer, 1998 Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and applications to nonlinear evolution equations. Part II: the KdV Equation. Geom. Funct. Anal. 3, 209–262 (1993) Constantin, P.: On the euler equations of incompressible fluids. Bull. Amer. Math. Soc. 44, 603–621 (2007) Ebin, D., Marsden, J.: Groups of diffeomorphisms and the motion of incompressible fluids. Ann. Math. 92, 341–363 (1970) Ebin, D., Misiołek, G., Preston, S.: Singularities of the exponential map on the volume-preserving diffeomorphism group. Geom. Funct. Anal. 16, 850–868 (2006) Henry, D.: Geometric Theory of Semilinear Parabolic Equations. Lecture Notes in Mathematics 840, New York: Springer, 1981 Himonas, A., Kenig, C.: Non-uniform dependence on initial data for the ch equation on the line. Diff. Int. Eqs. 22(3–4), 201–224 (2009) Himonas, A., Kenig, C., Misiołek, G.: Non-uniform dependence for the periodic Camassa-Holm equation. Comm. Part. Diff. Eqs., to appear Kato, T.: Quasi-Linear Equations of Evolution with Applications to Partial Differential Equations. Lecture Notes in Mathematics 448, New York: Springer, 1975 Kato, T.: The cauchy problem for quasi-linear symmetric hyperbolic systems. Arch. Rat. Mech. Anal. 58, 181–205 (1975) Kato, T., Ponce, G.: Commutator estimates and the euler and navier-stokes equations. Comm. Pure Appl. Math. 41, 891–907 (1988) Kenig, C., Ponce, G., Vega, L.: On the ill-posedness of some canonical dispersive equations. Duke Math J. 106, 617–633 (2001) Koch, H., Tzvetkov, N.: Nonlinear wave interactions for the benjamin-ono equation. Int. Math. Res. Not. 30, 1833–1847 (2005) Lin, Z.: Nonlinear instability of ideal plane flows. Int. Math. Res. Not. 41, 2147–2178 (2004) Majda, A., Bertozzi, A.: Vorticity and Incompressible Flow. Cambridge: Cambridge University Press, 2002 Misiołek, G.: Stability of ideal fluids and the geometry of the group of diffeomorphisms. Indiana Univ. Math. J. 42, 215–235 (1993) Shnirelman, A.: On the nonuniqueness of weak solutions of the euler equations. Comm. Pure Appl. Math. 50, 1260–1286 (1997)

Communicated by P. Constantin

Commun. Math. Phys. 296, 303–321 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1019-6

Communications in

Mathematical Physics

Interaction of Four Rarefaction Waves in the Bi-Symmetric Class of the Two-Dimensional Euler Equations Jiequan Li1, , Yuxi Zheng2, 1 Department of Mathematics, Capital Normal University,

Beijing 100037, Peoples Republic of China

2 Department of Mathematics, The Pennsylvania State University,

University Park, PA 16802, USA. E-mail: [email protected] Received: 6 August 2008 / Accepted: 27 December 2009 Published online: 24 February 2010 – © Springer-Verlag 2010

Abstract: The global existence and structures of solutions to multi-dimensional unsteady compressible Euler equations are interesting and important open problems. In this paper, we construct global classical solutions to the interaction of four orthogonal planar rarefaction waves with two axes of symmetry for the Euler equations in two space dimensions, in the case where the initial rarefaction waves are large. The bi-symmetric initial data is a basic type of four-wave two-dimensional Riemann problems. The solutions in this case are continuous, bounded and self-similar, and we characterize how large the rarefaction waves must be. We use the methods of hodograph transformation, characteristic decomposition, and phase space analysis. We resolve binary interactions of simple waves in the process. 1. Introduction Consider the two-dimensional isentropic compressible Euler system ⎧ ⎨ ρt + (ρu)x + (ρv)y = 0, (ρu)t + (ρu 2 + p)x + (ρuv)y = 0, ⎩ (ρv)t + (ρuv)x + (ρv 2 + p)y = 0,

(1.1)

where ρ is the density, (u, v) is the velocity and p is the pressure given by p(ρ) = Kρ γ , where K > 0 will be scaled to be one and γ > 1 is the gas constant. Cauchy problems for (1.1) are open. Riemann problems for (1.1) are a current research topic, as they are reducible to involve fewer independent variables while retaining important features of general solutions. We refer the reader to [2,7,8,13,20,25] for some general solutions to one-dimensional and multidimensional Euler equations, and to [15,30] for the rich Research partially supported by the Key Program from Beijing Educational Commission (KZ200910028002), 973 project (2006CB805902) and PHR(IHLB) and NSFC (10971142). Research partially supported by NSF-DMS-0603859, 0908207.

304

J. Li, Y. Zheng

flow patterns displayed by solutions to Riemann problems. Shock reflection problems [5,19,24,31,34], in particular, are included in the Riemann problems. Two-dimensional (2-D) Riemann problems are Cauchy problems with special initial data that are constant along each ray from the origin. The one-dimensional case is quite well-understood [4]. The two-dimensional case was formulated, and the solution configurations conjectured, in [28]. The solution configurations are complicated, as confirmed afterward by several numerical simulations [3,11,12,23]. There has been no rigorous proof of the numerical simulations due to lack of effective methods of analysis. In this paper, with the first success since the proposition [28], we construct analytic solutions to a case of a configuration of the 2-D four-wave Riemann problems of (1.1), using methods that we have developed in recent years. The construction is based on the analysis of (1.1) in three planes: The self–similar variables (ξ, η) = (x/t, y/t), the inclination angles of characteristics (α, β), and the velocity (u, v)-plane of the hodograph transformation. These forms enable us to do analysis effectively. The case that we solve in this paper has two axes of symmetry. The initial data of a 2-D Riemann problem may possess a certain symmetry, e.g., axial symmetry, or piecewise constant along the axial direction with one or two axes of symmetry of the plane. A global solution for the axially symmetric case has been constructed [29,33]. Furthermore, a global solution for the binary interaction of two planar rarefaction waves have also been constructed recently [17]. The construction for the interaction of four planar rarefaction waves with one axis of symmetry, a primary class of the two-dimensional four-wave Riemann problems, and denoted as Configuration A in [15,28,30], has not been available, but the reason is clearly revealed in paper [10] in which shock formation is established numerically. The shock formation near the sonic boundary makes the global construction difficult. The class of four planar rarefaction waves with two axes of symmetry, which we call bi-symmetric and denote as Configuration B in [15,28], have also been shown numerically to have shock development in [10] as well as in earlier numerical experiments [3,12,15,23]. However, the extra symmetry makes the configuration accessible by our newly developed tools. We obtain continuous global solutions in this class when the rarefaction waves are large, see Theorem 7.1. We use the hodograph transformation and characteristic decomposition in constructing the global solution. The characteristic decomposition handles simple waves best while the hodograph transformation, valid for non-simple waves, reduces the system to a linearly degenerate one. Assuming that the flow is ir-rotational and self-similar, Pogodin, Suchkov and Ianenko ([21], 1958) introduced the hodograph transformation to represent the system of equations (1.1) in the velocity variables (u, v), resulting in a decoupled partial differential equation of second order for the speed of sound c. In 2001, Li ([14]) carried out an analysis of the second order equation in the space (c, u, v), through a pair of variables resembling the well-known Riemann invariants together with their invariant regions, and established the existence of a solution to the expansion of a wedge of gas into vacuum in the hodograph plane for wide ranges of the gas constant and the wedge angle. In 2006, paper [16] clarified the concept of simple waves for (1.1). Then, in paper [17], we show that the hodograph transformation is non-degenerate (and globally one-to-one) precisely for non-simple waves, and the solutions constructed in [14] in the hodograph plane can be transformed back to the self-similar plane. Thus a complete procedure of construction of solutions is now available, which we use here for studying the interactions involved in Configuration B – the bi-symmetric class. In particular, the interaction of any two simple waves is completed in this paper, provided that the two waves are expanding toward vacuum, see Theorem 6.1.

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

305

Our main results are given in Theorems 6.1 and 7.1. A helpful characterization of simple waves, especially their boundaries, is given in Lemma 6.1, which will have broad applications in solving other 2-D Riemann problems. In the next section we list formulas and equations in various forms which form the basis for our construction in this paper. In Sect. 3, we select data to set up Configuration B, eliminating redundancy through scaling and translation or normalization. In Sect. 4, we recall previous work on the interaction of two symmetric planar rarefaction waves. In Sect. 5, we winnow the data to keep the problem hyperbolic. In Sect. 6, we characterize a complete patch of simple wave, as is needed in our solution, and construct the solution of interactions of two simple waves. In Sect. 7, we put the various pieces together to obtain the existence of global solutions, which is stated in Theorem 7.1 and reproduced here: Theorem (Main theorem). Consider the Riemann problem for system (1.1) with initial data consisting of constant states (ci , u i , vi ) in the i th quadrants (i = 1, 2, 3, 4) so that + , states 2 and 3 form a backward states 1 and 2 form a forward rarefaction wave R12 − + , and states rarefaction wave R23 , states 3 and 4 form a forward rarefaction wave R34 − 4 and 1 form a backward rarefaction wave R41 . (The rarefaction wave requirement on the data forces c2 = c4 , c1 = c3 , thus we call√it a bi-symmetric problem.) Then, there exists a number c2∗ (γ ) ∈ (0, 1) for γ > 1 + 2, such that our bi-symmetric Riemann problem has global continuous solutions, provided 0 < c2 < c2∗ (γ )c1 . Notations. Here is√a list of our notations: Besides the primitive variables ρ, (u, v) and p, we have c = γ p/ρ as the speed of sound, i = c2 /(γ − 1) the enthalpy, ϕ the pseudo-velocity potential. In terms of the self-similar variables (ξ, η), we often use the pseudo-velocity (U, V ) = (u − ξ, v − η), and ± , λ± , where √ U V ± c U 2 + V 2 − c2 ± = , ± λ∓ = −1. (1.2) U 2 − c2 The angles α, β and ω are defined as tan α := + tan β := − , ω = (α − β)/2, τ = (α + β)/2.

(1.3)

We also use the following vector fields: ∂ ± = ∂ξ + ± ∂η , ∂± = ∂u + λ± ∂v , ∂0 = ∂u ∂¯ + = (cos α, sin α)·(∂ξ , ∂η ), ∂¯ − = (cos β, sin β)·(∂ξ , ∂η ), ∂¯ 0 = cos τ ∂ξ + sin τ ∂η ∂¯+ = (sin β, − cos β) · (∂u , ∂v ), ∂¯− = (sin α, − cos α) · (∂u , ∂v ), (1.4) and some notations γ −1 3−γ γ +1 κ= , m= , = m − tan2 ω, tan2 θs = m, ν = . (1.5) 2 γ +1 2(γ − 1) 2. Systems of Self-Similar Flows 2.1. Irrotational flows. Our primary system is system (1.1) in the self–similar variables (ξ, η) = (x/t, y/t): ⎧ ⎨ U i ξ + V i η + 2κ i (u ξ + vη ) = 0, U u ξ + V u η + i ξ = 0, (2.1) ⎩Uv + V v + i = 0 ξ η η

306

J. Li, Y. Zheng

with the ir-rotationality condition u η = vξ . System (2.1), (2.2) can also be reduced to the system 2 (c − U 2 )u ξ − U V (u η + vξ ) + (c2 − V 2 )vη = 0, u η − vξ = 0,

(2.2)

(2.3)

supplemented by Bernoulli’s law 1 i + (U 2 + V 2 ) = −ϕ, ϕξ = U, ϕη = V. 2 The (pseudo-)characteristics are √ U V ± c U 2 + V 2 − c2 dη = ± : ≡ ± . dξ U 2 − c2

(2.4)

(2.5)

Then system (2.3) can be written in characteristic form: ∂ ± u + ∓ ∂ ± v = 0.

(2.6)

2.2. Characteristic decomposition. In [16], it is shown that system (2.3) and (2.4) has a characteristic decomposition, in analogy with that for the classical wave operator. It is very useful in the discussion of simple waves and their interactions. Proposition 2.1 (Commutator relation). For any quantity I (ξ, η), there holds ∂ −∂ + I − ∂ +∂ − I =

∂ − + − ∂ + − − (∂ I − ∂ + I ). − − +

(2.7)

Proposition 2.2 (Characteristic decomposition). For (2.3) and (2.4), there hold ∂ + ∂ − I = m 1 ∂ − I,

∂ − ∂ + J = m 2 ∂ + J,

(2.8)

where m 1 and m 2 can be expressed in the form m 1 = m 1 (u, v)(∂ξ u + ζ1 (u, v)∂η u), m 2 = m 2 (u, v)(∂ξ u + ζ2 (u, v)∂η u); and I = u, v, c or − , J = u, v, c or + . 2.3. System in inclination angles of characteristics. The inclination angles (α, β) of characteristics (see notation (1.3)) play an important role in our study. First we have from [17,32] cos τ sin τ , v − η = −c . (2.9) sin ω sin ω System (2.3) can be written as the closed system of equations in terms of α, β, and c: ⎧ ⎪ c∂¯ − β = cos2 ω(2 sin2 ω + c∂¯ − α), ⎪ ⎨ ¯+ c∂ α = cos2 ω[−2 sin2 ω + c∂¯ + β], (2.10) γ −1 ⎪ ⎪ [4 sin2 ω + c∂¯ − α − c∂¯ + β]. ⎩ ∂¯ 0 c = 2(γ + 1) sin ω u − ξ = −c

Form (2.10) is given in Chen and Zheng [6]. Note that the sign for ∂¯ 0 here is the opposite of ∂¯0 of [6]. Here we offer a more direct derivation.

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

307

Proof of (2.10). We start with the differential relations among the new variables: c cos βdα − c cos αdβ cos τ dc + , sin ω 2 sin2 ω c sin βdα − c sin αdβ sin τ dc + . dv − dη = − sin ω 2 sin2 ω

du − dξ = −

The differential form of the Bernoulli law (2.4) can be written as κ dc = (cos τ du + sin τ dv). sin ω

(2.11)

(2.12)

Using (2.6) and (2.11), we have cot ω∂¯ − c = cos(2ω) +

c [cos(2ω)∂¯ − α − ∂¯ − β]. 2 sin2 ω

(2.13)

The Bernoulli law (2.12) gives ∂¯ − c =

κ sin(2ω) cκ cot ω + (∂¯ − α − ∂¯ − β). 2 2(κ + sin ω) 2(κ + sin2 ω)

(2.14)

The above two equations together give c∂¯ − β = cos2 ω(2 sin2 ω + c∂¯ − α).

(2.15)

c∂¯ + α = cos2 ω[−2 sin2 ω + c∂¯ + β].

(2.16)

Similarly, we have

On the other hand, we can obtain the equation for c κ cot ω(2 sin2 ω + c∂¯ − α), 1+κ κ cot ω(−2 sin2 ω + c∂¯ + β). ∂¯ + c = − 1+κ

∂¯ − c =

Note that cos ω∂¯ 0 =

∂¯ + +∂¯ − 2 .

∂¯ 0 c =

(2.17)

Then we sum up:

κ [4 sin2 ω + c∂¯ − α − c∂¯ + β]. 2(1 + κ) sin ω

(2.18)

Thus we obtain the closed system of equations (2.10). Remark 2.1. In fact, (2.10) can be reduced to a diagonal form, ⎧ 2 ⎪ ⎪ ∂¯ + (−β + ψ(ω)) = sin ω[cos(2ω) − κ] , ⎪ ⎪ ⎨ c(κ + sin2 ω) 2 sin ω[cos(2ω) − κ] ⎪ , ∂¯ − (α + ψ(ω)) = ⎪ ⎪ c(κ + sin2 ω) ⎪ ⎩ ¯0 2 ∂ [c (1 + κ M 2 )] = 2cκ M, where

ψ(ω) :=

γ +1 arctan γ −1

γ −1 cot ω , γ +1

(2.19)

(2.20)

308

J. Li, Y. Zheng

and the pseudo-Mach number M is related to ω as 1 = M := U 2 + V 2 /c. sin ω

(2.21)

The Riemann variables ψ − β and ψ + α correspond to the classical Riemann invariants for homogeneous systems. However, it is convenient for us to use (2.10) in this paper. Other useful formulas are given below. Proposition 2.3. First-order derivatives have the formulas ⎧ sin α ¯ − sin β ¯ + ⎪ ⎪ ∂¯ − u = ∂ c, ∂¯ + u = − ∂ c, ⎪ ⎪ ⎪ κ κ ⎨ ¯+ + − ¯ ¯ c∂ β = ν sin(2ω)∂¯ − c, c∂ α = −ν sin(2ω)∂ c, (2.22) 2 − − ¯ ¯ ⎪ c∂ α = 2ν tan ω∂ c − 2 sin ω, c∂¯ + β = −2ν tan ω∂¯ + c + 2 sin2 ω, ⎪ ⎪ ⎪ tan ω ¯ ± ⎪ ⎩ c∂¯ ± ω = (sin2 ω + κ) ∂ c − sin2 ω. κ 2.4. System in the hodograph plane. Pogodin, Suchkov and Ianenko [21] proposed the hodograph transformation T : (ξ, η) → (u, v)

(2.23)

for (2.1), reversing the roles of (ξ, η) and (u, v) and regarding i as a function of (u, v). Then i as the function of u and v satisfies ξ − u = iu , η − v = iv ,

(2.24)

provided that the transformation (2.23) is non-degenerate. System (2.3) becomes a linearly degenerate system (2κ i(u, v) − i u2 )ηv + i u i v (ξv + ηu ) + (2κ i − i v2 )ξu = 0, (2.25) ξv − ηu = 0 for the unknowns (ξ, η). And i satisfies (2κ i − i u2 )i vv + 2i u i v i uv + (2κ i − i v2 )i uu = i u2 + i v2 − 4κ i.

(2.26)

The linear degeneracy of (2.25) becomes more transparent when it is expressed in terms of α, β and c. In paper [17], we convert (2.26) to ⎧

1+γ ⎨ ∂¯+ α = 4c · sin(α − β) · m − tan2 ω =: G(α, β, c), (2.27) ∂¯ β = G(α, β, c), ⎩ − ∂0 c = κ cos α+β / sin ω, 2 with ∂¯+ c = −κ, ∂¯− c = κ.

(2.28)

The definitions of ∂¯± and ∂0 (see (1.4)) implies that system (2.27) is linearly degenerate. In our construction of solutions, we need C 0 , C 1 and C 1,1 estimates. The main difficulty lies in the non-homogeneity of (2.27). Thus we shall need the second-order derivatives, given in [17], which are obtained by direct calculation.

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

309

Proposition 2.4. Assume that the solution of (2.27) (α, β) ∈ C 2 . Then we have

∂¯+ ∂¯− α + W ∂¯− α = Q(ω, c), −∂¯− ∂¯+ β + W ∂¯+ β = Q(ω, c),

(2.29)

where W (ω, c) and Q(ω, c) are 1 + γ m − tan2 ω 3 tan2 ω − 1 cos2 ω + 2 tan2 ω , 4c

(1 + γ )2 2 2 sin(2ω) m − tan ω 3 tan ω − 1 . Q(ω, c) := 16c2

W (ω, c) :=

(2.30)

Proposition 2.5. Assume that the solution of (2.27) (α, β) ∈ C 2 . Then we have

∂¯+ ∂¯− (α + β) + W ∂¯− (α + β) = a(ω, c)∂¯+ (α + β) −∂¯− ∂¯+ (α + β) + W ∂¯+ (α + β) = a(ω, c)∂¯− (α + β),

(2.31)

γ +1 cos2 ω(tan2 ω + α2 )(tan2 ω − α1 ), 4c

(2.32)

where a(ω, c) := where α2 :=

1 2m [3 + m + (3 + m)2 + 4m], α1 := . 2 3 + m + (3 + m)2 + 4m

(2.33)

Proposition 2.6. Assume that the solution of (2.27) (α, β) ∈ C 2 . Then we have ⎧ γ +1 ⎪ ⎨ (∂¯+ + W )(Z − ∂¯− α) = (tan2 ω + 1)(Z − ∂¯+ β) 4c γ +1 ⎪ ⎩ (−∂¯− + W )(Z − ∂¯+ β) = (tan2 ω + 1)(Z − ∂¯− α), 4c

(2.34)

where Z :=

γ +1 tan ω. 2c

(2.35)

To invert the solution on the hodograph plane to the (ξ, η) plane, we notice that (2.24) defines a mapping from (u, v, ) to (ξ, η) as ξ = u + i u , η = v + i v . The Jacobian has the formula j (ξ, η; u, v) =

∂(ξ, η) c2 (∂¯− α − Z )(∂¯+ β − Z ). = ∂(u, v) 4 sin4 ω

(2.36)

310

J. Li, Y. Zheng

(a)

(b) Fig. 3.1. Illustration of characteristics and planar rarefaction waves

3. Bi-symmetric Four Rarefaction Waves The initial data for the 2-D Riemann problem are constant along each ray from the origin, (u, v, ρ)(t, x, y)|t=0 = (u 0 , v0 , ρ0 )(θ ), θ = arctan(y/x).

(3.1)

For theoretical and application reasons, (u 0 , v0 , ρ0 )(θ ) is usually piecewise constant. The four-constant Riemann problem is a prototype example and has special initial data that takes on constant values in each of the four initial quadrants; i.e., (u 0 , v0 , ρ0 )(θ ) = (u i , vi , ρi ), (i − 1)π/2 < θ < iπ/2,

(3.2)

(i = 1, 2, 3, 4). We use I, II, III and IV to designate the corresponding state (u i , vi , ρi ). The four-wave Riemann problem is restricted further so that each adjacent pair of data in the four-constant Riemann problem is connectible by a single planar wave. See [28]. The characteristics defined by (2.5) in the region of a constant state provide a basic reference point for understanding non-constant states. The characteristics are straight lines and the set {(ξ, η); (ξ − u) ¯ 2 + (η − v) ¯ 2 = c¯2 } is a sonic circle of the state (u, ¯ v, ¯ ρ). ¯ The plus characteristic lines, denoted by + in Fig. 3.1(a), are tangent to the sonic circle, and go in the counterclockwise direction if regarded as starting at the tangent points in reference to the sonic circle, while − go clockwise. The Euler system (1.1) has two classes of planar rarefaction waves connecting a given pair of states (u f , v f , ρ f ) and (u b , vb , ρb ). We denote the first class by R +f b , whose telltale feature is that the family of straight-line characteristics go counter-clockwise in reference to the sonic circle of the state (u b , vb , ρb ). We denote the second class by R −f b , whose tell-tale feature is that the family of straight-line characteristics go clockwise in reference to the sonic circle of the state (u b , vb , ρb ). The two classes have examples represented by ⎧ c du ⎨ = , v = v f = vb , ρb < ρ f ξ = u + c, ± Rfb : (3.3) dρ ρ ⎩ η > v or η < v . b

b

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

311

We designate that the front is the state with higher pressure, or equivalently, higher density, as shown in Fig. 3.1(b). + connecting states I and II, R − conWe require our initial data (3.2) to have R12 14 − + connecting states III necting states I and IV, R32 connecting states III and II, and R34 and IV. These requirements place strong restrictions on the four states and a number of compatibility conditions result. In the end, see [15], however, we only need ρ1 = ρ3 , ρ2 = ρ4 , u 1 − u 2 = v1 − v4 (ρ2 < ρ1 ).

(3.4)

And the set-up is symmetric with respect to ξ − η = u 1 − v1

and

ξ + η = u 2 + v2 .

So our data enjoys two axes of symmetry and so we call it bi-symmetric data. It is + R − R − R + , see [12,15,22,23,28,30]. denoted traditionally Configuration B R12 14 32 34 In a recent paper [17] we handled the interaction of an R + with an R − at any angle between 0 and π . It is a fundamental case of wave interactions. We use it in this paper to + R − is to interact consider the bi-symmetric configuration, where the interaction of R12 14 − + with the interaction of R32 R34 , which is really the second level interaction of primary binary interactions, cf. Dinu [9]. Normalization. We use scaling and translations to get rid of unnecessary freedom in the data to prepare for our construction. We note that ξ and u can be shifted by an equal amount without changing the solution. The same is true for η and v. So we shall assume that u 1 − v1 = 0, u 2 + v2 = 0.

(3.5)

In addition, the transformation (u, v, c, ξ, η) → c(u, ¯ v, c, ξ, η) (where c¯ is any positive constant) does not change system (2.1), so we shall assume that c1 = 1.

(3.6)

Thus we have only two free parameters: c2 ∈ (0, 1) and γ > 1. And there hold u1 =

c1 − c2 > 0, v1 = v2 = u 1 , u 2 = −u 1 . γ −1

(3.7)

+ has the explicit expression The rarefaction wave R12 ρ ξ = u1 + ρ −1 p (ρ) dρ + p (ρ), ρ2 < ρ < ρ1 , ρ1

v = v1 , u = u1 +

ρ ρ1

ρ −1 p (ρ) dρ,

(3.8)

η > v1 . The characteristics of the plus family are straight lines; the characteristics of the minus family are given by γ +1 γ (γ + 1) γ −3 η = v1 + ρ 4 c¯ + ρ 2 , (3.9) 3−γ

312

J. Li, Y. Zheng

(a)

(b)

Fig. 4.1. Interaction of planar rarefaction waves: In (a) we show the case of gas expansion into a vacuum; In (b) we show the critical case that the vacuum interface is a single point

where c¯ is a constant, for γ = 3. For γ = 3, the characteristics are η = v1 + ρ c¯ − 6 ln ρ.

(3.10)

For the special minus characteristic curve that starts horizontally (i.e., the curve ab in Fig. 5.1) we have c¯ = −

2γ (γ − 1) γ −3 ρ1 2 3−γ

(3.11)

for γ = 3, and c¯ = 3 + 6 ln ρ1

(3.12)

for γ = 3. 4. Binary Interaction of Planar Rarefaction Waves Before constructing the global solution for the four bi-symmetric rarefaction waves we proposed last section, we recall the binary interaction of planar rarefaction waves from [17]. The most typical case is the interaction of full rarefaction waves R + and R − that connect the vacuum to a constant state, as shown in Fig. 4.1(a). These two waves penetrate each other completely and fully expand into the vacuum. Lemma 4.1 (Gas expansion [17]). There exists a solution (u, v, ρ) ∈ C 1 of (1.1) for the problem of gas expansion into a vacuum in the wave interaction region in the self-similar (ξ, η)-plane for all γ ≥ 1 and all wedge half-angle θ ∈ (0, π/2]. For θ > θs :=

arctan(Re 3−γ γ +1 ), the vacuum boundary is representable as a single-valued concave function ξ = B(η), the minus family of characteristics are concave, the plus family of characteristics are convex, and the difference of their inclination angles at the boundary is 2θs (γ ).

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

(a)

313

(b)

Fig. 4.2. Interaction of two symmetric rarefaction waves: In (a) the data satisfies 0 ≤ ρ1 = ρ2 < ρ∗ < ρ0 ; In (b) the data satisfies ρ∗ ≤ ρ1 = ρ2 < ρ0 + As a corollary, we can study the interaction of two planar rarefaction waves R01 − − + connecting states (u , v , ρ ) and (u , v , ρ ), R and R02 in Fig. 4.1(b), R01 0 0 0 1 1 1 02 connecting states (u 0 , v0 , ρ0 ) and (u 2 , v2 , ρ2 ), for two appropriate states (u 1 , v1 , ρ1 ) and (u 2 , v2 , ρ2 ). These two waves penetrate each other. Here we state a symmetric case: u 1 = u 2 , v1 = −v2 and ρ1 = ρ2 , and fix the state (u 0 , v0 , ρ0 ). Then it is evident that there exists a state (u ∗ , v∗ , ρ∗ ) such that if (u 1 , v1 , ρ1 ) = (u ∗ , v∗ , ρ∗ ) and (u 2 , v2 , ρ2 ) = (u ∗ , −v∗ , ρ∗ ), the vacuum interface just shrinks into a single point. That is, the wave-tail characteristics from points b and b meet at a point d at which the density is zero. This case is referred to as the critical case. Once the states (u 1 , v1 , ρ1 ) and (u 2 , v2 , ρ2 ) are such that ρ1 = ρ2 < ρ∗ , then the vacuum interface is no longer a single point. We refer to this case as the “large” rarefaction waves. We summarize the interaction of planar rarefaction waves in the following corollary. + and Corollary 4.1. For the interaction of two symmetric planar rarefaction waves R01 − R02 , there are three cases of solutions. For the large data case, they expand into vacuum and an interface separates the vacuum from the interaction region; For the small data case, they penetrate each other without the presence of vacuum. The third case is the middle case when the data yields a single point of vacuum. See Fig. 4.2.

We remark that we do not have any quantitative estimate on the location of the vacuum boundary ξ = B(η), e.g., the value of B(0). 5. Hyperbolicity and Non-overlapping of Domains of Determinacy + R − follows We start the construction of solutions. See Fig. 5.1. The interaction of R12 14 − + from Lemma 4.1 and Corollary 4.1. So does the interaction of R32 R34 . Under our normalization, the interaction point a has the coordinate 1 − c2 1 − c2 + 1, +1 , a = (ξ1 , η1 ) = γ −1 γ −1

314

J. Li, Y. Zheng

Fig. 5.1. Interaction of four bi-symmetric rarefaction waves + . Similarly, which follows from (3.6), (3.7) and the solution formula ξ = u + c for R12 point b has the horizontal coordinate

ξb = u 2 + c2 = −

γ c2 1 − c2 1 + c2 = − + . γ −1 γ −1 γ −1

Depending on the magnitude c2 of state II, we may or may not have a portion of the sonic circle of state II in the solution. We require that state II has no sonic point. In other words, the characteristic lines bh and b h intersect before contacting the sonic circle of state II. Thus we need the exiting slope of the minus characteristic curve (line bh), that starts horizontally from the first quadrant, to be greater than one, so that it will intersect − + its counterpart (line b h) from the interaction of R32 R34 before hitting the sonic circle of state II. Lemma √ 5.1. For state II to be hyperbolic at point h it is necessary and sufficient to have γ > 1 + 2 with 2 √ (2 − 2)(γ − 1) γ −3 ρ2 /ρ1 < (5.1) √ 2(γ − 1 − 2) for γ = 3, and

for γ = 3.

√ ρ2 /ρ1 < exp(− 2 − 1)

(5.2)

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

315

√ √ Proof. For γ = 3, we have c = p = 3ρ, u = c+u 1 −c1 , ξ = u+c = u 1 −c1 +2 3ρ + . The characteristic curve ab is given by (3.10) with data (3.12) or in R12 (5.3) η = v1 + ρ 3 + 6 ln(ρ1 /ρ). We then compute dη/dρ ln(ρ1 /ρ) dη = =√ . dξ dξ/dρ 1 + 2 ln(ρ1 /ρ)

(5.4)

Requiring the slope to be greater or equal to one, we find ρ1 /ρ ≥ e

√

2+1

.

(5.5)

Next for γ = 3, we use ξ from (3.8) and η from (3.9) with data (3.11) to compute γ + 1 √ γ −3 dξ γρ 2 ; = dρ 2

γ −3 dη γ (γ + 1)(γ − 1) γ −1 γ −3 = ρ 2 (ρ 2 − ρ1 2 ). dρ 2(3 − γ )(η − v1 )

(5.6)

We require that the slope be greater or equal to one, i.e., dη/dξ ≥ 1 which simplifies to γ −1 1 x−1 1 x−1 ≥ + (5.7) √ √ γ +1 3−γ γ +1 x 3−γ for x := (ρ/ρ1 ) We then factorize (5.7),

γ −3 2

.

√ √ 2)(γ − 1) (2 − 2)(γ − 1) (γ − 2γ − 1) x − x− ≥ 0, √ √ 2(γ + 2 − 1) 2(γ − 2 − 1) 2

(2 +

(5.8)

which then reduces to our conclusion. This completes the proof. Non-overlapping of domains of determinacy. In addition to a hyperbolic point h, we shall choose c2 lower enough so that the point d is a vacuum. We do not have a quantity of c2 to tell when this happens —- its value is most probably a numerical one, depending on γ only, under the current normalization (3.5), (3.6). The reason that we require d to be a vacuum is to minimize further interaction. Further, we need point d to be above the line ξ + η > 0; i.e., ξd > −ηd , which + to be possibly larger than before. We need it because we want to avoid may require R12 + R − and R − R + . We overlapping of the domains of determinacy of the interactions R12 14 32 34 explain that this is achievable. In fact, it is sufficient to require that β > π/4

(5.9)

along the plus characteristic curve db in Fig. 5.1. We use Fig. 5.2 for better illustration, in which the curve ab has a tangent line at point b with an inclination angle greater than π/4. Note that the lines to the left of curve c2 bd are not present in the four-rarefaction wave interaction because they are shadowed by the state c2 > 0, except for the segment bh. Since the curve abb0 is concave, we see that the value of β at b0 is also greater than π/4.

316

J. Li, Y. Zheng

Fig. 5.2. Interaction of two rarefaction waves in rotated coordinate

Using the fact that the solution to the binary interaction is continuous, we see that β on the curve bd will be greater than π/4 once the point b is sufficiently close to point b0 . We call such a critical value of c2 (so that β ≥ π/4 along curve bd) c20 . That is, c20 = sup{c2 ∈ (0, 1) | β > π/4 on plus characteristic curve bd }.

(5.10)

We explain now that condition (5.9) implies point d is above the line ξ + η = 0. Let us rotate the coordinate system of Fig. 5.2 counter-clockwise by π/4, so that we regard the line ξ − η = 0 as the new ξ -axis, called ξ˜ -axis. In rotated Fig. 5.2, we note that the velocity component along the ξ˜ -axis is √ u˜ = (u + v)/ 2. In particular we have u˜ = 0 at point b due to our normalization u 2 + v2 = 0. We observe that sin β˜ ¯ + ∂¯ + u˜ = − ∂ c κ holds along db, which we can integrate to find that u˜ > 0 at point d, since β˜ > 0 along db and c is increasing from d to b.√We observe further that ξ˜ ≥ u˜ at the vacuum from τ ˜ ξ − u = c cos sin ω ≥ 0, thus ξ + η = 2ξ > 0 at point d. In sum, Proposition 5.1. Suppose c2 ∈ (0, c20 ). Then we have β > π/4 along curve bd, point h is hyperbolic and point d is above the line ξ + η = 0.

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

317

Fig. 6.1. A patch of simple wave

6. Simple Waves and Their Interaction 6.1. A complete patch of simple wave. In [16] we showed that adjacent to a constant state is a simple wave, by using the characteristic decomposition (2.2). Thus the region adjacent to state II and covered by the curvilinear boundaries bhd in Fig. 5.1 is a simple wave. We show that its vacuum boundary is the single point d, and the boundary hd is a characteristic curve of the plus family. See Fig. 6.1. Lemma 6.1 (Simple wave). Let bd be a characteristic curve of the plus family, along which the density ρ decreases from point b to zero at point d. Let bk be a straight characteristic curve, where point k is sonic. Then a simple wave exists, forming a curvilinear triangle bkd, for which the boundary kd (the dotted curve in Fig. 6.1) is sonic, and each of the characteristics of the plus family extends from point d to a point on bk or kd. Remark. We do not know if β is monotone along the curve bd. Proof. It follows from [16] that the patch is a simple wave, in which the characteristics of the minus family are straight lines, along which the density is constant. Thus point k is a sonic point where U 2 + V 2 − c2 = 0. Every minus characteristic ends at a sonic point instead of vacuum. The length of the minus characteristics inside the patch shrinks to zero since the density shrinks to zero. The minus characteristics do not form shocks inside the patch since it can be shown, following the idea and proofs of [1,18,26], that the ∂¯ + c is always finite. In fact, we claim that 1 + κ ¯+ ¯+ 1 − ¯+ ¯ 2 sin(2ω) − − ∂ (∂ c) = ∂ c ∂ c. 2c κ cos2 ω

(6.1)

We derive (6.1) as follows. We use I = c in the commutator relation (2.7) and ∂ − c = 0 to obtain ∂ −∂ +c =

∂ − tan α − ∂ + tan β (−∂ + c). tan β − tan α

318

J. Li, Y. Zheng

We use ∂ − β = 0 in (2.10) to obtain 2 sin2 ω + c∂¯ − α = 0.

(6.2)

So we obtain ∂ − tan α =

1 2 sin2 ω − α = − ∂ . cos2 α c cos β cos2 α

We use (2.17) to obtain ∂ + tan β =

1 1 1+κ 2 ¯ +β = ¯ +c . 2 sin ω − c ∂ tan ω ∂ c cos2 β cos α c cos2 β cos α κ

In addition, we have ∂¯ − ∂¯ + c = cos β[∂ − (cos α∂ + c)] = cos α cos β∂ − ∂ + c − tan α ∂¯ − α ∂¯ + c. Using (6.2) again, we obtain ∂¯ − ∂¯ + c = cos α cos β∂ − ∂ + c + 2c−1 sin2 ω tan α ∂¯ + c. Combining the above and using cos α + cos β = 2 cos τ cos ω and sin ω sin α − cos τ = − cos α cos ω, we obtain (6.1). In (6.1), the direction ∂¯ + is going from d to b, thus ∂¯ + c > 0 on the curve db. The direction of −∂¯ − is going from b to h. Because the right-hand side has the factor ∂¯ + c, it does not get to zero. And because the coefficient of the quadratic term is negative, it does not grow to positive infinity. Thus ∂¯ + c > 0 remain positive and finite in the whole patch of the simple wave. This proves the above claim. We need to show that the plus characteristics do not start from an interior point of the boundary kd. From (6.2), we obtain that along a minus characteristic, sin2 ω + c∂¯ − ω = 0,

(6.3)

thus ω is monotone increasing in the direction parallel to that from b to k. Following the monotonicity, we can conclude that the plus characteristics cannot start from an interior point of the boundary kd. Because, if it does, then ω would be zero at the starting point, but our ω on the “initial” line bd is positive, thus contradicting the monotonicity along the minus characteristics connecting the starting point and the initial point on bd. This completes the proof of the lemma. 6.2. Interaction of simple waves. We consider the interaction of two simple waves. The two simple waves will be quite general, with quite general interaction angles, which will + include the interaction of the waves bhd with b he, as well as the interaction of R12 − with R14 , see Fig. 5.1. Let us take a survey on what angles of interactions are involved in our bi-symmetric + with R − is π/4. For the interaction at point interaction. The interaction half-angle of R12 14 h, the maximum angle π/2 is achieved when bh is parallel to b h, while the minimum angle π/4 is achieved when the point h becomes vacuum so that bh becomes bd and thus bh is perpendicular to b h. Therefore we shall need interaction (half-)angles between (π/4, π/2).

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

(a)

(b)

319

(c)

Fig. 6.2. Interaction of two simple waves

√ We use the interaction bhd with b he as the primary problem and γ > 1+ 2. Notice that the Suchkov angle θs (γ ) is less than π/4 for γ > 1, so our interaction half-angles are always greater than the Suchkov angles, hence our interactions belong to the large angle case following the terminology of paper [17]. Thus we consider data as shown in Fig. 6.2, part (a). The data is symmetric with respect to the ξ −axis. Point h is on the ξ −axis. The lower curvilinear triangle bhd is a simple wave in which bh is a straight characteristic curve of the minus family, while d is vacuum. The density is monotone decreasing to zero from h to d along hd, and hd is a convex characteristic curve of the plus family. Length of hd is finite. We need to construct the interaction zone dhe, where the dotted curve de is vacuum. Part (b) of Fig. 6.2 represents the hodograph domain of interaction, while part (c) of Fig. 6.2 represents the phase space (α, β), where the three lower branches of parts (a), (b), and (c) represent the same boundary. Local existence of the solution at point P in part (b) of Fig. 6.2 follows from the standard argument for Goursat problems, see [27,32] for example. We need uniform estimates on (α, β) and their derivatives to extend the local solution up to the vacuum boundary. Curve D Z in part (c) of Fig. 6.2 represents the relation between α and β on the boundary dh of part (a) of Fig. 6.2. It is a horizontal straight segment if the simple wave bhd is a planar wave. We note, once we require condition (5.10) that c2 < c20 , that β > −π/2 along D Z , i.e., the slopes of the straight lines in curvilinear triangle bhd of part (a) of Fig. 6.2 is negative but not −∞. Similarly, we have α < π/2 along the boundary E Z in part (c) of Fig. 6.2. Thus we can use β = min β, DZ

α = max α EZ

(6.4)

to form the bottom and right sides of an√ invariant triangle for (α, β), while the third side is the line α − β = 2θs for γ ∈ (1 + 2, 3) or α − β = 0 for γ ≥ 3. The fact that the three straight-lines are not penetrable has been established in paper [17]. Hence the (α, β) are bounded in this way: α ≤ max α, β ≥ min β, α − β ≥ 2θs (γ ), EZ

where we use θs (γ ) = 0 for γ ≥ 3.

DZ

(6.5)

320

J. Li, Y. Zheng

Now that the invariant region (6.5) for (α, β) is available, the derivatives of (α, β) can be shown to be bounded in terms of c > 0, see [17,32]. We omit the details. Thus, a global solution exists where c > 0 in region D P E. Using Proposition 2.6, we can invert the mapping to yield a solution in the ξ − η plane. By the invariance of the (α, β) of (6.5), we obtain that the characteristics in the ξ − η plane are either convex or concave. We summarize this subsection in a theorem. Theorem 6.1 (Simple wave interactions). Interaction of two simple waves with an interaction half-angle between (π/4, π/2) and a density vanishes along the interaction boundaries exists as a smooth solution, in which the plus family of characteristics are convex while the minus family is concave, provided that c2 ∈ (0, c20 ). 7. Global Solution We construct the global solution for the bi-symmetric four rarefaction wave interaction. + with R − has been done in [17], and it also follows from the The first interaction of R12 14 + to be large, by choosing previous section, for any c2 ∈ [0, 1]. We need the wave R12 c2 close to zero, so that the point d of Fig. 5.1 is a vacuum and point h is hyperbolic. Choosing c2 ∈ (0, c20 ), we avoid point d running into point e and obtain the global existence of the interaction of the two simple waves dhe at the same time. We summarize our results in a theorem. √ Theorem 7.1 (Global existence). Suppose γ > 1 + 2. Then there exists c20 (γ ) ∈ (0, 1) so that for any c2 ∈ (0, c20 ) the associated bi-symmetric four rarefaction wave interactions have continuous global solutions, whose centers are vacuum. We then take all of c20 under which there exists a global continuous solution regardless of the sign of β on curve bd, and denote the supremum of such c20 by c2∗ , to obtain the main theorem stated in the Introduction. References 1. Bang, S.: Interaction of three and four rarefaction waves of the pressure-gradient system. J. Diff. Eqs. 246, 453–481 (2009) 2. Bressan, A.: Hyperbolic systems of conservation laws. In: The One Dimensional Cauchy Problem. Oxford: Oxford University Press, 2000 3. Chang, T., Chen, G.Q., Yang, S.L.: On the 2–D Riemann problem for the compressible Euler equations, I. Interaction of shock waves and rarefaction waves. Disc. Cont. Dyn. Syst. 1, 555–584 (1995) 4. Chang, T., Hsiao, L.: The Riemann Problem and Interaction of Waves in Gas Dynamics. Pitman Monographs and Surveys in Pure and Applied Mathematics, 41, Harlow: Longman Scientific & Technical, 1989 5. Chen, G.-Q., Feldman, M.: Global solutions of shock reflection by large-angle wedges for potential flow. Ann. Math (2), to appear, available at http://pjm.math.berkeley.edu/annals/ta/080510-Chen/080510Chen-v1.pdf 6. Chen, X., Zheng, Y.: The interaction of rarefaction waves of the two-dimensional Euler equations. Indiana Univ. Math. J. 58(2009), No. 6 (in press) 7. Courant, R., Friedrichs, K.O.: Supersonic Flow and Shock Waves, New York: Interscience Pulishers, Inc., 1948 8. Dafermos, C.: Hyperbolic Conservation Laws in Continuum Physics. Grundlehren der mathematischen Wissenschaften, Berlin-Hidelberg-NewYork: Springer, 2000 9. Dinu, L.F.: Multidimensional Wave-Wave Regular Interactions and Genuine Nonlinearity: Some Remarks. Lecture presented in Loughborough University, UK, 2006-07 10. Glimm, G., Ji, X., Li, J., Li, X., Zhang, P., Zhang, T., Zheng, Y.: Transonic shock formation in a rarefaction Riemann problem for the 2-D compressible Euler equations. SIAM J. Appl. Math. 69, 720–742 (2008)

Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations

321

11. Kurganov, A., Tadmor, E.: Solution of two-dimensional Riemann problems for gas dynamics without Riemann problem solvers. Num. Meth. Part. Diff. Eqs. 18, 584–608 (2002) 12. Lax, P., Liu, X.: Solutions of two–dimensional Riemann problem of gas dynamics by positive schemes. SIAM J. Sci. Compt. 19, 319–340 (1998) 13. LeFloch, P.G.: Hyperbolic Systems of Conservation Laws, The Theory of Classical and Non-Classical Shock Waves. Basel: Birkhaüser Verlag, 2002 14. Li, J.: On the two-dimensional gas expansion for compressible Euler equations. SIAM J. Appl. Math. 62, 831–852 (2001) 15. Li, J., Zhang, T., Yang, S.: The Two-Dimensional Riemann Problem in Gas Dynamics. Pitman Monographs and Surveys in Pure and Applied Mathematics 98, Essex: Addison Wesley Longman limited, 1998 16. Li, J., Zhang, T., Zheng, Y.: Simple waves and a characteristic decomposition of the two dimensional compressible Euler equations. Commun. Math. Phys. 267, 1–12 (2006) 17. Li, J., Zheng, Y.: Interaction of rarefaction waves of the two-dimensional self-similar Euler equations. Arch. Rat. Mech. Anal. 193, 623–657 (2009) 18. Li, M., Zheng, Y.: Semi-hyperbolic patches of solutions of the two-dimensional Euler equations. Preprint, available on request 19. Elling, V., Liu, T.P.: Supersonic flow on a solid wedge. Comm. Pure Appl. Math. 61, 1331–1481 (2008) 20. Majda, A.: Compressible Fluid Flow and Systems of Conservation Laws in Several Space Variables. Applied Mathematical Sciences 53. New York: Springer-Verlag, 1984 21. Pogodin, I.A., Suchkov, V.A., Ianenko, N.N.: On the traveling waves of gas dynamic equations. J. Appl. Math. Mech. 22, 256–267 (1958) 22. Schulz–Rinne, C.W.: Classification of the Riemann problem for two-dimensional gas dynamics. SIAM J. Math. Anal. 24, 76–88 (1993) 23. Schulz–Rinne, C.W., Collins, J.P., Glaz, H.M.: Numerical solution of the Riemann problem for two– dimensional gas dynamics. SIAM J. Sci. Compt. 4, 1394–1414 (1993) 24. Serre, D.: Écoulements de fluides parfaits en deux variables indépendantes de type espace. Réflexion d’un choc plan par un dièdre compressif. Arch. Rat. Mech. Anal. 132, 15–36 (1995) 25. Smoller, J.: Shock Waves and Reaction-Diffusion Equations. Berlin-Heidelberg-NewYork: Springer, 1983 26. Song, K., Zheng, Y.: Semi-hyperbolic patches of solutions of the pressure gradient system. Disc. Cont. Dyn. Syst. Series A 24, 1365–1380 (2009) 27. Wang, R., Wu, Z.: On mixed initial boundary value problem for quasilinear hyperbolic system of partial differential equations in two independent variables (in Chinese), Acta Sci. Natur. Jinlin Univ., 2, 459–502, (1963) 28. Zhang, T., Zheng, Y.: Conjecture on the structure of solution of the Riemann problem for two-dimensional gas dynamics systems. SIAM J. Math. Anal. 21, 593–630 (1990) 29. Zhang, T., Zheng, Y.: Axisymmetric solutions of the Euler equations for polytropic gases. Arch. Rat. Mech. Anal. 142, 253–279 (1998) 30. Zheng, Y.: Systems of Conservation Laws: Two-Dimensional Riemann Problems. Vol. 38, PNLDE, Boston: Birkhäuser, 2001 31. Zheng, Y.: Two-dimensional regular shock reflection for the pressure gradient system of conservation laws. Acta Math. Appl. Sin. Engl. Ser. 22, 177–210 (2006) 32. Zheng, Y.: The compressible Euler system in two space dimensions. In: Series of Cont. Appl. Math. Vol. 13, (Shanghai Mathematics Summer School, 2007). G. Q. Chen, T.-T. Li, C. Liu (eds.) Singapore: World Scientific/ Higher Ed. Press, 2008 33. Zheng, Y.: Absorption of characteristics by sonic curves of the two-dimensional Euler equations. Disc. Cont. Dyn. Syst. 23, 605–616 (2009) 34. Zheng, Y.: Shock reflection for the Euler system. In: Hyperbolic Problems Theory, Numerics and Applications (Proceedings of the Osaka meeting 2004), Vol. II. Eds. F. Asakura (Chief), H. Aiso, S. Kawashima, A. Matsumura, S. Nishibata, K. Nishihara; Yokohama: Yokohama Publishers, 2006, pp. 425–432 Communicated by P. Constantin

Commun. Math. Phys. 296, 323–351 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1021-z

Communications in

Mathematical Physics

Uniqueness of Topological Solutions and the Structure of Solutions for the Chern-Simons System with Two Higgs Particles Jann-Long Chern1, , Zhi-You Chen1 , Chang-Shou Lin2 1 Department of Mathematics, National Central University, Chung-Li 32001,

Taiwan. E-mail: [email protected]; [email protected]

2 Department of Mathematics, Taida Institute for Mathematical Sciences,

National Taiwan University, Taipei 10617, Taiwan. E-mail: [email protected] Received: 3 October 2008 / Accepted: 7 January 2010 Published online: 4 March 2010 – © Springer-Verlag 2010

Abstract: The existence of topological solutions for the Chern-Simons equation with two Higgs particles has been proved by Lin, Ponce and Yang [16]. However, both the uniqueness problem and the existence of non-topological solutions have been left open. In this paper, we consider the case of one vortex at origin. Among others, we prove the uniqueness of topological solutions and give a complete study of the radial solutions, in particular, the existence of some non-topological solutions.

1. Introduction and Main Results In this paper, we will consider the nonlinear elliptic system ⎧ N ⎪ ⎪ v u ⎪ αs δ ps ⎨ u + λe (1 − e ) = 4π ⎪ ⎪ ⎪ ⎩ v + λeu (1 − ev ) = 4π

where =

2

∂2 i=1 ∂ x 2 , λ i

s=1 N

s=1

in R2 , (1.1)

αs δ ps

in

R2 ,

is a positive constant, N and N are two positive constants

which are called the vortex numbers, αs > 0 and αs > 0 are constants, and δ p is the Dirac measure at p. Equation (1.1) arises from a relativistic Abelian Chern-Simons model with two Higgs particles. For any solution (u, v) to Eq. (1.1), we let z = x 1 + i x 2 Work partially supported by National Science Council of Taiwan.

324

J.-L. Chern, Z.-Y. Chen, C.-S. Lin (1)

(2)

and define φ, χ , Ar and Ar , r = 1, 2, in the following: ⎧ N N ⎪ ⎪ ⎪ θ1 (z) = − arg (z − ps ), θ2 (z) = − arg (z − ps ), ⎪ ⎪ ⎨ s=1 s=1 1 1 φ(z) = e 2 u(z)+iθ1 (z) , χ (z) = e 2 v(z)+iθ2 (z) , ⎪ ⎪ (1) (1) ⎪ ⎪ A1 (z) = −Re{2i∂ ln φ(z)}, A2 (z) = −Im{2i∂ ln φ(z)}, ⎪ ⎩ (2) (2) A1 (z) = −Re{2i∂ ln χ (z)}, A2 (z) = −Im{2i∂ ln χ (z)};

(1.2)

here φ and χ are interpreted as two complex scalar fields in R2 representing two Hi(1) (2) (I ) ggs particles, and Ar and Ar , r = 1, 2, are two gauge fields. Then (φ, χ , Ar ), I = 1, 2, r = 1, 2, satisfy the self-dual equation for the Chern-Simons-Higgs model with two Higgs particles. For details of computations, we refer the readers to [8,15,16] and the references therein. For the past twenty years, the equation of Chern-Simons with one Higgs particle has been intensively studied, e.g., see [2–4,6–8,10–15,17–20,22,23] and references therein. However, the study for the system (1.1) only recently began with the paper [16]. For Eq. (1.1), there are two natural boundary conditions for solutions at ∞, namely, (i) lim u(x) = lim v(x) = 0, or |x|→∞

|x|→∞

(1.3)

(ii) lim u(x) = lim v(x) = −∞. |x|→∞

|x|→∞

We note that if (u, v) is a solution with the boundary condition either (i) or (ii), then, by the maximum principle, we have u(x) < 0 and v(x) < 0 for all x ∈ R2 . In physics literature, a solution (u, v) satisfying boundary condition (i) is called a topological solution. Since the nonlinear term ev (1 − eu ) = −ev u + O(|u|2 ) for u small and u(x), v(x) → 0 as |x| → +∞, by the estimates of elliptic PDE, we know that if (u, v) is a topological solution of (1.1), then both |u| and |v| decay exponentially at ∞. To solve (1.1), one may consider a regularized form: ⎧ N ⎪ 4αs ε ⎪ v (1 − eu ) = ⎪ u + λe ⎨ (ε+|x− ps |2 )2 s=1 (1.4) N ⎪ 4αs ε ⎪ u v ⎪ , ⎩ v + λe (1 − e ) = (ε+|x− p |2 )2 s

s=1

where ε is a small positive number, and introduce the background functions

u ε0 (x)

=

N s=1

4αs

N ε + |x − ps |2 ε + |x − ps |2 ε , v0 (x) = . ln 4αs ln 1 + |x − ps |2 1 + |x − ps |2

s=1

Then

u ε0 (x) = −h 1 (x) +

N s=1

N

4αs ε 4αs ε , v0ε = −h 2 (x) + , 2 2 (ε + |x − ps | ) (ε + |x − ps |2 )2 s=1

where h 1 , h 2 ∈ W 1,2 do not depend on ε > 0. By letting u = u ε0 + f, v = v0ε + g, the regularized form of (1.1) becomes ε ε f + λev0 +g (1 − eu 0 + f ) = h 1 (1.5) ε ε g + λeu 0 + f (1 − ev0 +g ) = h 2 .

Uniqueness and Structure of Solutions for the Chern-Simons System

It is clear that (1.5) is the Euler-Lagrange equations of the nonlinear functional:

ε ε ε I ( f, g) = (∇ f · ∇g + λeu 0 +v0 + f +g − λeu 0 + f − λe

v0ε +g

325

(1.6)

+ h 2 f + h 1 g) d x.

We refer to [16] for the details of arguments. From (1.6), we see that Eq. (1.1) is the so-called skew gradient system in the literature, see [21]. Clearly, the indefinite form of I presents a lot of difficulties for solving Eq. (1.1). Hence, it is remarkable that in Lin-Ponce-Yang [16], they are able to show the existence of topological solutions for Eq. (1.1) for any given set of singularities. Theorem A. [16] For any given sets { p1 , . . . , p N } and { p1 , . . . , p N } and αs , αs > 0, Eq. (1.1) possesses a topological solution (u, v). After Theorem A, it is natural to ask the question about the uniqueness of topological solutions for Eq. (1.1). For the single Chern-Simons-Higgs model, the uniqueness result was proved in [6] with only one singularity, and in [3 and 19] for multi-singularity in R2 and large λ as well as in the periodic case. In this article, we consider the topological solution (u, v) for the case N = N = 1 and p1 and p1 to be the origin O. Then (u, v) satisfies u + ev (1 − eu ) = 4π N1 δ0 in R2 , (1.7) v + eu (1 − ev ) = 4π N2 δ0 with the boundary condition u(x) → 0, v(x) → 0 as |x| → ∞.

(1.8)

By noting u(x) < 0, v(x) < 0 for x ∈ R2 , and by applying the standard method of moving planes, we can show that (u, v) is radially symmetric with respect to the origin O. The proof is standard, and will be omitted here. We refer to [1] for the details of the proof. To a single nonlinear elliptic equation, the uniqueness problem has been extensively studied for the last decades. It is well-known that the uniqueness problem is closely related to non-degeneracy of its linearized equation. See [4,5] and references therein. In this paper, we also want to prove the uniqueness by studying the non-degeneracy of linearized equations. The linearized equation at (u, v) of (1.7) is called degenerate if there exists a nonzero bounded solution pair (A(r ), B(r )) of A + ev (1 − eu )B − eu+v A = 0 (1.9) in R2 . B + eu (1 − ev )A − eu+v B = 0 Comparing to the case of a single equation, there are additional difficulties to be overcome for (1.9). In the proof of the uniqueness for a single equation, some standard techniques such as Sturm-Liouville comparison theorem play important roles. See [4] and [5]. However, these standard tools are no longer available for a system of Eqs. (1.9). Hence, we have to develop new ideas to work out for (1.9), which will be presented in Sect. 2. We believe that the method developed here should be helpful for a general class of nonlinear elliptic systems. After the non-degeneracy of (1.9) is established, we can prove the following uniqueness theorem.

326

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

Theorem 1.1. Let (u, v) be a topological solution of (1.7). Then the linearized equation (1.9) of (1.7) at (u, v) is non-degenerate. Moreover, Eq. (1.7) possesses one and only one topological solution. Now we come back to discuss the case of the boundary condition (ii) of (1.11). In the Abelian Chern-Simons-Higgs model with one particle, a solution u(x) satisfies u + eu (1 − eu ) = 4π N δ0 in R2 .

(1.10)

Suppose u = u(|x|) is a non-topological solution of (1.10), i.e., u(r ) → −∞ as r → +∞. Then it can be proved that u satisfies

eu (1 − eu ) d x < +∞. (1.11) R2

But, for the system (1.1), (1.11) might not hold even for the radial solution (u(r ), v(r )). Actually, in Sect. 5, we will show that there exists a solution pair (u, v) of (1.7) satis v (1 − eu ) d x < +∞ and fying both u(r ) and v(r ) tend to −∞ as r → ∞ with e R2 u v R2 e (1 − e ) d x = +∞. Thus, while compared with (1.10), the structure of solutions for (1.7) could be more complicated. One of our purposes in this paper is to classify solutions according to their behaviors at infinity. In this paper, we call a solution to be non-topological if (u, v) satisfies the boundary condition (ii) in (1.11), and both eu (1 − ev ) and ev (1 − eu ) are in L 1 (R2 ). For an entire solution (u, v) of (1.1), we set

1 1 ev (1 − eu ) d x, β2 = eu (1 − ev ) d x. (1.12) β1 = 2π R2 2π R2 In order to investigate the structure of all radial solutions of (1.7), we consider the following ODE system: ⎧ 1 ⎪ ⎨ u (r ) + u (r ) + ev(r ) (1 − eu(r ) ) = 0, r r >0 (1.13) 1 ⎪ ⎩ v (r ) + v (r ) + eu(r ) (1 − ev(r ) ) = 0, r with the initial value

u(r ) = 2N1 log r + α1 + o(1), v(r ) = 2N2 log r + α2 + o(1)

as r → 0+ .

(1.14)

According to the behaviors at ∞, all entire solutions of (1.13) can be classified into the following five types: Type (I): lim (u(r ), v(r )) = (0, 0), i.e., (u, v) is the topological solution. r →∞ Type (II): lim (u(r ), v(r )) = (−∞, −∞) with β1 < ∞ and β2 < ∞, i.e., r →∞ (u, v) is a non-topological solution. Type (III): lim u(r ) = −∞, lim v(r ) = −∞, and r →∞

r →∞

either 2N1 < β1 ≤ 2N1 + 2, β2 = ∞ or β1 = ∞, 2N2 < β2 ≤ 2N2 + 2.

Uniqueness and Structure of Solutions for the Chern-Simons System

327

Type (IV): lim (u(r ), v(r )) = (−cu , −∞) or lim (u(r ), v(r )) = (−∞, −cv ) r →∞ r →∞ for some constants cu > 0 and cv > 0. Type (V): lim (u(r ), v(r )) = (+∞, −∞) or lim (u(r ), v(r )) = (−∞, +∞). r →∞

r →∞

Our second result is the asymptotic behaviors of all entire solutions. Theorem 1.2. Let (u, v) be a solution of (1.13)–(1.14). Then (u, v) must be one of the above five types. Conversely, solutions of all types do exist. Let α = (α1 , α2 ), and (u(r, α), v(r, α)) denote the solution of (1.13)-(1.14). According to the behavior of (u, v), the set of initial data could be classified into the following regions:

= {α|(u(r, α), v(r, α)) is a solution with lim (u(r, α), v(r, α)) = (−∞, −∞)}, r →∞ T = {α|(u(r, α), v(r, α)) is the unique topological solution},

N T = {α|(u(r, α), v(r, α)) is a non-topological solution}, Su = {α|(u(r, α), v(r, α)) is a Type (IV) solution with lim u(r ) = −cu }, r →∞ Sv = {α|(u(r, α), v(r, α)) is a Type (IV) solution with lim v(r ) = −cv }, r →∞ Wu = {α|(u(r, α), v(r, α)) is a Type (V) solution with lim u(r ) = ∞}, r →∞ Wv = {α|(u(r, α), v(r, α)) is a Type (V) solution with lim v(r ) = ∞}. r →∞

Then the structure of solutions sets is described as follows: Theorem 1.3. Both and N T are non-empty and open simply connected. Furthermore, all sets \ N T , Su , Sv , Wu and Wv are non-empty, and the following statements are valid. (i) = v N T u is a non-empty and simple connected set, where

u = {α ∈ |(u(r, α), v(r, α)) is a Type (III) solution with β1 < ∞},

v = {α ∈ |(u(r, α), v(r, α)) is a Type (III) solution with β2 < ∞}.

(ii) ∂ = Su T Sv and S u S v = ∂ ∂ N T = T. (iii) For any α ∈ N T the corresponding (β1 , β2 ) satisfies (β1 − 2(N1 + 1))(β2 − 2(N2 + 1)) > 4(N1 + 1)(N2 + 1).

(1.15)

(iv) Wu is open. Furthermore, for each (θ, η) ∈ Su there exists > 0 such that (α1 , η) ∈ Wu ∀θ < α1 < θ + . (v) Wv is open. Furthermore, for each (µ, ν) ∈ Sv there exists δ > 0 such that (µ, α2 ) ∈ Wv ∀ν < α2 < ν + δ. We remark that the uniqueness of topological solutions implies the simple-connectedness of both and N T . The simple-connectedness is important itself, because it allows us to study the linearized equation of (1.7) at any non-topological solution through the argument of continuation. An important question about non-topological solutions arises: given any pair of (β1 , β2 ) satisfying (1.15) of Theorem 1.3, is there an unique non-topological solution (u, v) which satisfies (1.12)? We will come back to this issue in a coming paper. From Theorem 1.3, we note that there are drastic differences between the solutions of (1.10) and (1.7). For Eq. (1.10), if a solution is positive somewhere, then it will blow

328

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

up in finite |x|. But the situations do change for the system of equations. For example, the solution of Type (V) depicts that u might be positive somewhere, but both u and v do not blow up in finite |x|. Another consequence of Theorem 1.2 is that if both u and v are positive at some |x0 |, then u and v must blow up in finite |x|. The paper is organized as follows. First we investigate the monotone and non-degenerate properties of the linearized equations on the negative solutions of (1.13) in Sect. 2. Based on the results of Sect. 2 and applying the Implicit Function Theorem, we prove the uniqueness of topological solution for (1.13) in Sect. 3. In Sect. 4, we will give the asymptotic behaviors of all entire solutions. Finally, we prove the existences and classification of solutions of all types, Theorems 1.2 and 1.3, in Sect. 5. 2. The Non-Degeneracy of Linearized Equations In this section, we give the proof about the non-degeneracy of the linearized equation on the topological solution of (1.7). Before going to our proof, we need to state some properties concerning solutions. First, we have the Pohozaev identity as follows. Lemma 2.1. (Pohozaev identity). Let (u(r ), v(r )) be a solution of (1.13)–(1.14) in (0, R] for some R > 0. Then we have the following identity:

r 2 u(r ) v(r ) 2 u(r )+v(r ) [r u (r ) · r v (r ) + r (e +e )−r e ]−2 s(eu(s) + ev(s) ) ds 0

r u(s)+v(s) +2 se ds = 4N1 N2 ∀r ∈ (0, R]. (2.1) 0

Proof. By multiplying r v and r u on both sides of the first and second equation of (1.13) respectively, we obtain r v (r u ) + r v r ev (1 − eu ) = 0 ∀r ∈ (0, R]. (2.2) r u (r v ) + r u r eu (1 − ev ) = 0 Then adding these two equations together and taking the integration from 0 to r , we get r r [r u (r ) · r v (r ) − lim+ (r u (r ) · r v (r ))] + 0 s 2 d(eu(s) ) + 0 s 2 d(ev(s) ) r →0 r − 0 s 2 d(eu(s)+v(s) ) = 0 ∀r ∈ (0, R]. By the above equality, and using the initial value (1.14) and the integration by parts, we can easily obtain (2.1). Secondly, we have the following property for solutions with zero boundary value. Lemma 2.2. Let (u(r ), v(r )) be a solution of (1.13)–(1.14) satisfying u(R0 ) = v(R0 ) = 0 for some R0 > 0 (or R0 = +∞). Then the following are valid: (i) u < 0, v < 0, u > 0 and v > 0 on (0, R0 ). Furthermore, if R0 = ∞, i.e., (u, v) is a topological solution of (1.7), then the corresponding (β1 , β2 ) satisfies β1 = 2N1 and β2 = 2N2 , where (β1 , β2 ) is defined in (1.12). (ii) If N1 < N2 , then u > v on (0, R0 ). (iii) If N1 > N2 , then u < v on (0, R0 ). (iv) If N1 = N2 , then u ≡ v.

Uniqueness and Structure of Solutions for the Chern-Simons System

329

Proof. We shall apply the maximum principle to prove (i). Suppose u(r0 ) = max u > 0. Then u(r0 ) ≤ 0 and thus

(0,R0 ]

0 = u(r0 ) + ev(r0 ) (1 − eu(r0 ) ) < 0, which yields a contradiction. Hence, u(r ) ≤ 0 on (0, R0 ). The strong maximum principle implies u(r ) < 0 in (0, R0 ). Similarly, it holds for v. Since u(r ) < 0 and v(r ) < 0 in (0, R0 ), the maximum principle also implies that both u and v can not attain their local minima inside (0, R0 ). Since u (r ) > 0 and v (r ) > 0 for r near 0, we obtain u (r ) > 0, v (r ) > 0 on (0, R0 ). If R0 = ∞ then, by u(r ) < 0 on (0, ∞) and (1.7), we have (r u (r )) = r ev(r ) (eu(r ) − 1) < 0 ∀r ∈ (0, ∞). Thus, by u (r ) > 0 on (0, ∞), we get 0 ≤ lim r u (r ) = r →∞ ∞ 2N1 − 0 r ev (1 − eu )dr exists (≡ cu ). If cu > 0 then we easily have u(r ) > 0 for large r . This contradiction proves cu = 0. From this we get β1 = 2N1 . The case of β2 is similar. Hence (i) holds. By (1.15), we have (u − v) = 4π(N1 − N2 )δ0 + (eu − ev ). If (u − v)(r0 ) < 0 at some r0 ∈ (0, R0 ), then we can let r0 satisfy (u − v)(r0 ) = min (u − v) < 0, and we have

(0,R0 ]

0 ≤ (u − v)(r0 ) = eu(r0 ) − ev(r0 ) < 0, a contradiction. Hence u(r ) ≥ v(r ). By the strong maximum principle, the strict inequality u(r ) > v(r ) holds for r ∈ (0, R0 ). This proves (ii). Obviously, (iii) and (iv) follow easily. In the following, we investigate the monotone property of the negative solution of (1.13)–(1.14). Let, for i = 1, 2, ⎧ ∂U ⎪ ⎨ φi (r ) = , ∂αi (2.3) ∂V ⎪ ⎩ ψi (r ) = , ∂αi where U (r ; α1 , α2 ) = u(r ; α1 , α2 ) − 2N1 log r and V (r ; α1 , α2 ) = v(r ; α1 , α2 ) − 2N2 log r . Then (φi , ψi ), i = 1, 2, satisfy the linearized equations ⎧ ⎨ φi − eu+v φi + ev (1 − eu )ψi = 0, r ∈ (0, R0 ), ψi − eu+v ψi + eu (1 − ev )φi = 0, r ∈ (0, R0 ), (2.4) ⎩ φ (0) = 1 = ψ (0), φ (0) = 0 = ψ (0), φ (0) = 0 = ψ (0). 1 2 2 1 i i The monotone property of φi and ψi is as follows: Lemma 2.3. Let (u(r ), v(r )) be a solution of (1.13)–(1.14). If u(r ) < 0 and v(r ) < 0 for r ∈ (0, R0 ) for some R0 > 0 (or R0 = ∞), then the corresponding (φi , ψi ) satisfy φ1 (r ) > 0, φ1 (r ) > 0, φ2 (r ) < 0, φ2 (r ) < 0, (2.5) ∀r ∈ (0, R0 ). ψ1 (r ) < 0, ψ1 (r ) < 0, ψ2 (r ) > 0, ψ2 (r ) > 0

330

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

Proof. By (2.4) and (1.14), we obtain there exists r0 ∈ (0, R0 ) such that

r s[eu(s) (1 − ev(s) )φ1 (s) − eu(s)+v(s) ψ1 (s)] ds ∀r > 0 r ψ1 (r ) = − 0

r ≤− s[C1 s 2N1 (1 − C2 s 2N2 )φ1 (s) − C3 s 2N1 +2N2 ψ1 (s)] ds ∀r ∈ (0, r0 ) 0

≤ −Cr 2N1 +2 < 0 ∀r ∈ (0, r0 ).

(2.6)

By ψ1 (0) = 0, ψ1 (0) = 0 and (2.6), we have ψ1 (r ) < 0 and ψ1 (r ) < 0 ∀r ∈ (0, r0 ). On the other hand, by (2.4), (1.14), and the above result, we get

r s[eu(s)+v(s) φ1 (s) + ev(s) (eu(s) − 1)ψ1 (s)] ds ∀r > 0 r φ1 (r ) =

0 r ≥ C4 s · s 2N1 +2N2 φ1 (s) ds ∀r ∈ (0, r0 ) 0

≥ Cr 2N1 +2N2 +2 > 0 ∀r ∈ (0, r0 ). 1, φ1 (0)

(2.7)

= 0 and (2.7), we have φ1 (r ) > 0 and φ1 (r ) > 0 ∀r ∈ (0, r0 ). first inequality of (2.5) holds for r ∈ (0, r0 ). However (2.6) and

By φ1 (0) = These prove that the (2.7) hold as long as the first inequality of (2.5) is true. This shows that the first inequality of (2.5) holds. The proof for the second inequality of (2.5) is similar. The proof is complete. Finally, we state and prove the non-degenerate property of the linearized equation at a topological solution in the following: Lemma 2.4. Let (u(r ), v(r )) be a solution of (1.13)–(1.14) satisfying u(R0 ) = v(R0 ) = 0 for some R0 > 0 (or R0 = +∞). If (φi (r ), ψi (r )), i = 1, 2, is the respective solution pair of (2.4), then the following statements are valid. (i) If R0 = ∞, i.e., (u, v) is a topological solution, then there exist constants c1 > 0, c2 < 0, d1 < 0 and d2 > 0 such that, lim

φi (r )

r →∞ − 21 r r e

= ci and lim

ψi (r )

r →∞ − 21 r r e

= di , i = 1, 2.

ψ1 (r ) ) (ii) Let M A (r ) = − φφ21 (r (r ) and M B (r ) = − ψ2 (r ) . Then M A (r ) > M B (r ) > 0 ∀r ∈ ∞) if R0 = ∞) and M A (r ) < 0, M B (r ) > 0 ∀r ∈ (0, R0 ). [0, R 0 ] (resp., [0, φ1 (r ) φ2 (r ) = 0 ∀r ∈ [0, R0 ] (resp., [0, ∞) if R0 = ∞). (iii) det ψ1 (r ) ψ2 (r ) (iv) The corresponding linearized equation (1.9) is non-degenerate.

Proof. (i) We prove the asymptotic behavior of φ1 . The cases of ψ1 , φ2 and ψ2 are similar. Let w(r ) = φ1 (r ) − ψ1 (r ) − er . Then by (2.4), w satisfies w(r ) = (eu φ1 − ev ψ1 ) − (1 + r1 )er w(0) = 0, w (0) = −1. Since u(r ) < 0, v(r ) < 0, φ1 (r ) > 0 and ψ1 (r ) < 0 ∀r > 0, it follows that w ≤ (φ1 (r ) − ψ1 (r )) − (1 + r1 )er = w(r ) − r1 er ∀r ∈ (0, ∞). Thus we obtain w(r ) < 0 ∀r ∈ (0, ∞), i.e., φ1 (r ) − ψ1 (r ) < er on (0, ∞).

(2.8)

Uniqueness and Structure of Solutions for the Chern-Simons System

331

1

Let z(r ) = φ1 (r )r 2 . Then z satisfies z (r ) + [−1 + q(r )]z(r ) = 0,

(2.9)

where q(r ) = 1 − eu+v + Since lim

r →∞

eu −1 u

ev (1 − eu )ψ1 1 + 2. φ1 4r

= 1 and ψ1 < 0, we have

(eu − 1)ψ1 ≤ C · u(r )ψ1 (r ) for large r and some C > 0.

(2.10)

1

By |u(r )|, |v(r )| ≤ Cr − 2 e−r for large r , (2.8) and (2.10), we easily obtain 1

(eu − 1)ψ1 ≤ Cr − 2 for large r. v

From this and φ1 (r ) > r for large r , we deduce −e (1−e φ1 ∞ large. Moreover, since R (1 − eu+v )dr < ∞, we get

u )ψ 1

∈ L 1 (R, ∞) for R > 0

q(r ) ∈ L 1 [R, ∞).

(2.11)

By (2.11) and applying Corollary 9.2 of [9] to (2.9), we finally obtain lim

r →∞

z(r ) = c1 > 0, er

and hence lim

φ1 (r )

r →∞ − 21 r r e

= c1 .

This proves the case of φ1 . Thus (i) holds. (ii) By (2.4), we have limr →0+ M A (r ) = ∞, limr →0+ M B (r ) = 0, and thus M A (r ) > M B (r ) ∀r ∈ (0, r1 ) for some r1 ∈ (0, R0 ]. We divide the proof of (ii) into the following two steps. Step 1. If M A (r ) > M B (r ) ∀r ∈ (0, r0 ) for some r0 ≤ R0 , then M A (r ) < 0 and M B (r ) > 0 ∀r ∈ (0, r0 ). We prove Step 1 by contradiction. Suppose M A (r ) < 0 ∀r ∈ (0, r0 ) is not true. Then there exist 0 < r1 < r2 ≤ r0 such that M A (r1 ) < 0, M A (r2 ) > 0, M A (r1 ) = M A (r2 )(≡ C0 ), and 0 < M B (r ) < M A (r ) < C0 ∀r ∈ (r1 , r2 ).

(2.12)

For any c > 0 and r ∈ (0, R0 ], we define Ac (r ) = φ1 (r ) + c · φ2 (r ) and Bc (r ) = ψ1 (r ) + c · ψ2 (r ). Then Ac and Bc satisfy ⎧ ⎨ Ac − eu+v Ac = ev (eu − 1)Bc ∀r ∈ (0, R0 ], Bc − eu+v Bc = eu (ev − 1)Ac ∀r ∈ (0, R0 ], ⎩ A (0) = 1, B (0) = c > 0. c c

(2.13)

(2.14)

332

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

From (2.12) and (2.13), we easily obtain AC0 (r ) < 0 < BC0 (r ) ∀r ∈ (r1 , r2 ) and AC0 (r1 ) = 0 = AC0 (r2 ),

(2.15)

which imply that AC0 has a local minimum at some r¯ ∈ (r1 , r2 ) and AC0 (¯r ) ≥ 0. But, from (2.14) and (2.15), we get AC0 (¯r ) = eu(¯r )+v(¯r ) AC0 (¯r ) + ev(¯r ) (eu(¯r ) − 1)BC0 (¯r ) < 0.

(2.16)

This contradiction proves M A (r ) < 0 ∀r ∈ (0, r0 ). Similarly, suppose M B (r ) > 0 ∀r ∈ (0, r0 ) is not true. Then there exist 0 < r1 < r2 ≤ r0 such that M B (r1 ) > 0, M B (r2 ) < 0, M B (r1 ) = M B (r2 )(≡ C0 ), and C0 < M B (r ) < M A (r ) ∀r ∈ (r1 , r2 ).

(2.17)

By (2.17) and (2.13), we easily obtain BC0 (r ) < 0 < AC0 (r ) ∀r ∈ (r1 , r2 ) and BC0 (r1 ) = 0 = BC0 (r2 ),

(2.18)

and hence BC0 has a local minimum at some r¯ ∈ (r1 , r2 ) with BC0 (¯r ) ≥ 0. However, from (2.14) and (2.15) we get BC0 (¯r ) = eu(¯r )+v(¯r ) BC0 (¯r ) + eu(¯r ) (ev(¯r ) − 1)AC0 (¯r ) < 0.

(2.19)

This contradiction proves Step 1. Step 2. There does not exist R ∈ (0, R0 ) such that M A (R) = M B (R). Suppose Step 2 is not true. Then there exists a smallest R ∈ (0, R0 ] such that M A (R) = M B (R)(≡ C) and M A (r ) > M B (r ) > 0 ∀r ∈ (0, R). Let Ac and Bc be defined in (2.13). Then, in this case, by Step 1 we obtain AC (r ) > 0, BC (r ) > 0 ∀r ∈ (0, R), AC (R) = BC (R) = 0, AC (R) < 0, BC (R) < 0 if R < ∞.

(2.20)

Taking the differentiation w.r.t. αi , i = 1, 2, on both sides of the Pohozaev identity, (2.1), then for any c > 0 and r ∈ (0, R0 ], we obtain r 2 Ac (r )v (r ) + r 2 Bc (r )u (r ) + r 2 [eu(r ) Ac (r ) + ev(r ) Bc (r )]

r

−r 2 eu(r )+v(r ) (Ac (r ) + Bc (r ))2 s[eu Ac + ev Bc ]ds + 2 0

r

seu+v (Ac + Bc )ds = 0.

0

(2.21) If R < ∞ then, by replacing c and r with C and R in (2.21) respectively, we easily have 0 = R 2 AC (R)v (R) + R 2 BC (R)u (R) + R 2 BC (R)ev(R) (1 − eu(R) ) + R 2 AC (R)eu(R) (1 − ev(R) ) R

R +2 r AC eu (ev − 1)dr + r BC ev (eu − 1)dr . (2.22) 0

0

Uniqueness and Structure of Solutions for the Chern-Simons System

333

Then, combining (i) of Lemma 2.2, (2.20) and (2.22), we deduce 0 > R 2 AC (R)v (R) + R 2 BC (R)u (R) R

R r AC eu (1 − ev )dr + r BC ev (1 − eu )dr > 0, =2 0

0

which yields a contradiction. If R = ∞ then we first claim that one of AC and BC is unbounded. Suppose (†) is not true. Then AC and BC are bounded. By (2.21) we have 0 = lim r AC (r ) · r v (r ) + r BC (r ) · r u (r ) r →∞ + lim BC (r ) · r 2 ev(r ) (1 − eu(r ) ) + AC (r ) · r 2 eu(r ) (1 − ev(r ) ) r →∞ ∞

∞ r AC eu (ev − 1)dr + r BC ev (eu − 1)dr . +2 0

(†)

(2.23)

0

Moreover, (i) of Lemma 2.2 implies that lim r u (r ) = 0 = lim r v (r ),

r →∞

r →∞

lim [r 2 eu(r ) (1 − ev(r ) )] = 0 = lim [r 2 ev(r ) (1 − eu(r ) )].

r →∞

(2.24)

r →∞

Since |u| and |v| decay exponentially at ∞, and (AC , BC ) is bounded, by (2.14) and (i) of Lemma 2.2, we get the limits

∞

∞ lim r AC (r ) = [r eu+v AC ]dr − [r ev (1 − eu )Bc ]dr, r →∞ 0 0 (2.25)

∞

∞ [r eu+v BC ]dr − [r eu (1 − ev )Ac ]dr lim r BC (r ) = r →∞

0

0

all exist. Hence, due to (2.20) and (2.23)–(2.25), we finally obtain ∞

∞ u v v u 0=2 r AC e (1 − e )dr + r BC e (1 − e )dr > 0. 0

0

This contradiction shows that AC or BC is unbounded, and the claim is proved. Secondly, suppose AC is unbounded. From (2.14) we have (AC − BC ) − eu (AC − BC ) = (eu − ev )BC , and hence, by Lemma 2.2 and the strong maximum principle, we obtain that AC (r ) intersects BC (r ) at most one point on [0, ∞). Thus, w.l.o.g., we may assume that there exists r1 > 0 such that AC (r1 ) ≥ 0, (2.26) AC (r ) > BC (r ) > 0 on [r1 , ∞). Since (u, v) is a topological solution of (1.7), there exists r2 > r1 such that eu(r )+v(r ) ≥ max{ev(r ) (1 − eu(r ) ), eu(r ) (1 − ev(r ) )} ∀r ≥ r2 .

(2.27)

334

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

(r ) > 0 on (r , ∞) and thus Therefore, by (2.14) and (2.26)–(2.27), we get AC 1 lim AC (r ) = ∞. Now, by applying the same arguments in the proof of (i), we can r →∞ obtain AC (r ) lim = C A = c1 + C · c2 > 0, r →∞ − 21 r r e where c1 and c2 are constants in (i). Then there exists > 0 such that C = c1 + (C + (r ) = C . But, by Step 1 we have AC+ (r ) < 0 for large r . ) · c2 > 0 and lim AC+ −1 r →∞ r

2 er

We get a contradiction. The case of unboundedness for BC is similar. This shows Step 2. According to Steps 1 and 2, weeasily obtain (ii). φ1 (R) φ2 (R) = 0 for some R ∈ [0, R0 ] (resp., R ∈ [0, ∞) if (iii) Suppose det ψ1 (R) ψ2 (R) R0 = ∞). Then, w.l.o.g., there exists C0 > 0 such that 0 φ1 (R) φ2 (R) + C0 = . (2.28) ψ1 (R) ψ2 (R) 0 By (2.28) we obtain M A (R) = C0 = M B (R) which contradicts the result of (ii). Hence we prove (iii). (iv) Let (u, v) be a topological solution of (1.7). Then any solution pair (A(r ), B(r )) of the linearized equations (1.9) can be written as A(r ) = c1 φ1 (r ) + c2 ψ1 (r ) and B(r ) = c1 φ2 (r ) + c2 ψ2 (r ) for some c1 , c2 ∈ R. By the result of (†) in the proof of (ii), we easily obtain the non-degeneracy result if c = c2 /c1 > 0. When c ≤ 0 or c1 = 0, then by (i), we can also get that both Ac (r ) and Bc (r ) are unbounded. This proves (iv). 3. Uniqueness of Topological Solution In this section, we will use a continuation argument and Lemma 2.4 to establish the uniqueness of topological solutions. As we have seen in Sect. 2, if N1 = N2 , then u ≡ v, and the uniqueness follows from the case of scalar equation 1.10. Concerning the uniqueness for the scalar equation, we refer readers to [6 or 8]. Proof of Theorem 1.1. Suppose that for some pair (N10 , N20 ), Eq. (1.15) possesses at least two topological solutions. Without loss of generality, we may assume 0 ≤ N10 < N20 . Let N1∗ = inf{0 ≤ N1 |(1.15) possesses a unique topological solution for all ( Nˆ1 , N20 ) where N1 ≤ Nˆ1 ≤ N20 }. Clearly, N1∗ ≥ N10 . To yield a contradiction, we claim the following:

(∗) Suppose (u 0 , v0 ) is a topological solution of (1.15) with respect to (N1 , N2 ). Let U0 (r ) = u 0 (r ) − 2N1 log r and V0 (r ) = v0 (r ) − 2N2 log r . Then there is a neighborhood B of (N1 , N2 ) such that for any pair of (N1 , N2 ) in B, there exists the corresponding (U,V ) with respect to (N1 , N2 ), which is close to (U0 , V0 ) 2 2 in C B R (0) × C B R (0) for any R > 0, where (u(r ), v(r )) = (U (r ) + 2N1 log r, V (r ) + 2N2 log r ) is a topological solution of (1.15) with respect to (N1 , N2 ).

Uniqueness and Structure of Solutions for the Chern-Simons System

335

If the domain is bounded, then claim (∗) follows directly from the non-degeneracy of linearized equation and the Implicit Function Theorem. Since our domain is R2 , in order to apply the Implicit function theory, we need to show the linearized equation of (1.7) is an invertible operator from Wr2,2 (R2 ) × Wr2,2 (R2 ) to L 2 (R2 ), where Wr2,2 (R2 ) = {z(x) = z(r )|z, z , z ∈ L 2 (R2 )}. For the sake of completeness, we will present a proof of the claim (∗). First, let us assume claim (∗) holds. The proof of claim (∗) will be given later. By the claim (∗), at (N1∗ , N2 ), Eq. (1.7) possesses a unique topological solution. Thus N1∗ > N10 . By the definition of N1∗ , there are two sequences of solutions (u k , vk ), (u ∗k , vk∗ ) of (1.7) with (N1k , N20 ) such that N1k ↓ N1∗ . The following lemma shows the pre-compactness of (Uk , Vk ), where Uk (r ) = u k (r ) − 2N1k log r and Vk (r ) = vk (r ) − 2N20 log r . Lemma 3.1. Thereexists asubsequence of (Uk , Vk ) such that it converges to (U, V ) in 2 C B R (0) × C 2 B R (0) for any R > 0, where (u(r ), v(r )) = (U (r ) + 2N1∗ log r, V (r ) + 2N20 log r ) is a topological solution of (1.15). Proof. Since (u k , vk ) is a topological solution of (1.15) with (N1k , N20 ), we have

∞

∞ vk uk k e (1 − e )r dr = 2N1 and eu k (1 − evk )r dr = 2N20 . (3.1) 0

0

Since

r

2 s(1−eu k (s) )(1−evk (s) )ds =r 2 −2 0

r

s(eu k (s)+ evk (s) )ds +2

0

r

seu k (s)+vk (s) ds,

0

by the Pohozaev identity (2.1), we have

∞ r (1 − eu k (r ) )(1 − evk (r ) )dr = 2N1k N20 . 0

Thus by (3.1), we obtain

∞ [(1 − eu k ) + (1 − evk )]r dr = 2(N1k + N20 ) + 4N1k N20 0

≤ C0 < +∞ ∀k.

Then

Uk + evk (1 − eu k ) = 0 Vk + eu k (1 − evk ) = 0.

By integrating the equation, one has

r

evk (1 − eu k )s ds ≤ −Uk (r )r = 0

(3.2)

∞

(1 − eu k )s ds < C0 (by (3.2)).

0

Thus, |Uk (r )| is uniformly bounded in any bounded subinterval of [0, ∞). We claim that Uk (1) is bounded.

336

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

Otherwise, since Uk (r ) − Uk (1) is uniformly bounded on any bounded subsequence of [0, ∞), we have Uk (r ) = Uk (1) + O(1) for 0 ≤ r ≤ R0 , where R0 is chosen so that 1 2

R0

s ds > C0 .

0

Suppose Uk (1) → −∞. Then for large k, 1 − eu k (r ) ≥ Therefore, we have

∞

(1 − eu k )r dr > C0 ≥

1 for 0 ≤ r ≤ R0 . 2

R0

(1 − eu k )r dr ≥

0

0

1 2

R0

r dr > C0 ,

0

a contradiction. Therefore, Uk (1) is bounded. Recall that u k and vk are increasing in r and both are negative. Thus |u k (r )| and |vk (r )| are uniformly bounded in [r0 , ∞) for any r0 > 0. Without loss of generality, we may assume Uk (r ), Vk (r ) converges to U (r ), V (r ) in C 2 ([0, R]) for all R > 0, and (u(r ), v(r )) which is defined in Lemma 3.1 satisfies (1.15) with (N1∗ , N20 ) and u (r ), v (r ) > 0. By (3.2) and Fatou’s Lemma,

∞

∞ [(1 − eu ) + (1 − ev )]r dr ≤ lim inf [(1 − eu k ) + (1 − evk )]r dr < c0 , k→∞

0

0

which implies limr →+∞ u(r ) = limr →+∞ v(r ) = 0, that is, (u, v) is a topological solution. This completes the proof. Now we go back to the proof of uniqueness. By Lemma 3.1, (u k , vk ) and (u k , vk ) converges to (u, v), due to the fact that (1.15) has only one topological solution at (N1∗ , N20 ). W.l.o.g., we can assume |(u k − u k )(xk )| = ||u k − u k || L ∞ ≥ ||vk − vk || L ∞ ∀k. Set Ak =

(u k −u k ) ||u k −u k || L ∞

and Bk =

(vk −vk ) . ||u k −u k || L ∞

Then Ak , Bk satisfies

Ak + eηk (x) (1 − eu k )Bk − eξk (x)+vk Ak = 0 Bk + eξk (x) (1 − evk )Ak − eηk (x)+u k Bk = 0,

where ξk (x) ∈ (u k (x), u k (x)) and ηk (x) ∈ (vk (x), vk (x)). Since for any fixed k, u k (x) → 0, vk (x) → 0 as x → ∞, we can apply the same argument of (3.13) and Lemma 3.2, to obtain that the maximum points xk are bounded. Thus Ak and Bk converges to A and B in C 2 (R2 ) respectively, where (A, B) satisfies A + ev (1 − eu )B − eu+v A = 0, B + eu (1 − ev )A − eu+v B = 0. Since A and B are bounded and not all zero in R2 , by Lemma 2.4, we have A ≡ 0 and B ≡ 0, a contradiction. This completes the proof of Theorem 1.1.

Uniqueness and Structure of Solutions for the Chern-Simons System

337

Now we need to show claim (∗). To show it, we define a background function pair (u 0 , v0 ) by |x|2 |x|2 and v0 (x) = N2 ln . (3.3) u 0 (x) = N1 ln 1 + |x|2 1 + |x|2 Let (uˆ + u 0 , vˆ + v0 ) be a topological solution of (1.7). Then (u, v) satisfies ⎧ 4N1 v +vˆ u +uˆ 2 ⎪ ⎨ uˆ + e 0 (1 − e 0 ) − (1+|x|2 )2 = 0 in R , ⎪ ⎩

4N2 2 vˆ + eu 0 +uˆ (1 − ev0 +vˆ ) − (1+|x| 2 )2 = 0 in R , u(x) ˆ → 0 and v(x) ˆ → 0 as |x| → ∞.

(3.4)

To prove our claim (∗), we have to prove the linearized equation is an invertible operator from Wr2,2 (R2 ) × Wr2,2 (R2 ) → L 2 (R2 ), i.e., Eq. (3.5) below is uniquely solvable in Wr2,2 (R2 ) × Wr2,2 (R2 ) for any pair ( f, g) ∈ L 2 , A + ev (1 − eu )B − eu+v A = f, (3.5) in R2 . B + eu (1 − ev )A − eu+v B = g, For any pair ( f, g), by Lemma 2.4 there is at most one solution (A, B) ∈ Wr2,2 (R2 ) × Wr2,2 (R2 ). Hence, it suffices for us to show the existence of solutions. Since R2 is an unbounded domain, the existence can not follow directly from the uniqueness of the solution of (3.5), i.e., the Fredholm alternative theorem might not hold always. However, for any R > 0, the equation ⎧ ⎨ Ak + ev (1 − eu )Bk − eu+v Ak = f, Bk + eu (1 − ev )Ak − eu+v Bk = g, in B R (O), (3.6) ⎩A = B =0 on ∂ B R (O), k k has a solution, i.e., the Fredholm alternative theorem is true for each R > 0. Then by letting R = Rn → +∞, we want to prove (An , Bn ) = (A Rn , B Rn ) has a convergent subsequence in Wr2,2 (R2 ) × Wr2,2 (R2 ). Lemma 3.2. (An , Bn ) has a convergent subsequence in Wr2,2 (R2 ). Proof. By Sobolev’s embedding theorem, (An , Bn ) is locally Hölder function. We want to show that ||An || L ∞ (B Rn ) + ||Bn || L ∞ (B Rn ) ≤ C{|| f || L 2 (R2 ) + ||g|| L 2 (R2 ) } for some constant C independent of n. Suppose (3.7) does not hold. Without loss of generality, one may assume ||An || L ∞ = max(||An || L ∞ , ||Bn || L ∞ ) and ||An || L ∞ → +∞ as n → ∞.

(3.7)

(3.8)

Let xn ∈ R2 , such that |An (xn )| = ||An || L ∞ . First, we claim xn is bounded.

(3.9)

338

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

To prove our claim, Eq. (3.4) can be rewritten as An − An = f + ev (eu − 1)Bn + (eu+v − 1)An ,

(3.10)

and let K R (x, y) be the fundamental solution of − I with zero boundary value on ∂ B R (O). It is easy to see e−|x−y| 0 < K R (x, y) ≤ C √ for |x − y| ≥ 1. |x − y| Then

An (xn ) =

K R (xn , BR v−u

+(e

(3.11)

y)[ev (eu − 1)Bn (y)

− 1)An (y) + f (y)]dy.

(3.12)

Since v(y) → 0 and u(y) → 0 as y → +∞, by (3.12) we have |An (xn )| ≤ o(1)(||An || L ∞ + ||Bn || L ∞ ) + C · || f || L 2 ,

(3.13)

which yields a contradiction. Thus, xn is bounded. By letting Aˆn = ||AnA||n ∞ and Bˆn = ||AnB||n ∞ , then a subsequence of ( Aˆn , Bˆn ) will L L ˆ B) ˆ and ( A, ˆ B) ˆ satisfies converge to ( A,

Aˆ + ev (1 − eu ) Bˆ − eu+v Aˆ = 0 Bˆ + eu (1 − ev ) Aˆ − eu+v Bˆ = 0

and Aˆ ≡ 0. Since Aˆ and Bˆ are bounded, by Lemma 2.4, we have Aˆ ≡ 0 and Bˆ ≡ 0, which yields a contradiction. Thus (3.7) is established. By (3.7), a subsequence of (An , Bn ) will converge to (A, B) and (A, B) satisfies (3.5). Let Eq. (3.4) be rewritten as (3.6). Since A and B are bounded, by (3.5) and a standard argument, we can show that A(r ) and B(r ) in L 2 (R2 ). This proves the linearized equation is 1-1 and onto from Wr2,2 (R2 ) × Wr2,2 (R2 ) to L 2 (R2 ). Then by the open mapping theorem of functional Analysis (linear), we know the inverse operator of the linearized equation is bounded from L 2 (R2 ) to Wr2,2 (R2 ) × Wr2,2 (R2 ). By apply the Implicit Function Theorem, our claim (∗) is proved, and thus the proof of Theorem 1.1 is completely finished. 4. Asymptotic Behaviors of All Entire Solutions In this section, we give the asymptotic behaviors of all entire radial solutions for (1.7) as follows. Here solutions are not necessarily negative in R2 . Proposition 4.1. Let (u, v) be an entire solution of (1.13)–(1.14). Then (u, v) satisfies one of the following behaviors: (A) limr →∞ u(r ) = 0 and limr →∞ v(r ) = 0; (B) limr →∞ u(r ) = −∞ and limr →∞ v(r ) = −∞; −cu ) ) = (−cu , − e 4 ) for some cu > 0; (C) limr →∞ (u(r ), v(r r2 −cv

) (D) limr →∞ ( u(r , v(r )) = (− e 4 , −cv ) for some cv > 0; r2

Uniqueness and Structure of Solutions for the Chern-Simons System

339

(E) limr →∞ u(r ) = ∞, limr →∞ v(r ) = −∞; (F) limr →∞ u(r ) = −∞, limr →∞ v(r ) = ∞. In order to prove Proposition 4.1, we need the following lemmas. Lemma 4.1. Let (u, v) be an entire solution of (1.13)–(1.14). Then the following statements hold. (i) If u(r1 ) ≥ 0 for some r1 > 0, then u (r ) > 0 ∀r > 0, lim u(r ) = ∞ and r →∞ lim v(r ) = −∞. r →∞

(ii) If u (r1 ) ≤ 0 for some r1 > 0, then u(r1 ) < 0 and lim u(r ) = −∞. r →∞

Proof. (i) We shall use the maximum principle to prove the first part of (i). Suppose u (r2 ) ≤ 0 for some r2 > 0. Then, since u(r ) → −∞ as r → 0+ and u(r1 ) ≥ 0, we obtain that u either has a local minimum u(r3 ) < 0 or a local maximum u(r4 ) > 0 for some r3 and r4 depending on r1 and r2 . From (1.13) we get 0 = u(r3 ) + ev(r3 ) (1 − eu(r3 ) ) > 0 or 0 = u(r4 ) + ev(r4 ) (1 − eu(r4 ) ) < 0. This contradiction shows u (r ) > 0 ∀r ∈ (0, ∞). Furthermore, since (r u (r )) = r ev(r ) (eu(r ) − 1) > 0 ∀r > r1 , we have r u (r ) > r1 u (r1 ) > 0 ∀r > r1 and thus limr →∞ u(r ) = ∞. This proves the first part of (i). If limr →∞ v(r ) = −∞ is not true, then there exist r2 > 0 and a constant C1 such that v(r ) > C1 ∀r > r2 and u(r ) ≥ eC1 (eu(r ) − 1) ≥ Ceu(r ) ∀r > r3 ,

(4.1)

for some constants C > 0 and r3 > r2 . From (4.1) we easily obtain that u must blow up in finite time. This contradiction shows that limr →∞ v(r ) = −∞ and (i) holds. (ii) If u (r1 ) ≤ 0 for some r1 > 0, then by (i) we have u(r1 ) < 0. From this and (1.13), we obtain r u (r ) < r1 u (r1 ) < 0 ∀r > r1 . Hence we get limr →∞ u(r ) = −∞. This completes the proof. Lemma 4.2. Let (u, v) be an entire solution of (1.13)–(1.14). If limr →∞ u(r ) = −cu −cu ) exists and v (r1 ) ≤ 0 for some r1 > 0, then limr →∞ r u (r ) = 0, limr →∞ v(r = −e 4 r2 and cu > 0. Proof. Since v (r1 ) ≤ 0, by (ii) of Lemma 4.1 in the case of v, we have limr →∞ v(r ) = −∞ and

r r v (r ) = r1 v (r1 ) − seu(s) (1 − ev(s) ) ∀r > r1 r1

r < −C s ds ∀r > r2 > r1 and for some constant C > 0 r1

→ −∞ as r → ∞.

340

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

Then, combining the above inequality and (1.13), we easily obtain v(r ) r v (r ) r eu (ev − 1) e−cu = lim = lim = − . r →∞ r 2 r →∞ 2r 2 r →∞ 4r 4 lim

This proves the second result of this lemma. Since lim u(r ) = −cu exists, by Lemma 4.1, we easily obtain that u(r ) < 0, u (r ) > r →∞ 0 ∀r ∈ (0, ∞) and thus −cu ≤ 0. Then, by (r u (r )) = −r ev(r ) (1 − eu(r ) ) < 0 ∀r > 0, it follows that lim r u (r ) = 0. Suppose cu = 0. We claim the following two statements: r →∞

u(r )

(a) lim (b)

r →∞ ev(r ) u (r )

u(r )

= 0;

≥ v (r ) ∀r ≥ r0 for some sufficiently large r0 .

Since lim u(r ) = 0 and lim ev(r ) = 0, by (1.13) we have r →∞

r →∞

u(r ) r u (r ) r ev(r ) (eu(r ) − 1) = lim = lim r →∞ ev(r ) r →∞ r ev(r ) v (r ) r →∞ r ev(r ) (v (r ))2 + r eu(r )+v(r ) (ev(r ) − 1) u e −1 = lim = 0. r →∞ (v )2 + eu (ev − 1) lim

This proves claim (a). In order to prove claim (b), we first show u(r ) u(r ) (i) limr →∞ e u(r−1 ) = 1; (ii) limr →∞ r u (r ) = 0; (iii) limr →∞

u (r ) u(r ) = 0. eu(r ) u (r ) eu −1 = 0, we obtain limr →∞ u = limr →∞ u (r ) = limr →∞ eu(r ) (i). In addition, combining the facts of limr →∞ r u (r ) = 0 and

Since limr →∞ u(r ) = 1 which proves limr →∞ r 2 ev = 0 with (1.13), we have

u(r ) r u (r ) r ev (eu − 1) = lim = lim r →∞ r u (r ) r →∞ r 2 ev (eu − 1) r →∞ 2r ev (eu − 1) + r 2 ev (eu − 1)v + r 2 eu+v u 1 = lim . (4.2) r →∞ 2 + r v + r u eu eu −1 lim

u

e Since lim r v (r ) = −∞ and reuu −1 < 0 ∀r > 0, by (4.2) we obtain (ii). r →∞ Using the assertions of (i) and (ii), we get

u (r ) u (r ) −u (r ) − r ev(r ) (1 − eu(r ) ) = lim = lim r →∞ u(r ) r →∞ u (r ) r →∞ r u (r ) r u(r )ev(r ) r u(r )ev(r ) eu(r ) − 1 · ] = lim (by (i)) = lim [ r →∞ r →∞ r u (r ) r u (r ) u(r ) u(r ) = lim [ · r ev(r ) ] = 0 (by (ii) and lim r ev(r ) = 0). r →∞ r u (r ) r →∞ lim

This proves (iii).

Uniqueness and Structure of Solutions for the Chern-Simons System

341

Applying the results (i)–(iii) and (1.13), we obtain r u (r ) u(r ) lim r →∞ r v (r )

u(u (r ) + r u (r )) − r (u (r ))2 1 = lim · u(r ) v(r ) r →∞ u 2 (r ) r e (e − 1) v u 2 −r ue (1 − e ) − r (u ) (by (1.13)) = lim r →∞ r u 2 eu (ev − 1) ev (1 − eu ) u = lim + lim ( )2 = 0 (by (i) and (iii)). r →∞ r →∞ u u

u (r ) Since ru(r ) < 0 and r v (r ) < 0 ∀r > 0, by (4.3) we finally have sufficiently large r . Hence we prove claim (b). From (b), we easily obtain that

r u (r ) u(r )

[ln(−u(r )) − v(r )] ≥ 0 ∀r ≥ r0 .

(by (1.13))

(4.3) ≥ r v (r ) for

(4.4)

u(r ) ev(r )

≤ −eC0 < 0 ∀r ≥ r0 , where Integrating both sides of (4.4) from r0 to r , we deduce C0 = ln(−u(r0 )) − v(r0 ). This contradicts (a). Therefore cu > 0 and we complete the proof. Now we are in a position to prove Proposition 4.1. Proof of Proposition 4.1. We divide the proof into the following cases. Case 1. u(r1 ) ≥ 0 (resp. v(r1 ) ≥ 0) for some r1 > 0 : Then, by (i) of Lemma 4.1, we obtain that lim u(r ) = ∞ and lim v(r ) = −∞ (resp. r →∞ r →∞ lim v(r ) = ∞ and lim u(r ) = −∞). Hence (E) (resp. (F)) happens in this case.

r →∞

r →∞

Case 2. u(r ) < 0 and v(r ) < 0 ∀r ∈ (0, ∞) : (i) If u (r1 ) ≤ 0 and v (r2 ) ≤ 0 for some r1 > 0 and r2 > 0, then by (ii) of Lemma 4.1, we have lim u(r ) = lim v(r ) = −∞. This proves that (B) holds in this case. r →∞

r →∞

(ii) If u (r ) > 0 and v (r ) > 0 ∀r ∈ (0, ∞), then lim u(r ) = −C1 ≤ 0 and r →∞ lim v(r ) = −C2 ≤ 0 all exist. If C1 < 0 then, by (1.13)–(1.14), we have r →∞

r u (r ) = 2N1 −

r

sev(s) (1 − eu(s) )ds ∀r > 0

r −C1 < 2N1 − (1 − e ) s 1+2N2 ds ∀r > 0 0

0

< −C < 0 for r large, which implies u (r ) < 0 for large r . This contradiction shows C1 = 0. Similarly, C2 = 0 as well. Thus (A) occurs under this case. (iii) If u (r ) > 0 ( resp. v (r ) > 0) ∀r ∈ (0, ∞) and v (r1 ) ≤ 0 (resp. u (r1 ) ≤ 0) for some r1 > 0, then lim u(r ) = −cu (resp. lim v(r ) = −cv ) exists. By r →∞

−cu

r →∞

−cv

) ) e e (resp. lim u(r Lemma 4.2, we obtain lim v(r 2 = − 4 2 = − 4 ) and cu > 0 r →∞ r r →∞ r (resp. cv > 0). This shows that (C) (resp. (D)) happens in this case.

According to Case 1 and Case 2, we complete the proof.

342

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

5. The Structure of All Entire Solutions In this section, we will study the structures of all radial entire solutions for (1.7). Applying this classification, we give the proof of Theorems 1.2 and 1.3. Let the respective set of initial data according to the behaviors of solutions be depicted beneath the statement of Theorem 1.2 in Sect. 1. First, we derive the structure property of in the following. Proposition 5.1. is an open subset of R2 and the following statements are valid. (A) If (α1 , α21 ), (α1 , α22 ) ∈ with α21 < α22 , then (α1 , α2 ) ∈ ∀α21 < α2 < α22 . Similarly, if (α11 , α2 ), (α12 , α2 ) ∈ with α11 < α12 , then (α1 , α2 ) ∈ ∀α11 < α1 < α12 . (B) There exists (α˜ 1 , α˜ 2 ) ∈ R2 such that (α1 , α2 ) ∈ ∀α1 ≥ α˜ 1 or ∀α2 ≥ α˜ 2 . (C) is a simply connected and unbounded region such that

= N T

v , u

∂ = Su T Sv and S u S v = T. In particular, both Su and Sv are nonempty. Proof. We divide the proof into the following steps. ¯ > 0 such that u (r0 , α) ¯ <0 Step 1. Let α¯ ∈ be any point. Then there exists r0 = r0 (α) and v (r0 , α) ¯ < 0. By the continuity of (u , v ) w.r.t. α, there exists δ > 0 such that ¯ u (r0 , α) < 0 and v (r0 , α) < 0 ∀α ∈ Bδ (α).

(5.1)

By (ii) of Lemma 4.1, we obtain u(r, α) → −∞ and v(r, α) → −∞ as r → ∞ ∀α ∈ ¯ This proves Bδ (α) ¯ ⊂ and thus is open. Bδ (α). Step 2. By the monotone property of (φi , ψi ), i = 1.2, Lemma 2.3, we easily obtain (A). Step 3. We prove (B) by scaling arguments and monotone property. Choose d > 0 such that N2 N2 + 1

uˆs (r ) = U (e vˆs (r ) = V (e

where

− 2(N(1+d)s +N +1) 1

2

− 2(N(1+d)s +N +1) 1

2

r, s, ds) − s, r, s, ds) − ds,

(5.2)

(5.3)

U (t, α1 , α2 ) = u(t, α1 , α2 ) − 2N1 ln t, V (t, α1 , α2 ) = v(t, α1 , α2 ) − 2N2 ln t.

Then (uˆs , vˆs ) satisfies ⎧ ⎨ uˆs (r ) + e−d1 s r 2N2 evˆs = r 2(N1 +N2 ) euˆs +vˆs , vˆ (r ) + e−d2 s r 2N1 euˆs = r 2(N1 +N2 ) euˆs +vˆs , ⎩ s uˆs (0) = 0, uˆs (0) = 0, vˆs (0) = 0, vˆs (0) = 0,

(5.4)

d(N1 +1)−N2 N1 where d1 = NN21+1−d +N2 +1 and d2 = N1 +N2 +1 . By (5.2), we easily have d1 > 0 and d2 > 0. Now suppose there exists a sequence (s j , ds j ) ∈ with (s j , ds j ) → (∞, ∞).

Uniqueness and Structure of Solutions for the Chern-Simons System

343

Set (uˆ j , vˆ j ) = (uˆ s j , vˆs j ). Then by euˆ j ≤ 1, evˆ j ≤ 1 and euˆ j +vˆ j ≤ 1, we have that for all R > 0, |uˆ j (r )|, |vˆ j (r )| ≤ M on [0, R] for some M = M(R) > 0. Then ¯ |uˆ j (r )|, |vˆ j (r )| ≤ M¯ on [0, R] for some M¯ = M(R) > 0. From elliptic estimates, we have (uˆ j , vˆ j ) → (u, ˆ v) ˆ (passing subsequence if necessary) in C 2 ([0, R]) for any R > 0 and (u, ˆ v) ˆ which satisfies ⎧ ˆ vˆ ˆ ) = r 2(N1 +N2 ) eu+ ⎨ u(r ˆ vˆ (5.5) v(r ˆ ) = r 2(N1 +N2 ) eu+ ⎩ u(0) ˆ = 0, uˆ (0) = 0, v(0) ˆ = 0, vˆ (0) = 0. Since U (t, s, ds) and V (t, s, ds) are both decreasing in t, we have uˆ and vˆ are nonincreasing in r . But, any solution pair of (5.5) must be increasing in r . Actually, both uˆ or vˆ must blow up at finite r . This contradiction shows that there exists s0 > 0 such that u(r, s, ds) or v(r, s, ds) blows up ∀s > s0 , and hence, by Lemma 2.3, so does u(r, α1 , α2 ) or v(r, α1 , α2 ) for any α1 ≥ s0 ≡ α˜ 1 or α2 ≥ ds0 ≡ α˜ 2 . This proves (B). Step 4. In this step, we prove the result of (C). For this purpose we claim the following statements. (a) Let α ∈ and (u(r ), v(r )) = (u(r, α), v(r, α)). Then the corresponding (β1 , β2 ) satisfies either < ∞ or β2 < ∞. β1 (b) = N T

v . u (c) ∂ ⊂ Su T Sv . (d) If α = (α1 , α2 ) ∈ ∂ and (α1 − , α2 ) ∈ for some > 0, then α ∈ Su . (e) If α = (α1 , α2 ) ∈ ∂ and (α1 , α2 − ) ∈ for some

> 0, then α ∈ Sv . (f) Let 1 = ∂ Su and 2 = ∂ Sv . Then 1 2 = T . (g) is a simply connected and unbounded region. (h) If α = (α1 , α2 ) ∈ Su , then (α1 + , α2 ) ∈ Su ∀ > 0. (i) If α = (α1 , α2 ) ∈ Sv , then (α1 , α2 + ) ∈ Sv ∀ > 0. (a) Suppose the result is not true. Then β1 = ∞, β2 = ∞ and lim r u (r ) = −∞. r →∞

Hence, for each M > 2 there exists r M > 0 such that r u (r ) < −M ∀r > r M . Furthermore we get eu(r ) < C · r −M for large r and

∞

∞ u v β2 = r e (1 − e )dr ≤ r eu dr < ∞. 0

0

This contradiction proves (a). (b) By (a) and the of u and v , we easily have (b). definitions (c) Let E = Su T Sv , α ∈ ∂ and (u, v) be the respective solution. Then, by the Hopf lemma we have u(r ) < 0, v(r ) < 0 ∀r > 0. By Proposition 4.1, we have ∂ ⊂ E. This proves (c). (d) If α = (α1 , α2 ) ∈ ∂ and α = (α1 − , α2 ) ∈ for some > 0, then, by Lemma 2.3, v(r, α ) → −∞ and v(r, α) < v(r, α ) ∀r > 0. This proves α ∈ Su . (e) The proof is similar to (d). (f) From (c)-(e), 1 = ∅ and 2 = ∅, then by the continuity w.r.t. initial data, we obtain that 1 ∩ Sv = ∅ and 2 ∩ Su = ∅. Therefore, (f) is proved.

344

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

(g) Suppose is not connected. Then there exist two disjoint open sets O1 and O2 satisfying O1 O2 = . From (d)–(f), O1 and O2 possess one initial data of topological solutions at least, respectively, and by (A)–(B), there exist two distinct initial data of topological solutions T1 and T2 such that T1 ∈ ∂ O1 and T2 ∈ ∂ O2 . This contradicts the uniqueness of the topological solution. Hence is connected. By using the similar arguments, we get is unbounded. From (A), we obtain that the set does not have a hole, that is, is a simply connected set. This shows (g). (h) First we show that if α = (α1 , α2 ) ∈ Su , then ∀ > 0 we have α = (α1 + , α2 ) and u(r0 , α ) ≥ 0 for some r0 = r0 () > 0, that is, α ∈ / Su ∀ > 0 by Lemma 4.1. Suppose there exists 0 > 0 such that u(r, α0 ) < 0 ∀r > 0. By Lemma 2.3, we have u(r, α) < u(r, α ) < u(r, α0 ) < 0 ∀r ∈ (0, ∞). v(r, α0 ) < v(r, α ) < v(r, α) < 0

(5.6)

From (5.6) and α ∈ Su , we obtain u(r, α0 ) is bounded below for large r . Then, by (ii) of Lemma 4.1, we have u (r, α0 ) > 0 ∀r > 0, and hence lim u(r, α0 ) = −cu ≤ 0. r →∞

Furthermore, by Lemma 4.2, we get lim r u (r, α) = lim r u (r, α0 ) = 0. Combining r →∞ r →∞ these facts, we attain β1 (α) = 2N1 = β1 (α0 ). But, from (5.6) we have β1 (α) > β1 (α0 ). This contradiction proves our assertion and thus (h) holds. (i) The proof is similar to (h). By above claims (b)–(i) and the existence of the topological solution, we finally obtain (C) and the proof is complete. In the following, we let u and v be defined in Sect. 1. Then the corresponding (β1 , β2 ) of each solution can be classified as follows. Proposition 5.2. Let (u, v) be an entire solution of (1.13)–(1.14) and the corresponding (β1 , β2 ) be defined in (1.20). Then the following statements are valid: (a) β1 is continuous w.r.t α1 and α2 for all (α1 , α2 ) ∈ N T ∪ u . Similarly, β2 is continuous w.r.t α1 and α2 for all (α1 , α2 ) ∈ N T ∪ v . (b) If (u, v) is a topological solution, then (β1 , β2 ) = (2N1 , 2N2 ). (c) If (u, v) is a non-topological solution, then the respective (β1 , β2 ) satisfies (β1 − 2(N1 + 1))(β2 − 2(N2 + 1)) > 4(N1 + 1)(N2 + 1). (d) For any α ∈ Su (resp. α ∈ Sv ), (u, v) is a Type (IV) solution with (β1 , β2 ) = (2N1 , ∞) (resp. (β1 , β2 ) = (∞, 2N2 )). (e) For any α ∈ u (resp. α ∈ v ), (u, v) is a Type (III) solution with 2N1 < β1 ≤ 2N1 + 2 and β2 = ∞ (resp. β1 = ∞ and 2N2 < β2 ≤ 2N2 + 2). Proof. (a) We prove the case of β1 . The case of β2 is similar. Let D = N T u , α0 = (α10 , α20 ) ∈ D. First, we want to show that D = N T u is open. Because of β1 (α0 ) < ∞, we obtain that lim r v (r ; α0 ) < −2 − ε

r →∞

for some ε > 0, by continuity and (1.13), there exist r0 , δ > 0 such that for all |α −α0 | < δ, ε r v (r ; α) < −2 − on [r0 , ∞), 2

Uniqueness and Structure of Solutions for the Chern-Simons System

345

which imply β1 (α) < ∞, that is, α ∈ D. Thus the set D is open. Now if δn > 0 and δn → 0 as n → ∞ such that α1n = α10 +δn , (α1n , α20 ) ∈ D and β1,n = β1 (α1n , α20 ) ∀n. We want to prove β1n → β1 (α0 ) as n → ∞. By monotone property, Lemma 2.3, and the continuity of (u, v) w.r.t. the initial data, we obtain that u(r, α1n , α20 ) u(r, α0 ) pointwise in r as n → ∞. (5.7) v(r, α1n , α20 ) v(r, α0 ) By (5.7), the definition of β1 and the monotone convergence theorem, we obtain β1n → β1 (α0 ) as n → ∞. The case of δn < 0 is similar. So β1 is continuous w.r.t. α1 . By using the same arguments, we get β1 is continuous w.r.t. α2 . This proves (a). (b) By (i) of Lemma 2.2 we obtain the result. (c) Since (u, v) is a non-topological solution of (1.13), the respective β1 < ∞ and β2 < ∞. Hence there exists a sequence {r j } such that r j → ∞ and r 2j eu(r j ) (1 − ev(r j ) ) → 0, r 2j ev(r j ) (1 − eu(r j ) ) → 0 as j → ∞. By the Pohozaev identity, Lemma 2.1, we easily obtain

r

v(s) u(s) e (1 − e )s ds − 2 r u (r ) · r v (r ) − 2 0 v(r )

r

eu(s) (1 − ev(s) )s ds

0

+r 2 eu(r ) (1 − e ) + r 2 ev(r ) (1 − eu(r ) )

r = 4N1 N2 + 6 seu(s)+v(s) ds ∀r ∈ (0, ∞).

(5.8)

0

Taking r = r j on both sides of (5.8) and then letting j → ∞, we have

∞ r eu(r )+v(r ) dr (2N1 − β1 )(2N2 − β2 ) − 2β1 − 2β2 = 4N1 N2 + 6 0

which implies

∞

(β1 − 2(N1 + 1))(β2 − 2(N2 + 1)) = 4(N1 + 1)(N2 + 1) + 6

r eu(r )+v(r ) dr.

0

This proves (c). (d) We prove the case of Su . The proof for Sv is similar. Let α ∈ Su and (u(r ), v(r )) = (u(r, α), v(r, α)). By Lemma 4.2 we have lim r u (r ) = 0 = 2N1 − β1 (α). This r →∞ proves (d). (e) We prove the case of u . The proof for v is similar. Let α ∈ u and (u(r ), v(r )) = (u(r, α), v(r, α)). By (1.21), (1.22) and the definition of u , we have lim r u (r ; α) < 0 r →∞

and lim r u (r ) = 2N1 − β1 (α). Hence β1 (α) > 2N1 . Now we claim β1 (α) ≤ 2N1 + r →∞ 2 ∀α ∈ u . Suppose not, then there exists α ∈ u such that β1 (α) > 2N1 + 2 and ∞ β2 (α) = ∞. Then we obtain lim r u (r ) = 2N1 − β1 (α) < −2, and thus 0 r eu dr < r →∞ ∞. From this we deduce

∞

∞ u ∞> r e dr > r eu (1 − ev )dr = β2 (α) = ∞. 0

0

This contradiction proves (e) and the proof is complete.

The following results describe the existence and properties of Type (V) solution.

346

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

Proposition 5.3. Wu and Wv are open subsets of R2 . Furthermore, the following statements are valid: (i) For each (θ, η) ∈ Su , there exists > 0 such that (α1 , η) ∈ Wu ∀θ < α1 < θ + and ⎧ (u(r ) − λ log r ) = cu ⎪ ⎨ rlim →∞ ) e cu (5.9) lim rv(r 2+λ = − (2+λ)2 ⎪ r →∞ ⎩ β1 = 2N1 − λ, β2 = ∞, where cu and λ = λ(α1 , η) > 0 are constants. (ii) For each (µ, ν) ∈ Sv , there exists δ > 0 such that (µ, α2 ) ∈ Wv ∀ν < α2 < ν + δ and ⎧ u(r ) e cv ⎪ 2+γ = − (2+γ )2 ⎨ rlim →∞ r (5.10) lim (v(r ) − γ log r ) = cv , ⎪ ⎩ r →∞ β1 = ∞, β2 = 2N2 − γ , where cv and γ = γ (µ, α2 ) > 0 are constants. (iii) Wu and Wv are all nonempty. In order to prove Proposition 5.3, we need the following lemmas. Lemma 5.1. Suppose (u(r ), v(r )) is a radial solution satisfying u(r0 ) > 0, v(r0 ) < 0 and v (r0 ) < 0 (resp. v(r0 ) > 0, u(r0 ) < 0 and u (r0 ) < 0). Then (u(r ), v(r )) is an entire solution and lim u(r ) = ∞ and lim v(r ) = −∞ (resp. lim v(r ) = ∞ and r →∞ r →∞ r →∞ lim u(r ) = −∞).

r →∞

Proof. Suppose (u, v) is not an entire solution. Then there exists R0 > 0 such that u(r ) → ∞ as r → R0− . Then we claim that (a)

lim r v (r ) = lim v(r ) = −∞.

r →R0−

r →R0−

(b) (u + v) is bounded above on [R1 , R0 ] for some 0 < R1 < R0 . Since v(r0 ) < 0 and v (r0 ) < 0, we easily have v (r ) < 0 ∀r ∈ [r0 , R0 ). Then we obtain

r

r u v v(r0 ) r v (r ) = r0 v (r0 ) + se (e − 1)ds ≤ r0 v (r0 ) + (e − 1) seu ds → −∞, r0 r0

r r r s ln eu (ev − 1)ds v(r ) = v(r0 ) + v (r0 ) ln + r0 s r0

r r r v(r0 ) − 1) s ln ev (eu − 1)ds ≤ v(r0 ) + v (r0 ) ln + (e r0 s r0 r r v(r0 ) = v(r0 ) + v (r0 ) ln + (e − 1)(u(r ) − u(r0 ) − u (r0 ) ln ) → −∞ r0 r0 as r → R0− since lim u(r ) = lim r u (r ) = ∞. This prove (a). r →R0−

r →R0−

Uniqueness and Structure of Solutions for the Chern-Simons System

347

By (a), there exists 0 < R1 < R0 such that 2ev(r ) − 1 ≤ − 21 ∀r ∈ (R1 , R0 ). By (1.13) we easily have

r seu (2ev − 1 − ev−u )ds r (u (r ) + v (r )) = R1 (u (R1 ) + v (R1 )) + R1

1 r u se ds < 0 for r close to R0− . ≤C− 2 R1 This proves (u + v) (r ) < 0 for r near R0− and thus (b) follows. Now by (a)-(b) we deduce

∞ = lim r u (r ) = R1 u (R1 ) + r →R0−

R0

sev (eu − 1)ds < ∞.

R1

This contradiction proves the first result. From Lemma 4.1, we obtain the second result and the proof is complete. The following lemma depicts the asymptotic behaviors of Type (V) solution at infinity. Lemma 5.2. Let (u, v) be an entire solution of (1.13)–(1.14) on (0, ∞). If u(r0 ) ≥ 0 for some r0 > 0, then (a) lim r u (r ) = λ and lim (u(r ) − λ log r ) = cu for some constants λ > 0 and cu . r →∞

r →∞

(b) r p eu(r )+v(r ) ∈ L 1 (0, ∞) for any p ≥ 0. (r ) ) e cu e cu = − 2+λ and lim rv(r (c) lim rrv2+λ 2+λ = − (2+λ)2 , where λ and cu are the constants in r →∞ r →∞ (a). Proof. (a) By Lemma 4.1, we see that u(r ) ≥ C and 1 − ev(r ) ≤

1 on [r0 , ∞) 2

for some C, r0 > 0, then lim r v (r ) = −∞ and v(r ) ≤ −Cr 2 ∀r ≥ r0 .

r →∞

From (4.1) we obtain (r u (r )) > 0 on [r0 , ∞) and lim r u (r ) = λ, 0 < λ ≤ ∞.

r →∞

(5.11)

To complete this proof, we need the following fact. Claim. λ < ∞. Proof of Claim. Suppose λ = ∞, then, using (1.13)–(1.14), we obtain

∞ r eu(r )+v(r ) dr ≥ lim r u (r ) − 2N1 = ∞, 0

r →∞

(5.12)

348

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

and by lim r v (r ) = −∞, r →∞

r u (r ) r ev(r ) (eu(r ) − 1) 1 − e−u(r ) = lim = lim = 0. u(r ) v(r ) r →∞ r v (r ) r →∞ r e (e − 1) r →∞ 1 − e−v(r ) lim

(5.13)

By (5.13) we easily have, for any p > 0, (r 2+ p eu+v ) = r 1+ p eu+v (2 + p + r u (r ) + r v (r )) < 0 for large r.

(5.14)

From (5.14) we see that r 2+ p eu+v is bounded from above by a positive constant and ∞ hence r eu+v ≤ Cr −(1+ p) for all large r and 0 r eu(r )+v(r ) dr < ∞. This contradicts (5.12) and thus λ < ∞. Next, we show the asymptotic behavior of u at r = ∞. Let y(r ) = u(r ) − λ log r . Then, by (5.11), we have lim r y (r ) = 0 and r →∞

r y (r ) r ev (eu − 1) = lim = 0 for any p > 0. r →∞ r − p r →∞ − pr − p−1 lim

(5.15)

Since lim r y (r ) = 0 and (r y ) = r ev (eu − 1) > 0 on [r0 , ∞), we obtain that r →∞

y (r ) < 0 on [r0 , ∞). From (5.15) and y (r ) < 0 on [r0 , ∞), we easily obtain y (r ) > −C1r −2 , y(r ) > C2 for large r for some C2 ∈ R and thus lim y(r ) = cu for r →∞ some cu ∈ R. This shows the results of (a). (b) Now, by Lemma 4.1 we easily obtain v(r ) ≤ −Cr 2 ∀r ≥ R for some constants C > 0 and R > 0. From this inequality and the claim in the proof of (a), we have that r p eu(r )+v(r ) < Cr −2 for all large r > 0 for any p > 0. Thus the result (b) is valid. (c) By Lemma 4.1 and Eq. (1.13) we have limr →∞ v(r ) = −∞ and v(r ) r v (r ) r eu (ev − 1) = lim = lim 2+λ 2+λ r →∞ r r →∞ (2 + λ)r r →∞ (2 + λ)2 r 1+λ u e 1 1 = · lim λ · lim (ev − 1) = · ecu · (−1). 2 r →∞ r →∞ (2 + λ) r (2 + λ)2 lim

Thus (c) is true and the proof is complete.

Remark 5.2. If we replace u by v in the condition of Lemma 5.2, we can obtain the following respective results. The proof is similar. Lemma 5.3. Let (u, v) be an entire solution of (1.13)– (1.14) on (0, ∞). If v(r0 ) ≥ 0 for some r0 > 0, then (a) lim r v (r ) = η and lim (v(r ) − η log r ) = cv for some constants η > 0 and cv . r →∞

r →∞

(b) r p eu(r )+v(r ) ∈ L 1 (0, ∞) for any p ≥ 0. (r ) ) e cv e cv = 2+η and lim ru(r (c) lim rru2+η 2+η = − (2+η)2 , where η and cv are the constants in r →∞ r →∞ (a) above. Now we are in the position to prove Proposition 5.3.

Uniqueness and Structure of Solutions for the Chern-Simons System

349

Proof of Proposition 5.3. We prove the case of (Wu , Su ). The case of (Wv , Sv ) is similar. We divide the proof into the following steps. Step 1. Wu is open. Let α¯ ∈ Wu . Then there exists r0 > 0 such that u(r0 , α) ¯ > 0, v(r0 , α) ¯ < 0 and v (r0 , α) ¯ < 0. By the continuity of (u, v), (u , v ) w.r.t α, there exists δ > 0 such that u(r0 , α) > 0, v(r0 , α) < 0 and v (r, α) < 0 ∀α ∈ Bδ (α). ¯

(5.16)

By (5.16) and Lemma 5.1, we obtain (u(r, α), v(r, α)) → (∞, −∞) as r → ∞ ∀α ∈ Bδ (α). ¯ This proves Bδ (α) ¯ ⊂ Wu and hence Wu is open. Step 2. Let α˜ = (θ, η) ∈ Su . Then there exists r0 > 0 such that v(r0 , α) ˜ < 0 and v (r0 , α) ˜ < 0. By continuity, there exists > 0 such that v(r0 , α) < 0 and v (r0 , α) < 0 ∀α = (α1 , η) with η < α1 < η + . By (h) of Step 4 in the proof of Proposition 5.1 and (1.13), we have u(r1 , α) > 0 and v (r1 , α) < 0 for some r1 > r0 . From Lemma 5.1, we obtain that (α1 , η) ∈ Wu ∀η < α1 < η + . By Lemma 5.2, we also obtain the asymptotic behavior of (u, v) at ∞ and the corresponding (β1 , β2 ) which satisfies β1 = 2N1 − lim r u (r ) = 2N1 − λ and β2 = 2N2 − lim r v (r ) = ∞. r →∞

r →∞

These prove (i) and (ii). Step 3. Wu and Wv are all nonempty. Since, by (C) of Proposition 5.1, Su and Sv are all nonempty, we get Wu and Wv are also all nonempty from (i)-(ii). This completes the proof. Now, we give the proof of Theorems 1.3 and 1.2 in the following. Proof of Theorem 1.3. We divide the proof into the following steps. Step 1. By (C) of Proposition 5.1, (c) of Proposition 5.2 and (i)-(ii) of Proposition 5.3, we obtain (i), (iii) and (iv)-(v) respectively. Step 2. First we claim the following statements. (a) For each α ∈ (Sv v ) (resp. α ∈ (Su u )), there does not exist {αk } ⊂ u (resp. {αk } ⊂ v ) such that αk → α as k → ∞. (b) For each α = (α1 , α2 ) ∈ Sv (resp. α ∈ Su ), there does not exist k 0 with {αk = (α1 + k , α2 )} ⊂ N T (resp. {αk = (α1 − k , α2 )} ⊂ N T ) such that αk → α as k → ∞. (c) For each α = (α1 , α2 ) ∈ Sv (resp. α ∈ Su ), there does not exist k 0 with {αk = (α1 , α2 − k )} ⊂ N T (resp. {αk = (α1 , α2 + k )} ⊂ N T ) such that αk → α as i → ∞. (a) We prove the case of Sv v . The case of Su u is based on the same arguments. Suppose there exists {αk } ⊂ u such that αk → α as k → ∞ for some α ∈ (Sv v ). Then, by (d)-(e) of Proposition 5.2, we have β1 (α) = ∞ and β1 (αk ) ≤ 2N1 + 2 ∀k. Denote (u(r ), v(r )) = (u(r, α), v(r, α)) and (u k (r ), vk (r )) = (u(r, αk ), v(r, R αk )) ∀k. Then, by the definition of β1 , we obtain that there exists R0 > 0 such that 0 0 r ev (1 − eu )dr > 2N1 + 2, and

R0 2N1 + 2 ≥ lim r evk (1 − eu k )dr =

k→∞ 0

R0 v

r e (1 − eu )dr (by Bounded Convergence Theorem)

0

> 2N1 + 2.

350

J.-L. Chern, Z.-Y. Chen, C.-S. Lin

Fig. 1. Structure of all entire solutions

This contradiction proves (a). (b) We prove the case of Sv . The case of Su is similar. Let α = (α1 , α2 ) ∈ Sv . Suppose there exists k 0 with {αk = (α1 + k , α2 )} ⊂ N T such that αk → α as k → ∞. Then, by (c)–(d) of 5.2, we have β2 (α) = 2N2 and β2 (αk ) > 2N2 + 2 ∀k. Then by the continuity of β2 w.r.t. α1 , i.e., (a) of Proposition 5.2, we obtain β2 (α) = lim β2 (αk ) ≥ 2N2 + 2. This contradiction shows (b). r →∞ (c) The proof is similar to (b). We omit the details. Since Su and Sv are all nonempty, by (a)-(c) above we obtain that ∀α = (α1 , α2 ) ∈ Sv and ∀α¯ = (α¯ 1 , α¯ 2 ) ∈ Su , there respectively exists δ1 = δ1 (α) > 0 and δ2 = δ2 (α) ¯ >0 such that (α1 + δ, α2 ), (α1 , α2 − δ) ∈ v ∀0 < δ < δ1 (α¯ 1 − δ, α¯ 2 ), (α¯ 1 , α¯ 2 + δ) ∈ u ∀0 < δ < δ2 .

(5.17)

By (5.17) we deduce both u and v are nonempty and connected. Now, by (C) of Proposition 5.1 and (a) above, we obtain N T = ∅. By Proposition 5.2, is simple and open connected. From Lemma 2.3, we also have u , N T and v are all simple. Suppose N T is not open. Then, by (5.17) and Lemma 2.3, w.l.o.g., there exists {αi = (αi1 , αi2 )} ⊂ v such that αi → α¯ as i → ∞ for some α¯ = (α¯ 1 , α¯ 2 ) ∈ N T . Let αi = (α¯ 1 , αi2 ). Then αi → α¯ and { αi } ⊂ v . By (a),(c) and (e) of Proposition 5.2, we finally obtain αi ) = β2 (α) ¯ > 2N2 + 2. 2N2 + 2 ≥ lim β2 ( i→∞

This contradiction proves that

N T is open. Now, suppose Z = ∂ N T ∂ = ∅, then there exist α ∈ u and a sequence {αi } ⊂ v such that αi → α as i → ∞. This contradicts (a). Hence Z = ∅ and Z = T . Furthermore, by (a) we also get N T is connected. The proof is complete. Proof of Theorem 1.2. Let (u, v) be a radial solution of (1.7). Then, by Propositions 4.1 and 5.2, we obtain that (u, v) must be one of the Types (I)–(V). Conversely, by Theorem 1.1, the Type (I) solution, i.e., topological solution, exists and is unique. Then,

Uniqueness and Structure of Solutions for the Chern-Simons System

351

by (C) of Proposition 5.1, we have that both ∂ and are nonempty. Thus Types (II)–(IV) solutions all exist due to Proposition 5.1 and Theorem 1.3. In particular, the non-topological solution exists. Furthermore, by Proposition 5.3, the Type (V) solution exists. We complete the proof. Remark 5.3. Combining the results of Theorems 1.1, 1.2 and 1.3, we can sketch the structure of entire solutions as in Fig. 1. Acknowledgement. The authors would like to express their gratitude to the referee for valuable comments and suggestions.

References 1. Busca, J., Sirakov, B.: Symmetry results for semi-linear elliptic systems in the whole space. J. Diff. Eqs. 163, 41–56 (2000) 2. Caffarelli, L.A., Yang, Y.: Vortex condensation in the Chern-Simons Higgs model: an existence theorem. Commun. Math. Phys. 168, 321–336 (1995) 3. Chae, D., Imanuvilov, O.Y.: The existence of non-topological multivortex solutions in the relativistic self-dual Chern-Simons theory. Commun. Math. Phys. 215, 119–142 (2000) 4. Chan, H., Fu, C.-C., Lin, C.-S.: Non-topological multi-vortex solutions to the self-dual Chern-SimonsHiggs equation. Commun. Math. Phys. 231, 189–221 (2002) 5. Chen, C.-C., Lin, C.-S.: Uniqueness of the ground state solutions of u + f (u) = 0 in Rn , n ≥ 3. Comm. Part. Diff. Eqs. 16, 1549–1572 (1991) 6. Chen, X., Hastings, S., Mcleod, J.B., Yang, Y.: A nonlinear elliptic equation arising from gauge filed theory and cosmology. Proc. Roy. Soc. London Ser. A 446, 453–478 (1994) 7. Dunne, G.: Self-Dual Chern-Simons Theories, Lecture Notes in Physics. Vol. 36, Berlin: Springer, 1995 8. Dziarmaga, J.: Low energy dynamics of [U (1)] N Chern-Simons solitons. Phys. Rev. D 49, 5469– 5479 (1994) 9. Hartman, P.: Ordinary Differential Equations. New York: Wiley, 1964 (2nd ed. Boston-Basel-Stattgart: Birkhäuser, 1982) 10. Hong, J., Kim, Y., Pac, P.Y.: Multivortex solutions of the Abelian Chern-Simons-Higgs theory. Phys. Rev. Lett. 64, 2230–2233 (1990) 11. Jackiw, R., Pi, S.-Y.: Soliton solutions to the gauged nonlinear Schrödinger equation on the plane. Phys. Rev. Lett. 64, 2969–2972 (1990) 12. Jackiw, R., Weinberg, E.J.: Self-dual Chern-Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) 13. Jaffe, A., Taubes, C.: Vortices and Monopoles. Progress in Physics Vol. 2, Boston. MA: Birkhäuser, 1980 14. Kumar, C.N., Khare, A.: Charged vortex of finite energy in nonabelian gauge theories with Chern-Simons term. Phys. Lett. B 178, 395–399 (1986) 15. Kim, C., Lee, C., Ko, P., Lee, B.H., Min, H.: Schrödinger fields on the plane with [U (1)] N Chern-Simons interactions and generalized self-dual solitons. Phys. Rev. 48, 1821–1840 (1993) 16. Lin, C.-S., Ponce, A.C., Yang, Y.: A system of elliptic equations arising in Chern-Simons field theory. J. Funct. Anal. 247, 289–350 (2007) 17. Spruck, J., Yang, Y.: The existence non-topological solutions in the self-dual Chern-Simons theory. Commun. Math. Phys. 149, 361–376 (1992) 18. Spruck, J., Yang, Y.: Topological solutions in the self-dual Chern-Simons theory: existence and approximation. Ann. Inst. H. Poincaré Anal. Non Linéaire 12, 75–97 (1995) 19. Tarantello, G.: Uniqueness of selfdual periodic Chern-Simons vortices of topological type. Calc. Var. Part. Diff. Eqns 29, 191–217 (2007) 20. de Vega, H.J., Schaponsnilk, F.A.: Electrically charged vortices in non-abelian gauge theories with ChernSimons term. Phys. Rev. Lett. 56, 2564–2566 (1986) 21. Yanagida, E.: Mini-maximizers for reaction-diffusion systems with skew-gradient structure. J. Diff. Eqs. 179, 311–335 (2002) 22. Yang, Y.: The relativistic non-abelian Chern-Simons equations. Commun. Math. Phys. 186, 199– 218 (1997) 23. Yang, Y.: Solitons in Filed Theory and Nonlinear Analysis. Springer Monographs in Mathematics, New York: Springer-Verlag, 2001 Communicated by M. Aizenman

Commun. Math. Phys. 296, 353–403 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1022-y

Communications in

Mathematical Physics

Linear Perturbations of Quaternionic Metrics Sergei Alexandrov1 , Boris Pioline2,3 , Frank Saueressig4 , Stefan Vandoren5 1 Laboratoire de Physique Théorique & Astroparticules ,

2

3 4 5

Université Montpellier II, 34095 Montpellier Cedex 05, France. E-mail: [email protected] Laboratoire de Physique Théorique et Hautes Energies , Université Pierre et Marie Curie, 4 place Jussieu, 75252 Paris cedex 05, France. E-mail: [email protected] Laboratoire de Physique Théorique de l’Ecole Normale Supérieure , 24 rue Lhomond, 75231 Paris cedex 05, France Institut de Physique Théorique , CEA, F-91191 Gif-sur-Yvette, France. E-mail: [email protected] Institute for Theoretical Physics and Spinoza Institute, Utrecht University, Leuvenlaan 4, 3508 TD Utrecht, The Netherlands. E-mail: [email protected]

Received: 25 November 2008 / Accepted: 18 December 2009 Published online: 25 February 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com

Abstract: We extend the twistor methods developed in our earlier work on linear deformations of hyperkähler manifolds [1] to the case of quaternionic-Kähler manifolds. Via Swann’s construction, deformations of a 4d-dimensional quaternionic-Kähler manifold M are in one-to-one correspondence with deformations of its 4d + 4-dimensional hyperkähler cone S. The latter can be encoded in variations of the complex symplectomorphisms which relate different locally flat patches of the twistor space ZS , with a suitable homogeneity condition that ensures that the hyperkähler cone property is preserved. Equivalently, we show that the deformations of M can be encoded in variations of the complex contact transformations which relate different locally flat patches of the twistor space ZM of M, by-passing the Swann bundle and its twistor space. We specialize these general results to the case of quaternionic-Kähler metrics with d + 1 commuting isometries, obtainable by the Legendre transform method, and linear deformations thereof. We illustrate our methods for the hypermultiplet moduli space in string theory compactifications at tree- and one-loop level. Contents 1. 2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quaternionic-Kähler Geometry and Twistors . . . . . . . . . . . . . . . . . 2.1 Bottom-up: from QK to HKC . . . . . . . . . . . . . . . . . . . . . . . 2.2 Top down: from HKC to QK . . . . . . . . . . . . . . . . . . . . . . . 2.3 Patchwork construction of twistor spaces of HK manifolds - a summary 2.4 Conditions for superconformal invariance . . . . . . . . . . . . . . . . Unité mixte de recherche du CNRS UMR 5207. Unité mixte de recherche du CNRS UMR 7589.

Unité mixte de recherche du CNRS UMR 8549. Unité de recherche associée au CNRS URA 2306.

354 357 357 360 361 363

354

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

2.5 Homogeneous symplectic vs. contact geometry . . . . . . Quaternionic Geometry with Commuting Isometries . . . . . . 3.1 Tri-holomorphic isometries and superconformal invariance 3.2 Superconformal quotient . . . . . . . . . . . . . . . . . . 3.3 Contact twistor lines . . . . . . . . . . . . . . . . . . . . 4. The Perturbative Hypermultiplet Moduli Space . . . . . . . . . 4.1 Tree-level geometry . . . . . . . . . . . . . . . . . . . . . 4.2 One-loop correction . . . . . . . . . . . . . . . . . . . . . 4.3 Superconformal quotient . . . . . . . . . . . . . . . . . . 5. Linear Deformations of O(2) Quaternionic-Kähler Spaces . . . 5.1 Linear deformations of O(2) hyperkähler cones . . . . . . 5.2 Perturbed contact twistor lines . . . . . . . . . . . . . . . A. Infinitesimal SU (2) Transformations . . . . . . . . . . . . . . B. An alternative Formulation for Hypermultiplet Moduli Spaces . C. Deformed Superconformal Quotient . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

367 371 371 376 378 380 380 383 384 386 386 389 394 395 397 402

1. Introduction Quaternionic-Kähler (QK) manifolds play an important role in string and supergravity theories, primarily because the hypermultiplet moduli spaces appearing in string theory backgrounds with 8 supercharges fall into this class. In this work, we study general aspects of QK manifolds and of their twistor spaces, and provide a general formalism for describing linear perturbations of 4d-dimensional QK manifolds with d + 1 commuting isometries. For this purpose we build on our previous study of linear deformations of hyperkähler (HK) manifolds obtainable by the Legendre transform method [1]. A key fact for the present study is the (local) one-to-one correspondence between 4d-dimensional QK manifolds M and 4d + 4-dimensional “hyperkähler cones” (HKC) S, i.e. 4d + 4-dimensional HK manifolds with a homothetic Killing vector and an isometric SU (2) action rotating the three complex structures (see Fig. 1 for orientation). In particular, Swann’s construction produces S as a C2 /Z2 bundle over M, twisted by the SU (2) part of the spin connection on M [2]. The converse relation goes under the name of “superconformal quotient” in the physics literature [3,4]. Moreover, any isometry of M can be lifted to a tri-holomorphic isometry of S, see e.g. [5,6]. Therefore, the formalism of [1] is directly applicable to the Swann bundle S, with a suitable restriction to ensure the hyperkähler cone (or “superconformal invariance”) property. For this purpose, one introduces the twistor space ZS = S × CP 1 of the HK manifold S, an open covering Uˆi of ZS projecting to open disks Ui on CP 1 , and a local I , µ[i] ) (I = , 0, . . . , d − 1) for the O(2)-twisted comDarboux coordinate system (ν[i] I plex symplectic structure [i] = dµ[i] ∧ dν I on Uˆi . Since1 [i] = f 2 [ j] mod dζ on [i]

I

ij [ j]

I , µ[i] ) and (ν I , µ ) must be related the overlap Uˆi ∩ Uˆ j , the coordinate systems (ν[i] I [ j] I by a symplectomorphism on Uˆi ∩ Uˆ j ; the latter can be parametrized in the usual way [ j]

I , µ , ζ ). The set of all S [i j] , subject to consistency by a generating function S [i j] (ν[i] I relations, reality conditions and gauge equivalence, encodes the complex symplectic structure on ZS , and therefore the HK metric on S. 1 Here f are the transition functions of the O(1) bundle on CP 1 with coordinate ζ . ij

Linear Perturbations of Quaternionic Metrics

355

Fig. 1. Summary of various coordinate systems on the QK space M, its twistor space ZM , its Swann bundle S and the twistor space of the Swann bundle ZS

As we show in Sect. 2.4 below, superconformal invariance restricts S [i j] to be a func[i] I tion of f i−2n j ν[i] and µ I only, with no further ζ dependence, and to be homogeneous of

I , µ[i] ) are rescaled with weight (n, 1 − n), respectively2 . The intedegree one when (ν[i] I ger n characterizes the transformation rules of the local coordinates under both dilations and SU (2) rotations. For n = 1, the relevant case for QK manifolds with isometries, the O(0) sections µ[i] I may acquire anomalous scaling dimensions, and the homogeneity condition may be relaxed into a “quasi-homogeneity” property, as explained further in Sect. 2.4. [i j] (ν I , µ[i] , ζ ) of the generating funcDeformations of S correspond to variations H(1) I [i] tions S [i j] , subject to the consistency, reality and quasi-homogeneity conditions and gauge equivalence. When S is obtainable by the Legendre transform method, which is the case when M admits d + 1 commuting isometries, the deformed twistor lines and hyperkähler potential are easily computed to first order in the perturbation. The deformed QK metric may in principle be obtained by the standard superconformal quotient procedure. In Appendix C, we construct a natural set of coordinates on the deformed QK manifold, but stop short of writing the deformed metric explicitly, as the expressions would be too cumbersome. While the strategy outlined above is conceptually straightforward, it is rather unpractical. As we explain in detail in Sect. 2.5, one may by-pass the twistor space of the Swann bundle ZS , and work directly with the twistor space ZM of the QK manifold M, as 2 See [7–9] for other discussions of superconformal invariance in projective superspace and [10,11] for an analysis in components.

356

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

emphasized in particular by Salamon and Lebrun [12,14–16]. While ZS carries a complex O(2)-valued symplectic structure and a HK metric degenerate along the CP 1 fiber, ZM carries a complex O(2)-valued contact structure and a non-degenerate Kähler-Einstein metric3 [12]. Conversely, Fano contact manifolds with a Kähler-Einstein metric are twistor spaces of QK manifolds [13,14]4 . Similarly to ZS , the complex contact , ξ˜ [i] , α [i] ( = 0, . . . , d − 1) structure X on ZM admits local Darboux coordinates ξ[i] such that locally, the contact one-form takes the canonical form X [i] = dα [i] + ξ dξ˜ [i] . [i]

These Darboux coordinates on ZM are essentially the projectivization5 of the Darboux I , µ[i] on Z . More precisely, we show that the projectivized complex coordinates ν[i] S I Darboux coordinates depend on the coordinates (ζ, π 1 , π 2 ) on the CP 1 × (C2 /Z2 ) fiber over a given point on M only through the ratio z defined in (2.57) below. Together with the projectivization, this provides the desired reduction from ZS to ZM . The homogeI , µ[i] , ζ ) of complex symplectomorphisms on Z neous generating functions S [i j] (ν[i] S I , ξ˜ [i] , α [i] ) of complex contact transformations on yield generating functions Sˆ [i j] (ξ[i] [i j] of the same variables, subZ. Their deformations can be encoded in functions Hˆ (1) ject to consistency relations, reality conditions, and gauge equivalence. This recovers ˇ Lebrun’s assertion that the QK deformations of M are classified by the Cech cohomol1 ogy group H (ZM , O(2)) [14]. The deformed QK metric on M can then be extracted in a systematic way from the knowledge of the “contact twistor lines” (referred to as the , ξ˜ [i] , α [i] on Z “twistor map” in [17]), i.e. the complex coordinates ξ[i] M expressed as 1 µ functions of the coordinates z on CP and x on the base M. We end this introduction with an important remark. In string theory or supergravity, only QK manifolds with negative scalar curvature appear as a consequence of supersymmetry [18]. Such QK spaces are generically non-compact. The linear deformation theory set up in this paper is local and applies to both compact and non-compact manifolds. Possible obstructions to extend and integrate infinitesimal deformations into finite global deformations, however, depend strongly on the (non-)compactness. For instance, it is known that complete QK manifolds with positive scalar curvature admit no deformations, see e.g. [15,16]. In contrast, the hypermultiplet moduli spaces arising from string theory compactifications are in general deformed by quantum corrections, as explained e.g. in the introduction of [1] and to be discussed further in [19]. This paper is organized as follows. • In Sect. 2, we review general aspects of QK manifolds, their twistor spaces, HKC and twistor spaces thereof, and study the consequences of superconformal invariance on the symplectomorphisms used in the patchwork construction of the complex symplectic structure. In particular, in Sect. 2.5, we explain in detail how the homogeneous complex symplectic structure on ZS reduces to a complex contact structure on ZM , thus allowing to by-pass the Swann bundle and its twistor space. • In Sect. 3 we specify the case when the 4d-dimensional QK space has d + 1 commuting isometries, i.e. when its Swann bundle is obtainable by the Legendre transform construction. We find the corresponding restriction on the symplectomorphisms, perform the superconformal quotient and obtain the contact twistor lines. • In Sect. 4, we illustrate these methods on the example of the hypermultiplet moduli space in type 3 Moreover, in contrast to Z , the projection from Z 1 S M to CP is not holomorphic. 4 In fact, our local analysis seems to support Lebrun’s conjecture [14] that every Fano contact manifold is

a twistor space. 5 The equivalence between contact structures and homogeneous symplectic structures is a standard trick in contact geometry, see e.g. [14] and references therein.

Linear Perturbations of Quaternionic Metrics

357

II string theory, both at tree and one-loop level, and in the process strengthen the case for the absence of perturbative corrections beyond one-loop. • Section 5 studies deformations of QK manifolds with d + 1 commuting isometries. We determine the allowed linear perturbations which preserve superconformal invariance, and find the deformed twistor lines and contact twistor lines. These results will be applied to the hypermultiplet moduli space of type II string theories in [19]. • In Appendix A, we spell out the SU (2) action on the various multiplets at the infinitesimal level. In Appendix B, we briefly discuss an alternative description of the hypermultiplet moduli space using a different choice of contour, and show that it is related to the one in Sect. 4 by a local symplectomorphism. In Appendix C we generalize the superconformal quotient of Sect. 3.2 to the perturbed case, and provide an independent check on the results of Sect. 5.2. 2. Quaternionic-Kähler Geometry and Twistors In this section, we review the relation between quaternionic-Kähler (QK) manifolds and hyperkähler cones (HKC). This relation is one-to-one up to coverings (Theorem (5.9) in [2]), and can be established “bottom up”, by constructing the Swann bundle S over the QK manifold M, or “top down”, by performing the superconformal quotient of S. These two constructions are summarized in Sects. 2.1 and 2.2 following [4,17]. In Sect. 2.3, we recall the patchwork construction of the twistor space ZS of the HK space S developed in our previous work [1]. In Sect. 2.4 we derive the restrictions on the transition functions imposed by superconformal invariance. In Sect. 2.5, we study the reduction of the homogeneous complex symplectic structure on ZS to a complex contact structure on ZM . The reader will find it helpful to refer to Fig. 1 for the various coordinate systems involved in these constructions. 2.1. Bottom-up: from QK to HKC. A quaternionic-Kähler manifold M is a 4d-dimensional manifold with Riemannian metric gM and Levi-Civita connection ∇ whose holonomy group is contained in U Sp(d) × SU (2) [12]. M admits a triplet of almost complex Hermitian structures J (defined up to SU (2) rotations) satisfying the algebra of the unit imaginary quaternions. The quaternionic two-forms ω M (X, Y ) = gM ( J X, Y ) are covariantly closed with respect to the SU (2) part p of the Levi-Civita connection, and are proportional with a fixed coefficient ν to the curvature of p, dω M + p × ω M = 0 ,

d p +

1 1 p × p = ν ω M , 2 2

(2.1)

i = i jk a j ∧ bk . As a consequence, the metric on where we use the notation ( a × b) M is Einstein, with constant Ricci scalar curvature R = 4d(d + 2)ν. HK manifolds are degenerate limits of QK manifolds, where ν = 0. We are mainly concerned in this work with the case of negative curvature, ν < 0. The Swann bundle S associated to M is the total space of a C2 bundle (more precisely C2 /Z2 with the zero section deleted) over M. It is a hyperkähler manifold of dimension 4(d + 1) with an SU (2) isometric action which rotates the complex structures into each other, and a homothetic Killing vector. The homothetic Killing vector ensures that the hyperkähler manifold is actually a cone, and the SU (2) isometries guarantee that this is a cone over a three-Sasakian space with S 3 fibres over the quaternionic base M [20]. In physics terminology, these properties follow from N = 2 superconformal

358

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

invariance of the associated sigma model [4]. We denote by π A the complex coordinates on the C2 /Z2 fiber, π¯ A ≡ (π A )∗ their complex conjugate, and use the antisymmetric tensor A B to raise and lower the indices.6 The HK metric on S is given by

dsS2 = |Dπ A |2 +

ν 2 2 r dsM , 4

(2.2)

2 is the QK metric on M, r 2 ≡ |π 1 |2 + |π 2 |2 = π A π ¯ A is the squared norm where dsM on the fiber, and

Dπ A ≡ dπ A + p A

B

πB ,

(2.3)

is the covariant differential of π A . The isometric SU (2) action on S is given by the infinitesimal transformations

δπ A =

i i 3 π A + + π¯ A , δ π¯ A = − 3 π¯ A + − π A . 2 2

(2.4)

In particular, the norm r 2 is SU (2) invariant. The homothetic Killing vector r ∂r = π A ∂π A + π¯ A ∂π¯ A corresponds to dilations of the fiber. With respect to the complex structure J 3 , where Dπ A are (1, 0) forms, the Kähler form is ν A B , (2.5) ωS3 = i Dπ A ∧ Dπ¯ A + π A π¯ B ωM 2 while the holomorphic symplectic form ωS+ = − 21 (ωS1 − iωS2 ) is given by ν A B = d π A Dπ A . ωS+ = Dπ A ∧ Dπ A + π A π B ωM 2

(2.6)

This construction directly defines the HKC, or Swann bundle S, given a QK manifold M, see [17] for more details. For many purposes, it is useful to decompose the construction above in two steps, by first introducing the twistor space ZM [12], a CP 1 bundle over M, and then obtaining the Swann bundle S as a C× bundle over ZM . The twistor space ZM over M should not be confused with the twistor space ZS of S itself, to be introduced in Sect. 2.3 below. ZM is a complex manifold with a canonical Kähler-Einstein metric and a complex contact structure7 . Introducing a complex coordinate z on CP 1 , the line element is given by 2 dsZ = M

|dz + P|2 ν 2 + ds , (1 + z z¯ )2 4 M

(2.7)

while the Kähler form on ZM is given by − 3 − 2izω+ + 2i¯ ¯ z ω (dz + P) ∧ (d¯z + P) ν (1 − z z¯ )ωM M M + ωZM = i , (2.8) (1 + z z¯ )2 2 1 + z z¯ 6 We use conventions in which = 1 = − and π¯ = π¯ B . 12 21 A B A 7 Recall that a complex contact form on a complex manifold of complex dimension 2d + 1 is a holomorphic

one-form Xˆ , defined globally, such that Xˆ ∧ (dXˆ )d is a nowhere vanishing holomorphic top form. A contact structure corresponds to the case where Xˆ is a local one-form defined up to multiplication by a nowhere vanishing smooth function.

Linear Perturbations of Quaternionic Metrics

359

± 1 ∓ iω2 ), ω+ = (ω− )∗ . In these expressions, P stands for the where ωM = − 21 (ωM M M M “projectivized connection”, defined from the SU (2) connection p A B as

P = p+ − i p3 z + p− z 2 ,

(2.9)

where p+ ≡ p 1 2 , p3 ≡ i( p 1 1 − p 2 2 ), p− ≡ − p 2 1 , with p3 real and ( p− )∗ = p+ . The complex contact structure on ZM is induced from the Liouville form X on S,

X ≡ π A Dπ A =

1 2 2 (π ) (dz + P) , 2

(2.10)

and, as apparent from the overall factor of (π 2 )2 , is a section8 of the O(2) line bundle on CP 1 . From the complex contact structure one may easily extract the SU (2) connection p, and therefore the triplet of quaternionic two-forms ω M via Eq. (2.1). Thus, the knowledge of the complex structure and contact structure on ZM is sufficient to reconstruct the quaternionic-Kähler metric. To construct the Swann bundle S we introduce two more real coordinates r and ψ parametrizing the fiber of a C× bundle over ZM , with metric9 2 , dsS2 = dr 2 + r 2 (Dψ)2 + dsZ M

(2.11)

where Dψ = dψ +

i (zd¯z − z¯ dz) − i(1 − z z¯ ) p3 + 2z p− − 2¯z p+ . 2(1 + z z¯ )

(2.12)

The metrics (2.2) and (2.11) are identical, provided the coordinates r, ψ, z, z¯ are related to π A , π¯ A via r 2 = |π 1 |2 + |π 2 |2 , eiψ =

π1 π¯ 1 π 2 /π¯ 2 , z = 2 , z¯ = , π π¯ 2

(2.13)

or conversely

π1 π2

r eiψ =√ 1 + z z¯

z . 1

(2.14)

For more details, we again refer the reader to [17]. 8 More precisely, X is defined on S. In (2.23) we define a contact one-form Xˆ proportional to X , which does live on ZM . 9 We follow the conventions of [17], but with a slightly different notation. E.g. the coordinate ψ here is denoted by φ in [17]. Also, in [17], the SU (2) index in π A was not lowered after complex conjugation.

360

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

2.2. Top down: from HKC to QK. The characterizing property of an HKC is that there exists a function χ on S, known as the hyperkähler potential, such that the metric, in local (real) coordinates φ M , M = 1, ..., 4(d + 1), satisfies [2,4] g M N = D M ∂ N χ (φ).

(2.15)

For any Hermitian complex structure, in adapted complex coordinates z m , m = 1, . . . , 2(d + 1), (2.15) implies that gm n¯ = ∂m ∂n¯ χ (z, z¯ ) ,

Dm ∂n χ (z, z¯ ) = 0.

(2.16)

In particular, χ provides a Kähler potential in any complex structure. The dilation and SU (2) symmetries are generated by the vector fields χ M = g M N χN ,

k M = JM N χ N ,

(2.17)

where χ M ≡ ∂ M χ , g M N is the inverse HKC metric, and J is a triplet of complex structures. The SU (2) Killing vector fields are not tri-holomorphic but rotate the complex structures into each other. It follows from (2.15) that the four vector fields χ M and k N satisfy N , DM χ N = δM

D M k N = JN M .

(2.18)

In particular, χ m ∂z m is holomorphic. One can also express the hyperkähler potential in terms of the metric and the homothetic Killing vector fields, χ=

1 M 1 χ gM N χ N = χ M ∂M χ , 2 2

(2.19)

consistent with (2.15). It is easy to check that this form of the hyperkähler potential is SU (2) invariant. In the coordinates that appear in the construction of the Swann bundle, the homothetic Killing vector is generated by the vector field χ M ∂ M = r ∂r , and so (2.19) yields

χ = r 2 = π A π¯ A .

(2.20)

One can descend from the HKC S to the twistor space ZM by performing a U (1) Kähler quotient. For any choice of complex structure n · J with n a unit vector, n · k is a holomorphic Killing vector. The Kähler quotient of ZM with respect to n · k provides a Kähler manifold of real dimension 4d + 2, independent of the choice of n, which is just the twistor space ZM . By Frobenius’ theorem, one may choose a set of independent complex coordinates λ, u i , i = 1, . . . , 2d + 1 adapted to the action of the holomorphic vector field χ M , χ m (z)∂m = ∂λ |u i .

(2.21)

The Kähler potential on ZM is then determined from the hyperkähler potential χ by means of ¯

¯ χ (λ, λ¯ , u, u) ¯ = eλ+λ+K ZM (u,u) .

(2.22)

Defining the O(2)-twisted holomorphic contact form on ZM , dz + P ¯ Xˆ ≡ e−2λ X = eλ−λ+2iψ+K ZM , 2(1 + z z¯ )

(2.23)

Linear Perturbations of Quaternionic Metrics

361

one may rewrite the metric on the fiber as the modulus square of the contact form [3], |dz + P|2 = 4 e−2K ZM |Xˆ |2 . (1 + z z¯ )2

(2.24)

Note that ψ is not an independent coordinate, but rather will be determined in terms of λ, z, x µ in Eq. (2.79) below, in such a way that λ¯ − λ + 2iψ is a function on ZM only. The QK metric on M can be computed from the holomorphic contact form X as indicated below (2.10), or by decomposing the metric on the twistor space as in (2.7), see [4,5] for more details. To express the metric on M in closed form, one needs to express the complex coordinates z m on S (or u i on ZM ) in terms of 4d independent real coordinates, corresponding to R+ × SU (2) invariant combinations of φ M , and coordi¯ As we shall see shortly, this problem is a QK analog of nates on the C2 fiber z, z¯ , λ, λ. the problem of “parametrizing the twistor lines” in HK geometry.

2.3. Patchwork construction of twistor spaces of HK manifolds - a summary. As explained e.g. in [1,21–25], HK geometry is equivalent to complex symplectic geometry on the twistor space, compatible with the real structure. This, of course, also applies to the HKC metric on the Swann bundle S, with suitable restrictions on the complex symplectic structure to ensure the HK cone property. In this subsection, we briefly review the twistorial description of general HK manifolds S following [1], before studying the implications of superconformal invariance in Sect. 2.4. In contrast to the quaternionic-Kähler case described in Sect. 2.1, the twistor space ZS over a 4d + 4-dimensional HK manifold10 S is a trivial product ZS = S × CP 1 . Its structure was developed from a physics viewpoint in [22,24], and its relation to projective superspace was recently further analysed in [25]. We denote by ζ a complex coordinate on the projective line CP 1 around the north pole ζ = 0. ZS carries an integrable complex structure given by J (ζ, ζ¯ ) =

1 − ζ ζ¯ 3 ζ + ζ¯ 2 ζ − ζ¯ 1 J + J +i J 1 + ζ ζ¯ 1 + ζ ζ¯ 1 + ζ ζ¯

(2.25)

on the base S (where Ji are the three complex structures on S) and the standard complex structure on CP 1 . Moreover, in this complex structure, ZS carries a holomorphic two-form (more accurately, a section of 2 TF∗ (2), see [22]) and a Kähler form given locally by (ζ ) = ωS+ − iζ ωS3 + ζ 2 ωS− ,

(2.26)

1 (1 − ζ ζ¯ )ωS3 − 2iζ ωS+ + 2iζ¯ ωS− , 1 + ζ ζ¯

(2.27)

and ω(ζ, ζ¯ ) =

where ωS± = − 21 (ωS1 ∓ iωS2 ). Note that, in contrast to the quaternionic-Kähler case, both of these forms are degenerate along the CP 1 fiber direction dζ . The Kähler form ω coincides with ωS3 at the north pole ζ = 0, and with −ωS3 at the south pole ζ = ∞. 10 For obvious reasons, we deviate from the notations of [1] which considered 4d dimensional HK manifolds

M.

362

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

The holomorphic two-form however, while coinciding with ωS+ at the north pole, diverges with a second order pole at ζ = ∞. As explained in [1], it is useful to introduce a set of patches Uˆi , i = 1, . . . , N on ZS , which project to open disks11 Ui on CP 1 , and a local section [i] which is regular on each patch. In order for the holomorphic section to be well defined, one must require that, on the overlap Uˆi ∩ Uˆ j , [i] = f i2j (ζ ) [ j]

mod dζ.

(2.28)

The factor f i j (ζ ) corresponds to the transition function of the O(1) bundle on CP 1 , and was discussed in detail in [1]. In particular, we recall that f i j f jk = f ik ,

f ii = 1, τ ( f i2j ) = f ı¯2j¯ ,

(2.29)

where τ is the antipodal map [τ (ν)](ζ ) ≡ ν(−1/ζ¯ ), and ı¯ labels the patch Uı¯ opposite to the patch Ui under the involution τ . Defining [0] = and using f 0∞ = ζ one finds that [∞] ≡ ζ −2 [0] = ωS− − iωS3 ζ −1 + ωS+ ζ −2

(2.30)

is regular at the south pole ζ = ∞. Now, we may choose the covering Uˆi such that, on each patch, the holomorphic section [i] takes the Darboux form I [i] = dµ[i] I ∧ dν[i] ,

(2.31)

I , µ[i] , ζ ) is a local complex coordinate system on Z , regular throughout where (ν[i] S I ˆ the patch Ui (here I runs over d + 1 values, which we shall denote , 0, . . . , d − 1). [ j]

I , µ[i] ) and (ν I , µ ) Equation (2.28) implies that on the overlap of two patches, (ν[i] I [ j] I must be related by a complex (O(2)-twisted) symplectomorphism. This is conveniently encoded by a generating function S [i j] of the initial “position” and final “momentum” coordinates, such that

ν[Ij] = ∂µ[ j] S [i j] (ν[i] , µ[ j] , ζ ), I

2 [i j] µ[i] (ν[i] , µ[ j] , ζ ). I = f i j ∂ν I S [i]

(2.32)

To check (2.28), one may use the identity [ j]

[i] I dS [i j] = ν[Ij] dµ I + f i−2 j µ I dν[i]

mod dζ .

(2.33)

The transition functions S [i j] are restricted by consistency conditions which ensure that the symplectomorphisms compose properly (see [1] for more details). As a result, the holomorphic symplectic structure on the twistor space ZS is entirely specified by N − 1 freely chosen functions S [0i] (ν[0] , µ[i] , ζ ). In order to ensure the reality of the resulting I , µ[i] transform under the real metric, it is also necessary to require that the sections ν[i] I structure as I = −ν[¯Iı ] , = −µ[¯Iı ] . τ ν[i] τ µ[i] (2.34) I 11 In principle, one should introduce a local coordinate ζ [i] on each connected disk; to avoid cluttering we shall use a single coordinate ζ to parametrize all patches at once, with each connected disk Ui being centered at ζ = ζi .

Linear Perturbations of Quaternionic Metrics

363

The condition (2.34) requires that the functions S [i ı¯] are related by complex conjugation to their Legendre transform [1]. For a suitably generic choice of such transition functions, it is a general property of I , HK manifolds that the space of solutions of (2.32) has dimension 4d + 4, i.e. all ν[i] µ[i] I can be expressed as infinite Taylor series around ζ = ζi whose coefficients are all functions of 4d + 4 parameters. The moduli space of solutions is isomorphic to the HK I , µ[i] ) defines the “twistor lines”, i.e. realizes the CP 1 base S, and the map ζ → (ν[i] I fiber over any point in S as a rational curve in ZS . Having found the twistor lines, the geometry of S can be computed by Taylor expanding the holomorphic section around any point ζ ∈ CP 1 . When S is a HKC, as we discuss further in the next section, all points of CP 1 are equivalent, and we can therefore expand around ζ = 0. Since is a global section of O(2), the Taylor expansion stops at quadratic order, [0] = dw I ∧ dv I − i ωS3 ζ + dw¯ I ∧ dv¯ I ζ 2 ,

(2.35)

where v I , w I are the complex coordinates in the complex structure J 3 = J (0, 0), I v I = ν[0] (ζ = 0), w I = µ[0] I (ζ = 0).

(2.36)

Knowing the complex coordinates and the Kähler form ωS3 , it is straightforward to obtain the metric and a Kähler potential. When S is a HKC, as we will discuss in the next section, it is always possible to choose the Kähler potential such that it is invariant under SU (2), and therefore equal to the hyperkähler potential χ . 2.4. Conditions for superconformal invariance. We now discuss the implications of superconformal invariance for the general construction of the twistor space ZS of a HK manifold S. We recall from Sect. 2.2 that superconformal invariance requires the existence of a homothetic Killing vector and an SU (2) group of Killing vectors that rotates the complex structures and commutes with the dilations. As follows from the first equation in (2.18), the dilations rescale the hyperkähler cone metric and leave the complex structures invariant. We normalize the action of the dilations such that the metric has weight 2, J = J.

g = 2 g,

(2.37)

This implies that all the two-forms ω S on the Swann bundle S scale with weight two. The action of the dilations can be extended to the twistor space ZS by assigning a scaling weight zero to ζ . In this way, the holomorphic two-form from (2.26) transforms uniformly throughout the ζ plane,

[i]

The local Darboux coordinates ν[i] and obeyed, so we postulate12 I ν [i] = 2n ν[i] , I

= 2 [i] . µ[i]

(2.38)

must transform in such a way that (2.38) is [i]

µ I = (2−2n) µ[i] I ,

(2.39)

12 One may also consider giving a different scaling weight n for each conjugate pair (ν I, µ ). The generI I alization of the following discussion is immediate.

364

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

for some constant n. This is a symmetry of the gluing conditions (2.32) provided the generating functions are homogenous functions of degree one when ν and µ are scaled with degree n and 1 − n respectively, S [i j] (2n ν[i] , (2−2n) µ[ j] , ζ ) = 2 S [i j] (ν[i] , µ[ j] , ζ ).

(2.40)

We now turn to the SU (2) action. In order for the complex structure J (ζ, ζ¯ ) given in (2.25) to be invariant, one should compensate the rotation of J by a rotation on the CP 1 fiber. Thus the fiber coordinate ζ must transform as ζ =

αζ + β , ¯ + α¯ −βζ

α α¯ + β β¯ = 1.

(2.41)

Under this transformation, should transform as a O(2) section,

[0] ¯ + α¯ 2 [0] ζ . (ζ ) = −βζ

(2.42)

Here, we have written the action in the patch U0 around the north pole of CP 1 , parametrized by the local coordinate ζ = ζ [0] . The action in the patch Ui can be obtained by replacing ζ → ζ [i] , [0] → [i] . If we continue to use ζ as a coordinate in Ui , then from (2.28) the transformation of [i] becomes [i]

(ζ ) =

f i0 (ζ ) f i0 (ζ )

2

¯ + α¯ 2 [i] ζ . −βζ

(2.43)

In order to ensure that transforms as (2.43) in every patch, we postulate that the local I , µ[i] transform locally as O(2n) and O(2 − 2n) sections Darboux coordinates ν[i] I

2n I f i0 (ζ ) ¯ ν[i] (ζ ) = −βζ + α¯ ν[i] (ζ ), f i0 (ζ )

2−2n [i] f i0 (ζ ) ¯ + α¯ µ I[i] (ζ ) = − βζ µ I (ζ ). f i0 (ζ ) I

(2.44)

This is a symmetry of the gluing equations (2.32) provided S

[i j]

[ j]

ν[i] (ζ ), µ

f (ζ ) 2 j0 2 [i j] [ j] ¯ + α) ν . (ζ ), ζ = (− βζ ¯ S (ζ ), µ (ζ ), ζ [i] f j0 (ζ ) (2.45)

Using the homogeneity property (2.40), this translates into 2n f i j (ζ ) [i j] [ j] [i j] [ j] S ν ν . (ζ ), µ (ζ ), ζ = S (ζ ), µ (ζ ), ζ [i] [i] f i2n j (ζ ) This equation fixes the ζ dependence to be of the form [ j] . S [i j] (ν[i] , µ[ j] , ζ ) = Sˆ [i j] f i−2n ν , µ [i] j

(2.46)

(2.47)

Linear Perturbations of Quaternionic Metrics

365

I and µ[i] are global sections of O(2n) In particular, note that the special case where ν[i] I and O(2 − 2n), [ j]

I S [i j] (ν[i] , µ[ j] , ζ ) = f i−2n j ν[i] µ I ,

(2.48)

solves the conditions of superconformal invariance. In addition, as in [1], one must impose the reality conditions

τ S [i j] (ν[i] , µ[ j] , ζ [i] ) = S [¯ı j¯] (ν[¯ı ] , µ[j¯] , ζ [¯ı ] ).

(2.49)

Thus, we conclude that superconformal invariance is guaranteed provided the gen[ j] erating functions S [i j] (ν[i] , µ[ j] , ζ ) are functions of f i−2n j ν[i] and µ , without explicit dependence on ζ , homogeneous of degree 1 when their first and second arguments are scaled with weight n and 1 − n, respectively, and satisfying the reality condition (2.49). Anomalous O(0) multiplets. In fact, the above conditions are sufficient but not strictly speaking necessary. Indeed, we have assumed that the Darboux coordinates are adapted to the superconformal action, in the sense that dilations and SU (2) act canonically as in (2.39) and (2.44), respectively. Clearly, a local gauge transformation depending on ζ only would not affect the existence of an isometric SU (2) action, but would just make it look more complicated. More importantly, when n = 1 (or equivalently n = 0, after exchanging µ I with ν I ), it is possible that µ I transforms anomalously under dilations, namely

[i] 2 µ I[i] = µ[i] I − c I log ,

(2.50)

for some constants c[i] I , which we shall refer to as “anomalous dimensions”. This anomalous transformation may be generated from the standard transformation (2.39) with n = 0 by a local symplectomorphism generated by

[i] I I T [i] = µ˜ [i] I ν[i] − c I ν[i] log ν[i] ,

(2.51)

I . This however need not be a regular gauge transformation where ν[i] is any one of the ν[i]

in the patch Ui , and so the geometry will in general depend non-trivially on c[i] I . After this local symplectomorphism, the generating functions S [i j] are now of the form13 [ j] [ j] [i] I Sˆ [i j] ν[i] , µ I + c I log( f i−2 (2.52) S [i j] = f i−2 j j ν[i] ) − c I ν[i] log ν[i] , where Sˆ [i j] is a homogeneous function of degree one in its first argument. In particular, S [i j] satisfy a “quasi-homogeneity condition” [ j] [ j] [ j] I I S [i j] 2 ν[i] , µ I − c I log 2 , ζ = 2 S [i j] ν[i] , µI , ζ [i] I 2 . (2.53) − f i−2 c ν log j I [i] 13 The Sˆ [i j] appearing in this equation differs from the one in (2.47), the relation between the two being transcendental in general.

366

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

Such generating functions are consistent with SU (2) invariance and dilations provided [i] µ[i] I transforms in the same way as −c I log ν[i] , namely as (2.50) under dilations and

f i0 (ζ ) [i] ¯ + α¯ − βζ (2.54) (ζ ) − 2c log µ I[i] (ζ ) = µ[i] I I f i0 (ζ ) under rotations. Anomalous transformations play an important role, e.g., in describing the one-loop correction to the hypermultiplet metric in Sect. 4.2. Note that the constants c[i] I are not arbitrary. Firstly, they must satisfy the reality [i] ∗ [¯ı ] conditions (c I ) = −c I . Besides, they are also subject to additional consistency constraints, which follow from the requirement that the open contours around the logarithmic branch cuts in ζ plane (as discussed in [1]) combine consistently into closed contours. This requires in particular that the anomalous dimensions associated with the patches containing the zeros of ν[i] are real. In this paper we assume that ν[i] has always two first order zeros ζ± in the patches U± related by the antipodal map, and therefore demand [−] that c[+] I = −c I are real constants. For a similar reason, we impose the same condition [∞] on c[0] (see footnote 18). I = −c I For later reference, we give the action of the symplectomorphism generated by (2.52), ˆ [i j] , ν[Ij] = f i−2 j ∂µ[ j] S

I

µ[i] I

= ∂ν I

[i]

Sˆ [i j] − c[i] I log ν[i] + δ I

1

[ j] c ∂ Sˆ [i j] J µ[Jj] ν[i]

−

J ν[i]

c[i] J ν[i]

.

(2.55)

From CPζ1 × C2π to CPz1 . We close this discussion of SU (2) transformations with an important observation, which will be instrumental for understanding the relation between the twistor spaces ZS and ZM . Notice that the isometric SU (2) action on S corresponds to an SU (2) action on the fiber coordinates π A , π¯ A (2.4), at a fixed position on the QK base M. Thus, any local O(2n) section ν[i] , viewed as a function of (ζ, π A , π¯ A ) and µ x , satisfies differential equations

∂ζ + π¯ 1 ∂π 2 − π¯ 2 ∂π 1 ( f i0−2n ν[i] ) = 0, 2ζ ∂ζ − 2n + π 1 ∂π 1 + π 2 ∂π 2 − π¯ 1 ∂π¯ 1 − π¯ 2 ∂π¯ 2 ( f i0−2n ν[i] ) = 0, (2.56) ζ 2 ∂ζ − 2nζ + π 1 ∂π¯ 2 − π 2 ∂π¯ 1 ( f i0−2n ν[i] ) = 0. It follows that there exists a function ν˜ [i] (z, x µ ) of the coordinates x µ on M and of the ratio z≡

π¯ 2 ζ + π 1 , −π¯ 1 ζ + π 2

(2.57)

such that

ν[i] (ζ, π A , π¯ A , x µ ) = f i02n (π 2 − ζ π¯ 1 )2n ν˜ [i] (z, x µ ).

(2.58)

For anomalous O(0) sections, the same argument guarantees the existence of a function µ˜ [i] (z, x µ ) such that 2 2 2 . (2.59) µ[i] (ζ, π A , π¯ A , x µ ) = µ˜ [i] (z, x µ ) − c[i] log f (π − ζ π ¯ ) 1 i0 I

Linear Perturbations of Quaternionic Metrics

367

Moreover, under the action of the antipodal map, τ (z) = −1/z,

τ (˜ν[i] ) = −˜ν[¯ı ] /z2n ,

τ (µ˜ [i] ) = −µ˜ [¯ı ] − 2c[¯ı ] log z.

(2.60)

The coordinate z can be viewed as a coordinate on the CP 1 fiber of the twistor space ZM . After an appropriate SU (2) rotation on the C2 /Z2 fiber, we can always assume that the zero and the pole of (2.57) occur at the zeros ζ± of the singled-out section ν , z=−

1 ζ − ζ+ , z¯ ζ − ζ−

ζ+ = −

π1 , π¯ 2

ζ− =

π2 , π¯ 1

z=

π1 . π2

(2.61)

In particular, the points (0, ζ+ , ζ− , ∞) in the ζ plane are mapped to (z, 0, ∞, −1/¯z ) in the z plane, respectively. Since ν[i] is assumed to be regular at ζ = ζi , ν˜ [i] (z, x µ ) is regular at the point zi ≡ z(ζi ), except when i = −, where the factors (π 2 − ζ π¯ 1 ) introduce extra singularities at z = ∞. In the next subsection, we elaborate on these observations and relate the symplectic and contact structures on the twistor spaces ZS and ZM .

2.5. Homogeneous symplectic vs. contact geometry. Having understood the constraints of superconformal invariance on the transition functions S [i j] , we now explain how the homogeneous symplectic structure on ZS descends to a contact structure on ZM , in effect rederiving the inverse construction of [13]. For definiteness, and since this is the case of most physical interest, we restrict to twistor spaces with n = 1 from this section onward. From homogeneous symplectic to contact. Let us return to (2.33): the term proportional to dζ , usually unspecified in HK geometry, can be computed explicitly in the case of HKC manifolds. Indeed, by differentiating the factors f i j appearing explicitly in (2.52), and integrating by parts, one obtains [ j] [i] I [i] −2 I I d S [i j] − f i−2 j µ I ν[i] = ν[ j] dµ I − f i j ν[i] dµ I [ j] [i] I −2 I + Sˆ [i j] +c I ∂µ[ j] Sˆ [i j] −c[i] ν log ν −µ ν I [i] I [i] d( f i j ). [i] I

(2.62) ˆ [i j] , one conRe-expressing µ[i] I using (2.55) and using the homogeneity property of S cludes that X[i] = f i2j X[ j] ,

(2.63)

where X[i] is the O(2)-valued complex Liouville form [i] I I X[i] = ν[i] dµ[i] I + c I dν[i] ,

satisfying the reality condition τ (X[i] ) = X[i] ¯ .

(2.64)

368

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

Let us now introduce the dilation-invariant O(0) sections14 I ξ[i] ≡

I ν[i] ν[i]

[i] ξ˜ I[i] ≡ µ[i] I + c I log ν[i] ,

,

(2.65)

the remaining d where we have singled out one coordinate15 ν[i] , and denoted by ν[i] [i] coordinates. In this trivialization, the Liouville form X leads to a contact form Xˆ [i] ,

[i] ˜ [i] Xˆ [i] ≡ dξ˜[i] + ξ[i] dξ + c dξ[i] .

X [i] = ν[i] Xˆ [i] ,

(2.66)

The term linear in c[i] I may be reabsorbed by defining I α [i] ≡ ξ˜[i] + c[i] I ξ[i] ,

˜ [i] Xˆ [i] = dα [i] + ξ[i] d ξ .

(2.67)

The gluing condition (2.63) becomes Xˆ [i] = fˆi2j Xˆ [ j] ,

fˆi2j ≡ f i2j ν[ j] /ν[i] ,

(2.68)

¯

while Xˆ satisfies the reality condition τ (Xˆ [i] ) = −Xˆ [i] . , ξ˜ [i] and fˆ2 are all According to the remark at the end of the previous section, ξ[i] I ij functions of the CP 1 coordinate z defined in (2.57) and of the coordinates x µ on M, ξ[i] =

(x µ , z) ν˜ [i]

ν˜ [i] (x µ , z)

[i] µ µ , ξ˜ I[i]=µ˜ [i] I (x , z) + c I log ν˜ [i] (x , z),

fˆi2j =

ν˜ [ j] (x µ , z)

ν˜ [i] (x µ , z)

.

(2.69)

, ξ˜ [i] ) provide local complex Darboux coordinates for the complex Thus, the sections (ξ[i] I contact structure Xˆ on ZM . They satisfy the reality conditions

τ (ξ˜ I[i] ) = −ξ˜ I[¯ı ] + iπ c[¯Iı ] .

) = ξ, τ (ξ[i] [¯ı ]

(2.70)

On the overlap of two patches, the Darboux coordinates are related by contact transformations following directly from (2.55), ˆ [i j] , ξ[j] = fˆi−2 j ∂ξ˜ [ j] S

ξ˜[i] = ∂ξ Sˆ [i j] , [i]

(2.71)

[ j] I ξ˜[i] = Sˆ [i j] − ξ[i] ∂ξ Sˆ [i j] + c I ∂ξ˜ [ j] Sˆ [i j] − c[i] I ξ[i] , [i]

I

and ξ˜ [ j] + c[ j] log fˆ−2 , related to the original where Sˆ [i j] is a general function of ξ[i] ij I I quasi-homogeneous generating function S [i j] via [ j] [ j] ˆ [i j] −2 I ˜ [ j] ˆ S [i j] (ν[i] ν ξ , µ I , ζ ) = f i−2 , ξ + c log( f ) S [i] I j ij [i] I [i] I (2.72) − c I ξ[i] ν[i] log ν[i] , 14 Our notations are related to the ones in [17] via ξ ,NPV = ξ , ξ˜ NPV = −2iξ˜ [0] , α NPV = 4iξ˜ [0] + [0] ξ˜ [0] , where the quantities on the r.h.s. are evaluated at ζ = 0, z = z. 2iξ[0] 15 Equation (2.65) is singular at the zeros of ν , and one should in principle single out a second coordinate [i] 0 and ξ˜ [i] , as ν[i] to cover these patches. Rather than doing so, we allow for poles and logarithmic cuts in ξ[i] I in (2.80) below, in effect trivializing the O(2) bundle over CPz1 .

Linear Perturbations of Quaternionic Metrics

369

and it is understood that ξ[i] = 1. In particular, note that the transition functions fˆi2j are holomorphic functions on ZM given by

fˆi2j = ∂ξ˜ [ j] Sˆ [i j] ,

(2.73)

and are equal to one if and only if ν is a global O(2) section. (z, x µ ) and Recovering the metric from the contact twistor lines. The functions ξ[i] [i] µ ξ˜ (z, x ) specify the twistor fiber over each point in M, and are the analogs of the I

I (ζ ), µ[i] (ζ ) on S. The knowledge of these “contact twistor lines” allows twistor lines ν[i] I to reconstruct the Kähler-Einstein metric on ZM and the quaternionic-Kähler metric on M, in the following manner. First, identifying X [0] = X in (2.10) and using (2.58), the holomorphic contact form in any patch Ui may be written as

X [0] 1 = Xˆ [i] = 2 2 f 0i ν[i]

π2 π 2 − ζ π¯ 1

2

dz + P ν˜ [i] (x µ , z)

,

(2.74)

Since Xˆ [i] depends on the fiber coordinates π A , π¯ A , ζ only through the combination z, we may set ζ = 0, z = z in this expression, and obtain dz p+ 1 + − i p3 + p− z = e−[i] Xˆ [i] , z z 2

(2.75)

where we define the “contact potential”, e−[i] (x

µ ,z)

≡ 4 ν˜ [i] (x µ , z)/z ,

τ ([i] ) = [¯ı ] .

(2.76)

Applying (2.24), we conclude that the Kähler potential on ZM is given by K ZM = log

4(1 + z z¯ ) + Re [i] (x µ , z) + log | fˆ0i |2 . |z|

(2.77)

Since fˆ0i is a holomorphic function, the last term in (2.77) can be absorbed by a Kähler transformation, leading to a Kähler potential valid in the patch Ui . In order to derive the metric on Z, one could therefore express z, z¯ and [i] in terms of the complex coor , ξ˜ [i] , α [i] ) in Uˆ . For the purpose of computing the QK metric on M, this dinates (ξ[i] i step is unnecessary and it suffices to study the contact twistor lines, as we show below. For later reference, we record the hyperkähler potential which follows from (2.22) using v = e2λ , 1 + z z¯ Re [[i] (x µ ,z)] e . χ = 4|v fˆ0i2 | |z|

(2.78)

By comparing (2.23) with (2.75) for i = 0, we can also relate the coordinate ψ in (2.14) to the coordinates v , z, x µ on S, v z¯ i Im [[0] (x µ ,z)] e2iψ = e . (2.79) v¯ z

370

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

We now restrict our attention to the patches Uˆ+ and Uˆ− around z = 0 and z = ∞, respectively, corresponding to ζ = ζ+ and ζ = ζ− . Using (2.65), (2.58), (2.59) and f 0+ ∼ ζ − ζ− , we find that the contact twistor lines behave near z = 0 as ,−1 −1 ,0 ,1 ξ[+] = ξ[+] z + ξ[+] + ξ[+] z + O(z2 ), [+] [+] [+] ξ˜[+] = c log z + ξ˜,0 + ξ˜,1 z + O(z2 ),

(2.80)

[+] ,−1 −1 α [+] = c[+] log z + c ξ[+] z + α0[+] + α1[+] z + O(z2 ).

Similarly, near z = ∞, ,−1 ,0 ,1 −1 ξ[−] = ξ[−] z + ξ[−] + ξ[−] z + O(z−2 ), [−] [−] [−] −1 ξ˜[−] = −c log z + ξ˜,0 + ξ˜,1 z + O(z−2 ),

(2.81)

[−] ,−1 ξ[−] z + α0[−] + α1[−] z−1 + O(z−2 ), α [−] = −c[−] log z + c

where the Laurent coefficients are related to those at z = 0 by the reality conditions (2.70). It is also useful to specify the Laurent expansion of the contact potentials, 0 1 [+] = φ[+] + φ[+] z + O(z2 ), 0 1 [−] = φ[−] + φ[−] z−1 + O(z−2 ),

(2.82)

related by the antipodal map [−] = τ ([+] ). For generic choices of contact transformations, we expect16 that similarly to the HK case [22], the moduli space of solutions to the gluing conditions (2.71) and reality conditions (2.70) is of real dimension 4d + 1, and can be parametrized by the lowest Laurent [−] ∗ ,−1 ,−1 ∗ ˜ [+] coefficients ξ[+] = −(ξ[−] ) , ξ,0 = −(ξ˜,0 ) and the real coefficient i(α0[+] +α0[−] ). This parameter space admits a U (1) action induced by phase rotations of z, which can be quotiented out to produce the QK manifold M itself. Expanding the contact form (2.67) for i = ± at z = 0, ∞ and identifying the coefficients of zn on either side of (2.75) allows to extract the SU (2) connection, 0 1 [±] ,−1 ,−1 ˜ [±] , p± = e−φ[±] ξ[±] dξ,0 + c dξ[±] 2 (2.83) 0 i −φ[+] [+] ,0 ˜ [+] ,−1 ˜ [+] 1 p3 = e dα0 + ξ[+] dξ,0 + ξ[+] dξ,1 − iφ[+] p+ , 2 and to express the Laurent coefficients of the contact potentials in terms of the Laurent coefficients of the contact twistor lines, 0 1 ,−1 [±] [±] ,0 ξ˜,1 + c ξ[±] + c[±] , eφ[±] = ± ξ[±] 2 (2.84) 0 1 [±] ,1 ,−1 ˜ [±] ,0 ˜ [±] 1 φ[±] = ± e−φ[±] α1[±] + 2ξ[±] ξ,2 + ξ[±] ξ,1 + c ξ[±] . 2 Via (2.1), one obtains the triplet of quaternionic forms ω M , in particular ωM,3 =

2 (d p3 + 2i p+ ∧ p− ) . ν

(2.85)

16 In the case where M admits d + 1 commuting isometries, or for perturbations thereof, this will be demonstrated in Sects. 3 and 5 below.

Linear Perturbations of Quaternionic Metrics

371

As anticipated above, the U (1) action induced by phase rotations of z shifts p3 by a total derivative and acts on p± in opposite ways, so lies in the kernel of ω3 . In order to obtain the metric from ωM,3 , it is still necessary to specify the almost complex structure J3 . This is achieved by expanding the holomorphic one-forms dξ[+] [+] and dξ˜ around z = 0, and projecting them along the base M: I

,−1 = ξ[+] p+ z−2 + V z−1 + O(z0 ) dξ[+] −1 + V˜ I + O(z1 ) dξ˜ I[+] = −c[+] I p+ z

mod Dz,

(2.86)

mod Dz,

where Dz = dz + p+ − i p3 z + p− z2 and ,−1 V ≡ (d − i p3 )ξ[+] ,

[+] [+] V˜ I ≡ dξ˜ I,0 − ξ˜ I,1 p+ + ic[+] I p3 .

(2.87)

(1, 0) forms with respect to the almost complex structure J3 on M can then be obtained and dξ˜ [+] which are regular at z = 0, and setting by forming linear combinations of dξ[+] I z = 0 in the corresponding expressions. Thus, singling the index 0 out of , a basis of (1, 0) forms on M is given by 0,−1 a a,−1 0 a = ξ[+] V − ξ[+] V ,

˜ I = ξ 0,−1 V˜ I + c[+] V 0 , [+] I

(2.88)

where a runs from 1 to d − 1. Note that the (1,0) form p+ is not linearly independent from those, as it satisfies 0 1 [+] 0,−1 ,−1 ˜ [+] a,−1 ˜ ξ[+] 1 − ξ[+] (2.89) . ξ,1 p+ = e−φ[+] ξ[+] a + c 2 Having determined J3 in this way, the QK metric then follows from ω3 (X, Y ) = gM (J3 X, Y ). Of course, the SU (2) connection and almost complex structure can equivalently be obtained by expanding near z = ∞. Before closing this section, let us note that the above discussion simplifies considerably in the special case where ν is a global O(2) section: in this case, the transition functions fˆi2j become equal to one, and the contact potentials i (x µ , z) become independent of z, defining a real function on M. 3. Quaternionic Geometry with Commuting Isometries In this section, we study aspects of the twistor space ZS of a HKC S (of real dimension 4d + 4) with d + 1 commuting tri-holomorphic isometries. As explained in the introduction, this situation arises when S is the Swann bundle of a QK manifold M with d + 1 commuting isometries. 3.1. Tri-holomorphic isometries and superconformal invariance. As explained in [1], the moment maps associated to the d + 1 commuting tri-holomorphic isometries provide d + 1 global O(2) sections, which can be taken as the “position” coordinates ν I (I = , 0, . . . , d − 1) for the holomorphic section . The fact that ν I are global O(2) sections restricts the form of the transition functions S [i j] to [ j]

I ˜ [i j] (ν[i] , ζ ), S [i j] (ν[i] , µ[ j] , ζ ) = f i−2 j ν[i] µ I − H

(3.1)

372

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

in such a way that, on the overlap of two patches, [ j] [i] I 2 ˜ [i j] (ν[i] , ζ ). ν[Ij] = f i−2 j ν[i] , µ I = µ I − f i j ∂ν I H [i]

(3.2)

The condition of superconformal invariance (2.47) further restricts H˜ [i j] (ν[i] , ζ ) to be of the form ˆ [i j] (ν[i] ), H˜ [i j] (ν[i] , ζ ) = f i−2 j H

(3.3)

where Hˆ [i j] (ν[i] ) is a homogeneous function of degree one in ν[i] 17 . Following [1], we like to express H˜ [i j] in terms of the standard O(2) multiplet vI + x I − v¯ I ζ. ζ

(3.4)

H [i j] (η I , ζ ) ≡ ζ −1 f 02j H˜ [i j] (ζ f 0i−2 η, ζ ).

(3.5)

I (ζ ) = η I (ζ ) ≡ ζ −1 ν[0]

Thus, we define (cf. Eq. (3.7) in [1])

Using (3.3), this reduces to H [i j] (η I , ζ ) = Hˆ [i j] (η I ) ≡ H [i j] (η I ).

(3.6)

In terms of H [i j] (η), the gluing conditions (3.2) simply become [ j]

[i j] µ[i] (η). I = µ I − ∂η I H

(3.7)

The consistency conditions on H [i j] (η, ζ ) were analyzed in [1], and just need to be restricted to the superconformal case. Thus, we require that H [ ji] = −H [i j] ,

H [ik] + H [k j] = H [i j] ,

(3.8)

subject to the equivalence relation H [i j] → H [i j] + G [i] − G [ j] ,

(3.9)

τ (H [i j] ) = −H [¯ı j¯] ,

(3.10)

and reality conditions

where all quantities are ζ -independent functions of η I , homogeneous of degree one. As in [1], we shall abuse notation and define Hˆ [i j] away from the overlap Uˆi ∩ Uˆ j (in particular when the two patches do not intersect) using analytic continuation and the second equation in (3.8) to interpolate from Uˆi to Uˆ j . 17 R. Ionas and A. Neitzke have independently shown that the condition that the generalized prepotential is a section of O(2) implies that S is HKC [27].

Linear Perturbations of Quaternionic Metrics

373

Anomalous O(0) multiplets. As discussed in the previous section, it is possible to relax the homogeneity condition (2.47) into the “quasi-homogeneity” condition (2.52). In this case, H˜ [i j] is restricted to be of the form [ j] I −2 I ˆ [i j] (ν[i] ) + c[i] ν[i] H , (3.11) log ν − c ν log f ν H˜ [i j] (ν[i] , ζ ) = f i−2 [i] I [i] j i j [i] I where Hˆ [i j] (ν[i] ) is again homogeneous of degree one in its argument. Defining H [i j] as in (3.5), we find [ j] −2 I − c I η I log ζ f 0−2 (3.12) H [i j] (η, ζ ) = Hˆ [i j] (η) + c[i] j η . I η log ζ f 0i η The explicit dependence on ζ may be removed by a local symplectomorphism −2 I I G [i] = −c[i] + c[0] I η log ζ f 0i I η log ζ,

(3.13)

where the second, i independent term was added to ensure regularity in the patches i = 0 and i = ∞.18 After this gauge transformation, we find [i j] H [i j] (η, ζ ) = Hˆ [i j] (η) + c I η I log η ≡ H [i j] (η), [i j]

where c I

(3.14)

[ j]

[i] ≡ c[i] I − c I , while the momentum coordinate µ is replaced by [i] [i] −2 µ[i] − c[0] I log ζ. T ;I = µ I + c I log ζ f 0i

(3.15)

Note that (3.14) is no longer a homogeneous function of η I , but rather satisfies the quasi-homogeneity condition [i j] η I ∂η I − 1 H [i j] = c I η I . (3.16) Complex contact structure. As in Sect. 2.5, using the (quasi)-homogeneity property of the transition functions H [i j] , one may reduce the complex symplectic structure on ZS to a complex contact structure on ZM . One should only be careful that due to the gauge transformation (3.13), the anomalous O(0) sections satisfy [i] [i] A µ µ 2 2 , x ) = µ µ[i] − c[0i] (ζ, π , π ¯ ˜ (z, x ) − c log (π − ζ π ¯ ) 1 A I I log ζ, (3.17) T ;I T ;I rather than (2.59), while ξ˜ I[i] (z, x µ ) defined in (2.65) becomes [i] [0] ξ˜ I[i] ≡ µ[i] T ;I (ζ ) + c I log η + c I log ζ.

(3.18)

Using the fact that fˆi j = 1 when ν is a globally defined O(2) section, the transition function (3.1) with H [i j] as in (3.14) then leads to [ j] ˜ [ j] ˜ [ j] ξ − Hˆ [i j] (ξ[i] , ξ I ) = ξ˜ + ξ[i] ). Sˆ [i j] (ξ[i] 18 This is the place where the additional reality condition c[0] = −c[∞] becomes necessary. I I

(3.19)

374

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

The section ξ ≡ ξ[j] is globally well defined, and therefore takes the form ξ = z−1 Y+ + A − z Y− ,

(3.20)

where A is real and (Y+ )∗ = Y− . The vector (2 Re (Y+ ), 2 Im (Y+ ), A ) is in fact the generalized moment map for the translational isometry along A , as defined in [26]. The relation between A , Y+ and x I , v I will be discussed in Sect. 3.3 below. On the other hand, the sections ξ˜ I[i] are defined only in the patch Ui , and are related on the overlap of two patches by the complex contact transformation [ j] ξ˜[i] = ξ˜ − ∂ξ Hˆ [i j] (ξ ),

[ j] [i j] ξ˜[i] = ξ˜ − Hˆ [i j] (ξ ) + ξ ∂ξ Hˆ [i j] (ξ ) − c I ξ I .

(3.21)

[i j]

It should be noted that the term proportional to c in this expression disappears when ξ˜ [i] is traded for α [i] as in (2.67),

α [i] = α [ j] − Hˆ [i j] (ξ ) + ξ ∂ξ Hˆ [i j] (ξ ).

(3.22)

Lagrangian, hyperkähler potential and twistor lines. As explained in [1], the transition functions H [i j] (η) determine the holomorphic symplectic structure of ZS , and therefore a HK metric on S. The latter can be computed from the “Lagrangian”, a function of the components v I , x I , v¯ I of η I defined by the contour integral dζ H [0 j] (η(ζ )), L= (3.23) 2π i ζ Cj j

where the contours C j encircle the centered disks U j in the complex ζ -plane. Note that due to the consistency conditions (3.8), the index 0 on the right-hand side of this expression may be substituted with any other value without changing the result. A Kähler potential for the HK metric on S is then obtained by Legendre transformation with respect to x I [22], χ (v I , w I , v¯ I , w¯ I ) = L − x I ∂x I L, ∂x I L = w I + w¯ I .

(3.24)

As shown in [1], the “momentum” coordinates µ[i] I (ζ ) are given by a single expression valid for all patches i, dζ ζ + ζ i [i] µT ;I (ζ ) = I + ∂η I H [0 j] (η(ζ )), (3.25) 2 C j 2π i ζ 2(ζ − ζ ) j

provided ζ lies in the open disk Ui . In particular, it is manifestly regular in Ui . The coordinates I ≡ −i(w I − w¯ I ) correspond to overall additive constants unconstrained by (3.7), and are adapted to the tri-holomorphic isometries ∂ I of S. We now discuss the homogeneity and SU (2) transformation properties of L and χ . Taking into account the quasi-homogeneous property (3.16), we readily find the scaling relation I 2 L 2 v I , 2 v¯ I , 2 x I = 2 L(v I , v¯ I , x I ) − 2c[0] . (3.26) x log I

Linear Perturbations of Quaternionic Metrics

375

On the other hand, the hyperkähler potential satisfies 2 2 2 I I ¯ I −c[0] ¯ I ). (3.27) χ 2 v I , 2 v¯ I , w I −c[0] I log , w I log = χ (v , v¯ , w I , w The SU (2) action on v I and w I can be obtained from (A.5) and (A.6), in Appendix A, leading to δv I = i3 v I + + x I , δ v¯ I = −i3 v¯ I + − x I , ¯ I = − Lv¯ I + i 3 c[0] δw I = + Lv I − i 3 c[0] I , δw I , while the real combinations x I and I transform as

δx I = −2 + v¯ I + − v I , δ I = −i + Lv I − − Lv¯ I − 23 c[0] I .

(3.28)

(3.29)

Note that in the quasi-homogeneous case, w I has anomalous transformations under dilations and U (1) transformations [28], compared to the transformations found in [4]. The anomalous terms can be removed by defining wˆ I ≡ w I + c[0] I log v . It is instructive to check explicitly that χ is SU (2) invariant. Keeping only the homogeneous term in (3.14), one may rewrite dζ ˆ [0 j] I H (η ) − x I ∂η I Hˆ [0 j] (η I ) χˆ = C j 2π i ζ j

ˆ [0 j] v dζ H − v¯ ζ = ζ η C j 2π i ζ j

∂η Hˆ [0 j] v x − x v + (x v¯ − v¯ x )ζ + . (3.30) ζ η Integrating the round bracket in the first term by parts, a short computation establishes dζ η ∂η Hˆ [0 j] dζ ∂η Hˆ [0 j] − ( r · r ) , (3.31) χˆ = (r )2 (η )2 η C j 2π iζ C j 2π iζ j

j

where r I · r J = x I x J + 2v I v¯ J + 2v¯ I v J (3.32)

I I I I is the inner product of the 3-vectors r = 2 Re (v ), 2 Im (v ), x associated to the O(2) multiplets η I . Each term is the product of a SU (2) invariant quantity times a contour integral of an O(−2) section, and so (according to a general argument discussed at the end of Appendix A) is SU (2) invariant. In the quasi-homogeneous case, the same line of argument combined with the contour deformations discussed in Sect. 3.4 of [1] leads to χ = χˆ + c[+−] I

r · r I , r

(3.33)

where c[+−] denotes the quasi-homogeneity coefficient relating the patches around the I two roots of η (ζ ). Thus χ is SU (2) invariant, and therefore equal to the hyperkähler potential on S. This concludes the proof that transition functions of the form (3.14) indeed lead to a HKC metric on S.

376

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

3.2. Superconformal quotient. In this subsection and the following one, we perform the superconformal quotient explicitly for a general HKC S described by the formalism of the previous subsection. We start by constructing a convenient set of coordinates x µ , π A , π¯ A on M × C2 /Z2 , in terms of the complex coordinates ν I (ζ ), µ I (ζ ) on the Swann bundle S in an arbitrary complex structure ζ . In the next subsection, we find the reciprocal change of variables, and determine the twistor lines. The real coordinates x µ on M are characterized by their invariance under the scaling and isometric SU (2) actions on ZS . Instead, the coordinates π A , π¯ A should transform as a pair of doublets under SU (2), and have a squared norm dictated by the hyperkähler potential, π A π¯ A = χ . These constraints do not determine the coordinates x µ , π A , π¯ A uniquely. In the case of QK spaces obtained by the classical and quantum corrected c-map, studied in [17,28], it was convenient to choose coordinates x µ adapted to the action of a 2d + 1 dimensional Heisenberg group of isometries. In the general O(2) case, the only isometries are the d + 1 abelian shift symmetries, and there is no such “canonical” choice. Our construction below is tailored to reproduce the results [17,28] for c-map spaces, as we illustrate later in Sect. 4. It also follows from considerations in contact geometry, as discussed in greater generality in Sect. 5.2. We start by singling out two multiplets η , η0 of the d + 1 O(2) multiplets η I , and denote by ηa , a = 1, . . . , d − 1, the remaining ones. The zeros of ν[0] = ζ η are now ζ± =

x ∓ r , 2v¯

r =

(x )2 + 4v v¯ .

(3.34)

As explained below (A.9), SU (2)-invariant quantities can be constructed by contour– integrating O(−2) sections on S. The simplest example is dζ 1 1 = , (3.35) r C+ 2π i ζ η which recovers the SU (2) invariant r , homogeneous of degree one under dilations. Other convenient SU (2) and dilation invariants are given by I r · r ηI dζ I = , A ≡r )2 2π i ζ (η (r )2 C+ (3.36) η− η dζ η dζ η+ Z ≡r = 0 , Z¯ ≡ −r = 0, 0 0 η+ η− C+ 2π i ζ η η C− 2π i ζ η η and, when µ[i] is a non-anomalous O(0) local section, B I ≡ −i r

C+

[+] dζ µ I + i r 2π i ζ η

C−

[−]

dζ µ I = −i µ+I + µ− I . 2π i ζ η

(3.37)

Here C± denote the contours containing ζ± , µ[±] I are the multiplets which are regular in [±] the patch containing ζ± , η± = η (ζ± ), and µ± I = µ I (ζ± ). Moreover, an additional invariant can be constructed out of the hyperkähler potential itself, eφ ≡

χ , 4r

(3.38)

Linear Perturbations of Quaternionic Metrics

377

As we shall see the 4d variables x µ = {φ, Z a , Z¯ a , A , B I } provide a convenient coordinate system on M. In particular, the coordinates B I correspond to the directions along the d + 1 isometries. In the quasi-homogeneous case, (3.37), is no longer SU (2)-invariant. One may imag˜ [+] ine replacing µ[+] I by ξ I , which is non-anomalous, however this quantity is singular in the patch U+ , as apparent from (2.80). The logarithmic singularity can be cancelled 0 without affecting the SU (2) transformation properties by adding c[+] I log(η /η ), which 19 leads us to define dζ [0] 0 µ[+] B I ≡ −i r + c[+] (3.39) I log η + c I log ζ − (+ ↔ −). T ;I C+ 2π i ζ η It is important to note that there exists another manifestly SU (2) and dilation invariant quantity, R≡

| r × r0 | |v η+0 | = , 2(r )2 (r )2

(3.40)

where × denotes the inner product of vectors in R3 . As we shall see shortly, it is R rather than φ which appears most naturally in the general formulae (3.50) for the twistor lines. Note that R vanishes when the zeros of η and η0 collide. As far as the coordinates on the fiber π A , π¯ A are concerned, the reasoning below (A.9) shows that SU (2) doublets can be constructed by contour-integrating O(−3) sections. Thus, it is natural to consider20 1 dζ C ζ+ 1 π =C = , r η+0 C+ 2π i η ζ η0 1 dζ C¯ ζ− 2 ¯ =− − 0 , π =C 0 r η− −ζ η C− 2π i η (3.41) 1 dζ C¯ ¯ π¯ 1 = −C = , 0 C− 2π i ζ η −ζ η0 r −ζ− η− 1 dζ C π¯ 2 = −C =− , 0 2π i ζη ζη C+ r ζ+ η+0 where the proportionality constant C can be chosen to be real, and adjusted such that (2.20) is obeyed. This gives the SU (2) invariant (3.42) C = 2r eφ/2 |v η+0 |. Equation (2.14) may now be rewritten as in [17], 1

1 √ 2 v¯ η¯ +0 π z φ/2 = 2 e v , , z = ζ 1 + π2 v η+0 z− 2

e4iψ =

v z¯ . v¯ z

(3.43)

19 This definition arises naturally from the general procedure explained in Sect. 5.2. 20 The contour integrals given below suffer from ambiguities in the choice of square root branches. This is inherent to the fact that the fiber of the Swann bundle is C2 /Z2 . We choose the branch cuts in such a way that the reality condition π¯ A = (π A )∗ is obeyed, and (π A , A B π¯ B ) transform as SU (2) doublets.

378

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

The following further relations are also often useful: v 1 v |v | |v | ζ− = , ζ = −|z| , r = = (1 + z z ¯ ), x (1 − z z¯ ). + |z| v¯ v¯ |z| |z|

(3.44)

Finally, we note that the reduced O(2) global section ν˜ [i] defined in (2.58) takes the simple form

ν˜ [i] =

1 −φ e z. 4

(3.45)

Comparing with (2.76), we see that the contact potential [+] is real, and coincides with the invariant φ defined in (3.38). This is also apparent from (2.78), using the third equation in (3.44) and the fact that fˆi j = 1 in O(2) geometries. 3.3. Contact twistor lines. We now consider the converse problem, of determining the complex coordinates ν I , µ I on S in terms of the coordinates x µ on the QK base M and of the coordinates π A on the C2 fiber. This is known in mathematics as “parametrizing the twistor lines”. In view of the discussion in Sect. 2.5, this is equivalent to expressing the “contact twistor lines” (ξ , ξ˜ I[i] ) in terms of the coordinates (z, x µ ) on ZM . for any i) Let us consider first ξ (z, x µ ). As explained in (2.80), ξ (equal to ξ[i] admits a single pole at z = 0 and z = ∞. The coefficients of the Laurent expansion of ξ (z, x µ ) around z = 0 can be extracted from the contour integral dz dζ η −k µ ξ (z, x ) = r z , (3.46) ξ ,k = k+1 2 0 2π iz C+ 2π iζ (η ) where we used dζ dz = r . 2π iz 2π iζ η

(3.47)

Equation (3.46) vanishes for k ≤ −2 and k ≥ 2, as can be seen by deforming the contour around ζ+ to a contour around ζ− . For k = 0, one immediately recovers the first quantity in (3.36). For k = −1, one may decompose dζ η z η0 ,−1 = R Z , =r (3.48) ξ η0 2π iζ η η C+ where Z 0 ≡ 1, and we used the fact that the term in brackets is regular at ζ = ζ+ and equal at that point to the invariant R defined in (3.40). For k = 1, a similar argument leads to ξ ,1 = −R Z¯ . As a side product, we obtain a contour integral representation for the quantity (3.40), η0 η0 −1 dζ dζ z = r z . (3.49) R=r 2 2 C+ 2π i ζ (η ) C− 2π i ζ (η ) Thus, we conclude that the contact twistor line is parametrized by ξ (z, x µ ) = A + R z−1 Z − z Z¯ ,

(3.50)

Linear Perturbations of Quaternionic Metrics

379

so that Y+ = R Z in (3.20). Setting ζ = 0, z = z in this expression allows to express the complex coordinate ξ on ZM in terms of the coordinates on the base and on the CP 1 fiber. For ξ˜ I[i] defined in (2.65), and assuming c[i] I = 0 for simplicity, one may eliminate I in (3.25) in favor of B I , and use the identity dζ 2π iζ

ζ + ζ dz z + z 1 ζ+ + ζ 1 ζ− + ζ = − , − ζ −ζ 2 ζ − ζ+ 2 ζ − ζ− 2π i z z − z

(3.51)

where z is obtained from (3.47) by replacing ζ → ζ , z → z . This gives ξ˜ I[0] (z, x µ ) =

i 1 BI + 2 2 j

C˜ j

dz z + z [0 j] H (ξ(z )), 2π i z z − z I

(3.52)

[0 j]

where C˜ i is the image of the contour Ci in the z plane, and H I (ξ ) ≡ ∂η I H [0 j] (η I ). In the quasi-homogeneous case, similar manipulations (explained in greater generality in Sect. 5.2) lead to dz z + z i 1 1 [+−] B + ∂ξ Hˆ [0 j] (ξ(z )) + c log z, 2 2 2 C˜ j 2π i z z − z j [0 j] 1 dz z + z ˆ i 1 ˆ + cI ξ I H − ξ = B + ∂ + c[+−] log z. H ξ z − z 2 2 2π i z 2 ˜ Cj

ξ˜[0] = ξ˜[0]

j

(3.53) Again, by substituting ζ = 0, z = z in these expressions, one may obtain the complex coordinate ξ˜ I on ZM in terms of the fiber coordinate z and the coordinates on M. As in the case of (3.25), the r.h.s. of (3.53) gives the contact twistor line ξ˜ I[i] in any patch Ui , provided z is chosen to lie in the corresponding patch (moreover, as in (3.25), one may replace the superscript [0 j] with any [k j] without changing the result). Indeed, one may check that the discontinuity of the r.h.s. of (3.53) across the contours C˜ i precisely implements the contact transformations given in (3.21). Another important remark is that, due to the fact that the argument ξ of Hˆ [0 j] has a pole at z = 0, the integrals appearing in (3.53) need not be regular at z = 0: we shall see an example of this phenomenon in (4.23) below. For what concerns ξ˜ I[+] however, the integrals are regular and the first Laurent coefficients needed to find the SU (2) connection and the quaternionic-Kähler metric are readily extracted: [+] ξ˜,0 =

dz dz [+] ∂ξ Hˆ [0 j] , ξ˜,1 = ∂ Hˆ [0 j] , 2 ξ 2π i z 2π i z ˜ ˜ Cj Cj j j (3.54) dz i 1 [0 j] [0 j] ˆ ˆ H . = B + − ξ ∂ξ H 2 2 C˜ j 2π i z

i 1 B + 2 2 α0[+]

j

380

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

From these expressions it is easy to find the contact potentials using (2.84), dz 1 1 I Z ∂ξ Hˆ [0 j] (ξ (z)) + c[+] R I A , 2 2 2π iz 2 ˜ Cj j dz ¯ 1 1 I =− R Z ∂ξ Hˆ [0 j] (ξ (z)) − c[−] I A , 2 2π i 2 ˜ Cj

e[+] = e

[−]

(3.55)

j

In particular, e[±] are independent of the fiber coordinate z and are equal to each other, since their difference is the integral of a total derivative. On the other hand, changing variable from ζ to z in (3.31), we may rewrite the hyperkähler potential χ , Eq. (3.33), as a contour integral

χ =r R

j

C˜ j

dz −1 z Z − z Z¯ ∂ξ Hˆ [0 j] (ξ (z)) + r c[+−] AI , I 2π i z

(3.56)

where we used the notation A = 1. Equation (3.56) may be used to express R in terms of eφ or vice-versa. Moreover, comparing (3.56) and (3.55), one confirms that the invariant φ defined in (3.38) is indeed equal to the contact potential [±] . 4. The Perturbative Hypermultiplet Moduli Space In order to illustrate the results in the previous section, we now discuss the geometry of hypermultiplet moduli spaces in type II string theories compactified on a Calabi-Yau three-fold, which is the main motivation for this study. In Sect. 4.1 we focus on the tree-level geometry, deferring the inclusion of the one-loop correction to the next subsection. Non-perturbative contributions will be considered in [19], using the results on deformation in Sect. 5 below.

4.1. Tree-level geometry. At tree-level in the string perturbative expansion, the hypermultiplet moduli space M in type IIA (resp. IIB) string theory compactified on a Calabi-Yau three-fold Y (resp. X ) is a QK space of quaternionic dimension d = h 2,1 (Y ) + 1 (resp. d = h 1,1 (X ) + 1). It is obtained by the c-map construction from the vector multiplet moduli space MV in type IIB (resp. IIA) theory compactified on the same Calabi-Yau manifold Y (resp. X ) [29,30]. MV is a projective special Kähler manifold of dimension 2d − 2, representing the moduli space of complex deformations of Y (resp. complexified Kähler deformations of X ), described in the standard way [31,32] by a holomorphic prepotential F(X ) ( = 0, . . . , d −1), homogeneous of degree two. As shown in [33,34], the Lagrangian describing the Swann bundle of M is given by L = Im C+

dζ F(η ) , 2π i ζ η

(4.1)

where η I (I = , 0, . . . , d − 1) are O(2) multiplets parameterized as in (3.4), and the contour C+ encloses the root ζ+ of ζ η (ζ ) (given in Eq. (3.34)) counter-clockwise. This

Linear Perturbations of Quaternionic Metrics

381

can be cast in our general framework (3.23) by introducing four patches21 on CP 1 , centered at 0, ∞, ζ+ , ζ− , with transition functions [0+] Htree =−

i F(η ) , 2 η

[0−] Htree =−

¯ ) i F(η , 2 η

[0∞] Htree = 0.

(4.2)

The contour integral (4.1) was evaluated in [17] (generalizing a previous computation in [33,34] restricted to the locus v = v¯ = 0), resulting in L(v, v, ¯ x) =

1 ¯ − F(η+ ) − F(η ) . 2ir

(4.3)

The hyperkähler potential χ following by Legendre transform is given by [17] χ=

v v¯ K (η+ , η− ), (r )3

(4.4)

where

¯ K (Z , Z¯ ) ≡ i Z¯ F (Z ) − Z F¯ ( Z¯ ) ≡ e−K(Z , Z ) .

(4.5)

The hyperkähler potential χ may be further expressed in terms of the complex coordinates v I , w I and their complex conjugate by means of the Hesse potential associated to the special Kähler manifold MV [17]. The momentum coordinates µ[0] I for this geometry can be evaluated using (3.25) (away from the locus where the zeros of η collide with other singularities of F(η )): i ζ +ζ− ¯ ζ +ζ+ 1 µ[0] = − F (η ) − (η ) F ζ −ζ− − , 4ir ζ −ζ+ + 2 i ¯ −) ζ +ζ− ¯ ζ+ F(η+ ) ζ− F(η ζ ζ +ζ+ x µ[0] = + 2i(r )2 (ζ −ζ+ )2 + (ζ −ζ− )2 + 4i(r )3 ζ −ζ+ F(η+ )− ζ −ζ− F(η− ) 2 ζ +ζ− ¯ +ζ+ v v + . (4.6) F (η ) + ζ v ¯ (η ) + ζ v ¯ F − 4i(r1 )2 ζζ−ζ + + − − ζ+ ζ −ζ− ζ− + Since H [0∞] = 0, the momentum coordinates around the south pole are given by µ[∞] = I [0] µ I , and one may check that the reality conditions (2.34) are indeed satisfied. [0] The multiplets µ[0] and µ have a first order and second order pole at ζ = ζ± , respectively, while being regular elsewhere. It is readily checked that the combinations [0] µ[+] = µ −

i F (η) i F(η) [0] , µ[+] , = µ + 2 η 2 (η )2

(4.7)

[0+] related to µ[0] I by the symplectomorphism generated by Htree , are regular at ζ = ζ+ , while being singular at ζ = ζ− and other possible singularities of F(η). Indeed, evaluating µ[+] I at ζ = ζ+ yields

i i x i v ¯ (4.8) F (η ) − F (η ) + F (η ) + v ¯ ζ µ+ = − + − + + 2 4(r )2 2r ζ+ 21 Since H [0∞] = 0, U and U are really one and the same patch. We further assume that all singularities ∞ 0 of F(η) belong to either U0 or U∞ , but not to U± .

382

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

and µ+ =

i (x )2 − 2v v¯ ¯ −) + i F(η+ ) − F(η 4 2 4(r )

v v i x ¯ F (η+ ) + + ζ+ v¯ + ζ− v¯ (4.9) − F (η− ) 4(r )3 ζ+ ζ−

i v¯ v − v v¯ v v i . − F (η ) + F + ζ v ¯ + ζ v ¯ + + + 2(r )3 4(r )2 ζ+ ζ+

Similarly, the combinations [0] µ[−] = µ −

¯ i F¯ (η) i F(η) [−] [0] , µ = µ + , 2 η 2 (η )2

(4.10)

[0−] , are regular at ζ = ζ , related to µ[0] − I by the symplectomorphism generated by H while being singular at ζ = ζ+ and other possible singularities of F(η). We note that the multiplets µ I may be obtained independently by making use of the special symmetry properties of the QK metrics in the image of the c-map, namely the existence of an extended Heisenberg group of tri-holomorphic isometries [29,30]. Upon lifting them to the Swann bundle, these isometries are generated by the holomorphic Killing vector fields [17]

i K = − ∂w , 4 M = w ∂w

i ∂ w , Q = w ∂w − v ∂v , 2 1 1 − v ∂v + w ∂w − v ∂v . 2 2 P =

(4.11)

The commuting isometries K , P are manifest in the O(2) projective superfield construction; their O(2)-valued moment maps are just the O(2) multiplets η , η . The moment maps, λ , λ associated to the remaining isometries Q and M provide d + 1 additional global O(2) sections

λ = v w /ζ + w ∂w χ − v ∂v χ + v¯ w¯ ζ,

1 1 1 −1 λ = v w + v w ζ + w ∂w χ − v ∂v χ + w ∂w χ − v ∂v χ 2 2 2

1 (4.12) + v¯ w¯ + v¯ w¯ ζ. 2 Matching the leading terms in the expansion around ζ = 0, one readily checks that the momentum coordinates around the north pole are given in terms of the global O(2) sections [0] µ =

λ , η

µ[0] =

λ λ η − . η 2(η )2

(4.13)

Linear Perturbations of Quaternionic Metrics

383

4.2. One-loop correction. In type II theories compactified on a Calabi-Yau Y , the metric on the hypermultiplet moduli space receives a one-loop correction, proportional to the Euler number of Y [35]. There is evidence that there are no perturbative corrections to the hypermultiplet metric beyond one-loop [36].22 As shown in [36], the one-loop correction can be described in the projective superspace formalism by adding a term

dζ x + r η log η = −4c r − x log = 2c 2|v | C 2π iζ

L1−loop

(4.14)

to the Lagrangian (4.1), where c is a constant determined in [36], proportional to the Euler character of the Calabi-Yau threefold. Here the contour C is a figure-eight contour around ζ+ and ζ− , and the branch cuts in log η are chosen to extend from ζ+ to 0 and ζ− to ∞ (see Sect. 3.4 in [1] for a more detailed discussion). Equivalently, the one-loop correction gives rise to additional contributions to the transition functions (4.2). [0+] H1−loop = 2c η log η ,

[0−] H1−loop = −2c η log η ,

[0∞] H1−loop = 0.

(4.15)

In particular, the transition functions are no longer homogeneous, but fall in the “quasihomogeneous” class, with anomalous dimensions c[0] = c[∞] = 0,

c[±] = ∓2c ,

[i] c = 0.

(4.16)

The one-loop contribution to the hyperkähler potential is given by a simple correction to the formula (4.4) [28] χ=

v v¯ K (η+ , η− ) − 4 c r , (r )3

(4.17)

in agreement with the general result (3.33). Let us now determine the twistor lines for the one-loop corrected hypermultiplet moduli space. Starting from the general expression (3.25), the additional contribution (4.15) to the transition functions gives rise to extra terms in µ[i] ;T ,

ζ − ζ− [0]tree , µ[0] = µ + 2c log |ζ | + T; ζ − ζ+

|v |(ζ − ζ− )2 [+] [+]tree µT ; = µ + 2c, + 2c log − ζ ζ−

(4.18)

[i]tree while the other momentum coordinates remain unaltered, µ[i] . It is easy to = µ check that the multiplet (4.18) transforms under SU (2) transformations according to (A.6). 22 In the case of the universal hypermultiplet, this was established rigorously in [37]. See the end of Sect. 4.3 for a strengthening of the non-renormalization argument in [36].

384

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

4.3. Superconformal quotient. The superconformal quotient of the HKC defined by (4.1) was studied in [17,28].23 The dilation and SU (2) invariant coordinates used in these references were given by χ r · r η+a x a ˜ = −i(w − w¯ ) + , ζ = , z = , ζ Re [F (η+ )] , 4r (r )2 (r )2 η+0 (4.19) η0 ˜ v v¯ x σ = 2i(w − w¯ )+i v w − v¯ w¯ − (r )2 Re η+ ζ − F (η+ )ζ −4ic log η0+ .

e2U =

−

While the SU (2)-invariance of ζ˜ and σ directly is rather tedious to check, it can be made manifest by casting the resulting expressions in the form of a contour integral of an O(−2) section, e.g. when c = 0, [0] [0] µ η µ[0] dζ µ dζ ζ˜ = −2ir , σ = 4ir + . (4.20) η 2(η )2 C + 2π iζ η C + 2π iζ In the presence of the one-loop correction, these expressions may be generalized by performing the same replacement as in (3.39) and taking the real part. We now relate the result (4.19) to the general SU (2) and dilation invariant coordinates introduced in Sect. 3.2. Clearly, U = φ/2, z a = Z a , ζ = A . On the other hand, evaluating (3.39) with the help of (4.8), (4.9) and (4.18) leads to x Re F (η+ ) − A Re F (η+ ), (r )2 1 +x A η0 Re F (η+ )+ A A Re F (η+ )+2ic log η0+ . B = + (r 1 )2 Re F(η+ )− x 2(r )2 − 2 (4.21)

B = +

Thus the coordinates B I differ from σ, ζ˜ by SU (2) invariant terms, B = ζ˜ − A Re F (Z ),

1 1 B = − σ − A B . 2 2

(4.22)

The contact twistor lines can be found using the general formulae (3.50), (3.53), ξ = A + R z−1 Z − z Z¯ , i ξ˜[0] = B + A Re F (Z ) + R z−1 F (Z ) − z F¯ ( Z¯ ) , 2

i 1 [0] ξ˜ = B − A A Re F (Z ) + R2 Re Z¯ F (Z ) 2 2 ¯ Z¯ ) − 2c log z. −RA z−1 F (Z ) − z F¯ ( Z¯ ) − R2 z−2 F(Z ) + z2 F( (4.23) Finally, it remains to express R, defined in (3.40) above, in terms of the base coordinates (3.36)–(3.39). For this purpose, one may substitute the one-loop corrected hyperkähler 23 Ref. [28] used a different contour prescription related to the one used here by a local gauge transformation, as we explain in Appendix B. As a result, the expressions for ζ˜ and σ acquired some additional terms.

Linear Perturbations of Quaternionic Metrics

385

potential (4.17) into the definition of φ, Eq. (3.38), and use the homogeneity property of K (·, ·) to obtain 1 ¯ R = 2 e 2 K(Z , Z ) eφ + c. (4.24) Introducing W ≡ F (Z ) ζ − Z ζ˜

(4.25)

and using (4.22), one may obtain the contact twistor lines for the one-loop corrected hypermultiplet geometry in the form found in [17], ξ = ζ + R z−1 Z − z Z¯ , −2iξ˜[0] = ζ˜ + R z−1 F − z F¯ , (4.26) 4iξ˜[0] + 2iξ˜[0] ξ = σ + R z−1 W − z W¯ − 8i c log z. We proceed to extract the one-loop corrected hypermultiplet metric from the twistor data on ZM . Following the procedure outlined at the end of Sect. 2.5, we first compute the Laurent coefficients of ξ˜ I[+] entering the SU (2) connection (2.83), i i [+] [+] ξ˜,0 ζ˜ − F ζ , ξ˜,0 = = − (σ + ζ ζ˜ − F ζ ζ ) − eφ + c, 2 4 1 i [+] F ζ ζ , = RN Z¯ − (4.27) ξ˜,1 2 4R 1 i [+] ξ˜,1 F ζ ζ ζ . = − RN ζ Z¯ + 2 12R where N ≡ i(F − F¯ ), leading to the SU (2) connection i p+ = e−φ R Z dζ˜ − F dζ = ( p− )∗ , 4 (4.28) 1 −φ ˜ φ ˜ p3 = e dσ + ζ dζ − ζ dζ + 4(e + c)A K , 8

where A K ≡ i Ka dZ a − Ka¯ d Z¯ a¯ is the Kähler connection of the projective special Kähler base MV . A direct computation of the (1, 0) forms (2.88) then yields a = R2 dZ a , i ˜ = R dζ˜ − F dζ − F ζ dZ 2

i 3 i T ˜ T Z R Im (F ) Z¯ + , + F ζ ζ d ζ − F dζ T T 4r 4R2 r + 2c ˜ = −R dr + c dK r +c i + dσ + ζ˜ dζ + ζ dζ˜ − F ζ ζ dZ − 2F ζ dζ 4

i i Z T dζ˜T − FT dζ T , R2 Im (F )ζ Z¯ + F ζ ζ ζ + 4r 12 (4.29)

386

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

where r ≡ eφ . Taking linear combinations, a basis of (1,0) forms can be chosen as dZ a , f a dζ˜ − F dζ , (4.30) r + 2c i Z dζ˜ − F dζ , dσ + ζ˜ dζ − ζ dζ˜ + c dK, dr + r +c 4 where f a = eK/2 (∂a Z +∂a K Z ), generalizing the one-forms ea , E a , u, v introduced in [30] for c = 0. Finally, computing the Kähler form (2.85) in this basis and raising the indices, one obtains the one-loop corrected metric on the hypermultiplet moduli space [28,36],

r + 2c 1 2(r + c) ¯ dζ˜ − F dζ ds 2 = 2 dr 2 − N − Z Z r (r + c) r rK × dζ˜ − F¯ dζ 2 4(r + c) r +c ¯ ˜ dζ − ζ dζ˜ + 4c A K + dσ + ζ + Ka b¯ dZ a d Z¯ b , 16r 2 (r + 2c) r (4.31) which identifies φ as the four-dimensional dilaton. It is worthwhile noting that the one-loop correction changes the topology of the fibration of the σ -circle over the torus coordinatized by ζ , ζ˜ , by a term proportional to A K [38]. Any perturbative correction to the hypermultiplet metric beyond one-loop would presumably induce extra terms in the connection on the σ circle bundle proportional to a positive power of r , and would therefore conflict with the quantization of its first Chern class. This observation reinforces the arguments given in [36] ruling out perturbative corrections to the hypermultiplet metric beyond one-loop. 5. Linear Deformations of O(2) Quaternionic-Kähler Spaces In this section, we study the infinitesimal deformations of 4d-dimensional QK spaces M with d + 1 commuting isometries, which preserve the QK property but may break some or all of the isometries. Our strategy is to apply the general analysis of linear deformations of O(2) HK spaces developed in [1] to the Swann bundle S of M, restricting to deformations which preserve the superconformal invariance property. As explained in the introduction, it is possible to bypass the Swann bundle and work directly with the twistor space ZM . This strategy will be realized in Sect. 5.2. 5.1. Linear deformations of O(2) hyperkähler cones. As explained in [1], deformations of HK spaces S are conveniently described by perturbing the transition functions S [i j] which encode the holomorphic symplectic structure on the twistor space ZS , [ j]

I [i j] ˜ [i j] (ν[i] , ζ ) − H˜ (1) (ν[i] , µ[ j] , ζ ), S [i j] (ν[i] , µ[ j] , ζ ) = f i−2 j ν[i] µ I − H

(5.1)

I ,µ and working out the perturbations νˆ [i] ˆ [i] I of the twistor lines, I I = ζ f 0i−2 η I + νˆ [i] , ν[i]

µ[i] ˘ [i] ˆ [i] I =µ I +µ I

(5.2)

Linear Perturbations of Quaternionic Metrics

387

[i j] . Here and below, unperturbed quantities are to first order in the perturbations H˜ (1) denoted with ˘, perturbations with ˆ, and perturbed quantities with no extra symbol (with the exception of η I , v I , x I which will continue to denote unperturbed quantities). As shown in Sect. 3.1, superconformal invariance restricts the undeformed transition functions H˜ [i j] (ν[i] , ζ ) to be homogeneous of degree one in ν[i] , and without any explicit dependence on ζ except for some factors of f i j . The same reasoning shows the [i j] should satisfy the same conditions, namely perturbation H˜ (1) [ j] [ j] −2 [i j] [i j] I ˆ (1) ν , (5.3) H˜ (1) (ν[i] , µ[ j] , ζ ) = f i−2 , µ + c log f ν H [i] j i j [i] I I [i j] is a homogeneous function of degree one in its first argument.24 Following where Hˆ (1) [i j] for [1], we now trade H˜ (1) [i j] [i j] H(1) (η, µ˘ [ j] , ζ ) ≡ ζ −1 f 02j H˜ (1) (ζ f 0i−2 η, µ˘ [ j] , ζ ) [ j] [ j] [i j] η I , µ˘ I + c I log f 0−2 . = Hˆ (1) j ζη

We then perform the gauge transformation (3.13) to obtain [ j] [ j] [ j] [i j] [i j] H(1) η I , µ˘ T ;I + c I log η + c[0] (η, µ˘ T , ζ ) = Hˆ (1) I log ζ .

(5.4)

(5.5)

[ j]

Finally, we trade the argument µ˘ I ;T for the real multiplet [0∞] ρ I (ζ ) ≡ −i(µ˘ [0] ˘ [∞] log ζ. T ;I + µ T ;I ) − i c I

(5.6)

This quantity has the advantage of having non-anomalous O(0) transformations, more[i ı¯] is a real function over the reality conditions are also automatically satisfied provided H(1) I [i j] I of η and ρ I . After these redefinitions, H(1) is now a function of η , ρ I , homogeneous of degree one in η I , and with no explicit dependence on ζ . In addition, it must satisfy the co-cycle condition (3.8) and is subject to the gauge equivalence (3.9), where G [i] is now a function of η I , ρ I regular in the patch Uˆi . We may now borrow the results from [1], Sect. 5. In particular, the first order variation of the HK twistor lines is given by dζ ζ 3 + ζ 3 I H [0 j]I (ζ ), = i f 0i−2 (5.7) νˆ [i] ζ (ζ − ζ ) (1) 2π i ζ C j j dζ ζ + ζ [0 j] G I (ζ ) µˆ [i] = (5.8) T ;I − ζ) 2π i ζ 2(ζ Cj j

with H I ≡ ∂η I H, [i j]

GI

H I J ≡ ∂η I ∂η J H, [ j0]

H(1) I ≡ ∂η I H(1) , [ j∞]

[i j] [i j]J ≡ H(1) I + i H(1) (H I J + H I J

I H(1) ≡ ∂ρ I H(1) , [i j]

J ) + ζ −1 f 0i2 νˆ [i] HI J .

(5.9)

24 For simplicity, we do not consider deformations of the anomalous dimensions. However, it may be checked [i] that all formulae below continue to hold provided c I denote the total perturbed anomalous dimensions.

388

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

The corresponding deformations of the Kähler potential can be conveniently described by introducing the deformed Lagrangian L(v, v, ¯ x, ) =

j

Cj

dζ [0 j] [0 j] H (η) + H(1) (η, ρ) . 2π i ζ

(5.10)

So defined, it is a function of the complex variables v I . However, after perturbations v I are no longer Darboux coordinates. Instead, a system of complex Darboux coordinates of the deformed HKC, such that ωS+ = dw I ∧ du I , is given by u I = v I + i ∂ I

Cj

j

dζ H [0 j] , 2π i (1)

wI =

i 1 I + ∂x I L(u, u, ¯ x, ), (5.11) 2 2

where in the second relation the arguments v I of the Lagrangian are replaced by the new complex variables u I and the derivative is evaluated keeping u I , u¯ I and I fixed. Similarly, the Kähler potential for the deformed HK metric is given by the Legendre transform of the deformed Lagrangian (5.10) but written as a function of the new variables χ (u, u, ¯ w, w) ¯ = L(u, u, ¯ x, ) − x I (w I + w¯ I )

xI

.

(5.12)

In particular, the variation of the hyperkähler potential is given by a Penrose-type integral, χ(1) (u, u, ¯ w, w) ¯ =

j

Cj

dζ H [0 j] (η, ρ). 2π i ζ (1)

(5.13)

We now verify that the perturbed HK manifold is indeed a HKC (as is of course guaranteed by construction). Using the quasi-homogeneity property of H(1) (η, ρ), and in particular the property u I ∂u I − u¯ I ∂u¯ I + ζ ∂ζ ρ J = 0,

(5.14)

it is easily checked that L satisfies u I Lu I − u¯ I Lu¯ I = −2i c[0] I L I , I x I Lx I + u I Lu I + u¯ I Lu¯ I = L − 2c[0] I x .

(5.15)

Together with the identities

∂x I + iLx I x K ∂ K ∂x J − iLx J x L ∂ L L

+ ∂u I − iLu I x K ∂ K ∂u¯ J + iLu¯ J x L ∂ L L = 0,

∂x I + iLx I x K ∂ K ∂u J − iLu J x L ∂ L L

− ∂x J + iLx J x K ∂ K ∂u I − iLu I x L ∂ L L = 0,

(5.16) (5.17)

Linear Perturbations of Quaternionic Metrics

389

these equations guarantee that L satisfies the constraints of superconformal invariance. Moreover, from (2.17), one may compute the homothetic Killing vector and the Killing vectors for the SU (2) isometric action, χ w I = −2c[0] I . δu I = i3 u I + + x I + iL I , δ u¯ I = −i3 v¯ I + − x I − iL I , I

χ u = 2u I ,

δw I = + Lu I − i 3 c[0] ¯ I = − Lu¯ I + i 3 c[0] I , δw I .

(5.18) (5.19)

In particular, the homothetic Killing vector is holomorphic and identical to the undeformed case. Moreover, one may check that the one-form obtained by lowering the index on k + using the deformed metric reproduces the Liouville form (2.64) on S in the patch i = 0 [4]. 5.2. Perturbed contact twistor lines. In order to extract the deformed quaternionicKähler metric on M, one possible strategy is to study the deformations of the superconformal quotient: this computationally intensive approach is outlined in Appendix C. However, it turns out to be more economic and elegant to work directly with the complex contact structure on the twistor space ZM , without reference to the Swann bundle and its twistor space. For this purpose, let us recast the deformed symplectomorphism (5.1) into the form of the contact transformations (2.71). Introducing the same coordinates as in (2.65), it is easy to check that the deformed contact transformations are generated by the following transition functions: [ j] ˜ [ j] ˜ [ j] [i j] ˜ [ j] ξ − Hˆ [i j] (ξ[i] Sˆ [i j] (ξ[i] , ξ I ) = ξ˜ + ξ[i] ) − Hˆ (1) (ξ[i] , ξ I ), [ j]

where ξ˜ I that

[ j]

should be replaced by ξ˜ I

(5.20)

[ j]

+ c I log( fˆi−2 j ). However, it follows from (2.73)

[i j] , fˆi2j ≈ 1 − ∂ξ˜ [ j] Hˆ (1)

(5.21)

so that its logarithm is of first order in the perturbation already. Therefore, to the first [ j] ˆ [i j] order it is consistent to neglect the term c I log( fˆi−2 j ) in the argument of H(1) , and take [ j] Hˆ [i j] to be an arbitrary function of the undeformed coordinates ξ˘ and ξ˘˜ . As a result, (1)

one finds the following deformed contact transformations: = ξ[j] − T[ij] , ξ[i]

[i]

I

[ j] [i j] ξ˜ I[i] = ξ˜ I − T˜I ,

(5.22)

where, in view of later applications, we abbreviated [i j] [i j] T[ij] ≡ −∂ξ˜ [ j] Hˆ (1) + ξ[i] ∂ξ˜ [ j] Hˆ (1) , [i j] [ j] [i j] [i j] − c ∂ξ˜ [ j] Hˆ (1) T˜ ≡ ∂ξ Hˆ [i j] + Hˆ (1) , (5.23) [i] [i j] [i j] I [ j] [i j] [i j] [i j] − ξ[i] + c I ξ[i] T˜ ≡ Hˆ [i j] + Hˆ (1) ∂ξ Hˆ [i j] + Hˆ (1) + c ∂ξ˜ [ j] Hˆ (1) . [i]

Hˆ [i j]

As usual, the functions (1) must satisfy the co-cycle condition (3.8) and are defined up ˇ to the gauge equivalence (3.10). Thus, they define an element in the Cech cohomology

390

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

group25 H 1 (ZM , O(2)), realizing Lebrun’s assertion that this group classifies the QK deformations of M [14]. We now determine the deformed contact twistor lines. For definiteness we focus on (z, x µ ) around ζ = ζ , i.e. z = 0. The pole and the constant term in the coordinate ξ[+] + the Laurent expansion (2.80) are readily obtained by contour integrating around z = 0: (z, x µ ) = Y+ z−1 + A ξ[+] + + O(z),

where A +

= 0

dz ξ , 2π i z [+]

Y+

= 0

dz ξ . 2π i [+]

(5.24)

(5.25)

(z, x µ ) at z = 0 is given by On the other hand, the full Laurent series expansion of ξ[+] the series ∞ dz z n −1 ξ[+] (z) = Y+ z + ξ[+] (z ). (5.26) z 0 2π i z n=0

The contour around z = 0 may be deformed into a sum of contours C˜ j around the other singularities in the z plane. Using the contact transformations (5.23) on each patch, we obtain ∞ dz z n −1 ξ ξ[+] (z) = Y+ z − (z ) − T (z ) . (5.27) [ j] [+ j] z C˜ j 2π iz n=0 j=+

The first term in the square bracket gives a non-vanishing contribution for j = − and n = 0, 1 only, while the second term contributes an infinite Laurent series. Therefore, we arrive at the following representation: dz 1 ξ[+] T (z ), (z) = Y+ z−1 + A − Y z + (5.28) − − − z [+ j] 2π i z ˜ Cj j

where now A −

=−

∞

dz ξ , 2π i z [−]

Y−

=

∞

dz ξ . 2π i z2 [−]

(5.29)

From the reality conditions (2.70), we conclude that A− = (A+ )∗ , Y− = (Y+ )∗ . Comparing the O(z0 ) terms between (5.24) and (5.28) gives the difference dz A − A = T (z ). (5.30) + − [+ j] C˜ j 2π iz j

Eliminating A − in (5.28) in favor of the real quantity A = (A+ + A− )/2 leads to 1 dz z + z −1 T[+ j] (z ) (5.31) ξ[i] (z) = A + z Y+ − zY− + 2 C˜ j 2π i z z − z j

25 The twisting by O(2) is not apparent in our formalism, as explained in footnote 14, but follows from the [i j] originates from a homogeneous function of degree 1 on S. fact that Hˆ (1)

Linear Perturbations of Quaternionic Metrics

391

for i = +. As observed below (3.53), these equations are in fact valid in any patch Ui , since they exhibit the correct discontinuities across the contours C˜ i . In an analogous way one may obtain the deformed conjugate coordinates ξ˜ I[i] . The [+] Laurent coefficient ξ˜ I,0 may be extracted by integrating dz [+] dz [+] 0 ˜ − c[+] log z = iB I+ − c[+] ξ˜ I,0 ξ = log z ξ [+] , (5.32) I I I 0 2π iz 0 2π iz where we defined

B I+

≡ −i 0

dz [+] [+] 0 µ + c log ν [+] . I I 2π iz

(5.33)

On the other hand, the Laurent series expansion around z = 0 may be obtained by deforming the contour around z = 0 into a sum of contours around the other singulari[ j] ties in the z plane, and use the symplectomorphism (5.22) to map ξ˜ I[+] to ξ˜ I : ξ˜ I[+] (z) = c[+] I log z −

∞ ˜ n=0 j=+ C j

dz z n [ j] ˜ [+ j] , (5.34) ξ˜ I − c[+] I log z − TI 2π iz z

[i j]

where T˜I are defined in (5.23). The first two terms in the bracket only contribute when j = −. The cancelation of the logarithmic singularity at z = ∞ is ensured by the con[−] dition c[+] I = −c I , corresponding to the figure-eight contour prescription discussed in [1]. Using (2.81), we obtain 0 ξ[−] dz dz [+ j] ˜ξ [+] (z) = c[+] log z + iB − + c[−] log + T˜I , I I I I z ∞ 2π iz C˜ j 2π i(z − z) j

(5.35) where B I−

≡i

∞

dz [−] [−] 0 µ + c log ν [−] I I 2π iz

(5.36)

is the complex conjugate of B I+ . Comparing the z-independent terms in (5.34) and (5.35) establishes the identity

dz dz z [−] 0 log z ξ − c log 0 i B I+ − B I− = c[+] [+] I I ξ[−] 0 2π iz ∞ 2π iz dz ˜ [+ j] + . (5.37) T I C˜ j 2π iz j

Eliminating B I− in (5.35) in favor of B I = B I+ + B I− leads to dz dz i 1 0 0 ˜ξ [+] (z) = B I − c[+] log z ξ[+] + log ξ[−] /z I 2 2 I 0 2π iz ∞ 2π iz dz z + z ˜ [+ j] 1 (5.38) + TI + c[+] I log z. 2 C˜ j 2π iz z − z j

392

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

0 ∼ Y 0 /z and ξ 0 ∼ Y 0 z at z = 0 and ∞, we finally obtain Using the fact that ξ[+] + − [−] ⎛ ⎞ Y−0 dz z + z ˜ [+ j] i 1 [+] ⎝z ⎠ ξ˜ I[i] (z) = B I + + c log (5.39) T I I 2 2 Y+0 C˜ j 2π iz z − z j

for i = +, and in fact also for any i. This relation generalizes (3.53) to the perturbed case. Taken together, (5.31) and (5.39) give the contact twistor lines of the deformed [i j] , which is considered as a function of the twistor space in terms of the perturbation H(1) [i] undeformed twistor lines ξ˘ , ξ˘˜ given in (3.20) and (3.53). In order to make contact [i]

I

with the construction in Sect. 3.3, one should recall that Y+ = RZ , where dz ξ[+] dz 0 ξ , Z = , R = 0 2π i z 2π i [+] ξ 0 0 [+]

(5.40)

in such a way that Z 0 = 1. To obtain the perturbed quaternionic-Kähler metric, we should also calculate the leading Laurent coefficients of the contact potentials [±] given in (2.84). It turns out that one can actually compute the full contact potentials, using the gluing conditions [i j] e−[i] − e−[ j] = e−φ ∂ξ˜ [ j] Hˆ (1) ,

(5.41)

where φ is defined by (3.38) in the unperturbed geometry. These conditions follow from the gluing conditions for ν˜ [i] in (2.69), using the results for the transition functions fˆi2j

(5.21) and the unperturbed ν˜ [i] (3.45). Repeating again the same steps as above, one easily arrives at ⎞ ⎛ z + z dz 1 [0 j] ⎠ e[i] = eφ ⎝1 + ∂ξ˜ [ j] Hˆ (1) (z ) , (5.42) 2 C˜ j 2π iz z − z j

where φ is defined in the perturbed case as φ ≡ Re [+] (z = 0) =

1 0 0 φ[+] + φ[−] . 2

(5.43)

Note that this definition coincides with (3.38) in the unperturbed case. Using (2.84), the leading Laurent coefficient of the contact potentials are given by ⎛ ⎞ 0 dz dz 1 1 1 1 [0 j] [+] ⎝ eφ[+] = Y+ T[0j] ⎠ + c[+] , A + T˜ + c 2 2 2 2 2 C˜ j 2π iz C˜ j 2π i z j j ⎛ ⎞ (5.44) 0 dz dz 1 1 1 1 [0 j] [−] ⎝ eφ[−] = − Y− T[0j] ⎠ − c[−] . A − T˜ − c 2 2 2 2 C˜ j 2π i C˜ j 2π i z j

j

Inserting in (5.43) we therefore obtain dz −1 1 1 [0 j] φ z Y+ − zY− T˜ + c[+−] AI , e = I 4 2π i z 4 C˜ j j

(5.45)

Linear Perturbations of Quaternionic Metrics

393

and the full contact potentials via (5.42). As a useful consistency check, note that the difference of (5.44) can be rewritten, after some considerable work, as dz 0 0 [0 j] − φ[−] = , (5.46) φ[+] ∂ξ˜ [ j] Hˆ (1) 2π iz ˜ Cj j

consistently with (5.42). Altogether these results allow us to extract the metric following the procedure outlined at the end of Sect. 2.5. It is straightforward to relate this contact construction on ZM to the symplectic construction on ZS . For this purpose, one needs to apply the change of variable (2.61), where ζ± denote the location of the zeros of the perturbed section ν , to all contour integrals in the z plane. Under this change of variable, the integration measure becomes (ζ+ − ζ− ) dζ dz = . 2π iz (ζ − ζ+ )(ζ − ζ− ) 2π i

(5.47)

Its expression to the first order in deformation can be found in (C.9). However, in some cases it can be simplified. For example, integrated against a function which is regular at ζ± , this may be rewritten as

r± dζ dz = , 2 ν 2π iz 2π i f 0± [±]

1

≡

r±

C+

dζ

2 ν 2π i f 0± [±]

,

(5.48)

where the factor r± ensures that the residue at ζ = ζ± is equal to one. Integrating a function with a simple pole at ζ± , (5.48) must be generalized to r± s± 1 dz dζ = + + , (5.49) 2π iz 2π i f 2 ν ζ− − ζ+ r± 0± [±]

where s± is the second coefficient in the Taylor expansion

2 ν[±] = r± (ζ − ζ± ) + s± (ζ − ζ± )2 + · · · . f 0±

(5.50)

Note that the correction term in round brackets in (5.49) vanishes in the case where ν remains a global O(2) section. In this way, we may rewrite the invariant coordinates introduced above as follows: ν[±] dζ Y± ≡ r± z±1 , 2 C± 2π i f 0± (ν[±] )2 ν[±] r± s± 1 dζ A± ≡ ± + + , (5.51) 2 ζ− − ζ+ r± C± 2π i f 0± ν[±] ν[±] dζ [±] [±] ± 0 µ . B I ≡ ∓ir± + c log ν [±] I I 2 ν C± 2π i f 0± [+] From the computation of the deformed hyperkähler potential (C.22) in Appendix C, one may check that the relation (3.38) continues to hold after perturbation, provided φ

394

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

is defined by (5.43) and r = (r+ + r− )/2. From the general equation (2.78), this implies that 2 1 + z z¯ Re [[+] (x r = |v fˆ0+ e | |z|

µ ,z)−φ 0 (x µ )] [+]

.

(5.52)

We have not attempted to check this relation directly. In [19], we shall apply this general framework to the hypermultiplet moduli space in compactifications of type II string theory on a Calabi-Yau three-fold. Acknowledgements. We are grateful to A. Neitzke for discussions and former collaboration on related topics. The research of S.A. is supported by CNRS and by the contract ANR-05-BLAN-0029-01. The research of B.P. is supported in part by ANR(CNRS-USAR) contract no.05-BLAN-0079-01. F.S. acknowledges financial support from the ANR grant BLAN06-3-137168. S.V. thanks the Federation de Recherches “Interactions Fondamentales” and LPTHE at Jussieu for hospitality and financial support. Part of this work is also supported by the EU-RTN network MRTN-CT-2004-005104 “Constituents, Fundamental Forces and Symmetries of the Universe”. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

A. Infinitesimal SU(2) Transformations In this appendix, we study the infinitesimal action of SU (2) on the local sections introduced in the main text. We parametrize the Lie algebra of SU (2) by ± = (∓ )∗ and 3 = (3 )∗ such that

α β 1 − 2i 3 + = + O( 2 ). (A.1) −β¯ α¯ −− 1 + 2i 3

The infinitesimal action of SU (2) on π A , π¯ A is given in (2.4), i 1

1 3 −+ π2 π2 π π 2 = · . δ −π¯ 2 π¯ 1 −π¯ 2 π¯ 1 − − 2i 3

(A.2)

The finite action of SU (2) on O(2n) sections was discussed in Sect. 2.4. At the infinitesimal level, (2.41) reduces to δζ ≡ ζ − ζ = + − i3 ζ + − ζ 2 + O( 2 ).

(A.3)

In the patch i = 0, the O(2n) transformation rule (2.44) then leads to

I I I δν[0] (ζ ) ≡ ν[0] (ζ ) − ν[0] (ζ )

I (ζ ) + O( 2 ). (A.4) = + ∂ζ − i3 ζ ∂ζ − n + − ζ 2 ∂ζ − 2nζ ν[0]

Thus, the Taylor coefficients of ν[0] = imal SU (2) action by

I m m νm ζ

around ζ = 0 vary under an infinites-

I I δνmI = (m + 1)νm+1 + − i(m − n)νmI 3 + (m − 2n − 1)νm−1 − .

(A.5)

I The variation of µ[0] I and its Laurent coefficients µ I,m is obtained by replacing ν → µ I , n → 1 − n in these expressions.

Linear Perturbations of Quaternionic Metrics

395

In an arbitrary patch Ui , the SU (2) action (2.44) is most easily expressed in terms I (ζ ), which formally transforms in the same way as ν I (ζ ). Similarly, the of f i0−2n (ζ ) ν[i] [0] SU (2) action (2.54) is most easily stated in terms of f i0−2 (ζ ) exp(−µ[i] /c[i] I ), which also I (ζ ). After the gauge transformation (3.13), formally transforms in the same way as ν[0]

the transformation rules of µ[i] I are changed to

+ [i] 2 ∂ δµ[i] (ζ ) = − i ζ + ζ µ (ζ ) + + ζ c[0] − i + 3 − ζ T ;I 3 − I T ;I ζ

+ − − ζ c[i] − I , ζ

(A.6)

consistently with the fact that (3.18) transforms like a non-anomalous O(0) section. The variation (A.5) applies for the Laurent coefficients of any local section of O(2n). In particular, one may consider a homogeneous function G(να ) of O(2n α ) multiplets να , of homogeneity degree n when each να are scaled with homogeneity degree n α : α

n α να ∂να G = n G.

(A.7)

Then, for an arbitrary contour (not necessarily surrounding the origin), the SU (2) variation of the integrals Gm ≡

dζ G(να ), 2π i ζ m+1

(A.8)

is given by δG m = (m + 1)G m+1 + − i(m − n)G m 3 + (m − 1 − 2n)G m−1 − ,

(A.9)

as one can check from explicit calculation. In particular, for n = m = −1, we recover the remark in [39], according to which a contour integral of a section of O(−2) is SU (2)invariant. For n = −3/2, the contour integral of a section of O(−3) with m = −1, −2 instead produces a SU (2) doublet. These observations are central to the superconformal quotient discussed in Sect. 3.2.

B. An alternative Formulation for Hypermultiplet Moduli Spaces In this appendix we explain the relation between the formulation of the hypermultiplet space used in Sect. 4, and the one introduced in [28] using a different contour prescription, and establish their equivalence up to a local symplectomorphism. Aiming for a Lagrangian L whose limit v → 0 is regular, [28] considered the contour integral L (v, v, ¯ x) = Im

C

dζ 2π iζ

F(η ) , + 4ic η log η η

(B.1)

396

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

where the contour C appearing in (4.1) encircles the poles ζ = 0, ζ+ in the counter-clockwise direction and the logarithmic branch cuts connect 0, ζ+ and ζ− , ∞, respectively. The resulting Lagrangian L =

1 F(v) F (v) Im F(η+ ) − x Im 2 + x Im r (v ) v

x +r , +4c x − r + x log 2

(B.2)

differs from (4.3), (4.14) by terms linear in x I only and therefore describes the same metric. In terms of our general discussion, the contour prescription (B.1) arises from the transition functions i F(η) H [0+] = 0, H [0i] = , 2 η (B.3) ¯ i F(η) − F(η) [0−] [0∞] =H = − 4cη log η , H 2 η where i labels the patches where F(η) is singular. These transition functions are related to the ones given in (4.2), (4.15) by the gauge transformation generated by G [0] =

i F(η) − 2c η log(η ζ ), 2 η

G [∞] =

¯ i F(η) + 2c η log(η /ζ ), 2 η (B.4)

G [+] = G [−] = G [i] = −2c η log(ζ ). The new non-vanishing quasi-homogeneity coefficients (3.16) are c[0] = c[+] = −2c,

c[∞] = c[−] = 2c.

(B.5)

Note that in contrast to the description in Sect. 4.1, the coefficient c[0] does not vanish, corresponding to the Lagrangian (B.2) being quasi-homogeneous. We now discuss the twistor lines arising from this new contour prescription. The global O(2) sections η(ζ ) are unchanged while the gauge transformation (B.4) induces i F (η) , 2 η i F(η) + 2c (log(η ζ ) + 1). µ[0] (ζ ) = µ[0] (ζ ) + 2 (η )2

µ[0] (ζ ) = µ[0] (ζ ) −

(B.6)

Consequently, the coordinates w I become w = w −

i F (v) , 2 v

w = w +

i F(v) + 2c (log(v ) + 1). 2 (v )2

(B.7)

The corresponding map between the coordinates I and I is readily obtained using I = −i(w I − w¯ I ). Furthermore, the base coordinates (3.36)–(3.39) are identical,

eφ = eφ ,

Z a = Z a ,

A = A ,

B I = B I ,

(B.8)

where the last equality follows from (3.39) upon a brief computation. Taking into account also that only c[+−] , which are equal in the two formulations, contribute to I the general expressions (3.53), this, in turn, implies that both contour prescriptions give rise to the same twistor lines (4.23).

Linear Perturbations of Quaternionic Metrics

397

C. Deformed Superconformal Quotient In this appendix we generalize the superconformal quotient procedure of Sect. 3.2 to include deformations. While conceptually straightforward, this procedure is toilsome compared to the contact geometry approach of Sect. 5.2. Nevertheless, we include it here for completeness, as it provides useful consistency checks on our formalism,

Coordinates on the deformed base. As in the undeformed case described in Sect. 3.2, SU (2) invariant functions on the Swann bundle S can be obtained by contour-integrating O(−2) sections on ZS . In the presence of deformations, the global sections η I in (3.36) I , leading to the definitions26 must be replaced by the deformed local sections ν[+] 1 ≡ Re r Z ≡r a

C+

C+

dζ 1 , 2 2π i f 0+ ν[+]

ν[+] dζ , 2 0 2π i f 0+ ν[+] ν[+]

A ≡ r Re

C+

a

B I ≡ 2r Im C+

ν[+] dζ , 2 2π i f 0+ (ν[+] )2

(C.1) [+] 0 µ[+] dζ I + c I log ν[+] . 2 2π i f 0+ ν[+]

To first order in the deformation, this gives v¯ 1 νˆ ζ,+ − νˆ ζ,− + νˆ + + νˆ − , 2 r a a 0 ˘ νˆ − A νˆ + νˆ + − A˘ 0 νˆ + ˘ a − Z a = Z˘ a + + Z , ζ+ η+0 ζ+ η+0 (C.2)

1 2v¯ ˘ A = A + νˆ ζ,+ − νˆ ζ,− + νˆ + + νˆ − 2r r 1 − 2 Re ζ+ η+ νˆ ζ ζ,+ + 2r A˘ −x + 2v¯ ζ+ νˆ ζ,+ +2 2v¯ A˘ − v¯ νˆ + , (r ) r = r˘ +

where ˘ marks the unperturbed quantities defined in (3.36) and we introduced 2 I I 2 I 2 I νˆ ±I = f 0± νˆ [±] (ζ± ), νˆ ζ,± =∂ζ f 0± νˆ [±] (ζ± ), νˆ ζI ζ,±=∂ζ2 f 0± νˆ [±] (ζ± ). (C.3) We omitted the expansion of B I since it will not be needed. The SU (2) invariant R defined in (3.40) may also be extended to the deformed case as 0 − A 0− A ˘ 0 νˆ − ˘ 0 νˆ + νˆ − 1 ν ˆ + ˘ 1+ R=R + 0 2 ζ+ η+0 ζ− η −

1 2v¯ . (C.4) − νˆ ζ,+ − νˆ ζ,− + νˆ + + νˆ − 2r r Using the formulae given at the end of this appendix, one can check explicitly that the above expressions are indeed SU (2) invariant. 26 Note that dζ / f 2 = dζ [+] is the natural integration measure in the patch U . + 0+

398

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

Coordinates on the C2 /Z2 fiber. The coordinates π A on the fiber of S, (3.41) can be similarly generalized to the deformed case as follows: ζ dζ 1 π =C 1/2 2π i C+ 2 ν 2 ν0 f f 0+ 0+ [+] [+]

0 νˆ ζ,+ νˆ + 4v¯ x 0 + 2v 0 /ζ+ νˆ + 1 = π˘ 1 − − − + , r 2r r 2ζ+ η+0 ζ+ η+0 (C.5) ζ dζ 2 ¯ π = −C 1/2 C− 2π i f 2 ν 2 0 0− [−] − f 0− ν[−] 0 νˆ ζ,− νˆ − νˆ − 4v¯ x 0 + 2v 0 /ζ− 2 = π˘ 1 − + − − . 0 0 r 2r r 2ζ− η− ζ− η − The conjugate variables π¯ A can be obtained from (C.5) using ζ+ = −1/ζ− ,

η+ = η− ,

νˆ +I = −ˆν−I /ζ−2 ,

In particular, one has very simple relations π1 νˆ + = ζ˘+ 1 − , ζ+ ≡ − π¯ 2 r ζ+

I I = −ˆ νˆ ζ,+ νζ,− + 2νˆ −I /ζ− .

νˆ − π2 = ζ˘− 1 + ζ− ≡ , π¯ 1 r ζ−

whereas the variable z = π 1 /π 2 parametrizing the fiber of ZM is given by 0 νˆ ζ,+ + νˆ ζ,− νˆ − νˆ +0 + − z = z˘ 1 − 0 r 2ζ+ η+0 2ζ− η−

νˆ − 4v¯ νˆ + 4v¯ x 0 + 2v 0 /ζ+ x 0 + 2v 0 /ζ− + − + − . 0 2r r 2r r ζ+ η+0 ζ− η −

(C.6)

(C.7)

(C.8)

To first order in the perturbation, the two quantities in (C.7) provide the zeros ζ± of the deformed section ν , and (2.57),(2.61) continue to hold. Using the explicit expressions (C.5) for π A , one finds νˆ − r νˆ + dz 1 = − + dζ. (C.9) z ζ η r (ζ − ζ+ )2 (ζ − ζ− )2 Relation to the contact geometric approach. Using this relation, one may relate the coordinates defined here to those defined in Sect. 5.2: ⎞ ⎛ d˘ z ∂ρ H [0 j] (ξ˘ (˘z), ρ(˘z))⎠ , Y+ = R Z ⎝1 − 3i z (1) ˜ j 2π i˘ C j d˘z −1 [0 j] ˘ ˘ z˘ Z + z˘ Z¯ ∂ρ H(1) A = A + i R (ξ (˘z), ρ(˘z)), (C.10) z C˜ j 2π i˘ j

1 r = (r+ + r− ), 2

BI = BI ,

Linear Perturbations of Quaternionic Metrics

where

399

˘ z˘ −1 Z˘ − z˘ Z˘¯ , ξ˘ (˘z) = A˘ + R d˘z z˘ + z˘ [i j] ρ I (˘z) = B˘ I − i H (ξ˘ (˘z )) ˘ z˘ − z˘ I ˜ j 2π i z C j [i0] ˘ −i H I (ξ (˘z)) + H I[i∞] (ξ˘ (˘z)) − ic[+−] log z˘ I

(C.11)

parametrize the unperturbed twistor lines and z˘ is related to ζ through the (undeformed) relation (2.61), z˘ = −

1 ζ − ζ˘+ . z¯˘ ζ − ζ˘−

(C.12)

With these definitions and relations, it is tedious but straightforward to compute the deformed complex coordinate ξ = u /u on Z in terms of the coordinates on M × CP 1 , d˘z z˘ + z −1 [0 j] ˘ [0 j] ξ = A +z Y+ −zY− +i ∂ρ H(1) . − ξ (˘z)∂ρ H(1) z z˘ − z C˜ j 2π i˘ j

(C.13) Taking into account the relation

[i0] ˘ [i∞] ˘ ρ I = −2iξ˘˜ [i] − i H ( ξ ) + H ( ξ ) , I I I

(C.14)

one verifies that this expression coincides with the result (5.31) from contact geometry, at ζ = 0, z = z. A similar derivation of ξ˜ I[i] in (5.39) ought to be possible but we have not attempted to carry it through. Deformed hyperkähler potential. Having defined an appropriate set of coordinates on the Swann bundle, we now generalize the representation (3.56) for the hyperkähler potential to the deformed case, and relate it to the contact potential [+] of contact geometry. For this purpose, we note that the hyperkähler potential (5.12) can be written as dζ

1 − x I ∂ η I + ∂ x I ρ J ∂ρ J χ = 1 + νˆ 0I ∂v I + νˆ¯ 0I ∂v¯ I C 2π i ζ (C.15) × (H (η) + H(1) (η, ρ)) , where we omitted the summation over patches and used the relation (5.11) between the undeformed and deformed complex coordinates v I and u I ≡ v I + νˆ 0I . Following the same steps (3.30) which led to the representation (3.56), one finds d˘z −1 ˘ ˘ z˘ Z − z˘ Z˘¯ (H + H(1) ) ξ˘ (˘z), ρ(˘z) χ = 1 + νˆ 0I ∂v I + ν¯ˆ 0I ∂v¯ I r R z C˜ 2π i˘ (C.16) dζ

v ζ −1 + v¯ ζ [+−] ˘ I I x ∂x I − ζ ∂ζ ρ J ∂ρ J H(1) , +r c I A − η C˜ 2π iζ where ξ˘ (˘z) and ρ(˘z) are given in (C.11).

400

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

The unperturbed quantities A˘ , Z˘ , etc., are not SU (2) invariant. To arrive at the desired form of the hyperkähler potential, we replace them in the first term of (C.16) by their deformed SU(2) invariant counterparts and collect the remaining terms which are all of order O(H(1) ). The first term in (C.16) then reads

d˘z −1 z˘ Z − z˘ Z¯ (H + H(1) ) ξ(0) (˘z), ρ(˘z) + r c[+−] A I , (C.17) r R I 2π i˘ z ˜ C where ξ(0) (˘z) ≡ A + z˘ −1 Y+ − z˘ Y− .

(C.18)

The remaining terms are of three types: (i) the second term in (C.16), (ii) the terms coming from the derivatives with respect to v I and v¯ I in the first term in (C.16) and (iii) the terms coming from the difference between deformed and undeformed invariants in (C.17). Altogether they should combine in an invariant expression written as a contour integral of a O(−2) section. After a long calculation, one obtains d˘z −1 [0 j] [0 j] χ =r R z˘ Z − z˘ Z¯ (H + H(1) ) + r c[+−] AI I 2π i˘ z ˜ C d˘z −1 [ j0] [ j∞] z˘ Z − z˘ Z¯ H + H − ir R z C˜ 2π i˘ [0 j] [0 j] ∂ρ H(1) − ξ(0) ∂ρ H(1) d˘z d˘z z˘ + z˘ −1 ¯ H [i j] ˘ ˘ + ir R z Z − z Z z C˜ 2π i˘z z˘ − z˘ C˜ 2π i˘ [0i] [0i] ∂ρ H(1) , (C.19) − ξ(0) (˘z )∂ρ H(1) [i j]

[0i] are functions of ξ (˘ where H are functions of ξ(0) (˘z), whereas H(1) z) (or, (0) z) and ρ(˘ in the last term, functions of z˘ ). This expression can be further simplified. First, taking into account (C.14) and [i j] [i j] [i j] [i j] H (ξ ) = Hˆ [i j] (ξ ) − ξ Hˆ (ξ ) + c ξ + c ,

(C.20)

it is easy to check that the first and third terms in (C.19) combine into one with the [0 j] derivatives of the transition functions replaced by T˜ (ξ(0) , ξ˜ [ j] ) (5.23). Moreover, the last term in (C.19) can be rewritten as d˘z −1 [0 j] z˘ Z − z˘ Z¯ H r R 2π i˘ z ˜ Cj j d˘z z˘ + z˘ 1 T × −T[0 j] + , (C.21) 2 z z˘ − z˘ [0i] C˜ i 2π i˘ i

where for i = j the variable z˘ lies inside the contour for z˘ and the first term appears since in (C.19) the situation was opposite. Then the expression in the square brackets (˘ (˘ is just ξ[0] z) − ξ(0) z) for z˘ ∈ U j . Finally, Y and RZ differ by a phase factor (see

Linear Perturbations of Quaternionic Metrics

401

(C.10)) which can be absorbed into a redefinition of the integration variable z˘ . As a result, the hyperkähler potential can be written more compactly as χ =r

j

C˜ j

d˘z −1 [0 j] z˘ Y+ − z˘ Y− T˜ (ξ[0] , ξ˜ [ j] ) + r c[+−] AI . I 2π i˘z

(C.22)

Comparing with the contact potential (5.45), one finds that the relation (3.38) continues to hold in the perturbed case.

Deformed SU (2) transformations. The SU (2) invariance of the quantities defined in this appendix can be checked using the following transformation rules, which follow from the general discussion in Appendix A: 3 ¯I 3 − νˆ 3 , δ v¯ I = −i3 v¯ I + − x I − + νˆ 3I , 2 2

δ I = i − Lu¯ I − + Lu I + 3 c I , (C.23) δx I = −2(− v I + + v¯ I ), i i δw I = + Lu I + 3 c I , δ w¯ I = − Lu¯ I − 3 c I , 2 2 3 ¯I I I δ νˆ 0 = i3 νˆ 0 + i+ L I + − νˆ 3 , 2 3 2 I I I δ νˆ ± = (i3 − 2− ζ± )ˆν± − + ζ± νˆ 3 − − ν¯ˆ 3I , (C.24) 2 I I I δ νˆ ζ,± = −2− νˆ ± − 3+ ζ± νˆ 3 , δv I = i3 v I + + x I −

I δ νˆ ζI ζ,± = −(i3 − 2− ζ± )ˆνζI ζ,± − 2− νˆ ζ,± − 3+ νˆ 3I ,

where νˆ 0I

= C

dζ I H , 2π (1)

νˆ 3I

= C

dζ HI , π ζ 3 (1)

ν¯ˆ 3I = −

C

dζ HI π ζ −1 (1)

(C.25)

are Laurent coefficients of the deformation νˆ 0I given in (5.7) (as usual, we omitted the sum over contours). The following properties, valid in the absence of perturbations, are also useful: δζ± = −+ − − ζ±2 + i3 ζ± , = −(+ ζ¯∓ + − ζ± )η± , δη± 1 + z z¯ v 1 + z z¯ v¯ z + , δ z¯ = z¯ − , δz = |z| v |z| v¯ δρ I = + + − ζ 2 − i3 ζ ∂ζ ρ I .

(C.26) (C.27) (C.28) (C.29)

402

S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren

References 1. Alexandrov, S., Pioline, B., Saueressig, F., Vandoren, S.: Linear perturbations of Hyperkähler metrics. Lett. Math. Phys. 87, 225 (2009) 2. Swann, A.: Hyper-Kähler and quaternionic Kähler geometry. Math. Ann. 289(3), 421–450 (1991) 3. de Wit, B., Kleijn, B., Vandoren, S.: Superconformal hypermultiplets. Nucl. Phys. B568, 475–502 (2000) 4. de Wit, B., Roˇcek, M., Vandoren, S.: Hypermultiplets, hyperkähler cones and quaternion-Kähler geometry. JHEP 02, 039 (2001) 5. Bergshoeff, E.A., Cucu, S., de Wit, T., Gheerardyn, J., Van Proeyen, A., Vandoren, S.: The map between conformal hypercomplex/hyper-Kähler and quaternionic(-Kähler) geometry. Commun. Math. Phys. 262, 411–457 (2006) 6. de Wit, B., Roˇcek, M., Vandoren, S.: Gauging isometries on hyperkaehler cones and quaternion- kaehler manifolds. Phys. Lett. B511, 302–310 (2001) 7. Ketov, S.V.: Superconformal hypermultiplets in superspace. Nucl. Phys. B582, 95–118 (2000) 8. Kuzenko, S.M.: On superconformal projective hypermultiplets. JHEP 12, 010 (2007) 9. Kuzenko, S.M., Lindström, U., Roˇcek, M., Tartaglino-Mazzucchelli, G.: 4D N = 2 Supergravity and Projective Superspace. JHEP 0809, 051 (2008) 10. de Wit, B., Saueressig, F.: Off-shell N = 2 tensor supermultiplets. JHEP 09, 062 (2006) 11. de Wit, B., Saueressig, F.: Tensor supermultiplets and toric quaternion-Kaehler geometry. Fortsch. Phys. 55, 699–704 (2007) 12. Salamon, S.M.: Quaternionic Kähler manifolds. Invent. Math. 67(1), 143–171 (1982) 13. LeBrun, C.: Quaternionic-Kähler manifolds and conformal geometry. Math. Ann. 284(3), 353–376 (1989) 14. LeBrun, C.: Fano manifolds, contact structures, and quaternionic geometry. Internat. J. Math. 6(3), 419–437 (1995) 15. LeBrun, C.: A Rigidity Theorem for Quaternionic-Kahler Manifolds. Proc. Amer. Math. Soc. 103(4), 1205–1208 (1988) 16. LeBrun, C., Salamon, S.: Strong rigidity of positive quaternion-Kähler manifolds. Invent. Math. 118(1), 109–132 (1994) 17. Neitzke, A., Pioline, B., Vandoren, S.: Twistors and black holes. JHEP 04, 038 (2007) 18. Bagger, J., Witten, E.: Matter couplings in N = 2 supergravity. Nucl. Phys. B222, 1 (1983) 19. Alexandrov, S., Pioline, B., Saueressig, F., Vandoren, S.: D-instantons and twistors. JHEP 0903, 044 (2009) 20. Boyer, C.P., Galicki, K.: 3-Sasakian manifolds. Surv. Diff. Geom. 7, 123–184 (1999) 21. Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London Ser. A 362(1711), 425–461 (1978) 22. Hitchin, N.J., Karlhede, A., Lindström, U., Roˇcek, M.: Hyperkähler metrics and supersymmetry. Commun. Math. Phys. 108, 535 (1987) 23. Hitchin, N.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. (3) 55(1), 59– 126 (1987) 24. Ivanov, I.T., Roˇcek, M.: Supersymmetric sigma models, twistors, and the Atiyah-Hitchin metric. Commun. Math. Phys. 182, 291–302 (1996) 25. Lindström, U., Roˇcek, M.: Properties of Hyperkähler manifolds and their twistor spaces. Commun. Math. Phys. 293, 257–278 (2010) 26. Galicki, K.: A generalization of the momentum mapping construction for quaternionic Kähler manifolds. Commun. Math. Phys. 108(1), 117–138 (1987) 27. Neitzke, A.: Private communication 28. Alexandrov, S.: Quantum covariant c-map. JHEP 05, 094 (2007) 29. Cecotti, S., Ferrara, S., Girardello, L.: Geometry of type II superstrings and the moduli of superconformal field theories. Int. J. Mod. Phys. A4, 2475 (1989) 30. Ferrara, S., Sabharwal, S.: Quaternionic manifolds for type II superstring vacua of Calabi-Yau spaces. Nucl. Phys. B332, 317 (1990) 31. de Wit, B., Van Proeyen, A.: Potentials and symmetries of general gauged N = 2 supergravity: Yang-Mills models. Nucl. Phys. B245, 89 (1984) 32. Cremmer, E., de Wit, B., Derendinger, J.P., Ferrara, S., Girardello, L., Kounnas, C., Van Proeyen, A.: Vector multiplets coupled to N = 2 supergravity: SuperHiggs effect, flat potentials and geometric structure. Nucl. Phys. B250, 385 (1985) 33. Roˇcek, M., Vafa, C., Vandoren, S.: Hypermultiplets and topological strings. JHEP 02, 062 (2006) 34. Roˇcek, M., Vafa, C., Vandoren, S.: Quaternion-Kähler spaces, hyperkähler cones, and the c-map. http:// arXiv.org/abs/math/0603048v3[math.DG], 2006 35. Antoniadis, I., Ferrara, S., Minasian, R., Narain, K.S.: R 4 couplings in M- and type II theories on Calabi-Yau spaces. Nucl. Phys. B507, 571–588 (1997)

Linear Perturbations of Quaternionic Metrics

403

36. Robles-Llana, D., Saueressig, F., Vandoren, S.: String loop corrected hypermultiplet moduli spaces. JHEP 03, 081 (2006) 37. Antoniadis, I., Minasian, R., Theisen, S., Vanhove, P.: String loop corrections to the universal hypermultiplet. Class. Quant. Grav. 20, 5079–5102 (2003) 38. Günther, H., Herrmann, C., Louis, J.: Quantum corrections in the hypermultiplet moduli space. Fortsch. Phys. 48, 119–123 (2000) 39. Ionas, R.A.: Elliptic constructions of Hyperkähler metrics II: The quantum mechanics of a Swann bundle. http://arXiv.org/abs/0712.3600v1[math.DG], 2007 Communicated by N.A. Nekrasov

Commun. Math. Phys. 296, 405–428 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1024-9

Communications in

Mathematical Physics

The ADHM Construction and Non-local Symmetries of the Self-dual Yang–Mills Equations James D. E. Grant Fakultät für Mathematik, Universität Wien, Nordbergstrasse 15, 1090 Wien, Austria. E-mail: [email protected] Received: 16 December 2008 / Accepted: 18 December 2009 Published online: 4 March 2010 – © Springer-Verlag 2010

To Nicola Ramsay Abstract: We consider the action on instanton moduli spaces of the non-local symmetries of the self-dual Yang–Mills equations on R4 discovered by Chau and coauthors. Beginning with the ADHM construction, we show that a sub-algebra of the symmetry algebra generates the tangent space to the instanton moduli space at each point. We explicitly find the subgroup of the symmetry group that preserves the one-instanton moduli space. This action simply corresponds to a scaling of the moduli space. 1. Introduction The self-dual Yang–Mills equations have been investigated from two rather distinct points of view in the last few decades. The first direction is in the study of the topology of four-manifolds, and the work of Donaldson (see, e.g., [13,14]). In this approach, a fundamental role is played by the analysis of moduli spaces of solutions of the self-dual Yang–Mills equations with L 2 curvature (so-called “instanton solutions”) on given four-manifolds. The analysis of such moduli-spaces then yields powerful information concerning differentiable structures on the underlying four-manifold. The second, seemingly unrelated, development is in the theory of integrable systems. In particular, it has been shown that many known integrable systems of differential equations may be derived as symmetry reductions of the self-dual Yang–Mills equations (see, e.g., [21]). The purpose of the current paper, and its companion [15] which studies reducible connections, is to investigate whether properties of the self-dual Yang–Mills equations that follow from its integrable nature may be used to yield global information about instanton moduli spaces. In particular, it is known that the self-dual Yang–Mills equations on R4 admit an infinite-dimensional algebra of non-local symmetries [6–8]. In this paper, we investigate the action of these symmetries on the instanton-moduli spaces on R4 and, in particular, investigate, on the one-instanton moduli space, the sub-algebra of symmetries that preserve the L 2 condition on the curvature of the connection. Thinking of

406

J. D. E. Grant

such symmetries as generating flows on the moduli space of all self-dual connections, M, and of the k-instanton moduli space, Mk , as a subspace of M, then it is known that the non-local symmetries in general do not lie tangent to the subspaces Mk and, therefore, do not preserve the L 2 nature of the curvature (see, e.g., [9, Chap. V], but also Remark 4.4 below). Our results are rather double-edged. In Theorem 4.1, we show that the full tangent space to the moduli spaces Mk is generated by the fundamental vector fields of the symmetry algebra acting on the moduli space of self-dual connections M. When we attempt to “exponentiate” these tangent vectors into a group action on Mk , however, our conclusion is that the family of transformations that preserves the L 2 nature of the connection is rather small. In particular, the symmetries have orbits of rather high codimension in the moduli spaces. More specifically, in Theorem 5.1, we deduce that the only symmetries of the self-dual Yang–Mills equations that act on five-dimensional one-instanton moduli space M1 correspond to a scaling of the instanton solutions. Such a collapse to orbits of large codimension is not unfamiliar from the theory of harmonic maps into Lie groups [1,17,20,28], where one has similar non-local symmetry algebras [10]. The paper is organised as follows. In Sect. 2, we summarise the relevant background material that we require from both the integrable systems approach to the self-dual Yang–Mills equations and the ADHM approach to the instanton problem. Since our considerations are aimed at making a connection between the local aspects of the selfdual Yang–Mills equations and the global aspects, and the literature in these fields generally have completely different notation, it was deemed necessary to give an integrated, relatively detailed description of both approaches in a consistent notation. This accounts for the length of this section1 . In Sect. 3, we show how the ADHM construction may be used to yield explicit patching matrices for holomorphic bundles over subsets of CP 3 , to which we may apply the results of [9] concerning the action of the symmetry algebra of the self-dual Yang–Mills equations. In Sect. 4, we show that one-parameter families of ADHM data yield transformations that fall into the category of transformations considered in [9], with the important proviso that these transformations are significantly restricted by the assumption that they are generated by flows on the full moduli space of self-dual connections. In Sect. 5, we show that our constructions can be carried out explicitly on the one-instanton moduli space, and that the only symmetries (consistent with a particular technical assumption) that have a well-defined action on the one-instanton moduli space are scalings. Finally, in an Appendix, we give a direct derivation of the action of the non-local symmetries of the self-dual Yang–Mills equations on the twistorial patching matrix from the action on the self-dual connection. Finally, note that the symmetries that we investigate can also be constructed, by the same methods, on hyper-complex manifolds. Since we wish to consider symmetries that generalise to manifolds other than R4 , and there exist hyper-complex manifolds with no continuous (conformal) isometries, we will not consider symmetries (such as those discussed in [26]) that follow from the existence of a non-trivial conformal group on our manifold.

1 For more information on the local aspects of the self-dual Yang–Mills equations that are relevant to us, see [9, Chaps. II & III]. For more information concerning the ADHM formalism see either [4] or [2, Chaps. II-IV].

The ADHM Construction and Non-local Symmetries

407

2. Preliminaries 2.1. Quaternions and twistor spaces. We will deal exclusively with the self-dual Yang– Mills equations on R4 and S 4 , and make constant use of isomorphisms R4 ∼ = C2 ∼ = H, 1 2 3 4 4 which we first fix. As such, let x := (x , x , x , x ) ∈ R . We may then view u := x1 + ix2 , v := x3 − ix4 as coordinates on C2 and defining an isomorphism R4 → C2 ; x → (u, v). In terms of these coordinates, the flat metric on R4 takes the form g=

1 (du ⊗ du + du ⊗ du + dv ⊗ dv + dv ⊗ dv), 2

and the corresponding volume form is = dx1 ∧ dx2 ∧ dx3 ∧ dx4 =

1 du ∧ du ∧ dv ∧ dv. 4

Let P → R4 be a principal SU2 bundle2 over R4 , and E → R4 the rank-two complex vector bundle associated to P via the fundamental representation of SU2 . (We will switch between the principal bundle and vector bundle pictures without comment.) An SU2 connection on E, A ∈ Ω 1 (R4 , su2 ) is a solution of the self-dual Yang–Mills equations if its curvature satisfies F = F. In terms of the complex coordinates introduced above, this equation is equivalent to Fuv = 0,

Fuu + Fvv = 0,

Fuv = 0.

(2.1)

We introduce complex vector fields X(z), Y(z) ∈ C ∞ (R4 , T M ⊗ C) depending on a parameter z ∈ C ∪ {∞} ≡ CP 1 : X(z) := ∂u − z∂v ,

Y(z) := ∂v + z∂u .

(2.2)

Then Eqs. (2.1) are equivalent to the requirement that F (X(z), Y(z)) = 0,

∀z ∈ CP 1 .

(2.3)

∼ S4 ∼ Since R4 ⊂ R4 ∪ {∞} = = HP 1 , a point x ∈ R4 ∼ = C2 determines a quatern2 ionic line in H . In particular, we define x := u + jv ∈ H. In terms of homogeneous coordinates ( p, q) ∈ H2 on HP 1 , the point x determines the quaternionic line lx := (q, p) ∈ H2 q = x p in H2 . Now, let p = z 1 + j z 2 , q = z 3 + j z 4 with z := (z 1 , . . . , z 4 ) ∈ C4 , and view z as homogeneous coordinates on CP 3 . Right-multiplication by j on H2 defines an anti-linear anti-involution σ : C4 → C4 ;

(z 1 , z 2 , z 3 , z 4 ) → (−z 2 , z 1 , −z 4 , z 3 ),

(2.4)

2 We restrict to SU for simplicity. All of our considerations are equally valid for any classical Lie group. 2

408

J. D. E. Grant

which descends to define an involution on the projective space: σ : CP 3 → CP 3 ;

[z 1 , z 2 , z 3 , z 4 ] → [−z 2 , z 1 , −z 4 , z 3 ] .

The image of the quaternionic line lx in CP 3 corresponding to x ∈ R4 is given by the embedded projective line L x ≡ L (u,v) = [z 1 , z 2 , z 3 , z 4 ] ∈ CP 3 z 3 = z 1 u − z 2 v, z 4 = z 1 v + z 2 u . (2.5) Similarly, the embedded line corresponding to the point ∞ ∈ S 4 is L ∞ := [0, 0, z 3 , z 4 ] (z 3 , z 4 ) ∈ C2 \ {(0, 0)} . The lines L p , for p ∈ S 4 , are invariant under the action of σ , and are referred to as real lines. We will make particular use of the projection π : CP 3 \L ∞ → R4 ;

L x → x.

1 On a fixed real line, L x , x ∈ R4 , we introduce the affine3coordinate z = z 2 /z 1 ∈ CP . Finally, on the subset U1 := [z 1 , z 2 , z 3 , z 4 ] ∈ CP z 1 = 0 , we may introduce coordinates w1 := z 3 /z 1 , w2 := z 4 /z 1 , w3 := z 2 /z 1 ≡ z. By definition, the coordinates (w1 , w2 , w3 ), viewed as functions on U1 , are holomorphic with respect to the complex structure that U1 inherits as an open subset of CP 3 . From Eq. (2.5), the intersection L x ∩ U1 consists of the set of points with (w1 , w2 , w3 ) = (u − zv, v + zu, z). We will therefore often view (u, v, z) as coordinates on the set U1 ∼ = C2 × C, with the functions (u − zv, v + zu, z) being holomorphic with respect to the complex structure on U1 . One may then check that, in this coordinate system, a basis for anti-holomorphic vector fields on the set U1 is given by the vector fields {X(z), Y(z), ∂z }, with X(z), argument may be carried out on the set A similar Y(z) as in Eq. (2.2). U2 := [z 1 , z 2 , z 3 , z 4 ] ∈ CP 3 z 2 = 0 . In practice, the construction on U2 means that we may use the formulae for X(z), Y(z) with z taking values in CP 1 . As such, we will often, henceforth, identify the set CP 3 \L ∞ with the set C2 × CP 1 . When we speak of a function being, for example, “holomorphic” on C2 × C ⊂ C2 × CP 1 , we will mean holomorphic on the set U1 with the induced complex structure mentioned above.3

Notation. Given > 0, we define the following open subsets of CP 1 : 1 0 1 ∞ 1 V := z ∈ CP |z| > V := z ∈ CP |z| < 1 + , , 1+ and their intersection

V := V0 ∩ V∞ = z ∈ CP 1

1 < |z| < 1 + . 1+

We define the involution on the projective line σ : CP 1 → CP 1 ; z → − 1 z which, geometrically, is simply the anti-podal map. Note that σ (V0 ) = V∞ and σ (V ) = V . 3 From a CP 1 point of view, we are viewing U ∪ U = CP 3 \L as the total space of the normal bundle ∞ 1 2 O(1) ⊕ O(1) of the rational curve L 0 ⊂ CP 3 \L ∞ , where 0 denotes the origin in R4 . X and Y are then

linearly independent sections of this normal bundle. Unfortunately, this picture is not particularly well-suited to the calculations that we wish to perform.

The ADHM Construction and Non-local Symmetries

409

Given any subset V ⊂ CP 1 and a map g : V → SL2 (C), we define a corresponding map g ∗ : σ (V) → SL2 (C) by g ∗ (z) := (g(σ (z)))† . (Throughout, † : SL2 (C) → SL2 (C) will denote complex-conjugate transpose.) Similarly, given any map f : U × V → SL2 (C), we define a corresponding map f ∗ : U × σ (V) → SL2 (C) by f ∗ (x, z) := ( f (x, σ (z)))† . 2.2. Holomorphic bundles. An important property of the self-dual Yang–Mills equations is that they are the compatibility condition for the following overdetermined system of equations [9, Theorem 1]: (∂u − z∂v ) Ψ (x, z) = − ( Au − z Av ) Ψ (x, z), (∂v + z∂u ) Ψ (x, z) = − (Av + z Au ) Ψ (x, z), ∂z Ψ (x, z) = 0,

(2.6a) (2.6b) (2.6c)

for a map Ψ : R4 × V → SL2 (C), where V is a subset of CP 1 . In particular, given > 0, there exists a solution, Ψ0 : R4 × V0 → SL2 (C) that is analytic in z for z ∈ V0 . This solution is unique up to right multiplication 0 (x, z) := Ψ0 (x, z)R(u − zv, v + zu, z), Ψ0 (x, z) → Ψ where R : C2 × V0 → SL2 (C) is holomorphic with respect to the complex structure that C2 × V0 inherits as a subset of U1 . Given Ψ0 (x, z), we define

−1 Ψ∞ (x, z) := Ψ0∗ (x, z) . It is straightforward to check that Ψ∞ (x, z) is also a solution of (2.6) that is analytic in z on R4 × V∞ . Defining the fields ψ0 (x) := Ψ0 (x, 0),

ψ∞ (x) := Ψ∞ (x, ∞),

then Eqs. (2.6) imply that we may write the components of the connection in the form Au = − (∂u ψ∞ (x)) ψ∞ (x)−1 , Au = − (∂u ψ0 (x)) ψ0 (x)−1 ,

Av = − (∂v ψ∞ (x)) ψ∞ (x)−1 , Av = − (∂v ψ0 (x)) ψ0 (x)−1 .

(2.7a) (2.7b)

We then define the Yang J -function J : R4 → SL2 (C) by J := ψ∞ (x)−1 · ψ0 (x).

(2.8)

−1 Noting that, from the definition of Ψ∞ , we have ψ∞ (x) = ψ0 (x)† , it then follows that J † = J . A short calculation shows that the remaining part of the anti-self-dual part of the curvature may be written in the form −1 Fuu + Fvv = −ψ∞ ∂u Ju J −1 + ∂v Jv J −1 ψ∞ = −ψ0 ∂u J −1 Ju + ∂v J −1 Jv ψ0−1 .

410

J. D. E. Grant

If the connection, A, satisfies the self-dual Yang–Mills equations it therefore follows that the field J satisfies the Yang–Pohlmeyer equation (2.9) ∂u Ju J −1 + ∂v Jv J −1 = 0. Conversely, given J : R4 → SL2 (C) that satisfies the Yang–Pohlmeyer equation and † admits a splitting of the form (2.8) for some ψ0 and ψ∞ such that ψ∞ = ψ0−1 , then the connection constructed as in Eqs. (2.7) will satisfy the self-dual Yang–Mills equations. Finally, the quantity G(x, z) := (Ψ∞ (x, z))−1 · Ψ0 (x, z),

(2.10)

will be referred to as the patching matrix. It defines a holomorphic map C2 × V ⊂ CP 3 \L ∞ → SL2 (C), and hence the transition function of a holomorphic vector bundle over CP 3 \L ∞ . The splitting (2.10) implies that this bundle is trivial on restriction to real lines. The above is an explicit version of the Ward correspondence [30], which defines a 1 − 1 correspondence between self-dual Yang–Mills fields and holomorphic bundles over appropriate subsets of CP 3 that are trivial on restriction to real lines.4 Given such a bundle, the transition functions necessarily admit a splitting of the form (2.10), and the connection may then be reconstructed from the resulting fields Ψ0 , Ψ∞ via Eqs. (2.7). 2.3. Non-local symmetries of the self-dual Yang–Mills equations. If we consider a oneparameter family of solutions, J (t), of (2.9), depending in a C 1 fashion on a parameter d t ∈ (−ε, ε), then we deduce that J˙ := dt J (t) must satisfy the linearisation of (2.9): (2.11) ∂u J ∂u J −1 J˙ J −1 + ∂v J ∂v J −1 J˙ J −1 = 0. Such a J˙ defines a symmetry of the self-dual Yang–Mills equations. It is known that the only local symmetries5 of the self-dual Yang–Mills equations on flat R4 are gauge transformations and those generated by the action of the conformal group (see, e.g., [25]). On the other hand, there exists a non-trivial family of non-local symmetries of the self-dual Yang–Mills equations [6–8]. To describe these, we define the auxiliary maps χ0 : R4 × V0 → SL2 (C), χ∞ : R4 × V∞ → SL2 (C), χ0 (x, z) := ψ0 (x)−1 · Ψ0 (x, z),

χ∞ (x, z) := ψ∞ (x)−1 · Ψ∞ (x, z),

which are analytic in z for z ∈ V0 and z ∈ V∞ , respectively. These maps have the property that χ0 (x, 0) = χ∞ (x, ∞) = Id. The following result, based on the results of [6–8], may then be extracted from Sect. III.A of [9]: Proposition 2.1. Let T : R4 × S 1 → SL2 (C) be a map that extends continuously to a map T : R4 ×V → SL2 (C) (for some > 0) that is analytic in z for z ∈ V and satisfies (∂v + z∂u ) T (x, z) = (∂u − z∂v ) T (x, z) = 0 4 See [9] for more details of the Ward correspondence from this point of view. 5 i.e. depending only on the connection and its derivatives pointwise

The ADHM Construction and Non-local Symmetries

411

for (x, z) ∈ R4 × V . Then, given any λ ∈ V , the quantity J˙ := χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J + J · χ0 (x, σ (λ))T (x, λ)† χ0 (x, σ (λ))−1 = ψ∞ (x)−1 Ψ∞ (x, λ)T (x, λ)Ψ∞ (x, λ)−1 +Ψ0 (x, σ (λ))T (x, λ)† Ψ0 (x, σ (λ))−1 ψ0 (x) (2.12) is a solution of the linearisation (2.11). In the case where the function T is independent of x, it defines an element of the loop group ΛSL2 (C) that admits a holomorphic extension to an open neighbourhood of S 1 in C∗ . The algebra of symmetries generated by such T is then isomorphic to the Kac-Moody algebra of sl2 (C). The action of such symmetries on the patching matrix is given by the following result: Theorem 2.1. Let T : R4 × S 1 → SL2 (C) be as in the previous proposition. The induced flow on the patching matrix of the corresponding bundle over CP 3 \L ∞ is given by ˙ G(x, z) = −T (x, z)G(x, z) − G(x, z)T ∗ (x, z) + ρ∞ (x, z)G(x, z) + G(x, z)ρ0 (x, z), (2.13) for (x, z) ∈ R4 × V . In this equation, ρ0 : R4 × V0 → sl2 (C) and ρ∞ : R4 × V∞ → sl2 (C) are analytic functions of z on the respective regions and satisfy (∂v + z∂u ) ρ0 (x, z) = (∂u − z∂v ) ρ0 (x, z) = 0, (∂v + z∂u ) ρ∞ (x, z) = (∂u − z∂v ) ρ∞ (x, z) = 0. Moreover, the functions ρ0 , ρ∞ may be absorbed into holomorphic changes of bases on the regions V0 and V∞ . When this absortion process is carried out, the transformation (2.13) takes the simpler form ˙ G(x, z) = −T (x, z)G(x, z) − G(x, z)T ∗ (x, z).

(2.14)

Remark 2.1. These transformations have been investigated from the viewpoint of twistor-theory and have a natural sheaf-theoretic interpretation [18,19,24,25]. In the literature, it is standard to assume (2.13) (and the group-theoretic version (2.15) below) as the transformation law of the patching matrix, and to work backwards to derive the transformation law of J and the connection (see, e.g., [18,19,25]). Since a direct proof of (2.13), starting from (2.12), does not appear in the literature, we have included a proof in Appendix A. Remark 2.2. The transformation (2.14) is independent of the parameter λ that appears in Eq. (2.12). As such, the transformation depends only on the function T . In Eq. (2.13), the functions ρ0 , ρ∞ will generally depend on the parameter λ, but the corresponding dependence of G˙ on λ may be removed by a holomorphic change of frame. This situation is different from that in, for example, the theory of harmonic maps from a domain X ⊆ R2 to a Lie group G. In this case, one has a similar family of non-local symmetries [10] depending on a function T (λ). There, however, the transformation properties of the extended harmonic map depends explicitly on the value of the parameter λ (see, e.g., [28, §3]). Power-series expanding in λ then gives a family of flows acting on the extended solution, and hence on the space of harmonic maps.

412

J. D. E. Grant

The exponentiated form of the transformation law (2.14) is given by the following: Theorem 2.2 [9, Chap. IV.C]. Let g : R4 × S 1 → SL2 (C) be a smooth map that admits a continuous extension to a holomorphic map g : C2 × V ⊂ CP 3 \L ∞ → SL2 (C), for some > 0. Then we define the action of g on the patching matrix G(x, z) by the law G(x, z) → g(x, z) · G(x, z) · g ∗ (x, z). R4

(2.15)

If g extends holomorphically to × then the corresponding transformation is a holomorphic change of basis on the bundle over CP 3 \L ∞ , which leaves the self-dual connection, A, unchanged. V0 ,

Remark 2.3. The infinitesimal form of (2.15), where g(x, z) = exp(−t T (x, z)) is Eq. (2.14). 2.4. The ADHM construction. We wish to study the action of the symmetries mentioned above on the moduli spaces of instanton solutions of the self-dual Yang–Mills equations on R4 or, equivalently, S 4 . As such, we are concerned with connections whose curvatures are L 2 , in which case we have |F|2 d 4 x = −8π 2 k, R4

where k ∈ N0 is the second Chern number, c2 (E), of the bundle E (also called the instanton number of the connection). A self-dual connection with L 2 curvature on R4 necessarily extends to a self-dual connection on S 4 [29]. We will refer to such connections, defined on either R4 or S 4 as an instanton. The moduli space of instanton solutions, with instanton number k, modulo gauge transformations is a manifold of dimension (8k − 3) (away from singularities due to reducible connections), which we denote by Mk . For later considerations, it will be important to think of Mk as being finite-dimensional submanifolds of the (infinite-dimensional) space of all self-dual connections on R4 , not necessarily having L 2 curvature, which we denote by M. The symmetries of the selfdual Yang–Mills equations that we consider may be viewed as defining flows on the space M, and our main question is when these flows preserve the sub-manifolds Mk . Via the Ward correspondence [5,30], self-dual connections of instanton number k correspond to holomorphic bundles over CP 3 that are trivial on real lines. All such bundles may be constructed directly in terms of quaternionic linear algebra by the ADHM construction [4], which we now briefly recall. For each z = (z 1 , z 2 , z 3 , z 4 ) ∈ C4 , we define a linear map A(z) : W → V, between complex vector spaces W, V of dimension k, 2k + 2 respectively, which is of the form A(z) =

4

z i Ai .

i=1

The space W is assumed to admit an anti-linear involution σW : W → W . The space V is assumed to have a symplectic form (·, ·) and an anti-linear anti-involution σV : V → V that are compatible in the sense that (σV u, σV v) = (u, v),

∀ u, v ∈ V.

The ADHM Construction and Non-local Symmetries

413

We require that the map A(z) satisfies the compatibility condition σV (A(z)w) = A(σ (z)) σW (w),

∀ w ∈ W,

∀z ∈ C4 ,

(2.16)

where σ : C4 → C4 is as in Eq. (2.4). Finally, we impose the following conditions: – For all z ∈ C4 , the space Uz := A(z)(W ) ⊂ V is of dimension k; – For all z ∈ C4 , Uz is isotropic with respect to (·, ·) i.e. Uz ⊆ Uz⊥ , where ⊥ denotes the complement with respect to the form (·, ·). If we then define the quotient E z := Uz⊥ /Uz , then the collection of E z defines a holomorphic, rank-2 complex vector bundle E → CP 3 with structure group SL2 (C). The reality condition (2.16) then implies that the bundle is trivial on restriction to any real line and that the self-dual connection on S 4 determined by the Ward correspondence is an SU2 connection. 3. Patching Matrix Description of ADHM Construction In order to make contact between the action of the non-local symmetries of the self-dual Yang–Mills equations in the form of (2.13) and the ADHM construction, we first need to reformulate the ADHM construction in terms of patching matrices. We assume given an instanton solution of the self-dual Yang–Mills equations on S 4 , with corresponding holomorphic bundle E → CP 3 . We then consider (without any loss of information [29]) the restriction of this solution to R4 ⊂ S 4 and the restriction of the bundle E to π −1 (R4 ) ≡ CP 3 \L ∞ , which, for convenience, we denote by E → CP 3 \L ∞ . We split the set CP 3 \L ∞ as the union of two regions S0 := ((u, v), z) ∈ C2 × CP 1 |z| < 1 + = C2 × V0 , 1 S∞ := ((u, v), z) ∈ C2 × CP 1 |z| > = C2 × V∞ . 1+ Since S0 , S∞ ∼ = C3 , the bundle E restricted to either of these regions is holomorphically trivial [9,23]. The bundle E is therefore characterised by the transition function G : S0 ∩ S∞ → SL2 (C), which is the patching matrix from Sect. 2.2. The map G may be constructed directly from the ADHM data, at the expense of k fixing bases on the spaces V and W . In particular, let {ai }i=1 be a basis of vectors in W that are real with respect to σW , in the sense that σW (ai ) = ai ,

i = 1, . . . , k.

(So, in practice, we are looking on W as being the complexification of the fixed point set of σW .) The vectors vi (z) := A(z)ai ∈ V,

i = 1, . . . , k

define a collection of k vectors that span the space Uz ⊂ V . Due to the reality of the vectors ai , these vectors obey the reality condition σV (vi (z)) = vi (σ (z)),

i = 1, . . . , k,

∀z ∈ C4 .

Since Uz is isotropic with respect to the symplectic form, we deduce that

vi (z), v j (z) = 0, i, j = 1, . . . , k, z ∈ C4 .

(3.1)

(3.2)

414

J. D. E. Grant

We now view z as homogeneous coordinates on CP 3 , and construct bases for [z] ∈ S0 ⊂ CP 3 \L Uz⊥ is spanned by {vi (z)} along with [z] ∈ S0 , the annihilator ∞ . Given ⊥ two vectors e A (z) A = 1, 2 that span Uz /Uz . We therefore have (vi (z), e A (z)) = 0,

i = 1, . . . , k,

A = 1, 2,

[z] ∈ S0 ,

(3.3)

and, without loss of generality, may assume that (e1 (z), e2 (z)) = − (e2 (z), e1 (z)) = 1.

(3.4)

Although not strictly necessary, it will sometimes be useful to extend the vectors {e A (z), vi (z)} to a full basis for V by adding a set of vectors {wi (z)|i = 1, . . . , k} with the property that j wi (z), w j (z) = 0, vi (z), w j (z) = δi , wi (z), e A (z) = 0. (3.5) We may also define a basis f A (z) A = 1, 2 for Uz⊥ /Uz for [z] ∈ S∞ by the relations f1 (z) := −σV (e2 (σ (z))) ,

f2 (z) := σV (e1 (σ (z))).

This basis automatically has the property that (f1 (z), f2 (z)) = 1,

(vi (z), f A (z)) = 0.

(3.6)

Given that {e A (z)} and {f A (z)} are both bases for Uz⊥ /Uz for [z] ∈ S0 ∩ S∞ , there exist functions G A B (z), λ A i (z) defined on this region with the property that f A (z) = G A B (z) e B (z) + λ A i (z) vi (z).

(3.7)

(From now on, the summation convention will be assumed over repeated indices.) The matrix G(z), defined for [z] ∈ S0 ∩ S∞ is then the transition function of our bundle E. Before deriving some properties of the patching matrix that we will require, we define the SL2 (C)-invariant tensor by AB = − B A with 12 = 1 and the SO2 (C)-invariant tensor δ with components 1 A = B, δ AB = 0 A = B. Proposition 3.1. The patching matrix, G, as defined above obeys the conditions det G(z) = 1,

G ∗ (z) = G(z),

for [z] ∈ S0 ∩ S∞ , where G ∗ (z) := G(σ (z))† . The functions λ A i obey the reality condition λ A i (σ (z)) = δ AB G C B C D λ D i (z), for [z] ∈ S0 ∩ S∞ .

λ A i (z) = −G A B δ BC C D λ D i (σ (z))

The ADHM Construction and Non-local Symmetries

415

Proof. Firstly, we have

1 = (f1 (z), f2 (z)) = G 1 B (z) e B (z) + λ1 i (z) vi (z), G 2 B (z) e B (z) + λ2 i (z) vi (z) = G 1 1 (z)G 2 2 (z) − G 1 2 (z)G 2 1 (z) (e1 (z), e2 (z)) = det G(z),

where the four equalities follow from Eqs. (3.6), (3.7), (3.3) and (3.4), respectively. Therefore, det G(z) = 1, as required. The definition of the vectors f A (z) may be rewritten in the form f A (z) = −δ AB BC σV (eC (σ (z))).

(3.8)

We now apply σV to this equation, substitute Eqs. (3.7) and (3.1), and use the antilinear, anti-involutive nature of σV . After some manipulation of δ and tensors, and using the fact that det G = 1, we then find that

f A (z) = G ∗ (z) A B e B (z) − δ BC C D λ D i (σ (z))vi (z) .

Comparing with (3.7) then gives the required equalities.

Remark 3.1. We will be primarily interested in Eq. (3.7) when it is restricted to a real line L x ⊂ CP 1 \L ∞ . Since the patching matrix, G, defined above is holomorphic on CP 3 \L ∞ , when restricted to a neighbourhood of the line L x ≡ L (u,v) , then G will restrict to a function (which we denote by G(x, z)) that is holomorphic in (u − zv, v + zu, z) for z ∈ V , for some > 0. 4. One-Parameter Families of ADHM Data We now consider a one-parameter family of ADHM data A(z) := A(t : z), with t ∈ I a parameter, I a sub-interval of the real line containing the origin. We assume that A(t : z) is a C 1 function of t. We wish to investigate how the elements of the above explicit construction depend on A(t : z). The image A(t : z) (W ) is now spanned by the vectors {vi (t : z)}, and Uz⊥ /Uz is spanned by {e A (t : z)}, which we assume normalised such that (3.4) is satisfied for each t ∈ I . Constructing the vectors {f A (t : z)}, we then define the patching matrix G A B (t : z) and the functions λ A i (t : z) as in (3.7). Proposition 4.1. Given a one-parameter family of ADHM data, A(t : z), and patching matrices as defined in (3.7), then there exists a matrix-valued function d(t : z) with the property that ˙ : z) = d(t : z)G(t : z) + G(t : z)d ∗ (t : z). G(t

(4.1)

Proof. To investigate the t-dependence of these quantities, we consider their derivatives with respect to t. The derivatives of the relevant vectors are given as follows:6 v˙ i = Ai j v j + Bi j w j + AB s Ai e B ,

(4.2a)

˙ = C vj − Aj w − w

(4.2b)

i

ij

i

j

AB

r A eB , i

e˙ A = c A e B + r A vi + s Ai w , B

i

i

6 Everything depends on (t : z), but we drop explicit mention of this dependence for the moment.

(4.2c)

416

J. D. E. Grant

where Ai j , . . . s Ai are functions of (t, z), that satisfy the relationships Bi j = B ji , C i j = C ji , c A A = 0. ˙ i , e˙ A that It is straightforward to check that these are the most general forms of v˙ i , w preserve the relations (3.2), (3.3), (3.4) and (3.5). We also define functions that characterise the time-dependence of the vector fields fA: f˙ A = d A B f B + t A i vi + u Ai wi .

(4.3)

From this expression and Eq. (3.7), we deduce that G˙ A B = d A C G C B − G A C cC B + λ A i BC sCi ,

(4.4)

along with the relations j λ˙ A i = d A C λC i + t A i − G A B r B i − λ A A j i ,

u Ai = G A B s Bi + λ A j B ji . Also, equating f˙1 with −e˙2 , and f˙2 with e˙1 , we find that d A B = u Ai λ Bi + AC δ C D c D E δ E F B F , and u Ai = AB δ BC sCi . These equations, along with (4.4) imply that the t-derivative of the patching matrix obeys the relation (4.1) with d = u i ⊗ λi + AC δ C D c D E δ E F B F , as required.

Remark 4.1. The quantities that occur in Eq. (4.1) may all be constructed directly from the vector fields e A , vi since c A C C B . (vi , e˙ A ) = s Ai , (˙e A , e B ) = C

Therefore the construction does not actually require the introduction of the basis vectors {wi }. Corollary 4.1. Given a one-parameter family of ADHM data and patching matrix defined as above, then there exists a map d : I × C2 × V → SL2 (C) that is holomorphic in (u − zv, v + zu, z) for z ∈ V such that the restriction of the patching matrix to real-lines L x evolves according to ˙ : x, z) = d(t : x, z)G(t : x, z) + G(t : x, z)d ∗ (t : x, z), G(t for (x, z) ∈ C2 × V . Proof. Restrict (4.1) to L x .

(4.5)

The ADHM Construction and Non-local Symmetries

417

Remark 4.2. Let α(t : x, z) satisfy the first order ordinary differential equation α(t ˙ : x, z) = d(t : x, z) α (t : x, z),

α(0 : x, z) = Id.

Given an initial patching matrix G(x, z), it follows that the one-parameter family of patching matrices G(t : x, z) := α(t : x, z) G(x, z) α ∗ (t : x, z)

(4.6)

satisfies Eq. (4.1) with initial conditions G(0 : x, z) = G(x, z). Conversely, by uniqueness of solutions of (4.1), it follows that G(t : x, z), as defined in Eq. (4.6), is the unique one-parameter family of patching matrices determined by the flow (4.1) with initial data G(x, z). Note that these transformations (4.5) and (4.6) are of the same form as those generated by the symmetries of the self-dual Yang–Mills equations given in Eq. (2.13) and Theorem 2.2, with the important proviso that the function d(t : x, z) occurring in (4.5) depends explicitly on the parameter t. The symmetries (2.13) should be viewed as defining a flow on the space, M, of self-dual connections defined by the map T . In solving (2.13), we are simply constructing the integral curves of this flow, with t a parameter along the integral curve. As such, in (2.13), it is important that the function T (x, z) is independent of the parameter t. Viewing the function T as defining a flow on M and the instanton moduli spaces Mk as submanifolds of M, we directly deduce: Theorem 4.1. Let A ∈ Mk be a k-instanton self-dual connection (modulo gauge transformation) on R4 , with Mk viewed as a submanifold of the space, M, of all self-dual connections on R4 . Then for each vector v ∈ TA Mk , there exists a function T such that the fundamental vector field on M corresponding to T via Eq. (2.13) coincides with v at the point A ∈ Mk . Proof. Any element v ∈ TA Mk is generated by a one-parameter family of ADHM data, A(t : z), with A(0 : z) corresponding to the connection A. This one-parameter family of ADHM data then gives rise to a one-parameter family of patching matrices G(t : x, z) evolving according to (4.5), where G(0 : x, z) is the patching matrix corresponding to ˙ : x, z) corresponds to the tangent vector v. Taking T (x, z) := −d(0 : x, z) A and G(0 gives a symmetry that, via (2.13) (with ρ0 = ρ∞ = 0) generates the tangent vector v. Remark 4.3. Theorem 2.1 states that, given a function T , there is a corresponding fundamental vector field on M, the space of self-dual connections, corresponding to T . We shall denote this fundamental vector field by XT . Theorem 4.1 states that, given a connection A ∈ Mk and a tangent vector v ∈ TA Mk , then there exists such a function T such that XT |A = v. It is important to note, however, that the integral curve of XT starting at A ∈ Mk will, generally, not remain within the sub-manifold Mk of M. In order to determine which one-parameter groups of symmetries gives flows that remain in the moduli space Mk , we need to determine which transformations of the form (4.5) are generated by transformations of the form (2.13), with T (x, z) independent of t. From the form of (4.5) and (2.13), it appears natural to identify d(t : x, z) with −T (x, z) + ρ∞ (t, x, z). We impose that T is independent of t. The map ρ∞ simply generates a change of holomorphic frame for z ∈ V∞ . At this point, we should recall that we have partially fixed our holomorphic frames in deriving our patching matrix from the

418

J. D. E. Grant

ADHM data. As such, if we wish to employ our approach with one-parameter families of ADHM data, we must allow for one-parameter families of changes of frame in order to compensate for this fixing of frames. As such, we should allow ρ∞ to be t-dependent (i.e. ρ∞ = ρ∞ (t, x, z)). Note that such a t-dependent change of frame does not affect the corresponding self-dual connection A(t). As such, we may use ρ∞ to absorb any part of d(t : x, z) that is holomorphic on C2 × V∞ , leaving an irreducible part of d(t : x, z), denoted d0 (t : x, z), that has singularities in the region V∞ that cannot be removed by absorption into ρ∞ . In order to arise from a symmetry of the self-dual Yang–Mills equations, d0 (t : x, z) must then be independent of t. Since d(t : x, z) is determined by first t-derivatives of the ADHM data, A(t : z), imposing that d0 (t : x, z) is constant in t will impose conditions on the first t-derivatives of the A(t : z) data that must be satisfied in order for this one-parameter family of data (and corresponding self-dual connections) to arise from a symmetry of the self-dual Yang–Mills equations. Explicit calculations, in the next section, suggest that these conditions are quite restrictive. Remark 4.4. The fact that the flow on the moduli space does not generally preserve the L 2 nature of the curvature is well-known (see, e.g., [6–8] where this effect is mentioned). In [9, Chap. V], an explicit example of a transformation acting on a one-instanton patching matrix is given to demonstrate this phenomenon. In the notation of (4.6), this transformation takes the form 1 1 t/z . α(t : x, z) = √ 1 1 − t2 tz From this expression, we deduce that 1 d(t : x, z) =

3/2 1 − t2

t z

1/z . t

Following the programme of the previous remark, we then isolate the part of d that has singularities in the region z ∈ V∞ , namely 1 0 0 d0 (t : x, z) =

3/2 z 0 . 1 − t2 Since d0 depends explicitly on t, we deduce that the counterexample provided in [9, Chap. V] falls outside of the class of transformations generated by transformations (2.13) with T independent of t. Remark 4.5. If one drops the reality condition that our connections are SU2 connections, rather than SL2 (C) connections, then Takasaki has argued [27] that the action of the non-local symmetry group generated by transformations of the form J˙(x) = χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J is transitive on the space of SL2 (C) solutions of the self-dual Yang–Mills equations. If, as here, we restrict to symmetries of the form (2.12) that explicitly preserve the SU2 nature of the connection, then the symmetry group need not act transitively on the moduli space of solutions, even though the symmetries have been shown to generate the tangent space at each point. Moreover, if we explicitly impose that we only consider symmetries that preserve the L 2 nature of the connection, then the explicit calculations carried out in the next section for the one-instanton moduli space suggest that the orbits of the symmetry group are actually of high codimension in the moduli space.

The ADHM Construction and Non-local Symmetries

419

5. The One-Instanton Solution In the case of a one-instanton solution, it is straightforward to carry out the ADHM construction and the construction of deformations explicitly. We find that the one-parameter subgroups of ADHM data with d(t : x, z) of the form −T (x, z) + ρ∞ (t, x, z) are rather small. First, we fix some notation. In the case k = 1, then we may write ⎛

⎞ A1 (z) ⎜ A (z)⎟ v(z) := A(z) = ⎝ 2 ⎠ ∈ C4 , A3 (z) A4 (z) where Ai (z) =

4

j

j=1

Ai z j , i = 1, . . . , 4. Letting ⎛ ⎞ ⎛ ⎞ α −β ⎜ α ⎟ ⎜β ⎟ ⎟ σV ⎝ ⎠ := ⎜ ⎝ −δ ⎠, γ δ γ

then (3.1) implies that the functions Ai (z) must satisfy the reality conditions: A1 (σ (z)) = −A2 (z),

A2 (σ (z)) = A1 (z),

A3 (σ (z)) = −A4 (z),

A4 (σ (z)) = A3 (z).

In particular, using the symmetry transformations inherent in the ADHM construction [2, Chap. II], we may fix A1 (z) = λz 1 , A3 (z) = αz 1 − βz 2 − z 3 ,

A2 (z) = λz 2 , A4 (z) = βz 1 + αz 2 − z 4 ,

(5.1a) (5.1b)

where λ is a positive, real number and α, β are complex numbers. Finally, we may take the symplectic form on V ∼ = C4 to be (a, b) = a 1 b2 − a 2 b1 + a 3 b4 − a 4 b3 ,

a, b ∈ C4 .

Theorem 5.1. The only transformations of the ADHM data (λ, α, β) that arise from a non-local symmetry of the self-dual Yang–Mills equations (2.12) according to (4.5) with d(t : x, z) of the form −T (x, z) + ρ∞ (t : x, z) are of the form λ λ → λ(t) := √ , 1 − kλ2 t

α, β constant,

(5.2)

where k ∈ R is a real constant. Proof. On a region with A1 (z) = 0 (and hence A2 (z) = 0), then we find that Uz = v(z)⊥ /v is spanned by the vectors A4 (z) A3 (z) e1 (z) = 0, , 1, 0 , e2 (z) = 0, − , 0, 1 , A1 (z) A1 (z)

420

J. D. E. Grant

which have the property that (e1 , e2 ) = 1. Such a basis, including the normalisation property, is unique up to a translation e A → e A + λ A v, and an SL2 (C) rotation of the vectors e A (z). Taking the conjugates of these vectors, we find that A4 (z) f1 (z) = −e2 (z) = − , 0, 1, 0 , A2 (z)

f2 (z) = e1 (z) =

A3 (z) , 0, 0, 1 . A2 (z)

These expressions imply that on the overlap where the two above regions overlap, we have the patching matrix (see [9, Chap. V])

G=

1+

A3 (z)A4 (z) A1 (z)A2 (z) 2 3 (z) − A1A(z)A 2 (z)

A4 (z)2 A1 (z)A2 (z) (z)A4 (z) 1 − AA13 (z)A 2 (z)

and λ1 (z) = −

A4 (z) , A1 (z)A2 (z)

λ2 (z) =

A3 (z) . A1 (z)A2 (z)

We may take the vector w(z) to be w = 0,

1 , 0, 0 , A1 (z)

which is unique up to w → w + φv. If we now let v(z) depend smoothly on a parameter t ∈ (−, ), then we may calculate the parameters of the deformation A, B, C, . . . as defined in (4.2) and (4.3). The parameter d is the one that we primarily require and a straightforward calculation shows that d(t : z) =

∂ A4 (t : z)/A2 (t : z) × A3 (t : z)/A1 (t : z) A4 (t : z)/A1 (t : z) . −A (t : z)/A (t : z) 3 2 ∂t

Taking Ai (t : z) as in (5.1), with λ replaced by λ(t), etc, then, restricted to the line L x , the deformation parameter that we require takes the form (β−v)+z(α−u) 1 ∂ λ (α−u)−z(β−b) d(x, z) = × λ z ∂t − (α−u)−z(β−v) λ

(β−v)+z(α−u) λ

.

This expression may be written in the form d(x, z) =

1 A(u − zv)2 + B(u − zv)(v + zu) + C(v + zu)2 z F H D + E (u − zv) + + G (v + zu) + + I + J z, + z z z

The ADHM Construction and Non-local Symmetries

where

421

λ˙ −1 0 λ˙ 0 −1 0 0 , B= 3 , C= 3 , 1 0 0 1 λ λ 0 0 1 −λβ˙ + β λ˙ 1 −λα˙ + α λ˙ 0 0 D= 3 , E= 3 , λ λα˙ − 2α λ˙ −β λ˙ λ −λβ˙ + 2β λ˙ −α λ˙ 1 α λ˙ −λβ˙ + 2β λ˙ 1 −β λ˙ −λα˙ + 2α λ˙ F= 3 , G = , 0 λα˙ − α λ˙ λ λ3 0 −λβ˙ + β λ˙ 1 α(λβ˙ − β λ˙ ) β(λβ˙ − β λ˙ ) , H= 3 λ α(−λα˙ + α λ˙ ) β(−λα˙ + α λ˙ ) 1 αλα˙ − βλβ˙ − αα λ˙ + ββ λ˙ βλα˙ + αλβ˙ − 2αβ λ˙ I = 3 , ˙ + β(λβ˙ − β λ) ˙ λ βλα˙ + αλβ˙ − 2αβ λ˙ α(−λα˙ + α λ) 1 β(−λα˙ + α λ˙ ) α(λα˙ − α λ˙ ) J= 3 , ˙ ˙ λ β(−λβ˙ + β λ) α(λβ˙ − β λ) A=

λ˙ λ3

where˙denotes differentiation with respect to t. According to the philosophy of Remark 4.3, we note that the coefficients D, F, H and I correspond to terms that are analytic for z ∈ V∞ , and therefore may be absorbed into the ρ∞ term. The remaining part of the parameter d is then 1 d0 (x, z) = A(u − zv)2 + B(u − zv)(v + zu) + C(v + zu)2 z +E(u − zv) + G(v + zu) + J z. All of the terms in d0 have singularities at z = ∞ ∈ V∞ . In order for such transformations to arise from a T that is independent of t with d = −T + ρ∞ , we therefore require that the remaining coefficients A, B, C, E, G and J must be independent of t (i.e. constant). An analysis of the explicit form of these coefficients given above shows that this condition is only possible if λ˙ k = , λ3 2

α˙ = β˙ = 0,

where k is a constant. Integrating these equations yields (5.2). Therefore the only transformation on the one-instanton moduli space that arises from a symmetry of the form (2.12) with d(t : x, z) = −T (x, z) + ρ∞ (t, x, z) is a scaling of the moduli space. Remark 5.1. The group of transformations on the one-instanton moduli space is therefore only one-dimensional. Such a collapse to a finite-dimensional action is familiar from the theory of harmonic maps (see, e.g., [1,17,20,28]), where the orbits of the group action are also, generically, of high codimension. 6. Final Remarks Our first main result is Theorem 4.1, which states that the tangent space to the instanton moduli spaces, Mk , are generated by symmetries of the self-dual Yang–Mills equations. Nevertheless, our second main result, based on an analysis of the one-instanton moduli

422

J. D. E. Grant

space, is that the subgroup of the symmetry group that preserves the L 2 nature of the connection, and hence has orbits that lie in a particular Mk , is rather small. In particular, the orbits of this subgroup on the space Mk are of high codimension. We have restricted ourselves to one-parameter families of ADHM data that arise from transformations of the form (2.13) and (4.5) with d(t : x, z) = −T (x, z) + ρ∞ (t, x, z). Note that this is a sufficient, but not necessary, condition for Eqs. (2.13) and (4.5) to be consistent. It is conceivable that there might be a larger group of transformations acting on the moduli spaces, Mk , consistent with these equations, but we have not investigated this possibility. It is hoped that there is a more elegant way of carrying out the calculations in the previous section. In particular (also regarding the remark in the previous paragraph) one would like to pull the infinitesimal action on the patching matrix (2.13) directly up to the space of ADHM data. An alternative approach to extending our analysis would be to investigate our approach from the point of view of Donaldson’s reformulation of the ADHM construction [12], where one views instantons as defining holomorphic bundles over CP 2 . Restricting our constructions to the CP 2 picture is straightforward, but it is again to directly calculate the action of the symmetry transformations on the data. Work of Nakamura [22] concerning dynamical systems defined on the space of data of the Donaldson construction may be relevant in this regard. The approach where one might expect the symmetries to have the simplest form would be within Atiyah’s reformulation [3] of the instanton moduli spaces in terms of holomorphic maps CP 1 → ΩG. In this case, the connection with harmonic map theory is quite strong. In the case of the self-dual Yang–Mills equations, however, one expects the symmetry group to act directly on the map in the Atiyah construction, whereas for harmonic maps the “dressing action” acts purely on the space ΩG. It is also quite difficult to see directly how the action on the patching matrix or ADHM data transfer to the Atiyah picture, due to the non-holomorphic transformations required in passing from the ADHM construction to this approach. More broadly, thinking of (λ, α, β) as coordinates on the five-dimensional ball (with (α, β) compactified to the four-sphere and λ the radial coordinate) then the flow in (5.2) is simply a radial scaling. In particular, for k > 0, the flow converges to the fixed point 1 λ = 0 as t → −∞, and diverges to +∞ as t → kλ2 −. Such flows are, in some respects, reminiscent of Morse flows, and it would be of interest to know whether our approach has a Morse-theoretic interpretation. In addition, it would be interesting to relate our work to other examples of systems where one has a symmetry algebra, but no corresponding group action e.g. Teichmüller theory.7 As mentioned in the Introduction, the original motivation for this work was to determine whether the integrable systems approach to the self-dual Yang–Mills equations could give information about instanton moduli spaces as used in the more topological context of Donaldson theory. In this regard, the results of this paper should be viewed alongside the results of the companion paper [15]. In [15], reducible connections were studied on open subsets of R4 , and were found to bear a strong resemblance to harmonic maps of finite type (see, e.g., [16, Chap. 24]). In particular, all reducible connections lie in the orbit, under flows (2.13), of the flat connection on R4 . Therefore instanton solutions on R4 and reducible connections (which are necessarily not L 2 on R4 ) appear to have quite different behaviour under the symmetry group of the self-dual Yang–Mills equations. Since reducible and irreducible connections play a different role in Donald7 The author is grateful to Prof. K. Ono for this suggestion.

The ADHM Construction and Non-local Symmetries

423

son’s work [11], corresponding to the smooth and singular parts of the moduli space respectively, it is striking that such connections also seem to have different behaviour from the point of view of integrable systems. In this respect, it would be of particular interest to investigate the one-instanton moduli space on CP 2 , where one has L 2 and reducible connections in the same moduli space. Acknowledgements. This work was supported by START-project Y237–N13 of the Austrian Science Fund and a Visiting Professorship at the University of Vienna. The author is grateful to the anonymous referee, whose detailed comments led to the significant improvement of this paper.

A. Action of Symmetries on the Patching Matrix It appears that the direct derivation of the infinitesimal flow, (2.13), from the flow of the J -function, (2.12), has not appeared in the literature. We therefore give a proof of this result here. The closest to our derivation that we have found is the corresponding construction for harmonic maps into Lie groups given in [28, §3-4]. For ease of notation, we define the quantities α(x, λ) := Ψ∞ (x, λ)T (x, λ)Ψ∞ (x, λ)−1 , α(x, λ)† := Ψ0 (x, σ (λ))T (x, λ)† Ψ0 (x, σ (λ))−1 , and recall the solution of the linearisation equation, (2.12), in this notation: J˙(x) = ψ∞ (x)−1 α(x, λ) + α(x, λ)† ψ0 (x).

(A.1a) (A.1b)

(A.2)

Proposition A.1. There exists a function h ∞ (x, z) ≡ h ∞ (u − zv, v + zu, z) with the property that λ Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = (α(x, λ) − α(x, z)) λ−z 1 α(x, λ)† − α(x, σ (z))† − Ψ∞ (x, z)h ∞ (z)Ψ∞ (x, z)−1 , + 1 + zλ (A.3)

for all z ∈ CP 1 such that z = 0, λ, −1 λ. Similarly, there exists a function h 0 (x, z) ≡ h 0 (u − zv, v + zu, z) such that Ψ˙ 0 (x, z)Ψ0 (x, z)−1 − ψ˙ 0 (x)ψ0 (x)−1 = −

zλ 1 + zλ

z (α(x, λ) − α(x, z)) λ−z

α(x, λ)† − α(x, σ (z))† + Ψ0 (x, z)h 0 (z)Ψ0 (x, z)−1 ,

(A.4)

for all z ∈ CP 1 such that z = ∞, λ, −1 λ. Proof. From (A.2), we deduce that ψ˙ 0 (x)ψ0 (x)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = α(x, λ) + α(x, λ)† .

(A.5)

424

J. D. E. Grant

From the defining relations for ψ0 (x, z), ψ∞ (x, z) we deduce that the derivative of the components of the connection are given by

A˙ u − z A˙ v = − (Du − z Dv ) Ψ˙0 (x, z)Ψ0 (x, z)−1 = − (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 ,

A˙ v + z A˙ u = − (Dv + z Du ) Ψ˙0 (x, z)Ψ0 (x, z)−1 = − (Dv + z Du ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 . This expression implies that (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 = Du ψ˙ 0 (x)ψ0 (x)−1 − z Dv ψ˙ ∞ (x)ψ∞ (x)−1 , (Dv + z Du ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 = Dv ψ˙ 0 (x)ψ0 (x)−1 + z Du ψ˙ ∞ (x)ψ∞ (x)−1 . We need to solve these equations for Ψ∞ (x, z) with the boundary condition that Ψ˙ ∞ (x, z) → ψ˙ ∞ (x) as z → ∞. These equations may be rewritten in the form (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = Du α(x, λ) + α(x, λ)† , (Dv + z Du ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = Dv α(x, λ) + α(x, λ)† . We now note that (Du − λDv ) α(x, λ) = (Dv + λDu ) α(x, λ) = 0. Therefore, for all z = λ, λ (Du − z Dv ) α(x, λ), λ−z λ Dv α(x, λ) = (Dv + z Du ) α(x, λ). λ−z

Du α(x, λ) =

Similarly,

Dv + λDu α(x, λ)† = Du − λDv α(x, λ)† = 0,

from which we deduce that, for all z = −1 λ, Du α(x, λ)† = Dv α(x, λ)† =

1 1 + zλ 1 1 + zλ

(Du − z Dv ) α(x, λ)† , (Dv + z Du ) α(x, λ)† .

The ADHM Construction and Non-local Symmetries

425

Hence, (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 λ 1 † α(x, λ) = 0, − α(x, λ) − λ−z 1 + zλ and, similarly, (Dv + z Du ) [. . . ] = 0. It then follows that there exists a function H∞ (u − zv, v + zu, z) with the property that Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = +

1 1 + zλ

λ α(x, λ) λ−z

(A.6)

α(x, λ)† − Ψ∞ (x, z)H∞ (z)Ψ∞ (x, z)−1 .

Taking H∞ (x, z) = h ∞ (x, z) − Ψ∞ (x, z)−1

λ 1 α(x, z) + α(x, σ (z))† Ψ∞ (x, z) λ−z 1 + zλ

cancels the poles in the first two terms in the right-hand-side of (A.6), and yields Eq. (A.3). A similar argument for Ψ0 (x, z) yields Eq. (A.4). Lemma A.1. ˙ G(z) = T (z)G(z) + G(z)T ∗ (z) + h ∞ (z)G(x, z) + G(x, z)h 0 (z).

Proof. Firstly, ∂ ˙ Ψ∞ (x, z)−1 · Ψ0 (x, z) G(z) = ∂s = Ψ∞ (x, z)−1 Ψ˙ 0 (x, z) · Ψ0 (x, z)−1 − Ψ˙ ∞ (x, z) · Ψ∞ (x, z)−1 Ψ0 (x, z). Now use Eqs. (A.1), (A.3), (A.4) and (A.5).

The left-hand-side of (A.4) is analytic for |z| < 1+. Any singularities in this region that occur in the first two terms on the right-hand-side must therefore be cancelled by corresponding singularities in the function h 0 . It turns out that this consideration is enough to determine h 0 up to addition of a function of (u − zv, v + zu, z) that is holomorphic on the region |z| < 1 + . Similar remarks apply to h ∞ and Eq. (A.3). Proposition A.2. There exists a function ρ0 (x, z) ≡ ρ0 (u − zv, v + zu, z), holomorphic 1 for |z| < 1 + with the property that on the region 1+ < |z| < 1 + we have 1 Ψ˙ 0 (x, z)Ψ0 (x, z)−1 − ψ˙ 0 (x)ψ0 (x)−1 = (zα(x, λ) − λα(x, z)) λ−z 1 zλα(x, λ)† + α(x, σ (z))† + Ψ0 (x, z)ρ0 (z)Ψ0 (x, z)−1 . − 1 + zλ

(A.7)

426

J. D. E. Grant

Proof. Rearranging Eq. (A.4) yields z Ψ0 (z)−1 (α(x, λ) − α(x, z)) Ψ0 (z) λ−z zλ + Ψ0 (z)−1 α(x, λ)† − α(x, σ (z))† Ψ0 (z). 1 + zλ

h 0 (z) = χ0 (z)−1 χ˙ 0 (z) −

Since Ψ0 is analytic for |z| < 1 + and the poles at z = λ, σ (λ) have been can1 celled, it follows that h 0 is analytic for 1+ < |z| < 1 + . We may therefore split (0) (∞) (0) (∞) h 0 (z) = h 0 (z) + h 0 (z), where h 0 is analytic for |z| < 1 + and h 0 is analytic for 1 1 |z| > 1+ . For |z| > 1+ , we have (∞)

h0

(z) = −

1 2πi

γ−

h 0 (w) dw, w−z

1 1 where γ− = {w ∈ C : w = 1+ }, where < is chosen such that |z| > 1+ . Using 1 the fact that χ and Ψ0 are analytic for |z| < 1+ , we find that " ! wλ 1 1 w (∞) ∗ −1 h 0 (z) = T (w) − G(w) T (w)G(w) dw 2πi γ− w − z 1 + wλ λ−w

for |z| >

1 1+ .

Differentiating under the integral sign, we find that (∞)

(∂u − z∂v ) h 0 where K (x) :=

1 2πi

(z) = ∂u K (x),

γ−

(∞)

(∂v + z∂u ) h 0

(z) = ∂v K (x),

1 1 T ∗ (w) + G(w)−1 T (w)G(w) dw. w − σ (λ) w−λ

Note that this expression is independent of z. In order to construct the function h 0 , we (0) must find a function h 0 , holomorphic (in z) for |z| < 1 + with the property that (∂u − z∂v ) h 00 (z) = −∂u K (x),

(∂v + z∂u ) h 00 (z) = −∂v K (x).

To construct such a function, we define the contour γ+ = {w ∈ C : |w| = 1 + } and deduce that 1 1 1 ∗ −1 K (x) = T (w) + G(w) T (w)G(w) dw 2πi γ+ w − σ (λ) w−λ −T ∗ (σ (λ)) − G(λ)−1 T (λ)G(λ). We then find that, for |z| < 1 + , 1 1 1 −∂u K (x) = − ∂u T ∗ (w) + ∂u G(w)−1 T (w)G(w) dw 2πi γ+ w − σ (λ) w−λ +∂u T ∗ (σ (λ)) + ∂u G(λ)−1 T (λ)G(λ) = (∂u − z∂v ) Φ(x, λ, z),

The ADHM Construction and Non-local Symmetries

where

427

1 1 w 1 T ∗ (w) + G(w)−1 T (w)G(w) dw 2πi γ+ w − z w − σ (λ) w−λ σ (λ) λ T ∗ (σ (λ)) + G(λ)−1 T (λ)G(λ), + σ (λ) − z λ−z

Φ(x, λ, z) := −

with a similar expression for −∂u K (x). Again cancelling the poles at z = λ, σ (λ), we deduce that, for |z| < 1 + , we may take λ (0) G(λ)−1 T (λ)G(λ) − G(z)−1 T (z)G(z) h 0 (z) = ρ0 (z) + λ−z 1 w 1 1 T ∗ (w) + G(w)−1 T (w)G(w) dw − 2πi γ+ w − z w − σ (λ) w−λ # $ σ (λ) T ∗ (σ (λ)) − T ∗ (z) , + σ (λ) − z where ρ0 = ρ0 (u − zv, v + u, z) is analytic for |z| < 1 + . Finally, we note that, in the 1 region 1+ < |z| < 1 + we have (∞) h 0 (z) = h (0) 0 (z) + h 0 (z) =

z zλ G(z)−1 T (z)G(z) − T ∗ (z) + ρ0 (z). λ−z 1 + zλ

Substituting this expression into (A.4) yields (A.7). Theorem A.1. On the region

1 1+

(A.8)

< |z| < 1 + we have

˙ G(z) = −T (z)G(z) − G(z)T ∗ (z) + ρ∞ (z)G(x, z) + G(x, z)ρ0 (z). Proof. The reality conditions for Ψ0 and Ψ∞ imply that h ∞ (z) = h ∗0 (z). The result then follows from Lemma A.1 and Eq. (A.8). Remark A.1. Since the functions ρ0 , ρ∞ are holomorphic in (u − zv, v + zu, z) and 1 analytic for |z| < 1 + , |z| > 1+ , respectively, they simply generate holomorphic changes of basis on these regions. As such, modulo holomorphic changes of basis, the symmetry (2.12) generates the flow ˙ G(z) = −T (z)G(z) − G(z)T ∗ (z) for the patching matrix. Since T is independent of t, the corresponding one-parameter group of transformations determined by T with initial conditions the patching matrix G 0 (x, z) is of the form

G(t; x, z) = exp (−t T (x, z)) G 0 (x, z) exp −t T ∗ (x, z) . In particular, we recover the group action constructed on heuristic grounds by Crane [9]: Given a map h : X × S 1 → SL2 (C) that extends to a holomorphic map h˜ : X × V → SL2 (C) (where holomorphic means with respect to the complex structure X × V as a subset of CP 3 ) then the group action on the patching matrix is of the form ˜ G(x, z) → (h · G) (x, z) := h(x, z)G(x, z)h˜ ∗ (x, z).

428

J. D. E. Grant

References 1. Arsenault, G., Jacques, M., Saint-Aubin, Y.: Collapse and exponentiation of infinite symmetry algebras of Euclidean projective and Grassmannian σ models. J. Math. Phys. 29, 1465–1471 (1988) 2. Atiyah, M.F.: Geometry on Yang–Mills fields. Scuola Normale Superiore Pisa, Pisa, 1979 3. Atiyah, M.F.: Instantons in two and four dimensions. Commun. Math. Phys. 93, 437–451 (1984) 4. Atiyah, M.F., Hitchin, N.J., Drinfel d, V.G., Manin, Y.I.: Construction of instantons. Phys. Lett. A 65, 185–187 (1978) 5. Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London Ser. A 362, 425–461 (1978) 6. Chau, L.L., Ge, M.L., Sinha, A., Wu, Y.S.: Hidden-symmetry algebra for the self-dual Yang–Mills equation. Phys. Lett. B 121, 391–396 (1983) 7. Chau, L.L., Ge, M.L., Wu, Y.S.: Kac–Moody algebra in the self-dual Yang–Mills equation. Phys. Rev. D (3) 25, 1086–1094 (1982) 8. Chau, L.-L., Wu, Y.S.: More about hidden-symmetry algebra for the self-dual Yang–Mills system. Phys. Rev. D (3) 26, 3581–3592 (1982) 9. Crane, L.: Action of the loop group on the self-dual Yang–Mills equation. Commun. Math. Phys. 110, 391– 414 (1987) 10. Dolan, L.: Kac–Moody algebra is hidden symmetry of chiral models. Phys. Rev. Lett. 47, 1371–1374 (1981) 11. Donaldson, S.K.: An application of gauge theory to four-dimensional topology. J. Diff. Geom. 18, 279– 315 (1983) 12. Donaldson, S.K.: Instantons and geometric invariant theory. Commun. Math. Phys. 93, 453–460 (1984) 13. Donaldson, S.K., Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford Mathematical Monographs, New York: The Clarendon Press/Oxford University Press, 1990 14. Freed, D.S., Uhlenbeck, K.K.: Instantons and Four-Manifolds, Vol. 1 of Mathematical Sciences Research Institute Publications, New York: Springer-Verlag, Second ed., 1991 15. Grant, J.D.E.: Reducible connections and non-local symmetries of the self-dual Yang–Mills equations. Commun. Math. Phys. doi:10.1007/s00220-010-1025-8 16. Guest, M.A.: Harmonic Maps, Loop Groups, and Integrable Systems, Vol. 38 of London Mathematical Society Student Texts, Cambridge: Cambridge University Press, 1997 17. Guest, M.A., Ohnita, Y.: Group actions and deformations for harmonic maps. J. Math. Soc. Japan 45, 671–704 (1993) 18. Ivanova, T.A.: On infinite-dimensional algebras of symmetries of the self-dual Yang–Mills equations. J. Math. Phys. 39, 79–87 (1998) 19. Ivanova, T.A.: On infinitesimal symmetries of the self-dual Yang–Mills equations. J. Nonlinear Math. Phys. 5, 396–404 (1998) 20. Jacques, M., Saint-Aubin, Y.: Infinite-dimensional Lie algebras acting on the solution space of various σ models. J. Math. Phys. 28, 2463–2479 (1987) 21. Mason, L.J., Woodhouse, N.M.J.: Integrability, Self-duality, and Twistor Theory. Vol. 15 of London Mathematical Society Monographs. New Series, New York: The Clarendon Press/Oxford University Press, 1996 22. Nakamura, Y.: Nonlinear integrable flow on the framed moduli space of instantons. Lett. Math. Phys. 20, 135–140 (1990) 23. Okonek, C., Schneider, M., Spindler, H.: Vector Bundles on Complex Projective Spaces, Vol. 3 of Progress in Mathematics, Boston, M.A.: Birkhäuser, 1980 24. Park, Q.-H.: 2D sigma model approach to 4D instantons. Int. J. Mod. Phys. A 7, 1415–1447 (1992) 25. Popov, A.D.: Self-dual Yang–Mills: symmetries and moduli space. Rev. Math. Phys. 11, 1091–1149 (1999) 26. Popov, A.D., Preitschopf, C.R.: Extended conformal symmetries of the self-dual Yang–Mills equations. Phys. Lett. B 374, 71–79 (1996) 27. Takasaki, K.: A new approach to the self-dual Yang–Mills equations. Commun. Math. Phys. 94, 35–59 (1984) 28. Uhlenbeck, K.: Harmonic maps into Lie groups: classical solutions of the chiral model. J. Diff. Geom. 30, 1–50 (1989) 29. Uhlenbeck, K.K.: Removable singularities in Yang–Mills fields. Commun. Math. Phys. 83, 11–29 (1982) 30. Ward, R.S.: On self-dual gauge fields. Phys. Lett. A 61, 81–82 (1977) Communicated by N.A. Nekrasov

Commun. Math. Phys. 296, 429–446 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1025-8

Communications in

Mathematical Physics

Reducible Connections and Non-local Symmetries of the Self-dual Yang–Mills Equations James D. E. Grant Fakultät für Mathematik, Universität Wien, Nordbergstrasse 15, 1090 Wien, Austria. E-mail: [email protected] Received: 16 December 2008 / Accepted: 18 December 2009 Published online: 4 March 2010 – © Springer-Verlag 2010

To David E. Williams Abstract: We construct the most general reducible connection that satisfies the selfdual Yang–Mills equations on a simply-connected, open subset of flat R4 . We show how all such connections lie in the orbit of the flat connection on R4 under the action of non-local symmetries of the self-dual Yang–Mills equations. Such connections fit naturally inside a larger class of solutions to the self-dual Yang–Mills equations that are analogous to harmonic maps of finite type. 1. Introduction Reducible connections play an important rôle in Donaldson’s study of four-manifold topology [11]. In particular, the singular ends of the moduli space of global solutions of the self-dual Yang–Mills equations on a four-manifold are due to the existence of connections on which the group of gauge transformations (modulo its centre) do not act freely. In the current paper, we study reducible connections from a different point of view, that of integrable systems theory. In this paper and its companion [13] we investigate the non-local symmetry algebra of the self-dual Yang–Mills equations on R4 discussed in [7–9] and corresponding group actions on spaces of solutions of the self-dual Yang–Mills equations on open subsets of R4 . In [13] we studied instanton moduli spaces, as explicitly described by the ADHM construction [1,2], and group actions that preserved the L 2 nature of the curvature of the connection. In the current work, we investigate reducible connections defined on a simply-connected, open subset of R4 . Given that reducible and irreducible connections play a different rôle in Donaldson’s work, our main motivation was to study whether such connections have different properties from an integrable systems point of view. This does, indeed, seem to be the case. In the case of instanton solutions on R4 , it was argued in [13] that the symmetry group that acts on the moduli space has orbits of high codimension in the moduli space. (In other words, the orbits are quite small.) Our conclusion for reducible connections, however, is quite different. After explicitly constructing the most general reducible self-dual

430

J. D. E. Grant

connection on a simply-connected, open subset of R4 (there are no reducible self-dual connections on R4 with L 2 curvature), we deduce that all reducible connections lie in the orbit of the flat connection on R4 under the action of the non-local symmetry group of the self-dual Yang–Mills equations. Also, the reducible connections lie within a larger class of solutions that arise quite naturally from the symmetries of the self-dual Yang–Mills equations. Solutions in this larger family are determined by aholomorphic function, T , defined on an open subset of CP 3 that obeys the condition T, T ∗ = 0. Such formulae bear a strong resemblance to those arising in the theory of harmonic maps of finite type (see, e.g., [14, Chap. 24]). This result is distinct from the instanton case discussed in [13], which bears more of a resemblance to the theory of harmonic maps of finite uniton number [25]. The organisation of this paper is as follows. In the following section, we set up notation and recall the non-local symmetries of the self-dual Yang–Mills equations on R4 constructed in [7–9]. We also recall the main results of [10] concerning the twistorial interpretation of these symmetries in terms of their action on the patching matrix of holomorphic bundles over open subsets of CP 3 . In sect. 3, we determine the most general reducible self-dual connection on a simply-connected, open subset of R4 . We show that these may be constructed directly from harmonic functions. We then show the patching matrix of such connections may be constructed directly. In Sect. 4, we deduce that all such patching matrices, and therefore all reducible self-dual connections, lie in the orbit of the flat connection on R4 . In particular, we see there is a larger class of patching matrices that appear quite naturally from the group action of [10] that contains all reducible connections. Analogies between this larger class of solutions and harmonic maps of finite type are briefly investigated in Sect. 5. After some final remarks, in an Appendix we study some properties of a simplified version of the group action of [10]. As in the companion paper [13], we study only the non-local symmetries of the selfdual Yang–Mills equations constructed in [7–9], and not symmetries that require the existence of a non-trivial conformal group on our manifold. We also specialise to the case of SU2 structure group, although the generalisation to any classical Lie group is straightforward. 2. Preliminaries 2.1. The self-dual Yang–Mills equations on R4 . Let U be a connected, simply connected open subset of R4 with its flat metric. From the Cartesian coordinates (t, x, y, z) on R4 , we define complex coordinates u := t + ix,

v := y − i z

on R4 ∼ = C2 . In terms of these coordinates, the metric on R4 is g=

1 (du ⊗ du + du ⊗ du + dv ⊗ dv + dv ⊗ dv) , 2

and the standard volume form is = dt ∧ dx ∧ dy ∧ dz =

1 du ∧ du ∧ dv ∧ dv. 4

In terms of these coordinates, a local basis for the bundle of anti-self-dual two-forms (with respect to the above metric and volume form) on U is given by {du ∧dv, 21 (du ∧du + dv ∧ dv), du ∧ dv}.

Reducible Connections and Non-local Symmetries

431

Let π : P → U be a principal SU2 bundle over U . Since we are, essentially, working locally, a connection on P may be represented by an su2 -valued one-form A ∈ 1 (U, su2 ), with curvature F ∈ 2 (U, su2 ). In terms of the complex coordinates above, the connection satisfies the self-dual Yang–Mills equations on U if and only if

Fuu

Fuv = 0, + Fvv = 0, Fuv = 0.

(2.1a) (2.1b) (2.1c)

Since U is simply-connected, (2.1a) and (2.1c) imply the existence of maps ψ0 , ψ∞ : U → SL2 (C) with the property that −1 Au = − (∂u ψ∞ ) ψ∞ ,

Au =

−1 Av = − (∂v ψ∞ ) ψ∞ ,

− (∂u ψ0 ) ψ0−1 ,

Av =

− (∂v ψ0 ) ψ0−1 .

(2.2a) (2.2b)

The fields ψ0 , ψ∞ are determined by Eqs. (2.2) up to transformations 0 (x) := ψ0 (x)R(u, v), ψ0 (x) → ψ

∞ (x) := ψ∞ (x)S(u, v), ψ∞ (x) → ψ

where R, S are arbitrary analytic functions of (u, v), (u, v) respectively. We may use † this freedom to set, without loss of generality, ψ∞ (x) = ψ0 (x)−1 , ∀x ∈ U . The remaining freedom in the choice of ψ0 , ψ∞ is then of the form 0 (x) := ψ0 (x)R(u, v), ψ0 (x) → ψ

† ∞ (x) := ψ∞ (x) R(u, v)−1 . ψ∞ (x) → ψ (2.3)

In terms of these fields we define the Yang J -function, J : U → SL2 (C), by −1 J (x) := ψ∞ (x) · ψ0 (x),

x ∈ U.

It follows from the reality properties of ψ0 , ψ∞ that J is Hermitian J (x) = J (x)† ,

x ∈ U,

and that, under the transformation (2.3), J transforms according to the rule J (x) → J(x) := R(u, v)† J (x)R(u, v).

(2.4)

Substituting into Eq. (2.1b), we find that the connection, A, satisfies the self-dual Yang–Mills equations if and only if J satisfies the two (equivalent) versions of the Yang–Pohlmeyer equation ∂u Ju J −1 + ∂v Jv J −1 = 0, ∂u J −1 Ju + ∂v J −1 Jv = 0.

(2.5a) (2.5b)

432

J. D. E. Grant

2.2. Associated linear problem. Let be a non-empty, open subset of CP 1 := C∪{∞}, and consider the following overdetermined system of equations for a map : U × → SL2 (C) (∂v + z∂u ) (x, z) = − (Av + z Au ) (x, z), (∂u − z∂v ) (x, z) = − (Au − z Av ) (x, z), ∂z (x, z) = 0.

(2.6a) (2.6b) (2.6c)

An important property of the self-dual Yang–Mills equations is that they are the integrability condition for this system. In particular, if the connection A satisfies the self-dual Yang–Mills equations on U , then there exists > 0 and a solution 0 : U × V0 →

0 0 SL2 (C) of this system that is analytic in z for z ∈ V , where V := z ∈ CP 1 |z| < 1 + . Notation. We define the involution σ : CP 1 → CP 1 by σ (z) = − 1 z. Given a subset V ⊂ CP 1 and a map g : V → SL2 (C), we define a corresponding map g ∗ : σ (V) → SL2 (C) by g ∗ (z) := (g(σ (z)))† . Similarly, given any map f : U ×V → SL2 (C), we define a corresponding map f ∗ : U × σ (V) → SL2 (C) by f ∗ (x, z) := ( f (x, σ (z)))† . Given the solution, 0 , of (2.6), ∞: U ×

another solution we may now construct 1 , by ∞ (x, z) := 0∗ (x, z)−1 . V∞ → SL2 (C), where V∞ := z ∈ CP 1 |z| > 1+ The solution ∞ is analytic in z for z ∈ V∞ . Remark 2.1. Note that, for the construction of the connection in Eq. (2.2), we may take ψ0 (x) := 0 (x, 0) and ψ∞ (x) := ∞ (x, 0). We will assume, from now on, that ψ0 and ψ∞ are defined in this way. Definition 2.1. The patching matrix (or clutching function, in Uhlenbeck’s terminology [25]), G : U × V → SL2 (C) is defined by

where V := V0 ∩ V∞

G(x, z) = ∞ (x, z)−1 · 0 (x, z), 1

= z ∈ CP 1 1+ < |z| < 1 + .

(2.7)

Remark 2.2. Viewing U × V as a subset of π −1 (U ) ⊆ CP 3 , the patching matrix is the transition function of the holomorphic vector bundle over CP 3 corresponding to our self-dual connection A [3,27]. Since U × V0 and U × V∞ are open subsets of C3 , any holomorphic bundle over them is trivial. As such, the bundle over π −1 (U ) is completely determined by the patching matrix G (see, e.g., [10]). The fact that the patching matrix splits as in (2.7) implies that the bundle is trivial on restriction to a line π −1 (x), for each x ∈ U [27]. Since the patching matrix obeys the reality condition G(t, z) = G ∗ (t, z), the bundle admits a Hermitian structure, and the corresponding self-dual connection is an SU2 connection, rather than an SL2 (C) connection.

Reducible Connections and Non-local Symmetries

433

2.3. Non-local symmetries. In order to study symmetries of the self-dual Yang–Mills equations, we let J (·, s) : Us → SL2 (C) be a one-parameter family of solutions of the Yang–Pohlmeyer equations (2.5). Here, s ∈ I with I an open interval in R containing the origin, J is assumed to depend in a C 1 fashion on the parameter s, and Us ⊆ R4 is the open subset of R4 on which the solution is well defined (i.e. non-singular)1 . Taking the derivative with respect to s of (2.5), we find that J (·, s) must satisfy the linearised equation ∂u J ∂u J −1 J˙ J −1 + ∂v J ∂v J −1 J˙ J −1 = 0, (2.8) where J˙ := ∂ J ∂s. It is known that the only local symmetries of the self-dual Yang–Mills equations on flat R4 are gauge transformations and those generated by the action of the conformal group (see, e.g., [22]). However, there exists a non-trivial family of non-local symmetries of the self-dual Yang–Mills equations [7–9], defined as follows. We define maps χ0 : U × V0 → SL2 (C), χ∞ : U × V∞ → SL2 (C) by the relations χ0 (x, z) := ψ0 (x)−1 · 0 (x, z), χ∞ (x, z) := ψ∞ (x)−1 · ∞ (x, z),

(x, z) ∈ U × V0 ,

(x, z) ∈ U × V∞ .

χ0 is analytic in z for all z ∈ V0 , with χ (x, 0) = Id, for all x ∈ U , and is a solution of the system

∂v − Jv J −1 + z∂u χ0 (x, z) = 0,

∂u − Ju J −1 − z∂v χ0 (x, z) = 0, for all (x, z) ∈ U × V0 . Similarly, χ∞ is analytic in z for all z ∈ V∞ , with χ∞ (x, ∞) = Id, for all x ∈ U . Note that we have χ∞ (x, λ) = (χ0 (x, σ (λ)))−† , for all (x, λ) ∈ U × V∞ . Based on the work of [7–9], we have the following result from [10]: Proposition 2.1. Let T : U × V → sl2 (C) obey the relations (∂v + λ∂u ) T (x, λ) = (∂u − λ∂v ) T (x, λ) = 0, and be analytic in λ on a neighbourhood, V , of the unit circle S 1 ⊂ C. Then J˙(x, s) = χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J + J · χ0 (x, σ (λ))T (x, λ)† χ0 (x, σ (λ))−1 = ψ∞ (x)−1 ∞ (x, λ)T (x, λ)∞ (x, λ)−1

+ 0 (x, σ (λ))T (x, λ)† 0 (x, σ (λ))−1 ψ0 (x) is a solution of the linearisation equation (2.8), for all x ∈ U , and all λ ∈ V . 1 We will often notationally suppress the dependence of the domain U on s.

(2.9)

434

J. D. E. Grant

Remark 2.3. In the case where the function T is independent of (u, v), it defines an element of the loop group SL2 (C) with a holomorphic extension to an open neighbourhood of S 1 in C∗ . The algebra of symmetries generated by such T is then isomorphic to the Kac-Moody algebra of sl2 (C) [7–9]. The natural question is how to exponentiate the above algebra into a group action on the space of solutions of the self-dual Yang–Mills equations. A solution to this problem is given by the following result Theorem 2.1. [10, Chap. IV.C]. Let g : X × S 1 → SL2 (C) be a smooth map that admits a continuous extension to a holomorphic map g : X × V ⊂ CP 3 → SL2 (C), for some > 0. Then the action of g on the patching matrix, G(x, z), is defined by G(x, z) → (g · G) (x, z) := g(x, z) · G(x, z) · g ∗ (x, z).

(2.10)

This equation defines an action of the set of such maps g on the space of self-dual connections on X . If g extends holomorphically to z ∈ V0 , then the corresponding transformation is a holomorphic change of basis on the bundle over π −1 (X ), which leaves the self-dual connection, A, unchanged. Remark 2.4. The above group action on the solution space have been given a cohomological description by Park (see [20] and references therein), which has been further investigated by Popov and Ivanova (see [17,18,22] and references therein). Remark 2.5. The group action (2.10) is slightly unusual since, in integrable systems theory, it is usually adjoint or coadjoint orbits of Lie groups that turn out to be relevant. If we consider the case where G and g are constant, and study the action of SL2 (C) on itself by (g, G) → g · G := gGg † , then the generic orbits of this action are five-dimensional. As such, the orbits do not carry the invariant symplectic structures that one would associate with, for example, coadjoint orbits. A brief investigation of this orbit structure is given in Appendix A. The connection between Theorem 2.1 and the transformation (2.9) is given by the following: Theorem 2.2. Given T : U × V → sl2 (C) as in Proposition 2.1, the corresponding flow on the space of patching matrices is given by ˙ G(x, z) = −T (x, z)G(x, z) − G(x, z)T ∗ (x, z) + ρ∞ (x, z)G(x, z) + G(x, z)ρ0 (x, z) (2.11) for (x, z) ∈ R4 × V . In this equation, ρ0 : R4 × V0 → sl2 (C) and ρ∞ : R4 × V∞ → sl2 (C) are analytic functions of z on the respective regions and satisfy (∂v + z∂u ) ρ0 (x, z) = (∂u − z∂v ) ρ0 (x, z) = 0, (∂v + z∂u ) ρ∞ (x, z) = (∂u − z∂v ) ρ∞ (x, z) = 0. Remark 2.6. It follows from a similar argument to that given in the proof of Proposition 1 (b) of [10] that the terms ρ0 and h ∞ in the above formula may be absorbed into a change of holomorphic frame on the sets z ∈ V0 and V∞ , respectively. Remark 2.7. The group action (2.10) was derived in [10], arguing by analogy with the action of dressing transformations in harmonic map theory. It has been investigated further in, for example, [17,18,22]. The first direct derivation of the infinitesimal result (2.11) from the generator (2.9), to my knowledge, appears in [13].

Reducible Connections and Non-local Symmetries

435

3. Reducible Connections Recall [12, Chap. 3] that a connection, D, on an SU2 bundle π : E → U is reducible if the group of gauge transformations G := C ∞ (U, SU2 ) modulo its centre does not act freely on the connection D. We now proceed to derive the most general form of a reducible connection on a simply-connected, open subset of R4 . In doing so, we make extensive use of the classical Pauli matrices, which we define as follows: 01 0 −i 1 0 τ ≡ (τ1 , τ2 , τ3 ) = , , . 10 i 0 0 −1 Proposition 3.1. Let U be a simply-connected, open subset of R4 , a ∈ C ∞ (U, R) a harmonic function. We define the connection A=

1 ∂a − ∂a τ3 ∈ 1 (U, su2 ). 2

(3.1)

Then the connection, A, is reducible and satisfies the self-dual Yang–Mills equations on U . Conversely, up to gauge transformation, all reducible self-dual connections on U arise in this way. Proof. A connection is reducible if and only if there exists a parallel section, η, of the adjoint bundle ad su2 [12, Theorem 3.1]. In terms of local coordinates (u, v) ∈ C2 ∼ = R4 , and relative to a local trivialisation of the bundle E, this condition implies that ∂ η + [Aa , η] = 0. ∂xa It follows from Eqs. (2.2) that, for a parallel section η, the map A : C2 → sl2 (C) defined by the equation η = ψ0 (x)A(u, v)ψ0 (x)−1 is holomorphic. In a similar fashion, using (2.2) and the relationship between ψ0 and ψ∞ , one may verify that η = −ψ∞ (x)A(u, v)† ψ∞ (x)−1 . These relations imply that J A(u, v) + A(u, v)† J = 0.

(3.2)

Note that this equation and the fact that det J = 0 implies that det A(u, v) = det A(u, v)∗ . Therefore det A(u, v) is real. Since A is holomorphic in u and v, it follows that det A is a real constant. Let A = (R + iI) · τ , with R, I : U → R3 . The fact that A depends holomorphically on (u, v) implies that ∂t R = ∂x I,

∂x R = −∂t I,

∂y R = −∂z I,

∂z R = ∂y I.

(3.3)

Moreover, we find that det A = |I|2 − |R|2 − 2iR, I. Since det A is real, we deduce that R, I = 0. We now let J = +λ·τ , with λ : U → R3 and the condition that det J = 1 implies that we require 2 = 1 + |λ|2 . Imposing (3.2), we find that we require

R = λ × I.

(3.4)

436

J. D. E. Grant

This equation implies that

2 |R|2 = |λ × I|2 = |λ|2 |I|2 − I, λ2 . Therefore 1 |λ|2 1 = |λ|2 ≥ 0.

det A = |I|2 − |R|2 =

2 |R|2 + I, λ2 − |R|2 |R|2 + I, λ2

For λ = 0, equality occurs in this inequality if and only if R = 0 and I, λ = 0. From (3.4), it follows that, in this case, R = I = 0, so A = 0. Moreover, if λ = 0, then (3.2) implies that R = 0, so det A = |I|2 , which is, again, strictly positive unless I = 0 and, hence, A = 0. To summarise, the fact that det A is real, along with (3.2), implies that det A is a nonnegative constant. Moreover, det A = 0 if and only if A = 0. Since we are assuming A = 0 and since any constant multiple of a parallel section is also parallel we may, without loss of generality, assume that det A = 1. In this case, it follows that the eigenvalues of A are ±i. Note that we still have the freedom to rotate the ψ’s, as given in (2.3). It follows that A transforms under the adjoint action of R −1 : v) := R(u, v)−1 A(u, v)R(u, v) = Ad R −1 A. A(u, v) → A(u,

(3.5)

We now write A in the form

a(u, v) b(u, v) A(u, v) = , c(u, v) −a(u, v)

where a, b, c are holomorphic functions of (u, v). On a neighbourhood of any point (u, v) at which a(u, v) = −i, the holomorphic change of frame a+i b R+ (u, v) = c a+i has the property that R+ (u, v)−1 A(u, v)R+ (u, v) = iτ3 . Similarly, for a(u, v) = +i,

R− (u, v) =

−b a − i a−i c

gives a holomorphic change of frame with the property that R− (u, v)−1 A(u, v)R− (u, v) = iτ3 . As such, given any point p ∈ X , there exists a neighbourhood of p and a holomorphic frame such that A = iτ3 in that frame.

Reducible Connections and Non-local Symmetries

437

From (3.2), it follows that there exist real functions α, β such that J = α Id + βτ3 . Since det J = 1, we have α 2 = 1+β 2 . Since J is continuous, α will have constant sign, so we assume that α > 0. Therefore, since U is assumed simply-connection, we may consistently define a real-valued function a with the property that α = cosh a, β = sinh a. It then follows that J = exp (aτ3 ) .

(3.6)

We therefore have J −1 Ju = au τ3 ,

J −1 Jv = av τ3 .

Imposing the Yang–Pohlmeyer equation implies that a is harmonic: (∂u ∂u + ∂v ∂v ) a = 0. From (3.6), we see that, up to a gauge transformation, we may take a a ψ∞ = exp − τ3 . ψ0 = exp τ3 , 2 2 The form of the connection given in Eq. (3.1) then follows from Eq. (2.2). The parallel section, η, is equal to iτ3 . Example 1. The case J (x) = exp

|u|2 − |v|2 τ3

corresponds to a = |u|2 −|v|2 and, therefore, defines a reducible connection. In this case, the connection is non-singular on R4 . However, the curvature is not L 2 , and therefore the connection cannot be extended to S 4 [26]. In this particular case the connection is algebraically special, in the sense that, in addition to satisfying the self-dual Yang–Mills equations (2.1), the curvature satisfies Fuv = 0,

Fvu = 0.

It can be shown that all algebraically special connections arise in this way, and are thus reducible. Recall [16] that there is a 1 − 1 correspondence between harmonic functions on , O(−2)), where U := π −1 (U ) ⊆ CP 3 . U ⊆ R4 and sheaf cohomology classes in H 1 (U In the present case, such a cohomology class may be represented by a holomorphic function2 f : U × C∗ → C. In terms of homogeneous coordinates on CP 3 , we have f (λz) = λ−2 f (z). The corresponding harmonic function on U ⊆ R4 is then given by the contour integral 1 a(x) = f (u − wv, v + wu, w)dw, (3.7) 2πi γ where γ := {w ∈ C ⊂ CP 1 : |w| = 1}. 2 By holomorphic, we mean with respect to the complex structure induced on U × C∗ as a subset of CP 3 .

438

J. D. E. Grant

Proposition 3.2. Given the connection as in Proposition 3.1, the patching matrix for the may be taken as holomorphic bundle on U G(x, z) = exp

1 F(x, z) + F ∗ (x, z) τ3 , 2

where F(x, z) :=

1 2πi

γ

w+z f (u − wv, v + wu, w)dw, w−z

(3.8)

and the holomorphic function f is a representative of the cohomology class in , O(−2)) corresponding to the harmonic function a. H 1 (U Proof. We assume that there exists 0 (x, z) of the form exp 21 F(x, z)τ3 , with F(x, 0) = a(x). From (2.6) for and the explicit form of the connection, we deduce that F must be analytic in z and satisfy the relations (∂u − z∂v ) F(x, z) = (∂u + z∂v ) a(x),

(∂v + z∂u ) F(x, z) = (∂v − z∂u ) a(x).

From (3.7) we then calculate (∂u + z∂v ) a(x) = = = = =

1 f (u − wv, v + wu, w)dw (∂u + z∂v ) 2πi γ 1 (w + z) (∂2 f ) (u − wv, v + wu, w)dw 2πi γ 1 w+z (w − z) (∂2 f ) (u − wv, v + wu, w)dw 2πi γ w − z 1 w+z (∂u − z∂v ) f (u − wv, v + wu, w)dw 2πi γ w − z w+z 1 f (u − wv, v + wu, w)dw. (∂u − z∂v ) 2πi γ w − z

Along with a similar calculation for (∂v − z∂u ) a(x), we have the above candidate for F given in (3.8). By its definition as a contour integral, it follows that F is analytic in w+z z, for z in the interior of γ . Since the function w−z f (u − wv, v + wu, w) is continuous for w ∈ γ , and γ is compact, the Dominated Convergence Theorem implies that lim z→0 F(x, z) = a(x). As such, F has all of the required properties. Remark 3.1. Note that the patching matrix in Proposition 3.2 takes the form G(x, z) = exp (ϕ(x, z)τ3 ), where ϕ is a holomorphic function of (u − zv, v + zu, z) that satisfies ϕ ∗ (x, z) = ϕ(x, z). Example 2. In the case of our algebraically special connection, where a = |u|2 − |v|2 , we may take f (x, z) = z12 (u −zv)(v +zu). We then find that F(x, z) = |u|2 −|v|2 −2uv

and G(x, z) = exp 1z (u − zv) (v + zu) τ3 .

Reducible Connections and Non-local Symmetries

439

4. Orbit of the Flat Connection Let T : U × S 1 be analytic on a neighbourhood of U × S 1 in U × C∗ ⊂ CP 3 , with the property that (4.1) T, T ∗ = 0. Given a solution of the self-dual Yang–Mills equations, described by patching matrix G(x, z), we may consider the one-parameter family of connections generated by T via the flow (2.11). Since the functions ρ0 , ρ∞ can be removed by a holomorphic change of basis on the regions U × V0 and U × V∞ we may, without loss of generality, fix the holomorphic bases by setting ρ0 = ρ∞ = 0. The unique solution of (2.11) with initial conditions G(x, z) is then G t (x, z) = exp (−t T (x, z)) G(x, z) exp −t T (x, z)∗ . In general, one would perform a Birkhoff splitting of G t , which would yield connections At generated from the connection, A, corresponding to the patching matrix G. (Generally, there will be jumping points at which G does not admit a splitting of the form (2.7), so we will need to shrink the set U accordingly. The set of such points will, generically, be of strictly positive codimension in U .) A case of particular interest to us is when the initial connection is flat, in which case we may take G(x, z) = 1. We then have G t (x, z) = exp −t T (x, z) + T (x, z)∗ as the patching matrix of connections that lie in this orbit of the flat connection. As a special case of this construction, letting T = − 21 ϕ(x, z)τ3 , where ϕ : U ×V → SL2 (C) satisfies (∂u − z∂v ) ϕ = (∂v + z∂u ) ϕ = 0 and is analytic in z ∈ V for some > 0. Then, assuming that is chosen sufficiently small that G(x, z) is analytic on U × V , we deduce that G t (x, z) = exp (tϕ(x, z)τ3 ) ,

(x, z) ∈ U × V .

In light of this construction, and the classification of patching matrices arising from reducible connections in the previous section, we deduce: Theorem 4.1. Let A be a reducible connection on an open subset U ⊂ R4 . Then A lies on the orbit of the flat connection on U under the action of the non-local symmetry group of the self-dual Yang–Mills equations. Remark 4.1. As mentioned earlier, the set U on which the self-dual connection is defined will generally shrink under the action of the symmetry group. In the case of reducible connections, where one may start from the flat connection on R4 , then there do exist reducible connections defined on the whole of R4 . (Our algebraically special connection is an example of such.) Since there are no non-trivial reducible connections on S 4 , however, Uhlenbeck’s theorem [26] implies that the curvature of such a connection cannot be L 2 . (This property may also be checked directly from the explicit form of the connection.)

440

J. D. E. Grant

Remark 4.2. Takasaki [24] has argued that, if we drop the reality conditions on self-dual connections and consider SL2 (C) connections rather than SU2 ones, then the group action generated by transformations of the form J˙(x, s) = χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J

(4.2)

is transitive on the space of local solutions of the SL2 (C) self-dual Yang–Mills equations. In the current context, we are explicitly restricting ourselves (via the form of Eq. (2.9)) to transformations that preserve the SU2 nature of the connection, in which case there is no reason to believe that the group action should be transitive. In an analogous situation in the theory of harmonic maps into Lie groups, one can show that transformations analogous to (4.2) map real extended harmonic maps to real harmonic maps if and only if the action is trivial [4, Prop. 3.4] (i.e. g · = ). It is, similarly, expected that transformations of the form (4.2) will map SU2 connections to SU2 connections if and only if the connections coincide. 5. Harmonic Maps of Finite Type It is clear from the discussion in the previous section that the reducible connections on a simply-connected, open subset U ⊆ R4 are a special case of a more general type of connection. In particular, given a map T : U × V → SL2 (C) satisfying the commutator condition (4.1), then the patching matrix (5.1) G(x, z) = exp T (x, z) + T (x, z)∗ will generate a solution of the self-dual Yang–Mills equations on a subset of U . The forms of condition (4.1) and the patching matrix (5.1) are reminiscent of formulae that appear when one considers harmonic maps of finite type into Lie groups (see, e.g., [14, Chap. 24] and [5,6] for harmonic maps into k-symmetric spaces). Recall that, in this context, we consider Lie groups G, G1 , G2 such that G = G1 · G2 (in the sense that, given g ∈ G, there exist unique g1 ∈ G1 , g2 ∈ G2 such that g = g1 g2 ). At the Lie algebra level, we have a direct sum decomposition g = g1 + g2 , and we denote the projections onto the two summands by π1 , π2 . In the case where AdG1 g2 ⊆ g2 , then various Lax flows on g can be solved explicitly. In particular, let J1 , J2 be invariant vector fields on g3 and consider the Lax equations ∂s X (s, t) = [X (s, t), (π1 ◦ J1 ) (X (s, t))] , ∂t X (s, t) = [X (s, t), (π1 ◦ J2 ) (X (s, t))] ,

(5.2a) (5.2b)

for a map X : R2 → g with initial conditions X (0, 0) = V ∈ g. These equations are compatible, and the solution to this problem may be written in the form X (s, t) = Ad F(s,t)−1 V, where F : R2 → G1 takes the form F(s, t) = exp (s (π1 ◦ J1 ) (V) + t (π1 ◦ J2 ) (V)) ,

(s, t) ∈ R2 .

3 I.e. J , J : g → g satisfy J (Ad v) = Ad J (v) for all g ∈ G and v ∈ g and similarly for J . g g 1 1 2 1 2

Reducible Connections and Non-local Symmetries

441

The connection with harmonic maps arises if we let G be a compact Lie group and use the standard loop-group decompositions (see, e.g., [14, Chap. 12] and [23, Chap. 8]) to take G := G C , G1 := G, G2 := + G C . If we now impose that the initial conditions for the Lax equation (5.2) correspond to an element of the loop group of G (rather than G C ) and that their Laurent expansion lies between degrees −d and d: d n V = λ → f (λ) ≡ αn λ ∈ G C , n=−d

then it turns out that the map X also has Laurent expansion that lies between degrees −d and d. Moreover, F : R2 → G is automatically the extended solution corresponding to a harmonic map ϕ : R2 → G. (In particular, ϕ(s, t) = F(s, t)|λ=−1 .) Such harmonic maps are of finite type. In the context of self-dual Yang–Mills connections, the analogue of harmonic maps of finite type for self-dual Yang–Mills fields would appear to be patching matrices of the form G(x, z) = exp (x, z), where : U × V → SL2 (C) satisfies 1) (∂u − z∂v ) (x, z) = (∂v + z∂u ) (x, z) = 0; 2) ∗ (x, z) = (x, z) 3) (x, z) is analytic in z for z ∈ V for some > 0, and there exists d ∈ N0 such that has a finite Laurent expansion of the form (x, z) =

d

an (x)z n ,

(x, z) ∈ U × V ,

n=−d

for some ai : U → gC , i = −d, . . . , d on this set. In particular, fixing a point p ∈ U , then the finite Laurent expansion at p is analogous to the initial condition V having finite Laurent expansion. Moreover, Condition 1) above is then the analogue of the Lax equations (5.2) satisfied by the map X in the harmonic map case. Definition 5.1. We will call a solution of the self-dual Yang–Mills equations for which there exists a patching matrix that satisfies the above criteria a self-dual connection of finite type4 . Remark 5.1. The conditions above imply that the maps ai satisfy the conditions ∂u a−d = 0, ∂u a−d = 0, ∂u an+1 = ∂v an , ∂v an+1 = −∂u an , n = −d, . . . , d − 1, ∂v ad = 0, ∂u ad = 0.

(5.3a) (5.3b) (5.3c)

For d = 0, we deduce that G is constant, and therefore the corresponding self-dual connection A is flat. For d ≥ 1, the algebraic condition (4.1) imposes non-trivial restrictions on the coefficients ai . Note that our algebraically special connection is a self-dual connection of type 1. 4 Or, of type d, when we wish to be more specific

442

J. D. E. Grant

Remark 5.2. Let A be a reducible connection defined by a harmonic function a as in (3.1). Letting a0 (x) := a(x) then A is a self-dual connection of finite type if and only if there exists d > 0 and functions a−d , . . . , ad such that Eqs. (5.3) hold. In particular, this condition implies that ∂da = 0, for all r, s such that r + s = d. ∂u r ∂v s Therefore, is necessarily a polynomial of degree less than or equal to d in (u, u, v, v). As such, the space of self-dual connections of type d is necessarily finite-dimensional. Remark 5.3. The most restrictive case is when we impose that G splits in the form (2.7) with d a0 (x) n + 0 (x, z) = exp an (x)z , 2 n=1 −1 a0 (x) n − an (x)z . ∞ (x, z) = exp − 2 n=−d

Such a splitting only occurs if ai (x), a j (x) = 0, for all i, j = −d, . . . , d. Since SL2 (C) is of rank one, it follows that there exists a constant element α ∈ SL2 (C) such that ai (x) = ϕi (x)α, for functions ϕi : U → C. A change of basis (rotating so that α → τ3 ) implies that such patching matrices give rise to reducible connections when a0 is real. Remark 5.4. One of the main differences between the integrable systems approach to harmonic maps and the self-dual Yang–Mills equations is the form of the symmetry group action on the solutions. In the case of harmonic map equations from a domain X ⊆ R2 to a Lie group G, one interprets the harmonic map equations as implying the existence of a holomorphic map E : X → G into the based loop group of G. The “dressing action” on the space of harmonic maps is then induced by the action of various groups on the group G [15,25]. In particular, the symmetry group acts only on the space where the map E takes its values, rather than on the map E itself. In the case of the self-dual Yang–Mills equations, the object of study is the patching matrix G : U × V → sl2 (C), and the group action (2.10) acts non-trivially on the map G. This difference is the main issue that makes the case of self-dual Yang–Mills equations more complicated. As remarked earlier, the particular form of the group action (2.10) implies that many of the techniques used to study orbits in the harmonic map case have no direct analogue in the self-dual Yang–Mills case. 6. Final Remarks Our main result is that reducible connections that satisfy the self-dual Yang–Mills equations on simply-connected, open subsets of R4 lie in the orbit of the flat connection under the action of the non-local symmetry group of these equations found in [7–9]. In particular, such connections lie within a larger class of solutions, dis4, defined by a holomorphic function T (x, z) with the property that cussed in Sect. T (x, z), T ∗ (x, z) = 0. This condition defines a class of solutions of the self-dual Yang–Mills equations that seem quite natural from the integrable systems point of view,

Reducible Connections and Non-local Symmetries

443

and suggests a connection with the theory of harmonic maps of finite type. Whether the analogy with such harmonic maps may be extended, and techniques developed in, for example [5,6], may be adapted to the study of our class of self-dual connections, is under investigation. It is clear that the work here (and in the sister paper [13]) may be extended in several ways. The investigation of the symmetry group on the one-instanton moduli space on the four-manifold CP 2 would be of particular interest, since, in this case, the standard reducible connection is also L 2 , so we have reducible and irreducible connections in the same moduli space. Such an investigation would yield further information concerning the different behaviour of reducible connections studied here and the instanton connections studied in [13] under the symmetry group. In a different direction, given that the reducible connections and the class discussed in Sect. 4 seem quite a natural family of solutions to investigate from the point of view of integrable systems, it would be of interest to investigate whether there are similar families of self-dual Ricci-flat four-manifolds (for example, those with algebraically special self-dual Weyl tensor) that arise naturally from the symmetries of, for example, Plebanski’s equations [21]. We should also point out that we have exclusively considered the self-dual Yang–Mills equations on Riemannian manifolds, due to the original motivation of Donaldson theory. It is more usual to investigate the integrable systems aspects of the self-dual Yang–Mills equations on manifolds of signature (− − ++) (see, e.g., [19] for an extensive treatment of this topic). It would be of interest to investigate the action of symmetries on, for example, reducible connections in the case of signature (− − ++). Viewed in conjunction with the results of the companion paper [13], where instanton solutions of the self-dual Yang–Mills equations were investigated, it appears that the action of the non-local symmetry group on the space of solutions of the self-dual Yang–Mills equations is quite different in the two cases. In the case of instanton moduli spaces, evidence was found that the orbits of the symmetry group that preserve the L 2 nature of the curvature of the connection are rather small. In the present case, however, all reducible connections are contained in a single orbit. It appears that the distinction between instanton connections and reducible connections for the self-dual Yang–Mills equations are, in this sense, similar to the distinction between harmonic maps of finite uniton number [25] and harmonic maps of finite type. Since the original motivation for the current work (and [13]) was to investigate connections between integrable systems theory and Donaldson’s use of the self-dual Yang–Mills equations in connection with four-dimensional topology [11], it is rather striking that the behaviour of reducible connections and irreducible connections should be so different from the integrable systems point of view. Whether these results point to a deeper relationship between integrable systems theory and topological field theory would certainly seem worthy of further investigation. Acknowledgements. This work was supported by START-project Y237–N13 of the Austrian Science Fund and a Visiting Professorship at the University of Vienna. The author is grateful to Prof. M.A. Guest for discussions and, in particular, for drawing his attention to harmonic maps of finite type. He would also like to thank the anonymous referee for a detailed critique of the original version of this paper.

A. Constant Group Action The group action G(x, z) → h(x, z)G(x, z)h ∗ (x, z) is a little unusual. In order to gain some insight into this action, we consider some similar actions on simpler groups, analogous to the case where G and h are constant.

444

J. D. E. Grant

A.1. SL2 (R). Consider the action of SL2 (R) on itself given by SL2 (R) × SL2 (R) → SL2 (R);

(h, g) → h · g := hgh t ,

where t denotes transpose. The subgroup PSL2 (R) ∼ = SO02,1 acts effectively. We decompose g into symmetric and skew-symmetric parts g = U + α, 0 1 . It then follows that α is invariant under where U is symmetric, α ∈ R and = −1 0 the action of h and U transforms according to U → h · U := hU h t . The fact that g lies in SL2 (C) implies that det U = 1 − α 2 . Writing

t +x y , U= y t −x

then det U = −u2 , where u = (t, x, y) ∈ R2,1 . The orbits of the group action are then parametrised by α ∈ R and consist of vectors u ∈ R2,1 with u2 = α 2 − 1. Since the restriction on u is insensitive to the sign of α, we consider the orbits for α ≥ 0: α = 0 Here there are two orbits consisting of symmetric elements of SL2 (R). We have u2 = −1, so u lies on the two-sheeted hyperboloid in R2,1 , with each sheet constituting an orbit. In this case, giving the orbits the induced hyperbolic metric, the group SL2 (R) acts isometrically. 0 < α < 1 In this case, there are two orbits, i.e. the two components of the hyperboloid u2 = −1 + α 2 ∈ (−1, 0) in R2,1 . Again, the group SL2 (R) acts isometrically with respect to the induced metric on the orbits. α = 1 In this case, u2 = 0, so either u = 0 or u is null. In the first case, the group orbit consists of the point u = 0. In the latter case, the future and past null-cones of the origin give two distinct group orbits. α > 1 In this case, there is one orbit, consisting of the one-sheeted hyperboloid u2 = −1 + α 2 ∈ (1, ∞) in R2,1 . In this case, SL2 (R) acts isometrically with respect to the induced (Lorentzian) metric on the orbit. A.2. SL2 (C). In particular, we consider the action of SL2 (C) on itself given by SL2 (C) × SL2 (C) → SL2 (C);

(h, g) → h · g := hgh ∗ ,

(A.1)

where ∗ denotes complex-conjugate transpose. The subgroup PSL2 (C) ∼ = SO3,1 acts effectively. It is straightforward to check that † 1 −1 I [g] := tr g g 2 is invariant under the transformation g → h · g. It is useful to split g into Hermitian and skew-Hermitian parts:

Reducible Connections and Non-local Symmetries

445

g = U + V, where U ∗ = U, V ∗ = −V, and to note that this decomposition is preserved under (A.1) (i.e. (h · g)U = h · gU , etc.). A straightforward calculation implies that det U =

1 (I + 1) . 2

In particular, letting U = t + xτ1 + yτ2 + zτ3 and u := (t, x, y, z) ∈ R3,1 , we deduce that 1 u2 = − (I + 1) . (A.2) 2 Similarly, letting V = i (T + X τ1 + Y τ2 + Z τ3 ) and v := (T, X, Y, Z ) ∈ R3,1 , we find that det g = −u2 − 2iu, v + v2 . Since g ∈ SL2 (C), we therefore deduce that 1 v2 = − (I − 1) , 2

(A.3)

u, v = 0.

(A.4)

1 then u and v would be non-zero, orthogonal, time-like vectors in R3,1 . Since

If I > this cannot occur, we deduce that I ≤ 1. We investigate the distinct cases separately: I = 1 In this case, u2 = −1 and v2 = 0. As such, u lies on the two-sheeted hyperboloid in R3,1 . The condition that v2 = 0 and is orthogonal to the non-zero, time-like vector u then implies that v = 0. As such, we have two distinct orbits, corresponding to the two components of the two-sheeted hyperboloid. These orbits correspond to the Hermitian elements of SL2 (C). Giving the orbits the hyperbolic metric induced from R3,1 , the group SL2 (C) acts isometrically. Note that for I < 1, the vector v is always space-like, and lies on the one-sheeted

hyperboloid I := w ∈ R3,1 : w2 = 21 (1 − I ) in R3,1 . −1 < I < 1 We have u2 = −(I + 1)/2 in R3,1 . Since u is orthogonal to v, we may √ view u as a time-like vector of length (I + 1)/2 lying in the two-sheeted hyperboloid in Tv I . As such, we have two distinct orbits, consisting of the two components of the two-sheeted hyperboloid bundle in T I . In this case, the group action on the orbit is the action induced by the isometric action of SL2 (R) on the Lorentzian metric induced on the one-sheeted hyperboloid. Alternatively, we may view u as a time-like vector lying on the two-sheeted hyperboloid u2 = −(I + 1)/2 in R3,1 . We then view v as a tangent vector to the hyperboloid of length √1 (1 + |I |). Therefore the orbits in this case may be identified with the radius √1 2 √1 2

2

(1 + |I |) sphere sub-bundle of the tangent bundle of the hyperbolic space of radius

(I + 1). Again, there are two orbits corresponding to the two components of the hyperboloid. In this case, the group action on the orbit is the action induced by the isometric action of SL2 (R) on the induced metric on the two-sheeted hyperboloid. I = −1 In this case, u2 = 0 and v2 = 1. As such, we may view u as a null vector in Tv −1 . There are then three distinct orbits. The first consists of u = 0, and is simply the hyperboloid −1 . This orbit consists of the skew-Hermitian elements of SL2 (C).

446

J. D. E. Grant

The other orbits consist of the sub-bundle of T −1 consisting of the past and future null cone of the origin in each tangent space. I < −1 In this case, u2 = (|I | + 1)/2 > 0 in R3,1 . Therefore there is one orbit, consisting of the one-sheeted hyperboloid sub-bundle of T 1 . The SL2 (C) action is that induced by the isometric action on the induced Lorentzian metric on I . References 1. Atiyah, M.F.: Geometry on Yang–Mills Fields, Scuola Normale Superiore Pisa, Pisa, 1979 2. Atiyah, M.F., Hitchin, N.J., Drinfel d, V.G., Manin, Y.I.: Construction of instantons. Phys. Lett. A 65, 185–187 (1978) 3. Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London Ser. A 362, 425–461 (1978) 4. Bergvelt, M.J., Guest, M.A.: Actions of loop groups on harmonic maps. Trans. Amer. Math. Soc. 326, 861–886 (1991) 5. Burstall, F.E., Pedit, F.: Harmonic maps via Adler-Kostant-Symes theory. In: Harmonic Maps and Integrable Systems, Aspects Math., E23, Braunschweig: Vieweg, 1994, pp. 221–272 6. Burstall, F.E., Pedit, F.: Dressing orbits of harmonic maps. Duke Math. J. 80, 353–382 (1995) 7. Chau, L.L., Ge, M.L., Sinha, A., Wu, Y.S.: Hidden-symmetry algebra for the self-dual Yang–Mills equation. Phys. Lett. B 121, 391–396 (1983) 8. Chau, L.L., Ge, M.L., Wu, Y.S.: Kac–Moody algebra in the self-dual Yang-Mills equation. Phys. Rev. D (3) 25, 1086–1094 (1982) 9. Chau, L.-L., Wu, Y.S.: More about hidden-symmetry algebra for the self-dual Yang–Mills system. Phys. Rev. D (3) 26, 3581–3592 (1982) 10. Crane, L.: Action of the loop group on the self-dual Yang–Mills equation. Commun. Math. Phys. 110, 391–414 (1987) 11. Donaldson, S.K.: An application of gauge theory to four-dimensional topology. J. Diff. Geom. 18, 279–315 (1983) 12. Freed, D.S., Uhlenbeck, K.K.: Instantons and Four-Manifolds, Vol. 1 of Mathematical Sciences Research Institute Publications, New York: Springer-Verlag, Second ed., 1991 13. Grant, J.D.E.: The ADHM construction and non-local symmetries of the self-dual Yang–Mills equations. Commun. Math. phys. doi:10.1007/s00220-010-1024-9 14. Guest, M.A.: Harmonic Maps, Loop Groups, and Integrable Systems, Vol. 38 of London Mathematical Society Student Texts, Cambridge: Cambridge University Press, 1997 15. Guest, M.A., Ohnita, Y.: Group actions and deformations for harmonic maps. J. Math. Soc. Japan 45, 671–704 (1993) 16. Hitchin, N.J.: Linear field equations on self-dual spaces. Proc. Roy. Soc. London Ser. A 370, 173–191 (1980) 17. Ivanova, T.A.: On infinite-dimensional algebras of symmetries of the self-dual Yang–Mills equations. J. Math. Phys. 39, 79–87 (1998) 18. Ivanova, T.A.: On infinitesimal symmetries of the self-dual Yang–Mills equations. J. Nonlinear Math. Phys. 5, 396–404 (1998) 19. Mason, L.J., Woodhouse, N.M.J.: Integrability, Self-Duality, and Twistor Theory, Vol. 15 of London Mathematical Society Monographs. New Series, New York: The Clarendon Press/Oxford University Press, 1996 20. Park, Q.-H.: 2D sigma model approach to 4D instantons. Int. J. Mod. Phys. A 7, 1415–1447 (1992) 21. Pleba´nski, J.F.: Some solutions of complex Einstein equations. J. Math. Phys. 16, 2395–2402 (1975) 22. Popov, A.D.: Self-dual Yang–Mills: symmetries and moduli space. Rev. Math. Phys. 11, 1091–1149 (1999) 23. Pressley, A., Segal, G.: Loop Groups, Oxford Mathematical Monographs, New York: The Clarendon Press/Oxford University Press, 1986 24. Takasaki, K.: A new approach to the self-dual Yang–Mills equations. Commun. Math. Phys. 94, 35–59 (1984) 25. Uhlenbeck, K.: Harmonic maps into Lie groups: classical solutions of the chiral model. J. Diff. Geom. 30, 1–50 (1989) 26. Uhlenbeck, K.K.: Removable singularities in Yang–Mills fields. Commun. Math. Phys. 83, 11–29 (1982) 27. Ward, R.S.: On self-dual gauge fields. Phys. Lett. A 61, 81–82 (1977) Communicated by N.A. Nekrasov

Commun. Math. Phys. 296, 447–474 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1018-7

Communications in

Mathematical Physics

Random Current Representation for Transverse Field Ising Model Nicholas Crawford1 , Dmitry Ioffe2 1 Department of Statistics, UC Berkeley, Berkeley, CA 94720, USA 2 Faculty of Industrial Engineering, Technion, Haifa 3200, Israel.

E-mail: [email protected] Received: 8 February 2009 / Accepted: 14 December 2009 Published online: 2 March 2010 – © Springer-Verlag 2010

Abstract: Random current representation (RCR) for transverse field Ising models (TFIM) has been introduced in [14]. This representation is a space-time version of the classical RCR exploited by Aizenman et. al. [1,3,4]. In this paper we formulate and prove corresponding space-time versions of the classical switching lemma and show how they generate various correlation inequalities. In particular we prove exponential decay of truncated two-point functions at positive magnetic fields in the z-direction and address the issue of the sharpness of the phase transition. 1. The Model and the Results In what follows, we shall, for brevity, consider translation invariant models on Zd . Spe- cifically, let T N be the d-dimensional lattice torus of linear size N and J = Ji j = Ji− j is a finite range irreducible translation invariant interaction. Let h ≥ 0, ρ > 0, λ ≥ 0 and 0 ≤ β ≤ ∞. The quantum Hamiltonian we are going to consider is of the form, ρ ˆ ix . − HN = Ji j σˆ iz σˆ jz + h σˆ iz + λ (1.1) 2 i, j

ˆ x = I + σˆ Above

x

i

i

/2, and σˆ z and σˆ x are the usual Pauli matrices, 1 0 01 and σˆ x = . σˆ z = 0 −1 10

Let us introduce the partition function

d ¯ Zβ,N (h, ρ, λ) = e−N β (ρ J +h+λ) Tr e−β H N ,

This research was supported by a grant from G.I.F., the German Israeli Foundation for Scientific Research and Development and by a grant from BSF, the United States—Israel Binational Science Foundation.

448

N. Crawford, D. Ioffe

where J¯ = j Ji j and we remark that this choice of normalization is made so as to seamlessly introduce certain stochastic integral representations below. Mean values of various local observables are denoted as ·β,N . For instance,

z −β H x e−β H N ˆ Tr N Tr σˆ i e i ˆ ix β,N = , , σˆ iz β,N = −β H N Tr e Tr e−β H N ˆ x e−β H N Tr σˆ jz i ˆ ix β,N = . or, for i = j, σˆ jz Tr e−β H N Most of the results which we shall derive in the sequel hold uniformly in β < ∞ and/or in N . Whenever this is the case we shall omit the corresponding sub-index. Note that in many cases uniformity in β < ∞ implies extensions of the corresponding properties to the ground state β = ∞. Important quantities to be considered here are the z-magnetization: Mβ,N (h, ρ, λ) = σˆ iz β,N , and the truncated two-point functions,

ˆ xj β,N and ˆ ix ; ˆ xj β,N . σˆ iz ; σˆ jz β,N = σˆ iz σˆ jz β,N − σˆ iz β,N σˆ jz β,N , σˆ iz ; Our two main results are: Theorem A. For every h > 0, λ ≥ 0 and ρ ≥ 0 there exists c1 = c1 (h, λ, ρ) > 0 and c2 = c2 (h, λ, ρ) < ∞, such that ˆ ix ; ˆ xj ≤ c2 e−c1 | j−i| , 0 ≤ σˆ iz ; σˆ jz ≤ c2 e−c1 | j−i| , 0 ≤ ˆ xj ≤ 0. and, for i = j, − c2 e−c1 | j−i| ≤ σˆ iz ;

(1.2)

By our convention the above results are claimed to be uniform in the torus size N and in β < ∞. Theorem B. Uniformly in h > 0, ρ > 0 and λ > 0 the following differential inequalities hold: M(h, ρ, λ) ≤ h

∂M ∂M ∂M + M 3 + M 2ρ − 2λM 2 , ∂h ∂ρ ∂λ

(1.3)

and, −

∂M M ∂M ∂M ∂M ≤ and ≤ J¯ M . 2 ∂λ 1 − M ∂h ∂ρ ∂h

(1.4)

Again, by convention, the above inequalities are claimed to hold uniformly in N and in β < ∞.

Random Current Representation

449

In view of the fundamental techniques developed in [2,3], differential inequalities (1.3) and (1.4) imply a certain sharpness of phase transition as the transverse field λ ˆ x do not comand/or the inverse temperature β are varied. In particular, since σˆ z and mute, the uniformity of our estimates in β imply that taking β → ∞, these inequalities still hold and can be used to derive a genuine quantum phase transition, albeit the fact that we derive it using a somewhat classical re-interpretation of the model (see Sect. 5). In principle, since the model in question could be considered as the strong coupling limit of (d + 1)-dimensional classical Ising models [5,9], Theorem B could be attempted as a limiting conclusion from the result of [3]. The point of this paper, however, is to try to understand something new; that is to develop a general and robust stochastic geometric description of quantum systems, hopefully also yielding simpler, or at least alternative, proofs even in the classical case of λ = 0. In particular, the conclusions of both theorems above will become rather transparent in the stochastic geometric context which we develop here. The rest of this paper is organized as follows. Section 2 introduces a recasting of the transverse Ising model in a useful probabilistic language. Further, we set down various geometric notions for this recasting which form the basis of our proofs of Theorem A and B. Section 3 applies these notions to the truncated correlation functions appearing in Theorem A. The resulting expressions may be seen as generalizing the results of the classical Switching Lemma employed in [1,3,4]. Section 4 provides a derivation of Theorem B. Section 5 analyzes expressions for truncated correlations to obtain a proof of Theorem A. Finally, at the end of Sect. 5 we briefly address the implications for a quantum phase transition in the ground state β = ∞. A Bibliographical Remark. Shortly after the first draft of this work was posted on the web, there appeared a preprint of [8]. The authors of [8] draw motivation from a parity calculus via strong coupling limits for classical RCR, and they develop what they call “randomparity representation” for TFIM. The paper [8] contains very similar formulations and proofs of the corresponding switching lemma and of the differential inequalities. The following bibliographical remark is due: (a) Although it might look ostensibly different, the random-parity representation of [8] can be readily derived (see Remark 1 below) from the RCR which was introduced in [14] and which we use here. [14] is a transcript of lectures given at Prague’s Probability school in 2006. (b) A simple example of the application to TFIM of the classical switching lemma via limiting parity calculus appears in the Appendix of [10].

2. Stochastic Geometry of the Model The stochastic geometric approach to quantum models via the Lie-Trotter product expansion in the imaginary time variable (additional dimension) and a subsequent classical re-interpretation was introduced in [11]. An important milestone along these lines is the seminal paper [6]. The approach expounded upon in that paper has many degrees of freedom in the sense that one can experiment with numerous decompositions of the Hamiltonian and with the basis in which the Lie-Trotter expansion is performed to achieve different representations. We shall skip the derivation of the representation of interest in the present context and proceed directly to its probabilistic description. We refer the interested reader to [14] where the quantum random current representation we are using here was introduced

450

N. Crawford, D. Ioffe

and where various other stochastic geometric descriptions of the transverse field Ising model are discussed at length. To each site i ∈ T N one attaches a copy Siβ of the circle Sβ of circumference β. In

the ground state case β = ∞, S∞ = R. The resulting (d + 1)-dimensional state space of the model is S N ∪ g, where,

S N = ∪i∈T N Siβ , and g is an artificial “ghost site”. The parameters h, J and λ enter the picture in the following fashion: Consider graphs G N = (V N , E N ) with the vertex set V N = T N ∪ g, and g edge set E N = E N0 ∪ E N which comprise either edges e = (i, j) ∈ E N0 with i, j ∈ T N g and Ji− j > 0, or e = (i, g) ∈ E N with i ∈ T N . As above, we omit the sub-index N whenever it has no impact on the corresponding definition or claim. Let us define the following families of independent Poisson point processes on Sβ : Processes of flips. With each e ∈ E N we associate a Poisson process ξe which has intensity ρ Ji− j if e = (i, j) and intensity h if e = (i, g). Processes of marks. With each i ∈ T N we associate a Poisson process mi of intensity λ. In the sequel we shall denote the corresponding product measure as P (dξ, dm). In particular, for notational convenience, whenever there is no confusion the dependence on (β, J, h, ρ, λ) will be suppressed. To write down the random current representation we still need to introduce the notion of labels: Labels. Labels ν are piece-wise constant maps ν : S N → {r, l}. Here r and l are just two symbols, which, if one traces the original derivation of [14], are related to the one particle eigenfunctions in the transverse x-basis. Given a realization (ξ, m) of the Poisson point processes and a finite subset A ⊂ S, let us say that a label ν is compatible A

(see Fig. 1)– which will be denoted by ν ∼ (ξ, m) – if (1) νi has a jump at u for every u ∈ A. (2) All other jumps of ν happen at arrival times of ξ : For e = (i, g), an arrival of ξe enforces a flip of νi , and, similarly, an arrival of ξi j enforces a simultaneous flip of νi and ν j . (3) For each i, νi (t) = r at each arrival time t of mi A

To facilitate the notation we shall drop A from ν ∼ (ξ, m) whenever A = ∅. Representation Formulas. The following formulas are established in [14]: For the partition function (and β < ∞), Z N = P (dξ, dm) 1. (2.1) ν∼(ξ,m)

Remark 1. Integrating out the process of marks m and calling r “even” and l “odd”, one recovers the “random parity” representation of [8].

Random Current Representation

451

(a)

(b)

Fig. 1. Poisson processes of arrivals and compatible labels on S = ∪61 Siβ :

(4,t)

(a) ν ∼ (ξ, m) (b) ν ∼ (ξ, m)

Given u = (i, t) define ˆ ux = e−t H σˆ ix et H . σˆ uz = e−t H σˆ iz et H and, accordingly, Note here that the signs match the imaginary time rotation of the quantum evolution. For one- and two-point functions in the z component of spin: z 1 σˆ u = 1. (2.2) P (dξ, dm) Z u ν ∼(ξ,m)

For the two-point function,

1 σˆ uz σˆ vz = Z

P (dξ, dm)

1.

(2.3)

u,v

ν ∼ (ξ,m)

In fact, it is straightforward to check that similar formulas hold for x-observables and mixed two-point functions (see [14] for details): Namely, 1 x ˆ u = 1I{ν(u)=r } , P (dξ, dm) Z ν∼(ξ,m) (2.4) 1 x ˆx ˆ u v = 1I{ν(u)=r } 1I{ν(v)=r } , P (dξ, dm) Z ν∼(ξ,m)

and, for u = v,

ˆ vx σˆ uz

1 = Z

P (dξ, dm)

u

ν ∼(ξ,m)

1I{ν(v)=r } .

(2.5)

452

N. Crawford, D. Ioffe

Note that once these formulas are available with u = v, they may be extended by continuity to the appropriate limiting correlation functions. We do not state them here as they will not appear in our derivations below. Intervals, paths and replicas. Let (ξ, m) be a realization of the Poisson processes introduced in the previous section, A a finite subset of S and let ν be a compatible label A

ν ∼ (ξ, m). An interval of ν is a maximal connected component I = (u, v) of some Siβ on which νi is constant. A path P of (ν, ξ, m) is an ordered sequence I1 , I2 , . . . , In , where Il is either an interval or a ghost site g and, (1) If Il = (ul , vl ) and Il+1 = (ul+1 , vl+1 ) then either vl = ul+1 or vl = (i, t), ul+1 = ( j, t) and t is an arrival time of ξi j . (2) If Il = (ul , vl ), vl = ( j, t) and Il+1 = g, then t is an arrival time of ξi,g. (3) If Il = g, Il+1 = (ul+1 , vl+1 ) and ul+1 = ( j, t), then t is an arrival time of ξ j,g. (4) There could not be two successive ghost sites g in a path. A path P = {I1 , . . . , In } is said to be ground if it does not contain g, except possibly at the last step In . Finally, a path P is said to be left if all the ground intervals of P bear ν-label l. Let us define the set {u ←→ v} to be the collection of triples (ξ, m, ν) so that there t

exists a left path with endpoints at u and v and the set u ←→ v to be the collection of triples (ξ, m, ν) so that there exists a ground left path from u to v. Note that ground left paths are self-avoiding and that there is a unique ground left path from u to g whenever u ν ∼ (ξ, m). We shall denote this path by Cl (u, g) and we shall use Cˇ l (u, g) for the union of its ground intervals, that is for Cl (u, g)\g. Consider now two finite (and not necessarily disjoint) subsets A, B ⊂ S and two A

B

copies (ξ 1 , m1 , ν 1 ) and (ξ 2 , m2 , ν 2 ) such that ν 1 ∼ (ξ 1 , m1 ) and ν 2 ∼ (ξ 2 , m2 ). We

shall denote the combined processes of flips and marks as (η, n) = (ξ 1 ∪ ξ 2 , m1 ∪ m2 ), where the union is understood in the coordinate wise sense, e.g. ηi j = ξi1j ∪ ξi2j . In all considerations below the processes (ξ 1 , m1 ) and (ξ 2 , m2 ) are independent. Consequently, (η, n) is just a collection of independent Poisson processes of arrivals with double intensities. Furthermore, given a realization (η, n), the conditional distribution of (ξ 1 , m1 ) ⊆ (η, n) is uniform with point mass #(η)+#(n) e ηe [Sβ ]+ i ni [Sβ ] 1 1 = . 2 2

(2.6)

Note that given η and the locations of the discontinuities of (ν 1 , ν 2 ), the arrivals of (ξ 1 , ξ 2 ) may be recovered. However it is not usually possible to reconstruct (m1 , m2 ) from n even knowing the values of (ν 1 , ν 2 ). Let us introduce geometric notions for pairs of configurations, extending our previous definitions. It will be convenient to make definitions relative to a fixed finite subset G ⊂ S. An interval I of (ν 1 , ν 2 ) is a maximal connected component I = (u, v) of some Siβ , on which both labels ν 1 and ν 2 are constant and which does not contain points from G. A path P of (ν 1 , ν 2 , η, n) is an ordered sequence I1 , I2 , . . . , In , where Il is either an interval of (ν 1 , ν 2 ) or a ghost site g and,

Random Current Representation

453

Fig. 2. Special set G = {(1, t), (4, s)}: Blocked intervals for two replicas (ξ 1 , m1 ), (ξ 2 , m2 ) and two com(1,t)

(4,s)

patible labels ν 1 ∼ (ξ 1 , m1 ), ν 2 ∼ (ξ 2 , m2 )

(1) If Il = (ul , vl ) and Il+1 = (ul+1 , vl+1 ), then either vl = ul+1 = (i, t), and then either (i, t) ∈ G or t is an arrival time of ηi,g; or, otherwise, vl = (i, t), ul+1 = ( j, t) and t is an arrival time of ηi j . (2) If Il = (ul , vl ), vl = ( j, t) and Il+1 = g, then t is an arrival time of η j,g. (3) If Il = g, Il+1 = (ul+1 , vl+1 ) and ul+1 = ( j, t), then t is an arrival time of η j,g. (4) There can not be two successive ghost sites g in a path. (5) All ground intervals Il ⊂ S are disjoint. As before, a path P = {I1 , . . . , In } is said to be ground if it does not contain g, with a possible exception of the last step In . A path P = {I1 , I2 , . . . , In } is said to be a loop if either I1 = In = g or vn = u1 . It is useful to keep in mind that the above notions do not depend on the values of compatible labels (ν1 , ν2 ) or arrivals of marks n. Rather, they only depend on the arrivals of flips η. On the other hand, we also consider an important notion which very much depends on the pair of configurations: Let us say that the interval I is blocked if (see Fig. 2) both ν 1 and ν 2 are equal to r on I and, in addition, n(I) > 0. A path P = {I1 , . . . , In } is said to be unblocked if it∗ does not contain blocked intervals. We shall say that u and v are ∗-connected; u ←→ v , if, for G = {u, v}, there ∗t exists an unblocked path with end-points at u and v, and we shall write u ←→ v whenever there exists a ground unblocked path from u to v. Basic Transformation. Let P = (I1 , . . . , In ) be an unblocked path of ν 1 , ν 2 , η, n from u to v. Obviously the labels ν 1 and ν 2 unambiguously define the splitting η = ξ 1 ∪ ξ 2 . Moreover, since P is unblocked, ν 1 and ν 2 unambiguously define the splitting of marks n = m1 ∪ m2 along P. Make the following transformation of labels and marks on each of the ground intervals I of P:

454

N. Crawford, D. Ioffe

(1) If the (ν 1 , ν 2 ) label of I is (l, r ), then flip it to (r, l) and transfer all marks accordingly – set m1 (I) = m2 (I) and set m2 (I) = 0. Perform the analogous procedure if the label is (r, l). (2) If the label is (l, l) then flip it to (r, r ). Accordingly, if the label is (r, r ), then flip it to (l, l). Note that in the latter case, since we are moving along an unblocked path, n(I) has to be equal to zero, and no incompatibility arises. (3) Adjust ξ 1 and ξ 2 accordingly - those are, of course completely defined by the labels (flips of the labels, to be precise). The above transformation, let us call it P , defines a map 1 ), ( 2 ) . ν1, ξ 1, m ν2, ξ 2, m (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) → ( The map P enjoys the following set of properties: (1) It is invertible: Indeed just apply P once more to recover the original data. (2) It does not change ν 1 and ν 2 labels and m1 , m2 -marks on intervals which do not belong to P. In addition, the original and modified configurations have the same set of intervals (defined by η, u and v), and P does not change the blocked/unblocked status of any of those. A

B

(3) If ν 1 ∼ (ξ 1 , m1 ) and ν 2 ∼ (ξ 2 , m2 ), then A{u,v}

∼

1 ) and ( ξ 1, m ν2

B{u,v}

2 ). ( ξ 2, m (2.7) 1 1 1 1 ) have the same (4) It is measure preserving: In view of (2.6), (ξ , m ) and (ξ , m conditional weights. ν1

∼

Minimal paths. Most of the transformations we are going to perform will be along minimal unblocked paths, often satisfying additional geometric constraints. Let us, therefore, define what we mean by minimal. First of all given an unblocked path P = (I1 , . . . , In )

if I is a ground define its length as |P| = n1 |Il |, where |I| is the Euclidean length interval, and, by definition, |g| = 0. Consider now two replicas ξ 1 , m1 , ξ 2 , m2 and a pair of compatible labels ν 1 , ν 2 . Let u, v ∈ S ∪ g and assume that there are unblocked paths from u to v. Then the minimal path C∗ (u, v) satisfies, |C∗ (u, v)| ≤ |P| for any unblocked path P from u to v.

(2.8)

C∗ (u, v)

It is easy to see that in general (2.8) alone does not define uniquely, and one needs to impose an additional rule in order to choose the minimal path from a set of paths with the same minimal length. For example the following rule will do: Write a coarse grained description of P(u, v) = R1 , . . . , Rm , where Rl is either a ghost site g or a maximal collection of successive ground intervals of P on some Siβ . Then for two unblocked paths P = (R1 , . . . Rm ) and P = (R1 , . . . , Rk ) we shall say that P ≺ P if either |P| < |P |, or if the lengths are equal, there exists l such that |Ri | = |Ri | for i = 1, . . . , l − 1, but |Rl | > |Rl |.

(2.9)

Then C∗ (u, v) is unambiguously defined as the unique unblocked path from u to v which

is ≺-less than any other unblocked path from u to v. In other words, the minimal path, as we define it, is the most conservative of all the paths of the same minimal length: it tries to stay as much as possible on each subsequent spatial circle Sβ . The important feature of the path transformation which was introduced above is (see Fig. 3): If C∗ (u, v) is the minimal path, then it remains so after C∗ (u,v) is performed. As a result, transformations along minimal paths are well defined and invertible.

Random Current Representation

455

(a)

(b)

u

v

Fig. 3. Two replicas (ξ 1 , m1 ), (ξ 2 , m2 ) and two compatible labels ν 1 ∼ (ξ 1 , m1 ), ν 2 ∼ (ξ 2 , m2 ), where u = (1, t) and v = (4, s). (a) Minimal unblocked path C∗ (u, v) from u to the ghost site g. (b) Basic transforu,v 1 1 ) and 1 ). Labels which are switched along the minimal path ξ 1, m ν 2 ∼ ( ξ ,m mation: New labels ν 1 ∼ ( are shaded. Note that the flips and the marks are switched accordingly

3. Switching Lemmas and Related Correlation Inequalities Recall that ξ 1 , m1 and ξ 2 , m2 are independent copies of our Poisson processes of flips and marks, and that we use η = ξ 1 ∪ ξ 2 , n = m1 ∪ m2 for the combined processes. Let E denote the expectation with respect to two independent replicas of Poisson processes of flips and marks; ξ 1 , m1 and ξ 2 , m2 . In this section, we give exact formulae for the truncated correlations appearing in (1.2) and discuss the term ∂ M/∂ρ which appears in Theorem B.

Representation of σˆ uz ; σˆ vz . In view of (2.6) we can record (2.2) in terms of two replicas as, #(η)+#(n) z z 1 1 σˆ u σˆ v = 2 P (dη, dn) 1. (3.1) Z 2 u 1 2 ξ ∪ξ =η

m1 ∪m2 =n

ν 1 ∼(ξ 1 ,m 2 ) v

ν 2 ∼(ξ 1 ,m2 )

Similarly, we can record (2.1) and (2.3) as, z z #(η)+#(n) z z Z σˆ i σˆ j 1 1 = 2 P (dη, dn) σˆ u σˆ v = Z Z 2 1 2

ξ ∪ξ =η

ν 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2

1. (3.2)

m1 ∪m2 =n ν ∼ (ξ ,m )

456

N. Crawford, D. Ioffe u

Let us have a closer look at (3.1). The constraint ν 1 ∼ (ξ 1 , m1 ) implies that there is a path P from u to g such that ν 1 ≡ l on P. In particular this path P must be unblocked. An analogous statement also applies with respect to v in the second replica. Therefore, one can rewrite (3.1) as

1 σˆ uz σˆ vz = 2 Z

#(η)+#(n) 1 P (dη, dn) 2 1I ∗ 1I

×

u ξ 1 ∪ξ 2 =η ν 1 ∼(ξ 1 ,m 1 ) m1 ∪m2 =n 2 v 2 2 ν ∼(ξ ,m )

u←→g

. ∗ v←→g

(3.3)

Similarly, one can rewrite (3.2) as,

σˆ uz σˆ vz

1 = 2 Z

#(η)+#(n) 1 × P (dη, dn) 2

1I

∗

u←→v

ξ 1 ∪ξ 2 =η

.

ν 1 ∼(ξ 1 ,m 1 )

(3.4)

m1 ∪m2 =n ν 2 u,v ∼ (ξ 2 ,m2 ) g

g

Let us fix Au,v = Au,v (η, n) to be the set of pairs of a realization of (η, n). Define objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the double sum on the righthand side of (3.3). Similarly let Au,v be the set of pairs of objects (currents and labels) which contribute to the double sum on the right-hand side of (3.4). Each of the objects g in Au,v contains an unblocked path, and hence the minimal unblocked path C∗ (u, g) g from u to g. We claim that the map , ≡ C∗ (u,g) : Au,v → Au,v is a measure preserving injection. This follows immediately from the properties of basic transformations minimal paths. However, is not onto: any couple of objects in the image 1 and g (ν , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) ∈ (Au,v ) necessarily contains an unblocked path from u to g. We have proved: Theorem 3.1. Truncated z-correlation functions satisfy the following version of the Switching Lemma:

1 σˆ uz ; σˆ vz = 2 Z =

1 E 2 Z 1 1

P (dη, dn)

#(η)+#(n) 1 × 2

ξ 1 ∪ξ 2 =η

ν 1 ∼(ξ 1 ,m 1 )

1I

m1 ∪m2 =n ν 2 u,v ∼ (ξ 2 ,m2 )

1I

ν ∼(ξ ,m 1 ) u,v ν 2 ∼ (ξ 2 ,m2 )

∗

∗ u←→g

(3.5)

.

u←→g

ˆ vx . Consider two independent replicas (ξ 1 , m1 ), (ξ 2 , m2 ) and ˆ ux ; Representation of two labels ν 1 ∼ (ξ 1 , m1 ) and ν 2 ∼ (ξ 2 , m2 ). Let us say that a couple of labels (ν 1 , ν 2 ) ∈ 1 2 1 2 [(r, 1r )u2, (r, l)v ] if ν (u) = r = ν (u), whereas ν (v) =r and ν (v) = l. The events, (ν , ν ) ∈ [(r, l)u , (r, l)v ] , (ν 1 , ν 2 ) ∈ [(l, l)u , (r, l)v ] etc. (all together 16 events) are defined in a completely similar fashion. In terms of two replicas, the representation

Random Current Representation

formulas (2.4) read: 1 ˆ vx = ˆ ux E Z2

457

1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,l)v ]}

ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )

+ 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,l)v ]} . Similarly, 1 ˆ ux ˆ vx = E Z2 1 1

(3.6)

1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,r )v ]}

ν ∼(ξ ,m 1 )

ν 2 ∼(ξ 2 ,m2 )

+ 1I{(ν 1 ,ν 2 )∈[(r,r )u ,(l,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(l,r )v ]} .

(3.7)

Evidently, E

1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,l)v ]} = E

ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )

1I{(ν 1 ,ν 2 )∈[(r,r )u ,(l,r )v ]} .

ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )

Consequently, we arrive to the following representation for the truncated two point function: 1 ˆ ux ; ˆ vx = 1 I E I 1 ,ν 2 )∈[(r,l) ,(r,l) ]} − 1 1 ,ν 2 )∈[(r,l) ,(l,r ) ]} . (3.8) (ν (ν { { u v u v Z2 1 1 1 ν ∼(ξ ,m )

ν 2 ∼(ξ 2 ,m2 )

At this stage we proceed much along the lines of our proof of Theorem 3.1. Fix a reali- zation of (η, n) and let B+ (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the sum 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,l)v ]} . ξ 1 ∪ξ 2 =η

ν 1 ∼(ξ 1 ,m 1 )

m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )

Similarly, let B− (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the sum 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(l,r )v ]} . ξ 1 ∪ξ 2 =η

ν 1 ∼(ξ 1 ,m 1 )

m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )

An injective map = η,n : B− (η, n) → B+ (η, n) is constructed as follows: Any (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) ∈ B− (η, n) contains an unblocked loop L from v to v such that u ∈ L. Indeed, such a loop may be constructed with ν 1 ≡ l. Now just choose the minimal such loop (in the sense discussed above) and perform on this minimal loop the very same surgery as in the Basic Transformation. Again, the property that the loop is minimal is not changed under the surgery and hence is invertible. On the other hand, the image set B− (η, n) ⊂ B+ (η, n).

458

N. Crawford, D. Ioffe

Geometrically, it is evident that B+\B− is characterized by the following condi tion: A pair (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) from B+ belongs to B+ \B− if and only if any unblocked loop containing v also contains u. In this case, let us say that u is loop-pivotal for v. We conclude: Theorem 3.2. Truncated x-correlation functions satisfy the following version of the Switching Lemma: 1 ˆ ux ; ˆ vx = E 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,l)v ]} 1I{u is loop pivotal for v }. (3.9) Z2 1 1 1 ν ∼(ξ ,m )

ν 2 ∼(ξ 2 ,m2 )

Representation of cross-correlations. As before, let E denote the expectation with respect to two independent replicas of Poisson processes of flips and marks; ξ 1 , m1 and ξ 2 , m2 . With this notation we have (from (2.2), the first of (2.4) and (2.5)), 1 ˆ vx = σˆ uz E 1I{ν 1 (v)=r } , (3.10) 2 Z u ν 1 ∼(ξ 1 ,m 1 )

ν 2 ∼(ξ 2 ,m2 )

and, accordingly,

x 1 ˆv = E σˆ uz Z2

1I{ν 2 (v)=r } .

(3.11)

u ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )

Fix a realization of (η, n) and let D+ (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the sum 1I{ν 2 (v)=r } . ξ 1 ∪ξ 2 =η

u ν 1 ∼(ξ 1 ,m 1 )

m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )

Similarly, let D− (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ξ 2 , m2 , ν 2 ) which contribute to the sum 1I{ν 1 (v)=r } . ξ 1 ∪ξ 2 =η

u ν 1 ∼(ξ 1 ,m 1 )

m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )

Note now that any pair of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) ∈ D− contains an unblocked path (e.g. with ν 1 ≡ l) and hence the minimal unblocked path C∗, v (u, g) from u to g which avoids v. An injective map = η,n : D− (η, n) → D+ (η, n) is then constructed as follows: (1) Perform the Basic Transformation along the minimal path C∗, v (u, g). (2) Using the symmetry of replicas, rename the resulting 1 ↔ 2 . ν1, ν2, ξ 1, m ξ 2, m

Random Current Representation

459

that D+ \D− is It is1evident characterized by the following condition: A pair of objects (ν , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) from D+ belongs to D+ \D− if and only if any unblocked ∗

path from u to the ghost site g contains v. Let us say that v is pivotal for u ←→ g if the latter condition holds. We have proved: Theorem 3.3. Truncated cross-correlation functions satisfy the following version of the Switching Lemma:

1 ˆ vx = − σˆ uz ; E Z2

1I{ν 2 (v)=r } 1I

u ν 1 ∼(ξ 1 ,m 1 ) 2 ν ∼(ξ 2 ,m2 )

∗

v is pivotal for u ←→ g

.

(3.12)

Note that the following (straightforward) generalization of Theorem 3.3 holds. Let G = {v1 , . . . , vl , vl+1 , . . . , vl+k } be a finite subset of S which is time-ordered in the following sense: The coordinates vq = (i q , tq ) satisfy tq < t p whenever q < p. Let u= (i, t) be ˆx such that tl < t < tl+1 . Then, the truncated cross-correlation σˆ uz ; l+k 1 vq is defined as l+k l l+k l+k z z x x z x x ˆv = ˆ v σˆ u ˆ v − σˆ u ˆv . σˆ u ; q q q q 1

1

1

l+1

We have, σˆ uz ;

l+k

ˆ vx = − q

1

1 E Z2

l+k

u ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )

1I{ν 2 (vq )=r } 1I

1

∗

G is pivotal for u ←→ g

.

(3.13)

Further correlation inequalities. In the classical case (see e.g. [1,3,4,16]) random current representations of correlations generate a variety of correlation inequalities. In fact, the morphology in the quantum case is even richer and this issue will be systematically addressed elsewhere. Here we shall focus only on such inequalities which are needed for proving our main results. Partial derivatives with respect to the parameters (h, λ, ρ) of the magnetization M = σˆ 0z are related to truncated correlations in the following way: Fix the origin 0 of T N and let 0 ∈ S be the point with the space time coordinates 0 = (0, 0). In view of (space and time) translation invariance it is of course inessential how we fix 0. Then, β ∂M z = σˆ 0z ; σˆ (i,t) dt, ∂h 0 i∈T N

and

∂M = ∂λ

i∈T N

β 0

∂M = ∂ρ

ˆ x dt. σˆ 0z ; (i,t)

(i, j):Ji− j >0

Ji j 2

β 0

z σˆ 0z ; σˆ (i,t) σˆ (zj,t) dt, (3.14)

Random current representations for the z and cross-correlations were already given z z z above. Let us therefore turn to σˆ 0 ; σˆ (i,t) σˆ ( j,t) terms. In order to facilitate the notation,

460

N. Crawford, D. Ioffe

set w = (i, t) and z = ( j, t). The random current representation of z z z z z z Z 1 = 2E σˆ 0 σˆ w σˆ z = σˆ 0 σˆ w σˆ z Z Z

1,

{0,w,z} 1 1 ν1 ∼ ξ ,m ν 2 ∼ ξ 2 ,m2

(

(

)

)

is straightforward. Consider now,

1 σˆ wz σˆ zz σˆ 0z = 2 E Z

1.

{w,z} ν 1 ∼ ξ 1 ,m 1 0 ν 2 ∼ ξ 2 ,m2

(

) )

( 1 1 1 2 2 2 which contributes to the latter integral Each pair of triples ν , ξ , m , ν , ξ , m contains an unblocked path from 0 to g. Performing our Basic Transformation along the minimal such path, we infer,

1 σˆ wz σˆ zz σˆ 0z = 2 E Z

1I

{0,w,z} 1 1 ∼ ξ ,m ν 2 ∼ ξ 2 ,m2

ν1

(

(

)

. ∗ 0←→g

)

Consequently,

1 σˆ 0z ; σˆ wz σˆ zz = 2 E Z

1I

{0,w,z} 1 1 ν1 ∼ ξ ,m ν 2 ∼ ξ 2 ,m2

(

(

)

)

. ∗ 0←→g

(3.15)

In particular, ∂ M/∂ρ ≥ 0. One can readily generalize the latter conclusion to a system with inhomogeneous flip rates in the following fashion: Let ρe : Sβ → R+ ; e ∈ E 0 be a collection of non-negative (and, say, piece-wise smooth) functions. Let us view the ρe ’s as time-inhomogeneous rates of arrivals of (ground) flips corresponding to the endpoints of e. In this way, we may introduce an analog of (2.2), defining z-expectation values Mu (ρ(·)) = Mu (h, ρ(·), λ) ; u ∈ S, via the right-hand side of (2.2) but using the inhomogeneous arrival rates (ρe (t))e∈E 0 . Then, for every u ∈ S, the functional Mu (·) is non-decreasing in ρ, that is ∀e ρe ≤ ρe t − a.e.

⇒

∀u Mu (ρ) ≤ Mu (ρ ).

(3.16)

It is worth noting that this may be seen as a special case of Griffith’s second inequality [12]. Obviously, we may use the random current representation to introduce time-inhomogeneous versions of all correlations we have already encountered in this paper. With that in mind, the following combination of (3.16) with (3.13) will be useful in the sequel: Let A be a disjoint union; A = I1 ∪ · · · ∪ In ∪ Siβ1 ∪ · · · Siβm ,

Random Current Representation

461

where Il -s are ground segments of the form Il = (wl , zl ) with both wl and zl lying on some circle Siβ (and being time ordered to avoid notational ambiguities). Define Ac = S\A. Finally define the reduced arrival rates ρ A, ⎧ ⎪ ⎨ ρ, if the corresponding flip is either between two points in A ρeA(t) = (3.17) or between two points in Ac ⎪ ⎩ 0, otherwise. In other words we suppress arrivals of flips between A and Ac . Let u ∈ Ac and let v1 , . . . , vl , u, vl+1 , . . . , v2n be the time ordering of the set {w1 , . . . , zn , u}. Then, exactly as in (3.13), l 2n 2n 2n

ˆ vx σˆ uz ˆ vx (ρ A) ≤ σˆ uz (ρ A) ˆ vx (ρ A) ≤ M(ρ) ˆ vx (ρ A), q q q q 1

1

l+1

1

(3.18) where the expectations are understood in terms of the corresponding (generalized to time-inhomogeneous rates) random current representations, and the second inequality follows from (3.16). In view of how the rates ρ A were defined, fixing labels at the end-points of I1 , . . . , In completely decouples the two regions A and Ac . As a result, (3.18) implies the following inequality: Eρ A

2n

u

1

ν|Aˇ c ∼(ξ,m)

1I{ν(vq )=r } ≤ M(ρ)Eρ A

2n

ν|Aˇ c ∼(ξ,m) 1

1I{ν(vq )=r } ,

(3.19)

where the expectation above is with respect to ρ A-arrival rates and the summation is over all reduced labels ν|Ac : Ac → {r, l}. 4. Differential Inequalities The following is an adaptation of the ideas of [2,3] to the quantum case. It is worth noting that the space-time techniques we develop here yield simplified proofs even in the classical case. A fruitful idea of [3] is to work with three replicas in order to control the above quan- tities. In our case these will be three independent replicas ξ 1 , m1 , ξ 2 , m2 , ξ 3 , m3 of Poisson processes of flips and marks and, respectively, three sets of compatible labels ν 1 , ν 2 , ν 3 . We shall always indicate in sub-indices which replicas we are talking about, e.g. we shall talk about left l1 paths in the first replica or about unblocked ∗23 -paths in the replicas 2 and 3. In the sequel P is the product measure for all three independent replicas and E denotes the corresponding expectation. Let us go back to the representations (2.2) and (2.1),

Z2 1 σˆ 0z = σˆ 0z 2 = 3 E Z Z

2 2 2 0 ν 1 ∼(ξ 1 ,m1 ) ν3 ∼(ξ 3,m 3) ν ∼(ξ ,m )

1.

(4.1)

462

N. Crawford, D. Ioffe

Fig. 4. The ground left 1-path Cˇ l1 (0, g) contains three intervals. r and are ν 1 -labels of the first replica. The unblocked 23-cluster C∗23 (g) is depicted schematically. Case 1: Cˇ l1 (0, g) is disjoint from C∗23 (g). Case 2: 0 ∈ C∗23 (g). Case 3: 0 ∈ / C∗23 (g), but Cˇ l1 (0, g) ∩ C∗23 (g) = ∅

Let C∗23 (g) be the set of all points v ∈ S which are ∗23 -connected to g and let us denote Cˇ l1 (0, g) as the set of ground (S) points on the unique ground left path from 0 to g. We shall distinguish three cases which exhaust all possible contributions to the right-hand side of (4.1) and lead to the various terms in (1.3): (1) Cˇ l1 (0, g) ∩ C∗23 (g) = ∅. (2) 0 ∈ C∗23 (g). (3) 0 ∈ C∗23 (g) but Cˇ l1 (0, g) ∩ C∗23 (g) = ∅. Below we consider these cases in turn (see Fig. 4). During our exposition of Case 3, we also derive the pair of inequalities (1.4). Case 2. If 0 ∈ C∗23 (g) then there exist ∗23 -paths from 0 to g. Hence the notion of the mini

mal path P ∗ = C∗23 (0, g) from 0 to g is well defined. Applying the Basic Transformation P ∗ on 23-labels, we readily conclude, 1 E Z3

0

ν 1 ∼(ξ 1 ,m1 )

ν 2 ∼(ξ 2 ,m 2 )

ν 3 ∼(ξ 3 ,m3 )

1 1I{0∈C∗ (g)} = 3 E 23 Z

ν 1 ∼(ξ 1 ,m1 )

0 ν 2 ∼(ξ 2 ,m 2 ) 0 3 3 3

0

1 = M 3 . (4.2)

ν ∼(ξ ,m )

Case 1. By construction, Cˇ l1 = (I1 , . . . , In ). All the intervals in this sequence are ξ 1 be the ground, and the last interval In = (w, v = (i, t)) satisfies t ∈ ξi1g. Let modified realization of 1-process of flips with the corresponding arrival removed, but the configuration (ν 1 , ξ 1 , m1 ) otherwise kept intact. Obviously, the relative weight of removing this arrival contributes a factor hdt, and one can recover the original ξ 1 by adding a flip from v to the ghost site g. Formally, fixing realizations of the second and third replicas and fixing compatible values of ν 1 and ν 2 , taking expectations only with

Random Current Representation

463

respect to (ξ 1 , m1 ) and summing only with respect to compatible ν 1 -labels we obtain ⎡ ⎤ ⎢ ⎥ E1 ⎣ 1I{Cˇ l (0,g)∩C∗ (g)=∅} ⎦ 23

1

0

ν 1 ∼(ξ 1 ,m1 )

=

i∈T N

β

0

⎡

⎤

⎢ hdtE1 ⎣

⎥ 1I{Cˇ l (0,g)∩C∗ (g)=∅} 1I{(i,t)∈Cˇ l (0,g)} |ξi,1g(t) = 1⎦ . 23 1 1

0

ν 1 ∼(ξ 1 ,m1 )

(4.3) Now, ⎡

⎤

⎢ E1 ⎣

⎥ 1I{Cˇ l (0,g)∩C∗ (g)=∅} 1I{(i,t)∈Cˇ l (0,g)} |ξi,1g(t) = 1⎦ 23 1 1

0

ν 1 ∼(ξ 1 ,m1 )

⎡

⎢ = E1 ⎣

⎤ ⎥ 1I{Cl (0,v)∩C∗ (g)=∅} ⎦ 23 1

(4.4)

1 0,v

ν ∼ (ξ 1 ,m1 )

with v = (i, t) on the right-hand side. Taking into account replicas 2 and 3, let us determine the properties of the resulting triple of configurations from the joint integration on the right-hand side of (4.4). Since Cl1 (0, v) ∩ C∗23 (g) = ∅, there exist ∗12 -paths from 0 to v which are disjoint from C∗23 (g). Let P ∗ be the minimal such path. P ∗ on Basic Transformation Consider the 1 1 1 2 2 2 ˆ ˆ ˆ , νˆ , ξ , m ˆ , which satisfies 12-labels. It produces a new collection νˆ , ξ , m the following set of conditions: (1) P ∗ is still the minimal ∗12 path from 0 to v which avoids C∗23 (g). In particular, the ∗23

transformation is invertible →g and 0 ← 0,v ∗23 1 1 1 2 2 2 ˆ and νˆ ∼ ξˆ , m ˆ . In particular, 0 ←→ v. (2) νˆ ∼ ξˆ , m Comparing with (3.5) (applied to 2 and 3 labels) and with the first of (3.14), we conclude β ∂M 1 z ≤ . =h E 1 I hdt σˆ 0z ; σˆ (i,t) l ∗ ˇ 3 C1 (0,g)∩C23 (g)=∅ Z ∂h 0 2 2 2 0

ν 1 ∼(ξ 1 ,m1 )

ν ∼(ξ ,m )

i

ν 3 ∼(ξ 3 ,m3 )

(4.5) Case 3. This is the most difficult case. In fact it contains two sub-cases, which we proceed to describe: The left ground path from 0 to g, denoted by Cˇ l1 (0, g), is a ground path which may be naturally written as an ordered collection of ground intervals, Cˇ l1 (0, g) = ∪n1 Il : Each

interval Il = [zl , wl ] is also naturally oriented with respect to the direction of the path towards g. Therefore, in the case under consideration we can speak of the first interval

464

N. Crawford, D. Ioffe

Fig. 5. Double transformation in Case 3 (a): The 23-mark at u∗ is removed at the cost 2λdt. The point u∗ is ∗23 pivotal for the 0 ←→ g -connection in the modified configuration. A∗23 (0, u∗ ) is the set of all the points

u ∈ S which can be reached from 0 via unblocked 23-paths avoiding u∗ . A∗23 (g, u∗ ) is the set of all the points u ∈ S which can be reached from g via unblocked 23-paths avoiding u∗

Il ∗ = [zl ∗ , wl ∗ ], where Cˇ l1 (0, g) hits C∗23 (g) and, furthermore about the first hitting point u∗ ∈ Il ∗ . ∗23

Case 3(a). Pivotal Marks: In this sub-case zl ∗ ←→ g or, equivalently, zl ∗ = u∗ . Since u∗ is in the boundary of C∗23 (g), there is necessarily a 23-mark at u∗ . Also, both the 2 and 3 labels are necessarily r at u∗ . By construction, (if we understand the interval (zl ∗ , u∗ ) ⊂ Cl1 (0, u∗ ) as being topologically open) Cˇ l1 (0, u∗ ) ∩ C∗23 (g) = ∅.

(4.6)

∗ (0, u∗ ) be the Hence there exist ∗12 -paths from 0 to u∗ which avoid C∗23 (g). Let P12 ∗ ∗ minimal such path. Let also P23 (u , g) be the minimal ∗23 -path from u∗ to g. These paths are disjoint. Let us make the following double transformation on all three collections of replicas and compatible labels:

(1) Remove the 23-mark at u∗ . This yields the weight 2λdt. ∗ (0,u∗ ) on 12-labels. (2) Perform the Basic Transformation P12 ∗ (u∗ ,g) on 23-labels. (3) Perform the Basic Transformation P23 Since the Basic Transformations are on disjoint paths the latter two operations are well ∗ (0, u∗ ) and defined, commute and moreover do not change the minimal character of P12 ∗ (u∗ , g). In other words, they are invertible. The resulting set of triples ˆ1 , νˆ 1 , ξˆ 1 , m P23 ˆ 2 , νˆ 3 , ξˆ 3 , m ˆ 3 satisfies the following conditions (see Fig. 5): νˆ 2 , ξˆ 2 , m u∗ 0 u∗ ˆ 1 , νˆ 2 ∼ ξˆ 2 , m ˆ 2 and νˆ 3 ∼ ξˆ 3 , m ˆ3 . (1) νˆ 1 ∼ ξˆ 1 , m (2) νˆ 2 (u∗ ) = l. ∗ 23 (3) u∗ is pivotal for 0 ←→ g .

Random Current Representation

465

Note that (2) is a consequence of (1) and (3) and, therefore, can be omitted. We claim that, ≤ ME 1I 3 ∗ E 1I ∗ . 1I ∗ ∗23 ∗23 u is pivotal for 0 ←→ g u is pivotal for 0 ←→ g {ν (u )=r } ( (

) )

( (

0 ν 2 ∼ ξ 2 ,m 2 u∗ ν 3 ∼ ξ 3 ,m3

) )

0 ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3

(4.7) Assuming (4.7) for the moment, a comparison with (3.12) and with the third of (3.14) reveals that the total contribution to M which comes from the Case 3(a) is bounded above by M2 E Z2 i

β

2λdt 0

( (

1I

1I 3 ∗ ∗23 u∗ = (i, t) is pivotal for 0 ←→ g {ν (u )=r }

= −2λM 2

) )

0 ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3

∂M . ∂λ (4.8)

To check (4.7) let A∗23 (0, u∗ ) be the set of all the points u ∈ S which can be reached from 0 via unblocked 23-paths avoiding u∗ . Evidently, A∗23 (0, u∗ ) can be written as a union A∗23 (0, u∗ ) = ∪R j , which satisfy the following set of properties:

(1) Each R j is either a full circle, or it is an interval R j = (p j , q j ) (which formally speaking union of successive ground intervals on some Siβ ) and then it bears 23-marks at its endpoints p j and q j , except, of course, for the interval which contains u∗ as one of its endpoints – recall that the 23-mark at u∗ was removed. Moreover, both labels ν 2 and ν 3 equal r at such end-points. (2) Let R j ∗ = (p j ∗ , u∗ ) be the remaining interval which contains u∗ as one of its

endpoints. Then ν 3 (u∗ −) = limz∈R j ∗ ,z→u∗ ν 3 (z) = r . (3) There are no arrivals of 23-flips between points in A∗23 (0, u∗ ) and points in S\A∗23 (0, u∗ ). The inequality (4.7) is then proved as follows: Conditioning on A∗23 (0, u∗ ) with realizations of all the processes and values of both 2 and 3 labels on it, we integrate with respect to marks on S\A∗23 (0, u∗ ), flips on S\A∗23 (0, u∗ ) ∪ g and compatible 2 and 3 labels. The constrained integration clearly decouples the two configurations on S\A∗23 (0, u∗ )∪g and so we can integrate the restricted 2 and 3 quantities independently. We arrive at a situation where (3.19) applies (for the restriction of ν 3 ). More precisely, what we use is actually a limiting case of (3.19), with the z component of spin in the

expectation occurring at the point u∗ on the boundary of S\A∗23 (0, u∗ )= A∗23 (0, u∗ )c . Putting things together concludes Step 3(a). Before proceeding to Case 3(b), let us prove the first of (1.4) by techniques similar to those of the previous paragraph. Consider the expression for ∂ M/∂λ as it appears in 0 (4.8). two labels ν 2 ∼ ξ 2 , m2 and ν 3 ∼ ξ 3 , m3 the condition that u∗ is pivotal Given ∗23 for 0 ←→ g is equivalent to A∗23 (0, u∗ )∩A∗23 (g, u∗ ) = ∅, where A∗23 (g, u∗ ) is the set of points which can be reached from g by unblocked paths avoiding u∗ . Consequently,

466

N. Crawford, D. Ioffe

by (3.19) (or, more precisely by the limiting case of the latter, applied this time to the restriction of ν 2 to A∗23 (0, u∗ )c at the point u∗ on the boundary of A∗23 (0, u∗ )c ), 1I 3 ∗ 1I ∗ E ∗23 u is pivotal for 0 ←→ g {ν (u )=r } ( (

) )

0 ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3

≤ ME

1I{A∗

0,u∗ ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3

( (

23 (0,u

∗ )∩A∗ (g,u∗ )=∅ 23

} 1I{ν 3 (u∗ )=r } .

) ) ∗23

In order to estimate the latter expression we shall separately consider whether u∗ ←→ g or not. First of all, E 1I{A∗ (0,u∗ )∩A∗ (g,u∗ )=∅} 1I{ν 3 (u∗ )=r } 1I ∗23 ≤ E 1I ∗23 , 23 23 u∗ ←→g u∗ ←→g 0,u∗ ν 2 ∼ ξ 2 ,m 2 ν 3 ∼ ξ 3 ,m3

( (

0,u∗ ν 2 ∼ ξ 2 ,m 2 ν 3 ∼ ξ 3 ,m3

) )

( (

) )

and the right-hand side is Z 2 σˆ 0z ; σˆ uz∗ (see (3.5)). On the other hand E 1I{A∗ (0,u∗ )∩A∗ (g,u∗ )=∅} 1I{ν 3 (u∗ )=r } 1I ∗ ∗23 23 23 u ←→g 0,u∗ ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3

( (

=E

) )

1I{A∗

23 (0,u

( (

∗ )∩A∗ (g,u∗ )=∅ 23

}

(4.9)

) )

0 ν 2 ∼ ξ 2 ,m 2 u∗ ν 3 ∼ ξ 3 ,m3

as can be seen by performing our Basic Transformation on the minimal ∗23 path from u∗ to g (which would necessarily lie in A∗23 (0, u∗ )c ). Since the constraints appearing on the ∗ right-hand side imply that u is pivotal, we may use (4.7) and bound the right-hand-side ˆ x∗ . The inequality (1.4) then follows easily. in (4.9) by −MZ 2 σˆ 0z ; u ∗23

Case 3(b). Pivotal Flips: Assume now that zl ∗ ←→ g or, equivalently, that zl ∗ ∈ In order to simplify notation set z∗ = zl ∗ and w∗ = wl ∗ −1 . Under the above assumption C∗23 (g) is disjoint from the left path Cl1 (0, w∗ ). Hence there exist ∗12 -paths ∗ (0, w∗ ) be the minimal such path. Let also from 0 to w∗ which avoid C∗23 (g). Let P12 ∗ ∗ P23 (z , g) be the minimal ∗23 -path from z∗ to g. These paths are disjoint. Let us make now the following transformation on all three replicas and labels: C∗23 (g).

(1) Remove the arrival of ξ 1 between w∗ and z∗ , yielding the weight ρ Ji, j dt. ∗ (0,w∗ ) on 12-labels. (2) Perform the Basic Transformation P12 ∗ (z∗ ,g) on 23-labels. (3) Perform the Basic Transformation P23 Again, since the Basic Transformations are on disjoint paths they are well defined and do ∗ (0, w∗ ) and P ∗ (z∗ , g). Thus, they are invertible not change the minimal character of P12 23 ˆ 1 , νˆ 2 , ξˆ 2 , m ˆ 2 , νˆ 3 , ξˆ 3 , m ˆ3 and the resulting collection of configurations νˆ 1 , ξˆ 1 , m satisfy the following set of conditions (see Fig. 6):

Random Current Representation

467

Fig. 6. Double transformation in Case 3 (b): The 1-flip between w∗ at z∗ is removed at the cost ρ Ji j dt. In the modified configuration w∗ ∈ C∗23 (0) and the clusters C∗23 (0) and C∗23 (g) are disjoint

{0,w∗ ,z∗ } 2 2 z∗ z∗ ˆ 1 , νˆ 2 ˆ and νˆ 3 ∼ ξˆ 3 , m ˆ3 . (1) νˆ 1 ∼ ξˆ 1 , m ∼ ξˆ , m (2) C∗23 (0, w∗ ) and C∗23 (g) are disjoint. Therefore, the contribution to M which comes from Case 3(b) is bounded by 1 β 1I ∗ ∗ ρ Ji j dt 1I ∗23 . M 2E 0←→(i,t) {C23 (0,(i,t))∩C23 (g)=∅} Z 0

i, j

ν2

{0,(i,t),( j,t)} 2 2 ∼ ξ ,m ( j,t) ν 3 ∼ ξ 3 ,m3

(

(

(4.10)

)

)

We claim that the latter expression is bounded above by 1 β 1I ∗ ∗ ρ Ji j dt . (4.11) M2 2 E 1I ∗23 0←→(i,t) {C23 (0,(i,t))∩C23 (g)=∅} Z 0

i, j

ν2

{0,(i,t),( j,t)} 2 2 ∼ ξ ,m ν 3 ∼ ξ 3 ,m3

(

(

)

)

The proof is the same as that of (4.7) and is omitted here. The expression in (4.11) is exactly M 2 ρ∂ M/∂ρ. Indeed, just compare it with (3.15): ∗23

If we define w = (i, t) and z = ( j, t), then 0 ←→ g precisely means that either ∗23 ∗23 ∗23 0 ←→ w, z ←→ g and C∗23 (0, w) ∩ C∗23 (z, g) = ∅ or, the other way around, 0 ←→ z, ∗23

w ←→ g and C∗23 (0, z) ∩ C∗23 (w, g) = ∅. The second inequality of (1.4) is also an immediate consequence. From a (by now) standard application of the Basic Transformation, E 1I{C∗ (0,w)∩C∗ (z,g)=∅} = E 1I{C∗ (0,w)∩C∗ (z,g)=∅} . 23 23 23 23 {0,w,z} 2 2 ∼ ξ ,m ν 3 ∼ ξ 3 ,m3

ν2

(

(

)

{0,w} ν 2 ∼ ξ 2 ,m 2 z ν 3 ∼ ξ 3 ,m3

)

(

By (3.19) and in view of the representation (3.5), E 1I ∗23 ≤ ME {0,w} ν 2 ∼ ξ 2 ,m 2 z ν 3 ∼ ξ 3 ,m3

(

(

) )

0←→g

1I

{0,w} ν 2 ∼ ξ 2 ,m 2 ν 3 ∼ ξ 3 ,m3

(

(

(

) )

) )

∗23

0←→g

= M σˆ 0z ; σˆ wz .

468

N. Crawford, D. Ioffe

The analogous statement holds if the roles of z and w are interchanged. The conclusion follows by collecting terms. 5. Proof of Theorem A: Exponential Decay In the sequel we shall continue to use P and, respectively, E for the product probability for two independent replicas ξ 1 , m1 and ξ 2 , m2 . As before n = m1 ∪ m2 and η = ξ 1 ∪ ξ 2. The proof is given in three subsections, corresponding to each of the three truncated correlations. The proof for z-correlations is given in some detail and, as the proofs of the second two inequalities only require small modifications of this result, we will be more brief in proving the last two statements. Proof of Theorem A for z-correlations. Let i, j ∈ T N , s, t ∈ Sβ be fixed and let u = (i, t), v = ( j, s). We shall prove the following generalization of the first of (1.2): Lemma 5.1. There exist c1 = c1 (h, λ, ρ) > 0 and c2 = c2 (h, λ, ρ) < ∞ such that, z z (5.1) σˆ u ; σˆ v ≤ c2 e−c1 d(u,v) ,

where d(u, v) = | j − i| + |t − s|. The above inequality is uniform in N , β, u and v. Proof. The starting point for our analysis is the formula (3.5) reproduced here: ⎛ ⎞

σˆ uz ; σˆ vz

⎜ 1 ⎜ = 2 E⎜ ⎜ Z ⎝

ν 1 ∼(ξ 1 ,m 1 ) u,v ν 2 ∼ (ξ 2 ,m2 )

⎟ ⎟

1I ∗t 1I ∗ ⎟ ⎟. u←→v u←→g

(5.2)

⎠

∗t There is a simple reason to include a redundant constraint u ←→ v : Given a realization of ξ 1 and ξ 2 , the function

(m1 , m2 ) →

1I

∗t

u←→v

ν 1 ∼(ξ 1 ,m 1 ) u,v

ν 2 ∼ (ξ 2 ,m2 )

is monotone non-increasing. Consequently, for any F(m1 , m2 ) non-decreasing, the FKG property of the pair of Poisson processes m1 and m2 implies: ⎛ ⎞ ⎜ ⎜ 1 2 E⎜ ⎜ F(m , m ) ⎝

⎟ ⎟

1I ∗t ⎟ u←→v ⎟ ⎠

ν∼(ξ 1 ,m 1 ) u,v ν 2 ∼ (ξ 2 ,m2 )

⎞

⎛

⎜ ≤ E F(m1 , m2 ) E ⎝

ν 1 ∼(ξ 1 ,m1 ) ν 2 u,v ∼ (ξ 2 ,m2 )

1I

∗t

u←→v

⎟ .

⎠

(5.3)

Random Current Representation

469 β

For every δ > 0 fixed (for convenience we’ll assume that δ divides β) let Zδ = δZ/ ((β/δ)Z) be the rescaled one-dimensional lattice torus which is just an equal δ-spacing embedding of β/δ sites into Sβ . We construct non-decreasing functions Fδ (m1 , m2 ) = Fδu,v (m1 , m2 ) as follows: First β β of all let us map S onto Zδ × Zd : A point p = (δk, j) ∈ Zδ × Zd corresponds to the j interval [(k − 1)δ, kδ) of Sβ . Two points p = (δk, j) and q = (δl, m) are said to be connected if either j = m and |k − l| ≤ 1 mod (β/δ) or k = l and ( j, m) ∈ E. β Consider the following Bernoulli site percolation process X δ on Zδ × Zd , which is generated by the combined process of marks n: ) X δ (p) =

0, if n ( j × [(k − 1)δ, kδ)) > 0 δ, otherwise. β

Clearly, P (X δ = δ) tends to one as δ tends to zero. For p, q ∈ Zδ × Zd we can define the minimal passage time Tδ (p, q) = min

γδ :p →q

X δ (r).

r∈γ

Then, there exist c1 , c2 > 0 such that δ P Tδ (p, q) < dδ (p, q) ≤ c2 e−c1 dδ (p,q) , 2

(5.4)

β

uniformly in 0 ≤ δ ≤ δ0 small enough and in p, q ∈ Zδ × Zd . Moreover, our choice of δ0 may be made independent of β. Here, dδ (p, q) is the minimal possible number of points in connected paths γδ : p → q. Note that if pu and pv label δ-intervals containing u and v, then dδ (pu , pv ) ≥ c3 d(u, v) uniformly in δ small and, say, d(u, v) ≥ 1. Suppose that for such δ, pu and pv , we also assume δ > 0 is chosen to satisfy (5.4). If we define Dδc

u,v c δ = Dδ = Tδ (pu , pv ) < dδ (pu , pv ) , 2

then since Fδ = 1IDδc is non-decreasing, the FKG inequality (5.3) along with (5.4) imply that for all δ small there exist c1 = c1 (δ), c2 > 0, such that ⎛ ⎜ ⎜ E⎜ ⎜1IDδc ⎝

⎞

1I

∗t

u←→v

ν 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2

ν ∼ (ξ ,m )

⎛

⎜ ⎟ ⎜ ⎟ −c1 d(u,v) ⎜ ≤ c e E 2 ⎜ ⎟ ⎝ ⎠

⎟

⎞

1I

∗t

u←→v

ν 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2

ν ∼ (ξ ,m )

⎟ ⎟ ⎟ ⎠

⎟ .

(5.5)

470

N. Crawford, D. Ioffe

In view of (5.5) it suffices to check that, perhaps by adjusting further c1 , c2 > 0, ⎛

⎞

⎛

⎞

⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎟ ≤ c e−c1 d(u,v) E ⎜1I ∗t 1I ∗t ⎟ . 1 I E⎜ 1 I 1 I ∗ 2 Dδ ⎜ Dδ ⎜ ⎟ ⎟ u←→v u ←→v u←→g ⎠ ⎝ ⎝ ⎠ ν 1 ∼(ξ 1 ,m 1 ) ν 1 ∼(ξ 1 ,m 1 ) u,v

u,v

ν 2 ∼ (ξ 2 ,m2 )

ν 2 ∼ (ξ 2 ,m2 )

(5.6) Consider now the set C ∗t(u, v) of all the points z ∈ S which are ∗t-connected to both u ∗t and v. The set C ∗t(u, v) is non-empty on the event u ←→ v , and it is represented as a union of intervals C ∗t(u, v) = ∪l Rl . Each interval Rl ⊂ Slβ is either a full circle, or

Rl = (zl , wl ) ⊂ Siβl with combined n marks placed at both end-points. Note that these endpoints must also have ν 1 , ν 2 = r . Let us say that p = (δk, l) ∈ Gδ (Rl ) if p ∈ Rl and X δ (p) = δ.

Note that p ∈ Gδ (Rl ) implies in particular that [(k − 1)δ, kδ) × l ⊆ Rl . The crucial property is that on the event Dδu,v the following happens: The number of all δ-intervals associated with points p ∈ ∪l Gδ (Rl ) is bounded below as

l

p∈Gδ (Rl )

1≥

1 1 Tδ (pu , pv ) > c3 d(u, v). δ 2

(5.7)

∗t Let us condition on realizations of C ∗t(u, v) which are compatible with u ←→ v and Dδu,v . As before, such a conditioning rules out simultaneous flips between points in C ∗t(u, v) and S\C ∗t(u, v). Therefore, the corresponding conditional integration and summation over compatible flips, marks and labels inside and outside C ∗t(u, v) decouples over the two regions. In other words, to establish (5.6) it is enough to prove the following statement ((5.9) below): Let A = ∪Rl be a collection of full circles and disjoint intervals, such that u and v are interiour points of A. Further, suppose that A contains at least c3 21 d(u, v) disjoint sub-intervals each with length at least δ and let us say that Dδu,v (A) occurs for the realization of the combined process of marks n whenever (5.7) holds. Let ρ A denote the reduced time-inhomogeneous rates of arrivals of flips (associated to edges on the torus) as in (3.17), ⎧ ⎪ ⎨ ρ, if the corresponding flip is either between two points in A A ρ e (t) = (5.8) or between two points in S\A, ⎪ ⎩ 0, otherwise.

Random Current Representation

471

Then,

Eρ A

1I{C∗t (u,v)=A} 1I

νˇ 1 ∼(ξ 1 ,m 1 ) u,v

νˇ 2 ∼ (ξ 2 ,m2 )

1I u,v ∗ Dδ (A) u←→g

≤ c2 e−c1 d(u,v) Eρ A

1I{C∗t (u,v)=A} 1IDu,v (A) , δ

(5.9)

νˇ 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2

νˇ ∼ (ξ ,m )

where νˇ 1 , νˇ 2 are restrictions of the labels to A = ∪Rl which are compatible with the marks, and in particular with r -boundary conditions, at the end-points of Rl -s. The inequality (5.9) is established by the following embedding procedure: Let νˇ 1 , ξ 1 , 2 2 2 1 m , νˇ , ξ , m be a pair of configurations which contribute to the left-hand side of (5.9). All such configurations have no arrivals of g-induced flips on A. At this stage it is convenient to introduce the following separate notation for processes of flips: let g,k ξˇek ; k = 1, 2, to denote arrivals for e = (i, j) ∈ E 0 and ξe ; k = 1, 2, to denote arrivals g 1 2 g for e = (i, g) ∈ E . A similar notation ηˇ = ξˇ ∪ ξˇ and η = ξ g,1 ∪ξ g,2 is introduced for g combined processes of flips. Then, on the event conditions = ∅, the compatibility η (A) u,v 1 1 1 2 ˇ on the left-hand side of (5.9) read as νˇ ∼ ξ , m and, accordingly, νˇ ∼ ξˇ 2 , m2 , and the expression on the left-hand side of (5.9) equals to

1I{C∗t (u,v)=A} 1IDu,v (A) . (5.10) e−2h l |Rl | Eρ A δ νˇ 1 ∼(ξˇ 1 ,m 1 ) u,v

νˇ 2 ∼ (ξˇ 2 ,m2 )

Fix now a realization of (ξˇ 1 , m1 ), (ξˇ 2 , m2 ) and compatible labels νˇ 1 ∼ (ξˇ 1 , m1 ) and (ξˇ 2 , m2 ). Consider the following event: E(A) = ∩l ∩p∈Gδ (Rl ) ξ g,i (Ip ) is even for i = 1, 2 ∩ ηg(A\Aδ ) = 0 ,

u,v νˇ 2 ∼

β

where, for p = (kδ, l) ∈ δZδ × Zd , we set Ip = [(k − 1)δ, kδ) × l

and Aδ = ∪l ∪p∈Gδ (Rl ) Ip .

Evidently, P (E(A)) = e−2h

l

|Rl |

(cosh(δh))2 ≥ e−2h

l

|Rl |

(cosh(δh))c3 d(u,v) ,

l p∈Gδ (Rl )

(5.11) g,1 g,2 where the second inequality follows from (5.7). Each E(A)-realization of ξ , ξ u,v gives rise to compatible labels νˇ 1 [ξ g,1 ] ∼ ξˇ 1 , ξ g,1 , m1 and νˇ 2 [ξ g,2 ] ∼ ξˇ 2 , ξ g,2 , m2 u,v which are unambiguously constructed from the original νˇ 1 ∼ (ξˇ 1 , m1 ) and νˇ 2 ∼ (ξˇ 2 , m2 ) by the appropriate even number of flips on each of the intervals Ip ⊆ Aδ (see Fig. 7).

472

N. Crawford, D. Ioffe

(b)

(a)

g,1 and ξ g,2 Fig. 7. Compatible labels which are constructed from νˇ 1 , νˇ 2 and even number of arrivals of ξ 1 1 1 2 2 2 on intervals from Gδ : (a) Original configurations νˇ , ξˇ , m and νˇ , ξˇ , m . (b) Example of admissible (in the sense of event E) even number of arrivals of ξ g,1 , ξ g,2 : the circled numbers indicate total number of arrivals on the corresponding intervals

As a result, the expectation on the right-hand side of (5.9) is bounded below by

1I{C∗t (u,v)=A} 1IDu,v (A) , (cosh(δh))c3 d(u,v) e−2h l |Rl | Eρ A δ νˇ 1 ∼(ξˇ 1 ,m 1 ) u,v

νˇ 2 ∼ (ξˇ 2 ,m2 )

and (5.9) follows.

Proof of Theorem A for xz-correlations. Recall the expression (3.12), 1 . ˆ xx = − σˆ uz ; E 1I{ν 2 (v)=r } 1I ∗ 2 v is pivotal for u ←→ g Z u

(5.12)

ν 1 ∼(ξ 1 ,m 1 )

ν 2 ∼(ξ 2 ,m2 ) ∗

Observe that if v is pivotal for u ←→ g then A∗ (u, v) (recall that the latter notation stands for the set of points which are ∗-connected to u by paths avoiding v) does not contain g, which means that there are no arrivals of ηg on A∗ (u, v). At this point we may proceed exactly as in the proof of Theorem A for z-correlations. Proof of Theorem A for x-correlations. Recall the expression (3.9), 1 ˆ ux ; ˆ vx = E 1I{(ν 1 ,ν 2 )∈[(r,l),(r,l)]} 1I{v is loop pivotal for u} . 2 Z ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )

(5.13)

Random Current Representation

473

Observe that under the constraints on the right-hand side, if v is loop-pivotal for u then the set A∗ (u, v)\{u, v} contains at least two disjoint components. Hence at least one of these components should be disjoint from g. Again, at this point we may proceed exactly as in the proof of Theorem A for z-correlations. Implications for the ground state β = ∞. As was proved above, exponential decay of truncated two-point functions is uniform in β < ∞. Consequently, for every N < ∞, the limit

M∞,N (h, ρ, λ) = lim Mβ,N (h, ρ, λ) β→∞

also satisfies (1.3) and (1.4). On the other hand, by an obvious time scaling, M∞,N (αh, αρ, αλ) = M∞,N (h, ρ, λ) for every α > 0. Hence, ρ

∂ M∞,N ∂ M∞,N ∂ M∞,N ∂ M∞,N = −λ −h ≤ −λ . ∂ρ ∂λ ∂h ∂λ

Therefore, (1.3) implies that M∞,N ≤ h

∂ M∞,N ∂ M∞,N 3 2 + M∞,N . − 3M∞,N λ ∂h ∂λ

(5.14)

Together with the first of (1.4) (for M∞,N ) the inequality (5.14) sets up the stage for an analysis of sharpness of the σˆ z phase transition literally along the lines of [2,3]. Acknowledgements. Our proof of exponential decay is based on an argument which was developed in the classical setting together with Roberto Fernandez and Yvan Velenik (see [14]). We are grateful to Anna Levit for useful remarks and a very careful reading of the first draft of this paper.

References 1. Aizenman, M.: Geometric analysis of φ 4 fields and Ising models. Commun. Math. Phys. 86(1), 1–48 (1982) 2. Aizenman, M., Barsky, D.J.: Sharpness of the phase transition in percolation models. Commun. Math. Phys. 108(3), 489–529 (1987) 3. Aizenman, M., Barsky, D.J., Fernández, R.: The phase transition in a general class of Ising-type models is sharp. J. Stat. Phys. 47(3-4), 343–374 (1987) 4. Aizenman, M., Fernández, R.: On the critical behavior of the magnetization in high-dimensional Ising models. J. Stat. Phys. 44(3-4), 393–454 (1986) 5. Aizenman, M., Klein, A., Newman, C.: Percolation methods for disordered quantum Ising models. In: Kotecky, R., ed., Phase Transitions: Mathematics, Physics, Biology,.., Singapore: World Scientific, 1993, pp. 1–26 6. Aizenman, M., Nachtergaele, B.: Geometric aspects of quantum spin states. Commun. Math. Phys. 164, 17–63 (1994) 7. Biskup, M., Chayes, L., Crawford, N.: Mean-field driven first-order phase transitions in systems with long-range interactions. J. Stat. Phys. 119(6), 1139–1193 (2006) 8. Björnberg, J.E., Grimmett, G.: The phase transition of the quantum Ising model is sharp. J. Stat. Phys. 136(2), 231–273 (2009) 9. Campanino, M., Klein, A., Perez, J.F.: Localization in the ground state of the Ising model with a random transverse field. Commun. Math. Phys. 135, 499–515 (1991) 10. Chayes, L., Crawford, N., Ioffe, D., Levit, A.: The phase diagram of the quantum Curie-Weiss model. J. Stat. Phys. 133(1), 131–149 (2008) 11. Ginibre, J.: Existence of phase transitions for quantum lattice systems. Commun. Math. Phys. 14, 205–234 (1969) 12. Griffiths, R.: Correlations in Ising Ferromagnets. II. J. Math. Phys. 8, 484 (1967)

474

N. Crawford, D. Ioffe

13. Griffiths, R., Hurst, C., Sherman, S.: Concavity of magnetization of an Ising ferromagnet in a positive external field. J. Math. Phys. 11, 790 (1970) 14. Ioffe, D.: Stochastic geometry of classical and quantum Ising models. Lecture Notes in Mathematics 1970, Berlin-Heidelberg: Springer, 2000 15. Ioffe, D., Levit, A.: Long range order and giant components of quantum random graphs. Markov. Proc. Rel. Fields 13(3), 469–492 (2007) 16. Shlosman, S.: Signs of rsell’s functions. Commun. Math. Phys. 102(4), 679–686 (1985)

u

Communicated by M. Aizenman

Commun. Math. Phys. 296, 475–523 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1026-7

Communications in

Mathematical Physics

-adic Quantum Vertex Algebras and Their Modules Haisheng Li∗ Department of Mathematical Sciences, Rutgers University, Camden, NJ 08102, USA. E-mail: [email protected] Received: 9 March 2009 / Accepted: 22 December 2009 Published online: 7 March 2010 – © Springer-Verlag 2010

Abstract: This is a paper in a series to study vertex algebra-like structures arising from various algebras including quantum affine algebras and Yangians. In this paper, we study notions of -adic nonlocal vertex algebra and -adic (weak) quantum vertex algebra, slightly generalizing Etingof-Kazhdan’s notion of quantum vertex operator algebra. For any topologically free C[[]]-module W , we study -adically compatible subsets and -adically S-local subsets of (EndW )[[x, x −1 ]]. We prove that any -adically compatible subset generates an -adic nonlocal vertex algebra with W as a module and that any -adically S-local subset generates an -adic weak quantum vertex algebra with W as a module. A general construction theorem of -adic nonlocal vertex algebras and -adic quantum vertex algebras is obtained. As an application we associate the centrally extended double Yangian of sl2 to -adic quantum vertex algebras.

1. Introduction In [EK], one of an important series of papers, Etingof and Kazhdan introduced a fundamental notion of quantum vertex operator algebra and they constructed a family of quantum vertex operator algebras which are formal deformations of vertex operator alge bras associated with affine Lie algebras sl n+1 . For a quantum vertex operator algebra in this sense, the underlying space is a topologically free C[[]]-module V = V 0 [[]] with V 0 a vector space over C, and the vertex operator map Y is a C[[]]-module map from V to Hom(V, V 0 ((x))[[]]), where the key axioms are a quasi commutativity property, called S-locality, and an associativity property. Furthermore, the S-locality by assumption is governed by a unitary rational quantum Yang-Baxter operator S(x) on V . It follows from the definition that V /V is an ordinary vertex algebra (over C), so quantum vertex operator algebras in this sense are formal deformations of vertex ∗ Partially supported by NSF grant DMS-0600189.

476

H. Li

algebras. As it was mentioned therein, a generalization of this theory to the super case is straightforward. Inspired by [EK], in a series of papers ([Li4,Li5,Li6,KL]) we have extensively studied a notion of (weak) quantum vertex algebra, as a generalization of the notions of vertex algebra and vertex superalgebra. For a (weak) quantum vertex algebra V in this sense, the underlying space is a vector space over C and the vertex operator map Y is a linear map from V to Hom(V, V ((x))), which satisfies a certain braided Jacobi identity, or equivalently an S-locality and associativity. This theory of (weak) quantum vertex algebras has many of the features of the theory of ordinary vertex (super-)algebras. For example, as it was proved in [Li4], weak quantum vertex algebras and their modules can be constructed from what were called S-local sets of vertex operators on an arbitrarily given vector space, just as vertex (super)algebras and modules can be constructed from “mutually local” vertex operators (see [Li1]). Examples of quantum vertex algebras and modules were constructed in [Li5] from Zamolodchikov-Faddeev algebras of a certain type and in [Li6] from q-versions of double Yangian DY(sl2 ) with q a nonzero complex number. In this paper, we come back to Etingof-Kazhdan’s notion of quantum vertex operator algebra with a slight generalization such that the classical limits are more general quantum vertex algebras. More specifically, we systematically study notions of -adic nonlocal vertex algebra and -adic (weak) quantum vertex algebra, and we establish general construction theorems, with the ultimate goal to associate such adic quantum vertex algebras to centrally extended double Yangians essentially in the same way that affine Lie algebras were associated with vertex operator algebras. An -adic nonlocal vertex algebra will be a topologically free C[[]]-module V equipped with a C[[]]-module map Y from V to (EndV )[[x, x −1 ]] and a vector 1 ∈ V such that for every positive integer n, V /n V is a nonlocal vertex algebra over C, while an -adic weak quantum vertex algebra is an -adic nonlocal vertex algebra V that satisfies S-locality in the sense of [EK] with S(x) only a C[[]]-module map without any other assumption. Furthermore, an -adic quantum vertex algebra is an -adic weak quantum vertex algebra V such that the S-locality operator S(x) is a unitary rational quantum Yang-Baxter operator and satisfies the shift condition and hexagon identity as in [EK]. For each finite-dimensional simple Lie algebra g, Drinfeld (see [Dr1]) introduced a Hopf algebra Y (g), called Yangian, as a deformation of the universal enveloping algebra U (g[t]) of Lie algebra g[t]. Then the double DY(g) of Y (g) in the sense of Drinfeld was studied in [KT]. Furthermore, centrally extended double Yangian DY (g) was studied in [Kh] and [IK], as a deformation of the universal enveloping algebra U (ˆg) of the affine Lie algebra gˆ , where a vertex operator representation was also given. Our objective is to establish a canonical association of the centrally extended double Yangians (in which the parameter is a formal variable, instead of a complex number) with vertex algebra-like structures. This is our main motivation to study -adic (weak) quantum vertex algebras. In this paper we build the foundation for this theory of -adic (weak) quantum vertex algebras. As one of the main results, we establish a general construction of -adic weak quantum vertex algebras and their modules. This is an -adic version of the general construction in [Li5] of weak quantum vertex algebras and their modules, and we here extensively use the results therein. Let W be a general C[[]]-module. Consider formal series a(x) = am x −m−1 ∈ (EndW )[[x, x −1 ]] m∈Z

-adic Quantum Vertex Algebras and Their Modules

477

satisfying the condition that for every w ∈ W and for every positive integer n, there exists an integer k such that am w ∈ n W for m ≥ k, and let E(W ) consist of all such a(x). In the case that W = W 0 [[]] is topologically free (with W 0 a vector space over C), we have E(W ) = E(W 0 )[[]] which is also topologically free. (Recall that for a vector space U over C, E(U ) = Hom(U, U ((x))).) We then study what we call -adically compatible subsets and -adically S-local subsets of E(W ). We prove that any -adically compatible subset of E(W ) generates an -adic nonlocal vertex algebra with W as a canonical module, while an -adically S-local subset of E(W ) generates an -adic weak quantum vertex algebra with W as a canonical module. The fact is that the generating functions of a centrally extended double Yangian on a highest weight module W together with the identity operator 1W form an -adically S-local subset of E(W ), and hence one has an -adic weak quantum vertex algebra generated by those generating functions. It was known (see [Li1,LL]) that if W is a highest weight module for the affine Lie algebra gˆ of level ∈ C, then the canonical generating functions of gˆ generate a vertex algebra, which can be identified as a so-called vacuum gˆ -module of the same level . For centrally extended double Yangians, the situation is different; the generated -adic weak quantum vertex algebra is not a module for DY (g), but it is a module for a certain cover of DY(g). This is mainly due to the fact that the field associated to a Cartan element is broken into two fields in the quantum case. In this paper, we pick up the simplest case with g = sl2 and work out the details. More specifically, we introduce a cover DY (sl2 ) of DY(sl2 ) and by using our general construction we show that on a universal vacuum DY (sl2 )-module of a generic level (which is defined suitably), there exists a canonical -adic quantum vertex algebra structure with every highest weight DY (sl2 )-module of the same level as a module. In principle, a generalization to the centrally extended double Yangian of a general finite-dimensional simple Lie algebra can be done in a similar way, but one has to deal with the complicated Serre type relations. We plan to study this in a future publication. In this paper, we also construct a family of -adic quantum vertex algebras as deformations of certain quantum vertex algebras which were studied in [KL]. Those quantum vertex algebras were constructed by using certain generalized Weyl-Clifford algebras, or Zamolodchikov-Faddeev algebras. In one special case, we obtain a quantum βγ -system, and in another we obtain a formal deformation of the vertex operator superalgebra VL associated with the lattice L = Zα with α, α = 1. There is a very interesting paper [AB], in which Anguelova and Bergvelt studied a broad class of vertex algebra-like structures called H D -quantum vertex algebras, using some ideas of Borcherds from [B2], and they constructed certain interesting examples by employing Borcherds’ bicharacter construction. This notion of H D -quantum vertex algebra generalizes the notion of braided vertex operator algebra in [EK] in several aspects. For example, the braiding operator S (describing quasi locality) is allowed to have two (independent) spectral parameters, instead of one. What we here call -adic quantum vertex algebras can be considered as a subfamily of H D -quantum vertex algebras. A drawback of this generality is that general H D -quantum vertex algebras, just as Etingof-Kazhdan’s braided vertex operator algebras, fail to satisfy the usual associativity for vertex algebras, though they do satisfy a braided associativity. On the other hand, weak quantum vertex algebras in the sense of [Li4] and -adic weak quantum vertex algebras all satisfy the usual associativity, which promises a transparent representation theory. Especially, examples of -adic quantum vertex algebras (and their modules) are

478

H. Li

constructed by using vertex operators on potential modules from a representation point of view. This paper is organized as follows: In Sect. 2, we study -adic nonlocal vertex algebras and -adic (weak) quantum vertex algebras and present some basic results. In Sect. 3, we present some technical results. In particular we discuss -adic nonlocal vertex subalgebras. In Sect. 4, we give a general construction of -adic (weak) quantum vertex algebras and their modules using -adic S-local sets of (formal) vertex operators. In Sect. 5, we construct some -adic quantum vertex algebras which are deformations of certain quantum vertex algebras. In Sect. 6, as an application of the general construction, we associate the centrally extended double Yangian of sl2 with -adic quantum vertex algebras. 2. -adic Nonlocal Vertex Algebras and -adic Weak Quantum Vertex Algebras In this section we study the notions of -adic nonlocal vertex algebra and -adic (weak) quantum vertex algebra, and we present basic properties of -adic nonlocal vertex algebras. The notion of -adic (weak) quantum vertex algebra slightly generalizes EtingofKazhdan’s notion of quantum vertex operator algebra. In this paper, we use the standard formal variable notations and conventions as established in [FLM] (cf. [LL]). The scalar field will be the field C of complex numbers, N and Z+ denote the set of nonnegative integers and the set of positive integers, respectively. We start by recalling the notion of nonlocal vertex algebra (cf. [BK,Li2]). A nonlocal vertex algebra is a vector space V , equipped with a linear map Y : V → Hom(V, V ((x))) ⊂ (EndV )[[x, x −1 ]], vn x −n−1 (vn ∈ EndV ), v → Y (v, x) = n∈Z

and equipped with a distinguished vector 1 ∈ V , satisfying the conditions that Y (1, x) = 1, Y (v, x)1 ∈ V [[x]]

and

lim Y (v, x)1 (= v−1 1) = v for v ∈ V,

x→0

(2.1) (2.2)

and that for u, v, w ∈ V , there exists l ∈ N such that (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w = (x0 + x2 )l Y (Y (u, x0 )v, x2 )w in V [[x0±1 , x2±1 ]] (the weak associativity). For a nonlocal vertex algebra V , define a linear operator D on V by d Y (v, x)1 for v ∈ V. Dv = v−2 1 = lim x→0 d x

(2.3)

(2.4)

We have ([Li2], Prop. 2.6) d Y (v, x) dx

for v ∈ V,

(2.5)

e x D Y (v, x1 )e−x D = Y (e x D v, x1 ) = Y (v, x1 + x),

(2.6)

[D, Y (v, x)] = Y (Dv, x) = and furthermore, Y (v, x)1 = e

xD

v.

(2.7)

-adic Quantum Vertex Algebras and Their Modules

479

In [Li4], the following class of nonlocal vertex algebras was singled out: Definition 2.1. A weak quantum vertex algebra is a nonlocal vertex algebra V , satisfying S-locality: For u, v ∈ V , there exist r

u (i) ⊗ v (i) ⊗ f i (x) ∈ V ⊗ V ⊗ C((x))

i=1

and a nonnegative integer k such that (x1 −x2 )k Y (u, x1 )Y (v, x2 ) = (x1 −x2 )k

r

f i (x2 − x1 )Y (v (i) , x2 )Y (u (i) , x1 ),

i=1

(2.8) where f i (x2 − x1 ) is to be expanded in the nonnegative powers of x1 , i.e., in view of the formal Taylor theorem, f i (x2 − x1 ) = e

−x1 ∂ ∂x

2

f i (x2 ) ∈ C((x2 ))[[x1 ]].

Remark 2.2. Let V be a weak quantum vertex algebra. Let u, v, w ∈ V and assume that (2.8) holds. Then weak associativity relation (2.3) and (2.8) imply x1 − x2 Y (u, x1 )Y (v, x2 )w x0 r x2 − x1 f i (−x0 )Y (v (i) , x2 )Y (u (i) , x1 )w −x0−1 δ −x0 i=1 x1 − x0 Y (Y (u, x0 )v, x2 )w = x2−1 δ x2

x0−1 δ

(2.9)

(the S-Jacobi identity). In fact, the notion of weak quantum vertex algebra can be alternatively defined by using all the axioms that define the notion of nonlocal vertex algebra except that the weak associativity axiom is replaced by the S-Jacobi identity. Definition 2.3. Let U be a vector space. A unitary rational quantum Yang-Baxter operator (with one parameter) on U is a linear map S(x) : U ⊗ U → U ⊗ U ⊗ C((x)) satisfying the condition that S 21 (x)S(−x) = 1, S 12 (x1 )S 13 (x1 + x2 )S 23 (x2 ) = S 23 (x2 )S 13 (x1 + x2 )S 12 (x1 ), where S 21 (x) = PS(x)P with P the permutation operator on U ⊗ U .

480

H. Li

The following notion is essentially due to Etingof and Kazhdan [EK]: Definition 2.4. A quantum vertex algebra is a nonlocal vertex algebra V equipped with a unitary rational quantum Yang-Baxter operator S(x) on V , satisfying the conditions that [D ⊗ 1, S(x)] = −

dS(x) (the shift condition), dx

(2.10)

and that for any u, v ∈ V , there exists a nonnegative integer k such that (x1 − x2 )k Y (x1 )(1 ⊗ Y (x2 ))(S(x1 − x2 )(u ⊗ v) ⊗ w) = (x1 − x2 )k Y (x2 )(1 ⊗ Y (x1 ))(v ⊗ u ⊗ w)

(2.11)

for all w ∈ V , and that S(x)(Y (z) ⊗ 1) = (Y (z) ⊗ 1)S 23 (x)S 13 (x + z)

(2.12)

(the hexagon identity), where Y (x) : V ⊗ V → V ((x)) is the map associated to the vertex operator map Y (·, x) : V → Hom(V, V ((x))). The following notion is due to Etingof and Kazhdan [EK]: Definition 2.5. Let V be a nonlocal vertex algebra. For each positive integer n, define a linear map Z n : C((x1 )) · · · ((xn )) ⊗ V ⊗n → V ((x1 )) · · · ((xn ))

(2.13)

Z n ( f ⊗ v (1) ⊗ · · · ⊗ v (n) ) = f (x1 , . . . , xn )Y (v (1) , x1 ) · · · Y (v (n) , xn )1

(2.14)

by

for v (1) , . . . , v (n) ∈ V, f ∈ C((x1 )) · · · ((xn )). If all the linear maps Z n for n ≥ 1 are injective, V is said to be nondegenerate. The following proposition ([Li4], Theorems 4.8 and 5.11) was lifted from [EK]: Proposition 2.6. Let V be a weak quantum vertex algebra. Assume that V is nondegenerate. Then there exists a unique linear map S(x) : V ⊗ V → V ⊗ V ⊗ C((x)) satisfying the condition that for any u, v ∈ V , there exists a nonnegative integer k such that (2.11) holds with S(x)(v ⊗ u) =

r

v (i) ⊗ u (i) ⊗ f i (x).

i=1

Furthermore, V equipped with S(x) is a quantum vertex algebra. Definition 2.7. Let V be a nonlocal vertex algebra. A V -module is a vector space W equipped with a linear map YW : V → Hom(W, W ((x))) ⊂ (EndW )[[x, x −1 ]] vn x −n−1 (vn ∈ EndW ) v → YW (v, x) = n∈Z

-adic Quantum Vertex Algebras and Their Modules

481

satisfying the conditions that YW (1, x) = 1W (where 1W denotes the identity operator on W ),

(2.15)

and that for any u, v ∈ V, w ∈ W , there exists l ∈ N such that (x0 + x2 )l YW (u, x0 + x2 )YW (v, x2 )w = (x0 + x2 )l YW (Y (u, x0 )v, x2 )w. (2.16) A quasi V -module is defined by using all the above axioms except that the last weak associativity axiom is replaced by a weaker axiom: For any u, v ∈ V, w ∈ W , there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that p(x0 +x2 , x2 )YW (u, x0 +x2 )YW (v, x2 )w = p(x0 +x2 , x2 )YW (Y (u, x0 )v, x2 )w. (2.17) Next, we study -adic analogues. Let be a formal variable throughout this paper. A C[[]]-module W is said to be torsion-free if w = 0 for every 0 = w ∈ W , and is said to be separated if ∩n≥1 n W = 0. For a C[[]]-module W , using subsets w + n W for w ∈ W, n ≥ 1 as the basis of open sets one obtains a topology on W , which is called the -adic topology. A C[[]]-module W is said to be -adically complete if every Cauchy sequence in W with respect to this -adic topology has a limit in W . A C[[]]-module W is topologically free if W = W 0 [[]] for some vector space W 0 over C. It is a fact that a C[[]]-module is topologically free if and only if it is torsion-free, separated, and -adically complete (cf. [K]). Definition 2.8. Let W be a C[[]]-module. Define E(W ) to consist of each formal series a(x) = am x −m−1 ∈ (EndW )[[x, x −1 ]] m∈Z

such that for every w ∈ W , am w → 0, that is, for every positive integer n, a m w ∈ n W

for m sufficiently large.

For every C[[]]-endomorphism F of W , F preserves the submodules n W for n ∈ N, so that F gives rise to an endomorphism of W/n W for each n ∈ N. In this way, we have natural C[[]]-module homomorphisms π˜ n : EndW → End(W/n W ) for n ∈ N. We also use π˜ n for its canonical extensions—the C[[]]-module homomorphisms from (EndW )[[x1±1 , . . . , xr±1 ]] to (End(W/n W ))[[x1±1 , . . . , xr±1 ]] for r ≥ 1. In terms of the maps π˜ n we have E(W ) = a(x) ∈ (EndW )[[x, x −1 ]] | π˜ n (a(x)) ∈ E(W/n W ) for all n ∈ N . (2.18) Definition 2.9. An -adic nonlocal vertex algebra is a topologically free C[[]]-module V , equipped with a C[[]]-module map Y : V → E(V ) ⊂ (EndV )[[x, x −1 ]], v → Y (v, x) = vn x −n−1 , n∈Z

482

H. Li

and equipped with a distinguished vector 1 ∈ V , satisfying the conditions that Y (1, x) = 1, Y (v, x)1 ∈ V [[x]]

and

lim Y (v, x)1 (= v−1 1) = v

x→0

for v ∈ V,

and that for u, v, w ∈ V and for n ∈ N, there exists l ∈ N such that (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w ≡ (x0 + x2 )l Y (Y (u, x0 )v, x2 )w

(2.19)

modulo n V [[x0±1 , x2±1 ]] (the -adic weak associativity). Remark 2.10. Notice that for r, s ∈ Z, the coefficient of the monomial x0r x2s in the expression Y (u, x0 + x2 )Y (v, x2 )w is r + i u −r −1−i v−s−1+i w, i i≥0

which is an infinite sum in general, even though it converges to an element of V . This is one of the few places where V needs to be -adically complete. The following is a characterization of an -adic nonlocal vertex algebra in terms of nonlocal vertex algebras over C: Proposition 2.11. Let V be a topologically free C[[]]-module, equipped with a vector 1 ∈ V and a C[[]]-module map Y from V to (EndV )[[x, x −1 ]]. Then (V, Y, 1) carries the structure of an -adic nonlocal vertex algebra if and only if for every n ∈ N, (V /n V, Y (n) , 1 + n V ) is a nonlocal vertex algebra over C, where Y (n) : V /n V → (End(V /n V ))[[x, x −1 ]] is the canonical map reduced from Y . Proof. From definition it is clear that if (V, Y, 1) is an -adic nonlocal vertex algebra, V /n V is a nonlocal vertex algebra over C for every n ∈ N. Now, assume that for every n ∈ N, V /n V is a nonlocal vertex algebra over C. For v ∈ V , we have π˜ n (Y (v, x)) ∈ E(V /n V ) for n ∈ N. Thus Y (v, x) ∈ E(V ). On the other hand, for each n ∈ N, with 1 + n V being the vacuum vector of V /n V , we have Y (1, x)v − v ∈ n V [[x, x −1 ]] for v ∈ V . Because V is separated, we must have Y (1, x)v − v = 0. Similarly, we can show Y (v, x)1 ∈ V [[x]] and lim x→0 Y (v, x)1 = v. The weak associativity of V /n V for n ∈ N exactly amounts to the -adic weak associativity. Then (V, Y, 1) is an -adic nonlocal vertex algebra. Remark 2.12. Let V be an -adic nonlocal vertex algebra. We have a projective inverse system of nonlocal vertex algebras over C (or over C[[]]): 0 ← V /V ← V /2 V ← V /3 V ← · · · ,

(2.20)

and the -adic nonlocal vertex algebra V can be considered as an inverse limit. Using Proposition 2.11 (and the arguments in the proof) we immediately have: Lemma 2.13. Let V be an -adic nonlocal vertex algebra. Define D ∈ EndV by D(v) = v−2 1

for v ∈ V.

(2.21)

Then [D, Y (v, x)] = Y (Dv, x) =

d Y (v, x) dx

for v ∈ V.

-adic Quantum Vertex Algebras and Their Modules

483

Let V be an -adic nonlocal vertex algebra. Following [EK], let ˆ → V [[x, x −1 ]] Y (x) : V ⊗V be the C[[]]-module map associated to the vertex operator map Y : V → ˆ and V ⊗V ˆ ⊗C((x))[[]] ˆ (EndV )[[x, x −1 ]], where here and henceforth V ⊗V stand for 0 the -adically completed tensor products. If V = V [[]] with V 0 a C-vector space, we have ˆ = (V 0 ⊗ V 0 )[[]] V ⊗V

and

ˆ ⊗C((x))[[]] ˆ V ⊗V = (V 0 ⊗ V 0 ⊗ C((x)))[[]].

Definition 2.14. Let V be an -adic nonlocal vertex algebra. Define a C[[]]-module map ˆ → (EndV )[[x1±1 , x2±1 ]] Y (x1 , x2 ) : V ⊗V by Y (x1 , x2 )(u ⊗ v)(w) = Y (x1 )(1 ⊗ Y (x2 ))(u ⊗ v ⊗ w) = Y (u, x1 )Y (v, x2 )w

(2.22)

for u, v, w ∈ V . Definition 2.15. An -adic weak quantum vertex algebra is an -adic nonlocal vertex algebra V which satisfies -adic S-locality: For u, v ∈ V , there exists ˆ ⊗C((x))[[]] ˆ F(u, v, x) ∈ V ⊗V satisfying the condition that for every positive integer n, there exists k ∈ N such that (x1 −x2 )k Y (u, x1 )Y (v, x2 )w ≡ (x1 −x2 )k Y (x2 )(1 ⊗ Y (x1 ))(F(u, v, x2 −x1 ) ⊗ w) (2.23) modulo n V [[x1±1 , x2±1 ]] for all w ∈ V . Remark 2.16. Let V be an -adic weak quantum vertex algebra. We see that for every positive integer n, V /n V is a weak quantum vertex algebra over C. For u, v, w ∈ V , by Remark 2.2 we have x1 − x2 −1 Y (u, x1 )Y (v, x2 )w x0 δ x0 x2 − x1 −1 Y (x2 )(1 ⊗ Y (x1 ))(F(u, v, −x0 ) ⊗ w) −x0 δ −x0 x1 − x0 Y (Y (u, x0 )v, x2 )w (2.24) ≡ x2−1 δ x2 modulo n V [[x0±1 , x1±1 , x2±1 ]]. Since V is -adically complete, the coefficient of each p q monomial x0r x1 x2 for r, p, q ∈ Z in each of the three main terms is an element of V . n As ∩n≥1 V = 0, we obtain x1 − x2 x0−1 δ Y (u, x1 )Y (v, x2 )w x0 x2 − x1 Y (x2 )(1 ⊗ Y (x1 ))(F(u, v, −x0 ) ⊗ w) −x0−1 δ −x0 x1 − x0 Y (Y (u, x0 )v, x2 )w (2.25) = x2−1 δ x2

484

H. Li

(the S-Jacobi identity). Clearly, the S-Jacobi identity is equivalent to -adic weak associativity and -adic S-locality. In view of this, the notion of -adic weak quantum vertex algebra can be defined alternatively by using the S-Jacobi identity. Definition 2.17. Let U be a C[[]]-module and let r be a positive integer. For A(x1 , . . . , xr ), B(x1 , . . . , xr ) ∈ U [[x1±1 , . . . , xr±1 ]], we write A ∼± B if for every positive integer n, there exists a (nonzero) polynomial p(x1 , . . . , xr ) ∈ xi ± x j | 1 ≤ i < j ≤ r ⊂ C[x1 , . . . , xr ] such that p(x1 , . . . , xr )(A(x1 , . . . , xr ) − B(x1 , . . . , xr )) ∈ n U [[x1±1 , . . . , xr±1 ]]. (2.26) Clearly, relations ∼± on U [[x1±1 , . . . , xr±1 ]] are equivalence relations. It is also clear that the left multiplication by a Laurent polynomial and the formal partial differential operators ∂/∂ x1 , . . . , ∂/∂ xr preserve the equivalence relations. For convenience, we shall also use the notation ∼ for ∼− . If V is an -adic nonlocal vertex algebra, for u, v, w ∈ V we have Y (u, x0 + x2 )Y (v, x2 )w ∼+ Y (Y (u, x0 )v, x2 )w V [[x0±1 , x2±1 ]].

Furthermore, if V is an -adic weak quantum vertex algebra, for any in u, v ∈ V , there exists ˆ ⊗C((x))[[]] ˆ F(u, v, x) ∈ V ⊗V such that Y (u, x1 )Y (v, x2 ) ∼ Y (x2 , x1 )F(u, v, x2 − x1 ) in (EndV )[[x1±1 , x2±1 ]]. Remark 2.18. Note that the equivalence relations ∼± when restricted to certain subspaces of U [[x1±1 , . . . , xr±1 ]] amount to the equality relation. For example, let U = U 0 [[]] be a topologically free C[[]]-module. For A(x1 , . . . , xr ), B(x1 , . . . , xr ) ∈ (U 0 ((x1 )) · · · ((xr )))[[]], if A ∼± B, we must have A = B. This is simply because (U 0 ((x1 )) · · · ((xr )))[[]] is a vector space over the field C((x1 )) · · · ((xr )) which contains xi ± x j for 1 ≤ i < j ≤ r . Proposition 2.19. Let V be an -adic nonlocal vertex algebra and let ˆ ⊗C((x))[[]]. ˆ u, v ∈ V, A(x) ∈ V ⊗V Then Y (u, x1 )Y (v, x2 ) ∼ Y (x2 , x1 )(A(x2 − x1 )) if and only if Y (u, x)v = e x D Y (−x)(A(−x)).

(2.27)

-adic Quantum Vertex Algebras and Their Modules

485

Furthermore, V is an -adic weak quantum vertex algebra if and only if there exists a C[[]]-module map ˆ → V ⊗V ˆ ⊗C((x))[[]] ˆ S(x) : V ⊗V such that Y (u, x)v = e x D Y (−x)S(−x)(v ⊗ u)

for u, v ∈ V.

(2.28)

Proof. We only need to prove the first part. Note that for every positive integer n, V /n V is a nonlocal vertex algebra over C. By Corollary 5.3 of [Li4], there exists a nonnegative integer k such that (x1 − x2 )k Y (u, x1 )Y (v, x2 )w ≡ (x1 − x2 )k Y (x2 , x1 )(A(x2 − x1 ))w mod n V for all w ∈ V if and only if Y (u, x)v ≡ e x D Y (−x)(A(−x)) mod n V. Since V is -adically complete, the coefficient of each power of x in e x D Y (−x)(A(−x)) is an element of V . As ∩n≥1 n V = 0, the relations Y (u, x)v ≡ e x D Y (−x)(A(−x)) mod n V for all n ≥ 1 amount to (2.27). The following is a reformulation and a slight generalization of Etingof and Kazhdan’s notion of quantum vertex operator algebra (see [EK]): Definition 2.20. An -adic quantum vertex algebra is an -adic nonlocal vertex algebra V equipped with a C[[]]-module map ˆ → V ⊗V ˆ ⊗C((x))[[]], ˆ S(x) : V ⊗V

(2.29)

which satisfies the shift condition: [D ⊗ 1, S(x)] = −

dS(x) , dx

(2.30)

the quantum Yang-Baxter equation: S 12 (x1 )S 13 (x1 + x2 )S 23 (x2 ) = S 23 (x2 )S 13 (x1 + x2 )S 12 (x1 ),

(2.31)

and the unitarity condition: S 21 (x)S(−x) = 1,

(2.32)

subject to the following axioms: (QA1) The -adic S-locality: For any u, v ∈ V and for any positive integer n, there exists k ≥ 0 such that for any w ∈ V the series (x1 − x2 )k Y (x1 )(1 ⊗ Y (x2 ))(S(x1 − x2 )(u ⊗ v) ⊗ w) and (x1 − x2 )k Y (x2 )(1 ⊗ Y (x1 ))(v ⊗ u ⊗ w) coincide modulo n V [[x1±1 , x2±1 ]]. (QA4) The hexagon identity: S(x1 )(Y (x2 ) ⊗ 1) = (Y (x2 ) ⊗ 1)S 23 (x1 )S 13 (x1 + x2 ).

(2.33)

486

H. Li

Let V be an -adic nonlocal vertex algebra. For a positive integer n, we define a C[[]]-module map ˆ ±1 ±1 ˆ Z nV : V ⊗n ⊗C((x 1 )) · · · ((x n ))[[]] → V [[x 1 , . . . , x n ]]

as in Definition 2.5. Recall that V /V is a nonlocal vertex algebra over C. Lemma 2.21. Let V = V 0 [[]] be an -adic nonlocal vertex algebra such that the nonlocal vertex algebra V /V over C is nondegenerate. Then for every positive integer n, the C[[]]-module map Z nV is injective. ˆ ˆ Proof. Let n be any positive integer and let 0 = A ∈ V ⊗n ⊗C((x1 )) · · · ((xn ))[[]]. Then

A = k (A0 + A1 + 2 A2 + · · · ) for some k ∈ N, Ai ∈ (V 0 )⊗n ⊗ C((x1 )) · · · ((xn )) with A0 = 0. Writing Y (x) = Y0 (x) + Y1 (x) + 2 Y2 (x) + · · · with Yi ∈ (EndV 0 )[[x, x −1 ]] for i ≥ 0, we have Y (x1 )(1 ⊗ Y (x2 )) · · · (1⊗(n−1) ⊗ Y (xn ))A = k Y0 (x1 )(1 ⊗ Y0 (x2 )) · · · (1⊗(n−1) ⊗ Y0 (xn ))(A0 ) + O(k+1 ). As V /V is nondegenerate we have Y0 (x1 )(1 ⊗ Y0 (x2 )) · · · (1⊗(n−1) ⊗ Y0 (xn ))(A0 ) = 0, so that Z nV (A) = 0. This proves that Z nV is injective.

The following is a reformulation of Proposition 1.11 of [EK]: Proposition 2.22. Let V be an -adic weak quantum vertex algebra such that the nonlocal vertex algebra V /V over C is nondegenerate. Then S-locality defines a unique C[[]]-module map ˆ → V ⊗V ˆ ⊗C((x)))[[]] ˆ S(x) : V ⊗V

(2.34)

with S(x)(u ⊗ v) = F(v, u, x) for u, v ∈ V as in Definition 2.15 and (V, Y, 1, S) carries the structure of an -adic quantum vertex algebra. Next, we study modules and quasi modules for -adic nonlocal vertex algebras. Definition 2.23. Let V be an -adic nonlocal vertex algebra. A V -module is a topologically free C[[]]-module W , equipped with a C[[]]-module map, YW : V → E(W ) ⊂ (EndW )[[x, x −1 ]], satisfying the conditions that YW (1, x) = 1W and that for u, v ∈ V, w ∈ W and for every positive integer n, there exists l ∈ N such that (x0 + x2 )l YW (u, x0 + x2 )YW (v, x2 )w ≡ (x0 + x2 )l YW (Y (u, x0 )v, x2 )w

(2.35)

n W [[x0±1 , x2±1 ]].

We define a notion of quasi V -module by replacing the modulo -adic weak associativity with the following axiom: For u, v ∈ V, w ∈ W and for every positive integer n, there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that p(x0 +x2 , x2 )YW (u, x0 +x2 )YW (v, x2 )w ≡ p(x0 +x2 , x2 )YW (Y (u, x0 )v, x2 )w modulo

n W [[x0±1 , x2±1 ]].

(2.36)

-adic Quantum Vertex Algebras and Their Modules

487

We have the following straightforward analogue of Proposition 2.11: Proposition 2.24. Let V be an -adic nonlocal vertex algebra, let W be a topologically free C[[]]-module, and let YW be a C[[]]-module map from V to (EndW )[[x, x −1 ]]. Then (W, YW ) is a (quasi) V -module if and only if for every positive integer n, W/n W is a (quasi) V /n V -module. The following is an -adic version of a result of [Li4]: Proposition 2.25. Let V be an -adic nonlocal vertex algebra, let ˆ ⊗C((x))[[]] ˆ m ∈ Z, u, v, c(0) , c(1) , · · · ∈ V, A(x) ∈ V ⊗V such that lim j→∞ c( j) = 0, and let (W, YW ) be a V -module. If (x1 − x2 )m Y (u, x1 )Y (v, x2 ) − (−x2 + x1 )m Y (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) , Y (c , x2 ) x2 δ = j! ∂ x2 x2

(2.37)

j≥0

then (x1 − x2 )m YW (u, x1 )YW (v, x2 ) − (−x2 + x1 )m YW (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) . YW (c , x2 ) x2 δ = j! ∂ x2 x2

(2.38)

j≥0

If (W, YW ) is a faithful V -module, the converse also holds. Proof. For any positive integer n, V /n V is a nonlocal vertex algebra over C and W/n W is a (V /n V )-module. It follows from [Li4] (Prop. 6.7) that after being applied to a vector in W , (2.38) holds modulo n W . With W topologically free, W is -adically complete and separated. Then (2.38) must hold. For the converse, for any positive integer n, denote by ρn the C[[]]-module map from V to E(W/n W ). Clearly, n V ⊂ ker ρn . Then V / ker ρn is a nonlocal vertex algebra over C and W/n W is a faithful (V / ker ρn )-module. Again, from [Li4], (Prop. 6.7), (2.37) modulo ker ρn holds. For v ∈ ∩n≥1 ker ρn , with W separated we have YW (v, x) = 0. As W is faithful, we must have ∩n≥1 ker ρn = 0. Since V is -adically complete, (2.37) holds on V . We shall also need the following variation: Proposition 2.26. Let V be an -adic nonlocal vertex algebra, let ˆ ⊗C((x))[[]] ˆ m ∈ Z, u, v, c(0) , c(1) , · · · ∈ V, A(x) ∈ V ⊗V such that lim j→∞ c( j) = 0, and let (W, YW ) be a V -module. If (x1 − x2 )m Y (u, x1 )Y (v, x2 ) − (−x2 + x1 )m Y (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) , Y (c , x1 ) x2 δ = j! ∂ x2 x2 j≥0

(2.39)

488

H. Li

(note that the only change is on the variable x for vertex operators Y (c( j) , x)), then (x1 − x2 )m YW (u, x1 )YW (v, x2 ) − (−x2 + x1 )m YW (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) . YW (c , x1 ) x2 δ = j! ∂ x2 x2

(2.40)

j≥0

If (W, YW ) is a faithful V -module, the converse also holds. Proof. Let (U, YU ) be a V -module, e.g., U = V or U = W . For any v ∈ V, w ∈ U and for any n ≥ 1, with U/n U a module for V /n V viewed as a nonlocal vertex algebra over C, we have YU (Dv, x)w ≡

d YU (v, x)w mod n U. dx

Since U is separated, we have YU (Dv, x)w =

d YU (v, x)w. dx

Using this we get ∂ j −1 1 x1 x2 δ j! ∂ x2 x2 j≥0 1 ∂ j x1 YU (c( j) , x2 )x2−1 δ = j! ∂ x2 x2 j≥0 j 1 ∂ j−i ∂ i −1 1 x1 ( j) = YU (c , x2 ) x2 δ ( j − i)! ∂ x2 i! ∂ x2 x2

YU (c( j) , x1 )

j≥0 i=0

∂ i −1 1 1 x1 YU (D j−i c( j) , x2 ) x2 δ ( j − i)! i! ∂ x2 x2 j≥0 i=0 1 ∂ i −1 1 x1 YU (Dr c(r +i) , x2 ) . x2 δ = r! i! ∂ x2 x2 =

j

i≥0 r ≥0

Notice that for i ≥ 0, r ≥0 r1! Dr c(r +i) ∈ V (as V is -adically complete). Then it follows from Proposition 2.25. 3. Some Technical Results In this section we present certain technical results which we need in Sect. 4. In particular, we study -adic nonlocal vertex subalgebras and subalgebras generated by subsets of an -adic nonlocal vertex algebra. Definition 3.1. Let V be an -adic nonlocal vertex algebra. An -adic nonlocal vertex subalgebra is a C[[]]-submodule containing 1 such that (U, Y, 1) carries the structure of an -adic nonlocal vertex algebra. In particular, U is a topologically free submodule.

-adic Quantum Vertex Algebras and Their Modules

489

We say that a C[[]]-submodule U of an -adic nonlocal vertex algebra V is Y -closed if u m v ∈ U for all u, v ∈ U, m ∈ Z. (This is to distinguish the algebraic closedness from the topological closedness.) Remark 3.2. Let U be a C[[]]-submodule of a topologically free C[[]]-module V . With n U ⊂ U ∩ n V for n ∈ N, we see that the induced topology on U from V (with the -adic topology) coincides with the -adic topology of U if and only if for any n ∈ N, there exists k ∈ N such that U ∩ k V ⊂ n U . Proposition 3.3. Let V be an -adic nonlocal vertex algebra and let U be a C[[]]submodule satisfying the conditions that 1 ∈ U , U is Y -closed, and that the induced topology on U from V coincides with its own -adic topology. In addition we assume that U is -adically complete. Then U is an -adic nonlocal vertex subalgebra of V . Proof. Notice that as a submodule of V , U is torsion-free and separated. Since U is also -adically complete, U is topologically free. Let u, v ∈ U and n ∈ N. From assumption, there exists k ∈ N such that U ∩ k V ⊂ n U . With u, v ∈ U ⊂ V and k ∈ N being fixed, there exists r ∈ N such that u m v ∈ k V for m ≥ r . Then u m v ∈ U ∩ k V ⊂ n U

for m ≥ r.

That is, Y (u + n U, x)(v + n U ) ∈ (U/n U )((x)). Furthermore, let w ∈ V . By -adic weak associativity, there exists l ∈ N such that (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w ≡ (x0 + x2 )l Y (Y (u, x0 )v, x2 )w (mod k V ). Because U is Y -closed and -adically complete, and because U ∩ k V ⊂ n U , we have (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w ≡ (x0 + x2 )l Y (Y (u, x0 )v, x2 )w (mod n U ). Now, (U, Y, 1) satisfies all the axioms for an -adic nonlocal vertex algebra.

Definition 3.4. Let M be a C[[]]-module. For any C[[]]-submodule K , we define [K ] = {w ∈ M | n w ∈ K for some n ∈ N}.

(3.1)

The following two lemmas are straightforward: Lemma 3.5. Let M be a C[[]]-module and let K be a C[[]]-submodule such that [K ] = K . Then K ∩ n M = n K for all n ∈ N. In particular, the induced topology on K from M (with the -adic topology) coincides with the -adic topology of K . Lemma 3.6. Let V be an -adic nonlocal vertex algebra and let K be a Y -closed C[[]] of K in V are Y -closed. submodule. Then both [K ] and the -adic completion K Furthermore, we have: Proposition 3.7. Let V be an -adic nonlocal vertex algebra and let K be a Y -closed C[[]]-submodule containing 1. Then [K ] = [K ] and [K ] is an -adic nonlocal vertex subalgebra of V .

490

H. Li

Proof. Since [[K ]] = [K ], by Lemma 3.5 we have [K ] ∩ n V = n [K ] for n ∈ N. ] is -adically complete with respect to its only -adic topology. Then [K ] is Then [K

topologically free. From Lemma 3.6, [K ] is Y -closed. If we can prove [K ] = [K ], then

] is an -adic nonlocal vertex subalgebra of V . Let u ∈ [K ] . by Proposition 3.3, [K ]. Furthermore, there exists a By definition, there exists k ∈ N such that k u ∈ [K Cauchy sequence {an } in [K ] with k u as the limit. Then there exists r ≥ 1 such that an − k u ∈ k V for n ≥ r . As V is torsion-free, for each n ≥ r , there exists uniquely bn ∈ V such that an = k bn . As [[K ]] = [K ] and n bn = an ∈ [K ], we have bn ∈ [K ] for n ≥ r . Using the fact that V is torsion-free, we see that {bn }n≥r is a Cauchy sequence ]. This proves [K ] = [K ], concluding the in [K ], converging to u. Thus u ∈ [K proof. Now, let U be a subset of an -adic nonlocal vertex algebra V . Let U (1) be the C[[]]span of U ∪ {1} and then inductively define U (n+1) for n ≥ 1 to be the C[[]]-span of the vectors am b for a, b ∈ U (n) , m ∈ Z. From definition we have 1 ∈ U (n) ⊂ U (n+1) for n ≥ 1. Set

U o = ∪n≥1 U (n) ⊂ V (3.2) and furthermore, set U = U o (the -adic completion).

(3.3)

We have: Proposition 3.8. Let V be an -adic nonlocal vertex algebra and let U be a subset of V . Then U is an -adic nonlocal vertex subalgebra satisfying the condition that U ⊂ U and [U ] = U . Furthermore, any -adic nonlocal vertex subalgebra H , which satisfies the condition that U ⊂ H and [H ] = H , contains U . Proof. It follows from the definition that ∪n≥1 U (n) is Y -closed and contains {1} ∪ U . Then the first assertion follows from Proposition 3.7. Let H be an -adic nonlocal vertex subalgebra such that [H ] = H and U ⊂ H . Then ∪n≥1 U (n) ⊂ H . Furthermore, we have

U o = ∪n≥1 U (n) ⊂ [H ] = H. Since [U o ] = U o , the induced topology on U o (from H or V ) coincides with its own -adic topology. Then U ⊂ H as H is -adically complete. We shall need the following result later: Lemma 3.9. Let V be an -adic nonlocal vertex algebra with a generating subset U in the sense that V = U and let (W, YW ) be a quasi V -module equipped with a C[[]]-linear operator D such that [D, YW (u, x)] =

d YW (u, x) dx

for u ∈ U.

(3.4)

Assume that w is a vector of W such that Dw = 0. Then Y (v, x)w ∈ V [[x]] for all v ∈ V and the linear map φ defined by φ(v) = v−1 w for v ∈ V is a V -module homomorphism.

-adic Quantum Vertex Algebras and Their Modules

491

Proof. For every positive integer n, V /n V is a nonlocal vertex algebra over C and W/n W is a quasi (V /n V )-module. Set K n = (∪k≥1 U (k) + n V )/n V ⊂ V /n V. It is clear that K n is a nonlocal vertex subalgebra of V /n V . Let φn : K n → W/n W be the map induced from φ. From [Li4] (Prop. 6.2), we have vr w ∈ n W

for v ∈ ∪k≥1 U (k) , r ≥ 0,

and φn is a K n -module homomorphism, which amounts to φ(u m v) ≡ u m φ(v) mod n W for u, v ∈ ∪k≥1 U (k) , m ∈ Z. As ∩n≥1 n W = 0, we have vr w = 0 for v ∈ ∪k≥1 U (k) , r ≥ 0 and for u, v ∈ ∪k≥1 U (k) , m ∈ Z.

φ(u m v) = u m φ(v)

Now, let u, v ∈ [∪k≥1 U (k) ]. There exists t ∈ N such that t u, t v ∈ ∪k≥1 U (k) . Then t vr w = 0 for r ≥ 0 and 2t φ(u m v) = 2t u m φ(v)

for m ∈ Z.

Since W is torsion-free, we have vr w = 0 for r ≥ 0 and φ(u m v) = u m φ(v) for m ∈ Z. Recall that U is the completion of [∪k≥1 U (k) ]. Let u (i) , v (i) (i ≥ 1) be sequences in [∪k≥1 U (k) ], converging to u, v ∈ V , respectively. We have ( j) (i) ( j) vr(i) w = 0 for r ≥ 0 and φ(u (i) m v ) = u m φ(v )

for i, j ≥ 1, m ∈ Z.

From this we see that for every positive integer n, vr w ∈ n W for r ≥ 0 and φ(u m v) − u m φ(v) ∈ n W

for m ∈ Z.

Again as ∩n≥1 n W = 0, we get vr w = 0 for r ≥ 0 and φ(u m v) = u m φ(v) for m ∈ Z. Now we have proved Y (v, x)w ∈ V [[x]], as needed.

φ(u m v) = u m φ(v)

for all u, v ∈ U , m ∈ Z,

Let W be a topologically free C[[]]-module and let A be a C-subspace of W . Notice that for any sequence {an }n≥0 in A, n≥0 an n ∈ W . Set A[[]] = which is a C[[]]-submodule of W .

⎧ ⎨ ⎩

n≥0

⎫ ⎬

a n n | a n ∈ A , ⎭

(3.5)

492

H. Li

Definition 3.10. Let A and B be subsets of an -adic nonlocal vertex algebra V . We say that the ordered pair (A, B) is -adically S-local if for any a ∈ A, b ∈ B, there exists ˆ ⊗C((x)))[[]] ˆ P(a, b; x) ∈ ((CB) ⊗ (CA) ⊗ C((x))) [[]] ⊂ V ⊗V such that Y (a, x1 )Y (b, x2 ) ∼ Y (x2 , x1 )P(a, b; x2 − x1 ).

(3.6)

We say that a subset A of V is -adically S-local if (A, A) is -adically S-local. We have the following technical result: Lemma 3.11. Let A and B be C-subspaces of V such that (A, B) is -adically S-local. Then (A, B (2) ) and (A(2) , B) are -adically S-local. Proof. From definition, there exists a C[[]]-module map ˆ ˆ ˆ → B[[]] ⊗A[[]] ⊗C((x))[[]] S(x) : B[[]] ⊗A[[]]

such that for a ∈ A, b ∈ B, Y (a, x1 )Y (b, x2 ) ∼ Y (x2 , x1 )S(x2 − x1 )(b ⊗ a).

(3.7)

We have the maps ˆ ˆ ˆ ˆ ˆ S 32 (x) : B[[]] ⊗A[[]] ⊗B[[]] → B[[]] ⊗A[[]] ⊗B[[]] ⊗C((x))[[]], 13 ˆ ˆ ˆ ˆ ˆ S (x) : B[[]] ⊗B[[]] ⊗A[[]] → B[[]] ⊗B[[]] ⊗A[[]] ⊗C((x))[[]].

Note that by Proposition 2.19, (3.7) is equivalent to Y (a, x)b = e x D Y (−x)S(−x)(b ⊗ a). Let a ∈ A, u, v ∈ B. Using Lemma 2.13 we get Y (a, x)Y (u, z)v ∼ Y (z)(1 ⊗ Y (x))(S(z − x)(u ⊗ a) ⊗ v) = Y (z)(1 ⊗ e x D Y (−x))S 32 (−x)(S(z − x)(u ⊗ a) ⊗ v) = e x D Y (z − x)(1 ⊗ Y (−x))S 32 (−x)(S(z − x)(u ⊗ a) ⊗ v) ∼ e x D Y (−x)(Y (z) ⊗ 1)S 32 (−x)(S(z − x)(u ⊗ a) ⊗ v) ∼ e x D Y (−x)(Y (z) ⊗ 1)S 32 (−x)(S(−x + z)(u ⊗ a) ⊗ v) in V [[x ±1 , z ±1 ]]. In view of Remark 2.18 we have Y (a, x)Y (u, z)v = e x D Y (−x)(Y (z) ⊗ 1)S 32 (−x)(S(−x + z)(u ⊗ a) ⊗ v). It follows that (A, B (2) ) is -adically S-local. Similarly, for a, b ∈ A, u ∈ U , we have Y (Y (a, x0 )b, x)u ∼+ Y (a, x0 + x)Y (b, x)u = Y (a, x0 + x)e x D Y (−x)S(−x)(u ⊗ b) = e x D Y (a, x0 )Y (−x)S(−x)(u ⊗ b) ∼+ e x D Y (−x)(1 ⊗ Y (x0 ))S 13 (−x − x0 )(S(−x)(u ⊗ b) ⊗ a) in V [[x0±1 , x ±1 ]], which by Remark 2.18 implies Y (Y (a, x0 )b, x)u = e x D Y (−x)(1 ⊗ Y (x0 ))S 13 (−x − x0 )(S(−x)(u ⊗ b) ⊗ a). It follows that (A(2) , B) is -adically S-local.

-adic Quantum Vertex Algebras and Their Modules

493

Now, we have (cf. [Li5,LTW]): Proposition 3.12. Let V be an -adic nonlocal vertex algebra and let U be an -adically S-local subset such that V = (∪n≥1 U (n) )[[]] . Then V is an -adic weak quantum vertex algebra. Proof. We must prove that V as a subset of V is -adically S-local. Because U is -adically S-local, it follows from Lemma 3.11 (and induction) that ∪n≥1 U (n) is -adically S-local. Let u, v ∈ V . From assumption, we have u=

a(i)i , v =

i≥0

b( j) j with a(i), b( j) ∈ ∪n≥1 U (n) .

j≥0

For any i, j ∈ N, there exists ˆ ⊗C((x))[[]] ˆ Ai, j (x) ∈ V ⊗V such that Y (a(i), x1 )Y (b( j), x2 ) ∼ Y (x2 , x1 )Ai, j (x2 − x1 ). Notice that

ˆ ⊗C((x))[[]] ˆ Ai, j (x)i+ j ∈ V ⊗V

i, j∈N

and ⎛ Y (u, x1 )Y (v, x2 ) ∼ Y (x2 , x1 ) ⎝

⎞

Ai, j (x2 − x1 )⎠ .

i, j∈N

This proves that V is -adically S-local.

Using the proof of Proposition 2.8 in [Li5] and Proposition 3.12, we immediately have: Proposition 3.13. Let V be an -adic nonlocal vertex algebra, let U be an -adically S-local subset, and let W be a V -module with e ∈ W such that YW (u, x)e ∈ V [[x]] for u ∈ U . Set K = (∪n≥1 U (n) )[[]] ⊂ V . Then YW (v, x)e ∈ V [[x]] for all v ∈ K . Furthermore, the C[[]]-module map θ : K → W , defined by θ (v) = v−1 e for v ∈ K , satisfies that φ(Y (u, x)v) = YW (u, x)φ(v)

for u, v ∈ K .

494

H. Li

4. A General Construction of -adic Quantum Vertex Algebras In this section we give a general construction of -adic nonlocal vertex algebras and -adic weak quantum vertex algebras by using what we call -adic quasi-compatible sets of vertex operators on topologically free C[[]]-modules. We start by recalling from [Li4] (cf. [Li2]) the general construction of nonlocal vertex algebras from quasi-compatible sets of formal vertex operators. Let W 0 be a vector space over C. Set E(W 0 ) = Hom(W 0 , W 0 ((x))).

(4.1)

The identity operator on W 0 , denoted by 1W 0 , is a typical element of E(W 0 ), and the formal differential operator ddx is an endomorphism of E(W 0 ). Definition 4.1. An (ordered) sequence (ψ (1) (x), . . . , ψ (r ) (x)) in E(W 0 ) is said to be quasi-compatible if there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that ⎞ ⎛ ⎝ p(xi , x j )⎠ ψ (1) (x1 ) · · · ψ (r ) (xr ) ∈ Hom(W 0 , W 0 ((x1 , . . . , xr ))). (4.2) 1≤i< j≤r

A subset U of E(W 0 ) is said to be quasi-compatible if every (ordered) finite sequence in U is quasi-compatible. We also define a notion of compatibility by assuming that p(x1 , x2 ) is of the form (x1 − x2 )k with k ∈ N. Assume that (a(x), b(x)) is a quasi-compatible pair in E(W 0 ). By definition, there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that p(x1 , x2 )a(x1 )b(x2 ) ∈ Hom(W 0 , W 0 ((x1 , x2 ))).

(4.3)

Recall from [Li4] that ιx1 ,x2 : C∗ (x1 , x2 ) → C((x1 ))((x2 )) is the algebra-embedding that preserves each element of C[[x1 , x2 ]], where C∗ (x1 , x2 ) denotes the algebra extension of C[[x1 , x2 ]] by inverting every nonzero polynomial. We have ιx,x0 (1/ p(x0 + x, x)) ( p(x1 , x)a(x1 )b(x)) |x1 =x+x0 ∈ Hom(W 0 , W 0 ((x))) ((x0 )). Definition 4.2. Let (a(x), b(x)) be a quasi-compatible pair in E(W 0 ). Define a(x)n b(x) for n ∈ Z, elements of E(W 0 ), in terms of the generating function a(x)n b(x)x0−n−1 YE (a(x), x0 )b(x) = n∈Z

by YE (a(x), x0 )b(x) = ιx,x0 (1/ p(x0 + x, x)) ( p(x1 , x)a(x1 )b(x)) |x1 =x+x0 , where p(x1 , x2 ) is any nonzero polynomial such that (4.3) holds.

-adic Quantum Vertex Algebras and Their Modules

495

A quasi-compatible subspace U of E(W 0 ) is said to be YE -closed if a(x)n b(x) ∈ U

for a(x), b(x) ∈ U, n ∈ Z.

(4.4)

We have ([Li4], Theorem 2.19; cf. [Li2]): Theorem 4.3. Let W 0 be a vector space over C and let U be any quasi-compatible subset of E(W 0 ). There exists a (unique) smallest YE -closed quasi-compatible subspace U of E(W 0 ), containing U and 1W 0 , and (U , YE , 1W 0 ) carries the structure of a nonlocal vertex algebra with U as a generating subset and W 0 is a quasi-module for U with YW 0 (α(x), x0 ) = α(x0 ) for α(x) ∈ U . Furthermore, if U is compatible, W 0 is a module for U . Definition 4.4. Let W 0 be a vector space over C. A subset U of E(W 0 ) is said to be S-local if for any a(x), b(x) ∈ U , there exist a (i) (x), b(i) (x) ∈ U, f i (x) ∈ C((x))(i = 1, . . . , r ) such that (x1 − x2 )k a(x1 )b(x2 ) = (x1 − x2 )k

r

f i (x2 − x1 )b(i) (x2 )a (i) (x1 )

(4.5)

i=1

for some k ∈ N. From [Li4] (Lemma 3.2), every S-local subset of E(W 0 ) is quasi-compatible. In fact, the same proof shows that every S-local subset is compatible. Furthermore, we have (see [Li4], Theorem 5.8): Theorem 4.5. Let W 0 be a vector space over C and let U be an S-local subset of E(W 0 ). Then the nonlocal vertex algebra U generated by U is a weak quantum vertex algebra with W 0 as a faithful module. Now, let W be a C[[]]-module. Recall from Sect. 2 that E(W ) is the C[[]]-submodule of (EndW )[[x, x −1 ]], consisting of each a(x) = m∈Z am x −m−1 satisfying the condition that for any w ∈ W, n ∈ N, there exists k ∈ Z such that a m w ∈ n W

for m ≥ k.

For the rest of this section we assume that W = W 0 [[]] is a fixed topologically free C[[]]-module. Then W is torsion-free, separated in the sense that ∩n≥1 n W = 0, and -adically complete. We have the following projective inverse system: 0 ← W/W ← W/2 W ← W/3 W ← · · ·

(4.6)

to for n ≥ 0) with W as (equipped with the canonical maps from an inverse limit. Let F be an endomorphism of W . For every nonnegative integer n, F gives rise to an endomorphism Fn of W/n W . Then we have an endomorphism {Fn } of the projective inverse system (4.6). Conversely, given any endomorphism, a sequence { f n }, of the inverse system (4.6), we have an endomorphism f of W . The C[[]]-module EndW can be naturally identified with (EndW 0 )[[]] and we have W/n+1 W

W/n W

(EndW )[[x, x −1 ]] = (EndW 0 )[[x, x −1 ]][[]], which is topologically free. Furthermore, we have E(W ) = E(W 0 )[[]], which is also topologically free.

(4.7)

496

H. Li

Lemma 4.6. For a(x) ∈ (EndW )[[x, x −1 ]], if k a(x) ∈ E(W ) for some k ∈ N, then a(x) ∈ E(W ). Proof. For any n ∈ N, w ∈ W , with k a(x) ∈ E(W ), there exists q ∈ Z such that k am w ∈ k+n W for m ≥ q, where a(x) = m∈Z am x −m−1 . Since W is torsion-free, we have am w ∈ n W for m ≥ q. This proves a(x) ∈ E(W ). Remark 4.7. For each n ∈ N, we have a canonical C[[]]-module map π˜ n : (EndW )[[x, x −1 ]] → (End(W/n W ))[[x, x −1 ]].

(4.8)

As W is torsion-free, we have ker π˜ n = n (EndW )[[x, x −1 ]]. Recall from Sect. 2 that an element a(x) of (EndW )[[x, x −1 ]] lies in E(W ) if and only if π˜ n (a(x)) ∈ E(W/n W ) for n ∈ N. Then we have canonical C[[]]-module maps πn : E(W ) → E(W/n W )

(4.9)

for n ∈ N, where by Lemma 4.6, ker πn = E(W ) ∩ n (EndW )[[x, x −1 ]] = n E(W ). For every n ∈ N, we have a canonical C[[]]-module map

(4.10)

θn : E(W/n+1 W ) → E(W/n W ). We have the following projective inverse system 0 ← E(W/W ) ← E(W/2 W ) ← E(W/3 W ) ← · · ·

(4.11)

with E(W ) equipped with C[[]]-module maps πn as an inverse limit. Then for any sequence {ψn (x)} with ψn (x) ∈ E(W/n W ) for n ∈ N, satisfying the condition that θn (ψn+1 (x)) = ψn (x), there exists a unique ψ(x) ∈ E(W ) such that πn (ψ(x)) = ψn (x) for n ∈ N. Definition 4.8. A finite sequence a 1 (x), . . . , a r (x) in E(W ) is said to be -adically quasi-compatible if for every positive integer n, the sequence πn (a 1 (x)), . . . , πn (a r (x)) in E(W/n W ) is quasi-compatible. A subset U of E(W ) is said to be -adically quasicompatible if every finite sequence in U is -adically quasi-compatible. Correspondingly, we define notions of -adically compatible sequence and -adically compatible subset. Let r be a positive integer. For each n ∈ N, we have a canonical C[[]]-module map π˜ n(r ) : (EndW )[[x1±1 , . . . , xr±1 ]] → (End(W/n W ))[[x1±1 , . . . , xr±1 ]], (1)

where π˜ n = π˜ n defined in (4.8). We also have a canonical C[[]]-module map: θ˜n(r ) : (End(W/n+1 W ))[[x1±1 , . . . , xr±1 ]] → (End(W/n W ))[[x1±1 , . . . , xr±1 ]]. It is clear that for n ∈ N, (r ) . π˜ n(r ) = θ˜n(r ) ◦ π˜ n+1

(4.12)

For any vector space U over C, we set E (r ) (U ) = Hom(U, U ((x1 , . . . , xr ))), which is naturally a C((x1 , . . . , xr ))-module.

(4.13)

-adic Quantum Vertex Algebras and Their Modules

497 (r )

Definition 4.9. Let r be a positive integer. For every n ∈ N, define En (W ) to be the C[[]]-submodule of (EndW )[[x1±1 , . . . , xr±1 ]], consisting of each formal series ψ(x1 , . . . , xr ) =

ψ(m 1 , . . . , m r )x1−m 1 −1 · · · xr−m r −1 ,

m 1 ,...,m r ∈Z

satisfying the condition that π˜ n(r ) (ψ(x1 , . . . , xr )) ∈ E (r ) (W/n W ), or equivalently, for every w ∈ W , there exists k ∈ Z such that ψ(m 1 , . . . , m r )w ∈ n W whenever m i ≥ k for some 1 ≤ i ≤ r. (r )

We see that En (W ) are also C((x1 , . . . , xr ))-modules and we have (r ) E0(r ) (W ) ⊃ E1(r ) (W ) · · · ⊃ En(r ) (W ) ⊃ En+1 (W ) ⊃ · · · . (r )

In terms of En (W ), a sequence ψ 1 (x), . . . , ψ r (x) in E(W ) is -adically quasi-compatible if and only if for every n ∈ N, there exists 0 = p(x, y) ∈ C[x, y] such that ⎛ ⎞ ⎝ p(xi , x j )⎠ ψ 1 (x1 ) · · · ψ r (xr ) ∈ En(r ) (W ). 1≤i< j≤r

Generalizing the maps πn and θn , we have canonical C[[]]-module maps for n ∈ N: πn(r ) : En(r ) (W ) → E (r ) (W/n W ), θn(r ) : E (r ) (W/n+1 W ) → E (r ) (W/n W ), which satisfy (r ) . πn(r ) = θn(r ) ◦ πn+1

Set (r )

E (W ) = ∩n≥1 En(r ) (W ) ⊂ (EndW )[[x1±1 , . . . , xr±1 ]].

(4.14)

Note that if (a(x), b(x)) is an -adically quasi-compatible pair in E(W ), then for every n ∈ N, (πn (a(x)), πn (b(x))) is a quasi-compatible pair in E(W/n W ) and hence πn (a(x))m πn (b(x)) are defined for all m ∈ Z. Lemma 4.10. Let (a(x), b(x)) be an -adically quasi-compatible pair in E(W ). We have θn+1 (πn+1 (a(x))m πn+1 (b(x))) = πn (a(x))m πn (b(x)) for n ∈ N, m ∈ Z.

(4.15)

498

H. Li

Proof. For any fixed n ∈ N, let p(x, y) ∈ C[x, y] be a nonzero polynomial such that (2)

p(x1 , x2 )a(x1 )b(x2 ) ∈ En+1 (W ) ⊂ En(2) (W ). From Definition 4.2, we have p(x0 + x, x)YE (πn (a(x)), x0 )πn (b(x)) = ( p(x1 , x)πn (a(x1 ))πn (b(x))) |x1 =x+x0 , = πn(2) ( p(x1 , x)a(x1 )b(x)) |x1 =x+x0 , p(x0 +x, x)YE (πn+1 (a(x)), x0 )πn+1 (b(x)) = ( p(x1 , x)πn+1 (a(x1 ))πn+1 (b(x))) |x1 =x+x0 (2) = πn+1 ( p(x1 , x)a(x1 )b(x))) |x1 =x+x0 .

With (4.12) it follows that p(x0 + x, x)YE (πn (a(x)), x0 )πn (b(x)) = p(x0 + x, x)θn+1 (YE (πn+1 (a(x)), x0 )πn+1 (b(x))) , which implies YE (πn (a(x)), x0 )πn (b(x)) = θn+1 (YE (πn+1 (a(x)), x0 )πn+1 (b(x))), as desired.

Using Lemma 4.10 we define the following partial operations on E(W ): Definition 4.11. Let (a(x), b(x)) be an -adically quasi-compatible (order) pair in E(W ). For m ∈ Z, we define a(x)m b(x) = lim πn (a(x))m πn (b(x)) ∈ E(W ). ←

(4.16)

Form the generating function YE (a(x), x0 )b(x) =

(a(x)m b(x))x0−m−1 .

(4.17)

m∈Z

From definition, for every positive integer n we have πn (a(x)m b(x)) = πn (a(x))m πn (b(x))

(4.18)

πn (YE (a(x), x0 )b(x)) = YE (πn (a(x)), x0 )πn (b(x)).

(4.19)

for m ∈ Z. Namely, (2)

Recall that E (W ) consists of each ψ(x1 , x2 ) ∈ (EndW )[[x1±1 , x2±1 ]], such that for every n ∈ N, π˜ n(2) (ψ) ∈ E (2) (W/n W ). Proposition 4.12. Let a(x), b(x) ∈ E(W ). Assume that there exists p(x1 , x2 , ) ∈ C[x1 , x2 , ] with p(x1 , x2 , 0) = 0 such that p(x1 , x2 , )a(x1 )b(x2 ) ∈ E(2) (W ). Then (a(x), b(x)) is -adically quasi-compatible and YE (a(x), x0 )b(x) = ιx,x0 ,(1/ p(x0 + x, x, )) ( p(x1 , x, )a(x1 )b(x)) |x1 =x+x0 .

-adic Quantum Vertex Algebras and Their Modules

499

Proof. Set f (x1 , x2 ) = p(x1 , x2 , 0) ∈ C[x1 , x2 ], A = p(x1 , x2 , )a(x1 )b(x2 ) ∈ E(2) (W ). Then p(x1 , x2 , ) = f (x1 , x2 ) − g(x1 , x2 , ) for some g(x1 , x2 , ) ∈ C[x1 , x2 , ]. We have ιx1 ,x2 ( f (x1 , x2 )−k−1 )g(x1 , x2 , )k k A. a(x1 )b(x2 ) = ιx1 ,x2 ,(1/ p(x1 , x2 , ))A = k≥0

For any positive integer n, we have f (x1 , x2 )n a(x1 )b(x2 ) ≡

n−1

f (x1 , x2 )n−k−1 g(x1 , x2 , )k k A modn (EndW )[[x1±1 , x2±1 ]],

k=0

so that π˜ n(2) f (x1 , x2 )n a(x1 )b(x2 ) ∈ E (2) (W/n W ),

(4.20)

(2)

as π˜ n (A) ∈ E (2) (W/n W ). This proves that (a(x), b(x)) is -adically quasi-compatible. Furthermore, for n ∈ N we have f (x0 + x, x)n πn (YE (a(x), x0 )b(x)) = f (x1 , x)n πn (a(x1 ))πn (b(x)) |x1 =x2 +x0 = πn(2) f (x1 , x)n a(x1 )b(x) |x1 =x2 +x0 . Then (2)

πn ιx,x0 ,( p(x0 + x, x, )−1 )( p(x1 , x, )a(x1 )b(x))|x1 =x+x0 (2)

= πn ιx,x0 ,( p(x0 + x, x, )−1 f (x0 + x, x)−n )( p(x1 , x, ) f (x1 , x)n a(x1 )b(x))|x1 =x+x0 (2)

= πn ιx,x0 ,( p(x0 + x, x, )−1 f (x0 + x, x)−n ) p(x0 + x, x, )( f (x1 , x)n a(x1 )b(x))|x1 =x+x0 (2)

= ιx,x0 ,( f (x0 + x, x)−n )πn ( f (x1 , x)n a(x1 )b(x))|x1 =x+x0 = πn YE (a(x), x0 )b(x) ,

from which the second part follows.

An -adically quasi-compatible C[[]]-submodule K of E(W ) is said to be YE -closed if a(x)m b(x) ∈ K

for a(x), b(x) ∈ K , m ∈ Z.

Proposition 4.13. Let V be a YE -closed -adically quasi-compatible C[[]]-submodule of E(W ), containing 1W . Suppose that [V ] = V and V is -adically complete. Then (V, YE , 1W ) carries the structure of an -adic nonlocal vertex algebra and W is a faithful quasi V -module with YW (a(x), x0 ) = a(x0 ) for a(x) ∈ V . Furthermore, if V is -adically compatible, W is a module (instead of a quasi module).

500

H. Li

Proof. Recall that E(W ) is topologically free. As a submodule of E(W ), V is torsionfree and separated. Being assumed to be -adically complete, V is topologically free. For any n ∈ N, as πn (a(x))m πn (b(x)) = πn (a(x)m b(x)) for a(x), b(x) ∈ V, m ∈ Z, we see that πn (V ) is a YE -closed quasi-compatible C[[]]-submodule of E(W/n W ), containing 1W . It follows from Theorem 4.3 that πn (V ) is a nonlocal vertex algebra over C with W/n W as a quasi-module. Now we prove that the map πn from V to πn (V ) reduces to a C[[]]-isomorphism from V /n V onto πn (V ), so that V /n V is a nonlocal vertex algebra over C. Let a(x) ∈ V be such that πn (a(x)) = 0 in E(W/n W ). Then a(x)W ⊂ n W [[x, x −1 ]]. So a(x) = n b(x) for some b(x) ∈ (EndW )[[x, x −1 ]]. By Lemma 4.6, b(x) ∈ E(W ). Then we have b(x) ∈ [V ] = V . Thus a(x) = n b(x) ∈ n V . This proves that V ∩ ker πn = n V , which implies V /n V πn (V ) ⊂ E(W/n W ). Consequently, V /n V is a nonlocal vertex algebra over C. By Propositions 2.11 and 2.24, V is an -adic nonlocal vertex algebra with W as a quasi V -module. The last part follows from Theorem 4.3 and Proposition 2.24. For convenience, we call any -adic nonlocal vertex algebra V in Proposition 4.13 an -adic nonlocal vertex subalgebra of E(W ). Lemma 4.14. Let U be an -adically quasi-compatible C[[]]-submodule of E(W ). Then [U ] is -adically quasi-compatible. If U is YE -closed, so is [U ]. Proof. Notice that [U ] ⊂ E(W ) by Lemma 4.6. As W is torsion-free, for any w ∈ W, s, n ∈ N, the relation s w ∈ s+n W implies w ∈ n W . Furthermore, for ψ ∈ (r ) (W ) implies ψ ∈ En(r ) (W ). (EndW )[[x1±1 , . . . , xr±1 ]], s, n ∈ N, the relation s ψ ∈ En+s Let a 1 (x), . . . , a r (x) ∈ [U ]. There exists k ∈ N such that k a i (x) ∈ U for i = 1, . . . , r . As the sequence k a 1 (x), . . . , k a r (x) in U is -adically quasi-compatible, for every n ∈ N, there exists 0 = p(x, y) ∈ C[x, y] such that ⎞ ⎛ (r ) r k ⎝ p(xi , x j )⎠ a 1 (x1 ) · · · a r (xr ) ∈ En+r k (W ), 1≤i< j≤r

which gives

⎛ ⎝

⎞ p(xi , x j )⎠ a 1 (x1 ) · · · a r (xr ) ∈ En(r ) (W ).

1≤i< j≤r

This proves that the sequence a 1 (x), . . . , a r (x) is -adically quasi-compatible. Therefore, [U ] is -adically quasi-compatible. Assume that U is YE -closed. Let a(x), b(x) ∈ [U ], m ∈ Z. By definition, there exists k ∈ N such that k a(x), k b(x) ∈ U . Then 2k (a(x)m b(x)) = (k a(x))m (k b(x)) ∈ U. Thus a(x)m b(x) ∈ [U ]. This proves that [U ] is YE -closed.

Theorem 4.15. Let K be a maximal -adically quasi-compatible C[[]]-submodule of E(W ). Then [K ] = K , K is -adically topologically free and YE -closed. Furthermore, (K , YE , 1W ) carries the structure of an -adic nonlocal vertex algebra with W as a quasi-module with YW (α(x), x0 ) = α(x0 ) for α(x) ∈ K . If K is -adically compatible, W is a module (instead of a quasi-module).

-adic Quantum Vertex Algebras and Their Modules

501

Proof. By Lemma 4.14, [K ] is -adically quasi-compatible. As K is maximal, we have [K ] ⊂ K . Thus [K ] = K . Let a(x), b(x) ∈ K , m ∈ Z. For every n ∈ N, πn (K ) is quasi-compatible, then by Theorem 4.3, πn (K ) generates a nonlocal vertex algebra πn (K ) over C and we have πn (K + Ca(x)m b(x)) = πn (K ) + Cπn (a(x))m πn (b(x)) ⊂ πn (K ), a quasi-compatible C-subspace of E(W/n W ). This proves that K + Ca(x)m b(x) is -adically quasi-compatible in E(W ). Again, with K maximal, we have a(x)m b(x) ∈ K . Thus K is YE -closed. Now, let {ψm (x)} be a sequence in K , satisfying the condition that for any r ≥ 0, there exists k ≥ 0 such that ψm (x) − ψn (x) ∈ r K whenever m, n ≥ k. Since E(W ) is -adically complete, the sequence {ψm (x)} has a limit, say ψ(x), in E(W ). For any n ∈ N, there exists m ∈ N such that ψm (x)−ψ(x) ∈ n E(W ), which implies πn (ψm (x)) = πn (ψ(x)). Thus πn (ψ(x)) = πn (ψm (x)) ∈ πn (K ). Consequently, πn (K + Cψ(x)) ⊂ πn (K ), which is quasi-compatible. This proves that K + Cψ(x) is -adically quasi-compatible in E(W ) and then it follows that ψ(x) ∈ K . Thus K is -adically complete, so that it is topologically free. Now, in view of Proposition 4.13, K is an -adic nonlocal vertex algebra with W as a quasi module. Furthermore, W is a module if K is -adically compatible. Now, let U be an -adically quasi-compatible subset of E(W ). In view of Zorn’s lemma, there exists a maximal -adically quasi-compatible C[[]]-submodule K of E(W ), containing U and 1W . Set U (1) = C[[]]U + C[[]]1W . Define U (2) to be the C[[]]-span of the vectors a(x)m b(x) for a(x), b(x) ∈ U (1) , m ∈ Z. Since K is YE -closed by Theorem 4.15, U (2) ⊂ K . Then U (2) is -adically quasi-compatible. For n ≥ 1, we inductively define U (n+1) = (U (n) )(2) . In this way, we obtain an increasing sequence of -adically quasi-compatible C[[]]-submodules: U (1) ⊂ U (2) ⊂ U (3) ⊂ · · · . Set U o = {a(x) ∈ E(W ) | k a(x) ∈ ∪n≥2 U (n) for some k ≥ 1}.

(4.21)

That is, U o = [∪n≥2 U (n) ]. In view of Lemma 3.5 we have U o ∩ n E(W ) = n U o

for n ≥ 1.

(4.22)

In particular, the induced topology of U o from E(W ) coincides with the -adic topology of U o . Then we define U to be the -adic completion of U o . Theorem 4.16. Let U be an -adically quasi-compatible subset of E(W ). Then [U ] = U , U is topologically free, -adically quasi-compatible, and YE -closed, and (U , YE , 1W ) carries the structure of an -adic nonlocal vertex algebra and W is a faithful quasi-U -module with YW (α(x), x0 ) = α(x0 ) for α(x) ∈ U . Furthermore, for any -adic nonlocal vertex subalgebra V of E(W ), containing U , such that [V ] = V , we have U ⊂ V . Proof. To prove [U ] = U , let a(x) ∈ [U ]. By definition, a(x) ∈ E(W ) and there exists k ≥ 0 such that k a(x) ∈ U . As U is the -adic completion of U o , there exists a Cauchy sequence {ψm (x)} in U o with k a(x) as a limit. There exists r ≥ 1 such that ψm (x) − k a(x) ∈ k E(W )

for m ≥ r.

502

H. Li

Then ψm (x) ∈ k E(W ) for m ≥ r . Set ψm (x) = k φm (x) for m ≥ r with φm (x) ∈ E(W ). Noticing that [U o ] = U o , we have φm (x) ∈ U o for m ≥ r . We see that {φm (x)}m≥r is a Cauchy sequence in U o with a(x) as a limit. Thus a(x) ∈ U . This proves [U ] = U . As U is torsion-free, separated, and -adically complete by definition, U is topologically free. It follows from definition that ∪n≥2 U (n) is -adically quasi-compatible and YE -closed. By Lemma 4.14, U o (= [∪n≥2 U (n) ]) is -adically quasi-compatible and YE -closed. Let ψ1 (x), . . . , φr (x) be a sequence in U and let n be any positive integer. For 1 ≤ i ≤ r , there exists a sequence {ψim (x)} in U o , which converges to ψi (x). Let k be a positive integer such that ψim (x) − ψi (x) ∈ n E(W )

for 1 ≤ i ≤ r, m ≥ k.

As U o is -adically quasi-compatible, there exists 0 = p(x, y) ∈ C[x, y] such that ⎛ ⎞ πn ⎝ p(xi , x j )⎠φ1k (x1 ) · · · φr k (xr ) ∈ Hom((W/n W ), (W/n W )((x1 , . . . , xr ))). 1≤i< j≤r

Then ⎛ πn ⎝

⎞ p(xi , x j )⎠φ1 (x1 ) · · · φr (xr ) ∈ Hom((W/n W ), (W/n W )((x1 , . . . , xr ))).

1≤i< j≤r

This proves that ψ1 (x), . . . , φr (x) is -adically quasi-compatible. It follows from Lemma 3.6 that U is YE -closed. Now, by Proposition 4.13, (U , YE , 1W ) carries the structure of an -adic nonlocal vertex algebra and W is a faithful quasi-U -module. Let V be an -adic nonlocal vertex subalgebra of E(W ) satisfying the condition that U ⊂ V and [V ] = V . It is straightforward to see that U ⊂ V . For a topologically free C[[]]-module W , we say a subset T spans W -adically if W = (CT )[[]] . The following is an -adic version of ([Li4], Theorem 6.3) which is an analogue of a theorem of Frenkel-Kac-Radul-Wang [FKRW] and of Muerman-Primc [MP] (cf. [LL]): Theorem 4.17. Let V be a topologically free C[[]]-module, U a subset of V , 1 a vector in V , D a C[[]]-module endomorphism of V , and Y 0 a map Y 0 : U → E(V ); u → Y 0 (u, x) = u(x) = u m x −m−1 . m∈Z

Assume that all the following conditions hold: D1 = 0, Y 0 (u, x)1 ∈ V [[x]] and

lim Y 0 (u, x)1 = u,

x→0

d 0 Y (u, x) [D, Y 0 (u, x)] = dx

for u ∈ U,

(4.23) (4.24) (4.25)

U (x) = {u(x) | u ∈ U } is -adically compatible, and V is -adically spanned by vectors (r ) u (1) m 1 · · · u mr 1

(4.26)

-adic Quantum Vertex Algebras and Their Modules

503

for r ≥ 0, u (i) ∈ U, m i ∈ Z. In addition we assume that there exists a C[[]]-module morphism ψ from V to U (x) ⊂ E(V ) such that ψ(1) = 1V and ψ(u m v) = u(x)m ψ(v)

for u ∈ U, v ∈ V, m ∈ Z.

(4.27)

Then Y 0 extends uniquely to a C[[]]-module map Y from V to E(V ) such that (V, Y, 1) carries the structure of an -adic nonlocal vertex algebra. Proof. The uniqueness is obvious, so it remains to establish the existence. As U (x) is an -adically compatible subset of E(V ), by Theorem 4.16 we have an -adic nonlocal vertex algebra U (x) with V as a faithful U (x)-module, where YV (α(x), x0 ) = α(x0 ) for α(x) ∈ U (x). For u ∈ U , we have YV (u(x), x0 )1 = u(x0 )1 ∈ V [[x0 ]], [D, YV (u(x), x0 )] = [D, Y 0 (u, x0 )] =

d 0 d Y (u, x0 ) = YV (u(x), x0 ). d x0 d x0

By Lemma 3.9, the map φ from U (x) to V , defined by φ(α(x)) = Resx x −1 α(x)1, is a U (x)-module homomorphism. We see that φ(1V ) = 1 and that for u ∈ U, α(x) ∈ U (x), φ(YE (u(x), x0 )α(x)) = YV (u(x), x0 )φ(α(x)) = u(x0 )φ(α(x)), which amounts to φ(u(x)m α(x)) = u m φ(α(x))

for m ∈ Z.

It follows that φ ◦ ψ = 1V . Thus ψ is a C[[]]-module isomorphism from V onto ψ(V ) ⊂ U (x). For u ∈ U , we have ψ(u) = ψ(u −1 1) = u(x)−1 1V = u(x) and

φ(u(x)) = φ(ψ(u)) = u.

Inside U (x), ψ(V ) is -adically spanned by vectors u (1) (x)m 1 · · · u (r ) (x)m r 1V for r ≥ 0, u (i) ∈ U, m i ∈ Z. For a ∈ V , we define Y (a, x) ∈ (EndV )[[x, x −1 ]]] by Y (a, x0 )b = φ (YE (ψ(a)(x), x0 )ψ(b)(x)) for b ∈ V. As φ is a U (x)-module homomorphism with φ ◦ ψ = 1, we have Y (a, x0 )b = YV (ψ(a)(x), x0 )b = ψ(a)(x0 )b, so that Y (a, x) = ψ(a)(x) ∈ E(V ). In particular, for u ∈ U , Y (u, x0 ) = YV (ψ(u), x0 ) = YV (u(x), x0 ) = u(x0 ) = Y 0 (u, x0 ), so the map Y extends Y0 . For v ∈ V , we have Y (1, x0 )v = YV (1V , x0 )v = 1V (v) = v, Y (v, x0 )1 = YV (ψ(v)(x), x0 )1 ∈ V [[x0 ]],

504

H. Li

and lim Y (v, x0 )1 = lim YV (ψ(v)(x), x0 )1 = φ(ψ(v)) = v.

x0 →0

x0 →0

To prove that (V, Y, 1) is an -adic nonlocal vertex algebra, we show that for every positive integer n, V /n V with the reduced structures is a nonlocal vertex algebra over C. For any n ≥ 1, U (x)/n U (x) is a nonlocal vertex algebra over C and φ reduces to a homomorphism φ¯ from U (x)/n U (x) to V /n V . On the other hand, the map ψ reduces to a map ψ¯ from V /n V to U (x)/n U (x) such that φ¯ ◦ ψ¯ = 1. We see that the image ψ(V ) of ψ(V ) in U (x)/n U (x) is a nonlocal vertex subalgebra ¯ It follows that V /n V is a nonlocal and ψ(V ) V /n V through the maps φ¯ and ψ. vertex algebra over C. Therefore, (V, Y, 1) is an h-adic nonlocal vertex algebra. This establishes the existence, concluding the proof. Definition 4.18. Let W be a topologically free C[[]]-module as before. We define a C[[]]-module map ˆ (W )⊗C((x))[[]] ˆ Z (x1 , x2 ) : E(W )⊗E → (EndW )[[x1±1 , x2±1 ]] by Z (x1 , x2 )(a(x) ⊗ b(x) ⊗ f (x)) = f (x1 − x2 )a(x1 )b(x2 ).

(4.28)

ˆ (W )⊗C((x))[[]] ˆ Lemma 4.19. Let a(x), b(x) ∈ E(W ), B(x) ∈ E(W )⊗E such that a(x1 )b(x2 ) ∼ Z (x2 , x1 )(B(x)). Then (a(x), b(x)) is -adically compatible and YE (a(x), x0 )b(x) x1 − x x − x1 = Resx1 x0−1 δ a(x1 )b(x) − x0−1 δ Z (x, x1 )(B(x)). x0 −x0

(4.29)

Proof. By definition, for any positive integer n, there exists k ∈ N such that (x1 − x2 )k πn (a(x1 ))πn (b(x2 )) = (x1 − x2 )k πn(2) (Z (x2 , x1 )(B(x))). From [Li4] we have YE (πn (a(x)), x0 )πn (b(x)) x1 − x x − x1 πn (a(x1 ))πn (b(x))−x0−1 δ πn(2) Z (x, x1 )(B(x)). = Resx1 x0−1 δ x0 −x0 Then it follows

Definition 4.20. Let V be an -adic nonlocal vertex subalgebra of E(W ). Define a ˆ ⊗C((x))[[]] ˆ C[[]]-module map YE (x2 , x1 ) from V ⊗V to (EndV )[[x1±1 , x2±1 ]] by YE (x2 , x1 )(a(x) ⊗ b(x) ⊗ f (x)) = f (x2 − x1 )YE (a(x), x2 )YE (b(x), x1 ) for a(x), b(x) ∈ V, f (x) ∈ C((x))[[]].

(4.30)

-adic Quantum Vertex Algebras and Their Modules

505

Lemma 4.21. Let V be an -adic nonlocal vertex subalgebra of E(W ), and let ˆ ⊗C((x))[[]]. ˆ u(x), v(x) ∈ V, A(x) ∈ V ⊗V Suppose that u(x1 )v(x2 ) ∼ Z (x2 , x1 )(A(x)) in

(EndW )[[x1±1 , x2±1 ]].

(4.31)

Then

YE (u(x), x1 )YE (v(x), x2 ) ∼ YE (x2 , x1 )(A(x))

(4.32)

in (EndV )[[x1±1 , x2±1 ]] and x1 − x2 x2 − x1 −1 −1 YE (u(x), x1 )YE (v(x), x2 ) − x0 δ YE (x2 , x1 )(A(x)) x0 δ x0 −x0 x1 − x0 YE (YE (u(x), x0 )v(x), x2 ). (4.33) = x2−1 δ x2 Proof. Let n be a positive integer. We have a nonlocal vertex algebra πn (V ) ⊂ E(W/n W ) over C with ker πn = V ∩ n E(W ) (= n V ). From assumption, there exists a nonnegative integer k such that (x1 − x2 )k πn (u(x1 ))πn (v(x2 )) = (x1 − x2 )k Z (x2 , x1 )(πn (A(z))). From [Li4] (Prop. 3.13), we have (x1 − x2 )k YE (πn (u(x)), x1 )YE (πn (v(x)), x2 ) = (x1 − x2 )k YE (x2 , x1 )πn (A(z)) as desired.

Definition 4.22. A subset U of E(W ) is said to be -adically S-local if for any a(x), b(x) ∈ U , there exists A(x) ∈ (CU ⊗ CU ⊗ C((x)))[[]] such that a(x1 )b(x2 ) ∼ Z (x2 , x1 )(A(x)).

(4.34)

Lemma 4.23. Every -adically S-local subset of E(W ) is -adically compatible. Proof. Let U be an -adically S-local subset of E(W ). For every positive integer n, we see that πn (C[[]]U ) is an S-local subset of E(W/n W ), so that πn (C[[]]U ) is compatible. By definition, C[[]]U is an -adically compatible subset of E(W ). Thus U is an -adically compatible subset. The following is a refinement of Theorem 4.17 (cf. [Li5], Theorem 2.9): Theorem 4.24. Let V be a topologically free C[[]]-module, U a subset of V , 1 a vector in V , and Y 0 a map Y 0 : U → E(V ); u → Y 0 (u, x) = u(x) = u m x −m−1 . m∈Z

Assume that all the following conditions hold: Y 0 (u, x)1 ∈ V [[x]] and

lim Y 0 (u, x)1 = u

x→0

for u ∈ U,

(4.35)

506

H. Li

U (x) = {u(x) | u ∈ U } is -adically S-local, and V is -adically spanned by vectors (r ) u (1) m 1 · · · u mr 1

(4.36)

for r ≥ 0, u (i) ∈ U, m i ∈ Z. In addition we assume that there exists a C[[]]-module morphism ψ from V to U (x) ⊂ E(V ) such that ψ(1) = 1V and ψ(u m v) = u(x)m ψ(v)

for u ∈ U, v ∈ V, m ∈ Z.

(4.37)

Then the map Y 0 extends uniquely to a C[[]]-module map Y from V to E(V ) such that (V, Y, 1) carries the structure of an -adic weak quantum vertex algebra. Proof. We shall slightly modify the proof of Theorem 4.17. By Lemma 4.23, U (x) is -adically compatible, so it generates an -adic nonlocal vertex algebra U (x). Set K = (∪n≥1 U (x)(n) )[[]] ⊂ U (x) ⊂ E(V ). By Proposition 3.13, we have a C[[]]-map φ : K → V such that φ(1V ) = 1, φ(u(x)n a(x)) = u n φ(a(x))

for u ∈ U, n ∈ Z, a(x) ∈ K .

Then continue with the proof of Theorem 4.17 to see that Y 0 extends uniquely to a C[[]]-module map Y from V to E(V ) such that (V, Y, 1) carries the structure of an -adic nonlocal vertex algebra. From Lemma 4.21, U is an -adically S-local subset of V and then by Proposition 3.12, V is an -adic weak quantum vertex algebra. Notice that compared with the corresponding theorem in [FKRW] and [MP], Theorems 4.17 and 4.24 have an extra assumption on the existence of the map ψ. By using classical linear algebra, it is not hard to see that in the general noncommutative situation, an assumption like this is indeed necessary. This assumption means that V is a universal vacuum module for a certain algebra. The following results are companions in practical applications: Lemma 4.25. Let U be a subset of E(W ) satisfying the condition that for a(x), b(x) ∈ U , there exist B(z) ∈ (CU ⊗CU ⊗C((x)))[[]] and p(x, ) ∈ C[x, ] with p(x, 0) = 0 such that p(x1 − x2 , )a(x1 )b(x2 ) = p(x1 − x2 , )Z (x2 , x1 )(B(x)).

(4.38)

Then U is -adically S-local. Proof. Let a(x), b(x) ∈ U . By assumption there exist B and p(x, ) with all the assumed properties. With p(x, 0) = 0, we have p(x, ) = f (x) − g(x, ), where 0 = f (x) ∈ C[x], g(x, ) ∈ C[x, ]. Expand p(x, )−1 in the nonnegative powers of as follows: f (x)−1−i g(x, )i i ∈ C((x))[[]], p(x, )−1 = i≥0

where f (x)−i−1 is understood as an element of C((x)). Let n be a positive integer. Then p(x, )−1 ≡

n−1 i=0

f (x)−1−i g(x, )i i (mod n C((x))[[]]).

-adic Quantum Vertex Algebras and Their Modules

507

Let k be a nonnegative integer such that x k f (x)−n ∈ C[[x]], so that (x1 − x2 )k f (x1 − x2 )−1−i = (−x2 + x1 )k f (−x2 + x1 )−1−i for all i = 0, . . . , n − 1. Set A = p(x1 − x2 , )a(x1 )b(x2 ), the common quantity of both sides of (4.38). We have (x1 − x2 )k a(x1 )b(x2 ) = (x1 − x2 )k p(x1 − x2 , )−1 A n−1 k −1−i i i f (x1 − x2 ) g(x1 − x2 , ) A (mod n ) ≡ (x1 − x2 ) = (−x2 + x1 )k

i=0 n−1

f (−x2 + x1 )−1−i g(−x2 + x1 , )i i A

i=0

≡ (−x2 + x1 )

k

f (−x2 + x1 )−1−i g(−x2 + x1 , )i i A (mod n )

i≥0

= (−x2 + x1 ) Z (x2 , x1 )B(x). k

This proves a(x1 )b(x2 ) ∼ Z (x2 , x1 )B(x). Thus U is -adically S-local.

Proposition 4.26. Let V be an -adic nonlocal vertex subalgebra of E(W ). Suppose ˆ ⊗C((x))[[]], ˆ that a(x), b(x) ∈ V , B(x) ∈ V ⊗V and p(x, ) ∈ C[x, ] with p(x, 0) = 0 satisfy p(x1 − x2 , )a(x1 )b(x2 ) = p(x1 − x2 , )Z (x2 , x1 )(B(x)).

(4.39)

Then p(x1 − x2 , )YE (a(x), x1 )YE (b(x), x2 ) = p(x1 − x2 , )YE (x2 , x1 )(B(x)). (4.40) Proof. In view of Lemma 4.25, we have a(x1 )b(x2 ) ∼ Z (x2 , x1 )(B(x)). Furthermore, by Lemma 4.21 we have YE (a(x), x1 )YE (b(x), x2 ) ∼ YE (x2 , x1 )(B(x)). Then the following Jacobi identity holds: x1 − x2 x2 − x1 x0−1 δ YE (a(x), x1 )YE (b(x), x2 ) − x0−1 δ YE (x2 , x1 )(B(x)) x0 −x0 x2 + x0 YE (YE (a(x), x0 )b(x), x2 ). = x1−1 δ x1 By Proposition 4.12 we have p(x0 , )YE (a(x), x0 )b(x) = ( p(x1 − x, )a(x1 )b(x)) |x1 =x+x0 , which involves only nonnegative integer powers of x0 . Multiplying both sides of the Jacobi identity by p(x0 , ), and then applying Resx0 we obtain the desired identity.

508

H. Li

We also have: Proposition 4.27. Let W be given as before and let V be an -adic nonlocal vertex subalgebra of E(W ) such that V is -adically compatible, and let ˆ ⊗C((x))[[]], ˆ m ∈ Z, u(x), v(x), c0 (x), c1 (x), · · · ∈ V, A(x) ∈ V ⊗V satisfying the condition that for every positive integer n, there exists a nonnegative integer r such that c j (x) ∈ n V for j ≥ r . Suppose that (x1 − x2 )m u(x1 )v(x2 ) − (−x2 + x1 )m Z (x2 , x1 )(A(x)) ∂ j −1 1 x1 j . c (x2 ) x2 δ = j! ∂ x2 x2

(4.41)

j≥0

Then (x1 − x2 )m YE (u(x), x1 )YE (v(x), x2 ) − (−x2 + x1 )m YE (x2 , x1 )(A(x)) ∂ j −1 1 x1 j . (4.42) YE (c (x), x2 ) x2 δ = j! ∂ x2 x2 j≥0

Proof. Since V is -adically compatible, W is a faithful V -module with YW (α(x), x0 ) = α(x0 ) for α(x) ∈ V . Then it follows immediately from Proposition 2.25. 5. -deformations of Quantum Vertex Algebras VQ In this section we construct a family of -adic quantum vertex algebras as -deformations of certain quantum vertex algebras which were studied in [KL]. One special case gives rise to an -deformed βγ -system, while another special case gives rise to an -deformation of the vertex operator superalgebra VL associated to the rank-one lattice L = Zα with α, α = 1. We essentially deal with the same algebras as in [KL] with a formal parameter , instead of a nonzero complex number q. We first recall the quantum vertex algebras of Zamolodchikov-Faddeev type from [KL]. Let l be a positive integer and let Q = (qi j )li, j=1 be a complex matrix such that qi j q ji = 1 for 1 ≤ i, j ≤ l.

(5.1)

Define AQ to be the associative algebra with identity (over C) with generators X i,n , Yi,n

(i = 1, . . . , l, n ∈ Z),

subject to relations X i,m X j,n = qi j X j,n X i,m , Yi,m Y j,n = qi j Y j,n Yi,m , X i,m Y j,n − q ji Y j,n X i,m = δi, j δm+n+1,0

(5.2)

for 1 ≤ i, j ≤ l, m, n ∈ Z. A vector w in an AQ -module is called a vacuum vector if X i,n w = Yi,n w = 0

for 1 ≤ i ≤ l, n ≥ 0,

and an AQ -module W equipped with a vacuum vector that generates W is called a vacuum AQ -module.

-adic Quantum Vertex Algebras and Their Modules

509

Let JQ be the left ideal of AQ , generated by X i,n , Yi,n (1 ≤ i ≤ l, n ≥ 0), that is, JQ =

l

(AQ X i,n + AQ Yi,n ).

i=1 n≥0

Set VQ = AQ /JQ ,

(5.3)

a left AQ -module, and set 1 = 1 + JQ ∈ VQ . Then 1 is a vacuum vector and VQ equipped with 1 is a vacuum AQ -module. For 1 ≤ i ≤ l, set u (i) = X i,−1 1, v (i) = Yi,−1 1 ∈ VQ and set X i (x) =

X i,n x −n−1 , Yi (x) =

n∈Z

(5.4)

Yi,n x −n−1 ∈ AQ [[x, x −1 ]].

(5.5)

n∈Z

It was proved therein that there exists a unique quantum vertex algebra structure on VQ with 1 as the vacuum vector and with Y (u (i) , x) = X i (x), Y (v (i) , x) = Yi (x)

for 1 ≤ i ≤ l.

It was also proved that VQ is nondegenerate. Next, we are going to construct a family of -adic quantum vertex algebras by deforming VQ . For this purpose we shall need the following notion (cf. [Li3]): Definition 5.1. Let V be a general nonlocal vertex algebra. An -adic pseudo-endomorphism of V is a linear map

(x) : V → (V ⊗ C((x)))[[]] satisfying the condition that (x)1 = 1 ⊗ 1,

(x1 )Y (v, x2 ) = Y ( (x1 − x2 )v, x2 ) (x1 )

for v ∈ V,

(5.6)

where the map Y is canonically extended. An -adic pseudo-endomorphism (x) is called a pseudo-automorphism if there exists an -adic pseudo-endomorphism (x) such that (x)(x)v = v = (x) (x)v for v ∈ V . We say that -adic pseudo-endomorphisms (x) and (x) commute if (x1 )(x2 ) = (x2 ) (x1 ). The following is an -adic version of Lemma 3.15 of [KL]: Lemma 5.2. Let Q = (qi j ) be an l × l matrix as before and let p1 (x, ), . . . , pl (x, ) be any sequence in C((x))[[]] with pi (x, 0) = 0 for 1 ≤ i ≤ l (so that pi (x, ) are invertible). Then there exists an -adic pseudo-automorphism (x) of VQ such that

(x)(u (i) ) = u (i) ⊗ pi (x, ),

(x)(v (i) ) = v (i) ⊗ pi (x, )−1

Furthermore, all such pseudo-automorphisms mutually commute.

for 1 ≤ i ≤ l.

510

H. Li

Proof. Note that VQ ⊗ C((x))[[]] ⊂ (VQ ⊗ C((x)))[[]]. As C((x))[[]] is a commutative associative algebra over C with − ddx as a derivation, C((x))[[]] becomes a vertex algebra with 1 as the vacuum vector and with d

Y ( f, z)g = (e−z d x f )g

for f, g ∈ C((x))[[]].

We then equip VQ ⊗ C((x))[[]] with the tensor product vertex algebra structure with Yten denoting the vertex operator map. Then Yten (v ⊗ f (x), z) = Y (v, z) ⊗ f (x − z) for v ∈ V, f (x) ∈ C((x))[[]]. For A(x) ∈ V ⊗ C((x))[[]], we have Yten (A(x), z) = Y (A(x − z), z), noting that as in (5.6), Y is C((x))[[]]-bilinear. An -adic pseudo-endomorphism from VQ to VQ ⊗ C((x))[[]] exactly amounts to a vertex algebra homomorphism. It is straightforward to check that with X i (z) and Yi (z) (1 ≤ i ≤ l) acting on VQ ⊗ C((x))[[]] as Y (u (i) ⊗ pi (x, ), z) and Y (v (i) ⊗ pi (x, )−1 , z), respectively, VQ ⊗ C((x))[[]] becomes an AQ -module with 1 ⊗ 1 as a vacuum vector. Since VQ is a universal vacuum AQ -module, it follows that there exists an AQ -module homomorphism θ from VQ to VQ ⊗ C((x))[[]] such that θ (1) = 1 ⊗ 1. Because VQ as a quantum vertex algebra is generated by u (i) , v (i) (1 ≤ i ≤ l), it follows that θ is a vertex algebra homomorphism. We have θ (u (i) ) = u (i) ⊗ pi (x, ), θi (v (i) ) = v (i) ⊗ pi (x, )−1 for 1 ≤ i ≤ l. Denoting θ alternatively by (x), we see that (x) satisfies all the properties.

Set G(x, ) =

p(x, ) | p(x, ), q(x, ) ∈ C[x, ] with p(x, 0), q(x, 0) = 0 , q(x, )

(5.7)

an abelian group. We also consider G(x, ) as a (multiplicative) subgroup of C((x))[[]]. Theorem 5.3. Let Q = (qi j )li, j=1 be given as before and let pi j (x, ) ∈ G(x, ) ⊂ C((x))[[]]

(5.8)

such that pi j (x, 0) = 1 for 1 ≤ i, j ≤ l. For 1 ≤ i ≤ l, let i (x) be the pseudo-automorphism of VQ such that

i (x)(u ( j) ) = u ( j) ⊗ pi j (x, ),

i (x)(v ( j) ) = v ( j) ⊗ pi j (x, )−1

for 1 ≤ j ≤ l,

obtained in Lemma 5.2. Then there exists a unique -adic quantum vertex algebra structure on VQ [[]] with 1 as the vacuum vector and with Y(u (i) , x) = Y (u (i) , x) i (x),

Y(v (i) , x) = Y (v (i) , x) i (x)−1

for 1 ≤ j ≤ l.

-adic Quantum Vertex Algebras and Their Modules

511

Furthermore, VQ [[]] is non-degenerate and generated by u (i) , v (i) (1 ≤ i ≤ l), and the following relations hold for 1 ≤ i, j ≤ l: pi j (x1 − x2 , )−1 Y(u (i) , x1 )Y(u ( j) , x2 ) = q ji p ji (x2 − x1 , )−1 Y(u ( j) , x2 )Y(u (i) , x1 ), pi j (x1 − x2 , )−1 Y(v (i) , x1 )Y(v ( j) , x2 ) = qi j p ji (x2 − x1 , )−1 Y(v ( j) , x2 )Y(v (i) , x1 ), pi j (x1 − x2 , )Y(u (i) , x1 )Y(v ( j) , x2 ) − q ji p ji (x2 − x1 , )Y(v ( j) , x2 )Y(u (i) , x1 ) x1 = δi j x2−1 δ . x2 Proof. Note that from Lemma 5.2, pseudo-automorphisms i (x) (1 ≤ i ≤ l) are mutually commuting. For 1 ≤ i ≤ l, set a (i) (x) = Y (u (i) , x) i (x),

b(i) (x) = Y (v (i) , x) i−1 (x).

We have pi j (x1 − x2 , )−1 a (i) (x1 )a ( j) (x2 ) = pi j (x1 − x2 , )−1 Y (u (i) , x1 ) i (x1 )Y (u ( j) , x2 ) j (x2 ) = Y (u (i) , x1 )Y (u ( j) , x2 ) i (x1 ) j (x2 ) = qi j Y (u ( j) , x2 )Y (u (i) , x1 ) j (x2 ) i (x1 ) = qi j p ji (x2 − x1 , )−1 a ( j) (x2 )a (i) (x1 ), pi j (x1 − x2 , )−1 b(i) (x1 )b( j) (x2 ) = pi j (x1 − x2 , )−1 Y (v (i) , x1 ) i−1 (x1 )Y (v ( j) , x2 ) −1 j (x 2 ) = Y (v (i) , x1 )Y (v ( j) , x2 ) i−1 (x1 ) −1 j (x 2 ) −1 = qi j Y (v ( j) , x2 )Y (v (i) , x1 ) −1 j (x 2 ) i (x 1 )

= qi j p ji (x2 − x1 , )−1 b( j) (x2 )b(i) (x1 ), pi j (x1 − x2 , )a (i) (x1 )b( j) (x2 ) − q ji p ji (x2 − x1 , )b( j) (x2 )a (i) (x1 ) −1 ( j) (i) = Y (u (i) , x1 )Y (v ( j) , x2 ) i (x1 ) −1 j (x 2 )−q ji Y (v , x 2 )Y (u , x 1 ) j (x 2 ) i (x 1 ) x1

−1 = δi j x2−1 δ j (x 2 ) i (x 1 ) x2 x1

−1 = δi j x2−1 δ j (x 2 ) i (x 2 ) x2 x1 . = δi j x2−1 δ x2

Set T = {a (i) (x), b(i) (x) | 1 ≤ i ≤ l} ⊂ E(VQ [[]]). From Lemma 4.25, T is -adically S-local and hence -adically compatible by Lemma 4.23. By Theorem 4.16, T generates an -adic nonlocal vertex algebra T inside E(VQ [[]]) with VQ [[]] as a module.

512

H. Li

We are going to apply Theorem 4.24 with V = VQ [[]], U = {u (i) , v (i) | 1 ≤ i ≤ l}, and Y0 (u (i) , x) = a (i) (x), Y0 (v (i) , x) = b(i) (x). We claim that VQ [[]] is generated from 1 by field operators a (i) (x), b(i) (x) (1 ≤ i ≤ l). Let W be the C[[]]-submodule of VQ [[]] generated from 1 by field operators a (i) (x), b(i) (x) (1 ≤ i ≤ l). We have i (x)1 = 1 ⊗ 1,

i (x)a ( j) (x1 ) = pi j (x − x1 , )a ( j) (x1 ) i (x),

i (x)b( j) (x1 ) = pi j (x − x1 , )−1 b( j) (x1 ) i (x) for 1 ≤ i, j ≤ l. It follows from induction that i (x)W ⊂ W [[x, x −1 ]] for 1 ≤ i ≤ l. Similarly, we have i (x)−1 W ⊂ W [[x, x −1 ]]. As Y (u (i) , x) = a (i) (x) i (x)−1 , Y (v (i) , x) = b(i) (x) i (x), it follows that W is closed under vertex operators Y (u (i) , x) and Y (v (i) , x) for 1 ≤ i ≤ l. Consequently, W = VQ [[]], as claimed. By Proposition 3.13, there exists a K -module homomorphism π from K to VQ [[]], sending 1VQ [[]] to 1. We are going to prove that π is in fact an isomorphism. First, we see that π is surjective and it gives rise to a surjective linear map π0 : K /K → VQ (= VQ [[]]/VQ [[]]). By Lemma 5.4, which follows next, we have pi j (x1 − x2 , )YE (a (i) (x), x1 )YE (a ( j) (x), x2 ) = qi j p ji (x2 − x1 , )YE (a ( j) (x), x2 )YE (a (i) (x), x1 ), pi j (x1 − x2 , )YE (b(i) (x), x1 )YE (b( j) (x), x2 ) = qi j p ji (x2 − x1 , )YE (b( j) (x), x2 )YE (b(i) (x), x1 ), pi j (x1 − x2 , )YE (a (i) (x), x1 )YE (b( j) (x), x2 ) −q ji p ji (x2 − x1 , )YE (b( j) (x), x2 )YE (a (i) (x), x1 ) x1 = δi j x2−1 δ x2 for 1 ≤ i, j ≤ l. From this, it follows that K /K is a vacuum AQ -module with X i (z) and Yi (z) (1 ≤ i ≤ l) acting as YE (a (i) (x), z) and YE (b(i) (x), z), respectively. Furthermore, we see that π0 is a surjective AQ -module homomorphism from K /K to VQ . As every nonzero vacuum AQ -module is irreducible from [KL], π0 must be an isomorphism. With K separated and with VQ [[]] torsion-free, it follows from a result of Enriquez ([En], Lemma 3.14) that π is an isomorphism. Now, by Theorem 4.24, there exists an -adic weak quantum vertex algebra structure on VQ [[]] with 1 as the vacuum vector and with Y(u (i) , x) = a (i) (x),

Y(v (i) , x) = b(i) (x)

for 1 ≤ i ≤ l.

Then the last assertion follows immediately. As VQ is nondegenerate, VQ [[]] is an -adic quantum vertex algebra. Now, the proof is complete.

-adic Quantum Vertex Algebras and Their Modules

513

The following is the result we need in the proof of Theorem 5.3: Lemma 5.4. Let W be a topologically free C[[]]-module and let V be an -adic nonlocal vertex subalgebra of E(W ). Assume that a(x), b(x) ∈ V, p(x, ), q(x, ) ∈ C[x, ], f (x, ) ∈ C((x))[[]] with p(x, 0), q(x, 0) = 0 such that p(x1 − x2 , ) x1 , (5.9) a(x1 )b(x2 ) − f (x2 − x1 , )b(x2 )a(x1 ) = λx2−1 δ q(x1 − x2 , ) x2 where λ is a complex number. Then p(x1 −x2 , ) YE (a(x), x1 )YE (b(x), x2 )− f (x2 − x1 , )YE (b(x), x2 )YE (a(x), x1 ) q(x1 −x2 , ) x1 = λx2−1 δ . (5.10) x2 Proof. From (5.9), we have (x1 − x2 ) p(x1 − x2 , )a(x1 )b(x2 ) = (x1 − x2 )q(x1 − x2 , ) f (x2 − x1 , )b(x2 )a(x1 ) = (x1 − x2 ) p(x1 − x2 , ) q(x1 − x2 , ) f (x2 − x1 , ) b(x2 )a(x1 ). × p(−x2 + x1 , ) In view of Lemma 4.25, we have q(x1 − x2 , ) f (x2 − x1 , ) b(x2 )a(x1 ). a(x1 )b(x2 ) ∼ p(−x2 + x1 , ) By Lemma 4.19, we have x1 − x a(x1 )b(x) x0 q(x1 − x, ) f (x − x1 , ) x − x1 b(x)a(x1 ). −Resx1 x0−1 δ −x0 p(−x + x1 , )

YE (a(x), x0 )b(x) = Resx1 x0−1 δ

Multiplying both sides by

p(x0 ,) q(x0 ,)

(viewed as an element of C((x0 ))[[]]) we obtain

p(x0 , ) p(x1 − x, ) x1 − x YE (a(x), x0 )b(x) = Resx1 x0−1 δ a(x1 )b(x) q(x0 , ) x0 q(x1 − x, ) x − x1 f (x − x1 , )b(x)a(x1 ). − Resx1 x0−1 δ −x0

Furthermore, for n ≥ 0, applying Resx0 x0n to both sides and then using (5.9) we get x p(x0 , ) 1 YE (a(x), x0 )b(x) = λResx1 (x1 − x)n x −1 δ = δn,0 λ. Resx0 x0n q(x0 , ) x

514

H. Li

On the other hand, by Lemma 4.21 we have the following Jacobi identity in V : x1 − x2 −1 YE (a(x), x1 )YE (b(x), x2 ) x0 δ x0 q(x1 − x2 , ) f (x2 − x1 , ) x2 − x1 −1 YE (b(x), x2 )YE (a(x), x1 ) −x0 δ −x0 p(−x2 + x1 , ) x2 + x0 YE (YE (a(x), x0 )b(x), x2 ). = x1−1 δ x1 p(x0 ,) and then taking Resx0 we obtain Multiplying both sides by q(x 0 ,) p(x1 −x2 , ) YE (a(x), x1 )YE (b(x), x2 )− f (x2 − x1 , )YE (b(x), x2 )YE (a(x), x1 ) q(x1 −x2 , ) p(x0 , ) x2 + x0 −1 YE (YE (a(x), x0 )b(x), x2 ) = Resx0 x1 δ x1 q(x0 , ) x2 , = λx1−1 δ x1

proving (5.10).

Example 5.5. Consider the special case with l = 1, q11 = 1, and p11 (x, ) = x+x . In this case (with Q = q11 = 1), AQ is a Weyl algebra and VQ is a vertex algebra. By Theorem 5.3, there exists an -adic quantum vertex algebra VQ [[]] with generators u, v such that x1 − x2 x2 − x1 Y (u, x1 )Y (u, x2 ) = Y (u, x2 )Y (u, x1 ), x1 − x2 + x2 − x1 + x1 − x2 x2 − x1 Y (v, x1 )Y (v, x2 ) = Y (v, x2 )Y (v, x1 ), x1 − x2 + x2 − x1 + x1 − x2 + x2 − x1 + x1 −1 Y (u, x1 )Y (v, x2 ) − Y (v, x2 )Y (u, x1 ) = x2 δ . x1 − x2 x2 − x1 x2 This gives an -deformed βγ -system (cf. [EFK]). Example 5.6. Consider the case with l = 1, q11 = −1, and p11 (x, ) = x+x . In this case, AQ is a Clifford algebra and VQ is a vertex superalgebra (cf. [FFR]). Theorem 5.3 asserts that there exists an -adic quantum vertex algebra VQ [[]] with generators u, v such that x1 − x2 x2 − x1 Y (u, x1 )Y (u, x2 ) = − Y (u, x2 )Y (u, x1 ), x1 − x2 + x2 − x1 + x1 − x2 x2 − x1 Y (v, x1 )Y (v, x2 ) = − Y (v, x2 )Y (v, x1 ), x1 − x2 + x2 − x1 + x1 − x2 + x2 − x1 + x1 −1 Y (u, x1 )Y (v, x2 ) + Y (v, x2 )Y (u, x1 ) = x2 δ . x1 − x2 x2 − x1 x2 Let L = Zα be a rank-one lattice with α, α = 1. Associated to L, one has a vertex superalgebra VL (cf. [DL]). It is known that vertex superalgebras VQ and VL are isomorphic with u = eα and v = e−α . In view of this, VQ [[]] is an -deformation of VL .

-adic Quantum Vertex Algebras and Their Modules

515

6. -adic Quantum Vertex Algebras Associated with Double Yangians In this section, we shall associate the centrally extended double Yangian of sl2 (see [Kh]) with -adic quantum vertex algebras. This can be viewed as an -adic version of the corresponding result of [Li6] for the centerless double Yangian of sl2 with evaluated as a nonzero complex number. The following is a variant of the centrally extended double Yangian DY (sl2 ) in [Kh]: Definition 6.1. Define DY (sl2 ) to be the topological associative algebra over C[[]] with generators em , f m , h m , c, d (m ∈ Z), which are grouped together in terms of generating functions em x −m−1 , f (x) = f m x −m−1 , e(x) = m∈Z

h (x) = 1 + +

m∈Z

hm x

−m−1

,

m≥0

−

h (x) = 1 −

h m x −m−1 ,

m<0

with c a central element, subject to relations [d, e(x)] = e(x)e(y) = f (x) f (y) = h + (x)e(y) = h + (x) f (y) = h − (x)e(y) = h − (x) f (y) = h ± (x)h ± (y) = h + (x)h − (y) = [e(x), f (y)] =

d d d ± e(x), [d, f (x)] = f (x), [d, h ± (x)] = h (x), dx dx dx y−x − e(y)e(x), y−x + y−x + f (y) f (x), y−x − x −y+ e(y)h + (x), x −y− x − y − (1 + c) f (y)h + (x), x − y + (1 − c) (6.1) y−x − e(y)h − (x), y−x + y−x + f (y)h − (x), y−x − h ± (y)h ± (x), x − y + x − y − (1 + c) − · · h (y)h + (x), x − y − x − y + (1 − c) y 1 y + c x −1 δ h + (x) − x −1 δ h − (y) , x x

where by convention for a ∈ C[c], 1 = a n (x − y)−n−1 n ∈ C[c, (x − y)−1 ][[]], x − y − a n≥0

1 = a n (y − x)−n−1 n ∈ C[c, (y − x)−1 ][[]]. y − x − a n≥0

516

H. Li

Remark 6.2. Here we give some details for the definition. Let T be the free associative algebra over C with generators em , f m , h m (m ∈ Z), c, d. Set deg c = deg d = 0, deg em = deg f m = deg h m = m for m ∈ Z, ! to make T a Z-graded algebra T = n∈Z Tn . For n ∈ Z, set Fn (T ) =

Tm ⊂ T.

m≥n

This defines a decreasing filtration {Fn (T )}n∈Z of T with ∩n∈Z Fn (T ) = 0. Denote by T the formal completion of T with respect to this filtration. Then T [[]] is an algebra over C[[]]. The algebra DY (sl2 ) is simply the quotient algebra of T [[]] modulo all the relations corresponding to (6.1). The standard double Yangian DY(sl2 ) (see [Kh]) was related to the increasing filtration {Ck }k∈Z , where Ck = m≤k Tm for k ∈ Z. Remark 6.3. Let W be a C[[]]-module and let a(x), b(x) ∈ E(W ). Assume that a(x) ∈ (EndW )[[x −1 ]]. Note that the equalities x1 − x2 − b(x2 )a(x1 ), x1 − x2 + x2 − x1 + −x2 + x1 − b(x2 )a(x1 ) = b(x2 )a(x1 ) a(x1 )b(x2 ) = x2 − x1 − −x2 + x1 +

a(x1 )b(x2 ) =

both make sense, but they are not equivalent in general. On the other hand, the following equalities both make sense and are equivalent x1 − x2 − b(x2 )a(x1 ), x1 − x2 + x1 − x2 + a(x1 )b(x2 ). b(x2 )a(x1 ) = x1 − x2 −

a(x1 )b(x2 ) =

± By a DY (sl2 )-module we mean a C[[]]-module W on which e(x), f (x), h (x) and c, d act such that

e(x), f (x), h ± (x) ∈ E(W ) and such that all the defining relations in (6.1) hold. A DY (sl2 )-module W is said to be of level ∈ C if c acts on W as scalar . Let W be a DY(sl2 )-module of level . Set UW = {1W , e(x), f (x), h + (x), h − (x)} ⊂ E(W ).

(6.2)

Note that from the last defining relation we have (x1 − x2 )(x1 − x2 − )[e(x1 ), f (x2 )] = 0. In view of Lemma 4.25, UW is -adically S-local. Then by Theorem 4.15, UW generates an -adic nonlocal vertex algebra VW inside E(W ). We are going to show that the space VW is naturally a module for a variant of DY (sl2 ).

-adic Quantum Vertex Algebras and Their Modules

517

Definition 6.4. Let DY (sl2 ) be the topological associative algebra over C[[]] with generators e˜m , f˜m , h˜ ± m (m ∈ Z), c, d, and with generating functions e(x) ˜ =

e˜m x −m−1 ,

m∈Z

˜+

h (x) = 1 +

f˜(x) =

f˜m x −m−1 ,

m∈Z

h˜ +m x −m−1 ,

˜−

h (x) = 1 −

m∈Z

−m−1 , h˜ − mx

(6.3)

m∈Z

subject to relations [d, e(x)] ˜ = e(x) ˜ e(y) ˜ = f˜(x) f˜(y) = e(y) ˜ h˜ + (x) = f˜(y)h˜ + (x) = h˜ − (x)e(y) ˜ = h˜ − (x) f˜(y) =

d ˜ d e(x), ˜ [d, f˜(x)] = f (x), dx dx y−x − e(y) ˜ e(x), ˜ y−x + y−x + ˜ f (y) f˜(x), y−x − x − y − ˜+ ˜ h (x)e(y), x −y+ x − y + (1 − c) ˜ + h (x) f˜(y), x − y − (1 + c) y−x − e(y) ˜ h˜ − (x), y−x + y − x + ˜ ˜− f (y)h (x), y−x −

[d, h˜ ± (x)] =

d ˜± h (x), dx

(6.4)

[h˜ ± (x), h˜ ± (y)] = 0, x − y − x − y + (1 − c) ˜ + ˜ − · · h (x)h (y), h˜ − (y)h˜ + (x) = x − y + x − y − (1 + c) y 1 y + c ˜ + −1 −1 − ˜ ˜ x δ h (y) . [e(x), ˜ f (y)] = h (x) − x δ x x Similarly, we define a DY ˜ f˜(x), (sl2 )-module to be a C[[]]-module on which e(x), ± ˜h ± (x) and c, d act with e(x), ˜ ˜ ˜ f (x), h (x) ∈ E(W ), satisfying all the defining relations. (sl )-module is called a vacuum vector if A vector w in a DY 2 dw = 0, e˜m w = f˜m w = h˜ ± m w = 0 for m ≥ 0. A vacuum DY (sl2 )-module is a module equipped with a vacuum vector which generates the whole module.

518

H. Li

Remark 6.5. In view of Remark 6.3, the following relations hold in DY (sl2 ): x −y− + h (x)e(y), x −y+ x − y + (1 − c) + h (x) f (y), f (y)h + (x) = x − y − (1 + c) x − y − x − y + (1 − c) + h − (y)h + (x) = · · h (x)h − (y). x − y + x − y − (1 + c) e(y)h + (x) =

On the other hand, these three relations also imply the corresponding original relations. Thus DY (sl2 ) is isomorphic to the quotient algebra of DY(sl2 ) modulo the relations h˜ +m = 0

for m < 0 and h˜ − n =0

for n ≥ 0.

(6.5)

Consequently, every DY (sl2 )-module of level is naturally a DY(sl2 )-module of level . Set h˜ + (x) =

k≥0

1 h˜ k x −k−1= (h˜ + (x)−1), h˜ − (x) =

k<0

1 h˜ k x −k−1 = (1− h˜ − (x)).

(6.6)

The following is straightforward: Lemma 6.6. In terms of h˜ ± (x) , those relations involving h˜ ± (x) in (6.4) become y−x − 2 e(y) ˜ h˜ − (x) + e(y), ˜ ˜ = h˜ − (x) e(y) y−x + y−x + −2 y − x + ˜ ˜− h˜ − (x) f˜(y) = f (y)h (x) + f˜(y), y−x − y−x − x − y − ˜+ −2 · h (x) e(y) e(y), ˜ ˜ + e(y) ˜ h˜ + (x) = x −y+ x −y+ x − y + (1 − c) ˜ + ˜ 2 f˜(y)h˜ + (x) = · h (x) f (y) + f˜(y), x − y − (1 + c) x − y − (1 + c) [h˜ ± (x) , h˜ ± (y) ] = 0, h˜ − (y) h˜ + (x) = F()h˜ + (x) h˜ − (y) −2c 1 + h˜ + (x) − h˜ − (y) , + (x − y + )(x − y − (1 + c)) where F() =

x − y − x − y + (1 − c) · , x − y + x − y − (1 + c)

and [e(x), ˜ f˜(y)] k ck k−1 y y ∂ −1 + − + ˜ ˜ ˜ =x δ h (x) + h (x) + (1 + h (x) ) . x −1 δ x k! ∂y x k≥1

-adic Quantum Vertex Algebras and Their Modules

519

Proposition 6.7. Let W be a topologically free DY (sl2 )-module of level . Set "W = {1W , e(x), ˜ f˜(x), h˜ + (x) , h˜ − (x) } ⊂ E(W ). U "W is -adically S-local and the -adic nonlocal vertex algebra V "W generThen U "W is a DY ˜ 0 ), f˜(x0 ) and h˜ ± (x0 ) acting ated by U (sl2 )-module of level with e(x ± ˜ ˜ as YE (e(x), ˜ x0 ), YE ( f (x), x0 ) and YE (h (x), x0 ), respectively, and with d acting as D = d/d x and c acting as scalar . Furthermore, 1W generates a vacuum DY (sl2 )module of level . "W Proof. With the commutation relations (6.4) and with Lemma 6.6, by Lemma 4.25, U "W is -adically S-local. Note that the D-operator of the -adic nonlocal vertex algebra V is exactly the formal differential operator d/d x and we have d d YE (e(x), ˜ x0 ), [D, YE ( f˜(x), x0 )] = YE ( f˜(x), x0 ), d x0 d x0 d [D, YE (h˜ ± (x), x0 )] = YE (h˜ ± (x), x0 ). d x0 With the commutation relations (6.4), by Proposition 4.26 we have [D, YE (e(x), ˜ x0 )] =

x2 − x1 − · YE (e(x), ˜ x2 )YE (e(x), ˜ x1 ), x2 − x1 + x2 − x1 + · YE ( f˜(x), x2 )YE ( f˜(x), x1 ), YE ( f˜(x), x1 )YE ( f˜(x), x2 ) = x2 − x1 − x1 − x2 − · YE (h˜ + (x), x1 )YE (e(x), YE (e(x), ˜ x2 )YE (h˜ + (x), x1 ) = ˜ x2 ), x1 − x2 + x1 − x2 + (1 − ) · YE (h˜ + (x), x1 )YE ( f˜(x), x2 ), YE ( f˜(x), x2 )YE (h˜ + (x), x1 ) = x1 − x2 − (1 + ) x2 − x1 − · YE (e(x), ˜ x2 ) = ˜ x2 )YE (h˜ − (x), x1 ), YE (h˜ − (x), x1 )YE (e(x), x2 − x1 + x2 − x1 + · YE ( f˜(x), x2 )YE (h˜ − (x), x1 ), YE (h˜ − (x), x1 )YE ( f˜(x), x2 ) = x2 − x1 − [YE (h˜ ± (x) , x1 ), YE (h˜ ± (x) , x2 )] = 0, YE (h˜ − (x), x2 )YE (h˜ + (x), x1 ) x1 − x2 − x1 − x2 + (1 − ) · · YE (h˜ + (x), x1 )YE (h˜ − (x), x2 ). = x1 − x2 + x1 − x2 − (1 + ) "W -module. Using Proposition 2.26 we get Recall that W is a faithful V x2 YE h˜ + (x) + h˜ − (x) , x1 [YE (e(x), ˜ x1 ), YE ( f˜(x), x2 )] = x1−1 δ x1 k k−1 ∂ k −1 x2 + ˜ YE (1 + h (x) , x1 ) + x1 δ k! ∂ x2 x1 k≥1 x2 + 1 YE (h˜ + (x), x1 ) x1−1 δ = x1 x2 YE (h˜ − (x), x2 ) . − x1−1 δ x1 ˜ x1 )YE (e(x), ˜ x2 ) = YE (e(x),

520

H. Li

This shows that VW is a DY (sl2 )-module of level . Clearly, 1 is a vacuum vector of "W viewed as a DY(sl2 )-module, so it generates a vacuum DY V (sl2 )-module. Definition 6.8. Define K to be a Lie algebra over C with a basis {c, E m , Fm , Im , Jm | m ∈ Z} and with the Lie bracket relations [c, K ] = 0, [E(x), E(y)] = 0,

[F(x), F(y)] = 0,

[I (x), I (y)] = 0,

[J (x), J (y)] = 0,

[I (x), J (y)] =

2c , (x − y)2

−2 2 E(y), [I (x), F(y)] = F(y), x−y x−y 2 −2 [J (x), E(y)] = E(y), [J (x), F(y)] = F(y), y−x y−x y y ∂ [E(x), F(y)] = x −1 δ (I (y) + J (y)) + c x −1 δ , x ∂y x where a(x) = m∈Z am x −m−1 for a = E, F, I, J . [I (x), E(y)] =

Remark 6.9. Consider the product Lie algebra sl2 ⊕ Cz. Extend the normalized Killing form on sl2 to a symmetric (invariant) bilinear form on sl2 ⊕ Cz by sl2 , z = 0, z, z = 0. Then we have an affine Lie algebra sl 2 ⊕ Cz. It is readily seen that the Lie algebra K is isomorphic to sl ⊕ Cz with 2 J (x) = h(x)− − z(x),

E(x) = e(x), F(x) = f (x), I (x) = h(x)+ + z(x),

and c = k (the central element of sl 2 ⊕ Cz), where for a ∈ sl2 ⊕ Cz, a(x) = a(m)x −m−1 m∈Z

and h(x)+ =

m≥0

h(m)x −m−1 ,

h(x)− =

h(m)x −m−1 .

m<0

A vector w in a K -module is called a vacuum vector if E m w = Fm w = Im w = Jm w = 0 for m ≥ 0. A vacuum K -module is a module equipped with a vacuum vector which generates the whole module. Denote by K ≥0 the linear span of c, E m , Fm , Im , Jm for m ≥ 0. It is clear that K ≥0 is a Lie subalgebra. Let be a complex number. Denote by C the 1-dimensional K ≥0 -module with c acting as and with all the other generators of K ≥0 acting trivially. Form the induced K -module VK (, 0) = U (K ) ⊗U (K ≥0 ) C .

(6.7)

-adic Quantum Vertex Algebras and Their Modules

521

Set 1 = 1 ⊗ 1 ∈ VK (, 0). Then VK (, 0) is a vacuum K -module of level , which is universal in the obvious sense. In view of the connection of K with sl 2 ⊕ Cz, VK (, 0) is also a universal vacuum sl2 ⊕ Cz-module of level . If is generic, it is well known that the universal vacuum sl 2 ⊕ Cz-module of level is irreducible. It follows that VK (, 0) is an irreducible K -module if is generic. Set E = E(−1)1, F = F(−1)1, I = I (−1)1, J = J (−1)1 ∈ VK (, 0). Clearly, {1, E(x), F(x), I (x), J (x)} is S-local. It follows from [Li5] that there exists a (unique) non-degenerate quantum vertex algebra structure on VK (, 0) over C with 1 as the vacuum vector and with Y (E, x) = E(x), Y (F, x) = F(x), Y (I, x) = I (x), Y (J, x) = J (x). Proposition 6.10. Let W be any DY (sl2 )-module of level . Then W/W is a K -module of level with E m , Fm , Im , Jm for m ∈ Z acting as e˜m , f˜m , h˜ +m , h˜ − m , respectively. If W is a vacuum DY(sl2 )-module of level , then W/W is a vacuum K -module of level . Proof. The first assertion follows immediately from the defining relations in (6.4) and the relations in Lemma 6.6. If W is a vacuum DY (sl2 )-module of level , we see that W/W is a vacuum K -module of level . Theorem 6.11. Let be a complex number. Assume that there exists a vacuum DY (sl2 ) module V ( DY(sl2 ), ) of level which is universal in the obvious sense and which is topologically free. Then there exists a unique -adic weak quantum vertex algebra struc ture on V ( DY (sl2 ), ) with 1 as the vacuum vector and with ˜± Y (e˜−1 1, x) = e(x), ˜ Y ( f˜−1 1, x) = f˜(x), Y (h˜ ± −1 1, x) = h (x) .

(6.8)

If is generic, then V ( DY (sl2 ), ) is a non-degenerate -adic quantum vertex algebra. Furthermore, on any DY(sl2 )-module W of level , there exists a unique V ( DY (sl2 ), )-module structure such that ˜± YW (e˜−1 1, x) = e(x), ˜ YW ( f˜−1 1, x) = f˜(x), YW (h˜ ± −1 1, x) = h (x) . Proof. Let W be a DY (sl2 )-module of level . By Proposition 6.7, the -adic nonlocal " vertex algebra VW generated by U˜ W is a DY (sl2 )-module of level and the submodule (sl )-module of level with vacuum vector 1W . It generated by 1W is a vacuum DY 2 follows that there exists a DY(sl2 )-module homomorphism ψW from V ( DY (sl2 ), ) "W , sending 1 to 1W . Specializing W = V ( DY to V (sl ), ) and then applying Theo 2 ± rem 4.24 with V = V ( DY(sl2 ), ), U = {1, e˜−1 1, f˜−1 1, h˜ −1 1}, we obtain the first "W . The assertion. For a general W , from Theorem 4.16, W is a canonical module for V C[[]]-module map ψW satisfies ψW (1) = 1W , ψW (u m v) = u(x)m ψW (v)

for u ∈ U, v ∈ V, m ∈ Z.

522

H. Li

As 1 generates V ( DY (sl2 ), ) as a DY(sl2 )-module, it follows that ψW is a homomorphism of -adic nonlocal vertex algebras. Then the last assertion follows. As for the second assertion, since is generic, VK (, 0) is an irreducible K -mod ule. Because V ( DY (sl2 ), )/V ( DY(sl2 ), ) is a vacuum K -module of level by Proposition 6.10, it follows that V ( DY (sl2 ), )/V ( DY(sl2 ), ) VK (, 0) as a K -module. It follows that this K -module isomorphism is also an isomorphism of nonlocal vertex algebras. As VK (, 0) is (irreducible) non-degenerate, V ( DY (sl2 ), ) is a non-degenerate -adic quantum vertex algebra. Acknowledgement. Part of this paper was finished during my visit at Shanghai Jiaotong University, China, in May 2008. I am very grateful to Professor Cuipo Jiang for her hospitality. I would like to thank the referees for valuable suggestions to put this paper in better shape.

References [AB] [BK] [B1] [B2] [DL] [Dr1] [Dr2] [En] [EK] [EFK] [FKRW] [FFR] [FLM] [IK] [KL] [K] [Kh] [KT] [LL] [Li1]

Anguelova, I.I., Bergvelt, M.J.: H D -Quantum vertex algebras and bicharacters. http://arXiv.org/ abs/0706.1528[math.QA], 2007 Bakalov, B., Kac, V.: Field algebras. Internat. Math. Res. Notices 3, 123–159 (2003) Borcherds, R.E.: Vertex algebras. In: “Topological Field Theory, Primitive Forms and Related Topics” (Kyoto, 1996), edited by Kashiwara, M., Matsuo, A., Saito, K., Satake, I. Progress in Math., Vol. 160, Boston: Birkhäuser, 1998, pp. 35–77 Borcherds, R.E.: Quantum vertex algebras. In: Taniguchi Conference on Mathematics Nara’98, Adv. Stud. Pure Math. 31, Tokyo: Math. Soc. Japan, 2001, pp. 51–74 Dong, C., Lepowsky, J.: Generalized Vertex Algebras and Relative Vertex Operators. Progress in Math. 112, Boston: Birkhäuser, 1993 Drinfeld, V.: Hopf algebras and quantum yang-baxter equation. Soviet Math. Dokl. 32, 1060– 1064 (1985) Drinfeld, V.: A new realization of yangians and quantized affine algebras. Soviet Math. Dokl. 36, 212–216 (1988) Enriquez, B.: PBW and duality theorems for quantum groups and quantum current algebras. http://arXiv.org/abs/math/9904113v4[math.QA], 1999 Etingof, P., Kazhdan, D.: Quantization of lie bialgebras, V. Selecta Math. (New Series) 6, 105– 130 (2000) Etingof, P., Frenkel, I., Kirillov, A. Jr.: Lectures on Representation Theory and Knizhnik-Zamolodchikov Equations. Math. Surv. and Mono. 58, Providence, RI: Amer. Math. Soc., 1998 Frenkel, E., Kac, V., Radul, A., Wang, W.: w1+∞ and w(gl∞ ) with central charge n. Commun. Math. Phys. 170, 337–357 (1995) Feingold, A., Frenkel, I.B., Ries, J.F.: Spinor construction of vertex operator algebras, triality, (1) and E 8 . Contemporary Math. 121, Providence, RI: Amer. Math. Soc., 1991 Frenkel, I., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Appl. Math. Vol. 134, Boston: Academic Press, 1988 Iohara, K., Konno, M.: A central extension of DY (gl2 ) and its vertex representations. Lett. Math. Phys. 37, 319–328 (1996) Karel, M., Li, H.-S.: Some quantum vertex algebras of zamolodchikov-faddeev type. Commun. Contemp. Math. 11, 829–863 (2009) Kassel, C.: Quantum Groups. GTM 155, Berlin-Heidelberg-New York: Springer-Verlag, 1995 Khoroshkin, S.M.: Central extension of the Yangian double. http://arXiv.org/abs/q-alg/ 9602031v1, 1996 Khoroshkin, S., Tolstoy, V.: Yangian double. Lett. Math. Phys. 36, 373–402 (1996) Lepowsky, J., Li, H.-S.: Introduction to Vertex Operator Algebras and Their Representations. Progress in Math. 227, Boston: Birkhäuser, 2004 Li, H.-S.: Local systems of vertex operators, vertex superalgebras and modules. J. Pure Appl. Algebra 109, 143–195 (1996)

-adic Quantum Vertex Algebras and Their Modules [Li2] [Li3] [Li4] [Li5] [Li6] [LTW] [MP]

523

Li, H.-S.: Axiomatic G1 -vertex algebras. Commun. Contemp. Math. 5, 281–327 (2003) Li, H.-S.: Pseudoderivations, pseudoautomorphisms and simple current modules for vertex algebras. In: Contemporary Math. 392, Providence, RI: Amer. Math. Soc., 2005, pp. 55–65 Li, H.-S.: Nonlocal vertex algebras generated by formal vertex operators. Selecta Math. (New Series) 11, 349–397 (2005) Li, H.-S.: Constructing quantum vertex algebras. International J. Math. 17, 441–476 (2006) Li, H.-S.: Modules-at-infinity for quantum vertex algebras. Commun. Math. Phys. 282, 819– 864 (2008) Li, H.-S., Tan, S., Wang, Q.: Twisted modules for quantum vertex algebras. J. Pure Appl. Alg. 214, 201–220 (2010) Meurman, A., Primc, M.: Annihilating fields of standard modules of sl(2, C) and combinatorial identities. Memoirs Amer. Math. Soc. 137 (652) (1999)

Communicated by Y. Kawahigashi

Commun. Math. Phys. 296, 525–557 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1008-9

Communications in

Mathematical Physics

Controllability Issues for Continuous-Spectrum Systems and Ensemble Controllability of Bloch Equations Karine Beauchard1, , Jean-Michel Coron2, , Pierre Rouchon3, 1 CMLA, ENS Cachan, CNRS, Universud, 61 avenue du Président Wilson,

F-94230 Cachan, France. E-mail: [email protected]

2 Institut Universitaire de France and Université Pierre et Marie Curie-Paris 6,

UMR 7598 Laboratoire Jacques-Louis Lions, Paris, F-75005, France. E-mail: [email protected] 3 Mines ParisTech, Centre Automatique et Systèmes, Mathématiques et Systèmes, 60, boulevard Saint-Michel, 75272 Paris CEDEX, France. E-mail: [email protected] Received: 16 March 2009 / Accepted: 10 November 2009 Published online: 21 February 2010 – © Springer-Verlag 2010

Abstract: We study the controllability of the Bloch equation, for an ensemble of non interacting half-spins, in a static magnetic field, with dispersion in the Larmor frequency. This system may be seen as a prototype for infinite dimensional bilinear systems with continuous spectrum, whose controllability is not well understood. We provide several mathematical answers, with discrimination between approximate and exact controllability, and between finite time or infinite time controllability: this system is not exactly controllable in finite time T with bounded controls in L 2 (0, T ), but it is approximately ∞ ([0, +∞)). Moreover, controllable in L ∞ in finite time with unbounded controls in L loc we propose explicit controls realizing the asymptotic exact controllability to a uniform state of spin +1/2 or −1/2. 1. Introduction 1.1. Studied system, bibliography. Most controllability results available for infinite dimensional bilinear systems are related to systems with discrete spectra (see for instance, [2–4] for exact controllability results and [5,14] for approximate controllability results). As far as we know, very few controllability studies consider systems admitting a continuous part in their spectra. In [13] an approximate controllability result is given for a system with mixed discrete/continuous spectrum: the Schrödinger partial differential equation of a quantum particle in an N-dimensional decaying potential is shown to be approximately controllable (in infinite time) to the ground bounded state when the initial state is a linear superposition of bounded states. In [10–12] a controllability notion, called ensemble controllability, is introduced and discussed for quantum systems described by a family of ordinary differential equations (Bloch equations) depending continuously on a finite number of scalar parameters and with a finite number of control inputs. Ensemble controllability means that it is possible KB, JMC and PR were partially supported by the “Agence Nationale de la Recherche” (ANR), Projet Blanc C-QUID number BLAN-3-139579.

526

K. Beauchard, J.-M. Coron, P. Rouchon

to find open-loop controls that compensate for the dispersion in these scalar parameters: the goal is to simultaneously steer a continuum of systems between states of interest with the same control input. The articles [10–12] highlight, for three common dispersions in NMR spectroscopy, the role of Lie algebras and non-commutativity in the design of a compensating control sequence and consequently in the characterization of ensemble controllability. Such continuous family of ordinary differential systems sharing the same control inputs can be seen as the prototype of infinite dimensional systems with purely continuous spectra. The goal of this paper is to show that the very interesting controllability analysis of [10–12] can be completed by functional analysis methods developed for infinite dimensional systems governed by partial differential equations (see, e.g., [8] for samples of these methods). We focus here on one of the three dispersion cases treated in [10–12]. We consider ⎛ ⎞ an ensemble of non interacting half-spins in a static field ⎝ ⎛

v(t)

⎞

0 0 ⎠ B0

in R3 , subject to a trans-

verse radio frequency field ⎝ −u(t) ⎠ in R3 (the control input). The ensemble of half-spins 0

is described by the magnetization vector M ∈ R3 depending on time t but also on the Larmor frequency ω = −γ B0 (γ is the gyromagnetic ratio). It obeys the Bloch equation: ⎛ ⎞ 0 −ω v(t) ∂M 0 −u(t) ⎠ M(t, ω), (t, ω) ∈ [0, +∞) × (ω∗ , ω∗ ), (1) (t, ω) = ⎝ ω ∂t −v(t) u(t) 0 where −∞ ω∗ < ω∗ +∞ are given. With the notations ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 00 0 0 01 0 −1 0 x := ⎝ 0 0 −1 ⎠ , y := ⎝ 0 0 0 ⎠ , z := ⎝ 1 0 0 ⎠ , 01 0 −1 0 0 0 0 0

(2)

the system (1) can be written ∂M (t, ω) = ωz + u(t)x + v(t)y M(t, ω), (t, ω) ∈ [0, +∞) × (ω∗ , ω∗ ). (3) ∂t It is a bilinear control system in which, at time t, • the state is (M(t, ω))ω∈(ω∗ ,ω∗ ) ; for each ω, M(t, ω) ∈ S2 , the unit sphere of R3 , • the two control inputs u(t) and v(t) are real. In the sequel, we denote by ek , the R3 -vector of coordinates (δki )i∈{1,2,3} . Thus, we study the simultaneous controllability of a continuum of ordinary differential equations, with respect to a parameter ω that belongs to an interval (ω∗ , ω∗ ) (let us mention [16] in which the simultaneous controllability of an ODE with respect to a discrete parameter is addressed). Notice that, when v = u = 0, the spectrum of this system is made by the union of the two segments, i(ω∗ , ω∗ ) and −i(ω∗ , ω∗ ), belonging to the imaginary axis. The pioneer articles [10–12] provide convincing arguments indicating why the system (3) is ensemble controllable (i.e. approximately controllable in L 2 ((ω∗ , ω∗ ), S2 )) with unbounded and also bounded controls, when ω∗ and ω∗ are finite. Here, we provide several mathematical results that complete these ensemble controllability results with discriminations between approximate or exact controllability and between finite or infinite time (asymptotically) controllability.

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

527

1.2. Controllability issues. Let us recall a famous non controllability result for infinite dimensional bilinear systems due to Ball, Marsden and Slemrod [1]. This result concerns general systems of the form dw = Aw + p(t)Bw, dt

(4)

where the state is w and the control is p : [0, T ] → R. Theorem 1. Let X be a Banach space with dim(X ) = +∞, A generates a C 0 -semigroup of bounded operators on X and B : X → X be a bounded operator. For w0 ∈ X , 1 ([0, +∞)) and w(0) = w . w(t; p, w0 ) denotes the unique solution of (4) with p ∈ L loc 0 The reachable set from w0 , R(w0 ) := {w(t; p, w0 ); t 0, p ∈ L rloc ([0, +∞)), r > 1}, is contained in a countable union of compact subsets of X and, in particular, it has an empty interior in X . Thus (4) is not controllable in X with controls in ∪r >1 L rloc ([0, +∞)). We cannot apply directly here this result since the spaces X = L 2 ((ω∗ , ω∗ ), S2 ) or C 0 ([ω∗ , ω∗ ], S2 ), where the Cauchy problem is well-defined are not vector spaces. In order to get an interesting result for the Bloch equation, one needs extensions of the above result to Banach manifolds. (This has been done in [15] when the manifold is the unit sphere of a Hilbert space.) For (2), the situation is similar to the one described in Theorem 1. In Theorem 5, we show that for any analytic initial condition M0 (ω), the reachable set in finite time T > 0 from M0 with controls in L 2 (0, T ) only contains analytic functions of ω. Thus, the reachable set (from an analytic initial data) has an empty interior in L 2 ((ω∗ , ω∗ ), S2 ), which is a natural space for the Cauchy problem. However, for (2), the obstruction to exact controllability given by Theorem 5 has a much stronger consequence than the obstruction described by Theorem 1 which is, in fact, a rather weak non controllability result. Indeed, it does not prevent the reachable set from being dense in X (approximate controllability in X ). For example, this is the case for the 1D beam equation u tt + u xxxx + p(t)u xx = 0, x ∈ (0, 1), t ∈ (0, +∞), u = u x = 0 at x = 0, 1, in which the state is (u, u t ) and the control is p. Theorem 1 ensures that this system is not exactly controllable in H02 × L 2 (0, 1) with controls in L rloc ([0, +∞)), r > 1. However, it is proved in [3] that this system is exactly controllable in H 5+ × H 3+ (0, 1) with controls in H01 (0, T ), at least locally around a stationnary trajectory. Similarly, Turinici’s generalization [15] of Theorem 1 applies to 1D Schrödinger equations of the form ∂2ψ i ∂ψ ∂t = − ∂x2 − u(t)µ(x)ψ, x ∈ (0, 1), t ∈ (0, +∞), ψ(t, 0) = ψ(t, 1) = 0 where the state is ψ, the control is u and µ ∈ C ∞ ([0, 1]). It proves that this system 2 ([0, +∞)). However is not exactly controllable in H 2 ((0, 1), C) with controls in L loc it is proved in [2,4] that this system, with µ(x) = (x − 1/2) is exactly controllable

528

K. Beauchard, J.-M. Coron, P. Rouchon

in H 7 ((0, 1), C) with controls in H01 (0, T ), locally around the eigenstates, for T large enough. The conclusion of [2–4] is that, sometimes, the negative result of Theorem 1 is only due to an unfortunate choice of functional spaces that do not allow the controllability; but positive controllability results may be expected in different functional spaces. Therefore, one may still hope to prove the exact controllability of the Bloch equation in some well chosen functional spaces. We will see in this article that it is not the case: the Bloch equation is not exactly controllable in a much stronger sense than the one of Theorem 1. Indeed, we will prove that, when (ω∗ , ω∗ ) = (−∞, +∞), the reachable set (in finite time T and with small L 2 (0, T )-controls) from M0 ≡ e3 is a non-flat submanifold of the functional space L 2 ∩ Cb0 (R), with infinite codimension. When the domain (ω∗ , ω∗ ) is a bounded interval of R, we will see that there exist analytic targets, arbitrarily close to e3 that cannot be reached exactly from e3 with bounded controls in L 2 (0, T ). Thus, the non controllability of (3) is not related to a regularity problem and this equation corresponds to a very different situation from [2–4]. 1.3. Outline and open problems. In Sect. 2, we study the linearized system of (3) around the steady-state (M ≡ e3 , (u, v) ≡ 0) with −∞ < ω∗ < ω∗ < +∞. This system is shown to be approximately controllable in C 0 ([ω∗ , ω∗ ], R3 ), in any finite time T , with unbounded controls (u, v) ∈ Cc∞ ((0, T ), R2 ). But it is not exactly controllable either in finite time or in infinite time. Moreover, for any reachable target, there exists only one control which steers the control system to the target. In Sect. 3, we study the exact controllability of the nonlinear system (3), locally around M ≡ e3 , in finite time. First, we prove that the simultaneous exact controllability with respect to ω in the whole space R (i.e. ω∗ = −∞, ω∗ = +∞) does not hold with bounded controls. Indeed, for every time T > 0, the reachable set, from M0 ≡ e3 , with bounded controls in L 2 (0, T ), is a non flat submanifold of the functional space L 2 ∩ Cb0 (R), with infinite codimension. Then, with an analyticity argument, we deduce that the simultaneous exact controllability with respect to ω in a bounded interval (ω∗ , ω∗ ), −∞ < ω∗ < ω∗ < +∞, does not hold either. The exact controllability of (3) being impossible with bounded controls, in Sects. 4 and 5, we investigate the exact controllability of (3) with unbounded controls. In Sect. 4, completing the arguments of [10–12], we prove the ensemble controllability of (3): any measurable initial condition M0 : (ω∗ , ω∗ ) → S2 can be steered approximately in L 2 (ω∗ , ω∗ ) to e3 . This approximate controllability indeed holds for stronger norms, for instance . L ∞ and . H s , ∀s ∈ (0, 1). The controls used to realize this motion are sequences of pulses presented in [10] (but one may also use controls in ∞ ([0, +∞))) and the proof relies on non-commutativity and functional analysis. L loc In Sect. 5, we propose other explicit unbounded controls realizing the asymptotic local (exact) controllability to e3 , simultaneously with respect to ω in a bounded interval. Here, the proof relies on Fourier analysis. Finally, in Sect. 6 , we compare the feasibility, the time and the cost of the two controllability processes presented in Sects. 4 and 5, on particular motions. Let us emphasize that the behavior of the nonlinear system around e3 is very different from the one of the linearized system around e3 . Indeed, • first, the linearized system is not asymptotically zero controllable whereas the nonlinear system is asymptotically locally controllable to e3 , • then, as seen in Sect. 2, for the linearized system and for any reachable target, only a single control works, whereas for the nonlinear system and for any initial condition, many controls allow to reach exactly e3 (in infinite time).

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

529

Thus, the nonlinearity allows to recover controllability. Finally, let us mention some open problems. In Sect. 3, we prove the non exact controllability with bounded L 2 -controls, in finite time, because the reachable set from e3 is a non flat submanifold of the functional space L 2 ∩Cb0 (R), with infinite codimension. The equation of this submanifold and the validity of the same negative result in infinite time (i.e. the non asymptotic exact controllability to e3 with bounded controls) are open problems. In Sect. 5, we prove the exact controllability to e3 with unbounded controls, in infinite time. The validity of the same result in finite time is also open. Before starting the mathematical study let us introduce some notations that will be used throughout all the paper. We write ⎛ ⎞ x(t, ω) M(t, ω) := ⎝ y(t, ω) ⎠ , (5) z(t, ω) Z (t, ω) := (x + iy)(t, ω), w(t) := (−v + iu)(t).

(6)

Thus, when, for some time T > 0, z(t, ω) > 0 on (0, T ) × (ω∗ , ω∗ ), then the system (3) implies ∂Z (t, ω) = iωZ (t, ω) − w(t) 1 − |Z (t, ω)|2 , (t, ω) ∈ (0, T ) × (ω∗ , ω∗ ), (7) ∂t so

Z (t, ω) =

t

Z 0 (ω) −

w(τ ) 1 − |Z (τ, ω)|2 e−iωτ dτ eiωt ,

0

(t, ω) ∈ (0, T ) × (ω∗ , ω∗ ). Unless otherwise specified, the functions considered are complex valued and, for example, we write L 2 (R) for L 2 (R, C). When the functions considered are real valued we specify it and, for example, we write L 2 (R, R). The notation Cb0 refers to continuous and bounded functions. 2. Linearized System Around (M ≡ e3 , (u, v) ≡ 0) In this section −∞ < ω∗ < ω∗ < +∞. We are interested in the linearized system of (3) around (M ≡ e3 , (u, v) ≡ 0), or, equivalently in the linearized system of (7) around (Z ≡ 0, w ≡ 0), Z˙ (t, ω) = iωZ (t, ω) − w(t),

Z (0, ω) = Z 0 (ω),

(8)

whose solution is

Z (t, ω) =

Z 0 (ω) −

t

w(τ )e−iωτ dτ eiωt .

(9)

0

We prove its non exact controllability and its approximate controllability with unbounded controls.

530

K. Beauchard, J.-M. Coron, P. Rouchon

2.1. Non exact controllability. We denote by F the 1-D Fourier transform: F(w)(ω) = w(t)e−iωt dt. R

When a function is defined on I ⊂ R, we extend it by 0 on R \ I . One has the following proposition. Proposition 1. Let T ∈ (0, +∞). The reachable set from Z 0 = 0 for (8) with controls w ∈ L 1 (0, T ) is {Z (T ); w ∈ L 1 (0, T )} = F[L 1 (−T, 0)]. The set of initial conditions Z 0 that are asymptotically zero controllable with controls w ∈ L 1 (0, +∞) for (8) is F[L 1 (0, +∞)]. For every Z 0 ∈ F[L 1 (0, +∞)], the function w := F −1 [Z 0 ] is the unique control in 1 L (0, +∞) that steers the control system (8) from Z 0 to 0. Proof of Proposition 1. The two first statements are direct consequences of the explicit expression (9). Concerning the third statement, it is sufficient to prove that if w ∈ L 1 (0, +∞) and if F[w] ≡ 0 on (ω∗ , ω∗ ), then w = 0. Let w be such a function and consider ϕ : C+ ∪ C− ∪ (ω∗ , ω∗ ) → C, defined by ⎧ ⎨ F[w](z), if z ∈ C− , ϕ(z) := F[w](z), if z ∈ C+ , ⎩ 0, if z ∈ (ω∗ , ω∗ ), where C+ := {z ∈ C; (z) > 0} and C− := {z ∈ C; (z) < 0}. Then ϕ is holomorphic on C+ and on C− and continuous on C+ ∪ C− ∪ (ω∗ , ω∗ ), so it is holomorphic on C+ ∪ C− ∪ (ω∗ , ω∗ ). Since ϕ vanishes on (ω∗ , ω∗ ), then ϕ ≡ 0. Thus w = 0.

2.2. Approximate controllability with unbounded controls. Proposition 2. Let T > 0, Z f ∈ C 0 ([ω∗ , ω∗ ]) and η > 0. There exists w ∈ Cc∞ ((0, T )) such that the solution of (8) with Z 0 = 0 satisfies Z (T ) − Z f L ∞ (ω∗ ,ω∗ ) < η. Proof of Proposition 2. Let T > 0, Z f ∈ C 0 ([ω∗ , ω∗ ]) and η > 0. Thanks to the Weierstrass theorem, there exists a polynomial P ∈ C[X ] such that T η Z f (ω)e−iω 2 − P(iω) L ∞ (ω∗ ,ω∗ ) < . 2 Applying the control w(t) := −P(∂t )δT /2 (t) in (9) with Z 0 = 0, we get T

T

Z (T, ω) = −F[w](ω)eiωT = P(iω)e−iω 2 eiωT = P(iω)eiω 2 , thus η . 2 Now, let us smooth this control candidate in order to provide a smooth control. Let g ∈ Cc∞ ((−1, 1), R+ ) such that R g = 1. For ∈ (0, T /2), the function

t − T /2 1 g (t) := g Z (T ) − Z f L ∞ (ω∗ ,ω∗ ) <

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

531

is supported in (0, T ). Applying the control w (t) := −P(∂t )g (t) in (8), we get Z (T, ω) = P(iω)gˆ (ω)eiωT . Noticing that gˆ (ω) − e

−iω T2

=

1

−1

T

g(y)[e−iωy − 1]dy e−iω 2 ,

we get T

P(iω)[gˆ (ω) − e−iω 2 ] L ∞ (ω∗ ,ω∗ ) T

P(iω) L ∞ (ω∗ ,ω∗ ) gˆ (ω) − e−iω 2 L ∞ (ω∗ ,ω∗ ) → 0 when → 0. Thus, for small enough, Z (T ) − Z f L ∞ (ω∗ ,ω∗ ) < η.

3. Non-exact Controllability with Bounded Controls In this section, we study the reachable set from M(0) ≡ e3 for (1) with bounded controls (u, v) ∈ L 2 ((0, T ), R2 ). Notice that, when M(0) ≡ e3 and w is small enough in L 1 (0, T ), then z(t, ω) > 0 for every (t, ω) ∈ (0, T ) × R and t w(τ ) 1 − |Z (τ, ω)|2 e−iωτ dτ eiωt , ∀(t, ω) ∈ [0, T ] × R. (10) Z (t, ω) = − 0

3.1. Case ω∗ = −∞, ω∗ = +∞. In this section, we take ω∗ = −∞, ω∗ = +∞. In a first subsection we make precise the functional framework in which (10) is well posed. In a second subsection, we prove that the reachable set from zero, with bounded controls (u, v) in L 2 ((0, T ), R2 ) is a non flat submanifold of the functional space L 2 ∩ Cb0 (R), with infinite codimension. In particular, (3) is not locally controllable with bounded controls (u, v) in L 2 ((0, T ), R2 ). 3.1.1. Solutions on [0, T ]

√ Proposition 3. Let T > 0 and R := 1/(2 T ). For every w ∈ L 2 (0, T ) with w L 2 (0,T ) < R, there exists a unique Z ∈ C 0 ([0, T ], L 2 (R))∩Cb0 ([0, T ]×R) solution of (10) and it satisfies √ Z L ∞ ((0,T )×R) T w L 2 (0,T ) , (11) √ (12) Z C 0 ([0,T ],L 2 (R)) 2 2π w L 2 (0,T ) . Moreover, for every w1 , w2 ∈ L 2 (0, T ) with w1 L 2 (0,T ) < R and w2 L 2 (0,T ) < R, we have √ Z 1 − Z 2 L ∞ ((0,T )×R) 2 T w1 − w2 L 2 (0,T ) , (13) √ (14) Z 1 − Z 2 C 0 ([0,T ],L 2 (R)) 4 2π w1 − w2 L 2 (0,T ) , where, for j ∈ {1, 2}, Z j denotes the unique solution of (10) for w := w j .

532

K. Beauchard, J.-M. Coron, P. Rouchon

√ Proof of Proposition 3. Let T > 0 and c := 1/ 3, which is chosen so that | f (x)| c, ∀x ∈ [0, 1/2], where f (x) := 1 − x2 .

(15)

Let w ∈ L 2 (0, T ) be such that w L 2 (0,T ) < R. We apply the Banach fixed point theorem to the map defined on B := C 0 ([0, T ], L 2 (R)) ∩ C 0 ([0, T ] × R, BC (0, 1/2)) by (ξ ) = Z where BC (0, 1/2) := {ξ ∈ C; |z| 1/2}, t Z (t, ω) = − w(τ ) 1 − |ξ(τ, ω)|2 e−iωτ dτ eiωt , ∀(t, ω) ∈ [0, T ] × R. 0

Note that B is a nonempty closed subset of the√Banach space C 0 ([0, T ], L 2 (R)). First Step: takes its values in B because R T 1/2. Let ξ ∈ B and Z := (ξ ). The Cauchy-Schwarz inequality leads to √ √ |Z (t, ω)| w L 1 (0,T ) T w L 2 (0,T ) T R 1/2. (16) Thus Z ∈ C 0 ([0, T ] × R, BC (0, 1/2)). Thanks to the decomposition t w(τ ) 1 − |ξ(τ, ω)|2 − 1 e−iωτ dτ eiωt , (17) Z (t, ω) = −F[τ−t w|[0,t] ](ω) − 0

where τa (ϕ)(s) := ϕ(s − a), thanks to the Plancherel theorem, (15) and the CauchySchwarz inequality, we get √ Z (t) L 2 (R) 2πw L 2 (0,T )

t 2 1/2 + w(τ ) 1 − |ξ(τ, ω)|2 − 1 e−iωτ dτ dω R

√

0

w2L 2 (0,T )

t

1/2

2πw L 2 (0,T ) + c |ξ(τ, ω)| dτ dω 0 R √ √ w L 2 (0,T ) 2π + c T ξ C 0 ([0,T ],L 2 (R)) , 2

2

(18)

so Z (t) ∈ L 2 (R) for every t ∈ [0, T ]. In the right-hand side of (17), the first term belongs to C 0 ([0, T ], L 2 (R)) and the second term also, as one can prove by applying the dominated convergence theorem with the following domination, that holds for every (t, ω) ∈ [0, T ] × R,

T 1/2 t −iωτ 2 2 w(τ ) 1 − |ξ(τ, ω)| − 1 e dτ cw L 2 (0,T ) |ξ(τ, ω)| dτ . 0

√

0

Second step: is a contraction on B because c T R < 1. Let ξ1 , ξ2 ∈ B, Z 1 := (ξ1 ) and Z 2 := (ξ2 ). We have t 1 − |ξ1 (τ, ω)|2 − 1 − |ξ2 (τ, ω)|2 e−iωτ dτ eiωt . (Z 1 − Z 2 )(t, ω) = − w(τ ) 0

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

533

Using (15) and the Cauchy-Schwarz inequality, we get Z 1 − Z 2 L ∞ ((0,T )×R) w L 1 (0,T ) cξ1 − ξ2 L ∞ ((0,T )×R) √ c T Rξ1 − ξ2 L ∞ ((0,T )×R) . Working as in (18), we also get √ (Z 1 − Z 2 )(t, .) L 2 (R) c T Rξ1 − ξ2 C 0 ([0,T ],L 2 (R)) , ∀t ∈ [0, T ]. √ √ Third step: Proof of (11) and (12) thanks to c T R 1/2. Since c T R 1/2, the inequalities (11) and (12) are consequences of √ (16) and (18) with ξ = Z . Fourth step: Proof of (13) and (14) thanks to c T R 1/2. Using the decomposition t )(τ ) 1 − |Z 1 (τ, ω)|2 e−iωτ dτ eiωt (Z 1 − Z 2 )(t, ω) = − 0 (w1 − w 2 t − 0 w2 (τ ) 1 − |Z 1 (τ, ω)|2 − 1 − |Z 2 (τ, ω)|2 e−iωτ dτ eiωt , the Cauchy-Schwarz inequality and (15), we get √ |(Z 1 − Z 2 )(t, ω)| T w1 − w2 L 2 (0,T ) √ +c T RZ 1 − Z 2 L ∞ ((0,T )×R) , ∀(t, ω) ∈ [0, T ] × R √ which, since c T R 1/2, leads to √ Z 1 − Z 2 L ∞ ((0,T )×R) 2 T w1 − w2 L 2 (0,T ) . Using the decomposition (Z 1 − Z 2 )(t, ω) = −F[τ−t (w1 − w2 )|[0,t] ](ω) t 1 − |Z 1 (τ, ω)|2 − 1 e−iωτ dτ eiωt − 0 (w1 − w2 )(τ ) t − 0 w2 (τ ) 1 − |Z 1 (τ, ω)|2 − 1 − |Z 2 (τ, ω)|2 e−iωτ dτ eiωt , and working as in (18), we get (Z 1 − Z 2 )(t) L 2 (R)

√ 2π w1 − w2 L 2 (0,T ) √ +w1 − w2 L 2 (0,T ) c T Z 1 C 0 ([0,T ],L 2 (R)) √ +w2 L 2 (0,T ) c T Z 1 − Z 2 C 0 ([0,T ],L 2 (R)) .

√ Thus, using (12) and c T R 1/2, we get √ √ Z 1 − Z 2 C 0 ([0,T ],L 2 (R)) 2 2π 1 + 2c T R w1 − w2 L 2 (0,T ) √ 4 2π w1 − w2 L 2 (0,T ) .

This shows the existence part of Proposition 3 and the uniqueness if one requires that Z takes its values in BC (0, 1/2). The uniqueness without assuming this last assumption can be easily obtained from the previous study by noticing that this study implies that, if two solutions are equal on [0, τ ] with τ ∈ [0, T ), then they are equal on [0, τ ] for τ > τ.

534

K. Beauchard, J.-M. Coron, P. Rouchon

3.1.2. Structure of the reachable set from zero in time T The goal of this section is the proof of the following result, where B R [L 2 (0, T )] := {w ∈ L 2 (0, T ); w L 2 (0,T ) < R}. √ Theorem 2. Let T > 0 and R := 1/(4 3T ). The image of the end point map FT : B R [L 2 (0, T )] → L 2 ∩ Cb0 (R) w → Z (T, .) where Z solves (10),

(19)

is a strict submanifold of L 2 ∩ Cb0 (R) of infinite codimension that does not coincide with its tangent space at zero. The proof of Theorem 2 relies on the following results (see [17, Theorem 73.E and Corollary 73.45, Chap. 73]). Theorem 3. Let M and N be two C k -Banach manifolds with chart space over R and k ∈ {∞} ∪ N \ {0}. Let F : M → N be a map of class C k . If F is a C k embedding, then S := F(M) is a submanifold of N and in particular a C k -Banach manifold. Theorem 4. Under the same assumptions as in Theorem 3, if F is an injective C k immersion and if F is closed, then F is a C k embedding. Proof of Theorem 2. We take M := B R [L 2 (0, T )] and N := L 2 ∩ Cb0 (R). They are both C ∞ -Banach manifolds as open subsets of Banach spaces. The continuity of FT is a consequence of (13) and (14). With similar manipulations as in the proof of (13) and (14), one can prove that FT is C 1 and d FT (w).W = ξ(T, .) where ξ is defined by t ξ(t, ω) = − W (τ ) 1 − |Z (τ, ω)|2 e−iωτ dτ eiωt

0 t

+ 0

[Z (τ, ω)ξ(τ, ω)] −iωτ w(τ ) e dτ eiωt , ∀(t, ω) ∈ [0, T ] × R. 1 − |Z (τ, ω)|2

(20)

We use the same notation c as in the previous proof (see (15)).√ First step: FT is injective on B R [L 2 (0, T )] because 6c T R < 1. Let w1 , w2 ∈ B R [L 2 (0, T )] be such that FT (w1 ) = FT (w2 ). From T T −iωt 2 w1 (t) 1 − |Z 1 (t, ω)| e dt = w2 (t) 1 − |Z 2 (t, ω)|2 e−iωt dt, 0

we deduce F[w1 − w2 ](ω) =

0

(w2 − w1 )(t) 1 − |Z 2 (t, ω)|2 − 1 e−iωt dt T + 0 w1 (t) 1 − |Z 2 (t, ω)|2 − 1 − |Z 1 (t, ω)|2 e−iωt dt. T 0

Considering the L 2 (R)-norm of both sides, using the Plancherel equality and working as in (18) we get √ √ 2π w1 − w2 L 2 (0,T ) w2 − w1 L 2 (0,T √ ) c T Z 2 C 0 ([0,T ],L 2 (R)) +w1 L 2 (0,T ) c T Z 1 − Z 2 C 0 ([0,T ],L 2 (R)) .

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

535

Using (12) and (14), we deduce √ √ √ 2π w1 − w2 L 2 (0,T ) 6c 2π T Rw1 − w2 L 2 (0,T ) , √ which gives the conclusion, because 6c T R < √ 1. Second step: FT is an immersion because 6c T R < 1. Let w ∈ B R [L 2 (0, T )] and W ∈ L 2 (0, T ) be such that d FT (w).W = 0. Thanks to (20), we have T F[W ](ω) = − 0 W (τ ) 1 − |Z (τ, ω)|2 − 1 e−iωτ dτ T [Z (τ, ω)ξ(τ, ω)] −iωτ e dτ, ∀ω ∈ R. + 0 w(τ ) 1 − |Z (τ, ω)|2 Considering the L 2 (R)-norm of both sides and working as in (18) we get √ √ 2π W L 2 (0,T ) W L 2 (0,T ) c T Z C 0 ([0,T ],L 2 (R)) √ +w L 2 (0,T ) c T ξ C 0 ([0,T ],L 2 (R)) . Admitting the following inequality: √ ξ C 0 ([0,T ],L 2 (R)) 4 2π W L 2 (0,T ) ,

(21)

and using (12), we get √ √ √ 2π W L 2 (0,T ) 6c 2π T RW L 2 (0,T ) , √ which gives the conclusion because 6c T R < 1. Now, let us prove (21). Using the decomposition t ξ(t, ω) = −F[τ−t W ](ω) − 1 − |Z (s, ω)|2 − 1 e−iωs dseiωt W (s) 0

t

+ 0

[Z (s, ω)ξ(s, ω)] −iωs w(s) e dseiωt , 2 1 − |Z (s, ω)|

and working as in (18), we get √ √ ξ(t) L 2 (R) 2π W L 2 (0,T ) + W L 2 (0,T ) c T Z C 0 ([0,T ],L 2 (R)) √ +w L 2 (0,T ) c T ξ C 0 ([0,T ],L 2 (R)) . √ Using (12) and c T R 1/2, we get (21). √ Third step: FT is a closed map because 6c T R 1/2. Let A be a closed subset of B R [L 2 (0, T )]. Let (Z n (T, .) = FT (wn ))n∈N be a sequence of FT (A) that converges to Z ∞ (.) in L 2 ∩ Cb0 (R). In order to prove that Z ∞ ∈ FT (A), we prove that (wn )n∈N is a Cauchy sequence in L 2 (0, T ). For every n ∈ N, we have Z n (T, ω) = −F[τ−T wn ](ω) −

T 0

wn (t)

1 − |Z n (t, ω)|2 − 1 e−iωt dteiωT ,

536

K. Beauchard, J.-M. Coron, P. Rouchon

so, for n, p ∈ N, we have F [τ−T (wn − w p )](ω) = (Z p − Z n )(T, ω) T − 0 (wn − w p )(t) 1 − |Z n (t, ω)|2 − 1 e−iωt dteiωT T − 0 w p (t) 1 − |Z n (t, ω)|2 − 1 − |Z p (t, ω)|2 e−iωt dteiωT .

Considering the L 2 (R)-norm of each side, using the Plancherel equality and working as in (18), we get √ 2π wn − w p L 2 (0,T ) (Z n − Z p )(T ) L 2 (R√ ) +wn − w p L 2√ (0,T ) c T Z n C 0 ([0,T ],L 2 (R)) +w p L 2 (0,T ) c T Z n − Z p C 0 ([0,T ],L 2 (R)) . Using (12) and (14), we get √ √ √ 2πwn − w p L 2 (0,T ) (Z n − Z p )(T ) L 2 (R) + 6c 2π T Rwn − w p L 2 (0,T ) . √ which gives the conclusion because 6c T R = 1/2. Fourth step: The manifold S := FT B R [L 2 (0, T )] does not coincide with its tangent space at zero. We have d FT (0).W = −F[τ−T W ], thus T0 S = F[L 2 ((−T, 0))]. Let us compute the third order development of FT around 0, w(t) = W1 (t) + 2 W2 (t) + 3 W3 (t) + · · · , Z (t, ω) = Z 1 (t, ω) + 2 Z 2 (t, ω) + 3 Z 3 (t, ω) + · · · Since, as → 0,

1 − | Z 1 + 2 Z 2 + 3 Z 3 |2 =

1 1 − 2 |Z 1 |2 + o( 2 ) = 1 − 2 |Z 1 |2 + o( 2 ), 2

we have Z 1 (t, ω) = −F[τ−t (W1 )|[0,t] ](ω), Z 2 (t, ω) = −F[τ−t (W2 )|[0,t] ](ω), 1 Z 3 (t, ω) = −F[τ−t (W3 )|[0,t] ](ω) − 2

t

W1 (τ )|Z 1 (τ, ω)|2 e−iωτ dτ eiωt .

0

We want to prove the existence of W1 ∈ L 2 (0, T ) such that the map

T

ω → 0

W1 (τ )|Z 1 (τ, ω)|2 e−iωτ dτ

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

537

does not belong to F[L 2 (0, T )]. Using the explicit expression of Z 1 , the change of variable σ2 → x = τ + σ1 − σ2 and Fubini’s theorem, we get T 2 −iωτ dτ 0 W1 (τ )|Z 1 (τ, ω)| e T τ τ iωσ2 dσ e−iωτ dτ W (σ )e = 0 W1 (τ ) 0 W1 (σ1 )e−iωσ1 dσ1 1 2 2 0 T τ τ = τ =0 σ1 =0 σ2 =0 W1 (τ )W1 (σ1 )W1 (σ2 )e−iω(τ +σ1 −σ2 ) dσ2 dσ1 dτ T τ σ1 +τ W1 (τ )W1 (σ1 )W1 (τ + σ1 − x)e−iωx dxdσ1 dτ = τ =0 σ1 =0 x=σ 1 = F[W1 ](ω),

where W (x) :=

T min{τ,x}

τ =0 σ1 =max{0,x−τ }

W (τ )W (σ1 )W (τ + σ1 − x)dσ1 dτ

0

if x ∈ (0, 2T ) . if x ∈ / (0, 2T )

Computing the map 1 associated to W = 1[0,T ] , we get 1 (3T /2) = T 2 /16, thus, for small enough, FT (1[0,T ] ) ∈ / T0 S.

3.2. Case −∞ < ω∗ < ω∗ < +∞ . The goal of this section is the proof of the following result. Theorem 5. (i) Let T > 0, u, v ∈ L 2 (0, T ) and M be the solution of ∂M ∂t (t, ω) = [ωz + u(t)x + v(t)y ]M(t, ω), (t, ω) ∈ (0, T ) × C, (22) M(0, ω) = e3 . Then ω ∈ C → Z (T, ω) is holomorphic. √ (ii) Let T > 0 and R := 1/(4 3T ). There exists Z f : (ω∗ , ω∗ ) → C analytic such that, for every ∗ > 0, there exists ∈ (0, ∗ ) such that, for every w ∈ B R [L 2 (0, T )], the solution of (3) satisfies Z (T ) = Z f . As a consequence, there are arbitrarily small analytic targets on (ω∗ , ω∗ ) that cannot be reached exactly in finite time, with controls having a prescribed L 2 -bound. Proof of Theorem 5. (i). Let T > 0, u, v ∈ L 2 (0, T ) and M be the solution of (22). We introduce the functions M1 , M2 : [0, T ] × R × R → R3 defined by M1 (t, ω1 , ω2 ) := [M(t, ω1 + iω2 )], M2 (t, ω1 , ω2 ) := [M(t, ω1 + iω2 )]. ω1 , ω2 ) := (M1 (t, ω1 , ω2 ), M2 (t, ω1 , ω2 ))t solves an equation The function M(t, of the form ∂M ω1 , ω2 ). = f (t, M, ∂t

(23)

ω1 , ω2 ) ∈ R6 × R × R The map f is measurable, of class C 1 with respect to ( M, and satisfies ω1 , ω2 )| (|ω1 | + |ω2 | + |u| + |v|)| M|, | f (t, M, ω1 , ω2 )| |ω1 | + |ω2 | + |u| + |v|, | fM (t, M, ω1 , ω2 )| | M|, | f ω2 (t, M, ω1 , ω2 )| | M|. | f ω1 (t, M,

538

K. Beauchard, J.-M. Coron, P. Rouchon

has partial derivatives with respect to ω1 and ω2 . (To check that, one can, Thus M for example, adapt the proof of [9, Theorem 4.1, Chap. 4, p. 100].) Let us prove that they satisfy the Cauchy-Riemann relations, in order to get the holomorphy of ω ∈ C → M(T, ω). We introduce the notation Yk,l (t, ω1 , ω2 ) :=

∂ Mk (t, ω1 , ω2 ), for k, l ∈ {1, 2}. ∂ωl

Differentiating the system (23) with respect to ω1 and ω2 , we get ⎧ ∂(Y −Y ) 11 22 ⎪ = A(t, ω1 )(Y11 − Y22 ) − B(ω2 )(Y12 + Y21 ), ⎪ ⎨ ∂(Y12∂t+Y21 ) = A(t, ω1 )(Y12 + Y21 ) + B(ω2 )(Y11 − Y22 ), ∂t ⎪ (Y − Y )(0, ω1 , ω2 ) = 0, 11 22 ⎪ ⎩ (Y12 + Y21 )(0, ω1 , ω2 ) = 0, where ⎛

⎛ ⎞ ⎞ 0 −ω1 v(t) 0 −ω2 0 0 −u(t) ⎠ and B(ω2 ) := ⎝ ω2 0 0 ⎠ . A(t, ω1 ) := ⎝ ω1 −v(t) u(t) 0 0 0 0 The uniqueness of the solution of the Cauchy problem ensures that Y11 = Y22 and Y12 = −Y21 . (ii) Let Z f : R → C be an analytic function that does not belong to the tangent space of the image of FT at zero (i.e. which is not the Fourier transform of a function in L 2 ((−T, 0))). Then, for every ∗ > 0, there exists ∈ (0, ∗ ) such that Z f does not belong to the image of FT . Thanks to (i), reaching Z f on (ω∗ , ω∗ ) in time T with controls u and v in L 2 ((0, T ), R) is equivalent to reaching it on R. But Z f does not belong to the image of FT . Therefore Z f cannot be reached on (ω∗ , ω∗ ), in time T , with controls u and v in B R [L 2 ((0, T ), R)].

4. Ensemble Controllability with Unbounded Controls In this section, we take −∞ < ω∗ < ω∗ < +∞. The goal of this section is to complete the very interesting arguments of [10–12] with functional analysis ideas, to prove the ensemble controllability of (3), (i.e. the approximate controllability of (3) in L 2 (ω∗ , ω∗ )) with unbounded controls. Actually, we prove a stronger result. First, let us introduce the definition of solutions of (3) with Dirac controls. Definition 1. Let b ∈ [0, +∞), β, γ ∈ R and M0 : (ω∗ , ω∗ ) → S2 . The solution of (3) with M(0) = M0 , u(t) = βδb (t), v(t) = γ δb (t) is M(t, ω) =

exp(ωz t)M0 (ω) for t ∈ [0, b), exp(ωz (t − b)) exp(βx + γ y ) exp(ωz b)M0 (ω) for t ∈ (b, +∞),

i.e. M(b+ , ω) = exp(βx + γ y )M(b− , ω).

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

539

Let H 1 ((ω∗ , ω∗ ), S2 ) := {M ∈ H 1 ((ω∗ , ω∗ ), R3 ); M(ω) ∈ S2 , ∀ω ∈ (ω∗ , ω∗ )}. Let us also denote U [t; u, v, M0 ] the value at time t, of the solution of (3), with initial condition M0 at time 0. Thus, U [t; u, v, M0 ] is a function of ω ∈ (ω∗ , ω∗ ). Definition 1 is motivated by the following result, which is a consequence of explicit expressions and the boundedness of (ω∗ , ω∗ ). Proposition 4. For every β, γ ∈ R, we have β γ lim U b + ; 1[b,b+] , 1[b,b+] , . − U [b+ ; βδb , γ δb , .] = 0. L(H 1 ((ω∗ ,ω∗ ),R3 ),H 1 ((ω∗ ,ω∗ ),R3 )) →0

Let us introduce the set D of finite sums of Dirac masses on [0, +∞). The goal of this section is to prove the following result. Theorem 6. Let M0 ∈ H 1 ((ω∗ , ω∗ ), S2 ). There exist (tn )n∈N ∈ [0, +∞)N , and (u n )n∈N , (vn )n∈N ∈ D N such that U [tn+ ; u n , vn , M0 ] → e3 weakly in H 1 ((ω∗ , ω∗ ), R3 ). Thanks to Proposition 4, one easily gets, from Theorem 6 the following corollary. Corollary 1. Let M0 ∈ H 1 ((ω∗ , ω∗ ), S2 ). There exist (tn )n∈N ∈ [0, +∞)N , and (u n )n∈N , ∞ ([0, +∞), R)N such that (vn )n∈N ∈ L loc U [tn ; u n , vn , M0 ] → e3 weakly in H 1 ((ω∗ , ω∗ ), R3 ). Thanks to the compactness of the injection H 1 (ω∗ , ω∗ ) → L 2 (ω∗ , ω∗ ) (resp. H 1 (ω∗ , ω∗ ) → L ∞ (ω∗ , ω∗ )), Theorem 6 and Corollary 1 give the approximate controllability of (3), to e3 , for the L 2 ((ω∗ , ω∗ ), R3 )-norm (resp. the L ∞ (ω∗ , ω∗ )-norm), in finite time. The proof of Theorem 6 relies on the following lemma, that will be proved later on. Lemma 1. (1) Let M ∈ H 1 ((ω∗ , ω∗ ), S2 ) be such that M = 0. There exist T > 0, u, v ∈ D such that • one has U [T + ; u, v, M] L 2 < M L 2 , • for every sequence (Mn )n∈N ∈ H 1 ((ω∗ , ω∗ ), S2 )N satisfying Mn L 2 M L 2 , ∀n ∈ N

(24)

Mn → M weakly in H 1 ((ω∗ , ω∗ ), R3 )

(25)

and there exists an extraction ϕ such that U [T + ; u, v, Mϕ(n) ] L 2 Mϕ(n) L 2 , ∀n ∈ N.

(2) Let M ∈ S2 be such that M = e3 . There exists θ ∈ [0, 2π ) such that for some (u, v) ∈ {(π δ1 + (π + θ )δ2 , 0), (0, π δ1 + (π + θ )δ2 )}, U [2+ ; u, v, M] is constant over (ω∗ , ω∗ ) and |U [2+ ; u, v, M] − e3 | < |M − e3 |. In Sect. 4.1, we prove Theorem 6 thanks to functional analysis and Lemma 1, which is proved in Sects. 4.2 and 4.3. In Sect. 4.2, we recall a preliminary result, which is already presented in [10,12]. In Sect. 4.3, we deduce the proof of Lemma 1.

540

K. Beauchard, J.-M. Coron, P. Rouchon

4.1. Proof of Theorem 6 thanks to Lemma 1. In this section, we deduce Theorem 6 from Lemma 1, using similar arguments as in [14]. Proof of Theorem 6 thanks to Lemma 1. Let M0 ∈ H 1 ((ω∗ , ω∗ ), S2 ) be such that M0 = e3 (otherwise tn ≡ 0 gives the conclusion). We introduce the set ∈ H 1 ((ω∗ , ω∗ ), S2 ); ∃(tn )n∈N ∈ [0, ∞)N , ∃(u n )n∈N , (vn )n∈N ∈ D N K := { M such that U [tn+ ; u n , vn , M0 ] L 2 M0 L 2 , ∀n ∈ N weakly in H 1 ((ω∗ , ω∗ ), R3 )} and U [tn+ ; u n , vn , M0 ] → M and the quantity ∈ K }. L 2 ; M m := inf{ M Notice that K is not empty because it contains M0 (take tn ≡ 0). First step: Let us prove the existence of e ∈ K such that e L 2 = m. Let (Mn )n∈N∗ ∈ ∗ K N be such that Mn L 2 → m when n → +∞. Then (Mn )n∈N∗ is a bounded sequence in H 1 ((ω∗ , ω∗ ), R3 ), thus, there exists e ∈ H 1 ((ω∗ , ω∗ ), S2 ) such that (up to an extraction) Mn → e weakly in H 1 and strongly in L 2 .

(26)

Then, e L 2 lim inf Mn L 2 = m. n→+∞

Let us prove that e belongs to K , which gives the conclusion of the first step. For every p p p n ∈ N∗ , Mn ∈ K so there exist (tn ) p∈N ∈ [0, +∞)N , (u n ) p∈N , (vn ) p∈N ∈ D N such that U [tn ; u n , vn , M0 ] L 2 M0 L 2 , ∀ p ∈ N, ∀n ∈ N∗ p+

p

p

(27)

and U [tn ; u n , vn , M0 ] → Mn weakly in H 1 and strongly in L 2 , when p → +∞, ∀n ∈ N∗ . p+

p

p

For every n ∈ N∗ , we choose p = p(n) ∈ N such that p(n)+

U [tn

p(n)

; un

p(n)

, vn

, M0 ] − Mn L 2

1 . n

(28)

The sequence (Yn := U [tn ; u n , vn , M0 ])n∈N∗ is bounded in H 1 ((ω∗ , ω∗ ), R3 ) because of (27). Thus, there exists e ∈ H 1 ((ω∗ , ω∗ ), S2 ) such that (up to an extraction) p(n)+

p(n)

p(n)

Yn → e weakly in H 1 and strongly in L 2 . The definition of K ensures that e belongs to K . Moreover, because of (26) and (28), Yn → e strongly in L 2 ((ω∗ , ω∗ ), R3 ), thus (uniqueness of the strong L 2 limit), e = e and e ∈ K . Second step: Let us prove that m = 0. Working by contradiction, we assume that m > 0. Then e = 0 thus, we can apply Lemma 1 (1). There exist T > 0, u, v ∈ D such that U [T + ; u, v, e] L 2 < e L 2 = m,

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

541

and an extraction ϕ such that, with the notations of the first step, U [T + ; u, v, Yϕ(n) ] L 2 Yϕ(n) L 2 , ∀n ∈ N.

(29)

Let us prove that U [T + ; u, v, e] belongs to K , which gives the contradiction. Using (27) and (29), we have U [T + ; u, v, Yϕ(n) ] L 2 M0 L 2 , ∀n ∈ N. Thus, there exists e∗ ∈ H 1 ((ω∗ , ω∗ ), S2 ) such that (up to an extraction) U [T + ; u, v, Yϕ(n) ] → e∗ weakly in H 1 and strongly in L 2 . Then, e∗ ∈ K , by definition of K . But we have U [T + ; u, v, Yϕ(n) ] − U [T + ; u, v, e] L 2 = Yϕ(n) − e L 2 → 0 when n → +∞, thus U [T + , u, v, e] = e∗ (uniqueness of the strong L 2 limit), and U [T + , u, v, e] ∈ K . This ends the proof of the second step. Third step. With a slight abuse of notations, let us still denote by S2 the set of constant functions from (−ω∗ , ω∗ ) with values into S2 . Thanks to the first and second steps, the set K ∩ S2 is not empty, so we can consider − e3 |; M ∈ K ∩ S2 }. m := inf{| M Working exactly as in the first step, one can prove that K ∩ S2 is a closed subset of S2 , thus K ∩ S2 is compact and there exists e˜ ∈ K ∩ S2 such that |e˜ − e3 | = m . Fourth step: Let us prove that m = 0, which gives the conclusion. Working by contradiction, we assume m > 0. Then e˜ = e3 and we can apply Lemma 1 (2). There exists θ ∈ [0, 2π ) such that, for some (u, v) ∈ {(π δ1 + (π + θ )δ2 , 0), (0, π δ1 + (π + θ )δ2 )}, U [2+ ; u, v, e] ˜ is constant over (ω∗ , ω∗ ) and |U [2+ ; u, v, e] ˜ − e3 | < |e˜ − e3 |. Let us prove that U [2+ ; u, v, e] ˜ belongs to K , which gives the contradiction. First, we emphasize that explicit computations show that exp(π ξ ) exp(ωz ) exp(π ξ ) = exp(−ωz ), ∀ξ ∈ {x, y}.

(30)

Thus, for some ξ ∈ {x, y}, we have U [2+ ; u, v, .] = exp((θ + π )ξ ) exp(ωz ) exp(π ξ ) exp(ωz ) = exp(θ ξ ) exp(π ξ ) exp(ωz ) exp(π ξ ) exp(ωz ) = exp(θ ξ ) exp(−ωz ) exp(ωz ) = exp(θ ξ ). Since e˜ ∈ K , there exist (sn )n∈N ∈ [0, +∞)N , (µn )n∈N , (νn )n∈N ∈ D N such that U [sn+ ; µn , νn , M0 ] L 2 M0 L 2 , ∀n ∈ N, U [sn+ ; µn , νn , M0 ] → e˜ weakly in H 1 . Let Z n := U [sn+ ; µn , νn , M0 ]. We have U [2+ ; u, v, Z n ] = exp(θ ξ )Z n . Thus U [2+ ; u, v, Z n ] L 2 = Z n L 2 M0 L 2 , ∀n ∈ N, ˜ weakly in H 1 . Thus U [2+ ; and U [2+ ; u, v, Z n ] → exp(θ ξ )e˜ = U [2+ ; u, v, e] u, v, e] ˜ ∈ K.

542

K. Beauchard, J.-M. Coron, P. Rouchon

4.2. The argument of [10,12]. The goal of this section is to recall the proof of the following result, which is already presented in [10,12]. Proposition 5. Let P, Q ∈ R[X ]. The flow of (3) can generate I + τ [P(ω)x + Q(ω)y ] + o(τ ) when τ → 0, with controls that are finite sums of Dirac masses. More precisely, for every > 0, there exists τ ∗ = τ ∗ (P, Q, ) > 0 such that, for every τ ∈ [0, τ ∗ ], there exist T > 0 and u, v ∈ D, such that τ. U [T + ; u, v, .] − I + τ [P(ω)x + Q(ω)y ] 1 ∗ 3 1 ∗ 3 L(H ((ω∗ ,ω ),R ),H ((ω∗ ,ω ),R ))

Remark 1. Let us explain why Proposition 5 may not be sufficient to prove the global approximate controllability of (3) in L 2 (ω∗ , ω∗ ), with Dirac controls. First, let us remark that, for every point M = (x(1) , x(2) , x(3) ) ∈ S2 , such that (3) x = 0, then (x M, y M) is a basis of TS2 M (the tangent space of S2 at M). Let > 0 and M0 = (x0 , y0 , z 0 ) ∈ L 2 ((ω∗ , ω∗ ), S2 ) be such that, z(ω) = 0 for almost every ω ∈ (ω∗ , ω∗ ). Following a classical strategy, we consider an homotopy H : [0, 1] × (ω∗ , ω∗ ) → S2 (s , ω) → H (s, ω) such that H ∈ C 1 ([0, 1], L 2 ((ω∗ , ω∗ ), S2 )), H (0, ω) = M0 (ω), H (1, ω) = e3 , and we try to reach e3 from M0 by following the path given by H . Since z = 0 a.e. on (ω∗ , ω∗ ), there exist f, g ∈ L 2 ((ω∗ , ω∗ ), R) such that ∂H (0, ω) = f (ω)x M0 (ω) + g(ω)y M0 (ω). ∂s Thanks to the Weierstrass theorem, there exist P, Q ∈ R[X ] such that f − P L 2 < and g − Q L 2 < . Applying Proposition 5, one may follow (approximately) the direction given by ∂∂sH (0, ω), with a small amplitude τ ∗ , that depends on this direction. If one wants to be sure to reach e3 in finite time, by iteration of this process, one would need, at least, the independence of the amplitude τ ∗ with respect to the direction (otherwise, one may stop in the middle of the path). However the maximum amplitude τ ∗ given by Proposition 5 depends on the direction, through the choice of the polynomials P, Q. Proof of Proposition 5. In this proof τ is a positive real number. By (30), exp(π x ) exp(ωz τ ) exp(−π x ) = exp(−ωz τ ),

(31)

and this evolution is generated, in time τ , by the controls u(t) = −π δ0 (t) + π δτ (t), v ≡ 0. The controls √ √ u(t) = τ δ0 (t) − (π + τ )δ√τ (t) + π δ2√τ (t), √ √ (resp. u(t) = −π δ0 (t) + (π − τ )δ√τ (t) + τ δ2√τ (t)),

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

543

√ v ≡ 0, generate in time 2 τ the evolution √ √ √ √ U1 (τ ) := exp(π x )√ exp(ωz τ ) exp(−(π + τ√)x ) exp(ω √ √z τ ) exp(x τ ) = exp(−ωz τ ) exp(−x τ ) exp(ωz τ ) exp(x τ ) = I + τ ω[z , x ] + o(τ ) = I + τ ωy + o(τ ) (resp. √ √ √ √ U1 (−τ ) := exp(x√ τ ) exp(ωz√ τ ) exp((π −√ τ )x ) exp(ω √ z τ ) exp(−π x ) = exp(x τ ) exp(ωz τ ) exp(−x τ ) exp(−ωz τ ) = I − τ ω[z , x ] + o(τ ) = I − τ ωy + o(τ )), where we have used (31) to pass from the first to the second line. Here and in the following, o(τ ) denote quantities which tend to 0 in the L(H 1 ((ω∗ , ω∗ ), R3 ), H 1 ((ω∗ , ω∗ ), R3 ))-norm as τ → 0+ . In the√same way, there exist controls, that are sums of Dirac masses, that generate in time 6 τ the evolutions √ √ U2 (τ ) := exp(−ωz τ )U1 (−τ ) exp(ωz τ )U1 (τ ) = I + τ 3/2 ω2 [y , z ] + o(τ 3/2 ) = I − τ 3/2 ω2 x + o(τ 3/2 ), √ √ U2 (−τ ) := U1 (τ ) exp(ωz τ )U1 (−τ ) exp(−ωz τ ) = I − τ 3/2 ω2 [y , z ] + o(τ 3/2 ) = I + τ 3/2 ω2 x + o(τ 3/2 ), √ and in time 14 τ the evolutions √ √ U3 (τ ) := exp(−ωz τ )U2 (−τ ) exp(ωz τ )U2 (τ ) = I − τ 2 ω3 [z , x ] + o(τ 2 ) = I − τ 2 ω3 y + o(τ 2 ), √ √ U3 (−τ ) := U2 (τ ) exp(ωz τ )U2 (−τ ) exp(−ωz τ ) = I + τ 2 ω3 [z , x ] + o(τ 2 ) = I + τ 2 ω3 y + o(τ 2 ). Thus, one can generate I ± τ ω2 x + o(τ ) in time 6τ 1/3 , and I ± τ ω3 y + o(τ ) in time 14τ 1/4 . Iterating this process, for every n ∈ N, one can generate I ± τ ω2n x + o(τ ) 1 and I ± τ ω2n+1 y + o(τ ) in a time Tn that behaves like 4n τ 2n . The same argument with x replaced by y in U j (τ ), j 1, shows that for every n ∈ N, one can generate 1

I ± τ ω2n+1 x + o(τ ) and I ± τ ω2n y + o(τ ) in a time Tn that behaves like 4n τ 2n . Thus, for every P, Q ∈ R[X ], one can generate I + τ [P(ω)x + Q(ω)y ] + o(τ ) in finite time, by composing the previous evolutions.

544

K. Beauchard, J.-M. Coron, P. Rouchon

4.3. Proof of Lemma 1. The goal of this subsection is the proof of Lemma 1, thanks to the previous subsection. Proof of Lemma 1. Proof of (1) of Lemma 1. It is sufficient to prove this statement under the additional assumption z = 0.

(32)

Indeed, let us assume that it is proved when (32) holds. Let M = (x, y, z) ∈ H 1 ((ω∗ , ω∗ ), S2 ) be such that M = 0 and z ≡ 0. Then x2 + y 2 ≡ 1 thus x = 0 or y = 0. Let us assume, for example, that y = 0. Thanks to (30), we have 3π U [2; 3π 2 δ0 + π δ1 , 0, .] = exp(ωz ) exp(π x ) exp(ωz ) exp( 2 x ) = exp(ωz ) exp(π x ) exp(ωz ) exp(π x ) exp( π2 x ) = exp(ωz ) exp(−ωz ) exp( π2 x ) = exp( π2 x ).

Thus the function

⎛ ⎞ x 3π δ0 + π δ1 , 0, M] = ⎝ 0 ⎠ U [2; 2 y

has a non vanishing third component and the L 2 norm of its derivative is the same one as M. Applying Lemma 1 (1) to U [2; 3π 2 δ0 +π δ1 , 0, M], we get the conclusion of Lemma 1 (1) for M. Let M = (x, y, z) ∈ H 1 ((ω∗ , ω∗ ), S2 ) be such that M = 0 and z = 0. First step: Let us prove the existence of P, Q ∈ R[X ], α > 0 and τ0∗ > 0 such that, for every τ ∈ (0, τ0∗ ), • one has

d 2 I + τ [P(ω)x + Q(ω)y ] M 2 M 2L 2 − τ α, L dω

(33)

• for every sequence (Mn )n∈N ∈ H 1 ((ω∗ , ω∗ ), S2 )N satisfying (24) and (25), there exists an extraction ϕ such that d 2 I + τ [P(ω)x + Q(ω)y ] Mϕ(n) 2 Mϕ(n) 2L 2 − τ α, ∀n ∈ N. (34) L dω Developing the square, we get, for τ 0 and P, Q ∈ R[X ], d 2 I + τ [P(ω)x + Q(ω)y ] M 2 = M 2L 2 + 2τ A(P, Q) + τ 2 B(P, Q), L dω d 2 I + τ [P(ω)x + Q(ω)y ] Mn 2 = Mn 2L 2 + 2τ An (P, Q) + τ 2 Bn (P, Q), L dω where A(P, Q), An (P, Q), B(P, Q), Bn (P, Q) are real constants. Straightforward computations give ω∗ d A(P, Q) = ω∗ dω [P(ω)x + Q(ω)y ]M(ω) , M (ω)dω ω∗ = ω∗ P [−zy + yz ] + Q [zx − xz ] dω.

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

545

We look for P, Q ∈ R[X ] such that A(P, Q) < 0. Since A is a linear form in (P, Q) it is sufficient to prove that A = 0. Working by contradiction, we assume A = 0. Thanks to the density of polynomials in L 2 (ω∗ , ω∗ ), we have zy − yz = 0, zx − xz = 0.

(35)

Let I be a nonempty connected component of {ω ∈ (ω∗ , ω∗ ); z(ω) = 0}. Since z = 0, such a I exists. By (35), there exist a, b ∈ R such that x(ω) = az(ω) and y(ω) = bz(ω), ∀ω ∈ I . Since M takes values in S2 , we have 1 = x(ω)2 + y(ω)2 + z(ω)2 = (a 2 + b2 + 1)z(ω)2 . This shows that I = (ω∗ , ω∗ ) and that M is constant over (ω∗ , ω∗ ), which is in contradiction with the assumption M = 0. Therefore, there exist P, Q ∈ R[X ] such that A(P, Q) < 0. For every n ∈ N, we have An (P, Q) =

ω∗

ω∗

P (−z n yn + yn z n ) + Q (z n xn − xn z n ).

Thanks to (25), there exists an extraction ϕ such that Mϕ(n) → M weakly in H 1 and strongly in L 2 . Then Aϕ(n) (P, Q) → A(P, Q) when n → +∞. Thus, we can assume that Aϕ(n) (P, Q) <

3 A(P, Q), ∀n ∈ N 4

(36)

(otherwise take another extraction). We have √ d B(P, Q) := dω [P(ω)x + Q(ω)y ]M 2 L P L 2 + Q L 2 + [P L ∞ + Q L ∞ ]M L 2 , √

d Bn (P, Q) := dω [P(ω)x + Q(ω)y ]Mn 2 L P L 2 + Q L 2 + [P L ∞ + Q L ∞ ]Mn L 2 P L 2 + Q L 2 + [P L ∞ + Q L ∞ ]M L 2 .

Let τ0∗ = τ0∗ (M) > 0 be such that 2 |A(P, Q)| τ0∗ P L 2 + Q L 2 + [P L ∞ + Q L ∞ ]M L 2 < . 2 Then, for every τ ∈ [0, τ0∗ ], we have (33) and (34) with α := −A(P, Q). Second step: Conclusion. Let P, Q be as in the first step. Let 1 > 0 be such that 1 M H 1 <

α . 2M L 2

(37)

546

K. Beauchard, J.-M. Coron, P. Rouchon

Let τ ∗ = τ ∗ (P, Q, 1 ) be as in Proposition 5 and τ1∗ := min{τ ∗ , τ0∗ }. Thanks to Propo∞ ([0, +∞), R) such that sition 5, there exist T > 0, u, v ∈ L loc U [T + ; u, v, .] − I + τ1∗ [P(ω)x + Q(ω)y ]

L(H 1 ,H 1 )

1 τ1∗ .

(38)

Then, using (38), (33) and (37), we get d U [T + ; u, v, M] L 2 dω U [T ; u, v, M] − I + τ [P(ω)x + Q(ω)y ] M 2 L d I + τ [P(ω)x + Q(ω)y ] M 2 + dω L 1/2 ∗ 2 ∗ 1 τ1 M H 1 + M L 2 − ατ1 1 τ1∗ M H 1 + M L 2 − < M L 2 .

ατ1∗ 2M L 2

Similarly, we have ατ1∗ 2Mϕ(n) L 2 ατ1∗ 2M L 2

U [T + ; u, v, Mϕ(n) ] L 2 1 τ1∗ Mϕ(n) H 1 + Mϕ(n) L 2 − 1 τ1∗ M H 1 + Mϕ(n) L 2 − < Mϕ(n) L 2 .

This ends the proof of the first statement of Lemma 1. Proof of (2) of Lemma 1. Let M = (x, y, z) ∈ S2 be such that M = e3 . Then x = 0 or y = 0. We assume, for example, that y = 0. Thanks to (30), we have U [2+ ; π δ1 + (π + θ )δ2 , 0, .] = exp((π + θ )x ) exp(ωz ) exp(π x ) exp(ωz ) = exp(θ x ) exp(π x ) exp(ωz ) exp(π x ) exp(ωz ) = exp(θ x ) exp(−ωz ) exp(ωz ) = exp(θ x ). Thus, ⎛

⎞ 1 0 0 U [2+ ; π δ1 + (π + θ )δ2 , 0, M] = ⎝ 0 cos(θ ) − sin(θ ) ⎠ M. 0 sin(θ ) cos(θ ) We get the conclusion with θ ∈ [0, 2π ) such that ⎛

⎞ x ⎠. U [2+ ; π δ1 + (π + θ )δ2 , 0, M] = ⎝ 0 y2 + z 2

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

547

5. Explicit Controls for the Asymptotic Exact Controllability to e3 In this section, ω∗ = 0, ω∗ = π . We propose explicit controls realizing the asymptotic exact controllability to −e3 , locally around −e3 . First, let us introduce some notations. For a function f : (−π, π ) → C, we denote by cn ( f ) its Fourier coefficients and by N ( f ) their l 1 -norm: π 1 cn ( f ) := f (ω)e−inω dω, N ( f ) := |cn ( f )|. 2π −π n∈Z For a function f : (0, π ) → C, we define cn ( f ) := cn ( f˜), ∀n ∈ Z, N ( f ) := N ( f˜), where f˜ : (−π, π ) → C, f˜(ω) := f (|ω|). For a vector valued map M = (x, y, z) : (0, π ) → R3 , we define N (M) := N (x) + N (y) + N (z). Then, we have the following results. Lemma 2. For every f, g : [0, π ] → C such that N ( f ), N (g) < ∞, we have N ( f g) N ( f )N (g).

(39)

For every (x, y) ∈ L 1 ((0, π ), R2 ), we have, with Z := x + iy, 1 (N (x) + N (y)) N (Z ) N (x) + N (y). 2

(40)

For every M : [0, π ] → S2 such that N (M) < +∞ and z(ω) > 0, ∀ω ∈ [−π, π ], we have N (z − 1) 2N (Z )2 .

(41)

1 N (Z ) N (M − e3 ) 3N (Z ). 2

(42)

If, moreover, N (Z ) 1/4, then

As a consequence, for a map M : [0, π ] → S2 such that N (Z ) 1/4 and z > 0, the quantity N (Z ) measures the N -distance from M to e3 . Proof of Lemma 2. We have N ( f g) = | cn− p ( f )c p (g)| |c p (g)| |cn− p ( f )| = N ( f )N (g). n∈Z p∈Z

p∈Z

n∈Z

The inequality (40) is a consequence of the triangular inequality because N (Z ) = N (x + iy) and N (x) + N (y) = N ((Z + Z )/2) + N ((Z − Z )/2i). Let M : [0, π ] → S2 be such that N (M) < +∞ and z > 0. We have π 1 π 1 2 ˜ N (z −1) = 1 − 1 − | Z (ω)| dω + 1 − | Z˜ (ω)|2 e−inω dω. 2π −π 2π −π n∈Z−{0}

548

K. Beauchard, J.-M. Coron, P. Rouchon

√ √ p Using 1 − 1 − x x, ∀x ∈ (0, 1), 1 − x = 1 + ∞ p=1 α p x , that converges uniformly with respect to x ∈ [0, δ] when δ < 1 and where α p < 0 for every p ∈ N∗ and (39), we get N (z − 1) Z˜ 2L ∞ (0,2π ) − N (Z )2 − N (Z )2 −

∞

∞

αp

p=1

n∈Z−{0}

1 π ˜ 2π −π | Z (ω)|2 p e−inω dω

α p N (|Z |2 p )

p=1 ∞

α p N (Z ) p=1

N (Z )2 + 1 − 2N (Z )2 .

2p

1 − N (Z )2

Formula (42) is a direct consequence of the previous inequalities.

The goal of this section is the proof of the following theorem. Theorem 7. There exists δ > 0 such that, for every M0 : [0, π ] → S2 with N [Z 0 ] < δ and z 0 < −1/2, there exists = (M0 ) > 0 such that, the solution of (3) with M(0) = M0 , u(t) :=

2k−1 1 π π 1[k,k+] (t) − c−k+ p (Z 0 ) 1[k+ p,k+ p+] (t) + 1[3k,3k+] (t), p=1

v(t) := −

2k−1 p=1

1 c−k+ p (Z 0 ) 1[k+ p,k+ p+] (t),

where k = k(M0 ) ∈ N is such that

|cn (Z 0 )| <

|n|>k

N (Z 0 ) , 4

(43)

satisfies N [Z 0 ] , 2 z(3k + ) < −1/2. N [Z (3k + )] <

By iterating this process, we find an increasing sequence (tn )n∈N ∈ [0, +∞)N and ∞ ([0, +∞), R) such that two controls u, v ∈ L loc N [Z (tn )] <

1 N [Z 0 ]. 2n

Thus, M(tn ) + e3 L ∞ → 0 when n → +∞. These explicit controls provide the exact asymptotic controllability to e3 . In Sect. 5.1, we present the heuristic of the proof of Theorem 7, which is detailed in Sect. 5.2.

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

549

5.1. Heuristic. Let us sketch the proof of Theorem 7. It is inspired by the return method, introduced in [6,7] and already used for the control of quantum systems in [2,4] (for other applications see the book [8]). It consists here in going close to +e3 in order to delete the main Fourier coefficients of the initial condition, and then to move back to −e3 . Notice that, when z > 0, the system (3) implies that Z˙ (t, ω) = iωZ (t, ω) − w(t) 1 − |Z (t, ω)|2 , (t, ω) ∈ (0, +∞) × (0, π ), (44) z˙ (t, ω) = −[w(t)Z (t, ω)], (t, ω) ∈ (0, +∞) × (0, π ). (45) We have Z 0 (ω) =

dn einω , where dn := cn (Z 0 ).

(46)

n∈Z

Let k ∈ N∗ that will be chosen later on. On the time interval [0, k) we take w = 0, thus Z (k − , ω) = Z 0 (ω)eikω = dn ei(n+k)ω and z(k − , ω) = z 0 (ω). n∈Z

At time k, we apply the control w(t) = iπ δk (t) in order to move close to +e3 . Indeed, thanks to Definition 1, we have ⎛ ⎞ 1 0 0 M(k + , ω) = exp(π x )M(k − , ω) = ⎝ 0 −1 0 ⎠ M(k − , ω), 0 0 −1 thus Z (k + , ω) = Z (k − , ω) =

dn ei(−n−k)ω and z(k + , ω) = −z(k − , ω).

n∈Z

On the time interval (k, 3k) we apply a control of the form w(t) =

2k−1

w p δ p+k (t),

p=1

where w p ∈ C. Approaching the nonlinear system (44) by its linearized system around (Z ≡ 0, w ≡ 0), we get 3k Z (3k − , ω) ∼ Z (k + , ω) − k w(t)e−iω(t−k) dt ei2kω 2k−1 (47) i(−n−k)ω −i pω ∼ dn e − wpe ei2kω . n∈Z

p=1

Moreover, z stays close to +1 because the control applied is small. Choosing w p := d p−k , we get Z (3k − , ω) ∼ dn ei(−n+k)ω . |n|k

550

K. Beauchard, J.-M. Coron, P. Rouchon

Finally, at time 3k, we apply the control w(t) = iπ δ3k (t) in order to return to −e3 :

Z (3k + , ω) = Z (3k − , ω) ∼

dn ei(n−k)ω ,

|n|k

and z(3k + , ω) = −z(3k − , ω) is close to −1. Now, by choosing k = k(Z 0 ) such that

|dn | <

|n|k

1 N (Z 0 ), 2

we get the existence of a time T = T (Z 0 ) := 3k and a control w : [0, T ] → C such that N [Z (T )] < N [Z 0 ]/2. Finally, the steps that need to be justified are • the approximation of the nonlinear system by its linearized system in (47), • the convergence, for the norm N , of the solutions of (3) when we approximate the ∞. Dirac controls by controls in L loc

5.2. Proof of Theorem 7. Let us recall that the solutions of (3) with Dirac controls have been defined in Definition 1, and that we have the following result. Proposition 6. Let β, γ ∈ R, M0 ∈ C 0 ([0, π ], S2 ) be such that N (M0 ) < +∞. Let M be the solution of (3) with M(0) = M0 , u(t) = βδ0 (t), and v(t) = γ δ0 (t). For > 0, let M be the (classical) solution of (3) with M(0) = M0 , u(t) = (β/)1(0,) (t), and v(t) = (γ /)1(0,) (t). Then N (M () − M(0+ )) → 0 when → 0.

(48)

Proof of Proposition 6. We have M (, ω) = exp[|ω|z + βx + γ y ]M0 (ω), M(0+ , ω) = exp[βx + γ y ]M0 (ω). One has N (M () − M(0 )) +

+∞ an () n=1

n!

,

(49)

with an () := N [(|ω|z + βx + γ y )n − (βx + γ y )n ]M0 (ω) .

(50)

Noticing that N (|ω|) < +∞, using (39) together with standard estimates and the Weierstrass M-test, one easily sees that (48) follows from (49) and (50).

Thanks to Proposition 6, Theorem 7 is a consequence of the following result.

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

551

Theorem 8. There exists δ > 0 such that, for every M0 : [0, π ] → S2 with N [Z 0 ] < δ and z 0 < −1/2, the solution of (3) with M(0) = M0 , u(t) := π δk (t) −

2k−1

c−k+ p (Z 0 ) δk+ p (t) + π δ3k (t),

p=1

v(t) := −

2k−1

c−k+ p (Z 0 ) δk+ p (t),

p=1

where k = k(M0 ) ∈ N is such that (43) holds, satisfies N [Z (3k + )] <

1 N (Z 0 ), 2

1 z(3k + ) < − . 2

(51) (52)

The key point of the proof of Theorem 8 is the following result. Proposition 7. There exist C > 0 and C > 0 such that, for every d0 ∈ C with |d0 | 1, for every M0 = (x0 , y0 , z 0 ) : [0, π ] → S2 with N (Z 0 ) 1 and z 0 > 0, the solution of (3) with M(0) = M0 , v(t) = −(d0 )δ0 (t), u(t) = (d0 )δ0 (t) satisfies N Z (0+ ) − Z 0 + d0 C|d0 | max{|d0 |, N (Z 0 )}, (53) z(0+ , ω) z 0 (ω) − C |d0 | max{|d0 |, N (Z 0 )}.

(54)

Proof of Proposition 7. Let us write d0 = β0 + iγ0 , with β0 , γ0 ∈ R. We have M(0+ , ω) = exp[β0 x + γ0 y ]M0 (ω). Using the decomposition exp[β0 x + γ0 y ] = I + β0 x + γ0 y + R, where R = O(|d0 |2 ) as d0 → 0, (55) we get Z (0+ , ω) = Z 0 (ω) − d0 z 0 (ω) + R1 x0 (ω) + R2 y0 (ω) + R3 z 0 (ω), where R j ∈ C, |R j | C|d0 |2 for j = 1, 2, 3, and C is a universal constant. Therefore, we have cn [Z (0+ ) − Z 0 + d0 ] = d0 cn [1 − z 0 ] + R1 cn [x0 ] + R2 cn [y0 ] + R3 cn [z 0 ], ∀n ∈ Z. Using (40) and (41), we get N Z (0+ ) − Z 0 + d0 |d0 |N (z 0 − 1) + |R1 |N (x0 ) + |R2 |N (y0 ) + |R3 |N (z 0 ) 2|d0 |N (Z 0 )2 + C|d0 |2 [2N (Z 0 ) + 1 + 2N (Z 0 )2 ], which gives (53) with C = 2 + 5C. From (55) we get z(0+ , ω) = z 0 (ω) + d 0 Z 0 (ω) + R1 x0 (ω) + R2 y0 (ω) + R3 z 0 (ω),

552

K. Beauchard, J.-M. Coron, P. Rouchon

where R j ∈ C, |R j | C |d0 |2 for j = 1, 2, 3, where C is another universal constant. Using (40) and (41), we get z(0+ , ω) z 0 (ω) − |d0 ||Z 0 (ω)| − C |d0 |2 [|x0 (ω)| + |y0 (ω)| + |z 0 (ω)|] z 0 (ω) − |d0 |N (Z 0 ) − C |d0 |2 [N (x0 ) + N (y0 ) + N (z 0 )] z 0 (ω) − |d0 |N (Z 0 ) − C |d0 |2 [2N (Z 0 ) + 1 + 2N (Z 0 )2 ],

which gives (54) with C = 1 + 5C . Proof of Theorem 8. Let δ be such that 4Cδ < 1, C δ < 1/2, δ ∈ (0, 1),

(56)

where C, C are as in Proposition 7. Let M0 , k, u, v be as in Theorem 8. We use the notation (46). First step: on [0,k]. We have (see the previous section) Z (k + , ω) = dn ei(−n−k)ω and z(k + , ω) = −z 0 (ω). (57) n∈Z

Second step: on (k,3k). Let us prove by induction on p ∈ {0, . . . , 2k − 1} that for every p ∈ {0, . . . , 2k − 1}, we have ! (H p ): N Z (k + p)+ − n∈Z−{−k+1,...,−k+ p} dn ei(−n−k+ p)|ω| (58) C[|d−k+1 | + · · · + |d−k+ p| ]N (Z 0 ), (H p ): z((k + p)+ , ω) −z 0 (ω) − C [|d−k+1 | + · · · + |d−k+ p |]N (Z 0 ).

(59)

Notice that (H2k−1 ) and (56) provide ⎡ ⎤ − N ⎣ Z 3k − dn ei(−n+k)|ω| ⎦ C N (Z 0 )2 , n∈Z,|n|k

thus, thanks to (56) and (43), we have N [Z (3k − )] < N [Z 0 ]/2.

(60)

) and (56), We also have, thanks to (H2k−1

z(3k − , ω) = z((3k − 1)+ , ω) −z 0 (ω) + C δ 2 > 0, thus z(3k − , ω) =

1 − |Z (3k + , ω)|2 >

1 − δ/2 > 1/2.

(61)

The properties (H0 ) and (H0 ) come from (57). Now, let p ∈ {1, . . . , 2k − 1} and let us assume that (H p−1 ) and (H p−1 ) hold. Thanks to (H p−1 ) and (56), we have N [Z ((k + p)− )] = N [Z ((k + p − 1)+ )] N [Z 0 ] δ 1,

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

553

and thanks to (H p−1 ) we have

z((k + p)− , ω) = z((k + p − 1)+ , ω) −z 0 (ω) − C N (Z 0 )2 >

1 1 − = 0, 2 2

thus we can apply Proposition 7. Thanks to Proposition 7 and (H p−1 ), we get & ' + i(−n−k+ p)ω N Z (k + p) − dn e n∈Z−{−k+1,...,−k+ p}

N Z& (k + p)+ − Z (k + p)− + d−k+ p ' − i(−n−k+ p)ω dn e +N Z (k + p) − n∈Z−{−k+1,...,−k+ p−1}

− C|d−k+ & p |N [Z ((k + p) )] +N Z (k + p − 1)+ −

n∈Z−{−k+1,...,−k+ p−1}

' dn

ei(−n−k+ p−1)ω

C|d−k+ p |N [Z 0 ] + C[|d−k+1 | + · · · + |d−k+ p−1 |]N (Z 0 ), ), we get which proves (H p ). Thanks to Proposition 7 and (H p−1

z((k + p)+ , ω) z((k + p)− , ω) − C |d−k+ p |N [Z ((k + p)− )] z((k + p − 1)+ , ω) − C |d−k+ p |N [Z 0 ] −z 0 (ω) − C [|d−k+1 + · · · + |d−k+ p−1 | + |d−k+ p |]N [Z 0 ]. Third step: at 3k. We have Z (3k + , ω) = Z (3k − , ω) and z(3k + , ω) = −z(3k − , ω), thus (60) and (61) give (51) and (52).

6. Comparison In this section, we compare the control results and processes presented in Sects. 4 and 5. First, let us compare the statements of Theorems 6 (or Corollary 1) and 7. On one hand, the statement of Theorem 6 is stronger than the one of Theorem 7 because it is global and it gives the approximate controllability of (3) for the norms . H s , ∀s < 1 (whereas Theorem 7 only provides the approximate controllability for N ). On the other hand, Theorem 7 is stronger than Theorem 6 because it needs less regular initial data. Now, let us compare the control processes detailed in the proof of Theorems 6 and 7. Given M0 ∈ H 1 ((ω∗ , ω∗ ), S2 ), the proofs of Lemma 1 and Proposition 5 give an explicit way to find T > 0, u, v, ∈ D such that U [T + ; u, v, M0 ] L 2 < M0 L 2 . Iterating this process, we produce a sequence of reachable points (Mn )n∈N ⊂ H 1 ((ω∗ , ω∗ ), S2 ) such that (Mn L 2 )n∈N decreases. We expect that Mn L 2 → 0 when n → +∞, and once this norm is small enough, we apply a control given in Lemma 1 (2) to go closer to e3 . However, the sequence (Mn )n∈N may not converge to 0. Thus, the control process presented in Sect. 4 is not completely satisfying from a practical point of view. Moreover, even if the sequence (Mn )n∈N converges to 0 in L 2 ((ω∗ , ω∗ ), R3 ), the controllability process may take a long time (in particular the controllability time is not

554

K. Beauchard, J.-M. Coron, P. Rouchon

a priori bounded by a quantity depending only on M0 H 1 ) and cost a lot (because at each step, one has to compute new controls u and v and because the commands proposed in the proof of Proposition 5 involve many trips between −e3 and +e3 ). On the contrary, the controllability process presented in Sect. 5 works within a time T which is explicit, with controls u, v that are also explicit in terms of the Fourier coefficients of M0 , and needs only two trips between ±e3 . Thus, the time and the cost are well known. Let us compare the time and the cost involved by the two controllability processes on a particular example. We take (ω∗ , ω∗ ) = (0, π/2), and an initial data of the form ⎞ ⎛ x (ω) ⎠, 0 M0 (ω) = ⎝ 1 − 2 x (ω)2 where > 0 is small, x (ω) =

N

ak () cos((2k − 1)ω) + cos((2N + 1)ω), ∀ω ∈ (0, π/2),

(62)

k=1

and (ak ())1k N ∈ R N are such that π/2 x (ω) ω K dω = 0, ∀K ∈ {0, . . . , N − 1}. 0 1 − 2 x (ω)2

(63)

We will prove later the existence of such coefficients. We want to reach e3 . Let us apply the strategy presented in Sect. 4, to find explicit T > 0, u, v ∈ D such that U [T + ; u, v, M0 ] L 2 < M0 L 2 . One needs a polynomial Q ∈ R[X ] such that π/2 (zx − z x)Qdω < 0. 0

Then, deg(Q) N , because of (63). Thanks to the proof of Lemma 1, there exists τ ∗ = τ ∗ (Q, x ) > 0 and α > 0 such that, for every τ ∈ (0, τ ∗ ), there exist T > 0, u, v ∈ D such that U [T + ; u, v, M0 ] 2L 2 M0 2L 2 − ατ. However τ ∗ cannot be quantified, thus, we do not know the size of the decrease. Moreover, as emphasized in the proof of Proposition 5, the time of control T satisfies T 2 N τ 1/N (time needed to generate I + τ ω N x + o(τ )) and one makes more than 2 N trips between ±e3 (just count how many times the matrices exp(π x ) or exp(π y ) appear in the generation of I + τ ω N x + o(τ ) in the proof of Proposition 5). With the strategy of Sect. 5 taking the same explicit expression for M0 on (−π, π ), we know the existence of ∗ > 0 such that, for every ∈ (0, ∗ ), the explicit controls u(t) := π δ2N +1 (t) + π δ6N +3 (t), v(t) := −

N 2N a N +1−m ()δ2N +1+2m (t) − am−N ()δ2N +1+2m (t), 2 2 m=1

m=N +1

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

555

with the convention a N +1 () = a−N −1 () = 1, realize 1 N U [(6N + 3)+ ; u, v, M0 ] + e3 < N [M0 + e3 ]. 2 Here, the controls are explicit, the time scales like 6N , we have a bound from below for the decrease of the N -distance to e3 , and the process needs only 2 trips between ±e3 . Now, let us prove the existence of the coefficients (ak ())1k N . Lemma 3. Let N ∈ N∗ . (i) The matrix A ∈ M N (R) with coefficients

π/2

Ak,K :=

(2k − 1) sin((2k − 1)ω)ω K dω, 1 k N , 0 K N − 1,

0

is invertible. (ii) There exists ∗ > 0 and a C 1 map ∈ [0, ∗ ] → (ak ())1k N ∈ R N such that (62)-(63) hold. Proof. (i) We assume that A is not invertible. Then, there exists (λ1 , . . . , λ N ) ∈ R N − {0} such that

N π/2 0

λk sin((2k − 1)ω)ω K dω = 0, ∀0 K N − 1.

(64)

k=1

N λk sin((2k − 1)ω) and 0 < ω1 < · · · < ω L < π/2 be all the Let f (ω) := k=1 values of the open interval (0, π/2) on which f vanishes and changes its sign. Then, the function ω → f (ω)(ω − ω1 ) . . . (ω − ω L ) has a constant sign on (0, π/2) and it is not identically zero, thus

π/2

f (ω)(ω − ω1 ) . . . (ω − ω L )dω = 0.

0

The assumption (64) ensures that L N . Thanks to trigonometric formulas, there exists (µ1 , . . . , µ N ) ∈ R N − {0} such that f (ω) =

N

µk sin(ω)2k−1 = sin(ω)

k=1

N

µk sin(ω)2(k−1) .

k=1

Since the quantities sin(ω1 )2 , . . . , sin(ω N )2 are all different from zero (ω1 , . . . ω N ∈ (0, π/2)), they provide N roots for the polynomial N

µk X (k−1)

k=1

that have a degree (N − 1) and is different from zero. This is a contradiction.

556

K. Beauchard, J.-M. Coron, P. Rouchon

(ii) Thanks to (i), there exists (α1 , . . . , α N ) ∈ R N such that π/2 N αk (2k − 1) sin((2k − 1)ω) + (2N + 1) sin((2N + 1)ω) ω K dω = 0, 0

k=1

∀0 K N . There exists M > 0 such that N αk cos((2k − 1)ω) + cos((2N + 1)ω) M, ∀ω ∈ (0, π/2). k=1

When b = (b1 , . . . , b N )t ∈ R N we have N √ (αk + bk ) cos((2k − 1)ω) + cos((2N + 1)ω) M + N b, k=1

thus the following map F is well defined: 1 F : 0, 2M × BR N 0, √M → R N N , ( , b), → F(, b)

π/2

F(, b) := 0

yb (ω)

ω dω 1 − 2 yb (ω)2

,

K

1K N

where yb (ω) :=

N (αk + bk ) cos((2k − 1)ω) + cos((2N + 1)ω), ∀ω ∈ (0, π/2). k=1

Then F(0, 0) = 0 and db F(0, 0) is invertible, thanks to (i). Thus, the implicit function theorem gives the conclusion.

Acknowledgements. The authors would like to thank Gabriel Turinici for helpful comments.

References 1. Ball, J.M., Marsden, J.E., Slemrod, M.: Controllability for distributed bilinear systems. SIAM J. Control Optim. 20(4), 575–597 (1982) 2. Beauchard, K.: Local controllability of a 1-D Schrödinger equation. J. Math. Pures Appl. 84, 851–956 (2005) 3. Beauchard, K.: Local controllability of a 1D beam equation. SIAM J. Control Optim. 47(3), 1219–1273 (2008) 4. Beauchard, K., Coron, J.-M.: Controllability of a quantum particle in a moving potential well. J. Funct. Anal. 232(2), 328–389 (2006) 5. Chambrion, T., Mason, P., Sigalotti, M., Boscain, U.: Controllability of the discrete-spectrum Schrödinger equation driven by an external field. Ann. Inst. H. Poincaré Anal. Non Linéaire 26(1), 329–349 (2009) 6. Coron, J.-M.: Global asymptotic stabilization for controllable systems without drift. Math. Control Signals Systems 5(3), 295–312 (1992)

Controllability Issues for Continuous-Spectrum Systems and Bloch Equations

557

7. Coron, J.-M.: On the controllability of 2-D incompressible perfect fluids. J. Math. Pures Appl. 75(2), 155–188 (1996) 8. Coron, J.-M.: Control and Nonlinearity. Mathematical Surveys and Monographs, Vol. 136, Providence, RI: Amer. Math. Soc., 2007 9. Hartman, P.: Ordinary Differential Equations. New York: John Wiley and Sons, 1964 10. Li, J.-S., Khaneja, N.: Ensemble controllability of the bloch equations. In: Proceedings of the 45th IEEE Conference on Decision & Control, Washington, DC: IEEE Comp. Soc., (San Doego, 2006), pp. 2483– 2487, 2006 11. Li, J.-S., Khaneja, N.: Control of inhomogeneous quantum ensemble. Phys. Rev. A 73, 030302(R) (2006) 12. Li, J.-S., Khaneja, N.: Ensemble control of bloch equations. IEEE Trans. Automatic Control, 2009, to appear 13. Mirrahimi, M.: Lyapunov control of a quantum particle in a decaying potential. Ann. Inst. H. Poincaré Anal. Non Lin. 26(5), 1743–1765 (2009) 14. Nersesyan, V.: Growth of Sobolev norms and controllability of Schrödinger equations. Preprint 2008, available at http://arxiv.org/abs/0864.3982v2[math.AP], 2008 15. Turinici, G.: On the controllability of bilinear quantum systems In: C. Le Bris, M. Defranceschi, eds., Mathematical Models and Methods for Ab Initio Quantum Chemistry, Volume 74 of Lecture Notes in Chemistry, Berlin-Heideelberg-NewYork: Springer, 2000 16. Turinici, G., Rabitz, H.: Optimally controlling the internal dynamics of a randomly oriented ensemble of molecules. Phys. Rev. A 70, 063412 (2004) 17. Zeidler, E.: Nonlinear Functional Analysis and it’s Applications, Vol. 4: Applications to mathematical physics. New York: Springer, 1988 Communicated by I.M. Sigal

Commun. Math. Phys. 296, 559–587 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1028-5

Communications in

Mathematical Physics

A Priori Estimates for the Free-Boundary 3D Compressible Euler Equations in Physical Vacuum Daniel Coutand1 , Hans Lindblad2 , Steve Shkoller3 1 CANPDE, Maxwell Institute for Mathematical Sciences and Department of Mathematics,

Heriot-Watt University, Edinburgh, EH14 4AS, UK. E-mail: [email protected]

2 Department of Mathematics, University of California, San Diego, CA 92093, USA.

E-mail: [email protected]

3 Department of Mathematics, University of California, Davis, CA 95616, USA.

E-mail: [email protected] Received: 10 April 2009 / Accepted: 27 December 2009 Published online: 11 March 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com

Abstract: We prove a priori estimates for the three-dimensional compressible Euler equations with moving physical vacuum boundary, with an equation of state given by p(ρ) = Cγ ρ γ for γ > 1. The vacuum condition necessitates the vanishing of the pressure, and hence density, on the dynamic boundary, which creates a degenerate and characteristic hyperbolic free-boundary system to which standard methods of symmetrizable hyperbolic equations cannot be applied.

Contents 1. 2. 3. 4. 5. 6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notation and Weighted Spaces . . . . . . . . . . . . . . . . . . . . The Lagrangian Vorticity . . . . . . . . . . . . . . . . . . . . . . Properties of the Determinant J , Cofactor Matrix a, Unit Normal n, and a Polynomial-type Inequality . . . . . . . . . . . . . . . . . . Trace Estimates and the Hodge Decomposition Elliptic Estimates . The a priori Estimates . . . . . . . . . . . . . . . . . . . . . . . . The Case of General γ > 1 . . . . . . . . . . . . . . . . . . . . .

. . . . . 559 . . . . . 564 . . . . . 566 . . . .

. . . .

. . . .

. . . .

. . . .

566 567 568 586

1. Introduction 1.1. The compressible Euler equations in Eulerian variables. For 0 ≤ t ≤ T , the evolution of a three-dimensional compressible gas moving inside of a dynamic vacuum

560

D. Coutand, H. Lindblad, S. Shkoller

boundary is modeled by the one-phase compressible Euler equations: ρ[u t + u · Du] + Dp(ρ) = 0

in (t),

(1.1a)

ρt + div(ρu) = 0

in (t),

(1.1b)

p=0

on (t),

(1.1c)

V((t)) = u · n, (ρ, u) = (ρ0 , u 0 )

(1.1d) on (0),

(1.1e) (1.1f)

(0) = . R3

The open, bounded subset (t) ⊂ denotes the changing volume occupied by the gas, (t) := ∂(t) denotes the moving vacuum boundary, V((t)) denotes the normal velocity of (t), and n denotes the exterior unit normal vector to (t). The vector-field u = (u 1 , u 2 , u 3 ) denotes the Eulerian velocity field, p denotes the pressure function, and ρ denotes the density of the gas. The equation of state p(ρ) is given by p(x, t) = Cγ ρ(x, t)γ for γ > 1,

(1.2)

where Cγ is the adiabatic constant which we set to unity, and ρ > 0 in (t)

and

ρ = 0 on (t).

Equation (1.1a) is the conservation of momentum; (1.1b) is the conservation of mass; the boundary condition (1.1c) states that pressure (and hence density) vanish along the vacuum boundary; (1.1d) states that the vacuum boundary is moving with the normal component of the fluid velocity, and (1.1e)–(1.1f) are the initial conditions for the density, velocity, and domain. Using the equation of state (1.2), (1.1a) is written as ρ[u t + u · Du] + Dρ γ = 0

in (t).

1.2. Physical vacuum. With the sound speed given by c := the outward unit normal to , satisfaction of the condition

(1.1a )

√ ∂ p/∂ρ and N denoting

∂c02 < 0 on (1.3) ∂N defines a physical vacuum boundary (see [10,12–15,20]), where c0 = c|t=0 . The physical vacuum condition (1.3) is equivalent to the requirement that γ −1

∂ρ0 < 0 on . (1.4) ∂N Since ρ0 > 0 in , (1.4) implies that for some positive constant C and x ∈ near the vacuum boundary , γ −1

ρ0

(x) ≥ Cdist(x, ) for x near .

(1.5)

Because of condition (1.5), the compressible Euler system (1.1) is a degenerate and characteristic hyperbolic system to which standard methods of symmetric hyperbolic conservation laws cannot be applied. We note that by choosing a lower-bound with a faster rate of degeneracy such as, for example, dist(x, (t))b for b = 2, 3, . . . ., the analysis becomes significantly easier;

Free-Boundary 3D Compressible Euler Equations in Vacuum

for instance, if b = 2, then

γ −1

Dρ (x,t) 0 γ −1 ρ0 (x,t)

561

is bounded for all x ∈ . This bound makes

it possible to easily control error terms in energy estimates, and in effect removes the singular behavior associated with the physical vacuum condition (1.5). 1.3. Fixing the domain and the Lagrangian variables on . We transform the system (1.1) into Lagrangian variables. We let η(x, t) denote the “position” of the gas particle x at time t. Thus, ∂t η = u ◦ η

for t > 0

and

η(x, 0) = x,

where ◦ denotes composition so that [u ◦ η](x, t) := u(η(x, t), t). We set v = u ◦ η (Lagrangian velocity), f = ρ ◦ η (Lagrangian density), A = [Dη]−1 (inverse of deformation tensor), J = det Dη (Jacobian determinant), a = J A (tranpose of cofactor matrix). Using Einstein’s summation convention defined in Sect. 2.3 below, and using the notation F,k to denote ∂∂xFk , the kth -partial derivative of F for k = 1, 2, 3, the Lagrangian version of Eqs. (1.1a)–(1.1b) can be written on the fixed reference domain as f vti + Aik f γ ,k = 0

in × (0, T ],

(1.6a)

f t + f Ai v i , j = 0

j

in × (0, T ],

(1.6b)

f =0

in × (0, T ],

(1.6c)

in × {t = 0},

(1.6d)

( f, v, η) = (ρ0 , u 0 , e) where e(x) = x denotes the identity map on . j Since Jt = J Ai v i , j , it follows that

f = ρ0 J −1 ,

(1.7)

so that the initial density function ρ0 can be viewed as a parameter in the Euler equations. Let := ∂ denote the initial vacuum boundary; using that Aik = J −1 aik , we write the compressible Euler equations (1.6) as γ

ρ0 vti + aik (ρ0 J −γ ),k = 0 (η, v) = (e, u 0 ) γ −1

ρ0 γ −1

with ρ0

=0

(x) ≥ C dist(x, ) for x ∈ near .

in × (0, T ],

(1.8a)

in × {t = 0},

(1.8b)

on ,

(1.8c)

562

D. Coutand, H. Lindblad, S. Shkoller

1.4. Setting γ = 2. We will focus our analysis on the case γ = 2, and in Sect. 7, we will explain the changes in the higher-order energy function for the general case of γ > 1. We seek solutions η to the following system: ρ0 vti + aik (ρ02 J −2 ),k = 0

in × (0, T ],

(1.9a)

on × {t = 0},

(1.9b)

on ,

(1.9c)

vti + 2 Aik (ρ0 J −1 ),k = 0,

(1.10)

vti + ρ0 aik J −2 ,k +2ρ0 ,k aik J −2 = 0.

(1.11)

(η, v) = (e, u 0 ) ρ0 = 0 with ρ0 (x) ≥ C dist(x, ) for x ∈ near . Equation (1.9a) is equivalent to

and (1.10) can be written as

Because of the degeneracy caused by ρ0 = 0 on , all three equivalent forms of the compressible Euler equations are crucially used in our analysis. Equation (1.9a) is used for energy estimates, while (1.10) is used for estimates of the vorticity, and (1.11) is used for additional elliptic-type estimates used to recover the bounds for normal derivatives. 1.5. The reference domain . To avoid the use of local coordinate charts necessary for arbitrary geometries, for simplicity, we will assume that the initial domain ⊂ R3 at time t = 0 is given by = {(x1 , x2 , x3 ) ∈ R3 | (x1 , x2 ) ∈ T2 , x3 ∈ (0, 1)}, where T2 denotes the 2-torus, which can be thought of as the unit square with periodic boundary conditions. This permits the use of one global Cartesian coordinate system. At t = 0, the reference vacuum boundary is the top boundary = {x3 = 1}, while the bottom boundary {x3 = 0} is fixed with boundary condition u 3 = 0 on {x3 = 0} × [0, T ]. The moving vacuum boundary is then given by (t) = η(t)() = η(x1 , x2 , 1, t). 1.6. The higher-order energy function. For γ = 2, the physical energy [ 21 ρ0 |v|2 + ρ02 J −1 ]d x is a conserved quantity, but is far too weak for the purposes of constructing solutions; instead, we consider the higher-order energy function

Free-Boundary 3D Compressible Euler Equations in Vacuum

E(t) =

4

∂t2a η(t)24−a +

a=0 3

+

563

4 √ ρ0 ∂¯ 4−a ∂t2a Dη(t)20 + ρ0 ∂¯ 4−a ∂t2a v(t)20 a=0

ρ0 ∂t2a J −2 (t)24−a + curlη v(t)23 + ρ0 ∂¯ 4 curlη v(t)20 ,

(1.12)

a=0

where ∂¯ = ∂∂x1 , ∂∂x2 . Section 2 explains the notation. While this function is not conserved, it is possible to show that supt∈[0,T ] E(t) remains bounded for sufficiently smooth solutions of (1.9), whenever T > 0 is taken sufficiently small; the bound depends only on E(0). 1.7. Main Result. Theorem 1.1 (The case γ = 2). Suppose that η(t) is a smooth solution of (1.9) on a time interval [0, T¯ ] satisfying the initial bound E(0) < ∞, and that the initial density function 0 < ρ0 in and ρ0 ∈ H 4 () satisfies the physical vacuum condition (1.5). Then for T > 0 taken sufficiently small, the energy function E(t) constructed from the solution η(t) satisfies the a priori estimate sup E(t) ≤ M0 ,

t∈[0,T ]

where M0 and T is a function of E(0). Of course, our theorem also covers the case that ⊂ Rd for d = 1 or 2, and by using a collection of coordinate charts, we can allow arbitrary initial domains, as long as the initial boundary is of Sobolev class H 3.5 . We announced Theorem 1.1 in [4]. 1.8. History of prior results for the compressible Euler equations with vacuum boundary. We are aware of only a handful of previous theorems pertaining to the existence of solutions to the compressible and inviscid Euler equations with moving vacuum boundary. Makino [16] considered compactly supported initial data, and treated the compressible Euler equations for a gas as being set on R3 × (0, T ]. With his methodology, it is not possible to track the location of the vacuum boundary (nor is it necessary); nevertheless, an existence theory was developed in this context, by a variable change that permitted the standard theory of symmetric hyperbolic systems to be employed. Unfortunately, the constraints on the data are too severe to allow for the evolution of the physical vacuum boundary. In [11], Lindblad proved existence and uniqueness for the 3D compressible Euler equations modeling a liquid rather than a gas. For a compressible liquid, the density ρ > 0 is assumed to be a positive constant on the moving vacuum boundary (t) and is thus uniformly bounded below by a positive constant. As such, the compressible liquid provides a uniformly hyperbolic, but characteristic, system. Lindblad used Lagrangian variables combined with Nash-Moser iteration to construct solutions. More recently, Trakhinin [19] provided an alternative proof for the existence of a compressible liquid, employing a solution strategy based on symmetric hyperbolic systems combined with Nash-Moser iteration. The only existence theory for the physical vacuum singularity that we are aware of can be found in the recent paper by Jang and Masmoudi [6] for the 1D compressible

564

D. Coutand, H. Lindblad, S. Shkoller

gas; we refer the interested reader to the introduction in that paper for a nice history of the analysis of the 1D compressible Euler equations with damping. 1.9. Generalization of the isentropic gas assumption. The general form of the compressible Euler equations in three space dimensions are the 5 × 5 system of conservation laws ρ[u t + u · Du] + Dp(ρ) = 0, ρt + div(ρu) = 0, (ρE)t + div(ρuE + pu) = 0,

(1.13a) (1.13b) (1.13c)

where (1.13a), (1.13b) and (1.13c) represent the respective conservation of momentum, mass, and total energy. Here, the quantity E is the sum of contributions from the kinetic energy 21 |u|2 , and the internal energy e, i.e.,E = 21 |u|2 + e. For a single phase of compressible liquid or gas, e becomes a well-defined function of ρ and p through the theory of thermodynamics, e = e(ρ, p). Other interesting and useful physical quantities, the temperature T (ρ, p) and the entropy S(ρ, p) are defined through the following consequence of the second law of thermodynamics T d S = de = −

p dρ. ρ2

For ideal gases, the quanities e, T, S have the explicit formulae: T p = , ρ(γ − 1) γ −1 p T (ρ, p) = , ρ e(ρ, p) =

p = e S ρ γ , γ > 1, constant. In regions of smoothness, one often uses velocity and a convenient choice of two additional variables among the five quantities S, T, p, ρ, e as independent variables. For the Lagrangian formulation, the entropy S plays an important role, as it satisfies the transport equation St + (u · D)S = 0, and as such, S ◦ η = S0 , where S0 (x) = S(x, 0) is the initial entropy function. Thus, by γ replacing f with e S◦η ρ0 J −γ , our analysis for the isentropic case naturally generalizes to the 5 × 5 system of conservation laws. 2. Notation and Weighted Spaces 2.1. Differentiation and norms in the open set . The reference domain is defined in Sect. 1.5. Throughout the paper the symbol D will be used to denote the three-dimensional gradient vector ∂ ∂ ∂ . D= , , ∂ x1 ∂ x2 ∂ x3

Free-Boundary 3D Compressible Euler Equations in Vacuum

565

For integers k ≥ 0 and a smooth, open domain of R3 , we define the Sobolev space (H k (; R3 )) to be the completion of C ∞ () (C ∞ (; R3 )) in the norm ⎛ ⎞1/2

∂ |a| u(x)

2

⎠ , uk := ⎝

∂ x a1 ∂ x a2 ∂ x a3 d x 1 d x 2 d x 3

H k ()

|a|≤k

1

2

3

for a multi-index a ∈ Z3+ , with the standard convention that |a| = a1 + a2 + a3 . For real numbers s ≥ 0, the Sobolev spaces H s () and the norms · s are defined by interpolation. We will write H s () instead of H s (; R3 ) for vector-valued functions. In the case that s ≥ 3, the above definition also holds for domains of class H s . We will write d x to denote the 3-D Lebesgue measure d x1 d x2 d x3 . 2.2. Tangent and normal vectors to . The outward-pointing unit normal vector to is given by N = (0, 0, 1). Similarly, the unit tangent vectors on are given by T1 = (1, 0, 0) and T2 = (0, 1, 0). 2.3. Einstein’s summation convention. Repeated Latin indices i, j, k,, etc., are summed from 1 to 3, and repeated Greek indices α, β, γ , etc., are summed from 1 to 2. For exam 3 2 2 ∂ F i αβ ∂G i 2 ple, F,ii := i=1,3 ∂ x∂i ∂ xi , and F i ,α I αβ G i ,β := i=1 α=1 β=1 ∂ xα I ∂ xβ . 2.4. Sobolev spaces on . For functions u ∈ H k (), k ≥ 0, we set ⎛ ⎞1/2

∂ |α| u(x)

2

α α d x1 d x2 ⎠ , |u|k := ⎝

1 2 ∂ x1 ∂ x2 |a|≤k

for a multi-index α ∈ Z2+ . For real s ≥ 0, the Hilbert space H s () and the boundary norm | · |s is defined by interpolation. The negative-order Sobolev spaces H −s () are defined via duality: for real s ≥ 0, H −s () := [H s ()] . 2.5. Notation for derivatives and norms. Throughout the paper, we will use the following notation: ∂ ∂ ∂ , , , D = three-dimensional gradient vector = ∂ x1 ∂ x2 ∂ x3 ∂ ∂ ¯∂ = two-dimensional gradient vector or horizontal derivative = , , ∂ x1 ∂ x2 · s = H s () interior norm, | · |s = H s () boundary norm. The k th partial derivative of F will be denoted by F,k =

∂F ∂ xk .

566

D. Coutand, H. Lindblad, S. Shkoller

2.6. The embedding of a weighted Sobolev space. Using d to denote the distance function to the boundary , and letting p = 1 or 2, the weighted Sobolev space Hd1p (), with norm given by d(x) p (|F(x)|2 + |D F(x)|2 ) d x for any F ∈ Hd1p (), satisfies the following embedding: p

Hd1p () → H 1− 2 (), so that there is a constant C > 0 depending only on and p such that d(x) p |F(x)|2 + |D F(x)|2 d x. F21− p/2 ≤ C

(2.1)

See, for example, Sect. 8.8 in Kufner [9]. 3. The Lagrangian Vorticity We make use of the permutation symbol ⎧ ⎨ 1, even permutation of {1, 2, 3}, εi jk = −1, odd permutation of {1, 2, 3}, ⎩ 0, otherwise, and the basic identity regarding the i th component of the curl of a vector field u: (curl u)i = εi jk u k , j . The chain rule shows that (curl u(η))i = (curlη v)i := εi jk Asj v k ,s , the right-hand side defining the Lagrangian curl operator curlη . Taking the Lagrangian curl of (1.10) yields the Lagrangian vorticity equation εk ji Asj vti ,s = 0, or curlη vt = 0.

(3.1)

4. Properties of the Determinant J, Cofactor Matrix a, Unit Normal n, and a Polynomial-type Inequality 4.1. Differentiating the Jacobian determinant. The following identities will be useful to us: ∂ηr (4.1) ∂¯ J = ars ∂¯ s (horizontal differentiation ), ∂x r ∂v (4.2) ∂t J = ars s (time differentiation using v = ηt ). ∂x 4.2. Differentiating the cofactor matrix. Using (4.1) and (4.2) and the fact that a = J A, we find that r ¯ k = ∂¯ ∂η J −1 [ars a k − a s ark ] (horizontal differentiation), (4.3) ∂a i i i ∂xs ∂vr −1 s k J [ar ai − ais ark ] (time differentiation using v = ηt ). (4.4) ∂t aik = ∂xs

Free-Boundary 3D Compressible Euler Equations in Vacuum

567

4.3. The Piola identity. It is a fact that the columns of every cofactor matrix are divergence-free and satisfy aik ,k = 0.

(4.5)

The identity (4.5) will play a vital role in our energy estimates. (Note that we use the notation cofactor for what is commonly termed the adjugate matrix, or the transpose of the cofactor.) 4.4. Geometric identities. The vectors η,α for α = 1, 2 span the tangent plane to the surface in R3 , and τ1 :=

η,1 ×η,2 η,1 η,2 , τ2 := , and n := |η,1 | |η,2 | |η,1 ×η,2 |

are the unit tangent and normal vectors, respectively, to η(). By definition of the cofactor matrix, ⎡ 2 ⎤ η ,1 η3 ,2 −η3 ,1 η2 ,2 ai3 = ⎣ η3 ,1 η1 ,2 −η1 ,1 η3 ,2 ⎦ . η1 ,1 η2 ,2 −η1 ,2 η2 ,1

(4.6)

4.5. A polynomial-type inequality. For a constant M0 ≥ 0, suppose that f (t) ≥ 0, t → f (t) is continuous, and f (t) ≤ M0 + C t P( f (t)),

(4.7)

where P denotes a polynomial function, and C is a generic constant. Then for t taken sufficiently small, we have the bound f (t) ≤ 2M0 . This type of inequality, which we introduced in [2], can be viewed as a generalization of standard nonlinear Gronwall inequalities. With E(t) defined by (1.12), we will show that supt∈[0,T ] E(t) satisfies the inequality (4.7). 5. Trace Estimates and the Hodge Decomposition Elliptic Estimates The normal trace theorem which states that the existence of the normal trace of a velocity field w ∈ L 2 () relies on the regularity of divw (see, for example, [18]). If divw ∈ H 1 () , then w · N , the normal trace, exists in H −0.5 () so that (5.1) w · N 2H −0.5 () ≤ C w2L 2 () + divw2H 1 () for some constant C independent of w. In addition to the normal trace theorem, we have the following. Lemma 5.1. Let w ∈ L 2 () so that curlw ∈ H 1 () , and let T1 , T2 denote the unit tangent vectors on , so that any vector field u on can be uniquely written as u α Tα . Then

568

D. Coutand, H. Lindblad, S. Shkoller

w · Tα 2H −0.5 () ≤ C w2L 2 () + curlw2H 1 () ,

α = 1, 2

(5.2)

for some constant C independent of w. See [1] for the proof. Combining (5.1) and (5.2), w H −0.5 () ≤ C w L 2 () + divw H 1 () + curlw H 1 ()

(5.3)

for some constant C independent of w. The construction of our higher-order energy function is based on the following Hodgetype elliptic estimate: Proposition 5.2. For an H r domain , r ≥ 3, if F ∈ L 2 (; R3 ) with curl F ∈ 1 H s−1 (; R3 ), divF ∈ H s−1 (), and F · N | ∈ H s− 2 () for 1 ≤ s ≤ r , then there exists a constant C¯ > 0 depending only on such that Fs ≤ C¯ F0 + curl Fs−1 + div Fs−1 + |∂¯ F · N |s− 3 , 2 (5.4) Fs ≤ C¯ F0 + curl Fs−1 + div Fs−1 + 2α=1 |∂¯ F · Tα |s− 3 , 2

where N denotes the outward unit-normal to , and Tα are tangent vectors for α = 1, 2. These estimates are well-known and follows from the identity − F = curl curlF − DdivF; a convenient reference is Taylor [17]. 6. The a priori Estimates 6.1. Curl Estimates. Following Lemma 10.1 in [3], we obtain the following estimates: Proposition 6.1. For all t ∈ (0, T ), 3 a=0

curl ∂t2a η(t)23−a +

4

ρ0 ∂¯ 4−l curl ∂t2l η(t)20 ≤ M0 + C T P( sup E(t)). t∈[0,T ]

l=0

(6.1) Proof. From (3.1), (curlη v)kt = εk ji At sj v i ,s =: B(A, Dv), where B is quadratic in its arguments; hence, t curlη v(t) = curl u 0 + B(A(t ), Dv(t ))dt , (6.2) 0

and computing the gradient of this relation yields D curlη v(t) = curl Du 0 − ε· ji D Asj v i ,s +

t

D B(A(t ), Dv(t ))dt .

0

Applying the fundamental theorem of calculus once again, shows that t [At sj Dηi ,s −D Asj v i ,s ]dt D curlη η(t) = t D curl u 0 + ε· ji t

t

+ 0

0

0

D B(A(t ), Dv(t ))dt dt ,

Free-Boundary 3D Compressible Euler Equations in Vacuum

569

and finally that curl Dη(t) = t curl Du 0 − ε· ji t +ε· ji

0

t 0

At sj (t )dt Dηi ,s

[At sj Dηi ,s −D Asj v i ,s ]dt +

t t 0

0

D B(A(t ), Dv(t ))dt dt . (6.3)

To obtain an estimate for curl η(t)23 , we let D 2 act on (6.3). With ∂t Asj = p p −Als vl , p A j and D Asj = −Als Dηl , p A j , we see that the first three terms on the righthand side of (6.3) are bounded by M0 + C T P(supt∈[0,T ] E(t)), where we remind the reader that M0 = P(E(0)) is a polynomial function of the E at time t = 0. Since p

p

p

D B(A, Dv) = −εk ji [Dv i ,s Als vl , p A j + v i ,s Als Dvl , p A j + v i ,s vl , p D(Als A j )], the highest-order term arising from the action of D 2 on D B(A, Dv) is written as t t p p −εk ji [D 3 v i ,s Als vl , p A j + v i ,s Als D 3 vl , p A j ]dt dt . 0

0

Both summands in the integrand scale like D 3 v Dv A A. The precise structure of this summand is not very important; rather, the derivative count is the focus. Integrating by parts in time, t t t t t D 3 v Dv A A dt dt = − D 3 η (Dv A A)t dt dt + D 3 η Dv A A dt , 0

0

0

0

0

from which it follows that 2 t t D 3 B(A(t ), Dv(t ))dt dt ≤ C T P( sup E(t)), 0 0 t∈[0,T ] 0

and hence sup curl η(t)23 ≤ M0 + C T P( sup E(t)).

t∈[0,T ]

t∈[0,T ]

Next, we show that curl vt (t)22 ≤ M0 + C T P( sup E(t)). t∈[0,T ]

From (3.1), curl vt = ε j·i

t 0

(6.4)

At sj (t )dt vti ,s .

Since H 2 () is a multiplicative algebra, we can directly estimate the H 2 ()-norm of curl vt to prove that (6.4) holds. The estimates for curl vttt (t) in H 1 () and curl ∂t5 v(t) in L 2 () follow the same argument. The weighted estimates follow from similar reasoning. We first show that ρ0 ∂¯ 4 curl η(t)20 ≤ M0 + C T P( sup E(t)). t∈[0,T ]

(6.5)

570

D. Coutand, H. Lindblad, S. Shkoller

To prove this weighted estimate, we write (6.2) as t At sj (t )dt + curl u 0 + curl v(t) = ε jki v i ,s 0

and integrate in time to find that t i curl η(t) = t curl u 0 + ε jki v ,s 0

0

t

At sj (t )dt dt

It follows that ρ0 ∂¯ 4 curl η(t) = tρ0 ∂¯ 4 curl u 0 t t t + εk ji At sj ρ0 ∂¯ 4 v i ,s dt dt +

0

0

t

+

ε jki ρ0 ∂¯ 4 v i ,s

0

0

t

0

At sj (t )dt dt +

t

0

t

t

B(A(t ), Dv(t ))dt ,

0

t

t

+

B(A(t ), Dv(t ))dt dt .

0

0

εk ji ρ0 ∂¯ 4 At sj v i ,s dt dt ε jki v i ,s 0

0

t

ρ0 ∂¯ 4 At sj (t )dt dt + R2 , (6.6)

where R2 denotes terms which are lower-order in the derivative count; in particular the terms with the highest derivative count in R2 scale like ρ ∂¯ 3 Dv or ρ ∂¯ 4 η, and hence satisfy the inequality R2 (t)20 ≤ M0 + C T P(supt∈[0,T ] E(t)). We focus on the first integral on the right-hand side of (6.6); integrating by parts in time, we find that t t t t εk ji At sj ρ0 ∂¯ 4 v i ,s dt dt = − εk ji Att sj ρ0 ∂¯ 4 ηi ,s dt dt 0 0 0 0 t + εk ji At sj ρ0 ∂¯ 4 ηi ,s dt , 0

and hence

2 t t 4 i s ¯ εk ji At j ρ0 ∂ v ,s dt dt ≤ M0 + C T P( sup E(t)). 0 0 t∈[0,T ] 0

The other time integrals in (6.6) can be estimated in the same fashion, which proves that (6.5) holds. The weighted estimates for the curl of vt , vttt and ∂t5 v are obtained similarly.

6.2. Energy estimates. We assume that we have smooth solutions η on a time interval [0, T ], and that for all such solutions, the time T > 0 is taken sufficiently small so that for t ∈ [0, T ], 1 3 ≤ J (t) ≤ , 2 2 η(t)23.5 ≤ 2e23.5 + 1, ∂ta v(t)23−a/2 ≤ 2∂ta v(0)23−a/2 + 1

(6.7) for a = 0, 1, . . . , 6.

The right-hand sides appearing in these inequalities shall be denoted by a generic constant C in the estimates appearing below. Once we establish our a priori bounds, we can ensure that our solution verifies to the assumptions (6.7) by means of the fundamental theorem of calculus.

Free-Boundary 3D Compressible Euler Equations in Vacuum

571

6.2.1. The structure of the estimates. Due to the degeneracy of the initial density function ρ0 , one time derivative scales like one-half of a space derivative. The energy estimates for the time and tangential derivatives are obtained by first studying the ∂¯ 4 -differentiated Euler equations, then the ∂¯ 3 ∂t2 -differentiated Euler equations, and so on, until we reach the ∂¯ 0 ∂t8 -differentiated Euler equations. The estimates for the normal derivatives are then found using elliptic-type estimates. The Sobolev embedding theorem requires that we use H 4 () as the minimal regularity of η(t). 6.2.2. The ∂¯ 4 -problem. Proposition 6.2. For δ > 0 and letting the constant M0 depend on 1/δ, 4 2 2 4 2 ¯ ¯ sup ρ0 (x)|∂ v(x, t)| d x + ρ0 (x)|∂ Dη(x, t)| d x t∈[0,T ]

≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

(6.8)

Proof. Letting ∂¯ 4 act on ρ0 vti + aik (ρ02 J −2 ),k = 0, and taking the L 2 ()-inner product with ∂¯ 4 v i , we obtain 1 d ρ0 |∂¯ 4 v|2 d x + aik (ρ02 ∂¯ 4 J −2 ),k ∂¯ 4 v i d x ∂¯ 4 aik (ρ02 J −2 ),k ∂¯ 4 v i d x + 2 dt 3 aik (∂¯ 4 ρ02 J −2 ),k ∂¯ 4 v i d x = cl ∂¯ 4−l aik (ρ02 ∂¯ l J −2 ),k ∂¯ 4 v i d x. (6.9) +

l=1

Integrating the first term from 0 to t ∈ (0, T ] produces the first term on the left-hand side of (6.8). We define the following integrals: I1 = ∂¯ 4 aik (ρ02 J −2 ),k ∂¯ 4 v i d x, I2 = aik (ρ02 ∂¯ 4 J −2 ),k ∂¯ 4 v i d x, I3 = aik (∂¯ 4 ρ02 J −2 ),k ∂¯ 4 v i d x, R=

3 l=1

cl

∂¯ 4−l aik ∂¯ l (ρ02 J −2 ),k ∂¯ 4 v i d x.

The last integral introduces our notation R for the remainder, which throughout the paper will consist of integrals of lower-order terms which can, via elementary inequalities together with our assumptions (6.7), easily be shown to satisfy the following estimate: T R(t)dt ≤ M0 + δ sup E(t) + C T P( sup E(t)). (6.10) 0

T

t∈[0,T ]

t∈[0,T ]

The sum of 0 [I1 (t) +I2 (t) + I3 (t)]dt together with the estimates for curl η given by Proposition 6.1 will provide the remaining energy contribution ρ02 (x, t)|∂¯ 4 Dη|2 d x plus error terms which have the same bound as R.

572

D. Coutand, H. Lindblad, S. Shkoller

T Analysis of 0 Rdt. Using the identity (4.5), we integrate by parts with respect to xk and then with respect to the time derivative ∂t , and use (4.5) to obtain that R=−

3 l=1

=

3

3

T

cl

0

∂¯ 4−l aik ∂¯ l (ρ02 J −2 ) ∂¯ 4 v i ,k d xdt

∂¯ 4−l aik ∂¯ l (ρ02 J −2 ) ∂¯ 4 ηi ,k d xdt

T

cl

l=1

−

0

cl

l=1

t

T ∂¯ 4−l aik ∂¯ l (ρ02 J −2 ) ∂¯ 4 ηi ,k d x 0 .

Notice that when l = 3, the highest-order integrand in the spacetime integral on the right-hand side scales like [∂¯ Dη ρ0 ∂¯ 3 ∂t J −2 + ∂¯ Dv ρ0 ∂¯ 3 J −2 ] ρ0 ∂¯ 4 Dη, where denotes an L ∞ ((0, T ) × ) function. Since ρ0 ∂t2 J −2 (t)23 is contained in the energy function E(t) and since ∂¯ Dη(t) ∈ L ∞ (), the first summand is estimated using an L ∞ L 2 -L 2 Hölder’s inequality, while for the second summand, we use that ρ0 J −2 (t)24 is contained in E(t) together with an L 4 -L 4 -L 2 Hölder’s inequality. When l = 1, the integrand in the spacetime integral on the right-hand side scales like [∂¯ Dη ρ0 ∂¯ 3 at ik + ∂¯ Dv ρ0 ∂¯ 3 aik ] ρ0 ∂¯ 4 ηi ,k . Since ρ0 ∂¯ 3 Dvt (t)20 is contained in the energy function E(t) and since ∂¯ Dη ∈ L ∞ (), the first summand is estimated using an L ∞ -L 2 -L 2 Hölder’s inequality. We write the second summand as β ∂¯ Dv ρ0 ∂¯ 3 ai ρ0 ∂¯ 4 ηi ,β +∂¯ Dv ρ0 ∂¯ 3 ai3 ρ0 ∂¯ 4 ηi ,3 .

We estimate T β ∂¯ Dv ρ0 ∂¯ 3 ai ρ0 ∂¯ 4 ηi ,β d xdt 0

T

=−

0

β

β

[∂¯ Dv ρ0 ∂¯ 3 ai ,β ρ0 ∂¯ 4 ηi + ∂¯ Dv,β ρ0 ∂¯ 3 ai ρ0 ∂¯ 4 ηi ]d xdt

≤C ∂¯ Dv(t) L 3 () ρ0 ∂¯ 4 a(t)0 ρ0 ∂¯ 4 η(t) L 6 () 0 +∂¯ 2 Dv(t) L 3 () ρ0 ∂¯ 4 η(t) L 6 () ∂¯ 3 a0 dt T ≤C ∂¯ Dv(t) H 0.5 () ρ0 ∂¯ 4 a(t)0 ρ0 ∂¯ 4 η(t)1 0 +∂¯ 2 Dv(t) H 0.5 () ρ0 ∂¯ 4 η(t)1 ∂¯ 3 a0 dt T ≤C v(t) H 3.5 () ρ0 ∂¯ 4 Dη(t)20 + v(t) H 2.5 () ρ0 ∂¯ 4 Dη(t)0 η(t)4 0 (6.11) +v(t) H 3.5 () η(t)24 dt, T

where we have used Hölder’s inequality, followed by the Sobolev embeddings H 0.5 () → L 3 () and H 1 () → L 6 ().

Free-Boundary 3D Compressible Euler Equations in Vacuum

573

We also rely on the interpolation estimate

0

v2L 2 (0,T ;H 3.5 ()) ≤ C (v(t)3 η4 ) + Cvt L 2 (0,T ;H 3 ()) η L 2 (0,T ;H 4 ()) T ≤ M0 + δ sup η(t)24 + C T sup η(t)24 + vt (t)23 , t∈[0,T ]

t∈[0,T ]

(6.12) where the last inequality follows from Young’s and Jensen’s inequalities. Using this together with the Cauchy-Schwarz inequality, (6.11) is bounded by C T P(supt∈[0,T ] ¯ we see that E(t)). Next, since (4.6) shows that each component of ai3 is quadratic in ∂η, 3 3 4 i ¯ ¯ ¯ the same analysis shows the spacetime integral of ∂ Dv ρ0 ∂ ai ρ0 ∂ η ,3 has the same bound, and so we have estimated the case l = 1. For the case that l = 2, the integrand in the spacetime integral on the right-hand side of the expression for R scales like ∂¯ 2 Dη ∂¯ 2 Dv ρ0 ∂¯ 4 Dη, so that an L 6 − L 3 − L 2 Hölder’s inequality, followed by the same analysis as for the case l = 1 provides the same bound as for the case l = 1. To deal with the space integral on the right-hand side of the expression for R, the integral at time t = 0 is equal to zero since η(x, 0) = x, whereas the integral evaluated at t = T is written, using the fundamental theorem of calculus, as −

3

cl

l=1

=−

3 l=1

ρ0 ∂¯ 4−l aik ∂¯ l J −2 ρ0 ∂¯ 4 ηi ,k d x

cl

T

ρ0 0

t=T

(∂¯ 4−l aik ∂¯ l J −2 )t ρ0 ∂¯ 4 ηi ,k (T )d x,

which can be estimated in the identical fashion as the corresponding spacetime integral. As such, we have shown that R has the claimed bound (6.10). Analysis of the integral I1 . Because ρ0 = 0 on = {x3 = 1}, we use the identity (4.5) to integrate by parts with respect to xk to find that I1 = − ρ02 J −2 ∂¯ 4 aik ∂¯ 4 v i ,k d x + ρ02 J −2 ∂¯ 4 ai3 ∂¯ 4 v i d x1 d x2 {x3 =0} 2 −2 ¯ 4 k ¯ 4 i =− ρ0 J ∂ ai ∂ v ,k d x,

since on the fixed boundary {x3 = 0}, η3 = x3 so that according to (4.6), the components a13 = 0 and a23 = 0 on {x3 = 0}, and v 3 = 0 on {x3 = 0}, so that ∂¯ 4 ai3 ∂¯ 4 v i = 0 on{x3 = 0}. To estimate I1 , we use the formula (4.3) for horizontally differentiating the cofactor matrix: I1 = ρ0 2 J −3 ∂¯ 4 ηr ,s [ais ark − ars aik ] ∂¯ 4 v i ,k d x + R,

where the remainder R satisfies (6.10). We decompose the highest-order term in I1 as the sum of the following two integrals:

574

D. Coutand, H. Lindblad, S. Shkoller

ρ0 2 J −3 (∂¯ 4 ηr ,s ais )(∂¯ 4 v i ,k ark )d x, =− ρ0 2 J −3 (∂¯ 4 ηr ,s ars )(∂¯ 4 v i ,k aik )d x.

I1a = I1b

Since v = ηt , I1a is an exact derivative modulo an antisymmetric commutation with respect to the free indices i and r ; namely, ∂¯ 4 ηr ,s ais ∂¯ 4 v i ,k ark = ∂¯ 4 ηi ,s ars ∂¯ 4 v i ,k ark + (∂¯ 4 ηr ,s ais − ∂¯ 4 ηi ,s ars )∂¯ 4 v i ,k ark .

(6.13)

Using the notation [Dη F]ri = ars F i ,s for any vector field F, 1 d 1 ∂¯ 4 ηi ,s ars ∂¯ 4 v i ,k ark = |Dη ∂¯ 4 η|2 − ∂¯ 4 ηr ,s ∂¯ 4 ηi ,k (ars aik )t , (6.14) 2 dt 2 so the first term on the right-hand side of (6.13) produces an exact derivative in time. For the second term on the right-hand side of (6.13), note the identity (∂¯ 4 ηr ,s ais − ∂¯ 4 ηi ,s ars )∂¯ 4 v i ,k ark = −J 2 εi jk ∂¯ 4 ηk ,r Arj εimn ∂¯ 4 v n ,s Asm .

(6.15)

We have used the permutation symbol ε to encode the anti-symmetry in this relation, and the basic fact that the trace of the product of symmetric and antisymmetric matrices is equal to zero. Recalling our notation [curlη F]i = εi jk F k ,r Arj , (6.15) can be written as (∂¯ 4 ηr ,s ais − ∂¯ 4 ηi ,s ars )∂¯ 4 v i ,k ark = −J 2 curlη ∂¯ 4 η · curlη ∂¯ 4 v, which can also be written as an exact derivative in time: 1 d curlη ∂¯ 4 η · curlη ∂¯ 4 v = | curlη ∂¯ 4 η|2 − ∂¯ 4 ηk ,r ∂¯ 4 ηk ,s (Arj Asj )t 2 dt + ∂¯ 4 ηk ,r ∂¯ 4 η j ,s (Arj Ask )t .

(6.16)

(6.17)

The terms in (6.14) and (6.17) which are not the exact time derivatives are quadratic in ρ0 ∂¯ 4 Dη with coefficients in L ∞ ([0, T ] × ); denoting the integral over of such terms by Qρ0 ∂¯ 4 Dη , 1 d 1 d I1a = ρ0 2 J −3 |Dη ∂¯ 4 η|2 d x − ρ0 2 J −1 | curlη ∂¯ 4 η|2 d x + Qρ0 ∂¯ 4 Dη + R, 2 dt 2 dt T where 0 |Qρ0 ∂¯ 4 Dη | dt ≤ C T P(supt∈[0,T ] E(t)), and R satisfies (6.10). j

With the notation divη F = Ai F i , j , the differentiation formula (4.1) shows that I1b can be written as 1 d I1b = − ρ0 2 J −1 | divη ∂¯ 4 η|2 d x + Qρ0 ∂¯ 4 Dη + R. 2 dt It follows that 1 d I1 = ρ0 2 J −3 |Dη ∂¯ 4 η|2 − J −1 | curlη ∂¯ 4 η|2 − J −1 | divη ∂¯ 4 η|2 d x + R 2 dt 1 d = ρ0 2 |D ∂¯ 4 η|2 − J −1 | curlη ∂¯ 4 η|2 − J −1 | divη ∂¯ 4 η|2 d x + R, 2 dt

Free-Boundary 3D Compressible Euler Equations in Vacuum

575

where we have used the fundamental theorem of calculus for the second equality on the term J −3 Dη ∂¯ 4 η. Since 21 < J (t) < 23 , we see that

T

I1 (t)dt =

0

1 2

ρ0 2 |D ∂¯ 4 η(T )|2 − J −1 | curlη ∂¯ 4 η(T )|2 − J −1 | divη ∂¯ 4 η|2 (T ) d x

T

−M0 +

R(t)dt.

(6.18)

0

Analysis of the integral I2 . Integration by parts once again, using (4.5), yields I2 = − ρ02 ∂¯ 4 J −2 aik ∂¯ 4 v i ,k d x.

Since ∂¯ 4 J −2 = −2J −3 ∂¯ 4 J plus lower-order terms, which have at most three horizontal derivatives acting on J . For such lower-order terms, we integrate by parts with respect to ∂t , and estimate the resulting integrals in the same manner as we estimated the remainder term R, and obtain the same bound. Thus, I2 = 2 ρ02 J −3 asr ∂¯ 4 ηs ,r aik ∂¯ 4 v i ,k d x + R d 2 −3 r ¯ 4 s k ¯4 i ρ J as ∂ η ,r ai ∂ η ,k d x − ρ02 (J −3 asr aik )t ∂¯ 4 ηs ,r ∂¯ 4 ηi ,k d x + R. = dt 0 Given our identities for differentiating a and J , the Sobolev embedding theorem together with our assumptions (6.7) and the Cauchy-Schwarz inequality show that

T 0

ρ02 (J −3 asr aik )t ∂¯ 4 ηs ,r ∂¯ 4 ηi ,k d xdt ≤ C T

sup E(t);

t∈[0,T ]

consequently, we can write t 2 −3 r ¯ 4 s k ¯4 i ρ0 J as ∂ η ,r ai ∂ η ,k d x − M0 = [I2 (t ) + R(t )]dt .

(6.19)

0

On the other hand, ρ02 J −3 asr ∂¯ 4 ηs ,r aik ∂¯ 4 ηi ,k d x t t 2 −3 ¯ 4 4 s 4 4 i r k ¯ ¯ ¯ = ρ0 J at s dt at i dt d x ∂ div η + ∂ η ,r ∂ div η + ∂ η ,k 0 0 t = ρ02 J −3 |∂¯ 4 div η|2 d x + 2 ρ02 J −2 ∂¯ 4 div η∂¯ 4 ηs ,r at rs dt d x 0 t t 2 −3 ¯ 4 s r ¯4 i + ρ0 J ∂ η ,r at s dt ∂ η ,k at ik dt d x. (6.20)

0

0

Yet another application of the Sobolev embedding theorem together with our assumptions (6.7) and the Cauchy-Schwarz inequality shows that the second and third integrals

576

D. Coutand, H. Lindblad, S. Shkoller

on the right-hand side are bounded by M0 + C T supt∈[0,T ] E(t), so that combining (6.19) and (6.20), we find that T T I2 (t)dt = ρ02 J −1 |∂¯ 4 div η|2 d x − M0 + R(t)dt. (6.21)

0

0

Analysis of the integral I3 . Integration by parts using (4.5) shows that T T I3 (t)dt = − ∂¯ 4 ρ02 J −2 aik ∂¯ 4 v i ,k d xdt 0

=

0

T

T

0

= 0

−

∂¯ 4 ρ02 (J −2 aik )t ∂¯ 4 ηi ,k d xdt − ∂¯ 4 ρ02 (J −2 aik )t ∂¯ 4 ηi ,k d xdt −

∂¯ 4 ρ02

T 0

∂¯ 4 ρ02 J −2 aik ∂¯ 4 ηi ,k d x

t=T

∂¯ 4 ρ02 ∂¯ 4 div η(T )d x

J −2 aik dt ∂¯ 4 ηi ,k (T )d x,

so that by the Cauchy-Schwarz inequality and Young’s inequality, T I3 (t)dt ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

0

t∈[0,T ]

(6.22)

Summing inequalities. We integrate (6.9) from 0 to T , and sum (6.10), (6.18), (6.21), and (6.22) to find that 1 sup ρ02 |∂¯ 4 Dη|2 d x + ρ02 J −1 |∂¯ 4 div η|2 d x − ρ02 J −1 |∂¯ 4 curl η|2 d x t∈[0,T ] 2 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

Note the factor of 21 in (6.18) which, in conjunction with (6.21), gives the positivity of the divergence term. Adding to this the inequality (6.1), and possibly readjusting our constants, we obtain the desired result, and complete the proof of the proposition.

With the unit tangent vectors T1 = (1, 0, 0) and T2 = (0, 1, 0), η · Tα = ηα for α = 1, 2, and we have the following Corollary 6.3. For α = 1, 2, sup |ηα |23.5 ≤ M0 + C T P( sup E(t)).

t∈[0,T ]

t∈[0,T ]

Proof. The weighted embedding estimate (2.1) shows that ∂¯ 4 η20 ≤ C ρ02 |∂¯ 4 η|2 + |∂¯ 4 Dη|2 d x.

Now sup

t∈[0,T ]

ρ02 |∂¯ 4 η|2 d x

= sup

t∈[0,T ]

ρ02

t 0

2

¯∂ 4 vdt d x ≤ T 2 sup √ρ0 ∂¯ 4 v20 .

t∈[0,T ]

Free-Boundary 3D Compressible Euler Equations in Vacuum

577

It follows from Proposition 6.2 that sup ∂¯ 4 η20 ≤ M0 + C T P( sup E(t)).

t∈[0,T ]

t∈[0,T ]

According to our curl estimates (6.1), supt∈[0,T ] curl η23 ≤ M0 +C T P(supt∈[0,T ] E(t)), from which it follows that sup ∂¯ 4 curl η2H 1 () ≤ M0 + C T P( sup E(t)),

t∈[0,T ]

t∈[0,T ]

since ∂¯ is a horizontal derivative, and integration by parts with respect to ∂¯ does not produce any boundary contributions. From the tangential trace inequality (5.2), we find that sup |∂¯ 4 ηα |2−1/2 ≤ M0 + C T P( sup E(t)),

t∈[0,T ]

t∈[0,T ]

from which the assertion of the corollary follows.

6.3. The ∂t8 -problem. Proposition 6.4. For δ > 0 and letting the constant M0 depend on 1/δ, sup ρ0 |∂t8 v(x, t)|2 d x + ρ02 (x, t)|∂t7 Dv(x, t)|2 d x t∈[0,T ]

≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

(6.23)

Proof. Letting ∂t8 act on ρ0 vti + aik (ρ02 J −2 ),k = 0, and taking the L 2 ()-inner product with ∂t8 v i , we obtain 1 d ρ0 |∂t8 v|2 d x + ∂t8 aik (ρ02 J −2 ),k ∂t8 v i d x + aik (ρ02 ∂t8 J −2 ),k ∂t8 v i d x 2 dt 7 cl ∂t8−l aik (ρ02 ∂tl J −2 ),k d x. (6.24) = l=1

Integrating the first term from 0 to t ∈ (0, T ] produces the first term on the left-hand side of (6.23). We define the following three integrals: I1 = ∂t8 aik (ρ02 J −2 ),k ∂t8 v i d x, I2 = R=

7 l=1

T

aik (ρ02 ∂t8 J −2 ),k ∂t8 v i d x, cl

∂t8−l aik (ρ02 ∂tl J −2 ),k ∂t8 v i d x.

The sum of 0 [I1 (t)+I2 (t)]dt together with the curl estimates given by Proposition 6.1 will provide the remaining energy contribution ρ02 (x, t)|∂t7 Dv|2 d x plus error terms which have the same bound as R, namely (6.10).

578

D. Coutand, H. Lindblad, S. Shkoller

T Analysis of 0 Rdt. We use the identity (4.5) to integrate by parts with respect to xk and then with respect to the time derivative ∂t to obtain that T 7 R=− cl ∂t8−l aik ρ02 ∂tl J −2 ∂t8 v i ,k d xdt

=

7

T

cl

0

l=1

−

0

l=1

7

cl

l=1

ρ0 ∂t8−l aik ∂tl J −2 ρ0 ∂t7 v i ,k d xdt t

T

ρ0 ∂t8−l aik ∂¯ l J −2 ρ0 ∂t7 v i ,k d x . 0

Notice that when l = 7, the integrand in the spacetime integral on the right-hand side scales like [Dvt ρ0 ∂t6 Dv + Dv ρ0 ∂t7 Dv] ρ0 ∂t7 Dv, where denotes an L ∞ () function. Since ρ0 ∂t7 Dv(t)20 is contained in the energy function E(t), Dvt (t) is bounded t in L ∞ (), and since we can write ρ0 ∂t6 Dv(t) = ρ0 ∂t6 Dv(0) + 0 ρ0 ∂t7 Dv(t )dt , the first and second summands are both estimated using an L ∞ -L 2 -L 2 Hölder’s inequality. The case l = 6 is estimated exactly the same way as the case l = 3 in the proof of Proposition 6.2. For the case l = 5, the integrand in the spacetime integral scales like [Dvtt ρ0 ∂t6 J −2 + Dvttt ρ0 Dvtttt ]ρ0 ∂t7 Dv. Both summands can be estimated using an L 3 -L 6 -L 2 Hölder’s inequality. The case l = 4 is treated as the case l = 5. The case l = 3 is also treated in the same way as l = 5. The case l = 2 is estimated exactly the same way as the case l = 1 in the proof of Proposition 6.2. The case l = 1 is treated in the same way as the case l = 7. To deal with the space integral on the right-hand side of the expression for R, the integral at time t = 0 is bounded by M0 , whereas the integral evaluated at t = T is written, using the fundamental theorem of calculus, as 7

cl ρ0 ∂t8−l aik ∂tl J −2 ρ0 ∂t7 v i ,k d x t=T

l=1

=

7

cl

l=1 7

+

l=1

ρ0 ∂t8−l aik (0)∂tl J −2 (0)ρ0 ∂t7 v i ,k (T )d x

cl

ρ0 0

T

(∂t8−l aik ∂tl J −2 )t dt ρ0 ∂t7 v i ,k (T )d x.

The first integral on the right-hand side is estimated using Young’s inequality, and is bounded by M0 + δ supt∈[0,T ] E(t), while the second integral can be estimated in the identical fashion as the corresponding spacetime integral. As such, we have shown that R has the claimed bound (6.10). Analysis of the integral I1 . As to the term I1 , using the identity (4.4), the same computation as for the ∂¯ 4 -differentiated problem shows that ρ02 (∂t7 vr ,s Ais ) (∂t8 v i ,k Ark ) =

1 d 1 d |ρ0 Dη ∂t7 v(t)|2 − |ρ0 curlη ∂t7 v(t)|2 2 dt 2 dt 1 j j k + ρ0 2 ∂t7 v k ,r ∂t7 v b ,s (Arj Asm )t [δm δbk − δb δm ], 2

Free-Boundary 3D Compressible Euler Equations in Vacuum

579

and −ρ02 (∂t7 vr ,s Ars ) (∂t8 v i ,k Aik ) = − and hence I1 =

1 d 2 dt

1 d 1 |ρ0 divη ∂t7 v|2 + ρ02 ∂t7 vr ,s ∂t7 v i ,k (Ars Aik )t , 2 dt 2

ρ0 2 J −3 |Dη ∂t7 v|2 − J −1 | curlη ∂t7 v|2 − J −1 | divη ∂t7 v|2 d x + R.

It follows that T 1 I1 (t)dt = ρ0 2 |D∂t7 v(T )|2 − J −1 | curlη ∂t7 v(T )|2 − J −1 | divη ∂t7 v(T )|2 d x 2 0 T R(t)dt. (6.25) −M0 + 0

Analysis of the integral I2 . Integration by parts, using (4.5), once again yields ρ02 ∂t8 J −2 aik ∂t8 v i ,k d x. I2 = −

Since ∂t8 J −2 = −2J −3 ∂t8 J plus lower-order terms, which have at most seven time derivatives on J , and can be estimated in the same fashion as the remainder term R above. We see that I2 = 2 ρ02 J −3 asr ∂t7 v s ,r aik ∂t8 v i ,k d x + R d 2 −3 r 7 s k 7 i ρ J as ∂t v ,r ai ∂t v ,k d x − ρ02 (J −3 asr aik )t ∂t7 v s ,r ∂t7 v i ,k d x + R. = dt 0 Following our analysis of the term I2 in the ∂¯ 4 -problem, we see that T ρ02 J −3 asr ∂t7 v s ,r aik ∂t7 v i ,k d x = M0 + [I2 (t) + R(t)]dt.

(6.26)

0

On the other hand, ρ02 J −3 asr ∂t7 v s ,r aik ∂t7 v i ,k d x t t 2 −2 7 7 s 7 7 i r k ∂t div v + ∂t v ,r ∂t div v + ∂t v ,k = ρ0 J at s dt at i dt d x 0 0 t = ρ02 J −2 |∂t7 div v|2 d x + 2 ρ02 J −2 ∂t7 div v ∂t7 v s ,r at rs dt d x 0 t t + ρ02 J −2 ∂t7 v s ,r at rs dt ∂t7 v i ,k at ik dt d x. (6.27)

0

0

Yet another application of the Sobolev embedding theorem together with our assumptions (6.7) and the Cauchy-Schwarz inequality shows that the second and third integrals

580

D. Coutand, H. Lindblad, S. Shkoller

on the right-hand side are bounded by M0 + C supt∈[0,T ] E(t), so that summing (6.26) and (6.27) shows that T T I2 (t)dt = ρ02 J −2 |∂t7 div v(T )|2 d x − M0 + R(t)dt. (6.28)

0

0

Summing inequalities. We integrate (6.24) from 0 to T , and sum (6.10), (6.25), and (6.28) to find that 1 2 −2 7 2 2 −2 7 2 2 −2 7 2 ρ0 J |∂t Dv| d x + ρ0 J |∂t div v| d x − ρ0 J |∂t curl v| d x sup t∈[0,T ] 2 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

Adding the curl estimate (6.1), readjusting our constants, we obtain the desired result, and complete the proof of the proposition.

6.4. The ∂t2 ∂¯ 3 , ∂t4 ∂¯ 2 , and ∂t6 ∂¯ problems. Since we have provided detailed proofs of the energy estimates for the two end-point cases of all space derivatives, the ∂¯ 4 problem, and all time derivatives, the ∂t8 problem, we have covered all of the estimation strategies for all possible error terms in the three remaining intermediated problems; meanwhile, the energy contributions for the three intermediate are found in the identical fashion as for the ∂¯ 4 and ∂t8 problems. As such we have the additional estimate Proposition 6.5. For δ > 0 and letting the constant M0 depend on 1/δ, for α = 1, 2, sup

3

t∈[0,T ] a=1

√ |∂t2a ηα (t)|23.5−a + ρ0 ∂¯ 4−a ∂t2a v(t)20 + ρ0 ∂¯ 4−a ∂t2a Dη(t)20

≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

6.5. Additional elliptic-type estimates for normal derivatives. Our energy estimates provide a priori control of horizontal and time derivatives of η; it remains to gain a priori control of the normal (or vertical) derivatives of η. This is accomplished via a bootstrapping procedure relying on having ∂t7 v(t) bounded in L 2 (). Proposition 6.6. For t ∈ [0, T ], ∂t5 v(t) ∈ H 1 (), ρ0 ∂t6 J −2 (t) ∈ H 1 () and sup ∂t5 v(t)21 + ρ0 ∂t6 J −2 (t)21 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

t∈[0,T ]

Proof. We will first assume that ρ0 (x3 ) = 1 − x3 , and after establishing our estimates for this particular choice of ρ0 , we will explain the minor modifications required for the general case of 0 < ρ0 ∈ H 4 () satisfying (1.5). We write (1.9a) as vti + 2 Aik (ρ0 J −1 ),k = 0, which we rewrite as vti + ρ0 aik J −2 ,k −2ai3 J −2 = 0. We have used the fact that ρ0 ,β = 0 for β = 1, 2, and ρ0 ,3 = −1.

(6.29)

Free-Boundary 3D Compressible Euler Equations in Vacuum

581

Letting ∂t6 act on Eq. (6.29), we have that β

ρ0 ai3 ∂t6 J −2 ,3 −2ai3 ∂t6 J −2 = −∂t7 v i − ρ0 ∂t6 (ai J −2 ,β ) − (∂t6 ai3 )[−2J −2 + ρ0 J −2 ,3 ] +

5

ca ∂ta ai3 ∂t6−a [−2J −2 + ρ0 J −2 ,3 ].

a=1

According to Propositions 6.4 and 6.5, sup

t∈[0,T ]

∂t7 v(t)20 + ρ0 ∂¯ D∂t5 v(t)20 ≤ M0 + δ sup E(t) + C T P( sup E(t)), t∈[0,T ]

t∈[0,T ]

¯ we see that for all t ∈ [0, T ], and since (4.6) shows that ai3 is quadratic in ∂η, 2 [ρ0 ai3 ∂t6 J −2 ,3 −2ai3 ∂t6 J −2 ](t) ≤ M0 + δ sup E(t) + C T P( sup E(t)). 0

t∈[0,T ]

t∈[0,T ]

It follows that ρ0 |a·3 |∂t6 J −2 ,3 (t)20 + 4|a·3 | ∂t6 J −2 (t)20 − 4 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

ρ0 |a·3 |2 ∂t6 J −2 ∂t6 J −2 ,3 d x

t∈[0,T ]

We assume that our solution is sufficiently smooth so that ρ0 [(∂t6 J −2 )2 ],3 is well-defined and integrable. As such, we write1 2 −4 ρ0 |a·3 |2 ∂t6 J −2 ∂t6 J −2 ,3 d x = −2|a·3 | ∂t6 J −2 (t) + 2 ρ0 (|a·3 |2 ),3 (∂t6 J −2 )2 d x 0 6 −2 2 +4 |∂t J | d x1 d x2 , {x3 =0}

so that together with our previous inequality, ρ0 ∂t6 J −2 ,3 (t)20 + ∂t6 J −2 (t)20 ≤ M0 + δ sup E(t) + C T P( sup E(t)) + C t∈[0,T ]

t∈[0,T ]

ρ0 |∂t6 J −2 |2 d x.

¯ t6 J −2 (t) is already estimated by Proposition 6.5, then Since ρ0 ∂∂ ρ0 ∂t6 J −2 (t)21 + ∂t6 J −2 (t)20

≤ M0 + δ sup E(t) + C T P( sup E(t)) + C t∈[0,T ]

t∈[0,T ]

ρ0 |∂t6 J −2 |2 d x.

1 Jang & Masmoudi [7] have counterexamples to the obtained inequality when J −2 is not sufficiently smooth. It is important that the function J −2 has greater regularity than the desired a priori estimate indicates, and in particular, as we noted, ρ0 [(∂t6 J −2 )2 ],3 must be well-defined and integrable.

582

D. Coutand, H. Lindblad, S. Shkoller

We use Young’s inequality and the fundamental theorem of calculus (with respect to t) for the last integral to find that for δ > 0, 2 2 ρ0 ∂t6 J −2 ∂t6 J −2 d x ≤ δ ∂t6 J −2 (t) + Cδ ρ0 ∂t5 Dv(t) C 0

+

0

+ M0 + C T P( sup E(t)) t∈[0,T ]

2 ≤ δ ∂t6 J −2 (t) + M0 + C T P( sup E(t)), 0

t∈[0,T ]

where we have used the fact that ρ0 ∂t7 Dv(t)20 is contained in the energy function E(t). By once again readjusting the constants, we see that on [0, T ], 2 ρ0 ∂t6 J −2 (t)21 + ∂t6 J −2 (t) ≤ M0 + δ sup E(t) + C T P( sup E(t)). (6.30) 0

t∈[0,T ]

t∈[0,T ]

j

With Jt = ai v i , j , we see that j

j

ai ∂t5 v ij = ∂t6 J − v i , j ∂t5 ai −

4

j

ca ∂ta ai ∂t5−a v i , j ,

a=1

so that using (6.30) together with the fundamental theorem of calculus the estimate for the last two terms on the right-hand side, we see that 2 j 5 i ai ∂t v , j (t) ≤ M0 + δ sup E(t) + C T P( sup E(t)), 0

t∈[0,T ]

t∈[0,T ]

from which it follows that 2 div ∂t5 v(t) ≤ M0 + δ sup E(t) + C T P( sup E(t)). 0

t∈[0,T ]

t∈[0,T ]

According to Proposition 6.1, curl ∂t5 v(t)20 ≤ M0 + C T P(supt∈[0,T ] E(t)) and with the bound on ∂t5 v α given by Proposition 6.5, Proposition 5.2 provides the estimate 2 5 ∂t v(t) ≤ M0 + δ sup E(t) + C T P( sup E(t)). 1

t∈[0,T ]

t∈[0,T ]

More generally, for any 0 < ρ0 ∈ H 4 () satisfying (1.5), Eq. (6.29) takes the form, for β = 1, 2, β

vti + ρ0 aik J −2 ,k +2ρ0 ,3 ai3 J −2 + 2ρ0 ,β ai J −2 = 0. Letting ∂t6 act on this equation yields β

β

ρ0 ai3 ∂t6 J −2 ,3 +2ρ0 ,3 ai3 ∂t6 J −2 = −∂t7 v i − ρ0 ∂t6 (ai J −2 ,β ) − 2ρ0 ,β ∂t6 (ai J −2 ) − (∂t6 ai3 )[ρ0 J −2 ,3 +2ρ0 ,3 J −2 ] +

5 a=1

ca ∂ta ai3 ∂t6−a [ρ0 J −2 ,3 +2ρ0 ,3 J −2 ]. (6.31)

Free-Boundary 3D Compressible Euler Equations in Vacuum

583

As such, if the L 2 ()-norm of the right-hand side of (6.31) is bounded by M0 + δsupt∈[0,T ] E(t) + C T P(supt∈[0,T ] E(t)), then the identical argument detailed above would lead to the inequality 2 ρ0 ∂t6 J −2 (t)21 + |ρ0 ,3 |∂t6 J −2 (t) ≤ M0 + δ sup E(t) + C T P( sup E(t)). 0

t∈[0,T ]

t∈[0,T ]

(6.32) By the physical vacuum condition (1.5), for > 0 taken sufficiently small, there are constants θ1 , θ2 > 0 such that |ρ0 ,3 (x)| ≥ θ1 whenever 1 − ≤ x3 ≤ 1, and ρ0 (x) > θ2 whenever 0 ≤ x ≤ 1 − ; hence, by readjusting the constants, we obtain the inequality (6.30). As the L 2 ()-bound for the right-hand side of (6.31), the only new type of term that β the general function ρ0 produces is −2ρ0 ,β ∂t6 (ai J −2 ). On the other hand, given that 4 ρ0 ∈ H () and that ρ0 = 0 on , the Sobolev embedding theorem shows that for β = 1, 2, ρ0 ,β /ρ0 L ∞ () ≤ Cρ0 ,β /ρ0 2 ≤ ρ0 4 < C, so that |ρ0 ,β (x)| ≤ Cρ0 (x). This shows that β

2ρ0 ,β ∂t6 (ai J −2 )20 ≤ C2ρ0 ∂t6 ([ai1 + ai2 ]J −2 )20 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

Having a good bound for ∂t5 v(t) in H 1 () we proceed with our bootstrapping. Proposition 6.7. For t ∈ [0, T ], vttt (t) ∈ H 2 (), ρ0 ∂t4 J −2 (t) ∈ H 2 () and sup vttt (t)22 + ρ0 ∂t4 J −2 (t)22 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

t∈[0,T ]

Proof. We let ∂t4 act on Eq. (6.29), and using the argument just given above, it suffices to consider the case that ρ0 = 1 − x3 . It follows that β

ρ0 ai3 ∂t4 J −2 ,3 −2ai3 ∂t4 J −2 = −∂t5 v i − ρ0 ∂t4 (ai J −2 ,β ) − (∂t4 ai3 )[−2J −2 + ρ0 J −2 ,3 ] +

3

ca ∂ta ai3 ∂t4−a [−2J −2 + ρ0 J −2 ,3 ].

(6.33)

a=1

In order to estimate ∂t4 J −2 (t) in H 1 (), we first estimate horizontal derivatives of ∂t4 J −2 (t) in L 2 (). As such, we consider for α = 1, 2, ρ0 ai3 ∂t4 J −2 ,3α −2ai3 ∂t4 J −2 ,α =

β

− ∂t5 v i − ρ0 ∂t4 (ai J −2 ,β ) − (∂t4 ai3 )(−2J −2 + ρ0 J −2 ,3 ) +

3

ca ∂ta ai3 ∂t4−a (−2J −2 + ρ0 J −2 ,3 ) ,α

a=1

− ρ0 ai3 ,α ∂t4 J −2 ,3 +2ai3 ,α ∂t4 J −2 .

(6.34)

584

D. Coutand, H. Lindblad, S. Shkoller

According to Proposition 6.5, the right-hand side of (6.34) is bounded in L 2 () by M0 + δsupt∈[0,T ] E(t) + C T P(supt∈[0,T ] E(t)). Using the argument just given above in the proof of Proposition 6.6, we conclude that for α = 1, 2, sup

t∈[0,T ]

vttt ,α (t)21 + ρ0 ∂t4 J −2 ,α (t)21 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

(6.35) We next differentiate (6.33) in the vertical direction x3 to obtain ρ0 ai3 ∂t4 J −2 ,33 −3ai3 ∂t4 J −2 ,3 =

β

− ∂t5 v i −ρ0 ∂t4 (ai J −2 ,β )−(∂t4 ai3 )(−2J −2 +ρ0 J −2 ,3 ) +

3

ca ∂ta ai3 ∂t4−a (−2J −2

+ ρ0 J

−2

,3 ) ,3

a=1

−ρ0 ai3 ,3 ∂t4 J −2 ,3 +2ai3 ,3 ∂t4 J −2 .

(6.36)

Now the inequality (6.35) together with Propositions 6.5 and 6.6 show that the right-hand side of (6.36) is bounded in L 2 () by M0 + δsupt∈[0,T ] E(t) + C T P(supt∈[0,T ] E(t)). It follows that for k = 1, 2, 3, ρ0 ai3 ∂t4 J −2 ,k3 −3ai3 ∂t4 J −2 ,k 20 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

Note that the coefficient in front of ai3 ∂t4 J −2 has changed from −2 to −3, but the identical integration-by-parts argument that we used in the proof of Proposition 6.6 is once again employed and shows that ρ0 ∂t4 J −2 (t)22 + ∂t4 J −2 (t)21 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

We can thus infer that div vttt (t)21 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

According to Proposition 6.1, curl vttt (t)21 ≤ M0 + C T P(supt∈[0,T ] E(t)) and with α given by Proposition 6.5, Proposition 5.2 provides the estimate the bound on vttt vttt (t)22 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

Proposition 6.8. For t ∈ [0, T ], vt (t) ∈ H 3 (), ρ0 ∂t2 J −2 (t) ∈ H 3 () and sup

t∈[0,T ]

vt (t)23 + ρ0 ∂t2 J −2 (t)23 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

Free-Boundary 3D Compressible Euler Equations in Vacuum

585

Proof. Next, we let ∂t2 act on Eq. (6.29), so that β

ρ0 ai3 ∂t2 J −2 ,3 −2ai3 ∂t2 J −2 = −∂t3 v i − ρ0 ∂t2 (ai J −2 ,β ) − (∂t2 ai3 )[−2J −2 + ρ0 J −2 ,3 ] + 2(∂t ai3 )∂t [−2J −2 + ρ0 J −2 ,3 ]. The same argument used in the proof of Proposition 6.7 provides the desired inequality.

Proposition 6.9. For t ∈ [0, T ], η(t) ∈ H 4 (), ρ0 J −2 (t) ∈ H 4 () and sup η(t)24 + ρ0 J −2 (t)24 ≤ M0 + δ sup E(t) + C T P( sup E(t)). t∈[0,T ]

t∈[0,T ]

t∈[0,T ]

Proof. We use the identity β

ρ0 ai3 J −2 ,3 −2ai3 J −2 = −vti − ρ0 ai J −2 ,β . The same argument used in the proof of Proposition 6.7 provides the desired inequality.

6.6. Estimates for curlη v. The regularity for the Lagrangian curl of v gains regularity. Corollary 6.10. sup curlη v(t)23 + ρ0 ∂¯ 4 curlη v(t)20 ≤ M0 + δ sup E(t) + C T P( sup E(t)).

t∈[0,T ]

t∈[0,T ]

t∈[0,T ]

Proof. Letting D 3 act on the identity (6.2) for curlη v, we see that the highest-order term scales like t 3 D curl u 0 + D 4 v Dv A Adt . 0

We integrate by parts to see that the highest-order contribution to D 3 curlη v(t) can be written as t D 3 curl u 0 − D 4 η [Dv A A]t dt + D 4 η(t) Dv(t) A(t) A(t), 0

which, according to Proposition 6.9, has L 2 ()-norm bounded by M0 (δ) + δ sup E(t) + C T P( sup E(t)), t∈[0,T ]

t∈[0,T ]

after readjusting the constants; thus, the inequality for the H 3 ()-norm of curlη v(t) is proved. The same type of analysis works for the weighted estimate. After integration by parts in time, the highest-order term in the expression for ρ0 ∂¯ 4 curlη v(t) scales like t 4 ¯ ρ0 ∂ curl u 0 − ρ0 ∂¯ 4 Dη [Dv A A]t dt + ρ0 ∂¯ 4 Dη(t) Dv(t) A(t) A(t). 0

Hence, the inequality (6.8) shows that the weighted estimate holds as well.

586

D. Coutand, H. Lindblad, S. Shkoller

6.7. The a priori bound. Summing the inequalities provided by our energy estimates, the additional elliptic estimates, and the estimates for curlη v shows that sup E(t) ≤ M0 + C T P( sup E(t)).

t∈[0,T ]

t∈[0,T ]

According to our polynomial-type inequality given in Sect. 4.5, by taking T > 0 sufficiently small, we have the a priori bound sup E(t) ≤ 2M0 .

t∈[0,T ]

7. The Case of General γ > 1 We denote by a0 the integer satisfying the inequality 1<1+

1 − a0 ≤ 2. γ −1

The general higher-order energy function is given by E γ (t) =

4

∂t2a η(t)24−a

a=0 3

+

4 √ ρ0 ∂¯ 4−a ∂t2a Dη(t)20 + ρ0 ∂¯ 4−a ∂t2a v(t)20 + a=0

ρ0 ∂t2a J −2 (t)24−a + curlη v(t)23 + ρ0 ∂¯ 4 curlη v(t)20

a=0 a0 √ 1+ 1 −a + ρ0 γ −1 ∂t7+a0 −a Dv(t)20 . a=0

Notice the last sum in E γ appears whenever γ < 2, and the number of time-differentiated problems increases as γ approaches 1. Using this energy function, the same methodology as we used for the case γ = 2, shows that supt∈[0,T ] E γ (t) remains bounded for T > 0 taken sufficiently small. Acknowledgements. We thank the referee for useful suggestions which have improved the manuscript. SS was supported by the National Science Foundation under grant DMS-0701056. HL was supported by the National Science Foundation under grant DMS-0801120. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Cheng, A., Coutand, D., Shkoller, S.: On the Motion of Vortex Sheets with Surface Tension in the 3D Euler Equations with Vorticity. Comm. Pure Appl. Math. 61, 1715–1752 (2008) 2. Coutand, D., Shkoller, S.: On the interaction between quasilinear elastodynamics and the Navier-Stokes equations. Arch. Rat. Mech. Anal. 179(3), 303–352 (2006) 3. Coutand, D., Shkoller, S.: Well-posedness of the free-surface incompressible Euler equations with or without surface tension. J. Amer. Math. Soc. 20, 829–930 (2007) 4. Coutand, D., Lindblad, H., Shkoller, S.: 2007 SIAM Conference on Analysis of Partial Differential Equations, Dec. 10, 2007

Free-Boundary 3D Compressible Euler Equations in Vacuum

587

5. Evans, L.C.: Partial differential equations. Graduate Studies in Mathematics, 19. Providence, RI: Amer. Math. Soc., 1998 6. Jang, J., Masmoudi, N.: Well-posedness for compressible Euler with physical vacuum singularity. Comm. Pure Appl. Math. 62, 1327–1385 (2009) 7. Private communication with Steve Shkoller on Oct. 7, 2008 at NYU 8. Kreiss, H.O.: Initial boundary value problems for hyperbolic systems. Commun. Pure Appl. Math. 23, 277–296 (1970) 9. Kufner, A.: Weighted Sobolev Spaces. New York: Wiley-Interscience, 1985 10. Lin, L.W.: On the vacuum state for the equations of isentropic gas dynamics. J. Math. Anal. Appl. 121, 406–425 (1987) 11. Lindblad, H.: Well posedness for the motion of a compressible liquid with free surface boundary. Commun. Math. Phys. 260, 319–392 (2005) 12. Liu, T.-P.: Compressible flow with damping and vacuum. Japan J. Appl. Math. 13, 25–32 (1996) 13. Liu, T.-P., Yang, T.: Compressible Euler equations with vacuum. J. Diff. Eqs. 140, 223–237 (1997) 14. Liu, T.-P., Yang, T.: Compressible flow with vacuum and physical singularity. Meth. Appl. Anal. 7, 495–510 (2000) 15. Liu, T.-P., Smoller, J.: On the vacuum state for isentropic gas dynamics equations. Adv. Math. 1, 345–359 (1980) 16. Makino, T.: On a local existence theorem for the evolution equation of gaseous stars. In: Patterns and waves, Stud. Math. Appl. 18, Amsterdam: North-Holland, 1986, pp. 459–479 17. Taylor, M.: Partial Differential Equations, Vol. I-III, Berlin-Heidelberg-New York: Springer, 1996 18. Temam, R.: Navier-Stokes equations. Theory and Numerical Analysis. Third edition. Studies in Mathematics and its Applications 2. Amsterdam: North-Holland Publishing Co., 1984 19. Trakhinin, Y.: Local existence for the free boundary problem for the non-relativistic and relativistic compressible Euler equations with a vacuum boundary condition. http://arXiv.org/abs/0810.2612v2[math. AP], 2009 20. Xu, C.-J., Yang, T.: Local existence with physical vacuum boundary condition to Euler equations with damping. J. Diff. Eqs. 210, 217–231 (2005) Communicated by P. Constantin

Commun. Math. Phys. 296, 589–623 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1020-0

Communications in

Mathematical Physics

Topological Open Strings on Orbifolds Vincent Bouchard1 , Albrecht Klemm2 , Marcos Mariño3 , Sara Pasquetti4 1 Jefferson Physical Laboratory, Harvard University, 17 Oxford St.,

Cambridge, MA 02138, USA. E-mail: [email protected] 2 Physikalisches Institut der Universität Bonn, Nußallee 12, D-53115 Bonn, Germany. E-mail: [email protected] 3 Département de Physique Théorique et Section de Mathématiques, Université de Genève, CH 1211 Genève 4, Switzerland. E-mail: [email protected] 4 Institut de Physique, Université de Neuchâtel, Rue A. L. Breguet 1, CH-2000 Neuchâtel, Switzerland. E-mail: [email protected] Received: 5 September 2008 / Accepted: 16 January 2010 Published online: 7 March 2010 – © Springer-Verlag 2010

Abstract: We use the remodeling approach to the B-model topological string in terms of recursion relations to study open string amplitudes at orbifold points. To this end, we clarify modular properties of the open amplitudes and rewrite them in a form that makes their transformation properties under the modular group manifest. We exemplify this procedure for the C3 /Z3 orbifold point of local P2 , where we present results for topological string amplitudes for genus zero and up to three holes, and for the one-holed torus. These amplitudes can be understood as generating functions for either open orbifold Gromov–Witten invariants of C3 /Z3 , or correlation functions in the orbifold CFT involving insertions of both bulk and boundary operators. 1. Introduction Topological string theory on Calabi–Yau threefolds has played a crucial role in our understanding of string theory and Gromov–Witten theory. One of the most fascinating aspects of this topological sector of string theory is that very often amplitudes can be computed exactly, and their dependence on the moduli can be studied in detail. This has led to very rich pictures of the moduli space of the theory, involving different phases which exhibit different physics [10,56]. Modular and analytic properties of the amplitudes connect the different phases of the Calabi–Yau moduli space in a very precise way. Each phase of the moduli space is characterized by a set of “good coordinates,” and different good coordinates corresponding to different phases are related by a transformation in the modular group of the theory. As explained in [1], topological string amplitudes are modular objects with specific transformation properties under this group, and as one goes from one phase to the other, the amplitudes have to be transformed accordingly. For example, when expanded at the large radius limit in moduli space, topological string amplitudes are generating functions of Gromov–Witten invariants. As one moves away from this point towards different regions in moduli space, the large radius expansion eventually ceases

590

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

to converge, but after suitable modular transformations and analytic continuations, the topological string amplitudes can be re-expanded in terms of the good variables of the new phase. In particular, when going to orbifold points of the moduli space, the amplitudes become generating functions for orbifold Gromov–Witten invariants. A detailed understanding of the modular transformation properties of the amplitudes makes it then possible to relate Gromov–Witten invariants to orbifold Gromov–Witten invariants, in the spirit of the crepant resolution conjecture [19,24,53]. In [1] this was used to calculate generating functions of orbifold Gromov–Witten invariants in the case of the C3 /Z3 orbifold, which corresponds to a phase in the moduli space of local P2 , its crepant resolution. The predictions obtained in this way have been later verified mathematically in orbifold Gromov–Witten theory [11,17,20,25], and other examples have been recently calculated [18,26]. A crucial ingredient in the approach of [1] is the ability to obtain exact results for the topological string amplitudes on the whole of moduli space, so that they can be expanded in different phases. These exact expressions are typically calculated by using the B-model and mirror symmetry. On top of that, it is extremely useful to write these exact results in a way that makes the transformation properties manifest. For local Calabi–Yau threefolds, the mirror manifold reduces to an algebraic curve and the modular group is essentially the symplectic group acting on the homology of the surface. Topological string amplitudes can then be written in terms of modular forms with respect to this group, and when the curve has genus one, as in the case of the mirror to local P2 , one can write them in terms of elliptic functions [1]. The results of [1] were obtained for closed string amplitudes, and it is natural to ask how one could extend these results to open topological string amplitudes. As in the closed case, we first need a formalism to compute open topological string amplitudes exactly on the whole closed and open moduli space. For the case of toric Calabi–Yau threefolds, this formalism has been proposed in [16,49] and it is based on a recursion relation first obtained in the context of matrix models [32,34]. One advantage of the framework developed in [16,49], as compared to the holomorphic anomaly equations of [12,54], is that the amplitudes are completely fixed by the recursion. It is then natural to use this formalism in order to understand the properties of open string amplitudes as one moves in the open and closed moduli space of toric Calabi–Yau threefolds, and in particular to extract information about the open counterparts of orbifold Gromov–Witten invariants (which so far have not been defined in the mathematical literature). In [16] some steps were taken in this direction. In particular we discussed how to find “good coordinates” for the open moduli at the orbifold point, and we made a preliminary analysis of the disk amplitude. In this paper we present a detailed study of open topological string amplitudes at the orbifold point, focusing on the case of C3 /Z3 . First of all, we clarify the transformation properties of the string amplitudes in the open sector, and we present expressions for them which make their modular transformation properties manifest. Since the recursion of [16,34,49] is based on the Bergman kernel of the mirror curve, our first step is to write it (for a curve of genus one) in terms of elliptic functions. One can then plug the resulting expression in the recursion to find modular expressions for all the open string amplitudes. This leads to considerable improvements in terms of computional efficiency of the recursion relations. As a consequence of this refinement of the formalism of [16], we are able to calculate open orbifold string amplitudes at high order, and we present explicit expressions for amplitudes with (g, h) = (0, 2), (0, 3) and (1, 1). These expressions give generating functions for open orbifold Gromov–Witten invariants, and from the CFT point of view they compute correlation functions of arbitrary

Topological Open Strings on Orbifolds

591

insertions of both bulk operators, associated with twist fields, and boundary operators, associated with deformation modes of the D-brane open moduli. The organization of the paper is as follows. We start by reviewing the remodeling approach to the B-model using recursion relations in Sects. 2.1 to 2.3. In Sect. 2.4, we study modularity of the open amplitudes, which we rewrite in a form that makes their transformation properties explicit in Sect. 2.5. Section 3 is then devoted to the study of topological open string amplitudes at the C3 /Z3 orbifold point in the moduli space of local P2 , using the formalism presented in Sect. 2. We also briefly comment on the calculation of the open amplitudes at the conifold point in the moduli space of local P2 in Sect. 3.5.

2. Open B-model on Mirrors of Toric Calabi-Yau Threefolds 2.1. The geometry. Consider the A-twisted sigma model on a (noncompact) toric Calabi-Yau threefold X . A-branes are objects in the “derived Fukaya category” of X ; roughly speaking, they correspond to Lagrangian submanifolds of X with bundles on them. We consider a simple class of A-branes, given by noncompact special Lagrangian submanifolds L ⊂ X with trivial bundle, with topology R2 × S 1 ; those were constructed in [4,5,39] — see also [16] for a detailed description. The mirror theory is a B-twisted sigma model1 on a family π : Y → M of noncompact Calabi-Yau threefolds, where M is the moduli space of the closed B-model. Let z = (z 1 , . . . , z k ) be coordinates on M centered at a point of maximally unipotent monodromy. The fiber Yz = π −1 (z 1 , . . . , z k ) of the family has the form Yz = {ww = H (x, y; z)} ⊂ (C)2 × (C∗ )2 ,

(2.1)

where H (x, y; z) is a Laurent polynomial in x, y ∈ C∗ of degree 1. The precise form of H (x, y; z) is dictated by the toric data of the mirror X . Yz is a quadric fibration over (C∗ )2 , with degeneration locus the Riemann surface z = {H (x, y; z) = 0} ⊂ (C∗ )2 .

(2.2)

B-branes are objects in the derived category of coherent sheaves, some of which correspond to holomorphic submanifolds of Yz with bundles on them. The B-branes mirror to the simple A-branes considered above can be described as wrapping a holomorphic curve in Yz , with trivial bundle on it. More precisely, fix a point p0 ∈ z parameterized by (x0 , y0 ), and denote by C z ( p0 ) the holomorphic submanifold of Yz defined by w = 0 = H (x0 , y0 ; z).

(2.3)

It is given by the line parameterized by w over the point p0 ∈ z . This is the holomorphic curve which is wrapped by the B-brane. The open moduli space corresponds to deformations of the B-brane C z ( p0 ) in Yz , which are parameterized by the point p0 ∈ z . As a result, the moduli space of the open B-model on (Y, C) is given by the family of Riemann surfaces → M, with fiber (2.2). 1 The mirror is generally presented as a Landau-Ginzburg model; we explain the correspondence between the Landau-Ginzburg model and the sigma model in Appendix A.

592

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

Example 2.1. The main example that we will study is the mirror to local P2 . Let X = K P2 be the total space of the canonical bundle over P2 . Its mirror is the family of Calabi-Yau threefolds Y → M, where the closed moduli space M is one-dimensional, whose fibers Yz are given by (2.1) with H (x, y; z) = 1 + x + y +

z . xy

(2.4)

The family of Riemann surfaces → M has fibers z (2.2), which are elliptic curves with three punctures.

2.2. Disk amplitude. In this paper we focus on the open amplitudes of the B-model. Let us start with the simplest amplitude, the disk amplitude (genus 0, 1 hole). Roughly speaking, it is the open analog of the genus 0 closed amplitude, which corresponds to the prepotential of special geometry of the closed moduli space M. The disk amplitude on (Y, C) similarly admits a simple definition as follows. Recall that the moduli space of the open B-model consists in a family of Riemann surfaces (with punctures) → M. Choose an embedding2 of the fibers z in (C∗ )2 , z = {H (x, y; z) = 0} ⊂ (C∗ )2 ,

(2.5)

and define the one-form ω( p) = log y(x( p)) = log y(x)

dx( p) x( p)

dx x

(2.6)

on z , where p ∈ z and x is chosen as local coordinate. Remark 2.1. Note that in the following we will always omit the dependence on z to simplify the notation. But since → M is a family of curves, all the objects we define on the fiber z will have an implicit dependence on z. The main conjecture of [4,5], which comes from dimensional reduction of the holomorphic Chern-Simons theory on the brane C, goes as follows. Conjecture 2.1 ([4,5]). The “Abel-Jacobi” map ω( p), F (0,1) = γ

(2.7)

where γ is the chain [q ∗ , q] and q ∗ ∈ z is a reference point, gives the B-model disk amplitude, up to classical terms. F (0,1) should be understood as a series expansion in the local coordinate x near x = 0, where x corresponds to the open modulus associated to the brane. 2 The choice of embedding of in (C∗ )2 corresponds to a choice of phase and framing of the mirror z brane. This was considered in detail in [16].

Topological Open Strings on Orbifolds

593

The Abel-Jacobi map is defined on the Jacobian, that is only up to addition of integrals of ω( p) over one-cycles. But here we will only be interested in the series expansion of the amplitude in the open modulus, and so the ambiguity is irrelevant. Note that this conjecture is the local analog of the result of [50], where the disk amplitude is computed in terms of normal functions. This formula has been verified in many examples, by expanding the disk amplitude near a point of maximally unipotent monodromy in the closed moduli space, and comparing with open A-model amplitudes on the toric mirror. It requires an explicit knowledge of the closed and open mirror maps, which can be understood as solutions of an extended Picard-Fuchs system (the latter was derived in the language of mixed Hodge structures and relative cohomology in [48]). 2.3. General formalism. We now move on to the general amplitudes F (g,h) with genus g and h holes. As for the closed amplitudes F (g) , the physical B-model open amplitudes are generally non-holomorphic, and satisfy an open analog of the holomorphic anomaly equations of [12]. However, to compare with the A-model Gromov-Witten generating functions, one needs to consider the holomorphic limit of the physical B-model amplitudes expanded near a special point in the moduli space. The F (g,h) that we consider here are these holomorphic objects, rather than the physical B-model amplitudes. Stated from a modularity point of view, what we construct here are the quasi-modular forms, rather than the almost holomorphic modular forms [1]. We will discuss this point in more detail in the next subsection. In [16,49] a general recursive formalism for computing B-model genus g, h hole open amplitudes F (g,h) on (Y, C) was proposed. From a mathematical point of view, since the open B-model is not really well understood, this can be taken as a proposal for a definition of the open B-model on these geometries. Consider again the following data: – A family of (punctured) Riemann surfaces → M (the open B-model moduli space); – A choice of embedding of the fibers z in (C∗ )2 , z = {H (x, y; z) = 0} ⊂ (C∗ )2 .

(2.8)

We claim that these data fully characterize the open B-model on (Y, C), with arbitrary genus and number of holes. By projecting onto the x-axis we may see z as a branched cover of C∗ . Denote by qi ∈ z the ramification points of the projection map, such that dx(qi ) = 0. Let λi := x(qi ) ∈ C∗ be the branch points. We assume that they all have branching order two. Then, near qi , there are two points q, q¯ ∈ z with the same projection x(q) = x(q) ¯ (those are defined only locally near qi ). As before, define the one-form ω( p), which reads in local coordinates dx (2.9) ω( p) = log y(x) . x Definition 2.1. The Bergman kernel B( p, q) is the unique bilinear differential on z with a double pole at p = q with no residue, and no other pole. It is normalized by B( p, q) = 0, (2.10) AI

where

(A I ,

B I ) is a symplectic basis of cycles on z .

594

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

Note that the Bergman kernel is defined on the Riemann surface itself, and does not depend on the embedding in (C∗ )2 . Its definition however requires a choice of symplectic basis of cycles on z . Definition 2.2. Near qi ∈ z , define the one-form 1 q¯ dE q,q¯ ( p) = B( p, ξ ), 2 ξ =q

(2.11)

where the integration is in a neighborhood of qi . Note that this is defined only locally near qi . We are now ready to state the recursion, which was first derived in the context of matrix models in [22,32,34]. Definition 2.3. Let W˜ (g,h) ( p1 , p2 , . . . , ph ), g, h ∈ Z, g ≥ 0, h ≥ 1, be multilinear differentials on z . Fix the initial conditions W˜ (0,1) ( p1 ) = 0,

W˜ (0,2) ( p1 , p2 ) = B( p1 , p2 ).

Define the remaining differentials by the W˜ (g,h) ( p1 , p2 , . . . , ph ) =

qi

+

dE q,q¯ ( p1 ) Res q=qi ω(q) − ω(q) ¯

g

(2.12)

recursion3

¯ p2 , . . . , p h ) W˜ (g−1,h+1) (q, q, ⎞

¯ p H \J )⎠ , W˜ (g−l,|J |+1) (q, p J )W˜ (l,|H |−|J |+1) (q,

l=0 J ⊆H

(2.13) where we used the notation H = {2, 3, . . . , h}, and given any subset J = {i 1 , . . . , i j } ⊆ H we defined p J = { pi1 , . . . , pi j }. There is a second recursion which reads as follows. Definition 2.4. Let F (g) , g ∈ Z, g ≥ 2, be functions on z defined by 1 F (g) = Res θ (q)W˜ (g,1) (q), 2g − 2 q q=qi

(2.14)

i

where θ (q) is any primitive of ω(q), i.e. dθ (q) = ω(q). We define the function F (1) by: Definition 2.5. Define 1 1 F (1) = − log τ B − log ω (qi ), 2 24

(2.15)

i

where τ B is the Bergman tau-function and

log y(x)

1 d ω (qi ) = ,

dz i ( p) x p=qi

z i ( p) =

x( p) − x(qi ).

(2.16)

We refer the reader to [34] for more details. 3 Note that the integrand in the right-hand side is only defined locally near the ramification points q . i

Topological Open Strings on Orbifolds

595

The main conjecture of [16,49], which relates the objects defined above recursively to the B-model amplitudes, could be stated as follows. Conjecture 2.2. Let F (0) be the prepotential of special geometry, F (1) be as in Definition 2.5, and the F (g) ’s for g ≥ 2 be as in Definition 2.4. For g ≥ 0, h ≥ 1, and (g, h) = (0, 1), (0, 2), define the multilinear differentials W (g,h) ( p1 , . . . , ph ) = W˜ (g,h) ( p1 , . . . , ph ),

(2.17)

using Definition 2.3. Let W (0,2) ( p1 , p2 ) = B( p1 , p2 ) −

d p1 d p2 , ( p 1 − p 2 )2

(2.18)

and W (0,1) ( p) = ω( p). Define F

(g,h)

=

γ1

···

γh

W (g,h) ( p1 , . . . , ph ),

(2.19)

(2.20)

where the γi ’s are the chains [qi∗ , qi ], with the qi∗ ∈ z reference points. The F (g) constructed above are the genus g closed B-model amplitudes on Y , and the (g,h) F are the genus g, h hole open B-model amplitudes on (Y, C). The F (g,h) should be understood as series expansions in the local coordinates xi := x( pi ), which correspond to the open moduli associated to the branes. Note that as for the disk amplitude F (0,1) , the F (g,h) are only defined modulo integration over closed cycles; but again, this ambiguity will be irrelevant since we only consider instanton expansions of the amplitudes. Remark 2.2. Note that the conjecture for F (1) should probably follow from the result of Dubrovin and Zhang for the G-function associated to Frobenius manifolds [31], which was also studied by Givental [36]. It can also be understood from a topological field theory point of view as in [13]. There are various arguments behind this conjecture. First, a strong piece of evidence comes from direct calculation. In [16,49], various amplitudes for the mirrors of C3 , local P1 , P2 , P1 × P1 , F1 , F2 were computed. By expanding the amplitudes near a point of maximally unipotent monodromy and plugging in the open and closed mirror maps, it was shown that one recovers the open A-model amplitudes on the toric mirrors. This however only tests the conjecture at large radius; in [16] the conjecture was also tested at the orbifold point of local P1 × P1 , by comparing with perturbative Chern-Simons theory on the lens space S 3 /Z2 . A more conceptual argument for the conjecture goes as follows. The recursions were derived by [22,34] in the context of matrix models. When is the spectral curve of a matrix model, the recursions (2.13) and (2.14) respectively generate the correlation functions and free energies of the matrix model. For some B-model geometries, using large N dualities on the mirror side, we can find matrix model representations with spectral curve z , which justifies the conjecture. However, in general no matrix model representation is known; but it was argued first in [49], and then in much more detail in

596

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

[30], that the B-model amplitudes should indeed satisfy the recursions (2.13) and (2.14), whether there is a matrix model representation or not. This involves a detailed analysis of the B-model, understood in the chiral boson picture developed in [2]. One can also show that the amplitudes obtained through the recursion, after restoring non-holomorphicity using modularity as in [1], satisfy the holomorphic anomaly equations (and their open analogs) [33]. In any case, in the following we will take this conjecture for granted and explore some of its consequences.

2.4. Modularity. In the previous subsection we introduced a recursive formalism to compute open and closed B-model amplitudes. While the formalism is very elegant conceptually, and provides a complete solution to the B-model on these geometries, it turns out to be rather complicated computationally. One reason is that the formalism makes no explicit use of the modular properties of the amplitudes; on the contrary, the intermediate step of taking residues at the branch points destroys the symmetry of the amplitudes. Indeed, the branch points are in general complicated functions of z, since the projection z → C∗ is a branched cover. But the final amplitudes are simple functions of z; in fact, only symmetric combinations of the branch points, which are simple rational functions of z, appear in the final amplitudes. It thus seems desirable to recast the recursion in a different form, bypassing the intermediate step of taking residues at the branch points. However, one problem is that the integrand in the right hand side of (2.13) is only defined locally near the branch points. Hence, one cannot simply deform the contour integral to pick up residues at the other poles in a straightforward way. Indeed, localizing the integrand at the branch points turns out to be crucial in the derivation of the recursion in the matrix model context (see for instance pp.14-15 of [22]), in order to get rid of unfixed polynomials. So it seems that the intermediate step plays a more important role than one would have expected at first sight.4 Even though it seems difficult to reformulate the recursion in a more computationally effective way, what we can do is use our knowledge of the modular properties of the amplitudes to rewrite the amplitudes a posteriori. That is, using modularity and regularity of the amplitudes we write down a general ansatz for the amplitudes, either in terms of modular forms, or as functionals of solutions of the Picard-Fuchs equations. At each genus g and number of holes h, the ansatz involves rational functions in the open and closed moduli comprising a finite number of unknown parameters. The latter can be fixed by comparing the ansatz with the result obtained from the recursion (2.13). Alternatively, the parameters can be fixed by comparing with a mirror calculation at large radius using the topological vertex formalism [3]. These formulae prove to be very useful in studying the amplitudes at various points in the moduli space, as we will do in the next section. However, the rational functions become rather involved and increasingly difficult to determine for higher genus and a larger number of holes. 2.4.1. Picard-Fuchs equations, monodromy and modularity of the closed amplitudes Given a family of Calabi-Yau threefolds Y → M, it is standard to associate a system of differential equations, called the Picard-Fuchs equations, which annihilate periods of the holomorphic volume form z on the fiber Yz . In the noncompact setting, the 4 It is tempting to speculate that the process of localizing at the branch points is a B-model mirror analog to the process of using localization with respect to a torus action in Gromov-Witten theory.

Topological Open Strings on Orbifolds

597

Picard-Fuchs system can be extracted either by taking the limit of a compact threefold [23], or by considering the equivalent Landau-Ginzburg setting [36,41]. Solutions to the Picard-Fuchs equations provide a set of flat coordinates on M. When Y is of the form studied previously, it can be shown that the geometry “reduces” to the family of curves → M, and the Picard-Fuchs equations annihilate periods of the one-form ω( p) over one-cycles on the Riemann surface z . From now on, we focus on the case where z is a genus one curve. Let (A, B) be a canonical basis of one-cycles on the genus one curve z . Apart from a constant solution, there are two more solutions to the Picard-Fuchs equations, which provide a basis of dual periods:

T =

ω( p),

TD =

A

ω( p).

(2.21)

B

The Picard-Fuchs differential equations have regular singular points, around which the periods have monodromy. The monodromy group is a finite index subgroup of S L(2, Z). A natural question is to study modularity of the B-model amplitudes with respect to the monodromy group. This question was approached for the closed amplitudes from physical principles in [1]. The physical closed B-model amplitudes F (g) are invariant under the monodromy group — indeed, this is required for consistency of the physical theory all over the moduli space M — but they are non-holomorphic. This can be reformulated in terms of modularity with respect to the modular parameter of z : τ=

∂ 2 F (0) ∂ TD = , ∂T ∂T 2

(2.22)

where F (0) is the prepotential of special geometry giving the genus 0 closed B-model amplitude. In this language, the statement becomes that for g ≥ 2, the physical amplitudes F (g) are almost holomorphic modular forms with respect to the monodromy group [1]. However, there is a canonical isomorphism between the ring of almost holomorphic modular forms and the ring of quasi-modular forms — forms that transform with a shift [44]. This is given by “taking the holomorphic limit” of F (g) , which breaks the modular invariance by keeping only the constant term in the finite expansion in Im(τ )−1 . We thus obtain the holomorphic closed B-model amplitudes F (g) , which are quasi-modular with respect to the monodromy group. Those are the amplitudes that were constructed through the recursion. 2.4.2. Modularity of the open amplitudes We now want to understand the modular properties of the open amplitudes F (g,h) , which are the holomorphic limits of the monodromy invariant physical amplitudes F (g,h) . This was studied in [33,34] using the recursion. Let τ be the modular parameter of z , which parameterizes the upper half plane. Let τ˜ =

Aτ + B , Cτ + D

A B ∈ ⊂ S L(2, Z) C D

(2.23)

be a symplectic transformation of the periods in the monodromy group . Under (2.23), the Bergman kernel transforms as ˜ p, q) = B( p, q) − 2π i u( p)(Cτ + D)−1 Cu(q), B(

(2.24)

598

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

where u( p) is the holomorphic differential. The shift makes the Bergman kernel a quasimodular form of weight 0. Through the recursion (2.13), this induces quasi-modular properties for all the open amplitudes W (g,h) . One can compute the explicit transformation properties of the differentials W (g,h) by plugging in the transformation properties of the Bergman kernel directly in the recursion, as was done in [33,34]; we will not repeat the analysis here. Instead, what we are doing next is to use our knowledge of modularity to write down explicit expressions for the (low genus and number of hole) amplitudes in terms of modular forms, and as functionals of solutions of the Picard-Fuchs equations (the periods).

2.5. Modular forms and functionals. As we have just seen, the multilinear differentials W (g,h) are quasi-modular forms of weight 0 with respect to the monodromy group. They are obtained by taking the holomorphic limit of the non-holomorphic differentials W (g,h) , which correspond to the physical amplitudes, therefore are monodromy invariant. As a consequence, the holomorphic amplitudes W (g,h) can be universally written as functionals of the periods and their derivatives, where the periods are functions of some local coordinates on the moduli space. The functional point of view provides a very useful way of computing modular transformations of the amplitudes, since changing the period in the functional directly implements the symplectic transformation between the periods. In other words, the choice of period in the functional corresponds to a choice of modular parameter, or equivalently to a choice of canonical basis of cycles in the definition of the Bergman kernel. This approach renders the computation of the amplitudes everywhere in the moduli space straightforward. To see how it goes, let us start by deriving a general expression for the annulus amplitude (or the Bergman kernel) in terms of modular forms, which is the main ingredient in the recursion relation, and induces the quasi-modular properties of the amplitudes. We then explain how it can be written generally as a functional; we will propose an exact form for the functional in the next section when we specialize to the local P2 geometry. Finally we propose a general ansatz for the higher order amplitudes, which we use in the next section to derive functional expressions and compute the amplitudes at the orbifold point of local P2 . 2.5.1. The annulus amplitude As usual, we start with a family of (punctured) Riemann surfaces → M (the open B-model moduli space), and a choice of embedding of the fibers z in (C∗ )2 , z = {H (x, y; z) = 0} ⊂ (C∗ )2 .

(2.25)

We specialize to the case where z is a genus one curve. Denote by qi ∈ z the ramification points of the projection map onto the x-axis, and by λi := x(qi ) ∈ C∗ the branch points. When z has genus one, the annulus amplitude W (0,2) can be written in terms of the Weierstrass elliptic function, using uniformization parameters for the elliptic curve. Alternatively, when z has four distinct branch points λi , i = 1, . . . , 4, one can work directly on the C∗ which is the image of the x-projection. In terms of x-projected variables x1 , x2 ∈ C∗ (i.e. local coordinates x1 := x( p1 ) and x2 := x( p2 )), Akemann derived in [6] a formula for the annulus amplitude, which reads

Topological Open Strings on Orbifolds

599

M(x1 , x2 ) + M(x2 , x1 ) (x1 − x2 )2 dx1 dx2 E(k) − −(λ1 − λ3 )(λ2 − λ4 ) , K (k) 2(x1 − x2 )2

dx1 dx2 W (0,2) ( p1 , p2 ) = √ 4 σ (x1 )σ (x2 )

(2.26)

where M(x1 , x2 ) = (x1 − λ1 )(x1 − λ2 )(x2 − λ3 )(x2 − λ4 ), σ (x) =

4

(x − λi ),

(2.27) (2.28)

i=1

and K (k) and E(k) are elliptic integrals with modulus k2 =

(λ1 − λ2 )(λ3 − λ4 ) . (λ1 − λ3 )(λ2 − λ4 )

(2.29)

Note that the amplitude depends on a choice of ordering of the branch points, which corresponds to a choice of canonical basis of cycles on z . Let us start by rewriting the amplitude in terms of modular forms. Proposition 2.1. Let

Sk =

λ j1 · · · λ jk

(2.30)

1≤ j1 < j2 <...< jk ≤4

be the elementary symmetric polynomials in the four branch points, and let √ u(x)dx = i

(λ1 − λ3 )(λ2 − λ4 ) dx √ 4 σ (x)K (k)

(2.31)

be the holomorphic differential. The annulus amplitude can be written as

(0,2)

f 0 (x1 , x2 ) 1 + √ 2(x1 − x2 )2 4 σ (x1 )σ (x2 ) π2 + u(x1 )E 2 (τ )u(x2 ) dx1 dx2 , 3

W (0,2) ( p1 , p2 ) = −

(2.32)

where τ is the modular parameter, E 2 (τ ) is the second Eisenstein series, and the rational (0,2) function f 0 (x1 , x2 ) reads5 f 0(0,2) (x1 , x2 ) =

1 6x12 x22 − 3x1 x2 (x1 + x2 )S1 + (x12 + 4x1 x2 + x22 )S2 3(x1 − x2 )2 −3(x1 + x2 )S3 + 6S4 ) . (2.33)

5 Recall from Remark 2.1 that we do not write explicitly the dependence on z for simplicity.

600

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

Proof. We start with Akemann’s formula (2.26). Let us introduce K (k) 2i 1 , e3 = ω1 = √ (S2 − 3(λ1 λ2 + λ3 λ4 )) . π (λ1 − λ3 )(λ2 − λ4 ) 12

(2.34)

e3 is one of the three roots of the elliptic curve in Weierstrass form. Then, manipulating some of Akhiezer’s identities for elliptic integrals [7], we obtain the identity 1 2 2 (2.35) E(k)K (k) = π E 2 (τ ) + ω1 e3 . 12 From this we rewrite the second term in (2.26) as 1 E(k) (λ1 − λ3 )(λ2 − λ4 ) − √ K (k) 4 σ (x1 )σ (x2 ) e3 π2 u(x1 )E 2 (τ )u(x2 ) + √ , (2.36) = 3 σ (x1 )σ (x2 ) using the definition of the holomorphic differential above. By expanding the function M(x1 , x2 ) and combining with the e3 term, we can rewrite the other terms of (2.26) in terms of elementary symmetric polynomials of the branch points, and we obtain (2.32). Remark 2.3. In (2.32), the only term which is not quite modular invariant is the term with E 2 (τ ). Since E 2 (τ ) is a quasi-modular form of weight 2, and the holomorphic differentials are modular of weight −1, we see explicitly that the annulus amplitude is a quasi-modular form of weight 0, as it should. The shift in the modular transformation of the annulus amplitude comes, as in the closed case [1], from the shift in the modular transformation of the second Eisenstein series E 2 (τ ). (0,2)

Remark 2.4. Note that the function f 0 (x1 , x2 ) is also rational in z — hence manifestly modular invariant — since it involves only symmetric combinations of the branch points, (0,2) which are necessarily rational functions of z. The function f 0 (x1 , x2 ) corresponds to the “holomorphic ambiguity” in the integration of the holomorphic anomaly equation for the open amplitudes. Let us now define G(τ ) =

E 2 (τ ) , 3ω12

(2.37)

which is a function of z through the definition of ω1 , and depends on a choice of modular parameter τ , but does not depend on the open string variables x1 and x2 . The annulus can now be rewritten as (0,2)

(x1 , x2 ) + G(τ ) f dx1 dx2 dx1 dx2 . + 0 √ (2.38) 2 2(x1 − x2 ) 4 σ (x1 )σ (x2 ) The Bergman kernel B( p1 , p2 ) is obtained by changing the sign in front of the first term. G(τ ) plays an important role in the following, since it encodes the quasi-modular properties of the amplitudes. As a result, G(τ ) can be expressed as a functional of the period T and its derivative; in which case we will denote it as G[T ; z]. The choice of period T corresponds to the choice of modular parameter τ . The exact form of G[T ; z] depends on the curve z ; we will present it for the mirror of local P2 in the next section. To summarize, we now have an expression for the annulus amplitude in terms of modular forms, which can be rewritten as a functional of the period and its derivatives, using G[T ; z]. Let us now study the higher order amplitudes. W (0,2) ( p1 , p2 ) = −

Topological Open Strings on Orbifolds

601

2.5.2. Higher amplitudes Now that we have a functional expression for the annulus amplitude (2.38), which is the main ingredient of the recursion, we can derive the principal functional form of the higher genus amplitudes from (2.13). Lemma 2.1. For g ≥ 0, h ≥ 1, and (g, h) = (0, 1), (0, 2), the general form of the amplitudes is W (g,h) ( p1 , . . . , ph ) =

dx1 · · · dx h

h √ 2g−2+h

i=1 σ (x i )

3g−3+2h

(g,h)

G i [T ; z] f i

(x1 , . . . , x h ),

i=0

(2.39) where

=

(λi − λ j )2

(2.40)

i< j (g,h)

(x1 , . . . , x h ) are rational in their is the discriminant of the curve. The functions f i arguments and in the closed parameter z. Moreover, they have the form (g,h) fi (x1 , . . . , x h )

(g,h)

where the Q i z.

(g,h)

(x1 , . . . , x h ) 3g−2+h , σ (x ) j j=1

Q = i

h

(2.41)

(x1 , . . . , x h ) are polynomials of finite degree in their arguments and in

Proof Sketch of the proof. We obtain this general form by close inspection of the recursion (2.13), and using the functional formula (2.38) for the annulus amplitude. Let us simply sketch the main lines of the argument. Let W (g,h) ( p1 , . . . , ph ) = w (g,h) (x1 , . . . , x h )dx1 · · · dx h . First, it is clear from the definition that the functions σ (x1 ) · · · σ (x h )w (g,h) (x1 , . . . , x h )

(2.42)

(2.43)

are rational in the xi ’s, since multiplying by the square roots amounts to cancelling the branch cuts. Second, by pushing down the recursion (2.13) in order to obtain the analogs of Feynman rules, as in Definition 4.5 of [34], we see that each amplitude is represented by dE q,q¯ ( p) a graph with 3g − 3 + 2h edges. Each edge gives a factor of either B( p, q) or ω(q)−ω( q) ¯ . (g,h) Since both of these factors are polynomials of degree 1 in G[T ; z], we obtain that w must be a polynomial of order 3g − 3 + 2h in G[T ; z]. So what we know so far is that 1 √ i=1 σ (x i )

w (g,h) (x1 , . . . , x h ) = h

3g−3+2h i=0

(g,h) G i [T ; z] f˜i (x1 , . . . , x h ),

(2.44)

602

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

(g,h) (g,h) where the f˜i (x1 , . . . , x h ) are rational in the xi ’s. It is also clear that the f˜i (g,h) are rational in z, since we are summing over branch points, hence the f˜i can be expressed in terms of elementary symmetric polynomials in the branch points, which must be rational functions of z. (g,h) Finally, the denominators of the functions f˜i (x1 , . . . , x h ) can be obtained from the pole structure of the integrand in the recursion (2.13). The analysis is rather subtle, and we leave the details to the reader. Roughly speaking, after taking residues and summing over branch points, each pole of the form σ (x)−k contributes a factor of k in the denominator, and the double poles of the Bergman kernels combine to give the factors of σ (x) in the denominator.

For a particular geometry, by comparing the generic form of the amplitudes (2.39) with the explicit result obtained with the recursion, we can determine the functions (g,h) fi (x1 , . . . , x h ) at each genus and number of holes. Once this is done, the main advantage of the functional form of the amplitudes is that the computation of the amplitudes at various points in the moduli space simply amounts to inserting the right period T in the functional. We exemplify this procedure in detail in the next section by studying the mirror of local P2 at the C3 /Z3 orbifold point. Note that the general form of the amplitudes (2.39) was obtained directly by inspection of the recursion and using the functional formula for the annulus amplitude. Alternatively, it could have been obtained through direct integration of the open version of the (g,h) holomorphic anomaly equation, in which case the functions f 0 (x1 , . . . , x h ) would correspond to the holomorphic ambiguities. This complementary approach sheds new light on the structural constaints of the amplitudes coming directly from modularity; we hope to report on that in future work. 3. Open Orbifold Gromov-Witten Invariants of C3 /Z3 In this section we apply our formalism to the study of the mirror to local P2 at the orbifold point in moduli space. 3.1. Geometry. We consider the geometry described in Example 2.1. → M is the one-parameter family of genus one Riemann surfaces with three punctures. We choose the following embedding for the fibers, z = y 2 + y(1 + x) + zx 3 = 0 ⊂ (C∗ )2 .

(3.1)

We consider the B-model on this geometry, with B-branes wrapping the curve C ⊂ Y as usual. The mirror theory is the A-model on the target space X = K P2 , with a noncompact A-brane wrapping a special Lagragian submanifold of topology R2 × S 1 (see [5,4,16] for a detailed description of these branes). The parameterization of the curve z above corresponds on the A-model side to an “outer brane with zero framing”, in the nomenclature of [4,16]. Unless specified, all our calculations in this section will be in this parameterization. By mirror symmetry, the closed A-model moduli space is isomorphic to M. It has two patches, which correspond to two phases of the A-model. In each phase, there is a limit

Topological Open Strings on Orbifolds

603

point near which the A-model amplitudes have a convergent expansion, and become the amplitudes of a non-linear sigma model (coupled to two-dimensional gravity). In the first patch, the limit point is the large radius point, which is located at z = 0. The amplitudes expanded near this point become generating functions of Gromov-Witten invariants of X = K P2 . In the second patch, the limit point is the orbifold point, located at z = ∞; a good local coordinate is ψ = z −1/3 . The amplitudes expanded near this point become generating functions of orbifold Gromov-Witten invariants of X = C3 /Z3 . As a result, moving from one patch to the other in M induces a topologically-changing transition of the target space. This analysis also extends to the open sector. In the large radius patch, the amplitudes F (g,h) expanded near the large radius point become generating functions for open Gromov-Witten invariants of (X, L), where L is the special Lagrangian submanifold mirror to C. Open Gromov-Witten invariants are defined in terms of stable maps from bordered Riemann surfaces with Lagrangian boundary conditions [45] — see also [37]. If X admits a U (1) action which fixes L, then the U (1) acts naturally on the space of stable maps, and one can use localization to compute open Gromov-Witten invariants [45]. In the orbifold patch, one expects a similar story to hold, and the amplitudes expanded near the orbifold point should be generating functions for open orbifold Gromov-Witten invariants of (X , L ). Here, L is a Lagrangian submanifold of C3 which is fixed by the Z3 action, hence descends to a Lagrangian submanifold of the orbifold. This Lagrangian L corresponds to the original Lagrangian L at the large radius point, and should exist as a consequence of the A-version of the McKay correspondence for derived categories. Therefore, one can consider stable maps from bordered Riemann surfaces to the orbifold C3 /Z3 , in such a way that the boundaries are mapped to L , and construct the corresponding open orbifold Gromov–Witten invariants. One could then follow the approach of [45] in the context of orbifolds, and use localization with respect to a U (1) action to compute open orbifold Gromov-Witten invariants of C3 /Z3 . Such open orbifold Gromov-Witten invariants have not been defined in the mathematical literature yet. However, Renzo Cavalieri informed us that he is presently working on this [21]. In particular, he has managed to compute disk orbifold Gromov-Witten invariants of C3 /Z3 using localization of Z3 -Hodge integrals. His calculation matches perfectly with the results we present in Subsect. 3.3.1, as we explain there. Another useful point of view on these open orbifold Gromov–Witten invariants is to consider the topological string theory near the orbifold point as a perturbed N = 2 orbifold conformal field theory (CFT) coupled to gravity. From this point of view, the open topological string amplitude F (g,h) is a generating function of arbitrary insertions of bulk and boundary operators of the orbifold CFT. In the case of C3 /Z3 there is only one bulk operator O. This is a twist operator which corresponds to a blow-up mode of the orbifold singularity, i.e. to a deformation mode of the closed string modulus. In the presence of Lagrangian boundary conditions specified by L , one also has boundary preserving operators. These operators correspond to the insertion of open string states on the boundaries of the Riemann surface which maps to L , and they are in one-to-one correspondence with H 1 (L , End(E)), where E is an appropriate vector bundle on L [9,55]. Since we have open strings with h boundaries, the most general configuration can be obtained by considering h branes wrapping L . In our case b1 (L ) = 1, therefore there will be h (integrated) boundary operators , = 1, . . . , h, corresponding to the h branes wrapping L . We then have

604

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

F

(g,h)

= exp Tor b O + =

h =1

j,i 1 ,...,i h ≥0

X g,h

1 (g,h) j N T X i1 . . . X hi h , j! (i1 ,...,i h ), j or b 1

(3.2)

where (g,h)

N(i1 ,...,i h ), j =

1 O j 1i1 . . . hi h g,h , i1 ! · · · ih !

(3.3)

and the vevs are calculated for the twisted N = 2 SCFT of the orbifold coupled to grav(g,h) ity on a Riemann surface g,h . The numbers N(i1 ,...,i h ), j should be identified with the open orbifold Gromov–Witten invariants. The combinatorial factor i 1 ! · · · i h ! is included in the invariant in order to agree with the conventions of Cavalieri for the open orbifold Gromov–Witten invariants to which we will compare our results later on. Our goal in this section is to use mirror symmetry and our B-model recursive formalism to compute generating functions of open orbifold Gromov-Witten invariants of C3 /Z3 . This can be done in two ways; either by extracting the B-model amplitudes at the orbifold point from the large radius ones using the quasi-modular properties of the amplitudes, or by generating the amplitudes directly at the orbifold point using the functional expressions derived in the previous section. But before doing that, we need to understand the open and closed mirror maps near the orbifold point, in order to map the B-model amplitudes to the A-model amplitudes.

3.2. Open and closed mirror maps. 3.2.1. Closed mirror map The closed mirror map provides a local isomorphism between the closed A- and B-model moduli spaces. One needs to compute the flat coordinate near a given point of M, which is given by a solution of the associated Picard-Fuchs system. Inversion of the flat coordinate gives the closed mirror map. For the case under consideration, one obtains a single Picard-Fuchs equation, which reads ( + 3z(3 + 2)(3 + 1)) f = 0,

(3.4)

∂ with the logarithmic derivative = z ∂z . The constant function f = 1 is always a solution of (3.4). Near z = 0, the two other solutions are

T (z) = log z − 6z + 45z 2 − 560z 3 + · · · := log z + σ (z), 423 2 z − 2972z 3 + · · · , TD (z) = (log z)2 + 2σ (z) log z − 18z + 2

(3.5)

T (z) is the flat coordinate in the large radius patch. At z = 0, T (0) → −∞, and the expansion parameter is set to Q = eT . The closed mirror map in this patch, which expresses z in terms of the flat parameter T , is then given by z(Q) = Q + 6Q 2 + 9Q 3 + 56Q 4 + · · · .

(3.6)

Topological Open Strings on Orbifolds

605

In the orbifold patch, the two non-trivial solutions to (3.4) read k k k 2k k ψ 3 (−1)k+1 ψ k Bk (ψ) = , , ; ,1 + ; − , 3 F2 k 3 3 3 3 3 3

(3.7)

with k = 1, 2, and we used the local coordinate ψ = z −1/3 . Using the explicit expansion of the hypergeometric system we get Bk (ψ) =

(−1)3n+k+1 ψ 3n+k n≥0

(3n + k)!

3 n + k3 . k3

(3.8)

The flat parameter in this patch reads [1] Tor b (ψ) = B1 (ψ),

(3.9)

and the dual period is Tor b,D (ψ) = B2 (ψ). At ψ = 0, we get Tor b (0) = 0, hence Tor b itself is a good expansion parameter. The closed mirror map reads ψ(Tor b ) = Tor b +

1 4 29 6607 Tor b − Tor7 b + T 10 + · · · . (3.10) 648 3674160 71425670400 or b

3.2.2. The open mirror map The open mirror map extends the isomorphism to the open sector, which in the case under consideration is the fiber of the moduli space → M. Again, one needs to determine the open flat coordinate, which is a solution of the extended Picard-Fuchs system, as derived in [47,48]. The open mirror map is given by inverting this open flat coordinate. We refer the reader to [16,47,48] for a detailed explanation of the extended Picard-Fuchs system. In the large radius patch, it was shown in [4,47] that the open flat coordinate is given by 1

X (x, z) = xe 3 (log z−T (z)) ,

(3.11)

where x is the local coordinate x on z , and T (z) is the closed flat coordinate. Note that X (x, z) is monodromy-invariant under z → e2πi z. At (x, z) = (0, 0), we have X → 0, hence it is a good expansion parameter. The open mirror map becomes x(Q, X ) = X (1 − 2Q + 5Q 2 − 32Q 3 + . . .).

(3.12)

In the orbifold patch, we argued in [16] — by requiring that the disk amplitude, when expressed in flat coordinates, be monodromy-invariant under the Z3 orbifold monodromy ψ → e2πi/3 ψ, which fixes the open flat coordinate uniquely, up to scale — that the open flat coordinate must be given by X or b (x, ψ) = x z 1/3 = xψ −1 .

(3.13)

The open mirror map simply becomes x(X or b , Tor b ) = X or b ψ(Tor b ), where ψ(Tor b ) is the closed mirror map.

(3.14)

606

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti (0,1)

for the orbifold disk amplitude of C3 /Z3 at zero framing

Table 3.1. Some invariants Ni, j

i j

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 1 0 0

0 0 − 21 0 0 5 − 54 0 0

− 31 0 0

0

0 0 − 56 0 0

− 14 0 0 4 0 0 − 160 9 0 0 − 292 3 0 0

0

0 0 − 15 4 0 0

− 10 27 0 0 20 0 0 − 4940 9 0 0

0 2 0 0 − 3400 27 0 0

1 27

0 0 29 − 729 0 0 6607 19683

0 0 − 4736087 531441 0

197 1458

0 0 − 63107 39366 0 0 58248455 1062882

2 3

0 0 10 27

0 0 − 23 0 0 8074 729

0 0

1 2

0 0 − 40 27 0 0 − 1432 729 0 0 80456 19683

0 0 − 51705832 531441 0

206 45

0 0 15514 1215

0 0 − 945934 32805 0 0 906117742 885735

53768 243

0 0

6 7

0 0 − 154 9 0 0 19586 243

0 0 5544602 6561

0 0 − 307254682 177147 0

3215 36

0 0 − 384575 972 5540 0 3 0 0 − 214690135 0 26244 21092500 0 243 0 0 8720423035 0 708588

2820200 729

0 0 − 90503800 19683 0 0 − 528718078600 531441 0

3.3. Quasi-modular transformations. Let us start by computing the amplitudes explicitly, using the quasi-modular transformation of the amplitudes from large radius to the orbifold point. 3.3.1. Disk amplitude For completeness, let us review the calculation of the orbifold disk amplitude, which was done in [16]. Recall that the disk amplitude is simply given by the Abel-Jacobi map dx F (0,1) = log(y(x)) , (3.15) x up to classical terms. y(x) is obtained by solving the curve z and keeping the relevant branch: 1 1 + x + (1 + x)2 − 4 z x 3 . (3.16) y(x) = 2 We want to expand the Abel-Jacobi map at the orbifold point. Remark that the open mirror map (3.14) is linear in ψ. Hence we must plug in the open mirror map before expanding in the closed coordinate to get a meaningful expansion. This being done, we get the orbifold disk amplitude (0,1)

For b =

1 (0,1) j i N X or b Tor b , j! i, j

(3.17)

i, j

with the invariants given in Table 3.1. 3.3.2. Incorporating framing As we mentioned earlier, the calculation above was done for an outer brane with zero framing. However, for the orbifold disk amplitude the calculation can be easily generalized to arbitrary framing. From a Gromov-Witten point of view, framing corresponds to a choice of torus action in the localization process.

Topological Open Strings on Orbifolds (0,1)

Table 3.2. Some invariants Ni, j

607 for the orbifold disk amplitude of C3 /Z3 at general framing f

i j

1

2

3

4

0 1 2 3 4 5 6 7 8 9

0 1 0 0

0 0 − f − 21 0 0 5 (2 f + 1) − 54 0 0

− 31 0 0 3 f 2 + 3 f + 23 0 0 5 2 27 9 f + 9 f + 2 0 0 1 2 3 −9 f − 9 f − 2

0 f + 21 0 0 8 54 f 3 + 81 f 2 + 37 f + 5 − 27 0 0 8 1890 f 3 + 2835 f 2 + 1303 f + 179 − 729 0 0

1 27

0 0 29 − 729 0 0 6607 19683

10 11 12

0 0

13 14

− 4736087 531441 0

197(2 f +1) 1458

0 0 f +1) − 63107(2 39366 0

0 0

0

0 0

58248455(2 f +1) 1062882

4037 729

2 9f +9f +2

8 102870 f 3 +154305 f 2 +71549 f +10057 19683

0 0 − 0

8 65783718 f 3 +98675577 f 2 +45818317 f +6463229 531441

Hence the calculation at arbitrary framing is relevant for comparison with localization computations in Gromov-Witten theory. Recall from [16] that a framing transformation of the brane is given by reparameterizing the embedding of the fibers z in (C∗ )2 by (x f , y f ) = (x y f , y),

(3.18)

where (x f , y f ) are the new coordinates, and f ∈ Z is the framing. In particular, the embedding of z becomes 3 f +2 3 f +1 2 f +1 z = y f + yf + xf yf + zx 3f = 0 ⊂ (C∗ )2 . (3.19) We compute the disk amplitude for this curve as dx f F (0,1) = log(y f (x f )) , f xf

(3.20)

where the function y f (x f ) is obtained by solving (3.19) for x f (as a series expansion). Plugging in the mirror map, we obtain the invariants presented in Table 3.2, for general framing f . As we mentioned already, the framing f is correlated to the choice of torus weights for localization of the Hodge integrals in Gromov–Witten theory. Renzo Cavalieri has implemented the Hodge integral calculation for the disk amplitude of C3 /Z3 [21]. It turns out that the most natural choice of torus weights in Gromov–Witten theory does not correspond to f = 0, but rather to f = −2/3 (or f = −1/3). To ease comparisons, we present in Table 3.3 the disk invariants for f = −2/3. Rather amazingly, these invariants are precisely equal to the invariants computed by Cavalieri in orbifold Gromov–Witten theory! Since the Gromov–Witten calculation is done on the A-model side, this comparison also shows that our choice of open orbifold mirror map (3.14), which was argued in [16] from monodromy considerations, is correct, including the scale.

608

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti (0,1)

Table 3.3. Some invariants Ni, j

for the orbifold disk amplitude of C3 /Z3 at framing f = −2/3 i

j

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 1 0 0

0 0

− 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 − 16 0 0 8 − 81 0 0 248 − 2187 0 0

0 0 4 − 45 0 0 188 − 1215 0 0 − 10972 32805 0 0

1 12

0

0 0

1 − 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 4 − 81 0 0 400 − 2187 0 0 − 146800 59049 0 0 − 74714800 1594323 0 0 − 33798787600 43046721 0

1 6

0 0

1 27

5 162

0 0 29 − 729 0 0

0 0 197 − 4374 0 0

6607 19683

0 0 − 4736087 531441 0

63107 118098

0 0 − 58248455 3188646

10984 59049

0 0 − 6768584 1594323 0

385132 885735

0 0 − 381155716 23914845

0 0 0 0 0 0 0 0 0 0 0 0 0 0

5 63

0 0 35 243

0 0 5705 6561

0 0 889805 177147

0 0 17027675 4782969

0

7 108

0 0 875 2916

0 0 221221 78732

0 0 51307949 2125764

0 0 3576521095 57395628

It may seem however odd to assign a non-integral value to f ; it would be interesting to understand this issue better. Presumably, the denominator of 3 comes from the orbifold Z3 action at the orbifold point — indeed, framing has so far only been interpreted from a large radius point of view in topological strings. Note however that non-integral framings have already been considered, although in a different context [27].

3.3.3. Annulus amplitude We now want to compute the annulus amplitude, which is slightly more complicated, since it has non-trivial modular properties and transforms with a shift. More precisely, recall from (2.24) that the annulus transforms as (0,2)

Wor b ( p1 , p2 ) = W (0,2) ( p1 , p2 ) − 2π i u( p1 )(Cτ + D)−1 Cu( p2 ),

(3.21)

where W (0,2) ( p1 , p2 ) is the large radius annulus amplitude and (Cτ + D)−1 C comes from the modular transformation of the period matrix τ from large radius to the orbifold. The first step consists then in computing the annulus amplitude at large radius, using Akemann’s formula (2.26). This was done in [16,49], and we will not repeat the calculation here. What we need to do however is to analytically continue this result to the orbifold point, to obtain the first term on the right hand side of (3.21). The analytic continuation can be done directly in Akemann’s formula, by expanding the branch points around ψ = 0. However, it is important to note that as for the disk amplitude, we must write things in terms of the open flat coordinates X 1 and X 2 — henceforth we will drop the subscript or b — before expanding in ψ, since the open mirror map is linear in ψ. After using a few identities involving elliptic functions and -functions, we obtain the following analytic continuation of the large radius annulus amplitude to the orbifold point, in orbifold flat coordinates X 1 and X 2 :

Topological Open Strings on Orbifolds

W

(0,2)

609

√ 6 12 18 81 ψ ( 23 ) 3 ( 23 ) 243 3 ( 23 ) 1 2 = dX 1 dX 2 − +ψ − 8 π3 64 π 6 18 512 π 9 √ √ 6 12 6 81 ψ 2 ( 23 ) 9 3 ψ 2 ( 23 ) 9 3 ψ ( 23 ) + + 1+ X 1 + −2ψ − X 12 8 π3 64 π 6 8 π3 √ 6 12 9 3 ψ ( 23 ) 81 ψ 2 ( 23 ) + 1+ + 8 π3 64 π 6 √ 6 9 3 ψ 2 ( 23 ) + −3 ψ − X 1 + 5 ψ 2 X 12 X 2 8 π3 √ 6 9 3 ψ 2 ( 23 ) 2 2 2 + −2 ψ − + 5 ψ X1 + 3 X1 (3.22) X2 + · · · . 8 π3 −9

√

One can see that it is not rational, as expected; the non-rational terms should be cancelled by the shift in (3.21). The next step is to compute the modular transformation between the large radius periods (T, TD ) and the orbifold periods (Tor b , Tor b,D ). This can be done by standard analytic continuation, as in [1]. Define c1 = −

1 (1/3) , 2πi (2/3)2

c2 =

1 (2/3) , 2πi (1/3)2

β=

1 , (2πi)3

ω = e2πi/3 . (3.23)

We get the transformation ⎛

TD

⎞

⎛ βω2 c1

⎝T ⎠=⎜ ⎝−c

2

1

0

βω c2

1 3

⎞⎛

Tor b,D

⎞

⎟ 0 ⎠ ⎝ Tor b ⎠ . 1 1

c1 0

(3.24)

Note that this transformation is not quite symplectic, since its determinant is −β; that is, it changes the scale of the symplectic form. However, this can be taken into account by renormalizing the string coupling constant, as in [1]. Now the modular transformation that we want is given by the inverse of this matrix. We get that C =−

c2 , β

D=−

ω2 . c1

(3.25)

We also need the large radius period matrix τ , analytically continued around ψ = 0. By definition, it is given by τ (ψ) =

∂ TD ∂ TD /∂ψ = . ∂T ∂ T /∂ψ

(3.26)

Using the transformation above between TD , T and Tor b,D , Tor b , and expanding around ψ = 0, we get 1

1

2

7

i ψ 2 ( 23 ) i 2 3 ψ ( 23 ) (−1) 6 τ (ψ) = − √ − − + O(ψ 3 ). 2 5 3 ( 16 ) 2π ( 13 )

(3.27)

610

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti (0,2)

Table 3.4. Some invariants N(i ,i ), j for the orbifold annulus amplitude of C3 /Z3 1 2 (i 1 , i 2 ) j 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

(1,1) 0 0 1 9

0 0 1 − 243 0 0 391 6561

0 0 − 225595 177147 0 0 301065409 4782969

(2,1) 1 2

0 0 − 61 0 0 1 − 54 0 0 29 − 162 0 0 8455 1458

0 0

(3,1) 0 − 23 0 0

(2,2) 0 − 34 0 0

34 81

11 36

0 0

0 0

562 2187

197 972

0 0

0 0

31606 59049

8333 26244

0 0 − 49954466 1594323 0

0 0 − 15072793 708588 0

(4,1) 0 0

(3,2) 0 0

14 9

5 3

0 0 338 − 243 0 0 − 17206 6561 0 0

0 0 − 65 81 0 0 − 4261 2187 0 0

30802 177147

158125 59049

0 0

0 0

712334462 4782969

8347925 1594323

(5,1)

(4,2)

3 5

1 2

(3,3) 1 3

0 0 − 26 5 0 0

0 0 − 16 3 0 0

0 0 − 16 3 0 0

238 45

56 27

40 27

0 0

0 0

0 0

3614 135

1552 81

160 9

0 0 − 44338 1215 0 0

0 0 − 48104 729 0 0

0 0 − 52712 729 0 0

Finally, we can compute the holomorphic differential u( p) from the standard formula (2.31). Putting all this together, and integrating, we obtain the orbifold annulus amplitude in flat orbifold coordinates X 1 , X 2 , Tor b , (0,2)

For b =

1 (0,2) j N X i1 X i2 T , j! (i1 ,i2 ), j 1 2 or b

(3.28)

i 1 ,i 2 , j (0,2)

with the invariants N(i1 ,i2 ), j given in Table 3.4; the invariants are symmetric in (i 1 , i 2 ). The invariants are rational, as they should be. Moreover, it is easy to see that the amplitude is invariant under the Z3 orbifold monodromy. Indeed, the orbifold monodromy is given by (Tor b , X 1 , X 2 ) → (ωTor b , ω2 X 1 , ω2 X 2 ),

ω = e2πi/3 .

(3.29)

Thus all terms in the expansion above are monodromy invariant. In Table 3.5 we also present some results for the corresponding framed invariants. 3.3.4. Higher amplitudes Computing the higher amplitudes directly using the modular shift is rather complicated, partially because of all the elliptic functions involved in the calculation. It is much simpler to use the functional expressions to compute the orbifold amplitudes. We have however checked that the genus 0, three-hole amplitude computed through the shift also matches the functional calculation, but we will not present the calculation here for brevity.

3.4. Calculation using the functionals. Let us now use the functional expressions for the amplitude derived in the previous section to compute the open orbifold amplitudes. First, we need to specify what the functional G[T ; z] is for the curve z given by (3.1).

Topological Open Strings on Orbifolds

611 (0,2)

Table 3.5. Some invariants N(i ,i ), j for the framed orbifold annulus amplitude of C3 /Z3 1 2 (i 1 , i 2 ) j 0 1 2

1 9

(2,1) 1 2 + f 0 0

(3,1) 0 − 23 + f (2 f − 1) 0

(2,2) 0 − 34 + f (2 f − 1) 0

3

+ f ( f + 1) 0

0 0

0 0

4

0

− 61 − 31 f 12 f 2 + 18 f + 7 0

5 6

(1,1) 0 0

1 − 243

0

+

5 f (1+ f ) 27

0 1 − 54 −

34 81

− 0 0

f (2646 f 3 +2592 f 2 +538 f −53) 27

11 36

− 0 0

f (2727 f 3 +2754 f 2 +637 f −35) 27

f (31+90 f +60 f 2 ) 27

3.4.1. Generalities First, from the embedding of the elliptic curve (3.1), we obtain σ (x) = (x + 1)2 − 4x 3 z,

(3.30)

= 1 + 27z.

(3.31)

and the discriminant

We claim that the functional G[T ; z] reads 1 ∂T 1 ∂ G[T ; z] = − 2 4 log + log + 5 log z , z C zzz ∂z ∂z 3

(3.32)

where C zzz =

∂ 3 F (0) 3 = 3 ∂z 3 z

(3.33)

is the Yukawa coupling in the local variable z. Let us sketch the derivation of this functional formula. The genus one amplitude F (1) was defined in Definition 2.5. In our context, one can show that (2.15) becomes, up to a constant term, F (1) = − log η(τ ) −

1 ˜ log (ψ), 24

(3.34)

˜ where (ψ) = 27 + ψ 3 is the discriminant in terms of ψ = z −1/3 , and η(τ ) is the Dedekind η-function. Alternatively, F (1) can also be expressed as [13] 1 ∂T 1 ˜ F (1) = − log − log (ψ). 2 ∂ψ 12

(3.35)

Combining the two formulae, we obtain log η(τ ) =

∂T 1 1 ˜ log + log (ψ). 2 ∂ψ 24

(3.36)

612

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

Now the second Eisenstein series E 2 (τ ) is related to the Dedekind η-function by: d log η(τ ). dτ

(3.37)

∂T ˜ 12 log + log (ψ) . ∂ψ

(3.38)

E 2 (τ ) = 24 As a result, we get E 2 (τ ) =

∂ ∂τ

Using the fact that τ=

∂ 2 F (0) , ∂T 2

˜ −1/3 ) =

(z

, z

(3.39)

we obtain E 2 (τ ) =

∂T ∂z

2

1 ∂ C zzz ∂z

∂T 12 log + log + 15 log z . ∂z

(3.40)

Finally, recall that G[T ; z] is defined by G[T ; z] =

E 2 (τ ) . 3ω12

(3.41)

By direct computation, we can write ω1 as a functional of T and z, ω1 = iz

∂T , ∂z

(3.42)

and we obtain the final formula for G[T ; z] given in (3.32). With this explicit formula for the functional G[T ; z], we can proceed with the calculation of the higher amplitudes, using our ansatz (2.39). As explained previously, to compute the amplitudes at the orbifold point, all that we need to do is to input the period Tor b corresponding to the flat parameter at the orbifold point in the functional. 3.4.2. Annulus amplitude The functional expression for the annulus amplitude was obtained in (2.38), using the expression (3.32) for G[T ; z]. For the curve z under (0,2) consideration, the rational function f 0 (x1 , x2 ) can be computed, and reads (0,2)

f0

(x1 , x2 ) =

6 + 6x2 + x22 + x12 (1 − 12x2 z) + x1 (6 + 4x2 − 12x22 z) . 3(x1 − x2 )2

(3.43)

All that one needs to do to obtain the orbifold annulus amplitude, is to do the change of variable z = ψ −3 , replace the open moduli x1 and x2 by the open orbifold mirror map x1,2 = X 1,2 ψ, where X 1 and X 2 are the open flat coordinates, and insert the closed flat orbifold coordinate T = Tor b (ψ) in the functional. Then, we plug in the closed mirror map in the result and expand in X 1 , X 2 and Tor b to obtain the orbifold annulus amplitude. It is easy to show that we obtain precisely (3.28) with the invariants of Table 3.4; note however how much simpler the calculation was.

Topological Open Strings on Orbifolds

613 (1,1)

Table 3.6. Some invariants Ni, j

for the genus 1, 1 hole orbifold amplitude of C3 /Z3 i

j

1

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0

0 0 1 − 36 0 0 11 − 972 0 0 223 − 26244 0 0

5 24

0 − 59 0 0 86 − 243 0 0 − 5210 6561 0 0

0 0

11 8

0 77 − 12 0 0

0 0

1 72

0 0 1 1944

0 0 475 52488

0 0 395585 − 1417176 0 0 640118305 38263752

0

712639 708588

0 0 − 1726238977 19131876

0 0 1 12

0 0 31 324

0 0 1 − 12 0 0 − 38945 8748 0 0

25 12

0 0 3301 1620

0 0 307847 43740

0 0 5242661 − 1180980 0 0 − 11317800859 31886460

172678 177147

0 0 133378114 4782969

0

0 0 −10 0 0 − 412 27 0 0 − 610 9 0 0 − 62488 729 0 0

18823 324

0 0 1237285 8748

0 0 168774025 236196

0 0 23152439695 6377292

0

110 3

0 0 − 127415 324 0 0 − 6757145 4374 0 0 − 1966276115 236196 0 0 − 152933889775 1594323

9 85 12

0 0 − 495 2 0 0 162755 54

0 0 344095 18

0 0 158337275 1458

0 0

3.4.3. Genus 1, one-hole The amplitude has the form predicted by the ansatz (2.39). By comparing with the result obtained through the recursion, we can fix the functions (1,1) fi (x). We obtain 9 2 dx (1,1) (1,1) G [T ; z] + f 1 (x)G[T ; z] + f 0 (x) , (3.44) W (1,1) = √ σ (x) 32 with the functions: x(1 + x) , 8σ (x) 1 (1,1) f 0 (x) = (1 + 36z + 4x(1 + 36z) + 16x 6 z 2 (1 + 36z) 96σ (x)2 +6x 2 (1 + 46z + 270z 2 ) + x 4 (1 + 56z + 396z 2 ) +4x 3 (1 + 55z + 495z 2 ) + 4x 5 z(1 + 57z + 1296z 2 )).

f 1(1,1) (x) =

(3.45)

Doing the transformations as above to go to the orbifold point, we obtain the amplitude 1 (1,1) j (1,1) Ni, j X i Tor b , For b = (3.46) j! i, j

with the invariants given in Table 3.6. 3.4.4. Genus 0, three-hole The amplitude has again the form predicted by the ansatz (0,3) (2.39). We can fix the functions f i (x1 , x2 , x3 ) by comparing with the recursion, and we obtain 9 3 dx1 dx2 dx3 (0,3) (0,3) G [T ; z] + f 2 (x1 , x2 , x3 )G 2 [T ; z] W = √ 64 σ (x1 )σ (x2 )σ (x3 ) (0,3) (0,3) (3.47) + f 1 (x1 , x2 , x3 )G[T ; z] + f 0 (x1 , x2 , x3 ) .

614

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti (0,3)

Table 3.7. Some invariants N(i ,i ,i ), j for the genus 0, 3 hole orbifold amplitude of C3 /Z3 1 2 3 (i 1 , i 2 , i 3 ) j

(1, 1, 1)

(2, 1, 1)

(3, 1, 1)

(2, 2, 1)

(4, 1, 1)

(3, 2, 1)

(2, 2, 2)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

2 3

0 − 65 0 0

0 0

0 0

4 3

1 0 0 − 19 3 0 0 − 37 27 0 0

0 0 − 49 8 0 0 − 133 72 0 0

0 0 1 − 27 0 0 1 − 81 0 0 37 243

0 0 − 42703 6561 0 0

13 162

0 0 397 4374

0 0 92273 − 118098 0 0 120276571 3188646

0

52 27

23 12

0 0 − 124 729 0 0 − 17972 19683 0 0

0 0 11 − 324 0 0 7303 − 8748 0 0

3393164 531441

1584895 236196

0 0 − 4470350924 14348907

0 0 − 1939962841 6377292

0 0 − 176 27 0 0 − 32 81 0 0

9 8

2480 243

709 81

1771 216

0 0 − 475424 6561 0 0

0 0 − 57857 729 0 0

0 0 − 172613 1944 0 0

The functions f i(0,3) (x1 , x2 , x3 ) are rather complicated; we present them in Appendix B. Doing the transformations as above to go to the orbifold point, we obtain the amplitude 1 (0,3) j (0,3) For b = X i1 X i2 X i3 T , (3.48) N j! (i1 ,i2 ,i3 ), j 1 2 3 or b i 1 ,i 2 ,i 3 , j

with the invariants given in Table 3.7. The invariants are symmetric in (i 1 , i 2 , i 3 ). 3.5. Conifold point. So far we considered the A- and B-model amplitudes in the two distinct phases of M, namely the large radius phase and the orbifold one. There is however a third point around which the amplitudes have an interesting expansion, which is the conifold point. This is not a limit point of a phase of M; rather, it is a singular point of the moduli space, where the target space of the A-model develops a conifold 1 singularity.6 This point is located at z = − 27 . It is generally interesting to expand the amplitudes near the conifold point. For (g) instance, the leading behavior of the closed amplitudes Fcon expanded at the conifold point can be understood as the amplitudes of the non-critical c = 1 string at the (g) self-dual radius [35]. Moreover, the amplitudes Fcon seem to possess a universal gap, as discovered in [42]. That is, the leading behavior of the closed amplitudes is of the form: B2g

(g)

Fcon =

2g(2g

2g−2 − 2)Tcon

(g)

2 + k1 Tcon + O(Tcon ),

(3.49)

where the Bn are the Bernoulli numbers and Tcon is the vanishing period at the conifold. This feature is rather striking, and very useful computationally. Indeed, one of the most 6 In the gauged linear sigma model description of the A-model, at the conifold point new massless modes appear, which defines a new branch of vacua.

Topological Open Strings on Orbifolds

615

effective approaches for computing closed amplitudes is by directly integrating [38] the holomorphic anomaly equation of [12], using the polynomial structure of the amplitudes proposed in [57]. However, the holomorphic anomaly equation is not complete; at each genus one needs to fix a finite number of constants (the holomorphic ambiguity) using extra data. In conjunction with the leading behavior of the amplitudes, the gap behavior at the conifold point — more precisely the absence of the 2g − 3 subleading negative powers in the Tcon expansion — imposes 2g − 2 such extra conditions, which have been shown to completely fix the holomorphic ambiguity in many local geometries. In the compact setting, they allow computation of closed amplitudes to very high genus [43]. One may wonder if this approach has an open counterpart. So far, we relied entirely on the recursion formalism to compute open amplitudes. As we have seen, while this formalism is very satisfactory conceptually, it is rather cumbersome computationally. Direct integration of the open holomorphic anomaly equations — recently derived in [33] in the local setting — would provide an alternative method to compute the open amplitudes. In particular, one could hope that a gap behavior exists for the open amplitudes expanded at the conifold point, providing sufficient boundary conditions to fix the holomorphic ambiguity. This is surely enough motivation for studying in more detail the open amplitudes near the conifold point. In what follows we present general properties of the amplitudes; technical and computational aspects are relegated to Appendix C. Consider as usual the moduli space → M. As mentioned before, the conifold point in the closed moduli space M is located at z = −1/27; a good local coordinate is w = 27z + 1.

(3.50)

Since we are computing open amplitudes, we must also specify where we expand the amplitudes in the open moduli space w . At the conifold point w = 0, it turns out that two of the branch points of the x-projection of the curve w collapse to the same value, x = −1/3. Instead of expanding the open amplitudes near x = 0, we will now expand the amplitudes near this critical point x = −1/3, using a new local coordinate centered at this point:7 p=

1 + 3. x

(3.51)

On the mirror A-model side, expanding the amplitudes near this point should correspond to considering branes located near a vertex of the toric diagram. The open B-model amplitudes expanded near this critical point should correspond, at leading order, to c = 1 string amplitudes at the self-dual radius,8 which are in turn equivalent to Gaussian matrix model amplitudes [29]. More precisely, the expected leading behavior of the open amplitudes near the critical point is [14]: F (g,h) ∼ Tcon

2−2g−h

F˜ (g,h) ,

(3.52)

7 Note however that this critical point is a singular limit of the curve , hence one has to choose an w appropriate set of coordinates to smooth out the singularity. In particular, one must consider a double–scaling limit, where the open coordinate p is rescaled with the closed modulus, as explained in [34]. We will come back to that in the explicit computations in Appendix C. 8 Note that such critical points have already been considered in the context of matrix models. In [28] it was proposed that c = 1 amplitudes can be obtained in a two-cut matrix model by considering the critical limit where the two cuts touch each other.

616

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

where Tcon is the closed flat coordinate near the conifold point w = 0, which corresponds to the vanishing period. The amplitudes F˜ (g,h) , which are independent of Tcon , are to be identified with the amplitudes for FZZT branes in the c = 1 string at the self-dual radius. Indeed, as noticed in a similar context in [49], in the critical limit toric branes should become FZZT branes. The leading behavior (3.52) is the open analog to the leading behavior for the closed amplitudes, as presented in (3.49). Recalling the discussion above for the closed amplitudes, the open amplitudes would possess a gap if the subleading terms in Tcon with negative exponents vanished. We can use the recursion and the formalism developed in Sect. 2 to compute the B-model amplitudes explicitly at the critical point; we report this calculation in Appendix C. The calculation shows that the amplitudes indeed possess the expected leading behavior in Tcon . However, the subleading terms in Tcon are not vanishing, in contrast with the closed amplitudes. As a result, we conclude that in the open case, there is no simple gap behavior at the conifold. This renders the use of the direct integration of the holomorphic anomaly equations as a method to solve for the amplitudes rather limited in the open case, since one lacks the boundary conditions provided by the gap behavior and required to fix the holomorphic ambiguity. A word of caution to end this section; as we discuss in Appendix C, it is not clear to us how to fix the open flat coordinate near the critical point (w, p) = (0, 0). This prevents us from providing unambiguous results for the open amplitudes near the conifold point. It would be interesting to clarify these issues further. A. Landau-Ginzburg vs Sigma Model In this Appendix we explain the relation between the standard Landau-Ginzburg mirrors to toric threefolds and the sigma models described previously. We follow the argument presented by Hori, Iqbal and Vafa in p.93 of [40]. Consider the A-twisted sigma model on a (noncompact) toric Calabi-Yau threefold X defined by the toric charge vectors Q a , a = 1, . . . , k. Its mirror [36,41] — see also [24] for a clear explanation — is a B-twisted Landau-Ginzburg model on the family of algebraic tori π : V → M, with V = (C∗ )3+k , and M = (C∗ )k , with projection map k+2 k+2 Q1 Q ik i π : (y0 , . . . , yk+2 ) → (z 1 , . . . , z k ) = yi , . . . , yi . (A.1) i=0

i=0

The Landau-Ginzburg superpotential W : V → C reads W =

k+2

yi .

(A.2)

i=0

Choose local coordinates y0 , y1 , y2 on the fiber Vz = π −1 (z 1 , . . . , z k ), and use the projection map π to rewrite the superpotential as

Wz := W V = G(y0 , y1 , y2 ; z), (A.3) z

where G(y0 , y1 , y2 ; z) is a homogeneous Laurent polynomial in (y0 , y1 , y2 ) of degree 1.

Topological Open Strings on Orbifolds

617

Consider now the Landau-Ginzburg model on Vz = (C∗ )3 × (C2 ), with superpotential Wz = G(y0 , y1 , y2 ; z) − ww .

(A.4)

By Knörrer periodicity, the category of B-branes in the Landau-Ginzburg model (Vz , Wz ) is equivalent to the category of B-branes in the Landau-Ginzburg model (Vz , Wz ) [51]. The “periods” of the Landau-Ginzburg model consists in integrals of the form

eG(y0 ,y1 ,y2 ;z)−ww d log y0 ∧ d log y1 ∧ d log y2 ∧ dw ∧ dw .

(A.5)

Since y0 is a C∗ -coordinate, we can define new coordinates y˜i = yi /y0 , i = 1, 2, and w˜ = w/y0 . The superpotential becomes Wz = y0 (G(1, y˜1 , y˜2 ; z) − ww ˜ ),

(A.6)

and the periods now take the form

˜ ) e y0 (G(1, y˜1 , y˜2 ;z)−ww dy0 ∧ d log y˜1 ∧ d log y˜2 ∧ dw˜ ∧ dw .

(A.7)

Note that d log y0 has become dy0 , due to the rescaling of w. As a result, we can “integrate out” y0 , and we obtain a delta function δ(G(1, y˜1 , y˜2 ; z) − ww ˜ ).

(A.8)

In other words, the B-twisted Landau-Ginzburg model “localizes” on the B-twisted sigma model on the family of noncompact threefolds Y → M with fiber Yz = {ww = H (x, y; z)} ⊂ (C∗ )2 × (C2 ),

(A.9)

where we redefined x = y˜1 , y = y˜2 , w = w˜ and H (x, y; z) := G(1, x, y; z). Note however that we have only shown equivalence of the period integrals. What one would need to show is the equivalence of the category of B-branes for both models, as in the first step involving Knörrer periodicity. In other words, there should be an equivalence between the category of B-branes of the Landau-Ginzburg model (Vz , Wz ), which is generally understood as the category of matrix factorizations, and the derived category of coherent sheaves of Yz . It would be very interesting to understand this relation better, in the spirit of the Landau-Ginzburg/Calabi-Yau correspondence which was derived in [52]. B. Functions f i(0,3) (x1 , x2 , x3 ) for the Genus 0, 3 Hole Amplitude (0,3)

We present in this Appendix the functions f i the expression for the amplitude W (0,3) (3.47):

(x1 , x2 , x3 ), i = 0, 1, 2 entering into

618

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti (0,3)

f2

=

1 (4z(4z(12zx33 + (108z + 1)x32 + 2(54z − 1)x3 − 3)x23 + (4z(108z + 1)x33 64σ (x1 )σ (x2 )σ (x3 ) −(216z + 5)x32 − 6(54z + 1)x3 − 108z − 1)x22 + 2(4z(54z − 1)x33 − 3(54z + 1)x32 −2(108z + 1)x3 − 54z + 1)x2 − 12zx33 − (108z + 1)x32 + (2 − 108z)x3 + 3)x13 +(4z(4z(108z + 1)x33 − (216z + 5)x32 − 6(54z + 1)x3 − 108z − 1)x23 +(−4z(216z + 5)x33 + 9(36z + 1)x32 + 2(270z + 7)x3 + 216z + 5)x22 + (−24z(54z + 1)x33 +2(270z + 7)x32 + 4(216z + 5)x3 + 324z + 6)x2 − 4z(108z + 1)x33 + (216z + 5)x32 + 108z +6(54z + 1)x3 + 1)x12 + 2(4z(4z(54z − 1)x33 − 3(54z + 1)x32 − 2(108z + 1)x3 − 54z + 1)x23 +(−12z(54z + 1)x33 + (270z + 7)x32 + 2(216z + 5)x3 + 162z + 3)x22 + (−8z(108z + 1)x33 +2(216z + 5)x32 + 12(54z + 1)x3 + 216z + 2)x2 + 4(1 − 54z)zx33 + 3(54z + 1)x32 + 54z +(216z + 2)x3 − 1)x1 + 12zx33 + 108zx32 + x32 + 108zx3 − 2x3 −4zx23 (12zx33 + (108z + 1)x32 + 2(54z − 1)x3 − 3) + x2 (8(1 − 54z)zx33 + 6(54z + 1)x32 +(432z + 4)x3 + 108z − 2) + x22 (−4z(108z + 1)x33 + (216z + 5)x32 +6(54z + 1)x3 + 108z + 1) − 3),

(0,3)

f1

=

(B.1)

1 (−4z(4z(12z(180z + 7)x33 + (2592z 2 − 12z − 5)x32 − 2x3 + 108z + 3)x23 192σ (x1 )σ (x2 )σ (x3 ) +(4z(2592z 2 − 12z − 5)x33 +(1296z 2 +48z +1)x32 +6(648z 2 − 12z − 1)x3 +1296z 2 − 168z − 7)x22 −2(4zx33 + (−1944z 2 + 36z + 3)x32 + (−1944z 2 + 360z + 14)x3 + 324z + 11)x2 +12z(36z + 1)x33 + (1296z 2 − 168z − 7)x32 − 3(144z + 5) − 2(324z + 11)x3 )x13 +(−4z(4z(2592z 2 − 12z − 5)x33 + (1296z 2 + 48z + 1)x32 +6(648z 2 − 12z − 1)x3 + 1296z 2 − 168z − 7)x23 + (−4z(1296z 2 + 48z + 1)x33 +2(10368z 2 + 492z + 5)x3 + 9072z 2 + 9(36zx3 + x3 )2 + 336z + 1)x22 +2(12z(−648z 2 + 12z + 1)x33 + (10368z 2 + 492z + 5)x32 + 2(7452z 2 + 276z + 1)x3 +5832z 2 + 108z − 3)x2 + 4z(−1296z 2 + 168z + 7)x33 + 5184z 2 + (9072z 2 + 336z + 1)x32 −24z + 6(1944z 2 + 36z − 1)x3 − 7)x12 + 2(4z(4zx33 + (−1944z 2 + 36z + 3)x32 +(−1944z 2 + 360z + 14)x3 + 324z + 11)x23 + (12z(−648z 2 + 12z + 1)x33 +(10368z 2 + 492z + 5)x32 + 2(7452z 2 + 276z + 1)x3 + 5832z 2 + 108z − 3)x22 −2(4z(972z 2 − 180z − 7)x33 − (7452z 2 + 276z + 1)x32 + (6 − 5832z 2 )x3 − 972z 2 + 180z + 7)x2 +4z(324z +11)x33 +3(1944z 2 +36z − 1)x32 − 324z +2(972z 2 − 180z − 7)x3 − 11)x1 +1728z 2 x33 +60zx33 + 5184z 2 x32 − 24zx32 − 7x32 − 432z − 648zx3 − 22x3 +4zx23 (−12z(36z + 1)x33 + (−1296z 2 + 168z + 7)x32 + (648z + 22)x3 + 432z + 15) +2x2 (4z(324z + 11)x33 + 3(1944z 2 + 36z − 1)x32 + 2(972z 2 − 180z − 7)x3 − 324z − 11) +x22 (4z(−1296z 2 + 168z + 7)x33 + (9072z 2 + 336z + 1)x32 + 6(1944z 2 + 36z − 1)x3 +5184z 2 − 24z − 7) − 15),

(B.2)

Topological Open Strings on Orbifolds

(0,3)

f0

=

619

1 (4z(4z(36z(144z 2 + 32z + 1)x33 + (7776z 2 + 480z + 7)x32 576σ (x1 )σ (x2 )σ (x3 ) +2(4536z 2 + 306z + 5)x3 + 3(864z 2 + 60z + 1))x23 + (4z(7776z 2 + 480z + 7)x33 + (108864z 3 +7920z 2 + 168z + 1)x32 + 6(7776z 3 + 1368z 2 + 66z + 1)x3 + 3888z 2 + 276z + 5)x22 +2(4z(4536z 2 + 306z + 5)x33 + 3(7776z 3 + 1368z 2 + 66z + 1)x32 +2(4212z 2 + 288z + 5)x3 + 5184z 2 + 378z + 7)x2 + 12z(864z 2 + 60z + 1)x33 + 9(24z + 1)2 +(3888z 2 + 276z + 5)x32 + 2(5184z 2 + 378z + 7)x3 )x13 +(4z(4z(7776z 2 + 480z + 7)x33 + (108864z 3 + 7920z 2 + 168z + 1)x32 +6(7776z 3 + 1368z 2 + 66z + 1)x3 + 3888z 2 + 276z + 5)x23 + (124416z 3 + 8496z 2 +4(108864z 3 + 7920z 2 + 168z + 1)x33 z + 120z + 3(186624z 4 + 53568z 3 + 3888z 2 + 108z + 1)x32 +(264384z 3 + 20160z 2 + 444z + 2)x3 − 1)x22 + 2(46656z 3 + 2376z 2 +12(7776z 3 + 1368z 2 + 66z + 1)x33 z − 54z + (132192z 3 + 10080z 2 + 222z + 1)x32 +2(81648z 3 + 5292z 2 + 60z − 1)x3 − 3)x2 + 4z(3888z 2 + 276z + 5)x33 +(124416z 3 + 8496z 2 + 120z − 1)x32 − 132z + 6(15552z 3 + 792z 2 − 18z − 1)x3 − 5)x12 +2(4z(4z(4536z 2 + 306z + 5)x33 + 3(7776z 3 + 1368z 2 + 66z + 1)x32 +2(4212z 2 + 288z + 5)x3 + 5184z 2 + 378z + 7)x23 + (46656z 3 + 2376z 2 +12(7776z 3 + 1368z 2 + 66z + 1)x33 z − 54z + (132192z 3 + 10080z 2 + 222z + 1)x32 +2(81648z 3 + 5292z 2 + 60z − 1)x3 − 3)x22 + 2(4z(4212z 2 + 288z + 5)x33 +(81648z 3 + 5292z 2 + 60z − 1)x32 + 6(5832z 3 + 108z 2 − 30z − 1)x3 − 324z 2 − 144z − 5)x2 +4z(5184z 2 + 378z + 7)x33 + 2592z 2 + 3(15552z 3 + 792z 2 − 18z − 1)x32 − 90z −2(324z 2 + 144z + 5)x3 − 7)x1 + 20736z 3 x33 + 1728z 2 x33 + 36zx33 + 10368z 2 − 132zx32 −5x32 + 144z + 5184z 2 x3 − 180zx3 − 14x3 + 4zx23 (12z(864z 2 + 60z + 1)x33 +(3888z 2 + 276z + 5)x32 + 2(5184z 2 + 378z + 7)x3 + 9(24z + 1)2 ) +x22 (4z(3888z 2 + 276z + 5)x33 + (124416z 3 + 8496z 2 + 120z − 1)x32 +6(15552z 3 + 792z 2 − 18z − 1)x3 − 132z − 5) + 2x2 (4z(5184z 2 + 378z + 7)x33 +3(15552z 3 + 792z 2 − 18z − 1)x32 − 2(324z 2 + 144z + 5)x3 + 2592z 2 − 90z − 7) − 9).

(B.3)

C. Conifold Expansion In this Appendix we provide detailed calculations supporting the discussion of the open amplitudes near the conifold point in Subsect. 3.5. Before computing the amplitudes, one needs to fix the open and closed mirror maps near the critical point (w, p) = (0, 0) in the moduli space. First, the closed mirror map

620

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

can be easily obtained by performing analytic continuation of the large radius periods to the conifold point w = 0. One obtains 11w 2 109w 3 9389w 4 + + + ··· , 18 243 26244 11 log(w) 109 log(w) 877 7 w2 + w3 + · · · . = w log(w) + + + 18 12 243 1458 (C.1)

Tcon = w + D Tcon

The vanishing period Tcon at w = 0 gives the closed flat coordinate at the conifold point, and the closed mirror map is obtained as usual by inverting the series. The open mirror map is much more delicate. As explained in [16], it should be given by a linear combination of solutions of the extended Picard-Fuchs system. That is, by a linear combination of the constant solution, the closed periods (C.1), the solution (see [16]): w−1 1 u = log( p − 3) + log , (C.2) 3 27 and the other relevant solution which is given by the disk amplitude.9 At the critical point (w, p) = (0, 0), the latter reads √ p 2 11 p 3 47 p 4 (0,1) + + + ··· F =i 3 36 972 11664 1 p p2 23 p 3 +w − log( p) − + + + ··· 3 54 324 13122 11 1 1 31 p 2 29 p + 2− + + · · · + · · · . (C.3) +w 2 − log( p) − 54 2p 2p 2916 17496 Therefore, generically, the open flat coordinate should be of the form: D + G. P = A u + B F (0,1) + C Tcon + D Tcon

(C.4)

D contain a logarithm We can directly set B and D to zero, as both F (0,1) and Tcon which would then introduce non-trivial monodromy in the physical disk amplitude. We further decide to fix A = −1 and G = 4π3 i , for the following reasons. First, fixing A just fixes the overall scale of the map. For instance, for A = −1 we get that

P( p, w) =

4π i p w + + · · · + C Tcon + G − . 3 3 3

(C.5)

Then, we fix G = 4iπ 3 to cancel the constant term in the p, w expansion, as we want the flat coordinate to vanish at ( p, w) = (0, 0). We then obtain P( p, w) =

w w2 p p2 + + ··· + + + · · · + C Tcon , 3 18 3 6

(C.6)

9 By disk amplitude here we mean its completion with classical terms such that it is a solution of the extended Picard-Fuchs system.

Topological Open Strings on Orbifolds

621

and the inverse mirror map reads: 1 2 27C 2 + 18C + 1 Tcon p = −(3C + 1)Tcon − + ··· 18 1 2 2 27C + 18C + 1 Tcon + · · · P + · · · . (C.7) + 3 + (3C + 1)Tcon + 18 We did not however find any argument to fix the constant C in the open mirror map. As a result, we are left with a one-parameter family of open mirror maps at the conifold, parameterized by C. Now, as we already mentioned in footnote 7, the conifold point is a singular limit for the mirror curve, and one needs to choose an appropriate coordinate on the resolution. To smooth out the singularity, as in [34] we introduce the rescaled open flat coordinate: X = P Tcon . (C.8) We will then expand the open amplitudes in the flat coordinates X and Tcon . Conifold amplitudes can be easily obtained with the method developed in Sect. 2. As for the orbifold point studied in Sect. 3, basically we only need to input the flat coordinate Tcon in the functionals W (g,h) , and expand the result in the flat coordinates Tcon and X at the conifold point. We obtain the following results. The disk amplitude at the conifold point reads: 3C 1 3X 2 3C X 3/2 + · · · Tcon + − + + X 2 + · · · Tcon F (0,1) = X + 4 2 8 8 1 3C 2 C 2 + + X + · · · Tcon + − + ... . (C.9) 4 72 8 The annulus amplitude: F (0,2) =

3X 1 X 2 9X 12 X 22 9(X 13 X 2 + X 1 X 23 ) + + + ··· 512 256 16 1 2 2 + Tcon (−9C + 1) (X 2 X 1 + X 1 X 2 ) + · · · 64 C 9C 2 1 − + X 1 X 2 + · · · Tcon + · · · , + 24 16 32

(C.10)

and the genus 1, one-hole amplitude: 15 3 3 1 1 1 F (1,1) = X+ X + ··· + (1 − 45C) X 2 + . . . √ 32 256 Tcon 256 Tcon C 77 15C 3 C 2 77C 83 45C 2 − − X + ··· + − + + + + 256 128 2304 256 256 2304 6912 135C 2 595C 35 2 + + + X + ··· Tcon + · · · . (C.11) 256 3072 9216 2−2g−h

These amplitudes have indeed the expected leading behavior Tcon , as explained in (3.52). However the subleading terms are not vanishing (for any value of C), and so there is no simple gap behavior.

622

V. Bouchard, A. Klemm, M. Mariño, S. Pasquetti

Acknowledgements. We would like to thank Alireza Tavanfar and Marlene Weiss for collaboration at the initial stages of this work, and Renzo Cavalieri for sharing with us his results on open orbifold Gromov–Witten invariants prior to publication. We would also like to thank Paul Johnson, Manfred Herbst, Yongbin Ruan and Ed Segal for interesting discussions. The work of S.P. was partly supported by the Swiss National Science Foundation and by the European Commission under contracts MRTN-CT-2004-005104. The work of V.B. is supported in part by the Center for the Fundamental Laws of Nature at Harvard University and by NSF grants PHY-0244821 and DMS-0244464.

References 1. Aganagic, M., Bouchard, V., Klemm, A.: Topological strings and (Almost) modular forms. Commun. Math. Phys. 277, 771 (2008) 2. Aganagic, M., Dijkgraaf, R., Klemm, A., Mariño, M., Vafa, C.: Topological strings and integrable hierarchies. Commun. Math. Phys. 261, 451 (2006) 3. Aganagic, M., Klemm, A., Mariño, M., Vafa, C.: The topological vertex. Commun. Math. Phys. 254, 425 (2005) 4. Aganagic, M., Klemm, A., Vafa, C.: Disk instantons, mirror symmetry and the duality web. Z. Naturforsch. A 57, 1 (2002) 5. Aganagic, M., Vafa, C.: Mirror symmetry, D-branes and counting holomorphic discs. http://arXiv.org/ abs/hep-th/0012041v1, 2000 6. Akemann, G.: Higher genus correlators for the Hermitian matrix model with multiple cuts. Nucl. Phys. B 482, 403 (1996) 7. Akhiezer, N.I.: Elements of Theory of Elliptic Functions, AMS, Providence, RI: Amer. Math.Soc., 1999 8. Alim, M., Lange, J.D.: Polynomial structure of the (Open) topological string partition function. JHEP 0710, 045 (2007) 9. Aspinwall, P.S.: D-branes on Calabi-Yau manifolds. http://arXiv.org/abs/hep-th/0403166v1, 2004 10. Aspinwall, P.S., Greene, B.R., Morrison, D.R.: Calabi-Yau moduli space, mirror manifolds and spacetime topology change in string theory. Nucl. Phys. B 416, 414 (1994) 11. Bayer, A., Cadman, C.: Quantum cohomology of [Cn /µr ]. http://arXiv.org/abs/0705.2160v2[math.AG], 2009 12. Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311 (1994) 13. Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Holomorphic anomalies in topological field theories. Nucl. Phys. B 405, 279 (1993) 14. Bertoldi, G., Hollowood, T.J.: Large N gauge theories and topological cigars. JHEP 0704, 078 (2007) 15. Bonelli, G., Tanzini, A.: The holomorphic anomaly for open string moduli. JHEP 0710, 060 (2007) 16. Bouchard, V., Klemm, A., Mariño, M., Pasquetti, S.: Remodeling the B-model. Commun. Math. Phys. 287, 117–178 (2009) 17. Bouchard, V., Cavalieri, R.: On the mathematics and physics of high genus invariants of C3 /Z3 . http:// arXiv.org/abs/0709.3805v1[math.AG], 2007 18. Brini, A., Tanzini, A.: Exact results for topological strings on resolved Y(p,q) singularities. http://arXiv. org/abs/0804.2598v4[hep-th], 2008 19. Bryan, J., Graber, T.: The crepant resolution conjecture. http://arXiv.org/abs/arXiv:math/ 0610129v2[math.AG], 2007 20. Cadman, C., Cavalieri, R.: Gerby localization, Z3 -Hodge integrals and the GW theory of [C3 /Z3 ]. http:// arXiv.org/abs/0705.2158v3[math.AG], 2007 21. Cavalieri, R.: Private communication 22. Chekhov, L., Eynard, B., Orantin, N.: Free energy topological expansion for the 2-matrix model. JHEP 0612, 053 (2006) 23. Chiang, T.M., Klemm, A., Yau, S.T., Zaslow, E.: Local mirror symmetry: Calculations and interpretations. Adv. Theor. Math. Phys. 3, 495 (1999) 24. Coates, T., Corti, A., Iritani, H., Tseng, H.-H.: Wall-Crossings in toric Gromov-Witten theory I: crepant examples. http://arXiv.org/abs/math/0611550v3[math.AG], 2006 25. Coates, T., Corti, A., Iritani, H., Tseng, H.-H.: Computing Genus-Zero twisted Gromov-Witten invariants. http://arXiv.org/abs/math/0702234v2[math.AG], 2007 26. Coates, T.: Wall-Crossings in toric Gromov-Witten theory II: local examples. http://arXiv.org/abs/0804. 2592v1[math.AG], 2008 27. Diaconescu, D.E., Florea, B.: Large N duality for compact Calabi-Yau threefolds. Adv. Theor. Math. Phys. 9, 31 (2005) 28. Dijkgraaf, R., Gukov, S., Kazakov, V.A., Vafa, C.: Perturbative analysis of gauged matrix models. Phys. Rev. D 68, 045007 (2003)

Topological Open Strings on Orbifolds

623

29. Dijkgraaf, R., Vafa, C.: Matrix models, topological strings, and supersymmetric gauge theories. Nucl. Phys. B 644, 3 (2002) 30. Dijkgraaf, R., Vafa, C.: Two dimensional Kodaira-Spencer theory and three dimensional chern-simons gravity. http://arXiv.org/abs/0711.1932v1[hep-th], 2007 31. Dubrovin, B., Zhang, Y.: Bihamiltonian hierarchies in 2D topological field theory at one-loop approximation. Commun. Math. Phys. 198, 311 (1998) 32. Eynard, B.: Topological expansion for the 1-hermitian matrix model correlation functions. JHEP 0411, 031 (2004) 33. Eynard, B., Mariño, M., Orantin, N.: Holomorphic anomaly and matrix models. JHEP 0706, 058 (2007) 34. Eynard, B., Orantin, N.: Invariants of algebraic curves and topological expansion. http://arXiv.org/abs/ math-ph/0702045v4, 2007 35. Ghoshal, D., Vafa, C.: c = 1 String as the topological theory of the conifold. Nucl. Phys. B 453, 121 (1995) 36. Givental, A.: Elliptic Gromov-Witten invariants and the generalized mirror conjecture. In: Integrable Systems and Algebraic Geometry (Kobe/Kyoto, 1997), River Edge, NJ: World Sci. Publ., 1998, pp. 107–155 37. Graber, T., Zaslow, E.: Open-String Gromov-Witten invariants: calculations and a mirror “Theorem”. http://arXiv.org/abs/hep-th/0109075v1, 2001 38. Grimm, T.W., Klemm, A., Mariño, M., Weiss, M.: Direct integration of the topological string. JHEP 0708, 058 (2007) 39. Harvey, R., Lawson, H.B.: Calibrated geometries. Acta Mathematica 148, 47–157 (1982) 40. Hori, K., Iqbal, A., Vafa, C.: D-branes and mirror symmetry. http://arXiv.org/abs/hep-th/0005247v2, 2000 41. Hori, K., Vafa, C.: Mirror symmetry. http://arXiv.org/abs/hep-th/0002222v3, 2000 42. Huang, M.x., Klemm, A.: Holomorphic anomaly in gauge theories and matrix models. JHEP 0709, 054 (2007) 43. Huang, M.x., Klemm, A., Quackenbush, S.: Topological string theory on compact Calabi-Yau: modularity and boundary conditions. In: Homological Mirror Symmetry: New Dev. and Perspectives, A. Kapustin (ed.), Lect. Notes in Phys. 757, Berlin-Heidelberg-New York: Springer, 2009, pp. 45–102 44. Kaneko, M., Zagier, D.B.: A generalized Jacobi theta function and quasimodular forms. In: The Moduli Space of Curves. Progr. Math. 129, Boston, MA: Birkhauser, 1995, pp. 165–172 45. Katz, S., Liu, C.-C.M.: Enumerative geometry of stable maps with Lagrangian boundary conditions and Multiple Covers of the Disc. Adv. Theor. Math. Phys. 5, 1–49 (2002) 46. Konishi, Y., Minabe, S.: On solutions to Walcher’s extended holomorphic anomaly equation. http://arXiv. org/abs/0708.2898v2[math.AG], 2007 47. Lerche, W., Mayr, P.: On N = 1 mirror symmetry for open type II strings. http://arXiv.org/abs/hep-th/ 0111113v2, 2002 48. Lerche, W., Mayr, P., Warner, N.: N = 1 special geometry, mixed Hodge variations and toric geometry. http://arXiv.org/abs/hep-th/0208039v1, 2002 49. Mariño, M.: Open string amplitudes and large order behavior in topological string theory. JHEP 0803, 060 (2008) 50. Morrison, D.R., Walcher, J.: D-branes and normal functions. http://arXiv.org/abs/0709.4028v1[hep-th], 2007 51. Orlov, D.: Triangulated categories of singularities and D-branes in Landau-Ginzburg models. Proc. Steklov Inst. Math. 246(3), 227–248 (2004) 52. Orlov, D.: Derived categories of coherent sheaves and triangulated categories of singularities. http:// arXiv.org/abs/math.AG/0503632v3, 2005 53. Ruan, Y.: The cohomology ring of crepant resolutions of orbifolds. In: Gromov-Witten Theory of Spin Curves and Orbifolds, Vol. 403 of Contemp. Math., Providence, RI: Amer. Math. Soc., 2006, pp. 117–126 54. Walcher, J.: Extended holomorphic anomaly and loop amplitudes in open topological string. Nucl. Phys. B 817(3), 167–207 (2009) 55. Witten, E.: Chern-Simons Gauge theory as a string theory. Prog. Math. 133, 637 (1995) 56. Witten, E.: Phases of N = 2 theories in two dimensions. Nucl. Phys. 403, 159 (1993) 57. Yamaguchi, S., Yau, S.T.: Topological string partition functions as polynomials. JHEP 0407, 047 (2004) Communicated by N.A. Nekrasov

Commun. Math. Phys. 296, 625–654 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1007-x

Communications in

Mathematical Physics

Continuity of the von Neumann Entropy M. E. Shirokov Steklov Mathematical Institute, Gubkina Str. 8, 11991 Moscow, Russia. E-mail: [email protected] Received: 20 April 2009 / Accepted: 10 November 2009 Published online: 2 March 2010 – © Springer-Verlag 2010

Abstract: A general method for proving continuity of the von Neumann entropy on subsets of positive trace-class operators is considered. This makes it possible to rederive the known conditions for continuity of the entropy in more general forms and to obtain several new conditions. The method is based on a particular approximation of the von Neumann entropy by an increasing sequence of concave continuous unitary invariant functions defined using decompositions into finite rank operators. The existence of this approximation is a corollary of a general property of the set of quantum states as a convex topological space called the strong stability property. This is considered in the first part of the paper.

Contents 1. 2. 3.

4. 5.

6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . The Strong Stability Property of S(H) . . . . . . . . . . . . 3.1 The definition . . . . . . . . . . . . . . . . . . . . . . . 3.2 Some implications . . . . . . . . . . . . . . . . . . . . . On Approximation of Concave (Convex) Functions on S(H) The Approximation of the von Neumann Entropy and the Continuity Conditions . . . . . . . . . . . . . . . . . 5.1 The uniform approximation property . . . . . . . . . . . 5.2 The continuity conditions . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 One property of the positive cone of trace-class operators 7.2 The proofs of the auxiliary results . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

626 627 629 629 632 634

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

635 636 645 649 650 650 650

626

M. E. Shirokov

1. Introduction The set of quantum states – density operators in a separable Hilbert space – plays the central role in analysis of general infinite dimensional quantum systems. One of the technical problems in this analysis is related to noncompactness of the set of quantum states and nonexistence of inner points of this set considered as a closed convex subset of the separable Banach space of all trace-class operators. Another technical problem consists in discontinuity and unboundedness of basic characteristics of quantum states such as the von Neumann entropy, the relative entropy, etc. The above problems can be partially overcome by using two special properties of the set of quantum states considered in detail in the first part of [25]. The first of them can be considered as a kind of “weak compactness” since it provides generalization to the set of quantum states of several results well known for compact convex sets (see Sect. 2) while the second one called the stability property reveals the special relation between the topology and the convex structure of the set of quantum states (see Subsect. 3.1). These two properties provide the foundation of analysis of continuity of several important characteristics of quantum systems and quantum channels (see [25] and the references therein). In this paper we prove a stronger version of the stability property of the set of quantum states naturally called strong stability and consider its applications concerning the problem of approximation of concave (convex) functions on the set of quantum states and providing a new approach to analysis of continuity of such functions. The main application of the strong stability property considered in this paper is the development of a method of proving continuity of the von Neumann entropy. In infinite dimensions the von Neumann entropy is a nonnegative concave lower semicontinuous function on the set of quantum states taking the value +∞ on a dense subset of this set.1 Nevertheless the von Neumann entropy has continuous bounded restrictions to some important subsets of quantum states, for example, to the set of states of the system of quantum oscillators with bounded mean energy. Since continuity of the entropy is a very desirable property in analysis of quantum systems, various sufficient continuity conditions have been obtained up to now. The earliest among them seems to be Simon’s dominated convergence theorems presented in [26] and widely used in applications (the generalized forms of these theorems are presented in Corollary 4). Another useful continuity condition originally appeared in [29] (as far as I know) and can be formulated as continuity of the entropy on each subset of states characterized by bounded mean value of a given positive unbounded operator with discrete spectrum provided that its sequence of eigenvalues has a sufficient rate of increase (see Example 1). Some special conditions of continuity of the von Neumann entropy are considered in [24]. It turns out that the strong stability property of the set of quantum states (more precisely, the approximation technique based on this property) provides a new method of proving continuity of the von Neumann entropy on a set of quantum states based on the established relation between this property and the special uniform approximation property of this set defined via the relative entropy. Well known results concerning the relative entropy make it possible to show conservation of the uniform approximation property under different set-operations, which implies roughly speaking “preservation of continuity” of the entropy under these set-operations. The proposed method makes it possible to re-derive the known conditions of continuity of the von Neumann entropy mentioned above (in the more general forms) and to 1 Moreover, the set of states with finite entropy is a first category subset of the set of all quantum states [29].

Continuity of the von Neumann Entropy

627

obtain the several new (as far as I know) conditions which seems to be useful in analysis of quantum systems. 2. Preliminaries Let H be a separable Hilbert space, B(H) – the Banach space of all bounded operators in H with the operator norm · , T(H) – the Banach space of all trace-class operators in H with the trace norm · 1 , containing the cone T+ (H) of all positive trace-class operators. The closed convex subsets T1 (H) = {A ∈ T+ (H) | Tr A ≤ 1} and S(H) = {A ∈ T+ (H) | Tr A = 1} are complete separable metric spaces with the metric defined by the trace norm. Operators in S(H) are denoted ρ, σ, ω, . . . and called density operators or states since each density operator uniquely defines a normal state on B(H) [3]. In what follows A is a subset of the cone of positive trace-class operators. We denote by cl(A), co(A), σ -co(A), co(A) and extr(A) the closure, the convex hull, the σ -convex hull, the convex closure and the set of all extreme points of a set A correspondingly [12,22]. In what follows we consider functions on subsets of T+ (H) taking values in [−∞, +∞], which are semibounded (either lower or upper bounded) on these subsets. We denote by co f and co f the convex hull and the convex closure of a function f on a convex set A [12,22]. The set of all bounded continuous functions on a set A is denoted C(A). The set of all Borel probability measures on a closed set A endowed with the topology of weak convergence is denoted P(A). This set can be considered as a complete separable metric space [19]. The barycenter b(µ) of the measure µ in P(A) is the operator in co(A) defined by the Bochner integral b(µ) = Aµ(d A), A

which always exists if the set A is bounded. For arbitrary subset B ⊆ co(A) let PB (A) be the subset of P(A) consisting of all measures with barycenter in B. Let P a (A) be the subset of P(A) consisting of atomic measures and let P f (A) be the subset of P a (A) consisting of measures with a finite number of atoms. Each measure in P a (A) corresponds to a collection of operators {Ai } ⊂ A with probability distribution {πi } conventionally called an ensemble and denoted {πi , Ai }. The barycenter of this measure is the average i πi Ai of the corresponding ensemble. We use the following two strengthened versions of the well known notion of a concave function. A semibounded function f on the set S(H) is called σ -concave at a state ρ0 ∈ S(H) if the discrete Jensen’s inequality πi f (ρi ) f (ρ0 ) ≥ i

holds for an arbitrary countable ensemble {πi , ρi } of states in S(H) with the average state ρ0 .

628

M. E. Shirokov

A semibounded universally measurable2 function f on the set S(H) is called µ-concave at a state ρ0 ∈ S(H) if the integral Jensen’s inequality f (ρ0 ) ≥ f (ρ)µ(dρ) S(H)

holds for an arbitrary measure µ in P(S(H)) with the barycenter ρ0 . σ -convexity and µ-convexity of a function f are naturally defined via the above notions applied to the function − f . The examples of semibounded functions (in particular, Borel functions) on the set S(H), which are convex but not σ -convex or σ -convex but not µ-convex at particular states as well as sufficient conditions for σ -convexity and µ-convexity of a convex function at any state are considered in [25]. The identity operator in a Hilbert space H and the identity transformation of the space T(H) are denoted IH and IdH correspondingly. Following [11] an arbitrary positive unbounded operator in a Hilbert space with discrete spectrum of finite multiplicity is called the H-operator. The set S(H) is not compact if dim H = +∞, but it has the property consisting in compactness of the pre-image b−1 (A) ⊂ P(S(H)) of any compact subset A of S(H) under the map µ → b(µ) [11, Prop. 2], which can be used for proving for the set S(H) and for its subsets several results well known for compact sets. This property (in the general context of a metrizable convex subset of a locally convex space) is studied in detail in [20], where it is called µ-compactness. It implies in particular the following Choquet-type assertion and the lemma below. Lemma 1. Let A be a closed subset of S(H). For an arbitrary state ρ in co(A) there exists a measure µ in P(A) such that b(µ) = ρ. Proof. Let ρ0 ∈ co(A) and {ρn } ⊂ co(A) be a sequence converging to the state ρ0 . For each n ∈ N there exists a measure µn ∈ P(A) with finite support such that ρn = b(µn ). By µ-compactness of the set S(H) the sequence {µn } has a partial limit µ0 ∈ P(A). Continuity of the map µ → b(µ) implies b(µ0 ) = ρ0 .

Lemma 1 provides correctness of the definition of the functions in the following lemma, proved in the Appendix. Lemma 2. Let f be a lower semicontinuous lower bounded function on a closed subset A of S(H). A) The function ˇ f A (ρ) = inf f (σ )µ(dσ ) µ∈P{ρ} (A) A

is convex and lower semicontinuous on the set co(A). For arbitrary ρ in co(A) the infimum in the definition of the value fˇA (ρ) is achieved at a particular measure in P{ρ} (A). B) If the map P(A) µ → b(µ) ∈ co(A) is open then the function f (σ )µ(dσ ) fˆA (ρ) = sup µ∈P{ρ} (A) A

is concave and lower semicontinuous on the set co(A). 2 This means that the function f is measurable with respect to any measure in P(S(H)).

Continuity of the von Neumann Entropy

629

For given natural k we denote by Tk+ (H) (correspondingly by Sk (H)) the set of positive +∞ trace-class operators (correspondingly states) having rank ≤ k. The convex set k=1 Sk (H) of all finite rank states is denoted Sf (H). A linear positive trace-nonincreasing map : T(H) → T(H) such that the dual map ∗ : B(H) → B(H) is completely positive is called a quantum operation [10]. The convex set of all quantum operations from T(H) to itself is denoted F≤1 (H). If a quantum operation is trace-preserving then it is called a quantum channel. An arbitrary quantum operation (correspondingly channel) ∈ F≤1 (H) has the following Kraus representation: (·) =

+∞

V j (·)V j∗ ,

j=1

∗ where {V j }+∞ is a set of operators in B(H) such that +∞ j=1 V j V j ≤ IH (correspond+∞ j=1∗ ingly j=1 V j V j = IH ). For given natural n we denote by Fn≤1 (H) the subset of F≤1 (H) consisting of quantum operations having the Kraus representation with ≤ n nonzero summands. We will use the following result of the purification theory. Lemma 3. Let H and K be Hilbert spaces such that dim H = dim K. For an arbitrary pure state ω0 in S(H ⊗ K) and an arbitrary sequence {ρn } of states in S(H) converging to the state ρ0 = Tr K ω0 there exists a sequence {ωn } of pure states in S(H ⊗ K) converging to the state ω0 such that ρn = Tr K ωn for all n. The assertion of this lemma can be proved by noting that the infimum in the definition of the Bures distance (or the supremum in the definition of the Uhlmann fidelity) between two quantum states can be taken only over all purifications of one state with fixed purification of another state and that convergence of a sequence of states in the trace norm distance implies its convergence in the Bures distance [9,14]. Let Pn be the set of all probability distributions with n ≤ +∞ outcomes endowed with the total variation topology. Note. In what follows continuity of a function f on a set A ⊂ T+ (H) implies its finiteness on this set (in contrast to lower (upper) semicontinuity). 3. The Strong Stability Property of S(H) 3.1. The definition. The notion of stability of a convex subset of a linear topological space appeared at the end of the 1970’s as a result of study of the properties of compact convex sets, which led in particular to proving equivalence of continuity of the convex hull (envelope) of an arbitrary continuous function (the CE-property), openness of the mixture map and openness of the barycenter map for given compact convex set (the Vesterstrom-O’Brien theorem [4]). In the subsequent papers (see [5,8,18] and the references therein) the term stability was used to denote openness of the mixture map for an arbitrary convex subset of a linear topological space (which is not equivalent in general to the CE-property). The stability property of the set S(H) of quantum states and its corollaries are considered in detail in [25]. It consists in the validity of the following equivalent3 statements: 3 Equivalence of these statements follows from the µ-compact generalization of the Vesterstrom-O’Brien theorem [20, Theorem 1].

630

• • • • •

M. E. Shirokov

the map S(H)×2 × [0, 1] (ρ, σ, λ) → λρ + (1 − λ)σ ∈ S(H) is open; the map P(S(H)) µ → b(µ) ∈ S(H) is open; the map P(extrS(H)) µ → b(µ) ∈ S(H) is open; co f = co f ∈ C(S(H)) for arbitrary f ∈ C(S(H)); µ µ f ∗σ = f ∗ ∈ C(S(H)) for arbitrary f ∈ C(extrS(H)), where f ∗σ and f ∗ are the σ -convex roof and the µ-convex roof of the function f [25].

Physically openness of the map P(S(H)) µ → b(µ) ∈ S(H) (correspondingly of the map P(extrS(H)) µ → b(µ) ∈ S(H)) means roughly speaking that any small perturbation of the average state of a given continuous ensemble of states (correspondingly of pure states) can be realized by appropriate small perturbations of the states of this ensemble. It turns out that the stability property of the set S(H) can be enforced by showing that any small perturbation of the average state of a given (countable or continuous) ensemble of finite rank states can be realized by appropriate small perturbations of the states of this ensemble without increasing the maximal rank of these states. Mathematically this strong stability property of the set S(H) is formulated in the following theorem. Theorem 1. The surjective continuous maps P(Sk (H)) µ → b(µ) ∈ S(H) and P a (Sk (H)) µ → b(µ) ∈ S(H) are open for each natural k. As mentioned before, the assertion of Theorem 1 for k = 1 is equivalent to openness of the map P(S(H)) µ → b(µ) ∈ S(H). The proof of this equivalence is based on coincidence of the set S1 (H) with the set extrS(H) and is universal in the sense that it is valid for an arbitrary compact or µ-compact convex set in the role of S(H) [4,20]. In contrast to this in the proof of the assertion of Theorem 1 for k > 1 the specific structure of the set S(H) is essentially used. The basic ingredients of the proof of the above theorem are the following lemma and Lemma 5 below. Lemma 4. Let {πi0 , ρi0 } be a countable ensemble of states in Sk (H) with the average +∞ 0 0 πi ρi . For an arbitrary sequence {ρn } ⊂ S(H) converging to the state ρ0 ρ0 = i=1 there exists a sequence {{πin , ρin }}n of countable ensembles of states in Sk (H) such that lim πin = πi0 , πi0 > 0 ⇒ lim ρin = ρi0 , ∀i, and ρn =

n→+∞

n→+∞

+∞

πin ρin , ∀n.

i=1

The assertion of this lemma implies weak convergence of the sequence {{πin , ρin }}n of atomic measures to the atomic measure {πi0 , ρi0 }, t.i. convergence in P(Sk (H)), which means that limn→+∞ i πin f (ρin ) = i πi0 f (ρi0 ) for any function f in C(Sk (H)). This relation can be easily proved by noting that pointwise convergence of the sequence {{πin }}n to the probability distribution {πi0 } implies its convergence in the norm of total variation. Proof of Lemma 4. For each i let |ϕi be a unit vector in S(H ⊗ Hk ) such that +∞ Tr Hk |ϕi ϕi | = ρi0 , where Hk is an auxiliary k-dimensional Hilbert space. Let {|i }i=1 be an orthonormal basis in a separable Hilbert space H . Consider the unit vector |ψ0 = +∞ 0 i=1 πi |ϕi ⊗|i in the space H⊗Hk ⊗H . It is easy to see that Tr Hk ⊗H |ψ0 ψ0 | = ρ0 . By Lemma 3 there exists sequence {|ψn } of unit vectors in H ⊗ Hk ⊗ H converging to the vector |ψ0 such that Tr Hk ⊗H |ψn ψn | = ρn for each n.

Continuity of the von Neumann Entropy

631

+∞ be the local measurement in the space H⊗H ⊗H Let {E i = IH ⊗ IHk ⊗|i i |}i=1 k

[10]. Since E i |ψ0 =

πi0 |ϕi ⊗ |i for each i we have πi0 = Tr E i |ψ0 ψ0 | and

πi0 ρi0 = Tr Hk ⊗H E i |ψ0 ψ0 |E i . Let πin = Tr E i |ψn ψn | and n −1 (πi ) Tr Hk ⊗H E i |ψn ψn |E i , πin > 0 ρin = ρi0 , πin = 0. Then rankρin ≤ k for all n and i. The sequence of ensembles {πin , ρin } has the required properties.

Remark 1. It is interesting to compare the above lemma with Lemma 3 in [23] containing the analogous assertion concerning finite ensembles with no rank restriction on states m is naturally embedded in the of ensembles. The case of the finite ensemble {πi0 , ρi0 }i=1 0 condition of Lemma 4 by setting πi = 0 for all i > m, but this lemma does not guarantee that the sequence {{πin , ρin }}n consists of ensembles of m states in contrast to the assertion of Lemma 3 in [23]. Increasing dimensionality of ensembles of the sequence {{πin , ρin }}n is the cost of the rank restriction on the states of these ensembles. a (S(H)) is a dense subset of P (S(H)) For arbitrary state ρ in S(H) the set P{ρ} {ρ} [11, Lemma 1]. This simple result can be enforced as follows. a (S (H)) is a dense Lemma 5. For arbitrary state ρ in S(H) and k ∈ N the set P{ρ} k subset of P{ρ} (Sk (H)).

This means that any probability measure supported by the set of states of rank ≤ k can be weakly approximated by some sequence of atomic measures – countable ensembles of states of rank ≤ k with the same barycenter. Proof. To prove the assertion of the lemma for k = 1 consider the Choquet ordering on the set P(S(H)). We say that µ ν if and only if f (σ )µ(dσ ) ≥ f (σ )ν(dσ ) S(H)

S(H)

for an arbitrary convex continuous bounded function f on the set S(H) [7]. By Lemma 1 in [11] for given measure µ0 in P(S1 (H)) there exists a sequence {µn } of measures in P(S(H)) with finite support converging to the measure µ0 such that b(µn ) = b(µ0 ) for all n. Decomposing each atom of the measure µn into a convex combination of pure states we obtain the measure µˆ n in P a (S1 (H)) with the same barycenter. It is easy to see that µˆ n µn . By µ-compactness of the set S(H) the set {µˆ n }n>0 is relatively compact. This implies existence of the subsequence {µˆ n k } converging to a measure {µˆ 0 } in P(S1 (H)) [19, Theorem 6.1]. Since µˆ n k µn k for all k, the definition of the weak convergence implies µˆ 0 µ0 and hence µˆ 0 = µ0 by maximality of the measure µ0 with respect to the Choquet ordering (see the Appendix). Density of the set a (S (H)) in P (S (H)) is proved. P{ρ} 1 {ρ} 1 Let k > 1 and Hk be the k-dimensional Hilbert space. Let be the multi-valued map from Sk (H) into the set 2S1 (H⊗Hk ) such that (ρ) is the set of all purifications in S1 (H ⊗ Hk ) of the state ρ ∈ Sk (H). It is clear that the map is closed-valued. Thus by Theorem 3.1 in [28] to prove existence of a measurable selection of the map it is sufficient to show weak measurability of this map in terms of [28]. Let U be an open

632

M. E. Shirokov

subset of S1 (H ⊗ Hk ). Then − (U ) = {ρ ∈ Sk (H) | (ρ) ∩ U = ∅} = (U ), where (·) = Tr Hk (·) is the affine (single valued) map from S(H ⊗ Hk ) onto Sk (H). Since the restriction of the map to the set S1 (H ⊗ Hk ) is open,4 the set − (U ) = (U ) is open and hence Borel. As mentioned before, this implies existence of a measurable selection ∗ of the map . Let ν0 = µ0 ◦ −1 ∗ be the image of the measure µ0 under the map ∗ . It is clear that ν0 ∈ P(S1 (H ⊗ Hk )). By the assertion of the lemma for k = 1 there exists a sequence {νn } of measures in P a (S1 (H ⊗ Hk )) converging to the measure ν0 such that b(νn ) = b(ν0 ) for all n. Since ◦∗ = IdH the image ν0 ◦ −1 of the measure ν0 under the map coincides with µ0 . This and continuity of the map imply convergence of the sequence {µn = νn ◦ −1 } of measures in P a (Sk (H)) to the measure µ0 . Since the map is affine we have b(µn ) = (b(νn )) = (b(ν0 )) = b(µ0 ) for all n. Thus the sequence {µn } has the required properties.

Proof of Theorem 1. By Lemma 5 it is sufficient to prove openness of the surjective map P a (Sk (H)) µ → b(µ) ∈ S(H) for each natural k. Let U be an arbitrary open subset of P a (Sk (H)). Suppose b(U ) is not open in S(H). Then there exist a state ρ0 ∈ b(U ) and a sequence {ρn } of states in S(H) \ b(U ) converging to the state ρ0 . Let µ0 = {πi0 , ρi0 } be a measure in U such that b(µ0 ) = ρ0 . By Lemma 4 (and the remark after it) there exists a sequence of measures µn = {πin , ρin } in P a (Sk (H)) converging to the measure µ0 = {πi0 , ρi0 } such that b(µn ) = ρn for all n. Openness of the set U implies µn ∈ U for all sufficiently large n contradicting the choice of the sequence {ρn }.

3.2. Some implications. In the case dim H < +∞ the convex (concave) roof extension to the set S(H) of a function f on the set of pure states S1 (H) = extrS(H) is defined at a mixed state ρ as the minimal (maximal) value of i πi f (ρi ) over all decompositions ρ = i πi ρi of this state into a finite convex combination of pure states [27]. This extension is widely used in quantum information theory, in particular, in construction of entanglement monotones [21]. The convex (concave) roof extension has the two natural generalizations to the case dim H = +∞, called in [25] the σ -convex (concave) roof and the µ-convex (concave) roof correspondingly (the first extension is defined via all decompositions of a state into a countable convex combination of pure states while the second one – via all “continuous” decompositions corresponding to Borel probability measures on the set of pure states with given barycenter). Generalizing the σ -concave roof construction, for given natural k and semibounded function f on the set Sk (H) consider the function S(H) ρ → fˆkσ (ρ) =

sup

a (S (H)) {πi ,ρi }∈P{ρ} k i

πi f (ρi )

4 This means that for an arbitrary state ω ∈ S (H ⊗ H ) and sequence {ρ } ⊂ S (H) converging to the n 0 1 k k state ρ0 = (ω0 ) there exist a subsequence {ρn k } and a sequence {ωk } ⊂ S1 (H ⊗ Hk ) converging to the state ω0 such that (ωk ) = ρn k for all k. The last property can be verified by using the standard arguments of the purification theory.

Continuity of the von Neumann Entropy

633

(the supremum is over all decompositions of the state ρ into a countable convex combination of states of rank ≤ k). This function is obviously σ -concave on the set S(H) (see Sect. 2). If the function f is σ -concave at any state in Sk (H) then the functions fˆkσ and f coincide on the set Sk (H), so in this case the function fˆkσ can be considered as an extension of the function f to the set S(H). Generalizing the µ-concave roof construction, for given natural k and semibounded Borel function f on the set Sk (H) consider the function µ S(H) ρ → fˆk (ρ) = sup f (σ )µ(dσ ) µ∈P{ρ} (Sk (H)) Sk (H)

(the supremum is over all probability measures with the barycenter ρ supported by states of rank ≤ k). This function is also obviously σ -concave on the set S(H) but its µ-concavity depends on the question of its universal measurability. By Propositions 1 µ and 2 below, the function fˆk is µ-concave on the set S(H) if the function f is either lower bounded lower semicontinuous or upper bounded upper semicontinuous on the µ set Sk (H). If the function f is µ-concave at any state in Sk (H) then the functions fˆk µ and f coincide on the set Sk (H), so in this case the function fˆk can be considered as an extension of the function f to the set S(H). The strong stability property of the set S(H) stated in Theorem 1 and Lemma 5 imply the following result. Proposition 1. Let f be a lower semicontinuous lower bounded function on the set µ Sk (H). Then fˆkσ = fˆk and this function is lower semicontinuous and µ-concave on the set S(H). µ Proof. Coincidence of the functions fˆkσ and fˆk follows from lower semicontinuity of the functional P(Sk (H)) µ → Sk (H) f (σ )µ(dσ ) (proved by the standard argumentation) and Lemma 5. Theorem 1 and Lemma 2 imply lower semicontinuity of the µ lower bounded function fˆkσ = fˆk , which guarantees its µ-concavity (by Proposition A-2 in the Appendix in [25]).

The µ-compactness property of the set S(H) (described before Lemma 1) implies the following result. Proposition 2. Let f be an upper semicontinuous upper bounded function on the set µ Sk (H). Then the function fˆk is upper semicontinuous and µ-concave on the set S(H). µ For an arbitrary state ρ in S(H) the supremum in the definition of the value fˆk (ρ) is achieved at some measure in P{ρ} (Sk (H)). Proof. Lemma 2 implies attainability of the supremum in the definition of the value µ µ fˆk (ρ) and upper semicontinuity of the function fˆk , which guarantees its µ-concavity (by Proposition A-2 in the Appendix in [25]).

Under the condition of Proposition 2 we can say nothing about upper semicontinuity and µ-concavity of the function fˆkσ (see example 2 in [25]). The above two propositions have the obvious corollary. µ Corollary 1. Let f be a continuous bounded function on the set Sk (H). Then fˆkσ = fˆk and this function is continuous on the set S(H).

634

M. E. Shirokov

4. On Approximation of Concave (Convex) Functions on S(H) The functional constructions considered in Subsect. 3.2 can be used in the study of the following approximation problem: for a given concave (convex) function f on the set S(H) having some particular symmetry5 to find a monotonic sequence { f k } of concave (convex) functions on the set S(H) having the same symmetry, satisfying additional analytical requirements and such that f k |Sk (H) = f |Sk (H) , ∀k, and

lim f k (ρ) = f (ρ), ∀ρ ∈ S(H).

k→+∞

Let f be a function on the set S(H) having semibounded restriction to the set Sk (H) for each k. We can consider the nondecreasing sequence { fˆkσ } of concave functions on σ = sup fˆσ . If the restriction of the function the set S(H) and its pointwise limit fˆ+∞ k k f to the set Sk (H) is universally measurable for each k then we can also consider the µ nondecreasing sequence { fˆk } of concave functions on the set S(H) and its pointwise µ µ limit fˆ+∞ = supk fˆk . µ By construction all the functions in the sequences { fˆkσ } and { fˆk } inherit the arbitrary σ and symmetry of the function f . Hence the same assertion holds for the functions fˆ+∞ µ fˆ+∞ . σ and fˆµ are concave on the set S(H). By construction fˆσ ≤ The functions fˆ+∞ +∞ +∞ µ σ | fˆ+∞ and f |Sf (H) ≤ fˆ+∞ Sf (H) (Sf (H) is the convex subset of S(H) consisting of σ ≤ f , if the finite rank states). If the function f is σ -concave on the set S(H) then fˆ+∞ µ function f is µ-concave on the set S(H) then fˆ+∞ ≤ f . To show coincidence of the σ and fˆµ with the function f additional conditions are required. functions fˆ+∞ +∞ Proposition 3. Let f be a concave lower semicontinuous lower bounded function on the set S(H), having some particular symmetry. µ

A) For each natural k the concave lower semicontinuous function fˆkσ = fˆk has the same symmetry and coincides with the function f on the set Sk (H). σ = fˆµ of the monotonic sequence { fˆσ = fˆµ } coincides The pointwise limit fˆ+∞ +∞ k k with the function f on the set S(H). B) If the function f has continuous restriction to the set Sk (H) for each natural k µ then the sequence { fˆkσ = fˆk } consists of concave continuous bounded functions on the set S(H). µ Proof. By Proposition 1 fˆkσ = fˆk and this function is lower semicontinuous for each µ σ = fˆ k. This implies fˆ+∞ +∞ and lower semicontinuity of the last function. Since the function f is µ-concave by Proposition A-2 in the Appendix in [25], the first assertion of the proposition follows from the previous observations and Lemma 6 below. The second assertion of the proposition follows from Corollary 1 since it is easy to see that continuity of the restrictions of the concave function f to the set Sk (H) for all k implies boundedness of these restrictions.

Lemma 6. A lower semicontinuous lower bounded concave function f on the set S(H) is uniquely determined by its restriction to the set Sf (H) of finite rank states. 5 This means that the function f is invariant with respect to the particular family of symmetries of the set S(H).

Continuity of the von Neumann Entropy

635

Proof. It is sufficient to consider the case of a nonnegative function f. Let ρ0 be an arbitrary state and let ρn = (Tr Pn ρ0 )−1 Pn ρ0 be the sequence of finite rank states converging to the state ρ0 , where {Pn } is the sequence of finite rank spectral projectors of the state ρ0 increasing to the identity operator IH . For each n the inequality λn ρn ≤ ρ0 with λn = Tr Pn ρ0 implies decomposition ρ0 = λn ρn + (1 − λn )σn , where σn = (1 − λn )−1 (ρ0 − λn ρn ) is a state. By concavity and nonnegativity of the function f we have f (ρ0 ) ≥ λn f (ρn ) for all n, which implies lim supn→+∞ f (ρn ) ≤ f (ρ0 ). By lower semicontinuity of the function f we have limn→+∞ f (ρn ) = f (ρ0 ).

Remark 2. The first assertion of Proposition 3 can be considered as a “constructive form” of Lemma 6 since it provides a constructive way of restoring a lower semicontinuous lower bounded concave function on the set S(H) by means of its restriction to the set Sf (H). σ and fˆµ can be used in study of the following Note that the above functions fˆ+∞ +∞ construction problem: for a given concave function defined on the convex set Sf (H) of finite rank states and having some particular analytical and symmetry properties to construct its concave extension to the set S(H) of all states preserving these properties. Since in the proof of Proposition 3 the restriction of the function f to the set Sf (H) is only used, it shows that for an arbitrary concave lower bounded function f on the set Sf (H) with some particular symmetry such that its restriction to the set Sk (H) is lower semicontinuous for each k there exists a unique concave lower semicontinuous σ = fˆµ to the set S(H) with the same symmetry. For example, if f is extension fˆ+∞ +∞ an entropy-type (t.i. nonnegative concave lower semicontinuous unitary invariant) funcσ = fˆµ is its unique entropy-type tion defined on the set of finite rank states, then fˆ+∞ +∞ extension to the set of all states.

The second assertion of Proposition 3 and the generalized Dini’s lemma (in which the condition of continuity of functions of an increasing sequence is relaxed to their lower semicontinuity) imply the following continuity condition. Corollary 2. Let f be a concave lower semicontinuous lower bounded function on the set S(H). A) If the function f has continuous restriction to the set Sk (H) for each k then uniform µ convergence of the sequence { fˆkσ = fˆk } on a particular subset A ⊆ S(H) implies continuity of the function f on this subset. B) Continuity of the function f on a compact subset A ⊆ S(H) implies uniform µ convergence of the sequence { fˆkσ = fˆk } on this subset. We will use Corollary 2 in the next section to obtain continuity conditions for the von Neumann entropy. 5. The Approximation of the von Neumann Entropy and the Continuity Conditions The von Neumann entropy H (ρ) = −Trρ log ρ is a lower semicontinuous concave unitary invariant function on the set S(H) of quantum states with the range [0, +∞], having continuous restriction to the set Sk (H) for each k. By Proposition 3 the function H is

636

M. E. Shirokov

a pointwise limit of the increasing sequence {Hk } of nonnegative concave continuous bounded6 unitary invariant functions on the set S(H) defined as follows Hk (ρ) =

sup

a (S (H)) {πi ,ρi }∈P{ρ} k i

πi H (ρi ) =

sup

µ∈P{ρ} (Sk (H)) Sk (H)

H (σ )µ(dσ ).

(The first supremum is over all decompositions of the state ρ into countable convex combination of states of rank ≤ k while the second one is over all probability measures with the barycenter ρ supported by states of rank ≤ k.) For each k the function Hk may be called the entropy approximator of order k or briefly k-approximator. By construction the von Neumann entropy coincides with its k-approximator on the set Sk (H) of all states of rank ≤ k. For arbitrary state ρ ∈ S(H) the difference k (ρ) = H (ρ) − Hk (ρ) between the von Neumann entropy and its k-approximator can be expressed as follows:

k (ρ) =

inf a

{πi ,ρi }∈P{ρ} (Sk (H))

i

πi H (ρi ρ) =

inf

µ∈P{ρ} (Sk (H)) Sk (H)

H (σ ρ)µ(dσ ),

where H (··) is the relative entropy [15,29] (the first equality follows from expression (4) below, the second one – from Proposition 1 in [11]). The possibility to express the value k (ρ) via the relative entropy is essentially used in what follows (see Lemma 8 below). The representation of the von Neumann entropy as a limit of the increasing sequence {Hk } of concave continuous bounded unitary invariant functions can be used for different purposes, in particular, for construction of the increasing sequence of continuous entanglement monotones providing approximation of the Entanglement of Formation (see Sect. 6 in [25]). By Corollary 2 this representation can be used for proving continuity of the von Neumann entropy on a subset of states by showing uniform convergence to zero of the sequence { k } on this subset. The last property of a subset of states, in what follows called the uniform approximation property, is considered in detail in the next subsection (in the extended context of subsets of the positive cone of trace-class operators).

5.1. The uniform approximation property. Since in many applications it is necessary to deal with the following extensions (cf.[13]) S(A) = −Tr A log A and H (A) = S(A) − η(Tr A) of the von Neumann entropy to the cone T+ (H) of all positive trace-class operators (where η(x) = −x log x), we will obtain the continuity conditions for the function A → H (A) on this extended domain. In what follows the function A → H (A) on the is called

the quantum cone T+ (H) entropy while the function {xi } → H ({xi }) = i η(xi ) − η x i i on the positive cone of the space l1 , coinciding with the Shannon entropy on the set P+∞ of probability distributions, is called the classical entropy. 6 It is easy to see that the range of the function H coincides with [0, log k]. k

Continuity of the von Neumann Entropy

637

The von Neumann entropy has the important property expressed in the following inequality: n

n n H λi ρi ≤ λi H (ρi ) + η(λi ), (1) i=1

i=1

i=1

n n , where valid for arbitrary set {ρi }i=1 of states and probability distribution {λi }i=1 n ≤ +∞ (Proposition 6.2 in [15] and the simple approximation). The definition and inequality (1) with n = 2 imply the following properties of the quantum entropy:

H (λA) = λH (A),

H (A) + H (B − A) ≤ H (B) ≤ H (A) + H (B − A) + Tr Bh 2

Tr A , Tr B

where A, B ∈ T+ (H), A ≤ B, λ ≥ 0 and h 2 (x) = η(x) + η(1 − x). Note that S(A) − πi S(Ai ) = πi H (Ai A) i

(2) (3)

(4)

i

for an arbitrary ensemble {πi , Ai } of operators in T+ (H) with the average A, where H (··) is the (extended) relative entropy defined for arbitrary operators A and B in T+ (H) as follows (cf.[13]): H (A B) = i| (A log A − A log B + B − A) |i, i

where {|i} is the orthonormal basis of eigenvectors of A and it is assumed that H (A B) = +∞ if suppA is not contained in suppB. It is easy to verify that H (λA λB) = λH (A B), λ ≥ 0.

(5)

For given natural k consider the function Hk (A) =

sup

a (Tk (H)) {πi ,Ai }∈P{A} i +

πi H (Ai )

on the set T+ (H) (the supremum is over all decompositions of the operator A into a countable convex combination of operators of rank ≤ k). By using (2) it is easy to see that the restriction of the above function Hk to the set S(H) coincides with the k-approximator of the von Neumann entropy defined in the first part of this section (so, we use the same notation) and that Hk (λA) = λHk (A),

A ∈ T+ (H), λ ≥ 0.

Thus we have Hk (A) = A1 Hˆ kσ (A−1 1 A) ≤ A1 log k,

A ∈ T+ (H).

(6)

The contribution of the strong stability property of the set S(H) to the below results is based on the following observation.

638

M. E. Shirokov

Lemma 7. For arbitrary natural k the function A → Hk (A) is continuous on the cone T+ (H). Proof. By means of (6) the assertion of the lemma follows from Corollary 1 showing continuity of the function ρ → Hˆ kσ (ρ) on the set S(H).

For given natural k consider the function

k (A) =

inf

a (Tk (H)) {πi ,Ai }∈P{A} + i

πi H (Ai A)

(7)

on the set T+ (H) (the infimum is over all decompositions of the operator A into a countable convex combination of operators of rank ≤ k). It follows from (5) that

k (λA) = λ k (A),

A ∈ T+ (H), λ ≥ 0.

(8)

By Lemma 8 below the restriction of the function k defined in (7) to the set S(H) coincides with the function k = H − Hk defined in the first part of this section (so, we use the same notation). We will use the following properties of the function k . Lemma 8. For each natural k the following assertions hold: A) For an arbitrary operator A ∈ T+ (H) the infimum in definition (7) of the value a (Tk (H)) consisting of ensembles {π , A }

k (A) can be taken over the subset of P{A} i i + such that Tr Ai = Tr A for all i and hence

k (A) = H (A) − Hk (A). B) The function T+ (H) A → k (A) is nonnegative lower semicontinuous unitary k invariant and homogenous in the sense of (8). −1 k (0) = T+ (H). Continuity of this function on a subset A ⊂ T+ (H) means continuity of the quantum entropy on the subset A. C) The function A → k (A) is monotone with respect to the operator order: A≤B

⇒

k (A) ≤ k (B),

∀A, B ∈ T+ (H).

D) Let {λi (A)} be the sequence of the eigenvalues of the operator A ∈ T+ (H) arranged in nonincreasing order,7 then . k (A) = H ({λik (A)}) = η(λik (A)) − η(A1 ),

k (A) ≤ +∞

i=1

where the sequence {λik (A)} is the k-order coarse-graining of the sequence {λi (A)}, t.i. λik (A) = λ(i−1)k+1 (A) + · · · + λik (A) for all i = 1, 2, . . .. E) For arbitrary operators A in T+ (H) and C in B(H) the following inequality holds:

k (C AC ∗ ) ≤ C2 k (A). 7 It is possible to take the sequence {λ (A)} in arbitrary order but the corresponding sequence {λk (A)} is i i most close to the sequence (A1 , 0, 0, . . .) having zero entropy provided that the nonincreasing order is used. k (A) is considered in Remark 3 below. The relation between k (A) and

Continuity of the von Neumann Entropy

639

F) For an arbitrary operator A in T+ (H) and an arbitrary sequence {Pn } of projectors in B(H) strongly converging to the identity operator IH the following relation holds: lim k (Pn A Pn ) = k (A).

n→+∞

m of mutually G) For an arbitrary operator A in T+ (H) and an arbitrary family {Pi }i=1 orthogonal projectors in B(H) (m ≤ +∞) the following inequality holds:

k (A) ≥

m

k (Pi A Pi ).

i=1

H) For an arbitrary operator A in T+ (H) and an arbitrary quantum operation : T(H) → T(H) having the Kraus representation consisting of ≤ n summands the following inequality holds:

nk ((A)) ≤ k (A). m of operators in T (H) and corresponding set I) For an arbitrary finite set {Ai }i=1 + m {ki }i=1 of natural numbers the following inequality holds: m

m Ai ≤

ki (Ai ).

k1 +k2 +···+km i=1

i=1

+∞ of operators in T (H), probability distriJ) For an arbitrary countable set {Ai }i=1 + +∞ and natural m the following inequality holds: bution {λi }i=1 ⎛ ⎞ +∞ −1 +∞ +∞

+∞ +∞ λi Ai ≤ λi k (Ai ) + λi H ⎝ Ai λi λi Ai ⎠

mk i=1

i=1

≤

+∞

i=m

i=m

i=m

λi k (Ai )

i=1

+ sup Ai 1 H {λi }i≥m . i≥m a (Tk (H)) one can consider ensemble Proof. A) For arbitrary ensemble {πi , Ai } in P{A} +

a (Tk (H)), where λ = π A A−1 and B = A A A −1 , such {λi , Bi } in P{A} i i i 1 i i 1 i 1 + 1 that λi H (Bi A) = πi H (Ai A) i

i

− η (A1 ) −

i

πi η (Ai 1 ) ≤

i

πi H (Ai A),

where the last inequality follows from concavity of the function η, since i πi Ai 1 = A1 . By (2) and (6) this implies k (A) = H (A) − Hk (A). B) Lemma 7 and Assertion A imply the first and the third parts of this assertion. To prove the second one note that the inclusion Tk+ (H) ⊆ −1 k (0) follows from the

640

M. E. Shirokov

definition of the function k , while the converse inclusion is easily derived from the implication ρ ∈ S(H) \ Sk (H) ⇒ H (ρ) > Hk (ρ), which follows from strict concavity of the von Neumann entropy and the last assertion of Proposition 2, implying attainability of the supremum in the second (continuous) expression in the definition of the function Hk (ρ). C) If A ≤ B then there exists contraction C such that A = C BC ∗ . Indeed, on the subspace suppB this contraction is constructed as the continuous extension to this subspace of the linear operator A1/2 B −1/2 defined on the linear hull of the eigenvectors of the operator B corresponding to the positive eigenvalues, while on the subspace H suppB it acts as the zero operator. Hence this assertion follows from Assertion H proved below. D) Let {Pik }i be the sequence of spectral projectors of the operator A such that the projector Pik corresponds to the eigenvalues λ(i−1)k+1 (A), . . . , λik (A). Then λik (A) = Tr Pik A for all i and the ensemble {πik , (πik )−1 Pik A}, where πik = λik (A)A−1 1 , belongs a (Tk (H)). Hence to the set P{A} +

k (A) ≤

πik H ((πik )−1 Pik AA) = H ({λik (A)}).

i

E) By means of (8) this follows from Assertion H proved below. F) By lower semicontinuity of the function k (Assertion B) this follows from Assertion E. G) It is sufficient to prove that ¯

k (A) ≥ k (P A P) + k ( P¯ A P), where P¯ = IH − P, for arbitrary projector P. This inequality is easily proved by using the definition of the function k and the inequality ¯ P¯ B P) ¯ H (AB) ≥ H (P A PP B P) + H ( P¯ A P valid for arbitrary operators A and B in T+ (H) (Lemma 3 in [13]). H) This follows from monotonicity of the relative entropy since for an arbitrary a (Tk (H)) the ensemble {π , (A )} lies in P a nk ensemble {πi , Ai } in P{A} i i + {(A)} (T+ (H)). I) By means of (8) it is sufficient to show that

k +k (γ A + (1 − γ )B) ≤ γ k (A) + (1 − γ ) k (B)

(9)

for arbitrary operators A and B in T+ (H) and γ ∈ [0, 1]. For given k and k let {πi , Ai }i and {λ j , B j } j be ensembles of operators of rank ≤ k with the average A and of rank ≤ k with the average B correspondingly. Then the ensemble {πi λ j , γ Ai +(1−γ )B j }i, j has the average γ A + (1 − γ )B and consists of operators of rank ≤ k + k . By joint convexity of the relative entropy we have

k +k (γ A + (1 − γ )B) ≤ πi λ j H (γ Ai + (1 − γ )B j γ A + (1 − γ )B) i, j

≤γ

i

which implies inequality (9).

πi H (Ai A) + (1 − γ )

j

λ j H (B j B),

Continuity of the von Neumann Entropy

641

J) The first inequality with m = 1 is easily derived from the definition of the function

k by using Donald’s identity

πi H (Ai B) =

i

πi H (Ai A) + H (AB),

i

valid for arbitrary ensemble {πi , Ai } of positive trace-class operators with the average A and arbitrary trace-class operator B [15]. The case m > 1 is reduced to the case m m = 1 by applying Assertion I (with (8)) to the sum i=1 Ai , where Ai = λi Ai for +∞ i = 1, m − 1 and Am = i=m λi Ai . The second inequality follows from the estimation

πi H (Ai A) ≤ sup Ai 1 H ({πi }), i

i

valid for arbitrary ensemble {πi , Ai } of trace-class operators with the average A, which can be proved by using monotonicity of the relative entropy: i

πi H (Ai A) = c

πi H ((ρi )(ρ)) ≤ c

i

πi H (ρi ρ) = cH ({πi }),

i

where c = supi Ai 1 , (·) = c−1

i i| · |iAi ,

ρi = |ii| and ρ =

i

πi ρi .

k (A) in Assertion D of Lemma 8 Remark 3. It is easy to show that the upper bound obtained by using the spectral decomposition of the operator A tends to zero if H (A) is finite, which provides the additional proof of convergence of the sequence {Hk } to k and k , t.i. the function H on the cone T+ (H). Noncoincidence of the functions k (A), can be shown by the existence of such operator A in T+ (H) that k (A) < following example. Let ρ be the chaotic state in a particular 3-D subspace H0 ⊂ H. It is clear that 2 (ρ) = log 3 − 2 log 2 ≈ 0.64 (we use the natural logarithm).

3 In the subspace H0 consider four unit vectors ⎤ ⎤ ⎡ ⎤ ⎡ ⎡ ⎡ ⎤ −1/2 −1/2 1 0 √ √ |ϕ1 = ⎣ 0 ⎦ , |ϕ2 = ⎣ 3/2 ⎦ , |ϕ3 = ⎣ − 3/2 ⎦ , |ϕ4 = ⎣ 0 ⎦ . 0 1 0 0 By direct calculation of eigenvalues one can show that the two rank states ρ1 = 21 |ϕ1 ϕ1 | + 21 |ϕ2 ϕ2 | and ρ2 = 25 |ϕ3 ϕ3 | + 35 |ϕ4 ϕ4 | have the entropies H (ρ1 ) ≈ 0.57 and H (ρ2 ) ≈ 0.67. Since 49 ρ1 + 59 ρ2 = ρ we can conclude that 2 (ρ).

H2 (ρ) ≥ 49 H (ρ1 ) + 59 H (ρ2 ) ≈ 0.63. Thus 2 (ρ) = H (ρ) − H2 (ρ) < The following notion plays the central role in this paper. Definition 1. A subset A of T+ (H) has the uniform approximation property (briefly the UA-property) if lim sup k (A) = 0.

k→+∞ A∈A

642

M. E. Shirokov

Importance of the UA-property is justified by its close relation to continuity of the quantum entropy considered in Theorem 2 in the next subsection. Usefulness of this relation is based on the following observation, showing the conservation of the UA-property under different set-operations. Proposition 4. Let A be a subset of T+ (H) having the UA-property. A) The UA-property holds for the closure cl(A) of the set A. B) For each λ > 0 the UA-property holds for the set Mλ (A) = {λA |A ∈ A}. C) If inf A∈A A1 > 0 then the UA-property holds for the set E(A) = {λA | A ∈ A, λ ≥ 0} ∩ T1 (H). D) For each natural m the UA-property holds for the set m com (A) = πi Ai {πi } ∈ Pm , {Ai } ⊆ A . i=1

If the set A is bounded then the UA-property holds for the set +∞ πi Ai {πi } ∈ P, {Ai } ⊆ A , coP(A) = i=1

where P is a subset of P+∞ such that lim

sup H ({πi }i>m ) = 0.

m→+∞ {π }∈P i

E) The UA-property holds for the sets D(A) = { B ∈ T+ (H) | ∃A ∈ A : B ≤ A} and D(A) = { B ∈ T+ (H) | ∃A ∈ A : B A}, where B A means that the sequence {λi (B)} of eigenvalues of the operator B is majorized by the sequence {λi (A)} of eigenvalues of the operator A in the sense λi (B) ≤ λi (A) for all i; If the set A is compact and does not contain the null operator, then the UA-property holds for the set D(A) =

−1 , B ∈ T1 (H) | ∃A ∈ A : BB−1 1 ≺ AA1

where ρ ≺ σ means that the state σ is more chaotic than the state ρ in the Uhlmann sense [1,30], t.i. for the sequences {λi (ρ)} and {λi (σ )} of eigenvalues of the states n n ρ and σ arranged in nonicreasing order the inequality i=1 λi (ρ) ≥ i=1 λi (σ ) holds for each natural n.

Continuity of the von Neumann Entropy

643

F) The UA-property holds for the sets Q n (A) = {(A) ∈ Fn≤1 (H), A ∈ A }, n ∈ N, and Q F(A) = {(A) | ∈ F, A ∈ A }, where F is a subset of F≤1 (H) such that for the corresponding set V of sequences 8 {V j }+∞ j=1 of Kraus operators the following two conditions hold: ∗ ∗ +∞ 1) either RanV j ⊥ RanV j for all {V j } j=1 ∈ V and all j = j exceeding some natural n or limm→+∞ sup{V j }∈V,A∈A j≥m H V j AV j∗ = 0; 2) lim sup H {TrV j AV j∗ } j≥m = 0. m→+∞ {V }∈V,A∈A j

Remark 4. In connection with Assertion D note that the UA-property of a set A does +∞ not imply the UA-property of its σ -convex hull σ -co(A) = { i=1 πi Ai |{πi } ∈ P+∞ , {Ai } ⊆ A} even if the set A is compact. As an example one can consider the converging sequence of pure states from Example 1 in the second part of [24], such that the von Neumann entropy is not continuous on the σ -convex hull of this sequence (since the UA-property implies continuity of the entropy by Lemma 7). Note that the condition limm→+∞ sup{πi }∈P H ({πi }i>m ) = 0 means continuity of the classical entropy on the set P provided that this set is compact. Proof of Proposition 4. A) This follows from lower semicontinuity of the function k on the set T+ (H) for each k (Lemma 8B). B) This is an obvious corollary of (8). C) This also follows from (8) since sup {λ | B = λA, A ∈ A} ≤

B∈E(A)

inf A1

A∈A

−1

.

D) The first part follows from Lemma 8I and (8) implying m

m m

km πi Ai ≤ πi k (Ai ), ∀ {πi }i=1 ∈ Pm . i=1

i=1

The second part follows from Lemma 8J since for arbitrary k and m it implies +∞

+∞

πi Ai ≤ πi k (Ai ) + sup Ai 1 H {πi }i≥m

km i=1

i=1

i≥m

≤ sup k (A) + sup A1 H {πi }i≥m , A∈A

A∈A

+∞ +∞ ∀ {Ai }i=1 ⊆ A, ∀ {πi }i=1 ∈ P+∞ . 8 The ways to show validity of these conditions are considered in the proof of Corollary 9 below.

644

M. E. Shirokov

E) The first part follows from Lemma 8C and unitary invariance of the function k . The second part follows from Lemma 8D and Lemma 9 below since −1 BB−1 1 ≺ AA1

⇒

k H ({λik (B)}) ≤ B1 A−1 1 H ({λi (A)})

for each natural k by Shur concavity of the von Neumann entropy [30]. F) The first part follows from Lemma 8H. To prove the second part note that Lemma 8J implies the inequality ⎛ ⎞ +∞ +∞

km ⎝ V j AV j∗ ⎠ ≤

k (V j AV j∗ ) + H {TrV j AV j∗ } j≥m . j=1

j=1

Thus it is sufficient to show that condition 1) implies lim

+∞

sup

k→+∞ {V }∈V,A∈A j j=1

k (V j AV j∗ ) = 0.

(10)

If the first alternative in condition 1) holds then Assertions E and G of Lemma 8 provide the estimation +∞

k (V j AV j∗ ) =

j=1

n

k (V j AV j∗ ) +

j=1

≤ n k (A) +

k (V j AV j∗ )

j>n

k (P j A P j ) ≤ (n + 1) k (A),

j>n

∀ {V j } ∈ V, where P j is the projector on the subspace RanV j∗ , which implies (10) by the UA-property of the set A. If the second alternative in condition 1) holds then the similar estimation, in which the term j>n k (V j AV j∗ ) is majorized by j>n H (V j AV j∗ ) also implies (10) by the UA-property of the set A.

Lemma 9. Let A be a compact subset of T+ (H) having the UA-property. Then k (A) = 0 lim sup

k→+∞ A∈A

k is the upper bound for the function k defined in Lemma 8D. where Proof. By Lemma 7 the UA-property of the set A implies continuity of the function A → H (A) on this set. Let {Pik }i be the sequence of spectral projectors of the operator A defined in the k proof of Assertion D of Lemma 8 and πik = A−1 1 Tr Pi A for all i. By Lemma 4 in [13] k the sequence of continuous functions A → H (P1 A) monotonously converges to the function A → H (A) as k → +∞. By Dini’s lemma this sequence converges uniformly on the set A. This implies the assertion of the lemma since k (A) =

πik H ((πik )−1 Pik AA) ≤ H (A) − H (P1k A), A ∈ A. i

Continuity of the von Neumann Entropy

645

By definition the UA-property of sets A and B implies the UA-property of their union A ∪ B. By Lemma 8I and Proposition 4D we have the following observations. Corollary 3. Let A and B be subsets of T+ (H) having the UA-property. A) The UA-property holds for the set {A + B | A ∈ A, B ∈ B} (generally called the Minkowski sum of the sets A and B); B) The UA-property holds for the convex closure co(A ∪ B) of the union of A and B provided these sets are convex. 5.2. The continuity conditions. Lemmas 7 and 8, Dini’s lemma and Proposition 4 imply the following theorem, containing the main results of this paper. Theorem 2. A) If a set A ⊂ T+ (H) has the UA-property then the quantum entropy is continuous on this set. B) If the quantum entropy is continuous on a compact set A ⊂ T+ (H), then this set has the UA-property. C) If a set A ⊂ T+ (H) has the UA-property then the quantum entropy is continuous on the set (A), where is an arbitrary finite composition of the set-operations D, Q n , Q F considered in Proposition 4 with arbitrary cl, Mλ , E, com , coP, D, D, parameters m, n ∈ N and λ > 0 provided the sets P, F and the arguments of E, Q F satisfy the conditions mentioned in this proposition. coP, D, Remark 5. As the simplest example showing importance of the compactness condition in the second assertion of Theorem 2, one can consider the set A = {λρ | λ ∈ R+ }, where ρ is an infinite rank state with finite entropy. The following example shows that the second assertion of Theorem 2 can not be valid even for relatively compact convex sets of states. Let {ρi }i≥0 be a sequence of finite rankstates in S(H) such that ρ0 is a pure state, +∞ −λH (ρ ) !n−1 i < +∞ H (ρi ) ≥ 1 for all i > 0, suppρn ⊂ H i=1 e i=0 suppρi and for all λ > 0. Let λi = (H (ρi ))−1 for each i ∈ N. Consider the sequence of states σi = (1 − λi )ρ0 + λi ρi , i ∈ N, obviously converging to the state ρ0 . In Appendix 7.2 it is proved that von Neumann entropy is continuous on the the +∞ convex set A = σ -co ({σi }i∈N ) = i=1 πi σi | {πi } ∈ P+∞ , but it is not continuous on the set cl (A) = co ({σi }i∈N ) = A ∪ {ρ0 }. By the first assertion of Theorem 2 and Proposition 4A the UA-property does not hold for the set A.

Show first that Theorem 2 makes it possible to re-derive the continuity conditions mentioned in the Introduction in the generalized forms. Example 1. Let {h i } be a nondecreasing sequence of nonnegative numbers and P{h i },h be the subset of P+∞ consisting of probability distributions {πi } satisfying the inequality i h i πi ≤ h. By Lemma 11 in the Appendix the set P{h i },h satisfies the condition in Proposition 4D if and only if g ({h i }) = inf λ > 0 | i e−λh i < +∞ = 0. By Theorem 2C the von Neumann entropy is continuous on the set cl(coP{hi },h (Sk (H))) for each k. This observation provides another proof9 of the well known result stated 9 The original proof of this result is based on lower semicontinuity of the function ρ → H (ρσ ), where λ σλ = (Tre−λH )−1 e−λH , for all λ > 0 [15,29].

646

M. E. Shirokov

that the entropy is continuous on the set K H,h = {ρ ∈ S(H) | Tr Hρ ≤ h}, where H is an H-operator such that g(H ) = inf λ > 0 | Tre−λH < +∞ = 0, since by using the extremal properties of eigenvalues of a positive operator it is easy to see that the set cl(coP{hi },h (S1 (H))), where {h i } is the sequence of eigenvalues of the operator H , contains the set K H,h (and all its unitary translations). The von Neumann entropy is not continuous on cl(coP{hi },h (S1 (H)) if g ({h i }) > 0 since it is not continuous on the set K H,h if g(H ) > 0 [24].

Theorem 2 implies the following generalization of Simon’s dominated convergence theorems [26]. Corollary 4 (Generalized Simon’s convergence theorem). 10 If the quantum entropy is continuous on a compact subset A of T+ (H) then it is continuous on the sets D(A) and D(A) defined in the first part of Assertion E of Proposition 4. This condition and Corollary 3 show that {H (An + Bn ) → H (A0 + B0 )} ⇔ {H (An ) → H (A0 )} ∧ {H (Bn ) → H (B0 )} , where {An } and {Bn } are sequences of positive trace class operators converging respectively to operators A0 and B0 . The above “dominated-type” continuity conditions can be enriched by the following one. Corollary 5. If the quantum entropy is continuous on a compact subset A of T+ (H), which does not contain the null operator, then it is continuous on the set D(A) defined in the second part of Assertion E of Proposition 4. By Corollary 5 and Theorem 13 in [30] to prove continuity of the von Neumann entropy on a set A ⊂ S(H) it suffices to show its continuity on the image of this set under the expectation ρ → i Pi ρ Pi for a particular set {Pi } of mutually orthogonal projectors such that i Pi = IH . Corollary 5 and the infinite-dimensional generalization of Nielsen’s theorem provide the following observation concerning the notion of entanglement of a state of a composite quantum system. Example 2. Let H and K be separable Hilbert spaces. The entanglement E(ω) of a pure state ω in S(H ⊗K) is defined as the von Neumann entropy of its reduced states (cf.[2]): E(ω) = H (Tr K ω) = H (Tr H ω). Let L(H, K) be the set of all LOCC-operations transforming the set S(H ⊗ K) into itself. Corollary 5 and Lemma 2 in [17]11 imply the following assertion: 10 In the original versions of these theorems the weaker topologies are used. Since the set D(A) is compact (by the compactness criterion in Lemma 10 in the Appendix), the weak operator topology on this set coincides with the trace norm topology. The µ-convergence topology does not coincide with the trace norm topology on the set D(A), but by noting that the sequences of eigenvalues of the operators in D(A) form a compact subset of the space l1 it is easy to see that µ-convergence of a sequence {An } ⊂ D(A) to an operator A0 ∈ D(A) means trace norm convergence of the sequence {Un An Un∗ } ⊂ D(A) to the operator A0 for some set {Un } of unitaries. 11 In [17] the majorization order is used, which is converse to the Uhlmann order “≺” used in this paper.

Continuity of the von Neumann Entropy

647

If the function ω → E(ω) is continuous on a compact set C ⊂ extrS(H ⊗ K), then it is continuous on the set {(ω) | ω ∈ C, ∈ L(H, K)} ∩ extrS(H ⊗ K). This shows that for an arbitrary sequence {ωn } of pure states in S(H⊗K) converging to a state ω0 and arbitrary set {n }n≥0 of LOCC-operations such that the sequence {n (ωn )} consists of pure states and converges to the state 0 (ω0 ), the following implication holds: lim E(ωn ) = E(ω0 )

n→+∞

⇒

lim E(n (ωn )) = E(0 (ω0 )).

n→+∞

By Corollary 10 in the Appendix for arbitrary closed set A ⊂ T+ (H) (not necessarily compact) and arbitrary natural m the set com (A) defined in Assertion D of Proposition 4 is closed. Theorem 2 implies the following result. Corollary 6. A) If the quantum entropy is continuous and bounded on a closed bounded set A ⊂ T+ (H) then it is continuous on the set com (A) for arbitrary natural m. B) If the quantum entropy is continuous on a compact set A ⊂ T+ (H), then it is continuous on the set cl(coP(A)) for arbitrary subset P of P+∞ such that lim sup H ({πi }i>m ) = 0. m→+∞ {π }∈P i

By Remark 4 the set cl(coP(A)) in the second assertion of this corollary can not be replaced by the σ -convex hull σ -co(A) of the set A. Proof. A) Let {An } ⊂ com (A) be a sequence converging to an operator A0 ∈ com (A). Suppose lim H (An ) > H (A0 ).

n→+∞

(11)

m By the construction of the set com (A) for each n there exists an ensemble {πin , Ain }i=1 m n n of operators in A such that An = i=1 πi Ai . By using Proposition 5 in the Appendix and boundedness of the set A we may consider (by replacing the sequence {An } by m of operators in A such that some subsequence) that there exists an ensemble {πi0 , Ai0 }i=1 m πi0 Ai0 . limn→+∞ πin = πi0 , limn→+∞ πin Ain = πi0 Ai0 for each i = 1, m and A0 = i=1 Since the entropy is continuous and bounded on the set A we have lim n→+∞ H (πin Ain ) = H (πi0 Ai0 ) for each i = 1, m. By the part “⇐” of the remark after Corollary 4 this implies a contradiction to (11). B) This directly follows from Theorem 2.

If A is a union of m < +∞ closed convex sets then Corollary 10 in the Appendix implies com (A) = co(A), so we obtain from Corollary 6 the following result. Corollary 7. If the quantum entropy is continuous on each set from a finite collection m of convex closed bounded subsets of T (H) then it is continuous on the convex {Ai }i=1 +

m closure co i=1 Ai of this collection. m in Remark 6. The condition of closedness of the all sets from the collection {Ai }i=1 Corollary 7 is essential. The simple example showing this can be constructed as follows. Let A1 = {ρ0 } and A2 = σ -co ({σi }i∈N ), where the state ρ0 and the sequence {σi }i∈N are taken from the example in Remark 5. As shown in this example the entropy is continuous on the convex bounded sets A1 and A2 but it is not continuous on the convex set A1 ∪ A2 .

Theorem 2 also implies the following continuity condition.

648

M. E. Shirokov

n be a finite collection of subsets of T (H) having the UA-propCorollary 8. Let {Ai }i=1 + erty (for example, compact subsets on which the quantum entropy is continuous). Then for arbitrary natural m the quantum entropy is continuous on the set

⎫⎞ ⎛⎧ n m ⎨ ⎬ cl ⎝ Vi j Ai Vi∗j | Ai ∈ Ai , Vi j ∈ B(H), Vi j ≤ 1 ⎠ . ⎩ ⎭ i=1 j=1

The following observation can be used in the study of quantum channels and in the theory of quantum measurements (see Example 3 below). +∞ ∗ +∞ ⊂ B(H) such that Corollary 9. Let V=1 be the set of all sequences {Vi }i=1 i=1 Vi Vi = IH endowed with the Cartesian product of the strong* operator topology (the topology of coordinate-wise strong* operator convergence).12 Let A be a subset of T+ (H) on which the quantum entropy is continuous. +∞

1) The function ({Vi }, A) → i=1 H Vi AVi∗ is continuous on V=1 × A. 2) If V0 is a subset of V=1 such that the function ({Vi }, A) → H

+∞ TrVi AVi∗ i=1

is continuous on V0 × A then the function ({Vi }, A) → H tinuous on V0 × A.

(12)

+∞ i=1

Vi AVi∗ is con-

Proof. We can consider that the sets A and V0 are compact. 1) It follows from Corollary 8 that the function Fm (({Vi }, A)) = H (Cm ACm ), where m ∗ 2 2 Cm = i=1 Vi Vi and m ∈ N, is continuous on V=1 × A. Since C m ≤ C m+1 for all m the sequence {Fm } is nondecreasing. By noting that convergence of a sequence {An } ⊂ T+ (H) to an operator A0 ∈ T+ (H) follows from its convergence in the weak operator topology provided that limn Tr An = Tr A0 (Theorem 1 in [6]) and by using Corollary 8 we conclude that limm→+∞ Fm (({Vi }, A)) = H (A). By the Groenevold-Lindblad-Ozawa inequality (see [16]) we have

H (Vi AVi∗ ) ≤ H (A) − Fm (({Vi }, A)).

i>m

Hence continuity of the function A → H (A) and Dini’s lemma show that limm→+∞ sup{Vi }∈Vc ,A∈A i>m H (Vi AVi∗ ) = 0 for an arbitrary compact subset Vc of V=1 . This

m and continuity of the function ({Vi }, A) → i=1 H Vi AVi∗ for each m (provided by Corollary 8) imply the first assertion of the corollary. 2) By the above observation the second alternative in condition 1) in Proposition 4F holds for the sets V0 and A. Since condition 2) in this proposition follows from conti +∞ ∗ nuity of function (12) by Dini’s lemma, the set i=1 Vi AVi | {Vi } ∈ V0 , A ∈ A has the UA-property. By Theorem 2 this implies the second assertion of the corollary.

12 The strong* operator topology on B(H) is defined by the family of seminorms A → A|ϕ + A∗ |ϕ, |ϕ ∈ H [3]. By using more complicated analysis it is possible to replace this topology here by the strong operator topology.

Continuity of the von Neumann Entropy

649

Example 3. Let Mm (H) be the set of all quantum measurements with m ≤ +∞ outcomes on the quantum system associated with the Hilbert space H. Each measurement m m of operators in B(H) such that ∗ M in Mm (H) is described by a set {Vi }i=1 i=1 Vi Vi = IH and its action on an arbitrary a priori state ρ ∈ S(H) results in the posteriori ensemm , where ρ (M, ρ) = (TrV ρV ∗ )−1 V ρV ∗ is the posteriori ble {πi (M, ρ), ρi (M, ρ)}i=1 i i i i i state corresponding to the i th outcome and πi (M, ρ) = TrVi ρVi∗ is the probability of this outcome (if TrVi ρVi∗ = 0 then theposteriori state ρi (M, ρ) is not defined) [10]. m m The mean posteriori state ρ(M, ¯ ρ) = i=1 πi (M, ρ)ρi (M, ρ) = i=1 Vi ρVi∗ corresponds to the nonselective measurement. We will consider that a sequence {Mn } ⊂ Mm (H) converges to a measurement M0 ∈ Mm (H) if limn→+∞ Vin = Vi0 for all i = 1, m in the strong* operator topolm is the set of operators describing the measurement M . ogy, where {Vin }i=1 n Let A be a subset of S(H) on which the von Neumann entropy is continuous. Corollaries 8 and 9 imply the following assertions: • the von Neumann entropy of the posteriori state H (ρi (M, ρ)) is continuous on the subset of Mm (H) × A, on which ρi (M, m ρ) is defined, i = 1, m ; • the mean entropy of posteriori states i=1 πi (M, ρ)H (ρi (M, ρ)) is continuous on Mm (H) × A; • if M is a subset of Mm (H) such that the Shannon entropy of the outcome’s probabilm ity distribution H {πi (M, ρ)}i=1 is continuous on M × A then the von Neumann entropy of the mean posteriori state H (ρ(M, ¯ ρ)) is continuous on M × A. If m < +∞, then the function H (ρ(M, ¯ ρ)) is continuous on Mm (H) × A.

Remark 7. The continuity conditions considered in this subsection are formulated for subsets of T+ (H). They can be reformulated for subsets of S(H) by using the following obvious observation: If the quantum entropy is continuous on a subset A of T+ (H) such that inf A∈A A 1 > 0 then the von Neumann entropy is continuous on the subset −1 AA1 | A ∈ A of S(H).

6. Conclusion The method of proving continuity of the von Neumann entropy proposed in this paper is essentially based on the strong stability property (stated in Theorem 1) and on the µ-compactness (described before Lemma 1) of the set of quantum states, revealing the special relations between the topology and the convex structure of this set. Of course, it does not mean that validity of the continuity conditions obtained by this method depends on validity of these abstract properties and that these conditions can not be proved by other methods. For example, the assertion of Corollary 7 for sets of quantum states can be shown by noting that continuity of the entropy on any closed convex set of states implies compactness of this set (this follows from Lemma 2 in [25] and Corollary 7 in [24]) and by applying spectral finite dimensional approximation based on using inequality (1) and Dini’s lemma, but the proposed method provides a simpler and in a sense more natural way of doing this. The special approximation of concave lower semicontinuous functions considered in this paper, in particular, the approximation of the von Neumann entropy used in proving its continuity, seems to be interesting for other applications.

650

M. E. Shirokov

7. Appendix 7.1. One property of the positive cone of trace-class operators. The positive cone T+ (H) has the following important property. m } be a sequence of ensembles consisting of m < +∞ Proposition 5. Let {{πin , Ain }i=1 n m operators in T+ (H) such that the sequence { i=1 πin Ain }n of their averages converges to nk m } converging to a particular an operator A0 . There exists a subsequence {{πi , Ain k }i=1 k m with the average A in the following sense: ensemble {πi0 , Ai0 }i=1 0

lim πin k = πi0 and πi0 > 0 ⇒ lim Ain k = Ai0 , i = 1, m.

k→+∞

k→+∞

Note that this proposition does not assert that Ai0 = A0j for all i = j. m Proof. We may assume that the sequence {An = i=1 πin Ain }n belongs to the set T1 (H) of positive operators with trace ≤ 1. Let Bin = πin Ain be an operator in T1 (H) such that Bin ≤ An for all n > 0 and i = 1, m. Since the set {An }n≥0 is compact, the compactness criterion for subsets of T1 (H) in Lemma 10 below implies relative compactness of the sequence {Bin }n>0 for each i = 1, m. Hence we can find an increasing sequence {n k } of natural numbers such that there exist lim πin k = πi0 and

k→+∞

lim Bin k = Bi0 ∀i = 1, m,

k→+∞

m is a probability distribution and {B 0 }m is a collection of operators in where {πi0 }i=1 i i=1 m m T1 (H). Since i=1 Bin k = An k for all k we have i=1 Bi0 = A0 . 0 0 0 0 0 −1 Let Ai = (πi ) Bi if πi = 0 and Ai = 0 otherwise. The subsequence m } and the ensemble {π 0 , A0 }m have the required properties.

{{πin k , Ain k }i=1 k i i i=1

Corollary 10. For an closed subset A of T+ (H) and an arbitrary natural m arbitrary m the set com (A) = i=1 πi Ai |{πi } ∈ Pm , {Ai } ⊂ A is closed. The following compactness criterion for subsets of T1 (H) can be derived from the compactness criterion for subsets of S(H), presented in [11, the Appendix] (by considering the set {A + (1 − Tr A)ρ0 | A ∈ A} ⊂ S(H) for a given set A ⊂ T1 (H), where ρ0 is a fixed pure state). Lemma 10. A closed subset A of T1 (H) is compact if and only if for arbitrary ε > 0 there exists a finite rank projector Pε in B(H) such that Tr(IH − Pε )A < ε for all A ∈ A. 7.2. The proofs of the auxiliary results. On maximality of any measure supported by pure states. Maximality of a measure µ in P(S1 (H)) with respect to the Choquet ordering follows from coincidence of this ordering with the dilation ordering [7], but it can be easily proved in this special case as follows. Suppose ν µ. Since for an arbitrary concave continuous bounded function f on the S(H) stability of the set S(H) implies continuity of its convex hull co f (see [20, Theorem 1]) we have µ( f ) ≥ ν( f ) and µ(co f ) ≤ ν(co f ), where µ( f ) = f (σ )µ(dσ ). S(H)

Continuity of the von Neumann Entropy

651

By noting that f ≥ co f and that these functions coincide on the support of the measure µ we conclude that µ( f ) = ν( f ) and hence µ = ν. The proof of Lemma 2. It is easy to see that lower semicontinuity and lower boundedness of the function f imply lower semicontinuity of the functional P(A) µ → F(µ) = f (σ )µ(dσ ). (13) A

A) Convexity of the function fˇA follows from its definition. By lower semicontinuity of the functional (13) and compactness of the set P{ρ} (A) (provided by µ-compactness of the set S(H)) the infimum in the definition of the value fˇA (ρ) for each ρ in co(A) is achieved at a particular measure in P{ρ} (A). Suppose the function fˇA is not lower semicontinuous. Then there exists a sequence {ρn } ⊂ co(A) converging to a state ρ0 ∈ co(A) such that lim fˇA (ρn ) < fˇA (ρ0 ).

n→+∞

(14)

As proved before for each n = 1, 2, . . . there exists a measure µn in P{ρn } (A) such that fˇA (ρn ) = F(µn ). µ-compactness of the set S(H) implies existence of a subsequence {µn k } converging to a particular measure µ0 . By continuity of the map µ → b(µ) the measure µ0 belongs to the set P{ρ0 } (A). Lower semicontinuity of functional (13) implies fˇA (ρ0 ) ≤ F(µ0 ) ≤ lim inf F(µn k ) = lim fˇA (ρn k ), k→+∞

k→+∞

contradicting (14). B) Concavity of the function fˆA follows from its definition. Suppose the function fˆA is not lower semicontinuous. Then there exists a sequence {ρn } ⊂ co(A) converging to a state ρ0 ∈ co(A) such that lim fˆA (ρn ) < fˆA (ρ0 ).

n→+∞

(15)

For ε > 0 let µε0 be such a measure in P{ρ0 } (A) that fˆA (ρ0 ) < F(µε0 ) + ε. By openness of the map P(A) µ → b(µ) there exists a subsequence {ρn k } and a sequence {µk } ⊂ P(A) converging to the measure µε0 such that b(µk ) = ρn k for each k. Lower semicontinuity of functional (13) implies fˆA (ρ0 ) ≤ F(µε0 ) + ε ≤ lim inf F(µk ) + ε ≤ lim fˆA (ρn k ) + ε, k→+∞

contradicting (15) (since ε is arbitrary).

k→+∞

The proof of the assertion in Remark 5. For an arbitrary state in A = σ -co({σi }) there ρ+∞ exists a probability distribution {πi } ∈ P+∞ such that ρ = i=1 πi σi . This distribution is unique since Pi ρ = πi λi ρi for each i, where Pi is the projector on the subspace suppρi . The one-to-one correspondence P+∞ {πi } ↔ i πi σi ∈ A is continuous in both directions (t.i. it is a homeomorphism). Indeed, continuity of the map “→” is obvious while continuity of the map “←” can be proved by using the above set {Pi } of projectors and by noting that pointwise convergence of a sequence of probability distributions to a probability distribution implies its convergence in the norm of total variation.

652

M. E. Shirokov

Thus to prove continuity of the von Neumann entropy

on the set A it is sufficient to show continuity of the function P+∞ {πi } → H i πi σi . By the construction of the sequence {σi } we have

+∞

+∞

+∞ ( πi σi = H πi (1 − λi ) ρ0 ⊕ πi λi ρi H i=1

i=1

=

+∞

i=1

πi (1 − λi )H (ρ0 ) +

i=1

+η

+∞

= 1+η

πi (1 − λi ) i=1 +∞

+

+∞ i=1 +∞

πi λi H (ρi ) η(πi λi )

i=1

πi (1 − λi )

i=1

+

+∞

πi λi (− log πi ) +

i=1

+∞

πi λi (− log λi ).

i=1

By using properties of the function x → η(x) and Lemma 11 below it is easy to show continuity of the all terms in the right side of the above expression as functions of {πi }. Discontinuity of the von Neumann entropy on the set cl (A) = A ∪ {ρ0 } follows from the inequality H (σi ) ≥ λi H (ρi ) = 1, i > 0, since H (ρ0 ) = 0. Lemma 11. Let {h j }+∞ sequence of positive numbers such that j=1 be a nondecreasing

+∞ −λh j < +∞ < +∞. Then g {h j } = inf λ > 0 | j=1 e lim

sup

m→+∞ {x }∈B 1 j≥m j

η(x j )h −1 j = g {h j } ,

where B1 is the positive part of the unit ball of the Banach space l1 . Proof. We will prove first that λ∗ ≤ sup

+∞

{x j }∈B1 j=1

−1 η(x j )h −1 j ≤ λ∗ + h 1 ,

(16)

−λh j = e if it exists or where λ∗ is either the unique solution of the equation +∞ j=1 e

+∞ −g({h })h j j < e). g {h j } otherwise (if j=1 e By using the Lagrange method it is easy to show that the function {x j }nj=1 → n −1 n n j=1 η(x j )h j attains its maximum on the positive part B1 of the unit ball of R at the n n −λ h −1 −λh n j j = e, vector {e } j=1 , where λn is the unique solution of the equation j=1 e and hence λn ≤ sup

n

{x j }∈B1n j=1

η(x j )h −1 j = λn +

n j=1

−1 e−λn h j −1 h −1 j ≤ λn + h 1 .

Continuity of the von Neumann Entropy

653

It is easy to see that the increasing sequence {λn } converges to λ∗ , so by noting that −1 {x j } → +∞ j=1 η(x j )h j is a lower semicontinuous function and by passing to the limit in the above expression we obtain (16). The assertion of the lemma can be derived from (16) applied to the sequence +∞ −λh j+m {h j+m }+∞ , since if the solution of the equation = e exists for all m j=1 e j=1

it tends to g {h j } as m → +∞.

Acknowledgements. I am grateful to A.S.Holevo for the help and useful discussion. I am also grateful to I.Nechita for the information about the generalization of Nielsen’s theorem and to the referees for the useful remarks. This work is partially supported by the program “Mathematical control theory” of Russian Academy of Sciences, by the federal target program “Scientific and pedagogical staff of innovative Russia” (program 1.2.1, contract P 938), by the analytical departmental target program “Development of scientific potential of the higher school 2009-2010” (project 2.1.1/500) and by RFBR grants 09-01-00424-a and 10-01-00139-a. I am grateful to the organizers of the workshop Thematic Program on Mathematics in Quantum Information at the Fields Institute, where some work improving the paper was done.

References 1. Alberti, P.M., Uhlmann, A.: Stochasticity and Partial Order. Doubly Stochastic Maps and Unitary Mixing. Berlin: VEB Deutscher Verlag Wiss., 1981 2. Bennett, C.H., DiVincenzo, D.P., Smolin, J.A., Wootters, W.K.: Mixed state entanglement and quantum error correction. Phys. Rev. A 54, 3824–3851 (1996) 3. Bratteli, O., Robinson, D.W.: Operators Algebras and Quantum Statistical Mechanics. vol. I. New YorkHeidelberg-Berlin: Springer Verlag, 1979 4. O’Brien, R.: On the openness of the barycentre map. Math. Ann. 223(3), 207–212 (1976) 5. Clausing, A., Papadopoulou, S.: Stable convex sets and extremal operators. Math. Ann. 231, 193–203 (1978) 6. Dell’Antonio, G.F.: On the limits of sequences of normal states. Commun. Pure Appl. Math. 20, 413–430 (1967) 7. Edgar, G.A.: On the Radon-Nikodim Property and Martingale Convergence. Lecture Notes in Mathematics 645, Berlin-Heidelberg: Springer, 1978, pp. 62–76 8. Grzaslewicz, R.: Extreme continuous function property. Acta. Math. Hungar. 74, 93–99 (1997) 9. Holevo, A.S.: On quasi-equivalence of locally normal states. Theor. Math. Phys. 13(2), 184–199 (1972) 10. Holevo, A.S.: Statistical Structure of Quantum Theory. Berlin-Heidelberg-New York: Springer-Verlag, 2001 11. Holevo, A.S., Shirokov, M.E.: Continuous ensembles and the χ -capacity of infinite dimensional channels. Theor. Prob. and Appl. 50(1), 8698 (2005) 12. Joffe, A.D., Tikhomirov, W.M.: Theory of Extremum Problems. New York: Academic Press, 1979 13. Lindblad, G.: Expectation and entropy inequalities for finite quantum systems. Commun. Math. Phys. 39(2), 111–119 (1974) 14. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge: Cambridge University Press, 2000 15. Ohya, M., Petz, D.: Quantum Entropy and Its Use. Texts and Monographs in Physics. Berlin: SpringerVerlag, 1993 16. Ozawa, M.: On information gain by quantum measurement of continuous observable. J. Math. Phys. 27, 759–763 (1986) 17. Owari, M., Braunstein, S.L., Nemoto, K., Murao, M.: ε-convertibility of entangled states and extension of Schmidt rank in infinite-dimensional systems. Quant. Inf. Comp. 8(1–2), 0030–0052 (2008) 18. Papadopoulou, S.: On the geometry of stable compact convex sets. Math. Ann. 229, 193–200 (1977) 19. Parthasarathy, K.: Probability Measures on Metric Spaces. New York-London: Academic Press, 1967 20. Protasov, V.Yu., Shirokov, M.E.: Generalized compactness in linear spaces and its applications. Sbornik: Math. 200(5), 697–722 (2009) 21. Plenio, M.B., Virmani, S.: An introduction to entanglement measures. Quant. Inf. Comp. 7, 1 (2007) 22. Rockafellar, R.: Convex Analysis. Princeton, NJ: Princeton Univ. Press, 1970 23. Shirokov, M.E.: The Holevo capacity of infinite dimensional channels and the additivity problem. Commun. Math. Phys. 262, 137–159 (2006) 24. Shirokov, M.E.: Entropic characteristics of subsets of states-I. Izvestiya: Math. 70(6), 1265–1292 (2006)

654

M. E. Shirokov

25. Shirokov M.E.: On properties of the space of quantum states and their application to construction of entanglement monotones. Izvestiya: Math., to appear, available at http://arxiv.org/abs/0804.1515v2[mathph], 2009 26. Simon, B.: Convergence Theorem for Entropy. Appendix in E.H.Lieb, M.B.Ruskai, Proof of the strong suadditivity of quantum mechanical entropy. J. Math. Phys. 14, 1938 (1973) 27. Uhlmann, A.: Entropy and optimal decomposition of states relative to a maximal commutative subalgebra. Open Syst. Inf. Dyn. 5(3), 209–228 (1998) 28. Wagner, D.H.: Survey of Measurable Selection Theorems. Lecture Notes in Mathematics 794, Berlin: Springer, 1980, pp. 176–219 29. Wehrl, A.: General properties of entropy. Rev. Mod. Phys. 50, 221–250 (1978) 30. Wehrl, A.: How chaotic is a state of a quantum system. Rep. Math. Phys. 6, 15–28 (1974) Communicated by M.B. Ruskai

Commun. Math. Phys. 296, 655–680 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1034-7

Communications in

Mathematical Physics

Lax Pair Equations and Connes-Kreimer Renormalization Gabriel B˘adi¸toiu1 , Steven Rosenberg2 1 Institute of Mathematics “Simion Stoilow” of the Romanian Academy, P.O. Box 1-764, 014700 Bucharest,

Romania. E-mail: [email protected]

2 Department of Mathematics and Statistics, Boston University, Boston, MA 02215, USA.

E-mail: [email protected] Received: 3 May 2009 / Accepted: 15 January 2010 Published online: 14 March 2010 – © Springer-Verlag 2010

Abstract: We find a Lax pair equation corresponding to the Connes-Kreimer Birkhoff factorization of the character group of a Hopf algebra. This flow preserves the locality of counterterms. In particular, we obtain a flow for the character given by Feynman rules, and relate this flow to the Renormalization Group Flow. 1. Introduction In the theory of integrable systems, many classical mechanical systems are described by a Lax pair equation associated to a coadjoint orbit of a semisimple Lie group, for example via the Adler-Kostant-Symes theorem [1]. Solutions are given by a Birkhoff factorization on the group, and in some cases this technique extends to loop group formulations of physically interesting systems such as the Toda lattice [9,13]. By the work of Connes-Kreimer [2], there is a Birkhoff factorization of characters on general Hopf algebras, in particular on the Kreimer Hopf algebra of 1PI Feynman diagrams. In this paper, we reverse the usual procedure in integrable systems: we construct a Lax pair equation ddtL = [L , M] on the Lie algebra of infinitesimal characters of the Hopf algebra whose solution is given precisely by the Connes-Kreimer Birkhoff factorization (Theorem 5.9). The Lax pair equation is nontrivial in the sense that it is not an infinitesimal inner automorphism. The main technical issue, that the Lie algebra of infinitesimal characters is not semisimple, is overcome by passing to the double Lie algebra with the simplest possible Lie algebra structure. In particular, the Lax pair equation induces a flow for the character given by Feynman rules in dimensional regularization. This flow has the physical significance that it preserves locality, the independence of the character’s counterterm from the mass parameter. The flow also induces Lax pair flows for the β-functions of characters. In Sects. 1–4, we introduce a method to produce a Lax pair on any Lie algebra from equations of motion on the double Lie algebra. In Sect. 5, we apply this method to the particular case of the Lie algebra of infinitesimal characters of a Hopf algebra, and prove

656

G. B˘adi¸toiu, S. Rosenberg

Theorem 5.9. Although we focus on the minimal subtraction scheme for simplicity, the main results hold in any renormalization scheme. In Sects. 6–8, we discuss physical implications of the Lax pair flow. These implications are of two types: results in Sect. 7 which say that local characters remain local under the flow, and results in Sect. 8 which compare the Lax pair flow to the Renormalization Group Flow (RGF) usually considered in quantum field theory. As discussed in the beginning of Sect. 6, the RGF is a flow on the group G C of scalar valued (i.e renormalized) characters on the Hopf algebra of Feynman diagrams, while the Lax pair flow is on the Lie algebra gA of infinitesimal characters with values in Laurent series. There are several ways to compare the RGF to the Lax pair flow, all of which involve some identifications of the different spaces for the flows. Working at the group level, we can consider an unrenormalized character ϕ (e.g. before dimensional regularization and minimal subtraction) as an element of the corresponding group G A . Manchon’s bijection [12] R˜ : G A → gA transfers ϕ to an ˜ infinitesimal character R(ϕ); this bijection has better behavior than the logarithm map. This infinitesimal character has a Lax pair flow, which we can then transfer back to G A to obtain a flow ϕt . Finally, we can compare the flow of the renormalized scalar valued characters (ϕt )+ (0) in Connes-Kreimer notation to the RGF of ϕ+ (0). However, even in simple examples these two flows are not the same. Working at the infinitesimal level, we consider the β-function of a renormalized character, since the β-function is essentially the infinitesimal generator of the RGF. The β-function of a character is an element of the Lie algebra gC of scalar valued characters, so in Sect. 6 we extend the β-function to a β-character in gA . (This extension has previously appeared in the literature in a different context.) This material is used in the main results in Sects. 7–8. In preparation for the study of the RGF, in Sect. 7 we discuss “how physical” the Lax pair flow is. We first show that the Lax pair flow is trivial on primitives in the Hopf algebra. We then prove the main result (Theorem 7.3) that local characters for the minimal subtraction scheme remain local under the Lax pair flow, so the flow stays inside the set of physically plausible characters. We also discuss the dependence of this result on the renormalization scheme, and identify characters for which the Lax pair flow is trivial. From the results in Sect. 7, it seems unlikely that one can directly identify the RGF with a Lax pair flow, so in Sect. 8 we track how the RGF changes under the Lax pair flow. For example, even if the Lax pair flow is nontrivial for a given initial physically plausible character, one might hope that the RGF is unchanged. In Sect. 8, we give a criterion (Corollary 8.4) for when the RGF is fixed under the Lax pair flow. We show that the β-character and the β-function satisfy Lax pair equations, and briefly discuss the complete integrability of Lax pair flows of characters and β-functions. We would like to thank Dirk Kreimer for suggesting we investigate the connection between the Connes-Kreimer factorization and integrable systems, Dominique Manchon for helpful conversations, and the referee for valuable suggestions. 2. The Double Lie Algebra and its Associated Lie Group There is a well known method to associate a Lax pair equation to a Casimir element on the dual g∗ of a semisimple Lie algebra g [13]. The semisimplicity is used to produce an Ad-invariant, symmetric, non-degenerate bilinear form on g, allowing an identification of g with g∗ . For a general Lie algebra g, there may be no such bilinear form. To produce a Lax pair, we need to extend g to a larger Lie algebra with the desired bilinear form. We

Lax Pair Equations and Connes-Kreimer Renormalization

657

do this by constructing a Lie bialgebra structure on g, whose definition we now recall (see e.g. [10]). Definition 2.1. A Lie bialgebra is a Lie algebra (g, [·, ·]) with a linear map γ : g → g⊗g such that (a) t γ : g∗ ⊗ g∗ → g∗ defines a Lie bracket on g∗ , (b) γ is a 1-cocycle of g, i.e. (2) ad(2) x (γ (y)) − ad y (γ (x)) − γ ([x, y]) = 0, (2)

(2)

where ad x : g ⊗ g → g ⊗ g is given by ad x (y ⊗ z) = ad x (y) ⊗ z + y ⊗ ad x (z) = [x, y] ⊗ z + y ⊗ [x, z]. A Lie bialgebra (g, [·, ·], γ ) induces an Lie algebra structure on the double Lie algebra g ⊕ g∗ by [X, Y ]g⊕g∗ = [X, Y ],

[X ∗ , Y ∗ ]g⊕g∗ = t γ (X ⊗ Y ), [X, Y ∗ ] = ad∗X (Y ∗ ), for X , Y ∈ g and X ∗ , Y ∗ ∈ g∗ , where ad∗ is the coadjoint representation given by ad∗X (Y ∗ )(Z ) = −Y ∗ (ad X (Z )) for Z ∈ g. Since it is difficult to construct explicitly the Lie group associated to the Lie algebra g ⊕ g∗ , we will choose the trivial Lie bialgebra given by the cocycle γ = 0 and denote by δ = g ⊕ g∗ the associated Lie algebra. Let {Yi , i = 1, . . . , l} be a basis of g, with dual basis {Yi∗ }. The Lie bracket [·, ·]δ on δ is given by [Yi , Y j ]δ = [Yi , Y j ], [Yi∗ , Y j∗ ]δ = 0, [Yi , Y j∗ ]δ = −

cik Yk∗ , j

k

j where the cik are the structure constants: [Yi , Y j ] = k cikj Yk . The Lie group naturally associated to δ is given by the following proposition. Proposition 2.2. Let G be the simply connected Lie group with Lie algebra g and let θ : G × g∗ → g∗ be the coadjoint representation θ (g, X ) = Ad∗G (g)(X ). Then the Lie algebra of the semi-direct product G˜ = G θ g∗ is the double Lie algebra δ. Proof. The Lie group law on the semi-direct product G˜ is given by (g, X ) · (g , X ) = (gg , X + θ (g, X )). ˜ Then the bracket on g˜ is given by Let g˜ be the Lie algebra of G. [X, Y ∗ ]g˜ = dθ (X, Y ∗ ), [X, Y ]g˜ = [X, Y ], [X ∗ , Y ∗ ]g˜ = 0, for left-invariant vector fields X , Y of G and X ∗ , Y ∗ ∈ g∗ . We have dθ (X, Y ∗ ) = dAd∗G (X )(Y ∗ ) = [X, Y ∗ ]δ , since dAd G = adg. The main point of this construction is existence of a good bilinear form on the double.

658

G. B˘adi¸toiu, S. Rosenberg

Lemma 2.3. The natural pairing ·, · : δ ⊗ δ → C given by (a, b∗ ), (c, d ∗ ) = d ∗ (a) + b∗ (c), a, c ∈ g, b∗ , d ∗ ∈ g∗ , is an Ad-invariant symmetric non-degenerate bilinear form on the Lie algebra δ. Proof. By [10], this bilinear form is ad-invariant. Since G˜ is simply connected, the Ad-invariance follows. As an explicit example, we have Ad G˜ ((g, 0))(Yi , 0) = (Ad G (g)(Yi ), 0), and Ad G˜ ((g, 0))(0, Y j∗ ) = (0, Ad∗G (g)(Y j∗ )), from which the invariance under AdG˜ (g, 0) follows.

3. The Loop Algebra of a Lie Algebra Following [1], we consider the loop algebra ⎧ ⎫ N ⎨ ⎬ Lδ = L(λ) = λ j L j | M, N ∈ Z, L j ∈ δ . ⎩ ⎭ j=M

The natural Lie bracket on Lδ is given by

λi L i , λ j L j = λk [L i , L j ]. k

Set Lδ+ =

Lδ− =

⎧ ⎨ ⎩ ⎧ ⎨ ⎩

L(λ) =

N

i+ j=k

λ j L j | N ∈ Z+ ∪ {0}, L j ∈ δ

j=0

L(λ) =

−1

λ j L j | M ∈ Z+ , L j ∈ δ

j=−M

⎫ ⎬ ⎭

⎫ ⎬ ⎭

,

.

Let P+ : Lδ → Lδ+ and P− : Lδ → Lδ− be the natural projections and set R = P+ −P− . The natural pairing ·, · on δ yields an Ad-invariant, symmetric, non-degenerate pairing on Lδ by setting N N λi L i , λ j L j = L i , L j . i=M

j=M

i+ j=−1

For our choice of basis {Yi } of g, we get an isomorphism I : L(δ ∗ ) → Lδ with I

j j L i Y j∗ λ−1−i . L i Y j λi =

We will need the following lemmas.

(3.1)

Lax Pair Equations and Connes-Kreimer Renormalization

659

Lemma 3.1 [1]. We have the following natural identifications: Lδ+ = L(δ ∗ )− and Lδ− = L(δ ∗ )+ . Lemma 3.2 [13, Lem. 4.1]. Let ϕ be an Ad-invariant polynomial on δ. Then ϕm,n [L(λ)] = Resλ=0 (λ−n ϕ(λm L(λ))) is an Ad-invariant polynomial on Lδ for m, n ∈ Z. As a double Lie algebra, δ has an Ad-invariant polynomial, the quadratic polynomial ψ(Y ) = Y, Y

associated to the natural pairing. Let Yl+i = Yi∗ for i ∈ {1, . . . , l = dim(g)}, so elements N j of Lδ can be written L(λ) = 2lj=1 i=−M L i Y j λi . Then the Ad-invariant polynomials ψm,n (L(λ)) = Resλ=0 (λ−n ψ(λm L(λ))),

(3.2)

defined as in Lemma 3.2 are given by ψm,n (L(λ)) = 2

l

j

j+l

Li Lk .

(3.3)

j=1 i+k−n+2m=−1

Note that powers of ψ are also Ad-invariant polynomials on δ, so k ψm,n (L(λ)) = Resλ=0 (λ−n ψ k (λm L(λ)))

(3.4)

are Ad-invariant polynomials on Lδ. It would be interesting to classify all Ad-invariant polynomials on Lδ in general. 4. The Lax Pair Equation Let P+ , P− be endomorphisms of a Lie algebra h and set R = P+ − P− . Assume that [X, Y ] R = [P+ X, P+ Y ] − [P− X, P− Y ] is a Lie bracket on h. From [13, Theorem 2.1], the equations of motion induced by a Casimir (i.e. Ad-invariant) function ϕ on h∗ are given by dL = −ad∗h M · L , dt

(4.1)

for L ∈ h∗ , where M = 21 R(dϕ(L)) ∈ h. Now we take h = (Lδ)∗ = L(δ ∗ ), with δ a finite dimensional Lie algebra and with the understanding that (Lδ)∗ is the graded dual with respect to the standard Z-grading ∗ . After identifying Lδ ∗ = Lδ and on Lδ. Let P± be the projections of Lδ ∗ onto Lδ± ∗ ad = −ad via the map I in (3.1), the equations of motion (4.1) can be written in Lax pair form dL = [M, L], dt

(4.2)

where M = 21 R(I (dϕ(L(λ)))) ∈ Lδ, and ϕ is a Casimir function on Lδ ∗ = Lδ [13, Th. 2.1]. Finding a solution for (4.2) reduces to the Riemann-Hilbert (or Birkhoff) factorization problem. The following theorem is a corollary of [1, Th. 4.37] [13, Th. 2.2].

660

G. B˘adi¸toiu, S. Rosenberg

Theorem 4.1. Let ϕ be a Casimir function on Lδ and set X = I (dϕ(L(λ))) ∈ Lδ, for L(λ) = L(0)(λ) ∈ Lδ. Let g± (t) be the smooth curves in L G˜ which solve the factorization problem exp(−t X ) = g− (t)−1 g+ (t), with g± (0) = e, and with g+ (t) = g+ (t)(λ) holomorphic in λ ∈ C and g− (t) a polynomial in 1/λ with no constant term. Let M = 21 R(I (dϕ(L(λ)))) ∈ Lδ. Then the integral curve L(t) of the Lax pair equation dL = [L , M] dt is given by L(t) = Ad L G˜ g± (t) · L(0).

(4.3)

This Lax pair equation projects to a Lax pair equation on the loop algebra of the original Lie algebra g. Let π1 be either the projection of G˜ onto G or its differential from δ onto g. This extends to a projection of Lδ onto Lg. The projection of (4.2) onto Lg is d(π1 (L(t))) = [π1 (L), π1 (M)], dt

(4.4)

since π1 = dπ1 commutes with the bracket. Thus the equations of motion (4.2) induce a Lax pair equation on Lg, although this is not the equations of motion for a Casimir on Lg. Theorem 4.2. The Lax pair equation of Theorem 4.1 projects to a Lax pair equation on Lg. Remark 4.3. The content of this theorem is that a Lax pair equation on the Lie algebra of a semi-direct product G G evolves on an adjoint orbit, and the projection onto g evolves on an adjoint orbit and is still in Lax pair form. Lax pair equations often appear as equations of motion for some Hamiltonian, but the projection may not be the equations of motion for any function on the smaller Lie algebra. We thank B. Khesin for this observation. When ψm,n is the Casimir function on Lδ given by (3.2), X can be written nicely in terms of L(λ). Proposition 4.4. Let X = I (dψm,n (L(λ))). Then X = 2λ−n+2m L(λ). Proof. Write L(λ) =

i, j

(4.5)

j

L i λi Y j . By formula (3.3), we have

∂ψm,n = ∂ L tp

2L t+l n−1−2m− p , if t ≤ l 2L t−l n−1−2m− p , if t > l.

(4.6)

Lax Pair Equations and Connes-Kreimer Renormalization

661

Therefore X = I (dψm,n (L(λ))) = = 2λ−n+2m

l p

2l

+

p,t

∂ L tp

λ−1− p Yt∗

n−1−2m− p L t+l n−1−2m− p Yt+l λ

t=1

n−1−2m− p L t−l n−1−2m− p Yt−l λ

t=l+1 −n+2m

= 2λ

∂ψm,n

L(λ).

(4.7)

5. The Main Theorem for Hopf Algebras In this section we give formulas for the Birkhoff decomposition of a loop in the Lie group of characters of a Hopf algebra and produce the Lax pair equations associated to the Birkhoff decomposition. We present two approaches, both motivated by the ConnesKreimer Hopf algebra of 1PI Feynman graphs. First, in analogy to truncating Feynman integral calculations at a certain loop level, we truncate a (possibly infinitely generated) Hopf algebra to a finitely generated Hopf algebra, and solve Lax pair equations on the finite dimensional piece (Theorem 5.4). We also discuss the compatibility of solutions related to different truncations. Second, we solve a Lax pair equation associated to Ad-covariant maps on the full Hopf algebra (Theorem 5.9). These results are proven for the minimal subtraction scheme, but apply to other renormalization schemes. In Sect. 5.1, we introduce notation and prove a Birkhoff decomposition for the loop group associated to a doubled Lie algebra. In Sect. 5.2, we introduce the truncation process and prove Theorem 5.4. In Sect. 5.3, we treat the Feynman rules character and prove Theorem 5.9. In Sect. 5.4, we note that the methods of this section apply to any renormalization scheme given by a linear projection satisfying the Rota-Baxter equation. 5.1. Birkhoff decompositions for doubled Lie algebras. Let H = (H, 1, µ, , ε, S) be a graded connected Hopf algebra over C. Let A be a unital commutative algebra with unit 1A . Unless stated otherwise, A will be the algebra of Laurent series; the only other occurrence in this paper is A = C. Definition 5.1. The character group G A of the Hopf algebra H is the set of algebra morphisms φ : H → A with φ(1) = 1A . The group law is given by the convolution product (ψ1 ψ2 )(h) = ψ1 ⊗ ψ2 , h ; the unit element is ε. Definition 5.2. An A-valued infinitesimal character of a Hopf algebra H is a C-linear map Z : H → A satisfying Z , hk = Z , h ε(k) + ε(h) Z , k .

662

G. B˘adi¸toiu, S. Rosenberg

The set of infinitesimal characters is denoted by gA and is endowed with a Lie algebra bracket: [Z , Z ] = Z Z − Z Z , for Z , Z ∈ gA , where Z Z , h = Z ⊗ Z , (h) . Notice that Z (1) = 0. For a finitely generated Hopf algebra, G C is a Lie group with Lie algebra gC , and for any Hopf algebra and any A, the same is true at least formally. We recall that δ = gC ⊕ g∗C is the double of gC and the g∗C is the graded dual of gC . We consider the algebra δ = δ ⊗ A of formal Laurent series with values in δ δ =

⎧ ⎨ ⎩

L(λ) =

∞ j=−N

⎫ ⎬ λ j L j | L j ∈ δ, N ∈ Z . ⎭

The natural Lie bracket on δ is

λi L i , λ j L j = λk [L i , L j ]. k

i+ j=k

Set δ+ =

δ− =

⎧ ⎨ ⎩ ⎧ ⎨ ⎩

L(λ) =

∞

λj L j | L j ∈ δ

j=0

L(λ) =

−1 j=−N

⎫ ⎬ ⎭

,

λ j L j | L j ∈ δ, N ∈ Z+

⎫ ⎬ ⎭

.

Recall that for any Lie group K , a loop L(λ) with values in K has a Birkhoff decom−1 −1 ∈ P1 − {0} and position if L(λ) = L(λ)−1 − L(λ)+ with L(λ)− holomorphic in λ 1 ˜ L(λ)+ holomorphic in λ ∈ P − {∞}. In the next lemma, G refers to G θ g∗ as in Prop. 2.2. ˜ We prove the existence of a Birkhoff decomposition for any element (g, α) ∈ G. Theorem 5.3. Every (g, α) ∈ G˜ = G A AdG∗ g∗A has a Birkhoff decomposition A (g, α) = (g− , α− )−1 (g+ , α+ ) with (g+ , α+ ) holomorphic in λ and (g− , α− ) a polynomial in λ−1 without constant term. Proof. We recall that (g1 , α1 )(g2 , α2 ) = (g1 g2 , α1 + Ad∗ (g1 )(α2 )). Thus (g, α) = −1 −1 (g− , α− )−1 (g+ , α+ ) if and only if g = g− g+ and α = Ad∗ (g− )(−α− + α+ ). Let −1 g = g− g+ be the Birkhoff decomposition of g in G A given in [2,6,12]. Set α+ = P+ (Ad∗ (g− )(α)) and α− = −P− (Ad∗ (g− )(α)), where P+ and P− are the holomorphic and pole part, respectively. Then for this choice of α+ and α− , we have (g, α) = (g− , α− )−1 (g+ , α+ ). Note that the Birkhoff decomposition is unique.

Lax Pair Equations and Connes-Kreimer Renormalization

663

5.2. Lax pair equations for the truncated Lie algebra of infinitesimal characters. For a finitely generated Hopf algebra, we can apply Theorems 4.1, 4.2 to produce a Lax pair equation on Lδ and on the loop space of infinitesimal characters Lg. However, the common Hopf algebras of 1PI Feynman diagrams and rooted trees are not finitely generated. As we now explain, we can truncate the Hopf algebra to a finitely generated Hopf algebra, and use the Birkhoff decomposition to solve a Lax pair equation on the infinitesimal character group of the truncation. A graded Hopf algebra H = ⊕n∈N Hn is said to be of finite type if each homogeneous component Hn is a finite dimensional vector space. Let B = {Ti }i∈N be a minimal set of homogeneous generators of the Hopf algebra H such that deg(Ti ) ≤ deg(T j ) if i < j and such that T0 = 1. For i > 0, we define the C-valued infinitesimal character Z i on generators by Z i (T j ) = δi j . The Lie algebra of infinitesimal characters g is a graded Lie algebra generated by {Z i }i>0 . Let g(k) be the vector space generated by {Z i | deg(Ti ) ≤ k}. We define deg(Z i ) = deg(Ti ) and set [Z i , Z j ] if deg(Z i ) + deg(Z j ) ≤ k [Z i , Z j ]g(k) = 0 if deg(Z i ) + deg(Z j ) > k We identify ϕ ∈ G C with {ϕ(Ti )} ∈ CN and on CN we set a group law given by {ϕ1 (Ti )} ⊕ {ϕ2 (Ti )} = {(ϕ1 ϕ2 )(Ti )}. G (k) = {{ϕ(Ti )}{i | deg(Ti )≤k} | ϕ ∈ G C } is a finite dimensional Lie subgroup of G C = (CN , ⊕) and the Lie algebra of G (k) is g(k) . There is no loss of information under this identification, as ϕ(Ti T j ) = ϕ(Ti )ϕ(T j ). Let δ (k) be the double Lie algebra of g(k) and let G˜ (k) be the simply connected Lie group with Lie(G˜ (k) ) = δ (k) as in Proposition 2.2. The following theorem is a restatement of Theorem 4.1 in our new setup. Theorem 5.4. Let H = ⊕n Hn be a graded connected Hopf algebra of finite type, and let ψ : Lδ (k) → C be a Casimir function (e.g. ψ(L) = ψm,n (L(λ)) = Resλ=0 (λm ψ(λn L(λ))) with ψ : δ (k) × δ (k) → C the natural paring of δ (k) ). Set X = I (dψ(L 0 )) for L 0 ∈ Lδ (k) . Then the solution in Lδ (k) of dL 1 = [L , M] Lδ (k) , M = R(I (dψ(L))) dt 2

(5.1)

with initial condition L(0) = L 0 is given by L(t) = Ad L G˜ (k) g± (t) · L 0 ,

(5.2)

where exp(−t X ) has the Connes-Kreimer Birkhoff factorization exp(−t X ) = g− (t)−1 g+ (t). Remark 5.5. (i) If L 0 ∈ Lδ, there exists k ∈ N such that L 0 ∈ Lδ (k) . Indeed L 0 ∈ Lδ is generated over C[λ, λ−1 ] by a finite number of {Z i }, and we can choose k ≥ max{deg(Z i )}. (ii) While the Hopf algebra of rooted trees and the Connes-Kreimer Hopf algebra of 1PI Feynman diagrams satisfy the hypothesis of Theorem 5.4, the Feynman rules ˜ as explained below. character does not lie in L G, In the next sections, we will investigate the relationship between the Lax pair flow L(t) and the Renormalization Group Equation. In preparation, we project from Lδ (k) to Lg(k) via π1 as in Sect. 4.

664

G. B˘adi¸toiu, S. Rosenberg

Corollary 5.6. Let ψ be a Casimir function on Lδ (k) . Set L 0 ∈ Lg(k) ⊂ Lδ (k) , X = π1 (I (dψ(L 0 ))). Then the solution of the following equation in Lg(k) 1 dL = [L , M1 ] L g(k) , M1 = π1 R(I (dψ(L))) (5.3) dt 2 with initial condition L(0) = L 0 is given by L(t) = Ad LG (k) g± (t) · L 0 ,

(5.4)

where exp(−t X ) has the Connes-Kreimer Birkhoff factorization in Lg(k) exp(−t X ) = g− (t)−1 g+ (t). Remark 5.7. (i) For Feynman graphs, this truncation corresponds to halting calculations after a certain loop level. From our point of view, this truncation is somewhat crude. g(k) is not a subalgebra of g, and if k < , g(k) is not a subalgebra of g() . Although the Casimirs ψm,n and the exponential map restrict well from g to g(k) , the Birkhoff decomposition exp(−t X ) of X ∈ Lg(k) is very different from the Birkhoff decompositions in Lg, Lg() . In fact, if g ∈ G (k) has Birkhoff decompo−1 g+ in G, there does not seem to be f (k) ∈ N such that g± ∈ G ( f (k)) . sition g = g− (ii) It would interesting to know, especially for the Hopf algebras of Feynman graphs or rooted trees, whether there exists a larger connected graded Hopf algebra H containing H such that the associated infinitesimal Lie algebra Lie(G C ) is the double δ. This would provide a Lax pair equation associated to an equation of motion on the infinitesimal Lie algebra of H . The most natural candidate, the Drinfeld double D(H) of H, does not work since the dimension of the Lie algebra associated to D(H) is larger than the dimension of δ. 5.3. Lax pair equations in the general case. In [2], Connes and Kreimer give a Birkhoff decomposition for the character group of the Hopf algebra of 1PI graphs, and in particular for the Feynman rules character ϕ(λ) given by minimal subtraction and dimensional regularization. The truncation process treated above does not handle the Feynman rules character, as the regularized toy model character defined in [5,7,11] of the Hopf algebra of integer decorated rooted trees and the Feynman rules character are not polynomials in λ, λ−1 , but Laurent series in λ. Thus Corollary 5.6 does not apply, as in our notation log(ϕ(λ)) ∈ g\Lg. This and Remark 5.7(i) force us to consider a direct approach in g as in the next theorem. However, we cannot expect that the Lax pair equation is associated to any Hamiltonian equation, and we replace Casimirs with Ad-covariant functions. Definition 5.8 [14]. Let G be a Lie group with Lie algebra g. A map f : g → g is Ad-covariant if Ad(g)( f (L)) = f (Ad(g)(L)) for all g ∈ G, L ∈ g. Theorem 5.9. Let H be a connected graded commutative Hopf algebra with gA the associated Lie algebra of infinitesimal characters with values in Laurent series. Let f : gA → gA be an Ad-covariant map. Let L 0 ∈ gA satisfy [ f (L 0 ), L 0 ] = 0. Set X = f (L 0 ). Then the solution of 1 dL = [L , M], M = R( f (L)) dt 2

(5.5)

Lax Pair Equations and Connes-Kreimer Renormalization

665

with initial condition L(0) = L 0 is given by L(t) = Ad G g± (t) · L 0 ,

(5.6)

where exp(−t X ) has the Connes-Kreimer Birkhoff factorization exp(−t X ) = g− (t)−1 g+ (t). Proof. The proof is similar to [13, Theorem 2.2]. First notice that d d Ad(g− (t)−1 g+ (t)) · L 0 = (exp(−t X )L 0 exp(t X )) dt dt = − exp(−t X )X L 0 exp(t X ) + exp(−t X )L 0 X exp(t X ) = − exp(−t X )[X, L 0 ] exp(t X ) = 0, which implies Ad(g− (t)−1 g+ (t)) · L 0 = L 0 and Ad(g− (t)) · L 0 = Ad(g+ (t)) · L 0 . Set L(t) = Ad(g± (t)) · L 0 = g± (t)L 0 g± (t)−1 . As usual, dg± (t) dL = g± (t)−1 , L(t) , dt dt so

dL 1 dg+ (t) dg− (t) = g+ (t)−1 + g− (t)−1 , L(t) . dt 2 dt dt

The Birkhoff factorization g+ (t) = g− (t) exp(−t X ) gives dg+ (t) dg− (t) = exp(−t X ) + g− (t)(−X ) exp(−t X ), dt dt and so dg+ (t) dg− (t) g+ (t)−1 = g− (t)−1 + g− (t)(−X )g− (t)−1 . dt dt Thus 2M = R( f (L(t))) = R( f (Ad(g− (t)) · L 0 )) = R(Ad(g− (t)) · f (L 0 )) dg+ (t) dg− (t) g+ (t)−1 ) + R( g− (t)−1 ) = R(Ad(g− (t)) · X )) = −R( dt dt dg− (t) dg+ (t) g+ (t)−1 − g− (t)−1 . =− dt dt

± (t) g± (t)−1 (x) ∈ A± for x ∈ H. Thus ddtL = [L , M]. Here we use dgdt Remark 5.10. In a particular case of Theorem 5.9, we get a Hamiltonian system. First, since the proof of Theorem 5.9 depends on the splitting gA = gA− ⊕ gA+ of the Lie algebra and the Birkhoff decomposition of the Lie group, and since this splitting and Birkhoff decomposition exist (see Theorem 5.3) on Lδ k , Theorem 5.9 extends to Lδ k . Let ψ : Lδ k → C be a Casimir function on Lδ k . For f (L) = (∇ψ)(L) the gradient of ψ, f is an Ad-invariant function on Lδ k . Since ψ is a Casimir function, [L , (∇ψ)(L)] = 0 for all L (see [1,14]), so in particular [ f (L 0 ), L 0 ] = 0. Thus, the Hamiltonian system (5.5) satisfies (5.6). Since I (dψ) = ∇ψ, Theorem 5.4 is a particular case of the Lδ k version of Theorem 5.9. It is natural to ask if the Hamiltonian system (5.1) is completely integrable; this is discussed briefly in Sect. 8.

666

G. B˘adi¸toiu, S. Rosenberg

If f : gA → gA is given by f (L) = 2λ−n+2m L, then f is Ad-covariant and [ f (L 0 ), L 0 ] = [2λ−n+2m L 0 , L 0 ] = 0. Corollary 5.11. Let H be a connected graded commutative Hopf algebra with gA the Lie algebra of infinitesimal characters with values in Laurent series. Pick L 0 ∈ gA and set X = 2λ−n+2m L 0 . Then the solution of dL = [L , M], M = R(λ−n+2m L) dt

(5.7)

with initial condition L(0) = L 0 is given by L(t) = Ad G A g± (t) · L 0 ,

(5.8)

where exp(−t X ) has the Connes-Kreimer Birkhoff factorization exp(−t X ) = g− (t)−1 g+ (t). Remark 5.12. Let ϕ be the Feynman rules character. We can find the Birkhoff factorization of ϕ itself within this framework by adjusting the initial condition. Namely, set L 0 (λ) = 21 λn−2m exp−1 (ϕ(λ)). Then exp(X ) = ϕ by Prop. 4.4, so the solution of (5.7) involves the Birkhoff factorization ϕ = g− (−1)−1 g+ (−1). Namely, we have L(−1) =

λn−2m Ad G A g± (−1) exp−1 (ϕ). 2

5.4. Other renormalization schemes. Although the renormalization scheme considered so far is the minimal subtraction scheme, the results on Lax pair equations are valid for any suitably defined renormalization scheme. Roughly speaking, different renormalization schemes correspond to different splittings of A, as we now briefly explain. However, for simplicity in most sections we will treat only the minimal subtraction scheme. Let A be the algebra of Laurent series. Let π : A → A be a Rota-Baxter map [6], which by definition is a linear map satisfying the Rota-Baxter equation: π(ab) + π(a)π(b) = π(aπ(b)) + π(π(a)b) for a, b ∈ A. Let R : gA → gA given by R(X ) = X − 2π ◦ X for any infinitesimal character X : H → A. By [6, Prop. 2.6], R satisfies the modified classical Yang-Baxter equation (mCYBE) [R(X ), R(Y )] − R([R(X ), Y ] + [X, R(Y )]) = −[X, Y ], which implies that R is an R-operator, i.e. the bracket [·, ·] R is a Lie bracket (cf. [13]). If additionally we assume that the Rota-Baxter map is a projection, π 2 = π , then A splits into a direct sum of two subalgebras A = A− ⊕ A+ with A− = Im(π ), and the Birkhoff decomposition is unique. As a consequence, the results of this section extend to any Rota-Baxter projection π . For the minimal subtraction scheme, π is just the projection of a Laurent series onto its pole part. In summary, in Sect. 5 we have presented methods for applying the Connes-Kreimer renormalization theory to produce Lax pair equations for both truncated and full character algebras. This theory both encodes the traditional Bogoliubov-ParasiukHepp-Zimmermann procedure and emphasizes the pro-unipotent complex group of

Lax Pair Equations and Connes-Kreimer Renormalization

667

characters associated to the commutative Hopf algebra of Feynman graphs (see [4, Ch. 1, Sect. 6]). This pro-unipotent group is by definition a projective limit of finite dimensional unipotent Lie groups, and has an associated pro-nilpotent Lie algebra, a projective limit of nilpotent Lie algebras. In our case, the finite dimensional nilpotent Lie algebras are the double of the infinitesimal characters of the truncated Hopf algebras, and the corresponding unipotent groups are the exponentials of the nilpotent algebras.

6. The Connes-Kreimer β-Function and the β-Character 6.1. Overview of Sects. 6–8. The next three sections are devoted to studying “how physical” the Lax pair flow is. This vague question can be approached in at least two ways: (i) given a property of a physically plausible character, we ask if this property is preserved under the Lax pair flow (see Sect. 7); (ii) we compare the Lax pair flow to the renormalization group flow (RGF) which is fundamental in quantum field theory (see Sect. 8). Before addressing either topic, we have to recall where various flows live. A character for us is an element of G A , the homomorphisms from the Hopf algebra of Feynman diagrams to the algebra A of Laurent series (although many of our results hold in more generality). A renormalized character, such as Feynman rules given by dimensional regularization and minimal subtraction, is an element of G C , the homomorphisms from the Hopf algebra to C. The RGF is a flow on G C . In contrast, the Lax pair flow is on the Lie algebra gA of infinitesimal characters, which is formally the Lie algebra of G A . As an example of topic (i), in Sect. 7 we ask if the physically necessary property of locality (see Def. 6.2) of an unrenormalized character ϕ ∈ G A is preserved under the Lax pair flow. To make sense of this question, we must choose a bijection α : G A → gA , take α(ϕ) ∈ gA , let α(ϕ) flow to L(t) under the Lax pair flow, and set ϕt = α −1 (L(t)). The question, “If ϕ is local, is ϕt local,” is now well defined but depends on the choice of α and a choice of Casimir function for the Lax pair flow. A natural choice of α is the logarithm, the inverse of the exponential map, but it turns out that another bijection due to Manchon has much better behavior. As an example of (ii), we can ask to what extent the RGF is related to the Lax pair flow. Since these flows live on different spaces, there are several ways to interpret this question, and each interpretation involves a choice of identification. For example, we can ask if the family of renormalized scalar characters [α −1 (L(t))]+ (λ = 0) ∈ G C ever coincides with the RGF of a character ϕ ∈ G A . The answer to this is negative for the bijections mentioned above. Since the RGF does not equal the Lax pair flow under these identifications, it is better to ask how the RGF is affected by the Lax pair flow. In particular, we consider the β-function β = βϕ ∈ gC , the infinitesimal generator of the RGF. We can extend β to an element β˜ ∈ gA , and ask for the behavior of β˜ under the Lax pair flow. We show in Sect. 8 that β˜ and hence β satisfy a Lax pair flow. This allows us to give a fixed point equation (Corollary 8.4) which has a solution iff the RGF of (suitably identified) ϕt coincide. In summary, we will see in Sect. 7 that the physically important property of locality is preserved under the Lax pair flow. After preliminary work in Sect. 6 on the β-character, we will see in Sect. 8 that the RGF of a family of characters ϕt given by a Lax pair flow is itself controlled by a Lax pair flow.

668

G. B˘adi¸toiu, S. Rosenberg

6.2. The β-character. As mentioned above, we extend the (scalar) β-function βϕ ∈ gC of a local character ϕ to an infinitesimal character β˜ϕ ∈ gA (Lemma 6.6). This ˜ in the language “β-character” has already appeared in the literature: β˜ϕ = λ R(ϕ), of [12] explained below (Lemma 6.7). To define the β-character, we recall material from [3,7,12]. Throughout this section, A denotes the algebra of Laurent series. Let H = Hn be a connected graded Hopf algebra. Let Y be the biderivation on n

H given on homogeneous elements by Y : Hn → Hn ,

Y (x) = nx for x ∈ Hn .

Definition 6.1 [12]. We define the bijection R˜ : G A → gA by ˜ R(ϕ) = ϕ −1 (ϕ ◦ Y ). We now define an action of C on G A . For s ∈ C and ϕ ∈ G A we define ϕ s (x) on an homogeneous element x ∈ H by ϕ s (x)(λ) = esλ|x| ϕ(x)(λ),

(6.1)

for any λ ∈ C, where |x| is the degree of x. Definition 6.2. Let d s (ϕ )− = 0}, G (6.2) A = {ϕ ∈ G A ds be the set of characters with the negative part of the Birkhoff decomposition independent of s. Elements of G A are called local characters. The dimensional regularized Feynman rule character ϕ is local. Referring to [3,7], the physical meaning of locality is that the counterterm ϕ− does not depend on the mass − parameter µ: ∂ϕ ∂µ = 0, and this in turn reflects the locality of the Lagrangian. Proposition 6.3. ([3,12,7]) Let ϕ ∈ G A . Then the limit Fϕ (s) = lim ϕ −1 (λ) ϕ s (λ) λ→0

exists and is a one-parameter subgroup in G A ∩ G C of scalar valued characters of H. Notice that (ϕ −1 (λ) ϕ s (λ))() ∈ A+ as s −1 s ϕ −1 (λ) ϕ s (λ) = ϕ+−1 ϕ− (ϕ s )−1 − (ϕ )+ = ϕ+ (ϕ )+ .

Definition 6.4. For ϕ ∈ G A , the β-function of ϕ is defined to be βϕ = −(Res(ϕ− )) ◦ Y ). We have [3] βϕ =

d F −1 (s), ds s=0 ϕ−

−1 , also belongs to G where Fϕ −1 , the one-parameter subgroup associated to ϕ− A. − To relate the β-function βϕ ∈ gC to our Lax pair equations, which live on gA , we can either consider gC as a subset of gA , or we can extend βϕ to an element of gA . Since gC is not preserved under the Lax pair flow, we take the second approach.

Lax Pair Equations and Connes-Kreimer Renormalization

669

Definition 6.5. For ϕ ∈ G A , x ∈ H , set d β˜ϕ (x)(λ) = (ϕ −1 ϕ s )(x)(λ). ds s=0 The following lemma establishes that β˜ is an infinitesimal character. Lemma 6.6. Let ϕ ∈ G A. (i) β˜ϕ is an infinitesimal character in gA . (ii) β˜ϕ is holomorphic (i.e. β˜ϕ (x) ∈ A+ for any x). Proof. (i) For two homogeneous elements x, y ∈ H, we have: ϕ s (x y) = es|x y|λ ϕ(x y) = es|x|λ ϕ(x)es|y|λ ϕ(y) = ϕ s (x)ϕ s (y). Therefore ϕ −1 ϕ s ∈ G A . Since ϕ −1 ϕ 0 = e we get d ϕ −1 ϕ s ∈ gA . ds s=0 (ii) Since β˜ϕ =

d s ds (ϕ )−

= 0, we get

d d −1 s −1 s −1 (ϕ = (ϕ )

ϕ

((ϕ ) )

(ϕ ) )

(ϕ s )+ . + − − + + ds s=0 ds s=0

Then β˜ϕ (x) = (ϕ+ )−1 (x ) Therefore β˜ϕ (x) ∈ A+ .

d d (ϕ s )+ (x ) = (ϕ+ )(S(x )) (ϕ s )+ (x ). ds s=0 ds s=0

Lemma 6.7. If ϕ ∈ G A then (i) (ii) (iii)

˜ β˜ϕ = λ R(ϕ), βϕ = Ad(ϕ+ (0))(β˜ϕ λ=0 ), β˜ϕ− (λ) is independent of λ and satisfies β˜ϕ− (x)(λ = 0) = −βϕ (x).

Proof. (i) For (x) = x ⊗ x , we have d d β˜ϕ (x)(λ) = (ϕ −1 ϕ s )(x)(λ) = ϕ −1 (x ) (ϕ s )(x ) ds s=0 ds s=0 = ϕ −1 (x )λ · deg(x )ϕ(x ) = λϕ −1 (x )ϕ ◦ Y (x ) = λ(ϕ −1 (ϕ ◦ Y ))(x) ˜ = λ R(ϕ)(x). ˜ 2 ) + φ −1 R(φ ˜ 1 ) φ2 , implies ˜ 1 φ2 ) = R(φ (ii) The cocycle property of R˜ [7], R(φ 2 that −1 −1 ˜ ˜ − ˜ + ) + ϕ+−1 λ R(ϕ ˜ − λ R(ϕ) = λ R(ϕ

ϕ+ ) = λ R(ϕ ) ϕ+ .

(6.3)

−1 −1 ˜ − ˜ + ) = ϕ+−1 (ϕ+ ◦ Y ) is always holomorphic and since λ R(ϕ ) = Res(ϕ− )◦ Since R(ϕ Y = −Res(ϕ− ) ◦ Y = β by [12, Th. IV.4.4], when we evaluate (6.3) at λ = 0 we get ˜ β(ϕ) = Ad(ϕ+−1 (0))β. λ=0

670

G. B˘adi¸toiu, S. Rosenberg

−1 (iii) The Birkhoff decomposition of ϕ− = (ϕ− )−1 − (ϕ− )+ is given by (ϕ− )− = ϕ− −1 and (ϕ− )+ = ε. By definition, βϕ− = −Res((ϕ− )− ) ◦ Y = −Res(ϕ− ) ◦ Y = Res(ϕ− ) ◦ Y = −βϕ . Applying (ii) to ϕ− , we get

−βϕ = βϕ− = Ad(ελ=0 )(β˜ϕ− λ=0 ) = β˜ϕ− λ=0 . ˜ − ), which by [12, Th. IV.4.4] belongs to gC , i.e. it does not By part (i), β˜ϕ− = λ R(ϕ depend on λ. These results will be used in Sect. 8. 7. The Lax Pair Flow and Locality in Minimal Subtraction and Other Renormalization Schemes The Lax pair flow lives on the Lie algebra gA of infinitesimal characters. As described in the Introduction, the bijection R˜ −1 : gA → G A of [12] transfers the Lax pair flow to the Lie group. The main result in this section is that using R˜ −1 , local characters remain local under the Lax pair flow (Theorem 7.3). This would not be true if we used the exponential map from gA to G A . We also discuss to what extent these results are independent of the choice of renormalization scheme. This section is organized as follows. In Sect. 7.1, we prove that the Lax pair flow is trivial on primitive elements in the Hopf algebra, and prove the locality of characters under the Lax pair flow. In Sect. 7.2, we describe how the results of this section carry over for renormalization schemes other than minimal subtraction, and identify certain characters which are fixed points of the Lax pair flow.

7.1. The Lax pair flow: the role of primitives, pole order, and locality. In this subsection, we use the minimal subtraction scheme for renormalization. We first show that the Lax pair flow is trivial on primitive elements. Proposition 7.1. If x is a primitive element of H, then the solution L(t)(x) of the Lax pair flow does not depend on t. Proof. We first compute the adjoint representation in terms of the Hopf algebra coprod˜ uct. We use Sweedler’s notation for the reduced coproduct (x) = x ⊗ x , where ˜

(x) = (x) − x ⊗ 1 − 1 ⊗ x. Notice that deg(x ) + deg(x ) = deg(x) and 1 ≤ ˜ ) = (x ) ⊗ (x ) , we have deg(x ), deg(x ) < deg(x). For x = 1 and (x ((ϕ1 ϕ2 )ϕ1−1 )(x) = (ϕ1 ϕ2 ) ⊗ ϕ1−1 , x ⊗ 1 + 1 ⊗ x + x ⊗ x

= (ϕ1 ϕ2 )(x) + ϕ1−1 (x) + (ϕ1 ϕ2 )(x )ϕ1−1 (x )

= ϕ1 (x) + ϕ2 (x) + ϕ1 (x )ϕ2 (x ) + ϕ1−1 (x) + ϕ1 (x ) + ϕ2 (x ) + ϕ1 ((x ) )ϕ2 ((x ) ) ϕ1−1 (x )

= ϕ2 (x) + ϕ1 (x )ϕ2 (x ) + ϕ1 (x) + ϕ1−1 (x) + ϕ1 (x )ϕ1−1 (x ) + ϕ2 (x )ϕ1−1 (x ) + ϕ1 ((x ) )ϕ2 ((x ) )ϕ1−1 (x ).

Lax Pair Equations and Connes-Kreimer Renormalization

671

Differentiating with respect to ϕ2 and setting L = ϕ˙ 2 gives the adjoint representation: Ad(ϕ1 )(L)(x) = L(x) + ϕ1 (x )L(x ) + L(x )ϕ1 (Sx ) + ϕ1 ((x ) )L((x ) )ϕ1 (Sx ),

(7.1)

where S is the antipode of the Hopf algebra. ˜ For a primitive element x, the reduced coproduct vanishes, (x) = 0, thus by relation (7.1), L(t)(x) = (Ad(g± (t))(L 0 )) (x) = L 0 (x). Thus everything of interest in the Lax pair flow occurs off the primitives. The following lemma will be used in the proof of the locality of the Lax pair flow. Lemma 7.2. (i) If the initial condition L 0 ∈ gA is holomorphic in λ, then the solution L(t) = Ad(g+ (t))L 0 of the Lax pair equation is holomorphic in λ. (ii) If L 0 ∈ gA has a pole of order n, then L(t) = Ad(g+ (t))L 0 has a pole of order at most n. Proof. By (7.1), we have Ad(g+ (t))(L 0 )(x) = L 0 (x) + g+ (t)(x )L 0 (x ) + L 0 (x )g+ (t)(Sx ) +g+ (t)((x ) )L 0 ((x ) )g+ (t)(Sx ).

(7.2)

Notice that g+ (t)(x) is holomorphic for x ∈ H . If L 0 is holomorphic, then every term of the right hand side of (7.2) is holomorphic, so Ad(g+ (t))(L 0 ) is holomorphic. Since multiplication with a holomorphic series cannot increase the pole order, L(t) cannot have a pole order greater than the pole order of L 0 . We now show that local characters remain local under the Lax pair flow. Theorem 7.3. For a local character ϕ ∈ G A , let L(t) be the solution of the Lax pair ˜ equation (5.5) for any Ad-covariant function f , with the initial condition L 0 = R(ϕ). Let ϕt be the flow given by ϕt = R˜ −1 (L(t)). Then ϕt is a local character for all t. Proof. Recall from [12, Th. IV.4.1] that λ R˜ : G A → gA restricts to a bijection from G A to gA+ , where gA+ is the set of infinitesimal characters on H with values in A+ = C[[λ]]. This can be rephrased to R˜ is a bijection between G A and the set of Laurent series with ˜ the pole order at most one. In particular, L 0 = R(ϕ) has a pole of order at most one. By Lemma 7.2, L(t) has the pole order at most one, which implies that ϕt = R˜ −1 (L(t)) is a local character. Corollary 7.4. If x is a primitive element of H, then ϕt (x) = ϕ(x) and βϕt (x) = βϕ (x) for all t. Proof. By its definition, ϕt = R˜ −1 (L(t)). By Proposition 7.1, for x primitive we have L(t)(x) = L 0 (x) for all t, therefore ϕt (x) = ϕ(x) for all t. Thus βϕt (x) = (−Res(ϕt )− ◦ Y ) (x) = −|x|Res ((ϕt )− (x)) = −|x|Res (ϕ− (x)) = βϕ (x).

672

G. B˘adi¸toiu, S. Rosenberg

7.2. Lax pair flows for arbitrary renormalization schemes. In this subsection, we prove that Theorem 7.3 on the locality of the Lax pair flow holds for an arbitrary renormalization scheme, i.e. for any Rota-Baxter projection π : A → A as in Sect. 5.4. Recall that for such projections the Birkhoff decomposition is uniquely determined and that Theorem 5.9 holds. We first need to establish two lemmas. The next lemma proves that the key result [12, Th. IV.4.1] holds for any Rota-Baxter projection. Manchon’s proof seems to work only for the minimal subtraction scheme, so we rework the proof. Lemma 7.5. The map λ R˜ : G A → gA restricts to a bijection from G A to gA+ , where gA+ is the set infinitesimal characters on H with values in A+ Proof. The definition of R˜ : G A → gA is independent of the renormalization scheme and by [12], R˜ is bijective (see also the scattering map in [3]). ˜ ) = gA+ . First, we point out that the proofs of It is sufficient to show that λ R(G A ˜ ˜ Lemma 6.7(i), i.e. λ R(ϕ)(ϕ) = βϕ , and of Lemma 6.6, i.e. β˜ϕ ∈ gA+ remain identical ˜ for any Rota-Baxter projection. This implies that if ϕ ∈ G A , then λ R(ϕ) ∈ gA+ . ˜ Let λ R(ϕ) ∈ gA+ , with ϕ ∈ G A . We prove the converse of Lemma 6.6(ii). d d −1 s −1 −1 ˜ λ R(ϕ) = β˜ϕ = (ϕ+ ) ϕ− ((ϕ )− ) ϕ+ + (ϕ+ ) (ϕ s )+ ds s=0 ds s=0 which implies ϕ−

d d s −1 −1 s −1 ˜ ((ϕ = ϕ ) )

λ R(ϕ) − (ϕ )

(ϕ ) − + + + (ϕ+ ) ds s=0 ds s=0 ∈ gA+ ∩ gA− = {0}.

Thus (ϕ s )− does not depend on s, so ϕ ∈ G A.

The following lemma is due to Manchon [12, Lem. IV.4.3], and its proof extends to any renormalization scheme (i.e. Rota-Baxter projection). Lemma 7.6. Let ϕ ∈ G A. (1) Then (ϕ− )−1 ∈ G A. (2) If h ∈ G A+ , then ϕ h ∈ G A. We can now show that locality of characters is preserved under the Lax pair flow via the R˜ identification of any renormalization scheme. Theorem 7.7. For a local character ϕ ∈ G A , let L(t) be the solution of the Lax pair ˜ equation (5.5) for any Ad-covariant function f , with the initial condition L 0 = R(ϕ). Let ϕt be the flow given by ϕt = R˜ −1 (L(t)). Then ϕt is a local character for all t.

Lax Pair Equations and Connes-Kreimer Renormalization

673

Proof. By [7], ˜ ˜ ξ ) = R(ξ ˜ ) + ξ ∗−1 R(ϕ)

ξ. R(ϕ Taking ξ = g+ (t)−1 and multiplying by λ, we get ˜ g+ (t)−1 ) = λ R(g ˜ + (t)−1 ) + g+ (t) λ R(ϕ) ˜ λ R(ϕ

g+ (t)−1 . −1 ∗−1 ∈ G . Thus λ R(ϕ ˜ Since ϕ ∈ G A and g(t)+ ∈ G A+ , by Lemma 7.6, ϕ g(t)+ A −1 −1 −1 ˜ ˜ g+ (t) ) ∈ gA+ . Since g(t)+ ∈ G A+ , by definition of R we get λ R(g(t)+ ) ∈ gA+ . It follows that

˜ −1 (λAd(g+ (t)) R(ϕ)) ˜ ϕt = R˜ −1 (Ad(g+ (t))L 0 ) = (λ R) −1 −1 ˜ (g+ (t) λ R(ϕ) ˜ = (λ R)

g+ (t) ) ∈ G .

(7.3)

A

We can also show that for certain initial conditions, the flow ϕt is constant. Proposition 7.8. If ϕ ∈ G A and ϕ+ = ε (i.e. ϕ has only a pole part), then the flow ϕt of Theorem 7.7 for the Ad-covariant function f (L) = λ−n+2m L has ϕt = ϕ for all t. Proof. If we show that either g± (t) = ε, then

˜ ˜

ε−1 = ϕ. ϕt = R˜ −1 g± (t) R(ϕ)

g± (t)−1 = R˜ −1 ε R(ϕ) g± (t) are given by the Birkhoff decomposition of g(t) = exp(−2tλ−n+2m L 0 ) =

∞ (−2tλ−n+2m−1 )k (λL 0 )k k=0

k!

,

(7.4)

˜ where λL 0 = λ R(ϕ) ∈ gC [12, Th. IV.4.4]. If −n + 2m − 1 ≥ 0, then g(t)(x) ∈ A+ for any x, which implies g− (t) = ε. Similarly, if −n + 2m − 1 < 0, then g(t)(x) ∈ A− for any x, which implies g+ (t) = ε. Notice that the right hand side of (7.4) is a finite sum, namely up to k = deg(x) when evaluated on x ∈ H. Starting with a local character, we can produce examples of the previous theorem. Corollary 7.9. If ϕ ∈ G A , then the flow ϕt associated to the Ad-covariant function f (L) = λ−n+2m L has ((ϕ− )−1 )t = ϕ for all t. −1 ∈ G . By the previous proposition, it Proof. Since ϕ ∈ G A , by Lemma 7.6, (ϕ− ) A −1 follows that ((ϕ− ) )t = ϕ.

Remark 7.10. Lemma 6.7(ii) and the results in the next section cannot be extended to an arbitrary renormalization scheme (with the same proofs). This comes from the following simple fact. If R is a Rota-Baxter map, then Id − R is also a Rota-Baxter map. In particular, when R is the minimal subtraction scheme, Id − R, the projection to the holomorphic part of the Laurent series, is also a Rota-Baxter projection and thus if we renormalize with respect to Id − R, then ϕ+I d−R (x) = −ϕ−R (x) for any primitive element x. Therefore ϕ+I d−R (λ = 0) might not be defined.

674

G. B˘adi¸toiu, S. Rosenberg

8. The β-Function, the Renormalization Group Flow, and the Lax Pair Flow It is natural to ask if the Lax pair flow can be identified with the more usual Renormalization Group Flow (RGF). As mentioned above, the RGF (ϕ t )+ (λ = 0) lives in the Lie group of characters G C , while the Lax pair flow L(t) lives in the Lie algebra gA . To match these flows, we can transfer the Lax pair flow to the Lie group level using either of the maps R˜ −1 and exp, namely by defining ϕt = R˜ −1 (L(t)) and χt = exp(L(t))

(8.1)

and then setting λ = 0. However, it is easy to show that even on a commutative, graded connected Hopf algebra H, ϕt = ϕ t and χt = ϕ t , where A is the algebra of Laurent series. In Sect. 8.1, we give a criterion (Corollary 8.4) under which the RGF is independent of the Lax flow parameter t. Strictly speaking, we compare RGFs translated back to the identity in the character group in order to make the comparison. In Sect. 8.2, we show that both the β-function and the β-character satisfy Lax pair equations (Proposition 8.5, Lemma 8.6, Theorem 8.7). Finally, in Sect. 8.3 we make some preliminary remarks on the complete integrability of the Lax pair flows for characters and for β-functions. Throughout this section we work with the minimal subtraction renormalization scheme. 8.1. Relations between the Renormalization Group Flow and the Lax pair flow. Local characters satisfy the abstract Renormalized Group Equation [8], which we now recall. s For a local character ϕ ∈ G A with ϕ given by (6.1), the renormalized characters are s defined by ϕren (s) = (ϕ )+ (λ = 0). Theorem 8.1. For ϕ ∈ G A , the renormalized characters ϕren (s) satisfy the abstract Renormalized Group Equation: ∂ ϕren (s) = βϕ ϕren (s). ∂s Here our parameter s corresponds to es in [8]. The abstract RGE of a local character ϕ can be written as (d/ds)(ϕr en (s)) ϕr en (s)−1 = βϕ , thus the renormalized group flow ϕr en is in fact the integral flow associated to the beta function and in consequence ϕr en (s) = exp(sβϕ )ϕr en (0). We now give an expression for the β-function of a character under the Lax pair flow from Theorem 7.3. This will be put into Lax pair form in Proposition 8.5. Proposition 8.2. In the setup of Theorem 5.9, we have βϕt = Ad(A(t))(βϕ ) with A(t) given by A(t) = (ϕt )+ (0) g+ (t)(0) ϕ+−1 (0) ∈ G C , and where g+ (t) is given as in Theorem 5.9. ˜ t ) = λL(t), which by Theorem 5.9 becomes Proof. By Lemma 6.7, we get β˜ϕt = λ R(ϕ β˜ϕt = λAd(g+ (t))L 0 = g+ (t)(λL 0 )g+ (t)−1 = g+ (t)(β˜ϕ )g+ (t)−1 . Since β˜ϕ and g+ (t)

Lax Pair Equations and Connes-Kreimer Renormalization

675

˜ are holomorphic, evaluating at λ = 0, we get that β˜ϕt (0) = Ad(g + (t)(0))(βϕ ), which by Lemma 6.7 implies that βϕt = Ad (ϕt )+ (0)g+ (t)(0)ϕ+−1 (0) (βϕ ). Exponentiating the formula of sβϕt given in the previous proposition, a straightforward computation gives the RGF of the local character ϕt in terms of the RGF of the character ϕ:

(ϕt )r en (s) = (ϕt )+ (0) Ad(g+ (t)(0)) ϕ+ (0)−1 ϕr en (s) . We can now relate the RGF (ϕt )r en (s) to other flows in this paper. Assume as usual that ϕ is a local character. To compare the flows (ϕt )r en (s) and ϕr en (s), we introduce the translated RGF ϕt (s) of (ϕt )r en (s) by ϕt (s) = ((ϕt )+ (0))−1 (ϕt )r en (s). A natural question is to find the values of t where the translated RG flows ϕt (s), ϕ (s) coincide. Notice that ϕt (s), ϕ (s) coincide at s = 0. We set β˜0 (t) = β˜ϕt . λ=0

˜ t ), so by Lemma 6.7, By its definition, ϕt satisfies L(t) = R(ϕ ˜ t ) β˜0 (t) = λ R(ϕ = λL(t) = Res(L(t)). λ=0

λ=0

Lemma 8.3. If ϕ is a local character, then ϕt (s) = exp(s β˜0 (t)). Proof. We consider the Taylor expansion of ϕt (s) at s = 0: ϕt (s) =

d k ϕ (0) s k t . ds k k! k≥0

By the abstract RGE Theorem 8.1 and Lemma 6.7, we have

∗k d k ϕt (s) = ((ϕt )+ (0))−1 βϕ∗kt (ϕt )ren (s) = Ad(((ϕt )+ (0))−1 )(βϕt ) . k s=0 s=0 ds Therefore ϕt (s) = exp(s β˜0 (t)).

˜ ˜ t ) = L(t). Thus Recall that L(t) = Ad(g+ (t))( R(ϕ)) and that, by its definition, R(ϕ ˜ t ) = Ad(g+ (t))(λ R(ϕ)) ˜ β˜ϕt = λ R(ϕ = Ad(g+ (t))(β˜ϕ ), which evaluated at λ = 0 gives β˜0 (t) = Ad(g+ (t)(0))(β˜0 (0))). This implies the following equation for ϕt . Corollary 8.4. For fixed t, the translated RGF ϕt (s) equals ϕ (s) iff Ad(g+ (t)(0)) · β˜0 (0) = β˜0 (0).

(8.2)

676

G. B˘adi¸toiu, S. Rosenberg

By Theorem 5.9, g+ (t) = (exp(−t f (L 0 )))+ =

exp −t f

β˜ϕ λ

+

depends only on the initial character ϕ and the choice of an Ad-covariant function f . Thus we can consider (8.2) as a fixed point equation for the t flow of ϕ. This equation is satisfied for a cocommutative Hopf algebra (for which the adjoint representation is trivial), in the setup of Proposition 7.8, and in Theorem 8.8 below. However, for the Hopf algebra of integer decorated rooted trees and the regularized toy model character defined in [5,7,11], the only value of t for which ϕt = ϕ is t = 0. 8.2. Lax pair equations for the β-function. We now show that the β-functions and β-characters of ϕt also satisfy a Lax pair flow. Proposition 8.5. In the setup of Theorem 5.9, we have d((ϕt )+ (0)) dβϕt −1 = ((ϕt )+ (0)) + Ad((ϕt )+ (0))(M+ (0)), βϕt , dt dt where M comes from the Lax pair equation d L/dt = [L , M], and M+ is the projection of M into gA+ . Proof. In the notation of Proposition 8.2, βϕt = Ad(A(t))(βϕ ), which as usual implies that dβϕt d A(t) −1 = A(t) , βϕt . dt dt We have d A(t) d((ϕt )+ (0)) A(t)−1 = ((ϕt )+ (0))−1 dt dt dg+ (t)(0) +(ϕt )+ (0) (g+ (t)(0))−1 ((ϕt )+ (0))−1 . dt By proof of Theorem 5.9, we have d (g+ (t)(0)) (g+ (t)(0))−1 ) = M+ (0). dt Since β˜ϕt and its value at λ = 0 are simpler and carry better geometric properties, we now restrict our attention to them. Lemma 8.6. For a local character ϕ ∈ G A , let ϕt be the flow from Theorem 7.3. Then d β˜ϕt = [β˜ϕt , M]. dt

(8.3)

Lax Pair Equations and Connes-Kreimer Renormalization

677

˜ t ) = λL(t). Then Proof. By Lemma 6.7, we get β˜ϕt = λ R(ϕ ˜ t )) d(λ R(ϕ λd L(t) d β˜ϕt = = = λ[L(t), M] = [β˜ϕt , M]. dt dt dt Let the Taylor expansion of β˜ϕt be β˜ϕt =

∞

β˜k (t)λk .

k=0

In the setup of Corollary 5.11, we get a Lax pair equation for β˜0 (t) = β˜ϕt λ=0 . Theorem 8.7. For a local character ϕ ∈ G A , let L(t) be the Lax pair flow of Corol˜ lary 5.11 with initial condition L 0 = R(ϕ). Let ϕt = R˜ −1 (L(t)). Then (i) For −n + 2m ≥ 1, ϕt = ϕ and hence βϕt = βϕ and β˜0 (t) = β˜0 (0) for all t. (ii) For −n + 2m ≤ 0, βϕt ∈ gC satisfies d β˜0 (t) = 2[β˜0 (t), β˜n−2m+1 (t)]. dt

(8.4)

Proof. By Theorem 7.3, ϕt are local characters, so by [12, Th. IV.4.], β˜ϕt = λL(t) = ˜ t ) is holomorphic. λ R(ϕ (i) If −n + 2m ≥ 1, then λ−n+2m L(t) is holomorphic, which implies M = R(λ−n+2m L(t)) = λ−n+2m L(t) = λ−n+2m−1 β˜ϕt . L(t) satisfies the Lax pair equation dL = [L , M] = [L , λ−n+2m L] = λ−n+2m [L , L] = 0. dt Thus L(t) = L 0 for all t, which gives ϕt = R˜ −1 (L(t)) = R˜ −1 (L 0 ) = ϕ for all t. (ii) For −n + 2m ≤ 0, we have M = R(λ−n+2m L(t)) = λ−n+2m L(t) − 2P− (λ−n+2m L(t)) = λ−n+2m−1 β˜ϕt − 2P− (λ−n+2m−1 β˜ϕt ). Equation (8.3) becomes d β˜ϕt = −2[β˜ϕt , P− (λ−n+2m−1 β˜ϕt )] dt = −2[P+ (λ−n+2m−1 β˜ϕt ), λn−2m+1 P− (λ−n+2m−1 β˜ϕt )]. Expand β˜ϕt as β˜ϕt =

∞ k=0

β˜k (t)λk .

678

G. B˘adi¸toiu, S. Rosenberg

Then

⎡ d β˜ϕt = −2 ⎣ dt

∞

k=n−2m+1

β˜k (t)λk−n+2m−1 ,

n−2m

⎤ β˜ j (t)λ j ⎦ ,

j=0

and evaluating at λ = 0 gives d β˜0 (t) = −2[β˜n−2m+1 (t), β˜0 (t)] = 2[β˜0 (t), β˜n−2m+1 (t)]. dt

(8.5)

In the setup of Corollary 5.11, Proposition 8.5 can be restated as follows. Corollary 8.8. Let H be a connected graded commutative Hopf algebra with gA the Lie algebra of infinitesimal characters with values in Laurent series. Pick L 0 ∈ gA and set X = 2λ−n+2m L 0 . Then

dβϕt d((ϕt )+ (0)) −1 ˜ = βϕt , − ((ϕt )+ (0)) + 2Ad((ϕt )+ (0)) βn−2m+1 (t)) . (8.6) dt dt Proof. A gauge transformation X → X = ξ X ξ −1 changes a Lax pair equation of the form (d/dt)X = [Y, X ] into (d/dt)X = [Y , X ] with Y = ξ Y ξ −1 + (dξ/dt)ξ −1 . Taking X = β˜0 (t), Y = −2β˜n−2m+1 (t), ξ = (ϕt )+ (0) and using Lemma 6.7(ii), (8.4) becomes (8.6). 8.3. Complete integrability for the flows of infinitesimal characters. We end with a brief discussion of the complete integrability of our Lax pair equations. We first discuss the complete integrability of the flow of infinitesimal characters L(t) using spectral curve techniques as in [13] for a specific truncated Hopf algebra. We then give an example of a truncated Hopf algebra for which complete integrability can be shown by classical techniques. These results will be discussed more completely elsewhere. 8.3.1. Spectral curve techniques Let H2 be the Hopf subalgebra generated by the trees t0 = 1T , t1 = , t2 = , t3 = , t4 =

, t5 = .

For T ∈ {t1 , . . . , t5 }, let Z T be the corresponding infinitesimal character. The Lie algebra g2 of scalar valued infinitesimal characters of H2 is generated by Z t1 , . . . , Z t5 . Let G 1 be the scalar valued character group of H2 , and let G 0 be the semi-direct product G 1 C given by (g, t) · (g , t ) = (g · θt (g ), t + t ), where θt (g)(T ) = etdeg(T ) g(T ) for homogenous T . Define a new variable Z 0 with d . The Lie algebra g0 of G 0 is generated by [Z 0 , Z ti ] = deg(ti )Z ti , so formally Z 0 = dθ ∗ Z 0 , Z t1 , . . . , Z t5 . Let δ = g0 ⊕g0 be the double Lie algebra associated to an arbitrary Lie bialgebra structure. The conditions a) and b) in Definition 2.1 of a Lie bialgebra can be written in a basis as a system of quadratic equations. We can solve this system explicitly, e.g. via Mathematica. It turns out that there are 43 families of Lie bialgebra structures γ on g0 . In more detail, the system of quadratic equations involves 90 variables. Mathematica gives 1 solution with 82 linear relations (and so 8 degrees of freedom), 7 solutions with 83 linear relations, 16 solutions with 84 linear relations, 13 solutions with 85 linear relations, 5 solutions with 86 linear relations, and 1 solution with 87 linear relations.

Lax Pair Equations and Connes-Kreimer Renormalization

679

To any Lax equation of matrices with a spectral parameter, one can associate a spectral curve and study its algebro-geometric properties (see [13]). In our case, we consider the adjoint representation ad : δ → gl(δ) and the induced adjoint representation of the loop algebra. Applying ad : Lδ → gl(Lδ) to the Lax pair equation (5.1) of Theorem 5.4, for the Hopf algebra H2 we get a Lax pair equation in gl(Lδ), ad(L) = [ad(L), ad(M)]. dt

(8.7)

The spectral curve of (8.7) is given by the characteristic equation of ad(L(λ)): 0 = {(λ, ν) ∈ C − {0} × C | det(ad(L(λ)) − νId) = 0}. The theory of the spectral curve and its Jacobian usually assumes that the spectral curve is irreducible. For all 43 families of Lie bialgebra structures that gives δ, the spectral curve itself is the union of degree one curves. Thus each irreducible component has a trivial Jacobian, and the spectral curve theory breaks down. We do not know if spectral curve techniques work for more complicated truncated Hopf algebras. 8.3.2. A completely integrable Lax pair equation We give an example of a completely integrable system associated to the Lax pair equation of Theorem 5.4. Let H3 be the Hopf subalgebra generated by the trees t0 = 1T , t1 = , t2 = , t4 =

,

let g3 be the Lie algebra of infinitesimal characters, and let δ = g3 ⊕g∗3 be the double Lie algebra (associated to the trivial Lie bialgebra structure). Let L 0 = l−2 λ−2 +l−1 λ−1 +l0 ∈ Lδ. By Lemma 7.2, L(t) = Ad(g+ (t))(L 0 ) has a pole of order at most two. By (7.1) and arguing as in Lemma 7.2, we conclude that L(t) = Ad(g− (t))(L 0 ) has no terms ci λi , with i ≥ 1 in its Laurent expansion. Thus L(t) ∈ L −2,0 δ = { 0k=−2 L k λk , L k ∈ δ}. i i i i ∗ xα λ Yα + xα∗ λ Yα ∈ Let {Y1 , Y2 , Y3 , Y1∗ , Y2∗ , Y3∗ } be a basis of δ and write x = L −2,0 δ. The truncated Poisson bracket {·, ·} R is given by j

c i+ j {xai , xb } R = εi, j Ca,b xc∗ ,

with (i) εi, j = 1 if i, j ≥ 0, εi, j = −1 if i, j ≤ −1, εi, j = 0 otherwise; (ii) [E a , E b ] = c E ; (iii) x i+ j = 0 if i + j > 0 or if i + j < −2. The rank of this Poisson bracket Ca,b c∗ c is four. Since the dimension of L −2,0 δ is 18, we need a set of 18 − (4/2) = 16 linearly independent functions in involution to get a completely integrable system [1]. For (i, k) ∈ {(−2, k) | 1 ≤ k ≤ 3} ∪ {(−1, 3), (0, 3)} and (i, r ) ∈ {(−2, r ) |1 ≤ r i } ≤ 3} ∪ {(i, r ) | − 1 ≤ i ≤ 0, 1 ≤ r ≤ 2}, in the coordinates {xki , xk∗ {−2≤i≤0, 1≤k≤3} set (xki )2 (x i )2 , Hri+3 (x) = r ∗ , 2 2 0 )2 0 )2 (x (x (x 0 )2 (x 0 )2 (x 0 )2 H10 (x) = 1 + 2 , H60 (x) = 1 + 2 + 3∗ , 2 2 2 2 2 −1 −1 −2 −1 −2 −1 −1 −2 −1 −2 −1 −2 H1 (x) = x1 x1 + x2 x2 , H6 (x) = x1 x1 + x2 x2 + x3∗ x3∗ . Hki (x) =

The set S = {H j−2 , 1 ≤ j ≤ 6} ∪ {H10 , H1−1 } ∪ {H ij , −1 ≤ i ≤ 0, 3 ≤ j ≤ 6} is a set of sixteen linearly independent functions in involution. If ψ from Theorem 5.4 is a nonconstant Casimir function with respect to the truncated Poisson bracket {·, ·}

680

G. B˘adi¸toiu, S. Rosenberg

on L −2,0 δ, then S ∪ {ψ} is in involution with respect to the truncated Poisson bracket {·, ·} R . There then exists a function F ∈ S such that {ψ}∪S \{F} is linearly independent (in the sense that their differentials are linearly independent on a dense open subset of L −2,0 δ). Therefore Eq. (5.1) of Theorem 5.4 is completely integrable with respect to the truncated Poisson structure {·, ·} R on L −2,0 δ. Remark 8.9. We can apply these techniques to the Lax pair flow for β-characters. For certain Hopf subalgebras, the equation d β˜0 (t) = 2[β˜0 (t), β˜n−2m+1 (t)] dt

(8.8)

from Theorem 8.7 is a Hamiltonian system. In some cases, (8.8) is an integrable system on the corresponding double Lie algebra δ. Acknowledgements. Gabriel Baditoiu would like to thank the Max-Planck-Institute for Mathematics, Bonn and the Erwin Schrödinger International Institute for Mathematical Physics for the hospitality. Steven Rosenberg would also like to thank ESI and the Australian National University.

References 1. Adler, M., van Moerbeke, P., Vanhaecke, P.: Algebraic integrability, Painlevé geometry and Lie algebras. Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics, Vol. 47, Berlin: Springer-Verlag, 2004 2. Connes, A., Kreimer, D.: Renormalization in Quantum Field Theory and the Riemann-Hilbert Problem I: The Hopf algebra structure of Graphs and the Main Theorem. Commun. Math. Phys. 210, 249–273 (2000) 3. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem. II. The β-function, diffeomorphisms and the renormalization group. Commun. Math. Phys. 216(1), 215–241 (2001) 4. Connes, A., Marcolli, M.: Noncommutative geometry, quantum fields and motives. American Mathematical Society Colloquium Publications, Vol. 55, Providence, RI: Amer. Math. Soc., 2008 5. Delbourgo, R., Kreimer, D.: Using the Hopf algebra structure of QFT in calculations. Phys. Rev. D 60, 105025 (1999) 6. Ebrahimi-Fard, K., Guo, L., Kreimer, D.: Integrable renormalization II: the general case. Ann. Inst. H. Poincaré 6, 369–395 (2005) 7. Ebrahimi-Fard, K., Manchon, D.: On matrix differential equations in the Hopf algebra of renormalization. Adv. Theor. Math. Phys. 10, 879–913 (2006) 8. Ebrahimi-Fard, K., Gracia-Bondía, J.M., Patras, F.: A Lie theoretic approach to renormalization. Commun. Math. Phys. 276(2), 519–549 (2007) 9. Guest, M.: Harmonic Maps, Loop Groups, and Integrable Systems. LMS Student Texts, Vol. 38, Cambridge: Cambridge U. Press, 1997 10. Kosmann-Schwarzbach, Y.: Lie bialgebras, Poisson Lie groups and dressing transformations. In: Integrability of nonlinear systems (Pondicherry, 1996), Lecture Notes in Phys., Vol. 495, Berlin: Springer, 1997, pp. 104–170 11. Kreimer, D.: Chen’s iterated integral represents the operator product expansion. Adv. Theor. Math. Phys. 3(3), 627–670 (1999) 12. Manchon, D.: Hopf algebras, from basics to applications to renormalization, Comptes-rendus des Rencontres mathématiques de Glanon 2001, Published 2003, available at http://arxiv.org/abs/math/ 0408405v2, 2006 13. Reyman, A.G., Semenov-Tian-Shansky, M.A.: Integrable Systems II: Group-Theoretical Methods in the Theory of Finite-Dimensional Integrable Systems. In: Dynamical systems. VII, Encyclopaedia of Mathematical Sciences, Vol. 16, Berlin: Springer-Verlag, 1994 14. Suris, Y.B.: The problem of integrable discretization: Hamiltonian approach. Progress in Mathematics, Vol. 219, Basel: Birkhäuser Verlag, 2003 Communicated by M. Aizenman

Commun. Math. Phys. 296, 681–738 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1023-x

Communications in

Mathematical Physics

Ionization of Coulomb Systems in R3 by Time Periodic Forcings of Arbitrary Size O. Costin1 , J. L. Lebowitz2,3 , S. Tanveer1 1 Department of Mathematics, Ohio State University, 231 West 18th Ave.,

Columbus, OH 43210, USA. E-mail: [email protected]

2 Department of Mathematics, Rutgers University, 110 Frelinghaysen Rd.,

Piscataway, NJ 08854, USA

3 Department of Physics, Rutgers University, 136 Frelinghaysen Rd.,

Piscataway, NJ 08854, USA Received: 5 May 2009 / Accepted: 3 October 2009 Published online: 7 March 2010 – © Springer-Verlag 2010

Abstract: We analyze the long time behavior of solutions of the Schrödinger equation iψt = (−−b/r + V (t, x))ψ, x ∈ R3 , r = |x|, describing a Coulomb system subjected to a spatially compactly supported time periodic potential V (t, x) = V (t + 2π/ω, x) with zero time average. We show that, for any V (t, x) of the form 2(r ) sin(ωt − θ ), with (r ) nonzero on its support, Floquet bound states do not exist. This implies that the system ionizes, i.e. P(t, K ) = K |ψ(t, x)|2 d x → 0 as t → ∞ for any compact set K ⊂ R3 . Furthermore, if the initial state is compactly supported and has only finitely many spherical harmonic modes, then P(t, K ) decays like t −5/3 as t → ∞. To prove these statements, we develop a rigorous WKB theory for infinite systems of ordinary differential equations.

Contents 1. Introduction and Overview of Results . . . . . . . . . . 1.1 The Coulomb Hamiltonian . . . . . . . . . . . . . 1.2 Setting . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Ionization . . . . . . . . . . . . . . . . . . . . . . 1.4 Laplace space formulation . . . . . . . . . . . . . 1.5 The homogeneous equation and the PDE-difference equations . . . . . . . . . . . . . . . . . . . . . . 2. Main Results . . . . . . . . . . . . . . . . . . . . . . . 3. Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Hilbert space H . . . . . . . . . . . . . . . . . 3.2 Proof of Theorem 2 . . . . . . . . . . . . . . . . . 3.3 Step 1. Compact operator reformulation . . . . . . 3.4 Restriction to a ball B; Definition of C . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

682 683 683 684 684

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

685 686 686 687 687 688 689

682

O. Costin, J. L. Lebowitz, S. Tanveer

3.5 Step 2. Regularity of Rβ,l,m at p = 0 and of Cl,m at p1 = 0 . . . . . . . . . . . . . . . . . . . . . . . 3.6 Compactness . . . . . . . . . . . . . . . . . . . . . 3.7 Step 3. The Fredholm alternative . . . . . . . . . . . 3.8 End of proof of Theorem 2 . . . . . . . . . . . . . . 3.9 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . 3.10 Proof of ionization for spherically symmetric , Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . 3.11 Asymptotic behavior of vn in (30) as n → −∞ . . . 4. Proofs of Intermediate Steps . . . . . . . . . . . . . . . . 4.1 Proof of Proposition 8 . . . . . . . . . . . . . . . . . 4.2 Proof of Proposition 15 . . . . . . . . . . . . . . . . 4.3 Proof of Proposition 11 . . . . . . . . . . . . . . . . 4.4 Proof of Proposition 17 . . . . . . . . . . . . . . . . 4.5 Proof of Proposition 19 . . . . . . . . . . . . . . . . 4.6 Proof of Proposition 21 and final estimates for Theorem 1 . . . . . . . . . . . . . . . . . . . . . 4.7 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . 4.8 Connection with the Floquet operator . . . . . . . . . 4.9 Differential equation for w . . . . . . . . . . . . . . 4.10 Proof of Proposition 23 . . . . . . . . . . . . . . . . 4.11 Proof of Theorem 4 . . . . . . . . . . . . . . . . . . 4.12 Further results on gn 0 −k and h k . . . . . . . . . . . . 4.13 Proof of Lemma 31 . . . . . . . . . . . . . . . . . . 4.14 Proof of Lemma 33 . . . . . . . . . . . . . . . . . . 4.15 Proof of Lemma 35 . . . . . . . . . . . . . . . . . . 5. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Short proof of the regularity of the unitary propagator 5.2 Laplace transform of the Schrödinger equation . . . . 5.3 Analyticity of (I − Cl,m)−1 in X . . . . . . . . . . . 5.4 Coulomb Green’s function representation . . . . . . 5.5 Dependence of A in Eq. (56) on Z , p . . . . . . . . . 5.6 Asymptotics of w2 (a), w2 (a) for small λ . . . . . . . 5.7 Stationary phase analysis needed to calculate the ionization rate . . . . . . . . . . . . . . . . . . . 5.8 Calculation of jk . . . . . . . . . . . . . . . . . . . 5.9 Generalizations . . . . . . . . . . . . . . . . . . . . 5.10 Further remarks on the asymptotics . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

691 692 692 693 693

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

693 694 696 696 697 698 698 699

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

700 701 703 703 703 704 705 718 720 721 723 723 724 725 725 727 727

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

732 734 735 736 736 737

1. Introduction and Overview of Results The long time behavior of solutions of the Schrödinger equation of a system with both discrete and continuous spectrum subjected to a time periodic potential is a longstanding problem. Powerful results have been obtained under various assumptions on the potentials, see [5–8,21,32,34,36,37], and references therein. In particular, there are conditional results on the ionization of the Hydrogen atom, subjected to an external time-harmonic dipole field V (t, x) = E · x cos ωt if E is sufficiently small, see [43,44].

Ionization of Coulomb Systems in R3

683

In addition, Möller and Skibsted proved the equivalence of absence of point spectrum and ionization for a large class of such systems subject to periodic fields [32]. There are also detailed results about the behavior of the wave function for systems subjected to general time periodic potentials, decaying faster than r −2 , under the additional assumption of absence of point spectrum of the Floquet operator, see [20]. None of these results however prove or disprove ionization of Coulomb–bound particles subject to time-periodic forcing of fixed amplitude and zero average. In fact, such results have only recently been obtained even for simple model systems, see [11– 13,15,16,30] and references cited there. For a periodic dipole field of nonzero average ionization was proved in [33] (we note that the time averaged Hamiltonian has no bound states in this seetting). What experiments and simplified models show is that the behavior of systems with both discrete and continuous spectrum, subject to time-periodic fields of arbitrary strength, can be very complicated. For amplitudes where perturbation theory is not applicable (such fields are becoming of increasing practical importance in technology), qualitative departures from the behavior at small fields are observed. There are even situations, see e.g. [12], where for small enough fields ionization occurs for all initial states while for larger fields there exist localized time–quasiperiodic solutions of the Schrödinger equation, i.e. Floquet bound states. Though these situations are rather exceptional, constructive methods of analysis are required to determine the outcome in specific settings. In this paper we prove ionization for Coulomb systems with very special (non-dipole) type of forcings of arbitrary magnitude. This is equivalent to establishing the absence of point and singular continuous spectrum of the corresponding Floquet operators. We also obtain the large time behavior of the wave function. The time decay of the wave function, for compactly supported initial conditions, is of order t −5/6 . This differs from the t −3/2 or, exceptionally, t −1/2 power law found for shorter range reference potentials, see [15,20]. The nonperturbative methods include the development of rigorous WKB techniques for infinite systems of ODEs. 1.1. The Coulomb Hamiltonian. In units such that 2 /2m = 1, the Coulomb quantum Hamiltonian of a Hydrogen atom (more generally a Rydberg atom) is b HC = − − , r

(1)

where b > 0, r = |x|, x ∈ R3 and is the Laplacian. It is well known, see e.g. [28], that HC is self-adjoint on the Sobolev space H 2 (R3 ) = D(−), the domain of − (cf. also [28], p. 303). The spectrum of HC consists of isolated eigenvalues E n = −b2 /4n 2 , with multiplicity n 2 , and an absolutely continuous part, [0, ∞). 1.2. Setting. Our starting point is the time evolution of the wave function ψ(t, x) of the Hydrogen atom described by the Schrödinger equation iψt = HC ψ + V (t, x)ψ; ψ(0, x) = ψ0 (x) ∈ H 2 (R3 ), j (x)ei jωt is real valued and 0 ≡ 0. where V (t, x) = j∈Z

(2)

684

O. Costin, J. L. Lebowitz, S. Tanveer

The operator HC + V (t, x) satisfies the assumptions of Theorem X.71, p. 290, in [31] v.2.; Theorem X.70, p. 285 also applies in our setting. Thus, for any t, ψ(t, ·) ∈ H 2 (R3 ), and the unitary propagator U (t) for (2) is strongly differentiable in t; see § 5.1 for a short proof in our case. Assumption 1. The j (x), j ∈ Z are smooth inside a common compact support, cho sen without loss of generality to be the ball B1 ⊂ R3 of radius 1, and j∈Z (1 + | j|) j L ∞ (B1 ) < ∞. 1.3. Ionization. We say that the system ionizes if the probability to find the particle in any compact set vanishes for large t, i.e., for any a > 0 we have P(t, Ba ) = |ψ(t, x)|2 d x → 0 as t → ∞, (3) Ba

where Ba = {x : |x| < a}. To prove ionization, it clearly suffices to prove (3) for all a > 1. A simple way in which ionization may fail is the existence of a solution of the Schrödinger equation in the form ψ(t, x) = eiφt v(t, x) with φ ∈ R and v ∈ L 2 ([0, 2π/ω] × R3 ) time-periodic. (4) Substitution in (2) leads to the equation: K v = φv,

(5)

where K =i

∂ − − − br −1 + V (t, x) ∂t

(6)

is the Floquet operator, densely defined on L 2 ([0, 2π/ω] × R3 ); 0 = v ∈ L 2 implies by definition that φ ∈ σ p (K ), the point spectrum of K . Somewhat surprisingly, in all studied systems, σ p (K ) = ∅ is in fact the only possibility for ionization to fail. As we will show this is also true for (2). The proof of ionization also implies that K does not have any singular continuous spectrum. This turns out to be a consequence of the existence of an underlying compact operator formulation, the operator being closely related to K . Generic ionization is then expected since L 2 solutions of the Schrödinger equation of the special form (4) are unlikely. We prove that for V (t, x) = 2(r ) sin(ωt −θ ), > 0 on [0, 1] and sufficiently smooth, they do not exist. 1.4. Laplace space formulation. For ψ ∈ H 2 (R3 ), the Laplace transform ∞ ˆ p, ·) := ψ( ψ(t, ·)e− pt dt 0

ˆ p, ·) is H 2 valued exists for p ∈ H, the right half complex plane, and the map p → ψ( analytic in Re p > 0. The Laplace transform converts the asymptotic problem (3) into an analytical one.

Ionization of Coulomb Systems in R3

685

To improve the decay in p of the Laplace transform, it is convenient to write ψ(t, x) = ψ0 (x)e−t + y(t, x).

(7)

Now, y(t, x) satisfies i yt − HC y −V (t, x)y = e−t [iψ0 + HC ψ0 +V (t, x)ψ0 ] ≡ −y 0 (t, x);

y(0, x) = 0.

(8)

Standard arguments (see Appendix 5.2) show that the t−Laplace transform of y, yˆ is in H 2 and satisfies j (x) yˆ ( p − i jω, x), (9) (HC − i p) yˆ ( p, x) = yˆ 0 ( p, x) − j∈Z

where yˆ 0 ( p, x) = −

j (x)ψ0 (x) 1 . (iψ0 + HC ψ0 ) − 1+ p 1 + p − i jω

(10)

j∈Z

1.5. The homogeneous equation and the PDE-difference equations. The homogeneous system associated to (9) is (− − b/r − i p)w( p, x) = − j (x)w( p − i jω, x). (11) j∈Z

Note 2. (i) Clearly, (9) and (11) couple two values of p only if ( p1 − p2 ) ∈ iωZ, and are effectively infinite systems of partial differential equations. Setting p = p1 + inω, with p1 ∈ C mod (iω),

(12)

we denote yn ( p1 , x) = yˆ ( p1 +inω, x), yn0 ( p1 , x) = yˆ 0 ( p1 +inω, x), wn ( p1 , x) = w( p1 +inω, x). Equations (9) and (11) now become (HC − i p1 + nω)yn = yn0 −

j (x)yn− j ,

(13)

j∈Z

(HC − i p1 + nω)wn = −

j (x)wn− j .

(14)

j∈Z

Note 3. Seen as a differential difference equation, the solution yˆ ( p, x) is then a vector {yn ( p1 , x)}n∈Z and the whole problem depends only parametrically on p1 . We have yn ( p1 + iω, x) = yn+1 ( p1 , x),

(15)

and the analysis can be restricted to S0 = { p ∈ H : Im p ∈ [0, ω)}, where H is the closure of H. There is arbitrariness in the choice of S0 and, to see analyticity in p1 on ∂S0 , it is convenient to allow p1 ∈ H, using (15) to identify different strips of width ω.

686

O. Costin, J. L. Lebowitz, S. Tanveer

2. Main Results Theorem 1. Assume V (t, x) = 2(r ) sin(ωt − θ ), with (r ) = 0 for r > 1, (r ) > 0 for r ≤ 1 and (r ) ∈ C ∞ [0, 1]. Then σ p (K ) = ∅ and ionization always occurs. Furthermore, if ψ0 (x) is compactly supported and has only finitely many spherical harmonics, then P(t, Ba ) = O(t −5/3 ). For the proof, given in § 3.10, § 3.11 and § 4.6, we develop a relatively general rigorous WKB theory for infinite systems of differential equations. This yields the asymptotic behavior of wn as n → −∞. The argument relies on Theorems 2 and 3 below. Remark 4. The condition (1− ) = 0 simplifies the arguments but these could accommodate an algebraically vanishing . (We also note that some one-dimensional models with rough such as a δ mass show failure of ionization, see [12,30 and 35].) We will later derive equivalent systems of integral equations, (22), allowing for a compact operator reformulation of the problem. Theorem 2. In the setting § 1.2, assuming spherical symmetry in x of the forcing V (t, x), ionization occurs iff for all p1 ∈ H, (14) has only zero H 2 solutions decaying in n (1) . This is true iff σ p (K ) = ∅. This extends results about absence of singular continuous spectrum of K , [20], to this class of systems, with Coulombic potential and nonanalytic forcing. The proof is given in § 3.2 and § 3.8. Properties of Floquet bound states for general compactly supported V (t, x). Theorem 3. If there exists an H 2 nonzero solution w of (14) decaying in n,1 then it has the further property wn = χ B1 wn for all n < 0

(16)

with χ A the characteristic function of the set A. The general idea of the proof is explained in § 3.9 and the details are given in § 4.7. Note 5. (i) The Sobolev embedding theorem implies that wn is continuous in x. From (14), wn is piecewise C 2 , implying continuity of ∇wn up to ∂B1 . (ii) Equation (16) makes the second order system (14) formally overdetermined since the regularity of w in x imposes both Dirichlet and Neumann conditions on ∂B1 for n < 0. Nontrivial solutions are not, in general, expected to exist. 3. Proofs Outline of the ideas. As in our previous work [10–15], summarized in [18] on simpler systems, we rely on a modified Fredholm theory to prove a dichotomy: there are bound Floquet states, or the system gets ionized. Mathematically the Coulomb potential introduces a number of substantial difficulties compared to the potentials considered before (for references, see e.g. [15]), due to its singular behavior at the origin and, more importantly, its very slow decay at infinity. 1 For precise conditions, see §3.4 below and the integral form (22).

Ionization of Coulomb Systems in R3

687

The slow decay translates into potential-specific corrections at infinity, and standard general methods to show compactness in weighted spaces of the Floquet resolvent, such as those in [20], or our previous ones do not apply. Instead, the asymptotic behavior in the far field of the resolvent has to be calculated in detail. The accumulation of eigenvalues of increasing multiplicity at the top of the discrete spectrum of HC produces an essential singularity at zero of the Floquet resolvent with a local expansion of the form l/2 i j A(−i p)−1/2 for small p when Re p ≥ 0, where A = π b/2. For sufficiently j,l p e rapidly decaying potentials the exponentials would be absent. Their presence clearly makes the analysis at p = 0 of the Floquet resolvent more delicate and is responsible for the change in the large time asymptotic behavior of the wave function, from t −3/2 to t −5/6 . −1/2 We introduce an extended parameter X = ( p 1/2 , ei A(−i p) ) and prove analyticity of the solution yˆ in X , whose p-counterpart is p small, Re p ≥ 0, and similarly in regions near the special points p ∈ iωZ. We reformulate the problem in terms of an integral operator C, defined in § 3.4, closely related to the Floquet resolvent, shown to be compact in a suitable space and analytic in a variable corresponding to X . Then, by the Fredholm alternative, (I − C)−1 is meromorphic, and in fact analytic in X , since we show absence of eigenvalues of I − C for any p ∈ H. 3.1. The Hilbert space H. Let H be the Hilbert space of sequences Y = {yn }n∈Z , yn ∈ L 2 (Ba ), with a > 1, and with (1 + |n|)4/3 yn 2L 2 (B ) < ∞ Y 2 := Y a2 = n∈Z

a

Note 6. The properties of (I − C)−1 as Re p1 → 0+ ensure that Y ( p1 , ·) ∈ H ∪ H 2 and is locally integrable in p1 along iR. We then extend the stationary phase method to such a setting, cf. § 5.7, to evaluate, asymptotically for large t, the inverse Laplace transform of yˆ on iR and obtain the ionization result and time decay estimates. To show ionization we then have to rule out the existence of a point spectrum of the Floquet operator, that is the existence of nontrivial solutions of (14). We use the general criterion in Theorem 3 to show that, if there exists a nonzero solution to (14), then a subsequence of {wn }n∈−N would be singular at x = 0, in contradiction with Note 5, (i). To find the behavior of solutions for large n, we develop a WKB theory for infinite systems of ODEs and find the asymptotic behavior of wn in n in detail. The formal WKB calculation of the behavior is straightforward algebra, relatively easy even in much more general settings, see § 5.9. Justifying the procedure is however delicate, and a good part of the paper is devoted to that; cf. § 4.11, § 4.12. The procedure of introducing an enlarged set of parameters with respect to which the solution is regular, when this does not hold in the original parameter, should also be applicable to other problems where complicated singularities arise. 1 on ∂H = iR, where it is 3.2. Proof of Theorem 2. We show that yˆ has a limit in L loc smooth except for possible poles and a discrete set of essential (but L 1 ) singularities. Poles are present iff the integral form (22) of (11) has nontrivial solutions in H. There is sufficient decay in p at infinity, so that, when poles are absent, the Riemann-Lebesgue

688

O. Costin, J. L. Lebowitz, S. Tanveer

lemma applies, implying that y decays as t → ∞, proving ionization–since ψ0 (x)e−t obviously goes to zero in this limit. More detailed analysis of the resolvent reveals the nature of the essential singularity at p = iωZ. Stationary phase analysis shows a t −5/6 decay of the wave function if the initial condition is spatially compactly supported and contains only a finite number of spherical harmonics. Proposition 7. Ionization holds for every ψ0 ∈ L 2 iff it holds for any ψ0 in a set densely spanning L 2 . Proof. We make use of the standard triangle inequality argument to estimate U (t)ψ0 , where U (t) is the unitary operator associated to the Schrödinger evolution (2). We choose ψ0 in a dense set Cc∞ (R3 ), the smooth, compactly supported functions in Define as usual the angular momentum operators

R3 .

−L2 =

1 ∂2 ∂2 1 ∂ + + 2 ∂θ tan θ ∂θ sin2 θ ∂φ 2

and L z = −i

∂ . ∂φ

Let Pl,m be the orthogonal projector on {φ : L2 φ = l(l + 1)φ, L z φ = mφ} for some m ∈ Z, |m|≤ l ∈ N ∪ {0}. Since P = I , we can now assume without loss of generality that l,m l,m ψ0 ∈ Pl,m Cc∞ (R3 ) if l and m are arbitrary. Likewise, if P(t, Ba ) decays like t −5/3 when ψ0 ∈ Pl,m Cc∞ (R3 ) , then the same decay rate clearly holds for any ψ0 given by a finite linear combination over (l, m) (but not, in general, for any ψ0 ∈ L 2 (R3 )). Further notations. As usual we write D = {z : |z| < }, D = D1 and we denote D+ = D ∩{z : arg z ∈ (−π/4, π/4)}. We also let I = i[−, 0], H+c = { p+c : p ∈ H}, α = { p : Re( p) ≥ 0, Im( p) = α} and for a set A, A\α = A\α . We denote D = H ∪ iR+ , and O(D) will denote some small open neighborhood of D. 3.3. Step 1. Compact operator reformulation. To investigate the analytic properties of ψˆ it is convenient to introduce a new operator Aβ which is a complex perturbation of HC , having no real eigenvalues. More precisely, define Aβ := HC −iβ( p)χ Ba (r )−i p; (with the understanding that A0 = HC −i p), (17) where a > 1 and β = β( p) =

c>0

if Im p ∈ [−, pc ] and Re p ≥ 0

0

otherwise

.

(18)

Here < ω/2 is small as required in Proposition 17 below, and we choose pc so that pc /ω ∈ / Z and pc > −E 0 = b2 /4, the ground state energy of the unperturbed atom.

Ionization of Coulomb Systems in R3

689

Clearly Aβ is defined on D(H0 ) and A∗β = A−β + i p ∗ + i p. We rewrite (13) and (14) in the equivalent form

HC −i p1 +nω−iβ( p)χ Ba (r ) yn = yn0 −iβ( p)χ Ba (r )yn − j (x)yn− j , (19)

j∈Z

HC − i p1 + nω − iβ( p)χ Ba (r ) wn = −iβ( p)χ Ba (r )wn −

j (x)wn− j .

(20)

j∈Z

We show next that

A−1 β

is analytic in p ∈ H\{ pc ∪ − }, and sufficiently regular on

iR. Since the parameter pc is artificial, the non-analyticity at pc ∪ − of A−1 β is not reflected in the actual solution yˆ , as discussed in Note 9.

Proposition 8. There exists an open neighborhood O of D\ pc ∪ − , not containing ∪ the origin 0, such that the operator Rβ = A−1 β exists and is analytic in p ∈ O\( pc − ). Furthermore, for any p for which Rβ exists, we have Rβ : L 2 (R3 ) → H 2 R3 . The proof is given in § 4.1.

3.4. Restriction to a ball B; Definition of C. To study ionization, we only need to know y(t, x) for x in a fixed (but arbitrary) ball Ba ⊃ B1 . Henceforth, to simplify the notation, we write Ba = B. We shall therefore need to study the properties of χ B Rβ χ B . This sandwiched operator (which preserves information about L 2 (R3 ) through built-in boundary conditions on ∂B) is the one that we shall most often use below. We recall that p = p1 + inω and yˆ ( p1 + inω, x) = yn ( p1 , x). Since j (x) = χ B j (x), (13) implies that for x ∈ B, ⎤ ⎡ yn ( p1 , x) = χ B Rβ yn0 + χ B Rβ χ B ⎣−iβyn ( p1 , x) − j (x)yn− j ( p1 , x)⎦ , (21) j∈Z

where we may assume that B contains the support of ψ0 (x), and therefore of yn0 . Note that Rβ depends on n through p = inω + p1 . Corresponding to (21), we obtain the homogeneous system: ⎤ ⎡ j (x)wn− j ( p1 , x)⎦ . (22) wn ( p1 , x) = χ B Rβ χ B ⎣−iβwn ( p1 , x) − j∈Z

The elements of H will be denoted by capital letters, e.g. {yn }n∈Z =: Y , yn0 n∈Z =: Y0 . We define the operators T on L 2 (B) by {TY0 }n = χ B Rβ yn0 , and C on H by

⎡

{CY }n = χ B Rβ χ B ⎣−iβyn ( p1 , x) −

⎤ j (x)yn− j ( p1 , x)⎦ .

j∈Z

Then, we rewrite (21) in the form Y = TY0 + CY.

(23)

690

O. Costin, J. L. Lebowitz, S. Tanveer

Note 9. We shall see that for any β satisfying (18), Eq. (23) has a unique solution in H (call it now Y (β) ). Thus, away from the artificial cuts, all these solutions coincide (since the domain of Y (0) corresponding to c = 0, contains all of the others). Hence, wherever some Y (β) has analytic continuation, so will Y (0) . The homogeneous system corresponding to (23) is given by w = Cw.

(24)

ˆ Note 10. We have shown (cf. § 1.4 and § 5.2) that ψ and the Laplace transform yˆ ( p, ·) = L ψ − ψ0 e−t exist for Re p > 0. The corresponding Y = {yn }, yn = yˆ (inω + p1 , ·), restricted to B, will therefore satisfy (23) for β = 0 when Re p1 > 0. It will be shown that (23) has a unique solution Y ∈ H for any Re p1 ≥ 0, and that Y is analytic in 1 limit on iR, with sufficient decay in n. The implied decay and p1 ∈ H and has an L loc regularity properties of yˆ ( p, ·) on iR show that L−1 yˆ + e−t ψ0 (the integration contour taken to be iR) equals ψ for x ∈ B. Proposition 11. (Asymptotic behavior of χ B Rβ χ B ). If Re p = 0 and |Im p| → ∞ (see Note 3), then χ B Rβ χ B = O(| p|−1/2 ) (recall that β = 0 if |Im p| is large). Moreover, for any > 0, χ B Rβ χ B is analytic in p in an open set containing −i(, ∞). For Im p → +∞, the | p|−1/2 decay rate follows from the spectral theorem since we are outside the spectrum, while for Im p → −∞, the rate is obtained using Mourre estimates [27], Theorem 6.1. The rest of the proof involving analyticity is given in § 4.3 and relies on an explicit representation of the resolvent for HC , see § 5.4. Using spherical symmetry, the explicit Green’s function could be avoided, but in view of possible future generalizations to non-spherical V (t, x), we prefer this more delicate approach. Lemma 12. We have TY0 ∈ H. The operators S := Y → { j∈Z j (x)yn− j ( p1 , x)}n∈Z and C are bounded in H. Proof. We note from Proposition 11 that Rβ = O | p|−1/2 for large p, i.e. O |n|−1/2 for large |n|, since p = inω + p1 . Therefore, from the expression of yˆn0 in (10), (1 + |n|)4/3 {TY0 }n 2 n∈Z

≤

⎛ (1 + |n|)

ψ0 L 2 + ψ0 L 2 χ B Rβ χ B

4/3 ⎝

|1 + p1 + inω|

n∈Z

⎞2

j L 2 ⎠ |1 + p1 + i(n − j)ω| j∈Z ⎡ ⎛ ⎞2 ⎤ j L 2 ⎢ ⎠ ⎥ ≤ C ⎣1 + (1 + |n|)6/7 ⎝ ⎦. (1 + |n − j|) +χ B Rβ χ B ψ0 L 2

n∈Z

j∈Z

Using (7.12) and (7.13) in [15] with γ = 6/7, the above is finite since (1 + | j|)3/7 j L 2 ≤ (1 + | j|) j L 2 < ∞. j∈Z

j∈Z

Ionization of Coulomb Systems in R3

691

The proof that S is the same as that of Lemma 27 of [15], with γ = 43 , replacing absolute values by norms in x. Since Rβ is uniformly bounded (in the operator norm) and acts diagonally in n, C is bounded too. Lemma 13. Both TY0 and the operator C are analytic in p1 for p1 ∈ O H\ { pc + iωZ} ∪ {− + iωZ} ∪ {I + iωZ} . Proof. Propositions 8 and 11 imply that Rβ is analytic in p ∈ O\{ pc ∪ − } and in an open set containing −i(, ∞). Analyticity of TY0 and C follow from their definition (we note C is a norm limit of analytic operators: its restrictions to the subspaces with nonzero components for |n| ≤ N only). Remark 14. As shown later, (I − C) is invertible. Since the solution Y cannot depend on the arbitrary parameters and pc (see Note 9), the non-analyticity of C and TY0 for p1 ∈ { pc + iωZ} ∪ {− + iωZ} ∪ {I + iωZ} is not reflected in Y . Proposition 15. For Re p1 > 0 large enough, (23) has a unique solution in H. The inverse Laplace transform in p of yˆ ( p, x) =: yn ( p1 , x), where p = inω + p1 , solves the initial value problem (8) in B (see Note 10). The proof is given in § 4.2. 3.5. Step 2. Regularity of Rβ,l,m at p = 0 and of Cl,m at p1 = 0. Define Rβ,l,m = Pl,m Rβ and Cl,m = Pl,m C. Note 16. (Compactness versus regularity of Rβ,l,m ). The term −iβ χ B was introduced in § 3.2 to ensure that Rβ,l,m is bounded in H. Since −iβ χ B is localized in x, the shifts in the poles created by the point spectrum of HC are smaller as p → 0 (the size of the orbitals of the Hydrogen atom grows when the energy approaches zero.) The resulting integral operators have an essential singularity at p = 0. The factor χ B is needed to ensure compactness, simplifying the analysis. The poles of the resolvent Rβ,l,m accumulate at p = 0 from −H, along a curve tangent to the positive imaginary p-axis (see Note 58 in §5.6). As a result, while being uniformly bounded, Rβ,l,m is not continuous along the imaginary p line at zero but oscillates without limit. Boundedness of χ B Rβ,l,m χ B (which is not difficult to prove) does not ensure boundedness of the solution Y . However, we do have analyticity in an √ extended, two-dimensional, parameter. Let λ := −i p (with the usual branch of the square root, Imλ < 0 if p ∈ H) and let X := ( p 1/2 , Z ) with Z =e

iπ b 2λ

.

(25)

(The dependence of Z on λ reflects the actual behavior of the solution.) The resolvent is analytic in X and a useful Fredholm alternative can be applied. For any a > 1 we can choose a c in (18) (see § 4.4 below) such that the following statement holds.

692

O. Costin, J. L. Lebowitz, S. Tanveer

Proposition 17 (Analyticity in X ). χ B Rβ,l,m χ B is analytic with respect to X on the compact set D+ × D.2 The proof is given in §4.4. As a corollary, we have the following regularity property of Cl,m and Tl,m Y0 . Let S− = { p : Im p ∈ [−, ω − ), Re p ≥ 0}.

(26)

√ 1/2 Corollary 18. For p1 ∈ S− , define X 1 = p1 , Z 1 , where Z 1 = exp iπ b/ 2 −i p1 . Then, Tl,m Y0 and Cl,m are analytic in X 1 on the compact set D+ × D. Proof. Note first that Propositions 8 and 11 and the relative arbitrariness in the choice of pc and imply that χ B Rβ χ B is analytic in p in a neighborhood of p = inω, n ∈ Z\{0}. Since for large |n|, χ B Rβ χ B = χ B R0 χ B , its expression as an integral operator involving G in (49) (see Note (28) as well) implies a lower bound of the analyticity radius independent of n. For sufficiently small for any n ∈ Z, including n = 0, then, analyticity of χ B Rβ χ B in the expanded variable iπ b p − inω, exp √ (2 −i( p − inω)) follows in the domain iπ b 1/2 ≤1 | p − inω| ≤ ; exp √ (2 −i( p − inω)) (since Proposition 17 gives analyticity X ∈ D+ ×D). Analyticity of Cl,m in X 1 ∈ D+ ×D now follows since Cl,m is the norm limit of analytic operators (the restrictions of Cl,m to the subspaces of H with zero components for |n| > N ). (See Proposition 11 for the necessary estimates of decay in N .) The analyticity of Tl,m Y0 follows from its definition. 3.6. Compactness. Proposition 19. Cl,m is compact in H (cf. Note 2) for p1 ∈ H. The proof is given in §4.5.

3.7. Step 3. The Fredholm alternative. We can now formulate the ionization condition using the Fredholm alternative. −1 Proposition 20. If (24) has no nontrivial solution in H for p1 ∈ S− , then I − Cl,m exists and the system ionizes (cf. (3)). 2 As usual, by analyticity in a compact set, we mean analyticity in some open set containing the compact set. Analyticity in D+ × D of course, implies that B Rβ,l,m B is given by a convergent double series in iπ b p 1/2 and e 2λ .

χ

χ

Ionization of Coulomb Systems in R3

693

The first part is simply the Fredholm alternative. Ionization follows from the following proposition. We recall that yˆ ( p1 + inω, x) = yn are the components of Y . Proposition 21. Assume (24) has no nontrivial solution when p1 ∈ H. Then, for ψ0 ∈ Pl,m C0∞ (R3 ) , the solution Y ∈ H to (23) is analytic in p1 ∈ H\{iωZ} and analytic with respect to X 1 in D+ × D. In particular Y is bounded at p1 = 0. These properties imply sufficient regularity and decay of yˆ ( p, x) so that the integration contour in L−1 yˆ can be taken to be iR. By the Riemann-Lebesgue lemma, P(t, B) → 0 as t → ∞. The proof is given in § 4.6. 3.8. End of proof of Theorem 2. It only remains to make the connection with Floquet theory. This is done in § 4.8. 3.9. Proof of Theorem 3. Equation (14), restricted to B, follows from the homogeneous system w = Cw. Multiplying (14) by wn , summing over n, and integrating over Ba˜ , where a˜ ∈ (1, a], we are lead to a nonnegative definite quantity involving wn |∂Ba˜ being zero for n < 0. Details are given in § 4.7. 3.10. Proof of ionization for spherically symmetric , Theorem 1. We consider the case V (t, x) = 2(r ) sin ωt, corresponding to 1 = −i and −1 = i ( is real valued). The proof in the slightly more general case 2(r ) sin (ωt − θ ) amounts to replacing t by t − θ/ω and ψ0 (x) by ψ (θ/ω, x) in our proof. Recall Cl,m = Pl,m C.3 We obtain by projection of (23) to Pl,m L 2 (R3 ) , Y = Y0 + Clm Y.

(27)

The homogeneous equation associated to (27) is w = Clm w, w ∈ H.

(28)

The Fredholm alternative applies and (27) has a unique solution in H iff (28) implies w = 0. Note 22. By separation of variables in spherical coordinates, we see that Cl,m can be defined in the same way as C, replacing Aβ in (17) by l(l + 1) b d2 2 d + − − − i p1 + nω − iβ χ B (29) dr 2 r dr r2 r and the associated differential-difference systems are obtained by replacing − − b/r d2 l(l + 1) b 2 d with − 2 − + − . dr r dr r2 r Clearly if there exists a nontrivial solution w ∈ H of (28), then, again by elliptic regularity (see Proposition 8), v defined by w = Yl,m v(r ), where Yl,m are the spherical harmonics, is a nontrivial solution to Aβ,r = −

Aβ,r vn = −i (vn+1 − vn−1 ) − iβ χ B vn

implying A0,r vn = −i (vn+1 − vn−1 ) (30)

3 As discussed, it suffices to show ionization on a dense subset of initial conditions.

694

O. Costin, J. L. Lebowitz, S. Tanveer

Proposition 23. If v satisfies (30), then there exists n ≥ 0 such that either (i) vn (1) = 0; or (ii) vn (1) = 0, but vn (1) = 0; let n 0 be the smallest √ such n. By homogeneity, we can assume that vn 0 (1) = 1 in case (i) and vn 0 (1) = − (1) in case (ii) (we use the positivity of ). The proof is given in § 4.10. Definition. We define τ to be 0 or 1 in case (i) and 1 in case (ii) respectively. 3.11. Asymptotic behavior of vn in (30) as n → −∞. In view of Proposition 21 we see that (30) holds the necessary ionization information. 3.11.1. Notation. Let

s(r ) :=

1

(ρ)dρ (r ∈ (0, 1)).

(31)

r

By assumption > 0 is smooth and then so is s. Let n 0 − k, 0 = (0), 0 = (0), s0 = s(0), α = Denote

H0 (ζ ) :=

2 ζ 1/2 e ζ K l+1/2 (ζ ); G 0 (ζ ) = π

√ 2 0 , ζ = αkr. s0

(32)

π ζ 1/2 e ζ Il+1/2 (ζ ), 2

where K l+1/2 and Il+1/2 are the modified Bessel functions of order l + 1/2. It follows that for small ζ , H0 (ζ ) ∼ 2−l ζ −l (2l)!/l!. Let Hˇ (ζ ; k, l) be the unique solution of the integral equation ζ e−2s G 0 (s)R(H0 + k −1 Hˇ )(s)ds Hˇ (ζ ; k, l) = G 0 (ζ ) 0

−H0 (ζ )

ζ

e−2s H0 (s)R(H0 + k −1 Hˇ )(s)ds

(33)

kα

for ζ ∈ [0, kα], where the operator R is defined by 0 (1 + 2ζ ) τ ω bf f − . + (R f ) (ζ ) = 2 − 2 + 2α 4α0 2 αζ Define H (ζ ) = H (ζ ; k, l) := H0 (ζ ) + k −1 Hˇ (ζ ; k, l). It can be checked that H satisfies 0 (1 + 2ζ ) τ ω l(l + 1) b H H + + + − H = 2 1 − 2kα 2 4kα0 2k ζ2 αζ k

(34)

Ionization of Coulomb Systems in R3

695

with the following asymptotic condition4 b l(l + 1) + log ζ + O H (ζ ) ∼ 1 + 2ζ 2kα

log ζ 1 , kζ ζ 2

(ζ, k → ∞, ζ kα).

(35)

2 ζ 1/2 e ζ K l+1/2 (ζ )(1 + o(1)) as k → ∞ π (ii) From the expression (33) for Hˇ it is seen that as ζ → 0 we have Hˇ (ζ ; k, l) ∼ const.ζ −l+1 for l = 1 and Hˇ ∼ const. + const ζ log ζ for l = 1. For τ = 0 or 1, Hˇ is less singular than H0 at ζ = 0.

Remark 24.

(i) H (ζ ; k, l) ∼

Define

1

m k (r ) =

s2k+τ 4 (1)H (αkr ) 1

(2k + τ )! 4 (r )H (αk)

exp

ω 4

1

r

s2k+τ s(s)ds Fk (r ). := √ (2k + τ )! (s)

(36)

Note 25. From standard properties of the modified Bessel function K l+1/2 , it follows that for large enough k, H (αkr ) is continuous and nonzero for r ∈ (0, 1] and that as r → 0, H (αkr ) is singular as r −l . Therefore for any k sufficiently large u k ≡ r l m k has a finite limit nonzero limit as r → 0+ . Definition 26. With n 0 as in Proposition 23, we define h k (r ) by wn 0 −k =

ik m k (r )h k (r )Yl,m . r

(37)

Theorem 4. (Behavior as k → +∞ (i.e. n → −∞) ) For any sufficiently large k, u k := r l m k (r ) is continuous in r ∈ (0, 1] and u k (r ) → const = 0 as r → 0+ . Furthermore, if there is a nontrivial solution to (30), then there exists a subsequence k j → ∞ such that for any r ∈ [0, 1], lim h k, j (r ) = 1.

j→∞

(38)

The first part follows simply from Note 25. The rest of the proof is given in § 4.11. Proposition 27. There is no nonzero solution of (28) in H. Indeed, Theorem 4 shows that otherwise (r l+1 vn )(0) = m n = 0 for a subsequence of n < 0. This implies that the corresponding wn (x) ∼ m n r −l−1 Yl,m for r = |x| → 0. This singularity is incompatible with wn ∈ H 2 , (see Proposition 8). Thus there is no admissible solution of the homogeneous system and the first part of Theorem 1 follows from Theorem 2 (i). See also the remarks in § 5.10. The result on the decay rate follows from the type of essential singularities of yˆ for p ∈ iωZ; see §4.6. 4 As is common, the notation O (a, b) ≡ O (|a| + |b|); similarly O (a, b, c) ≡ O (|a| + |b| + |c|).

696

O. Costin, J. L. Lebowitz, S. Tanveer

4. Proofs of Intermediate Steps 4.1. Proof of Proposition 8. As mentioned in § 3.3, Aβ and A∗β are adjoints of eachother. They are furthermore densely defined and hence (see, e.g [28], Theorems 5.28 and 5.29, p. 168), closed. Once we show that Aβ ( p) is invertible in D, analyticity of Rβ in O\( pc ∪ − ) follows (the spectrum of the closed operator HC − iβ χ B (r ) is a closed set). Analyticity holds wherever Aβ is analytic, [31], Vol. 1, Theorem VIII.2, p. 254). (1) Eigenvalues. We first show below that no i p ∈ iD is an eigenvalue of HC − iβ χ B (r ). Assume we had Aβ ψ = 0. If β = 0, A0 ψ = 0 implies i p ∈ σ p (HC ), but, by construction, these values of p correspond to the region where β = 0. So we can assume β > 0. Then ψ, (− − i p − br −1 )ψ + ψ, −iβ χ B ψ = 0.

(39)

Taking the imaginary part of (39) we get Re p ψ, ψ + cψ, χ B ψ = Re p ψ, ψ + cχ B ψ, χ B ψ = 0.

(40)

If Re p > 0 this immediately implies ψ = 0. If Re p = 0 we get χ B ψ = 0. But χ B ψ = 0 implies 0 = Aβ ψ = A0 ψ. In spherical coordinates the equation A0 ψ = 0 becomes a system of ordinary differential equations d2 l(l + 1) 2 d −1 − 2− + − br − i p ψn,l,m = 0. (41) dr r dr r2 Since χ B ψ = 0, the solution of (41) vanishes identically on [0, a]; but then, by standard arguments the solution is identically zero. If Im p ∈ / [−, pc ], with p ∈ D, then Aβ = A0 , and we are, by construction, outside the spectrum of A0 , and thus A0 ψ = 0 implies ψ = 0. (2) The range of Aβ is dense. Indeed, the opposite would imply5 Ker(A∗β ) = 0, which leads to the same contradiction as in Step 1 (note that A∗β is simply Aβ with the signs of β and Re p changed at the same time). (3) For any p ∈ D there is an > 0 such that Aβ ψ > ψ. (a) If Re p > 0 and ψ = 1, then Aβ ψ Aβ ψ, ψ = |A0 ψ, ψ − icχ B ψ, χ B ψ| |Re pψ, ψ| ≥ Re p. (42) (b) Let now Re p = 0, and assume Im p is between two eigenvalues of −HC , the distance to the nearest being δ > 0. To get a contradiction, assume that ψ j = 1 and Aβ ψ j = j → 0. Then j = Aβ ψ j Aβ ψ j , ψ j = A0 ψ j , ψ j − icχ B ψ j , χ B ψ j (43) c χ B ψ j , χ B ψ j → 0, thus χ B ψ j → 0, and by the definition of Aβ and A0 we get A0 ψ j → 0,

(44)

which is impossible, since our assumption and (44) imply noninvertibility of HC − i p while i p is outside the spectrum of HC . 5 [28], p. 267.

Ionization of Coulomb Systems in R3

697

(c) In the last case, Re p = 0, Im p ∈ σ p (−HC ); then if we assume there is a sequence ψ j , ψ j = 1 such that Aβ ψ j → 0 as j → ∞ we get Aβ ψ j |Aβ ψ j , ψ j | = A0 ψ j , ψ j − icχ B ψ j , χ B ψ j (45) cχ B ψ j , χ B ψ j → 0. Since A0 ψ j ≤ Aβ ψ j + cχ B ψ j , (45) implies A0 ψ j → 0. On the other hand, with P the orthogonal projection on the finite dimensional eigenspace of HC corresponding to the eigenvalue i p, we have A0 P = 0 ⇒ A0 = A0 (I − P) and then since A0 ψ j → 0, A0 (I − P)ψ j → 0.

(46)

But by definition A0 is invertible on (I − P)L 2 (R3 ) and (46) then implies (I − P)ψ j → 0, i.e. Pψ j − ψ j → 0. Since ψ j = 1, Pψ j → 1. Then Pψ j is a bounded sequence in the finite dimensional space P L 2 (R3 ), hence we can extract a convergent subsequence, which we may without loss of generality assume to be Pψ j itself, Pψ j → ψ, ψ = 1, and also ψ j → Pψ j → ψ, thus Pψ = ψ. Therefore, A0 ψ = A0 Pψ = 0. Also, since multiplication by cχ B is a bounded operator we have cχ B ψ j → cχ B ψ = 0, since cχ B ψ j → 0. Therefore, Aβ ψ ≤ A0 ψ + cχ B ψ = 0 in contradiction to the absence of eigenvalues. (4) Definition of the inverse. This is standard: we let ψ ∈ D(Aβ ), Aβ ψ = φ and define Rβ φ = ψ. This is well defined since Aβ ψ1 = Aβ ψ2 entails, by Step 1, ψ1 = ψ2 . By Step 2, Rβ is defined on a dense set. By Step 3, for any p there is an > 0 such that Rβ < −1 . Thus Rβ extends by density to L 2 (R3 ) and by construction Aβ Rβ φ = φ whenever Rβ φ ∈ D(Aβ ). Conversely, if φ ∈ D(Aβ ), and Aβ φ = u then Rβ u = φ entailing Rβ Aβ φ = φ on the dense set D(Aβ ). For the regularity of Rβ in x, we first note that if we define Q = (I − )−1 , we have the following identity: −1 b Rβ = Q 1 − . (47) + iβ χ B + i p + 1 Q r It is clear that if φ ∈ L 2 (R3 ), Qφ ∈ H 2 (R3 ) and so (b/r − iβ χ B + i p + 1) Qφ ∈ L 2 . Therefore, from (47), Rβ : L 2 (R3 ) → H 2 (R3 ). 4.2. Proof of Proposition 15. The shift operator S, defined by (SY ) j = y j+1 , is quite straightforwardly shown to be bounded in H: the proof of Lemma 27 in [15] goes through without changes. By the second resolvent identity we have Rβ = (1 − iβR0 χ B )−1 R0 . Since − − br −1 is self-adjoint, we have by the spectral theorem, for some C > 0 independent of p, (− − br −1 − i p)−1 L 2 (R3 ) C(Re p)−1 , p|)−1 .

(48)

and thus Rβ L 2 (B) C1 (1 + |Re Since Rβ is diagonal (in n) and S is bounded (cf. Lemma 12), we have CH C2 (1 + |Re p1 |)−1 . Thus CH is small for large Re p1 , and therefore (I − C)Y = Y (0) has a unique solution Y ∈ H and the proof follows.

698

O. Costin, J. L. Lebowitz, S. Tanveer

4.3. Proof of Proposition 11. Proof. The estimate χ B R0 χ B = O( p −1/2 ) is shown right after the statement of Proposition 11. We now consider the analyticity of Rβ in an open set on the imaginary p axis for Im p < −. There, β = 0 and χ B Rβ χ B = χ B R0 χ B is manifestly analytic from its representation as an integral operator, whose kernel G is given below (see [26] and Appendix §5.4 √for details). With k = i p (using the principal branch of the square root), and ν = b/(2k), G(x, x ; k) ik(η − ξ )I (−ikξ )J (−ikη) − k 2 ξ η[I (−ikξ ) J˙(−ikη) − J (−ikη) I˙(−ikξ )] = (1 − iν)(1 + iν) ik

e 2 (ξ +η) × , 4π |x − x |

(49)

where ξ = |x| + |x | + |x − x |, η = |x| + |x | − |x − x |, i∞ i∞ I (z 1 ) = e−z 1 t t −iν (1 + t)iν dt, I˙(z 1 ) = − e−z 1 t t 1−iν (1 + t)iν dt 0

1

J (z 2 ) =

e z 2 t t −iν (1 − t)iν dt; J˙(z 2 ) =

0

(50) (51)

0 1

e z 2 t t 1−iν (1 − t)iν dt.

0

Further properties of function G are discussed in §5.4. √ Note 28. Note that (49) still holds for p ∈ iR+ , with k = i p, with the choice + arg k = π/2 for p ∈ iR , and with the upper limits i∞ in (51) replaced by +∞. 4.4. Proof of Proposition 17. The function f = Rβ,l,m χ B g is the solution of the equation l(l + 1) b d2 2 d 2 + + λ − 2− − − ic f = g; r a, dr r dr r2 r (52) d2 l(l + 1) b 2 d 2 − 2− + − + λ f = 0; r > a, dr r dr r2 r √ such that f decays at infinity, is regular at the origin and C 1 at r = √ a. We note λ = −i p is in the closure of the fourth quadrant for Re p 0. We let α = λ2 − ic, κ1 = b/(2α), κ = b/(2λ), µ = 2l + 1 and define (in terms of the Whittaker functions M and W 6 ) m 1 (s) := s −1 Mκ1 ,µ/2 (2αs); w1 (s) := s −1 Wκ1 ,µ/2 (2αs); w2 (s) := s −1 Wκ,µ/2 (2λs). 6 See [9], pp. 60, Eq. (1) and pp. 63, Eq. (5).

(53)

Ionization of Coulomb Systems in R3

699

For r > a we have f = Bw2 (r ) since r −1 Mκ,µ/2 (2λr ) grows with r as r → ∞. For r a we must have f = Am 1 + f 0 ,

(54)

where, using standard results about the Wronskian of M and W, see [9], pp. 25 and [1], pp 505, 508, we have r 2α(1 + µ) χ [0,a] (s)s 2 m 1 (s)g(s)ds f = w (r ) 0 1 1 1 2 + 2 µ − κ1 0 a s 2 w1 (s)χ [0,a] (s)g(s)ds. (55) +m 1 (r ) r

The integral representations of the functions M and W entail immediately that the functions f 0 , f and Mκ1 ,µ/2 (2αr ) depend analytically on λ for small λ. Continuity of f and f at a 1 imply that A defined in (54), is given by A=

f 0 (a)w2 (a) − f 0 (a)w2 (a) . m 1 (a)w2 (a) − m 1 (a)w2 (a)

(56)

In § 5.5 it is shown that that A is analytic in (λ, exp [iπ b/(2λ)]) in a domain corresponding to λ small in the closure of the fourth quadrant, if a and √ c are chosen

large √ enough. It follows that resolvent Rβ,m,n is analytic in X for X = p, exp iπ b/ 2 −i p ∈ + D × D for small . 4.5. Proof of Proposition 19. By adding and subtracting 1 from Aβ and using the second resolvent formula, whenever everything is well defined, we have

χ B A−1 χ B = : χ B Rβ χ B = χ B (− + 1)−1 χ B β −χ B Rβ (−br −1 − iβ χ B − 1 − i p)(− + 1)−1 χ B .

(57)

The Green’s function for − + 1 is G(x, y) =

1 e−|x−y| . 4π |x − y|

(58)

Now if φ j L 2 (B) 1 then the functions f j = (− + 1)−1 χ B φ j are seen by straightforward calculation to be equicontinuous on the one point compactification of R3 . A subsequence, without loss of generality assumed to be the f j ’s themselves, converges in L 2 (R3 ) as well (to a function with exponential decay, since there is a δ1 > 0 small enough and independent of j so that eδ1 |x| (− + 1)−1 χ B φ j ) is also equicontinuous on the compactification of R3 ). In particular, χ B (− + 1)−1 χ B is compact. Now f j converge in the sup norm with weight eδ1 |x| , and thus (−br −1 − iβ χ B − 1 − i p) f j converge in L 2 (R3 ). Since Rβ is bounded, compactness of χ B Rβ χ B follows. By Proposition 11, and the previous argument, C is a norm limit of compact operators (the truncations of C to the subspaces of H with vanishing components for |n| > N ). Therefore, Cl,m = Pl,m C is also compact.

700

O. Costin, J. L. Lebowitz, S. Tanveer

4.6. Proof of Proposition 21 and final estimates for Theorem 1. If (24) has no nontrivial −1 solution for any p1 ∈ H, then compactness of Cl,m implies that I − Cl,m exists. Lemma 13, Corollary 18 and Proposition 11 give the analytic and continuity properties −1 of Cl,m and Tl,m Y0 . Analyticity of I − Cl,m in X 1 for X 1 ∈ D+ × D, follows in a standard way from analyticity of Cl,m and the second resolvent formula, A−1 − B −1 = B −1 (B − A)A−1

(59)

(see § 5.3). The same resolvent identity can be applied to show analyticity of (I −Cl,m )−1 with respect to p1 in a neighborhood of

H\ ( pc + iωZ) ∪ (− + iωZ) ∪ (I + iωZ) . −1 Hence, the solution Y = I − Cl,m Tl,m Y0 is analytic for p1 ∈ H\{iωZ}, since , β and pc are artificially introduced parameters the value of which cannot affect Y , since Y0 is independent of these choices (see Remark 14.) The function yˆ ( p; x) = yn ( p1 , x), with p = inω + p1 , is analytic in p for p ∈ iR\iωZ and by analyticity of Y in X 1 , boundedness at p = iωZ follows. In particular, as p → inω from the right half-plane, yˆ ( p, x) is analytic in the extended variable iπ b 1/2 ( p − inω) , exp √ . (60) 2 −i( p − inω) The regularity properties of Y in p and the decay properties in |n| of its components yn for large |n|, simply stemming from Y ∈ H, imply that y(t, x) can be expressed as an inverse Laplace transform of yˆ ( p, x) on iR. We now show that P(t, B) = ψ0 (x)e−t + y(x, t)2L 2 (B) ≤ 2e−2t ψ0 2L 2 (B) +2y(x, t)2L 2 (B) → 0 as t → ∞. We note that ∞ ∞ d x|y(t, x)|2 = dx eit (s−s ) yˆ (is, x) yˆ (is , x)dsds B B −∞ −∞ ∞ ∞ i s˜ t = e yˆ (i s˜ + is , x) yˆ (is , x)d x ds d s˜ . −∞

−∞

(61)

B

So, in order to show ionization, it suffices from the Riemann-Lebesgue Lemma to show that ∞ yˆ (is , x) yˆ (is + i s˜ , x)d x ds −∞

B

is in L 1 (d s˜ ). This follows from Cauchy-Schwarz, since

∞ −∞

∞

−∞

B

2 | yˆ (is , x)|| yˆ (is +i s˜ , x)|d x ds d s˜≤ yˆ (is , ·) L 2 (B) ds . R

(62)

Ionization of Coulomb Systems in R3

701

However, R

yˆ (is , ·) L 2 (B) ds =

0

!

ω

n∈Z ω

≤C 0

ω

≤C 0

yn (iq, ·) L 2 (B) dq (1 + |n|)4/3 yn (iq, ·)2L 2 (B) dq

n∈Z

Y (iq, ·)2H dq < ∞,

(63)

since Y is bounded in p1 = iq for q ∈ [0, ω]. Since yˆ ( p1 +inω, x), n ∈ Z, is analytic in the variable (60), standard stationary phase analysis (see Appendix §5.7 ) shows that y(t, x) = O(t −5/6 ), and hence P(t, B) = O(t −5/3 ) as t → ∞. 4.7. Proof of Theorem 3. Since (14) (restricted to B) follows from the homogeneous system w = Cw (see also Proposition 8 for the necessary regularity), we look for a nontrivial solution of (14) in H. We multiply (14) by wn , integrate over the ball Ba˜ (of radius a˜ ∈ (1, a]), sum over n (this is legitimate since w ∈ H) and take the imaginary part of the resulting expression. Noting that

j (x)wn− j wn =

j,n∈Z

− j w n− j wn =

j,n∈Z

=

j w n+ j wn

j,n∈Z

j (x)wm wm− j

(64)

j,m∈Z

so the sum (64) is real, we get from (14), " # 2 0 = Im +i p1 |wn (x)| d x + d xw n wn n∈Z Ba˜

= +Re p1

Ba˜ n∈Z

n∈Z

1 |wn (x)|2 d x + 2i Ba˜

"

∂Ba˜

# w n ∇wn − wn ∇w n · n d S.

(65)

n∈Z

It is convenient to decompose wn using spherical harmonics. We write wn = Rn,l,m (r )Ylm (θ, φ).

(66)

l 0,|m|l

The last integral in (65), including the prefactor, then equals % $ i R n,m,l Rn,m,l − R n,m,l Rn,m,l − a2 2 n∈Z m,l

i W[R n,m,l , Rn,m,l ], = − a2 2 n∈Z m,l

(67)

702

O. Costin, J. L. Lebowitz, S. Tanveer

where W[ f, g] is the Wronskian of f and g. On the other hand, we have outside of Ba˜ , wn + br −1 wn + (i p1 − nω)wn = 0,

(68)

and then by (66), the Rn,l,m satisfy for r > a the equation R +

2 l(l + 1) R = (−i p1 + nω)R, R + br −1 R − r r2

(69)

where we have suppressed the subscripts. Let gn,l,m = r Rn,l,m . Then for the gn,l,m we get l(l + 1) −1 g = 0. (70) g − − i p + nω − br 1 r2 Thus R R =

gg |g|2 − r2 r3

(71)

and r 2 W[R, R] = W[g, g] =: Wn .

(72)

Multiplying (70) by g, and the conjugate of (70) by g and subtracting, we get for r > a, Wn = −i( p1 + p1 )|g|2 = −2i|g|2 Re p1 .

(73)

Remark 29. Direct estimates using the Green’s function representation (49) imply that wn (x) =

e−κn r cn (θ, φ) + O(r −1 ) as r → ∞ b r 1+ 2κn

with cn (θ, φ) independent of r and with κn = −i p1 + nω (when Re p1 > 0, κn is in the fourth quadrant when n < 0).

(74)

(75)

(i) We first take Re p1 > 0, to illustrate the argument. Using (74) we get b

g ∼ Ce−κn r r − 2κn (1 + o(1)) as r → ∞.

(76)

There is a one-parameter family of solutions of (70) satisfying (76) and the asymptotic expansion can be differentiated [42]. We assume, to get a contradiction, that there exist n < 0 for which g = gn = 0. For these n we have, using (76), differentiability of this asymptotic expansion and the definition of κn that 1 lim |gn |−2 Wn = −Im κn > 0. 2i r →∞

(77)

It follows from (73) and (77) that Wn /(2i) is strictly positive for all r > a (by monotonicity and positivity at infinity) and all n for which gn = 0. This implies that the last term in (65) is a sum of nonnegative terms which shows that (65) cannot be satisfied nontrivially. (ii) Re p1 = 0. For n < 0, we use Remark 29 (and differentiability of the asymptotic expansion as in Case (i)) to calculate Wn in the limit r → ∞: Wn = 2i|cn |2 |κn |(1+o(1)).

Ionization of Coulomb Systems in R3

703

Since for Re p1 = 0, Wn is constant, cf. (73), it follows that Wn = 2i|cn |2 |κn ||gn |2 exactly. Thus, (65) cannot be non-trivially satisfied, implying that wn (x) = 0 for all n < 0 and |x| = r = a˜ ∈ (1, a].

(78)

For a˜ > r > 1 (where V (t, x) = 0) we have Own = 0, where O is the elliptic operator − − b/r − i p1 + nω. The proof that wn (x) = 0 for r > 1 then follows immediately from (78), by standard unique continuation results [17,23] (in fact, O is analytic hypo-elliptic). See also Note 5. 4.8. Connection with the Floquet operator. It is easy to check that the discrete timeFourier transform of the eigenvalue equation for the Floquet operator, Eq. (5), K v = φv, with p1 = iφ, coincides with (14), the differential version of the homogeneous equation associated to (23). Now, (78) shows that a solution of (14) is an eigenvector of K . In the opposite direction the existence of a Floquet eigenfunction entails failure of ionization since it implies the existence of a solution of (2) for which the absolute value is time-periodic. 4.9. Differential equation for w. We seek to show that the only solution to the homogeneous system w = Cl,m w

(79)

2 in the space H is w = 0. Since w is

piecewise C (see Note 5), (79) implies that the −1 components of w = Yl,m r gn (r ) n∈Z satisfy the differential-difference system (see Note 5): d2 l(l + 1) −1 gn = i (gn+1 − gn−1 ). gn − −br + nω − i p1 + (80) dr 2 r2

First, we notice that for n < 0, Theorem 2 implies that gn (r ) = 0 for r 1. Thus gn (1) = 0, gn (1) = 0 for all n < 0. 4.10. Proof of Proposition 23. The gist of the proof is that contractive mapping arguments show that if the statement was false then the solution would vanish. Lemma 30. If Y = 0, then there exists some n 0 0 so that either gn 0 (1) = 0 or gn 0 (1) = 0. (As before, in the sequel, we shall define n 0 to be the smallest such integer). Proof. To get a contradiction, assume the statement is false. Since the functions wn are in the domain of (see Note 5), then, in particular, for any n, gn is continuous in r . Thus, the set Z n := {r : gn (r ) = 0} is closed and so is the (possibly empty) left connected component of 1 in Z n , call it K n . Let & Kn . K = n∈Z

Assume to get a contradiction that K is nonempty: let then K = [a, 1]. If a = 0, then Y ≡ 0 since gn (1) = 0, gn (1) = 0 imply gn (r ) = 0 for r > 1. Then Y = 0 implies

704

O. Costin, J. L. Lebowitz, S. Tanveer

a > 0. We first take 0 < a < 1. We write the differential equation for gn (r ) in integral form and use the conditions gn (a) = 0 = gn (a), since gn vanishes on [a, 1]: # √ a" √ [1 − e−2 nω(s−r ) ] √nωs nωr e gn (r ) = e √ 2 nω r l(l + 1) ˜ × − i p1 + V (s) gn (s) − i(s) (gn−1 (s) − gn+1 (s)) ds. s2 (81) Consider the Banach space of sequences {gn (r )}∞ n=−∞ in the norm sup

n∈Z,r ∈[a−,a]

√ nωr gn (r ) . e

It is easy to see that the rhs of (81) is a contractive mapping if is small enough and then gn (r ) = 0 for r ∈ [a − , a] contradicting the definition of a. The same is true if a = 1, since gn (1) = 0 and gn (1) = 0 would imply, with the same proof as before, that gn = 0 for r ∈ [1 − , 1], for some > 0, contradicting the definition of a. 4.11. Proof of Theorem 4. For a heuristic discussion see § 5.9. The proof is by rigorous WKB. The fact that there are two competing potentially large variables, k and 1/r makes it necessary to rigorously match two regimes. First, note that (37) implies gn 0 −k (r ) = i k m k (r )h k (r ).

(82)

We need a few more preliminary results. Lemma 31. For any 1 > 0, there exists C3 > 0 independent of k and 1 so that for k k0 = C3 1−1 , and for r ∈ [1 , 1], 1/2 k0 sup |h k | C4 k0 , (83) k 1 ≤r ≤1 where C4 is independent of 1 and k. The proof of Lemma 31 is given in §4.13. Definition 32. For fixed , we define L = αC3 (2C4 C3 /)2 , with C3 and C4 defined in Lemma 31, and ζ = αkr , where α is given in (32). We will take small enough so that L ≥ C3 α. Finally, in what follows, c∗ is a positive “generic” constant, the value of which is immaterial. Lemma 33. For > 0 small enough and kαr = ζ ∈ [L , kα], we have |h k (r ) − 1| . The proof of Lemma 33 is given in § 4.14.

(84)

Ionization of Coulomb Systems in R3

705

Definition 34. Let h˜ k (ζ ) = h k (ζ /(αk)). Lemma 35. For any small > 0, there exists a subsequence S = {h˜ k j } j∈N that con˜ ), we verges to a continuous function h˜ for ζ ∈ [0, L ]. For the limiting function h(ζ ˜ have |h(ζ ) − 1| ≤ 4 for ζ ∈ [0, L ]. The proof of this proposition is given in § 4.15. Proposition 36. For any r ∈ [0, 1], lim j→∞ h k, j (r ) = 1. Proof. From Lemma 35 and Lemma 33 it follows that for any r ∈ [0, 1] and any > 0 we have lim j→∞ |h k j (r ) − 1| ≤ 4. The proof of Theorem 4 now follows from the definition of h k in (36), Remark 24, Note 5 and Proposition 36. 4.12. Further results on gn 0 −k and h k . Lemma 37. For any j, k ∈ N ∪ {0} we have, at r = 1, i.e. at s = 0, ∂ j+τ gn 0 −k |s=0 = δ j,2k i k for 0 j 2k ∂s j+τ Proof. In case (i) (corresponding to τ = 0), note that (80) may be rewritten, cf. (31), as (gn 0 −k )ss −

Qk gn 0 −k = i gn 0 −k+1 − gn 0 −k−1 , (gn 0 −k )s + 3/2 2

(85)

where Qk =

b l(l + 1) + (k − n 0 )ω + i p1 − r r2

(86)

Since gn 0 −k (1) = 0 = gn 0 −k (1) for all k 1, while gn 0 (1) = 1, the statement follows from (85) for any 0 j 2, if 2k j. Assuming the statement holds for some j 2 for 2k j, we prove it for ( j + 1) for 2k ( j + 1). Taking ( j − 1) derivatives in s of (85) at s = 0, we obtain ∂ j+1 gn 0 −k ∂ j−1 ∂ j−1 = i g − i gn −(k+1) + L n −(k−1) 0 ∂s j+1 ∂s j−1 ∂s j−1 0 where L is a linear combination of derivatives of gn 0 −k up to order j, which are all zero since 2k ( j + 1) > j. The first two terms on the rhs give a contribution of ii k δ( j−1),2(k−1) + 0 since 2k ( j + 1) implies 2(k − 1) ( j − 1) and 2(k + 1) > ( j − 1) completing the inductive step. In case (ii) (corresponding to τ = 1): since gn 0 (1) = 0 and gn 0 −k (1) = 0 = gn 0 −k (1) for all k 1, it follows from (85) that gn0 −k = 0 for all k 1 implying the conclusion for j = 0 and j = 1. By taking an additional derivative of (85) with respect to s and evaluating at s = 0, we obtain gn (1) ∂ ∂ 3 gn 0 −k = iδ2,2k = iδ2,2k gn 0 |s=0 = iδ2,2k √0 3 ∂s ∂s − (1) so the statement holds for j = 2 and any k with 2k j. The rest of the proof is very similar to that for τ = 0.

706

O. Costin, J. L. Lebowitz, S. Tanveer

Let ψ1,k , ψ2,k be two independent solutions of Lk ψ = 0 ; and Wk = ψ1,k (r )ψ2,k (r ) − ψ2,k (r )ψ1,k (r )

(87)

Lk ψ = ψ + Q k ψ

(88)

where

From the form of the equation we see that Wk is independent of r . Lemma 38. For n = n 0 − k, k 1, the system (80) is equivalent to 1 gn 0 −k (r ) = i (s) gn 0 −k+1 (s) − gn 0 −k−1 (s) G k (r, s)ds k 1

(89)

r

where G k (r, s) = Wk−1 [ψ1,k (r )ψ2,k (s) − ψ2,k (r )ψ1,k (s)]

(90)

Proof. The proof simply follows from variation of parameters, the two boundary conditions at r = 1 and gn 0 −k (1) = gn 0 −k (1) = 0. Definition 39. Define jk =

s

Lk m k − m k−1 mk

(91)

Lemma 40. For k ≥ 1, there exist constants C1 , C2 and c∗ , independent k so that for of any r ∈ (0, 1] we have | jk | ≤ c∗ . For r ≥ k1 , we have | jk (r )| ≤ C1 / kr 2 + C2 Proof. In the Appendix, (253), we obtain an explicit expression for jk . Routine asymptotics for large k in different regimes of r ∈ (0, 1], discussed in the Appendix §5.8, show that k 2 jk(2) + k jk(1) = O(1) in all cases and hence jk = O(1). In fact, as r → 0 and k → ∞ with ζ = kαr = O(1) fixed, we have jk → g(ζ ), where g(ζ ) is bounded. Also taking the r - derivative of jk for r = O(1) not small, we get jk (r ) = O(1). When r 1, the asymptotics in the regime k1 r 1 gives jk = O ζ −1 = O (1/(kr )). Since the asymptotics is differentiable, we have jk (r ) = O 1/(kr 2 ) . Finally, we look d d jk = k dζ jk ∼ kg (ζ ), where ζ 2 g (ζ ) is bounded for all at ζ = O(1), ζ ≥ 1. Since dr ζ , it follows that | jk (r )| ≤ C1 / kr 2 + C2 for r ≥ 1/k. Lemma 41. For k 1, h k (r ) defined in (82) satisfies the system of differential equations: m k−1 jk m k−1 m k+1 mk hk = + + h k−1 (r ) + h k+1 (r ) , (92) h k + 2h k mk mk s mk mk and the system of integral equations (89) is equivalent, for k 1, to 1 (s)m k−1 (s) h k (r ) = G k (r, s)h k−1 (s)ds m k (r ) r 1 (s)m k+1 (s) G k (r, s)h k+1 (s)ds := Ak h k−1 + Hk h k+1 . + m k (r ) r

(93)

Ionization of Coulomb Systems in R3

707

Proof. This simply follows by substituting gn 0 −k (r ) = i k m k (r )h k (r ) into (80) and (89), and using m k Lk m k m k−1 jk + Qk = = + , mk mk mk s in turn a consequence of Lemma 40.

Remark 42. Let now r ∈ [ˆ , 1], where ˆ C2 k −1 for sufficiently large C2 independent of k. It is convenient to rewrite Ak and Hk in (93) in terms of s (see (3.11.1)). Furthermore, changing the variable of integration from s to t = s(s)/s(r ), we obtain 1 [Ak h k−1 ](s) = (2k + τ )(2k + τ − 1) t 2k−2+τ Tk (s, t)h k−1 (st)dt, (94) 0

where, using (36), we get Tk (s, t) = and

√ (r (st))Fk−1 (r (st) G k (r (s), r (st)) sFk (r (s))

1 s3 (r (st)t 2k+2+τ [Hk h k+1 ](s) = (2k + 2 + τ )(2k + 1 + τ ) 0 Fk+1 (r (st) × G k (r (s), r (st))h k+1 (st)dt. Fk (r (s))

(95)

(96)

In evaluating Ak for large k, it is useful to calculate the Taylor expansion of Tk (s, t) and its s derivative at t = 1. To do so, we first note that (r ) Fk−1 ∂ Tk (r )Fk−1 (r ) Fk−1 (r ) ∂G k = − − G (r, r ) − (r, r ), (97) k ∂t 2(r )Fk (r ) Fk (r ) Fk (r ) ∂r where, simplify notation, we wrote r (s) = r and r (ts) = r and used ∂r s(r ) = √ to − (r ). From (87) and (90) we get G k (r, r ) = 0 and ∂r G k (r, r ) = 1 at r = r ; (97) implies ∂ Tk Fk−1 (r ) . (98) =− ∂t t=1 Fk (r ) Using (97), taking an additional derivative with respect to t, using also (86) and (88) to see that ∂r r G k = −Q k G k , we obtain (r ) 2ξ Fk−1 ∂ 2 Tk ξ (r ) Fk−1 (r ) + . (99) = √ ∂t 2 t=1 Fk (r ) 23/2 (r ) (r )Fk−1 (r ) A similar calculation can be carried out for the third derivative. We only write down the potentially largest term in the regime kr ≥ C2 (for large k and small r ) ξ 2 Fk−1 (r ) 1 l(l + 1 ξ 2 Fk−1 (r )Q k (r ) ∂ 3 Tk = + O 1, kω − = ∂t 3 t=1 (r )Fk (r ) kr 3 (r )Fk (r ) r2 1 (100) +O 1, 3 . kr

708

O. Costin, J. L. Lebowitz, S. Tanveer

Note that if kr is sufficiently large, (35) gives Fk−1 (r ) H (α(k − 1)r )H (αk) = = 1 + O(k −2 r −1 ) Fk (r ) H (αkr )H (α(k − 1)

(101)

and (r ) Fk−1

l(l + 1) +O 2αkr 2

1 k 2r 3

. (102) Fk−1 (r ) √ Note also that (32) implies α − 2 (r )/ (s(r )) = O(r ) for small r . Including all terms that become important when r is small, we note that in the regime when kr is sufficiently large, we have 2 k f2 (1 − t)2 (1 − t)3 − Tk = (1 − t) + − f 1 + 2 4 r 3 k 4 3 3 2 (1 − t) (1 − t) (1 − t) (1 − t) (1 − t)2 (1 − t) +O , , , (103) , , r3 kr 3 r kr k 2r 3 k 2r 2 ∂ Tk k f3 (1 − t)2 3 = − f1 + 3 (1 − t) − ∂s 4 r 3 k 4 3 3 (1 − t) (1 − t) (1 − t) (1 − t)2 (1 − t)2 (1 − t) , (104) +O , , , , , 2 2 r4 kr 4 r2 kr 2 k 2r 4 k r ∼−

where ωs2 , (r (s)) l(l + 1)s2 , f 2 (s) = 4 l(l + 1)s2 f 3 (s) = . 23/2 f 1 (s) =

(105) (106) (107)

When r ∈ [0, ], ˆ for ˆ = C2 /k, it is sometimes more convenient to express Ak in terms of ζ = kαr . For that purpose, we define ! 1 r (0) 4 1 s(0)−s ωs(r ) − log exp dr √ Q(ζ ) = −2k log 1 − , (108) s(0) (r ) 4 0 (r ) where we recall the relation (31) between s and r = ζ /(kα), ζ ∈ [0, kα]. A series expansion in k −1 leads to 3 2 ω ζ2 (0) ζ ζ (0) ζ . (109) + 1 + + O − , Q(ζ ) = ζ − k 2α 2 4(0)α 4k α(0) k2 k2 We choose ˆ1 = C˜ 2 k −1 log k, for some k-independent C˜ 2 (chosen more precisely later). We define δˆ1 , dependent of r , so that (1 − δˆ1 )s(r ) = s(1 ).

(110)

Ionization of Coulomb Systems in R3

709

From (31), it follows that for sufficiently large C˜ 2 we have δˆ1 1 −

(5 + l) log k s(1 ) . s() (4k + 2τ )k

(111)

It follows from the definition of Ak in (93) that for r ∈ [0, ], ˆ i.e. ζ ∈ [0, kα ˆ ], kα ˆ1 a H (η(1−k −1 ))

1 Ak h k−1 (ζ ) = e−Q(η)+Q(ζ ) 1+ G(ζ, η)h k−1 (η(1−k −1 ))dη k H (ζ ) ζ +(2k + τ )(2k + τ − 1) 1 2k−2+τ Fk−1 (r ) s(r ) h k−1 (r )dr × (r )G k r, r 2 s(r ) s Fk (r ) ˆ1 =: [A0k h k−1 ](ζ ) + [A1k h k−1 ](r ),

(112)

where G(ζ, η) is defined by G(ζ, η) = kαG k (r (ζ ), r (η)), while

(113)

a1 (η, ζ ) H (αk) τ τ − 1 s2 (0)(η/(kα)) = 1+ 1+ k H (α(k − 1)) 2k 2k s2 (η/(kα))(0) τ s(η/(kα)) − 1, × s(ζ /(kα))

while for large k and 0 < ζ ≤ η ≤ ˆ1 α we have 2 (0) τ η η 1 η + (ζ − η) + O , . a1 (η, ζ ) = τ − + 1 + 2 α(0) 2 k k

(114)

(115)

Similarly, for kαr = ζ ∈ (0, k ˆ1 α), we define b1 (η, ζ ) H (αk) s2 (η/(kα))(η/(kα)) = k H (α(k + 1)) s2 (0)(0)

s(η/(kα)) s(ζ /(kα))

τ

− 1.

(116)

We then have [Hk h k+1 ] (ζ ) =

(0)s2 (0) α 2 k 2 (2k + 2 + τ )(2k + 1 + τ ) kα ˆ1 b1 H (η(1+k −1 )) −Q(η)+Q(ζ ) 1+ G(ζ, η)h k+1 (η(1+k −1 ))dη × e k H (ζ ) ζ s2 (2k + 2)(2k + 1 + 2τ ) 1−δˆ1 Fk+1 (r (st)) h k+1 (st)dt × (r (st))G k (r (s), r (st)) t 2k+2+τ Fk (r (s)) 0 +

=: [Hk0 h k+1 ] + [Hk1 h k+1 ].

(117)

710

O. Costin, J. L. Lebowitz, S. Tanveer

Lemma 43. For k 2 and k1 ∈ {k − 1, k, k + 1} we have

(1) If r ∈ (0, 1) and s ∈ (r, r + δ), where δ ≤ min C2 k −1 log k, 1 − r , then G k (r, s) Fk1 (s) c∗ , ∂ G k (r, s) Fk1 (s) < c∗ k 1/2 . Fk (r ) k 1/2 ∂r Fk (r ) (2) If r ∈ (0, 1), δ ≤ C2 k −1 log k with r + δ < 1, then for s ∈ (r + δ, 1), G k (r, s) Fk1 (s) < c∗ k l/2−1/2 , ∂ G k (r, s) Fk1 (s) < c∗ k l/2+1/2 . Fk (r ) ∂r Fk (r ) Proof. It suffices to find bounds for G k (r, s)H (αk1 s)/H (αkr ) since the other functions involved are regular everywhere for r, s ∈ [0, 1], see (36). We first consider k → +∞. It is easily verified that G(ζ, η), defined in (113), is the Green’s function (see (86), (88) ) for l(l + 1) ω b L := → − + + (118) + 2 2 [i p1 − n 0 ω] 2 2 ζ k α αζ k α and is given by G(ζ, η) := kαG k (r (ζ ), r (η)) =

1 (ζ )2 (η) − 2 (ζ )1 (η) , W

(119)

where 1 , 2 are two independent solutions of L = 0 and W = 1 (ζ )2 (ζ ) − 2 (ζ )1 (ζ ) is their constant Wronskian. Standard asymptotic results show there exist √ two independent solutions 1 , 2 such that for large k, we have uniformly in z ∈ [0, ωk], √ ω 2l l! π z Yl+1/2 (z) ; where z = ζ = ωkr, 1 ∼ − (120) 2 (2l)! 2 α k 2−l−1 (2l + 2)! π z Jl+1/2 (z). (121) 2 ∼ (l + 1)! 2 √ √ The Wronskian W is asymptotic, for large k, to (2l + 1) ω/ α 2 k. The expressions (120) and (121) may also be used to determine the asymptotics of 1 and 2 . Using (119), (120), (121) and (36) and the bounds on W , with l1 = l + 21 it follows that ' k1 H α z ω Fk1 (s) c∗ |zz |1/2 G (r, s) (z)J (z ) − J (z)Y (z )] [Y l l l l ' 1 1 1 1 , (122) F (r ) k k 1/2 k k H α ωz √ √ √ where z = η ω/ α 2 k = ωks. A similar bound holds for ∂ Fk1 (s) ∂r F (r ) G k (r, s) . k We now prove part (1). We break this case up into two subcases: (a) r ∈ [k −2/3 , 1] and (b) r ∈ [0, k −2/3 ]. In case (a), we note that s ∈ [r, r + δ] implies s/r and therefore

Ionization of Coulomb Systems in R3

711

1 ≤ z /z = O(1). The√ function H in (122) √ is close to 1 because its argument is large. Furthermore, note that zYl+1/2 (z) and z Jl+1/2 (z) are bounded for large z, while they are asymptotic to constant multiples of z −l and z l+1 for small z. Using (122), part 1 of the lemma follows by inspection in case (a). For case (b), (122) further simplifies since z,z are small and H (k1 η/k) H (k1 η/k) G k (r (ζ ), r (η)) = G(ζ, η) H (ζ ) kα H (ζ ) H (k1 η/k) ηl+1 ζ −l − ζ l+1 η−l . ∼ kα H (ζ )(2l + 1)

(123)

When ζ ∈ [log k, αk 1/3 ] and η ∈ [ζ, ζ + αkδ], we have 1 ≤ [η/ζ ]l ≤ c∗ and therefore H (k1 η/k) H (k1 η/k) c∗ H (ζ ) G k (r (ζ ), r (η)) = kα H (ζ ) G(ζ, η) k 1/2 . The same inequality holds if ζ ∈ [0, log k], since η ∈ [ζ, (C2 + 1) log k] since in this regime ζ −l /H (ζ ) is bounded and the logarithmic growth in k of terms involving η can be bounded by, say, k 1/2 , while for small η, ηl H (k1 η/k) is bounded. The bounds on d d derivatives follow in a similar manner using dr = kα dζ . Part 2 (which is only relevant for r + δ 1) follows similarly on careful inspection of (122), from the asymptotic behavior in different regimes of z and z . Lemma 44. Let r ∈ (0, ], ˆ with ˆ1 = Ck2 log k. We choose C2 large enough so that (5+l) log k s(1 ) 1 l/2−1/2 (1 − δ )2k−2+τ f 1 ∞ s(r ) = (1 − δ1 ) ≤ 4k+2τ . Then |[Ak f ](r )| c∗ k d 1 −3 l/2+1/2 2k−2+τ −2 c∗ k f ∞ and | dr [Ak f ](r )| c∗ k (1 − δ1 ) f ∞ c∗ k f ∞ . Proof. Consider A1k given by (112). We note that s−2 (s) and its r −derivative are bounded, while G k (s, r )Fk (s)/Fk (r ) and its r -derivative are bounded by c∗ k l/2−1/2 and c∗ k l/2+1/2 respectively for any τ (cf. Lemma 43). Further |s(s)/s(r )| (1 − δ1 ) and from (111), we have (1 − δ1 )2k−2+τ and the lemma follows.

c∗ , l/2+5/2 k

Remark 45. Since for r ∈ (0, ], ˆ the bound in Lemma 44 on A1k is O(k −2 ), we will see later that Ak is dominated by A0k (defined in (112)) as k → ∞. Lemma 46. Define G0 (ζ, η) = limk→∞ G(ζ, η) and H0 (ζ ) = limk→∞ H (ζ ), where ζ, η k 1/2 as k → ∞. Then, ∞ H0 (η) dη = 1, (124) e−η+ζ G0 (ζ, η) H 0 (ζ ) ζ ∞ H (ζ ) H0 (η) e−η+ζ G0ζ (ζ, η) dη = −1 + 0 . (125) H0 (ζ ) H0 (ζ ) ζ

712

O. Costin, J. L. Lebowitz, S. Tanveer

Proof. Using (120) and (121) and the behavior of Bessel functions for small argument, [1], it follows that for ζ, η k 1/2 we have G0 (ζ, η) = lim G(ζ, η) = k→∞

2 1/2 ζ e K l+1/2 (ζ ). Now, using the πζ verified that f (ζ ) = e−ζ H0 (ζ ) satisfies

and H0 (ζ ) = limk→∞ H (ζ ) = function equation, it is easily

'

ηl+1 ζ −l − ζ l+1 η−l 2l + 1

f −

(126) modified Bessel

l(l + 1) f = f ζ2

with f (ζ ) ∼ e−ζ as ζ → ∞. Using variation of parameters to invert the left hand side of the above equation, and using the boundary conditions at ∞ we obtain ∞ G0 (ζ, η) f (η)dη. f (ζ ) = ζ

Dividing through by f (ζ ), the first identity in the lemma follows. By differentiating the first identity with respect to ζ , and using the first identity in the resulting expression, we obtain the second identity. Lemma 47. For any r ∈ (0, 1), Ak [1](r ) − 1 =

r

For

1 k

1

c∗ m k−1 (s) G k (r, s)ds − 1 2 . (s) m k (r ) k

≤ r ≤ 21 we get 1 d c∗ m k−1 (s) c∗ Ak [1](r ) = d G k (r, s)ds 2 + 3 2 , (s) dr dr m k (r ) k k r r

while for any r ∈ [0, 21 ], 1 ∂ m k−1 (s) ds c∗ k. G k (r, s) (s) ∂r m k (r ) r

(127)

(128)

(129)

Proof. Recalling the definition (93), it follows from (39) and Lemma 40 that Lk m k − m k−1 =

jk (r ) mk , s

(130)

where jk (r ) = O(1) as k → +∞ for any r ∈ [0, 1]. We can check from (36) that m k (1) = 0, m k (1) = 0 for k 1. From (130), inversion of Lk yields 1 jk (s) m k (s) ds. G k (r, s) (s)m k−1 (s) + (131) m k (r ) = s(s) r Therefore, r

1

G k (r, s)

(s)m k−1 (s) ds = 1 − m k (r )

r

1

G k (r, s)

jk (s)m k (s) ds. s(s)m k (r )

(132)

Ionization of Coulomb Systems in R3

713

First, we choose δˆ1 so that 1 − δˆ1 = (5 + l) log k/ (4k + 2τ ). We then define ˆ It is clear that for large k we have δˆ ∼ δˆ so that (1 − δˆ1 )s(r) = s(r + δ). (5/2 + l/2) s(r ) log k/ (2k + τ ) (r ) . Lemma 43, and the fact that k l/2−1/2 (1 − δˆ1 )2k+1+τ /(2k + 1 + τ ) 13 give k

ˆ jk (s)m k (s) 1−δ1 2k+τ G k (r, s) t ds s(s)m k (r ) 0 r +δˆ

1

Fk (r (st) G k (r (s), r (st)) jk (r (st))dt (r (st))Fk (r (s) c∗ c∗ 3 jk ∞ 3 . (133) k k × √

r +δˆ Now, consider the contribution from r . There are again two cases: (i) 1 r k −2/3 and (ii) 0 < r k −2/3 . In the√first case, Taylor expanding G k (r, s) near s = r we get G k = (s−r )+O((s−r )3 Q k ) = (r )s(1 − t) + O(k 4/3 (1 − t)3 , (1 − t)2 ). Hence, 1 r +δˆ jk (s)m k (s) c∗ ds c∗ jk ∞ G k (r, s) t 2k+τ −1 (1 − t)dt 2 . (134) r s(s)m k (r ) k 1−δˆ1 For the case (ii), we rewrite the integral in terms of ζ = kαr , to obtain ζ +kα δˆ r +δˆ jk (s)m k (s) c∗ H (η) ds 2 jk ∞ dη G k (r, s) e−Q(η)+Q(ζ ) G(ζ, η) r s(s)m k (r ) k H (ζ ) ζ ˆ c∗ ζ +kα δ H0 (η) 2 dηe−η+ζ G0 (ζ, η) k ζ H0 (ζ ) c∗ (η) H c∗ ∞ 0 2 dηe−η+ζ G0 (ζ, η) 2 k ζ H0 (ζ ) k

(135)

by Lemma 46. Using (132) and (135), the first part follows. To prove (128), we note that if C3 is large and r k > C3 , Taylor expansion gives Fk (r (st) U1 (s, t) := √ G k (r (s), r (st)) (r (st))Fk (r (s) 3 (1 − t)2 2 3 (1 − t) (136) = f 4 (s)(1 − t) + O (1 − t) , k(1 − t) , , r2 kr 2 for f 4 = −s/ (r (s)), while ∂ (1 − t)3 (1 − t)2 . U1 (s, t) = f 4 (s)(1 − t) + O (1 − t)2 , k(1 − t)3 , , ∂s r3 kr 3

714

O. Costin, J. L. Lebowitz, S. Tanveer

From (132) we note that 1 j (r (st)) d U1 (s, t)dt t 2k+τ +1 √k Ak [1](r ) = − (r (s)) dr (r (st)) 1−δˆ1 1 t 2k+τ jk (r (st))U1,s(s, t)dt − 1−δˆ1

−

d dr

1

r +δˆ

ξ(s) ξ(r )

2k+τ

jk (s)Fk (s) G k (r, s)ds. ξ(s)Fk (r )

(137)

We note further that

ξ(s) 2k+τ jk (s)Fk (s) G k (r, s)ds ξ(s)Fk (r ) r +δˆ ξ(r ) 1 ξ(s) 2k+τ jk (s) ∂ Fk (s) G k (r, s) ds =− ξ(s) ∂r Fk (r ) r +δˆ ξ(r ) ξ (r ) 1 ξ(s) 2k+τ jk (s)Fk (s) G k (r, s)ds +(2k + τ ) ξ(r ) r +δˆ ξ(r ) ξ(s)Fk (r ) " #2k+τ ˆ ˆ k (r + δ) ˆ ξ(r + δ) jk (r + δ)F ˆ 1 + δˆ (r ) . + G k (r, r + δ) ˆ k (r ) ξ(r ) ξ(r + δ)F

−

d dr

1

(138)

From the bounds in Lemmas 40 and 43 and the fact that ξ(s)/ξ(r ) ≤ (1 − δˆ1 ), we easily 1 d Ak [1](r ) is O(1/k 2 ). conclude that the contribution of r +δˆ in (138) to dr Since Lemma 40 implies | jk (r )| < c∗ and | jk (r )| < c∗ + c∗ /(kr 2 ) for 21 ≥ r ≥ k1 , it follows from the local expansion of U1 (s, t) and its s-derivative in a neighborhood of t = 1 in the first integral in (137) that d r +δˆ c∗ jk (s)m k (s) c∗ ds G k (r, s) + dr r s(s)m k (r ) k 3r 2 k 2 and (128) follows. ˆ from (90), We now prove (129). We first note that for r ≥ k −2/3 , s ∈ (r, r + δ), ∂r G(r, s) = −1 at s = r and therefore, from (120), (121), it follows that for s − r = ˆ The same is true O(k −1 log k) k −1/2 , ∂r G(r, s) ∼ −1 < 0 for s ∈ (r, r + δ). −2/3 for r ∈ [0, k ] since in this regime, ∂r G k (r, s) ∼ ∂ζ G0 (ζ, η) (see (126)), with ζ = r/(αk), η = s/(αk). Therefore, from (31) and (36), we get −

∂ ∂r

m k−1 (r )G k (r, s) m k (r )

√ F (r ) m k−1 (r )G k (r, s) (r ) = − (2k + τ − 2) − k ξ(r ) Fk (r ) m k (r ) m k−1 (s) . (139) −∂r G k (r, s) m k (r )

Ionization of Coulomb Systems in R3

715

1 Since the contributions to the integrals from r +δˆ is O( k12 ), and the first term on the 1 right on (139) is negative for large k, while the second is positive, it follows that 1 d ∂ m k−1 (r ) ds ≤ Ak [1](r ) (s) ∂r m k (r )G k (r, s) dr r √ Fk (r ) (r ) 1 +2 (2k + τ − 2) ≤ c∗ k − |Ak [1](r )| + O (140) ξ(r ) Fk (r ) k2 1

for r ∈ C2 k −1 , 1 . For r ∈ 0, C2 k −1 , we note that since the contribution from r +δˆ d for dr Ak [1](r ) is negligible, we have d d Ak [1](r ) ∼ kα dr dζ

ζ +kα δˆ1

ζ

a1 H (η(1 − 1/k)) e−Q(η)+Q(ζ ) 1 + G(ζ, η) k H (ζ )

ˆ

ζ +kα δ1 H0 (η) d G0 (ζ, η) e−η+ζ dζ ζ H0 (ζ ) ∞ H0 (η) d G0 (ζ, η); ∼ kα e−η+ζ dζ ζ H0 (ζ )

∼ kα

(141)

d Ak [1](r )| ≤ c∗ k. Hence it follows immediately from Lemma 46 that in this case , | dr the inequality in (140) is valid for all r ∈ [0, 1/2].

Lemma 48. For any f ∈ L ∞ [0, 1],

c∗ For r ∈ [0, 1], Ak f ∞ 1 + 2 f ∞ , k ( ( (d ( 1 ( , ( (b) For r ∈ 0, ( dr [Ak f ](r )( c∗ k f ∞ . 2 ∞ (a)

(142) (143)

Proof. Consider the expression for Ak f from (93). We break up the integral into 1 and r +δ , where δ = C2 k −1 log k, with C2 large enough so that (1 − δ1 )2k−2+τ

1 k l/2+7/2

; (1 − δ1 ) :=

r +δ r

s(r + δ) . s(r )

From (36) and Lemma 43, part (2), transforming the integration variable to t, it follows that 1 c∗ m k (s) f ∞ . G (s) (r, s) f (s)ds (144) k k2 m k (r ) r +δ r +δ In r (we replace the upper limit r + δ by 1 if r + δ > 1). Since δ1 = O k −1 log k and t ∈ (1 − δ1 , 1) then Tk (s, t) ≥ 0 and G k (r, s) 0 for r ∈ [k −2/3 , 1]. Therefore, r +δ (s)m k−1 (s) c∗ (145) G k (r, s) + 2 . Ak f ∞ f ∞ m k (r ) k r From (144) we get

c∗ m k−1 (s) G k (r, s)ds 2 . (s) m k (r ) k r +δ 1

716

O. Costin, J. L. Lebowitz, S. Tanveer

Hence r +δ r

(s)

m k−1 (s) G k (r, s)ds = m k (r )

1

(s)

r

m k−1 (s) G k (r, s)ds + O m k (r )

1 k2

.

Using Lemma 47, (142) (a) follows. For (b) we write 1 m k−1 (s) d (s) G k (r, s) f (s)ds dr r m k (r ) 1 ∂ m k−1 (s) G k (r, s) f (s)ds. = (s) ∂r m k (r ) r By Lemma 47, the quantity above is bounded by c∗ k f ∞ .

(146)

(147)

Lemma 49. For any f ∈ L∞ [0, 1], c∗ f ∞ , k2 c∗ 2 f ∞ . k

Hk f ∞

(d ( ( [Hk f ](r )∞ dr

Proof. As before, we choose δ = C2 k −1 log k large C2 independent of k. Using Lemma 43, it follows that 1 (s)m k+1 (s) G k (r, s) f (s)ds c∗ (1 − δ1 )2k+2 k l/2−5/2 f ∞ m k(r ) r +δ c∗ 4 f ∞ , (148) k 1 ∂ (s)m k+1 (s) G (r, s) f (s)ds k ∂r m r +δ

c∗ (1 − δ1 )

k(r )

2k+2 l/2−3/2

k

f ∞

c∗ f ∞ . k3

Now, Lemma 43 implies r +δ 1 (s)m k+1 (s) c∗ f ∞ G (r, s) f (s)ds t 2k+2+τ dt k m k(r ) k2 r 0 c∗ f ∞ , k3 r +δ c∗ f ∞ 1 2k+2+τ ∂ (s)m k+1 (s) G k (r, s) f (s)ds t dt ∂r m k(r ) k r 0 c∗ f ∞ . k2

(149)

(150)

(151)

Lemma 50. There exist k0 and c∗ , independent of k, so that for k > k0 , over the r -interval (0, 1), h k ∞ < c∗ .

(152)

Ionization of Coulomb Systems in R3

717

Proof. First we note that for k0 sufficiently large, h k0 ∞ exists since gk0 is continuous for r ∈ [0, 1] and the expression for m k in (36) shows that 1/m k0 is bounded as well for sufficiently large k0 since K l+1/2 has no zeros in the region of interest. Define rk = Hk h k+1 . Note that h k = Ak (Ak−1 h k−2 + rk−1 ) + rk .

(153)

In k − k0 inductive steps we get k−k 0 −1

h k = Ak Ak−1 ..Ak0 +1 h k0 + Hk h k+1 +

⎛ ⎝

m=1

m )

⎞ Ak− j+1 ⎠ Hk−m h k−m+1 .

(154)

j=1

We write this abstractly as h = h0 + Nh,

(155)

where h0k = Ak Ak−1 ..Ak0 +1 h k0 ; ⎛ ⎞ k−k m 0 −1 ) ⎝ Ak− j+1 ⎠ Hk−m h k−m+1 , [Nh]k = Hk h k+1 + m=1

(156)

j=1

and N is defined on the space S of sequences h = {h k }∞ k=k0 +1 in the norm h = sup h k ∞ . k k0 +1

(157)

Lemmas 48 and 49 imply ⎧ ⎫ ⎞ ⎬ k−k m 0 −1 ⎨ ) c c 1 ∗ ∗ ⎠ 1+ |[Nh]k | h∞ ⎝ 2 + c∗ ⎩ (k − j + 1)2 ⎭ (k − m)2 k0 m=1 j=1 ⎛

< νh∞ ,

(158)

where, if k0 is large ν < 1 is independent of k. Thus, N is contractive and there is a unique solution of (155) in S.

d Lemma 51. For any r ∈ 0, 21 and for large enough k we have dr h k ∞ c∗ k. Proof. Since by Lemma 50 h k is bounded, Lemmas 49 and 48 imply |h k (r )| |

d d [Ak h k−1 ](r )| + | [Hk h k+1 ](r )| c∗ k. dr dr

Lemma 52. For all k 1, h k (1) = 1.

718

O. Costin, J. L. Lebowitz, S. Tanveer

Proof. In case (i), a simple computation shows that ∂ 2k gn 0 −k |s=0 = i k h k (1); (gn 0 −k := i k m k h k ). ∂s2k (By the differential equation for h k , all derivatives exist.) Lemma 37 with j = 2k gives ik =

∂ 2k |s=0 gn 0 −k = i k h k (1), ∂s2k

implying the result in case (i). In case (ii), using Lemma 38, a similar computation shows that ik =

∂ 2k+1 |s=0 gn 0 −k = i k h k (1) (gn 0 −k := i k m k h k ). ∂s2k+1

Definition 53. Let Tˆk (s, s) = s −2k+1−τ

s

t 2k−2+τ s

0

∂ Tk (s, t)dt, ∂s

(159)

where Tk (s, t) is defined in (95). Lemma 54. Let δ = k −1 log k and Sk (s) := ∂∂s s ∈ (0, δ) and r (s) k −1 C2 , we have

1 0

t 2k−2 Tk (s, t)dt. If C2 is large enough,

s f (s) s f 3 (s) Tˆk (s, s) = sSk (s) − 1 (1 − s)3 + (1 − s)3 12 3kr 3 (1 − s)4 (1 − s)3 (1 − s)2 (1 − s) (1 − s)3 (1 − s)2 . (160) +O , , , , , kr 4 k 2r 4 k 3r 4 k 4r 3 kr 2 k 2r 2 Proof. This simply follows by integrating (103) from t = 1 to s of Tk and the fact that Tˆk (s, 1) = sSk (s). 4.13. Proof of Lemma 31. First choose 1 > 0. From Lemma 49, it follows that ( ( (d ( ( [Hk h k+1 ]( ( ds (

∞

c∗ c∗ h k+1 ∞ 2 , k2 k

where we applied Lemma 50. Further, we note that d 1 Ak h k−1 (s) (2k + τ )(2k + τ − 1) ds 1 1 ∂ Tk (s, t)h k−1 (st)dt + t 2k+τ −2 t 2k+τ −1 Tk (s, t)h k−1 (st)dt. = ∂s 0 0

(161)

Ionization of Coulomb Systems in R3

719

We have

2k+τ −2 ∂ Tk

∂ Tk (s, t) dt t 2k+τ −2 s ∂s 0 0 s 1 1 2k+τ −2 ∂ Tk × (s, t)dt ds h k−1 (ss)ds = h k−1 (s)Sk (s) − h k−1 (ss) t s ∂s t 0 0 1 = h k−1 (s)Sk (s) − h k−1 (ss)s 2k−1+τ Tˆk (s, s)ds = (2k + τ − 1)Sk (s) 1

t

∂s

(s, t)h k−1 (st)dt = h k−1 (s)Sk (s) −

0

1

×

1

s 2k−2+τ h k−1 (ss)ds −

0

0

1

s 2k−1+τ [Tˆk (s, s) − Tˆk (s, 1)]h k−1 (ss)ds. (162)

Therefore, d d s Ak [h k−1 ](s)

(2k + τ )(2k + τ − 1)

1

=

[Tk (s, s) − Tˆk (s, s) + sSk (s)]s 2k+τ −1

0

×h k−1 (ss)ds + (2k + τ − 1)Sk (s)

1

s 2k+τ −2 h k−1 (ss)ds.

0

(163) We note that ∂ (2k + τ )(2k + τ − 1)Sk (s) = [Ak [1](s)] = O ∂s

"

1 1 , 2 2 3 k 1 k

#

1 and that (2k + τ − 1) 0 s 2k+τ −2 h k−1 (ss)ds has a bound independent of k. Combining (103) with Lemma 54, if k is large so that k1 is large, then f2 k f1 ˆ + 2 Tk (s, s) − [Tk (s, s) − sSk (s)] = (1 − s) + − 4 r f (1 − s)2 2 f 3 × − (1 − s)3 + (1 − s)3 − s − 1 + k 3 12 3kr 3 (1 − s)4 (1 − s)3 (1 − s)2 (1 − s) (1 − s)3 (1 − s)2 +O , , , 4 3 , , , kr 4 k 2r 4 k 3r 4 k r kr 2 k 2r 2 (1 − s)4 (1 − s)3 (1 − s)3 (1 − s)2 , . , , × r3 kr 3 r kr

(164)

From (164), it is clear that Tk (s, s) −Tˆk (s, s) + sSk (s) > 0 if s ∈ (1 − δ, 1) and k1 is sufficiently large. Now, s f 3 / 3kr 3 (1 − s)3 > 0 exceeds any term following it in (164), except possibly when 1 − r , i.e. s is small. Thus, if we define Mk =

sup

r (s)∈[1 ,1]

|h k (s)|

(165)

720

O. Costin, J. L. Lebowitz, S. Tanveer

we get

k f1 f2 + 2 s 2k+τ −1 (1−s)+ − 4 r 1−δ1 f c∗ 2 c∗ 1 × [− (1 − s)2 + (1 − s)3 ] + s 1 (1 − s)3 ds + 2 + 3 2 . k 3 12 k k 1

|h k (s)| (2k + τ )(2k +τ −1)Mk−1

1

(166)

When (1 − r ) (and thus s) is small, we can replace the term s f 1 /(12)(1 − s)3 on the right side of the above equation simply by (1 − s)3 , which is clearly bigger. From the 1 fact that 1−δ s 2k−1 [−k −1 (1 − s)2 + (2/3)(1 − s)3 ]ds = O(k −5 ), it follows that " # 2k − 1 + τ c∗ c∗ c∗ c∗ + (167) Mk Mk−1 + 2 + 3 2. + 2k + 1 + τ k 2 k 3 12 k k 1 Let C3 be large enough and define k0 (1 ) = C3 /1 , so that for k k0 we have " # 2k + τ − 1 c∗ c∗ k − 1 1/2 + 2 3+ 2 . 2k + τ + 1 1 k k k Then for k k0 ,

Mk

implying Mk Mk 0 +

k0 k

k−1 k

1/2 Mk−1 +

c∗ c∗ + 3 2, 2 k k 1

(168)

1/2

k 3/2 k0 1 c∗ 1 c∗ c∗ c∗ + c + . ∗ 1/2 + 1/2 3/2 5/2 2 1/2 1/2 k 1/2 j 3/2 k 1 /2 k j k k0 k k0 12 1 j=k0 j=k0

(169) The result follows from the definition of Mk and noting that last two terms in (169) are 3/2 O(c∗ k0 k −1/2 ). 4.14. Proof of Lemma 33. From Lemma 31 and the definition of k0 , it follows that |h k (1 )|

3/2

C4 C3

3/2

k 1/2 1

for k C3 1 −1 = k0 . Using h k (1) = 1, it follows that for k ≥ C3 /r , 1 3/2 C4 C3 |h k (r )|dr 1 . |h k (r ) − 1| 1/2 r 2 (kr ) " #2 3/2 C4 C3 α 1/2 7 = L then |h k (r ) − 1| . Additionally, if αkr 1 2 7 It is to be noted that for small enough the inequality αkr ≥ L always implies k ≥ C /r . 3

Ionization of Coulomb Systems in R3

721

4.15. Proof of Lemma 35. For ζ ∈ [0, L ], using the a priori boundedness of h k in k and Lemma 51, we note that both h˜ k (ζ ) := h k (r (ζ )) and (h˜ k )ζ are bounded independently of k. Hence the sequence {h˜ k }k 2 is bounded and equicontinuous. By Ascoli-Arzelà’s ˜ The theorem, there exists a subsequence h˜ k j (ζ ) converging to a continuous function h. ˜ )−1| ≤ 4. Now, from Lemma 33, first part of the result is proved. We first prove that |h(ζ |h˜ k (ζ ) − 1| for ζ ∈ [L , αk] for sufficiently large k.

(170)

Let h˜ k, j be a subsequence that converges to h˜ for ζ ∈ [0, L ]. Let ζm , ζ M be a minimum, and a maximum point of h˜ on [0, L ] and the corresponding minimum and maximum values are denoted by m and M respectively. Continuity at the endpoint ζ = L implies that M ≥ 1 − , m ≤ 1 + . If both M − 1 − < 0 and m − 1 + > 0, there is ˜ ) − 1| ≤ 2. Now, consider the nothing to prove because in that case it is clear that |h(ζ possibility that (i): M > 1 + . In a similar manner, we will also consider the possibility ˜ ) < 1 + , (ii): m < 1 − . Consider (i) first. Since at the end point of the interval, h(L from continuity there exists an interval [a, b] ⊂ [ζ M , L ] of nonzero length for which 1 ˜ h(η) ≤ (M + 1 + ) < M for η ∈ [a, b]. 2

(171)

For some Lˆ > L , independent of k (to be determined shortly), we write " ˆ # $ % kα1 L 0 Ak f (ζ ) = + K (ζ, η) f (η(1 − k −1 ))dη Lˆ

ζ

a1 H (η(1 − k −1 ) G(ζ, η)dη with K (ζ, η) := e−Q(η)+Q(ζ ) 1 + k H (ζ ) 01 =: [A00 k f ](ζ ) + [Ak f ](ζ ).

(172)

For fixed ζ and η we have lim K (ζ, η) = K 0 (ζ, η) = e−η+ζ

k→∞

H0 (η) G0 (ζ, η). H0 (ζ )

(173)

On our interval we have η ζ . Thus G0 0 (see (126)); G0 can vanish only if η = ζ . Furthermore, by (171) we have ζ M ∈ [a, b]. We can then define J=

˜ 3 sup[0,L ] |h| , where K m = min K 0 (ζ M , η) > 0. η∈[a,b] (b − a)K m

Note that Q(η) ∼ η for large k and, aside from the exponential term, K is algebraically bounded. We can thus choose Lˆ > L large enough independently of k, so that −1 f ∞,[L ,kα1 ] . |[A01 k f ](ζ )| J

(174)

ˆ for simplicity, There is a subsequence of h˜ k j that converges uniformly on ∈ [0, L]; ˜ ˜ ) we will use the same notation h k, j for the subsequence. It is clear that the limit is h(ζ ˜ ˆ if ζ ∈ [0, L ]. We keep the notation h for the limit on [0, L]. We note that (170) implies ˜ ) − 1| for ζ ∈ [L , L]. ˆ |h(ζ

(175)

722

O. Costin, J. L. Lebowitz, S. Tanveer

ˆ h(ζ ˜ ) ≤ 1 + < M. Now choose a small 2 > 0. It is clear that in the interval [L , L], ˜ ), we have For sufficiently large k j , using continuity of h(ζ ˜ [A00 k, j h(ζ M )] ≤

ˆ η∈[ζ M , L]\[a,b]

˜ K (ζ M , η)h(η)dη +

b

˜ K (ζ, η)h(η)dη + M2

a

b 1 M K (ζ M , η)dη + (M + 1 + ) K (ζ M , η)dη + 2 M 2 ˆ a η∈[ζ M , L]\[a,b] Lˆ b 1 =M K (ζ M , η)dη − (M − 1 − ) K (ζ M , η)dη + M2 2 0 a (b − a) (M − 1 − )K m + M2 . MA00 k j [1](ζ M ) − 3

01 1 Since Ak j [1] = A00 k j [1] + Ak j [1] + Ak j [1] (see (112) and (172)) Lemmas 47, 44 and (174) imply that for large k j we have

[A00 k j [1]](ζ M ) 1 +

+ 2 . J

Hence, for large k j we have K m ˜ [A00 + 22 − (M − 1 − )(b − a). k j h](ζ M ) M 1 + J 3

(176)

˜ Now, there exists N so that if j N , h˜ k j − h ˆ < 2 and j = Ak j+1 ...Ak j +1 ∞,[0, L] satisfies j − I ∞ 2 while r j+1 := Bk j+1 +

k j+1 −k j −1 m ) m=1

Ak j+1 −l+1 Bk j+1 −m ,

l=1

where Bl = Hl h l+1 , satisfies the estimate |r j+1 | < 2 . Therefore, from h˜ k j+1 = j Ak j h˜ k j + r j+1 it follows that ˜ M ) − 2 = M − 2 . h˜ k j+1 (ζ M ) h(ζ On the other hand, at ζ = ζ M we have Km (M − 1 − )(b − a) + 2 . j Ak j h˜ k j + r j+1 (1 + 2 ) M(1 + + 22 ) + 2 − J 3 (177)

Ionization of Coulomb Systems in R3

723

Thus,

Km M − 2 (1 + 2 ) M(1 + + 22 ) + 2 − (M − 1 − )(b − a) + 2 . J 3

This is true for any 2 , hence as 2 ↓ 0. Thus, Km − (M − 1 − )(b − a) . M M 1+ J 3 However, from the definition of J , this implies M − 1 − . We note that for (ii), ˜ which has a maximum at ζm , to conclude that we repeat the above argument for −h, either (−m) − (−1 + ) ≤ 0 or (−m) − (−1 + ) = 1 − − m . Therefore, 1 − 2 m M 1 + 2, implying that |h˜ − 1| ≤ 4. 5. Appendix 5.1. Short proof of the regularity of the unitary propagator. Theorem 5. Assume that H1 = H + V (x, t), where H is time independent and selfadjoint, and V (·, t) is in L ∞ (Rn ) for every t and is differentiable in time, with integrable derivative. Consider the Schrödinger problem iψt = H1 ψ; ψ(x, 0) ∈ D(H ).

(178)

Then there exists a strongly differentiable unitary propagator on L 2 (Rn ) U (t) so that ψ(x, t) = U (t)ψ0 ∈ D(H ) for all t and ψ(x, t) solves (178). Proof. We note that it is enough to prove this property on a finite interval [0, ], since the problem can be restarted at t = . Let y = ψ − e−t ψ0 . Then y satisfies the inhomogeneous Schrödinger equation i yt = y0 e−t + H y + V y; y0 := iψ0 + H ψ0 + V ψ0 , y(0) = 0.

(179)

We transform this equation into an integral equation, formally for now. Straightforward calculations show that i(ei H t y)t = ei H t e−t y0 + ei H t V y

(180)

or (still formally)

t e(i H −1)s ds y0 + ei H s V (s)y(s)ds 0 0 t −1 i H t−t = (i H − 1) (e − 1)y0 + ei H s V (s)y(s)ds

iei H t y =

t

(181)

0

or, equivalently, i y = (i H − 1)−1 (e−t − e−i H t )y0 + e−i H t

t 0

ei H s V (s)y(s)ds.

(182)

724

O. Costin, J. L. Lebowitz, S. Tanveer

It is clear that (182) is contractive in the norm supt∈[0,] · L 2 (R3 ) for small , and has a unique solution. Clearly, the first term on the right side of (182) is differentiable in time and the derivative is continuous since e−i H t is; let u 0 denote this derivative. We now write a formal equation for u = yt . We have t−s t −i H s e V (t − s) u(s )ds ds iu = u 0 + 0 0 t + e−i H s V (t − s)u(t − s)ds. (183) 0

This equation is also contractive, and has a unique solution, in the same space. Thus both sides of (183) are integrable in time. By t integration and appropriate changes of variables and order of integration, we see that 0 u(s)ds satisfies the same equation as y, which t has a unique solution. Thus y = 0 u(s)ds is strongly differentiable. Since both y and ei H t y are strongly differentiable (the latter by inspection from (181)), y ∈ D(H ) for all t and is strongly differentiable. It is clear that ψ ∈ D(H ) and easy to check that it is differentiable and satisfies (178). 5.2. Laplace transform of the Schrödinger equation. We look more generally at equations of the form iψt = H ψ + V (t, x)ψ,

(184)

where H is self-adjoint and time independent, and V (x, t) is bounded on R3 and differentiable and bounded in t, and ψ(x, 0) ∈ D(H ). The conditions on V can be relaxed. (For the purpose of this paper, H would be taken to be HC .) ˆ p, ·) of ψ(t, ·) Proposition 55. Under the assumptions above, the Laplace transform ψ( exists for Re p > 0; it is in D(H ) and satisfies ( p + i H )ψˆ = ψ0 − i V ψ.

(185)

Proof. We take the unitary propagator of the time-independent problem, U = e−i H t and apply U ∗ (t) = U −1 (t) to both sides of (184). Since (cf. § 1.2) U −1 is strongly differentiable, with derivative iU −1 H , and ψ is t−differentiable in L 2 , U −1 ψ is differentiable and we get (U −1 ψ)t = iU −1 H ψ + U −1 ψt = −iU −1 V ψ.

(186)

Since U −1 V ψ

is continuous in t, we can integrate both sides and get, after multiplication by U and using the fact that U −1 (t) = U (−t), t t U −1 V ψ(s)ds = U ψ0 − i U (t − s)(V ψ)(s)ds ψ = U ψ0 − iU 0

0

= U ψ0 − iU ∗ (V ψ),

(187)

where ∗ is the usual Laplace convolution. Taking the Laplace transform (which clearly exists) in (187) and using standard functional calculus we get ψˆ = ( p + i H )−1 ψ0 − i( p + i H )−1 V ψ, and thus ψˆ is a D(H ) solution of (185).

(188)

Ionization of Coulomb Systems in R3

725

Now, from Eq. (7), it follows that yˆ satisfies (9). Furthermore, using (188) and the fact that y 0 and j are compactly supported, we see that yˆ also satisfies ⎡ ⎤ yˆ ( p, ·) = R0 χ B yˆ 0 ( p, ·) − R0 χ B ⎣ j yˆ ( pˆ − i jω, ·)⎦ , (189) j∈Z

where R0 = (HC − i p)−1 . 5.3. Analyticity of (I − Cl,m)−1 in X . This is standard, and can be seen directly from analytic functional calculus. We provide a self-contained argument, for completeness. We write C X to emphasize the X − dependence of C, and for simplicity of notation we drop the (l, m) subscript. We have (I − C X 1 )−1 − (I − C X )−1 = (I − C X )−1 (C X 1 − C X )(I − C X 1 )−1 and $ % (I − C X )−1 I + (C X 1 − C X )(I − C X 1 )−1 = (I − C X 1 )−1 .

(190)

We fix X 1 and let X → X 1 . Since (I − C X 1 )−1 is bounded, then (C X 1 − C X )(I − C X 1 )−1 → 0 as X → X 1 and I + (C X 1 − C X )(I − C X 1 )−1

(191)

is invertible when X 1 and X are close enough and [I + (C X 1 − C X )(I − C X 1 )−1 ]−1 → I in operator norm as X → X 1 . Thus (I − C X )−1 → (I − C X 1 )−1 in operator norm, as X 1 →

X .

(192)

Now diferentiability in X follows from (190).

5.4. Coulomb Green’s function representation. The retarded Green’s functions G = G + is defined as the solution of the equation, A0 G(x, x ; k) = δ(x − x )

(193)

in distributions, satisfying the radiation condition G(x, x ; k) ∼ F(θ, φ)eikr r −1−iν ; as r → ∞,

(194)

where k=

b . i p (Im k > 0 if Re p > 0), ν = 2k

(195)

Equivalently, G is the R3 \{0} solution of (193) with zero right hand side, satisfying (194) and |x − x |G(x, x ; k) → (4π )−1 as x − x → 0. Proposition 56. R0 χ B g =

G(x, x ; k)g(x )d x . B

(196)

726

O. Costin, J. L. Lebowitz, S. Tanveer

Proof. The function

G(x, x ; k)g(x )d x

f :=

(197)

B

solves, as can be checked, the equation A0 f = χ B g

(198)

with the radiation condition (194). Such a solution is unique since the difference of two solutions satisfies the equation A0 f = 0 (with the radiation condition (194)). Multiplying by G(x, x ; k), integrating over a volume and passing to the limit where the volume approaches R3 , we see that f ≡ 0. Symmetries of the Coulomb potential −b/r allow for a closed form of G (cf. [26]– where the sign is chosen differently) in terms of Whittaker functions W and M, ∂ ∂ (1 − iν) G(x; x ; k) = − Wiν, 1 (−ikξ )Miν, 1 (−ikη), (199) 2 2 4πik|x − x | ∂ξ ∂η where Im k > 0 , 2kν = b and ξ = |x| + |x | + |x − x |, η = |x| + |x | − |x − x |.

(200)

The Whittaker functions are defined in terms of the Kummer functions M and U by the relations, see [1], Chap. 13, z 1 1 Mκ,µ (z) = e− 2 z 2 +µ M + µ − κ, 1 + 2µ, z , −π < arg z π, 2 (201) 1 − 2z 21 +µ + µ − κ, 1 + 2µ, z , −π < arg z π. Wκ,µ (z) = e z U 2 The following integral representation follows from [1], Chap. 13, for the values we are interested in, z 1 = −ikξ , z 2 = −ikη, a = 1 − iν, b = 2 (a different “b” than the one in our Coulomb potential) 1

1

e− 2 z z J (z) e− 2 z z I (z) ; Wiν; 1 (z) = , Miν; 1 (z) = 2 2 (1 − iν)(1 + iν) (1 − iν)

(202)

where I and J are as defined in (51) and the expression is valid in the regions where the integrals converge (in particular, |Im ν| < 1). For other values of ν of interest, the integrals can be replaced by appropriate contour integrals. For instance J would be replaced by −1 0 1 − e−2π ν e zt t −iν (1 − t)iν dt, C

where C is a smooth simple curve encircling [0, 1], as it can be checked by calculating the jump across the cut of the integrand. It follows from these integral representations that the Green’s function is analytic at any (small) p, Re p = 0. Substituting (202) into (199), we obtain (49).

Ionization of Coulomb Systems in R3

727

5.5. Dependence of A in Eq. (56) on Z , p. We now seek √ to determine the asymptotics of A in (56) in the resolvent χ B Rβ χ B in terms of λ = −i p and Z = exp [iπ b/(2λ)] √ for X = ( p, Z ) ∈ D+ × D for sufficiently small . Recall the expression A in (56). Note that since α = λ2 − ic ∼ e−iπ/4 c1/2 1 + O(λ2 ) ≡ α0 + λ2 α1 + · · · , (203) κ1 =

% beiπ/4 $ b 1 + O(λ2 ) ≡ κ1,0 + λ2 κ1,2 + , = √ 2α c

(204)

each of m 1 (a) and w1 (a) is analytic in λ for small λ, with the expansion 1 Mκ1 ,l+1/2 (2αa) ∼ m 1,0 (a) + λm 1,1 (a) + · · · , r 1 w1 (a) = Wκ1 ,l+1/2 (2αa) ∼ w1,0 (a) + λw1,1 (a) + · · · . a

m 1 (a) =

(205) (206)

The asymptotics in this case is also differentiable with respect to a and we get similar expressions as above for m 1 (a) and w1 (a). It follows that the expression for f 0 in (55) also possesses a regular series expansion in λ: f 0 (a) = f 0,0 (a) + λ f 0,1 (a) + · · · .

(207)

To simplify A as in (56) for small λ, we now consider the asymptotics of w2 (a) and w2 (a) for small λ. 5.6. Asymptotics of w2 (a), w2 (a) for small λ. Since w2 (a) = a1 Wκ,l+1/2 (2λa), with κ = b/(2λ), it follows from formula (13.1.33) and analytic continuation to larger values of κ of (13.2.5) of [1], p. 505 and the identity (x)(1 − x) = π/ sin[π x] that w2 (a) = −

e−iπ(l−κ) e−λa (2λa)(l+1) (κ − l) H (2λa; κ, l), 2πia e−zt t l−κ (1 + t)l+κ dt,

where H (z; κ, l) =

(208)

C

where the contour C starts at ∞ei0 , circles around the origin once counter-clockwise to the right of t = −1 and goes to ∞ei2π . In defining the integrand, we choose arg t ∈ [0, 2π ], arg(1 + t) ∈ (−π, π ] so that there is no branch cut on the real axis between −1 and 0. It follows from (208) that l e−iπ(l−κ) e−λa (2λ)(l+2) a l (κ − l) w2 (a)= −λ+ w2 (a)+ H1 (2λa; κ, l), a 2πi where H1 (z; κ, l) = e−zt t l−κ+1 (1 + t)l+κ dt. (209) C

We now seek to determine H (2λa; b/(2λ), l) and H1 (2λa, b/(2λ), l) asymptotically for small λ. For that purpose it is convenient to define a 1/2 1 2 + τ, (210) 2 = 2λ , τ = 2 t , P(τ ; 2 ) = − log 1 + b 2 τ

728

O. Costin, J. L. Lebowitz, S. Tanveer

where we use the principal branch of log in defining P(τ ; 2 ) above. Then, noting that in the definition of log τ and log (τ + 2 ), arg τ ∈ [0, 2π ) and arg (τ + α) ∈ (−π, π ], we have 2 log τ − log(τ + 2 ) = − log 1 + τ for τ in the upper-half plane, while for τ in the lower-half plane, we have 2 log τ − log(τ + 2 ) = i2π − log 1 + . τ It is readily checked that $ √ % b bl+1/2 H 2λa; , l = 2l+1 2l+1 l+1/2 τ l (τ +2 )l exp − ab P(τ ; 2 ) dτ 2λ 2 λ a C 1 $ √ % iπ b l l + exp − τ (τ + 2 ) exp − ab P(τ ; 2 ) dτ , λ C2 (211) $ √ % b bl+1 H1 2λa; , l = 2l+2 2l+2 l+1 τ l+1 (τ +2 )l exp − ab P(τ ; 2 ) dτ 2λ 2 λ a C1 $ √ % iπ b + exp − τ l+1 (τ + 2 )l exp − ab P(τ ; 2 ) dτ . λ C2 (212) Here C1 is a contour in the upper-half complex τ -plane from +∞ to −2 along a steepest descent line, passing through the saddle point τs,1 = i(1+o(1)), where P (τs,1 ; 2 ) = 0. The contour C2 is the steepest descent line in the lower-half τ -plane from τ = −2 to +∞ through the saddle point τs,2 , = −i(1 + o(1)) where P (τs,2 ; 2 ) = 0. We rewrite w2 and w2 as w2 (a) = where

% √ √ (−1)l+1 e−λa bl+1/2 (κ − l) $ −1 Z M ( ab, )+ Z M (2 ab, ) , √ 1 2 2 2 2l+1 aλl 1 M1 (ζ, 2 ) = e−ζ P(τ ;2 ) τ l (τ + 2 )l dτ, πi C1 1 M2 (ζ, 2 ) = e−ζ P(τ ;2 ) τ l (τ + 2 )l dτ, πi C2 l (−1)l e−λa bl+1 (κ − l) w2 (a) + w2 (a) = −λ + a 2l+1 aλl $ % √ √ × Z M3 ( ab) + Z −1 M4 ( ab) ,

where M3 (ζ, 2 ) =

1 πi and

e−ζ P(τ ;2 ) τ l+1 (τ + 2 )l dτ 1 e−ζ P(τ ;2 ) τ l+1 (τ + 2 )l dτ. M4 (ζ, 2 ) = πi C2

C1

(213)

(214)

(215)

(216)

Ionization of Coulomb Systems in R3

729

√ It follows that, with 2 = 2λ a/b, we have # √ √ 1/2 " 2 w2 (a) Z M3 ( ab, 2 ) + M4 ( ab, 2 ) l b . = −λ + − √ √ w2 (a) a a 1/2 Z 2 M1 ( ab, 2 ) + M2 ( ab, 2 )

(217)

5.6.1. Analyticity in 2

√ Proposition 57. The functions Mi ( ab, ·), i = 1, ..., 4, are analytic near zero. Proof. We look at M1 , the others being similar. We can make a change of variable q = P(τ ; 2 ) − P(τs,1 ; 2 ),

(218)

where the function q is real on the steepest descent contour and changes monotonically from ∞ to 0, as we move from +∞ to τ = τs,1 , and then increases monotonically again from 0 to ∞ as we move along the steepest descent path from τ = τs,1 to τ = −2 . We denote the two branches of the inverse function τ (q) in (218) by τ1 (q) and τ2 (q). Noting that 1 d P(τ ; 2 ) = + 1, dτ τ (τ + 2 ) we have M1 (ζ, 2 ) = e

−ζ τs,1

"

"

∞

e

−ζ q

τ2l+1 (τ2 + 2 )l+1

#

τ22 + 1 + 2 τ2 " # # ∞ l+1 l+1 −ζ q τ1 (τ1 + 2 ) e − dq . τ12 + 1 + 2 τ1 0

dq

0

(219)

It is easy to check that (τi − τs1 )2 is analytic for small 2 , regular in q and nonzero at 2 = 0 for all q. Furthermore, the integrands in (219) are clearly bounded by an L 1 function uniformly in 2 (see (210) and (218)), ensuring 2 -analyticity of the integrals. Returning to the original variable τ we get √ √ 1 1 − ab − τ1 +τ 2l M1 ( ab, 0) = e τ dτ πi πi C1,0 π $ % √ = exp i(2l + 1)θ − 2 ab sin θ dθ 0 √ √ √ % $ = J2l+1 2 ab − i Y2l+1 2 ab + G2l+1 2 ab and

√ √ 1 1 − ab − τ1 +τ 2l M2 ( ab, 0) = e τ dτ πi πi C2,0 0 $ % √ exp i(2l + 1)θ − 2 ab sin θ dθ = −π √ √ √ % $ = J2l+1 2 ab + i Y2l+1 2 ab + G2l+1 2 ab ,

(220)

(221)

730

O. Costin, J. L. Lebowitz, S. Tanveer

where J2l+1 and Y2l+1 are the usual Bessel functions of order 2l + 1 and 2 1 ∞1 G2l+1 (ν) ≡ exp[(2l + 1)t] + (−1)2l+1 exp[−(2l + 1)t] e−ν sinh t dt π 0 2 ∞ sinh ((2l + 1)t) e−ν sinh t dt, (222) = π 0 3 2 2 = i + Series in 2 . (223) τs,1 = i 1 − 2 − 4 2 √ Thus, asymptotically, to the leading order in λ, we have with ν = 2 ab, a w2 (a) − b w2 (a)

2 Z (J2l+2 (ν) − iY2l+2 (ν) − iG2l+2 (ν)) + (J2l+2 (ν) + iY2l+2 (ν) + iG2l+2 (ν)) = 2 Z (J2l+1 (ν) − iY2l+1 (ν) − iG2l+1 (ν)) + (J2l+1 (ν) + iY2l+1 (ν) + iG2l+1 (ν)) × (1 + O(λ)). (224) The discussion on w2 (a)/w(a) shows that A=

f 0 (a)w2 (a) − f 0 (a)w2 (a) m 1 (a)w2 (a) − m 1 (a)w2 (a)

(225)

√ is an analytic function of the extended parameter set X for X = p, Z ∈ D+ × D as long as the denominator for A is nonvanishing as λ → 0. We can prove it is nonvanishing by simplifying the leading order expression in λ for (a)/w (a) under the further assumption that a and c (as w2 (a)/w2 (a), defined as w2,0 2,0 in the definition of β) are sufficiently large. 5.6.2. Further simplification for large a. For large a, there is additional simplification since √ √ (−1)l+1 J2l+1 2 ab ± iY2l+1 2 ab ∼ π 1/2 a 1/4 b1/4 $ √ π % , (226) × exp ±i 2 ab + 4 √ √ (−1)l+1 J2l+2 2 ab ± iY2l+2 2 ab ∼ π 1/2 a 1/4 b1/4 $ √ π % , (227) × exp ±i 2 ab − 4 and from Watson’s Lemma, we get √ √ √ G2l+1 2 ab = O(1/a), G2l+2 2 ab = O(1/ a).

(228)

It follows that for large a, (a) w2,0

w2,0 (a)

∼

r2 a

n1 − Z 2 Z 2 + n1

1 + O(a −1/2 ) ,

(229)

Ionization of Coulomb Systems in R3

where n 1 = ie

√ 4i ba

731

√ iπ b , r2 = i ab. , Z = exp 2λ

(230)

5.6.3. Nonvanishing of the denominator of A in (225) Now, defining m = m 1 (a), m = m 1 (a), f = f 0 (a), f = f 0 (a), we have to the leading order in λ, for large a, n 1 − Z 2 + O(λ) − af r2 n 1 + Z 2 + O(λ, a −1/2 ) −A= 3 n 1 − Z 2 + O(λ) − am − r2 m n 1 + Z 2 + O(λ, a −1/2 ) 4r2

f 4r2 (n 1 − Z 2 ) + O(λ) − 4a f n 1 + Z 2 + O(λ)

. =

m 4r2 (n 1 − Z 2 ) + O(λ) − 4am n 1 + Z 2 + O(λ) The denominator of A is m m 2 + 4r2 + O(λ) + n 1 4a − 4r2 + O(λ) . D = −m Z 4a m m

(231)

(232)

(233)

We note that

1 b m ≡ m 1 (a) = M b ,l+1/2 (2αa) = e−αa (2α)l+1 a l M l +1− , 2l +2, 2αa a 2α 2α $ % αa e ∼ (2λ)l+1 a l for α large, 1 + O αa)−1 b l + 1 − 2α

(234)

and for large α in the fourth quadrant % $ eαa −1 , 1 + O(αa) b l + 1 − 2α $ π% α = λ2 − ic → c1/2 exp −i as λ → 0. 4 Therefore, D can be zero for large enough c (i.e. large β) only if √ % $ √ Z 2 2 2ac1/2 (1 − i) 1 + O (ca)−1 + 4i ba √ % $ √ = −n 1 2 2ac1/2 (1 − i) 1 + O (ca)−1 − 4i ba + O(λ). m ≡ m 1 (a) ∼ (2α)l+1 a l α

(235) (236)

Taking the absolute square of both sides, we obtain, √ √ √ |Z |2 [2a 2c]2 1 + O(c−1 a −1 + [4 ab − 2a 2c 1 + O(c−1 a −1 ]2 √ √ √ = [2a 2c]2 1 + O(c−1 a −1 + [4 ab + 2a 2c(1 + O(c−1 a −1 )]2 + O(λ). This is impossible, since |Z | ≤ 1. This means that for large enough a and c (that is, β large), D cannot be zero. It means that the resolvent is well-defined as p = 0 is approached from the closure of H.

732

O. Costin, J. L. Lebowitz, S. Tanveer

Note 58. Note that the denominator of A in (232) vanishes at points in the region |Z | > 1, where, as a result, the resolvent Rβ has poles. From the relation between Z and p, it follows that p = 0 is an accumulation point of a sequence of poles in the left half plane approaching zero tangentially to iR.

5.7. Stationary phase analysis needed to calculate the ionization rate. √ We know that the solution yˆ (is, x) is analytic in the extended parameter is, Z ) , where

√ Z = exp iπ b/ 2 s . So, for X =

(237)

√ is, Z ∈ D+ × D, yˆ (is, x) =

∞

s l/2 Fl (Z ).

(238)

iπ b Fl exp √ . 2 s

(239)

l=0

Consider ∞

G(s) ≡

s

l/2

l=4

It is clear that G(s) is a C 1 function of s in [−a, a]. Integration by parts gives a G(s)eist ds = O(t −1 ). −a

Now note that

Fl

iπ b exp √ 2 s

=

j≥0

π bj D j,l exp i √ 2 s

(240)

(241)

with D j,l decreasing exponentially with j, because of analyticity of Fl (Z ) for |Z | ≤ 1. For 0 ≤ l ≤ 3, it follows there exists constants c and C independent of j so that 3

|D j,l | ≤ Ce−cj .

(242)

l=0

It follows that for large t, we have |

3

∞

√ l=1 j=[ t]+1

√ ibj exp √ eist s l/2 ds| ≤ C1 e−c t . 2 s −a

D j,l

a

(243)

Further, for large t, 3 a C 2 ist l/2 . D0,l e s ds ≤ t −a l=0

(244)

Ionization of Coulomb Systems in R3

733

Therefore, a 3 s l/2 Fl (Z ) eist ds = −a l=0

∼

a

0≤l≤3 −a √ t] [

s l/2 Fl (Z )eist) ds D j,l

0≤l≤3 j=1

+O

1 . t

a

−a

s

l/2

(245)

πb ds exp i st + j √ 2 s

We first evaluate the terms of the form a √ s l/2 eits+id j / s ds

(246)

−a

for large t, where dj =

jπ b . 2

0 The contribution from −a is obviously small, at most O(1/t), uniformly for all t, since

√ the integrand vanishes exponentially as s → 0− . So we only consider, for 1 ≤ j ≤ t , a −1/2 s l/2 eits+id j s ds. (247) 0

We have a point of stationary phase at s = s0, j , where 2/3 dj . s0, j = 2t

(248)

√ Note that s0, j 1 for t large since j is restricted to j ≤ t. It is then convenient to rescale s = s0, j q, to obtain a % $ s0, j 2−2/3 1+l/2 s0, j exp iν j q + 2q −1/2 q l/2 dq, where ν j = 2/3 t 1/3 . (249) 0 dj Using standard stationary phase arguments we obtain that, for large t, and hence large νj, √ l+1/2 a/s0, j % $ 2π s0, j eiν j −iπ/4 1+l/2 −1/2 l/2 q dq − exp iν j q + 2q e | |s0, j √ νj 0 1+l/2

≤C

s0, j

νj

.

(250)

For large t, the dominant contribution comes from the term with l = 0 and so

√

√ √ a t t iν j 2π s e 0, j ist −iπ/4 −1 yˆ (is, x)e ds − D j,0 e e−cj ≤ C1 t −1 . √ ≤ Ct νj −a j=0

j=0

(251)

734

O. Costin, J. L. Lebowitz, S. Tanveer

The

√ sum over j is clearly convergent because of the exponential decay of D j,0 ; hence t in the upper limit can be replaced by ∞. From the definition of s0, j and ν j , it follows that a (252) yˆ (is, x)eist ds = O t −5/6 . −a

At all other singular points, p = inω, n ∈ Z, the behavior is similar, and a similar calculation gives a einωt t −5/6 contribution. Since Y ∈ H, there is sufficient decay in n to ensure that the sum over all such contributions is convergent. 5.8. Calculation of jk . Substituting the explicit expressions for m k (r ) and m (k−1) (r ), it may be checked that in both cases, τ = 0 and τ = 1, corresponding to (i) and (ii) respectively (2)

(1)

(0)

jk = k 2 α 2 s(r ) jk + kα 2 s(r ) jk + jk , where (253) √ H (αk)H (ζ − ζ /k) H (ζ ) l(l + 1) 4 (r ) H (ζ ) 4 + − jk(2) = 2 2 1 − − , α s H (α(k − 1))H (ζ ) H (ζ ) ζ2 αs(r ) H (ζ ) H (αk)H (ζ − ζ /k)) b 2(1 − 2τ ) (1) 1 − + jk = − 2 2 α s H (α(k − 1))H (ζ ) αζ ! √ 2τ ωs H (ζ ) + − − + √ , αs 2α H (ζ ) 2 α (1 + 2τ ) ω2 s3 5s 2 ωs2 s − ωs + − (ωn 0 − i p1 )s, − − 162 43/2 4 4 16 1√ where s(r ) = r (s)ds, ζ = kαr and jk := s[Lk m k − m k−1 ]/m k . Recall that H (ζ ) satisfies ω (0)(1 + 2ζ ) τ l(l + 1) b + H H, (254) H = 2 1 − + + − 2kα 2 4kα(0) 2k ζ2 αζ k jk(0) =

where

√ α=2

(0) , s(0)

(255)

and that H (ζ ) has the following asymptotic behavior: l(l +1) b log ζ 1 , (ζ, k → ∞, , ζ ≤ kα) . H (ζ ) ∼ 1+ + log ζ + O , 2ζ 2kα kζ ζ 2 (2)

(1)

(256)

Now, we claim that for any r ∈ (0, 1), | jk + k −1 jk | ≤ Ck −2 . In the regime r 1, we use Taylor expansion: 1 1 ζ ζ , s = s(0) − (0) (257) = (0) + (0) + O + O kα k2 kα k2 √ (2) (1) and substitute r = ζ /(kα) in jk + k −1 jk ; we then use α = 2 (0)/s(0), (254) (2) and the asymptotic behavior (256) to evaluate H (αk) and H (α(k − 1)) to find jk +

Ionization of Coulomb Systems in R3

735

(1)

k −1 jk ∼ k −2 g(ζ ) for some bounded differentiable function g(ζ ), with asymptotic behavior g(ζ ) ∼ const./ζ for large ζ . When r is not small, we use the asymptotic behavior (256) to evaluate all terms involving the function H and to find the same inequality | jk(2) + k −1 jk(1) | ≤ Ck −2 . Therefore, jk (r ) = O(1) in all Further, it is easily checked that in the regimes. regime k ζ 1, jk (r ) = O 1, ζ −1 = O(1/(kr ), 1). Since the asymptotics is differentiable (since H satisfies a second order differential equation), it follows jk (r ) = O(k −1r −2 , 1). When r is not small, using (256), it is readily checked that jk = O(1). 5.9. Generalizations. In fact, the same asymptotic arguments hold more generally if V (t, x) =

M

ei jωt j (r )

j=−M

with j (r ) satisfying the conditions we used for . We substitute for r = O(1), ⎡ ⎤ M c∗ exp ⎣k log f 0 (r ) + gn 0 −k (r ) = k 1− j/M f j (r )⎦ , (2k/M + 1) j=1

and calculate the error term Rk as before. By requiring that the O(k 2−2 j/M ) terms vanish for j = 0, .., M, we obtain (M + 1) first order differential equations for f j . To leading order 1 2/M f 0 (r ) = −M (s)ds . r

The expressions for f j (r ) for j 1 are more complicated and involve arbitrary constants to be determined from the information for small k at r = 1. Again because of the presence of r −2 l(l + 1) in Lk , the remainder is O(r −2 ), which is O(k 2 ) when r = O(k −1 ). We write ⎡ ⎤ M H (αkr ) . (258) gn 0 −k (r ) ∼ c∗ exp ⎣k log f 0 (r ) + k 1− j/M f j (r )⎦ (2k/M + 1) j=1

Then, if ζ = O(1), we find to leading order H (ζ ) ∼ H0 (ζ ), where l(l + 1) H0 = 0, ζ2 1 where now α = 2 −M (0)/s(0) and s(r ) = r −M (s). As for M = 1, we have to require H0 (ζ ) ∼ 1 as ζ → ∞. This leads to 2 ζ 1/2 H0 (ζ ) = e ζ K l+ 1 (ζ ). 2 π H0 − 2H0 −

For nonzero gn 0 −k , the constant multiple in (258) is expected to be nonzero. On the other hand, the asymptotic behavior as ζ ↓ 0, H0 (ζ ) ∼ c∗ ζ −l implies that the behavior at r = 0 of gn 0 −k /r is not acceptable unless every gn vanishes identically.

736

O. Costin, J. L. Lebowitz, S. Tanveer

* The analysis is likely to extend to systems with HC replaced by HW = − − b/r + W (r ), where b may be zero and W (r ) = O(r −1− ) for large r and is in L ∞ (R3 ). Under these assumptions, W (r ) does not participate in the asymptotics, to the orders relevant to the proofs.

5.10. Further remarks on the asymptotics. Remark 59. A weaker statement than Theorem 4 suffices to complete the proof of Theorem 1. For instance, it suffices to show that for sufficiently large j, |Rk, j | < 1, where r l+1 vn 0 −k j (r ) = i k j r l m k j (r )[1 + Rk j (r )]. Remark 60. Stronger results than those in Proposition 36 hold. Noting that for any integer q 0 we have Ak j +q ...Ak j +2 Ak j +1 Ak j [h˜ − 1]∞

∞ ) q =0

c∗ 1+ (k j + q )2

h˜ − 1∞ ,

while Ak j +q ...Ak j +2 Ak j +1 Ak j [1] = 1 + O(k −1 j ), ˜ ∞ c∗ k −2 , it follows that the sequence h˜ k , satisfying and the fact that Hk j +q h j h˜ k = Ak h˜ k−1 + Hk h˜ k+1 , has the property limk→∞ h˜ k = 1. Indeed, this is in accordance with the heuristic arguments presented in § 5.9. While these results completely justify the formal asymptotics, they are not needed in the proofs and we omit the details.

Acknowledgements. We thank R. D. Costin, S. Goldstein, W. Schlag, A. Soffer and C. Stucchio for very useful discussions. We are very grateful to Kenji Yajima for many useful comments and suggestions on earlier drafts of this paper. Work supported in part by NSF Grants DMS-0100495, DMS-0406193, DMS-0600369, DMS-0100490, DMS-0807266, DMR 01-279-26 and AFOSR grant AF-FA9550-04. O. C. and J. L. L. acknowledge the partial support from IAS and IHES and S.T. acknowledges support by the EPSRC and the Mathematics Institute at Imperial College during his 2005-2006 stay. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Ionization of Coulomb Systems in R3

737

References 1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. New York: Wiley-Interscience, 1984 2. Agmon, S.: Spectral properties of Schrödinger operators and scattering theory. Ann. Scuola. Norm. Sup. Pisa, Ser. IV 2, 151–218 (1975) 3. Agmon, S.: Analyticity properties in scattering and spectral theory for schrodinger operators with longrange radial potentials. Duke Math. J. 68(2), 337–399 (1992) 4. Belissard, J.: Stability and instability in quantum mechanics. In: Trends and Developments in the Eighties, Albeverio, S., Blanchard, Ph. (eds.) Singapore: World Scientific, 1985, pp. 1–106 5. Bourgain, J.: On long-time behaviour of solutions of linear Schrödinger equations with smooth timedependent potential. In: Geometric Aspects of Functional Analysis, Lecture Notes in Math. 1807, Berlin: Springer, 2003, pp. 99–113 6. Bourgain, J.: Growth of Sobolev norms in linear Schrödinger equatios with quasi-periodic potential. Commun. Math. Phys. 204(1), 207–240 (1999) 7. Bourgain, J.: On growth of Sobolev norms in linear Schrödinger equations with smooth time-dependent potential. J. Anal Math. 77, 315–348 (1999) 8. Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and applications to nonlinear evolution equations. I. Schrödinger equations. Geom. Funct. Anal 3(2), 107–156 (1993) 9. Buchholz, H.: The Confluent Hypergeometric Function. Berlin-Heidelberg-NewYork: Springer-Verlag, 1969 10. Costin, O., Costin, R.D., Lebowitz, J.L.: Transition to the Continuum of a Particle in Time-Periodic Potentials, Advances in Differential Equations and Mathematical Physics, AMS Contemporary Mathematics 327 ed. Karpeshina, Yu., Stolz, C., Weikard, R., Zeng, Y. Providence, RI: Amer. Math. Soc., 2003, pp. 75–86 11. Costin, O., Lebowitz, J.L., Rokhlenko, A.: Exact results for the ionization of a model quantum system. J. Phys. A: Math. Gen. 33, 1–9 (2000) 12. Costin, O., Costin, R.D., Lebowitz, J.L., Rokhlenko, A.: Evolution of a model quantum system under time periodic forcing: conditions for complete ionization. Commun. Math. Phys. 221(1), 1–26 (2001) 13. Costin, O., Rokhlenko, A., Lebowitz, J.L.: On the Complete Ionization of a Periodically Perturbed Quantum System. CRM Proceedings and Lecture Notes 27, Providence, RI: Amer. Math. Soc., 2001, pp. 51–61 14. Costin, O., Soffer, A.: Resonance theory for Schrödinger operators. Commun. Math. Phys. 224, 133–152 (2001) 15. Costin, O., Costin, R.D., Lebowitz, J.L.: Time asymptotics of the Schrödinger wave function in timeperiodic potentials. J. Stat. Phys. 116(1–4), 283–310 (2004) 16. Costin, O., Lebowitz, J.L., Stucchio, C.: Ionization in a one-dimensional dipole model. Rev. Math. Phys. 7, 835–872 (2008) 17. Treves, F.: Basic Linear Partial Differential Equations. London-New York: Academic Press, 1975 18. Costin, O., Lebowitz, J.L., Stucchio, C., Tanveer, S.: Exact results for ionization of model atomic systems. J. Math Phys. 51, 015211 (2010) 19. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin-Heidelberg-NewYork: Springer-Verlag, 1987 20. Galtbayar, A., Jensen, A., Yajima, K.: Local time-decay of solutions to Schrödinger equations with time-periodic potentials. J. Stat. Phys. 116(1–4), 231–282 (2004) 21. Goldberg, M.: Strichartz estimates for the Schrödinger equation with time-periodic Ln/2 potentials. J. Funct. Anal. 256(3), 718–746 (2009) 22. Hislop, P.D., Sigal, I.M.: Introduction to Spectral Theory with Applications to Schrödinger Operators. Applied Mathematical Sciences 113, Berlin-Heidelberg-NewYork: Springer, 1996 23. Hörmander, L.: Linear Partial Differential Operators. Berlin-Heidelberg-NewYork: Springer, 1963 24. Howland, J.S.: Stationary scattering theory for time dependent Hamiltonians. Math. Ann. 207, 315– 335 (1974) 25. Jauslin, H.R., Lebowitz, J.L.: Spectral and stability aspects of quantum Chaos. Chaos 1, 114–121 (1991) 26. Hostler, L., Pratt, R.H.: Coulomb’s Green’s function in closed form. Phys. Rev. Lett. 10(11), 469–470 (1963) 27. Jensen, A.: High energy resolvent estimates for generalized many-body Schrodinger operators. Publ. RIMS, Kyoto U. 25, 155–167 (1989) 28. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidelberg-NewYork: Springer Verlag, 1995 29. Koch, P.M., van Leeuven, K.A.H.: The importance of resonances in microwave “Ionization” of excited hydrogen atoms. Phys. Repts. 255, 289–403 (1995) 30. Miller, P.D., Soffer, A., Weinstein, M.I.: Metastability of breather modes of time dependent potentials. Nonlinearity 13, 507–568 (2000) 31. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. New York: Academic Press, 1972

738

O. Costin, J. L. Lebowitz, S. Tanveer

32. Möller, J.S., Skibsted, E.: Spectral theory of time-periodic many-body systems. Adv. Math. 188(1), 137– 221 (2004) 33. Möller, J.S.: Two-body short-range systems in a time-periodic electric field. Duke Math. J. 105(1), 135– 166 (2000) 34. Rodnianski, I., Tao, T.: Long-time Decay Estimates for Schrödinger Equations on Manifolds. Ann. of Math. Stud. 163, Princeton, NJ: Princeton Univ. Press, 2007 35. Rokhlenko, A., Costin, O., Lebowitz, J.L.: Decay versus survival of a local state subjected to harmonic forcing: exact results. J. Phys. A: Mathematical and General 35, 8943 (2002) 36. Schlag, W., Rodnianski, I.: Time decay for solutions of Schrödinger equations with rough and timedependent potentials. Invent. Math 3, 451–513 (2004) 37. Herbst, I., Möller, J.S., Skibsted, E.: Asymptotic completeness for N -body Stark Hamiltonians. Commun. Math. Phys. 174(3), 509–535 (1996) 38. Merzbacher, E.: Quantum Mechanics. 3rd ed., New York: Wiley, 1998 39. Simon, B.: Schrödinger operators in the twentieth century. J. Math. Phys. 41, 3523 (2000) 40. Slater, L.J.: Confluent hypergeometric functions. Cambridge: Cambridge University Press, 1960 41. Soffer, A., Weinstein, M.I.: Nonautonomous Hamiltonians. J. Stat. Phys. 93, 359–391 (1998) 42. Wasow, W.: Asymptotic Expansions for Ordinary Differential Equations. New York: Interscience Publishers, 1968 43. Yajima, K.: Resonances for the AC-Stark effect. Commun. Math. Phys. 87(3), 331–352 (1982/83) 44. Graffi, S., Yajima, K.: Exterior complex scaling and the AC-Stark effect in a Coulomb field. Commun. Math. Phys. 89(2), 277–301 (1983) 45. Yajima, K.: Scattering theory for Schrödinger equations with potentials periodic in time. J. Math. Soc. Japan 29, 729 (1977) 46. Yajima, K.: Existence of solutions of Schrödinger evolution equations. Commun. Math. Phys. 110, 415 (1987) Communicated by M. Aizenman

Commun. Math. Phys. 296, 739–767 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1027-6

Communications in

Mathematical Physics

Statistical Stability and Continuity of SRB Entropy for Systems with Gibbs-Markov Structures José F. Alves, Maria Carvalho, Jorge Milhazes Freitas Centro de Matemática da Universidade do Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal. E-mail: [email protected]; [email protected]; [email protected]

Received: 7 May 2009 / Accepted: 22 November 2009 Published online: 12 March 2010 – © Springer-Verlag 2010

Abstract: We present conditions on families of diffeomorphisms that guarantee statistical stability and SRB entropy continuity. They rely on the existence of horseshoe-like sets with infinitely many branches and variable return times. As an application we consider the family of Hénon maps within the set of Benedicks-Carleson parameters. Contents 1.

Introduction . . . . . . . . . . . . . . . . . . . . . . 1.1 Gibbs-Markov structure . . . . . . . . . . . . . . 1.2 Uniform families . . . . . . . . . . . . . . . . . 1.3 Statement of results . . . . . . . . . . . . . . . . 2. Quotient Dynamics and Lifting Back . . . . . . . . . 2.1 The natural measure . . . . . . . . . . . . . . . . 2.2 Lifting to the Gibbs-Markov structure . . . . . . 2.3 Entropy formula . . . . . . . . . . . . . . . . . . 3. Statistical Stability . . . . . . . . . . . . . . . . . . . 3.1 Convergence of the densities on the reference leaf 3.2 Continuity of the SRB measures . . . . . . . . . 4. Entropy Continuity . . . . . . . . . . . . . . . . . . . 4.1 Auxiliary results . . . . . . . . . . . . . . . . . . 4.2 Convergence of metric entropies . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

739 741 743 744 745 745 748 750 752 752 756 761 761 763 766

1. Introduction A physical measure for a smooth map f : M → M on a manifold M is a Borel probability measure µ on M for which there is a positive Lebesgue measure set of points

740

J. F. Alves, M. Carvalho, J. M. Freitas

x ∈ M, called the basin of µ, such that n−1 1 n→∞ δ f j (x) −→ µ n→∞ n

lim

(1.1)

j=0

in the weak* topology, where δz stands for the Dirac measure on z ∈ M. Sinai, Ruelle and Bowen showed the existence of physical measures for Axiom A smooth dynamical systems. These were obtained as equilibrium states for the logarithm of the Jacobian along the unstable direction. Besides, such probability measures exhibit positive Lyapunov exponents and conditionals which are absolutely continuous with respect to Lebesgue measure on local unstable leaves; probability measures with the latter properties are nowadays known as Sinai-Ruelle-Bowen measures (SRB measures, for short). Statistical properties and their stability have met with wide interest, particularly in the context of dynamical systems which do not satisfy classical structural stability. This may be checked through the continuous variation of the SRB measures, referred to in [AV] as statistical stability. Another characterization of stability addresses the continuity of the metric entropy of SRB measures. Although an old issue, going back to [N] and [Y1] for example, this continuity (topological or metric) is in general a hard problem. Notice that for families of smooth diffeomorphisms verifying the entropy formula, see [LY2], and whose Jacobian along the unstable direction depends continuously on the map, the entropy continuity is an immediate consequence of the statistical stability. This holds for instance in the setting of Axiom A attractors whose statistical stability was established in [R] and [M]. The regularity of the SRB entropy for Axiom A flows was proved in [C]. Analyticity of metric entropy for Anosov diffeomorphisms was proved in [P]. More recently, statistical stability for families of partially hyperbolic diffeomorphisms with non-uniformly expanding centre-unstable direction was established in [V]. Due to the continuous variation of the centre-unstable direction in the partial hyperbolicity context, the entropy continuity follows as in the Axiom A case. Statistical stability for Hénon maps within Benedicks-Carleson parameters have been proved in [ACF]; the entropy continuity for this family is a more delicate issue, since the lack of partial hyperbolicity, mostly due to the presence of “critical” points, originates a highly irregular behavior of the unstable direction. In the endomorphism setting, many advances have been obtained for important families of maps, for instance in [RS,T2,T1,AV,A,F,FT] concerning statistical stability, and in [AOT] for the entropy continuity. Actually, our main theorem may be regarded as a version for diffeomorphisms of the entropy continuity result in [AOT]. In this work we give sufficient conditions on families of smooth diffeomorphisms for the statistical stability and the continuous variation of the SRB entropies. The families we study here, though having directions of non-uniform expansion, do not allow the approach of the hyperbolic case, since no continuity assumptions on these directions with the map will be assumed. Instead, we consider diffeomorphisms admitting GibbsMarkov structures as in [Y2] that may be thought of as “horseshoes” with infinitely many branches and variable return times. This is mainly motivated by the important class of Hénon maps presented in the next paragraph. Our assumptions, which have a geometrical and dynamical nature, ensure in particular the existence of SRB measures. Gibbs-Markov structures were used in [Y2] to derive decay of correlations and the validity of the Central Limit Theorem for the SRB measure. Here we prove that under some additional uniformity requirements on the family we obtain statistical stability and SRB entropy continuity.

Statistical Stability and Continuity of SRB Entropy

741

The major application of our main result concerns the Benedicks-Carleson family of Hénon maps, f a,b :

R2 −→ R2 (x, y) −→ (1 − ax2 + y, bx).

(1.2)

For small b > 0 values, f a,b is strongly dissipative, and may be seen as an “unfolded” version of a quadratic interval map. It is known that for small b there is a trapping region whose topological attractor coincides with the closure of the unstable manifold W of ∗ of f a fixed point z a,b a,b . In [BC] it was shown that for each sufficiently small b > 0 there is a positive Lebesgue measure set of parameters a ∈ [1, 2] for which f a,b has a dense orbit in W with a positive Lyapunov exponent, which makes this a non-trivial and strange attractor. We denote by BC the set of those parameters (a, b) and call it the Benedicks-Carleson family of Hénon maps. As shown in [BY1], each of these nonhyperbolic attractors supports a unique SRB measure µa,b , whose main features were further studied in [BY2,BV1,BV2]. In [BY2] a Gibbs-Markov structure was built for each f a,b with (a, b) ∈ BC, which has been used to obtain statistical behavior of Hölder observables. These structures have also been used in [ACF] to deduce the statistical stability of this family. In this work we add the metric entropy continuity with respect to these measures. 1.1. Gibbs-Markov structure. Let f : M → M be the C k diffeomorphism (k ≥ 2) defined on a finite dimensional Riemannian manifold M, endowed with a normalized volume form on the Borel sets that we denote by Leb and call Lebesgue measure. Given a submanifold γ ⊂ M we use Lebγ to denote the measure on γ induced by the restriction of the Riemannian structure to γ . An embedded disk γ ⊂ M is called an unstable manifold if dist( f −n (x), f −n (y)) → 0 as n → ∞ for every x, y ∈ γ . Similarly, γ is called a stable manifold if dist( f n (x), f n (y)) → 0 as n → ∞ for every x, y ∈ γ . Definition 1. Let D u be the unit disk in some Euclidean space and Emb1 (D u , M) be the space of C 1 embeddings from D u into M. We say that u = {γ u } is a continuous family of C 1 unstable manifolds if there is a compact set K s and u : K s × D u → M such that i) γ u = u ({x} × D u ) is an unstable manifold; ii) u maps K s × D u homeomorphically onto its image; iii) x → u |({x} × D u ) defines a continuous map from K s into Emb1 (D u , M). Continuous families of C 1 stable manifolds are defined similarly. Definition 2. We say that ⊂ M has a hyperbolic product structure if there exist a continuous family of unstable manifolds u = {γ u } and a continuous family of stable manifolds s = {γ s } such that i) ii) iii) iv)

= (∪γ u ) ∩ (∪γ s ); dim γ u + dim γ s = dim M; each γ s meets each γ u in exactly one point; stable and unstable manifolds meet with angles larger than some θ > 0.

742

J. F. Alves, M. Carvalho, J. M. Freitas

Let ⊂ M have a hyperbolic product structure, whose defining families are s and u . A subset ϒ0 ⊂ is called an s-subset if ϒ0 also has a hyperbolic product structure and its defining families 0s and 0u can be chosen with 0s ⊂ s and 0u = u ; u-subsets are defined analogously. Given x ∈ , let γ ∗ (x) denote the element of ∗ containing x, for ∗ = s, u. For each n ≥ 1, let ( f n )u denote the restriction of the map f n to γ u -disks, and let det D( f n )u be the Jacobian of D( f n )u . In the sequel C > 0 and 0 < β < 1 are constants, and we require the following properties from the hyperbolic product structure : (P0 ). Positive measure: for every γ ∈ u we have Lebγ ( ∩ γ ) > 0. (P1 ). Markovian: there are pairwise disjoint s-subsets ϒ1 , ϒ2 , · · · ⊂ such that (a) Lebγ ((\ ∪ ϒi ) ∩ γ ) = 0 on each γ ∈ u ; (b) for each i ∈ N there is τi ∈ N such that f τi (ϒi ) is a u-subset, and for all x ∈ ϒi , f τi (γ s (x)) ⊂ γ s ( f τi (x)) and f τi (γ u (x)) ⊃ γ u ( f τi (x)); (c) for each n ∈ N there are finitely many i’s with τi = n. (P2 ). Contraction on stable leaves: for each γ s ∈ s and each y ∈ γ s (x), dist( f n (y), f n (x)) ≤ Cβ n , ∀n ≥ 1. For the last two properties we introduce the return time R : → N and the induced map F = f R : → , which are defined for each i ∈ N as R|ϒi = τi and f R |ϒi = f τi |ϒi , and, for each x, y ∈ , the separation time s(x, y) is given by s(x, y) = min n ≥ 0 : ( f R )n (x) and ( f R )n (y) lie in distinct ϒi s . (P3 ). Regularity of the stable foliation: (a) for y ∈ γ s (x) and n ≥ 0, log

∞ det D f u ( f i (x)) i=n

γ, γ

det D f u ( f i (y))

≤ Cβ n ;

(b) given ∈ we define : γ ∩ → γ ∩ by (x) = γ s (x) ∩ γ . Then is absolutely continuous and u ,

∞

d( ∗ Lebγ ) det D f u ( f i (x)) ; (x) = d Lebγ det D f u ( f i ( −1 (x))) i=0

(c) letting v(x) denote the density in item (b), we have log

v(x) ≤ Cβ s(x,y) , for x, y ∈ γ ∩ . v(y)

(P4 ). Bounded distortion: for γ ∈ u and x, y ∈ ∩ γ , log

det D( f R )u (x) R R ≤ Cβ s( f (x), f (y)) . R u det D( f ) (y)

Statistical Stability and Continuity of SRB Entropy

743

Remark 1.1. We do not assume uniform backward contraction along unstable leaves as (P4)(a) in [Y2]. Properties (P3 )(c) and (P4 ) are new if comparing our setup to that in [Y2]. However, these are a consequence of (P4) and (P5) of [Y2] as done in [Y2, Lemma 1]. In spite of the uniform contraction on stable leaves demanded in (P2 ), this is not too restrictive in systems having regions where the contraction fails to be uniform, since we are allowed to remove stable leaves, provided a subset with positive measure of leaves remains in the end. This has been carried out for Hénon maps in [BY2].

1.2. Uniform families. Let F be a family of C k maps (k ≥ 2) from the finite dimensional Riemannian manifold M into itself, and endow F with the C k topology. Assume that each map f ∈ F admits a Gibbs-Markov structure f as described in Sect. 1.1. Let uf = {γ uf } and sf = {γ sf } be its defining families of unstable and stable curves. Denote by R f : f → N the corresponding return time function. Given f 0 ∈ F, take a sequence f n ∈ F such that f n → f 0 in the C 1 topology as n → ∞. For the sake of notational simplicity, for each n ≥ 0 we will indicate the dependence of the previous objects on f n just by means of the index or supra-index n. If γnu ∈ nu is sufficiently close to γ0u ∈ 0u in the C k topology, we may define a projection by sliding through the stable manifolds of 0 , Hn : γnu ∩ 0s −→ γ0u s z −→ γ0 (z) ∩ γ0u and set 0 = γ0u ∩ 0 , 0n = Hn−1 (0 ), n = γnu ∩ n , n0 = Hn (n ∩ 0n ).

(1.3)

Given k ∈ N and positive integers i 1 , . . . , i k , we denote by ϒin1 ,...,ik the s-sublattice that j

satisfies Fn (ϒin1 ,...,ik ) ⊂ ϒinj for every 1 ≤ j < k and Fnk (ϒin1 ,...,ik ) = ϒink . Definition 3. F is called a uniform family if the conditions (U0 )–(U5 ) below hold: (U0 ). Absolute constants: The constants C and β in (P2 ),(P3 ) and (P4 ) can be chosen the same for all f ∈ F. (U1 ). Proximity of unstable leaves: There are unstable leaves γˆ0 ∈ 0u and γˆn ∈ n such that γˆn → γˆ0 in the C 1 topology as n → ∞. (U2 ). Matching of structures: Defining the objects of (1.3) with γˆn replacing γnu , we have Lebγˆn n 0n → 0, as n → ∞. (U3 ). Proximity of stable directions: For every z ∈ n0 ∩ 0 we have γns (z) → γ0s (z) in the C 1 topology as n → ∞.

744

J. F. Alves, M. Carvalho, J. M. Freitas

(U4 ). Matching of s-sublattices: Given N , k ∈ N and ϒi01 ,...,ik with R0 ϒi0j ≤ N for 1 ≤ j ≤ k, there is ϒ n1 ,..., k such that Rn ϒ nj = R0 ϒi0j for 1 ≤ j ≤ k and Lebγˆ0 Hn ϒ n1 ,..., k ∩ γˆn ϒi01 ,...,ik ∩ γˆ0 → 0, as n → ∞. (U5 ). Uniform tail: Given ε > 0, there are N = N (ε) and J = J (ε, N ) such that ∞

j Lebγˆn {Rn = j} < ε, ∀n > J.

j=N

This last property ensures in particular that γˆn Rn d Lebγˆn < ∞ for large n, which by [Y2, Theorem 1] implies the existence of an SRB measure for each f n . Remark 1.2. Using that stable and unstable manifolds of f 0 meet with angles uniformly bounded away from zero at points in 0 , and the proximities given by (U1 ) and (U3 ), it follows that there is some θ > 0 such that, for n large enough, the stable manifolds through points in 0n meet γˆn with an angle bigger than θ . Together with (P3 ) and (U1 ), this implies that: i) (Hn )∗ Lebγˆn Lebγˆ0 with uniformly bounded density; ii)

d(Hn )∗ Lebγˆn Lebγˆ0

→ 1 on L 1 (Lebγˆ0 ), as n → ∞.

1.3. Statement of results. Consider a family F such that each f ∈ F admits a unique SRB measure µ f . Letting P(M) denote the space of probability measures on M endowed with the weak* topology, we say that F is statistically stable if the map F −→ P(M) f −→ µ f , is continuous. In the sequel h µ f denotes the metric entropy of f with respect to the measure µ f . Theorem A. Let F be a uniform family such that each f ∈ F admits a unique SRB measure. Then (1) F is statistically stable; (2) F f → h µ f is continuous. Corollary B. The family BC is statistically stable and the map BC (a, b) → h µa,b is continuous. This corollary follows immediately after building Gibbs-Markov structures satisfying (P0 )–(P4 ), as was done in [BY2], and verifying the uniformity conditions (U0 )–(U5 ), as in [ACF]. For the sake of clearness, the following list specifies exactly where each property is obtained.

Statistical Stability and Continuity of SRB Entropy

(P0 ) (P1 ) (P2 ) (P3 )(a) (P3 )(b) (P3 )(c) (P4 ) (U0 ) (U1 ) (U2 ) (U3 ) (U4 ) (U5 )

745

[BY2, Proposition A(3)] [BY2, Proposition A(1),(2)] [BY2, Proposition A(2)] [BY2, Sublemma 8] [BY2, Sublemma 10] [BY2, Sublemma 11] [BY2, Sublemma 9] [ACF, Sects. 6,7,8] Hyperbolicity of the fixed point z ∗ [ACF, Sect. 6 in particular Corollary 6.4] [ACF, Sect. 7 in particular Proposition 7.3] [ACF, Sect. 8 in particular Proposition 8.9] [BY2, Proposition A(4)]

Concerning (U0 ) and (U5 ), observe that the constants depend exclusively on the maximum value for b > 0 and the minimum for a < 2 in the choice of Benedicks-Carleson parameters. 2. Quotient Dynamics and Lifting Back In this section we shall analyze some dynamical features of a diffeomorphism f admitting with a Gibbs-Markov structure that verifies properties (P0 )–(P4 ). Consider a ¯ obtained by collapsing the stable curves of ; i.e. ¯ = / ∼, where quotient space

s z ∼ z if and only if z ∈ γ (z). Since by (P1 )(b) the induced map F = f R : → ¯ → ¯ is well defined takes γ s leaves to γ s leaves, then the quotient induced map F : ¯ and if ϒ¯ i is the quotient of ϒi , then F takes the sets ϒ¯ i homeomorphically onto . ¯ through the canonical Given an unstable leaf γ , the set γ ∩ is suited as a model for ¯ We will see in Sect. 2.1 that we may define a natural reference projection π¯ : → . ¯ Besides, F is an expanding Markov map (see Lemma 2.1), thus having measure m¯ on . an absolutely continuous (w.r.t m), ¯ F-invariant probability measure µ. ¯ Moreover, if µ˜ denotes the F-invariant measure supported on then µ¯ = π¯ ∗ (µ). ˜ To build an SRB measure µ out of µ˜ is just a matter of saturating the measure µ. ˜ The existence of the measures µ, ¯ µ˜ and the fact that µ¯ = π¯ ∗ (µ) ˜ follows from standard methods, which can be found for instance in [Y2]. For the sake of completeness we will present the construction of the SRB measure, also having in mind how some properties can be carried up through the lifting. We will accomplish this by adapting some ideas used in the construction of Gibbs states; see [B]. 2.1. The natural measure. The purpose of this subsection is to introduce a natural prob¯ and establish some properties of the Jacobian of F with respect ability measure m¯ on to m. ¯ Moreover, we show the existence of an F-invariant density ρ¯ with respect to the measure m. ¯ Fix an arbitrary γˆ ∈ u . The restriction of π¯ to γˆ ∩ gives a homeomorphism that ¯ Given γ ∈ u and x ∈ γ ∩ , let xˆ be the point in we denote by πˆ : γˆ ∩ → . s γ (x) ∩ γˆ . Defining for x ∈ γ ∩ , u(x) ˆ =

∞ det D f u ( f i (x)) i=0

det D f u ( f i (x)) ˆ

,

(2.1)

746

J. F. Alves, M. Carvalho, J. M. Freitas

we have that uˆ satisfies the bounded distortion property (P3 )(c). For each γ ∈ u let m γ be the measure in γ such that dm γ = uˆ 1γ ∩ , d Lebγ where 1γ ∩ is the characteristic function of the set γ ∩ . These measures have been defined in such a way that if γ , γ ∈ u and is obtained by sliding along stable leaves from γ ∩ to γ ∩ , then

∗ m γ = m γ .

(2.2)

To verify this let us show that the densities of these two measures with respect to Lebγ coincide. Take x ∈ γ ∩ and x ∈ γ ∩ such that (x) = x . By (P3 )(b) one has d ∗ Lebγ u(x ˆ ) , (x ) = d Lebγ u(x) ˆ which implies that dm γ d ∗ m γ d ∗ Lebγ (x ) = u(x) ˆ (x ) = u(x ˆ ) = (x ). d Lebγ d Lebγ d Lebγ Conditions (P0 ) and (2.2) allow us to define the reference probability measure m¯ whose representative in each unstable leaf γ ∈ u is exactly Leb1 () m γ . γˆ Let T : (X 1 , m 1 ) → (X 2 , m 2 ) be a measurable bijection between two probability measure spaces. T is called nonsingular if it maps sets of zero m 1 measure to sets of zero m 2 measure. For a nonsingular transformation T we define the Jacobian of T with dT −1 (m )

respect to m 1 and m 2 , denoted by Jm 1 ,m 2 (T ), as the Radon-Nikodym derivative ∗dm 1 2 . By assertion (1) of the following lemma it makes sense to consider the Jacobian of the quotient map F : (, m) → (, m) that we simply denote J F. Lemma 2.1. Assuming that F(γ ∩ ϒi ) ⊂ γ for γ , γ ∈ u , let J F(x) denote the Jacobian of F with respect to the measures m γ and m γ . Then (1) J F(x) = J F(y) for every y ∈ γ s (x); (2) there is C0 > 0 such that for every x, y ∈ γ ∩ ϒi , J F(x) s(F(x),F(y)) ; J F(y) − 1 ≤ C0 β (3) for every k ∈ N and any k positive integers i 1 , . . . i k , there is C1 > 0 such that for every x, y ∈ ϒi1 ,...,ik ∩ γ , J F k (x) J F k (y) ≤ C1 . Proof. (1) For Lebγ almost every x ∈ γ ∩ we have u(F(x)) ˆ . J F(x) = det D F u (x) · u(x) ˆ

(2.3)

Statistical Stability and Continuity of SRB Entropy

747

Denoting ϕ(x) = log | det D f u (x)| we may write log J F(x) =

R−1

∞ ϕ( f i (F(x))) − ϕ( f i ( F(x)) ϕ( f (x)) + i

i=0

=

i=0

∞ ϕ( f i (x)) − ϕ( f i (x) ˆ −

R−1

i=0

∞ . ϕ( f i (F(x))) ϕ( f i (x)) ˆ + ˆ − ϕ( f i ( F(x))

i=0

i=0

which is Thus we have shown that J F(x) can be expressed just in terms of xˆ and F(x), enough for proving the first part of the lemma. (2) It follows from (2.3) that J F(x) det D F u (x) u(F(x)) ˆ u(y) ˆ = log + log + log . J F(y) det D F u (y) u(F(y)) ˆ u(x) ˆ

log

Observing that s(x, y) > s(F(x), F(y)) the conclusion follows from (P3 )(c) and (P4 ). (3) Again, from (2.3), we obtain u det D F k (x) J F k (x) u(y) ˆ u(F ˆ k (x)) log = log + log . + log u k J F (y) u(F ˆ k (y)) u(x) ˆ det D F k (y) By (P4 ) we have u k ∞ det D F k (x) s(F l (x),F l (y)) Cβ ≤C β l < ∞. log ≤ u det D F k (y) l=1 l=0 The remaining terms are easily controlled once again due to (P3 )(c). ¯ → ¯ has an invariant probability measure µ¯ with d µ¯ = Lemma 2.2. The map F : −1 ρd ¯ m, ¯ where K ≤ ρ¯ ≤ K , for some K = K (C1 , β) > 0. Proof. We construct ρ¯ as the density with respect to m¯ of an accumulation point of n−1 i F ∗ (m). ¯ Let ρ¯ (n) denote the density of µ¯ (n) and ρ¯ i the density of µ¯ (n) = 1/n i=0 i i F ∗ (m). ¯ Also, let ρ¯ i = j ρ¯ ij , where ρ¯ ij is the density of F ∗ (m|σ ¯ ij ) and the σ ij ’s range i ¯ such that F (σ i ) = . ¯ over all components of j

¯ ij ). We have for x¯ ∈ σ ij such that Consider the normalized density ρ˜ ij = ρ¯ ij /m(σ i

x¯ = F (x¯ ) and for some y¯ ∈ σ ij , i

ρ˜ ij (x) ¯ =

J F (y¯ ) i

J F (x¯ )

¯ −1 = (m( ¯ ))

k−1 i J F(F (y¯ )) k=1

J F(F

k−1

(x¯ ))

.

By Lemma 2.1(2) we have for every k = 1, . . . , i, J F(F J F(F

k−1

(y¯ ))

k−1

(x¯ ))

k k s F (¯y ),F (x¯ ) ¯ y) , ≤ exp C1 β ≤ exp C1 β (i−k)+s(x,¯

748

J. F. Alves, M. Carvalho, J. M. Freitas

from where we conclude that ⎧ ⎫ ⎨ ⎬ ¯ y) ¯ ≤ exp C1 β s(x,¯ β j ≤ exp {C1 /(1 − β)} = K . ρ˜ ij (x) ⎩ ⎭ j≥0

Observe that we also get 1 ρ˜ ij (x) ¯

i

=

J F (x¯ ) i

J F (y¯ )

¯ ≤ K, (m( ¯ ))

¯ ≥ K −1 . Now, since ρ¯ i = ¯ ij )ρ˜ ij , we have K −1 ≤ ρ¯ i ≤ K which yields ρ˜ ij (x) j m(σ −1 (n) which implies that K ≤ ρ¯ ≤ K , from where we obtain that K −1 ≤ ρ¯ ≤ K . 2.2. Lifting to the Gibbs-Markov structure. We now adapt standard techniques for lifting the F- invariant measure on the quotient space to an F- invariant measure on the initial Gibbs-Markov structure. Given an F-invariant probability measure µ, ¯ we define a probability measure µ˜ on as follows. For each bounded φ : → R consider its discretizations φ • : γˆ ∩ → R ¯ → R defined by and φ ∗ : φ • (x) = inf{φ(z) : z ∈ γ s (x)}, and φ ∗ = φ • ◦ πˆ −1 .

(2.4)

If φ is continuous, as its domain is compact, we may define var φ(k) = sup |φ(z) − φ(ζ )| : |z − ζ | ≤ Cβ k , in which case var φ(k) → 0 as k → ∞. Lemma 2.3. Given any continuous φ : → R, for all k, l ∈ N we have (φ ◦ F k )∗ d µ¯ − (φ ◦ F k+l )∗ d µ¯ ≤ var φ(k). Proof. Since µ¯ is F-invariant, (φ ◦ F k )∗ d µ¯ − (φ ◦ F k+l )∗ d µ¯ = (φ ◦ F k )∗ ◦ F l d µ¯ − (φ ◦ F k+l )∗ d µ¯ l ≤ (φ ◦ F k )∗ ◦ F − (φ ◦ F k+l )∗ d µ. ¯ By definition of the discretization we have l l (φ ◦ F k )∗ ◦ F (x) = inf φ(z) : z ∈ F k γ s (F (x)) and

(φ ◦ F k+l )∗ (x) = inf φ(ζ ) : ζ ∈ F k+l γ s (x) .

Statistical Stability and Continuity of SRB Entropy

749

l Observe that F k+l (γ s (x)) ⊂ F k γ s (F (x)) and by (P2 ), l diam F k γ s (F (x)) ≤ Cβ k . l Thus, (φ ◦ F k )∗ ◦ F − (φ ◦ F k+l )∗ ≤ var φ(k).

By the Cauchy criterion the sequence (φ ◦ F k )∗ d µ¯ k∈N converges. Hence, the Riesz Representation Theorem yields a probability measure µ˜ on , (2.5) φd µ˜ := lim (φ ◦ F k )∗ d µ¯ k→∞

for every continuous function φ : → R. Proposition 2.4. The probability measure µ˜ is F-invariant and has absolutely continuous conditional measures on γ u leaves. Moreover, given any continuous φ : → R we have (1) φd µ˜ − (φ ◦ F k )∗ d µ¯ ≤ var φ(k); ¯ µ, ¯ → R is defined by (2) if φ is constant in each γ s , then φd µ˜ = φd ¯ where φ¯ : −1 ¯ φ(x) = φ(z), where z ∈ π¯ (x); (3) if φ is constant in each γ s and ψ : → R is continuous, then ψ.φd µ˜ − (ψ ◦ F k )∗ (φ ◦ F k )∗ d µ¯ ≤ φ1 var ψ(k). (4) µ˜ is ergodic. Proof. Regarding the F-invariance property, note that for any continuous φ : → R, ∗ ˜ φ ◦ F k+1 d µ¯ = φd µ, φ ◦ Fd µ˜ = lim k→∞

by Lemma 2.3. Assertion (1) is an immediate consequence of Lemma 2.3. Property (2) follows from ∗ ¯ µ, ¯ φd µ˜ = lim φ¯ ◦ F¯ k d µ¯ = φd φ ◦ F k d µ¯ = lim k→∞

k→∞

¯ which holds by definition of µ, ˜ and the F-invariance of µ. ¯ For statement (3) let ¯ → R be defined by φ(x) ¯ φ¯ : = φ(z), where z ∈ π¯ −1 (x). For any k, l positive integers observe that (ψ.φ ◦ F k )∗ d µ¯ = (ψ ◦ F k )∗ (φ ◦ F k )∗ d µ¯ φ∗

and

(ψφ ◦ F k+l )∗ d µ¯ − (ψφ ◦ F k )∗ d µ¯ k+l ∗ ¯ k+l k ∗¯ k ¯ ¯ = (ψ ◦ F ) φ ◦ F d µ¯ − (ψ ◦ F ) φ ◦ F d µ¯ ≤ (ψ ◦ F k+l )∗ − (ψ ◦ F k )∗ ◦ F¯ l |φ ◦ F¯ k+l |d µ¯ ≤ var ψ(k)φ1 .

Inequality (3) follows letting l go to ∞.

750

J. F. Alves, M. Carvalho, J. M. Freitas

Remark 2.5. Since the continuous functions are a dense subset of L 1 - functions, then properties (2) and (3) also hold when φ ∈ L 1 , by dominated convergence. In particular, this gives that π¯ ∗ µ˜ = µ. ¯ In order to prove item (4), let E˜ denote the set of points z ∈ such that for every ˜ we have g ∈ L 1 (µ) n−1 1 lim g ◦ f i (z) = g d µ. ˜ (2.6) n→∞ n i=0

¯ and g ∈ L 1 (µ). We define similarly E¯ with respect to µ, ¯ points in ¯ Recall that ergo¯ = 1 and µ( ˜ = 1, respectively. Actually, it dicity of µ¯ and µ˜ is equivalent to µ( ¯ E) ˜ E) is enough to consider continuous functions in the previous definitions. We will show ¯ ⊂ E, ˜ which then by Remark 2.5 implies that µ˜ is ergodic. Let x¯ ∈ E, ¯ that π¯ −1 (E) z ∈ π¯ −1 (x) ¯ and consider a continuous function g : → R. Then for every k ∈ N we have k−1 n−1 n−1 1 1 1 i i g(F (z)) − gd µ˜ ≤ g(F (z)) + ¯ g(F k+i (z)) − (g ◦ F k )∗ (F i (x)) n n n i=0 i=0 i=0 n−1 n+k−1 1 1 + g(F i (z)) + (g ◦ F k )∗ (F i (x)) ¯ − (g ◦ F k )∗ d µ¯ + (g ◦ F k )∗ d µ¯ − gd µ˜ n n i=n−1 i=0 n−1 2kg∞ 1 (g ◦ F k )∗ (F i (x)) ¯ − (g ◦ F k )∗ d µ¯ −−−→ 2varg(k), ≤ + 2varg(k) + n→∞ n n i=0

and the conclusion follows by letting k → ∞. We are then left to verify the absolute continuity of µ. ˜ We already know that supports an F-invariant ergodic measure µ with absolutely continuous conditional measures on γ u leaves; see e.g. [Y2, Sect. 2]. In fact, we know that on a.e. γ u , the conditional measure µγ u is equivalent to the conditional Lebesgue measure Lebγ u , when restricted to . We are going to show that µ˜ = µ. Consider E the set of points z ∈ for which (2.6) holds with µ instead of µ. ˜ Our goal is to show that E ∩ E˜ = ∅, which gives the desired equality of the two measures. As µ(E) = 1, by the equivalence (on a.e. γ u ) between µγ u and Lebγ u restricted to , there exists an unstable leaf γ u such that Lebγ u ((\E)∩γ u ) = 0. ˜ = 1 which implies that µ( ˜ = 0 because π¯ ∗ µ˜ = µ¯ by By (4) we have µ( ˜ E) ¯ π¯ (\E)) ˜ = 0. Now, since Remark 2.5. Since µ¯ is equivalent to m, ¯ it follows that m( ¯ π¯ (\E)) the representative of m¯ on γ u is also equivalent to Lebγ u restricted to , we also have ˜ ∩ γ u ) = 0. Consequently, we have Lebγ u ((E ∩ E) ˜ ∩ γ u ) > 0 which that Lebγ u ((\E) proves that E ∩ E˜ = ∅. 2.3. Entropy formula. Let µ˜ be the SRB measure for F obtained from µ¯ = ρ¯ m¯ as in (2.5). We define the saturation of µ˜ by µ∗ =

∞

f ∗l (µ|{R ˜ > l}) .

(2.7)

l=0

It that µ∗ is f -invariant and that the finiteness of µ∗ is equivalent to is well known R d µ˜ = R d µ¯ < ∞. By construction of and m¯ and µ, ¯ the finiteness of µ∗ is

Statistical Stability and Continuity of SRB Entropy

751

also equivalent to γ ∩ R d Lebγ < ∞. Clearly, each f ∗l (µ|{R ˜ > l}) has absolutely l u continuous conditional measures on { f γ }, which are Pesin unstable manifolds. Consequently µ=

1 µ∗ µ∗ (M)

is an SRB measure for f . Lemma 2.6. If λ is a Lyapunov exponent of µ, ˜ then λ/σ is a Lyapunov exponent of µ, where σ = Rd µ. ˜ Proof. As µ is obtained by saturating µ˜ in (2.7), one easily gets µ∗ () ≥ µ() ˜ = 1, and so µ() > 0. By ergodicity, it is enough to compare the Lyapunov exponents for points z ∈ . Let n be a positive integer. We have for each z ∈ , F n (z) = f Sn (z) (z), where Sn (z) =

n−1

R(F i (z)).

i=0

As Sn (z) = Sn (ζ ) for Lebesgue almost every z ∈ and ζ close to z, we have for v ∈ Tz M, n 1 log D f Sn (z) (z)v = log D F n (z)v. Sn (z) nSn (z) Since µ˜ is ergodic, the Birkhoff ergodic theorem yields Sn (z) = lim R d µ˜ = σ n→∞ n

(2.8)

(2.9)

for µ˜ almost every z ∈ . ¯ Then Proposition 2.7. Let J F¯ be the Jacobian of F¯ with respect to the measure m¯ on . ¯ m. h µ = σ −1 log J Fd ¯ ¯

Proof. By [LY2, Cor. 7.4.2] we have hµ =

λi dim E i ,

(2.10)

λi >0

where λi are Lyapunov exponents of µ and E i the corresponding linear spaces given by Oseledets’ decomposition. By Lemma 2.6 we have h µ = σ −1 λ˜ i dim E i , λ˜ i >0

˜ As a consequence of Oseledets theorem we may where λ˜ i are Lyapunov exponents of µ. also write λ˜ i dim E i = log det D F u d µ. ˜ λ˜ i >0

752

J. F. Alves, M. Carvalho, J. M. Freitas

According to (2.3), log J Fd µ˜ = log det D F u d µ˜ + log uˆ ◦ Fd µ˜ − log ud ˆ µ˜ = log det D F u d µ, ˜

where the last equality follows from the F-invariance of µ. ˜ Finally, since by Lemma 2.1 J F is constant in each γ s -leaf it follows from Proposition 2.4 (2) that ¯ m. log J Fd µ˜ = log J Fd ¯ ¯

3. Statistical Stability Let F be a uniform family of maps. Fix f 0 ∈ F and take any sequence ( f n )n≥1 in F such that f n → f 0 , as n → ∞, in the C k topology. For each n ≥ 0, let µn denote the (unique) SRB measure for f n . Given n ≥ 0, the map f n ∈ F admits a Gibbs-Markov structure n with nu = {γnu } and ns = {γns } its defining families of unstable and stable leaves. Consider Rn : n → N the return time, Fn : n → n the induced map, γˆn the special unstable leaf given by condition (U1 ) and Hn : γˆn ∩0s → γˆ0 obtained by sliding through the stable leaves of 0 . Recall that n0 = Hn (γˆn ∩ n ) and 0 = γˆ0 ∩ 0 . Remark 3.1. Since f n → f 0 , as n → ∞, in the C k topology and (U1 ) holds, then for every ε > 0 and ∈ N, there exists N0 ∈ N such that for every n ≥ N0 we have γˆn − γˆ0 1 < ε, −1 max n |( f n ◦ Hn − f 0 )(x)|, . . . , |( f n ◦ Hn−1 − f 0 )(x)| < ε,

x∈0 ∩0

and u ( f ◦ H −1 (x)) u ( f ◦ H −1 (x)) det D f det D f n n n n n n , . . . , log log max < ε. det D f u ( f (x)) x∈ ∩n det D f u ( f (x)) 0

0

0

0

0

0

Our goal is to show that µn → µ0 in the weak* topology, i.e. for each continuous function g : M → R the sequence g dµn converges to g dµ0 . We will show that given any continuous g : M → R, each subsequence of g dµn admits a subsequence converging to g dµ0 .

3.1. Convergence of the densities on the reference leaf. In Sect. 2.1 we built a family of holonomy invariant measures on unstable leaves that gives rise to a measure m¯ n on ¯ n . Moreover, (πˆ n )∗ m γˆn = m¯ n and m γˆn = 1γˆn ∩n Lebγˆn ,

(3.1)

Statistical Stability and Continuity of SRB Entropy

753

where 1(·) stands for the indicator function. By Lemma 2.2, for each n ≥ 0 there is an F¯n -invariant measure µ¯ n = ρ¯n m¯ n with ρ¯n ∞ ≤ K for all n ≥ 0. We define the sequence (n )n≥0 of functions in γˆ0 as n = ρ¯n ◦ πˆ n ◦ Hn−1 · 1n0 ,

(3.2)

which in particular gives 0 = ρ¯0 ◦ πˆ 0 . The main purpose of this section is to prove that the sequence (n )n∈N converges theorem there is a subsequence to 0 in the weak* topology. By the Banach-Alaoglu n i i∈N converging to some ∞ ∈ L ∞ (Lebγˆ0 ) in the weak* topology, i.e. φn i d Lebγˆ0 −−−→ φ∞ d Lebγˆ0 , ∀φ ∈ L 1 (Lebγˆ0 ). (3.3) i→∞

The following lemma establishes that integration with respect to m¯ n is close to integration with respect to n Lebγˆ0 , up to a small error. Lemma 3.2. Let φ¯ ∈ L ∞ (m¯ n ). If n is sufficiently large, then −1 ¯ ∞ Qn , (φ¯ ◦ πˆ n ◦ Hn )n d Lebγˆ0 ≤ K φ φ¯ ρ¯n d m¯ n − n ¯ n 0

where Q n = Lebγˆn (0n n ) + n d(Hn )∗ Lebγˆn − n d Lebγˆ0 .

0

φ¯ ρ¯n d m¯ n =

0

¯

d Lebγˆn . It follows that −1 ¯ ¯ ¯ φ ρ¯n d m¯ n − n (φ ◦ πˆ n ◦ Hn )n d Lebγˆ0 0 n ≤ (φ¯ ◦ πˆ n )(ρ¯n ◦ πˆ n ) d Lebγˆn 0n n + (φ¯ ◦ πˆ n )(ρ¯n ◦ πˆ n ) d Lebγˆn − (φ¯ ◦ πˆ n ◦ Hn−1 )n d Lebγˆ0 n 0n ∩n

Proof. By (3.1), we have

¯n

ˆ n )(ρ¯n ◦ πˆ n ) n (φ ◦ π

0

¯ ∞ Lebγˆ (0n n ) ≤ K φ n −1 −1 (φ¯ ◦ πˆ n ◦ Hn )n d(Hn )∗ Lebγˆn − (φ¯ ◦ πˆ n ◦ Hn )n d Lebγˆ0 + n n 0 0 ¯ ∞ Lebγˆ (0n n ) + K φ ¯ ∞ ≤ K φ d(Hn )∗ Lebγˆn − d Lebγˆ0 . n n n 0

0

Consider the maps G 0 : γˆ0 → γˆ0 and G n : γˆ0 → γˆn defined by G 0 = πˆ 0−1 ◦ F¯0 ◦ πˆ 0 and G n = πˆ n−1 ◦ F¯n ◦ πˆ n ◦ Hn−1 .

754

J. F. Alves, M. Carvalho, J. M. Freitas

γ^n

Fig. 1.

Lemma 3.3. For every ε > 0, n ∈ N sufficiently large and Lebγˆ0 -almost every x ∈ 0 ∩ n0 ∩ {Rn = } ∩ {R0 = } we have |G n (x) − G 0 (x)| < ε. Proof. Consider a point x ∈ 0 ∩ n0 ∩ {Rn = } ∩ {R0 = }. We may assume that G n (x) is a Lebesgue density point of n . Then, using (U2 ) and the continuity of the stable foliation (see Definition 1 (iii)), for sufficiently large n ∈ N we may guarantee the existence of a point y˜ ∈ 0n ∩ n such that γns (y) ˜ is at most ε sin(θ )/4 apart from γns (G n (x)) in the C 1 -norm; recall Remark 1.2 (see Fig. 1). Using (U3 ) we may assume ˜ that n ∈ N is also sufficiently large so that the distance in the C 1 norm between γns (y) and γ0s (y) ˜ is at most ε sin(θ )/4. Taking into account Remark 3.1 and the continuity of the stable foliation, we may assume that n ∈ N is large enough so that | f nl (Hn−1 (x)) − f 0l (x)| is sufficiently small in order for γ0s ( f 0l (x)) to belong to a ε sin(θ )/4-neighborhood of γ0s (y), ˜ in the C 1 -norm. s l s l −1 It follows that γn ( f n (Hn (x))) and γ0 ( f 0 (x)) are at most 3ε sin(θ )/4 apart, in the C 1 norm. Finally, observing that G n (x) = γns ( f nl (Hn−1 (x)))∩γnu , G 0 (x) = γ0s ( f 0l (x))∩γ0u and γnu can be made arbitrarily close to γ0u , in the C 1 -norm (by (U1 )), then, as long as n is sufficiently large, we have |G n (x) − G 0 (x)| < ε. Proposition 3.4. The measure (∞ ◦ πˆ 0−1 )m¯ 0 is F¯0 -invariant. ¯ 0 → R, Proof. We just have to verify that for every continuous ϕ :

(ϕ ◦ F¯0 )(∞ ◦ πˆ 0−1 )d m¯ 0 =

ϕ(∞ ◦ πˆ 0−1 )d m¯ 0 .

Given such ϕ, consider a continuous function φ : M → R such that φ∞ ≤ ϕ∞ and φ|0 = ϕ ◦ πˆ 0 . Since µ¯ n i = ρ¯n i d m¯ n i is F¯n i -invariant we have

(φ ◦ πˆ n−1 ◦ F¯n i )ρ¯n i d m¯ n i = i

(φ ◦ πˆ n−1 )ρ¯n i d m¯ n i . i

(3.4)

Statistical Stability and Continuity of SRB Entropy

755

Recalling definitions (3.1),(3.2), the fact that n i is supported on n0 i ⊂ 0 and applying Lemmas 3.2 and 2.2 we get (φ ◦ πˆ −1 )ρ¯n d m¯ n − ϕ(∞ ◦ πˆ −1 ) d m¯ 0 ni i i 0 −1 ≤ (φ ◦ Hn i )n i d Lebγˆ0 − (ϕ ◦ πˆ 0 )∞ d Lebγˆ0 + Q n i −1 = (φ ◦ Hn i )n i d Lebγˆ0 − φ∞ d Lebγˆ0 + Q n i −1 ≤ (φ ◦ Hn i )n i d Lebγˆ0 − φn i d Lebγˆ0 + φn i d Lebγˆ0 − φ∞ d Lebγˆ0 + Q n i −1 ≤ K φ ◦ Hn i − φ d Lebγˆ0 + φn i d Lebγˆ0 − φ∞ d Lebγˆ0 + Q n i . Therefore, using (U1 ) for the first term on the right, (3.3) for the second and (U2 ) plus Remark 1.2 for the Q term, we conclude that −1 (3.5) (φ ◦ πˆ n i )ρ¯n i d m¯ n i −−−→ ϕ(∞ ◦ πˆ 0−1 )d m¯ 0 . i→∞

Once we prove the next claim, then equality (3.4), the limit (3.5) and the uniqueness of the limit give the desired result. ¯n i )ρ¯n i d m¯ n i −−−→ ϕ ◦ F¯0 (∞ ◦ πˆ −1 )d m¯ 0 . Claim 3.1. (φ ◦ πˆ n−1 ◦ F 0 i i→∞

Let

¯n i )ρ¯n i d m¯ n i − ϕ ◦ F¯0 (∞ ◦ πˆ −1 )d m¯ 0 . E 1 := (φ ◦ πˆ n−1 ◦ F 0 i

Again, using definitions (3.1),(3.2) and applying Lemma 3.2 we get E 1 ≤ (φ ◦ G n i )n i d Lebγˆ0 − (φ ◦ G 0 )∞ d Lebγˆ0 + Q n i . Now, observe that by (U2 ) and Remark 1.2 the term Q n i can be made arbitrarily small for large i. This leaves us with the first term on the right that we denote by E 2 . Using Lemma 2.2 we have E 2 ≤ φ ◦ G n i − φ ◦ G 0 n i d Lebγˆ0 + (φ ◦ G 0 )n i d Lebγˆ0 − (φ ◦ G 0 )∞ d Lebγˆ0 ≤ K φ ◦ G n i − φ ◦ G 0 d Lebγˆ0 + (φ ◦ G 0 )n i d Lebγˆ0 − (φ ◦ G 0 )∞ d Lebγˆ0 .

756

J. F. Alves, M. Carvalho, J. M. Freitas

According to Eq. (3.3) it is clear that the last term on the right can be made arbitrarily small provided i is large enough. So, denote by E 3 the first term on the right. Recalling the fact that n i is supported on n0 i ⊂ 0 , we have for any N , ∞

Lebγˆ0 ({Rn i = }) + Lebγˆ0 ({R0 = })

E 3 ≤ K φ∞

=N +1

+ K φ∞

+K

N =1

N

Lebγˆ0 ({Rn i = } {R0 = })

=1

n {Rni = }∩{R0 = }∩0 ∩0i

φ ◦ G n − φ ◦ G 0 d Lebγˆ . i 0

Denote by E 4 , E 5 and E 6 respectively the terms in the last sum. Having in mind (U5 ) and Remark 1.2, we may choose N ∈ N sufficiently large so that E 4 is small for large i. For this choice of N , by (U4 ), we also have that E 5 is small for large i. We now turn our attention to E 6 . For = 1, . . . , N , let φ ◦ G n − φ ◦ G 0 1 n E 6 = i 0 ∩ i d Lebγˆ0 . {Rni = }∩{R0 = }

0

Since φ is continuous and M is compact then each E 6 can be made arbitrarily small by Lemma 3.3. Corollary 3.5. Given φ ∈ L 1 (Lebγˆ0 ), we have φn d Lebγˆ0 −−−→ φ0 d Lebγˆ0 . n→∞

¯ it follows Proof. By uniqueness of the absolutely continuous invariant measure for F, −1 from Proposition 3.4 that ρ¯0 = ∞ ◦ πˆ 0 , which immediately yields ∞ = 0 . Hence φn i d Lebγˆ0 −−−→ φ0 d Lebγˆ0 , for all φ continuous. (3.6) i→∞

The same argument proves that any subsequence of (n )n has a weak* convergent subsequence with limit also equal to 0 . This shows that (n )n itself converges to 0 in the weak* topology. Since continuous functions are dense in L 1 (Lebγˆ0 ), using that the densities n are uniformly bounded, by Lemma 2.2, the result follows easily from (3.6). 3.2. Continuity of the SRB measures. For each n ≥ 0 let µ˜ n be the Fn - invariant measure lifted from µ¯ n as in (2.5), µ∗n the saturation of µ˜ n as in (2.7), and µn = µ∗n /µ∗n (M) the SRB measure. The main goal of this section is to prove the following result. Proposition 3.6. For every continuous g : M → R, gdµ∗n −−−→ gdµ∗0 . i→∞

Statistical Stability and Continuity of SRB Entropy

757

Proof. As M is compact, then g is uniformly continuous and g∞ < ∞. Recalling (2.7) we may write for all n ∈ N0 and every integer N0 , µ∗n =

N 0 −1

µ n + ηn ,

=0

where µ n = f ∗ (µ˜n |{Rn > }) and ηn = ≥N0 f ∗ (µ˜n |{Rn > l}). By (U5 ), we may choose N0 so that ηn (M) is as small as we want, for all n ∈ N0 . We are left to show that for every < N0 , if n is large enough then (g ◦ f )1{R > } d µ˜ n − (g ◦ f )1{R > } d µ˜ 0 n 0 n 0 is arbitrarily small. We fix < N0 and take k ∈ N large so that var(g(k)) is sufficiently small. Then, we use Proposition 2.4 (3) and its Remark 2.5 to reduce our problem to controlling the following error term: k ∗ k ∗ k ∗ k ∗ E := (g ◦ f n ◦ Fn ) (1{Rn > } ◦ Fn ) d µ¯ n − (g ◦ f 0 ◦ F0 ) (1{R0 > } ◦ F0 ) d µ¯ 0 . Let 0 : γˆ0 → R be such that 0 = ρ¯0 ◦ πˆ 0 · 10 and define (g ◦ f n ◦ Fnk )• (1{Rn > } ◦ Fnk )• ◦ Hn−1 n d Lebγˆ0 E 0 = − (g ◦ f 0 ◦ F0k )• (1{R0 > } ◦ F0k )• 0 d Lebγˆ0 . By Lemma 3.2, we have E ≤ E 0 + K g∞ Q n . Observe that by (U2 ) and Remark 1.2 we may consider n large enough so that K g∞ Q n is negligible. Applying the triangular inequality we get E 0 ≤ K (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 + K g∞ (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 k • k • + (g ◦ f 0 ◦ F0 ) (1{R0 > } ◦ F0 ) 10 ∩n0 (n − 0 ) d Lebγˆ0 . By Corollary 3.5 the term (g ◦ f ◦ F k )• (1{R > } ◦ F k )• 1 ∩n (n − 0 ) d Lebγˆ 0 0 0 0 0 0 0 is as small as we want as long as n is large enough. The analysis of the remaining terms (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 and

(1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0

is left to Lemmas 3.8 and 3.9, respectively.

758

J. F. Alves, M. Carvalho, J. M. Freitas

In the proofs of Lemmas 3.8 and 3.9 we have to produce a suitable positive integer N so that returns that take longer than N iterations are negligible. The next lemma provides the tools for an adequate choice. We consider the sequence of consecutive return times for z ∈ , 1 2 n−1 R 1 (z) = R(z) and R n (z) = R f R +R +...+R (z) . (3.7) Lemma 3.7. Given k, N ∈ N,

m¯ z ∈ : ∃t ∈ {1, . . . , k} such that R t (z) > N ≤ kC1 m({R ¯ > N }). Proof. We may write

k−1 Bt , z ∈ : ∃t ∈ {1, . . . , k} such that R t (z) > N = t=0

where

Bt = z ∈ : R(z) ≤ N , . . . , R t (z) ≤ N , R t+1 (z) > N .

If R(z) ≤ N , . . . , R t (z) ≤ N then there exist j1 , . . . jt ≤ N with R(ϒ jl ) ≤ N for every ¯ and there is y ∈ ϒ j1 ,..., jt l = 1, . . . , t and z ∈ ϒ j1 ,..., jt . Observe that F¯ t ϒ j1 ,..., jt = t ¯ ≤ J F¯ (y).m(ϒ such that m( ¯ ) ¯ j1 ,..., jt ). Also, there exists x ∈ ϒ j1 ,..., jt ∩ F¯ −t ({R > N }) ¯ j1 ,..., jt ∩ F¯ −t ({R > N }). Then, using bounded such that m({R ¯ > N }) ≥ J F¯ t (x).m(ϒ distortion we obtain m(ϒ ¯ j1 ,..., jt ∩ F¯ −t ({R > N }) J F¯ t (y) m({R ¯ > N }) ≤ ¯ > N }). ≤ C1 m({R ¯ m(ϒ ¯ j1 ,..., jt ) m( ¯ ) J F¯ t (x) Finally, we conclude that |Bt | =

m(ϒ ¯ j1 ,..., jt ∩ F¯ −t ({R > N })

j1 ,..., jt : R(ϒ jl )≤N , l=1...t

¯ > N }) ≤ C1 m({R

m(ϒ ¯ j1 ,..., jt )

j1 ,..., jt : R(ϒ jl )≤N , l=1...t

≤ C1 m({R ¯ > N }). Lemma 3.8. Given , k ∈ N and ε > 0 there is J ∈ N such that for every n > J , (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 < ε. Proof. We split the argument into three steps: (1) We appeal to Lemma 3.7 to choose N ∈ N sufficiently large so that the set L := x ∈ 0 ∩ n0 : ∃t ∈ {1, . . . , k} R0t (x) > N or Rnt (x) > N has sufficiently small mass.

Statistical Stability and Continuity of SRB Entropy

759

(2) We pick J ∈ N large enough to guarantee that, according to condition (U4 ), for every k positive integers j1 , . . . , jk such that R0 (ϒ 0jl ) ≤ N , for all i = 1, . . . , k, each set ϒ 0j1 ,..., jk and its corresponding ϒ nj1 ,..., jk satisfy the condition: ϒ 0j1 ,..., jk Hn ϒ nj1 ,..., jk has sufficiently small conditional Lebesgue measure. (3) Finally, in each set ϒ 0j1 ,..., jk ∩ Hn ϒ nj1 ,..., jk we control (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• .

Step (1). From Lemma 3.7 we have |L| ≤ kC1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) . So, by assumption (U5 ), we may choose N and J large enough so that

ε 2g∞ kC1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) < , 3 which implies that ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 < . 3 L Step (2). By (P1 )(c) it is possible to define V = V (N , k) as the total number of sets ϒ j1 ,..., jk such that R(ϒ jl ) ≤ N for all i = 1, . . . , k. Now, using (U4 ), we may choose J so that for every n > J and ϒ 0j1 ,..., jk such that R0 (ϒ 0jl ) ≤ N for all i = 1, . . . , k, then the corresponding ϒ nj1 ,..., jk is such that ε < V −1 (2 max{1, g∞ })−1 . Lebγˆ0 ϒ 0j1 ,..., jk Hn ϒ nj1 ,..., jk 3 Under these circumstances we have j1 , . . . , jk : R0 (ϒ 0j ) ≤ N l l = 1, . . . , k

ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 1 ∩n d Lebγˆ < . 0 0 n 0 0 3 ϒ j ,..., j Hn ϒ j ,..., j 1 1 k k

Step (3). For each i = 1, . . . , k, let τ ji = R0 (ϒ 0ji ). In each set ϒ 0j1 ,..., jk ∩ ϒ nj1 ,..., jk we have that F0k = f 0τ1 +...+τk and Fnk = f nτ1 +...+τk . Since M is compact, each f n is C k and f n → f 0 , as n → ∞, in the C k topology then • there exists ϑ > 0 such that |z − ζ | < ϑ ⇒ |g(z) − g(ζ )| < 3ε V −1 ; • there exists J1 such that for all n > J1 and z ∈ M we have max | f 0 (z) − f n (z)|, . . . , | f 0k N +l (z) − f nk N +l (z)| < ϑ2 ; • there exists η > 0 such that for all z, ζ ∈ M and f ∈ F, |z − ζ | < η ⇒ max | f (z) − f (ζ )|, . . . , | f k N +l (z) − f k N +l (ζ )| < Furthermore, according to (U3 ),

ϑ 2.

760

J. F. Alves, M. Carvalho, J. M. Freitas

• there is J2 such that for every n > J2 and x ∈ 0 ∩ n0 we have s γ (x) − γ s (x) 1 < η. n 0 C Let n > max{J1 , J2 }, z ∈ γ0s (x) and take ζ ∈ γns (x) such that |z − ζ | < η. This together with the choices of η and J1 implies τ +...+τk +l (z) − f 0τ1 +...+τk +l (ζ ) f 0 ◦ F0k (z) − f n ◦ Fnk (ζ ) ≤ f 0 1 + f 0τ1 +...+τk +l (ζ ) − f nτ1 +...+τk +l (ζ ) < ϑ/2 + ϑ/2 = ϑ. Finally, the above considerations and the choice of ϑ allow us to conclude that for every n > max{J1 , J2 }, x ∈ 0 ∩ n0 and z ∈ γ0s (x), there exists ζ ∈ γns (x) such that ε (3.8) g( f n ◦ Fnk (ζ )) − g( f 0 ◦ F0k (z)) < V −1 . 3 Attending to (2.4), (3.8) and the fact that we can interchange the roles of z and ζ in the latter, we obtain that for every n > max{J1 , J2 }, ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• < V −1 , 3 from which we deduce that j1 , . . . , jk R0 (ϒ 0j ) ≤ N l 1≤l ≤k

ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 1 ∩n d Lebγˆ < . 0 0 0 3 ϒ 0j ,..., j Hn ϒ nj ,..., j 1 1 k k

Lemma 3.9. Given l, k ∈ N and ε > 0 there exists J ∈ N such that for every n > J , (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 < ε. Proof. As in the proof of Lemma 3.8, we divide the argument into three steps. (1) The condition on N : Consider the set L 1 = x ∈ 0 ∩ n0 : ∃t ∈ {1, . . . , k + 1} such that R0t (x) > N or Rnt (x) > N .

From Lemma 3.7 we have |L 1 | ≤ (k + 1)C1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) . So we choose N large enough so that

ε 2g∞ (k + 1)C1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) < , 3 which implies that ε (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 < . 3 L1

Statistical Stability and Continuity of SRB Entropy

761

(2) Let as before V = V (N , k + 1) be the total number of sets ϒ j1 ,..., jk+1 such that R(ϒ ji ) ≤ N for all i = 1, . . . , k + 1. Now, using (U4 ), we may choose J so that for every n > J and ϒ 0j1 ,..., jk+1 such that R0 (ϒ 0ji ) ≤ N for all i = 1, . . . , k + 1 then the corresponding ϒ nj1 ,..., jk+1 is such that ε Lebγˆ0 ϒ 0j1 ,..., jk+1 Hn ϒ nj1 ,..., jk+1 < V −1 (2 max{1, g∞ })−1 . 3 Let L 2 = ϒ 0j1 ,..., jk+1 Hn ϒ nj1 ,..., jk+1 and observe that

j1 , . . . , jk+1

:

ε (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 < . 3 L2

R0 (ϒ 0j ) ≤ N l l = 1, . . . , k + 1

(3) At last, notice that in each set ϒ 0j1 ,..., jk+1 ∩ Hn ϒ nj1 ,..., jk+1 we have (1{Rn >l} ◦ Fnk )• ◦ Hn−1 − (1{R0 >l} ◦ F0k )• = 0, which gives the result. 4. Entropy Continuity In Proposition 2.7 we have seen that the SRB entropy can be written just in terms of the quotient dynamics. Our aim now is to show that the integrals appearing in that formula are close for nearby dynamics, and this is the content of Proposition 4.4. Notice that since the integrands are not necessarily continuous functions, the continuity of the integrals is not an immediate consequence of the statistical stability.

4.1. Auxiliary results. Lemma 4.1. Let (ϕn )n∈N be a bounded sequence of m-measurable functions defined on M belonging to L ∞ (m). If ϕn → ϕ in the L 1 (m)-norm and ψ ∈ L 1 (m), then ψ(ϕn − ϕ)dm → 0, when n → ∞. Proof. Take any ε > 0. Let C > 0 be an upper bound for ϕn ∞ . Since ψ ∈ L 1 (m), there is δ > 0 such that for any Borel set B ⊂ M, ε . (4.1) |ψ|dm < m(B) < δ ⇒ 4C B Define for each n ≥ 1,

Bn = x ∈ M : |ϕn (x) − ϕ0 (x)| >

ε . 2ψ1

762

J. F. Alves, M. Carvalho, J. M. Freitas

Since ϕn − ϕ0 1 → 0 when n → ∞, then there is n 0 ∈ N such that m(Bn ) < δ for every n ≥ n 0 . Taking into account the definition of Bn , we may write |ψ||ϕn − ϕ0 |dm = |ψ||ϕn − ϕ0 |dm + |ψ||ϕn − ϕ0 |dm Bn M\Bn ε ≤ 2C |ψ|dm + |ψ|dm. 2ψ 1 M\Bn Bn Then, using (4.1), this last sum is upper bounded by ε, as long as n ≥ n 0 . Lemma 4.2. There is C2 > 0 such that log J F¯n ≤ C2 Rn for every n ≥ 0. Proof. Define L n = maxx∈M {| det D f nu (x)|}, for each n ≥ 0. By the compactness of M and the continuity on the first order derivative, there is L > 1 such that L n ≤ L for all n ≥ 0. We have | det D(Fn )u (x)| =

Rn (x)−1

| det D f nu ( f n (x))| ≤ L Rn (x) . j

j=0

By (2.3) it follows that log J (Fn )(x) = log | det D Fnu (x)| + log u(F ˆ n (x)) − log u(x). ˆ ˆ n (x)) − log u(x)| ˆ ≤ 2Cβ 0 = 2C, we Observing that by (P3 )(a) it follows that | log u(F have log J (Fn )(x) ≤ Rn (x) log L + 2C. To conclude, we take C2 = log L + 2C. Lemma 4.3. Given ε > 0, there is J ∈ N such that for all n > J , |Rn − R0 | d Lebγˆ0 ≤ ε n0 ∩0

Proof. Let ε > 0 be given. Using condition (U5 ) and Remark 1.2, take N ≥ 1 and J = J (N , ε) > 0 in such a way that ∞ j=N j Lebγˆ0 {Rn = j} < ε/3 and ∞ j Leb {R = j} < ε/3. Since 0 γˆ0 j=N ∞

Rn =

1{Rn > j} ,

j=0

we may write N −1 N −1 −1

N Rn − R0 1 = Rn − 1{Rn > j} − 1{R0 > j} + 1{Rn > j} + 1{R0 > j} − R0 1 j=0

≤

∞ j=N

1{Rn > j} 1 +

j=0 N −1 j=0

j=0

∞ 1{Rn > j} − 1{R0 > j} 1 + 1{R0 > j} 1 j=N

∞ −1 ∞ N = 1{Rn > j} 1 + 1{Rn ≤ j} − 1{R0 ≤ j} 1 + 1{R0 > j} 1 . j=N

j=0

j=N

Statistical Stability and Continuity of SRB Entropy

763

By the choices of N and J , the first and third terms in this last sum are smaller than ε/3. By (U4 ), increasing J if necessary, we can make Lebγˆ0 ({Rn = j}{R0 = j}) sufficiently small in order to have the second term smaller than /3. 4.2. Convergence of metric entropies. Our aim is to show that h µn → h µ0 as n → ∞, which by Proposition 2.7 can be rewritten as σn−1 log J F¯n d µ¯ n −→ σ0−1 log J F¯0 d µ¯ 0 , as n → ∞. (4.2) ¯n

¯0

Observing that σn = n Rn d µ˜ n = µ∗n (M), then by Proposition 3.6 we have σn → σ0 , as n → ∞. Hence, (4.2) is a consequence of the next result. log J F¯n d µ¯ n −→ log J F¯0 d µ¯ 0 as n → ∞. Proposition 4.4. ¯n

¯0

Proof. The convergence above will follow if we show that the following term is arbitrarily small for large n ∈ N. E := (log J F¯n ◦ πˆ n )(ρ¯n ◦ πˆ n ) d Lebγˆn − (log J F¯0 ◦ πˆ 0 )0 d Lebγˆ0 . n

0

Recall that 0 = ρ¯0 ◦ πˆ 0 and n = ρ¯n ◦ πˆ n ◦ Hn−1 , for every n ∈ N. Define E 0 := (log J F¯n ◦ πˆ n ◦ Hn−1 )n d(Hn )∗ Lebγˆn n ∩0 0 − (log J F¯0 ◦ πˆ 0 )0 d Lebγˆ0 . n ∩0 0

By Lemmas 2.2 and 4.2 we have E ≤ E 0 + K C2

n \0n

Rn d Lebγˆn +K C2

0 \n0

R0 d Lebγˆ0 .

Since R0 ∈ L 1 (Lebγˆ0 ), then, by (U2 ) and Remark 1.2, for large n, we may have Lebγˆ0 (0 n0 ) small so that 0 \n R0 d Lebγˆ0 becomes negligible. Now, for each 0 N ∈ N, Rn d Lebγˆn ≤ N d Lebγˆn + Rn d Lebγˆn . n \0n

n \0n

{Rn >N }

Using condition (U5 ) we may choose N so that for all n ∈ N large enough the quantity {R = j} is arbitrarily small. Again, using (U2 ), j=N +1 j Leb {Rn >N } Rn d Lebγˆn = γˆ0 n if n ∈ N is sufficiently large then n \0 d Lebγˆ0 is as small as we want. Therefore, we 0 are reduced to estimating E 0 .

764

J. F. Alves, M. Carvalho, J. M. Freitas

Note that by definition n0 ⊂ 0 . Having this in mind, we split E 0 into the next three terms that we call E 1 , E 2 , E 3 respectively. −1 ¯ ¯ (log J Fn ◦ πˆ n ◦ Hn )n d(Hn )∗ Lebγˆn − (log J F0 ◦ πˆ 0 )n d(Hn )∗ Lebγˆn E0 ≤ n n 0 0 + (log J F¯0 ◦ πˆ 0 )n d(Hn )∗ Lebγˆn − (log J F¯0 ◦ πˆ 0 )n d Lebγˆ0 n n 0 0 + (log J F¯0 ◦ πˆ 0 )n d Lebγˆ0 − (log J F¯0 ◦ πˆ 0 )0 d Lebγˆ0 . n n 0

0

Concerning E 2 , using Lemma 2.2 and Lemma 4.2 we have d(Hn )∗ Lebγˆn ¯ E2 ≤ | log J F0 ||n | − 1 d Lebγˆ0 n d Lebγˆ0 0 d(Hn )∗ Lebγˆn ≤ K C2 R0 − 1 d Lebγˆ0 . n d Leb 0

γˆ0

Now, Remark 1.2 and Lemma 4.1 guarantee that E 2 can be made arbitrarily small for large n ∈ N. Using Corollary 3.5, E 3 can also be made small for large n. We are left with E 1 . By Lemma 2.2 and Remark 1.2 we only need to control (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 n0 ∩0

whose estimation we leave to Lemma 4.6. Remark 4.5. Assume that γn is a compact unstable manifold of the map f n for n ≥ 0 and γn → γ0 , in the C 1 topology. The convergence of f n to f 0 in the C 1 topology ensures that given ∈ N and > 0 there exist δ = δ( , ) > 0 and J = J (δ) ∈ N such that for every n > J, x ∈ γ0 and y ∈ γn with |x − y| < δ, j j j j max | f n (y) − f 0 (x)|, | log det(D f n )u (y) − log det(D f 0 )u (x)| < . j=1,...,

Lemma 4.6. Given any ε > 0 there exists J ∈ N such that for every n > J , (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 < ε. n0 ∩0

Proof. Let ε > 0 be given. For n, N ∈ N define An,N = {Rn ≤ N } ∩ {R0 ≤ N } and Acn,N = {Rn > N } ∪ {R0 > N }. By Lemma 4.2 we have (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 n0 ∩Acn,N

≤ C2

Rn d Lebγˆ0 +C2

n0 ∩Acn,N

n0 ∩Acn,N

R0 d Lebγˆ0 .

SinceR0 ∈ L 1 (Lebγˆ0 ), there is δ > 0 such that if a measurable set A has Lebγˆ0 (A) < δ, then A R0 d Lebγˆ0 < ε/(4C2 ). According to (U5 ), we may pick N ∈ N and choose

Statistical Stability and Continuity of SRB Entropy

765

J ∈ N such that for every n > J we get Lebγˆ0 (Acn,N ) < δ. This implies that the second term on the right hand side of the inequality above is smaller than ε/4. The same argument and Lemma 4.3 allow us to conclude that for a convenient choice of N ∈ N and for J ∈ N sufficiently large, ε C2 Rn d Lebγˆ0 ≤ C2 R0 d Lebγˆ0 +C2 |Rn − R0 | d Lebγˆ0 ≤ . 4 n0 ∩Acn,N n0 ∩Acn,N n0 So, assuming that N has been chosen and J is sufficiently large so that (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 ≤ ε/2, n0 ∩Acn,N

we are left to deal with (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 n0 ∩An,N

≤

i:R0 (ϒi0 )≤N

+

i:R0 (ϒi0 )≤N

ϒi0 ∩ϒin

ϒi0 ϒin

(log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) 1n0 ∩0 d Lebγˆ0 (log J F¯n ◦ πˆ n ◦ Hn−1 )−(log J F¯0 ◦ πˆ 0 ) 1n0 ∩0 ∩An,N d Lebγˆ0 .

Denote by S1 and S2 respectively the first and second sums above, and v the number of terms in S1 and S2 . By Lemma 4.2 we have S2 ≤ C2 (Rn + R0 )1n0 ∩0 ∩An,N d Lebγˆ0 ≤ 2C2 N Lebγˆ0 (ϒi0 ϒin ). ϒi0 ϒin

Hence, using (U4 ) we consider J ∈ N large enough to have Lebγˆ0 (ϒi0 ϒin ) < ε/ (8C2 N v), and so S2 ≤ ε/4. Let τi = R0 (ϒi0 ) = Rn (ϒin ) ≤ N . We want to see that for all n large enough and all x ∈ ϒi0 ∩ ϒin with τi ≤ N , (4.3) (log J F¯n ◦ πˆ n ◦ Hn−1 )(x) − (log J F¯0 ◦ πˆ 0 )(x) ≤ ε/4v, which yields S1 ≤ ε/4. Using (2.3) and observing that the curves γˆn , γˆ0 are the leaves we chose to define the reference measures m¯ n , m¯ 0 , then we easily get for y = Hn−1 (x), log J F¯n ◦ πˆ n (y) − log J F¯0 ◦ πˆ 0 (x) ≤ log det(D f τi )u (y) − log det(D f τi )u (x) n 0 +| log uˆ n ( f nτi (y)) − log uˆ 0 ( f 0τi (x))|.

Using Remark 4.5 with = N and ε/8v instead of , and recalling that τi ≤ N , we may find δ > 0 and J ∈ N so that for all n > J , log det(D f τi )u (y) − log det(D f τi )u (x) < ε/8v. (4.4) n 0 Observe that |x − y| < δ as long as J is sufficiently large, since x = Hn (y).

766

J. F. Alves, M. Carvalho, J. M. Freitas

For every n, k ∈ N0 and t ∈ n , let uˆ kn (t) =

k j det D f nu ( f n (t)) j

j=0

det D f nu ( f n (tˆ))

.

By definition of uˆ n (see (2.1)) and by (P3 )(a), there is k ∈ N such that for every n ∈ N0 and t ∈ n we have | log uˆ n (t) − log uˆ kn (t)| < ε/(48v). Thus, | log uˆ n ( f nτi (y)) − log uˆ 0 ( f 0τi (x))| ≤ | log uˆ n ( f nτi (y)) − log uˆ kn ( f nτi (y))|

+| log uˆ kn ( f nτi (y)) − log uˆ k0 ( f 0τi (x))|

+| log uˆ k0 ( f 0τi (x)) − log uˆ 0 ( f 0τi (x))| k j j ≤ log det D f nu ( f n (ζ )) − log det D f 0u ( f 0 (z)) j=0 k j j + log det D f nu ( f n (ζˆ )) − log det D f 0u ( f 0 (ˆz )) j=0

+

ε , 24v

where z = f 0τi (x), ζ = f nτi (y), zˆ is the only point on the set γ0s (z) ∩ γˆ0 and ζˆ is the unique point on the set γns (ζ ) ∩ γˆn . Observe that since γˆn → γˆ0 and f n → f 0 in the C 1 topology, and τi ≤ N , then u γn (ζ ) → γ0u (z), in the C 1 topology. Besides, using Lemma 3.3 we also have |ˆz − ζˆ | as small as we want for J large enough. Consequently, by Remark 4.5, we may find J ∈ N sufficiently large so that for all n > J , we have k j j log det D f nu ( f n (ζ )) − log det D f 0u ( f 0 (z)) < ε/(24v),

(4.5)

j=0

and k j j log det D f nu ( f n (ζˆ )) − log det D f 0u ( f 0 (ˆz )) < ε/(24v).

(4.6)

j=0

Estimates (4.4),(4.5) and (4.6) yield (4.3).

References [A] [ACF] [AOT]

Alves, J.F.: Strong statistical stability of non-uniformly expanding maps. Nonlinearity 17(4), 1193–1215 (2004) Alves, J.F., Carvalho, V., Freitas, J.M.: Statistical stability for Hénon maps of Benedicks-Carleson type. Ann. Inst. H. Poincaré Anal. Non Linéaire 27, 595–637 (2010). doi:10.1016/j.anihpc.2009. 09.009 Alves, J.F., Oliveira, K., Tahzibi, A.: On the continuity of the srb entropy for endomorphisms. J. Stat. Phys. 123(4), 763–785 (2006)

Statistical Stability and Continuity of SRB Entropy

[AV] [BC] [BDV] [B] [BV1] [BV2] [BY1] [BY2] [C] [F] [FT] [LY1] [LY2] [M] [N] [O] [P] [RS] [R] [T1] [T2] [V] [Y1] [Y2]

767

Alves, J.F., Viana, M.: Statistical stability for robust classes of maps with non-uniform expansion. Ergod. Th. & Dyn. Sys. 22, 1–32 (2002) Benedicks, M., Carleson, L.: The dynamics of the Hénon map. Ann. Math. 133, 73–169 (1991) Bonatti, C., Díaz, L.J., Viana, M.: Dynamics Beyond Uniform Hiperbolicity. Berlin-HeidelbergNew York: Springer-Verlag, 2005 Bowen, R.: Equilibrium States and Ergodic Theory of Anosov Diffeomorphisms. Volume 470 of Lecture Notes in Mathematics, Berlin-Heidelberg-NewYork: Springer-Verlag, 1975 Benedicks, M., Viana, M.: Solution of the basin problem for hénon-like attractors. Invent. Math. 143, 375–434 (2001) Benedicks, M., Viana, M.: Random perturbations and statistical properties of hénon-like maps. Ann. Inst. H. Poincaré Anal. Non Linéaire 23(5), 713–752 (2006) Benedicks, M., Young, L.S.: Sinai-bowen-ruelle measures for certain hénon maps. Invent. Math. 112, 541–576 (1993) Benedicks, M., Young, L.S.: Markov extensions and decay of correlations for certain hénon maps. Astérisque 261, 13–56 (2000) Contreras, G.: Regularity of topological and metric entropy of hyperbolic flows. Math. Z. 210(1), 97–111 (1992) Freitas, J.M.: Continuity of srb measure and entropy for benedicks-carleson quadratic maps. Nonlinearity 18, 831–854 (2005) Freitas, J.M., Todd, M.: Statistical stability of equilibrium states for interval maps. Nonlinearity 22, 259–281 (2009) Ledrappier, F., Young, L.S.: The metric entropy of diffeomorphisms. i. characterization of measures satisfying pesin’s entropy formula. Ann. of Math. (2) 122(3), 509–539 (1985) Ledrappier, F., Young, L.S.: The metric entropy of diffeomorphisms. ii. relations between entropy, exponents and dimension. Ann. of Math. (2) 122(3), 540–574 (1985) Mañé, R.: The hausdorff dimension of horseshoes of diffeomorphisms of surfaces. Bol. Soc. Bras. Mat, Nova Série 20(2), 1–24 (1990) Newhouse, S.: Continuity properties of entropy. Ann. Math. 129(2), 215–235 (1989) Oseledets, V.I.: A multiplicative ergodic theorem. lyapunov characteristic numbers for dynamical systems. Trans. Moscow. Math. Soc. 19, 197–231 (1968) Pollicott, M.: Zeta functions and analyticity of metric entropy for anosov systems. Israel J. Math. 75, 257–263 (1991) Rychlik, M., Sorets, E.: Regularity and other properties of absolutely continuous invariant measures for the quadratic family. Commun. Math. Phys. 150, 217–236 (1992) Ruelle, D.: Differentiation of srb states. Commun. Math. Phys. 187(1), 227–241 (1997) Thunberg, H.: Unfolding of chaotic unimodal maps and the parameter dependence of natural measures. Nonlinearity 14, 323–337 (2001) Tsujii, M.: On continuity of bowen-ruelle-sinai measures in families of one dimensional maps. Commun. Math. Phys. 177, 1–11 (1996) Vásquez, C.H.: Statistical stability for diffeomorphisms with dominated splitting. Erg. Th. Dyn. Sys. 27(1), 253–283 (2007) Yomdin, Y.: Volume growth and entropy. Israel J. Math. 57(3), 285–300 (1987) Young, L.S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. Math. 147, 585–650 (1998)

Communicated by G. Gallavotti

Commun. Math. Phys. 296, 769–826 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1016-9

Communications in

Mathematical Physics

Equilibrium Fluctuations for a Model of Coagulating-Fragmenting Planar Brownian Particles Mojtaba Ranjbar1 , Fraydoun Rezakhanlou2, 1 Mathematics and Computer Science Faculty, Amirkabir University, Tehran, Iran 2 Department of Mathematics, University of California, Berkeley,

California 94720–3830, USA. E-mail: [email protected] Received: 29 May 2009 / Accepted: 24 November 2009 Published online: 18 February 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com

Abstract: We study a model of mass-bearing coagulating-fragmenting planar Brownian particles. Coagulation occurs when two particles are within a distance of order ε. Our model is macroscopically described by an inhomogeneous Smoluchowski’s equation in the low ε limit provided that the initial number of particles N is of order | log ε|. When a detailed balance condition is satisfied, we establish a central limit theorem by showing that in the low ε limit, the particle density fluctuation fields obey an Ornstein-Uhlenbeck stochastic differential equation. 1. Introduction One of the main purposes of statistical mechanics is to explain the macroscopic behavior of various phenomena in terms of the statistics of their microscopic structures. Macroscopically we often have a PDE involving a small number of parameters. The microscopic description however involves a large number of components that are evolving by either deterministic or stochastic rules. Let us name three reasons to justify our interest in understanding the connection between the microscopic and macroscopic descriptions. As our first reason, we remark that historically the macroscopic equation is formally derived from the microscopic description of the phenomenon under study. It is an important task of statistical mechanics to justify such a derivation rigorously and verify the validity of the macroscopic PDE. For our second reason, we note that we often have simple dynamical rules for the microscopic model and the main challenging feature of the model has to do with its large size. On the other hand, the macroscopic evolution involves only a few variables but is dictated by a rather sophisticated nonlinear rules. It is the case for many examples that the macroscopic equation is not fully understood. Hopefully by exploring its relation with its microscopic counterpart we may discover new tools and techniques for the macroscopic equation. This work is supported in part by NSF grant DMS-0707890.

770

M. Ranjbar, F. Rezakhanlou

As our third reason, we should mention that even though the macroscopic equation is preferred because of its dependence on a small number of variables, it is only a reduced description of the microscopic phenomenon at hand and we would like to find practical ways of recovering some of the lost information as we switch to the macroscopic world. Since the passage from the microscopic details to macroscopic parameters can be recast as a law of large numbers for some conserved quantities in many scenarios, probability theory suggests some standard routes for going beyond the law of large numbers and gain new information. The celebrated central limit theorem and large deviations for classical examples are guidelines for producing some vital information for the microscopic model under study. It is the latter reason which is the chief motivation for the present article. Our microscopic model is a system of coagulating-fragmenting Brownain particles which is macroscopically described by an inhomogeneous Smoluchowski equation. This equation is derived as a law of large numbers. The main contribution of this article is a central limit theorem for the aforementioned law of large numbers when the system is in equilibrium. In our model, we start with N particles with each particle traveling in Rd as a Brownain motion. Each particle has a size m ∈ Z and a radius r ∈ (0, ∞). In fact our interpretation of the location x of a particle is that x is the center of a ball of radius r and in some sense only a small fraction of the ball is occupied by the true particle. It turns out that in reality each particle is a cluster of smaller objects and the cluster is a complex fractal like entity that is too complicated to be treated with the existing techniques. That is why we simplify the model by replacing the cluster with a ball of radius r (m) = m χ so that when χ > d1 , we are taking into account the fact that the mass of the particle comes from a small portion of the ball which is occupied by the cluster. This may appear somewhat native and not too realistic from a physical point of view. Nevertheless, as was explained in [HR1–3] and [R2], the model does exhibit some expected features of the underlying 1 physics. For example, the condition χ < d−2 guarantees that no gelation occurs in finite time. That is, no particle of size infinity is formed in finite time at the macroscopic level. 1 We also conjecture that an instantaneous gelation would occur if χ > d−2 . In fact the true radius of a particle is εr with ε very small. A calculation involving Wiener sausages reveals that if N = ε2−d when d ≥ 3 and N =| log ε | when d = 2, then the expected value of the number of times a particle coagulates with other particles in one unit of time stays positive and finite as → 0. This property allows us to obtain the Smoluchowski’s equation for the evolution of cluster densities in the low ε limit. To further simplify the involved mathematical technicalities, we forget about balls presenting each particle and regard each particle as a point. Now the coagulation occurs stochastically only when particles of positions x and y and masses m and n, satisfy | x − y |≤ c0 ε(m χ + n χ ), for a constant c0 . In the preceding works [HR1–2], [R2 and HRY], we were able to derive the macroscopic equation as a law of large numbers; if we label the locations and masses of particles as (xi , m i ), i ∈ I , then our law of large numbers asserts 1 δxi (t) (d x)11(m i (t) = n) → f n (x, t)d x Kε i∈I

with f n solving Smoluchowski equation (2.6) of Sect. 2. Here 2−d if d ≥ 3, ε Kε = | log ε| if d = 2.

(1.1)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

771

As a central limit theorem, we are interested in the limit of the fluctuation fields ε −1 ξn (d x, t) = K ε K ε δxi (t) (d x) 11(m i (t) = n) − f n (x, t)d x (1.2) i

as ε → 0. In Sect. 2, we state a conjecture regarding the evolution of ξnε in low the ε limit. According to this conjecture, the limit ξ solves an Ornstein-Uhlenbeck stochastic differential equation in an infinite dimensional setting with ξ living in a negative Sobolev space. The conjecture is formulated using the so-called fluctuation–dissipation principle of non-equilibrium statistical mechanics. In this article, we establish the conjecture only when the dimension is 2 and the model satisfies a detailed-balance condition. Some steps of our proof do not apply to higher dimensions. The case d ≥ 3 is more challenging and is left for a future investigation. We continue this Introduction by mentioning some previous work related to our model. Smoluchowski’s equation was introduced by Smoluchowski in the seminal work [Sm]. The first mathematically rigorous derivation of Smoluchowski’s equation from a model of coagulating Brownian particles was carried out by Lang–Nyugen [LN] when d = 3 and all particles have the same diffusion coefficient. A related problem has been studied by Sznitman [Sz] when d = 2. A completely different approach has been employed in [HR1-2 and YRH] to treat the model in general. A thorough survey on related models and their applications can be found in Aldous [A]. In fact Open Problem 16 in [A] is exactly our central limit theorem when there is no spacial dependence. We refer to the monograph [Sp] for an introduction to related questions in statistical mechanics and a discussion of the fluctuation–dissipation principle. An equilibrium fluctuation result has been studied in [R1] for a model of the colliding particles associated with the discrete Boltzmann equation. We end this section with an outline of the paper. In Sect. 2, we state a conjecture for the macroscopic evolution of the fluctuation fields. In Sect. 3, a family of reversible invariant measures for the microscopic model is constructed. In this section, the conjecture is restated as the main result of this paper under the assumption that the model starts from one of the reversible measures and that the dimension is 2. In Sect. 4, the strategy of the proof is described. The first step of the proof is a regularity of the coagulation term and is carried out in Sects. 5–7. The proof of the main result is completed in Sects. 8 and 9. 2. A Conjecture We start with the description of our model. The configuration space consists of pairs ω = (x, m) with x a subset of Rd with no accumulation point and m : x → N = {1, 2, 3, . . . } is a map that assigns a positive integer to each element of x. Throughout this section we assume that d ≥ 2. It is often convenient to write ω = (xi , m i )i∈I with xi ∈ Rd and m i ∈ N, where I = I (ω) is a countable index set. We regard x as a collection of cluster positions in Rd with no accumulation point and m assigns asize to each such position. We may also identify ω = (x, m) as a discrete measure ωˆ = i∈I δ(xi ,m i ) on Rd × N. Using this identification we equip the space with the topology of vague convergence. We now describe the evolution of coagulating and fragmenting Brownian clusters as a Markov process on the configuration space . For this, functions d : N → (0, ∞), α : N × N → (0, ∞) and β : N × N → (0, ∞) are given which represent the diffusion

772

M. Ranjbar, F. Rezakhanlou

coefficient, the coagulation rate and the fragmentation rate respectively. We assume that both α and β are symmetric. Also a parameter χ ∈ [0, ∞) and a continuously differentiable function V : Rd → [0, ∞) are given for our model. We then define a Markov process ω(t) with infinitesimal generator A = A0 + Ac + A f , where A0 represents the “Brownian motion” part of the dynamics, and Ac and A f represent the coagulation and fragmentation parts of the evolution. For the “Brownian motion” part, we use the representation ω = (xi , m i )i∈I to write A0 F(ω) = d(m i )xi F(ω), (2.1) i∈I

for any C 2 function F. Here xi represents the Laplace operator which acts on the xi th variable. + As for the coagulation part, we write Ac F(ω) = A+c F(ω) − A− c F(ω) with Ac and − Ac given by 1 mi + α(m i , m j )Vε (xi − x j ; m i , m j ) F(Si1j ω) Ac F(ω) = 2 mi + m j i, j∈I i= j

mj F(Si2j ω) , mi + m j 1 α(m i , m j )Vε (xi − x j ; m i , m j )F(ω). A− c F(ω) = 2 +

i, j∈I i= j

Here, (i) ε > 0 is a small parameter that represents the

range of interaction. (ii) The function Vε (x; m, n) = λ(ε)V xε ; m, n , where | log ε|−1 ε−2 if d = 2, λ(ε) = ε−2 if d ≥ 3, and V (x; m, n) = r (m, n)−2 V

x , r (m, n)

(2.2)

(2.3)

with r (m, n) = r (m) + r (n), for r (n) = n χ , and V is a symmetric Hölder continuous function of compact support and total integral 1. (iii) We denote by Si1j ω the configuration formed from ω by removing x j from x and assigning the size m i + m j to the surviving cluster at xi . The configuration Si2j ω is defined in the same way, except that we remove xi from x and assign the size m i + m j to the cluster at x j . We note that the cluster at xi survives the coagulation i with probability m im+m . j Before describing the fragmentation part of the dynamics, let us explain the form of the function Vε . Note that Vε (xi − x j ; m i , m j ) = 0 only if xi − x j is of order ε(r (m i ) + r (m j )) with r (m) = m χ . This means that we regard a particle of size m to be roughly a ball of radius εr (m) so that a pair of clusters of centers xi and x j coagulate

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

773

when their corresponding balls overlap. If we assume that the mass of the i th cluster is distributed evenly in the ball Br (m i ) (xi ), then we expect to have χ = d1 . However, in reality a cluster is far from being a round ball and expected to be a fractal like object. By allowing χ ∈ (0, ∞) we are hoping to have a more physically relevant model. In particular, the case χ < d1 represents a scenario in which the ball Br (m i ) (xi ) contains the true cluster and only a fraction of the ball is occupied with the cluster. We believe that 1 the case χ > d−2 corresponds to the occurrence of “gelation”. We refer to [HR1,HR3 and R2] for more discussions. (Note that no finite χ can cause gelation when d = 2; we guess that the radius must grow exponentially with the mass in order to have a gel in this case.) The occurrence of the factor λ(ε) in the definition of Vε is to guarantee that when two clusters collide, then they coagulate with a probability that stays away from 0 as ε → 0. Indeed B = xi − x j is a Brownian motion that spends a time of order ε2 r (m i , m j )2 (respectively ε2 r (m i , m j )2 | log ε|) in the support of Vε when d ≥ 3 (respectively d = 2). We also multiply the sum in the definition Ac by 1/2 to ensure that the summation is over unordered pairs {i, j}. As for the fragmentation part, A f F(ω) = A+f F(ω) − A−f F(ω), is given by

m i −1 1 y,m β(m, m i − m) V ε (xi − y; m i − m, m)(F(Si ω) − F(ω))dy, 2 i

m=1

with A+f F(ω) =

m i −1 1 y,m β(m, m i − m) V ε (xi − y; m i − m, m)F(Si ω)dy. 2 i

m=1

Here, V ε (a; m, n) = ε−d V

a ε

; m, n ,

(2.4)

y,m

and Si is that configuration formed from ω by replacing (xi , m i ) with a pair of clusters of positions xi and y and sizes m and m i − m. The central object to study is the cluster density of a given size. Microscopically we are interested in the empirical measures δxi (t) (d x)11(m i (t) = n), gnε (d x, t) = K ε−1 i

where K ε was defined by (1.1). If for example we select (x1 (0), m 1 (0)), . . . , (x N (0), m N (0)) randomly and independently with the law

1 P(xi (0) ∈ A, m i (0) = n) = f 0 (x)d x, (2.5) Z A n with Z = n f n0 d x, then by the law of large numbers, gnε (d x, 0) converges weakly to f n0 (x)d x provided that N = K ε Z and ε → 0. Note that such a choice of initial condition implies that on average there are K ε f n0 (x)d x many particles of size n. A Wiener Sausage calculation reveals that in average, each particle in our model experiences finitely many coagulations per unit time. This explains our reason for choosing

774

M. Ranjbar, F. Rezakhanlou

1 K ε as above. The main result of [HR1,HR2 and R1] states that if χ < d−2 , there is no fragmentation, and α satisfies some technical conditions, then the empirical density gnε (d x, t) converges to f n (x, t)d x where f n is a solution to Smoluchowski’s equation, subject to the initial condition f n (x, 0) = f n0 (x). It is shown in [HR3] that this solution is unique. Smoluchowski’s equation has the form

∂ fn +, f −, f −,c (x, t) = d(n)x f n (x, t) + Q +,c n (f) − Q n (f) + Q n (f) − Q n (f), ∂t where f = ( f n : n ∈ N), and Q +,c n (f) = +, f

Q n (f) =

n 1 α(m, ˆ n − m) f m f n−m , 2

m=1 ∞

ˆ β(m, n) f n+m ,

m=1

−, f

Qn

Q n−,c (f) =

∞

(2.6)

α(m, ˆ n) f m f n ,

m=1

(f) =

n−1 1 ˆ β(m, n − m) f n , 2 m=1

with ˆ α(m, ˆ n) = η(m, n)α(m, n), β(m, n) = η(m, n)β(m, n).

(2.7)

The function η(m, n) is calculated in terms of the microscopic details of the model. We start with the case d = 2. In this case η is independent of the function V and the parameter χ , and is simply given by η(m, n) =

2π(d(m) + d(n)) . 2π(d(m) + d(n)) + α(m, n)

(2.8)

The formula for η(m, n) is more complicated when d ≥ 3 and does depend on both V and χ . Here is the recipe for η: First we find the unique solution to the equation (d(m) + d(n))u m,n (x) = α(m, n)V (x; m, n)(1 + u m,n (x)) with u(x; m, n) = u m,n (x) satisfying u m,n (x) → 0, as |x| → ∞. Then we set

η(m, n) = V (x; m, n)(1 + u m,n (x))d x.

(2.9)

(2.10)

Remark 2.1. • For the purposes of this section, we have assumed that n f n0 d x < ∞, which implies that there are finitely many particles almost surely. However in Sect. 3 when the main result of this article is discussed, the density f n0 is constant and the system involves infinitely many particles. The existence of such a particle system is no longer obvious, and in Remark 3.5 we will explain how such a particle system is constructed. • Note that we deliberately choose a mechanism for the fragmentation that is, in some sense, dual to the coagulation mechanism. This allows us to easily construct reversible invariant measures for the process ω(t). In other words the fragmentation is defined in such a way that if we reverse time after a coagulation, we obtain a fragmentation. For the kinetic limit however, we can use a kernel W for the fragmentation that is different from V , or even choose two new locations y1 and y2 near xi for the locations of new clusters of a fragmented cluster. However, for this fragmentation mechanism, the macroscopic coagulation and fragmentation rates read αˆ = αη, βˆ = βη with possibly η = η .

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

775

−, f

−,c • Let us write Q n = Q +,c n − Q n + Q n − Q n . We then have the following useful formula: For any sequence (Jn : n ∈ N), 1 ˆ Jn Q n = (α(m, ˆ n) f m f n − β(m, n) f m+n )(Jm+n − Jm − Jn ). 2 n,m n +, f

The main goal of this article is to derive an equation for the evolution of the density fluctuations about the solution to Smoluchowski’s equation. To this end, recall the fluctuation fields ξnε (d x, t) that was defined by (1.2). Let us 0 assume that χ < (d − 2)−1 and that the total mass n n f n d x is finite. Conjecture 2.1. As ε → 0, the process ξnε converges to ξn , where ξn is the unique solution to the Uhlenbeck–Ornstein equation ∂ξn f = d(n)x ξn + Lcn ξ + Ln ξ + γn , ∂t ξn (x, 0) = ξ¯n (x),

(2.11)

where ξ = (ξn : n ∈ N), and −,c Lcn = L+,c n − Ln , Ln = Ln f

+, f

−, f

− Ln

,

(2.12)

with L+,c n ξ = +, f Ln ξ =

n−1 m=1 ∞

α(m, ˆ n − m) f m ξn−m , Ln−,cξ = 2 ˆ β(m, n)ξn+m ,

−, f

Ln

ξ =

m=1

1 2

∞

α(m, ˆ n)( f m ξn + f n ξm ),

m=1 n−1

ˆ β(m, n − m)ξn ,

(2.13)

(2.14)

m=1

and γn is a space-time white noise with variance given by 2

d(n) f n |∇ Jn |2 d xdt Jn γn d xdt =2 n

n

1 + 2 1 + 2

α(m, ˆ n) f n f m (Jn+m − Jn − Jm )2 d xdt

m,n

ˆ β(m, n) f n+m (Jn+m − Jn − Jm )2 d xdt

m,n

(2.15) for any smooth test function J = (Jn : n ∈ N) of compact support in Rd × (0, ∞). In fact γ belongs to a suitable negative Sobolev space and the integral of Jn γn must be understood as the value of the distribution γn at the smooth test function Jn . See the next section or the beginning of Sect. 8 for the precise definition of ξ and γ and the meaning of Eq. (2.11). The main result of this paper asserts that Conjecture 2.1 is valid if the initial distribution of the cluster is chosen according to a reversible equilibrium state and d = 2.

776

M. Ranjbar, F. Rezakhanlou

3. Equilibrium Fluctuations We start with constructing reversible invariant measures for the process ω(t). For this we take a collection of positive numbers λ = (λn : n ∈ N) such that n λn < ∞, and α(m, n)λn λm = β(m, n)λn+m

(3.1)

for every m, n ∈ N. Note that for such a collection, the functions f n (x, t) ≡ λn do solve Smoluchowski’s equation because by (2.7) and (3.1), ˆ α(m, ˆ n)λn λm = β(m, n)λn+m ,

(3.2)

and this in turn implies −, f

λ) = Q n Q +,c n (λ

f λ), Q n−,c (λ λ) = Q +, λ). (λ n (λ

(3.3)

Given such λ , we construct a reversible invariant measure µλ for our process ω(t): Let n xn be a Poisson point process with intensity are inde∞ K ε nλn . Assume that (x , n ∈ N) pendent and define ω = (x, m) by x = n=1 x and m(a) = n for a ∈ xn . In words, particles of size n form a Poisson point process of intensity of K ε λn and these processes are independent for different choices of n. We note that if is a bounded subset of Rd , then

λn , M dµλ = ||K ε n

where n n M (ω) = M (x, m) = #{a ∈ x : a ∈ , m(a) = n},

M =

∞

n M .

n=1

Hence, if we assume that n λn < ∞, then there are finitely many clusters in a bounded domain almost surely with respect to µλ . We now assert that µλ is indeed reversible. To explain this, let us take two bounded local C 2 functions F, G : → R. By a local function F we mean that there exists a positive constant c0 such that F depends only on particles (xi , m i ) such that |xi |, m i ≤ c0 . We then have

(3.4) G AF dµλ = F AG dµλ . Indeed,

G A0 F dµλ = −

G

A+c F

dµλ =

d(m i )∇xi F · ∇xi G dµλ ,

i

F

A−f G

dµλ ,

G

A− c F

(3.5)

dµλ =

F A+f G dµλ . (3.6)

Note that (3.6) is the microscopic analog of (3.3), and together with (3.5) imply (3.4). The proof of (3.5) follows from an integration by parts. As for (3.6), observe that for any bounded set , L ∞ n (λn K ε ) L n −λn K ε || µλ (dω ) = e 11(m ni = n, xni ∈ )d xni , Ln! L 1 ,L 2 ,... n=1

i=1

(3.7)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

777

where ω is the configuration in the set and µλ is the law of ω under µλ . Here we n is the number of such have labeled particles of size n by n1, n2, . . . , n L n and L n = M particles. Using the representation (3.7), one can readily verify (3.6). Let us write Pωε and Eωε for the probability and the expectation with respect to the process ω(·) subject to the initial condition ω(0) = ω. When ω(0) is distributed according eq eq to an invariant measure µλ , we write Pε and Eε instead. Given ω(·), we define

1 ε ξn (t, J ) = K ε J (xi (t))11(m i (t) = n) − λn J (x)d x (3.8) Kε i

for every smooth J : Rd → R of compact support. Let D denote the space of smooth functions of compact support and let D denote the space of distributions (the dual of D). We regard ξnε as an element of the Skorohod space D = D([0, T ], (D )N ). The transformation ω(·) → ξ ε induces a probability measure P ε on D. We regard ξnε (t, J ) as the value of the distribution ξnε (t) at J. To state our assumptions, take a nondecreasing function a ≥ 1, such that α (m, n) = α(m, n)/(d(m) + d(n)) ≤ a(n) + a(m), and set β (n) = n−1 m=1 β(n − m, m). Hypothesis 3.1. The function d(·) is bounded. Moreover for some θ > 1/2, lim τ (ε) := lim K ε1/2 a(n)λn = 0, ε→0

ε→0

(3.9)

2εr (n)>δ(ε)

where δ(ε) = | log ε|−θ , and [a(n)(r (n) + β (n) log n) + a(n)2 (a(n) + log n)]λn < ∞.

(3.10)

n

Remark 3.1. Note that by detailed balance, we have that β(n, m) = α(m, n)λn λm /λm+n . Hence, if α and λ are known, then β is determined. As an example, consider the case with λn decaying like e−cn , as n → ∞. In this case, we can readily see that if a(n) is growing at most like a polynomial as n gets large, then both (3.9) and (3.10) are satisfied. Theorem 3.1. Assume Hypothesis 3.1 and that the dimension d = 2. Then the finite dimensional marginals of the sequence P ε converges to the finite dimensional marginals of P, where P is the distribution of a stationary Ornstein–Uhlenbeck Gaussian process with covariance

∞

ξn (t, Jn )ξn (0, Hn )P(dξ ) =

∞

n=1

(Tt Jn )(x)Hn (x)λn d x.

(3.11)

n=1

Here Jn , Hn ∈ D for n ∈ N and Tt is the semigroup generated by the linear Smoluchowski’s operator (J)n = d(n)x Jn +

∞

ˆ β(m, n)Jn+m −

m=1

+

n−1 m=1

α(m, ˆ n − m)Jn−m λm −

n−1 1 ˆ β(m, n − m)Jn 2 m=1

∞ m=1

α(m, ˆ n)(Jn λm + Jm λn ). (3.12)

778

M. Ranjbar, F. Rezakhanlou

Remark 3.2. Note that the macroscopic coagulation and fragmentation rates αˆ and βˆ are strictly smaller than their microscopic counterparts α and β. We refer the reader to Sect. 4 for a heuristic explanation and how a fundamental auxiliary function u ε would ˆ allow us to switch from the microscopic rates α and β to macroscopic rates αˆ and β. Note also that even though the “strengths” of the noises associated with the coagulation and fragmentation are given by α and β, the corresponding macroscopic “strengths” are given by αˆ and βˆ as the expressions (2.15) and (2.15) indicate. In fact this reduction in the strength happens in a very curious way: – The auxiliary function u ε corrects the original noises coming from the coagulation and fragmention by reducing their strengths to α˜ = αη2 and β˜ = βη2 . (See formulas (8.33) and (8.37) and the definitions of Ac0 and A f 0 which are given right after (8.26) and (8.34).) – The Brownian part of the dynamics uses the corrector u ε and produces some noise ˆ (See which enhances the reduced strengths α˜ and β˜ to their final values αˆ and β. formula (8.24), expression A02121111 which is defined right before (8.23), and the final step of the proof of (8.4).) Remark 3.3. In fact what we can prove is somewhat stronger than what has appeared in the statement of Theorem 3.1. We will show that the process ξ ε = ξ − ξ with both ξ eq and ξ stationary processes in time, where the law of ξ under Pε converges to P, and lim Eeq ε |ξ (t, J )| = 0

ε→0

for every t, n ∈ N, and test function J . We refer the reader to Sect. 9 for the details. An alternative description of the law P is the martingale formulation of Holley and Stroock [HS] that will be defined in Sect. 8. It is this formulation which we use for the proof of Theorem 3.1. Remark 3.4. The intuition behind (3.11) is the standard dissipation-fluctuation principle. This principle is used to predict the form of the diffusion coefficient once the drift and the invariant measure for the fluctuation fields are known. In fact (3.11) is equivalent to saying that the process ξ is a solution to the stochastic differential equation dξ = ξ dt + BdWt ,

(3.13)

where dWt = (dW1,t , . . . , dWn,t , . . . ) with (dWn : n ∈ N) independent space-time white noises and the operator B is determined by

∞ ∞ (Bζζ )n (BH)n d x = 2 λn ∇x Jn · ∇x Hn d x n=1

+

1 2

1 + 2

n=1

∞

m,n=1 ∞ m,n=1

α(m, ˆ n)λn λm (Jn+m − Jn − Jm )(Hn+m − Hn − Hm )d x ˆ β(m, n)λn+m (Jn+m − Jn − Jm )(Hn+m − Hn − Hm )d x.

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

779

Indeed if we start with the ansatz that ξ satisfies an Ornstein-Uhlenbeck equation of the form (3.13), then we have an obvious guess for the linear drift ξ , namely the linearization of the right-hand side of the macroscopic equation (2.6). We also have a candidate for its invariant measure, namely the measure P 0 given by (3.11) at t = 0;

∞

ξn0 (Jn )ξn0 (Hn )P 0 (dξ 0 ) =

n=1

∞

Jn (x)Hn (x)λn d x.

n=1

We then select the diffusion operator B to be compatible with what we have for the drift and the invariant measure of the process ξ . Remark 3.5. As our final remark, we comment that it is not obvious that our Markov process ω(·) exists because we are dealing with infinitely many interacting diffusions. However, since we are only interested in the process ω(·) at equilibrium, its existence can be shown by rather standard arguments which we now sketch. (i) Observe that if initial macroscopic densities ( f n0 : n ∈ N) satisfy n f n0 d x < ∞, then we can construct our process by starting from N independent particles (x1 , m 1 ), . . . , (xN , m N ) satisfying (2.5), where N and ε are related by the equation N = K ε n f n0 d x. In other words, if the total density is finite macroscopically, then initially we are dealing with finitely many particles almost surely and the existence of the process ω(·) is obvious. However, since at equilibrium f n0 ≡ λn is not integrable, we need to consider a Poisson point process with infinitely many particles. (ii) We now argue that we can construct our process if we make two assumptions: n f n0 (x)d x < ∞, (3.14) n

|x|≤r

α(m, n) = β(m, n) = 0 if m + n > ,

(3.15)

for every r > 0 and some > 0. In other words, we assume that locally the total mass is finite macroscopically but now we assume that no interaction occurs if particles are large. To construct ω(·) in this case, we first replace f 0 with f 0 11(|x| ≤ k). Our process exists for such an initial macroscopic density by (i). The corresponding process is denoted by ωk (·). We now want to send k to infinity and show that the sequence (ωk : k ∈ N) is tight and that any of its limit point ω is a solutionto the martingale problem associated with the generator A. t That is, F(ω(t)) − 0 AF(ω(s)ds, is a martingale for every C 2 local function F. This can be readily achieved by establishing a control on the total number of particles in a ball {x : |x| ≤ r }. Here is a way of establishing such a control uniformly in k: Pick a positive smooth function J which equals to exp(−|x|) for large x, and set H (x) = − |y|≤1 log |y|J (x − y)dy. We can readily show that H > 0 and that H ≤ c0 H for a constant c0 . Then use the martingale t M(t) = F(ωk (t)) − 0 AF(ωk (s)ds for F(ω) = i H (xi )m i to show sup E sup F(ωk (t))2 < ∞, k

t∈[0,T ]

(3.16)

for every T . This can be used to establish the tightness of ωk and the existence of our process provided that (3.14) and (3.15) are true.

780

M. Ranjbar, F. Rezakhanlou

(iii) It remains to relax the restriction (3.15). We now would like to take advantage of the fact that we only need to consider f n0 ≡ λn . More precisely, by (ii), we know that Peq exists if we assume that (3.15) is true. Let us write ω for our process when α and β are replaced with α (m, n) = α(m, n)11(m + n ≤ ), β (m, n) = β(m, n)11(m + n) ≤ ). Again, we need to show the tightness of ω and pass to the limit in the martingale formulation of our process. For this, we need to show something like (3.16) for the sequence ω . This can be readily achieved by bounding various terms that appear in the martingale M(·), using the fact that the process ω is stationary in time. 4. A Sketch of the Proof We aim to show that the expression X ε (ω(t)) = K ε−1/2

J (xi (t), m i (t)),

(4.1)

i

with J (x, n) = Jn (x) satisfying Jn d x = 0, is close to n ξn (t, Jn ), with the distributions (ξn : n ∈ N) solving (3.13) in the weak sense. To derive (3.13), we use Markov property of the process ω(t) to write X ε (ω(t)) = X ε (ω(0)) +

0

n

t

+ 0

t

A0 X ε (ω(s))ds +

t 0

Ac X ε (ω(s))ds

A f X ε (ω(s))ds + Mε (t)

(4.2)

=: Yε1 + Yε2 (t) + Yε3 (t) + Yε4 (t) + Mε (t), with Mε a martingale for which

Nε (t) = Mε (t) −

t

2

0

(AX ε2 − 2X ε AX ε )(ω(s))ds,

(4.3)

is a martingale. The identity (4.2) should be compared to what we have as the weak form of (3.13), namely n

ξn (Jn , t) =

ξn (Jn , 0) +

n

+ +

t

0 m,n

t 0 m,n 1 2

t 0

d(n)ξn (Jn , s)ds

n

α(m, ˆ n)λm ξn (Jn+m − Jm − Jn , s)ds ˆ β(m, n)ξn+m (Jn+m − Jm − Jn , s)ds + M(t)

=: Y + Y (t) + Y 3 (t) + Y 4 (t) + M(t),

(4.4)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

781

where the process M(t) is a martingale for which

d(n)λn |∇ Jn (x)|2 d x N (t) = M(t)2 − t 2 t − 2 t − 2

n

α(m, ˆ n)λm λn (Jn+m − Jn − Jm )2 (x)d x

m,n

ˆ β(m, n)λn+m (Jn+m − Jn − Jm )2 (x)d x,

m,n

is a martingale. To establish Theorem 3.1, we may try to show Yεj → Y j ,

Mε → M,

for i = 1, . . . , 4. It turns out that this is not what is going on! Firstly, it is rather straightforward to show that Yε1 → Y 1 by the classical central limit theorem with Y 1 a Gaussian random variable with variance n λn Jn2 d x. Also, virtually by definition, we have that t if ξ ε converges to ξ , then Yε2 → n d(n) 0 ξn (Jn , s)ds. This stems from the fact that Yε2 corresponds to the “non-interacting” part of the evolution, namely the Laplacian operator . However we need to split the “interacting” part of the microscopic evolution into 3 distinct parts of completely different natures. Indeed, we have a decomposition Yε3 = Yε3,1 + Yε3,2 + Yε3,3 ,

(4.5)

where Yε3,1 → Y 3 as ε → 0, the term Yε3,2 contributes to the fragmentation term so that Yε3,2 + Yε4 → Y 4 , and Yε3,3 contributes to the martingale part. That is, Yε3,3 + Mε → M. It is as if a part of the microscopic “drift” becomes some type of “white noise” as ε → 0. Perhaps this is the most surprising aspect of the present work and is in complete contrast with some earlier works on the equilibrium and non-equilibrium fluctuations on models with diffusive scaling [CY,C] and a stochastic model with kinetic scaling [R1]. This ramification of the diffusion coefficient by the “drift” is reminiscent of a similar phenomenon for the tagged particles in the exclusion processes (see Kipnis-Vardhan [KV]). In our setting however, the ramification of the noise happens in a rather curious way as we explained in Remark 3.2. To explain the decomposition (4.5), and sketch our method of proof further, we need to recall how Smoluchowski’s equation has been derived from our microscopic model in the articles [HR1,HR2,R2 and HRY]. For this derivation, we need to understand how the microscopic coagulation (respectively fragmentation) rate α(m, n) (respectively ˆ β(m, n)) leads to the macroscopic coagulation rate α(m, ˆ n) (respectively β(m, n)). For the derivation of (2.6), we start from the expression Xˆ ε (ω(t)) = K ε−1/2 X ε (ω(t)) = K ε−1 J (xi (t), m i (t)), i

and study the corresponding (4.2) which we obtain by multiplying both sides of −1/2 −1/2 −1/2 3 (4.2) by K ε . Since K ε Mε → 0, we only need to concentrate on K ε Yε −1/2 4 −1/2 4 and K ε Yε . The term K ε Yε is in some sense linear and all challenges come −1/2 3 −1/2 3 from K ε Yε . It turns out that there is a splitting K ε Yε = Z ε1 + Z ε2 with

782

M. Ranjbar, F. Rezakhanlou

t −1/2 4 c 2 Z ε1 converging to 0 Yε converging to n Jn (x)Q n (f)(x, t)d x and Z ε + K ε t f n Jn (x)Q n (f)(x, t)d x. This splitting is not hard to justify; when a fragmen0 tation occurs, a pair of particles are produced which are within a distance of order O(ε) and prone to coagulate. Of course such a coagulation undoes the fragmentation that has just been occurred. Indeed Z ε2 is negative which results in a macroscopic fragmentation βˆ strictly less than β. To describe the decomposition (4.5), let us observe Yε3 =

1 −1/2 K α(m i , m j )Vε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ) 2 ε

(4.6)

i, j

=

1 −3/2 α(m i , m j )V ε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ), K 2 ε i, j

where V ε = K ε Vε and J˜(xi , m i , x j , m j ) is given by mj mi J (xi , m i + m j ) + J (x j , m i + m j ) − J (xi , m i ) − J (x j , m j ). (4.7) mi + m j mi + m j Our goal would be a decomposition of the form

t

t

t Yε3 (s)ds = Bεz (ω(s))ds + Cε (ω(s))ds + Dε (t) + Err or, 0

0

(4.8)

0

where Bεz (ω) =

1 −3/2 K α(m i , m j )W ε (xi − x j + z; m, n) J˜(xi , m i , x j , m j ), (4.9) 2 ε i, j

for a suitable function W ε which will be defined shortly, and Err or represents a term that will go to zero as ε → 0 and |z| → 0. The form of W ε would allow us to replace α with its macroscopic counterpart α. ˆ The term Cε is given by

K ε−3/2

i −1 m

i

β(m, m i − m)V ε (xi − y; m, m i − m)

m=1

u ε (xi − y; m, m i − m) J˜(xi , m, y, m i − m)dy, and the term Dε (t) is a martingale. It is the decomposition (4.8) that leads to the decomposition (4.5). To achieve the decomposition (4.8), fix z and start from the expression G ε (ω) = K ε−3/2 uˆ ε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ), (4.10) i, j

where uˆ ε (a; m, n) = u ε (a + z; m, n) − u ε (a; m, n), with u ε (a; m, n) satisfying the equation (d(m) + d(n))u ε (x; m, n) = α(m, n) Vε (x; m, n)u ε (x; m, n) + V ε (x; m, n) . (4.11)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

783

(The functions V ε and Vε were defined by (2.4) and right before (2.2) respectively.) We then apply the martingale decomposition as in (4.2) to assert

t G ε (ω(t)) = G ε (ω(0)) + AG ε (ω(s)) + E ε (t), (4.12) 0

with E ε (t) a martingale. This involves various terms as we apply the operators A0 , Ac and A f on G ε . As it turns out, the first term G ε (ω(0)) and many other terms on the right-hand side of (4.12) are small if |z| is sufficiently small. However, the choice of u ε results in a component in (A0 + Ac )G ε , which is exactly our 2(Bεz − Yε3 ) in (4.9), and a component in A f G ε which is exactly Cε . The function W ε in (4.9) is given by W ε (a; m, n) = V ε (a; m, n)(1 + K ε−1 u ε (a; m, n)).

(4.13)

Of course we need to show that all other components in (A0 + Ac )G ε , and A f G ε are small if ε and |z| are small. This can be achieved by rather straightforward reasoning if we require K ε1/2 |z|| log |z|| → 0,

K ε−1/2 | log |z|| → 0.

(4.14)

−1/2

(In higher dimension, the second condition is replaced with K ε |z|2−d → 0, which is inconsistent with the first condition if d ≥ 3.) These two conditions are satisfied if |z| = | log ε|−θ , for some θ > 1/2. At this stage, we simply use the smallness of uˆ ε for z satisfying ε << |z| << 1. In other words, we do not take advantage of the fact that J is of 0 average and do not apply any central limit-type arguments. (For higher dimensions, this line of reasoning is not applicable and we really need to establish a central limit-type theorem to show that the error term in (4.8) is small.) Of course we may try to square the error term and take advantage of the fact that J is of 0 average and that particles are independent at equilibrium. This turns out to be rather technical and challenging and will be dealt with in a future publication. So far we have learned that the expression Yε3 can be replaced with the right hand-side of (4.8). Once this

is achieved, we take a smooth function ζ of compact support, set ζ δ (a) = δ −d ζ aδ , and define ζ δ (xi (t) − x)11(m i (t) = n). f nδ (x, t) = K ε−1 i

We think of this as an approximation of the density of particles of cluster size n. We δ(ε) choose δ = δ(ε) = | log ε|−θ with θ > 1/2. The outcome f˜nε (x, t) = f n (x, t) converges weakly to λn as ε → 0. So far we have not carried out any CLT. We know that if |z 2 − z 1 | ≤ δ(ε), then

t

t

t Yε3 (s)ds = Bεz 2 −z 1 (ω(s))ds + Cε (ω(s))ds + Dε (t) + Error 1 (ε) (4.15) 0

0

0

Error 1 (ε)

with limε→0 = 0 and Dε (t) a martingale. We multiply both sides of (4.15) by ζ δ (z 1 )ζ δ (z 2 ) and integrate with respect to z 1 and z 2 . After a change of variables z 1 → z 1 − xi , z 2 → z 2 − x j , we obtain

1 t α(m i , m j )W ε (z 1 − z 2 ; m i , m j ) J˜(xi , m i , x j , m j ) K ε−3/2 2 0 i, j

ζ

δ(ε)

(xi − z 1 )ζ

δ(ε)

(x j − z 2 )dz 1 dz 2 ds,

784

M. Ranjbar, F. Rezakhanlou

for the first term of the right-hand side of (4.15). Since J is smooth and ζ is of compact support, we may replace J˜(xi , m i , x j , m j ) with J˜(z 1 , m i , z 2 , m j ) for an error of order O(δ(ε)). We then carry out the summation over i and j to obtain

t

1 α(m, n) K ε1/2 f nδ(ε) (z 1 , s) f mδ(ε) (z 2 , s) 2 m,n 0 W ε (z 1 − z 2 ; m, n) J˜(z 1 , m, z 2 , n)dz 1 dz 2 ds. Since J is of zero average, the integrand := K ε1/2 ( f nδ(ε) (z 1 , s) f mδ(ε) (z 2 , s) − λn λm )W ε (z 1 − z 2 ; m, n) J˜(z 1 , m, z 2 , n), can be written as = 1 + 2 + 3 ,

(4.16)

where 1 = K ε1/2 ( f nδ(ε) (z 1 , s) − λn )λm W ε (z 1 − z 2 ; m, n) J˜(z 1 , m, z 2 , n), 2 = K ε1/2 ( f mδ(ε) (z 1 , s) − λm )λn W ε (z 1 − z 2 ; m, n) J˜(z 1 , m, z 2 , n),

3 = K ε1/2 [( f nδ(ε) (z 1 , s) − λn )( f mδ(ε) (z 2 , s) − λm )]W ε (z 1 − z 2 ; m, n) J˜(z 1 , m, z 2 , n).

To achieve our goals, we wish to show that 3 is small in average. Formally, if 1 is a bounded quantity, then 3 is smaller than 1 because of the additional small term δ(ε) f m − λm . This turns out to be wrong; the term f mδ − λm is small only in a weak sense and its product with f nδ − λn is no longer small. This is not surprising at all because δ = δ(ε) is not sufficiently large enough for a central limit theorem to take place. Indeed the support of ζ δ is a set of volume O(δ d ) and as a result, the particle density f nδ involves O(K ε δ d ) many particles in average. For a CLT taking place, we need a density which deals with a large number of particles. In other words, we expect 3 to be small only when K ε δ d → ∞ as ε → ∞. This would not be the case if δ = | log ε|−θ for a θ > 1/2. In order to figure out a successful way of going beyond | log ε|−θ and reach a density δ f with δ satisfying K ε δ d → ∞, we need to review what has been achieved so far and what to learn from it. Basically our goal is a central limit theorem (CLT) for the particle density (4.1) and for this we need to perform some type of CLT for the time average of (4.6). Note that Yε3 is in some sense singular because the function V ε is a delta-type expression. That is, eq in a region of volume O(εd ), V ε is of order O(ε−d ). In fact if we calculate Eε Yε3 (ω)2 , we get an expression that blows up as ε → 0. All this ultimately stems from the fact that the coagulation occurs when particles are microscopically close. We wish to replace V ε with a smoother kernel and this is exactly what purpose (4.8) serves. We try to replace xi − x j , the argument of V ε , with xi − x j + z. That is, we try to figure out the coagulation rate when particles xi and x j are not microscopically close but only macroscopically close, i.e., xi − x j = z + O(ε) with |z| → 0 after sending ε → 0. (For example |z| could be as “large” as | log ε|−θ .) However there is a price to pay for such a replacement; we need to replace V ε with W ε and modify the fragmentation term (we are referring to the term Cε ), and even the noise is modified (the term Dε ). To carry this out, we encountered various additional terms which are presumably small. We have a relatively easy ride, if |z| << | log ε|−1/2 . Even though we have not reached our ultimate goal |z| >> | log ε|−1/2 , we have already achieved three important tasks:

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

785

(i) The correctors Cε and Dε would modify the fragmentation and martingle terms as required in the proof of the main result Theorem 3.1. (ii) The term W ε would allow us to replace α with αˆ because lim W ε = η as ε → 0. (iii) We have been able to replace the singular term V ε (a; m, n) with a less singular term W¯ ε (a; m, n) = W ε (a + z; m, n)ζ δ(ε) (z)dz, where δ(ε) = | log ε|−θ for some θ > 1/2. We are now t in a position to explain the central role of Eq. (4.11). Because of the time average in 0 Yε3 (s)ds, we are dealing with an expression which is almost as smooth as A−1 Yε3 . Of course A−1 is too complicated to use. The message behind Eq. (4.11) and its use is that we only need to consider 2-particles dynamics. Namely, the fact that xi − x j is a diffusion with generator (d(m i ) + d(m j )), and that once a coagulation occurs with rate α(m i , m j ) between the i th and j th particles, (xi , x j ) as a pair no longer exists and hence the dynamics of xi − x j has an infinitesimal generator of a killed Brownian motion: ε = (d(m) + d(n)) − α(m, n)Vε (·; m, n), with m = m i and n = m j . Now the function u ε = ε−1 V ε is smoother than V ε and this allows us to perturb its argument by a small vector z and obtain (4.8). By (iii), we are −d ), and has a support of now dealing W¯ ε in place of V ε . We note that W¯ ε = O(δ(ε) diameter O(δ(ε)). To replace W¯ ε with W˜ ε (a; m, n) = W ε (a + z; m, n)ζ δ (ε) (z)dz, for some δ (ε) >> | log ε|−1/2 , we almost repeat the formula (4.12) where V ε is replaced with W¯ ε , and u ε is replaced with v ε which now solves (d(m) + d(n))v ε (x; m, n) = α(m, n)W¯ ε (x; m, n). (4.17) This time we can show that various terms that appear in AG ε are small provided that |z| ≤ δ (ε) for δ , that is, now can be chosen as large as | log log ε|−θ for any θ > 1/2. For this step of the proof we show that all the error terms have small second moments, in other words, a CLT is taking place and the errors have small variances. (see Sect. 7). As a consequence of the main result of Sect. 7, we have the decomposition (4.15) where δ(ε) is replaced with δ (ε). We can now rigorously show that 3 is small by ignoring the time integration and showing that the integrand has a small second moment with respect to the equilibrium measure. As for 1 , we first carry out dz 2 integration and use the fact that lim W ε (a; m, n)da = η(m, n), as ε → 0. (This was proved as Theorem 3.2 in [HR2].) After some straightforward manipulations, t

t

ξn (t, J )ds η(m, n)λm + Error 2 (ε). 1 dz 1 dz 2 ds = 0

0

As for 2 , we first replace J (z 1 ) with J (z 2 ) for a small error because |z 1 − z 2 | = O(ε). We then integrate with respect to z 1 and repeat our reasoning for 1 to obtain t

t

ξm (s, J )ds η(m, n)λn + Error 3 (ε). 2 dz 1 dz 2 ds = 0

In summary

0

t 0

Yε3 (s)ds =

1 α(m, ˆ n) 2 m,n

t

(ξm (s, J )λn + ξn (s, J )λm )ds

0

+ Dε (t) + Err or (ε) for an error Error(ε) that goes to 0 on ε → 0.

(4.18)

786

M. Ranjbar, F. Rezakhanlou

5. Regularity of the Coagulation Term, Part I As we mentioned in Sect. 4, the main ingredient for the proof of Theorem 3.1 is the statement (4.18). In this section this statement is partially established and the full proof of (4.18) will be achieved in Sect. 7. We now prepare for the main result of this section, which will appear as Theorem 5.1 at the end of the section. The proof of Theorem 5.1 willbe given in Sect. 6. Note that the function J in (4.1) is of compact support and satisfies J (x, n)d x = 0, for every n. In fact we only need to consider J (x, n) = J¯(x)11(n = m) ¯ with J¯(x)d x = 0. Evidently for such a function J , we have J (x, n)d x = 0 for every n. Note that J˜ of (4.7) is not of compact support. However, for some positive l, we have that J˜(x, m, y, n) = 0, if either m, n ≥ l or |xi |, |x j | ≥ l. Because of the Vε term in the definition of Yε3 , we may replace J˜ with Jˆ(xi , m i , x j , m j ) = J˜(xi , m i , x j , m j )K (xi − x j ), for a smooth symmetric function K (a) of compact support which is 1 whenever |a| ≤ 1. The advantage of Jˆ to J˜ is that the former is of compact support in the spatial variables. Note however, the term Vε only implies that |xi − x j | ≤ c0 εr (m i , m j ) for a constant c0 . Hence such a replacement is valid only if c0 εr (m i , m j ) ≤ 1. This causes an error that can be readily handled with the aid of our hypothesis (3.9). (See the first step of the proof of Theorem 8.1 in Sect. 8.) Recall that u ε (x; m, n) solves (d(m) + d(n))u ε (x; m, n) = α(m, n)[Vε (x; m, n)u ε (x; m, n) + V ε (x; m, n)], where V ε (x; m, n) = ε−2 V (x/ε; m, n), and Vε (x; m, n) = K ε−1 ε−2 V (x/ε; m, n). Given such a function u ε , we define G(ω; z) = G(ω) = K ε−3/2

uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ),

(5.1)

i, j

where uˆ ε (a; m, n) = u ε (a + z; m, n) − u ε (a; m, n). We have

t

G(ω(t)) = G(ω(0)) +

AG(ω(s))ds + Mt ,

0

where Mt is a martingale. We write AG = A0 G + Ac G + A f G =: H1 + H2 + H3 .

(5.2)

We now study various terms which appeared on the right-hand side. We write Jˆx and Jˆy for the derivatives of Jˆ with respect to its first and second spatial arguments. We then write H1 = H11 + H12 + H13 ,

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

787

with H11 (ω) = K ε−3/2

uˆ ε (xi − x j ; m i , m j )[(d(m i )xi +d(m j )x j ) Jˆ(xi , m i , x j , m j )],

i, j

H12 (ω) =

K ε−3/2

d(m i )uˆ εx (xi − x j ; m i , m j ) · Jˆx (xi , m i , x j , m j )

i, j

−K ε−3/2

d(m j )uˆ εx (xi − x j ; m i , m j ) · Jˆy (xi , m i , x j , m j )

i, j

=: H121 (ω) − H122 (ω), (d(m i ) + d(m j ))uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) H13 (ω) = K ε−3/2 i, j

=:

z H13 (ω) −

0 0 H131 (ω) − H132 (ω),

where z H13 (ω) = K ε−3/2

α(m i , m j )W ε (xi − x j + z; m i , m j ) Jˆ(xi , m i , x j , m j )

i, j

with W ε (a; m, n) = u ε (a; m, n)Vε (a; m, n) + V ε (a; m, n), and 0 H131 (ω) = K ε−3/2

α(m i , m j )u ε (xi − x j ; m i , m j )

i, j

Vε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), 0 (ω) = K ε−3/2 α(m i , m j )V ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ). H132 i, j

We also write H2 = H21 + H22 ,

z 0 H21 = H21 − H21 ,

z (ω) given by with H21

1 − K ε−3/2 α(m i , m j )Vε (xi − x j ; m i , m j ) 2 i, j u ε (xi − x j + z; m i , m j ) Jˆ(xi , m i , x j , m j ) +u ε (x j − xi + z; m i , m j ) Jˆ(x j , m j , xi , m i ) α(m i , m j )Vε (xi − x j ; m i , m j ) = −K ε−3/2 i, j

u (xi − x j + z; m i , m j ) Jˆ(xi , m i , x j , m j ). ε

788

M. Ranjbar, F. Rezakhanlou

Moreover, H22 (ω) =

1 α(m i , m j )Vε (xi − x j ; m i , m j )K ε−3/2 2 i, j mi [uˆ ε (xi − xk ; m i + m j , m k ) Jˆ(xi , m i + m j , xk , m k ) mi + m j k

+uˆ ε (xk − xi , m k , m i + m j ) Jˆ(xk , m k , xi , m i + m j )] +

mi [uˆ ε (x j − xk ; m i + m j , m k ) Jˆ(x j , m i + m j , xk , m k ) mi + m j

+uˆ ε (xk − x j ; m k , m i + m j ) Jˆ(xk , m k , x j , m i + m j )] −[uˆ ε (xi − xk ; m i , m n ) Jˆ(xi , m i , xk , m k ) +uˆ ε (xk − xi ; m k , m i ) Jˆ(xk , m k , xi , m i )] −[uˆ ε (x j − xk ; m j , m k ) Jˆ(x j , m j , xk , m k ) + uˆ ε (xk − x j ; m k , m j ) Jˆ(xk , m k , x j , m j )] . The expression H22 arises from the changes in the function G when a coagulation occurs due to the influence of the appearance and disappearance of particles on other particles that are not directly involved. The expression H21 represents those terms in G that are absent after a coagulation. Note that for our formula for H12 , we used the fact that K is symmetric and since V is symmetric, the function u ε is also symmetric. As for the fragmentation part of dynamics, we have H3 = H31 + H32 + H33 , where H31 = H311 + H312 , with

i −1 m 1 H311 (ω) = β(m, m i − m)V ε (xi − y; m, m i − m) K ε−3/2 2 i, j m=1 ε uˆ (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j ) − uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) dy,

j −1 m 1 −3/2 H312 (ω) = β(m, m j − m)V ε (x j − y; m, m j − m) Kε 2 i, j m=1 ε uˆ (xi − x j ; m i , m) Jˆ(xi , m i , x j , m) −uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) dy.

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

789

We carry out dy integration and use symmetry to obtain that H31 = 2H311 , where

i −1 m 1 H311 (ω) = β(m, m i − m) K ε−3/2 2 i, j m=1 uˆ ε (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j )

−uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) .

Also, H32 = H321 + H322 , with H321 (ω) =

1 2

K ε−3/2

i −1 m

β(m, m i − m)V ε (xi − y; m, m i − m)

i, j m=1

uˆ (y − x j ; m, m j ) Jˆ(y, m, x j , m j )dy,

j −1 m 1 β(m, m j − m)V ε (x j − y; m, m j − m) H322 (ω) = K ε−3/2 2 ε

i, j m=1

uˆ (xi − y; m i , m) Jˆ(xi , m i , y, m)dy, ε

z 0 , with and H33 = H33 − H33

K ε−3/2

z H33 (ω) =

i −1 m

i

β(m, m i − m)V ε (xi − y; m, m i − m)

m=1

u (xi − y + z; m, m i − m) Jˆ(xi , m, y, m i − m) dy. ε

0 + H 0 = 0. We may rewrite (5.2) as Note that H131 21 z z z 0 0 AG + [H132 − H13 + H33 ] = (H11 + H12 ) + H21 + (H22 + H31 + H32 ) + H33 .

(5.3)

We are now ready to state the main result of this section. √ Theorem 5.1. Let Jˆ be as above and assume that ε < |z| < 1. Then t

t z 0 0 AG(ω(s))ds + Eeq [H (ω(s)) − H (ω(s)) + H (ω(s))]ds ε 132 33 13 0 0 ≤ C0 t K ε1/2 |z|| log |z|| + K ε−1/2 | log |z|| . (5.4) We establish Theorem 5.1 by examining various terms that appeared on the right-hand side of (5.3). Indeed we show 1/2 Eeq ε |G(ω(t))||G(ω(0))| ≤ C 0 K ε |z|,

(5.5)

1/2 Eeq ε |H11 (ω(s))| ≤ C 0 K ε |z|,

(5.6)

1/2 Eeq ε |H12 (ω(s))| ≤ C 0 K ε |z|| log |z||,

(5.7)

790

M. Ranjbar, F. Rezakhanlou 1/2 Eeq ε |H22 (ω(s))| ≤ C 0 K ε |z|,

(5.8)

1/2 Eeq ε |H31 (ω(s))| ≤ C 0 K ε |z|,

(5.9)

1/2 Eeq ε |H32 (ω(s))| ≤ C 0 K ε |z|,

(5.10)

z −1/2 |log |z||, Eeq ε H21 (ω(s)) ≤ C 0 K ε

(5.11)

z −1/2 |log |z||. Eeq ε H33 (ω(s)) ≤ C 0 K ε

(5.12)

Theorem 5.1 is an immediate consequence of (5.5-11). The bound (5.5) will be used for the proof of Theorem 3.1. As we mentioned in Sect. 4, our method of proof can be used to establish a law of t 1/2 large number (LLN) for the expression 0 K ε Yε3 (s)ds with Yε3 as in (4.4). This can be achieved as in [HR2] by using the regularity of the coagulation term and this time z can be chosen to be any small vector. Moreover for J˜, we may choose any smooth function of compact support. Note that since we are at equilibrium, the proof of LLN is much easier than what we have in [HR2] because all the correlationbounds needed for 1/2 t the proof are trivially true. This would allow us to find the limit of 0 K ε Yε3 (s)ds as ε → 0. Since this limit is not random, the limit can be calculated by passing to the limit eq t 1/2 eq 1/2 in Eε 0 K ε Yε3 (s)ds = tEε K ε Yε3 (0). In summary, Lemma 5.1. Let K (x, m, y, n) by any smooth function of compact support. Then t ε = 0, ¯ (5.13) lim Eeq Z (ω(s))ds − t Z ε ε→0

where Z ε (ω) = K ε−2 Z¯ =

0

α(m i , m j )V ε (xi − x j ; m i , m j )K (xi , m i , x j , m j ),

i, j

λm λn α(m, n)

K (x, m, x, n)d x.

m,n

Lemma 5.1 will be needed in Sect. 8. In Sect. 8 we also need another LLN which can be established with a similar argument. This time our Z ε (ω) is given by d(m i ) K ε−1 |∇u ε (xi − x j ; m i , m j )|2 K ε−2 i, j

J˜(xi , m i , xi , m j )2 11(|xi − x j | ≤ 1).

(5.14)

As we will see in Lemma 6.1 of Sect. 6, the function W ε (a; m, n) = K ε−1 |∇u ε (a; m, n)|2 11(|a| ≤ 1), ε ε −2 −1 is almost as |a| ≤ ε. singular as V (a; m, n) because W (a; m, n) = O(ε K ε ) when However W ε da stays bounded as ε → 0. We will calculate γ = limε→0 W ε da in Sect. 8 (see the final step of the proof of (8.4).) We have,

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

791

Lemma 5.2. Let Z ε be as in (5.14). Then (5.13) is true for Z¯ =

λm λn d(m)γ (m, n)

J˜(x, m, x, n)2 d x.

m,n

This lemma can be proved in a similar way. This time we start with a function w ε (x; m, n) that now solves (d(m) + d(n))w ε (x; m, n) = α(m, n)Vε (x; m, n)w ε (x; m, n) + d(m)W ε (x; m, n), and define G(ω) = K ε−2

wˆ ε (xi − x j ; m i , m j ) J˜(xi , m i , xi , m j )2 ,

(5.15)

i, j

where wˆ ε (a; m, n) = w ε (a + z; m, n) − w ε (a; m, n). Again, using the same method of proof as [HR2] we can show that the limit in (5.13) exists and then by taking the expectation of Z ε , we identify the limit.

6. Proof of Theorem 5.1 In this section, we establish (5.5)– (5.12). As a preliminary step, we state a lemma about the regularity of the function u ε . Recall that u ε satisfies (4.11) or equivalently (d(m) + d(n))x u ε (x; m, n) = α(m, n)V ε (x; m, n) | log ε|−1 u ε (x; m, n) + 1 , In fact log ε ≤ u ε and u ε is given by 1 α (m, n) 2π

log |x − y|V ε (y; m, n) | log ε|−1 u ε (y; m, n) + 1 dy,

where α (m, n) = α(m, n)/(d(m) + d(n)). To ease the notation, we do not display the dependence of α (m, n) and r (m, n) on m and n. Lemma 6.1. There exist positive constants C1 and C2 such that for all x, |x| , | log ε| , |u (x; m, n)| ≤ C1 α min 1 + log r |∇u ε (x; m, n)| ≤ C1 α min |x|−1 , (r ε)−1 , ε

(6.1) (6.2)

and for |x| ≥ 2|z| + C2 r ε, |∇u ε (x + z; m, n) − ∇u ε (x; m, n)| ≤ C1 α |x|−2 |z|.

(6.3)

792

M. Ranjbar, F. Rezakhanlou

Also,

|∇u ε (a; m, n)|da ≤ C1 α l,

(6.4)

|uˆ ε (a; m, n)|da ≤ C1 α (l + |z|)|z|,

(6.5)

|a|≤l

|a|≤l

|∇ uˆ ε (a; m, n)|da ≤ C1 α |z| | log(|z| + r ε)| + 1 + log+ l + r ε , |a|≤l 2

ε 2 2 2 2 2 2 + l u (a; m, n) da ≤ C1 α r ε | log ε| + l log +1 , r |a|≤l

l + r2 . |∇u ε (a; m, n)|2 da ≤ C1 α 2 1 + log+ rε |a|≤l

(6.6) (6.7) (6.8)

Proof. The proofs of (6.1), (6.2) and (6.3) are omitted and can be found in Sect. 2.2 of [HR2]. Note however that in [HR2] we are assuming that χ = 0 and that we were dealing with V ε (x) = ε−2 V (x/ε) instead of (εr )−2 V (x/(r ε)). Since we have u ε (x; m, n) = v ε (x/r ) for r = r (m, n) and v ε solving (d(n) + d(m))v ε (x) = α(m, n)V ε (x) | log ε|−1 v ε (x) + 1 , we can readily use the results of [HR2] to obtain (6.1), (6.2) and (6.3). As for (6.4), we apply (6.2) to assert

|∇u ε (a; m, n)|da ≤ c1 α min |a|−1 , (r ε)−1 da ≤ c2 α l. |a|≤l

|a|≤l

As for (6.5), we simply write,

ε

|a|≤l

|uˆ (a; m, n)|da =

|a|≤l

≤ |z|

0

1

∇u (a + t z; m, n) · zdt da

|a|≤l+|z|

ε

|∇u ε (a; m, n)|da,

and apply (6.4). As for (6.6), we use (6.3) and (6.4) to write

|∇ uˆ ε (a; m, n)|da ≤ |∇ uˆ ε (a; m, n)|da |a|≤2|z|+C2 r ε

α |a|−2 |z|da +C1 2|z|+C2 r ε≤|a|≤l ≤ c2 α (|z| + r ε) + |z|| log(|z| + r ε)| + |z|| log l| . For the proof of (6.7), let us write A(l; m, n) for the left-hand side of (6.7). We use (6.1) to assert that if l ≤ εr , then A(l; m, n) ≤ c2 l 2 α 2 | log ε|2 ≤ c2 α 2 r 2 ε2 | log ε|2 ,

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

793

and if εr < l, then A(l; m, n) is bounded above by

|a| da c3 α 2 r 2 ε2 | log ε|2 + 11(|a| ∈ (εr, l)) log r l 2 2 2 2 2 2 ≤ c4 α r ε | log ε| + l log + 1 , r completing the proof of (6.7). In the same fashion, we can readily establish (6.8).

Proof of (5.5), (5.6) and (5.7). We omit the proof of (5.6) because its proof is very similar to the proof of (5.5). Evidently

1/2 Eeq |G(ω(0))| ≤ c K λm λn |uˆ ε (a; m, n)|da 1 ε ε |a|≤1 m,n

≤

c2 K ε1/2 |z|

α (m, n)λm λn ≤ c3 K ε1/2 |z|,

m,n

where we used (6.5) and (3.10) for the the second and third inequalities respectively. This proves (5.5). We now turn to the proof of (5.7). We certainly have

1/2 Eeq |H (ω(0))| ≤ c K λm λn |∇ uˆ ε (a; m, n)|da 12 1 ε ε |a|≤1 m,n

α (m, n) (r (m, n)ε + |z| log(|z| + r (m, n)ε)) λm λn

≤

c2 K ε1/2

≤

m,n 1/2 c3 K ε |z|| log |z||,

by (6.6) of Lemmas 6.2. We now use (3.10) to deduce (5.7).

eq Eε |H22 (ω(0))|

Proof of (5.8). Evidently the expression is bounded by

c1 K ε1/2 α(m, n)V ε (b; m, n) |uˆ ε (a; m, p)| |a|≤1

m,n, p

+|uˆ (a; m + n, p)| λm λn λ p dadb ≤ c2 K ε1/2 |z| α(m, n)(α (n, p) ε

m,n, p

+α (m + n, p))λm λn λ p ≤ c3 K ε1/2 |z|, where we used (6.5) and (3.10) for the second and third inequalities respectively. This proves (5.8). Proof of (5.9) and (5.10). We start with the proof (5.9). We have H311 = H3111 −H3112 , where

i −1 m 1 β (m i )uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), H3112 (ω) = K ε−3/2 2 i, j m=1

794

M. Ranjbar, F. Rezakhanlou

with β (m i ) =

m i −1

β(m, m i − m).

m=1 eq

1/2

Repeating the proof of (5.5) yields that Eε |H3112 (ω(0))| ≤ c1 K ε |z|. The term H3111 is treated in the same fashion: 1/2 Eeq ε |H3111 (ω(0))| ≤ c2 K ε |z|

n−1

β(m, n − m)(a( p) + a(m))λ p λn ≤ c3 K ε1/2 |z|.

n, p m=1

This completes the proof of (5.9). We now turn to the proof of (5.10). The terms H321 and H322 are similar and both eq can be treated as (5.9). We only treat the latter. We certainly have that Eε |H321 (ω(0))| is bounded by

c1

K ε1/2

n−1

β(m, n − m)

V ε (a − y; m, n − m)|uˆ ε (y − b; m, p)|

n, p m=1

11(|y − b| ≤ 1, |y|, |b| ≤ l)dadbdy

n−1 1/2 =c1 K ε β(m, n − m) |uˆ ε (y − b; m, p)|11(|y − b| ≤ 1, |y| ≤ l)dbdy n, p m=1

≤ c2 K ε1/2 |z|

n−1

β(m, n − m)(a( p) + a(m))λ p λn ≤ c3 K ε1/2 |z|,

n, p m=1

completing proof of (5.10).

eq

z (ω)| is bounded Proof of (5.11) and (5.12).. We certainly have that the term Eε |H21 above by −3/2 c1 Eeq ε Kε

α(m i , m j )Vε (xi − x j ; m i , m j )|u ε (xi − x j + z; m i , m j )|11(|xi | ≤ l)

i, j

c2 |z| 11(εr (m, n) ≤ |z|)λm λn α(m, n)α (m, n) log r (m, n) m,n +c2 K ε1/2 α(m, n)α (m, n)λm λn 11(εr (m, n) > |z|)

≤ c2 K ε−1/2

m,n

≤

c3 K ε−1/2 | log |z|| + c3 K ε−1/2 +c3 K ε1/2

n

α(m, n)α (m, n) log r (m, n) λm λn

m,n

a(n) λn 11(εr (n) > |z|) 2

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

795

≤c3 K ε−1/2 | log |z|| + c4 K ε−1/2 a(n)2 log n λn + c4 K ε1/2 a(n)2 11(εr (n) >|z|)λn n

n

|z| −1 −1/2 1/2 log ≤ c3 K ε | log |z|| + c5 K ε a(n)2 log n λn ε n ≤ c3 K ε−1/2 | log |z|| + c5 K ε−1 , where we used Lemma 6.1 for the first inequality. This completes the proof of (5.11). eq z Similarly the term Eε |H33 (ω)| is bounded above by

i −1 m c1 Eeq β(m, m i − m)V ε (xi − y; m, m i − m) K ε−3/2 ε i

m=1

|u ε (xi − y + z; m, m i − m)|11(|xi | ≤ l)dy ≤ c2 K ε−1/2 log

β(m, n − m)α (m, n − m)

n m=1

c2 |z| 11(εr (m, n − m) ≤ |z|)λn r (m, n − m)

+c2 K ε1/2 ≤

n−1

n−1

β(m, n − m)α (m, n − m)11(εr (m, n − m) > |z|)λn

n m=1 −1/2 c3 K ε | log |z|| + c3 K ε−1 .

This completes the proof of (5.12).

7. Regularity of the Coagulation Term, Part II As we explained in Sect. 4, one of the main steps of the proof of Theorem 3.1 is the 0 with a more manageable replacement of the expression V ε (·) in the collision term H132 ε expression W (· + z) for small z. Ultimately we average out W ε (· + z) over z and apply a CLT. For this to succeed, we need to make sure that we can afford a small z which is as big as | log ε|−a for some a < 1/2. In Sect. 5, we used the auxiliary function G in z order to relate H132 to H13 provided that |z| is of order δ(ε) = | log ε|−θ for θ > 1/2. In this section, we would like to fill the gap by showing that in fact z can be chosen so that |z| is as large as δ (ε) = | log log ε|−θ , provided that θ ∈ (0, 1/2). To achieve this, we fix a θ ∈ (0, 1/2) and set H¯ 13 (ω) to be equal

z (ω)ζ δ(ε) (z)dz = K ε−3/2 α(m i , m j )W¯ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), H13 i, j

where W¯ ε (a; m, n) = W (a + z; m, n)ζ δ(ε) (z)dz and W ε was defined by (4.13). First observe that there exists a constant c1 such that the function W¯ ε has a support that is contained in a ball of center 0 and radius δ(ε; m, n) = c1 δ(ε) + r (m, n)ε. For our purposes, it is more convenient to assume that r (m, n)ε ≤ δ(ε) so that for a constant c2 , the support W¯ ε is contained in a ball of center 0 and radius c2 δ(ε), with c2 = c1 + 1, and that |W¯ ε | ≤ c2 δ(ε)−2 . Such a restriction causes a small error. Indeed, if we set H¯ 13 (ω) := K ε−3/2 α(m i , m j )W¯ ε (xi − x j ; m i , m j ) J¯(xi , m i , x j , m j ), (7.1) i, j

796

M. Ranjbar, F. Rezakhanlou

with J¯(xi , m i , x j , m j ) = J˜(xi , m i , x j , m j )11(r (m, n)ε ≤ δ(ε)), then 1/2 ¯ ¯ Eeq ε | H13 (ω) − H13 (ω)| ≤ c1 K ε

α(m, n)λm λn 11(r (m, n)ε > δ(ε))

m,n

≤ c2 K ε1/2

a(n)λn 11(2r (n)ε > δ(ε)),

(7.2)

n

which goes to 0 by our assumption (3.9). Define v ε by

1 v ε (x; m, n) = log |x − y|W¯ ε (y; m, n)dy. 2π

(7.3)

We then set G (ω; z) = G (ω) = K ε−3/2

qˆ ε (xi − x j ; m i , m j ) J¯(xi , m i , x j , m j ),

(7.4)

i, j

where qˆ ε (a; m, n) = v ε (a + z; m, n)K (a + z) − v ε (a; m, n)K (a). We have

t G (ω(t)) = G (ω(0)) + AG (ω(s))ds + Mt , 0

where Mt is a martingale. Note that G is very similar to G of Sect. 5; uˆ is replaced with qˆ and J¯ is replaced with J˜. The latter difference has to do with the fact that now the function K appears in the definition of qˆ and we no longer need to multiply J˜ with a cut-off function. We write AG = A0 G + Ac G + A f G =: H1 + H2 + H3 .

(7.5)

We now study various terms which appeared on the right-hand side. We write H1 = H11 + H12 + H13 .

We do not repeat the definition of various H -expressions which all correspond to H expressions of Sect. 5. However, since v ε satisfies (7.3), we have a different decompo . The decomposition sition for H13 qˆ ε (a; m, n) = v ε (a + z; m, n)K (a + z) − v ε (a; m, n)K (a) +∇v ε (a + z; m, n) · ∇ K (a + z) − ∇v ε (a; m, n) · ∇ K (a) +v ε (a + z; m, n)K (a + z) − v ε (a; m, n)K (a) =: q1ε (a; m, n) + q2ε (a; m, n) + q3ε (a; m, n), results in a decomposition H13 = H131 + H132 + H133 ,

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

where H13r = K ε−3/2

797

qˆrε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ).

i, j

We may rewrite (7.5) as

t H131 (ω(s))ds = G(ω(t)) − G(ω(0)) − Mt 0

t − (H11 + H12 + H132 + H133 + H2 + H3 )(ω(s))ds.

(7.6)

0

We are now ready to state the main result of this section. Theorem 7.1. Assume that δ(ε) < |z| < 1. Then t eq Eε H131 (ω(s))ds ≤ C0 (t + 1) |z| + K ε−1/2 | log δ(ε)|1/2 . 0

Remark 7.1. With the aid of this theorem, we can readily improve the z-average from is given by |z| = O(δ(ε)) to |z| = O(δ (ε)). Indeed H131 K ε−3/2 , W¯ ε (xi − x j + z; m i , m j )K (xi − x j + z) J¯(xi , m i , x j , m j ) − H¯ 13 i, j can be replaced with H ¯ 13 , for an error that goes to 0 as ε → 0. and by (7.2), the term H¯ 13 z (ω)ζ δ(ε) (z)dz can be replaced From this and Theorem 7.1 we deduce that H¯ 13 = H13 with W¯ ε (xi − x j + z; m i , m j )K (xi − x j + z) J˜(xi , m i , x j , m j ), K ε−3/2 i, j

so long as |z| = δ (ε). We establish Theorem 7.1 by examining various terms that appeared on the right-hand side of (7.6). Indeed we show −1/2 + |z| , (7.7) Eeq ε |G (ω(t))| ≤ C 0 K ε −1/2 Eeq + |z| , ε (H11 + H133 )(ω(s)) ≤ C 0 K ε

(7.8)

1/2 Eeq ε (H12 + H132 )(ω(s)) ≤ C 0 K ε |z|| log |z||,

(7.9)

1/2 Eeq , ε H22 (ω(s)) ≤ C 0 |z|| log δ(ε)|

(7.10)

−1/2 + |z| , Eeq ε (H31 + H32 )(ω(s)) ≤ C 0 K ε

(7.11)

−1/2 |log δ(ε)|, Eeq ε (H21 + H33 )(ω(s)) ≤ C 0 K ε

(7.12)

2 Eeq ≤ C0 t K ε−1 δ(ε) + |z|2 (log |z|)2 . ε Mt

(7.13)

To prepare for the proof of Theorem 7.1, we start with an elementary lemma.

798

M. Ranjbar, F. Rezakhanlou

Lemma 7.1. Assume that G(x, y, m, n)d xd y = 0, for every m, n ∈ N. Then 2 Y dνλ = Z 1 + Z 2 + Z 3 , where

Z 1 = N (N − 1)(N − 2) G(y1 , y2 , n 1 , n 2 )G(y1 , y3 , n 1 , n 3 ) n 1 ,n 2 ,n 3

Z 2 = N (N − 1)(N − 2)

λn 1 λn 2 λn 3 dy1 dy2 dy3 , G(y1 , y2 , n 1 , n 2 )G(y3 , y2 , n 3 , n 1 ) n 1 ,n 2 ,n 3

Z 3 = N (N − 1)

λn 1 λn 2 λn 3 dy1 dy2 dy3 , G(y1 , y2 , n 1 , n 2 )2 dy1 dy2 λn 1 λn 2 .

n 1 ,n 2

The straightforward proof of Lemma 7.1 is omitted. See also Lemma 3.3 of [R1] where a similar lemma is proved. As our next lemma we state some bounds on the function v ε . The proof of this lemma is omitted because it is identical to the proof of Lemma 6.1. Lemma 7.2. There exist positive constants C1 and C2 such that for all x, |v ε (x; m, n)| ≤ C1 α min {1 + |log |x|| , | log δ(ε)|}, |∇v ε (x; m, n)| ≤ C1 α min |x|−1 , δ(ε)−1 ,

(7.14) (7.15)

and for |x| ≥ 2|z| + C2 δ(ε), |∇v ε (x + z; m, n) − ∇v ε (x; m, n)| ≤ C1 γ (ε; m, n)α |x|−2 |z|. Also,

|∇q ε (a; m, n)|da ≤ C1 α ,

(7.17)

|qˆ ε (a; m, n)|da ≤ C1 α |z|,

(7.18)

|∇ qˆ ε (a; m, n)|d x ≤ C1 α {|z|[| log(|z| + δ(ε))| + 1] + δ(ε)},

q ε (a; m, n)2 da ≤ C1 α 2 ,

|∇q ε (a; m, n)|2 da ≤ C1 α | log δ(ε)|.

(7.19)

(7.16)

(7.20) (7.21)

Proof of (7.7) and (7.8). We only prove (7.7) because (7.8) can be proved by a verbatim argument. To apply Lemma 7.1, we need to check that for every n 1 and n 2 ,

(7.22) qˆ ε (y1 − y2 ; n 1 , n 2 ) J˜(y1 , n 1 , y2 , n 2 )dy1 dy2 = 0. We certainly have

J (y, n 1 )dy qˆ ε (a; n 1 , n 2 )da = 0. qˆ ε (y1 − y2 ; n 1 , n 2 )J (y1 , n 1 )dy1 dy2 =

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

799

The same is true if we replace J (y1 , n 1 ) with J (y1 , n 1 + n 2 ). This completes the proof of (7.22). In view of Lemma 7.1,

2 Eeq G (ω(0)) = G 2 (ω)νλ (dω) = R1 + R2 + R3 , ε with R1 =

K ε (K ε − 1)(K ε − 2)K ε−3

qˆ ε (y1 − y2 ; n 1 , n 2 )qˆ ε (y1 − y3 ; n 1 , n 3 )

n 1 ,n 2 ,n 3

J¯(y1 , n 1 , y2 , n 2 ) J¯(y1 , n 1 , y3 , n 3 )λn 1 λn 2 λn 3 dy1 dy2 dy3 ,

qˆ ε (y1 − y2 ; n 1 , n 2 )2 J¯2 (y1 , n 1 , y2 , n 2 )λn 1 λn 2 dy1 dy2 , R3 = K ε (K ε − 1)K ε−3 n 1 ,n 2

and R2 is given by an expression similar to R1 . We start with bounding R3 . We certainly have

−1 λn 1 λn 2 q ε (a; n 1 , n 2 )2 + q ε (a + z; n 1 , n 2 )2 da. R3 ≤ c1 K ε n 1 ,n 2

By Lemmas 7.2, R3 ≤ c2 K ε−1

n 1 ,n 2

α (n 1 , n 2 )2 λn 1 λn 2 ≤ c3 K ε−1

a(n)2 λn ≤ c4 K ε−1 .

(7.23)

n

We now turn to R1 . First observe that R1 ≤ R1 , where

λn 1 λn 2 λn 3 |qˆ ε (a; n 1 , n 2 )|da |qˆ ε (a; n 1 , n 3 )|da. R1 = c1 n 1 ,n 2 ,n 3

By Lemma 7.2 we deduce R11 ≤ c1 |z|2

a(n)2 λn ≤ c2 |z|2 .

(7.24)

n

From this and (7.23) we deduce (7.7).

is very similar to H , we only establish (7.9) for H . In Proof of (7.9). Since H12 132 12 view of Lemma 7.1, 2 Eeq ε H12 (ω(0)) = R1 + R2 + R3 ,

with R1 = K ε (K ε −1)(K ε −2)K ε−3

∇ qˆ ε (y1 − y2 ; n 1 , n 2 ) · ∇ qˆ ε (y1 − y3 ; n 1 , n 3 )

n 1 ,n 2 ,n 3

J¯(y1 , n 1 , y2 , n 2 ) J¯(y1 , n 1 , y3 , n 3 )λn 1 λn 2 λn 3 dy1 dy2 dy3 ,

−3 |∇ qˆ ε (y1 − y2 ; n 1 , n 2 )|2 J¯2 (y1 , n 1 , y2 , n 2 )λn 1 λn 2 dy1 dy2 , R3 = K ε (K ε −1)K ε n 1 ,n 2

and R2 is given by an expression similar to R1 .

800

M. Ranjbar, F. Rezakhanlou

We start with bounding R3 . We certainly have

λn 1 λn 2 |∇q ε (a; n 1 , n 2 )|2 + |∇q ε (a + z; n 1 , n 2 )|2 da. R3 ≤ c1 K ε−1 n 1 ,n 2

By Lemmas 7.2, R3 ≤ c2 | log δ(ε)|K ε−1 ≤ c3 K ε−1 | log δ(ε)|

α (n 1 , n 2 )2 λn 1 λn 2

n 1 ,n 2

a(n)2 λn ≤ c4 K ε−1 | log δ(ε)|.

(7.25)

n

We now turn to R1 . First observe that R1 ≤ R1 , where

λn 1 λn 2 λn 3 |qˆ ε (a; n 1 , n 2 )|da |qˆ ε (a; n 1 , n 3 )|da. R1 = c1 n 1 ,n 2 ,n 3

By Lemma 7.2 we deduce R1 ≤ c1 |z|2

α (n 1 , n 2 )α (n 1 , n 3 )λn 1 λn 2 λn 3 ≤ c2 |z|2 .

(7.26)

n 1 ,n 2 ,n 3

From this and (7.25) we deduce (7.9).

= 2H , H = Proof of (7.11). As in the proof of (5.8) and (5.9), we have that H31 311 311 H3111 − H3112 , where H3112 (ω) =

1 −3/2 K β (m i )qˆ ε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ). 2 ε i, j

Repeating the proof of (7.9) yields 2 Eεeq H3112 (ω(s)) ≤ c3 (|z|2 + K ε−1 ). below. The term H3111 is handled in just the same way we handle H32 . The terms H and H are similar and both can be treated as We now turn to H32 321 322 (7.11). We only treat the latter. We apply Lemma 7.1 for G(xi , x j , m i , m j ) given by

1 2

m j −1

β(m, m j − m)V ε (x j − y; m, m j − m)qˆ ε (xi − y; m i , m) J¯(xi , m i , y, m)dy.

m=1

As a result, 2 Eeq ε [H322 (ω(0))] = R1 + R2 + R3 ,

with R1 , R2 , and R3 corresponding to Z 1 , Z 2 , and Z 3 in Lemma 7.1. We first treat R3 . For this term we need to bound |G(xi , x j , m i , m j )|. In this case we simply move the absolute value inside the summation and replace |qˆ ε (a; m i , m)| with a constant multiple of |q ε (a; m i , m)| + |q ε (a + z; m i , m)|. We then apply Lemma 6.1 to assert |G(xi , x j , m i , m j )| ≤ S(xi − x j , m i , m j ) + S(xi − x j + z, m i , m j ),

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

801

with S(a, p, n) given by c2

n−1

β(m, n − m)V ε (y; m, n − m)α ( p, m) min {| log δ(ε)|, | log |a + y||} 11(|a| ≤ 2)dy.

0

We have that there exists constants c3 and c4 such that ! c3 n−1 β(m, n − m)α ( p, m)(| log |a||+1), if |a| ≥ c4 (r (n)ε + δ(ε)), 0n−1 S(a, p, n) ≤ otherwise. c3 0 β(m, n − m)α ( p, m)| log δ(ε)|, From this we can readily deduce that R3 ≤ c5 K ε−1 , as in the proof of (7.11). (Note that n−1 β(m, n − m)α ( p, m) ≤ (a(n) + a( p))β (n) because by our choice, the function 0 a is non-decreasing.) We now turn to R1 and R2 . We certainly have that |G(xi , x j , m i , m j )| is bounded above by

1 m j −1 |z| β(m, m j − m)V ε (x j − y; m, m j − m) 0

m=1

|∇q (xi − y + t z; m i , m) J¯(xi , m i , y, m)|dydt. ε

We then apply Lemma 7.2 to assert

|G(xi , x j , m i , m j )| ≤ |z|

1

L(xi − x j + t z, m i , m j )dt,

0

with L(a, p, n) given by

n−1 c5 β(m, n − m)V ε (y; m, n − m)α ( p, m) 0

min δ(ε)−1 , |a + y|−1 11(|a| ≤ 2)dy. Again we can readily show ! c3 n−1 α (m, n)β(m, n − m)|a|−1 , if |a| ≥ c4 (r (n)ε + δ(ε)), 0n−1 L(a, n) ≤ c3 0 α (m, n)β(m, n − m)δ(ε)−1 , otherwise. Repeating the proof of (7.7) yields that R1 + R2 ≤ c7 |z|2 , completing the proof of (7.11). because H can be treated by an Proof of (7.12). We only establish (7.12) for H21 33 identical argument. Choose c1 so that V (a) = 0 if |a| > c1 . We certainly have that the eq z expression Eε |H21 (ω)| is bounded above by −3/2 Eeq α(m i , m j )Vε (xi − x j ; m i , m j )|qˆ ε (xi − x j + z; m i , m j )| ε Kε i, j

≤ c1 | log δ(ε)|K ε−1/2

α(m, n)α (m, n)λm λn

m,n

≤ c2 | log δ(ε)|K ε−1/2 , where we used Lemma 6.1 for the first inequality. This completes the proof of (7.12).

802

M. Ranjbar, F. Rezakhanlou

is a sum of eight terms H , i = 1, . . . , 8, and we Proof of (7.10). We note that H22 22i . Since all the eight terms establish (7.10) by showing the analogous bound for each H22i can be treated in the same way, we only treat the sixth term which is given by

1 −3/2 α(m i , m j )Vε (xi − x j ; m i , m j ) K 2 ε

H226 (ω) =

i, j,k

q (xk − xi ; m k , m i ) J¯(xk , m k , xi , m i ). ε

We note that J¯ is a sum of 4 terms which yields a decomposition = H2261 + H2262 − H2263 − H2264 . H226

(7.27)

which is Again all the 4 terms can be treated in the same way, so we only treat H2264 given by

1 −3/2 α(m i , m j )Vε (xi − x j ; m i , m j )q˜ ε (xk − xi ; m k , m i )J (xi , m i ), K 2 ε i, j,k

where q(a; ˜ m k , m i ) = q(a; ˆ m k , m i )11(r (m i , m k )ε ≤ δ(ε)). We use the elementary inequality |a| ≤ δ + δ −1 a 2 to assert | ≤ H22641 + H22642 , |H2264

(7.28)

and H22642 are respectively given by where H22641

δ −1 K α(m i , m j )Vε (xi − x j ; m i , m j )|J (xi , m i )| 2 ε i, j

δ −1 2

K ε−1

α(m i , m j )Vε (xi − x j ; m i , m j )|J (xi , m i )|

i, j

K ε−1/2

2 q˜ ε (xk − xi ; m k , m i )

.

k Evidently, Eεeq H22641 ≤ c1 δ, for some constant c1 . Moreover, by squaring the expression in the brackets, we learn that H22642 = H226421 + H226422 , where H226421 = δ −1 K ε−2 α(m i , m j )Vε (xi −x j ; m i , m j )|J (xi , m i )| q˜ ε (xk −xi ; m k , m i )2 , i, j,k H226422

=δ

−1

K ε−2

α(m i , m j )Vε (xi − x j ; m i , m j )|J (xi , m i )|

i, j, k=l

q˜ ε (xk − xi ; m k , m i )q˜ ε (xl − xi ; m l , m i ). Because of our choice of q, ˜ we have that q(a; ˜ m, n)da = 0. As a consequence, = 0. Eεeq H226422

(7.29)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

803

We certainly have that Eεeq H226421 is bounded above by

α(n 1 , n 2 )λn 1 λn 2 λn 3

n 1 ,n 2 ,n 3

= c2 δ −1

c2 δ −1 K ε

Vε (a; n 1 , n 2 )da

α(n 1 , n 2 )λn 1 λn 2 λn 3

qˆ ε (b; n 3 , n 1 )2 db

qˆ ε (b; n 3 , n 1 )2 db.

n 1 ,n 2 ,n 3

On the other hand, by Lemma 7.2,

ε 2 2 qˆ (b; n 3 , n 1 ) db ≤ |z| ∇q ε (b; n 3 , n 1 )|2 db ≤ c3 α (n 3 , n 1 )2 | log δ(ε)||z|2 . As a result, Eεeq H226421 is bounded above by

c4 δ −1 |z|2 | log δ(ε)|

α(n 1 , n 2 )α (n 3 , n 1 )2 λn 1 λn 2 λn 3 ≤ c5 δ −1 |z|2 | log δ(ε)|.

n 1 ,n 2 ,n 3

In summary, from this (7.28), and (7.29) we deduce ≤ c1 δ + c5 δ −1 |z|2 | log δ(ε)|. Eεeq H2264 By choosing δ = |z|| log δ(ε)|1/2 we deduce (7.10).

Proof of (7.13). As it is well-known,

t 2 eq [M ] = E (AG − 2G AG )(ω(s))ds = t (Z 1 + Z 2 + Z 3 ), Eeq ε t ε 0

where Z 1 = 2Eeq ε (A0 G − 2G A0 G )(ω), Z 2 = Eeq ε (Ac G − 2G Ac G )(ω), Z 3 = Eeq ε (A f G − 2G A f G )(ω).

We start with bounding Z 1 : d(m i )|∇xi G (ω)|2 ≤ Z 11 + Z 12 + Z 13 + Z 14 , Z 1 = K ε−3 Eeq ε i

where Z 11

Z 12

2 −3 eq ε ¯ = 4K ε Eε d(m i ) ∇ qˆ (xi − x j ; m i , m j ) J (xi , m i , x j , m j ) , j i 2 ε ¯ = 4K ε−3 Eeq d(m ) q ˆ (x − x , m , m )∇ , m , x , m ) J (x i i j i j xi i i j j . ε j i

804

M. Ranjbar, F. Rezakhanlou

The term Z 13 and Z 14 are given by a similar expression; xi and x j are swapped inside the absolute values. We only bound Z 11 because Z 11 involves ∇ qˆ ε which is more singular than qˆ ε . The remaining Z 1r can be bounded in a similar way. Squaring yields Z 11 ≤ 4K ε−3 Eeq d(m i ) ∇ qˆ ε (xi − x j ; m i , m j ) · ∇ qˆ ε (xi − xk ; m i , m k ) ε j=k

i

J¯(xi , m i , x j , m j ) J¯(xi , m i , xk , m k ) +4K ε−3 Eeq d(m i ) |∇ qˆ ε (xi − x j ; m i , m j )|2 J¯(xi , m i , x j , m j )2 ε i

j

=: Z 111 + Z 112 . Use Lemma 7.2 to deduce Z 112 ≤ c1 K ε−1 | log δ(ε)|

n 1 ,n 2

d(n 1 )α (n 1 , n 2 )2 λn 1 λn 2 ≤ c2 K ε−1 | log δ(ε)|.

We now turn to Z 111 . By Lemma 7.2, 2 Z 111 ≤ c1 d(n 1 ) |z|| log |z|| + δ(ε) α (n 1 , n 2 )α (n 1 , n 3 )λn 1 λn 2 λn 3 n 1 ,n 2 ,n 3

2 2 ≤ c2 |z|| log |z|| + δ(ε) ≤ 4c2 |z|| log |z|| . In summary Z 1 ≤ c3 K ε−1 | log δ(ε)| + c3 [|z|| log |z||]2 .

(7.30)

We now look at Z 2 . We have Z2 =

1 −3 eq K E α(m i , m j )Vε (xi − x j ; m i , m j ) 2 ε ε i, j ⎧ ⎡ ⎤⎫2 8 ⎬ ⎨ ⎣i, j (0) + i, j,k ( p)⎦ , × ⎭ ⎩ p=1

k

where 8p=1 i, j,k ( p) represents the eight terms that appeared in the definition of H¯ 22 ε and i, j (0) = −qˆ (xi − x j ; m i , m j ) J¯(xi , m i , x j , m j ). An application of the inequality ⎞2 ⎛ 8 8 ⎠ ⎝ ap ≤9 a 2p , p=0

p=0

yields that Z 2 is bounded by

⎡ 2 ⎤ 8 9 −3 eq K E α(m i , m j )Vε (xi − x j ; m i , m j ) ⎣i, j (0)2 + i, j,k ( p) ⎦ 2 ε ε i, j

=:

8 p=0

Z2 p

p=1

k

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

805

with for example, 9 eq −3 E K α(m i , m j )Vε (xi − x j ; m i , m j ) 2 ε ε i, j 2 ε qˆ (xk − x j ; m k , m j ) J¯(xk , m k , x j , m j ) .

Z 28 ≤

k

We only treat Z 20 and Z 28 as the other terms Z 2r for r = 1, . . . , 7 can be treated as Z 28 . We have Z 28 = Z 281 + Z 282 , where 9 −3 Z 281 = Eeq Vε (xi − x j ; m i , m j )α(m i , m j ) ε Kε 2 i, j qˆ ε (xk − x j ; m k , m j )qˆ ε (xl − x j ; m l , m j ) k=l

Z 282

J¯(xk , m k , x j , m j ) J¯(xl , m l , x j , m j ), 9 −3 = Eeq Vε (xi − x j ; m i , m j )α(m i , m j ) ε Kε 2 i, j qˆ ε (xk − x j ; m k , m j )2 J¯(xk , m k , x j , m j )2 . k

We start with the former Z 281 ≤ c1

α(n 1 , n 2 )λn 1 λn 2 λn 3 λn 4

|qˆ ε (a; n 3 , n 2 )|da

n 1 ,n 2 ,n 3 ,n 4

As for Z 282 we have Z 282 ≤ c1 K ε−1

|qˆ ε (a; n 4 , n 2 )|da ≤ c2 |z|2 .

α(n 1 , n 2 )

n 1 ,n 2 ,n 3

qˆ ε (a; n 3 , n 2 )2 daλn 1 λn 2 λn 3 ≤ c2 K ε−1 .

Finally Z 280 ≤ In summary,

c1 K ε−2

α(n 1 , n 2 )λn 1 λn 2

n 1 ,n 2

qˆ ε (a; n 1 , n 2 )2 da ≤ c2 K ε−2 .

Z 2 ≤ c1 K ε−1 + |z|2 .

We now turn to Z 3 . We have Z3 =

m i −1 1 eq −3 Eε K ε β(m, m i − m) V ε (xi − y; m i − m, m) 2 i m=1 ⎡ ⎤2 4 ⎣i (y, m) + i ( p; y, m)⎦ dy, p=1

(7.31)

806

M. Ranjbar, F. Rezakhanlou

where for example i (y, m) = qˆ ε (xi − y; m, m i − m) J¯(xi , m, m i − m, y), q¯ ε (y − x j ; m, m j ) J¯(y, m, x j , m j ). i (3; y, m) = j

Again Z 3 ≤ 5 Z 33

4 0

Z 3r with for example

m i −1 5 eq −3 = Eε K ε β(m, m i − m) V ε (xi − y; m i − m, m)i (3; y, m)2 dy. 2 i

m=1

We can now repeat the line of argument we had for Z 2 by squaring out i and use Lemma 7.2 to get Z 3 ≤ c1 K ε−1 + |z|2 . (7.32) From (7.30), (7.31) and (7.32) we deduce (7.13).

8. Kinetic Limit In this section we establish the main claim of Theorem 3.1. We now state the martingale formulation of the Ornstein–Uhlenbeck diffusion which uniquely determines the solution of Eq. (3.13). Definition 8.1. We say ξ is a solution of (3.13) if for any smooth function J of compact support with J = 0, the following processes are martingales:

t M J (t) = M(t) = ξ(t, J ) − ξ(0, J ) − ξ(s, J )ds, 0

N J (t) = N (t) = M(t) − t A(J ). Here J = (Jn : n ∈ N) with Jn : Rd → R and Jn (x)d x = 0, = 0 + c + f , and ξn (t, Jn ), ξ(t, J ) = 2

n

0 ξ(t, J ) =

d(n)ξn (t, x Jn ),

n

c ξ(t, J ) =

α(m, ˆ n)λn ξn (t, Jn+m − Jn − Jm ),

m,n

f (ξ, J ) =

ˆ β(m, n)ξn (t, Jn + Jm − Jn+m ),

m,n

A(J ) = 2

d(n)λn |∇x Jn |2 d x +

n

1 + 2

m,n

1 2

α(m, ˆ n)λn λm (Jn+m − Jn − Jm )2 d x

m,n

ˆ β(m, n)λn+m (Jn+m − Jn − Jm )2 d x.

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

807

We note that the last two terms in the definition of A(J ) are equal by the detailed balance assumption. Ideally, we would like to show that the family P ε is tight as ε → 0 and that any limit point solves (3.13). Unfortunately we have not been able to establish the tightness and the difficulty comes from two error terms which go to 0 as ε → 0 for each t. More precisely, let us define ξ (t, J ) = ξ(t, J ) + ξ (t, J ), where J (xi (t), m i (t)), ξ(t, J ) = K ε1/2 i 1 ¯ 2 G ε (ω(t)),

and ξ (t, J ) = with

G (ω; z 2 − z 1 )ζ δ (ε) (z 2 )ζ δ (ε) (z 1 )dz 1 dz 2 . G¯ ε (ω) = G(ω; z)ζ δ(ε) (z)dz + (8.1) Here G and G are as in (5.1) and (7.3), and ζ δ (a) = δ −2 ζ (a/δ) with ζ a smooth non-negative symmetric function of compact support satisfying ζ (a)da = 1. We take a countable dense subset D0 of smooth functions of compact support and write H = L 1 ([0, T ]; R)D0 . The transformation ω(·) → (ξ (·, J ) : J ∈ D0 ) induces a probability measure Pˆ ε on H. Let us write P for the distribution of a process ξ which solves (3.13) and is subject to the following initial condition: ξ(0, J ) is a Gaussian random variable with variance

2 λn J 2 (x, n)d x. ξ(0, J ) P(dξ ) = n

Note that ξ(·, J ) is stationary under P. Note also that P can be regarded as a probability measure on H. It turns out that the tightness of the sequence Pˆ ε can be shown by standard arguments. Theorem 8.1. The sequence Pˆ ε converges to P as ε → 0. Moreover, lim Eeq ε ξ (t, J ) = 0, ε

(8.2)

for every t. We note that (8.2) is an immediate consequence of (5.5) and (7.7). The proof of the convergence of Pˆ ε is naturally divided into two steps. The first step is devoted to the proof of the tightness of the family Pˆ ε . This step will be carried out in Sect. 9. For the second step, we show that any limit point solves (3.13). This is a rather straight forward consequence of Theorem 8.2 below. This theorem is also the main ingredient for the proof of Theorem 3.1. We note that by a celebrated result of Holley and Stroock [HS], (3.13) has a unique solution in the sense of Definition 8.1. Theorem 8.2. There exist martingales Mε and Nε , and processes Err 1,ε and Err 2,ε such that

t ξ (t, J ) − ξ (0, J ) − ξ (s, J )ds = Mε (t) + Err 1,ε (t), (8.3) 0

Mε (t)2 − t A(J ) = Nε (t) + Err 2,ε ,

(8.4)

808

M. Ranjbar, F. Rezakhanlou

where A(J ) was defined in Definition 8.1, and 1,ε eq 2,ε lim Eeq ε Err (t) = lim Eε Err (t) = 0. ε→0

ε→0

The proof of Theorem 8.2 is naturally divided into two parts. Proof of (8.3). Step 1: Let us write X¯ ε (ω) = X ε (ω)+ 21 G¯ ε (ω), where X ε (ω) and G¯ ε (ω) were defined by (4.1) and (8.1) respectively. As it is a well-known fact for Markov processes, the following process is a martingale:

t M¯ ε (t) := X¯ ε (ω(t)) − X¯ ε (ω(0)) − A X¯ ε (ω(s))ds. 0

Note that by definition, X¯ ε (ω(t)) = ξ (t, J ). Let us study the term A X¯ ε . We certainly have 1 A X¯ ε = A0 X ε + Ac X ε + A f X ε + AG¯ ε . 2 Note that the term AX ε involves J˜ whereas AG¯ ε involves Jˆ. We replace J˜ of AX ε with eq Jˆ. This causes an error Err 0 which is small because Eε | Err 0 | is bounded above by c1 K ε1/2 α(m, n)11(c0 εr (m, n) ≥ 1)λm λn ≤ c2 K ε1/2 a(n)11(2c0 εr (n) ≥ 1)λn . n,m

n

As a result, we may use (3.9) to deduce lim Eeq ε | Err 0 | = 0.

ε→0

As a consequence of Theorems 5.1 and 7.1 (see Remark 7.1), we have

t

t

t

t 1 0 Ac X ε + AG¯ ε (ω(s))ds = Q ε (ω(s))ds − H33 (ω(s))ds + Err 1 ds, 2 0 0 0 0 where Q ε (ω) equals

1 α(m i , m j )W¯ ε (xi − x j + z 2 − z 1 ; m i , m j ) K ε−3/2 2 i, j

Jˆ(xi , m i , x j , m j )ζ δ (z 1 )ζ δ (z 2 )dz 1 dz 2 , and

1/2 −1/2 Eeq | log δ(ε)| + c1 δ (ε) + K ε−1/2 | log δ(ε)|1/2 , ε | Err 1 | ≤ c1 K ε δ(ε) + K ε (8.5)

which goes to 0 in small ε limit. Step 2: Recall that the summation is over distinct i and j by our overall convention. However, one can readily check that if we allow i = j in the summation, then the −1/2 discrepancy is of order O(K ε ). Also, if we replace Jˆ with J˜, the error is of order O(τ (ε)). The sum of these two errors is denoted by Err2 , and we have −1/2 Eeq + τ (ε)), ε | Err 2 | ≤ c1 (K ε

(8.6)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

809

which goes to 0 in small ε limit. Because of the form of J˜, we may write Q ε (ω) = Q 1ε + Q 2ε − Q 3ε − Q 4ε + Err2 , where for example Q 4ε (ω), given by

1 K ε−3/2 α(m i , m j )W¯ ε (xi − x j + z 2 − z 1 ; m i , m j ) 2 i, j

J (x j , m j )ζ δ (ε) (z 1 )ζ δ (ε) (z 2 )dz 1 dz 2 , with the summation over all i and j. We make a change of variables xi − z 1 = a1 , x j − z 2 = a2 to write that Q 4ε (ω) equals

1 −3/2 Kε α(m i , m j )W¯ ε (a1 − a2 ; m i , m j ) 2 i, j

J (x j , m j )ζ δ (ε) (xi − a1 )ζ δ (ε) (x j − a2 )da1 da2 = K ε1/2 α(m, n)W¯ ε (a1 − a2 ; m, n) f ε (a1 , m; ω) f ε (a2 , n; ω; J )da1 da2 , m,n

where f ε (a, m; ω) = K ε−1

ζ δ (ε) (xi − a)11(m i = m),

i

ε

f (a, m; ω; J ) =

K ε−1

ζ δ (ε) (xi − a)J (xi , m)11(m i = m).

i 42 We then have that Q 4ε (ω) = Q 41 ε (ω) + Q ε (ω) + Err 3 , where

1 Q 41 (ω) = α(m, n)W¯ ε (a1 − a2 , m, n)λm f ε (a2 , n; ω; J )da1 da2 , K ε1/2 ε 2 m,n

1 Q 42 α(m, n)W¯ ε (a1 − a2 ; m, n)λn f ε (a1 , m; ω) J¯ε (a2 , n)da1 da2 , K ε1/2 ε (ω) = 2 m,n

where J¯ε (a, n) = ζ δ (ε) (x − a)J (x, n)d x and Err 3 is given by

1 α(m, n)W¯ ε (a1 − a2 ; m, n) K ε1/2 2 m,n ( f ε (a1 , m; ω) − λm )( f ε (a2 , n; ω; J ) − λn J¯ε (a2 , m))da1 da2

1 = α(m i , m j )W¯ ε (a1 − a2 ; m i , m j ) K ε−3/2 2 i, j

(xi − a1 ) − λm i )(ζ δ (ε) (x j − a2 )J (x j , m j ) − λm j J¯ε (a2 , m j ))da1 da2 . Here we have used the assumption J = 0. We wish to show that Err 3 is small. We first observe that we can write Err3 = Err31 + Err32 , where Err31 is what we obtain

(ζ

δ (ε)

810

M. Ranjbar, F. Rezakhanlou

by restricting the summation to indices i = j, and Err32 corresponds to the case i = j. −1/2 It is not hard to show that Err32 is of order O(K ε ). On the other hand, 2 −1 −2 Eeq ε Err 31 ≤ c1 K ε δ (ε) .

(8.7)

To see this, observe that Err 231 equals

1 α(m i , m j )α(m p , m q ) da1 da2 db1 db2 K ε−3 2 i, j, p,q (a1 − a2 ; m i , m j )W¯ ε (b1

W¯

ε

(ζ

δ (ε)

(xi − a1 ) − λm i )(ζ

(ζ

δ (ε)

(x j − a2 )J (x j , m j ) − λm j J¯ε (a2 , m j ))

− b2 ; m p , m q )

δ (ε)

(x p − b1 ) − λm p )

(ζ δ (ε) (xq − b2 )J (xq , m q ) − λm q J¯ε (b2 , m q )) =: E 1 + E 2 + E 3 , where E s represents the above summation with (i, j, p, q) ∈ I (s) with I (1) corresponding to the cases i = p, q or p = i, j or j = p, q or q = i, j, I (2) corresponds to the case i = p and q = j, and I (3) corresponding to the case i = q and p = j. (Recall that the summation in our expression for Err 231 is over i = j and q = p.) We can readily check Eeq ε E 1 = 0.

(8.8)

eq Eε E 2

equals On the other hand

1 −1 Kε W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 (λm γ δ (ε) (a1 − b1 ) − λ2m ) 2 m,n (λn ζ δ (ε) (y − a2 )ζ δ (ε) (y − b2 )J 2 (y, n) − λ2n J¯ε (a2 , n) J¯ε (b2 , n))dyda1 da2 db1 db2 1 = (E 21 + E 22 + E 23 + E 24 ), 2 where γ δ (ε) (a) = δ (ε)−2 γ (a/δ (ε)), for γ (a) = ζ (a + b)ζ (b)db, and E 2r for r = 1, . . . , 4, are given by

−1 E 21 = K ε λm λn W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 γ δ (ε) (a1 − b1 )

m,n

(y − a2 )ζ δ (ε) (y − b2 )J 2 (y, n)dyda1 da2 db1 db2 ,

= K ε−1 λ2m λn W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 ζ

E 22

δ (ε)

m,n δ (ε)

E 23

(y − a2 )ζ δ (ε) (y − b2 )J 2 (y, n)dyda1 da2 db1 db2 ,

= K ε−1 λm λ2n W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 γ δ (ε) (a1 − b1 )

E 24

J¯ε (a2 , n) J¯ε (b2 , n)dyda1 da2 db1 db2 ,

−1 2 2 = Kε λm λn W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2

ζ

m,n

m,n

J (a2 , n) J¯ε (b2 , n)dyda1 da2 db1 db2 . ¯ε

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

811

We can readily see |E 22 | + |E 23 | + |E 24 | ≤ c1 K ε−1 , for a constant c1 . As for E 21 we have, E 21 ≤ c2 K ε−1 δ (ε)

−2

λm λn

W¯ ε (a1 − a2 ; m, n)W¯ ε

m,n

(b1 − b2 ; m, n)α(m, n)2 γ δ (ε) (a1 − b1 )

ζ δ (ε) (y − a2 )J 2 (y, n)dyda1 da2 db1 db2 ≤ c3 K ε−1 δ (ε)

−2

. −1/2

δ (ε)−2 for a constant c5 . Hence Eε E 2 ≤ c4 K ε−1 δ (ε)−2 . Similarly Eε E 3 ≤ c2 K ε This and (8.8) yield (8.7). Step Note that αˆ = α limε→0 W ε (this was proved in [HR1] as Theorem 3.2), 3: and W¯ ε = W ε . Hence

1 Q 41 (ω) = α(m, ˆ n)λm f ε (a2 , n; ω, J )da2 + Err4 K ε1/2 ε 2 m,n 1 −1/2 = Kε α(m, ˆ n)λm J (x j , n)11(m j = n) + Err41 , (8.9) 2 m,n eq

eq

j

where Err4 is the error we get by replacing W¯ ε = W ε with its limit as ε → 0. Since ⎤ ⎡ ⎣ K ε−1/2 Eeq α(m, ˆ n)λm J (x j , n)11(m j = n)⎦ ≤ c1 , ε j

m,n

for a constant c1 independent of ε, we deduce 2 lim Eeq ε |Err41 | = 0.

ε→0

(8.10)

Moreover, since J¯ε (a, n) = λn J (a, n) + O(δ (ε)), we have that Q 42 ε (ω) equals

1 α(m, n)W¯ ε (a1 − a2 ; m, n)J (a2 , m) f δ (a1 , n; ω)da1 da2 + O(δ) K ε1/2 2 m,n

1 = α(m, ˆ n)λn J (a1 , n) f δ (a1 , m; ω)da1 + Err 42 (8.11) K ε1/2 2 m,n

1 = α(m, ˆ n)λn J (xi , n)11(m i = m) + Err 42 K ε−1/2 2 m,n i

with 2 lim Eeq ε |Err42 | = 0.

ε→0

(8.12)

812

M. Ranjbar, F. Rezakhanlou j

The terms Q ε for j = 1, 2, 3 can be treated likewise. From (8.6), (8.7), (8.9), (8.10), (8.11), (8.12) and (8.2) we deduce

t

t

t Q δε (ω(s))ds = c ξ(s, J )ds + Err 5 ds, (8.13) 0

0

0

with lim Eeq ε |Err5 | = 0.

(8.14)

ε→0

0 . Recall Step 4: We now study the term H33

0 (ω) = H33

m i −1 1 −3/2 Kε β(m, m i − m) 2 i m=1

V ε (xi − y; m, m i − m)u ε (xi − y; m, m i − m) Jˆ(xi , m, y, m i − m)dy.

As we discussed in Step 1, we have replaced Jˆ with J˜ for an error which vanishes in small ε limit. Moreover J˜(xi , m, y, m i − m) = J˜(xi , m, xi , m i − m) + O(r (m, m i − m)ε), 0 (ω) + A X equals whenever V ε (xi − y; m, m i − m) = 0. Hence −H33 f ε m i −1 −1 −1/2 K β(m, m i − m) 2 ε i m=1

× W ε (xi − y; m, m i − m) J˜(xi , m, xi , m i − m)dy + Err6

=

m i −1 −1 −1/2 ˆ β(m, m i − m) J˜(xi , m, xi , m i − m) + Err 7 Kε 2 i

m=1

= f (ξ, J ) + Err 7 , with lim Eeq ε |Err7 | = 0.

(8.15)

ε→0

This is proved as in the previous step. For example, we have used

i −1 m −1/2 K β(m, m − m) W ε (xi − y; m, m i − m) Eeq i ε ε i

m=1

∇ Jˆ(xi , m, xi + θ (y − xi ), m i − m) · (y − xi )dθ dy

2

= O(ε2 ),

where ∇ denotes the derivative with respect to the second spatial variable.

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

813

Final Step: From (8.4-6) and (8.13-15) we deduce that the martingale M¯ ε (t) satisfies M¯ ε (t) = ξ (t, J ) − ξ (0, J ) − = ξ (t, J ) − ξ (0, J ) −

t

0

t

ξ(s, J )ds +

Err 8 (ω(s))ds

0

t

ξ (s, J )ds +

0

t

Err 9 (ω(s))ds

0

eq lim Eeq ε | Err 8 (ω(0))| = lim Eε | Err 9 (ω(0))| = 0.

ε→0

ε→0

(8.16)

Proof of (8.4). Step 1: Define Gˆ ε (ω) = X ε (ω) + 21 Gˆ ε (ω). Set,

G(ω; z)ζ δ(ε) (z)dz and let us write Xˆ ε (ω) =

Mˆ ε (t) := Xˆ ε (ω(t)) − Xˆ ε (ω(0)) −

t 0

A Xˆ ε (ω(s))ds.

Note that Mˆ ε (t) = M¯ ε (t) + Mt , where Mt was defined in Sect. 7. By (7.13), 2 ¯ ˆ M lim Eeq (t) − M (t) = 0. ε ε ε

ε→0

(8.17)

As it is a well-known fact for Markov processes, the following process is also a martingale: Nˆ ε (t) = Mˆ ε (t)2 −

t 0

(A( Xˆ ε )2 − 2 Xˆ ε A Xˆ ε )(ω(s))ds.

We certainly have A := A( Xˆ ε )2 − 2 Xˆ ε A Xˆ ε = A0 + Ac + A f ,

(8.18)

with A0 = 2K ε−1

d(m i ) |∇x J (xi , m i ) + Bi (ω)|2 ,

i

where Bi (ω) =

1 −1 K ε ∇xi u¯ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), 2 j

1 + K ε−1 ∇xi u¯ ε (x j − xi ; m j , m i ) Jˆ(x j , m j , xi , m i ), 2 j

with u¯ ε = u˜ ε − u ε , where u˜ ε (a; m, n) = u ε (a + z; m, n)ζ δ(ε) (z)dz. The exact form of Ac and A f will be given in Steps 2 and 4 respectively. We have Bi = Bi1 + Bi2 , where

814

M. Ranjbar, F. Rezakhanlou

Bi1 (ω) and Bi2 (ω) are given by 1 −1 ε u¯ x (xi − x j ; m i , m j ) Jˆ(x j , m j , xi , m i ) K 2 ε j

−u¯ εx (x j − xi ; m j , m i ) Jˆ(xi , m i , x j , m j ) ,

1 −1 ε u¯ (xi − x j ; m i , m j ) Jˆx (xi , m i , x j , m j ) K 2 ε j

−u¯ ε (x j − xi ; m j , m i ) Jˆy (x j , m j , xi , m i ) ,

respectively. We have A0 = 2 A00 + 2 A01 + 2 A02 , with A00 = K ε−1

d(m i )|∇x J (xi , m i )|2 ,

i

A01 =

2K ε−1

d(m i )(Bi1 (ω) + Bi2 (ω)) · ∇x J (xi , m i ),

i

A02 = K ε−1

d(m i ) (Bi1 (ω) + Bi2 (ω))2 .

i

We first show that the term A01 is small. Note that |A01 | ≤ A011 + A012 , for d(m i )|Bir (ω)|, A01r = c1 K ε−1 i

with c1 = 2∇ J ∞ . By Lemma 6.1,

ε λm λn |∇ u¯ ε (a; m, n)|da ≤ c3 δ(ε) log δ(ε), Eeq A011 ≤ c2 m,n

Eεeq A012

≤ c2

λm λn

m,n

|a|≤1

|u¯ ε (a; m, n)|da ≤ c3 δ(ε),

as in the proof of (5.5) and (5.7). As a result, |A01 | ≤ 2c3 δ(ε) log δ(ε). We now turn to A02 . We may write A02 = A021 + A022 + A023 , with 2 d(m i )Bi1 , A021 = K ε−1 i

A022 =

K ε−1

A023 =

2K ε−1

2 d(m i )Bi2 ,

i

i

d(m i )Bi1 Bi2 .

(8.19)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

815

We now use Lemma 6.2 to show that A022 and A023 are small. Indeed after squaring, Eεeq A022 ≤ c4 Eεeq K ε−3 d(m i )|u¯ ε (xi − x j ; m i , m j )| |u¯ ε (xi − xk ; m i , m k )| i, j,k

= A0221 + A0222 , where A0221 and A0222 correspond to the cases k = j and k = j respectively. By Lemma 6.2 and (3.10),

ε ε Eeq A0221 ≤ c5 λm λn λ p |u¯ (a; m, n)|da |u¯ ε (a; m, p)|da ≤ c6 δ(ε)2 , |a|≤1

m,n, p

Eεeq A0222 ≤ c5 K ε−1

|a|≤1

λm λn

m,n

In the same fashion, Eεeq A023 ≤ c7 Eεeq K ε−3

|a|≤1

|u¯ ε (a; m, n)|2 da ≤ c6 K ε−1 .

d(m i )|u¯ ε (xi − x j ; m i , m j )| |∇ u¯ ε (xi − xk ; m i , m k )|

i, j,k

= A0231 + A0232 , where A0231 and A0232 correspond to the cases k = j and k = j respectively. By Lemma 6.2 and (3.10),

λm λn λ p |u¯ ε (a; m, n)|da Eεeq A0231 ≤ c8 |a|≤1

m,n, p

×

|a|≤1

Eεeq A0232 ≤ c10 K ε−1

≤

|a|≤1

c10 K ε−1 ×

λm λn

m,n

×

|∇ u¯ ε (a; m, p)|da ≤ c9 δ(ε)2 | log δ(ε)|,

|u¯ ε (a; m, n)||∇ u¯ ε (a; m, n)|da m,n

|a|≤1

λm λn

|a|≤1

1/2

ε

|u¯ (a; m, n)| da

|∇ u¯ ε (a; m, n)|2 da

2

1/2

≤ c11 K ε−1 δ(ε)K ε1/2 K ε1/2 = c11 δ(ε). As a result,

|A022 + A023 | ≤ c12 δ(ε) + K ε−1 .

(8.20)

We now concentrate on A021 . First observe that since V and ζ are symmetric, we learn that u¯ ε is symmetric. From this and symmetry of J˜ and K we learn Bi1 = K ε−1 u¯ εx (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ). j

816

M. Ranjbar, F. Rezakhanlou

After squaring, we obtain A021 = A0211 + A0212 , where A0211 = 2K ε−3 d(m i )u¯ εx (xi − x j ; m i , m j ) · u¯ εx (xi − xk ; m i , m k ) i, j,k

A0212

Jˆ(xi , m i , x j , m j ) Jˆ(xi , m i , xk , m k ), = K ε−3 d(m i )|u¯ εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 , i, j

where j = k in A0211 . Using Lemma 6.1 we deduce

Eεeq |A0211 | ≤ c1 λm λn λ p |∇ u¯ ε (a; m, n)|da m,n, p

×

|a|≤1

|a|≤1

|∇ u¯ ε (a; m, p)|da ≤ c2 δ(ε)2 (log δ(ε))2 .

As for A0212 , we first write, A0212 = A02121 + A02122 + A02123 , where A02121 = K ε−3 d(m i )|u εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 , i, j

A02122 = K ε−3

i, j

A02123 =

−2K ε−3

d(m i )|u˜ εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 ,

d(m i ) u˜ εx · u εx (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )2 .

i, j

We now argue that A02122 and A02123 are small. To see this, first observe that by Lemma 6.2, ε ε δ(ε) ∇u (a + z; m, n)ζ (z)dz |∇ u˜ (a; m, n)| =

−2 ≤ c3 δ(ε) |∇u ε (a + z; m, n)|dz |z|≤c3 δ(ε)

−2 |a + z|−1 dz ≤ c4 δ(ε) α (m, n) |z|≤c3 δ(ε) ≤ c5 α (m, n) min |a|−1 , δ(ε)−1 . As a result,

|a|≤1

|a|≤1

|∇ u˜ ε (a; m, n)|2 da ≤ c6 α (m, n)2 | log δ(ε)|,

|∇ u˜ ε (a; m, n) · ∇u ε (a; m, n)|da ≤ c6 α (m, n)2 | log δ(ε)|1/2 | log ε|1/2 .

From this we learn |A02122 | ≤ c7 α (m, n)2

| log δ(ε)| | log δ(ε)|1/2 , |A02123 | ≤ c7 α (m, n)2 . | log ε| | log ε|1/2

(8.21)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

From (8.18)-(8.21) we deduce that A0 = 2 A00 + 2 A02121 + Err1 , with −1/2 | log δ(ε)|1/2 . Eeq ε | Err 1 | ≤ c δ(ε)| log δ(ε)| + | log ε|

817

(8.22)

Furthermore, if we pick a small δ > 0 and write A02121 = A021211 + A021212 , with A021211 = K ε−3 d(m i )|u εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 11(|xi − x j | ≤ δ), i, j

A021212 =

K ε−3

d(m i )|u εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 11(|xi − x j | > δ),

i, j

then we have that Eε |A021212 | ≤ K ε−1 δ −2 , and A021211 = A0212111 + A0212112 , where in the first term we replace the x j argument of Jˆ with xi , and the second term is the error caused by such a replacement. More precisely, A0212111 = K ε−3 d(m i )|u εx (xi − x j ; m i , m j )|2 eq

i, j

J˜(xi , m i , xi , m j )2 11(|xi − x j | ≤ δ),

−1 Eeq |A | ≤ K δc λ λ |∇u ε (a; m, n)|2 da ≤ c2 δ, 0212112 1 m n ε ε |a|≤δ

m,n

where we used the smoothness of J for the first inequality and (6.8) for the second inequality. Now that we have replaced x j with xi in Jˆ, we can drop the condition |xi − x j | ≤ δ. Indeed A0212111 = A02121111 + A02121112 , with d(m i )|∇u ε (xi − x j ; m i , m j )|2 A02121111 = K ε−3 i, j

J˜(xi , m i , xi , m j )2 11(|xi − x j | ≤ 1), −1 −2 δ . Eeq ε |A02121112 | ≤ c3 K

We now choose δ = | log ε|−1/3 . In summary, A02121 = A02121111 + Err2 , where −1/3 Eeq . ε |Err2 | ≤ c4 | log ε|

On the other hand, by Lemma 5.1, t 2 A00 (ω(s))ds − t A0 (J ) = 0, lim Eeq ε ε→0 0 t 2 A02121111 (ω(s))ds − t A (J ) = 0, lim Eeq ε 0 ε→0

(8.24)

0

where A0 (J ) = 2

d(n)λn

|∇ Jn |2 d x,

n

A0 (J ) =

(8.23)

m,n

λm λn d(m)γ (m, n)

|Jm+n − Jn − Jm |2 d x,

(8.25)

818

M. Ranjbar, F. Rezakhanlou

with γ (m, n) = lim | log ε|−1 ε→0

|a|≤1

|∇u ε (a; m, n)|2 da.

Step 2: We now study Ac . Recall that u¯ ε = u˜ ε − u ε . We have ! 02 8 1 −1 α(m i , m j )Vε (xi − x j ; m i , m j ) S(i, j, ω) + Rr (i, j, ω) , Ac (ω) = K ε 2 r =0

i, j

where S(i, j, ω) = J˜(xi , m i , x j , m j ) + Jˆ(xi , m i , x j , m j )K ε−1 u ε (xi − x j ; m i , m j ), R0 (i, j, ω) = −K ε−1 Jˆ(xi , m i , x j , m j )u˜ ε (xi − x j ; m i , m j ), mi u¯ ε (xi − xk ; m i + m j , m k ) Jˆ(xi , m i + m j , xk , m k ), R1 (i, j, ω) = K ε−1 mi + m j k mi −1 R2 (i, j, ω) = K ε u¯ ε (xk − xi ; m k , m i + m j ) Jˆ(xk , m k , xi , m i + m j ), mi + m j k mj −1 R3 (i, j, ω) = K ε u¯ ε (x j − xk ; m i + m j , m k ) Jˆ(x j , m i + m j , xk , m k ), mi + m j k mj −1 R4 (i, j, ω) = K ε u¯ ε (xk − x j ; m k , m i + m j ) Jˆ(xk , m k , x j , m i + m j ), mi + m j k −1 u¯ ε (xi − xk ; m i , m k ) Jˆ(xi , m i , xk , m k ), R5 (i, j, ω) = −K ε k

R6 (i, j, ω) =

−K ε−1

R7 (i, j, ω) =

−K ε−1

R8 (i, j, ω) =

−K ε−1

u¯ ε (xk − xi ; m k , m i ) Jˆ(xk , m k , xi , m i ),

k

u¯ ε (x j − xk ; m j , m k ) Jˆ(x j , m j , xk , m k ),

k

u¯ ε (xk − x j ; m k , m j ) Jˆ(xk , m k , x j , m j ),

k

where the summation is over k with k = i, j. Let us write T (i, j, ω) = We then write

8

r =0

Rr (i, j, ω).

Ac (ω) = Ac0 (ω) + Ac1 (ω) + Ac2 (ω) with 1 −1 K α(m i , m j )Vε (xi − x j ; m i , m j )S(i, j, ω)2 , 2 ε i, j Ac1 (ω) = K ε−1 α(m i , m j )Vε (xi − x j ; m i , m j )S(i, j, ω)T (i, j, ω),

Ac0 (ω) =

i, j

1 Ac2 (ω) = K ε−1 α(m i , m j )Vε (xi − x j ; m i , m j )T (i, j, ω)2 . 2 i, j

(8.26)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

819

Our goal is showing that both Ac1 and Ac2 are small. We start with the latter. We use the 2 8 ≤ 9 80 Rr2 , to assert that Ac2 ≤ 9 80 Ac2r . We only bound Ac20 inequality 0 Rr and Ac25 as the remaining Ac2r are similar to Ac25 . To bound Ac25 , we simply square out the summation in k to obtain Ac25 =

1 −3 K α(m i , m j )Vε (xi − x j ; m i , m j ) 2 ε i, j,k,l

ε

u¯ (xi − xk ; m i , m k )u¯ ε (xi − xl ; m i , m l ) Jˆ(xi , m i , xk , m k ) Jˆ(xi , m i , xl , m l ) =: Ac251 + Ac252 , where Ac251 and Ac252 represent the cases k = l and k = l respectively. We then have

ε Eeq A ≤ c α(m, n)λ λ λ λ | u ¯ (a; m, p)|da |u¯ ε (a; m, q)|da 1 m n p q c251 ε |a|≤1

m,n, p,q

≤ c2 δ(ε)

|a|≤1

2

by (6.5). In the same fashion, we may use (6.7) to show that Eε Ac252 ≤ c2 K ε−1 . Treating other Ac2r , r = 1, . . . , 8 in the same way yields eq

Eeq ε

8 r =1

Ac2r ≤ c3 δ(ε)2 + K ε−1 .

(8.27)

eq

As for Ac20 we have that Eε Ac20 is bounded by 1 eq −3 E K α(m i , m j )Vε (xi − x j ; m i , m j )u˜ ε (xi − x j ; m i , m j )2 Jˆ(xi , m i , x j , m j )2 2 ε ε i, j

−1 ≤ c4 K ε α(m, n)λm λn Vε (a; m, n)u˜ ε (a; m, n)2 da. m,n

If we restrict the summation to those m and n such that r (m, n)ε ≥ δ(ε), then we simply use u˜ ε ≤ c1 K ε to show that the sum is bounded by a constant multiple of τ (ε), which is small by our assumption (3.9). On the other hand, when Vε (a; m, n) = 0 and r (m, n)ε ≤ δ(ε), ε ε δ(ε) u (a + z; m, n)ζ (z)dz |u˜ (a; m, n)| =

≤ c5 δ(ε)−2 |u ε (a + z; m, n)|dz |z|≤c5 δ(ε)

−2 | log |z||dz ≤ c6 δ(ε) α (m, n)

|z|≤c5 δ(ε)+c6 εr (m,n)

≤ c7 α (m, n)| log δ(ε)|.

(8.28)

Hence −2 2 Eeq ε Ac20 ≤ c8 | log ε| | log δ(ε)| + c8 τ (ε).

(8.29)

820

M. Ranjbar, F. Rezakhanlou

From this and (8.27) we deduce 2 −1 −2 2 δ(ε) A ≤ c + | log ε| + | log ε| | log δ(ε)| + τ (ε) . Eeq c2 9 ε

(8.30)

Step 3: We now turn to Ac1 . By Lemma 6.2, the expression K ε−1 u ε (a; m, n) is uniformly bounded whenever Vε (a; m, n) = 0. Hence |Ac1 | ≤ Ac1 = c1 K ε−2 α(m i , m j )Vε (xi − x j ; m i , m j )|T (i, j, ω)|. i, j,k

Again using the decomposition of T , we write Ac1 = r8=0 Ac1r , with for example α(m i , m j )Vε (xi − x j ; m i , m j ) Ac15 (ω) = c1 K ε−2 i, j,k

ε u¯ (xi − xk ; m i , m k ) Jˆ(xi , m i , xk , m k ) . By (6.5), Eeq ε Ac15

≤ c2

α(m, n)λm λn λ p

|a|≤1

m,n, p

|u¯ ε (a; m, p)|da ≤ c2 δ(ε).

(8.31)

Similarly Ac10 (ω) = c1 K ε−2

α(m i , m j )Vε (xi − x j ; m i , m j )

i, j

ε u˜ (xi − x j ; m i , m j ) Jˆ(xi , x j , m i , m j ) , which yields Eeq ε Ac10 (ω)

≤ c3

α(m, n)λm λn

m,n

|a|≤1

Vε (a; m, n)|u˜ ε (a; m, n)|da.

Again using (8.28) we obtain −1 Eeq ε Ac10 (ω) ≤ c4 | log ε| | log δ(ε)| + c4 τ (ε),

in the same way we obtained (8.28). From this and (8.30) we deduce −1 Eeq ε Ac1 ≤ c5 δ(ε) + | log ε| | log δ(ε)| + τ (ε) . From this, (8.26) and (8.30) we deduce

t

Ac (ω(s))ds = 0

with

t

Ac0 (ω(s))ds + Err 2 ,

0

−1 Eeq + | log ε|−1 | log δ + τ (ε)| . ε | Err 2 | ≤ c δ(ε) + | log ε|

(8.32)

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

On the other hand, by the Law of Large Numbers, t = 0, lim Eeq A (ω(s))ds − t A (J ) c0 c ε ε→0

where Ac (J ) =

1 2

821

(8.33)

0

α(m, ˜ n)λn λm (Jn+m − Jn − Jm )2 d x

m,n

and α(m, ˜ n) = α(m, n)η(m, n)2 = α(m, ˆ n)η(m, n), where η = α/α. ˆ Here we have used limε K ε−1 u ε = η − 1 uniformly in the support of Vε . (See Theorem 3.2 of [HR].) Step 4: We now concentrate on A f . We have A f (ω) =

m i −1 1 −1 Kε β(m, m i − m) V ε (xi − y; m i − m, m) 2 i m=1 2 4 Rr (i; y, m; ω) dy, S(i; y, m; ω) + r =0

where S(i; y, m; ω) = J (xi , m) + J (y, m i − m) − J (xi , m i ) −K ε−1 u ε (xi − y; m, m i − m) Jˆ(xi , m, y, m i − m), R0 (i; y, m; ω) = K ε−1 u˜ ε (xi − y; m, m i − m) Jˆ(xi , m, y, m i − m), R1 (i; y, m; ω) = K ε−1 [u¯ ε (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j ) j

−u¯ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )], R2 (i; y, m; ω) = K ε−1 [u¯ ε (xi − x j ; m i , m) Jˆ(xi , m i , x j , m) j

−u¯ (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )], R3 (i; y, m; ω) = K ε−1 u¯ ε (y − x j ; m, m j ) Jˆ(y, m, x j , m j ), ε

j

R4 (i; y, m; ω) = K ε−1

u¯ ε (x j − y; m j , m) Jˆ(x j , m j , y, m).

j

Let us write T =

4 0

Rr , and A f = A f 0 + A f 1 + A f 2,

(8.34)

822

M. Ranjbar, F. Rezakhanlou

with Af0

m i −1 1 −1 = Kε β(m, m i − m) V ε (xi − y; m i − m, m)S(i; y, m; ω)2 dy, 2 m=1

i

Af1 =

K ε−1

i −1 m

Af2

i −1 m

We have that A f 2 ≤ A f 2r (ω) =

V ε (xi − y; m i − m, m)T (i; y, m; ω)2 dy.

β(m, m i − m)

m=1

i

K ε−1

V ε (xi − y; m i − m, m)(ST )(i; y, m; ω)dy,

β(m, m i − m)

m=1

i

1 = K ε−1 2

5 2

4 0

i −1 m

i

A f 2r with

V ε (xi − y; m i − m, m)Rr (i; y, m; ω)2 dy.

β(m, m i − m)

m=1

We then use (8.27) to learn that A f 20 (ω) is bounded above by c1 K ε−3

i −1 m

i

β(m, m i − m)

V ε (xi − y; m i − m, m)u˜ ε (xi − y; m, m i − m)2 dy

m=1

≤ c2 K ε−3 | log δ(ε)|2

i −1 m

i

β(m, m i − m)α (m, m i − m)2 .

m=1

We then use (3.10) to deduce −2 2 Eeq ε A f 20 (ω) ≤ c3 | log ε| | log δ(ε)| .

(8.35)

On the other hand, A f 21 (ω) = K ε−1

i −1 m

i

β(m, m i − m)R1 (i; y, m; ω)2 ,

m=1

is bounded above by A f 211 + A f 212 , with A f 211 = 2K ε−3

i −1 m

i

A f 212 = 2K ε−3

m=1

i −1 m

i

m=1

⎡ β(m, m i − m) ⎣ ⎡ β(m, m i − m) ⎣

⎤2 |u¯ ε (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j )|⎦

j

⎤2 |u¯ (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )|⎦ . ε

j

By squaring out the expression inside brackets, we can readily see A f 211 + A f 212 ≤ c1 δ(ε)2 + K ε−1

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

by Lemma 6.1. The term A f 22 is treated likewise. Hence,

A f 21 (ω) + A f 22 (ω) ≤ c2 δ(ε)2 + K ε−1 . Eeq ε

823

(8.36)

We now study A f 23 : A f 23 (ω) =

K ε−3

i −1 m

β(m, m i − m)

V ε (xi − y; m i − m, m)

i, j,k m=1

u¯ (y − x j ; m, m j ) Jˆ(y, m, x j , m j )u¯ ε (y − xk ; m, m k ) Jˆ(y, m, xk , m k )dy = A f 231 + A f 232 , ε

with A f 231 and A f 232 corresponding to the cases k = j and k = j. With the aid of (6.5) and (6.8) , we can readily deduce −1 Eeq ε A f 232 ≤ c3 | log ε| .

2 Eeq ε A f 231 ≤ c3 δ(ε) ,

The term A f 24 is treated likewise. In summary ε A f 2 (ω) ≤ c4 δ(ε)2 + | log ε|−1 . E eq We can readily bound A f 1 as in the previous step: ε A f 1 (ω) ≤ c5 δ(ε) + | log ε|−1 | log d(ε) + τ (ε)| . E eq From all this we conclude

t

A f (ω(s))ds = 0

t

A f 0 (ω(s))ds + Err 3

0

with Err 3 satisfying (8.32). On the other hand S(i; y, m; ω) = −J (xi , m, xi , m i − m) (1 + K ε−1 u ε (xi − y; m, m i − m)) + O(εr (m, m i − m)) whenever V ε (y − xi ; m, m i − m) = 0. From this and a law of large numbers, we can readily deduce t = 0, (8.37) lim Eeq A (ω(s))ds − t A (J ) f0 f ε ε→0 0

where A f (J ) =

1 2

˜ β(m, n)λn+m (Jn+m − Jn − Jm )2 d x,

m,n

where ˜ ˆ β(m, n) = β(m, n)η(m, n)2 = β(m, n)η(m, n).

824

M. Ranjbar, F. Rezakhanlou

Final Step: From (8.22), (8.23), (8.24), (8.32), (8.35) and (8.37) we learn that the process M¯ ε (t)2 − t A (J ), is a sum of a martingale and a small error, where

2 A (J ) = 2 λn d(n)|∇x Jn | d x + λn λm d(n)γ (n, m)(Jn+m − Jn − Jm )2 d x n

+

1 2

1 + 2

n,m

α(m, ˜ n)λn λm (Jn+m − Jn − Jm )2 d x

m,n

˜ β(m, n)λn+m (Jn+m − Jn − Jm )2 d x.

m,n

It remains to show that A(J ) = A (J ). Since ˜ ˆ λm λn α(m, ˜ n) = β(m, n)λn+m , λm λn α(m, ˆ n) = β(m, n)λn+m , it suffices to show (d(m) + d(n))γ (m, n) = lim (d(m) + d(n))K ε−1 ε→0

|a|≤1

|∇u ε (a; m, n)|2 da

= α(m, ˆ n) − α(m, ˜ n) = α(m, n)(η(m, n) − η(m, n)2 ). (8.38) We have (d(m) + d(n))K ε−1

|∇u ε (a; m, n)|2 da

u ε (a; m, n)u ε (a; m, n)da + O(K ε−1 ) = −(d(m) + d(n))K ε−1 |a|≤1

V ε (a; m, n)K ε−1 u ε (a; m, n) = −α(m, n) |a|≤1 × 1 + K ε−1 u ε (a; m, n) da + O(K ε−1 ), |a|≤1

where we integrated by parts and used (4.11). This and limε K ε−1 u ε = η−1 (Theorem 3.2 of [HR1]) imply (8.38). 9. Proofs of Theorems 8.1 and 3.1 In this section, we complete the proof of Theorems 8.1 and 3.1. We first show that the process ξ (t, J ) is tight. More precisely, Theorem 9.1. For every smooth function J of compact support and positive T , there exists a constant c(J, T ) such that

T lim sup Eeq sup |ξ (t + h, J ) − ξ (t, J )|dt ≤ c(J, T )δ 1/2 . (9.1) ε ε→0

0

0≤h≤δ

Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles

825

Proof. Recall that by (8.16),

t

t ¯ ¯ ξ (t, J ) − ξ (s, J ) = ξ(θ, J )dθ + Mε (t) − Mε (s) − Err 8 (ω(θ )) dθ, (9.2) s

s

where,

t Err 8 (ω(θ ))dθ ≤ T lim sup Eeq |Err 8 (ω(0))| = 0. lim sup Eeq sup ε ε ε→0

0≤t≤T

ε→0

0

(9.3)

On the other hand, since 2 eq 2 sup Eeq ε ξ(θ, J ) = sup Eε ξ(0, J ) < ∞, ε

ε

we can readily deduce Eeq ε

sup 0≤s≤t≤s+δ≤T

t ξ(θ, J )dθ s

1/2

t 2 ≤ (T + δ)1/2 Eeq ξ(θ, J ) dθ ≤ c1 δ 1/2 . ε

(9.4)

s

See Sect. 5 and (5.7) of [R] for more details. It remains to establish the tightness of the martingale M¯ ε . For this, it suffices to show lim sup

2 ¯ ¯ sup Eeq ε [ Mε (t + h) − Mε (t)] ≤ c0 (T )δ.

ε→0 0≤t≤T 0
(9.5)

This is an immediate consequence of Doob’s inequality and (8.5). From (9.2), (9.3) and (9.5) we deduce (9.1). We are now ready to complete the proof of Theorems 3.1 and 8.1. Take a countable set {tn : n ∈ N} that contains 0 and is dense in some fixed interval [0, T ]. Write P˜ ε for the law of

(ξ (·, J ) : J ∈ D0 ), (ξ (tn , J ) : J ∈ D0 , n ∈ N) ∈ H × RD0 ×N , eq

with respect to the probability measure Pε . Using Theorem 9.1, we can readily show that the sequence P˜ ε is tight. Let P˜ be a limit point of the sequence P˜ ε . Using Theorem 8.2, it is not hard to show that for every J ∈ D0 , the sequences

tn (M J (tn ) := ξ (tn , J ) − ξ (0, J ) − ξ (s, J )ds : n ∈ N), 0

(N J (tn ) := M J (tn ) − tn A(J ) : n ∈ N), 2

˜ We now extend M J and N J are martingales with respect to the probability measure P. to the whole interval [0, T ]. More precisely, for t ∈ / {tn : n ∈ N}, define M J (t) := lim M J (tn ), tn →t−

N J (t) := lim N J (tn ), tn →t−

which exist almost surely by the Martingale Upcrossing Theorem. Here by limtn →t− , we mean the limit with respect to a subsequence of {tn } which increases to t from the

826

M. Ranjbar, F. Rezakhanlou

˜ left. As a result, ξ¯ (t, J ) := limtn →t− ξ(tn , J ) also exists almost surely with respect to P. The process ξ¯ (t, J ) is a solution to (3.13) in the sense of Definition 8.1. This completes the proof of Theorem 3.1 because the set {tn : n ∈ N} can be chosen to include any given finite collection of points. To complete the proof of Theorem 8.1, it suffices to show that ξ (t, J ) = ξ¯ (t, J ) almost everywhere. For this, let us assume that the set {tn : n ∈ N} includes the points in the set {i/L : i ∈ N} ∩ [0, T ] for every positive integer L. Write TL (t) for [t L]/L, where [t L] denotes the integer part of t L. From (9.1) we can readily deduce

T |ξ (t, J ) − ξ (TL (t), J )|dtd P˜ ≤ c(J, T )L −1/2 . 0

Since lim L→∞ ξ (TL (t)) = ξ¯ (t, J ) by definition, we deduce that ξ (t, J ) = ξ¯ (t, J ) for almost all t and almost surely. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References [A] [C] [CY] [EK] [HR1] [HR2] [HR3] [HS] [KV] [LN] [R1] [R2] [Sm] [Sz] [Sp] [HRY]

Aldous, D.J.: Deterministic and stochastic models for coalescence (aggregation, coagulation): a review of the mean-field theory for probabilists. Bernoulli 5, 3–48 (1999) Chang, C.C.: Equilibrium fluctuations of gradient reversible particle systems. Prob. Th. Rel. Fields 100, 269–283 (1994) Chang, C.C., Yau, H.T.: Fluctuations of one-dimensional ginzburg-landau models in nonequilibrium. Commun. Math. Phys. 145, 209–234 (1992) Ethier, S.N., Kurtz, T.G.: Markov Processes. Characterization and Convergence. Wiley, New York, 1986 Hammond, A.M., Rezakhanlou, F.: Kinetic limit for a system of coagulating planar Brownian particles. J. Stat. Phys. 123(2–4), 997–1040 (2006) Hammond, A.M., Rezakhanlou, F.: The kinetic limit of a system of coagulating Brownian particles. Arch. Rat. Mech. Anal. 185, 1–67 (2007) Hammond, A.M., Rezakhanlou, F.: Moment bounds for the Smoluchowski equation and their consequences. Commun. Math. Phys. 276, 645–670 (2007) Holley, R.A., Stroock, D.W.: Generalized ornstein-uhlenbeck processes and infinite particle branching Brownian motions. Publ. Res. Inst. Math. Sci. 14, 741–788 (1978) Kipnis, C., Varadhan, S.R.S.: Central limit theorem for additive functionals of reversible markov processes and applications to simple exclusions. Comm. Math. Phys. 104, 1–19 (1986) Lang, R., Nyugen, X.-X.: Smoluchowski’s theory of coagulation in colloids holds rigorously in the Boltzmann-grad limit. Z. Wahrsch. Verw. Gebiete 54, 227–280 (1980) Rezakhanlou, F.: Equilibrium fluctuations for the discrete Boltzmann equation. Duke J. Math. 93(2), 257–288 (1998) Rezakhanlou, F.: The coagulating Brownian particles and Smoluchowski’s equation. Markov Process. Rel. Fields 12, 425–445 (2006) Smoluchowski, M.: Drei vortrage uber diffusion, brown’sche molekular bewegung und koagulation von kolloidteilchen. Phys. Z. XVII, 557–571, 585–599 (1916) Sznitman, A.S.: Propagation of chaos for a system of annihilating Brownian spheres. Comm. Pure Appl. Math. 40, 663–690 (1987) Spohn, H.: Large Scale Dynamics of Interacting Particles. Springer, Berlin-Hidelberg-NewYork (1991) Yaghouti, M., Rezakhanlou, F., Hammond, A.M.: Coagulation, diffusion and the continuous smoluchowski equation. Stoch. Proc. Appl. 119, 3042–3080 (2009)

Communicated by H. Spohn

Commun. Math. Phys. 296, 827–860 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0962-6

Communications in

Mathematical Physics

Highest Weight Modules Over Quantum Queer Superalgebra Uq (q(n)) Dimitar Grantcharov1, , Ji Hye Jung2,, , Seok-Jin Kang2, , Myungho Kim2,, 1 Department of Mathematics, University of Texas at Arlington, Arlington,

TX 76021, USA. E-mail: [email protected]

2 Department of Mathematical Sciences and Research Institute of Mathematics,

Seoul National University, San 56-1 Sillim-dong, Gwanak-gu, Seoul 151-747, Korea. E-mail: [email protected]; [email protected]; [email protected] Received: 1 June 2009 / Accepted: 27 August 2009 Published online: 20 December 2009 – © Springer-Verlag 2009

Abstract: In this paper, we investigate the structure of highest weight modules over the quantum queer superalgebra Uq (q(n)). The key ingredients are the triangular decomposition of Uq (q(n)) and the classification of finite dimensional irreducible modules over quantum Clifford superalgebras. The main results we prove are the classical limit theorem and the complete reducibility theorem for Uq (q(n))-modules in the category Oq≥0 . Introduction Since its inception, the representation theory of Lie superalgebras has been known to be much more complicated than the corresponding theory of Lie algebras. One of the Lie superalgebra series attracts special attention due to its resemblance of the Lie algebra gln on the one hand and because of the unique properties of its structure and representations on the other. This is the so-called queer (or strange) Lie superalgebra q(n) which consists of all endomorphisms of Cn|n that commute with an odd automorphism P of Cn|n such that P 2 = Id. The queer nature of q(n) is partly due to the nonabelian structure of its Cartan subsuperalgebra h having a nontrivial odd part h1¯ . Another unique property of q(n) is that, although it has no invariant bilinear form, it admits an invariant odd bilinear form. Because of the nonabelian structure of h, the study of the highest weight modules of q(n) requires some tools in addition to the standard technique. For example, the highest weight space vλ of an irreducible highest weight q(n)-module V (λ) has a Clifford module structure. The case when V (λ) is a tensor module; i.e., a submodule of some tensor power V ⊗r of the natural q(n)-module V = Cn|n , was treated first by Sergeev in 1984. In [Se2] Sergeev established several important results, among which are the complete reducibility of V ⊗r , a character formula of V (λ), and an analog of the

This research was supported by a UT Arlington REP Grant. This research was supported by KRF Grant # 2007-341-C00001. This research was supported by BK21 Mathematical Sciences Division.

828

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

fundamental Schur-Weyl duality, often referred as Sergeev duality. The characters of all simple finite-dimensional q(n)-modules have been found by Penkov and Serganova in 1996 (see [PS2 and PS3]) via an algorithm using a supergeometric version of the Borel-Weil-Bott Theorem. In 2004 Brundan, [B], obtained the character formula of Penkov and Serganova using a different approach and formulated a conjecture for the characters of the irreducible modules in the category O. Important results related to the simplicity of the highest weight q(n)-modules were obtained recently by Gorelik in [G]. In this paper we initiate the study of highest weight representations of the quantum superalgebra Uq (q(n)). The aim of this paper is twofold. We want to study highest weight Uq (q(n))-modules on the one hand, and to build the foundations of the crystal bases theory for the tensor modules of Uq (q(n)) on the other. The latter problem will be treated in a future work. A quantum deformation of the universal enveloping algebra of q(n) was constructed first by Olshanski in [O]. Olshanski’s construction is a flat deformation of the universal enveloping algebra U (q(n)) of q(n) and is a quantum enveloping superalgebra in the sense of Drinfeld ([Dr], Sect. 7). The idea in [O] is to apply a suitable modification of the procedure used by Faddeev, Reshetikhin, and Takhtajan in [RTF] – using an element S in End(Cn|n )⊗2 that satisfies the quantum Yang-Baxter equation. However, as pointed out by Olshanski, the r -matrix r ∈ q(n)⊗2 does not satisfy the classical YangBaxter equation. Thus no quantum analogue of U (q(n)) can be a quasi-triangular Hopf algebra. In the present paper, based on the description of Olshanski, we give a presentation of Uq (q(n)) in terms of generators and relations so that the relations are quantum deformations of the relations of q(n) obtained in [LS]. Using this presentation, we find a natural triangular decomposition of Uq (q(n)), and then introduce the notion of highest weight modules and Weyl modules. Similarly to the case of q(n), in order to study highest weight modules, one has to describe the modules over the quantum Clifford superalgebra Cliff q (λ) for a weight λ of q(n). These modules, as we show in Sect. 3, do not have the same structure as the ones over the classical Clifford superalgebra Cliff(λ). For example, the irreducible modules over Cliff q (λ) are parity invariant for a much larger set of weights λ, compared with the irreducibles over Cliff(λ). In the last two sections of the paper we focus on the category Oq≥0 of finite dimensional Uq (q(n))-modules all whose weights are of the form λ1 1 +· · ·+λn n (λi ∈ Z≥0 ). One of our main results is a classical limit theorem for the irreducible modules in Oq≥0 . Due to the structure of the quantum Clifford superalgebra, the classical limit theorem is non-standard, as it is not true in general that the classical limit V 1 of an irreducible highest weight Uq (q(n))-module V q (λ) is V (λ). In fact, as we show in Sect. 5, if λ has even number of nonzero coordinates λ1 > · · · > λ2k , then ch V 1 = 2 ch V (λ). The “queer” version of the classical limit theorems are Theorem 5.14 and Theorem 5.16. With the aid of the classical limit theorems we obtain another important result in the last section: the category Oq≥0 is semisimple. The organization of the paper is as follows. In Sect. 1 we recall some definitions and basic results about q(n). The realization of Uq (q(n)) and its triangular decomposition is provided in Sect. 2. Section 3 is devoted to the study of the quantum Clifford superalgebra and its modules. In Sect. 4 we introduce the notion of highest weight modules and Weyl modules. In particular, we show that every Weyl module W q (λ) has a unique irreducible quotient V q (λ). The classical limit theorem for the category Oq≥0 is proved in Sect. 5 and the complete reducibility of Uq (q(n))-modules in Oq≥0 is established in the last section.

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

829

1. The Lie Superalgebra q(n) and its Representations The ground field in this section will be C. By Z≥0 and Z>0 we denote the nonnegative integers and strictly positive integers, respectively. We set Z2 = Z/2Z. Every vector space V = V0¯ ⊕ V1¯ over C is Z2 -graded with even part V0¯ and odd part V1¯ . We will write dim V = m|n if dim V0¯ = m and dim V1¯ = n. By we denote the parity change functor; i.e., V is a vector space for which V0¯ = V1¯ and V1¯ = V0¯ . The direct sum of r copies of a vector space V will be written as V ⊕r . The Lie subsuperalgebra g = q(n) of gl(n|n) is defined in matrix form by A B A, B ∈ gln . g = q(n) := B A By definition, a subsuperalgebra h = h0¯ ⊕ h1¯ of g is a Cartan subsuperalgebra, if it is a self-normalizing nilpotent subsuperalgebra. Every such h has a nontrivial odd part h1¯ . We fix h to be the standard Cartan subsuperalgebra, namely the onefor which h0¯ E i,i 0 , has a basis {k1 , . . . , kn } and h1¯ has a basis {k1¯ , . . . , kn¯ }, where ki := 0 E i,i 0 E i,i ki¯ := and E i, j is the n × n matrix having 1 in the (i, j) position and 0 E i,i 0 elsewhere. One should note that all Cartan subsuperalgebras of g are conjugate to h. Let {1 , . . . , n } be the basis of h∗0¯ dual to {k1 , . . . , kn }. We denote ki − ki+1 by h i for i = 1, 2, . . . , n − 1. The root system = 0¯ ∪ 1¯ of g has identical even and odd parts. Namely, 0¯ =1¯ = {i − j | 1 < i = j < n}. In particular, the root space decomposition g = property that gα has dimension 1|1 for every α ∈ . α∈ gα has the n−1 n−1 Set αi := i − i+1 . Let Q = i=1 Zαi be the root lattice and Q + = i=1 Z≥0 αi be the positive root lattice. The notation Q − = −Q + will also be used. There is a partial ordering on h∗0¯ defined by λ ≥ µ if and only if λ − µ ∈ Q + for λ, µ ∈ h∗0¯ . The root space E i,i+1 0 E i,i+1 0 , while g−αi is gαi is spanned by ei := and ei¯ := 0 E E i,i+1 0 i,i+1 n 0 E i+1,i 0 E i+1,i . Let P := i=1 and f i¯ := spanned by f i := Zi 0 E i+1,i E i+1,i 0 n be the weight lattice of g and denote by P ∨ := i=1 Zki the dual weight lattice. Let I := {1, 2, . . . , n − 1} and J := {1, 2, . . . , n}. Proposition 1.1. [LS]. The Lie superalgebra g is generated by the elements ei , ei¯ , f i , f i¯ (i ∈ I ), h0¯ and kl¯ (l ∈ J ) with the following defining relations: [h, h ] = 0 for h, h ∈ h0¯ , [h, ei ] = αi (h)ei , [h, ei¯ ] = αi (h)ei¯ for h ∈ h0¯ , i ∈ I, [h, f i ] = −αi (h) f i , [h, f i¯ ] = −αi (h) f i¯ for h ∈ h0¯ , i ∈ I, [h, kl¯] = 0 for h ∈ h0¯ , l ∈ J, [ei , f j ] = δi j (ki − ki+1 ), [ei , f j¯ ] = δi j (ki¯ − ki+1 ) for i, j ∈ I, [ei¯ , f j ] = δi j (ki¯ − ki+1 ), [kl , ei ] = αi (kl )ei for i, j ∈ I, l ∈ J, [kl , f i ] = −αi (kl ) f i , [ei¯ , f j¯ ] = δi j (ki + ki+1 ) for i, j ∈ I, l ∈ J, ei if l = i, i + 1 for i ∈ I, l ∈ J, [kl¯, ei¯ ] = 0 otherwise

830

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

[kl¯, f i¯ ] =

f i if l = i, i + 1 0 otherwise

for i ∈ I, l ∈ J,

[ei , e j¯ ] = [ei¯ , e j¯ ] = [ f i , f j¯ ] = [ f i¯ , f j¯ ] = 0 for i, j ∈ I, |i − j| = 1, [ei , e j ] = [ f i , f j ] = 0 for i, j ∈ I, |i − j| > 1, [ei , ei+1 ] = [ei¯ , ei+1 ], [ei , ei+1 ] = [ei¯ , ei+1 ], [ f i+1 , f i ] = [ f i+1 , f i¯ ], [ f i+1 , f i¯ ] = [ f i+1 , f i ], [ki¯ , k j¯ ] = δi j 2ki for i, j ∈ J, [ei , [ei , e j ]] = [ei¯ , [ei , e j ]] = 0 for i, j ∈ I, |i − j| = 1, [ f i , [ f i , f j ]] = [ f i¯ , [ f i , f j ]] = 0 for i, j ∈ I, |i − j| = 1. Remark. We modified the relations given in [LS]. More precisely, we replaced the relations [ei¯ , [ei , e j¯ ]] = 0 for i, j ∈ I, |i − j| = 1, [ f i¯ , [ f i , f j¯ ]] = 0 for i, j ∈ I, |i − j| = 1

(1.1)

by [ei , ei+1 ] = [ei¯ , ei+1 ], [ei , ei+1 ] = [ei¯ , ei+1 ], [ f i+1 , f i ] = [ f i+1 , f i¯ ], [ f i+1 , f i¯ ] = [ f i+1 , f i ].

(1.2)

Since (1.1) can be derived from (1.2) (and other ones), we can easily see that these two presentations are equivalent. The universal enveloping algebra U (g) is obtained from the tensor algebra T (g) by factoring out by the ideal generated by the elements [u, v] − u ⊗ v + (−1)αβ v ⊗ u, where α, β ∈ Z2 , u ∈ gα , v ∈ gβ . Let U + (respectively, U 0 and U − ) be the subalgebra of U (g) generated by the elements ei , ei¯ (i ∈ I ) (respectively, by ki , ki¯ (i ∈ J ) and by f i , f i¯ (i ∈ I )). By the Poincaré-Birkhoff-Witt theorem, the universal enveloping algebra has the triangular decomposition: U (g) ∼ = U − ⊗ U 0 ⊗ U +.

(1.3)

A g-module V is called a weight module if it admits a weight space decomposition V = Vµ , where Vµ = {v ∈ V | hv = µ(h)v for all h ∈ h0¯ }. µ∈h∗¯ 0

For a weight g-module M denote by wt(M) the set of weights λ ∈ h∗0¯ for which Mλ = 0. Every submodule of a weight module is also a weight module. If dimC Vµ < ∞ for all µ ∈ h∗0¯ , the character of V is defined to be

ch V = (dimC Vµ ) eµ , µ∈h∗¯ 0

are formal basis elements of the group algebra C[h∗0¯ ] with the multiplication where given by eλ eµ = eλ+µ for all λ, µ ∈ h∗0¯ . Denote by b+ the standard Borel subsuperalgebra of g generated by kl , kl¯ (l ∈ J ) and ei , ei¯ (i ∈ I ). A weight module V is called a highest weight module if it is generated over g by a finite dimensional irreducible b+ -submodule (see [PS1, Def. 4]). eµ

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

831

Proposition 1.2. [P]. Let v be a finite dimensional irreducible Z2 -graded b+ -module. (1) The maximal nilpotent subsuperalgebra n of b+ acts on v trivially. (2) For any weight µ ∈ h∗0¯ , consider the symmetric bilinear form Fµ (u, v) := µ([u, v]) on h1¯ and let Cliff(µ) be the Clifford superalgebra of the quadratic space (h1¯ , Fµ ). Then there exists a unique weight λ ∈ h∗0¯ such that v is endowed with a canonical Z2 -graded Cliff(λ)-module structure and v is determined by λ up to . (3) h0¯ acts on v by the weight λ determined in (2). From the above proposition, we know that the dimension of the highest weight space of a highest weight g-module with highest weight λ is the same as the dimension of an irreducible Cliff(λ)-module. On the other hand all irreducible Cliff(λ)-modules have the same dimension (see, for example, [ABS, Table 2]). Thus the dimension of the highest weight space is constant for all highest weight modules with highest weight λ. Definition 1.3. Let v(λ) be the irreducible b+ -module determined by λ up to . The Weyl module W (λ) of g with highest weight λ is defined to be W (λ) := U (g) ⊗U (b+ ) v(λ). Note that the structure of W (λ) is determined by λ up to . Remark. One may define the Verma module corresponding to λ by M(λ) := U (g)⊗U (b+ ) Cliff(λ). Since the Verma modules are not highest weight modules, they will not be considered in this paper. We will denote by +0¯ and + the set of gln -dominant integral weights and the set of g-dominant integral weights, respectively. These are given by +0¯ := {λ1 1 + · · · + λn n ∈ h∗0¯ | λi − λi+1 ∈ Z≥0 for all i ∈ I }, + := {λ1 1 + · · · + λn n ∈ +0¯ | λi = λi+1 ⇒ λi = λi+1 = 0 for all i ∈ I }. Proposition 1.4. [P]. (1) For any weight λ, W (λ) has a unique maximal submodule N (λ). (2) For each finite dimensional irreducible g-module V , there exists a unique weight λ ∈ +0¯ such that V is a homomorphic image of W (λ). (3) V (λ) := W (λ)/N (λ) is finite dimensional if and only if λ ∈ + . Now we restrict our attention to the following subcategory of the category of finite dimensional g-modules. Definition 1.5. Set P≥0 := {λ = λ1 1 + · · · + λn n ∈ P | λ j ≥ 0 for all j = 1, . . . , n}. The category O≥0 consists of finite dimensional U (g)-modules M with weight space decomposition M = λ∈P Mλ such that wt(M) ⊂ P≥0 . Clearly, O≥0 is closed under finite direct sum, tensor product and taking submodules and quotient modules. Because a q(n)-module in O≥0 can be decomposed into a direct sum of irreducible highest weight gln -modules, one can easily prove the following proposition (see, for example, [HK, Theorem 7.2.3]). Proposition 1.6. For each λ ∈ + ∩ P≥0 , V (λ) is an irreducible U (g)-module in the category O≥0 . Conversely, every irreducible U (g)-module in the category O≥0 has the form V (λ) for some λ ∈ + ∩ P≥0 .

832

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

In [Se1], Sergeev has presented an explicit set of generators of Z = Z(U (g)), the center of U (g), and showed that each Weyl module W (λ) (λ ∈ h∗0¯ ) admits a central character. Let χλ ∈ HomC (Z , C) be the central character afforded by W (λ); i.e., every element z ∈ Z acts on W (λ) as scalar multiplication by χλ (z). Following [B, (2.12)], to each weight λ = λ1 1 + · · · + λn n ∈ P, one can assign a formal symbol δ(λ) := δλ1 + · · · + δλn such that δ0 = 0 and δ−i = −δi . Proposition 1.7. [B, Theorem 4.19], [PS2, Proposition 1.1]. For λ, µ ∈ P, χλ = χµ if and only if δ(λ) = δ(µ). The following proposition will be very useful in Sect. 5. Proposition 1.8. Let V be a finite dimensional highest weight module over g with highest weight λ ∈ + ∩ P≥0 . Then V is isomorphic to an irreducible highest weight module V (λ). Proof. If V is reducible, since it is finite dimensional, it contains a nonzero proper irreducible submodule W . Then W is isomorphic to an irreducible highest weight module V (µ) for some weight µ ∈ + ∩ P≥0 by Proposition 1.4. We know that µ λ and χλ = χµ . But, by Proposition 1.7, δ(λ) = δ(µ). Since λ, µ ∈ + ∩ P≥0 , we have λ = µ, which is a contradiction. Thus V is irreducible and by Proposition 1.4, it must be isomorphic to the irreducible highest weight module V (λ) up to . The next proposition gives a sufficient condition for the finite dimensionality of a highest weight g-module. Proposition 1.9. Let V be a highest weight module over g with highest weight λ ∈ + . If f iλ(h i )+1 v = 0 for all v ∈ Vλ and i ∈ I , then V is finite dimensional. Proof. Let {x1 , x2 , . . . , xr } and {y1 , y2 , . . . , yr } be bases of g0¯ and g1¯ , respectively. Then by the Poincaré-Birkhoff-Witt theorem, U (g) has a basis consisting of elements of the form y11 y22 · · · yrr xn1 1 xn2 2 · · · xrnr , where j = 0 or 1 and n j ∈ N ∪ {0}. Because {y11 y22 · · · yrr | j = 0, 1} is a finite set, it is enough to show that U (g0¯ )Vλ is finite dimensional. For any v ∈ Vλ , we know that U (g0¯ )v is a highest weight module over g0¯ λ(h )+1 with highest weight λ satisfying f i i v = 0 for all i ∈ I . Thus it is finite dimensional.

U (g0¯ )v, we have the desired result. Since U (g0¯ )Vλ ⊂ v∈Vλ

We say that a weight λ = λ1 1 + · · · + λn n ∈ h∗0¯ is α-typical if α = i − j and λi + λ j = 0. In [Se2], Sergeev proved the following character formula for V (λ) (λ ∈ + ∩ P≥0 ): ⎛ ⎞ ch V (λ) =

⎜ dim vλ

⎜ sgn w w ⎜eλ+ρ0 ⎝ D w∈W

α∈+¯ , 0 λ is α−tyipical

⎟ ⎟ (1 + e−α )⎟ , ⎠

(1.4)

where vλ is an irreducible Cliff(λ)-module, W is the Weyl group of g0¯ = gln , ρ0 = 1 w(ρ0 ) is the Weyl denominator. In [PS2], the formula α∈+¯ α and D = w∈W sgn w e 2 0 (1.4) is called the generic character formula and an explicit algorithm for computing the character of an arbitrary finite dimensional irreducible g-module is presented.

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

833

2. The Quantum Superalgebra Uq (q(n)) In [O], Olshanski constructed the quantum deformation Uq (q(n)) of the universal enveloping algebra of q(n). The quantum superalgebra Uq (q(n)) is defined to be the associative algebra over C(q) generated by L i j , i ≤ j, with defining relations L ii L −i,−i = L −i,−i L ii = 1, (−1) p(i, j) p(k,l) q ϕ( j,l) L i j L kl + {k ≤ j < l}θ (i, j, k)(q − q −1 )L il L k j + {i ≤ −l < j ≤ −k}θ (−i, − j, k)(q − q −1 )L i,−l L k,− j

(2.1)

= q ϕ(i,k) L kl L i j + {k < i ≤ l}θ (i, j, k)(q − q −1 )L il L k j + {−l ≤ i < −k ≤ j}θ (−i, − j, k)(q − q −1 )L −i,l L −k, j , where ϕ(i, j) = δ|i|,| j| sgn( j), θ (i, j, k) = sgn(sgn(i) + sgn( j) + sgn(k)), p(i, j) = 0 if i j > 0 for any indices i ≤ j, k ≤ l in {±1, · · · ± n} and the symbol {· · ·} 1 if i j < 0, (the dots stand for some inequalities) is equal to 1 if all of these inequalities are fulfilled and 0 otherwise. Following [O, Remark 7.3], we consider the set of generators of Uq (g) = Uq (q(n)) as follows: 1 1 L −i−1,−i , f i := L i,i+1 , q − q −1 q − q −1 (2.2) 1 1 1 ei¯ := − L , f := − L , k := − L . −i−1,i −i,i+1 −i,i i¯ i¯ q − q −1 q − q −1 q − q −1

q ki := L i,i , q −ki := L −i,−i , ei := −

Our first main result is the following presentation of Uq (g). Theorem 2.1. The quantum superalgebra Uq (g) is isomorphic to the unital associative algebra over C(q) generated by the elements ei , f i , ei¯ , f i¯ (i = 1, . . . , n − 1), kl¯ (l = 1, . . . , n), and q h (h ∈ P ∨ ), satisfying the following relations: q 0 = 1, q h 1 +h 2 = q h 1 q h 2 for h 1 , h 2 ∈ P ∨ , q h ei q −h = q αi (h) ei , q h f i q −h = q −αi (h) f i for h ∈ P ∨ , q h ki¯ q −h = ki¯ , q h ei¯ q −h = q αi (h) ei¯ , q h f i¯ q −h = q −αi (h) f i¯ for h ∈ P ∨ , 1 ki −ki+1 −ki +ki+1 ei f i − f i ei = q , − q q − q −1 qei+1 f i − f i ei+1 = ei f i+1 − q f i+1 ei = ei f j − f j ei = 0 if |i − j| > 1, ei f i¯ − f i¯ ei = q −ki+1 ki¯ − ki+1 q −ki , qei+1 f i¯ − f i¯ ei+1 = ei f i+1 − q f i+1 ei = ei f j¯ − f j¯ ei = 0 if |i − j| > 1, ei¯ f i − f i ei¯ = q ki+1 ki¯ − ki+1 q ki , qei+1 f i − f i ei+1 = ei¯ f i+1 − q f i+1 ei¯ = ei¯ f j − f j ei¯ = 0 if |i − j| > 1,

834

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

ki¯ ei − qei ki¯ = ei¯ q −ki , qki¯ ei−1 − ei−1 ki¯ = −q −ki ei−1 , ki¯ e j − e j ki¯ = 0 for j = i and j = i − 1, ki¯ f i − q f i ki¯ = − f i¯ q ki , qki¯ f i−1 − f i−1 ki¯ = q ki f i−1 , ki¯ f j − f j ki¯ = 0 for j = iand j = i − 1, ki¯2 =

q 2ki − q −2ki , ki¯ k j¯ = −k j¯ ki¯ for i = j, q 2 − q −2

ei¯ f i¯ + f i¯ ei¯ =

(2.3)

q ki +ki+1 − q −ki −ki+1 + (q − q −1 )ki¯ ki+1 , q − q −1

qei+1 f i¯ + f i¯ ei+1 = ei¯ f i+1 + q f i+1 ei¯ = ei¯ f j¯ + f j¯ ei¯ = 0 if |i − j| > 1, ki¯ ei¯ + qei¯ ki¯ = ei q −ki , qki¯ ei−1 + ei−1 ki¯ = q −ki ei−1 , ki¯ e j¯ + e j¯ ki¯ = 0 for j = i and j = i − 1, ki¯ f i¯ + q f i¯ ki¯ = f i q ki , qki¯ f i−1 + f i−1 ki¯ = q ki f i−1 , ki¯ f j¯ + f j¯ ki¯ = 0 for j = i and j = i − 1, ei¯2 = −

q − q −1 2 q − q −1 2 2 e , f = f , i¯ q + q −1 i q + q −1 i

ei e j − e j ei = f i f j − f j f i = ei¯ e j¯ + e j¯ ei¯ = f i¯ f j¯ + f j¯ f i¯ = 0 if |i − j| > 1, ei e j¯ − e j¯ ei = f i f j¯ − f j¯ f i = 0 if |i − j| = 1, ei ei+1 − ei+1 ei = ei¯ ei+1 + ei+1 ei¯ , f i+1 f i − f i f i+1 = f i¯ f i+1 + f i+1 f i¯ , ei ei+1 − ei+1 ei = ei¯ ei+1 − ei+1 ei¯ , f i+1 f i − f i f i+1 = f i+1 f i¯ − f i¯ f i+1 , qei2 ei+1 − (q + q −1 )ei ei+1 ei + q −1 ei+1 ei2 = 0, q f i2 f i+1 − (q + q −1 ) f i f i+1 f i + q −1 f i+1 f i2 = 0, 2 2 − (q + q −1 )ei+1 ei ei+1 + q −1 ei+1 ei = 0, qei ei+1 2 2 q f i f i+1 − (q + q −1 ) f i+1 f i f i+1 + q −1 f i+1 f i = 0,

qei2 ei+1 − (q + q −1 )ei ei+1 ei + q −1 ei+1 ei2 = 0, q f i2 f i+1 − (q + q −1 ) f i f i+1 f i + q −1 f i+1 f i2 = 0, 2 2 qei ei+1 − (q + q −1 )ei+1 ei ei+1 + q −1 ei+1 ei = 0, 2 2 q f i f i+1 − (q + q −1 ) f i+1 f i f i+1 + q −1 f i+1 f i = 0.

Proof. Let U be the unital associative algebra over C(q) generated by the elements ei , f i , ei¯ , f i¯ (i = 1, . . . , n − 1), kl¯ (l = 1, . . . , n), and q h (h ∈ P ∨ ) with defining relations given in (2.3). Using (2.1) and (2.2), the relations in (2.3) can be derived easily. Thus there is a well-defined algebra homomorphism φ : U −→ Uq (g).

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

835

From the relation (2.1), we obtain L i,i+ j = (q − q −1 )q −

j−1

h=1 ki+h

j−1

ad f i+h ( f i ),

h=1

L −i,i+ j = − (q − q −1 )q −

j−1

h=1 ki+h

j−1

ad f i+h ( f i¯ ),

h=1

L −i− j, i = (−1) (q − q j

−1

)q

j−1

j−1

h=1 ki+h

(2.4) ad ei+h (ei¯ ),

h=1

L −i− j,−i = (−1) (q − q j

−1

)q

j−1

h=1 ki+h

j−1

ad ei+h (ei ),

h=1

j where ad bi (b j ) := bi b j − b j bi , h=1 ad bi+h (bi ) := ad bi+ j · · · ad bi+1 (bi ) and 0 h=1 ad bi+h (bi ) = bi for bi = ei , ei¯ , f i , f i¯ (i = 1, . . . , n − 1, j > 0). It follows that the homomorphism φ must be surjective. It remains to prove φ is injective. For this purpose, we will show that the relations in (2.1) can be derived from the ones in (2.3). The proof of our assertion is quite lengthy and tedious. But the basic idea is just the case-by-case check-up. We define the sets = {(i, j) ∈ Z/{0} × Z/{0} | − n ≤ i ≤ j ≤ n},

1 = {(i, j) ∈ | i > 0, j > 0 and i < j},

2 = {(i, j) ∈ | i < 0, j > 0 and |i| < | j|},

3 = {(i, j) ∈ | i < 0, j > 0 and |i| > | j|},

4 = {(i, j) ∈ | i < 0, j < 0 and |i| > | j|},

5 = {(i, j) ∈ | i < 0, j > 0 and |i| = | j|}.

For ((i, j), (k, l)) ∈ × , let a = min{|i|, | j|}, b = max{|i|, | j|}, c = min{|k|, |l|}, d = max{|k|, |l|}. We list all possible subsets of × : C1 = {((i, j), (k, l)) ∈ × | c < d < a < b},

C2 = {((i, j), (k, l)) ∈ × | c < d = a < b},

C3 = {((i, j), (k, l)) ∈ × | c < a < d < b},

C4 = {((i, j), (k, l)) ∈ × | c < a < d = b},

C5 = {((i, j), (k, l)) ∈ × | c < a < b < d},

C6 = {((i, j), (k, l)) ∈ × | c = a < d < b},

C7 = {((i, j), (k, l)) ∈ × | c = a < d = b},

C8 = {((i, j), (k, l)) ∈ × | c = a < b < d},

C9 = {((i, j), (k, l)) ∈ × | a < c < d < b},

C10 = {((i, j), (k, l)) ∈ × | a < c < d = b},

C11 = {((i, j), (k, l)) ∈ × | a < c < b < d},

C12 = {((i, j), (k, l)) ∈ × | a < b = c < d},

C13 = {((i, j), (k, l)) ∈ × | a < b < c < d},

D1 = {((i, j), (k, l)) ∈ 5 × | |i| < c < d},

D2 = {((i, j), (k, l)) ∈ 5 × | |i| = c < d},

D3 = {((i, j), (k, l)) ∈ 5 × | c < |i| < d},

D4 = {((i, j), (k, l)) ∈ 5 × | c < |i| = d},

D5 = {((i, j), (k, l)) ∈ 5 × | c < d < |i|},

D6 = {((i, j), (k, l)) ∈ × 5 | |k| < a < b},

D7 = {((i, j), (k, l)) ∈ × 5 | |k| = a < b},

D8 = {((i, j), (k, l)) ∈ × 5 | a < |k| < b},

D9 = {((i, j), (k, l)) ∈ × 5 | a < b = |k|},

D10 = {((i, j), (k, l)) ∈ × 5 | a < b < |k|}.

836

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

We consider all cases for s × t ∩ Ci (1 ≤ s, t ≤ 4, 1 ≤ i ≤ 13) and s × t ∩ Di (s = 5, 1 ≤ t ≤ 4 or 1 ≤ s ≤ 4, t = 5 and 1 ≤ i ≤ 10). Since the remaining cases can be checked similarly, we just prove: −1 L i,i L k,l L i,i = q ϕ(l,i)−ϕ(k,i) L k,l L i, j L k,l − L k,l L i, j = 0

if (k, l) ∈ 1 ∪ 2 , if ((i, j), (k, l)) ∈ 1 × 1 ∩ C1 ,

(2.5) (2.6)

L i, j L k,l − L k,l L i, j = (q − q −1 )L i,l L k, j if ((i, j), (k, l)) ∈ 1 × 1 ∩ C2 ,

(2.7)

(L i, j )2 =

− q −1

q (L −i, j )2 q + q −1

if (i, j) ∈ 2

.

(2.8)

From (2.4), we obtain L i, j = L i, j

L −1 j−1, j−1

(L j−1, j L i, j−1 − L i, j−1 L j−1, j ) q − q −1 L −i−1,−i−1 = (L i,i+1 L i+1, j − L i+1, j L i,i+1 ) q − q −1

if (i, j) ∈ 1 ∪ 2 , if (i, j) ∈ 3 ∪ 4 .

To prove (2.5), we use induction on l − k: −1 L i,i L k,l L i,i

=

−1 L l−1,l−1

q − q −1

−1 L i,i (L l−1,l L k,l−1 − L k,l−1 L l−1,l )L i,i

= q ϕ(l,i)−ϕ(l−1,i)+ϕ(l−1,i)−ϕ(k,i) =q

ϕ(l,i)−ϕ(k,i)

−1 L l−1,l−1

q

− q −1

L l−1,l L k,l−1 − L k,l−1 L l−1,l

L k,l .

From (2.3), we know that f i f j − f j f i = 0 if |i − j| > 1. By using induction on j − i and (2.5), one can show that L i, j L k,k+1 − L k,k+1 L i, j = 0 when ((i, j), (k, k + 1)) ∈ 1 × 1 ∩ C1 . Similarly, one can prove L i, j L k,l − L k,l L i, j = 0 by induction on l − k. The proof of (2.7) is analogous (we use induction on l − k and (2.5), (2.6)): L i, j L k,l =

−1 L l−1,l−1

q − q −1

L i, j (L l−1,l L k,l−1 − L k,l−1 L l−1,l )

−1 L l−1,l−1 −1 L = L L +(q −q )L L L − L L L l−1,l i, j k,l−1 i,l l−1, j k,l−1 k,l−1 i, j l−1,l q − q −1 −1 L l−1,l−1 = L l−1,l L k,l−1 L i, j +(q − q −1 )L i,l L l−1, j L k,l−1 − L k,l−1 L l−1,l L i, j q − q −1 −(q − q −1 )L k,l−1 L i,l L l−1, j −1 L i,l (L l−1, j L k,l−1 − L k,l−1 L l−1, j ) = L k,l L i, j + L l−1,l−1

= L k,l L i, j + (q − q −1 )L i,l L k, j . To verify the relation (2.8), it suffices to show that (L j−1, j L i, j−1 − L i, j−1 L j−1, j )2 =

q − q −1 (L j−1, j L −i, j−1 − L −i, j−1 L j−1, j )2 . q + q −1

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

837

For this purpose, we need the following formulas for (i, j) ∈ 2 which can be derived using induction: L j−1, j L i, j−1 L j−1, j =

1 (q L i, j−1 L 2j−1, j + q −1 L 2j−1, j L i, j−1 ), q + q −1

q L −i, j−1 L 2j−1, j − (q + q −1 )L j−1, j L −i, j−1 L j−1, j + q −1 L 2j−1, j L −i, j−1 = 0. Using these formulae, we can verify the desired relations (L j−1, j L i, j−1 − L i, j−1 L j−1, j )2 = (L j−1, j L i, j−1 L j−1, j )L i, j−1 −

q −q −1 L j−1, j L 2−i, j−1 L j−1, j − L i, j−1 L 2j−1, j L i, j−1 q +q −1

+L i, j−1 (L j−1, j L i, j−1 L j−1, j ) −1 q q −q −1 q 2 2 2 2 2 = L L + L L − L L L j−1, j −i, j−1 j−1, j q +q −1 q +q −1 j−1, j −i, j−1 q +q −1 −i, j−1 j−1, j q − q −1 q = (L j−1, j L −i, j−1 L j−1, j − L −i, j−1 L 2j−1, j )L −i, j−1 q + q −1 q + q −1 q −1 2 2 +L −i, j−1 (L j−1, j L −i, j−1 L j−1, j − L L −i, j−1 )− L j−1, j L −i, j−1 L j−1, j q + q −1 j−1, j 2 q − q −1 L j−1, j L −i, j−1 − L −i, j−1 L j−1, j . = −1 q +q

Set deg f i = deg f i¯ = −αi , deg q h = deg kl¯ = 0, deg ei = deg ei¯ = αi . Since all the defining relations of the quantum superalgebra Uq (g) are homogeneous, it has a root space decomposition Uq (g) = (Uq )α , α∈Q

where (Uq )α = {u ∈ Uq (g) | q h uq −h = q α(h) u for all h ∈ P ∨ }. Remark. If we define Fi = f i q −ki+1 , E i = q ki+1 ei , one can see that the relations involving E i , Fi and q h are the same as the standard relations for Uq (gln ) (see, for example, [HK, Def. 7.1.1]). Hence Uq (gln ) is a subalgebra of Uq (g). The comultiplication of Uq (g) is given by the formula (L i, j ) =

j

L i,k ⊗ L k, j ,

k=i

(see §4 in [O]). In terms of the new generators we have:

(2.9)

838

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

(q h ) = q h ⊗ q h for every h ∈ P ∨ , (ei ) = q −ki+1 ⊗ ei + ei ⊗ q −ki , ( f i ) = q ki ⊗ f i + f i ⊗ q ki+1 , (ei¯ ) = q −ki+1 ⊗ ei¯ − (q − q −1 )ei ⊗ ki¯ ⎛ j i−1 j

+(q − q −1 ) ⎝ (−1) j+1 q h=1 ki− j+h ad ei− j+h (ei− j ) j=1

⊗ q−

j−1

h=1 ki− j+h

j−1

h=1

⎞ ad f i− j+h ( f i− j )⎠

h=1

⎛ j i−1 j

+(q − q −1 ) ⎝ (−1) j q h=1 ki− j+h ad ei− j+h (ei− j ) j=1

⊗ q−

j−1

h=1 ki− j+h

j−1 h=1

h=1

ad f i− j+h ( f i− j ) ) + ei¯ ⊗ q ki , ⎛

( f i¯ ) = q −ki ⊗ f i¯ + (q − q −1 ) ⎝

j−1 i−1 j−1

(−1) j q h=1 ki− j+h ad ei− j+h (ei− j ) j=1

⊗q −

j

h=1 ki− j+h

j

⎞

h=1

ad f i− j+h ( f i− j )⎠

h=1

⎛ j−1 i−1 j−1

+(q − q −1 ) ⎝ (−1) j+1 q h=1 ki− j+h ad ei− j+h (ei− j ) j=1

⊗q −

j

h=1 ki− j+h

j

h=1

⎞ ad f i− j+h ( f i− j )⎠

h=1

+(q − q (ki¯ ) = q −ki

−1

) ki¯ ⊗ f i + f i¯ ⊗ q ki+1 , ⎛ j−1 i−1 j−1

⊗ ki¯ + (q − q −1 ) ⎝ (−1) j q h=1 ki− j+h ad ei− j+h (ei− j ) j=1

⊗q −

j−1

h=1 ki− j+h

j−1

⎞

h=1

ad f i− j+h ( f i− j )⎠

h=1

⎛ j−1 i−1 j−1

+(q − q −1 ) ⎝ (−1) j+1 q h=1 ki− j+h ad ei− j+h (ei− j ) j=1

⊗q −

j−1

h=1 ki− j+h

j−1 h=1

h=1

⎞

ad f i− j+h ( f i− j )⎠ + ki¯ ⊗ q ki .

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

839

Let Uq+ (respectively, Uq− ) be the subalgebra of Uq (g) generated by the elements ei , ei¯ (respectively, f i , f i¯ ) for i = 1, . . . , n − 1, and let Uq0 be the subalgebra of Uq (g) generated by q h (h ∈ P ∨ ) and kl¯ for l = 1, . . . , n. In addition, let Uq≥0 (respectively, Uq≤0 ) be the subalgebra of Uq (g) generated by Uq+ and Uq0 (respectively, by Uq− and Uq0 ). We will show that the quantum superalgebra Uq (g) has a triangular decomposition. For this purpose, we need the following lemma. Lemma 2.2. Uq≥0 ∼ = Uq0 ⊗ Uq+ ,

Uq≤0 ∼ = Uq− ⊗ Uq0 .

Proof. We will prove the second part. Let { f ζ | ζ ∈ } be a basis of Uq− consisting of monomials in f i and f i¯ ’s (i ∈ I ). Consider a set = {(a1 , . . . , an ) | ai = 0 or 1 for all i ∈ J }. Then {q h kη | h ∈ P ∨ , η ∈ } is a basis of Uq0 , where kη = k1a¯ 1 · · · kna¯ n for η = (a1 , . . . , an ) by [O, Theorem 6.2]. By the defining relations of Uq (g), it is easy to see that the elements f ζ q h kη (ζ ∈ , h ∈ P ∨ , η ∈ ) span Uq≤0 . Thus there is a surjective C(q)-linear map Uq− ⊗ Uq0 −→ Uq≤0 given by f ζ ⊗ q h kη −→ f ζ q h kη . To show that this map is injective, it suffices to show that the elements f ζ q h kη (ζ ∈ , h ∈ P ∨ , η ∈ ) are linearly independent over C(q). Suppose

Cζ,h,η f ζ q h kη = 0 for some Cζ,h,η ∈ C(q). ζ ∈, h∈P ∨ , η∈

We may write

⎛

⎜ ⎜ ⎜ ⎝

β∈Q +

⎞

deg f ζ =−β,

h∈P ∨ , η∈

Since Uq (g) =

⎟ ⎟ Cζ,h,η f ζ q h kη ⎟ = 0 ⎠

for some Cζ,h,η ∈ C(q).

β∈Q (Uq )β ,

we have

Cζ,h,η f ζ q h kη = 0

for each β ∈ Q + .

deg f ζ =−β,

h∈P ∨ , η∈

n−1 n−1 Write β = − i=1 m i αi (m i ∈ Z≥0 ), and let h β = i=1 m i ki+1 . Since f ζ is a monomial in f i and f i¯ ’s, the term of degree (−β, 0) in ( f ζ ) is f ζ ⊗ q h β . We consider the terms of degree (0, 0) in (kη ), where η = (a1 , . . . , an ). Then the terms of degree (0,0) in (kη ) can be written as (q −k1 ⊗ k1¯ + k1¯ ⊗ q k1 )a1 · · · (q −kn ⊗ kn¯ + kn¯ ⊗ q kn )an ⎛ ⎞ ai n

j a −j ⎝ = q −(ai − j)ki ki¯ ⊗ q jki ki¯ i ⎠ i=1

=

j=0

( j1 ,..., jn )∈ ji ≤ai , i∈J

n j a −j q −(ai − ji )ki ki¯ i ⊗ q ji ki ki¯ i i . i=1

840

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Since the terms of degree (−β, 0) of ⎛

0=

η=(a1 ,...,an ) ( j1 ,..., jn )∈ ji ≤ai , i∈J ∈

⊗q h β +h

n

Cζ,h,η ( f ζ q h kη ) must sum to zero, we have

⎜

⎜ ⎝

Cζ,h,η f ζ q

h

deg f ζ =−β, h∈P ∨

n

q

−(ai − ji )ki

j ki¯ i

i=1

⎞

⎟ ⎟. ⎠

a − ji

q ji ki ki¯ i

i=1

(2.10)

ai − ji n For all (a1 − j1 , . . . , an − jn ) ∈ and h ∈ P ∨ , the elements q h are k i=1 i¯ linearly independent. Set η1 := (1, . . . , 1). Since there is only one pair of (a1 , . . . , an ) n a −j and ( j1 , . . . , jn ) such that i=1 ki¯ i i = kη1 in the above sum, we obtain 0=

h∈P ∨

×

n

Cζ,h,η1 f ζ q h−

i=1 ki

j q −(ai − ji )ki ki¯ i

⊗ q h β +h

i=1

n

Cζ,h,η f ζ q h

η=(a1 ,...,an ), ( j1 ,..., jn ), h∈P ∨ , (a1 − j1 ,...,an − jn ) =η1 deg f ζ =−β

deg f ζ =−β

n

⊗q h β +h kη1 +

a −j q ji ki ki¯ i i

.

i=1

Thus we have

n

Cζ,h,η1 f ζ q h−

i=1 ki

= 0 for all h ∈ P ∨ .

deg f ζ =−β n

Multiplying by q −h+

i=1 ki

from the right we obtain

Cζ,h,η1 f ζ = 0 for all h ∈ P ∨ .

deg f ζ =−β

Using the linear independence of f ζ , we conclude all Cζ,h,η1 = 0 for all ζ ∈ , h ∈ P ∨ . Now consider general η = (a1 , . . . , an ) ∈ . Assume that for all η = (a1 , . . . , an ) such that ai ≥ ai for all i ∈ J and η = η, Cζ,h,η = 0 for all ζ ∈ , h ∈ P ∨ . Then there is only one pair of (a1 , . . . , an ) and ( j1 , . . . , jn ) such that (a1 − j1 , . . . an − jn ) = η in (2.10). Repeating the above argument, we conclude Cζ,h,η = 0 for all ζ ∈ , h ∈ P ∨ . For example, consider η2 = (0, 1, . . . , 1). Since Cζ,h,η1 = 0, there is only one pair of (a1 , . . . , an ) and ( j1 , . . . , jn ) such that (a1 − j1 , . . . an − jn ) = (0, 1, . . . , 1) in (2.10). Thus we have n

Cζ,h,η2 f ζ q h− i=2 ki = 0 for all h ∈ P ∨ . deg f ζ =−β n

Multiplying q −h+ i=2 ki and using the linear independence of f ζ , we obtain Cζ,h,η2 = 0 for all ζ ∈ , h ∈ P ∨ . We are now ready to prove the triangular decomposition for Uq (g).

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

841

Theorem 2.3. There is a C(q)-linear isomorphism Uq (g) ∼ = Uq− ⊗ Uq0 ⊗ Uq+ .

Proof. Let { f ζ | ζ ∈ } , {q h kη | h ∈ P ∨ , η ∈ }, and {eτ | τ ∈ } be monomial bases of Uq− , Uq0 and Uq+ respectively, where and are the index sets as in the proof for Lemma 2.2. It suffices to show that the elements f ζ q h kη eτ (ζ, τ ∈ , h ∈ P ∨ , η ∈ ) are linearly independent over C(q). Suppose

Cζ,h,η,τ f ζ q h kη eτ = 0

for some Cζ,h,η,τ ∈ C(q).

ζ,h,η,τ

The root space decomposition of Uq (g) yields

Cζ,h,η,τ f ζ q h kη eτ = 0

for all γ ∈ Q.

h, η, deg f ζ +deg eτ =γ

Using the partial ordering on h∗0¯ , we can choose α = deg f ζ and β = deg eτ , which are minimal and maximal, respectively, among those for which α + β = γ and C ζ,h,η,τ is nonzero. If α = − m i αi , set h α = m i ki+1 , and if β = n i αi , set h β = n i ki+1 . The term of degree (0, β) in (eτ ) is q −h β ⊗ eτ and the term of degree (α, 0) of ( f ζ ) is f ζ ⊗ q h α . Since the terms of degree (α, β) of Cζ,h,η,τ ( f ζ q h kη eτ ) must sum to zero, we have

deg f ζ =α, deg eτ =β, h, η=(a1 ,··· ,an )

( j1 ,..., jn )∈ ji ≤ai , i∈J

⊗q

h α +h

n

Cζ,h,η,τ f ζ q

h

n

q

−(ai − ji )ki −h β

j ki¯ i

i=1

a −j q ji ki ki¯ i i

eτ = 0.

i=1

j n ki¯ i are linearly independent for ζ ∈ , h ∈ P ∨, ( j1 , . . . , jn ) The elements f ζ q h i=1

∈, by Lemma 2.2. By the similar argument in the proof for Lemma 2.2, we obtain

Cζ,h,η,τ eτ = 0 for all h ∈ P ∨ , ζ ∈ , and η ∈ .

deg eτ =β

Using the linear independence of eτ , we conclude that Cζ,h,η,τ = 0 for all ζ ∈ , h ∈ P ∨ , η ∈ , and τ ∈ , as desired.

842

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

3. The Quantum Clifford Superalgebra Cliffq (λ) We first introduce some notation that will be used in this section only. Let K be a field of zero characteristic and A be an associative K-algebra. Denote by Mat n (A) the associative K-algebra of n × n matrices with entries in A. If A is a superalgebra, then Matn (A) is a superalgebra as well by setting Matn (A)i¯ = Mat n (A i¯ ). By sMat n|n (K) we denote A B the associative superalgebra of 2n × 2n matrices , where A, B, C, and D are C D in Matn (K) and A 0 0 B sMat n,n (K)0¯ = , sMat n,n (K)1¯ = . 0 D C 0 A B . In particular, Let Q n (K) be the subsuperalgebra of sMatn|n (K) with elements B A Q n (K)0¯ = Q n (K)1¯ = Mat n (K). There are K-superalgebra isomorphisms Matr sMat 1|1 (K) ∼ = sMatr |r (K), Matr (Q 1 (K)) ∼ = Q r (K). Note that if K = C, then the superalgebra Q n (C) coincides with g as a complex vector space. Another example of a K-superalgebra is any extension K(α) of K of√degree 2 considering α as an odd element. If α 2 = β ∈ K we will denote K(α) by K( β). In this section, we set F = C(q). For every λ ∈ P we define I q (λ) to be the left ideal of Uq0 generated by q h − q λ(h) 1, h ∈ P ∨ . Set Cliffq (λ) := Uq0 /I q (λ). We may consider Cliffq (λ) as the associative F-algebra generated by the identity 1 = 1 + I q (λ) and ti¯ := ki¯ + I q (λ) satisfying the relations ti¯ t j¯ + t j¯ ti¯ = δi j

2(q 2λi − q −2λi ) 1, i, j = 1, . . . , n. q 2 − q −2

Furthermore, Cliffq (λ) has an obvious Z2 -grading (and thus a superalgebra structure) by assuming that ti¯ are odd. More precisely, Cliff q (λ)0¯ is spanned by 1 and the monomials ti¯1 . . . ti¯2k of even degree, while Cliff q (λ)1¯ is spanned by those of odd degree. In this section we will describe the structure of Cliffq (λ) and will classify its irreducible modules. Because of its superalgebra structure, Cliffq (λ) has both Z2 -graded and nongraded modules and both cases will be addressed. The results in this section may be derived from more general statements about quadratic forms and Clifford superalgebras over arbitrary fields (see, for example, [Lam] and [Sh]). For the sake of completeness we will give an outline of the proofs. The results and the proofs in this section will also help us to describe explicitly the action of Uq0 on the highest weight vectors of an irreducible highest weight module over Uq (g). This is demonstrated in Example 3.10 forthe case n = 3 and λ = (4, 2, 1). n n In this section, we fix V := i=1 Fti¯ and := ( 1 , . . . , n ) ∈ F and denote by B : V × V → F the symmetric bilinear form defined by B (ti¯ , t j¯ ) = δi j i . Let Cliffq ( ) be the unique up to isomorphism Clifford algebra associated to V and B . If q 2λi − q −2λi , then we have Cliffq ( ) Cliffq (λ). i = q 2 − q −2 Define V ( ) := V / ker B , where ker B := {v ∈ V | B (v, u) = 0, for every u ∈ V } and denote by β the restriction of B on V ( ). Let N = {i | i = 0}, Z = { j | j = 0}, and | | = #N . Set N := ( i1 , . . . , i| | ), 0 Z :=

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

843

( j1 , . . . , jn−| | ) = (0, . . . , 0), where N = {i 1 , . . . , i | | }, Z = { j1 , . . . , jn−| | }, and i 1 < · · · < i | | . It is clear that ker B = j∈Z Ft j¯ and that Cliffq ( N ) = Ft is the Clifford algebra corresponding to (V ( ), β ). Furthermore, i∈N i¯ Cliffq ( ) Cliffq ( N ) ⊗F Cliffq (0 Z ) Cliffq ( N ) ⊗F ker B . Here W denotes the exterior algebra of the vector space W . Thanks to the above isomorphisms every Cliffq ( )-module can be considered as a Cliffq ( N )-module under the embedding Cliffq ( N ) = Cliffq ( N ) ⊗F 1 → Cliffq ( N ) ⊗F Cliffq (0 Z ). The class ˙ F˙ 2 is called the discriminant of (V, B ). ( ) of ( ) = i∈N i in F/ The following lemma is standard and the proof is left to the reader. Lemma 3.1. Let M be an irreducible Cliff q ( )-module. Then M is an irreducible Cliffq ( N )-module and ti¯ v = 0 for every i ∈ Z . Conversely, if M0 is an irreducible Cliffq ( N )-module then M0 considered as a Cliff q ( )-module with trivial action of Cliff q (0 Z ) is irreducible as well. Since our goal in this section is to classify the irreducible representations of Cliff q ( ), thanks to the above lemma, we may assume that i are nonzero. So, for simplicity we fix Z = ∅, and thus B = β and V ( ) = V , in all statements preceding Corollary 3.9. Recall that a vector v in V is called β −isotropic (or simply isotropic) if β (v, v) = 0. A subspace W of V is β −isotropic subspace if β (u, w) = 0 for every u and w in W . A subspace W of V is anisotropic if it contains no nonzero β −isotropic vector. An isotropic subspace W of V is maximal isotropic if there is no larger β -isotropic subspace containing W . Lemma 3.2. Let W be an isotropic subspace of V . Then there exists an isotropic subspace W ∗ and a subspace Z of V such that V = Z ⊕ W ⊕ W ∗ , dim W = dim W ∗ , β (z, w) = β (z, w ∗ ) = 0 for every z ∈ Z , w ∈ W, w∗ ∈ W ∗ . ∗ } of W and W ∗ , respectively, Moreover, there exist bases {w1 , . . . , wm } and {w1∗ , . . . , wm ∗ such that β (wi , w j ) = δi j .

Proof. The lemma follows by induction on dim W . If dim W = 1, then W ∗ is spanned by w1∗ = x − 21 β (x, x)w1 , where x ∈ V is arbitrarily chosen so that β (w1 , x) = 1. Then we define Z to be Z = {z ∈ V | β (z, w1 ) = β (z, w1∗ ) = 0}. For the complete proof, see [Sh, Lemma 1.3]. The decomposition V = Z ⊕ W ⊕ W ∗ in Lemma 3.2 is called a weak Witt decomposition of V . For any weak Witt decomposition V = Z ⊕ W ⊕ W ∗ , we denote by Cliff( Z ) the Clifford algebra corresponding to (Z , β |Z ). If V = Z ⊕ W ⊕ W ∗ is a weak Witt decomposition for which Z is anisotropic (or, equivalently, W is maximal isotropic) we call it a Witt decomposition. We may identify W ∗ with the dual space of W via the nondegenerate form β . If V = Z ⊕ W ⊕ W ∗ is a Witt decomposition, the dimension of W is an invariant of (V, β ) (see [Sh, Lemma 1.4]) and is known as the

844

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Witt index of the form β . We say that the Witt index is maximal if dim Z ≤ 1. Recall that if the ground field is C, the Witt index is always maximal. In the case of arbitrary F though, the Witt index is generally not maximal as we verify in Lemma 3.6. In order to find a Witt decomposition and the Witt index of (V, β ) we need some preparatory statements. Lemma 3.3. Let V = Z ⊕ W ⊕ W ∗ be a weak Witt decomposition and let m = 2dim W . Then Cliff q ( ) ∼ = Mat m (Cliff q ( Z )). Moreover, we have Mat m (Cliff q ( Z )0¯ ) if Z = 0, Cliff q ( )0¯ ∼ = Mat m/2 (F) ⊕ Mat m/2 (F) if Z = 0. Proof. For the complete proof, see [Sh, Theorem 2.6]. The proof follows by induction on dim W . We sketch the proof for dim W = 1. In this case there is an isomorphism : Cliff q ( ) → Mat m (Cliff q ( Z )) defined by its restriction |V on V : z r . z + r w1 + sw1∗ → s −z Notice that if Z = 0, is not necessarily parity preserving. In such a case we choose the −1 isomorphism : Cliff q ( ) → Mat m (Cliff q ( Z )) defined by (α) = D (α)D, g 0 where D = and any g ∈ Z with β (g, g) = 0. 0 1 Lemma 3.4. The nondegenerate Legendre’s equation always has a nontrivial solution in F: for every nonzero A, B, C in F, there exist X, Y, Z ∈ F with (X, Y, Z ) = (0, 0, 0) such that AX 2 + BY 2 + C Z 2 = 0. Proof. We modify the proof of the classical Legendre’s Theorem (see, for example, [IR, §17.3]). We first assume that A, B, C, X, Y, Z are polynomials in C[q], where A, B, C are square free. We may fix √ C = −1, since if (X, Y, Z ) is a solution of AC X 2 + BCY 2 = Z 2 , then (X, Y, −1 CZ ) is a solution of AX 2 + BY 2 + C Z 2 = 0. We prove that AX 2 + BY 2 = Z 2 has a nontrivial solution by induction on N := max{deg A, deg B}. If N = 0; i.e., A and B are constant polynomials, then AX 2 + BY 2 = Z 2 has a solution (constant polynomials). Assume that deg B ≤ deg A and deg A ≥ 1. Recall that every polynomial R ∈ C[q] is a quadratic residue modulo any square free polynomial S. Indeed, if S is constant, our assertion is obvious. Otherwise, let S(q) = ri=1 (q − z i ) with z i = z j , and let yi ∈ C be such that yi2 = R(z i ). Then yi2 ≡ R (mod (q −z i )). Using the Chinese Remainder Theorem, we find y ∈ C[q] for which y ≡ yi (mod (q − z i )). But then y 2 ≡ R (mod (q − z i )) and thus y 2 ≡ R (mod S). We fix C1 with deg C1 < deg A such that C12 ≡ B ( modA). Then C12 − B = AT = A A1 M 2 for some square free polynomial A1 . Since deg A + deg A1 ≤ deg(A A1 M 2 ) = deg(C12 − B) < 2 deg A, we have 0 ≤ deg A1 < deg A. Now we observe that if (X 1 , Y1 , Z 1 ) is a solution of A1 X 2 + BY 2 = Z 2 , then (A1 X 1 M, C1 Y1 + Z 1 , Z 1 C1 + BY1 ) is a solution of AX 2 + BY 2 = Z 2 . Using the induction hypothesis, we complete the proof. Remark. Lemma 3.4 may be proved with a standard algebro-geometric argument using dimensions, see, for example, [Har, Exercise 11.6]. The lemma is also a particular case

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

845

of the following theorem of Tsen-Lang: if K is a field of transcendence degree n over an algebraically closed field k, then any quadratic form over K of dimension bigger than 2n is isotropic. For details, see [Lam, Chap. XI]. q 2λi − q −2λi . For simplicity, we will write βλ , q 2 − q −2 |λ|, and (λ) for β , | |, and ( ), respectively. The following technical lemma can be easily verified. In what follows, we assume i =

Lemma 3.5. Define an equivalence relation ∼ in {λi | i = 1, . . . , n} by λi ∼ λ j if λi2 = λ2j and denote by o(λi ) the orbit of λi relative to ∼. Then (λ) = 1¯ (or, equivalently, (λ) is a square in F) if and only if the orbit o(λi ) of every λi = ±1 contains an even number of elements. Lemma 3.6. The space V is anisotropic if and only if dim V = 1 or dim V = 2 and (λ) = 1. If V is isotropic, there is a Witt decomposition V = Z ⊕ W ⊕ W ∗ of V such that (1) dim W = k if dim V = 2k + 1, k ≥ 1 (maximal Witt index); (2) dim W = k − 1 if dim V = 2k and (λ) = 1; (3) dim W = k if dim V = 2k and (λ) = 1 (maximal Witt index). In particular, if λ1 > λ2 > · · · > λn > 0, then dim W = n−1 2 . Proof. The proof consists of several steps. Step 1. The case dim V = 1. This case is straightforward. Step 2. The case dim V = 2. In this case, v = a1 t1¯ + a2 t2¯ is βλ -isotropic if and only 1 if a12 1 + a22 2 = 0. The latter equation has a solution for a1 and a2 if and only if 2 is a square (or equivalently, 1 2 is a square). Step 3. If dim V ≥ 3, then V ∼ = Fw ⊕ Fw∗ ⊕ Fv3 ⊕ · · · ⊕ Fvn , where βλ (w, w) = βλ (w ∗ , w ∗ ) = βλ (w, vi ) = βλ (w ∗ , vi ) = 0 for i ≥ 3, βλ (w, w ∗ ) = 1, βλ (v3 , v3 ) = 1 2 3 , βλ (vi , vi ) = i if i ≥ 4. Let us first consider the case dim V = 3. We use Lemma 3.4 to find w = x1 t1¯ + x2 t2¯ + x3 t3¯ such that βλ (w, w) = 0. Applying Lemma 3.2 to W = Fw, we find w∗ = y1 t1¯ + y2 t2¯ + y3 t3¯ and z = z 1 t1¯ + z 2 t2¯ + z 3 t3¯ such that βλ (w ∗ , w ∗ ) = βλ (w ∗ , z) = βλ (w, z) = 0, βλ (w, w ∗ ) = 1. The choice of z is unique up to a multiplication by a nonzero constant in F. A simple calculation shows that z i may be chosen as follows √ z 1 = −1 2 3 (x2 y3 − x3 y2 ), √ z 2 = −1 1 3 (x3 y1 − x1 y3 ), √ z 3 = −1 1 2 (x1 y2 − x2 y1 ). Then one can easily verify that βλ (z, z) = 1 2 3 .

846

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

∗ In the case dim V > 3, write V = Ft1¯ ⊕ Ft2¯ ⊕ Ft3¯ ⊕ i≥4 Fti¯ . Fix w, w , z ∈ Ft1¯ ⊕ Ft2¯ ⊕ Ft3¯ as above, and set v3 = z and vi = ti¯ for i ≥ 4. Step 4. If dim V ≥ 3, then V has a Witt decomposition V ∼ = Z ⊕ W ⊕ W ∗, where

⎧ ⎪ ⎨0 dim Z = 1 ⎪ ⎩2

if dim V is even and 1 2 . . . n is a square, if dim V is odd, if dim V is even and 1 2 . . . n is not a square.

This follows from an inductive argument using Step 1, Step 2, and Step 3. Lemma 3.7. (1) Assume that dim V = 1. Then Q 1 (F) if (λ) = 1¯ (equivalently, 1 is a square in F), √ Cliff q (λ) ∼ = F( 1 ) if (λ) = 1¯ (equivalently, 1 is not a square in F). (2) Assume that dim V = 2. Then Cliff q (λ) ∼ = Mat 2 (F) as (nongraded) algebras and Cliff q (λ)0¯ ∼ = Cliff q ( 1 2 ). Proof. The case (1) corresponds to the “classical case” (Clifford superalgebra over C) and can be easily verified. (2) Let A = Cliff q (λ). Then A is a quaternion algebra over F. Since it is not a division algebra, by Wedderburn’s Theorem, we have A ∼ = Mat 2 (F) (see [Lam, Theorem 2.7] for details). The isomorphism A0¯ ∼ = Cliff q ( 1 2 ) is straightforward. Remark. The superalgebraic structure of Cliff q (λ) for dim V = 2 is “explicit” only ¯ In this case, one can show that Cliff q (λ) ∼ when (λ) = 1. = sMat 1|1 (F). We are now ready to describe the superalgebra structure of Cliff q (λ). Proposition 3.8. (1) If n is even, then Cliff q (λ) ∼ = Matr (A), where A = Cliff q n −1 (((λ), 1)) and r = 2 2 . Furthermore, Cliff q (λ) ∼ = Mat 2r (F) as (nongraded) algebras and ¯ Matr (F) ⊕ Matr (F) if (λ) = 1, ∼ √ Cliff q (λ)0¯ = ¯ if (λ) = 1. Matr (F( (λ))) n−1 (2) If n is odd, then Cliff q (λ) ∼ = Matr (B), where B = Cliff q ((λ)) and r = 2 2 . Furthermore, ¯ Cliff q (λ) ∼ if (λ) = 1, = Q r (F), Cliff q (λ)0¯ ∼ = Matr (F) √ ∼ ∼ ¯ Cliff q (λ) = Matr (F( (λ))), Cliff q (λ)0¯ = Matr (F) if (λ) = 1.

In particular, Cliff q (λ) is a simple superalgebra which is isomorphic to ¯ • a direct sum of two isomorphic simple algebras if n is odd and (λ) = 1; • a simple algebra otherwise.

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

847 n

Proof. We first consider the case when n is even and let r = 2 2 −1 . If 1 . . . n is a ¯ square in F, then (1) is proved by Lemma 3.6 (3) and Lemma 3.3. Now if (λ) = 1, ∼ by Lemma 3.3 and Step 3 in the proof of Lemma 3.6, we have Cliff q ( ) = Matr (A), where A = Cliff q ( 1 . . . n−1 , n ). We now apply Lemma 3.7 (1),(2) and prove (1). n−1 Next, assume that n is odd and let r = 2 2 . By Lemma 3.3 and Step 3 in the proof ∼ of Lemma 3.6, we have Cliff q (λ) = Matr (B), where B is the 2-dimensional Clifford superalgebra Cliff q ( 1 . . . n ). We use Lemma 3.7 (1) to complete the proof. In the statement of the following corollary we allow λi to be zero for some i. Recall that |λ| is the number of nonzero λi . We also set λ N := (λi1 , . . . , λi|λ| ), where Nλ = {i 1 , . . . , i |λ| } and i 1 < · · · < i |λ| . Corollary 3.9. Every Z2 -graded Cliff q (λ N )-module is completely reducible. Furthermore, the superalgebra Cliff q (λ) has up to isomorphism (1) two simple modules E q (λ) and (E q (λ)) of dimension 2k−1 |2k−1 if |λ| = 2k and ¯ (λ) = 1; (2) one simple module E q (λ) ∼ = (E q (λ)) of dimension 2k |2k if |λ| = 2k and (λ) = 1¯ (in particular, if λ1 > · · · . > λ2k > 0); (3) one simple module E q (λ) ∼ = (E q (λ)) of dimension 2k |2k if |λ| = 2k + 1. Proof. Thanks to Lemma 3.1, we may assume that λi = 0; i.e., |λ| = n. The category of all Z2 -graded Cliff q (λ)-modules is equivalent to the category of all nongraded Cliff q (λ)0¯ -modules. Indeed, the reverse correspondence is obtained by V0 → Cliff q (λ) ⊗Cliff q (λ)0¯ V0 . The corollary follows from Proposition 3.8 and the characterization of the simple and indecomposable (nongraded) modules of Matr (F) ⊕ Matr (F), Matr (F), and √ Matr (F( (λ))). (This characterization may be found, for example, in [Lang, Chap. XVII].) Example 3.10. Let n = 3 and λ = (4, 2, 1). We describe the action of ti¯ (i = 1, 2, 3) on E q (λ). We have 1 = (q 2 + q −2 )(q 4 + q −4 ), 2 = q 2 + q −2 , 3 = 1. For simplicity, let t = q 2 + q −2 . We first find a solution of Legendre’s equation 1 X 2 + 2 Y 2 + 3 Z 2 = 0.

√

(3.1)

We follow the proof of Lemma 3.4. Let Z = t Z and Y = −1Y . In order to solve the equation (t 2 − 2)X 2 + t Z 2 = Y 2 we find C1 ∈ C[t] for which C12 − t is a multiple of t 2 − 2. Using the Chinese Remainder Theorem, we choose √ √ 4 4 √ √ 8 2 (1 − −1)t + (1 + −1). C1 = 4 2 √ √ Then we solve the equation A1 X 12 + B Z 12 = Y12 for A1 = − 42 −1 and B = t. A solution for this is √ 4 √ 8 (1 − −1), 0). (X 1 , Y1 , Z 1 ) = (1, 4

848

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Then (3.1) has a solution √ (A1 X 1 , −1(Y1 C1 + B Z 1 ), t (C1 Z 1 + Y1 )) √ √ √ 4 √ 1√ 2√ 2 8 t+ (1 − −1)t . −1, −1, = − 4 4 2 4 Multiplying by an appropriate constant and changing signs, we fix the following solution of (3.1): √ √ √ √ 4 w = (X, Y, Z ) = (1, −1t − 2, 2(1 + −1)t). We consider w as an element in V relative to the basis {t1¯ , t2¯ , t3¯ }. We use Lemma 3.2 to find a Witt decomposition V = Fw ⊕ Fw∗ ⊕ Fz. As mentioned in the proof of Lemma 3.2, we find w ∗ = c(X, Y, −Z ), √

where c = √−1 t −2 such that βλ (w, w ∗ ) = 1 and βλ (w ∗ , w ∗ ) = 0. Then, as pointed out 4 2 in Step 3 of the proof of Lemma 3.6, we can find z = c(tY Z , −t (t 2 − 2)X Z , 0) such that βλ (z, w) = βλ (z, w ∗ ) = 0 and βλ (z, z) = − 41 1 2 3 = − 41 t 2 (t 2 − 2). Set ! √ α = (λ) = t 2 (t 2 − 2). Using Lemma 3.3, we define an isomorphism : Cliff q (λ) → Mat 2 (F(α)) by α 0 0 0 0 α −1 . , z → w → , w∗ → 0 −α α 0 0 0 From Proposition 3.8 and Corollary 3.9 we find that E q (λ) = F(α)⊕2 . Let v1 and v2 be the standard basis vectors of the F(α)-vector space E q (λ), and let v¯i = αvi (i = 1, 2). The action of Cliff q (λ) on E q (λ) is given by z(v1 ) = v¯1 , z(v2 ) = −v¯2 , z(v¯1 ) = t 2 (t 2 − 2)v1 , z(v¯2 ) = −t 2 (t 2 − 2)v2 , w(v1 ) = 0, w(v2 ) = (t 2 (t 2 − 2))−1 v¯1 , w(v¯1 ) = 0, w(v¯2 ) = v1 , w ∗ (v1 ) = v¯2 , w ∗ (v2 ) = 0, w ∗ (v¯1 ) = t 2 (t 2 − 2)v2 , w ∗ (v¯2 ) = 0. In order to determine the action of ti¯ (i = 1, 2, 3) on E q (λ), we need to express t1¯ , t2¯ , t3¯ in terms of z, w, w ∗ . With simple computations we find: √ √ √ √ √ 4 −1 t 2 − 2 8(1 − −1) −1t − 2 2 ∗ w + t (t − 2)w + z, t1¯ = √ 2 t 4 2 t √ √ √ √ √ 4 √ √ 8( −1 − 1) 1 −1 −1t − 2 w + ( −1t 2 − 2t)w ∗ + z, t2¯ = √ t 2 t 4 2 √ √ √ 2(1 + −1)t ∗ 1 − −1 t3¯ = w . w+ √ √ 4 2 4 4 2t

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

849

4. Highest Weight Representation Theory of Uq (g) A Uq (g)-module V q is called a weight module if it admits a weight space decomposition Vµq , where Vµq = {v ∈ V q | q h v = q µ(h) v for all h ∈ P ∨ }. Vq = µ∈P

q

For a weight Uq (g)-module V q , we set wt V q = {λ ∈ P | Vλ = 0}. By the same argument as in [HK, Ch.3], it can be verified that every submodule of a weight Uq (g)-module q is also a weight module. If dimC(q) Vµ < ∞ for all µ ∈ P, then the character of V q is defined to be

(dimC(q) Vµq ) eµ , ch V q = µ∈P

where eµ are formal basis elements of the group algebra C(q)[P] with the multiplication given by eλ eµ = eλ+µ for all λ, µ ∈ P. A weight module V q is called a highest weight module if it is generated over Uq (g) by a finite dimensional irreducible Uq≥0 -module vq . Note that vq also admits a weight space decomposition. We call a vector in vq a highest weight vector of V q . Combining Lemma 2.2 and the triangular decomposition of Uq (g) (Theorem 2.3), we obtain V q = Uq− vq . Proposition 4.1. If vq is a finite dimensional irreducible Uq≥0 -module with a weight q space decomposition vq = µ∈P vµ , then vq is irreducible as a Uq0 -module and vq = q vλ for some λ ∈ P. Conversely, if vq is an irreducible Uq0 -module on which the even part of Uq0 acts by a weight λ, then vq can be endowed with the structure of an irreducible Uq≥0 -module by letting Uq+ act trivially on vq . q

Proof. Because vq is finite dimensional, there exists a weight λ ∈ P such that vλ = 0 q q q q q q and vλ+αi = 0 for all i ∈ I . Then we have Uq+ vλ = vλ and Uq0 vλ = vλ . Thus vλ is q a Uq≥0 -submodule of vq and hence vλ = vq . The other direction is obvious from the defining relations of Uq (g) in Theorem 2.1. Remark. If vq is a finite dimensional irreducible Uq≥0 -module which generates a highest weight module V q of highest weight λ, then, by Proposition 4.1, we know that vq is an irreducible Uq0 -module of weight λ. Thus vq is a finite dimensional irreducible module over Cliff q (λ) = Uq0 /I q (λ). Conversely, if E q is a finite dimensional irreducible Cliff q (λ)-module, then it is clear that E q is an irreducible Uq0 -module of weight λ. By Corollary 3.9, we know that, up to isomorphism, Cliff q (λ) has at most two simple modules: E q (λ) and (E q (λ)). The Uq (g)-module W q (λ) = Uq (g) ⊗U ≥0 E q (λ) is q called the Weyl module of Uq (g) corresponding to λ (defined up ). Proposition 4.2. (1) W q (λ) is a free Uq− -module of rank dim E q (λ). (2) Every highest weight Uq (g)-module with highest weight λ is a homomorphic image of W q (λ). (3) Every Weyl module W q (λ) has a unique maximal submodule N q (λ).

850

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Proof. (1) This is clear from the definition. (2) Let V q be a highest weight module with highest weight λ generated by the irreducible Uq≥0 -module vq . Because vq is irreducible over Cliff q (λ), it is isomorphic to E q (λ) up to . Thus the map φ : W q (λ) −→ V q induced by E q (λ) → vq is a surjective Uq (g)-module homomorphism. (3) Since E q (λ) is an irreducible Cliff q (λ)-module, any proper submodule N q of W q (λ) does notcontain highest weight vectors (the vectors in E q (λ)). That is, N q must lie in µ<λ W q (λ)µ . Thus the sum of two proper submodules is again a proper submodule of W q (λ). Then the sum N q (λ) of all proper submodules of W q (λ) is the unique maximal submodule of W q (λ). For λ ∈ P, the unique irreducible quotient V q (λ) := W q (λ)/N q (λ) is called the irreducible highest weight module over Uq (g) with highest weight λ (defined up to ). We introduce the notation [n]q :=

q n − q −n , q − q −1

which is called a q-integer. We also define [0]q ! := 1 and [n]q ! := [n]q ·[n −1]q · · · [1]q . We define the divided powers of ei and f i as follows: (k)

:=

ei

eik , [k]q !

(k)

fi

:=

f ik . [k]q !

By a straightforward induction argument, we can prove the following lemma. Lemma 4.3. For all i ∈ I and k ∈ Z≥0 , we have (k)

ei f i

(k−1) q

(k)

= f i ei + f i

h i q −k+1

− q −h i q k−1 . q − q −1

Proposition 4.4. Let λ ∈ + and V q (λ) be the irreducible highest weight Uq (g)-module λ(h )+1 generated by an irreducible finite dimensional Uq≥0 -module vq . Then f i i v = 0 for q all v ∈ v and i ∈ I . Proof. Lemma 4.3 implies (k)

(k−1)

ei f i v = [λ(h i ) − k + 1]q f i

v for all v ∈ vq .

λ(h )+1

If k = λ(h i ) + 1, we see that ei f i i v = 0. Moreover, for j = i, we already know q λ(h )+1 λ(h )+1 e j f i i v = 0 and e j¯ f i i v = 0, since V q (λ) = µ≤λ Vµ . λ(h i )+1

Suppose that ei¯ f i

v = 0. We have

λ(h i )+1

ei (ei¯ f i

λ(h i )+1

ei¯ (ei¯ f i λ(h i )+1

Also, e j (ei¯ f i

λ(h i )+1

v) = ei¯ (ei f i v) = −

λ(h i )+1

v) = e j¯ (ei¯ f i

v) = 0,

− q −1

q λ(h )+1 e2 f i v = 0. q + q −1 i i

v) = 0 for j = i, since V q (λ) =

µ≤λ

q

Vµ .

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n)) λ(h )+1

851 λ(h )+1

If λ(h i ) ≥ 1, then wt(ei¯ f i i v) = λ − λ(h i )αi < λ. Thus ei¯ f i i v would generate a nontrivial proper submodule of V q (λ), which contradicts the irreducibility of V q (λ). If λ(h i ) = 0, then we have λi = λi+1 = 0 so that ki¯ v = ki+1 v = 0 by Lemma 3.1. From the defining relation of Uq (g), we know ei¯ f i v = f i ei¯ v + (q ki+1 ki¯ − q ki ki+1 )v = 0. λ(h )+1

Therefore, in any case, ei¯ f i i v = 0 for all v ∈ vq . λ(h )+1 Similarly, if f i i v = 0, it would generate a nontrivial proper submodule of λ(h )+1 V q (λ). Hence we conclude f i i v = 0 for all v ∈ vq . 5. Classical Limits Let A1 := { f /g ∈ C(q) | f, g ∈ C[q], g(1) = 0}. For an integer n ∈ Z, we formally define [y; n]x :=

yxn − y −1 x−n , x − x−1

(y; n)x :=

yxn − 1 . x−1

For example, [q h ; 0]q =

q h − q −h , q − q −1

(q h ; 0)q =

qh − 1 . q −1

Definition 5.1. We define the A1 -form UA1 of the quantum superalgebrta Uq (g) to be the A1 -subalgebra of Uq (g) with 1 generated by the elements ei , ei¯ , f i , f i¯ , q h , kl¯ and (q h ; 0)q (i ∈ I, l ∈ J, h ∈ P ∨ ). We denote by UA+1 (respectively, UA−1 ) the A1 -subalgebra of Uq (g) with 1 generated by ei , ei¯ (respectively, f i , f i¯ ) for i ∈ I , and by UA0 1 the A1 -subalgebra of Uq (g) with 1 generated by q h , kl¯ and (q h ; 0)q for l ∈ J, h ∈ P ∨ . Lemma 5.2. (1) (q h ; n)q ∈ UA0 1 for all n ∈ Z and h ∈ P ∨ . (2) [q h ; 0]q ∈ UA0 1 for all n ∈ Z and h ∈ P ∨ . Proof. Our assertions follow immediately from the following identities: (q h ; n)q = q n (q h ; 0)q + [q h ; 0]q = q

qn − 1 , q −1

q −1 (1 + q −h )(q h ; 0)q . q2 − 1

Note that ki¯2 = [q 2ki ; 0]q 2 = q 2

q2 − 1 1 (1 + q −2ki ) (q 2ki ; 0)q . 4 q −1 q +1

852

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Proposition 5.3. We have the triangular decomposition of the algebra UA1 . Namely, UA1 ∼ = UA−1 ⊗ UA0 1 ⊗ UA+1 as A1 -modules. ∼

Proof. Recall the canonical isomorphism Uq (g) −→ Uq− ⊗Uq0 ⊗Uq+ given by Theorem 2.3. The following commutation relations hold: ei (q h ; 0)q = (q h ; −αi (h))q ei ,

ei¯ (q h ; 0)q = (q h ; −αi (h))q ei¯ ,

(q h ; 0)q f i = f i (q h ; −αi (h))q ,

(q h ; 0)q f i¯ = f i¯ (q h ; −αi (h))q ,

ei f i = f i ei + [q ki −ki+1 ; 0]q , ei+1 f i = q −1 f i ei+1 , ei f i+1 = q f i+1 ei , ei f j − f j ei = 0 for |i − j| > 1, ei¯ f i¯ = − f i¯ ei¯ + [q ki +ki+1 ; 0]q + (q − q −1 )ki ki+1 , ei+1 f i¯ = −q −1 f i¯ ei+1 , ei¯ f i+1 = −q f i+1 ei¯ , ei¯ f j¯ = − f j¯ ei¯ = 0 for |i − j| > 1. Together with Lemma 5.2, one can show that the image of the canonical isomorphism lies inside UA−1 ⊗ UA0 1 ⊗ UA+1 when restricted to UA1 . Its inverse map is given by multiplication. Hence the two spaces are isomorphic as A1 -modules. In what follows, V q is a highest weight module over Uq (g) with highest weight λ ∈ P generated by a finite dimensional irreducible Uq≥0 -submodule vq . Then vq is a finite dimensional irreducible Cliff q (λ)-module. Since it is irreducible, it is generated by a nonzero vector v ∈ (vq )0¯ ; i.e., vq = Cliff q (λ)v. Note that q 2n − q −2n = q 2n−2 + q 2n−6 + · · · + q −2n+6 + q −2n+2 ∈ A1 for n ∈ Z>0 . q 2 − q −2 We denote by CliffA1 (λ) the A1 -subalgebra of Cliff q (λ) generated by {ti¯ | i ∈ J }. Definition 5.4. Let V q be a highest weight Uq (g)-module generated by a finite dimensional irreducible Uq≥0 -module vq and let E A1 (λ) be the Cliff A1 (λ)-submodule of vq ∼ = E q (λ) generated by a nonzero element v ∈ (vq )0¯ . The A1 -form of V q is defined to be the UA1 -submodule VA1 of V q generated by E A1 (λ). In what follows, V q will denote a highest weight Uq (g)-module. Proposition 5.5. VA1 = UA−1 E A1 (λ). Proof. In view of Proposition 5.3, it suffices to show that UA+1 E A1 (λ) = E A1 (λ) and UA0 1 E A1 (λ) = E A1 (λ). The first assertion is clear by the definition of highest weight modules. For the second assertion, we observe that q h w = q λ(h) w, (q h ; 0)q w =

q λ(h) − 1 w for all w ∈ E A1 (λ). q −1

Hence we obtain VA1 = UA1 E A1 (λ) = UA−1 E A1 (λ).

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

853 q

For each µ ∈ P, let us denote by (VA1 )µ the space VA1 ∩ Vµ . The following assertion can be proved using the same arguments as in [HK, Prop. 3.3.6]. Proposition 5.6. VA1 has the weight space decomposition VA1 = µ≤λ (VA1 )µ . Proposition 5.7. For each µ ∈ P, the weight space (VA1 )µ is a free A1 -module with q rank A1 (VA1 )µ = dimC(q) Vµ . In particular, rank A1 E A1 (λ) = dimC(q) E q (λ). Proof. Because A1 is a principal ideal domain, every finitely generated torsion free module over A1 is free. Furthermore, since C(q) is the field of quotients of the integral domain A1 , a finite subset of a C(q)-vector space is linearly independent over C(q) if q and only if it is linearly independent over A1 . Thus it is enough to show that each Vµ has q a C(q)-basis which is also contained in (VA1 )µ . The highest weight space v = E q (λ) has a linearly independent subset of {t1¯1 t2¯2 · · · tn¯ n v | j = 0 or 1} which generates E q (λ) over C(q), since E q (λ) = Cliff q (λ)v. By definition, this subset is contained in E A1 (λ). q q For Vµ , it is easy to show that there is a basis of Vµ whose elements are of the form n 1 2 f ζ t1¯ t2¯ · · · tn¯ v, where f ζ are monomials in f i and f j¯ . This basis is also contained in (VA1 )µ , which proves the proposition. Corollary 5.8. The map φ : C(q) ⊗A1 VA1 −→ V q given by f ⊗ v −→ f v ( f ∈ C(q), v ∈ VA1 ) is a C(q)-linear isomorphism. Let J1 be the ideal of A1 generated by q − 1. Then there is a canonical isomorphism of fields ∼

A1 /J1 −→ C given by f (q) + J1 −→ f (1). Define the C-linear vector spaces U1 = (A1 /J1 ) ⊗A1 UA1 , V 1 = (A1 /J1 ) ⊗A1 VA1 . Then V 1 is naturally a U1 -module. Note that U1 ∼ = UA1 /J1 UA1 and V 1 ∼ = VA1 /J1 VA1 . We use the bar notation for the images under these maps. The passage under these maps is referred to as taking the classical limit. Since VA1 = UA1 E A1 (λ), we have: V1 ∼ = VA1 /J1 VA1 = UA1 E A1 (λ)/J1 UA1 E A1 (λ) = (UA1 /J1 UA1 ) · (E A1 (λ)/J1 E A1 (λ)). Hence V 1 is generated by E A1 (λ)/J1 E A1 (λ) over U 1 . For each µ ∈ P, denote by Vµ1 the space (A1 /J1 ) ⊗A1 (VA1 )µ ∼ = (VA1 )µ /J1 (VA1 )µ . Proposition 5.9. (1) V 1 = µ≤λ Vµ1 (2) For each µ ∈ P, dimC Vµ1 = rank A1 (VA1 )µ . Proof. The first assertion follows from Proposition 5.6. Using the same argument as in [HK, Lemma 3.4.1], we can prove the second assertion. Let h¯ ∈ U1 be the classical limit of (q h ; 0)q ∈ UA1 . Using [HK, Lemma 3.4.3], we have: Lemma 5.10. (1) For all h ∈ p ∨ , we have q h = 1. (2) For any h, h ∈ P ∨ , h + h = h + h .

854

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Theorem 5.11. (1) The elements ei , ei¯ , f i , f i¯ , (i ∈ I ), kl¯ (l ∈ J ) and h (h ∈ P ∨ ) satisfy the defining relations of U (g). Hence there exists a surjective C-algebra homomorphism ψ : U (g) −→ U1 and the U1 -module V 1 has a U (g)-module structure. (2) For each µ ∈ P and h ∈ P ∨ , the element h acts on Vµ1 as scalar multiplication by µ(h). So Vµ1 is the µ-weight space of the U (g)-module V 1 . ∼

(3) There is an isomorphism Cliff(λ) −→ Cliff 1 (λ) := Cliff A1 (λ)/J1 Cliff A1 (λ). (4) As a U (g)-module, V 1 is a highest weight module or the sum of two highest weight modules with highest weight λ ∈ P. Proof. (1) The first relation for U (g) is trivial. Since (q h ; 0)q ei − ei (q h ; 0)q = ei (q h ; αi (h))q − ei (q h ; 0)q =

q αi (h) − 1 h ei q , q −1

we obtain [h, ei ] = αi (h)ei by letting q → 1. Similarly, [h, ei¯ ] = αi (h)ei¯ , [h, f i ] = −αi (h) f i , [h, f i¯ ] = −αi (h) f i¯ and [h, kl¯] = 0. We have ei f i − f i ei = [q h i ; 0]q =

q (1 + q −h i )(q h i ; 0)q . q +1

1 Taking the classical limit to both sides above leads to ei f i − f i ei = 2h i = h i . 2 Also ki¯2 = [q 2ki ; 0]q 2 = q 2

q2 − 1 1 (1 + q −2ki ) (q 2ki ; 0)q . q4 − 1 q +1 2

When we take q → 1, we obtain ki¯ = ki . Since we can obtain the following relations in U (g) by the Jacobi identity, [ei¯ , [ei , e j ]] = [[ei¯ , ei ], e j ] + [ei , [ei¯ , e j ]] = [ei , [ei¯ , e j ]], for |i − j| = 1, in order to prove the corresponding relations in U1 , it suffices to show that [ei , [ei¯ , e j ]] = 0. The latter relation can be checked easily by letting q → 1. The rest of the relations can be derived in a similar manner. Therefore, there exists a surjective algebra homomorphism ψ : U (g) −→ U1 defined by ei −→ ei , ei¯ −→ ei¯ , f i −→ f i , f i¯ −→ f i¯ , h −→ h, kl¯ −→ kl¯ (i ∈ I, l ∈ J ), which can be used to define a U (g)-module structure on V 1 . (2) For v ∈ (VA1 )µ and h ∈ P ∨ , we have (q h ; 0)q v =

q µ(h) − 1 v. q −1

Taking the classical limit of both sides yields our assertion.

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

855

(3) Note that ti¯ t j¯ + t j¯ ti¯ = 2δi j λi in Cliff 1 (λ) and Cliff(λ) is the associative C-algebra with 1 generated by {ki¯ | i ∈ J } with defining relations ki¯ k j¯ + k j¯ ki¯ = 2δi j λi . Thus we have a surjective C-algebra homomorphism Cliff(λ) → Cliff 1 (λ). Observe that dimC Cliff 1 (λ) = rank A1 Cliff A1 (λ) = dimC(q) Cliff q (λ) = dimC Cliff(λ). The first two equalities follow by using the same reasoning as in Proposition 5.9 and Proposition 5.7, respectively. It is well known that the dimension of the Clifford algebra associated with a symmetric bilinear form on a vector space of dimension k is 2k . This result holds for any base field of characteristic different from 2. Thus we proved the last equality. (4) V q is generated by a finite dimensional irreducible Uq≥0 -submodule vq ∼ = E q (λ) up to . By Corollary 3.9, ⎧ k ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨2 q k+1 ¯ dim E (λ) = 2 if |λ| = 2k and (λ) = 1, ⎪ ⎩2k+1 if |λ| = 2k + 1. It is"well #known that the dimension of the Z2 -graded irreducible Cliff(λ)-modules " # |λ|−1

|λ|−1

is 2 2 |2 2 (see, for example, [ABS]). With this in mind we deduce that E A1 (λ)/J1 E A1 (λ) is an irreducible Cliff(λ)-module when |λ| = 2k + 1 or |λ| = 2k and (λ) = 1, and the direct sum of two irreducible Cliff(λ)-modules otherwise. Since E q (λ) is a parity invariant module over Cliff q (λ) for |λ| = 2k and ¯ E A1 (λ)/J1 E A1 (λ) is a parity invariant Cliff(λ)-module as well. Hence (λ) = 1, E A1 (λ)/J1 E A1 (λ) = v(λ)⊕v(λ) for some irreducible Cliff(λ)-module v(λ). By definition, V 1 is a highest weight U (g)-module generated by E A1 (λ)/J1 E A1 (λ) or the sum of two highest weight modules generated by v(λ) and v(λ) for some irreducible Cliff(λ)-module v(λ). By Propositions 5.7 and 5.9 and Theorem 5.11, we obtain the following identity between the characters of a highest weight U (g)-module and a highest weight Uq (g)module. Proposition 5.12. ch V 1 = ch V q . Corollary 5.13. V q (λ) is finite dimensional if and only if λ ∈ + . Proof. Let V q = V q (λ). If λ ∈ + , then we have f iλ(h i )+1 v = 0 for all v ∈ Vλ by Proposition 4.4. Taking the classical limit, we have f¯iλ(h i )+1 v¯ = 0 for all v¯ ∈ Vλ1 . Because V 1 is a highest weight module or the sum of two highest weight modules, it is finite dimensional by Proposition 1.9, and hence V q is finite dimensional by Proposition 5.12. Conversly, assume that λ is not in + . Then V 1 has a submodule which is a highest weight module and whose irreducible quotient is isomorphic to an irreducible highest weight module with highest weight λ. It is not finite dimensional by (2) of Proposition 1.4. Again by Proposition 5.12, V q cannot be finite dimensional. q

856

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

Theorem 5.14. If λ ∈ + ∩ P≥0 and V q is the irreducible highest weight Uq (g)-module V q (λ) with highest weight λ, then V 1 is isomorphic to ¯ (1) V (λ) or V (λ) if |λ| = 2k and (λ) = 1, ¯ (2) V (λ) ⊕ V (λ) if |λ| = 2k and (λ) = 1 (in particular, if λ1 > . . . . > λ2k > 0), (3) V (λ) ∼ = V (λ) if |λ| = 2k + 1. ⎧ ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨ch V (λ) q ¯ Hence, ch V (λ) = 2 ch V (λ) if |λ| = 2k and (λ) = 1, ⎪ ⎩ch V (λ) if |λ| = 2k + 1. Proof. By Theorem 5.11 (4), V 1 is a highest weight module or the sum of two highest weight modules over U (g) with highest weight λ. By Proposition 1.8, we have ⎧ ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨V (λ) or V (λ) 1 ∼ ¯ V = V (λ) ⊕ V (λ) if |λ| = 2k and (λ) = 1, ⎪ ⎩V (λ) ∼ V (λ) if |λ| = 2k + 1. = The second assertion follows from Proposition 5.12. Remark. The main reason we restrict our attention in Theorem 5.14 to the dominant set of weights + ∩ P≥0 is the statement of Proposition 1.8. We still believe that the theorem holds in a more general setting and conjecture that it is true for any weight λ ∈ + for which the generic character formula (1.4) holds. Corollary 5.15. If V q is a finite dimensional highest weight module over Uq (g) with highest weight λ ∈ + ∩ P≥0 , then V q is isomorphic to V q (λ) up to . Proof. Note that V 1 is a highest weight module or the sum of two highest weight modules over U (g) with highest weight λ and it is finite dimensional by Proposition 5.12. From Proposition 1.8, we know that V 1 is an irreducible module or the direct sum of two irreducible modules. Thus we get ch V q = ch V 1 = ch V q (λ) by Theorem 5.14 and hence V q ∼ = V q (λ). Define the subalgebras U1± := A1 /J1 ⊗A1 UA±1 and U10 := A1 /J1 ⊗A1 UA0 1 of U1 . Theorem 5.16. The classical limit U1 of Uq (g) is isomorphic to the universal enveloping algebra U (g). Proof. By Theorem 5.11 (1), there exists a surjective algebra homomorphism ψ : U (g) −→ U1 defined by ei −→ ei , ei¯ −→ ei¯ , f i −→ f i , f i¯ −→ f i¯ , h −→ h, kl¯ −→ kl¯ for i ∈ I , h ∈ P ∨ and l ∈ J . From (1.3), U (g) ∼ = U − ⊗ U 0 ⊗ U +. We first show that U 0 is isomorphic to U10 . Consider the restriction ψ0 of ψ to U 0 . Note that Cliff A1 (λ) is a UA0 1 -module. Indeed, as in the proof of Proposition 5.5, we know that q h w = q λ(h) w, (q h ; 0)q w =

q λ(h) − 1 w for all w ∈ Cliff A1 (λ). q −1

In particular, the action of ki¯ is just the left multiplication by ti¯ . Let g ∈ ker ψ0 . By the 2n Poincaré-Birkhoff-Witt theorem, we can write g = i=1 gi kηi , where kηi = k1a¯ 1 · · · kna¯ n , 0 ≤ a j ≤ 1 for all j ∈ J and each gi is a polynomial in k1 , . . . , kn . For each λ ∈ P we have

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

0 = ψ0 (g) · 1 =

2n

857

λ(gi )tηi ∈ Cliff 1 (λ),

i=1

where λ(gi ) denotes the polynomial in λ j corresponding to gi . Since {tηi } is a linearly independent subset of Cliff 1 (λ) ∼ = Cliff(λ), we have λ(gi ) = 0 for all i = 1, . . . , 2n. Since we may take any integer value for λ j , gi must be zero for all i = 1, . . . , 2n and hence g is identically zero. Thus ψ0 is injective. − − − Next we show that the restriction of ψ− of ψ to U is an isomorphism of U onto U1 . Suppose ker ψ− = 0 and u = bζ f ζ ∈ ker ψ− , where bζ ∈ C and f ζ are monomials in f i and f i¯ ’s. Let N be the maximal length of the monomials f ζ in the expression of u, and choose λ ∈ + ∩ P≥0 satisfying λ(h i ) > N and |λ| = 2k and (λ) = 1¯ or |λ| = 2k+1 for all i ∈ I . By Theorem 5.14, the classical limit V 1 of V q (λ) is isomorphic to the irreduc|λ|+1 ¯ or |λ| = 2k + 1. Set r = 2[ 2 ] . ible U (g)-module V (λ) |λ| = 2k and (λ) = 1, when ⊕r Consider the map φ : U − −→ V 1 , given by (x1 , . . . , xn ) −→ r1=1 ψ(xi ) · vi for a basis {vi | i = 1, . . . , r } of Vλ1 . Then by Proposition 1.8 and Proposition 1.9, ker φ is ⊕r λ(h )+1 λ(h )+1 generated by( f i i , 0, . . . , 0), . . ., (0, . . . , 0, f i i ) for the left ideal of U − i ∈ I . In particular, (u, 0, . . . , 0) = ( bζ f ζ , 0, . . . , 0) ∈ ker φ. That is ψ− (u)v1 = 0, which is a contradiction. So ker ψ− = 0 and U − is isomorphic to U1− . Similarly, we can show that U + ∼ = U1+ . By the triangular decomposition we have U (g) ∼ = U1− ⊗ U10 ⊗ U1+ ∼ = U1 . = U− ⊗ U0 ⊗ U+ ∼ It can be checked easily that this isomorphism is an algebra isomorphism. Theorem 5.17. Let λ ∈ P. If V q is the Weyl module W q (λ) over Uq (g) with highest weight λ, then its classical limit V 1 is isomorphic to ¯ (1) W (λ) or W (λ) if |λ| = 2k and (λ) = 1, (2) W (λ) ⊕ W (λ) if |λ| = 2k and (λ) = 1¯ (in particular, if λ1 > · · · > λ2k > 0), (3) W (λ) ∼ = W (λ) if |λ| = 2k + 1. Proof. Let v(λ) be a finite dimensional irreducible b+ -module of weight λ which generates W (λ). Since U − ∼ = U1− and E A1 (λ)/J1 E A1 (λ) is isomorphic to v(λ) or v(λ) ⊕ v(λ) as a Cliff(λ)-module, it suffices to show that V 1 is a free U1− -module whose rank is dimC v(λ) or 2 dimC v(λ). By Proposition 4.2 we know that W q (λ) is a free Uq− -module generated by q E (λ). Since VA1 is a subspace of V q , taking Proposition 5.7 into account, VA1 is a free UA−1 -module generated by E A1 (λ). Taking the classical limit, we see that V 1 = U1− · E A1 (λ)/J1 E A1 (λ) and dimC E A1 (λ)/J1 E A1 (λ) = dimC(q) E q (λ) = dimC v(λ) or 2 dimC v(λ). By a similar argument as in [HK, Prop. 3.4.10], we can show that V 1 is a free ¯ E q (λ) is parity invariant. Hence we have When |λ| = 2k and (λ) = 1, ⎧ ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨W (λ) or W (λ) ¯ V1 ∼ if |λ| = 2k and (λ) = 1, = W (λ) ⊕ W (λ) ⎪ ⎩W (λ) ∼ if |λ| = 2k + 1. = W (λ)

U1− -module.

858

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim ≥0

6. Complete Reducibility of the Category Oq

In this section, we prove the complete reducibility theorem for Uq (g)-modules in the category Oq≥0 . Definition 6.1. The category Oq≥0 consists of finite dimensional Uq (g)-modules M with a weight space decomposition M = λ∈P Mλ such that wt(M) ⊂ P≥0 . Remark. The complete reducibility theorem for Oq≥0 , which we establish at the end of this section, implies that Oq≥0 is isomorphic to the category Tq of tensor modules, i.e., submodules of a tensor power of the natural representation C(q)n|n . Indeed, using the description of Tq provided by Olshanski and Sergeev we first check that every simple object of Oq≥0 is a tensor module. Then, by the complete reducibility result for Tq , obtained again by Sergeev and Olshanski, we conclude that the two categories are isomorphic. One can easily prove the following proposition (see, for example, [HK, Theorem 7.2.3]). Proposition 6.2. For each λ ∈ + ∩ P≥0 , V q (λ) is an irreducible Uq (g)-module in the category Oq≥0 . Conversely, every finite dimensional irreducible Uq (g)-module in the category Oq≥0 has the form V q (λ) for some λ ∈ + ∩ P≥0 . Let S be the antipode on Uq (g) defined in [O, Sect. 4]. We have S(q h ) = q −h for all h ∈ P ∨ . Because S is an anti-automorphism on Uq (g), one can define two Uq (g)-module structures on the dual vector space of a Uq (g)-module V ∈ Oq≥0 by x · φ, v := φ, S(x) · v and x · φ, v := φ, S −1 (x) · v for each x ∈ Uq (g) and linear functional φ on V . We denote by V ∗ these modules ∗ and V , respectively. As vector spaces both modules are just µ∈P Vµ , where Vµ∗ = HomC(q) (Vµ , C(q)). The following lemma is an immediate consequence of the definitions. Lemma 6.3. Suppose that V is a Uq (g)-module in the category Oq≥0 . (1) There exist canonical Uq (g)-module isomorphisms (V ∗ ) ∼ =V ∼ = (V )∗ . ∗ (2) The space Vµ is a weight space of weight −µ. Since q h S(ei )q −h = q αi (h) S(ei ), we have S(ei )Vµ ⊂ Vµ+αi , which implies ei Vµ∗ ⊂ ∗ Vµ−αi . By Lemma 6.3, we get ei (V ∗ )−µ ⊂ (V ∗ )−µ+αi . Similarly, we also have ei¯ (V ∗ )−µ ⊂ (V ∗ )−µ+αi , f i (V ∗ )−µ ⊂ (V ∗ )−µ−αi , f i¯ (V ∗ )−µ ⊂ (V ∗ )−µ−αi for all i ∈ I and ki¯ (V ∗ )−µ ⊂ (V ∗ )−µ for all i ∈ J . A weight module M is called a lowest weight module with lowest weight λ ∈ P if it is generated over Uq (g) by an irreducible finite dimensional Uq≤0 -module. By a similar argument as in Proposition 4.1, one can show that (V q (λ)λ )∗ is an irreducible Uq≤0 -module so that V q (λ)∗ and V q (λ) are lowest

weight modules of lowest weight −λ. Suppose that V is a Uq (g)-module in the category Oq≥0 . Because V is finite dimensional, we may choose a maximal weight λ ∈ wt(V ) with the property that λ + αi is not a

Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))

859

weight of V for any i ∈ I . Then the weight space Vλ is a Uq≥0 -module. Fix an irreducible Uq≥0 -submodule v of Vλ and set L = Uq (g)v. Then L is a highest weight Uq (g)-module with highest weight λ. By the assumption, λ ∈ + ∩ P≥0 and from Corollary 5.15 we know L ∼ = V q (λ) up to . Now consider v¯ = HomC(q) (v, C(q)) ⊂ V ∗ , and set L¯ = Uq (g)¯v ⊂ V ∗ . It is easy to show that v¯ is an irreducible Uq≤0 (g)-module and L¯ is a lowest weight module with lowest weight −λ. Translating Corollary 5.15 to the case of lowest weight modules, we get the following lemma. Lemma 6.4. The Uq (g)-module L¯ is isomorphic to the irreducible lowest weight module V q (λ)∗ with lowest weight −λ and lowest weight space v¯ . Now we can prove the completely reducibility theorem for Uq (g)-modules in the category Oq≥0 . Theorem 6.5. Every Uq (g)-module V in the category Oq≥0 is completely reducible. Proof. Take a maximal weight λ and consider a submodule of V , say L, generated by an irreducible Uq≥0 -submodule of Vλ . We want to show V ∼ = L ⊕ V /L. Taking dual with −1 ∗ ¯ respect to S of the inclusion L → V , we obtain a Uq (g)-module homomorphism ¯ . Thus we have a map: V ∼ = (V ∗ ) → ( L) ¯ . ψ : L → V → ( L) ¯ are It is easy to check that ψ is a nontrivial homomorphism. Since both L and ( L) irreducible, ψ is an isomorphism by Schur’s lemma and we see that the following short exact sequence splits: 0 → L → V → V /L → 0. Since V /L ∈ Oq≥0 , using induction on the dimension of V , we complete the proof. Corollary 6.6. The tensor product of a finite number of Uq (g)-modules in the category Oq≥0 is completely reducible. Remark. The same argument can be applied to prove the complete reducibility of O≥0 . In that case, the antipode is given by S(x) = −x for all x ∈ g (see [N, Sect. 4]) and Proposition 1.8 plays the same role as Proposition 5.15. Acknowledgements. We would like to thank Ivan Penkov and Vera Serganova for stimulating discussions. D.G. gratefully acknowledges the hospitality and excellent working conditions at the Seoul National University where most of this work was completed.

860

D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim

References [ABS] [B] [BKM] [Dr] [G] [Har] [HK] [IR] [K] [Lam] [Lang] [LS] [N] [O] [P] [PS1] [PS2] [PS3] [RTF] [Se1] [Se2] [Sh]

Atiyah, M.F., Bott, R., Shapiro, A.: Clifford modules. Topology 3, 3–38 (1964) Brundan, J.: Kazhdan-Lusztig polynomials and character formulae for the Lie superalgebra q(n). Adv. Math. 182, 28–77 (2004) Benkart, G., Kang, S.-J., Melville, D.: Quantized enveloping algebras for Borcherds superalgebras. Trans. Amer Math. Soc. 350, 3297–3319 (1998) Drinfel’d, V.: Quantum groups. In: Proceedings of the International Congress of Mathematicians, Vol. 1 (Berkeley, Calif., 1986), Providence, RI: Amer. Math. Soc., 1987, pp. 798–820 Gorelik, M.: Shapovalov determinants of q-type Lie superalgebras. Int. Math. Res. Pap., Article ID 96895, 1–71 (2006) Harris, J.: Algebraic Geometry, A first course. Corrected reprint of the 1992 original. Graduate Texts in Mathematics 133, New York: Springer-Verlag, 1995 Hong, J., Kang, S.-J.: Introduction to Quantum Groups and Crystal Bases. Graduate Studies in Mathematics 42, Providence, RI: Amer. Math. Soc., 2002 Ireland, K., Rosen, M.: A Classical Introduction to Modern Number Theory. 2nd ed., Graduate Texts in Mathematics 84, New York: Springer-Verlag, 1990 Kac, V.: Lie superalgebras. Adv. Math. 26, 8–96 (1977) Lam, T.Y.: Introduction to Quadratic Forms over Fields, Graduate Studies in Mathematics 67, Providence, RI: Amer. Math. Soc., 2005 Lang, S.: Algebra, Revised third edition, Graduate Texts in Mathematics 211, New York: SpringerVerlag, 2002 Leites, D., Serganova, V.: Defining relations for classical Lie superalgebras I. Superalgebras with Cartan matrix or Dynkin-type diagram. In: Proc. Topological and Geometrical Methods in Field Theory eds. J. Mickelson, et al., Singapore: World Sci., 1992, pp. 194–201 Nazarov, M.: Capelli identities for Lie superalgebras, Ann. Sci. Ecole Norm. Sup. (4) 30, 6, 847–872 (1997) Olshanski, G.: Quantized universal enveloping superalgebra of type q and a super-extension of the Hecke alegbra. Lett. Math. Phys. 24, 93–102 (1992) Penkov, I.: Characters of typical irreducible finite-dimensional q(n)-modules. Funct. Anal. Appl. 20, 30–37 (1986) Penkov, I., Serganova, V.: Generic irreducible representations of finite-dimensional Lie superalgebras. Int. J. Math. 5, 389–419 (1994) Penkov, I., Serganova, V.: Characters of irreducible G-modules and cohomology of G/P for the Lie supergroup G = Q(N ). J. Math. Sci. (New York) 84, 1382–1412 (1997) Penkov, I., Serganova, V.: Characters of finite-dimensional irreducible q(n)-modules. Lett. Math. Phys. 40, 147–158 (1997) Reshetikhin, N., Takhtadzhyan, L., Faddeev, L.: Quantization of Lie groups and Lie algebras. (Russian) Algebra i Analiz 1, 178–206 (1989); translation in Leningrad Math. J. 1, 193–225 (1990) Sergeev, A.: The centre of enveloping algebra for Lie superalgebra Q(n, C). Lett. Math. Phys. 7, 177–179 (1983) Sergeev, A.: Tensor algebra of the identity representation as a module over the Lie superalgebras G L(n, m) and Q(n). (Russian), Mat. Sb. (N.S.) 123(165), 422–430 (1984) Shimura, G.: Arithmetic and Analytic Theories of Quadratic Forms and Clifford Groups. Mathematical Surveys and Monographs 109, Providence, RI: Amer. Math. Soc., 2004

Communicated by Y. Kawahigashi

Commun. Math. Phys. 296, 861–880 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1017-8

Communications in

Mathematical Physics

Global Solution to the Three-Dimensional Incompressible Flow of Liquid Crystals Xianpeng Hu, Dehua Wang Department of Mathematics, University of Pittsburgh, Pittsburgh, PA 15260, USA. E-mail: [email protected]; [email protected] Received: 25 June 2009 / Accepted: 29 November 2009 Published online: 6 February 2010 – © Springer-Verlag 2010

Abstract: The equations for the three-dimensional incompressible flow of liquid crystals are considered in a smooth bounded domain. The existence and uniqueness of the global strong solution with small initial data are established. It is also proved that when the strong solution exists, all the global weak solutions constructed in [16] must be equal to the unique strong solution.

1. Introduction

Liquid crystals are substances that exhibit a phase of matter that has properties between those of a conventional liquid, and those of a solid crystal. For instance, a liquid crystal may flow like a liquid, but its molecules may be oriented in a crystal-like way. There are many different types of liquid crystal phases, which can be distinguished based on their different optical properties. The various liquid crystal phases can be characterized by the type of ordering that is present. One can distinguish positional order and orientational order, and moreover order can be either short-range or long-range. Liquid crystals may have an isotropic phase at high temperature, or anisotropic orientational structure at lower temperature. The diverse phases of liquid crystals have wide applications from the liquid crystal display to biology. (In particular, biological membranes and cell membranes are a form of liquid crystal.) In the 1960s, the theoretical physicist P.-G. de Gennes found fascinating analogies between liquid crystals and superconductors as well as magnetic materials, which was rewarded with the Nobel Prize in Physics in 1991. One of the most common liquid crystal phases is the nematic, where the molecules have no positional order, but they have long-range orientational order. For more details of physics, we refer the readers to the two books of de Gennes-Prost [5] and Chandrasekhar [3].

862

X. Hu, D. Wang

The three-dimensional flow of nematic liquid crystals can be governed by the following system of partial differential equations ([5,14–16]): ∂u + u · ∇u − µu + ∇ P = −λdiv (∇d ∇d) , ∂t ∂d + u · ∇d = γ (d − f (d)) , ∂t divu = 0,

(1.1a) (1.1b) (1.1c)

where u ∈ R3 denotes the velocity, d ∈ R3 the director field for the averaged macroscopic molecular orientations, P ∈ R the pressure arising from the incompressibility; and they all depend on the spatial variable x = (x1 , x2 , x3 ) ∈ R3 and the time variable t > 0. The positive constants µ, λ, γ stand for viscosity, the competition between kinetic energy and potential energy, and microscopic elastic relaxation time or the Deborah number for the molecular orientation field, respectively. We set these three constants to be one since their sizes do not play any role in our analysis. The symbol ∇d ∇d denotes a matrix whose i j th entry is < ∂xi d, ∂x j d >, and it is easy to see that ∇d ∇d = (∇d) ∇d, where (∇d) denotes the transpose of the 3 × 3 matrix ∇d. In (1.1), f (d) is the penalty function which will be assumed to be zero as in [16] for the three-dimensional problem. However, the approach of this paper can be applied to the general case. The system (1.1) is a simplified version, but still retains most of the essential features, of the Ericksen-Leslie equations ([7,8,10–13]) for the hydrodynamics of nematic liquid crystals; see [16,19,20] for more discussions on the relations of the two models. Both the Ericksen-Leslie system and the simplified one (1.1) describe the time evolution of liquid crystal materials under the influence of both the velocity field u and the director field d. In many situations, the flow velocity field does disturb the alignment of the molecule, and, in turn, a change in the alignment will induce velocity. We consider the initial-boundary value problem of system (1.1) in a bounded domain ⊂ R3 with C 3 boundary under the initial-boundary conditions: d|t=0 = d0 , u|t=0 = u0 ,

(1.2)

u|∂ = 0, d|∂ = d0 ,

(1.3)

and

with divu0 = 0 in , and d0 ∈ C 1 () satisfying ∇d0 = 0 on the boundary ∂. We introduce a 3 × 3 matrix F = ∇d,

(1.4)

and take the gradient of (1.1b) to rewrite (1.1), with f (d) = 0 and µ = λ = γ = 1, as: ∂u + u · ∇u − u + ∇ P = −div(F F), ∂t ∂F + u · ∇F + F∇u = F, ∂t divu = 0,

(1.5a) (1.5b) (1.5c)

Global Solution to the Flow of Liquid Crystals

863

where we used, for all i, j, k = 1, 2, 3, ∂u j ∂di ∂di ∂di ∂ ∂ uj = = (F∇u + u · ∇F)ik . + uj ∂ xk ∂x j ∂ xk ∂ x j ∂ x j ∂ xk Notice that (1.5a) is the incompressible Navier-Stokes equation with the source term, −div(F F), while (1.5b) is a parabolic equation of F. The initial-boundary conditions (1.2) and (1.3) become u|t=0 = u0 , F|t=0 = F0 := ∇d0 ,

(1.6)

u|∂ = 0, F|∂ = 0.

(1.7)

and

There have been some studies on system (1.1). In Lin-Liu [16], the global existence of weak solutions with large initial data was proved under the condition that the orientational configuration d(x, t) belongs to H 2 , and the global existence of classical solutions was also obtained if the coefficient µ is large enough in three dimensional spaces. Similar results were obtained also in [20] for a different but similar model. When weak solutions are discussed, the regularity of the weak solution was investigated in [17] (and also [11]). In this paper, we are interested in strong solutions of (1.5) in the Sobolev space W 2,q () with q > 3. It is worth pointing out that if F belongs to W 2,q (), it is equivalent to saying that d should be in W 3,q () according to (1.4). By a Strong Solution, we mean a triplet (u, F, P) satisfying (1.5) almost everywhere with the initial condition (1.6) and the boundary condition (1.7). Our strategy to consider (1.5) in W 2,q () is to linearize (1.5) as ∂u − u + ∇ P = −v · ∇v − div(G G), ∂t ∂F − F = −v · ∇G − G∇v, ∂t divu = 0,

(1.8a) (1.8b) (1.8c)

for some given v ∈ R3 and G ∈ M 3×3 . One of the motivations of making such an linearization is that we can use the maximal regularity of Stokes equations ([4]) and the parabolic equation ([1]). We first use an iteration method to establish the local existence and uniqueness of a strong solution with general initial data. Then we prove the global existence by establishing some global estimates under the condition that the initial data is small in some sense. The global weak solution was obtained in Lin-Liu [16], but the uniqueness is still an open problem. We shall prove that when the strong solution exists, all the global weak solutions constructed in [16] must be equal to the unique strong solution, which is called the weak-strong uniqueness. Similar results were obtained by Danchin [4] for the density-dependent incompressible Navier-Stokes equations. We shall establish our results in the spirit of [4], while developing new estimates for the director field d. The rest of the paper is organized as follows. In Sect. 2, we state our main results on local and global existence of strong solution, as well as the weak-strong uniqueness. In Sect. 3, we recall the maximal regularity for Stokes equations and the parabolic equation, and also some L ∞ estimates. In Sect. 4, we give the proof of the local existence. In Sect. 5, we prove the global existence. Finally in Sect. 6, we show the weak-strong uniqueness.

864

X. Hu, D. Wang

2. Main Results In this section, we state our main results. If k > 0 is an integer and p ≥ 1, we denote by W k, p the set of functions in L p () whose derivatives of up to order k belong to L p (). For T > 0 and a function space X , denote by L p (0, T ; X ) the set of Bochner measurable X-valued time dependent functions f such that t → f X belongs to L p (0, T ). Let us define the functional spaces in which the existence of solutions is going to be obtained: p,q

Definition 2.1. For T > 0 and 1 < p, q < ∞, we denote by MT (u, F, P) such that 1− 1p , p

u ∈ C([0, T ]; D Aq

the set of triplets

1,q

) ∩ L p (0, T ; W 2,q () ∩ W0 ()), ∂t u ∈ L p (0, T ; L q ),

divu = 0, 2(1− 1p )

F ∈ C([0, T ]; Bq, p

) ∩ L p (0, T ; W 2,q ()), ∂t F ∈ L p (0, T ; L q ()),

and

P ∈ L p (0, T ; W 1,q ()),

Pd x = 0.

The corresponding norm is denoted by · M p,q . T

We remark that the condition

Pd x = 0

in Definition 2.1 holds automatically if we replace P by 1 Pd x P− || 1− 1 , p

in (1.1) and (1.5). Also, in the above definition, the space D Aq p stands for some fractional domain of the Stokes operator in L q (cf. Sect. 2.3 in [4]). Roughly, the vector1− 1p , p

fields of D Aq

are vectors which have 2 −

2 p

derivatives in L q , are divergence-free, 2(1− 1p )

and vanish on ∂. The Besov space (for a definition, see [2]) Bq, p as the interpolation space between L q and W 2,q , that is, 2(1− 1p )

Bq, p

can be regarded

= (L q , W 2,q )1− 1 , p . p

We note that, from Proposition 2.5 in [4], 1− 1p , p

D Aq

2(1− 1p )

→ Bq, p

∩ L q ().

(2.1)

The local existence will be shown by using an iterative method, and if the initial data is sufficiently small in some suitable function spaces, the solution is indeed global in time. More precisely, our existence results read:

Global Solution to the Flow of Liquid Crystals

865

Theorem 2.1. Let be a bounded domain in R3 with C 3 boundary. Assume 1 ≤ p, q ≤ 1 1 2 1− 1− , p p ∞ with 2p 1 − q3 ∈ (0, 1) and u0 ∈ D Aq p , F0 ∈ Bq, p ∩ L q . Then, (1) there exists a T0 > 0, such that system (1.5) with the initial-boundary conditions p,q (1.6)–(1.7) has a unique local strong solution (u, F, P) ∈ MT0 in × (0, T0 ); (2) moreover, there exists a δ0 > 0, such that, if the initial data satisfies u0

1− 1p , p

≤ δ0 , F0

D Aq

2(1− 1p )

Bq, p

∩L q

≤ δ0 , p,q

then (1.6)–(1.7) has a unique global strong solution (u, F, P) ∈ MT for all T > 0.

in ×(0, T )

Remark 2.1. The above theorem gives us the global strong solution near u = 0, F = 0. The similar argument to the proof of Theorem 2.1 below will also enable us to show the global existence of a strong solution to (1.5) near the equilibrium state: u = 0, F = I (the 3 × 3 identity matrix). According to Lin-Liu [16], for the given initial-boundary conditions (1.6) and (1.7), there exists at least a Weak Solution to (1.5). But its uniqueness is still an open question. More precisely, a triplet (v, E, ) is called a weak solution to (1.5) with (1.6) and (1.7) in × (0, T ) if (v, E, ) satisfies the system (1.5) in the sense of distributions, i.e, for all ψ ∈ (C0∞ ( × (0, T )))3 with divψ = 0 and φ ∈ (C0∞ ( × (0, T )))9 , we have T T T v∂t ψ d xdt + v ⊗ v : ∇ψ d xdt − ∇v : ∇ψ d xdt 0

=− and

T

0

=

T 0

0

E E : ∇ψ d xdt,

E : ∂t φ d xdt − T 0

0

T 0

u · ∇ E : φ d xdt −

T 0

E∇u : φ d xdt

∇ E : ∇φ d xdt,

with the energy inequality: t 2 2 2 2 (|v(t)| + |E(t)| )d x + (|∇v| + |∇ E| )d xds ≤ (|v0 |2 + |E 0 |2 )d x.

0

In this weak formulation, the pressure can be determined as in the Navier-Stokes equations, see Galdi [9]. We state here the existence of weak solutions in Theorem A of [16]: Proposition 2.1. Assume that u0 ∈ L 2 and F0 ∈ L 2 . Then the system (1.5) with the initial condition (1.6) and the boundary condition (1.7) has a global weak solution (v, E, ) such that v ∈ L 2 (0, T ; H 1 ) ∩ L ∞ (0, T ; L 2 ), and E ∈ L 2 (0, T ; H 1 ) ∩ L ∞ (0, T ; L 2 ), for all T ∈ (0, ∞).

866

X. Hu, D. Wang

For the same initial-boundary conditions, the relation between its weak solution and its strong solution can be formulated as: 1− 1 , p

2(1− 1 )

Theorem 2.2. Assume that u0 ∈ D Aq p and F0 ∈ Bq, p p ∩ L q . Then its corresponding weak solution to (1.5) with (1.6) and (1.7) is unique and indeed is equal to its unique strong solution. Usually, we call this kind of uniqueness Weak-Strong Uniqueness. For similar results on the compressible Navier-Stokes equations, we refer readers to [6,18]. 3. Maximal Regularity In this section, we recall the maximal regularities for the parabolic operator and the Stokes operator, as well as some L ∞ estimates. For T > 0, 1 < p, q < ∞, denote W(0, T ) := W 1, p (0, T ; (L q ())3 ) ∩ L p (0, T ; (W 2,q ())3 ). Throughout this paper, C stands for a generic positive constant. We first recall the maximal regularity for the parabolic operator (cf. Theorem 4.10.7 and Remark 4.10.9 in [1]): 2(1− 1p )

Theorem 3.1. Given 1 < p, q < ∞, ω0 ∈ Bq, p Cauchy problem

and f ∈ L p (0, T ; L q (R3 )3 ), the

dω − ω = f, t ∈ (0, T ), ω(0) = ω0 , dt has a unique solution ω ∈ W(0, T ), and

ω W (0,T ) ≤ C f L p (0,T ;L q (R3 )) + ω0

2 1− 1p

,

Bq, p

where C is independent of ω0 , f and T . Moreover, there exists a positive constant c0 independent of f and T such that ω W (0,T ) ≥ c0 sup ω(t) t∈(0,T )

2(1− 1p )

.

Bq, p

Now we recall the maximal regularity for the Stokes equations (cf. Theorem 3.2 in [4]): Theorem 3.2. Let be a bounded domain with a C 3 boundary in R3 and 1 < p, q < ∞. 1− 1p , p

Assume that u0 ∈ D Aq

and f ∈ L p (R+ ; L q ). Then the system

⎧ ⎪ ⎨∂t u − u + ∇ P = f, Pd x = 0, divu = 0, u|∂ = 0, ⎪ ⎩u| t=0 = u0 ,

Global Solution to the Flow of Liquid Crystals

867

has a unique solution (u, P) satisfying the following inequality for all T > 0: T 1p p (∇ P, u, ∂t u) L q dt u(T ) 1− 1p , p + D Aq

0

⎛

≤ C ⎝ u0

1− 1p , p

T

+

D Aq

0

p

f (t) L q dt

1p

⎞ ⎠

(3.1)

with C = C(q, p, ). Remark 3.1. We notice that (3.1) does not include the estimate for u L p (0,T ;L q ) . Indeed, thanks to u|∂ = 0, Poincaré’s Inequality, and the fact ∇ud x = 0, we have u W 2,q ≤ C u L q , and then (3.1) can be rewritten as u(T )

1− 1p , p

+

D Aq

0

⎛

≤ C ⎝ u0

T

p (∇ P, u, u, ∂t u) L q

1− 1p , p D Aq

T

p

f (t) L q dt

+ 0

1p

1p dt

⎞ ⎠.

(3.2)

We have the L ∞ estimate in the spatial variable as follows (cf. Lemma 4.1 in [4]). Lemma 3.1. Let 1 < p, q, r, s < ∞ satisfy 0<

p 3p − < 1, 2 2r

1 1 1 = + . s r q

Then the following inequalities hold: 1

3

∇ f L p (0,T ;L ∞ ) ≤ C T 2 − 2r f 1−θ

1− 1p , p L ∞ 0,T ;D Ar

1

f θ p , L (0,T ;W 2,r )

3

∇ f L p (0,T ;L q ) ≤ C T 2 − 2r f 1−θ

1− 1p , p L ∞ (0,T ;D As )

f θL p (0,T ;W 2,s ) ,

for some constant C depending only on , p, q and 1 3 1−θ = − . p 2 2r Similarly, we have, Lemma 3.2. Let 1 < p, q < ∞ satisfy 0 < 1

∇ f L p (0,T ;L ∞ ) ≤ C T 2

3 − 2q

p 2

f 1−θ

−

3p 2q

< 1. Then one has,

2(1− 1p ), p

L ∞ (0,T ;Bq, p

for some constant C depending only on , p, q and 1−θ 1 3 = − . p 2 2q

)

f θL p (0,T ;W 2,q ) ,

868

X. Hu, D. Wang

Proof. First, we notice that

1− 2 − q3

p B∞,∞

1− 3

q , B∞,∞

θ,1

0 = B∞,1

with 1−θ 1 3 = − , p 2 2q 0 → L ∞ is true due to Theorem 6.2.4 see Theorem 6.4.5 in [2]. Also the imbedding B∞,1 in [2]. Hence, one has

∇ f L ∞ ≤ C ∇ f B 0

∞,1

≤ C ∇ f θ 1− 3 ∇ f 1−θ . 1− 2 − 3 q

p

B∞,∞

B∞,∞

(3.3)

q

We remark that 2(1− 1p )

Bq, p

2− 2 − q3

p → B∞,∞

1− 2 − q3

p → B∞,∞

1− 3

q , W 1,q → B∞,∞ ,

see Theorem 6.2.4 and Theorem 6.5.1 in [2]. Hence, according to (3.3), one deduces that 1 T

∇ f L p (0,T ;L ∞ ) ≤ C

0

0

≤ CT

pθ

∇ f 1− 3 q

B∞,∞ T

≤C

p

∇ f

1 3 2 − 2q

p(1−θ)

dt 1− 2 − 3 p

q

B∞,∞

1

p

pθ p(1−θ) f W 2,q f 2(1− 1 ), p dt p Bq, p

f 1−θ

2(1− 1p ), p L ∞ (0,T ;Bq, p )

f θL p (0,T ;W 2,q ) .

Lemma 3.3. For f ∈ have, for all t ∈ [0, T ],

L p (0, T, L q )

and ∂t f ∈

L p (0, T ; L q )

with f 0 = f (0) ∈

Lq ,

f L ∞ (0,t;L q ) ≤ C f 0 L q + f L p (0,t;L q ) + ∂t f L p (0,t;L q ) ,

for some positive constant C independent of T and f . Proof. Indeed, we have p f (t) L q

=

p f0 L q p

= f0 L q p

≤ f0 L q p

≤ f0 L q

t

d p f (s) L q ds ds 0 p t p−q q−2 f (t) L q + | f (s)| f (s)∂s f (t)d x dt q 0 R3 t p p−1 + f (s) L q ∂s f L q ds q 0 t t p−1 1p p p p p + f L q ds ∂s f L q ds , q 0 0 +

and consequently, (3.4) follows from Hölder’s inequality.

we

(3.4)

Global Solution to the Flow of Liquid Crystals

869

4. Local Existence In this section, we prove the local existence and uniqueness of a strong solution in Theorem 2.1. The proof will be divided into several steps, including constructing the approximate solution by iteration, obtaining the uniform estimate, showing the convergence, consistency, and uniqueness. 4.1. Construction of approximate solutions. We initialize the construction of approximate solutions by setting F 0 := F0 and u0 := u0 . For given (un , Fn ), the Stokes equations (1.8a) and the parabolic equation (1.8b) enable us to define (un+1 , Fn+1 , P n+1 ) as the global solution of ∂un+1 − un+1 + ∇ P n+1 = −un · ∇un − div(Fn Fn ), ∂t ∂Fn+1 − Fn+1 = −un · ∇Fn − Fn ∇un , ∂t divun+1 = 0,

(4.1a) (4.1b) (4.1c)

with the initial-boundary conditions: un+1 |t=0 = u0 , Fn+1 |t=0 = F0 , un+1 |∂ = 0, Fn+1 |∂ = 0, and

P n+1 d x = 0.

According to Theorem 3.1 and Theorem 3.2, an argument by induction yields a p,q sequence {(un , Fn , P n )}n∈N ⊂ MT for all positive T . 4.2. Uniform estimate for some small fixed time T . We aim at finding a positive time T p,q independent of n for which {(un , Fn , P n )}n∈N is uniformly bounded in the space MT . Indeed, applying Theorem 3.1 and Theorem 3.2, we obtain u

n+1

(T ) ⎛

1− 1p , p

+

D Aq

≤ C ⎝ u0

1p p n+1 n+1 n+1 n+1 q dt ∇ P , u , u , ∂t u

T

L

0

1− 1p , p

T

+

D Aq

0

un · ∇un + div(Fn Fn ) L q dt p

1p

⎞ ⎠,

(4.2)

and Fn+1 (T )

2(1− 1p )

Bq, p

≤ C F0

+ Fn+1 W (0,T )

2(1− 1p )

Bq, p

+ F ∇u + u · ∇F L p (0,T ;L q ()) . n

n

n

n

(4.3)

870

X. Hu, D. Wang

Now define U n (t) := un (t)

1− 1p , p

L ∞ (0,t;D Aq

+ Fn (t)

)

+ un L p (0,t;W 2,q ) + ∂t un L p (0,t;L q )

2(1− 1p )

L ∞ (0,t;Bq, p

)

+ Fn W (0,t) ,

and U 0 = u0

1− 1p , p

+ F0

D Aq

2(1− 1p )

Bq, p

∩L q

.

Hence, from (4.2) and (4.3), one has, using Lemmas 3.1–3.3, U n+1 (t) ≤ C U 0 + Fn L ∞ (0,t;L q ) ∇Fn L p (0,t;L ∞ ) + un L ∞ (0,t;L q ) ∇un L p (0,t;L ∞ ) + un L ∞ (0,t;L q ) ∇Fn L p (0,t;L ∞ ) + Fn L ∞ (0,t;L q ) ∇un L p (0,T ;L ∞ ) (4.4) 1 −3 ≤ C U 0 + t 2 2q (U 0 + U n (t))U n (t) . Hence, if we assume that U n (t) ≤ 4CU 0 on [0, T0 ] with 0 < T0 ≤

3 4C(4C + 1)U 0

2q q−3

,

(4.5)

then a direct computation yields U n+1 (t) ≤ 4CU 0 , on [0, T0 ]. Coming back to (4.2), (4.3), and (4.4), we conclude that the sequence {(un , Fn , p,q P n )}∞ n=1 is uniformly bounded in M T0 . More precisely, we have Lemma 4.1. For all t ∈ [0, T0 ] with T0 satisfying (4.5), U n (t) ≤ 4CU 0 .

(4.6)

4.3. Convergence of the approximate sequence. We now prove Lemma 4.2. {(un , Fn , P n )}∞ n=1 is a Cauchy sequence and thus converges in MT0 . p,q

Proof. Let δun := un+1 − un , δ P n := P n+1 − P n , δFn := Fn+1 − Fn . Define δU n (t) := δun (t)

1− 1p , p

L ∞ 0,t;D Aq

+ δFn (t)

+ δun L p (0,t;W 2,q ) + ∂t δun L p (0,t;L q )

2 1− 1p

L ∞ 0,t;Bq, p

+ δFn W (0,t) .

(4.7)

Global Solution to the Flow of Liquid Crystals

871

The triplet (δun , δFn , δ P n ) satisfies ⎧ ∂δun n n ⎪ ⎪ ∂t − δu + ∇δ P ⎪ ⎨ = −un · ∇un + un−1 · ∇un−1 − div(Fn Fn ) + div(Fn−1 Fn−1 ), n ∂δF n n n n n n−1 n−1 n−1 ⎪ − δF = −u · ∇F − F ∇u + u · ∇F + F ∇un−1 , ⎪ ⎪ ⎩ ∂t n divu = 0,

(4.8)

with δun |t=0 = δun |∂ = 0, δFn |t=0 = δFn |∂ = 0, and

δ P n d x = 0.

Notice that, using Lemma 3.1 and Lemma 3.2, − un · ∇un + un−1 · ∇un−1 L p (0,T ;L q ()) = δun−1 · ∇un − un−1 · ∇δun−1 L p (0,T ;L q ) ≤ un−1 L ∞ (0,T ;L q ) ∇δun−1 L p (0,T ;L ∞ ) + δun−1 L ∞ (0,T ;L q ) ∇un L p (0,T ;L ∞ ) 1 −3 (4.9) ≤ 4CU 0 ∇δun−1 L p (0,T ;L ∞ ) + T 2 2q δun−1 L ∞ (0,T ;L q ) ,

− div(Fn Fn ) + div(Fn−1 Fn−1 ) L p (0,T ;L q ())

= − div(δFn−1 Fn ) − div(Fn−1 δFn−1 ) L p (0,T ;L q ) ≤ Fn L ∞ (0,T ;L q ) ∇δFn−1 L p (0,T ;L ∞ ) + δFn−1 L ∞ (0,T ;L q ) ∇Fn−1 L p (0,T ;L ∞ ) + Fn−1 L ∞ (0,T ;L q ) ∇δFn−1 L p (0,T ;L ∞ ) + δFn−1 L ∞ (0,T ;L q ) ∇Fn L p (0,T ;L ∞ ) 1 −3 (4.10) ≤ 4CU 0 ∇δFn−1 L p (0,T ;L ∞ ) + T 2 2q δFn−1 L ∞ (0,T ;L q ) , − un · ∇Fn + un−1 · ∇Fn−1 L p (0,T ;L q ) = un · ∇δFn−1 + δun−1 · ∇Fn−1 L p (0,T ;L q ) ≤ un L ∞ (0,T ;L q ) ∇δFn−1 L p (0,T ;L ∞ ) + δun−1 L ∞ (0,T ;L q ) ∇Fn−1 L p (0,T ;L ∞ ) 1 −3 (4.11) ≤ 4CU 0 ∇δFn−1 L p (0,T ;L ∞ ) + T 2 2q δun−1 L ∞ (0,T ;L q ) , and − Fn ∇un + Fn−1 ∇un−1 L p (0,T ;L q ) = Fn ∇δun−1 + δFn−1 ∇un−1 L p (0,T ;L q ) ≤ Fn L ∞ (0,T ;L q ) ∇δun−1 L p (0,T ;L ∞ ) + δFn−1 L ∞ (0,T ;L q ) ∇un−1 L p (0,T ;L ∞ ) 1 −3 (4.12) ≤ 4CU 0 ∇δun−1 L p (0,T ;L ∞ ) + T 2 2q δFn−1 L ∞ (0,T ;L q ) .

872

X. Hu, D. Wang

Applying Theorems 3.1–3.2 with the help of (4.9)–(4.12), one deduces that δU n (t) ≤ 8CU 0 ∇δun−1 L p (0,t;L ∞ ) + ∇δFn−1 L p (0,t;L ∞ ) (4.13) 1 −3 + t 2 2q ( δFn−1 L ∞ (0,t;L q ) + δun−1 L ∞ (0,t;L q ) ) . On the other hand, (4.7) implies that, by Lemma 3.3, δFn−1 L ∞ (0,t L q ) + δun−1 L ∞ (0,t;L q ) ≤ δU n−1 (t), which, combining with (4.13), Lemma 3.1 and Lemma 3.2 together, gives 1

δU n (t) ≤ 16CU 0 t 2

3 − 2q

δU n−1 (t).

(4.14)

Thus, if we choose T0 satisfying (4.5), such that, the condition 1

16CU 0 T0 2

3 − 2q

1 2

≤

is fulfilled, it is clear that {(un , Fn , P n )}∞ n=1 is a Cauchy sequence in M T0 . p,q

4.4. The limit is a solution. Since {(un , Fn , P n )}∞ n=1 is a Cauchy sequence in M T0 , p,q then it converges. Let (u, F, P) ∈ MT0 be the limit of the sequence {(un , Fn , P n )}∞ n=1 p,q in MT0 . We claim all those nonlinear terms in (4.1) converge to their corresponding terms in (1.5) in L p (0, T0 ; L q ). Indeed, using Lemmas 3.1 and 3.3, we have, p,q

un · ∇un − u · ∇u L p (0,T0 ;L q ) = (un − u) · ∇un + u · ∇(un − u) L P (0,T0 ;L q ) ≤ un − u L ∞ (0,T0 ;L q ) ∇un L p (0,T0 ;L ∞ ) + u L ∞ (0,T0 ;L q ) ∇un − ∇u L p (0,T0 ;L ∞ ) 1

≤ C un − u M p,q T0 2 → 0,

T0

3 − 2q

1

CU 0 + C u L ∞ (0,T0 ;L q ) T0 2

3 − 2q

un − u M p,q T0

p,q

as n → ∞ due to the convergence of un to u in MT0 and Lemma 3.3. Hence, un · ∇un → u · ∇u, in L p (0, T0 ; L q ). Similarly, we have div(Fn Fn ) → div(FF ), in L p (0, T0 ; L q ); un · ∇Fn → u · ∇F, in L p (0, T0 ; L q ); Fn ∇un → F∇u, in L p (0, T0 ; L q ). Thus, taking the limit as n → ∞ in (4.1), we conclude that (1.5) holds in L p (0, T0 ; L q ), and hence almost everywhere on × [0, T0 ].

Global Solution to the Flow of Liquid Crystals

873

4.5. Uniqueness. Let (u1 , F1 , P1 ) and (u2 , F2 , P2 ) be two solutions to (1.5) with the initial-boundary conditions (1.6) and (1.7). Denote δu = u1 − u2 , δF = F1 − F2 , δ P = P1 − P2 . Note that the triplet (δu, δF, δ P) satisfies the following system: ⎧ ⎪ ⎨∂t δu − µδu + ∇δ P = −u2 · ∇δu − δu · ∇u1 + div((δF) F1 + F2 δF), ∂t δF − δF = −u1 · ∇δF − δu · ∇F2 − F1 ∇δu − δF∇u2 , ⎪ ⎩divδu = 0, (4.15) with the initial-boundary conditions δu|t=0 = δu|∂ = 0, δF|t=0 = δF|∂ = 0, and

δ Pd x = 0.

Define X (t) := δu(t)

1− 1p , p

L ∞ (0,t;D Aq

+ δF(t)

)

+ δu L p (0,t;W 2,q ) + ∂t δu L p (0,t;L q )

2(1− 1p )

L ∞ (0,t;Bq, p

)

+ δF W (0,t) .

Thus, applying Lemmas 3.1 and 3.2 to (4.15), one has, repeating the argument in (4.9)– (4.12), X (t) ≤ 4CU 0 ∇δu L p (0,t;L ∞ ) + ∇δF L p (0,t;L ∞ ) 1 −3 + t 2 2q ( δF L ∞ (0,t;L q ) + δu L ∞ (0,t;L q ) ) 1

≤ 16CU 0 t 2

3 − 2q

X (t) ≤

1 X (t). 2

Hence, X (t) = 0 for all t ∈ [0, T0 ], which guarantee the uniqueness on the interval [0, T0 ]. 5. Global Existence In this section, we prove that, if the initial data is sufficiently small, the local solution established in the previous section is indeed global in time. To this end, we first denote by T ∗ the maximal time of existence for (u, F, P). Define the function H (t) as H (t) := u L p (0,t;W 2,q ) + ∂t u L p (0,t;L q ) + u + P L p (0,t;W 2,q ) + F

L∞

2 1− 1p

L ∞ 0,t;Bq, p

1− 1p , p 0,t;D Aq

+ F W (0,t) ,

874

X. Hu, D. Wang

and H0 := u0

1− 1p , p

+ F0

D Aq

2 1− 1p

Bq, p

∩L q

.

To extend the local solution, we need to control the maximal time T ∗ only in terms of the initial data. For this purpose, it is obvious to observe that H (t) is an increasing and continuous function in [0, T ∗ ), and for all t ∈ [0, T ∗ ), we have, using Lemmas 3.1 and 3.2, H (t) ≤ C H0 + u · ∇u L p (0,t;L q ) + div(F F) L p (0,t;L q ) (5.1) + u · ∇F + F∇u L p (0,t;L q ) . On the other hand, Lemmas 3.1–3.3 imply that u · ∇u L p (0,t;L q ) ≤ u L ∞ (0,t;L q ()) ∇u L p (0,t;L ∞ ) 1

≤ C ( u 0 L q + H (t)) H (t)t 2

3 − 2q

(5.2)

1 3 2 − 2q

div(F F)

L p (0,t;L q )

≤ C(H0 + H (t))H (t)t , ∞ q p ≤ C F L (0,t;L ) ∇F L (0,t;L ∞ ) 1

≤ C( F0 L q + H (t))H (t)t 2 ≤ C(H0 + H (t))H (t)t

1 3 2 − 2q

3 − 2q

(5.3)

,

and, similarly, by Lemma 3.3, u · ∇F + F∇u L p (0,t;L q ) ≤ u L ∞ (0,t;L q ) ∇F L p (0,t;L ∞ ) + F L ∞ (0,t;L q ) ∇u L p (0,t;L ∞ ) 1

≤ C( u0 L q + H (t))H (t)t 2 1

≤ C(H0 + H (t))H (t)t 2

3 − 2q

3 − 2q

1

+ C( F0 L q + H (t))H (t)t 2

3 − 2q

(5.4)

.

Substituting (5.2)–(5.4) into (5.1), we get 1 −3 H (t) ≤ C H0 + (H0 + H (t))H (t)t 2 2q .

(5.5)

Assume that T is the smallest number such that H (T ) = 4C H0 . This is possible because H (t) is an increasing and continuous function in time. Then, H (t) < H (T ) = 4C H0 , for all t ∈ [0, T ), and from (5.5), we deduce that 1

3 ≤ (H0 + 4C H0 )4C T 2

3 − 2q

.

Hence, we have ∗

T >T ≥

3 8C(H0 + 4C H0 )

2q q−3

.

Global Solution to the Flow of Liquid Crystals

875

This implies that the maximal time of existence will go to infinity when the initial data approaches zero. More precisely, we can show that, if the initial data is sufficiently small, the solution exists globally in time. To this end, we need some other estimates for the terms on the right side of (5.1). Indeed, by the imbedding W 1,q → L ∞ , as q > 3, we have u · ∇u L p (0,t;L q ) ≤ u L ∞ (0,t;L q ()) ∇u L p (0,t;L ∞ ) ≤ C( u0 L q + H (t)) u L p (0,t;W 2,q ) ≤ C(H0 + H (t))H (t). Similarly, we have div(F F) L p (0,t;L q ) ≤ C(H0 + H (t))H (t), and u · ∇F + F∇u L p (0,t;L q ) ≤ C(H0 + H (t))H (t). Thus, (5.1) turns out to be H (t) ≤ C(H0 + (H0 + H (t))H (t)).

(5.6)

By the Cauchy-Schwarz inequality, (5.6) becomes H (t) ≤ C(H0 + H02 + 2H 2 (t)),

(5.7)

for all t ∈ [0, T ∗ ). Now we take H0 sufficiently small such that H0 + H02 ≤ δ :=

1 . 8C 2

(5.8)

Then, under the assumption (5.8), we compute directly from (5.7) and the continuity of H (t) that 1 − 1 − 8C 2 (H0 + H02 ) 1 ≤ , (5.9) H (t) ≤ 4C 4C for all t ∈ [0, T ∗ ). In particular, this implies that (u, F, P) M p,q ≤ ∗ T

1 < ∞. 4C

Hence, according to the local existence in the previous section, we can extend the solution on [0, T ∗ ) to some larger interval [0, T ∗ + T0 ) with T0 > 0. This is impossible since T ∗ is already the maximal time of existence. Hence, when the initial data satisfies (5.8), the strong solution is indeed global in time. The proof of Theorem 2.1 is complete.

876

X. Hu, D. Wang

6. Weak-Strong Uniqueness The purpose of this section is to show Weak-Strong Uniqueness in Theorem 2.2. To this end, we need to obtain first an energy estimate for the strong solution to the system (1.5). More precisely, we have p,q

Lemma 6.1. Let p, q satisfy the same conditions as Theorem 2.1 and (u, F, P) ∈ MT0 be the unique solution to (1.5) on × [0, T0 ]. Then, one has, t u(t) 2 + F(t) 2 d x + ∇u 2 + ∇F 2 d xds = u0 2 + F0 2 d x.

0

Proof. Note that 1− 1p , p

u ∈ C([0, T0 ]; D Aq

) ∩ L p (0, T0 ; W 2,q )

with q > 3.

Then, we have u ∈ C([0, T0 ]; L 2 ) ∩ L 2 (0, T0 ; H 1+α ) for some α ≥ 0, using 1− 1p , p

D Aq

2(1− 1p )

→ Bq, p

∩ L q () → L 2 (),

Sobolev’s embedding W 2,q () → H 2 () as q > 3, and the standard interpolation inequality. Similarly, F ∈ C([0, T0 ]; L 2 ) ∩ L 2 (0, T0 ; H 1+α ). Taking the L 2 scalar product in (1.5a) with u and performing integration by parts, we obtain d 2 2 |u| d x + |∇u| d x = F F : ∇ud x, (6.1) dt where the notation A : B means the inner product between two matrices, i.e. A : B = 2 A i, j i j Bi j . Similarly, taking the L inner product in (1.5b) with F and performing integration by parts, we obtain d |F|2 d x + |∇F|2 d x = − F∇u : Fd x − F : (u · ∇F)d x, (6.2) dt where |F|2 = F : F and |∇F|2 =

∂Fi j 2 ∂x .

i, j,k

k

Notice that 1 1 F : (u · ∇F)d x = u · ∇|F|2 d x = − divu|F|2 d x = 0, 2 2

Global Solution to the Flow of Liquid Crystals

877

and, due to AB : C = A : C B = B : A C, F∇u : Fd x = ∇u : F Fd x.

Hence, adding (6.1) and (6.2) together, we have d 2 2 (|u| + |F| )d x + (|∇u|2 + |∇F|2 )d x = 0. dt Integrating the above equality over the time interval [0, t], we obtain the energy equality of this lemma. Now, we recall that for the weak solution (v, E, ) obtained in [16], we have for (almost) all t ∈ (0, T ), t 1 1 2 2 2 2 (|v(t)| +|E(t)| )d x + (|∇v| +|∇ E| )d xds≤ (|u0 |2+|F0 |2 )d x. (6.3) 2 2 0 We remark that, in view of the regularity of u, we deduce from the weak formulation of (1.5) the following equalities: t v · ud xds + ∇u : ∇vd xds 0 (6.4) t t ∂u 2 = |u0 | + E E : ∇ud xds + v· + v · ∇u d xds, ∂t 0 0 and

t F : Ed x + ∇F : ∇ Ed xds 0 t t 2 = |F0 | d x − v · ∇ E : Fd xds − E∇v : Fd xds 0 0 t ∂F d xds, + E: ∂t 0

(6.5)

for a.e. t ∈ (0, T ). Here, we used the identity v · ∇u · wd x = − v · ∇w · ud x,

for a vector w, if divv = 0. Since E satisfies Eq. (1.5b), we substitute (1.5b) into (6.5), and use the following two facts: t t (v · ∇ E : F + v · ∇F : E)d xds = v · ∇(E : F)d xds = 0, 0

0

and E∇u : F + E : F∇u = ∇u : (E F + F E),

878

X. Hu, D. Wang

to obtain

t F : Ed x + 2 ∇F : ∇ Ed xds 0 t = |F0 |2 d x − ∇u : (E F + F E)d xds 0 t t + (v − u) · ∇F : Ed xds − F : E∇(v − u)d xds.

0

(6.6)

0

On the other hand, we can write the equation for u as ∂u + v · ∇u − u + ∇ P = (v − u) · ∇u − div(F F). ∂t Multiplying (6.7) by v and integrating over × (0, t), we get t ∂u + v · ∇u d xds v· ∂t 0 t t =− ∇u : ∇vd xds + (v − u) · ∇u · vd xds 0 0 t + F F : ∇vd xds. 0

(6.8)

Substituting (6.8) into (6.4), we obtain t u · vd xds + 2 ∇u : ∇vd xds 0 t = |u0 |2 + E E : ∇ud xds 0 t t + (v − u) · ∇u · vd xds + F F : ∇vd xds. 0

(6.7)

0

(6.9)

Also, according to Lemma 6.1, we have t 1 1 2 2 2 2 (|u| +|F| )d x + (|∇F| +|∇u| )d xds = (|u0 |2 +|F0 |2 )d x. 2 2 0

(6.10)

Summing (6.3), (6.10) and subtracting the sum of (6.6) and (6.9), we obtain for almost all t ∈ (0, T ), t 1 (|u(t) − v(t)|2 + |F(t) − E(t)|2 )d x + (|∇u − ∇v|2 + |∇F − ∇ E|2 )d xds 2 0 t t ≤− (F − E) (F − E) : ∇ud xds − (v − u) · ∇u · vd xds 0 0 t t − (v − u) · ∇F : Ed xds + F : E∇(v − u)d xds 0 0 t − F F : ∇(v − u)d xds (6.11) 0

Global Solution to the Flow of Liquid Crystals

879

t t =− (F − E) (F − E) : ∇ud xds − (v − u) · ∇u · (v − u)d xds 0 0 t t − (v − u) · ∇F : (E − F)d xds + (E − F) F : ∇(v − u)d xds := I,

0

0

where, we used twice the fact

v · ∇u · ud x = 0,

if divv = 0. For I , we have, by Hölder’s inequality, t 2 2 (|F − E| +|u − v| )d x ds |I | ≤ ( ∇u L ∞ () + ∇F L ∞ () ) 0 t t 1 + |∇v − ∇u|2 d xds + C F 2L ∞ E − F 2L 2 ds. 2 0 0

(6.12)

Substituting (6.12) back to (6.11), one has 1 1 t (|u(t)−v(t)|2 +|F(t)− E(t)|2 )d x + (|∇u−∇v|2 +|∇F − ∇ E|2 )d xds 2 2 0 t ≤ ( ∇u L ∞ () + ∇F L ∞ () +C F 2L ∞ () ) (|F − E|2 +|u − v|2 )d x ds. 0

(6.13) Notice that ∇u L ∞ () + ∇F L ∞ () + F 2L ∞ () ∈ L 1 (0, T ). Therefore, using (6.13) together with Grönwall’s inequality, we finally conclude that u = v, F = E a.e and thus P = up to a constant in × (0, T ). The proof of Theorem 2.2 is complete. Acknowledgements. Xianpeng Hu’s research was supported in part by the National Science Foundation grant DMS-0604362 and by the Mellon Predoctoral Fellowship of the University of Pittsburgh. Dehua Wang’s research was supported in part by the National Science Foundation under grants DMS-0604362 and DMS0906160, and by the Office of Naval Research under grant N00014-07-1-0668.

References 1. Amann, H.: Linear and Quasilinear Parabolic Problems. Vol. I. Abstract linear theory. Boston, MA: Birkhúser Boston, Inc., 1995 2. Bergh, J., Löfström, J.: Interpolation Spaces. An Introduction. Grundlehren der Mathematischen Wissenschaften, Berlin-New York: Springer-Verlag, 1976 3. Chandrasekhar, S.: Liquid Crystals. 2nd ed., Cambridge: Cambridge University Press, 1992 4. Danchin, R.: Density-dependent incompressible fluids in bounded domains. J. Math. Fluid Mech. 8, 333–381 (2006) 5. de Gennes, P.G., Prost, J.: The Physics of Liquid Crystals. New York: Oxford University Press, 1993. 6. Desjardins, B.: Regularity of weak solutions of the compressible isentropic Navier-Stokes equations. Comm. Part. Diff. Eqs. 22, 977–1008 (1997)

880

X. Hu, D. Wang

7. Ericksen, J.L.: Conservation laws for liquid crystals. Trans. Soc. Rheology 5, 23–34 (1961) 8. Ericksen, J.L.: Continuum theory of nematic liquid crystals. Res. Mechanica 21, 381–392 (1987) 9. Galdi, G.P.: An Introduction to the Mathematical Theory of the Navier-Stokes Equations.Vol. I. Linearized steady problems. New York: Springer-Verlag, 1994 10. Hardt, R., Kinderlehrer, D.: Mathematical Questions of Liquid Crystal Theory. The IMA Volumes in Mathematics and its Applications 5, New York: Springer-Verlag, 1987 11. Hardt, R., Kinderlehrer, D., Lin, F.: Existence and partial regularity of static liquid crystal configurations. Commun. Math. Phys. 105, 547–570 (1986) 12. Leslie, F.: Some constitutive equations for liquid crystals. Arch. Rat. Mech. Anal. 28, 265–283 (1968) 13. Leslie, F.: Theory of flow phenomenum in liquid crystals. In: The Theory of Liquid Crystals, 4, LondonNew York: Academic Press, 1979, pp. 1–81 14. Lin, F.-H.: Nonlinear theory of defects in nematic liquid crystals; phase transition and flow phenomena. Commun. Pure. Appl. Math. 42, 789–814 (1989) 15. Lin, F.-H.: Mathematics theory of liquid crystals. In: Applied Mathematics at the Turn of the Century, Lecture Notes of the 1993 Summer School, Universidad Complutense de Madrid, Madrid: Editorial Complutense, 1995 16. Lin, F.-H., Liu, C.: Nonparabolic dissipative systems modeling the flow of liquid crystals. Comm. Pure Appl. Math. 48, 501–537 (1995) 17. Lin, F.-H., Liu, C.: Partial regularity of the dynamic system modeling the flow of liquid crystals. Disc. Cont. Dyn. Sys. 2, 1–22 (1996) 18. Lions, P.-L.: Mathematical Topics in Fluid Mechanics. Vol. 1. Incompressible models. Oxford Lecture Series in Mathematics and its Applications, 3. Oxford Science Publications. New York: The Clarendon Press/Oxford University Press, 1996 19. Liu, C., Walkington, N.J.: Approximation of liquid crystal flow. SIAM J. Numer. Anal. 37, 725–741 (2000) 20. Sun, H., Liu, C.: On energetic variational approaches in modeling the nematic liquid crystal flows. Disc. Contin. Dyn. Syst. 23, 455–475 (2009) Communicated by P. Constantin

Commun. Math. Phys. 296, 881–898 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0995-x

Communications in

Mathematical Physics

Ambient Metrics for n-Dimensional pp-Waves Thomas Leistner1 , Pawel Nurowski2,3 1 School of Mathematical Sciences, The University of Adelaide,

Adelaide, SA 5005, Australia. E-mail: [email protected]

2 Instytut Fizyki Teoretycznej, Uniwersytet Warszawski, ul. Ho˙za 69,

00-681 Warszawa, Poland. E-mail: [email protected]

3 Instytut Matematyczny PAN, ul. Sniadeckich 8,

00-956 Warszawa, Poland Received: 1 July 2009 / Accepted: 2 November 2009 Published online: 4 February 2010 – © Springer-Verlag 2010

Abstract: We provide an explicit formula for the Fefferman- Graham ambient metric of an n-dimensional conformal pp-wave in those cases where it exists. In even dimensions we calculate the obstruction explicitly. Furthermore, we describe all 4-dimensional pp-waves that are Bach-flat, and give a large class of Bach-flat examples which are conformally Cotton-flat, but not conformally Einstein. Finally, as an application, we use the obtained ambient metric to show that even-dimensional pp-waves have vanishing critical Q-curvature. 1. Introduction Plane fronted gravitational waves, called pp-waves, are Lorentzian 4-manifolds (M, g) admitting a covariantly constant null vector field K . In addition, their Ricci tensor Ric satisfies Ric = κ ⊗ κ,

(1)

where κ is the 1-form on M defined by κ := K −| g. Physicists require also that the function is nonnegative for a pp-wave. This is because , via the Einstein field equations, is directly related to the energy momentum tensor of its gravitational field. pp-waves are important in general relativity theory since they generalize the concept of a plane wave of classical electrodynamics [41], as well as because of the fact that every 4-dimensional spacetime has a special pp-wave as a well defined limit [40], the Penrose limit, as it is called. Higher dimensional generalizations of the 4-dimensional pp-waves were studied in [42], appeared in Kaluza-Klein theory [28,25,29,9], and later in string theory [5,6,4, 35,11,36,12,18,3,37]. Their property of possessing a covariantly constant null vector This work was supported in part by the Polish Ministerstwo Nauki i Informatyzacji grant nr: 1 P03B 07529 and by the Sonderforschungsbereich 676 of the German Research Foundation.

882

T. Leistner, P. Nurowski

field K , implies that they have reduced Lorentzian holonomy from the full orthogonal group SO(1, n − 1) to the subgroup preserving the null vector K . In fact, they can be characterised by having Abelian holonomy Rn−2 [30,32]. As such they admit many parallel spinors: The dimension of the space of parallel spinors on an n-dimensional pp-wave is at least half of the dimension of the spinor module, [30]. In local coordinates (xi , u, r )i=1,...,n−2 in Rn , the n-dimensional pp-wave metric can be written as g=

n−2

(dxi )2 + 2du (dr + hdu) .

i=1

Here h is an arbitrary smooth real function of the first (n − 1) coordinates, h = h(xi , u). The covariantly constant null vector field is K = ∂r . Another property of this metric is that it has vanishing scalar curvature. Hence, if it is Einstein then it is Ricci flat. This n−2 ∂ 2 h happens if and only if h = i=1 = 0. ∂(xi )2 Conformal classes of pp-wave metrics have remarkable properties. One of them has been described by their discoverer H. W. Brinkmann already in 1925. In his seminal paper [8] Brinkmann not only studied spaces that were later called Brinkmann waves, namely Lorentzian manifolds with parallel null vector field, but he also showed the following [8, Theorems IV and VIII]: A 4-dimensional, not locally conformally flat Einstein manifold (M, g) locally admits a function ϒ such that the conformally rescaled metric e2ϒ g is again Einstein, but not homothetic to g, if and only if (M, g) is a Ricci-flat pp-wave (or its counterpart in neutral signature1 ). In this case, the rescaled metric is also Ricci-flat and the gradient of ϒ is a null vector. This occurs because the Weyl tensor W of a pp-wave is null and aligned with K , i.e. K −| W = 0, which makes these metrics not weakly generic in the terminology of [20]. In this paper we discuss another remarkable conformal property of n-dimensional pp-wave metrics, which is related to the ambient metric construction of Fefferman and Graham [15,16], a construction that provides the geometric framework of AdS/CFT correspondence2 . The ambient metric construction mimics the situation in the flat model of conformal geometry: Here the n-dimensional sphere equipped with the flat conformal structure can be viewed as the projectivisation of the light-cone in (n + 2)-dimensional Minkowski space. Letting the spheres wander along the light cone recovers the metrics in the conformal class. For a conformal class [g] in signature ( p, q) on an n = ( p + q)dimensional manifold M the ambient metric is a metric g of signature ( p + 1, q + 1) := (−ε, ε) × M × (1 − δ, 1 + δ), ε > 0, on the product of M with two intervals, M δ > 0, that is compatible with the conformal structure (for details see Definition 1) and, moreover, is Ricci flat. The Ricci-flat condition ensures that the the ambient metric depends uniquely on the conformal structure and encodes all properties of the conformal class [g] but has the downside that the ambient metric does not always exist. Starting with a formal power series 1 Be aware that the coordinates in the relevant Sect. 4.2 of Brinkmann’s paper [8] have to be understood as complex and complex conjugate in order to obtain Lorentzian metrics. If they are considered as real coordinates the resulting metric has neutral signature. 2 Note that in some papers from the physics literature the term Fefferman-Graham metric has a different meaning than ours. What physicists call Fefferman-Graham metric, e.g. in [2 or 13], is a related concept that Fefferman and Graham call the Poincaré-Einstein metric. How to obtain one from another is well known and we shall explain it in Sect. 7.

Ambient Metrics for n-Dimensional pp-Waves

883

g = 2 (tdρ + ρdt) dt + t

2

g+

∞

ρ µk k

(2)

k=1

with ρ ∈ (−ε, ε), t ∈ (1 − δ, 1 + δ) Fefferman and Graham showed that if n is odd, the Ricci-flatness of the ambient metric gives equations for µ1 , µ2 , . . . that can be solved in principle, but the calculations have been carried out only for very special conformal classes, mainly those that are related to Einstein spaces [34,31,19]. If n = 2s is even, there is a conformally invariant obstruction to the existence of a Ricci-flat ambient metric, called the Fefferman-Graham obstruction. This obstruction is the nonvanishing of the obstruction tensor O, given by the term µs . In n = 4 this obstruction tensor is the Bach tensor for g. In higher dimensions the leading term of O is sg (g), but there are a lot of lower order terms, which, again, are determined in principle, but whose calculation is very cumbersome. One important feature of the ambient metric is that if the metric g is real analytic then its corresponding ambient metric g (if it exists) is also real analytic [15,16,27]. Another feature of the ambient metric is that if the conformal class of g includes an Einstein metric g E , then the power series in the ambient metric g˜ E truncates at k = 2; in particular, for n > 3, even the obstruction tensor vanishes. In such case the metric is given as a second order polynomial in each of the variables t and ρ. However, if the metric g is not conformally Einstein, then, except for a few examples [19,39], no explicit formulae for µk , k > 3 are known. In this context our main result is the following remarkable conformal property of n-dimensional pp-waves: for them all the coefficients µk in the ambient metric, the obstruction tensor in even dimensions, and hence, the condition under which the ambient metric truncates at a given order can be calculated explicitly. In Sect. 4 we prove n−2 Theorem 1. Let g = i=1 (dxi )2 + 2du (dr + hdu) be an n-dimensional pp-wave metric with a real analytic function h = h(x1 , . . . , xn−2 , u). Then the FeffermanGraham ambient metric for the conformal class [g] exists if and only if n is odd and h is arbitrary, or if n = 2s is even and s h = 0. In both cases the ambient metric is given by a formal power series ∞ k h g = 2d (tρ) dt + t 2 g + ρ k du 2 , k! pk k=1

k

n−2

with pk := j=1 (2 j − n) and := i=1 ∂i2 . In particular, if n = 2s is even, the obstruction tensor O is given by O = s h du 2 . Thus if n = 2s is even, the ambient metric g is a polynomial of order s − 1 in the variable ρ. If n is odd, since the metric g is real analytic, the Fefferman-Graham result guarantees that the above metric g is also real analytic. This in particular means that the ∞ k h k power series k=1 k! pk ρ converges to a real analytic function in variable ρ. Theorem 1 provides us with a variety of examples of conformal structures with explicit ambient metrics and which, in general, are not conformally Einstein. For example, every polynomial h in the xi ’s of order lower than k, with coefficients being functions of u, represents a pp-wave with ambient metric truncated at order lower than k/2. In Sect. 6 we construct more general examples than those defined by h being polynomials in the xi s. In particular, in dimension four we find all Bach-flat 4-dimensional pp-waves and

884

T. Leistner, P. Nurowski

we prove that most of them are not conformally Einstein. They are defined by quite general functions h and have ambient metrics which are linear in variable ρ. It is interesting to note that these pp-waves, although Bach-flat and conformal to Cotton-flat, are not conformally Einstein. Theorem 1 implies also another interesting feature of the pp-waves: their obstruction tensor O (in even dimensions) involves only the terms of the highest possible order in the derivatives of their metric; since all the lower order terms that are usually present in the obstruction tensor are vanishing, the pp-waves are, in a sense, the closest cousins of the conformally Einstein metrics. Using the explicit form of the ambient metric and the main result of [24], in Sect. 7 we show that for even-dimensional pp-waves the critical Q-curvature vanishes. This result is in correspondence with the fact that for a pp-wave all scalar invariants constructed from the curvature tensor vanish (for the proof in arbitrary dimension see [10]). In the final Sect. 8 we study the holonomy of the ambient metric of a pp-wave in relation to results in [31]. We show that it is contained in the stabiliser of a totally null plane. 2. The Fefferman-Graham Ambient Metric An important tool in order to construct invariants in conformal geometry is the so-called Fefferman-Graham ambient metric or ambient space (see [15 and 16]). Let (M, [g]) be a a smooth n-dimensional manifold M with conformal structure [g] of signature ( p, q) with the conformal frame bundle P 0 . It can also be characterised by a principle R+ -fibre bundle π : Q → M defined as the ray sub-bundle in the bundle of metrics of signature ( p, q) given by metrics in the conformal class c. The action of R+ on Q shall be denoted by ϕ: ϕ(t, gx ) = t 2 gx . From [16] we adopt the following notation. Definition 1. Let (M, [g]) be a conformal structure of signature ( p, q) over an n-dimensional manifold M, and π : Q → M the corresponding ray bundle. A semi-Riemannian g ) of signature ( p + 1, q + 1) is called pre-ambient space if manifold ( M, and (1) there is a free R+ -action ϕ on M, is R+ -equivariant. (2) an embedding ι : Q → M (3) If F is the fundamental vector field of ϕ , and L denotes the Lie derivative, then L F g = 2 g , i.e. the metric g is homogeneous of degree 2 with respect to the R+ -action. (4) Any gx ∈ Q satisifies the equality (ι∗ g )gx = gx (dπ(.), dπ(.)) in 2 Tg∗x Q. A pre-ambient space is called ambient space if its Ricci curvature vanishes. Under the assumption that the conformal structure is given by a real analytic metric, in odd dimensions a Ricci-flat ambient metric always exists and is also real analytic. In even dimensions n ≥ 4, the existence of a Ricci-flat ambient metric is obstructed by the nonvanishing of the obstruction tensor O, [16, pp. 22]. This is a symmetric tracefree and divergence-free (2, 0)-tensor, which is conformally invariant of weight (2 − n), i.e. if gˆ = e2ϕ g ∈ [g], then Oˆ = e(2−n)ϕ O. It is given by 2 O = n/2−2 P − ∇ J + lower order terms, g g

Ambient Metrics for n-Dimensional pp-Waves

885

1 scal Ric − 2(n−1) where P = n−2 g is the Schouten tensor, J its trace, and g denotes the Laplacian of g ∈ [g]. For a conformal class in even dimension that is given by a real analytic metric with vanishing obstruction tensor, the ambient metric exists and is also real analytic. Fixing a metric g in the conformal class, in [15,16] it is shown that an ambient space near M can be written as = (− , ) × M × (1 − δ, 1 + δ) M with the ambient metric g = 2tdρdt + 2ρdt 2 + t 2 g(ρ), in which g(ρ) is a one-paramemter family of metrics on M with g(0) = g. This is referred to as g being in normal form. As the ambient metric is analytic, one can write the family g(ρ) as a power series in ρ,

1 2

1 3

2 2

g = 2tdρdt + 2ρdt + t g + ρg + ρ g + ρ g + . . . , 2 6 with g = ∂ρ g(0). We summarise the results for the ambient metric in Theorem 2 ([15,16 and 27]). Let (M, [g]) be a real analytic manifold M of dimension n ≥ 2 equipped with a conformal structure defined by a real analytic semi-Riemannian metric g. g) (1) If n is odd, or if n is even with O = 0, then there exists an ambient space ( M, with real analytic Ricci-flat metric g. (2) If n is odd the ambient space is unique modulo diffeomorphisms that restrict to the and commute with identity along Q ⊂ M ϕ . If n is even with O = 0, the ambient space is unique, modulo the same set of diffeomorphisms and modulo terms of order ≥ n/2 in ρ, where ρ is the coordinate in the normal form of the ambient metric. The Ricci-flat condition then determines symmetric (2, 0)-tensors µk such that ∞ 2 2 k g = 2tdρdt + 2ρdt + t g + ρ µk . k=1

In [16] the first µk are determined explicitly: (µ1 )ab = 2Pab , (n − 4)(µ2 )ab = −Bab + (n − 4)Pa c Pbc , 3(n−4)(n−6)(µ3 )ab = g Bab −2Wcabd B cd −4(n−6)Pc(a Bb) c −4Pc c Bab + 4(n − 4)Pcd ∇d C(ab)c − 2(n − 4)C c a d Cdbc + (n − 4)Ca cd Cbcd + 2(n − 4)∇d Pc c C(ab) d − 2(n − 4)Wcabd Pc e Ped ,

(3)

where Wabcd is the Weyl tensor, Pab is the Schouten tensor, Cabc := ∇c Pab − ∇b Pac is the Cotton tensor, and Bab = ∇c Cabc − Pcd W cabd is the Bach tensor.

886

T. Leistner, P. Nurowski

3. pp-Waves and Their Curvature A pp-wave is a Lorentzian manifold with a parallel null vector field K , i.e. ∇ K = 0, K = 0, and g(K , K ) = 0, whose curvature tensor satisfies the trace condition ef

Rab Re f cd = 0.

(4)

If we denote by κ the one-form given by κ := K −| g the curvature condition (4) is equivalent to each of the following, in which [ab] denotes the skew symmetrisation with respect to a and b, [42]: (1) κ[a Rbc]de = 0; (2) there is a symmetric (2, 0)-tensor with K −| = 0, such that Rabcd = κ[a b][c κd] ; e f (3) there is a function ϕ, such that R ab Recd f = ϕκa κb κc κd . The Ricci tensor of a pp-wave is given by Ric = κ ⊗ κ, for a smooth function . In dimension n = 4 this is even equivalent to the curvature condition (4). In [31] we gave another equivalent definition, without using coordinates or traces, but identifying a pp-wave as a Lorentzian manifold with parallel null vector field K , whose curvature satisfies (5) Im R(U, V )|K ⊥ ⊂ R · K for all U, V ∈ T M. This equivalence allows for several generalisations [32] and for an easy proof of another equivalence that is related to holonomy: An n-dimensional Lorentzian manifold is a pp-wave if and only if its holonomy group is contained in the Abelian subgroup Rn−2 of the stabiliser in SO(1, n − 1) of a null vector [30]. Locally, an n-dimensional pp-wave admits coordinates (x1 , . . . , xn−2 , u, r ) such that the metric is given by g=

n−2

(dxi )2 + 2du (dr + hdu) ,

(6)

i=1

with h being a smooth real function of the first (n − 1) coordinates, h = h(xi , u), [42]. In these coordinates the parallel null vector field K is given by ∂r and, up to symmetries, the only non-vanishing curvature terms of a pp-wave are R(∂i , ∂u , ∂ j , ∂u ) = ∂i ∂ j h. ∂ ∂ , ∂u := ∂u and ∂i := ∂x∂ i , i = 1, . . . , n − 2. Here we use the obvious notation ∂r := ∂r Hence, the function determining the Ricci-tensor is given by = −h with n−2 2 ∂i h, i.e. h = i=1

Ric = −h du 2 .

(7)

Hence, the image of the Ricci-tensor is totally null, and the scalar curvature vanishes. With this at hand, one can easily calculate the tensors related to the conformal geometry of a pp-wave. First, there is the Schouten-tensor P =

1 h Ric = − du 2 . n−2 n−2

(8)

Ambient Metrics for n-Dimensional pp-Waves

887

Secondly, the Weyl tensor is given by W (∂i , ∂u , ∂ j , ∂u ) = ∂i ∂ j h − δi j

h , n−2

(9)

h and for n > 3 we obtain that ∂i ∂ j h = δi j n−2 as an equivalent condition on h for g being conformally flat. 1 Next, we calculate the Cotton tensor C. As ∇P = − n−2 d(h) ⊗ du 2 one obtains that

C(∂u , ∂i , ∂u ) = −C(∂u , ∂u , ∂i ) =

∂i h n−2

(10)

are the only non-vanishing components of the Cotton tensor. Hence, ∂i h = 0 is the condition on h for 3-dimensional conformally flat pp-waves. Furthermore, we obtain the Bach tensor B, B=−

2 h du 2 . n−2

(11)

This enables us to calculate the next terms in the ambient metric expansion in Eqs. (3) h du 2 , namely beyond µ1 = 2P = n−2 µ2 =

1 − n−4 B

=

µ3 =

1 2(n−4)(n−6) B

=

2 h 2 (n−2)(n−4) du , 3 h 2 3(n−2)(n−2)(n−4) du .

The very simple structure of µ1 , µ2 , and µ3 above, and in particular the appearance of the consecutive powers of the Laplacian, suggests that this pattern may be also present in the next terms in the ambient metric expansion. That this is really the case will be proven in the next section. 4. The pp-Wave Ambient Metric Looking at the very simple form of the pp-wave metric (6) and the general formula for the ambient metrics (2), our ansatz for the ambient metric for this g is g¯ = 2d(ρt)dt + t 2 2du (dr + (h + H )du) +

n−2

(dxi )2 ,

(12)

i=1

where H = H (ρ, xi , u), and H (ρ, xi , u)|ρ=0 = 0.

(13)

If we were able to find an analytic function H satisfying (13) and for which the metric (12) was Ricci flat then, by the uniqueness of the Fefferman-Graham Theorem 2, we would conclude that g¯ with this H is the ambient metric for (6). Thus to check our guess it is enough to calculate the Ricci tensor for (12) and to check if its vanishing is possible for the function H in the postulated form (13).

888

T. Leistner, P. Nurowski

Lemma 1. The Ricci tensor of the metric (12) is Ric(g) ¯ = (2 − n)Hρ + 2ρ Hρρ − H − h du 2 . Here H =

n−2

∂2 H i=1 ∂(xi )2 ,

∂H ∂ρ ,

Hρ =

etc.

Proof. We start with a coframe θ 0 = d(ρt), θ i = tdxi , θ n−1 = t 2 (dr + (h + H )du), θ n = du, θ n+1 = dt,

(14)

in which the metric g¯ reads: g¯ = g¯µν θ µ θ ν = 2θ 0 θ n+1 + 2θ n−1 θ n +

n−2

(θ i )2 ,

µ, ν = 0, 1, . . . , n + 1.

i=1

It has the following differentials: dθ 0 dθ i

= 0, = −t −1 θ i ∧ θ n+1 ,

dθ n−1 = t Hρ θ 0 ∧ θ n + t dθ n = 0, dθ n+1 = 0.

n−2 i=1

∀i = 1, . . . , n − 2, (h i + Hi )θ i ∧ θ n − 2t −1 θ n−1 ∧ θ n+1 + ρt Hρ θ n ∧ θ n+1 ,

In this coframe the Levi-Civita connection 1-forms, i.e. matrix-valued 1-forms satisfying µ dθ µ + ν ∧ θ ν = 0, µν + νµ = 0, µν = g¯µσ σν , are: 0n in n−1 n i n+1 n−1 n+1 n n+1

= −t Hρ θ n , = −t (h i + Hi )θ n , = t −1 θ n+1 = t −1 θ i , = t −1 θ n = t −1 θ n−1 − ρt Hρ θ n .

(15)

Modulo the symmetry µν = −νµ all other connection 1-forms are zero. ρ The curvature 2-forms µν = dµν + µρ ∧ ν , have the following nonvanishing components: 0n = −Hρρ θ 0 ∧ θ n − in = −Hiρ θ 0 ∧ θ n −

n−2

Hiρ θ i ∧ θ n − ρ Hρρ θ n ∧ θ n+1 ,

i=1 n−2

(δik Hρ + Hik + h ik )θ k ∧ θ n − ρ Hiρ θ n ∧ θ n+1 ,

k=1 n−2

nn+1 = −ρ Hρρ θ 0 ∧ θ n −

(16)

ρ Hiρ θ i ∧ θ n − ρ 2 Hρρ θ n ∧ θ n+1 ,

i=1

together with the components that are implied by the symmetry µν = −νµ . The Riemann tensor Rµνρσ , defined by µν = 21 Rµνρσ θ ρ ∧ θ σ , can be read off from Eqs. (16). Using this and the inverse of the metric g µν , gµρ g ρν = δµν , we calculate the

Ambient Metrics for n-Dimensional pp-Waves

889

Ricci tensor Rµν = g ρσ Rρµσ ν . It turns out that it has Rnn = −2R0nnn+1 + as its only nonvanishing component. Explicitly:

n−2 i=1

Rinin

Rnn = 2ρ Hρρ − (n − 2)Hρ − H − h. This finishes the proof of the lemma. The lemma shows that the metric g¯ is Ricci flat if and only if the function H satisfies the following PDE: (2 − n)Hρ + 2ρ Hρρ − H = h.

(17)

For g¯ to be the ambient metric for (6) we in addition require the initial condition (13). By looking for the solution of the initial value problem (17), (13) in the form of a power series H=

∞

ak ρ k ,

(18)

k=0

we immediately get a0 = 0 from the initial condition (13). Then inserting (18) in (17), we easily arrive at Proposition 1. If n = 2s + 1, s ≥ 1, then the initial value problem (17), (13) has a unique power series solution. It is given by: H=

∞ k=1

k!

k

k h

i=1 (2i − n)

ρk .

(19)

If n = 2s the power series solution exists only if s h = 0. If this is the case, the solution is also unique and given by the power series (19), which truncates to a polynomial of order (s − 1) in the variable ρ. This proposition proves our Theorem 1 of the Introduction. Note that the solution we found is a solution to Eq. 3.17 in [16] that was derived for the Taylor expansion of the ambient metric, here specified for a pp-wave. In particular, for n = 2s the obstruction tensor of an n-dimensional pp-wave is given by O = s h du 2 . With this result at hand, every polynomial h in the xi ’s of order lower than 2k, with coefficients being functions of u, gives an example of a pp-wave for which the ambient metric truncates to a polynomial of order lower than k. This gives plenty of examples of explicit ambient metrics, also in even dimensions. Moreover, choosing h properly, one gets examples for which the conformal class does not contain an Einstein metric. This will be the aim of Sect. 6. But first we address the issue of convergence of H in odd dimensions.

890

T. Leistner, P. Nurowski

5. Convergence in Three Dimensions In odd dimensions the solution to the Ricci-flat equation, H in (19), may be given by an infinite series. Since H contains only natural powers of ρ, general arguments as in [16] ensure that H converges for an analytic function h and is analytic as well, [21]. Here we give a simple argument that proves convergence for n = 3: Proposition 2. Let h be a function on C × R of variables (z, u) which is an entire holomorphic function in z = x + iy ∈ C, is continuous in u ∈ R, and is real for z = x ∈ R. Then the series H (x, u, ρ) =

∞ k=1

(k h)(x, u) ρk k k! i=1 (2i − 3)

(20)

converges uniformly on compact subsets of R3 . Proof. Let R > 1 be a real number and let C = sup{|h(z, u)|} over all values of (z, u) such that |z −x| ≤ (R +2 ), |u| ≤ ν > 0, and |x| ≤ > 0. Then by the Cauchy-Schwarz inequality, the k th derivative of h at every real point (x, u) ∈ [− , ] × [−ν, ν] satisfies . This provides the following estimate for the values of the powers of |h (k) (x, u)| ≤ Ck! Rk the Laplacian k h =

d 2k h : dz 2k

∀(x, u) ∈ [− , ] × [−ν, ν] we have |(k h)(x, u)| ≤

C(2k)! . R 2k

(21)

Now we rewrite (20) to the equivalent form H = ρh −

∞ k=1

k+1 h ρ k+1 . (k + 1)! · 1 · 3 · · · · · (2k − 1)

To show that H converges it is enough to show the convergence of the power series above. This can be done by using the estimate (21): k+1 ∞ |ρ| k+1 h (2k + 2)! k+1 | ρ |≤C (k + 1)! · 1 · 3 · · · · · (2k −1) (k + 1)! · 1 · 3 · · · · · (2k −1) R 2 k=1 k=1 k+1

∞ ∞ |ρ| (2 · 4 · · · · · 2k) · (2k + 1)(2k + 2) |ρ| k+1 =C = C b . k (k + 1)! R2 R2 ∞

k=1

k=1

Since |bk+1 | 2(k + 1)(2k + 3)(2k + 4) = −→ 2 as k → ∞, |bk | (k + 2)(2k + 1)(2k + 2) then this series converges for |ρ| ≤

R2 2 .

This finishes the proof.

Ambient Metrics for n-Dimensional pp-Waves

891

6. Bach Flat Metrics that are not Conformally Einstein With Eq. (11) it is obvious how to obtain Bach-flat pp-waves. It is more difficult to find those that are not conformally Einstein. In this section we want to give examples of 4-dimensional pp-waves that are both Bach flat and not conformal to Einstein. But first we have to review some necessary conditions of being conformal to Einstein given in [20] for any dimension. In this section, when we write ‘conformal to’ we mean ‘locally conformal to’. From the formulae for the transformation of the Schouten tensor under conformal changes of the metric one obtains that a metric is conformal to an Einstein metric if and only if there exists a scaling function ϒ such that P − ∇dϒ + (dϒ)2 is pure trace.

(22)

In the following we write Y for the gradient of ϒ. In [20, Prop. 2.1] the following necessary conditions for the metric to be conformal to Einstein were derived from Eq. (22): C + W (Y, ., ., .) = 0, B + (n − 4)W (Y, ., ., Y ) = 0.

(23) (24)

Note that the first condition is satisfied for a gradient Y if and only if the metric is conformally equivalent to a metric with vanishing Cotton tensor, i.e. if it is conformally Cotton-flat. We further mention that the property of being conformally Cotton-flat is also neccessary for the metric to be conformally Einstein [20]. For a pp-wave conditions (23) and (24) are equivalent to the following: Proposition 1. If the pp-wave (6) is conformally Einstein but not conformally flat and n > 3, then there is a vector field Y on M, whose components Y i := dxi (Y ), i = 1, . . . , n − 2, and Y n−1 := du(Y ) satisfy the equations ∂i h − Y i h + (n − 2)

n−2

Y k ∂k ∂i h = 0,

(25)

k=1

2 h − (n − 4)h

n−2

Yk

2

+ (n − 2)(n − 4)

k=1

n−2

Y k Y l ∂k ∂l h = 0,

(26)

k,l=1

for i = 1, . . . , n − 2, and Y n−1 = 0. Proof. Writing Y =

Y k∂

k

+ Y n−1 ∂

u + dr (Y )∂r , Eq. (23) and the formulae in Sect. 3 give

0 = Y n−1 W (∂u , ∂i , ∂u , ∂ j ),

∂i h h 0= + Y k ∂k ∂i h − δki . n−2 n−2 These, when n > 3, imply both Y n−1 = 0 and Eq. (25). Equation (24) gives that

h 2 h , 0=− − (n − 4)Y k Y l ∂k ∂l h − δkl n−2 n−2 which implies Eq. (26).

(27)

892

T. Leistner, P. Nurowski

Writing Y as the gradient of ϒ, Y =

n−2

∂k ϒ∂k + ∂r ϒ∂u + (∂u ϒ − h∂r ϒ) ∂r ,

k=1

the proposition implies that du(Y ) = ∂r ϒ = 0. Hence, ∂r (dr (Y )) = ∂r (∂u ϒ − h∂r ϒ) = 0, and we obtain Corollary 1. Let g be a pp-wave that is conformally Einstein but not conformally flat in dimension n > 3, and let Y be the gradient of the scaling function ϒ satisfying Eq. (22). Then the function Y n = dr (Y ) does not depend on the r -variable. Example 1. For n = 3 a third order polynomial h in x with coefficients being functions of u defines a pp-wave with non-vanishing Cotton tensor. Hence, it is not conformally flat and therefore not conformally Einstein. h Example 2. Set M = Rn and h = (x1 )4 + · · · + (xn−2 )4 . Then, ∂i ∂ j h = δi j n−2 on open sets in M and hence, g is not conformally flat. On the other hand, Eq. (26) can never be satisfied in 0 ∈ M, because here all second order derivatives of h vanish, but 2 h = 24(n − 2). Thus, the pp-wave defined by h = (x1 )4 + · · · + (xn−2 )4 is not conformally Einstein.

Now we turn to dimension n = 2s = 4. Here the formula (19) makes sense only if 2 h = 0. In such case the formula truncates to H = 21 ρh. Thus it is clear that for the 4-dimensional pp-waves the Fefferman-Graham obstruction is precisely 2 h, which is a multiple of the Bach tensor, and does not involve any lower order terms in the derivatives of the metric functions. In order to write down all such metrics, it is convenient to 1 2 1 −ix2 pass to the complex notation by introducing coordinates z = x √+ix , z¯ = x √ . In this 2

2

notation the most general 4-dimensional pp-wave metric satisfying 2 h = 0 is given by g4 = 2du dr + z¯ α + z α¯ + β + β¯ du + 2dzd¯z . Here α = α(z, u), β = β(z, u) are holomorphic functions of z. This metric is Bach-flat, and in some cases, such as when az + α¯ z¯ = const, is conformal to an Einstein metric. Its ambient metric is given by g˜4 = 2d(ρt)dt + t 2 2du[dr + z¯ α + z α¯ + β + β¯ − ρ(az + α¯ z¯ ) du] + 2dzd¯z , and by construction is Ricci flat. We get Proposition 2. A 4-dimensional pp-wave g4 is Bach flat if and only if g4 = 2du dr + z¯ α + z α¯ + β + β¯ du + 2dzd¯z , with α = α(z, u), β = β(z, u) functions of a complex variable z and a real variable u which are holomorphic in z. In general, this Bach-flat metric is not conformally Einstein:

Ambient Metrics for n-Dimensional pp-Waves

893

Theorem 3. A 4-dimensional Bach-flat pp-wave g4 = 2du (dr + (¯z α + z α) ¯ du) + 2dzd¯z

(28)

with β ≡ 0 is conformally equivalent to a metric with vanishing Cotton tensor. Moreover, the following three properties are equivalent: (1) ∂z2 α ≡ 0, (2) g4 is conformally flat, (3) g4 is conformally Einstein. In particular, any such metric with ∂z2 α ≡ 0 is not conformally Einstein. ¯ Next, Proof. First, in the complex coordinates (z, z¯ ) we have: h = 2 (∂z α + ∂z¯ α). using 1 i ∂1 = √ (∂z + ∂z¯ ) , ∂2 = √ (∂z − ∂z¯ ) , 2 2 in the formula (9) we see that the Weyl tensor vanishes if and only if ∂z2 α = 0. This proves the equivalence of (1) and (2). For the remaining statements we try to find a vector field Y that solves the necessary condition (23) for g to be conformally Einstein. We use this equation in the form (25), as in Proposition 1. Recall that in this proposition we proved that such a vector does not have a ∂u -component. Thus we look for Y of the form Y = F∂z + F∂z¯ + f ∂r , where F = F(z, z¯ , r, u) is a complex and f = f (z, z¯ , r, u) is a real function. Equation (25) gives 0 = ∂z2 α (1 + z¯ F) + ∂z¯2 α¯ 1 + z F , (29) 2 2 0 = ∂z α (1 + z¯ F) − ∂z¯ α¯ 1 + z F , (30) which immediately implies ∂z2 α (1 + z¯ F) = 0. Assuming that g4 is not conformally flat, i.e. ∂z2 α ≡ 0 we get F(z) = −1/¯z . Thus we found that the vector Y solves (23) if and only if Y = − 1z¯ ∂z − 1z ∂z¯ + f ∂r . Now, g4 is conformally Cotton-flat if we find f such that this Y is a gradient. Setting 1 1 Y = g4 (Y, .) = − dz − d¯z + f du, z z¯ we see that Y is locally a gradient, i.e. dY = 0, if and only if f is a function of variable u alone. Every f = f (u) gives a solution to the conformally Cotton equation. To prove that (3) implies (2), assume that g4 is not conformally flat but conformally Einstein. Then we plug in the vector Y we have obtained as a solution of Eq. (25), and its corresponding

α + z∂z¯ α¯ α¯ + z¯ ∂z α 1 1 ∇Y = d f ⊗ du − + du 2 + 2 dz 2 + 2 d¯z 2 z¯ z z z¯

894

T. Leistner, P. Nurowski

into P − ∇Y + (Y )2 . According to Eq. (22) this must be a pure trace, if the metric g4 is conformally Einstein. But this can not happen since P − ∇Y + (Y )2 has a nowhere vanishing dzd¯z -term given by z2z¯ dzd¯z , and an identically vanishing dr du-term. Thus P − ∇Y + (Y )2 is never proportional to g4 , which in turn, can not be conformally Einstein. In the light of discussions in [20], the metrics (28) provide interesting examples because, apart from being Bach-flat, they are conformally Cotton-flat, but not conformally Einstein even though the necessary conditions (23) and (24) are both satisfied for a gradient. This phenomenon is special to Lorentzian and probably to other indefinite signature metrics. We strongly believe that a similar argument works in any dimension, even though one might not be able to describe the functions with s h = 0. But under certain assumptions it might be possible to deduce a contradiction between Eq.’s (25) – (26) and the fact that the function dr (Y ) is independent of the r -coordinate as it occurs for n = 4. We want to conclude this section by returning to the result of Brinkmann in [8] mentioned in the Introduction. If a 4-dimensional pp-wave is Einstein, and hence Ricci-flat, the function h is given by α + α for a holomorphic function α. Again, this metric is conformally flat if and only if ∂z2 α = 0. If it is not conformally flat but conformally Einstein, then the vector field Y is null and a multiple of ∂r , namely Y = f ∂r with a function f = f (u) that depends on the variable u only. As P = 0, Eq. (22) then is equivalent to f = f 2 . Hence, any such function yields a conformal rescaling of a Ricci-flat pp-wave to another Einstein metric that is in fact Ricci-flat. The new metric may be isometric to the original one but in general this is not the case (see also [14]). Finally, note that a non-trivial solution of f = f 2 is not defined on all of R, and thus, in general, f does not yield a global rescaling to another Einstein metric. 7. The Critical Q-Curvature of a pp-Wave For a semi-Riemannian manifold of (M, g) even dimension n = 2s, in [7] T. Branson introduced a series {Q 2k }k=1...s of scalar invariants constructed from the curvature tensor involving 2k derivatives of the metric3 . As such, for a pp-wave all Q 2k are zero. This follows from the general fact that all scalar invariants constructed from the Riemannian curvature tensor of a pp-wave vanish (for a proof in arbitrary dimension see [10]). However, as an application of Theorem 1, in this section we will use the pp-wave ambient metric in order to show that the critical Q-curvature Q n of a pp-wave vanishes. The so-called subcritical Q-curvatures Q 2 , . . . , Q n−2 are defined by the inhomogeneous part of the GJMS-operators P2k , namely g

P2k (1) = (s − k)Q 2k . The GJMS-operators P2k introduced in [23] are conformally covariant operators. We will not give a definition of the critical Q-curvature Q n here (please refer to [17], for example). Instead we will explain a formula for the critical Q-curvature given in [24] that expresses it in terms of the volume of the Poincaré metric. 3 Regarding this section, we would like to thank Andreas Juhl for explaining to us some facts about Q-curvature.

Ambient Metrics for n-Dimensional pp-Waves

895

Let (M, [g]) be a smooth manifold of even dimension n = 2s with conformal class [g]. To this manifold one can assign a Poincaré metric g+ . g+ is a metric on M+ = M × (0, a) given by 1 g+ = 2 dx2 + gx , x where gx is a 1-parameter family of metrics with the same signature as g and with initial condition g0 = g such that g+ is asymptotically Einstein, which means that Ric(g+ )+ ng+ vanishes up to terms of order (n − 2) in x. The Poincaré-metric is unique up to addition of terms of the form xn Sx , where Sx is a 1-parameter family of symmetric (2, 0)-tensors such that S0 is trace-free√(for details see [15,16]). For a Poincaré metric one can show, see [22] for details, that det(gx )/ det(g) has the Taylor expansion

det(gx ) = 1 + v (2) x2 + v (4) x4 + · · · + v (n−2) xn−2 + v (n) xn + · · · , (31) det(g) defining smooth functions v (2k) . Then in [24] it is shown that the critical Q-curvature Q n of (M, [g]) is given as 2nc n2 Q n = nv (n) +

s−1 (n − 2k)A∗2k v (n−2k) .

(32)

k=1

Here A2k are the linear differential operators that appear in the expansion of a harmonic function for a Poincaré-metric, the star denotes the formal adjoint, and c n2 is a constant. Furthermore, one has to recall how the Poincaré-metric can be obtained by the ambient metric. Assume that g = 2d(ρt)dt + t 2 g(ρ) is a pre-ambient metric for [g] that is Ricci-flat up to terms of order s and higher. Such a metric always exists and is unique up to terms of order n/2 in ρ. Now, on | p ∈ M, t 2 ρ = −1}, M+ = {(ρ, p, t) ∈ M the Poincaré-metric is given by 1 g+ = 2 x

1 2 dx + g(x ) . 2 2

Note that if the pre-ambient metric is Ricci-flat, then the Poincaré-metric obtained in this way is Einstein. We can use the ambient metric of a pp-wave to prove Theorem 4. The critical Q-curvature of an even-dimensional pp-wave vanishes. Proof. Let (M, g) be a pp-wave of even dimension n = 2s. In Sect. 4 we have also shown that its pre-ambient metric that is Ricci-flat up to terms of order n/2 is given by formula (12) with H as in (19). Using the coframe in (14) we can write down the volume form ω(ρ) of the ρ-dependent family of pp-waves, g(ρ) = 2du (dr + (h + H )du) +

n−2 (dxi )2 , i=1

896

T. Leistner, P. Nurowski

namely ω(ρ) = dx1 ∧ . . . ∧ dxn−2 ∧ (dr + (h + H )du) ∧ du = ω(0). For the family gx = 21 g(x2 ) defining the Poincaré metric this implies that det(gx ) = det(g0 ). Hence, all the v (2k) in (31) are zero and so is the critical Q-curvature by the result of [24] given in formulae (32). Recall that for a pp-wave (M, g) the vanishing of the scalar curvature implies that the Laplacian g is conformally covariant. Calculations using formulae in [26] show that the first GJMS-operators P2 , P4 and P6 are equal to the corresponding powers of the Laplacian g , 2g and 3g . We conjecture that for pp-waves this is also the case for the higher P2k . 8. Conformal and Ambient Holonomy We conclude with a brief remark about the holonomy of the ambient metric and the holonomy of the normal conformal Cartan connection, also called the conformal holonomy, of a pp-wave. Holonomy groups describe the reduction of generic structures down to more special structures, in the semi-Riemannian, the conformal, and in other geometric settings. For a conformal manifold of signature (r, s) the conformal holonomy is contained in SO(r + 1, s + 1). If it is a proper subgroup, then the conformal structure is reduced to a more special structure. Examples are Lorentzian Fefferman spaces, for an overview see [1], where the conformal holonomy reduces to the special unitary group, or conformal structures in signature (2, 3) with non-compact G2 as structure group, [38,39]. In [31] it is proven that the conformal holonomy of an n-dimensional Lorentzian conformal class that is given by a metric with parallel null line and totally null Ricci tensor is contained in the stabiliser in SO(2, n) of a totally null plane N . Of course, pp-waves are special examples of such metrics and hence, their conformal holonomy reduces to this stabiliser. But we get the same result also for the holonomy of the ambient metric of a pp-wave. Proposition 3. The metric g defined in Eq. (12) admits a holonomy invariant distribution of totally null planes N spanned by ∂r and ∂ρ . In particular, all curvature operators ¯ ¯ leave invariant the fibres of N and of N ⊥ , which is spanned R(V, W ), V, W ∈ T M, by ∂r , ∂ρ , and ∂i . Proof. The easiest way to see this is to consider the dual frame to the co-frame in (14) given by E0 =

1 1 1 ρ ∂ρ , E i = ∂i , E n−1 = 2 ∂r , E n = ∂u − (h + H )∂r , E n+1 = ∂t − ∂ρ . t t t t

Using the relation g( ¯ ∇¯ E µ , E ν ) = µν one can read off from the formulae for the connection 1-forms in (15) that N = span(E 0 , E n−1 ) = (span(E 0 , E i , E n−1 ))⊥ is invariant under the Levi-Civita connection.

Ambient Metrics for n-Dimensional pp-Waves

897

Corollary 2. Let G be the holonomy group of the ambient metric of a pp-wave in odd dimension or in dimension 2s with s h = 0. Then G is contained in the stabiliser in SO(2, n) of a totally null plane in R2,n . In general, it is possible to show that the conformal holonomy is always contained in the ambient holonomy [33]. For a conformal class with an Einstein-metric or a Ricciflat metric both holonomy groups are the same [31,34]. For a pp-wave, not necessarily conformal Einstein, we have just seen that both are contained in the isotropy group of a totally null plane. Hence, it is very likely that the conformal holonomy is actually equal to the ambient holonomy. But to give a proof of this is beyond the scope of this paper. References 1. Baum, H.: The conformal analog of Calabi-Yau manifolds. In: Handbook of Pseudo-Riemannian Geometry, IRMA Lectures in Mathematics and Theoretical Physics. Zürich European Mathematical Society, 2007, In press 2. Bautier, K., Englert, F., Rooman, M., Spindel, P.: The Fefferman-Graham ambiguity and AdS black holes. Phys. Lett. B 479(1-3), 291–298 (2000) 3. Bena, I., Roiban, R.: Supergravity pp solutions with 28 and 24 supercharges. Phys. Rev D 67, 125014 (2003) 4. Berenstein, D., Maldacena, J., Nastase, H.: Strings in flat space and pp waves from N = 4 super Yang Mills. J. High Energy Phys. (4):No. 13, 30 (2002) 5. Blau, M., Figueroa-O’Farrill, J., Hull, C., Papadopoulos, G.: A new maximally supersymmetric background of type IIB superstring theory. J. High Energy Phys. 01, 047 (2002) 6. Blau, M., Figueroa-O’Farrill, J., Hull, C., Papadopoulos, G.: Penrose limits and maximal supersymmetry. Class. Quant. Grav. 19, L87–L95 (2002) 7. Branson, T.P.: The Functional Determinant. Volume 4 of Lecture Notes Series. Seoul: Seoul National University Research Institute of Mathematics Global Analysis Research Center, 1993 8. Brinkmann, H.W.: Einstein spaces which are mapped conformally on each other. Math. Ann. 94, 119–145 (1925) 9. Chru´sciel, P.T., Kowalski-Glikman, J.: The isometry group and Killing spinors for the pp wave space-time in D = 11 supergravity. Phys. Lett. B 149(1-3), 107–110 (1984) 10. Coley, A., Milson, R., Pelavas, N., Pravda, V., Pravdová, A., Zalaletdinov, R.: Generalizations of pp-wave spacetimes in higher dimensions. Phys. Rev. D (3), 67(10):104020, 4, 2003 11. Cvetiˇc, M., Lü, H., Pope, C.N.: Penrose limits, pp-waves and deformed M2-branes. Phys. Rev. D69, 046003 (2004) 12. Cvetiˇc, M., Lü, H., Pope, C.N.: M-theory pp-waves, Penrose limits and supernumerary supersymmetries. Nuclear Phys. B 644(1-2), 65–84 (2002) 13. de Haro, S., Skenderis, K., Solodukhin, S.N.: Holographic reconstruction of spacetime and renormalization in the AdS/CFT correspondence. Commun. Math. Phys. 217, 595 (2001) 14. Ehlers, J., Kundt, W.: Exact solutions of the gravitational field equations. In: Gravitation: An Introduction to Current Research. New York: Wiley, 1962, pp. 49–101 15. Fefferman, C., Graham, C.R.: Conformal invariants. In: Elie Cartan etles mathematiques of Aujourdheu, Astérisque, (Numero Hors Serie):95–116 (1985) 16. Fefferman, C., Graham, C.R.: The ambient metric. http://arxiv.org/abs/0710.0919v2[math.DG], 2008 17. Fefferman, C., Hirachi, K.: Ambient metric construction of Q-curvature in conformal and CR geometries. Math. Res. Lett. 10(5-6), 819–831 (2003) 18. Gauntlett, J.P., Hull, C.M.: pp-waves in 11-dimensions with extra supersymmetry. J. High Energy Phys. 6(13), 13 (2002) 19. Gover, A.R., Leitner, F.: A sub-product construction of Poincare-Einstein metrics. Int. J. Math. 20, 1263–1287 (2009) 20. Gover, A.R., Nurowski, P.: Obstructions to conformally Einstein metrics in n dimensions. J. Geom. Phys. 56(3), 450–484 (2006) 21. Graham, C.R.: Personal communication 22. Graham, C.R.: Volume and area renormalizations for conformally compact Einstein metrics. In: The Proceedings of the 19th Winter School “Geometry and Physics” (Srni, 1999), Rend. Circ. Mat. Palermo (2) Suppl. No. 63, 31–42 (2000) 23. Graham, C.R., Jenne, R., Mason, L.J., Sparling, G.A.J.: Conformally invariant powers of the Laplacian. I. Existence. J. London Math. Soc. (2) 46(3), 557–565 (1992)

898

T. Leistner, P. Nurowski

24. Graham, C.R., Juhl, A.: Holographic formula for Q-curvature. Adv. Math. 216(2), 841–853 (2007) 25. Hull, C.M.: Exact pp-wave solutions of eleven-dimensional supergravity. Phys. Lett. 139B, 3941 (1984) 26. Juhl, A.: Families of Conformally Covariant Differential Operators, Q-curvature and Holography. Progress in Mathematics. 275, Basel: Birkhäuser, 2009 27. Kichenassamy, S.: On a conjecture of Fefferman and Graham. Adv. Math. 184(2), 268–288 (2004) 28. Kowalski-Glikman, J.: Vacuum states in supersymmetric Kaluza-Klein theory. Phys. Lett. B 134(3-4), 194–196 (1984) 29. Kowalski-Glikman, J.: A nontrivial vacuum state in D = 10, N = 1 supergravity. Phys. Lett. B 134(3-4), 159–160 (1984) 30. Leistner, T.: Lorentzian manifolds with special holonomy and parallel spinors. In: Proceedings of the 21st Winter School “Geometry and Physics” (Srni, 2001), Rend. Circ. Mat. Palermo suppl. 69, 131–159 (2002) 31. Leistner, T.: Conformal holonomy of C-spaces, Ricci-flat, and Lorentzian manifolds. Diff. Geom. Appl. 24(5), 458–478 (2006) 32. Leistner, T.: Screen bundles of Lorentzian manifolds and some generalisations of pp-waves. J. Geom. Phys. 56(10), 2117–2134 (2006) 33. Leistner, T., Nurowski, P.: Conformal classes with G 2(2) -ambient metrics. http://arxiv.org/abs/:0904. 0186v2[math.DG], 2009 34. Leitner, F.: Conformal Killing forms with normalisation condition. Rend. Circ. Mat. Palermo (2) Suppl. 75, 279–292 (2005) 35. Meessen, P.: A small eprint on pp-wave vacua in 6 and 5 dimensions. Phys. Rev. D65, 087501 (2002) 36. Michelson, J.: (Twisted) toroidal compactication of pp-waves. Phys. Rev. D66, 066002 (2002) 37. Michelson, J.: A pp-wave with 26 supercharges. Class. Quant. Grav. 19(23), 5935–5949 (2002) 38. Nurowski, P.: Differential equations and conformal structures. J. Geom. Phys. 43(4), 327–340 (2005) 39. Nurowski, P.: Conformal structures with explicit ambient metrics and conformal G 2 holonomy. In: Symmetries and Overdetermined Systems of Partial Differential Equations. Volume 144 of IMA Vol. Math. Appl., New York: Springer, 2008, pp. 515–526 40. Penrose, R.: Any space-time has a plane wave as a limit. In: Differential Geometry and Relativity, Mathematical Phys. and Appl. Math., Vol. 3. Dordrecht: Reidel, 1976, pp. 271–275 41. Robinson, I.: A solution of the Maxwell-Einstein equations. Bull. Acad. Polon. Sci. Sér. Sci. Math. Astr. Phys. 7, 351–352 (unbound insert), (1959) 42. Schimming, R.: Riemannsche Räume mit ebenfrontiger und mit ebener Symmetrie. Math. Nach. 59, 128–162 (1974) Communicated by P.T. Chru´sciel