Communications in Mathematical Physics - Volume 266

Commun. Math. Phys. 266, 1–16 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0008-2 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

26 downloads 626 Views 9MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 266, 1–16 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0008-2

Communications in

Mathematical Physics

Smoothing Effect of Quenched Disorder on Polymer Depinning Transitions Giambattista Giacomin, Fabio Lucio Toninelli 1 Laboratoire de Probabilités de P 6 & 7 (CNRS U.M.R. 7599) and Université Paris 7 – Denis Diderot, U.F.R.

Mathematiques, Case 7012, 2 place Jussieu, 75251 Paris Cedex 05, France. E-mail: [email protected]

2 Laboratoire de Physique, ENS Lyon (CNRS U.M.R. 5672), 46 Allée d’Italie, 69364 Lyon cedex 07, France.

E-mail: [email protected] Received: 8 July 2005 / Accepted: 28 November 2005 Published online: 11 April 2006 – © Springer-Verlag 2006

Abstract: We consider general disordered models of pinning of directed polymers on a defect line. This class contains in particular the (1 + 1)-dimensional interface wetting model, the disordered Poland–Scheraga model of DNA denaturation and other (1 + d)dimensional polymers in interaction with flat interfaces. We consider also the case of copolymers with adsorption at a selective interface. Under quite general conditions, these models are known to have a (de)localization transition at some critical line in the phase diagram. In this work we prove in particular that, as soon as disorder is present, the transition is at least of second order, in the sense that the free energy is differentiable at the critical line, so that the order parameter vanishes continuously at the transition. On the other hand, it is known that the corresponding non-disordered models can have a first order (de)localization transition, with a discontinuous first derivative. Our result shows therefore that the presence of the disorder has really a smoothing effect on the transition. The relation with the predictions based on the Harris criterion is discussed. 1. Introduction and Models Quenched disorder is expected to smooth phase transitions in many situations. This is for instance the case of ferromagnetic spin systems in dimension d ≤ 2 (if the spins are discrete) or in dimension d ≤ 4 (if they have rotation symmetry): when a random magnetic field is present the Imry–Ma argument [20], made rigorous by Aizenman and Wehr [1], implies that these models do not exhibit, at any temperature, the first order phase transition with associated spontaneous magnetization which characterizes the corresponding pure models. For the analogous phenomenon for SOS effective interface models see [6]. In the present work, under some conditions on the disorder distribution, we prove that a similar effect takes place in models of directed polymers in random media exhibiting a localization/delocalization transition on a defect line. Such a transition may be of first or higher order in the corresponding pure, i.e. non-disordered, cases and we show that, as soon as disorder is present, the transition is at least of second order. It is important

2

G. Giacomin, F. L. Toninelli

to emphasize that the mechanism inducing the smoothing of the transition in our case is very different from the Imry–Ma one, and it is rather based on an estimate of the probability that the polymer visits rare but very favorable regions where the disorder produces a large positive fluctuation of the partition function. We will consider mainly two classes of models: random pinning (or wetting) models and random copolymers at selective interfaces. In pinning models [2, 10, 13, 24, 27] the typical situation one has in mind is that of a directed path in (1 + d) dimensions, which receives a reward (or a penalty) at each intersection with a 1-dimensional region (a defect line, in the physical language) according to whether the charge present on the line at the intersection point is positive or negative. On the other hand the copolymer model, whose study was initiated in [14] in the theoretical physical literature (see [23] and references therein for updated physics developments) and in [5, 25] in the mathematical one, aims at modeling a (1 + 1)-dimensional, directed heteropolymer containing both hydrophobic and hydrophilic components, in the presence of an interface (the line S = 0 in the notations of Sect. 1.2) separating two solvents, situated in the upper and lower half-planes (S > 0 and S < 0). One solvent favors the hydrophilic components and the other favors the hydrophobic ones, i.e., if the charge of the n th monomer is positive (negative), this monomer tends to be in the upper (lower) half-plane. The main question one would like to answer is whether or not the interaction induces a localization of the polymer along the defect line or interface. 1.1. Random pinning and wetting models. Let S := {Sn }n=0,1,... be a homogeneous process on an arbitrary set that contains a point 0, with S0 = 0 and law P. For β ≥ 0, h ∈ R and ω = {ωn }n=1,2,··· ∈ RN , one introduces the probability measure N 1 dP N ,ω (S) = exp (1.1) (βωn − h) 1 Sn =0 1 SN =0 . dP Z N ,ω n=1

The choice of setting S N = 0 is just for technical convenience, see Remark 1.1 below. Let us set τ0 = 0 and, for i ∈ N := {1, 2, . . .}, τi = inf {n > τi−1 : Sn = 0} if τi−1 < +∞ and τi = +∞ if τi−1 = +∞. We assume that τ := {τi }i is the sequence of partial sums of an IID sequence of random variables taking values in N ∪ {∞} with discrete density K (·): K (n) = P(τ1 = n).

(1.2)

This is of course the case if S is a Markov chain and for definiteness one should keep this case in mind. In order to avoid trivialities, we assume that K (∞) < 1 and for the model to be defined we assume also that there exists s ∈ N such that P(S N = 0) > 0 for every N ∈ sN. Therefore, starting with (1.1), we always implicitly assume that N ∈ sN, when N is the length of the polymer. Moreover the cases we will consider are such that there exist α ≥ 1 such that lim inf n→∞

log K (sn) ≥ −α, log n

(1.3)

and K (n) = 0 if n ∈ sN. We advise the reader who feels uneasy with the weak requirement (1.3) to focus on the case of K (n) behaving like n −α , possibly times a slowly varying function (see Appendix 3.2). Note that, for α < 2, the return times to zero of

Smoothing of the Depinning Transition

3

S are not integrable. We stress that we have introduced s to account for the possible periodicity of S: of course the most natural example is that of the simple random walk on Z, P(Si − Si−1 = 1) = P(Si − Si−1 = −1) = 1/2 and {Si − Si−1 }i IID, for which (1.3) holds with s = 2 and α = 3/2 [11, Ch. III]. Another example is that of the simple random walk on Zd , d ≥ 2: in this case, α = d/2. We can rewrite Z N ,ω , and in fact the model itself, in terms of the sequence τ : the partition function in (1.1) is Z N ,ω := E exp H N ,ω (S) ; τN N = N , (1.4) with N N := sup{i : τi ≤ N } and NN βωτi − h . H N ,ω (S) =

(1.5)

i=1

The disordered pinning model, which we consider here, is obtained assuming that the sequence ω is chosen as a typical realization of an IID sequence of random variables with law P, still denoted by ω = {ωn }n . We assume finiteness of exponential moments: E[exp(tω1 )] < ∞,

(1.6)

for t ∈ R and, without loss of generality, E ω1 = 0 and E ω12 = 1. Further assumptions on P will be formulated in Sect. 2, where we state our main results. Under the above assumptions on the disorder the quenched free energy of the model exists, namely the limit f(β, h) := lim

N →∞

1 log Z N ,ω , N

(1.7)

exists P( dω) –almost surely and in the L1 (P) sense. The existence of this limit can be proven via standard super-additivity arguments based on Kingman’s sub-additive ergodic theorem [22] (we refer for example to [2, 5, 16] for details). This approach yields also automatically the so-called self-averaging property of the free energy, that is the fact that f(β, h) is non-random. A simple but fundamental observation is that f(β, h) ≥ 0. The proof of such a result is elementary:     NN 1 1 E log Z N ,ω ≥ E log E exp  βωτi − h  ; τ1 = N  N N

(1.8)

(1.9)

i=1

=

1 1 N →∞ (βE [ω N ] − h) + log P (τ1 = N ) −→ 0, N N

(1.10)

where we have used assumption (1.3). The proof of (1.8) suggests the following partition of the parameter space (or phase diagram): • The localized region: L = (β, h) : f(β, h) > 0 ; • The delocalized region: D = (β, h) : f(β, h) = 0 .

4

G. Giacomin, F. L. Toninelli

We set h c (β) := sup{h : f(β, h) > 0}, and we will call h c (β) the critical point. Since f(β, ·) is not increasing and continuous, D = {(β, h) : h ≥ h c (β)}. It is rather easy to see that |h c (β)| < ∞, see e.g., [2] and [16]. Remark 1.1. It would be of course as natural to consider the model with partition function Z fN ,ω := E exp H N ,ω (S) . It is therefore worth to stress that by standard arguments, see e.g. [16], one sees that ∞ j=n K ( j) f exp (|βω N − h|) Z N ,ω Z N ,ω ≤ Z N ,ω ≤ max K (n) n∈s N: n≤N

≤

1 exp (|βω N − h|) Z N ,ω , minn∈s N: K (n)

(1.11)

n≤N

uniformly in N ∈ sN and ω. In particular, by (1.3), the free energy is unaffected by the presence of the constraint τN N = N : we stick to the constrained case because it simplifies some technical steps of the proofs. Remark 1.2. For ease of exposition we have used the generic denomination of pinning model, but our framework, as it is possibly clearer when the partition function is cast in the form (1.4), includes a variety of models, like for example the Poland–Scheraga model of DNA denaturation, for which theoretical arguments on models with excluded volume interactions suggest a value of α larger than 2 [21]: note that whether the transition in the disordered case is of first or higher order is a crucial and controversial issue in the field, see for example [9, 8, 15]. We stress also that, since we can choose K (∞) > 0, it is rather easy to see that also the disordered (1 + 1)-dimensional interface wetting models [10, 13] enter the general class we are considering. It is known that the nonrandom case β = 0 is exactly solvable and that, according to the law of τ1 , the transition at h c (0) = log(1 − K (∞)) can be either of first or higher order. For completeness and to match our set-up, we give a quick self-contained analysis of this case in Appendix 3.2.

1.2. Random copolymers at a selective interface. In the case of the copolymer model, the natural setting is to assume, in addition, that the state space of the process {Sn }n is Z, that the law P is invariant under the transformation S → −S and that (Si − Si−1 ) ∈ {−1, 0, +1} for every i ≥ 1. For instance, if {Si − Si−1 }i is a sequence of IID variables then it is a classical result [12, Ch. XII.7] that K (n) = c(1 + o(1))n −3/2 , for large values of n ∈ sN (so α = 3/2): c is a constant that can be expressed in terms of P(S1 − S0 = 0) and of course s = 1 unless P(S1 − S0 = 0) = 0. The copolymer model is defined introducing 1 (βωn + h) sign(Sn ), 2 N

H N ,ω (S) =

(1.12)

n=1

with the convention that sign(0) = +1, and the corresponding partition function Z N ,ω := E exp H N ,ω (S) ; S N = 0 . (1.13)

Smoothing of the Depinning Transition

5

We may assume without loss of generality that both β and h are nonnegative. The factor 1/2 in (1.12) is introduced just for convenience (see Sect. 3.2). As in Sect. 1.1, the corresponding random model is obtained by choosing ω as a realization of an IID sequence of centered random variables of unit variance and finite exponential moments. In analogy with (1.7) and (1.8), the limit f (β, h) := lim

N →∞

1 log Z N ,ω , N

(1.14)

exists P( dω) –almost surely and in the L1 (P) sense and one can prove as in (1.9) that f(β, h) := f (β, h) − h/2 ≥ 0. Therefore, also in this case we can partition the phase diagram into a localized and a delocalized phase, as • The localized region: L = (β, h) : f(β, h) > 0 ; • The delocalized region: D = (β, h) : f(β, h) = 0 . Also in this case, we set h c (β) := sup{h : f(β, h) > 0} and we observe that D = {(β, h) : h ≥ h c (β)}. Many results have been proven for copolymers at selective interfaces, but they are almost always about the case of simple random walks. However they can be extended in a rather straightforward way to the general case we consider here (we omit the details also because they are not directly pertinent to the content of this paper). Above all we have that 0 < h c (β) < ∞ for every β > 0. Even more, the explicit bounds carry over to the general S we consider here: the upper bound [5] is even independent of the choice of the law of S, while the lower bound [4] depends on α. We take this opportunity to stress also that various results about path behavior in the two regions are available, both for the pinning problem and for the copolymer, and, once again, mostly in the simple random walk case α = 3/2. For instance, it is known that in L long excursions of the walk from the line S = 0 are exponentially suppressed and the fraction of sites where the polymer crosses the line remains nonzero in the thermodynamic limit (e.g., cf. [16, 25] and references therein, and [18]). On the other hand, in the interior of D the number of intersections is, in a suitable sense, O(log N ) [17] and, for the copolymer, the number of steps where sign(Sn ) = −1 is also O(log N ) [17]. 2. Smoothing of the Depinning Transition While the free energy can be proven to be infinitely differentiable with respect to all of its parameters in the region L [18], no results are available about its regularity at the critical point, apart from the obvious fact that f is continuous since it is convex. The result of the present paper partly fills this gap, showing that the transition is at least of second order, as soon as disorder is present. In order to state our main theorem, we need some further assumptions on the disorder variables ω. We will consider two distinct cases: C1: Bounded random variables. The random variable ω1 is bounded, |ω1 | ≤ M < ∞.

(2.1)

C2: Unbounded continuous random variables. The law of ω1 has a density P(·) with respect to the Lebesgue measure on R, and there exists 0 < R < ∞ such that P(y + x) dy ≤ Rx 2 P(y + x) log (2.2) P(y) R

6

G. Giacomin, F. L. Toninelli

in a neighborhood of x = 0. This is true in great generality whenever P(·) is positive, for example when the disorder is Gaussian and, more generally, whenever P(·) = exp(−V (·)), with V (·) a polynomial bounded below. Then, one has: Theorem 2.1. Under Condition C1 or C2, both for the copolymer and for the pinning model, for every 0 < β < ∞ there exists 0 < c(β) < ∞, possibly depending on P, such that for every 1 ≤ α < ∞, f(β, h) ≤ αc(β)(h c (β) − h)2

(2.3)

if h < h c (β). Although the above result, coupled of course with (1.8), seems to be in the same spirit as the rounding effect proven by Aizenman and Wehr [1] for the two-dimensional Random Field Ising Model, the physical mechanisms of smoothing are deeply different in the two cases. While [1] is based on a rigorous version of the Imry-Ma argument [20] (i.e., a comparison between the effect of boundary conditions and of disorder fluctuations in the bulk due to the random magnetic field) in our case the boundary conditions play no role at all and everything is based on an energy-entropy argument inspired by [4]. Remark 2.2. It is important to observe that, as explained in Appendix 3.2, in the pinning case the deterministic model, β = 0, has a first order phase transition whenever n∈N n K (n) < +∞ [2], in particular when α > 2. Theorem 2.1 therefore shows that the disorder has really a smoothing effect on the transition. But more than that is true: for α ∈ (3/2, 2), f(0, h) at h c (0) is strictly less regular than f(β, h) at h c (β). In fact f(0, h) > (h c (0)−h)2−δ for δ ∈ (0, (2α −3)/(α −1)) and small values of h c (0)−h > 0 (for sharper results, see Appendix 3.2). Notice that this is in agreement with the so-called Harris criterion [19], which predicts that arbitrarily weak disorder modifies the nature of a second-order phase transition as soon as the critical exponent of the specific heat in the pure case is positive. In the present situation, this condition corresponds just to α > 3/2. The Harris criterion also predicts that the critical behavior does not change if α < 3/2, which is compatible with Theorem 2.1. Rigorous work connected to the Harris criterion, in the Ising model context, may be found in [7]. As one realizes easily from the proof of Theorem 2.1 given in Sect. 3, the constant c(β) in (2.3) can be very large (of order O(β −2 )) for β small. This is rather intuitive: for β → 0 one approaches the deterministic situation, where the transition can be of first order. Remark 2.3. In the theoretical physics literature, the (de)localization transition is claimed to be in some cases of order higher than two [28], or even of infinite order [27, 23]. The method we present here, which is rather insensitive to the details of the model, does not allow to prove more than second order in general. It is likely that finer results require model-specific techniques. Remark 2.4. A generalized model: copolymers with adsorption. It is also possible to consider copolymer models with an additional pinning interaction [26], as H N ,ω, ω (S) =

N n=1

1 (β2 ωn − h 2 ) sign(Sn ). 2 N

(β1 ωn − h 1 ) 1 Sn =0 +

n=1

(2.4)

Smoothing of the Depinning Transition

7

Here, both ω and ω are sequences of IID centered random variables with unit variance and finite exponential moments. In addition, one assumes ω to be independent from ω. This model corresponds to the situation where the interface between the two solvents is not neutral. While for simplicity we will not present details for this model, we sketch here what happens in this case. In analogy with the previous models, one can partition the phase diagram (i.e., the space of the parameters β1 , h 1 , β2 , h 2 ) into a localized and a delocalized region, separated by a critical surface. In this case, Theorem 2.1 is easily generalized to give that the free energy f(β1 , h 1 , β2 , h 2 ) has continuous first derivatives with respect to h 1 , h 2 when these parameters approach the critical surface from the localized region. 3. Proof of the Smoothing Effect 3.1. The pinning case. In this section we prove Theorem 2.1 for the pinning case, and in the next one we explain how the proof can be immediately extended to the copolymer model. The key idea, in analogy with [17], is to introduce a new free energy where the fraction of sites where the polymer comes back to zero is fixed. In other words, recalling that N N = |{1 ≤ n ≤ N : Sn = 0}|

(3.1)

one introduces, for m ∈ [0, 1], 1 E log Zˆ N ,ω (β; m, ε) N N 1 E log E eβ n=1 ωn 1 Sn =0 1(N N /N )∈[m−ε,m+ε] 1 SN =0 . := lim lim ε0 N →∞ N

φ(β, m) = lim lim

ε0 N →∞

(3.2)

Note that the limit is well defined, since E log Zˆ N ,ω (β; m, ε) is super-additive in N , thanks to the IID assumption on the increments of the sequence of return times τ , and non-increasing for ε 0. Therefore the limit φε (β, m) of (1/N )E log Zˆ N ,ω (β; m, ε), as N → ∞, exists, as well as the second limit φε (β, m) φ(β, m) as ε 0. Notice moreover that (3.2) holds also without taking the expectation, as for (1/N ) log Z N ,ω : in this case of course the limit N → ∞ has to be taken in the P( dω) –a.s. sense. Moreover, it is immediate to realize that φ(β, 0) = 0 and that φ(β, m) ≤ f(β, 0) ≤ β,

(3.3)

so that φ(β, m) is always bounded above. Finally, always thanks the IID property of the differences of successive return times to zero, it is easy to show that φ(β, ·) is concave: φ(β, m) ≥ xφ(β, m 1 ) + (1 − x)φ(β, m 2 )

(3.4)

if 0 ≤ x ≤ 1 and m = xm 1 + (1 − x)m 2 . By exploiting the P( dω) –a.s. convergence of (1/N ) log Zˆ N ,ω (β; m, ε) to the nonrandom limit φε (β, m) and the subsequent convergence for ε 0, one deduces that f(β, h) and φ(β, m) are related by a Legendre transform: f(β, h) = sup (φ(β, m) − hm) . m∈[0,1]

(3.5)

8

G. Giacomin, F. L. Toninelli

In turn, this allows to identify h c (β) in terms of φ(β, ·) as

h c (β) = inf h : sup (φ(β, m) − hm) = 0 .

(3.6)

m∈[0,1]

The key technical step in the proof of Theorem 2.1 is the following: Theorem 3.1. Under Condition C1 or C2, for every 0 < β < ∞ there exists 0 < C(β) < ∞ such that, for every 1 ≤ α < ∞, φ(β, m) − h c (β)m ≤ −

C(β) 2 m α

(3.7)

if 0 ≤ m ≤ 1. Proof of Theorem 2.1. It is an immediate consequence of Theorem 3.1 and (3.5). In fact, by (3.7) we have φ(β, m) − hm ≤ −

C(β) 2 m + (h c (β) − h)m, α

(3.8)

for every m ∈ [0, 1]. Taking the supremum over m on both sides of (3.8), by (3.5) we obtain C(β) 2 m + (h c (β) − h)m f(β, h) ≤ sup − α m∈[0,1] C(β) 2 α m + (h c (β) − h)m = (h c (β) − h)2 . (3.9) ≤ sup − α 4C(β) m≥0

We go now to the proof of Theorem 3.1. We will consider first the case in which ω1 satisfies condition C2, because it technically lighter. The two cases differ only in the first part of the proof (that is, up to Remark 3.2 below), where the probability of a rare event is estimated from below by changing the law of the disorder and by evaluating the corresponding relative entropy price. In the case of C2 it is sufficient to shift (i.e. to translate) the distribution of the disorder variables, and the relative entropy estimate implied by (2.2) fits well the rest of the proof. Under Assumption C1, instead, we have to tilt the law of ω and the arising expressions need to be re-worked, see Lemma 3.4 below, before stepping to the second part of the proof. Proof of Theorem 3.1 under Assumption C2. Due to the concavity of φ(β, ·), it is enough to prove (3.7) for 0 < m ≤ c1 , with c1 = c1 (β) > 0, not depending on α. We define γ (β, m) := −φ(β, m) + h c (β)m + c2

β2m2 > 0, α

(3.10)

where c2 > 0 is a constant depending only on P, which will be chosen later. We stress that the term containing c2 has been added simply because a priori one knows only that −φ(β, m) + h c (β)m ≥ 0, cf. (3.6), and it turns out to be technically practical to work with γ (β, m) > 0. For ∈ sN we define also A ,m,ε = ω ∈ R : log Zˆ ,ω (β; m, ε) − h c (β)m ≥ γ (β, m) . (3.11)

Smoothing of the Depinning Transition

9

Moreover for 0 < m ≤ c1 , we let P be the law obtained from P shifting the distribution of ω1 , . . . , ω so that γ (β, m) E[ω j ] = 8 . βm

(3.12)

lim γ (β, m)/m = 0,

(3.13)

Note that m→0

since otherwise, by convexity of γ (β, ·), one has γ (β, m) ≥ m for for some > 0 and every m, that is f(β, h c (β) − ) = 0, which is in contrast with the definition of h c (β). We now show that the event E := ω : (ω1 , . . . , ω ) ∈ A ,m,ε , (3.14) becomes P–typical for large. We first observe that, thanks to the constraint N / ≥ m − ε, and assuming that ε ≤ m/2, one has 1 E log Zˆ ,ω (β; m, ε) − h c (β)m 1 log Zˆ ,ω (β; m, ε) − h c (β)m . (3.15) ≥ 4γ (β, m) + E By (3.2) and (3.10), there exist ε0 (m) > 0 and 0 (ε, m) ∈ sN such that 1 log Zˆ ,ω (β; m, ε) − h c (β)m ≥ 2γ (β, m), E for ε ≤ ε0 , ≥ 0 . This in turn implies that P (E) ≥ P log Zˆ ,ω (β; m, ε) − E log Zˆ ,ω (β; m, ε) ≥ −γ (β, m) ,

(3.16)

(3.17)

which is greater than, say, 1/2 for sufficiently large, since γ (β, m) > 0 and (recall (3.2) and discussion following that formula) 1 log Zˆ ,ω (β; m, ε) −

1 →∞ E log Zˆ ,ω (β; m, ε) −→ 0,

(3.18)

in P–probability. The price of shifting P to P is directly estimated by using Assumption C2, cf. (2.2), and recalling (3.13): assuming that m ≤ c1 (β), with c1 (β) sufficiently small, one obtains the estimate d P d P γ (β, m) 2 H( P|P) := E log , (3.19) ≤ 64R dP dP βm and by applying the relative entropy inequality P(E) 1 log ≥− H(P|P) + e−1 , P(E) P(E)

(3.20)

10

G. Giacomin, F. L. Toninelli

we obtain p := P (E) ≥

1 exp −128R (γ (β, m)/(βm))2 , 2

(3.21)

for large . Remark 3.2. Inequality (3.20) holds whenever the measures P, P are absolutely continuous with respect to each other and for every event E of nonzero measure. It is a simple consequence of Jensen inequality: since r log r ≥ −e−1 for every r > 0, one has dP dP P(E) E ≥ E log E = log E log P(E) d P d P 1 d P d P d P 1 d P −1 =− E log 1E ≥ − log , E +e dP dP dP dP P(E) P(E) (3.22) which is just (3.20). We now apply an energy–entropy argument similar to that of [4] which, in the present case, roughly consists in selecting only those polymer trajectories which visit the rare stretches where the disorder configuration is such to produce a sufficiently large positive fluctuation of the partition function. Of course the precise definition of these rare stretches is directly related to the event E. This selection strategy gives a lower bound on the free energy, which implies (3.7). More precisely, we consider a system of length k , with k ∈ N, ∈ sN and we divide it into blocks B j = { j + 1, j + 2, . . . , ( j + 1) } of length , with j = 0, . . . , k − 1. For a given realization of ω, we denote by I(ω) the ordered set of nonnegative integers, I(ω) = { j1 , . . . , j|I (ω)| }

:= {0 ≤ j ≤ k − 1 : ω j +1 , . . . , ω( j+1) ∈ A ,m,ε },

(3.23)

where it is understood that ε ≤ ε0 (m), ≥ 0 (m, ε) and m ≤ c1 as above, so that (3.21) holds. We bound the partition function Z N ,ω below by inserting in the average over the paths the constraint that Si = 0 whenever i ∈ B j with j ∈ / I(ω), that Si = 0 whenever i = j or i = ( j + 1) with j ∈ I(ω), and that, for every j ∈ I(ω), |{i ∈ B j : Si = 0}| ∈ [m − ε, m + ε].

(3.24)

In this way, recalling the definition of A ,m,ε and assuming that ε ≤ γ (β, m)/(2|h c (β)|), one obtains γ (β, m) 1 1 1 log Z k ,ω (β, h c (β)) ≥ |I(ω)| + log K (L r ), (3.25) k k 2 k r =0,...,|I (ω)|: L r =0

where we recall that K (n) is the P–probability that first return to 0 of an excursion of the free process occurs at step n, as in (1.2), while L r ’s, the (possibly vanishing) lengths of the excursions of the process between two blocks Bi , Bi with i, i ∈ I(ω), are defined as L r = | jr +1 − jr − 1|,

(3.26)

Smoothing of the Depinning Transition

11

with the convention that j0 = −1, j|I (ω)|+1 = k, see Fig. 1. Taking the expectation with respect to the disorder and using (1.3), one obtains then |I (ω)| 1 γ (β, m) 2α 1 L r =0 log L r , E log Z k ,ω (β, h c (β)) ≥ p− E k 2 k

(3.27)

r =0

for sufficiently large. At this point, as in [4], one uses Jensen’s inequality and the concavity of the logarithm to get 1 E log Z k ,ω (β, h c (β)) k |I (ω)| 2α γ (β, m) r =0 L r p− E (|I(ω)| + 1) log ≥ 2 k |I(ω)| + 1 ! " k γ (β, m) (|I(ω)| + 1) p − 2αE log . ≥ 2 k |I(ω)| + 1

(3.28)

Since the disorder variables in the distinct blocks B j j are independent, the law of large numbers implies |I(ω)| k→∞ −→ p, k

(3.29)

P( dω) –a.s., so that, recalling (3.21), 1 E log Z k ,ω (β, h c (β)) k p γ (β, m) 2α − log ≥ p 2 p γ (β, m) γ (β, m) 2 log(2 ) − 256α R ≥ p − 2α . 2 βm

0 = f(β, h c (β)) = lim

k→∞

(3.30)

Since is arbitrary, one obtains γ (β, m) ≥

β2m2 , 512α R

(3.31)

which is the desired inequality and (3.7), provided that c2 in (3.10) satisfies c2 < (512R)−1 .

Sn

L0

0 B B B B 2 4 1 3

L1

L2

B10

L3

B17

kln

Fig. 1. A typical trajectory contributing to the lower bound (3.25). In this example k = 22, = 8, I(ω) = {4, 10, 17}. Note that Sn = 0 for n in a block B j with j ∈ {4, 10, 17}, except at the boundary with a block B j with j ∈ {4, 10, 17}, since by construction Sn = 0 at the boundaries of these blocks. Inside B j , j ∈ I(ω), the path moves with relative freedom, but it is bound to touch the line S = 0 approximately m times (see the text for the choice of m)

12

G. Giacomin, F. L. Toninelli

Remark 3.3. It is easy to check that, in the Gaussian case ω1 ∼ N (0, 1), Theorem 2.1 holds with c(β) = c3 β −2 , c3 a suitable positive constant. Indeed, in this case the estimate (2.2) holds for every x ∈ R (and R = 1/2) and therefore one can take c1 = 1 in the proof of Lemma 3.1, so that (3.31) implies (3.7) with C(β) = c4 β 2 . Of course, the same is true, up to constants, for every P such that inequality (2.2) holds uniformly for x ∈ R. Proof of Theorem 3.1 under Assumption C1. The proof proceeds as in case C2, up to the definition of the law P, but in this case the law obtained by shifting P in general has an infinite entropy with respect to P. Therefore, in this case we define P rather by tilting the law of the first variables: d P 1 (ω) = exp u ωn , (3.32) dP z n=1

where u ≥ 0 will be chosen later and z = z(u) = E euω1 . Let, for ∈ sN, 1 E log Zˆ ,ω (β; m, ε) ψ (u) =

(3.33)

and observe that ψ (0) =

1 E log Zˆ ,ω (β; m, ε).

Then, one has Lemma 3.4. There exist u 0 (β) > 0 and c0 (β) > 0, possibly depending on P, such that for every 0 < β < ∞, 1 ≤ α < ∞ the following holds: for every m ∈ (0, 1], if ε ≤ m/2 and 0 ≤ u ≤ u 0 we have ψ (u) − ψ (0) ≥ c0 βmu.

(3.34)

Lemma 3.4 will be proven below. To proceed with the proof of Theorem 3.1, we choose u = 4γ (β, m)/(βmc0 ) and notice that u is certainly smaller than u 0 if m ≤ c1 (β) with c1 sufficiently small (see (3.13)). Then, choosing 0 < m ≤ c1 and ε ≤ m/2, (3.34) implies that (3.15) is valid also in the present case. On the other hand, it is immediate to verify that (3.19) still holds, with R replaced by some R (β, M) and γ (β, m) by γ (β, m)/c0 . The rest of the proof proceeds exactly as in the case C2.

Proof of Lemma 3.4. Note that ∂u ψ (u) =

$ 1 # E ωi log Zˆ ,ω (β; m, ε) − ξ ψ (u),

(3.35)

i=1

where ξ = ξ(u) = E ω1 . The first term in the right-hand side of (3.35) can be rewritten (with some abuse of notation) as ! ! ωi " " ξ β ˆ ˆ E log Z ,ω (β; m, ε) E ωi + dy E ,ω (1 Si =0 ) ωi =0 ωi =y 0 i=1

= ξ ψ (u) +

β

i=1

! E (ωi − ξ ) 0

i=1

ωi

dy Eˆ ,ω (1 Si =0 )

ωi =y

" ,

(3.36)

Smoothing of the Depinning Transition

13

where Eˆ ,ω (·) =

# $ N E · exp i=1 βωτi 1 S =0 1(N / )∈[m−ε,m+ε] Zˆ ,ω (β; m, ε)

(3.37)

and obviously the first term in (3.36) cancels the second one in the right-hand side of (3.35). Next, observe that the following identity holds: Eˆ ,ω (1 Si =0 )

ωi =y

=

eβ(y−y ) Eˆ ,ω (1 Si =0 )

ωi =y

1 + (eβ(y−y ) − 1) Eˆ ,ω (1 Si =0 )

.

(3.38)

ωi =y

Now recall (2.1), so that it is sufficient to consider y and y such that |y − y | ≤ 2M, and from (3.38) we obtain e−2β M Eˆ ,ω (1 Si =0 )

ωi =y

≤ Eˆ ,ω (1 Si =0 )

ωi =y

≤e

+2β M

Eˆ ,ω (1 Si =0 )

ωi =y

.

(3.39)

We can use this inequality to bound below the last term in (3.36). We have in fact ! E ωi

ωi

dy Eˆ ,ω (1 Si =0 )

"

ωi =y

0

≥ e−2β M η E Eˆ ,ω (1 Si =0 ) ≥e

−4β M

ωi =0

η E Eˆ ,ω (1 Si =0 ),

where η = η(u) = E ω12 , while E

0

ωi

dy Eˆ ,ω (1 Si =0 )

ωi =y

≤ Me2β M E Eˆ ,ω (1 Si =0 ).

(3.40)

Therefore, recalling the constraint (N / ) ∈ [m − ε, m + ε] in the definition of Pˆ ,ω (·), one has the following lower bound: ∂u ψ (u) ≥ β(m − ε)ηe−4Mβ − β(m + ε)ξ Me2β M .

(3.41)

Now choose ε ≤ m/2 and notice that ξ = u + O(u 2 ) for u 1, while η = 1 + O(u). Therefore, there exists u 0 (β, M) > 0 such that, for 0 ≤ u ≤ u 0 and for every m, the following holds: ∂u ψ (u) ≥ βm

e−4Mβ e−4Mβ − 2βmu Me2β M ≥ βm =: c0 βm. 4 8

An integration in u concludes the proof of (3.34).

(3.42)

14

G. Giacomin, F. L. Toninelli

3.2. The copolymer case. In order to prove Theorem 2.1 for the copolymer model (1.12), it is convenient to start from the observation that one can rewrite the limit free energy as N 1 E log E exp − f(β, h) = lim (βωn + h)n 1 SN =0 , (3.43) N →∞ N n=1

where n = 1sign(Sn )=−1 . Comparing this expression for the copolymer free energy with (1.1), it is clear that in the present case the role of 1 Sn =0 is played by n . The proof then proceeds exactly like in the pinning case, with the only differences that in the definition (3.2) of φ(β, m) the constraint (N N /N ) ∈ [m − ε, m + ε] has to be replaced by |{1 ≤ n ≤ N : n = 1}| ∈ [m − ε, m + ε] N

(3.44)

and that, in the energy–entropy argument, the path is required to satisfy Si > 0 (and not just Si = 0) whenever i ∈ B j with j ∈ / I(ω). This implies that K (L) in (3.25) has to be replaced by K (L)/2, which has the effect of adding a negative term of order O( p / ) in the lower bound (3.30), which is negligible for sufficiently large. Appendix A. The Non–Disordered Pinning Model For β = 0, the free energy (1.7) may be identified explicitly with the following procedure. First we consider the equation K (n) exp(−bn) = exp(h), (A.1) n∈N

and we look for a solution b > 0, which exists only if n∈N K (n) > exp(h), in which (n) := exp(−h − bn)K (n), K (·) is a discrete probacase it is unique. Then if we set K bility density and one can write N E exp (−hN N ) ; τN N = N =

j

exp(−h)K ( i )

j=1 ( 1 ,..., j )∈N j : i=1 j i=1 i =N

= exp (bN )

N

j

( i ) =: exp (bN ) G N , K

j=1 ( 1 ,..., j )∈N j : i=1 j i=1 i =N

(A.2) and one easily sees that G N is the probability that the random walk which starts at 0 and (·) hits the site N . It is a classical fundatakes positive integer IID jumps with law K (n) [11, Ch. XIII]. This of mental result of renewal theory that lim N G N = 1/ n∈N n K course implies that b = f(0, h). On the other hand, if (A.1) admits no positive solution, (n) := exp(−h)K (n), so that K (·) is by proceeding as in (A.1) and by setting simply K a sub–probability density, one easily sees that f(0, h) = 0. So Eq. (A.1) contains all the information about the free energy.

Smoothing of the Depinning Transition

15

Let us then observe that (A.1) has a positive solution if and only if h < log(1− K (∞)) and therefore h c (0) = log(1 − K (∞)) ≤ 0 (h c (β) being defined before Remark 1.1). The behavior at criticality can be extracted from (A.1) in a rather straightforward way too, but of course we need to make precise the requirement on K (·) beyond the lower bound (1.3): • The case := n∈N n K (n) < ∞. This is a necessary and sufficient condition for the transition to be of first order. More precisely, for ∈ (0, ∞] f (0, h) =

exp(h c (0)) (h c (0) − h) + o ((h c (0) − h)) ,

for h h c (0).

Formula (A.3) follows since by Dominated Convergence if < ∞ then K (n) exp(−bn) = exp(h c (0)) − b + o(b)

(A.3)

(A.4)

n∈N

for b 0, while by a direct estimate limb0 b−1 n∈N K (n)[1 − exp(−bn)] = ∞ if = ∞. Note that the condition < ∞ holds, in particular, in the case where K (n) = L(n)/n α

(A.5)

for large n ∈ sN, with α > 2 and L(·) a function varying slowly at infinity, i.e., a positive function such that limr →∞ L(xr )/L(r ) = 1 for every x > 0 (see [3] for more details on slowly varying functions). • The case = ∞. We set K (n) = ∞ j=n+1 K (n) (by this we mean the sum over j ∈ N such that j > n: K (∞) is not included in the sum) and assume that N

K ( j) = % L(N )N 2−α ,

j=1

for some function % L(·) which is slowly varying at infinity. This is true, in particular, in the case (A.5). By the easy (Abelian) Theorem [12, part of the classical Tauberian bα−2 % Ch.XIII.5, Theorem 2] we have that ∞ exp(−bn)K (n) = c L(1/b)(1+o(1)) n=0 as b 0, with c = c (α) > 0. Therefore ∞

∞ e−bn K (n) = K (0) + e−b − 1 e−bn K (n)

n=1

n=0 α−1 %

= K (0) − c b

L(1/b)(1 + o(1)),

(A.6)

and L (1/(h c (0) − h)) , for h h c (0), f(0, h) = (h c (0) − h)1/(α−1)

(A.7)

with L(·) a slowly varying function (see [3, (1.5.1) and Theorem 1.5.12]). It is therefore clear that the transition is of second order for α ∈ (3/2, 2] (we emphasize, for the case α = 2, that we are assuming that = +∞) and it is of higher order for α < 3/2. The value α = 3/2 is borderline and the order of the transition depends then on the slowly varying function % L(·). In the case of one dimensional symmetric random walks with IID increments taking values in {−1, 0, +1}, α = 3/2 and L(n) = c(1 + o(1)) for n large, c a positive constant, and therefore the transition is really of second order and not higher.

16

G. Giacomin, F. L. Toninelli

Acknowledgement. We would like to thank Bernard Derrida, Thomas Garel, Cécile Monthus and David Mukamel for interesting discussions. This research has been conducted in the framework of the GIP–ANR project JC05_42461 (POLINTBIO).

References 1. Aizenman, M., Wehr, J.: Rounding effects of quenched randomness on first–order phase transitions. Community Math. Phys. 130, 489–528, (1990) 2. Alexander, K. S., Sidoravicius, V.: Pinning of polymers and interfaces by random potentials. preprint (2005). http://arxiv.org/list/math.PR/0501028, 2005 3. Bingham, N. H., Goldie, C. M., Teugels, J. L.: Regular Variation. Cambridge: Cambridge University Press, 1987 4. Bodineau, T., Giacomin, G.: On the localization transition of random copolymers near selective interfaces. J. Stat. Phys. 117, 801–818, (2004) 5. Bolthausen, E., den Hollander, F.: Localization transition for a polymer near an interface. Ann. Probab. 25, 1334–1366, (1997) 6. Bovier, A., Külske, C.: There are no nice interfaces in (2 + 1)–dimensional SOS models in random media. J. Stat. Phys. 83, 751–759, (1996) 7. Chayes, J. T., Chayes, L., Fisher, D. S., Spencer, T.: Correlation Length Bounds for Disordered Ising Ferromagnets. Commun. Math. Phys. 120, 501–523 (1989) 8. Coluzzi, B.: Numerical study on a disordered model for DNA denaturation transition. Phys. Rev. E. 73, 011911, (2006) 9. Cule, D., Hwa, T.: Denaturation of heterogeneous DNA. Phys. Rev. Lett. 79, 2375–2378, (1997) 10. Derrida, B., Hakim, V., Vannimenius, J.: Effect of disorder on two–dimensional wetting. J. Stat. Phys. 66, 1189–1213 (1992) 11. Feller, W.: An introduction to probability theory and its applications. Vol. I, Third edition, New York– London–Sydney: John Wiley & Sons, Inc., 1968 12. Feller, W.: An introduction to probability theory and its applications. Vol. II, Second edition, New York– London–Sydney: John Wiley & Sons, Inc., 1971 13. Forgacs, G., Luck, J. M., Th. Nieuwenhuizen, M., Orland, H.: Wetting of a Disordered Substrate: Exact Critical behavior in Two Dimensions. Phys. Rev. Lett. 57, 2184–2187, (1986) 14. Garel, T., Huse, D. A., Leibler, S., Orland, H.: Localization transition of random chains at interfaces. Europhys. Lett. 8, 9–13, (1989) 15. Garel, T., Monthus, C.: Numerical study of the disordered Poland–Scheraga model of DNA denaturation. J. Stat. Mech., Theory and Experiments (2005), P06004 16. Giacomin, G.: Localization phenomena in random polymer models. Preprint, 2004; Available online: http://www.proba.jussieu.fr/pageperso/giacomin/pub/publicat.html, 2004 17. Giacomin, G., Toninelli, F. L.: Estimates on path delocalization for copolymers at selective interfaces. Probab. Theor. Rel. Fields 133, 464–482, (2005) 18. Giacomin, G., Toninelli, F. L.: The localized phase of disordered copolymers with adsorption. ALEA 1, 149–180, (2006) 19. Harris, A. B.: Effect of random defects on the critical behaviour of Ising models. J. Phys. C 7, 1671–1692, (1974) 20. Imry, Y., Ma, S.–K.: Random–Field Instability of the Ordered State of Continuous Symmetry. Phys. Rev. Lett. 35, 1399–1401, (1975) 21. Kafri, Y., Mukamel, D., Peliti, L.: Why is the DNA denaturation transition first order? Phys. Rev. Lett. 85, 4988–4991, (2000) 22. Kingman, J. F. C.: Subadditive ergodic theory. Ann. Probab. 1, 882–909, (1973) 23. Monthus, C.: On the localization of random heteropolymers at the interface between two selective solvents. Eur. Phys. J. B 13, 111–130, (2000) 24. Petrelis, N.: Polymer pinning at an interface. Preprint, 2005; available on: http://arxiv.org/list/math. PR/0504464, 2005 25. Sinai, G., Ya.: A random walk with a random potential. Theory Probab. Appl. 38, 382–385, (1993) 26. Soteros, C. E., Whittington, S. G.: The statistical mechanics of random copolymers. J. Phys. A: Math. Gen. 37, R279–R325, (2004) 27. Tang, L.–H., Chaté, H.: Rare–Event Induced Binding Transition of Heteropolymers. Phys. Rev. Lett. 86, 830–833, (2001) 28. Trovato, T., Maritan, A.: A variational approach to the localization transition of heteropolymers at interfaces. Europhys. Lett. 46, 301–306, (1999) Communicated by H. Spohn

Commun. Math. Phys. 266, 17–35 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0009-1

Communications in

Mathematical Physics

Homogeneous Statistical Solutions and Local Energy Inequality for 3D Navier-Stokes Equations Arnaud Basson École Normale Supérieure, Département de Mathématiques et Applications, 45, rue d’Ulm, 75230 Paris Cedex 05, France. E-mail: [email protected] Received: 12 July 2005 / Accepted: 3 October 2005 Published online: 14 April 2006 – © Springer-Verlag 2006

Abstract: We are interested in space-time spatially homogeneous statistical solutions of Navier-Stokes equations in space dimension three. We first review the construction of such solutions, and introduce convenient tools to study the pressure gradient. Then we show that given a spatially homogeneous initial measure with finite energy density, one can construct a homogeneous statistical solution concentrated on weak solutions which satisfy the local energy inequality.

Introduction In this paper, we consider statistical solutions for three dimensional Navier-Stokes equations, which were introduced by Hopf ([Hop52]) as a mathematical model intended to provide the statistical theory of turbulent flows with rigorous foundations based on Navier-Stokes equations. The study of these solutions has been successfully undertaken by Foias ([Foi72]). More precisely, we are interested in space-time spatially homogeneous statistical solutions (which correspond to homogeneous turbulence), in the sense of Vishik and Fursikov ([VF77, VF78, FT80, VF88]; cf. also [FMRT01]). Let us recall in a few words the definition of such solutions. We start with the Navier-Stokes equations in R3 :   ∂t u − u + ∇· (u ⊗ u) + ∇ p = 0 (1) ∇· u = 0 (t > 0, x ∈ R3 ).  u(t = 0, x) = u (x) 0 Here the initial datum u 0 will only be assumed to belong to the space L 2loc (R3 ) (and of course to be weakly divergence-free). Choose a T > 0. A statistical solution of (1) is a Borel probability measure P on the space L 2loc ([0, T ] × R3 ) such that P-almost every u ∈ L 2loc ([0, T ] × R3 ) is a weak

18

A. Basson

solution of (1), i.e. u is weakly divergence-free and for all φ ∈ D(]0, T [×R3 ) with ∇· φ = 0, we have (∂t φ · u + φ · u + ∇ ⊗ φ · u ⊗ u) d xdt = 0. For statistical solutions, we deal with the initial conditions in the following way. The initial velocity u 0 ∈ L 2loc (R3 ) becomes a random variable, and we only focus on its law µ0 , which we will call the initial probability measure: it is a probability measure on the space L 2loc (R3 )σ of divergence-free locally square integrable vector fields on R3 . To be specific, let µ0 be a Borel probability measure on L 2loc (R3 )σ ; then P is a statistical solution of (1) with initial datum µ0 if moreover we have P ◦ γ0−1 = µ0 , where γ0 is the restriction at time t = 0 : (γ0 u)(x) = u(t = 0, x). This means formally that on the probability space L 2loc (R3 )σ , P (equipped with the Borel σ -algebra), the random variable u(t = 0) has law µ0 . We shall see later how to give a rigorous formulation of this fact, i.e. a precise meaning to the expression P ◦ γ0−1 . A measure µ on L 2loc (R3 ) will be said to be spatially homogeneous, or briefly homogeneous, if it is invariant by spatial translations : µ(τh ω) = µ(ω) for all h ∈ R3 and all Borel subsets ω of L 2loc (R3 ). Here, τh is the translation in x-space defined by τh u(x) = u(x + h); and we set τh ω = {τh u}u∈ω . Equivalently, for all F ∈ L 1 (µ), we have F(τh u) dµ(u) = F(u) dµ(u). Similarly, in the time-dependent case, we set τh u(t, x) = u(t, x + h), and a measure P on L 2loc ([0, T ] × R3 ) which is invariant by these space translations τh (h ∈ R3 ) is said to be (spatially) homogeneous. Our main result (see Theorem 2.1) shows that under appropriate hypothesis, when given a spatially homogeneous measure µ0 on L 2loc (R3 )σ , one can construct a homogeneous statistical solution P with initial datum µ0 , concentrated on weak solutions of (1) which are suitable in the sense of Caffarelli, Kohn and Nirenberg ([CKN82]), i.e. which satisfy a local energy inequality (and also satisfy local integrability properties which are easily checked in our situation). As a consequence, for almost every initial velocity u 0 ∈ L 2loc (R3 )σ (according to µ0 ), there exists a suitable weak solution u of the Cauchy problem (1). The main feature of suitable solutions and motivation for their study is the partial regularity property they enjoy: Caffarelli, Kohn and Nirenberg have shown that any suitable solution is locally bounded on the complement in ]0, T [×R3 of a closed set S (called the set of singular points) such that H1 (S) = 0 (where H1 is the one dimensional Hausdorff measure in R4 ), see [CKN82]. We now recall the basic property enjoyed by homogeneous measures: Proposition 0.1 ([VF77, FT80]). Let µ be a homogeneous measure on L 2loc (R3 ), and G a mapping from L 2loc (R3 ) to L 1loc (R3 ), which commutes with the translations τh . Assume that the functionals G(u)(x) d x and u → |G(u)(x)| d x u → Q

Q

are µ-integrable for any Q = [a1 , b1 ] × · · · ×[a3 , b3 ]. Then there exists a constant G¯ such that for all w ∈ L 1 (R3 ), the functional |G(u)(x)| |w(x)| d x is ﬁnite µ-almost

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

everywhere, P-integrable and G(u)(x) w(x) d x dµ(u) = G¯ R3

R3

19

w(x) d x.

This number G¯ is usually denoted by G(u)(x) dµ(u), the variable x being irrelevant. An analogous result is valid for homogeneous measures on L 2loc ([0, T ] × R3 ). we will always assume that the initial measure µ0 satisfies e(µ0 ) = In the sequel, 2 d x dµ (u)<∞, which can be rewritten e(µ )= |u(x)|2 dµ (u)<∞. |u(x)| 0 0 0 [0,1]3 The number e(µ0 ) is called the mean energy density of µ0 . Let us introduce a few notations. We will make a constant use of the following functional spaces:

|u(x)|2 2 2 3 2 L (r ) = u ∈ L loc (R ) d x = u L 2 (r ) < ∞ , (1 + |x|2 )r and H 1 (r ) = u ∈ L 2 (r ) | ∇u ∈ L 2 (r ) normed by u2H 1 (r ) = u2L 2 (r ) + ∇u2L 2 (r ) , with r > 3/2 (so that they contain the constant functions). We recall that a consequence of Proposition 0.1 is µ0 (L 2 (R3 )) = µ0 ({0}), and µ0 (L 2 (r )) = 1 (cf. [VF77]). The space of C ∞ functions compactly supported in an open set U will be denoted by D(U ). When E is any space of functions depending on x, we will often denote the space L p ([0, T ], E) by L p E. In the notations, we do not distinguish between spaces of scalar functions and spaces of vector fields. The index σ will denote spaces of divergence-free vector fields. The scalar product in L 2 will be denoted by ·, · . The open ball of center 0 and radius R in R3 will be denoted by B R . The values of the constants denoted by C, C , Ck , etc. which will appear in our calculations may change from line to line. Finally, we will use the summation convention on repeated indices. In the first section, we deal with the construction of a spatially homogeneous statistical solution P given a homogeneous initial measure µ0 , with a particular emphasis on new estimates (see Propositions 1.2 and 1.3), which will enable us to show in the second section that P-almost every solution u of (1) satisfies the local energy inequality (Theorem 2.1). 1. Construction of the Statistical Solutions Let µ0 be a spatially homogeneous probability measure on L 2loc (R3 )σ , with finite mean energy density e(µ0 ). We want to show the existence of a homogeneous statistical solution P of (1) with initial datum µ0 . In order to do so, we first consider regularized periodic Navier-Stokes equations, with a period 2l designed to grow to +∞. We show the existence of a statistical solution Pl for these equations, then we prove convenient estimates. This enables us to let l tend to infinity to get P, the statistical solution of (1) we are seeking for, as a limit of the Pl ’s. This proof was first done by Vishik and Fursikov [VF78, VF88], so we only sketch it briefly, detailing only some new estimates for the measures Pl and P. 1.1. Regularized periodic equation. We use the following regularization of NavierStokes equations on a periodic domain:   ∂t u − u + ∇· [(u ∗ ωl ) ⊗ u] + ∇ p = 0 (2) ∇· u = 0 (t > 0, x ∈ R3 ) ,  u(t = 0) = u 0

20

A. Basson

where u 0 is a square integrable divergence-free periodic vector field, with period 2l (l ∈ N∗ ), and ωl is a standard mollifier: ωl (x) = l 3 ω1 (lx), with ω1 a C ∞ nonnegative function of integral equal to 1, with compact support in the ball B1 ⊂ R3 . This regularization, which is different from that used by Vishik and Fursikov, is much more convenient for what follows than a Fourier truncation. We will denote the space of square integrable 2l-periodic functions by L 2per ([−l, l]3 ). The Cauchy problem (2) is well posed: we have existence and uniqueness of a globalin-time regular solution u, so we can consider the operator Sl : L 2per ([−l, l]3 )σ → C 0 ([0, T ], L 2per ([−l, l]3 )), which maps the initial velocity u 0 to the associated solution u. To define a statistical solution of (2), we need to approximate the initial probability distribution µ0 by measures µl0 concentrated on periodic functions. Vishik and Komech have proved the following result: Proposition 1.1 ([VK88]). Let µ0 be a spatially homogeneous probability measure on L 2loc (R3 )σ , such that e(µ0 ) < ∞. There exists a sequence of spatially homogeneous probability measures µl0 , concentrated on L 2per ([−l, l]3 )σ , with ﬁnite mean energy density : e(µl0 ) ≤ e(µ0 ), which converge to µ0 in the following weak sense: for all n ∈ N∗ , all bounded continuous functions f on Rn and all ϕ1 , . . . , ϕn ∈ D(R3 ), we have f ( u, ϕ1 , . . . , u, ϕn ) dµ0 (u). f ( u, ϕ1 , . . . , u, ϕn ) dµl0 (u) −−−−→ l→+∞

We define a probability measure on L 2loc ([0, T ] × R3 ) by Pl = µl0 ◦ Sl−1 . It is clear that Pl is a spatially homogeneous statistical solution of (2), with initial measure µl0 (we have in particular F(u(0)) d Pl (u) = F(u 0 ) dµl0 (u 0 ) for F ∈ L 1 (µl0 )). It also satisfies the following basic estimate: there exists a constant C independent of l and t ∈ [0, T ] such that |u(t, x)|2 + 0

T

|u(s, x)|2 + |∇ ⊗ u(s, x)|2 ds d Pl (u) ≤ Ce(µl0 ) ≤ Ce(µ0 ), (3)

which implies that (u(t)2L 2 (r ) + u2L 2 H 1 (r ) ) d Pl ≤ Ce(µl0 ) (recall that r > 3/2), where the constant C depends only on r . Thus Pl is concentrated on r >3/2 L 2 H 1 (r ). The reader is referred to [VF88] for the details. We emphasize the fact that the “statistical energy estimate” (3), which will remain true for the statistical solution P we will construct, has no known equivalent for individual solutions of the Cauchy problem (1).

1.2 How to compute the pressure gradient? In this paragraph and the following one, we will show that for any positive α, we have u L ∞ L 2 (2+α) d Pl (u) ≤ Cα . This new estimate relies on a better understanding of the structure of the pressure term ∇ p when the velocity u does not vanish at infinity. Formally, the pressure satisfies ∇ p = ∇Ri R j (u i u j ), where Ri is the i th Riesz transform, which is a non-local operator. We will split the gradient ∇ p into a “semilocal” term ∇ p1 , and a (non-local) smooth bounded term ∇ p2 . This decomposition of the pressure is a basic tool in our work, for it is very convenient to handle the pressure

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

21

gradient in Eqs. (1) and (2). In this paragraph, the function u is not supposed to be a solution of any equation, unless otherwise stated. Let ϕ ∈ D(R3 ) be a non-negative function, supported in B2 , such that ϕ(x) = 1 on B1 . Set ψk (x) = ϕ(2−k−1 x) − ϕ(2−k x) (so that 1 = ϕ(x) + k≥0 ψk (x)) and χk (x) = ϕ(2−k−3 x) − ϕ(2−k+2 x). Notice that ψk is supported in B2k+2 \ B2k , and χk in B2k+4 \ B2k−2 . We define p1 (x) = ϕ(x/8)Ri R j (ϕu i u j ) +

+∞

χk (x)Ri R j ψk u i u j ,

(4)

k=0

and +∞ ∇ p2 = ∇ (1 − ϕ(x/8))Ri R j ϕu i u j + ∇ (1 − χk )Ri R j ψk u i u j .

(5)

k=0

We now show that p1 and ∇ p2 are well-behaved when u lies in an appropriate functional space. To obtain an estimate for p1 , we use the fact that the sum over k in (4) is uniformly locally finite, and the continuity of the Riesz transform on L 3/2 (R3 ):

+∞ 3 3 dx | p1 (x)|3/2 x 2+ 2 |ϕ( d x ≤ C )R R (ϕu u )| |χ (x)R R (ψ u u )| i j i j k i j k i j 8 2 r (1+|x| ) (1+|x|2 )r k=0 +∞ 3/2 3/2 −2r k Ri R j (ψk u i u j ) d x ≤ C Ri R j (ϕu i u j ) d x +C 2 |u|3 ϕ 3/2 +

≤C ≤C

k=0

+∞

3/2 2−2r k ψk

dx

k=0

|u|3 d x. (1 + |x|2 )r

We now deal with p2 . Let K i j be the kernel of the operator Ri R j , which satisfies |K i j (x)| ≤ C/|x|3 and |∇ K i j (x)| ≤ C/|x|4 . We notice that for all k, we have (1 − χk (x))ψk (y) = 0 and (1 − ϕ(x/8))ϕ(y) = 0, provided |x − y| ≤ (1 + |y|)/16. Take an x such that 1 − χk (x) = 0. Then ψk (y) = 0 on the ball |x − y| < 1/16, so that x does not belong to the support of ψk u i u j . Hence we can write, for such an x,

Ri R j ψk u i u j = =

K i j (x − y)ψk (y)u i (y)u j (y) dy

|x−y|≥ 1+|y| 16

K i j (x − y)ψk (y)u i (y)u j (y) dy.

The integral in the r.h.s. remains well defined when 1 − χk (x) = 0. Thus for all x ∈ R3 , we have K i j (x − y)ψk (y)u i (y)u j (y) dy. (1 − χk (x))Ri R j ψk u i u j = (1−χk (x)) |x−y|≥ 1+|y| 16

22

A. Basson

A similar result holds for (1 − ϕ(x/8))Ri R j (ϕu i u j ). Then we write ∇[(1 − χk )Ri R j ψk u i u j ] = −∇χk (x)

+(1−χk )

|x−y|≥ 1+|y| 16 |x−y|≥ 1+|y| 16

K i j (x − y)ψk (y)u i (y)u j (y) dy

∇ K i j (x − y)ψk (y)u i (y)u j (y) dy.

We can estimate the L ∞ norm of this quantity: ∇[(1−χk )Ri R j (ψk u i u j )] ≤ C2−k ∞

ψk (y)|u(y)|2 dy + C (1 + |y|)3

ψk (y)|u(y)|2 dy. (1 + |y|)4

Here the constants C do not depend on k, hence both terms in the r.h.s. have the same magnitude, for when y belongs to the support of ψk , we have 2−k (1 + |y|)−3 ≈ (1 + |y|)−4 ≈ C2−4k . Thus after summation over k we obtain: ∇ p2 ∞ ≤ C

|u(y)|2 dy. (1 + |y|)4

We sum up these estimates in the following proposition. Proposition 1.2. Let p1 and ∇ p2 be deﬁned by (4) and (5). The following estimates are valid (r > 23 ):

| p1 (x)|3/2 dx ≤ C (1 + |x|2 )r

|u(x)|3 d x, (1 + |x|2 )r

(6)

and ∇ p2 ∞ ≤ Cu2L 2 (2) .

(7)

To deal with the regularized periodic system (2), we define analogous quantities p1l and ∇ p2l , just replacing u i u j by (u i ∗ ωl )u j : +∞ p1l = ϕ(x/8)Ri R j ϕ(u i ∗ ωl )u j + χk (x)Ri R j ψk (u i ∗ ωl )u j ,

(8)

k=0

∇ p2l = ∇ (1 − ϕ(x/8))Ri R j ϕ(u i ∗ ωl )u j +

+∞

∇ (1 − χk )Ri R j ψk (u i ∗ ωl )u j .

(9)

k=0

The estimates (6) and (7) remain valid in this case (with constants C independent of l), and it is easily checked that when u is any solution of (2), the associated pressure satisfies ∇ p = ∇Ri R j ((u i ∗ ωl )u j ) = ∇ p1l + ∇ p2l .

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

23

1.3. Statistical estimate of the L ∞ L 2 (2 + α) norm. We are now able to control the L ∞ L 2 (2 + α) norm of the solutions u of the regularized periodic Navier-Stokes equations (2) for α > 0, uniformly with respect to l. This control will provide a statistical estimate of u L ∞ L 2 (2+α) independent of l, i.e. a bound for u L ∞ L 2 (2+α) d Pl (u). In order to do so, multiply Eq. (2) by (1 + |x|2 )−2−α u and integrate over R3 to get d dt

|∇ ⊗ u|2 d x |u|2 |u|2 d x = − + (1 + |x|2 )−2−α d x 2(1 + |x|2 )2+α (1 + |x|2 )2+α 2 |u|2 u · ∇ p dx 2 −2−α (u ∗ ωl ) · ∇(1 + |x| ) + dx − 2 (1 + |x|2 )2+α |u|2 d x |u|3 d x u · ∇ p dx ≤C +C − . 5 2 2+α +α (1 + |x| ) (1 + |x|2 )2+α (1 + |x|2 ) 2

We deal with the last term of the r.h.s., writing ∇ p = ∇ p1l + ∇ p2l , and using inequalities (6) and (7) (more precisely, their equivalents in the periodic case): u · ∇ p1l d x l 2 −2−α p = u · ∇(1 + |x| ) d x 1 2 2+α (1 + |x| ) 1 23 3 3 | p1l | 2 d x |u|3 d x ≤C 5 5 (1 + |x|2 ) 2 +α (1 + |x|2 ) 2 +α |u|3 d x ≤C , 5 (1 + |x|2 ) 2 +α and u · ∇ p2l d x |u| d x l ≤ ∇ p 2 ∞ (1 + |x|2 )2+α (1 + |x|2 )2+α ≤C Let W (t) =

|u|2 d x (1 + |x|2 )2

|u|2 d x (1 + |x|2 )2+α

21

.

|u(t, x)|2 (1 + |x|2 )−2−α d x = u(t)2L 2 (2+α) . We obtain dW ≤ CW + C dt

|u|3 d x (1 + |x|2 )

5 2 +α

√

+C W

|u|2 d x (1 + |x|2 )2

(the constants C do not depend on u of course, nor on the period l). Notice that we have the same control for the term involving p1 as for the term related to the nonlinearity in the equation, ∇· ((u ∗ ωl ) ⊗ u); we will encounter this feature again on several occasions. To obtain a Gronwall-type inequality, we still have to deal with the second term in the r.h.s. of the previous inequality. Apply the Cauchy-Schwarz inequality, with 25 + α = 2+α 3+α 2 + 2 :

24

A. Basson

21 21 |u|4 d x |u|2 d x ≤ (1 + |x|2 )2+α (1 + |x|2 )3+α √ 3+α 2 = W u(1 + |x|2 )− 4 .

|u|3 d x 5

(1 + |x|2 ) 2 +α

4

Bound the L 4 norm using Sobolev inequality to get

√

|u|3 d x 5

(1 + |x|2 ) 2 +α

|u|2 d x

≤C W √ ≤C W

α

3

(1 + |x|2 ) 2 + 2

|u|2 + |∇ ⊗ u|2 3

α

(1 + |x|2 ) 2 + 2

3+α 2 + ∇ u(1 + |x|2 )− 4 d x d x.

So far, we have obtained: √ dW ≤ CW + C W dt

|u|2 + |∇ ⊗ u|2 (1 + |x|2 )

3 α 2+2

dx =

√

W A(t),

√ , from what we can deduce where A(t) = C W (t) + Cu(t)2 2 L [0,T ],H 1 23 + α2 √ √ √ t W (t) ≤ W (0) + 0 A(s) ds. Hence, u L ∞ ([0,T ],L 2 (2+α)) = sup0≤t≤T W (t) ≤ T √ W (0) + 0 A(s) ds. We now integrate with respect to the measure Pl , recalling that l µ0 = Pl ◦ γ0−1 :

u L ∞ ([0,T ],L 2 (2+α)) d Pl (u) ≤

u 0 L 2 (2+α) dµl0 (u 0 ) +

T

A(s) ds d Pl (u).

0

The last term is easily controlled, thanks to the statistical energy estimate (3):

T

A(s) ds d Pl (u) ≤ C T + C

0

u2 2 L

[0,T ],H 1

3 α 2+2

d Pl (u)

≤ C(1 + e(µ0 )),

so that we have proved the following. Proposition 1.3. The statistical solutions Pl of the regularized periodic system (2) satisfy the estimate: u L ∞ ([0,T ],L 2 (2+α)) d Pl (u) ≤ C(1 + e(µ0 )), for any positive α, where C depends only on T and α. The main point in this result is the fact that C does not depend on l.

(10)

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

25

1.4. Passage to the limit. Following Vishik and Fursikov, we need an estimate for ∂t u: we will establish bounds for ∂t u H −s (B N ) for any N , where B N is the ball {|x| < N } ⊂ R3 and s is great enough. Our study of the pressure makes it very easy, avoiding the rather tedious calculations in [VF88]. Proposition 1.4. The following estimates hold for any l, N ∈ N∗ , a 25 : ∂t u L 2 ([0,T ],H −s (B N )) d Pl ≤ C N , (11) (12) ∂t u L 1 ([a,b],H −s (B N )) d Pl ≤ C N (b − a). (The constant C N depends on N , and also on s, µ0 and T , but not on l.) Proof. Let u be a solution of (2). Let us show that for r ∈ ] 23 , 2[ and α > 0 small enough (α < 2 − r ), (13) ∂t u(t) H −s (B N ) ≤ C N u(t) L 2 (r ) 1 + u(t) L 2 (2+α) . We have u H −s (B N ) ≤ C N u L 2 (r ) since s ≥ 2; ∇· ((u ∗ ωl ) ⊗ u) H −s (B N ) ≤ C(u ∗ ωl ) ⊗ u L 1 (B N ) , thanks to the embedding L 1 ⊂ H −s+1 (s > 25 ); now (u ∗ωl )⊗u L 1 (B N ) ≤ u2L 2 (B

N +1 )

≤ C N u2L 2 (2+α) , with C N independent of l. We split the pressure gradient into

∇ p1l + ∇ p2l . The structure of p1l enables us to control it as well as the bilinear term u ⊗ u: there exists k1 = k1 (N ) such that for any x in B N , χk Ri R j ψk (u i ∗ ωl )u j , p1 (x) = ϕ(x/8)Ri R j ϕ(u i ∗ ωl )u j + k≤k1

because the function χk is zero on the ball B2k−2 , hence it is sufficient that 2k1 −1 > N . Then we can write ∇[χk Ri R j ψk (u i ∗ ωl )u j ] H −s (B N ) ≤ Ck Ri R j ψk (u i ∗ ωl )u j H −s+1 (R3 ) ≤ Ck (u ∗ ωl ) ⊗ u H −s+1 (B k+2 ) , 2

with Ck independent of l, then (u ∗ ωl ) ⊗ u H −s+1 (B k+2 ) ≤ Hence, we have ∇ p1 H −s (B N ) ≤

C N u2L 2 (2+α) ;

2

Ck u2L 2 (2+α)

as previously.

moreover

∇ p2 H −s (B N ) ≤ ∇ p2 ∞ ≤ Cu2L 2 (2) ≤ Cu L 2 (2−α) u L 2 (2+α) , and this ends the proof of (13). As a consequence, we deduce that ∂t u L 2 ([0,T ],H −s (B N )) ≤ C N u L 2 L 2 (r ) 1 + u L ∞ L 2 (2+α) , ∂t u L 1 ([a,b],H −s (B N )) ≤ C N b − a + u2L 2 ([a,b],L 2 (r )) , and the result follows from (3) and (10) by integration with respect to Pl .

26

A. Basson

We now conclude briefly the construction of the statistical solution P, following roughly [VF88]. We fix two numbers 23 < r1 < r0 < 2, and a s > 25 . We shall use the space

=

  

u ∈ D (]0, T [×R3 ) u = u L 2 H 1 (r1 ) + N ≥1

  1 ∂ u < ∞ , 1 −s t L H (B N )  2N C N

(12). We have where the constants C N are those in Proposition 1.4, inequality u d Pl (u) ≤ C (uniformly in l). The embedding ⊂ L 2 ([0, T ], L 2 (r0 )) is compact. Let D R be the ball of center 0 and radius R in . Its closure D¯ R in L 2 L 2 (r0 ) is compact, and we have Pl (c D¯ R ) ≤ Pl (c D R ) ≤ CR (see [VF88]), so we can apply Prokhorov’s compactness criterion ([GS75, VTC87]), which furnishes a subsequence of (Pl ) (that we shall still denote by Pl ) weakly convergent on the space L 2 L 2 (r0 ). The weak limit is a probability measure P on this space, and for all continuous bounded functionals f on L 2 L 2 (r0 ), we have f (u) d Pl (u) −−−→ f (u) d P(u). l→∞

As in [VF88], P satisfies the following properties: i) P is spatially homogeneous, ii) u2L 2 H 1 (r ) d P(u) < ∞, 1 iii) P-almost every u is a weak solution of the Navier-Stokes equations; moreover, it can be shown that the associated pressure gradient ∇ p is equal to ∇ p1 + ∇ p2 , where p1 and ∇ p2 are defined by equalities (4) and (5). Passing to the limit, we also get the inequalities in Propositions 1.3 and 1.4, with Pl replaced by P. As a consequence, the measure P is concentrated on L ∞ L 2 (2 + α), and also on V = {u ∈ L 2 L 2 (r0 ) | ∀N , ∂t u ∈ L 2 H −s (B N )}. Thus for almost every −s u, γt u = u(t, ·) is well defined in Hloc (R3 ). Note that the Borel subsets of V are also 2 2 2 Borel subsets of L L (r0 ), hence of L loc ([0, T ] × R3 ), so that P can be seen as a Borel −s measure on V . Thanks to the continuity of γt from V to Hloc , we can define the measure −1 −s P ◦ γt (0 ≤ t ≤ T ) on Hloc . As in [VF88], we show that P ◦ γ0−1 = µ0 (which means that P ◦ γ0−1 is concentrated on L 2loc , and its restriction to this space coincides with µ0 ). So far, we have shown that P is a spatially homogeneous statistical solution of the Navier-Stokes equations, with initial measure µ0 , as well as some additional estimates, which will be useful later. We also obtain the following results (see [VF88]; our study of the pressure term allows us to include new statements about the pressure in these two Vishik and Fursikov results): Theorem 1.5. For almost every initial datum u 0 ∈ L 2loc (R3 ) (in the sense of the measure µ0 ), there exists a weak solution u of the Navier-Stokes equations (1) on ]0, T [×R3 , enjoying the following properties: i) u ∈ L ∞ L 2 (2 + α) for all α > 0, ii) u ∈ L 2 H 1 (r1 ), iii) the associated pressure satisﬁes ∇ p = ∇ p1 + ∇ p2 , iv) u ∈ V and for all N , u(t) tends to u 0 in H −s (B N ).

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

27

Theorem 1.6. The family of measures µt = P ◦ γt−1 satisﬁes the extended Hopf equation: ∀w ∈ D(R3 ), t µ t (w) − µ 0 (w) = i v, w + v ⊗ v, ∇ ⊗ w

0 (14) + p1 (v), ∇· w − ∇ p2 (v), w ei v,w dµτ (v) dτ, where µ t is the characteristic functional of the measure µt , µ t (w) =

ei v,w dµt (v).

(The original Hopf equation [Hop52, VF88] includes no pressure term, and is valid only when w is divergence-free, unlike our Eq. (14).) Let us conclude this section with a more specific result of convergence for the sequence Pl that we shall need in the next section. Let σ ∈ ] 21 , 23 [. We consider 1

2

the Banach spaces F N = L 2 L 2 (r0 ) ∩ L 3 H 2 (B N ) and W N = ∩ L σ H σ (B N ). The embedding W N ⊂ F N is compact, because the embeddings ⊂ L 2 L 2 (r0 ) 2 1 and W 1,1 H −s (B N ) ∩ L σ H σ (B N ) ⊂ L 3 H 2 (B N ) are (recall that ∂t u L 1 H −s (B N ) ≤ 2 N C N u ). Moreover, we have uW N d Pl (u) ≤ C N uniformly in l. To see that, it is sufficient to notice: u σ2 σ ≤ uσL 2 H 1 (B ) u1−σ ≤ C N u L 2 H 1 (r1 ) + u L ∞ L 2 (2+α) . L ∞ L 2 (B ) L

H (B N )

N

N

In the same way as previously, we deduce that the family Pl is weakly compact as a sequence of measures on the space F N . The only possible cluster point of this sequence is P, hence we have shown the following result. Proposition 1.7. For any N ∈ N∗ , the sequence Pl converges weakly to P on the space F N , i.e. for all bounded continuous functionals F on F N , we have F d Pl → F d P. 2. Local Energy Inequality Let us state our main result. The present section will be devoted to its proof. Theorem 2.1. Let P be the spatially homogeneous statistical solution constructed in Sect. 1. Then P-almost every solution u of the Navier-Stokes equations (1) satisﬁes the local energy inequality: ∀ρ ∈ D(]0, T [×R3 ) such that ρ ≥ 0, we have |u|2 (∂t + )ρ d xdt + (|u|2 + 2 p1 )u · ∇ρ d xdt 2 |∇ ⊗ u|2 ρ d xdt ≤ −2 u · ∇ p2 ρ d xdt. (15) This inequality (obtained formally by multiplication of the Navier-Stokes equations by ρu and integrations by parts) has been considered by Scheffer ([Sch77]) and by Caffarelli, Kohn and Nirenberg ([CKN82]) in order to study the regularity of solutions of (1): it plays a fundamental role in the proof of Caffarelli, Kohn and Nirenberg’s famous regularity theorem (see [CKN82]); cf. also [Lem02]. Our result implies that this theorem can be applied to P-almost every solution u of (1).

28

A. Basson

To prove Theorem 2.1, we fix a test function ρ, and we show that inequality (15) is true for P-almost every u with this function ρ. In order to do so, we notice that we have a similar inequality satisfied by the solutions u of the regularized periodic system (2):

|∇ ⊗ u|2 ρ d xdt −

2

|u|2 (∂t + )ρ d xdt

−

(|u|2 (u ∗ ωl ) + 2 pu) · ∇ρ d xdt = 0.

(16)

This is in fact an equality (due to the smoothness of u), obtained by multiplying (2) by ρu and integrating. Then we want to let l tend to infinity, using the weak convergence Pl → P. To carry out this limiting process, we proceed in three steps. First, we study the continuity of the terms involved in inequality (15) with respect to u (paragraph 2.1); then we bound the errors which appear in the regularized local energy inequality, due to convolution (paragraph 2.2); and we eventually pass to the limit using an appropriate truncation (paragraph 2.3). So in paragraphs 2.1 to 2.3, the non-negative function ρ ∈ D(]0, T [×R3 ) is fixed. Then in paragraph 2.4 we will prove that for P-almost every u, the local energy inequality (15) is true for all non-negative test functions ρ.

2.1. Continuity of the terms. We choose 0 < a < b < T and N0 ∈ N such that ρ is supported in ]a, b[×B N0 . Let N be greater than N0 . We will work with the space 1 F N = L 2 L 2 (r0 ) ∩ L 3 H 2 (B N ). We shall fix N later (N0 and N will depend only on ρ). To begin with, we show that |u|2 (∂t + )ρ d xdt + (|u|2 + 2 p1 )u · ∇ρ d xdt is a continuous function of u ∈ F N . We shall now denote the integrals · · · d xdt by the pairing ·, · . The mapping u → |u|2 , (∂t + )ρ is trivially continuous on L 2 L 2 (r0 ), hence on FN . To deal with the term |u|2 u, ∇ρ , we consider the trilinear form (u, v, w) → (u · v)w, ∇ρ : | (u · v)w, ∇ρ | ≤ u L 3 L 3 (B N ) v L 3 L 3 (B N ) w L 3 L 3 (B N ) ∇ρ∞ 0

0

0

≤ CuF N vF N wF N , (the constant C depends on ρ). Hence, u → |u|2 u, ∇ρ is continuous on F N . We turn to the term u → p1 u, ∇ρ . Let k1 be great enough so that

k1 χk Ri R j ψk u i u j , ∇ρ p1 u, ∇ρ = uϕ(x/8)Ri R j ϕu i u j + u k=0

(this is achieved by choosing 2k1 −1 > N0 ). We fix N large enough so that ψk1 is supported in B N (e.g. N = 2k1 +4 ), and consider a trilinear form in u, v, w as before:

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

29

k1 χk Ri R j (ψk vi w j ), ∇ρ uϕ(x/8)Ri R j (ϕvi w j ) + u k=0 k1 ≤ Cu L 3 L 3 (B N ) Ri R j (ϕvi w j ) 3 3 3 + Ri R j (ψk vi w j ) L 2 L 2 (R )

0

≤ Cu L 3 L 3 (B N

0)

ϕvi w j

≤ CuF N vF N wF N

3

3

L2L2

+

k1 k=0

k=0

ψk vi w j

3

3

L 2 L 2 (R 3 )

3

3

L2L2

(17)

(the constant C depends on N and ρ). We now consider the term u· ∇ p2 , ρ . This term may not be continuous in the F N norm, for ∇ p2 depends on the values of u over the whole space : ∇ p2 ∞ ≤ Cu2L 2 (2) ,

so | u· ∇ p2 , ρ | ≤ Cu2L 2 (2) u L 2 (B N ) , and we lack integrability in time to control 0 this cubic quantity. We solve this problem by regularization in the time variable: let δk (t) be a standard one-dimensional mollifier, and let us consider (u ∗ δk )· ∇ p2 , ρ . This term being cubic in u, we again introduce the corresponding trilinear form, namely (u ∗ δk )· ∇ p2 (v, w), ρ , where ∇ p2 (v, w) = ∇[(1 − ϕ(x/8))R i R j (ϕvi w j )] + 2 )−2 d x ∇[(1−χ )R R (ψ v w )] satisfies ∇ p (v, w) ≤ C |v||w|(1+|x| q i j q i j 2 ∞ q≥0 as in (7). Then (recall that r0 < 2) T |v||w| d xdt | (u ∗ δk )· ∇ p2 (v, w), ρ | ≤ Cu ∗ δk L ∞ ([a,b],L 1 (B N )) 0 (1 + |x|2 )2 0 ≤ Ck u L 2 L 2 (r0 ) v L 2 L 2 (r0 ) w L 2 L 2 (r0 ) . Finally, we consider the term |∇ ⊗ u|2 , ρ . It is not continuous, nevertheless we show that it is lower semi-continuous. Lemma 2.2. There exists a sequence vn ∈ D(]a, b[×B N0 ) such that for all u ∈ L 2 L 2 (r0 ), we have |∇ ⊗ u|2 , ρ = supn | u, ∇· (vn ρ) |2 if the l.h.s. is well deﬁned and ﬁnite, and supn | u, ∇· (vn ρ) |2 = +∞ otherwise. Proof. The quantity |∇ ⊗u|2, ρ is a weighted L 2 norm of ∇ ⊗u. Let vn be a sequence of test functions dense in the set v ∈ L 2 H 1 (B N0 ) | |v|2 , ρ < 1 in the L 2 H 1 (B N0 ) topology. This implies that the vn ’s are dense in the unit ball of the weighted space L 2 (ρ d xdt), hence if |∇ ⊗ u|2 , ρ < ∞, we have |∇ ⊗ u|2 , ρ = supn | ∇ ⊗ u, vn ρ |2 = supn | u, ∇· (vn ρ) |2 . Conversely, assume that supn | u, ∇· (vn ρ) |2 = K 2 < ∞. We notice that | u, ∇· (vρ) | ≤ Cu v L 2 H 1 (B N ) , then we have for any v ∈ L 2 H 1 (B N0 ), 0 | ∇ ⊗ u, vρ | = | u, ∇· (vρ) | ≤ K v L 2 (ρ d xdt) , for v/v L 2 (ρ d xdt) is a cluster point of the sequence vn . Thus ∇ ⊗ u belongs to L 2 (ρ d xdt). Finally, we note that since | u, ∇· (vn ρ) | ≤ Cn u L 2 L 2 (r0 ) , the mapping u → |∇ ⊗ u|2 , ρ = supn | u, ∇· (vn ρ) |2 (with values in R+ ∪ {+∞}) is lower semi-continuous with respect to u in the L 2 L 2 (r0 ) norm. So far, we have proved the following result. Proposition 2.3. The following mapping is lower semi-continuous on F N : ! " ! " u → 2 |∇ ⊗ u|2 , ρ − |u|2 , (∂t + )ρ − |u|2 u + 2 p1 u, ∇ρ +2 (δk ∗ u) · ∇ p2 , ρ .

30

A. Basson

2.2. Control of errors. Let us introduce a few notations: # $ # $ E( f ) = 2 |∇ ⊗ f |2 , ρ − | f |2 , (∂t + )ρ # $ # $ − | f |2 f + 2 p1 f, ∇ρ + 2 f · ∇ p2 , ρ

(18)

(this is the expression appearing in (15)), E(l) ( f ) = 2 |∇ ⊗ f |2 , ρ − | f |2 , (∂t + )ρ

− | f |2 ( f ∗ ωl ) + 2 p1l f, ∇ρ + 2 f · ∇ p2l , ρ

(this is the expression in (16)), and Ek ( f ) = 2 |∇ ⊗ f |2 , ρ − | f |2 , (∂t + )ρ

− | f |2 f + 2 p1 f, ∇ρ + 2 (δk ∗ f ) · ∇ p2 , ρ

(this mapping has already appeared in Proposition 2.3). In these three expressions, p1 , ∇ p2 , p1l and ∇ p2l are defined respectively by (4), (5), (8) and (9) with u replaced by f . We want to bound the quantity |Ek ( f ) − E(l) ( f )| for any f in an appropriate space (regardless of the fact that f is a solution of some equation or not). This quantity represents the errors due to convolutions with ωl and δk . The four following lemmas carry out this task. Lemma 2.4. We have, for α > 0, C | | f |2 ( f − f ∗ ωl ), ∇ρ | ≤ √ f 2L 2 H 1 (r ) f L ∞ L 2 (2+α) 1 l

(19)

(C depends only on ρ, r1 and α, but not on l). Proof. We write, thanks to Hölder and Sobolev inequalities: | | f |2 ( f − f ∗ ωl ), ∇ρ | ≤ C f 2L 4 L 3 (B

N0 )

f − f ∗ ωl L 2 L 3 (B N

0)

≤ C f L 2 H 1 (r1 ) f L ∞ L 2 (2+α) f − f ∗ ωl

1

L 2 H 2 (B N0 )

.

Let ζ ∈ D(B N0 +2 ) such that ζ (x) = 1 for x ∈ B N0 +1 ; let g = ζ f . We have f ∗ ωl (t, x) = g ∗ ωl (t, x) for x ∈ B N0 , so that f (t) − f ∗ ωl (t) 1 ≤ g(t) − g ∗ ωl (t) ≤

H 2 (B N0 )

1 2 (R 3 )

. Using Fourier transform, one easily sees that g(t) − g ∗ ωl (t)

H C √ ∇ ⊗ g(t) L 2 . Thus we obtain f (t) − l

and the result follows.

f ∗ ωl (t)

1

H 2 (B N0 )

≤

C √ ∇ ⊗ l

1

H2

f (t) L 2 (r1 ) ,

Lemma 2.5. We have C | ( p1 − p1l ) f, ∇ρ | ≤ √ f 2L 2 H 1 (r ) f L ∞ L 2 (2+α) 1 l (C is independent of l).

(20)

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

Proof. Write | ( p1 − p1l ) f, ∇ρ | ≤ C p1 − p1l

4

3

L 3 L 2 (B N0 )

31

f L 4 L 3 (B N ) . Now, we 0

proceed as in (17), with vi w j replaced by ( f i ∗ ωl − f i ) f j . For x ∈ B N0 , we have p1 − p1l = ϕ(x/8)Ri R j ϕ( f i − f i ∗ ωl ) f j + χk Ri R j ψk ( f i − f i ∗ ωl ) f j k≤k1

p1 (t) −

p1l (t) L 3/2 (B N ) 0

≤ CRi R j (ϕ( f i − f i ∗ ωl ) f j ) L 3/2 (R3 ) +C Ri R j (ψk ( f i − f i ∗ ωl ) f j ) L 3/2 (R3 ) k≤k1

≤ C f (t) − f ∗ ωl (t) L 3 (B N ) f (t) L 3 (B N ) . Hence | ( p1 − p1l ) f, ∇ρ | ≤ C f − f ∗ ωl L 2 L 3 (B N ) f 2L 4 L 3 (B ) , and the remaining N part of the proof is similar to the previous lemma. Lemma 2.6. We have, for 0 < α < 21 , ! " C f · ∇ p2 − ∇ p2l , ρ ≤ f 2L ∞ L 2 (2+α) f L 2 H 1 (2−α) l

(21)

(C is independent of l). Proof. We already know a bound for ∇ p2 ∞ , namely inequality (7). Similarly, we − f ∗ωl || f | d x. Thus we have, using the Cauchy-Schwarz have ∇ p2 − ∇ p2l ∞ ≤ C | f (1+|x| 2 )2 inequality: f · ∇ p2 − ∇ p2l L 1 (B N ) ≤ C f L 1 (B N ) ∇ p2 − ∇ p2l ∞ 0

0

≤ C f L 1 (B N ) f − f ∗ ωl L 2 (2−α) f L 2 (2+α) . 0

It is not difficult to show that f − f ∗ ωl L 2 (2−α) ≤ ∇ p2l ), ρ | ≤ Cl f 2L ∞ L 2 (2+α) f L 2 H 1 (2−α) . Lemma 2.7. Let s > 0. We have, for 0 < α < | ( f − δk ∗ f ) · ∇ p2 , ρ | ≤

1 2

C l ∇

f L 2 (2−α) , then | f · (∇ p2 −

1 and 0 < ε ≤ min( 2s+1 , 13 ),

C (3−ε)/2 (3−ε)/2 ∂t f εL 1 H −s (B ) f L 2 H 1 (2−α) f L ∞ L 2 (2+α) (22) N kε

(C depends on ρ, s, α and ε, but not on k). Proof. We write | ( f − δk ∗ f ) · ∇ p2 , ρ | ≤ C f − δk ∗ f L 2 ([a,b],L 2 (B N

0 ))

∇ p2 L 2 L ∞ .

We use again a function ζ ∈ D(B N ) such that ζ (x) = 1 on B N0 , and g = ζ f . We have f (t) − δk ∗ f (t) L 2 (B N

0)

≤ g(t) − δk ∗ g(t) L 2 (R3 ) ≤ g(t) − δk ∗ g(t)εH −s g(t) − δk ∗ g(t)1−ε 1 , H2

32

A. Basson

provided that 0 < ε ≤

1 2s+1 .

Then the Hölder inequality leads to

b

f − δk ∗

a

f 2L 2 (B ) dt ≤ N0

f − δk ∗ f L 2 ([a,b],L 2 (B N

0 ))

b

g−δk ∗ g H −s

2ε b 1−2ε 2(1−ε) 1−2ε dt g − δk ∗g 1 dt

a

H2

a

≤ g − δk ∗ gεL 1 ([a,b],H −s ) g − δk ∗ g1−ε

1

L p ([a,b],H 2 )

,

1 with p = 2(1−ε) 1−2ε . We have g − δk ∗ g L 1 ([a,b],H −s ) ≤ k ∂t g L 1 ([0,T ],H −s ) . Assume p ≤ 4, i.e. ε ≤ 13 , so that g − δk ∗ g p 1 ≤ C f 4 1 , and recall that L H 2 (B N )

L H2

∇ p2 L 2 L ∞ ≤ C f 2L 4 L 2 (2) ≤ C f L 2 L 2 (2−α) f L ∞ L 2 (2+α) thanks to (7); then we obtain: | ( f − δk ∗ f ) · ∇ p2 , ρ |

C 1−ε ε f L 2 L 2 (2−α) f L ∞ L 2 (2+α) ∂t f L 1 H −s (B ) f ≤ 1 N 4 kε 2 L H (B N ) C 1+(1−ε)/2 (1−ε)/2 ε ≤ ε ∂t f L 1 H −s (B ) f L ∞ L 2 (2+α) f L 2 H 1 (B ) f L 2 L 2 (2−α) , N N k since f 2

1

L 4 H 2 (B N )

≤ f L 2 H 1 (B N ) f L ∞ L 2 (B N ) .

2.3. Passage to the limit. We shall use the following function to truncate some unbounded quantities:   0 if z ≤ 0 θ (z) = z if 0 ≤ z ≤ 1,  1 if z ≥ 1 which satisfies the following “triangular inequality” for all z 1 , z 2 ∈ R: θ (z 1 + z 2 ) ≤ 1 θ (z 1 ) + θ (z 2 ). In what follows, we fix s > 25 , α ∈ ]0, 21 [ and ε ∈ ]0, 2s+1 [. We want to show that for P-almost every f, E( f ) is non-positive, which is equivalent to θ (E( f )) d P( f ) = 0. This formulation of the local energy inequality allows us to use the weak convergence of Pl towards P. We first establish that θ (Ek ( f )) d Pl ( f ) is small. Then letting l and k tend to infinity, we will obtain the desired result. Proposition 2.8. There exists a constant K = K (ρ, N , µ0 , T ) such that for all l and k, we have K K θ (Ek ( f )) d Pl ( f ) ≤ 1/6 + ε/3 . l k Note that K does not depend on l nor on k. Proof. The solutions u of the regularized periodic system (2) satisfy (16), that is E(l) (u) = 0, which implies Ek (u) = Ek (u) − E(l) (u) # $ # $ # $ = − |u|2 (u − u ∗ ωl ), ∇ρ − 2 ( p1 − p1l )u, ∇ρ + 2 u · (∇ p2 − ∇ p2l ), ρ $ # +2 (δk ∗ u − u) · ∇ p2 , ρ .

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

33

Then we have # # $ $ θ (Ek (u)) ≤ θ |u|2 (u − u ∗ ωl ), ∇ρ + θ 2 ( p1 − p1l )u, ∇ρ $ # # $ +θ 2 u · ∇ p2 − ∇ p2l , ρ + θ 2 (δk ∗ u − u) · ∇ p2 , ρ . (23) This inequality and the estimates proved in the previous paragraph will give us the required bound for θ (Ek (u)) d Pl (u). Indeed, we will only need to integrate the previous inequality, for Pl is concentrated on solutions of Eq. (2). 1 We notice that for all z ∈ R, θ (z) ≤ |z| 3 , and we use Lemma 2.4 to obtain: # 2 1 $ C θ |u|2 (u − u ∗ ωl ), ∇ρ ≤ 1/6 u L3 2 H 1 (r ) u L3 ∞ L 2 (2+α) 1 l C ≤ 1/6 u L 2 H 1 (r1 ) + u L ∞ L 2 (2+α) , l then we integrate this inequality with respect to Pl , and the estimates (3) and (10) lead to

# $ C u L 2 H 1 (r1 ) + u L ∞ L 2 (2+α) d Pl (u) θ |u|2 (u − u ∗ ωl ), ∇ρ d Pl (u) ≤ 1/6 l K ≤ 1/6 , (24) l

where the constant K is independent of l. Analogously, thanks to Lemmas 2.5 and 2.6, we have respectively

# $ K θ 2 ( p1 − p1l )u, ∇ρ d Pl (u) ≤ 1/6 , l # $ K θ 2 u · (∇ p2 − ∇ p2l ), ρ d Pl (u) ≤ 1/3 . l

(25) (26) 1

We also have, thanks to Lemma 2.7, with s > 25 , recalling that θ (z) ≤ |z| 3 , $ # C θ 2 (δk ∗ u − u) · ∇ p2 , ρ ≤ ε/3 ∂t u L 1 H −s (B N ) + u L 2 H 1 (2−α) +u L ∞ L 2 (2+α) . k We integrate and use estimates (3), (10) and (12) to get

$ # K θ 2 (δk ∗ u − u) · ∇ p2 , ρ d Pl (u) ≤ ε/3 . k

So far, thanks to inequalities (23) to (27), we have shown that θ (Ek (u)) d Pl (u) ≤

3K K + . l 1/6 k ε/3

(27)

34

A. Basson

Next we forget the fact that Pl is concentrated on solutions of (2) and we only focus on the weak convergence of the measures Pl → P. Recall we want to show that θ (E(u)) d P(u) = 0. We first let l tend to infinity. The function θ is continuous and non-decreasing, thus the mapping u → θ (Ek (u)) is lower semi-continuous and bounded on F N (thanks to Proposition 2.3). Then the weak convergence of Pl to P on F N (Proposition 1.7) implies that K θ (Ek (u)) d P(u) ≤ lim inf θ (Ek (u)) d Pl (u) ≤ ε/3 . l→∞ k Let us recall the definition of E(u) (Eq. (18)), which yields # $ θ (E(u)) ≤ θ (Ek (u)) + θ 2 (δk ∗ u − u) · ∇ p2 , ρ . # $ We apply Lemma 2.7 again to show that θ 2 (δk ∗ u − u) · ∇ p2 , ρ d P(u) ≤ K /k ε/3 in the same way as (27), and we obtain 2K θ (E(u)) d P(u) ≤ ε/3 . k Hence letting k tend to infinity, we get θ (E(u)) d P(u) = 0, i.e. E(u) ≤ 0 d P(u)almost everywhere. Thus, a non-negative ρ ∈ D(]0, T [×B N0 ) being given, we have shown that almost every solution u satisfies the local energy inequality with this test function ρ: # $ # $ # $ # $ 2 |∇ ⊗ u|2 , ρ − |u|2 , (∂t + )ρ − |u|2 u + 2 p1 u, ∇ρ + 2 u · ∇ p2 , ρ ≤ 0. 2.4. Final step. It remains to extend this to any ρ, more precisely to show that there exists a Borel set A ⊂ L 2loc ([0, T ] × R3 ) with P(A) = 1 such that all u ∈ A and all non-negative ρ ∈ D(]0, T [×R3 ) satisfy the local energy inequality. This can be done by a density argument. For all N0 , the set of non-negative smooth functions compactly supported in ]0, T [×B N0 endowed with the Sobolev H m norm is separable: let ρkN0 be a dense sequence of such functions. For all ρkN0 , we can apply what we have already proved: there exists a Borel subset AkN0 in L 2loc ([0, T ] × R3 ) with P(AkN0 ) = 1, such that any u ∈ AkN0 satisfies the local energy inequality with the test function ρkN0 . Moreover, we can assume that AkN0 ⊂ L 2 H 1 (r1 ) ∩ L ∞ L 2 (2 + α), for P is concentrated on this space. We choose m great enough so that given any u ∈ L 2 H 1 (r1 ) ∩ L ∞ L 2 (2 + α), the mapping $ # $ # ρ → E(u, ρ) = 2 |∇ ⊗ u|2 , ρ − |u|2 , (∂t + )ρ # $ # $ − |u|2 u + 2 p1 u, ∇ρ + 2 u · ∇ p2 , ρ where ρ ∈ D(]0, T [×B N0 ), is continuous in the H m topology; we know that for all N0 u ∈ AkN0 , E(u, ρkN0 ) ≤ 0. We deduce from these facts that given u ∈ k Ak , we have E(u, ρ) ≤ 0 for any non-negative ρ ∈ D(]0, T [×B N0 ). Hence for any u ∈ A = N0 k AkN0 and any non-negative ρ ∈ D(]0, T [×R3 ), we have E(u, ρ) ≤ 0. Moreover, P(A) = 1. This proves the statement of Theorem 2.1. As a corollary, we can add in the statement of Theorem 1.5 a fifth assertion: u satisfies the local energy inequality (15).

Space-Time Spatially Homogeneous Statistical Solutions of NS Equations

35

Acknowledgements. I express my deep gratitude to my advisor Professor Pierre-Gilles Lemarié-Rieusset, who brought this subject to my attention, gave me much advice and helped me to solve several critical difficulties.

References [CKN82]

Caffarelli, L., Kohn, R., Nirenberg, L.: Partial regularity of suitable weak solutions of the NavierStokes equations. Comm. Pure Appl. Math. 35, 771–831 (1982) [Foi72] Foias, C.: Statistical study of the Navier-Stokes equations, I. Rend. Sem. Mat. Univ. Padova 48, 219–348 (1972); idem, II, 49, 9–123 (1973) [FMRT01] Foias, C., Manley, O., Rosa, R., Temam, R.: Navier-Stokes Equations and Turbulence. Cambridge: Cambridge Univ. Press, 2001 [FT80] Foias, C., Temam, R.: Homogeneous Statistical Solutions of Navier-Stokes Equations. Indiana Univ. Math. J. 29, 913–957 (1980) [GS75] Gihman, I.I., Skorohod, V.V.: The Theory of Stochastic Processes, 1. Berlin: Springer Verlag, 1975 [Hop52] Hopf, E.: Statistical hydrodynamics and functional calculus. J. Rat. Mech. Anal. 1, 87–123 (1952) [Lem02] Lemarié-Rieusset, P.-G.: Recent developments in the Navier-Stokes problem. London: Chapman & Hall/CRC Press, 2002 [Sch77] Scheffer, V.: Hausdorff measures and the Navier-Stokes equations. Commun. Math. Phys. 55, 97–112 (1977) [VF77] Vishik, M.I., Fursikov, A.V.: Solutions statistiques homogènes des systèmes différentiels paraboliques et du système de Navier-Stokes. Ann. Scuola Norm. Sup. Pisa, série IV, IV, 531–576 (1977) [VF78] Vishik, M.I., Fursikov, A.V.: Translationally homogeneous statistical solutions and individual solutions with infinite energy of a system of Navier-Stokes equations. Siberian Math. J. 19, 710– 729 (1978) [VF88] Vishik, M.I., Fursikov, A.V.: Mathematical Problems of Statistical Hydromechanics. Dordrecht: Kluwer Academic Publishers, 1988 [VK88] Vishik, M.I.,Komech, A.I.: Periodic approximations of homogeneous measures. In: [VF88, Appendix II]. [VTC87] Vakhania, N.N., Tarieladze, V.I., Chobanyan, S.A.: Probability Distributions on Banach Spaces. Dordrecht: D. Reidel Publishing Company, 1987 Communicated by P. Constantin

Commun. Math. Phys. 266, 37–63 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0034-0

Communications in

Mathematical Physics

Multiplicativity of Completely Bounded p-Norms Implies a New Additivity Result Igor Devetak1 , Marius Junge2 , Christoper King3 , Mary Beth Ruskai4 1 Electrical Engineering Department, University of Southern California, Los Angeles, CA 90089, USA.

E-mail: [email protected]

2 Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

E-mail: [email protected]

3 Department of Mathematics, Northeastern University, Boston, MA 02115, USA. E-mail: [email protected] 4 Department of Mathematics, Tufts University, Medford, MA 02155, USA.

E-mail: [email protected] Received: 15 July 2005 / Accepted: 12 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006

Abstract: We prove additivity of the minimal conditional entropy associated with a quantum channel , represented by a completely positive (CP), trace-preserving map, when the infimum of S(γ12 ) − S(γ1 ) is restricted to states of the form (I ⊗ ) (|ψψ|). We show that this follows from multiplicativity of the completely bounded norm of considered as a map from L 1 → L p for L p spaces defined by the Schatten p-norm on matrices, and give another proof based on entropy inequalities. Several related multiplicativity results are discussed and proved. In particular, we show that both the usual L 1 → L p norm of a CP map and the corresponding completely bounded norm are achieved for positive semi-definite matrices. Physical interpretations are considered, and a new proof of strong subadditivity is presented. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Additivity of CB Entropy . . . . . . . . . . . . . . . . . . . . 2.1 Multiplicativity questions in quantum information theory 2.2 Proof of additivity from CB multiplicativity . . . . . . . 2.3 Proof of CB additivity from SSA . . . . . . . . . . . . . 3. Completely Bounded Norms . . . . . . . . . . . . . . . . . . . 3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 An important lemma . . . . . . . . . . . . . . . . . . . 3.3 Operator spaces . . . . . . . . . . . . . . . . . . . . . . 3.4 Fubini and Minkowski generalizations . . . . . . . . . . 3.5 More facts about L q (Md ; L p (Mn )) norms . . . . . . . . 3.6 State representative of a map . . . . . . . . . . . . . . . 4. Multiplicativity for CB Norms . . . . . . . . . . . . . . . . . . 4.1 1 ≤ q ≤ p . . . . . . . . . . . . . . . . . . . . . . . . 4.2 q ≥ p . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

38 39 39 40 42 43 43 44 46 47 48 49 50 50 52

38

I. Devetak, M. Junge, C. King, M.B. Ruskai

5. Applications of CB Entropy . . . . . . . . . . 5.1 Examples and bounds . . . . . . . . . . 5.2 Entanglement breaking and preservation 5.3 Operational interpretation . . . . . . . . 6. Entropy Inequalities . . . . . . . . . . . . . . A. Purification . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

53 53 55 56 57 61

1. Introduction Quantum channels are represented by completely positive, trace preserving (CPT) maps on Md , the space of d × d matrices. Results and conjectures about additivity and superadditivity of various types of capacity play an important role in quantum information theory. In this paper, we present a new additivity result which can be stated in terms of a type of minimal conditional entropy defined as SCB,min () =

inf

ψ∈Cd ⊗Cd

(S [(I ⊗ ) (|ψψ|)] − S [(Tr2 (|ψψ|)]) ,

(1.1)

where S(Q) = −Tr Q log Q is the von Neumann entropy. The shorthand CB stands for “completely bounded” which will be explained later. We will show that this CB minimal conditional entropy is additive, i.e., SCB,min ( A ⊗ B ) = SCB,min ( A ) + SCB,min ( B ).

(1.2)

The expression (1.1) for SCB,min () should be compared to those for two important types of capacity. To facilitate this, it is useful to let γ12 = (I ⊗ ) (|ψψ|), and observe that its reduced density matrices are γ1 = Tr2 (I ⊗ ) (|ψψ|), and γ2 = Tr1 (I ⊗ ) (|ψψ|). We can now rewrite (1.1) as −SCB,min () = supψ S(γ1 ) − S(γ12 ) . (1.3) The capacity of a quantum channel for transmission of classical information when assisted by unlimited entanglement (as in, e.g., dense coding) is given by [5, 6, 16] C E A () = supψ S(γ1 ) + S(γ2 ) − S(γ12 ) . (1.4) The capacity for transmission of quantum information without additional resources is the coherent information, [4, 10, 35, 47] C Q () = sup S(γ2 ) − S(γ12 ) . (1.5) ψ

In these expressions, the supremum is taken over all normalized vectors ψ in Cd ⊗ Cd and γ12 depends on both ψ and . It has been established that C E A () is additive [5, 16], but C Q () is not additive in general [11]. To understand the difference between C Q () and SCB,min (), use the trace-preserving property of to rewrite γ1 = Tr2 (|ψψ|) and γ2 = [Tr1 (|ψψ|)]. The additive quantity (1.3) contains γ1 which is independent of , while the non-additive quantity (1.5) contains γ2 which depends upon . We do not have a completely satisfactory physical interpretation of the the CB entropy, although an operational meaning can be found. It appears to provide a measure of how well a channel preserves entanglement. In particular, if is entanglement breaking,

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

39

SCB,min () > 0 (although the converse does not hold). Recently, Horodecki, Oppenheim and Winter [18, 19] gave an elegant interpretation of quantum conditional information which we discuss in the context of our results in Sect. 5. The additivity (1.2) will follow from the multiplicativity (2.5) of the quantity ω p () ≡

sup ψ∈Cd ⊗Cd

(I ⊗ ) (|ψψ|) p γ12 p = sup . Tr2 (|ψψ|) | p ψ∈Cd ⊗Cd γ1 p

(1.6)

We will see that this is a type of CB norm. Recall that one of several equivalent criteria for a map to be completely positive is that for all integers d, the map Id ⊗ takes positive semi-definite matrices to positive semi-definite matrices. (We use I to denote the identity map I(ρ) = ρ to avoid confusion with the identity matrix I.) One can similarly define other concepts, such as completely isometric, in terms of the maps Id ⊗ . The completely bounded (CB) norm is thus CB = sup Id ⊗ .

(1.7)

d

However, this depends on the precise definition of the norm on the right side of (1.7) or, equivalently, on the norms used to regard and Id ⊗ as maps between Banach spaces. The appropriate definitions for the situations considered here are described in Sect. 3.1 and 3.3. In the process of deriving our results, we obtain a number of related results of independent interest. For example, we show that when is a CP map, both q→ p and the corresponding CB norm are attained for a positive semi-definite matrix, extending a result in [52]. The strong subadditivity (SSA) inequality [33, 43] for quantum entropy S(Q 123 ) + S(Q 3 ) ≤ S(Q 23 ) + S(Q 13 )

(1.8)

is the basis for Holevo’s proof of additivity of C E A () and the proof of (1.2) given in Sect. 2.3. In Sect. 6 we use operator space methods to obtain a new proof of SSA. This paper is organized as follows. Section 2 is concerned with our main result, (1.2). After some background, we present two different proofs. In Sect. 3, which is divided into six subsections, we introduce notation and summarize results about CB norms and operator spaces used in the paper. Only the basic notation in Sect. 3.1 and the Minkowski inequalities in Sect. 3.4 are needed for the main result, Theorem 11. A subtle distinction between the norms used to define CB and Id ⊗ q→ p often used in quantum information (e.g., [3, 27, 28, 52]) is described in the penultimate paragraph of Sect. 3.2. In Sect. 4, we prove multiplicativity of the CB norm for maps : L q (Mm ) → L p (Mn ). When q ≥ p, we also show that the CB norm equals q→ p , yielding a proof of multiplicativity for the latter. In Sect. 5, we explicitly give CB and SCB,min () for simple examples, including the depolarizing channel; prove that SCB,min () > 0 for entanglement breaking channels; and discuss physical interpretations. In Sect. 6, we use the Minkowski inequalities for the CB norms to obtain a new proof of SSA. We also show that the minimizer implicit in X 12 (1, p) converges to X 1 . 2. Additivity of CB Entropy 2.1. Multiplicativity questions in quantum information theory. We are interested in CB norms when is a map L q (Md ) → L p (Md ) where L p (Md ) denotes the Banach space

40

I. Devetak, M. Junge, C. King, M.B. Ruskai

of d × d matrices with the Schatten norm A p = (Tr |A| p )1/ p . One then defines the norm q→ p ≡ sup A

(A) p . A q

(2.1)

Watrous [52] and Audenaert [2] independently showed that this norm is unchanged if the supremum in (2.1) is restricted to positive semi-definite matrices, resolving a question raised in [27]. Thus, (A) p . A>0 A q

q→ p = sup

(2.2)

In quantum information theory, the norm ν p () = 1→ p plays an important role. It has been conjectured [3] (see also [27]) that ν p ( A ⊗ B ) = ν p ( A ) ν p ( B )

(2.3)

in the range 1 ≤ p ≤ 2. Proof of this conjecture would imply additivity of minimal entropy which has been shown to be equivalent to several other important and long-standing conjectures [48]. We note here only that Smin () = inf S[(ρ)], where ρ∈D

D = {ρ : ρ > 0, Tr ρ = 1} denotes the set of density matrices. Note that ν p () = sup (ρ) p . Amosov, Holevo and Werner [3] showed that the additivity ρ∈D

of minimal entropy Smin ( A ⊗ B ) = Smin ( A ) + Smin ( B )

(2.4)

would follow if (2.3) can be proved. In this paper, we consider instead C B,1→ p for which the expression in (1.7) reduces to ω p (), and show that it is multiplicative, i.e., that ω p ( A ⊗ B ) = ω p ( A ) ω p ( B ).

(2.5)

We first show that (2.5) implies our new additivity result, providing a motivation for the technical material needed to prove (2.5). We subsequently found another proof which does not use CB norms; this is presented in Sect. 2.3. However, the CB proof given next provides an indication of the potential of this machinery for quantum information. 2.2. Proof of additivity from CB multiplicativity. We define a function of a self adjoint matrix with spectral decomposition A = k λk |φk φk | as f (A) = k f (λk )|φk φk |. We will need functions of the form f (t) = t p log t defined on [0, ∞) so that f (0) = 0 for p > 0 and Q p log Q is 0 on ker(Q). For any Q > 0 we define the entropy as Q S(Q) = −Tr Q log Q and note that S Tr Q = Tr1 Q S(Q) + log Tr Q. We will often use the notation γ12 for density matrices in the tensor product Md ⊗ Mn Mdn and γ1 = Tr2 γ12 , for the corresponding reduced density matrix in Md . (The partial trace Tr2 denotes the trace on Mn . One can similarly define γ2 = Tr1 γ12 The density matrix γ12 can be regarded as a probability distribution on Cd ⊗Cn in which case γ1 and γ2 are the non-commutative analogues of its marginals.) We first prove a technical result.

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

41

Tr γ p 1 − Tr12 γ 12 is well-deﬁned for p > 1 and p 1 1 γ12 a density matrix. It can be extended by continuity to p ≥ 1 and this extension satisﬁes p d Tr12 γ12 = S(γ12 ) − S(γ1 ). (2.6) u(1, γ12 ) = − dp Tr1 γ1p

Lemma 1. The function u( p, γ12 ) ≡

1 p−1

p=1

Moreover, u( p, γ12 ) is uniformly bounded in γ12 for p ∈ [1, 2] and the continuity at p = 1 is uniform in γ12 . Proof. It is well-known and straightforward to verify that, for any density matrix ρ in Mm , lim p→1 (1 − Tr ρ p ) /(1 − p) = S(ρ) and that 0 ≤ S(ρ) ≤ log m. It then follows that (2.6) holds; the convergence is uniform in γ12 because the set of density matrices is compact. By the mean value theorem, for any fixed p, γ12 one can find p with p p 1≤ p ≤ p such that u( p, γ12 ) = − ddp Tr12 γ12 /Tr 1 γ1 p= p . Combining this with the p ≥ γ 2 for any density matrix and fact that γ ≥ γ p ∈ (1, 2] gives the following bound: p p p p Tr γ 12 12 log γ12 Tr1 γ1 − Tr1 γ1 log γ1 Tr12 γ12 (2.7) |u( p, γ12 )| = p 2 Tr1 γ1 ≤

S(γ12 ) + S(γ1 ) , p 2 Tr1 γ1

which is uniform in p for p ∈ (1, 2].

(2.8)

The quantity Scond (γ12 ) ≡ S(γ12 )−S(γ1 ) is called the conditional entropy. Motivated by (2.6), we define the C.B. minimal entropy as SCB,min () =

inf

ψ∈Cd ⊗Cd

Scond [(I ⊗ ) (|ψψ|)] ,

(2.9)

and observe that it satisfies the following. Theorem 2. For any CPT map ,

p 1 − ω p () SCB,min () = lim , p→1+ p−1

(2.10)

where ω p () is given by (1.6). Proof. With γ12 = (I ⊗ ) (|ψψ|), one finds SCB,min () =

inf

|ψ∈Cd ⊗Cd

u(1, γ12 ) = inf lim u( p, γ12 ) ψ

p→1+

p Tr12 γ12 1 = inf lim 1− p ψ p→1+ p − 1 Tr1 γ1

p Tr12 γ12 1 = lim inf 1− p p→1+ ψ p − 1 Tr 1 γ1

p Tr12 γ12 1 = lim , 1 − sup p p→1+ p − 1 ψ Tr 1 γ1

(2.11)

42

I. Devetak, M. Junge, C. King, M.B. Ruskai

where the interchange of lim p→1+ and inf ψ is permitted by the uniformity in γ12 of the continuity of u( p, γ12 ) at p = 1. Theorem 3. For all pairs of CPT maps A , B , SCB,min ( A ⊗ B ) = SCB,min ( A ) + SCB,min ( B ). Proof. The result follows easily from the observations above and (2.5). p 1 − ω p ( A ⊗ B ) SCB,min ( A ⊗ B ) = lim p→1+ p−1 p 1 − ω p ( A ) [ω( B )] p = lim p→1+ p−1 p 1 − ω p ( A ) = lim p→1+ p−1 p p 1 − ω p ( B ) lim + lim ω p ( A ) p→1+ p→1+ p−1 = SCB,min ( A ) + SCB,min ( B ), p where we used lim p→1+ ω p ( A ) = 1.

(2.12)

This result relies on (2.5) which is a special case of Theorem 11 with q = 1. Recently, Jencova [21] found a simple direct proof of (2.5). 2.3. Proof of CB additivity from SSA. Recall that any CPT map can be represented in the form † (ρ) = Tr E U AE ρ ⊗ τ E U AE

(2.13)

with U AE unitary and τ E a pure reference state on the environment. The following key result follows from standard purification arguments (which are summarized in Appendix A). Lemma 4. Let the CPT map have a representation as in (2.13). One can ﬁnd a reference system R and a pure state |ψ R A ψ R A | such that Tr R |ψ R A ψ R A | = ρ. Deﬁne γ R E A = (I R ⊗ U AE ) (|ψ R A ψ R A | ⊗ τ E ) (I R ⊗ U AE )† . Then γ R E A is also pure and S(γ E A ) − S(γ E ) = S(γ R ) − S(γ R A ),

(2.14)

where the reduced density matrices are deﬁned via partial traces. It follows from (1.8) that the conditional entropy is subadditive, i.e., for any state γ E 1 E 2 A1 A2 , S(γ E 1 E 2 A1 A2 ) − S(γ E 1 E 2 ) ≤ S(γ E 1 A1 ) − S(γ E 1 ) + S(γ E 2 A2 ) − S(γ E 2 ).

(2.15)

This was proved by Nielsen [37] and appears as Theorem 11.16 in [38]. It follows easily from the observation that (2.15) is the sum of the following pair of inequalities, which are special cases of SSA: S(γ E 1 E 2 A1 A2 ) + S(γ E 1 ) ≤ S(γ E 1 A1 ) + S(γ E 1 E 2 A2 ), S(γ E 1 E 2 A2 ) + S(γ E 2 ) ≤ S(γ E 1 E 2 ) + S(γ E 2 A2 ).

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

43

Now define SCB,min () = inf (S [(I ⊗ ) (|ψψ|)] − S [Tr A |ψψ|]) . ψ

(2.16)

Let R A1 A2 denote the minimizer for 1 ⊗ 2 and γ R1 R2 A1 A2 E 1 E 2 = (I R ⊗ U A1 E 1 A2 E 2 )

× |ψ R A1 A2 ψ R A1 A2 | ⊗ τ E 1 E 2 (I R ⊗ U A1 E 1 A2 E 2 )† .

(2.17)

Then SCB,min (1 ⊗ 2 ) = S(γ R1 R2 A1 A2 ) − S(γ R1 R2 ) = S(γ E 1 E 2 ) − S(γ E 1 E 2 A1 A2 ) ≥ S(γ E 1 ) − S(γ E 1 A1 ) + S(γ E 2 ) − S(γ E 2 A2 ).

(2.18)

Next, use the lemma to find purifications ψ R A and ψ R A so that the last line above = S(γ R 1 A1 ) − S(γ R 1 ) + S(γ R2 A2 ) − S(γ R2 ) ≥ SCB,min (1 ) + SCB,min (2 ).

(2.19)

The reverse inequality can be obtained using product . 3. Completely Bounded Norms 3.1. Deﬁnitions. For the applications in this paper, we can define the completely bounded (CB) norm of a map : L q (Mm ) → L p (Mn ) as (Id ⊗ )(Y ) (∞, p) (3.1) CB,q→ p ≡ sup Id ⊗ (∞,q)→(∞, p) = sup sup Y (∞,q) d d Y with Y (∞, p) ≡ Y L ∞ (Md ;L p (Mn )) =

sup A,B∈Md

(A ⊗ In )Y (B ⊗ In ) p . A 2 p B 2 p

(3.2)

Effros and Ruan [12, 13] introduced the norm Y (1, p) . Pisier [40, 41] subsequently used complex interpolation between them to define a norm Y (t, p) for any 1 < t < ∞. He showed (Theorem 1.5 in [41]) that the norm obtained by this procedure satisfies Y (t, p) ≡ Y L t (Md ;L p (Mn )) =

inf

Y =(A⊗In )Z (B⊗In ) A,B∈Md

A 2t B 2t Z (∞, p) ,

(3.3)

which we can regard as its definition. The vector space Md ⊗ Mn equipped with the norm (3.3) is a Banach space which we denote by L t (Md ; L p (Mn )). Given an operator : L t (Md ; L q (Mm )) → L s (Md ; L p (Mn )), the usual norm for linear maps from one Banach space to another becomes ≡ (t,q)→(s, p) =

sup

Q∈Md ⊗Mm

(Q) (s, p) . Q (t,q)

(3.4)

44

I. Devetak, M. Junge, C. King, M.B. Ruskai

Theorem 1.5 and Lemma 1.7 in Pisier [41] show that one can use this norm to obtain another expression for the CB norm (Id ⊗ )(Y ) (t, p) (3.5) CB,q→ p ≡ sup Id ⊗ (t,q)→(t, p) = sup sup Y (t,q) d d Y valid for all t ≥ 1. In effect, we can replace ∞ in (3.1) by any t ≥ 1. In working with the CB norm, we will find it convenient to choose t = q when q ≤ p and t = p when q ≥ p. Thus our working definition of the CB norm is (3.5) with t = min{q, p}. For the applications considered in Sect. 2 and 5, this becomes t = q = 1. Remark. When X > 0, Hölder’s inequality implies AX B † p ≤ AX A† p B X B † p ≤ max{ AX A† p , B X B † p } and the unitary invariance of the norm implies that AX A† p = |A| X |A| p . Therefore, when X ≥ 0, we can replace any expression of the form sup A,B AX B † p by sup A>0 AX A p irrespective of what other restrictions may be placed upon A, B. We will show that for CP maps, the CB norm is unchanged if the supremum is taken over Y > 0. (See Sect. 3.2, and Theorem 12 and Corollary 14 in Sect. 4.) Thus, when working with CP maps, one can generally assume that A = B > 0 in expressions for Y (q, p) . When Y > 0 combining (3.2) and (3.3) gives the identity, 1

Y ( p, p) = inf sup (A ⊗ In) 2 p (B ⊗ In) B>0 A>0 Tr B=1 Tr A=1

− 21p

Y (B ⊗ In)

− 21p

1

(A ⊗ In) 2 p p (3.6)

for all p ≥ 1. Since Theorem 7 implies that Y ( p, p) = Y p , this gives a variational expression for the usual p-norm on Mdn Md ⊗ Mn . The choice n = 1 yields a max-min principle for the p-norm on Md . The Banach space L t (Md ; L p (Mn )) is a special case of a more general Banach space L t (Md ; E) for which a norm is defined on d × d matrices with entries in an operator space E as described in Sect. 3.3. Because we use here only operators : L q (Mm ) → L p (Mn ) rather than the general situation of operators : E → F between Banach spaces E, F, we give explicit expressions only for norms on L t (Md ; L p (Mn )). On a few occasions we need to consider spaces L t (Md ; E) with E = L q (Mm ; L p (Mn )); we denote the norm on these space by Y (t,q, p) . In general we will only encounter triples with two distinct indices and will not need additional expressions for these norms. Such cases as Y (q,q, p) reduce to L q (Mdm ; L p (Mn )) via the isomorphism between Mdm ⊗ Mn Md ⊗ Mm ⊗ Mn ; most situations require only comparisons via Minkowski type inequalities given in Sect. 3.4. In Sect. 3.5 we show that Y (1, p,1) = Y (1, p) ; this is needed only for the application in Sect. 6. 3.2. An important lemma. We illustrate the use of (3.3) by proving the following lemma, which is a special case of a more general result in [24]. It plays a key role in the multiplicativity results of Sect. 4.2 for q ≥ p. Although not needed for our main result, it also has + important implications when q ≤ p. We first define q→ p = sup (Q) p / Q q . Q>0

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

45

Lemma 5. Let : L q (Mm ) → L p (Mn ) be a CP map. Then for every r ≥ 1 the map ⊗ Id : L q (Mm ; L r (Md )) → L p (Mn ; L r (Md )) satisﬁes + ⊗ Id (q,r )→( p,r ) ≤ q→ p.

(3.7)

Proof of Lemma. For any Q (3.3) implies that one can find A, Y, B such that Q = (A ⊗ I)Y (B ⊗ I) and Q (q,r ) = A 2q B 2q Y (∞,r ) . Since is CP, one can find K j satisfying (3.29). Let V A denote the block row vector with elements (K 1 A ⊗ Id , K 2 A ⊗ Id , . . . , K m A ⊗ Id ), and similarly for B. Then ( ⊗ Id )(Q) = V A (Iν ⊗ Y )VB† = (K j A ⊗ Id )Y (B K †j ⊗ Id ). (3.8) j

(Note that Iν ⊗ Y denotes a block diagonal matrix with Y along the diagonal with Iν the identity in an additional reference space used to implement the representation (3.29). Y itself is in the tensor product space Mm ⊗ Md on which ⊗ Id acts; K j and A are in Mm . We can extend V to an element of Mν ⊗ Mm ⊗ Md by adding rows of zero blocks; i.e., to i,ν j=1 δi1 |i j| ⊗ K j A ⊗ Id .) Therefore, applying (3.3) on this extended space gives 1/2

1/2

( ⊗ Id )(Q) ( p,r ) ≤ V A† V A p VB† VB p Iν ⊗ Y (∞,∞,r ) (3.9) 1/2 † † † † K j A AK j K j B BK j Y (∞,r ) = j j = ≤ =

p p 1/2 1/2 † (A A) p (A A) p Y (∞,r ) 1/2 1/2 + † † q→ p A A q B B q Y (∞,r ) + q→ p Q (q,r ) , †

(3.10)

2 where we used A† A q = A 2q . The following corollary implies that for any p, q, the norm q→ p is achieved on a positive semi-definite matrix Q > 0. This was proved earlier by Watrous [52], resolving a question raised in [27]. In Sect. 4, we will see that a similar result holds for CB norms of CP maps. This is stated as Theorem 12 for q ≤ p and Corollary 14 for q ≥ p. Corollary 6. Let : L q (Mm ) → L p (Mn ) be a CP map. Then for all q, p ≥ 1, the + norm q→ p = q→ p + Proof. The choice d = 1 in Lemma 5 gives q→ p ≤ q→ p . Since the reverse inequality always holds, the result follows. + Note that one can similarly conclude that supd ⊗ Id (q,t)→( p,t) = q→ p so that nothing would be gained by defining an alternative to the CB norm in this way. In Sect. 5 we show that the depolarizing channel gives an explicit example of a map with CB,1→ p > 1→ p . It is worth commenting on the difference between this result and the proof by Amosov, Holevo and Werner [3] that I ⊗ (1,1)→( p, p) = 1→ p . In the latter, the identity is viewed as an isometry from one Banach space L q (Md ) to another, L p (Md ). In the case of the CB norm, the identity is viewed as a map from the Banach space L t (Md ) onto itself. Thus, we consider Id ⊗ with Id : L t (Md ) → L t (Md ) and : L q (Mm ) → L p (Mn ), for which we need to consider what norm should be used on

46

I. Devetak, M. Junge, C. King, M.B. Ruskai

the domain Md ⊗ Mm if t = q or, on the range Md ⊗ Mn if t = p? When q = p this question is unavoidable. One needs a norm which acts like L t on Md and L p on Mn , and (3.2) provides such a norm. Some of the motivation for the definitions used here is sketched in the next section. For a discussion of the stability properties of I ⊗ (q,q)→( p, p) see Kitaev [28] and Watrous [52]. Note that in the case q = p, the two types of norms for the extension ⊗Id coincide and our results imply that for CP maps CB,p→p = ⊗ Id ( p, p)→( p, p) = +p→ p . However, for measuring the difference between channels [28, 52], one is primarily interested in maps of the form 1 − 2 which are not CP. 3.3. Operator spaces. The Banach space E = L p (Mn ) together with the sequence of norms on the spaces L ∞ (Md ; L p (Mn )) with d = 1, 2, . . . form what is known as an operator space. More generally, an operator space is a Banach space E and a sequence of norms defined on the spaces Md (E), whose elements are d × d matrices with elements in E, with certain properties that guarantee that E can be embedded in B(H), the bounded operators on some Hilbert space H. Alternatively, one can begin with a subspace E ⊂ B(H); then the norm in Md (E) is given by the inclusion Md (E) ⊂ Md (B(H)) B(H⊗d ) consistent with interpreting an element of Md (E) as a block matrix. (Usually such a situation is considered a concrete operator space in contrast to an abstract operator space given by matrix norms satisfying Ruan’s axioms [13, 42, 45].) The only operator spaces we use in this paper are those with E = L p (Mn ) and, occasionally, E = L t (Md ; L p (Mn )). Although a concrete representation for even these spaces is not known, the explicit expressions for the norms given in Sect. 3.1 and 3.5 suffice for many purposes. (The reader who wishes to explore the literature should be aware that most of it is written in terms of L t (Md ; E) rather than L t (Md ; L p (Mn )) and that the notation St (Md ; E) (for Schatten norm) is more common than L t .) For maps B(H) to B(K) complete boundedness is just uniform boundedness for the sequence of norms of Id ⊗ . This notion is built in a manner analogous to the familiar notion of complete positivity. In a similar way, one can define other “complete” notions, such as complete isometry based on the behavior of Id ⊗ . The particular type of operator space considered here is called a “vector-valued L p space”. We have already remarked on the need to define a norm on L t (Md ; L p (Mn )) to give a non-commutative generalization of the classical Banach space t ( p ). Unfortu1/t 1/t t or Tr1 (Tr2 |Y | p )t/ p do not nately, such naive generalizations as jk Y jk p even define norms. The norms described in Sect. 3.1, although difficult to work with, yield an elegant structure with the following properties: a) for the subalgebra of diagonal matrices the norm on L t (Md ; L p (Mn )) reduces to that on t ( p ).

1/t b) When Y = A⊗B is a tensor product, Y (t, p) = A t B p= Tr |A|t (Tr |B| p )1/ p . c) The Banach space duality between L p and L p with 1p + p1 = 1 generalizes to L q (Md ; L p (Mn ))∗ = L q (Md ; L p (Mn )).

(3.11)

d) The collection of norms on {L t (Md ; L p (Mn ))} can be obtained from some (abstract) embedding of L p (Md ) into B(H) providing the operator space structure of L p (Md ). e) The structure of L t (Md ; L p (Mn )) can be used to develop a theory of vector-valued non-commutative integration which generalizes the theory of non-commutative integration developed by Segal [46] and Nelson [36].

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

47

Although not used explicitly, properties (c) and (e) play an important role in our results. Consequences of (e) described in Sect. 3.4 play a key role in the proofs in Sect. 4 and Sect. 6. Theorem 10, which gives the simple expression (1.6) for the CB norm in the case 1 → p, is an immediate consequence of a fundamental duality theorem. For general information on operator spaces, see Paulsen [39], Effros and Ruan [13] or Pisier [42]. The theory of non-commutative vector valued L p spaces was developed by Pisier in two monographs [40] and [41]. Additional developments can be found in [22] and [42]. 3.4. Fubini and Minkowski generalizations. Because vector valued L p -spaces permit the development of a consistent theory of vector-valued non-commutative integration, one would expect generalizations of fundamental integration theorems. This is indeed the case, and analogues of both Fubini’s theorem and Minkowski’s inequality play an important role in the results that follow. First, Theorem 1.9 in [41] gives a non-commutative version of Fubini’s theorem. Theorem 7. For any 1 ≤ p ≤ ∞, the isomorphisms L p (Md ; L p (Mn )) L p (Md ⊗Mn ) L p (Mdn ) hold in the sense of complete isometry, which implies that for all W ∈ Md ⊗ Mn ,

1/ p . (3.12) W L p (Md ;L p (Mn )) = W L p (Mn ;L p (Md )) = W p = Tr |W | p The next result, which is Theorem 1.10 in [41], will lead to non-commutative versions of Minkowski’s inequality and deals with the flip map F which takes A ⊗ B → B ⊗ A and is then extended by linearity to arbitrary elements of a tensor product space so that W12 → W21 . Theorem 8. For q ≤ p, the ﬂip map F : L q (Md ; L p (Mn )) → L p (Mn ; L q (Md )) is a complete contraction. The fact that F is a contraction yields an analogue of Minkowski’s inequality for matrices. W21 ( p,q) = F(W12 ) ( p,q) ≤ W12 (q, p)

for q ≤ p.

(3.13)

The fact that F is a complete contraction means that I ⊗ F is also a contraction which yields a triple Minkowski inequality W132 (q, p,q) ≤ W123 (q,q, p)

(3.14)

when q ≤ p. Remark. To see why we regard (3.13) as a non-commutative version of Minkowski’s

t 1/t ≤ k inequality, recall the usual p ( q ) version. For t ≥ 1, j k |a jk | j

1/t t |a jk | , and Carlen and Lieb [8] extended this to positive semi-definite matrices 1/t

1/t ≤ Tr2 Tr1 Q t12 . (3.15) Tr1 (Tr2 Q 12 )t As in the case of the classical inequalities, (3.15) holds for t ≥ 1 and the reverse inequality holds for t ≤ 1. Moreover, it follows that for R ≥ 0, q p/q 1/ p p q/ p 1/q ≤ Tr2 Tr1 R12 for q ≤ p. (3.16) Tr1 Tr2 R12

48

I. Devetak, M. Junge, C. King, M.B. Ruskai p

To see that (3.16) and (3.15) are equivalent, let t = p/q, and Q 12 = R12 . Then raising both sides of (3.16) to the q-th power yields (3.15). 1/q In general, the quantity Tr1 (Tr2 R p )q/ p does not define a norm. Carlen and Lieb [8] conjectured that Tr1 (Tr2 R p )1/ p does define a norm for 1 ≤ p ≤ 2, but proved it only in the case p = 2. (For p > 2 it can be shown not to be a norm.) Their conjecture is that 1/t

1/t Tr3 Tr2 (Tr1 Q 123 )t ≤ Tr1,3 Tr2 Q t123 (3.17) which is very similar in form to (3.14) with q = 1, p = t. 3.5. More facts about L q (Md ; L p (Mn )) norms. We now state two additional formulas for norms on L q (Md ; L p (Mn )). Although not needed for the main result, some consequences are needed for Theorem 12 and in Sect. 6. For detailed proofs see [22]. We state both under the assumption 1 ≤ q ≤ p ≤ ∞ and 1/q = 1/ p + 1/r . Then (A ⊗ In )Y (B ⊗ In ) q Y ( p,q) ≡ Y L p (Md ;L q (Mn )) = sup (3.18) A 2r B 2r A,B∈Md and Y (q, p) ≡ Y L q (Md ;L p (Mn )) =

inf

Y =(A⊗In )Z (B⊗In ) A,B∈Md

A 2r B 2r Z p .

(3.19)

Moreover, when Y > 0 is positive semi-definite, one can restrict both optimizations to A = B > 0. In the case X > 0, q = 1, (3.18) becomes (A ⊗ In )X 12 (A ⊗ In ) 1 X 12 ( p,1) = sup A 22 p A>0 Tr A2 X 1 = sup = X 1 p , (3.20) 2 A>0 A p and (3.19) can be rewritten as X (1, p) = = =

inf

A>0 X =(A⊗In )Z (A⊗In )

A 22 p Z p

(3.21)

inf

(B −1/2 p ⊗ In ) X (B −1/2 p ⊗ In ) p

inf

(B

B>0, B 1 =1 B>0, B 1 =1

− 21 (1− 1p )

⊗ In ) X (B

− 21 (1− 1p )

⊗ In ) p .

(3.22)

In Sect. 6, we will also need W132 (1, p,1) = W132 L 1 (Md ;L p (Mn ;L 1 (Mm ))) = = =

inf

A∈Md ,A>0 W132 =(A⊗I32 )Z 132 (A⊗I32 )

inf

B1 >0, B1 1

inf

B1 >0, B1 1

−1/2 p

(B1

−1/2 p

(B1

A 22 p Z 132 ( p, p,1) −1/2 p

⊗ I3 ⊗ I2 )W132 (B1 −1/2 p

⊗ I3 )W13 (B1

(3.23) ⊗ I3 ⊗ I2 ) ( p, p,1)

⊗ I3 ) ( p, p)

(3.24)

= W13 (1, p) , where (3.23) is proved in [22] and the reductions which follow used (3.20) and (3.18).

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

49

Lemma 9. When 1 ≤ q ≤ p ≤ ∞ and X is a contraction, then 1/2 C † X D (q, p) ≤ C † C (q, p) D † D (q, p) .

(3.25)

Proof. It follows from (3.19) that one can find A, B ∈ Md and Y, Z ∈ Mdn such that A, B > 0, A 2r = B 2r = 1, Y, Z > 0 and C † C = (A ⊗ In )Y (A ⊗ In ),

C † C (q, p) = (A ⊗ In )Y (A ⊗ In ) p ,

D † D = (B ⊗ In )Z (B ⊗ In ),

D † D (q, p) = (B ⊗ In )Z (B ⊗ In ) p .

Moreover, there are partial isometries, V, W such that C = V Y 1/2 (A ⊗ In ) and D = W Z 1/2 (B ⊗ In ). Then C † X D = (A ⊗ In )Y 1/2 V † X W Z 1/2 (B ⊗ In )

(3.26)

and it follows from (3.19) and Hölder’s inequality that C † X D (q, p) ≤ (A ⊗ In )Y 1/2 V † X W Z 1/2 (B ⊗ In ) p ≤ =

1/2 (A ⊗ In )Y A ⊗ In ) p V † X W ∞ (B 1/2 1/2 C † C (q, p) D † D (q, p) .

(3.27) ⊗ In )Z B

1/2 ⊗ In ) p

3.6. State representative of a map. A linear map : Md → Md can

be associated with a block matrix in which the j, k block is the matrix |e j ek | in the standard basis. This is often called the “Choi-Jamiolkowski matrix” or “state representative” in quantum information theory and will be denoted X . Thus,

X = |e j ek | ⊗ |e j ek | . (3.28) jk

Choi [9] showed that the map is CP if and only if X is positive semi-definite. Conversely given a (positive semi-definite) d 2 × d 2 matrix X , one can use (3.28) to define a CP map . In addition, Choi showed that the eigenvectors of X can be rearranged to yield operators, K j such that (Q) =

K j Q K †j .

(3.29)

j

This representation was obtained independently by Kraus [30, 31] and can be recovered from that of Stinespring [51]. For every CP map with Choi matrix X , it follows from (3.28) that

(A ⊗ I)X (A ⊗ I) p = A|e j ek |A ⊗ |e j ek | p (3.30) jk

= (I ⊗ ) (|ψ A ψ A |) p , where the last equality follows if we choose |ψ A = j A|e j ⊗ |e j .

50

I. Devetak, M. Junge, C. King, M.B. Ruskai

Theorem 10. For any CP map , (I ⊗ ) (|ψψ|) p ≡ ω p (). (3.31) Tr2 (|ψψ|) p ψ =1

CB,1→ p = X (∞, p) = sup

Proof. This result requires a fundamental duality result proved by Blecher and Paulsen [7] and by Effros and Ruan [12, 13] and described in Sect. 2.3 of [42]. It states that CB,1→ p = ∗ CB, p →∞ = X (∞, p) .

(3.32)

Using (3.2) gives (A ⊗ I)X (A ⊗ I) p A2 p A>0 (I ⊗ ) (|ψψ|) p = sup . Tr2 (|ψψ|) p ψ

CB,1→ p = sup

Since the ratio is unchanged if |ψ is multiplied by a constant, one can restrict the supremum above to ψ = 1. 4. Multiplicativity for CB Norms 4.1. 1 ≤ q ≤ p. We now prove multiplicativity of the CB norm for maps : L q (Mm ) → L p (Mn ) with q ≤ p. Theorem 11. Let q ≤ p and A : L q (Mm A ) → L p (Mn A ) and B : L q (Mm B ) → L p (Mn B ) be CP and CB. Then A ⊗ B CB,q→ p = A CB,q→ p B CB,q→ p .

(4.1)

Proof. Let Q C AB be in Md ⊗ Mm A ⊗ Mm B and RC AB = (Id ⊗ Im A ⊗ B )(Q C AB ). Then using (3.14), one finds A ⊗ B CB,q→ p = sup sup d Q C AB

= sup Q C AB

≤ sup RC B A

(Id ⊗ A ⊗ B )(Q C AB ) (q, p, p) Q C AB (q,q,q)

(4.2)

(Id ⊗ A ⊗ In B )(RC AB ) (q, p, p) (Id ⊗ Im A ⊗ B )(Q C AB ) (q,q, p) RC AB (q,q, p) Q C AB (q,q,q) (Id ⊗ In B ⊗ A )(RC B A ) (q, p, p) RC B A (q, p,q) RC B A (q, p,q) RC AB (q,q, p)

× sup Q C AB

(4.3)

(Id ⊗ Im A ⊗ B )(Q C AB ) (q,q, p) Q C AB (q,q,q)

≤ In B ⊗ A CB,( p,q)→( p, p) B CB,q→ p = A CB,q→ p B CB,q→ p .

(4.4)

For the last two lines, we used In ⊗ A CB,( p,q)→( p, p) to denote the CB norm of In ⊗ A : L p (Mn ; L q (Mm )) → L p (Mn ; L p (Mm )) and then applied Corollary 1.2 in [41], which states that this is the same as the CB norm of : L q (Mm ) → L p (Mm ).

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

51

To prove the reverse direction, we need a slight modification of the standard strategy of showing that the bound can be achieved with a tensor product. It can happen that the CB norm itself is not attained for any finite Id ⊗ norm. Therefore, we first show that any finite product can be achieved, and then use the fact that the CB norm can be approximated arbitrarily closely by such a product. Thus, we begin with the observation that for any d and X, Y in the unit balls for L q (Md ⊗ Mm ) and L q (Md ⊗ Mn ), there exist Q, R > 0 in the unit ball of L 2q (Md ) such that (Q ⊗ 1m )(Id ⊗ A )(X )(Q ⊗ Im ) q = (Id ⊗ A )(X ) L q (Md ;L p (Mm ))

(4.5)

(R ⊗ In )(Id ⊗ B )(Y )(R ⊗ In ) q = (I ⊗ B )(Y ) L q (Md ;L p (Mn )) .

(4.6)

and

Then, using Theorem 7, one finds A ⊗ B CB,q→ p ≥ [I Md 2 ⊗ ( A ⊗ B )](X ⊗ Y ) L q (Md 2 ;L p (Mmn )) ≥ (Q ⊗ R ⊗ Imn )[Id 2 ⊗ ( A ⊗ B )](X ⊗ Y )(Q ⊗ R ⊗ Imn ) q = (Q ⊗ I)[(I Md ⊗ A )(X )](Q ⊗ I) q (R ⊗ I)[(I Md ⊗ B )(Y )](R ⊗ I) q = (I Md ⊗ A )(X ) L q (Md ;L p (Mm )) (I Md ⊗ B )(Y ) L q (Md ;L p (Mn )) . (4.7) Given > 0, one can find d, X, Y such that A CB,q→ p < + (I Md ⊗ A ) (X ) L q (Md ;L p (Mm )) and B CB,q→ p < + (I Md ⊗ B )(Y ) L q (Md ;L p (Mn )) . Inserting this in (4.7) above gives A ⊗ B CB,q→ p ≥ A CB,q→ p B CB,q→ p

− A CB,q→ p + B CB,q→ p + O( 2 ). Since > 0 is arbitrary, we can conclude that A ⊗ B CB,q→ p ≥ A CB,q→ p B CB,q→ p .

The next result implies that for CP maps, it suffices to restrict the supremum in the CB norm to positive semi-definite matrices. Theorem 12. When q ≤ p and : L q (Mm ) → L p (Mn ) is CP, Id ⊗ (q,q)→(q, p) is achieved with a positive semi-deﬁnite matrix, i.e., Id ⊗ (q,q)→(q, p) = Id ⊗ +(q,q)→(q, p) . Proof. First use the polar decomposition of Q ∈ Mdm to write Q = Q †1 Q 2 with Q 1 = |Q|1/2 U, Q 2 = |Q|1/2 , where U is a partial isometry and |Q| = (Q † Q)1/2 . The matrix

Q †1 Q †1 Q 1 Q †1 Q 2 U † |Q|U Q Q1 Q2 = >0 (4.8) = |Q| Q † Q †2 Q †2 Q 1 Q †2 Q 2 is positive semi-definite. Since is CP, so is I ⊗ which implies that (I ⊗ )(U † |Q|U ) (I ⊗ )(Q) >0 (I ⊗ )(Q † ) (I ⊗ )(|Q|)

(4.9)

52

I. Devetak, M. Junge, C. King, M.B. Ruskai

A C with C† B A, B > 0 is positive semi-definite if and only if C = A1/2 X B 1/2 with X a contraction. Applying this to (4.9) gives

is positive semi-definite. We now use the fact that a 2 × 2 block matrix

(I ⊗ )(Q) = [(I ⊗ )(U † |Q|U )]1/2 X [(I ⊗ )(|Q|)]1/2

(4.10)

with X a contraction. Therefore, it follows from (3.25) that (I ⊗ )(Q) (q, p) = [(I ⊗ )(U † |Q|U )]1/2 X [(I ⊗ )(|Q|)]1/2 (q, p) 1/2 ≤ (I ⊗ )(U † |Q|U ) (q, p) (I ⊗ )(|Q|) (q, p) ≤ I

⊗ +(q,q)→(q, p)

Q q U |Q|U q †

1/2

(4.11)

= I ⊗ +(q,q)→(q, p) Q q .

4.2. q ≥ p. Theorem 13. Let q ≥ p and A : L q (Mm A ) → L p (Mn A ), B : L q (Mm B ) → L p (Mn B ) be maps which are both CP. Then a) b) c)

+ CB,q→ p = q→ p = q→ p, A ⊗ B q→ p = A q→ p B q→ p , A ⊗ B CB,q→ p = A CB,q→ p B CB,q→ p .

(4.12) (4.13) (4.14)

Combining part (a) with Corollary 6 implies that it suffices to restrict the supremum in the CB norm to positive semi-definite matrices. Corollary 14. When q ≥ p and : L q (Mm ) → L p (Mn ) is CP, Id ⊗ CB,q→ p is achieved with a positive semi-deﬁnite matrix. Proof of Theorem 13. To prove part (a), observe that

(Id ⊗ )(W AB ) p CB,q→ p = sup sup W AB ( p,q) d W AB ∈Md ⊗Mm

(Id ⊗ )(W AB ) p W B A (q, p) = sup sup W B A (q, p) W AB ( p,q) d W AB ≤ sup ≤

sup

d W B A ∈Mm ⊗Md + q→ p.

(4.15)

( ⊗ Id )(W B A ) p W B A (q, p) (4.16)

The first inequality follows from the fact that the second ratio in (4.15) is ≤ 1 by (3.13) and the last inequality then follows from (3.7). When d = 1, the supremum over W of the ratio in (4.15) is precisely q→ p which implies CB,q→ p ≥ q→ p . This proves part (a).

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

53

To prove part (b), write A ⊗ B = ( A ⊗ I)(I ⊗ B ) and for any Q AB ∈ Mm A ⊗ Mm B , let R AB = (I ⊗ B )(Q). Then A ⊗ B q→ p = sup Q

≤ sup Q

≤ sup R

( A ⊗ B )(Q) p Q q

(4.17)

( A ⊗ I)(R AB ) p R AB (q, p) ( B ⊗ I)(Q B A ) ( p,q) R AB (q, p) R B A ( p,q) Q q ( A ⊗ I)(R) ( p, p) ( B ⊗ I)(Q B A ) ( p,q) sup R (q, p) Q B A q,q QBA

(4.18)

≤ A q→ p B q→ p , where we used (3.13), Fubini, and R B A = ( B ⊗ I)(Q B A ). This proves (b). Part (c) then follows immediately from (a) and (b). 5. Applications of CB Entropy 5.1. Examples and bounds. It is well-known that conditional information can be negative as well as positive. Therefore, it is not surprising that (1.1) can also be either positive or negative, depending on the channel . As in Sect. 1, we adopt the convention that γ12 = (I ⊗ ) (|ψψ|). One has the general bounds −S(γ1 ) ≤ SCB,min () ≤ S(γ1 )

(5.1)

− log d ≤ SCB,min () ≤ log d.

(5.2)

which imply

The lower bound in (5.2) follows from the definition (1.3) and the positivity of the entropy S(γ12 ) > 0; the upper bound follows from subadditivity S(γ12 ) ≤ S(γ1 ) + S(γ2 ). The upper bound is attained if and only if the output (I ⊗ ) (|ψψ|) is always a product. The lower bound in (5.2) is attained for the identity channel, and the upper bound for the completely noisy channel (ρ) = (Tr ρ) d1 I . Next, consider the depolarizing channel µ (ρ) = µρ+(1−µ)(Tr ρ) d1 I . This channel satisfies the covariance condition U (ρ)U ∗ = (UρU ∗ ) for all unitary U . Lemma 2 in the appendix of [21] can therefore be used to show that the minimal CB entropy is achieved when γ1 = Tr2 (I ⊗ )(|ψψ|) is the maximally mixed state d1 I so that |ψ is maximally entangled and CB,1→ p = S(X ) − log d. Moreover, the state (I ⊗ µ )(|ψψ|) has one non-degenerate eigenvalue the eigenvalue

1−µ d2

with multiplicity

ω p (µ ) = d

− p+1 p

d2

− 1. From this one finds

1/ p (1 − µ + d 2 µ) p + (d 2 − 1)(1 − µ) p

(5.3) 1+(d 2−1)µ d2

and

(5.4)

and 1−µ 1−µ 2 log 1−µ (5.5) SCB,min (µ ) = − 1−µ 2 − (d − 1) d 2 log d 2 − log d d2 d = log d − d12 (1−µ+d 2 µ) log(1−µ+d 2 µ)+(d 2 −1)(1−µ) log(1−µ) .

54

I. Devetak, M. Junge, C. King, M.B. Ruskai

In the case of qubits, d = 2 and (5.4) becomes 1/ p µ CB,1→ p = ω p (µ ) = 2−( p+1)/ p (1 + 3µ) p + 3(1 − µ) p

(5.6)

which can be compared to 1/ p . µ 1→ p = ν p (µ ) = 2−1 (1 + µ) p + (1 − µ) p

(5.7)

The strict convexity of f (x) = x p implies that for µ > 0, (1 + µ) p =

(1+3µ)+(1−µ) 2

p

<

1 (1 + 3µ) p + 3(1 − µ) p 2

from which it follows that µ CB,1→ p > µ 1→ p . This confirms that, in general, the CB norm CB,1→ p of a map is strictly greater than 1→ p . (This can be seen directly for the identity map I which corresponds to µ = 1.) For qubits, one can verify explicitly that SCB,min () is achieved with a maximally entangled state and that it decreases monotonically with µ. Numerical work [11] shows that SCB,min () changes from positive to negative at µ = 0.74592, which is also the cut-off forT C Q () = 0. 1 The Werner-Holevo channel [53] is WH (ρ) = d−1 (Tr ρ)I − ρ . One finds that 1 γ12 has exactly d2 non-zero eigenvalues d−1 (a 2j +ak2 ) with j < k and a 2j the eigenvalues of γ1 . One can then use the concavity of −x log x to show that S(γ12 ) ≥ S(γ1 )+log d−1 2 , d−1 which implies that SCB,min (WH ) = log 2 is achieved with a maximally entangled input. Moreover, SCB,min (WH ) = −1 for d = 2, and SCB,min (WH ) = 0 for d = 3. One can also use the covariance property WH (UρU ∗ ) = U WH (ρ)U T and Lemma 2 of [21] to see that ω p (WH ) is achieved with a maximally entangled state, and verify that ω p (WH ) =

2 d−1

1− 1

p

>

1 d−1

1− 1

p

= ν p (WH ).

(5.8)

This gives another example for which the CB norm is strictly greater than 1→ p . However, the CB norm is not always attained on a maximally entangledstate. Cont sider for example the non-unital qubit map (ρ) = λρ + (1−λ) 2 I + 2 σ3 Tr ρ, and √ √ the one-parameter family of pure bipartite states |ψa = a |00 + 1 − a |11, where 0 ≤ a ≤ 1. In this case γ12 = (I ⊗ )(|ψa ψa |) √   a(1 + t + λ) 0 0 2λ a(1 − a) 1 0 (1 − a)(1 + t − λ) 0 0  =  . 0 0 a(1 − t − λ) 0 2 √ 0 0 (1 − a)(1 − t + λ) 2λ a(1 − a) γ

Numerical computations show that for p > 1, γ121 pp is maximized at values a > 1/2 when t > 0, and values a < 1/2 when t < 0. Since the state |ψa is maximally entangled only when a = 1/2, this demonstrates that the CB norm ω p () is achieved at a non-maximally entangled state for this family of maps.

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

55

5.2. Entanglement breaking and preservation. The class of channels for which (I ⊗ )(ρ) is separable for any input is called entanglement breaking (EB). Those which are also trace preserving are denoted EBT. These maps were introduced in [15] by Holevo who wrote them in the form (ρ) = k Rk Tr ρ E k with each Rk a density matrix and {E k } a POVM, i.e., E k ≥ 0 and k E k = I . They were studied in [20] where several equivalent conditions were proved. The next result shows that EBT channels always have positive minimal CB entropy. Therefore, a channel for which SCB,min () is negative always preserves some entanglement. Lemma 15. If : Mm → Mn is an EBT map, then for all p ≥ 1 and positive semideﬁnite Q ∈ Mn ⊗ Mm , (In ⊗ )(Q) p ≤ Tr2 Q p = Q 1 p .

(5.9)

Theorem 16. If is an EBT map, then ω p () ≤ 1 and SCB,min () is positive. Theorem 16 follows immediately from Lemma 15 and Theorem 2 of Sect. 2.2. The converse does not hold, i.e., SCB,min () ≥ 0 does not imply that is EBT. For the depolarizing channel, it is known [44] that α is EBT if and only if |α| ≤ 13 ; however, as reported above, SCB,min (α ) > 0 for 0 < α < 0.74592. For d > 3, the WH channel also has positive CB entropy, although it can not break all entanglement because it is known [53] that ν p (WH ) is not multiplicative for sufficiently large p. The proof of Lemma 15 is similar to King’s argument [25] for showing multiplicativity of the maximal p-norm for EBT maps, and is based on the following inequality due to Lieb and Thirring [34]: Tr (C † DC) ≤ Tr (CC † ) p D p

(5.10)

for p ≥ 1 and D > 0 positive semi-definite.1 Proof of Lemma 15. By assumption, we can write (ρ) = a density matrix and {E k } a POVM. Then (In ⊗ )(Q) = =

k

Rk Tr ρ E k with each Rk

κ [Tr2 (I ⊗ X k )Q] ⊗ Rk k=1 κ

G k ⊗ Rk ,

(5.11)

k=1

where G k =

k [Tr2 (I

⊗ X k )Q]. Note that

Tr2 Q =

κ κ [Tr2 (I ⊗ X k )Q] = Gk . k=1

(5.12)

k=1

1 The proof in the Appendix of [34] is based on the concavity of A → Tr (B A1/m B)m for m ≥ 1 and A, B ≥ 0. This was first proved by Epstein [14]; it is also a special case of Lemma 1.14 in [41], which is proved using complex interpolation in the operator space framework. Araki [1] gave another proof of (5.10), and a simple proof based on Hölder’s inequality was given by Simon in Theorem I.4.9 of [49].

56

I. Devetak, M. Junge, C. King, M.B. Ruskai

With |ek the canonical basis in Cκ we define the following matrices in Mκ ⊗ Mn ⊗ Mn : 

0 ... In ⊗ R 1 In ⊗ R 2 . . .  0 |ek ek | ⊗ In ⊗ Rk =  R = ..  ... . k

and

...

0

√ √G 1  G2  1/2 ⊗ In = V = V |ek e1 | ⊗ G k ⊗ In =  .  .. k √ Gκ

0 0 .. .

   

(5.13)

0 In ⊗ R κ

 0 ... 0 0 . . . 0  .. ..  ⊗ In , . . 0 ... 0

(5.14)

where we adopt the convention of using the subscripts 3, 1, 2 for Mκ , Mn , Mn respectively so that the partial traces Tr1 and Tr2 retain their original meaning. It follows that |e1 e1 | ⊗ (In ⊗ )(Q) = V † RV.

(5.15)

Applying (5.10) one finds p

(In ⊗ )(Q) p = Tr (V † RV ) p = Tr312 (V † RV ) p ≤ Tr312 (V V † ) p R p Tr12 [(V V † ) p ]kk (In ⊗ Rk ) p =

(5.16)

k

=

V † ) p ]kk Tr2 (Rk ) p , Tr1 [(V

(5.17)

k

V † ) p ]kk = Tr3 (V V † ) p (|ek ek | ⊗ In ) is the k th block on the diagonal of where [(V † p † p † ) p ]kk ⊗In . Since Rk is a density matrix, Tr2 (Rk ) p ≤ 1. (V V ) and [(V V ) ]kk = [(V V p (In fact, we could assume wlog that Rk = |θk θk | so that Rk = Rk and Tr2 (Rk ) p = 1.) Therefore, p V † ) p ]kk (In ⊗ )(Q) p ≤ Tr1 [(V k

Tr1 (

V † ) p = Tr31 (V † V ) p = Tr31 (V G K ) = Tr1 (Tr2 Q) = p

p

p Tr2 Q p

(5.18) (5.19)

k

5.3. Operational interpretation. Recently Horodecki, Oppenheim and Winter [18, 19] (HOW) obtained results which give an important operational meaning to quantum conditional information, consistent with both positive or negative values. Applying their results to the expression SCB,min () = S(γ AB ) − S(γ A ) with γ AB = (I ⊗ ) (|ψψ|), where |ψ is the minimizer in (1.1) gives the following interpretation:

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

57

• A channel for which SCB,min () > 0 always breaks enough entanglement so that some EPR pairs must be added to enable Alice to transfer her information to Bob. • A channel for which SCB,min () < 0 leaves enough entanglement in the optimal state so that some EPR pairs remain after Alice has transferred her information to Bob. For example, as discussed in Sect. 5.1 the depolarizing channel is entanglement breaking for µ ∈ [−1/3, 1/3]; for µ ∈ (1/3, 0.74592) it always breaks enough entanglement to require input of EPR pairs to transfer Bob’s corrupted state back to Alice; and for µ > 0.74592 maximally entangled states retain enough entanglement to allow the distillation of EPR pairs after Bob’s corrupted information is transferred to Alice. Note, however, that the HOW interpretation [18, 19] is an asymptotic result in the sense that it is are based on the assumption of the availability of the tensor product ⊗n state γ AB with n arbitrarily large, and is related to the “entanglement of assistance” [50] which is known not to be additive. One would also like to have an interpretation of the additivity of SCB,min () so that the “one-shot” formula −SCB,min () represents the capacity of an asymptotic process which is not enhanced by entangled inputs. Thus far, the only scenarios for which we have found this to be true seem extremely contrived and artificial. 6. Entropy Inequalities In this section, we show that operator space methods can be used to give a new proof of SSA (1.8). Although the strategy is straightforward, it requires some rather lengthy and tedious bounds on derivatives and norms. Our purpose is not to give another proof of SSA, but to demonstrate the fundamental role of Minkowski-type inequalities and provide some information on the behavior of the (1, p) near p = 1. Differentiation of inequalities of the type found in Sect. 3.4 often yields entropy inequalities. The procedure is as follows. Consider an inequality of the form g L ( p) ≤ g R ( p) valid for p ≥ 1 which becomes an equality at p = 1. Then the function g( p) = g R ( p) − g L ( p) ≥ 0 for p ≥ 1 and g(1) = 0. This implies that the right derivative g (1+) ≥ 0 or, equivalently, that g L (1+) ≤ g R (1+). Applying this to (3.16) yields −S(Q 1 ) ≤ −S(Q 12 ) + S(Q 2 )

(6.1)

which is the well-known subadditivity inequality S(Q 12 ) ≤ S(Q 1 ) + S(Q 2 ). Applying the same principle to conjecture (3.17) yields −S(Q 23 ) + S(Q 3 ) ≤ −S(Q 123 ) + S(Q 13 )

(6.2)

which is equivalent to strong subadditivity (1.8). (Carlen and Lieb [8] observed that the reverse of (3.17) holds when t ≤ 1 and used the corresponding left derivative inequality g L (1−) ≥ g R (1−) to obtain another proof of SSA.) These entropy inequalities can also be obtained by differentiating the corresponding CB Minkowski inequalities (3.13) and (3.14). We will need the following. Theorem 17. For any X = X 12 in Mm ⊗ Mn , with X ≥ 0 and Tr X = 1, p d p=1 = −S(X 12 ) + S(X 1 ). X 12 dp

(1, p)

(6.3)

Before proving this result, observe that (3.20) implies W12 (1, p) = W2 p and (3.24) implies W132 (1, p,1) = W13 (1, p) . Then, when q = 1, the inequalities (3.13) and (3.14) imply

58

I. Devetak, M. Junge, C. King, M.B. Ruskai p

p

(6.4)

p W123 (1,1, p) .

(6.5)

W2 p ≤ W12 (1, p) , p W13 (1, p)

≤

Now, under the assumption that W123 > 0 and Tr W123 = 1, Theorem 17 implies p d d p W21 (1, p) p=1 = −S(W2 ), p d d p W12 (1, p) p=1 = −S(W12 ) + S(W1 ), p d d p W123 (1,1, p) p=1 = −S(W123 ) + S(W12 ), p d d p W13 (1, p) p=1 = −S(W13 ) + S(W1 ). The usual subadditivity and SSA inequalities, (6.1) and (6.2) then follow from the principle, g L (1+) ≤ g R (1+), above and (6.4) and (6.5) respectively. Proof of Theorem 17. The basic strategy is similar to that in Sect. 2.2, but requires some additional details. Let X 1 = Tr2 X and let Q denote the orthogonal projection onto ker(X 1 ). Since Q X 1 Q = 0, it follows that Tr (Q ⊗ In )X (Q ⊗ In ) = 0. Since X is positive semi-definite this implies that X = ((Im − Q) ⊗ In )X ((Im − Q) ⊗ In ). For fixed X the functions 1

v( p, B) = X 1/2 (B p 1 2

w( p, B) = X 1 B

−1

1 p −1

⊗ In ) X 1/2 , and

(6.6)

1 2

X1

(6.7)

are well-defined for p > 1, and B ∈ β(X 1 ), where β(X 1 ) = {B ∈ D: ker(B) ⊂ ker(X 1 )}. − 1 (1− 1 )

− 1 (1− 1 )

1

1

1

1

1

− − p ⊗ I ) X (B 2 p ⊗ I ) p Since (B 2 n n p = X 2 (B 2 B B 2 ⊗ In )X 2 p , it follows from (3.22) and the remarks above that

X 12 (1, p) = inf v( p, B) p = B∈D

inf

B∈β(X 1 )

v( p, B) p .

(6.8)

The set of density matrices D is compact, and v( p, B) p is bounded below and continuous, hence for each p > 1 there is a (i.e., at least one) density matrix B( p) which minimizes v( p, B) p , so that X 12 (1, p) = v( p, B( p)) p . Since p > 1 and B( p) is a density matrix, B( p) v( p, B) ≥ X

and

−1+ 1p

(6.9)

> Im which implies

w( p, B) ≥ X 1 .

(6.10)

Furthermore, 1

1 = Tr X ≤ Tr v( p, B( p)) ≤ (mn) p 1− 1

−1

v( p, B( p)) p

(6.11)

(where the last inequality uses A 1 ≤ d p A p for any positive semi-definite d × d matrix A and any p ≥ 1). Replacing B( p) by another density matrix cannot decrease v( p, B( p)) p , hence 1 1− 1 1− 1 v( p, B( p)) p ≤ v p, Im p = m p X p ≤ m p . (6.12) m

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

59

Combining (6.11) and (6.12) shows that lim Tr (v( p, B( p)) − X ) = 0,

(6.13)

p→1+

and, together with (6.10) implies that v( p, B( p)) → X . Also, for any B ∈ β(X 1 ), 1

Tr v( p, B) = Tr12 (X 12 [B] p = Tr X 1 B

1 p −1

−1

⊗ In )

= Tr w( p, B)

(6.14)

so that lim p→1+ Tr (w( p, B( p)) − X 1 ) = 0 and w( p, B( p)) → X 1 . Writing out the derivative on the left side of (6.3), we see that we need to show that

1 Tr v( p, B( p)) p − 1 = −S(X ) + S(X 1 ). lim (6.15) p→1+ p − 1 First note that for p > 1,

1 1 Tr v( p, B( p)) p − 1 ≤ Tr v( p, X 1 ) p − 1 , p−1 p−1

(6.16)

and a direct calculation shows that the right side of (6.16) converges to −S(X ) + S(X 1 ) as p → 1+. Hence to prove (6.15) it is sufficient to show that

1 Tr v( p, B( p)) p − 1 ≥ −S(X ) + S(X 1 ). lim inf (6.17) p→1+ p − 1 Hölder’s inequality implies 1

1 = X 1 1 = B( p) 2 ≤ B( p)

− 21p

1 1 2 − 2p

1

B( p) 2 p

− 21

1

X 1 B( p) 2 p

22 p/( p−1) B( p)

1 1 2p − 2

− 21

1

B( p) 2

X 1 B( p)

− 21p

1 1 2p − 2

1

p

= w( p, B( p)) p .

(6.18)

Combining this with (6.14) gives a bound on the numerator on the left in (6.15), Tr v( p, B( p)) p − 1 ≥ Tr v( p, B( p)) p − Tr v( p, B( p)) − Tr w( p, B( p)) p − Tr w( p, B( p)) .

(6.19)

The mean value theorem for the function g( p) = x p implies that for some p1 , p2 ∈ [1, p] ,

1 Tr v( p, B( p)) p − Tr v( p, B( p)) = Tr v( p, B( p)) p1 p−1 × log v( p, B( p)),

1 Tr w( p, B( p)) p − Tr w( p, B( p)) = Tr w( p, B( p)) p2 p−1 × log w( p, B( p)).

(6.20a)

(6.20b)

The convergence in (6.13) and following (6.14) imply lim Tr v( p, B( p)) p1 log v( p, B( p)) = −S(X ),

(6.21a)

lim Tr w( p, B( p)) p2 log w( p, B( p)) = −S(X 1 ).

(6.21b)

p→1 p→1

Combining (6.20a), (6.21a)and (6.19) gives (6.17).

60

I. Devetak, M. Junge, C. King, M.B. Ruskai 1

Remark. The proof above relies on the convergence of lim p→1+ X 1/2 (B p 1/2 1 −1 and lim p→1+ X 1/2 X 1 B p

−1

⊗In ) X 1/2 =

1/2 X1

X = X 1 , but tells us nothing at all about the behavior of B( p) as p → 1+. By making a few changes at the end of this proof and exploiting Klein’s inequality, we can also show that lim p→1+ B( p) = X 1 . Klein’s inequality [29, 38] states that Tr A log A − Tr A log B ≥ Tr (A − B)

(6.22)

with equality in the case Tr A = Tr B if and only if A = B. Now, replace (6.19) by Tr v( p, B( p)) p − 1 = Tr v( p, B( p)) p − Tr v( p, B( p)) +(Tr w( p, B( p)) − 1).

(6.23)

1

Then use the mean value theorem for the function g2 ( p) = y p to replace (6.20b) by 1 1 1 1 p)−1 log B( p) X 12 . (6.24) (Tr w( p, B( p)) − 1) = − 2 Tr X 12 [B( p)](1/ p−1 p 1

p) p ) X B( p)− 2 (1−1/ together with the We could use (6.22) with A = B( p)−1/2(1−1/ 1 fact that A and w( p , B( p)) have the same non-zero eigenvalues to bound the right side of (6.24) below by −(1/ p 2 )S[w( p , B( p)] + Tr w( p , B( p)) − 1. However, because 1/ p 1 B 1/ p , we cannot extend (6.12) and (6.14) to conclude that this converges to S(X 1 ). Instead, we first observe that the compactness of the set of density matrices D implies that we can find a sequence pk → 1+ such that X 12 (1, p) = v( pk , B( pk )) pk and Bk → B ∗ ∈ D. If B ∗ is not in β(X 1 ), then the right side of the first line of (6.24) → +∞ giving a contradiction with (6.16). Hence B ∗ ∈ β(X 1 ). Therefore, (6.24) and (6.22) imply

lim

pk →∞

1 (Tr w( pk , B( pk )) − 1) = −Tr X 1 log B ∗ ≥ S(X 1 ). pk − 1

(6.25)

Inserting this in (6.23) yields lim

pk →∞

1 Tr v( pk , B( pk )) p − 1 = −S(X 12 ) − Tr X 1 log B ∗ pk − 1 ≥ −S(X 12 ) + S(X 1 ).

(6.26)

Combining these results with (6.16), we conclude that equality holds in (6.26) and that −Tr X 1 log B ∗ = S(X 1 ) = −Tr X 1 log X 1 .

(6.27)

We can now use the condition for equality in (6.22) to conclude that B ∗ = X 1 . Since this is true for the limit of any convergent sequence of minimizers B( pk ) with pk → 1, we have also proved the following which is of independent interest. Corollary 18. For X ∈ Mm ⊗Mn with X ≥ 0 and Tr X = 1 and p ∈ (1, 2], let B( p) ∈ D 1

1

minimize X (1, p) , i.e., X 2 (B p ≡ Tr2 X .

−1

⊗ In ) X 1/2 p = X (1, p) . Then lim B( p) = X 1 p→1+

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

61

Acknowledgements. The work of M.J. was supported in part by National Science Foundation grant DMS0301116. The work of C.K. was supported in part by the National Science Foundation under grant DMS0400426. The work of M.B.R. was supported in part by the National Security Agency (NSA) and Advanced Research and Development Activity (ARDA) under Army Research Office (ARO) contract number DAAD1902-1-0065; and by the National Science Foundation under Grant DMS-0314228. This work had its genesis in a workshop in 2002 at the Pacific Institute for the Mathematical Sciences at which M.J. and M.B.R. participated. Part of this work was done while I.D. and M.B.R. were visiting the Isaac Newton Institute. The authors are grateful to these institutions for their hospitality and support. Finally, we thank Professor Andreas Winter for discussions about possible interpretations of SCB,min ().

A. Purification To make this paper self-contained and accessible to people in fields other than quantum information we summarize the results needed to prove Lemma 4. Any density matrix in Dd can be written in terms of its spectral decomposition (restricted to [ker(γ )]⊥ ) as γ = m k=1 λk |φk φk | where each eigenvalue λk > 0 and counted in terms of its multiplicity so that the eigenvectors {|φk } are orthonormal. If we then let {|χk } be any orthonormal basis of Cm and define | ∈ Cd ⊗ Cm as | =

m λk |φk ⊗ |χk ,

(A.1)

k=1

then γ = Tr2 | | and (A.1) is called a puriﬁcation of γ . Conversely, given a normalized vector | ∈ Cn ⊗ Cm , it is a straightforward consequence of the singular value decomposition that | can be written in the form | = µk |φk ⊗ |χk (A.2) k

with {|φk } and {|χk } orthonormal sets in Cn and Cm respectively. (This is often called the “Schmidt decomposition” in quantum information theory. For details and some history see Appendix A of [26].) It follows from (A.2) that the reduced density matrices γ1 = Tr2 | | and γ2 = Tr1 | | have the same non-zero eigenvalues. Although our interest here is for H = Cm , these results extend to infinite dimensions and yield the following Corollary 19. When | AB is a bipartite pure state in H A ⊗ H B , then its reduced density matrices γ A = Tr B | | and γ B = Tr A | | have the same entropy, i.e., S(γ A ) = S(γ B ). References 1. Araki, H.: On an inequality of Lieb and Thirring. Lett. Math. Phys. 19, 167–170 (1990) 2. Audenaert, K.M.R.: A note on the p → q norms of completely positive maps. http://arxiv.org/list/ math-ph/0505085, 2005 3. Amosov, G. G., Holevo, A. S., Werner, R. F.: On Some Additivity Problems in Quantum Information Theory. Prob. Inf. Trans. 36, 305–313 (2000) 4. Barnum, H., Nielsen, M.A., Schumacher, B.: Information transmission through a noisy quantum channel. Phys. Rev. A 57, 4153–4175 (1998) 5. Bennett, C. H., Shor, P.W., Smolin, J. A., Thapliyal, A. V.: Entanglement-assisted classical capacity of noisy quantum channels. Phys. Rev. Lett. 83, 3081–84 (1999) 6. Bennett, C. H., Shor, P.W., Smolin, J. A., Thapliyal, A. V.: Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem. IEEE Trans. Inform. Theory 48, 2637–2655 (2002)

62

I. Devetak, M. Junge, C. King, M.B. Ruskai

7. Blecher, D.P., Paulsen, V. I.: Tensor products of operator spaces. J. Funct. Anal. 99, 262–292 (1991) 8. Carlen, E. Lieb, E.: A Minkowski type trace inequality and strong subadditivity of quantum entropy. Amer. Math. Soc. Trans. 189, 59–62 (1999), reprinted in [32]. 9. Choi, M-D: Completely Positive Linear Maps on Complex Matrices. Lin. Alg. Appl. 10, 285–290 (1975) 10. Devetak, I.: The Private Classical Capacity and Quantum Capacity of a Quantum Channel. IEEE Trans. Inform. Theory 51, 44–55 (2005) 11. DiVincenzo, D. P., Shor, P. W., Smolin, J. A.: Quantum-channel capacity of very noisy channels. Phys. Rev. A 57, 830–839 (1998); erratum 59, 1717 (1999) 12. Effros, E. G., Ruan, Z. J.: Self-duality for the Haagerup tensor product and Hilbert space factorizations. J. Funct. Anal. 100, 257–284 (1991) 13. Effros, E. G., Ruan, Z. J.: Operator Spaces, Oxford: Oxford Univ. Press, 2000 14. Epstein, H.: Remarks on two theorems of E. Lieb. Commun. Math. Phys. 31, 317–325 (1973) 15. Holevo, A. S.: Coding Theorem for Quantum Channels. http://arxiv.org/list/quant-ph/9809023; Quantum coding theorems. Russ. Math. Surv. 53, 1295–1331 (1999) 16. Holevo, A. S.: On Entanglement-Assisted Classical Capacity. J. Math. Phys. 43, 4326–4333 (2002) 17. Holevo, A. S., Werner, R.F.: Evaluating capacities of bosonic Gaussian channels. Phys. Rev. A 63, 032312 (2001) 18. Horodecki, M., Oppenheim, J., Andreas Winter: Quantum information can be negative. Nature 436, 673–676 (2005) 19. Horodecki, M., Oppenheim, J., Andreas Winter: Quantum state merging and negative information. Commun. Math. Phys. http://arxiv.org/list/quant-ph/0512247, 2005 (in press) 20. Horodecki, M., Shor, P., Ruskai, M. B.: Entanglement Breaking Channels. Rev. Math. Phys. 15, 629–641 (2003) 21. Jencˇová, A.: A relation between completely bounded norms and conjugate channels. Commun. Math. Phys., DOI 10.1007/s00220-006-0035-z 22. Junge, M.: Factorization theory for spaces of operators. Habilitation thesis, Kiel University, 1996 23. Junge, M.: Vector-valued L p spaces for von Neumann algebras with QWEP. In preparation 24. Junge, M., Ruan, Z.-J.: Decomposable Maps on Non-commutative L p spaces. Contemporary Mathematics 365, 355–381 (2004) 25. King, C.: Maximal p-norms of entanglement breaking channels. Quantum Inf. and Comput. 3(2), 186–190 (2003) 26. King, C., Ruskai, M.B.: Minimal Entropy of States Emerging from Noisy Quantum Channels. IEEE Trans. Info. Theory 47, 1–19 (2001) 27. King, C., Ruskai, M. B.: Comments on multiplicativity of maximal p-norms when p = 2. In: Quantum Information, Statistics and Probability, ed. by O. Hirota, River Edge, NJ: World Scientific, 2004, pp. 102–114 28. Kitaev, A.: Classical and Quantum Computation Providence, RI: AMS, 2002 29. Klein, O.: Zur quantenmechanischen begründung des des zweiten haupsatzes der wärmelehre. Zeit. für Physik. 72, 767–775 (1931) 30. Kraus, K.: General state changes in quantum theory Ann. Phys. 64, 311–335 (1971) 31. Kraus, K.: States, Effects and Operations: Fundamental Notions of Quantum Theory. Berlin-Heidelberg: New York: Springer-Verlag, 1983 32. Inequalities: Selecta of E. Lieb. M. Loss, M.B. Ruskai, eds., Berlin-Heidelberg: New York: Springer, 2002 33. Lieb, E.H., Ruskai, M.B.: Proof of the Strong Subadditivity of Quantum Mechanical Entropy. J. Math. Phys. 14, 1938–1941 (1973), reprinted in [32] 34. Lieb, E., Thirring, W.: Inequalities for the Moments of the Eigenvalues of the Schrödinger Hamiltonian and Their Relation to Sobolev Inequalities. In Studies in Mathematical Physics, E. Lieb, B. Simon, A. Wightman, eds., pp. 269–303 (Princeton University Press, 1976) pp. 269–303, reprinted in [32] 35. Lloyd, S.: The capacity of a noisy quantum channel. Phys. Rev. A 55, 1613–1622 (1997) 36. Nelson, E.: Notes on non-commutative integration. J. Func. Anal. 15, 103–116 (1974) 37. Nielsen, M.: Quantum Information Theory, PhD thesis. University of New Mexico, 1998 38. Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information Cambridge: Cambridge University Press, 2000 39. Paulsen, V.: Completely Bounded Maps and Operator Algebras. Cambridge: Cambridge University Press, 2000 40. Pisier, G.: The Operator Hilbert Space O H , Complex Interpolation and Tensor Norms. Memoirs AMS. 122, Providence, RI: Amer. Math. Soc. 1996 41. Pisier, G.: Non-Commutative Vector Valued L p -spaces and Completely p-summing Maps. Paris: Société Mathématique de France, 1998 42. Pisier, G.: Introduction to Operator Space Theory. Cambridge: Cambridge University Press, 2003 43. Ruskai, M.B.: Inequalities for Quantum Entropy: A Review with Conditions for Equality. J. Math. Phys. 43, 4358–4375 (2002); erratum 46, 019901 (2005)

Multiplicativity of Completely Bounded p-Norms Implies New Additivity Result

44. 45. 46. 47. 48. 49. 50. 51. 52. 53.

63

Ruskai, M. B.: Qubit Entanglement Breaking Channels. Rev. Math. Phys. 15, 643–662 (2003) Ruan, Z-J.: Subspaces of C ∗ -algebras. J. Funct. Anal. 76, 217–230 (1988) Segal, I. E.: A non-commutative extension of abstract integration. Ann. of Math. 57, 401–457 (1953) Shor, P. W.: Announced at MSRI workshop, (November, 2002). Notes are available at www.msri.org/publications/ln/msri/2002/quantumcrypto/shor/1/index.html Shor, P. W.: Equivalence of Additivity Questions in Quantum Information Theory. Commun. Math. Phys. 246, 453–472 (2004) Simon, B.: The Statistical Mechanics of Lattice Gases. Princetion, NJ: Princeton Univ. Press, 1993 Smolin, J., Verstraete, F., Winter, A.: Entanglement of assistance and multipartite state distillation. Phys. Rev. A 72, 052317 (2005) Stinespring, W.F.: Positive functions on C ∗ -algebras. Proc. Amer. Math. Soc. 6, 211–216 (1955) Watrous, J.: Notes on super-operator norms induced by Schatten norms. Quantum Inf. Comput. 5, 57–67 (2005) Werner, R. F., Holevo, A. S.: Counterexample to an additivity conjecture for output purity of quantum channels. J. Math. Phys. 43(9) 4353–4357 (2002)

Communicated by M. Aizenman

Commun. Math. Phys. 266, 65–70 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0035-z

Communications in

Mathematical Physics

A Relation Between Completely Bounded Norms and Conjugate Channels Anna Jenˇcová Mathematical Institute, Slovak Academy of Sciences, Bratislava, Slovakia. E-mail: [email protected] Received: 10 January 2006 / Accepted: 14 February 2006 Published online: 9 May 2006 – © Springer-Verlag 2006

Abstract: We show a relation between a quantum channel and its conjugate C , which implies that the p → p Schatten norm of the channel is the same as the 1 → p completely bounded norm of the conjugate. This relation is used to give an alternative proof of the multiplicativity of both norms. 1. Introduction A quantum channel is a completely positive trace preserving (CPT) map : Md → Md , Md is the set of d × d complex matrices. Any channel can be viewed as a map L q (Md ) → L p (Md ), where L q (Md ) denotes the space Md with the Schatten norm Aq = Tr (|A|q )1/q , 1 ≤ q ≤ ∞. Let q→ p be the corresponding norm of , q→ p = sup A∈Md

(A) p = Aq

(A) p , A∈Md ,A≥0 Aq sup

the second equality was proved in [1, 9]. We say that norms of this type are multiplicative if 1 ⊗ 2 q→ p = 1 q→ p 2 q→ p for any channels 1 and 2 . The spaces L q (Md ) and L p (Md ) can be endowed with an operator space structure as in [8], then is a completely bounded map. Multiplicativity of the corresponding completely bounded norms C B,q→ p for all 1 ≤ p, q ≤ ∞ was proved in [3]. In particular, this implies multiplicativity of q→ p for q ≥ p, since this is equal to C B,q→ p for CPT maps. It was shown that the norm C B,1→ p is equal to the quantity ω p () =

(I ⊗ )(|ψ ψ|) p . Tr2 (|ψ ψ|) p ψ∈Cd ⊗Cd sup

66

A. Jenˇcová

Multiplicativity of ω p then yields the additivity for the CB minimal conditional entropy, defined as SC B,min =

inf

ψ∈Cd ⊗Cd

(S [(I ⊗ )(|ψ ψ|)] − S [Tr2 (|ψ ψ|)]) .

In the present note, we show that there is a relation between ω p () and the norm C p→ p of the conjugate map C . This relation is then used for an alternative proof of multiplicativity of both quantities, avoiding the use of the deep results of the theory of operator spaces and CB norms, involved in the proofs in [3]. 2. Representations of CPT Maps and Conjugate Channels d d Let e1d , . . . , edd be the standard basis in Cd and let β0 = √1 i ei ⊗ ei be a maximally d entangled vector. Let : Md → Md be a CPT map. Then is uniquely represented by its Choi-Jamiolkowski matrix X φ ∈ Md ⊗ Md , defined by (1) X = d(I ⊗ )(|β0 β0 |) = eid edj ⊗ eid edj . i, j

Other representations of can be obtained from the Stinespring representation, which in the case of matrices has the form [7]

(ρ) = V † (ρ ⊗ Iκ )V, V : Cd → Cd ⊗ H, Tr2 V V † = Id , where H is an auxiliary Hilbert space, κ = dim H ≤ representation of is

dd .

(2)

The Lindblad-Stinespring

(ρ) = Tr2 U (ρ ⊗ |φ φ|)U † ,

(3)

Cd

⊗H → ⊗ H is a partial isometry. where φ is a unit vector in H, and U : ˆ The Kraus This can be obtained from the Stinespring representation of the dual map . representation Cd

(ρ) =

κ

Fk ρ Fk† ,

Fk : Cd → Cd ,

k=1

Fk† Fk = Id

(4)

k

is related to (2) and (3) by V =

κ

Fk† ⊗ |ekκ ,

k=1

Fk = Tr2 U I ⊗ |φ ekκ | , k = 1, . . . , κ, where e1κ , . . . , eκκ is an orthonormal basis in H. Let be given by (3). The conjugate channel to is the map C : Md → B(H), defined as C (ρ) = Tr1 U (ρ ⊗ |φ φ|)U † = Tr F j ρ Fk† |eκj ekκ |. (5) j,k

This definition appeared in [4] (under the name “complementary channels”) and was used in [5, 6] in the context of multiplicativity and additivity problems. The next lemma shows a relation between the Stinespring representation (2) of and the Choi-Jamiolkowski matrix (1) of its conjugate.

Relation Between Completely Bounded Norms and Conjugate Channels

67

Lemma 1. Let be a CPT map, such that (ρ) = V † (ρ ⊗ Iκ )V is the Stinespring representation. Then X C = (V V † )T , where B T is the transpose of the matrix B, BiTj = B ji . Proof. Let V = VV† =

κ

k=1

κ

Fk† ⊗ |ekκ , then using (5), we get

Fi† F j ⊗ |eiκ eκj | =

i, j=1

=

d

ekd |Fi† F j |eld |ekd eld | ⊗ |eiκ eκj |

i, j=1 k,l=1

|ekd eld | ⊗

k,l=1

=

κ d

κ

Tr F j |eld ekd |Fi† |eiκ eκj |

i, j=1

T T |ekd eld | ⊗ C (|eld ekd |) = X C.

d k,l=1

Theorem 1. For a CPT map and 1 ≤ p ≤ ∞, p→ p = ω p (C ). Proof. Note first that for any CPT map, we have ([3]) ω p () =

sup

A≥0,A2 p ≤1

(A ⊗ Id )X (A ⊗ Id ) p .

(6)

Let the Stinespring representation (2) of be (ρ) = V † (ρ ⊗ Iκ )V . Then by Lemma 1, p→ p = =

sup

A≥0,A p ≤1

sup

(A) p =

B≥0,B2 p ≤1

sup

B≥0,B2 p ≤1

V † (B 2 ⊗ Iκ )V p =

T C (B ⊗ Iκ )X C (B ⊗ Iκ ) p = ω p ( ),

the last equality follows from the fact that B T ≥ 0 if B ≥ 0 and B T p = B p .

Remark. Let q > p. Exactly as in the above proof, we get that q→ p =

sup

A≥0,A2q ≤1

(A ⊗ Id )X C (A ⊗ Id ) p .

The last expression is equal to the L r (Md , L p (Md )) norm X C (r, p) for 1/q + 1/r = 1/ p, see Eq. (3.18) in [3]. This is an operator space type of norm, but not a CB norm, in general.

68

A. Jenˇcová

3. Multiplicativity To prove multiplicativity, we need the following observation: ω p ( ⊗ Tr ) = ω p (Tr ⊗ ) = ω p ().

(7)

This follows from Lemma 2, proved in the Appendix. We remark that this equality implies that the supremum in the definition of ω p can be taken over all Md ⊗ Md , that is, ω p () =

sup

X ∈Md ⊗Md

(I ⊗ )(X ) p . Tr2 (X ) p

(8)

To show this, we first note that the supremum in (8) may be restricted to positive X . Let 2 X ≥ 0 and let |ψ123 ∈ Cd ⊗ Cd ⊗ Cd be a purification of X , X = Tr3 (|ψ123 ψ123 |). Then (I1 ⊗ )(Tr3 (|ψ123 ψ123 |)) p (I ⊗ )(X ) p = Tr2 (X ) p Tr23 (|ψ123 ψ123 |) p (I1 ⊗ ⊗ Tr )(|ψ123 ψ123 |)) p = . Tr23 (|ψ123 ψ123 |) p Consequently, ω p () ≤

sup

X ∈Md ⊗Md

≤

(I ⊗ )(X ) p Tr2 (X ) p

sup 2

ψ∈(Cd ⊗Cd )⊗2

(I12 ⊗ ⊗ Tr )(|ψ ψ|) p = ω p ( ⊗ Tr ) = ω p (), Tr34 (|ψ ψ|) p

hence the assertion. We now obtain an alternative proof of multiplicativity of · p→ p and ω p . Theorem 2. For CPT maps 1 : Md1 → Md1 and 2 : Md2 → Md2 and for 1 ≤ p ≤ ∞, 1 ⊗ 2 p→ p = 1 p→ p 2 p→ p ω p (1 ⊗ 2 ) = ω p (1 )ω p (2 ). Proof. We first show that the p → p norm of a channel is not changed by tensoring with identity. Indeed, by Theorem 1 and (7), ⊗ I p→ p = ω p (( ⊗ I)C ) = ω p (C ⊗ Tr ) = ω p (C ) = p→ p . Similarly, I ⊗ p→ p = p→ p . Let now A ∈ Md1 ⊗ Md2 , B = (I ⊗ 2 )(A) and compute sup A

(1 ⊗ 2 )(A) p (1 ⊗ I)(B) p (I ⊗ 2 )(A) p = sup A p B p A p A ≤ 1 p→ p 2 p→ p .

Since the opposite inequality is easy, we get 1 ⊗ 2 p→ p = 1 p→ p 2 p→ p , which in turn implies the multiplicativity of ω p .

Relation Between Completely Bounded Norms and Conjugate Channels

69

Remark. Note that the fact that preserves trace was never used in the paper, so that the results are valid for all completely positive maps on matrices. Acknowledgements. This work was done during a visit to Tufts University and thereby partially supported by NSF grant DMS-0314228. The author wishes to thank Mary Beth Ruskai and Christopher King for discussions and valuable comments. The research was supported by Center of Excellence SAS Physics of Information I/2/2005 and Science and Technology Assistance Agency under the contract No. APVT-51-032002.

Appendix The following lemma is due to C. King. Lemma 2. Let : Mn → Mm be a channel with the covariance property (UρU † ) = U (ρ)(U )† , where U is a unitary in Mm , for any unitary U ∈ Mn . Then for any CPT map : Md → Md , we have ω p ( ⊗ ) = ω p ( ⊗ ) = ω p ()ω p (). Proof. The proof uses the fact that there are n 2 unitary operators in Mn , such that n 2 −1 † k=0 Uk AUk = n(Tr A)In for any n × n matrix A, and therefore (Uk ⊗ Id )A12 Uk† ⊗ Id = n In ⊗ A2 k

for A12 ∈ Mn ⊗ Md , A2 = Tr1 A12 . Let us define p 1/2 1/2 p g p (ρ, ) = Tr ρ 1/2 p ⊗ Id X ρ 1/2 p ⊗ Id = Tr X ρ 1/ p ⊗ Id X so that ω p () p = supρ≥0,Trρ≤1 g p (ρ, ). Then by [2], ρ → g p (ρ, ) is concave. It is easy to see that g p (ρ, ) = g p (UρU † , ) for any unitary operator U on Cn . It follows that for any ρ ≥ 0, Trρ = 1,

1 1 † † g p (ρ, ) = 2 g p (Uk ρUk , ) ≤ g p Uk ρUk , n n2 k k

1 1 p In , = X p = ω p () p . = gp n n Similarly, we have 1 † U , ⊗ g ⊗ I ⊗ I ρ (U ) p k d 12 d k n2 k

1 † ≤ gp (Uk ⊗ Id )ρ12 (Uk ⊗ Id ), ⊗ n2 k

1 1 = gp In ⊗ ρ2 , ⊗ = g p In , g p (ρ2 , ). n n

g p (ρ12 , ⊗ ) =

The easy inequality ω p ()ω p () ≤ ω p ( ⊗ ) now finishes the proof. The equality ω p ( ⊗ ) = ω p ()ω p () is proved similarly.

70

A. Jenˇcová

References 1. Audenaert, K.M.R.: A note on the p → q norms of completely positive maps. http://arxiv.org/list/math-ph/ 0505085, 2005 2. Epstein, H.: Remarks on two theorems of E. Lieb. Commun. Math. Phys. 31, 317–325 (1973) 3. Devetak, I., Junge, M., King, C., Ruskai, M.B.: Multiplicativity of completely bounded p-norms implies a new additivity result. Commun. Math. Phys., DOI: 10.1007/s00220-006-0034-0 4. Devetak, I., Shor, P.W.: The capacity of a quantum channel for simultaneous transmission of classical and quantum information. Commun. Math. Phys. 256, 287–303 (2005) 5. Holevo, A.S.: On complementary channels and the additivity problem. http://arxiv.org/list/quant-ph/ 0509101, 2005 6. King, C., Matsumoto, K., Nathanson, M., Ruskai, M.B.: Properties of conjugate channels with applications to additivity and multiplicativity. http://arxiv.org/list/quant-ph/0509126, 2005 7. Paulsen, V.: Completely bounded maps and operator algebras. Cambridge: Cambridge University Press, (2002) 8. Pisier, G.: Non-commutative vector valued L p -spaces and completely p-summing maps. Astérisque (Soc. Math. France) 247 1–131 (1998) 9. Watrous, J.: Notes on super-operator norms induced by Schatten norms. Quantum. Inf. Comput. 5, 57–67 (2005) Communicated by M.B. Ruskai

Commun. Math. Phys. 266, 71–122 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0010-8

Communications in

Mathematical Physics

Geometric K-Homology of Flat D-Branes Rui M.G. Reis, Richard J. Szabo Department of Mathematics, School of Mathematical and Computer Sciences, Heriot-Watt University, Colin Maclaurin Building, Riccarton, Edinburgh EH14 4AS, U.K. E-mail: [email protected]; [email protected] Received: 20 July 2005 / Accepted: 20 October 2005 Published online: 8 April 2006 – © Springer-Verlag 2006

Abstract: We use the Baum-Douglas construction of K-homology to explicitly describe various aspects of D-branes in Type II superstring theory in the absence of background supergravity form fields. We rigorously derive various stability criteria for states of Dbranes and show how standard bound state constructions are naturally realized directly in terms of topological K-cycles. We formulate the mechanism of flux stabilization in terms of the K-homology of non-trivial fibre bundles. Along the way we derive a number of new mathematical results in topological K-homology of independent interest. Introduction One of the most exciting recent interactions between physics and mathematics has been through the realization that D-branes in string theory are classified by generalized cohomology theories such as K-theory. The charges of D-branes in Type II superstring theory are classified by the K-theory groups of the spacetime in which they live [36, 49, 53, 64]. In Type I superstring theory one uses instead KO-theory, while the charges of branes in orbifolds and orientifolds are classified by various equivariant K-theories, KR-theories, and extensions thereof [9, 10, 31, 33, 53, 64]. In curved backgrounds and in the presence of a non-trivial B-field, D-brane charge takes values in a twisted K-theory group [14, 40, 64]. In addition, the Ramond-Ramond fields which are typically supported on D-branes similarly take values in appropriate K-theory groups [10, 50, 29]. These realizations have prompted intensive investigations in both the mathematics and physics literature into the properties and definitions of various K-theory groups. At the heart of the excitement in these investigations is the fact that the correct physical picture of D-brane charges (and Ramond-Ramond fields) cannot be properly captured in general by ordinary cohomology but rather requires K-theory [30, 24], and conversely that the known physical properties of D-branes in string theory give insights into the rigorous characterizations of certain less widely explored generalized cohomology functors such as those of twisted K-theory [46].

72

R.M.G. Reis, R.J. Szabo

In this paper we will elucidate in detail the observation that K-homology, the homological version of K-theory, is really the more appropriate arena in which to classify D-branes [2, 34, 47, 54, 59, 60]. We treat only the case of Type II D-branes in the absence of non-trivial B-fields. We will build on the classic Baum-Douglas construction of K-homology [6, 7] which is called topological K-homology in order to distinguish it from analytic K-homology, another homological realization of K-theory. In [6, 7] it is proven that this is indeed the homology theory dual to K-theory. For a unified treatment which works for a generic cohomology theory, see [38]. The main advantage of this geometric formulation is that the K-homology cycles encode the most primitive requisite objects that must be carried by any D-brane, such as a spinc structure and a complex vector bundle. Generally, D-branes are much more complicated objects than just subspaces in an ambient spacetime and require a more abstract mathematical notion, such as that of a derived category [26, 61]. Nevertheless, the realization of D-branes in topological K-homology gives them a very natural robust definition in fairly general spacetime backgrounds and reveals various important properties of their (quantum) dynamics that could not be otherwise detected if one classified the brane worldvolumes using only ordinary singular homology. As we will describe in detail in the following, such effects include important stability properties as well as the fact that D-branes do not always wrap submanifolds of the spacetime. Moreover, there is a natural relationship between the Baum-Douglas construction and the realization of certain D-branes as objects in a particular triangulated category. The requisite mathematical material used for this investigation is surveyed in Sect. 1. We elaborate on various aspects of K-homology, giving all the relevant definitions (in terms of the Baum-Douglas approach) and describing some new results which to the best of our knowledge have not previously appeared in the literature. We also present different perspectives on K-homology: the spectrum based definition, which gives a more algebraic topological setting, allowing us to use general results in topology to investigate the structure of D-branes; and the analytic definition (based on the Baum-Douglas approach and the Brown-Douglas-Fillmore construction, although one can also use Kasparov’s approach), which relates the study of D-branes with the study of C∗ -algebras and their K-homology. This second approach makes way to the study of operator algebras, as well as giving a natural setting in which to express the famous index results. Of importance in this section is the set up of several methods to calculate the K-homology groups such as Poincar´e Duality, the Universal Coefficient Theorem, and the Chern Character. We then undertake the task of attempting to describe D-branes within this rigorous formalism in Sect. 2. Our goal throughout is two-fold. Firstly, we define and describe the physics of D-branes in a rigorously precise K-homological setting which we hope is accessible to mathematicians with little or no prior knowledge of string theory, while at the same time describing another mathematical framework in which to set and do string theory. Secondly, we emphasize the fact that the (sometimes surprising) physical properties of D-branes are completely transparent when the branes are defined and analysed within the mathematical framework of topological K-homology. Our basic aim will be to find generators for the pertinent K-homology groups which will in turn be identified geometrically with the D-branes of the spacetime. We give some general results on how to do this and some of our results on the description of the K-homology generators (Th. 2.1, 2.3) deal with the mathematical problem of representation of cycles in a homology theory. These results also relate K-homology with homology and spinc bordism. Many non-trivial dynamical aspects of D-branes are then reformulated as the problem of

Geometric K-Homology of Flat D-Branes

73

finding appropriate changes of bases for the generators of the K-homology groups. Included in this list are the constructions of bound states of D-branes from both the “branes within branes” mechanism [25] and the dielectric effect [51], as well as the decay of unstable systems of D-branes into stable bound states via tachyon condensation on their worldvolumes [36, 53, 57, 64]. While our constructions find their most natural interpretation in the physics of D-branes, the results may also be of independent mathematical interest. In this section we also relate our K-homological approach with the one using derived categories. In Sect. 3 we start turning our attention to explicit examples, including a simple analysis of D-branes in spheres and projective spaces as well as a homological treatment of T-duality. We apply the structure results from Sect. 2 and introduce the Hurewicz homomorphism. This homomorphism is important in the problem of representation of cycles, as well as that of finding the K-homology generators. In Sect. 4 we look at some more complicated examples of D-branes which carry torsion charges. In both the torsion and torsion-free cases, we study in detail the phenomenon of brane instability [24, 46], i.e. that some D-branes may be unstable even though they wrap non-trivial spinc homology cycles of the spacetime. This problem becomes particularly transparent in topological K-homology. A number of explicit examples of torsion D-branes are presented, including those in Lens spaces, in all (even and odd dimensional) real projective spaces and some products thereof, and in the basic Fermat quintic threefold and its mirror Calabi-Yau threefold. We introduce another method of calculating the K-homology groups, namely the Atiyah-Hirzebruch-Whitehead (AHW) spectral sequence, and apply it to Lens spaces. We also deal with the problem of factorizing the Hurewicz homomorphism through K-homology. This last issue is non-trivial, since there is no map between the K-homology and the ordinary homology spectra. We give conditions and particular instances in which this occurs. We also calculate and give a geometric interpretation (to the best of our knowledge, for the first time) of the K-homology of the even real projective spaces, which is a good example of a non spinc space, therefore of one that doesn’t allow for the application of the usual tools (i.e. Poincar´e Duality). Finally, in Sect. 5 we examine the problem of stabilizing certain D-branes even when they wrap homologically trivial worldvolumes. This is achieved by regarding the ambient spacetime as the total space of a non-trivial fibration. The characteristic class of the fibre bundle then acts as a source of stabilization and effectively renders the D-brane stable. We present a number of classes of examples of this type, and in each instance explicitly determine the topological K-homology groups. Our analysis includes as a special case the well-known example of spherical D-branes wrapping S2 ⊂ S3 [5], which in our construction are rendered stable by virtue of the Hopf fibration over CP1 . The methods of this section are based in the Leray-Serre spectral sequence, of which the AHW spectral sequence is a special case. By the use of this spectral sequence we give general results in the structure of the K-homology groups of spaces which are total spaces of certain types of fibrations (coverings, spherical fibrations, and fibrations with fibre a sphere). In summary, in this paper we present another way of viewing D-branes. In this setting a D-brane on a background spacetime X is a triple (M, E, φ), where M is a spinc submanifold of X (usually interpreted as a D-brane in string theory), E is the ChanPaton bundle on M, and φ : M → X is the embedding. This contains more information than the usual picture of a D-brane as a submanifold of spacetime, since it also considers the specific embedding of M in X and its Chan-Paton bundle. D-brane charges are then classified by the K-homology group of X , instead of its K-theory. The relations

74

R.M.G. Reis, R.J. Szabo

used to define the equivalence relation defining the group are physically meaningful and therefore natural. In general, the K-homology group may contain elements in which the generators aren’t as above (e.g., φ may only be a continuous map instead of an embedding of manifolds), and the question of whether this is always the case relates to the mathematical problem of representation of cycles for which we have given some answers (for instance in Theorems 2.1, 2.3, the numerous examples given). In terms of string theory, one of the main advantages of this method is that it allows, through calculation, to obtain a complete description, up to equivalence, of all the D-branes that may appear in a specific spacetime. Several methods are given to calculate these, and abstract as well as specific examples are worked out in detail. Also, this paper recasts interpretations of D-brane phenomena done in the setting of K-homology of C∗ -algebras into an algebraic topological framework, while supplying the necessary tools to work in this new topological framework. Although we lose the more rich structure of a cohomology theory, we think that the ability to obtain a full geometric characterization of D-branes more than compensates for that. Another task we tried to fulfil with this paper was to create, as far as possible, a dictionary between string theory (and particularly D-brane theory) and mathematics, trying with this to make it easier for mathematicians not necessarily connected with mathematical physics to acquaint themselves with some of the mathematical problems that lye at the heart of string theory; also, as an effort to put the physical notions in a mathematically rigorous setting as much as possible. 1. K-Homology In this section we shall develop the requisite mathematical material that will be required throughout this paper, postponing the start of our string theory considerations until the next section. We define geometric K-homology and describe some basic properties of the topological K-homology groups of a topological space. We also compare this homology theory with other formulations of K-homology as the dual theory to K-theory. Unless otherwise stated, in this paper X will denote a finite CW-complex, i.e. a compact CW-complex (see [58] for a rigorous mathematical definition). For instance, compact manifolds are finite CW-complexes. 1.1. K-cycles. Definition 1.1. A K-cycle on X is a triple (M, E, φ) where (i) M is a compact spinc manifold without boundary; (ii) E is a complex vector bundle over M; and (iii) φ : M → X is a continuous map. There are no connectedness requirements made upon M, and hence the bundle E can have different fibre dimensions on the different connected components of M. It follows that disjoint union (M1 , E 1 , φ1 ) (M2 , E 2 , φ2 ) := (M1 M2 , E 1 E 2 , φ1 φ2 ) is a well-defined operation on the set of K-cycles on X . Definition 1.2. Two K-cycles (M1 , E 1 , φ1 ) and (M2 , E 2 , φ2 ) on X are isomorphic if there exists a diffeomorphism h : M1 → M2 such that

Geometric K-Homology of Flat D-Branes

75

(i) h preserves the spinc structures; (ii) h ∗ (E 2 ) ∼ = E 1 ; and (iii) The diagram / M2 M1 C CC CC φ2 C φ1 CC ! X h

commutes. The set of isomorphism classes of K-cycles on X is denoted (X). Definition 1.3 (Bordism). Two K-cycles (M1 , E 1 , φ1 ) and (M2 , E 2 , φ2 ) on X are bordant if there exist a compact spinc manifold W with boundary, a complex vector bundle E → W , and a continuous map φ:W → X such that the two K-cycles (∂ W,E|∂ W ,φ|∂ W ) and (M1 (−M2 ), E 1 E 2 , φ1 φ2 ) are isomorphic. Here −M2 denotes M2 with the spinc structure on its tangent bundle T M2 reversed. 1.2. Clutching construction. Before proceeding with further definitions, we need a construction that will be instrumental in defining the topological K-homology groups. Let M be a compact spinc manifold and F → M a C∞ real spinc vector bundle with evendimensional fibres and projection map ρ. Let 11R M := M × R denote the trivial real line R bundle over M. Then F ⊕ 11 M is a real vector bundle over M with odd-dimensional fibres. By choosing a C∞ metric on it, we may define the unit sphere bundle = S F ⊕ 11R (1.1) M M by restricting the set of fibre vectors of F ⊕ 11R M to those which have unit norm. The c c by the exact sequence spin structures on T M and F induce a spin structure on T M is a spinc manifold. By construction, M is a sphere bundle over lemma [6], and hence M M with even-dimensional spheres as fibres. We denote the bundle projection by −→ M . π : M

(1.2)

as consisting of two copies B± (F) of Alternatively, we may regard the total space M the unit ball bundle B(F) of F (carrying opposite spinc structures) glued together by the identity map idS(F) on its boundary, so that = B+ (F) ∪S(F) B− (F) . M

(1.3)

For p ∈ M, let 2n = dimR F p with n ∈ N. The group Spinc (2n) has two irreducible half-spin representations. The spinc structure on F associates to these representations complex vector bundles S0 (F) and S1 (F) of equal rank 2n−1 over M. Their Whitney sum S(F) = S0 (F) ⊕ S1 (F) is a bundle of Clifford modules over M such that C(F) ⊗ C ∼ = End S(F), where C(F) is the real Clifford algebra bundle of F. Let S+ (F) and S− (F) be the spinor bundles over F obtained from pullbacks to F by the bundle projection ρ : F → M of S0 (F) and S1 (F), respectively. Clifford multiplication induces a bundle map F ⊗ S0 (F) → S1 (F) that defines a vector bundle map σ : S+ (F) → S− (F) covering id F which is an isomorphism outside the zero section

76

R.M.G. Reis, R.J. Szabo

of F. Since the ball bundle B(F) is a sub-bundle of F, we may form spinor bundles over B± (F) as the restriction bundles ± (F) = S± (F)|B± (F) . We can then glue + (F) and − (F) along S(F) = ∂B(F) by the Clifford multiplication map σ giving a vector defined by bundle over M H (F) = + (F) ∪σ − (F) .

(1.4)

For each p ∈ M, H (F)|π −1 ( p) is the Bott generator vector bundle over the even-dimensional sphere π −1 ( p) [6]. Thus, starting from the triple (M, F, ρ) we have constructed H (F), π ). another triple ( M, Definition 1.4. If (M, E, φ) is a K-cycle on X and F is a C∞ real spinc vector bundle over M with even-dimensional ﬁbres, then the process of obtaining the K-cycle H (F) ⊗ π ∗ (E), φ ◦ π ) from (M, E, φ) is called vector bundle modiﬁcation. ( M, 1.3. Topological K-homology. We are now ready to define the topological K-homology groups of the space X . Definition 1.5. The topological K-homology group of X is the group obtained from quotienting (X ) by the equivalence relation ∼ generated by the relations of (i) bordism; (ii) direct sum: if E = E 1 ⊕ E 2 , then (M, E, φ) ∼ (M, E 1 , φ) (M, E 2 , φ); and (iii) vector bundle modiﬁcation. The group operation is induced by disjoint union of K-cycles. We denote this group by K t (X ) := (X ) / ∼, and the homology class of the K-cycle (M, E, φ) by [M, E, φ] ∈ K t (X ). The manifolds M N and N M are bordant through a bordism which clearly induces a bordism of the respective K-cycles. It follows that [∅, ∅, ∅] is the identity for the operation induced by disjoint union of K-cycles and [−M, E, φ] is the inverse of [M, E, φ], where −M denotes M with its spinc structure reversed.The operation is also clearly associative and commutative. Thus K t (X ) is an abelian group. Since the equivalence relation on (X ) preserves the parity of the dimension of M in K-cycles (M, E, φ), one can define the subgroup K0t (X ) (resp. K1t (X )) consisting of classes of K-cycles (M, E, φ) for which all connected components Mi of M are of even (resp. odd) dimension. Then K t (X ) = K0t (X ) ⊕ K1t (X ) has a natural Z2 -grading. The geometric construction of K-homology is functorial. If f : X → Y is a continuous map, then the induced homomorphism f ∗ : K t (X ) −→ K t (Y ) of Z2 -graded abelian groups is given on classes of K-cycles [M, E, φ] ∈ K t (X ) by f ∗ [M, E, φ] := [M, E, f ◦ φ] . One has (id X )∗ = idKt (X ) and ( f ◦ g)∗ = f ∗ ◦ g∗ . Since vector bundles over M extend to vector bundles over M × [0, 1], it follows by bordism that induced homomorphisms depend only on their homotopy classes.

Geometric K-Homology of Flat D-Branes

77

If pt denotes a one-point topological space, then the only K-cycles on pt are (pt, pt × Ck , idpt ) with k ∈ N. Thus K0t (pt) ∼ = Z and K1t (pt) ∼ = 0. The collapsing map ε : X → pt then induces an epimorphism ε∗ : K t (X ) −→ K t (pt) ∼ =Z.

(1.5)

The reduced topological K-homology group of X is t (X ) := ker ε∗ . K

(1.6)

Since the map (1.5) is an epimorphism with left inverse induced by the inclusion of a t (X ) for any space X . point ι : pt → X , one has K t (X ) ∼ =Z⊕K 1.4. Computational tools. Before adding further structure to this K-homology theory, we pause to describe some basic technical results which will aid in calculating the groups K t (X ), particularly in the subsequent sections when we shall seek explicit Kcycle representatives for their generators. In what follows we shall use the notation [n] := {1, 2, . . . , n}. Lemma 1.1. K t (X ) is generated by classes of K-cycles [M, E, φ] where M is connected. Proof. Let {Mi }i∈I be the set of connected components of M. Since Mis compact, ∼ I is afinite set. Defining E i := E| Mi and φi := φ| Mi , we have E = i∈I E i and φ = i∈I φi so that [M, E, φ] = i∈I [Mi , E i , φi ]. t Lemma 1.2. If {X j } j∈J is the set of connected components of X then K t (X ) = j∈J K (X j ). Proof. Let [M, E, φ] ∈ K t (X ) with {Mi }i∈I the set of connected components of M. As in the proof of Lemma 1.1, one has [M, E, φ] = i∈I [Mi , E i , φi ]. For any i ∈ I , Mi is connected and φi is continuous, and so there exists ji ∈ tJ such that φi (Mi ) ⊂ X ji . Thus [Mi , E i , φi ] ∈ K t (X ji ) and so [M, E, φ] ∈ let j∈J K (X j ). Conversely, [Mi , E i , φi ] ∈ K t (X ji ) for some i ∈ [n] and ji ∈ J . Defining M := i∈[n] Mi , E := i∈[n] E i and φ := i∈[n] φi , one has [M, E, φ] ∈ K t (X ). The conclusion now follows by considering the image of the class i∈[n] [Mi , E i , φi ] in K t (X ) under the homomorphism induced by the continuous map i∈[n] ι ji , where ι ji : X ji → X are the canonical inclusions. Lemma 1.3. Let (M, E, φ) be a K-cycle on X . Suppose that the degree 0 topological K-theory group Kt0 (M) of M is generated as a Z-module by classes [F1 ], . . . , [F p ] of complex vector bundles over M. Then [M, E, φ] belongs to the Z-submodule of K t (X ) generated by {[M, Fi , φ]}i∈[ p] . Proof. By hypothesis there exist integers n 1 , . . . , n p such that [E] = i∈[ p] n i [Fi ]. Without loss of generality we may suppose that n j ≥ 0 for all 1 ≤ j ≤ m while n j < 0 for all m + 1 ≤ j ≤ p, for some m with 1 ≤ m ≤ p. Then [E] +

p i=m+1

(−n i ) [Fi ] =

m i=1

n i [Fi ] ,

78

R.M.G. Reis, R.J. Szabo

which implies that there exists an integer k ≥ 0 such that     p −n i ni m k k  E ⊕ Fi  ⊕ 11C Fi  ⊕ 11C M = M . i=m+1

j=1

i=1

j=1

Going down to classes in K t (X ) using the direct sum relation, we then have [M, E, φ] +

p

m k k (−n i )[M, Fi , φ]+ M, 11C , φ = n i M, Fi , φ + M, 11C M M ,φ

i=m+1

i=1

which implies that [M, E, φ] =

i∈[ p]

n i [M, Fi , φ].

Corollary 1.1. The homology class of a K-cycle (M, E, φ) on X depends only on the K-theory class of E in Kt0 (M). Lemma 1.4. The homology class of a K-cycle (M, E, φ) on X depends only on the homotopy class of φ in [M, X ]. Proof. This follows from [M, E, φ] = [M, E, φ ◦ id M ] = φ∗ [M, E, id M ] and the fact that induced homomorphisms depend only on their homotopy classes. Corollary 1.2. If X is a compact spinc manifold without boundary, E → X is a complex vector bundle and φ : X → X is a continuous map, then [X, E, φ] depends only on the homotopy class of φ in [X, X ]. 1.5. Cap product. The cap product is the Z2 -degree preserving bilinear pairing ∩ : Kt0 (X ) ⊗ K t (X ) −→ K t (X ) given for any complex vector bundle F → X and K-cycle class [M, E, φ] ∈ K t (X ) by [F] ∩ [M, E, φ] := [M, φ ∗ F ⊗ E, φ] and extended linearly. It makes K t (X ) into a module over the ring Kt0 (X ). Later on (see Sects. 2.7 and 3.2) we will see that this product can be extended to a bilinear form t ∩ : Kti (X ) ⊗ Ktj (X ) −→ Ke(i+ j) (X ) ,

(1.7)

where we have denoted the mod 2 congruence class of an integer n ∈ Z by 0 , n even e(n) := . 1 , n odd The construction utilizes Bott periodicity and the isomorphism Kt1 (X ) ∼ = Kt0 ( X ), where X is the reduced suspension of the topological space X . The product ∩ : t t Kt1 (X ) ⊗ Kit (X ) → Ke(i+1) (X ) is given by the pairing ∩ : Kt0 ( X ) ⊗ Ke(i−1) ( X ) → t Ke(i−1) ( X ).

Geometric K-Homology of Flat D-Branes

79

1.6. Exterior product. If X and Y are spaces, then the exterior product t × : Kit (X ) ⊗ Ktj (Y ) −→ Ke(i+ j) (X × Y )

is given for classes of K-cycles [M, E, φ] ∈ Kit (X ) and [N , F, ψ] ∈ Ktj (Y ) by M, E, φ × N , F, ψ := M × N , E F, (φ, ψ) , where M × N has the spinc product structure uniquely induced by the spinc structures on M and N , and E F is the vector bundle over M × N with fibres (E F)( p,q) = E p ⊗ Fq for ( p, q) ∈ M × N . This product is natural with respect to continuous maps and there is the following version (due to Atiyah) of the Künneth theorem in K-homology [48]. Theorem 1.1. If X is a CW-complex and Y is a compact topological space, then for each i = 0, 1 there is a natural short exact sequence 0 −→ Ktj (X ) ⊗ Klt (Y ) −→ Kit (X × Y ) e( j+l)=i

−→

Tor Ktj (X ) , Klt (Y ) −→ 0 .

e( j+l)=e(i+1)

This sequence always splits, although unnaturally (see [12, 23] where a generalization to the case of C∗ -algebras is given, and [56]). 1.7. Spectral K-homology. Let X be a general (not necessarily compact) CW-complex. We then define Kit (X ) := lim Kit (Y ), −→ Y

for i = 0, 1, where the limit runs over the finite CW-subcomplexes Y of X . By defining t (X ) := K t (X ) and K t t t K2k 0 2k+1 (X ) := K1 (X ) for all k ∈ Z, one has that K (X ) is a 2-periodic unreduced homology theory on the category of CW-complexes. On the other hand, K-theory is a 2-periodic cohomology theory which can be defined in terms of its U := Z × BU(∞) and K U spectrum KU = {KnU }n∈Z , where K2k 2k+1 := U(∞) are the 0 1 classifying spaces for Kt and Kt , respectively. Thus we can define [58] a homology theory related to Kt by the inductive limit Kis (X, Y ) := lim πn+i (X/Y ) ∧ KnU −→ n

for all i ∈ Z, where Y is a closed subspace of the topological space X and ∧ denotes the smash product. Bott periodicity then implies that this is a 2-periodic homology theory. For any finite CW-complex X , we can construct a map µs : Kit (X ) −→ Kis (X ) := Kis (X, ∅) given by

µs [M, E, φ] := φ∗ [E] ∩ [M]s

80

R.M.G. Reis, R.J. Szabo

on classes of K-cycles and extended by linearity. Here ∩ : Kt0 (X ) ⊗ Kis (X ) → Kis (X ) for i = 0, 1 is the spectrally defined cap product with [M]s the fundamental class of the manifold M in Kis (X ) [58]. The transformation µs is an isomorphism which is natural in X , and so it defines a natural equivalence between the functors K t and K s [38]. It follows that K t (X ) is a realization of K s (X ). The map µs is also compatible with cap

products, i.e. µs (ξ ∩ α) = ξ ∩ µs (α) for all ξ ∈ Kt (X ) and α ∈ K t (X ), or equivalently there is a commutative diagram Kti (X ) ⊗ Ktj (X )

∩

idKi (X ) ⊗µs

e(i+ j) (X ) µs

t

Kti (X ) ⊗ Ksj (X )

/ Kt

∩

s / Ke(i+ j) (X ) .

In particular, if X is a compact connected spinc manifold without boundary, then C µs [X, 11C X , id X ] = (id X )∗ [11 X ] ∩ [X ]s = (id X )∗ [X ]s = [X ]s s in K s (X ), with 11C X the trivial complex line bundle over X . Since µ is a natural equivat s lence between K and K it follows that, within the framework of topological K-homolt ogy as the dual theory to K-theory, [X, 11C X , id X ] is the fundamental class of X in K (X ). t One can give a definition of relative K-homology groups Ki (X, Y ) in such a way that there is also a map µs : Kit (X, Y ) → Kis (X, Y ) which defines a natural equivalence between functors on the category of topological spaces having the homotopy type of finite CW-pairs (X, Y ) [38]. One can also give a bordism description of K t (X, Y ) as follows. We consider the set of all triples (M, E, φ), where (i) M is a compact spinc manifold with boundary; (ii) E is a complex vector bundle over M; and (iii) φ : M → X is a continuous map with φ(∂ M) ⊂ Y . This set is quotiented by relations of bordism (modified from Definition 1.3 by the requirement that M1 (−M2 ) ⊂ ∂ W is a regularly embedded submanifold of codimension 0 with φ(∂ W \ M1 (−M2 )) ⊂ Y ), direct sum and vector bundle modification. The collection of equivalence classes is a Z2 -graded abelian group with operation induced by disjoint union of relative K-cycles [38]. Since K-homology is a generalized homology theory, there is a long exact homology sequence for any pair (X, Y ). Because K t is a 2-periodic theory, this sequence truncates to the six-term exact sequence

K0t (Y ) O

ι∗

/ Kt (X )

ς∗

0

0

∂

K1t (X, Y ) o

/ Kt (X, Y ) ∂

ς∗

K1t (X ) o

ι∗

K1t (Y )

where the horizontal arrows are induced by the canonical inclusion maps ι : Y → X and ς : (X, ∅) → (X, Y ). In the bordism description, the connecting homomorphism is given by the boundary map ∂[M, E, φ] := [∂ M, E|∂ M , φ|∂ M ]

Geometric K-Homology of Flat D-Branes

81

on classes of K-cycles and extended by linearity. One also has the usual excision property. If U ⊂ Y is a subspace whose closure lies in the interior of Y , then the inclusion ς U : (X \ U, Y \ U ) → (X, Y ) induces an isomorphism ≈

ς∗U : K t (X \ U, Y \ U ) −→ K t (X, Y ) of Z2 -graded abelian groups. 1.8. Analytic K-homology. We will now briefly describe the relationship between K t (X ) and the analytic K-homology groups of a finite CW-complex X (the construction of these groups follows verbatim to the case of a compact metrizable topological space – see [6]). 1.8.1. The group K0a (X ). Let 0 (X ) be the set of all quintuples (H0 , ψ0 , H1 , ψ1 , T ) where (i) for each i = 0, 1, Hi is a separable Hilbert space; (ii) for each i = 0, 1, ψi : C(X ) → L(Hi ) is a unital algebra ∗-homomorphism, where C(X ) is the C ∗ -algebra of continuous complex-valued functions on X and L(Hi ) is the C ∗ -algebra of bounded linear operators on Hi ; and (iii) T : H0 → H1 is a bounded Fredholm operator such that the operator T ◦ ψ0 ( f ) − ψ1 ( f ) ◦ T is compact for all f ∈ C(X ). We can define on 0 (X ) a direct sum operation and an equivalence relation generated by isomorphism, direct sum with a trivial object, and compact perturbation of Fredholm operators. The quotient set is, with direct sum, an abelian group K0a (X ) called the degree 0 analytic K-homology group of X . There is an epimorphism Index : K0a (X ) −→ Z

(1.8)

given by Index[H0 , ψ0 , H1 , ψ1 , T ] := Index T . Suppose that X is a closed C∞ manifold, E 0 , E 1 are complex C∞ vector bundles over X and D : C∞ (E 0 ) → C∞ (E 1 ) is an elliptic pseudo-differential operator on X . Then one can construct an element [D] ∈ K0a (X ) which depends only on D. All elements of K0a (X ) arise in this way, and in this case we have that Index[D] = Index D is the analytic index of D regarded as a Fredholm operator [6]. 1.8.2. The group K1a (X ). Let 1 (X ) be the set of all pairs (H, τ ) where (i) H is a separable Hilbert space; and (ii) τ : C(X ) → Q(H) is a unital algebra ∗-homomorphism, where Q(H) = L(H)/K(H) is the Calkin algebra with K(H) the closed ideal in L(H) consisting of compact operators on H.

82

R.M.G. Reis, R.J. Szabo

On 1 (X ) we can define a direct sum operation and an equivalence relation using unitary equivalence and triviality. The quotient set is, with direct sum, an abelian group K1a (X ) called the degree 1 analytic K-homology group of X . It coincides with the Brown-Douglas-Fillmore group Ext(X ) := Ext(C(X ), K) of equivalence classes of extensions of the C ∗ -algebra C(X ) by compact operators K [8], defined by C ∗ -algebras A which fit into the short exact sequence 0 −→ K −→ A −→ C(X ) −→ 0 .

(1.9)

Suppose that X is a closed C∞ manifold, E is a complex C∞ vector bundle over X and A : C∞ (E) → C∞ (E) is a self-adjoint elliptic pseudo-differential operator on X . Then one can construct an element [A] ∈ K1a (X ) which depends only on A. All elements of K1a (X ) arise in this way [6]. 1.8.3. The group K a (X ). We define K a (X ) := K0a (X ) ⊕ K1a (X ) to be the analytic K-homology group of X . There is a natural notion of induced homomorphism f ∗ : K a (X ) → K a (Y ) for continuous maps f : X → Y such that K a is a 2-periodic homology theory. Let us now describe its explicit relation to the topological K-homology theory K t . Let (M, E, φ) be a K-cycle on the finite CW-complex X and D / the Dirac operator / E is an elliptic first order on the spinc manifold M. Then the twisted Dirac operator D differential operator on M (self-adjoint if dimR M is odd; the "corner operator" that maps odd spinors to even spinors, and vice-versa, if dimR M is even). Hence it determines an element [D / E ] = [E] ∩ [D / ] ∈ K a (M) with the degree preserved, and φ∗ [D / E ] ∈ K a (X ). t This element depends only on the K-homology class [M, E, φ] ∈ K (X ), and so we get a well-defined map of Z2 -graded abelian groups µa : K t (X ) −→ K a (X ) given by µa [M, E, φ] := φ∗ [D / E] on classes of K-cycles. This map is an isomorphism which is natural [6]. The index epimorphism (1.8) and the epimorphism (1.5) induced by the collapsing map together with the isomorphism µa generate a commutative diagram K t (X ) DD DD ε∗ a DD µ DD D! /Z. K a (X )

(1.10)

Index

As a consequence of what was said in the previous and this subsection one has the following. Theorem 1.2. Let X be a ﬁnite CW-complex. Then all three forms of K-homology are isomorphic, i.e. K t (X ) ∼ = K a (X ) ∼ = Kis (X ) are isomorphic as graded groups.

Geometric K-Homology of Flat D-Branes

83

1.9. Poincaré duality. Let X be an n-dimensional compact manifold with (possibly empty) boundary, and B(T X ) → X and S(T X ) → X the unit ball and sphere bundles of X . An element τ ∈ Kte(n) (B(T X ), S(T X )) is called a Thom class or an orientation e(n) for X if τ |(B(T X )x ,S(T X )x ) ∈ Kt (B(T X )x , S(T X )x ) ∼ = Kt0 (pt) is a generator for all x ∈ X . The manifold X is said to be Kt -orientable if it has a Thom class. In that case the usual cup product on the topological K-theory ring yields the Thom isomorphism ≈ e(i+n) T X : Kti X −→ Kt B(T X ) , S(T X ) given for i = 0, 1 and ξ ∈ Kti (X ) by T X (ξ ) := πB∗ (T X ) (ξ ) ∪ τ , where πB(T X ) : B(T X ) → X is the bundle projection. This construction also works by replacing the tangent bundle of X with any O(r ) vector bundle V → X , defining a Thom isomorphism ≈ e(i+r ) T X,V : Kti X −→ Kt B(V ) , S(V ) given by T X,V (ξ ) := πB∗ (V ) (ξ ) ∪ τV ,

(1.11)

where the element τV ∈ Kte(r ) (B(V ), S(V )) is called the Thom class of V . Any Kt -oriented manifold X of dimension n has a uniquely determined fundamental s (X, ∂ X ). One then has the Poincaré duality isomorphism class [X ]s ∈ Ke(n) ≈

s X : Kti (X ) −→ Ke(i+n) (X, ∂ X )

given for i = 0, 1 and ξ ∈ Kti (X ) by taking the cap product X (ξ ) := ξ ∩ [X ]s .

(1.12)

In particular, if X is a compact spinc manifold of dimension n without boundary, then X is Kt -oriented and so in this case we also have a Poincaré isomorphism as above [38, 58] giving e(n) K0t (X ) ∼ = Kt (X ) ,

e(n+1) K1t (X ) ∼ (X ) . = Kt

1.10. Universal coefﬁcient theorem. Let X be a compact n-dimensional spinc manifold without boundary. In the framework of analytic K-homology, the six-term exact sequence in K-theory corresponding to an extension (1.9) reduces to the short exact sequence 0 −→ K0 (K) −→ K0 (A) −→ Kt0 (X ) −→ 0 Z

84

R.M.G. Reis, R.J. Szabo

and therefore defines an element of Ext(Kt0 (X ), Z) in homological algebra. Conversely, there is a universal coefficient theorem given by the short exact sequence [56, 11, 34] (here we’re assuming that this extension is in the kernel of γ00 : E xt (X ) → H om(K t1 (X ), Z) – see [28] for a definition of this homomorphism) 0 −→ Ext Kt0 (X ) , Z −→ Ext C(X ) , K −→ Hom Kt1 (X ) , Z) −→ 0 . K1a (X ) This sequence splits, although not naturally. For definiteness, suppose that the degree 0 K-theory of X can be split as Kt0 (X ) = K0 (X ) ⊕ tor K0 (X ) , where the lattice K0 (X ) = t t m t Kt0 (X ) / tor K0 (X ) is the free part of the K-theory group and tor K0 (X ) = i=1 Zn i is t t its torsion subgroup (such a split is neither unique nor natural). Since X is a finite CW-complex, the abelian group Kt0 (X ) is finitely generated and we have m m 0 ∼ ∼ Ext Kt (X ) , Z = Ext tor K0 (X ) , Z = Ext Zn i , Z = Zn i = tor K0 (X ) t t i=1

i=1

from which it follows that K1a X ∼ = Hom Kt1 (X ) , Z ⊕ tor K0 (X ) . t

Although the torsion part of the dual homology group to the topological K-theory group Kt1 (X ) can differ from that of the analytic K-homology K1a (X ), Poincaré duality always asserts an isomorphism between the full groups K a (X ) ∼ = K t (X ) and Kt (X ). Note that if dimR X is even and Kt0 (X ) is a free abelian group, then Kt1 (X ) ∼ = K1t (X ) ∼ = K1a (X ) ∼ = 1 Hom(Kt (X ), Z). One can make a stronger statement (due to D. W. Anderson) which works for any CW-complex X with finitely generated K-theory. Since KU is a CW-spectrum and a ring-spectrum, there is a universal coefficient theorem expressed by the (split) exact sequence [1, 44, 66] t 0 −→ Ext Ke(i−1) (X ) , Z −→ Kti X −→ Hom Kit (X ) , Z −→ 0 for i = 0, 1. The epimorphism is given by the index map. As above, let us consider the splits Kti (X ) = Ki (X ) ⊕ tor Ki (X ) and Kit (X ) = Kt (X ) ⊕ tor Kt (X ) . One then easily i i t t concludes that t ∼ t t Ext Ke(i−1) (X ) , Z ∼ (X ) , = Ext tor Ke(i−1) (X ) , Z = tor Ke(i−1) t Hom Ki (X ) , Z ∼ = Hom Kt (X ) , Z ∼ = Kt (X ) . i

i

∼ tor t and i ∼ By the universal coefficient theorem it follows that tor Ke(i−1) (X ) = Ki (X ) Kt (X ) = t Kt (X ) . We can also conclude the following from these general calculations. i

Proposition 1.1. Let X be a ﬁnite CW-complex. Then its K-homology group is ﬁnitely generated. Proof. Since X is a finite CW-complex, its K-theory groups are finitely generated. The isomorphisms above then imply the conclusion.

Geometric K-Homology of Flat D-Branes

85

1.11. Chern character . There is a natural transformation ch• : K t (X ) → H (X ; Q) of Z2 -graded homology theories called the (homology) Chern character which is defined in the following way. Recall the Z2 -grading on singular homology given by H (X ; Q) = Heven (X ; Q)⊕Hodd (X ; Q) with Heven (X ; Q) = e(k)=0 Hk (X ; Q) and Hodd (X ; Q) = e(k)=1 Hk (X ; Q). Given a K-cycle (M, E, φ) on X , let φ∗ : H (M; Q) → H (X ; Q) be the homomorphism induced on rational homology by φ. Then ch• (E)∪td(T M)∩[M] is the Poincaré dual on M of the even degree cohomology class ch• (E) ∪ td(T M), where ch• : Kt0 (−) → Heven (−; Q) is the (cohomology) Chern character in K-theory, td denotes the Todd class of a spinc vector bundle and [M] is the orientation cycle of M in H (M; Q) induced by the spinc structure on T M. Then (1.13) ch• (M, E, φ) := φ∗ ch• (E) ∪ td(T M) ∩ [M] is an element of H (X ; Q) which depends only on the K-homology class [M, E, φ] ∈ K t (X ). This map preserves the Z2 -grading. The Chern characters ch• and ch• preserve the cap product, i.e. for every topological space X there is a Z2 -degree preserving commutative diagram

Kt (X ) ⊗ K t (X )

∩

ch• ⊗ch•

H (X ; Q) ⊗ H (X ; Q)

∩

/ K t (X )

ch•

/ H (X ; Q) .

Since X is a finite CW-complex, K t (X ) is a finitely generated abelian group and ch• induces an isomorphism K t (X ) ⊗Z Q ∼ = H (X ; Q) of Z2 -graded vector spaces over Q. The Chern character can be used to give an explicit formula for the epimorphism (1.5) in terms of characteristic classes as ε∗ [M, E, φ] = ch• (E) ∪ td(T M)[M]. Then the commutative diagram (1.10) can be recast as the equality Index φ∗ [D / E ] = Index µa [M, E, φ] = ε∗ [M, E, φ] = ch• (E) ∪ td(T M)[M] .

(1.14)

In the special case where X is a point this becomes Index D / E = ch• (E) ∪ td(T M)[M], which is a particular instance of the Atiyah-Singer index theorem. 2. K-Cycles and D-Brane Constructions We will now bring string theory into the story. Many of our subsequent results find their most natural interpretations in terms of D-branes, with the mathematical formalism of K-homology leading to new insights into the properties of D-branes wrapping cycles in non-trivial spacetimes. We begin with some heuristic physical discussion aimed at motivating the interpretation of D-branes as K-cycles in topological K-homology. Then we move on to more mathematical computations explaining the interplay between Khomology and the properties of D-branes. The analysis will center around finding explicit K-cycle representatives for the generators of the K-homology groups, which will be interpreted as D-branes in the pertinent spacetime. For physical definitions and descriptions of D-branes in string theory, see [39, 55].

86

R.M.G. Reis, R.J. Szabo

2.1. D-branes. Consider Type II superstring theory on a spacetime X with all background supergravity form fields turned off. X is an oriented ten-dimensional spin manifold. A D-brane in X is an oriented spinc submanifold M ⊂ X together with a complex vector bundle E → M called the Chan-Paton bundle. M itself is referred to as the worldvolume of the D-brane and when dimR M = p + 1 we will sometimes refer to the brane as a D p-brane to emphasize its dimensionality. The presence of non-vanishing background form fields would mean that the classification of D-branes requires algebraic and/or twisted (co)homology tools. Their absence means that we can resort to topological methods, which thereby classify ﬂat D-branes. While this works whenever the tangent bundle T X over spacetime is stably trivial, in more general cases the set-up would not describe a true background of string theory since the terminology ‘flat’ used here is not meant to imply that we consider a flat spacetime geometry. Nonetheless, the topological description will provide geometric insight into the nature of D-branes in curved backgrounds, even in this simplified setting. Thus a very crude definition of a D-brane is as a K-cycle (M, E, φ) on the spacetime X , with φ = ι : M → X the natural inclusion (the remaining elements of (X ) then ensure that the quotient (X ) / ∼ really is K t (X )). Sometimes we regard D-branes as sitting in an ambient space which is a proper subspace of spacetime, for instance when X is a product X = Q × Y we may be interested in worldvolumes M ⊂ Q. When there is no danger of confusion we will also use the symbol X for this ambient space. In either case, X is also customarily called the target space. D-branes are generally more complicated objects than just submanifolds carrying vector bundles, because in string theory they are realized as (Dirichlet) boundary conditions for a two-dimensional superconformal quantum field theory with target space X . Any classification based solely on K-theory is expected to capture only those properties that depend on D-brane charge. Nevertheless, the primitive definition of a D-brane as a K-cycle in topological K-homology is very natural and carries much more information than its realization in the dual K-theory framework. We shall see that the geometric description of K-homology is surprisingly rich and provides a simple context in which non-trivial D-brane effects are exhibited in a clear geometrical fashion. Example 2.1 (B-Branes). Let X be a (possibly singular) complex n-dimensional projective algebraic variety, and let O X be the structure sheaf of regular functions on X . Recall that a coherent sheaf on X is a sheaf of O X -modules which is locally the cokernel of a morphism of holomorphic vector bundles over X . The coherent algebraic sheaves on X form an abelian category denoted coh(X ). The bounded derived category of coherent sheaves on X , denoted D(X ) := D(coh(X )), is the triangulated category of topological B-model D-branes (or B-branes for short) in X [26, 61]. An object of this category is a bounded differential complex of coherent sheaves. It contains coh(X ) as a full subcategory by identifying a coherent sheaf F with the trivial complex 0 0 0 0 0 0 F• = 0 −→ · · · −→ 0 −→ F −→ 0 −→ · · · −→ 0 having F in degree 0. For details of these and other constructions, see [3]. The relevant K-homology group, denoted K0ω (X ), is the Grothendieck group of coherent algebraic sheaves on X obtained by applying the usual Grothendieck completion functor to the abelian category coh(X ). There is a natural transformation α ω : D(X ) → K0ω (X ) which may be described as follows. Let F• ∈ D(X ) be a complex. Using a locally-free resolution we may replace F• by a quasi-isomorphic complex of locallyfree sheaves, each of which has associated to it a holomorphic vector bundle. The virtual

Geometric K-Homology of Flat D-Branes

87

Euler class of the complex obtained by replacing locally-free sheaves with their corresponding vector bundles is then the K-homology class α ω (F• ) that we are looking for. On the other hand, the underlying topological space of X is a finite CW-complex and has topological K-homology group K t (X ). We will now construct a natural map µω : K0ω (X ) → K0t (X ), which by composition gives a natural map from the derived category µω ◦ α ω : D(X ) → K0t (X ) and gives an intrinsic description of B-branes in terms of K-cycles. Consider triples (M, E, φ) where (i) M is a non-singular complex projective algebraic variety; (ii) E is a complex algebraic vector bundle over M; and (iii) φ : M → X is a morphism of algebraic varieties. Two triples (M1 , E 1 , φ1 ) and (M2 , E 2 , φ2 ) are said to be isomorphic if there exists an isomorphism h : M1 → M2 of complex projective algebraic varieties such that h ∗ (E 2 ) ∼ = E 1 as complex algebraic vector bundles over M1 and φ1 = φ2 ◦ h. The set of isomorphism classes of triples is denoted ω (X ). Given such a triple (M, E, φ), the morphism φ : M → X induces the direct image functor coh(φ) : coh(M) → coh(X ), defined by coh(φ)(F) = φ∗ (F) for F ∈ coh(M), which is left exact and induces the i-th right derived functor Ri coh(φ) : D(M) → D(X ) for i = 0, 1, . . . , n as follows. We include the category of coherent sheaves into the category of quasi-coherent sheaves, replace a complex by a quasi-isomorphic complex of injectives, and apply the functor coh(φ) componentwise to the complex of injectives (If X is singular this requires a resolution of its singularities). Then the induced map φ∗ : K0ω (M) −→ K0ω (X ) is given for coherent algebraic sheaves F on M by φ∗ [F] :=

n

(−1)i α ω ◦ Ri coh(φ)[F• ] .

(2.1)

i=0

In particular, if E denotes the sheaf of germs of algebraic sections of E → M, then φ∗ [ E ] ∈ K0ω (X ). By using a resolution of the singularities of X if necessary and the fact that any coherent sheaf on a non-singular variety admits a resolution by locally free sheaves, one can show that the φ∗ [ E ] obtained from triples in ω (X ) generate the abelian group K0ω (X ) [6]. By forgetting some structure a triple (M, E, φ) becomes a K-cycle on X and hence determines an element [M, E, φ] ∈ K0t (X ). Thus we get a well-defined map µω : K0ω (X ) −→ K0t (X ) given on generators by µω φ∗ E := M, E, φ . This map is a natural transformation of the covariant functors K0ω and K0t , thus providing an extension of the Grothendieck-Riemann-Roch theorem. However, in contrast to the transformations µs and µa of the previous section, µω is not an isomorphism [6]. This suggests that the topological K-homology group K t (X ) carries more information about the category of B-branes than the Grothendieck group K ω (X ).

88

R.M.G. Reis, R.J. Szabo

One of the virtues of this mapping of elements in the derived category D(X ) to classes of K-cycles in K0t (X ) is that it allows one to compute B-brane charges even when the variety X is singular. The collapsing map ε : X → pt induces, as before, an epimorphism χ : K0ω (X ) → K0ω (pt). Since a coherent algebraic sheaf over a point is just a finite dimensional complex vector space, which can be characterized by its dimension, one has K0ω (pt) ∼ = Z. The charge of a B-brane represented by a coherent algebraic sheaf F on X is the image of [F] ∈ K0ω (X ) in Z under the epimorphism χ , which using (2.1) is given explicitly by the Euler number χ (X ; F) := χ [F] =

n

(−1)i dimC Hi (X ; F) ,

i=0

where Hi (X ; F) is the i th cohomology group of X with coefficients in F. Together with the epimorphism (1.5) and the transformation µω , there is a commutative diagram similar to (1.10) given by K0t (X ) O EE EE ε∗ ω EE µ EE E" K0ω (X ) χ / Z . For a B-brane represented by a K-cycle (M, E, φ), the characteristic class formula for ε∗ in (1.14) then gives the charge χ X ; φ∗ E = ch• (E) ∪ td(T M)[M] . If X is non-singular and E is a complex vector bundle over X , then this charge formula applied to the K-cycle (X, E, id X ) is just the Hirzebruch-Riemann-Roch theorem which computes the analytic index of the twisted Dolbeault operator ∂ E on X . See [6] for more details. ♦ 2.2. Qualitative description of K-cycle classes. We will now explain how the equivalence relations of topological K-homology, as spelled out in Definition 1.5, translate into physical statements about D-branes. Let us begin with bordism. Suppose that X is a locally compact topological space and pt X contains a distinguished point called “infinity”. Let X ∞ := X pt be the one-point compactification of X . We are interested in configurations of D-branes in X which have finite energy. This means that they should be regarded as equivalent to the closed string vacuum asymptotically in X ∞ . The charge of a D-brane (M, E, φ) is given through the index formula (1.14) (in this paper we do not deal with the square root of the A-genus which usually defines D-brane charges [49]). The condition that the D-brane should have vanishing charge at infinity is tantamount to requiring that its K-homology class have a trivialization at infinity, i.e. that there is a compact subset U of X such that E| X −U is a trivial bundle and φ| X −U is homotopically trivial, so that Index φ∗ [D / E ] = 0 over X − U . This is the physical meaning of the definition of the reduced K-homology group (1.6), which measures charges of D-branes relative to that of the vacuum. The K-homology of X in this case should then be defined by the relative K-homology group of Sect. 1.7 as K t (X ∞ , pt) ∼ = K t (X ),

Geometric K-Homology of Flat D-Branes

89

where we have used excision. Since the six-term exact sequence for this relative Kt homology has connecting homomorphism ∂ : Kit (X ∞ , pt) → Ke(i+1) (pt) acting by c ∂[W, F, ψ] = [∂ W, F|∂ W , ψ|∂ W ], boundaries ∂ W of spin manifolds W map into pt and so any D-brane (M, E, φ) which is bordant to the trivial K-cycle (∅, ∅, ∅) on X should carry the same charge as the vacuum. This is guaranteed by the bordism relation. The argument extends to a “compactification” of spacetime in which X = Q × Y , with Q locally compact and Y a compact topological space without boundary. D-branes will now either be located at particular points in Y or will wrap submanifolds of Y . Requiring that such configurations be equivalent to the vacuum asymptotically in Q ∞ thus requires adding a copy of Y at infinity in Q ∞ , and so we look for K-homology classes with a trivialization on Y at infinity. Using relative bordism and excision, the classes of D-brane K-cycles are now seen to live in the group K t (X, Y ) ∼ = K t (X/Y, pt) ∼ = K t (Q ∞ , pt). Let us now turn to direct sum. It represents “gauge symmetry enhancement for coincident branes” which occurs when several D-branes wrap the same submanifold M ⊂ X [63]. In this case, the Chan-Paton bundles E i → M of the constituent D-branes are augmented to the higher-rank Chan-Paton bundle E = i E i of the combined brane configuration. Such a combination is called a bound state of D-branes. The branes are bound together by open string excitations corresponding to classes of bundle morphisms [ f i j ] ∈ Hom(E i , E j ). Other open string degrees of freedom are described by the higher cohomology groups Ext p (E i , E j ), p ≥ 1. Finally, let us look at vector bundle modification. Consider the K-cycle (pt, 11C pt , ι) on C X where 11pt = pt × C and ι : pt → X is the inclusion of a point. Let F = pt × R2n with = pt × S2n ∼ n ≥ 1. Then M = S2n is a 2n-dimensional sphere, and π = ε : S2n → pt is the collapsing map. From the clutching construction of Sect. 1.2, H (F) = H (F)|S2n = H (F)|π −1 (pt) is the Bott generator for the K-theory of S2n . Using vector bundle modiC 2n 2n fication one then has [pt, 11C pt , ι] = [S , H (F) ⊗ 11pt , ι ◦ π ] = [S , H (F), ε], where ι ◦ π = ε is a collapsing map. This equality represents the “blowing up” of a D(−1)brane (also known as a D-instanton) into a collection of spherical D(2n − 1)-branes and is a simple example of what is known in string theory as the “dielectric effect” [51]. This is described in more generality in Sect. 2.5 below. The equality also illustrates the crucial point that topological K-homology naturally encodes the fact that D-branes are typically not static objects but will “decay” into stable configurations of branes. Suppose that the spacetime has a split X = R× X , where R is thought of as parametrizing “time” and X is “space”. Consider a D-brane in X whose worldvolume is initially of the form M = R × M with M ⊂ X . In the distant future (at large time), this brane will “decay” into stable D-branes of lower dimension which wrap non-trivial homology cycles of X . In the example above, if we regard the 2n-sphere S2n as the one-point compactification of the locally compact space R2n , then the spherical D(2n − 1)-brane can be thought of as decaying into a bound state of D(−1)-branes. There are no conserved charges for higher branes because R2n is contractible. The precise formulation of this notion of “stability” in K-homology is one of the goals of the present paper. Type II superstring theory on the spacetime manifold X itself splits into two string theories, called Type IIA and Type IIB. The former contains stable supersymmetric D pbranes for all even 0 ≤ p ≤ 8 while the latter contains stable supersymmetric D p-branes for all odd −1 ≤ p ≤ 9. Thus Type IIA D-branes are naturally classified by the group K1t (X ) while K0t (X ) classifies Type IIB branes. Since dimR X is even in the present case, by Poincaré duality this is in complete agreement with the corresponding K-theory classification [53, 64]. We shall now proceed to examine properties of these D-branes

90

R.M.G. Reis, R.J. Szabo

by giving more rigorous mathematical calculations in K-homology, beginning with an accurate statement of what is meant by “stable” in the above discussion. 2.3. Stability. In the absence of background form fields, stable supersymmetric D-branes are known to wrap the non-trivial spinc homology cycles of the spacetime manifold X in which they live [30]. In K-homology, this is asserted by the following fundamental result that will play an important role throughout the rest of this paper. Definition 2.1. Let X be a space and Y ⊂ X . We say that a D-brane [M, E, φ] ∈ K t (X ) wraps Y if φ(M) ⊂ Y . The idea behind this definition is that the condition above determines a continuous map ψ : M → Y such that i ◦ ψ = φ, where i : Y → X is the canonical inclusion and i ∗ ([M, E, ψ]) = [M, E, φ], and so we can interpret the initial D-brane as one over Y . On the other hand, the notion of stability is, mathematically, a heuristic one, since we interpret that a D-brane [M, E, φ] ∈ K t (X ) is unstable if it is decomposable (nontrivially) in the form [M, E, φ] =

mp n

p p d pi (M, E) Mi , 11 M p , φi , i

p=0 i=1

p p where the branes Mi , 11 M p , φi are linearly independent generators of K t (X ). Namely, i the stable objects are the possible linearly independent generators. Theorem 2.1. Let X be a compact connected ﬁnite CW-complex of dimension n whose rational homology can be presented as H (X ; Q) =

mp n

p

Mi

Q,

p=0 i=1 p

where Mi is a p-dimensional compact connected spinc submanifold of X without boundp ary and with orientation cycle [Mi ] given by the spinc structure. Suppose that the p p canonical inclusion map ιi : Mi → X induces, for each i, p, a homomorphism p p (ιi )∗ : H p (Mi ; Q) → H p (X ; Q) ∼ = Qm p with the property p p p ιi ∗ Mi = κi p Mi (2.2) for some κi p ∈ Q \ {0}. Then the lattice Kt (X ) := K t (X ) / tor Kt (X ) is generated by the classes of K-cycles p C p Mi , 11 M p , ιi , 0 ≤ p ≤ n , 1 ≤ i ≤ m p . i

ip

Proof. Fixing 0 ≤ p ≤ n, 1 ≤ i ≤ m p , let {xab }b∈[n i p ] be cohomology classes a

∼ Qn a for each in degree a generating the rational cohomology group Ha (Mi ; Q) = p p a = 0, 1, . . . , p. Since Mi is oriented and connected, one has H p (Mi ; Q) ∼ = Q ∼ = p ip ip 0 H (Mi ; Q) and hence n p = 1 = n 0 . Without loss of generality we may assume that p

ip

Geometric K-Homology of Flat D-Branes ip

ip

91

ip

x01 = 1, and we set x p := x p1 . The rational cohomology ring of the submanifold p Mi ⊂ X can thus be presented as

ip

p−1 na

p Mi ; Q

=1Q ⊕

In particular, the Todd class

p td(T Mi )

H

ip ip xab Q ⊕ x p Q .

a=0 b=1

td

p T Mi

p

∈ Heven (Mi ; Q) may be expressed in the form ip

=1+

p−1 na

ip

ip

ip

dab xab + δe( p),0 x p

a=1 b=1 ip

ip

for some dab ∈ Q with dab = 0 whenever a is odd. Let us now use the Chern character (1.13) to compute p p p p p = ιi ∗ td(T Mi ) ∩ [Mi ] ch• Mi , 11C p , ιi M i

ip

p−1 n a p p ip ip p dab xab ∩ [Mi ] + ri p [pt] = ιi ∗ [Mi ] + a=1 b=1 ip

=

p κi p [Mi ]+

p−1 na

p ip p ip p dab ιi ∗ xab ∩[Mi ] +ri p ιi ∗ [pt]

a=1 b=1

(2.3) for some ri p ∈ Q with ri p = 0 for p odd. We have used ch• (11C p ) = 1 and (2.2). For M ip

p

ip

p

i

each 1 ≤ a < p, 1 ≤ b ≤ n a one has (ιi )∗ (xab ∩ [Mi ]) ∈ H p−a (X ; Q). The ordered collection n 1 1 n m 1 ] . . . [M1 ]···[Mm n ] c = [pt] [M1 ]···[M of homology cycles is a basis of H (X ; Q) as a rational vector space. On the other hand, if we set n C n 0 1 C 1 C C 1 1 n n ch (pt,11C pt ,ι1 ) ch• (M1 ,11 M 1 ,ι1 )···ch• (Mm 1 ,11 M 1 ,ιm 1 ) . . . ch• (M1 ,11 M n ,ι1 )···ch• (Mm n ,11 M n ,ιm n ) m h= • n m 1 1 1 then from (2.3) it follows that h = c , where is an upper triangular matrix whose diagonal elements are the non-zero rational numbers κi p . Thus det = 0 and so the collection h is also a basis of H (X ; Q) as a rational vector space. Since X is a finite CW-complex, ch• ⊗ idQ : K t (X ) ⊗Z Q → H (X ; Q) is an isomorphism and hence (ch• ⊗ idQ )−1 h is a set of generators for K t (X ) ⊗Z Q. Remark 2.1. Some comments are in order concerning the statement of the previous theorem. The inclusions considered are topological embeddings (which, between manifolds, may turn out to be smooth maps), since X is not assumed to be a manifold. Since embedding manifolds into other manifolds can be very troublesome, our assumptions are of a purely combinatorial nature, stated by the shape of the rational homology of X and

92

R.M.G. Reis, R.J. Szabo

the behavior of the maps induced in homology by the inclusion maps. One instance in which this is a straightforward result is when X has a CW-structure given by manifolds satisfying the stated properties whose skeletons have big enough gaps, for instance if X 2k = X 2k+1 , for all k, as is the case for projective spaces other than real projective spaces. ♦ Remark 2.2. We do not know if this theorem can be proven by replacing assumption (2.2) with a weaker condition. The crucial issue is whether or not there is a non-trivial linear p p p p p relation p,i n pi [Mi , 11C 1C p , ιi ] = 0 over Z among the “lifts” [Mi ] → [Mi , 1 p , ιi ] M M i

i

of the non-trivial homology cycles of X to K-homology. If such a relation exists, then the lifts of some non-trivial singular homology classes in K t (X ) are 0. This means that some D-brane state is unstable, even though it wraps a non-trivial spinc homology cycle. It either decays into the closed string vacuum state, or is not completely unstable but decays into other D-branes according to the solutions of the linear equation p p C n p,i pi [Mi , 11 M p , ιi ] = 0 over Z. The same argument as that used in Sect. 2.4 below i

shows that such a decay is always into branes wrapped on manifolds of lower dimension than that of the original D-brane. The condition (2.2) guarantees that this does not occur. We shall analyse this feature from a different perspective in Sect. 4.1. This analysis illustrates the fact that D-branes need not generally simply correspond to subspaces of spacetime. ♦ Another perspective on the problem of finding the K-homology generators is given by Spinc bordism. Let MSpinc (X ) be the Spinc bordism group of a finite CW-complex X . This group is the set of equivalence classes, through bordism, of pairs (M, f ), where M is a compact Spinc manifold without boundary and f : M → X is a continuous map. Disjoint union induces an abelian group structure on MSpinc (X ). This group is Z-graded by the dimension of M. This construction gives a generalized homology theory MSpinc ( · ) on the category of finite CW-complexes, which can also be defined in terms of a spectrum MSpinc . For more details see [21, 20, 58]. Conner and Floyd ([20]) constructed an orientation map MU → KU between the complex bordism and the K-theory spectra which was later generalized to a map MSpinc → KU by Atiyah-Bott-Shapiro ([18]). Hence, if X is a finite CW-complex, this translates into a homomorphism MSpinc (X ) → K t (X), given by [M, f ] → [M, 11C M, f ] which is a natural transformation of homology theories inducing an MSpinc (pt)-module structure on K t (pt) . The relevant result for our purposes is the following theorem by Hopkins and Hovey ([35]). Theorem 2.2. The map MSpinc (X) ⊗MSpinc (pt) K t (pt) → K t (X) induced by the Atiyah-Bott-Shapiro orientation is an isomorphism of K t (pt)-modules. This immediately implies the following result, reducing the problem of calculating the K-homology generators to the analogous problem in Spinc bordism. Theorem 2.3. Let X be a ﬁnite CW-complex. Suppose [Mi , f i ], 1 ≤ i ≤ m are generators of MSpinc (X) as an MSpinc (pt)-module. Then [Mi , 11C Mi , f i ], 1 ≤ i ≤ m, generate t t K (X) as a K (pt)-module. Namely, for all n = 0, 1, Knt (X) is generated by the elements [Mi , 11C Mi , f i ], 1 ≤ i ≤ m, such that dim Mi = n. In the case where X is also a manifold, we can consider

Geometric K-Homology of Flat D-Branes

93

differentiable bordism, namely we can consider bordism of pairs (M, f ), where M is a compact Spinc manifold without boundary and f : M → X is a differentiable map. We then obtain another abelian group MSpincDiff (X), which is canonically isomorphic to MSpinc (X) ([21]). 2.4. Branes within branes. A D p-brane also generally has, in addition to its p-brane charge, lower-dimensional q-brane charges with q = p −2, p −4, . . . , which depend on the Chan-Paton bundle over its worldvolume [25]. Let M ⊂ X be a compact connected spinc manifold without boundary and let E be a complex vector bundle over M. Then [M, E, ι] ∈ K t (X ), where ι : M → X is the natural inclusion. Under the assumptions of Theorem 2.1, if the brane is torsion-free then it has an expansion in terms of the lifted homology basis for the lattice Kt (X ) of the form

[M, E, ι] =

mp n

p p d pi (M, E) Mi , 11 M p , ιi

(2.4)

i

p=0 i=1

with d pi (M, E) ∈ Z. The crucial point here is that the branes on the right-hand side of (2.4) are of lower dimension than the original brane on the left-hand side, and have even codimension with respect to the worldvolume M. Lemma 2.1. If [Mi , 11C p , ιi ] ∈ im(ιi )∗ for each 0 ≤ p ≤ n, 1 ≤ i ≤ m p , then Mi d pi (M, E) = 0 for all p = dimR (M) − 2 j with j = 0, 1, . . . , dimR2(M) . p

p

p

Proof. We apply the Chern character (1.13) to both sides of (2.4). Then ch• (Mi , 11C p , ιi ) M p

p

i

is a sum of homology cycles in Heven (X ; Q) (resp. Hodd (X ; Q)) for p even (resp. odd) of degree at most p. Since [M, E, ι] = ι∗ [M, E, id M ], the conclusion then follows from the commutative diagram K t (M)

ι∗

ch•

H (M; Q)

/ K t (X ) ch•

ι∗

/ H (X ; Q).

Remark 2.3. This relationship represents the possibility of being able to construct stable states of D-branes as bound states in higher-dimensional branes by placing non-trivial Chan-Paton bundles on the higher-dimensional worldvolume. Note that this works even when E is a non-trivial line bundle. In general, it is difficult to determine the lower ( p − 1)-brane charges d pi (M, E) ∈ Z in (2.4) explicitly. We will return to this issue in Sect. 4.1. ♦ 2.5. Polarization. There is an “opposite” effect to the one just described wherein a D p-brane can expand or “polarize” into a higher dimensional brane [51]. This is the dielectric effect which was mentioned in Sect. 2.2 and it is intrinsically due to the nonabelian structure groups that higher rank Chan-Paton bundles possess. We will now

94

R.M.G. Reis, R.J. Szabo

describe this process in more detail. Let M ⊂ X be a compact spinc manifold without boundary. Then M is Kt -oriented. Let F be a real C∞ spinc vector bundle over M of even rank 2r with structure group O(2r ) and Thom class τ F . Then F ⊕ 11R M is a smooth O(2r + 1) vector bundle which admits a nowhere zero section F : M → F ⊕ 11R M given by F (x) := 0x ⊕ 1 for x ∈ M. The map F may be regarded as a section of the sphere bundle defined by (1.1,1.2). Then the Poincaré duality isomorphism yields a functorial homomorphism !F : Kti M −→ Kti M given for i = 0, 1 and ξ ∈ Kti (M) by F !F (ξ ) := −1 ◦ ∗ ◦ M (ξ ) . M

This is called the Gysin homomorphism, and by using (1.3) it may be represented as the composition [65] TM,F exc ς∗ , B− (F) −→ , !F : Kti M −→ Kti B+ (F) , S(F) −→ Kti M Kti M where the first map is the Thom isomorphism of F, the second map is excision and the , ∅) → ( M , B− (F)). third map is restriction induced by the inclusion ς : ( M Consider now the complex vector bundle H (F) → M defined in (1.4). With π the bundle projection (1.2), [H (F)]|π −1 (x) = [H (F)|π −1 (x) ] is the Bott generator of ) t0 (B(F)x / S(F)x ) for all x ∈ M. It follows that the K-theory class [H (F)] ∈ Kt0 ( M K ∗ ∗ is related to the Thom class of F by ς ◦ exc(τ F ) = [H (F)]. Since ς and exc are ring isomorphisms, from the explicit expression for the Thom isomorphism (1.11) one has !F [E] = π ∗ [E]∪[H (F)] = [π ∗ (E)⊗ H (F)] for any complex vector bundle E → M. Thus given a K-cycle (M, E, φ) on X , we can rewrite vector bundle modification as the equivalence relation

F [E], φ ◦ π . M, E, φ ∼ M, !

(2.5)

The charge of the D-brane [M, E, φ] ∈ K t (X ) is, by definition, the index (1.14). From (2.5) we see that we can rewrite it using F [E], φ ◦ π µa M, E, φ = µa M, ! to get M . Index φ∗ D / E = ch• !F E ∪ td T M Thus the charge of a polarized D p-brane can be expressed entirely in terms of characteristic classes associated with the higher-dimensional spherical D( p + 2r )-brane into which it has dissolved to form a bound state. A similar formula was noted in a specific context in [43]. This is a general feature of bound states of D-branes, and they can always be expressed entirely in terms of quantities intrinsic to the ambient space X [49], as we will now proceed to show.

Geometric K-Homology of Flat D-Branes

95

2.6. Tachyon condensation. The constructions we have thus far presented involve stable states of D-branes. We can also consider configurations of branes which are a priori unstable and decay into stable D-branes. This requires us to start considering virtual elements in K-theory, with the stable states being associated to the positive cone of the K-theory group. Using these elements we can construct an explicit set of generators for K t (X ), including its torsion subgroup. The decay mechanism is then represented as a change of basis for the topological K-homology group after we compare these generators with those obtained in Theorem 2.1. Proposition 2.1. Let X be an n-dimensional compact connected spinc manifold without boundary whose degree 0 reduced topological K-theory group can be presented as the split   k pi m t0 (X ) =  K ξ j Z ⊕ γli Zn i , j=1

i=1

l=1

t0 (X ) for each 1 ≤ j ≤ m, 1 ≤ i ≤ k, 1 ≤ l ≤ pi . where n i ≥ 2, pi ≥ 1 and ξ j , γli ∈ K Choose representatives for the generators ξ j = [E j ] − [F j ] and γli = [G li ] − [Hli ] t (X ) is generated as an abelian in terms of complex vector bundles over X . Then K e(n) group by the elements X, E j , id X − X, F j , id X , 1 ≤ j ≤ m , X, G li , id X − X, Hli , id X , 1 ≤ i ≤ k , 1 ≤ l ≤ pi . Proof. We explicitly construct the Poincaré duality isomorphism tX : Kt0 (X ) → t (X ) induced from (1.12) in this case. Recall from Sect. 1.7 that the K-cycle class Ke(n) t s [X, 11C X , id X ] is the fundamental class of X in Ke(n) (X ) and that the isomorphism µ is compatible with cap products. It follows that the map tX := (µs )−1 ◦ X is given explicitly by tX ξ j = [E j ] − [F j ] ∩ X, 11C X , id X C = X, 11C X ⊗ E j , id X − X, 11 X ⊗ F j , id X = X, E j , id X − X, F j , id X . The conclusion now follows by Poincaré duality.

Remark 2.4. Suppose that X satisfies the conditions of both Theorem 2.1 and Proposip p t (X ) / tor K tion 2.1. Let [Mi , 11C p , ιi ] be generators of the lattice K t (X ) = K t (X ) e(n) M e(n)

i

e(n)

given by Theorem 2.1, and [X, E j , id X ] − [X, F j , id X ] the generators given by Proposition 2.1. Since these are bases of the same free Z-module K t (X ) , there are uniquely e(n)

j

defined integers a pi such that

X, E j , id X − X, F j , id X =

n

mp

i=1 p=1 e( p)=e(n)

j p p a pi Mi , 11C p , ιi M

(2.6)

i

for 1 ≤ j ≤ m. In the string theory setting, X is a ten-dimensional spin manifold and (2.6) represents a change of basis on K t (X ) . The right-hand side is an expansion 0

96

R.M.G. Reis, R.J. Szabo

in terms of the stable torsion-free Type IIB D-branes which wrap the non-trivial spinc homology cycles of X in even degree. The left-hand side is the difference between a pair of spacetime-ﬁlling D9-branes wrapping the entire ambient space X . The relative sign difference indicates that one of these branes should be regarded as oppositely charged relative to the other, i.e. it is an antibrane. The left-hand side thus represents a brane-antibrane system. It is unstable and (2.6) describes its decay into lower-dimensional stable D-branes in X . Note that any stable D-brane of even degree can be constructed from such 9-brane pairs. We have thereby reproduced the construction of Type IIB stable supersymmetric D-branes from spacetime filling brane-antibrane pairs [53, 57, 64]. As before, however, it is in general quite difficult to explicitly determine the ( p − 1)-brane j charges a pi ∈ Z in (2.6). We remark on the analogous construction in Type IIA string theory in Sect. 2.7. ♦ Example 2.2 (ABS Construction). Let X be a compact connected ten-dimensional C∞ spin manifold without boundary. Let (M, E, φ) be a K-cycle on X such that φ : M → X is a proper imbedding with M connected and of even codimension 2k in X (so that the corresponding K-homology class describes a D p-brane in X with p = 9−2k). Choosing a riemannian metric on X , the normal bundle N M → M of M in X fits into an exact sequence of real vector bundles as φ∗

0 −→ T M −→ T X −→ N M −→ 0 . Let wi (F) ∈ Hi (M; Z2 ) denote the i th Stiefel-Whitney class of a real vector bundle F → M with w0 (F) = 1. If the metric on X restricts non-degenerately to the worldvolume M, then one has the Whitney sum formula φ ∗ wi (T X ) =

i

w j (T M) ∪ wi− j (N M) .

j=0

Since X and M are orientable, we have w1 (T X ) = w1 (T M) = 0 and hence w1 (N M) = 0. Since X is spin, we also have w2 (T X ) = 0 and hence w2 (N M) = w2 (T M). Thus endowing M with a spinc structure is equivalent to endowing its normal bundle with a spinc structure. It follows that N M → M is a real spinc C∞ vector bundle with structure group SO(2k). Applying the clutching construction of Sect. 1.2 to F = N M, vector bundle modification then identifies the D-branes

H (N M) ⊗ π ∗ (E), φ ◦ π = M, E, φ . M,

(2.7)

To proceed further we need the following elementary result [18]. Lemma 2.2. Let W be a compact manifold and Z a connected manifold which are both non-empty and have the same dimension. Then any embedding f : W → Z is a diffeomorphism. Proof. Let V = Z \ f (W ). Since W and Z have the same dimension, f (W ) is open in Z . On the other hand, since W is compact, f (W ) is compact in Z and hence closed. Thus one concludes that V is both open and closed in Z . Since Z is connected, it follows that either V = ∅ or V = Z . Since W and Z are non-empty, one has V = ∅.

Geometric K-Homology of Flat D-Branes

97

which in the present case is a compact Let us apply this lemma to the sphere bundle M, ∼ ten-dimensional submanifold of X . It follows that M = X and so the left-hand side of (2.7) represents a configuration of spacetime-filling D9-branes. To see that it is an element of the basis set provided by Proposition 2.1, we appeal to the Atiyah-Bott-Shapiro (ABS) construction in topological K-theory [4]. For this, we use the metric on X to construct a tubular neighbourhood M of M in X . Let M denote the closure of M in X and M its boundary. The neighbourhood M may be identified with the total space of the normal bundle N M → M. Without loss of generality, we can identify M with the interior of the unit ball bundle B(N M) \ S(N M) (whose fibres consist of normal vectors with norm < 1). Then we have the identifications M = B(N M) and M = S(N M). Let ρ : M → M be the retraction of the regular neighbourhood M onto M, and denote the ± ∗ twisted spinor bundles over M by ± E := (N M) ⊗ ρ (E). Set X = X \ M . − Suppose first that the bundle E admits an extension over X , also denoted − E. Via the (extended) Clifford multiplication map σ E := σ ⊗ idρ ∗ (E) , the bundle +E is isomorphic to − E on M , and so it can also be extended over X by declaring that it − ± be isomorphic to E over X . This gives a pair of bundles E → X that determine 0 an element [+E ] − [− E ] of the reduced K-theory group Kt (X ) which vanishes on X . On the other hand, from (1.4) we see that this K-theory element is just the Gysin homomorphism + − E − E = !N M E = π ∗ E ∪ H (N M) . ∼ Using this fact along with the diffeomorphism M = X , the identification (2.7) becomes X, +E , id X − X, − (2.8) E , id X = ± M, E, φ , and X coincide. where the sign depends on whether or not the spinc structures on M This is the standard construction of a Type IIB D-brane [M, E, φ] in terms of spacetimefilling brane-antibrane pairs [53, 64]. In this context the Clifford multiplication map σ E is called the tachyon ﬁeld and the decay mechanism (2.8) is known as tachyon conden sation. If the bundle − E does not admit an extension over X , we use Swan’s theorem to construct a complex vector bundle G → M such that G ⊕ (S1 (N M) ⊗ E) is trivial − ∗ over M, and hence whose pullback ρ ∗ (G) ⊕ − E is trivial over M. Then ρ (G) ⊕ E + ∗ can be extended over the whole of X as a trivial bundle. The bundle ρ (G) ⊕ E is ∗ isomorphic to ρ ∗ (G) ⊕ − E on M under the vector bundle map idρ (G) ⊕ σ E , and so it can also be extended over X by setting it equal to ρ ∗ (G) ⊕ − over X . The resulting E K-theory class is again trivial over X (but not over X ), and by the direct sum relation the Poincaré dual K-homology class coincides with (2.8). ♦ 2.7. Unstable 9-branes. The crux of the constructions of Sect. 2.6 is that one can use virtual elements which signal instability of the given configurations of branes. Using Corollary 1.1 one can replace (X ) with the collection of isomorphism classes of triples (M, ξ, φ), where M and φ are as in Definition 1.1 and ξ ∈ Kt0 (M) is a class in the degree 0 topological K-theory of M. Clearly both definitions lead to the same group K t (X ). One can further extend the definition to triples (M, ξ, φ) with ξ ∈ Kt (M) = Kt0 (M) ⊕ Kt1 (M) [38]. The Z2 -grading on K t (X ) is then defined by taking K0t (X ) (resp. K1t (X )) to be the subgroup given by classes of K-cycles (M, ξ, φ) such that

98

R.M.G. Reis, R.J. Szabo

ξ | Ml ∈ Ktil (M) for some il = 0, 1 and dimR (Ml ) + il is an even (resp. odd) integer for all connected components Ml of M. Vector bundle modification is now generically described as the equivalence relation (2.5) using the Gysin homomorphism associated to a real spinc vector bundle F → M whose rank has the same parity as that of the dimension of (the connected components of) M. To see that this definition is in fact equivalent to our previous one, let [M, ξ, φ] ∈ j Kit (X ) with i = 0, 1, M connected, and ξ a non-zero element of Kt (M) for some j = 0, 1. If m := dimR (M), then one has m + j ≡ i mod 2 by definition. Consider i+m+1 j and the associated Gysin map !F : Kt (M) → the trivial spinc bundle F = 11R M e( j+i+m) e( j+i+m) ∼ ( M ). Since j +i +m ≡ j +m+ j +m mod 2 ≡ 0 mod 2, one has Kt (M)= Kt 0 Kt ( M ). It follows that there are complex vector bundles E, H → M with !F (ξ ) = E, φ ◦ π ] − [E] − [H ], and by vector bundle modification one has [M, ξ, φ] = [ M, t [ M, H, φ ◦ π ] in Ki (X ). Notice that by using the usual cup product on the K-theory ring Kt (X ), the cap product (1.7) may now be alternatively defined by ξ ∩ [M, ξ, φ] = ∗ [M, φ ξ ∪ ξ, φ] for ξ ∈ Kti (X ) and K-cycle classes [M, ξ, φ] ∈ Ktj (X ). Suppose that X is a compact spinc manifold without boundary obeying the conditions of Theorem 2.1. Let E be a complex vector bundle over X and α an automorphism of E. This defines a degree 1 K-theory class [E, α] ∈ Kt1 (X ) which we assume to be torsion-free. Applying the Poincaré duality isomorphism as before then gives the analog of the expansion (2.6) as

X, (E, α), id X =

n

mp

i=1 p=1 e( p)=e(n+1)

p p b pi (E, α) Mi , 11C p , ιi M

(2.9)

i

. In the string theory setting, this is a relation in Kt (X ) expressing the in K t 1 e(n+1) (X ) decay of an unstable D9-brane into stable Type IIA D-branes [36, 53]. E → X is the Chan-Paton bundle on the 9-brane and now the automorphism α : E → E plays the role of the tachyon field. As an explicit example of the decay mechanism (2.9), we may construct the Type IIA version of the ABS construction of Example 2.2. Now we consider a K-cycle (M, E, φ) on X with M of odd codimension 2k + 1 in X , so that the corresponding normal bundle N M → M is an SO(2k + 1) vector bundle. By Lemma 2.2 ∼ one again has a diffeomorphism M = X . Define E := (N M) ⊗ ρ ∗ (E), where (N M) = S(N M)| M is the pull-back of the unique irreducible spinor bundle S(N M) over N M, and assume that it admits an extension over X . Then the Gysin homomorphism gives ! [E] = [ E , exp σ E ] in Kt1 (X ), and so by vector bundle modification one has the identification X, ( E , exp σ E ), id X = ± M, E, φ . This is the standard construction of a Type IIA D-brane [M, E, φ] in terms of unstable spacetime-filling 9-branes [36, 53]. Remark 2.5. It is important to realize that one can stick to our original definition and thus avoid Kt1 -classes entirely. One of the great advantages of the geometric formulation of K-homology, in contrast to other homology theories, is that it is naturally defined in terms of stable objects and one need never consider virtual elements. While brane-antibrane systems are straightforward to construct, the unstable D-branes defined by virtual

Geometric K-Homology of Flat D-Branes

99

K-theory classes in degree 1 are not so natural in this framework. This reflects the difficulties encountered in the description of these D-brane states directly in string theory. To illustrate this point further, let X be as in Proposition 2.1 and consider the Gysin e(i+n) homomorphism in K-theory !Fn : Kti (X ) → Kt ( X ), where Fn = X × Rn . Then n X = X × S and the Gysin homomorphism becomes a map !Fn : Kti (X ) −→ Kte(i+n) (X × Sn ) . The sphere bundle projection π : X × Sn → X in this case is the projection onto the first factor. Since π! ◦ !Fn = idK (X ) , it follows that !Fn is a monomorphism t

and π! is an epimorphism. By definition one has !Fn = (tX ×Sn )−1 ◦ ∗Fn ◦ tX and π! = (tX )−1 ◦ π∗ ◦ tX ×Sn . Because the Poincaré duality maps are isomorphisms, one

concludes that the induced maps ∗Fn and π∗ in K-homology are also a monomorphism and an epimorphism, respectively. Assume that the degree 1 topological K-theory group of X admits a split   k pi m 1 i   Kt (X ) = ξj Z ⊕ γl Zn i . j=1

i=1

l=1

There is a commutative diagram F

! 1

Kt1 (X ) tX

/ K0 (X × S1 ) t t

X ×S1

t Ke(n−1) (X )

F

∗ 1

/ Kt (X × S1 ) . e(n−1)

Let !F1 (ξ j ) = [E j ] − [F j ] and !F1 (γli ) = [G li ] − [Hli ] in terms of complex vector bundles over X × S1 . Then for all 1 ≤ j ≤ m one has [X ×S1 , E j , π ]−[X ×S1 , F j , π ] = π∗ [X ×S1 , E j , id X×S1 ]−[X × S1 , F j , id X ×S1 ] = π∗ ◦ ∗F1 [X, τ j , id X ] for some τ j ∈ Kt0 (X ). Since π∗ ◦ ∗F1 = idKt (X ) , by Poincaré duality it follows that

[X × S , E j , π ] − [X × S , F j , π ] , 1 ≤ j ≤ m 1

1

t is a set of generators for the torsion-free part of the K-homology group Ke(n−1) (X ). One 1 can use the same procedure for the other set of generators of Kt (X ) to conclude that

[X × S1 , G li , π ] − [X × S1 , Hli , π ] , 1 ≤ i ≤ k , 1 ≤ l ≤ pi t is a set of generators for the torsion subgroup of Ke(n−1) (X ). In the string theory setting, this gives an “M-theory” realization of unstable Type IIA 9-branes in terms of braneantibrane systems on an 11-dimensional extension X × S1 of the spacetime manifold X [53, 64]. More generally, one can start from any real spinc line bundle F → X and describe these unstable D-brane states in terms of a spinc circle bundle over X . ♦

100

R.M.G. Reis, R.J. Szabo

3. Torsion-Free D-Branes In this section we will describe some elementary applications of the formalism of the previous section. While for the most part we will arrive at the anticipated results, this simple analysis will illustrate how the known properties of D-branes arise within a mathematically precise formalism. We will only look at examples of torsion-free K-homology groups, deferring the analysis of torsion D-branes to the next section. 3.1. Spherical D-branes. An important role will be played by the D-branes which wrap images of n-dimensional spheres Sn in X . We will first consider the case where X is an arbitrary finite CW-complex. Let n ≥ 0 and let E be a complex vector bundle over Sn . Using Lemma 1.4 we can construct a homomorphism γn,E : [Sn , X ] −→ K t (X ) given by γn,E [ψ] := [Sn , E, ψ] . We can also construct a homomorphism h n,E : πn (X ) −→ K t (X ) given by h n,E [ f ] := [Sn , E, f ] . The subgroup of K t (X ) generated by the K-cycle classes of the form [Sn , E, ψ] for n ≥ 1 and [pt, F, φ] is denoted St (X ). It has a natural Z2 -grading St (X ) = St0 (X ) ⊕ St1 (X ) by the parity e(n) of the sphere dimensions n. Proposition 3.1. Let n ≥ 0 and let E be a complex vector bundle over Sn . (a) If X is path connected and simply connected, then im γn,E = im h n,E in K t (X ). (b) If E = 11C is the Hurewicz homomorphism in K-homology. Sn , then h n := h n,11C Sn Proof. (a) follows immediately from the fact that [Sn , X ] = πn (X ) in this case [22]. C n (b) follows from the fact that [Sn , E, f ] = f ∗ [Sn , 11C Sn , idSn ] with [S , 11Sn , idSn ] the t n n fundamental class of S in K (S ). Remark 3.1. Let f : S1 → S1 be a continuous map with f (s0 ) = x0 = s0 . Regarding S1 ⊂ C, let θ ∈ (0, 1) be defined by x0 = e 2π i θ s0 . Define f 0 : S1 → S1 by f 0 (z) := e −2π i θ f (z). Then f 0 is continuous with f 0 (s0 ) = s0 , and so it is a based map at s0 . The map H : [0, 1] × S1 → S1 defined by H (t, z) := e −2π i t θ f (z) is a homotopy between f and f 0 . Since homotopy of maps is an equivalence relation, we conclude that the map G : [S1 , S1 ] → π1 (S1 ) given by the assignment [ f ] → [ f 0 ] is well-defined and a bijection. Thus we may consider γ1,E : [S1 , X ] → K t (X ) as a ♦ homomorphism γ1,E : π1 (X ) → K t (X ) for any topological space X . We shall now specialize to the case X = Sn , whose cellular structure consists of a single k-cell in dimensions k = 0, n. We will find generators for the subgroup of K t (Sn ) generated by the lower dimensional spheres Sk with 1 ≤ k ≤ n and pt.

Geometric K-Homology of Flat D-Branes

101

Proposition 3.2. K t (S0 ) = St (S0 ) ∼ = Z ⊕ Z with St0 (S0 ) ∼ = Z ⊕ Z and St1 (S0 ) ∼ = 0. Proof. By definition of the group St (pt) it follows immediately that St1 (pt) is the free abelian group generated by the trivial K-cycle class [∅, ∅, ∅], i.e. St1 (pt) = 0. A complex vector bundle E over pt is just a finite dimensional vector space, i.e. there exists an intem ger m > 0 such that E = pt × Cm = i=1 11C pt . Since the unique map φ : pt → pt is the C ∼ identity, it follows that [pt, E, φ] = m [pt, 11pt , idpt ]. Thus St0 (pt) ∼ = [pt, 11C pt , idpt ] Z = Z and the conclusion now follows from Lemma 1.2. As a consequence of Lemma 1.3, Theorem 2.1 and Proposition 3.2 we have the following result. Lemma 3.1. If n ≥ 1 then St (Sn ) is generated by classes of the form [Sk , E, φ] and k [pt, 11C pt , ι], where 1 ≤ k ≤ n, E is a generating vector bundle for the K-theory of S , n and ι : pt → S is the inclusion of a point. Recall that the (complex) K-theory of the spheres is given by Z ⊕ Z , n even 0 , n even , Kt1 (Sn ) ∼ . Kt0 (Sn ) ∼ = = Z , n odd Z , n odd n The trivial line bundle 11C Sn = S × C is always a degree 0 generator, given by the 0 0 n monomorphism Kt (pt) → Kt (S ) induced by the inclusion of a point. The non-trivial generator of Kt0 (S2 ) is obtained from the homeomorphism CP1 ∼ = S2 by taking the class of the canonical line bundle L1 over the complex projective line CP1 . The non-trivial generator [L1 ] p of Kt0 (S2 p ), p ∈ N is obtained from [L1 ] by using the K-theory cup product [41]. The Kt1 -groups are obtained through suspension Sn+1 ∼ = Sn = Sn ∧ S1 . By Poincaré duality one concludes that Z ⊕ Z , n even e(n) n ∼ K0t (Sn ) ∼ K (S ) , = t = Z , n odd 0 , n even e(n−1) n ∼ K (S ) K1t (Sn ) ∼ = t = Z , n odd

for n ≥ 1, and hence K t (Sn ) ∼ =Z⊕Z. Proposition 3.3. Let n ≥ 1. C n (a) K t (Sn ) is generated by the classes [pt, 11C pt , ι] and [S , 11Sn , idSn ]. (b) St (Sn ) = K t (Sn ) as Z2 -graded abelian groups.

Proof. (a) follows from calculating the Chern characters of the classes. Since ch• (11C X) = • (11C ) ∪ 1 for any space X and td(T pt) = 1, it follows that ch• (pt, 11C , ι) = ι (ch ∗ pt pt C C td(T pt)∩[pt]) = ι∗ [pt] = 1. We also have T Sn ⊕11C Sn = 11Sn , so that 1 = td(11Sn ) = C • C n n n n] = td(T Sn ) ∪ td(11C Sn ) = td(T S ) and ch• (S , 11Sn , idSn ) = ch (11Sn ) ∪ td(T S ) ∩ [S n n [S ]. Thus ch• maps the pertinent classes to distinct non-torsion elements of H (S ; Z), and the conclusion follows by Poincaré duality. (b) follows from the fact that the generators of K t (Sn ) are in St (Sn ). n+1

n+1

102

R.M.G. Reis, R.J. Szabo

Remark 3.2. Proposition 3.3(a) is just a special case of Theorem 2.1 and it allows us to conclude, without the assumption of Poincaré duality, that the torsion-free part of K t (Sn ) is generated by the said classes. This is also true of the equality K t (pt) = St (pt) considered in Proposition 3.2. ♦ Corollary 3.1. Let n ≥ 0. ∼ Z ⊕ Z is generated by the classes [pt, 11C , ι] and [Sn , 11Cn , id n ], while (a) St0 (Sn ) = pt S S St1 (Sn ) ∼ = 0. t (Sn ) is a (b) The Hurewicz homomorphism in reduced K-homology h n : πn (Sn ) → K bijection. Proof. (a) follows immediately along the lines of the proof of Proposition 3.3, since ch• is an isomorphism in this case. (b) follows from the fact that h n = ch−1 • ◦ ρn , where ≈

ρn : πn (Sn ) −→ Hn (Sn ; Z) is the Hurewicz isomorphism.

Remark 3.3. From the discussion of Sect. 2.2 we see that the classes [S2k , 11C , ι ], S2k S2k t 2k n n with 2k < n and ιS2k : S → S the inclusion, are all identified in K (S ) through vector n bundle modification. Looking at the proof of Proposition 3.3, one has [Sn , 11C Sn , idSn ] = n n [Sn , 11C Sn , ψn ], where ψn : S → S is a map of winding number n. In particular, by t n n Corollary 3.1(b) a K-cycle class [S , 11C Sn , φ] ∈ K (S ) depends only on the degree of • C the map φ. Finally, from (1.14) the charge of the D-brane [Sn , 11C Sn , idSn ] is ch (11Sn ) ∪ C n n td(T S )[S ] = 1, and similarly for the “vacuum” D-brane [pt, 11pt , ι]. Thus the mathematical analysis above reproduces the well-known physical property that D-branes in flat space carry no lower-dimensional D-brane charges and thus have a simple additive charge. ♦ 3.2. T-Duality. Using the reduced version of the exterior product of Sect. 1.6 and the consequent Künneth theorem, we can investigate the relationship between the groups t ( n X ) and K t (X ), where n X = Sn ∧ X is the n th reduced suspension of the K 0 e(n) topological space X . By Bott periodicity and induction one immediately concludes that t t ( 2n X ) ∼ t (X ) = K K = K 0 0 e(2n) (X ). On the other hand, a simple application of the t (X ) ⊗ K t (S1 ) ∼ t (X ), t ( 1 X ) ∼ Künneth theorem in its reduced version yields K =K =K 0 1 1 1 t ( 2n+1 X ) ∼ t (X ) = K t and by induction we conclude that K =K 0 1 e(2n+1) (X ). Let us now consider the group K0t (X ×S1 ). The Künneth theorem gives K0t (X ×S1 ) ∼ = t K0 (X ) ⊕ K1t (X ), and therefore 0t (X × S1 ) ∼ 0t (X ) ⊕ K1t (X ) . K =K

(3.1)

∼ X × pt → X × S1 induces a homomorphism ι∗ : K t (X ) → The inclusion ι : X = 0 t 1 (X × S ). From the decomposition (3.1) it follows that ι∗ (α) = α ⊕ 0 for all α ∈ K 0 t (X ), and hence im ι∗ = K t (X ) and coker ι∗ = K t (X × S1 )/K t (X ) ∼ K = K1t (X ) 0 0 0 0 t t t where we have identified K1 (X ) (resp. K0 (X )) with the subgroup of K0 (X × S1 ) consisting of K-cycle classes [M, E, φ] such that up to homotopy φ(M) X × pt (resp. φ(M) ⊆ X × pt). This construction can be used to provide an alternative “M-theory” definition of the unstable 9-branes in Type IIA superstring theory introduced in Sect. 2.7

Geometric K-Homology of Flat D-Branes

103

which does not require virtual K-theory elements. For X a ten-dimensional compact spin manifold without boundary, they are identified with the classes [X, E, φ] on the 11-dimensional space X × S1 for which E = 11C X and φ(X ) X × pt. This is consistent with the construction presented in Remark 2.5. Another application of these simple observations is to the description of T-duality in topological K-homology. Let Q be a finite CW-complex and let Tn ∼ = (S1 )n be an n−1 n−1 n-dimensional torus. By the Künneth theorem one has Kit (Tn ) ∼ = Kit (S1 )⊕2 ∼ = Z⊕2 for i = 0, 1. Generalizing the computation of (3.1) thus gives the isomorphisms t n−1 0 (Q) ⊕ K1t (Q) ⊕ Z ⊕2 , K0t (Q × Tn ) ∼ = K n−1 0t (Q) ⊕ Z ⊕2 , K1t (Q × Tn ) ∼ = K1t (Q) ⊕ K and therefore K0t (Q × Tn ) ∼ = K1t (Q × Tn ) .

(3.2)

This isomorphism describes a relationship between Type IIB and Type IIA D-branes on the spacetime X = Q × Tn called T-duality. From the identifications above we see that the isomorphism exchanges wrapped D-branes [M, E, φ] (having φ(M) Q ×pt) with unwrapped D-branes (having φ(M) ⊆ Q × pt). The powers of 2n−1 give the expected multiplicity of D p-brane charges arising from wrapping all higher stable D-branes on various cycles of the torus Tn . A more geometrical derivation of the T-duality isomorphism (3.2) may be given as follows. Let ∼ = Zn be a lattice of rank n in a real vector space V of dimension n, and ∨ ∨ let ⊂ V be the dual lattice. Consider the real torus Tn = V / and the corresponding dual torus Tn = V ∨ /∨ . The lattices and ∨ may then be identified with the first homology lattices H1 (Tn ; Z) and H1 ( Tn ; Z), while the first homology lattice of Tn × Tn coincides with ⊗ ∨ . There is a unique line bundle P over the product space Tn × Tn , called the Poincaré line bundle, such that for any point t ∈ Tn the restriction n Pt = P|Tn ×{t} represents an element of the Picard group of T corresponding to t, and n such that the restriction P|{0}× Tn is the trivial complex line bundle over T . This bundle Tn ) which is a K-theory cup product of odd degree generators defines a class in Kt0 (Tn × for the K-theory of the tori Tn and Tn . Consider now the projections Q × Tn × N Tn NNN q q p p qq NN q NNN q q q NN' q x q q n Q×T Q × Tn . The T-duality isomorphism in topological K-theory [15, 37] ≈ e(i+n) T! : Kti Q × Tn −→ Kt Q × Tn is given for ξ ∈ Kti (Q × Tn ) and i = 0, 1 by T! (ξ ) = p! ( p ∗ ξ ⊗ P) ,

104

R.M.G. Reis, R.J. Szabo

e(i+n) where p! : Kti (Q × Tn × Tn ) → Kt (Q × Tn ) is the push-forward map in Ktheory which is given by the topological index. Since we assume that the spacetimes X = Q × Tn are spin (equivalently Q is spin), they are Kt -oriented and X = Q ×Tn and n thus obey Poincaré duality. The K-homology of Q × T thereby has a set of generators given by [Q × Tn , ξ, id Q×Tn ], where ξ ∈ Kt (Q × Tn ) is a generator, and similarly for Q × Tn . It follows that the map t T ! : Kit Q × Tn −→ Ke(i+n) Q × Tn

given by

T ! Q × Tn , ξ, id Q×Tn = Q × Tn , p! ( p ∗ ξ ⊗ P), id Q× Tn

is a well-defined group homomorphism. Since T ! = t n ◦ T! ◦ (tQ×Tn )−1 , it is an Q×T isomorphism. This isomorphism is the T-duality isomorphism in topological K-homology. While this map is defined in terms of virtual K-theory elements, one can straightforwardly obtain a picture with only stable isomorphism classes appearing by applying vector bundle modification along the lines explained in Sect. 2.7. If n is even, the T-duality isomorphism maps a spacetime-filling brane-antibrane pair on X to a spacetime-filling brane-antibrane pair on the 11-dimensional “M-theory” extension X × S1 , as spelled out by the construction of Remark 2.5. In particular, this description can be used to provide a more general construction of T-duality in the case of a spinc torus bundle over Q [15]. The construction also thereby provides a topological K-homology realization of brane descent relations among D-branes [52]. 3.3. Projective D-branes. The simplest example of flat D-branes in a curved background is provided by the complex projective spaces CPn of real dimension 2n. Being complex manifolds they are automatically spinc . The cellular structure in this case may be described by the stratification of CPn into linearly embedded subspaces as CP0 ⊂ CP1 ⊂ · · · ⊂ CPn−1 ⊂ CPn ,

(3.3)

= satisfies the hypotheses of Theorem 2.1, a set of generators t (CPn ) is given by for its reduced topological K-homology group K k C (3.4) CP , 11CPk , ιk , 1 ≤ k ≤ n, where CP0

pt. Since CPn

where ιk : CPk → CPn is the canonical inclusion. On the other hand, let Ln denote the canonical line bundle over CPn and L∨ line bundle. The reduced K-theory of n its dual n 0 n n n ∨ i CP is then given by Kt (CP ) = Kt (CP ) = i=1 ([11C CPn ] − [Ln ]) Z. From PropC ∨ i osition 2.1 it follows that the K-cycle classes ([11CPn ] − [Ln ]) ∩ [CPn , 11C CPn , idCPn ] describe spacetime-filling D-branes on complex projective space and we arrive at the following result. t (CPn ) = K t (CPn ) ∼ Proposition 3.4. For n ≥ 1, K = Z⊕n is the free abelian group

with generators

0

i i (ki ) (ki ) n ∨ ⊗(i−k) n ⊗(i−k) CP , CP , (Ln ) , idCPn − (L∨ ) , id n n CP , k=0 k even

1≤i ≤n.

l=0

k=0 k odd

l=0

Geometric K-Homology of Flat D-Branes

105

Remark 3.4. The decay of the brane-antibrane system provided by Proposition 3.4 into the stable D-branes described by the K-cycle classes (3.4) is rather intricate to describe. ∼ ∨ ⊕(n+1) , one has td(T CPn ) = f (−c1 (Ln ))n+1 where c1 (Ln ) Since T CPn ⊕11C CPn = (Ln ) is the first Chern class of the canonical line bundle Ln → CPn and f (x) = 1 +

x Bk 2k (−1)k−1 + x 2 2k! k≥1

with Bk ∈ Q the k th Bernoulli number. This fact may be used to attempt to find t (CPn ) via the change of basis map (2.6) between these two sets of generators of K the Chern character and the homology of CPn . However, both the Chern character and the Todd class lead directly to a strictly rational-valued change of basis matrix. The obstruction to this explicit procedure is encoded in whether or not the Chern character admits an integral lift, i.e. an extension of the usual map into rational homology to a map into integer homology. With the exception of the projective line CP1 ∼ = S2 , one easily checks that such a lift is not possible on complex projective spaces. The problem of integral lifts is discussed more thoroughly in the next section. ♦ Remark 3.5. These results generalize straightforwardly to all three spinc projective spaces KPn , where K is one of the three division algebras generated by the complex numbers C, the quaternions H or the Cayley numbers O. Let r := dimR (K), so that KP1 ∼ = Sr . n n Then the only non-trivial reduced homology groups of KP are Hr k (KP ; Z) ∼ = Z, 1 ≤ k ≤ n. The cellular structure is determined by a stratification of KPn into (r k)cells for k = 0, 1, . . . , n described by linearly embedded subspaces, analogously to ∨ n n (3.3). The K-theory ring is generated by r2 [11C KPn ] − [LKPn ], with LKP → KP the canonical complex line bundle. In all three instances one arrives at the isomorphism t (KPn ) ∼ K H (KPn ; Z), with generators constructed as above. The equivalence be= tween K-homology and singular homology in these cases can be understood through the appropriate spectral sequence and the sparseness of the cellular structure of KPn . Spectral sequences will play an important role in the investigations of subsequent sections. The real projective spaces RPn , based on the algebraically open field K = R, have a more intricate K-homology and will be treated in the next section. ♦

4. Torsion D-Branes Let us now turn to the somewhat more interesting situation in which D-branes are described by K-cycles which generally produce torsion elements in K-homology. In these cases one can encounter K-homology groups which do not coincide with the corresponding integral homology groups, and here the K-homology classification makes some genuine predictions that cannot be detected by ordinary homological methods. We will first consider the general problem of finding explicit homology cycles in the spacetime which are wrapped by D-branes. This analysis extends that of Sect. 2.3 to examine general circumstances under which a spinc homology cycle has a non-trivial lift to K-homology and hence is wrapped by a stable D-brane. Then we turn to a number of explicit examples illustrating how the K-cycle representatives of torsion charges are constructed in practice.

106

R.M.G. Reis, R.J. Szabo

4.1. Stability. Since K t is a homology theory defined by means of a ring spectrum, it satisfies the wedge axiom [58]. One consequence of this fact is that we can immedit ( α Snα ) using the K-homology of the spheres, since then ately obtain the groups K t n (Sα ). Another consequence is that we can use the Atiyah-Hirzebt ( α Snα ) = α K K ruch-Whitehead (AHW) spectral sequence [58] to compute the K t -groups of CW-complexes. Let X be a connected finite CW-complex, and let X [n] denote its n-skeleton with X [0] = pt. By the Whitehead cellular approximation theorem, the inclusion ιn : X [n] → X induces an isomorphism in integral homology up to degree n − 1. Consider the AHW spectral sequence {Erp,q , dr }r ∈N; p,q∈Z for reduced K-homology satisfying t t e( (4.1) H p X ; Ke(q) (pt) =⇒ K E2p,q = p+q) X with H p (X ; Z) := H p (X, pt; Z). Convergence of the spectral sequence means that there t (X ) given by is a filtration {F p,n− p } p∈N0 of K e(n) 0 = F0,n ⊆ F1,n−1 ⊆ · · · ⊆ F p,n− p ⊆ · · · ⊆ Fn,0 ⊆ · · · ⊆ F p,q p+q=n t e(n) =K (X ) , [ p] ) → K ∼ t t where F p,q = im (ι p )∗ : K e( p+q) (X e( p+q) (X ) and F p+1,n− p−1 /F p,n− p = ∞ E p+1,n− p−1 . For each j = 1, 2, 3 there is a natural epimorphism

β jX : H j (X ; Z) = E2j,0 −→ E∞ j,0 = F j,0 /F j−1,1 . In particular, since F0,1 = 0 = F1,1 the cases j = 1, 2 yield an epimorphism t β jX : H j (X ; Z) = E2j,0 −→ E∞ j,0 = F j,0 → Ke( j) (X ) . By analysing the spectral sequence, one concludes that β jX is injective if and only if no non-zero differential dr : Erp,q → Erp−r,q+r −1 reaches Ekj,0 for k ≥ 2. Thus if the reduced singular homology H (X ; Z) is concentrated in odd (resp. even) degree, except for possibly H2 (X ; Z) (resp. H1 (X ; Z) and H3 (X ; Z)), then β1X (resp. β2X ) is injective. From these considerations one concludes that if X is a connected CW-complex of dimension ≤ 4, then there are natural short exact sequences 0 −→ H1 (X ; Z) −→ K1t (X ) −→ H3 (X ; Z) −→ 0 , 0t (X ) −→ H4 (X ; Z) −→ 0 . 0 −→ H2 (X ; Z) −→ K The latter sequence splits, yielding an isomorphism ≈ t chZ even : K0 (X ) −→ Heven (X ; Z) . t If X is simply connected, then the map chZ 3 : K1 (X ) → H3 (X ; Z) is a bijection and we thereby obtain an isomorphism ≈ t chZ • : K (X ) −→ H (X ; Z)

(4.2)

Geometric K-Homology of Flat D-Branes

107

of Z2 -graded abelian groups such that the diagram t (X ) K

chZ •

/ H (X ; Z) II II II ch• III $ H (X ; Q)

(4.3)

commutes, where : H (X ; Z) → H (X ; Q) is the homomorphism induced by the inclusion of abelian groups Z → Q. (These calculations were first carried out in [42].) The isomorphism (4.2) is called an integral lift of the Chern character in K-homology. It extends the usual Chern character map between stable D-branes and non-trivial homology cycles in X to include torsion classes. For X of dimension ≤ 3, this isomorphism even exists without the assumption of simple connectivity [48]. For CW-complexes X of higher dimension, the problem of determining which homology cycles lift to stable D-branes is much more difficult, because then the analysis of the spectral sequence is not so clear cut. For instance, in general F2,1 ∼ = H1 (X ; Z) is non-trivial, thereby making this kind of analysis generically impossible. Remark 4.1. The filtration groups F p,q approximating the full K-homology group consist of D-branes [M, E, φ] in X whose worldvolumes are supported in the p-skeleton, i.e. φ(M) ⊆ X [ p] . The extension groups E∞ p,q between successive approximants consist of those D-branes in the p-skeleton which are not supported on the ( p − 1)-skeleton, i.e. φ(M) X [ p−1] , or in other words E∞ p,q consists of D( p − 1)-branes which carry no lower-dimensional brane charges. By definition, the approximations Erp,q for r ≥ 2 compute the homology of the differential dr −1 (E1p,q is the group of singular p-chains t (pt) with d1 the usual simplicial boundary homomorphism). on X with values in Ke(q) Let M ⊂ X be a p-dimensional compact spinc manifold without boundary which defines a non-trivial homology class [M] ∈ E2p,q in (4.1). If [M] extends through the spectral sequence as a non-trivial element of all homology groups Erp,q , then it can represent a non-trivial element of E∞ p,q and hence have a non-trivial lift to K-homology. In this case there exists a D-brane [M, E, φ] wrapping M on the p-skeleton of X which is stable and carries no lower brane charges. Conversely, suppose that [M] = dr ω for some r ∈ N and ω ∈ H (X ; Z). Then the homology class [M] can be lifted to K-homology, but the lift is trivial as it vanishes in E∞ p,q . This means that there exists a [ p] D-brane wrapping M in X with no lower brane charges, but this D-brane is unstable. Thus the AHW spectral sequence in this context keeps track of the possible obstructions for a homology cycle of H p (X ; Z), starting from (4.1), to survive to E∞ p+1,n− p−1 . Then, the solution of the K-homology extension problem required to get the filtration groups F p,q from E∞ p,q identifies the lower brane charges carried by D-branes and changes the additive structure in K-homology from that of the singular homology classes. The spectral sequence in this regard measures the possible obstructions to extending [M] non-trivially over higher-dimensional simplices of X . ♦

108

R.M.G. Reis, R.J. Szabo

4.2. A-branes. Let p, q1 , . . . , qn be integers with p ≥ 1 and gcd( p, qi ) = 1 for all i = 1, . . . , n. There is a free C∞ action G : Z p × S2n+1 −→ S2n+1 given by G e 2π i k/ p , (z 0 , z 1 , . . . , z n ) = e 2π i k/ p z 0 , e 2π i q1 k/ p z 1 , . . . , e 2π i qn k/ p z n , where we regard Z p ⊂ S1 and S2n+1 ⊂ Cn+1 . The corresponding quotient space L( p; q1 , . . . , qn ) is a compact connected C∞ manifold of dimension 2n + 1 called a Lens space. For definiteness we will consider only the case n = 1. The corresponding D-branes are then a particular instance of topological A-model D-branes (or A-branes for short) [26, 45, 61] which are mirror duals to the B-branes described in Example 2.1 and belong to the derived Fukaya category of the spacetime [3]. The mirror manifold to the algebraic variety X is taken to be the non-compact Calabi-Yau threefold which ⊗ p ⊕ (L )⊗( p−2) → CP1 . is the total space of the rank 2 complex vector bundle (L∨ 1 1) ⊗p. For q = 1 the Lens space L( p; 1) may be identified with the boundary of (L∨ 1) Higher-dimensional Lens spaces are similarly identified with the boundaries of the total ⊗ p → CPn . spaces of the line bundles (L∨ n) The Lens space L( p; q) is a compact connected spin three-manifold which admits a CW-complex structure with one n-cell for each dimension n = 0, 1, 2, 3 [22]. Its singular homology is given by H0 L( p; q) ; Z ∼ =Z ∼ = H3 L( p; q) ; Z , H1 L( p; q) ; Z ∼ = Zp , (4.4) H2 L( p; q) ; Z ∼ =0. Since we know the singular homology of L( p; q), we can work out the spectral sequence in this case and thus calculate the topological K-homology K t (L( p; q)). Proposition 4.1. K0t L( p; q) ∼ = Z , K1t L( p; q) ∼ = Z ⊕ Zp . Proof. There exists an AHW spectral sequence {Ern,m , dr }r ∈N;n,m∈Z converging to K t (L( p; q)) with Hn L( p; q) ; Z , m even 2 t En,m . = Hn L( p; q) ; Ke(m) (pt) ∼ = 0 , m odd 2 2 2 ∼Z∼ ∼ and E1,2k From (4.4) it follows that the only non-zero groups are E0,2k = = E3,2k = Z p with k ∈ Z. The next sequence of homology groups of the differential module is defined by 3 := En,m

2 2 −→ En−2,m+1 ker d2 : En,m 2 2 im d2 : En+2,m−1 −→ En,m

.

2 If m is odd, n ≥ 4 or n < 0, then En,m = 0 so that ker d2 = 0 = im d2 and hence 3 En,m = 0. For the remaining cases with m = 2k and n = 0, 1, 2, 3, the pertinent part of the differential bicomplex is of the form d2

d2

d2

d2

2 2 2 · · · −→ En+2,2k−1 −→ En,2k −→ En−2,2k+1 −→ · · · , 0 0 Hn L( p; q) ; Z

Geometric K-Homology of Flat D-Branes

109

3 implying that im d2 = 0 and hence En,2k = ker d2 ∼ = Hn (L( p; q); Z). By induction we conclude from this data that Er0,2k ∼ =Z∼ = Er3,2k and Er1,2k ∼ = Z p for every r ≥ 2, with all other homology groups vanishing. We therefore have   Hn L( p; q) ; Z = Z , m even , n = 0, 3 ∞ . H L( p; q) ; Z = Z p , m even , n = 1 En,m = lim Ern,m ∼ = −→  1 r H2 L( p; q) ; Z = 0 , otherwise ∞ . Then solving the extension problems For each l ∈ Z let F0,l = E0,l ∞ 0 −→ Fn−1,1−n −→ Fn,−n −→ En,−n −→ 0 ,

∞ −→ 0 0 −→ Fn−1,2−n −→ Fn,1−n −→ En,1−n

for every n ∈ N will produce groups Fn,−n and Fn,1−n such that {Fn,−n }n∈N0 (resp. {Fn,1−n }n∈N0 ) is a filtration of K0t (L( p; q)) (resp. K1t (L( p; q))). Starting from the data above, it is straightforward to compute Fn,−n = Z for all n ∈ N0 , and hence K0t (L( p; q)) = Z. Furthermore, one finds F0,1 ∼ = 0, F1,0 ∼ = Zp ∼ = F2,−1 and Fn,1−n ∼ = Z ⊕ Z p for all n ≥ 3, so that K1t (L( p; q)) = Z ⊕ Z p . Let us now work out D-brane representatives for the K-homology groups of Propt (L( p; q)) = 0, it is immediate that [pt, 11C osition 4.1. Since K pt , ι] is the generator of 0 t K0 (L( p; q)) = Z. Furthermore, since L( p; q) is an odd-dimensional spin manifold, it is Kt -orientable and so it has a fundamental class [L( p; q), 11C L( p;q) , idL( p;q) ] which is t the free generator of K1 (L( p; q)) = Z ⊕ Z p . If we take q = 1 and let L1 denote as before the canonical line bundle over CP1 , then we can identify the sphere bundle of ⊗p L1 with the Lens space L( p; 1). In this case, from the K-theory of L( p; 1) [41] we can identify the torsion generator of K1t (L( p; 1)) with the K-cycle class ∗ ∨ (4.5) L( p; 1), 11C L( p;1) , idL( p;1) − L( p; 1), π (L1 ), idL( p;1) , ⊗p

where π : S(L1 ) → CP1 is the bundle projection. To describe the decay of the spacetime-filling brane-antibrane pair (4.5) into stable D-branes, we note that H1 (L( p; q); Z) ∼ = Zp ∼ = π1 (L( p; q)) and that the Hurewicz homomorphism ρ1 : π1 (L( p; q)) → H1 (L( p; q); Z) given by ρ1 [ f ] = f ∗ [S1 ] is a bijection. In addition, the Hurewicz homomorphism in K-homology h 1 : π1 (L( p; q)) → K1t (L( p; q)) is given by h 1 [ f ] = f ∗ [S1 , 11C , idS1 ] = [S1 , 11C , f ]. Since L( p; q) is a S1 S1 compact three-dimensional manifold, the homological Chern character admits an integral lift (4.2) fitting into the commutative diagram (4.3) for X = L( p; q). Furthermore, Z Z t chZ odd := ch1 ⊕ ch3 : K1 (L( p; q)) → H1 (L( p; q); Z) ⊕ H3 (L( p; q); Z) is an isomorphism. In particular, chZ 1 : Tor K1t (L( p;q)) → H1 (L( p; q); Z) is an isomorphism. Its inverse is given by the isomorphism β1 : H1 (L( p; q); Z) → Tor Kt (L( p;q)) which fits 1 into the commutative diagram [48] π1 L( p; q)

/ Tor Kt (L( p;q)) . 1 nn6 n n n ρ1 nnn nnn β1 n H1 L( p; q) ; Z h1

110

R.M.G. Reis, R.J. Szabo

It follows that h 1 : π1 (L( p; q)) → Tor Kt (L( p;q)) is an isomorphism. The generator of 1 the fundamental group [ f ] ∈ π1 (L( p; q)) = Z p may be taken to be any loop obtained by projecting a path on the universal cover S3 → L( p; q) connecting two points on S3 that are related by the Z p -action defining the Lens space. Then [S1 , 11C , f ] is the S1 torsion generator of K1t (L( p; q)). For q = 1 it coincides with the generator (4.5). Remark 4.2. Examining the proof of Proposition 4.1, we see that this construction of the ∞ stable D-brane states in L( p; q) follows from the form of the homology groups En,m of the differential module in the AHW spectral sequence. In the present example, all homology cycles have non-trivial lifts to K-homology and are thus wrapped by stable states of D-branes. ♦ 4.3. Projective D-branes. We will now complete the calculation initiated in Sect. 3.3 by exhibiting the D-branes in the fourth and final real projective spaces RPm , which arise in certain orbifold spacetimes of string theory [33]. They can be realized as the quotient of the m-sphere Sm by the antipodal map. Let qm : Sm → RPm be the quotient map. With the exception of the projective line RP1 ∼ = S1 , the corresponding K-homology groups contain torsion subgroups. Analogously to (3.3), the CW-complex structure of RPm may be given by the stratification provided by linearly embedded subspaces, so that its set of k-cells consists of the single element RPk for k = 0, 1, . . . , m with RPk /RPk−1 ∼ = Sk . m The singular homology of RP is given by H0 (RP2n+1 ; Z) Hm−2i (RPm ; Z)

∼ =Z ∼ = Hm (RPm ; Z) , ∼ = Z2 , i = 1, . . . , m2 .

Let LRPm = Sm × R/Z2 be the canonical flat line bundle over RPm , and ιk : RPk → RPm the inclusion of the k-cell. 4.3.1. RP2n+1 We begin with the odd-dimensional cases m = 2n + 1. In this in stance RP2n+1 is a spinc manifold. Thus it is Kt -oriented and we can apply Poincaré duality to compute its topological K-homology K t (RP2n+1 ) from its known K-theory

Kt (RP2n+1 ) [41]. Proposition 4.2. K0t (RP2n+1 ) ∼ = Z , K1t (RP2n+1 ) ∼ = Z ⊕ Z2n . Applying Proposition 2.1 to the example at hand, one finds that the generating D-brane 2n+1 , 11C , of K0t (RP2n+1 ) is [pt, 11C pt , ι], while the spacetime-filling D-brane [RP RP2n+1 t 2n+1 idRP2n+1 ] is the free generator of K1 (RP ). The torsion generator of K1t (RP2n+1 ) is the spacetime-filling brane-antibrane pair 2n+1 C (4.6) RP , 11RP2n+1 , idRP2n+1 − RP2n+1 , LRP2n+1 ⊗ C, idRP2n+1 . As in the examples of Lens spaces, the decay products of the brane-antibrane system (4.6) cannot be determined through Theorem 2.1 due to the torsion. The difference between K-homology and singular homology here can be understood by appealing to the AHW spectral sequence. After some calculation one finds the filtration groups F2n−3,4−2n ∼ = Z2n−1 and F2n−1,2−2n ∼ = Z2n [16, 58], which thereby alter the additive structure in Hodd (RP2n+1 ; Z). The lift of the generator [RP2n−1 ] ∈ H2n−1 (RP2n+1 ; Z) ∼ =

Geometric K-Homology of Flat D-Branes

111

Z2 to K-homology is the stable D-brane ω0 = [RP2n−1 , 11C ,ι ] ∈ K1t (RP2n+1 ). RP2n−1 2n−1 While [RP2n−1 ] is of order 2 in Hodd (RP2n+1 ; Z), ω0 is of order 2n in K1t (RP2n+1 ) and is thus equal to (4.6). For every k = 0, 1, . . . , n, ωk := 2k ω0 = [RP2n−2k−1 , 11C , RP2n−2k−1 ι2n−2k−1 ] (with ωn := 0) corresponds to the order 2 generator [RP2n−2k−1 ] ∈ H2n−2k−1 (RP2n+1 ; Z) ∼ = Z2 . These associations illustrate that an integral lift chZ odd of the Chern character, along the lines described in Sect. 4.1, does not generally exist in the present class of examples. Remark 4.3. This example furnishes a nice illustration of D-brane decay [16]. Placing together 2k D(2n − 2)-branes wrapping RP2n−1 ⊂ RP2n+1 creates an unstable state that decays into a D(2n − 2k − 2)-brane wrapping RP2n−2k−1 ⊂ RP2n−1 , due to the triviality of the singular homology classes 2k [RP2n−2k−1 ] in Hodd (RP2n+1 ; Z). Similarly, stacks of 2 j of these D(2n − 2k − 2)-branes for 1 ≤ j < n − k decay into a D(2n − 2k − 2 j − 2)-brane, and so on. ♦ Remark 4.4. For n = 1 this construction of the torsion generator of K1t (RP3 ) coincides with the construction which is completely analogous to that used in Sect. 4.2 to construct the torsion D0-brane wrapping S1 ∼ ♦ = RP1 on the Lens spaces L( p; q). 4.3.2. RP2n . The even-dimensional real projective spaces RP2n are more difficult to deal with because they are not orientable. In particular, they are not spinc and so most of the techniques used thus far cannot be applied to this case. In fact, this space provides an exotic example whereby not only does the K-homology differ from singular homology, but also where Poincaré duality breaks down and the K-homology differs from the dual K-theory which in the present case is given by Kt0 (RP2n ) ∼ = Z ⊕ Z2n−1 , Kt1 (RP2n ) ∼ = 0. Proposition 4.3. Kt (RP2n ) ∼ = Z , Kt (RP2n ) ∼ = Z n−1 . 0

2

1

t (RP2n ) = 0. Proof. A simple application of the AHW spectral sequence shows that K 0 t 2n Since Hodd (RP ; Z) = 0, via the Chern character we conclude that K1 (RP2n ) has no free part. Finally, by applying the universal coefficient theorem of Sect. 1.10 to X = RP2n one concludes that tor Kt (RP2n ) ∼ = tor K0 (RP2n ) ∼ = Z2n−1 . 1

t

∼ Z is the D-instanton [pt, 11C , ι]. The As always, the generator of K0t (RP2n ) = pt remaining torsion generators of K1t (RP2n ) ∼ = Z2n−1 are more difficult to find. They may be constructed as follows. By excision, the quotient map p2n : (RP2n , RP2n−1 ) → (RP2n /RP2n−1 , pt) induces an isomorphism ( p2n )∗ : K t (RP2n , RP2n−1 ) → K t (RP2n /RP2n−1 , pt) giving t (RP2n /RP2n−1 ) ∼ t (S2n ) K t (RP2n , RP2n−1 ) ∼ = K t (RP2n /RP2n−1 , pt) ∼ = K = K and one concludes that K0t (RP2n , RP2n−1 ) ∼ = Z,

K1t (RP2n , RP2n−1 ) ∼ = 0.

The six-term exact sequence associated to the pair (RP2n , RP2n−1 ) is given by K0t (RP2n−1 ) O

(ι2n−1 )∗

/ Kt (RP2n ) 0

ς∗

/ Kt (RP2n , RP2n−1 ) 0

∂

K1t (RP2n , RP2n−1 ) o

∂

ς∗

K1t (RP2n ) o

(ι2n−1 )∗

K1t (RP2n−1 ) .

(4.7)

112

R.M.G. Reis, R.J. Szabo

The homomorphism (ι2n−1 )∗ : K0t (RP2n−1 ) → K0t (RP2n ) is induced by the inclusion of the (2n − 1)-skeleton in RP2n . Since both groups K0t (RP2n−1 ) = K0t (RP2n ) = Z C C are generated by [pt, 11C pt , ι], it follows that (ι2n−1 )∗ [pt, 11pt , ι] = [pt, 11pt , ι] and hence (ι2n−1 )∗ is an isomorphism. Combining this fact with (4.7), we conclude that the six-term exact sequence truncates to the short exact sequence given by (ι2n−1 )∗

∂

0 −→ K0t (RP2n , RP2n−1 ) −→ K1t (RP2n−1 ) −→ K1t (RP2n ) −→ 0 . It finally follows that K1t (RP2n ) ∼ = K1t (RP2n−1 ) / im ∂ ∼ = Z ⊕ Z2n−1 / im ∂ .

(4.8)

Comparing with Proposition 4.3 we conclude that the connecting homomorphism has range im ∂ ∼ = Z. The torsion D-branes in this case are thus supported in the (2n − 1)skeleton which is a spinc submanifold of RP2n . Their explicit K-cycle representatives, along with the pertinent decay products, can be constructed exactly as in our previous example above. Note that there are no spacetime filling branes in RP2n . Remark 4.5. To understand the geometrical meaning of the quotient in (4.8), consider the commutative diagram (S2n , pt) 6 O n n nn n n nnn nnn p2n (B2n , S2n−1P) PPP PPP PPP PP( f 2n (RP2n , RP2n−1 ) p2n

where f 2n is the characteristic map of the 2n-cell. This induces the commutative diagram in K-homology given by t (S2n ) K 6 0 O l l lll lll l l lll ( p2n )∗ K0t (B2n , S2n−1 ) RRR RRR RRR ( f 2n )∗ RRRR( K0t (RP2n , RP2n−1 ) ) ( p2n ∗

) are isomorphisms. It follows that where the induced maps ( f 2n )∗ and ( p2n ∗ t (RP2n , RP2n−1 ) ∼ Z with ∂[B2n , 11C , f ] = [B2n , 11C , f ] is the generator of K = 0 B2n 2n B2n 2n 2n−1 ) → K t (RP2n−1 ) [S2n−1 , 11C , q ] = h [q ], where h : π (RP 2n−1 2n−1 2n−1 2n−1 1 S2n−1 2n−1 is the Hurewicz homomorphism in K-homology and the quotient map q2n−1 : S2n−1 → RP2n−1 is the generator [q2n−1 ] ∈ π2n−1 (RP2n−1 ) ∼ = Z. We thus conclude that im ∂ = im h 2n−1 , and hence the quotient by the image of the boundary homomorphism in (4.8) projects out the integrally charged D-brane [S2n−1 , 11C ,q ] which fills the entire S2n−1 2n−1 2n (2n − 1)-cell of RP . ♦

Geometric K-Homology of Flat D-Branes

113

4.3.3. RP2n+1 × RP2k+1 . When dealing with torsion K-homology groups, the structure of D-branes on product manifolds becomes an interesting problem. Let us first consider the representative example RP2n+1 × RP2k+1 wherein the factors each support torsion D-branes. Proposition 4.4. K0t (RP2n+1 × RP2k+1 ) ∼ = Z ⊕ Z ⊕ Z2k ⊕ Z2n ⊕ Z2 p ∼ = K1t (RP2n+1 × 2k+1 RP ), where p = gcd(n, k). Proof. We apply the Künneth theorem of Sect. 1.6 (Theorem 1.1). The torsion extension for the K0t -group is given by Tor Kit (RP2n+1 ), Ktj (RP2k+1 ) = Tor Z, Z ⊕ Z2k ⊕ Tor Z ⊕ Z2n , Z = 0, i+ j=1

and so there is an isomorphism K0t (RP2n+1 ×RP2k+1 ) = K0t (RP2n+1 )⊗K0t (RP2k+1 ) ⊕ K1t (RP2n+1 )⊗K1t (RP2k+1 ) = Z ⊕ Z ⊕ Z2k ⊕ Z2n ⊕ Z2 p. (4.9) On the other hand, the torsion extension for the K1t -group is Z2 p . Again the short exact sequence of Theorem 1.1 for the present space splits and we find the same isomorphism as in (4.9) for the K-homology group K1t (RP2n+1 × RP2k+1 ). The generating D-branes are straightforward to work out as before. For the various subgroups of K0t (RP2n+1 × RP2k+1 ) given by Proposition 4.4 one finds the generators , RP2n+1 × RP2k+1 , 11C Z ⊕ Z : pt , 11C pt , ι RP2n+1 ×RP2k+1 , idRP2n+1 ×RP2k+1 , Z2n : RP2n+1 × RP2k+1 , LRP2n+1 ⊗ C 11C RP2k+1 , idRP2n+1 ×RP2k+1 , Z2k : RP2n+1 × RP2k+1 , 11C RP2n+1 LRP2k+1 ⊗ C , idRP2n+1 ×RP2k+1 , Z2 p : RP2n+1 × RP2k+1 , (LRP2n+1 ⊗ C LRP2k+1 ⊗ C) ⊕11C RP2n+1 ×RP2k+1 , idRP2n+1 ×RP2k+1 C − RP2n+1 × RP2k+1 ,(LRP2n+1 ⊗ C 11C RP2n+1 ) ⊕ (11RP2k+1 LRP2k+1 ⊗C) , idRP2n+1 ×RP2k+1. The 2n -torsion and 2k -torsion charges come from stable spacetime-filling D-branes on RP2n+1 × RP2k+1 . Since they carry non-trivial line bundles on their worldvolumes, they can be decomposed into lower-dimensional D-branes carrying trivial line bundles along the lines of Sect. 2.4. The precise nature of the constituent D-branes can again be deduced upon careful examination of the AHW spectral sequence. The decomposition of the 2 p -torsion spacetime-filling brane-antibrane pairs is analogous to that of the RP2n+1 example studied earlier. For the first four subgroups of K1t (RP2n+1 × RP2k+1 ) one finds the generators: 2k+1 , 11C Z ⊕ Z : [RP2n+1 , 11C RP2n+1 , ι2n+1 ] , [RP RP2k+1 , ι2k+1 ] ,

Z2n : [RP2n+1 , LRP2n+1 ⊗ C, ι2n+1 ] − [RP2n+1 , 11C RP2n+1 , ι2n+1 ] , Z2k : [RP2k+1 , LRP2k+1 ⊗ C, ι2k+1 ] − [RP2k+1 , 11C RP2k+1 , ι2k+1 ] .

114

R.M.G. Reis, R.J. Szabo

The 2 p -torsion class is more difficult to determine in this case because it arises from the torsion extension in the Künneth formula. It can be found by again comparing to singular homology and identifying it as the lift of the remaining cycles in Hodd (RP2n+1 × RP2k+1 ; Z) after the other decay products have been determined from the AHW spectral sequence along the lines of the RP2n+1 example above [16]. 4.3.4. RP2n+1 × Sk . For our final example of projective D-branes we consider a product spacetime in which one factor carries only torsion-free D-branes. For the representative example RP2n+1 × Sk we proceed exactly as in the previous case. Z⊕Z , k even t 2n+1 k ∼ ×S ) = , K1t (RP2n+1 × Proposition 4.5. K0 (RP Z ⊕ Z ⊕ Z2n , k odd Z ⊕ Z ⊕ Z2n ⊕ Z2n , k even Sk ) ∼ . = Z⊕Z⊕Z n , k odd 2

The generators of the K0t -groups are given by  

; pt , 11C RP2n+1 ×Sk , 11C 2n+1 k , id 2n+1 k , pt , ι , RP ×S RP ×S  2n+1 k C RP ×S , LRP2n+1 ⊗C 11 k , id 2n+1 k − RP2n+1 ×Sk , 11C 2n+1 k , id 2n+1 k ; RP ×S RP ×S S RP ×S

pt , 11C pt , ι ,

Sk , 11Ck , ι S

Sk

k even

, k odd

while the K1t -groups are generated by the D-branes             

RP2n+1 ×Sk , L

RP2n+1 , 11C 2n+1 , ι2n+1 , RP2n+1 ×Sk , 11C 2n+1 k , id 2n+1 k , RP RP ×S ×S RP RP2n+1 , LRP2n+1 ⊗C , ι2n+1 − RP2n+1 , 11C 2n+1 , ι2n+1 ,

RP2n+1

RP

⊗C 11C 2n+1 k , id 2n+1 k − RP2n+1 ×Sk , 11C 2n+1 k , id 2n+1 k ; k even . RP ×S ×S RP ×S RP ×S RP RP2n+1 , 11C 2n+1 , ι2n+1 , Sk , 11Ck , ι k ,

RP

S

S

RP2n+1 , LRP2n+1 ⊗C , ι2n+1 − RP2n+1 , 11C 2n+1 , ι2n+1 ; k odd RP

4.4. D-branes on Calabi-Yau spaces. We conclude this section by further indicating how our analysis of torsion D-branes can be applied to the topological A-model and B-model examples studied earlier, and ultimately to a K-cycle description of mirror symmetry [3]. For definiteness, let us work with the Fermat quintic threefold Y defined by # Y = (z 1 , . . . , z 5 ) ∈ CP4

$ % $ 5 5 zi = 0 . $ i=1

This is a simply-connected complex projective algebraic variety of real dimension 6. It is therefore a spinc manifold and satisfies Poincaré duality. Let HY be the hyperplane line bundle on Y , i.e. the restriction to Y of the line bundle which is associated with any hyperplane H in CP4 . Let D be the corresponding hyperplane divisor whose zero set in CP4 is precisely the original hyperplane H. Let C ∼ = CP1 be a degree 1 rational curve 0 ⊕4 ∼ on Y . Then Kt (Y ) = Z [17, 19] and by Poincaré duality we have K0t (Y ) ∼ = Z⊕4 . From the known K-theory generators we can thus identify the generating A-branes, pt, 11C Y, 11C pt , ι , Y , idY , C C C Y, HY , idY − Y, 11C Y, (ιC )! [11C ], idY = C, 11C , ιC . Y , idY = D, 11 D , ι D ,

Geometric K-Homology of Flat D-Branes

115

In addition, one has Kt1 (Y ) ∼ = Z⊕204 so that K1t (Y ) ∼ = Z⊕204 . The corresponding Dbranes wrap the 204 independent three-cycles of H3 (Y ; Z) and are constructed using Theorem 2.1. As expected, the A-branes all wrap lagrangian submanifolds with flat line bundles. The corresponding B-branes live in the multiply-connected non-singular Calabi-Yau threefold X obtained by quotienting Y by the Z5 -action generated by z i → ζ i−1 z i , i = 1, . . . , 5, where ζ 5 = 1 [3, 32, 62]. This is also a complex projective algebraic variety of real dimension 6, and hence a spinc manifold. Let H X be the hyperplane line bundle restricted to X , and let L X → X be the flat line bundle L X = Y × C/Z5 with respect to the Z5 -action above. Then by Poincaré duality one has K0t (X ) ∼ = Kt0 (X ) ∼ = ⊕4 Z ⊕ Z5 . Using the known K-theory generators [19] we may write down the D-branes corresponding to the free part as  [pt, 11C  pt , ι] ,   , [X, H X , id X ] − [X, 11C X id X ] , C , id ] − [X, H ⊕ H , id ] ,  ⊗ H ⊕ 1 1 [X, H X X X X X  X X  X, (H X )⊗3 ⊕ (H X )⊕3 , id X − X, (H X ⊗ H X )⊕3 ⊕ 11C X , id X , while the torsion generator is the spacetime-filling brane-antibrane pair [X, L X , id X ] − [X, 11C X , id X ] . This brane-antibrane system decays into a stable torsion D4-brane at the Gepner point of the given Calabi-Yau moduli space [19]. Furthermore, one has K1t (X ) ∼ = Kt1 (X ) ∼ = ⊕44 Z ⊕ Z5 , with the free part generated by lifting the 44 three-cycles in H3 (X ; Z) and the torsion generated by applying the Hurewicz homomorphism to the generator of the fundamental group π1 (X ) ∼ = Z5 . 5. Flux Stabilization of D-Branes In this final section we shall consider D-branes which live on the total space X of a fibration F

/X B

where we assume that the base space B is a simply connected finite CW-complex (we can consider B to be only path-connected, which then requires the use of local coefficients). There are many such situations in which one is interested in the classification of D-branes in X . In fact, many of the examples we have considered previously fall into this category. For instance, both the Lens spaces L( p; q1 , . . . , qn ) and the real projective spaces RP2n+1 are circle bundles over CPn . Furthermore, the very application of vector bundle modification identifies those D-branes such that one is a spherical fibration over the other. Both our descriptions of “M-theory” definitions of unstable 9-branes and T-duality also have natural extensions to torus bundles. Since K-homology satisfies both the wedge and weak homotopy equivalence axioms for finite CW-complexes, we may apply the Leray-Serre spectral sequence to calculate

116

R.M.G. Reis, R.J. Szabo

the topological K-homology groups in these instances. The Leray-Serre theorem [58] states that there is a spectral sequence {Erp,q , dr }r ∈N; p,q∈Z converging to K t (X ) and satisfying t (F) E2p,q = H p B ; Ke(q) for all p, q ∈ Z. This spectral sequence relates D-branes on X to homology cycles of the base and fibres. Stability criteria can be formulated along the lines of Sect. 4.1 t to determine which homology classes in H p (B; Ke(q) (F)) lift to non-trivial classes in t K (X ). Part of the homology basis of X wrapped by stable D-branes may then contain cycles of the base B embedded in X as the zero section of the fibre bundle along with cycles from the inclusion of some of the fibres F. The worldvolumes will typically be labelled by the characteristic class of the fibre bundle. Some of these singular homology cycles may be trivial in the homology of X (but not in the homology of B or F), and on its own a D-brane wrapping the given cycle in X would be unstable. However, regarding X as the total space of a non-trivial fibration can effectively render the D-brane state stable in a process somewhat reverse to the decay of branes wrapping non-trivial homology cycles that we studied earlier (Sects. 2.3 and 4.1). We refer to this process as “flux stabilization” [5, 13, 46], with the characteristic class of the fibration playing the role of a “flux” on the D-brane worldvolume. These fluxes act as conserved topological charges on the D-branes which give an obstruction for them to decay to the vacuum state. For example, a circle bundle X over B is classified entirely by its first Chern class c1 (X ) ∈ H2 (B; Z).

5.1. Spherically-ﬁbred D-branes. We begin with the class of fibrations wherein the fibre spaces F = S2n are even-dimensional spheres. Proposition 5.1. Let S2n → X → B be a spherical ﬁbration such that the base space B is a simply connected ﬁnite CW-complex with freely generated singular homology concentrated in even degree. Then K0t (X ) ∼ = H (B; Z) ⊕ H (B; Z) ,

K1t (X ) ∼ = 0.

Proof. The second term of the Leray-Serre spectral sequence is given by E2p,q ∼ =

H p (B; Z ⊕ Z) , q even . 0 , q odd

By the universal coefficient theorem for singular homology one has H p (B; Z⊕n ) ∼ = H p (B; Z)⊕n , and so E2p,q

∼ =

H p (B; Z) ⊕ H p (B; Z) , q even . 0 , q odd

Under the assumptions on the homology of B, one has E2p,q = 0 if either p or q are odd ∼ 2 and hence Erp,q ∼ = E2p,q for all r ≥ 2. It follows that E∞ p,q = E p,q . Then it is easy to see

Geometric K-Homology of Flat D-Branes

117

∞ = 0 for all p ≥ 0, and so K t (X ) = 0. Furthermore, one has that F p,1− p = F0,1 = E0,1 1 the filtration groups  p   Hq (B; Z) ⊕ Hq (B; Z) , p even   q=0 ∼ F p,− p = p−1    Hq (B; Z) ⊕ Hq (B; Z) , p odd  q=0

and hence K0t (X ) = H (B; Z) ⊕ H (B; Z).

5.2. Fractional branes. Next we look at the cases where the fibre F is a finite set of points. In this case, stable D-branes again wrap cycles in the base B. Any such D-brane is also accompanied by a set of |F| mirror images called fractional branes [27] which live on the various leaves of the covering space X . Proposition 5.2. Let F → X → B be a covering such that the base space B is a simply connected ﬁnite CW-complex with freely generated singular homology concentrated in even degree. Then K0t (X ) ∼ = H (B; Z)⊕|F| ,

K1t (X ) ∼ = 0.

Proof. Since the functor K t satisfies the infinite wedge axiom [38], the second term of the Leray-Serre spectral sequence is given by   H B ; Z , q even p E2p,q ∼ . = α∈F  0 , q odd Applying the universal coefficient theorem allows us to rewrite this term as H p (B; Z)⊕|F| , q even . E2p,q ∼ = 0 , q odd Under the stated assumptions on the homology of B we thereby conclude that H p (B; Z)⊕|F| , q even, p ≥ 0 even 2 ∼ ∼ . E∞ E = = p,q p,q 0 , otherwise ∞ = 0 for all p ≥ 0, and so K t (X ) = 0. One One easily shows that F p,1− p = F0,1 = E0,1 1 also concludes that  p   Hq (B; Z) , p even   α∈F q=0 ∼ F p,− p = p−1    Hq (B; Z) , p odd  α∈F q=0

and hence K0t (X ) =

α∈F

H (B; Z).

118

R.M.G. Reis, R.J. Szabo

There is a relative version of the Leray-Serre spectral sequence of the form t t H p B ; Ke(q) (F) =⇒ Ke( E2p,q = p+q) X, F . Using an analysis similar to the one just made allows one to determine D-brane states analogous to those of Sect. 5.1 in the case where the homology of B is supported complimentarily to that above and the fractional branes are all identified with one another. Proposition 5.3. Let F → X → B be a covering such that the base space B is a simply connected ﬁnite CW-complex with freely generated reduced singular homology concentrated in odd degree. Then Kt (X, F) ∼ Kt (X, F) ∼ = 0, = H (B; Z) ⊕ H (B; Z) . 0

1

Corollary 5.1. Let F → X → B be a covering such that the base space B is a simply connected ﬁnite CW-complex with freely generated reduced singular homology concentrated in degree n mod 2. Then t (X ) ∼ t K H (B; Z) , K (X ) ∼ = = 0. e(n)

e(n+1)

Remark 5.1. Since B is path-connected, an application of these results to the trivial fibration id B : B → B gives as a corollary that K0t (B) (resp. K1t (B)) is isomorphic to H (B; Z) while K1t (B) (resp. K0t (B)) is trivial when B has non-trivial homology only in even (resp. odd) degree. ♦ 5.3. Spherically-based D-branes. For our final general class of fibre bundles, we consider the cases where the base space B is a sphere. In such instances the stable D-branes on X are determined by images of K-cycles on the fibre F. Our first result completely determines the case of coverings of even-dimensional spheres. Proposition 5.4. Let F → X → S2n , n ≥ 1 be a ﬁbration over the 2n-sphere such that the topological K-homology group of the ﬁbre K t (F) = K0t (F) is freely generated. Then ∼ Kt (F) ⊕ Kt (F) , ∼ 0. Kt (X ) = Kt (X ) = 0

0

0

1

Proof. The second term of the Leray-Serre spectral sequence is given by t 2 ∼ Ke(q) (F) , p = 0, 2n E p,q = . 0 , otherwise Since K1t (F) = 0, this becomes t K0 (F) , q even, p = 0, 2n . E2p,q ∼ = 0 , otherwise Since E2p,q = 0 if either p or q is odd, it follows that dr = 0 for all r, p, q and so E∞ p,q = ∞ t 2 E p,q . One easily concludes that F p,1− p = F0,1 = E0,1 = K1 (F) = 0 for all p, and so ∞ = K t (F) K1t (X ) = 0. On the other hand, for p ≥ 2n − 1 one has F p,− p = F0,0 = E0,0 0 and the only extension problem arises in the exact sequence ∞ 0 −→ F2n−1,1−2n −→ F2n,−2n −→ E2n,−2n −→ 0 .

Since Ext(K0t (F), K0t (F)) = 0 it follows that F2n,−2n = K0t (F) ⊕ K0t (F). For p > 2n

one has F p,− p = F2n,−2n , and so we conclude that K0t (X ) = K0t (F) ⊕ K0t (F).

Geometric K-Homology of Flat D-Branes

119

Proposition 5.5. Let F → X → S1 be a ﬁbration over the circle such that the topot logical K-homology group of the ﬁbre obeys Ext(Kit (F), Ke(i+1) (F)) = 0 for i = 0, 1. Then K0t (X ) = K1t (X ) = K0t (F) ⊕ K1t (F) . Proof. The second term in the Leray-Serre spectral sequence is given by t 2 ∼ Ke(q) (F) , p = 0, 1 E p,q = . 0 , otherwise The spectral sequence collapses at the second level, and so t Ke(q) (F) , p = 0, 1 ∞ ∼ 2 ∼ E p,q = E p,q = . 0 , otherwise 2 ∼ Kt t We therefore have F0,q ∼ (F)) = 0, i = 0, 1 = E0,q = e(q) (F). Since Ext(Kit (F), Ke(i+1) t t one finds F p,− p = K0 (F) ⊕ K1 (F) = F p,1− p and the conclusion follows.

Remark 5.2. For a fibration F → X → S2n+1 over a generic odd-dimensional sphere, 2n+1 2n+1 = 0 = ker d2n+1 : with the additional assumption that im d2n+1 : E2n+1,q−2n → E0,q 2n+1 2n+1 E2n+1,q → E0,q+2n one can derive an analogous result to the one of Proposition 5.5. ♦ Using the relative version of the Leray-Serre spectral sequence and performing an analysis analogous to those just made allows one to conclude the following result. Proposition 5.6. Let F → X → Sn , n ≥ 1 be a ﬁbration over the n-sphere. Then for i = 0, 1 one has t (F) . Kit (X, F) ∼ = Ke(i+n)

5.4. Hopf branes. Intimately related to the four classes of projective D-branes studied earlier are the Hopf fibrations. Let r := dimR (K), where K is one of the four normed division algebras over the field of real numbers given by R, C, H or O. Then the projective plane KP2 is the mapping cone of the Hopf fibration Sr −1

/ S2r −1 . Sr

The total space X = S2r −1 of this fibre bundle is a sphere in K2 , while its base B = KP1 ∼ = Sr is the one-point compactification K∞ . The Hopf fibrations are the free generators of the fundamental groups π2r −1 (Sr ). For the case r = 1 (K = R) the corresponding D-branes are represented as solitonic kinks. For r = 2 (K = C) the D-branes are Dirac monopoles corresponding to the magnetic monopole bundle over CP1 . For r = 4 (K = H) the D-branes are SU(2) Yang-Mills instantons corresponding to the holomorphic vector bundle of rank 2 over CP3 . Finally, the case r = 8 (K = O) realizes D-branes as Spin(8) instantons of the Hopf bundle over RP8 . These characterizations [52, 53] are asserted by computing the topological K-homology groups using the relative version of the Leray-Serre spectral sequence. In fact, they are special cases of a more general result.

120

R.M.G. Reis, R.J. Szabo

Proposition 5.7. For any spherical ﬁbration of the form S2n+1 → X → S2m one has H2m (S2m ; Z) = Z Kit (X, S2n+1 ) ∼ = for i = 0, 1. Proof. The second term in the relative Leray-Serre spectral sequence is given by H2m (S2m ; Z) , p = 2m 2 ∼ . E p,q = 0 , otherwise The spectral sequence collapses at the second level so that H2m (S2m ; Z) , p = 2m ∞ ∼ 2 ∼ . E p,q = E p,q = 0 , otherwise 2m Since E∞ p,− p = 0 unless p = 2m, one has F p,− p = F2m,−2m = H2m (S ; Z) and H2m (S2m ; Z). Furthermore, one has the filtration we conclude that K0t (X, S2n+1 ) ∼ = groups F p,1− p = F2m,1−2m = H2m (S2m ; Z) and it follows that also K1t (X, S2n+1 ) = H2m (S2m ; Z).

Remark 5.3. Proposition 5.7 shows that the only stable D-branes (in addition to the usual point-like D-instantons) in any of the four non-trivial Hopf fibrations above wrap a spherical submanifold Sr embedded into S2r −1 as the zero section of the fibre bundle, even though these submanifolds are homologically trivial. The worldvolume spheres Sr are labelled by the classifying map of the Hopf fibration, which is the generator of πr −1 (Spin(r ))/πr −1 (Spin(r − 1)) ∼ = πr −1 (Sr −1 ), and they are stabilized by the flux th 2r −1 given by the r Chern class cr (S ) ∈ Hr (Sr ; Z) ∼ = Z. For example, for r = 2 this construction reproduces the well-known result that the stable branes in S3 are spherical D2-branes wrapping S2 ⊂ S3 with integral charge labelled by π1 (S1 ) ∼ = Z [5]. The stabilizing flux in this case is the magnetic charge c1 (S3 ) ∈ H2 (S2 ; Z) ∼ = Z given by the first Chern class of the monopole bundle. Acknowledgements. We thank P. Baum, D. Calderbank, R. Hepworth, J.J. Manjarín, V. Mathai, R. Minasian, A. Ranicki, and E. Rees for helpful discussions. The work of R.M.G.R. is supported in part by FCT grant SFRH/BD/12268/2003. The work of R.J.S. is supported in part by a PPARC Advanced Fellowship, by PPARC Grant PPA/G/S/2002/00478, and by the EU-RTN Network Grant MRTN-CT-2004-005104.

References 1. Adams, J.F.: Stable Homotopy and Generalised Homology. Chicago: The University of Chicago Press, 1974 2. Asakawa, T., Sugimoto, T., Terashima, S.: “D-Branes, Matrix Theory and K-Homology”. J. High Energy Phys. 0203, 034 2002 3. Aspinwall, P.S.: D-Branes on Calabi-Yau Manifolds. http://arXiv.org/list/hep-th/0403166 4. Atiyah, M.F., Bott, R., Shapiro, A.: Clifford Modules. Topology 3, 3–38 (1964) 5. Bachas, C., Douglas, M.R., Schweigert, C.: Flux Stabilization of D-Branes. J. High Energy Phys. 0005, 048 (2000) 6. Baum, P., Douglas, R.G.: K-Homology and Index Theory. Proc. Symp. Pure Math. 38, 117–173 (1982) 7. Baum, P., Douglas, R.G.: Index Theory, Bordism and K-Homology. Contemp. Math. 10, 1–33 (1982) 8. Brown, L.G., Douglas, R.G., Fillmore, P.A.: Extensions of C ∗ -Algebras and K-Homology. Ann. Math. 105, 265–324 (1977) 9. Bergman, O., Gimon, E.G., Hoˇrava, P.: Brane Transfer Operations and T-Duality of Non-BPS States. J. High Energy Phys. 9904, 010 (1999)

Geometric K-Homology of Flat D-Branes

121

10. Bergman, O., Gimon, E.G., Sugimoto, S.: Orientifolds, RR Torsion and K-Theory. J. High Energy Phys. 0105, 047 (2001) 11. Blackadar, B.: K-theory for Operator Algebras. Berlin-Heidelberg-New York: Springer-Verlag, 1986 12. Bödigheimer, C.F.: Splitting the Künneth Sequence in K-theory. Math. Ann. 242(2), 159–171 (1979) 13. Bordalo, P., Ribault, S., Schweigert, C.: Flux Stabilization in Compact Groups. J. High Energy Phys. 0110, 036 (2001) 14. Bouwknegt, P., Mathai, V.: D-Branes, B-Fields and Twisted K-Theory. J. High Energy Phys. 0003, 007 (2000) 15. Bouwknegt, P., Evslin, J., Mathai, V.: T-Duality: Topology Change from H -Flux. Commun. Math. Phys. 249, 383–415 (2004) 16. Bouwknegt, P., Evslin, J., Jurˇco, B., Mathai, V., Sati, H.: Flux Compactifications of Projective Spaces and the S-Duality Puzzle. http://arXiv.org/list/hep-th/0501110 17. Braun, V.: K-theory Torsion. http://arXiv.org/list/hep-th/0005103 18. Bröcker, T., Jänich, K.: Introduction to Differential Topology. Cambridge: Cambridge University Press, 1982 19. Brunner, I., Distler, J.: Torsion D-branes in Nongeometrical Phases. Adv. Theor. Math. Phys. 5, 265–309 (2002) 20. Conner, P.E., Floyd, E.E.: The Relation of Cobordism to K-theories. Lecture Notes in Mathematics 28, Berlin-Heidelberg-New York: Springer, 1966 21. Conner, P.E., Floyd, E.E.: Differentiable Periodic Maps. Berlin-Heidelberg-New York: Springer, 1964 22. Davis, J.F., Kirk, P.: Lecture Notes in Algebraic Topology. Providence, RI: Amer. Mathe. Soc. 2001 23. Deutz, A.: The Splitting of the Künneth sequence in K-theory for C∗ -algebras. PhD Dissertation, Wayne State University, 1981 24. Diaconescu, D.-E., Moore, G.W., Witten, E.: E 8 Gauge Theory and a Derivation of K-Theory from M-Theory. Adv. Theor. Math. Phys. 6, 1031–1134 (2003) 25. Douglas, M.R.: Branes within Branes. In: Strings, Branes and Dualities. Dordrecht: Kluwer, 1999, pp. 267–275 26. Douglas, M.R.: D-Branes, Categories and N = 1 Supersymmetry. J. Math. Phys. 42, 2818–2843 (2001) 27. Douglas, M.R., Moore, G.W.: D-Branes, Quivers and ALE Instantons. http://arXiv.org/list/hepth/9603167, 1996 28. Douglas, R.G.: C∗ -algebra Extensions and K-homology. Ann. Math. Studies, Princeton, NJ: Princeton University Press, 1980 29. Freed, D.S., Hopkins, M.J.: On Ramond-Ramond Fields and K-Theory. J. High Energy Phys. 0005, 044 (2000) 30. Freed, D.S., Witten, E.: Anomalies in String Theory with D-Branes. Asian J. Math. 3, 819 (1999) 31. García-Compeán, H.: D-Branes in Orbifold Singularities and Equivariant K-Theory. Nucl. Phys. B 557, 480–504 (1999) 32. Greene, B.R., Plesser, M.R.: Duality in Calabi-Yau Moduli Space. Nucl. Phys. B 338, 15–37 (1990) 33. Gukov, S.: K-Theory, Reality and Orientifolds. Commun. Math. Phys. 210, 621–639 (2000) 34. Harvey, J.A., Moore, G.W.: Noncommutative Tachyons and K-Theory. J. Math. Phys. 42, 2765–2780 (2001) 35. Hopkins, M.J., Hovey, M.A.: Spin Cobordism Determines Real K-theory. Math. Z. 210, 181–196 (1992) 36. Hoˇrava, P.: Type IIA D-Branes, K-Theory and Matrix Theory. Adv. Theor. Math. Phys. 2, 1373–1404 (1999) 37. Hori, K.: D-Branes, T-Duality and Index Theory. Adv. Theor. Math. Phys. 3, 281–342 (1999) 38. Jakob, M.: A Bordism-Type Description of Homology. Manuscr. Math. 96, 67–80 (1998) 39. Johnson, C.V.: D-Branes. Cambridge: Cambridge University Press, 2003 40. Kapustin, A.: D-Branes in a Topologically Non-Trivial B-Field. Adv. Theor. Math. Phys. 4, 127–154 (2000) 41. Karoubi, M.: K-Theory: An Introduction. Berlin-Heidelberg-New York: Springer-Verlag, 1978 42. Kaminker, J., Schochet, C.: Topological Obstructions to Perturbations of Pairs of Operators. In: K-Theory and Operator Algebras, Lecture Notes in Mathematics 575, Berlin-Heidelberg-New York: Springer, 1975 43. Lechtenfeld, O., Popov, A.D., Szabo, R.J.: Noncommutative Instantons in Higher Dimensions, Vortices and Topological K-Cycles. J. High Energy Phys. 0312, 022 (2003) 44. Madsen, I., Rosenberg, J.: The Universal Coefficient Theorem for Equivariant K-Theory of Real and Complex C ∗ -Algebras. Contemp. Math. 70, 145–173 (1988) 45. Maldacena, J.M., Moore, G.W., Seiberg, N.: Geometrical Interpretation of D-Branes in Gauged WZW Models. J. High Energy Phys. 0107, 046 (2001) 46. Maldacena, J.M., Moore, G.W., Seiberg, N.: D-Brane Instantons and K-Theory Charges. J. High Energy Phys. 0111, 062 (2001) 47. Matsuo, Y.: Topological Charges of Noncommutative Soliton. Phys. Lett. B 499, 223–228 (2001) 48. Matthey, M.: Mapping the Homology of a Group to the K-Theory of its C ∗ -Algebra. Ill. Math. J. 46, 953–977 (2002)

122

R.M.G. Reis, R.J. Szabo

49. Minasian, R., Moore, G.W.: K-Theory and Ramond-Ramond Charge. J. High Energy Phys. 9711, 002 (1997) 50. Moore, G.W., Witten, E.: Self-Duality, Ramond-Ramond Fields and K-Theory. J. High Energy Phys. 0005, 032 (2000) 51. Myers, R.C.: Dielectric-Branes. J. High Energy Phys. 9912, 022 (1999) 52. Olsen, K., Szabo, R.J.: Brane Descent Relations in K-Theory. Nucl. Phys. B 566, 562–598 (2000) 53. Olsen,K., Szabo, R.J.: Constructing D-Branes from K-Theory. Adv. Theor. Math. Phys. 4, 889–1025 (2000) 54. Periwal, V.: D-Brane Charges and K-Homology. J. High Energy Phys. 0007, 041 (2000) 55. Polchinski, J.: String Theory, Vol. 2. Cambridge: Cambridge University Press, 1998 56. Rosenberg, J., Schochet, C.: The Künneth Theorem and the Universal Coefficient Theorem for Kasparov’s Generalized K-functor. Duke Math. J. 55, no. 2, 431–474 (1987) 57. Sen, A.: Tachyon Condensation on the Brane-Antibrane System. J. High Energy Phys. 9808, 012 (1998) 58. Switzer, R.M.: Algebraic Topology: An Introduction. Berlin-Heidelberg-New York: Springer-Verlag, 1978 59. Szabo, R.J.: Superconnections, Anomalies and Non-BPS Brane Charges. J. Geom. Phys. 43, 241–292 (2002) 60. Szabo, R.J.: D-Branes, Tachyons and K-Homology. Mod. Phys. Lett. A 17, 2297–2316 (2002) 61. Witten, E.: Chern-Simons Gauge Theory as a String Theory. Prog. Math. 133, 637–678 (1995) 62. Witten, E.: Phases of N = 2 Theories in Two Dimensions. Nucl. Phys. B 403, 159–222 (1993) 63. Witten, E.: Bound States of Strings and p-Branes. Nucl. Phys. B 460, 335–350 (1996) 64. Witten, E.: D-Branes and K-Theory. J. High Energy Phys. 9812, 019 (1998) 65. Würgler, U.: Riemann-Roch Transformationen und Kobordismen. Comm. Math. Helv. 46, 414–424 (1971) 66. Yosimura, Z.: Universal Coefficient Sequences for Cohomology Theories of CW-Spectra. Osaka J. Math. 16, 201–217 (1979) Communicated by M.R. Douglas

Commun. Math. Phys. 266, 123–151 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0007-3

Communications in

Mathematical Physics

Approximate Controllability of Three-Dimensional Navier–Stokes Equations Armen Shirikyan Laboratoire de Mathématiques, Université de Paris-Sud XI, Bâtiment 425, 91405 Orsay Cedex, France. E-mail: [email protected] Received: 25 July 2005 / Accepted: 25 November 2005 Published online: 14 April 2006 – © Springer-Verlag 2006

To the undying memory of my uncle, Professor Sargis A. Markosyan Abstract: The paper is devoted to studying the problem of controllability for 3D Navier–Stokes equations in a bounded domain. We develop the method introduced by Agrachev and Sarychev in the 2D case and establish a sufficient condition under which the problem in question is approximately controllable by a finite-dimensional force. In the particular case of a torus, it is shown that our sufficient condition is fulfilled for a control of low dimension not depending on the viscosity. Contents 0. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1. Preliminaries on 3D Navier–Stokes Equations . . . . . . . 1.1 Functional spaces and Leray projection . . . . . . . . . 1.2 Parabolic semigroups generated by the Stokes operator 1.3 Linearised Navier–Stokes system . . . . . . . . . . . . 1.4 Strong solutions of the Navier–Stokes system . . . . . 2. Main Results . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Approximate controllability . . . . . . . . . . . . . . . 2.2 Proof of Theorem 2.2: reduction to ε-controllability . . 2.3 Navier–Stokes equations on a torus . . . . . . . . . . . 3. Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . . . 3.1 Scheme of the proof . . . . . . . . . . . . . . . . . . . 3.2 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . 3.3 Proof of Proposition 3.2 . . . . . . . . . . . . . . . . . 4. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 A version of the implicit function theorem . . . . . . . 4.2 Proof of Lemma 3.3 . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

124 127 127 128 131 132 136 136 138 140 144 144 145 146 150 150 150

124

A. Shirikyan

0. Introduction In the pioneering article [AS05a], Agrachev and Sarychev introduced a new method for studying controllability properties of PDE’s perturbed by a finite-dimensional control force. They considered the 2D Navier–Stokes (NS) equations u˙ + (u, ∇)u − νu + ∇ p = η(t, x), div u = 0,

(0.1)

where x ∈ T2 = R2 /2π Z2 , ν > 0 is the viscosity, u(t, x) is the velocity field, p(t, x) is the pressure, and η(t, x) is a control function that takes on values in a ﬁnite-dimensional space E ⊂ L 2 (T2 , R2 ). One of the main results in [AS05a] states that if E contains sufficiently many Fourier modes, then for any T > 0 and ν > 0 Eq. (0.1) is approximately controllable in time T . Without going into details, let us explain two key ideas that enable one to prove approximate controllability (AC) of (0.1). 1 Introduce the space H = u ∈ L 2 (T2 , R2 ) : div u ≡ 0

(0.2)

and denote by : L 2 (T2 , R2 ) → H the orthogonal projection in L 2 (T2 , R2 ) onto the subspace H . Projecting (0.1) to H , we obtain the following evolution equation in H , which is equivalent to (0.1): u˙ + ν Lu + B(u) = η(t).

(0.3)

Here L = −, B(u) = {(u, ∇)u}, and we use the same notation for the right-hand side η and its projection to H . It is well known that the Cauchy problem for (0.3) is well posed in appropriate functional spaces, and the corresponding solutions defined on the positive half-line are continuous functions of time with range in H . Recall that Eq. (0.3) is said to be approximately controllable in time T by an E-valued control (where E ⊂ H is a finite-dimensional subspace) if for any u 0 , uˆ ∈ H and any ε > 0 there is an essentially bounded function η : [0, T ] → E such that u(T ) − u ˆ < ε, where u(t) denotes the solution of (0.3) issued from u 0 and · stands for the L 2 -norm. Along with (0.3), let us consider the control system u˙ + ν L(u + ζ (t)) + B(u + ζ (t)) = η(t).

(0.4)

Here η and ζ are E-valued control functions. It turns out that the control systems (0.3) and (0.4) are equivalent. Namely, we have the following property, which is an analogue for PDE’s of a more general result established in [AS86] for the case of ODE’s (see also Sects. 6.1 and 12.4 in [AS05a]): (P1 ) Equation (0.3) with η ∈ E is AC in time T > 0 if and only if so is Eq. (0.4) with η, ζ ∈ E. 1 The scheme presented below is not entirely accurate and differs slightly from the one used in the original paper [AS05a].

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

125

We now compare (0.4) with a control system of the form (0.3) in which the control function takes on values in a space E 1 ⊃ E. More precisely, for any subset A ⊂ H , denote by co A the convex hull of A, that is, the set of vectors v ∈ H that are representable in the form v=

k

λi u i ,

i=1

where k ≥ 1 is an integer depending on v, u i ∈ A for i = 1, . . . , k, and λi > 0 are some constants whose sum is equal to 1. Let E 1 ⊂ H be the largest vector space such that B(u) + E 1 ⊂ co B(u + ζ ) + ν Lζ + η : η, ζ ∈ E for any u ∈ H . (0.5) Consider the control system u˙ + ν Lu + B(u) = η1 (t),

(0.6)

where η1 is an E 1 -valued control. The following property is a version for PDE’s of the well-known convexification principle (for instance, see Theorem 8.2 in [AS04] or Theorem 7 in [Jur97, Chap. 3]). (P2 ) Suppose that Eq. (0.6) with η1 ∈ E 1 is AC in time T > 0. Then so is Eq. (0.4) with η, ζ ∈ E. Note that, in a general situation, the subspace E 1 may coincide with E. However, if E is a proper subset of E 1 , then properties (P1 ) and (P2 ) enable one to reduce the question of AC for Eq. (0.3) to a similar problem with a larger control space. Iterating this argument, for any initial space E one can construct a non-decreasing sequence of subspaces E 1 ⊂ E 2 ⊂ · · · such that the following property holds. (P) Equation (0.3) with η ∈ E is AC in time T > 0 if and only if so is Eq. (0.3) with η ∈ E k for some k ≥ 1. Now let {e j } be an orthonormal basis in H formed of trigonometric polynomials and let H N ⊂ H be the vector space spanned by e1 , . . . , e N . It was shown by Agrachev and Sarychev [AS05a] that if E ⊃ H N0 for a sufficiently large N0 ≥ 1, then there is a sequence Nk → ∞ such that H Nk ⊂ E k for any k ≥ 1. This property combined with (P) implies that (0.3) is AC. The Agrachev–Sarychev approach is rather general and does not use any particular property of 2D NS equations other than well-posedness of the Cauchy problem in appropriate functional spaces and the presence of a “mixing” nonlinearity. It can be applied to various controlled PDE’s, including the 2D Euler system and nonlinear Schrödinger equation [AS05b]. Moreover, combining some refined versions of properties (P1 ) and (P2 ) with a degree theory argument, it was shown in [AS05a] that the 2D NS system on the torus possesses the property of exact controllability in observed projections. The aim of this paper is to develop the Agrachev–Sarychev method in such a way that it can be applied to equations for which the well-posedness of the Cauchy problem is not known to hold. Namely, we consider the 3D Navier–Stokes system on a torus T3 . Let H be the space of divergence-free vector fields on T3 (cf. (0.2)) and let V = H 1 (T3 , R3 ) ∩ H . As in the 2D case, one can reduce the problem in question to an evolution equation in H of the form (0.3). Let E ⊂ H be a finite-dimensional subspace. We shall say that the 3D NS system (0.3) with η ∈ E is approximately controllable in

126

A. Shirikyan

time T if for any u 0 , uˆ ∈ V and any ε > 0 there is an essentially bounded function η : [0, T ] → E and a strong solution u(t) of (0.3) such that u(0) = u 0 , u(T ) − u ˆ V < ε. The following theorem is a simplified version of the main result of this paper (see Sect. 2 for more details). Main Theorem. There is a ﬁnite-dimensional subspace E ⊂ H such that for any T > 0 and ν > 0 the 3D Navier–Stokes system (0.3) with η ∈ E is approximately controllable in time T . To prove this result, we show that properties (P1 ) and (P2 ) remain valid for the 3D NS system. Their proof, however, is different from that in the 2D case and relies substantially on a perturbative result on existence of strong solutions for 3D NS equations (see Sect. 1.4). We note that even in the 2D case the approach of this paper contains some new elements compared with the proofs in [AS05a]. We also hope that our presentation will help the readers not familiar with the geometric control theory of ODE’s to gain a better understanding of the Agrachev–Sarychev method. It should be mentioned that the problem of controllability for Navier–Stokes and Euler equations was studied by many authors during the last fifteen years, and a number of deep results have been obtained (see the papers [Lio90, Fur95, Cor96, CF96, Ima98, FE99, Cor99, FC99, Gla00, Zua02] and the references therein). In particular, it was proved that NS equations possess the property of exact controllability (both in 2D and 3D) by a force supported in any given domain (see [Cor96, CF96, Ima98, FE99]). Furthermore, feedback stabilisation properties of NS and Euler equations were studied in [Cor99, BS01, Fur01, Fur04, BT04]. We point out, in particular, the paper [BT04] in which exponential stabilisation to a steady state solution for the 3D NS system is obtained via finite-dimensional controllers. To the best of my knowledge, this paper provides a first result on approximate controllability of 3D NS equations by a control of finite-dimension not depending on the viscosity. In conclusion, we note that our arguments can be used to prove the property of exact controllability in observed projections for 3D NS system; we shall address this question in a subsequent publication. The paper is organised as follows. In Sect. 1, we have compiled some preliminaries on Navier–Stokes equations. The main results of the paper are presented in Sect. 2. We establish a sufficient condition under which the 3D NS system is controllable by a finite-dimensional force and then show that it is satisfied in the case of periodic boundary conditions. Sect. 3 is devoted to the proofs. Notation. We use standard functional spaces arising in the theory of Navier–Stokes equations; they are defined in Sect. 1.1. For a separable Banach space X and a compact interval J ⊂ R, we introduce the following notation. B X (R) is the closed ball in X of radius R centred at the origin. L p (J, X ) is the space of measurable functions f : J → X such that 1/ p p f (t) X dt < ∞, (0.7) f L p (J,X ) := J

where · X stands for the norm in X . In the case p = ∞, condition (0.7) is replaced by f L ∞ (J,X ) := ess sup f (t) X < ∞. t∈J

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

127

C(J, X ) is the space of continuous functions f : J → X endowed with the norm f C(J,X ) := max f (t) X . t∈J

L(X ) denotes the space of continuous linear operators in X with the usual operator norm · L(X ) . If X is a Hilbert space and E ⊂ X is a closed subspace, then E ⊥ stands for the orthogonal complement of E in X . In this case, we denote by P = P E and Q = Q E the orthogonal projections in X onto the subspaces E and E ⊥ , respectively. Throughout the paper, we denote by Ci , i = 1, 2, . . . , unessential positive constants, by R+ the half-line [0, +∞), and by JT the time interval [0, T ].

1. Preliminaries on 3D Navier–Stokes Equations In this section, we have compiled some auxiliary results on 3D Navier–Stokes equations. Although the methods used in their proofs are well known, we present a rather detailed justification of all statements, since they will play an essential role in Sects. 2 and 3.

1.1. Functional spaces and Leray projection. Let D ⊂ R3 be a bounded domain with C 2 -smooth boundary ∂ D. Denote by H s = H s (D, R3 ) the space of vector functions u = (u 1 , u 2 , u 3 ) whose components belong to the Sobolev space of order s and by · s the corresponding norm. In the case s = 0, we shall write L 2 = L 2 (D, R3 ) and · , respectively. If s > 1/2, then H0s (D, R3 ) stands for the space of functions u ∈ H s vanishing on ∂ D. Let H = u ∈ L 2 (D, R3 ) : div u = 0 in D, (u, n)|∂ D = 0 , where n is the outward unit normal to ∂ D. Introduce the spaces V = H01 (D, R3 ) ∩ H, U = H 2 (D, R2 ) ∩ V, endowed with the norms · 1 and · 2 , respectively. Let : L 2 → H be the Leray projection, that is, the orthogonal projection in L 2 onto the closed subspace H . The following result is a straightforward consequence of the Hodge–Kodaira decomposition, elliptic regularity, and complex interpolation (for instance, see [Soh01]). Proposition 1.1. The projection satisﬁes the inequality us ≤ C us for any u ∈ H s (D, R3 ), where 0 ≤ s ≤ 2 and C > 0 is a constant not depending on u(x) and s.

128

A. Shirikyan

1.2. Parabolic semigroups generated by the Stokes operator. Let L be the Stokes operator, that is, the operator − with the domain U . It is well known that L is a positive self-adjoint operator in H with discrete spectrum (for instance, see [CF88, Chap. 4]). We shall use sometimes the following equivalent norms on U and V : uU = (Lu, Lu)1/2 , uV = (Lu, u)1/2 , where (·, ·) stands for the scalar product in L 2 . Consider the problem u˙ + Lu = h(t), u(0) = u 0 .

(1.1) (1.2)

For any T > 0, we set JT = [0, T ] and define the space XT = C(JT , V ) ∩ L 2 (JT , U ) endowed with the norm uXT = max u(t)V + 0≤t≤T

0

T

1/2 2 u(t)U dt

.

The following result is a consequence of the above-mentioned properties of L (for instance, see [Hen81, Sect. 1.3]). Proposition 1.2. For any h ∈ L 2 (JT , H ) and u 0 ∈ V , problem (1.1), (1.2) has a unique solution u ∈ XT , which satisﬁes the inequality t t 2 u(t)2V + u(s)U ds ≤ u 0 2V + h(s)2 ds, t ∈ JT . (1.3) 0

0

We now consider the projection of problem (1.1), (1.2) to a subspace of finite codimension. Let E ⊂ U be a finite-dimensional subspace and let E ⊥ be its orthogonal complement in H . We denote by P E and Q E the orthogonal projections in H onto the subspaces E and E ⊥ , respectively. Consider the problem w˙ + L E w = f (t), w(0) = w0 ,

(1.4) (1.5)

where L E = Q E L. Define the space XT (E) := C(JT , V ∩ E ⊥ ) ∩ L 2 (JT , U ∩ E ⊥ ), endowed with the norm · XT . Proposition 1.3. For any f ∈ L 2 (JT , E ⊥ ) and w0 ∈ V ∩ E ⊥ , problem (1.4), (1.5) has a unique solution w ∈ XT (E), which satisﬁes the inequality wXT ≤ C w0 V + f L 2 (JT ,H ) , (1.6) where C > 0 is a constant depending only on E and T .

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

129

Proof. Step 1. We first prove the uniqueness of solution. To this end, suppose that w ∈ XT (E) is a solution of problem (1.4), (1.5) with f = 0 and w0 = 0. Then d w(t)2 = 2(w(t), w(t)) ˙ = −2(w(t), L E w(t)) ≤ 0, dt whence it follows that w ≡ 0. Step 2. We now prove the existence of solution. Without loss of generality, we shall assume that T > 0 is sufficiently small; the general case can be reduced to the former by iteration. Let us set Y = L 2 (JT , E ⊥ ). We claim that there is a continuous operator S E : Y → Y with the following property: if u ∈ XT is the solution of problem (1.1), (1.2) with h = S E f and u 0 = w0 , then the function Q E u belongs to XT (E) and satisfies Eqs. (1.4), (1.5). If this assertion is proved, then inequality (1.6) is a straightforward consequence of (1.3). To construct the operator S E , suppose that u ∈ XT is a solution of (1.1), (1.2) with u 0 = w0 and some function h ∈ Y and let w = Q E u. Since E ⊂ U and dim E < ∞, the projection Q E is continuous in the spaces U and V . This implies that w ∈ XT (E). Moreover, it follows from (1.2) that (1.5) holds. Applying Q E to (1.1), we derive w˙ + L E w = h − Q E LP E u. Thus, w is a solution of (1.4) if and only if h − Q E LP E u = f.

(1.7)

Let us denote by K : L 2 (JT , H ) → XT the operator that takes each function h to the solution in XT of problem (1.1), (1.2) with u 0 = 0:

t

K h(t) =

e−(t−s)L h(s)ds.

(1.8)

0

Then the solution of (1.1), (1.2) with u 0 = w0 can be written in the form u = v + K h, v(t) = e−t L w0 . Substituting this expression for u in the left-hand side of (1.7) and denoting by I the identity operator, we obtain the following functional equation for h: (I − Q E LP E K )h = f + Q E LP E v. The right-hand side of this equation belongs to Y. Therefore, the required assertion will be established if we show that Q E LP E K L(Y ) ≤

1 2

for sufficiently small T > 0,

where L(Y) stands for the space of continuous linear operators in Y.

(1.9)

130

A. Shirikyan

Step 3. Let us prove (1.9). Since e−t L L(H ) = e−α1 t , where α1 > 0 is the first eigenvalue of L, it follows from (1.8) that K hY ≤ C1 T hY ,

(1.10)

where C1 > 0 does not depend on T . Using again the fact that E ⊂ U is finite-dimensional, we see that Q E LP E g ≤ C2 g for any g ∈ H ,

(1.11)

where C2 > 0 depends only on E. Combining (1.10) and (1.11), we arrive at (1.9). The proof of Proposition 1.3 is complete. Remark 1.4. It is clear that inequality (1.6) remains valid if we replace T by any T < T , and the corresponding constant C in the right-hand side will be independent of T . In particular, we obtain the estimate t t 2 2 2 2 w(t)V + w(s)U ds ≤ C w0 V + f (s) ds , t ∈ JT , (1.12) 0

0

where C > 0 does not depend on w0 and f . We now consider a particular case of (1.4) in which E is a subspace generated by eigenfunctions of the Stokes operator. Let {e j } be an orthonormal basis in H formed of the eigenfunctions of L and let {α j } be the corresponding sequence of eigenvalues indexed in an increasing order. Let us denote by H N the vector space spanned by e1 , . . . , e N and by H N⊥ its orthogonal complement in H . We write P N and Q N for the orthogonal projections in H onto the subspaces H N and H N⊥ , respectively. In what follows, we shall need a refinement of inequality (1.6) for the case E = H N . Corollary 1.5. Suppose that the conditions of Proposition 1.3 are fulﬁlled with E = H N , where N ≥ 1 is an integer, and that f ∈ L 2 (JT , H r ) for some r ∈ (0, 1/2). Then there is a constant C > 0 not depending on N and r such that the solution w ∈ XT (H N ) of problem (1.4), (1.5) with w0 = 0 satisﬁes the inequality −r/2

wXT ≤ C α N +1 f L 2 (JT ,H r ) .

(1.13)

Proof. Let D(L r ) be the domain of the operator L r :

∞ ∞ 2 D(L r ) = u = u jej ∈ H : α 2r u < ∞ . j j j=1

j=1

It is well known that (see [Tay97, Chap. 17]) D(L r/2 ) = H r ∩ H

for r ∈ (0, 1/2).

Therefore, using the Poincaré inequality and the fact that f (t) ∈ t ∈ JT , we derive r/2

(1.14) H N⊥

for almost every

f (t)r ≥ C1 L r/2 f (t) ≥ C1 α N +1 f (t) almost surely. Combining this with (1.6), we arrive at (1.13).

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

131

1.3. Linearised Navier–Stokes system. For any u, v ∈ H 2 , we have (u, ∇)v ∈ L 2 , and therefore we can define a bilinear operator by the formula B(u, v) = {(u, ∇)v}.

(1.15)

The following proposition, which establishes some continuity properties for B, can be proved with the help of Proposition 1.1, Sobolev embedding theorems, and interpolation inequalities (cf. [CF88, Chap. 6]). Proposition 1.6. There are positive constants C1 and C2 such that, for any u, v ∈ H 2 , we have 1/2 1/2 (1.16) v1 , v1 v2 u1 , B(u, v) ≤ C1 min u1 u2 1/2 v2 . (1.17) B(u, v)1 ≤ C2 u1 u2 In particular, the function B(u) = B(u, u) is continuous from H 2 to H 1 ∩ H . We now fix a finite-dimensional subspace E ⊂ H and consider the equation w˙ + L E w + QB(v1 (t), w) + QB(w, v2 (t)) = f (t),

(1.18)

where v1 and v2 are given functions and Q denotes the orthogonal projection in H onto E ⊥ . Proposition 1.7. For any functions v1 , v2 ∈ L 4 (JT , V ), f ∈ L 2 (JT , E ⊥ ) and w0 ∈ V ∩ E ⊥ , problem (1.18), (1.5) has a unique solution w ∈ XT (E). Moreover, there is a constant C > 0 depending only on max vi L 4 (JT ,V ) , i = 1, 2 such that wXT ≤ C w0 V + f L 2 (JT ,H ) . (1.19) Proof. Step 1. Let us show that if w ∈ XT (E) is a solution of (1.18), (1.5), then it satisfies inequality (1.19). This will imply, in particular, the uniqueness of solution. It follows from (1.16) that the function fˆ(t) = f (t) − Q B(v1 (t), w(t)) + B(w(t), v2 (t)) belongs to the space L 2 (JT , E ⊥ ) and satisfies the inequality fˆ(t)2 ≤ 2 f (t)2 + C1 v1 (t)21 + v2 (t)21 w(t)V w(t)U 2 ≤ 2 f (t)2 + δ w(t)U + C2 v1 (t)41 + v2 (t)41 w(t)2V for any t ∈ JT , where δ > 0 is an arbitrary constant and C2 > 0 depends only on δ. Combining this with inequality (1.12) in which f is replaced by fˆ and choosing δ > 0 sufficiently small, we obtain 1 t 2 w(s)U ds w(t)2V + 2 0 t (1.20) ≤ C3 w0 2V + v1 41 + v2 41 w2V ds + f 2L 2 (J ,H ) . 0

T

Ignoring the integral on the left-hand side and applying the Gronwall inequality, we obtain sup w(t)V ≤ C4 w0 V + f L 2 (JT ,H ) . 0≤t≤T

Combining this with (1.20), we obtain a similar upper bound for w L 2 (JT ,U ) .

132

A. Shirikyan

Step 2. We now construct a solution with the help of the contraction mapping principle. Namely, we shall prove the following assertion: there is a constant ε > 0 such that if vi ∈ L 4 (JS , H 1 ), i = 1, 2, for some S ≤ T and v1 L 4 (JS ,H 1 ) + v2 L 4 (JS ,H 1 ) ≤ ε,

(1.21)

then for any w0 ∈ V ∩ E ⊥ and f ∈ L 2 (JS , E ⊥ ) problem (1.18), (1.5) has a solution w ∈ X S (E). Once this claim is established, existence of the solution on JT will follow by a simple iteration argument. Let us consider an operator F that takes each function w ∈ X S (E) to the solution of the equation w˙ + L E w = g, g := f − Q B(v1 , w ) + B( w , v2 ) , (1.22) supplemented with the initial condition (1.5). Using Proposition 1.6, we easily show that g ∈ L 2 (JS , E ⊥ ). Thus, in view of Proposition 1.3, the operator F is well defined. Let us show that F is a contraction. Indeed, if w i ∈ X S (E), i = 1, 2, and wi = F( wi ), then the function w = w1 − w2 is a solution of problem (1.4), (1.5) with f = −Q B(v1 , w ) + B( w , v2 ) , where w =w 1 − w 2 . Repeating literally the arguments used in Step 1, we can show that f L 2 (JS ,E ⊥ ) ≤ C5 w X S v1 L 4 (JS ,H 1 ) + v2 L 4 (JS ,H 1 ) . Proposition 1.3 and Remark 1.4 imply that if (1.21) is satisfied, then wX S = F( w1 ) − F( w2 )X S ≤ C6 ε w1 − w 2 X S . It follows that F is a contraction for sufficiently small ε. Its unique fixed point w ∈ X S (E) is a solution of problem (1.18), (1.5). The proof is complete. 1.4. Strong solutions of the Navier–Stokes system. In this subsection, we establish two perturbative results on solvability of the 3D Navier–Stokes system. Let us fix a finitedimensional subspace E ⊂ U and consider the problem w˙ + L E w + Q B(w) + B(v, w) + B(w, v) = f (t), (1.23) w(0) = w0 , (1.24) where v ∈ L 4 (JT , H 1 ), f ∈ L 2 (JT , E ⊥ ), and w0 ∈ V ∩ E ⊥ are given functions. Theorem 1.8. For any R > 0 there are positive constants ε and C such that the following assertions hold. (i) Let vˆ ∈ L 4 (JT , H 1 ), fˆ ∈ L 2 (JT , E ⊥ ), and w 0 ∈ V ∩ E ⊥ be some functions such ˆ that problem (1.23), (1.24) with v = v, ˆ f = f , w0 = w 0 has a solution w ∈ XT (E). Suppose that v ˆ L 4 (JT ,H 1 ) ≤ R, fˆ L 2 (JT ,E ⊥ ) ≤ R, w XT ≤ R.

(1.25)

Then, for any triple (v, f, w0 ) satisfying the inequalities 0 V ≤ ε, (1.26) v − v ˆ L 4 (JT ,H 1 ) ≤ ε, f − fˆ L 2 (JT ,E ⊥ ) ≤ ε, w0 − w problem (1.23), (1.24) has a unique solution w ∈ XT (E).

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

133

(ii) Let R : L 4 (JT , H 1 ) × L 2 (JT , E ⊥ ) × (V ∩ E ⊥ ) → XT (E) be an operator that is deﬁned on the set of functions (v, f, w0 ) satisfying (1.26) and takes each triple (v, f, w0 ) to the solution w ∈ XT (E) of (1.23), (1.24). Then R is uniformly Lipschitz continuous, and its Lipschitz constant does not exceed C. Proof. We shall use a refined version of the implicit function theorem (IFT). Its exact formulation is given in the Appendix (see Sect. 4.1). Step 1. In view of Proposition 1.3, problem (1.4), (1.5) is well posed in the space XT (E). Therefore, we can define continuous operators K E : L 2 (JT , E ⊥ ) → XT (E),

M E : V ∩ E ⊥ → XT (E)

(1.27)

by the following rule: K E takes each function f ∈ L 2 (JT , E ⊥ ) to the solution w ∈ XT (E) of problem (1.4), (1.5) with w0 = 0 (cf. (1.8)) and M E takes each function w0 ∈ V ∩ E ⊥ to the solution w ∈ XT (E) of problem (1.4), (1.5) with f = 0. In what follows, we shall omit the subscript E to simplify notation. Let us define the spaces H = L 4 (JT , H 1 ) × L 2 (JT , E ⊥ ) × (V ∩ E ⊥ ), Y = L 2 (JT , E ⊥ ). We seek a solution of (1.23), (1.24) in the form w = Mw0 + K g,

(1.28)

where g ∈ Y is an unknown function. Substituting (1.28) for w in (1.23), we obtain the following functional equation in the space Y: g + Q B(Mw0 + K g) + B(v, Mw0 + K g) + B(Mw0 + K g, v) − f = 0. (1.29) Let us set u = (v, f, w0 ) and denote by F(u, g) the left-hand side of (1.29). It is a matter of direct verification to show that the operator F : H × Y → Y is twice continuously differentiable. Furthermore, setting uˆ = (v, ˆ fˆ, w 0 ), gˆ = fˆ − Q B( w ) + B(v, ˆ w ) + B( w , v) ˆ (1.30) ˆ g) we see that F(u, ˆ = 0. In view of Proposition 4.1, the desired assertion will be established if we show that for any R > 0 there is ρ(R) > 0 such that the following three ) satisfying (1.25). statements hold for any (v, ˆ fˆ, w 0 , w (a) The functions uˆ and gˆ defined by (1.30) satisfy the inequality ˆ H + g u ˆ Y ≤ ρ(R). (b) The norm of the second derivative of F is bounded uniformly in (u, g). ˆ g) (c) Let F (u, g) be the derivative of F with respect to g. Then F (u, ˆ is an invertible linear operator in Y, and its norm satisfies the inequality −1 F (u, ˆ g) ˆ L(Y ) ≤ ρ(R).

134

A. Shirikyan

Step 2. To prove (a), we first note that (1.16) implies the inequality B(a, b)Y + B(b, a)Y ≤ C1 a L 4 (JT ,H 1 ) bXT .

(1.31)

It follows that w L 4 (JT ,H 1 ) + v w XT ˆ L 4 (JT ,H 1 ) ≤ C3 (R). g ˆ Y ≤ fˆY + C2 A similar inequality for uˆ is obvious. Step 3. The definition of F implies that the operator F1 (u, g) = F(u, g) − g + f is a sum of bilinear forms with respect to (v, g, w0 ). Therefore, the second derivative of F coincides with the symmetrization of F1 . Thus, to prove (b), it suffices to show that F1 is continuous in appropriate functional spaces. This assertion is a straightforward consequence of (1.31) and the continuity of operators (1.27). Step 4. We now prove (c). Let us set a = M w 0 + K gˆ ∈ XT . We wish to show that for any ξ ∈ Y the equation ˆ g)h ˆ := h + QB(a + v, ˆ K h) + QB(K h, a + v) ˆ =ξ F (u, has a unique solution h ∈ Y, whose norm satisfies the inequality hY ≤ ρ(R) ξ Y .

(1.32)

Setting ζ = K h, we obtain the following problem for ζ ∈ XT (E): ˆ ζ ) + QB(ζ, a(t) + v(t)) ˆ = ξ, ζ (0) = 0. ζ˙ + L E ζ + QB(a(t) + v(t), In view of Proposition 1.7, this problem has a unique solution ζ ∈ XT (E), which satisfies the inequality ζ XT ≤ C4 (R) ξ Y .

(1.33)

Since ˆ ζ ) + B(ζ, a(t) + v(t)) ˆ h = ζ˙ + L E ζ = ξ − Q B(a(t) + v(t), inequality (1.32) follows from (1.33) and (1.31). The proof of Theorem 1.8 is complete. Remark 1.9. In Sect. 3.3, we shall need to consider perturbations of an equation of the form u˙ + L(u + ζ ) + B(u + ζ ) + B(u + ζ, v) + B(v, u + ζ ) = g(t),

(1.34)

where ζ ∈ L 4 (JT , H 2 ), v ∈ L 4 (JT , H 1 ), and g ∈ L 2 (JT , H ). In this case, we have a result similar to Theorem 1.8. Namely, for any R > 0 there are positive constants ε and C such that the following assertions hold.

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

135

(i) Let ζ ∈ L 4 (JT , H 2 ), vˆ ∈ L 4 (JT , H 1 ), and gˆ ∈ L 2 (JT , E ⊥ ) be some functions such that problem (1.34), (1.2) with v = vˆ and g = gˆ has a solution uˆ ∈ XT . Suppose that ζ L 4 (JT ,H 2 ) ≤ R, v ˆ L 4 (JT ,H 1 ) ≤ R, g ˆ L 2 (JT ,H ) ≤ R, u ˆ XT ≤ R. Then, for any pair (v, g) satisfying the inequalities v − v ˆ L 4 (JT ,H 1 ) ≤ ε, g − g ˆ L 2 (JT ,H ) ≤ ε,

(1.35)

problem (1.34), (1.2) has a unique solution u ∈ XT . (ii) Let R : L 4 (JT , H 1 ) × L 2 (JT , H ) → XT be the resolving operator that takes each pair (v, g) satisfying (1.35) to the solution u ∈ XT of (1.34), (1.2). Then R is uniformly Lipschitz continuous, and its Lipschitz constant does not exceed C. To prove the above assertions, it suffices to rewrite (1.34) in the form u˙ + Lu + B(u)+ B(u, v+ζ )+ B(v+ζ, u) = f := g− Lζ − B(ζ, v)− B(v, ζ )− B(ζ, ζ ) and to apply Theorem 1.8. We shall not dwell on the details. We now consider problem (1.23), (1.24) in which E = H N . Proposition 1.10. For any R > 0 there is an integer N0 ≥ 1 such that if N ≥ N0 and functions v ∈ XT , f ∈ L 2 (JT , H N⊥ ), and w0 ∈ H N⊥ ∩ V satisfy the inequalities vXT ≤ R, f L 2 (JT ,H ) ≤ R, w0 V ≤ R,

(1.36)

then problem (1.23), (1.24) with E = H N has a unique solution w ∈ XT (H N ), which satisﬁes the inequality wXT ≤ C(R),

(1.37)

where the constant C(R) > 0 depends only on R. Proof. The uniqueness of solution can be established by a standard argument. We shall use the contraction mapping principle to construct a solution. Let us denote by Bρ the set of functions w ∈ XT (H N ) such that wXT ≤ ρ and w(0) = w0 . Consider an operator F : Bρ → XT (H N ) that takes each function w ∈ Bρ to the solution w ∈ XT (H N ) of the problem w˙ + L N w = Q N ( f − B(v + w ) + B(v)), w(0) = w0 ,

(1.38) (1.39)

where L N = Q N L. We claim that for any R > 0 there is a constant ρ > 0 and an integer N0 ≥ 1 such that F is a contraction of the set Bρ into itself for N ≥ N0 . Step 1. We first show that F(Bρ ) ⊂ Bρ for an appropriate choice of ρ and sufficiently large N . Let us fix any r ∈ (0, 1/2). In view of Proposition 1.3 and Corollary 1.5, the solution w ∈ XT (H N ) of (1.38), (1.39) satisfies the inequality −r/2 wXT ≤ C1 w0 V + f L 2 (JT ,H ) +C2 α N +1 B(v+ w )− B(v) L 2 (JT ,H r ) . (1.40)

136

A. Shirikyan

It follows from (1.16), (1.17), and an interpolation inequality that 1 v+ w B(v + w )− B(v)r ≤ B(v+ w )− B(v)1/2 ≤ C3 v+ w 2 +v1 v2 , whence we see that B(v + w ) − B(v) L 2 (JT ,H r ) ≤ C4 v2XT + w 2XT .

(1.41)

Combining (1.40) and (1.41), we derive −r/2

F( w)XT ≤ C5 (R) + C6 α N +1 ρ 2 . Hence, if ρ = 2C5 (R) and N is so large that α N +1 ≥ (4C6 C5 (R))2/r , then F(Bρ ) ⊂ Bρ . Step 2. We now show that F is a contraction. Indeed, if w 1 , w 2 ∈ Bρ and wi = F( wi ), i = 1, 2, then the difference w = w1 − w2 ∈ XT (H N ) is a solution of problem (1.4), (1.5) with E = H N , w0 = 0 and 2 ) − B(v + w f = Q N B(v + w 1 ) = Q N B(uˆ 2 , w ) + B( w , uˆ 1 ) , where uˆ i = v + w i , i = 1, 2, and w =w 2 − w 1 . It follows from (1.16) and (1.17) that 1/2 w 1 , + uˆ 2 1 uˆ 2 2 )1/2 f ≤ C7 uˆ 1 1 uˆ 1 2 1/2 1/2 w 2 + w 1 w 2 uˆ 1 2 . f 1 ≤ C8 uˆ 2 1 uˆ 2 2 Combining these estimates with an interpolation inequality, we see that T f (t)21/2 dt ≤ C9 w 2XT uˆ 1 2XT + uˆ 2 2XT ≤ C10 (R 2 + ρ 2 ) w1 − w 2 2XT . 0

Applying Corollary 1.5, we arrive at the inequality −r/2

w2 )XT = wXT ≤ C α N +1 f L 2 (JT ,H r ) F( w1 ) − F( −r/2

≤ C11 α N +1 (R 2 + ρ 2 )1/2 w1 − w 2 XT . It follows that the operator F is a contraction for sufficiently large N and, hence, has a unique fixed point w ∈ Bρ , which is a solution of (1.23), (1.24). Since ρ = 2C5 (R), we see that w satisfies (1.37). The proof is complete. 2. Main Results In this section, we present our main results on approximate controllability of NS equations. To simplify notation, we shall confine ourselves to the case ν = 1. All the results are valid for any positive viscosity, and the proofs remain literally the same. 2.1. Approximate controllability. Let L 2loc (R+ , H ) be the space of measurable functions h : R+ → H whose restriction to any interval JT belongs to L 2 (JT , H ). Consider the controlled Navier–Stokes system

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

137

u˙ + Lu + B(u) = h(t) + η(t), u(0) = u 0 ,

(2.1) (2.2)

where h ∈ L 2loc (R+ , H ) and u 0 ∈ V are given functions and η(t) is a control taking on values in a finite-dimensional subspace E ⊂ U . Let us recall the concept of approximate controllability. Definition 2.1. Let T > 0 be a constant. Equation (2.1) is said to be approximately controllable in time T if for any ε > 0 and any points u 0 , uˆ ∈ V there is a control function η ∈ L ∞ (JT , E) and a solution u ∈ XT = C(JT , V ) ∩ L 2 (JT , U ) of problem (2.1), (2.2) such that u(T ) − u ˆ 1 < ε.

(2.3)

To formulate the main result of this paper, we introduce some notation. In view of Proposition 1.6, the nonlinear operator B(u) is continuous from U to H 1 ∩ H . For any finite-dimensional subspace G ⊂ U , we denote by F(G) the largest vector space F ⊂ U such that for any η1 ∈ F there are vectors η, ζ 1 , . . . , ζ k ∈ G and positive constants α1 , . . . , αk satisfying the relation η1 = η −

k

α j B(ζ j ).

j=1

We emphasise that the integer k ≥ 1 may depend on η1 . It is not difficult to see that F(G) is well defined and that G ⊂ F(G). Moreover, taking into account the fact that B(u) is a bilinear form on U , we conclude that dim F(G) < ∞. We now set E 0 = E,

E k = F(E k−1 )

for k ≥ 1,

E∞ =

∞

Ek .

(2.4)

k=1

The following theorem is the main result of this paper. Theorem 2.2. Let h ∈ L 2loc (R+ , H ) and let E ⊂ U be a ﬁnite-dimensional subspace such that E ∞ is dense in H . Then for any T > 0 the Navier–Stokes system (2.1) is approximately controllable in time T . The proof of Theorem 2.2 is based on an auxiliary result which is of independent interest (cf. property (P) in the Introduction). To formulate it, we introduce the following definition. Definition 2.3. Let T , R, and ε be positive constants and let E ⊂ U be a subspace. Equation (2.1) is said to be (ε, R)-controllable in time T if for any u 0 ∈ BV (R) and uˆ ∈ BU (R) there is a control function η ∈ L ∞ (JT , E) and a solution u ∈ XT of problem (2.1), (2.2) such that (2.3) holds. Theorem 2.4. Let T , R, and ε be positive constants, let E ⊂ U be a ﬁnite-dimensional subspace, let E 1 = F(E), and let h ∈ L 2 (JT , H ). Then Eq. (2.1) with η ∈ E is (ε, R)-controllable in time T if and only if so is the equation u˙ + Lu + B(u) = h(t) + η1 (t), η1 ∈ E 1 .

(2.5)

A proof of Theorem 2.4 will be given in Sect. 3. Here we show that Theorem 2.4 implies the approximate controllability of the Navier–Stokes system and that the hypothesis of Theorem 2.2 is fulfilled for the case of a torus in R3 .

138

A. Shirikyan

2.2. Proof of Theorem 2.2: reduction to ε-controllability. The required assertion will be established if we show that, for any positive constants T , R, and ε, Eq. (2.1) is (ε, R)controllable in time T . From now on, we fix T , R, and ε and we shall say that a system is ε-controllable if it is (ε, R)-controllable in time T . Step 1. Recall that the subspaces H N and H N⊥ were introduced in Sect. 1.2. We first show that there is an integer N ≥ 1 such that Eq. (2.1) with η ∈ H N is ε-controllable, and the control function η ∈ L ∞ (JT , H N ) can be chosen so that η L ∞ (JT ,H ) ≤ K ,

(2.6)

where K > 0 is a constant that depends only on R, T , and ε. We fix arbitrary points u 0 ∈ BV (R) and uˆ ∈ BU (R) and set v N (t) = T −1 P N t uˆ + (T − t)e−t L u 0 for 0 ≤ t ≤ T .

(2.7)

Note that sup v N XT ≤ C(R, T ).

(2.8)

w˙ + Q N L(w + v N ) + Q N B(w + v N ) = Q N h(t), w(0) = Q N u 0 .

(2.9)

N ≥1

Consider the problem

Since Q N Lv N ≡ 0, Q N u 0 V ≤ u 0 V , Q N h(t) ≤ h(t), Proposition 1.10 and inequality (2.8) imply that problem (2.9) has a unique solution w N ∈ XT (H N ) for sufficiently large N . It follows that the function u N = v N + w N belongs to the space XT and satisfies Eqs. (2.1) and (2.2) with η(t) = v˙ N + P N Lu N + B(u N ) − h(t) . (2.10) Moreover, it follows from (2.7) that u N (T ) − u ˆ 1 = Q N (u N (T ) − u) ˆ 1 ≤ w N (T )1 + Q N u ˆ 1.

(2.11)

The second term in the right-hand side of (2.11) goes to zero as N → ∞ uniformly with respect to uˆ ∈ BU (R). Therefore, the ε-controllability of (2.1) with η ∈ H N will be established if we show that sup w N (T )1 → 0

u 0 ,uˆ

as N → ∞,

(2.12)

where the supremum is taken over u 0 ∈ BV (R), uˆ ∈ BU (R). To prove (2.12), we take the scalar product in H of the function 2Lw N and the first equation in (2.9). This results in 2 ∂t w N 2V + 2w N U = 2(h, Lw N ) − 2(B(u N ), Lw N ).

(2.13)

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

139

Let us estimate the right-hand side of this relation. By the Cauchy inequality and (1.16), we have 2 |(h, Lw N )| ≤ 41 w N U + h2 , 2 2 |(B(u N ), Lw N )| ≤ 41 w N U + B(u N )2 ≤ 41 w N U + C1 u N 31 u N 2 .

Substituting these estimates into (2.13) and using the Poincaré inequality, we derive ∂t w N 2V + α N +1 w N 2V ≤ 2h2 + 2C1 u N 31 u N 2 . Applying the Gronwall and Cauchy–Schwarz inequalities, we obtain T 2 −α N +1 T 2 w N (T )V ≤ e u 0 V + C2 e−α N +1 (T −s) h2 +u N 31 u N 2 ds 0 T

≤ e−α N +1 T u 0 2V +C2

0

−1/2

e−α N+1 (T−s) h2 ds +C3 α N+1 u N 4XT . (2.14)

The first two terms on the right-hand side of (2.14) go to zero as N → ∞ uniformly with respect to u 0 ∈ BV (R). If we show that sup u N XT ≤ C4 (R, T ),

N ≥1

(2.15)

then (2.12) will follow from (2.14). Inequality (2.8) and Proposition 1.10 (see (1.37)) imply that sup w N XT ≤ C5 (R, T ).

N ≥1

Combining this with (2.8), we arrive at (2.15). Step 2. We now show that, for sufficiently large k ≥ 1, Eq. (2.1) with η ∈ E k is ε-controllable. Indeed, let us choose an integer N ≥ 1 and a constant K > 0 such that for any points u 0 ∈ BV (R) and uˆ ∈ BU (R) and an appropriate control function η N ∈ L ∞ (JT , H N ) verifying (2.6) there is a unique solution u N ∈ XT of (2.1), (2.2) with η = η N , and it satisfies the inequality u N (T ) − u ˆ 1 < ε/2.

(2.16)

By Theorem 1.8, there is δ0 > 0 such that, for any function η ∈ L ∞ (JT , H ) verifying the condition η − η N L ∞ (JT ,H ) ≤ δ0 , problem (2.1), (2.2) has a unique solution u ∈ XT , which satisfies the inequality u − u N XT ≤ C η − η N L ∞ (JT ,H ) .

(2.17)

Since E ∞ is dense in H and H N is finite-dimensional, for any δ > 0 we can find k ≥ 1 such that B H (K ) is contained in the δ-neighbourhood of E k . It follows that for any function η N ∈ L ∞ (JT , H N ) satisfying inequality (2.6) there is η ∈ L ∞ (JT , E k ) such that η − η N L ∞ (JT ,H ) ≤ δ. Let us choose δ ∈ (0, δ0 ) so small that 2Cδ < ε. Then (2.17) and (2.16) imply that (2.3) holds. Thus, Eq. (2.1) with η ∈ E k is ε-controllable for a sufficiently large k.

140

A. Shirikyan

Step 3. We now show that Eq. (2.1) with η ∈ E is ε-controllable. Indeed, since Eq. (2.1) with η ∈ E k is ε-controllable, applying Theorem 2.4 in which E = E k−1 , we see that so is Eq. (2.1) with η ∈ E k−1 . Repeating this argument k times, we arrive at the required result. The proof of Theorem 2.2 is complete. 2.3. Navier–Stokes equations on a torus. In this subsection, we study controlled Navier–Stokes equations with periodic boundary conditions. More precisely, let us fix a vector q = (q1 , q2 , q3 ) with positive components and set Tq3 = R3 /2π Zq3 , where Zq3 = {x = (x1 , x2 , x3 ) ∈ R3 : xi /qi ∈ Z

for i = 1, 2, 3}.

Consider the Navier–Stokes system u˙ + (u, ∇)u − νu + ∇ p = h(t, x) + η(t, x), div u = 0,

(2.18)

where x = (x1 , x2 , x3 ) ∈ Tq3 . In other words, we assume that all functions are periodic of period 2πqi with respect to xi , i = 1, 2, 3. To simplify notation, we shall assume, without loss of generality, that the mean values of u, h, and η with respect to x ∈ Tq3 are zero. As in the case of a bounded domain with Dirichlet boundary condition, one can reduce (2.18) to an evolution equation in an appropriate Hilbert space. Namely, we set

2 3 3 H = u ∈ L (Tq , R ) : div u ≡ 0, u(x) d x = 0 Tq3

and denote by : L 2 (Tq3 , R3 ) → H the orthogonal projection in L 2 (Tq3 , R3 ) onto the subspace H . Define the spaces V = H 1 (Tq3 , R3 ) ∩ H, U = H 2 (Tq3 , R3 ) ∩ H, endowed with the norms · 1 and · 2 , respectively. Projecting (2.18) to the space H and taking ν = 1, we obtain Eq. (2.1) in which L = − is the Stokes operator with the domain D(L) = U and B(u) = {(u, ∇)u}. Theorem 2.2, which was formulated for the Dirichlet boundary condition, remains valid in this case as well. Our aim is to describe explicitly a finite-dimensional subspace E ⊂ U for which the hypothesis of Theorem 2.2 is fulfilled. To this end, we first construct an orthogonal basis in H formed of the eigenfunctions of L. For x, y ∈ R3 , let x, yq =

3 i=1

qi−1 xi yi , (x, y) =

3 i=1

xi yi , |x| =

3

|xi |.

i=1

We set Z3∗ = Z3 \ {0} and R3∗ = R3 \ {0}. For a ∈ R3∗ , denote by a ⊥ the two-dimensional subspace in R3 defined by the equation x, aq = 0. Note that a ⊥ = (−a)⊥ . For any m ∈ Z3∗ , let us choose a vector (m) ∈ m ⊥ so that {(m), (−m)} is an orthonormal basis in m ⊥ with respect to the scalar product (·, ·). We now set cm (x) = (m) cosm, xq , sm (x) = (m) sinm, xq for m ∈ Z3∗ . It is a matter of direct verification to show that cm and sm are eigenfunctions of L and that {cm , sm , m ∈ Z3∗ } is an orthogonal basis in H . For a finite family of functions A, we denote by span A the vector space spanned by A.

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

141

Theorem 2.5. For any vector q = (q1 , q2 , q3 ) with positive components there is an integer d ≥ 4 such that if E = span{cm , sm , |m| ≤ d}, then the vector space E ∞ deﬁned in (2.4) is dense in H . Theorems 2.2 and 2.5 imply the following result on approximate controllability of the NS system by a force of finite dimension. Corollary 2.6. Let E ⊂ U be the ﬁnite-dimensional subspace deﬁned in Theorem 2.5. Then for any T > 0 the Navier–Stokes system (2.1) with η ∈ E is approximately controllable in time T . Remark 2.7. In the particular case when q = (1, 1, 1), it is possible to give a more precise description of a subspace E ⊂ U for which E ∞ is dense in H . Namely, let E be the vector space that is spanned by the functions cm and sm with indices m = (m 1 , m 2 , m 3 ) ∈ Z3∗ such that either |m| ≤ 2 or |m| = 3 and m i = 0 for i = 1, 2, 3. Repeating the proof of Theorem 2.5 (see below), it is easy to see that the subspace E ∞ defined in (2.4) is dense in H . A simple computation shows that dim E = 64. Thus, for any T > 0 and ν > 0 the 3D Navier–Stokes system on the standard torus T3 is approximately controllable by a 64-dimensional control. Proof of Theorem 2.5. For any integer k ≥ 1, set Hk = span{cm , sm , |m| ≤ k}, so that E = Hd . We shall show by induction that the sequence of subspaces defined in (2.4) satisfies the inclusion E 2k ⊃ Hk+d

for any k ≥ 0.

(2.19)

Since the base of induction is obvious, we shall prove inclusion (2.19) for k ≥ 1 assuming that it is true for any k < k. Step 1. Let us endow R3 with the Euclidean scalar product (·, ·) and denote by Pa , a ∈ R3∗ , the orthogonal projection in R3 onto the subspace a ⊥ . Define the two-dimensional subspaces Am := span{cm , c−m }, Bm := span{sm , s−m }, m ∈ Z3∗ , and note that any functions f m ∈ Am and gn ∈ Bn can be represented in the form f m (x) = f˜m cosm, xq , gn (x) = g˜ n sinn, xq ,

(2.20)

where f˜m and g˜ n are some vectors such that f˜m , mq = g˜ n , nq = 0. Let us show that the following relations hold for any m, n ∈ Z3∗ : B( f m , gn ) = Amn ( f m ) cosm − n, xq Pm−n + cosm + n, xq Pm+n g˜ n , B( f m , f n ) = Amn ( f m ) sinm − n, xq Pm−n − sinm + n, xq Pm+n f˜n , B(gm , f n ) = Amn (gm ) cosm + n, xq Pm+n − cosm − n, xq Pm−n f˜n ,

(2.21) (2.22) (2.23)

where fl ∈ Al and gl ∈ Bl , l = m, n, are arbitrary functions, P0 stands for the zero operator in R3 , and Amn ( f m ) =

1 ˜ f m , nq , 2

Amn (gm ) =

1 g˜ m , nq . 2

(2.24)

142

A. Shirikyan

We shall confine ourselves to the proof of (2.21), since the other relations can be established in a similar way. It is a matter of direct verification to show that a cosl, xq = (Pl a) cosl, xq , a sinl, xq = (Pl a) sinl, xq

(2.25)

for any a ∈ R3 and l ∈ Z3∗ . Combining (2.25) and (2.20), we obtain B( f m , gn ) = g˜ n cosm, xq ( f˜m , ∇) sinn, xq = g˜ n f˜m , nq cosm, xq cosn, xq f˜ ,n = m2 q g˜ n cosm − n, xq + cosm + n, xq f˜ ,n = m2 q cosm − n, xq Pm−n + cosm + n, xq Pm+n g˜ n . Step 2. To prove (2.19), we first show that − E 2k−1 ⊃ Hk+d ,

(2.26)

where H− p ⊂ H p denotes the subspace spanned by the functions cl and sl with indices l ∈ Z3∗ such that either |l| ≤ p − 1 or |l| = p and there are at least two non-zero components of l. The proof of (2.26) is based on the following proposition. Proposition 2.8. For any vector q = (q1 , q2 , q3 ) with positive components there is a constant εq > 0 such that if m, n, l ∈ Z3∗ satisfy the conditions l = m + n, m and n are not parallel, |n| ≤ εq |m|,

(2.27)

then for any f ∈ Al and g ∈ Bl there are a, b ∈ span{Am , An , Bm , Bn } such that B(a) + f, B(b) + g ∈ span{Am−n , Bm−n }.

(2.28)

Postponing the proof of Proposition 2.8 until the end of this subsection, let us prove (2.26). Take any vector l ∈ Z3∗ of length |l| = k + d with at least two non-zero components. Let us choose non-parallel vectors m, n ∈ Z3∗ such that l = m + n, |m| = k + d − 1, |n| = 1, |m − n| = k + d − 2.

(2.29)

For instance, if l = (l1 , l2 , l3 ) and l1 ≥ 2, then we can take m = (l1 − 1, l2 , l3 ) and n = (1, 0, 0). If d ≥ 4 is sufficiently large, then the second and third relations in (2.29) imply that |n| ≤ εq |m|. Therefore, by Proposition 2.8, for any f ∈ Al and g ∈ Bl we can find functions a, b ∈ Hk+d−2 such that B(a) + f, B(b) + g ∈ span{Am−n , Bm−n } ⊂ Hk+d−2 .

(2.30)

The definition of F(E 2k−2 ) and the induction hypothesis imply that Al , Bl ⊂ E 2k−1 . Since l was arbitrary, we obtain (2.26).

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

143

Step 3. We can now prove (2.19) using the same argument as in the previous step. In view of (2.26), it suffices to show that Al , Bl ⊂ E2k for any vector l ∈ Z3∗ of length |l| = k + d with only one non-zero component. To this end, we choose non-parallel vectors m, n ∈ Z3∗ such that (cf. (2.29)) l = m + n, |m| = k + d, |n| = 2, |m − n| = k + d, and the vectors m, n, and m − n have at least two non-zero components. For instance, if l = (l1 , 0, 0) and l1 ≥ 2, then we can take m = (l1 − 1, 1, 0) and n = (1, −1, 0). If d is sufficiently large, then |n| ≤ εq |m|, and using again Proposition 2.8, for any f ∈ Al − and g ∈ Bl we can construct a, b ∈ Hk+d such that − B(a) + f, B(b) + g ∈ span{Am−n , Bm−n } ⊂ Hk+d .

Recalling the definition of F(E 2k−1 ), we see that Al , Bl ⊂ E 2k−1 , and, hence, (2.19) holds. Proof of Proposition 2.8. We shall confine ourselves to the proof of existence of a vector a ∈ span{Am , Bn } such that B(a) + f ∈ span{Am−n , Bm−n }.

(2.31)

Step 1. We seek a in the form a = f m + gn ,

f m ∈ Am , gn ∈ Bn .

(2.32)

Representing f m and gn in the form (2.20) and using relations (2.21) and (2.23), we derive B( f m + gn ) = B( f m , f m ) + B( f m , gn ) + B(gn , f m ) + B(gn , gn ) = cosm − n, xq Pm−n Amn ( f m )g˜ n − Anm (gn ) f˜m + cosm + n, xq Pm+n Amn ( f m )g˜ n + Anm (gn ) f˜m . Taking into account (2.24), we see that the desired assertion will be established if we show that any vector c ∈ (m + n)⊥ can be represented in the form c = Pm+n f˜m , nq g˜ n + g˜ n , mq f˜m , (2.33) where f˜m ∈ m ⊥ and g˜ n ∈ n ⊥ . Step 2. To establish (2.33), we first show that the image of the bilinear operator : m ⊥ × n ⊥ → R3 , ( f˜, g) ˜ → f˜, nq g˜ + g, ˜ mq f˜, coincides with (m − n)⊥ . Indeed, a simple calculation implies that ( f˜, g), ˜ m − nq = 0 for any f˜ ∈ m ⊥ , g˜ ∈ n ⊥ , and therefore (m ⊥ , n ⊥ ) ⊂ (m − n)⊥ . To prove the converse inclusion, it suffices to show that (m ⊥ , n ⊥ ) contains a two-dimensional affine subspace. To this end, let us choose a vector g˜ 0 ∈ n ⊥ such that g˜ 0 , mq = 1; this can be done because m and n are not parallel. Then (g˜ 0 , f˜) = f˜, nq g˜ 0 + f˜. Since g˜ 0 ∈ / m ⊥ , the above relation implies that the affine subspace (g˜ 0 , m ⊥ ) is twodimensional.

144

A. Shirikyan

Step 3. To conclude the proof of Proposition 2.8, we shall need the following simple lemma; its proof is obvious. Lemma 2.9. Let a, b ∈ R3 be two nonzero vectors and let Nb ⊂ R3 be the orthogonal complement of b for the scalar product (·, ·). Then Pa (Nb ) = a ⊥ if and only if a, bq = 0. Since the image of the bilinear application coincides with (m − n)⊥ , representation (2.33) will be established if we show that Pm+n (m − n)⊥ = (m + n)⊥ . To prove (2.34), we denote by Sq :

R3

→

a, bq = (Sq a, b)

R3

(2.34)

the linear operator such that

for any a, b ∈ R3 .

(2.35)

Obviously, such an operator exists and is invertible. It follows from (2.35) that for any vector a ∈ R3 the subspace a ⊥ coincides with the orthogonal complement of Sq a with respect to the scalar product (·, ·). Therefore, in view of Lemma 2.9, relation (2.34) holds if and only if K mn := m + n, Sq (m − n)q = 0.

(2.36)

Since all the norms in R3 are equivalent and Sq is an invertible continuous operator, we can find a constant Cq > 0 such that K mn = (Sq (m + n), Sq (m − n)) = (Sq m, Sq m) − (Sq n, Sq n) ≥ Cq−1 |m|2 − Cq |n|2 . Therefore, if |m| ≥ 2Cq |n|, then (2.36) holds. The proof is complete. 3. Proof of Theorem 2.4 3.1. Scheme of the proof. Let us fix constants R, T , and ε. As in the proof of Theorem 2.2, we shall say that a system is ε-controllable if it is (ε, R)-controllable in time T . We need to show that if (2.5) is ε-controllable, then so is (2.1). Along with (2.1) and (2.5), let us consider the equation u˙ + L(u + ζ (t)) + B(u + ζ (t)) = h(t) + η(t),

(3.1)

where η and ζ are control functions. Suppose we can prove the following two propositions. Proposition 3.1. Let u ∈ XT be a solution of (3.1) with η, ζ ∈ L ∞ (JT , E). Then there are sequences of controls ηk ∈ L ∞ (JT , E) and of solutions u k ∈ XT for Eq. (2.1) with η = ηk such that u k (0) = u(0) for all k ≥ 1, u k (T ) − u(T )V → 0 as k → ∞.

(3.2) (3.3)

Proposition 3.2. Let u ∈ XT be a solution of (2.5) with η1 ∈ L ∞ (JT , E 1 ), where E 1 = F(E). Then there are sequences of controls ηk , ζk ∈ L ∞ (JT , E) and of solutions u k ∈ XT for Eq. (3.1) with η = ηk and ζ = ζk such that (3.2) holds and u k − uC(JT ,V ) → 0

as k → ∞.

(3.4)

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

145

Propositions 3.1 and 3.2 imply the following results relating the control systems (2.1), (3.1), (2.5) (cf. properties (P1 ) and (P2 ) in the Introduction). Extension. Equation (2.1) with η ∈ E is ε-controllable if and only if so is Eq. (3.1) with η, ζ ∈ E. Convexiﬁcation. Equation (3.1) with η, ζ ∈ E is ε-controllable if and only if so is Eq. (2.5) with η ∈ E 1 , where E 1 = F(E). The claim of Theorem 2.4 is a straightforward consequence of the above assertions. Thus, to establish Theorem 2.4, it suffices to prove Propositions 3.1 and 3.2. Their proofs are given in the next two subsections. 3.2. Proof of Proposition 3.1. Recall that P and Q stand for the orthogonal projections in H onto the subspaces E and E ⊥ , respectively. Let us set v(t) = Pu(t), w(t) = Qu(t)

for t ∈ JT .

It is clear that v ∈ C(JT , E) and w ∈ XT (E). Moreover, the function w is a solution of the equation w˙ + L E w + Q B(w) + B(v + ζ, w) + B(w, v + ζ ) = f (t), where we set f = Q h − B(v + ζ ) − L(v + ζ ) . Let us choose a sequence vk ∈ C 1 (JT , E) such that vk − (v + ζ ) L 4 (JT ,V ) → 0 vk (0) = v(0), vk (T ) = v(T )

as k → ∞, for all k ≥ 1.

(3.5) (3.6)

Consider the equation z˙ + L E z + Q B(z) + B(vk , z) + B(z, vk ) = f k (t),

(3.7)

where f k = Q h − B(vk ) − Lvk . Using (3.5), (1.16), and the fact that dim E < ∞, it is easy to show that f k − f L 2 (JT ,H ) → 0

as k → ∞.

(3.8)

Theorem 1.8 combined with (3.5) and (3.8) implies that, for sufficiently large k ≥ 1, Eq. (3.7) has a unique solution wk ∈ XT (E) that satisfies the initial condition wk (0) = w(0).

(3.9)

Moreover, since the resolving operator associated with (3.7) is Lipschitz continuous, we see that wk − wXT → 0

as k → ∞.

(3.10)

146

A. Shirikyan

We now set u k = vk + wk . The construction implies that the function u k belongs to the space XT and satisfies Eq. (2.1) with the function η(t) = ηk (t) := v˙k (t) + P Lu k (t) + B(u k (t)) − h(t) , which belongs to L ∞ (JT , E). Furthermore, it follows from (3.9) and the first relation in (3.6) that the initial condition (3.2) is also verified. Finally, the second relation in (3.6) and convergence (3.10) imply that u k (T ) − u(T )V = wk (T ) − w(T )V ≤ wk − wXT → 0

as k → ∞.

The proof of Proposition 3.1 is complete.

3.3. Proof of Proposition 3.2. Step 1. Without loss of generality, we can assume that η1 (t) is piecewise constant. Indeed, suppose that Proposition 3.2 is proved in this case, and let u ∈ XT be a solution of (2.5), (2.2) with some η1 ∈ L ∞ (JT , E 1 ). Then there is a sequence ηm ∈ L ∞ (JT , E 1 ) of piecewise constant functions such that ηm − η1 L 2 (JT ,H ) → 0

as m → ∞.

Applying Theorem 1.8 with E = {0}, we see that, for sufficiently large m ≥ 1, problem (2.5), (2.2) with η1 replaced by ηm has a unique solution u m ∈ XT , which converges to u in XT as m → ∞. In particular, for any ε > 0 there is a piecewise constant function η˜ 1 ∈ L ∞ (JT , E 1 ) and a solution u˜ ∈ XT of problem (2.5), (2.2) with η1 = η˜ 1 such that u˜ − uXT < ε/2.

(3.11)

By assumption, Proposition 3.2 is true for the piecewise constant function η1 . Therefore there are sequences of control functions ηk , ζk ∈ L ∞ (JT , E) and of solutions u k ∈ XT for problem (2.1), (2.2) with ζ = ζk and η = ηk such that u k − u ˜ XT → 0

as k → ∞.

Combining this with (3.11), for any ε > 0 we can find ηε , ζε ∈ L ∞ (JT , E) and a solution u ε ∈ XT of problem (2.1), (2.2) with ζ = ζε and η = ηε such that u ε − uXT < ε. Since ε > 0 is arbitrary, we obtain the required assertion. Step 2. We now prove the proposition for piecewise constant functions η1 (t). A simple iteration argument combined with Theorem 1.8 shows that it suffices to consider the case in which there is only one interval of constancy. Thus, we assume that u ∈ XT is a solution of (2.5), (2.2) with η(t) ≡ η1 ∈ E 1 . We claim that there is a function η ∈ E and a sequence ζk ∈ L ∞ (JT , E) such that problem (3.1), (2.2) with ζ = ζk has a unique solution u k ∈ XT , which satisfies (3.4). We shall need the following lemma whose proof is given in the Appendix (see Sect. 4.2).

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

147

Lemma 3.3. Let E ⊂ U be a ﬁnite-dimensional space and E 1 = F(E). Then for any η1 ∈ E 1 there are vectors ζ 1 , . . . , ζ m , η ∈ E and positive constants λ1 , . . . , λm whose sum is equal to 1 such that B(u) − η1 =

m

λ j B(u + ζ j ) + Lζ j − η

for any u ∈ V .

(3.12)

j=1

Relation (3.12) implies that the function u ∈ XT satisfies the equation ∂t u + Lu +

m

λ j B(u + ζ j ) + Lζ j = h(t) + η.

(3.13)

j=1

Following a classical idea in the theory of control, we now fix an integer k ≥ 1 and consider the function ζk (t) = ζ (kt/T ),

(3.14)

where ζ (t) is a 1-periodic function defined by the relation ζ (s) = ζ j

for 0 ≤ s − (λ1 + · · · + λ j−1 ) < λ j , j = 1, . . . , m.

Equation (3.13) can be rewritten as ∂t u + L(u + ζk (t)) + B(u + ζk (t)) = h(t) + η + f k (t),

(3.15)

where f k = f k1 + f k2 , f k1 (t) = Lζk (t) −

m

λ j Lζ j ,

(3.16)

j=1

f k2 (t) = B(u(t) + ζk (t)) −

m

λ j B(u(t) + ζ j ).

(3.17)

j=1

It follows from the definition of ζk and inequality (1.16) that the functions f k belong to the space L ∞ (JT , H ) and satisfy the inequality sup f k L ∞ (JT ,H ) < ∞.

(3.18)

k≥1

Setting uˆ k = u − K f k , where the operator K is defined by (1.8), we conclude from (3.15) that uˆ k ∈ XT is a solution of the equation ∂t uˆ k + L(uˆ k + ζk ) + B(uˆ k + ζk ) + B(uˆ k + ζk , K f k ) + B(K f k , uˆ k + ζk ) = h + η − B(K f k ). (3.19) We wish to consider (3.1) as a perturbation of (3.19) and to apply Remark 1.9. To this end, we note that uˆ k XT + ζk L ∞ (JT ,E) + B(K f k ) L 2 (JT ,H ) + K f k XT ≤ R,

148

A. Shirikyan

where R > 0 does not depend on k. Therefore, by Remark 1.9, there is ε > 0 depending only on R such that if functions v ∈ L 4 (JT , H 1 ) and f ∈ L 2 (JT , H ) satisfy the inequalities v − K f k L 4 (JT ,H 1 ) ≤ ε, f + B(K f k ) L 2 (JT ,H ) ≤ ε,

(3.20)

then the equation ∂t z + L(z + ζk ) + B(z + ζk ) + B(z + ζk , v) + B(v, z + ζk ) = h + η + f

(3.21)

has a unique solution z ∈ XT satisfying the initial condition z(0) = u 0 . Suppose we have shown that K f k C(JT ,V ) + B(K f k ) L 2 (JT ,H ) → 0

as k → ∞.

(3.22)

In this case, the functions v ≡ 0 and f ≡ 0 satisfy condition (3.20) for sufficiently large k, and we can conclude that problem (3.1), (2.2) with ζ = ζk has a unique solution u k ∈ XT , and u k − uˆ k XT → 0 as k → ∞.

(3.23)

Since K f k C(JT ,V ) → 0 as k → ∞ (see (3.22)), convergence (3.23) and the definition of uˆ k imply that (3.4) holds. Thus, it remains to prove (3.22). Step 3. To prove (3.22), we note that (1.16) implies the inequality 3/2

B(K f k ) L 2 (JT ,H ) ≤ C1 K f k L 6 (J

T ,H

1/2

1)

K f k L 2 (J

T ,H

2)

.

Since the sequence { f k } is bounded in L 2 (JT , H ) (see (3.18)), Proposition 1.2 implies that K f k L 2 (JT ,H 2 ) is bounded by a constant not depending on k. Therefore convergence (3.22) will be established if we show that K f k C(JT ,V ) → 0 as k → ∞.

(3.24)

Step 4. To prove (3.24), note that, in view of interpolation inequalities for Sobolev spaces, we have 1/7

6/7

2 ≤ C2 K f k C(JT ,U ∗ ) K f k C(J K f k C(J T ,V )

T ,H

3/2 )

,

(3.25)

where U ∗ denotes the dual space of U endowed with the norm vU ∗ = L −1 v. It is a matter of straightforward verification to show that L r e−t L L(H ) ≤ C3 t −r for r ≥ 0, t > 0. Combining this with (3.18), for any t ∈ JT we derive t 3/4 −(t−s)L L e K f k (t) H 3/2 ≤ C4 L(H ) f k (s) ds 0 t ≤ C5 sup f k L ∞ (JT ,H ) (t − s)−3/4 ds ≤ C6 . k≥1

(3.26)

0

Furthermore, integrating by parts, we write K f k (t) = Fk (t) − G k (t),

(3.27)

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

149

where

t

Fk (t) =

t

f k (s) ds, G k (t) =

0

Le−(t−s)L Fk (s) ds.

0

The definition of the norm in U ∗ implies that G k C(JT

,U ∗ )

t

≤ max t∈JT

0

e−(t−s)L L(H ) Fk (s) ds ≤ Fk L 1 (JT ,H ) .

(3.28)

Suppose we have shown that Fk C(JT ,H ) → 0

as k → ∞.

(3.29)

Then combining (3.25)–(3.29), we arrive at (3.24). Step 5. We now prove (3.29). We shall show that for any piecewise constant H 2 -valued function u on JT , the sequence {Fk } converges to zero in the space C(JT , H ). If this assertion is established, then a simple approximation argument combined with inequality (1.16) shows (3.29) is true for any u ∈ XT . Convergence (3.29) will be established if we prove the following assertions: (i) The family {Fk } ⊂ C(JT , H ) is relatively compact. (ii) For any t ∈ JT , the sequence {Fk (t)} goes to zero in H as k → ∞. To prove (i), note that, in view of (3.18), the family {Fk } is uniformly equicontinuous on JT . Therefore, by the Arzelà–Ascoli theorem, it suffices to show that there exists a compact set K ⊂ H such that Fk (t) ∈ K

for all t ∈ JT , k ≥ 1.

This assertion follows from the fact that, for piecewise constant functions u, the image of f k is contained in a finite set not depending on k. We now prove (ii). Let us denote by Jq = [tq−1 , tq ], q = 1, . . . , L, the intervals of constancy of u. We fix any integer r , 1 ≤ r ≤ L, and for any t ∈ Jr +1 write Fk (t) =

t

f k (s) ds =

r

tq

q=1 tq−1

0

f k (s) ds +

t

f k (s) ds.

tr

Thus, to prove (ii), it suffices to show that, for any q, q = 1, . . . , L, and t ∈ Jq , we have

t

f k (s) ds → 0

as k → ∞.

tq−1

This can be done by a straightforward computation (cf. [Jur97, Chap. 3]). The proof of Proposition 3.2 is complete.

150

A. Shirikyan

4. Appendix 4.1. A version of the implicit function theorem. Let X and Y be Banach spaces and let Z = X × Y . We denote by B X (x, δ) the closed ball in X of radius δ centred at x and by B Z (z) the closed ball in Z of radius 1 centred at z. Let F : Z → Y be a C 2 function. We write Fy (z) for its Fréchet derivative with respect to y at a point z and denote by |F|z the C 2 norm of the restriction of F to B Z (z). The following result can be established by repeating the arguments used in a standard proof of the implicit function theorem (for instance, see [Tay97, Chap. 1]). Proposition 4.1. For any R > 0 there are positive constants C and δ such that the following statements hold: (i) Let zˆ = (x, ˆ yˆ ) ∈ Z be any point such that the linear operator Fy (ˆz ) is invertible and −1 |F|zˆ ≤ R, Fy (ˆz ) L(Y ) ≤ R. Then there is a unique C 2 function f : B X (x, ˆ δ) → Y such that F(x, f (x)) = 0

for x ∈ B X (x, ˆ δ).

(ii) The function f satisﬁes the inequality f (x1 ) − f (x2 )Y ≤ C x1 − x2 X

for x1 , x2 ∈ B X (x, ˆ δ).

4.2. Proof of Lemma 3.3. In view of the definition F(E), there are vectors ζ˜ 1 , . . . , ζ˜ k , η˜ ∈ E and constants α j > 0, j = 1, . . . , k, such that

η1 = η˜ −

k

α j B(ζ˜ j ).

j=1

Let us set m = 2k, η = η, ˜ λj = λj =

√ j αj j ˜ for 2α , ζ = α ζ √ α j−k j ˜ j−k 2α , ζ = − α ζ

j = 1, . . . , k, for j = k + 1, . . . , m,

where α = α1 + · · · + αk . It is a matter of direct verification to show that (3.12) holds. Acknowledgements. This paper arose from my close cooperation with A. A. Agrachev, S. B. Kuksin and A. V. Sarychev, and I would like to thank them for numerous discussions. I am grateful also to M. Paicu for useful remarks on the Navier–Stokes equations.

Approximate Controllability of Three-Dimensional Navier–Stokes Equations

151

References [AS86]

Agrachev, A.A., Sarychev, A.V.: Reduction of a smooth system that is linear with respect to the control. Mat. Sb. (N.S.) 130(172) no. 1, 18–34, 128 (1986) [AS04] Agrachev, A.A., Sachkov, Yu.L.: Control Theory from Geometric Viewpoint. Berlin: SpringerVerlag, 2004 [AS05a] Agrachev, A.A., Sarychev, A.V.: Navier–Stokes equations: controllability by means of low modes forcing. J. Math. Fluid Mech. 7, 108–152 (2005) [AS05b] Agrachev, A.A., Sarychev, A.V.: Personal communication [BS01] Barbu, V., Sritharan, S.S.: Flow invariance preserving feedback controllers for the Navier–Stokes equation. J. Math. Anal. Appl. 255(1), 281–307 (2001) [BT04] Barbu, V., Triggiani, R.: Internal stabilization of Navier–Stokes equations with finite-dimensional controllers. Indiana Univ. Math. J. 53(5), 1443–1494 (2004) [CF88] Constantin, P., Foias, C.: Navier–Stokes Equations. Chicago, IL: University of Chicago Press, 1988 [CF96] Coron, J.-M., Fursikov, A. V.: Global exact controllability of the 2D Navier–Stokes equations on a manifold without boundary. Russ. J. Math. Phys. 4(4), 429–448 (1996) [Cor99] Coron, J.-M.: On the null asymptotic stabilization of the two-dimensional incompressible Euler equations in a simply connected domain. SIAM J. Control Optim. 37(6), 1874–1896 (1999) (electronic) [Cor96] Coron, J.-M.: On the controllability of the 2-D incompressible Navier–Stokes equations with the Navier slip boundary conditions. ESAIM Contrôle Optim. Calc. Var. 1, 35–75 (1995/96) (electronic) [FC99] Fernández-Cara, E.: On the approximate and null controllability of the Navier–Stokes equations. SIAM Rev. 41(2), 269–277 (1999) (electronic) [FE99] Fursikov, A.V., Emanuilov, O.Yu.: Exact controllability of the Navier–Stokes and Boussinesq equations. Usp. Mat. Nauk 54(3) (327), 93–146 (1999) [Fur95] Fursikov, A.V.: Exact boundary zero controllability of three-dimensional Navier–Stokes equations. J. Dynam. Control Systems 1(3), 325–350 (1995) [Fur01] Fursikov, A.V.: Stabilizability of two-dimensional Navier–Stokes equations with help of a boundary feedback control. J. Math. Fluid Mech. 3(3), 259–301 (2001) [Fur04] Fursikov, A.V.: Stabilization for the 3D Navier–Stokes system by feedback boundary control. Discrete Contin. Dyn. Syst. 10(1-2), 289–314 (2004) [Gla00] Glass, O.: Exact boundary controllability of 3-D Euler equation. ESAIM Control Optim. Calc. Var. 5, 1–44 (2000) (electronic) [Hen81] Henry, D.: Geometric Theory of Semilinear Parabolic Equations, Lecture Notes in Mathematics, Vol. 840, Berlin: Springer-Verlag, 1981 [Ima98] Imanuvilov, O.Yu.: On exact controllability for the Navier–Stokes equations. ESAIM Control Optim. Calc. Var. 3, 97–131 (1998) (electronic) [Jur97] Jurdjevic, V.: Geometric Control Theory. Cambridge: Cambridge University Press, 1997 [Lio90] Lions, J.-L.: Remarques sur la contrôlabilité approchée. Spanish-French Conference on Distributed-Systems Control. Málaga: Univ. Málaga, 1990, pp. 77–87 [Soh01] Sohr, H.: The Navier–Stokes Equations. Basel: Birkhäuser Verlag, 2001 [Tay97] Taylor, M.E.: Partial Differential Equations. I–III. New York: Springer-Verlag, 1996–1997 [Zua02] Zuazua, E.: Controllability of partial differential equations and its semi-discrete approximations. Discrete Contin. Dyn. Syst. 8(2), 469–513 (2002) Communicated by G. Gallavotti

Commun. Math. Phys. 266, 153–196 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0006-4

Communications in

Mathematical Physics

On the Third Critical Field in Ginzburg-Landau Theory S. Fournais1,2 , B. Helffer2 1 Laboratoire de Mathématiques UMR CNRS 8628, Université Paris-Sud - Bât 425, F-91405 Orsay Cedex,

France. E-mail: [email protected]

2 CNRS and Laboratoire de Mathématiques UMR CNRS 8628, Université Paris-Sud - Bât 425, F-91405

Orsay Cedex, France. E-mail: [email protected] Received: 25 July 2005 / Accepted: 29 November 2005 Published online: 8 April 2006 – © Springer-Verlag 2006

Abstract: Using recent results by the authors on the spectral asymptotics of the Neumann Laplacian with magnetic field, we give precise estimates on the critical field, HC3 , describing the appearance of superconductivity in superconductors of type II. Furthermore, we prove that the local and global definitions of this field coincide. Near HC3 only a small part, near the boundary points where the curvature is maximal, of the sample carries superconductivity. We give precise estimates on the size of this zone and decay estimates in both the normal (to the boundary) and parallel variables. Contents 1. Introduction . . . . . . . . . . . . . . . 1.1 Setup for general domains . . . . . 1.2 The half plane model . . . . . . . . 1.3 Results for general domains . . . . . 1.4 Discussion of critical fields . . . . . 1.5 Results for non-degenerate domains 1.6 Organization of the paper . . . . . . 2. Diamagnetism . . . . . . . . . . . . . . 2.1 General domains . . . . . . . . . . 2.2 The case of the disc . . . . . . . . . 3. Local Critical Fields . . . . . . . . . . . 3.1 General analysis . . . . . . . . . . . 3.2 Calculating asymptotics . . . . . . . 4. Localization . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

154 154 155 156 157 159 160 160 160 163 165 165 166 168

The two authors are supported by the European Research Network ‘Postdoctoral Training Program in Mathematical Analysis of Large Quantum Systems’ with contract number HPRN-CT-2002-00277, and the ESF Scientific Programme in Spectral Theory and Partial Differential Equations (SPECT). Part of this work was carried out while S.F. visited CIMAT, Mexico.

154

S. Fournais, B. Helffer

4.1 Estimates in the normal direction . . . . . . 4.2 Energy estimates . . . . . . . . . . . . . . 4.3 Agmon estimates in the tangential direction 4.4 An alternative approach to λ1 B A . . . . 5. Proofs of Theorem 1.1 and Theorem 1.3 . . . . 6. Local Equals Global for All Domains . . . . . . 7. An Improved Estimate on ψL∞ . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

168 169 176 176 179 183 185

1. Introduction 1.1. Setup for general domains. Our main motivation comes from superconductivity. As appeared from the works of Bernoff-Sternberg [BeSt], Lu-Pan [LuPa1, LuPa2, LuPa3, LuPa4], and Helffer-Pan [HePa], the determination of the lowest eigenvalues of the magnetic Schrödinger operator is crucial for a detailed description of the nucleation of superconductivity (on the boundary) for superconductors of Type II and for accurate estimates of the critical field HC3 . If the determination of the complete asymptotics of the lowest eigenvalues of the Schrödinger operators was essentially achieved (except for exponentially small effects) in the two-dimensional case with the works of [HeMo2] and [FoHe2], what remained to be determined was the corresponding asymptotics for the critical field. We will actually obtain much more and clarify the links between the various definitions of critical fields considered in the mathematical or physical literature and supposed to define the right critical field. The Ginzburg-Landau functional is given by E ψ, A = Eκ,H ψ, A 2 κ2 2 2 2 4 2 2 | pκ H A ψ| − κ |ψ| + |ψ| + κ H curl A − 1 d x, (1.1) = 2 with ψ, A ∈ W 1,2 (; C) × W 1,2 (; R2 ) and where p A = −i∇ − A . We fix the choice of gauge by imposing that div A = 0 in ,

A · ν = 0 on ∂.

(1.2)

Here and in the rest of the paper, ν denotes the interior unit normal vector at the boundary ∂. The domains considered in the paper will generally be smooth, bounded and simply-connected. By variation around a minimum for Eκ,H we find that minimizers ψ, A satisfy the Ginzburg-Landau equations,

p 2 ψ = κ 2 1 − |ψ|2 ψ κH A

in ; (1.3a) curl 2 A = − 2κi H ψ∇ψ − ψ∇ψ − |ψ|2 A

pκ H A ψ · ν = 0 on ∂. (1.3b) curl A − 1 = 0 Here curl (A1 , A2 ) = ∂x1 A2 − ∂x2 A1 , and curl 2 A = ∂x2 curl A , −∂x1 curl A .

On the Third Critical Field in Ginzburg-Landau Theory

155

A standard important consequence of the Ginzburg-Landau equations and the maximum principle is the uniform bound, independent of κ, H , ψ∞ ≤ 1, (1.4) for all minimizers ψ, A of Eκ,H (see [DGP] for a proof). Let F denote the vector potential generating the constant exterior magnetic field div F = 0 in , F · ν = 0 on ∂. (1.5) curl F = 1 It is known that, for given values of the the parameters κ, H , the functional E has (possibly non-unique) minimizers. However, after some analysis of the functional, one finds (see [GiPh] for details) that given κ there exists H (κ) such that if H > H (κ) then is the only minimizer of Eκ,H (up to change of gauge). Following Lu and Pan (0, F) [LuPa1], we can therefore define

(1.6) HC3 (κ) = inf H > 0 : 0, F is a minimizer of Eκ,H . In the physical interpretation of a minimizer ψ, A , |ψ(x)| is a measure of the superconducting properties of the material near the point x. Therefore, HC3 (κ) is the value of the external magnetic field, H , at which the material loses its superconductivity completely. Most of our results will concern the behavior of HC3 (κ) for large values of κ. This behavior is governed (to leading order) by a linear spectral problem on the half-axis. We therefore make a short digression to introduce this simple model. 1.2. The half plane model. In the case when is the half-plane Rs × (R+ )t one is led, after linearization around 0, F , to consider the operator H(κ H ) = pκ2 H F ,

on L 2 Rs × (R+ )t with magnetic Neumann boundary conditions. After a gauge transformation, a scaling and a partial Fourier transformation in the variable parallel to the boundary, one finds the following operator on the half-line R+ , depending on a real parameter ζ , h(ζ ) := −

d2 + (ζ + τ )2 , dτ 2

(1.7)

on L 2 (R+ , dτ ). The boundary condition at τ = 0 is the usual Neumann boundary condition. It is intuitively clear that this (self adjoint) operator is very important for the subject considered in the present paper, and it has been extensively studied. We will here recall the main spectral properties (see [DaHe] and [BeSt]) of h(ζ ). We denote by µ(ζ ) the lowest eigenvalue of h(ζ ) and by ϕζ the corresponding strictly positive, normalized eigenfunction. Then the infimum, inf ζ ∈R (inf Spec(h(ζ ))), is actually a minimum. There exists ξ0 < 0 such that ζ → µ(ζ ) decreases monotonically on (−∞, ξ0 ) to a minimum value 0 = µ(ξ0 ),

(1.8)

156

S. Fournais, B. Helffer

satisfying 1 < 0 < 1, (1.9) 2 and then increases monotonically again. It is very important that 0 < 1, since the differential operator h(ζ ), when taken on the entire line R, has 1 as lowest spectral point (for all values of ζ ). Thus the inequality, (1.9) expresses the lowering of the energy due to the Neumann boundary condition and will ultimately be responsible for the localization phenomena near the boundary observed and applied in this paper. In addition 0 = ξ02 .

(1.10)

We will write u 0 instead of φξ0 and define u 20 (0) . (1.11) 3 Since it turns out that minimizers of the GL-functional are exponentially localized near the boundary, the leading order behavior of minimizers (in the parameter regime considered in the present paper) is governed by the operator h from (1.7). C1 =

1.3. Results for general domains. A central question in the mathematical treatment of Type II1 superconductors is to establish the asymptotic behavior of HC3 (κ) for large κ. We will also be concerned with this and will, in particular, describe how HC3 (κ) can be determined by the study of a linear problem. Our first result is the following strengthening of a result in [HePa]. Theorem 1.1. Suppose is a bounded simply-connected domain in R2 with smooth boundary. Let kmax be the maximal curvature of ∂. Then 1 κ C1 (1.12) HC3 (κ) = + 3 kmax + O κ − 2 , 0 02 where C1 , 0 are the universal constants introduced in (1.11) and (1.8). When is a disc we get the improved estimate κ C1 HC3 (κ) = + 3 kmax + O κ −1 . 0 02

(1.13)

The proof of Theorem 1.1 is given in Sect. 5 below. Remark 1.2. The improvement in (1.12) 1.1] is in the esti 1 Theorem 1 compared to [HePa, mate on the remainder: we get O κ − 2 instead of O κ − 3 . Our result is optimal in the sense that the next term depends on detailed geometric properties of the boundary. We believe that (at least ‘generically’) the next term in HC3 (κ) is of the form c0 κ −a , where both c0 ∈ R and a ≥ 21 depend on ∂. In order to expand HC3 to higher orders we will impose a geometric condition, Assumption 1.8 below, on . In that case we get a full asymptotic expansion of HC3 given in (1.25) below. 1 Superconductors of Type II are the ones for which κ (in our units) is large.

On the Third Critical Field in Ginzburg-Landau Theory

157

Our second result is a precise estimate on the size of the superconducting region in the case where H is close to, but below, HC3 . To state it we need a bit of notation concerning the boundary, ∂. Let γ : R/|∂| → R2 be a (counter-clockwise) parametrization of ∂ with |γ (s)| = 1. For s ∈ R/|∂| we denote by k(s) the curvature of ∂ at the point γ (s). For more discussion of these boundary coordinates, see Appendix B. We define the coordinate t = t (x) that measures the distance to the boundary t (x) := dist(x, ∂). Let ν(s) be the interior normal vector to ∂ at the point γ (s) and define : R/|∂| × (0, t0 ) → by (s, t) = γ (s) + tν(s). Then, for t0 sufficiently small, is a diffeomorphism with image

R/|∂| × (0, t0 ) = {x ∈ dist(x, ∂) < t0 }. Furthermore, t ((s, t)) = t. Thus, in a neighborhood of the boundary, the function s = s(x) is defined (by (s(x), t (x)) = −1 (x)). From the work of Helffer-Morame [HeMo2] (see also Helffer-Pan [HePa] for the non-linear case) we know that minimizers of the Ginzburg-Landau functional are exponentially localized to a region near the boundary (see Theorem 4.1 for a restatement of their results). Here we prove that minimizers are also localized in the tangential variable to a small zone around the points of maximum curvature. The size of that zone depends on the order to which the derivatives of the curvature vanishes at such points. Our estimate is an improvement of a similar estimate in [HePa]. Theorem 1.3 (Tangential Agmon estimates (non-linear case)). Let be a bounded simply-connected domain in R2 with smooth boundary. Let ψ, A = ψκ,H , Aκ,H be a family of minimizers of the Ginzburg-Landau functional depending on the parameters κ, H . We suppose that H = H (κ) in such a way that ρ := HC3 (κ) − H satisﬁes 0 < ρ = o(1) as κ → ∞. Then there exist α, C > 0 such that if κ > C, then 1 √ √ χ12 κ 4 t e2α κ K (s) |ψ(x)|2 d x ≤ CeCρ κ |ψ(x)|2 d x. (1.14)

Here K (s) is the function deﬁned by K (s) := kmax − k(s).

(1.15)

The proof of Theorem 1.3 is also given in Sect. 5 below.

1.4. Discussion of critical ﬁelds. Actually, we should define more than one critical field, instead of just HC3 . We define an upper and a lower critical field, HC3 (κ) ≤ HC3 (κ), by HC3 (κ) = inf H > 0 : HC3 (κ) = HC3 (κ).

for all H > H, 0, F is the only minimizer of Eκ,H , (1.16)

158

S. Fournais, B. Helffer

Remark 1.4. The uniqueness of 0, F as a minimizer is only imposed on HC3 (κ) and not on HC3 (κ). Our motivation for this is to have HC3 (κ) being ‘the largest possible reasonable definition’ of the critical field, and similarly HC3 (κ) as small as possible. The proof of Theorem 1.1 gives a lower bound to HC3 (κ) and an upper bound to HC3 (κ), so the expansion in (1.12) is valid for both fields. The physical idea of a sharp value for the external magnetic field strength at which superconductivity disappears, requires the different definitions of the critical field to coincide. Our most precise result, Theorem 1.10, establishes this identification under a (generically satisfied) geometric assumption on ∂. Most works analyzing HC3 relate (more or less implicitly) these global critical fields to local ones given purely in terms of spectral data of a magnetic Schrödinger operator, i.e. in terms of a linear problem. We will discuss the local fields more deeply in Sect. 3. Here we will give the following definition. Let, for B ∈ R+ , the magnetic Neumann Laplacian H(B) be the self-adjoint operator (with Neumann boundary conditions) associated to the quadratic form 1,2 | −i∇ − B F u|2 d x. (1.17) W () u →

We define λ1 (B) as the lowest eigenvalue of H(B). The local fields can now be defined as follows.

2 , HCloc (κ) = inf H > 0 : for all H > H, λ (κ H ) ≥ κ 1 3 (1.18)

loc 2 HC3 (κ) = inf H > 0 : λ1 (κ H ) ≥ κ . The difference between HCloc (κ) and HCloc (κ)—and also between HC3 (κ) and 3 3 HC3 (κ)—can be retraced to the general non-existence of an inverse to the function B → λ1 (B), i.e. to lack of strict monotonicity of λ1 . In Sect. 2 we study this monotonicity question, which—via Theorems 1.6 and 1.7 below—is fundamental for the analysis of HC3 . Remark 1.5. The detailed spectral analysis in Bauman-Phillips-Tang [BaPhTa] in the case where is a disc does not exclude that, in this case, HCloc (κ) and HCloc (κ) differ 3 3 even for large values of κ. They prove the estimate [BaPhTa, Theorem 7.2], C loc (κ) ≤ , in the case of the disc. HC3 (κ) − HCloc 3 κ However, in Subsect. 2.2 we will make a precise analysis in this special case and conclude that actually (for the disc) HCloc (κ) = HCloc (κ) for sufficiently large values of κ. 3 3 A simple result is that the local fields are not larger than their global counterparts. Theorem 1.6. Let be a bounded simply-connected domain in R2 with smooth boundary and let κ > 0, then the following general relations hold: HC3 (κ) ≥ HCloc (κ), 3

(1.19)

(κ). HC3 (κ) ≥ HCloc 3

(1.20)

On the Third Critical Field in Ginzburg-Landau Theory

159

The easy proof of Theorem 1.6 is given in Sect. 3. For general domains we do not know that the local fields HCloc (κ) and HCloc (κ) coincide. 3 3 The next theorem complements Theorem 1.6 and is typical of type II materials, in the sense that it is only valid for large values of κ. Theorem 1.7. Let be a bounded simply-connected domain in R2 with smooth boundary. Then there exists a constant κ0 > 0 such that, for κ > κ0 , we have HC3 (κ) = HCloc (κ), HC3 (κ) = HCloc (κ). 3 3

(1.21)

Theorem 1.7 is proved in Sect. 6. 1.5. Results for non-degenerate domains. In order to obtain more precise results, we need to impose geometric conditions on . We will mainly work with the following condition. Assumption 1.8. The domain ⊂ R2 is bounded and simply-connected and has smooth boundary. Furthermore, there exists a ﬁnite number N of points {s1 , . . . , s N } ∈ R/|∂| of maximal curvature, i.e. such that k(s j ) =

sup

s∈R/|∂|

k(s), (1.22)

k(s) < k(s j ), ∀s ∈ R/|∂| \ {s1 , . . . , s N }. Finally, these maxima are non-degenerate, in the sense that k2, j := −k

(s j ) = 0. We write k2 = min j k2, j . In Fournais-Helffer [FoHe2] the asymptotics of λ1 (B), for large B, was calculated under the additional condition N = 1. In Appendix A we prove that a similar asymptotics holds for any N . We will prove in Sect. 2 that, under Assumption 1.8, λ1 : [B0 , ∞) → [λ1 (B0 ), ∞) is bijective for B0 sufficiently large. In particular, we get: Proposition 1.9. Suppose satisﬁes Assumption 1.8. Then there exists κ0 such that, if κ ≥ κ0 , then the equation for H : λ1 (κ H ) = κ 2 ,

(1.23)

has a unique solution H (κ). In other words, for large κ, the upper and lower local fields, defined in (1.18), coincide. We define, for κ ≥ κ0 , the local critical field HCloc (κ) to be the solution given by 3 Proposition 1.9, i.e. (κ) = κ 2. (1.24) λ1 κ HCloc 3 In Sect. 3.2, we calculate the asymptotics of HCloc (κ) (based on the asymptotics of 3 (κ) to (1.23) has the formal λ1 (B) from [FoHe2]). The result is that the solution HCloc 3 asymptotic expansion given by Hformal , where   ∞ j 7 κ  C1 kmax 3k2 − 23 − C1 2 κ + κ − 4 Hformal = η j κ − 4 , 1+ √ (1.25) 0 0 κ j=0

160

S. Fournais, B. Helffer

as κ → +∞. Here the coefficients η j ∈ R are computable recursively. The expression for Hformal is to be understood as an asymptotic series, no convergence is proved (or even expected). Using Proposition 1.9 we can identify the lower and upper local fields and therefore find the following result. Theorem 1.10. Suppose is either the disc or that it satisﬁes Assumption 1.8. Then there exists κ0 > 0 such that, when κ > κ0 , then (κ) = HC3 (κ) = HC3 (κ). HCloc 3

(1.26)

Furthermore, in the second case, we get a full asymptotic expansion of HC3 as given in (1.25). Proof. The case of the disc follows from Theorems 1.6 and 1.7 together with Corollary 2.8. For the non-degenerate case—i.e. under Assumption 1.8—Theorem 1.10 follows from combining Proposition 1.9 with Theorems 1.6 and 1.7. Remark 1.11. Under Assumption 1.8, the known asymptotics, (1.25), of HCloc (κ) can, of 3 course, be combined with Theorem 1.10 to find the complete asymptotics of HC3 (κ) for κ large. 1.6. Organization of the paper. Since the spectral analysis of the operator H(B) underlies the entire subject, we start by studying this linear model and the associated local critical fields. In Sect. 2, we give general conditions under which B → λ1 (B) is monotone for large B and thus prove Proposition 1.9. A very important domain, the disc, does not satisfy those general conditions. However, as discussed in Sect. 2 the monotonicity of λ1 (B) is also true for the disc. The proof of this fact, Proposition 2.7, is a somewhat long application of perturbation theory and we therefore give it in Appendix C. In the short Sect. 3 we start the comparison of the local and global fields and combine the results of the previous section with the complete asymptotic expansion of λ1 (B) known under Assumption 1.8 and recalled in Appendix A. Section 4 is technically the most demanding of the paper. Here we discuss precise estimates on the ground state energies of magnetic Neumann operators with nearly constant magnetic field, and the closely related subject of localization of the corresponding eigenstate(s). It is the improved precision in these estimates that allow us, in the following section, to get the strong general results of Theorem 1.1 and Theorem 1.3. That the local and global fields coincide, i.e. Theorem 1.7, is proved in Sect. 6. Finally, in Sect. 7 we prove an improved bound on ψ∞ , compared to (1.4). This last estimate is not used in the paper, however it complements the decay estimate, Theorem 1.3. Furthermore, it follows from the methods applied generally in the paper and will be important for further work on the subject. 2. Diamagnetism 2.1. General domains. In this section we will study the behavior for large B of the lowest Neumann eigenvalue λ1 (B) of the operator H(B) associated to the quadratic form in (1.17).

On the Third Critical Field in Ginzburg-Landau Theory

161

We will only assume in this section that is bounded with piecewise Lipschitz boundary. Then the magnetic operator H(B) has compact resolvent, so the eigenvalues tend to infinity, in particular, the degeneracy of the ground state is finite. Let B ∈ R and let n be the degeneracy of λ1 (B). By analytic perturbation theory (see for instance [Kato] or [ReSi, Chap. XII]) there exists > 0, n analytic functions (B − , B + ) β → φ j (β) ∈ H 2 () \ {0}, and n analytic functions (B − , B + ) β → E j (β) ∈ R, such that H(β)φ j (β) = E j (β)φ j (β),

E j (B) = λ1 (B).

We may choose sufficiently small in order to have the existence (but not necessarily the uniqueness) of j+ , j− ∈ {1, . . . , n} such that For β > B: E j+ (β) = For β < B: E j− (β) =

min

j∈{1,...,n}

min

j∈{1,...,n}

E j (β), (2.1) E j (β).

Define the left and right derivatives of λ1 (B): λ 1,± (B) := lim

→0±

λ1 (B + ) − λ1 (B) .

(2.2)

Proposition 2.1. For all B ∈ R, the one-sided derivatives λ 1,± (B) exist and satisfy λ 1,± (B) = −2 φ j± F · −i∇ − B F φ j± . Proof. Clearly, λ 1,± (B) = E j± (B). We will prove that E j± (B) = −2 φ j± F · −i∇ − B F φ j± . But this result is just first order perturbation theory (Feynman-Hellman). Proposition 2.2. Let g be a function such that for all ∈ (−1, 1) we have |g(β + ) − g(β)| → 0

(2.3)

as β → ∞. Suppose is such that there exists α ∈ R such that λ1 (B) = α B + g(B) + o(1), as B → +∞. Then the limits lim B→∞ λ 1,+ (B) and lim B→∞ λ 1,− (B) exist and lim λ 1,+ (B) = lim λ 1,− (B) = α.

B→∞

B→∞

(2.4)

Remark 2.3. Let γ ∈ [0, 1), then g(β) = β γ satisfies (2.3). Thus, if there exist γ1 , . . . , γm ∈ [0, 1) and α, α1 , . . . , αm ∈ R, such that (as B → ∞), λ1 (B) = α B +

m

α j B γ j + o(1),

j =1

then Proposition 2.2 implies that lim λ 1,± (B) = α.

B→∞

162

S. Fournais, B. Helffer

Proof of Proposition 2.2. Clearly2 , for all B, we have λ 1,+ (B) ≤ λ 1,− (B). So it suffices to prove that α ≤ lim inf λ 1,+ (B),

(2.5)

≤ α.

(2.6)

B→∞ lim sup λ 1,− (B) B→∞

Let > 0. Then

λ 1,+ (B) = −2 φ j+ (B) F · −i∇ − B F φ j+ (B) 1 φ j+ (B) H(B + ) − H(B) − 2 F 2 φ j+ (B) . =

Therefore, the variational principle implies λ 1,+ (B) ≥

2 λ1 (B + ) − λ1 (B) − F ∞ . L ()

By assumption there exists a function f : R+ → R+ , with limβ→∞ f (β) = 0, and such that |λ1 (β) − (αβ + g(β))| ≤ f (β). Thus, λ 1,+ (B) ≥ α +

2 f (B) + f (B + ) g(B + ) − g(B) − − F ∞ . L ()

(2.7)

Therefore, (using (2.3)) 2 lim inf λ 1,+ (B) ≥ α − F ∞

L ()

B→∞

.

Since > 0 was arbitrary, this finishes the proof of (2.5). The proof of (2.6) is similar (taking < 0 reverses the inequalities) and is omitted. From Remark 2.3 it is clear that, in order to prove monotonicity of λ1 (B), we only need to have an asymptotic expansion of λ1 (B) with an error term of order o(1). This was shown in [FoHe2], under Assumption 1.8, with the additional condition that there exists a unique point of maximal curvature, i.e. N = 1. Appendix A generalizes this result to any N . In particular, we get that Proposition 2.4 below holds in this case. Notice that with several maxima (and symmetry) one expects the difference between λ1 (B) and λ2 (B) to be exponentially small, so it may seem a bit surprising that one is able to prove Proposition 2.4 in this case. Another very interesting case where the conditions of Proposition 2.2 can be verified is the case of a domain with corners. This is the subject of the work [BonDa]. Proposition 2.4. Suppose satisﬁes Assumption 1.8. Then lim λ 1,+ (B) = lim λ 1,− (B) = 0 .

B→∞

B→∞

In particular, B → λ1 (B) is strictly increasing for large B. 2 Using the fact that λ j± is an analytic choice of the eigenvalues in a neighborhood of B.

(2.8)

On the Third Critical Field in Ginzburg-Landau Theory

163

Proof. This is clear using Proposition 2.2, Corollary A.4 and Remark 2.3. We finish this subsection by giving the proof of Proposition 1.9. Proof of Proposition 1.9. Since, by Proposition 2.4, lim B→∞ λ 1,+ (B) = lim B→∞ λ 1,− (B) = 0 > 0, there exists B0 > 0 such that B → λ1 (B) is strictly increasing from [B0 , +∞) to [λ1 (B0 ), +∞). Furthermore, by continuity, we may choose B0 sufficiently big such that λ1 (B) < λ1 (B0 ),

(2.9)

for all B < B0 . So, using (2.9), the inverse function λ−1 1 is uniquely defined as a continuous function Define κ0 by κ0 =

√

λ−1 1 : [λ1 (B0 ), +∞) → [B0 , +∞). λ1 (B0 ). Then, for κ > κ0 , the equation λ1 (κ H ) = κ 2 ,

has the unique solution H =

2

λ−1 1 κ . κ

2.2. The case of the disc. In the case where = B(0, R) is a disc, the best available asymptotics ([BaPhTa]) does not give that the hypothesis (2.3) is satisfied. In this subsection we will state a more precise asymptotic estimate in order to settle the question of diamagnetism for the disc. The proof will be given in Appendix C. Remember the spectral parameters, C1 , 0 , ξ0 were introduced in (1.8) and (1.11). Theorem 2.5 (Eigenvalue asymptotics for the disc). Suppose that is the unit disc. Deﬁne δ(m, B), for m ∈ Z, B > 0, by √ δ(m, B) = m − B2 − ξ0 B. (2.10) Then there exist (computable) constants C0 , δ0 ∈ R such that if B = inf |δ(m, B) − δ0 |,

(2.11)

1 √ λ1 (B) = 0 B − C1 B + 3C1 0 2B + C0 + O B − 2 .

(2.12)

m∈Z

then

Remark 2.6. As the proof will show, the constants C0 , δ0 can be expressed in terms of spectral data for the basic operator h(ξ0 ) discussed in Subsect. 1.2. Before we give the proof of Theorem 2.5, we collect the following important consequence. Proposition 2.7. Let be the disc. Then the left- and right-hand derivatives λ 1,± (B) exist and satisfy λ 1,+ (B) ≤ λ 1,− (B), lim inf

B→+∞

λ 1,+ (B)

≥ 0 −

(2.13) 3 2 C1 |ξ0 |

> 0.

In particular, B → λ1 (B) is strictly increasing for large B.

164

S. Fournais, B. Helffer

The monotonicity of B → λ1 (B) implies that the local fields are equal. Corollary 2.8. Let be the disc. Then there exists a constant κ0 > 0, such that, if κ > κ0 , then HCloc (κ) = HCloc (κ). 3 3

(2.14)

√

√ Proof of Proposition 2.7. Let g(B) = −C1 B + 3C1 0 2B + C0 , α = 0 . We calculate as in the proof of Proposition 2.2 until we reach (2.7). Notice that 0 ≤ B ≤ 21 , for all B > 0. Furthermore, consider B > 1, > 0. Let m 0 ∈ Z be such that B+ = m 0 −

B+ 2

√ − ξ0 B + − δ0 .

Then, since −1 < ξ0 < 0, √ √ B+ B − ξ0 B + − δ0 − m 0 − − ξ0 B − δ0 B+ − B ≥ m 0 − 2 2 ≥ − − − ξ0 √ (2.15) √ ≥− . 2 2 B + + B Therefore, 2B+ − 2B = ( B+ + B )( B+ − B ) ≥ − , 2 and we get lim inf

B→+∞

g(B + ) − g(B) ≥ − 23 C1 0 .

The rest of the proof follows the one of Proposition 2.2 by taking to zero (and using (1.9)). The numerical fact that 0 > 23 C1 |ξ0 | follows from known identities. We give the following short argument. From [FoHe2, Prop. A.3], we get that 3C1 |ξ0 | = 1 − 4I2 , where the integral I2 (given in (C.18) below) satisfies I2 > 0. In particular, 3C1 |ξ0 | < 1. Since it is known that 0 > 21 , this proves the desired statement. We also state the following approximate numerical values from [Bon1], C1 ∼ 0.254, |ξ0 | ∼ 0.768.

(2.16)

On the Third Critical Field in Ginzburg-Landau Theory

165

3. Local Critical Fields 3.1. General analysis. In addition to the (global) critical fields HC3 (κ) and HC3 (κ), one can also define local fields. These local fields are determined by the values where the normal solution3 0, F is a not unstable local minimum of Eκ,H , i.e.

loc HC3 (κ) = inf H > 0 : for all H > H, Hess Eκ,H

≥0 , 0, F (3.1) (κ) = inf H > 0 : Hess Eκ,H ≥ 0 . HCloc 3 0, F

Since the Hessian, Hess Eκ,H , at the normal solution is given by Hess Eκ,H (φ1 , a1 ) , (φ2 , a2 ) 0, F

−i∇ − κ H F φ1 · −i∇ − κ H F φ2 − κ 2 φ1 φ2 =

+ κ 2 H 2 (curl a1 ) (curl a2 ) d x,

(3.2)

we get the equivalent definitions given in (1.18) in the Introduction. Furthermore, we can prove the general comparison between the local and global fields given in Theorem 1.6. Proof of Theorem 1.6. We first prove (1.19). Suppose H > HC3 (κ). Then 0, F is the only minimizer of Eκ,H . In particular, for all φ, A, Eκ,H φ, F + A ≥ Eκ,H 0, F = 0. This implies that Hess Eκ,H ≥ 0. Since H > HC3 (κ) was arbitrary, we get (1.19). 0, F

(κ). Then λ1 (κ H ) < κ 2 . Let ψ be a ground Next we prove (1.20). Suppose H < HCloc 3 state for H(κ H ). We use, for η > 0, the pair ηψ, F as a trial state in Eκ,H , 2 = (λ1 (κ H ) − κ 2 )η2 ψ2 2 + κ η4 ψ4 4 . Eκ,H [ηψ, F] L () L () 2 Since λ1 (κ H ) − κ 2 < 0, we get Eκ,H ηψ, F < 0 for η sufficiently small (using that W 1,2 () ⊂ L 4 ()). Thus 0, F is not a minimizer for Eκ,H . Since H < HCloc (κ) was 3 arbitrary, this proves (1.20) and therefore finishes the proof of the lemma. As a corollary of Proposition 1.9 we get the following result.

Theorem 3.1. Suppose that satisﬁes Assumption 1.8. Then there exists κ0 > 0, such (κ) = HCloc (κ) = HCloc (κ). that, for all κ > κ0 , HCloc 3 3 3 Proof. This just a combination of Proposition 1.9 with the characterization (1.18). (Re(κ) was defined in (1.24)). call that HCloc 3

is a solution to the GL-equations (1.3), for all values of κ, H . Thus, 0, F is always a stationary point of the Ginzburg-Landau functional Eκ,H . 3 Notice that 0, F

166

S. Fournais, B. Helffer

3.2. Calculating asymptotics. In this section we will calculate the form of the asymptotics of the solution H (κ) to λ1 (κ H ) = κ 2 . The calculation is based on the asymptotic expansion of λ1 (B) proved in the work [FoHe2] and its extension in Appendix A. Therefore we need to impose Assumption 1.8. Lemma 3.2. Suppose satisﬁes Assumption 1.8. Let H = H (κ) be the solution to the equation λ1 (κ H ) = κ 2 ,

(3.3)

given by Proposition 1.9. Then there exists a sequence {η j }∞ j=0 ⊂ R such that   ∞ κ  C1 kmax − C1 3k22 κ −3/2 + κ −7/4 H= 1+ √ η j κ − j/4  , 0 0 κ j=0

(3.4)

in the sense of an asymptotic series as κ → +∞. Proof. Let µ(1) (h) be the lowest eigenvalue of the magnetic Neumann Laplacian associated to the following quadratic form on L 2 (): 2 −i h∇ − F u d x.

From [FoHe2] and/or Appendix A we know that there exists a sequence {ζ j }∞ j=0 ⊂ R such that for all M ∈ N, M j 3 7 15 M µ(1) (h) = 0 h − C1 kmax h 2 + C1 4 0 3k22 h 4 + h 8 ζ j h 8 + O h 2+ 8 , j=0

(3.5) as h → 0+ . By a simple scaling we see that if λ1 (B) is the lowest eigenvalue of the magnetic Neumann Laplacian associated to the form defined in (1.17), then λ1 (B) = B 2 µ(1) (B −1 ), and therefore (as B → ∞), M j 1 1 1 M λ1 (B) = 0 B − C1 kmax B 2 + C1 4 0 3k22 B 4 + B 8 ζ j B− 8 + O B− 8 . j=0

(3.6)

On the Third Critical Field in Ginzburg-Landau Theory

167

We calculate with the Ansatz for H (κ) given by (3.4):  C1 kmax − C1 λ1 (κ H ) ∼ κ 2 1 + √ 0 κ

3k2 2 κ

− 23

+κ

− 47

∞

 ηjκ

− 4j



j=0

 1 2 ∞ j 3 7 κ  C1 kmax −C1 kmax √ − C1 3k22 κ − 2 + κ − 4 η j κ− 4  1+ √ 0 0 κ j=0 4

+C1 0

+

∞ j=0

ζj

κ

3k2 2

1− j 4 1− j 8

0

 1 4 √ ∞ j 3 7 κ  C1 kmax 3k2 − 2 −4 −4  − C 1 + κ + κ η κ √ √ 1 j 2 4 0 0 κ j=0

C1 kmax − C1 1+ √ 0 κ

3k2 2 κ

− 23

+κ

− 47

∞

1− j ηk κ

− k4

8

k=0 ∞

j 1 1 C1 kmax = κ2 + √ κ − C1 3k22 κ 2 + κ 4 η j κ− 4 0 j=0   ∞ j 3 7 1 κ  1 C1 kmax (1) − C1 3k22 κ − 2 + κ − 4 −C1 kmax √ f j κ− 4  1+ √ 2 0 κ 2 0 j=0   ∞ √ j 3 7 C 1 k 1 1 max (2) +C1 3k22 κ 1 + √ − C1 3k22 κ − 2 + κ − 4 f j κ− 4  4 0 κ 4 j=0 1− j ∞ κ 4 3 1− j 1 − j C1 kmax C1 3k22 κ − 2 − + ζ j 1− j 1 + √ 8 8 0 κ j=0 0 8 ∞ 7 k +κ − 4 f j,k κ − 4 . (3.7)

k=0 (1)

(2)

Here the coefficients f j , f j , f j,k only depend on (η0 . . . , η j ) (not on the ηs with s > j). Therefore, we find the following structure: 1

λ1 (κ H ) = κ 2 + κ 4

∞

j

κ − 4 (η j + g j ),

j=0

where the g j only depend on the ηs with s < j. This implies that there exists a solution of the form (3.4) to the identity λ1 (κ H ) = κ 2 in the sense of asymptotic series. It is elementary to prove by induction that the solution H (κ) given by Proposition 1.9 must have the asymptotic expansion given by the formal solution (3.4). Remark 3.3 (Comparison with Bernoff-Sternberg [BeSt] results). In [BeSt] the first three terms in the expansion of HC3 are calculated formally by construction of a trial state. In

168

S. Fournais, B. Helffer

[BeSt, Formula (3.1)], and using the notation from that paper, the asymptotics h −2κss h 1 1 1 BS , κ− HC3 = hk + √ +O 3J0 2J0 3 k k is given. To translate to our notations we use the table k

BS

= κ, h

BS

1 = , κ BS = kmax , κss BS = −k2 , 0

So in our notation, their result is HCBS 3

κ C1 kmax = + − 3/2 0 0

√ 0 J0 = . 3C1

(3.8)

√ 1 −1/2 3 k2 . +O 2 C1 κ κ 0

This is in almost complete agreement with our

result (3.4), except for the fact that they give the next term as being of order O κ −1 , and we do not exclude the existence of a

term of order O κ −3/4 . 4. Localization In this section we will give Agmon estimates for the linear problem—both parallel and normal to the boundary. We also recall how these estimates carry over to the non-linear equations (1.3). 4.1. Estimates in the normal direction. First we recall the estimate in the normal direction from [HePa] (see also the work on the linear problem [HeMo2]). Define the magnetic quadratic form by p u 2 d x, W 1,2 () u → q B A = (4.1) BA

where A is any (possibly B-dependent) vector field satisfying the following estimates (for some C0 > 0), ≤ C0 , curl A − 1 1 ≤ C0 B −1/4 , curl A = 1 on ∂. (4.2) A 2 C ()

C

Notice that, by [HePa, Prop. 4.2] (recalled as Theorem 5.1 below), (4.2) is verified for the minimizers of the GL-functional. We will denote the unique (Neumann) self-adjoint operator associated with the quadratic form q B A by H B A . Theorem 4.1 (Uniform normal Agmon estimates (linear case)). Let C0 be given and let be a bounded simply-connected domain with smooth boundary. Then there exist C, α such that if A = A Bis a vector ﬁeld satisfying (4.2) (withthe given C 0 ) and if φ B A is an eigenfunction of H B A , with eigenvalue λ = λ(B) ≤ 1 − C −1 B, then,

eα

for all B > C.

√

Bt (x)

φ

B A

2 + B −1 p

B A φ B A

2 dx ≤ C

0

φ 2 d x, BA

(4.3)

On the Third Critical Field in Ginzburg-Landau Theory

169

The above result can also be applied to obtain similar localization estimates for the (non-linear) Ginzburg-Landau problem. This was carried out in [HePa, Prop. 4.2] and [Pan, Lemma 7.2]. Theorem 4.2 (Uniform normal Agmon estimates (non-linear case)). Let δ > 0 and let be a bounded simply-connected domain with smooth boundary. Then there exists α, C > 0 such that if ψ, A are minimizers of the Ginzburg-Landau functional, with (κ, H ) satisfying that

κ,H

κ/H < 1 − δ, κ > C, then

eα

√

κ H t (x)

|ψ|2 +

p

1 κH

2 dx ≤ C ψ κ H A

(4.4)

|ψ|2 d x.

(4.5)

This theorem admits the following basic corollary: Corollary 4.3. With the assumptions of the theorem, for any p ≥ 2, there exists a constant C p such that ψ L 2 () ≤ C p κ

− p−2 2p

ψ L p () .

(4.6)

Proof. We first observe that the normal Agmon estimate gives the existence of C such that: ψ2L 2 () ≤ C |ψ(x)|2 d x. {d(x,∂)≤C/κ}

We can then use Hölder to get for any q ≥ 1, 1 1 q ˜ − 1− q ψ2L 2 () ≤ Cκ . |ψ(x)|2q d x Taking q =

p 2,

we get (4.6).

4.2. Energy estimates. In this subsection we will give uniform lower bounds on the ground state energies of the magnetic quadratic form q B A . Theorem 4.4. Let C0 > 0 be given and let be a bounded, simply-connected domain in R2 with smooth boundary. Let γ ∈ R satisfy that 0 < γ < 1. Then there exist B0 , 0 , M > 0 such that, if A = A B is a family of vector ﬁelds depending on the parameter B and satisfying (4.2) with the given C0 , and U B (x) is given by 1 γ B, if dist(x, ∂) > 2B − 8 U B (x) = (4.7) 1 1 1 ˜ 2 − MB4, 0 B − C1 k(s)B if dist(x, ∂) ≤ 2B − 8 , ˜ with k(s) := kmax − 0 K (s), then q B A [u] ≥ for all u ∈ W 1,2 () and all B > B0 .

U B (x)|u(x)|2 d x,

(4.8)

170

S. Fournais, B. Helffer

For comparison we include the following result from [HePa, Prop. 3.7]. Theorem 4.5. Let C0 > 0 be given and let be a bounded simply-connected domain with smooth boundary. Then there exist B0 , M > 0 such that if A = A B is a family of vector ﬁelds satisfying (4.2) with the given C0 , and U B (x) is given by 1 1 if dist(x, ∂) > 2B − 6 1 − M B − 6 B, (4.9) U B (x) = 1 1 1 0 B − C1 k(s)B 2 − M B 3 , if dist(x, ∂) ≤ 2B − 6 , then

q B A [u] ≥

U B (x)|u(x)|2 d x,

(4.10)

for all u ∈ W 1,2 () and all B > B0 . Remark 4.6. The advantage of the result in Theorem 4.4 compared to Theorem 4.5 is 1 1 the improved error estimate (B 4 compared to B 3 ). However, the disadvantage is that ˜ the curvature k(s) has been replaced by the larger function k(s). In the proof of Theorem 4.4 we will use the following result from [BaPhTa]. Theorem 4.7. Let µ(1) (h, b, D(0, R)) be the ground state energy of the operator in (4.1) in the case where B = b, A = F (i.e. curl B A = b is independent of x ∈ ), = D(0, R); the disc of radius R, and where p B A = −i h∇ − B A . Then there exists C > 0 such that, if b R 2 / h ≥ C, then µ(1) (h, b, D(0, R)) ≥ 0 bh − C1 b1/2 h 3/2 /R − Ch 2 R −2 .

(4.11)

Remark 4.8. • Clearly, in the case of a disc, the curvature k is constant, k = R −1 . • The technical condition in Theorem 4.4—that be bounded—is only imposed for the convenience of being able to apply Theorem 4.7. In the general case (for instance for exterior domains) one should instead do and improve an analysis as in [HeMo2, Sect. 10]. However that would carry us too far astray here, so we only state and prove Theorem 4.4 in the case of bounded . Before giving the proof of Theorem 4.4 we collect the following important consequence. Corollary 4.9. Let the assumptions be as in Theorem 4.4. Then (4.12) q B A [u] ≥ u2 0 B − C1 kmax B 1/2 − O B 1/4 , for all u ∈ W 1,2 () and all B sufﬁciently big. Here the O only depends on and on the constant C0 in (4.2). Proof. Corollary 4.9 clearly follows from Theorem 4.4 upon estimating U B (x) ≥ inf U B (y). y∈

Proof of Theorem 4.4. The proof is a bit long and technical, so we split it into a number of steps.

On the Third Critical Field in Ginzburg-Landau Theory

171

Reduction and Agmon estimate. Since is bounded we have kmax > 0. Outside a small neighborhood of the points s with k(s) = kmax the result of Theorem 4.5 is stronger than that of Theorem 4.4. So we will assume below that k(s) > 0 for all s. Consider a slightly modified version of U B , 1 1 γ B − MB4, if dist(x, ∂) > 2B − 8 (4.13) U˜ B (x) = 1 1 1 . ˜ 2 − MB4, 0 B − C1 k(s)B if dist(x, ∂) ≤ 2B − 8 We will prove Theorem 4.4 with U˜ B instead of U B . Clearly, by changing γ a little to 1 absorb the lower order term M B 4 , Theorem 4.4 with U B follows. Let u = u B, A be a normalized ground state of H B A − U˜ B . The fact of using U˜ B instead of U B makes u independent of M. Notice, that (for fixed B, M) it is clear that the spectrum of H B A − U˜ B consists of a sequence of eigenvalues whose only accumulation point is +∞. Therefore such a ground state u exists. Let λ = λ B, A,M be the associated ground state energy. Notice that since A satisfies (4.2), we get, for all v ∈ W01,2 (), 1 (4.14) q B A [v] ≥ B |v(x)|2 curl A d x ≥ B 1 + O B − 4 v2 . Therefore, the ground state u satisfies the Normal Agmon estimates, i.e. the conclusion of Theorem 4.1 (see [HeMo2, (6.25) and (6.26)]), with constants independent of M. In particular, t N |u|2 + B −1 | p B A u|2 d x ≤ C N B −N /2 , (4.15)

with constants C N independent of M. In the remainder of the proof all constants will be independent of M. (B) in R/|∂| and a partition of unity Localization. We can find a sequence {s j,B } Nj=0 N (B)

{χ˜ j,B } j=0 on R/|∂| such that supp χ˜ j,B ∩ supp χ˜ k,B = ∅ if j ∈ / {k − 1, k, k + 1} (with the convention that s N (B)+1,B = s0,B , s−1,B = s N (B),B ). Furthermore, we may impose the conditions, for some constant C and for B ≥ B0 sufficiently big: 1 1 supp χ˜ j,B ⊂ s j,B + −B − 8 , B − 8 , (4.16) 1 2 2 χ˜ j,B = 1, |∇ χ˜ j,B | ≤ C B 4 . j

j

We will always choose the s j,B such that |s j,B | ≤ |∂|/2 . Let χ1 , χ2 be a standard partition of unity on R: χ12 + χ22 = 1, supp χ1 ⊂ (−2, 2), χ1 = 1 on a nbhd of [−1, 1]. Let us define

(4.17)

1 χ j,B (s, t) = χ˜ j,B (s)χ1 B 8 t ,

1 8

θ j,B (x) = χ j B t (x) , for j = 1, 2.

(4.18)

172

S. Fournais, B. Helffer

We will also consider χ j,B as a function on (by passing to boundary coordinates) without changing the notation. We use the standard localization formula combined with (4.14) to estimate the energy of the part localized in the interior of , followed by the (weak) Normal Agmon estimates, (4.15), and get ˜ λ≥ χ j,B u H − U B χ j,B u + B curl A − U˜ B |θ2,B u|2 d x j

−C ≥

B 1/4

1

1

B − 8 ≤t (x)≤2B − 8

|u|2 d x

χ j,B u | H − U˜ B χ j,B u + O B −∞ .

j

Modulo choosing M sufficiently large, it therefore suffices to prove that 1 1 ˜ 2 χ j,B u ≥ −C B 4 χ j,B u H − 0 B + C1 k(s)B |χ j,B u|2 d x. (4.19) j

j

We will write each of the terms in · | · in boundary coordinates and compare with the similar term with fixed curvature. Notice that (4.19) is an estimate on sums. The next step will be to reduce the proof of (4.19) to an estimate on individual terms. Local spectral estimate is sufﬁcient. Let e be the local energy density, written in boundary coordinates (s, t). The expression for e is given explicitly in (4.25) below. We will prove the following estimate on each of the terms in the sum in (4.19): 1 1 ˜ 2 − C B 4 χ j,B u2 χ j,B u Hχ j,B u ≥ 0 B − C1 k(s)B 1 9 −C B 4 t 2 e[χ j,B u](s, t) ds dt − C B 4 t 4 |χ j,B u|2 ds dt. (4.20) Before proving (4.20) let us prove that (4.20) implies (4.19) and therefore the theorem. Taking the sum over j in (4.20) we get 1 1 ˜ 2 − C B 4 θ1,B u2 χ j,B u Hχ j,B u ≥ 0 B − C1 k(s)B j 1

−C B 4

9

t 2 e[χ j,B u](s, t) ds dt −C B 4

t 4 |θ1,B u|2 ds dt.

j

(4.21) We will use the (weak) normal Agmon estimates, (4.15), to see that the last two terms 1 in (4.21) can be included in the C B 4 -term, yielding (4.19). For the last term, this is easily done. Since u = 1 = 1,B u + O(B −∞ ), we can estimate 9 9 B 4 t 4 |θ1,B u|2 ds dt ≤ C B 4 t 4 (x)|u(x)|2 d x ≤ C B 1/4 ≤ C

B 1/4 1,B u2 .

On the Third Critical Field in Ginzburg-Landau Theory

173

The second term in (4.21) is only slightly more complicated. We estimate by the Euclidean integral 1 1 t 2 e[χ j,B u](s, t) ds dt ≤ C B 4 t 2 (x)| p B A χ j,B u|2 d x. B4

j

j

Now, | p B A χ j,B u|2 ≤ 2χ 2j,B | p B A u|2 + 2|∇χ j,B |2 |u|2 , and (4.16) together with the Agmon estimates, therefore imply that 2 1 1 1 2 4 4 B t e[χ j,B u](s, t) ds dt ≤ C B t 2 (x) p B A u(x) + B 4 |u(x)| d x

j 1 4

1

≤ C B ≤ 2C B 4 1,B u2 .

(4.22)

Thus, (4.20) implies (4.19), so we only need to prove (4.20) in order to finish the proof of the theorem. Local spectral estimate. We will now prove the local estimate (4.20). The proof of this estimate goes by comparison with the constant curvature case, Theorem 4.7. Using Lemma B.1 we may assume that the gauge—on each of the sets supp χ j,B —is chosen such that the expression for A in boundary coordinates is as in (B.8). Let A˜ 1 be the non-zero entry on the right hand side of (B.8), A˜ 1 (s, t) = −t (1 − tk(s)/2) + t 2 b(s, t). We know from (4.2) and Lemma B.1 that 1

b L ∞ ≤ C B− 4 . loc Define

(4.23)

B j,B :=

e[χ j,B u](s, t) ds dt = χ j,B u | Hχ j,B u,

(4.24)

with a(s, t) = 1 − tk(s), e[ f ] := a −1 |(Ds − B A˜ 1 ) f |2 + a|Dt f |2 .

(4.25)

Similarly, we define quantities with fixed curvature k j,B , A˜ 1, j,B (s, t) = −t 1 − t 2 = k (s j,B ), a j,B = 1 − tk j,B ,

k j,B = k(s j,B ), k j,B and

A j,B :=

e j,B [χ j,B u](s, t) ds dt,

(4.26)

174

S. Fournais, B. Helffer

with

2 2 ˜ e j,B [ f ] := a −1 j,B Ds − B A1, j,B f + a j,B |Dt f | .

(4.27)

We observe that B j,B = χ j,B u | Hχ j,B u and that A j,B is a similar expression but for a domain with constant curvature k j,B , i.e. a disc. We first compare B j,B and A j,B , essentially by Taylor expanding. The result of this comparison is (4.34). We clearly have e[χ j,B u](s, t) = e j,B [χ j,B u](s, t) + f 1 (s, t) + f 2 (s, t) − f 3 (s, t), with

(4.28)

2

2 ˜ D f 1 = a −1 − a −1 χ − B A u s 1 j,B + a − a j,B |Dt (χ j,B u)| , j,B 2 2 ˜ ˜ A χ f 2 = a −1 B − A u 1 1, j,B j,B , j,B

˜ 1 − A˜ 1, j,B χ j,B u Ds − B A˜ 1, j,B χ j,B u . f 3 = 2a −1 B A j,B

1 1 Notice that for s ∈ s j,B + −B − 8 , B − 8 , we have,

Thus,

1 |k(s) − k j,B | = |s − s j,B | · k ((1 − )s j,B + s) d 0 1 − 18

≤ CB |k j,B | + B − 8 .

(4.29)

1 1 |a − a j,B | = t|k(s) − k j,B | ≤ C B − 8 k j,B + B − 8 t, 1 1 −1 − 18 a − a −1 k j,B + B − 8 t, for t < 2B − 8 . j,B ≤ C B

(4.30)

We estimate, using (4.30), for any η > 0 , 1 1 | f 1 (s, t)| ≤ C B − 8 k j,B + B − 8 t e[χ j,B u](s, t) 1 1 2 1 ≤ C ηB − 2 k j,B + B − 8 e[χ j,B u](s, t) + C η−1 B 4 t 2 e[χ j,B u](s, t). (4.31) We also estimate f 2 and f 3 by 7

f 2 (s, t) ≤ C B 4 t 4 |χ j,B u|2 , and

(4.32)

7 1 ˜ 1, j,h χ j,B u χ | f 3 (s, t)| ≤ 2Ct 2 B 8 k j,B + B − 8 a −1 u D − B A j,B s j,B 9 1 1 2 ≤ C η−1 B 4 t 4 |χ j,B u|2 + C ηB − 2 k j,B + B − 8 e j,B [χ j,B u](s, t). (4.33)

On the Third Critical Field in Ginzburg-Landau Theory

175

Thus, we get by combining (4.28) with (4.31), (4.32), and (4.33) and integrating, 2 1 − 21 − 18 1 + CηB B j,B ≥ 1 − CηB − 2 k j,B + B 1 −Cη−1 B 4 1 9 −C B 4 B − 2

2 − 18 A j,B k j,B + B t 2 e[χ j,B u](s, t) ds dt −1 t 4 |χ j,B u|2 ds dt. +η

This gives (with new constants C) 1 1 B j,B ≥ 1 − CηB − 2 |k j,B |2 + B − 4 A j,B −1 14 −Cη B t 2 e[χ j,B u](s, t) ds dt 1 9 t 4 |χ j,B u|2 ds dt. −C B 4 B − 2 + η−1

(4.34)

From Theorem 4.7 we get the estimate 1 A j,B ≥ 0 B − C1 k j,B B 2 − C χ j,B u2 .

(4.35)

Therefore, with K (s) := (kmax − k(s)), K j,B := kmax − k j,B , and making a Taylor expansion of k(s) − k j,B , we get

2 1 1 1 − CηB − 2 k j,B + B − 4 A j,B 1 − 0 B − C1 {kmax − 0 K (s)}B 2 χ j,B u2 1 2

− 18

− C η 0 k j,B B 2 χ j,B u2 = C1 K j,B − 0 K j,B + C k j,B B 1 +O B 4 χ j,B u2 . (4.36)

By definition the function K (s), which can also be identified to a periodic function on R, satisfies, for some C > 0, K (s) ≥ 0,

K

(s) ≤ C.

(4.37)

Using (4.37) we find, for all s, σ ∈ R, 0 ≤ K (σ ) ≤ K (s) + K (s)(σ − s) + C(σ − s)2 . Upon setting σ = s −

K (s) 2C

and using K (s) = −k (s) we get the inequality |k (s)|2 ≤ C K (s),

(4.38)

176

S. Fournais, B. Helffer

with C = 4C, valid for all s ∈ R. Applying (4.38) to (4.36) we see that for 0 , η sufficiently small we have 2 − 12 − 14 A j,B 1 − CηB k j,B + B 1 − 0 B − C1 {kmax − 0 K (s)}B 2 χ j,B u2 1

≥ −C B 4 χ j,B u2 ,

(4.39)

C

for some > 0. Combining (4.39) and (4.34), for the given choice of η, yields (4.21). Since (4.21) implies the theorem we have finished the proof. 4.3. Agmon estimates in the tangential direction. Theorem 4.4 can be used to obtain exponential localization estimates in the tangential (s-)variable both for the linear and non-linear problems. These estimates are similar to Theorems 4.1 and 4.2 and are given in Theorems 4.10 and 1.3. Theorem 4.10 (Uniform tangential Agmon estimates). Let C0 > 0 be given and let be a bounded domain with smooth boundary. Then there exist C, α > 0, such that if A = A B is a family of vector ﬁelds satisfying (4.2) (with the given C0 ) B∈[C,+∞)

and (u B ) B∈[C,∞[ is a family of normalized eigenfunctions of H B A with corresponding eigenvalue λ(B) satisfying the bound 1 1 (4.40) λ = λ B A ≤ 0 B − C1 kmax B 2 + C0 B 4 , ∀B ≥ C, and if χ1 ∈ C0∞ is the function from (4.17), then, for all B ≥ C, 2 1 1 4 u B (x) d x ≤ C. e2α B K (s) χ12 B 8 t (x) |u B (x)|2 + B −1 −i∇ − B A(x)

(4.41) Proof. The proof of Theorem 4.10 is similar to (but easier than) the proof of Theorem 1.3, given in Sect. 5 below, and will therefore be omitted. Corollary 4.11. With the assumptions of the theorem and Assumption 1.8, for any p ≥ 2, there exists a constant C p such that ψ L 2 () ≤ C p κ

− 5( p−2) 8p

ψ L p () .

(4.42)

This can also be extended without additional difficulties to the case when K has isolated zeros of finite order. 4.4. An alternative approach to λ1 B A . In the case where λ1 (B) is known to very high precision, it is advantageous to estimate λ1 B A by first approximating by constant field and then using the knowledge of λ1 (B). This is the result in Theorems 4.12 and 4.13 below. Remember that λ1 B A is the lowest eigenvalue (bottom of the spectrum) of H B A and that λ1 (B) = λ1 B F .

On the Third Critical Field in Ginzburg-Landau Theory

177

Theorem 4.12. Let C0 > 0 be given. Suppose that is a smooth, bounded, simplyconnected domain in R2 and that is not a disc. Then there exists B0 , 0 , C > 0 such that, for all B ≥ B0 and if A satisﬁes (4.2) with the given C0 , then √ √ −0 4 B . (4.43) λ1 B A ≥ λ1 (B) − C curl A − 1 1 B+e C ()

If is a disc, then (4.43) is replaced by λ1 B A ≥ λ1 (B) − C curl A − 1

√

C 1 ()

B+1 .

(4.44)

Proof. We consider first the case where is not a disc. Let φ B A be a normalized ground state of H B A . Since, by Corollary 4.9, λ1 (B) = 1

0 B − C1 kmax B 2 + O(B 1/4 ), we may assume that 1 1 λ1 B A ≤ 0 B − C1 kmax B 2 + C0 B 4 , (if not, there is nothing to prove). Then the normal Agmon estimates (given in Theorem 4.1) and the tangential Agmon estimates (given in Theorem 4.10) give exponential localization estimates on φ B A . Since is not a disc, there exists a σ0 ∈ R/|∂| such that k(σ0 ) = kmin = Choose > 0 such that k(s) ≤ partition of unity on such that

min k(s) = kmax .

s∈R/|∂|

kmin + kmax on |s − σ0 | ≤ . Let f 12 + f 22 = 1 be a 2

f 1 = 1 on {t ≤ t0 /2} ∩ {|s − σ0 | ≥ }, supp f 1 ⊆ {t ≤ t0 } ∩ {|s − σ0 | ≥ /2}. By the standard localization formula, λ1 B A = q B A φ B A = q B A f 1 φ B A + q B A f 2 φ B A 2 |∇ f 1 |2 + |∇ f 2 |2 φ B A d x. −

Using the Agmon estimates, we therefore find for some C, 0 , √ 4 λ1 B A ≥ q B A f 1 φ B A − Ce−0 B .

(4.45)

Using Lemma B.1 (in the situation given by (B.8)) we know that we can choose a gauge ϕ on supp f 1 such that A˜ − F˜ − ∇ϕ = t 2 (s, t), ˜ F˜ are A and F transformed to boundary coordinates and where (s, t) L ∞ where A, is controlled by curl A − 1 1 . Therefore, we can estimate, for all ρ > 0, C 2 q B A f 1 φ B A ≥ (1 − ρ)q B F f 1 ei Bϕ φ B A − ρ −1 B 2 t 2 (s, t) f 1 φ B A d x. (4.46)

178

S. Fournais, B. Helffer

Using the normal Agmon estimates, we therefore get from (4.45) and (4.46) 2 √ 2 4 λ1 B A ≥ (1 − ρ)λ1 (B) f 1 φ B A − Cρ −1 B curl A − 1 1 − Ce−0 B . C ()

(4.47) We finish the proof of (4.43) by applying the normal and tangential Agmon estimates again (to remove the localization f 1 ) and by choosing √ ρ = curl A − 1 1 / B C

(using that λ1 (B) ≤ B for large B). When is a disc, we make the partition of unity f 1 , f 2 as follows.

, f 1 = 1 on {t ≤ t0 /2} ∩ |s| ≤ |∂| 4

supp f 1 ⊆ {t ≤ t0 } ∩ |s| ≤ 3|∂| . 4 2

The localization error |∇ f 1 |2 + |∇ f 2 |2 φ B A d x then becomes of unit size, and we get λ1 B A ≥ q B A f 1 φ B A + q B A f 2 φ B A + O(1). (4.48) Both q B A f 1 φ B A and q B A f 2 φ B A are now estimated as above. This finishes the proof. is very small, the exponential error in Theorem 4.12 is too When curl A − 1 1 C ()

expensive, therefore we need the following simpler result: Theorem 4.13. Suppose is a smooth, bounded, simply-connected domain in R2 . Then there exist B0 > 0 and, for all p > 2, C p > 0 such that λ1 B A ≥ λ1 (B) − C p B 3/2 curl A − 1 p , (4.49) L ()

when B ≥ B0 . Proof. To prove the estimate (4.49), we write b = curl A − 1 ∈ C01 () and define 1 (x) = (log |x − y|)b(y) dy, a = (−∂2 , ∂1 ) . 2π Then curl a = = b in , and for all p > 2, a L ∞ ≤ C p b L p () .

(4.50)

Since is simply connected there exists a choice of gauge ϕ such that A − F − ∇ϕ = a . With φ¯ B = e−i Bϕ φ B A , we get for all 0 < ρ, 2 λ1 B A = −i∇ − B A − ∇ϕ φ¯ B d x 2 a φ¯ B 2 d x. ≥ (1 − ρ) −i∇ − B F φ¯ B d x − ρ −1 B 2

On the Third Critical Field in Ginzburg-Landau Theory

This implies (notice that (4.51) is trivial for ρ > 1) by (4.50), for all ρ > 0, λ1 B A ≥ (1 − ρ)λ1 (B) − Cρ −1 B 2 b2L p () .

179

(4.51)

√

By choosing ρ = b L p () B, we get the desired estimate (using that λ1 (B) ≤ B for large B). 5. Proofs of Theorem 1.1 and Theorem 1.3 Let us start by recalling the following result from [HePa, Prop.4.2]. Theorem 5.1. Suppose ψ, A = ψκ,H , Aκ,H is a sequence of minimizers for the GL-functional. Suppose that the parameters κ, H satisfy, −1 (5.1) 0 + g(κ) κ ≤ H < HC3 (κ), for some function g : R → R with limκ→∞ g(κ) = 0. Then there exists a constant C > 0, such that for all κ > C, we have C ≤ √ (5.2) ψ2L ∞ () , curl A − 1 1 C () κH ≤ Cψ2L ∞ () . (5.3) curl A − 1 2 C ()

In particular, using (1.4), curl A − 1C 1 () ≤ C(κ H )− 2 , curl A − 1C 2 () ≤ C. 1

(5.4)

Remark 5.2. In particular, Theorem 5.1 implies that the minimizing vector potential Aκ,H satisfies (4.2) for B = κ H . Thus, we can apply the results from Subsect. 4.2 on the magnetic quadratic form qκ H Aκ,H appearing in the Ginzburg-Landau functional. Proof of Theorem 1.1. Remember that Eκ,H 0, F = 0, so minimizers always have non-positive energy. We consider first the case of the disc. Using Theorem 2.5, we see that there exists C > 0, such that if κ > C, and H<

κ C1 + 3 kmax − C/κ, 0 02

then λ1 (κ H ) < κ 2 . Therefore, Theorem 1.6 implies that (κ) > HC3 (κ) ≥ HCloc 3

κ C1 + 3 kmax − C/κ. 0 02

(5.5)

On the other hand, Theorem 4.12, combined with Theorems 2.5 and Theorem 5.1, implies the existence of C > 0, such that if κ > C , H satisfies HC3 (κ) ≥ H >

κ C1 + 3 kmax + C /κ, 0 02

180

S. Fournais, B. Helffer

and ψ, A is a Ginzburg-Landau minimizer with parameters (κ, H ), then necessarily λ1 κ H A > κ 2 . Since, ψ, A is a minimizer we therefore have 2 κ2 0 ≥ Eκ,H ψ, A ≥ λ1 κ H A − κ 2 ψ2 + ψ44 + (κ H )2 curl A − 1 . 2 Since all terms on the right hand side are non-negative we can therefore conclude that ψ, A = 0, F . This finishes the proof in the case of the disc. Consider now the case of general . The lower bound in Corollary 4.9 combined with the matching upper bound from [HeMo2] gives that λ1 (κ H ) = 0 κ H − C1 kmax (κ H )1/2 + O (κ H )1/4 . From this it is an elementary calculation to prove that there exists C > 0 such that H≤

1 κ C1 + 3 kmax − Cκ − 2 0 02

⇒

λ1 (κ H ) < κ 2 −

√

κ,

for all κ > C. Therefore, Theorem 1.6 implies that ≥ HC3 ≥ HCloc 3

1 κ C1 + 3 kmax − Cκ − 2 . 0 02

(5.6)

To prove the opposite inequality, let (κ, H )n be a sequence of parameters (we will suppress the n from the notation for convenience) such that • H ≤ HC3 (κ), when (κ, H ) is in the sequence. • there exists a non-trivial GL-minimizer (ψ, A) 1 C1 κ • H ≥ 0 + 3 kmax + Cκ − 2 , where C is a (big) constant to be chosen below. 02

Then Theorem 5.1 and Corollary 4.9 imply that, if the constant C is chosen sufficiently big, √ λ1 κ H A ≥ κ 2 + κ. But, by the same argument as for the disc, this contradicts the non-triviality of the GL-minimizers and therefore proves that HC3 ≤

1 κ C1 + 3 kmax + Cκ − 2 . 0 02

(5.7)

We will now estimate the tangential size of the superconducting boundary layer when H is very close to—but below—HC3 .

On the Third Critical Field in Ginzburg-Landau Theory

181

Proof of Theorem 1.3. The proof is a variant of the proof of a similar result obtained in [HePa]. Since 0 ≤ HC3 (κ) − H = ρ, we get from Theorem 1.1 that H=

1 κ C1 + 3 kmax + O κ − 2 + ρ. 0 02

But then an elementary calculation gives that √

κ 2 − 0 κ H − C1 kmax (κ H )1/2 − O (κ H )1/4 = O(κρ) + O κ .

(5.8)

With χ1 , χ2 being a standard partition of unity and using Eq. (1.3) satisfied by the we can calculate GL-minimizer (ψ, A), 1 1 2 κ 2 χ1 κ 4 t exp ακ 2 K (s) ψ 1 1 ≥ χ12 κ 4 t exp 2ακ 2 K (s) ψ H κ H A ψ 1 1 = qκ H A χ1 κ 4 t exp ακ 2 K (s) ψ 1 2 1 − ∇ χ1 κ 4 t exp ακ 2 K (s) |ψ|2 d x.

We can estimate the localization error as 1 2 1 ∇ χ1 κ 4 t exp ακ 2 K (s) |ψ|2 d x ≤ L 1 + L 2 ,

(5.9)

(5.10)

where

2 2 1 K (s) χ κ 4 t exp 2ακ 21 K (s) |ψ|2 d x, L 1 := Cα κ 1 2 1 1 1

L 2 := Cκ 2 χ1 κ 4 t exp 2ακ 2 K (s) |ψ|2 d x. 2

Using the Agmon estimates in the normal direction, Theorem 4.2, we get L 2 = ψ2 O(κ −∞ ).

(5.11)

Now, using Theorems 4.4 and 5.1, we have 1 1 qκ H A χ1 κ 4 t exp ακ 2 K (s) ψ 1 1 1 ≥ 0 κ H − C1 kmax (κ H ) 2 + 0 K (s)(κ H ) 2 − O (κ H ) 4 1 1 2 × χ1 κ 4 t exp ακ 2 K (s) ψ d x.

(5.12)

182

S. Fournais, B. Helffer

Using (5.8), (5.9), and (5.12), we therefore get, remembering that by (4.38), we have |K (s)|2 ≤ C K (s), √

L2 ≥ 0 K (s)(κ H )1/2 − Cα 2 κ|K (s)|2 + O κ + O(κρ) 1 1 2 × χ1 κ 4 t exp ακ 2 K (s) ψ d x √

≥ 0 1 − α 2 C K (s)(κ H )1/2 + O κ + O(κρ) 1 1 2 × χ1 κ 4 t exp ακ 2 K (s) ψ d x.

(5.13)

We split the integral on the right hand side in (5.13) as

=

!

1

K (s)≥S ρ+κ − 2

" +

! " 1 K (s)<S ρ+κ − 2

=: I1 + I2 ,

for some S > 0. We will choose S sufficiently large below. By definition of I2 , we get √ √

|I2 | ≤ C κρ + κ e2αS (1+ κρ ) ψ2 ,

(5.14)

If α is sufficiently small and S is sufficiently big, we have

√ I1 ≥ κ + κρ

! K (s)≥S

1 ρ+κ − 2

" χ 2 1

1 1 κ 4 t exp 2ακ 2 K (s) |ψ(x)|2 d x. (5.15)

Combining the estimates (5.13), (5.14), (5.15), and (5.11) we find (for α sufficiently small and S sufficiently big)

! K (s)≥S

1 ρ+κ − 2

" χ 2 1

√ 1 1 κ 4 t exp 2ακ 2 K (s) |ψ(x)|2 d x ≤ CeC κρ ψ2 . (5.16)

Evidently,

! K (s)≤S

1 ρ+κ − 2

" χ 2 1

√ 1 1 κ 4 t exp 2ακ 2 K (s) |ψ(x)|2 d x ≤ CeC κρ ψ2 . (5.17)

Combining (5.16) and (5.17) finishes the proof of the theorem.

On the Third Critical Field in Ginzburg-Landau Theory

183

6. Local Equals Global for All Domains In this section we will prove Theorem 1.7 stating that the local and global fields are equal—with no extra hypothesis on the domain . The basic result is the following lemma. Lemma 6.1. Let g : R+ → R+ satisfy g(κ) → 0, κ as κ → ∞. Then there exists a constant κ0 > 0 such that if H (κ) is such that HC3 (κ) − H (κ) ≤ g(κ), and ψ, A

κ,H (κ)

is a nontrivial minimizer of Eκ,H (κ) with κ ≥ κ0 , then κ 2 − λ1 (κ H ) > 0.

(6.1)

Proof. By definition of H C3 (κ), nontrivial minimizers only exist below H C3 (κ), so we may assume that H C3 (κ) − g(κ) ≤ H (κ) ≤ H C3 (κ). Since ψ, A is non-trivial, we get, with H = H (κ) that: κ 2 ψ22 > Q κ H A [ψ].

(6.2)

:= κ 2 ψ22 − Q κ H A [ψ].

(6.3)

We define

Notice, that the GL-equation gives ψ44 =

. κ2

(6.4)

Since HC3 (κ) > H > HC3 (κ) − o(κ), we are in a situation where Corollary 4.3, can be applied. Therefore, we get by (4.6) with p = 4, 1

ψ2 ≤ Cκ − 4 ψ4 .

(6.5)

Coming back to (6.4), we get 3

1

ψ2 ≤ Cκ − 4 4 .

(6.6)

We now estimate as in the proof of Theorem 4.13 (with the notation , a and b = curl A − 1 as in that proof) | a ψ|2 d x, 0 < ≤ κ 2 − (1 − ρ)λ1 (κ H ) ψ22 + ρ −1 (κ H )2 (6.7)

for all 0 < ρ.

184

S. Fournais, B. Helffer

Notice that, by elliptic estimates, a W 1,2 () ≤ W 2,2 () ≤ Cb2 , so (by the Sobolev estimates in 2-dimensions), a 4 ≤ Cb2 , whereby (κ H )2 a 24 ≤ C(κ H )2 curl A − 122 ≤ C. Here we used that Eκ,H ψ, A ≤ 0 to get the last estimate.

(6.8)

We now insert (6.8), (6.4), and (6.6) in (6.7), 0 < ≤ κ 2 − (1 − ρ)λ1 (κ H ) ψ22 + ρ −1 (κ H )2 a 24 ψ24

√ 1 3 2 2 − −1 . ≤ κ − λ1 (κ H ) ψ2 + Cρλ1 (κ H ) 2 κ 2 + Cρ κ √ 3 Upon choosing ρ = κ − 4 , and using that λ1 (κ H ) < Cκ 2 , we find 1 0 < ≤ κ 2 − λ1 (κ H ) ψ22 + Cκ − 4 .

(6.9)

(6.10)

1

When κ is so big that Cκ − 4 < 1, we therefore get 1 0 < 1 − Cκ − 4 ≤ κ 2 − λ1 (κ H ) ψ22 .

(6.11)

Since ψ cannot vanish identically for a non-trivial minimizer, this shows that κ 2 − λ1 (κ H ) > 0. Remark 6.2. The proof gives also (at least if all critical fields are equal) that, for H = HC3 (κ), the minimizer should be normal. We indeed get in this case (see (6.11)) 1 (6.12) 1 − Cκ − 4 ≤ κ 2 − λ1 (κ H ) ψ22 = 0. This implies = 0 and then ψ = 0, using for example (6.4). Proof of Theorem 1.7. By Theorem 1.6 it suffices to prove the inequalities HC3 (κ) ≤ HCloc (κ), 3

HC3 (κ) ≤ HCloc (κ), 3

for large values of κ. Notice that we can reformulate the definitions (1.18) as

2 , HCloc (κ) = sup H ≥ 0 : λ (κ H ) < κ 1 3 (κ) HCloc 3

= sup H ≥ 0 : for all H ≤ H, λ1 κ H < κ 2 .

(6.13)

As previously remarked the asymptotics (1.12) is valid for all 4 different definitions of the critical field, in particular, we have4 g(κ) (κ) , HC3 (κ) − HCloc ≤ 3 2 4 Remember that H loc (κ) is the smallest of the four fields and H (κ) is the biggest. C3 C3

On the Third Critical Field in Ginzburg-Landau Theory

185

for some strictly positive function g = o(κ). By Lemma 6.1 there exists κ0 such that if HC3 (κ) − g(κ) ≤ H ≤ HC3 (κ), κ ≥ κ0 , and Eκ,H has a non-trivial minimizer, then κ 2 ≥ λ1 (κ H ).

(6.14)

Suppose κ ≥ κ0 and HC3 (κ) − g(κ) ≤ H < HC3 (κ). By definition of HC3 (κ) there exists a non-trivial minimizer of Eκ,H and therefore (6.14) holds. Since H was arbitrary, we get (6.15) κ 2 ≥ λ1 (κ H ), for all H ∈ HC3 (κ) − g(κ), HC3 (κ) . Since we know from the definition of g that HCloc (κ) ≥ HC3 (κ) − g(κ), we can conclude 3 by (6.13) that HC3 (κ) ≤ HCloc (κ). 3 (κ) is easier. Suppose κ ≥ κ0 and that The proof that HC3 (κ) ≤ HCloc 3 HC3 (κ) − g(κ) ≤ H < HC3 (κ) is such that Eκ,H has a non-trivial minimizer. Then (6.14) holds and we get from (6.13) (κ). Since H was arbitrary this implies that HC3 (κ) ≤ HCloc (κ). that H ≤ HCloc 3 3 This finishes the proof of Theorem 1.7. 7. An Improved Estimate on ψL∞

From the maximum principle, one gets that minimizers ψ, A of the Ginzburg-Landau functional (1.1) satisfy the estimate ψ L ∞ ≤ 1 independently of the values of κ, H . When H is far from HC3 , that is a very useful estimate, but in the region near HC3 , this estimate is far from optimal and it is interesting to have a better control of ψ∞ . Let us start by a non-rigorous argument. One can heuristically get the optimal behavior from a back-of-the-envelope calculation. Multiplying the GL-equation (1.3) by ψ and integrating, one gets

Let

Q κ H A (ψ) − κ 2 ψ22 + κ 2 ψ44 = 0.

(7.1)

δ = κ 2 − λ1 κ H A

(7.2)

be the spectral distance. We observe that it results in (7.1), that δ is necessarily >0 if the minimizer has its component ψ not trivial. One might expect that Q κ H A (ψ) − κ 2 ψ22 ≈ −δψ22 ,

ψ44 ψ22

≈ ψ2∞ .

(7.3)

186

S. Fournais, B. Helffer

Therefore the GL-equation (7.1) implies that √ ψ∞ ≈ C δκ −1 . Unfortunately, it is difficult to justify the second estimate in (7.3) rigorously, so we will only obtain a somewhat less accurate estimate. Proposition 7.1. Let be a bounded, simply-connected domain with smooth boundary. κ,H is a family of minimizers of the GL-functional with 0 < ρ = Suppose that (ψ, A) HC3 (κ) − H = o(κ), as κ → ∞. For all > 0, 1

ψ L ∞ () ≤ Cδ 1/2 κ − 2 + . 1 If satisﬁes Assumption 1.8, and ρ = O κ − 2 , then (7.4) is improved to 5

ψ L ∞ () ≤ Cδ 1/2 κ − 8 + .

(7.4)

(7.5)

The proposition will be a consequence of the Lemma 7.2. Under the assumptions of the proposition, for all 1 , 2 > 0 such that 1 ≤ 1 + 2 , there exists a constant C > 0, such that if δ is the corresponding spectral 2 distance, 0 < δ = κ − λ1 κ H A , then λ ≤ Cδ 1/2 κ 1 +2 µ1+1 ,

(7.6)

where λ = ψ∞ and µ is deﬁned by λµ = ψ2 . Furthermore, if ψ , δ˜ψ , is deﬁned by ψ := κ 2 ψ22 − Q κ H A [ψ], δ˜ψ :=

ψ , ψ22

(7.7)

then the estimate (7.6) also holds with δ replaced by δ˜ψ . Proof of Lemma 7.2. Before we start the real proof, we state the basic inequalities that we will use. The estimates Q κ H A (ψ) ≤ κ 2 ψ22 , ψ44 ≤

δ ψ22 , κ2

(7.8)

are easy consequences of (7.1). Furthermore, from [HePa, Prop. 4.2], we get the inequality √ (7.9) ∇κ H A ψ∞ ≤ C κ H ψ∞ . By the Sobolev inequality and interpolation, we get that for all ps > 2, 0 < s ≤ 1, s λ ≤ C|ψ|W s, p ≤ Cψ1−s p ∇|ψ| p + Cψ p .

We then use the diamagnetic and Hölder’s inequalities on the right-hand side: 2 λ ≤ Cψ1−s −i∇ − κ H A ψsp + Cλµ p . p

On the Third Critical Field in Ginzburg-Landau Theory

187

Using (4.6), with p = ∞, we get that 1

µ ≤ Cκ − 2 .

(7.10)

So for κ large enough, we obtain

s λ ≤ Cψ1−s −i∇ − κ H A ψ . p p

We now apply Hölder’s inequality for each term of the right hand side:

s # $ 1−s p−2 2 p λ ≤ C λ p−4 ψ44 p (−i∇ − κ H A)ψ . ∞ (−i∇ − κ H A)ψ2 We use (7.8) and (7.9) to get 1−s s p 1−s 2 1−s δ p 1−2 1−s p δ p µ p κ s−2 p . λ p κ p µ2 λ ≤ C λ p−2 µ2 2 = Cλ κ

(7.11)

This implies that ps

λ ≤ Cδ 2 µ 1−s κ 2(1−s) −1 . 1

1

(7.12)

22 1 Write 1−s = 1 + 1 and ps = 2 + 1+ . Then we find (7.6). 1 To get the version of (7.6) with δ˜ψ , we notice that using the consequence (7.1) we have

ψ44 =

δ˜ψ ψ22 . κ2

(7.13)

Thus the input to the above proof of (7.6), i.e. (7.8) and (7.9), holds with δ replaced by δ˜ψ . Therefore the conclusion (7.6) also holds under the same change. 1

Proof of Proposition 7.1. Applying (7.10) to (7.6), we find (7.4). If ρ = O(κ − 2 ) and under Assumption 1.8, we can also apply the parallel Agmon estimates, Theorem 1.3 and its Corollary 4.11 with p = 4, and therefore bound 5

µ ≤ Cκ − 8 . This implies (7.5).

(7.14)

It is also possible to express the estimate on ψ∞ in terms of the deviation ρ from the critical field instead of the spectral distance δ. This leads to the following statement: Proposition 7.3. Let be a bounded, simply-connected domain with smooth boundary κ,H is a family of such that the conclusion of Proposition 2.4 holds. Suppose that (ψ, A) minimizers of the GL-functional with 0 < ρ = HC3 (κ) − H = o(κ), as κ → ∞. Then, for all > 0, there exists C and κ , s.t., for κ ≥ κ , ψ L ∞ () ≤ C ρ 1/2 κ . 1 If satisﬁes Assumption 1.8, and ρ = O κ − 2 , then (7.15) is improved to 1

ψ L ∞ () ≤ C ρ 1/2 κ − 8 + .

(7.15)

(7.16)

188

S. Fournais, B. Helffer

Proof. Since the conclusion of Proposition 2.4 holds, we get that the solution of (1.23) is unique for κ large and all the critical fields are equal. We will use the version of Lemma 7.2 with δ˜ψ . We may assume that ρ > 0 (if not, all minimizers are trivial and ψ∞ = 0). Then we recall inequality (6.11) from the proof of Lemma 6.1 and get ψ ≤ C κ 2 − λ1 (κ H ) ψ22 . (7.17) Proposition 2.4 gives a control of the derivative of λ1 (B), so we find from (7.17) that √

ψ ≤ Cκρψ22 .

(7.18)

Therefore, δ˜ψ ≤ C κρ, and, proceeding as in the proof of Proposition 7.1, we get the proposition. Appendix A. General Non-Degenerate Domains In this appendix we will prove an asymptotics for the low lying eigenvalues of the magnetic Neumann operator on a domain satisfying Assumption 1.8. This was obtained in [FoHe2] under the additional condition that N = 1 in this assumption, i.e. that the curvature has a unique maximum. For convenience of comparison with this work we consider the semi-classical asymptotics, i.e. we consider the operator 2 Kh = −i h∇ − A , (A.1) with Neumann boundary conditions, and will study the asymptotics of the spectrum 5 {µ(n) (h)}∞ n=1 of Kh as h → 0+ . Here A is a vector potential generating a unit magnetic so we field, i.e. curl A = 1. It will be practical to have a globally defined choice of A, 1 will work with the special choice A = 2 (−x2 , x1 ). Let δ = 41 min j=k |s j −sk | and let γ : R/|∂| → ∂ be the standard parametrization of ∂ by arc-length. For j ∈ {1, . . . , N }, let ( j) be a bounded domain with smooth boundary satisfying that 1. There exists a (smooth) parametrization γ ( j) of ∂( j) by arc-length such that γ ( j) (s)= γ (s − s j ) for s ∈ (−δ, δ). 2. γ ( j) (0) is the unique point of maximum curvature of ∂( j) . In particular, the domains ( j) satisfy Assumption 1.8, with N = 1. ( j) Define Kh to be the differential operator from (A.1) defined on ( j) with Neumann boundary conditions, and let {µ(n, j) (h)}∞ n=1 be the eigenvalues (in non-decreasing order) ( j) of Kh . From [FoHe2] we get the following description of the µ(n, j) (h)’s. Theorem A.1. Suppose satisﬁes Assumption 1.8. For any j ∈ {1, . . . , N } and any (n, j) n ∈ N \ {0}, there exists a sequence {ζ }∞ =1 ⊂ R (which can be calculated recur(n, j) sively to any order) such that µ (h) admits the following asymptotic expansion (for h 0) : ∞ 1 3 7 15 (n, j) 3k2, j 4 +h 8 (2n − 1)h h 8 ζ . µ(n, j) (h) ∼ 0 h − kmax C1 h 2 + C1 04 2 =0

(A.2) 5 Clearly, as in Subsect. 3.2, by scaling this is equivalent to the large B asymptotics of Spec H(B).

On the Third Critical Field in Ginzburg-Landau Theory (n, j) ∞ }=1

Furthermore, the coefﬁcients {ζ (1) and (2) above.

189

are independent of the choice of ( j) satisfying

(n, j) (h)}, j = 1, . . . , N , n = 1, . . . , ∞ Let now {µ˜ (n) (h)}∞ n=1 be the sequence of the {µ with multiplicity and in non-decreasing order. In this context, it is convenient to consider the operator Khcomb defined as Kh(1) ⊕ · · · ⊕ Kh(N ) as an operator on L 2 ((1) ) ⊕ · · · ⊕ L 2 ((N ) ). Then clearly µ˜ (n) (h) is simply the n th eigenvalue of Khcomb . The main result of this appendix is the following theorem.

Theorem A.2. Suppose that satisﬁes Assumption 1.8. With the notation given above, we have for all n ∈ N \ {0}, |µ(n) (h) − µ˜ (n) (h)| = O(h ∞ ). Remark A.3. Actually, the proof gives Theorem A.2 with an exponentially small error instead of the weaker O(h ∞ ) in the statement. By simple manipulations we convert the small h asymptotics to a large B asymptotics. Corollary A.4. Suppose satisﬁes Assumption 1.8 and let λ1 (B) be the smallest eigenvalue of H(B). Then there exists a sequence {ζ j }∞ j=0 ⊂ R such that for all M > 0, M j 1 1 1 M λ1 (B) = 0 B − C1 kmax B 2 + C1 4 0 3k22 B 4 + B 8 ζ j B− 8 + O B− 8 . j=0

(A.3) Proof of Theorem A.2. We may choose an η > 0 such that B(γ (s j ), 2η) ∩ B(γ (sk ), 2η) = ∅

for j = k, ( j)

B(γ (s j ), 2η) ∩ = B(γ (s j ), 2η) ∩

(A.4)

.

Let φ j be a smooth function satisfying φ j (x) = 1 on B(γ (s j ), η), supp φ j ⊂ B(γ (s j ), 2η).

(A.5)

(n, j )

(n)

Let ψh be the n th eigenfunction of Kh . Furthermore, let ψh n be the eigenfunction (j ) of Kh n corresponding to µ˜ (n) (h), here jn may depend on h. (n, j) We may consider φ j ψh as a function on (extended by zero). By the Agmon ( j) estimates (for the ), we easily get (n, jn )

φ jn ψh

(m, jm )

| φ jm ψh

(n, j ) (m, j ) φ jn ψh n | Kh φ jm ψh m

= δm,n + O(h ∞ ),

= δm,n

(A.6)

µ˜ (n) (h) + O(h ∞ ).

Therefore, the variational characterization of eigenvalues gives µ(n) (h) ≤ µ˜ (n) (h) + O(h ∞ ). (A.7) %N 2 We now prove the opposite inequality. Define φ = j=1 φ j . For ψ ∈ L (), we 2 (1) 2 (N ) can naturally identify φψ with an element of L ( ) ⊕ · · · ⊕ L ( ). We will do

190

S. Fournais, B. Helffer

so without changing the notation. Using again the Agmon estimates (this time for ), we see that (n)

(m)

φψh | φψh = δm,n + O(h ∞ ), (n)

(m)

φψh | Khcomb φψh = δm,n µ˜ (n) (h) + O(h ∞ ).

(A.8)

Here the inner products on the left hand sides are the natural inner product on L 2 ((1) ) ⊕ · · · ⊕ L 2 ((N ) ). A second application of the variational principle therefore gives µ˜ (n) (h) ≤ µ(n) (h) + O(h ∞ ), and finishes the proof.

(A.9)

Appendix B. Boundary Coordinates Let be a smooth, simply-connected domain in R2 . Let γ : R/|∂| → ∂ be a parametrization of the boundary with |γ (s)| = 1 for all s. Let ν(s) be the unit vector, normal to the boundary, pointing inward at the point γ (s). We choose the orientation of the parametrization γ to be counter-clockwise, so

det γ (s), ν(s) = 1. The curvature k(s) of ∂ at the point γ (s) is now defined by γ

(s) = k(s)ν(s). The map defined in the introduction, : R/|∂| × (0, t0 ) → , (s, t) → γ (s) + tν(s),

(B.1)

is clearly a diffeomorphism, when t0 is sufficiently small, with image

R/|∂| × (0, t0 ) = {x ∈ dist(x, ∂) < t0 } =: t0 . Furthermore, t ((s, t)) = t. If A is a vector field on t0 with B = curl A we define the associated fields in (s, t)-coordinates by A˜ 1 (s, t) = (1 − tk(s)) A((s, t)) · γ (s), A˜ 2 (s, t) = A((s, t)) · ν (s), (B.2) ˜ t) = B(((s, t)). B(s, (B.3) ˜ Furthermore, for all u ∈ W 1,2 (t0 ), we have, with Then ∂s A˜ 2 − ∂t A˜ 1 = (1 − tk(s)) B. v = u ◦ , 2 dx |(−i∇ − A)u| t0 2 2 = (1 − tk(s))−2 (−i∂s − A˜ 1 )v + (−i∂t − A˜ 2 )v (1 − tk(s)) dsdt, 2 |u(x)| d x = |v(s, t)|2 (1 − tk(s)) dsdt. t0

(B.4)

On the Third Critical Field in Ginzburg-Landau Theory

191

Lemma B.1. Suppose is a bounded, simply connected domain with smooth boundary and let t0 be the constant from (B.1). Then there exists a constant C > 0 such that, if A is a vector potential in with curl A = 1 on ∂,

(B.5)

and with A˜ deﬁned as in (B.2), then there exists a gauge function ϕ(s, t) on R/|∂| × (0, t0 ) such that t 2 k(s) 2 b(s, t) A¯ 1 (s, t) − t + + t γ 0 ˜ ¯ 2 := A − ∇(s,t) ϕ = A(s, t) = ¯ , (B.6) A2 (s, t) 0 where γ0 =

1 |∂|

curl A d x,

and b satisﬁes the estimate, b L ∞ (R/|∂|×(0, t0 )) ≤ Ccurl A − 1C 1 (t ) . 0

2

(B.7)

Furthermore, if [s0 , s1 ] is a subset of R/|∂| with s1 − s0 < |∂|, then we may choose ϕ on (s0 , s1 ) × (0, t0 ) such that t 2 k(s) 2 b(s, t) A¯ 1 (s, t) + t −t + ˜ ¯ 2 := A − ∇(s,t) ϕ = , (B.8) A(s, t) = ¯ A2 (s, t) 0 with b still satisfying the estimate (B.7). Proof. Notice first that |∂| A1 (s, 0) ds = 0

|∂|

A · γ (s) ds =

0

curl A d x.

Let us write ν˜ ν = curl A − 1, ν(s, ˜ t) = ν((s, t)), ν˜ = . t Then ˜ν L ∞ ≤ CνC 1 (t ) and 0

∂s A˜ 2 − ∂t A˜ 1 = (1 − tk(s))(1 + t ν˜ ). Define

ϕ(s, t) = 0

t

A˜ 2 (s, t ) dt +

s

˜ A1 (s , 0) ds − sγ0 .

(B.9)

0

Then ϕ is a well-defined continuous function on R/|∂| × (0, t0 ). We pose A¯ = A˜ − ∇ϕ and find A¯ 1 (s, t) A¯ 1 (s, t) ¯ = , A(s, t) = ¯ 0 A2 (s, t) ∂t A¯ 1 (s, t) = −(∂s A˜ 2 − ∂t A˜ 1 ) = −(1 − tk(s))(1 + t ν˜ ), A¯ 1 (s, 0) = γ0 .

192

S. Fournais, B. Helffer

Therefore, t 2 k(s) − A¯ 1 (s, t) = γ0 − t + 2

t

t (1 − t k(s))˜ν (s, t ) dt ,

0

and we get (B.6) by applying l’Hôpital’s rule to the integral. In the case where we only consider a part (s0 , s1 )×(0, t0 ) of the ring R/|∂|×(0, t0 ), we have trivial topology and therefore any two vector fields generating the same magnetic field are gauge equivalent. Therefore the constant term, γ0 , can be omitted. From a more practical point of view, one can see that we can omit the term sγ0 in (B.9) since we do not need to ensure the periodicity of the function ϕ. Appendix C. Proof of Theorem 2.5 In this appendix we will give the proof of Theorem 2.5. The proof essentially reduces to a detailed analysis of a certain perturbation of the operator h(ξ0 ). We will at times invoke certain known identities on matrix elements involving the ground state u 0 of h(ξ0 ). These we will recall here. First of all, the moments ∞ Mn := (τ + ξ0 )n |u 0 (τ )|2 dτ, 0

are known [BeSt, HeMo2]. In particular, M0 = 1, M1 = 0, M2 =

0 . 2

(C.1)

Since the vanishing of the first moment, M1 , means that u 0 ⊥ (τ + ξ0 )u 0 , we find that one can define ! "−1 d2 2 − 2 + (τ + ξ0 ) − 0 (τ + ξ0 )u 0 . dτ In the proof below, the matrix element ! "−1 ' & d2 (τ + ξ0 )u 0 , I2 := (τ + ξ0 )u 0 − 2 + (τ + ξ0 )2 − 0 dτ

(C.2)

plays a role. The expression I2 can be calculated from second order perturbation theory, at ζ = ξ0 , on the family h(ζ ) := −

d2 + (τ + ζ )2 dτ 2

(C.3)

on L 2 (R+ , dτ ) with Neumann boundary conditions at the origin. The details of this calculation are given in [FoHe2, Appendix A], here we just recall the result (C.4) 1 − 4I2 = 3C1 0 .

On the Third Critical Field in Ginzburg-Landau Theory

193

(B Proof of Theorem 2.5. Let D(t) = {x ∈ R2 | |x| ≤ t} be the disc with radius t. Let Q be the quadratic form (−i∇ − B F)u 2 d x, (B [u] = Q D(1)\D( 21 )

˜ with domain {u ∈ W 1,2 (D(1) \ D( 21 )) | u(x) = 0 on |x| = 21 }. Let λ(B) be the lowest eigenvalue of the corresponding self-adjoint operator. Using the Agmon estimates in the normal direction (see Theorem 4.1), we see that λ1 (B) ≤ λ˜ (B) = λ1 (B) + O(B −∞ ).

(C.5)

The first inequality in (C.5) is immediate by the variational principle. The second estimate follows by using a cut-off version of the ground state ψ of H(B) as a trial state (B , since ψ decays exponentially away from {|x| = 1} by Theorem 4.1. in Q By changing to boundary coordinates (if (r, θ ) are usual polar coordinates, then (B [u] becomes, t = 1 − r , s = θ ), the quadratic form Q 2π 1/2 ( Q B [u] = ds dt (1 − t)−1 |(Ds − B A˜ 1 )u|2 + (1 − t)|Dt u|2 , 0 0 (C.6) 2π 1/2 2 u2L 2 = ds dt (1 − t)|u|2 , A˜ 1 = 21 − t + t2 . 0

0

curl F d x

= 21 for the disc. Here we used Lemma B.1, and that γ0 = |∂| √ Performing the scaling τ = Bt and decomposing in Fourier modes, u = eims φ(t), we find ˜ λ(B) = B inf eδ(m,B),B .

(C.7)

m∈Z

Here the function δ(m, B) was defined and eδ,B is the lowest eigenvalue of √ in (2.10) √ the quadratic form qδ,B on L 2√ ((0, B/2); (1 − Bτ )dτ ) (with Neumann boundary condition at 0 and Dirichlet at B/2): √

qδ,B [φ] =

B/2

(1 −

0

+(1 −

√τ )−1 B

1 (τ + ξ0 ) + B − 2 (δ −

√τ )|φ (τ )|2 dτ. B

τ2 2 2 ) (C.8)

We will only consider δ varying in a fixed bounded set. This is justified since it follows from [FoHe2, Lemma 5.4] that, for all C > 0, there exists D > 0 such that, if |δ| > D and B > D, then 1

eδ,B ≥ 0 − C1 B − 2 + C. Furthermore, for δ varying in a fixed bounded set, we know (from the analysis of h, some of which is recalled in Subsect. 1.2) that there exists a d > 0 such that if B > d −1 , then the spectrum of qδ,B contained in (−∞, 0 + d) consists of exactly one simple eigenvalue.

194

S. Fournais, B. Helffer

Neumann operator h(δ, B) associated to qδ,B (on the space L 2 ((0, √ The self-adjoint √ B/2); (1 − Bτ )dτ )) is d √τ )−1 B dτ

h(δ, B) = −(1 − +(1 −

√τ )−2 B

(1 −

d √τ ) B dτ

1 (τ + ξ0 ) + B − 2 (δ −

τ2 2 2 ) .

(C.9)

We will write down an explicit test function for h(δ, B) in (C.14) below, giving eδ,B up 3 to an error of order O(B − 2 ) (locally uniformly in δ). We can formally develop h(δ, B) as 1

3

h(δ, B) = h0 + B − 2 h1 + B −1 h2 + O(B − 2 ). with h0 = −

d2 + (τ + ξ0 )2 (= h(ξ0 )), dτ 2

d 2 + 2(τ + ξ0 )(δ − τ2 ) + 2τ (τ + ξ0 ), dτ d 2 2 + (δ − τ2 )2 + 4τ (τ + ξ0 )(δ − τ2 ) + 3τ 2 (τ + ξ0 )2 . h2 = τ dτ h1 =

(C.10)

Let u 0 be the known ground state eigenfunction of h0 with eigenvalue 0 . Here h0 is considered as a selfadjoint operator on L 2 (R+ , dτ ) with Neumann boundary condition at 0. Let R0 be the regularized resolvent (h0 − 0 )−1 φ, φ ⊥ u 0 , R0 φ = 0, φ u0 . Here ‘perpendicular’ is measured with respect to the usual inner product (no perturbation of the measure) in L 2 (R+ , dτ ). Let λ1 and λ2 be given by λ1 := u 0 | h1 u 0 , λ2,1 := u 0 | h2 u 0 ,

λ2 := λ2,1 + λ2,2 ,

(C.11)

λ2,2 := u 0 | (h1 − λ1 )u 1 .

Here the inner products are the usual inner products in L 2 (R+ , dτ ). The functions u 1 , u 2 are given as # $ (C.12) u 1 = −R0 (h1 − λ1 )u 0 , u 2 = −R0 (h1 − λ1 )u 1 + (h2 − λ2 )u 0 . Notice that (see [FoHe2, Lemma A.5]) u 0 ∈ S(R+ ) and that R0 maps S(R+ ) (continuously) to itself. Therefore, u 0 , u 1 , u 2 (and their derivatives) are rapidly decreasing functions on R+ . Let χ ∈ C0∞ (R) be a usual cut-off function, χ (t) = 1 1

Define χ B (τ ) = χ (τ B − 4 ).

for |t| ≤ 18 , supp χ ⊂ [− 41 , 41 ].

(C.13)

On the Third Critical Field in Ginzburg-Landau Theory

195

Our trial state is defined by # $ 1 ψ := χ B u 0 + B − 2 u 1 + B −1 u 2 .

(C.14)

A calculation (using in particular the exponential decay of the involved functions) gives that #

$ 3 √ h(δ, B) − 0 + λ1 B − 21 + λ2 B −1 ψ 2 √ = O(B − 2 ), L ([0, B/2];(1− Bτ )dτ ) 1

ψ L 2 ([0,√ B/2];(1−√ Bτ )dτ ) = 1 + O(B − 2 ),

(C.15)

where the constant in O is uniform for δ in bounded sets. By the spectral theorem we get (uniformly for δ varying in bounded sets)

1 3 dist 0 + λ1 B − 2 + λ2 B −1 , Spec h(δ, B) = O(B − 2 ). Since 0 is the isolated ground state eigenvalue for the unperturbed operator h0 , we have therefore, by perturbation theory, proved that (uniformly for δ varying in bounded sets) 1

3

eδ,B = 0 + λ1 B − 2 + λ2 B −1 + O(B − 2 ).

(C.16)

It remains to calculate λ1 , λ2 and, in particular, deduce their dependence on δ. ∞ From the known moments, Mn = 0 (τ + ξ0 )n |u 0 (τ )|2 dτ , it is an elementary exercise (which can for instance be found in [FoHe2, Sect. 2]) to calculate λ1 . The result is λ1 = −C1 .

(C.17)

In particular, λ1 is independent of δ (this can be verified at a glance by using that the first moment, (τ + ξ0 )u 20 dτ , vanishes). It is much harder to calculate λ2 explicitly. However, notice (by writing λ2,2 = −u 0 | (h1 − λ1 )R0 (h1 − λ1 )u 0 ) that λ2 (δ) is a quadratic polynomial as a function of δ. We find the coefficient to δ 2 as 1 − 4I2 , with I2 := u 0 | (τ + ξ0 )R0 (τ + ξ0 )u 0 .

(C.18)

But this integral was calculated in [FoHe2, Prop. √ A.3] (we have kept the notation I2 from that paper) as recalled in (C.4). Notice that 3C1 0 > 0. Since λ2 is quadratic in δ, there exist therefore δ0 , C0 ∈ R such that

λ2 = 3C1 0 (δ − δ0 )2 + C0 . Remembering (C.5) and (C.7) this finishes the proof of Theorem 2.5. Acknowledgements We would like to thank R. Frank for his remarks on a preliminary version which motivated us to improve our previous results concerning equality between local fields and global fields.

196

S. Fournais, B. Helffer

References [Ag] [BaPhTa] [BeSt] [BoHe] [Bon1] [Bon2] [BonDa] [DaHe] [DGP] [FoHe1] [FoHe2] [GiPh] [Hel] [HeMo1] [HeMo2] [HePa] [HeSj1] [Kato] [LuPa1] [LuPa2] [LuPa3] [LuPa4] [Pan] [PiFeSt] [ReSi] [S-JSaTh] [Si] [St] [TiTi] [Ti]

Agmon, S.: Lectures on exponential decay of solutions of second order elliptic equations. Math. Notes, T. 29, Princeton, NI: Princeton University Press, 1982 Bauman, P., Phillips, D., Tang, Q.: Stable nucleation for the Ginzburg-Landau system with an applied magnetic field. Arch. Rat. Mech. Anal. 142, 1–43 (1998) Bernoff, A., Sternberg, P.: Onset of superconductivity in decreasing fields for general domains. J. Math. Phys. 39, 1272–1284 (1998) Bolley, C., Helffer, B.: An application of semi-classical analysis to the asymptotic study of the supercooling field of a superconducting material. Ann. Inst. H. Poincaré (Section Physique Théorique) 58(2), 169–233 (1993) Bonnaillie, V.: Analyse mathématique de la supraconductivité dans un domaine à coins : méthodes semi-classiques et numériques. Thèse de Doctorat, Université Paris 11, 2003 Bonnaillie, V.: On the fundamental state for a Schrödinger operator with magnetic fields in domains with corners. Asymptotic Anal. 41(3–4), 215–258 (2005) Bonnaillie-No¨el, V., Dauge, M.: Asymptotics for the low-lying eigenstates of the Schrödinger operator with magnetic field near a corner. Preprint University of Rennes, 2005 Dauge, M., Helffer, B.: Eigenvalues variation I, Neumann problem for Sturm-Liouville operators. J. Differ. Eqs. 104(2), 243–262 (1993) Du, X., Gunzburger, M.D., Peterson, J.S.: Analysis and approximation of the Ginzburg-Landau model of superconductivity. SIAM Rev. 34(1), 54–81 (1992) Fournais, S., Helffer, B.: Energy asymptotics for type II superconductors. Calc. Var. and PDE 24, no. 3, 341–376 (2005) Fournais, S., Helffer, B.: Accurate eigenvalue asymptotics for the magnetic Neumann Laplacian. Preprint 2004. To appear in Annales de l’Institut Fourier. Giorgi, T., Phillips, D.: The breakdown of superconductivity due to strong fields for the GinzburgLandau model. SIAM J. Math. Anal. 302, no. 2, 341–359 (1999) (electronic) Helffer, B.: Introduction to the semiclassical analysis for the Schrödinger operator and applications. Springer Lecture Notes in Math. 1336, Berlin Heidelberg New york, Springer Verlag, 1988 Helffer, B., Mohamed, A.: Semiclassical analysis for the ground state energy of a Schrödinger operator with magnetic wells. J. Funct. Anal. 138(1), 40–81 (1996) Helffer, B., Morame, A.: Magnetic bottles in connection with superconductivity. J. Funct. Anal. 185(2), 604–680 (2001) Helffer, B., Pan, X.: Upper critical field and location of surface nucleation of superconductivity. Ann. Inst. H. Poincaré (Section Analyse non linéaire) 20(1), 145–181 (2003) Helffer, B., Sjöstrand, J.: Multiple wells in the semiclassical limit I. Comm. Partial Differ. Eqs. 9(4), 337–408 (1984) Kato, T.: Perturbation theory for linear operators. Berlin: Springer-Verlag, 1976 Lu, K., Pan, X.-B.: Estimates of the upper critical field for the Ginzburg-Landau equations of superconductivity. Physica D 127, 73–104 (1999) Lu, K., Pan, X.-B.: Eigenvalue problems of Ginzburg-Landau operator in bounded domains. J. Math. Phys. 40(6), 2647–2670 (1999) Lu, K., Pan, X.-B.: Gauge invariant eigenvalue problems on R2 and R2+ . Trans. Amer. Math. Soc. 352(3), 1247–1276 (2000) Lu, K., Pan, X.-B.: Surface nucleation of superconductivity in 3-dimension. J. Differ. Eqs. 168(2), 386–452 (2000) Pan, X.-B.: Surface superconductivity in applied magnetic fields above HC3 . Commun. Math. Phys. 228, 327–370 (2002) del Pino, M., Felmer, P.L., Sternberg, P.: Boundary concentration for eigenvalue problems related to the onset of superconductivity. Commun. Math. Phys. 210, 413–446 (2000) Reed, M., Simon, B.: Methods of modern Mathematical Physics, IV : Analysis of operators. New York: Academic Press, 1978 Saint-James, D., Sarma, G., Thomas, E.J.: Type II Superconductivity. Oxford: Pergamon, 1969 Simon, B.: Semi-classical analysis of low lying eigenvalues I. Ann. Inst. H. Poincaré (Section Physique Théorique) 38(4), 295–307 (1983) Sternberg, P.: On the Normal/Superconducting Phase Transition in the Presence of Large Magnetic Fields. In: J. Berger, J. Rubinstein, eds., Connectivity and Superconductivity, Lect. Notes in Physics 63, Berlin Heidelberg New york: Springer-Verlag, 1999, pp. 188–199 Tilley, D.R., Tilley, J.: Superfluidity and superconductivity. 3rd edition. Bristol-Philadelphia: Institute of Physics Publishing, 1990 Tinkham, M.: Introduction to Superconductivity. New York: McGraw-Hill Inc., 1975

Communicated by B. Simon

Commun. Math. Phys. 266, 197–210 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0018-0

Communications in

Mathematical Physics

On the Conserved Quantities for the Weak Solutions of the Euler Equations and the Quasi-geostrophic Equations Dongho Chae Department of Mathematics, Sungkyunkwan University, Suwon 440-746, Korea. E-mail: [email protected] Received: 2 August 2005 / Accepted: 28 October 2005 Published online: 22 April 2006 – © Springer-Verlag 2006

Abstract: In this paper we obtain sufficient conditions on the regularity of the weak solutions to guarantee conservation of the energy and the helicity for the incompressible Euler equations. The regularity of the weak solutions are measured in terms of the Tries and the Besov norms, B˙ s . In particular, in the Besov bel-Lizorkin type of norms, F˙ p,q p,q space case, our results refine the previous ones due to Constantin-E-Titi (energy) and the author of this paper (helicity), where the regularity is measured by a special class of the Besov space norm B˙ sp,∞ = N˙ ps , which is the Nikolskii space. We also obtain a sufficient regularity condition for the conservation of the L p -norm of the temperature function in the weak solutions of the quasi-geostrophic equation.

1. Introduction and the Main Results The Euler equations for the homogeneous incompressible fluid flows in Rn , n = 2, 3, are  n  ∂v ∂t + (v · ∇)v = −∇ p, (x, t) ∈ R × (0, T ) (E) div v = 0,  v(x, 0) = v0 (x), where v = (v1 , · · · , vn ), v j = v j (x, t), j = 1, · · · , n, is the velocity of the fluid flows, p = p(x, t) is the scalar pressure, and v0 is the given initial velocity satisfying div v0 = 0. It is well-known that for smooth solutions of the Euler equations the energy E(t) = 21 Rn |v(x, t)|2 d x is preserved in time. For nonsmooth(weak) solutions it is not at all obvious that we still have energy conservation. Thus, there comes the very interesting question of how much smoothness we need to assume for the solution to have energy conservation property. Regarding this question L. Onsager conjectured that a Hölder continuous weak solution with the Hölder exponent 1/3 preserve the energy,

198

D. Chae

and this is sharp. Considering Kolmogorov’s scaling argument on the energy correlation in the homogeneous turbulence the exponent 1/3 is natural. The sufficiency part of this conjecture is proved in a positive direction by a simple but very elegant argument by s Constantin-E-Titi[5], using the Besov space norm, B˙3,∞ with s > 1/3 (see below for precise definitions of the function spaces) for the velocity. Remarkably enough Shnirelman[13] later constructed an example of weak solution of 3D Euler equations, which does not preserve energy. The problem of finding the optimal regularity condition for a weak solution to have conservation property can also be considered for the helicity, which is defined by H (t) = Rn v(x, t) · ω(x, t)d x, where ω = curl v is the vorticity. In particular, the helicity is closely related to the topological invariants, e.g. the knottedness of vortex tubes (see [1] for the details and other significance of the helicity conservation). Thus, in [2] the author of this paper obtained a sufficient regularity condition for the helicity conservation, using the function space B˙ s9 , s > 1/3, for the 5 ,∞

vorticity. One of the purposes of this paper is to refine those results, using the Triebels , and the Besov spaces B˙ s Lizorkin type of spaces, F˙ p,q p,q with similar values for s, p, but allowing full range of values for q ∈ [1, ∞](for a precise statement of the results see the theorems below). When we restrict q = ∞, our Besov space results for Euler equations reduce to the previous ones described above. On the other hand, our results for Triebel-Lizorkin type of space are completely new. We also extend our arguments to consider the L p -norm conservation in the weak solutions of the 2D quasi-geostrophic equations. By a weak solution of (E) in Rn × (0, T ) with initial data v0 we mean a vector field 2 (Rn )) satisfying the integral identity: v ∈ C([0, T ); L loc T ∂φ(x, t) d xdt − − v(x, t) · v0 (x) · φ(x, 0)d x ∂t Rn Rn 0 T − v(x, t) ⊗ v(x, t) : ∇φ(x, t)d xdt

0 T

−

T 0

Rn Rn

0

div φ(x, t) p(x, t)d xdt = 0,

(1.1)

v(x, t) · ∇ψ(x, t)d xdt = 0

(1.2)

Rn

for every vector test function φ = (φ1 , . . . , φn ) ∈ C0∞ (Rn ×[0, T )), and for every scalar ∞ n test function ψ ∈ C0 (R × [0, T )). Here we used the notation (u ⊗ v)i j = u i v j , and A : B = i,n j=1 Ai j Bi j for n × n matrices A and B. In the case when we discuss the helicity conservation of the weak solution we impose further regularity for the vorticity, 3 ω(·, t) ∈ L 2 (R3 ) for almost every t ∈ [0, T ] in order to define the helicity for such a weak solution. In order to state our main theorems we introduce function spaces. Given s 0 < s < 1, 1 ≤ p ≤ ∞, 1 ≤ q ≤ ∞, the function space F˙ p,q is defined by the seminorm,  | f (x) − f (x − y)|q q1    dy if 1 ≤ p ≤ ∞, 1 ≤ q < ∞   Rn p n |y|n+sq L ( R ,d x) f F˙ s = . p,q  | f (x) − f (x − y)|    ess sup if 1 ≤ p ≤ ∞, q = ∞  p n |y|s |y|>0 L (R ,d x)

Conserved Quantities for Solutions of Euler and Quasi-geostrophic Equations

199

On the other hand, the space B˙ sp,q is defined by the seminorm, 

1  q  f (·)− f (·−y)qL p dy if 1 ≤ p ≤ ∞, 1 ≤ q < ∞ Rn |y|n+sq . f B˙ s = p,q  f (·)− f (·−y) L p  ess sup if 1 ≤ p ≤ ∞, q = ∞ |y|>0 |y|s s s Observe that, in particular, F˙ ∞,∞ = B˙∞,∞ = C s , which is the usual Hölder seminormed space. We also note that when q = ∞ we have the equivalence, B˙ sp,∞ = N˙ ps , which is the Nikolskii space, used in [5] and [2]. In order to compare this space with other more classical function spaces let us intros , Bs duce the Banach space F p,q p,q by defining its norm, s = f L p + f F˙ s , f Bsp,q = f L p + f B˙ s , f F p,q p,q

p,q

p ∼ = L s (Rn ) = (1−) L p (Rn ), the fractional order Sobolev space (or the Bessel potential space)(see s coincides p. 163,[14]). If min{np,q} < s < 1, n < p < ∞ and n < q ≤ ∞, then F p,q s n with the Triebel-Lizorkin space F p,q (R ) defined by the Littlewood-Paley decomposition(see p. 101, [15]). On the other hand, for wider range of parameters, 0 < s < 1, 0 < p ≤ ∞, 0 < q ≤ ∞, B sp,q coincides with the Besov space B sp,q (Rn )(see p. 110, [15]). We also note the equivalence,

respectively. We note that for 0 < s < 1, 2 ≤ p < ∞, q = 2, − 2s

s F p,2

s s s ˙s F˙ p, p = B p, p , F p, p = B p, p s (resp. for 1 ≤ p ≤ ∞. Hereafter, we use the notation X˙ sp,q (resp. X sp,q ) to represent F˙ p,q s s s F p,q ) or B˙ p,q (resp. B p,q ).

Theorem 1.1. Let s > 13 and q ∈ [2, ∞] be given. Suppose v is a weak solution of s (Rn )). the n−dimensional Euler equations with v ∈ C([0, T ]; L 2 (Rn )) ∩ L 3 (0, T ; X˙ 3,q Then, the energy is preserved in time, namely |v(x, t)|2 d x = |v0 (x)|2 d x (1.3) Rn

Rn

for all t ∈ [0, T ). Theorem 1.2. Let s > 13 , q ∈ [2, ∞], and r1 ∈ [2, ∞], r2 ∈ [1, ∞] be given, satisfying 2/r1 + 1/r2 = 1. Suppose v is a weak solution of the 3-D Euler equations with v ∈ C([0, T ]; L 2 (R3 )) ∩ L r1 (0, T ; X˙ s9 (R3 )) and ω ∈ L r2 (0, T ; X˙ s9 (R3 )), where 2 ,q

5 ,q

the curl operation is in the sense of distribution. Then, the helicity is preserved in time, namely v(x, t) · ω(x, t)d x = v0 (x) · ω0 (x)d x (1.4) R3

R3

for all t ∈ [0, T ). Similarly to [2], as an application of the above theorem we have the following estimate from below of the vorticity by a constant depending on the initial data for the weak solutions of the 3-D Euler equations.

200

D. Chae

Corollary 1.1. Suppose v is a weak solution of the 3-D Euler equations satisfying the conditions of Theorem 1.2. Then, we have the following estimate: ω(·, t)2 3 ≥ C H0 , ∀t ∈ [0, T ),

(1.5)

L2

where H0 = Rn v0 (x) · ω0 (x)d x is the initial helicity, and C is an absolute constant. Next we are concerned with the L p -norm conservation for the weak solutions of the 2D quasi-geostrophic equation,  ∂θ  ∂t + (v · ∇)θ =0, (QG) v(x, t) = −∇ ⊥ R2 θ(y,t) |x−y| dy,  θ (x, 0) = θ0 (x), where θ (x, t) is a scalar function representing the temperature, v(x, t) is the velocity field of the fluid, and ∇ ⊥ = (−∂x2 , ∂x1 ). The system (QG) is of intensive interest recently (see e.g. [4, 6, 16, 7, 8, 3], and references therein), since the equation has very similar structure to the 3-D Euler equations, and also it has direct connections to the physical phenomena in atmospheric science. Let p ∈ [2, ∞). By a weak solution of (QG) in D × (0, T ) with initial data v0 we p mean a scalar field θ ∈ C([0, T ); L p (R2 ) ∩ L p−1 (R2 )) satisfying the integral identity: T ∂ − + v · ∇ φ(x, t)d xdt − θ (x, t) θ0 (x)φ(x, 0)d x = 0 (1.6) ∂t R2 R2 0 v(x, t) = −∇ ⊥

θ (y, t) dy R2 |x − y|

(1.7)

for every test function φ ∈ C0∞ (R2 × [0, T )), where ∇ ⊥ in (1.7) is in the sense of distribution. We note that contrary to the case of 3-D Euler equations there is a global existence result for the weak solutions of (QG) for p = 2([12]). Theorem 1.3. Let s > 13 , p ∈ [2, ∞), q ∈ [1, ∞], and r1 ∈ [ p, ∞], r2 ∈ [1, ∞] be given, satisfying p/r1 + 1/r2 = 1. Suppose θ is a weak solution of (QG) with θ ∈ p C([0, T ]; L p (R2 )∩L p−1 (R2 ))∩L r1 (0, T ; X sp+1,q (R2 )) and v ∈ L r2 (0, T ; X˙ sp+1,q (R2 )). Then, the L p norm of θ (·, t) is preserved, θ (t) L p = θ0 L p

(1.8)

for all t ∈ [0, T ]. 2. Proof of the Main Theorems Let ϕ(x) ∈ C0∞ (Rn ) be the standard mollifier with ϕ ≥ 0, supp ϕ ⊂ {x ∈ Rn | |x| ≤ 1}. 1 (Rn ), we denote by f ε (x) = ( f ∗ ϕ ε )(x). Let ϕ ε (x) = ε1n ϕ( xε ). Given f ∈ L loc Lemma 2.1. Let k ∈ N, s ∈ (0, 1) and p, q ∈ [1, ∞]. Then, there exist constants C depending on k, s, q, n such that the following inequalities hold: D k f ε L p ≤ Cεs−k f X˙ s ,

(2.1)

f − f ε L p ≤ Cεs f X˙ s .

(2.2)

p,q

p,q

Conserved Quantities for Solutions of Euler and Quasi-geostrophic Equations

201

Proof. By integration by part we deduce

Dxk f (x − y)ϕ ε (y)dy = (−1)k D ky f (x − y)ϕ ε (y)dy n n R R

y 1 k ε = dy f (x − y)D y ϕ (y)dy = n+k f (x − y)(D k ϕ) ε ε Rn Rn

y 1 = n+k dy, [ f (x − y) − f (x)](D k ϕ) ε ε Rn

D k f ε (x) =

where we used the fact

y

y dy = f (x) dy = 0. f (x)(D k ϕ) (D k ϕ) ε ε Rn Rn Hence, ε

y k | f (x − y)− f (x)| ϕ) dy (D εn+k Rn ε

1

1

q q 1 | f (x − y)− f (x)|q k y q ( qn +s)q ≤ n+k dy ϕ |y| dy D n+sq ε |y| ε Rn Rn 1

1 q q | f (x − y) − f (x)|q ( qn +s)q k q = εs−k dy |D ϕ(y)| |y| dy n+sq |y| Rn Rn 1

q | f (x − y)− f (x)|q = Cεs−k dy , (2.3) n+sq |y| Rn

|D f (x)| ≤ k

1

s . where 1/q + 1/q = 1. Taking L p (d x) norm of (2.3), we obtain (2.1) with X˙ sp,q = F˙ p,q In order to have the corresponding inequality for the norm of B˙ sp,q , we use the Minkowski inequality and the Hölder inequality to estimate

y p 1p 1 D k f ε L p ≤ n+k | f (x − y) − f (x)| (D k ϕ) dy d x n n ε ε R R

y 1 ≤ n+k f (·− y)− f (·) L p D k ϕ dy ε ε Rn 1

1 q q

y q q f (·− y)− f (·) L p 1 k ( qn +s)q ≤ n+k dy dy |y| D ϕ n+sq n n ε |y| ε R R

= Cεs−k f B˙ s ,

(2.4)

p,q

where 1/q + 1/q = 1. Next, we prove (2.2). | f (x) − f ε (x)| = [ f (x) − f (x − y)]ϕ ε (y)dy n R ≤ | f (x) − f (x − y)|ϕ ε (y)dy Rn

202

D. Chae

1

1 q ε q ( n +s)q q | f (x) − f (x − y)|q ϕ (y) |y| q dy dy n+sq |y| Rn Rn 1

q | f (x) − f (x − y)|q = Cεs dy , n+sq |y| Rn ≤

(2.5)

s . where 1/q + 1/q = 1. Taking L p (d x) norm of (2.5), we obtain (2.2) with X˙ sp,q = F˙ p,q On the other hand, using the Minkowski and the Hölder inequalities again, we have ε

f − f

Lp

p

ε

≤

| f (x) − f (x − y)|ϕ (y)dy n Rn R ≤ f (·) − f (· − y) L p ϕ ε (y)dy Rn

≤

q

f (·) − f (· − y) L p dy |y|n+sq Rn

1 q Rn

1p dx

ε q ( n +s)q ϕ (y) |y| q dy

1

= Cε f B˙ s . s

q

(2.6)

p,q

Proof of Theorem 1.1. We note the identity, ϕ ε (y) (u(x − y) − u(x)) ⊗ (v(x − y) − v(x)) dy (u ⊗ v)ε = u ε ⊗ v ε + Rn

−(u − u ε ) ⊗ (v − v ε )

(2.7)

2 (Rn ), which was first observed in [5]. Suppose v(x, t) is a weak for all u, v ∈ L loc solution of (E). Let ξ(t) ∈ C0∞ ([0, T )). Given y ∈ Rn , choosing the test functions φ(x, t) = ξ(t)(ϕ ε (x − y), 0, 0), ξ(t)(0, ϕ ε (x − y), 0) and ξ(t)(0, 0, ϕ ε (x − y)) in (1.1), we obtain each component of

∂v ε + div (v ⊗ v)ε = −∇ p ε , ∂t

(2.8)

and choosing ψ(x, t) = ξ(t)ϕ ε (x − y) in (1.2), we derive div v ε = 0. We take L 2 (Rn ) inner product (2.8) with v ε . Then, integrating by part, and using the identity (2.7), we obtain 1 d |v ε |2 d x = (v ⊗ v)ε : ∇v ε d x − ∇ pε · vε d x 2 dt Rn Rn Rn ε ϕ (y)(v(x − y) − v(x)) ⊗ (v(x − y) − v(x))dy : ∇v ε (x)d x = Rn Rn − [(v − v ε ) ⊗ (v − v ε )] : ∇v ε d x Rn

:= I + I I,

(2.9)

where we used the facts, Rn

∇ pε · vε d x = −

Rn

p ε div v ε d x = 0,

Conserved Quantities for Solutions of Euler and Quasi-geostrophic Equations

and

v ε ⊗ v ε : ∇v ε d x =

Rn

n n i, j=1 R

viε v εj

=−

∂v εj ∂ xi

dx =

203

1 (v ε · ∇)|v ε |2 d x 2 Rn

1 (div v ε )|v ε |2 d x = 0. 2 Rn

We estimate I and I I separately: ε 2 |ϕ (y)||v(x − y) − v(x)| dy |∇v ε (x)|d x I ≤ Rn

≤

Rn

Rn

Rn

|ϕ ε (y)|q |y|

≤ Cε

( 2n q +2s)q

1 dy

q

|v(x − y) − v(x)|q dy |y|n+sq Rn

2s

Rn

|v(x − y) − v(x)|q dy |y|n+sq Rn

3 23 q

dx

2 q

|∇v ε (x)|d x

∇v ε L 3

≤ Cε3s−1 v3F˙ s ,

(2.10)

3,q

where 2/q + 1/q = 1, q ∈ [2, ∞], and we used (2.1) in the last step. For the estimate in s norm we first use the Fubini theorem, and the use the Hölder inequality to deduce, B˙3,q I ≤ |ϕ ε (y)||v(x − y) − v(x)|2 |∇v ε (x)|d x dy n Rn R ≤ |ϕ ε (y)|v(· − y) − v(·)2L 3 ∇v ε L 3 dy Rn

≤

ε

q

|ϕ (y)| |y|

≤ Cε ≤ Cε

Rn 2s−1

1 dy

q

v(· − y) − v(·)q dy |y|n+sq Rn

2 q

∇v ε L 3

v2B˙ s · εs v2B˙ s 3,q

3s−1

( 2n q +2s)q

v3B˙ s

3,q

3,q

,

(2.11)

where we used (2.1) again. We note that the estimate (2.10) has obvious end point extension for q = 2(q = ∞) and q = ∞(q = 1), although we do not write down those estimates separately. The estimate of I I is simpler as follows. II ≤ |v(x) − v ε (x)|2 |∇v ε (x)|d x Rn

≤ v − v ε 2L 3 ∇v ε L 3 ≤ Cε3s−1 v3X˙ s ,

(2.12)

3,q

where we used (2.1) and (2.2) directly. Taking into account the estimates (2.10)–(2.12), and integrating (2.9) over [0, t] ⊂ [0, T ], we have T ε v(τ )3X˙ s dτ. v (t)2L 2 − v0ε 2L 2 ≤ Cε3s−1 0

For s >

1 3,

3,q

s (Rn )). passing ε → 0, we have v(t) L 2 = v0 L 2 if v ∈ L 3 (0, T ; X˙ 3,q

204

D. Chae

Proof of Theorem 1.2. Suppose v(x, t) is a weak solution of (E), and ω = curl v in the sense of distribution. Let ξ(t) ∈ C0∞ ([0, T )). Given y ∈ R3 , choosing the test functions, φ(x, t) = ξ(t)curlx (ϕ ε (x − y), 0, 0), ξ(t)curlx (0, ϕ ε (x − y), 0) and ξ(t)curlx (0, 0, ϕ ε (x − y)) in (1.1), and integrating by part, we obtain the three components of ∂ωε + div (v ⊗ ω)ε − div (ω ⊗ v)ε = 0. ∂t

(2.13)

We compute d ∂v ε ∂ωε ε ε ε · ω dx + dx v · ω dx = vε · dt R3 ∂t R3 ∂t R3 =− div (v ⊗ v)ε · ωε d x − v ε · div (v ⊗ ω)ε d x 3 3 R R ε ε + v · div (ω ⊗ v) d x R3

= I + I I + I I I.

(2.14)

Integrating by part, and using the formula (2.7), we derive I = =

3 R

(v ⊗ v)ε : ∇ωε d x ε

ε

ε

v ⊗ v : ∇ω d x +

R3

ε

R3

rε (v, v) : ∇ω d x −

R3

(v−v ε ) ⊗ (v−v ε ) : ∇ωε d x,

where we set rε (u, v) =

ϕ ε (y) (u(x − y) − u(x)) ⊗ (v(x − y) − v(x)) dy.

R3

Similarly, II = =

R3

(v ⊗ ω)ε : ∇v ε d x ε

R3

ε

ε

v ⊗ ω : ∇v d x +

R3

rε (v, ω) : ∇v ε d x

− (v−v ε ) ⊗ (ω−ωε ) : ∇v ε d x, 3 R I I I =− (ω ⊗ v)ε : ∇v ε d x 3 R ε ε ε =− ω ⊗ v : ∇v d x − rε (ω, v) : ∇v ε d x R3 R3 + (ω−ωε ) ⊗ (v−v ε ) : ∇v ε d x R3

Conserved Quantities for Solutions of Euler and Quasi-geostrophic Equations

205

respectively. Since div v ε = 0, we have by integration by part,

ε

R3

ε

ε

v ⊗ v : ∇ω d x =

3 3 i, j=1 R

viε

3

=−

3 i, j=1 R

=−

R3

v εj viε

∂ωεj ∂ xi ∂v εj ∂ xi

dx ωεj d x

v ε ⊗ ωε : ∇v ε d x.

(2.15)

Also, using the fact div ωε = 0, we have by integration by part,

ε

R3

ε

ε

ω ⊗ v : ∇v d x =

3 3 i, j=1 R

=−

3 3 i, j=1 R

=−

R3

∂v εj

ωiε v εj ωiε

∂ xi ∂v εj ∂ xi

dx v εj d x

ωε ⊗ v ε : ∇v ε d x = 0.

(2.16)

Hence, we find that the sum of the first terms of I, I I and I I I cancels out, and after rearrangement of the remaining terms we obtain, I + II + III =

rε (v, v) : ∇ωε d x + rε (v, ω) : ∇v ε d x 3 3 R R ε − rε (ω, v) : ∇v d x 3 R − (v − v ε ) ⊗ (v − v ε ) : ∇ωε d x 3 R − (v − v ε ) ⊗ (ω − ωε ) : ∇v ε d x 3 R + (ω − ωε ) ⊗ (v − v ε ) : ∇v ε d x R3

= J1 + J2 + J3 + J4 + J5 + J6 .

(2.17)

We estimate (2.17) term by term starting from J1 : ε ε |J1 | = ϕ (y)(v(x − y) − v(x)) ⊗ (v(x − y) − v(x))dy : ∇ω (x)d x R3 R3 ≤ ϕ ε (y)|v(x − y) − v(x)|2 dy |∇ωε (x)|d x ≤

R3

R3

R3

R3

ε

q

|ϕ (y)| |y|

( q6 +2s)q

1 dy

q

|v(x − y) − v(x)|q dy |y|3+sq R3

2 q

|∇ωε (x)|d x

206

D. Chae

≤ Cε

2s

R3

|v(x − y)−v(x)|q dy |y|3+sq R3

49

9

2q

dx

∇ωε

9

L5

≤ Cε3s−1 v2F˙ s ωF˙ s , 9 2 ,q

(2.18)

9 ,q 5

where 1/q + 1/q = 1, and we used (2.1) in the last step. For the Besov space norm estimate we use the Fubini theorem and the Hölder inequality as previously: ε 2 ε |J1 | ≤ ϕ (y)|v(x − y) − v(x)| |∇ω (x)|d x dy 3 R3 R ≤ ϕ ε (y)v(· − y) − v(·)2 9 ∇ωε 9 dy R3

≤

L5

L2

R3

R3



1 6 q ( +2s)q |ϕ ε (y)|q |y| q dy 

q

v(· − y) − v(·) |y|3+sq

R3

L

9 2

 q2 dy  ∇ωε

≤ Cε3s−1 v2B˙ s ωB˙ s . 9 2 ,q

9

L5

(2.19)

9 ,q 5

We estimate J2 as follows: ε ε |J2 | = ϕ (y)(v(x − y) − v(x)) ⊗ (ω(x − y) − ω(x))dy : ∇v (x)d x R3 R3 ε ≤ ϕ (y)|v(x − y) − v(x)||ω(x − y) − ω(x)|dy |∇v ε (x)|d x R3

R3

≤

ε

q

|ϕ (y)| |y|

R3

( q6 +2s)q

1 dy

q

1

1 q q |v(x − y) − v(x)|q |ω(x − y)−ω(x)|q × dy dy |∇v ε (x)|d x 3+sq 3+sq |y| |y| R3 R3 R3 2 9

9 2q |v(x − y) − v(x)|q ≤ Cε2s dy d x |y|3+sq R3 R3 ×

R3

|ω(x − y) − ω(x)|q dy |y|3+sq R3

59

9

5q

dx

∇v ε

9

L2

≤ Cε3s−1 v2F˙ s ωF˙ s , 9 2 ,q

(2.20)

9 ,q 5

where 2/q + 1/q = 1. Next, we estimate in the Besov norm: |J2 | ≤ ϕ ε (y)|v(x − y) − v(x)||ω(x − y) − ω(x)||∇v ε (x)|d x dy 3 R3 R ≤ ϕ ε (y)v(· − y) − v(·) 9 ω(· − y) − ω(·) 9 ∇v ε 9 dy R3

L2

L5

L2

Conserved Quantities for Solutions of Euler and Quasi-geostrophic Equations

≤

R3

×

q

ε

|ϕ (y)| |y|  

R3

( q6 +2s)q

1 q

dy q

v(·− y)−v(·) |y|3+sq

R3

207

L

9 2

1  q dy  

 q1

q

ω(· − y) − ω(·)

9 L5

|y|3+sq

R3

dy  ∇v ε

≤ Cε3s−1 v2B˙ s ωB˙ s .

(2.21)

9 ,q 5

9 2 ,q

9

L2

The estimate of J3 is similar to that of J2 , and we have |J3 | ≤ Cε3s−1 v2X˙ s ω X˙ s .

(2.22)

9 ,q 5

9 2 ,q

We estimate J4 as follows: ε ε ε |J4 | = [(v − v ) ⊗ (v − v )] : ∇ω d x 3 R ≤ |v − v ε |2 |∇ωε |d x ≤ v − v ε 2 9 ∇ωε R3 3s−1

≤ Cε

L2

v2X˙ s ω X˙ s ,

(2.23)

9 ,q 5

9 2 ,q

where we used (2.1) and (2.2). Similarly, we estimate J5 : [(v − v ε ) ⊗ (ω − ωε )] : ∇v ε d x |J5 | = 3 R ≤ |v − v ε ||ω − ωε ||∇v ε |d x ≤ v − v ε 9 ω − ωε R3 3s−1

≤ Cε

9

L5

L2

9

L5

∇v ε

v2X˙ s ω X˙ s .

(2.24)

9 ,q 5

9 2 ,q

9

L2

The estimates of J6 is similar to that of J5 , and we have |J6 | ≤ Cε3s−1 v2X˙ s ω X˙ s .

(2.25)

9 ,q 5

9 2 ,q

Taking into account the estimates (2.18)–(2.25), and integrating (2.17) over [0, t] ⊂ [0, T ], we have ε ε ε ε v (x, t) · ω (x, t)d x − v0 (x) · ω0 (x)d x R3

≤ Cε3s−1

R3

T

≤ Cε

v(t)2X˙ s ω(t) X˙ s dt 9 ,q 5

9 2 ,q

0

T

3s−1 0

v(t)rX˙1s

2 r1 dt

9 2 ,q

0

T

1 ω(t)rX˙1s

9 ,q 5

where 2/r1 + 1/r2 = 1. Passing ε → 0, we find that ω(x, t) · v(x, t)d x = ω0 (x) · v0 (x)d x R3

for s >

1 3.

R3

r2

dt

,

208

D. Chae

Proof of Corollary 1.1. We estimate the helicity, v(x, t) · ω(x, t)d x ≤ v(·, t) L 3 ω(·, t) R3

≤ C∇v(·, t)

3

L2

3

L2

ω(·, t)

3

L2

≤ Cω(·, t)2 3 , (2.26) L2

where we used the Sobolev inequality and the Calderon-Zygmund inequality. Combining (2.26) with Theorem 1.2, we obtain the desired conclusion. Proof of Theorem 1.3. Suppose θ (x, t) is a weak solution of (QG) in the sense of (1.6)– (1.7). Let ξ(t) ∈ C0∞ ([0, T )). Given y ∈ R2 , choosing the test function φ(x, t) = ξ(t)ϕ ε (x − y) in (1.6) we obtain ∂θ ε + div (vθ )ε = 0. (2.27) ∂t We take the L 2 (R2 ) inner product (2.27) with θ ε |θ ε | p−2 . Then, integrating by part, and using the identity (2.7), we obtain 1 d ε p |θ | d x = ( p−1) (vθ )ε · ∇θ ε |θ ε | p−2 d x p dt R2 R2 = ( p−1) ϕ ε (y)(v(x − y)−v(x))(θ (x − y)−θ (x))dy R2

−( p − 1)

R2 ε

·∇θ (x)|θ ε (x)| p−2 d x R2

[(v − v ε )(θ − θ ε )] · ∇θ ε (x)|θ ε (x)| p−2 d x

:= ( p − 1)[I + I I ],

(2.28)

where we used the fact, 1 1 v ε θ ε · ∇θ ε |θ ε | p−2 d x = (v ε · ∇)|θ ε | p d x = − (div v ε )|θ ε | p d x = 0. p R2 p R2 R2 We estimate I and I I separately: I ≤ |ϕ ε (y)||v(x − y) − v(x)||θ (x − y) − θ (x)|dy |∇θ ε (x)||θ ε (x)| p−2 d x R2

≤

R2

R2

ε

R2

q

|ϕ (y)| |y|

( q4 +2s)q

1 dy

q

|v(x − y) − v(x)|q dy |y|2+sq R2

1 q

1 q |θ (x − y) − θ (x))|q × dy |∇θ ε (x)||θ ε (x)| p−2 d x |y|2+sq R2   1 p+1

p+1 q q |v(x − y) − v(x)| 2s  ≤ Cε dy dx |y|2+sq R2 R2 

×

R2

|θ (x − y) − θ (x)|q dy |y|2+sq R2

≤ Cε3s−1 vF˙ s

p+1,q

θ 2F˙ s

p+1,q

p−2

θ L p+1 ,

p+1 q

 dx

1 p+1

∇θ ε L p+1 θ ε L p+1 p−2

(2.29)

Conserved Quantities for Solutions of Euler and Quasi-geostrophic Equations

209

where 1/q + 1/q = 1, q ∈ [1, ∞], and we used (2.1) in the last step. The estimate in the Besov space norm is the following. I ≤ |ϕ ε (y)||v(x − y) − v(x)||θ (x − y) − θ (x)||∇θ ε (x)||θ ε (x)| p−2 d x dy ≤

R2

R2

ϕ ε (y)v(· − y) − v(·) L p+1 θ (· − y) − θ (·) L p+1 ∇θ ε L p+1 θ ε L p+1 dy p−2

R2

≤

ε

|ϕ (y)| |y|

R2

×

q

( q4 +2s)q

1 dy

|y|2+sq

≤ Cε3s−1 vB˙ s

p+1,q

θ 2B˙ s

q

dy

1

q

R2

|y|2+sq

R2

θ (· − y) − θ (·) L p+1

1

q

v(· − y) − v(·) L p+1

q

q

∇θ ε L p+1 θ ε L p+1 p−2

dy

p−2

p+1,q

θ L p+1 .

(2.30)

Using (2.1) and (2.2) directly, we estimate II ≤ |v(x) − v ε (x)||θ (x) − θ ε (x)||∇θ ε (x)||θ ε (x)| p−2 d x R3

≤ v − v ε L p+1 θ − θ ε L p+1 ∇θ ε L p+1 θ ε L p+1 p−2

≤ Cε3s−1 v X˙ s

p+1,q

θ 2X˙ s

p+1,q

p−2

θ L p+1 .

(2.31)

Taking into account the estimates (2.29)–(2.31), and integrating (2.28) over [0, t] ⊂ [0, T ], we have T ε p−2 θ (t) p p − θ ε p p ≤ Cε3s−1 v(τ ) X˙ s θ (τ )2X˙ s θ (τ ) L p+1 dτ 0 L L

p+1,q

0

T

≤ Cε3s−1

≤ Cε

p

v(τ ) X˙ s

p+1,q

0

3s−1 0

T

p+1,q

v(τ )rX˙2s

θ (τ ) X s

p+1,q

dτ

p+1,q

r1 2

dτ 0

T

θ (τ )rX1s dτ p+1,q

rp

1

,

where p/r1 + 1/r2 = 1. For s > 13 , passing ε → 0, we have θ (t) L p = θ0 L p , if v ∈ L r2 (0, T ; X˙ sp+1,q (R2 )) and θ ∈ L r1 (0, T ; X sp+1,q (R2 )). Acknowledgements. This research was supported by KOSEF Grant no. R01-2005-000-10077-0.

References 1. Arnold, V.I., Khesin, B.A.: Topological Methods in Hydrodynamics. Berlin Heidelberg New York: Springer-Verlag Inc., 1998 2. Chae, D.: Remarks on the Helicity of the 3-D Incompressible Euler Equations. Commun. Math. Phys. 240(3), 501–507 (2003) 3. Chae, D.: The quasi-geostrophic equation in the Triebel-Lizorkin spaces. Nonlinearity 16, 479–495 (2003)

210

D. Chae

4. Constantin, P.: Geometric Statistics in Turbulence. SIAM Rev. 36, 73–98 (1994) 5. Constantin, P., E, W., Titi, E.: Onsager’s Conjecture on the Energy Conservation for Solutions of Euler’s Equations. Commun. Math. Phys. 165(1), 207–209 (1994) 6. Constantin, P., Majda, A., Tabak, E.: Formation of strong fronts in the 2-D quasi-geostrophic thermal active scalar. Nonlinearity 7, 1459–1533 (1994) 7. Córdoba, D.: Nonexistence of simple hyperbolic blow-up for the quasi-geostrophic equation. Ann. of Math. 148, 1135–1152 (1998) 8. Córdoba, D., Fefferman, C.: Growth of solutions for QG and 2D Euler equations. J. Amer. Math. Soc. 15(3), 665–670 (2002) 9. Duchon, J., Robert, R.: Inertial Energy Dissipation for Weak Solutions of Incompressible Euler and Navier-Stokes Equations. Nonlinearity 13, 249–255 (2000) 10. Majda, A., Bertozzi, A.: Vorticity and Incompressible Flow. Cambridge: Cambridge Univ. Press, 2002 11. Moffat, H.K., Tsinober, A.: Helicity in Laminar and Turbulent Flow. Ann. Rev. Fluid Mech. 24, 281–312 (1992) 12. Resnick, S.: Dynamical Problems in Non-linear Advective Partial Differential Equations. Ph.D Thesis, University of Chicago, 1995 13. Shnirelman, A.: Weak Solutions with Decreasing Energy of Incompressible Euler Equations. Commun. Math. Phys., 210, 541–603 14. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton Univ. Press, 1970 15. Triebel, H.: Theory of Function Spaces. Boston: Birkäuser Verlag, 1983 16. Wu, J.: Inviscid limits and regularity estimates for the solutions of the 2-D dissipative Quasi-geostrophic equations. Indiana Univ. Math. J. 46(4), 1113–1124 (1997) Communicated by P. Constantin

Commun. Math. Phys. 266, 211–238 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0013-5

Communications in

Mathematical Physics

A Counterexample to Dispersive Estimates for Schrödinger Operators in Higher Dimensions Michael Goldberg1 , Monica Visan2 1 Department of Mathematics, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA.

E-mail: [email protected]

2 Department of Mathematics, University of California, Los Angeles, CA 90025-1555, USA.

E-mail: [email protected] Received: 11 August 2005 / Accepted: 4 November 2005 Published online: 22 April 2006 – © Springer-Verlag 2006

Abstract: In dimension n > 3 we show the existence of a compactly supported potential in the differentiability class C α , α < n−3 2 , for which the solutions to the linear Schrödinger equation in Rn , −i∂t u = −u + V u, u(0) = f, fail to satisfy an evolution estimate of the form n

u(t, ·)∞ ≤ C|t|− 2 u(0, ·)1 . This contrasts with known results in dimensions n ≤ 3, where a pointwise decay condition on V is generally sufficient to imply dispersive bounds. The obstructions in our example are generated by an expression with scaling law −n+ 23 +α |t| , which becomes dominant in the time interval |t| 1. 1. Introduction The evolution operator for the free Schrödinger equation, here denoted by e−it , is subject to a wide variety of estimates. Functional analysis dictates that it must be an isometry on L 2 (Rn ) at every fixed time t. Representing e−it as a convolution operator n 2 with the kernel (−4πi t)− 2 e−i|x| /(4t) , leads to the dispersive bound n

e−it f ∞ ≤ (4π |t|)− 2 f 1

(1.1)

valid for each t = 0. Between these two estimates one already has most of the necessary elements to verify more subtle space-time properties of the Schrödinger evolution such as global Strichartz bounds. It is natural to ask whether a perturbed operator eit H , H = − + V , can satisfy (up to a constant) the same L 1 → L ∞ estimate as the free evolution. In general, it

212

M. Goldberg, M. Visan

cannot. If H has point spectrum (eigenvalues), the naive dispersive estimate (1.1) fails. Indeed, for any Schwartz function f that has nonzero inner product with an eigenfunction, eit H f, f does not converge to zero as t → ∞. Therefore, it is a natural endeavour to prove n

eit H Pac (H ) f ∞ ≤ C|t|− 2 f 1 ,

(1.2)

where Pac (H ) denotes the projection onto the absolutely continuous spectrum1 of H . It is known that (1.2) can fail for t large in the presence of a zero-energy eigenvalue or resonance. For more details, see [8, Theorem 10.5], [6, Theorem 8.2], and [9, §3]. By assuming that zero is a regular point, that is, neither an eigenvalue nor a resonance of H , one can find conditions governing the decay and regularity (but not the size, or signature) of V which are known to be sufficient to imply the dispersive bound (1.2). These are listed below for reference. • • • •

[4] n = 1: (1 + |x|)V ∈ L 1 (R), [15] n = 2: |V (x)| ≤ C(1 + |x|)−3−ε , 3 3 [2] n = 3: V ∈ L 2 −ε (R3 ) ∩ L 2 +ε (R3 ), [9] n ≥ 3: Vˆ ∈ L 1 and (1 + |x|2 )γ /2 V (x) is a bounded operator on the Sobolev space H ν for some γ > n + 4 and some ν > 0.

For a more thorough discussion of the work on this problem, see the survey [16]. One might extrapolate from the results in dimensions 1, 2, and 3 that a suitable L p -type condition for potentials should be sufficient in every dimension. The main result of this paper shows that this is not true. Denoting by X a space of functions with fixed compact support and bounded derivatives of order α < n−3 2 , we reach the following contrary conclusion: Theorem 1.1. Suppose n > 3. There cannot exist a bound of the form n

eit H Pac (H ) f ∞ ≤ C(V )|t|− 2 f 1 with C(V ) < ∞ for every potential V ∈ X , V X ≤ 1. 3

More specifically, the L 1 → L ∞ mapping norm can be made at least as large as |t|−n+ 2 +α for a sequence of times tn converging to zero. Due to the short-time nature of the result, the inclusion of a spectral projection Pac (H ) is something of a formality, as the presence or absence of bound states is primarily felt when |t| ≥ 1. In constructing the counterexamples, we follow the approach of [8], where dispersive estimates are proven in the setting of weighted L 2 (Rn ). Specifically, we use Stone’s formula to construct the spectral measure of H from its resolvents, which is studied as a multiplicative perturbation of the free resolvent. Passing from weighted L 2 to unweighted L 1 requires us to consider separately the first few terms of the associated Born series. This argument was worked out in three dimensions by [4], and the initial terms of the Born series (again in three dimensions) are computed explicitly in [14]. The three dimensional analysis of [4] relies heavily on the simple expression of the free resolvent as an integral operator. The free resolvent kernel can be written in terms of elementary functions in all odd dimensions; however, the expressions become 1 For the potentials discussed here, there is no singular continuous spectrum by the Agmon-Kato-Kuroda Theorem [13, Theorem XIII.33].

A Counterexample to Dispersive Estimates

213

increasingly unwieldy as the dimension increases. In even dimensions, Bessel/Hankel functions are required. The key to avoiding this morass is the introduction of certain symbol classes, S i, j , which capture the essential features of the free resolvent. In particular, in dimension n, one must integrate by parts approximately (n + 1)/2 times to obtain the appropriate power of t; this seems quite impossible without such a unifying tool. In dimensions four and higher, the Green’s function is rather singular at the origin, specifically, it is not locally square integrable. This necessitates carrying the Born expansion much further than in [4], which adds to the complexity of our proof. The relevant identites and mapping properties for the free resolvent are presented in Sect. 2 and 3. Our analysis contains certain partial positive results. In Sect. 4 we show that (1.2) is attained by the tail of the Born series, taken after a finite number (depending on the dimension) of initial terms. The question of whether eit H Pac (H ) is dispersive therefore reduces to an estimate on the initial terms in the Born series. In Sect. 5 we take advantage of an explicit formula for the contribution of the first non3 trivial term to construct a potential whose contribution is bounded below by |t|−n+ 2 +α at certain times 0 < t < 1. In the limit t → 0, this runs contrary to the desired bound n of |t|− 2 . An elementary argument shows that any finite sum of initial terms must be subject to this lower bound as well. Combined with the dispersive estimate for the tail of the series, this establishes a lower bound for the full Schrödinger evolution with this potential. The Uniform Boundedness Principle is then used to show that the worst possible limiting behaviour can be achieved. Section 6 contains a derivation of the crucial explicit formula. It should again be emphasized that the non-dispersive phenomenon takes place over extremely short times; moreover, it is a high-energy phenomenon. Indeed by Theorem B.2.3 of [17], for any bounded compactly supported function φ, the operator eit H φ(H ) maps L 1 into L ∞ uniformly in t. This is true for very general potentials, in particular those that are bounded. A physical interpretation is that even high-frequency waves travelling with large velocity can be effectively scattered by a non-smooth potential. Depending on the geometry of the potential, the first reflection may generate an unacceptable degree of constructive interference. For the purposes of our counterexample, “non-smooth” will mean that V is assumed to possess fewer than n−3 2 continuous derivatives. Compare this to the smoothness conditions in [9], which are sufficient to imply a dispersive bound. In that paper a potential is only explicitly required to possess derivatives of order ν for some ν > 0. Indeed, there exist numerous examples of functions satisfying all the hypotheses of [9], yet which we would consider to be non-smooth. On the other hand, the potentials constructed in this paper are differentiable to order n−3 2 but the dispersive estimate still fails. This suggests that while a dispersive bound may hold for all sufficiently smooth potentials (with rapid decay at infinity), other criteria besides the number and size of derivatives determine what happens in the absence of such strong regularity. The additional assumption in [9] is that Vˆ ∈ L 1 , which is satisfied by any potential n in the Sobolev space H 2 +ε (Rn ). Determining which functions of lesser regularity also have integrable Fourier transform is a well known difficult problem. The counterexample constructed here is motivated by a different and explicitly geometric consideration, the focal pattern of reflections caused by an elliptical surface. Strictly speaking, the reflection is caused by a highly oscillatory potential whose level sets are ellipses. When presented in this light, it is clear that some notion of curvature and/or convexity can also

214

M. Goldberg, M. Visan

determine whether dispersive estimates remain valid. There is still considerable room between the currently known sufficient conditions and the negative result presented here. We believe this middle ground can be explored via some combination of geometric and Fourier analysis and that these are most likely two sides of the same coin. 2. Notes on the Free Resolvent We introduce here a class of symbols which will be relevant in the study of the free resolvent, simplifying both the notation and the analysis. For i, j ∈ Q, we denote by ai, j a symbol belonging to the class S i, j , i.e., a symbol that satisfies the following estimates ∂ k a (x) c x i−k if 0 < x ≤ 1, i, j ∀ k ≥ 0. ≤ k j−k k ck x if x > 1 ∂x The calculus of these symbols is quite straightforward: the derivative of a symbol in

S i, j is a symbol in S i−1, j−1 and the product of a symbol in S i, j with a symbol in S i , j

is a symbol in S i+i , j+ j . In particular, the product of a symbol in S i, j with x α belongs to S i+α, j+α . Now let us consider the resolvent of the free Schrödinger equation, R0 (z) = (− − z)−1 . In dimension n ≥ 4, R0 (z) is given by the kernel: n −1 1 z2 i 2 H n(1) (z 2 |x − y|), 2 −1 4 2π |x − y| 1

R0 (z)(x, y) = 1

(2.1)

(1)

where Im z 2 ≥ 0 and H n −1 is the first Hankel function. 2 We encode the information contained in the asymptotic expansions of the first Hankel function near the origin and at infinity (see [5]), together with the information provided by the differential equation satisfied by the first Hankel function, (1) (1) Hν−1 (z) − Hν+1 (z) = 2

d (1) H (z), dz ν

into the following formula valid for Re ν > − 21 and | arg z| < π , Hν(1) (z) = ei z a−ν,− 1 (z). 2

This together with (2.1) yield a representation for the kernel of the free resolvent in dimension n ≥ 4 in terms of the aforementioned symbols, that is, R0± (λ2 )(x, y) = a0, n−3 (λ|x − y|) 2

e±iλ|x−y| , |x − y|n−2

(2.2)

where R0± (λ2 ) denote the boundary values R0 (λ2 ± i0). Let us also point out a similar formula for the imaginary part of the free resolvent, Im R0 (λ2 )(x, y) = an−2, n−3 (λ|x − y|) 2

e±iλ|x−y| , |x − y|n−2

(2.3)

A Counterexample to Dispersive Estimates

215

by which we mean that we can write it as the sum of two terms of this type, one with phase eiλ|x−y| and the other with phase e−iλ|x−y| . Indeed, using (for example) the identity λn−2 (− − 1)−1 (λx) = (− − λ2 )−1 (x)

(2.4)

for the kernels of the free resolvents, we can write Im R0 (λ2 )(x, y) = λn−2 (λ|x − y|)

2−n 2

J n−2 (λ|x − y|), 2

where J n−2 denotes the Bessel function. Consulting the asymptotic expansions of the 2 Bessel function near the origin and at infinity (see again [5]) and using the differential equation satisfied by the Bessel function, Jν−1 (z) − Jν+1 (z) = 2

d Jν (z), dz

one easily derives (2.3). The purpose of understanding the free resolvent is that it enables us to study functions of H through to the Stone formula for the spectral measure: ∞ F(H )Pac (H ) f, g = 2 F(λ2 )λE (λ2 ) f, gdλ 0 2 ∞ F(λ2 )λ Im R V (λ2 ) f, gdλ, = π 0 where f, g are any two Schwartz functions, Pac (H ) denotes the projection onto the absolutely continuous spectrum of H , E (λ) denotes the spectral measure associated to H , and R V± (λ2 ) := (H − λ2 ± i0)−1 is the resolvent of the perturbed Schrödinger equation. We have chosen signs so that 2i Im R V (λ2 ) = R V+ (λ2 ) − R V− (λ2 ). In order to compute the kernel of Im R V (λ2 ), we make use of the resolvent identity: R V± (λ2 ) = R0± (λ2 ) − R0± (λ2 )V R V± (λ2 ), which by iteration gives rise to the following finite Born series expansion:

R V± (λ2 )

=

2m+1

R0± (λ2 )[−V R0± (λ2 )]l

(2.5)

l=0

+R0± (λ2 )V [R0± (λ2 )V ]m R V± (λ2 )[V R0± (λ2 )]m V R0± (λ2 ).

(2.6)

Elementary algebra can also be used to solve for R V± (λ2 ) in terms of R0± (λ2 ):

−1 ± 2 R V± (λ2 ) = I + R0± (λ2 )V R0 (λ ) := S ± (λ2 )R0± (λ2 ). For now this identity is only a formal statement, as we have not shown that S ± (λ2 ) =

−1 (I + R0± (λ2 )V exists as a bounded operator on any space. Existence and uniform boundedness of S ± (λ2 ) will be demonstrated in Sect. 4.

216

M. Goldberg, M. Visan

3. Useful Lemmas In this section we prove a few technical lemmas. We begin with certain results related to the boundedness of the Riesz potentials between various weighted spaces. By Riesz potentials, we mean the operators Iα : f → |x|α−n ∗ f, where 0 < α < n. 1 Let Iq denote the space of compact operators T for which T Iq = [tr (|T |q )] q is finite. We recall the following well-known result (see [12, Theorem XI.20]): Lemma 3.1. Let f, g ∈ L q (Rn ), for some 2 ≤ q < ∞. Then, f (x)g(−i∇) ∈ Iq and f (x)g(−i∇)Iq ≤ (2π )

− qn

f q gq .

Here, f (x) denotes multiplication by f in physical space, while g(−i∇) denotes multiplication by g in frequency space. As a consequence of Lemma 3.1, one can derive results on the boundedness of the Riesz potentials between various weighted spaces. To describe these spaces, we will use the notation f L p,σ := xσ f L p , where x := (1 + |x|2 )1/2 , 1 ≤ p ≤ ∞, and σ ∈ R. Following the notation of Jensen

and Kato, we write B(0, σ ; 0, −σ ) for the set of bounded operators from L 2,σ to L 2,−σ ,

while B0 (0, σ ; 0, −σ ) denotes the set of compact operators from L 2,σ to L 2,−σ , Jensen shows (see Lemma 2.3 in [6]) the following result. Proposition 3.2. 1) If 0 < α < n2 , σ, σ ≥ 0, and σ +σ ≥ α, then Iα ∈ B(0, σ ; 0, −σ ). Moreover, if σ + σ > α, then Iα ∈ B0 (0, σ ; 0, −σ ). 2) If n2 ≤ α < n, σ, σ > α − n2 , and σ + σ ≥ α, then Iα ∈ B(0, σ ; 0, −σ ). Moreover, if σ + σ > α, then Iα ∈ B0 (0, σ ; 0, −σ ). The case α ≥ n may appear qualitatively different from the Riesz potentials considered above; however, the mapping bounds between weighted L 2 spaces are still valid. Proposition 3.3. Let α ≥ n. The convolution operator Iα := f → |x|α−n ∗ f is an element of B0 (0, σ ; 0, −σ ), provided σ, σ > α − n2 . Proof. As every Hilbert-Schmidt operator is compact, in order to prove the proposi tion it suffices to show that Iα is a Hilbert-Schmidt operator between L 2,σ and L 2,−σ . In turn, this is equivalent to showing the finiteness of the integral

x−2σ |x − y|2(α−n) y−2σ d xd y. Consider the integral with respect to x, namely x−2σ |x − y|2(α−n) d x.

A Counterexample to Dispersive Estimates

217

If |y| ≤ 1, this is dominated by the integral of x2(α−σ −n) , which is finite because σ > α − n2 . Now suppose |y| > 1. Over the region where |x| ≤ 21 |y|, the factor |x − y| is essentially of size |y|, as can be seen from the triangle inequality. Meanwhile, the factor x−2σ is integrable because σ > α − n2 ≥ n2 . Consequently, the integral over this region is bounded by |y|2(α−n) , i.e., x−2σ |x − y|2(α−n) d x |y|2(α−n) . |x|≤ 21 |y|

Over the region where |x − y| ≤ 21 |y|, the triangle inequality dictates |x| ∼ |y|. Hence, x−2σ |x − y|2(α−n) d x y−2σ |x − y|2(α−n) d x |x−y|≤ 21 |y|

|x−y|≤ 21 |y|

−2σ

y

|y|2α−n |y|2α−2σ −n .

Everywhere else in Rn , the two functions |x| and |x −y| are of comparable size. Recalling that σ > α − n2 , the integral over this region is then dominated by −2σ 2(α−n) x |x| |x|2(α−σ −n) |y|2α−2σ −n . |x|> 21 |y|

|x|> 21 |y|

Therefore, the dominant term for large y comes from the region |x| ≤ 21 |y|. To complete the estimate for the Hilbert-Schmidt norm, it remains to bound the integral over the y-variable. As σ > α − n2 , this is dominated by

y2(α−n) y−2σ dy 1. This concludes the proof of Proposition 3.3.

Propositions 3.2 and 3.3 immediately yield some mapping bounds for the free resolvent and its derivatives. Indeed, we have Corollary 3.4. Let j be any nonnegative integer and suppose σ, σ > j + 21 with σ +σ > j + n+1 2 . Then d j n−3 R0± (λ2 ) f 2,−σ λ− j λ j+ 2 f L 2,σ . L dλ Proof. Recall that the kernel of R0± (λ2 ) is given by |x|2−n e±iλ|x| a0, n−3 (λ|x|). When a 2 symbol is differentiated, the effect is comparable to dividing by λ; see Sect. 2 for the calculus of the symbols ai, j . Each derivative that falls on the exponential factor increases the power of |x| by one. d j ± 2 Based on these possible outcomes, the integral kernel of ( dλ ) R0 (λ ) must be of − j 2−n ±iλ|x| the form λ |x| e a0, n−3 + j (λ|x|), which is dominated pointwise by the kernel of λ− j I2 + λ

n−3 2

2

I n+1 + j . Thus, for the kernels, we have the pointwise inequality 2

d j n−3 R0± (λ2 ) λ− j λ j+ 2 (I2 + I n+1 + j ). 2 dλ The claim follows from Propositions 3.2 and 3.3.

(3.1)

218

M. Goldberg, M. Visan

The estimate above is based entirely on the size of the integral kernel of R0± (λ2 ) and its derivatives and completely ignores the oscillatory nature of these functions. If one takes advantage of this oscillation using Fourier analysis techniques, the result is a much more subtle mapping estimate known as the Limiting Absorption Principle for the free resolvent (see [1, 13, Theorem XIII.33]). Lemma 3.5. Choose any σ, σ >

1 2

and ε > 0. Then for all λ ≥ 1,

R0± (λ2 ) f L 2,−σ λ−1+ε f L 2,σ . Proof (Sketch of Proof). First, one shows that R0± (1) is a bounded operator from L 2,σ

to L 2,−σ . One characterization of R0± (1) is that it multiplies the Fourier transform of f by the distribution m(ξ ) = |ξ |c2n−1 ± Cn iδ0 (|ξ |2 − 1). If f ∈ L 2,σ , then fˆ ∈ H σ (Rn ). As σ > 21 , the Trace Theorem (see [1 or 11, Theorem IX.39]) implies that fˆ will restrict to an L 2 function on surfaces of codimension 1. The surface of particular interest here is the unit sphere, where m(ξ ) becomes singular. After a partition of unity decomposition and smooth changes of variables, each sector of the sphere can be mapped to a subset of the hyperplane {ξ1 = 0}. Under the same change of variables, the singular part of m(ξ ) takes the form m(ξ ) = ξ11 ± iδ0 (ξ1 ). This reduces matters to a one-dimensional problem. In R, multiplying the Fourier transform by ξ11 or by a delta-function are integration operators which map L 1 to L ∞

and consequently also map L 2,σ to L 2,−σ provided σ, σ > 21 . The kernel of R0± (λ2 ) is simply a dilation of R0± (1); see (2.4). A straightforward scaling argument shows that R0± (λ2 ) f L 2,−σ λσ +σ

−2

f L 2,σ

for all λ ≥ 1. Finally, one can use the embeddings L 2,σ ⊂ L 2,min(σ,

1+ε

L 2,− min(σ , 2 ) ⊂ L 2,−σ to obtain the desired power of decay in λ.

1+ε 2 )

and

Note that Corollary 3.4 and Lemma 3.5 imply that the free resolvent and its derivatives map functions with good decay at infinity to functions with less decay. If this is composed with multiplication by a potential V (x) with sufficient decay at infinity, the resulting operator will be bounded from certain weighted spaces to themselves. Corollary 3.6. Let j be a nonnegative integer and suppose |V (x)| ≤ Cx−β for some 1 1 β > max( n+1 2 + j, 2 j + 1). Then for every j + 2 < σ < β − ( j + 2 ), −1+ε d j λ f L 2,−σ , if j = 0, ± 2 (3.2) R0 (λ )V f 2,−σ ≤ n−3 L dλ λ− j λ j+ 2 f 2,−σ , if j ≥ 1. L

Remark 3.7. It is possible to mimic the proof of the Limiting Absorption Principle to prove stronger estimates in the cases where 1 ≤ j < n−1 2 . These are interesting in their own right, but will not be needed here. As mentioned in the introduction, the kernel of the free resolvent is not locally square integrable, which places it outside the context of the mapping estimates above. However, as the next results demonstrate, the kernel associated to [V R0± ]m belongs to a weighted L 2 space, provided m is big enough and V decays sufficiently rapidly. We start with the following

A Counterexample to Dispersive Estimates

219

Lemma 3.8. Let µ and σ be such that µ < n and n < σ + µ. Then dy xn−σ −µ , σ < n σ µ x−µ , σ > n. Rn y |x − y| Proof. We analyze the integral on each of the following three disjoint domains: Domain 1. |y| ≤ |x| 2 . From the triangle inequality we get |x − y| ∼ |x|; we estimate the contribution of this domain to the integral by xn−σ −µ , σ < n −µ −σ −µ n −σ |x| y dy |x| |x| x σ > n. x−µ , |y|≤ |x| 2 Domain 2. |x − y| ≤ the integral by |x−y|≤ |x| 2

|x| 2 .

On this domain |y| ∼ |x| and we estimate its contribution to

dy = x−σ xσ |x − y|µ

0

|x| 2

r n−1 dr xn−σ −µ , rµ

where the inequality holds because µ < n. |x| Domain 3. |y| > |x| 2 and |x − y| > 2 . The triangle inequality yields |x − y| ∼ |y| and as n − σ − µ < 0, we obtain the estimate y−σ |y|−µ dy xn−σ −µ , |y|> |x| 2

by treating |x| ≤ 1 and |x| > 1 separately.

Proposition 3.9. Suppose |V (x)| ≤ Cx−β for some β > n + 3. Then for any integer 2n 0 ≤ j ≤ n2 + 1 and any pair ( p, q) such that either 1 < p < n+3 and q1 = 1p − n2 , or n p = 1 and 1 ≤ q < n−2 , we have d j n−3 R0± (λ2 ) f 1, 3 q λ− j λ j+ 2 f 1, 3 p . V L 2 ∩L L 2 ∩L dλ Proof. In view of (3.1), we need only prove estimates for the operator V Ik for certain 2 ≤ k ≤ n + 23 . The weighted L 1 estimate follows from 3 3 sup x− 2 y 2 −β |x − y|k−n dy 1, x∈Rn

Rn

which is a direct consequence of Lemma 3.8 with σ = β − 23 and µ = n − k. We turn now to the smoothing estimate. Consider first the case p = 1. Lemma 3.8 n with σ = qβ and µ = q(n − k) implies that for 1 ≤ q < n−2 , we have 3q |V (x)|q |x − y|q(k−n) d x y 2 .

220

M. Goldberg, M. Visan

Note that the upper bound on q is dictated by k = 2. Thus, in the case p = 1, the claim follows from Minkowski’s inequality: 3 k−n (x) |x − y| | f (y)|dy | f (y)|y 2 dy f 1, 3 . V q L

Lx

2

2n Lastly, we treat the case 1 < p < n+3 . Note that given p, the choice of q is governed by the Hardy-Littlewood-Sobolev inequality for I2 . As V ∈ L ∞ , we obtain

V I2 ( f ) L q f L p f

3

L 1, 2 ∩L p

.

n−1 It remains to consider Ik with k = n+1 2 + j. For 0 ≤ j < 2 , by the Hardy-Littlewood1 ∞ Sobolev inequality and the fact that V ∈ L ∩ L , we get

V I n+1 + j ( f ) L q V 2

2 pn n−3+2 j

I n+1 + j ( f ) 2

2 pn n−1−2 j

f L p f

3

L 1, 2 ∩L p

.

n For the remaining values of j, i.e., n−1 2 ≤ j ≤ 2 + 1, we use again Lemma 3.8 with σ = qβ and µ = q(n − k) to obtain 3q |V (x)|q |x − y|q(k−n) d x y 2 2n for 1 ≤ q < n−1 . For the values of p currently under consideration, q is guaranteed to lie in this range. Another application of Minkowski’s inequality yields

V I n+1 + j ( f ) L q f 2

This completes the proof of the proposition. Proposition 3.10. For any 0 ≤ j ≤

n 2

3

L 1, 2

f

3

L 1, 2 ∩L p

.

+ 1 and σ > j + 21 ,

d j n−3 R0± (λ2 ) f 2,−σ λ− j λ j+ 2 f 1, 3 2 . L 2 ∩L L dλ Proof. We use the estimate (3.1) and split the resolvent kernel into two pieces, according to whether |x − y| < 1 or |x − y| ≥ 1. The piece supported away from the diagonal x = y maps L 1 into L 2,−σ because of the bound dy 1, sup 2(n−k) y2σ x∈Rn |x−y|≥1 |x − y| valid for any k ≤ n+1 2 + j. The piece supported close to the diagonal x = y is a convolution against an integrable function and hence it maps L 2 to itself. If the map V R0± (λ2 ), or one of its derivatives (with respect to λ), is applied enough times to a locally integrable function with fast decay, the result will be locally in L 2 . Any subsequent applications of the free resolvent will yield functions in weighted L 2 spaces. Each time the Limiting Absorption Principle is invoked, it improves the norm bounds by a factor of λ−1+ε until eventually, some polynomial decay in λ is achieved. Our primary estimate of this form is given below.

A Counterexample to Dispersive Estimates

221

Corollary 3.11. Suppose |V (x)| ≤ Cx−β for some β > n + 3. Let m 0 > 0 ≤ j ≤ n2 + 1. Then d j m V R0± (λ2 ) 0 f 2,σ λ− j λ j+1−2n f 1, 3 L 2 L dλ

n2 2

and

(3.3)

for any σ < β − ( n+3 2 ). 2

Proof. The lower bound of n2 is not intended to be sharp and was obtained in the following manner: It requires about n4 iterations of V R0± (λ2 ) to smooth an integrable function to local L 2 behavior (see Proposition 3.9) and one more to reach a weighted L 2 space (see Proposition 3.10). Also, n2 + 1 powers of V R0± (λ2 ) can be lost to derivatives which we bound using Corollary 3.4. For each of these 3n+8 4 operations, we have established n−3

only a crude bound which grows like λ 2 . According to Lemma 3.5, each time the Limiting Absorption Principle is invoked, this reduces the degree of polynomial growth 2 by 1−ε, so it needs to be done approximately (3n+8)(n−3) +2n −1 times. Setting m 0 > n2 8 is sufficient to obtain (3.3). We will also need the following mapping properties of Im R0 (λ2 ). Proposition 3.12. Let 0 ≤ j ≤ n2 + 1. Then, for σ > n+3 2 we have d j 3 Im R0 (λ2 ) 2,σ 2,−σ λn−2− j λ 2 . L →L dλ Moreover, assuming |V (x)| ≤ Cx−β for some β > n + 3, we have d j 3 V Im R0 (λ2 ) 1, 3 λn−2− j λ 2 1, 23 2 L →L dλ while, for m ≥ 2m 0 > n 2 , σ > n+3 2 , and β > 2σ , we have d j n 2 (n−3) 5 Im[V R0+ (λ2 )]m f 2,σ λn−2− j λ j+ 2 −2n+ 4 f 1, 3 . L 2 L dλ

(3.4)

(3.5)

(3.6)

Proof. From (2.3), we have the following formula for the kernel of Im R0 (λ2 ): Im R0 (λ2 )(x, y) = an−2, n−3 (λ|x − y|) 2

e±iλ|x−y| . |x − y|n−2

Derivatives can affect Im R0 (λ2 ) in two ways: Whenever a derivative falls on the symbol, this has the effect of reducing the power of λ by one. If a derivative falls on the phase, this has the effect of increasing the power of |x − y| by one. Hence, using the calculus of the symbols ai, j , we get d j e±iλ|x−y| Im R0 (λ2 )(x, y) = λ− j1 an−2, n−3 (λ|x − y|) 2 dλ |x − y|n−2− j2 j1 + j2 = j j1 , j2 ≥0

=

j1 + j2 = j j1 , j2 ≥0

λn−2− j a j2 , j2 − n−1 (λ|x − y|)e±iλ|x−y| . 2

222

M. Goldberg, M. Visan

Thus, d j n−1 Im R0 (λ2 )(x, y) λn−2− j λ|x − y| j− 2 . dλ

(3.7)

The estimate (3.4) follows from (3.7) and x−2σ y−2σ d xd y λ3 . λ|x − y|n−1−2 j

(3.8)

−n+1+2 j 1; the resulting For 0 ≤ j ≤ n−1 2 , (3.8) follows from the bound λ|x − y| n n−1 n integral is finite whenever σ > 2 . For 2 < j ≤ 2 + 1, we first apply Lemma 3.8 to the integral in the variable y to obtain y−2σ y−2σ −n+1+2 j dy λ d xd y λ3 x−n+1+2 j . n−1−2 j λ|x − y| |x − y|n−1−2 j

The remaining integral in the variable x is finite under our assumptions on σ . In view of (3.7), the estimate (3.5) follows from 3 dy sup x− 2 1. 3 n−1 x∈Rn yβ− 2 λ|x − y| 2 − j To see (3.9) one considers separately the cases 0 ≤ j ≤ − n−1 2 +j

n−1 2

and

n−1 2

< j ≤

(3.9) n 2

+ 1,

1 in the former case and applying Lemma 3.8 with bounding λ|x − y| σ = β − 23 and µ = n−1 − j in the latter case. 2 We turn now to (3.6). We rewrite Im[V R0+ (λ2 )]m = [V R0+ (λ2 )]m − [V R0− (λ2 )]m using the following algebraic identity: M k=0

A+k −

M

A− k =

k=0

Then, Im[V R0+ (λ2 )]m =

M l−1 l=0

m 1 +m 2 =m m 1 ,m 2 ≥0

A− k

k=0

Al+ − Al−

M

A+k .

(3.10)

k=l+1

[V R0− (λ2 )]m 1 V Im R0 [V R0+ (λ2 )]m 2 .

(3.11)

We treat the cases m 1 < m 0 and m 2 < m 0 separately. In the first case, use Corollary 3.11 for [V R0+ (λ2 )]m 2 , (3.4), and Corollary 3.4 for [V R0− (λ2 )]m 1 to derive the claim. In the second case, use the weighted L 1 bound in Proposition 3.9 for [V R0+ (λ2 )]m 2 , (3.5), and Corollary 3.11 for [V R0− (λ2 )]m 1 to obtain (3.6). We also record the following lemma whose proof is just an exercise in integration by parts: Lemma 3.13. Given a ∈ Cc∞ (R \ {0}), we have N eitλ2 λa(λ)dλ |t|−N eitλ2 λs+1−2N a (s) (λ)dλ , R

for every N ≥ 0.

s=0

R

A Counterexample to Dispersive Estimates

223

4. Dispersive Estimate for the Final Term In this section we will show that the tail (2.6) of the finite Born series expansion obeys dispersive estimates for any potential V satisfying |V (x)| x−β , provided we take β and m large enough. Theorem 4.1. Assume that the potential V satisﬁes |V (x)| x−β for some β > and that m > n 2 . Then ∞ 2 sup Im eitλ λ R0+ (λ2 )V [R0+ (λ2 )V ]m S + (λ2 )R0+ (λ2 ) x,y∈Rn

× [V

0

R0+ (λ2 )]m V

R0+ (λ2 )

n (x, y)dλ |t|− 2 .

3n+5 2

(4.1)

Remark 4.2. The condition β > 3n+5 2 is not intended to be sharp. Since the function we eventually construct as a counterexample has compact support, decay conditions are not a matter of primary concern. There are numerous oscillatory components in this integral, which suggests the use of stationary phase methods. Although it appears natural to take the critical point to be λ = 0, this turns out not to be the best choice. Define the functions G ±,x (λ2 )(·) := e∓iλ|x| R0± (λ2 )(·, x). The expression in (4.1) can be rewritten as I + (t, x, y) − I − (t, x, y), where ∞ 2 ± eitλ e±iλ(|x|+|y|) λ S ± (λ2 )R0± (λ2 )[V R0± (λ2 )]m V G ±,y (λ2 ), I (t, x, y) := 0 [V R0∓ (λ2 )]m V G ∓,x (λ2 ) dλ ∞ 2 2 = eitλ e±iλ(|x|+|y|) b± (4.2) x,y (λ ) dλ. 0

n

It suffices to show that |I + (t, x, y) − I − (t, x, y)| |t|− 2 uniformly in x and y. The first step is to establish some properties (including existence) of the operators S ± (λ2 ). This is the crux of the Limiting Absorption Principle for perturbed resolvents. We sketch the details below. Proposition 4.3. Suppose |V (x)| ≤ Cx−β for some β > n+1 2 and also that zero energy is neither an eigenvalue nor a resonance of H = − + V . Then sup S ± (λ2 ) L 2,−σ →L 2,−σ < ∞ λ≥0

for all σ ∈ ( 21 , β − 21 ). Proof. Under our assumptions, (3.1) and Proposition 3.2 imply that R0± (λ2 )V is a compact operator on the space L 2,−σ . The Fredholm alternative then guarantees the existence of S ± (λ2 ) unless there exists a nonzero function g ∈ L 2,−σ satisfying g = −R0± (λ2 )V g. For λ > 0, as g = −R0± (λ2 )V g is formally equivalent to (− + V )g = λ2 g, it follows by a theorem of Agmon [1] (see also [12, Sect. XIII.8]) that g is in fact an eigenfunction, that is, g ∈ L 2 . As positive imbedded eigenvalues do not exist by Kato’s theorem (see, for example, [12, Sect. XIII.8]), we must have g ≡ 0.

224

M. Goldberg, M. Visan

When λ = 0, the free resolvent R0 (0) is a scalar multiple of I2 . Since we are in dimension n ≥ 4, it is possible to improve the decay of g by a bootstrap argument to

obtain g ∈ L 2,−σ for all σ > 0; in dimension n ≥ 5, it is in fact possible to bootstrap all the way to g ∈ L 2 . In other words, zero energy would have to be either an eigenvalue or a resonance of H , contradicting our assumptions. Thus, we must have g ≡ 0. To obtain a uniform bound for S ± (λ2 ), note that by Lemma 3.5 we have R0± (λ2 )V L 2,−σ λ−1+ε . Thus I + R0± (λ2 )V converges to the identity as λ → ∞. Its inverse, S ± (λ2 ), will thus have operator norm less than 2 for all λ > λ0 . On the remaining interval, λ ∈ [0, λ0 ], observe that the family of operators R0± (λ2 ) varies continuously with λ. By continuity of inverses, S ± (λ2 ) is continuous and bounded on this compact interval. Derivatives of S ± (λ2 ) can be taken using the identity d ± 2 d ± 2

S (λ ) = −S ± (λ2 ) R (λ ) V S ± (λ2 ). dλ dλ 0 From this, Corollary 3.4, and Proposition 4.3, it follows that for 1 ≤ j ≤

n 2

d j n−3 S ± (λ2 ) 2,−σ 2,−σ λ− j λ j+ 2 , L →L dλ

+ 1, (4.3)

provided 21 + j < σ < β − ( 21 + j) and β > n+1 2 + j. Moreover, it becomes clear that R V± (λ2 ) = S ± (λ2 )R0± (λ2 ) and its derivatives have mapping properties comparable to those of the free resolvent. We now have estimates for every object in (4.2) except for the functions G ±,y (λ2 ). These follow from another straightforward computation. Proposition 4.4. Suppose |V (x)| ≤ Cx−β for some β > j ≤ n2 + 1,

3n+5 2 .

Then for each 0 ≤

n−3 n−3 d j λ− j λ 2 −j λ 2 2 G ±,y (λ )(·) 1, 3 + + . V (·) n−1 L 2 dλ yn−2 y n−1 2 y 2

(4.4)

Proof. Write out the function G ±,y (λ2 ) in the form G ±,y (λ2 )(x) = a0, n−3 (λ|x − y|) 2

e±iλ(|x−y|−|y|) . |x − y|n−2

Derivatives can affect G ±,y in one of two ways. Whenever a derivative falls on the symbol, it has the effect of reducing the power of λ by one (this property was utilized previously in Sect. 3). When derivatives fall on the exponential factor, the effect is to multiply by |x − y| − |y|, which is smaller than x. Thus, for 0 ≤ j ≤ n2 + 1, d j (|x − y| − |y|) j2 ±iλ(|x−y|−|y|) G ±,y (λ2 )(x) = λ− j1 a0, n−3 (λ|x − y|) e , 2 dλ |x − y|n−2 j1 + j2 = j j1 , j2 ≥0

A Counterexample to Dispersive Estimates

225

and hence d j n−3 x j2 x j2 2 − j1 λ− j1 . G ±,y (λ2 )(x) + λ n−1 dλ |x − y|n−2 |x − y| 2 j1 + j2 = j j1 , j2 ≥0

The result now follows from Lemma 3.8 provided β >

3n+5 2 .

Proof of Theorem 4.1. Consider first what happens if |t| ≤ 4. The bounds established in Corollary 3.11 (for j = 0), Proposition 4.3, and Proposition 4.4 (for j = 0) show that 2 −2 uniformly in x and y. This bounds the function b± x,y (λ ) in (4.2) is smaller than λ n the value of I + (t, x, y) − I − (t, x, y) by a constant, which is less than |t|− 2 as desired. For the remainder of the calculation we will assume that |t| > 4. Let ρ : R → R be a smooth even cutoff function which is identically one on the inter1 2 ± 2 2 val [−1, 1] and identically zero outside [−2, 2]. Let b± x,y,1 (λ ) := ρ(|t| λ)bx,y (λ ) and ± ± ± ± ± bx,y,2 := bx,y − bx,y,1 and define I1 (t, x, y), I2 (t, x, y) accordingly. For simplicity, the dependence on x and y will be suppressed whenever possible. We consider the integrals I2± (t, x, y) first. Case 1. |x| + |y| ≥ |t|. At least one of |x|, |y| is greater than − 21

ality assume it is |y|. Then |y|−1 ≤ 2|t|−1 < |t| supp b2± . Moreover, for λ ≥ |y|−1 , Proposition 4.4 V

2 d j dλ G ±,y (λ ) L 1, 23

λ

n−3 2 −j

y

|t| 2;

without loss of gener-

and hence |y|−1 does not belong to yields the bound

λ j

n−1 2

|t|

1−n 2

λ

n−3 2 −j

λ j .

(4.5)

To bound G ±,x (λ2 ), we use d j ) G ±,x (λ2 ) V ( dλ

L

1, 23

λ− j λ j+

n−3 2

.

(4.6)

No additional improvement can be gained here, because the size of |x| is unknown. By Corollary 3.4, Corollary 3.11, Proposition 4.3, (4.3), (4.5), and (4.6), we can deduce |b2± (λ2 )| |t| and

1−n 2

λ

n−1 2

λ−3n−1 |t|

1−n 2

λ−

5n+3 2

d 1−n n−3 5n+3 b2± (λ2 ) |t| 2 λ 2 λ− 2 . dλ

Applying stationary phase around the critical point λ0 = ∓ |x|+|y| and integrating by 2t ± − n2 parts once away from the critical point, it follows that |I2 (t)| |t| . 1

1

Case 2. |t| 2 ≤ |x| + |y| < |t|. Again, assume without loss of generality that |y| ≥ 21 |t| 2 . Therefore, for λ ∈ supp b2± we have |y| ≥ 21 |λ|−1 , which implies V

2 d j dλ G ±,y (λ ) L 1, 23

λ

n−3 2 −j

y

λ j

n−1 2

.

226

M. Goldberg, M. Visan

For G ±,x (λ2 ) we will use (4.6). The critical point for the phase occurs at λ0 = ∓ |x|+|y| 2t , which is comparable in size 1 −2 to |y| . In the interval [λ0 − 41 |t|− 2 , λ0 + 41 |t|− 2 ] we have the |t| and greater than 2 |t| size estimate |λ | n−1 n 1 2 0 ∼ |t|− 2 + 2 . |b2± (λ2 )| ∼ |y| 1

1

1

An application of stationary phase yields the desired bound on this interval. Away from the critical point, the derivatives of b2± (λ2 ) obey the following bounds 5(n+1) d j λ n−1 2 − j λ j− 2 ± 2 b2 (λ ) n−1 dλ y 2

for all 0 ≤ j ≤

n 2

(4.7)

+ 1. 1

1

1

Over the intervals [|t|− 2 , λ0 − 41 |t|− 2 ] and [λ0 + 41 |t|− 2 , 2λ0 ], (4.7) becomes d j 5(n+1) λ 2 b2± (λ2 ) 0 n−1 λ− j λ j− 2 dλ y 2 n−1

n

1

|t|− 2 + 2 λ− j for all 0 ≤ j ≤

n 2

1

+ 1. As on this region λ − λ0 |t|− 2 , each integration by parts in 1

(4.2) gains us a factor of |t|− 2 . Thus, integrating by parts twice (i.e., taking j = 2) and 1 recalling that in this case λ0 |t|− 2 , we obtain the desired dispersive estimate. Over the interval [2λ0 , 1] (where λ − λ0 ≥ 21 λ), we use (4.7) and the assumption 1

|y| ≥ 21 |t| 2 to get

d j n−1 n−1 b2± (λ2 ) λ 2 − j |t|− 4 dλ

for all 0 ≤ j ≤ n2 + 1. To obtain the desired decay in t, it is necessary to integrate by parts at least n+1 4 times. On the interval [1, ∞], (4.7) implies that b2± (λ2 ) and its derivatives all decay faster 1−n

1

than λ−2 y 2 . Using again the assumption |y| ≥ 21 |t| 2 and integrating by parts another n+1 4 times, we obtain the desired dispersive estimate.

Case 3. |x|, |y| < |t| 2 . This time, the critical point λ0 = ∓ |x|+|y| 2t lies outside the support of b2± (λ2 ). Therefore, one could safely integrate by parts; however, the lack of a lower bound for |x| and |y| limits the usefulness of estimates like (4.4) in the regime λ < 1. Without loss of generality, assume |y| ≥ |x|. For λ ≥ 1, b2± (λ2 ) and its derivatives decay rapidly. Indeed, by Corollary 3.4, Corollary 3.11, Proposition 4.3, (4.3), and Proposition 4.4, for λ ≥ 1 and 0 ≤ j ≤ n2 + 1 we get d j 1 1 1 1 ± 2 −2n−3 b2 (λ ) λ + + . dλ xn−2 x n−1 yn−2 y n−1 2 2 1

A Counterexample to Dispersive Estimates

227

As the powers of x and y in the denominator may not make a meaningful contribution (if x, y are small), it is necessary to integrate by parts at least n2 times in order to n generate the desired |t|− 2 decay or better. The regime λ ∈ [y−1 , 1] is similar to the interval [2λ0 , 1] in the previous case. Indeed, V

2 d j dλ G ±,y (λ ) L 1, 23

λ

n−3 2 −j

y

λ j

n−1 2

λ

n−3 2 −j

y

n−1 2

and d j V ( dλ ) G ±,x (λ2 )

for all 0 ≤ j ≤

n 2

3

L 1, 2

λ− j λ j+

n−3 2

λ− j

+ 1. Thus, 5(n+1) n−1 d j λ n−1 2 − j λ j− 2 λ 2 −j ± 2 b2 (λ ) n−1 n−1 dλ y 2 y 2

for all 0 ≤ j ≤ n2 + 1. Integrating by parts to create polynomial decay in λ: |t|

−N

1

y−1

(λ − λ0 )

−N

λ

n 2

≤N ≤

n−1 2 −N

y

n−1 2

n 2

+ 1 times is more than enough

dλ |t|−N y2N −n .

1

n

Recalling that in this case we have |y| < |t| 2 , the resulting bound for this piece is |t|− 2 . 1 For the remaining interval, [|t|− 2 , y−1 ], we exploit instead the cancellation between − R0+ (λ2 ) and R0 (λ2 ) using the algebraic identity (3.10). We apply (3.10) to I2+ (t, x, y) − I2− (t, x, y), where ∞

1 2 ± I2 (t, x, y) = eitλ 1 − ρ(|t| 2 λ) λ R0± (λ2 )V [R0± (λ2 )V ]m S ± (λ2 )R0± (λ2 ) 0 × [V R0± (λ2 )]m V R0± (λ2 ) (x, y) dλ ∞

1 2 = eitλ 1 − ρ(|t| 2 λ) λ δ y , R0± (λ2 )V [R0± (λ2 )V ]m S ± (λ2 )R0± (λ2 ) 0 × [V R0± (λ2 )]m V R0± (λ2 )δx dλ ∞ 2 2 = eitλ c± x,y,2 (λ )dλ. 0

Each term in the resulting sum contains a factor of R0+ (λ2 ) − R0− (λ2 ), an integral operator whose kernel is pointwise dominated by λn−2 (see (3.7)). This is even true if the cancellation falls on S + (λ2 ) because we can write

S + (λ2 ) − S − (λ2 ) = −S − (λ2 ) R0+ (λ2 ) − R0− (λ2 ) V S + (λ2 ). We will integrate by parts n+1 2 times if n is odd and relies on the estimates of Proposition 3.12.

n 2

+ 1 times if n is even. Our analysis

228

M. Goldberg, M. Visan

In place of the weighted L 1 estimate (4.4), we use the following two bounds for the two possible initial functions on which the resolvents act. For 0 ≤ j ≤ n2 + 1, we have d j λ− j R0± (λ2 )(·, y) 1, 3 y (4.8) V (·) n−2 , 2 L dλ d j

Im R0 (λ2 ) (·, y) 1, 3 λn−2− j . (4.9) V (·) L 2 dλ To see (4.8), we use the pointwise bound d j n−3 R0± (λ2 )(x, y) λ− j I2 + λ 2 I n+1 + j , 2 dλ and apply Lemma 3.8 to obtain n−3 d j λ− j λ 2 λ− j ± 2 R0 (λ )(·, y) 1, 3 + , V (·) L 2 dλ yn−2 y n−1 yn−2 2 −j

where the last inequality holds for λ ≤ y−1 . Similarly, to prove (4.9) we use (3.7); applying Lemma 3.8 and treating the cases n−1 n 0 ≤ j ≤ n−1 2 and 2 < j ≤ 2 + 1 separately, we obtain d j

3 3

Im R0 (λ2 ) (·, y) 1, 3 λn−2− j 1 + λ 2 y 2 λn−2− j , V (·) L 2 dλ again, for λ ≤ y−1 . Using the estimates in Proposition 3.12, (4.8), and (4.9), we get d j

λn−1− j 2 c+x,y,2 (λ2 ) − c− (λ ) . x,y,2 dλ yn−2 Thus, an application of Lemma 3.13 with N = n+1 2 for n odd, or N = yields the bound y−1 n−1−2N n λ |t| N dλ |t|− 2 . n−2 − 21 y |t|

n 2

+ 1 for n even

In each of the three cases discussed above, the difference I2+ (t, x, y) − I2− (t, x, y) is n seen to be smaller than |t|− 2 . To complete the proof of the theorem, we need to show |I1+ (t, x, y) − I1− (t, x, y)| |t|− 2 for |t| > 4. n

Here,

∞

1 2 eitλ ρ(|t| 2 λ)λ R0± (λ2 )V [R0± (λ2 )V ]m S ± (λ2 )R0± (λ2 ) 0 × [V R0± (λ2 )]m V R0± (λ2 ) (x, y) dλ ∞ 1 2 = eitλ ρ(|t| 2 λ)λ δ y , R0± (λ2 )V [R0± (λ2 )V ]m S ± (λ2 )R0± (λ2 ) 0 × [V R0± (λ2 )]m V R0± (λ2 )δx dλ ∞ 2 2 = eitλ c± x,y,1 (λ )dλ.

I1± (t, x, y) =

0

A Counterexample to Dispersive Estimates

229

Arguing as in Case 3 above, we see that + − 2 2 n−1 c . x,y,1 (λ ) − cx,y,1 (λ ) λ Thus, |I1+ (t, x,

y) −

I1− (t, x,

1

|t|− 2

y)|

This concludes the proof of Theorem 4.1.

n

λn−1 dλ |t|− 2 .

0

5. Nondispersive Estimates 5.1. Nondispersive estimate for the term l = 1. To summarize the progress up to this point, we have decomposed the perturbed resolvent R V± (λ2 ) into a finite Born series with initial terms given by (2.5) and a tail given by (2.6). In the previous sections, the contribution of the tail was shown to satisfy a dispersive estimate at both high and low energies. The dispersive behavior of the full evolution eit H Pac (H ) is therefore dictated by the contribution from the initial terms of the Born series. We show that there are potentials in the class α n n−3 5 X = V ∈ C (R ), α < 2 , suppV ⊂ B(0, 5) \ B(0, 2 ) that do not yield a dispersive estimate for the term corresponding to l = 1 in (2.5). It will follow, via an argument in the next subsection, that the entire expression (2.5) cannot satisfy a dispersive estimate either. To define the class of potentials more precisely, let X be the completion of the appropriately supported C ∞ functions with respect to the W α,∞ -norm, f X := (1 + )α/2 f ∞ . Fix the points x0 , y0 ∈ Rn so that x0 is the unit vector in the first coordinate direction and y0 = −x0 . Now let f ε and g ε be smooth approximations of f = δx0 and g = δ y0 which are supported in B(x0 , ε) and B(y0 , ε), respectively, and have unit L 1 -norm. Define the expression n a1L (t, ε, V ) := t 2 eitλ ψ L (λ)[R0+ (λ)(x, x1 )V (x1 )R0+ (λ)(x1 , y) −R0− (λ)(x, x1 )V (x1 )R0− (λ)(x1 , y)] f ε (x), g ε (y) d xd x1 dydλ n 2 I L (t, |x − x1 |, |y − x1 |)V (x1 ) f ε (x)g ε (y)d x1 d xd y, = t where ψ can be any Schwartz function with ψ(0) = 1 and ψ L (λ) = ψ(λ/L). Fubini’s theorem is used to perform the dλ integral first, noting that since f ε , g ε , and V all have disjoint support, the singularities of R0± (λ)(x, x1 ) and R0± (λ)(x1 , y) can be disregarded. If the term corresponding to l = 1 in the Born series (2.5) and (2.6) satisfied a dispersive estimate, it would yield the bound lim |a1L (t, ε, V )| ≤ C(V ) f ε 1 g ε 1 = C(V ).

L→∞

(5.1)

230

M. Goldberg, M. Visan

Observe that a1L (t, ε, V ) is linear in the last entry and can therefore be viewed as a family of linear maps indexed by the remaining parameters (L , t, ε). By the Uniform Boundedness Principle, if a dispersive estimate for the l = 1 term held for every potential V ∈ X , it would imply the sharper inequality sup |a1L (t, ε, V )| ≤ CV X .

(5.2)

L≥1

For t 1 this will not be possible, thanks to the asymptotic description of the function I L (t, |x − x1 |, |y − x1 |) stated below. Lemma 5.1. Suppose n ≥ 3 and 0 < t ≤ 1. Let ψ : R → R be a Schwartz function with Fourier transform supported in the unit interval and satisfying ψ(0) = 1, and K a compact subset of (0, ∞). There exist constants C1 , C2 < ∞ depending on n, ψ, and K such that I L (t, |x − x1 |, |x1 − y|) (|x − x1 | + |x1 − y|)n−2 −i (|x−x1 |+|x1 −y|)2 i 4t − e 3 n−1 n−1 2(−4πi t)n− 2 |x − x | 2 |x − y| 2 1

≤ C1 t

1

−(n− 25 )

(5.3)

for all L > C2 t −3 and |x − x1 |, |x1 − y| ∈ K . If t is held ﬁxed, then the remainder converges as L → ∞ to a function G(|x − x1 |, |x1 − y|, t) uniformly over all pairs of distances |x − x1 |, |x1 − y| ∈ K . The proof of Lemma 5.1 is technical and is given in Sect. 6 below. An immediate consequence of this lemma is the following Corollary 5.2. Let n ≥ 3, 0 < t ≤ 1, and ε < 21 . The following bound is valid for all functions V ∈ X with V X ≤ 1: 3−n L (|x − x1 | + |x1 − y|)n−2 it 2 lim a1 (t, ε, V ) − 3 n−1 n−1 L→∞ 2(−4πi)n− 2 |x − x1 | 2 |x1 − y| 2 (|x−x1 |+|x1 −y|)2 4t ×e−i V (x1 ) f ε (x)g ε (y) d x1 d xd y ≤ Ct

5−n 2

f ε 1 g ε 1 .

(5.4)

Proof. If ε < 21 , then we have |x −x1 |, |x1 − y| ∈ [1, 10] for every combination of points with x ∈ supp( f ε ), y ∈ supp(g ε ), x1 ∈ supp(V ). Thus the conditions of Lemma 5.1 are satisfied, with the conclusion that I L (t, ·, ·) converges uniformly as L → ∞ to a bounded function in x, x1 , y. The result then follows from the dominated convergence theorem and the observation that V 1 ≤ CV X ≤ C . If the integral in (5.4) were taken in absolute values, the resulting bound on a1L (t, ε, V ) would be of size |t|

3−n 2

. In dimension n ≥ 4, this contrasts with the desired estimate lim |a1L (t, ε, V )| ≤ C,

L→∞

A Counterexample to Dispersive Estimates

231

which is uniform in t. Furthermore, for a fixed small time t it is not difficult to construct 2 a potential Vt ∈ X which negates the oscillatory factor of e−i(|x−x1 |+|x1 −y|) /(4t) . Let φ be a smooth cutoff which is supported in the interval [6, 8] and F : R → R a nonnegative smooth function which satisfies F(s) = 0 for all s ≤ 0 and F(s) = s for all s ≥ 21 . Given a time 0 < t ≤ 1, define (|x − x | + |x − y |)2 0 1 1 0 . (5.5) Vt (x1 ) = Cn t α φ(|x0 − x1 | + |x1 − y0 |)F cos 4t The constant Cn will be chosen momentarily. It is perhaps unnecessary to modify the cosine function with F; however, the positivity of F does guarantee that zero energy will be neither an eigenvalue nor a resonance of − + Vt . Proposition 5.3. There exists a constant Cn > 0 so that the function Vt deﬁned above satisﬁes Vt X ≤ 1 for all 0 < t ≤ 1. Proof. It is equivalent to show that in the absence of the coefficient Cn , Vt X would be bounded by a finite constant uniformly in t. The support of φ(|x0 − x1 | + |x1 − y0 |) is located within an annular region bounded by the ellipsoids with foci x0 , y0 and major axes of length 6 and 8, respectively. As this region is bounded away from both x0 and y0 , the length sum |x0 − x1 | + |x1 − y0 | is a scalar C ∞ -function of x1 . 1 −y0 | on this domain It follows that any sufficiently smooth function of |x0 −x1 |+|x 4t α −α should have C -norm controlled by (1 + t ). The leading coefficient t α then ensures that the X -norm will be controlled by a uniform constant for all |t| ≤ 1. Finally, multiplication by the fixed smooth cutoff φ(|x0 − x1 | + |x1 − y0 |) only increases the norm by another finite constant. Now it is a simple matter to show that Vt produces a counterexample to (5.2) and hence to (5.1) for 0 < t 1. Proposition 5.4. Suppose n > 3. There exist constants T, C1 , C2 > 0 such that if 0 < t ≤ T and 0 < ε < C1 t, then n−3−2α lim a1L (t, ε, Vt ) ≥ C2 t −( 2 ) . L→∞

Proof. Start with the asymptotic integral formula in (5.4). For any choice of points (|x−x1 |+|x1 −y|)n−2 x ∈ supp( f ε ), y ∈ supp(g ε ), x1 ∈ supp(Vt ), the expression (|x−x (n−1)/2 is a 1 | |x 1 −y|) smooth positive function of size comparable to 1. Consider what happens to the integral over d x1 in the special case when x = x0 , y = y0 . Then, the oscillatory part of Vt (x1 ) is synchronized with the real part of 2 e−i(|x−x1 |+|x1 −y|) /(4t) so that the real part of the product is always positive and of size approximately 1 on a set of approximately unit measure. The real part of the integral is then bounded below by a positive constant. For arbitrary x ∈ supp( f ε ) and y ∈ supp(g ε ), it is possible to differentiate under the integral sign in either of the variables x or y and each partial derivative is controlled by t −1 . Thus the lower bound on the real part of the integral remains valid so long as |x − x0 |, |y − y0 | t, which is ensured by setting ε < C1 t. The definition of Vt also includes a factor of t α . When this is substituted into (5.4), n−3−2α the resulting leading coefficient is proportional to t −( 2 ) . There is also an error term n−5−2α of unknown sign, but with size controlled by t −( 2 ) . This can be absorbed into the lower bound for any 0 < t ≤ T , provided T is chosen sufficiently small.

232

M. Goldberg, M. Visan

5.2. Nondispersive estimate for the full evolution. Theorem 5.5. Suppose n > 3. There cannot exist a bound of the form n

eit H Pac (H ) f ∞ ≤ C(V )|t|− 2 f 1 with C(V ) < ∞ for every potential V ∈ X , V X ≤ 1. Proof. Assume the contrary and write V = θ W with W X ≤ 1 and θ ∈ [0, 1]. By assumption, we would then have the bound ∞ 1 sup eitλ ψ L (λ)[Rθ W (λ + i0)− Rθ W (λ− i0)] f, g dλ |eit H Pac (H ) f, g| = 2π L≥1

0

n

≤ C(W, θ )|t|− 2 f 1 g1

(5.6)

L1 ∩ L2

and, in particular, for the functions for ψ as in Lemma 5.1 and for every f, g ∈ f ε , g ε defined in Subsect. 5.1. The finite Born series expansion (2.5) and (2.6) allows us to write the perturbed resolvent Rθ W (λ ± i0) as the sum of a polynomial of degree 2m + 1 in θ and a tail. When this is substituted into (5.6) above, along with the functions f ε , g ε ∈ L 1 ∩ L 2 , the tail n is shown in Theorem 4.1 to be controlled by C|t|− 2 f ε 1 g ε 1 for some C. It follows that the initial terms must obey a similar bound. Write this as 2m+1 k L sup P L (θ ) := sup θ ak ≤ C(W, θ ),

L≥1

L≥1 k=0

L are defined for each k ∈ where the coefficients {akL }2m+1 k=0 of the polynomial P {0, 1, ..., 2m + 1} and L < ∞ by the formula n akL (t, ε, W ) = t 2 eitλ ψ L (λ)[R0+ (λ)([−W R0+ (λ)]k − R0− (λ)[−W R0− (λ)]k ] f ε , g ε dλ.

Denote by V the 2m + 2-dimensional space of all polynomials of degree 2m + 1, and consider the linear maps from V into R2m+2 defined by P=

2m+1

ak θ k → {a0 , ..., a2m+1 }

k=0

and P=

2m+1

1 , . . . , P 2m+1 ak θ k → P 0 , P 2m+1 2m+1 .

k=0

Clearly the two maps are bijections and thus one can express each coefficient ak 1 as a linear combination of the values P(0), P( 2m+1 ), ..., P( 2m+1 2m+1 ). From our assumption that C(W, θ ) < ∞ for every 0 ≤ θ ≤ 1, it follows that each of the expressions 1 |P L (0)|, |P L ( 2m+1 )|, . . . , |P L ( 2m+1 2m+1 )|, as well as their maximum, is bounded uniformly in L ≥ 1. One concludes that sup |a1L (t, ε, W )| ≤ C(W ) < ∞

L≥1

for every W ∈ X with W X ≤ 1. This, however, is precisely the same statement as (5.1) which was already shown to be false.

A Counterexample to Dispersive Estimates

233

6. Proof of Lemma 5.1 The main ingredients of Lemma 5.1 are a recurrence relation (in n) for the resolvent kernels and explicit computations in dimensions 2 and 3. With some abuse of notation, define Rn± (λ) to be the free resolvent limε↓0 (− − λ ± iε)−1 in Rn . The Stone formula dictates that ∞ 1 eitλ [Rn+ (λ) − Rn− (λ)] f, g dλ = e−it f, g 2πi 0 −i|x−y|2 n e 4t f (x)g(y) ¯ d xd y = (−4πi t)− 2 R2n

for all t = 0 and f, g (say) Schwartz functions. Recall that the resolvents Rn (z) = (− − z)−1 can be defined for all z ∈ C \ R+ , and that Rn± (λ) are the analytic continuations onto the boundary from above and below, respectively. It follows that both Rn+ (λ) and Rn− (λ) can be defined for negative values of λ. Moreover, [Rn+ (λ) − Rn− (λ)] = 0 for all λ ≤ 0. The integral above may therefore be taken over the entire real line. One further observation is that since Rn+ (λ) is a holomorphic family of operators for λ in the upper halfplane and is uniformly bounded (as operators on L 2 , for example) away from the real axis, its inverse Fourier transform must be supported on the halfline {t ≤ 0}. Similarly, Rn− (λ), which is holomophic in the lower halfplane, has inverse Fourier transform supported in {t ≥ 0}. This leads to the conclusion  n −i|x|2  −2πi(−4πi t)− 2 e 4t , if t > 0 itλ − (6.1) e Rn (λ, |x|) dλ =  0, R if t < 0 for all x ∈ Rn . Setting |x| = r in the preceding identity leads to the recurrence relation − (λ, r ) = − Rn+2

1 ∂ − Rn (λ, r ) . 2πr ∂r

(6.2)

The same identity also holds for Rn+ (λ, r ). 6.1. The cases n = 2, 3. It should first be noted that the integral R eitλ Rn− (λ, r ) dλ in (6.1) is never absolutely convergent and is properly interpreted as the Fourier transform of a distribution. As such, its behavior at t = 0 requires additional clarification. Lemma 6.1. For any ﬁxed r > 0 and n = 2, 3, the expression R eitλ Rn− (λ, r ) dλ agrees with the distribution f given by ∞ n −ir 2 − n2 ( f, φ) = −2πi(−4πi) lim t − 2 e 4t φ(t) dt (6.3) a↓0 a

for all Schwartz functions φ. Proof. Because of analyticity considerations, the identity above must be correct modulo distributions supported on t = 0. Let φ ∈ Cc∞ (R) have nonvanishing derivatives at t = 0 and consider pairings of the form f, N φ(N ·).

234

M. Goldberg, M. Visan

On one hand, the function t −n/2 e−ir /(4t) χ(0,∞) has a continuous anti-derivative I (t) n with asymptotic behavior I (t) = O(t 2− 2 ) as t approaches zero. Integrating by parts, ∞ ∞ 2 − n2 −ir 4t N φ(N t)t e dt = −N lim φ(N a)I (a) − lim N 2 φ (N t)I (t) dt lim a↓0 a a↓0 a↓0 a ∞ = −N 2 φ (N t)I (t) dt 2

0

= O(N

n−2 2

) in the limit N → ∞.

Meanwhile, the pairing f, N φ(N ·) is defined by Parseval’s identity to be ˆ f, N φ(N (·)) = Rn− (λ, r )φ(λ/N ) dλ. R

For fixed r > 0, the resolvent Rn− (λ, r ) possesses the asymptotic expansion Rn− (λ, r ) = c1 r

1−n 2

λ

n−3 4

e−ir

√

λ

+ O(λ

n−5 4

)

as λ → ∞ and is integrable near λ = 0. Thus, it has a continuous anti-derivative J (λ, r ) n−1 which grows no faster than O(λ 4 ). Integrating by parts, f, N φ(N ·) = −N −1 J (λ, r )φˆ (λ/N ) dλ R

= O(N

n−1 4

).

As n = 2, 3, the difference between the left and right sides of (6.3) grows no faster 1 than O(N 2 ) when applied to the test functions N φ(N t). It is well-known that any nonM (k) zero distribution g supported on t = 0 has the form (g, φ) = k=1 ck φ (0), and would therefore grow at least as fast as O(N ) when applied to the same family of test functions. Having established the inverse Fourier transform of Rn− (λ, r ) for each r > 0, it is possible to calculate the inverse Fourier transform of any product Rn− (λ, r )Rn− (λ, s) by taking convolutions. Given a choice of r, s, t > 0, R

e

itλ

Rn− (λ, r )Rn− (λ, s) dλ

−2π = (−4πi)n

t

r2

s2

e−i( 4u + 4(t−u) )

0

du n 2

n

u (t − u) 2

,

(6.4)

where the Fourier transform has introduced a normalizing factor of (2π )−1 . To make the complex exponential more manageable, change variables to r2 s2 r 2 + s2 (t − u)2 r 2 + u 2 s 2 v = + − = . t u t −u t u(t − u)t The range of possible values for v is [2r s, ∞). Based on the quadratic relationship (r 2 + s 2 + v)u 2 − (2r 2 + v)tu + r 2 t 2 = 0,

A Counterexample to Dispersive Estimates

235

the variable substitutions for u and (t − u) are given by  √ 2  √ 1 v + 2r s ∓ v − 2r s 2r 2  2 u= t, t −u = √ √  t. 2r 2 + v ∓ v 2 − 4r 2 s 2 2r 2 + v ∓ v 2 − 4r 2 s 2 The substitution formula for the differentials is 2  √ √ r v + 2r s ∓ v − 2r s  √ dv du = ±t  . √ 2 2 2 2 2r + v ∓ v − 4r s v 2 − 4r 2 s 2 Making all appropriate substitutions and correctly accounting for the fact that each value of v > 2r s is attained twice in u ∈ (0, t), the integral in (6.4) becomes R

e

itλ

R2− (λ, r )R2− (λ, s) dλ

v e−i 4t 1 −i r 2 +s 2 ∞ 4t = dv e √ 4π t v 2 − 4r 2 s 2 2r s 1 −i r 2 +s 2 (1) −r s

4t H e = 0 2t 4π t

(6.5)

in the case n = 2. Here, H0(1) is the Hankel function introduced in Sect. 2. Some relevant (1) properties of this function are that H0 (z) is analytic in the upper halfplane and decays √ i z asymptotically like πi/2ze as z → ∞ along any ray. In the case n = 3, the integral in (6.4) becomes eitλ R3− (λ, r )R3− (λ, s) dλ R   √ r 2 +s 2 ∞ 2 + v + v 2 − 4r 2 s 2 2r −2π e−i 4t  √  = √ (−4πi)3 t 2 2r s r v + 2r s + v − 2r s   √ v 2 2 2 2 2r + v − v − 4r s  e−i 4t dv +  √ √ √ v 2 − 4r 2 s 2 r v + 2r s − v − 2r s r +s v −2π e−i 4t r + s ∞ e−i 4t = dv. √ (−4πi)3 t 2 rs v − 2r s 2r s 2

2

At this point it remains to calculate the Fourier transform of an inverse square-root function, which yields

∞ 2r s

v

√

e−i 4t v − 2r s

The final result is eitλ R3− (λ, r )R3− (λ, s) dλ = R

rs

dv = e−i 2t

√ −4πi t.

r + s (r +s)2 1 e−i 4t . 3/2 2i(−4πi t) rs

(6.6)

236

M. Goldberg, M. Visan

− 6.2. Dimensions n > 3. The recurrence relation for Rn+2 (λ) makes it possible to compute the analogous terms in dimensions n = 5, 7, . . ., by repeatedly applying the differ2 ential operator (4π 2 r s)−1 ∂r∂ ∂s to the three-dimensional result (6.6). For small values

of t, the leading-order term occurs when all derivatives fall on e−i(r +s) /(4t) . This leads to the following asymptotic expression as t → 0, which is valid in any odd dimension n ≥ 3. ( ) 1 (r + s)n−2 −i (r +s)2 itλ − − 4t e Rn (λ, r )Rn (λ, s) dλ = e 3 n−1 R 2i(−4πi t)n− 2 (r s) 2 5 (6.7) +O t −(n− 2 ) . 2

(1)

The same result is true in even dimensions as well. To see this, recall that H0 (z) = 1/2 ei z ω(z), where the derivatives of ω satisfy the following bounds as |z| goes to ( πi 2z ) infinity: k d lim ω(z) = 1, ω(z) = O(|z|−k ), k = 1, 2, . . . . z→∞ dz The expression in (6.5) can then be rewritten as rs 2 1 −i (r +s) 4t ω − . eitλ R2− (λ, r )R2− (λ, s) dλ = e 2i(−4πi t r s)1/2 2t R Applying the differential operators

∂ ∂r

and

∂ ∂s

only increases the degree of the singular-

ity at t = 0 when the derivative falls on the term e−i(r +s) /(4t) . If the derivative falls instead on ω(− r2ts ), one power of t is added to the denominator, but the effect is can2

d ω(z). Consequently, when (4π 2 r s)−1 ∂r∂ ∂s is applied celled by the faster decay of dz iteratively to (6.5), the leading-order term results from having all of the derivatives fall 2 − (λ) then dictates that on e−i(r +s) /(4t) . The recurrence relation for Rn+2 ( ) (r + s)n−2 −i (r +s)2 1 itλ − − 4t e Rn (λ, A)Rn (λ, B) dλ = e 3 n−1 R 2i(−4πi t)n− 2 (r s) 2 5 +O t −(n− 2 ) 2

for dimensions n = 4, 6, . . ., as desired. The results of this calculation can be summarized as follows. Proposition 6.2. Suppose n ≥ 3 and let K be a compact subset of (0, ∞). There exist constants C1 , C2 < ∞, depending on n and K , such that the remainder function (r + s)n−2 −i (r +s)2 1 itλ − − 4t e Rn (λ, A)Rn (λ, B) dλ − G(r, s, t) := e 3 n−1 R 2i(−4πi t)n− 2 (r s) 2 satisﬁes the estimates 5

|G(r, s, t)| ≤ C1 t −(n− 2 ) , uniformly in r, s ∈ K and 0 < t ≤ 1.

∂ 1 G(r, s, t) ≤ C2 t −(n− 2 ) , ∂t

A Counterexample to Dispersive Estimates

237

Proof. One obtains an exact expression for G(r, s, t) by differentiating the base case n = 2 or n = 3. Under the assumption r, s ∈ K , every monomial in r and s (including those with fractional and/or negative exponents) can be dominated by a constant. −r s

dk Every expression of the form t −k dz can also be bounded by a constant. Finally, k ω 2t nonnegative powers of t are smaller than 1. The function G(r, s, t) consists of all the lower-order terms where at least one of the 2 ∂ ∂ partial derivatives ∂r , ∂s does not fall on the exponential e−i(r +s) /(4t) . It follows that

each of these terms is O(t −(n− 2 ) ). If the derivative ∂t∂ is taken at the end, this can only increase the sharpness of the singularity by a factor of t −2 . 5

To be precise, the proposition above is describing the Fourier transform of a distrin−3 bution as the integrand Rn− (λ, r )Rn− (λ, s) experiences growth on the order of |λ| 2 . In Lemma 5.1, the auxilliary function ψ L (λ) is introduced to make the integral absolutely convergent. This has the effect of convolving the distribution G(s, t, ·) with the *L . approximate identity (2π )−1 ψ At a fixed time 0 < t ≤ 1, if L ≥ 2t −1 one can estimate the effect of the convolutions r 2 +s 2 3 3 r 2 +s 2 1 *L ∗ (·)−(n− 2 ) e−i 4(·) (t) − t −(n− 2 ) e−i 4t ≤ Cn,K L −1 t −(n+ 2 ) (2π )−1 ψ and

1 *L ∗ G(r, s, ·)] (t) − G(r, s, t) ≤ Cn,K L −1 t −(n− 2 ) (2π )−1 ψ

ˆ If L > Ct −3 , these by using the Mean Value Theorem and the support property of ψ. resulting differences are no larger than the initial size estimate for G(r, s, t). Furthermore, at fixed 0 < t ≤ 1 they vanish in the limit L → ∞ uniformly over all pairs r, s ∈ K . Recall the definition of I L (t, |x − x1 |, |x1 − y|) in the notation of this section: + I L (t, |x − x1 |, |x1 − y|) = eitλ Rn+ (λ, |x − x1 |)Rn+ (λ, |x1 − y|) , −Rn− (λ, |x − x1 |)Rn− (λ, |x1 − y|) dλ. Under the substitutions r = |x − x1 | and s = |x1 − y|, we have fully characterized the contribution of the term eitλ Rn− (λ, r )Rn− (λ, s) to the integral. The inverse Fourier transform of Rn+ (λ, r )Rn+ (λ, s) is a distribution supported on the half line {t ≤ 0} because of *L , it will be supported in (−∞, L −1 ] analyticity considerations. After convolution with ψ −1 and therefore vanishes at any t > 0 once L > t . This concludes the proof of Lemma 5.1. References 1. Agmon, S.: Spectral properties of Schrödinger operators and scattering theory. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 2, no. 2, 151–218 (1975) 2. Goldberg, M.: Dispersive bounds for the three-dimensional Schrödinger equation with almost critical potentials. To appear in Geom. and Funct. Anal. 3. Goldberg, M.: Dispersive estimates for the three-dimensional Schrödinger equation with rough potentials. To appear in Amer. J. Math.

238

M. Goldberg, M. Visan

4. Goldberg, M., Schlag, W.: Dispersive estimates for Schrödinger operators in dimensions one and three. Commun. Math. Phys. 251, 157–178 (2004) 5. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series and products. Santiego New York: Academic Press, 6th edn. 2002 6. Jensen, A.: Spectral properties of Schrödinger operators and time-decay of the wave functions results in L 2 (Rm ), m ≥ 5. Duke Math. J. 47, 57–80 (1980) 7. Jensen, A.: Spectral properties of Schrödinger operators and time-decay of the wave functions. Results in L 2 (R 4 ). J. Math. Anal. Appl. 101, no. 2, 397–422 (1984) 8. Jensen, A., Kato, T.: Spectral properties of Schrödinger operators and time-decay of the wave functions. Duke Math. J. 46, no. 3, 583–611 (1979) 9. Journé, J.-L., Soffer, A., Sogge, C.D.: Decay estimates for Schrödinger operators. Commun. Pure Appl. Math. 44, 573–604 (1991) 10. Rauch, J.: Local decay of scattering solutions to Schrödinger’s equation. Commun. Math. Phys. 61, no. 2, 149–168 (1978) 11. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol. II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 12. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol. III: Scattering Theory. New York: Academic Press, 1979 13. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of Operators. New YorkLondon: Academic Press, 1978 14. Rodnianski, I., Schlag, W.: Time decay for solutions of Schrödinger’s equations with rough and timedependent potentials. Invent. Math. 155, no. 3, 451–513 (2004) 15. Schlag, W.: Dispersive estimates for Schrödinger operators in two dimensions. Commun. Math. Phys. 257, no.1, 87–117 (2005) 16. Schlag, W.: Dispersive estimates for Schrödinger operators: a survey. Preprint 17. Simon, B.: Schrödinger Semigroups. Bull. Amer. Math. Soc. 7, 447–526 (1982) 18. Yajima, K.: The W k, p -continuity of wave operators for Schrödinger operators. J. Math. Soc. Japan 47, no. 3, 551–581 (1995) Communicated by B. Simon

Commun. Math. Phys. 266, 239–265 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0014-4

Communications in

Mathematical Physics

Rotation Sets of Billiards with One Obstacle Alexander Blokh1, , Michał Misiurewicz2, , Nándor Simányi1, 1 Department of Mathematics, University of Alabama at Birmingham, University Station, Birmingham,

AL 35294-2060, USA. E-mail: [email protected]; [email protected]

2 Department of Mathematical Sciences, IUPUI, 402 N. Blackford Street, Indianapolis, IN 46202-3216, USA.

E-mail: [email protected] Received: 16 August 2005 / Accepted: 10 December 2005 Published online: 14 April 2006 – © Springer-Verlag 2006

Abstract: We investigate the rotation sets of billiards on the m-dimensional torus with one small convex obstacle and in the square with one small convex obstacle. In the first case the displacement function, whose averages we consider, measures the change of the position of a point in the universal covering of the torus (that is, in the Euclidean space), in the second case it measures the rotation around the obstacle. A substantial part of the rotation set has usual strong properties of rotation sets. 1. Introduction Traditionally, billiards have been investigated from the point of view of Ergodic Theory. That is, the properties that have been studied, were the statistical properties with respect to the natural invariant measure equivalent to the Lebesgue measure. However, it is equally important to investigate the limit behavior of all trajectories and not only of almost all of them. In particular, periodic trajectories (which are of zero measure) are of great interest. A widely used method in this context is to observe that billiards in convex domains are twist maps (see, e.g. [6], Sect. 9.2), so the well developed rotation theory for twist maps (see, e.g. [6], Sect. 9.3) applies to them. Rotation Theory has been recently developed further and its scope has been significantly widened (see, e.g., Chapter 6 of [1] for a brief overview). This opens possibilities of its application to new classes of billiards. In the general Rotation Theory one considers a dynamical system together with an observable, that is a function on the phase space, with values in a vector space. Then one takes limits of ergodic averages of the observable along longer and longer pieces of trajectories. The rotation set obtained in such a way contains all averages of the observable along periodic orbits, and, by the The first author was partially supported by NSF grant DMS 0456748.

The second author was partially supported by NSF grant DMS 0456526. The third author was partially supported by NSF grant DMS 0457168.

240

A. Blokh, M. Misiurewicz, N. Simányi

Birkhoff Ergodic Theorem, integrals of the observable with respect to all ergodic invariant probability measures. With the natural choice of an observable, information about the appropriate rotation set allows one to describe the behavior of the trajectories of the system (see examples in Chapter 6 of [1]). Exact definitions are given later in the paper. We have to stress again that we consider all trajectories. Indeed, restricting attention to one ergodic measure would result in seeing only one rotation vector. However, rotation vectors of other points, non-typical for the measure, will be missing. Thus, the approach to the billiards should be from the point of view of Topological Dynamics, instead of Ergodic Theory, even though we do consider various invariant measures for which rotation vector can be computed. Observe that the rotation set for a suitably chosen observable is a useful characteristic of the dynamical system. In the simplest case, the observable is the increment in one step (for discrete systems) or the derivative (for systems with continuous time) of another function, called displacement. When the displacement is chosen in a natural way, the results on the rotation set are especially interesting. In this paper we consider two similar classes of billiards, and the observables which we use for them are exactly of that type. One system consists of billiards on an m-dimensional torus with one small convex obstacle which we lift to the universal covering of the torus (that is, to the Euclidean space) and consider the natural displacement there. Those models constitute the rigorous mathematical formulation of the so called Lorentz gas dynamics with periodic configuration of obstacles. They are especially important for physicists doing research in the foundations of nonequilibrium dynamics, since the Lorentz gas serves as a good paradigm for nonequilibrium stationary states, see the nice survey [5]. The other system consists of billiards in a square with one small convex obstacle close to the center of the square; here we measure average rotation around a chosen obstacle using the argument as the displacement. We treat both billiards as flows. This is caused by the fact that in the lifting (or unfolding for a billiard in a square) we may have infinite horizon, especially if the obstacle is small. In other words, there are infinite trajectories without reflections, so when considering billiards as maps, we would have to divide by zero. Although this is not so bad by itself (infinity exists), we lose compactness and cannot apply nice general machinery of the rotation theory (see, e.g., [11]). Note that in the case of a billiard in a square we have to deal with trajectories that reflect from the vertices of the square. We can think about such reflection as two infinitesimally close reflections from two adjacent sides. Then it is clear that our trajectory simply comes back along the same line on which it arrived at the vertex, and that this does not destroy the continuity of the flow. The ideas, methods, and results in both cases, the torus and the square, are very similar. However, there are some important differences, and, in spite of its two-dimensionality, in general the square case is more complicated. Therefore we decided to treat the torus case first (in Sects. 2, 3 and 4) and then, when we describe the square case (in Sect. 5), we describe the differences from the torus case, without repeating the whole proofs. We believe that this type of exposition is simpler for a reader than the one that treats both cases simultaneously or the one that produces complicated abstract theorems that are then applied in both cases. In Sect. 6 we get additionally some general results, applicable also to other situations. Let us describe briefly the main results of the paper. The exact definitions will be given later. Let us only note that the admissible rotation set is a subset of the full rotation set, about which we can prove much stronger results than about the full rotation

Rotation Sets of Billiards with One Obstacle

241

set. Also, a small obstacle does not mean an “arbitrarily small” one. We derive various estimates of the size of the admissible rotation set. In the torus case the estimates that are independent of the dimension are non-trivial because of the behavior of the geometry of Rm as m → ∞. In both cases we show that the admissible rotation set approximates better and better the full rotation set when the size of the obstacle diminishes. We prove that in both cases, the torus and the square, if the obstacle is small, then the admissible rotation set is convex, rotation vectors of periodic orbits are dense in it, and if u is a vector from its interior, then there exists a trajectory with the rotation vector u (and even an ergodic invariant measure, for which the integral of the velocity is equal to u, so that u is the rotation vector of almost every trajectory). The set is √ full rotation √ connected, and in the case of the square, is equal to the interval [− 2/4, 2/4]. We conjecture that the full rotation set shares the strong properties of the admissible rotation set. 2. Preliminary Results - Torus Let us consider a billiard on the m-dimensional torus Tm = Rm /Zm (m ≥ 2) with one strictly convex (that is, it is convex and its boundary does not contain any straight line segment) obstacle O with a smooth boundary. We do not specify explicitly how large the obstacle is, but let us think about it as a rather small one. When we lift the whole picture to Rm then we get a family of obstacles O k , where k ∈ Zm and O k is O0 translated by the vector k. When we speak about a trajectory, we mean a positive (one-sided) billiard trajectory, unless we explicitly say that it is a full (two-sided) one. However, we may mean a trajectory in the phase space, in the configuration space (on the torus), or in the lifting or unfolding (the Euclidean space). It will be usually clear from the context, which case we consider. We will say that the obstacle O k is between O i and O j if it intersects the convex hull of O i ∪ O j and k = i, j . For a trajectory P, beginning on a boundary of O, its m m type is a sequence (kn )∞ n=0 of elements of Z if the continuous lifting of P to R that starts at the boundary of O k0 reflects consecutively from O kn , n = 1, 2, . . . . In order to make the type unique for a given P, we will additionally assume that k0 = 0. Note that except for the case when P at its initial point is tangent to O, there are infinitely many reflections, so the type of P is well defined. This follows from the following lemma. In it, we do not count tangency as a reflection. Lemma 2.1. If a trajectory has one reﬂection then it has inﬁnitely many reﬂections. Proof. Suppose that a trajectory has a reflection, but there are only finitely many of them. Then we can start the trajectory from the last reflection. Its ω-limit set is an affine subtorus of Tm and the whole (positive) trajectory is contained in this subtorus (and it is dense there). Since we started with a reflection, this subtorus intersects the interior of the obstacle. Since the trajectory is dense in this subtorus, we get a contradiction. Of course, there may be trajectories without any reflections. In particular, if the obstacle is contained in a ball of radius less than 1/2, there are such trajectories in the direction of the basic unit vectors. Sometimes we will speak about the type of a piece of a trajectory; then it is a finite sequence. We will also use the term itinerary. m We will call a sequence (kn )∞ n=0 of elements of Z admissible if

242

1. 2. 3. 4.

A. Blokh, M. Misiurewicz, N. Simányi

k0 = 0, for every n we have kn+1 = kn , for every n there is no obstacle between O kn and O kn+1 , for every n the obstacle O kn+1 is not between O kn and O kn+2 .

Theorem 2.2. For any admissible sequence (kn )∞ n=0 there is a trajectory with type m and a positive integer q such that k (kn )∞ . If additionally there is p ∈ Z n+q = k n + p n=0 for every n then this trajectory can be chosen periodic of discrete period q (that is, after q reﬂections we come back to the starting point in the phase space). Similarly, for any ∞ admissible sequence (kn )∞ n=−∞ there is a trajectory with type (k n )n=−∞ . n such that x i belongs to the boundary of Proof. Fix n. For every sequence A = (x i )i=0 O ki for i = 0, 1, . . . , n, let (A) be the curve obtained by joining consecutive points x i by straight segments (such a curve may intersect interiors of some obstacles). Since the Cartesian product of the boundaries of O ki is compact and the length of (A) depends continuously on A, there is an A for which this length is minimal. We claim that in such a case (A) is a piece of a trajectory. By (3), the segment Ii joining x i with x i+1 cannot intersect any obstacle except O ki and O ki+1 . If it intersects O ki at more than one point, it intersects its boundary at x i and at another point y. Then replacing x i by y will make (A) shorter, a contradiction. This argument does not work only if x i−1 , x i and x i+1 are collinear and x i lies between x i−1 and x i+1 . However, such situation is excluded by (4). This proves that Ii does not intersect O ki at more than one point. Similarly, it does not intersect O ki+1 at more than one point. Now the known property of curves with minimal lengths guarantees that at every x i , i = 1, 2, . . . , n − 1, the incidence and reflection angles are equal. This proves our claim. For a two-sided sequence (kn )∞ n=−∞ the argument is very similar. Now we make this construction for every n and get a sequence (An )∞ n=1 of pieces of trajectories. We note their initial points in the phase space (points and directions) and choose a convergent subsequence of those. Then the trajectory of this limit point in the phase space will have the prescribed type. If there is p ∈ Zm and a positive integer q such that kn+q = kn + p for every n, then q−1 we consider only the sequence A = (x i )i=0 and repeat the first part of the above proof adding the segment joining x q−1 with x 0 + p to (A).

Note that by Corollary 1.2 of [4], if the obstacle is strictly convex then a periodic orbit from the above theorem is unique. The next lemma essentially expresses the fact that any billiard flow with convex obstacles lacks focal points. It follows from the corollary after Lemma 2 of [9]. The types of trajectory pieces about which we speak in this lemma are not necessarily admissible. Lemma 2.3. For a given ﬁnite sequence B = (kn )sn=0 of elements of Zm and points x 0 , x s on the boundaries of O k0 and O ks respectively, there is at most one trajectory piece of type B starting at x 0 and ending at x s . The same remains true if we allow the ﬁrst segment of the trajectory piece to cross O k0 and the last one to cross O ks (as in Fig. 2.1). Corollary 2.4. If the trajectory piece from Lemma 2.3 exists and has admissible type, then it is the shortest path of type B starting at x 0 and ending at x s . Proof. Similarly as in the proof of Theorem 2.2, the shortest path of type B from x 0 to x s is a trajectory piece (here we allow the first segment of the trajectory piece to cross

Rotation Sets of Billiards with One Obstacle

243

Fig. 2.1. A trajectory piece crossing the first and last obstacles

O k0 and the last one to cross O ks ). By Lemma 2.3, it is equal to the trajectory piece from that lemma. Of course this trajectory piece depends on x 0 and x s . However, its length depends on those two points only up to an additive constant. Denote by c the diameter of O. Lemma 2.5. For every admissible ﬁnite sequence B = (kn )sn=0 of elements of Zm the lengths of trajectory pieces of type B (even if we allow them to cross O k0 and O ks ) differ by at most 2c. The displacements along those trajectory pieces also differ by at most 2c. Proof. Let and be two such trajectory pieces, joining x 0 with x s and y0 with ys respectively, where x 0 , y0 belong to the boundary of O k0 and x s , ys belong to the boundary of O ks . Replace the first segment of by adding to it the segment joining x 0 with y0 , and do similarly with the last segment of . Then we get a path joining y0 with ys of type B. By Corollary 2.4, its length is not smaller than the length of . On the other hand, its length is not larger than the length of plus 2c. Performing the same construction with the roles of and reversed, we conclude that the difference of the lengths of those two paths is not larger than 2c. The second statement of the lemma is obvious. One can look at the definition of an admissible sequence in the following way. Instead ∞ m of a sequence (kn )∞ n=0 of elements of Z we consider the sequence (l n )n=1 , where ∞ ∞ l n = kn − kn−1 . Since k0 = 0, knowing (l n )n=1 we can recover (kn )n=0 . Now, condition (3) can be restated as no obstacle between O0 and Ol n , and condition (4) as the obstacle Ol n not between O0 and Ol n +l n+1 . Let G be the directed graph whose vertices are those j ∈ Zm \ {0} for which there is no obstacle between O0 and O j , and there is an edge (arrow) from j to i if and only if O j is not between O0 and O j + i . Then every sequence (l n )∞ n=1 obtained from an admissible sequence is a one-sided infinite path in G, and vice versa, each one-sided infinite path in G is a sequence (l n )∞ n=1 obtained from an admissible sequence. Hence, we can speak about paths corresponding to admissible sequences and admissible sequences corresponding to paths. Lemma 2.6. The set of vertices of G is ﬁnite. Proof. Fix an interior point x of O0 . By Lemma 2.1, any ray beginning at x intersects the interior of some O k with k = 0. Let Vk be the set of directions (points of the unit sphere) for which the corresponding ray intersects the interior of O k . This set is open, so we get an open cover of a compact unit sphere. It has a finite subcover, so there exists

244

A. Blokh, M. Misiurewicz, N. Simányi

a constant M > 0 such that every ray from x of length M intersects the interior of some O k with k = 0. This proves that the set of vertices of G is finite. Note that in G there is never an edge from a vertex to itself. Moreover, there is a kind of symmetry in G. Namely, if k is a vertex then −k is a vertex; there is an edge from k to −k; and if there is an edge from k to j then there is an edge from − j to −k. The following lemma establishes another symmetry in G. Lemma 2.7. If k, j ∈ Zm and O k is between O0 and O k+ j , then O j is also between O0 and O k+ j . Thus, if there is an edge in G from k to j then there is an edge from j to k. Proof. The map f (x) = k + j − x defines an isometry of Zm and f (O0 ) = O k+ j , f (O k+ j ) = O0 , f (O k ) = O j . This proves the first statement of the lemma. The second statement follows from the first one and from the definition of edges in G. We will say √ that the obstacle O is small if it is contained in a closed ball of radius smaller than 2/4. To simplify the notation, in the rest of the paper, whenever the obstam cle is small, we √ will be using the lifting to R such that the centers ofm the balls of radii smaller than 2/4 containing the obstacles will be at the points of Z . Denote by U the set of unit vectors from Zm (that is, the ones with one component ±1 and the rest of components 0), and by Am the set {−1, 0, 1}m \ {0} (we use the subscript m by A, since this set will be used sometimes when we consider all dimensions at once). In particular, U ⊂ Am . Lemma 2.8. Let O be small. If k, l ∈ Zm \ {0} and k, l ≤ 0, then O k is not between O0 and O k+l . In particular, if k and l are vertices of G and k, l ≤ 0, then there are edges in G from k to l and from l to k. Proof. We will use elementary geometry. Consider the triangle with vertices A = 0, B = k and C = k + l. The angle at the vertex B is at most π/2, and the lengths of the sides AB and BC are at least 1. We need to construct a straight line which separates the plane P in which the triangle √ ABC lies into two half-planes with the first one containing the open disk of radius 2/4 centered at B and the second one containing such disks centered at A and C. Then the hyperplane of dimension m − 1 through this line and perpendicular to the plane P will separate O k from O0 and O k+l . This will prove that there is an edge in G from k to l. By Lemma 2.7, there will be also an edge in G from l to k. Let D and E be the points on the sides B A and BC respectively, whose distance from B is 1/2 and let L be the straight line through D and √ E. Since the angle at the vertex B is at most π/2, the distance of B from L is at least 2/4. Since |AD| ≥ |B D| and |C E| ≥ |B E|, the distances of A and C from L are at least as large as the distance of B from L. This completes the proof. Lemma 2.9. For a billiard on a torus with a small obstacle, all elements of Am are vertices of G. Proof. Let u ∈ Am and v ∈ Zm \ {0, u}. If v = (v1 , v2 , . . . , vm ) then | v, u | ≤ m 2 i=1 |vi | ≤ v , so v, u − v ≤ 0. Therefore, by Lemma 2.8, v is not between 0 and u. This proves that u is a vertex of G. Lemma 2.10. Assume that O is small. Then G is connected, and for every vertices k, l of G there is a path of length at most 3 from k to l in G, via elements of U .

Rotation Sets of Billiards with One Obstacle

245

Proof. By Lemma 2.9, the set of vertices of G contains U . Let k, l be vertices of G. Then k, l = 0, so there exist elements u, v of U such that k, u ≤ 0 and l, v ≤ 0. By Lemma 2.8, there are edges in G from k to u and from v to l. If u = v then kul is a path of length 2 from k to l. If u = v then u, v = 0, so by Lemma 2.8 there is an edge from u to v. Then kuvl is a path of length 3 from k to l. 3. Rotation Set - Torus Now we have enough information in order to start investigating the rotation set R of our billiard. It consists of limits of the sequences (( yn − x n )/tn )∞ n=1 , where there is a trajectory piece in the lifting from x n to yn of length tn , and tn goes to infinity. Since we have much larger control of pieces of trajectories of admissible type, we introduce also the admissible rotation set A R, where in the definition we consider only such pieces. Clearly, the admissible rotation set is contained in the rotation set. By the definition, both sets are closed. It is also clear that they are contained in the closed unit ball in Rm , centered at the origin. Due to the time-reversibility, both sets R and A R are centrally symmetric with respect to the origin. For a given point p in the phase space let us consider the trajectory t → T (t) in Rm starting at p. We can ask whether the limit of (T (t) − T (0))/t, as t goes to infinity, exists. If it does, we will call it the rotation vector of p. Clearly, it is the same for every point in the phase space of the full trajectory of p, so we can speak of the rotation vector of a trajectory. In particular, every periodic orbit has a rotation vector, and it is equal to (T (s) − T (0))/s, where s is the period of the orbit. Note that if we use the discrete time (the number of reflections) rather than continuous time, we would get all good properties of the admissible rotation set from the description of the admissible sequences via the graph G and the results of [11]. Since we are using continuous time, the situation is more complicated. Nevertheless, Lemma 2.5 allows us to get similar results. For a trajectory piece T we will denote by |T | its length and by d(T ) its displacement. Theorem 3.1. The admissible rotation set of a billiard on a torus with a small obstacle is convex. Proof. Fix vectors u, v ∈ A R and a number t ∈ (0, 1). We want to show that the vector t u + (1 − t)v belongs to A R. Fix ε > 0. By the definition, there are finite admissible sequences A, B and trajectory pieces T, S of type A, B respectively, such that d(T ) < ε and d(S) − v < ε. (3.1) − u |T | |S| Both A, B can be represented as finite paths in the graph G. By Lemma 2.10, there are admissible sequences C1 , C2 , C3 represented in G as paths of length at most 3, via elements of U , such that the concatenations of the form D = AC1 AC1 . . . AC1 AC2 BC3 BC3 . . . BC3 B are admissible. There exists a trajectory piece Q of type D. We will estimate its displacement and length. Assume that in D the block A appears p times and the block B appears q − p times. Let d A , d B be the total displacements due to the blocks A, B respectively. We get d A − pd(T ) ≤ 2 pc and d B − (q − p)d(S) ≤ 2(q − p)c.

246

A. Blokh, M. Misiurewicz, N. Simányi

The displacement due to each of the blocks C1 , C2 , C3 is at most of norm 2 + 2c, so the total displacement due to all those blocks is at most of norm q(2 + 2c). If we replace all displacements by the trajectory lengths, we get the same estimates (we use here Lemma 2.5). Thus we get the following estimates: d(Q) − α ≤ 4qc + 2q and |Q| − β ≤ 4qc + 2q, where α = pd(T ) + (q − p)d(S) and β = p|T | + (q − p)|S|. Therefore

d(Q) α d(Q) α ≤ + α − α − − |Q| β |Q| |Q| |Q| β 4qc + 2q 4qc + 2q ≤ + α |Q| |Q|β α q 1+ . = (4c + 2) |Q| β

(3.2)

Set s = p|T |/β. Then 1 − s = (q − p)|S|/β, so α d(T ) d(S) =s + (1 − s) . β |T | |S|

(3.3)

By (3.1), we get α ≤ max(u, v) + ε. β Moreover, |Q| β ≥ − (4c + 2) ≥ min(|T |, |S|) − (4c + 2). q q Therefore if |T | and |S| are sufficiently large (we may assume this), the right-hand side of (3.2) is less than ε. Together with (3.3), we get d(Q) d(T ) d(S) < ε. − s + (1 − s) |Q| |T | |S| By this inequality and (3.1), it remains to show that by the right choice of p, q we can approximate t by s with an arbitrary accuracy. We can write s = f (x), where x = p/q and f (x) =

|T |x . |T |x + |S|(1 − x)

The function f is continuous on [0, 1], takes value 0 at 0 and value 1 at 1. Therefore the image of the set of rational numbers from (0, 1) is dense in [0, 1]. This completes the proof. Theorem 3.2. For a billiard on a torus with a small obstacle, rotation vectors of periodic orbits of admissible type are dense in the admissible rotation set.

Rotation Sets of Billiards with One Obstacle

247

Proof. Fix a vector u ∈ A R and ε > 0. We want to find a periodic orbit of admissible type whose rotation vector is in the ε-neighborhood of u. By the definition, there is an admissible sequence A and a trajectory piece T of type A such that d(T ) ε < . − u (3.4) |T | 2 Moreover, we can assume that |T | is as large as we need. As in the proof of Theorem 3.1, we treat A as a path in the graph G and find an admissible sequence C represented in G as paths of length at most 3, via elements of U , such that the periodic concatenation D = AC AC AC . . . is admissible. There exists a periodic orbit of type D. Let Q be its piece corresponding to the itinerary AC. We will estimate its displacement and length. Similarly as in the proof of Theorem 3.1, we get |d(Q) − d(T )| ≤ 4c + 2 and |Q| − |T | ≤ 4c + 2. Therefore

d(Q) d(T ) |Q| − |T | ≤

d(Q) d(T ) d(T ) d(T ) |Q| − |Q| + |Q| − |T | 4c + 2 d(T ) 4c + 2 ≤ + · . |T | − (4c + 2) |T | |T | − (4c + 2)

If |T | is sufficiently large then the right-hand side of this inequality is smaller than ε/2. Together with (3.4) we get d(Q) |Q| − u < ε. This completes the proof.

We will refer to closed paths in G as loops. Remark 3.3. It is clear that in the above theorem we can additionally require that the corresponding loop in the graph G passes through a given vertex. To get more results, we need a generalization of a lemma from [8] to higher dimensions. Lemma 3.4. Assume that 0 ∈ Rm lies in the interior of the convex hull of a set of m + 1 vectors v 0 , v 1 , . . . , v m . For every K > 0 if L is large enough then the following property holds. If x ∈ Rm and x ≤ L then there exists i ∈ {0, 1, . . . , m} and a positive integer n such that x + nv i ≤ L − K . Moreover, x + jv i ≤ L for j = 1, 2, . . . , n − 1. Proof. Let us fix K > 0. We will consider only L such that L > K . Set M = maxi v i . For each x ∈ Rm with x = 1 let f (x) be the minimum of x + tv i over i = 0, 1, . . . , m and t ≥ 0. By the assumption, f (x) < 1. Clearly f is continuous, and therefore there is ε > 0 such that f (x) ≤ 1 − ε for every x. Thus, for every y ∈ Rm there is i = 0, 1, . . . , m and s ≥ 0 such that y + sv i ≤ (1 − ε) y. Let n be the smallest integer larger than s. Then n > 0, and if L ≥ (M + K )/ε and y ≤ L then y + nv i ≤ (1 − ε)L + M ≤ L − K . The last statement of the lemma follows from the convexity of the balls in Rm .

248

A. Blokh, M. Misiurewicz, N. Simányi

Now we can follow the methods of [8] and [11]. We assume that our billiard has a small obstacle. For a full trajectory T we will denote by T (t) the point to which we get after time t. Lemma 3.5. If u is a vector from the interior of A R, then there exists a full trajectory T of admissible type and a constant M such that T (t) − T (0) − t u ≤ M

(3.5)

for all t ∈ R. Proof. Let us think first of positive t’s. Since u is in the interior of A R, one can choose m + 1 vectors w0 , w 1 , . . . , w m ∈ A R such that u is in the interior of the convex hull of those vectors. Moreover, by Theorem 3.2 and Remark 3.3 we may assume that wi are rotation vectors of periodic orbits Pi of admissible type, corresponding to loops Ai in G passing through a common vertex V . We can also consider those loops as finite paths, ending at V and starting at the next vertex in the loop. Set v i = d(Pi ) − |Pi |u = |Pi |wi − |Pi |u = |Pi |(wi − u). Since u is in the interior of the convex hull of the vectors wi , we get that 0 is in the interior of the convex hull of the vectors wi − u, and therefore 0 is in the interior of the convex hull of the vectors v i . We will construct our trajectory, or rather a corresponding path in the graph G, by induction, using Lemma 3.4. Then we get a corresponding trajectory of admissible type by Theorem 2.2. We start with the empty sequence, that corresponds to the trajectory piece consisting of one point. Then, when a path B j in G (corresponding to a trajectory piece Q j ) is constructed, and it ends at V , we look at the vector x = d(Q j ) − |Q j |u and choose v i and n according to Lemma 3.4. We append B j by adding n repetitions of Ai (corresponding to a trajectory piece that we can call n Pi ) and obtain B j+1 (corresponding to a trajectory piece Q j+1 ). To do all this, we have to define K that is used in Lemma 3.4 and prove that if x ≤ L then also d(Q j+1 ) − |Q j+1 |u ≤ L. Let us analyze the situation. When we concatenate Q j and Ai . . . Ai (n times) to get Q j+1 , by Lemma 2.5 we have d(Q j+1 ) − |Q j+1 |u − d(Q j ) − |Q j |u − d(n Pi ) − |n Pi |u ≤ 4c(1 + u). Moreover, d(n Pi ) − |n Pi |u = n(d(Pi ) − |Pi |u) = nv i . Therefore in Lemma 3.4 we have to take K = 4c(1 + u) and then we can make the induction step. In such a way we obtain an infinite path B in G. By Theorem 2.2, there exists a billiard trajectory T of type B. Note that we did not complete the proof yet, because we got (3.5) (with M = L) only for a sequence of times t = |Q j |. We can do better using the last statement of Lemma 3.4. This shows that (3.5) with M = L + K holds for a sequence of times t with the difference of two consecutive terms of this sequence not exceeding s = max(|P0 |, |P1 |, . . . , |Pn |) + 4c. Every time t can be written as t + r with t being a term of the above sequence (so that (3.5) holds with M = L + K ) and r ∈ [0, s). Then T (t ) − T (0) − t u ≤ L + K + T (t + r ) − T (t) + r u. Thus, (3.5) holds for all times with M = L + K + s + su. The same can be done for negative t’s, so we get a full (two-sided) path, and consequently a full trajectory.

Rotation Sets of Billiards with One Obstacle

249

Now we are ready to prove the next important theorem. Remember that our phase space is a factor of a compact connected subset of the unit tangent bundle over the torus. Theorem 3.6. For a billiard on a torus with a small obstacle, if u is a vector from the interior of A R, then there exists a compact invariant subset Y of the phase space, such that every trajectory from Y has admissible type and rotation vector u. Proof. Let Y be the closure of the trajectory T from Lemma 3.5, taken in the phase space. If S is a trajectory obtained from T by starting it at time s (that is, S(t) = T (s + t)) then by Lemma 3.5 we get S(t) − S(0) − t u ≤ 2M for all t. By continuity of the flow, this property extends to every trajectory S from Y . This proves that every trajectory from Y has rotation vector u. Since a trajectory of admissible type has no tangencies to the obstacle (by Condition 4 of the definition of admissible sequences), each finite piece of a trajectory from Y has admissible type. Therefore every trajectory from Y has admissible type. Remark 3.7. The set Y above can be chosen minimal, and therefore the trajectory from Lemma 3.5 can be chosen recurrent. As a trivial corollary to Theorem 3.6 we get the following. Corollary 3.8. For a billiard on a torus with a small obstacle, if u is a vector from the interior of A R, then there exists a trajectory of admissible type with rotation vector u. We also get another corollary, which follows from the existence of an ergodic measure on Y . Corollary 3.9. For a billiard on a torus with a small obstacle, if u is a vector from the interior of A R, then there exists an ergodic invariant probability measure in the phase space, for which the integral of the velocity is equal to u and almost every trajectory is of admissible type. This corollary is stronger than Corollary 3.8, because from it and from the Ergodic Theorem it follows that almost every point has rotation vector u. The details of the necessary formalism are described in Sect. 6. Of course, in our particular case both results are corollaries to Theorem 3.6, so we know anyway that all points of Y have rotation vector u. 4. Admissible Rotation Set is Large In this section we will investigate how large the admissible rotation set A R is. This of course depends on the size of the obstacle and the dimension of the space. We will measure the size of A R by the radius of the largest ball centered at the origin and contained in A R. We will start with the estimates that depend on the dimension m of the space but not on the size of the obstacle (provided it is small in our meaning). In order to do it, we first identify some elements of Zm that are always vertices of G. Recall that Am = {−1, 0, 1}m \ {0}.

250

A. Blokh, M. Misiurewicz, N. Simányi

√ Lemma 4.1. If k ∈ Am then ( 2/2)(k/k) ∈ A R. Proof. If k ∈ U then there is a vector l ∈ U orthogonal to k. Vectors k + l and k − l belong to Am and one can easily check that there are edges from k + l to k − l and from k − l to k + l in G. The periodic path (k + l)(k − l)(k + l)(k − l) . . . in G gives us a periodic orbit P of the billiard. The √ P is √ displacement along P is 2k and the period of rotation vector of P is t k, where t > 2/2. smaller than k + l + k − l = 2 2, so the √ Since 0 ∈ A R and A R is convex, we get ( 2/2)(k/k) ∈ A R. Assume now that k ∈ Am and k > 1. Then k = l + u for some l ∈ Am and u ∈ U such that u is orthogonal to l. By Lemma 2.9, l is a vertex of G. By Lemma 2.8 there are edges in G from l to u and from u to l. Similarly as before, we get a periodic orbit of the billiard (corresponding to the periodic path l ul u . . . ) with the displacement k and period less than l + u = k2 − 1 + 1, so k

k ∈ A R. − 1 + 1 k √ √ Since k/( k2 − 1 + 1) ≥ 2/2, the vector ( 2/2)(k/k) also belongs to A R.

k2

·

By the results of [2], the √ convex hull of {v/v : v ∈ Am } contains the closed ball centered at 0 with radius 2/ ln m + 5. From this and Lemma 4.1 we get immediately the following result. Theorem 4.2. For a billiard on a torus √ with a small obstacle, the set A R contains the closed ball centered at 0 with radius 2/(ln m + 5). Now we proceed to the estimates that are independent of the dimension m. This is not as simple as it seems. As we saw above, a straightforward attempt that takes into account only those vectors of Zm for which we can show explicitly that they are vertices of G, gives estimates that go to 0 as m → ∞. By the results of [2], those estimates cannot be significantly improved. Therefore we have to use another√method. Let us assume first that O0 is the ball centered at 0 of radius r < 2/4. We start with a simple lemma. Lemma 4.3. Assume that Ol is between O0 and O k and let ϑ be the angle between the vectors k and l. Then

1 2 2 2 2 2 2 (4.1) k, l ≥ k l − 4r > k l − 2 and sin ϑ ≤ 2r/l.

(4.2)

Proof. If Ol is between O0 and O k then there is a line parallel to the vector k, whose distances from 0 and l are at most r . Therefore the distance of l from the line through 0 and k is at most 2r . The orthogonal projection of l to this line is ( k, l /k2 )k, so 2 l − k, l k ≤ 4r 2 . 2 k

Rotation Sets of Billiards with One Obstacle

251

The left-hand side of this inequality is equal to l2 −

k, l 2 k2

and 4r 2 < 1/2, so (4.1) holds. By (4.1), we have sin2 ϑ = 1 − cos2 ϑ = 1 − so (4.2) holds.

k, l 2 k2 (l2 − 4r 2 ) 4r 2 ≤ 1 − = , k2 l2 k2 l2 l2

Clearly, the angle ϑ above is acute. The estimate in the next lemma requires extensive use of the fact that the vectors that we are considering have integer components. Lemma 4.4. Assume that Ol is between O0 and O k and k, l ≤ k, k − l . Then l ≤ k/2. Proof. By Lemma 4.3, (4.1) holds. Since k, l ≤ k, k − l , we get k, l ≤ k2 − k, l , so 2 k, l ≤ k2 .

(4.3)

k2 > 4l2 − 2.

(4.4)

By (4.1) and (4.3) we get If l > k/2, then k2 < 4l2 . Together with (4.4), since k2 and l2 are integers, we get k2 = 4l2 − 1.

(4.5)

From (4.3) and (4.5) we get k, l ≤ 2l2 − 1/2. Hence, since k, l is also an integer, we get k, l ≤ 2l2 − 1. From this, (4.1) and (4.5) we get 1 1 = 2l2 − (2l2 − 1), (2l2 − 1)2 > (4l2 − 1) l2 − 2 2 a contradiction.

Let us think about standing at the origin and looking at the sky, where vertices of G are stars. Are there big parts of the sky without a single star? Observe that as the dimension m of the space grows, the angles between the integer vectors tend to become larger. For√ instance, the angle between the vectors (1, 0, . . . , 0) and (1, . . . , 1) is of order π/2 − 1/ m. Thus, any given acute angle can be considered relatively small if m is sufficiently large. Let α be a positive angle. We will say that a set A ⊂ Zm \ {0} is α-dense in the sky if for every v ∈ Rm \ {0} there is u ∈ A such that the angle between the vectors v and u is at most α. Set ∞ r arcsin n−1 . η(r ) = 2 n=0

252

A. Blokh, M. Misiurewicz, N. Simányi

Proposition 4.5. The set of vertices of G is η(r )-dense in the sky. Proof. Fix a vector v ∈ Rm \ {0} and ε > 0. There exists k0 ∈ Zm \ {0} such that the angle between v and k0 is less than ε. Then we define by induction a finite sequence (k1 , k2 , . . . , kn ) of elements of Zm \ {0} such that O ki+1 is between O0 and O ki and ki+1 ≤ ki /2. This is possible by Lemmas 2.7 and 4.4. Since ki ≥ 1 for each i, this procedure has to terminate at some kn . Then there is no obstacle between O0 and O kn , so kn is a vertex of G. By Lemma 4.3, the angle between ki and ki+1 is at most arcsin(2r/ki+1 ). By our construction, we have kn ≥ 1 = 20 , kn−1 ≥ 21 , kn−2 ≥ 22 , etc. Therefore the angle between kn and k0 is smaller than η(r ). Hence, the angle between v and kn is smaller than η(r ) + ε. Since ε was arbitrary, this angle is at most η(r ). Remark 4.6. Proposition 4.5 was proved under the assumption that O0 is the ball centered √ at 0 of radius r < 2/4. However, making an obstacle smaller results in preservation or even enlargement of G. Moreover, we have a freedom in the lifting where to put the origin. Therefore, Proposition 4.5 remains √ true under a weaker assumption, that O is contained in a closed ball of radius r < 2/4. Let us investigate the properties of η(r ). Lemma√4.7. The function η is √ continuous and increasing on (0, η(r ) < 2 πr . In particular, η( 2/4) < π/2 and

√

2/4]. Moreover,

lim η(r ) = 0.

r →0

√ Proof. Assume that 0 <√r ≤ 2/4. Then all numbers whose arcus sine we are taking are from the interval (0, 2/2], so clearly η is continuous and increasing. We have also the estimate √ √ 2/2 x 2 2 ≥ = . arcsin x π/4 π √ Moreover, the equality holds only if x = 2/2. Thus η(r ) <

∞ √ πr √ = 2 πr. 2n 2 n=0

Therefore limr →0 η(r ) = 0 and √ √ η( 2/4) < 2 π

√ π 2 = . 4 2

Now we assume only that O is small. In the next lemma we obtain two estimates of the length of t k ∈ A R if k is a vertex of G. One of those estimates will be useful for all vertices of G, the other one for those with large norm. The main idea of the proof is similar as in the proof of Lemma 4.1. √ Lemma 4.8. If k is a vertex of G then the vectors (1 − 2/2)(k/k) and (k − 1)/(k + 1) (k/k) belong to A R.

Rotation Sets of Billiards with One Obstacle

253

Proof. Let k = (x1 , x2 , . . . , xm ) be a vertex of G. Let s be the number of non-zero components of k. Then we may assume that xi = 0 if i ≤ s and xi = 0 if i > s. If s = 1 then the statement of the lemma follows from Lemma 4.1. Assume now that s > 1. Then for every i ≤ s there is a vector v i ∈ U with only i th component non-zero and v i , k < 0. By Lemma 2.8 there are edges in G from k to v i and from v i to k, so the periodic path kv i kv i . . . in G gives us a periodic orbit Pi of the billiard. The displacement along Pi is k + v i and the period of Pi is smaller than k + v i = k + 1, so the rotation vector of Pi is ti (k + v i ) with ti > 1/(k + 1). Therefore the vector (k + v i )/(k + 1) belongs to A R. Since the vectors v i form an orthonormal basis of Rs , we have k=

s v i , k v i . i=1

Set s v i , k . v i , k and ai = a= a i=1

Then the vector u=

s i=1

ai

k + vi k + 1

is a convex combination of elements of A R, so u ∈ A R. We have

s s 1 1 k k 1+ . ai + v i , k v i = u= k + 1 a(k + 1) k + 1 a i=1

i=1

For each i we have v i , k ≤ −1, so a ≤ −s, and therefore 1 + 1/a ≥ (s − 1)/s. Moreover, √ k s ≥√ . k + 1 s+1 Since s ≥ 2, we get

√ √ √ √ s 2−1 2 s−1 s−1 ·√ , = √ ≥ √ =1− s 2 s+1 s 2 √ so the vector u has the direction of k and length at least 1 − 2/2. To get the other estimate of the length of u, note that v i , k = −|xi |, so a=−

s

|xi | ≤ −k,

i=1

and hence u ≥

1 k − 1 k · 1− = . k + 1 k k + 1

254

A. Blokh, M. Misiurewicz, N. Simányi

Lemma 4.9. Let A ⊂ Rm be a ﬁnite set, α-dense in the sky for some α < π/2. Assume that every vector of A has norm c. Then the convex hull of A contains a ball of radius c cos α, centered at 0. Proof. Let K be the convex hull of A. Then K is a convex polytope with vertices from A. Since A is α-dense in the sky and α < π/2, K is non-degenerate and 0 belongs to its interior. Let s be the radius of the largest ball centered at 0 and contained in K . This ball is tangent to some face of K at a point v. Then the whole K is contained in the half-space {u : u, v ≤ v2 }. In particular, for every u ∈ A we have u, v ≤ v2 . Since A is α-dense in the sky, there is u ∈ A such that the angle between v and u is at most α. Therefore v2 ≥ u, v ≥ uv cos α = cv cos α, so s = v ≥ c cos α.

Now we can get the first, explicit, estimate of the radius of the largest ball contained in A R. Theorem 4.10. For a billiard√ on a torus with small obstacle, assume that O is contained in a closed ball of radius √ r < 2/4. Then the admissible rotation set contains the closed ball of radius (1 − 2/2) cos η(r ) centered at 0. √ Proof. Set H = {(1 − 2/2)(k/k) : k is a vertex of G} and let K be the convex hull of H . By Lemma 4.8 and by the convexity of A R, we have K ⊂ A R. By Proposition 4.5, H is √ η(r )-dense in the sky. Thus, by Lemma 4.9, K contains the closed ball of radius (1 − 2/2) cos η(r ) centered at 0. The second estimate is better for small r (uniformly in m), but does not give an explicit formula for the radius of the ball contained in A R. We first need two lemmas. Lemma 4.11. For every integer N > 1 there exists an angle β(N ) > 0 (independent of m), such that if u, v ∈ Zm \ 0 are vectors of norm less than N and the angle ϑ between u and v is positive, then ϑ ≥ β(N ). Proof. Under our assumptions, each of the vectors u, v has less than N 2 non-zero components. Therefore the angle between u and v is the same as the angle between some 2 vectors u , v ∈ Z2N −2 of the same norms as u, v. However, there are only finitely 2 many vectors in Z2N −2 \ {0} of norm less than N , so the lemma holds with β(N ) equal to the smallest positive angle between such vectors. Lemma 4.12. Let A ⊂ Zm \ 0 be a ﬁnite set, α-dense in the sky for some α < β(N )/2, where β(N ) is as in Lemma 4.11. Then the set of those elements of A which have norm at least N is 2α-dense in the sky. Proof. Let B be the set of vectors of Rm \ {0} whose angular distance from some vector of A of norm at least N is 2α or less. Let v ∈ Rm be a non-zero vector. Assume that the angles between v and all vectors of A are non-zero. By the assumptions, there exists u ∈ A such that the angle ϑ between v and u is at most α. If u ≥ N then v ∈ B. Suppose that u < N . Choose ε > 0 such that ε < β(N ) − 2α and ε < ϑ. We draw a great circle in the sky through u and v and go along it from u through v and beyond it to some v so that the angle between

Rotation Sets of Billiards with One Obstacle

255

u and v is α + ε. Now, there exists u ∈ A such that the angle between v and u is at most α. Then the angle between u and u is at least ε and at most 2α + ε < β(N ), so u ≥ N . The angle between v and u is at most 2α + ε − ϑ < 2α, so again, v ∈ B. In such a way we have shown that B is dense in Rm \ {0}. It is also clearly closed in Rm \ {0}, so it is equal to Rm \ {0}. Let β(N ) be as in Lemma 4.11, and let N (r ) be the maximal N such that η(r ) < β(N )/2. This definition is correct for sufficiently small r , since clearly β(N ) → 0 as N → ∞, and by Lemma 4.7, η(r ) → 0 as r → 0. It follows that N (r ) → ∞ as r → 0. Theorem 4.13. For a billiard on a torus √ with a small obstacle, assume that O is contained in a closed ball of radius r < 2/4 for r so small that N (r ) is deﬁned. Then the admissible rotation set contains the closed ball of radius (N (r ) − 1)/(N (r ) + 1) cos(2η(r )) centered at 0. Proof. Since η(r ) < β(N (r ))/2, by Proposition 4.5, Remark 4.6 and Lemma 4.12, the set of those vertices of G that have norm at least N (r ) is 2η(r )-dense in the sky. By Lemma 4.8, if k is such a vertex, (k − 1)/(k + 1) (k/k) ∈ A R. Since 0 ∈ A R and A R is convex, also (N (r ) − 1)/(N (r ) + 1) (k/k) ∈ A R. Then by Lemma 4.9, the closed ball of radius (N (r ) − 1)/(N (r ) + 1) cos(2η(r )) centered at 0 is contained in A R. Corollary 4.14. The radius of the largest ball centered at 0 contained in the admissible rotation set goes to 1 uniformly in m as the diameter of the obstacle goes to 0. We conclude this section with a result showing that even though A R may be large, it is still smaller than R. Theorem 4.15. For a billiard on a torus with a small obstacle, the admissible rotation set is contained in the open unit ball. In particular, A R = R. Proof. Since the graph G is finite, there exist positive constants c1 < c2 such that for every trajectory piece of admissible type the distance between two consecutive reflections is contained in [c1 , c2 ]. Moreover, from the definition of an edge in G and from the compactness of the obstacle it follows that there is an angle α > 0 such that the direction of a trajectory piece of admissible type changes by at least α at each reflection. Consider the triangle with two sides of length c1 and c2 and the angle π − α between them. Let a be the ratio between the length of the third side and c1 + c2 , that is, a=

c12 + c22 + 2c1 c2 cos α c1 + c2

.

This ratio is less than 1 and it decreases when α or c1 /c2 grows. Therefore, if consecutive reflections for a trajectory piece of admissible type are at times t1 , t2 , t3 , then the displacement between the first and the third reflections divided by t3 − t1 is at most a. Thus, every vector from A R has length at most a. Clearly, the vector (1, 0, . . . , 0) belongs to R, and thus A R = R.

256

A. Blokh, M. Misiurewicz, N. Simányi

5. Billiard in the Square Now we consider a billiard in the square S = [−1/2, 1/2]2 with one convex obstacle O with a smooth boundary. The lifting to R2 , considered in Sect. 2 is replaced in this case by the unfolding to R2 . That is, we cover R2 by the copies of S obtained by consecutive symmetries with respect to the lines x = n + 1/2 and y = n + 1/2, n ∈ Z. Thus, the square Sk = S + k (k ∈ Z2 ) with the obstacle O k in it is the square S with O, translated by k, with perhaps an additional symmetry applied. If k = ( p, q), then, if both p, q are even, there is no additional symmetry; if p is even and q odd, we apply symmetry with respect to the line y = q; if p is odd and q is even, we apply symmetry with respect to the line x = p; and if both p, q are odd, we apply central symmetry with respect to the point ( p, q). In this model, trajectories in S with obstacle O unfold to trajectories in R2 with obstacles O k , k ∈ Z2 . The situation in R2 is now the same as in the case of the torus billiard, except that, as we mentioned above, the obstacles are not necessarily the translations of O0 , and, of course, the observable whose averages we take to get the rotation set is completely different. Let us trace which definitions, results and proofs of Sects. 2, 3 and 4 remain the same, and which need modifications. The definitions of between and type remain the same. Lemma 2.1 is still valid, but in its proof we have to look at the trajectory on the torus R2 /(2Z)2 rather than R2 /Z2 . Then the definition of an admissible sequence remains the same. The first part of Theorem 2.2 and its proof remains the same, but in the proof of the part about periodic trajectories we have to be careful. The point x 0 + p from the last paragraph of the proof has to be replaced by a point that after folding (the operation reverse to the unfolding) becomes x 0 . This gives us a periodic orbit in the unfolding that projects (folds) to a periodic orbit in the square. Moreover, there may be a slight difference between the square case and the torus case if we want to determine the least discrete period of this orbit (where in the square case, in analogy to the torus case, we count only reflections from the obstacle). In the torus case it is clearly the same as the least period of the type. In the square case this is not necessarily so. For instance, if the obstacle is a disk centered at the origin, the orbit that goes vertically from the highest point of the disk, reflects from the upper side of the square and returns to the highest point of the disk, has discrete period 1 in the above sense. However, its type is periodic of period 2 and in the unfolding it has period 2. Fortunately, such things are irrelevant for the rest of our results. Lemma 2.3, Corollary 2.4 and their proofs remain the same as in the torus case. The same can be said about the part of Lemma 2.5 that refers to the lengths of trajectory pieces. The definition of the graph G has to be modified. This is due to the fact that the conditions (3) and (4) cannot be restated in the same way as in the torus case, because now not only translations, but also symmetries are involved (the obstacle needs not be symmetric, and the unfolding process involves symmetries about vertical and horizontal lines). In order to eliminate symmetries, we enlarge the number of vertices of G four times. Instead of O0 , we look at {O0 , O(1,0) , O(0,1) , O(1,1) }. For every k = ( p, q) ∈ Z2 there is ζ (k) ∈ Q, where Q = {(0, 0), (1, 0), (0, 1), (1, 1)}, such that k − ζ (k) has both components even. Then condition (3) can be restated as no obstacle between Oζ ( kn−1 ) and Ol n +ζ ( kn−1 ) , and condition (4) as the obstacle Ol n +ζ ( kn−1 ) not between Oζ ( kn−1 ) and Ol n +l n+1 +ζ ( kn−1 ) . Therefore we can take as the vertices of G the pairs (i, j), where i ∈ Q, j ∈ Z2 , i = j , and there is no obstacle between O i and O j . There is an edge in G from (i, j ) to (i , j ) if and only if O j is not between O i and O j + j − i and ζ ( j ) = i . Then,

Rotation Sets of Billiards with One Obstacle

257

similarly as in the torus case, there is a one-to-one correspondence between admissible sequences and one-sided infinite paths in G, starting at vertices (0, j ). This restriction on the starting point of a path in G creates some complication, but for every i ∈ Q there is a similar correspondence between admissible sequences and one-sided infinite paths in G, starting at vertices (i, j ). Therefore, if we want to glue finite paths, we may choose an appropriate i. Lemma 2.6 and its proof remain unchanged. The definition of a small obstacle has to be modified slightly. This is due to the fact, that while a torus is homogeneous, so all positions of an obstacle are equivalent, this is not the case for a square. An obstacle placed close to a side of the square will produce a pattern of obstacles in the unfolding which is difficult to control. Therefore we will√say that the obstacle O is small if it is contained in a closed ball of radius smaller than 2/4, centered at 0. With this definition, Lemmas 2.8 and 2.10 and their proofs remain unchanged. We arrived at a point where the situation is completely different than for the torus case, namely, we have to define the displacement function. Once we do it, the definitions of the rotation set, admissible rotation set and rotation vector remain the same, except that we will call a rotation vector (since it belongs to R) a rotation number. Since we have to count how many times the trajectory rotates around the obstacle, the simplest way is to choose a point z in the interior of O and set ϕ(x) = arg(x − z)/(2π ), where arg is the complex argument (here we identify R2 with C). It is not important that the argument is multivalued, since we are interested only in its increment along curves. For any closed curve avoiding the interior of O, the increment of ϕ is equal to the winding number of with respect to z. Since the whole interior of O lies in the same component of C \ , this number does not depend on the choice of z. If is not closed, we can extend it to a closed one, while changing the increment of ϕ by less than 1. Therefore changing z will amount to the change of the increment of ϕ by less than 2. When computing the rotation numbers, we divide the increment of ϕ by the length of the trajectory piece, and this length goes to infinity. Therefore in the limit a different choice of z will give the same result. This proves that the rotation set we get is independent of the choice of z. The proofs of the results of Sect. 3 rely on the second part of Lemma 2.5, which we have not discussed yet. The possibility of the trajectory pieces crossing the obstacles, mentioned in that lemma, was necessary only for the proof of its first part, so we do not have to worry about it now. However, we have to make an additional assumption that B is admissible. This creates no problem either, because this is how we use it later. Since we changed here many details, it makes sense to state exactly what we will be proving. Lemma 5.1. For every ﬁnite admissible sequence B = (kn )sn=0 of elements of Z2 the displacements ϕ along trajectory pieces of type B differ by less than 2.

Proof. Note that the “folding” map π: R2 \ k∈Z2 O k → S \ O is continuous. There fore if curves and γ in R2 \ k∈Z2 O k with the common beginning and common end are homotopic then π() and π(γ ) are homotopic, so the increments of ϕ along them are the same. If and are two trajectory pieces as in the statement of the lemma, then can be extended to γ with the same beginning and end as , with the change of the increment of ϕ along its projection by π less than 2 (1 at the beginning, 1 at the end). Thus, it suffices to show that this can be done in such a way that and γ are homotopic. Therefore we have to analyze what may be the reasons for and γ not to be homotopic. Extending to γ can be done in the right way, so this leaves two possibly bad things that we have to exclude. The first is that when going from O ki to O ki+2 via O ki+1 ,

258

A. Blokh, M. Misiurewicz, N. Simányi

we pass with on one side of O ki+1 and with on the other side of O ki+1 . The second one is that when going from O ki to O ki+1 , we pass with on one side of some O j , and with on the other side of O j . However, the first possibility contradicts condition (4) from the definition of admissibility and the second one contradicts condition (3). This completes the proof. With Lemma 5.1 replacing Lemma 2.5, the rest of results of Sect. 3 (except the last two theorems and the corollary) and their proofs remain the same (with obvious minor modifications, for instance due to the fact that constants in Lemmas 2.5 and 5.1 are different). Let us state the main theorems we get in this way. Theorem 5.2. The admissible rotation set of a billiard in a square with a small obstacle is convex, and consequently, it is a closed interval symmetric with respect to 0. Theorem 5.3. For a billiard in a square with a small obstacle, rotation numbers of periodic orbits of admissible type are dense in the admissible rotation set. Theorem 5.4. For a billiard in a square with a small obstacle, if u is a number from the interior of A R, then there exists a compact, forward invariant subset Y of the phase space, such that every trajectory from Y has admissible type and rotation number u. Corollary 5.5. For a billiard in a square with a small obstacle, if u is a number from the interior of A R, then there exists a trajectory of admissible type with rotation number u. Corollary 5.6. For a billiard in a square with a small obstacle, if u is a number from the interior of A R, then there exists an ergodic invariant probability measure in the phase space, for which the integral of the displacement is equal to u and almost every trajectory is of admissible type. Since the billiard in the square is defined only in dimension 2, most of the results of Sect. 4 do not have counterparts here. However, we can investigate what happens to A R as the size of the obstacle decreases. Moreover, here the position of the obstacle matters, so the size of the obstacle should be measured by the radius of the smallest ball centered in the origin that contains it. The following theorem should be considered the context √ in√ of Theorem 6.9 that states that the full rotation set, R, is equal to [− 2/4, 2/4]. Theorem 5.7. √ For every √ ε > 0 there exists δ > 0 such that the set A R contains the interval [− 2/4 + ε, 2/4 − ε] whenever the obstacle is contained in the disk centered at the origin and diameter less than δ. Proof. Let us estimate the rotation number of the curve V that in the unfolding is a straight line segment from the origin to the point (2n + 1, 2n). We are counting the displacement as the rotation around the center of the square (as always, as the multiples of 2π ). In particular, the displacement for V makes sense, since at its initial and terminal pieces the argument is constant. Compare V to the curve V that in the unfolding is a straight line segment from (0, −1/2) to (2n + 1, 2n + 1/2). In the square, V goes from the lower side to the right one, to the upper one, to the left one, to the lower one, etc., and it reflects from each side at its midpoint. Moreover, the distances of the endpoints of V from the corresponding endpoints of V are 1/2. Therefore, when we deform linearly V to V , we do not cross any point of Z2 . This means that the difference of the displacements along those trajectories in the square is less than 2 (this difference may

Rotation Sets of Billiards with One Obstacle

259

occur because they end at different points). The displacement along V is n + 1/2, so the displacement along V is between n − 3/2 and n + 5/2.

The length of √ V is between the length √ of V minus 1 and the length of V , that is, between (2n + 1) 2 − 1 and √ (2n +√1) 2. Therefore, as n goes to infinity, the rotation number of V goes to n/(2n 2) = 2/4. √ If we fix ε > 0 then there is n such that the rotation number of V is larger than 2/4 − ε/4. Then we can choose δ > 0 such that if the obstacle is contained in the disk centered at the origin and diameter less than δ then (2n + 1, 2n) is a vertex of the graph G and any trajectory piece Tn that in the unfolding is a straight line segment from a point of O0 to a point of O(2n+1,2n) has rotation number differing from the rotation number of V by less than ε/4 (when we deform linearly V to get this trajectory √ piece, we do not cross any point of Z2 ). Hence, this rotation number is larger than 2/4 − ε/2. Now we construct a periodic orbit of admissible type with the rotation number differing from the rotation number of Tn by less than ε/2. By Lemma 2.10, there is a loop An in G, passing through v n = (2n + 1, 2n) and at most 2 other vertices, both from U . As n goes to infinity, then clearly the ratios of displacements and of lengths of Tn and An go to 1. Therefore the ratio of their rotation numbers also goes to 1, and if n is large enough, the difference between them will be smaller than ε/2. This gives us δ such that if the obstacle is contained in the disk centered at the origin and diameter less than δ √ then the set A R contains the number v > 2/4 − ε. Since A R is symmetric with respect to√0, it contains √ also the number −v, and since it is connected, it contains the interval [− 2/4 + ε, 2/4 − ε]. Theorem 4.15 also has its counterpart for the billiard in the square. Theorem 5.8. For a billiard in a square√with a√small obstacle, the admissible rotation set is contained in the open interval (− 2/4, 2/4). In particular, A R = R. Since the proof of this theorem utilizes a construction introduced in the proof of Theorem 6.8, we postpone it until the end of Sect. 6. 6. Results on the Full Rotation Set In this section we will prove several results on the full rotation set R in both cases, not only about the admissible rotation set. Some of the proofs apply to a much more general situation than billiards, and then we will work under fairly general assumptions. Let X be a compact metric space and let be a continuous semiﬂow on X . That is, : [0, ∞) × X → X is a continuous map such that (0, x) = x and (s + t, x) = (t, (s, x)) for every x ∈ X , s, t ∈ [0, ∞). We will often write t (x) instead of (t, x). Let ξ be an time-Lipschitz continuous observable cocycle for (X, ) with values in Rm , that is, a continuous function ξ : [0, ∞) × X → Rm such that ξ(s + t, x) = ξ(s, t (x)) + ξ(t, x) and ξ(t, x) ≤ Lt for some constant L independent of t and x. The rotation set R of (X, , ξ ) is the set of all limits lim

n→∞

ξ(tn , xn ) , where lim tn = ∞. n→∞ tn

By the definition, R is closed, and is contained in the closed ball in Rm , centered at the origin, of radius L. In particular, R is compact.

260

A. Blokh, M. Misiurewicz, N. Simányi

Theorem 6.1. The rotation set R of a continuous semiﬂow on a connected space X with a time-Lipschitz continuous observable cocycle ξ is connected. Proof. Set ψ(t, x) =

ξ(t, x) t

for t > 0 and x ∈ X . Then the function ψ is continuous on the space (0, ∞) × X . For n ≥ 1, set K n = ψ([n, ∞) × X ). With this notation, we have ∞

R=

Kn .

n=1

The set [n, ∞) × X is connected, so K n is connected, so K n is connected. Moreover, K n is contained in the closed ball in Rm , centered at the origin, of radius L. Therefore K n is compact. Thus, (K n )∞ n=1 is a descending sequence of compact connected sets, and so its intersection R is also connected. In the case of a billiard on a torus that we are considering, the phase space X is the product of the torus minus the interior of the obstacles with the unit sphere in Rm (velocities). At the boundaries of the obstacles we glue together the pre-collision and post-collision velocity vectors. This space is compact, connected, and our semiflow (even a flow, since we can move backwards in time, too) is continuous. The observable cocycle is the displacement function. Clearly, it is time-Lipschitz with the constant L = 1 and continuous. Thus, Theorem 6.1 applies, and the rotation set R is connected. Similar situation occurs in the square. Here there is one more complication, due to the fact that there are trajectories passing through vertices. The gluing rule at a vertex q of the square is that the phase points (q, v) and (q, −v) must be identified for all relevant velocities v. Then the flow is also continuous in this case, so Theorem 6.1 also applies. It means that the rotation set R is a closed interval, symmetric with respect to 0. When we work with invariant measures, we have to use a slightly different formalism. Namely, the observable cocycle ξ has to be the integral of the observable function ζ along an orbit piece. That is, ζ : X → Rm is a bounded Borel function, integrable along the orbits, and t ξ(t, x) = ζ (s (x)) ds. 0

Assume that is a continuous flow. Then, if µ is a probability measure, invariant and ergodic with respect to , then by the Ergodic Theorem, for µ-almost every point x ∈ X the rotation vector ξ(t, x) t→∞ t lim

of x exists and is equal to X ζ (x) dµ(x). Problems may arise if we want to use weak-* convergence of measures. If ζ is continuous and µn weak-* converge to µ then the integrals of ζ with respect to µn converge to the integral of ζ with respect to µ. However, for a general ζ this is not true. Note that in the cases of billiards that we are considering, ζ is the velocity vector. It has a discontinuity at every point where a reflection occurs (formally speaking, it is even not

Rotation Sets of Billiards with One Obstacle

261

well defined at those points; for definiteness we may define it there in any way so that it remains bounded and Borel). However, it is well known that the convergence of integrals still holds if the set of discontinuity points of ζ has µ-measure zero (as a random reference, we can give [3], Theorem 7.7.10, p. 234). Let us call an observable almost continuous if the set of its discontinuity points has measure zero for every -invariant probability measure. By what we said above, the following lemma holds. Lemma 6.2. If probability measures µn weak-* converge to a -invariant probability measure µ and ζ is almost continuous then lim ζ (x) dµn (x) = ζ (x) dµ(x). n→∞ X

X

We have to show that this lemma is relevant for billiards. Lemma 6.3. Let (, X ) be a billiard ﬂow in the phase space. Then the velocity observable function ζ is almost continuous, Proof. The only points of discontinuity of ζ are on the boundary of the region in which we consider the billiard. Take a small piece Y of this set. Then for a small t ≥ 0 the sets (t, Y ) will be pairwise disjoint (if for y1 ∈ (t1 , Y ) and y2 ∈ (t2 , Y ) with t1 = t2 the velocity is the same, the points of are different). However, by the invariance of µ, their measures are the same. Since the parameter t varies in an uncountable set, those measures must be 0. The following theorem is an analogue of Theorem 2.4 from [7]. Its proof is basically the same as there (except that here we deal with a flow and a discontinuous observable), so we will omit some details. A point of a convex set A is an extreme point of A if it is not an interior point of any straight line segment contained in A. Theorem 6.4. Let (, X ) be a continuous ﬂow and let ζ: X → Rm be an almost continuous observable function. Let R be the rotation set of (, X, ζ ). Then for any extreme vector u of the convex hull of R there is a -invariant ergodic probability measure µ such that X ζ (x) dµ(x) = u. Proof. There is a sequence of trajectory pieces such that the average displacements on those pieces converge to u. We can find a subsequence of this sequence such that the measures equidistributed on those pieces weakly-* converge to some probability measure ν. This measure is automatically invariant. Therefore, by Lemma 6.2, we can pass to the limit with the integrals of ζ , and we get X ζ (x) dν(x) = u. We decompose ν into ergodic components, and since u is an extreme point of the convex hull of R, for almost all ergodic components µ of ν we have X ζ (x) dµ(x) = u. By Lemma 6.3, we can apply the above theorem to our billiards. In particular, by the Ergodic Theorem, for any extreme vector u of the convex hull of R there is a point with the rotation vector u. Now we will look closer at billiards on the torus Tm with one obstacle (not necessarily small). We know that its rotation set R is contained in the unit ball. It turns out that although (by Corollary 4.14) R can fill up almost the whole unit ball, still it cannot reach the unit sphere S m−1 on a big set. Let us start with the following theorem.

262

A. Blokh, M. Misiurewicz, N. Simányi

Theorem 6.5. For a billiard on the torus Tm with one obstacle, if u is a rotation vector of norm 1, then there is a full trajectory in Rm which is a straight line of direction u. Proof. Clearly, such u is an extreme point of the convex hull of R. By Theorem 6.4, there is an ergodic measure µ such that the integral of the velocity with respect to µ is u. Thus, the support of µ is contained in the set of points of the phase space for which the vector component is u. Take t which is smaller than the distance between any two obstacles in the lifting. Then µ-almost all full trajectories of the billiard have direction u at all times kt, for any integer k. Such a trajectory has to be a straight line with direction u. The following lemma has been proved in [10] as Lemma A.2.2. Lemma 6.6. For every dimension m > 1 and every number r > 0 there are ﬁnitely many nonzero vectors x 1 , x 2 , . . . , x k ∈ Zm such that whenever a straight line L in Rm is at least at the distance r from Zm , then L is orthogonal to at least one of the vectors x i . In other words, L is parallel to the orthocomplement (lattice) hyperplane Hi = (x i )⊥ . As an immediate consequence of Theorem 6.5 and Lemma 6.6, we get the following result. Theorem 6.7. For a billiard on the torus Tm with one obstacle, the intersection R∩S m−1 is contained in the union of ﬁnitely many great hyperspheres of S m−1 . The hyperplanes deﬁning these great hyperspheres can be taken as in Lemma 6.6. For the billiard in the square with one obstacle, we can determine the full rotation set much better than for the torus case. In the theorem below we do not need to assume that the obstacle is small or even convex. However, we assume that it is contained in the interior of the square and that its boundary is smooth. Theorem 6.8. For√a billiard √ with one obstacle in a square, the rotation set is contained in the interval [− 2/4, 2/4]. Proof. Let Y be the square minus the obstacle. In order to measure the displacement along a trajectory piece T , we have to trace how its lifting to the universal covering space of Y behaves. Since Y is homeomorphic to an annulus, this universal covering has a natural structure of a strip in the plane. Without any loss of generality, we may assume that the displacement along T is positive. We divide Y into 4 regions, as in Fig. 6.1. The line dividing regions 1 and 2 is a segment of the lowest horizontal straight line such that the whole obstacle is below it; this segment has only its left endpoint belonging to the obstacle. The other three dividing lines are chosen in the same way after turning the whole picture by 90, 180 and 270 degrees. Note that here we are not interested very much to which region the points of the division lines belong, so it is a partition modulo its boundaries. In the universal covering of Y , our four regions become infinitely many ones, and they are ordered as . . . , 1−1 , 2−1 , 3−1 , 4−1 , 10 , 20 , 30 , 40 , 11 , 21 , 31 , 41 , . . . .

(6.1)

Here the main number shows to which region in Y our region in the lifting projects, and the subscript indicates the branch of the argument (as in the definition of the displacement) that we are using. Thus, if in the lifting the trajectory goes from, say, 20

Rotation Sets of Billiards with One Obstacle

263

2 3

1 4 Fig. 6.1. Four regions

to 123 , then the displacement is 22, up to an additive constant. This constant does not depend on the trajectory piece we consider, so it disappears when we pass to the limit to determine the rotation set. Since we assumed that the displacement along T is positive, then in general, the trajectory moves in the order as in (6.1), although of course it can go back and forth. Look at some region, say, 1n , which is (in the universal covering) between the region where T begins and the region where T ends. After T leaves 1n , for good, it can bounce between the left and right sides several times. As this happens, the y-coordinate on the trajectory must grow, so at some point the trajectory will hit the upper side of the square for the first time after it leaves 1n for good (unless it ends before it does this). We denote the time of this collision by t (1n ). Then we use analogous notation for the time of the first hit of the left side after leaving 2n for good, etc. After the trajectory hits the top side at t (1n ), it moves to the left or right (we mean the horizontal component of the velocity). It cannot move vertically, because then it would return to the region 1n . If it is moving to the left, it is still in the region 2n , or it just left it, but did not hit the left side of the square yet. Therefore t (1n ) < t (2n ). If it is moving to the right, it is in, or it will return to, the region 2n . Therefore also in this case t (1n ) < t (2n ). In such a way we get an increasing sequence of times when the trajectory T hits the consecutive sides of the square (in the lifting). By joining those consecutive reflection points by segments, we get a piecewise linear curve γ , which is not longer than T , but the displacement along γ differs from the displacement along T at most by a constant independent of the choice of T . This curve γ goes from the right side of the square to the upper one, to the left one, to the lower one, to the right one, etc. This is exactly the same behavior that is displayed by the trajectory in the square without an obstacle, that starts at the midpoint of the right side and goes in the direction of (−1, 1). Therefore γ and pass through the same squares in the unfolding. We terminate in that square in the unfolding in which γ ends. Since in the unfolding is a segment of a straight line, it is shorter than γ , again up to a constant independent of γ , and those two curves have the same displacement (as always, up to a constant). This shows that the rotation number of the trajectory piece T is not larger than for the curve , plus a constant that goes to 0 as the length of T goes to infinity. Since the rotation number of (let us think √ now about the infinite trajectory) is 2/4, the limit in the definition of the rotation set cannot exceed this number. The two trajectories in the square without an obstacle, described in the last paragraph of the above proof, are also trajectories of any square billiard with one small obstacle.

264

A. Blokh, M. Misiurewicz, N. Simányi

√ √ Therefore in this case the rotation set R contains the interval [− 2/4, 2/4]. In such a way we get the following result. Theorem 6.9. For√a billiard √ with one small obstacle in a square, the rotation set is equal to the interval [− 2/4, 2/4]. Now we present the proof of Theorem 5.8 that has been postponed until now. Proof of Theorem 5.8. Let us use the construction from the proof of Theorem 6.8 and look at a long trajectory piece T of admissible type. We get an increasing sequence of times when the trajectory T hits the consecutive sides of the square (in the lifting). Construct a partial unfolding of T , passing to a neighboring square only at those times. In such a way we get a piecewise linear curve T which sometimes reflects from the sides of the unfolded square, and sometimes goes through them. Its length is the same as the length of T . Moreover, the curve γ , constructed in the proof of Theorem 6.8 starts and terminates in the same squares as T and as we know from that proof, the displacements along γ and T differ at most by a constant independent of T . It is also clear that the same holds if we replace γ by the segment (also from the proof of Theorem 6.8). Thus, up to √ a constant that goes to 0 as the length of T goes to infinity, the rotation number of T is 2/4 multiplied by the length of and divided by the length of T . By the same reasons as in the proof of Theorem 4.15, there exist positive constants c1 < c2 and α > 0, such that for every trajectory piece of admissible type the distance between two consecutive reflections from obstacles is contained in [c1 , c2 ], and the direction of a trajectory piece of admissible type changes by at least α at each reflection. However, we have to take into account that the direction of T can change also at the reflections from the boundaries of the squares, and then we do not know how the angle changes. Thus, either immediately before or immediately after each reflection from an obstacle there must be a piece of T where the direction differs from the direction of by at least α/2. Such a piece has length larger than or equal to the √ distance from the

obstacle to the boundary of the square, which is at least c1 = 1/2 − 2/4. Those pieces are alternating with the pieces of T of length at most c2 each, where we do not know what happens with the direction. This means that as we follow T (except the initial piece of the length bounded by c2 ), we move in the direction that differs from the direction of by at least α/2 for at least c1 /(c1 + c2 ) of time. Therefore, in the limit as the length of T goes to infinity, the length of divided by the length of T is not larger than b=

c1

c

c2 α + 1 cos . + c2 c1 + c2 2

√ √ This proves that A R ⊂ [−b 2/4, b 2/4]. Since b < 1, this completes the proof.

References 1. Alsedà, L., Llibre, J., Misiurewicz, M.: Combinatorial dynamics and entropy in dimension one. Advanced Series in Nonlinear Dynamics, Vol. 5, Second Edition, River Edge, NJ: World Scientific, 2000 2. Bárány, I., Simányi, N.: A note on the size of the largest ball inside a convex polytope. Period. Math. Hungar. 51, 15–18 (2005) 3. Bauer, H.: Probability Theory and Elements of Measure Theory. New York: Holt, Reinehart and Winston, 1972

Rotation Sets of Billiards with One Obstacle

265

4. Chernov, N.: Construction of transverse fiberings in multidimensional semidispersed billiards. Funct. Anal. Appl. 16, 270–280 (1982) 5. Dettmann, C. P.: The Lorentz Gas: A Paradigm for Nonequilibrium Stationary States. In Hard Ball Systems and the Lorentz Gas, (ed. D. Szász), Encyclopaedia Math. Sci. 101, Berlin-Heidelberg-New York: Springer Verlag, 2000, pp. 315–365 6. Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge: Cambridge University Press, 1995 7. Misiurewicz, M., Ziemian, K.: Rotation sets for maps of tori. J. London Math. Soc. (2) 40, 490–506 (1989) 8. Misiurewicz, M., Ziemian, K.: Rotation sets and ergodic measures for torus homeomorphisms. Fund. Math. 137, 45–52 (1991) 9. Stoyanov, L.: An estimate from above of the number of periodic orbits for semi-dispersed billiards. Commun. Math. Phys. 124, 217–227 (1989) 10. Szász, D.: The K-property of ‘Orthogonal’ Cylindric Billiards. Commun. Math. Phys. 160, 581–597 (1994) 11. Ziemian, K.: Rotation sets for subshifts of finite type. Fund. Math. 146, 189–201 (1995) Communicated by G. Gallavotti

Commun. Math. Phys. 266, 267–288 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0015-3

Communications in

Mathematical Physics

Multipole Radiation in a Collisionless Gas Coupled to Electromagnetism or Scalar Gravitation S. Bauer1, , M. Kunze1 , G. Rein2 , A. D. Rendall3 1 Fachbereich Mathematik, Universität Duisburg-Essen, 45117 Essen, Germany.

E-mail: [email protected]

2 Universität Bayreuth, Fakultät für Mathematik und Physik, 95440 Bayreuth, Germany 3 Max-Planck-Institut für Gravitationsphysik, Am Mühlenberg 1, 14476 Golm, Germany

Received: 29 August 2005 / Accepted: 9 November 2005 Published online: 22 April 2006 – © Springer-Verlag 2006

Abstract: We consider the relativistic Vlasov-Maxwell and Vlasov-Nordström systems which describe large particle ensembles interacting by either electromagnetic fields or a relativistic scalar gravity model. For both systems we derive a radiation formula analogous to the Einstein quadrupole formula in general relativity.

1. Introduction and Main Results This paper is an investigation of the mathematical properties of certain models for the interaction of matter, described by a kinetic equation, with radiation, described by hyperbolic equations. The first model, the relativistic Vlasov-Maxwell system, plays an important role in plasma physics. The motivation for studying the second model, the Vlasov-Nordström system, comes from the theory of gravitation. On a mathematical level the Vlasov-Maxwell system can also give insights into gravity. The most precise existing theory of gravitation, general relativity, predicts that certain astrophysical systems, such as colliding black holes or neutron stars, will give rise to gravitational radiation. There is a major international effort under way to detect these gravitational waves [5]. In order to relate the general theory to predictions of what the detectors will see it is necessary to use approximation methods - the exact theory is too complicated. The mathematical status of these approximations remains unclear although partial results exist. This paper is intended as a contribution to understanding the mathematical structures involved. Since the solutions of the equations of general relativity are so difficult to analyze rigorously it is useful to start with model problems. One possibility is the scalar theory of gravitation considered here, the Vlasov-Nordström theory [6]. It has already been used as a model problem for numerical relativity in [21]. Supported in parts by DFG priority research program SPP 1095

268

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

Among the approximation methods used to study gravitational radiation those which are most accessible mathematically are the post-Newtonian approximations. Some information on these has been obtained in [17] and [18]. Results which are analogous to these but go much further have been obtained for the Vlasov-Maxwell and Vlasov-Nordström systems in [4] and [2] respectively. None of these results include radiation explicitly. Here we take a first step in doing so. On the other hand, for the case of finite particle systems interacting with their self-induced fields there are several rigorous results concerning radiation; see [22] for an up-to-date review. Our main results (Theorem 1.4 and Theorem 1.9 below) are relations between the motion of matter and the radiation flux at infinity for the Vlasov-Maxwell and VlasovNordström systems respectively. They are analogues of the Einstein quadrupole formula [23, (4.5.13)] which is a basic tool in computing the flux of gravitational waves from a given source. In the case of the Einstein and Maxwell equations a spherically symmetric system does not radiate. For the Vlasov-Nordström system a spherical system can radiate and the specialization of the general formula to that case is computed. In [21] a difference between the spherically symmetric and the general case was claimed but we have not succeeded in connecting this to our results. The main theorems are obtained under plausible assumptions on the behavior of global solutions of the relevant system (Assumption 1.1 and Assumption 1.6 below). The former can be proved to hold in the case of small data. For the systems we are going to consider the (scalar) energy density e and the (vector) momentum density P are related by the conservation law ∂t e + ∇ · P = 0. Defining the local energy in the ball of radius r > 0 as Er (t) =

|x|≤r

e(t, x) d x,

this conservation law and the divergence theorem imply that d Er (t) = dt

|x|≤r

=−

∂t e(t, x) d x = −

|x|=r

|x|≤r

∇ · P(t, x) d x

x¯ · P(t, x) dσ (x),

(1.1)

x denotes the outer unit normal. More specifically, for the relativistic where x¯ = |x| Vlasov-Maxwell system with two particle species,

eRVM (t, x) = c

2

+

1 + c−2 p 2 ( f + + f − )(t, x, p) dp

1 |E(t, x)|2 + |B(t, x)|2 , 8π

PRVM (t, x) = c2

p( f + + f − )(t, x, p) dp +

(1.2) c E(t, x) × B(t, x), 4π

(1.3)

Multipole Radiation in a Collisionless Gas

269

whereas for the Vlasov-Nordström system, eVN (t, x) = c2 1 + c−2 p 2 f (t, x, p) dp c2 (∂t φ(t, x))2 + c2 |∇φ(t, x)|2 , 8π c4 ∂t φ(t, x) ∇φ(t, x). PVN (t, x) = c2 p f (t, x, p) dp − 4π +

(1.4)

Our assumptions function will be such that the con on the support of the distribution tributions of p( f + + f − ) dp to PRVM and p f dp to PVN vanish for |x| = r large. Hence we arrive at d RVM c E (t) = x¯ · (B × E)(t, x) dσ (x) dt r 4π |x|=r for the relativistic Vlasov-Maxwell system, and d VN c4 Er (t) = x¯ · (∂t φ∇φ)(t, x) dσ (x) dt 4π |x|=r for the Vlasov-Nordström system. The main results of this paper are concerned with the expansion of these energy fluxes for r, c → ∞ and |t − c−1r | ≤ const. Under suitable assumptions we will prove that, to leading order, d RVM 2 Er (t) ∼ − 3 |∂t2 D(u)|2 , dt 3c −1 where u = t − c r denotes the retarded time and D(u) = x ρ0 (u, x) d x is the dipole moment associated to the Newtonian limit of the relativistic Vlasov-Maxwell system. Similarly, d VN 1 Er (t) ∼ − (∂t R(ω, u))2 dσ (ω), dt 4π c5 |ω|=1 with a more complicated radiation term R associated to the Newtonian limit of the Vlasov-Nordström system. In the spherically symmetric case, ∂t R(ω, u) is found to be proportional to ∂t Ekin (u), the change of kinetic energy of the Newtonian system. The exact statements are contained in Theorems 1.4 and 1.9 below. 1.1. Dipole radiation in the relativistic Vlasov-Maxwell system. The relativistic Vlasov-Maxwell system describes a large ensemble of particles which move at possibly relativistic speeds and interact only by the electromagnetic fields which the ensemble creates collectively. Collisions among the particles are assumed to be sufficiently rare to be neglected [13]. In order to see effects due to radiation damping it is necessary that there are at least two species of particles with different charge-to-mass ratios. For the sake of simplicity we assume that there are exactly two species with their masses normalized to unity and their charges normalized to plus and minus unity, respectively. The density of the positively and negatively charged particles in phase space is given by

270

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

the non-negative distribution functions f ± = f ± (t, x, p), depending on time t ∈ R, position x ∈ R3 , and momentum p ∈ R3 . Their dynamics is governed by the relativistic Vlasov-Maxwell system  ∂t f ± + pˆ · ∇x f ± ± (E + c−1 pˆ × B) · ∇ p f ± = 0,     c ∇ × E = −∂t B, c ∇ × B = ∂t E + 4π j, (RVMc) ∇ · E = 4πρ, ∇ · B = 0,    + − + − ρ = ( f − f ) dp, j = pˆ ( f − f ) dp,  where pˆ = γ p, γ = (1 + c

−2 2 −1/2

p )

,

p = | p| , and 2

2

=

R3

.

(1.5)

The electric field E = E(t, x) ∈ R3 and the magnetic field B = B(t, x) ∈ R3 satisfy the wave equations (−∂t2 + c2 )E = 4π(c2 ∇ρ + ∂t j) and (−∂t2 + c2 )B = −4π c ∇ × j.

(1.6)

In order to determine the radiation of the system at infinity, we have to consider solutions that are isolated from incoming radiation. For the wave equations in (1.6), this means that we need to restrict ourselves to the retarded part of the solutions. Accordingly, (RVMc) is replaced by  ∂t f ± + pˆ· ∇x f ± ± (E + c−1 pˆ × B) · ∇ p f ± = 0,    dy  −2 −1  ,  E(t, x) = − (∇ρ + c ∂t j)(t − c |y − x|, y)  |y − x|  dy (retRVMc) B(t, x) = c−1 ∇ × j (t − c−1 |y − x|, y) ,    |y − x|      ρ = ( f + − f − ) dp, j = pˆ ( f + − f − ) dp, which we call the retarded relativistic Vlasov-Maxwell system. The motivation for considering this system is as follows. In physics situations are often important where radiation impinging on the matter from far away has a negligible effect on the dynamics and it is therefore an appropriate idealization to use the retarded solution of the field equations. When this is done the specification of a solution requires only data for the matter, in contrast to what happens in the usual initial value problem for the corresponding system of equations. We prescribe initial data f ± (0, x, p) = f ±,◦ (x, p), x, p ∈ R3 ,

(1.7)

for the densities at t = 0; these data do not depend on c. However, the corresponding solution ( f + , f − , E, B) does depend on c, but we do not make explicit this dependence through our notation. We refer to Remark 1.5(c) below for the case of initial data varying with c. Our standing assumption is that the initial data are non-negative, smooth, and compactly supported, f ±,◦ ∈ C0∞ (R3 × R3 ),

f ±,◦ ≥ 0,

(1.8)

and we fix positive constants R0 , P0 , S0 such that f ±,◦ (x, p) = 0 for |x| ≥ R0 or | p| ≥ P0 , and f ±,◦ W 3,∞ ≤ S0 .

(1.9)

Multipole Radiation in a Collisionless Gas

271

Every solution of (retRVMc) satisfies the identity f ± (t, x, p) = f ±,◦ (X ± (0, t, x, p), P ± (0, t, x, p)),

(1.10)

where s → (X ± (s, t, x, p), P ± (s, t, x, p)) solves the characteristic system x˙ = p, ˆ

p˙ = ±(E + c−1 pˆ × B),

(1.11)

with data X ± (t, t, x, p) = x and P ± (t, t, x, p) = p. Hence 0 ≤ f ± (t, x, p) ≤ f ±,◦ ∞ . In order to derive our results on radiation, we have to assume certain a priori bounds on the corresponding solutions of (retRVMc). In particular, the latter have to exist globally in time. Assumption 1.1. (a) For each c ≥ 1 the system (retRVMc) has a unique solution f ± ∈ C 2 (R × R3 × R3 ), E ∈ C 2 (R × R3 ; R3 ), B ∈ C 2 (R × R3 ; R3 ), satisfying the initial condition (1.7). (b) There exists P1 > 0 such that f ± (t, x, p) = 0 for | p| ≥ P1 and all c ≥ 1. In particular, f ± (t, x, p) = 0 for |x| ≥ R0 + P1 |t| by (1.11). (c) For every T > 0, R > 0, and P > 0 there exists a constant M1 (T, R, P) > 0 such that |∂tα+1 f ± (t, x, p)| + |∂tα ∇x f ± (t, x, p)| ≤ M1 (T, R, P) for |t| ≤ T , |x| ≤ R, | p| ≤ P, and α = 0, 1, uniformly in c ≥ 1. Note that none of the constants in Assumption 1.1 may depend on c. The constants R0 , P0 , S0 , P1 , M1 from (1.9) and Assumption 1.1 are considered to be the “basic” ones. Any other constant which appears in an estimate is only allowed to depend on these. Checking the arguments from [7, 8], it can be shown that Assumption 1.1 holds at least for sufficiently “small” initial data f ±,◦ . A more precise investigation of the set of initial data leading to solutions which satisfy Assumption 1.1 is not part of this paper. The main point we want to make here is that whenever Assumption 1.1 is verified, then the technique described below can be employed. We will need estimates relating the solutions of (retRVMc) to the corresponding Newtonian problem obtained in the limit c → ∞. This sort of information usually goes under the name of post-Newtonian approximation; see [19, 4]. For this, one formally expands the solutions in powers of c−1 as f ± = f 0± + c−1 f 1± + c−2 f 2± + · · · , E = E 0 + c−1 E 1 + c−2 E 2 + · · · , B = B0 + c−1 B1 + c−2 B2 + · · · , with coefficient functions f j± , E j , and B j independent of c. Moreover, by (1.5), pˆ = p − (c−2 /2) p 2 p + · · · , γ = 1 − (c−2 /2) p 2 + · · · .

272

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

These expansions can be substituted into (retRVMc), and comparing coefficients at every order gives a sequence of equations for the coefficients. The Newtonian limit of (retRVMc) is given by the plasma physics case of the Vlasov-Poisson system:  ∂t f 0± + p · ∇x f 0± ± E 0 · ∇ p f 0± = 0,    x−y    E 0 (t, x) = ρ (t, y) dy, 0 3 |x − y| (VPpl)   ρ0 = ( f 0+ − f 0− ) dp,     f 0± (0, x, p) = f ±,◦ (x, p). The following proposition addresses the well-known solvability properties of (VPpl). Clearly, ( f 0+ , f 0− , E 0 ) is independent of c, and we refer to e.g. [20, 16] for the regularity of the solution. Proposition 1.2. There are constants R2 , P2 > 0, and for every T > 0, R > 0, and P > 0, there is a constant M2 (T, R, P) > 0, with the following properties. For initial data f ±,◦ as above, there exists a unique global solution ( f 0± , E 0 ) of (VPpl) so that (a) f 0± ∈ C ∞ (R × R3 × R3 ) and E 0 ∈ C ∞ (R × R3 ; R3 ), (b) if |t| ≤ 1, then f 0± (t, x, p) = 0 for |x| ≥ R2 or | p| ≥ P2 , (c) if |t| ≤ T , |x| ≤ R, | p| ≤ P, and α = 0, 1, then |∂tα f 0± (t, x, p)| + |∂tα E 0 (t, x)| ≤ M2 (T, R, P). For the approximation of solutions of (retRVMc) by solutions of (VPpl), we state the following result without proof; the result is derived like the analogous one for (RVMc), cf. [19, 4]. Proposition 1.3. Under Assumption 1.1 there exist for every T > 0, R > 0, and P > 0 constants M3 (T, R, P) > 0 and M4 (T, R) > 0 with the following property. If c ≥ 2P1 and if ( f ± , E, B) and ( f 0± , E 0 ) denote the global solutions of (retRVMc) and (VPpl) provided by Assumption 1.1 and Proposition 1.2, respectively, with initial data as above, then (a) | f ± (t, x, p) − f 0± (t, x, p)| ≤ M3 (T, R, P) c−2 for |t| ≤ T , |x| ≤ R, and | p| ≤ P, (b) |E(t, x) − E 0 (t, x)| ≤ M4 (T, R)c−2 for |t| ≤ T and |x| ≤ R, (c) |B(t, x)| ≤ M4 (T, R) c−1 for |t| ≤ T and |x| ≤ R. It is important to note that all the “derived” constants R2 , P2 , M2 , M3 , M4 appearing above do only depend on the basic constants R0 , P0 , S0 , P1 , M1 . In order to determine the constants M3 (T, R, P) and M4 (T, R) for given parameters T > 0, R > 0, and P > 0 the constant M1 (T , R , P ) from Assumption 1.1 is needed for certain parameters T > T, R > R, P > P. We are now ready to state our first main result. Theorem 1.4 (Radiation for (retRVMc)). Put r∗ = max{2(R0 + P1 ), R2 } and MRVM = {(t, r, c) : r ≥ 2r∗ , c ≥ 2P1 , |t − c−1 r | ≤ 1, r ≥ c3 }. x If (t, r, c) ∈ MRVM , then with r = |x|, x¯ = |x| , and u = t − c−1 |x|,

x¯ · (B × E)(t, x)+c−4 r −2 |x¯ ×∂t2 D(u)|2 ≤ A(c−5 r −2 +c−2 r −3 +c−1 r −4 ),

(1.12)

Multipole Radiation in a Collisionless Gas

for a constant A > 0 depending only on R0 , P0 , S0 , P1 , M1 . In particular, d RVM c (t) = x¯ · (B × E)(t, x) dσ (x) Er dt 4π |x|=r 2 = − 3 |∂t2 D(u)|2 + O(c−4 + c−1 r −1 + r −2 ) 3c for (t, r, c) ∈ MRVM . Here ErRVM (t) = |x|≤r eRVM (t, x) d x, see (1.2), and D(u) = x ρ0 (u, x) d x

273

(1.13)

denotes the dipole moment associated to the Vlasov-Poisson system (VPpl). Remark 1.5. (a) The condition r ≥ c3 in MRVM is not needed for the proof of (1.12) and (1.13). It just guarantees that c−2 r −3 ≤ c−5 r −2 and c−1 r −1 ≤ c−4 . (b) The same estimate (1.12) can be derived, possibly with a different constant A, if the condition |u| ≤ 1 is replaced by |u| ≤ u 0 for some u 0 > 0. (c) As long as the constants R0 , P0 , S0 , P1 , M1 remain independent of c, one can also allow for c-dependent initial data f c±, ◦ , both for (retRVMc) and (VPpl). However, in this case the functions ( f 0± , E 0 ) become c-dependent, too. For instance, in the particular case ±,◦ , f c±, ◦ = f 0±, ◦ + c−1 f 1±, ◦ + c−2 fr,c ±, ◦ ±, ◦ satisfying suitable bounds (independently of c for fr,c ), with f 0±, ◦ , f 1±, ◦ , and fr,c ± Theorem 1.4 remains valid, if f 0 and E 0 are replaced by the approximations f˜0± + c−1 f˜1± and E˜ 0 + c−1 E˜ 1 , respectively. Here ( f˜0± , E˜ 0 ) is the solution of (VPpl) for the initial data f 0±, ◦ , and ( f˜1± , E˜ 1 ) solves the Vlasov-Poisson system linearized about ( f˜0± , E˜ 0 ), under the initial condition f˜1± (0) = f 1±, ◦ . (d) In the case of one species only, say f −, ◦ = 0, there is no dipole radiation, since then ∂t2 D = 0, cf. (2.13) below. (e) For spherically symmetric solutions there is again no dipole radiation. In fact, if ρ0 (t, −x) = ρ0 (t, x) for x ∈ R3 , then D = 0 by symmetry.

The proof of Theorem 1.4 is given in Sect. 2.1. 1.2. Monopole radiation in the Vlasov-Nordström system. If we set all physical constants (except the speed of light c) equal to unity, then the Vlasov-Nordström system is given by  ∂t f + pˆ · ∇x f − (Sφ) p + c2 γ ∇φ · ∇ p f = 4(Sφ) f,   (−∂t2 + c2 )φ = 4π µ, (VNc)   µ = γ f dp, where we continue to use the notation from (1.5), and where S = ∂t + pˆ · ∇. The matter distribution is modeled through the nonnegative density function f = f (t, x, p), whereas the scalar function φ = φ(t, x) describes the gravitational field. We refer to

274

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

[6, 11, 1, 9] for the global existence of smooth solutions to (VNc). In analogy to the passage from (RVMc) to (retRVMc), the solutions of (VNc) that are isolated from incoming radiation are the solutions of the retarded system  ∂t f + pˆ · ∇x f − (Sφ) p + c2 γ ∇φ · ∇ p f = 4(Sφ) f,    dy  , φ(t, x) = −c−2 µ(t − c−1 |y − x|, y) (retVNc) |y − x|     µ = γ f dp, which we call the retarded Vlasov-Nordström system. We continue to make the standing hypotheses (1.8) and (1.9) for the initial data f (0, x, p) = f ◦ (x, p) of (retVNc). At this point it should be noted that the “physical” particle density on the mass shell in the metric e2φ diag(−1, 1, 1, 1) is not f but e−4φ(t,x) f (t, x, eφ p). In particular, the density f used in the formulation above is not constant along solutions of the characteristic system x˙ = p, ˆ

p˙ = −(Sφ) p − c2 γ ∇φ,

(1.14)

but satisfies the relation f (t, x, p) = f ◦ (X (0, t, x, p), P(0, t, x, p))e4φ(t,x)−4φ(0,X (0,t,x, p)) ,

(1.15)

where s → (X (s, t, x, p), P(s, t, x, p)) denotes the solution of (1.14) with X (t, t, x, p) = x, P(t, t, x, p) = p. For technical reasons we prefer to work with the above formulation of the system in terms of the ‘unphysical’ density f , and we make the following assumptions on the solutions of (retVNc). Assumption 1.6. (a) For each c ≥ 1 the system (retVNc) has a unique solution f ∈ C 2 (R × R3 × R3 ), φ ∈ C 2 (R × R3 ), satisfying the initial condition f (0, x, p) = f ◦ (x, p). (b) There exists P1 > 0 such that f (t, x, p) = 0 for | p| ≥ P1 and all c ≥ 1; by (1.15), (1.14) this implies that f (t, x, p) = 0 for |x| ≥ R0 + P1 |t|. (c) For every T > 0, R > 0, and P > 0 there exists a constant M1 (T, R, P) > 0 such that |∂tα f (t, x, p)| ≤ M1 (T, R, P) for |t| ≤ T , |x| ≤ R, | p| ≤ P, and α = 0, 1, 2. In addition, for every T > 0 and R > 0 there exists a constant M1 (T, R) > 0 such that |φ(t, x)| + |∇φ(t, x)| + |∂t φ(t, x)| ≤ M1 (T, R) for |t| ≤ T and |x| ≤ R, uniformly in c ≥ 1. Again R0 , P0 , S0 , P1 , M1 are considered to be the “basic” constants, all other constants being derived from these. We remark that for “small” initial data the existence of global-in-time solutions is shown in [12], where also bounds on the solutions are obtained. It is reasonable to expect that these solutions have the required regularity for smooth initial data, cf. [16], and that on compact time intervals estimates as in Assumption 1.6 (c) can be derived uniformly in c. The crucial assumption is the bound on the momentum support in part (b), which needs to be uniform in c as well.

Multipole Radiation in a Collisionless Gas

275

The Newtonian approximation for c → ∞ of (retVNc) is found by means of the formal expansion f = f 0 + c−1 f 1 + c−2 f 2 + · · · , φ = φ0 + c−1 φ1 + c−2 φ2 + c−3 φ3 + c−4 φ4 + · · · , see [10, 2]. Thereby it is verified that this (lowest order) Newtonian approximation of (retVNc) is given by the gravitational case of the Vlasov-Poisson system  ∂t f 0 + p · ∇x f 0 − ∇φ2 · ∇ p f 0 = 0,    ρ0 (t, y)    φ2 (t, x) = − dy, |x − y| (VPgr)   ρ0 = f 0 dp,     f 0 (0, x, p) = f ◦ (x, p). The analogue of Proposition 1.2 is valid for (VPgr). Note that ( f 0 , φ2 ) is independent of c. Proposition 1.7. There are constants R2 , P2 > 0, and for every T > 0, R > 0, and P > 0, there is a constant M2 (T, R, P) > 0, with the following properties. For initial data f ◦ as above, there exists a unique global solution ( f 0 , φ2 ) of (VPgr) so that (a) f 0 ∈ C ∞ (R × R3 × R3 ) and φ2 ∈ C ∞ (R × R3 ), (b) if |t| ≤ 1, then f 0 (t, x, p) = 0 for |x| ≥ R2 or | p| ≥ P2 , (c) if |t| ≤ T , |x| ≤ R, | p| ≤ P, and α = 0, 1, 2, then |∂tα f 0 (t, x, p)| + |∂tα+1 φ2 (t, x)| + |∂tα ∇φ2 (t, x)| ≤ M2 (T, R, P). By [3], we also have the following rigorous result concerning the Newtonian limit of (retVNc). Proposition 1.8. Choose the constants P1 > 0 and M1 (T, R, P) > 0 according to Assumption 1.6. Then for every T > 0, R > 0, and P > 0 there are constants M3 (T, R, P) > 0 and M4 (T, R) > 0 with the following properties. If c ≥ 2P1 , let ( f, φ) and ( f 0 , φ2 ) denote the global solutions of (retVNc) and (VPgr) provided by Assumption 1.6 and Proposition 1.7, respectively, with initial data as above. Then (a) | f (t, x, p) − f 0 (t, x, p)| ≤ M3 (T, R, P) c−2 for |t| ≤ T , |x| ≤ R, and | p| ≤ P, (b) |∇φ(t, x)| ≤ M4 (T, R)c−2 for |t| ≤ T and |x| ≤ R, (c) |∂t φ(t, x)−c−2 ∂t φ2 (t, x)|+|∇φ(t, x)−c−2 ∇φ2 (t, x)| ≤ M4 (T, R) c−4 for |t| ≤ T and |x| ≤ R. After these preparations we can state our second main result. Theorem 1.9 (Radiation for (retVNc)). Put r∗ = max{2(R0 + P1 ), R2 } and MVN = {(t, r, c) : r ≥ 2r∗ , c ≥ 2P1 , |t − c−1r | ≤ 1, r ≥ c6 }. If (t, r, c) ∈ MVN , then with r = |x|, x¯ =

x |x| ,

and u = t − c−1 |x|,

¯ u))2 ≤ A(c−10 r −2 + c−4 r −3 ),

x¯ · (∂t φ∇φ)(t, x) + c−9r −2 (∂t R(x,

(1.16)

276

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

for a constant A > 0 depending only on R0 , P0 , P1 , M1 , S0 . In particular, d VN c4 Er (t) = x¯ · (∂t φ∇φ)(t, x) dσ (x) dt 4π |x|=r 1 =− (∂t R(ω, u))2 dσ (ω) + O(c−6 + r −1 ) 4π c5 |ω|=1 for (t, r, c) ∈ MVN . Here ErVN (t) = |x|≤r eVN (t, x) d x, see (1.4), and 1 R(x, ¯ u) = − |x¯ · ∇φ2 (u, y)|2 dy 4π − (x¯ · p)2 f 0 (u, y, p) dp dy + 4 Ekin (u),

where Ekin (t) =

1 2

(1.17)

(1.18)

p 2 f 0 (t, x, p) d x d p

(1.19)

denotes the kinetic energy associated to the Vlasov-Poisson system (VPgr). 1 |∇φ2 (t, x)|2 d x, the total energy E(t) = Ekin (t) + Epot (t) Defining Epot (t) = − 8π is conserved along solutions of (VPgr). Remark 1.10. (a) Once again the condition r ≥ c6 in MVN is not needed for the proof of (1.16). It only has to be included in order that the second error term O(c−4 r −3 ) is at least as good as the first one, which is O(c−10 r −2 ). (b) In the sense of Remark 1.5 (b) and (c), one could allow for |u| ≤ u 0 and/or c-dependent initial data. For spherically symmetric solutions, Theorem 1.9 simplifies as follows. Corollary 1.11 (Radiation for spherically symmetric solutions to (retVNc)). Deﬁne r∗ = max{2(R0 + P1 ), R2 } and

MVN = (t, r, c) : r ≥ 2r∗ , c ≥ 2P1 , |t − c−1 r | ≤ 1, r ≥ c6 . If (t, r, c) ∈ MVN , then with r = |x| and u = t − c−1 r ,

(∂t φ ∂r φ)(t, x) + 64 c−9r −2 (∂t Ekin (u))2 ≤ A(c−10 r −2 + c−4 r −3 ),

9 for a constant A > 0 depending only on R0 , P0 , S0 , P1 , M1 . In particular, d VN c4 64 Er (t) = (∂t φ ∂r φ)(t, x) dσ (x) = − 5 (∂t Ekin (u))2 + O(c−6 + r −1 ) dt 4π |x|=r 9c for (t, r, c) ∈ MVN . The proofs of Theorem 1.9 and Corollary 1.11 are carried out in Sect. 2.2.

Multipole Radiation in a Collisionless Gas

277

2. Proofs 2.1. Proof of Theorem 1.4. To expand E(t, x) and B(t, x) as given by (retRVMc), we recall from Assumption 1.1 (b) that f ± (t, x, p) = 0 for |x| ≥ R0 + P1 |t|. It follows that ρ(t, x) = 0 and j (t, x) = 0 for |x| ≥ R0 + P1 |t|. If (t, x, c) ∈ MRVM , then |u| = |t − c−1 |x|| ≤ 1 and c ≥ 2P1 . Thus if |y| ≥ 2(R0 + P1 ), then R0 + P1 |t − c−1 |y − x|| = R0 + P1 |u + c−1 |x| − c−1 |y − x|| ≤ R0 + P1 (|u| + c−1 |y|) ≤ R0 + P1 (1 + (2P1 )−1 |y|) ≤ |y|. Hence F(t − c−1 |y − x|, y) = 0 for both F = −(∇ρ + c−2 ∂t j) or F = c−1 ∇ × j. Thus for the y-integrals defining E and B in (retRVMc), it is sufficient to extend these over the ball |y| ≤ max{2(R0 + P1 ), R2 } = r∗ . In what follows g = O(c−k r −l ) denotes a function such that |g(t, x)| ≤ Ac−k r −l for all |x| = r ≥ 2r∗ , c ≥ 2P1 , and |t − c−1 |x|| ≤ 1, with A only depending on the basic constants. The following lemma states a representation for E and B similar to the Friedlander radiation field; see [15, p. 91/92] and [8]. Lemma 2.1. The ﬁelds can be written as E(t, x) = E rad (t, x) + O(r −2 ) and B(t, x) = B rad (t, x) + O(c−1r −2 ), where E

rad

(t, x) = −r

−1

|y|≤r∗

B rad (t, x) = c−1 r −1

(∇ρ + c−2 ∂t j)(u + c−1 x¯ · y, y) dy,

|y|≤r∗

∇ × j (u + c−1 x¯ · y, y) dy.

(2.1) (2.2)

Proof. Consider E first, and let F = −(∇ρ+c−2 ∂t j). According to Assumption 1.1 (c), we have |F(. . .)| ≤ AM1 (1 + r∗ , r∗ , p∗ ) = O(1) for some constant A > 0, where p∗ = max{P1 , P2 } and (. . .) = (t − c−1 |y − x|, y) = (u + c−1 |x| − c−1 |y − x|, y). If |x| = r ≥ 2r∗ and |y| ≤ r∗ , then

|x| |y−x|

≤

|x| |x|−r∗

≤ 2. It follows that

1 1 |x| − |y − x| = + = r −1 + O(r −2 ) |y − x| |x| |y − x||x| for all |y| ≤ r∗ . Therefore by (retRVMc), dy dy F(. . .) E(t, x) = F(. . .) = |y − x| |y − x| |y|≤r∗ F(. . .) r −1 + O(r −2 ) dy = |y|≤r∗ −1 =r F(. . .) dy + O(r −2 ). |y|≤r∗

278

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

Next we note that for |y| ≤ r∗ and |x| = r ≥ 2r∗ , |x| − |x − y| = |x| − |x| 1 − 2 x¯ · y/|x| + |y|2 /|x|2 1 −2 x¯ · y/|x| + |y|2 /|x|2 + O(r −2 ) = |x| − |x| 1 + 2 = x¯ · y + O(r −1 ).

(2.3)

Since |F(. . .) − F(u + c−1 x¯ · y, y)| ≤ ∂t F L ∞ c−1 ||x| − |y − x| − x¯ · y| = O(c−1r −1 ) by Assumption 1.1 (c) and (2.3), we get E = E rad + O(r −2 ). The proof for the magnetic field is analogous, using F = c−1 ∇ × j. Now we need to investigate the relation between E rad and B rad . For this, we recall the continuity equation ∂t ρ + ∇ · j = 0 and calculate ¯ ∇ρ(∗) = ∇ y [ρ(∗)] + c−1 x¯ ∇ y · [ j (∗)] − c−2 (x¯ · ∂t j (∗)) x, ∇ × j (∗) = ∇ y × [ j (∗)] − c−1 x¯ × ∂t j (∗), where (∗) = (u + c−1 x¯ · y, y)

is the argument. This follows just from evaluating the total derivatives. Since |y|≤r∗ dy = dy in (2.1) and (2.2) by Assumption 1.1 (b), integration by parts shows that all ∇ y -terms drop out. Consequently, due to u = t − c−1 r the relations E rad (t, x) = −r −1 [∇ρ(∗)+c−2 ∂t j (∗)] dy |y|≤r∗ −c−2 (x¯ · ∂t j (∗)) x¯ +c−2 ∂t j (∗) dy = −r −1 |y|≤r∗ j (u + c−1 x¯ · y, y)− x¯ · j (u +c−1 x¯ · y, y) x¯ dy, (2.4) = −c−2 r −1 ∂t |y|≤r∗ B rad (t, x) = c−1 r −1 ∇ × j (∗) dy |y|≤r∗ = −c−2 r −1 ∂t x¯ × j (u + c−1 x¯ · y, y)dy |y|≤r∗

are obtained. Note that in particular E rad and B rad are of the same order in c−1 and r −1 , i.e., E rad (t, x) = B rad (t, x) = O(c−2 r −1 ) by Assumption 1.1 (c). Observing x¯ × ¯ dy = [ j (∗) − (x¯ · j (∗))x] |y|≤r∗

|y|≤r∗

x¯ × j (∗) dy,

(2.5)

Multipole Radiation in a Collisionless Gas

279

differentiation w.r. to t yields the important formula x¯ × E rad (t, x) = B rad (t, x).

(2.6)

Also x¯ ·

|y|≤r∗

¯ dy = 0, [ j (∗) − (x¯ · j (∗))x]

so that x¯ · E rad (t, x) = x¯ · B rad (t, x) = 0.

(2.7)

Collecting the results from Lemma 2.1 and (2.5), it follows that x¯ · (B × E) = x¯ · [B rad + O(c−1 r −2 )] × [E rad + O(r −2 )] = x¯ · B rad × E rad + O(c−2 r −3 ) + O(c−3r −3 ) + O(c−1 r −4 ) = −|x¯ × E rad |2 + O(c−2 r −3 ) + O(c−1 r −4 ),

(2.8)

since by (2.6) and (2.7), x¯ · (B rad × E rad ) = x¯ · ([x¯ × E rad ] × E rad ) = x¯ · ((x¯ · E rad )E rad − |E rad |2 x) ¯ = −|E rad |2 = −|x¯ × E rad |2 . Equations (2.8) and (2.4) imply x¯ · (B × E) = −c

−4 −2

r

x¯ × ∂t

j (u + c

|y|≤r∗ −1 −4

+O(c−2 r −3 ) + O(c

r

−1

2

x¯ · y, y) dy

).

(2.9)

To expand the square as c → ∞, we note that | p| ≥ p∗ ≥ P1 implies f ± (t, y, p) = 0 for all t ∈ R and all y ∈ R3 by Assumption 1.1 (b). Therefore we can always replace the average over momentum space dp by | p|≤ p∗ dp. For | p| ≤ p∗ , ∇ p pˆ = γ idR3 − c−2 γ 3 p ⊗ p = idR3 + O(c−2 ) by (1.5). Furthermore, using Assumption 1.1, ∇x ( f + − f − )(∗) = ∇ y [( f + − f − )(∗)] − c−1 x¯ ∂t ( f + − f − )(∗) = ∇ y [( f + − f − )(∗)] + 1{|y|≤r∗ , | p|≤ p∗ } O(c−1 ).

280

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

Utilizing this, (retRVMc), and Proposition 1.3, we get, writing (∗, p) = (u + c−1 x¯ · y, y, p), ∂t j (∗) dy = pˆ ∂t ( f + − f − )(∗, p) dp dy |y|≤r∗ = pˆ − pˆ · ∇x ( f + − f − )(∗, p) − (E + c−1 pˆ × B) · ∇ p ( f + + f − )(∗, p) dp dy = O(c−1 ) + ∇ p pˆ (E + c−1 pˆ × B) |y|≤r∗

= O(c−1 ) + =

|y|≤r∗

| p|≤ p∗

×( f + f − )(∗, p) dp dy idR3 + O(c−2 ) E 0 + O(c−2 ) | p|≤ p∗ × f 0+ + f 0− + O(c−2 ) (∗, p) dp dy

|y|≤r∗

| p|≤ p∗ +

E 0 ( f 0+ + f 0− )(∗, p) dp dy + O(c−1 ).

(2.10)

Also |E 0 ( f 0+ + f 0− )(∗, p) − E 0 ( f 0+ + f 0− )(u, y, p)| ≤ ∂t E 0 ( f 0+ + f 0− ) L ∞ c−1 |x¯ · y| = O(c−1 ) by Proposition 1.2 (c). Thus (2.9) and (2.10) yield

2

x¯ · (B × E) = −c−4 r −2

x¯ × E 0 ( f 0+ + f 0− )(u, y, p) dp dy +O(c−1 )

|y|≤r∗ | p|≤ p∗

+ O(c−2 r −3 ) + O(c−1 r −4 )

2

− −4 −2

+ E 0 ( f 0 + f 0 )(u, y, p) dp dy

= −c r x¯ × +O(c−5 r −2 ) + O(c−2 r −3 ) + O(c−1 r −4 ),

(2.11)

since by Proposition 1.2 (b), f 0± (u, y, p) = 0 for |y| ≥ r∗ ≥ R2 or | p| ≥ p∗ ≥ P2 . Defining the dipole moment D(t) = x ρ0 (t, x) d x with ρ0 from (VPpl), we obtain by the Vlasov equation in (VPpl) that x ∂t ( f 0+ − f 0− ) d p d x ∂t D = =− x p · ∇x ( f 0+ − f 0− ) + E 0 · ∇ p ( f 0+ + f 0− ) d p d x = p ( f 0+ − f 0− ) d p d x

Multipole Radiation in a Collisionless Gas

281

and

p ∂t ( f 0+ − f 0− ) d p d x =− p p · ∇x ( f 0+ − f 0− ) + E 0 · ∇ p ( f 0+ + f 0− ) d p d x = E 0 ( f 0+ + f 0− ) d p d x.

∂t2 D =

(2.12)

Due to (2.11) it follows that x¯ · (B × E) = −c−4 r −2 |x¯ × ∂t2 D(u)|2 + O(c−5 r −2 ) + O(c−2 r −3 ) + O(c−1 r −4 ), which completes the proof of (1.12). Concerning (1.13), we have d RVM c Er (t) = dt 4π

−c2

|x|=r

|x|=r

x¯ · (B × E)(t, x) dσ (x) (x¯ · p)( f + + f − )(t, x, p) dp dσ (x)

by (1.1) and (1.3). For |x| = r and (t, r, c) ∈ MRVM , we can estimate R0 + P1 |t| = R0 + P1 |u + c−1 r | ≤ R0 + P1 + r2 ≤ r = |x|, since r ≥ 2r∗ ≥ 2(R0 + P1 ). Therefore f ± (t, x, p) = 0 by Assumption 1.1 (b), and this yields d RVM c E (t) = dt r 4π

|x|=r

x¯ · (B × E)(t, x) dσ (x), (t, r, c) ∈ MRVM .

Hence for (1.13) it suffices to use (1.12) and to note that c−3 r −2 − 4π

by integration.

|x|=r

|x¯

× ∂t2 D(u)|2 dσ (x)

1 =− |ω × ∂t2 D(u)|2 dσ (ω) 4π c3 |ω|=1 2 = − 3 |∂t2 D(u)|2 3c

− Proof of Remark 1.5 (d). If f 0− (t = 0) = f −, ◦ = 0, then also f 0−1= 0 by (1.10). Thus + −1 |x − y| f 0 (t, y, p) dp dy = |x − y| ρ0 (t, y) dy, we get defining φ0 (t, x) = E 0 = −∇φ0 and φ0 = −4πρ0 . Consequently, by (2.12),

∂t2 D =

E 0 f 0+ d p d x =

E 0 ρ0 d x =

Hence there is no dipole radiation in this case.

1 4π

∇φ0 φ0 d x = 0.

(2.13)

282

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

2.2. Proof of Theorem 1.9. By (retVNc), ∂t φ and ∇φ are given by dy , ∂t φ(t, x) = −c−2 ∂t µ(t − c−1 |y − x|, y) |y − x| dy ∇φ(t, x) = −c−2 ∇µ(t − c−1 |y − x|, y) . |y − x| In full analogy to Lemma 2.1, we obtain the following representation. Lemma 2.2. We can write ∂t φ(t, x) = (∂t φ)rad (t, x)+O(c−2 r −2 ) and ∇φ(t, x) = (∇φ)rad (t, x)+O(c−2 r −2 ), where (∂t φ)rad (t, x) = −c−2 r −1 (∇φ)rad (t, x) = −c−2 r −1

|y|≤r∗ |y|≤r∗

∂t µ(u + c−1 x¯ · y, y) dy,

(2.14)

∇µ(u + c−1 x¯ · y, y) dy.

(2.15)

Let again (∗) = (u + c−1 x¯ · y, y) denote the argument. Then −1 ∇ y [µ(∗)] = c x¯ ∂t µ(∗) + ∇µ(∗). Since |y|≤r∗ dy = dy in (2.15), it follows that x¯ · (∇φ)

rad

(t, x) = −c

−2 −1

x¯ · ∇µ(∗) dy = c−3 r −1 x¯ · x¯ ∂t µ(∗) dy r

= −c−1 (∂t φ)rad (t, x). The same argument shows that (∇φ)rad = O(c−3r −1 ), and also (∂t φ)rad = O(c−2 r −1 ) by (2.14). Hence we find from Lemma 2.2 that x¯ · (∂t φ ∇φ)(t, x) = −c

−5 −2

r

2

∂t µ(∗) dy + O(c−4 r −3 ).

(2.16)

In order to expand the square we use, following [14], the differential operators T = c−1 x¯ ∂t + ∇ and S = ∂t + pˆ · ∇. Then ∂t = (1 − c−1 pˆ · x) ¯ −1 (S − pˆ · T ) and ∇ y [µ(∗)] = T µ(∗) is a total derivative. Hence the corresponding term drops out upon integration with respect to y. Observing the relation ∇ p · [(Sφ) p + c2 γ ∇φ] = 3(Sφ),

Multipole Radiation in a Collisionless Gas

283

the Vlasov equation in (retVNc) yields

∂t µ(∗) dy =

γ ∂t f (∗, p) dp dy

−1 (Sφ) p+c2 γ ∇φ ·∇ p f + 4(Sφ) f (∗, p)dpdy γ 1−c−1 p· ˆ x¯ −1 · (Sφ) p+c2 γ ∇φ f (∗, p) dp dy =− ∇ p γ 1 − c−1 p· ˆ x¯ −1 ˆ x¯ (Sφ) f (∗, p) dp dy, (2.17) + γ 1−c−1 p· =

where (∗, p) = (u + c−1 x¯ · y, y, p). A direct calculation shows that −1 −1 −1 −1 2 2 = ∇p ∇ p γ (1 − c pˆ · x) ¯ 1 + p /c − c p · x¯ ¯ −2 (c−1 x¯ − c−2 p). ˆ = γ 2 (1 − c−1 pˆ · x) If | p| ≥ p∗ ≥ P1 , then f (∗, p) = 0 by Assumption 1.6 (b). Furthermore, if |u| ≤ 1 and ≥ r∗ enforces f (∗, p) = 0 as before. Therefore we can replace |x| ≥ 2r∗, then |y| dp dy by |y|≤r∗ | p|≤ p∗ dy dp in the integrals occurring in (2.17). In other words, we may always assume that both |y| and | p| are bounded, with a bound depending only on the basic constants. Accordingly, γ = 1 + O(c−2 ), γ 2 = 1 + O(c−2 ),

pˆ = γ p = p + O(c−2 ),

(2.18)

−2 1−c−1 p· ˆ x¯ = 1+2c−1 p· x¯ +O(c−2 ).

(2.19)

∇ p γ (1 − c−1 pˆ · x) ¯ −1 = c−1 x¯ + c−2 (2( p · x) ¯ x¯ − p) + O(c−3 ).

(2.20)

1−c−1 p· ˆ x¯

−1

= 1+O(c−1 ),

This results in

Furthermore, since |u + c−1 x¯ · y| ≤ 1 + r∗ , also f (∗, p) = f 0 (∗, p) + O(c−2 ), ˜ 2 )(∗, p) + O(c−4 ) = O(c−2 ), (Sφ)(∗, p) = c−2 ( Sφ ∇φ(∗) = c−2 ∇φ2 (∗) + O(c−4 ),

˜ 2 = ∂t φ2 + p · ∇φ2 . Observe that here the by Proposition 1.8 and (2.18), where Sφ constants M3 (1 +r∗ , r∗ , p∗ ) and M4 (1 +r∗ , r∗ ) enter the bounds on O(c−2 ) and O(c−4 ).

284

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

Hence from (2.17), (2.18), (2.19), and (2.20) we get ∂t µ(∗) dy = − c−1 x¯ + c−2 (2( p · x) ¯ x¯ − p) + O(c−3 ) |y|≤r∗ | p|≤ p∗ × O(c−2 ) + 1 + O(c−2 ) ∇φ2 + O(c−2 ) × f 0 (∗, p) + O(c−2 ) dp dy ˜ 2 )+O(c−4 ) 1+O(c−2 ) 1+O(c−1 ) c−2 ( Sφ + |y|≤r∗ | p|≤ p∗ × f 0 (∗, p) + O(c−2 ) dp dy −1 = −c x¯ · (1 + 2c−1 p · x)∇(φ ¯ 2 f 0 )(∗, p) dp dy |y|≤r∗ | p|≤ p∗ ˜ 2 + p · ∇φ2 ) f 0 (∗, p) dp dy + O(c−3 ). +c−2 ( Sφ |y|≤r∗

| p|≤ p∗

(2.21) Let ψ denote either ∇φ2 or ∂t φ2 . Then by Proposition 1.7 (c), (ψ f 0 )(∗, p) = (ψ f 0 )(u + c−1 x¯ · y, y, p) = (ψ f 0 )(u, y, p) + c−1 (x¯ · y) ∂t (ψ f 0 )(u, y, p) + O(c−2 ) = (ψ f 0 )(u, y, p) + O(c−1 ). Hence (2.21) yields ∂t µ(∗)dy = −c−1 x¯ ·

|y|≤r∗

+c

−2

= O(c

|y|≤r∗

−3

(∇φ2 f 0 )(u, y, p)+c−1 (x¯ · y) ∂t (∇φ2 f 0 )(u, y, p) | p|≤ p∗ +O(c−2 ) + 2c−1 (x¯ · p) ∇φ2 f 0 (u, y, p) dp dy

| p|≤ p∗

−1

˜ 2 + p · ∇φ2 ) f 0 (u, y, p) dp dy + O(c−3 ) ( Sφ

)−c (x¯ · ∇φ2 ) ρ0 (u, y) dy |y|≤r∗ −c−2 ∂t (x¯ · y) (x¯ · ∇φ2 ) ρ0 (u, y) dy |y|≤r∗ ˜ 2 + p · ∇φ2 −2(x¯ · p) (x¯ ·∇φ2 ) +c−2 Sφ |y|≤r∗

| p|≤ p∗

× f 0 (u, y, p) dp dy,

(2.22)

recalling that u = t − c−1r . In view of Proposition 1.7 (b) we may extend all integrals over the whole space again. Now ∇φ2 (u, y) f 0 (u, y, p) dp dy = 0 ∇φ2 ρ0 (u, y) dy = by Lemma 2.3 below, whence the lowest order term drops out. In addition, Lemma 2.3 also shows that

Multipole Radiation in a Collisionless Gas

as well as

285

˜ 2 )(u, y) f 0 (u, y, p) dp dy = −2 ∂t Ekin (u) ( Sφ

( p · ∇φ2 )(u, y) f 0 (u, y, p) dp dy = −∂t Ekin (u)

and

1 (x¯ · p)(x¯ · ∇φ2 )(u, y) f 0 (u, y, p) dp dy = − ∂t (x¯ · p)2 f 0 (u, y, p) dp dy. 2 Finally, we can also write 1 (x¯ · y)(x¯ · ∇φ2 )(u, y)ρ0 (u, y) dy = −Epot (u) − |x¯ · ∇φ2 (u, y)|2 dy 4π by Lemma 2.3 and since |x| ¯ = 1. Using this and ∂t Epot = −∂t Ekin (see the remarks following (1.19)) in (2.22), and collecting all the terms, it follows that ¯ u) + O(c−3 ), (2.23) ∂t µ(∗) dy = −c−2 ∂t R(x, with R(x, ¯ u) as in (1.18). Inserting (2.23) into (2.16), we see that

2

x¯ · (∂t φ∇φ)(t, x) = −c−5 r −2 −c−2 ∂t R(x, ¯ u) + O(c−3 ) + O(c−4 r −3 ) = −c−9 r −2 (∂t R(x, ¯ u))2 + O(c−10 r −2 ) + O(c−4 r −3 ). Therefore (1.16) is proved. Concerning (1.17), the fact that d VN c4 Er (t) = x¯ · (∂t φ∇φ)(t, x) dσ (x) dt 4π |x|=r is due to (t, r, c) ∈ MVN , analogously to the argument in the proof of Theorem 1.4. Hence (1.17) is a direct consequence of (1.16), changing variables as x = r ω. We still need to give the proof of Lemma 2.3. For the Vlasov-Poisson system (VPgr), ∇φ2 (t, x) f 0 (t, x, p) d p d x = 0, ˜ 2 )(t, x) f 0 (t, x, p) d p d x = −2 ∂t Ekin (t), ( Sφ p · ∇φ2 (t, x) f 0 (t, x, p) d p d x = −∂t Ekin (t), (ξ · p)(ξ · ∇φ2 )(t, x) f 0 (t, x, p) d p d x 1 = − ∂t (ξ · p)2 f 0 (t, x, p) d p d x (ξ ∈ R3 ), 2 (ξ · x)(ξ · ∇φ2 )(t, x) f 0 (t, x, p) d p d x 1 = −|ξ |2 Epot (t) − |ξ · ∇φ2 (t, x)|2 d x (ξ ∈ R3 ), 4π 2 1 p f 0 (t, x, p) dp d x, see (1.19), and Epot (t) = − 8π |∇φ2 (t, x)|2 d x. where Ekin (t) = 21

286

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

Proof. Firstly, since φ2 = 4πρ0 , ∇φ2 (t, x) f 0 (t, x, p) d p d x = ∇φ2 (t, x) ρ0 (t, x) d x 1 = ∇φ2 (t, x) φ2 (t, x) d x 4π 1 = ∇ · |∇φ2 (t, x)|2 d x = 0. 8π

For the remaining assertions we define the mass current density as j0 = p f 0 dp. Integration of the Vlasov equation with respect to p implies the continuity equation ∂t ρ0 + ∇ · j0 = 0. Hence ˜Sφ2 f 0 d p d x = (∂t φ2 ρ0 + ∇φ2 · j0 ) d x = (∂t φ2 ρ0 − φ2 ∇ · j0 ) d x = ∂t φ2 ρ0 d x = 2∂t Epot (t) = −2∂t Ekin (t) by conservation of energy, and p · ∇φ2 f 0 d p d x = j0 · ∇φ2 d x = ∂t ρ0 φ2 d x 1 ∂t ρ0 (t, x)ρ0 (t, y) d x d y =− |x − y| = ∂t Epot (t) = −∂t Ekin (t). Furthermore, by (VPgr), ∂t (ξ · p)2 ∇ p · (∇φ2 f 0 ) d p d x (ξ · p)2 f 0 d p d x = = −2 (ξ · p)(ξ · ∇φ2 ) f 0 d p d x.

For the last assertion, using φ2 = 4πρ0 , 3 1 (ξ · x)(ξ · ∇φ2 )ρ0 d x = (ξ · x) ξi ∂i φ2 ∂ j ∂ j φ2 d x 4π i, j=1

3 1 (ξ · x)ξi ∂i ∂ j φ2 + ξ j ξi ∂i φ2 ∂ j φ2 d x 4π i, j=1 1 1 2 =− (ξ · x)ξ · ∇|∇φ2 | d x − |ξ · ∇φ2 |2 d x 8π 4π 1 |ξ |2 2 = |∇φ2 | d x − |ξ · ∇φ2 |2 d x 8π 4π 1 = −|ξ |2 Epot − |ξ · ∇φ2 |2 d x. 4π This completes the proof of the lemma.

=−

Multipole Radiation in a Collisionless Gas

287

2.3. Proof of Corollary 1.11. In this section we verify Corollary 1.11 by specializing Theorem 1.9 to spherically symmetric functions. We recall that initial data f ◦ are said to be spherically symmetric, if f ◦ (Ax, Ap) = f ◦ (x, p) for any matrix A ∈ SO(3). Then the solution ( f 0 , φ2 ) of (VPgr) provided by Proposition 1.7 remains spherically symmetric for all times. Therefore f 0 (t, Ax, Ap) = f 0 (t, x, p), ρ0 (t, Ax) = ρ0 (t, x), and φ2 (t, x) = φ2 (t, Ax) (2.24) holds for all A ∈ SO(3). Firstly, this implies ∇φ2 = x¯ ∂r φ2 as well as |∇φ2 |2 = |∂r φ2 |2 , ∂r denoting the radial derivative. By choosing A ∈ SO(3) such that A x¯ = e j (the j’s unit vector in R3 ), (2.24) yields

2

yj 1 1 1 2 2

∂r φ2 (u, y)

dy |x¯ · ∇φ2 (u, y)| dy = |∂ j φ2 (u, y)| dy =

4π 4π 4π |y| 1 2 = |∂r φ2 (u, y)|2 dy = − Epot (u). 12π 3 Similarly, (2.24) and |x| ¯ 2 = 1 implies that p 2j f 0 (u, y, p) dp dy (x¯ · p)2 f 0 (u, y, p) dp dy = 1 = p 2 f 0 (u, y, p) dp dy 3 2 = Ekin (u). 3 Therefore by (1.18), 1 (x¯ · p)2 f 0 (u, y, p) dp dy + 4 Ekin (u) R(x, ¯ u) = − |x¯ · ∇φ2 (u, y)|2 dy − 4π 2 10 Ekin (u). = Epot (u) + 3 3 Since ∂t Epot = −∂t Ekin by conservation of energy, we get ∂t R(x, ¯ u) = Hence Corollary 1.11 follows from (1.16) and (1.17).

8 3 ∂t Ekin (u).

References 1. Andréasson, H., Calogero, S., Rein, G.: Global classical solutions to the spherically symmetric Nordström-Vlasov system. Math. Proc. Camb. Phil. Soc. 138, 533–539 (2005) 2. Bauer, S.: Post-Newtonian approximation of the Vlasov-Nordström system. Comm. Partial Differ. Eqs. 30, 957–985 (2005) 3. Bauer, S.: The Vlasov-Nordström system as a geometric singular perturbation problem, Preprint, 2005 4. Bauer, S., Kunze, M.: The Darwin approximation of the relativistic Vlasov-Maxwell system. Ann. H. Poincaré 6, 283–308 (2005) 5. Bradaschia, C. (ed.): Proceedings of the 5th Edoardo Amaldi Conference on Gravitational Waves. In: Classical Quantum Gravity 21, S377–S1263 (2004)

288

S. Bauer, M. Kunze, G. Rein, A. D. Rendall

6. Calogero, S.: Spherically symmetric steady states of galactic dynamics in scalar gravity. Classical Quantum Gravity 20, 1729–1742 (2003) 7. Calogero, S.: Global small solutions of the Vlasov-Maxwell system in the absence of incoming radiation. Indiana Univ. Math. J. 53, 1331–1363 (2004) 8. Calogero, S.: Outgoing radiation from an isolated collisionless plasma. Ann. Henri Poincaré 5, 189–201 (2004) 9. Calogero, S.: Global classical solutions to the 3D Nordström-Vlasov system.http://arxiv.org/list/mathph/0507030, 2005 10. Calogero, S., Lee, H.: The non-relativistic limit of the Nordström-Vlasov system. Comm. Math. Sci. 2, 19–34 (2004) 11. Calogero, S., Rein, G.: On classical solutions of the Nordström-Vlasov system. Comm. Partial Differ. Eqs. 28, 1863–1885 (2003) 12. Friedrich, S.: On global classical solutions to the Nordström-Vlasov system. PhD thesis, University of Bayreuth, in preparation 13. Glassey, R.T.: The Cauchy Problem in Kinetic Theory. Philadelphia: SIAM, 1996 14. Glassey, R.T., Strauss, W.: Singularity formation in a collisionless plasma could occur only at high velocities. Arch. Rat. Mech. Anal. 92, 59–90 (1986) 15. Hörmander, L.: Lectures on Nonlinear Hyperbolic Differential Equations. Berlin-New York: Springer, 1997 16. Lindner A.: C k -Regularität der Lösungen des Vlasov-Poisson-Systems partieller Differentialgleichungen. Diploma thesis, LMU München, 1991 17. Rendall, A.D.: On the definition of post-Newtonian approximations. Proc. Roy. Soc. London Ser. A 438, 341–360 (1992) 18. Rendall, A.D.: The Newtonian limit for asymptotically flat solutions of the Vlasov-Einstein system. Commun. Math. Phys. 163, 89–112 (1994) 19. Schaeffer, J.: The classical limit of the relativistic Vlasov-Maxwell system. Commun. Math. Phys. 104, 403–421 (1986) 20. Schaeffer, J.: Global existence of smooth solutions to the Vlasov-Poisson system in three dimensions. Comm. Partial Differ. Eqs. 16, 1313-1335 (1991) 21. Shapiro, S.L., Teukolsky, S.A.: Scalar gravitation: A laboratory for numerical relativity. Phys. Rev. D 47, 1529–1540 (1993) 22. Spohn, H.: Dynamics of Charged Particles and Their Radiation Field. Cambridge-New York: Cambridge University Press, 2004 23. Straumann, N.: General Relativity and Relativistic Astrophysics. Texts and Monographs in Physics, Berlin-New York: Springer, 1984 Communicated by H. Spohn

Commun. Math. Phys. 266, 289–329 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0054-9

Communications in

Mathematical Physics

Integration by Parts Formula for Regional Fractional Laplacian Qing-Yang Guan Institute of Applied Mathematics, Academy of Mathematics and System Science, Chinese Academy of Sciences, Beijing, China 100080. E-mail: [email protected] Received: 24 March 2005 / Accepted: 11 February 2006 Published online: 22 June 2006 – © Springer-Verlag 2006

Abstract: We obtain the integration by parts formula for the regional fractional Laplacian which are generators of symmetric α-stable processes on a subset of Rn (0 < α < 2). In this formula, a local operator appears on the boundary connected with the regional fractional Laplacian on domain. Hence this formula can be understood as the Green formula for the regional fractional Laplacian. The similar integration by parts formula is also given for more general nonlocal operators. The reflected stable-like processes are studied. 1. Background and the Main Results The main purpose of this paper is to establish the integration by parts formula for a class of nonlocal operators including the regional fractional Laplacian as the simplest case. Besides their own theoretic interests, these operators are also related to Lévy flights in physics. In recent years there have been many observations and experiments related to Lévy flights (super-diffusion), e.g., Richardson turbulent diffusion, collective slip diffusion on solid surface, quantum optics, bulk-surface exchange controlled dynamics in porous glass and even the flight of an albatross (see [13, 16, 17]). High variability, self-similarity and heavy tailed distribution are some key characteristics of Lévy flight in these experiments. They are also the basic characteristics for a class of jumping Lévy processes, the symmetric α-stable processes (0 < α < 2). Compared with the continuous process Brownian motion (α = 2), symmetric α-stable processes have infinite jumps in an arbitrary time interval. The large jumps of these processes make their variances and expectations become infinity according to 0 < α < 2 and 0 < α ≤ 1 respectively. When α = 23 , the symmetric α-stable processes also appear in the studying of stellar dynamics (see [4]). In this paper we are interested in these processes on domain. Research partially supported by NSFC 10501048

290

Q.-Y. Guan

We know that the generator of the symmetric α-stable processes is fractional α Laplacian −(−) 2 . As this operator is nonlocal, it can not be used on domain automatically. When restricting the integral kernel of fractional Laplacian to a subset G of α

Rn , we obtain the nonlocal operator G2 (see Definition 1.1 below). This operator describes a particle jumping from one point x ∈ G to another point y ∈ G with intensity 1 proportional to |x−y| n+α . In contrast to the Brownian motion, which can be taken as the limiting model of the random walk in which the test particles are assumed to jump to one of the nearest neighbor sites, we can take the stable process as the limiting model of such a random walk in which the test particles are assumed to jump to any other sites with power law decay in probability. Recently, these limiting models on domain, which are called the censored stable processes and the reflected stable processes in Bogdan, Burdzy, Chen [2], have been studied by many authors (besides this, the killed stable processes have been studied in much more literature, see, e.g., the references in [2, 10]). In physics, the reflected process can be used to describe some random flow in a closed domain with free action on the boundary, and they are always connected to the Neumann boundary problem. By [2], starting from a point in G, the reflected stable processes will hit the boundary of G in finite time if 1 < α < 2 and keep living in G forever if 0 < α ≤ 1. The heat kernel estimation of the reflected stable processes has been given in Chen and Kumagai [6]. In Guan[9], α

Guan, Ma[10, 11], these processes are further investigated by their generators G2 which are called regional fractional Laplacian in [10]. When G is an interval and 1 < α < 2, α

an integration by parts formula and a boundary operator of G2 are formulated in [10], see Theorem 1.1 below. α In this paper we will establish the integration by parts formula for G2 on high dimension. We also extend this formula to the case of 0 < α ≤ 1. By this formula the free α

boundary problem for G2 can be formulated, see Remark 3.2. To introduce the main results of this paper, next we restate some results in [10]. Let dx 0 < α < 2 and let G be an open set of Rn . Denote by L1 := L1 (G, (1+|x|) n+α ) all the |u(x)| measurable functions u on G such that d x < ∞. For u ∈ L1 , x ∈ G n+α G (1 + |x|) and ε > 0, we write

α 2

G,ε u(x) = A(n, −α)

where A(n, −α) =

|α|2α−1 ( n+α 2 ) π 2 (1 − α2 ) n

y∈G,|y−x|>ε

u(y) − u(x) dy, |x − y|n+α

. α

Definition 1.1. Let u ∈ L1 . The regional fractional Laplacian G2 is defined by the formula α

α

2 u(x), x ∈ G, G2 u(x) = lim G,ε

ε↓0

provided the limit exists.

(1.1)

Integration by Parts Formula for Regional Fractional Laplacian

291

α

α

When G = Rn , R2 n is the fractional power of Laplacian −(−) 2 which can be defined α by Fourier transform (see e.g. [14, 18]): F((−) 2 u)(ξ ) = |ξ |α F(u)(ξ ). The next defiα

2 when 1 < α < 2. Here, the notation F γ in [10] nition is the boundary operator of (0,1) γ is replaced by F1 so as to avoid confusion in this paper.

γ

Definition 1.2. Let 0 < γ < 1, u ∈ C 1 (0, 1). The γ boundary operator F1 is defined by γ

γ

F1 u(0) = lim u (t)t γ , F1 u(1) = lim u (t)(1 − t)γ , t↓0

t↑1

provided the limits exist. α

The following result is the integration by parts formula for G2 when G = (0, 1) and 1 < α < 2. Theorem 1.1 [Corollary 7.6 in [10]]. Let 1 < α < 2. Suppose that function u on [0, 1] u(x) a 2 2 is such that xu(x) α−1 ∈ C [0, 1] and (1−x)α−1 ∈ C [0, 1] (here we take 0 = 0 for a ∈ R).

Then F12−α u exists and for v ∈ C 2 [0, 1], 0

1 1 0

1 1 α (u(x) − u(y))(v(x) − v(y)) 2−α 2 d xd y = C v F u − v(x) u(x) d x, α 1 (0,1) |x − y|1+α 0 0

(1,−α) 1 1 where Cα = Aα(α−1) 0 s 2−α |1−s|α−1 α = 23 , Cα = 43 (π − 2 + ln 4).

1 1 1 1 ds − α−1 +limδ↓0 δ ( s 2−α |s−δ| α−1 − s )ds

. When

In [10], we also obtain a formula for regional fractional Laplacian in high dimensional α

case which shows that G2 is symmetric on C 2 (G). Theorem 1.2 [Theorem 3.4 in [10]]. Let G be a bounded Lipschitz open set in Rn . If 0 < α < 2, u, v ∈ C 2 (G), then α α 2 u(x)G v(x) d x = v(x)G2 u(x) d x. G

G

Let G be a bounded C 2 open set in Rn . It is well known that the integration by parts formula, i.e., the Green formula, for Laplacian is

∇u(x)∇v(x) d x = G

∂G

∂u (x)v(x) m(d x) − ∂n

v(x)u(x) d x,

(1.2)

G

where u, v ∈ C 2 (G), and ∂u ∂n (x) is the outward normal derivative of u on ∂G at point x, and m(d x) is the surface measure on ∂G. In Theorem 3.3 of this paper we will prove a similar formula as (1.2) for 1 < α < 2, which generalizes the results in Theorem 1.1 and 1.2. We will also prove the following theorem for all α ∈ (0, 2).

292

Q.-Y. Guan

Theorem 1.3. Let G be a bounded C 2 open set in Rn . Then the following assertions are true. (i) Let 1 < α < 2. If u, v ∈ Dβ (G), then β≥α

α

v(x)G2 u(x) d x − G

α

G

u(x)G2 v(x) d x

= Cn,α

v(x)F

2−α

u(x) m(d x) − Cn,α

∂G

v(x)F 2−α u(x) m(d x),

∂G

(1.3)

where Cn,α is defined by (3.9). Notations F 2−α u and Dα (G) are explained in Definition Dβ (G), then 2.1 and (2.5) respectively. In particular, if u, v ∈

α

G

v(x)G2 u(x) d x −

(ii) Let 0 < α ≤ 1. If u ∈

β>α

α

G

u(x)G2 v(x) d x = 0.

Dβ (G) and v ∈ C 2 (G), then

β≥α

α 2

G

v(x)G u(x) d x −

α 2

G

u(x)G v(x) d x = Cn,α

∂G

v(x)F 2−α u(x) m(d x). (1.4)

By Definition 2.1 the boundary operator F 2−α in (1.3) and (1.4) is defined by → → F u(x) = − lim u t (x + t − n (x))t 2−α , where − n (x) is the inner normal vector of 2−α

t↓0

∂G at point x ∈ ∂G. This boundary operator describes the power growth of u near the boundary in the normal direction. When α = 2, it becomes the outward normal derivative on the boundary. The last part of this section is a brief introduction of the rest of the content in this paper. In Sect. 2 we define the boundary operators F γ for the regional fractional Laplacian and define the space Dβ (G) for β > 0, which is bigger than C 2 (G) when 0 < β < 2. After α

that we give some necessary estimates of G2 u for u ∈ ∪β≥α Dβ (G). In Sect. 3 we prove α

the integration by parts formula for G2 on C 2 open set in Rn . As the generalization from Laplacian to second-order elliptic operator, in Sect. 4 we study more general nonlocal α

,κ

operators. Let κ be a symmetric function on G × G. Denote G2 to be the operator of κ(x,y) which the integral kernel is propositional to |x−y| n+α . This operator is the formal generator of stable-like process. In Theorem 4.5 we obtain the integration by parts formula for α

,κ

G2 . We also give two examples to show that, at least in some special cases, operators α 2 ,κ

G discussed in Sect. 4 include the fractional power of an elliptic differential operator with Neumann boundary condition, and we guess this is valid in more general cases. The non-symmetric case of κ is considered in Remark 4.1. By Fukushima’s decomposition, α

,κ

the heat kernel estimates in [6] and the integration by parts formula for G2 , Theorem 4.8 and 4.9 gives the semi-martingale property of the reflected stable-like processes and the probability explanation of the boundary operator respectively. Theorem 4.10 studies the domain of the generator of the reflected stable-like processes.

Integration by Parts Formula for Regional Fractional Laplacian

293

2. The Boundary Operator and Some Estimates for Regional Fractional Laplacian First we state some definitions and notations which will be used throughout this paper. An open set G in Rn is said to be C 2 , if there exists r0 > 0 such that for each x ∈ ∂G, we can find a C 2 function x : Rn−1 → R and an orthonormal coordinate system C Sx such that G ∩ B x, r0 = {y = {y1 , . . . , yn } : yn > x ( y1 , . . . yn ) } ∩ B x, r0 .

(2.1)

Denote

·, · : the inner pr oduct on Rn , Rn+ = {x = (x1 , · · · , xn ) : xn > 0}, ρ(x) := dist (x, ∂G) = inf{|y − x| : y ∈ ∂G}, ∀x ∈ G, dG := diameter o f G =: sup{|y − x| : y, x ∈ G}, B(x, δ) := {y : |y − x| < δ}, ∀x ∈ Rn , G δ := {y ∈ G : ρ(y) > δ}, G δ := {y ∈ G : 0 < ρ(y) < δ}, − → n (x) : the inner nor mal vector o f ∂G at point x ∈ ∂G, a ∨ b = max{a, b}, a ∧ b = inf{a, b}, ∀a, b ∈ R. For sets A, B ∈ Rn , we denote A \ B = {x : x ∈ A, x ∈ / B} and write A the closure of A in Rn . The (n − 1)−dimensional surface measure is always denoted by m(·). The constant numbers, such as a1 , a2 , c, C, may change in different places of this paper. By (2.1) we can check that for each x ∈ ∂G and y ∈ G ∩ B(x, r20 ), it holds that ρ(y) = inf{|y − z| : z = {z 1 , · · · , z n } ∈ B(x, r0 ), z n = x (z 1 , · · · , z n−1 )}. (2.2) The proof of the following lemma can be found in the Appendix of [8]. Lemma 2.1. Let G be a bounded C 2 open set in Rn . Then, for δ > 0 small enough, there is an unique point ξ(x) ∈ ∂G satisfying |x − ξ(x)| = ρ(x), ∀x ∈ G δ . Lemma 2.2. Let G be a bounded C 2 open set in Rn and let ξ be the function defined in Lemma 2.1. Then ξ ∈ C 1 (G δ0 ) for some δ0 ∈ (0, r20 ) and ∇ρ(x) =

x − ξ(x) , ∀ x ∈ G δ0 , |x − ξ(x)|

where ξ(x) is the point described by Lemma 2.1. Moreover, ρ ∈ C 2 (G δ0 ). Proof. The first assertion follows from Lemma 2.1 and Lemma 2.2 in [15]. The second assertion can be found in the Appendix in [8]. Remark 2.1. The distance function ρ does not belong to C 2 (G). For example, when G = B(0, 1), ρ(x) = |x| for x ∈ B(0, 1) and ρ is not differential at point 0.

294

Q.-Y. Guan

Let G be a C 2 open set in Rn and let β > 0. By Lemma 2.2 we can find δ > 0 depending on G and function h ∈ C 2 (G) depending on G and β such that h(x) =ρ(x)β−1 , ∀ x ∈ G δ , when β ∈ (0, 1) ∪ (1, ∞); h(x) = ln ρ(x), ∀ x ∈ G δ ,

when β = 1.

(2.3) (2.4)

For β > 0, define Dβ (G) ={u : u(x) = f (x)h(x) + g(x), ∀x ∈ G, f or some f, g ∈ C 2 (G)}. (2.5) Here Dβ (G) does not depend on the choice of h satisfying (2.3) and (2.4). When β ∈ (1, ∞), we always assume that function u ∈ Dβ (G) is defined on G by continuous extension. Notice that D2 (G) = C 2 (G) and Dβ (G) = C 2 (G) when β ≥ 2 and G is smooth. When 0 < β < 2 we can prove that the boundary value of f ∈ Dβ (G) in (2.5) is determined by u. In fact we have by Lemma 2.2, → du(x + t − n (x)) 2−β 1 f (x) = , ∀ x ∈ ∂G, β ∈ (0, 1) ∪ (1, 2), lim t β − 1 t↓0 dt → du(x + t − n (x)) t, ∀ x ∈ ∂G, β = 1. (2.6) f (x) = lim t↓0 dt Definition 2.1. Let G be a C 1 open set in Rn . For 0 ≤ γ < 2, u ∈ C 1 (G) and x ∈ ∂G, define operator F γ on ∂G by → n (x))t γ , F γ u(x) = − lim u (x + t − t

t↓0

provided that the limit exists. Remark 2.2. Let u, f be described by (2.5). By (2.6) we have F 2−β u(x) = (1 − β) f (x), β ∈ (0, 1) ∪ (1, 2);

F 2−β u(x) = − f (x), β = 1.

Remark 2.3. When γ = 0, we obtain F 0 u(x) =

∂u (x), ∂n

∀ u ∈ D2 (G), x ∈ ∂G,

where ∂u ∂n (x) is the outward normal derivative of u on the boundary at point x. Hence we may call F γ the scaled outward normal derivative. When G = (0, 1) and 0 < γ < 1, we γ γ γ have F γ u(0) = −F1 u(0) and F γ u(1) = F1 u(1), where F1 is defined by Definition 1.2. α

In what follows we give some basic estimates of G2 u for u ∈ Dβ (G) with β ≥ α. We α

need the following product formula for G2 . Lemma 2.3 [Lemma 3.5 in [10]]. Let G be an open set in Rn , u, v be functions on G. If α α dy < ∞, then G2 u(x) and G2 v(x) exist, and G |(u(y)−u(x))(v(y)−v(x))| |x−y|n+α α

α

α

G2 (uv)(x) = v(x)G2 u(x) + u(x)G2 v(x) (u(y) − u(x))(v(y) − v(x)) + A(n, −α) dy. |x − y|n+α G

Integration by Parts Formula for Regional Fractional Laplacian

295

Define for x ∈ G, wβ (x) = ρ(x)β−1 , β ∈ (0, 1) ∪ (1, ∞);

wβ (x) = ln ρ(x), β = 1.

(2.7)

α

The next proposition gives an estimate of G2 wα . The inequality (2.9) below can be found in Lemma 6.3 [2] and here we give another proof. By (5.4) in [2] we have for G = Rn+ , α

R2 n wα (x) = 0, ∀x ∈ Rn+ , α ∈ (0, 2).

(2.8)

+

In what follows δ0 is always denoted the constant specified in Lemma 2.2. Proposition 2.4. Let G be a bounded C 2 open set in Rn and let wα be the function defined in (2.7). Then there exist positive numbers a1 , a2 depending on α and G such that α

|G2 wα (x)| < a1 | ln ρ(x)| + a2 , α 2

|G wα (x)| < a1 | ln ρ(x)|2 + a2 , α 2

|G wα (x)| < a1 ρ(x)α−1 ,

∀x ∈ G δ0 ,

α ∈ (1, 2),

∀x ∈ G δ0 ,

α = 1,

(2.10)

α ∈ (0, 1).

(2.11)

∀x ∈ G δ0 ,

(2.9)

α

Proof. As G2 wα is continuous on G δ0 , we only need to consider the case when x = (x1 , . . . , xn ) ∈ G δ0 2 . By Lemma 2.2 there is an unique point x0 ∈ ∂G such that 2

∧δ0

|x − x0 | = ρ(x). By rotating the coordinate system, we can assume that x0 = 0 and x = (0, . . . , 0, ρ(x)). Set Ux0 = {y = (y1 , . . . , yn ) : y ∈ G δ0 , |(y1 , . . . , yn−1 )| < δ0 },

(2.12)

and denote by x0 the defining function of ∂G near x0 (see (2.1)). Since ∂G is C 2 , there exists a constant number k1 > 0 such that |x0 (y1 , . . . , yn−1 )| < k1 |(y1 , . . . , yn−1 )|2 and |yn | < k1 δ0 ,

∀ y ∈ Ux0 . (2.13)

By Lemma 2.2, ∂G δ is tangent to the plane {x = (x1 , . . . , xn ) : xn = δ} at point (0, . . . , 0, δ) for 0 < δ < δ0 . So there exists a constant number k2 > 0 such that |yn − ρ(y)| < k2 |(y1 , . . . , yn−1 )|2 ,

∀ y ∈ Ux0 .

(2.14)

Case 1. 1 < α < 2. Notice that ρ(x) = xn , we have ρ(y)α−1 − ρ(x)α−1 dy n+α |x − y| y∈Ux0 ,|y−x|>ε |yn |α−1 − xnα−1 ρ(y)α−1 − |yn |α−1 ≤ dy + dy . n+α n+α |x − y| |x − y| y∈Ux0 ,|y−x|>ε y∈Ux0 ,|y−x|>ε (2.15) Set A = {y : x0 (y1 , . . . , yn−1 ) < yn < 0 or 0 < yn < x0 (y1 , . . . , yn−1 ), |(y1 , . . . , yn−1 )| < δ0 }.

296

Q.-Y. Guan

By x ∈ G δ0 2

∧δ02

, (2.8) and the first inequality in (2.13),

lim

|yn |α−1 − xnα−1 dy n+α ε↓0 |x − y| y∈Ux0 ,|y−x|>ε ||yn |α−1 − xnα−1 | ||yn |α−1 − xnα−1 | ≤ dy + dy δ |x − y|n+α |x − y|n+α A |y−x|> 20 δ0 1 ||yn |α−1 − xnα−1 | ≤ dr I A (y) m(dy) + dy n+α n+1 δ |x − y| 0 |(y1 ,...,yn−1 )|=r |y−x|> 20 |x − y| ≤ (2π )

n

×

k1 (1 + k1α−1 )

δ0

r 2(α−1)

1 ρ(x) 2

α ( r +ρ(x) 2 )

≤ (2π )n k1 (1 + k1α−1 )

1

ρ(x) 2 0

ρ(x)α−1 α ( r +ρ(x) 2 )

dr + (2π )n

dr + (2π )n k1 1 + k1α−1

2 δ0

2α δ0α−1 2 2α + (2π )n k1 (1 + k1α−1 ) + (2π )n . α−1 α−1 δ0

(2.16)

The second last inequality uses the fact |x − y| ≥ (ρ(x) + r )/2 for y ∈ A, which comes from |x − y| ≥ r and |x − y| ≥ ρ(x) for y ∈ A. To estimate the second term in (2.15) we need the following inequality: |bα−1 − a α−1 | ≤ bα−2 |b − a|, b > 0, a > 0, 1 < α < 2.

(2.17)

By (2.14), (2.17) and the second inequality in (2.13),

ρ(y)α−1 − |yn |α−1 dy n+α |x − y|

y∈Ux0 ,|y−x|>ε

≤ ≤

= ≤

k 1 δ0

dr

≤2 ≤

y∈Ux0 ,|y−x|>ε

−k1 δ0 k 1 δ0

yn =r

dr

k2 |yn |α−2 dy |x − y|n−2+α I{y∈Ux0 ,|y−x|>ε}

k2 |r |α−2 m(dy) |x − y|n−2+α k2 |r |α−2 m(dy) |x − y|n−2+α

0 yn =r,|(y1 ,...,yn−1 )|<δ0 α+1 n r α−2 2 (2π ) k2 k1 δ0 dr α−1 |r − ρ(x)|α−1 0 2ρ(x) k 1 δ0 1 1 2α+1 (2π )n k2 dr + 2−α |r − ρ(x)|α−1 α−1 r r − ρ(x) 2ρ(x) 0 2 α+1 n 2 (2π ) k2 1 dr + ln(k δ ) − ln ρ(x) . 1 0 2−α |r − 1|α−1 α−1 0 r

dr (2.18)

Combining (2.15), (2.16) and (2.18), we see that (2.9) is true for some positive numbers a1 and a2 .

Integration by Parts Formula for Regional Fractional Laplacian

297

Case 2. 0 < α < 1. Denote (y1 , . . . , yn ) = (y1 , . . . , yn−1 , ρ(y)), ∀ y = (y1 , . . . , yn ) ∈ Ux0 . By Lemma 2.2 we can choose δ0 small enough such that there exists a constant number k3 > 0 satisfying ρx n (x) = 1, k3 < ρx n (y) ≤ 1, ∀ y ∈ Ux0 .

(2.19)

Applying (2.14), for ε small enough, we have (Ux0 ) ∩ B(x, ε + k2 ε2 )c ⊆ (Ux0 ∩ B(x, ε)c ) ⊆ (Ux0 ) ∩ B(x, ε − k2 ε2 )c , (2.20) where B(x, ε + k2 ε2 )c = Rn \B(x, ε + k2 ε2 ). In what follows we assume that By (2.20),

y∈Ux0 ,|y−x|>ε

=

ρ(y)α−1 − ρ(x)α−1 dy n+α |x − y|

(Ux0 ∩B(x,ε)c )

ynα−1 − xnα−1 dy −1 n+α −1 |x − (y)| ρxn ( (y))

( ρ (1−1 (y)) − 1)(ynα−1 − xnα−1 ) xn ≤ dy |x − y|n+α (Ux0 )∩B(x,ε+k2 ε2 )c ynα−1 − xnα−1 + dy n+α 2 c |x − y| (Ux0 )∩B(x,ε+k2 ε ) |ynα−1 − xnα−1 | + dy −1 n+α ρ (−1 (y)) B(x,ε+k2 ε2 )∩B(x,ε−k2 ε2 )c |x − (y)| xn ynα−1 − xnα−1 + −1 n+α ρ (−1 (y)) (Ux0 )∩B(x,ε+k2 ε2 )c |x − (y)| xn ynα−1 − xnα−1 dy . − |x − y|n+α ρx n (−1 (y)) When ε <

ρ(x) 3 ,

1 > 16ε. k2

(2.21)

for the third term in (2.21) we have

B(x,ε+k2 ε2 )∩B(x,ε−k2 ε2 )c

≤ ≤ =

|ynα−1 − xnα−1 | dy |x − −1 (y)|n+α ρx n (−1 (y)) 2εα−1

ε n+α k3 B(x,ε+k2 ε2 )∩B(x,ε−k2 ε2 )c ( 2 ) 2 (2π )n 2n+1+α ε+k2 ε −1−n n−1

k3

ε−k2 ε2

(2π )n 22n+1+α k2 . k3

ε

r

dy dr (2.22)

298

Q.-Y. Guan

Next we estimate the second term in (2.21). By (2.8) and ρ(x) < δ20 , ynα−1 − xnα−1 dy lim n+α ε↓0 (Ux0 )∩B(x,ε+k2 ε2 )c |x − y| ynα−1 + xnα−1 ynα−1 − xnα−1 = dy + dy n+α n+α 0δ0 |x − y| yn >δ0 |x − y| (2π )n −1 2α (2π )n ≤ δ + ρ(x)α−1 . (2.23) α(α + 1) 0 αδ0α Applying (2.19) and Lemma 2.2, we can find k4 > 0 such that | ρ k4 |y − x| for y ∈ Ux0 . Hence for the first term in (2.21) we have

( ρ

1

xn (

−1 (y))

|x − y|n+α

(Ux0 )∩B(x,ε+k2 ε2 )c

≤

(Ux0 )∩B(x,ε+k2 ε2 )c

− 1)(ynα−1 − xnα−1 )

1

xn (

−1 (y))

− 1| <

dy

k4 |ynα−1 − xnα−1 | dy. |x − y|n−1+α

(2.24)

For the last term in (2.21), by (2.14) and |x − y| < 2(1 + k2 δ0 )|x − −1 (y)| for y ∈ Ux0 , we have ynα−1 −xnα−1 ynα−1 − xnα−1 − dy −1 n+α −1 n+α −1 ρxn ( (y)) |x − y| ρxn ( (y)) (Ux0 )∩B(x,ε+k2 ε2 )c |x − (y)| ≤ 2n(n + α)(2 + 2k2 δ0 )n+1+α k2 |(y1 , . . . , yn−1 )|2 |ynα−1 − xnα−1 | × dy k3 |x − y|n+1+α (Ux0 )∩B(x,ε+k2 ε2 )c k2 |ynα−1 − xnα−1 | ≤ 2n(n + α)(2 + 2k2 δ0 )n+1+α dy. n−1+α (Ux0 )∩B(x,ε+k2 ε2 )c k3 |x − y| We also have (Ux0 )∩B(x,ε+k2 ε2 )c

ρ(x)

≤

dr 0

δ0

(2.25)

|ynα−1 − xnα−1 | dy |x − y|n−1+α

yn =r,|(y1 ,...,yn−1 )|≤δ0

ynα−1 m(dy) |x − y|n−1+α

ρ(x)α−1 m(dy) n−1+α ρ(x) yn =r,|(y1 ,...,yn−1 )|≤δ0 |x − y| 2α+1 (2π )n ρ(x) r α−1 2α+1 (2π )n δ0 ρ(x)α−1 ≤ dr + dr α α (ρ(x) − r )α α 0 ρ(x) (r − ρ(x)) 2α+1 (2π )n δ01−α 1 2α+1 (2π )n 1 ρ(x)α−1 . = dr + (2.26) 1−α (1 − r )α α α(1 − α) 0 r +

dr

Combining (2.21)–(2.26), we obtain (2.11).

Integration by Parts Formula for Regional Fractional Laplacian

299

Case 3. α = 1. We can follow the arguments in Case 2. When α = 1, the inequalities (2.22) and (2.26) are replaced respectively by | ln yn − ln xn | dy −1 n+1 ρ (−1 (y)) B(x,ε+k2 ε2 )∩B(x,ε−k2 ε2 )c |x − (y)| xn ln 2 ≤ dy ε n+1 k 3 B(x,ε+k2 ε2 )∩B(x,ε−k2 ε2 )c ( 2 ) ≤ (2π )n 22n+1 k2 k3−1 ln 2, and

(Ux0 )∩B(x,ε+k2 ε2 )c

≤

2ρ(x)

dr 0

δ0

(2.27)

| ln yn − ln xn | dy |x − y|n

yn =r,|(y1 ,...,yn−1 )|≤δ0

| ln ρ(x) − ln yn | m(dy) |x − y|n

ln yn − ln ρ(x) m(dy) |x − y|n 2ρ(x) yn =r,|(y1 ,...,yn−1 )|≤δ0 2ρ(x) δ0 | ln ρ(x) − ln r | ln r − ln ρ(x) dr + (2π )n dr ≤ (2π )n |ρ(x) − r | r − ρ(x) 2ρ(x) 0 +

dr

≤ (2π )

n

2

0

≤ (2π )n

2

0

[

δ0

]

ρ(x) ln r ln(k + 2) dr + (2π )n r −1 k k=1 2 δ0 ln r dr + 4(2π )n ln( + 2) , r −1 ρ(x)

δ0 where [ ρ(x) ] is the biggest integer no more than (2.10) as in Case 2.

(2.28)

δ0 ρ(x) . By (2.27) and (2.28) we can prove

Proposition 2.5. Let G be a bounded C 2 open set in Rn and 0 < α < 2. Then there exist constant numbers a1 , a2 depending on α and G such that α

|G2 wβ (x)| < a1 ρ(x)β−α−1 + a2 , ∀x ∈ G δ0 , when β ∈ (α, α + 1), α 2

|G wβ (x)| < a1 | ln ρ(x)| + a2 , α 2

|G wβ (x)| < a1 ,

∀x ∈ G δ0 , when β = α + 1,

∀x ∈ G δ0 ,

when β ∈ (α + 1, ∞).

(2.29) (2.30) (2.31)

Proof. First we assume that β ∈ (α, α + 1). By (5.4) in [2] there exists constant number Cα,β such that α

R2 n wβ (x) = Cα,β ρ(x)β−α−1 , ∀ x = (x1 , . . . , xn ) ∈ Rn+ . +

Based on this fact we can prove (2.29) by the same method as in Proposition 2.4. γ Next we assume that β = α + 1. Let u γ (x) = xn for x ∈ Rn+ and γ > 0. We can prove that, for each α ∈ (0, 2), there exists a constant number k1 satisfying α 1 |R2 n ∩B(0,1) u α (x)| ≤ k1 | ln xn |, ∀ x ∈ Rn+ ∩ B(0, ). + 2

(2.32)

300

Q.-Y. Guan

With the help of (2.32) and applying the same arguments as in Proposition 2.4, we can prove (2.30). Equation (2.31) can be proved similarly with the fact that for each β > α + 1, there exists a constant number k2 such that α 1 |R2 n ∩B(0,1) u β (x)| ≤ k2 , ∀ x ∈ Rn+ ∩ B(0, ), + 2

which can be checked directly.

Corollary 2.6. Let G be a bounded C 2 open set in Rn and 0 < α < 2. Let f ∈ C 2 (G) and let wβ (x) be the function defined in (2.7). For β ∈ [α, ∞), set vβ (x) = f (x)wβ (x). Then estimates (2.9)–(2.11) and (2.29)–(2.31) are still true when wα (x) and wβ (x) α

are replaced by vα (x) and vβ (x) respectively. Furthermore, G2 v ∈ L 1 (G) for v ∈

β≥α Dβ (G). Proof. By Proposition 2.3 in [10], for g ∈ C 2 (G) there exist some positive numbers c1 , c2 such that α

|G2 g(x)| < c1 , α 2

|G g(x)| ≤ c1 | ln ρ(x)| + c2 , α 2

|G g(x)| ≤ c1 ρ(x)1−α ,

∀x ∈ G, α < 1, ∀x ∈ G, α = 1, ∀x ∈ G, 1 < α < 2.

(2.33)

By Lemma 2.3, Proposition 2.4, Proposition 2.5 and (2.40) we need only to check that function |( f (x) − f (y))(wβ (y) − wβ (x))| w β (x) = dy |y − x|n+α G α

satisfies all the estimates of |G2 wβ (x)| in Proposition 2.4 and Proposition 2.5. We omit the proof as it is similar to the discussion in Case 2 of Proposition 2.4. By these estimates α

and noticing that G2 v is continuous on G, we can prove the last assertion.

At the last of this section we present a lemma which will be used in the next section. Lemma 2.7. Let G be a C 1 open set and u, f be the functions described by (2.5). Let − → → n (x0 ) be the inner normal vector of ∂G at point x0 ∈ ∂G. Then for any unit vector − r − → − → such that n (x0 ), r > 0, → du(x0 + t − r ) 2−α → → t lim = (α − 1) f (x0 ) − n (x0 ), − r α−1 , α ∈ (0, 1) ∪ (1, 2), t↓0 dt → du(x0 + t − r ) 2−α lim t = f (x0 ), α = 1. (2.34) t↓0 dt → → → Proof. By the boundary condition and − n (x0 ), − r > 0 we see that x0 + t − r ∈ G for → t > 0 small enough. So ρ(x0 + t − r ) is well defined for small t. Noticing that vectors → → → → x0 + t − r − ξ(x0 + t − r ) and − n (ξ(x0 + t − r )) have the same direction, we have by con− → − → x0 + t r − ξ(x0 + t r ) → tinuity that lim =− n (x0 ). Thus, the conclusion follows from → → t↓0 |x 0 + t − r − ξ(x0 + t − r )| Lemma 2.2.

Integration by Parts Formula for Regional Fractional Laplacian

3. Integration by Parts Formula for Regional Fractional Laplacian Lemma 3.1. Let G be a bounded C 2 open set in Rn , 1 < α < 2 and u ∈ We have 1 (u(x) − u(y))2 d xd y < ∞. A(n, −α) 2 |x − y|n+α G×G

301

β≥α

Dβ (G). (3.1)

Proof. We only prove the lemma for u ∈ Dα (G) as the proof of the others is similar. It is easy to check (3.1) for u ∈ C 2 (G). Therefore we need only to prove ( f (x)h(x) − f (y)h(y))2 1 A(n, −α) d xd y < ∞ 2 |x − y|n+α G×G for f, h explained in (2.3) and (2.5). Noticing that ( f (x)h(x) − f (y)h(y))2 d xd y |x − y|n+α G×G ( f (x) − f (y))2 ≤ 2 sup |h(x)| d xd y |x − y|n+α G×G x∈G (h(x) − h(y))2 + 2 sup | f (x)| d xd y, |x − y|n+α G×G x∈G we need only to check (3.1) for u = h. By Theorem 1.2, (h(x) − h(y))2 1 d xd y A(n, −α) 2 |x − y|n+α G δ ×G δ α h(x)G2 δ h(x) d x =− G δ α (h(y) − h(x))h(x) 2 d xd y − h(x) h(x) d x. = G |x − y|n+α G δ ×G δ Gδ

(3.2)

Set M = sup |h(x)|. By Lemma 2.2 there exists N > 0 such that x∈G

|h(x) − h(y)| < Nρ(x)α−2 |x − y|, ∀ x ∈ G δ , y ∈ G δ . Therefore

|(h(y) − h(x))h(x)| d xd y |x − y|n+α |h(y) − h(x)| ≤M d xd y n+α G δ ×G δ |x − y| 1 ≤ MN ρ(x)α−2 dy dx n−1+α G δ G δ |x − y| dG 1 ≤ (2π )n M N ρ(x)α−2 dr dx α G δ δ−ρ(x) r MN ≤ (2π )n ρ(x)α−2 (δ − ρ(x))1−α d x. α − 1 G δ G δ ×G δ

(3.3)

302

Q.-Y. Guan

By Lemma 2.2, there exists some constant number c such that G δ

ρ(x)

α−2

(δ − ρ(x))

1−α

dx ≤ c

δ

x α−2 (δ − x)1−α d x

0

1

=c

x α−2 (1 − x)1−α d x.

(3.4)

0 α

Combining (3.2)–(3.4) and noticing that G2 h ∈ L 1 (G) by Corollary 2.6, we obtain (3.1). Lemma 3.2. Let 0 < α < 2, T > 0, u ∈ C 1 (0, T ] and v ∈ C[0, T ]. If F 2−α u(0) exists, then δ T (u(y) − u(x))v(x) lim A(1, −α) d xd y = −Cα v(0)F 2−α u(0), (3.5) δ↓0 |x − y|1+α 0 δ where A(1, −α) Cα = α(α − 1) Cα =A(1, −α)

∞

0∞ 0

|s − 1|1−α − (s ∨ 1)1−α ds, s 2−α ln(s ∨ 1) − ln |s − 1| ds, s

α ∈ (0, 1) ∪ (1, 2), α = 1.

(3.6)

In particular, if F 2−β u(0) exists for some β > α, then

δ

lim A(n, −α) δ↓0

0

T

δ

(u(y) − u(x))v(x) d xd y = 0. |x − y|1+α

Proof. We will prove this lemma in the Appendix.

(3.7)

Theorem 3.3. Let G be a bounded C 2 open set in Rn , 1 < α < 2 and u, v ∈

Dβ (G).

β≥α

Then we have 1 A(n, −α) 2 = Cn,α

∂G

(u(x) − u(y))(v(x) − v(y)) d xd y |x − y|n+α G×G α v(x)F 2−α u(x) m(d x) − v(x)G2 u(x) d x,

(3.8)

G

where A(1, −α) Cn,α = Cα xnα m(d x) A(n, −α) {|x|=1,xn >0,x∈Rn } ⎧ Cα , n = 1, ⎪ ⎪ π2 ⎨ α n = 2, α 0 cos θ dθ, = 2Cn−1 π ⎪ 2 Cα 2 ⎪ 2π n−2 θ dθ, n ≥ 3. α ⎩ n−1 0 cos θ sin (

2

)

(3.9)

Integration by Parts Formula for Regional Fractional Laplacian

303

The constant number Cα is defined by (3.6). Notations F 2−α u and Dα (G) are explained Dβ (G), then in Definition 2.1 and (2.5) respectively. In particular, if u, v ∈ 1 A(n, −α) 2

β>α

G×G

(u(x) − u(y))(v(x) − v(y)) d xd y = − |x − y|n+α

α

G

v(x)G2 u(x) d x.

Proof. In what follows we assume that n ≥ 3. The proof of the other cases is similar. First we assume u, v ∈ Dα (G). We will prove (3.8) in three steps. Step 1. By Theorem 1.2 and Corollary 2.6, 1 (u(x) − u(y))(v(x) − v(y)) A(n, −α) d xd y 2 |x − y|n+α G×G 1 (u(x) − u(y))(v(x) − v(y)) d xd y = lim A(n, −α) δ↓0 2 |x − y|n+α G δ ×G δ α v(x)G2 δ u(x) d x = − lim δ↓0 G δ α u(y) − u(x) dx v(x) dy − lim v(x)G2 u(x) d x = lim A(n, −α) n+α δ↓0 δ↓0 G δ |x − y| Gδ G δ u(y) − u(x) dx v(y) dy = lim A(n, −α) δ↓0 |x − y|n+α Gδ Gδ (v(y) − v(x))(u(y) − u(x)) − lim A(n, −α) d xd y δ↓0 |x − y|n+α G δ G δ α − v(x)G2 u(x) d x G α u(y) − u(x) v(x) d xd y − v(x)G2 u(x) d x. (3.10) = − lim A(n, −α) n+α δ↓0 |x − y| G δ G δ G n−1 In order to calculate the first term in (3.10), we introduce some notations. Let (θi )i=1 ∈R such that 0 ≤ θi ≤ π, i = 1, 2, . . . , n − 2, and 0 ≤ θn−1 < 2π . Set − → θ = (sin θ1 sin θ2 . . . sin θn−1 , . . . , sin θ1 cos θ2 , cos θ1 ). − → As G δ is an open set, for any direction θ and x ∈ G, the set − → − → K xδ, θ = {y : y = x + t θ ∈ G δ f or some t ∈ R} − →

is an one dimensional open set. We can check that K xδ, θ is the union of finite one dimen− → − → sional open intervals. For each open interval U of K xδ, θ , denote a1 = x + t1 θ , a2 = − → − → − → x + t2 θ , where t1 = inf{t : y = x + t θ ∈ U } and t2 = sup{t : y = x + t θ ∈ U }. In what follows a1 and a2 are called the left end point and the right end point of the interval U respectively. It is easy to see that a1 , a2 ∈ ∂G ∪ ∂G δ . So we have four types of open intervals: 1 : a1 ∈ ∂G, a2 ∈ ∂G δ , 3 : a1 ∈ ∂G δ , a2 ∈ ∂G,

2 : a1 ∈ ∂G, a2 ∈ ∂G, 4 : a1 ∈ ∂G δ , a2 ∈ ∂G δ .

304

Q.-Y. Guan

According to these four types of intervals, G δ can be divided into four parts.

− →

G iδ, θ =

− →

{(a1 , a2 ) : (a1 , a2 ) is a type i open interval of K xδ, θ },

x∈G

i = 1, 2, 3, 4. Define − → − → 1δ, θ = x ∈ ∂G : x is the left end point of a type 1 open interval of K xδ, θ , − → − → 2δ, θ = x ∈ ∂G δ : x is the left end point of a type 4 open interval of K xδ, θ . (3.11) − →

− →

For x ∈ 1δ, θ ∪ 2δ, θ , denote − →

θ = {(a1 , a2 ) : L δ, x

− →

(a1 , a2 ) is the open interval of K xδ, θ taking x as the left end point}. − →

For x ∈ 1δ, θ , denote − →

Hxδ, θ =

t>0

− → − → − → − → θ z ∈ K xδ, θ : z = x − t θ , z ∈ / G δ, and {y : y = x − s θ , 4

− → s ∈ (0, t)} ∩ 1δ, θ = ∅ .

Write ϕ(θ1 , . . . , θn−2 ) = sinn−2 θ1 sinn−3 θ2 . . . sin θn−2 . We have

u(y) − u(x) v(x) d xd y n+α G δ |x − y| π π v(x) d x dθ1 . . . dθn−2 = A(n, −α)

A(n, −α)

G δ

G δ

0

0

2π

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

− → u(x + r θ ) − u(x) dr − → r 1+α {x+r θ ∈G δ ,r >0} π 2π π dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) dθn−1 = A(n, −α)

·

·

0

4 − →

i=1

G iδ, θ

v(x) d x

0

0

− → {x+r θ ∈G δ ,r >0}

− → u(x + r θ ) − u(x) dr. r 1+α

(3.12)

Integration by Parts Formula for Regional Fractional Laplacian

305

− → As ∂G is C 2 , we can calculate the integral in (3.12) along θ and find that

∞

v(x) d x

− →

θ G δ, 1

0

=

− → 1δ, θ

− → u(x + r θ ) − u(x) → I{x+r − dr θ ∈G δ } r 1+α

− → →

− n (x), θ m(d x)

∞

·

I 0

− → − → θ {x+t θ ∈L δ, } x

− → v(x + t θ ) dt

∞ → I{x+(t+r )− θ ∈G

0

− → − → u(x + (t + r ) θ ) − u(x + t θ ) · dr. r 1+α

δ}

(3.13)

− → − → If x ∈ G iδ, θ (i = 2, 3) and {z : z = x + t θ ∈ G δ f or some t > 0} = ∅, we can check that − → − → {z : z = x + t θ f or some t > 0} ∩ 1δ, θ = ∅.

So we have similarly

3

G iδ, θ

i=2

=

∞

− → v(x) d x

→ I{x+r − θ ∈G

0

− → u(x + r θ ) − u(x) dr δ} r 1+α

− → − → − → n (x), θ m(d x)

1δ, θ

·

∞

I 0

− → − → {x−t θ ∈Hxδ, θ }

− → v(x + t θ ) dt

0

∞ → I{x+r − θ ∈G

− → − → u(x + r θ ) − u(x − t θ ) dr. δ} (t + r )1+α (3.14)

We also have

− →

θ G δ, 4

v(x) d x 0

=

∞

− → 2δ, θ

− → u(x + r θ ) − u(x) → I{x+r − dr θ ∈G δ } r 1+α

− → → v(x) − n (x), − θ m(d x)·

∞

·

I 0

− → − → θ {x+t θ ∈L δ, } x

− → v(x + t θ ) dt

0

∞ → I{x+(t+r )− θ ∈G

− → − → u(x + (t + r ) θ ) − u(x + t θ ) · dr. r 1+α → In (3.15) − n (x) is the inner normal vector of ∂G δ at point x.

δ}

(3.15)

306

Q.-Y. Guan

Step 2. In this step we will prove that − → →

− n (x), θ x∈

∞

I

− → 1δ, θ ,

0

− → − → θ {x+t θ ∈L δ, } x

∞ → I{x+(t+r )− θ ∈G δ }

dt 0

− → − → |u(x + (t + r ) θ ) − u(x + t θ )| dr, r 1+α

(3.16) − → − → ∞ ∞ |u(x + r θ ) − u(x − t θ )| − → → − → → I I{x+r − dr,

− n (x), θ − → δ, θ dt θ ∈G } δ {x−t θ ∈Hx } (t + r )1+α 0 0

− →

x ∈ 1δ, θ , − → →

− n (x), − θ

∞

I 0

− → dt − → θ {x+t θ ∈L δ, } x

(3.17) ∞

0

→ I{x+(t+r )− θ ∈G

δ}

− → − → − → |u(x + (t + r ) θ ) − u(x + t θ )| δ, θ × dr, x ∈ (3.18) 2 r 1+α − → → are all uniformly bounded for δ > 0, θ and the corresponding x. The − n (x) in (3.16) − → and (3.17) and the n (x) in (3.18) are the inner normal vector of ∂G and ∂G at point x δ

respectively. When u ∈ C 2 (G), (3.16)–(3.18) are easy to check. Hence we can assume that u = f h (h and f are the functions explained in (2.3) and (2.5) respectively). With f (y)h(y) − f (x)h(x) = ( f (y) − f (x))h(y) + (h(y) − h(x)) f (x) applied to (3.16)–(3.18), the proof can be again reduced to the case u = h(x). By (2.3) this is equivalent to prove (3.16)-(3.18) when h(x) = ρ(x)α−1 . Thus in what follows we can assume that u(x) = ρ(x)α−1 . Before we prove (3.16)–(3.18), we give a lemma for preparation. The proof of this lemma will be given in the Appendix. Lemma 3.4. Let G be a bounded C 2 open set. The following assertions are true: − → − → → (i) − n (x), θ ≥ 0, ∀ x ∈ 1δ, θ . − → (ii) There exists a positive number k1 such that for each unit vector θ ,

t → − → − → − → → n (x), θ , ∀ x ∈ ∂G, 0 < t < k1 − ρ(x + t θ ) ≥ − n (x), θ . 4 (iii) Denote ρδ (y) the distance function between point y and ∂G δ . There exists a positive − → number k2 such that for each unit vector θ , t → − → − → − → → n (x), θ , ∀ x ∈ ∂G δ , 0 < t < k2 − ρδ (x − t θ ) ≥ − n (x), θ . 4 → In this case − n (x) is the inner normal vector of ∂G δ at point x. Next we prove (3.16) by two cases. − → → Case 1. − n (x), θ ≥

1

3δ 2 1 k12

− →

and x ∈ 1δ, θ . Denote − → − → θ t ∗ = sup{t : x + t θ ∈ L δ, x }.

Integration by Parts Formula for Regional Fractional Laplacian

307 − →

1

θ By (ii) in Lemma 3.4, we see that the length of L δ, is less than 3δ 2 k12 and hence x − → − → → n (x), θ for t ∈ (0, t ∗ ). Therefore by α > 1, ρ(x + t θ ) ≥ 4t −

− → − →

n (x), θ

∞

I

0

− → − → θ {x+t θ ∈L δ, } x

1

∞

dt 0

→ I{x+(t+r )− θ ∈G

δ}

− → − → |ρ(x + (t + r ) θ )α−1 − ρ(x + t θ )α−1 | × dr r 1+α − → t∗ ∞ → ( 4t )α−2 − n (x), θ α−2 − → − → − → ≤ (α − 1) n (x), θ dt I{x+(t+r ) θ ∈G } dr δ rα 0 0 t∗ − → → ≤ 4 − n (x), θ α−1 t α−2 (t ∗ − t)1−α dt 0

1

≤4

t α−2 (1 − t)1−α dt.

(3.19)

0

− → → Case 2. 0 < − n (x), θ <

1

3δ 2 1 k12

− → →

− n (x), θ 0

− →

and x ∈ 1δ, θ . We have

∞

I

− → − → θ {x+t θ ∈L δ, } x

dt 0

∞ → I{x+(t+r )− θ ∈G

δ}

− → − → |ρ(x + (t + r ) θ )α−1 − ρ(x + t θ )α−1 | × dr r 1+α t∗ ∞ − → → → ≤ − n (x), θ dt I{x+(t ∗ +r )− θ ∈G } t ∗ −δ

δ

0

− → − → |ρ(x + (t ∗ + r ) θ )α−1 − ρ(x + t θ )α−1 | × dr (t ∗ − t + r )1+α ∞ t ∗ −δ − → − → → dt I{x+(t ∗ +r )− + n (x), θ θ ∈G } 0

0

δ

− → − → |ρ(x + (t ∗ + r ) θ )α−1 − ρ(x + t θ )α−1 | × dr. (t ∗ − t + r )1+α

(3.20)

− → When t ∗ − δ < t < t ∗ , it is easy to see ρ(x + t θ ) ≥ δ − (t ∗ − t). Therefore − → →

− n (x), θ

t∗ t ∗ −δ t∗

0

≤ (α − 1)

1

≤ 0

t ∗ −δ

∞

dt dt 0

→ I{x+(t ∗ +r )− θ ∈G ∞

− → − → |ρ(x + (t ∗ + r ) θ )α−1 − ρ(x + t θ )α−1 | dr δ} (t ∗ − t + r )1+α

(δ − (t ∗ − t))α−2 dr (t ∗ − t + r )α

t α−2 (1 − t)1−α dt.

(3.21)

308

Q.-Y. Guan

For the second term in (3.20), we have − → →

− n (x), θ

t ∗ −δ 0

1

≤

3δ 2

1 2

1

3δ 2 1

k12

0

t ∗ −δ 0

0 t ∗ −δ

→ I{x+(t ∗ +r )− θ ∈G

− → − → |ρ(x + (t ∗ + r ) θ )α−1 − ρ(x + t θ )α−1 | dr δ} (t ∗ − t + r )1+α

∞

dt

k1 ≤

∞

dt

→ I{x+(t ∗ +r )− θ ∈G

δ}

1 dr (t ∗ − t + r )2

1

3δ 2 1 dt ≤ 1 (ln dG − ln δ). ∗ t −t k12

0

(3.22)

Combining (3.20)–(3.22), we obtain the assertion on (3.16). Now we prove the assertion on (3.17). As the boundary of G is C 2 , we can check that there exists a positive number k3 such that − →

− → − → → n (x), θ , f or y ∈ Hxδ, θ , x ∈ 1δ, θ . |y − x| ≥ k3 −

Therefore − → →

− n (x), θ 0

∞

I

− → − → {x−t θ ∈Hxδ, θ }

− → → ≤ − n (x), θ − → → ≤ − n (x), θ

0

→ I{x+r − θ ∈G

∞

I

∞

dt

0

− → − → {x−t θ ∈Hxδ, θ }

∞

I 0

− → − → {x−t θ ∈Hxδ, θ }

dt 0

− → − → |ρ(x + r θ )α−1 − ρ(x − t θ )α−1 | dr δ} (t + r )1+α

∞ → I{x+r − θ ∈G

δ}

1 dr (t + r )2

1 dt t

− → − → → → ≤ − n (x), θ ln dG − ln(k3 − n (x), θ ) , − → → which is bounded for all − n (x), θ . − →

For any x ∈ 2δ, θ , it follows from (iii) in Lemma 3.4 that 1

3δ 2 − → → 0 < − n (x), − θ ≤ 1 , k22

(3.23)

→ where − n (x) is the inner normal direction of ∂G δ at x. With the help of (3.23) we can prove the assertion on (3.18) by the method used in (3.21) and (3.22).

Integration by Parts Formula for Regional Fractional Laplacian

309

Step 3. In this step we show that the value of the second, the third and the forth terms in (3.12) go to zero when δ ↓ 0. We will also prove that the value of the first term in (3.12) goes to −Cn,α

v(x)F 2−α u(x) m(d x).

∂G

(3.24)

It is easy to check that the term in (3.17) goes to zero when δ ↓ 0. Hence by (3.14) and dominated convergence theorem,

π

lim δ↓0

0

3

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

− → v(x) d x

− → {x+r θ ∈G δ ,r >0}

G iδ, θ

i=2

2π

dθn−2

0

·

π

dθ1 . . . ,

− → u(x + r θ ) − u(x) dr = 0. r 1+α

Let C1 be the bound of (3.18) and N = sup |v(x)|. By Lemma 2.2 we see that C2 =: x∈G · m(d x) < ∞. Hence by (3.15) and (3.23), sup

0<δ<δ1 ∂G δ

π

lim δ↓0

dθ1 . . .

0

π

2π

dθn−2 0

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

− → |v(x)| d x

θ G δ, 4

− → {x+r θ ∈G δ ,r >0}

≤ C1 N lim δ↓0

− → 4δ, θ

π

m(d x)

− → |u(x + r θ ) − u(x)| dr r 1+α

dθ1 . . .

0

π

dθn−2 0

2π

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

≤ C1 N lim δ↓0

π 0

∂G δ

m(d x)·

π

dθ1 . . .

2π

dθn−2 0

ϕ(θ1 , . . . , θn−2 )I

0

1

− → → {0< − n (x),− θ ≤ 3δ12 }

dθn−1

k12

1

≤ C1 C2 N (2π )n lim δ↓0

3δ 2 1 2

= 0.

(3.25)

k1

Noticing that for fixed x ∈ ∂G, ∞ ∞ → 1 − − → → − → − → − → → − → ,θ { θ : − n (x), θ > 0} ⊆ { θ : x ∈ 1m } ⊆ { θ : − n (x), θ ≥ 0}, n=1 m=n

310

Q.-Y. Guan

hence by Lemma 2.7, Lemma 3.2, (3.13) and dominated convergence theorem, we obtain

π

dθ1 . . .

lim δ↓0

0

π

2π

dθn−2 0

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

− → u(x + r θ ) − u(x) · dr − → v(x) d x − → θ r 1+α G δ, {x+r θ ∈G δ ,r >0} 1 2π π π m(d x) dθ . . . dθ ϕ(θ1 , . . . , θn−2 ) dθn−1 = lim − → 1 n−2

δ↓0

1δ, θ

− → → · − n (x), θ

0

0

∞

I 0

− → − → θ {x+t θ ∈L δ, } x

0

− → v(x + t θ ) dt

− → − → u(x + (t + r ) θ ) − u(x + t θ ) → I{x+(t+r )− dr θ ∈G δ } r 1+α 0 (α − 1)Cα = v(x) f (x) m(d x) A(1, −α) ∂G π 2π π − → → − → dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) − n (x), θ α I{ − dθ · → n (x), θ >0} n−1

∞

·

0

0

0

π π 2 (α − 1)Cα = v(x) f (x) m(d x) dθ1 . . . dθn−2 A(1, −α) ∂G 0 0 2π · ϕ(θ1 , . . . , θn−2 ) cosα θ1 dθn−1 0 (α − 1)Cα = v(x) f (x) m(d x) xnα m(d x) n A(1, −α) ∂G {|x|=1,xn >0,x∈R } −Cn,α 2−α = v(x)F u(x) m(d x), A(n, −α) ∂G

(3.26)

which prove (3.24). By (3.10), (3.12) and the assertions in the beginning of Step 3 we finally obtain (3.8). When u, v ∈ Dβ (G), with the help of (3.7), we can prove the last conclusion of this β>α

theorem in the same way .

When 0 < α ≤ 1, Theorem 3.3 is not true for functions in Dα (G) for that we do not have the similar results as in Lemma 3.1. In fact if we choose u 1 (x) = ln x and u α (x) = x α−1 for 0 < α < 1, we can check that u α ∈ Dα ([0, 1]) for 0 < α ≤ 1 and (0,1)×(0,1)

(u α (x) − u α (y))2 d xd y = ∞. |x − y|1+α

However, in this case we have another type of integration by parts formula which is true, and is formulated in Theorem 1.3. Proof of Theorem 1.3. Formula (1.3) follows from (3.8) directly. Next we prove (1.4) for 0 < α < 1. The case of α = 1 can be proved in the same way. As in Theorem 3.3,

Integration by Parts Formula for Regional Fractional Laplacian

311

we only prove (1.4) for u ∈ Dα (G). By Corollary 2.6 and Theorem 1.2,

α

v(x)G2 u(x) d x G α = lim v(x)G2 δ u(x) d x + lim A(n, −α) δ↓0

α 2

= lim δ↓0

δ↓0

Gδ

Gδ

u(x)G δ v(x) d x + lim A(n, −α)

δ↓0

dx Gδ

dx

Gδ

G δ G δ

v(x)

u(y) − u(x) dy |x − y|n+α

v(y)

u(y) − u(x) dy |x − y|n+α

(v(y) − v(x))(u(y) − u(x)) − lim A(n, −α) d xd y δ↓0 |x − y|n+α Gδ Gδ α u(y) − u(x) = u(x)G2 v(x) d x − lim A(n, −α) dx v(x) dy δ↓0 |x − y|n+α G G δ Gδ v(y) − v(x) dx u(x) dy. + lim A(n, −α) δ↓0 |x − y|n+α Gδ Gδ

(3.27)

By similar calculation as in (3.3), we have lim A(n, −α) δ↓0

G δ

dx

u(x) Gδ

v(y) − v(x) dy = 0. |x − y|n+α

(3.28)

Now we follow the proof in Theorem 3.3 to calculate the second term in (3.27). We can use the same arguments as in Theorem 3.3 before Step 2. Next we start with Step 2 and adopt the same notations in Theorem 3.3. 1 − → − → − → → First we assume that − n (x), θ ≥ 3δ12 and x ∈ 1δ, θ . Notice that ρ(x + t θ ) ≥ − → → t − 4 n (x), θ

for t ∈

− → →

− n (x), θ

(0, t ∗ ),

k12

we have

∞

I 0

− → − → θ {x+t θ ∈L δ, } x

dt 0

∞ → I{x+(t+r )− θ ∈G

δ}

− → − → |ρ(x + (t + r ) θ )α−1 − ρ(x + t θ )α−1 | × dr r 1+α − → t∗ ∞ → ( 4t )α−1 − n (x), θ α−1 − → − → − → dt I{x+(t+r ) θ ∈G } dr ≤ n (x), θ δ r 1+α 0 0 ∗ t 41−α − − →

→ n (x), θ α ≤ t α−1 (t ∗ − t)−α dt α 0 41−α 1 α−1 ≤ t (1 − t)−α dt. (3.29) α 0

312

Q.-Y. Guan − →

− →

− →

θ δ, θ θ → (x) to be the point in 1 For x ∈ G δ, we denote ξδ,− such that x ∈ L δ, . By → (x) 1 ξ − θ δ, θ

(3.29) and the same calculation as in (3.24), δ↓0

0

− → θ G δ, 1

π

lim A(n, −α) v(x) d x

π

dθ1 . . . 0

∂G

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

− → {x+r θ ∈G δ ,r >0}

I

= −Cn,α

2π

dθn−2

1 − → 2 → → (x)), θ ≥ 3δ1 { − n (ξδ,− θ k12

v(x)F 2−α u(x) m(d x).

− → → → (x)), θ < Next we assume that 0 < − n (ξδ,− θ

π

δ↓0

− → θ x∈G δ, 1

0

π

dθ1 . . .

lim sup

− → u(x + r θ ) − u(x) dr r 1+α }

2π

dθn−2 0

(3.30)

1

3δ 2 1 k12

. We claim that

ϕ(θ1 , . . . , θn−2 )I

0

− →

I

1

θ − → 2 → G δ, 1 → (x)), θ < 3δ1 } {0< − n (ξδ,− θ k12

× dθn−1 = 0.

(3.31) − →

θ To this end we fix a point x ∈ G δ, 1 . Without loss of generality we assume that ξ(x) (the boundary point defined in Lemma 2.1) is the origin and x = (0, . . . , 0, ρ(x)). 1 − → 2 → → (x)), θ < 3δ1 , we can find a constant number k By Lemma 2.2, if 0 < − n (ξδ,− θ k12

1 2

independent of x such that | cos θ1 | < kδ . From this fact we obtain (3.31). By (3.31),

π

lim A(n, −α) δ↓0

θ G δ, 1

2π

dθn−2 0

− → dx

π

dθ1 . . .

0

·

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

− → {x+r θ ∈G δ ,r >0}

I

1 − → 2 → → (x)), θ < 3δ1 {0< − n (ξδ,− θ k12

− → |ρ(x + r θ )α−1 − ρ(x)α−1 | dr r 1+α }

A(n, −α) δ ≤ lim dt t α−1 (δ − t)−α m(d x) δ↓0 α ρ(x)=t 0 π 2π π dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) · 0

·I

− → θ G δ, 1

0

I

0 1

− → 2 → → (x)), θ < 3δ1 } {0< − n (ξδ,− θ k12

dθn−1 = 0,

Integration by Parts Formula for Regional Fractional Laplacian

which leads to

π

lim A(n, −α) δ↓0

·

0

− → v(x) d x

θ G δ, 1

π

dθ1 . . . 0

2π

dθn−2 0

I

313

ϕ(θ1 , . . . , θn−2 ) dθn−1 − → u(x + r θ ) − u(x) dr = 0. 1 − → 2 r 1+α − → (x)), θ < 3δ1 }

− → → {x+r θ ∈G δ ,r >0} {0< − n (ξδ, θ

k12

(3.32) Similarly we can also prove that π π dθ1 . . . dθn−2 lim δ↓0

0

·

0

4 − →

i=2

G iδ, θ

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

v(x) d x

2π

− → {x+r θ ∈G δ ,r >0}

− → u(x + r θ ) − u(x) dr = 0. r 1+α

Combining (3.27), (3.28), (3.30), (3.32) and (3.33), we obtain (1.4).

Theorem 3.5. Let G be a C 2 open set in Rn . Suppose that functions u, v ∈ when 1 < α < 2, and u ∈

(3.33)

Dβ (G)

β≥α

Dβ (G), v ∈ C 2 (G) when 0 < α ≤ 1. If u, v have

β≥α

compact support, then the results in Theorem 3.3 and Theorem 3.5 are also true. Proof. By straightforward calculation we can prove the same results in Lemma 3.1 under the assumptions of this theorem. Since the supports of u, v are compact, we can choose a bounded open set U ⊂ Rn such that U ∩G is C 2 open set and G ∩(supp[u]∪supp[v]) ⊂ α

U (supp[u] := {x : u(x) = 0}). Applying Theorem 3.3 to operator U2 ∩G , we can get the conclusion. Remark 3.1. There is no boundary term for the regional fractional Laplacian in Theorem 1.2. When 0 < α ≤ 1, the probability explanation is that the α-stable processes do not touch the boundary from inside (the capacity of the boundary is zero, see [2]). By Theorem 1.3, to get the boundary term for the regional fractional Laplacian, the function must change nonlinearly near the boundary. The smaller the α is, the greater the nonlinearity appears. When α ≤ 1 the boundary term appears only for unbounded functions. From Theorem 1.3, we can also state the free boundary problem for the regional fractional Laplacian. A simpler form is α −G2 u(x) = f (x), ∀ x ∈ G, (3.34) F 2−α u(x) = g(x), ∀ x ∈ ∂G, where f and g are smooth functions defined on G and ∂G respectively. By (1.3) we have the following restriction for (3.34), f (x) d x = −Cn,α g(x) m(d x). G

∂G

In [11] a kind of boundary value problem for the weak regional fractional Laplacian is studied. The boundary condition in [11] can be understood as the weak form of the condition: F 2−α u(x) = 0 on the reflected boundary.

314

Q.-Y. Guan

4. Regional Fractional-like Laplacian and Reflected Stable-like Processes In this section we consider more general nonlocal operators. We also study the connections of these operators and their probability counterpart. For convenience of discussion we assume that G is at least a Lipschitz open set in Rn in this section. In [6] a class of more general jumping processes on G was considered, of which the Dirichlet form is 1 κ(x, y)(u(x) − u(y))(v(x) − v(y)) E u, v = A(n, −α) d xd y, 2 |x − y|n+α G×G (u(x) − u(y))2 F κ = u ∈ L 2 (G) : d xd y < ∞ , (4.1) |x − y|n+α G×G κ

where κ(x, y) is a positive symmetric function on G × G and is bounded between two strictly positive numbers. We know that the Dirichlet space defined by (4.1) is equivalent to α2 −order Sobolev space. In [6] the stochastic process related to Dirichlet form (4.1) is called the reflected stable-like process. The case of κ ≡ 1 is what we discuss in Sect. 2 and Sect. 3. Another typical case of κ is determined by the following Dirichlet form: E κ u, v =

α

α

v(−A) 2 u d x, u, v ∈ D((−A) 2 ), G

F κ = u ∈ L 2 (G) :

G×G

(u(x) − u(y))2 d xd y < ∞ , |x − y|n+α

(4.2) α

where A is the L 2 adjoint operator of a Laplacian with Neumann boundary and D((−A) 2 ) is the domain of the α2 power of −A. In [6], Chen and Kumagai give the following results. Theorem 4.1 (Theorem 1.1 in [6]). Let G be a Lipschitz open set in Rn . If κ(x, y) is a symmetric function on G × G and is bounded between two strictly positive constants, then there exists a Feller process on G related to the Dirichlet form (4.1). This process has a Hödler continuous density function p(t, x, y). Moreover, there exist constant numbers b1 , b2 > 0 such that b1 t −n/α ∧

t t < p(t, x, y) < b2 t −n/α ∧ , n+α n+α |x − y| |x − y|

∀x, y ∈ G, 0 < t < 1.

(4.3)

In this section we study the operators related to Dirichlet form (4.1) and study the reflected stable-like processes by these operators. For the function κ(x, y) on G × G, define

α 2 ,κ

G u(x) = lim A(n, −α) ε↓0

y∈G,|y−x|>ε

κ(x, y)(u(y) − u(x)) dy, ∀ x ∈ G, (4.4) |x − y|n+α

Integration by Parts Formula for Regional Fractional Laplacian

315 α

provided the integral and limit exist. We may call G2 Laplacian. When κ ∈ Cb1 (G × G) and u ∈ Cb2 (G),

,κ

the regional fractional-like

κ(x, y)(u(y) − u(x)) dy |x − y|n+α κ(x, y)(u(y) − u(x) − ∇u(x), y − x) = lim dy ε2 ↓0 y∈G,ε2 >|y−x|>ε1 |x − y|n+α (κ(x, y) − κ(x, x)) ∇u(x), y − x + lim dy = 0. ε2 ↓0 y∈G,ε2 >|y−x|>ε1 |x − y|n+α

lim

ε2 ↓0 y∈G,ε2 >|y−x|>ε1

(4.5)

By (4.5) we see that the limit in (4.4) exists under the conditions above. Moreover, if there are some positive constant k1 and positive function k2 (x) on G such that |κ(x, y)| < k1 and |κ(x, y) − κ(x, x)| < k2 (x)|x − y|, ∀ (x, y) ∈ G × G, (4.6) α

,κ

we can also check that G2 u is well defined for u ∈ Cb2 (G). α

,κ

The following formula is the product formula for G2 which can be checked directly. α

α

,κ

,κ

G2 (uv)(x) = v(x)G2 u(x) α 2 ,κ

+ u(x)G v(x) + A(n, −α)

G

κ(x, y)(u(y) − u(x))(v(y) − v(x)) dy, |x − y|n+α

provided that each term is well defined.

α

(4.7)

,κ

In order to prove the integration by parts formula for G2 , we prepare some results below. Lemma 4.2. Let β1 , β2 > 0 and κ0 (x, y) = β1 + β2 |x−y| |x−y|n+α , where x = (x 1 , . . . , x n ) ∈ n+α

β−1

Rn+ , y = (y1 , . . . , yn ) ∈ Rn+ and y = (y1 , . . . , yn−1 , −yn ). Let u β (x) = xn β ∈ (0, 1) ∪ (1, 3) and u 1 (x) = ln xn . Then, α

,κ0

R2 n u α (x) = 0, +

∀x ∈ Rn+ , 0 < α < 2.

for

(4.8)

Moreover, for β ∈ (0, α + 1), α

,κ0

ωn−1 α + 1 n − 1 B( , )(β1 γ (α, β) + β2 γ (α, β))x β−α−1 , 2 2 2 (4.9) ∀x ∈ Rn+ ,

R2 n u β (x) = A(n, −α) +

where ωn−1 is the (n − 2)-dimensional Lebesgue measure of the unit sphere in Rn−1 , B 1 β−1 −1)(1−t α−β ) dt and γ (α, β) = is the beta function. When β = 1, γ (α, β) = 0 (t (1−t) 1+α 1 (1−t α−1 ) ln t 1 (t β−1 −1)(1−t α−β ) dt. When β = 1, γ (α, 1) = 0 (1−t)1+α dt and γ (α, 1) = 0 (1+t)1+α 1 (1−t α−1 ) ln t dt. 0 (1+t)1+α

316

Q.-Y. Guan

Proof. We can assume that β1 = β2 = 1 by linearity. First we prove the one dimensional y case. When α ∈ (0, 1) ∪ (1, 2), we have by setting y+x = t,

∞

0

y α−1 dy = |y + x|1+α

1 t α−1

0

x

dt =

1 . αx

Hence by (2.8), α

α

,κ0

R2 + u α (x) = R2 + u α (x) + A(1, −α)

∞

−A(1, −α) 0

∞ 0

y α−1 dy |y + x|1+α

x α−1 dy = 0. |y + x|1+α

(4.10)

When α = 1, we have 0

∞

ln y dy = lim δ↓0 |y + x|2

1 δ

δ

1 δ ln δ ln x ln y 1 + dy = . dy = lim δ↓0 δ + x |y + x|2 (y + x)y x δ

Thus by (2.8) again, α 2 ,κ0 R+

∞ ln y u 1 (x) = R+ u 1 (x) + A(1, −1) dy |y + x|2 0 ∞ ln x −A(1, −1) dy = 0. |y + x|2 0 α 2

(4.11)

Applying (4.10), (4.11) and the same notations in Theorem 3.3, we obtain α 1 ,κ0 R2 n u α (x) + A(n, −α) u α (y) − u α (x) |x − y|n+α = lim (1 + ) dy ε↓0 y∈Rn ,|y−x|>ε |x − y|n+α |x − y|n+α + π π 2π dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) dθn−1 = lim

ε↓0

· 0

0

0

∞

I{|t−

xn cos θ

|>ε} cos

α−1

θ1

0 α−1 t

− ( cosxnθ1 )α−1

|t −

xn 1+α cos θ1 |

(1 +

| cosxnθ1 − t|1+α | cosxnθ1 + t|1+α

) dt = 0,

which proves (4.8). The general case (4.9) can be proved by the same discussion as the proof of (5.4) in [2]. Proposition 4.3. Let G be a bounded Lipschitz open set in Rn . Let 0 < α < 2 and u, v ∈ C 2 (G). Suppose that κ satisfies (4.6) with k2 (x) < ρ(x)β in a neighborhood of ∂G for some constant β > α − 3, then κ(x, y)(u(x) − u(y))(v(x) − v(y)) 1 A(n, −α) d xd y 2 |x − y|n+α G×G α ,κ v(x)G2 u(x) d x. (4.12) =− G

Integration by Parts Formula for Regional Fractional Laplacian

317

Proof. We only prove the case of 1 < α < 2 and the proof for the other cases is similar. |u(y) − u(x)| Set M = sup . By the assumptions in this proposition we have |y − x| x,y∈G,x= y α 1 ,κ |G2 u(x)| A(n, −α) κ(x, x)(u(y) − u(x)) ≤ lim dy ε↓0 |x − y|n+α y∈G,|y−x|>ε (κ(x, y) − κ(x, x))(u(y) − u(x)) + lim dy ε↓0 |x − y|n+α y∈G,|y−x|>ε α u(y) − u(x) β 2 ≤ k1 |G u(x)| + lim ρ(x) dy n+α−1 ε↓0 y∈G,ρ(x)>|y−x|>ε |x − y| u(y) − u(x) + 2k1 dy n+α y∈G,|y−x|≥ρ(x) |x − y| α M(2π )n M(2π )n ≤ k1 |G2 u(x)| + ρ(x)2−α+β + 2k1 ρ(x)1−α , 2−α α−1

where k1 is described in (4.6). By this estimation, (2.40) and Lemma 3.2 in [10] we see α

,κ

that G2 u ∈ L 1 (G). We omit the rest of the proof as it is similar to the discussions in Theorem 3.3 [10]. Lemma 4.4. Let 0 < α < 2 and T > 0. Suppose that u ∈ C 1 (0, T ] and v ∈ C[0, T ]. If F 2−α u(0) exists, then δ T (u(y) − u(x))v(x) lim A(1, −α) d xd y = −Dα v(0)F 2−α u(0), (4.13) δ↓0 |x + y|1+α 0 δ where Dα =

A(1, −α) α(α − 1)

Dα =A(1, −α)

∞

0∞ 0

(s ∨ 1)1−α − |s + 1|1−α ds, s 2−α ln |s + 1| − ln(s ∨ 1) ds, s

α ∈ (0, 1) ∪ (1, 2), α = 1.

(4.14)

Proof. We omit the proof as the arguments are similar to the proof of Lemma 3.2 (see Appendix). For the C 2 open set G, let δ0 and ξ(x) be the number and function defined in Lemma 2.2. For x ∈ G δ0 , set x = 2ξ(x) − x.

(4.15)

We use these notations in the following theorem. Theorem 4.5. Let G be a C 2 open set in Rn . Assume that u, v ∈ Cc (G)∩

Dβ (G) and

β≥α

1 < α < 2. Let ψ1 , ψ2 be symmetric functions in C 1 (G × G) and let κ be a symmetric

318

Q.-Y. Guan

function on G × G such that n+α |x − y|n+α κ(x, y) − ψ1 (x, y) − ψ2 (x, y)( |x − y| + ) < C|x − y|, |x − y|n+α |x − y|n+α ∀x ∈ G δ1 , y ∈ G, (4.16) where C is a positive constant and 0 < δ1 < δ0 . Then the following equality is true: 1 κ(x, y)(u(x) − u(y))(v(x) − v(y)) A(n, −α) d xd y 2 |x − y|n+α G×G α ,κ Dκ (x)v(x)F 2−α u(x) m(d x) − v(x)G2 u(x) d x, (4.17) = ∂G

G

(n,−α) where Dκ (x) = A A(1,−α) (ψ1 (x, x)C α + 2ψ2 (x, x)Dα ) and Dα are explained in (3.6) and (4.14) respectively.

α {y∈Rn ,|y|=1,yn >0} yn

m(dy). Cα

Proof. For simplicity we assume that G is bounded and u ∈ Dα (G). By Lemma 4.2, Proposition 4.3, (4.7) and (4.16), we can prove the same assertions as in Corollary 2.6 α

,κ

and Lemma 3.1 for G2 and its related Dirichlet form respectively. With the help of these facts and following the arguments in Theorem 3.3, we can prove that 1 κ(x, y)(u(x) − u(y))(v(x) − v(y)) A(n, −α) d xd y 2 |x − y|n+α G×G α ,κ v(x)G2 u(x) d x =− G

for κ such that |κ(x, y)| < C|x − y|, and that 1 κ(x, y)(u(x) − u(y))(v(x) − v(y)) A(n, −α) d xd y 2 |x − y|n+α G×G α ,κ κ(x, x)v(x)F 2−α u(x) m(d x) − v(x)G2 u(x) d x = Cn,α ∂G

G

for κ ∈ C 1 (G × G). Hence the proof of (4.17) can be reduced to verify that κ(x, y)(u(x) − u(y))(v(x) − v(y)) 1 A(n, −α) d xd y 2 |x − y|n+α G×G Dα A(n, −α) xnα m(d x) ψ(x, x)v(x)F 2−α u(x) m(d x) = A(1, −α) {|x|=1,xn >0,x∈Rn } ∂G α ,κ v(x)G2 u(x) d x (4.18) − G

|x−y| for κ(x, y) = ψ(x, y) |x−y| |x−y|n+α or κ(x, y) = ψ(x, y) |x−y|n+α . We only prove (4.18) n+α

n+α

when κ(x, y) = ψ(x, y) |x−y| |x−y|n+α , as the proof of the second case is similar. on G × (G c ) by Define functions u on (G c )δ0 and ψ δ0 δ0 n+α

u (x) = u(x), ∀ x ∈ (G c )δ0 ,

(x, y) = ψ(x, y), ψ

∀ (x, y) ∈ G δ0 × (G c )δ0 ,

Integration by Parts Formula for Regional Fractional Laplacian

319

where the relation between x and x is in (4.15). By Lemma 2.2,

lim |

x→∂G

∂x | − 1 = 0, ∂x

(4.19)

− → ∂x n ∂ x is the Jacobian determinant of f (x) = x. For x ∈ ∂G and unit vector θ ∈ R , − → − → let Mxδ, θ be the open interval in the one dimensional open set {y : y = x + t θ ∈ G δ f or some t ∈ R+ } with x in its closure. With the help of Lemma 4.4 and (4.19),

where

following the arguments and taking the notations in Theorem 3.3, we have

α ψ(x, y)(u(x) − u(y))(v(x) − v(y)) ,κ + v(x)G2 u(x) d xd y n+α |x − y| G×G G u(y) − u(x) ψ(x, y)v(x) d xd y = − lim A(n, −α) δ↓0 |x − y|n+α G δ G δ ∩ G δ 0 u (y) − u(x) ∂ y (x, y)v(x) ψ ( = − lim A(n, −α) − I ) d xd y c c δ↓0 |x − y|n+α ∂ y G δ (G )δ ∩ (G )δ 0 u (y) − u(x) (x, y)v(x) ψ d xd y − lim A(n, −α) c c δ↓0 |x − y|n+α G δ (G )δ ∩ (G )δ 0 π π 2π = − lim A(n, −α) v(x) d x dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) dθn−1

1 A(n, −α) 2

δ↓0

G δ

0

0

0

− → u (x + r θ ) − u(x) → (x, x + r − ψ θ ) dr · − → 1+α r {x+r θ ∈(G c )δ ∩ (G c )δ ,r >0} 0 π π 2π = − lim A(n, −α) dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) dθn−1 δ↓0 0 0 0 ∞ − → − → → − → ·

− n (x), θ m(d x) · I − → δ, θ v(x + t θ ) dt

∂G ∞

0

{x+t θ ∈Mx

}

∂G ∞

0

{x+t θ ∈Mx

}

− → − → u (x − r θ ) − u(x + t θ ) → − → (x + t − → ψ θ , x − r θ ) dr · I{x−r − 1+α θ ∈(G c )δ ∩ (G c )δ } (t + r ) 0 0 2π π π = − lim A(n, −α) dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) dθn−1 δ↓0 0 0 0 ∞ − → − → → − → ·

− n (x), θ m(d x) · I − → δ, θ v(x + t θ ) dt

− → − → u(x + r θ ) − u(x + t θ ) − → − → → · I{x+r − ψ(x + t θ , x + r θ ) dr 1+α θ ∈G δ ∩ G δ } (t + r ) 0 0 (α − 1)Dα A(n, −α) =− ψ(x, x)v(x) f (x) m(d x) A(1, −α) ∂G π 2π π − → → − → dθ1 . . . dθn−2 ϕ(θ1 , . . . θn−2 ) − n (x), θ α I{ − dθ · → n (x), θ >0} n−1

0

0

0

320

Q.-Y. Guan

(α − 1)Dα A(n, −α) ynα m(dy) ψ(x, x)v(x) f (x) m(d x) A(1, −α) {y∈Rn ,|y|=1,yn >0} ∂G Dα A(n, −α) = ynα m(dy) ψ(x, x)v(x)F 2−α u(x) m(d x), A(1, −α) {y∈Rn ,|y|=1,yn >0} ∂G

=−

which proves (4.18). We omit the limit arguments in the above equalities as they are similar to the discussions in Theorem 3.3. Remark 4.1. When 0 < α < 2, we can also obtain the integration by parts formula for α

,κ

G2 as a generalization of Theorem 3.5. In the nonsymmetric case, write κ(x, ˆ y) = κ(y, x). Let u, v be functions satisfying the conditions in Theorem 4.5 and let 1 < α < 2. If κ ∈ C 2 (G × G), then, by the similar arguments in the proof of Theorem 4.5, we can prove α α ,κ ,κˆ v(x)G2 u(x) d x − u(x)G2 v(x) d x G G κ(x, x)v(x)F 2−α u(x) m(d x) − Cn,α κ(x, x)u(x)F 2−α v(x) m(d x) = Cn,α ∂G ∂G uvk d x + G

where

is

Cn,α

y∈G,|y−x|>ε

the

constant

in

(3.9)

and

k(x)

κ(x, ˆ y) − κ(x, y) dy . |x − y|n+α

=

lim A(n, −α) ε↓0

In order to study the generator of reflected stable-like process, we extend the defiα

α

,κ

,κ

nition of G2 u onto G by notation G2 u, provided that (4.4) is well defined on the α

α

,κ

boundary. When κ ≡ 1, G2 u is denoted by G2 u in [10]. Lemma 4.6. Let ψ be a bounded measurable function on [0, 1] and v = (sin θ0 , 0, . . . , 0, cos θ0 ) for some θ0 ∈ [0, π2 ]. Then for n ≥ 2, n IR+ ψ(z n ) z, v m(dz) = en , v IRn+ ψ(z n )z n m(dz), (4.20) |z|=1

|z|=1

where z = (z 1 , . . . , z n ) and en = (0, . . . , 0, 1). Proof. Applying the spherical coordinate system at point zero, we have for n ≥ 3, IRn+ ψ(z n ) z, v m(dz) |z|=1

=

π 2

π

dθ1 . . .

0

0

2π

ϕ(θ1 , . . . , θn−2 )ψ(cos θ1 )

dθn−2 0

× (sin θ1 . . . sin θn−1 , 0, . . . , 0, cos θ1 ), v dθn−1 π π 2π 2 = dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 )ψ(cos θ1 ) (0, . . . , 0, cos θ1 ), v dθn−1 0 0 0 = en , v IRn+ ψ(z n )z n m(dz). |z|=1

The proof for the case of n = 2 is similar.

Integration by Parts Formula for Regional Fractional Laplacian

321

1 Remark 4.2. By (4.20), for a measurablefunction ψ such that 0 ψ(s)ds > 0 and any unit vector v = (v1 , . . . , vn ), we have |z|=1 ψ(z n ) z, vIRn+ m(dz) > 0(= 0) when vn > 0(vn = 0) respectively. This gives Lemma 5.2 [10]. Theorem 4.7. Let G be a C 2 open set in Rn . Let u be a bounded function in Dβ (G). β>α+1

Assume that function κ satisfies the condition in (4.16) and that, additionally, ψ1 and ψ2 are strictly positive on {(x, x) : x ∈ ∂G}. When 1 ≤ α < 2, the following assertions are true for z ∈ ∂G: α

,κ

(i) G2 u(z) exists if and only if

∂u ∂n (z)

= 0.

α

∂u ∂n

,κ

= 0 in a relatively open subset of ∂G containing z, then 2 u is continuous (ii) If G at z. α ,κ (iii) If ∂u lim G2 u(x) = −∞. ∂n (z) > 0, then x∈G,x→z

(iv) If

∂u ∂n (z)

< 0, then

lim

x∈G,x→z

α

,κ

G2 u(x) = ∞.

∂u ∂n

is the outward normal derivative of u on the boundary. Moreover, when 0 < α ,κ Dβ (G). α < 1, G2 u is always well defined and continuous on G for u ∈ Here

β>α+1

Proof. The notations in Theorem 4.5 will be used throughout the proof of this theorem. We only prove the assertions when 1 < α < 2, n ≥ 3 and G is bounded. Other cases can be proved similarly. The proof for the case of κ ≡ 1 is given in Theorem 5.4 [10]. Below we also assume ψ1 ≡ 0 for the proof of this case is similar to the case of κ ≡ 1. (i) Without loss of generality, we may take a coordinate system C Sz such that z is the origin and that the inward normal vector of ∂G at z is (0, . . . , 0, 1). For function α

,κ

u(x) = f (x)ρ(x)β such that f ∈ C 2 (G) and β > α, the existence of G2 u(z) can be checked directly. Hence by (2.5) we can assume that u ∈ C 2 (G) in the following. We can also assume ψ1 ≡ 0 for the proof of this case is similar to the case of κ ≡ 1. Furthermore, noticing that ||x − y| − |x − y|| < c|x − y|2 for some constant c, we n+α only need to prove the theorem when κ(x, y) = ψ(x, y) |x−y| |x−y|n+α for some symmetric function ψ in C 1 (G × G). By the assumptions of u, ∂G and ψ,

∇u(z), y − z ψ(z, y) dy lim A(n, −α) ε1 ↓0 |z − y|n+α y∈G,ε2 <|y−z|<ε1

∇u(z), y − z = lim A(n, −α)ψ(z, z) dy n ε1 ↓0 |z − y|n+α y∈R+ ,ε2 <|y−z|<ε1 π π 2π 2 = lim A(n, −α)ψ(z, z) dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 ) dθn−1 ε1 ↓0

×

0 ε1

ε2

∇u(z), rα

y−z |y−z|

0

0

dr

ε1−α − ε11−α = lim A(n, −α)ψ(z, z) 2 ε1 ↓0 α−1

|y|=1

IRn+ ∇u(z), y m(dy) = 0.

(4.21)

322

Q.-Y. Guan

The last equality in (4.21) follows from condition α

α

,κ

∂u ∂n (z)

= 0 and (4.20). By (4.21),

,κ

2 2 u(z) − G,ε u(z)| lim |G,ε 1 2 ε1 ↓0 u(y) − u(z) = lim A(n, −α) ψ(z, y) dy ε1 ↓0 |z − y|n+α y∈G,ε2 <|y−z|<ε1 u(y) − u(z) − ∇u(z), y − z = lim A(n, −α) ψ(z, y) dy = 0. ε1 ↓0 |z − y|n+α y∈G,ε2 <|y−z|<ε1 (4.22)

∂u ∂n (z)

= 0, applying (4.20) we can also check that

∇u(z), y − z ψ(z, y) dy = ∞, ∀ε1 > 0, lim A(n, −α) ε2 ↓0 |z − y|n+α y∈G,ε2 <|y−z|<ε1

When

which leads to α α ,κ ,κ u(z) − 2 u(z)| G,ε1 G,ε2

lim | 2

ε2 ↓0

= ∞, ∀ε1 > 0.

(4.23)

Combining (4.22) and (4.23), we complete the proof of (i). (ii) Suppose that ∂u ∂n = 0 in a relatively open subset of ∂G containing z. Now we prove α

that

lim x→z,x∈G

α

,κ

,κ

2 u(x) = 2 u(z). Similarly we can also assume that u ∈ C 2 (G) and G

G

only consider the case κ(x, y) = ψ(x, y) |x−y| |x−y|n+α . Let δ0 be the number described in Lemma 2.2. Set n+α

Aεx = {y ∈ G : y ∈ B(x, ε)},

∀x ∈ G δ0 , ε > 0.

By the same arguments as in the proof in (i) we can prove that for some 0 < δ < δ0 , (u(y) − u(x)) A(n, −α) sup ψ(x, y) dy = 0. (4.24) lim ε1 ↓0 x∈B(z,δ)∩∂G |x − y|n+α y∈G,ε2 <|y−x|<ε1 On the other hand, by the continuity of u on G, it is easy to see that for each fixed ε > κ(x, y)(u(y) − u(x)) dy is uniformly continuous for x ∈ G δ0 . 0, A(n, −α) |x − y|n+α y∈G\Aεx α

Therefore, in order to prove

lim x→z,x∈G

lim

sup

ε↓0 x∈G∩B(z,ε)

,κ

α

,κ

G2 u(x) = G2 u(z), we need only to check that

A(n, −α)

y∈Aεx

u(y) − u(x) ψ(x, y) dy = 0. |x − y|n+α

(4.25)

In what follows we assume that x is close enough to z and hence there exists an unique point x0 on the boundary such that |x − x0 | = ρ(x) with ∂u ∂n (x 0 ) = 0. The same notation 2 n u will also be used as an extension of u in Cc (R ). Choose a coordinate system C Sx0 such that x0 is the origin and the inward normal vector of ∂G at x0 is (0, . . . , 0, 1). Applying (4.20) we have

Integration by Parts Formula for Regional Fractional Laplacian

323

ε

∇u(x0 ), y − x

∇u(x0 ), y dy = dr I{yn >(ε−ρ(x))∨0} m(dy) n+α |x − y| r n+α 0 y∈B(x,ε)∩Rn+ |y|=r = 0.

Hence by the assumptions for u, ∂G and ψ, u(y) − u(x) A(n, −α) sup ψ(x, y) dy lim ε↓0 x∈G∩B(z,ε) |x − y|n+α y∈Aεx u(y) − u(x) A(n, −α) = lim sup ψ(x , x ) dy 0 0 ε↓0 x∈G∩B(z,ε) |x − y|n+α y∈B(x,ε)∩Rn+ A(n, −α) = lim sup ψ(x0 , x0 ) ε↓0 x∈G∩B(z,ε) y∈B(x,ε)∩Rn+ u(y) − u(x) − ∇u(x0 ), y − x × dy |x − y|n+α A(n, −α) = lim sup ψ(x0 , x0 ) ε↓0 x∈G∩B(z,ε) y∈B(x,ε)∩∈Rn+ u(y) − u(x) − ∇u(x), y − x × dy |x − y|n+α = 0, which leads to (4.25). (iii) We make the same assumptions for u and κ as in (ii). Following the notations in (ii), we only need to prove lim

x→z,x∈G

A(n, −α)

δ y∈A x0

ψ(x, y)

u(y) − u(x) dy = −∞. |x − y|n+α

(4.26)

Without loss of generality, we can choose the coordinate system C Sx0 in (ii) such that ∇u(x) = (a1 , 0, . . . , 0, a2 ) for some a1 , a2 ∈ R. By the assumptions on ψ and ∂G, (4.26) is equivalent to lim

x→z,x∈G

A(n, −α)

u(y) − u(x) dy = −∞. n+α y∈B(x,δ0 )∩Rn+ |x − y|

Applying the spherical coordinate system at point x, we have y∈B(x,δ0 )∩Rn+ π 2

= A1 +

0

δ0

· 0

u(y) − u(x) dy |x − y|n+α π dθ1 . . . dθn−2 0

2π

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

∇u(x), (sin θ1 . . . sin θn−1 , 0, . . . , 0, cos θ1 )r −α I{ ρ(x)
0

(4.27)

324

Q.-Y. Guan

= A1 +

π 2

dθ1 . . .

δ0

·

2π

dθn−2

0

π 0

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

cos θ1 ∇u(x), en r −α I{ ρ(x)
cos θ1

0

= A1 + A2 + ∇u(x), en π π 2 · dθ1 . . . dθn−2 0

0

2π

ϕ(θ1 , . . . , θn−2 )I{ ρ(x)
1 } (α

δ0

0

cosα θ1 dθn−1 , − 1)ρ(x)α−1 (4.28)

where

u(y) − u(x) − ∇u(x), y − x dy, |x − y|n+α y∈B(x,δ0 )∩Rn+ π π 2π 2 A2 = ∇u(x), en dθ1 . . . dθn−2 ϕ(θ1 , . . . , θn−2 )I{ ρ(x)
0

0

·

π 2

0

π

dθ1 . . .

dθn−2 0

0

2π

ϕ(θ1 , . . . , θn−2 ) dθn−1

0

δ0

1 I ρ(x)
1}

cos θ1 dθn−1 α−1

are all bounded. Similarly, u(x) − u(x) dy n+α n y∈B(x,δ0 )∩R+ |x − y| = B1 + 2ρ(x) ∇u(x), en ·

δ0

0

−δ01−α

0 2π

π

dθn−2

0

2 cosα θ1 dθn−1 , 1 } αρ(x)α−1

ϕ(θ1 , . . . , θn−2 )I{ ρ(x)
(4.29)

where

u(x) − u(x) − ∇u(x), x − x dy, |x − y|n+α y∈B(x,δ0 )∩Rn+ π π 2 B2 = ∇u(x), en dθ1 . . . dθn−2 B1 =

0

2π

· 0

0

−2ρ(x)δ0−α dθn−1 1} α

ϕ(θ1 , . . . , θn−2 )I{ ρ(x)
1 are also all bounded. Combining (4.28), (4.29) and noticing that α−1 > α2 , lim ∇u(x), en < 0 and lim ρ(x)1−α = ∞, we obtain (4.27). (iv) follows from x→z,x∈G

x→∂G

(iii). The last assertion can be checked directly.

Integration by Parts Formula for Regional Fractional Laplacian

325

In what follows we present some results about the reflected stable-like processes on G. We omit the proofs of these results since, with the help of the above results, the arguments in [10] are still valid. Theorem 4.8. Let G be a bounded Lipschitz open set in Rn and κ be a function on G × G satisfying all the conditions in Theorem 4.1 and Proposition 4.3. Let (X t )t≥0 be the reflected stable-like process on G related to the Dirichlet form (E κ , F κ ) in (4.2). Then (X t )t≥0 is a semimartingale and t α ,κ X tk = x0k + Mtk + G2 xk (X s ) ds, a.s. Px0 , k = 1, 2, . . . , n, ∀x0 ∈ G, 0

Xt =

(X tk )nk=1 ,

x0 = (x0k )nk=1 ,

(4.30)

where (Mtk )t≥0 is a square integrable martingale and the Revuz measure of the sharp 2 k −z k | dz) dy . bracket process is (A(n, −α) G κ(z,y)|y |z−y|n+α α

The following results give the relation between the boundary operator of G2 the boundary local time of the reflected stable-like process.

,κ

and

Theorem 4.9. Let G be a C 2 open set in Rn and κ be a function on G × G satisfying all the conditions in Theorem 4.1 and Theorem 4.5. Let (X t )t≥0 be the reflected stable-like

process in Theorem 4.8. Suppose that u ∈ β≥α Dβ (G) with compact support and 1 < α < 2, then t α ,κ G2 u(X s )ds + L 1t − L 2t , a.s. Px0 , ∀x0 ∈ G, u(X t ) = u(x0 ) + Mt + 0

In the equality above, (Mt )t≥0 is a square integrable martingale and the Revuz mea 2 dz) dy. (L 1t )t≥0 and sure of the sharp bracket process is (A(n, −α) G κ(z,y)(u(z)−u(y)) |z−y|n+α (L 2t )t≥0 are PCAF (positive continuous additive functional, see [7]) in the strict sense with Revuz measures (Dκ (x)F 2−α u(x))− m(d x) and (Dκ (x)F 2−α u(x))+ m(d x) on ∂G respectively, where a− = 0 ∨ (−a) and a+ = 0 ∨ a for a ∈ R.

Remark 4.3. As F 2−α u ≡ 0 for u ∈ β>α Dβ (G), we can read from Theorem 4.9 that (u(X t ))t≥0 has local time on the boundary only when ∞u ∈ Dα (G). By (4.3) we can give the estimate of the resolvent density r1 (x, y) := 0 e−t p(t, x, y) dt for (X t )t≥0 . C In fact we can check that r1 (x, y) ≤ |x−y| n−α for some positive number C. This fact is necessary in checking the assumptions in Fukushima’s decomposition (see Theorem 5.2.5 [7]) which is the method to prove Theorem 4.9. Theorem 4.10. Let G be a C 2 open set in Rn and κ be function on G × G satisfying all the conditions in Theorem 4.1 and Theorem 4.5. Denote by Aα,κ F the Feller generator of the reflected stable-like process on G stated in Theorem 4.1. Then the following assertions are true: α ,κ G ∂u ∂n (x)

α,κ 2 u. (i) If 0 < α < 1 , u ∈ Cc1 (G), then u ∈ D(Aα,κ F ) and A F u =

(ii) If 1 ≤ α < 2 , u ∈ Cc2 (G), then u ∈ D(Aα,κ F ) if and only if When this condition is satisfied, we have

Aα,κ F u

=

α 2 ,κ

G

u.

= 0, ∀x ∈ ∂G.

326

Q.-Y. Guan

p Remark 4.4. Let p > 1 and denote Aα,κ p the L generator of (X t )t≥0 . Under the conp+1 ditions in Theorem 4.10, we can prove that Cc2 (G) ⊂ D(Aα,κ p ) when 0 < α < p . If p+1 p

≤ α < 2, we can show that the function u in Cc2 (G) belongs to D(Aα,κ p ) if and

3 only if ∂u ∂n = 0 on ∂G. When p = 2 and α = 2 , by complex interpolation theory these results have been proved for the fractional power of the elliptic differential operator with Neumann boundary. For more details please see [12].

At the end of this section we see two examples. We know that the corresponding Feller process of Dirichlet form (4.2) is the subordinate reflected α-stable process and this process could be constructed by reflected Brownian motion through time transformation (see [12]). Taking G = Rn+ := {x = (x1 , . . . , xn ) : xn ≥ 0}, we can calculate the function κ in Dirichlet form (4.2). In this case the corresponding process can be constructed by (|Yt |)t≥0 , where (Yt )t≥0 is the symmetric α-stable process on Rn . The following is the Dirichlet form of (|Yt |)t≥0 which can be checked directly. n+α (1 + |x−y| 1 |x−y|n+α )(u(x) − u(y))(v(x) − v(y)) E u, v = A(n, −α) d xd y, 2 |x − y|n+α Rn+ ×Rn+ n+α 2 (1 + |x−y| |x−y|n+α )(u(x) − u(y)) κ 2 n d xd y < ∞ , F = u ∈ L (R+ ) : |x − y|n+α Rn+ ×Rn+ (4.31)

κ

where y = (y1 , . . . , yn−1 , −yn ) for y = (y1 , . . . , yn ) and κ(x, y) = 1 + |x−y| |x−y|n+α . Similarly, we can also prove that the Dirichlet form of the subordinate reflected stable process on [0,1] is n+α

∞ 1 (u(x) − u(y))(v(x) − v(y)) E u, v = A(1, −α) d xd y, 2 2 |x ± y + 2k|1+α (0,1) k=−∞ (u(x) − u(y))2 F κ = u ∈ L 2 (0, 1) : d xd y < ∞ , (4.32) |x − y|1+α (0,1)2 κ

∞

|x − y|1+α . We can check that the functions κ in the above |x ± y + 2k|1+α k=−∞ two examples satisfy all the conditions in the theorems of this section above. where κ(x, y) =

Acknowledgements. The author would like to thank Professor Dao-Ming Cao for valuable discussions about the distance function. The author wishes to thank Professor Zhi-Ming Ma for his helpful comments and continuous support. He also thanks the anonymous referee for many helpful suggestions and comments.

5. Appendix The Proof of Lemma 3.2. The proof partly follows the calculation in Theorem 7.5 [10]. We only check (3.5) when 0 < α ≤ 1. In what follows we assume that u ∈ C 1 (0, T ], v ∈ C[0, T ] and F 2−α u(0) exists. Without loss of generality we set F 2−α u(0) = −1 and T = 1.

Integration by Parts Formula for Regional Fractional Laplacian

327

Case 1. 0 < α < 1. We have δ 1 (u(y) − u(x))v(x) d xd y |x − y|1+α 0 δ 1 y δ v(x) dx dy u (s)ds = 1+α 0 δ |x − y| x δ 1 δ 1 1 δ v(x) v(x) dx ds u (s) dy + d x ds u (s) dy = 1+α 1+α 0 x δ |x − y| 0 δ s |x − y| δ v(x)u (s) v(x)u (s) 1 δ ds dx − = α 0 |δ − x|α |1 − x|α x 1 1 δ v(x)u (s) v(x)u (s) + ds dx − α 0 |s − x|α |1 − x|α δ s δ 1 δ v(x)u (s) v(x)u (s) 1 1 = ds d x + ds d x + I1 α α α 0 α δ 0 |δ − x| 0 |s − x| δ v(0)u (s) 1 v(0)u (s) ds = − α(1 − α) 0 δ α−1 |δ − s|α−1 1 1 v(0)u (s) v(0)u (s) + ds + I1 + I2 − α(1 − α) δ s α−1 |s − δ|α−1 δ v(0) 1 v(0) ds = − α(1 − α) 0 s 2−α δ α−1 s 2−α |δ − s|α−1 1 1 v(0) v(0) + ds + I1 + I2 + I3 − 2−α α(1 − α) δ s s |s − δ|α−1 1 1 1−α δ s 1 − |1 − s|1−α − |s − 1|1−α v(0) v(0) = ds + ds 2−α α(1 − α) 0 s α(1 − α) 1 s 2−α + I1 + I2 + I3 1 δ (s ∨ 1)1−α − |s − 1|1−α v(0) = ds + I1 + I2 + I3 , (5.1) α(1 − α) 0 s 2−α where δ δ δ 1 v(x)u (s) v(x)u (s) 1 I1 = − dx ds + dx ds , α α α 0 x |1 − x| 0 δ |1 − x| δ s 1 δ (v(x) − v(0))u (s) (v(x) − v(0))u (s) 1 I2 = ds dx + ds dx , α |δ − x|α |s − x|α 0 0 δ 0 δ v(0) 1 v(0) 1 u ds I3 = − (s) − α(1 − α) 0 δ α−1 |δ − s|α−1 s 2−α 1 1 v(0) v(0) 1 + u ds. − (s) − α(1 − α) δ s α−1 |s − δ|α−1 s 2−α Since v is continuous and lim u (t)t 2−α = 1, we have t↓0

lim(v(x) − v(0)) = 0, u (s) − x↓0

1 s 2−α

=o

1 s 2−α

.

328

Q.-Y. Guan

Hence we can prove that I1 , I2 , I3 are finite and that lim I1 = lim I1 = lim I3 = 0. δ↓0

δ↓0

δ↓0

(5.2)

Therefore we complete the proof by (5.1) and (5.2). Case 2. α = 1. By a similar calculation as in Case 1, we have δ 1 (u(y) − u(x))v(x) d xd y |x − y|2 0 δ 1 δ ln δ − ln(δ − s) ln s − ln(s − δ) ds + v(0) ds + o(1) = v(0) s s 0 δ δ 1 δ ln s − ln(s − 1) − ln(1 − s) = v(0) ds + v(0) ds + o(1) s s 1 0 1 δ ln(s ∨ 1) − ln |s − 1| ds + o(1), = v(0) s 0

(5.3)

which leads to the conclusion for α = 1. − →

The proof of Lemma 3.4. Let x ∈ 1δ, θ . Without loss of generality we assume that x = 0. On rotating the coordinate system we may suppose that, near the origin, the surface ∂G has the form yn = x (y1 , . . . , yn−1 ) =

n−1

αi yi2 + k(y1 , . . . , yn−1 ),

(5.4)

i=1

where k(y1 , . . . , yn−1 ) is of higher order. Denote en = (0, . . . , 0, 1). − → − → → → If − n (x), θ < 0 (here − n (x) = en ), by (5.4) we see that t θ is under ∂G for small − →

t > 0. This implies that x ∈ / 1δ, θ . Thus the first conclusion is true. − → → In what follows we assume that − n (x), θ > 0. Denote a = max{1, α1 , . . . , αn−1 }. By Lemma 2.2 we can choose δ1 > 0 independent of x such that → (a) |∠(− n (y1 ), y2 − y1 ) −

π π |< , 2 6

∀ y1 , y2 ∈ ∂G ∩ B(x, 2δ1 ),

π , ∀ y ∈ ∂G ∩ B(x, 2δ1 ), 6 (c) ρ(y) = inf{|y − z| : z = {z i }1≤i≤n ∈ B(x, 2δ1 ), z ∈ ∂G}, ∀ y ∈ G ∩ B(x, δ1 ),

→ (b) |∠(− n (y), en )| <

(d) k(y1 , . . . , yn−1 ) ≤ a|(y1 , . . . , yn−1 )|2 ,

when |(y1 , . . . , yn−1 )| < δ1 .

Denote − → t ). (y1t , . . . , ynt ) = t θ , y tn = x (y1t , . . . , yn−1 By (5.4) and (d) we have y tn ≤ 2at 2 for 0 < t < δ1 . Therefore when 0 < t < − → → 1 ∧ 1)(δ1 ∧ 1) − n (x), θ , we have ( 4a t → − → − → → ynt − y tn ≥ t − n (x), θ − 2at 2 ≥ − n (x), θ . 2

(5.5)

Integration by Parts Formula for Regional Fractional Laplacian

329

Denote (y1t , . . . , ynt ) and (y1t , . . . , y tn ) by P1 and P2 respectively. By (c) there exists a −−→ point P3 in B(x, 2δ1 ) such that | P3 P1 | = ρ(P1 ). By (a) and (b), π 2π π −−→ −−→ −−→ −−→ ≤ ∠( P3 P1 , P3 P2 ) ≤ , ∠( P2 P1 , P3 P1 ) < . 3 3 6 −−→ − − → − → − → → This implies | P1 P3 | ≥ 21 | P1 P2 |. Hence by (5.5) we have ρ(t θ ) ≥ 4t − n (x), θ . Now 1 ∧ 1)(δ1 ∧ 1). we get (ii) by taking k1 = ( 4a 2 By Lemma 2.2, ∂G δ is C boundary and its second order derivatives are bounded uniformly when δ is small. With this fact we can prove (iii) in the same way as above. References 1. Bass, R.F.: Probabilistic Techniques in Analysis. New York: Springer, 1995 2. Bogdan, K., Burdzy, K., Chen, Z.Q.: Censored stable processes. Probab. Theory Relat. Fields 127, 89–152 (2003) 3. Bogdan, K., Byczkowski, T.: Potential theory for the α-stable Schrödinger operator on bounded Lipschitz domains. Studia Math 133(1), 53–92 (1999) 4. Chandrasekhar, S.: Stochastic problems in Physics and Astronomy. Rev. Mod. Phys. 15, 1–89 (1943) 5. Chen, Z.Q., Kim, P.: Green function estimates for censored stable processes. Probab. Theory Relat. Fields 124, 295–610 (2002) 6. Chen, Z.Q., Kumagai, T.: Heat Kernel Estimates for Stable-like Process on d-Sets. Stochastic Process Appl. 108, 27–62 (2003) 7. Fukushima, M., Oshima, Y., Takeda, M.: Dirichlet Form and Symmetric Markov Processes. Berlin: Walter de Gruyter, 1994 8. Gilbarg, D., Trudinger, N.: Elliptic Partial Differential Equations of Second order. 2nd edition, BerlinHeidelberg-New York: Springer-Verlag, 1983 9. Guan, Q.Y.: Reflecting α-Symmetric Stable Processes and Power Laplace Operators. Doctor Dissertation, Institute of Applied Mathematics, Academy of Mathematics and System Sciences, CAS, 2003 10. Guan, Q.Y., Ma, Z.M.: The reflected α-symmetric stable processes and regional fractional Laplacian. Probab. Theory Relat. Fields 134(4), 649–694 (2006) 11. Guan, Q.Y., Ma, Z.M.: Boundary value problems for fractional Laplacian. Stoch. and Dyn. 5(3), 385–424 (2005) 12. Jacob, N., Schiling, R.: Some Dirichlet spaces obtained by subordinate reflected diffusions. Rev. Mat. Iberoamericanna, 15, 59–91 (1999) 13. Klafter, J., Shlesiger, M.F., Zumofen, G.: Beyond Brownian motion. Physics Today, 49, 33–39 (1996) 14. Landkof, N.S.: Foundations of Modern Potential Theory. New York: Springer, 1972 15. Li, Y.Y., Nirenberg, L.: The Dirichlet problem for singularly perturbed elliptic equations. Comm. Pure Appl. Math. 51(11–12), 1445–1490 (1998) 16. Metzler, R., Klafter, J.: The Random Walks Guide to Anomalous Diffusion: A Fractional Dynamics Approach. Phys. Rep. 339, 1–77 (2000) 17. Solomon, T.H., Weeks, E.R., Swinney, H.L.: Observation of anomalous diffusion and Lévy flights in a two-dimensional rotating flow. Phys. Rev. Lett. 71, 3975–3978 (1993) 18. Stein, E.M.: Singular integrals and differentiability properties of functions. Princeton, NJ: Princeton Univ. Press, 1970 Communicated by A. Kupiainen

Commun. Math. Phys. 266, 331–342 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0063-8

Communications in

Mathematical Physics

Spacelike Hypersurfaces with Free Boundary in the Minkowski Space under the Effect of a Timelike Potential Rafael López Departmento de Geometría y Topología, Universidad de Granada, 18071 Granada, Spain. E-mail: [email protected] Received: 19 April 2005 / Accepted: 3 March 2006 Published online: 28 June 2006 – © Springer-Verlag 2006

Abstract: In this paper we consider a variational problem for spacelike hypersurfaces in the (n + 1)-dimensional Lorentz-Minkowski space Ln+1 , whose critical points are hypersurfaces supported in a spacelike hyperplane determined by two facts: the mean curvature is a linear function of the distance to and the hypersurface makes a constant angle with along its boundary. We prove that the hypersurface is rotational symmetric with respect to a straight-line orthogonal to and that each (non-empty) intersection with a parallel hyperplane to is a round (n − 1)-sphere. A similar result is proved for hypersurfaces trapped between two parallel hyperplanes. 1. Introduction and Statement of Results Consider the following variational problem: let be a spacelike hyperplane in the (n + 1)-dimensional Lorentz-Minkowski space Ln+1 and denote by + one of the two halfspaces at which divides Ln+1 . Let M be a compact spacelike hypersurface whose boundary ∂ M lies on and its interior, int(M), is included in + . The hyperplane is called the support hypersurface. Let us denote the bounded domain by ∂ M on . In this setting, we consider all perturbations in such way that M is adhered to , that is, ∂ M ⊂ , and int(M) remains in + . We consider the following energy functional: E = |M| − cosh β|| + Y d M, M

where |M| and || denote the n-areas of M and respectively. Here Y is a potential that, up constants, measures at each point the distance to . We say that Y is a timelike potential associated to (we shall drop the reference to if it is well understood in the context). We seek those configurations in a state of equilibrium, that is, when the energy is critical under any perturbation that does not change the volume enclosed by M ∪ . According to the principle of virtual work, the equilibrium of the system is achieved if

332

R. López

1. the mean curvature of M is a linear function on the distance to and 2. the hyperbolic angle β with which M and intersect along ∂ M is constant. See Fig. 1. In such a situation, we shall say that M is a stationary hypersurface. In absence of a timelike potential Y , M is a hypersurface with constant mean curvature. Constant mean curvature hypersurfaces have interest in different problems in general relativity. We refer [14, 17, 20] and references therein. Alías and Pastor have proved the following result: Theorem [4]. Consider a compact spacelike surface M in L3 with constant mean curvature and supported in a plane . If the hyperbolic angle of contact between M and is constant along ∂ M, then M must be an umbilical surface, that is, a planar disc or a hyperbolic cap. See also [18] for other results in Lorentzian space forms. The main argument used there is the holomorphicity of the Hopf differential in a surface with constant mean curvature. However, and just as they pointed out there, this method fails when the dimension ambient space is bigger than 3. In the present article, we extend the result of Alías and Pastor in two directions. First, we consider arbitrary dimension for the ambient space Ln+1 ; and second, we assume the presence of a timelike potential corresponding to the support hyperplane . Our proof uses the Alexandrov reflection method. Such technique was firstly used by Alexandrov to prove that a closed embedded constant mean curvature surface in Euclidean 3-dimensional space must be a round sphere [1]. The proof uses the very hypersurface as barrier of comparison with itself and the Hopf maximum principle for elliptic equations. Here, we prove a more general result: Theorem 1. Let M ⊂ Ln+1 be a compact embedded spacelike hypersurface supported in a spacelike hyperplane . Assume that 1. M lies in one side of . 2. The mean curvature of M is a function that depends only on the distance to . 3. The hyperbolic angle that makes M with along ∂ M is constant. Then there is a vertical straight-line L orthogonal to about which M is rotational symmetric. Moreover, M is topologically a n-ball and the intersection of M with a hyperplane orthogonal to L is a (n − 1)-sphere whose center lies on the axis L. In the case that the mean curvature is constant, M is a piece of a hyperbolic hyperplane bounded by a round (n − 1)-sphere or M is a domain of . Let us relate Theorem 1 with that of Alías and Pastor. First, we point out that a compact spacelike hypersurface in Ln+1 is a graph on some domain of the support plane,

Fig. 1. A stationary hypersurface over a plane . The hyperbolic angle β is constant along ∂ M

Spacelike Hypersurfaces with Free Boundary

333

and so, it is embedded [4]. On the other hand, the fact that the mean curvature of M is constant corresponds with no assumption of a timelike potential in L3 , as we will see in Sect. 2, Eq. (4). Moreover, an easy application of the maximum principle applied to the constant mean curvature equation implies that the hypersurface has no points in both sides of . Hence that M lies in one side of , unless that M is included in and in which case, M is a planar domain. As conclusion of Theorem 1, we describe the shape of the critical points of the initial variational problem. Corollary 1. Let be a spacelike hyperplane in Ln+1 . Then any stationary embedded hypersurface M supported in is a hypersurface of revolution with respect to a straight-line orthogonal to . Moreover, each non-empty intersection of M with a parallel hyperplane to is a round (n − 1)-sphere. A similar result was proved by Wente in Euclidean space [21]. Recently, the present author has detailed the size and shape of a stationary surface in L3 supported in a spacelike plane [16]. This paper is organized as follows. Section 2 is a preparatory introduction where we will formulate the variational problem. Next we present the background analysis of the maximum principle and in Sect. 4 we prove our main result, Theorem 1. In Sect. 5 we extend our result for stationary hypersurfaces trapped between two parallel hyperplanes and, finally, we summarize the results and the conclusions in Sect. 6. 2. Preliminaries In this section we present the variational problem introduced in the above section. Much of these results appear in the literature and we refer to them for more details. See [4– 8]. Let Ln+1 denote the (n + 1)-dimensional Lorentz-Minkowski space, that is, the real 2 , vector space Rn+1 endowed with the Lorentzian metric , = d x12 + · · · + d xn2 − d xn+1 where x = (x1 , . . . , xn+1 ) are the canonical coordinates in Ln+1 . An immersion x : M n → Ln+1 of a smooth n-manifold M is called spacelike if the induced metric on M is positive definite. Observe that a = (0, . . . , 0, 1) is a unit timelike vector field globally defined on Ln+1 , which determines a time-orientation on the space Ln+1 . This allows us the choice of a unit normal vector field N on M which is in the same time-orientation as a, and hence that M is oriented by N . We will refer to N as the future-directed Gauss map of M. The spacelike condition imposes topological restrictions to the immersion x. For example, there are not closed spacelike hypersurfaces and then, any compact spacelike hypersurface has non-empty boundary. If is a (n − 1)-submanifold in Ln+1 and x : M → Ln+1 is a spacelike immersion of a compact hypersurface, we say that the boundary of M is if the restriction x : ∂ M → is a diffeomorphism. For spacelike hypersurfaces, the projection π : Ln+1 → {xn+1 = 0}, π(x) = (x1 , . . . , xn ), is a local diffeomorphism between int(M) and π(int(M)). Thus, π is an open map and π(int(M)) is a domain in . If M is compact, then π : M → is a covering map. Thus, any compact spacelike hypersurface whose boundary is a graph over the boundary of an open region ⊂ {xn+1 = 0} is a graph over . From now, we shall identify a point p ∈ M with its image by x, namely x( p). Consider a compact spacelike hypersurface M ⊂ Ln+1 whose boundary ∂ M is on a spacelike hyperplane , which must be of spacelike-type. Without loss of generality

334

R. López

and after an isometry of the ambient, we assume that = {xn+1 = 0} and that M is the graph of a function u on a domain of . Although the boundary ∂ M is possibly non-connected, the causal character on M implies the existence of a component of ∂ M, named 0 , such that π(int(M)) is contained in the bounded domain determined by 0 in . Therefore, M defines an “interior” domain, that is, there exists a bounded region ⊂ such that M ∪ determines in Rn+1 a bounded domain B, called the “interior” of M. For spacelike hypersurfaces of Ln+1 , the notions of the first and the second fundamental form are defined in the same way as in the Euclidean space. In a classical notation, they are given by I=

n

gi j d xi d x j ,

II =

i, j

n

h i j d xi d x j ,

i, j

where gi j = ∂i x, ∂ j x is the induced metric on M by x and h i j = ∂i N , ∂ j x. Then the mean curvature H of x is H =

1 trace [(gi j )−1 (h i j )]. n

Assume that M is the graph of a smooth function u = u(x1 , . . . , xn ) defined over a domain ⊂ . The spacelike condition implies |∇u| < 1, where ∇ is the gradient operator in Rn and the Gauss map is N =

(∇u, 1) 1 − |∇u|2

.

According to this orientation, the mean curvature H at the point (x, u(x)), x ∈ , satisfies the equation (1 − |∇u|2 )u −

n

u i u j u i j = n H (1 − |∇u|2 )3/2 .

(1)

i, j

This equation can alternatively be written in divergence form div(T u) = n H,

∇u . Tu = 1 − |∇u|2

(2)

We present now the notion of stationary hypersurface in Ln+1 . Consider a spacelike hyperplane , that divides Ln+1 into two halfspaces. Let us orient by the futurethe component of Ln+1 \ directed unit timelike vector field N and consider Ln+1 + n+1 be a connected compact hypersurface towards where N points to. Let x : M → L with boundary ∂ M, smooth even at ∂ M such that x(int(M)) ⊂ Ln+1 + and x(∂ M) ⊂ . n+1 A variation of M is a differentiable map X : (− , ) × M → L such that X t : M → Ln+1 defined by X t ( p) = X (t, p), p ∈ M, is an immersion and X 0 = x. The variation is called admissible if X t (int(M)) ⊂ Ln+1 + and X t (∂ M) ⊂ for all t. The functionals A, S : (− , ) → R defined by A(t) = d At , S(t) = d, M

t

Spacelike Hypersurfaces with Free Boundary

335

measure, respectively, the n-area of M with the metric induced by X t and the n-area of t ⊂ , the region in bounded by X t (∂ M). Finally, the volume function V : (− , ) → R is defined by X ∗ d V, V (t) = [0,t]×M

where d V is the canonical volume element of Ln+1 . The variation X is said to be volume-preserving if V (t) = V (0) for all t. The variational vector field of X is ∂X ( p) . ξ( p) = ∂t t=0 If we assume the existence of a potential Y , then resultant variation energy is Y (t) = Y d At . M

The energy function E : (− , ) → R of the mechanical system is defined by E(t) = A(t) − cosh β S(t) + Y (t),

(3)

where β ∈ R is a constant. We say that the immersion x is stationary if E (0) = 0 for any volume preserving admissible variation of x. One can show that the first variation formula for the energy is:

E (0) = ξ, ν (cosh β + N , N ) ds, (−n H + Y + λ) N , ξ d M + ∂M

M

where ν is the inner unitary conormal to along ∂ M. Thus, we have Proposition 1. Let be a spacelike hyperplane in Ln+1 and let M be a compact hypersurface. Let us consider x : M → Ln+1 a smooth spacelike immersion such that x(int (M)) ⊂ Ln+1 + and x(∂ M) ⊂ . Then x is stationary if and only if 1. The mean curvature H of x satisfies the relation n H ( p) = Y ( p) + λ,

p ∈ M,

(4)

where Y is a potential and λ is a Lagrange parameter determined by an eventual volume constraint; 2. The hypersurface M = x(M) meets the support hyperplane with a constant hyperbolic angle β, and cosh β = −N , N along ∂ M. Our interest in this article lies in the case for which Y is a timelike potential associated to . As we have supposed that = {xn+1 = 0}, Y ( p) = κ xn+1 ( p),

(5)

for a constant κ. When κ = 0, Y = 0 and the mean curvature H of the hypersurface M is constant, with H = λ/n. On the other hand, when we talk of contact angle, it is implicitly assumed that the boundary regularity of M is enough to ensure that the idea of a normal vector to M

336

R. López

at every boundary point makes sense. For this, we will require M to be a sufficiently smooth hypersurface up to the boundary ∂ M. The contact angle β is given by 1 cosh β = −N , N = 1 − |∇u|2

(6)

along ∂ M. The constancy of the angle β implies that |∇u| is constant along ∂ M, and consequently, the Euclidean angle between M and along ∂ M is also constant. The constant λ is a Lagrange multiplier arising from the volume constraint: since M is a graph on , the volume V enclosed by M ∪ is V = u d.

By combining (4), (5), (6) and the divergence theorem in (2), we obtain κ V + λ|| = cosh β|∂| or λ=

cosh β|∂| − κ V . ||

3. The Maximum Principle We consider M1 and M2 two spacelike graphs in Ln+1 defined respectively by two functions u i , i = 1, 2. We suppose that both functions are defined in the same domain ⊂ Rn = {xn+1 = 0}. We know that the mean curvature Hi of Mi satisfies div(T u i ) = n Hi (x),

|∇u i | < 1

in . Assume that for each x ∈ , we have the inequality H1 (x) ≤ H2 (x). The operator div(T u) may be written in the form div(T u) =

n n 1 1 u ii + 3 ui u j ui j , W W i

W =

1 − |∇u|2 ,

i, j

where the subscript i indicates the differentiation with respect to the variable xi . We can write div(T u) = i,n j ai j (x, u, p)u i j with p ∈ Rn , pi = u i , and where n

ai j (x, u, p)ξi ξ j =

i, j

(1 − | p|2 )|ξ |2 + ξ, p2 , W3

ξ ∈ Rn .

As a consequence 0 < λ(x, u, p)|ξ |2 ≤

n

ai j (x, u, p)ξi ξ j ≤ (x, u, p)|ξ |2 ,

i, j

where λ(x, u, p) =

1 W

(x, u, p) =

1 . W3

Spacelike Hypersurfaces with Free Boundary

337

Then the operator is elliptic for | p| < 1 and uniformly elliptic for compact domains. Let φ(x, p, r ) = div(T u) = n H (x),

(7) 2

where r = (ri j ), ri j = u i j . Then φ is a smooth function defined in × D × Rn given explicitly by n pi p j 1 φ(x, p, r ) = δi j + ri j , 1 − | p|2 1 − | p|2 i, j where D is the unit open disc of Rn . For each u = u i , i = 1, 2, we will use the notation pi , r i and Hi for each i . Since H1 ≤ H2 , a standard argument by using the chain rule shows then 0 ≤ φ(x, p 2 , r 2 ) − φ(x, p 1 , r 1 ) n 1 n 1 ∂φ ∂φ (θ (t)) dt wi j + (θ (t)) dt w j := Lw, = 0 ∂ri j 0 ∂pj i, j

j

where w = u 1 − u 2 , wi = ∂w/∂ xi , wi j = ∂ 2 w/∂ xi ∂ x j and θ = θ (t) = (x, t p 2 + (1 − t) p 1 , tr 2 + (1 − t)r 1 ). The right hand side of the above equation defines an elliptic operator L because

1 1 1 1 2 2 dt|ξ | ≤ Lw ≤ max , |ξ | ≤ |ξ |2 , 3 W3 W (θ (t)) W 0 1 2 and Wi = 1 − |∇u i |2 . Since the coefficients ai j are locally bounded, L is locally uniformly elliptic and we are in position to apply the Hopf maximum principle to the difference function w, whether in its classical formulation ([11]) or its boundary point version ([12]): see also [10, Ch. 3]. Consequently, we have proved the following result: Theorem 2 (The touching principle). Let u, v be two smooth solutions to the same prescribed mean curvature equation (7) on a domain ⊂ Rn . Suppose that u ≤ v on and u(x0 ) = v(r0 ), x0 ∈ . Then u(x) = v(x) on . The same holds if p ∈ ∂ with the extra hypothesis that ∂u/∂ν = ∂v/∂ν at x0 , where ν is the outward unit normal to ∂. 4. Proof of Theorem 1 The method of proof used in this work is the Alexandrov reflection method and it may be adapted to the present situation. For expository reasons, we will describe it briefly. Such techniques have been used in a variety of situations in differential geometry. See also the so-called “method of moving plane” in the context of the theory of partial differential equations (for example [9, 19]). The proof consists in showing that for each hyperplane P orthogonal to , there exists some hyperplane P ∗ parallel to P such that M is invariant by the symmetry with respect to P ∗ . In showing this fact for such hyperplane, then M is a hypersurface of revolution whose axis L is the intersection of all hyperplanes P ∗ . In addition, the proof

338

R. López

shows that M is a graph over some domain of P ∗ in each side. Consequently, each intersection of M with a hyperplane parallel to is a round (n − 1)-sphere. For this, we work as follows. Fix a hyperplane P orthogonal to and consider the foliation of all translated copies of P along a straight-line orthogonal to P. Then coming from the infinity towards M doing such translations, one makes successive symmetries about these hyperplanes and looks to the possible first point of tangent touching contact with M again. Then we use the very hypersurface M as comparison hypersurface with itself and the touching principle concludes that in that new position P ∗ , the hypersurface M is invariant by the symmetry with respect to P ∗ . After an ambient isometry, we can suppose that the support hyperplane is = {x ∈ Rn+1 ; xn+1 = 0} and that the mean curvature H of M depends only on the xn+1 coordinate, that is, H (x) = H (xn+1 (x)) for any x ∈ M. Without loss of generality, we assume that M lies over the hyperplane . Let be the bounded region in bounded by ∂ M such that M ∪ is a closed embedded hypersurface. Therefore, M ∪ determines two domains in Rn+1 , namely A and B, where we denote, respectively, the non-bounded and the interior domain determined by M in Rn+1 . Recall that in our situation, both hyperbolic and Euclidean angles between M and are constant along ∂ M. Let P be a fixed vertical hyperplane far away from M so P ⊂ A. Let P(t) be the 1-parameter family of translated copy of P, where we choose the parameter t such that P(t), t > 0, is included in the connected component determined by P which contains M. Here t = dist(P(t), P), hence P(0) = P. Translating P towards M parallel to itself (say, to the right) one gets a first plane P(t1 ) that reaches M, that is, P(t1 ) ∩ M = ∅ but if t < t1 then P(t) ∩ M = ∅. Furthermore, the spacelike character of M implies that P(t1 ) touches M only at boundary points. Now, when we move P a little more to the right from t = t1 , until a hyperplane P(t), the (closed) part of M on the left of P(t), which we denote by M(t)− , is a graph (with respect to the horizontal) over a domain in P(t) and no point of M(t)− has a horizontal tangent hyperplane. We denote M(t)+ the part of on the right of P(t). Let M(t)∗ be the symmetry of M(t)− through P(t). We know then that for > 0 sufficiently small, M(t)∗ ⊂ B, t ∈ (t1 , t1 + ). Recall that the symmetry with respect to a vertical hyperplane is an isometry of Ln+1 , and so, the mean curvature remains invariant by the symmetry. Because the mean curvature of M depends only on the height with respect to , the mean curvature is the same for all points of M(t)+ and M(t)∗ at the same height. We continue now moving P(t) to the right, and reflecting M(t)− about P(t), successively until one reaches a first point of contact of the reflection of M with M(t)+ . Consider the first parallel hyperplane P(τ ) where one of the following conditions fails to hold (see Fig. 2 and 3): 1. int (M(τ )∗ ) ⊂ int (B). 2. M(τ )− is a graph over a part of P(τ ) and no point of M(τ )− has a horizontal tangent hyperplane. If 1) fails first, M(τ )+ and M(τ )∗ touch at some interior point p (Fig. 2, (a)), or at a boundary point p, with p ∈ ∂ M ∩ ∂ M(τ )∗ (Fig. 2 (b)). The fact that M lies over prohibits the possibility that p ∈ ∂ M and p is a reflection of an interior point of M(τ )− . Thus the tangent hyperplanes of M(τ )+ and M(τ )∗ agree at p (in the latter case, we use that the hyperbolic angle between M and along ∂ M is constant). In addition, the reflections invert normal vectors and the Gauss maps N of M(τ )∗ and M(τ )+ at such point p are the same. Then one applies the touching principle to M(τ )+ and M(τ )∗ at

Spacelike Hypersurfaces with Free Boundary

339

Fig. 2. The Alexandrov reflection method: (Case 1)

Fig. 3. The Alexandrov reflection method: (Case 2)

the point where they touch to conclude that M(τ )+ = M(τ )∗ . This means that P(τ ) is a hyperplane of symmetry of M. If 2) fails first, then there exists a point p where the tangent hyperplane of M(τ )− becomes horizontal is on ∂(M(τ )− ) ⊂ P(τ ) (Fig. 3 (a)) or p ∈ ∂ M ∩ P(τ ) (Fig. 3 (b)). In the former possibility one can apply the boundary touching principle to M(τ )∗ and M(τ )+ to conclude that P(τ ) is a hyperplane of symmetry of M; in the second one, the corresponding tangent hyperplanes of M and M(τ )∗ are identical because the hyperbolic angle with the a direction is the same at p. Then one applies the maximum principle at a corner point (see details in [19]). Thus, for each vertical hyperplane P, some parallel translate of P, namely P ∗ = P(τ ), is a hyperplane of symmetry of M and this proves that M is a hypersurface of revolution. To finish with the proof, we consider the situation of absence of the timelike potential. We know that a stationary hypersurface M with free boundary supported in a spacelike hyperplane must be a hypersurface of revolution. Set |x| = r , x ∈ Rn . After an isometry of the ambient, we assume that the rotation axis is the xn+1 -line. Then M is obtained by the rotation of the profile of a function u : [0, R] → R with boundary conditions u(0) = u 0 ,

u (0) = 0.

Equation (2) becomes an ordinary differential equation and it converts into r n−1 u (r ) 1 d = κ u(r ) + λ, 0 ≤ r < R. r n−1 dr 1 − u (r )2

(8)

340

R. López

In the case that we are treating, κ = 0, the solution corresponds with a constant mean curvature hypersurface, with H = λ/n. A direct integration of (8) leads to (up to constants)

n2 u(r ) = + r2 if λ = 0 λ2 u(r ) = 0,

if λ = 0.

In the first case, u describes a hyperbolic hyperplane of mean curvature λ/n; in the second one, we obtain that M is a domain of . This completes the proof of Theorem 1. The Alexandrov reflection method applies in a similar situation as in Theorem 1, where the condition on the angle is replaced by certain symmetry of the boundary. The next result generalizes those obtained in [2, 3] and its proof is omitted. Corollary 2. Let ⊂ Ln+1 be a closed (n − 1)-submanifold included in a spacelike plane and symmetric with respect to a straight-line L ⊂ . Let M be a spacelike embedded hypersurface spanning . Assume 1. Each component of \ ( ∩ L) is a graph on L. 2. M lies in one side of . 3. The mean curvature of M is a function that depends only on the distance with respect to . Then the plane P orthogonal to with L ⊂ P is a hyperplane of symmetry of M. Moreover, each component of M \ (M ∩ P) is a graph on P. In the particular case that is a round (n − 1)-sphere, M is a hypersurface of revolution and the intersection of M with a hyperplane parallel to is a round (n − 1)-sphere. 5. Bridges Between Two Parallel Hyperplanes The Alexandrov reflection technique can be used in other possible configurations. For example, hypersurfaces interconnecting a set of spacelike hyperplanes. The setting that we will consider is that a stationary hypersurface is trapped between two parallel spacelike hyperplanes 1 and 2 . Usually the hypersurface is called a bridge. In such case, the term S in (3) is the n-area of the domains that ∂ M bounds in each one of the hyperplanes. Again, in a state of equilibrium, the angle between the normal vector to the bridge and i along their lines of contact is constant (and possibly with different values in each hyperplane i ). The Alexandrov reflection method yields again the following Theorem 3. Let 1 and 2 be two parallel spacelike hyperplanes in Ln+1 . Consider M a spacelike embedded compact hypersurface included in the slab determined by 1 ∪2 and whose boundary ∂ M intersects both 1 and 2 . Assume that the mean curvature of M depends only on the distance to i and the hyperbolic contact angle between M and i is constant along ∂ M in each one of the two hyperplanes. Then M is rotational symmetric with respect to a straight-line orthogonal to i . Moreover, each (non-empty) intersection of M with a parallel hyperplane to i is a round (n − 1)-sphere. Corollary 3. Let M be a stationary embedded hypersurface in Ln+1 trapped between two parallel hyperplanes 1 ∪ 2 . Then M is a hypersurface of revolution with respect to a straight-line orthogonal to the support hyperplanes i . Moreover, each (non-empty) intersection of M with a parallel hyperplane to i is a round (n − 1)-sphere.

Spacelike Hypersurfaces with Free Boundary

341

Remark 1. In the case that M has constant mean curvature, M is not necessarily a piece of a hyperbolic hyperplane. The family of constant mean curvature spacelike hypersurfaces bounded by two axial (n − 1)-spheres in parallel hyperplanes is richer and according to Eq. (8), the function u is determined by elliptic integrals. See [15]. 6. Final Discussions and Conclusions In the Lorentz-Minkowski space Ln+1 , we have considered the variational problem of an embedded compact spacelike hypersurface M resting on a spacelike hyperplane . The forces involved in the system are to the n-areas of the hypersurface and the domain that M bounds in . Furthermore, we assume the existence of a timelike potential determined by . Our interest was the possible shapes of the hypersurface when it reaches an equilibrium: the energy of the system is critical under any perturbation of the hypersurface such that we maintain its adherence to the plate and the enclosed volume. The so-called Alexandrov reflection method allows to prove that the hypersurface is rotational symmetric with respect to a line orthogonal to the support hyperplane. Moreover the intersection with a parallel hyperplane to is a round (n − 1)-sphere. This extends the result proved in [4] both for arbitrary dimension and for a more general mean curvature function of the hypersurface. A similar result has been obtained for bridges between parallel hyperplanes. Another interesting support hypersurface occurs when is a hyperbolic hyperplane. One can believe that the only stationary hypersurface resting on a hyperbolic hyperplane are pieces of hyperbolic hyperplanes. This is true in L3 under the assumption of the constancy of the mean curvature [4]. However, we do not know if the same remains true under the effect of a timelike potential. We remark that a hyperbolic plane is also a hypersurface of revolution, which it makes one think that our conclusions can extend to this situation. Acknowledgements. This research is partially supported by MEC-FEDER grant no. MTM2004-00109.

References 1. Alexandrov, A.D.: Uniqueness theorems for surfaces in the large, V. Vestnik Leningrad Univ 13, 5 (1958) 2. Alías, L., López, R., Pastor, J.A.: Compact spacelike surfaces with constant mean curvature in the LorentzMinkowski 3-space. Tohoku Math. J. 50, 491 (1998) 3. Alías, L., Pastor, J.A.: Constant mean curvature spacelike hypersurfaces with spherical boundary in the Lorentz-Minkowski space. J. Geom. Phys. 28, 85 (1998) 4. Alías, L., Pastor, J.A.: Spacelike surfaces of constant mean curvature with free boundary in the Minkowski space. Class. Quantum Grav. 16, 1323 (1999) 5. Barbosa, J.L., Oliker, V.: Stable spacelike hypersurfaces with constant mean curvature in Lorentz space. In: Geometry and Global Analysis, Sendai: Tohoku University, 1993, pp. 161-164 6. Barbosa, J.L., Oliker, V.: Spacelike hypersurfaces with constant mean curvature in Lorentz space. Mat. Contemp. 4, 27 (1993) 7. Brill, D., Flaherty, F.: Isolated maximal surfaces in spacetime. Commun. Math. Phys. 50, 157 (1976) 8. Frankel, T.: Applications of Duschek’s formula to cosmology and minimal surfaces. Bull. Am. Math. Soc. 81, 579 (1975) 9. Gidas, B., Ni, W., Nirenbreg, L.: Symmetry and related properties via the maximum principle. Commun. Math. Phys. 68, 209 (1979) 10. Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: SpringerVerlag, 1983 11. Hopf, E.: Elementare Bermekungen Über die Lösungen partieller differentialgleichunger zweiter ordnung von elliptischen typen. Preuss. Akad. Wiss. 19, 147 (1927)

342

R. López

12. Hopf, E.: A remark on linear elliptic differential equations of the second order. Proc. Amer. Math. Soc. 3, 791 (1952) 13. Laplace, P.S.: Traité de mécanique céleste; suppléments au Livre X. Paris: Gauthier-Villars, 1805 14. Lichnerowicz, A.: L’integration des equations de la gravitation relativiste et le problem des n corps. J. Math. Pures Appl. 23, 37 (1944) 15. López, R.: 2004 Surfaces of annulus type with constant mean curvature in Lorentz-Minkowski space. http://arxiv.org/list/ math.DG/0501188, 2005 16. López, R.: Stationary liquid drops in Lorentz-Minkowski space. http://arxiv.org/list/ math-ph/0501038, 2005 17. Marsden, J.E., Tipler, F.J.: Maximal hypersurfaces and foliations of constant mean curvature in general relativity. Phys. Rep. 66, 109 (1980) 18. Pastor, J.A.: Spacelike hypersurfaces of constant mean curvature with free boundary in Lorentzian space forms. Class. Quantum Grav. 17, 1921 (2000) 19. Serrin, J.: A symmetry problem in potential theory. Arch. Rat. Mech. Anal. 43, 304 (1971) 20. Stumbles, S.: Hypersurfaces of constant mean extrinsic curvature. Ann. Phys. 133, 57 (1980) 21. Wente, H.C.: The symmetry of sessile and pendent drops. Pacific J. Math. 88, 387 (1980) Communicated by G.W. Gibbons

Commun. Math. Phys. 266, 343–353 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0029-x

Communications in

Mathematical Physics

Global Classical Solutions to the 3D Nordström-Vlasov System Simone Calogero Institutt for Matematiske Fag, NTNU Alfred Getz’ vei 1 N-7491, Trondheim, Norway Received: 8 July 2005 / Accepted: 11 January 2006 Published online: 19 May 2006 – © Springer-Verlag 2006

Abstract: The Nordström-Vlasov system describes the kinetic evolution of self-gravitating collisionless matter in the framework of a relativistic scalar theory of gravitation. We prove global existence and uniqueness of classical solutions for the corresponding initial value problem in three dimensions when the initial data for the scalar field are smooth and the initial particle density is smooth with compact support. 1. Introduction This paper is concerned with the Cauchy problem for the Nordström-Vlasov system. The latter is a Lorentz invariant kinetic model describing the evolution of self-gravitating collisionless matter under the assumption that the gravitational forces are mediated by a scalar field. In a system of Cartesian coordinates (t, x), t ∈ R, x ∈ R3 , the Nordström-Vlasov system is given by ∂t2 φ − φ = −μ, dp μ(t, x) = f (t, x, p) , 1 + | p|2 S f − (Sφ) p + (1 + | p|2 )−1/2 ∇x φ · ∇ p f = 4 f Sφ.

(1.1) (1.2) (1.3)

Here p ∈ R3 is the momentum variable, f = f (t, x, p) is the particle density in phase-space, φ = φ(t, x) is the scalar gravitational field generated by the particles and S = ∂t + p · ∇x , p=

p 1 + | p|2

;

Current address: Departamento de Matemática para a Ciência e Tecnologia, Campus de Azurém da Universidade do Minho, 4800-058 Guimarães, Portugal. E-mail: [email protected]

344

S. Calogero

S is the free-transport operator, p denotes the relativistic velocity of a particle with momentum p. Units are chosen such that the mass of each particle, the gravitational constant and the speed of light are equal to unity. A solution ( f, φ) of this system is interpreted as follows: The space-time is a four-dimensional Lorentzian manifold with a conformally flat metric which, in the coordinates (t, x), takes the form gμν = e2φ diag(−1, 1, 1, 1), μ, ν = 0, . . . , 3.

(1.4)

The particle density on the mass shell in this metric is f = e−4φ f (t, x, eφ p), but it is more convenient to work with f and φ as the dynamic variables. The Vlasov equation (1.3) is equivalent to the condition that f is constant along the geodesics flow of the metric (1.4). The right-hand side of the field equation (1.1) is the trace of the stress energy tensor associated to f with respect to the background Minkowski metric. We refer to [4] for a derivation of the equations. Although scalar theories of gravity are not physically correct, they may provide useful simplified models for General Relativity [25]. Moreover, scalar fields play a central role in modern theories of classical and quantum gravity [8]. The physically correct relativistic model for self-gravitating collisionless matter is the Einstein-Vlasov system, which is discussed for instance in [1, 21]. Due to the very complicated structure of the Einstein equations, the evolution problem for the Einstein-Vlasov system remains poorly understood, even in spherical symmetry. In fact, global existence and uniqueness of (asymptotically flat) solutions to the Einstein-Vlasov system has been proved only for small data in spherical symmetry [21]. As opposed to this, the Cauchy problem for the Vlasov-Poisson system, which is the non-relativistic limit of the Einstein-Vlasov system [22, 23], is by now well-understood, cf. [16, 18, 20, 24]. In [5] it is shown that Vlasov-Poisson is the non-relativistic limit of the Nordström-Vlasov system as well. However, global existence of classical solutions for the Nordström-Vlasov system and related models—such as the Vlasov-Maxwell system of plasma physics—has not yet been established. In this paper we show that this fundamental question has a positive answer for the Nordström-Vlasov system. Precisely, we shall prove the following Theorem 1. Given f 0 : R3 × R3 → [0, ∞) and φ0 , φ1 : R3 → R satisfying f 0 ∈ Cc1 , φ0 ∈ Cb3 ∩ H 1 , φ1 ∈ Cb2 ∩ L 2 , there exists a unique global solution ( f, φ) of the Nordström-Vlasov system in the class ( f, φ) ∈ C 1 [0, ∞) × R3 × R3 × C 2 [0, ∞) × R3 , such that ( f, φ)|t=0 = ( f 0 , φ0 ) and (∂t φ)|t=0 = φ1 . The proof of Theorem 1 relies upon three main tools. A first one is the representation formula for the time derivative of the field given in [6, Prop. 1]. It turns out that estimating ∂t φ is enough to obtain global existence. A second important tool is given by the (null cone) energy estimates which derive from the energy identity ∂t e + ∇x · p = 0, where e(t, x) =

1 1 1 + | p|2 f dp + (∂t φ)2 + (∇x φ)2 , p(t, x) = 2 2

(1.5) p f dp − ∂t φ∇x φ.

Global Classical Solutions to the 3D Nordström-Vlasov System

345

Upon integration, (1.5) leads to the identities

1 1 2 2 2 (1.6) 1 + | p| f dp d x + (∂t φ) + (∇x φ) d x = const., 2 2 (y − x) . (e + p · ω)(t − |x − y|, y) dy = e(0, y) dy, ω = |x − y| |x−y|t |x−y|t (1.7) Note also that e+p·ω =

1 + | p|2

+ω· p

1 1 f dp + (ω ∧ ∇x φ)2 + (∂t φ − ω · ∇x φ)2 . 2 2

We shall refer to (1.7) as the null cone energy identity, while (1.6) is the usual conservation of total energy. They imply ∂t φ(t) L 2 (R3 ) + ∇x φ(t) L 2 (R3 ) const.,

(1.8)

(ω ∧ ∇x φ)(t, x) L 2 (t,x ) + (∂t φ − ω · ∇x φ)(t, x) L 2 (t,x ) const.,

(1.9)

where R4 ⊃ t,x = {(t − |x − y|, y) : |x − y| t} y∈R3 is the past light cone with vertex at (t, x) and base on t = 0. The energy estimates are used to bound ∂t φ. For this purpose the representation formula for ∂t φ must be rewritten in a proper way to single out the contributions which are bounded by the null cone energy. All terms in the integral representation of ∂t φ but one can be estimated using the energy estimates. The remaining term is estimated using the third—and most important—main ingredient in the proof of Theorem 1, which is the L 2 version of a lemma due to C. Pallard, see [17, Lemma 1.1]. This crucial lemma establishes an L ∞ bound for the time integral of functions evaluated along the characteristics of the Vlasov equation. As we shall need a slightly different formulation of the result proved in [17], a proof thereof will be given when it is needed. We remark that prior to the present result, global existence theorems for NordströmVlasov have been proved under certain restrictions, such us small data [10], spherical symmetry [2] and for the 2-dimensional system [15]. Global existence of weak solutions is established in [7]. The proofs of these results make use of techniques originally developed to study the Cauchy problem for the Vlasov-Maxwell system, see [3, 9, 11–14, 19]. This suggests that if the analogue of Theorem 1 hold for the Vlasov-Maxwell system, the proof thereof might rely upon similar arguments as to the ones presented in this paper. However, since for Vlasov-Maxwell one has to estimate a vector field instead of a scalar field, the proof of global existence in the plasma physics case seems considerably more difficult and requires some additional non-trivial idea. 2. Preliminaries Throughout the paper we denote by C(t) any continuous non-decreasing positive function of time. If it is a constant, we denote it simply by C. The characteristics of the differential operator in the left hand side of (1.3) are the solutions of x˙ = p,

p˙ = −(Sφ) p − (1 + | p|2 )−1/2 ∇x φ

(2.10)

346

S. Calogero

and we denote by (X, P)(s) the characteristic curve satisfying (X, P)(t) = (x, p). Note that (X, P)(s) also depends on (x, p), but this is not reflected in our notation. The function e−4φ f is constant along the solutions of (2.10). We deduce that f (t, x, p) = f 0 (X (0), P(0)) exp [4φ(t, x) − 4φ0 (X (0))] ,

(2.11)

e−4φ f (t)∞ C.

(2.12)

whence

Let φ = φhom + ψ, where ψ is the solution of (1.1) with zero data and φhom solves 2φ = 0 with data (φ0 , φ1 ). Since f 0, then ψ 0. Therefore φ(t, x) C(t),

(2.13)

and eφ |φ| C(t). (2.14) ξ For (2.14) we used that supξ 0 e |ξ | C. Note also that f (t)∞ C(t). In [6] it is proved that the Cauchy problem for the Nordström-Vlasov system has a unique classical solution locally in time. Let Tmax be the maximal time of existence and denote ˜ P(t) = sup {| p| : f (s, x, p) = 0, for some x ∈ R3 }. 0s
˜ max ) < ∞ ⇒ Tmax = ∞, i.e., the solution could blow-up In [6, 7] it is proved that P(T in finite time only if the momentum support of f becomes unbounded. However, for the proof of Theorem 1 it is essential to look at another quantity. Define φ 3 2 P(t) = sup e 1 + | p| : f (s, x, p) = 0, for some x ∈ R , 0s
the maximal particles energy in the support of f . ˜ Lemma 1. The assertions P(t) < ∞ and P(t) < ∞ are equivalent. In particular P(Tmax ) < ∞ ⇒ Tmax = ∞. ˜ 2 . Moreover, using (2.12)Proof. Since eφ C(t), then we have P(t) C(t) 1 + P(t) (2.13), dp μ(t, x) f Ce2φ P(t)2 C(t)P(t)2 ; (2.15) 2 −φ 1 + | p| | p|e P (t) hence P(t) < ∞ implies μ(t)∞ C(t) and therefore also φ(t)∞ C(t). Thus φ(s) ∞ ˜ P(t) < ∞ ⇒ P(t) sup e P(t) C(t). s∈[0,t)

Global Classical Solutions to the 3D Nordström-Vlasov System

347

In order to estimate the function P(t) we shall use that, along characteristics, d 2φ 1 + | p|2 = 2e2φ ∂s φ. e ds

(2.16)

The aim is to transform (2.16) in a Grönwall’s type inequality by estimating ∂s φ in terms of P(t). An estimate like |∂t φ| C(t)P(t)2 log P(t) would be enough. However we are not able to obtain such a pointwise estimate for ∂t φ. Rather we have to use the integral version of (2.16), namely t 2φ 2 2φ0 (X (0)) 2 e 1 + | p| = e 1 + |P(0)| + 2 e2φ ∂s φ(s, X (s)) ds. (2.17) 0

Eventually the quantity we shall estimate is the time integral in the right hand side of (2.17). For this purpose we need the integral representation formula for ∂t φ which was derived in [6]. We recall it here for the sake of reference: ∂t φ(t, x) = (∂t φ) D + I + II + III, where

(2.18)

dp f 0 (y, p) d Sy , (1 + ω · p ) 1 + | p|2 |x−y|=t dy dp (ω + p) · p f (t − |x − y|, y, p) , I= 2 2 (1 + ω · p ) |x − y|2 1 + | p| |x−y|t dy dp (ω + p )2 , II = − (Sφ) f (t − |x − y|, y, p) 2 (1 + ω · p) 1 + | p|2 |x − y| |x−y|t dy dp (ω + p ) · ∇x φ . III = − f (t − |x − y|, y, p) 2 2 )3/2 |x − y| (1 + ω · p ) (1 + | p| |x−y|t

(∂t φ) D = ∂t φhom −

1 t

We rewrite the above representation formula in a new form which is suitable for being estimated in terms of the null cone energy: Proposition 1. The representation formula (2.18) can be rewritten in the form ∂t φ(t, x) = (∂t φ) D +

5

Zi ,

i=0

where

dy , |x − y| |x−y|t f (t − |x − y|, y, p) dy , dp |x − y|2 1 + | p|2 (1 + ω · p) |x−y|t dy f (t − |x − y|, y, p) dp , Z2 = − 2 2 3/2 |x − y|2 (1 + | p| ) (1 + ω · p) |x−y|t (ω · p ) f (t − |x − y|, y, p) dy , Z3 = 2 (∂t φ − ω · ∇x φ) dp 2 |x − y| 1 + | p| (1 + ω · p) |x−y|t

Z0 = −2 Z1 =

(∂t φ)μ(t − |x − y|, y)

348

S. Calogero

f (t − |x − y|, y, p) dy , dp (∂t φ − ω · ∇x φ) 2 3/2 2 (1 + | p| ) (1 + ω · p) |x − y| |x−y|t (ω ∧ p ) f (t − |x − y|, y, p) dy Z5 = −2 . (ω ∧ ∇x φ) · dp |x − y| 1 + | p|2 (1 + ω · p) |x−y|t

Z4 =

Proof. This proposition is the result of a straightforward calculation which proceeds as follows. In the integral I of (2.18) we write −1 p) · p = (1 + ω · p ) − 1 + | p|2 , (ω + which shows that I = Z1 + Z2 . In the integrals II and III of (2.18) we decompose ∇x φ into a component parallel to ω and a component orthogonal to ω, i.e., we write ∇x φ = (ω · ∇x φ)ω − ω ∧ ω ∧ ∇x φ. It follows that Sφ = (∂t φ)(1 + ω · p ) − (∂t φ − ω · ∇x φ) (ω · p ) + (ω ∧ ∇x φ) · (ω ∧ p) and (ω + p ) · ∇x φ = (ω · ∇x φ)(1 + ω · p ) + (ω ∧ ∇x φ) · (ω ∧ p ). In the integral II we also use −1 (ω + p )2 = 2(1 + ω · p ) − 1 + | p|2 . After substituting and summing up the various terms one can easily verify that II + III = Z0 + Z3 + Z4 + Z5 , which concludes the proof.

We conclude this section with Lemma 2. For R > 1, a, b 0, denote

−a 2 Ba,b (R) = 1 + | p| + ω · p (1 + | p|2 )−b dp. | p|R

Then Ba,b (R) C R (2−2b) log R, if a = 1, b < 1; 3−a ; Ba,b (R) C R (3−2b−a) , if a < 1, b < 2 1+a Ba,b (R) C R (1+a−2b) , if a > 1, b < . 2 Proof. The proof is by evaluating the integral in polar coordinates.

Global Classical Solutions to the 3D Nordström-Vlasov System

349

3. Proof of the Main Theorem Without loss of generality, we can assume P(t) C, where C can be chosen arbitrarily large, otherwise we redefine P(t) → P(t) +C. A first pointwise estimate on ∂t φ follows by the results of the previous section. Proposition 2. |∂t φ(t, x)| 2

|x−y|t

|∂t φ|μ(t − |x − y|, y)

dy + C(t)P(t)2 log P(t). |x − y|

Proof. From Proposition 1 we have |∂t φ(t, x)| C(t) + 2

dy + |Zi |. |x − y| 5

|x−y|t

|∂t φ|μ(t − |x − y|, y)

(3.19)

i=1

Let us estimate each integral Zi , for i = 1, . . . , 5. Observe that the domain of integration in the variable p can be chosen as {| p| 1 + e−φ P(t)} by the definition of P(t). Estimate for Z1 . By (2.12)–(2.14) and Lemma 2, dy |Z1 | C e4φ B1,0 1 + e−φ P(t) (t − |x − y|, y) |x − y|2 |x−y|t dy C(t)P(t)2 e2φ log 1 + e−φ P(t) (t − |x − y|, y) |x − y|2 |x−y|t dy e2φ |φ|(t − |x − y|, y) + log P(t) C(t)P(t)2 |x − y|2 |x−y|t C(t)P(t)2 log P(t). Estimate for Z2 . Again by (2.12)–(2.14) and Lemma 2, dy |Z2 | C e4φ B2,1/2 1 + e−φ P(t) (t − |x − y|, y) |x − y|2 |x−y|t dy C(t)P(t)2 e2φ(t−|x−y|,y) C(t)P(t)2 . |x − y|2 |x−y|t Estimate for Z3 . By the Cauchy-Schwarz inequality, (1.9), (2.12)–(2.14) and Lemma 2, dy |Z3 | C |∂t φ − ω · ∇x φ|e4φ B1,0 1 + e−φ P(t) (t − |x − y|, y) |x − y| |x−y|t

1/2 |∂t φ − ω · ∇x φ|2 (t − |x − y|, y) C |x−y|t

×

|x−y|t 2

e

8φ

2 B1,0 1 + e−φ P(t) (t − |x − y|, y)

C(t)P(t) log P(t).

dy |x − y|2

1/2

350

S. Calogero

Estimate for Z4 . As before,

1/2 |∂t φ − ω · ∇x φ|2 (t − |x − y|, y)dy |Z4 | C |x−y|t

×

|x−y|t 2

e

2 B2,1/2 1 + e−φ P(t) (t − |x − y|, y)

8φ

dy |x − y|2

1/2

C(t)P(t) .

Estimate for Z5 . Again as before,

1/2 2 |Z5 | C |ω ∧ ∇x φ| (t − |x − y|, y, p) dy |x−y|t

×

|x−y|t 2

e

8φ

2 B1,0 1 + e−φ P(t) (t − |x − y|, y)

dy |x − y|2

1/2

C(t)P(t) log P(t). Replacing the preceding estimates in (3.19) concludes the proof.

Using Proposition 2 and (2.17) we obtain the integral inequality t e2φ (1 + | p|2 ) C + 2 e2φ |∂s φ(s, X (s))| ds 0 t C + C(t) P(s)2 log P(s) ds + 4I0 (|∂t φ|μ, t),

(3.20)

0

where

I0 (g, t) =

t

e2φ(s,X (s))

0

|X (s)−y|s

We rewrite I0 (g, t) as

I0 (g, t) =

t

g(s − |X (s) − y|, y)

dy ds. |X (s) − y|

I0 (g, τ, t) dτ,

0

where

I0 (g, τ, t) =

τ

t

e2φ(s,X (s))

|y|=s−τ

g(τ, X (s) − y) d S y ds. (s − τ )

Except for the factor e2φ(s,X (s)) , I0 (g, τ, t) is the integral which is estimated in the proof of [17, Lemma 1.1]. However, since we shall need a slightly different formulation of the estimate proved in [17], we present here a complete proof of the result that we are going to use: Lemma 3. For all 0 τ t, g(τ ) L 2 I0 (g, τ, t) C(t) √ t −τ

t τ

log P(s) ds.

Global Classical Solutions to the 3D Nordström-Vlasov System

351

Proof. We first rewrite I0 in spherical coordinates: t π 2π e2φ(s,X (s)) g(τ, X (s) − (s − τ )ω)(s − τ ) sin θ dϕ dθ ds, I0 (g, τ, t) = τ

0

0

where ω = ω(θ, ϕ) = (sin θ cos ϕ, sin θ sin ϕ, cos θ ). Now, in [17, Lemma 2.1] it is shown that the transformation of variables (s, θ, ϕ) → X (s) − (s − τ )ω is a C 1 diffeomorphism with Jacobian · ω − 1)(s − τ )2 sin θ. J = ( X˙ (s) · ω − 1)(s − τ )2 sin θ = ( P(s) Hence, applying Cauchy-Schwarz’s inequality, t π 2π

1/2 2 I0 (g, τ, t) g (τ, X (s) − (s − τ )ω)|J | dϕ dθ ds τ

0

0

1/2 sin θ dϕ dθ ds · ω) (1 − P(s) τ 0 0 t

1/2 π 2π sin θ g(τ ) L 2 e4φ(s,X (s)) . dϕ dθ ds · ω) (1 − P(s) τ 0 0 ×

t

e4φ(s,X (s))

π

2π

We estimate the angular integral as π 2π 1 sin θ du dϕ dθ ds = 2π · ω) (1 − P(s) 0 0 −1 (1 − | P(s)|u) C 1 − log(1 − | P(s)|) C (|φ| + log P(s)) . We finally obtain I0 (g, τ, t) C(t)g(τ ) L 2 g(τ ) L 2 C(t) √ t −τ which concludes the proof of the lemma.

τ t

τ

t

e |φ| + log P(s) ds φ

1/2

log P(s) ds,

The proof of Theorem 1 is now almost complete. Observe that, by (2.15) and (1.8), ∂t φμ(τ ) L 2 μ(τ )∞ ∂t φ(τ ) L 2 C(t)P(τ )2 , τ t. Thus P(τ )2 I0 (|∂t φ|μ, τ, t) C(t) √ t −τ

τ

t

log P(s) ds.

Hence the integral I0 (|∂t φ|μ, t) is bounded by t t P(τ )2 I0 (|∂t φ|μ, t) C(t) log P(s) ds dτ √ t −τ 0 τ t s P(τ )2 = C(t) log P(s) dτ ds √ t −τ 0 0 t C(t) P(s)2 log P(s) ds. 0

352

S. Calogero

Finally, going back to (3.20) we obtain the Grönwall inequality

t 2 2 P(t) C(t) 1 + P(s) log P(s) dt , 0

whence P(t) C(t). By Lemma 1, this completes the proof of the main theorem. Acknowledgements. The author acknowledges support by the European HYKE network (contract HPRN-CT2002-00282) and by the project “PDE and Harmonic Analysis”, sponsored by Research Council of Norway (proj. no. 160192/V30).

References 1. Andréasson, H.: The Einstein-Vlasov System/Kinetic Theory. Living Rev. Relativity 8 (cited on 7 July 2005) 2. Andréasson, H., Calogero, S., Rein, G.: Global classical solutions to the spherically symmetric Nordström-Vlasov system. Math. Proc. Camb. Phil. Soc. 138, 533–539 (2005) 3. Bouchut, F., Golse, F., Pallard, C.: Classical Solutions and the Glassey-Strauss Theorem for the 3D Vlasov-Maxwell System. Arch. Rat. Mech. Anal. 170, 1–15 (2003) 4. Calogero, S.: Spherically symmetric steady states of galactic dynamics in scalar gravity. Class. Quant. Gravity 20, 1729–1741 (2003) 5. Calogero, S., Lee, H.: The non-relativistic limit of the Nordström-Vlasov system. Commun. Math. Sci. 2, 19–34 (2004) 6. Calogero, S., Rein, G.: On classical solutions of the Nordström-Vlasov system. Commun. Partial Diff. Eqs. 28, 1863–1885 (2003) 7. Calogero, S., Rein, G.: Global weak solutions to the Nordström-Vlasov system. J. Diff. Eqs. 204, 323–338 (2004) 8. Damour, T., Esposito-Farese, G.: Tensor-multi-scalar theories of gravitation. Class. Quantum Grav. 9, 2093–2176 (1992) 9. DiPerna, R.J., Lions, P.L.: Global weak solutions of Vlasov-Maxwell systems. Commun. Pure Appl. Math. 52, 729–757 (1989) 10. Friedrich, S.: Globale Existenzaussagen über klassische Lösungen des Vlasov-Nordström-Systems. Bayreuther Math. Schriften to appear 11. Glassey, R., Strauss, W.: Singularity formation in a collisionless plasma could occur only at high velocities. Arch. Rat. Mech. Anal. 92, 59–90 (1986) 12. Glassey, R., Strauss, W.: Absence of shocks in an initially dilute collisionless plasma. Commun. Math. Phys. 113, 191–208 (1987) 13. Glassey, R., Schaeffer, J.: The “Two and One-Half Dimensional” Relativistic Vlasov Maxwell System. Commun. Math. Phys. 185, 257–284 (1997) 14. Klainerman, S., Staffilani, G.: A new approach to study the Vlasov-Maxwell system. Commun. Pure Appl. Anal. 1, 103–125 (2002) 15. Lee, H.: Global existence of solutions of the Nordström-Vlasov system in two space dimensions. Commun. Partial Diff. Eqs. 30, 663–687 (2005) 16. Lions, P.-L., Perthame, B.: Propagation of moments and regularity for the 3-dimensional Vlasov-Poisson system. Invent. Math. 105, 415–430 (1991) 17. Pallard, C.: On the boundness of the momentum support of solutions to the relativistic Vlasov-Maxwell system. Indiana Univ. Math. J. 54, 1395–1409 (2005) 18. Pfaffelmoser, K.: Global classical solutions of the Vlasov-Poisson system in three dimensions for general initial data. J. Diff. Eqs. 95, 281–303 (1992) 19. Rein, G.: Generic global solutions of the relativistic Vlasov-Maxwell system of plasma physics. Commun. Math. Phys. 135, 41–78 (1990) 20. Rein, G.: Selfgravitating systems in Newtonian theory—the Vlasov-Poisson system. Banach Center Publications 41, Part I, 179–194 (1997) 21. Rein, G., Rendall, A.D.: Global existence of solutions of the spherically symmetric Vlasov-Einstein system with small initial data. Commun. Math. Phys. 150, 561–583 (1992) 22. Rein, G., Rendall, A.D.: The Newtonian limit of the Spherically Symmetric Vlasov-Einstein System. Commun. Math. Phys. 150, 585–591 (1992) 23. Rendall, A.D.: The Newtonian limit for asymptotically flat solutions of the Vlasov-Einstein system. Commun. Math. Phys. 163, 89–112 (1994)

Global Classical Solutions to the 3D Nordström-Vlasov System

353

24. Schaeffer, J.: Global existence of smooth solutions to the Vlasov-Poisson system in three dimensions. Commun. Part. Diff. Eqs. 16, 1313–1335 (1991) 25. Shapiro, S.L., Teukolsky, S.A.: Scalar gravitation: A laboratory for numerical relativity. Phys. Rev. D 47, 1529–1540 (1993) Communicated by P. Constantin

Commun. Math. Phys. 266, 355–399 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0051-z

Communications in

Mathematical Physics

Generalized Diagonal Crossed Products and Smash Products for Quasi-Hopf Algebras. Applications Daniel Bulacu1 , Florin Panaite2 , Freddy Van Oystaeyen3 1 Faculty of Mathematics and Informatics, University of Bucharest, Str. Academiei 14, RO-70109,

Bucharest 1, Romania. E-mail: [email protected]

2 Institute of Mathematics, Romanian Academy, PO-Box 1-764, RO-014700, Bucharest, Romania.

E-mail: [email protected]

3 Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, 2020,

Antwerp, Belgium. E-mail: [email protected] Received: 15 July 2005 / Accepted: 31 March 2006 Published online: 30 June 2006 – © Springer-Verlag 2006

Abstract: In this paper we introduce generalizations of diagonal crossed products, two-sided crossed products and two-sided smash products, for a quasi-Hopf algebra H . The results we obtain may then be applied to H ∗ -Hopf bimodules and generalized Yetter-Drinfeld modules. The generality of our situation entails that the “generating matrix” formalism cannot be used, forcing us to use a different approach. This pays off because as an application we obtain an easy conceptual proof of an important but very technical result of Hausser and Nill concerning iterated two-sided crossed products.

1. Introduction Quasi-bialgebras and quasi-Hopf algebras have been introduced by Drinfeld in [11], in connection with the Knizhnik-Zamolodchikov equations, and have been used afterwards in several branches of mathematics and physics, such as conformal field theory [10], knot theory [3], algebraic quantum field theory [17], elliptic quantum groups and the dynamical Yang-Baxter equation [15], etc. More recently, they are used by Majid and his collaborators (see for instance [1, 4, 20]) as the foundation of an emerging nonassociative geometry, regarded as a further extension of noncommutative geometry, with the “coordinate algebra” allowed to be nonassociative (quasi-Hopf algebras are intimately related to nonassociativity phenomena). Algebraically, quasi-bialgebras and quasi-Hopf algebras arise as very natural (especially from the tensor-categorical point of view) generalizations of bialgebras and Hopf algebras. Let k be a field, H an associative algebra and : H → H ⊗ H and ε : H → k Research partially supported by the EC programme LIEGRITS, RTN 2003, 505078, and by the bilateral projects “Hopf Algebras in Algebra, Topology, Geometry and Physics” and “New techniques in Hopf algebras and graded ring theory” of the Flemish and Romanian Ministries of Research. The first two authors have been also partially supported by the programme CERES of the Romanian Ministry of Education and Research, contract no. 4-147/2004.

356

D. Bulacu, F. Panaite, F. Van Oystaeyen

two algebra morphisms. Roughly speaking, H is a quasi-bialgebra if the category H M of left H -modules, equipped with the tensor product of vector spaces endowed with the diagonal H -module structure given via , and with unit object k viewed as a left H -module via ε, is a monoidal category (if we impose the associativity constraints to be the trivial ones, we obtain the usual concept of bialgebra). The comultiplication is not coassociative but is quasi-coassociative in the sense that is coassociative up to conjugation by an invertible element ∈ H ⊗ H ⊗ H . Note that the definition of a quasi-bialgebra or quasi-Hopf algebra is not self-dual. This paper deals with some kinds of crossed products arising in the context of quasiHopf algebras. Our starting point was an algebraic question of categorical nature (see below), but as a consequence of the ideas developed in the paper we were led to an easy and conceptual proof of a theorem of Hausser and Nill (arising in a physical context) concerning certain iterated crossed products. We would like to mention that the geometrical and physical relevance of iterated crossed products was also recently pointed out in [14], where it was proved for instance that the noncommutative 2n-planes introduced by Connes and Dubois-Violette in [9] may be written as iterated twisted tensor products of certain commutative algebras. The crossed products we work with in this paper are based on actions and coactions of quasi-Hopf algebras, as introduced in [5] (module algebras) and [12] ((bi) comodule algebras). Over a finite dimensional Hopf algebra H , speaking about module algebras or comodule algebras is the same thing, since a left (right) H -module algebra is the same as a right (left) H ∗ -comodule algebra. This does no longer hold over quasi-Hopf algebras, where a comodule algebra is an associative algebra but a module algebra is associative only in a tensor category, so in general being nonassociative as an algebra (for instance, the nonassociative algebra of octonions is such a module algebra over a certain quasiHopf algebra, cf. [2]). This fact leads to the following situation: a concept, construction, result, etc. from the theory of Hopf algebras might admit more different generalizations when passing to quasi-Hopf algebras. Such a situation occurs in this paper. To explain it, we recall some facts from [12]. If H is a finite dimensional quasi-Hopf algebra and A is an H -bicomodule algebra, Hausser and Nill introduced the so-called diagonal crossed products H ∗ A, H ∗ A, A H ∗ and A H ∗ , which are all (isomorphic) associative algebras and have the property that for A = H they are realizations of the quantum double of H , which has been introduced before by Majid in [18] in the form of an implicit Tannaka-Krein reconstruction procedure. Also, if A and B are a right and respectively a left H -comodule algebra, Hausser and Nill introduced an associative algebra structure on A ⊗ H ∗ ⊗ B, denoted by A > H ∗ H ∗ < B (A ⊗ B) H ∗ . Their motivation for introducing these constructions was the need to extend to the quasi-Hopf setting some models of Hopf spin chains and lattice current algebras from algebraic quantum field theory (see the introduction of [12] for details). For this purpose, one of the key results in [12] was that the two-sided crossed products can be iterated, providing thus a local net of associative algebras, with quantum double cosymmetry. Now, if H is a finite dimensional Hopf algebra, the construction A > H ∗ < B may be described equivalently with module algebras instead of comodule algebras, and it becomes a two-sided smash product A# H # B (where A and B are a left, respectively a right H -module algebra), with multiplication given by

Generalized Diagonal Crossed Products

357

(a#h#b)(a #h #b ) = a(h 1 · a )#h 2 h 1 #(b · h 2 )b , for all a, a ∈ A, h, h ∈ H and b, b ∈ B. It is this construction that we first wanted to generalize to quasi-Hopf algebras (where it will be different from the two-sided crossed product of Hausser and Nill). The need for such a construction arose as follows. It was proved in [21] that, for a finite dimensional ∗ H∗ ∗ Hopf algebra H , the category H H ∗ M H ∗ of H -Hopf bimodules is isomorphic to the category of left modules over a two-sided smash product H ∗ #(H ⊗ H op )# H ∗op . We wanted ∗ H∗ a similar result for the category H H ∗ M H ∗ for H a finite dimensional quasi-Hopf algebra, but observed that we could not use the two-sided crossed product of Hausser and Nill, we needed a generalization of the two-sided crossed product from Hopf algebras in the other direction (the one based on module algebras and not on comodule algebras). After constructing this two-sided smash product A# H # B, we wanted to express it as some sort of diagonal crossed product (A ⊗ B) H , and we were led naturally to consider a generalized diagonal crossed product A A, where A is an H -bimodule algebra and A is an H -bicomodule algebra. We describe now more formally the structure of this paper (H will be a fixed quasiHopf algebra or sometimes only a quasi-bialgebra). In Sect. 3 we introduce the left and right generalized diagonal crossed products A δ A and A δ A (which will turn out to be isomorphic), where A is an H -bimodule algebra and A is an associative algebra endowed with a two-sided coaction of H on it, and we prove their associativity. If A is an H -bicomodule algebra, one can construct out of it two two-sided coactions δl and δr , hence we have four generalized diagonal crossed products A A, A A, A A and A A. In Sect. 4 we construct, starting with a bicomodule algebra A, two left H ⊗ H op -comodule algebra structures on A, denoted by A1 and A2 . Regarding A as a left H ⊗ H op module algebra, we identify A A and A A with the generalized smash products (in the sense of [7]) A H ∗ < B, replacing H ∗ by A, and call this algebra the generalized two-sided crossed product. Then we construct the two-sided generalized smash product A B, which for A = H is exactly the two-sided smash product A# H # B that we needed. In Sect. 6 we prove the algebra isomorphisms A > A B and A > (A ⊗ B) < B. In Sect. 7 we study the invariance under twisting of our constructions. Starting with Sect. 8 we move to applications. We prove first that both two-sided products (the generalized two-sided crossed product and the two-sided generalized smash product) may be written as some iterated products. Together with the fact that a generalized smash product A B), this allows us to obtain a very easy, conceptual and constructive proof of the theorem of Hausser and Nill concerning iterated two-sided crossed products. As a byproduct of our approach, we obtain also that the iterated products arising in this theorem are actually isomorphic to a two-sided generalized smash product. In Sect. 9 we prove what was our original motivation for this paper, namely that the ∗ H∗ ∗ category H H ∗ M H ∗ of H -Hopf bimodules over a finite dimensional quasi-Hopf algebra H is isomorphic to H ∗ #(H ⊗H op )# H ∗ M. Along the way we obtain some other results of

358

D. Bulacu, F. Panaite, F. Van Oystaeyen

independent interest, such as the description of left modules over a two-sided smash product. In Sect. 10 we prove that, if (H, A, C) is a so-called Yetter-Drinfeld datum (here, C is an H -bimodule coalgebra) with C finite dimensional, then the category A Y D(H )C of (generalized) Yetter-Drinfeld modules is isomorphic to the category of left modules over the generalized diagonal crossed product C ∗ A. Some remarks on techniques are in order. What is characteristic in the approach of Hausser and Nill to their constructions is the systematic use of the so-called “generating matrix” formalism of the St. Petersburg school (the use of δ-implementers, λρ-intertwiners, etc). The replacement of H ∗ by an arbitrary H -bimodule algebra in our definition of the generalized diagonal crossed products makes the use of this formalism impossible, so most of our proofs are different in spirit from the ones of Hausser and Nill, and often easier (just compare our proof of the theorem concerning iterated two-sided crossed products with the original one in [12]), providing thus also an alternative approach to the constructions of Hausser and Nill. Another alternative approach has been provided by Schauenburg in [22] (using categorical techniques). 2. Preliminaries In this section we recall some definitions and results and fix notation used throughout the paper. 2.1. Quasi-bialgebras and quasi-Hopf algebras. We work over a field k. All algebras, linear spaces, etc. will be over k; unadorned ⊗ means ⊗k . Following Drinfeld [11], a quasi-bialgebra is a fourtuple (H, , ε, ), where H is an associative algebra with unit, is an invertible element in H ⊗ H ⊗ H , and : H → H ⊗ H and ε : H → k are algebra homomorphisms satisfying the identities (id ⊗ )((h)) = ( ⊗ id)((h))−1 ,

(2.1)

(id ⊗ ε)((h)) = h, (ε ⊗ id)((h)) = h,

(2.2)

for all h ∈ H , and has to be a normalized 3-cocycle, in the sense that (1 ⊗ )(id ⊗ ⊗ id)()( ⊗ 1) = (id ⊗ id ⊗ )()( ⊗ id ⊗ id)(), (id ⊗ ε ⊗ id)() = 1 ⊗ 1.

(2.3) (2.4)

The identities (2.2), (2.3) and (2.4) also imply that (ε ⊗ id ⊗ id)() = (id ⊗ id ⊗ ε)() = 1 ⊗ 1.

(2.5)

The map is called the coproduct or the comultiplication, ε the counit and the reassociator. As for bialgebras (see [24]) we denote (h) = h 1 ⊗ h 2 , but since is only quasi-coassociative we adopt the further convention (summation understood): ( ⊗ id)((h)) = h (1,1) ⊗ h (1,2) ⊗ h 2 , (id ⊗ )((h)) = h 1 ⊗ h (2,1) ⊗ h (2,2) , for all h ∈ H . We will denote the tensor components of by capital letters, and those of −1 by small letters, namely = X1 ⊗ X2 ⊗ X3 = T 1 ⊗ T 2 ⊗ T 3 = Y 1 ⊗ Y 2 ⊗ Y 3 = · · · , −1 = x 1 ⊗ x 2 ⊗ x 3 = t 1 ⊗ t 2 ⊗ t 3 = y 1 ⊗ y 2 ⊗ y 3 = · · · ,

Generalized Diagonal Crossed Products

359

The quasi-bialgebra H is called a quasi-Hopf algebra if there exists an anti-automorphism S of the algebra H and elements α, β ∈ H such that, for all h ∈ H , we have: S(h 1 )αh 2 = ε(h)α X 1 β S(X 2 )α X 3 = 1

and

h 1 β S(h 2 ) = ε(h)β,

and S(x 1 )αx 2 β S(x 3 ) = 1.

(2.6) (2.7)

For a quasi-Hopf algebra the antipode is determined uniquely up to a transformation α → U α, β → βU −1 , S(h) → U S(h)U −1 , where U ∈ H is invertible. The axioms for a quasi-Hopf algebra imply that ε(α)ε(β) = 1, so, by rescaling α and β, we may assume without loss of generality that ε(α) = ε(β) = 1 and ε ◦ S = ε. Together with a quasi-bialgebra or a quasi-Hopf algebra H = (H, , ε, , S, α, β) we also have H op , H cop and H op,cop as quasi-bialgebras (respectively quasi-Hopf algebras), where “op” means opposite multiplication and “cop” means opposite comultiplication. The structures are obtained by putting op = −1 , cop = (−1 )321 , op,cop = 321 , Sop = Scop = (Sop,cop )−1 = S −1 , αop = S −1 (β), βop = S −1 (α), αcop = S −1 (α), βcop = S −1 (β), αop,cop = β and βop,cop = α. Next we recall that the definition of a quasi-bialgebra or quasi-Hopf algebra is “twist covariant” in the following sense. An invertible element F ∈ H ⊗ H is called a gauge transformation or twist if (ε ⊗ id)(F) = (id ⊗ ε)(F) = 1. If H is a quasi-bialgebra or a quasi-Hopf algebra and F = F 1 ⊗ F 2 ∈ H ⊗ H is a gauge transformation with inverse F −1 = G 1 ⊗ G 2 , then we can define a new quasi-bialgebra (respectively quasiHopf algebra) H F by keeping the multiplication, unit, counit (and antipode in the case of a quasi-Hopf algebra) of H and replacing the comultiplication, reassociator and the elements α and β by F (h) = F(h)F −1 , F = (1 ⊗ F)(id ⊗ )(F)( ⊗ id)(F −1 )(F −1 ⊗ 1), α F = S(G 1 )αG 2 , β F = F 1 β S(F 2 ).

(2.8) (2.9) (2.10)

It is known that the antipode of a Hopf algebra is an anti-coalgebra morphism. For a quasiHopf algebra, we have the following: there exists a gauge transformation f ∈ H ⊗ H such that f (S(h)) f −1 = (S ⊗ S)(cop (h)), for all h ∈ H . (2.11) The element f can be computed explicitly. First set A1 ⊗ A2 ⊗ A3 ⊗ A4 = ( ⊗ 1)( ⊗ id ⊗ id)(−1 ), B 1 ⊗ B 2 ⊗ B 3 ⊗ B 4 = ( ⊗ id ⊗ id)()(−1 ⊗ 1), and then define γ , δ ∈ H ⊗ H by γ = S(A2 )α A3 ⊗ S(A1 )α A4 and δ = B 1 β S(B 4 ) ⊗ B 2 β S(B 3 ). Then f and

f −1

(2.12)

are given by the formulae f = (S ⊗ S)(cop (x 1 ))γ (x 2 β S(x 3 )), f −1 = (S(x 1 )αx 2 )δ(S ⊗ S)(cop (x 3 )).

(2.13) (2.14)

Moreover, f satisfies the following relations: f (α) = γ , (β) f −1 = δ.

(2.15)

360

D. Bulacu, F. Panaite, F. Van Oystaeyen

Furthermore the corresponding twisted reassociator (see (2.9)) is given by f = (S ⊗ S ⊗ S)(X 3 ⊗ X 2 ⊗ X 1 ).

(2.16)

2.2. Smash products. Suppose that (H, , ε, ) is a quasi-bialgebra. If U, V, W are left (right) H -modules, define aU,V,W , aU,V,W : (U ⊗ V ) ⊗ W → U ⊗ (V ⊗ W ), aU,V,W ((u ⊗ v) ⊗ w) = · (u ⊗ (v ⊗ w)), aU,V,W ((u ⊗ v) ⊗ w) = (u ⊗ (v ⊗ w)) · −1 . The category H M (M H ) of left (right) H -modules becomes a monoidal category (see [16, 19] for the terminology) with tensor product ⊗ given via , associativity constraints aU,V,W (aU,V,W ), unit k as a trivial H -module and the usual left and right unit constraints. Now, let H be a quasi-bialgebra. We say that a k-vector space A is a left H -module algebra if it is an algebra in the monoidal category H M, that is A has a multiplication and a usual unit 1 A satisfying the following conditions: (aa )a = (X 1 · a)[(X 2 · a )(X 3 · a )], h · (aa ) = (h 1 · a)(h 2 · a ), h · 1 A = ε(h)1 A ,

(2.17) (2.18) (2.19)

for all a, a , a ∈ A and h ∈ H , where h ⊗ a → h · a is the left H -module structure of A. Following [5] we define the smash product A# H as follows: as vector space A# H is A ⊗ H (elements a ⊗ h will be written a#h) with multiplication given by (a#h)(a #h ) = (x 1 · a)(x 2 h 1 · a )#x 3 h 2 h , a, a

(2.20)

h, h

for all ∈ A, ∈ H . This A# H is an associative algebra with unit 1 A #1 H and it is defined by a universal property (as Heyneman and Sweedler did for Hopf algebras), see [5]. It is easy to see that H is a subalgebra of A# H via h → 1#h, A is a k-subspace of A# H via a → a#1 and the following relations hold: (a#h)(1#h ) = a#hh , (1#h)(a#h ) = h 1 · a#h 2 h ,

(2.21)

for all a ∈ A, h, h ∈ H . For further use we need the notion of right H -module algebra. Let H be a quasi-bialgebra. We say that a k-linear space B is a right H -module algebra if B is an algebra in the monoidal category M H , i.e. B has a multiplication and a usual unit 1 B satisfying the following conditions: (bb )b = (b · x 1 )[(b · x 2 )(b · x 3 )], (bb ) · h = (b · h 1 )(b · h 2 ), 1 B · h = ε(h)1 B ,

(2.22) (2.23) (2.24)

for all b, b , b ∈ B and h ∈ H , where b ⊗ h → b · h is the right H -module structure of B. Also, we can define a (right-handed) smash product H # B as follows: as vector space H # B is H ⊗ B (elements h ⊗ b will be written h#b) with multiplication: (h#b)(h #b ) = hh 1 x 1 #(b · h 2 x 2 )(b · x 3 ),

(2.25)

for all b, b ∈ B, h, h ∈ H . This H # B is an associative algebra with unit 1 H #1 B . In fact, one can see that B op becomes a left H op,cop -module algebra and under the trivial permutation of tensor factors we have (B op # H op,cop )op = H # B.

Generalized Diagonal Crossed Products

361

2.3. Comodule algebras and generalized smash products. Recall from [12] the notion of comodule algebra over a quasi-bialgebra. Definition 2.1. Let H be a quasi-bialgebra. A unital associative algebra A is called a right H -comodule algebra if there exist an algebra morphism ρ : A → A ⊗ H and an invertible element ρ ∈ A ⊗ H ⊗ H such that: ρ (ρ ⊗ id)(ρ(a)) = (id ⊗ )(ρ(a))ρ , ∀a ∈ A, (1A ⊗ )(id ⊗ ⊗ id)(ρ )(ρ ⊗ 1 H ) = (id ⊗ id ⊗ )(ρ )(ρ ⊗ id ⊗ id)(ρ ), (id ⊗ ε) ◦ ρ = id, (id ⊗ ε ⊗ id)(ρ ) = (id ⊗ id ⊗ ε)(ρ ) = 1A ⊗ 1 H .

(2.26) (2.27) (2.28) (2.29)

Similarly, a unital associative algebra B is called a left H -comodule algebra if there exist an algebra morphism λ : B → H ⊗B and an invertible element λ ∈ H ⊗ H ⊗B such that the following relations hold: (id ⊗ λ)(λ(b))λ = λ ( ⊗ id)(λ(b)), ∀b ∈ B, (1 H ⊗ λ )(id ⊗ ⊗ id)(λ )( ⊗ 1B) = (id ⊗ id ⊗ λ)(λ )( ⊗ id ⊗ id)(λ ), (ε ⊗ id) ◦ λ = id, (id ⊗ ε ⊗ id)(λ ) = (ε ⊗ id ⊗ id)(λ ) = 1 H ⊗ 1B.

(2.30) (2.31) (2.32) (2.33)

When H is a quasi-bialgebra, particular examples of left and right H -comodule algebras are given by A = B = H and ρ = λ = , ρ = λ = . For a right H -comodule algebra (A, ρ, ρ ) we will denote ρ(a) = a 0 ⊗ a 1 , (ρ ⊗ id)(ρ(a)) = a 0,0 ⊗ a 0,1 ⊗ a 1 , etc. for any a ∈ A. Similarly, for a left H -comodule algebra (B, λ, λ ), if b ∈ B then we will denote λ(b) = b[−1] ⊗ b[0] , (id ⊗ λ)(λ(b)) = b[−1] ⊗ b[0,−1] ⊗ b[0,0] , etc. In analogy with the notation for the reassociator of H , we will write ρ = X˜ ρ1 ⊗ X˜ ρ2 ⊗ X˜ ρ3 = Y˜ρ1 ⊗ Y˜ρ2 ⊗ Y˜ρ3 = · · · , 1 2 3 1 2 3 −1 ρ = x˜ ρ ⊗ x˜ ρ ⊗ x˜ ρ = y˜ρ ⊗ y˜ρ ⊗ y˜ρ = · · · ,

and similarly for the element λ of a left H -comodule algebra B. When there is no danger of confusion we will omit the subscripts ρ or λ for the tensor components of the −1 elements ρ , λ or for the tensor components of the elements −1 ρ , λ . If A is a right H -comodule algebra then we define the elements p˜ ρ , q˜ρ ∈ A ⊗ H as follows: 1 3 2 p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 = x˜ 1ρ ⊗ x˜ 2ρ β S(x˜ 3ρ ), q˜ρ = q˜ρ1 ⊗ q˜ρ2 = X˜ ρ ⊗ S −1 (α X˜ ρ ) X˜ ρ . (2.34)

362

D. Bulacu, F. Panaite, F. Van Oystaeyen

By [12, Lemma 9.1], we have the following relations, for all a ∈ A: ρ(a<0> ) p˜ ρ [1A ⊗ S(a<1> )] = p˜ ρ [a ⊗ 1 H ], [1A ⊗

S −1 (a

<1> )]q˜ρ ρ(a<0> ) = [a ⊗ 1 H ]q˜ρ , 1 ρ(q˜ρ ) p˜ ρ [1A ⊗ S(q˜ρ2 )] = 1A ⊗ 1 H , [1A ⊗ S −1 ( p˜ ρ2 )]q˜ρ ρ( p˜ ρ1 ) = 1A ⊗ 1 H ,

(2.35) (2.36) (2.37) (2.38)

ρ (ρ ⊗ id H )( p˜ ρ )( p˜ ρ ⊗ id H ) = (idA ⊗ )(ρ(x˜ 1ρ ) p˜ ρ )(1A ⊗ g 1 S(x˜ 3ρ ) ⊗ g 2 S(x˜ 2ρ )),

(2.39)

(q˜ρ ⊗ 1 H )(ρ ⊗ id H )(q˜ρ )−1 ρ 3 2 1 = [1A ⊗ S −1 ( f 2 X˜ ρ ) ⊗ S −1 ( f 1 X˜ ρ )](idA ⊗ )(q˜ρ ρ( X˜ ρ )), (2.40)

where f = f 1 ⊗ f 2 is the element defined in (2.13) and f −1 = g 1 ⊗ g 2 . Let H be a quasi-bialgebra, A a left H -module algebra and B a left H -comodule algebra. Denote by A
(2.41)

for all a, a ∈ A and b, b ∈ B. By [7], A B the k-vector space A ⊗ B with the newly defined multiplication (a > b)(a > b ) = aa 0 x˜ 1ρ > (b · a 1 x˜ 2ρ )(b · x˜ 3ρ ),

(2.42)

for all a, a ∈ A and b, b ∈ B. It is easy to see that A > B is an associative algebra with unit 1A > 1 B . Of course, if A = H then H > B = H # B as algebras. 2.4. Bimodule algebras and bicomodule algebras. The following definition was introduced in [12] under the name “quasi-commuting pair of H -coactions”. Definition 2.2. Let H be a quasi-bialgebra. By an H -bicomodule algebra A we mean a quintuple (λ, ρ, λ , ρ , λ,ρ ), where λ and ρ are left and right H -coactions on A, respectively, and where λ ∈ H ⊗ H ⊗ A, ρ ∈ A ⊗ H ⊗ H and λ,ρ ∈ H ⊗ A ⊗ H are invertible elements, such that: - (A, λ, λ ) is a left H -comodule algebra; - (A, ρ, ρ ) is a right H -comodule algebra; - the following compatibility relations hold: λ,ρ (λ ⊗ id)(ρ(u)) = (id ⊗ ρ)(λ(u))λ,ρ , ∀u ∈ A, (1 H ⊗ λ,ρ )(id ⊗ λ ⊗ id)(λ,ρ )(λ ⊗ 1 H ) = (id ⊗ id ⊗ ρ)(λ )( ⊗ id ⊗ id)(λ,ρ ), (1 H ⊗ ρ )(id ⊗ ρ ⊗ id)(λ,ρ )(λ,ρ ⊗ 1 H ) = (id ⊗ id ⊗ )(λ,ρ )(λ ⊗ id ⊗ id)(ρ ).

(2.43) (2.44) (2.45)

Generalized Diagonal Crossed Products

363

As pointed out in [12], if A is a bicomodule algebra then, in addition, we have that (id H ⊗ idA ⊗ ε)(λ,ρ ) = 1 H ⊗ 1A , (ε ⊗ idA ⊗ id H )(λ,ρ ) = 1A ⊗ 1 H . (2.46) A first example of a bicomodule algebra is A = H , λ = ρ = and λ = ρ = λ,ρ = . For the left and right comodule algebra structures of A we will use notation as above. For simplicity we denote 1

2

3

λ,ρ = 1 ⊗ 2 ⊗ 3 = 1 ⊗ 2 ⊗ 3 = ⊗ ⊗ , 1 2 3 ˜1 ˜2 ˜3 −1 λ,ρ = θ ⊗ θ ⊗ θ = θ ⊗ θ ⊗ θ = θ ⊗ θ ⊗ θ . 1

2

3

As we mentioned before, if H is a quasi-bialgebra then so is H op , where “op” means the opposite multiplication. The reassociator of H op is op = −1 . Hence H ⊗ H op is a quasi-bialgebra with reassociator H ⊗H op = (X 1 ⊗ x 1 ) ⊗ (X 2 ⊗ x 2 ) ⊗ (X 3 ⊗ x 3 ).

(2.47)

If we identify left H ⊗ H op -modules with H -bimodules, then the category of H -bi modules, H M H , is monoidal, the associativity constraints being given by aU,V,W : (U ⊗ V ) ⊗ W → U ⊗ (V ⊗ W ), a U,V,W ((u ⊗ v) ⊗ w) = · (u ⊗ (v ⊗ w)) · −1 ,

(2.48)

for any U, V, W ∈ H M H and u ∈ U , v ∈ V and w ∈ W . Therefore, we can define algebras in the category of H -bimodules. Such an algebra will be called an H -bimodule algebra. More exactly, a k-vector space A is an H -bimodule algebra if A is an H -bimodule (denote the actions by h ·ϕ and ϕ ·h, for h ∈ H and ϕ ∈ A) which has a multiplication and a usual unit 1A such that for all ϕ, ϕ , ϕ ∈ A and h ∈ H the following relations hold: (ϕϕ )ϕ = (X 1 · ϕ · x 1 )[(X 2 · ϕ · x 2 )(X 3 · ϕ · x 3 )], h · (ϕϕ ) = (h 1 · ϕ)(h 2 · ϕ ), (ϕϕ ) · h = (ϕ · h 1 )(ϕ · h 2 ), h · 1A = ε(h)1A , 1A · h = ε(h)1A .

(2.49) (2.50) (2.51)

Let H be a quasi-bialgebra. Then H ∗ , the linear dual of H , is an H -bimodule via the H -actions

h ϕ, h = ϕ(h h), ϕ h, h = ϕ(hh ),

(2.52)

for all ϕ ∈ H ∗ and h, h ∈ H . The convolution ϕψ, h = ϕ(h 1 )ψ(h 2 ), ϕ, ψ ∈ H ∗ , h ∈ H , is a multiplication on H ∗ ; it is not in general associative, but with this multiplication H ∗ becomes an H -bimodule algebra.

364

D. Bulacu, F. Panaite, F. Van Oystaeyen

3. Generalized Diagonal Crossed Products In order to define the generalized diagonal crossed products we need the notion of twosided coaction. Let H be a quasi-bialgebra and A a unital associative algebra. Recall from [12] that a two-sided coaction of H on A is a pair (δ, ) where δ : A → H ⊗ A ⊗ H is an algebra map and ∈ H ⊗2 ⊗ A ⊗ H ⊗2 is an invertible element such that the following relations hold: (id H ⊗ δ ⊗ id H )(δ(u)) = ( ⊗ idA ⊗ )(δ(u)), ∀ u ∈ A,

(3.1)

−1

(1 H ⊗ ⊗ 1 H )(id H ⊗ ⊗ idA ⊗ ⊗ id H )()( ⊗ idA ⊗ ) = (id H ⊗ id H ⊗ δ ⊗ id H ⊗ id H )()( ⊗ id H ⊗ idA ⊗ id H ⊗ )(), (3.2) (ε ⊗ idA ⊗ ε) ◦ δ = idA , (id H ⊗ ε ⊗ idA ⊗ ε ⊗ id H )() = (ε ⊗ id H ⊗ idA ⊗ id H ⊗ ε)() = 1 H ⊗ 1A ⊗ 1 H .

(3.3) (3.4)

If H is a quasi-bialgebra then to any H -bicomodule algebra (A, λ, ρ, λ , ρ , λ,ρ ) one can associate (see [12]) two two-sided H -coactions, denoted by (δl , l ) and (δr , r ). More precisely δl = (λ ⊗ id H ) ◦ ρ, (3.5) ⊗2 ⊗2 ⊗2 ) (λ,ρ ⊗ 1 H )(λ ⊗ id H )(−1 l := (id H ⊗ λ ⊗ id H ρ ) [λ ⊗ 1 H ], and

δr = (id H ⊗ ρ) ◦ λ, ⊗2 ⊗2 −1 ⊗ ρ ⊗ id H ) (1 H ⊗ −1 )(id ⊗ ρ)( ) [1⊗2 r = (id H λ λ,ρ H H ⊗ ρ ].

(3.6)

Let H be a quasi-Hopf algebra, A an H -bimodule algebra and (δ, ) a two-sided coaction of H on a unital associative algebra A. Denote δ(u) := u (−1) ⊗ u (0) ⊗ u (1) , for 1

5

all u ∈ A, = 1 ⊗ · · · ⊗ 5 , −1 = ⊗ · · · ⊗ , and then define 1

2

3

4

5

δ = 1δ ⊗ · · · ⊗ 5δ = ⊗ ⊗ ⊗ S −1 ( f 1 ) ⊗ S −1 ( f 2 ), (3.7)

5 1 1 2 2 3 4 5 −1 −1 δ = 1 δ ⊗ · · · ⊗ δ = S ( g ) ⊗ S ( g ) ⊗ ⊗ ⊗ . (3.8)

Here f = f 1 ⊗ f 2 is the twist defined in (2.13) and f −1 = g 1 ⊗ g 2 is its inverse. We denote by A δ A and A δ A the k-vector spaces A ⊗ A and respectively A ⊗ A, furnished with the multiplications given respectively by: (ϕ δ u)(ϕ δ u ) = (1δ · ϕ · 5δ )(2δ u (−1) · ϕ · S −1 (u (1) )4δ ) δ 3δ u (0) u , (u δ ϕ)(u δ ϕ ) 2 −1 4 1 5 = uu (0) 3 δ δ (δ S (u (−1) ) · ϕ · u (1) δ )(δ · ϕ · δ ),

(3.9) (3.10)

for all u, u ∈ A and ϕ, ϕ ∈ A, where we write ϕ δ u and u δ ϕ in place of ϕ ⊗ u and respectively u ⊗ ϕ to distinguish the new algebraic structures, and where

Generalized Diagonal Crossed Products

365

5 δ = 1δ ⊗ · · · ⊗ 5δ and δ = 1 δ ⊗ · · · ⊗ δ are the elements defined by (3.7) and (3.8), respectively. We call A δ A and A δ A the left, and respectively right, generalized diagonal crossed product between A and A. The following (technical) lemma, expressing some relations fulfilled by the elements δ and δ , will be essential in the sequel. It will help us to prove that the generalized diagonal crossed products defined above are associative algebras, and moreover it will allow us to regard an H -bicomodule algebra A, in two ways, as a left H ⊗ H op -comodule algebra. We would like to stress that for these two aims, the explicit formulae for δ and δ are not so important, any other elements satisfying the relations in the lemma (plus some other minor conditions) are equally good, so it would be a natural question to ask whether there exist other such elements.

Lemma 3.1. Let H be a quasi-Hopf algbera, A a unital associative algebra and (δ, ) a two-sided coaction of H on A. 1

5

(a) Let δ = 1δ ⊗ · · · ⊗ 5δ = δ ⊗ · · · ⊗ δ be the element defined by (3.7). Then for all u ∈ A the following relations hold: 1δ u (−1) ⊗ 2δ u (0,−1) ⊗ 3δ u (0,0) ⊗ S −1 (u (0,1) )4δ ⊗ S −1 (u (1) )5δ

= u (−1)1 1δ ⊗ u (−1)2 2δ ⊗ u (0) 3δ ⊗ 4δ S −1 (u (1) )2 ⊗ 5δ S −1 (u (1) )1 ,

(3.11)

1 1 2 3 4 X (δ )1 1δ ⊗ X 2 (δ )2 2δ ⊗ X 3 δ (3δ )(−1) ⊗ δ 3(0) ⊗ S −1 ((3δ )(1) )δ x 3 5 5 1 2 2 ⊗4δ (δ )2 x 2 ⊗ 5δ (δ )1 x 1 = δ ⊗ (δ )1 1δ ⊗ (δ )2 2δ 3 4 4 5 ⊗δ 3δ ⊗ 4δ (δ )2 ⊗ 5δ (δ )1 ⊗ δ . (3.12) 1

1

5

5 (b) Let δ = 1 δ ⊗ · · · ⊗ δ = δ ⊗ · · · ⊗ δ be the element defined by (3.8). Then for all u ∈ A the following relations hold: 2 −1 3 4 5 −1 1 δ S (u (−1) ) ⊗ δ S (u (0,−1) ) ⊗ u (0,0) δ ⊗ u (0,1) δ ⊗ u (1) δ

2 3 4 5 −1 = S −1 (u (−1) )2 1 δ ⊗ S (u (−1) )1 δ ⊗ δ u (0) ⊗ δ u (1)1 ⊗ δ u (1)2 , (3.13) 1

2

2

3

4

1 2 3 4 1 X 3 δ ⊗ X 2 (δ )2 1 δ ⊗ X (δ )1 δ ⊗ δ δ ⊗ δ (δ )1 x 4

5

1

1

2

2 3 1 2 3 −1 ⊗ 5 δ (δ )2 x ⊗ δ x = (δ )1 δ ⊗ (δ )2 δ ⊗ δ S ((δ )(−1) ) 3

4

5

5

3 4 5 ⊗( 3 δ )(0) δ ⊗ (δ )(1) δ ⊗ δ (δ )1 ⊗ δ (δ )2 .

(3.14)

Proof. We will prove only (a), (b) being similar. The relation (3.11) follows easily by applying (3.7), (3.1) and (2.11), the details are left to the reader. We prove now (3.12). We will not perform all the computations, but we will point out the relations that are used at every step. So, we compute: 1

1

2

3

X 1 (δ )1 1δ ⊗ X 2 (δ )2 2δ ⊗ X 3 δ (3δ )(−1) ⊗ δ 3(0) 4

5

5

⊗S −1 ((3δ )(1) )δ x 3 ⊗ 4δ (δ )2 x 2 ⊗ 5δ (δ )1 x 1 1 2 2 3 3 3 (3.7,2.11) 1 1 1 X 1 ⊗ X 2 2 ⊗ X 3 (−1) ⊗ (0) = 5 4 5 5 ⊗S −1 (F 1 f 12 1 )x 2 ⊗ S −1 (F 2 f 22 2 )x 1

4

3

⊗ S −1 ( f 1 (1) )x 3

366

D. Bulacu, F. Panaite, F. Van Oystaeyen (3.2) =

1

2

1

2

2

3

3

4

4

⊗ 1 ⊗ 2 ⊗ ⊗ S −1 (S(x 3 ) f 1 X 1 1 ) 4

5

5

⊗S −1 (S(x 2 )F 1 f 12 X 2 2 ) ⊗ S −1 (S(x 1 )F 2 f 22 X 3 ) (2.9,2.16,2.11) =

1

2

1

2

2

3

3

4

4

⊗ 1 ⊗ 2 ⊗ ⊗ S −1 ( f 1 )S −1 (F 1 )2 5

4

5

⊗S −1 ( f 2 )S −1 (F 1 )1 ⊗ S −1 (F 2 ) (3.7) =

1

2

2

3

4

4

5

δ ⊗ (δ )1 1δ ⊗ (δ )2 2δ ⊗ δ 3δ ⊗ 4δ (δ )2 ⊗ 5δ (δ )1 ⊗ δ , 1

5

as claimed. We denoted by ⊗ · · · ⊗ another copy of −1 and by F 1 ⊗ F 2 another copy of the Drinfeld twist f defined in (2.13). Suppose now that A is an H -bicomodule algebra and let (δ, ) = (δl/r , λ/r ) be the two-sided coactions defined by (3.5) and (3.6), respectively. For simplicity we denote = δl , ω = δr , = δl and ω = δr . Concretely, the elements , ω ∈ H ⊗2 ⊗ A ⊗ H ⊗2 come out as 1 1 2 = ( X˜ ρ )[−1]1 x˜ 1λ θ 1 ⊗ ( X˜ ρ )[−1]2 x˜ 2λ θ[−1] 1 2 3 2 ⊗( X˜ ρ )[0] x˜ 3λ θ[0] ⊗ S −1 ( f 1 X˜ ρ θ 3 ) ⊗ S −1 ( f 2 X˜ ρ ),

(3.15)

1 ω = x˜ 1λ ⊗ x˜ 2λ 1 ⊗ (x˜ 3λ ) 0 X˜ ρ 2 0 2 3 ⊗S −1 ( f 1 (x˜ 3λ ) 11 X˜ ρ 2 1 ) ⊗ S −1 ( f 2 (x˜ 3λ ) 12 X˜ ρ 3 ),

(3.16)

1 2 3 = x˜ 1λ ⊗ x˜ 2λ ⊗ x˜ 3λ , λ,ρ = 1 ⊗ 2 ⊗ 3 , where ρ = X˜ ρ ⊗ X˜ ρ ⊗ X˜ ρ , −1 λ −1 1 2 3 1 2 λ,ρ = θ ⊗ θ ⊗ θ and f = f ⊗ f is the twist defined in (2.13). For further use we record the fact that the formulae in Lemma 3.1 (a) specialize to (δl/r , l/r ) as follows (for all u ∈ A):

1 u 0[−1] ⊗ 2 u 0[0] 0

[−1]

⊗ 3 u 0[0] 0

[0]

⊗ S −1 (u 0[0] 1 )4 ⊗ S −1 (u 1 )5

= u 0[−1]1 1 ⊗ u 0[−1]2 2 ⊗ u 0[0] 3 ⊗ 4 S −1 (u 1 )2 ⊗ 5 S −1 (u 1 )1 , (3.17) 1

1

2

3

4

X 1 1 1 ⊗ X 2 2 2 ⊗ X 3 3 0[−1] ⊗ 3 0[0] ⊗ S −1 (3 1 ) x 3 5

5

1

2

2

3

4

4

5

⊗4 2 x 2 ⊗ 5 1 x 1 = ⊗ 1 1 ⊗ 2 2 ⊗ 3 ⊗ 4 2 ⊗ 5 1 ⊗ , (3.18) and respectively ω1 u [−1] ⊗ ω2 u [0] 0[−1] ⊗ ω3 u [0] 0[0]

0

⊗ S −1 (u [0] 0[0] )ω4 ⊗ S −1 (u [0] 1 )ω5

1

= u [−1]1 ω1 ⊗ u [−1]2 ω2 ⊗ u [0] 0 ω3 ⊗ ω4 S −1 (u [0] 1 )2 ⊗ ω5 S −1 (u [0] 1 )1 ,(3.19) 3 3 3 ω11 ω1 ⊗ ω12 ω2 ⊗ ω2 ω[−1] ⊗ ω3 ω[0] ⊗ S −1 (ω[0] )ω4 ⊗ ω4 ω52 ⊗ ω5 ω51

0

1

= x 1 ω1 ⊗ x 2 ω21 ω1 ⊗ x 3 ω22 ω2 ⊗ ω3 ω3 ⊗ ω4 ω42 X 3 ⊗ ω5 ω41 X 2 ⊗ ω5 X 1 , (3.20) 1

5

where we denoted by = 1 ⊗ · · · ⊗ 5 = ⊗ · · · ⊗ the element defined in (3.15) and by ω = ω1 ⊗ · · · ⊗ ω5 = ω1 ⊗ · · · ⊗ ω5 the element defined in (3.16). If (A, λ, ρ, λ , ρ , λ,ρ ) is an H -bicomodule algebra then it is not hard to see that

Generalized Diagonal Crossed Products

367

321 321 op,cop -bicomodule algeAop,cop := (Aop , τA,H ◦ ρ, τ H,A ◦ λ, 321 ρ , λ , λ,ρ ) is an H bra (by τ X,Y : X ⊗ Y → Y ⊗ X we denoted the switch map x ⊗ y → y ⊗ x). Moreover, in H op,cop we have that the Drinfeld twist (defined for an arbitrary quasi-Hopf algebra −1 = g 2 ⊗ g 1 , where f is the Drinfeld twist of in (2.13)) is given by f op,cop = f 21 H . Now, if we denote by op,cop and ωop,cop the elements δl/r corresponding to the H op,cop -bicomodule algebra Aop,cop , then one can easily check that

= (ωop,cop )54321 and ω = (op,cop )54321 , so we restrict to the study of the elements , ω and their associated constructions. Finally, for this particular situation we denote A δl A = A A, A δr A = A A, A δl A = A A and A δr A = A A, where A is an arbitrary H -bimodule algebra. So the first two constructions are left generalized diagonal crossed products and the last two are right generalized diagonal crossed products. For example, the multiplications in A A and A A are given by (ϕ u)(ϕ u ) = (1 · ϕ · 5 )(2 u 0[−1] · ϕ · S −1 (u 1 )4 ) 3 u 0[0] u ,

(3.21)

(ϕ u)(ϕ u ) = (ω1 · ϕ · ω5 )(ω2 u [−1] · ϕ · S −1 (u [0] 1 )ω4 ) ω3 u [0] 0 u ,

(3.22)

for all ϕ, ϕ ∈ A and u, u ∈ A, where we write ϕ u and ϕ u in place of ϕ ⊗ u to distinguish the new algebraic structures. We are now ready to show that the generalized diagonal crossed products are unital associative algebras. Proposition 3.2. Let H be a quasi-Hopf algebra, A a unital associative algebra and (δ, ) a two-sided coaction of H on A. Consider A δ A and A δ A, the k-vector spaces A ⊗ A and respectively A ⊗ A, endowed with the multiplications defined in (3.9) and (3.10), respectively. Then these products define on A δ A and A δ A two structures of associative algebra with unit 1A δ 1A (respectively 1A δ 1A ), containing A ≡ 1A δ A (respectively A ≡ A δ 1A ) as unital subalgebra. Consequently, if A is an H -bicomodule algebra and A is an H -bimodule algebra then A A, A A, A A and A A are associative algebras containing A as unital subalgebra. Proof. We will give the proof only for A δ A, the one for A δ A being similar (it will use the relations satisfied by δ , instead of the ones satisfied by δ ). For ϕ, ϕ , ϕ ∈ A and u, u , u ∈ A we compute: (ϕ δ u)[(ϕ δ u )(ϕ δ u )] (3.9) = (ϕ

δ u)[(1δ · ϕ · 5δ )(2δ u (−1) · ϕ · S −1 (u (−1) )4δ ) δ 3δ u (0) u ]

1 (3.9,2.50) (δ =

5

2

4

2

· ϕ · δ )[((δ )1 u (−1)1 1δ · ϕ · 5δ S −1 (u (−1) )1 (δ )1 )((δ )2 u (−1)2 4

3

×2δ u (−1) · ϕ · S −1 (u (1) )4δ S −1 (u (1) )2 (δ )2 )] δ δ u (0) 3δ u (0) u 1 (3.11) = (δ

5

2

4

· ϕ · δ )[(δ )1 1δ u (−1) · ϕ · S −1 (u (1) )5δ (δ )1 )

368

D. Bulacu, F. Panaite, F. Van Oystaeyen 2

4

((δ )2 2δ u (0,−1) u (−1) · ϕ · S −1 (u (0,1) u (1) )4δ (δ )2 )] 3

δ δ 3δ u (0,0) u (0) u 1 5 1 5 (3.12,2.49) [((δ )1 1δ · ϕ · 5δ (δ )1 )((δ )2 2δ u (−1) · ϕ · S −1 (u (1) )4δ (δ )2 )] = 2 4 (δ (3δ )(−1) u (0,−1) u (−1) · ϕ · S −1 ((3δ )(1) u (0,1) u (1) )δ ) 3 δ δ (3δ )(0) u (0,0) u (0) u (2.50,3.9) [(1δ · ϕ · 5δ )(2δ u (−1) · ϕ · S −1 (u (1) )4δ ) δ 3δ u (0) u ](ϕ δ u ) = (3.9) = [(ϕ δ u)(ϕ δ u )](ϕ δ u ).

The fact that 1A δ 1A is the unit follows easily from the (co) unit axioms. Remark 3.3. In the algebras A δ A and A δ A we have (ϕ δ 1A )(1A δ u) = ϕ δ u and (u δ 1A )(1A δ ϕ) = u δ u, for all ϕ ∈ A and u ∈ A. Examples 3.4. 1) As we mentioned before, if H is a quasi-Hopf algebra then H ∗ is an H -bimodule algebra, hence it makes sense to consider the algebras H ∗ δ A and A δ H ∗ , which are exactly the left and right diagonal crossed products constructed in [12]. For this reason we called the algebras in Proposition 3.2 the generalized diagonal crossed products. 2) Let A be a left H -module algebra. Then A becomes an H -bimodule algebra, where the right H -action is given via ε. In this particular case A H and A H coincide both to the smash product algebra A# H . Moreover, if we replace the quasi-Hopf algebra H by an arbitrary H -bicomodule algebra A, then A A and A A coincide with the generalized smash product algebra A

(ϕ h)(ϕ h ) = (ω · ϕ · ω )(ω h 1 · ϕ · S 1

5

2

−1

(3.23)

(h (2,2) )ω ) ω h (2,1) h , (3.24) 4

3

for all ϕ, ϕ ∈ A and h, h ∈ H , where = 1 ⊗· · ·⊗5 , ω = ω1 ⊗...⊗ω5 ∈ H ⊗5 are now given by: 1 1 = X (1,1) x 1 y 1 ⊗ X (1,2) x 2 y12 ⊗ X 21 x 3 y22 ⊗ S −1 ( f 1 X 2 y 3 ) ⊗ S −1 ( f 2 X 3 ),

(3.25)

3 3 ω = x 1 ⊗ x 2 Y 1 ⊗ x13 X 1 Y12 ⊗ S −1 ( f 1 x(2,1) X 2 Y22 ) ⊗ S −1 ( f 2 x(2,2) X 3 Y 3 ),

(3.26)

and where f = f 1 ⊗ f 2 is the twist defined in (2.13). 4) Let H be an ordinary Hopf algebra with bijective antipode, A an H -bimodule algebra and A an H -bicomodule algebra in the usual (Hopf) sense. In this case the multiplications of A A and A A coincide, and are given by (ϕ u)(ϕ u ) = ϕ(u {−1} · ϕ · S −1 (u {1} )) u {0} u ,

(3.27)

Generalized Diagonal Crossed Products

369

for all ϕ, ϕ ∈ A and u, u ∈ A, where u {−1} ⊗ u {0} ⊗ u {1} := u 0[−1] ⊗ u 0[0] ⊗ u 1 = u [−1] ⊗ u [0] 0 ⊗ u [0] 1 . This construction appears in [26], in a slightly different form (namely, with S instead of S −1 ), under the name “generalized twisted smash product” (a particular case, when A = H , was introduced in [25]). Let H be a quasi-Hopf algebra. For an H -bicomodule algebra A and an H -bimodule algebra A the multiplications of the right generalized diagonal crossed products A A and A A are the following. If = 1 ⊗ · · · ⊗ 5 and ω = ω 1 ⊗ · · · ⊗ ω 5 we then have (u ϕ)(u ϕ ) = uu 0[0] 3 ( 2 S −1 (u 0[−1] ) · ϕ · u 1 4 )( 1 · ϕ · 5 ),

(3.28)

(u ϕ)(u ϕ ) = uu [0] 0 ω 3 (ω 2 S −1 (u [−1] ) · ϕ · u [0] 1 ω 4 )(ω 1 · ϕ · ω 5 ),

(3.29)

for all u, u ∈ A and ϕ, ϕ ∈ A. We know from Proposition 3.2 that A A and A A are associative algebras with unit 1A 1A and 1A 1A , respectively, containing A as unital subalgebra. In fact, under the trivial permutation of tensor factors we have that A A ≡ (Aop Aop,cop )op , A A ≡ (Aop Aop,cop )op ,

(3.30)

where the left generalized diagonal crossed products are made over H op,cop . Note that Aop becomes an H op,cop -bimodule algebra via the actions h ·op ϕ ·op h = h · ϕ · h, for all h, h ∈ H and ϕ ∈ A. In the sequel we will restrict to the study of the left generalized diagonal crossed products. Remark 3.5. Let H be a quasi-Hopf algebra and A an H -bicomodule algebra. In [12], Hausser and Nill proved that the two left (right) diagonal crossed products H ∗ A (A H ∗ ) and H ∗ A (A H ∗ ) are isomorphic as algebras, and then that these four diagonal crossed products are isomorphic as algebras. We will prove in the next section that such result is also true for generalized diagonal crossed products, but as a consequence of the fact that the (generalized) diagonal crossed products can be written as some generalized smash products, and of an explicit algebra isomorphism between A A and A A. Remark 3.6. There exists a very general scheme, due to Schauenburg [23], for constructing associative algebras starting with a monoidal category acting on a category of modules, and it is likely that the generalized diagonal crossed products fit into this scheme. However, we have chosen to prove the associativity of A δ A by direct computation, first because Schauenburg’s machinery is itself quite complicated, and second because the difficulty of our proof lies actually only in Lemma 3.1, which is needed anyway in the next section. If H is a finite dimensional quasi-Hopf algebra and A is an H -bicomodule algebra, Hausser and Nill constructed a map from H ∗ to the diagonal crossed product A H ∗ , having the property that A H ∗ is generated as algebra by A and (H ∗ ). Such a map may also be constructed for the generalized diagonal crossed products. We need first the following result.

370

D. Bulacu, F. Panaite, F. Van Oystaeyen

Lemma 3.7. Let H be a quasi-Hopf algebra, A an H -bimodule algebra and A an H bicomodule algebra. Then, for all ϕ ∈ A, we have ϕ 1A = (1A q˜ρ1 )(( p˜ ρ1 )[−1] · ϕ · q˜ρ2 S −1 ( p˜ ρ2 ) ( p˜ ρ1 )[0] ), where p˜ ρ and q˜ρ are given by (2.34). Proof. We compute: (1A q˜ρ1 )(( p˜ ρ1 )[−1] · ϕ · q˜ρ2 S −1 ( p˜ ρ2 ) ( p˜ ρ1 )[0] ) (3.21) 1 1 = (q˜ρ ) 0[−1] ( p˜ ρ )[−1] (2.37) = ϕ 1A ,

· ϕ · q˜ρ2 S −1 ( p˜ ρ2 )S −1 ((q˜ρ1 ) 1 ) (q˜ρ1 ) 0[0] ( p˜ ρ1 )[0]

which finishes the proof. Proposition 3.8. Let H be a quasi-Hopf algebra, A an H -bimodule algebra and A an H -bicomodule algebra. Define the map : A → A A, (ϕ) = ( p˜ ρ1 )[−1] · ϕ · S −1 ( p˜ ρ2 ) ( p˜ ρ1 )[0] ,

(3.31)

for all ϕ ∈ A. Then A A is generated as algebra by A and (A). Proof. By the previous lemma it follows that ϕ 1A = (1A q˜ρ1 )(ϕ · q˜ρ2 ), for all ϕ ∈ A, so for ϕ ∈ A and u ∈ A we can write ϕ u = (1A q˜ρ1 )(ϕ · q˜ρ2 )(1A u), finishing the proof. We will see other properties of the map in subsequent sections. We prove now a sort of associativity property of generalized diagonal crossed products with respect to tensoring by an arbitrary associative algebra. Proposition 3.9. Let H be a quasi-Hopf algebra, A an H -bimodule algebra, A an H bicomodule algebra and C an associative algebra. On A ⊗ C we have a (canonical) H -bicomodule algebra structure, yielding algebra isomorphisms A (A ⊗ C) ≡ (A A) ⊗ C, A (A ⊗ C) ≡ (A A) ⊗ C,

(3.32) (3.33)

defined by the trivial identifications. Proof. The H -bicomodule algebra structure on A ⊗ C is given in such a way that everything that happens on C is trivial, for instance the right H -comodule algebra structure is: ρA⊗C : A ⊗ C → (A ⊗ C) ⊗ H, ρA⊗C (u ⊗ c) = (u 0 ⊗ c) ⊗ u 1 , ∀ u ∈ A, c ∈ C, (ρ )A⊗C ∈ (A ⊗ C) ⊗ H ⊗ H, (ρ )A⊗C = ( X˜ ρ1 ⊗ 1C ) ⊗ X˜ ρ2 ⊗ X˜ ρ3 ,

Generalized Diagonal Crossed Products

371

and one can easily check that indeed A ⊗ C becomes an H -bicomodule algebra. Also, it is easy to see that the elements and ω for A ⊗ C are given by A⊗C = 1 ⊗ 2 ⊗ (3 ⊗ 1C ) ⊗ 4 ⊗ 5 , ωA⊗C = ω1 ⊗ ω2 ⊗ (ω3 ⊗ 1C ) ⊗ ω4 ⊗ ω5 , where = 1 ⊗ · · · ⊗ 5 and ω = ω1 ⊗ · · · ⊗ ω5 are the ones for A. Using this one obtains that the multiplications in A (A⊗C) and respectively A (A⊗C) coincide with those in (A A) ⊗ C respectively (A A) ⊗ C via the trivial identifications. 4. Generalized Diagonal Crossed Products as Generalized Smash Products Let H be a quasi-Hopf algebra and A an H -bicomodule algebra. We define two left H ⊗ H op -coactions on A, as follows: λ1 , λ2 : A → (H ⊗ H op ) ⊗ A, λ1 (u) = (u 0[−1] ⊗ S −1 (u 1 )) ⊗ u 0[0] := u (−1) ⊗ u (0) , λ2 (u) = (u [−1] ⊗ S −1 (u [0] 1 )) ⊗ u [0] 0 := u (−1) ⊗ u (0) , for all u ∈ A (of course, in the Hopf case these two coactions coincide). If we look at the element ∈ H ⊗2 ⊗ A ⊗ H ⊗2 given by (3.15) and consider the element (1 ⊗ 5 ) ⊗ (2 ⊗ 4 ) ⊗ 3 , then one can check that this element is invertible in (H ⊗ H op ) ⊗ (H ⊗ H op ) ⊗ A, its inverse being given by ( 1 X˜ λ1 (x˜ρ1 )[−1]1 ⊗ S −1 (x˜ρ3 g 2 )) ⊗ ( 2[−1] X˜ λ2 (x˜ρ1 )[−1]2 ⊗ S −1 ( 3 x˜ρ2 g 1 )) ⊗ 2[0] X˜ λ3 (x˜ρ1 )[0] , where f −1 = g 1 ⊗ g 2 is the element given by (2.14). We will denote by λ1 ∈ (H ⊗ H op ) ⊗ (H ⊗ H op ) ⊗ A this inverse. Similarly, if we look at the element ω given by (3.16) and consider the element (ω1 ⊗ ω5 ) ⊗ (ω2 ⊗ ω4 ) ⊗ ω3 , then one can check that this element is invertible in (H ⊗ H op ) ⊗ (H ⊗ H op ) ⊗ A, with inverse defined by 2 2 ˜3 2 1 ˜3 (Y˜λ1 ⊗ S −1 (θ 3 y˜ρ3 (Y˜λ3 ) 12 g 2 )) ⊗ (θ 1 Y˜λ2 ⊗ S −1 (θ 1 y˜ρ (Yλ ) 11 g 1 )) ⊗ θ 0 y˜ρ (Yλ ) 0 .

We will denote by λ2 ∈ (H ⊗ H op ) ⊗ (H ⊗ H op ) ⊗ A this inverse. The next proposition generalizes the corresponding result obtained for Hopf algebras in [8]. Proposition 4.1. With notation as above, (A, λ1 , λ1 ) and respectively (A, λ2 , λ2 ) are left H ⊗ H op -comodule algebras, denoted by A1 respectively A2 . Proof. It is easy to see that λ1 and λ2 are algebra maps, and also that the conditions (2.32) and (2.33) in the definition of a left comodule algebra are satisfied. Then the conditions (2.30) and (2.31) for (A, λ1 , λ1 ) (respectively for (A, λ2 , λ2 )) to be a left H ⊗ H op -comodule algebra are equivalent to the relations (3.17) and (3.18) fulfilled by (respectively to the relations (3.19) and (3.20) fulfilled by ω). We are now able to express the (generalized) diagonal crossed products over H as some generalized smash products over H ⊗ H op .

372

D. Bulacu, F. Panaite, F. Van Oystaeyen

Proposition 4.2. Let H be a quasi-Hopf algebra, A an H -bimodule algebra and A an H -bicomodule algebra. View A as a left H ⊗ H op -module algebra with action (h ⊗ h ) · ϕ = h · ϕ · h for all h, h ∈ H and ϕ ∈ A, and consider the two left H ⊗ H op -comodule algebras A1 and A2 obtained from A as above. Then we have algebra isomorphisms A A ≡ A
= ((1 ⊗ 5 ) · ϕ)((2 ⊗ 4 )(u 0[−1] ⊗ S −1 (u 1 )) · ϕ )<3 u 0[0] u = (1 · ϕ · 5 )(2 u 0[−1] · ϕ · S −1 (u 1 )4 )<3 u 0[0] u ,

and via the trivial identification this is exactly the multiplication of A A.

Recall that, for a finite dimensional quasi-Hopf algebra H , the quantum double D(H ) was first introduced by Majid in [18] by an implicit Tannaka-Krein reconstruction procedure, and more explicit descriptions were obtained afterwards by Hausser and Nill in [12, 13]. Actually, Hausser and Nill provided four explicit realizations of D(H ), two built on H ∗ ⊗ H and two on H ⊗ H ∗ ; all are, as algebras, diagonal crossed products, namely the two realizations built on H ∗ ⊗ H coincide with H ∗ H and H ∗ H and the two built on H ⊗ H ∗ coincide with H H ∗ and H H ∗ . On the other hand, it was proved in [8] that the Drinfeld double of a finite dimensional Hopf algebra may be written as a generalized smash product. As a corollary to the previous proposition, we obtain a generalization of this result for quasi-Hopf algebras. Corollary 4.3. If H is a finite dimensional quasi-Hopf algebra, then the quantum double D(H ) may be written as a generalized smash product. Proof. In the previous proposition take A = H ∗ , A = H and use the fact that H ∗ H and H ∗ H are realizations for D(H ). Let us also record the fact that the two left H ⊗ H op -comodule algebra structures on H are defined as follows: λ1 , λ2 : H → (H ⊗ H op ) ⊗ H, λ1 (h) = (h (1,1) ⊗ S −1 (h 2 )) ⊗ h (1,2) , λ2 (h) = (h 1 ⊗ S −1 (h (2,2) )) ⊗ h (2,1) , for all h ∈ H , and λ1 , λ2 ∈ (H ⊗ H op ) ⊗ (H ⊗ H op ) ⊗ H, 1 1 λ1 = (Y 1 X 1 x(1,1) ⊗ S −1 (x 3 g 2 )) ⊗ (Y12 X 2 x(1,2) ⊗ S −1 (Y 3 x 2 g 1 )) ⊗ Y22 X 3 x21 , 3 3 λ2 = (Y 1 ⊗ S −1 (x 3 y 3 Y(2,2) g 2 )) ⊗ (x 1 Y 2 ⊗ S −1 (x22 y 2 Y(2,1) g 1 )) ⊗ x12 y 1 Y13 ,

where f −1 = g 1 ⊗ g 2 is the element given by (2.14).

Generalized Diagonal Crossed Products

373

Let again H be a quasi-Hopf algebra, A an H -bimodule algebra and A an H -bicomodule algebra. We intend to prove that the two generalized left diagonal crossed products A A and A A are isomorphic as algebras, using their description as generalized smash products. First we need a result on generalized smash products. Namely, let H be a quasi-bialgebra, A a left H -module algebra, B a left H -comodule algebra and U ∈ H ⊗ B an invertible element such that (ε ⊗ idB)(U ) = 1B. If we define a map λ : B → H ⊗ B, λ (b) = U λ(b)U −1 , then, by [12], this is a new left H -comodule algebra structure on B, with λ = (1 H ⊗ U )(id H ⊗ λ)(U )λ ( ⊗ idB)(U −1 ), which will be denoted by B (and we will say that B and B are “twist equivalent”). We then may consider the generalized smash products A
(4.1)

2 2 11 1 θ 1 ⊗ S −1 (θ 3 )5 S −1 ( 3 )1 ⊗ 12 2 θ 0 ⊗ S −1 (θ 1 )4 S −1 ( 3 )2 [−1] 2 ⊗ 2 3 θ 0 = ω1 ⊗ ω5 ⊗ ω2 1 ⊗ S −1 ( 3 )ω4 ⊗ ω3 2 . [0]

(4.2)

374

D. Bulacu, F. Panaite, F. Van Oystaeyen

Proof. The relation (4.1) follows by applying (2.11), (2.45) and (2.44), we leave the details to the reader. We prove now (4.2). We compute: 11 1 θ 1 ⊗ S −1 (θ 3 )5 S −1 ( 3 )1 2 2 )4 S −1 ( 3 ) ⊗ 2 3 θ 2 ⊗ 12 2 θ 0 ⊗ S −1 (θ 1 2

0[0] [−1] (4.1) 1 1 1 = 1 x˜ λ θ

2 ⊗ S −1 ( f 2 X˜ ρ3 3 θ 3 ) ⊗ 12 x˜λ2 θ 0 [−1] 1

2 )⊗ X ˜ ρ1 2 (x˜ 3 ) 0 θ 2 ⊗S −1 ( f 1 X˜ ρ2 2 1 (x˜λ3 ) 1 θ 1

0 λ

0[0] 3

(2.44) 1 ˜ 1 1 = x˜ λ θ

2

˜ 3 θ 3 ) ⊗ x˜ 2 1 ˜ 2 θ2 ⊗ S −1 ( f 2 X˜ ρ3 (x˜λ3 ) 1 3 λ [−1]

0[−1] 1

2 ˜ 2 θ 2 ) ⊗ X˜ ρ1 (x˜ 3 ) 0 0 2 ˜2 ⊗S −1 ( f 1 X˜ ρ2 (x˜λ3 ) 0 1 2 1 λ [0] 1

1

0 [0] 0 θ 0[0] 3

(2.43) 1 = x˜ λ

2

1 ⊗ S −1 ( f 2 X˜ ρ3 (x˜λ3 ) 1 3 ) ⊗ x˜λ2 1

⊗S −1 ( f 1 X˜ ρ2 (x˜λ3 ) 0 1 2 1 ) ⊗ X˜ ρ1 (x˜λ3 ) 0 0 2 0 3

(2.26) 1 = x˜ λ

⊗ S −1 ( f 2 (x˜λ3 ) 12 X˜ ρ3 3 ) ⊗ x˜λ2 1

1

⊗S −1 ( f 1 (x˜λ3 ) 11 X˜ ρ2 2 1 ) ⊗ (x˜λ3 ) 0 X˜ ρ1 2 0 3

(3.16) 1 = ω

2

1

3

2

2

⊗ ω5 ⊗ ω2 ⊗ S −1 ( )ω4 ⊗ ω3 ,

as required. Proposition 4.6. Let H be a quasi-Hopf algebra and A an H -bicomodule algebra. Then the left H ⊗ H op -comodule algebras A1 and A2 are twist equivalent. More exactly, for the element U ∈ (H ⊗ H op ) ⊗ A given by U = ( 1 ⊗ S −1 ( 3 )) ⊗ 2 , we have that λ2 (u) = U λ1 (u)U −1 , ∀ u ∈ A, λ2 = (1 ⊗ U )(id ⊗ λ1 )(U )λ1 ( ⊗ id)(U −1 ). Proof. The first relation follows immediately from (2.43), and the second is equivalent to the relation (4.2) proved in the previous lemma. As a consequence of these results and (3.30), we obtain: Corollary 4.7. Let H be a quasi-Hopf algebra, A an H -bimodule algebra and A an H -bicomodule algebra. Then the two generalized left (right) diagonal crossed products A A and A A (A A and A A, respectively) are isomorphic as algebras, and moreover they are equivalent extensions of A. Remark 4.8. Let H be a quasi-Hopf algebra, A an H -bimodule algebra and A an H bicomodule algebra with λ,ρ = 1 H ⊗ 1A ⊗ 1 H . Then, by (2.43) it follows that (λ ⊗ id)ρ = (id ⊗ ρ)λ, and by (4.2) it follows that = ω. So, in this case we have that A A and A A are not only isomorphic, but they actually coincide, and that A1 and A2 also coincide. An example of such an A is the tensor product A ⊗ B, where A is a right comodule algebra and B is a left comodule algebra, see [12]. We will encounter another example in a subsequent section.

Generalized Diagonal Crossed Products

375

We end this section by showing that the left generalized diagonal crossed products are isomorphic, as algebras, to the right generalized diagonal crossed products. Let H be a quasi-Hopf algebra, A a unital associative algebra and (δ, ) a two-sided coaction of H on A. We associate to (δ, ) the elements pδ , qδ ∈ H ⊗A⊗ H as follows: pδ = pδ1 ⊗ pδ2 ⊗ pδ3 = 2 S −1 ( 1 β) ⊗ 3 ⊗ 4 β S( 5 ), qδ =

qδ1

⊗ qδ2

⊗ qδ3

1

2

3

= S( )α ⊗ ⊗

5 4 S −1 (α ) .

(4.3) (4.4)

By [12] we have the following relations, for all u ∈ A: pδ (1 H ⊗ u ⊗ 1 H ) = δ(u (0) ) pδ [S −1 (u (−1) ) ⊗ 1A ⊗ S(u (1) )],

(4.5)

(1 H ⊗ u ⊗ 1 H )qδ = [S(u (−1) ) ⊗ 1A ⊗ S −1 (u (1) )]qδ δ(u (0) ),

(4.6)

2

1

5

4

[S( ) f 1 ⊗ S( ) f 2 ⊗ 1A ⊗ S −1 (F 2 ) ⊗ S −1 (F 1 )] 3

×( ⊗ idA ⊗ )(qδ δ( )) = [1 H ⊗ qδ ⊗ 1 H ](id H ⊗ δ ⊗ id H )(qδ ), (4.7)

δ(qδ2 ) pδ [S −1 (qδ1 ) ⊗ 1A ⊗ S(qδ3 )] = 1 H ⊗ 1A ⊗ 1 H ,

(4.8)

[S( pδ1 ) ⊗ 1A

(4.9)

⊗

S −1 ( pδ3 )]qδ δ( pδ2 )

= 1 H ⊗ 1A ⊗ 1 H ,

where f = f 1 ⊗ f 2 = F 1 ⊗ F 2 is the Drinfeld twist defined in (2.13). Moreover, the definitions of qδ and of a two-sided coaction imply qδ1 1 ⊗ (qδ2 )(−1) 2 ⊗ (qδ2 )(0) 3 ⊗ (qδ2 )(1) 4 ⊗ qδ3 5 1

2

2

3

4

5

4

= S( )q L1 1 ⊗ q L2 2 ⊗ ⊗ q R1 1 ⊗ S −1 ( )q R2 2 ,

(4.10)

where q L = q L1 ⊗ q L2 := S(x 1 )αx 2 ⊗ x 3 and q R = q R1 ⊗ q R2 := X 1 ⊗ S −1 (α X 3 )X 2 . Finally, we need the formulae (S(h 1 ) ⊗ 1 H )q L (h 2 ) = (1 ⊗ h)q L , (1 H ⊗ S −1 (h 2 ))q R (h 1 ) = (h ⊗ 1 H )q R ,

(4.11) (4.12)

for all h ∈ H , which have been established in [12]. Proposition 4.9. Let H be a quasi-Hopf algebra, (δ, ) a two-sided coaction of H on an associative unital algebra A, and A an H -bimodule algebra. Then the map ϑ : A δ A → A δ A defined for all ϕ ∈ A and u ∈ A by ϑ(ϕ δ u) = qδ2 u (0) S −1 (qδ1 u (−1) ) · ϕ · qδ3 u (1) is an algebra isomorphism. In particular, if A is an H -bicomodule algebra then we get that all four generalized diagonal crossed products A A, A A, A A and A A are isomorphic as unital algebras.

376

D. Bulacu, F. Panaite, F. Van Oystaeyen

Proof. We show that ϑ is multiplicative. For any ϕ, ϕ ∈ A and u, u ∈ A we have: ϑ((ϕ δ u)(ϕ δ u )) (3.9,3.7) = (2.50,2.11) =

ϑ(( · ϕ · S −1 ( f 2 ))( u (−1) · ϕ · S −1 ( f 1 u (1) )) δ u (0) u ) 1

5

2

4

qδ2 (0) u (0,0) u (0) δ (S −1 (F 2 (qδ1 )2 (−1)2 u (0,−1)2 u (−1)2 g 2 ) 3

3

3

1

·ϕ · S −1 ( f 2 )(qδ3 )1 (1)1 u (0,1)1 u (1)1 )(S −1 (F 1 (qδ1 )1 (−1)1 u (0,−1)1 u (−1)1 g 1 ) 5

3

3

× u (−1) · ϕ · S −1 ( f 1 u (1) )(qδ3 )2 (1)2 u (0,1)2 u (1)2 2

4

3

(4.7) qδ2 (Q 2δ )(0) 3 u (0,0) u (0) δ (S −1 (qδ1 (Q 2δ )(−1) 2 u (0,−1)2 u (−1)2 g 2 ) · ϕ = ·qδ3 (Q 2δ )(1) 4 u (0,1)1 u (1)1 )(S −1 (Q 1δ 1 u (0,−1)1 u (−1)1 g 1 )u (−1) · ϕ ·S −1 (u (1) )Q 3δ 5 u (0,1)2 u (1)2 ) 3 2 (4.10,3.2) qδ2 u (0) u (0) δ (S −1 (qδ1 q L2 u (−1)(2,2) 2 u (−1)2 g 2 ) = 4 2 1 ·ϕ · qδ3 q R1 u (1)(1,1) 1 u (1)1 )(S −1 (q L1 u (−1)(2,1) 1 u (−1)1 g 1 )u (−1)1 · ϕ 5 4 ·S −1 (u (1)2 )q R2 u (1)(1,2) 2 u (1)2 ) (4.11,4.12,4.10) 2 qδ u (0) (Q 2δ )(0) 3 u (0) δ (S −1 (qδ1 u (−1) (Q 2δ )(−1) 2 u (−1)2 g 2 ) · ϕ = ·qδ3 u (1) (Q 2δ )(1) 4 u (1)1 )(S −1 (Q 1δ 1 u (−1)1 g 1 ) · ϕ · Q 3δ 5 u (1)2 ) (3.1,3.8) qδ2 u (0) (Q 2δ )(0) u (0,0) 3 δ ( 2 S −1 (qδ1 u (−1) (Q 2δ )(−1) u (0,−1) ) · ϕ = ·qδ3 u (1) (Q 2δ )(1) u (0,1) 4 )( 1 S −1 (Q 1δ u (−1) ) · ϕ · Q 3δ u (1) 5 ) (3.10) (qδ2 u (0) δ S −1 (qδ1 u (−1) ) · ϕ · qδ3 u (1) ) = ×(Q 2δ u (0) δ S −1 (Q 1δ u (−1) ) · ϕ · Q 3δ u (1) ) = ϑ(ϕ δ u)ϑ(ϕ δ u ),

as needed. (We denoted by Q 1δ ⊗ Q 2δ ⊗ Q 3δ another copy of qδ and by F 1 ⊗ F 2 another copy of f .) It is easy to see that the unit and counit properties imply ϑ(1A δ 1A ) = 1A δ 1A , so it remains to show that ϑ is bijective. To this end, define ϑ −1 : A δ A → A δ A given for all u ∈ A and ϕ ∈ A by ϑ −1 (u δ ϕ) = u (−1) pδ1 · ϕ · S −1 (u (1) pδ3 ) δ u (0) pδ2 , where pδ = pδ1 ⊗ pδ2 ⊗ pδ3 is the element defined in (4.3). We claim that ϑ and ϑ −1 are inverses. Indeed, ϑ ◦ ϑ −1 = idAδ A because of (4.6) and (4.9), and ϑ ◦ ϑ −1 = idAδ A because of (4.5) and (4.8) (we leave the verification of the details to the reader). 5. Generalized Two-sided Crossed Product and Two-sided Generalized Smash Product Let H be a finite dimensional quasi-bialgebra and (A, ρ, ρ ),(B, λ, λ ) a right and a left H -comodule algebra, respectively. As in the case of a bialgebra, the right H -coaction (ρ, ρ ) on A induces a left H ∗ -action : H ∗ ⊗ A → A defined by ϕ a = ϕ(a 1 )a 0 ,

(5.1)

Generalized Diagonal Crossed Products

377

for all ϕ ∈ H ∗ and a ∈ A, where ρ(a) = a 0 ⊗ a 1 for all a ∈ A. Similarly, the left H -coaction (λ, λ ) on B provides a right H ∗ -action : B ⊗ H ∗ → B defined by b ϕ = ϕ(b[−1] )b[0] ,

(5.2)

for all ϕ ∈ H ∗ and b ∈ B, where we now denote λ(b) = b[−1] ⊗ b[0] for all b ∈ B. Following [12, Prop. 11.4 (ii)] we define an algebra structure on the k-vector space A ⊗ H ∗ ⊗ B. This algebra is denoted by A > H ∗ ϕ < b)(a > ϕ (x˜ 1λ ϕ2 x˜ 2ρ )(x˜ 2λ ϕ1 x˜ 3ρ ) < x˜ 3λ (b ϕ2 )b , (5.3) for all a, a ∈ A, b, b ∈ B and ϕ, ϕ ∈ H ∗ , where we write a > ϕ H ∗ < B. The unit of the algebra A > H ∗ ε < 1B. Hausser and Nill [12] called this algebra the two-sided crossed product. They proved that A ⊗ B is an H -bicomodule algebra (here λ,ρ is trivial) and the diagonal crossed product (A ⊗ B) H ∗ is isomorphic, as an algebra, to the two-sided crossed product A > H ∗ < B. This construction admits a slight generalization, as follows. Let H be a quasi-bialgebra, A a right H -comodule algebra, B a left H -comodule algebra and A an H -bimodule algebra. On A ⊗ A ⊗ B define a multiplication by (a > ϕ < b)(a > ϕ (x˜ 1λ · ϕ · a 1 x˜ 2ρ )(x˜ 2λ b[−1] · ϕ · x˜ 3ρ ) < x˜ 3λ b[0] b ,

(5.4)

for all a, a ∈ A, b, b ∈ B and ϕ, ϕ ∈ A, where we write a > ϕ 1A < 1B, denoted by A > A H ∗ < B of Hausser and Nill. We construct now a different kind of two-sided product, using “dual” objects, that is by replacing comodule algebras by module algebras and the bimodule algebra by a bicomodule algebra. Proposition 5.1. Let H be a quasi-bialgebra, A a left H -module algebra, B a right H -module algebra and A an H -bicomodule algebra. If we define on A ⊗ A ⊗ B a multiplication, by (a b)(a b ) = (x˜ 1λ · a)(x˜ 2λ u [−1] θ 1 · a )<x˜ 3λ u [0] θ 2 u 0 x˜ 1ρ > (b · θ 3 u 1 x˜ 2ρ )(b · x˜ 3ρ ), (5.5) for all a, a ∈ A, u, u ∈ A and b, b ∈ B (where we write a b for a ⊗ u ⊗ b), and we denote this structure on A ⊗ A ⊗ B by A B, then A B is an associative algebra with unit 1 A <1A > 1 B .

378

D. Bulacu, F. Panaite, F. Van Oystaeyen

Proof. For all a, a , a ∈ A, u, u , u ∈ A and b, b , b ∈ B we compute: [(a b)(a b )](a b ) (5.5) =

2 { y˜ 1λ · [(x˜ 1λ · a)(x˜ 2λ u [−1] θ 1 · a )]}[ y˜ 2λ (x˜ 3λ )[−1] u [0,−1] θ[−1]

2 ×u 0[−1] (x˜ 1ρ )[−1] θ · a ]< y˜ 3λ (x˜ 3λ )[0] u [0,0] θ[0] u 0[0] (x˜ 1ρ )[0] θ 1

2

×u 0 y˜ 1ρ > {[(b · θ 3 u 1 x˜ 2ρ )(b · x˜ 3ρ )] · θ u 1 y˜ 2ρ }(b · y˜ 3ρ ) 3

(2.18,2.23,2.17,2.22) [(X 1 ( y˜ 1λ )1 x˜ 1λ · a]{[X 2 ( y˜ 1λ )2 x˜ 2λ u [−1] θ 1 · a ][X 3 y˜ 2λ (x˜ 3λ )[−1] u [0,−1] = 1 2 2 2 ×θ[−1] u 0[−1] (x˜ 1ρ )[−1] θ · a ]}< y˜ 3λ (x˜ 3λ )[0] u [0,0] θ[0] u 0[0] (x˜ 1ρ )[0] θ u 0 y˜ 1ρ 3 3 > [b · θ 3 u 1 x˜ 2ρ θ 1 u 11 ( y˜ 2ρ )1 x 1 ]{[(b · x˜ 3ρ θ 2 u 12 ( y˜ 2ρ )2 x 2 ](b · y˜ 3ρ x 3 )} (2.31) 2 ( y˜ 1λ · a){[( y˜ 2λ )1 x˜ 1λ u [−1] θ 1 · a ][( y˜ 2λ )2 x˜ 2λ u [0,−1] θ[−1] u 0[−1] = 2 ×(x˜ 1ρ )[−1] θ · a ]}< y˜ 3λ x˜ 3λ u [0,0] θ[0] u 0[0] (x˜ 1ρ )[0] θ u 0 y˜ 1ρ 1

2

> [b · θ 3 u 1 x˜ 2ρ θ 1 u 11 ( y˜ 2ρ )1 x 1 ]{[b · x˜ 3ρ θ 2 u 12 ( y˜ 2ρ )2 x 2 ](b · y˜ 3ρ x 3 )} 3

(2.30,2.44,2.18) =

3

( y˜ 1λ · a){ y˜ 2λ u [−1] θ 1 · [(x˜ 1λ · a )(x˜ 2λ 1 u 0[−1] (x˜ 1ρ )[−1] θ · a )]} 1

< y˜ 3λ u [0] θ 2 (x˜ 3λ ) 0 2 u 0[0] (x˜ 1ρ )[0] θ u 0 y˜ 1ρ > [b · θ 3 (x˜ 3λ ) 1 2

× 3 u 1 x˜ 2ρ θ 1 u 11 ( y˜ 2ρ ))1 x 1 ]{[b · x˜ 3ρ θ 2 u 12 ( y˜ 2ρ )2 x 2 ](b · y˜ 3ρ x 3 )} 3

3

(2.43,2.45,2.26) ( y˜ 1λ · a){ y˜ 2λ u [−1] θ 1 = 2 ×(x˜ 3λ ) 0 u [0] 0 θ 0 u 0,0 x˜ 1ρ y˜ 1ρ

· [((x˜ 1λ · a )(x˜ 2λ u [−1] θ · a )]}< y˜ 3λ u [0] θ 2 1

> [b · θ 3 (x˜ 3λ ) 1 u [0] 1 θ 1 2

×u 0,1 x˜ 2ρ ( y˜ 2ρ )1 x 1 ]{[b · θ u 1 x˜ 3ρ ( y˜ 2ρ )2 x 2 ](b · y˜ 3ρ x 3 )} 3

1 (2.27,2.23) ( y˜ 1λ · a){ y˜ 2λ u [−1] θ 1 · [(x˜ 1λ · a )(x˜ 2λ u [−1] θ · a )]}< y˜ 3λ u [0] θ 2 = 2 2 ×(x˜ 3λ u [0] θ u 0 y˜ 1ρ ) 0 x˜ 1ρ > [b · θ 3 (x˜ 3λ u [0] θ u 0 y˜ 1ρ ) 1 x˜ 2ρ ] 3 ×{[(b · θ u 1 y˜ 2ρ )(b · y˜ 3ρ )] · x˜ 3ρ } 1 2 (5.5) (a b)[(x˜ 1λ · a )(x˜ 2λ u [−1] θ · a )<x˜ 3λ u [0] θ u 0 y˜ 1ρ = 3 > (b · θ u 1 y˜ 2ρ )(b · y˜ 3ρ )] (5.5) (a b)[(a b )(a b )]. =

Finally, by (2.28), (2.29), (2.32), (2.33) and (2.46) it follows that 1 A <1A > 1 B is the unit of A B. Remarks 5.2. (i) The generalized two-sided crossed product A > A < B cannot be particularized for A = k or B = k because, in general, k is not a right or left H -comodule algebra (in fact, we can do that if and only if we work with a quasi-Hopf algebra which is a twisted Hopf algebra, i.e. it is of the form H F , where H is an ordinary Hopf algebra and F ∈ H ⊗ H is a twist on H ). For the algebra A B, we can take A = k or B = k. In these cases we obtain the right or left generalized smash products A > B and A B the two-sided generalized smash product. Note that, in the Hopf case, the multiplication of

Generalized Diagonal Crossed Products

379

A B is given by (a b)(a b ) = a(u [−1] · a ) (b · u 1 )b , for all a, a ∈ A, u, u ∈ A and b, b ∈ B. (ii) Let A = H . In this particular case we will denote the algebra A B by A# H # B (the elements will be written a#h#b, a ∈ A, h ∈ H , b ∈ B) and will call it the two-sided smash product. Our terminology is based on the fact that when we take A = k or B = k the resulting algebra is the right or left smash product algebra. Note that the multiplication of A# H # B is defined by (a#h#b)(a #h #b ) = (x 1 · a)(x 2 h 1 y 1 · a )#x 3 h 2 y 2 h 1 z 1 #(b · y 3 h 2 z 2 )(b · z 3 ), for all a, a ∈ A, h, h ∈ H and b, b ∈ B. It follows that the canonical maps i : A# H → A# H # B and j : H # B → A# H # B, i(a#h) = a#h#1 B and j (h#b) = 1 A #h#b, are algebra morphisms. In the Hopf case the multiplication of the two-sided smash product is defined by (a#h#b)(a #h #b ) = a(h 1 · a )#h 2 h 1 #(b · h 2 )b .

6. Two-Sided Products vs Generalized Diagonal Crossed Products As mentioned before, Hausser and Nill proved that a two-sided crossed product over a quasi-Hopf algebra is isomorphic to a right diagonal crossed product. We prove now that a generalized two-sided crossed product is isomorphic to a left generalized diagonal crossed product. Namely, let H be a quasi-bialgebra, A an H -bimodule algebra, (A, ρ, ρ ) a right H -comodule algebra and (B, λ, λ ) a left H -comodule algebra. Then, by [12], A ⊗ B becomes an H -bicomodule algebra, with the following structure: ρ(a⊗b) = (a 0 ⊗b)⊗a 1 , λ(a⊗b) = b[−1] ⊗(a⊗b[0] ), ρ = ( X˜ ρ1 ⊗1B)⊗ X˜ ρ2 ⊗ X˜ ρ3 , λ = X˜ λ1 ⊗ X˜ λ2 ⊗ (1A ⊗ X˜ λ3 ), λ,ρ = 1 H ⊗ (1A ⊗ 1B) ⊗ 1 H , for all a ∈ A and b ∈ B. Proposition 6.1. If H is a quasi-Hopf algebra and A, A, B are as above, then the generalized two-sided crossed product A > A A ϕ < b) = ϕ · S −1 (a 1 p˜ ρ2 ) (a 0 p˜ ρ1 ⊗ b), for all a ∈ A, b ∈ B and ϕ ∈ A, where p˜ = p˜ ρ1 ⊗ p˜ ρ2 is the element defined in (2.34). We prove that ν is an algebra map. For a, a ∈ A, b, b ∈ B, ϕ, ϕ ∈ A, we compute: ν((a > ϕ < b)(a > ϕ (x˜λ1 · ϕ · a 1 x˜ρ2 )(x˜λ2 b[−1] · ϕ · x˜ρ3 ) < x˜λ3 b[0] b )

380

D. Bulacu, F. Panaite, F. Van Oystaeyen

= [(x˜λ1 · ϕ · a 1 x˜ρ2 )(x˜λ2 b[−1] · ϕ · x˜ρ3 )] · S −1 (a 1 a 0,1 (x˜ρ1 ) 1 p˜ ρ2 ) (a 0 a 0,0 (x˜ρ1 ) 0 p˜ ρ1 ⊗ x˜λ3 b[0] b ) (2.50,2.11) 1 [x˜λ · ϕ · a 1 x˜ρ2 S −1 (g 2 )S −1 ((x˜ρ1 ) 12 ( p˜ ρ2 )2 )S −1 (a 12 a 0,12 )S −1 ( f 2 )] = × [x˜λ2 b[−1] · ϕ · x˜ρ3 S −1 (g 1 )S −1 ((x˜ρ1 ) 11 ( p˜ ρ2 )1 )S −1 (a 11 a 0,11 )S −1 ( f 1 )]

(a 0 a 0,0 (x˜ρ1 ) 0 p˜ ρ1 ⊗ x˜λ3 b[0] b ) (2.39,2.26) 1 [x˜λ · ϕ · a 1 S −1 ( p˜ ρ2 )S −1 (a 0,1 )S −1 ( f 2 a 12 X˜ ρ3 )] = × [x˜λ2 b[−1] · ϕ · S −1 ( f 1 a 11 X˜ ρ2 a 0,0,1 ( p˜ ρ1 ) 1 P˜ρ2 )] (a 0 X˜ ρ1 a 0,0,0 ( p˜ ρ1 ) 0 P˜ρ1 ⊗ x˜λ3 b[0] b ) (2.26,2.35) 1 [x˜λ · ϕ · S −1 ( f 2 X˜ ρ3 a 1 p˜ ρ2 )] = ×[x˜λ2 b[−1] · ϕ · S −1 ( f 1 X˜ ρ2 a 0,1 ( p˜ ρ1 ) 1 a 1 P˜ρ2 )]

( X˜ ρ1 a 0,0 ( p˜ ρ1 ) 0 a 0 P˜ρ1 ⊗ x˜λ3 b[0] b ), where P˜ρ1 ⊗ P˜ρ2 is another copy of p˜ ρ . On the other hand, we have seen in Remark 4.8 that for the bicomodule algebra A⊗B we have = ω and the multiplications in A (A ⊗ B) and A (A ⊗ B) coincide. One can check that ω ∈ H ⊗2 ⊗ (A ⊗ B) ⊗ H ⊗2 for A ⊗ B is obtained by ω = x˜λ1 ⊗ x˜λ2 ⊗ ( X˜ ρ1 ⊗ x˜λ3 ) ⊗ S −1 ( f 1 X˜ ρ2 ) ⊗ S −1 ( f 2 X˜ ρ3 ),

(6.1)

and then we compute (using the formula for ): ν(a > ϕ < b)ν(a > ϕ < b ) = [ϕ · S −1 (a 1 p˜ ρ2 ) (a 0 p˜ ρ1 ⊗ b)][ϕ · S −1 (a 1 P˜ρ2 ) (a 0 P˜ρ1 ⊗ b )] (3.22,6.1) 1 [x˜λ · ϕ · S −1 ( f 2 X˜ ρ3 a 1 p˜ ρ2 )][x˜λ2 b[−1] = ( X˜ ρ1 a 0,0 ( p˜ ρ1 ) 0 a 0 P˜ρ1 ⊗ x˜λ3 b[0] b ),

· ϕ · S −1 ( f 1 X˜ ρ2 a 0,1 ( p˜ ρ1 ) 1 a 1 P˜ρ2 )]

hence ν is multiplicative. It obviously satisfies ν(1A > 1A < 1B) = 1A (1A⊗1B), hence it is an algebra map. We prove now that ν is bijective. Define ν −1 : A (A ⊗ B) → A > A < B, ν −1 (ϕ (a ⊗ b)) = q˜ρ1 a 0 > ϕ · q˜ρ2 a 1 < b, for all a ∈ A, b ∈ B, ϕ ∈ A, where q˜ρ = q˜ρ1 ⊗ q˜ρ2 is the element defined in (2.34). We claim that ν and ν −1 are inverses. Indeed, νν −1 (ϕ (a ⊗ b)) = ϕ · q˜ρ2 a<1> S −1 ( p˜ ρ2 )S −1 (a 0,1 )S −1 ((q˜ρ1 ) 1 ) ((q˜ρ1 ) 0 a 0,0 p˜ ρ1 ⊗ b) (2.35) = ϕ (2.37) = ϕ

· q˜ρ2 S −1 ( p˜ ρ2 )S −1 ((q˜ρ1 ) 1 ) ((q˜ρ1 ) 0 p˜ ρ1 a ⊗ b) (a ⊗ b),

Generalized Diagonal Crossed Products

381

and ν −1 ν(a > ϕ < b) = q˜ρ1 a 0,0 ( p˜ ρ1 ) 0 > ϕ · S −1 ( p˜ ρ2 )S −1 (a 1 )q˜ρ2 a 0,1 ( p˜ ρ1 ) 1 ϕ · S −1 ( p˜ ρ2 )q˜ρ2 ( p˜ ρ1 ) 1 ϕ < b,

and this finishes the proof. Remark 6.2. Let H , A, A, B be as above and consider the map as in Proposition 3.8, with A taken to be A ⊗ B. Then, due to the particular structure of A, the map : A → A (A ⊗ B) is given by (ϕ) = ϕ · S −1 ( p˜ ρ2 ) ( p˜ ρ1 ⊗ 1B), for all ϕ ∈ A, where p˜ ρ is the one corresponding to A. Then, using this formula and (3.21), one verifies that the isomorphism ν from Proposition 6.1 reduces to ν(a > ϕ < b) = a(ϕ)b, ∀ a ∈ A, b ∈ B, ϕ ∈ A, where we suppressed the embeddings of A and B into A (A ⊗ B) (this generalizes [12], Prop. 11.4). Let H be a quasi-bialgebra, A a left H -module algebra and B a right H -module algebra. Then A ⊗ B becomes an H -bimodule algebra via the H -actions h · (a ⊗ b) · h = h · a ⊗ b · h , ∀a ∈ A, h, h ∈ H , b ∈ B.

(6.2)

Proposition 6.3. Let H be a quasi-Hopf algebra, A a left H -module algebra, B a right H -module algebra and A an H -bicomodule algebra. Then the two-sided generalized smash product A B is isomorphic as an algebra to the generalized diagonal crossed product (A ⊗ B) A. Proof. Define μ : (A ⊗ B) A → A B, μ((a ⊗ b) u) = 1 · a b · S −1 ( 3 )q˜ρ2 2 1 u 1 ,

(6.3)

for all a ∈ A, b ∈ B and u ∈ A, where q˜ρ = q˜ρ1 ⊗ q˜ρ2 is the element defined in (2.34). We will prove that μ is an algebra isomorphism. First, observe that the multiplication of (A ⊗ B) A is defined by ((a ⊗ b) u)((a ⊗ b ) u ) = [(1 · a)(2 u 0[−1] · a ) ⊗ (b · 5 )(b · S −1 (u 1 )4 )] 3 u 0[0] u ,

(6.4)

for all a, a ∈ A, b, b ∈ B and u, u ∈ A. By using (4.1), (2.40), (2.44) and several times (2.36) and (2.26), we obtain that 11 1 ⊗ 12 2 ⊗ q˜ρ1 ( 2 3 ) 0 ⊗ 5 S −1 ( 3 )1 (q˜ρ2 )1 ( 2 3 ) 11 ⊗ 4 S −1 ( 3 )2 ×(q˜ρ2 )2 ( 2 3 ) 12 = x˜ 1λ 1 ⊗ x˜ 2λ 1 2[−1] ⊗ x˜ 3λ q˜ρ1 ( 2 2[0] Q˜ 1ρ 0 ) 0 x˜ 1ρ 1

2

⊗S −1 ( 3 3 )q˜ρ2 ( 2 2[0] Q˜ 1ρ 0 ) 1 x˜ 2ρ ⊗ S −1 ( ) Q˜ 2ρ 1 x˜ 3ρ , 2

3

2

(6.5)

382

D. Bulacu, F. Panaite, F. Van Oystaeyen

where we denote by Q˜ 1ρ ⊗ Q˜ 2ρ another copy of q˜ρ . On the other hand, by (2.26), (2.43) and (2.36) it follows that 1 2 2 u 0[−1] ⊗ ( Q˜ 1ρ 0 ) 0 x˜ 1ρ u 0[0] 0 u 0 ⊗ ( Q˜ 1ρ 0 ) 1 x˜ 2ρ u 0[0] 1 u 11 1

3 ⊗S −1 ( u

˜ 2ρ 2 1 x˜ 3ρ u 0[0] u 12 = u [−1] 1 ⊗ (u [0] Q˜ 1ρ ) 0 ( 2 u ) 0,0 x˜ 1ρ

1 ) Q

1 2

2 ⊗(u [0] Q˜ 1ρ ) 1 ( u ) 0,1 x˜ 2ρ

⊗ S −1 ( ) Q˜ 2ρ ( u ) 1 x˜ 3ρ , 3

2

(6.6)

for all u, u ∈ A. Finally, using (2.34), (2.45), (2.6) and (2.46), one checks that 1 ⊗ q˜ρ1 2 0 ⊗ S −1 ( 3 )q˜ρ2 2 1 = (q˜ρ1 )[−1] θ 1 ⊗ (q˜ρ1 )[0] θ 2 ⊗ q˜ρ2 θ 3 .

(6.7)

Now, for all a, a ∈ A, u, u ∈ A and b, b ∈ B we compute: μ(((a ⊗ b) u)((a ⊗ b ) u )) (6.4,6.3) =

( 11 1 · a)( 12 2 u 0[−1] · a )
> (b · 5 S −1 ( 3 )1 (q˜ρ2 )1 ( 2 3 ) 11 u 0[0] 1 u 11 ) 1

(b · S −1 (u 1 )4 S −1 ( 3 )2 (q˜ρ2 )2 ( 2 3 ) 12 u 0[0] 1 u 12 ) 2

(6.5) =

(x˜ 1λ 1

1 · a)(x˜ 2λ 1 2[−1] u 0[−1]

·a

2 )<x˜ 3λ q˜ρ1 2 0 2[0] 0 ( Q˜ 1ρ 0 ) 0

2 x˜ 1ρ u 0[0] 0 u 0 > (b · S −1 ( 3 3 )q˜ρ2 2 1 2[0] 1 ( Q˜ 1ρ 0 ) 1 x˜ 2ρ 3 2 u 0[0] 1 u 11 )(b · S −1 ( u 1 ) Q˜ 2ρ 1 x˜ 3ρ u 0[0] 1 u 12 ) 1

(6.6) =

2

(x˜ 1λ 1

1 · a)(x˜ 2λ 1 2[−1] u [−1]

·a

)<x˜ 3λ q˜ρ1 2 0 2[0] 0 (u [0] Q˜ 1ρ ) 0

( u ) 0,0 x˜ 1ρ > (b · S −1 ( 3 3 )q˜ρ2 2 1 2[0] 1 (u [0] Q˜ 1ρ ) 1 ( u ) 0,1 x˜ 2ρ ) 2

2

(b · S −1 ( ) Q˜ 2ρ ( u ) 1 x˜ 3ρ ) 3

(6.7,5.5,2.43) ( 1 = 1

2

· a b · S −1 ( 3 )q˜ρ2 2 1 u 1 )

( · a < Q˜ 1ρ 0 u 0 > b · S −1 ( ) Q˜ 2ρ 1 u 1 )

(6.3) =

2

3

2

μ((a ⊗ b) u)μ((a ⊗ b ) u ),

as claimed. The (co) unit axioms imply μ((1 A ⊗ 1 B ) 1A ) = 1 A <1A > 1 B , so it remains to show that μ is bijective. To this end, define μ−1 : A B → (A⊗B) A, μ−1 (a b) = (θ 1 · a ⊗ b · S −1 (θ 3 u 1 p˜ ρ2 )) θ 2 u 0 p˜ ρ1 ,

(6.8)

for all a ∈ A, u ∈ A and b ∈ B, where p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 is the element defined in (2.34). We show that μ and μ−1 are inverses. Indeed, μμ−1 (a b) (6.8,6.3) a b, =

> b · S −1 (u 1 p˜ ρ2 )q˜ρ2 u 0,1 ( p˜ ρ1 ) 1

Generalized Diagonal Crossed Products

383

for all a ∈ A, u ∈ A, b ∈ B, and similarly μ−1 μ((a ⊗ b) u) (6.3,6.8) =

[θ 1 1 · a ⊗ b · S −1 ( 3 )q˜ρ2 ( 2 u) 1 S −1 (θ 3 (q˜ρ1 ) 1 ( 2 u) 0,1 ) p˜ ρ2 )] θ 2 (q˜ρ1 ) 0 ( 2 u) 0,0 p˜ ρ1

(2.35,2.37) =

(a ⊗ b) u,

and this finishes our proof. As a consequence of the two propositions, we obtain the following result: Corollary 6.4. Let H be a quasi-Hopf algebra, A a left H -module algebra, B a right H -module algebra, A a right H -comodule algebra and B a left H -comodule algebra. Then we have algebra isomorphisms A<(A ⊗ B) > B (A ⊗ B) (A ⊗ B) A > (A ⊗ B) < B. 7. Invariance Under Twisting In this section we prove that the generalized diagonal crossed products and the two-sided smash products are, in certain senses, invariant under twisting (such a result has also been proved by Hausser and Nill in [12] for their diagonal crossed products, with a different method, and by the authors in [5] for smash products). Let H be a quasi-bialgebra, A a left H -module algebra, A an H -bimodule algebra and F ∈ H ⊗ H a gauge transformation. If we introduce on A another multiplication, by a a = (G 1 · a)(G 2 · a ) for all a, a ∈ A, where F −1 = G 1 ⊗ G 2 , and denote by A F −1 this structure, then, as in [5], one can prove that A F −1 becomes a left H F -module algebra, with the same unit and H -action as for A. If we introduce on A another multiplication, by ϕ ◦ ϕ = (G 1 · ϕ · F 1 )(G 2 · ϕ · F 2 ) for all ϕ, ϕ ∈ A, and denote this by ∗ F A F −1 , then F A F −1 is an H F -bimodule algebra (for instance, if A = H , then F A F −1 ∗ op is just (H F ) ). Moreover, if we regard A as a left H ⊗ H -module algebra and F A F −1 op as a left H F ⊗ H F -module algebra, then F A F −1 coincides with AT −1 , where T is the gauge transformation on H ⊗ H op given by T = (F 1 ⊗ G 1 ) ⊗ (F 2 ⊗ G 2 ), and using the identification H F ⊗ (H F )op ≡ (H ⊗ H op )T . Suppose that we have also a left H -comodule algebra B; then, by [12], on the algebra structure of B one can introduce a left H F -comodule algebra structure (denoted in what −1 −1 −1 follows by B F ) by putting λ F = λ and λF = λ (F −1 ⊗ 1B). Proposition 7.1. With notation as above, we have an algebra isomorphism −1

A
−1

coincides, via the trivial

384

D. Bulacu, F. Panaite, F. Van Oystaeyen

Similarly, if A is a right H -comodule algebra, by [12] one can introduce on the algebra structure of A a right H F -comodule algebra structure (denoted by F A) by putting F ρ = ρ and F = (1 ⊗ F) . ρ ρ A Also, one can check that if A is an H -bicomodule algebra, the left and right H F -com−1 odule algebras A F and F A actually define the structure of an H F -bicomodule algebra −1 on A, denoted by F A F , which has the same λ,ρ as A. Suppose now that H is a quasi-Hopf algebra. Transforming this H F -bicomodule −1 op algebra F A F , as in a previous section, into the two left H F ⊗ H F -comodule algebras −1 −1 op ( F A F )1 and ( F A F )2 , by using the identification H F ⊗ H F ≡ (H ⊗ H op )T as before and the fact, observed in [12], that the Drinfeld twist f F on H F depends on the one on −1 ) f F −1 , we may obtain algebra isomorphisms H by the formula f F = (S ⊗ S)(F21 −1

−1

−1

−1

( F A F )1 ≡ (A1 )T , ( F A F )2 ≡ (A2 )T , defined by the trivial identifications. As a consequence, using the expressions of the generalized left diagonal crossed products as generalized smash products, we obtain the following result: Proposition 7.2. With notation as before, the algebra isomorphisms A A ≡

F A F −1

F

−1

A F , A A ≡

F A F −1

F

−1

AF ,

are defined by the trivial identifications. Suppose again that H is a quasi-bialgebra, A is a left H -module algebra and F ∈ H ⊗ H is a gauge transformation. Suppose now that we also have a right H -module algebra B. If we introduce on B another multiplication, by b b = (b · F 1 )(b · F 2 ) for all b, b ∈ B, denoting this structure by F B, then F B becomes a right H F -module algebra with the same unit and right H -action as for B. So, we have the following type of invariance under twisting for two-sided smash products: Proposition 7.3. With notation as before, we have an algebra isomorphism ϕ : A# H # B A F −1 # H F # F B, ϕ(a#h#b) = F 1 · a#F 2 hG 1 #b · G 2 , ∀ a ∈ A, h ∈ H, b ∈ B. In particular, by taking B = k or respectively A = k, we have algebra isomorphisms A# H A F −1 # H F , H # B H F # F B. Proof. Follows by a direct computation, similar to the one in [5]. 8. Iterated Products It was proved in [5] that, if H is a quasi-bialgebra and A is a left H -module algebra, then A# H becomes a right H -comodule algebra, with structure: ρ : A# H → (A# H ) ⊗ H, ρ(a#h) = (x 1 · a#x 2 h 1 ) ⊗ x 3 h 2 , ∀a ∈ A, h ∈ H , ρ = (1 A # X 1 ) ⊗ X 2 ⊗ X 3 ∈ (A# H ) ⊗ H ⊗ H.

Generalized Diagonal Crossed Products

385

Similarly, one can prove that if B is a right H -module algebra, then H # B becomes a left H -comodule algebra, with structure: λ : H # B → H ⊗ (H # B), λ(h#b) = h 1 x 1 ⊗ (h 2 x 2 #b · x 3 ), ∀h ∈ H , b ∈ B, λ = X 1 ⊗ X 2 ⊗ (X 3 #1 B ) ∈ H ⊗ H ⊗ (H # B). In the sequel we need some more general results, that we are stating now (the proof is similar to the one in [5]). Let H be a quasi-bialgebra, A a left H -module algebra and A an H -bicomodule algebra. Then A
Similarly, let H be a quasi-bialgebra, B a right H -module algebra and A an H -bicomodule algebra. Then A > B becomes a left H -comodule algebra, with structure defined for all u ∈ A and b ∈ B by: λ : A > B → H ⊗ (A > B), λ(u > b) = u [−1] θ 1 ⊗ (u [0] θ 2 > b · θ 3 ), 1 2 3 λ = X˜ λ ⊗ X˜ λ ⊗ ( X˜ λ > 1 B ) ∈ H ⊗ H ⊗ (A > B).

We are now ready to prove that the two-sided generalized smash product can be written (in two ways) as an iterated generalized smash product. Proposition 8.1. Let H be a quasi-bialgebra, A a left H -module algebra, B a right H -module algebra and A an H -bicomodule algebra. Consider the right and left H comodule algebras A B as above. Then we have algebra isomorphisms A B ≡ (A B, A B ≡ A<(A > B), given by the trivial identifications. In particular, we have A# H # B ≡ (A# H ) > B, A# H # B ≡ A<(H # B). Proof. We will prove the first isomorphism, the second is similar. We compute the multiplication in (A B. For a, a ∈ A, b, b ∈ B and u, u ∈ A we have: ((a b)((a b ) = (a (b · (a (b · θ 3 u 1 x˜ 2ρ )(b · x˜ 3ρ ) = ((x˜ 1λ · a)(x˜ 2λ u [−1] θ 1 · a )<x˜ 3λ u [0] θ 2 u 0 x˜ 1ρ ) > (b · θ 3 u 1 x˜ 2ρ )(b · x˜ 3ρ ). Via the trivial identification, this is exactly the multiplication of A B. Recall from [7] the definition and properties of the so-called quasi-smash product, but in a more general form. Let H be a quasi-bialgebra, A a right H -comodule algebra and A an H -bimodule algebra. Define a multiplication on A ⊗ A by

(a#ϕ)(a #ϕ ) = aa 0 x˜ 1ρ #(ϕ · a 1 x˜ 2ρ )(ϕ · x˜ρ3 ), ∀ a, a ∈ A, ϕ, ϕ ∈ A,

(8.1)

386

D. Bulacu, F. Panaite, F. Van Oystaeyen

where we write a # ϕ for a⊗ϕ, and denote this structure by A # A. Then A # A becomes a left H -module algebra with unit 1A # 1A and with left H -action h · (a # ϕ) = a # h · ϕ, ∀ a ∈ A, h ∈ H, ϕ ∈ A. Note that for A = H ∗ we obtain the quasi-smash product A # H ∗ from [7]. Also, by taking B a right H -module algebra and A = B as an H -bimodule algebra with trivial left H -action, A # A is exactly the generalized smash product A > B. We need the left-handed version of the above construction too. Namely, if H is a quasi-bialgebra, B a left H -comodule algebra and A an H -bimodule algebra, define a multiplication on A ⊗ B by (ϕ # b)(ϕ # b ) = (x˜ 1λ · ϕ)(x˜ 2λ b[−1] · ϕ ) # x˜ 3λ b[0] b , ∀ ϕ, ϕ ∈ A, b, b ∈ B, (8.2) where we write ϕ # b for ϕ ⊗ b, and denote this structure by A # B. Then A # B becomes a right H -module algebra with unit 1A # 1B and with right H -action, (ϕ # b) · h = ϕ · h # b, ∀ ϕ ∈ A, h ∈ H, b ∈ B. By taking A a left H -module algebra and A = A as an H -bimodule algebra with trivial right H -action, A # B is exactly the generalized smash product A A (A # B), obtained from the trivial identifications. Proof. Follows by direct computations. We now apply the above results. In [12], Hausser and Nill generalized to the setting of quasi-Hopf algebras some models of Hopf spin chains and lattice current algebras. The key result for this was the next theorem, concerning iterated two-sided crossed products (with H finite dimensional and A = H ∗ ). The original proof of this theorem is quite difficult to read, being written in the formalism of universal intertwiners. Using our results, we are now able to obtain for free a conceptual proof of the theorem, together with the explicit form of the structures that appear at (i) and (ii). Theorem 8.3. (Hausser and Nill). Let H be a quasi-bialgebra, A an H -bimodule algebra, A a right H -comodule algebra, B an H -bicomodule algebra and C a left H -comodule algebra. Then: (i) A > A A < C admits a left H -comodule algebra structure; (iii) there is an algebra isomorphism (given by the trivial identification) (A > A < B) > A < C ≡ A > A < (B > A < C).

Generalized Diagonal Crossed Products

387

Proof. Writing A > A ϕ < b) = (a > θ 1 · ϕ < θ 2 b 0 ) ⊗ θ 3 b 1 , ∀ a ∈ A, ϕ ∈ A, b ∈ B, 1 2 3 ρ = (1A > 1A < X˜ ρ ) ⊗ X˜ ρ ⊗ X˜ ρ ∈ (A > A < B) ⊗ H ⊗ H.

Similarly, writing B > A < C as B > (A # C), we obtain that this is a left H -comodule algebra, with structure: λ : B > A < C ≡ B > (A # C) → H ⊗ (B > (A # C)) ≡ H ⊗ (B > A < C), λ(b > ϕ < c) = b[−1] θ 1 ⊗ (b[0] θ 2 > ϕ · θ 3 < c), ∀ b ∈ B, ϕ ∈ A, c ∈ C, 1 2 3 λ = X˜ λ ⊗ X˜ λ ⊗ ( X˜ λ > 1A < 1C) ∈ H ⊗ H ⊗ (B > A < C).

To prove (iii), we will use the identifications appearing in our results: (A > A < B) > A < C ≡ ((A # A) A < C ≡ ((A # A) (A # C) ≡ (A # A) (A # C), and A > A < (B > A < C) ≡ A > A < (B > (A # C)) ≡ (A # A)<(B > (A # C)) ≡ (A # A) (A # C). So, we have proved that the two iterated generalized two-sided crossed products that appear in (iii) are both isomorphic as algebras (via the trivial identifications) to the two-sided generalized smash product (A # A) (A # C). Using the same results, we obtain another relation between the generalized two-sided crossed product and the two-sided generalized smash product. More exactly, let H be a quasi-bialgebra, A an H -bimodule algebra, A a left H -module algebra, B a right H -module algebra and A and B two H -bicomodule algebras. As we have seen before, A B) becomes a right (respectively left) H -comodule algebra, so we may consider the generalized two-sided crossed product (A A < (B > B). On the other hand, by the above Theorem of Hausser and Nill, A > A < B becomes a right H -comodule algebra and a left H -comodule algebra, but actually, using the explicit formulae for its structures that we gave, one can prove that it is even an H -bicomodule algebra, with λ,ρ = 1 H ⊗ (1A > 1A < 1B ) ⊗ 1 H , so we may consider the two-sided generalized smash product A<(A > A < B) > B. Proposition 8.4. We have an algebra isomorphism (A A < (B > B) ≡ A<(A > A < B) > B obtained from the trivial identification. In particular, we have (A# H ) > H ∗ < (H # B) ≡ A<(H > H ∗ < H ) > B.

388

D. Bulacu, F. Panaite, F. Van Oystaeyen

Proof. This may be proved by computing explicitly the multiplication rules in the two algebras and noting that they coincide. Alternatively, we provide a conceptual proof, by a sequence of identifications using the above results. We compute: A<(A > A < B) > B ≡ A<((A > A < B) > B) ≡ A<(((A # A) B) ≡ A<((A # A)<(B > B)) ≡ A<(A > A < (B > B)) ≡ A<(A > (A # (B > B))) ≡ (A (A # (B > B)) ≡ (A A < (B > B), where the fourth and the fifth identities hold because the left H -comodule algebra structures on (A > A < B) > B, A > A < (B > B) and A > (A # (B > B)) coincide (via the trivial identifications). 9. H ∗-Hopf Bimodules Let H be a finite dimensional quasi-bialgebra and A a left H -module algebra. Recall ∗ from [6] the category M H A , whose objects are vector spaces M, such that M is a right H ∗ -comodule (i.e. M is a left H -module, with action denoted by h ⊗ m → h m) and A acts on M to the right (denote by m ⊗ a → m · a this action) such that m · 1 A = m for all m ∈ M and the following relations hold: (m · a) · a = (X 1 m) · [(X 2 · a)(X 3 · a )], h (m · a) = (h 1 m) · (h 2 · a),

(9.1) (9.2) ∗

for all a, a ∈ A, m ∈ M, h ∈ H . Similarly, the category A M H consists of vector spaces M, such that M is a right H ∗ -comodule and A acts on M to the left (denote by a ⊗ m → a · m this action) such that 1 A · m = m for all m ∈ M and the following relations hold: a · (a · m) = [(x 1 · a)(x 2 · a )] · (x 3 m), h (a · m) = (h 1 · a) · (h 2 m),

(9.3) (9.4)

for all a, a ∈ A, m ∈ M, h ∈ H . From the description of left modules over A# H in ∗ [5], it is clear that A M H A# H M. If H is a quasi-Hopf algebra, by [6] we have an ∗ isomorphism of categories M H A M A# H . In what follows we need a description of ∗ MH A as a category of left modules over a right smash product. Proposition 9.1. Let H be a quasi-Hopf algebra and A a left H -module algebra. Define on A a new multiplication, by putting a a = (g 1 · a )(g 2 · a), ∀ a, a ∈ A,

(9.5)

where f −1 = g 1 ⊗ g 2 is given by (2.14), and denote this new structure by A. Then A becomes a right H -module algebra, with the same unit as A and right H -action given by a · h = S(h) · a, for all a ∈ A, h ∈ H . Proof. A straightforward computation, using (2.11) and (2.16). Definition 9.2. Let H be a quasi-bialgebra and B a right H -module algebra. We say that M, a k-linear space, is a left H, B-module if

Generalized Diagonal Crossed Products

389

(i) M is a left H -module with action denoted by h ⊗ m → h m; (ii) B acts weakly on M from the left, i.e. there exists a k-linear map B ⊗ M → M, denoted by b ⊗ m → b · m, such that 1 B · m = m for all m ∈ M; (iii) the following compatibility conditions hold: b · (b · m) = x 1 ([(b · x 2 )(b · x 3 )] · m), b · (h m) = h 1 [(b · h 2 ) · m],

(9.6) (9.7)

for all b, b ∈ B, h ∈ H , m ∈ M. The category of all left H, B-modules, morphisms being the H -linear maps that preserve the B-action, will be denoted by H,B M. Proposition 9.3. If H , B are as above, then the categories H,B M and H # B M are isomorphic. The isomorphism is given as follows. If M ∈ H # B M, define h m = (h#1) · m and b · m = (1#b) · m. Conversely, if M ∈ H,B M, define (h#b) · m = h (b · m). Proof. Straightforward computation. Proposition 9.4. If H is a finite dimensional quasi-Hopf algebra and A is a left H -mod∗ ule algebra, then M H A is isomorphic to H # A M, where A is the right H -module algebra constructed in Proposition 9.1. The correspondence is given as follows (we fix {ei } a basis in H with {ei } a dual basis in H ∗ ): ∗

• I f M ∈ H # A M, then M becomes an object in M H A with the following structures (we denote by h ⊗ m → h m the left H -module structure of M and by a ⊗ m → a m the weak left A-action on M arising from Proposition 9.3): ∗

M → M ⊗ H , m →

n

ei m ⊗ ei , ∀ m ∈ M,

i=1

M ⊗ A → M, m ⊗ a → m · a = q 1 ((S(q 2 ) · a) m), where q R = q 1 ⊗ q 2 = X 1 ⊗ S −1 (α X 3 )X 2 ∈ H ⊗ H (it is the element q˜ρ given by (2.34) corresponding to A = H ). ∗ ∗ • Conver sely, i f M ∈ M H A , denoting the H -comodule structure of M by M → ∗ M ⊗ H , m → m (0) ⊗ m (1) , and the weak right A-action on M by m ⊗ a → ma, then M becomes an object in H # A M with the following structures (again via Proposition 9.3): M is a left H -module with action h m = m (1) (h)m (0) , and the weak left A-action on M is given by a → m = ( p 1 m)( p 2 · a), ∀ a ∈ A, m ∈ M, where p R = p 1 ⊗ p 2 = x 1 ⊗ x 2 β S(x 3 ) ∈ H ⊗ H (it is the element p˜ ρ given by (2.34) corresponding to A = H ). Proof. Assume first that M ∈

H # A M;

then we have, by Propositions 9.3 and 9.1:

a (a m) = x 1 ([(g 1 S(x 3 ) · a )(g 2 S(x 2 ) · a)] m), a (h m) = h 1 [(S(h 2 ) · a) m],

(9.8) (9.9)

390

D. Bulacu, F. Panaite, F. Van Oystaeyen ∗

for all a, a ∈ A, h ∈ H , m ∈ M. We have to prove that M ∈ M H A . To prove (9.1), we compute (denoting by Q 1 ⊗ Q 2 another copy of q R ): (m · a) · a

Q 1 [(S(Q 2 ) · a ) (q 1 [(S(q 2 ) · a) m])]

= (9.9) = (9.8) = (2.40) =

1 q 1 X 11 [((g 1 S(X (2,2) )S(q22 ) f 1 X 2 · a)

(2.11) =

q 1 X 11 [((S(q 2 X 21 )1 X 2 · a)(S(q 2 X 21 )2 X 3 · a )) m]

Q 1 q11 [(S(q21 )S(Q 2 ) · a ) ((S(q 2 ) · a) m)] Q 1 q11 x 1 [((g 1 S(x 3 )S(q 2 ) · a)(g 2 S(x 2 )S(Q 2 q21 ) · a )) m] 1 (g 2 S(X (2,1) )S(q12 ) f 2 X 3 · a )) m]

= q 1 X 11 [(S(q 2 X 21 ) · ((X 2 · a)(X 3 · a ))) m] = q 1 [X 11 [(S(X 21 ) · (S(q 2 ) · ((X 2 · a)(X 3 · a )))) m]]

(9.9) =

q 1 [(S(q 2 ) · ((X 2 · a)(X 3 · a ))) (X 1 m)]

= (X 1 m) · ((X 2 · a)(X 3 · a )), q.e.d.

To prove (9.2), we compute: (h 1 m) · (h 2 · a) = q 1 ((S(q 2 )h 2 · a) (h 1 m)) (9.9) = (2.36) =

q 1 h (1,1) ((S(h (1,2) )S(q 2 )h 2 · a) m)

hq 1 ((S(q 2 ) · a) m) = h (m · a), q.e.d. ∗

Obviously m · 1 A = m, for all m ∈ M, hence indeed M ∈ M H A . ∗ Conversely, assume that M ∈ M H , that is A (ma)a = (X 1 m)[(X 2 · a)(X 3 · a )], h (ma) = (h 1 m)(h 2 · a),

(9.10) (9.11)

for all m ∈ M, a, a ∈ A, h ∈ H , and we have to prove that a → (a → m) = x 1 ([(g 1 S(x 3 ) · a )(g 2 S(x 2 ) · a)] → m), a → (h m) = h 1 [(S(h 2 ) · a) → m],

(9.12) (9.13)

for all a, a ∈ A, h ∈ H , m ∈ M. To prove (9.12), we compute (denoting by P 1 ⊗ P 2 another copy of p R ): a → (a → m) = ( p 1 [(P 1 m)(P 2 · a )])( p 2 · a) (9.11) = (9.10) = (2.39) = (9.11) =

=

[( p11 P 1 m)( p21 P 2 · a )]( p 2 · a) (X 1 p11 P 1 m)[(X 2 p21 P 2 · a )(X 3 p 2 · a)] 1 1 (x11 p 1 m)[(x(2,1) p12 g 1 S(x 3 ) · a )(x(2,2) p22 g 2 S(x 2 ) · a)]

x 1 [( p 1 m)[( p12 g 1 S(x 3 ) · a )( p22 g 2 S(x 2 ) · a)]] x 1 [((g 1 S(x 3 ) · a )(g 2 S(x 2 ) · a)) → m], q.e.d.

Generalized Diagonal Crossed Products

391

To prove (9.13), we compute: h 1 [(S(h 2 ) · a) → m] = h 1 [( p 1 m)( p 2 S(h 2 ) · a)] (9.11) = (2.35) =

(h (1,1) p 1 m)(h (1,2) p 2 S(h 2 ) · a)

( p 1 h m)( p 2 · a) = a → (h m), q.e.d.

Obviously 1 A → m = m, for all m ∈ M, hence indeed M ∈ H # A M. ∗ In order to prove that M H A H # A M, the only things left to prove are the following: (1) If M ∈ H # A M, then a → m = a m, for all a ∈ A, m ∈ M; ∗ (2) If M ∈ M H A , then m · a = ma, for all a ∈ A, m ∈ M. To prove (1), we compute: a→m

= ( p 1 m) · ( p 2 · a) = q 1 [(S(q 2 ) p 2 · a) ( p 1 m)] (9.9) = (2.38) =

q 1 p11 [(S( p21 )S(q 2 ) p 2 · a) m] a m, q.e.d.

To prove (2), we compute: m·a

= q 1 [(S(q 2 ) · a) → m] = q 1 [( p 1 m)( p 2 S(q 2 ) · a)] (9.11) = (2.37) =

(q11 p 1 m)(q21 p 2 S(q 2 ) · a) ma,

and the proof is finished. We will need the description of left modules over a two-sided smash product. Definition 9.5. Let H be a quasi-bialgebra, A a left H -module algebra and B a right H -module algebra. Define the category A,H,B M as follows: an object in this category is a left H -module M, with action denoted by h ⊗ m → h m, and we have left weak actions of A and B on M, denoted by a ⊗ m → a · m and b ⊗ m → b · m, such that: (i) M ∈ A# H M, that is the relations (9.3) and (9.4) hold; (ii) M ∈ H # B M, that is the relations (9.6) and (9.7) hold; (iii) the following compatibility condition holds: b · (a · m) = (y 1 · a) · [y 2 ((b · y 3 ) · m)],

(9.14)

for all a ∈ A, b ∈ B, m ∈ M. The morphisms in this category are the H -linear maps compatible with the two weak actions. Proposition 9.6. If H , A, B are as above, then being given as follows:

A# H # B M

A,H,B M,

the isomorphism

• If M ∈ A# H # B M, define a ·m = (a#1#1)·m, h m = (1#h#1)·m, b·m = (1#1#b)·m. • Conversely, if M ∈ A,H,B M, define (a#h#b) · m = a · (h (b · m)).

392

D. Bulacu, F. Panaite, F. Van Oystaeyen

Proof. Straightforward computation, using the formula for the multiplication in A# H # B. Let us point out how the condition (9.14) occurs: b · (a · m) = = = =

(1#1#b) · ((a#1#1) · m) [(1#1#b)(a#1#1)] · m (y 1 · a#y 2 #b · y 3 ) · m (y 1 · a) · (y 2 ((b · y 3 ) · m)),

which is exactly (9.14). Let H be a finite dimensional quasi-bialgebra and A, D two left H -module algebras. It ∗ is obvious that A M H coincides with the category of left A-modules within the monoi∗ dal category H M, and similarly M H D coincides with the category of right D-modules within H M. Hence, we can introduce the following new category: ∗

Definition 9.7. If H , A, D are as above, define A M H D as the category of A − D-bimod∗ H∗ ules within the monoidal category H M, that is M ∈ A M H D if and only if M ∈ A M , ∗ M ∈ MH D and the following relation holds: (a · m) · d = (X 1 · a) · [(X 2 m) · (X 3 · d)],

(9.15)

for all a ∈ A, m ∈ M, d ∈ D, where a ⊗ m → a · m and m ⊗ d → m · d are the weak actions. Proposition 9.8. Let H be a finite dimensional quasi-Hopf algebra and A, D two left ∗ H -module algebras. Then we have an isomorphism of categories A M H D A# H # D M, where D is the right H -module algebra as in Proposition 9.1. ∗

∗

Proof. Since A M H A# H M and M H D H # D M, the only thing left to prove is that ∗ the compatibility (9.14) in A,H,D M is equivalent to the compatibility (9.15) in A M H D . Let us first note the following easy consequences of (2.3), (2.6): X 1 p11 ⊗ X 2 p21 ⊗ X 3 p 2 = y 1 ⊗ y12 p 1 ⊗ y22 p 2 S(y 3 ), q11 y 1

⊗ q21 y 2

⊗ S(q y ) = X ⊗ q 2 3

1

1

X 12

⊗ S(q

2

X 22 )X 3 ,

(9.16) (9.17)

where p R = p 1 ⊗ p 2 and q R = q 1 ⊗ q 2 are the elements given by (2.34) for A = H . ∗ Let now M ∈ A M H D , with right D-action on M denoted by m ⊗ d → m · d. Then, by Proposition 9.4, the weak left D-action on M is given by d → m = ( p 1 m) · ( p 2 · d). We check (9.14); we compute: d → (a · m) = ( p 1 (a · m)) · ( p 2 · d) (9.4) = (9.15) = (9.16) = (9.2) =

[( p11 · a) · ( p21 m)] · ( p 2 · d) (X 1 p11 · a) · [(X 2 p21 m) · (X 3 p 2 · d)] (y 1 · a) · [(y12 p 1 m) · (y22 p 2 S(y 3 ) · d)] (y 1 · a) · [y 2 (( p 1 m) · ( p 2 S(y 3 ) · d))]

= (y 1 · a) · [y 2 ((S(y 3 ) · d) → m)] = (y 1 · a) · [y 2 ((d · y 3 ) → m)], q.e.d.

Generalized Diagonal Crossed Products

393

Conversely, assume that M ∈ A# H # D M, and denote the actions of A, H , D on M by a · m, h m, d · m respectively. Then, by Proposition 9.4, the right D-action on M is given by m · d = q 1 ((S(q 2 ) · d) · m). To check (9.15), we compute: (a · m) · d

= q 1 [(S(q 2 ) · d) · (a · m)] (9.14) =

q 1 [(y 1 · a) · (y 2 ((S(q 2 ) · d · y 3 ) · m))]

= q 1 [(y 1 · a) · (y 2 ((S(q 2 y 3 ) · d) · m))] (9.4) = (9.17) =

(q11 y 1 · a) · [q21 y 2 ((S(q 2 y 3 ) · d) · m)] (X 1 · a) · [q 1 X 12 ((S(q 2 X 22 )X 3 · d) · m)]

= (X 1 · a) · [q 1 X 12 ((S(q 2 )X 3 · d · X 22 ) · m)] (9.7) =

(X 1 · a) · [q 1 ((S(q 2 )X 3 · d) · (X 2 m))]

= (X 1 · a) · [(X 2 m) · (X 3 · d)], q.e.d. and the proof is finished. Let H be a finite dimensional quasi-bialgebra and A, D two H -bimodule algebras. H ∗ M H ∗ as the category of A − D-bimodules within the monoidal Define the category A D category H M H . By regarding A and D as left module algebras over H ⊗ H op , it is easy op ∗ H ∗ M H ∗ ∼ M(H ⊗H ) . Hence, as a consequence of Proposition 9.8, we to see that A D = A D finally obtain: Theorem 9.9. If H is a finite dimensional quasi-Hopf algebra and A, D are two H -biH ∗ MH ∗ module algebras, then we have an isomorphism of categories A A#(H ⊗H op )#D M. D In particular, we have

H ∗ MH ∗ H∗ H∗

H ∗ #(H ⊗H op )# H ∗ M.

10. Yetter-Drinfeld Modules as Modules Over a Generalized Diagonal Crossed Product If H is a quasi-bialgebra, then the category of (H, H )-bimodules, H M H , is monoidal. The associativity constraints are given by (2.48). A coalgebra in the category of (H, H )-bimodules will be called an H -bimodule coalgebra. More precisely, an H -bimodule coalgebra C is an (H, H )-bimodule (denote the actions by h · c and c · h) with a comultiplication : C → C ⊗ C and a counit ε : C → k satisfying the following relations, for all c ∈ C and h ∈ H : · ( ⊗ id)((c)) · −1 = (id ⊗ )((c)), (h · c) = h 1 · c1 ⊗ h 2 · c2 , (c · h) = c1 · h 1 ⊗ c2 · h 2 , ε(h · c) = ε(h)ε(c), ε(c · h) = ε(c)ε(h),

(10.1) (10.2) (10.3)

where we used the Sweedler-type notation (c) = c1 ⊗ c2 . An example of an H -bimodule coalgebra is H itself. Our next definition extends the definition of Yetter-Drinfeld modules from [18]. Definition 10.1. Let H be a quasi-bialgebra, C an H -bimodule coalgebra and A an H -bicomodule algebra. A left-right Yetter-Drinfeld module is a k-vector space M with the following additional structure:

394

D. Bulacu, F. Panaite, F. Van Oystaeyen

- M is a left A-module; we write · for the left A-action; - we have a k-linear map ρ M : M → M ⊗ C, ρ M (m) = m (0) ⊗ m (1) , called the right C-coaction on M, such that for all m ∈ M, ε(m (1) )m (0) = m and (θ 2 · m (0) )(0) ⊗ (θ 2 · m (0) )(1) · θ 1 ⊗ θ 3 · m (1) = x˜ 1ρ · (x˜ 3λ · m)(0) ⊗ x˜ 2ρ · (x˜ 3λ · m)(1)1 · x˜ 1λ ⊗ x˜ 3ρ · (x˜ 3λ · m)(1)2 · x˜ 2λ , (10.4) - the following compatibility relation holds: u 0 · m (0) ⊗ u 1 · m (1) = (u [0] · m)(0) ⊗ (u [0] · m)(1) · u [−1] ,

(10.5)

for all u ∈ A, m ∈ M. A Y D(H )C will be the category of left-right Yetter-Drinfeld modules and maps preserving the actions by A and the coactions by C. Let H be a quasi-bialgebra, A an H -bicomodule algebra and C an H -bimodule coalgebra. Let us call the threetuple (H, A, C) a Yetter-Drinfeld datum. We note that, for an arbitrary H -bimodule coalgebra C, the linear dual space of C, C ∗ , is an H -bimodule algebra. The multiplication of C ∗ is the convolution, that is (c∗ d ∗ )(c) = c∗ (c1 )d ∗ (c2 ), the unit is ε and the left and right H -module structures are given by (h c∗ h )(c) = c∗ (h ·c·h), for all h, h ∈ H , c∗ , d ∗ ∈ C ∗ , c ∈ C. In the rest of this section we establish that if H is a quasi-Hopf algebra and C is finite dimensional then the category A Y D(H )C is isomorphic to the category of left C ∗ A-modules, C ∗ A M. First some lemmas. Lemma 10.2. Let H be a quasi-Hopf algebra and (H, A, C) a Yetter-Drinfeld datum. We have a functor F : A Y D(H )C → C ∗ A M, given by F(M)=M as k-module, with the C ∗ A-module structure defined by (c∗ u)m := c∗ , q˜ρ2 · (u · m)(1) q˜ρ1 · (u · m)(0) ,

(10.6)

for all c∗ ∈ C ∗ , u ∈ A and m ∈ M, where q˜ρ = q˜ρ1 ⊗ q˜ρ2 is the element defined in (2.34). F transforms a morphism to itself. Proof. Let Q˜ 1ρ ⊗ Q˜ 2ρ be another copy of q˜ρ . For all c∗ , d ∗ ∈ C ∗ , u, u ∈ A and m ∈ M we compute: [(c∗ u)(d ∗ u )]m (3.21) =

=

[(1 c∗ 5 )(2 u 0[−1] d ∗ S −1 (u 1 )4 ) 3 u 0[0] u ]m

d ∗ , S −1 (u 1 )4 (q˜ρ2 )2 · (3 u 0[0] u · m)(1)2 · 2 u 0[−1]

c∗ , 5 (q˜ρ2 )1 · (3 u 0[0] u · m)(1)1 · 1 q˜ρ1 · (3 u 0[0] u · m)(0)

(3.15) =

2

1

1

2

d ∗ , S −1 ( f 1 X˜ ρ θ 3 u 1 )(q˜ρ2 )2 · (( X˜ ρ )[0] x˜ 3λ θ[0] u 0[0] u · m)(1)2 · ( X˜ ρ )[−1]2 3

1

2 2 ×x˜ 2λ θ[−1] u 0[−1] c∗ , S −1 ( f 2 X˜ ρ )(q˜ρ )1 · (( X˜ ρ )[0] x˜ 3λ θ[0] u 0[0] u · m)(1)1 1

2 ·x˜ 1λ θ 1 q˜ρ1 · (( X˜ ρ )[0] x˜ 3λ θ[0] u 0[0] u · m)(0)

Generalized Diagonal Crossed Products (10.5,2.40) =

395

2 2

d ∗ , S −1 (θ 3 u 1 ) Q˜ 2ρ x˜ 3ρ · (x˜ 3λ θ[0] u 0[0] u · m)(1)2 · x˜ 2λ θ[−1] u 0[−1] 2

c∗ , q˜ρ2 ( Q˜ 1ρ ) 1 x˜ 2ρ · (x˜ 3λ θ[0] u 0[0] u · m)(1)1 · x˜ 1λ θ 1 2 q˜ρ1 ( Q˜ 1ρ ) 0 x˜ 1ρ · (x˜ 3λ θ[0] u 0[0] u · m)(0)

(10.4) =

3 2 2

d ∗ , S −1 (θ 3 u 1 ) Q˜ 2ρ θ · (θ[0] u 0[0] u · m)(1) · θ[−1] u 0[−1] 2

c∗ , q˜ρ2 ( Q˜ 1ρ ) 1 · [θ · (θ[0] u 0[0] u · m)(0) ](1) · θ θ 1 2

1

2 q˜ρ1 ( Q˜ 1ρ ) 0 · [θ · (θ[0] u 0[0] u · m)(0) ](0) 2

(10.5,2.45) =

3 2 3 2

d ∗ , S −1 (α X˜ ρ θ 3 u 1 ) X˜ ρ θ θ 1 u 0,1 · (u · m)(1) 1 1 2 2 1

c∗ , q˜ρ2 · [( X˜ ρ )[0] θ θ 0 u 0,0 · (u · m)(0) ](1) · ( X˜ ρ )[−1] θ θ 1 1 2 2 q˜ρ1 · [( X˜ ρ )[0] θ θ 0 u 0,0 · (u · m)(0) ](0)

(2.45,2.26) =

3

2

d ∗ , S −1 (αθ23 u 12 X˜ ρ )θ13 u 11 X˜ ρ · (u · m)(1) 1 1

c∗ , q˜ρ2 · [θ 2 u 0 X˜ ρ · (u · m)(0) ](1) · θ 1 q˜ρ1 · [θ 2 u 0 X˜ ρ · (u · m)(0) ](0)

(2.6,2.34) = (10.6) =

c∗ , q˜ρ2 · [u Q˜ 1ρ · (u · m)(0) ](1) d ∗ , Q˜ 2ρ · (u · m)(1) q˜ρ1 · [u Q˜ 1ρ · (u · m)(0) ](0)

d ∗ , Q˜ 2ρ · (u · m)(1) (c∗ u)[ Q˜ 1ρ · (u · m)(0) ] = (c∗ u)[(d ∗ u )m],

as needed. It is not hard to see that (ε 1A )m = m for all m ∈ M, so M is a left C ∗ A-module. The fact that a morphism in A Y D(H )C becomes a morphism in C ∗ A M can be proved more easily, we leave the details to the reader. Lemma 10.3. Let H be a quasi-Hopf algebra and (H, A, C) a Yetter-Drinfeld datum and assume C is finite dimensional. We have a functor G : C ∗ A M → A Y D(H )C , given by G(M) = M as k-module, with structure maps defined by u · m = (ε u)m, ρ M : M → M ⊗ C, ρ M (m) =

(10.7)

n

(ci ( p˜ ρ1 )[0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] , (10.8)

i=1

for m ∈ M and u ∈ A. Here p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 is the element defined in (2.34), {ci }i=1,n is a basis of C and {ci }i=1,n is the corresponding dual basis of C ∗ . G transforms a morphism to itself. Proof. The most difficult part of the proof is to show that G(M) satisfies the relations (10.4) and (10.5). It is then straightforward to show that a map in C ∗ A M is also a map in A Y D(H )C , and that G is a functor. It is not hard to see that (2.45), (2.6) and (2.46) imply 1

2

3

2 2 θ θ 1 ⊗ θ θ 0 p˜ ρ1 ⊗ θ θ 1 p˜ ρ2 S(θ 3 ) = ( p˜ ρ1 )[−1] ⊗ ( p˜ ρ1 )[0] ⊗ p˜ ρ2 .

(10.9)

396

D. Bulacu, F. Panaite, F. Van Oystaeyen

Write p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 = P˜ρ1 ⊗ P˜ρ2 . For all m ∈ M we compute: (θ 2 · m (0) )(0) ⊗ (θ 2 · m (0) )(1) · θ 1 ⊗ θ 3 · m (1) n = ((ε θ 2 )(ci ( p˜ ρ1 )[0] )m)(0) ⊗ ((ε θ 2 )(ci ( p˜ ρ1 )[0] )m)(1) · θ 1 i=1

⊗θ 3 S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] (3.21,10.8) =

n

2 (c j ( P˜ρ1 )[0] )(ci (θ 0 p˜ ρ1 )[0] )m ⊗ S −1 ( P˜ρ2 ) · c j · ( P˜ρ1 )[−1] θ 1

i, j=1 2 2 ⊗θ 3 S −1 (θ 1 p˜ ρ2 ) · ci · (θ 0 p˜ ρ1 )[−1] n 1 2 (3.21,3.15) 2 [c j ci ( X˜ ρ )[0] x˜ 3λ (θ ( P˜ρ1 )[0] 0 θ 0 p˜ ρ1 )[0] ]m = i, j=1 1

3 ⊗ S −1 ( f 2 X˜ ρ P˜ρ2 )

2 3

2 p˜ ρ2 ) · ci ·c j · ( X˜ ρ )[−1]1 x˜ 1λ θ ( P˜ρ1 )[−1] θ 1 ⊗ θ 3 S −1 ( f 1 X˜ ρ θ ( P˜ρ1 )[0] 1 θ 1 1

1

2 ·( X˜ ρ )[−1]2 x˜ 2λ (θ ( P˜ρ1 )[0] 0 θ 0 p˜ ρ1 )[−1] (2.43,10.9,2.30) =

2

n

1 3 [c j ci ( X˜ ρ ( P˜ρ1 ) 0 p˜ ρ1 )[0] x˜ 3λ ]m ⊗ S −1 ( f 2 X˜ ρ P˜ρ2 ) · c j

i, j=1 1 2 1 ·( X˜ ρ ( P˜ρ1 ) 0 p˜ ρ1 )[−1]1 x˜ 1λ ⊗ S −1 ( f 1 X˜ ρ ( P˜ρ1 ) 1 p˜ ρ2 ) · ci · ( X˜ ρ ( P˜ρ1 ) 0 p˜ ρ1 )[−1]2 x˜ 2λ n (2.39) [c j ci ((x˜ 1ρ ) 0 p˜ ρ1 )[0] x˜ 3λ ]m ⊗ x˜ 2ρ S −1 ( f 2 ((x˜ 1ρ ) 1 p˜ ρ2 )2 g 2 ) · c j = i, j=1

·((x˜ 1ρ ) 0 p˜ ρ1 )[−1]1 x˜ 1λ ⊗ x˜ 3ρ S −1 ( f 1 ((x˜ 1ρ ) 1 p˜ ρ2 )1 g 1 ) · ci · ((x˜ 1ρ ) 0 p˜ ρ1 )[−1]2 x˜ 2λ (2.11,10.2) =

n

[ci ((x˜ 1ρ ) 0 p˜ ρ1 )[0] x˜ 3λ ]m ⊗ x˜ 2ρ · (S −1 ((x˜ 1ρ ) 1 p˜ ρ2 ) · ci

i=1 1 ·((x˜ ρ ) 0 p˜ ρ1 )[−1] )1 · x˜ 1λ ⊗ x˜ 3ρ · (S −1 ((x˜ 1ρ ) 1 p˜ ρ2 ) · ci · ((x˜ 1ρ ) 0 p˜ ρ1 )[−1] )2 n = [(x˜ 1ρ ) 0[−1] ci S −1 ((x˜ 1ρ ) 1 ) ((x˜ 1ρ ) 0 p˜ ρ1 )[0] x˜ 3λ ]m i=1 2 ⊗x˜ ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )1 · x˜ 1λ ⊗ x˜ 3ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )2 n (3.21) [(ε x˜ 1ρ )(ci ( p˜ ρ1 )[0] )(ε x˜ 3λ )]m = i=1 2 ⊗x˜ ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )1 · x˜ 1λ ⊗ x˜ 3ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )2 (10.7,10.8) =

θρ1 · (x˜ 3λ · m)(0) ⊗ x˜ 2ρ · (x˜ 3λ · m)(1)1 · x˜ 1λ ⊗ x˜ 3ρ · (x˜ 3λ · m)(1)2 · x˜ 2λ .

Similarly, we compute: u 0 · m (0) ⊗ u 1 · m (1) n = (ε u 0 )(ci ( p˜ ρ1 )[0] )m ⊗ u 1 S −1 ( p˜ ρ2 ) · ci ( p˜ ρ1 )[−1] i=1

· x˜ 2λ

· x˜ 2λ

· x˜ 2λ

Generalized Diagonal Crossed Products n

(3.21) =

i=1

397

(u 0,0[−1] ci S −1 (u 0,1 ) u 0,0[0] ( p˜ ρ1 )[0] )m

⊗u 1 S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] =

n i=1

(ci (u 0,0 p˜ ρ1 )[0] )m ⊗ u 1 S −1 (u 0,1 p˜ ρ2 ) · ci · (u 0,0 p˜ ρ1 )[−1] n

(2.35) =

i=1

(3.21) =

(ci ( p˜ ρ1 u)[0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 u)[−1]

n i=1

(ci ( p˜ ρ1 )[0] )(ε u [0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] u [−1]

(10.8) = (u [0]

· m)(0) ⊗ (u [0] · m)(1) · u [−1] ,

for all u ∈ A and m ∈ M, and this finishes the proof. The next result generalizes [13, Prop. 3.12], which is recovered by taking C = A = H . Theorem 10.4. Let H be a quasi-Hopf algebra and (H, A, C) a Yetter-Drinfeld datum, assuming C to be finite dimensional. Then the categories A Y D(H )C and C ∗ A M are isomorphic. Proof. We have to verify that the functors F and G defined in Lemmas 10.2 and 10.3 are inverse to each other. Let M ∈ A Y D(H )C . The structures on G(F(M)) (using first Lemma 10.2 and then Lemma 10.3) are denoted by · and ρ M . For any u ∈ A and m ∈ M we have that u · m = (ε u)m = ε, q˜ρ2 · (u · m)(1) q˜ρ1 · (u · m)(0) = u · m because ε(h · c) = ε(h)ε(c) and ε(m (1) )m (0) = m for all h ∈ H , c ∈ C, m ∈ M. We now compute for m ∈ M that ρ M (m) n = (ci ( p˜ ρ1 )[0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] i=1

(10.6) =

n i=1

ci , q˜ρ2 · (( p˜ ρ1 )[0] · m)(1) q˜ρ1 · (( p˜ ρ1 )[0] · m)(0) ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1]

(10.5) 1 1 = q˜ρ ( p˜ ρ ) 0 (2.38) = m (0)

· m (0) ⊗ S −1 ( p˜ ρ2 )q˜ρ2 ( p˜ ρ1 ) 1 · m (1)

⊗ m (1) = ρ M (m).

Conversely, take M ∈ C ∗ A M. We want to show that F(G(M)) = M. If we denote the left C ∗ A-action on F(G(M)) by →, then, using Lemmas 10.2 and 10.3 we find,

398

D. Bulacu, F. Panaite, F. Van Oystaeyen

for all c∗ ∈ C ∗ , u ∈ A and m ∈ M: (c∗ u) → m = c∗ , q˜ρ2 · (u · m)(1) q˜ρ1 · (u · m)(0) n =

c∗ , q˜ρ2 S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] (ε q˜ρ1 )(ci ( p˜ ρ1 )[0] )(ε u)m i=1

(3.21) =

n i=1

c∗ , q˜ρ2 S −1 ((q˜ρ1 ) 1 p˜ ρ2 ) · ci · ((q˜ρ1 ) 0 p˜ ρ1 )[−1]

(ci ((q˜ρ1 ) 0 p˜ ρ1 )[0] )(ε u)m (2.37,3.21) ∗ (c =

1A )(ε u)m = (c∗ u)m,

and this finishes our proof. There is a relation between the functor F from Lemma 10.2 and the map as in Proposition 3.8. Proposition 10.5. Let H be a quasi-Hopf algebra, (H, A, C) a Yetter-Drinfeld datum and M an object in A Y D(H )C ; consider the map : C ∗ → C ∗ A as in Proposition 3.8. Then the left C ∗ A-module structure on M given in Lemma 10.2 and the map are related by the formula: (c∗ )m = c∗ , m (1) m (0) , for all c∗ ∈ C ∗ and m ∈ M. Proof. We compute: (c∗ )m = (( p˜ ρ1 )[−1] c∗ S −1 ( p˜ ρ2 ) ( p˜ ρ1 )[0] )m = ( p˜ ρ1 )[−1] c∗ S −1 ( p˜ ρ2 ), q˜ρ2 · (( p˜ ρ1 )[0] · m)(1) q˜ρ1 · (( p˜ ρ1 )[0] · m)(0) = c∗ , S −1 ( p˜ ρ2 )q˜ρ2 · (( p˜ ρ1 )[0] · m)(1) · ( p˜ ρ1 )[−1] q˜ρ1 · (( p˜ ρ1 )[0] · m)(0) (10.5) = (2.38) =

c∗ , S −1 ( p˜ ρ2 )q˜ρ2 ( p˜ ρ1 ) 1 · m (1) q˜ρ1 ( p˜ ρ1 ) 0 · m (0)

c∗ , m (1) m (0) ,

finishing the proof. References 1. Akrami, S. E., Majid, S.: Braided cyclic cocycles and nonassociative geometry. J. Math. Phys. 45, 3883– 3911 (2004) 2. Albuquerque, H., Majid, S.: Quasialgebra structure of the octonions. J. Algebra 220, 188–224 (1999) 3. Altschuler, D., Coste, A.: Quasi-quantum groups, knots, three-manifolds and topological field theory. Commun. Math. Phys. 150, 83–107 (1992) 4. Beggs, E. J., Majid, S.: Quantization by cochain twists and nonassociative differentials. http://arxiv.org/listmath.QA/0506450, 2005 5. Bulacu, D., Panaite, F., Van Oystaeyen, F.: Quasi-Hopf algebra actions and smash products. Comm. Algebra 28, 631–651 (2000) 6. Bulacu, D., Nauwelaerts, E.: Relative Hopf modules for (dual) quasi-Hopf algebras. J. Algebra 229, 632–659 (2000)

Generalized Diagonal Crossed Products

399

7. Bulacu, D., Caenepeel, S.: Two-sided two-cosided Hopf modules and Doi-Hopf modules for quasi-Hopf algebras. J. Algebra 270, 55–95 (2003) 8. Caenepeel, S., Militaru, G., Zhu, S.: Crossed modules and Doi-Hopf modules. Israel J. Math. 100, 221–247 (1997) 9. Connes A., Dubois-Violette, M.: Noncommutative finite dimensional manifolds I. Spherical manifolds and related examples. Commun. Math. Phys. 230, 539–579 (2002) 10. Dijkgraaf, R., Pasquier, V., Roche, P.: Quasi-Hopf algebras, group cohomology and orbifold models. Nucl. Phys. B Proc. Suppl. 18 B, 60–72 (1990) 11. Drinfeld, V. G.: Quasi-Hopf algebras. Leningrad Math. J. 1, 1419–1457 (1990) 12. Hausser, F., Nill, F.: Diagonal crossed products by duals of quasi-quantum groups. Rev. Math. Phys. 11, 553–629 (1999) 13. Hausser, F., Nill, F.: Doubles of quasi-quantum groups. Commun. Math. Phys. 199, 547–589 (1999) 14. Jara Martínez, P., López Peña, J., Panaite F., Van Oystaeyen, F.: On iterated twisted tensor products of algebras. http://arxiv.org/list/math.QA/0511280, 2005 15. Jimbo, M., Konno, H., Odake S., Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. Transform. Groups 4, 303–327 (1999) 16. Kassel, C.: Quantum Groups, Graduate Texts in Mathematics 155, Berlin: Springer Verlag, 1995 17. Mack, G., Schomerus, V.: Action of truncated quantum groups on quasi-quantum planes and a quasiassociative differential geometry and calculus. Commun. Math. Phys. 149, 513–548 (1992) 18. Majid, S.: Quantum double for quasi-Hopf algebras. Lett. Math. Phys. 45, 1–9 (1998) 19. Majid, S.: Foundations of quantum group theory, Cambridge: Cambridge Univ. Press, 1995 20. Majid, S.: Gauge theory on nonassociative spaces. J. Math. Phys. 46, 103519, (2005) 23 pp 21. Panaite, F.: Hopf bimodules are modules over a diagonal crossed product algebra. Comm. Algebra 30, 4049–4058, (2002) 22. Schauenburg, P.: Hopf modules and the double of a quasi-Hopf algebra. Trans. Amer. Math. Soc. 354, 3349–3378 (2002) 23. Schauenburg, P.: Actions of monoidal categories and generalized Hopf smash products. J. Algebra 270, 521-563 (2003) 24. Sweedler, M. E.: Hopf algebras. New York: Benjamin, 1969 25. Wang, S.-H., Li, J.: On twisted smash products for bimodule algebras and the Drinfeld double. Comm. Algebra 26, 2435–2444 (1998) 26. Wang, S.-H.: Doi-Koppinen Hopf bimodules are modules. Comm. Algebra 29, 4671–4682 (2001) Communicated by A. Connes

Commun. Math. Phys. 266, 401–430 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0017-1

Communications in

Mathematical Physics

Stability of Planar Stationary Solutions to the Compressible Navier-Stokes Equation on the Half Space Yoshiyuki Kagei, Shuichi Kawashima Faculty of Mathematics, Kyushu University, Fukuoka 812-8581, Japan. E-mail: [email protected]; [email protected] Received: 1 August 2005 / Accepted: 9 December 2005 Published online: 29 April 2006 – © Springer-Verlag 2006

Abstract: Stability of planar stationary solutions to the compressible Navier-Stokes equation on the half space Rn+ (n ≥ 2) under outflow boundary condition is investigated. It is shown that the planar stationary solution is stable with respect to small perturbations in H s R+n with s ≥ [n/2] + 1 and the perturbations decay in L ∞ norm as t → ∞, provided that the magnitude of the stationary solution is sufficiently small. The stability result is proved by the energy method. In the proof an energy functional based on the total energy of the system plays an important role. 1. Introduction This paper studies large time behavior of solutions to the compressible Navier-Stokes equation on the half space R+n (n ≥ 2): ∂t ρ + div(ρu) = 0, ∂t (ρu) + div(ρu ⊗ u) + ∇ p(ρ) = μu + μ + μ ∇div u, (1.1) γ p(ρ) = Kρ . Here R+n = x = (x1 , x ); x = (x2 , . . . , xn ) ∈ Rn−1 , x1 > 0 ; ρ = ρ(x, t) and u = 1 u (x, t), . . . , u n (x, t) denote the unknown density and velocity, respectively; μ, μ , K and γ are constants satisfying μ > 0, n2 μ + μ ≥ 0, K > 0 and γ > 1. We consider (1.1) under the initial condition (ρ, u)|t=0 = (ρ0 , u 0 )

(1.2)

and the outflow boundary condition on x1 = 0, u|x1 =0 = (u 1b , 0, . . . , 0),

(1.3)

402

Y. Kagei, S. Kawashima

where u 1b is a constant satisfying u 1b < 0, together with the boundary condition at infinity x1 = ∞, (1.4) ρ → ρ+ , u → u 1+ , 0, . . . , 0 (x1 → ∞) , where ρ+ and u 1+ are constants satisfying ρ+ > 0. As is easily imagined, large time behavior of solutions of (1.1)–(1.4) heavily depends on the values of the boundary data u b , ρ+ and u + . In this paper we are interested in the situation where (1.1)–(1.4) admits a planar stationary solution, 1 i.e., a stationary solution ρ, u ) which depends only on x1 and u has the form u= u (x1 ), 0, . . . , 0 . ( Kawashima, Nishibata and Zhu [5] investigated the conditions for ρ+ , u 1+ and u 1b under which planar stationary motions occur. They proved that there exists a planar stationary solution ( ρ, u ) if and only if u 1+ < 0 and the Mach number at infinity x1 = ∞ is greater than or equal to 1. Furthermore, it was shown in [5] that ( ρ, u ) is asymptotically stable with respect to small one-dimensional perturbations, i.e., perturbations in 1 1 (x ), 0, . . . , 0 , provided the form ρ − ρ = ρ(x , t) − ρ (x ), u − u = u (x , t) − u 1 1 1 1 that u 1+ − u 1b is sufficiently small. In this we show that ( ρ, u ) is stable under multi-dimensional perturbations small paper in H s R+n and perturbations decay in L ∞ norm as t → ∞, provided that u 1+ − u 1b is sufficiently small. Here s is an integer satisfying s ≥ [n/2] + 1. Our stability theorem is proved by showing the local existence of solutions and deriving a suitable a priori estimate. The local existence is proved by applying the local H s -solvability theorem in [4]. We derive our H s -a priori estimate by the energy method. The point in deriving the a priori estimate is to obtain a suitable L 2 -energy bound. In order to do so we will employ the energy functional based on the total energy which is the same as in the one-dimensional case in [5]; and in fact, the energy functional works well also in the multi-dimensional problem exactly in the same way as in [5] to obtain the L 2 -energy bound. This is due to the fact that the stationary solutions do not have any shear components. Once we get the L 2 -energy bound, we proceed to obtain the estimates for derivatives by the energy method as in [3, 7]. This part is entirely different from the computation in the one-dimensional case; we derive the estimates for tangential and normal derivatives respectively, for which certain hyperbolic-parabolic aspects of the system are used and, also, the estimate for the inhomogeneous stationary Stokes problem is applied; and, then, the bootstrap argument yields the desired H s -energy bound. But in contrast to [3, 7] we do not regard the system as a perturbation from the linearization at infinity x1 = ∞. Instead we keep the principal part of the perturbation equation in its own quasilinear form and apply some commutator estimates. This will greatly simplify the argument. In this paper we consider large time behavior of solutions of (1.1)–(1.4) only under the conditions for ρ+ , u 1b and u 1+ where planar stationary solutions exist. If one of such conditions would be disturbed, then complicated phenomena might occur. In fact, Matsumura [6] proposed a classification of all possible time asymptotic states in terms of boundary data for one-dimensional problem. Some parts of this classification were already proved rigorously. See [6] and references therein. This paper is organized as follows. In Sect. 2 we review some properties of planar stationary solutions obtained in [5]. We then state our stability theorem. The proof of the theorem will be given in Sects. 3–5. In Sect. 3 we transform problem (1.1)–(1.4) into the initial boundary value problem for the perturbation. We then discuss the local existence and present our H s a priori estimate. Section 4 is devoted to the proof of the

Stability of Planar Stationary Solutions to Compressible NS Equations

403

a priori estimate. We finally prove decay of perturbations in L ∞ norm in Sect. 5 based on the a priori estimate. 2. Stability Result We first consider the one-dimensional stationary problem whose solutions represent planar stationary motions in R+n . We look for a smooth stationary ρ, u ) of solution ( (1.1)–(1.4) of the form ρ = ρ (x1 ) > 0 and u= u 1 (x1 ), 0, . . . , 0 . Then the problem for ρ , u 1 is written as ρ u1 = 0 (x1 > 0), x

1 2 1 ρ u1 u x1 x1 (x1 > 0), + p( ρ )x1 = 2μ + μ u x

1 =0

=

x1 1 ub, 1

ρ → ρ+ , u → u 1+ (x1 → ∞) ,

(2.1)

where subscript x1 stands for differentiation in x1 . Kawashima, Nishibata and Zhu [5] investigated problem (2.1) and gave a necessary and sufficient condition for the existence of solutions. Following [5], we introduce the Mach number at infinity defined by M+ ≡ We also set

|u + | p (ρ+ )

.

δ ≡ u 1+ − u 1b ,

which measures the strength of the stationary solution. Proposition 2.1 ([5]). Let u 1+ < 0. Then problem (2.1) has a smooth solution ρ , u 1 if 1 1 wc is a certain positive number. The solution if M+ ≥ 1 and wc u + > u b , where and only 1 u 1 (x1 ) is monotonically increasing when M+ = 1. ρ , u is monotonic, in particular, Furthermore, ρ , u 1 has the following decay properties as x1 → ∞. (i) If M+ > 1, then for any nonnegative integer k there exists a constant C > 0 such that k − ρ+ , u 1 − u 1+ ≤ Cδe−σ x1 ∂x1 ρ for some positive constant σ . (ii) If M+ = 1, then for any nonnegative integer k there exists a constant C > 0 such that k − ρ+ , u 1 − u 1+ ≤ C ∂x1 ρ

δ k+1 (1 + δx1 )k+1

.

404

Y. Kagei, S. Kawashima

Remark. The constant wc in Proposition 2.1 is determined in the following way. From (2.1) one can see that ρ+ / ρ = u 1 /u 1+ , and hence, by introducing the new unknown variable w = ρ+ / ρ , problem (2.1) is reduced to 2μ + μ u 1+ wx1 = H (w) (x1 > 0), w(x1 ) → 1 (x1 → ∞) , where 2 γ H (w) = Kρ+ 1 − w −γ − ρ+ u 1+ (w − 1). Clearly, H (1) = 0, and also, one can see that H (w) has another zero, which we denote by wc . Furthermore, it holds that M+ 1 if and only if wc 1. See [5] for the details. Our concern in this paper is to investigate the stability properties of the stationary solution ( ρ, u ) with respect to multi-dimensional perturbations. To state our stability result, we introduce function spaces. We denote by L p the usual Lebesgue space on R+n . The norm of L p space is denoted by · p and the inner product of L 2 is defined by

f g d x, f, g ∈ L 2 . ( f, g) = R+n

For a nonnegative integer m we denote by H m the usual m th order L 2 Sobolev space on R+n with norm · H m . The symbol C0m stands for the set of all C m functions which have compact support in R+n . We denote by H01 the completion of C01 in H 1 and the dual space of H01 is denoted by H −1 . For 0 < T ≤ ∞ and a nonnegative integer σ , we define the Banach space Z σ (T ) = X σ (T ) × Y σ (T )n , where σ

X σ (T ) =

2

C j [0, T ]; H σ −2 j

j=0

and

Y σ (T ) = X σ (T ) ∩

σ +1 2

σ +1−2 j . H j 0, T ; H

j=0

m = H m ∩ H 1 when m ≥ 1 and H m = L 2 when m = 0. The norm of Z σ (T ) Here H 0 is defined by (φ, ψ) Z σ (T ) = φ X σ (T ) + ψY σ (T ) , where φ X σ (T ) = sup |[φ(t)]|σ , ψY σ (T ) 0≤t≤T

2 = ψ X σ (T ) +

T 0

1/2 |[ψ(t)]|2σ +1 dt

Stability of Planar Stationary Solutions to Compressible NS Equations

405

with ⎛ σ 2 2 j |[φ(t)]|σ = ⎝ ∂t φ(t) j=0

H σ −2 j

⎞1/2 ⎠

.

We simply denote by Z σ , X σ and Y σ when T = ∞. ρ , u ) be the Theorem 2.2. Let s be an integer satisfying s ≥ s0 ≡ [n/2] + 1 and 1 let 1( u − u < δ0 , then solution of (2.1). Then there exists a positive number δ0 such that if + b ρ, u ) is stable with respect to perturbations small in H s R+n in the following sense: ( there exist ε0 > 0 and C > 0 such that if initial perturbation (ρ(0) − ρ , u(0) − u) satisfies (ρ(0) − ρ , u(0) − u ) H s ≤ ε0 and a suitable compatibility condition, then perturbation (ρ(t) − ρ , u(t) − u ) exists in Z s , and it satisfies (ρ(t) − ρ , u(t) − u ) H s ≤ C (ρ(0) − ρ , u(0) − u ) H s for all t ≥ 0 and lim ∂x (ρ(t) − ρ , u(t) − u ) H s−1 = 0.

t→∞

In particular, lim (ρ(t) − ρ , u(t) − u )∞ = 0.

t→∞

The stability result in Theorem 2.2 can be proved by combining Proposition 3.1 (local existence) and Proposition 3.2 (a priori estimate). Decay property in L ∞ norm will be proved in Sect. 5. 3. Reformulation of the Problem Let us rewrite the problem into the one for perturbations. We set (φ, ψ) = (ρ − ρ , u − u ). Then problem (1.1)–(1.4) is transformed into ∂t φ + u · ∇φ + ρdiv ψ = f, ρ(∂t ψ + u · ∇ψ) + Lψ + p (ρ)∇φ = g, ψ|x1 =0 = 0; (φ, ψ) → (0, 0) (x1 → ∞) , (φ, ψ)|t=0 = (φ0 , ψ0 ) , where

(3.1)

Lψ = −μψ − μ + μ ∇div ψ, f = f (φ, ψ) = −ψ · ∇ ρ − φdiv u, g = g(φ, ψ) = −(ρψ + φ u ) · ∇ u − p (ρ) − p ( ρ ) ∇ ρ.

The proof of Theorem 2.2 is thus reduced to showing the global existence of solution (φ, ψ) of (3.1) in the class Z s , where s is an integer satisfying s ≥ [n/2] + 1. In this section we will first show the local existence and then give a suitable priori estimate.

406

Y. Kagei, S. Kawashima

3.1. Local existence. Let us firstly consider the local existence of solutions. The local existence can be proved by applying the result in [4]. In fact, one can easily see that problem (3.1) is a hyperbolic-parabolic system satisfying the assumptions of the local solvability theorem in [4]. To state the local existence of solutions precisely, let us mention the compatibility condition for the initial value (φ0 , ψ0 ). Let (φ, ψ) be a smooth j j solution of (3.1). Then ∂t φ, ∂t ψ ( j ≥ 1) is inductively determined by j

j−1

j−1

∂t φ = −u · ∇∂t φ − ρdiv ∂t ψ j−1 j−1 j−1 − ∂t , u · ∇ φ + ∂t , ρdiv ψ + ∂t ( f (φ, ψ)) and

j j−1 j−1 ∂t ψ = −ρ −1 L∂t ψ + p (ρ)∇∂t φ j−1 j−1 −ρ −1 ∂t , ρ ∂t ψ + ∂t , p (ρ)∇ φ −ρ −1 ∂t

j−1

(ρu · ∇ψ) + ρ −1 ∂t

j−1

(g(φ, ψ)).

Here [C, D] = C D − DC is the commutator of C and D. j j is inductively given by (φ0 , ψ0 ) From these relations we see that ∂t φ, ∂t ψ t=0 in the following way: j j ∂t φ, ∂t ψ = φj, ψj , t=0

where φ j = −u 0 · ∇φ j−1 − ρ0 div ψ j−1 j−1

j −1 ψ · ∇φ j−1− + φ div ψ j−1− + =1 +F j−1 φ0 , ψ0 ; φ1 , . . . , φ j−1 , ψ1 , . . . , ψ j−1 , ψ j = −ρ0−1 Lψ j−1 + p (ρ0 ) ∇φ j−1 j−1

j −1 −1 φ ψ j− + a (φ0 ; φ1 , . . . , φ ) φ j−1− −ρ0 =1 −1 +ρ0 G j−1 φ0 , ψ0 , ∂x ψ0 ; φ1 , . . . , φ j−1 , ψ1 , . . . , ψ j−1 , ∂x ψ1 , . . . , ∂x ψ j−1 . +φ0 ; u 0 = u +ψ0 ; a (φ0 ; φ1 , . . . , φ ) is a certain polynomial in φ1 , . . . , φ ; Here ρ0 = ρ · · · · · · , and so on. j The boundary condition ψ|x1 =0 = 0 in (3.1) implies that ∂t ψ x1 =0 = 0 , and therefore, we have ψ j x1 =0 = 0. s Assume ψ) is a solution of (3.1) in Z (T ). Then, from the above observation, that (φ, we need φ j , ψ j ∈ H s−2 j for j = 0, . . . , [s/2], which can be verified by Lemmas

Stability of Planar Stationary Solutions to Compressible NS Equations

407

3.3, 3.4 and 3.10 below, provided that (φ0 , ψ0 ) ∈ H s with s ≥ s0 . Furthermore, it is necessary to require that (φ0 , ψ0 ) satisfies the s th order compatibility condition: s−1 1 . ψ j ∈ H0 for j = 0, 1, . . . , s= 2 Proposition 3.1. Let s be an integer satisfying s ≥ s0 . Assume that the initial value (φ0 , ψ0 ) satisfies the following conditions: s th order compatibility condition, where (a) (φ0 , ψ0 ) ∈ H s and (φ0 , ψ0 ) satisfies the s−1 s= 2 . (x1 ). (b) inf x ρ0 (x) ≥ − 21 inf x1 ρ (x1 ) Then there exists a positive number T0 depending on (φ0 , ψ0 ) H s and inf x1 ρ such that problem (3.1) has a unique solution (φ, ψ) ∈ Z s (T0 ) satisfying φ(x, t) ≥ − 43 inf x1 ρ (x1 ) for all (x, t) ∈ R+n × [0, T0 ]. Furthermore, the inequality a (φ, ψ)2Z s (T0 ) ≤ C 1 + (φ0 , ψ0 )2H s (φ0 , ψ0 )2H s holds for some constants C > 0 and a > 0 depending only on s, (φ0 , ψ0 ) H s and inf x1 ρ (x1 ). 3.2. A priori estimates. The global existence of solutions of (3.1) follows from Proposition 3.1 and the a priori estimate given in Proposition 3.2 below. To state our a priori estimate we introduce some notation. We define E σ (t) and Dσ (t) by 1/2 2 2 E σ (t) = sup |[φ(τ )]|σ + |[ψ(τ )]|σ 0≤τ ≤t

and

⎧

t 1/2 ⎪ ⎪ |||Dψ|||20 + φ|x1 =0 2L 2 (Rn−1 ) dτ for σ = 0, ⎨ Dσ (t) =

0 t 1/2 ⎪ ⎪ 2 2 2 ⎩ |||Dφ|||σ −1 + |||Dψ|||σ + φ|x1 =0 L 2 (Rn−1 ) dτ for σ ≥ 1. 0

Here and in what follows

⎧ ⎨∂x v(t)2 for σ = 0, 1/2 |||Dv(t)|||σ = ⎩ |[∂x v(t)]|2σ + |[∂t v(t)]|2σ −1 for σ ≥ 1.

From now on we fix a T > 0; and (φ, ψ) will denote the solution of (3.1) belonging to Z s (T ). We also write (ρ, u) = ( ρ + φ, u + ψ). Proposition 3.2. There exist constants K > 0, ε0 > 0 and C > 0, which are independent of T > 0, such that if E s (t) < K for all t ∈ [0, T ], then E s (t)2 + Ds (t)2 ≤ C (φ0 , ψ0 )2H s , and 1 inf φ(x, t) ≥ − inf ρ (x1 ) x 2 x1 for all t ∈ [0, T ], provided that (φ0 , ψ0 ) H s < ε0 . The proof of Proposition 3.2 will be given in the next section.

408

Y. Kagei, S. Kawashima

4. Proof of A Priori Estimate In this section we prove the a priori estimate given in Proposition 3.2. For this purpose we first prepare some auxiliary lemmas.

4.1. Auxiliary lemmas. In the proof of Proposition 3.2 we will frequently use the following lemmas. Lemma 4.1. Let 2 ≤ p ≤ ∞ and let j and k be integers satisfying

0 ≤ j < k, k > j + n

1 1 − . 2 p

Then there exists a constant C > 0 such that j k a f ∂x f ≤ C f 1−a ∂ x , 2 p

where a =

1 k

j+

n 2

−

n p

2

.

Lemma 4.1 can be proved by using Fourier transform and extension operator in a standard way. We omit the proof (cf. [1]). To control the energy supplied by the stationary flow, we will use the following lemma which is essentially the same as [5, Lemma 3.3]. Lemma 4.2. Let w denote ρ − ρ+ or u − u+. (i) If M+ ≥ 1, then for any integers k ≥ 1 and ≥ 0 there exists a positive constant C such that k f ∂x1 w

H

f ∈ H1 ∩ H . ≤ Cδ ∂x1 f H (−1)+ + f |x1 =0 L 2 (Rn−1 )

(ii) If M+ > 1, then for any integer k ≥ 0 there exists a positive constant C such that f, g ≤ Cδ ∂x1 f 2 + f |x1 =0 L 2 (Rn−1 ) ∂xk1 w × ∂x1 g2 + g|x1 =0 L 2 (Rn−1 ) f, g ∈ H 1 . The same estimate also holds for k ≥ 2 if M+ = 1. The proof of Lemma 4.2 is similar to that of [5, Lemma 3.3]. We omit the proof.

Stability of Planar Stationary Solutions to Compressible NS Equations

409

Lemma 4.3. (i) Let 1 ≤ σ ≤ s. Suppose that F(x, t, y) is a smooth function on R+n × [0, T ] × I , where I is a compact interval in R. Then for |α| + 2 j = σ , there hold α j ∂x ∂t , F (x, t, f 1 ) f 2 ⎧ 2 ⎨C0 |[ f 2 ]|σ −1 + C1 1 + |||D f 1 ||||α|+ j−1 |||D f 1 |||s−1 |[ f 2 ]|σ , s−1 ≤ ⎩C0 |[ f 2 ]|σ −1 + C1 1 + |||D f 1 ||||α|+ j−1 |||D f 1 |||s |[ f 2 ]|σ −1 , s−1 and

α j ∂x ∂t , F (x, t, f 1 ) f 2 −1 H |α|+ j−1 |||D f 1 |||s−1 |[ f 2 ]|σ −1 . ≤ C0 |[ f 2 ]|σ −1 + C1 1 + |||D f 1 |||s−1

Here C0 =

$

(β,k)≤(α, j) (β,k) =(0,0)

and C1 =

$

(β,k)≤(α, j) 1≤≤ j+|α|

β supx,t,y ∂x ∂tk F(x, t, y)

β supx,t,y ∂x ∂tk ∂ y F(x, t, y) .

(ii) Let 1 ≤ σ ≤ s and let w denote ρ or u . Then for |α| + 2 j = σ there holds j f 1 , f 2 ≤ Cδ|[ f 1 ]|σ −1 ∂x f 2 2 + f 2 |x1 =0 L 2 (Rn−1 ) . ∂xα ∂t , w The proof of Lemma 4.3 will be given in the Appendix. A straightforward application of Lemmas 4.2 and 4.3 yields the following estimates for f and g appearing on the right-hand side of (3.1). Lemma 4.4. Let 0 ≤ σ ≤ s. Suppose that (3.2) is satisfied. Then for |α| + 2 j = σ there hold α j ∂x ∂t f ≤ Cδ |||Dψ|||(σ −1)+ + |||Dφ|||(σ −1)+ + φ|x1 =0 L 2 (Rn−1 ) 2

and

α j ∂x ∂t g ≤ Cδ |||Dψ|||(σ −1)+ + |||Dφ|||(σ −1)+ + φ|x1 =0 L 2 (Rn−1 ) . 2

We will also use the following standard interpolation inequality. Lemma 4.5. Let k ≥ 1 and let 1 ≤ |α| ≤ k. Then % &

t2 α ∂ v(t2 )2 ≤ C ∂ α v(t1 )2 + ∂ v ∂ v dτ τ H |α|−1 x H |α| x x 2 2 t1

for all v ∈

Y k (T )

and 0 ≤ t1 ≤ t2 ≤ T .

Proof. The inequality can be easily proved in the case of the whole space. In the case of the half space one can prove it by using the extension operator, based on the inequality in the whole space case. This completes the proof.

410

Y. Kagei, S. Kawashima

4.2. Basic energy estimates. In order to prove the a priori estimate in Proposition 3.2 we will use the following energy estimates for weak solutions of hyperbolic and parabolic equations. ∈ X 0 (T ) satisfies Let f ∈ L 2 0, T ; L 2 . We say that a function φ + u · ∇φ = f, ∂t φ in the weak sense if

T

−

, ∂t ϕ + div (uϕ) dt = φ

0

(4.1)

T

f , ϕ dt

0

holds for all ϕ ∈ × (0, T )). Similarly, for g ∈ L 2 0, T ; H −1 we say that a ∈ Y 0 (T ) satisfies function ψ + u · ∇ψ + Lψ = ρ ∂t ψ g (4.2) C01 (

in the weak sense if

T

− ψ , ρ (∂t ϕ + u · ∇ϕ) dt + 0

T

L

1/2

ψ, L

1/2

T

ϕ dτ =

0

g , ϕ dt

0

holds for all ϕ ∈ C01 ( × (0, T )), where L 1/2 ψ, L 1/2 ϕ = μ(∇ψ, ∇ϕ) + μ + μ (div ψ, div ϕ); and ·, · denotes the duality pairing of H −1 and H01 . We here use the fact that ∂t ρ + div (ρu) = 0. satisfy (4.1) in the weak sense. Then φ Lemma 4.6. (i) Let f ∈ L 2 (0, T ; L 2 ) and let φ satisfies

t2 φ (t1 )2 + (t2 )2 ≤ φ |2 + 2 dτ div u, | φ f , φ 2 2 t1

for all 0 ≤ t1 ≤ t2 ≤ T . satisfy (4.2) in the (ii) Let g = g (1) + ∇ g (2) with g (1) , g (2) ∈ L 2 (0, T ; L 2 ) and let ψ satisfies weak sense. Then ψ

t2

t2 2 2 ' ( 22 dτ = (t2 ) (t1 ) , dτ L 1/2 ψ g, ψ ρ(t2 ) ψ +2 ρ(t1 ) ψ +2 2

t1

2

t1

' ( (1) (2) = − . for all 0 ≤ t1 ≤ t2 ≤ T . Here g, ψ g ,ψ g , div ψ satisfy (4.2) in the weak sense. Assume that ψ (iii) Let g ∈ L 2 0, T ; L 2 and let ψ 1 belongs to Y (T ). Then ψ satisfies

t2

t2 2 √ 1/2 2 1/2 2 g , ∂τ ψ ρ∂τ ψ 2 dτ = L ψ (t1 ) + 2 L ψ (t2 ) + 2 2 2 t1 t 1 dτ , ∂τ ψ − ρu · ∇ ψ for all 0 ≤ t1 ≤ t2 ≤ T . Proof. Formally the inequality in (i) is obtained by taking the L 2 -inner product of (4.1) 1 with φ and integrating by parts since u x =0 = u 1b < 0. Similarly, the identities in (ii) 1 , and ∂t ψ and (iii) are formally obtained by taking the L 2 -inner product of (4.2) with ψ respectively, and integrating by parts. A rigorous proof can be found in [4]. We omit the details.

Stability of Planar Stationary Solutions to Compressible NS Equations

411

4.3. Proof of Proposition 3.2. Proposition 3.2 is a consequence of the following subsequent propositions. As in the one-dimensional problem studied in [5], the point in the proof of Proposition 3.2 is to derive a suitable L 2 -energy bound. Due to the fact that the stationary solution has no shear components, one can obtain the L 2 bound in the same way as in the one-dimensional case in [5]. We define Ms (t) ≥ 0 by Ms (t)2 = (δ + E s (t)) E s (t)2 + Ds (t)2 . > 0 such that if Proposition 4.7. There exists a constant K E s (t) ≤ K

(4.3)

for all t ∈ [0, T ], then E 0 (t)2 + D0 (t)2 ≤ C (φ0 , ψ0 )22 + Ms (t)2 , uniformly in t ∈ [0, T ], where C > 0 is independent of T . Proof. As in [5] we introduce an energy functional based on the total energy % ρE = ρ

&

1 2 |u| + (ρ) , (ρ) = 2

ρ

p(ζ ) dζ. ζ2

Note that (ρ) is a strictly convex function of ρ1 . We then define & % 1 2 |ψ| + (ρ, ρ ρE = ρ ) , 2 where 1 1 (ρ, ρ ) = (ρ) − ( ρ ) − ∂ 1 ( − ρ) ρ ρ ρ

ρ p(ζ ) − p( ρ) = dζ. ζ2 ρ

|, and As shown in [5], ρ (ρ, ρ ) is equivalent to |ρ − ρ |2 for suitably small |ρ − ρ hence, there are positive constants c0 and c1 such that c0−1 |(φ, ψ)|2 ≤ ρ E ≤ c0 |(φ, ψ)|2 , where φ = ρ − ρ with |φ| ≤ c1 .

(4.4)

412

Y. Kagei, S. Kawashima

> 0 such that if E s (t) ≤ K , then Since H s → L ∞ we can find a number K 1 φ(t)∞ ≤ c1 and inf x φ(x, t) ≥ − 2 inf x1 ρ (x1 ) for all t ∈ [0, T ]. A direct calculation shows ρ )) ψ ∂t ρ E + div ρu E + ( p(ρ) − p( = μdiv (∇ψ · ψ) + μ + μ div (ψdiv ψ) −μ|∇ψ|2 − μ + μ (div ψ)2 + R0 , (4.5) where R0 = R0 (x, t) is the function defined by 1 u. R0 = −ρ (ψ · ∇ u ) · ψ − p(ρ) − p( ρ ) − p ( ρ )φ div u − φψ · L ρ Since (φ, ψ) ∈ Z s (T ) and s ≥ s0 , we deduce from (4.5), after integrating by parts, that

t t) d x − ρ E(x, ρu 1 E d x dτ n n−1 R+ 0 R x1 =0

t

t 2 1/2 =− R0 (x, τ ) d xdτ, (4.6) L ψ dτ + 2

0

where

0

1/2 2 L ψ = μ ∇ψ22 + μ + μ div ψ22 . 2

By Lemmas 4.1 and 4.2, if M+ > 1, we have

t R0 (x, τ ) d xdτ ≤ C δ D0 (t)2 + E s (t)D0 (t)2 .

(4.7)

0

In case M+ = 1, since ∂x1 u 1 > 0 by Proposition 2.1, we have ρ )φ div u −ρ (ψ · ∇ u ) · ψ − p(ρ) − p( ρ ) − p ( = − ρ|ψ|2 + p(ρ) − p( ρ ) − p ( ρ )φ ∂x1 u1 ≤ 0, and hence, by Lemma 4.2,

t

R0 (x, τ ) d xdτ ≤ Cδ D0 (t)2 .

(4.8)

0

The desired inequality in Proposition 4.7 now follows from (4.4), (4.6)–(4.8). This completes the proof. The estimates for derivatives are obtained in principle by the energy method as in [3, 7]. In contrast to [3, 7], we do not regard problem (3.1) as a perturbation from the linearized problem at infinity. We estimate derivatives by differentiating equations in (3.1) and using the commutator estimates given in Lemma 4.3. We begin with the estimates for tangential derivatives of (φ, ψ). The estimates are based on the fact that (3.1) has a structure of a symmetric hyperbolic-parabolic system. j In what follows we will denote the tangential derivative ∂t ∂xα by T j,α :

T j,α v = ∂t ∂xα v. j

Stability of Planar Stationary Solutions to Compressible NS Equations

413

Proposition 4.8. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j and α satisfy 2 j + α = σ . Suppose that (4.3) is satisfied. Then

t 1/2 2 2 2 T j,α φ(t)2 + T j,α ψ(t)2 + ψ dτ ≤ C E σ (0) + Ms (t) L . T j,α 2 2 2

0

Proof. Applying T j,α to (3.1) we have ∂t T j,α φ + u · ∇ T j,α φ + ρdiv T j,α ψ = f j,α

(4.9)

and

ρ ∂t T j,α ψ + u · ∇ T j,α ψ + L T j,α ψ + p (ρ)∇ T j,α φ = g j,α ,

where

and

(4.10)

f j,α = T j,α f − T j,α , u · ∇ φ − T j,α , ρ div ψ g j,α = T j,α g − T j,α , ρ ∂t ψ − T j,α , ρu · ∇ ψ − T j,α , p (ρ) ∇φ.

The estimate in Proposition 4.8 is based on the fact that system (4.9)–(4.10) can be put into a symmetric form. So we transform them into the system ∂t a(ρ)T j,α φ + u · ∇ a(ρ)T j,α φ + ρ a(ρ)div T j,α ψ = f j,α , (4.11) g j,α , ρ ∂t T j,α ψ + u · ∇ T j,α ψ + L T j,α ψ + ∇ p (ρ)T j,α φ = where a(ρ) = p (ρ)/ρ,

(4.12)

f j,α = −ρ a (ρ)(div u)T j,α φ + a(ρ) f j,α and g j,α = p (ρ)(∇ρ)T j,α φ + g j,α . We now apply Lemma 4.6 (i) to (4.11) and Lemma 4.6 (ii) to (4.12) respectively, and obtain

t 2 p (ρ)T j,α φ, div T j,α ψ dτ p (ρ)/ρ T j,α φ(t) + 2 2

0

2 ≤ p (ρ0 )/ρ0 ∂xα φ j + R1 (t)

(4.13)

2

and √ ρ T j,α ψ(t)2 + 2

−2

2

t

t 1/2 2 T j,α ψ dτ L 0

2

p (ρ)T j,α φ, div T j,α ψ

dτ

0

√ 2 ≤ ρ0 ∂xα ψ j + R2 (t). 2

(4.14)

414

Y. Kagei, S. Kawashima

Here R1 (t) =

t

2 p (ρ)/ρ div u, T j,α φ dτ + 2

t

0

p (ρ)/ρ f j,α , T j,α φ dτ

0

and

t

R2 (t) = 2

g j,α , T j,α ψ dτ.

0

It then follows from (4.13) and (4.14) that

t 2 √ 2 1/2 2 T j,α ψ dτ L p (ρ)/ρ T j,α φ(t) + ρ T j,α ψ(t)2 + 2 2 2 0 2 √ 2 ≤ p (ρ0 )/ρ0 ∂xα φ j + ρ0 ∂xα ψ j + R1 (t) + R2 (t). 2

2

Applying now Lemmas 4.1–4.4 to R1 (t) and R2 (t) we obtain the desired inequality in Proposition 4.8. This completes the proof. We next derive the H 1 -parabolic estimates for ψ. Proposition 4.9. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j and α satisfy 2 j + α = σ − 1. Suppose that (4.3) is satisfied. Then 2 t 1/2 T j+1,α ψ 2 dτ L T j,α ψ(t) + 2 2

0

2 ≤ C E σ (0)2 + T j,α φ(t)2 + Ms (t)2

t

+η 0

∂x T j,α φ 2 dτ + Cη 2

t 1/2 2 T j,α ψ dτ L 0

2

for any η > 0 with some constant Cη > 0. Proof. Let 2 j + α = σ − 1. Then we can apply Lemma 4.6 (iii) to (4.10) and obtain

t 2 √ 1/2 ρ ∂τ T j,α ψ 2 dτ L T j,α ψ(t) + 2 2 2

+2 0

t

0

p (ρ)∇ T j,α φ , ∂τ T j,α ψ dτ

t 1/2 α 2 = L ∂x ψ j − 2 ρu · ∇ T j,α ψ , ∂τ T j,α ψ dτ + R3 (t), 2

0

where

t

R3 (t) = 2 0

g j,α , ∂τ T j,α ψ dτ.

(4.15)

Stability of Planar Stationary Solutions to Compressible NS Equations

415

The second term on the right-hand side of (4.15) is estimated as t 2 ρu · ∇ T T ψ , ∂ ψ dτ τ j,α j,α 0

t

≤ 0

√ ρ∂τ T j,α ψ 2 dτ + C 2

t 1/2 2 T j,α ψ dτ, L 2

0

and, by Lemma 4.4, the third term is estimated as |R3 (t)| ≤ C Ms (t)2 . Let us next consider the third term on the left-hand side of (4.15). By integration by parts we have

t

2

0

p (ρ)∇ T j,α φ , ∂τ T j,α ψ dτ

τ =t p (ρ)∇ T j,α φ , T j,α ψ τ =0

t ∂τ p (ρ)∇ T j,α φ , T j,α ψ dτ. −2

=2

(4.16)

0

The first term on the right-hand side of (4.16) is equal to τ =t −2 T j,α φ, div p (ρ)T j,α ψ τ =0 . To estimate the second term on the right-hand side of (4.16), we first observe that p (ρ)∇T j,α φ satisfies ∂t p (ρ)∇T j,α φ + u · ∇ p (ρ)∇T j,α φ + ρ p (ρ)∇div T j,α ψ = h j,α , in the weak sense, where h j,α = −ρ p (ρ)(div u)∇T j,α φ + p (ρ) ∇T j,α f − ∇T j,α , u · ∇ φ − ∇T j,α , ρ div ψ . It then follows that the second term on the right-hand side of (4.16) is written as

−2 0

t

∂τ p (ρ)∇ T j,α φ , T j,α ψ dτ

u · ∇ p (ρ)∇T j,α φ + ρ p (ρ)∇div T j,α ψ − h j,α , T j,α ψ dτ

t

=2 0

t

= −2

−2

0 t 0

p (ρ)∇ T j,α φ , div u ⊗ T j,α ψ dτ

div T j,α ψ , div ρ p (ρ)T j,α ψ dτ − 2

t 0

h j,α , T j,α ψ dτ.

416

Y. Kagei, S. Kawashima

We thus obtain 2

p (ρ)∇ T j,α φ , ∂τ T j,α ψ dτ 0 τ =t = −2 T j,α φ, p (ρ)div T j,α ψ τ =0

t p (ρ)∇ T j,α φ , u div T j,α ψ dτ −2 0

t −2 div T j,α ψ , ρ p (ρ)div T j,α ψ dτ + R4 (t) t

0

≡ I1 (t) + I2 (t) + I3 (t) + R4 (t),

(4.17)

where

τ =t R4 (t) = −2 T j,α φ, ∇ p (ρ) T j,α ψ τ =0

t p (ρ)∇ T j,α φ , (∇u)T j,α ψ dτ −2 0

t

t −2 div T j,α ψ , ∇ ρ p (ρ) T j,α ψ dτ − 2 h j,α , T j,α ψ dτ. 0

0

The right-hand side of (4.17) is estimated as 2 2 1 |I1 (t)| ≤ L 1/2 T j,α ψ(t) + C T j,α φ(t)2 + C E σ (0)2 , 2 2

t ∇ T j,α φ div T j,α ψ dτ |I2 (t)| ≤ C 2 2 0

t

t 2 1/2 ∂x T j,α φ 2 dτ + Cη ≤η ψ T dτ L j,α 2 0

2

0

for any η > 0 with some constant Cη > 0,

t 1/2 2 |I3 (t)| ≤ C T j,α ψ dτ, L 2

0

and, by Lemmas 4.1–4.4, |R4 (t)| ≤ C Ms (t)2 . We thus obtain the desired inequality in Proposition 4.9. This completes the proof. We next derive the dissipative estimates for x1 -derivatives of φ, which follow from a hyperbolic-parabolic aspect of (3.1). Proposition 4.10. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j, α and satisfy 2 j + α + = σ − 1. Suppose that (4.3) is satisfied. Then 2 t 2 +1 φ(t) + ∂ φ T j,α ∂x+1 T dτ j,α x1 1 2 2 0 % ≤ C E σ (0)2 + Ms (t)2 &

t 2 2 2 + T j+1,α ∂x1 ψ + ∂x T j,α ∂x1 ψ + ∂x ∂x T j,α ∂x1 ψ dτ . 0

2

2

2

Stability of Planar Stationary Solutions to Compressible NS Equations

417

Proof. The first equation of (3.1) is written as ∂t φ + u · ∇φ + ρ ∂x1 ψ 1 + ∇ · ψ = f. Here and in what follows we use the notation ψ = ψ 2, . . . , ψ n and ∇ = ∂x2 , . . . , ∂xn . It then follows that +1 ∂t T j,α ∂x+1 φ + u · ∇ T ∂ φ j,α x1 1 1 ∂ ψ = f j,α ,+1 , +ρ T j,α ∂x+2 ψ + ∂ ∇ · T x j,α 1 x 1 1 where

(4.18)

f j,α ,+1 = T j,α ∂x+1 f − T j,α ∂x+1 , u · ∇ φ − T j,α ∂x+1 , ρ div ψ. 1 1 1

Furthermore, multiplying (4.18) by 1/ρ we obtain

1 1 +1 T j,α ∂x+1 T ∂t φ + u · ∇ ∂ φ j,α x1 1 ρ ρ 1 ∂ ψ = f j,α ,+1 , ψ + ∂ ∇ · T +T j,α ∂x+2 x j,α 1 x 1 1

(4.19)

where 1 1 φ + f j,α ,+1 . f j,α ,+1 = (div u)T j,α ∂x+1 1 ρ ρ We next observe the first component of the equation for ψ in (3.1): ρ ∂t ψ 1 + u · ∇ψ 1 + p (ρ)∂x1 φ − 2μ + μ ∂x21 ψ 1 − μ ψ 1 − μ + μ ∂x1 ∇ · ψ = g 1 . Hereafter we denote = ∂x22 + · · · + ∂x2n . Applying T j,α ∂x1 to this equation we have ρ ∂t T j,α ∂x1 ψ 1 + u · ∇ T j,α ∂x1 ψ 1 + p (ρ)T j,α ∂x+1 φ − 2μ + μ T j,α ∂x+2 ψ1 1 1 −μ T j,α ∂x1 ψ 1 − μ + μ ∂x1 ∇ · T j,α ∂x1 ψ = g 1j,α , , where

g 1j,α , = T j,α ∂x1 g 1 − T j,α ∂x1 , ρ ∂t ψ 1 − T j,α ∂x1 , ρu · ∇ ψ 1 − T j,α ∂x1 , p (ρ) ∂x1 φ.

(4.20)

418

Y. Kagei, S. Kawashima

1 By adding (4.19) and (4.20) × 2μ+μ we arrive at

1 1 ρp (ρ) 1 +1 +1 ∂t φ + u · ∇ ∂ φ + ∂ φ T j,α ∂x+1 T T j,α x1 j,α x1 1 ρ ρ 2μ + μ ρ 1 1 M T j,α ∂x1 ψ + f j,α ,+1 + g1 , =− 2μ + μ 2μ + μ j,α ,

where

M(ψ) = ρ ∂t ψ 1 + u · ∇ψ 1 − μ ψ 1 + μ∂x1 ∇ · ψ .

Lemma 4.6 (i) then yields 2

t 1 ρp (ρ) 1 1 +1 +1 T j,α ∂ +1 φ(t) + 2 T j,α ∂x1 φ, T j,α ∂x1 φ dτ x1 ρ 2μ + μ ρ ρ 0 2 2

t 1 1 α +1 2 +1 M T T ∂ ∂ φ − ∂ ψ , ∂ φ dτ ≤ j,α x1 j,α x1 ρ x x1 j 2μ + μ ρ 0

+R5 (t), where

0

2

(4.21)

2

t 1 φ dτ div u, T j,α ∂x+1 R5 (t) = 1 ρ 0

t

1 1 1 +1 T +2 g , ∂ φ dτ. f j,α ,+1 + j,α x1 2μ + μ j,α , ρ 0

The second term on the left-hand side of (4.21) is estimated as

t

t 2 ρp (ρ) 1 1 +1 +1 +1 ∂ ∂ ∂ T T 2 φ, φ dτ ≥ c φ T dτ, j,α j,α j,α x1 x1 x1 2 2μ + μ ρ ρ 0 0 and the second term on the right-hand side of (4.21) is estimated as

t 1 2 +1 M T j,α ∂x1 ψ , T j,α ∂x1 φ dτ 2μ + μ ρ 0

t ≤C ∂τ T j,α ∂x1 ψ + ∂x T j,α ∂x1 ψ + ∂x ∂x T j,α ∂x1 ψ 2 2 2 0 +1 × T j,α ∂x1 φ dτ 2

2 c t +1 ≤ T j,α ∂x1 φ dτ 2 2 0

t 2 2 2 +C ∂τ T j,α ∂x1 ψ + ∂x T j,α ∂x1 ψ + ∂x ∂x T j,α ∂x1 ψ dτ, 0

2

2

2

where c and C are some positive constants. Moreover, using Lemmas 4.2–4.4 we obtain |R5 (t)| ≤ C Ms (t)2 , and hence, Proposition 4.10 is proved.

Stability of Planar Stationary Solutions to Compressible NS Equations

419

In order to obtain the dissipative estimates for higher order x1 -derivatives of ψ and tangential derivatives of φ we prepare estimates for the material derivative of φ. In the ˙ following we denote the material derivative of φ by φ: φ˙ ≡ ∂t φ + u · ∇φ. We also introduce the semi-norm | · |k defined by ⎛ ⎞1/2 ∂ α v 2 ⎠ . |v|k = ⎝ x 2 |α|=k

Proposition 4.11. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j, α and satisfy 2 j + α + = σ − 1. Suppose that (4.3) is satisfied. Then

t T j,α φ˙ 2 dτ +1 0 % &

t 2 2 2 2 ≤ C Ms (t)2 + T j,α ∂x1 φ + T j+1,α ψ + ∂x T j,α ψ + ∂x ∂x T j,α ψ dτ . 0

Proof. We see from the first equation of (3.1) that φ˙ = f − ρdiv ψ, and hence,

T j,α +β φ˙ = −ρdiv T j,α +β ψ + f j,α +β ,

where β = + 1 and

(4.22)

f j,α +β = T j,α +β f − T j,α +β , ρ div ψ.

We also see from the first equation of (3.1) that φ˙ + ρ ∂x1 ψ 1 + ∇ · ψ = f, and hence,

T j,α +β ∂xk+1 φ˙ + ρ T j,α +β ∂xk+2 f j,α +β ,k+1 , ψ 1 + ∂x1 ∇ · T j,α +β ∂xk1 ψ = 1 1 (4.23)

where

k+1 +β ∂ f j,α +β ,k+1 = T j,α +β ∂xk+1 f − T , ρ div ψ. j,α x1 1

Furthermore, Eq. (4.20) with T j,α ∂x1 replaced by T j,α +β ∂xk1 gives ρ ∂t T j,α +β ∂xk1 ψ 1 + u · ∇ T j,α +β ∂xk1 ψ 1 φ − 2μ + μ T j,α +β ∂xk+2 ψ1 + p (ρ)T j,α +β ∂xk+1 1 1 −μ T j,α +β ∂xk1 ψ 1 − μ + μ ∂x1 ∇ · T j,α +β ∂xk1 ψ = g 1j,α +β ,k .

(4.24)

420

Y. Kagei, S. Kawashima

By adding (4.23) ×

1 ρ

and (4.24) ×

1 2μ+μ

we arrive at

1 T j,α +β ∂xk+1 φ˙ 1 ρ p (ρ) 1 k+1 k +β ∂ +β ∂ ψ =− T φ − M T j,α j,α x1 x1 2μ + μ 2μ + μ 1 1 g1 + f j,α +β ,k+1 + ρ 2μ + μ j,α +β ,k

(4.25)

with β + k = . The first term on the right-hand side of (4.22) is estimated as ρdiv T j,α +β ψ ≤ C ∂x ∂x T j,α ψ , 2

while the first two terms on the right-hand side of (4.25) are estimated as p (ρ) k+1 2μ + μ T j,α +β ∂x1 φ ≤ C ∂x1 T j,α φ 2 and

1 k M T ∂ ψ j,α +β x1 2μ + μ 2 ≤ C ∂t T j,α ψ + ∂x T j,α ψ + ∂x ∂x T j,α ψ .

The remaining terms in (4.22) and (4.25) are bounded by C Ms (t) by Lemmas 4.2–4.4, and the desired inequality in Proposition 4.11 is obtained. This completes the proof. We next apply estimates for the Stokes system to obtain the estimates for higher order derivatives. Proposition 4.12. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j, α and satisfy 2 j + α + = σ − 1. Suppose that (4.3) is satisfied. Then

t T j,α ψ 2 + T j,α φ 2 dτ +2 +1 0 % &

t T j,α φ˙ 2 + T j+1,α ψ 2 + T j,α ψ 2 dτ . ≤ C Ms (t)2 + 0

+1

+1

, ψ be the solution Proof. We apply the following estimate for the Stokes system. Let φ of the Stokes system = div ψ f in R+n , = + p (ρ+ )∇ φ g in R+n , −μψ ψ = 0. x1 =0

Then for any k ∈ Z, k ≥ 0, there exists a constant C > 0 such that ψ + φ ≤ C f k+1 + | g |k . k+2 k+1 See, e.g., [2].

(4.26)

Stability of Planar Stationary Solutions to Compressible NS Equations

421

By (4.22) we have 1 1 j,α . div T j,α ψ = − T j,α φ˙ + f j,α ≡ F ρ ρ

(4.27)

Moreover, we see from (4.10) ρ ∂t T j,α ψ + u · ∇ T j,α ψ − μ T j,α ψ − μ + μ ∇div T j,α ψ + p (ρ)∇ T j,α φ = g j,α . By (4.27) we have ∇div T j,α ψ = −∇

1 1 T j,α φ˙ + ∇ f j,α , ρ ρ

and hence, −μ T j,α ψ + p (ρ+ ) ∇ T j,α φ = −ρ ∂t T j,α ψ + u · ∇ T j,α ψ &

%

1 1 ˙ + μ + μ −∇ T j,α φ + ∇ f j,α ρ ρ j,α . − p (ρ) − p (ρ+ ) ∇ T j,α φ + g j,α ≡ G We thus conclude that T j,α φ, T j,α ψ satisfies the Stokes system. Since

1 +1 +1 +1 1 +1 1 ˙ ˙ T j,α φ + ∂x T j,α φ − ∂x , ∂x F j,α = − ∂x f j,α ρ ρ ρ

(4.28)

and j,α = −ρ ∂x ∂t T j,α ψ + u · ∂x ∇ T j,α ψ − μ + μ ∂x ∇ T j,α φ˙ ∂x G ρ − ∂x , ρ ∂t T j,α ψ + ∂x , ρu · ∇ T j,α ψ 1 T j,α φ˙ − μ + μ ∂x ∇, ρ % &

1 +∂x μ + μ ∇ f j,α − p (ρ) − p (ρ+ ) ∇ T j,α φ + g j,α , ρ the desired inequality follows from (4.26) and Lemmas 4.2–4.4. This completes the proof. The following estimates immediately follow from the first equation of (3.1) and Lemmas 4.2–4.4. Proposition 4.13. Let 2 ≤ σ ≤ s and let j satisfy 2 j ≤ σ − 2. Suppose that (4.3) is satisfied. Then % &

t

t j+1 2 2 2 2 |[∂x φ]|σ −2 + |[∂x ψ]|σ −1 dτ . ∂τ φ σ −2−2 j dτ ≤ C Ms (t) + 0

H

0

422

Y. Kagei, S. Kawashima

We are now in a position to prove Proposition 3.2. Proof of Proposition 3.2. We assume that (4.3) holds. We will show E σ (t)2 + Dσ (t)2 ≤ C E σ (0)2 + Ms (t)2

(4.29)

for all 0 ≤ σ ≤ s, which leads to the conclusion of Proposition 3.2. In fact, from (4.29) with σ = s one can easily see that E s (t)2 + Ds (t)2 ≤ C (φ0 , ψ0 )2H s , provided that (φ0 , ψ0 )2H s and δ > 0 are sufficiently small, and thus, Proposition 3.2 is proved. Let us prove (4.29). We prove it by induction argument on σ . By Proposition 4.7, (4.29) clearly holds for σ = 0. Let 1 ≤ r ≤ s and suppose that (4.29) holds for all σ ≤ r − 1. We shall prove (4.29) for σ = r . By (4.29) with σ = 0 and Proposition 4.8 we have &

t % 1/2 2 T j,α φ(t)2 + T j,α ψ(t)2 + T j,α ψ dτ L 2 2 2 j+|α |≤r

2

0

≤ C Er (0)2 + Ms (t)2 .

(4.30)

This, together with Proposition 4.9, implies that % & 2 t 2 1/2 T j+1,α ψ 2 dτ L T j,α ψ(t) + 2 j+|α |=r −1

2

0

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2

(4.31)

for any η > 0 with some constant Cη > 0. By (4.30) and (4.31) we have

2 j+|α |=r −1 0

t

T j+1,α ψ 2 + ∂x T j,α ψ 2 + ∂x ∂x T j,α ψ 2 dτ 2 2 2

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .

(4.32)

Using Proposition 4.10 with σ = r and = 0 we see from (4.32) that % &

t T j,α ∂x φ 2 dτ T j,α ∂x φ(t)2 + 1

2 j+|α |=r −1

2

0

1

2

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 ,

and hence, by Proposition 4.11 with σ = r and = 0,

t T j,α φ˙ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . 1 2 j+|α |=r −1 0

(4.33)

Stability of Planar Stationary Solutions to Compressible NS Equations

423

It then follows from (4.32), (4.33) and Proposition 4.12 with σ = r and = 0 that

t T j,α ψ 2 + T j,α φ 2 dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . 2 1 2 j+|α |=r −1 0

We thus arrive at 2 j+|α |=r −1

t

T j,α φ(t)2 1 + T j,α ψ(t)2 1 H H

T j,α φ 2 + T j+1,α ψ 2 + T j,α ψ 2 dτ 1 0 2 0 ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .

&

+

In particular, we obtain (4.29) for σ = r = 1 by taking η > 0 suitably small. To complete the induction argument for r ≥ 2, we now show the following inequalities: % & 2 t 2 +1 φ(t) + ∂ φ dτ T j,α ∂x+1 T j,α x1 1 2 j+|α |=r −1−

2

0

2

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 ,

t

T j,α φ˙ 2

+1

2 j+|α |=r −1− 0

dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2

(4.34)

(4.35)

and

2 j+|α |=r −1− 0

t

T j,α ψ 2

+2

2 + T j,α φ +1 dτ

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2

(4.36)

for all 0 ≤ ≤ r − 1. We will prove (4.34)–(4.36) by induction on . We have already proved (4.34)–(4.36) for = 0. Let 1 ≤ k ≤ r − 1. Assuming that (4.34)–(4.36) hold for all ≤ k − 1, we shall prove (4.34)–(4.36) for = k. By the inductive assumption on σ we have

t T j,α ψ 2 dτ ≤ C Er (0)2 + Ms (t)2 , (4.37) k+1 2 j+|α |=r −1−k 0

and, by the inductive assumption on and (4.30), we have

t T j+1,α ψ 2 + ∂x ∂x T j,α ψ 2 dτ k k 2 j+|α |=r −1−k 0

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .

(4.38)

424

Y. Kagei, S. Kawashima

It then follows from Proposition 4.10 with σ = r and = k that 2 j+|α |=r −1−k

% & 2 t 2 k+1 φ(t) + ∂ φ dτ T j,α ∂xk+1 T j,α x1 1 2

0

2

≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . This proves (4.34) for = k, and moreover, we have

T j,α ∂x φ 2 dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . 1 k

t

2 j+|α |=r −1−k 0

(4.39)

Applying now Proposition 4.11 with σ = r and = k we deduce from (4.37)–(4.39) that

t

T j,α φ˙ 2

k+1

2 j+|α |=r −1−k 0

dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .

(4.40)

This proves (4.35) for = k. In view of (4.37), (4.38) and (4.40) we see from Proposition 4.12 with σ = r and = k that (4.36) holds for = k. This completes the proof of (4.34)–(4.36). From the above argument and the inductive assumption on σ , we conclude that

t

|||Dψ|||r2 + |[∂x φ]|r2−1 + φ |x1 =0 2L 2 (Rn−1 ) dτ (4.41) ≤ ηDr (t)2 + Cη Er (0) + Ms (t)2 .

|[ψ(t)]|r2−1

+ |[φ(t)]|r2

+

0 2

This, together with Proposition 4.13 with σ = r , gives

0

t

|[∂τ φ]|r2−2 dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .

Furthermore, by (4.41) and Lemma 4.5 we have |[ψ(t)]|r2 ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 , and hence, we arrive at Er (t)2 + Dr (t)2 ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . By taking η > 0 suitably small we obtain (4.29) for σ = r , and the induction argument on σ is complete. This completes the proof of Proposition 3.2.

Stability of Planar Stationary Solutions to Compressible NS Equations

425

5. Decay of Perturbations as t → ∞ We finally prove the decay properties of the perturbation as t → ∞. Proposition 5.1. Under the assumption of Proposition 3.2, ∂x φ(t) H s−1 + ∂x ψ(t) H s−1 → 0

(5.1)

φ(t)∞ + ψ(t)∞ → 0

(5.2)

and as t → ∞. Proof. Since

we find a sequence

∞

0 {tk }∞ k=1

∂x φ2H s−1 + ∂x ψ2H s dτ < ∞,

(5.3)

with tk → ∞ as k → ∞ such that

∂x φ(tk )2H s−1 + ∂x ψ(tk )2H s → 0

(5.4)

as k → ∞. Applying Lemma 4.5 we have ∂x φ(t)2H s−2 + ∂x ψ(t)2H s−1 ≤ C ∂x φ(tk )2H s−2 + ∂x ψ(tk )2H s−1 &

t ∂τ φ H s−2 ∂x φ H s−1 + ∂τ ψ H s−1 ∂x ψ H s dτ +

(5.5)

tk

for all t ≥ tk and k ∈ N. We see from (5.3) and (5.4) that for any ε > 0 there exists a k such that if t ≥ tk then the right-hand side of (5.5) is less than ε, which means ∂x φ(t)2H s−2 + ∂x ψ(t)2H s−1 → 0

(5.6)

as t → ∞. To prove the decay of ∂xα φ(t)2 for |α| = s we first note that ∂xα φ ∈ X 0 satisfies (4.1) in the weak sense with f = −ρdiv ∂xα ψ + ∂xα f − ∂xα , u · ∇ φ − ∂xα , ρ div ψ. In view of the proof of Proposition 3.2 we have already known

∞ f 22 dτ < ∞.

(5.7)

0

By Lemma 4.6 (i) we have

t α α ∂ φ + ∂x ψ H s + ∂ φ(t)2 ≤ ∂ α φ(tk )2 + C f 2 ∂xα φ 2 dτ x x x 2 2 2 tk

(5.4) for all t ≥ tk and k ∈ N. This, together with (5.3), and (5.7), implies that for any ε > 0 there exists a k such that if t ≥ tk then ∂xα φ(t)2 < ε, i.e., α ∂ φ(t) → 0 x 2 as t → ∞ for |α| = s. Combining this with (5.6) we conclude the proof of (5.1). Decay property (5.2) now follows from Proposition 3.2, Lemma 4.1 and (5.1). This completes the proof.

426

Y. Kagei, S. Kawashima

Appendix In the Appendix we prove Lemma 4.3. For this purpose we first prepare the following lemma ([3, Lemma A.2]): Lemma A.1. Let s and sk (k = 1, . . . , ) be nonnegative integers and let αk (k = 1, . . . , ) be multi-indices. Suppose that s ≥ s0 , 0 ≤ |αk | ≤ sk ≤ s + |αk | (k = 1, . . . , ) and s1 + · · · + s ≥ ( − 1)s + |α1 | + · · · + |α | . Then there exists a constant C > 0 such that ) α ∂ 1 f 1 · · · ∂ α f ≤ C f k H sk . x x 2 1≤k≤

The proof of Lemma A.1 can be found in the appendix of [3]. Proof of Lemma 4.3. The inequality in (ii) is an immediate consequence of Lemma 4.2. Let us prove the third inequality in (i). We set z = (x, t) and ν = (α, j). Then ν γ ν ν−γ α j ∂x ∂t , F(x, t, f 1 ) f 2 = ∂z , F(z, f 1 ) f 2 = ∂z (F(z, f 1 ))∂z f2 , γ 0<γ ≤ν

γ and ∂z (F(z, f 1 )) is bounded by a linear combination of the following terms: γ (∂z F)(z, f 1 )) and

) γ0 γm ∂z f 1 ∂z ∂ y F (z, f 1 ) m=1

$

with 0 ≤ γ0 < γ , 1 ≤ ≤ |γ | − |γ0 |, m=0 γm = γ and |γm | ≥ 1(m = 1, . . . , ). Therefore, it suffices to estimate γ ν−γ f 2 −1 (A.1) ∂z F (z, f 1 )∂z H

and

) γ γ0 ν−γ m ∂z f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1

(A.2) H −1

$ with 0 ≤ γ0 < γ , 1 ≤ ≤ |γ | − |γ0 |, m=0 γm = γ and |γm | ≥ 1(m = 1, . . . , ). As for (A.1) we have γ γ ν−γ ν−γ f 2 −1 ≤ ∂z F (z, f 1 )∂z f2 ∂z F (z, f 1 )∂z H 2 ν−γ ≤ C0 ∂z f2 ≤ C0 |[ f 2 ]|σ −1 .

2

(A.3)

Stability of Planar Stationary Solutions to Compressible NS Equations

427

We next consider (A.2). We will show ) γ γ0 ν−γ ∂z m f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1

≤ CC1 |||D f 1 |||s−1 |[ f 2 ]|σ −1 , (A.4) H −1

which, together with (A.3), gives the third inequality in (i). We set γ = (β, k) and γm = (βm , km ). For simplicity we assume |βm | ≥ 1 for m = 1, . . . , . The other case can be treated similarly. We first show inequality (A.4) for γ with 1 < |β| + 2k < s. We set s = s − |β| − 2k and σ = σ − 1 − |α − β| − 2( j − k). Then s > 0 and σ = |β| + 2k − 1 > 0. Furthermore,

1 s 1 σ 1 1 1 s − + − + − =1+ − < 1. 2 n 2 n 2 n 2 n

Therefore, we can find p1 , p2 and p3 satisfying s 1 1 1 − < ≤ , 2 n p1 2

1 σ 1 1 − ≤ ≤ , 2 n p2 2

1 1 1 1 − < ≤ 2 n p3 2

and 1 1 1 + + = 1. p1 p2 p3 Applying Lemma 4.1 we see that for any ϕ ∈ C0∞ , ) γ γ0 ν−γ ∂z m f 1 ∂z f2 , ϕ ∂z ∂ y F (z, f 1 ) m=1 ) ν−γ γ ≤ C1 ∂z m f 1 ∂z f 2 ϕ p3 p2 m=1 p1 ) ν−γ γm ≤ CC1 ∂z f 1 ∂z f 2 σ ϕ H 1 H m=1 Hs ) γ ≤ CC1 ∂z m f 1 |[ f 2 ]|σ −1 ϕ H 1 . Hs

m=1

The first factor on the right-hand side of this inequality is estimated as ) γm ∂z f 1 m=1

) βm +λm km ≤C ∂x ∂t f 1 |λ1 |+···+|λ |≤ s m=1 2 ) =C ∂xβm +λm −1 ∂tkm ∂x f 1 .

Hs

|λ1 |+···+|λ |≤ s m=1

2

428

Since

Y. Kagei, S. Kawashima

$

m=1 (2km

+ |βm |) ≤ 2k + |β|, we have

{s − 1 − 2km − |βm + λm − 1|} = s −

m=1

(2km + |βm |) −

m=1

|λm |

m=1

≥ s − 2k − |β| − s = ( − 1)s. Lemma A.1 then implies ) γm ∂z f 1 ≤ C Hs

m=1

) km ∂t ∂x f 1

|λ1 |+···+|λ |≤ s m=1

H s−1−2km

≤ C|||D f 1 |||s−1 ,

and, consequently, we obtain inequality (A.4). We next show inequality (A.4) for γ with |β| + 2k = s. In this case σ = s and γ = ν. Since

1 s−1 1 1 1 1 s 1 + = + < , − − − 2 n 2 n 2 2 n 2 there exist p1 and p2 satisfying 1 s−1 1 1 − < ≤ , 2 n p1 2

1 1 1 1 − < ≤ 2 n p2 2

and 1 1 1 + = . p1 p2 2 By Lemma 4.1 we have ) ) γ γ0 γ ∂z m f 1 f 2 , ϕ ≤ C 1 ∂z m f 1 f 2 p1 ϕ p2 ∂z ∂ y F (z, f 1 ) m=1 m=1 2 ) γm ≤ CC1 ∂z f 1 f 2 H s−1 ϕ H 1 . m=1

2

Since

{s − 1 − 2km − |βm − 1|} = s −

m=1

(2km + |βm |)

m=1

≥ s − 2k − |β| − s = ( − 1)s, we deduce from Lemma A.1 that ) γm ∂z f 1 ≤ C m=1

2

) km ∂t ∂x f 1

|λ1 |+···+|λ |≤ s m=1

and hence, inequality (A.4) also holds in this case.

H s−1−2km

≤ C|||D f 1 |||s−1 ,

Stability of Planar Stationary Solutions to Compressible NS Equations

429

Let us prove inequality (A.4) for γ with |β| + 2k = 1. In this case |β| = 1 and k = 0, and hence, γ0 = (0, 0) and = 0. Therefore, with the same p1 and p2 as in the previous case, we have ν−γ ν−γ f 2 , ϕ ≤ C1 ∂xβ f 1 p ∂z f 2 ϕ p2 (∂ y F)(z, f 1 )∂xβ f 1 ∂z 1 2 j ≤ CC1 ∂x f 1 H s−1 ∂t f 2 σ −1−2 j ϕ H 1 H

≤ CC1 |||D f 1 |||s−1 |[ f 2 ]|σ −1 ϕ H 1 , from which inequality (A.4) is obtained. We next prove the second inequality in (i). In view of (A.3), it suffices to estimate ) γ γ0 ν−γ m ∂z f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1

2

$

with 0 ≤ γ0 < γ , 1 ≤ ≤ |γ | − |γ0 |, m=0 γm = γ and |γm | ≥ 1(m = 1, . . . , ). For simplicity we assume km ≥ 1 for m = 1, . . . , . The other case can be treated similarly. Since {s − 1 − 2(k1 − 1) − |β1 |} +

{s − 2 − 2(km − 1) − |βm |} + σ

m=2

= {s + 1 − 2k1 − |β1 |} +

{s − 2km − |βm |} + (2k + |β| − 1)

m=2

= s −

(2km + |βm |) + 2k + |β|

m=1

≥ s, we apply Lemma A.1 to obtain ) γ γ0 ν−γ m ∂z f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1 2 ) β1 k1 −1 ν−γ βm km −1 ≤ C1 ∂x ∂t ∂t f 1 ∂x ∂t ∂t f 1 ∂ z f2 m=2 2 ) km −1 ≤ CC1 ∂tk1 −1 ∂t f 1 s−1−2(k −1) ∂t f 1 s−2−2(k ∂t H

1

m=2

H

m −1)

ν−γ f2 ∂z

σ H

−1 ≤ CC1 |||D f 1 |||s |||D f 1 |||s−1 |[ f 2 ]|σ −1 .

This, together with (A.3), gives the second inequality in (ii). The first inequality in (i) can be proved similarly by applying Lemma A.1. We omit the details. This completes the proof.

430

Y. Kagei, S. Kawashima

References 1. Friedman, A.: Partial Differential Equations. New York: Holt, Rinehart and Winston, 1969 2. Galdi, G. P.: An Introduction to the Mathematical Theory of the Navier-Stokes Equations. Vol. 1, New York: Springer-Verlag, 1994 3. Kagei, Y., Kobayashi, T.: Asymptotic behavior of solutions to the compressible Navier-Stokes equations on the half space. Arch. Rat. Mech. Anal. 177, 231–330 (2005) 4. Kagei, Y., Kawashima, S.: Local solvability of initial boundary value problem for a quasilinear hyperbolicparabolic system. To appear in J. Hyperbolic Differential Equations 5. Kawashima, S., Nishibata, S., Zhu, P.: Asymptotic Stability of the Stationary Solution to the Compressible Navier-Stokes Equations in the Half Space. Commun. Math. Phys. 240, 483–500 (2003) 6. Matsumura, A.: Inflow and outflow problems in the half space for a one-dimensional isentropic model system of compressible viscous gas. Nonlinear Analysis 47, 4269–4282 (2001) 7. Matsumura, A., Nishida, T.: Initial boundary value problems for the equations of motion of compressible viscous and heat-conductive fluids. Commun. Math. Phys. 89, 445–464 (1983) Communicated by P. Constantin

Commun. Math. Phys. 266, 431–454 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0032-2

Communications in

Mathematical Physics

The Restricted Kirillov–Reshetikhin Modules for the Current and Twisted Current Algebras Vyjayanthi Chari1, , Adriano Moura2 1 Department of Mathematics, University of California, Riverside, CA 92521, USA.

E-mail: [email protected] 2 UNICAMP - IMECC, Campinas SP - Brazil, 13083-859. E-mail: [email protected]

Received: 3 August 2005 / Accepted: 16 January 2006 Published online: 25 April 2006 – © Springer-Verlag 2006

Abstract: We define a family of graded restricted modules for the polynomial current algebra associated to a simple Lie algebra. We study the graded character of these modules and show that they are the same as the graded characters of certain Demazure modules. In particular, we see that the specialized characters are the same as those of the Kirillov Reshetikhin modules for quantum affine algebras. 0. Introduction In this paper we define and study a family of Z+ –graded modules for the polynomial valued current algebra g[t] and the twisted current algebra g[t]σ associated to a finite– dimensional classical simple Lie algebra g and a non–trivial diagram automorphism of g. The modules which we denote as K R(mωi ) and K R σ (mωi ) respectively are indexed by pairs (i, m), where i is a node of the Dynkin diagram and m is a non–negative integer, and are given by generators and relations. These modules are indecomposable, but usually reducible, and we describe their Jordan–Holder series by giving the corresponding graded decomposition as a direct sum of irreducible modules for the underlying finite–dimensional simple Lie algebra. Moreover, we prove that the modules are finite– dimensional and hence restricted, i.e., there exists an integer n ∈ Z+ depending only on g and σ such that (g ⊗ t n )K R(mωi ) = 0. It turns out that this graded decomposition is exactly the one predicted in [9, Appendix A], [10, Sect. 6] coming from the study of the Bethe Ansatz in solvable lattice models. Our interest in these modules and the motivation for calling them the Kirillov–Reshetikhin modules arises from the fact that when we specialize the grading by setting t = 1, (or equivalently putting q = 1 in the formulae in [9, 10]) the character of the module is exactly the one predicted in [13, 14] for a family of irreducible finite–dimensional modules for the Yangian of g. Analogous modules for the untwisted quantum affine VC was partially supported by the NSF grant DMS-0500751.

432

V. Chari, A. Moura

algebra associated to g are also known to exist with this decomposition [2] and these have been studied from a combinatorial viewpoint in [9, 15–17]. One of the methods used in [2] involves passing to the q = 1 limit of the modules for the quantum affine algebra, although the resulting modules are not graded or restricted. The methods used in [2] require a number of complicated results from the representation theory of the untwisted quantum affine algebras which have not been proved in the twisted case. In this paper we show that the Kirillov–Reshetikhin modules can be studied in the non–quantum case. Thus we define graded restricted analogues and compute their graded characters without resorting to the quantum situation. As a consequence we have a mathematical interpretation of the parameter q which appears in the fermionic formulae in [9, 10]. Then we prove that the abstractly defined modules K R(mωi ) have a concrete construction as follows: the g[t]–module structure of the “fundamental” Kirillov–Reshetikhin modules (in most cases these modules correspond to taking m = 1) is described explicitly and then the module K R(mωi ) is realized as a canonical submodule of a tensor product of the fundamental modules. Another motivation for our interest is the connection with Demazure modules, [3, 5–7]. Thus we are able to prove that the Kirillov–Reshetikhin modules for the current algebras are isomorphic as representations of the current algebra to the Demazure modules in multiples of the basic representation of the current algebras. Our methods also work for some nodes of the other exceptional algebras; we explain this together with the reasons for the difficulties for the exceptional algberas in the concluding section of this paper. 1. Preliminaries 1.1. Notation. Let Z+ (resp. N) be the set of non-negative (resp. positive) integers. Given a Lie algebra a, let U(a) denote the universal enveloping algebra of a and a[t] = a⊗C[t] the polynomial valued current Lie algebra of a. The Lie algebra a[t] and its universal enveloping algebra are Z+ -graded where the grading is given by the powers of t. We shall identify a with the subalgebra a ⊗ 1 of a[t]. 1.2. Classical simple Lie algebras. For the rest of the paper g denotes a complex finite– dimensional simple Lie algebra of type An , Bn , Cn or Dn , n ≥ 1, and h a Cartan subalgebra of g. Let I = {1, · · · , n} and {αi : i ∈ I } (resp. {ωi : i ∈ I }) be a set of simple roots (resp. fundamental weights) of g with respect to h, R + (resp. Q, P) be the corresponding set of positive roots (resp. root lattice, weight lattice) and let Q + , P + be the Z+ –span of the simple roots and fundamental weights respectively. It is convenient + to set ω0 = 0. Let θ ∈ R be the highest root. For i ∈ I , let εi : Q → Z be defined by requiring η = i∈I εi (η)αi , η ∈ Q. We assume throughout the paper that the simple roots are indexed as in [1]. Given α ∈ R, let gα be the corresponding root space. Fix non–zero elements xα± ∈ g±α , h α ∈ h, such that h α , xα± = ±2xα± , xα+ , xα− = h α . Set n± =

α ∈ R+

g±α . Given any subset J ⊂ I let g(J ) be the subalgebra of g generated

by the elements xα±j , j ∈ J . The sets R(J ), P(J ) etc. are defined in the obvious way. Let (, ) be the form on h∗ induced by the restriction of the Killing form of g to h normalized so that (θ, θ ) = 2 and set dˇj = 2/(α j , α j ).

Restricted K-R Modules for Current and Twisted Current Algebras

433

1.3. Finite–dimensional g–modules. Given λ ∈ P + , let V (λ) be the irreducible finite– dimensional g–module with highest weight vector vλ , i.e., the cyclic module generated by vλ with defining relations: λh α +1 i vλ = 0. n+ vλ = 0, hvλ = λ(h)vλ , xα−i For μ ∈ P, let V (λ)μ = {v ∈ V (λ) : hv = μ(h)v, h ∈ h}. Any finite–dimensional g–module M is isomorphic to a direct sum ⊕λ∈P + V (λ)⊕m λ (M) , m λ (M) ∈ Z+ . Proposition ([18]). Let λ, μ ∈ P + . Then V (λ) ⊗ V (μ) is generated as a g–module by the element vλ ⊗ vμ∗ and relations: h vλ ⊗ vμ∗ = (λ + μ∗ )(h) vλ ⊗ vμ∗ , and

xα+

−μ∗ (h α )+1 λ(h α )+1 vλ ⊗ vμ∗ = xα− vλ ⊗ vμ∗ = 0,

for all α ∈ R + . Here μ∗ is the lowest weight of V (μ) and 0 = vμ∗ ∈ V (μ)μ∗ . 1.4. Graded modules. Given a g–module M, we regard it as a g[t]–module by setting (x ⊗ t r )m = 0 for all m ∈ M, r ∈ N, and denote the resulting graded g[t]–module by ev0 (M). Let V = ⊕s∈Z+ V [s] be a graded representation of g[t] with dim(V [s]) < ∞. Note that each V [s] is a g–module. For any s ∈ Z+ , let V (s) be the g[t]–quotient of V by the submodule ⊕s ≥s V [s ]. Clearly V (s) ∼ = ev0 (V [s]), V (s + 1) and the irreducible constituents of V are just the irreducible constituents of ev0 (V [s]), s ∈ Z+ . 2. The Kirillov–Reshetikhin Modules for g[t] 2.1. The modules K R(mωi ). Definition. Given i ∈ I and m ∈ Z+ , let K R(mωi ) be the g[t]–module generated by an element vi,m with relations, n+ [t]vi,m = 0, hvi,m = mωi (h)vi,m , (h ⊗ t r )vi,m = 0, h ∈ h, r ∈ N, (2.1) and

xα−i

m+1

vi,m = xα−i ⊗ t vi,m = xα−j vi,m = 0,

j ∈ I \{i}.

(2.2)

Note that the modules K R(mωi ) are graded modules since the defining relations are graded. The following is trivially checked. Lemma. For all i ∈ I , m ∈ Z+ , the module ev0 (V (mωi )) is a quotient of K R(mωi ). Remark. These modules were defined and studied initially in [2, Sect. 1] (where they were denoted as W (i, m)) as quotients of the finite–dimensional Weyl modules defined in [4]. In this paper however, we shall show directly that the modules K R(mωi ) are finite–dimensional as a consequence of the analysis of their g–module structure.

434

V. Chari, A. Moura

2.2. The graded character of K R(mωi ). We now state the main result of this section. We need some additional notation. For i ∈ I , m ∈ Z+ , let P + (i, m) ⊂ P + be defined by: P + (i, m) = P + i, dˇi + P + i, m − dˇi , m > dˇi , where P + (i, 1) = ωi if εi (θ ) = dˇi and in the other cases,

P + (i, 1) = ωi , ωi−2 , · · · , ωi , g = Dn ,

P + (i, 1) = ωi , ωi−2 , · · · , ωi , g = Bn , P + (n, 2) = {2ωn , ωn−2 , · · · , ωn } , g = Bn , P + (i, 2) = {2ωi , 2ωi−1 , · · · , 2ω1 , 0} , g = Cn , where i ∈ {0, 1} and i = i mod 2. Let μ0 , · · · , μk be the unique enumeration of the sets P + (i, dˇi ) chosen so that μ j − μ j+1 ∈ R + , μ j − μ j+2 ∈ Q + \R + , 0 ≤ j ≤ k − 1. Given m = dˇi m 0 + m 1 with 0 ≤ m 1 < dˇi and μ ∈ P + (i, m), we can clearly write μ = m 1 ωi + μ j1 + · · · + μ jm 0 , where jr ∈ {0, 1, · · · , k} for 1 ≤ r ≤ m 0 . We say that the expression is reduced if each jr is minimal with the property that μ − μ j1 − · · · − μ jr ∈ P + (i, m − r ). Such an expression is clearly unique and we set |μ| =

m0

jr .

r =1

Theorem. (i) Let i ∈ I , m, s ∈ Z+ . We have,

K R(mωi )[s] ∼ =

V (μ).

{μ∈P + (i,m):|μ|=s}

(ii) Write m = dˇi m 0 + m 1 , where 0 ≤ m 1 < dˇi . The canonical homomorphism of g[t]–modules ⊗m 0 K R(mωi ) → K R (m 1 ωi ) ⊗ K R dˇi ωi 0 mapping vi,m → vi,m 1 ⊗ v ⊗m is injective. ˇ

i,di

The rest of the section is devoted to the proof of this result. 2.3. Elementary properties of K R(mωi ). Proposition. (i) We have K R(mωi ) =

μ ∈ h∗

and K R(mωi )μ = 0 only if μ ∈ mωi − Q + .

K R(mωi )μ

Restricted K-R Modules for Current and Twisted Current Algebras

435

(ii) Regarded as a g–module, K R(mωi ) and K R(mωi )[s], s ∈ Z+ , are isomorphic to a direct sum of irreducible finite–dimensional representations of g. (iii) For all 0 ≤ r ≤ m, there exists a canonical homomorphism K R(mωi ) → K R(r ωi ) ⊗ K R((m − r )ωi ) of graded g[t]–modules such that vi,m → vi,r ⊗ vi,(m−r ) . Proof. Part (i) follows by a standard application of the PBW theorem. For (ii) it suffices, by standard results, to show that K R(mωi ) is a sum of finite–dimensional g–modules. Note that the defining relations of K R(mωi ) imply that U(g)vi,m ∼ = V (mωi ) as g–modules. Hence − mωi (h α )+1 xα vi,m = 0 (2.3) for all α ∈ R + . For v ∈ K R(mωi )μ we have U(g)v = U(n− )U(n+ )v. Part (ii) implies that U(n+ )v is a finite–dimensional vector space. Further since the action of n− on n− [t] given by the Lie bracket is locally nilpotent and since v ∈ U(n− [t])vi,m , it follows from (2.3) that xα− acts nilpotently on v. This proves that U(g)v is finite–dimensional and part (ii) is established. Part (iii) is clear from the defining relations of the modules. Corollary. For i ∈ I , m ∈ Z+ , we have K R(mωi ) =

V (μ)⊕m μ (i,m) .

μ ∈ P+

In particular, K R(0) ∼ = C. 2.4. An upper bound for m μ (i, m). The following result was proved in [2, Theorem 1] under the assumption that K R(mωi ) is finite–dimensional. An inspection of the proof shows however, that the only place this is used is to write K R(mωi ) as a direct sum of irreducible g–modules. But (as we shall see in the twisted case, where we do give a proof of the analogous proposition) this only requires the weaker result proved in Proposition 2.3, Proposition. As a g-module we have K R(mωi ) ∼ =

V (μ)⊕m μ ,

μ ∈ P + (i, m)

where m μ ∈ {0, 1}. Th next corollary is immediate since P + (i, m) is a finite set by definition. Corollary. For all i ∈ I , the modules K R(mωi ) are finite–dimensional. 2.5. Proof of Theorem 2.2: the cases εi (θ ) = 1, m ∈ Z+ and εi (θ ) = dˇi , m = 1. In these cases, it follows from the definition that P + (i, m) = {mωi }. Since ev0 (V (mωi )) is a g[t]–module quotient of K R(mωi ), part (i) is immediate from Proposition 2.4. Part (ii) of the theorem is now obvious since the canonical inclusion V (mωi ) → V (ωi )⊗m of g–modules is obviously also an inclusion of the g[t]–modules ev0 (V (mωi )) → ev0 (V (ωi ))⊗m . Since εi (θ ) = 1 for all 1 ≤ i ≤ n if g is of type An , the theorem is proved in this case and we assume for the rest of this section that g is not of type An .

436

V. Chari, A. Moura

2.6. An explicit construction in the case εi (θ ) = 2 and m = dˇi . Let Vs , 0 ≤ s ≤ k, be g–modules such that Homg(g ⊗ Vs , Vs+1 ) = 0, Homg ∧2 (g) ⊗ Vs , Vs+2 = 0, 0 ≤ s ≤ k − 1, (2.4) where we assume that Vk+1 = 0. Fix non–zero elements ps ∈ Homg(g ⊗ Vs , Vs+1 ), 0 ≤ s ≤ k − 1, and set pk = 0. It is easily checked that the following formulas extend the canonical g–module structure to a graded g[t]–module structure on V = ⊕ks=1 Vs : (x ⊗ t)v = ps (x ⊗ v), (x ⊗ t r )v = 0, r ≥ 2, for all x ∈ g, v ∈ Vs , 1 ≤ s ≤ k. Clearly, V [s] ∼ =g Vs , 0 ≤ s ≤ k. Moreover, if the maps ps , 0 ≤ s ≤ k − 1, are all surjective and if V0 = U(g)v0 then V = U(g[t])v0 . Proposition. Let i ∈ I be such that εi (θ ) = 2 and let μs ∈ P + (i, dˇi ), 0 ≤ s ≤ k. The modules V (μs ), 0 ≤ s ≤ k, satisfy (2.4) and the resulting g[t]–module is isomorphic to K R dˇi ωi . In particular, K R dˇi ωi [ j] ∼ =g V (μ j ), 0 ≤ j ≤ k. Proof. Using Proposition 1.3 it is easy to see that Homg(g ⊗ V (μs ), V (μs+1 )) ∼ = Homg(g, V (μs ) ⊗ V (μs+1 )) = 0 for all 0 ≤ s ≤ k − 1. It is not hard to check (see [8, 19]) that as g–modules, ∧2 (g) ∼ = g ⊕ V (ν), where ν = 2ω1 + ω2 if g is of type Cn , ν = ω1 + 2ω3 if g is of type B3 , ν = ω1 + ω3 + ω4 if g is of type D4 , and ν = ω1 + ω3 otherwise. Since μs − μs+2 ∈ / R, it follows that Homg(g ⊗ V (μs ), V (μs+2 )) = 0. To prove that Homg(V (ν) ⊗ V (μs ), V (μs+2 )) = 0, it suffices to prove that Homg(V (μs ), V (ν) ⊗ V (μs+2 )) = 0. Suppose that 0 = p ∈ Homg(V (μs ), V (ν) ⊗ V (μs+2 )). A simple computation using the explicit formulas for the fundamental weights in terms of the simple roots (see [11] for instance) shows that μs+2 + ν − μs ∈ Q +J ,

J = {1, · · · , s − 1}.

Hence 0 = p(vμs ) ∈ U(g J )vν ⊗ U(g J )vμs+2 , which implies that p(U(g J )vμs ) ⊂ U(g J )vν ⊗ U(g J )vμs+2 .

Restricted K-R Modules for Current and Twisted Current Algebras

437

The formulas given for μs shows that μs (h J ) = 0. This means that U(g J )vμs ∼ = C which implies that p(C) ∼ = C ⊂ U(g J )vν ⊗ U(g J )vμs+2 . But this is impossible since by standard results U(g J )vμs+2 is not isomorphic to the g J –dual of U(g J )vν . Hence p = 0 and we have proved that the modules V (μs ), 0 ≤ s ≤ k − 1, satisfy the conditions of (2.4). Let K˜R(dˇi ωi ) be the resulting g[t]–module. Since the modules V (μs ) are all irreducible, it follows moreover that K˜R dˇi ωi = U(g[t])vdˇi ωi . To see that K˜R dˇi ωi ∼ = K R(dˇi ωi ) it suffices, by Proposition 2.4, to prove that K˜R(dˇi ωi ) is a quotient of K R(dˇi ωi ). Since μ0 = mωi and μ0 − μ1 ∈ R + \{αi }, we see that we must have, p0 n+ ⊕ h ⊕ Cxα−i ⊗ vmωi = 0. This proves that

n+ [t]vmωi = 0, hvmωi = (mωi )(h)vmωi , h ⊗ t r vmωi = xα−i ⊗ t vmωi = 0, r ≥ 1.

Finally, since K˜R(mωi ) is obviously finite–dimensional, it follows that

xα−i

mωi (h α

i )+1

vμ0 = xα−j vμ0 = 0,

j ∈ I \{i},

and the proof of the proposition is complete. Given μ ∈ P + (i, m), m = dˇi m 0 + m 1 , 0 ≤ m 1 < dˇi , with reduced expression μ = m 1 ωi + μ j1 + · · · + μ jm 0 , set s j = #{r : jr ≥ j}, 0 ≤ j ≤ k, and sk s xμ = xμ−k−1 −μk ⊗ t · · · xμ−0 −μ1 ⊗ t 1 ∈ U(n− [t]). It is easily seen that xμ vi,m ∈ K R(mωi )[|μ|] ∩ K R(mωi )μ . Corollary. Let i ∈ I be such that εi (θ ) = 2. For 0 ≤ s ≤ k we have xμs vi,dˇi = 0, h(xμs vi,dˇi ) = μs (h)(xμs vi,dˇi ), n+ xμ vi,dˇi = 0, h ∈ h, or, equivalently, xμs vμ0 = vμs .

438

V. Chari, A. Moura

Proof. Suppose that μ = μs for some 1 ≤ s ≤ k. We first prove that there exists xs ∈ n− such that ps (xs ⊗ vμs ) = vμs+1 .

(2.5)

For this, note that by Proposition 1.3, v = ps (xθ− ⊗ vμs ) = 0. Since V (μs+1 ) is irreducible there exists xs ∈ U(n+ ) such that xs v = vμs+1 , which gives xs ps xθ− ⊗ vμs = ps ad xs xθ− ⊗ vμs = vμs+1 , proving (2.5). Setting xs = ad(xs )xθ− ∈ n− μs −μs+1 it follows that x s is a non–zero scalar multiple of xμ−s −μs+1 for some 0 ≤ s ≤ k and we get − xμs −μs+1 ⊗ t vμs = 0. An obvious induction on s now proves that xμs vi,dˇi = 0, 1 ≤ s ≤ k − 1. The other two statements of the corollary are now immediate. Recall that

K R dˇi ωi (r ) ∼ = K R dˇi ωi / ⊕r ≥r K R(mωi )[r ] .

Let vr be the image of vdˇi ωi in K R(dˇi ωi )(r ). By Corollary 2.6 we see that xμr −1 vr = 0, xμr vr = 0, 0 ≤ r ≤ r ≤ k.

(2.6)

2.7. Proof of Theorem 2.2: the case εi (θ ) = 2, m ≥ dˇi . If m = dˇi , then part (i) is the statement of Proposition 2.6 and part (ii) is trivially true. Assume that m > dˇi and write m = dˇi m 0 + m 1 , 0 ≤ m 1 < dˇi . Note that since m 1 ∈ {0, 1} we know by the earlier case that K R(m 1 ωi ) ∼ = ev0 (V (m 1 ωi )). Set

⊗m 0 0 ˇi ωi ⊂ K R(m K˜R(mωi ) = U(g[t]) vm 1 ωi ⊗ v ⊗m ω ) ⊗ K R d . 1 i ˇ di ωi

It is easily checked that K˜R(mωi ) is a graded quotient of K R(mωi ). Using Proposition 2.4 we see that part (i) of the theorem follows if we prove that for all s ∈ Z+ ,

V (μ). (2.7) K˜R(mωi )[s] ∼ = {μ∈P + (i,m):|μ|=s}

In particular this proves that K˜R(mωi ) ∼ = K R(mωi ) and so proves part (ii) of the theorem as well.

Restricted K-R Modules for Current and Twisted Current Algebras

439

Let μ = m 1 ωi + μ j1 + · · · + μ jm 0 be a reduced expression for μ, set K = K R(m 1 ωi ) ⊗ K R dˇi ωi ( j1 + 1) ⊗ · · · ⊗ K R dˇi ωi ( jm 0 + 1), and let K˜ = U(g[t])(vm 1 ωi ⊗ v j1 ⊗ · · · ⊗ v jm 0 ). Clearly K and K˜ are graded quotients of K R(m 1 ωi ) ⊗ K R(dˇi ωi )⊗m and K˜R(mωi ) respectively. Let ⊗m 0 π : K R(m 1 ωi ) ⊗ K R dˇi ωi → K , π K˜R(mωi ) = K˜ , be the canonical surjective morphism of g[t]–modules. Using (2.6) and the comultiplication of U(g[t]) one computes easily that 0 ) = vm 1 ωi ⊗ xμ j1 v j1 ⊗ · · · ⊗ xμ jm v jm 0 = 0. xμ π(vm 1 ωi ⊗ v ⊗m ˇ di ωi

0

(2.8)

Corollary 2.6 implies that n+ vm 1 ωi ⊗ xμi1 v j1 ⊗ · · · ⊗ xμ jm v jm 0 = 0, 0

and vm 1 ωi ⊗ xμ j1 v j1 ⊗ · · · ⊗ xμ jm v jm 0 ∈ K [|μ|] ∩ K μ . 0

It follows immediately that K˜ [|μ|] ∼ =g (V (μ) ⊕ N ) for some g–submodule N ⊂ K˜ which proves (2.7). 3. The Twisted Algebras We use the notation of the previous section freely. 3.1. Preliminaries. Throughout this section we let ∈ {0, 1}. From now on g denotes a Lie algebra of type An or Dn and σ : g → g the non–trivial diagram automorphism of order two. The statements in this subsection can be found in [12]. Thus we have ± g = g0 ⊕ g1 , h = h0 ⊕ h1 , n± = n± 0 ⊕ n1 ,

where g0 = {x ∈ g : σ (x) = x}, g1 = {x ∈ g : σ (x) = −x}. For any subalgebra a of g with σ (a) ⊂ a we set a = a ∩ g and we have a = a0 ⊕ a1 . The subalgebra g0 is a simple Lie algebra with Cartan subalgebra h0 and we let I0 be the index set for the corresponding set of simpe roots numbered as in [1]. Although in this and the following sections it is convenient to use the ambient Lie algebra g, we do not need any other data associated with it. Thus, all representations, roots, weights, the maps εi and so on will always be those associated with the fixed point algebra g0 .

440

V. Chari, A. Moura

We have g0 is of type Cn if g is of type A2n−1 and of type Bn if g is of type A2n or Dn+1 . Let (R0 )s be the set of short roots of g0 and (R0 ) the set of long roots. The adjoint action of g0 on g makes g1 into an irreducible representation of g0 and we have g1 = h 1

μ ∈ h∗0

(g1 )μ , g1 ∼ = V (φ),

where φ ∈ Q +0 is the highest short root of g0 if g is of type A2n−1 or Dn+1 and twice the highest short root if g is of type A2n . Further, if we set

R1 = μ ∈ h∗0 : (g1 )μ = 0 \{0}, R1+ = R1 ∩ Q +0 , then R1 = (R0 )s if g is of type A2n−1 or Dn+1 and R1 = R0 ∪ {2α : α ∈ (R0 )s } if g is of type A2n . In all cases, dim(g1 )α = 1 for α ∈ R1 . Given α ∈ R1+ , we denote by yα± any non–zero element in (g1 )±α . Note that n± 1 = ⊕α∈R ± (g1 )α . In addition, if α ∈ R0+ \R1+ we set yα± = 0, and similarly we set xα± = 0 if 1

α ∈ R1+ \R0+ . Lemma.

(i) The maps g1 ⊗ g1 → g0 and g0 ⊗ g1 → g1 defined by x ⊗ y → [x, y] are a surjective homomorphism of g0 –modules. (ii) If α ∈ R0 ∩ R1 , there exists h ∈ h1 such that [h, yα− ] = xα− . 3.2. The twisted current algebra. Extend σ to an automorphism σt : g[t] → g[t] by extending linearly the assignment, σt x ⊗ t r = σ (x) ⊗ (−t)r , x ∈ g, r ∈ Z+ . If a ⊂ g is such that σ (a) ⊂ a, let a[t]σ be the set of fixed points of σt . Clearly a[t]σ = a0 ⊗ C[t 2 ] ⊕ a1 ⊗ tC[t 2 ] and g[t]σ = n− [t]σ ⊕ h[t]σ ⊕ n+ [t]σ . 3.3. The modules K R σ (mωi ). Definition. For i ∈ I0 , m ∈ Z+ , let K R σ (mωi ) be the g[t]σ –module generated by an σ with relations, element vi,m σ σ σ σ σ n+ [t]σ vi,m = 0, h 0 vi,m = mωi (h 0 )vi,m , h ⊗ t 2r − vi,m = xα−j vi,m = 0, (3.1) for all h ∈ h , r ∈ Z+ , j = i,

and

xα−i

m+1

σ vi,m = 0,

σ σ xα−i ⊗ t 2 vi,m = yα−i ⊗ t vi,m = 0.

(3.2)

(3.3)

Restricted K-R Modules for Current and Twisted Current Algebras

441

Note that when g is of type σA2n or if αi ∈ (R0 )s it can be seen, by using Lemma σ 3.1(ii), that the relation xα−i ⊗ t 2 vi,m = 0 is actually a consequence of yα−i ⊗ t vi,m = 0. The modules K R σ (mωi ) are clearly graded modules for the graded algebra g[t]σ . Any g0 –module V can be regarded as a module for g[t]σ by setting x ⊗ t 2r + v = 0, v ∈ V, x ∈ g , r ∈ Z+ , and we denote the corresponding module by evσ0 (V ). The next lemma is easily checked. Lemma. For all i ∈ I0 and m ∈ Z+ the module evσ0 (V (mωi )) is a quotient of K R σ (mωi ). 3.4. The graded character of K R σ (mωi ). For i ∈ I0 and m ∈ N, let P0+ (i, m)σ be the subset of P0+ defined by σ σ P0+ (i, m)σ = P0+ i, diσ + P0+ i, m − diσ , where diσ = 1, g = A2n , diσ = 2, i = n, dnσ = 4, g = A2n , and, for 1 ≤ m ≤ diσ , P0+ (i, m)σ is given as follows. If g is of type A2n , then P0+ (i, 1)σ P0+ (i, 2)σ P0+ (n, 2)σ P0+ (n, 3)σ P0+ (n, 4)σ

= = = = =

{ωi }, {2ωi , 2ωi−1 , · · · , 2ω1 , 0} , i = n, {2ωn } , {3ωn } , {4ωn , 2ωn−1 , · · · , 2ω1 , 0} .

If g is of type A2n−1 , then,

P0+ (i, 1)σ = ωi , ωi−2 , · · · , ωi , where i ∈ {0, 1} and i = i mod 2. Finally if g is of type Dn+1 , then P0+ (i, 1)σ = {ωi , ωi−1 , · · · , 0} , i = n, P0+ (n, 1)σ = {ωn }. In particular P0+ (i, m)σ is finite. Fix an enumeration μs , 0 ≤ s ≤ k, of the sets P0+ (i, m)σ , 1 ≤ m ≤ diσ by requiring μs − μs+1 ∈ R1+ , μs − μs+2 ∈ / (R0+ ∪ R1+ ). For μ ∈ P0+ (i, m)σ define a reduced expression and |μ| ∈ Z+ analogously to the untwisted case. The main result is the following.

442

V. Chari, A. Moura

Theorem. Let i ∈ I0 , m ∈ Z+ . (i) For all s ∈ Z+ , K R σ (mωi )[s] ∼ =g0

μ ∈ P0+ (i, m)σ : |μ| = s

V (μ).

(ii) Write m = diσ m 0 + m 1 , where 0 ≤ m 1 < diσ . The canonical homomorphism of gσ [t]–modules ⊗m 0 K R σ (mωi ) → K R σ (m 1 ωi ) ⊗ K R σ diσ ωi is injective. The proof of the theorem proceeds as in the untwisted case. 3.5. Elementary properties of K R σ (mωi ). The next proposition is the twisted version of Proposition 2.3 and is proved in the same way. Proposition. (i) We have K R σ (mωi ) =

μ ∈ h∗0

K R σ (mωi )μ

and K R σ (mωi )μ = 0 only if μ ∈ mωi − Q +0 . (ii) Regarded as a g0 –module, K R σ (mωi ) and K R σ (mωi )[s], s ∈ Z+ , are isomorphic to a direct sum of irreducible finite–dimensional representations of g0 . In particular if W0 is the Weyl group of g0 , then K R σ (mωi )μ = 0 iff K R σ (mωi )wμ = 0 for all w ∈ W0 . (iii) Given 0 ≤ r ≤ m, there exists a canonical homomorphism K R σ (mωi ) → σ → v σ ⊗ K R σ (r ωi ) ⊗ K R ((m − r )ωi ) of graded g[t]σ –modules such that vi,m i,r σ vi,m−r . Corollary. We have K R σ (mωi ) ∼ =

V (μ)m μ (i,m) .

μ ∈ P0+

In particular K R σ (0) ∼ = C. 3.6. An upper bound for m μ (i, m). Proposition. For all i ∈ I0 , m ∈ Z+ , we have K R σ (mωi ) ∼ =g0

μ ∈ P0+ (i, m)σ

V (μ)m μ ,

where m μ ∈ {0, 1}. Corollary. For all i ∈ I0 and m ∈ Z+ , the modules K R σ (mωi ) are finite–dimensional. We postpone the proof of this proposition to the next section.

Restricted K-R Modules for Current and Twisted Current Algebras

443

3.7. An explicit construction of the modules K R σ (mωi ), i ∈ I0 , 1 ≤ m ≤ diσ . The next lemma is standard and can be found in [8, 19]. Let ι : g0 → ∧2 (g1 ) be the g0 –module map such that [, ] · ι = id. Lemma. (a) If g is of type Dn+1 , n ≥ 2, then ∧2 (g1 ) ∼ =g0 ι(g0 ). (b) If g is of type An , n = 3, we have ∧2 (g1 ) ∼ = ι(g) ⊕ V (ν) , where (i) ν = ω1 + ω3 if g is of type A2n−1 , n ≥ 3, (ii) ν = 2ω1 + ω2 if g is of type A2n , n ≥ 3, (iii) ν = 2ω1 + 2ω2 if g is of type A4 , (iv) ν = 6ω1 if g is of type A2 . Assume that Vs , 0 ≤ s ≤ k, are g0 –modules, let ps : g1 ⊗ Vs → Vs+1 , 0 ≤ s ≤ k − 1, be g0 –module maps, and set pk = 0. Set also qs = ps+1 (1 ⊗ ps ) ∈ Homg0 (g1 ⊗ g1 ⊗ Vs , Vs+2 ). Suppose that one of the following two conditions hold: (a) qs ∧2 (g1 ) ⊗ Vs = 0, ∀ 0 ≤ s ≤ k − 2,

(3.4)

(b) for all 0 ≤ s ≤ k − 2, y, z, w ∈ g1 , v ∈ Vs , we have qs (V (ν) ⊗ Vs ) = 0, ps+2 (1 ⊗ ps+1 )(1 ⊗ ps ) (((z ∧ w) ⊗ y − y ⊗ (z ∧ w)) ⊗ v) = 0.

(3.5)

Let x ∈ g0 , y ∈ g1 , v ∈ Vs . The following formulas define a graded g[t]σ –module structure on V = ⊕ks=1 Vs : (x ⊗ t 2 )v = qs (ι(x) ⊗ v), (y ⊗ t)v = ps (y ⊗ v), and

x ⊗ t 2r v = y ⊗ t 2r −1 v = 0,

for all r ≥ 2 and 0 ≤ s ≤ k. Furthermore, V [s] ∼ =g0 Vs . Moreover, if the maps ps , 0 ≤ s ≤ k − 1 are all surjective and V0 = U(g[t]σ )v0 , then the resulting g[t]σ –module is cyclic on v0 . Proposition. Let i ∈ I0 , 1 ≤ m ≤ diσ . The modules V (μs ), 0 ≤ s ≤ k, satisfy (3.4) (resp. (3.5)) if g is of type An (resp. Dn ). The resulting g[t]σ – module is isomorphic to K R σ (mωi ) and K R σ (mωi )[s] ∼ =g0 V (μs ). Proof. If k = 0 there is nothing to prove since it follows from Proposition 3.6 and Lemma 3.3 that K R σ (mωi ) ∼ = evσ0 (V (mωi )). Assume that k > 0. Using Proposition 1.3 it is easy to see that Homg0 (V (μs ) ⊗ V (μs+1 ), g1 ) = 0 and hence Homg0 (g1 ⊗ V (μs ), V (μs+1 )) = 0.

444

V. Chari, A. Moura

Suppose first that g is of type An . Equation (3.4) holds if we prove that Homg0 ∧2 (g1 ) ⊗ Vs , Vs+2 = 0, ∀0 ≤ s ≤ k − 2. Since μs − μs+2 ∈ / R0 , it is immediate that Homg0 (g0 ⊗ V (μs ), V (μs+2 )) = 0. To prove that Homg0 (V (ν) ⊗ V (μs ), V (μs+2 )) = 0, it suffices to prove that Homg0 (V (μs ), V (ν) ⊗ V (μs+2 )) = 0. This is done exactly as in the untwisted case by noting that μs+2 + ν − μs = s−1 j=1 k j α j for some k j ∈ Z+ . If g is of type Dn+1 , n ≥ 2, then Lemma 3.7 implies that the first condition in (3.5) is trivially satisfied. If n = 3 the second condition is also trivially true since k ≤ 3. If n > 3, let N be the g0 –submodule of T 3 (g1 ) spanned by elements of the form (x ⊗ y − y ⊗ x) ⊗ z − z ⊗ (x ⊗ y − y ⊗ x), with x, y, z ∈ g1 . Note that N ∩ ∧3 (g1 ) = 0, N ⊂ g1 ⊗ ∧2 (g1 ) + ∧2 (g1 ) ⊗ g1 . Now, it is not hard to see that, g1 ⊗ ∧2 (g1 ) ∼ =g0 V (ω1 + ω2 ) ⊕ V (ω3 ) ⊕ V (ω1 ), n > 3, and that the g0 –submodule V (ω3 ) occurs in T 3 (g1 ) with mulitplicity one. It follows that N∼ = V (ω1 + ω2 )⊕r1 ⊕ V (ω1 )⊕r2 , 0 ≤ r1 , r2 ≤ 2. We now prove that Homg0 (N ⊗ V (μs ), V (μs+3 )) = 0, 0 ≤ s ≤ k − 3, which establishes the second condition in (3.5). Since (V (μs ) ⊗ V (μs+3 ))ω1 = 0, it follows that Homg0 (V (ω1 ) ⊗ V (μs ), V (μs+3 )) = 0 and we are left to show that Homg0 (V (ω1 + ω2 ) ⊗ V (μs ), V (μs+3 )) = 0. For this it suffices to prove that Homg0 (V (μs ), V (ω1 + ω2 ) ⊗ V (μs+3 )) = 0. This is done as usual by noting that μs+3 + ωi + ω2 − μs is in the span of the elements αr , 1 ≤ r ≤ s − 1 and by observing that the (g0 ) J –module U((g0 ) J )vω1 +ω2 ⊂ V (ω1 + ω2 ) is not dual to the (g0 ) J –module U((g0 ) J )vμs+3 ⊂ V (μs+3 ), where J = {1, · · · , s − 1}. This proves that if we fix non–zero maps ps ∈ Homg0 (g1 ⊗ V (μs ), V (μs+1 )), then we can construct a graded cyclic g[t]σ –module V = U(g[t])vμ0 = ⊕ks=0 V (μs ), V [s] ∼ = V (μs ), 0 ≤ s ≤ k. As in the untwisted case, to complete the proof it suffices, in view of Proposition 3.6, to prove that V is a quotient of K R σ (mωi ). Since μ0 = mωi and μ0 − μ1 ∈ R0+ \{αi } we get p0 n+1 ⊕ h1 ⊕ Cyα−i ⊗ vmωi = 0,

Restricted K-R Modules for Current and Twisted Current Algebras

i.e.,

445

n+1 ⊗ t 2r −1 vmωi = h1 ⊗ t 2r −1 vmωi = yα−i ⊗ t vmωi = 0, r ≥ 1.

For the same reasons, q0

+ n0 ⊕ h0 ⊕ Cxα−i ⊗ vmωi = 0.

Hence we get σ n+0 ⊗ t 2r vmωi = h0 ⊗ t 2r vmωi = xα−i ⊗ t 2r vi,m = 0, r ≥ 1. The relations (3.2) follow since V is obviously finite–dimensional which completes the proof that V is a quotient of K R σ (mωi ). Given μ ∈ P0+ (i, m)σ , m = diσ m 0 + m 1 , 0 ≤ m 1 < diσ , with reduced expression μ = m 1 ωi + μ j1 + · · · + μ jm 0 , set s j = #{r : jr ≥ j}, 0 ≤ j ≤ k, and sk s yμ = yμ−k−1 −μk ⊗ t · · · yμ−0 −μ1 ⊗ t 1 ∈ U n− [t]σ . It is easily seen that σ yμ vi,m ∈ K R(mωi )[|μ|] ∩ K R(mωi )μ .

The next corollary is proved in the same way as Corollary 2.6 and we omit the details. Corollary. For 0 ≤ s ≤ k we have σ σ σ + σ = μs (h) yμs vi,d yμ vi,d = 0, h ∈ h0 , yμs vi,d σ = 0, h yμs vi,d σ σ , n σ i

i

i

i

or, equivalently, yμs vμ0 = vμs . σ 3.8. The modules K˜R (mωi ) and the completion of the proof of Theorem 3.4. For m = diσ m 0 + m 1 ∈ Z+ , i ∈ I0 , set ⊗m 0 σ σ σ ⊗m 0 ⊂ K R σ (m 1 ωi ) ⊗ K R σ diσ ωi K˜R (mωi ) = U(g[t]σ ) vm ⊗ (v ) . σ d ωi 1 ωi i

The next proposition is proved in exactly the same way as the corresponding result in Sect. 2.7 for the untwisted case, by using Proposition 3.7 and Corollary 3.7, and clearly completes the proof of Theorem 3.4. Proposition. Let i ∈ I0 , m ∈ N. Then σ K˜R (mωi )[s] ∼ =g0

μ∈P0+ (i,m)σ :|μ|=s

In particular, σ K˜R (mωi ) ∼ = K R σ (mωi )

as g[t]σ –modules.

V (μ).

446

V. Chari, A. Moura

4. Proof of Proposition 3.6 We will use the following remark repeatedly without Let x ∈ (g )α , σ further comment. σ = α ∈ R , and r ∈ Z+ be such that x ⊗ t 2r + vi,m = 0. Then x ⊗ t 2r +2s+ vi,m s ∈ Z+ . To see this, observe that if h ∈ h , then by definition we have 0 for all σ = 0. If x ∈ (g ) for some α ∈ R , choose h ∈ h such that [h, x] = x. h ⊗ t 2s+ vi,m α 0 This gives σ σ = x ⊗ t r +2 vi,m , 0 = h ⊗ t 2 , x ⊗ t r vi,m thus proving the remark. 4.1. The case αi ∈ R1 with εi (φ) = 1. The following lemma establishes the proposition in this case. Lemma. Suppose that i ∈ I0 is such that αi ∈ R1 and εi (φ) = 1 (in other words i = n if g is of type Dn+1 and i = 1 if g is of type A2n−1 ). Then K R σ (mωi ) ∼ = evσ0 (V (mωi )). Proof. It is straightforward to check that we can write yφ− = xβ− , yα−i , xθ− = yφ− , yγ− ,

(4.1)

for some β ∈ R0+ , γ ∈ R1+ with εi (β) = 0, εi (γ ) = 1. It follows from (3.3) and σ = 0. Since for all α ∈ R1+ we have yα− ⊗ t ∈ Proposition 3.5 that yφ− ⊗ t vi,m σ U(n+ ) yφ− ⊗ t it follows from (3.1) that yα− ⊗ t vi,m = 0. It follows now from (4.1) σ σ − − 2 2 that xθ ⊗ t vi,m = 0 and hence that xα ⊗ t vi,m = 0 for all α ∈ R0+ . This proves σ that K R σ (mωi ) = U n− 0 vi,m and the lemma is proved. 4.2. The subalgebras n− [t]σmax(i) . From now on we assume that i ∈ I0 is such that either αi ∈ / R1 or εi (φ) = 2. Let max(i) ∈ {1, 2} be the maximum value of the restriction of εi to R1+ , and let n− [t]σmax(i) be the subalgbera of n− [t]σ generated by

yα− ⊗ t : α ∈ R1+ , εi (α) = max(i) .

If g is of type An , then it is easy to see that

n− [t]σmax(i) = α∈R1 :εi (α)=max(i)

C yα− ⊗ t ,

(4.2)

while if g is of type Dn+1 , we have n− [t]σmax(i)=

α ∈ R1 : εi (α) = 1

C yα− ⊗ t

⎛ ⊕⎝

α ∈ R0+ : εi (α) = 2

⎞ C xα− ⊗ t 2 ⎠ . (4.3)

Restricted K-R Modules for Current and Twisted Current Algebras

447

σ . 4.3. Further relations satisfied by vi,m

Proposition. (i) Let α ∈ R1+ . If εi (α) < max(i) we have σ yα− ⊗ t 2r +1 vi,m = 0 for all r ∈ Z+ , while if εi (α) = max(i) we have σ yα− ⊗ t 2r +3 vi,m = 0 for all r ∈ Z+ . (ii) If g is of type An , then for all α ∈ R0+ , r ∈ Z+ , we have σ xα− ⊗ t 2r +2 vi,m = 0. (iii) If g is of type Dn+1 , then for all α ∈ R0+ and r ∈ Z+ , we have σ = 0, i < n. xα− ⊗ t 2r +2εi (α) vi,m Proof. Let α ∈ R1+ . Suppose that εi (α) = 0. Then sα (mωi − α) = mωi + α and hence (K R σ (mωi ))mωi −α = 0. This shows that σ σ xα− vi,m = yα− ⊗ t vi,m = 0. If εi (α) = 1 < max(i), then we can choose j ∈ I0 such that β = α − α j ∈ R1+ ∩ R0+ . If j = i, then εi (β) = 0 and we get by using (3.3) that − σ σ yα ⊗ t vi,m = xβ− , yα−i ⊗ t vi,m = 0, and if j = i, then εi (α j ) = 0 and we get − σ σ σ yα ⊗ t vi,m = xα−j , yβ− ⊗ t vi,m = xα−j yβ− ⊗ t vi,m . σ Repeating the argument with β we eventually get yα− ⊗ t vi,m = 0, thus proving (i). The proofs of (ii) and (iii) are similar and we omit the details. Corollary. (i) As vector spaces we have − σ σ K R σ (mωi ) = U n− U n [t] max(i) vi,m . 0 In particular K R σ (mωi ) is finite–dimensional. (ii) If g is of type A2 and 1 ≤ m ≤ 3, then we have K R σ (mωi ) ∼ = evσ0 (V (mωi )) and Proposition 3.6 is proved in these cases.

(4.4)

448

V. Chari, A. Moura

Proof. A straightforward application of the Poincaré-Birkhoff-Witt theorem gives (4.4). − σ Since n− 0 ⊕ n [t]max(i) is a finite–dimensional Lie algebra, it follows that for all μ ∈ P, dim(K R σ (mωi )μ ) < ∞. Proposition 3.5 now implies that if K R σ (mωi ) is infinite–dimensional, there must be infinitely many elements νr ∈ P + with K R σ (mωi )ν = 0 and hence νk ∈ P + ∩(mωi − Q + ). But this is a contradiction since it is well-known that the set P + ∩ (mωi − Q + ) is finite. If g is of type A2 and m ≤ 3, then i = 1 and (mω1 − φ)(h α1 ) < 0 since ε1 (φ) = 2. On the other hand, since ε1 (φ − α1 ) = 1, we see that − σ σ = yφ−α ⊗ t vi,m = 0. xα+1 yφ− ⊗ t vi,m 1 σ Since K R (mω i ) is isomorphic to a direct sum of finite–dimensional g0 –modules this − σ = 0 which proves the corollary. forces yφ ⊗ t vi,m

σ 4.4. Proof of Proposition 3.6. Recall the enumeration μ0 , · · · , μk of the sets P0+ i, diσ and set φs = μk−s − μk−s+1 ∈ R1+ for 1 ≤ s ≤ k. Given r = (r1 , · · · , rk ) ∈ Zk+ , set ηr = ks=1 rs φs ∈ Q + and r r yr = yφ1 ⊗ t 1 · · · yφk ⊗ t k . Note that ηr + ηr = ηr+r for all r, r ∈ Zk+ . For 1 ≤ s ≤ k, let es ∈ Zk+ be the element with one in the s th place and zero elsewhere. As usual ≤ is the partial order on P0 defined by μ ≤ μ iff μ − μ ∈ Q +0 . Proposition. We have σ n+0 yr vi,m ∈

ηr <ηr

K R σ (mωi ) =

σ U n− 0 yr vi,m ,

σ U(g0 )yr vi,m .

(4.5)

(4.6)

r∈Zk+

Assuming this proposition we complete the proof of Proposition 3.6 as follows. By Proposition 3.5 we can pick a g0 –module W 0 such that σ ⊕ W 0. K R σ (mωi ) ∼ =g0 U(g0 )vi,m

If W 0 = 0 there is nothing to prove. Otherwise, let prW 0 be the g0 -projection of k K R σ (mωi ) onto W 0 . Using (4.6) we see that there exists r ∈ Z+ such that prW 0 (yr ) = 0.

σ = 0 and such that ηr1 is minimal: i.e., Choose r1 ∈ Zk+ such that vr1 = prW 0 yr1 vi,m σ = 0 for some r ∈ Zk+ , then ηr1 − ηr ∈ if prW 0 yr vi,m / Q +0 . Using (4.5) we get

σ σ n+0 prW 0 yr1 vi,m = prW 0 n+0 yr1 vi,m = 0,

Restricted K-R Modules for Current and Twisted Current Algebras

449

i.e., n+0 vrσ1 = 0. Thus we can write σ ⊕ U(g0 )vrσ1 ⊕ W 1 , K R σ (mωi ) ∼ =g0 U(g0 )vi,m

for some g0 –submodule W 1 . Repeating the argument we find that there exists rs , s ∈ Z+ , such that σ ⊕ K R σ (mωi ) ∼ = U(g0 )vi,m

s≥1

U(g0 )vrσs ,

with n+0 vrσs = 0. Since vrσs ∈ K R σ (mωi )mωi −k

j=1 s j φ j

,

for some s j ∈ Z+ we get mωi −

k

s j φ j = m − sk diσ ωi + (sk − sk−1 )μ1 + · · · + (s2 − s1 )μk−1 + s1 μk ∈ P0+ .

j=1

An inspection of the sets P0+ (i, diσ )σ shows that this implies m ≥ diσ sk , s j ≤ s j+1 , 1 ≤ j ≤ k − 1, which in turn implies that the weight of vrσs is in P0+ (i, m)σ . Since rs = rs if s = s , Proposition 3.6 is now proved. It remains to prove Proposition 4.4. 4.4.1. Proof of (4.5). It is useful to recall our convention that if α ∈ R1+ , then yα− denotes an arbitrary non–zero element of the space (g1 )−α . The case when g is of type Dn+1 or A2n . We proceed by induction on n. To see that induc tion begins for the series Dn+1 at n = 2 note that then, n+ [t]σmax(1) = C yα−1 +α2 ⊗ t and hence the result follows from Corollary 4.3(i). For A2n it begins at n = 1 by noting that σ n− 0 [t]max(1) = C(yφ ⊗ t) and ε1 (φ − α1 ) = 1 < max(1) and again using Corollary 4.3. For the inductive step set J = {2, · · · , n} and let j ∈ J . We have σ σ x +j yr vi,m = (yφ1 ⊗ t)r1 x +j yr−r1 e1 vi,m ∈ (yφ1 ⊗ t)r1 r1 × U n− U((n− 0 J yr ⊂ 0 ) J )(yφ1 ⊗ t) yr , r ∈S

r ∈S

where

S = r ∈ Z+k : r1 = 0, ηr < ηr−r1 e1 . The last inclusion is a consequence of the fact that yφ−1 , n− 0 = 0. It is now simple to check that ηr > ηr +r1 e1 .

450

V. Chari, A. Moura

Suppose now that j = 1 and that g is of type A2n . We have σ = yα− +2 n xα+1 yr vi,m

s=2 αs

1

σ σ ⊗ t yr−e1 vi,m = δi,1 xα−1 yr−e1 +e2 vi,m ,

where δi,1 is one if i = 1 and zero otherwise. If g is of type Dn+1 , we have σ σ σ x1+ yr vi,m = yr−e1 +e2 vi,m + xθ− ⊗ t 2 yr−2e1 vi,m , where we recall that θ = α1 + 2

n

j=2 α j .

(4.7)

Since ηr−e1 +e2 < ηr it remains to prove that

σ σ xθ− ⊗ t 2 yr−2e1 vi,m ∈ U n− 0 yr vi,m . ηr <ηr

Now σ σ σ = xθ− ⊗ t 2 yr−2e1 vi,m + yr−e1 +e2 vi,m . xα−1 yr−2e1 +2e2 vi,m

(4.8)

Since ηr > ηr−e1 +e2 and ηr > ηr−2e1 +2e2 the inductive step is proved. The case when g is of type A2n−1 . We note that induction begins at n = 2 by using Corollary 4.3, since in this case we have when i = 2 that n+ [t]σmax(2) = C(yα−1 +α2 ⊗ t). Set J = {2, , · · · , n}. For the inductive step, note that if i is odd, then φs ∈ R +J and by σ . Since the induction hyopthesis it suffices to consider only the case xα+1 yr vi,m

xα+1 , yr = 0,

the result is immediate. Assume then that i is even and let j = 2. We have r r σ σ x +j yr vi,m = yφ1 ⊗ t 1 x +j yr−r1 e1 vi,m ∈ yφ1 ⊗ t 1 r1 × U((n− U((n− 0 ) J )yr ⊂ 0 ) J )(yφ1 ⊗ t) yr , r ∈S

r ∈S

where

S = r ∈ Z+k : r1 = 0, ηr < ηr−r1 e1 . If j = 2, we get σ xα+2 yr vi,m = yα− +α 1

2 +2

n

s=3 αs

σ σ ⊗ t yr−e1 vi,m = δi,2 xα−1 +α2 +α3 yr−e1 +e2 vi,m .

This completes the proof of the inductive step.

Restricted K-R Modules for Current and Twisted Current Algebras

451

4.4.2. Proof of (4.6). Let ki+ ⊂ n+0 be the subalgebra spanned by elements {xα+ : εi (α) = 0}. It is easily seen that the subalgebra n− [t]σmax(i) is a module for ki via the adjoint action, i.e., [ki , n− [t]σmax(i) ] ⊂ n− [t]σmax(i) . This action defines a ki –module structure on U(n− [t]σmax(i) ) and let ρ: ki → End U n− [t]σmax(i) denote the corresponding homomorphism of Lie algebras. Lemma. We have

ρ U ki+ yr . U n− [t]σmax(i) = r∈Zk+

Assuming the lemma we complete the proof of (4.6) as follows. Let g ∈ U(n− [t]σmax(i) ) and write g = r∈Z+ ρ(xr )yr for some xr ∈ U ki+ . Then, k

σ = gvi,m

σ ρ(xr )yr vi,m =

r∈Zk+

σ xr yr vi,m ∈

r∈Zk+

σ U(g0 )yr vi,m ,

r∈Zk+

σ = 0. An application of Corollary where the second equality uses the fact that xr vi,m 4.3(i) now gives (4.6). The proof of Lemma 4.4.2 is straightforward and we give a proof when g is of type Dn+1 . The other cases are similar. If i = 1, there is nothing to prove and we assume from now on that 1 < i < n. We proceed by induction on n, with induction beginning at n = 2 by the preceding comment. Let J = {2, · · · , n}, by the induction hypothesis we have U n− [t]σmax(i)∩n− [t]σ = ρ U ki+ ∩ n+J yr , J

r∈Zi+ :r1 =0

and since yφ−1 , n− [t]σ = yφ−1 , n+J = 0 we get

yφ1 ⊗ t

r1 ∈Z+

r1

U n− [t]σmax(i)∩n− [t]σ = ρ U ki+ ∩ n+J yr . J

Thus the lemma is established if we prove that r yφ1 ⊗ t 1 U n− [t]σmax(i)∩n− [t]σ = U n− [t]σmax(i) . ρ U ki+ r1 ∈Z+

J

For 1 ≤ j ≤ i − 1, set

θ j = α1 + · · · + α j + 2 α j+1 + · · · + αn ,

and

(4.9)

r∈Zi+

s1 si−1 , s ∈ Zi−1 xs = xθ−1 ⊗ t 2 · · · xθ−i−1 ⊗ t 2 + .

(4.10)

452

V. Chari, A. Moura

By the PBW theorem we have, U n− [t]σmax(i) = s∈Zi−1 + ,r ∈Z+

r xs yφ−1 ⊗ t U n− [t]σmax(i)∩n− [t]σ , J

and hence (4.10) follows if we prove that for all s ∈ Zi−1 + and r ∈ Z+ we have r xs yφ−1 ⊗ t U n− [t]σmax(i)∩n− [t]σ ∈ ρ U ki+ J r1 − σ yφ1 ⊗ t U n [t]max(i)∩n− [t]σ . ×

(4.11)

J

r1 ∈Z+

To prove (4.11) let r1 ∈ Z+ , g ∈ U n− [t]σmax(i)∩n− [t]σ . For 2 ≤ 2s1 ≤ r1 we get J

s1 p s ρ xα+1 1 (yφ1 ⊗ t)r1 g = xθ−1 ⊗ t 2 (yφ1 ⊗ t)r1 −s1 − p (yφ2 ⊗ t)s1 − p g, p=0

which proves (4.11) for all s with s j = 0 for j > 1. Assume that (4.11) is established for all s such that s j = 0 for all j ≥ r . To prove that it holds for s with s j = 0 for j > r > 1, suppose first that sr = 1. Then, ρ xα+r xs−er +er −1 (yφ1 ⊗ t)r1 g = xs (yφ1 ⊗ t)r1 g + xs−er +er −1 (yφ1 ⊗ t)r1 ρ xα+r g, and hence we get by using (4.9) that (4.11) holds for xs (yφ1 ⊗ t)r1 g when sr = 1. An obvious induction on sr gives the result in general. 5. Demazure Modules and Concluding Remarks 5.1. The relation between the modules K R(mωi ) and Demazure modules. 5.1.1. The affine Kac–Moody algebras. Let C[t, t −1 ] be the ring of Laurent polynomials in an indeterminate t. Let g be the affine Lie algebra defined by g = g ⊗ C t, t −1 ⊕ Cc ⊕ Cd, where c is central and x ⊗ t s , y ⊗ t k = [x, y] ⊗ t s+k + sδs+k,0 x, y c, d, x ⊗ t s = sx ⊗ t s , for all x, y ∈ g, s, k ∈ Z and where , is the Killing form of g. Set h = h ⊕ Cc ⊕ Cd, h∗ by setting λ(c) = λ(d) = 0 for λ ∈ h∗ . For 0 ≤ i, j ≤ n and regard h∗ as a subspace of ∗ + ⊂ let i ∈ h be defined by i (h j ) = δi, j , and let P h∗ be the non–negative integer ∗ linear span of i , 0 ≤ i ≤ n. Define δ ∈ (h) by δ|h = 0, δ(d) = 1, δ(c) = 0. h)∗ , 0 ≤ i ≤ n, where α0 = δ − θ are a set of simple roots for g with The elements αi ∈ ( + + spanned by the elements αi , 0 ≤ i ≤ n. Let W respect to h. Let Q be the subset of P –invariant form on be the extended affine Weyl group and ( , ) be the W h∗ obtained by requiring (i , α j ) = δi j for all 0 ≤ i, j ≤ n and (δ, αi ) = 0 for all 0 ≤ i ≤ n.

Restricted K-R Modules for Current and Twisted Current Algebras

453

n + , let L() 5.1.2. The highest weight integrable modules. Given = i=0 m i i ∈ P be the g–module generated by an element v with relations:

xα−i

gt[t]v = 0, (n+ ⊗ 1)v = 0, (h ⊗ 1)v = (h)v , m 0 +1 m +1 ⊗ 1 i v = 0, xθ+ ⊗ t −1 v = 0,

for 1 ≤ i ≤ n. This module is known to be irreducible and integrable (see [12]). The next proposition also can be found in [12]. + . Proposition. Let ∈ P (i) We have

L() = L() , where L() = v ∈ L() : hv = (h)v, h ∈ h . ∈ P

+ . / Q Moreover, dim(L() ) < ∞, dim(L()) = 1, and L() = 0 if − ∈ (ii) The set

: L() = 0 ⊂ − Q + wt(L()) = ∈ P and is preserved by W , ∈ P. dim(L()w ) = dim(L() ) ∀ w ∈ W . In particular, dim(L()w ) = 1 for all w ∈ W 5.1.3. The Demazure modules. From now on we fix = m0 for some m ∈ Z+ and , let vw be a we also assume for simplicity that g is of type An or Dn . Given w ∈ W non–zero element in L()w and let D(w) = U(g[t])vw ⊂ L(). . It is clear from Proposition 5.1.2 that D(w) is finite–dimensional for all w ∈ W + Proposition. Let = m0 ∈ P for some m > 0 and w ∈ W be such that w|h = mωi for some 1 ≤ i ≤ n. Then, D(w) is isomorphic to K R(mωi ). Proof. To prove the proposition we first show that vw satisfies the defining relations of K R(mωi ). It was shown in [3] that the element vw satisfied the relations n+ [t]vw = 0, (h ⊗ t r )vw = 0, h ∈ h, r ∈ N. The proof given in [3, Prop. 1.4.3] was for the special case when g is of type An , but works for any simple Lie algebra. Since D(w) is finite–dimensional the only relation that remains to be checked is that − xαi ⊗ t vw = 0. If not, then L()w−αi +δ = 0 and hence we find that L()−w−1 αi +δ = 0. Now (w, αi ) = m = m0 , w −1 αi implies that w −1 αi = α + δ for some α ∈ R and hence L()−α = 0. This forces α ∈ R + , L()sα (−α) = 0. Since = m0 , we find that sα ( − α) = + α and hence L()+α = 0 which is a contradiction. This proves that D(w) is a quotient of K R(mωi ). The fact that it as an isomorphism of g[t]–modules is immediate from [7, Theorem 1].

454

V. Chari, A. Moura

5.2. The case of exceptional Lie algebras. For the exceptional Lie algebras the definition and elementary properties of the modules K R(mωi ) and their twisted analogs are exactly the same as for the classical Lie algebras. Moreover, the g–module decompositions of the modules K R(mωi ) (resp. K R σ (mωi )) can be shown to coincide with the conjectural decompositions given in [9, 10, 14] as long as i is such that the maximum value of εi (α) ≤ dˇi for all α ∈ R + . In the other cases, the main difficulty lies in proving Proposition 2.4 (resp. Proposition 3.6). One reason for this is that the multiplicity of the irreducible representations occurring in a given graded component can be bigger than one. The construction of the fundamental Kirillov–Reshetikhin modules is also much more complicated since the number of irreducible components is large, in the case of E 8 and the trivalent node for instance, the total number of components is conjectured to be three hundred and sixty eight with twenty four non–isomorphic isotypical components. In fact in this case, a general conjecture for the structure of the non–fundamental modules is not available and the combinatorics seems formidable. References 1. Bourbaki, N.: Groupes et algèbres de Lie. Chapitres 4,5,6, Paris: Hermann, (1968) 2. Chari, V.: On the fermionic formula and the Kirillov-Reshetikhin conjecture. Int. Math. Res. Not. 12, 629–654 (2001) 3. Chari, V., Loktev, S.: Weyl, Fusion and Demazure modules for the current algebra of slr +1 . http:// arxiv.org/list/math.QA/0502165, 2005 4. Chari, V., Pressley, A.: Weyl modules for classical and quantum affine algebras. Represent. Theory 5, 191–223 (2001) 5. Feigin, B., Loktev S.: On Generalized Kostka Polynomials and the Quantum Verlinde Rule. In: Differential topology, infinite–dimensional Lie algebras, and applications, Amer. Math. Soc. Transl. Ser. 2, Vol. 194, Providence, RI:/Amer.Math.Soc., 1999, pp. 61–79, 6. Feigin, B., Loktev, S., Kirillov, A.N.: Combinatorics and Geometry of Higher Level Weyl Modules. http://arxiv.org/list/math.QA/0503315, 2005 7. Fourier, G., Littelmann, P.: Tensor product structure of affine Demazure modules and limit constructions. http://arxiv.org/list/math.RT/0412432, 2004 8. Fulton W., Harris, J.: Representation Theory - A first course. GTM bf 129, Berlin-Heidelberg-New York: Springer–Verlag, 1991 9. Hatayama, G., Kuniba, A., Okado, M., Takagi, T., Yamada, Y.: Remarks on the Fermionic Formula. Contemp. Math. 248, 243–291 (1999) 10. Hatayama, G., Kuniba, A., Okado, M., Takagi, T., Tsuboi, Z.: Paths, Crystals and Fermionic Formulae, Prog. Math. Phys. 23, 205–272 (2002) 11. Humphreys, J.: Introduction to Lie Algebras and Representation Theory. GTM 9, Berlin-Heidelberg-New York: Springer–Verlag, 1972 12. Kac, V.: Infinite dimensional Lie algebras. Cambridge: Cambridge Univ. Press, 1985 13. Kirillov, A.N., Reshetikhin, N.: Representations of Yangians and multiplicities of ocurrence of the irreducible components of the tensor product of simplie Lie algebras. J. Sov. Math. 52, no. 5, 393–403 (1981) 14. Kleber, M.: Combinatorial structure of finite-dimensional representations of Yangians: the simply-laced case. Int. Math. Res. Not. 7, no. 4, 187-201 (1997) 15. Naito, S., Sagaki, D.: Path Model for a Level Zero Extremal Weight Module over a Quantum Affine Algebra. http://arxiv.org/list/math.QA/0210450, 2002 16. Naito, S., Sagaki, D.: Construction of perfect crystals conjecturally corresponding to Kirillov-Reshetikhin modules over twisted quantum affine algebras. http://arxiv.org/list/math.QA/0503287, 2005 (1) 17. Schilling, A., Sternberg, P.: Finite-Dimensional Crystals B 2,s for Quantum Affine Algebras of type Dn . http://arxiv.org/list/math.QA/0408113, 2004, to appear in J. Alg. Combin. 18. Parthasarathy, K., Rao, R., Varadarajan, V.: Representations of complex semi–simple Lie groups and Lie Algebras. Ann. Math. 85, 38–429 (1967) 19. Varadarajan, V.: Lie Groups, Lie Algebras, and Their Representations. GTM 102, Berlin-Heidelberg-New York: Springer–Verlag, 1974 Communicated by L. Takhtajan

Commun. Math. Phys. 266, 455–470 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0031-3

Communications in

Mathematical Physics

Multifractal Structure of Two-Dimensional Horseshoes Luis Barreira, Claudia Valls Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal. E-mail: [email protected]; [email protected] Received: 24 August 2005 / Accepted: 7 February 2006 Published online: 25 April 2006 – © Springer-Verlag 2006

Abstract: We give a complete description of the dimension spectra of Birkhoff averages on a hyperbolic set of a surface diffeomorphism. The main novelty is that we are able to consider simultaneously Birkhoff averages into the future and into the past, i.e., both for positive and negative time. We emphasize that the description of these spectra is not a consequence of the available results in the case of Birkhoff averages simply into the future (or into the past). The main difficulty is that although the local product structure provided by the intersection of stable and unstable manifolds is bi-Lipschitz equivalent to a product, the level sets of the Birkhoff averages are never compact (this causes their box dimension to be strictly larger than their Hausdorff dimension), and thus the product of level sets may have a dimension that need not be the sum of the dimensions of these sets. Instead we construct explicitly noninvariant measures concentrated on each product of level sets with the appropriate pointwise dimension. We also consider the higher-dimensional case of more than one Birkhoff average, as well as the case of ratios of Birkhoff averages. 1. Introduction 1.1. Motivation. Our main concern in this paper is the multifractal analysis of dynamical systems. This theory can be considered a subfield of the dimension theory of dynamical systems, and it essentially studies the complexity of the level sets of invariant local quantities obtained from a dynamical system. In particular one can consider Birkhoff averages, Lyapunov exponents, pointwise dimensions, and local entropies. We emphasize that these functions are usually only measurable and thus their level sets are rarely manifolds. Hence, in order to measure the complexity of these sets it is appropriate to use quantities such as the topological entropy or the Hausdorff dimension. We refer to Supported by the Center for Mathematical Analysis, Geometry, and Dynamical Systems, and through Fundação para a Ciência e a Tecnologia by Programs POCTI/FEDER, POSI, and POCI 2010/Fundo Social Europeu, and the grant SFRH/BPD/14404/2003.

456

L. Barreira, C. Valls

the book [6] for the state-of-the-art of the theory of multifractal analysis in 1997, and to the survey [1] for later developments. Our main objective is to give a complete description of the dimension spectra of Birkhoff averages on a locally maximal hyperbolic set of a surface diffeomorphism, taking into account simultaneously Birkhoff averages into the future and into the past. More precisely, the spectra that we consider are obtained by computing the Hausdorff dimension of the level sets of Birkhoff averages of a given function both for positive and negative time.

1.2. A model case: the Smale horseshoe. In order to briefly describe our results and to explain why they are nontrivial, we consider here the very particular case of the (linear) Smale horseshoe = C × C, given by the product of two standard middle-third Cantor sets. We emphasize that our results are new even in this very special case. Let f : → be the dynamics on the horseshoe, here assumed to be expanding in the vertical direction and contracting in the horizontal direction. Given continuous functions ϕ, ψ : → R we consider the level sets of Birkhoff averages given for each α, β ∈ R by n−1 n−1 1 1 k −k K αβ = x ∈ : lim ϕ( f x) = α and lim ψ( f x) = β . n→∞ n n→∞ n k=0

k=0

The associated dimension spectrum is defined by D(α, β) = dim H K αβ , where dim H denotes the Hausdorff dimension. Again, the novelty with respect to many other spectra studied in the multifractal analysis of dynamical systems is that we consider simultaneously Birkhoff averages into the future and into the past. We now explain why we cannot obtain a description of the spectrum D from the known results in multifractal analysis. Let P and Q be the orthogonal projections onto the horizontal and vertical axes (note that in the present case of the Smale horseshoe these are respectively the unstable and stable holonomies). From the exponential behavior of f along the stable and unstable manifolds we can show that (see Sect. 3.2) n−1 1 −k P(K αβ ) × C = x ∈ : lim ψ( f x) = β , n→∞ n k=0 n−1 1 k ϕ( f x) = α . C × Q(K αβ ) = x ∈ : lim n→∞ n k=0

Incidentally, notice that the projection P(K αβ ) does not depend on α (and thus on the function ϕ), and that the projection Q(K αβ ) does not depend on β (and thus on the function ψ). Therefore K αβ = [P(K αβ ) × C] ∩ [C × Q(K αβ )] = P(K αβ ) × Q(K αβ ).

(1)

This shows that each level set K αβ is a product of level sets of Birkhoff averages either only into the future or only into the past. A priori it could seem that the identity in (1) would allow us to obtain a description of the spectrum D from the known results for

Multifractal Structure of Two-Dimensional Horseshoes

457

P(K αβ ) and Q(K αβ ). The problem is that in general the Hausdorff dimension of a product A × B need not be the sum of the Hausdorff dimensions of A and B, unless for example dim H A = dim B A or dim H B = dim B B, where dim B denotes the upper box dimension. It happens that as a consequence of the theory of multifractal analysis (see the discussion at the end of Sect. 3.4), if the functions ϕ and ψ are not cohomologous to constants, then dim H P(K αβ ) < dim B P(K αβ ) and dim H Q(K αβ ) < dim B Q(K αβ ) for all except one value of α and one value of β. Thus, even though it follows immediately from (1) that D(α, β) ≥ dim H P(K αβ ) + dim H Q(K αβ ),

(2)

a priori this inequality could be strict. Our main objective is to show that indeed (2) becomes an identity for every α and β (see Sect. 3.4). A consequence of the theory of multifractal analysis is then the analyticity of the spectrum (see Sect. 3.5). We also consider the higher-dimensional case of more than one function, as well as the case of ratios of Birkhoff averages.

1.3. Method of proof. We now briefly explain how we overcome the above difficulty. Our approach to establish the equality in (2) is to construct explicitly measures that are concentrated on each level set K αβ and have the right pointwise dimension, although they are always noninvariant (see Sect. 3.3). These measures are nevertheless constructed by combining invariant measures obtained from the theory of multifractal analysis. We note that we also require the so-called diametric regularity of the involved measures (see (16) below for the definition). This property is crucial in some approaches to the multifractal analysis of a dynamical system: it is used to ensure that the multifractal analysis at the level of symbolic dynamics can be transferred to the multifractal analysis on the manifold (see [6] for details). For example, all equilibrium measures of a Hölder continuous function on a locally maximal hyperbolic set are diametrically regular. Even though the measures that we construct are never equilibrium measures (we recall that they are never invariant), as already mentioned their “building blocks” come from multifractal analysis (although not always from the same dynamics) and this allows us to use the above property.

2. Hyperbolic Sets and Hausdorff Dimension Let f : M → M be a C 1+ε diffeomorphism on a surface M, for some ε ∈ (0, 1], and let ⊂ M be a compact smooth hyperbolic set for f . This means that is an f -invariant set (i.e., f () = ), and that there exist a continuous splitting of the tangent bundle T M = E s ⊕ E u , and constants c > 0 and λ ∈ (0, 1) such that for every x ∈ we have: 1. dx f (E s (x)) = E s ( f x) and dx f (E u (x)) = E u ( f x); 2. dx f n v ≤ cλn v whenever v ∈ E s (x) and n ∈ N; 3. dx f −n v ≤ cλn v whenever v ∈ E u (x) and n ∈ N.

458

L. Barreira, C. Valls

We will always assume that is locally maximal, i.e., that there exists an open neighborhood U of such that = n∈Z f n U , that the stable and unstable distributions E s and E u have dimension one, and that f is topologically mixing on . Let be a hyperbolic set for f , and denote by d the distance on M. For each x ∈ there exist local stable and unstable manifolds V s (x) and V u (x) containing x such that: 1. Tx V s (x) = E s (x) and Tx V u (x) = E u (x); 2. f (V s (x)) ⊂ V s ( f x) and f −1 (V u (x)) ⊂ V u ( f −1 x); 3. there exist constants c > 0 and λ ∈ (0, 1) (independent of x) such that for each n ∈ N we have d( f n y, f n x) ≤ cλn d(y, x) whenever y ∈ V s (x), d( f −n y, f −n x) ≤ cλn d(y, x) whenever y ∈ V u (x).

(3)

Given x ∈ we also consider the global stable and unstable manifolds of x, defined by W s (x) =

f −n V s ( f n x) and W u (x) =

n∈N

f n V u ( f −n x).

n∈N

Let now ts and tu be the unique real numbers such that P(ts log d f |E s ) = P(tu log d f −1 |E u ) = 0, where P denotes the topological pressure with respect to f on (see for example [3] for the definition). It was shown in [4] that dim H ( ∩ V s (x)) = ts and dim H ( ∩ V u (x)) = tu

(4)

for every x ∈ , and it was shown in [5] that dim H ( ∩ V s (x)) = dim B ( ∩ V s (x)), dim H ( ∩ V u (x)) = dim B ( ∩ V u (x))

(5)

for every x ∈ . Since in the present situation the stable and unstable holonomies are Lipschitz, we have dim H = dim H [( ∩ V s (x)) × ( ∩ V u (x))] = dim H ( ∩ V s (x)) + dim H ( ∩ V u (x)) = ts + tu ,

(6)

using (5) in the second identity (we recall that if dim H A = dim B A, then dim H (A× B) = dim H A + dim H B for any set B).

3. Dimension Spectra We consider in this section the dimension spectra of Birkhoff averages on a locally maximal hyperbolic set of a surface diffeomorphism f : M → M.

Multifractal Structure of Two-Dimensional Horseshoes

459

3.1. Dimension spectra. We denote by C δ () the space of Hölder continuous functions ϕ : → R with a given Hölder exponent δ ∈ (0, 1]. We fix κ ∈ N, and we consider two pairs of functions ( + , + ) and ( − , − ) in the space H () := C δ ()κ × C δ ()κ . The symbols + and − correspond respectively to the future and to the past: more precisely, we shall consider the Birkhoff averages of the functions ( + , + ) into the future and the Birkhoff averages of the functions ( − , − ) into the past. A similar notation will be used throughout the remaining sections in other quantities. Write ± = (ϕ1± , . . . , ϕκ± ) and ± = (ψ1± , . . . , ψκ± ). We will always assume that ψi± > 0 for i = 1, . . ., κ (and for simplicity we shall simply write ± > 0). Given a vector α = (α1 , . . . , ακ ) ∈ Rκ we consider the level sets n κ + ( f k x) ϕ k=0 i = αi , K α+ = x ∈ : lim n n→∞ ψi+ ( f k x) k=0 i=1 n κ − −k x) k=0 ϕi ( f − Kα = = αi . x ∈ : lim n − −k n→∞ x) k=0 ψi ( f i=1 We define the associated dimension spectrum D : Rκ × Rκ → R by D(α, β) = dim H (K α+ ∩ K β− ). 3.2. Formulas for the dimension of the level sets. We first consider separately each of the level sets K α+ and K α− . Theorem 1. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, for each α ∈ Rκ and x ± ∈ K α± we have ∩ W s (x + ) ⊂ K α+ , ∩ W u (x − ) ⊂ K α− ,

(7)

and dim H K α+ = dim H (K α+ ∩ V u (x + )) + ts ,

dim H K α− = dim H (K α− ∩ V s (x − )) + tu . Proof. Let a, b : → R be continuous functions with b > 0. It follows from (3) and the uniform continuity of a and b on that for each x ∈ and δ > 0, given n ∈ N sufficiently large we have |a( f m y) − a( f m x)| < δ and |b( f m y) − b( f m x)| < δ for every y ∈ V s (x) and m > n. Therefore, m a( f k y) m a( f k x) k=0 − k=0 m m k x) k=0 b( f k y) b( f k=0 m m a( f k x) m a( f k x) k k k=0 |a( f y) − a( f x)| k=0 k=0 m ≤ + m − m k y) k y) k x) b( f b( f b( f k=0 k=0 k=0 n a ∞ + (m − n + 1)δ n b ∞ + (m − n + 1)δ + (m + 1) a ∞ (m + 1) inf b (m + 1)2 (inf b)2 δ a ∞ δ + → inf b (inf b)2 ≤

(8)

460

L. Barreira, C. Valls

as m → ∞. Assume now that there exists β ∈ R such that m a( f k x) = β. lim k=0 m k m→∞ k=0 b( f x) Then, taking δ → 0 in (8) we conclude that m a( f k y) lim k=0 = β for every y ∈ V s (x). m k y) m→∞ b( f k=0 This implies that ∩ V s (x) ⊂ K α+ for every x ∈ K α+ . Furthermore, since the set K α+ is f -invariant we conclude that ∩ f −n V s ( f n x) ⊂ K α+ whenever x ∈ K α+ for every n ∈ N, and thus ∩ W s (x) ⊂ K α+ . Similar arguments establish the analogous statement in (7) between K α− and the global unstable manifolds. Let now Vεs (x) ⊂ W s (x) and Vεu (x) ⊂ W u (x) be the segments of size ε of the stable and unstable manifolds, with respect to the distances induced by d respectively on W s (x) and W u (x). Since E s and E u have codimension one, the stable and unstable holonomies are Lipschitz. Hence, by the uniform transversality of the stable and unstable manifolds, given x ∈ and a sufficiently small ε > 0, the map ( ∩ Vεs (x)) × ( ∩ Vεu (x)) (y, z) → [y, z] := Vεu (y) ∩ Vεs (z) is a Lipschitz homeomorphism with Lipschitz inverse. This ensures that in the set K α+ the open neighborhood ∩

y∈K α+ ∩Vεu (x)

Vεs (y)

of a point x ∈ K α+ (with respect to the induced topology on , in view of (7)) is taken onto the product (K α+ ∩ Vεu (x)) × ( ∩ Vεs (x)) by a Lipschitz map with Lipschitz inverse. Therefore, dim H K α+ = dim H (K α+ ∩ Vεu (x)) + ts = dim H (K α+ ∩ V u (x)) + ts , in view of (4) and (5). We obtain the corresponding identity for K α− in an analogous manner. This completes the proof of the theorem.

Multifractal Structure of Two-Dimensional Horseshoes

461

3.3. Existence of full measures. We now consider the dimension spectrum. We denote by M the family of f -invariant Borel probability measures on , and we define the functions P± : M → Rκ by

± ± ± ϕ1 dμ ϕκ dμ P (μ) = ,..., . ± ± ψ1 dμ ψκ dμ We note that M is compact and connected, and since P± is continuous, the set P± (M) is also compact and connected. We denote by B(x, r ) the ball of radius r centered at x. Theorem 2. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, if α ∈ int P+ (M) and β ∈ int P− (M), then there exists a probability measure ν on such that ν(K α+ ∩ K β− ) = 1, with log ν(B(x, r )) = dim H K α+ + dim H K β− − dim H r →0 log r lim

(9)

for ν-almost every x ∈ , and lim sup r →0

log ν(B(x, r )) ≤ dim H K α+ + dim H K β− − dim H log r

(10)

for every x ∈ K α+ ∩ K β− . Proof. Consider a Markov partition of , and the associated two-sided topological Markov chain σ : A → A with transfer matrix A. We also consider the coding map χ : A → obtained from the Markov partition. We denote by +A and − A respectively the sets of right-sided and left-sided infinite sequences obtained from A . We consider − the topological Markov chains σ + : +A → +A and σ − : − A → A defined by σ + (i 0 i 1 · · · ) = (i 1 i 2 · · · ) and σ − (· · · i −1 i 0 ) = (· · · i −2 i −1 ). Take now x ∈ , and choose ω ∈ A such that χ (ω) = x. Let R(x) be a rectangle of the Markov partition that contains x, and let π + : A → +A and π − : A → − A be the projections defined respectively by π + (· · · i −1 i 0 i 1 · · · ) = (i 0 i 1 · · · ) and π − (· · · i −1 i 0 i 1 · · · ) = (· · · i −1 i 0 ). For each sequence ω ∈ A we have χ (ω ) ∈ V u (x) ∩ R(x) whenever π − ω = π − ω, χ (ω ) ∈ V s (x) ∩ R(x) whenever π + ω = π + ω. Thus, if ω = (· · · i −1 i 0 i 1 · · · ), then via the coding map χ the set V u (x) ∩ R(x) can be identified with the cylinder Ci+0 = {( j0 j1 · · · ) ∈ +A : j0 = i 0 } ⊂ +A ,

(11)

and the set V s (x) ∩ R(x) can be identified with the cylinder − Ci−0 = {(· · · j−1 j0 ) ∈ − A : j0 = i 0 } ⊂ A .

(12)

462

L. Barreira, C. Valls

We use a construction described in [3, Lemma 1.6] such that given a Hölder continuous function ϕ : A → R allows one to obtain Hölder continuous functions ϕ u and ϕ s cohomologous to ϕ, depending respectively only on the symbolic future and on the symbolic past. We formulate this statement explicitly for the Hölder continuous functions ϕi± , ψi± , log d f |E u , and log d f −1 |E s (recall that f is of class C 1+ε , and that the stable and unstable distributions are always Hölder continuous on the base point). Lemma 1 (see [3, Lemma 1.6]). For each i = 1, . . . , κ there exist Hölder continuous functions ϕiu , ψiu , d u : +A → R and ϕis , ψis , d s : − A → R, and continuous functions ± ± ± gi , h i , ρ : A → R such that ϕi+ ◦ χ = ϕiu ◦ π + + gi+ − gi+ ◦ σ, ψi+ ◦ χ = ψiu ◦ π + + h i+ − h i+ ◦ σ, log d f |E u ◦ χ = d u ◦ π + + ρ + − ρ + ◦ σ, and ϕi− ◦ χ = ϕis ◦ π − + gi− − gi− ◦ σ,

ψi− ◦ χ = ψis ◦ π − + h i− − h i− ◦ σ, log d f −1 |E s ◦ χ = d s ◦ π − + ρ − − ρ − ◦ σ. We now initiate the process of construction of the measure ν. Set d + = dim H K α+ − ts and d − = dim H K β− − tu . Note that by (6) we have d + + d − = dim H K α+ + dim H K β− − dim H .

(13)

Set now u = (ϕ1u , . . . , ϕκu ), u = (ψ1u , . . . , ψκu ), s = (ϕ1s , . . . , ϕκs ), s = (ψ1s , . . . , ψκs ). Given q ± ∈ Rκ , we define (Hölder continuous) functions a u : +A → R and bs : − A → R by a u = q + , u − α ∗ u − d + d u , bs = q − , s − β ∗ s − d − d s ,

(14)

where ·, · denotes the standard inner product in Rκ and α ∗ (ϕ1 , . . . , ϕκ ) = (α1 ϕ1 , . . . , ακ ϕκ ). Since f is topologically mixing on (and hence the same happens with f −1 ), there exist a unique equilibrium measure μu of a u on +A (with respect to σ + ), and a unique − u s equilibrium measure μs of bs on − A (with respect to σ ). Note that both μ and μ are Gibbs measures. Since α ∈ int P+ (M) and β ∈ int P− (M), the following statement is an immediate consequence of Theorem 8 in [2] (see also the discussion after that theorem).

Multifractal Structure of Two-Dimensional Horseshoes

463

Lemma 2. There exist vectors q ± ∈ Rκ such that the corresponding measures μu and μs satisfy Pσ + (a u ) = Pσ − (bs ) = 0 and

dμ = α ∗ u

+A

u

dμ , u

+A

(15)

u

dμ = β ∗ s

− A

s

− A

s dμs .

We also define measures ν u and ν s on the rectangle R(x) ⊂ by ν u = μu ◦ π + ◦ χ −1 and ν s = μs ◦ π − ◦ χ −1 , using the vectors q ± in Lemma 2. We finally define the measure ν on R(x) by ν = ν u ×ν s . Note that since μu and μs are Gibbs measures, we have (see (11) and (12)) ν(R(x)) = μu (Ci+0 )μs (Ci−0 ) > 0. Lemma 3. There exist constants γ > 1 and C > 0 such that for every x ∈ and r > 0 we have ν(B(x, γ r )) ≤ Cν(B(x, r )).

(16)

Proof of the lemma. Consider the Hölder continuous functions a, b : → R defined by a = q + , + − α ∗ + − d + log d f |E u , b = q − , − − β ∗ − + d − log d f |E s . Note that ν u is the unique equilibrium measure of a with respect to f , and that ν s is the unique equilibrium measure of b with respect to f −1 . Being equilibrium measures of Hölder continuous functions, each of them have the property in (16) (this is the so-called diametric regularity property; see Propositions 21.4 and 24.1 in [6]), and thus the same happens with their product ν. We now start to establish the statements in the theorem. Lemma 4. We have lim inf r →0

log ν(B(x, r )) ≥ dim H K α+ + dim H K β− − dim H log r

for ν-almost every x ∈ . Proof of the lemma. Using the variational principle for the topological pressure of the functions in (14) it follows from Lemma 2 that h μu (σ + ) h s (σ − ) + μ = d and = d −, u dμu s s d − d dμ +

A

(17)

A

where h μ denotes the Kolmogorov–Sinai entropy with respect to the measure μ. By the Shannon–McMillan–Breiman theorem and the Birkhoff ergodic theorem, it follows

464

L. Barreira, C. Valls

from (17) that for every ε > 0, and for μu -almost every ω+ ∈ Ci+0 and μs -almost every ω− ∈ Ci−0 , there exists s(ω) ∈ N such that for each n, m > s(ω) we have log μu (Ci+0 ···in ) < d + + ε, d + − ε < − n u ((σ + )k ω+ ) d k=0 and log μs (Ci−−m ···i0 ) < d − + ε. d −ε <− m s − k − k=0 d ((σ ) ω ) −

Given r > 0 sufficiently small we now choose n = n(ω, r ) and m = m(ω, r ) such that −

n

d u ((σ + )k ω+ ) > log r, −

n+1

k=0

k=0

m

m+1

d u ((σ + )k ω+ ) ≤ log r,

(18)

d s ((σ − )k ω− ) ≤ log r.

(19)

and −

d s ((σ − )k ω− ) > log r, −

k=0

k=0

It follows from the construction of the Markov partitions (recall that the stable and unstable distributions have dimension one; see [6] for details) that there exists a constant ρ > 1 independent of x = χ (ω) and r such that B(y, r/ρ) ∩ ⊂ χ (Ci−m ···in ) ⊂ B(x, ρr )

(20)

for some point y ∈ χ (Ci−m ···in ), with ω = (· · · i −1 i 0 i 1 · · · ). Furthermore, by Lemma 3 there exists a constant c > 0 independent of x and r such that ν(B(y, 2ρr )) ≤ cν(B(y, r/ρ)). It follows from (20) that ν(B(x, r )) ≤ ν(B(y, 2ρr )) ≤ cν(B(y, r/ρ)) ≤ cν(χ (Ci−m ···in )) = cμu (Ci+0 ···in )μs (Ci−−m ···i0 ) n + u + k + d ((σ ) ω ) < c exp (−d + ε) −

k=0 m

× exp (−d + ε)

− k

−

d ((σ ) ω ) s

k=0

≤ c exp[(log r + d u ∞ )(d + − ε) + (log r + d s ∞ )(d − − ε)], and hence log ν(B(x, r )) ≥ d + + d − − 2ε, r →0 log r for ν-almost every point x ∈ . In view of (13), the arbitrariness of ε implies the desired result. lim inf

Let now αβ ⊂ A be the set of points ω ∈ A such that for i = 1, . . . , κ we have n n s − k − ϕ u ((σ + )k ω+ ) k=0 ϕi ((σ ) ω ) = α = βi . , lim lim nk=0 iu i n s + k + − k − n→∞ n→∞ k=0 ψi ((σ ) ω ) k=0 ψi ((σ ) ω )

Multifractal Structure of Two-Dimensional Horseshoes

465

Lemma 5. The inequality in (10) holds for every x ∈ χ (αβ ). Proof of the lemma. Given ε > 0 and ω ∈ αβ there exists r (ω) ∈ N such that for every n > r (ω) we have n + u u + k + + u q , ( − α ∗

)((σ ) ω ) (21) < εn q , ∞ , k=0

and n − s s − k − − s q , ( − β ∗

)((σ ) ω ) < εn q , ∞ .

(22)

k=0

Since μu and μs are Gibbs measures, in view of (15) there exists a constant D > 0 such that for every ω+ ∈ Ci+0 , ω− ∈ Ci−0 , and n, m ∈ N we have D −1 <

μu (Ci+0 ···in ) n < D, exp k=0 a u ((σ + )k ω+ )

and D

−1

μs (Ci−−m ···i0 ) < D. < s − k − exp m k=0 b ((σ ) ω )

Combining these inequalities with (21)–(22) we find that n μu (Ci+0 ···in ) > D −1 exp −d + d u ((σ + )k ω+ ) − εn q + , u ∞ ,

(23)

k=0

and

μ

s

(Ci−−m ···i0 )

>D

−1

exp −d

−

m

− k

−

−

d ((σ ) ω ) − εm q , ∞ . s

s

(24)

k=0

As in the proof of Lemma 4, it follows from the construction of the Markov partitions that there exists a constant ρ > 0 independent of x = χ (ω) and r such that B(x, ρr ) ⊃ χ (Ci−m ···in ), with n = n(ω, r ) and m = m(ω, r ) as in (18)–(19). Given ε > 0 and ω ∈ αβ we now take r > 0 sufficiently small so that n(ω, r ) > r (ω) and m(ω, r ) > r (ω) (the uniform hyperbolicity of f on ensures that this is always possible). Then, combining (23)–(24) with (18)–(19) we obtain ν(B(x, ρr )) ≥ ν(χ (Ci−m ···in )) = μu (Ci+0 ···in )μs (Ci−−m ···i0 ) ≥ D −2 r d

+ +d −

exp(−εn q + , u ∞ − εm q − , s ∞ ),

for all sufficiently small r > 0. Note that by (18)–(19) we have −n inf d u > log r, −m inf d s > log r.

466

L. Barreira, C. Valls

Therefore, for every x = χ (ω) with ω ∈ αβ we obtain lim sup r →0

log ν(B(x, r )) q + , u ∞ q − , s ∞ . ≤ d+ + d− + ε + log r inf d u inf d s

Since ε is arbitrary, for every x ∈ χ (αβ ) we have lim sup r →0

log ν(B(x, r )) ≤ d + + d −. log r

In view of (13) this gives the desired statement.

Lemma 6. We have χ (αβ ) = K α+ ∩ K β− . Proof of the lemma. It follows from Lemma 1 that ϕi+ ( f k (χ (ω))) = ψi+ (χ (σ k ω)) = ϕiu (π + (σ k ω)) + gi+ (σ k ω) − gi+ (σ k+1 ω) = ϕiu ((σ + )k ω+ ) + gi+ (σ k ω) − gi+ (σ k+1 ω), with analogous identities for the functions ψi+ , ϕi− , and ψi− . Therefore, n−1

+ k k=0 ϕi ( f (χ (ω))) n−1 + k k=0 ψi ( f (χ (ω)))

n−1 k=0 = n−1 k=0

ϕiu ((σ + )k ω+ ) + gi+ (ω) − gi+ (σ n ω) ψiu ((σ + )k ω+ ) + h i+ (ω) − h i+ (σ n ω)

,

(25)

and n−1

− −k (χ (ω))) k=0 ϕi ( f n−1 − −k (χ (ω))) k=0 ψi ( f

n−1 k=0 = n−1 k=0

ϕis ((σ − )k ω− ) + gi− (ω) − gi− (σ n ω)

ψis ((σ − )k ω− ) + h i− (ω) − h i− (σ n ω)

We now observe that n−1

ψiu ((σ + )k ω+ ) ≥ n inf ψi+ − 2 h i+ ∞ ,

k=0 n−1

ψis ((σ − )k ω− ) ≥ n inf ψi− − 2 h i− ∞ .

k=0

Since ψi± > 0 for i = 1, . . . , κ, these inequalities ensure that the limits n−1

+ k k=0 ϕi ( f (χ (ω))) , lim n−1 + k n→∞ k=0 ψi ( f (χ (ω)))

n−1 k=0 lim n−1 n→∞ k=0

ϕi− ( f −k (χ (ω)))

ψi− ( f −k (χ (ω)))

exist if and only if the limits n−1 u + k + k=0 ϕi ((σ ) ω ) lim n−1 , u + k + n→∞ k=0 ψi ((σ ) ω )

n−1 s − k − k=0 ϕi ((σ ) ω ) lim n−1 s − k − n→∞ k=0 ψi ((σ ) ω )

.

(26)

Multifractal Structure of Two-Dimensional Horseshoes

467

exist, in which case we have n−1 + k n−1 u + k + k=0 ϕi ( f (χ (ω))) k=0 ϕi ((σ ) ω ) = lim n−1 lim n−1 u + k + k + n→∞ n→∞ k=0 ψi ( f (χ (ω))) k=0 ψi ((σ ) ω ) and

n−1 − −k n−1 s − k − (χ (ω))) k=0 ϕi ( f k=0 ϕi ((σ ) ω ) = lim . lim n−1 n−1 − s −k (χ (ω))) − k − n→∞ n→∞ k=0 ψi ( f k=0 ψi ((σ ) ω )

In particular ω ∈ αβ if and only if χ (ω) ∈ K α+ ∩ K β− . This shows that χ (αβ ) = K α+ ∩ K β− . Combining the above lemmas we readily obtain the statement in the theorem.

We call each measure ν with the properties in Theorem 2 a full measure for the set K α+ ∩ K β− . We note that the particular measure ν constructed in the proof of Theorem 2 is never f -invariant (since μu is σ + -invariant while μs is σ − -invariant). 3.4. Formula for the spectrum. We now use the former results to obtain a formula for the spectrum D. Theorem 3. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, if α ∈ int P+ (M) and β ∈ int P− (M), then the set K α+ ∩ K β− is dense in and D(α, β) = dim H K α+ + dim H K β− − dim H .

(27)

Proof. If follows easily from the construction of the functions u , u , s , and s that the sets K α+ and K β− are dense in (we note that by Theorem 2 they are nonempty). Namely, by Lemma 1 (see also (25) and (26)) we know that the ratios of Birkhoff averages of these functions only depend on the symbolic past (in the case of K α+ ) or on the symbolic future (in the case of K β− ). The density follows immediately from the fact that

(σ + )−k ω+ = +A and

k∈N

(σ − )−k ω− = − A

k∈N

− A.

for any points ω+ ∈ +A and ω− ∈ Let now ν be the measure constructed in Theorem 2. By (9) we have (see for example Theorem 7.1 in [6]) dim H ν = dim H K α+ + dim H K β− − dim H , where dim H ν := inf{dim H Z : ν(Z ) = 1} is the Hausdorff dimension of the measure ν. Since ν(K α+ ∩ K β− ) = 1 we obtain dim H (K α+ ∩ K β− ) ≥ dim H ν = dim H K α+ + dim H K β− − dim H .

(28)

468

L. Barreira, C. Valls

For the reverse inequality, we note that it follows readily from (10) that dim H (K α+ ∩ K β− ) ≤ dim H K α+ + dim H K β− − dim H (see for example Theorem 7.2 in [6]).

An alternative proof of the inequality in (28) and which does not use the measure ν, is the following. We first note that given x ∈ K α+ ∩ K β− and a sufficiently small open neighborhood U of x, we have K α+ ∩ U = (V s (y) ∩ U ) and K β− ∩ U = (V u (z) ∩ U ). z∈K β− ∩U

y∈K α+ ∩U

Therefore, K α+

∩

K β−

∩U =

y∈K α+ ∩U

V (y) ∩ s

V (z) ∩ U. u

z∈K β− ∩U

In a similar manner to that in the proof of Theorem 1, since the stable and unstable holonomies are Lipschitz, this identity ensures that in the neighborhood U the set K α+ ∩ K β− is taken by a Lipschitz map with Lipschitz inverse onto the product (K α+ ∩ V u (x)) × (K β− ∩ V s (x)). Therefore, dim H (K α+ ∩ K β− ) ≥ dim H (K α+ ∩ V u (x)) + dim H (K β− ∩ V s (x)) = dim H K α+ + dim H K β− − dim H .

(29)

Unfortunately this approach does not provide an upper bound for D(α, β). In order to explain the difficulty, we first observe that K α+ = and K β− = ,

(30)

whenever the level sets K α+ and K β− are nonempty. This follows from the corresponding statement at the level of symbolic dynamics. Simply note that W s (x) (respectively W u (x)) is the set of points having eventually the same symbolic future (respectively the same symbolic past) as the point x, and thus have an arbitrary symbolic past (respectively symbolic future). The identities in (30) are an immediate consequence of this observation. It is the noncompactness of the sets K α+ and K β− that causes the difficulty. More precisely, it follows from (30) that K α+ ∩ V u (x) = ∩ V u (x) and K β− ∩ V s (x) = ∩ V s (x), and thus dim B (K α+ ∩ V u (x)) = dim B ( ∩ V u (x)) = dim H ( ∩ V u (x)),

dim B (K β− ∩ V s (x)) = dim B ( ∩ V s (x)) = dim H ( ∩ V s (x)).

Multifractal Structure of Two-Dimensional Horseshoes

469

It follows from the general inequality dim H (X ∩ Y ) ≤ dim H X + dim B Y that the best simple-minded upper estimate for D(α, β) is D(α, β) ≤ min dim H (K α+ ∩ V u (x)) + dim H ( ∩ V s (x)),

dim H ( ∩ V u (x)) + dim H (K β− ∩ V s (x)) .

= min{dim H K α+ , dim H K β− }.

(31)

But this inequality also follows trivially from the inclusions K α+ ∩ K β− ⊂ K α+ , K β− . On the other hand, it follows from the theory of multifractal analysis (see [2]) provided that some cohomology relations are excluded, that there exist uncountably many values of α and β such that dim H K α+ < dim H and dim H K β− < dim H . For these values of α and β the upper bound in (31) is strictly larger than the lower bound in (29). 3.5. Conditional variational principle. We describe here a conditional variational principle for the spectrum D. We recall that h μ ( f ) denotes the Kolmogorov–Sinai entropy of an f -invariant measure μ on . Theorem 4. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, the following properties hold: 1. if α ∈ int P+ (M) and β ∈ int P− (M), then hμ( f ) + : μ ∈ M and P (μ) = α D(α, β) = max − log d f |E s dμ hμ( f ) − : μ ∈ M and P (μ) = β ; + max u log d f |E dμ 2. the spectrum D is analytic on int P+ (M) × int P− (M). Proof. In view of Theorem 3 (see (27)), this is an immediate consequence of Theorems 8 and 13 in [2]. Let us consider as an illustration the particular case when the Birkhoff averages are obtained from the Lyapunov exponents. Let μ ∈ M. By Birkhoff’s ergodic theorem, for μ-almost every x ∈ there exist the limits λ+s (x) = lim

n→+∞

1 1 log dx f n |E s , λ+u (x) = lim log dx f n |E u , n→+∞ n n

and λ− s (x) = lim

n→−∞

1 1 log dx f n |E s , λ− log dx f n |E u . lim u (x) = n→−∞ |n| |n|

These are respectively the values of the forward and backward Lyapunov exponent of f at the point x. Set now

± λ± dμ, λ dμ . P± (μ) = s u

470

L. Barreira, C. Valls

It also follows from Birkhoff’s ergodic theorem that P(μ) := P+ (μ) = P− (μ) for every measure μ ∈ M. By Theorem 4, for each α = (α1 , α2 ), β = (β1 , β2 ) ∈ int P(μ) we have hμ( f ) hμ( f ) D(α, β) = max : P(μ) = α + max : P(μ) = β − λ+s dμ − λ− u dμ =−

1 1 max{h μ ( f ) : P(μ) = α} − max{h μ ( f ) : P(μ) = β}. α1 β2

References 1. Barreira, L.: Hyperbolicity and recurrence in dynamical systems: a survey of recent results. Resenhas IME-USP 5, 171–230 (2002) 2. Barreira, L., Saussol, B., Schmeling, J.: Higher-dimensional multifractal analysis. J. Math. Pures Appl. 81, 67–91 (2002) 3. Bowen, R.: Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. Lect. Notes in Math. 470, Berlin Heidelberg-New York: Springer, 1975 4. McCluskey, H., Manning, A.: Hausdorff dimension for horseshoes, Ergodic Theory Dynam. Systems 3, 251–260 (1983) 5. Palis, J., Viana, M.: On the continuity of Hausdorff dimension and limit capacity for horseshoes. In: Bamón, R., Labarca, R., Palis, J. (eds.), Dynamical Systems (Valparaiso, 1986), Lect. Notes in Math. 1331, Berlin Heidelberg-New York: Springer, 1988, pp. 150–160 6. Pesin, Ya.: Dimension Theory in Dynamical Systems. Contemporary Views and Applications, Chicago Lectures in Mathematics, University of Chicago Press, 1997 Communicated by P. Sarnak

Commun. Math. Phys. 266, 471–497 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0047-8

Communications in

Mathematical Physics

Conservative Solutions to a Nonlinear Variational Wave Equation Alberto Bressan, Yuxi Zheng Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected]; [email protected] Received: 30 August 2005 / Accepted: 13 February 2006 Published online: 25 May 2006 – © Springer-Verlag 2006

Abstract: We establish the existence of a conservative weak solution to the Cauchy problem for the nonlinear variational wave equation u tt − c(u)(c(u)u x )x = 0, for initial data of finite energy. Here c(·) is any smooth function with uniformly positive bounded values. 1. Introduction We are interested in the Cauchy problem u tt − c(u) c(u)u x x = 0 ,

(1.1)

with initial data u(0, x) = u 0 (x) ,

u t (0, x) = u 1 (x) .

(1.2)

Throughout the following, we assume that c : R → R+ is a smooth, bounded, uniformly positive function. Even for smooth initial data, it is well known that the solution can lose regularity in finite time ([12]). It is thus of interest to study whether the solution can be extended beyond the time when a singularity appears. This is indeed the main concern of the present paper. In ([5]) we considered the related equation 1 x u t + f (u)x = f (u)u 2x d x (1.3) 2 0 and constructed a semigroup of solutions, depending continuously on the initial data. Here we establish similar results for the nonlinear wave equation (1.1). By introducing new sets of dependent and independent variables, we show that the solution to the Cauchy problem can be obtained as the fixed point of a contractive transformation. Our main result can be stated as follows.

472

A. Bressan, Y. Zheng

Theorem 1. Let c : R → [κ −1 , κ] be a smooth function, for some κ > 1. Assume that the initial data u 0 in (1.2) is absolutely continuous, and that (u 0 )x ∈ L2 , u 1 ∈ L2 . Then the Cauchy problem (1.1)–(1.2) admits a weak solution u = u(t, x), defined for all (t, x) ∈ R × R. In the t-x plane, the function u is locally Hölder continuous with exponent 1/2. This solution t → u(t, ·) is continuously differentiable as a map with p values in Lloc , for all 1 ≤ p < 2. Moreover, it is Lipschitz continuous w.r.t. the L2 distance, i.e. u(t, ·) − u(s, ·) 2 ≤ L |t − s| (1.4) L for all t, s ∈ R. Equation (1.1) is satisfied in integral sense, i.e. φt u t − c(u)φ x c(u) u x d xdt = 0

(1.5)

for all test functions φ ∈ Cc1 . Concerning the initial conditions, the first equality in (1.2) p is satisfied pointwise, while the second holds in Lloc for p ∈ [1, 2[ . Our constructive procedure yields solutions which depend continuously on the initial data. Moreover, the “energy” . 1 E(t) = (1.6) u 2t (t, x) + c2 u(t, x) u 2x (t, x) d x 2 remains uniformly bounded. More precisely, one has Theorem 2. A family of weak solutions to the Cauchy problem (1.1)–(1.2) can be constructed with the following additional properties. For every t ∈ R one has . 1 E(t) ≤ E0 = (1.7) u 21 (x) + c2 u 0 (x) (u 0 )2x (x) d x . 2 Moreover, let a sequence of initial conditions satisfy n (u )x − (u 0 )x 2 → 0 , u n − u 1 2 → 0 , 0 1 L L and u n0 → u 0 uniformly on compact sets, as n → ∞. Then one has the convergence of the corresponding solutions u n → u, uniformly on bounded subsets of the t-x plane. It appears in (1.7) that the total energy of our solutions may decrease in time. Yet, we emphasize that our solutions are conservative, in the following sense. Theorem 3. There exists a continuous family {μt ; t ∈ R} of positive Radon measures on the real line with the following properties. (i) At every time t, one has μt (R) = E0 . (ii) For each t, the absolutely continuous part of μt has density 21 (u 2t + c2 u 2x ) w.r.t. the Lebesgue measure. (iii) For almost every t ∈ R, the singular part of μt is concentrated on the set where c (u) = 0.

Conservative Solutions to a Nonlinear Variational Wave Equation

473

In other words, the total energy represented by the measure μ is conserved in time. Occasionally, some of this energy is concentrated on a set of measure zero. At the times τ when this happens, μτ has a non-trivial singular part and E(τ ) < E0 . The condition (iii) puts some restrictions on the set of such times τ . In particular, if c (u) = 0 for all u, then this set has measure zero. We point out that what we do is constructing a continuous semigroup of solutions. Uniqueness within a class of conservative solutions at this point is only a conjecture. The paper is organized as follows. In the next two subsections we briefly discuss the physical motivations for the equation and recall some known results on its solutions. In Sect. 2 we introduce a new set of independent and dependent variables, and derive some identities valid for smooth solutions. We formulate a set of equations in the new variables which is equivalent to (1.1). In the new variables all singularities disappear: Smooth initial data lead to globally smooth solutions. In Sect. 3 we use a contractive transformation in a Banach space with a suitable weighted norm to show that there is a unique solution to the set of equations in the new variables, depending continuously on the data u 0 , u 1 . Going back to the original variables u, t, x, remarkably, in Sect. 4 we establish the Hölder continuity of these solutions u = u(t, x), and show that the integral equation (1.5) is satisfied. Moreover, in Sect. 5, we study the conservativeness of the solutions, establish the energy inequality and the Lipschitz continuity of the map t → u(t, ·). This already yields a proof of Theorem 2. In Sect. 6 we study the continuity of the maps t → u x (t, ·), t → u t (t, ·), completing the proof of Theorem 1. The proof of Theorem 3 is given in Sect. 7. 1.1. Physical background of the equation. Equation (1.1) has several physical origins. In the context of nematic liquid crystals, it comes as follows. The mean orientation of the long molecules in a nematic liquid crystal is described by a director field of unit vectors, n ∈ S2 , the unit sphere. Associated with the director field n, there is the well-known Oseen-Franck potential energy density W given by W (n, ∇n) = α |n × (∇ × n)|2 + β(∇ · n)2 + γ (n · ∇ × n)2 .

(1.8)

The positive constants α, β, and γ are elastic constants of the liquid crystal. For the special case α = β = γ , the potential energy density reduces to W (n, ∇n) = α |∇n|2 , which is the potential energy density used in harmonic maps into the sphere S2 . There are many studies on the constrained elliptic system of equations for n derived through variational principles from the potential (1.8), and on the parabolic flow associated with it, see [3, 9, 10, 16, 22, 36] and references therein. In the regime in which inertia effects dominate viscosity, however, the propagation of the orientation waves in the director field may then be modeled by the least action principle (Saxton [29]) δ n · n = 1. (1.9) ∂t n · ∂t n − W (n, ∇n) dxdt = 0, δu In the special case α = β = γ , this variational principle (1.9) yields the equation for harmonic wave maps from (1+3)-dimensional Minkowski space into the two sphere, see [8, 31, 32] for example. For planar deformations depending on a single space variable x, the director field has the special form n = cos u(x, t)ex + sin u(x, t)e y ,

474

A. Bressan, Y. Zheng

where the dependent variable u ∈ R1 measures the angle of the director field to the x-direction, and ex and e y are the coordinate vectors in the x and y directions, respectively. In this case, the variational principle (1.9) reduces to (1.1) with the wave speed c given specifically by c2 (u) = α cos2 u + β sin2 u.

(1.10)

Equation (1.1) has interesting connections with long waves on a dipole chain in the continuum limit ([13], Zorski and Infeld [45], and Grundland and Infeld [14]), and in classical field theories and general relativity ([13]). We refer the interested reader to the article [13] for these connections. This Eq. (1.1) compares interestingly with other well-known equations, e. g. ∂t2 u − ∂x [ p(∂x u)] = 0,

(1.11)

where p(·) is a given function, considered by Lax [25], Klainerman and Majda [24], and Liu [28]. Second related equation is ∂t2 u − c2 (u) u = 0

(1.12)

considered by Lindblad [27], who established the global existence of smooth solutions of (1.12) with smooth, small, and spherically symmetric initial data in R3 , where the large-time decay of solutions in high space dimensions is crucial. The multi-dimensional generalization of Eq. (1.1), ∂t2 u − c(u)∇ · (c(u)∇u) = 0,

(1.13)

contains a lower order term proportional to cc |∇u|2 , which (1.12) lacks. This lower order term is responsible for the blow-up in the derivatives of u. Finally, we note that Eq. (1.1) also looks related to the perturbed wave equation ∂t2 u − u + f (u, ∇u, ∇∇u) = 0,

(1.14)

where f (u, ∇u, ∇∇u) satisfies an appropriate convexity condition (for example, f = u p or f = a(∂t u)2 + b|∇u|2 ) or some nullity condition. Blow-up for (1.14) with a convexity condition has been studied extensively, see [2, 11, 15, 20, 21, 26, 30, 33, 34] and Strauss [35] for more reference. Global existence and uniqueness of solutions to (1.14) with a nullity condition depend on the nullity structure and large time decay of solutions of the linear wave equation in higher dimensions (see Klainerman and Machedon [23] and references therein). Therefore (1.1) with the dependence of c(u) on u and the possibility of sign changes in c (u) is familiar yet truly different. Equation (1.1) has interesting asymptotic uni-directional wave equations. Hunter and Saxton ([17]) derived the asymptotic equations (u t + u n u x )x =

1 n−1 nu (u x )2 2

(1.15)

for (1.1) via weakly nonlinear geometric optics. We mention that the x-derivative of Eq. (1.15) appears in the high-frequency limit of the variational principle for the Camassa-Holm equation ([1, 6, 7]), which arises in the theory of shallow water waves. A construction of global solutions to the Camassa-Holm equations, based on a similar variable transformation as in the present paper, will appear in [4].

Conservative Solutions to a Nonlinear Variational Wave Equation

475

1.2. Known results. In [18], Hunter and Zheng established the global existence of weak solutions to (1.15) (n = 1) with initial data of bounded variations. It has also been shown that the dissipative solutions are limits of vanishing viscosity. Equation (1.15) (n = 1) is also shown to be completely integrable ([19]). In [37–44], Ping Zhang and Zheng study the global existence, uniqueness, and regularity of the weak solutions to (1.15) (n = 1, 2) with L 2 initial data, and special cases of (1.1). The study of the asymptotic equation has been very beneficial for both the blow-up result [12] and the current global existence result for the wave equation (1.1). 2. Variable Transformations We start by deriving some identities valid for smooth solutions. Consider the variables

. R = u t + c(u)u x , . (2.1) S = u t − c(u)u x , so that ut =

R+S , 2

By (1.1), the variables R, S satisfy Rt − c R x = St + cSx =

ux =

R−S . 2c

c 2 2 4c (R − S ), c 2 2 4c (S − R ).

(2.2)

(2.3)

Multiplying the first equation in (2.3) by R and the second one by S, we obtain balance laws for R 2 and S 2 , namely c (R 2 )t − (c R 2 )x = 2c (R 2 S − RS 2 ) , (2.4) c (R 2 S − RS 2 ) . (S 2 )t + (cS 2 )x = − 2c As a consequence, the following quantities are conserved: R2 + S2 . 1 , E = u 2t + c2 u 2x = 2 4 Indeed we have

S2 − R2 . M = −u t u x = . 4c

E t + (c2 M)x = 0 , Mt + E x = 0 .

(2.5)

(2.6)

One can think of R 2 /4 as the energy density of backward moving waves, and S 2 /4 as the energy density of forward moving waves. We observe that, if R, S satisfy (2.3) and u satisfies (2.1b), then the quantity . F = R − S − 2cu x (2.7) provides solutions to the linear homogeneous equation Ft − c Fx =

c (R + S + 2cu x )F. 2c

(2.8)

476

A. Bressan, Y. Zheng

In particular, if F ≡ 0 at time t = 0, the same holds for all t > 0. Similarly, if R, S satisfy (2.3) and u satisfies (2.1a), then the quantity . G = R + S − 2u t provides solutions to the linear homogeneous equation Gt + c G x =

c (R + S − 2cu x )G . 2c

In particular, if G ≡ 0 at time t = 0, the same holds for all t > 0. We thus have Proposition 1. Any smooth solution of (1.1) provides a solution to (2.1)–(2.3). Conversely, any smooth solution of (2.1b) and (2.3) (or (2.1a) and (2.3)) which satisfies (2.2b) (or (2.2a)) at time t = 0 provides a solution to (1.1). The main difficulty in the analysis of (1.1) is the possible breakdown of regularity of solutions. Indeed, even for smooth initial data, the quantities u x , u t can blow up in finite time. This is clear from Eqs. (2.3), where the right hand side grows quadratically, see ([12]) for handling change of signs of c and interaction between R and S. To deal with possibly unbounded values of R, S, it is convenient to introduce a new set of dependent variables: . . w = 2 arctan R , z = 2 arctan S , so that R = tan

w , 2

S = tan

z . 2

(2.9)

Using (2.3), we obtain the equations wt − c w x = zt + c z x =

2 c R 2 − S 2 (Rt − c Rx ) = , 2 1+ R 2c 1 + R 2

(2.10)

2 c S 2 − R 2 (S + c S ) = . t x 1 + S2 2c 1 + S 2

(2.11)

To reduce the equation to a semilinear one, it is convenient to perform a further change of independent variables (Fig. 1). Consider the equations for the forward and backward characteristics: x˙ + = c(u) , x˙ − = −c(u) .

(2.12)

The characteristics passing through the point (t, x) will be denoted by s → x + (s, t, x) , s → x − (s, t, x) , respectively. As coordinates (X, Y ) of a point (t, x) we shall use the quantities x − (0,t,x) 0 . . 1 + R 2 (0, x) d x , Y = 1 + S 2 (0, x) d x . X= (2.13) x + (0,t,x)

0

Of course this implies X t − c(u)X x = 0 ,

Yt + c(u)Yx = 0 ,

(2.14)

Conservative Solutions to a Nonlinear Variational Wave Equation

477

Fig. 1. Characteristic curves

(X x )t − (c X x )x = 0 ,

(Yx )t + (c Yx )x = 0 .

(2.15)

Notice that 1 h→0 h

X x (t, x) = lim

1 h→0 h

Yx (t, x) = lim

x − (0,t,x+h)

1 + R 2 (0, x) d x ,

x − (0,t,x)

x + (0,t,x+h)

1 + S 2 (0, x) d x .

x + (0,t,x)

For any smooth function f , using (2.14) one finds f t + c f x = f X X t + f Y Yt + c f X X x + c f Y Yx = (X t + cX x ) f X = 2cX x f X , f t − c f x = f X X t + f Y Yt − c f X X x − c f Y Yx = (Yt − cYx ) f Y = −2cYx f Y . (2.16) We now introduce the further variables . 1 + R2 p= , Xx

. 1 + S2 q= . −Yx

(2.17)

Notice that the above definitions imply (X x )−1 =

w p = p cos2 , 1 + R2 2

(−Yx )−1 =

z q = q cos2 . 1 + S2 2

(2.18)

From (2.10)–(2.11), using (2.16)–(2.18), we obtain 2c

(1 + S 2 ) c R 2 − S 2 wY = , q 2c 1 + R 2

Therefore ⎧ ⎪ ⎨ wY = ⎪ ⎩z

X

=

c R 2 −S 2 q 4c2 1+R 2 1+S 2

=

c 4c2

p c S 2 −R 2 4c2 1+S 2 1+R 2

=

c 4c2

2c

(1 + R 2 ) c S 2 − R 2 zX = . p 2c 1 + S 2

2 sin

w 2

2 sin

z 2

cos2 cos2

z 2

− sin2

z 2

cos2

w 2

w 2

− sin2

w 2

cos2

z 2

q, (2.19) p.

478

A. Bressan, Y. Zheng

Using trigonometric formulas, the above expressions can be further simplified as wY = 8cc 2 (cos z − cos w) q , zX =

c 8c2

(cos w − cos z) p .

Concerning the quantities p, q, we observe that cx = c u x = c

R−S . 2c

(2.20)

Using again (2.18) and (2.15) we compute

pt − c px = (X x )−1 2R(Rt − c Rx ) − (X x )−2 (X x )t − c(X x )x (1 + R 2 ) = (X x )−1 2R

c 2 4c (R

− S 2 ) − (X x )−2 [cx X x ](1 + R 2 )

=

p 2R(R 2 1+R 2

=

p c 2c 1+R 2 S(1 +

c − S 2 ) 4c −

p c (R 1+R 2 2c

− S)(1 + R 2 )

R 2 ) − R(1 + S 2 ) ,

qt + c qx = (−Yx )−1 2S(St − cSx ) − (−Yx )−2 (−Yx )t + c(−Yx )x (1 + S 2 ) = (−Yx )−1 2S

c 2 4c (S

=

q 2S(S 2 1+S 2

=

c q 2c 1+S 2 R(1 +

− R 2 ) − (−Yx )−2 [cx (Yx )](1 + S 2 )

c − R 2 ) 4c −

q c (S 1+S 2 2c

− R)(1 + S 2 )

S 2 ) − S(1 + R 2 ) .

In turn, this yields q 1 1 = ( pt − c p x ) 2c (−Yx ) 2c 1 + S 2 S c S(1 + R 2 ) − R(1 + S 2 ) R pq pq = − (1 + R 2 )(1 + S 2 ) 4c2 1 + S 2 1 + R2 w c sin z − sin w z w z cos2 pq = 2 pq , (2.21) tan cos2 − tan 2 2 2 2 4c 2

p Y = ( pt − c p x ) c 4c2 c = 2 4c

=

p 1 1 = (qt + c qx ) 2c (X x ) 2c 1 + R 2 R c c R(1 + S 2 ) − S(1 + R 2 ) S pq pq = = 2 − 4c (1 + S 2 )(1 + R 2 ) 4c2 1 + R 2 1 + S2 c z c sin w − sin z w w z = 2 tan cos2 − tan cos2 pq = 2 pq . 4c 2 2 2 2 4c 2 Finally, by (2.16) we have R 1 p 1 1 u X = (u t + cu x ) 2c tan w2 cos2 w2 p , = 2c p = 2c 1+R 2 1+R 2 q X = (qt + c qx )

1 q u Y = (u t − cu x ) 2c = 1+S 2

S 1 2c 1+S 2

q =

1 2c

tan

z 2

cos2

z 2

q .

(2.22)

(2.23)

Conservative Solutions to a Nonlinear Variational Wave Equation

479

Starting with the nonlinear equation (1.1), using X, Y as independent variables we thus obtain a semilinear hyperbolic system with smooth coefficients for the variables u, w, z, p, q. Using some trigonometric identities, the set of Eqs. (2.19), (2.21)–(2.22) and (2.23) can be rewritten as wY = 8cc 2 (cos z − cos w) q , (2.24) z X = 8cc 2 (cos w − cos z) p , pY = 8cc 2 sin z − sin w pq , (2.25) q X = 8cc 2 sin w − sin z pq , u X = sin4cw p , (2.26) z u Y = sin 4c q . Remark 1. The function u can be determined by using either one of the equations in (2.26). One can easily check that the two equations are compatible, namely sin w cos w sin w wY p + pY c uY p + 2 4c 4c 4c c = − 2 sin w sin z + cos w cos z − cos2 w − sin2 w + sin w sin z pq 3 32 c c = cos(w − z) − 1 pq 3 32 c (2.27) = u YX .

u XY = −

Remark 2. We observe that the new system is invariant under translation by 2π in w and . . z. Actually, it would be more precise to work with the variables w† = eiw and z † = ei z . However, for simplicity we shall use the variables w, z, keeping in mind that they range on the unit circle [−π, π ] with endpoints identified. The system (2.24)–(2.26) must now be supplemented by non-characteristic boundary conditions, corresponding to (1.2). For this purpose, we observe that u 0 , u 1 determine the initial values of the functions R, S at time t = 0. The line t = 0 corresponds to a curve γ in the (X, Y ) plane, say Y = ϕ(X ), X ∈ R, . where Y = ϕ(X ) if and only if x x X= 1 + R 2 (0, x) d x , Y = − 1 + S 2 (0, x) d x 0

for some x ∈ R .

0

We can use the variable x as a parameter along the curve γ . The assumptions u 0 ∈ H 1 , u 1 ∈ L2 imply R, S ∈ L2 ; to fix the ideas, let . 1 2 E0 = (2.28) R (0, x) + S 2 (0, x) d x < ∞ . 4 The two functions . X (x) =

0

x

1 + R 2 (0, x) d x ,

. Y (x) =

0

1 + S 2 (0, x) d x

x

480

A. Bressan, Y. Zheng

are well defined and absolutely continuous. Clearly, X is strictly increasing while Y is strictly decreasing. Therefore, the map X → ϕ(X ) is continuous and strictly decreasing. From (2.28) it follows X + ϕ(X ) ≤ 4E0 . (2.29) As (t, x) ranges over the domain [0, ∞[ ×R, the corresponding variables (X, Y ) range over the set . (2.30)

+ = (X, Y ) ; Y ≥ ϕ(X ) . Along the curve

. γ = (X, Y ) ; Y = ϕ(X ) ⊂ R2 parametrized by x → X (x), Y (x) , we can thus assign the boundary data (w, ¯ z¯ , p, ¯ q, ¯ u) ¯ ∈ L∞ defined by

w¯ = 2 arctan R(0, x) , p¯ ≡ 1 , (2.31) u¯ = u 0 (x) . z¯ = 2 arctan S(0, x) , q¯ ≡ 1 , We observe that the identity z¯ w¯ − tan − 2c(u) ¯ u¯ x = 0 2 2 is identically satisfied along γ . A similar identity holds for G. F = tan

(2.32)

3. Construction of Integral Solutions Aim of this section is to prove a global existence theorem for the system (2.24)–(2.26), describing the nonlinear wave equation in our transformed variables. Theorem 4. Let the assumptions in Theorem 1 hold. Then the corresponding problem (2.24)–(2.26) with boundary data (2.31) has a unique solution, defined for all (X, Y ) ∈ R2 . In the following, we shall construct the solution on the domain + where Y ≥ ϕ(X ). On the complementary set − where Y < ϕ(X ), the solution can be constructed in an entirely similar way. Observing that all Eqs. (2.24)–(2.26) have a locally Lipschitz continuous right hand side, the construction of a local solution as fixed point of a suitable integral transformation is straightforward. To make sure that this solution is actually defined on the whole domain + , one must establish a priori bounds, showing that p, q remain bounded on bounded sets. This is not immediately obvious from Eqs. (2.25), because the right hand sides have quadratic growth. The basic estimate can be derived as follows. Assume c (u) . (3.1) C0 = sup 2 < ∞ . 4c (u) u∈R

From (2.25) it follows the identity q X + pY = 0 . In turn, this implies that the differential form p d X − q dY has zero integral along every closed curve contained in + . In particular, for every (X, Y ) ∈ + , consider the closed curve (see Fig. 2) consisting of:

Conservative Solutions to a Nonlinear Variational Wave Equation

481

Fig. 2. The closed curve

– the vertical segment joining X, ϕ(X ) with (X, Y ), – the horizontal segment joining (X, Y ) with ϕ −1 (Y ), Y , −1 – the portion of boundary γ = Y = ϕ(X ) joining ϕ (Y ), Y with X, ϕ(X ) . Integrating along , recalling that p = q = 1 along γ and then using (2.29), we obtain Y X p(X , Y ) d X + q(X, Y ) dY = X − ϕ −1 (Y ) + Y − ϕ(X ) ϕ −1 (Y )

ϕ(X )

≤ 2(|X | + |Y | + 4E0 ) .

(3.2)

Using (3.1)–(3.2) in (2.25), since p, q > 0 we obtain the a priori bounds

Y c (u) p(X, Y ) = exp sin z − sin w q(X, Y ) dY 2 ϕ(X ) 8c (u)

Y ≤ exp C0 q(X, Y ) dY ϕ(X ) ≤ exp 2C0 (|X | + |Y | + 4E0 ) .

(3.3)

Similarly,

q(X, Y ) ≤ exp 2C0 (|X | + |Y | + 4E0 ) .

(3.4)

Relying on (3.3)–(3.4), we now show that, on bounded sets in the X -Y plane, the solution of (2.24)–(2.26) with boundary conditions (2.31) can be obtained as the fixed point of a contractive transformation. For any given r > 0, consider the bounded domain .

r = (X, Y ) ; Y ≥ ϕ(X ) , X ≤ r , Y ≤ r .

482

A. Bressan, Y. Zheng

Introduce the space of functions . . r = f : r → R ; f ∗ = ess sup

(X,Y )∈ r

e−κ(X +Y ) f (X, Y ) < ∞ ,

where κ is a suitably large constant, to be determined later. For w, z, p, q, u ∈ r , consider the transformation T (w, z, p, q, u) = (w, ˜ z˜ , p, ˜ q, ˜ u) ˜ defined by ⎧ Y c (u) ⎪ ⎪ ⎪ ˜ Y ) = w(X, ¯ ϕ(X )) + (cos z − cos w) q dY, ⎨ w(X, 2 ϕ(X ) 8c (u) (3.5) X ⎪ c (u) ⎪ −1 (Y ), Y ) + ⎪ z ˜ (X, Y ) = z ¯ (ϕ (cos w − cos z) p d X , ⎩ 2 ϕ −1 (Y ) 8c (u) ⎧ ⎪ ⎪ ˜ Y) = 1 + ⎪ ⎨ p(X,

Y

ϕ(X ) X

⎪ ⎪ ⎪ ˜ Y) = 1 + ⎩ q(X,

c (u) sin z − sin w pˆ qˆ dY }, 2 8c (u)

ϕ −1 (Y )

c (u) sin w − sin z pˆ qˆ d X } , 2 8c (u)

u(X, ˜ Y ) = u(X, ¯ ϕ(X )) + In (3.6), the quantities p, ˆ qˆ are defined as . pˆ = min p , 2e2C0 (|X |+|Y |+4E0 ) ,

Y

ϕ(X )

sin z q dY. 4c

. qˆ = min q , 2e2C0 (|X |+|Y |+4E0 ) .

(3.6)

(3.7)

(3.8)

Notice that pˆ = p, qˆ = q as long as the a priori estimates (3.3)–(3.4) are satisfied. Moreover, if in Eqs. (2.24)–(2.26) the variables p, q are replaced with p, ˆ q, ˆ then the right hand sides become uniformly Lipschitz continuous on bounded sets in the X -Y plane. A straightforward computation now shows that the map T is a strict contraction on the space r , provided that the constant κ is chosen sufficiently big (depending on the function c and on r ). Obviously, if r > r , then the solution of (3.5)–(3.7) on r also provides the solution to the same equations on r , when restricted to this smaller domain. Letting r → ∞, in the limit we thus obtain a unique solution (w, z, p, q, u) of (3.5)–(3.7), defined on the whole domain + . To prove that these functions satisfy the (2.24)–(2.26), we claim that pˆ = p, qˆ = q at every point (X, Y ) ∈ + . The proof is by contradiction. If our claim does not hold, since the maps Y → p(X, Y ), X → q(X, Y ) are continuous, we can find some point (X ∗ , Y ∗ ) ∈ + such that q(X, Y ) ≤ 2e2C0 (|X |+|Y |+4E0 ) (3.9) p(X, Y ) ≤ 2e2C0 (|X |+|Y |+4E0 ) , . for all (X, Y ) ∈ ∗ = + ∩{X ≤ X ∗ , Y ≤ Y ∗ }, but either p(X ∗ , Y ∗ ) ≥ 23 e2C0 (|X |+|Y |+4E0 ) or q(X ∗ , Y ∗ ) ≥ 23 e2C0 (|X |+|Y |+4E0 ) . By (3.9), we still have pˆ = p, qˆ = q restricted to

∗ , hence the Eqs. (2.24)–(2.26) and the a priori bounds (3.3)–(3.4) remain valid. In particular, these imply p(X ∗ , Y ∗ ) ≤ e2C0 (|X |+|Y |+4E0 ) , reaching a contradiction.

q(X ∗ , Y ∗ ) ≤ e2C0 (|X |+|Y |+4E0 ) ,

Conservative Solutions to a Nonlinear Variational Wave Equation

483

Remark 3. In the solution constructed above, the variables w, z may well grow outside the initial range ]− π, π [ . This happens precisely when the quantities R, S become unbounded, i.e. when singularities arise. For future reference, we state a useful consequence of the above construction. Corollary 1. If the initial data u 0 , u 1 are smooth, then the solution (u, p, q, w, z) of (2.24)–(2.26), (2.31) is a smooth function of the variables (X, Y ). Moreover, assume m that a sequence of smooth functions (u m 0 , u 1 )m≥1 satisfies um 0 → u0 ,

(u m 0 )x → (u 0 )x ,

um 1 → u1

uniformly on compact subsets of R. Then one has the convergence of the corresponding solutions: (u m , p m , q m , w m , z m ) → (u, p, q, w, z) uniformly on bounded subsets of the X -Y plane. We also remark that Eqs. (2.24)–(2.26) imply the conservation laws q p q X + pY = 0 , − = 0. c X c Y

(3.10)

4. Weak Solutions, in the Original Variables By expressing the solution u(X, Y ) in terms of the original variables (t, x), we shall recover a solution of the Cauchy problem (1.1)–(1.2). This will provide a proof of Theorem 1. As a preliminary, we examine the regularity of the solution (u, w, z, p, q) constructed in the previous section. Since the initial data (u 0 )x and u 1 are only assumed to be in L2 , the functions w, z, p, q may well be discontinuous. More precisely, on bounded subsets of the X -Y plane, Eqs. (2.24)–(2.26) imply the following: – The functions w, p are Lipschitz continuous w.r.t. Y , measurable w.r.t. X . – The functions z, q are Lipschitz continuous w.r.t. X , measurable w.r.t. Y . – The function u is Lipschitz continuous w.r.t. both X and Y . The map (X, Y ) → (t, x) can be constructed as follows. Setting f = x, then f = t in the two equations at (2.16), we find

c = 2cX x x X , 1 = 2cX x t X , −c = −2cYx xY , 1 = −2cYx tY , respectively. Therefore, using (2.18) we obtain ⎧ 1 (1 + cos w) p ⎪ ⎪ xX = , = ⎨ 2X x 4 (4.1) 1 (1 + cos z) q ⎪ ⎪ ⎩ xY = , =− 2Yx 4 ⎧ 1 (1 + cos w) p ⎪ ⎪ , = ⎨ tX = 2cX x 4c (4.2) 1 (1 + cos z) q ⎪ ⎪ ⎩ tY = . = −2cYx 4c For future reference, we write here the partial derivatives of the inverse mapping, valid at points where w, z = −π ,

484

A. Bressan, Y. Zheng

⎧ ⎪ ⎪ ⎨ Xx =

2 , (1 + cos w) p 2 ⎪ ⎪ , ⎩ Yx = − (1 + cos z) q

⎧ ⎪ ⎪ ⎨ Xt =

2c , (1 + cos w) p 2c ⎪ ⎪ . ⎩ Yt = (1 + cos z) q

(4.3)

We can now recover the functions x = x(X, Y ) by integrating one of the equations in (4.1). Moreover, we can compute t = t (X, Y ) by integrating one of the equations in (4.2). A straightforward calculation shows that the two equations in (4.1) are equivalent: differentiating the first w.r.t. Y or the second w.r.t. X one obtains the same expression, p sin w wY (1 + cos w) pY − 4 4 c pq sin z − sin w + sin(z − w) = xYX . = 32c2 Similarly, the equivalence of the two equations in (4.2) is checked by x x x 2 xY X Y X t X Y − tYX = + = x X Y − 2 c u Y + 2 c u X c Y c X c c c c pq sin z − sin w + sin(z − w) = 16 c3 c pq − (1 + cos w) sin z − (1 + cos z) sin w = 0 . 3 16 c In order to define u as a function of the original variables t, x, we should formally invert the map (X, Y ) → (t, x) and write u(t, x) = u X (t, x) , Y (t, x) . The fact that the above map may not be one-to-one does not cause any real difficulty. Indeed, given (t ∗ , x ∗ ), we can choose an arbitrary point (X ∗ , Y ∗ ) such that t (X ∗ , Y ∗ ) = t ∗ , x(X ∗ , Y ∗ ) = x ∗ , and define u(t ∗ , x ∗ ) = u(X ∗ , Y ∗ ). To prove that the values of u do not depend on the choice of (X ∗ , Y ∗ ), we proceed as follows. Assume that there are two distinct points such that t (X 1 , Y1 ) = t (X 2 , Y2 ) = t ∗ , x(X 1 , Y1 ) = x(X 2 , Y2 ) = x ∗ . We consider two cases: xXY =

Case 1. X 1 ≤ X 2 , Y1 ≤ Y2 . Consider the set . x ∗ = (X, Y ) ; x(X, Y ) ≤ x ∗ and call ∂x ∗ its boundary. By (4.1), x is increasing with X and decreasing with Y . Hence, this boundary can be represented as the graph of a Lipschitz continuous function: X − Y = φ(X + Y ). We now construct the Lipschitz continuous curve γ (Fig. 3a) consisting of

Fig. 3. Paths of integration

Conservative Solutions to a Nonlinear Variational Wave Equation

485

– a horizontal segment joining (X 1 , Y1 ) with a point A = (X A , Y A ) on ∂x ∗ , with Y A = Y1 , – a portion of the boundary ∂x ∗ , – a vertical segment joining (X 2 , Y2 ) to a point B = (X B , Y B ) on ∂x ∗ , with X B = X 2 . We can obtain a Lipschitz continuous parametrization of the curve γ : [ξ1 , ξ2 ] → R2 in terms of the parameter ξ = X + Y . Observe that the map (X, Y ) → (t, x) is constant along γ . By (4.1)–(4.2) this implies (1 + cos w)X ξ = (1 + cos z)Yξ = 0, hence sin w · X ξ = sin z · Yξ = 0. We now compute u(X 2 , Y2 ) − u(X 1 , Y1 ) = u X d X + u Y dY γ

=

ξ2

ξ1

p sin w q sin z Xξ − Yξ 4c 4c

dξ = 0 ,

proving our claim. Case 2. X 1 ≤ X 2 , Y1 ≥ Y2 . In this case, we consider the set . t ∗ = (X, Y ) ; t (X, Y ) ≤ t ∗ , and construct a curve γ connecting (X 1 , Y1 ) with (X 2 , Y2 ) as in Fig. 3b. Details are entirely similar to Case 1. We now prove that the function u(t, x) = u X (t, x), Y (t, x) thus obtained is Hölder continuous on bounded sets. Toward this goal, consider any characteristic curve, say t → x + (t), with x˙ + = c(u). By construction, this is parametrized by the function X → t (X, Y ), x(X, Y ) , for some fixed Y . Recalling (2.16), (2.14), (2.18) and (2.26), we compute

τ

2 u t + c(u)u x dt =

0

=

X0 Xτ

≤

Xτ

X0 Xτ X0

(2cX x u X )2 (2X t )−1 d X w w 2 w −1 p 2 sin cos 2c p cos2 dX 2 4c 2 2 p d X ≤ Cτ , 2c

(4.4)

for some constant Cτ depending only on τ . Similarly, integrating along any backward characteristics t → x − (t) we obtain τ 2 u t − c(u)u x dt ≤ Cτ . (4.5) 0

Since the speed of characteristics is ±c(u), and c(u) is uniformly positive and bounded, the bounds (4.4)–(4.5) imply that the function u = u(t, x) is Hölder continuous with exponent 1/2. In turn, this implies that all characteristic curves are C 1 with Hölder continuous derivative. Still from (4.4)–(4.5) it follows that the functions ˇ R, S at (2.1) are square integrable on bounded subsets of the t-x plane. Finally, we

486

A. Bressan, Y. Zheng

prove that the function u provides a weak solution to the nonlinear wave equation (1.1). According to (1.5), we need to show that

φt (u t + cu x ) + (u t − cu x ) − c(u)φ x (u t + cu x ) − (u t − cu x ) d xdt = φt − (cφ)x (u t + cu x ) d xdt + φt + (cφ)x (u t − cu x ) d xdt φt + (cφ)x S d xdt . (4.6) = φt − (cφ)x R d xdt +

0=

By (2.16), this is equivalent to

− 2cYx φY R + 2cX x φ X S + c (u X X x + u Y Yx ) φ (S − R) d xdt = 0 . (4.7)

It will be convenient to express the double integral in (4.7) in terms of the variables X, Y . We notice that, by (2.18) and (2.14), d x dt =

pq d X dY . 2c (1 + R 2 )(1 + S 2 )

Using (2.26) and the identities ⎧ ⎪ ⎨

1 1 + cos w w = , = cos2 2 1+ R 2 2 1 + cos z z ⎪ ⎩ 1 , = cos2 = 1 + S2 2 2

⎧ ⎪ ⎨

R sin w , = 2 1+ R 2 ⎪ sin z ⎩ S , = 1 + S2 2

(4.8)

the double integral in (4.6) can thus be written as

1+ S 2 sin w 1+ R 2 sin z 1+ S 2 1+ R 2 2c φY R +2c φ X S +c p − q φ (S − R) q p 4c p 4c q pq d X dY × 2c (1 + R 2 ) (1 + S 2 )

R S c pq sin w sin z = φ (S − R) d X dY p φ + q φ + − Y X 1 + R2 1+ S 2 8c2 1+ S 2 1+ R 2

p sin w q sin z φY + φX = 2 2 z w c pq w z φ d X dY + 2 sin w sin z − sin w cos2 tan − sin z cos2 tan 8c 2 2 2 2

p sin w q sin z c pq = φY + φX + cos(w + z) − 1 φ d X dY . 2 2 8c2 (4.9)

Conservative Solutions to a Nonlinear Variational Wave Equation

487

Recalling (2.30), one finds p sin w q sin z + = (2c u X )Y + (2c u Y ) X 2 2 Y X = 4c u X u Y + 4c u X Y =

c pq c pq cos(w − z) − 1 sin w sin z + 4c2 8c2

c pq cos(w + z) − 1 . (4.10) 2 8c Together, (4.9) and (4.10) imply (4.7) and hence (4.6). This establishes the integral equation (1.5) for every test function φ ∈ Cc1 . =

5. Conserved Quantities From the conservation laws (3.10) it follows that the 1-forms p d X − q dY and p q c d X + c dY are closed, hence their integrals along any closed curve in the X -Y plane vanish. From the conservation laws at (2.6), it follows that the 1-forms E d x − (c2 M) dt ,

M d x − E dt

(5.1)

are also closed. There is a simple correspondence. In fact q 1 p q 1 p d X − dY − d x − M d x + E dt = dX + dY − dt. 4 4 2 4c 4c 2 Recalling (4.1)–(4.2), these can be written in terms of the X -Y coordinates as E d x − (c2 M) dt =

(1 − cos z) q (1 − cos w) p dX − dY , (5.2) 8 8 (1 − cos w) p (1 − cos z) q dX + dY , (5.3) 8c 8c respectively. Using (2.24)–(2.26), one easily checks that these forms are indeed closed: (1 − cos w) p c pq sin z(1 − cos w) − sin w(1 − cos z) = 8 64c2 Y (1 − cos z) q =− , (5.4) 8 X

(1 − cos w) p 8c

Y

c pq sin(w + z) − (sin w + sin z) = = 64c3

(1 − cos z) q 8c

. X

In addition, we have the 1-forms (1 + cos w) p (1 + cos z) q dX − dY , 4 4 (1 + cos w) p (1 + cos z) q dt = dX + dY , 4c 4c which are obviously closed. dx =

(5.5) (5.6)

488

A. Bressan, Y. Zheng

The solutions u = u(X, Y ) constructed in Sect. 3 are conservative, in the sense that the integral of the form (5.3) along every Lipschitz continuous, closed curve in the X -Y plane is zero. To prove the inequality (1.7), fix any τ > 0. The case τ < 0 is identical. For a given r > 0 arbitrarily large, define the set (Fig. 4) . = (X, Y ) ; 0 ≤ t (X, Y ) ≤ τ ,

X ≤ r ,Y ≤ r .

(5.7)

By construction, the map (X, Y ) → (t, x) will act as follows: A → (τ, a) ,

B → (τ, b) ,

C → (0, c) ,

D → (0, d) ,

for some a < b and d < c. Integrating the 1-form (5.3) along the boundary of we obtain

(1 − cos w) p (1 − cos z) q dX − dY 8 8 AB (1 − cos z) q (1 − cos w) p dX − dY = 8 8 DC (1 − cos w) p (1 − cos z) q dX − dY − 8 8 DA CB (1 − cos z) q (1 − cos w) p dX − dY ≤ 8 8 DC c 1 u 2t (0, x) + c2 u(0, x) u 2x (0, x) d x . = d 2

Fig. 4.

(5.8)

Conservative Solutions to a Nonlinear Variational Wave Equation

489

On the other hand, using (5.5) we compute b 1 2 u t (τ, x) + c2 u(τ, x) u 2x (τ, x) d x 2 a (1 − cos w) p = dX 8 AB∩{cos w=−1} (1 − cos z) q dY + 8 AB∩{cos z=−1} ≤ E0 .

(5.9)

Notice that the last relation in (5.8) is satisfied as an equality, because at time t = 0, along the curve γ0 the variables w, z never assume the value −π . Letting r → +∞ in (5.7), one has a → −∞, b → +∞. Therefore (5.8) and (5.9) together imply E(t) ≤ E0 , proving (1.7). We now prove the Lipschitz continuity of the map t → u(t, ·) in the L2 distance. For + this purpose, for any fixed time τ , we let μτ = μ− τ + μτ be the positive measure on the real line defined as follows. In the smooth case, 1 b 2 1 b 2 − + μτ ]a, b[ = R (τ, x) d x , μτ ]a, b[ = S (τ, x) d x . 4 a 4 a (5.10) To define μ± τ in the general case, let γτ be the boundary of the set . τ = (X, Y ) ; t (X, Y ) ≤ τ .

(5.11)

Given any open interval ]a, b[ , let A = (X A , Y A ) and B = (X B , Y B ) be the points on γτ such that x(A) = a ,

X P − Y P ≤ X A − Y A for every point P ∈ γτ with x(P) ≤ a ,

x(B) = b ,

X P − Y P ≥ X B − Y B for every point P ∈ γτ with x(P) ≥ b .

Then

where . − μτ ]a, b[ =

+ μτ ]a, b[ = μ− τ ]a, b[ + μτ ]a, b[ ,

AB

(1 − cos w) p dX 8

μ+τ

. ]a, b[ = −

AB

(5.12) (1 − cos z) q dY . 8 (5.13)

+ Recalling the discussion at (5.1)–(5.3), it is clear that μ− τ , μτ are bounded, positive measures, and μτ (R) = E0 , for all τ . Moreover, by (5.10) and (2.5), b b 1 2 2 2 (R + S 2 )d x ≤ 2μ(]a, b[). c ux d x ≤ 2 a a

For any a < b, this yields the estimate b u 2x (τ, y) dy ≤ 2κ 2 |b − a|μτ (]a, b[). |u(τ, b) − u(τ, a)|2 ≤ |b − a| a

(5.14)

490

A. Bressan, Y. Zheng

Next, for a given h > 0, y ∈ R, we seek an estimate on the distance u(τ +h, y)−u(τ, y)|. As in Fig. 5, let γτ +h be the boundary of the set τ +h , as in (5.11). Let P = (PX , PY ) be the point on γτ such that x(P) = y ,

X P − Y P ≤ X P − Y P for every point P ∈ γτ with x(P ) ≤ x .

Similarly, let Q = (Q X , Q Y ) be the point on γτ +h such that x(Q) = y ,

X Q − Y Q ≤ X Q − Y Q for every point Q ∈ γτ +h with x(Q ) ≤ y .

Notice that X P ≤ X Q and Y P ≤ Y Q . Let P + = (X + , Y + ) be a point on γτ with − X + = X Q , and let P − = (X − , Y − ) be a point on γτ with Y = Y Q . Notice that + x(P ) ∈ ]y, y + κh[ , because the point τ, x(Q) lies on some characteristic curve with speed −c(u) > −κ, passing through the point (τ +h, y). Similarly, x(P − ) ∈ ]y−κh, y[ . Recalling that the forms in (5.3) and (5.6) are closed, we obtain the estimate YQ u(Q) − u(P + ) ≤ u Y (X Q , Y ) dY Y+ YQ sin z = q 4c dY Y+ YQ (1 + cos z) q 1/2 (1 − cos z) q 1/2 = dY 4c 4 Y+ 1/2 Y Q 1/2 Y Q (1 + cos z) q (1 − cos z) q dY dY ≤ · 4c 4 Y+ Y+ 1/2

(1 − cos w) p (1 − cos z) q dX − dY ≤ · h 1/2 . 4 4 P− P+ (5.15) The last term in (5.15) contains the integral of the 1-form at (5.3), along the curve γτ , between P − and P + . Recalling the definition (5.12)–(5.13) and the estimate (5.14), we obtain the bound u(τ + h, x) − u(τ, x)2 ≤ 2u(Q) − u(P + )2 + 2u(P + ) − u(P)2 ≤ 4h · μτ ]x − κh, x + κh[ +4κ 2 · (κh) · μτ ]x , x + h[ . (5.16)

Fig. 5. Proving Lipschitz continuity

Conservative Solutions to a Nonlinear Variational Wave Equation

491

Therefore, for any h > 0, u(τ + h, ·) − u(τ, ·) 2 = L

≤

u(τ + h, x) − u(τ, x)2 d x

1/2

4(1 + κ 3 )h · μτ ]x − κh , x + κh[ d x

1/2 = 4(κ 3 + 1)h 2 μτ (R) 1/2 = h · 4(κ 3 + 1) E0 .

1/2

(5.17)

This proves the uniform Lipschitz continuity of the map t → u(t, ·), stated at (1.4). 6. Regularity of Trajectories In this section we prove the continuity of the functions t → u t (t, ·) and t → u x (t, ·), as functions with values in L p . This will complete the proof of Theorem 1. We first consider the case where the initial data (u 0 )x , u 1 are smooth with compact support. In this case, the solution u = u(X, Y ) remains smooth on the entire X -Y plane. Fix a time τ and let γτ be the boundary of the set τ , as in (5.11). We claim that d u(t, ·) = u t (τ, ·), (6.1) dt t=τ where, by (2.14), (2.18) and (2.26), 2c q sin z 2c p sin w . u t (τ, x) = u X X t + u Y Yt = + 4c p(1 + cos w) 4c q(1 + cos z) sin z sin w + . = 2(1 + cos w) 2(1 + cos z)

(6.2)

Notice that (6.2) defines the values of u t (τ, ·) at almost every point x ∈ R, i.e. at all points outside the support of the singular part of the measure μτ defined at (5.12). By the inequality (1.7), recalling that c(u) ≥ κ −1 , we obtain u t (τ, x)2 d x ≤ κ 2 E(τ ) ≤ κ 2 E0 . (6.3) R

To prove (6.1), let any ε > 0 be given. There exist finitely many disjoint intervals [ai , bi ] ⊂ R, i = 1, . . . , N , with the following property. Call Ai , Bi the points on γτ such that x(Ai ) = ai , x(Bi ) = bi . Then one has min 1 + cos w(P) , 1 + cos z(P) < 2ε (6.4) at every point P on γτ contained in one of the arcs Ai Bi , while 1 + cos w(P) > ε ,

1 + cos z(P) > ε ,

(6.5)

. for every point P along γτ , not contained in any of the arcs Ai Bi . Call J = ∪1≤i≤N [ai , bi ], J = R \ J , and notice that, as a function of the original variables, u = u(t, x) is smooth

492

A. Bressan, Y. Zheng

in a neighborhood of the set {τ } × J . Using Minkowski’s inequality and the differentiability of u on J , we can write limh→0

1 h

p 1/ p u(τ + h, x) − u(τ, x) − h u t (τ, x) d x R

≤ limh→0

1 h

1/ p p 1/ p u t (τ, x) p d x + . u(τ + h, x) − u(τ, x) d x J

J

(6.6) We now provide an estimate on the measure of the “bad” set J : (1 + cos z) q (1 + cos w) p meas (J ) = dX − dY dx = i A B 4 4 J i i (1 − cos z) q (1 − cos w) p ≤ 2ε dX − dY i A B 4 4 i i (1 − cos z) q (1 − cos w) p ≤ 2ε dX − dY ≤ 2ε E0 . 4 4 γτ (6.7) Now choose q = 2/(2 − p) so that 2p + q1 = 1. Using Hölder’s inequality with conjugate exponents 2/ p and q, and recalling (5.17), we obtain p 2 p/2 1/q u(τ + h, x) − u(τ, x) d x ≤ meas (J ) · u(τ + h, x) − u(τ, x) d x J J 2 p/2 1/q ≤ 2ε E0 · u(τ + h, ·) − u(τ, ·)L2 1/q 2 3 p/2 ≤ 2ε E0 · h 4(κ + 1) E0 . Therefore, p 1/ p 1/2 1 lim sup ≤ [2ε E0 ]1/ pq · 4(κ 3 + 1) E0 . u(τ + h, x) − u(τ, x) − h d x h J h→0 (6.8) In a similar way we estimate 2 p/2 u t (τ, x) p d x ≤ meas (J ) 1/q · , u t (τ, x) d x J J 1/ p p/2 u t (τ, x) p d x ≤ meas (J )1/ pq · κ 2 E0 .

(6.9)

J

Since ε > 0 is arbitrary, from (6.6), (6.8) and (6.9) we conclude lim

h→0

1 h

p 1/ p = 0. u(τ + h, x) − u(τ, x) − h u t (τ, x) d x R

(6.10)

Conservative Solutions to a Nonlinear Variational Wave Equation

493

The proof of continuity of the map t → u t is similar. Fix ε > 0. Consider the intervals [ai , bi ] as before. Since u is smooth on a neighborhood of {τ } × J , it suffices to estimate p lim sup u t (τ + h, x) − u t (τ, x) d x h→0 p ≤ lim sup u t (τ + h, x) − u t (τ, x) d x J h→0 2 p/2 1/q · ≤ lim sup meas (J ) u t (τ + h, x) − u t (τ, x) d x J h→0 p 1/q ≤ lim sup 2ε E0 · u t (τ + h, ·)L2 + u t (τ, ·)L2 h→0 p ≤ 2εE0 ]1/q 4E0 . Since ε > 0 is arbitrary, this proves continuity. To extend the result to general initial data, such that (u 0 )x , u 1 ∈ L2 , we consider a sequence of smooth initial data, with (u ν0 )x , u ν1 ∈ Cc∞ , with u n0 → u 0 uniformly, (u n0 )x → (u 0 )x almost everywhere and in L2 , u n1 → u 1 almost everywhere and in L2 . The continuity of the function t → u x (t, ·) as a map with values in L p , 1 ≤ p < 2, is proved in an entirely similar way. 7. Energy Conservation This section is devoted to the proof of Theorem 3, stating that, in some sense, the total energy of the solution remains constant in time. A key tool in our analysis is the wave interaction potential, defined as . + (t) = (μ− (7.1) t ⊗ μt ) (x, y) ; x > y . − + We recall that μ± t are the positive measures defined at (5.13). Notice that, if μt , μt are absolutely continuous w.r.t. Lebesgue measure, so that (5.10) holds, then (7.1) is equivalent to . 1 (t) = R 2 (t, x) S 2 (t, y) d xd y . 4 x>y

Lemma 1. The map t → (t) has locally bounded variation. Indeed, there exists a one-sided Lipschitz constant L 0 such that (t) − (s) ≤ L 0 · (t − s)

t > s > 0.

(7.2)

To prove the lemma, we first give a formal argument, valid when the solution u = u(t, x) remains smooth. We first notice that (2.4) implies 2 d c 2 2 2 (4(t)) ≤ − 2c R S d x + |R 2 S − RS 2 | d x R + S dx · dt 2c c 2 −1 2 2 R S − RS 2 | d x , ≤ −2κ R S d x + 4E0 2c ∞ L where κ −1 is a lower bound for c(u). For each ε > 0 we have |R| ≤ ε−1/2 + ε1/2 R 2 . Choosing ε > 0 such that

494

A. Bressan, Y. Zheng

c √ κ −1 > 4E0 2c ∞ · 2 ε , L we thus obtain d (4(t)) ≤ −κ −1 dt

16 E 2 R S dx + √ 0 ε 2 2

c 2c ∞ . L

This yields the L1 estimate τ 2 |R S| + |RS 2 | d xdt = O(1) · (0) + E02 τ = O(1) · (1 + τ )E02 , 0

where O(1) denotes a quantity whose absolute value admits a uniform bound, depending only on the function c = c(u) and not on the particular solution under consideration. In particular, the map t → (t) has bounded variation on any bounded interval. It can be discontinuous, with downward jumps. To achieve a rigorous proof of Lemma 1, we need to reproduce the above argument in terms of the variables X, Y . As a preliminary, we observe that for every ε > 0 there exists a constant κε such that |sin z(1 − cos w) − sin w(1 − cos z)| w z ≤ κε · tan2 + tan2 (1 + cos w)(1 + cos z) + ε(1 − cos w)(1 − cos z) 2 2 (7.3) for every pair of angles w, z. . Now fix 0 ≤ s < t. Consider the sets s , t as in (5.11) and define st = t \ s . Observing that pq (1 + cos w)(1 + cos z)d X dY , d xdt = 8c we can now write t ∞ R2 + S2 d xdt = (t − s)E0 4 s −∞ 1 2w z pq tan + tan2 · (1 + cos w)(1 + cos z) d X dY . (7.4) = 2 2 8c st 4 The first identity holds only for smooth solutions, but the second one is always valid. Recalling (5.4) and (5.13), and then using (7.3)–(7.4), we obtain 1 − cos z 1 − cos w (t) − (s) ≤ − p· q d X dY 8 8 st c +E0 · pq [sin z(1 − cos w) − sin w(1 − cos z)] d X dY 2 st 64c 1 ≤− (1 − cos w)(1 − cos z) pq d X dY 64 st c w z (1 + cos w)(1 + cos z) +E0 · pq κε · tan2 + tan2 2 2 2 st 64c +ε(1 − cos w)(1 − cos z)] d X dY ≤ κ(t − s) ,

Conservative Solutions to a Nonlinear Variational Wave Equation

for a suitable constant κ. This proves the lemma. To prove Theorem 3, consider the three sets . z(X, Y ) = −π ,

1 = (X, Y ) ; w(X, Y ) = −π , . w(X, Y ) = −π ,

2 = (X, Y ) ; z(X, Y ) = −π , . w(X, Y ) = −π ,

3 = (X, Y ) ; z(X, Y ) = −π ,

495

c u(X, Y ) = 0 , c u(X, Y ) = 0 , c u(X, Y ) = 0 .

From Eqs. (2.24), it follows that meas ( 1 ) = meas ( 2 ) = 0 . Indeed, wY = 0 on 1 and z X = 0 on 2 . Let ∗3 be the set of Lebesgue points of 3 . We now show that meas t (X, Y ) ; (X, Y ) ∈ ∗3 = 0 .

(7.5)

(7.6)

To prove (7.4), fix any P ∗ = (X ∗ , Y ∗ ) ∈ ∗3 and let τ = t (P ∗ ). We claim that lim sup h,k→0+

(τ − h) − (τ + k) = + ∞. h+k

(7.7)

By assumption, for any ε > 0 arbitrarily small we can find δ > 0 with the following property. For any square Q centered at P ∗ with side of length < δ, there exists a vertical segment σ and a horizontal segment σ , as in Fig. 6, such that meas 3 ∩ σ ≥ (1 − ε) , meas 3 ∩ σ ≥ (1 − ε) . (7.8) Call

. t + = max t (X, Y ) ; (X, Y ) ∈ σ ∪ σ , . t − = min t (X, Y ) ; (X, Y ) ∈ σ ∪ σ .

Fig. 6.

496

A. Bressan, Y. Zheng

Notice that, by (4.2), (1 + cos w) p (1 + cos z)q t+ − t− ≤ dX + dY ≤ c0 · (ε)2 . 4c 4c σ σ

(7.9)

Indeed, the integrand functions are Lipschitz continuous. Moreover, they vanish oustide a set of measure ε. On the other hand, (t − ) − (t + ) ≥ c1 (1 − ε)2 2 − c2 (t + − t − )

(7.10)

for some constant c1 > 0. Since ε > 0 was arbitrary, this implies (7.5). Recalling that the map t → has bounded variation, from (7.5) it follows (7.4). We now observe that the singular part of μτ is nontrivial only if the set P ∈ γτ ; w(P) = −π or z(P) = −π has positive 1-dimensional measure. By the previous analysis, restricted to the region where c = 0, this can happen only for a set of times having zero measure. Acknowledgement. Alberto Bressan was supported by the Italian M.I.U.R., within the research project #2002017219, while Yuxi Zheng has been partially supported by grants NSF DMS 0305497 and 0305114.

References 1. Albers, M., Camassa, R., Holm, D., Marsden, J.: The geometry of peaked solitons and billiard solutions of a class of integrable PDE’s. Lett. Math. Phys. 32, 137–151 (1994) 2. Balabane, M.: Non–existence of global solutions for some nonlinear wave equations with small Cauchy data. C. R. Acad. Sc. Paris 301, 569–572 (1985) 3. Berestycki, H., Coron, J.M., Ekeland, I. (eds.): Variational Methods. Progress in Nonlinear Differential Equations and Their Applications, Vol. 4, Boston: Birkhäuser, 1990 4. Bressan, A., Constantin, A.: Global solutions to the Camassa-Holm equations. To appear 5. Bressan, A., Zhang, P., Zheng, Y.: On asymptotic variational wave equations. Arch. Rat. Mech. Anal. To appear 6. Camassa, R., Holm, D.: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 (1993) 7. Camassa, R., Holm, D., Hyman, J.: A new integrable shallow water equation. To appear in Adv. Appl. Mech. 8. Christodoulou, D., Tahvildar-Zadeh, A.: On the regularity of spherically symmetric wave maps. Comm. Pure Appl. Math. 46, 1041–1091 (1993) 9. Coron, J., Ghidaglia, J., Hélein, F. (eds.): Nematics, Dordrecht: Kluwer Academic Publishers, 1991 10. Ericksen, J.L., Kinderlehrer, D. (eds.): Theory and Application of Liquid Crystals. IMA Volumes in Mathematics and its Applications, Vol. 5, New York: Springer-Verlag, 1987 11. Glassey, R.T.: Finite–time blow–up for solutions of nonlinear wave equations. Math. Z. 177, 323–340 (1981) 12. Glassey, R.T., Hunter, J.K., Zheng, Y.: Singularities in a nonlinear variational wave equation. J. Diff. Eqs. 129, 49–78 (1996) 13. Glassey, R.T., Hunter, J.K., Zheng, Y.: Singularities and oscillations in a nonlinear variational wave equation. Singularities and Oscillations, edited by J. Rauch, M.E. Taylor, (eds.) IMA, Vol. 91, Springer, 1997 14. Grundland, A., Infeld, E.: A family of nonlinear Klein-Gordon equations and their solutions. J. Math. Phys. 33, 2498–2503 (1992) 15. Hanouzet, B., Joly, J.L.: Explosion pour des problèmes hyperboliques semi–linéaires avec second membre non compatible. C. R. Acad. Sc. Paris 301, 581–584 (1985) 16. Hardt, R., Kinderlehrer, D., Lin, F.: Existence and partial regularity of static liquid crystal configurations. Commun. Math. Phys. 105, 547–570 (1986) 17. Hunter, J.K., Saxton, R.A.: Dynamics of director fields. SIAM J. Appl. Math. 51, 1498–1521 (1991) 18. Hunter, J.K., Zheng, Y.: On a nonlinear hyperbolic variational equation I and II. Arch. Rat. Mech. Anal. 129, 305–353, 355–383 (1995)

Conservative Solutions to a Nonlinear Variational Wave Equation

497

19. Hunter, J.K., Zheng, Y.: On a completely integrable nonlinear hyperbolic variational equation. Physica D 79, 361–386 (1994) 20. John, F.: Blow–up of solutions of nonlinear wave equations in three space dimensions. Manuscripta Math. 28, 235–268 (1979) 21. Kato, T.: Blow–up of solutions of some nonlinear hyperbolic equations. Comm. Pure Appl. Math. 33, 501–505 (1980) 22. Kinderlehrer, D.: Recent developments in liquid crystal theory. In: Frontiers in pure and applied mathematics : a collection of papers dedicated to Jacques-Louis Lions on the occasion of his sixtieth birthday. ed. R. Dautray, New York: Elsevier, 1991, pp. 151–178 23. Klainerman, S., Machedon, M.: Estimates for the null forms and the spaces Hs,δ . Internat. Math. Res. Notices no. 17, 853–865, (1996) 24. Klainerman, S., Majda, A.: Formation of singularities for wave equations including the nonlinear vibrating string. Comm. Pure Appl. Math. 33, 241–263 (1980) 25. Lax, P.: Development of singularities of solutions of nonlinear hyperbolic partial differential equations. J. Math. Phys. 5, 611–613 (1964) 26. Levine, H.: Instability and non–existence of global solutions to nonlinear wave equations. Trans. Amer. Math. Soc. 192, 1–21 (1974) 27. Lindblad, H.: Global solutions of nonlinear wave equations. Comm. Pure Appl. Math. 45, 1063–1096 (1992) 28. Liu, T.-P.: Development of singularities in the nonlinear waves for quasi–linear hyperbolic partial differential equations. J. Differential Equations 33, 92–111 (1979) 29. Saxton, R.A.: Dynamic instability of the liquid crystal director. In: Contemporary Mathematics, Vol. 100: Current Progress in Hyperbolic Systems, ed. W.B. Lindquist, Providence RI: AMS, 1989, pp. 325–330 30. Schaeffer, J.: The equation u tt − u = |u| p for the critical value of p. Proc. Roy. Soc. Edinburgh Sect. A 101A, 31–44 (1985) 31. Shatah, J.: Weak solutions and development of singularities in the SU (2) σ -model. Comm. Pure Appl. Math. 41, 459–469 (1988) 32. Shatah, J., Tahvildar-Zadeh, A.: Regularity of harmonic maps from Minkowski space into rotationally symmetric manifolds. Comm. Pure Appl. Math. 45, 947–971 (1992) 33. Sideris, T.: Global behavior of solutions to nonlinear wave equations in three dimensions. Comm. Partial Diff. Eq. 8, 1291–1323 (1983) 34. Sideris, T.: Nonexistence of global solutions to semilinear wave equations in high dimensions. J. Diff. Eq. 52, 378–406 (1984) 35. Strauss, W.: Nonlinear wave equations. CBMS Lectures 73, Providence RI: AMS, 1989 36. Virga, E.: Variational Theories for Liquid Crystals. Chapman & Hall, New York (1994) 37. Zhang, P., Zheng, Y.: On oscillations of an asymptotic equation of a nonlinear variational wave equation. Asymptotic Analysis 18, 307–327 (1998) 38. Zhang, P., Zheng, Y.: On the existence and uniqueness of solutions to an asymptotic equation of a variational wave equation. Acta Mathematica Sinica 15, 115–130 (1999) 39. Zhang, P., Zheng, Y.: On the existence and uniqueness to an asymptotic equation of a variational wave equation with general data. Arch. Rat. Mech. Anal. 155, 49–83 (2000) 40. Zhang, P., Zheng, Y.: Rarefactive solutions to a nonlinear variational wave equation, Comm. Partial Differential Equations 26, 381–419 (2001) 41. Zhang, P., Zheng, Y.: Singular and rarefactive solutions to a nonlinear variational wave equation, Chinese Annals of Mathematics 22B, 2, 159–170 (2001) 42. Zhang, P., Zheng, Y.: Weak solutions to a nonlinear variational wave equation, Arch. Rat. Mech. Anal. 166, 303–319 (2003) 43. Zhang, P., Zheng, Y.: On the second-order asymptotic equation of a variational wave equation, Proc A of the Royal Soc. Edinburgh, A. Mathematics 132A, 483–509 (2002) 44. Zhang, P., Zheng, Y.: Weak solutions to a nonlinear variational wave equation with general data, Annals of Inst. H. Poincaré, ©Non Linear Anal. 22, 207–226 (2005) 45. Zorski, H., Infeld, E.: New soliton equations for dipole chains, Phys. Rev. Lett. 68, 1180–1183 (1992) Communicated by P. Constantin

Commun. Math. Phys. 266, 499–545 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0036-y

Communications in

Mathematical Physics

The Random Average Process and Random Walk in a Space-Time Random Environment in One Dimension Márton Balázs1, , Firas Rassoul-Agha2 , Timo Seppäläinen1, 1 Mathematics Department, University of Wisconsin-Madison, Van Vleck Hall, Madison, WI 53706, USA.

E-mail: [email protected]; [email protected]

2 Mathematical Biosciences Institute, Ohio State University, 231 West 18th Avenue, Columbus, OH 43210,

USA. E-mail: [email protected] Received: 7 September 2005 / Accepted: 12 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006

Abstract: We study space-time fluctuations around a characteristic line for a one-dimensional interacting system known as the random average process. The state of this system is a real-valued function on the integers. New values of the function are created by averaging previous values with random weights. The fluctuations analyzed occur on the scale n 1/4 , where n is the ratio of macroscopic and microscopic scales in the system. The limits of the fluctuations are described by a family of Gaussian processes. In cases of known product-form invariant distributions, this limit is a two-parameter process whose time marginals are fractional Brownian motions with Hurst parameter 1/4. Along the way we study the limits of quenched mean processes for a random walk in a space-time random environment. These limits also happen at scale n 1/4 and are described by certain Gaussian processes that we identify. In particular, when we look at a backward quenched mean process, the limit process is the solution of a stochastic heat equation. 1. Introduction Fluctuations for asymmetric interacting systems. An asymmetric interacting system is a random process στ = {στ (k) : k ∈ K} of many components στ (k) that influence each others’ evolution. Asymmetry means here that the components have an average drift in some spatial direction. Such processes are called interacting particle systems because often these components can be thought of as particles. To orient the reader, let us first think of a single random walk {X τ : τ = 0, 1, 2, . . . } that evolves by itself. For random walk we scale both space and time by n because on this scale we see the long-term velocity: n −1 X nt → tv as n → ∞, where v = E X 1 . The random walk is diffusive which means that its fluctuations occur on the scale n 1/2 , as M. Balázs was partially supported by Hungarian Scientific Research Fund (OTKA) grant T037685.

T. Seppäläinen was partially supported by National Science Foundation grant DMS-0402231.

500

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

revealed by the classical central limit theorem: n −1/2 (X nt − ntv) converges weakly to a Gaussian distribution. The Gaussian limit is universal here because it arises regardless of the choice of step distribution for the random walk, as long as a square-integrability hypothesis is satisfied. For asymmetric interacting systems we typically also scale time and space by the same factor n, and this is known as Euler scaling. However, in certain classes of onedimensional asymmetric interacting systems the random evolution produces fluctuations of smaller order than the natural diffusive scale. Two types of such phenomena have been discovered. (i) In Hammersley’s process, in asymmetric exclusion, and in some other closely related systems, dynamical fluctuations occur on the scale n 1/3 . Currently known rigorous results suggest that the Tracy-Widom distributions from random matrix theory are the universal limits of these n 1/3 fluctuations. The seminal works in this context are by Baik, Deift and Johansson [3] on Hammersley’s process and by Johansson [19] on the exclusion process. We should point out though that [3] does not explicitly discuss Hammersley’s process, but instead the maximal number of planar Poisson points on an increasing path in a rectangle. One can intrepret the results in [3] as fluctuation results for Hammersley’s process with a special initial configuration. The connection between the increasing path model and Hammersley’s process goes back to Hammersley’s paper [18]. It was first utilized by Aldous and Diaconis [1] (who also named the process), and then further in the papers [26, 28]. (ii) The second type has fluctuations of the order n 1/4 and limits described by a family of self-similar Gaussian processes that includes fractional Brownian motion with Hurst parameter 41 . This result was first proved for a system of independent random walks [30]. One of the main results of the current paper shows that the n 1/4 fluctuations also appear in a family of interacting systems called random average processes in one dimension. The same family of limiting Gaussian processes appears here too, suggesting that these limits are universal for some class of interacting systems. The random average processes (RAP) studied in the present paper describe a random real-valued function on the integers whose values evolve by jumping to random convex combinations of values in a finite neighborhood. It could be thought of as a caricature model for an interface between two phases on the plane, hence we call the state a height function. RAP is related to the so-called linear systems discussed in Chapter IX of Liggett’s monograph [22]. RAP was introduced by Ferrari and Fontes [14] who studied the fluctuations from initial linear slopes. In particular, they discovered that the height over the origin satisfies a central limit theorem in the time scale t 1/4 . The Ferrari-Fontes results suggested RAP to us as a fruitful place to investigate whether the n 1/4 fluctuation picture discovered in [30] for independent walks had any claim to universality. There are two ways to see the lower order dynamical fluctuations. (1) One can take deterministic initial conditions so that only dynamical randomness is present. (2) Even if the initial state is random with central limit scale fluctuations, one can find the lower order fluctuations by looking at the evolution of the process along a characteristic curve. Articles [3] and [19] studied the evolutions of special deterministic initial states of Hammersley’s process and the exclusion process. Recently Ferrari and Spohn [15] have extended this analysis to the fluctuations across a characteristic in a stationary exclusion process. The general nonequilibrium hydrodynamic limit situation is still out of reach for these models. [30] contains a tail bound for Hammersley’s process that suggests

Random Average Process

501

n 1/3 scaling also in the nonequilibrium situation, including along a shock which can be regarded as a “generalized” characteristic. Our results for the random average process are for the general hydrodynamic limit setting. The initial increments of the random height function are assumed independent and subject to some moment bounds. Their means and variances must vary sufficiently regularly to satisfy a Hölder condition. Deterministic initial increments qualify here as a special case of independent. The classification of the systems mentioned above (Hammersley, exclusion, independent walks, RAP) into n 1/3 and n 1/4 fluctuations coincides with their classification according to type of macroscopic equation. Independent particles and RAP are macroscopically governed by linear first-order partial differential equations u t + bu x = 0. In contrast, macroscopic evolutions of Hammersley’s process and the exclusion process obey genuinely nonlinear Hamilton-Jacobi equations u t + f (u x ) = 0 that create shocks. Suppose we start off one of these systems so that the initial state fluctuates on the n 1/2 spatial scale, for example in a stationary distribution. Then the fluctuations of the entire system on the n 1/2 scale simply consist of initial fluctuations transported along the deterministic characteristics of the macroscopic equation. This is a consequence of the lower order of dynamical fluctuations. When the macroscopic equation is linear this is the whole picture of diffusive fluctuations. In the nonlinear case the behavior at the shocks (where characteristics merge) also needs to be resolved. This has been done for the exclusion process [25] and for Hammersley’s process [29]. Random walk in a space-time random environment. Analysis of the random average process utilizes a dual description in terms of backward random walks in a space-time random environment. Investigation of the fluctuations of RAP leads to a study of fluctuations of these random walks, both quenched invariance principles for the walk itself and limits for the quenched mean process. The quenched invariance principles have been reported elsewhere [24]. The results for the quenched mean process are included in the present paper because they are intimately connected to the random average process results. We look at two types of processes of quenched means. We call them forward and backward. In the forward case the initial point of the walk is fixed, and the walk runs for a specified amount of time on the space-time lattice. In the backward case the initial point moves along a characteristic, and the walk runs until it reaches the horizontal axis. Furthermore, in both cases we let the starting point vary horizontally (spatially), and so we have a space-time process. In both cases we describe a limiting Gaussian process, when space is scaled by n 1/2 , time by n, and the magnitude of the fluctuations by n 1/4 . In particular, in the backward case we find a limit process that solves the stochastic heat equation. There are two earlier papers on the quenched mean of this random walk in a space-time random environment. These previous results were proved under assumptions of small enough noise and finitely many possible values for the random probabilities. Bernabei [5] showed that the centered quenched mean, normalized by its own standard deviation, converges to a normal variable. Then separately he showed that this standard deviation is bounded above and below on the order n 1/4 . Bernabei has results also in dimension 2, and also for the quenched covariance of the walk. Boldrighini and Pellegrinotti [6] also proved a normal limit in the scale n 1/4 for what they term the “correction” caused by the random environment on the mean of a test function.

502

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Finite-dimensional versus process-level convergence. Our main results all state that the finite-dimensional distributions of a process of interest converge to the finite-dimensional distributions of a certain Gaussian process specified by its covariance function. We have not proved process-level tightness, except in the case of forward quenched means for the random walks where we compute a bound on the sixth moment of the process increment. Further relevant literature. It is not clear what exactly are the systems “closely related” to Hammersley’s process or exclusion process, alluded to in the beginning of the Introduction, that share the n 1/3 fluctuations and Tracy-Widom limits. The processes for which rigorous proofs exist all have an underlying representation in terms of a last-passage percolation model. Another such example is “oriented digital boiling” studied by Gravner, Tracy and Widom [16]. (This model was studied earlier in [27] and [20] under different names.) Fluctuations of the current were initially studied from the perspective of a moving observer traveling with a general speed. The fluctuations are diffusive, and the limiting variance is a function of the speed of the observer. The special nature of the characteristic speed manifests itself in the vanishing of the limiting variance on this diffusive scale. The early paper of Ferrari and Fontes [13] treated the asymmetric exclusion process. Their work was extended by Balázs [4] to a class of deposition models that includes the much-studied zero range process and a generalization called the bricklayers’ process. Work on the fluctuations of Hammersley’s process and the exclusion process has connections to several parts of mathematics. Overviews of some of these links appear in papers [2, 10, 17]. General treatments of large scale behavior of interacting random systems can be found in [9, 21–23, 32, 33]. Organization of the paper. We begin with the description of the random average process and the limit theorem for it in Sect. 2. Section 3 describes the random walk in a space-time random environment and the limit theorems for quenched mean processes. The proofs begin with Sect. 4 that lays out some preliminary facts on random walks. Sections 5 and 6 prove the fluctuation results for random walk, and the final Sect. 7 proves the limit theorem for RAP. The reader only interested in the random walk can read Sect. 3 and the proofs for the random walk limits independently of the rest of the paper, except for certain definitions and a hypothesis which have been labeled. The RAP results can be read independently of the random walk, but their proofs depend on the random walk results. Notation. We summarize here some notation and conventions for quick reference. The set of natural numbers is N = {1, 2, 3, . . . }, while Z+ = {0, 1, 2, 3, . . . } and R+ = [0, ∞). On the two dimensional integer lattice Z2 standard basis vectors are e1 = (1, 0) and e2 = (0, 1). The e2 -direction represents time. We need several different probability measures and corresponding expectation operators. P (with expectation E) is the probability measure on the space of environments ω. P is an i.i.d. product measure across the coordinates indexed by the space-time lattice Z2 . P (with expectation E) is the probability measure of the initial state of the random average process. Eω is used to emphasize that an expectation over initial states is taken with a fixed environment ω. Jointly the environment and initial state are independent, so the joint measure is the product P ⊗ P. P ω (with expectation E ω ) is the quenched path measure of the random walks in environment ω. The annealed measure for the walks

Random Average Process

503

is P = P ω P(dω). Additionally, we use P and E for generic probability measures and expectations for processes that are not part of this specific set-up, such as Brownian motions and limiting Gaussian processes. The environments ω ∈ are configurations ω = (ωx,τ : (x, τ ) ∈ Z2 ) of vectors indexed by the space-time lattice Z2 . Each element ωx,τ is a probability vector of length 2M + 1, denoted also by u τ (x) = ωx,τ , and in terms of coordinates u τ (x) = (u τ (x, y) : −M ≤ y ≤ M). The environment at a fixed time value τ is ω¯ τ = (ωx,τ : x ∈ Z). Translations on are defined by (Tx,τ ω) y,s = ωx+y,τ +s . x = max{n ∈ Z : n ≤ x} is the lower integer part of a real x. Throughout, C denotes a constant whose exact value is immaterial and can change from line to line. The density and cumulative distribution function of the centered Gaussian distribution with variance σ 2 are denoted by ϕσ 2 (x) and σ 2 (x). {B(t) : t ≥ 0} is one-dimensional standard Brownian motion, in other words the Gaussian process with covariance E B(s)B(t) = s ∧ t. 2. The Random Average Process The state of the random average process (RAP) is a height function σ : Z → R. It can also be thought of as a sequence σ = (σ (i) : i ∈ Z) ∈ RZ , where σ (i) is the height of an interface above site i. The state evolves in discrete time according to the following rule. At each time point τ = 1, 2, 3, . . . and at each site k ∈ Z, a random probability vector u τ (k) = (u τ (k, j) : −M ≤ j ≤ M) of length 2M + 1 is drawn. Given the state στ −1 = (στ −1 (i) : i ∈ Z) at time τ − 1, the height value at site k is then updated to στ (k) = u τ (k, j)στ −1 (k + j). (2.1) j:| j|≤M

This update is performed independently at each site k to form the state στ = (στ (k) : k ∈ Z) at time τ . The same step is repeated at the next time τ + 1 with new independent draws of the probability vectors. So, given an initial state σ0 , the process στ is constructed with a collection {u τ (k) : τ ∈ N, k ∈ Z} of independent and identically distributed random vectors. These random vectors are defined on a probability space (, S, P). If σ0 is also random with distribution P, then σ0 and the vectors {u τ (k)} are independent, in other words the joint distribution is P ⊗ P. We write u ωτ (k) to make explicit the dependence on ω ∈ . E will denote expectation under the measure P. M is the range and is a fixed finite parameter of the model. P-almost surely each random vector u τ (k) satisfies 0 ≤ u τ (k, j) ≤ 1 for all −M ≤ j ≤ M, and

M

u τ (k, j) = 1.

j=−M

It is often convenient to allow values u τ (k, j) for all j. Then automatically u τ (k, j) = 0 for | j| > M. Let p(0, j) = Eu 0 (0, j) denote the averaged probabilities. Throughout the paper we make two fundamental assumptions.

504

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

(i) First, there is no integer h > 1 such that, for some x ∈ Z,

p(0, x + kh) = 1.

k∈Z

This is also expressed by saying that the span of the random walk with jump probabilities p(0, j) is 1 [11, p. 129]. It follows that the group generated by {x ∈ Z : p(0, x) > 0} is all of Z, in other words this walk is aperiodic in Spitzer’s terminology [31]. (ii) Second, we assume that P{max u 0 (0, j) < 1} > 0.

(2.2)

j

If this assumption fails, then P-almost surely for each (k, τ ) there exists j = j (k, τ ) such that u τ (k, j) = 1. No averaging happens, but instead στ (k) adopts the value στ −1 (k + j). The behavior is then different from that described by our results. No further hypotheses are required of the distribution P on the probability vectors. Deterministic weights u ωτ (k, j) ≡ p(0, j) are also admissible, in which case (2.2) requires max j p(0, j) < 1. In addition to the height process στ we also consider the increment process ητ = (ητ (i) : i ∈ Z) defined by ητ (i) = στ (i) − στ (i − 1). From (2.1) one can deduce a similar linear equation for the evolution of the increment process. However, the weights are not necessarily nonnegative, and even if they are, they do not necessarily sum to one. Next we define several constants that appear in the results. D(ω) = x u ω0 (0, x) (2.3) x∈Z

is the drift at the origin. Its mean is V = E(D) and variance σ D2 = E[(D − V )2 ].

(2.4)

A variance under averaged probabilities is computed by (x − V )2 p(0, x). σa2 =

(2.5)

x∈Z

Define random and averaged characteristic functions by φ ω (t) = u ω0 (0, x)eit x and φa (t) = Eφ ω (t) = p(0, x)eit x , x∈Z

(2.6)

x∈Z

and then further λ(t) = E[ |φ ω (t)|2 ] and λ¯ (t) = |φa (t)|2 . Finally, define a positive constant β by π 1 − λ(t) 1 β= dt. ¯ 2π −π 1 − λ(t)

(2.7)

(2.8)

Random Average Process

505

The assumption of span 1 implies that |φa (t)| = 1 only at multiples of 2π . Hence the integrand above is positive at t = 0. Separately one can check that the integrand has a finite limit as t → 0. Thus β is well-defined and finite. In Sect. 4 we can give these constants, especially β, more probabilistic meaning from the perspective of the underlying random walk in random environment. For the limit theorems we consider a sequence στn of the random average processes, indexed by n ∈ N = {1, 2, . . . }. Initially we set σ0n (0) = 0. For each n we assume 3, n that the initial increments η0 (i) : i ∈ Z are independent random variables, with E[η0n (i)] = (i/n) and Var[η0n (i)] = v(i/n).

(2.9)

The functions and v that appear above are assumed to be uniformly bounded functions on R and to satisfy this local Hölder continuity: For each compact interval [a, b] ⊆ R there exist C = C(a, b) < ∞ and γ = γ (a, b) > 1/2 such that |(x) − (y)| + |v(x) − v(y)| ≤ C |x − y|γ for x, y ∈ [a, b].

(2.10)

The function v must be nonnegative, but the sign of is not restricted. Both functions are allowed to vanish. In particular, our hypotheses permit deterministic initial heights which implies that v vanishes identically. The distribution on initial heights and increments described above is denoted by P. We make this uniform moment hypothesis on the increments: there exists α > 0 such that sup E |η0n (i)|2+α < ∞. (2.11) n∈N, i∈Z

We assume that the processes στn are all defined on the same probability space. The environments ω that drive the dynamics are independent of the initial states {σ0n }, so the joint distribution of (ω, {σ0n }) is P ⊗ P. When computing an expectation under a fixed ω we write Eω . On the larger space and time scale the height function is simply rigidly translated at speed b = −V , and the same is also true of the central limit fluctuations of the initial height function. Precisely speaking, define a function U on R by U (0) = 0 and U (x) = (x). Let (x, t) ∈ R × R+ . The assumptions made thus far imply that both n (nx) −→ U (x − bt) n −1 σnt

(2.12)

and n (nx) − nU (x − bt) σnt σ n (nx − nbt) − nU (x − bt) − 0 −→ 0 (2.13) √ √ n n

in probability, as n → ∞. (We will not give a proof. This follows from easier versions of the estimates in the paper.) Limit (2.12) is the “hydrodynamic limit” of the process. The large scale evolution of the height process is thus governed by the linear transport equation wt + bwx = 0. This equation is uniquely solved by w(x, t) = U (x − bt) given the initial function w(x, 0) = U (x). The lines x(t) = x + bt are the characteristics of this equation, the

506

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

curves along which the equation carries information. Limit (2.13) says that fluctuations on the diffusive scale do not include any randomness from the evolution, only a translation of initial fluctuations along characteristics. We find interesting height fluctuations along a macroscopic characteristic line x(t) = √ y¯ + bt, and around such a line on the microscopic spatial scale n. The magnitude of these fluctuations is of the order n 1/4 , so we study the process n √ √ z n (t, r ) = n −1/4 σnt (n y¯ + r n + ntb) − σ0n (n y¯ + r n ) , indexed by (t, r ) ∈ R+ × R, for a fixed y¯ ∈ R. In terms of the increment process ητn , z n (t, 0) is the net flow from right to left across the discrete characteristic n y¯ + nsb, during the time interval 0 ≤ s ≤ t. Next we describe the limit of z n . Recall the constants defined in (2.4), (2.5), and (2.8). Combine them into a new constant κ=

σ D2 . βσa2

(2.14)

Let {B(t) : t ≥ 0} be one-dimensional standard Brownian motion. Define two functions q and 0 on (R+ × R) × (R+ × R): q ((s, q), (t, r )) = and 0 ((s, q), (t, r )) =

κ 2

σa2 (t+s)

σa2 |t−s|

1 1 exp − (q − r )2 dv √ 2v 2π v

(2.15)

∞

P[σa B(s) > x − q]P[σa B(t) > x − r ] d x r − 1{r >q} P[σa B(s) > x − q]P[σa B(t) ≤ x − r ] d x q q +1{q>r } P[σa B(s) ≤ x − q]P[σa B(t) > x − r ] d x r q∧r P[σa B(s) ≤ x − q]P[σa B(t) ≤ x − r ] d x. (2.16) +

q∨r

−∞

The boundary values are such that q ((s, q), (t, r )) = 0 ((s, q), (t, r )) = 0 if either s = 0 or t = 0. We will see later that q is the limiting covariance of the backward quenched mean process of a related random walk in random environment. 0 is the covariance for fluctuations contributed by the initial increments of the random average process. (Hence the subscripts q for quenched and 0 for initial time. The subscript on q has nothing to do with the argument (s, q).) The integral expressions above are the form in which q and 0 appear in the proofs. For q the key point is the limit (5.19) which is evaluated earlier in (4.5). 0 arises in Proposition 7.1. Here are alternative succinct representations for q and 0 . Denote the centered Gaussian density with variance σ 2 by

1 1 (2.17) exp − 2 x 2 ϕσ 2 (x) = √ 2σ 2π σ 2

Random Average Process

507

and its distribution function by σ 2 (x) =

x

−∞ ϕσ 2 (y) dy.

Then define

σ 2 (x) = σ 2 ϕσ 2 (x) − x(1 − σ 2 (x)), which is an antiderivative of σ 2 (x) − 1. In these terms,

q ((s, q), (t, r )) = κσa2 (t+s) |q − r | − κσa2 |t−s| |q − r | and

0 ((s, q), (t, r )) = σa2 s |q − r | + σa2 t |q − r | − σa2 (t+s) |q − r | .

Theorem 2.1. Assume (2.2) and that the averaged probabilities p(0, j) = Eu ω0 (0, j) have lattice span 1. Let and v be two uniformly bounded functions on R that satisfy the local Hölder condition (2.10). For each n, let στn be a random average process normalized by σ0n (0) = 0 and whose initial increments {η0n (i) : i ∈ Z} are independent and satisfy (2.9) and (2.11). Assume the environments ω independent of the initial heights {σ0n : n ∈ N}. Fix y¯ ∈ R. Under the above assumptions the finite-dimensional distributions of the process {z n (t, r ) : (t, r ) ∈ R+ ×R} converge weakly as n → ∞ to the finite-dimensional distributions of the mean zero Gaussian process {z(t, r ) : (t, r ) ∈ R+ × R} specified by the covariance E z(s, q)z(t, r ) = ( y¯ )2 q ((s, q), (t, r )) + v( y¯ )0 ((s, q), (t, r )).

(2.18)

The statement means that, given space-time points (t1 , r1 ), . . . , (tk , rk ), the Rk -valued random vector (z n (t1 , r1 ), . . . , z n (tk , rk )) converges in distribution to the random vector (z(t1 , r1 ), . . . , z(tk , rk )) as n → ∞. The theorem is also valid in cases where one source of randomness has been turned off: if initial increments around n y¯ are deterministic then v( y¯ ) = 0, while if D(ω) ≡ V then σ D2 = 0. The case σ D2 = 0 contains as a special case the one with deterministic weights u ωτ (k, j) ≡ p(0, j). If we consider only temporal correlations with a fixed r , the formula for the covariance is as follows: √

√ κσa E z(s, r )z(t, r ) = √ ( y¯ )2 s + t − t − s 2π √

√ √ σa + √ v( y¯ ) s + t − s + t for s < t. (2.19) 2π Remark 2.1. The covariances are central to our proofs but they do not illuminate the behavior of the process z. Here is a stochastic integral representation of the Gaussian process with covariance (2.18): √ z(t, r ) = ( y¯ )σa κ ϕσa2 (t−s) (r − x) dW (s, x) [0,t]×R

(2.20) + v( y¯ ) sign(x − r )σa2 t − |x − r | d B(x). R

Above W is a two-parameter Brownian motion defined on R+ × R, B is a one-parameter Brownian motion defined on R, and W and B are independent of each other. The first integral represents the space-time noise created by the dynamics, and the second

508

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

integral represents the initial noise propagated by the evolution. The equality in (2.20) is equality in distribution of processes. It can be verified by checking that the Gaussian process defined by the sum of the integrals has the covariance (2.18). One can readily see the second integral in (2.20) arise as a sum in the proof. It is the limit of Y n (t, r ) defined below Eq. (7.1). One can also check that the right-hand side of (2.20) is a weak solution of a stochastic heat equation with two independent sources of noise: √ z t = 21 σa2 zrr + ( y¯ )σa κ W˙ + 21 v( y¯ )σa2 B , z(0, r ) ≡ 0. (2.21) W˙ is space-time white noise generated by the dynamics and B the second derivative of the one-dimensional Brownian motion that represents initial noise. This equation has to be interpreted in a weak sense through integration against smooth compactly supported test functions. We make a related remark below in Sect. 3.2 for limit processes of quenched means of space-time RWRE. The simplest RAP dynamics averages only two neighboring height values. By translating the indices, we can assume that p(0, −1) + p(0, 0) = 1. In this case the evolution of increments is given by the equation ητ (k) = u τ (k, 0)ητ −1 (k) + u τ (k − 1, −1)ητ −1 (k − 1).

(2.22)

There is a queueing interpretation of sorts for this evolution. Suppose ητ −1 (k) denotes the amount of work that remains at station k at the end of cycle τ − 1. Then during cycle τ , the fraction u τ (k, −1) of this work is completed and moves on to station k + 1, while the remaining fraction u τ (k, 0) stays at station k for further processing. In this case we can explicitly evaluate the constant β in terms of the other quantities. In a particular stationary situation we can also identify the temporal marginal of z in (2.19) as a familiar process. (A probability distribution μ on the space ZZ is an invariant distribution for the increment process if it is the case that when η0 has μ distribution, so does ητ for all times τ ∈ Z+ .) Proposition 2.2. Assume p(0, −1) + p(0, 0) = 1. (a) Then 1 β = 2 E[u 0 (0, 0)u 0 (0, −1)]. σa

(2.23)

(b) Suppose further that the increment process ητ possesses an invariant distribution μ in which the variables {η(i) : i ∈ Z} are i.i.d. with common mean = E μ [η(i)] and variance v = E μ [η(i)2 ] − 2 . Then v = κ2 . Suppose that in Theorem 2.1 each ητn = ητ is a stationary process with marginal μ. Then the limit process z has covariance

E z(s, q)z(t, r ) = κ2 σa2 s |q − r | + σa2 t |q − r | − σa2 |t−s| |q − r | . (2.24) In particular, for a fixed r the process {z(t, r ) : t ∈ R+ } has covariance σa κ2 √ √ (2.25) s + t − |t − s| . E z(s, r )z(t, r ) = √ 2π In other words, process z(·, r ) is fractional Brownian motion with Hurst parameter 1/4.

Random Average Process

509

To rephrase the connection (2.24)–(2.25), the process {z(t, r )} in (2.24) is a certain two-parameter process whose marginals along the first parameter direction are fractional Brownian motions. Ferrari and Fontes [14] showed that given any slope ρ, the process ητ started from deterministic increments η0 (x) = ρx converges weakly to an invariant distribution. But as is typical for interacting systems, there is little information about the invariant distributions in the general case. The next example gives a family of processes and i.i.d. invariant distributions to show that part (b) of Proposition 2.2 is not vacuous. Presently we are not aware of other explictly known invariant distributions for RAP. Example 2.1. Fix integer parameters m > j > 0. Let {u τ (k, −1) : τ ∈ N, k ∈ Z} be i.i.d. beta-distributed random variables with density h(u) =

(m − 1)! u j−1 (1 − u)m− j−1 ( j − 1)!(m − j − 1)!

on (0, 1). Set u τ (k, 0) = 1 − u τ (k, −1). Consider the evolution defined by (2.22) with these weights. Then a family of invariant distributions for the increment process ητ = (ητ (k) : k ∈ Z) is obtained by letting the variables {η(k)} be i.i.d. gamma distributed with common density f (x) =

1 λe−λx (λx)m−1 (m − 1)!

(2.26)

on R+ . The family of invariant distributions is parametrized by 0 < λ < ∞. Under this distribution E[η(k)] = m/λ and Var[η(k)] = m/λ2 . One motivation for the present work was to investigate whether the limits found in [30] for fluctuations along a characteristic for independent walks are instances of some universal behavior. The present results are in agreement with those obtained for independent walks. The common scaling is n 1/4 . In that paper only the case r = 0 of Theorem 2.1 was studied. For both independent walks and RAP the limit z(· , 0) is a mean-zero Gaussian process with covariance of the type √ √

√

√ √ E z(s, 0)z(t, 0) = c1 s + t − t − s + c2 s + t − s + t , where c1 is determined by the mean increment and c2 by the variance of the increment locally around the initial point of the characteristic. Furthermore, as in Proposition 2.2(b), for independent walks the limit process specializes to fractional Brownian motion if the increment process is stationary. These and other related results suggest several avenues of inquiry. In the introduction we contrasted this picture of n 1/4 fluctuations and fractional Brownian motion limits with the n 1/3 fluctuations and Tracy-Widom limits found in exclusion and Hammersley processes. Obviously more classes of processes should be investigated to understand better the demarcation between these two types. Also, there might be further classes with different limits. Above we assumed independent increments at time zero. It would be of interest to see if relaxing this assumption leads to a change in the second part of the covariance (2.18). [The first part comes from the random walks in the dual description and would not be affected by the initial conditions.] However, without knowledge of some explicit invariant distributions it is not clear what types of initial increment processes {η0 (k)} are

510

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

worth considering. Unfortunately finding explicit invariant distributions for interacting systems seems often a matter of good fortune. We conclude this section with the dual description of RAP which leads us to study random walks in a space-time random environment. Given ω, let {X si,τ : s ∈ Z+ } denote a random walk on Z that starts at X 0i,τ = i, and whose transition probabilities are given by i, τ P ω (X s+1 = y | X si, τ = x) = u ωτ−s (x, y − x).

(2.27)

P ω is the path measure of the walk X si,τ , with expectation denoted by E ω . Comparison of (2.1) and (2.27) gives στ (i) = P ω (X 1i, τ = j | X 0i, τ = i)στ −1 ( j) = E ω στ −1 (X 1i, τ ) . (2.28) j

Iteration and the Markov property of the walks X si,τ then lead to στ (i) = E ω σ0 (X τi, τ ) .

(2.29)

Note that the initial height function σ0 is a constant under the expectation E ω . Let us add another coordinate to keep track of time and write X¯ si,τ = (X si,τ , τ − s) for s ≥ 0. Then X¯ si,τ is a random walk on the planar lattice Z2 that always moves down one step in the e2 -direction, and if its current position is (x, n), the e1 -coordinate of its next position is x + y with probability u n (x, y). We shall call it the backward random walk in a random environment. In the next section we discuss this walk and its forward counterpart. 3. Random Walk in a Space-Time Random Environment 3.1. Definition of the model. We consider here a particular random walk in random environment (RWRE). The walk evolves on the planar integer lattice Z2 , which we think of as space-time: the first component represents one-dimensional discrete space, and the second represents discrete time. We denote by e2 the unit vector in the time-direction. The walks will not be random in the e2 -direction, but only in the spatial e1 -direction. i,τ i,τ and backward walks X¯ m . The subscript m ∈ Z+ is We consider forward walks Z¯ m the time parameter of the walk and superscripts are initial points: Z¯ 0i,τ = X¯ 0i,τ = (i, τ ) ∈ Z2 .

(3.1)

The forward walks move deterministically up in time, while the backward walks move deterministically down in time: i,τ i,τ i,τ i,τ Z¯ m = (Z m , τ + m) and X¯ m = (X m , τ − m) for m ≥ 0.

Since the time components of the walks are deterministic, only the spatial components i,τ i,τ Zm and X m are really relevant. We impose a finite range on the steps of the walks: there is a fixed constant M such that i,τ i,τ i,τ i,τ X Z (3.2) m+1 − Z m ≤ M and m+1 − X m ≤ M.

Random Average Process

511

A note of advance justification for the setting: The backward walks are the ones relevant to the random average process. Distributions of forward and backward walks are obvious mappings of each other. However, we will be interested in the quenched mean processes of the walks as we vary the final time for the forward walk or the initial spacetime point for the backward walk. The results for the forward walk form an interesting point of comparison to the backward walk, even though they will not be used to analyze the random average process.

An environment is a configuration of probability vectors ω = u τ (x) : (x, τ ) ∈ Z2 , where each vector u τ (x) = (u τ (x, y) : −M ≤ y ≤ M) satisfies 0 ≤ u τ (x, y) ≤ 1 for all −M ≤ y ≤ M, and

M

u τ (x, y) = 1.

y=−M

An environment ω is a sample point of the probability space (, S, P). The sample 2 space is the product space = P Z , where P is the space of probability vectors of length 2M + 1, and S is the product σ -field on induced by the Borel sets on P. Throughout, we assume that P is a product probability measure on such that the vectors {u τ (x) : (x, τ ) ∈ Z2 } are independent and identically distributed. Expectation under P is denoted by E. When for notational convenience we wish to think of u τ (x) as an infinite vector, then u τ (x, y) = 0 for |y| > M. We write u ωτ (x, y) to make explicit the environment ω, and also ωx,τ = u τ (x) for the environment at space-time point (x, τ ). Fix an environment ω and an initial point (i, τ ). The forward and backward walks i,τ i,τ ¯ (m ≥ 0) are defined as canonical Z2 -valued Markov chains on their path Z m and X¯ m spaces under the measure P ω determined by the conditions

i, τ P ω { Z¯ s+1

P ω { Z¯ 0i, τ = (i, τ )} = 1, = (y, τ + s + 1) | Z¯ si, τ = (x, τ + s)} = u τ +s (x, y − x)

for the forward walk, and by

i, τ P ω { X¯ s+1

P ω { X¯ 0i, τ = (i, τ )} = 1, = (y, τ − s − 1) | X¯ si, τ = (x, τ − s)} = u τ −s (x, y − x)

for the backward walk. By dropping the time components τ , τ ± s and τ ± s ± 1 from the equations we get the corresponding properties for the spatial walks Z si, τ and X si, τ . When we consider many walks under a common environment ω, it will be notationally convenient to attach the initial point (i, τ ) to the walk and only the environment ω to the measure P ω . P ω is called the quenched distribution, and expectation under P ω is denoted by E ω . The annealed distribution and expectation are P(·) = EP ω (·) and E(·) = EE ω (·). i,τ i,τ and Z m are ordinary homogeneous random walks on Z with jump Under P both X m probabilities p(i, i + j) = p(0, j) = Eu 0 (0, j). These walks satisfy the law of large numbers with velocity V = p(0, j) j. (3.3) j∈Z

As for RAP, we also use the notation b = −V .

512

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

3.2. Limits for quenched mean processes. We start by stating the quenched invariance principle for the space-time RWRE. {B(t) : t ≥ 0} denotes standard one-dimensional Brownian motion. DR [0, ∞) is the space of real-valued cadlag functions on [0, ∞) with the standard Skorohod metric [12]. Recall the definition (2.5) of the variance σa2 of the annealed walk, and assumption (2.2) that guarantees that the quenched walk has stochastic noise. Theorem 3.1 [24]. Assume (2.2). We have these bounds on the variance of the quenched mean: there exist constants C1 , C2 such that for all n, (3.4) C1 n 1/2 ≤ E (E ω (X n0,0 ) − nV )2 ≤ C2 n 1/2 . 0,0 For P-almost every ω, under P ω the process n −1/2 (X nt − nt V ) converges weakly to 2 the process B(σa t) on the path space DR [0, ∞) as n → ∞.

Quite obviously, X n0,0 and Z n0,0 are interchangeable in the above theorem. Bounds (3.4) suggest the possibility of a weak limit for the quenched mean on the scale n 1/4 . Such results are the main point of this section. For t ≥ 0, r ∈ R we define scaled, centered quenched mean processes √

√ r n,0 an (t, r ) = n −1/4 E ω Z nt (3.5) − r n − ntV for the forward walks, and √ √

ntb+r n,nt yn (t, r ) = n −1/4 E ω X nt − r n

(3.6)

for the backward walks. In words, the process an follows forward walks from level 0 to level nt and records centered quenched means. Process yn follows backward walks from level nt down to level 0 and records the centered quenched mean of the point it hits at level 0. The initial points of the backward walks are translated by the negative of the mean drift ntb. This way the temporal processes an (·, r ) and yn (·, r ) obtained by fixing r are meaningful processes. Random variable yn (t, r ) is not exactly centered, for

(3.7) Eyn (t, r ) = n −1/4 ntb − ntb . Of course this makes no difference to the limit. Next we describe the Gaussian limiting processes. Recall the constant κ defined in (2.14) and the function q defined in (2.15). Let {a(t, r ) : (t, r ) ∈ R+ × R} and {y(t, r ) : (t, r ) ∈ R+ × R} be the mean zero Gaussian processes with covariances

Ea(s, q)a(t, r ) = q (s ∧ t, q), (s ∧ t, r ) and

E y(s, q)y(t, r ) = q (s, q), (t, r ) for s, t ≥ 0 and q, r ∈ R. When one argument is fixed, the random function r → y(t, r ) is denoted by y(t, ·) and t → y(t, r ) by y(·, r ). From the covariances follows that at a fixed time level t the spatial processes a(t, ·) and y(t, ·) are equal in distribution. We record basic properties of these processes.

Random Average Process

513

Lemma 3.1. The process {y(t, r )} has a version with continuous paths as functions of (t, r ). Furthermore, it has the following Markovian structure in time. Given 0 = t0 < t1 < · · · < tn , let { y˜ (ti − ti−1 , ·) : 1 ≤ i ≤ n} be independent random functions such that y˜ (ti − ti−1 , ·) has the distribution of y(ti − ti−1 , ·) for i = 1, . . . , n. Define y ∗ (t1 , r ) = y˜ (t1 , r ) for r ∈ R, and then inductively for i = 2, . . . , n and r ∈ R, ∗ y (ti , r ) = ϕσa2 (ti −ti−1 ) (u)y ∗ (ti−1 , r + u) du + y˜ (ti − ti−1 , r ). (3.8) R

Then the joint distribution of the random functions {y ∗ (ti , ·) : 1 ≤ i ≤ n} is the same as that of {y(ti , ·) : 1 ≤ i ≤ n} from the original process. Sketch of proof. Consider (s, q) and (t, r ) varying in a compact set. From the covariance comes the estimate

E (y(s, q) − y(t, r ))2 ≤ C |s − t|1/2 + |q − r | (3.9) from which, since the integrand is Gaussian,

5 E (y(s, q) − y(t, r ))10 ≤ C |s − t|1/2 + |q − r | ≤ C (s, q) − (t, r )5/2 . (3.10) Kolmogorov’s criterion implies the existence of a continuous version. n For the second statement use (3.8) to express a linear combination i=1 θi y ∗ (ti , ri ) in the form n n θi y ∗ (ti , ri ) = y˜ (ti − ti−1 , x) λi (d x), i=1

i=1

R

where the signed measures λi are linear combinations of Gaussian distributions. Use this representation to compute the variance of the linear combination on the left-hand side (it is mean zero Gaussian). Observe that this variance equals θi θ j q ((ti , ri ), (t j , r j )). i, j

Lemma 3.2. The process {a(t, r )} has a version with continuous paths as functions of (t, r ). Furthermore, it has independent increments in time. A more precise statement follows. Given 0 = t0 < t1 < · · · < tn , let {a(t ˜ i − ti−1 , ·) : 1 ≤ i ≤ n} be independent random functions such that a(t ˜ i − ti−1 , ·) has the distribution of a(ti − ti−1 , ·) for i = 1, . . . , n. Define a ∗ (t1 , r ) = a(t ˜ 1 , r ) for r ∈ R, and then inductively for i = 2, . . . , n and r ∈ R, ∗ ∗ a (ti , r ) = a (ti−1 , r ) + ϕσa2 ti−1 (u)a(t ˜ i − ti−1 , r + u) du. (3.11) R

Then the joint distribution of the random functions {a ∗ (ti , ·) : 1 ≤ i ≤ n} is the same as that of {a(ti , ·) : 1 ≤ i ≤ n} from the original process. The proof of the lemma above is similar to the previous one so we omit it.

514

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Remark 3.1. Processes y and a have representations in terms of stochastic integrals. As in Remark 2.1 let W be a two-parameter Brownian motion on R+ × R. In more technical terms, W is the orthogonal Gaussian martingale measure on R+ × R with covariance E W ([0, s] × A)W ([0, t] × B) = (s ∧ t) Leb(A ∩ B) for s, t ∈ R+ and bounded Borel sets A, B ⊆ R. Then √ y(t, r ) = σa κ ϕσa2 (t−s) (r − z) dW (s, z) (3.12) [0,t]×R

while √ a(t, r ) = σa κ

[0,t]×R

ϕσa2 s (r − z) dW (s, z).

(3.13)

By the equations above we mean equality in distribution of processes. They can be verified by a comparison of covariances, as the integrals on the right are also Gaussian processes. Formula (3.12) implies that process {y(t, r )} is a weak solution of the stochastic heat equation √ yt = 21 σa2 yrr + σa κ W˙ , y(0, r ) ≡ 0, (3.14) where W˙ is white noise. (See [34].) These observations are not used elsewhere in the paper. Next we record the limits for the quenched mean processes. The four theorems that follow require assumption (2.2) of stochastic noise and the assumption that the annealed probabilities p(0, j) = Eu ω0 (0, j) have span 1. This next theorem is the one needed for Theorem 2.1 for RAP. Theorem 3.2. The finite dimensional distributions of processes yn (t, r ) converge to those of y(t, r ) as n → ∞. More

precisely, for any finite set of points {(t j , r j ) : 1 ≤ j ≤ k} in R+ × R, the vector yn (t j , r j ) : 1 ≤ j ≤ k converges weakly in Rk to the vector y(t j , r j ) : 1 ≤ j ≤ k . Observe that property (3.8) is easy to understand from the limit. It reflects the Markovian property y,s ω E ω (X τx,τ ) = P ω (X τx,τ −s = y)E (X s ) for s < τ , y

and the “homogenization” of the coefficients which converge to Gaussian probabilities by the quenched central limit theorem. Let us restrict the backward quenched mean process to a single characteristic to observe the outcome. This is the source of the first term in the temporal correlations (2.19) for RAP. The next statement needs no proof, for it is just a particular case of the limit in Theorem 3.2. Corollary 3.3. Fix r ∈ R. As n → ∞, the finite dimensional distributions of the process {yn (t, r ) : t ≥ 0} converge to those of the mean zero Gaussian process {y(t) : t ≥ 0} with covariance √ κσa √ E y(s)y(t) = √ t +s− t −s (s < t). 2π Then the same for the forward processes.

Random Average Process

515

Theorem 3.4. The finite dimensional distributions of processes an converge to those of a as n → ∞. More precisely, for any finite set of points {(t j , r j ) : 1k ≤ j ≤ k} in × R, the vector a (t , r ) : 1 ≤ j ≤ k converges weakly in R to the vector R + n j j

a(t j , r j ) : 1 ≤ j ≤ k . When we specialize to a temporal process we also verify path-level tightness and hence get weak convergence of the entire process. When r = q in (2.16) we get √

q (s ∧ t, r ), (s ∧ t, r ) = ca s ∧ t with ca = σ D2 /(β π σa2 ). Since s ∧ t is the covariance of standard Brownian motion B(·), we get the following limit. Corollary √ 3.5. Fix r ∈ R. As n → ∞, the process {an (t, r ) : t ≥ 0} converges weakly to {B(ca t ) : t ≥ 0} on the path space DR [0, ∞). 4. Random Walk Preliminaries In this section we collect some auxiliary results for random walks. The basic assumptions, (2.2) and span 1 for the p(0, j) = Eu 0 (0, j) walk, are in force throughout the remainder of the paper. Recall the drift in the e1 direction at the origin defined by D(ω) =

x u ω0 (0, x),

x∈Z

with mean V = −b = E(D). Define the centered drift by g(ω) = D(ω) − V = E ω (X 10,0 − V ). The variance is σ D2 = E[g 2 ]. The variance of the i.i.d. annealed walk in the e1 direction is σa2 = (x − V )2 Eu ω0 (0, x). x∈Z

These variances are connected by σa2 = σ D2 + E (X 10,0 − D)2 . Let X n and X˜ n be two independent walks in a common environment ω, and Yn = X n − X˜ n . In the annealed sense Yn is a Markov chain on Z with transition probabilities q(0, y) =

E[u 0 (0, z)u 0 (0, z + y)]

(y ∈ Z),

p(0, z) p(0, z + y − x)

(x = 0, y ∈ Z).

z∈Z

q(x, y) =

z∈Z

516

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Yn can be thought of as a symmetric random walk on Z whose transition has been perturbed at the origin. The corresponding homogeneous, unperturbed transition probabilities are p(0, z) p(0, z + y − x) (x, y ∈ Z). q(x, ¯ y) = q(0, ¯ y − x) = z∈Z

The q-walk ¯ has variance 2σa2 and span 1 as can be deduced from the definition and the hypothesis that the p-walk has span 1. Since the q-walk ¯ is symmetric, its range must be a subgroup of Z. Then span 1 implies that it is irreducible. The q-walk ¯ is recurrent by the Chung-Fuchs theorem. Elementary arguments extend irreducibility and recurrence from q¯ to the q-chain because away from the origin the two walks are the same. Note that assumption (2.2) is required here because the q-walk is absorbed at the origin iff (2.2) fails. Note that the functions defined in (2.7) are the characteristic functions of these transitions: λ(t) = q(0, x)eit x and λ¯ (t) = q(0, ¯ x)eit x . x

x

Multistep transitions are denoted by q k (x, y) and q¯ k (x, y), defined as usual by q¯ 0 (x, y) = 1{x=y} , q¯ 1 (x, y) = q(x, y), q(x, ¯ x1 )q(x ¯ 1 , x2 ) · · · q(x ¯ k−1 , y) (k ≥ 2). q¯ k (x, y) = x1 ,...,xk−1 ∈Z

Green functions for the q¯ and q-walks are G¯ n (x, y) =

n

q¯ k (x, y) and G n (x, y) =

k=0

n

q k (x, y).

k=0

G¯ n is symmetric but G n not necessarily. The potential kernel a¯ of the q-walk ¯ is defined by a(x) ¯ = lim G¯ n (0, 0) − G¯ n (x, 0) .

(4.1)

It satisfies a(0) ¯ = 0, the equations q(x, ¯ y)a(y) ¯ for x = 0, and q(0, ¯ y)a(y) ¯ = 1, a(x) ¯ =

(4.2)

n→∞

y∈Z

y∈Z

and the limit lim

x→±∞

a(x) ¯ 1 . = |x| 2σa2

(4.3)

These facts can be found in Sects. 28 and 29 of Spitzer’s monograph [31]. Example 4.1. If for some k ∈ Z, p(0, k) + p(0, k + 1) = 1, so that q(0, ¯ x) = 0 for x∈ / {−1, 0, 1}, then a(x) ¯ = |x| /(2σa2 ).

Random Average Process

517

Define the constant β=

q(0, x)a(x). ¯

(4.4)

x∈Z

To see that this definition agrees with (2.8), observe that the above equality leads to n n k k q¯ (0, 0) − q(0, x)q¯ (x, 0) . β = lim n→∞

k=0 x

k=0

¯ and Y1 and Y¯k Think of the last sum over x as P[Y1 + Y¯k = 0], where Y¯k is the q-walk, k ¯ ¯ are independent. Since Y1 + Yk has characteristic function λ(t)λ (t), we get π π n 1 1 − λ(t) 1 k ¯ λ (t) dt = (1 − λ(t)) dt. β = lim ¯ n→∞ 2π −π 2π −π 1 − λ(t) k=0 Ferrari and Fontes [14] begin their development by showing that ζ¯ (s) , s1 ζ (s)

β = lim where ζ and ζ¯ are the generating functions ζ (s) =

∞

q k (0, 0)s k and ζ¯ (s) =

k=0

∞

q¯ k (0, 0)s k .

k=0

Our development bypasses the generating functions. We begin with the asymptotics of the Green functions. This is the key to all our results, both for RWRE and RAP. As already pointed out, without assumption (2.2) the result would be completely wrong because the q-walk absorbs at 0, while a span h > 1 would appear in this limit as an extra factor. Lemma 4.1. Let x ∈ R, and let xn be any sequence of integers such that xn − n 1/2 x stays bounded. Then 2σa2 x2

1 1 dv. (4.5) lim n −1/2 G n xn , 0 = exp − √ n→∞ 2βσa2 0 2v 2π v Proof. For the homogeneous q-walk ¯ the local limit theorem [11, Sect. 2.5] implies that 2σa2 x2

1 1 −1/2 ¯ lim n dv (4.6) exp − G n (0, xn ) = √ n→∞ 2σa2 0 2v 2π v and by symmetry the same limit is true for n −1/2 G¯ n (xn , 0). In particular, lim n −1/2 G¯ n (0, 0) =

n→∞

1

.

(4.7)

1 . β π σa2

(4.8)

π σa2

Next we show lim n −1/2 G n (0, 0) =

n→∞

518

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Using (4.2), a(0) ¯ = 0, and q(x, ¯ y) = q(x, y) for x = 0 we develop q m (0, x)a(x) ¯ = q m (0, x)a(x) ¯ = q m (0, x)q(x, ¯ y)a(y) ¯ x =0

x∈Z

=

x =0,y∈Z

q (0, x)q(x, y)a(y) ¯ m

x =0,y∈Z

=

q m+1 (0, y)a(y) ¯ − q m (0, 0)

y∈Z

q(0, y)a(y). ¯

y∈Z

Identify β in the last sum above and sum over m = 0, 1, . . . , n − 1 to get

q n (0, x)a(x). ¯ 1 + q(0, 0) + · · · + q n−1 (0, 0) β = x∈Z

Write this in the form

¯ n) . n −1/2 G n−1 (0, 0)β = n −1/2 E 0 a(Y

X n , where X n and X n are two independent walks in the same Recall that Yn = X n − environment. Thus by Theorem 3.1 n −1/2 Yn converges weakly to a centered Gaussian with variance 2σa2 . Under the annealed measure the walks X n and X n are ordinary i.i.d. walks with bounded steps, hence there is enough uniform integrability to conclude that n −1/2 E 0 |Yn | → 2 σa2 /π . By (4.3) and straightforward estimation, 1 n −1/2 E 0 a(Y ¯ n) → . σa2 π This proves (4.8). From (4.7)–(4.8) we take the conclusion 1 lim √ βG n (0, 0) − G¯ n (0, 0) = 0. n→∞ n

(4.9)

Let f 0 (z, 0) = 1{z=0} and for k ≥ 1 let q(z, z 1 )q(z 1 , z 2 ) · · · q(z k−1 , 0). f k (z, 0) = 1{z =0} z 1 =0,...,z k−1 =0

This is the probability that the first visit to the origin occurs at time k, including a possible first visit at time 0. Note that this quantity is the same for the q and q¯ walks. Now bound β 1 sup √ G n (z, 0) − √ G¯ n (z, 0) n n z∈Z n 1 k ≤ sup √ f (z, 0)βG n−k (0, 0) − G¯ n−k (0, 0). n z∈Z k=0

To see that the last line vanishes as n → ∞, by (4.9) choose n 0 so that √ |βG n−k (0, 0) − G¯ n−k (0, 0)| ≤ ε n − k for k ≤ n − n 0 , while trivially |βG n−k (0, 0) − G¯ n−k (0, 0)| ≤ Cn 0 for n − n 0 < k ≤ n. The conclusion (4.5) now follows from this and (4.6).

Random Average Process

519

Lemma 4.2. sup sup G n (x, 0) − G n (x + 1, 0) < ∞. n≥1 x∈Z

Proof. Let Ty = inf{n ≥ 1 : Yn = y} denote the first hitting time of the point y, G n (x, 0) = E x

n k=0

k=0

1{Yk = 0} + G n (y, 0).

Ty

≤ Ex

y ∧n T 1{Yk = 0} = E x 1{Yk = 0} + E x

n

1{Yk = 0}

k=Ty ∧n+1

k=0

Ty In an irreducible Markov chain the expectation E x k=0 1{Yk = 0} is finite for any given states x, y [8, Theorem 3 in Sect. I.9]. Since this is independent of n, the inequalities above show that sup sup |G n (x, 0) − G n (x + 1, 0)| < ∞ n −a≤x≤a

(4.10)

for any fixed a. Fix a positive integer a larger than the range of the jump kernels q(x, y) and q(x, ¯ y). Consider x > a. Let σ = inf{n ≥ 1 : Yn ≤ a − 1} and τ = inf{n ≥ 1 : Yn ≤ a}. Since the q-walks starting at x and x + 1 obey the translation-invariant kernel q¯ until they hit the origin, Px [Yσ = y, σ = n] = Px+1 [Yτ = y + 1, τ = n]. (Any path that starts at x and enters [0, a − 1] at y can be translated by 1 to a path that starts at x + 1 and enters [0, a] at y + 1, without changing its probability.) Consequently G n (x, 0) − G n (x + 1, 0) n a−1

Px [Yσ = y, σ = k] G n−k (y, 0) − G n−k (y + 1, 0) . = k=1 y=0

Together with (4.10) this shows that the quantity in the statement of the lemma is uniformly bounded over x ≥ 0. The same argument works for x ≤ 0. One can also derive the limit lim G n (0, 0) − G n (x, 0) = β −1 a(x) ¯ n→∞

but we have no need for this. Lastly, a moderate deviation bound for the space-time RWRE with bounded steps. Let X si,τ be the spatial backward walk defined in Sect. 3 with the bound (3.2) on the steps. Let X si, τ = X si, τ − i − V s be the centered walk. Lemma 4.3. For m, n ∈ N, let (i(m, n), τ (m, n)) ∈ Z2 , v(n) ≥ 1, and let s(n) → ∞ be a sequence of positive integers. Let α, γ and c be positive reals. Assume ∞ n=1

v(n)s(n)α exp{−cs(n)γ } < ∞.

520

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Then for P-almost every ω, lim

max

n→∞ 1≤m≤v(n)

s(n)α P ω

max

1≤k≤s(n)

1 X ki(m,n), τ (m,n) ≥ cs(n) 2 +γ = 0.

(4.11)

Proof. Fix ε > 0. By Markov’s inequality and translation-invariance,

1 i(m,n), τ (m,n) α ω +γ max X ≥ε P ω : max s(n) P ≥ cs(n) 2 1≤m≤v(n) 1≤k≤s(n) k 1 max ≤ ε−1 s(n)α v(n)P X k0,0 ≥ cs(n) 2 +γ . 1≤k≤s(n)

Under the annealed measure P, X k0,0 is an ordinary homogeneous mean zero random walk with bounded steps. It has a finite moment generating function φ(λ) = log E(exp{λ X 10,0 }) that satisfies φ(λ) = O(λ2 ) for small λ. Apply Doob’s inequality to the martingale Mk = exp(λ X k0,0 − kφ(λ)), note that φ(λ) ≥ 0, and choose a constant a1 such that φ(λ) ≤ a1 λ2 for small λ. This gives

1 1 max Mk ≥ exp cλs(n) 2 +γ − s(n)φ(λ) P max X k0,0 ≥ cs(n) 2 +γ ≤ P 1≤k≤s(n)

1≤k≤s(n)

1 ≤ exp −cλs(n) 2 +γ + a1 s(n)λ2 = ea1 · exp{−cs(n)γ }, 1

where we took λ = s(n)− 2 . The conclusion of the lemma now follows from the hypothesis and Borel-Cantelli. 5. Proofs for Backward Walks in a Random Environment Here are two further notational conventions used in the proofs. The environment configuration at a fixed time level is denoted by ω¯ n = {ωx,n : x ∈ Z}. Translations on are defined by (Tx,n ω) y,k = ωx+y,n+k . 5.1. Proof of Theorem 3.2. This proof proceeds in two stages. First in Lemma 5.1 convergence is proved for finite-dimensional distributions at a fixed t-level. In the second stage the convergence is extended to multiple t-levels via the natural Markovian prop√ ntb+r n,nt erty that we express in terms of yn next. Abbreviate X kn,t,r = X k . Then for 0 ≤ s < t,

√ n,t,r yn (t, r ) = n −1/4 E ω (X nt ) − r n n,t,r

nsb+z,ns P ω X nt−ns = nsb + z n −1/4 E ω (X ns )−z = z∈Z

+

z∈Z

=

z∈Z

n,t,r

√ P ω X nt−ns = nsb + z n −1/4 z − r n

n,t,r

nsb+z,ns P ω X nt−ns = nsb + z n −1/4 E ω (X ns )−z

n,t,r √ +n −1/4 E ω X nt−ns − nsb − r n

Random Average Process

=

z∈Z

521

n,t,r

nsb+z,ns P ω X nt−ns = nsb + z n −1/4 E ω (X ns )−z

(5.1)

+yn (u n , r ) ◦ Tntb−nbu n ,nt−nu n + n −1/4 ntb − nsb − nbu n , (5.2) where we defined u n = n −1 (nt − ns) so that nu n = nt − ns. Tx,m denotes the translation of the random environment that makes (x, m) the new space-time origin, in other words (Tx,m ω) y,n = ωx+y,m+n . The key to making use of the decomposition of yn (t, r ) given on lines (5.1) and (5.2) is that the quenched expectations nsb+z,ns and yn (u n , r ) ◦ Tntb−nbu n ,nt−nu n E ω X ns are independent because they arefunctions of environments ω¯ m on disjoint sets of levn,t,r els m, while the coefficients P ω X nt−ns = nsb + z on line (5.1) converge (in probability) to Gaussian probabilities by the quenched CLT as n → ∞. In the limit this decomposition becomes (3.8). Because of the little technicality of matching nt − ns with n(t − s) we state the next lemma for a sequence tn → t instead of a fixed t. Lemma 5.1. Fix t > 0, and finitely many reals r1 < r2 < . . . < r N . Let tn be a sequence of positive reals such that tn → t. Then as n → ∞ the R N -valued vector (yn (tn , r1 ), . . . , yn (tn , r N )) converges weakly to a mean zero Gaussian vector with covariance matrix {q ((t, ri ), (t, r j )) : 1 ≤ i, j ≤ N } with q as defined in (2.15). The proof of Lemma 5.1 is technical (martingale CLT and random walk estimates), so we postpone it and proceed with the main development. Proof of Theorem 3.2. The argument is inductive on the number M of time points in the finite-dimensional distribution. The induction assumption is that [yn (ti , r j ) : 1 ≤ i ≤ M, 1 ≤ j ≤ N ] → [y(ti , r j ) : 1 ≤ i ≤ M, 1 ≤ j ≤ N ] weakly on R M N for any M time points 0 ≤ t1 < t2 < · · · < t M and for any reals r1 , . . . , r N for any finite N .

(5.3)

The case M = 1 comes from Lemma 5.1. To handle the case M + 1, let 0 ≤ t1 < t2 < · · · < t M+1 , and fix an arbitrary (M + 1)N -vector [θi, j ]. By the Cramér-Wold device, it suffices to show the weak convergence of the linear combination θi, j yn (ti , r j ) = θi, j yn (ti , r j ) + θ M+1, j yn (t M+1 , r j ), (5.4) 1≤i≤M+1 1≤ j≤N

1≤i≤M 1≤ j≤N

1≤ j≤N

where we separated out the (M + 1)-term to be manipulated. The argument will use (5.1)–(5.2) to replace the values at t M+1 with values at t M plus terms independent of the rest. For Borel sets B ⊆ R define the probability measure nt M+1 b+r j √n,nt M+1 ω ω X nt M+1 −nt M pn, (B) = P − nt M b ∈ B . j

522

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Apply the decomposition (5.1)–(5.2), with sn = n −1 (nt M+1 − nt M ) and y˜n (sn , r j ) = yn (sn , r j ) ◦ Tnt M+1 b−nsn b,nt M+1 −nsn to get yn (t M+1 , r j ) =

z∈Z

ω nt M b+z,nt M ω −1/4 −z pn, (z)n E X nt M j

+ y˜n (sn , r j ) + O(n −1/4 ). (5.5)

The O(n −1/4 ) term above is n −1/4 nt M+1 b−nt M b−nsn b , a deterministic quantity. Next we reorganize the sum in (5.5) to take advantage of Lemma 5.1. Given a > 0, define a partition of [−a, a] by −a = u 0 < u 1 < · · · < u L = a

√ √ with mesh = max{u +1 − u √ }. For integers z√such that −a n < z ≤ a n, let u(z) denote the value u such that u n < z ≤ u +1 n. For 1 ≤ j ≤ N define an error term by Rn, j (a) = n −1/4

√ a n

√ z=−a n+1

ω pn, j (z)

nt b+z,nt M

E ω (X nt MM

)−z

√ √ nt b+u(z) n,nt M ) − u(z) n − E ω (X nt MM ω nt M b+z,nt M ω +n −1/4 pn, )−z . j (z) E (X nt M √ √ z≤−a n , z>a n

(5.6) (5.7)

With this we can rewrite (5.5) as yn (t M+1 , r j ) =

L−1

ω 1/2 pn, , u +1 n 1/2 ]yn (t M , u ) + y˜n (sn , r j ) j (u n

(5.8)

=0

+Rn, j (a) + O(n −1/4 ). Let γ denote a normal distribution on R with mean zero and variance σa2 (t M+1 − t M ). According to the quenched CLT Theorem 3.1, ω 1/2 pn, , u +1 n 1/2 ] → γ (u − r j , u +1 − r j ] in P-probability as n → ∞. (5.9) j (u n

In view of (5.4) and (5.8), we can write ω θi, j yn (ti , r j ) = ρn,i,k yn (ti , vk ) + θ M+1, j y˜n (sn , r j ) 1≤i≤M+1 1≤ j≤N

1≤i≤M 1≤k≤K

+Rn (a) + O(n −1/4 ).

1≤ j≤N

(5.10)

Above the spatial points {vk } are a relabeling of {r j , u }, the ω-dependent coefficients ω ω (u n 1/2 , u 1/2 ], and zeroes. The conρn,i,k contain constants θi, j , probabilities pn, +1 n j ω stant limits ρn,i,k → ρi,k exist in P-probability as n → ∞. The error in (5.10) is Rn (a) = j θ M+1, j Rn, j (a).

Random Average Process

523

The variables y˜n (sn , r j ) are functions of the environments {ω¯ m : [nt M+1 ] ≥ m > [nt M ]} and hence independent of yn (ti , vk ) for 1 ≤ i ≤ M which are functions of {ω¯ m : [nt M ] ≥ m > 0}. On a probability space on which the limit process {y(t, r )} has been defined, let y˜ (t M+1 − t M , ·) be a random function distributed like y(t M+1 − t M , ·) but independent of {y(t, r )}. Let f be a bounded Lipschitz continuous function on R, with Lipschitz constant C f . The goal is to show that the top line (5.11) below vanishes as n → ∞. Add and subtract terms to decompose (5.11) into three differences: ⎛ ⎛ ⎞ ⎞ ⎜ Ef ⎜ ⎝

=

1≤i≤M+1 1≤ j≤N

⎧ ⎪ ⎪ ⎨

⎜ ⎟ ⎜ θi, j yn (ti , r j )⎟ ⎠− Ef ⎝

⎛

⎞

1≤i≤M+1 1≤ j≤N

⎟ θi, j y(ti , r j )⎟ ⎠

⎜ ⎟ ⎟ Ef ⎜ θ y (t , r ) i, j n i j ⎝ ⎠ ⎪ ⎪ 1≤i≤M+1 ⎩ 1≤ j≤N ⎛ ⎜ ω ρ y (t , v ) + θ M+1, j −E f ⎜ n i k n,i,k ⎝ ⎧ ⎪ ⎪ ⎨

1≤i≤M 1≤k≤K

1≤ j≤N

⎛

⎞⎫ ⎪ ⎪ ⎟⎬ ⎟ y˜n (sn , r j )⎠ ⎪ ⎪ ⎭ ⎞

(5.11)

(5.12)

⎜ ω ⎟ Ef ⎜ ρ y (t , v ) + θ M+1, j y˜n (sn , r j )⎟ n i k n,i,k ⎝ ⎠ ⎪ ⎪ 1≤i≤M 1≤ j≤N ⎩ 1≤k≤K ⎛ ⎞⎫ ⎪ ⎪ ⎜ ⎟⎬ ⎜ ⎟ −E f ⎝ (5.13) ρi,k y(ti , vk ) + θ M+1, j y˜ (t M+1 − t M , r j )⎠ ⎪ ⎪ 1≤i≤M 1≤ j≤N ⎭ 1≤k≤K ⎧ ⎛ ⎞ ⎪ ⎪ ⎨ ⎜ ⎟ ⎟ Ef ⎜ + ρ y(t , v ) + θ y ˜ (t − t , r ) i,k i k M+1, j M+1 M j ⎝ ⎠ ⎪ ⎪ 1≤i≤M 1≤ j≤N ⎩ 1≤k≤K ⎞⎫ ⎛ ⎪ ⎪ ⎟⎬ ⎜ ⎟ ⎜ (5.14) θi, j y(ti , r j )⎠ . −E f ⎝ ⎪ ⎪ 1≤i≤M+1 ⎭

+

1≤ j≤N

The remainder of the proof consists in treating the three differences of expectations (5.12)–(5.14). By the Lipschitz assumption and (5.10), the difference (5.12) is bounded by C f E|Rn (a)| + O(n −1/4 ). We need to bound Rn (a). Recall that γ is an N (0, σa2 (t M+1 − t M ))-distribution.

524

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Lemma 5.2. There exist constants C1 and a0 such that, if a > a0 , then for any partition {u } of [−a, a] with mesh , and for any 1 ≤ j ≤ N , lim sup E|Rn, j (a)| ≤ C1 n→∞

√ + γ (−∞, −a/2) + γ (a/2, ∞) .

We postpone the proof of Lemma 5.2. From this lemma, given ε > 0, we can choose first a large enough and then small enough so that lim sup [ difference (5.12) ] ≤ ε/2. n→∞

Difference (5.13) vanishes as n → ∞, due to the induction assumption (5.3), the ω limits ρn,i,k → ρi,k in probability, and the next lemma. Notice that we are not trying to invoke the induction assumption (5.3) for M + 1 time points {t1 , . . . , t M , sn }. Instead, the induction assumption is applied to the first sum inside f in (5.13). To the second sum apply Lemma 5.1, noting that sn → t M+1 − t M . The two sums are independent of each other, as already observed after (5.10), so they converge jointly. This point is made precise in the next lemma. Lemma 5.3. Fix a positive integer k. For each n, let Vn = (Vn1 , . . . , Vnk ), X n = (X n1 , . . . , X nk ), and ζn be random variables on a common probability space. Assume that X n and ζn are independent of each other for each n. Let v be a constant k-vector, X another random k-vector, and ζ a random variable. Assume the weak limits Vn → v, X n → X , and ζn → ζ hold marginally. Then we have the weak limit Vn · X n + ζn → v · X + ζ, where the X and ζ on the right are independent. To prove this lemma, write Vn · X n + ζn = (Vn − v) · X n + v · X n + ζn and note that since Vn → v in probability, tightness of {X n } implies that (Vn −v)·X n → 0 in probability. As mentioned, it applies to show that lim [ difference (5.13) ] = 0.

n→∞

It remains to examine the difference (5.14). From a consideration of how the coeffiω cients ρn,i,k in (5.10) arise and from the limit (5.9),

ρi,k y(ti , vk ) +

1≤i≤M 1≤k≤K

+

1≤ j≤N

1≤ j≤N

( θ M+1, j

L−1 =0

θ M+1, j y˜ (t M+1 − t M , r j ) =

1≤i≤M 1≤ j≤N

θi, j y(ti , r j ) )

γ (u − r j , u +1 − r j ]y(t M , u ) + y˜ (t M+1 − t M , r j ) .

Random Average Process

525

The first sum after the equality sign matches all but the (i = M + 1)-terms in the last sum in (5.14). By virtue of the Markov property in (3.8) we can represent the variables y(t M+1 , r j ) in the last sum in (5.14) by y(t M+1 , r j ) =

R

ϕσa2 (t M+1 −t M ) (u − r j )y(t M , u) du + y˜ (t M+1 − t M , r j ).

Then by the Lipschitz property of f it suffices to show that, for each 1 ≤ j ≤ N , the expectation L−1 E ϕσa2 (t M+1 −t M ) (u − r j )y(t M , u) du − γ (u − r j , u +1 − r j ]y(t M , u ) R =0

can be made small by choice of a > 0 and the partition {u }. This follows from the moment bounds (3.9) on the increments of the y-process and we omit the details. We have shown that if a is large enough and then small enough, lim sup [ difference (5.14) ] ≤ ε/2. n→∞

To summarize, given bounded Lipschitz f and ε > 0, by choosing a > 0 large enough and the partition {u } of [−a, a] fine enough, lim sup E f n→∞

⎞ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ θi, j yn (ti , r j )⎟ − E f ⎜ θi, j y(ti , r j )⎟ ≤ ε. ⎜ ⎝ ⎝ ⎠ ⎠ 1≤i≤M+1 1≤i≤M+1 1≤ j≤N 1≤ j≤N ⎛

⎞

⎛

This completes the proof of the induction step and thereby the proof of Theorem 3.2. It remains to verify the lemmas that were used along the way. Proof of Lemma 5.2. We begin with a calculation. Here it is convenient to use the spacetime walk X¯ kx,m = (X kx,m , m − k). First observe that E ω (X nx,m ) − x − nV =

n−1

x,m E ω X k+1 − X kx,m − V

k=0

=

n−1

T x,m ω ¯ E ω E { X k } (X 10,0 − V )

k=0

=

n−1

E ω g(TX¯ x,m ω). k

k=0

(5.15)

526

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

From this, for x, y ∈ Z, 2 y,n E {E ω (X nx,n ) − x} − {E ω (X n ) − y} =E

n−1

2 E ω g(TX¯ x,n ω) − g(TX¯ y,n ω) k

k=0

+2

k

EE ω g(TX¯ x,n ω) − g(TX¯ y,n ω) E ω g(TX¯ x,n ω) − g(TX¯ y,n ω) k

k

0≤k<

(the cross terms for k < vanish) ⎞2 ⎛ n−1 y,n ⎝ =E P ω { X¯ kx,n = z}P ω { X¯ k = w} g(Tz ω) − g(Tw ω) ⎠ =E

k=0

z,w∈Z2

n−1

y,n y,n P ω { X¯ kx,n = z}P ω { X¯ k = w}P ω { X¯ kx,n = u}P ω { X¯ k = v}

k=0 z,w,u,v∈Z2

× g(Tz ω)g(Tu ω) − g(Tw ω)g(Tu ω) − g(Tz ω)g(Tv ω) + g(Tw ω)g(Tv ω)

by independence Eg(Tz ω)g(Tu ω) = σ D2 1{z=u} = σ D2 = =

n−1

y,n y,n y,n P{X kx,n = X kx,n } − 2P{X kx,n = X k } + P{X k = Xk }

k=0 n−1 P0 {Yk = 0} − Px−y {Yk = 0} 2σ D2 k=0

2 2σ D G n−1 (0, 0) − G n−1 (x − y, 0) .

On the last three lines above, as elsewhere in the paper, we used these conventions: X k and X k denote walks that are independent in a common environment ω, Yk = X k − Xk is the difference walk, and G n (x, y) the Green function of Yk . By Lemma 4.2 we get the inequality 2 y,n ≤ C |x − y| (5.16) E {E ω (X nx,n ) − x} − {E ω (X n ) − y} valid for all n and all x, y ∈ Z. Turning to Rn, j (a) defined in (5.6)–(5.7), and utilizing independence, E|Rn, j (a)| ≤ n

−1/4

√ a n

√ z=−a n+1

nt M b+z,nt M ω ω X − z E[ pn, (z)] E E j nt M

nt b+u(z)√n,nt M √ 2 1/2 − u(z) n E ω X nt MM nt b+z,nt M ω +n −1/4 E[ pn, (z)] E E ω (X nt MM ) j −

√ z≤−a√ n z>a n

nt b+z,nt M

−E(X nt MM

)

2 1/2

Random Average Process

527

+n −1/4

√ z≤−a√ n z>a n

nt M b+z,nt M ω E[ pn, )−z j (z)] · E(X nt M

√ 1/2 √max √ |z − u(z) n | −a n
≤ Cn −1/4

√ ≥ a n + Cn −1/4 .

For the last inequality above we used (5.16), bound (3.4) on the variance of the quenched mean, and then nt b+z,nt M E X nt MM − z = nt M b + nt M V = nt M b − nt M b = O(1). By the choice of u(z), and by the√central limit theorem if a > 2|r j |, the limit of the bound on E|Rn, j (a)| as n → ∞ is C( + γ (−∞, −a/2) + γ (a/2, ∞)). This completes the proof of Lemma 5.2. Proof of Lemma 5.1. We drop the subscript from tn and write simply t. For the main part of the proof the only relevant property is that ntn = O(n). We point this out after the preliminaries. N We show convergence of the linear combination i=1 θi yn (t, ri ) for an arbitrary but fixed N -vector θ = (θ1 , . . . , θ N ). This in turn will come from a martingale central limit √ ntb+ri n,nt theorem. For this proof abbreviate X ki = X k . For 1 ≤ k ≤ nt define z n,k = n −1/4

N

θi E ω g(TX¯ i ω)

i=1

k−1

so that by (5.15) nt k=1

z n,k =

N

θi yn (t, ri ) + O(n −1/4 ).

i=1

The error is deterministic and comes from the discrepancy (3.7) in the centering. It vanishes in the limit and so can be ignored. i A probability of the type P ω (X k−1 = y) is a function of the environments ω¯ j : nt − k + 2 ≤ j ≤ nt while g(Ty,s ω) is a function of ω¯ s . For a fixed n, {z n,k : 1 ≤ k ≤ nt} are martingale differences with respect to the filtration (1 ≤ k ≤ nt) Un,k = σ ω¯ j : nt − k + 1 ≤ j ≤ nt nt with Un,0 equal to the trivial σ -algebra. The goal is to show that k=1 z n,k converges to a centered Gaussian with variance 1≤i, j≤N θi θ j q ((t, ri ), (t, r j )). By the LindebergFeller Theorem for martingales, it suffices to check that nt 2 Un,k−1 −→ E z n,k k=1

1≤i, j≤N

θi θ j q ((t, ri ), (t, r j ))

(5.17)

528

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

and nt 2 E z n,k 1{z n,k ≥ ε} Un,k−1 −→ 0

(5.18)

k=1

in probability, as n → ∞, for every ε > 0. Condition (5.18) is trivially satisfied because |z n,k | ≤ Cn −1/4 by the boundedness of g. The main part of the proof consists of checking (5.17). This argument is a generalization of the proof of [14, Theorem 4.1] where it was done for a nearest-neighbor walk. We follow their reasoning for the first part of the proof. Since σ D2 = E[g 2 ] and since 2 on U conditioning z n,k ¯ nt−k+1 , one can n,k−1 entails integrating out the environments ω derive nt

2 E[z n,k

| Un,k−1 ] =

σ D2

k=1

1≤i, j≤N

θi θ j n

−1/2

nt−1

j P ω (X ki = X k ),

k=0

where X ki and X k are two walks independent under the common environment ω, started √ √ at (ntb + ri n, nt) and (ntb + r j n, nt). By (4.5), j

σ D2 n −1/2

nt−1

j P(X ki = X k ) −→ q ((t, ri ), (t, r j )).

(5.19)

k=0

This limit holds if instead of a fixed t on the left we have a sequence tn → t. Consequently we will have proved (5.17) if we show, for each fixed pair (i, j), that n −1/2

nt−1

j j P ω {X ki = X k } − P{X ki = X k } −→ 0

(5.20)

k=0

in P-probability. For the above statement the behavior of t is immaterial as long as it stays bounded as n → ∞. Rewrite the expression in (5.20) as

n

−1/2

nt−1

j j P{X ki = X k | Un,k } − P{X ki = X k | Un,0 }

k=0

=n

−1/2

= n −1/2

nt−1 k−1

k=0 =0 nt−1 nt−1 =0

≡ n −1/2

j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, }

nt−1 =0

k=+1

R ,

j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, }

Random Average Process

529

where the last line defines R . Check that ER Rm = 0 for = m. Thus it is convenient to verify our goal (5.20) by checking L 2 convergence, in other words by showing n −1

nt−1 =0

E[R2 ]

nt−1

= n −1

=0

⎡⎧ ⎫2 ⎤ ⎨ nt−1

⎬ ⎥ ⎢ j j P{X ki = E⎣ X k | Un,+1 } − P{X ki = X k | Un, } ⎦ ⎭ ⎩ k=+1

(5.21) −→ 0. For the moment we work on a single term inside the braces in (5.21), for a fixed pair j i − k > . Write Ym = X m X m for the difference walk. By the Markov property of the walks [recall (2.27)] we can write j j P{X ki = X k | Un,+1 } = P ω {X i = x, X = x} ˜ x,x,y, ˜ y˜ ∈Z

˜ y˜ − x)P(Y ˜ ×u ωnt− (x, y − x)u ωnt− (x, k = 0 | Y+1 = y − y˜ ) and similarly for the other conditional probability j j P{X ki = X k | Un, } = P ω {X i = x, X = x} ˜ x,x,y, ˜ y˜ ∈Z

˜ y˜ − x)]P(Y ˜ ×E[u ωnt− (x, y − x)u ωnt− (x, k = 0 | Y+1 = y − y˜ ). Introduce the transition probability q(x, y) of the Y -walk. Combine the above decompositions to express the (k, ) term inside the braces in (5.21) as j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, } j P ω {X i = x, X = x}q ˜ k−−1 (y − y˜ , 0) = x,x,y, ˜ y˜ ∈Z

× u ωnt− (x, y − x)u ωnt− (x, ˜ y˜ − x) ˜

˜ y˜ − x)] ˜ −E[u ωnt− (x, y − x)u ωnt− (x, j = P ω {X i = x, X = x} ˜ q k−−1 (x − x˜ + z, 0) z,w: −M≤w≤M −M≤w−z≤M

x,x˜

× u ωnt− (x, w)u ωnt− (x, ˜ w − z)

−E[u ωnt− (x, w)u ωnt− (x, ˜ w − z)] .

(5.22)

The last sum above uses the finite range M of the jump probabilities. Introduce the quantities ρω (x, x + m) =

y:y≤m

u ωnt− (x, y) =

m y=−M

u ωnt− (x, y)

530

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

and ˜ z, w) = ρω (x, x + w)u ωnt− (x, ˜ w − z) ζω (x, x, ω ω ˜ w − z) . −E ρ (x, x + w)u nt− (x, Fix (x, x), ˜ consider the sum over z and w on line (5.22), and continue with a “summation by parts” step: q k−−1 (x − x˜ + z, 0) u ωnt− (x, w)u ωnt− (x, ˜ w − z) z,w: −M≤w≤M −M≤w−z≤M

˜ w − z)] −E[u ωnt− (x, w)u ωnt− (x,

q k−−1 (x − x˜ + z, 0) ζω (x, x, ˜ z, w) − ζω (x, x, ˜ z − 1, w − 1) = z,w: −M≤w≤M −M≤w−z≤M

=

z,w: −M≤w≤M −M≤w−z≤M

+

2M

q k−−1 (x − x˜ + z, 0) − q k−−1 (x − x˜ + z + 1, 0) ζω (x, x, ˜ z, w)

q k−−1 (x − x˜ + z + 1, 0)ζω (x, x, ˜ z, M)

z=0 −1

−

q k−−1 (x − x˜ + z + 1, 0)ζω (x, x, ˜ z, −M − 1).

z=−2M−1

By definition of the range M, the last sum above vanishes because ζω (x, x, ˜ z, −M −1) = 0. Take this into consideration, substitute the last form above into (5.22) and sum over k = + 1, . . . , nt − 1. Define the quantity nt−1

A,n (x) =

k−−1 q (x, 0) − q k−−1 (x + 1, 0) .

(5.23)

k=+1

Then the expression in braces in (5.21) is represented as R =

nt−1

j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, }

k=+1

=

x,x˜

+

P ω {X i = x, X = x} ˜

x,x˜

j

z,w: −M≤w≤M −M≤w−z≤M

P ω {X i = x, X = x} ˜ j

2M nt−1

A,n (x − x˜ + z)ζω (x, x, ˜ z, w)

(5.24)

q k−−1 (x − x˜ + z + 1, 0)ζω (x, x, ˜ z, M)

z=0 k=+1

(5.25) ≡ R,1 + R,2 , where R,1 and R,2 denote the sums on lines (5.24) and (5.25).

Random Average Process

531

nt−1 Recall from (5.21) that our goal was to show that n −1 =0 ER2 → 0 as n → ∞. We show this separately for R,1 and R,2 . As a function of ω, ζω (· · · ) is a function of ω¯ nt− and hence independent of the probabilities on line (5.24). Thus we get j j 2 E[R,1 ]= E P ω {X i = x, X = x}P ˜ ω {X i = x , X = x˜ } x,x,x ˜ ,x˜

×

A,n (x − x˜ + z)A,n (x − x˜ + z )

−M≤w ≤M

−M≤w≤M −M≤w−z≤M −M≤w −z ≤M ˜ z, w)ζω (x , x˜ , z , w ) . × E ζω (x, x,

(5.26)

Lemma 4.2 implies that A,n (x) is uniformly bounded over (, n, x). Random variable ζω (x, x, ˜ z, w) is mean zero and a function of the environments {ωx,nt− , ωx,nt− }. ˜ Consequently the last expectation on line (5.26) vanishes unless {x, x} ˜ ∩ {x , x˜ } = ∅. The sums over z, w, z , w contribute a constant because of their bounded range. Taking all these into consideration, we obtain the bound j j j 2 E[R,1 ] ≤ C P{X i = X i } + P{X i = X } + P{X = X } . (5.27) By (4.5) we get the bound n −1

nt−1 =0

2 E[R,1 ] ≤ Cn −1/2

(5.28)

which vanishes as n → ∞. For the remaining sum R,2 observe first that ˜ z, M) = u ωnt− (x, ˜ M − z) − Eu ωnt− (x, ˜ M − z). ζω (x, x,

(5.29)

Summed over 0 ≤ z ≤ 2M this vanishes, so we can start by rewriting as follows: j R,2 = P ω {X i = x, X = x} ˜ x,x˜

×

2M nt−1 q k−−1 (x − x˜ + z + 1, 0) − q k−−1 (x − x, ˜ 0) ζω (x, x, ˜ z, M) z=0 k=+1

=−

x,x˜

=−

x,x˜

P ω {X i = x, X = x} ˜ j

z 2M

A,n (x − x˜ + m, 0)ζω (x, x, ˜ z, M)

z=0 m=0

P ω {X i = x, X = x} ˜ j

2M

A,n (x − x˜ + m, 0)ρ¯ω (x, ˜ x˜ + M − m),

m=0

where we abbreviated on the last line ρ¯ω (x, ˜ x˜ + M − m) = ρω (x, ˜ x˜ + M − m) − Eρω (x, ˜ x˜ + M − m).

532

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Square the last representation for R,2 , take E-expectation, and note that ˜ x˜ + M − m)ρ¯ω (x˜ , x˜ + M − m ) = 0 E ρ¯ω (x, unless x˜ = x˜ . Thus the reasoning applied to R,1 can be repeated, and we conclude that nt−1 2 → 0. also n −1 =0 ER,2 To summarize, we have verified (5.21), thereby (5.20) and condition (5.17) for the martingale CLT. This completes the proof of Lemma 5.1. 6. Proofs for Forward Walks in a Random Environment 6.1. Proof of Theorem 3.4. The proof of Theorem 3.4 is organized in the same way as the proof of Theorem 3.2 so we restrict ourselves to a few remarks. The Markov property reads now (0 ≤ s < t, r ∈ R): √

√ r n,0 an (t, r ) = an (s, r ) + P ω Z ns = r n + nsV + y

y∈Z

√

√ r n+nsV +y,ns E ω Z nt−ns − r n − y − ntV

+n −1/4 nsV − nsV . ×n

−1/4

This serves as the basis for the inductive proof along time levels, exactly as done in the argument following (5.3). Lemma 5.1 about the convergence at a fixed t-level applies to an (t, ·) exactly as worded. This follows from noting that, up to a trivial difference from integer parts, the processes an (t, ·) and yn (t, ·) are the same. Precisely, if S denotes the P-preserving transformation on defined by (Sω)x,τ = ω−ntb+x,nt−τ , then √ √ √ √ ntb+r n,nt r n,0 ) − r n = E ω (Z nt ) − r n + ntb. E Sω (X nt The errors in the inductive argument are treated with the same arguments as used in Lemma 5.2 to treat Rn, j (a). 6.2. Proof of Corollary 3.5. We start with a moment bound that will give tightness of the processes. Lemma 6.1. There exists a constant 0 < C < ∞ such that, for all n ∈ N, E (E ω (Z n0,0 ) − nV )6 ≤ Cn 3/2 . Proof. From

2 E E ω g(TZ¯ x,0 ω) − E ω g(TZ¯ 0,0 ω) = 2σ D2 P[Yn0 = 0] − P[Ynx = 0] n

n

we get P[Ynx = 0] ≤ P[Yn0 = 0] Z¯ n = Z¯ n0,0 for this proof. E ω g(TZ¯ k ω) relative to the

for all n ≥ 0 and x ∈ Z.

(6.1)

Abbreviate E ω (Z n ) − nV is a mean-zero martingale with increments filtration Hn = σ {ω¯ k : 0 ≤ k < n}. By the Burkholder-Davis-Gundy inequality [7],

Random Average Process

533

n−1 ω 2 3 . E g(TZ¯ k ω) E (E ω (Z n ) − nV )6 ≤ CE k=0

Expanding the cube yields four sums 6 C E E ω g(TZ¯ k ω) + C 0≤k
+C

0≤k1
0≤k1
2 4 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1

2

4 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω)

+C

0≤k1
1

2

2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1

2

3

with a constant C that bounds the number of arrangements of each type. Replacing some g-factors with constant upper bounds simplifies the quantity to this: 2 2 2 C E E ω g(TZ¯ k ω) + C E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 0≤k
+C

0≤k1
1

0≤k1
2

2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) . 1

2

3

The expression above is bounded by C(n 1/2 + n + n 3/2 ). We show the argument for the last sum of triple products. (Same reasoning applies to the first two sums.) It utilizes repeatedly independence, Eg(Tu ω)g(Tv ω) = σ D2 1{u=v} for u, v ∈ Z2 , and (6.1). Fix 0 ≤ k1 < k2 < k3 < n. Let Z¯ k denote an independent copy of the walk Z¯ k in the same environment ω: 2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1 2 3 2 ω 2

ω = E E g(TZ¯ k ω) E g(TZ¯ k ω) 1 2 ω ¯ ω ¯ × P { Z k3 = u}P { Z k3 = v}E g(Tu ω)g(Tv ω) u,v∈Z2

2 2 = CE E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) P ω { Z¯ k3 = Z¯ k 3 } 1 2 2 ω 2

ω = CE E g(TZ¯ k ω) E g(TZ¯ k ω) 1 2 ω ¯ ¯ × P { Z k2 +1 = u, Z k2 +1 = v} EP ω { Z¯ ku3 −k2 −1 = Z¯ kv3 −k2 −1 } u,v∈Z2

(walks Z¯ ku and Z¯ kv are independent under a common ω) 2 2

≤ CE E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1 2 × P ω { Z¯ k2 +1 = u, Z¯ k 2 +1 = v} P(Yk03 −k2 −1 = 0) u,v∈Z2

2 2 = CE E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) P(Yk03 −k2 −1 = 0). 1

2

534

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Now repeat the same step, and ultimately arrive at 0≤k1
≤C ≤C

2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1

0≤k1
2

3

P(Yk01 = 0)P(Yk02 −k1 −1 = 0)P(Yk03 −k2 −1 = 0) ≤ Cn 3/2 .

By Theorem 8.8 in [12, Chap. 3], 3 3 E an (t + h, r ) − an (t, r ) an (t, r ) − an (t − h, r ) ≤ Ch 3/2 is sufficient for tightness of the processes {an (t, r ) : t ≥ 0}. The left-hand side above is bounded by 6 6 E an (t + h, r ) − an (t, r ) + E an (t, r ) − an (t − h, r ) .

Note that if h < 1/(2n) then an (t + h, r ) − an (t, r ) an (t, r ) − an (t − h, r ) = 0 due to the discrete time of the unscaled walks, while if h ≥ 1/(2n) then n(t + h) − nt ≤ 3nh. Putting these points together shows that tightness will follow from the next moment bound. Lemma 6.2. There exists a constant 0 < C < ∞ such that, for all 0 ≤ m < n ∈ N, 6 0,0 E {E ω (Z n0,0 ) − nV } − {E ω (Z m ≤ C(n − m)3/2 . ) − mV } Proof. The claim reduces to Lemma 6.1 by restarting the walks at time m. Convergence of finite-dimensional distributions in Corollary 3.5 follows from Theorem 3.4. The limiting process a(·)= ¯ lim an (·, r ) is identified by its covariance E a(s) ¯ a(t)= ¯ q (s ∧ t, r ), (s ∧ t, r ) . This completes the proof of Corollary 3.5. 7. Proofs for the Random Average Process This section requires Theorem 3.2 from the space-time RWRE section.

7.1. Separation of effects. As the form of the limiting process in Theorem 2.1 suggests, we can separate the fluctuations that come from the initial configuration from those created by the dynamics. The quenched means of the RWRE represent the latter. We start with the appropriate decomposition. Abbreviate √ xn,r = x(n, r ) = n y¯ + r n .

Random Average Process

535

Recall that we are considering y¯ ∈ R fixed, while (t, r ) ∈ R+ × R is variable and serves as the index for the process, x(n,r )+ntb, nt n σnt (xn,r + ntb) − σ0n (xn,r ) = E ω σ0n (X nt ) − σ0n (xn,r ) = E ω 1 −1 =

x(n,r )+ntb, nt

X nt

x(n,r )+ntb, nt

X nt

i>x(n,r )

−

x(n,r )+ntb, nt

X nt

>x(n,r )

<x(n,r )

η0n (i)

i=x(n,r )+1

x(n,r )

x(n,r )+ntb, nt

i=X nt

η0n (i) +1

x(n,r )+ntb, nt · η0n (i) P ω i ≤ X nt

i≤x(n,r )

x(n,r )+ntb, nt · η0n (i). P ω i > X nt

Recalling the means (i/n) = Eη0n (i) we write this as n σnt (xn,r + ntb) − σ0n (xn,r ) = Y n (t, r ) + H n (t, r ),

where Y n (t, r ) =

(7.1)

x(n,r )+ntb, nt η0n (i) − (i/n) 1{i > xn,r }P ω i ≤ X nt i∈Z

x(n,r )+ntb, nt − 1{i ≤ xn,r }P ω i > X nt and H n (t, r ) =

i∈Z

x(n,r )+ntb, nt (i/n) 1{i > xn,r }P ω i ≤ X nt

x(n,r )+ntb, nt − 1{i ≤ xn,r }P ω i > X nt . The plan of the proof of Theorem 2.1 is summarized in the next lemma. In the pages that follow we then show the finite-dimensional weak convergence n −1/4 H n → H , and the finite-dimensional weak convergence n −1/4 Y n → Y for a fixed ω. This last statement is actually not proved quite in the strength just stated, but the spirit is correct. The distributional limit n −1/4 Y n → Y comes from the centered initial increments η0n (i) − (i/n), x(n,r )+ntb, nt while a homogenization effect takes place for the coefficients P ω {i ≤ X nt } which converge to limiting deterministic Gaussian probabilities. Since the initial height functions σ0n and the random environments ω that drive the dynamics are independent, we also get convergence n −1/4 (Y n + H n ) → Y + H with independent terms Y and H . This is exactly the statement of Theorem 2.1. Lemma 7.1. Let (0 , F0 , P0 ) be a probability space on which are defined independent random variables η and ω with values in some abstract measurable spaces. The marginal laws are P for ω and P for η, and Pω = δω ⊗ P is the conditional probability distribution of (ω, η), given ω. Let Hn (ω) and Yn (ω, η) be R N -valued measurable functions of (ω, η). Make assumptions (i)–(ii) below.

536

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

(i) There exists an R N -valued random vector H such that Hn (ω) converges weakly to H. (ii) There exists an R N -valued random vector Y such that, for all θ ∈ R N , Eω [eiθ·Y ] → E(eiθ·Y ) in P-probability as n → ∞. n

Then Hn + Yn converges weakly to H + Y, where H and Y are independent. Proof. Let θ, λ be arbitrary vectors in R N . Then ω iλ·Hn +iθ·Yn − E eiλ·H E eiθ·Y EE e n n n ≤ E eiλ·H Eω eiθ·Y − Eeiθ·Y + Eeiλ·H − Eeiλ·H Eeiθ·Y n n n ≤ E eiλ·H Eω eiθ·Y − Eeiθ·Y + Eeiλ·H − Eeiλ·H . By assumption (i), the second term above goes to 0. By assumption (ii), the integrand in the first term goes to 0 in P-probability. Therefore by bounded convergence the first term goes to 0 as n → ∞. Turning to the work itself, we check first that H n (t, r ) can be replaced with a quenched RWRE mean. Then the convergence H n → H follows from the RWRE results. Lemma 7.2. For any S, T < ∞ and for P-almost every ω,

x(n,r )+ntb, nt − xn,r = 0. lim sup n −1/4 H n (t, r ) − ( y¯ ) · E ω X nt n→∞ 0≤t≤T −S≤r ≤S

Proof. Decompose H n (t, r ) = H1n (t, r ) − H2n (t, r ), where H1n (t, r ) =

i>x(n,r )

H2n (t, r )

=

i≤x(n,r )

x(n,r )+ntb, nt · (i/n), P ω i ≤ X nt

x(n,r )+ntb, nt · (i/n). P ω i > X nt

Working with H1n (t, r ), we separate out the negligible error, H1n (t, r ) = ( y¯ )

i>x(n,r )

+

i>x(n,r )

= ( y¯ ) · E ω

x(n,r )+ntb, nt P ω i ≤ X nt

x(n,r )+ntb, nt · (i/n) − ( y¯ ) P ω i ≤ X nt

x(n,r )+ntb, nt

X nt

− xn,r

+

+ R1 (t, r )

with R1 (t, r ) =

∞ m=1

x m n,r x(n,r )+ntb, nt + − ( y¯ ) . · P ω xn,r + m ≤ X nt n n

Random Average Process

537

Fix a small positive number δ < function : |R1 (t, r )| ≤

1 2,

and use the boundedness of probabilities and the

xn,r m − ( y¯ ) + n n

1/2+δ n

m=1

x(n,r )+ntb, nt . P ω xn,r + m ≤ X nt

∞

+C·

(7.2)

m=n 1/2+δ +1

By the local Hölder-continuity of with exponent γ > 21 , the first sum is o(n 1/4 ) if x(n,r )+ntb,nt = xn,r + ntb and by time nt the walk δ > 0 is small enough. Since X 0 has displaced by at most Mnt, there are at most O(n) nonzero terms in the second sum in (7.2). Consequently this sum is at most

x(n,r )+ntb, nt Cn · P ω X nt − xn,r ≥ n 1/2+δ . By Lemma 4.3 the last line vanishes uniformly over t ∈ [0, T ] and r ∈ [−S, S] as n → ∞, for P-almost every ω. We have shown x(n,r )+ntb, nt + − xn,r lim sup n −1/4 H1n (t, r ) − ( y¯ ) · E ω X nt = 0 P-a.s. n→∞ 0≤t≤T −S≤r ≤S

Similarly one shows x(n,r )+ntb, nt − lim sup n −1/4 H2n (t, r ) − ( y¯ ) · E ω X nt − xn,r = 0 P-a.s. n→∞ 0≤t≤T −S≤r ≤S

The conclusion follows from the combination of these two.

x(n,r )+ntb, nt ω For a fixed n and y¯ , the process E X nt − xn,r has the same distribution as the process yn (t, r ) defined in (3.6). A combination of Lemma 7.2 and Theorem 3.2 imply that the finite-dimensional distributions of the processes n −1/4 Hn converge weakly, as n → ∞, to the finite-dimensional distributions of the mean-zero Gaussian process H with covariance E H (s, q)H (t, r ) = ( y¯ )2 q ((s, q), (t, r )).

(7.3)

7.2. Finite-dimensional convergence of Y n . Next we turn to convergence of the finitedimensional distributions of process Y n in (7.1). Recall that B(t) is standard Brownian motion, and σa2 = E[(X 10,0 − V )2 ] is the variance of the annealed walk. Recall the definition ∞ P[σa B(s) > x − q]P[σa B(t) > x − r ] d x 0 ((s, q), (t, r )) = q∨r

− 1{r >q}

q q

r

P[σa B(s) > x − q]P[σa B(t) ≤ x − r ] d x

+1{q>r } P[σa B(s) ≤ x − q]P[σa B(t) > x − r ] d x r q∧r P[σa B(s) ≤ x − q]P[σa B(t) ≤ x − r ] d x. + −∞

538

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

Recall from (2.9) that v( y¯ ) is the variance of the increments around n y¯ . Let {Y (t, r ) : t ≥ 0, r ∈ R} be a real-valued mean-zero Gaussian process with covariance EY (s, q)Y (t, r ) = v( y¯ )0 ((s, q), (t, r )).

(7.4)

Fix N and space-time points (t1 , r1 ), . . . , (t N , r N ) ∈ R+ × R. Define vectors

Yn = n −1/4 Y n (t1 , r1 ), . . . , Y n (t N , r N ) and Y = Y (t1 , r1 ), . . . , Y (t N , r N ) . This section is devoted to the proof of the next proposition, after which we finish the proof of Theorem 2.1. Proposition 7.1. For any vector θ = (θ1 , . . . , θ N ) ∈ R N , Eω (eiθ·Y ) → E(eiθ·Y ) in P-probability as n → ∞. n

Proof. Let G be a centered Gaussian variable with variance S = v( y¯ )

N

θk θl 0 ((tk , rk ), (tl , rl ))

k, l=1

and so θ · Y is distributed like G. We will show that Eω (eiθ·Y ) → E(ei G ) n

in P-probability.

Y n (t, r ),

introduce some notation: x(n,r )+ntb, nt ζnω (i, t, r ) = 1{i > xn,r }P ω i ≤ X nt x(n,r )+ntb, nt − 1{i ≤ xn,r }P ω i > X nt

Recalling the definition of

so that Y n (t, r ) =

η0n (i) − (i/n) ζnω (i, t, r ).

i∈Z

Then put νnω (i) =

N

θk ζnω (i, tk , rk )

k=1

and

Un (i) = n −1/4 η0n (i) − (i/n) νnω (i).

Consequently θ · Yn =

Un (i).

i∈Z

To separate out the relevant terms let δ > 0 be small and define Wn =

1/2+δ n y¯ +n

i=n y¯ −n 1/2+δ

Un (i).

(7.5)

Random Average Process

539

For fixed ω and n, under the measure Pω the variables {Un (i)} are constant multiples of centered increments η0n (i) − (i/n) and hence independent and mean zero. Recall also that second moments of centered increments η0n (i) − (i/n) are uniformly bounded. Thus the terms left out of Wn satisfy

Eω (Wn − θ · Yn )2 ≤ Cn −1/2

νnω (i)2 ,

i:|i−n y¯ | > n 1/2+δ

and we wish to show that this upper bound vanishes for P-almost every ω as n → ∞. Using the definition of νnω (i), bounding the sum on the right reduces to bounding sums of the two types x(n,r )+ntk b, ntk 2 1{i > x(n, rk )} P ω i ≤ X ntk k

n −1/2

i:|i−n y¯ | > n 1/2+δ

and x(n,r )+ntk b, ntk 2 1{i ≤ x(n, rk )} P ω i > X ntk k .

n −1/2

i:|i−n y¯ | > n 1/2+δ

For large enough n the points x(n, rk ) lie within 21 n 1/2+δ of n y¯ , and then the previous sums are bounded by the sums

n −1/2

x(n,r )+ntk b, ntk 2 P ω i ≤ X ntk k

x(n,r )+ntk b, ntk 2 P ω i > X ntk k .

i ≥ x(n,rk )+(1/2)n 1/2+δ

and

n −1/2

i ≤ x(n,rk )−(1/2)n 1/2+δ

These vanish for P-almost every ω as n → ∞by Lemma 4.3, in a manner similar to the second sum in (7.2). Thus Eω (Wn − θ · Yn )2 → 0 and our goal (7.5) has simplified to Eω (ei Wn ) → E(ei G )

in P-probability.

(7.6)

We use the Lindeberg-Feller theorem to formulate conditions for a central limit theorem for Wn under a fixed ω. For Lindeberg-Feller we need to check two conditions: (LF-i) S (ω) ≡ n

1/2+δ n y¯ +n

i=n y¯ −n 1/2+δ

(LF-ii)

1/2+δ n y¯ +n

i=n y¯ −n 1/2+δ

Eω Un (i)2 −→ S, n→∞

Eω Un (i)2 · 1{|Un (i)|>ε} −→ 0 for all ε > 0. n→∞

540

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

To see that (LF-ii) holds, pick conjugate exponents p, q > 1 (1/ p + 1/q = 1): 1 1 p q Eω Un (i)2 · 1{Un (i)2 >ε2 } ≤ Eω |Un (i)|2 p Pω Un (i)2 > ε2 ≤ε

− q2

1 1 p q Eω |Un (i)|2 p Eω Un (i)2

≤ Cn −1/2−1/(2q) . In the last step we used the bound |Un (i)| ≤ Cn −1/4 η0n (i) − (i/n), boundedness of , and we took p close enough to 1 to apply assumption (2.11). Condition (LF-ii) follows if δ < 1/(2q). We turn to condition (LF-i):

S (ω) = n

1/2+δ n y¯ +n

E Un (i)2 = ω

i=n y¯ −n 1/2+δ

=

1/2+δ n y¯ +n

1/2+δ n y¯ +n

n −1/2 v(i/n)[νnω (i)]2

i=n y¯ −n 1/2+δ

n −1/2 [v(i/n) − v( y¯ )] [νnω (i)]2

i=n y¯ −n 1/2+δ

+

1/2+δ n y¯ +n

n −1/2 v( y¯ )[νnω (i)]2 .

i=n y¯ −n 1/2+δ

Due to the local Hölder-property (2.10) of v, the first sum on the last line is bounded above by γ C( y¯ )n 1/2+δ n −1/2 n −1/2+δ = C( y¯ )n δ(1+γ )−γ /2 → 0 for sufficiently small δ. Denote the remaining relevant part by S˜ n (ω), given by

S˜ n (ω) =

1/2+δ n y¯ +n

n

−1/2

v( y¯ )[νnω (i)]2

i=n y¯ −n 1/2+δ

= v( y¯ )

N k, l=1

θk θl n

= v( y¯ )n

−1/2

1/2+δ n

νnω (m + n y¯ )

2

m=−n 1/2+δ −1/2

1/2+δ n

ζnω (n y¯ + m, tk , rk )ζnω (n y¯ + m, tl , rl ).

m=−n 1/2+δ

(7.7) Consider for the moment a particular (k, l) term in the first sum on line (7.7). Rename (s, q) = (tk , rk ) and (t, r ) = (tl , rl ). Expanding the product of the ζnω -factors gives three sums:

Random Average Process

n

1/2+δ n

−1/2

541

ζnω (n y¯ + m, s, q)ζnω (n y¯ + m, t, r )

m=−n 1/2+δ

=n

1/2+δ n

−1/2

×P

ω

x(n,q)+nsb, ns 1{m>q √n } 1{m>r √n } P ω X ns ≥ n y¯ + m

m=−n 1/2+δ x(n,r )+ntb, nt

X nt

− n −1/2

n 1/2+δ

≥ n y¯ + m

(7.8)

x(n,q)+nsb, ns 1{m>q √n } 1{m≤r √n } P ω X ns ≥ n y¯ + m

m=−n 1/2+δ

< n y¯ + m

x(n,q)+nsb, ns + 1{m≤q √n } 1{m>r √n } P ω X ns < n y¯ + m

x(n,r )+ntb, nt ≥ n y¯ + m ×P ω X nt ×P

+n

ω

x(n,r )+ntb, nt X nt

−1/2

×P

ω

1/2+δ n

(7.9)

x(n,q)+nsb, ns 1{m≤q √n } 1{m≤r √n } P ω X ns < n y¯ + m

m=−n 1/2+δ x(n,r )+ntb, nt

X nt

< n y¯ + m .

(7.10)

Each of these three sums (7.8)–(7.10) converges to a corresponding integral in P-probability, due to the quenched CLT Theorem 3.1. To see the correct limit, just note that

x(n,r )+ntb, nt P ω X nt < n y¯ + m

x(n,r )+ntb, nt √ x(n,r )+ntb,nt = P ω X nt − X0 < −ntb + m − r n , and recall that −b = V is the average speed of the walks. We give technical details of the argument for the first sum in the next lemma. Lemma 7.3. As n → ∞, the sum in (7.8) converges in P-probability to ∞ P[σa B(s) > x − q]P[σa B(t) > x − r ] d x. q∨r

Proof of Lemma 7.3. With

x(n,q)+nsb, ns √ ≥ n y¯ + x n f nω (x) = P ω X ns

x(n,r )+ntb, nt √ ×P ω X nt ≥ n y¯ + x n and Inω

=

nδ

q∨r

f nω (x)d x,

542

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

the sum in (7.8) equals Inω + O(n −1/2 ). By the quenched invariance principle Theorem 3.1, for any fixed x, f nω (x) converges in P-probability to f (x) = P[σa B(s) ≥ x − q]P[σa B(t) ≥ x − r ]. x(n,r )+ntb, nt

We cannot claim this convergence P-almost surely because the walks X nt change as n changes. But by a textbook characterization of convergence in probability, for a fixed x each subsequence n( j) has a further subsequence n( j ) such that ω P ω : f n( (x) −→ f (x) = 1. j ) →∞

By the diagonal trick, one can find one subsequence for all x ∈ Q and thus ω ∀{n( j)}, ∃{ j } : P ω : ∀x ∈ Q : f n( (x) → f (x) = 1. j ) Since f nω and f are nonnegative and nonincreasing, and f is continuous and decreases to 0, the convergence works for all x and is uniform on [q ∨ r, ∞). That is, 0 0 ω ∀{n( j)}, ∃{ j } : P ω : 0 f n( j ) −

0 0 f0

L ∞ [q∨r, ∞)

→ 0 = 1.

It remains to make the step to the convergence of the integral Inω to Define now Jnω (A)

=

A

q∨r

∞ q∨r

f (x) d x.

f nω (x)d x.

Then, for any A < ∞, ω ∀{n( j)}, ∃{ j } : P ω : Jn( (A) → j )

A

f (x)d x = 1.

q∨r

A In other words, Jnω (A) converges to q∨r f (x)d x in P-probability. Thus, for each 0 < A < ∞, there is an integer m(A) such that for all n ≥ m(A), P ω : Jnω (A) −

A

q∨r

−1 < A−1 . f (x)d x > A

Pick An ∞ such that m(An ) ≤ n. Under the annealed measure P, X n0,0 is a homogeneous mean zero random walk with variance O(n). Consequently

Random Average Process

543

E[ |Inω − Jnω (An )| ] ≤ ≤

∞

E[ f nω (x)]d x

An ∧n δ ∞

√ x(n,r )+ntb, nt P X nt ≥ x(n, r ) − r n An ∧n δ √ +x n d x −→ 0. n→∞

Combine this with

P ω : Jnω (An ) −

An

∞

An

q∨r

< A−1 f (x)d x > A−1 n n .

Since q∨r f (x)d x converges to q∨r f (x)d x, we have shown that Inω converges to this same integral in P-probability. This completes the proof of Lemma 7.3. We return to the main development, the proof of Proposition 7.1. Apply the argument of the lemma to the three sums (7.8)–(7.10) to conclude the following limit in P-probability: lim n

−1/2

n→∞

=

1/2+δ n

ζnω (n y¯ + m, s, q)ζnω (n y¯ + m, t, r )

m=−n 1/2+δ ∞

P[σa B(s) > x − q]P[σa B(t) > x − r ] d x

q∨r

− 1{r >q}

r

P[σa B(s) > x − q]P[σa B(t) ≤ x − r ] d x

q q

+1{q>r } P[σa B(s) ≤ x − q]P[σa B(t) > x − r ] d x r q∧r P[σa B(s) ≤ x − q]P[σa B(t) ≤ x − r ] d x + −∞

= 0 ((s, q), (t, r )). Return to condition (LF-i) of the Lindeberg-Feller theorem and the definition (7.7) of S˜ n (ω). Since S n (ω) − S˜ n (ω) → 0 as pointed out above (7.7), we have shown that S n → S in P-probability. Consequently ∀{n( j)}, ∃{ j } : P ω : S n( j ) (ω) → S = 1. This can be rephrased as: given any subsequence {n( j)}, there exists a further subsequence {n( j )} along which conditions (LF-i) and (LF-ii) of the Lindeberg-Feller theorem are satisfied for the array

Un( j ) (i) : n( j ) y¯ − n( j )1/2+δ ≤ i ≤ n( j ) y¯ + n( j )1/2+δ , ≥ 1 under the measure Pω for P-a.e. ω. This implies that ∀{n( j)}, ∃{ j } : P ω : Eω (ei Wn( j ) ) → E(ei G ) = 1. But the last statement characterizes convergence Eω (ei Wn ) → E(ei G ) in P-probability. As we already showed above that Wn − θ · Yn → 0 in Pω -probability P-almost surely, this completes the proof of Proposition 7.1.

544

M. Balázs, F. Rassoul-Agha, T. Seppäläinen

7.3. Proofs of Theorem 2.1 and Proposition 2.2. Proof of Theorem 2.1. The decomposition (7.1) gives z n = n −1/4 (Y n + H n ). The paragraph that follows Lemma 7.2 and Proposition 7.1 verify the hypotheses of Lemma 7.1 for H n and Y n . Thus we have the limit z n → z ≡ Y + H in the sense of convergence of finite-dimensional distributions. Since Y and H are mutually independent mean-zero Gaussian processes, their covariances in (7.3) and (7.4) can be added to give (2.18). Proof of Proposition 2.2. The value (2.23) for β can be computed from (2.8), or from the probabilistic characterization (4.4) of β via Example 4.1. If we let u denote a random variable distributed like u 0 (0, −1), then we get β=

E(u 2 ) − (Eu)2 Eu − E(u 2 ) and κ = . Eu − (Eu)2 Eu − E(u 2 )

With obvious notational simplifications, the evolution step (2.22) can be rewritten as η (k) − ρ = (1 − u k )(η(k) − ρ) + u k−1 (η(k − 1) − ρ) + (u k−1 − u k )ρ. Square both sides, take expectations, use the independence of all variables {η(k − 1), η(k), u k , u k−1 } on the right, and use the requirement that η (k) have the same variance v as η(k) and η(k − 1). The result is the identity v = v(1 − 2Eu + 2E(u 2 )) + 2ρ 2 (E(u 2 ) − (Eu)2 ) from which follows v = κρ 2 . The rest of part (b) is a straightforward specialization of (2.18). Acknowledgements. The authors thank P. Ferrari and L. R. Fontes for comments on article [14] and J. Swanson for helpful discussions.

References 1. Aldous, D., Diaconis, P.: Hammersley’s interacting particle process and longest increasing subsequences. Probab. Theory Related Fields 103(2), 199–213 (1995) 2. Aldous, D., Diaconis, P.: Longest increasing subsequences: from patience sorting to the Baik-Deift-Johansson theorem. Bull. Amer. Math. Soc. (N.S.) 36(4), 413–432 (1999) 3. Baik, J., Deift, P., Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Amer. Math. Soc. 12(4), 1119–1178 (1999) 4. Balázs, M.: Growth fluctuations in a class of deposition models. Ann. Inst. H. Poincaré Probab. Statist. 39(4), 639–685 (2003) 5. Bernabei, M.S.: Anomalous behaviour for the random corrections to the cumulants of random walks in fluctuating random media. Probab. Theory Related Fields 119(3), 410–432 (2001) 6. Boldrighini, C., Pellegrinotti, A.: T −1/4 -noise for random walks in dynamic environment on Z. Mosc. Math. J. 1(3), 365–380 470–471 (2001) 7. Burkholder, D.L.: Distribution function inequalities for martingales. Ann. Probab. 1, 19–42 (1973) 8. Chung, K.L.: Markov chains with stationary transition probabilities. In: Die Grundlehren der mathematischen Wissenschaften, Band 104. New York: Springer-Verlag Second edition, 1967 9. De Masi, A., Presutti, E.: Mathematical methods for hydrodynamic limits, Volume 1501 of Lecture Notes in Mathematics. Berlin: Springer-Verlag, 1991 10. Deift, P.: Integrable systems and combinatorial theory. Notices Amer. Math. Soc. 47(6), 631–640 (2000) 11. Durrett, R.: Probability: theory and examples. Duxbury Advanced Series. Belmont, CA: Brooks/Cole– Thomson, Third edition, 2004 12. Ethier, S.N., Kurtz, T.G.: Markov processes. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. New York: John Wiley & Sons Inc., 1986

Random Average Process

545

13. Ferrari, P.A., Fontes, L.R.G.: Current fluctuations for the asymmetric simple exclusion process. Ann. Probab. 22(2), 820–832 (1994) 14. Ferrari, P.A., Fontes, L.R.G.: Fluctuations of a surface submitted to a random average process. Electron. J. Probab. 3, no. 6, 34 pp., (1998) (electronic) 15. Ferrari, P.L., Spohn, H: Scaling Limit for the Space-Time Covariance of the Stationary Totally Asymmetric Simple Exclusion Process. Commum. Math. Phys. (2006) (in press) 16. Gravner, J., Tracy, C.A., Widom, H.: Limit theorems for height fluctuations in a class of discrete space and time growth models. J. Stat. Phys. 102(5–6), 1085–1132 (2001) 17. Groeneboom, P.: Hydrodynamical methods for analyzing longest increasing subsequences. J. Comput. Appl. Math. 142(1), 83–105 (2002) 18. Hammersley, J.M.: A few seedlings of research. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of statistics, Berkeley, Univ. California Press, 1972, pp. 345–394 19. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209(2), 437–476 (2000) 20. Johansson, K.: Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. Math. (2) 153(1), 259–296 (2001) 21. Kipnis, C., Landim, C.: Scaling limits of interacting particle systems. Volume 320 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Berlin: SpringerVerlag 1999 22. Liggett, T.M.: Interacting particle systems. Volume 276 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], New York: Springer-Verlag 1985 23. Liggett, T.M.: Stochastic interacting systems: contact, voter and exclusion processes. Volume 324 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles f Mathematical Sciences], Berlin: Springer-Verlag, 1999 24. Rassoul-Agha, F., Seppäläinen, T.: An almost sure invariance principle for random walks in a space-time random environment. Probab. The. Rel. Fields 133, no. 3, 299–314 (2005) 25. Rezakhanlou, F.: A central limit theorem for the asymmetric simple exclusion process. Ann. Inst. H. Poincaré Probab. Statist. 38(4), 437–464 (2002) 26. Seppäläinen, T.: A microscopic model for the Burgers equation and longest increasing subsequences. Electron. J. Probab. 1, no. 5, approx. 51 pp., (1996) (electronic) 27. Seppäläinen, T.: Exact limiting shape for a simplified model of first-passage percolation on the plane. Ann. Probab. 26(3), 1232–1250 (1998) 28. Seppäläinen, T.: Large deviations for increasing sequences on the plane. Probab. Th. Rel. Fields 112(2), 221–244 (1998) 29. Seppäläinen, T.: Diffusive fluctuations for one-dimensional totally asymmetric interacting random dynamics. Commun. Math. Phys. 229(1), 141–182 (2002) 30. Seppäläinen, T.: Second-order fluctuations and current across characteristic for a one-dimensional growth model of independent random walks. Ann. Probab. 33(2), 759–797 (2005) 31. Spitzer, F.: Principles of random walk. New York: Springer-Verlag, 1976 32. Spohn, H.: Large scale dynamics of interacting particles. Berlin: Springer-Verlag 1991 33. Varadhan, S.R.S.: Lectures on hydrodynamic scaling. In Hydrodynamic limits and related topics (Toronto, ON, 1998), Volume 27 of Fields Inst. Commun., Providence, RI: Amer. Math. Soc., 2000, pp. 3–40 34. Walsh, J.B.: An introduction to stochastic partial differential equations. In: École d’été de probabilités de Saint-Flour, XIV—1984, Volume 1180 of Lecture Notes in Math., Berlin: Springer, 1986, pp. 265–439 Communicated by H. Spohn

Commun. Math. Phys. 266, 547–569 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0061-x

Communications in

Mathematical Physics

Incompressible and Compressible Limits of Coupled Systems of Nonlinear Schrödinger Equations Tai-Chia Lin1,2 , Ping Zhang3 1 National Taiwan University, Department of Mathematics, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan

106. E-mail: [email protected]

2 National Center of Theoretical Sciences, National Tsing Hua University, Hsinchu, Taiwan 3 Academy of Mathematics & Systems Science, CAS Beijing 100080, P.R. China.

E-mail: [email protected] Received: 12 September 2005 / Accepted: 12 February 2006 Published online: 29 June 2006 – © Springer-Verlag 2006

Abstract: Recently, coupled systems of nonlinear Schrödinger equations have been used extensively to describe a double condensate, i.e. a binary mixture of BoseEinstein condensates. In a double condensate, an interface and shock waves may occur due to large intraspecies and interspecies scattering lengths. To know the dynamics of an interface and assure the existence of shock waves in a double condensate, we study the incompressible and the compressible limits respectively of two coupled systems of nonlinear Schrödinger equations. The main idea of our arguments is to define a “H -functional” like a Lyapunov functional which can control the propagation of densities and linear momenta. Such an idea is different from the one using the standard Wigner transform to investigate the incompressible and the compressible limits of a single nonlinear Schrödinger equation. 1. Introduction Here we study the incompressible and the compressible (semiclassical) limits respectively of two coupled systems of nonlinear Schrödinger equations called Gross-Pitaevskii equations given by 1 iε 2 −α ∂t ψ1ε = − 2ε ψ1ε + 1ε (|ψ1ε |2 − 1)ψ1ε + βε |ψ2ε |2 ψ1ε , (1.1) 1 iε 2 −α ∂t ψ2ε = − 2ε ψ2ε + 1ε (|ψ2ε |2 − 1)ψ2ε + βε |ψ1ε |2 ψ2ε , x ∈ , t > 0, and

iε 2 ∂t ψ1ε = − 2ε ψ1ε + |ψ1ε |2 ψ1ε + γ |ψ2ε |2 ψ1ε , 1 iε 2 ∂t ψ2ε = − 2ε ψ2ε + |ψ2ε |2 ψ2ε + γ |ψ1ε |2 ψ2ε , x ∈ , t > 0, 1

(1.2)

where ψ εj = ψ εj (x, t) ∈ C, j = 1,2, 0 < ε << 1 is a small parameter, 0 < α < 16 , β > 1 and γ ≥ 1 are constants. Hereafter, the domain is a bounded smooth domain in R2 , besides, the associated boundary conditions are

548

T.-C. Lin, P. Zhang

(B) Neumann boundary conditions, i.e.

∂ψ εj → ∂− n

∂

= 0 for t > 0, j = 1,2.

The systems (1.1) and (1.2) may come from the following system: 2 ψ1 + U11 |ψ1 |2 ψ1 + U12 |ψ2 |2 ψ1 , i∂t ψ1 = − 2m

(1.3)

ψ2 + U22 |ψ2 |2 ψ2 + U12 |ψ1 |2 ψ2 , x ∈ , t > 0, i∂t ψ2 = − 2m 2

which models a binary mixture of Bose-Einstein condensates called a double condensate without the effect of trap potentials. To investigate the superfluidity of a double condensate, trap potentials should be switched off so that the condensates may expand freely. Physically, is the Planck constant divided by 2π , m is atom mass, U j j ∼ N a j j , j = 1,2, and U12 ∼ N a12 , where a j j is the intraspecies scattering length of the j th hyperfine state, j = 1,2, a12 is the interspecies scattering length, and N is the number of condensate atoms. The system (1.3) can be transformed into the system (1.2) by set2 2 ting U11 = U22 = ε−1 m , U12 = γ ε−1 m , and a suitable time scaling. On the other hand, let j = eiμ j t/ψ j , j = 1,2, where μ j ∼ U j j is the chemical potential of the corresponding component. Then the system (1.3) becomes 2 i∂t 1 = − 2m 1 − μ1 1 + U11 |1 |2 1 + U12 |2 |2 1 , (1.4) 2 i∂t 2 = − 2m 2 − μ2 2 + U22 |2 |2 2 + U12 |1 |2 2 , x ∈ , t > 0, which can be transformed into the system (1.1) by setting U j j = μ j = ε−2 m , j = 1,2, 2

2

U12 = βε−2 m , and another proper time scaling. Bose-Einstein condensates are composed of ultracold dilute Bose gases which may be influenced by boundary conditions. The conventional boundary conditions of the condensates are zero Dirichlet boundary conditions (cf. [8]) if the domain is regarded as a region where a double condensate dwells. However, by [6], zero Dirichlet boundary condition may enhance specific heat more than that of Neumann boundary condition. This may result in more thermal fluctuations which may hinder the expansion of the condensates. To reduce the thermal effect of boundary conditions, we may use Neumann boundary conditions, i.e. (B) instead of zero Dirichlet boundary conditions. Shock waves may occur in a single Bose-Einstein condensate having big initial inhomogeneity of density and strongly repulsive interaction between atoms. Since the sound velocity in Bose-Einstein condensates is proportional to the square root from its density (cf. [18]), the inhomogeneity of density may result in severe compression and wave breaking. Such wave breaking may cause shock waves as for classical compressible gas dynamics if the quantum pressure can be negligible at the initial stages of evolution and the hydrodynamical approach holds. By rapidly increasing the s-wave scattering length using Feshbach resonance, one may ignore the quantum pressure and obtain shock waves in a single Bose-Einstein condensate (cf. [17]). For a double condensate, it would be natural to believe that the quantum pressure of each component can be ignored, the hydrodynamical approach holds, and shock waves in a double condensate may occur if the scattering lengths ai j ’s (i.e. Ui j ’s) go to infinity. Consequently, we may 2 2 set U11 = U22 = ε−1 m , U12 = γ ε−1 m and ε → 0+ in the system (1.3) to get the system (1.2), and study the compressible limit of the system (1.2) to find the compressibility of a double condensate and assure the existence of shock waves in a double condensate. An interface called domain wall may be formed between two components of a double condensate when they have strongly repulsive interactions on each other. As U12 >

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

549

√ U11 U22 and U j j > 0, j = 1,2, spontaneous symmetric breaking occurs, and the two components are immiscible and separated in space by an interface called phase separation ([1, 5, 9 and 24]). For the existence of an interface in a double condensate, we may 2 2 set U j j = μ j = ε−2 m , j = 1,2, U12 = βε−2 m and ε → 0+ in the system (1.4), and transform it into the system (1.1). Actually, such an interface may have motion effected by the superfluid current of a double condensate. To understand the dynamics of an interface in a double condensate, we study the incompressible limit of the system (1.1) and set the constant β > 1 for spontaneous symmetry breaking. 1 The time scale of the system (1.2) is ε 2 which is standard for the compressible (semiclassical) limit of nonlinear Schrödinger equations (cf. [7, 13, 25]). However, the time 1 scale of the system (1.1) is ε 2 −α which is unconventional for the incompressible limit problem (cf. [19]). Since the interface energy is not negligible, we need such a time scale 1 to deal with the effect of interface energy. One may decompose such a time scale as ε 2 1 and ε−α , where ε 2 is for the compressible limit to get compressible fluid equations, and ε−α is for the incompressible limit of these compressible fluid equations. To explain 1 ε this, we may set (ψ1ε , ψ2ε ), ψkε = ρkε ei Sk /ε 2 , k = 1, 2 as a solution of the system (1.1) and formally derive compressible fluid equations with a large parameter ε−α for the time scale. This limit process corresponds to the zero Mach limit of compressible Euler equations to incompressible Euler equations (see [15], (2.69) on p.52). For initial conditions of (1.1), we set ψkε |t=0 = εk ∈ H 4 (; C), k = 1,2 satisfying as ε ↓ 0, def E ε (ε1 , ε2 ) =

ε 2

2 k=1

|∇εk |2 d x

2 1 ε + φk ) − 1]2 [( 2ε k=1

β − 1 ε ε φ1 φ2 d x = O(ε−2α ), + (1.5) ε 2 2 1 1 ε def ε [( H0ε = |(∇ − iε− 2 −α v0 )εk |2 d x + φk ) − 1]2 2 2ε k=1 k=1 β − 1 ε ε φ1 φ2 d x = O(1), (1.6) + ε def

where φkε = |εk |2 , k = 1,2, and O(1) denotes a bounded quantity. Here v0 ∈ → n |∂ = 0. To match the Neumann boundH 3 (; R2 ) satisfies divv0 = 0 and v0 · − ε ary conditions (B), we assume that k are compatible with the Neumann boundary conditions (B) in some appropriate sense. 1 +α We may give an example for such εk ’s as follows: Let k = φkε ei Sk /ε 2 , k = 1,2, where ∇ Sk = v0 in k , φkε (x) = 1 if x ∈ k , dist (x, 0 ) ≥ ε; φkε (x) = 0 if x ∈ 0 ; φkε (x) ∈ (0, 1) with |∇ φkε (x)| = O( 1ε ) if x ∈ k and dist (x, 0 ) < ε. Here 1 and 2 are two segregated smooth domains with a common boundary 0 which is a bounded smooth curve as an interface separating the whole domain into two components 1 and 2 . Then it is obvious that (1.5) and (1.6) hold. Now we state the main theorem for incompressible limits of ψkε ’s as follows: Theorem 1. Let (ψ1ε , ψ2ε ) be the solution of the system (1.1) with the Neumann boundary conditions (B) and initial data (ε1 , ε2 ) ∈ H 4 (; C2 ) satisfying (1.5) and (1.6), where

550

T.-C. Lin, P. Zhang def

def

1

is a bounded smooth domain in R2 . Let ρkε = |ψ kε |2 and Jkε = ε 2 +α Im(ψkε ∇ψkε ) for k = 1, 2. Then for any T > 0, there exists C = C T, v0 H 3 () > 0 such that ε 2

2 k=1

2 1 ε β − 1 ε ε [( ρ1 ρ2 d x ≤ C, |∇ ρkε |2 d x + ρk ) − 1]2 + ε 2ε

(1.7)

k=1

and 2 2 1 ε ε ε Jk − ρk v d x ≤ Cε2α as ε ↓ 0, ρ k

(1.8)

k=1

for t ∈ [0, T ], where v ∈ C([0, ∞); H 3 ()) is the unique solution of incompressible Euler equations given by ⎧ ⎨ ∂t v + (v · ∇)v + ∇ p = 0 f or x ∈ , t > 0, div v = 0 f or x ∈ , t > 0, (1.9) ⎩ v | = v f or x ∈ , t=0 0 → together with slip boundary condition v · − n |∂ = 0. Remark 1.1. (1) Let v0 ∈ H 3 () satisfy divv0 = 0 and some appropriate compatibility conditions with the slip boundary conditions, one can prove the global well-posedness of (1.9) in the class of C([0, ∞]; H 3 ()) (cf. [14] for more details). And one may check [13] to see why the Neumann boundary condition for (1.1) corresponds to the slip boundary condition for (1.9). (2) Due to β > 1, (1.7) gives lim ρ1ε ρ2ε d x = 0 which signifies the strongly repulsive ε↓0

interaction of ψkε ’s, and result in an interface separating the supports of ρkε ’s as ε goes to zero. (3) From energy conservation (see Sect. 2) and (1.5), E ε (ψ1ε , ψ2ε ) = E ε (ε1 , ε2 ) = O(ε−2α ) for t ≥ 0. However, this is not enough to derive the inequality (1.7) which may control the part of energy coming from the interface. Hence we need to define a function called “H -functional” given by ε H (t) = 2 ε

de f

2 k=1

|(∇ − iε

− 12 −α

β − 1 ε ε ρ1 ρ2 d x, + ε

v)ψkε |2

2 1 ε [( + ρk ) − 1]2 2ε k=1

(1.10)

for t ≥ 0, where v and ρkε ’s are defined in Theorem 1. Such a functional like a Lyapunov function may control the propagation of densities and linear momenta of ψkε ’s. Similar approaches are also used in [19] for the incompressible limit of a single Schrödinger– Poisson system in the periodic case, and in [13] for the semiclassical limit of GrossPitaevskii equations in the exterior domain. The new difficulties here lie in the coupling of two nonlinear Schrodinger equation, and the new scaling in (1.1) to deal with the interface problem, which is different from the known scaling before. Notice that here we do not use the standard Wigner transform approach (cf. [25]) either, as it might lead to more complicated situations to study the system (1.1).

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

551

In Sect. 3, we shall prove 1

H ε (t) ≤ eC ∇v L ∞ ([0,T ]×) T (H0ε + Cε 2 −3α )

for T > 0 and t ∈ [0, T ]. (1.11)

Furthermore, Theorem 1 may imply Corollary 1. Under the same assumptions as Theorem 1, for any T > 0, k = 1,2, Jkε − ρkε v L 4/3 () ≤ C0 εα ,

(1.12)

where C0 is a positive constant depending on T , v0 and ||. Generally, the interface dynamics of a double condensate may depend on the associated superfluid currents. By Corollary 1, the superfluid currents of a double condensate are governed by the incompressible Euler equation. Thus it would be expected that the motion of an interface can be described by a particle trajectory equation given by dX dt = v(X (t), t), t > 0, (1.13) X (0) = w ∈ 0 , where v is the solution of (1.9) and 0 is the interface of initial data εk ’s. For initial data of (1.2), we set ψkε |t=0 = Fkε ∈ H 4 (; C), k = 1,2 satisfying as ε ↓ 0, ε ε (F1ε , F2ε ) def E = 2

ε 0ε def = H 2

2 k=1

|∇ Fkε |2 d x +

2 1 ( | f kε |2 ) 2

+γ f 1ε f 2ε d x = O(1), 2 k=1

k=1

(1.14)

1

|(∇ − iε− 2 u 0 )Fkε |2 d x +

+(γ − 1) f 1ε f 2ε ]d x = oε (1),

1 ε [ |( f k ) − ρ0 |2 2 2

k=1

(1.15)

where f kε = |Fkε |2 , k = 1,2, and oε (1) is a small quantity tending to zero as ε goes to zero. Here 0 < ρ0 ∈ H 3 (; R) and u 0 ∈ H 3 (; R2 ) satisfies u 0 · n |∂ = 0. And we assume that Fkε are compatible with the Neumann boundary conditions (B). 1 √ We may give an example for such Fkε as follows: Let Fkε = f k eiθk /ε 2 , k = 1,2, where f k ∈ C∞ 0 (; R) with support in k , ∇θk = u 0 in k , and k ’s are two disjoint bounded smooth domains. Then it is trivial that (1.14) and (1.15) hold. A theorem for compressible limits of ψkε ’s is given by Theorem 2. Let (ψ1ε , ψ2ε ) be the solution of the system (1.2) with the Neumann boundary conditions (B) and initial data (F1ε , F2ε ) ∈ H 4 (; C2 ) satisfying (1.14) and (1.15), def

def

1

where is a bounded smooth domain in R2 . Let ρkε = |ψkε |2 and Jkε = ε 2 Im(ψkε ∇ψkε ) for k = 1,2. Then there exist T∗ > 0 such that (ρ1ε + ρ2ε − ρ)(·, t) L2 () → 0,

(1.16)

(J1ε + J2ε − ρu)(·, t) L4/3 () → 0 for t ∈ [0, T∗ ) as ε ↓ 0,

(1.17)

and

552

T.-C. Lin, P. Zhang

where (ρ, u) ∈ C([0, T∗ ); H 3 ()) is the unique solution of the compressible Euler equations given by ⎧ ⎨ ∂t ρ + div(ρu) = 0, ∂t u + (u · ∇)u + ∇ρ = 0 for x ∈ , t ∈ (0, T∗ ), (1.18) ⎩ ρ| t=0 = ρ0 (x), u|t=0 = u 0 (x) for x ∈ , together with slip boundary condition u · n |∂ = 0. Here T∗ is the maximal time period for the existence of the regular solution of (1.18). Remark 1.2. (1) Under the assumptions that 0 < ρ0 ∈ H 3 (; R) and u 0 ∈ H 3 (; R2 ), Beirao [2, 3] proved the local well-posedness of (1.18) under appropriate compatibility conditions for (ρ0 (x), u 0 (x)). (2) The space dimension two assumption is to guarantee that (1.2) has a unique global solution with smooth enough initial data (cf. [4]). Similar comment serves for Theorem 1 as well. When spatial dimension is greater than two, the global well-posedness of the initial-boundary value problem (1.2) is still open. However, when = R d and d = 2, 3, to guarantee the local well-posedness of (1.18), one may need the assumptions like Fkε → eiθ/ε at infinity so that ρ0 (x) ≥ c > 0 in the whole space. Then as for [13], one can prove the global well-posedness of (1.2), and therefore one may obtain a similar result like Theorem 2. (3) It is remarkable that the solution of (1.18) is also a solution of the isentropic compressible Euler equation given by ∂t ρ + div(ρu) = 0, (1.19) ∂t (ρu) + div(ρu ⊗ u) + 21 ∇ρ 2 = 0 f or x ∈ , t ∈ (0, T∗ ), which may form a shock wave in finite time generically (cf. [21]). Hence Theorem 2 shows the compressibility of a double condensate which may result in shock wave appearance. The main difficulty of Theorem 2 is strong competition between ρ1ε and ρ2ε so the hydrodynamical approach may not hold for ρ1ε and ρ2ε individually. To overcome such difficulty, we may add ρ1ε and ρ2ε together and regard ρ1ε + ρ2ε as the total density of the classical compressible gas when ε tends to zero. The rest of this paper is organized as follows: In Sect. 2, we introduce conservation laws of mass, energy and linear momentum as our basic tools to prove Theorem 1 and 2. Then we give proofs of Theorem 1 and 2 in Sect. 3 and 4, respectively. For the global existence of systems (1.1) and (1.2), one may refer to Sect. 5, the Appendix, for details. 2. Conservation Laws Of single nonlinear Schrödinger equations (NLS), conservation laws, especially the modified Madelung’s fluid dynamic equations (e.g. (2.6), (2.7), (2.29), (2.30)), are very useful to investigate vortex dynamics (cf. [11, 12]), blow up collapse waves (cf. [23]) and semiclassical limit (cf. [13, 19, 25]). Due to the unique form of single NLS, one may derive three conservation laws including conservation of energy, mass and linear momentum. However, for general coupled systems of NLS, conservation laws would be more complicated and difficult to use. Since the systems (1.1) and (1.2) have specific coupling terms which may provide some symmetric properties, the associated conservation laws may have proper forms for studying such systems. In this section, we want

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

553

to prove conservation of energy, mass and linear momentum for the systems (1.1) and (1.2), respectively. Let us first notice that given ψkε (0, x) ∈ H 4 () for k = 1, 2, it is obvious to follow from the result in the Appendix that both (1.1) and (1.2) have a unique solution in the class of C([0, ∞); H 2 ()). Then it is standard (cf. [13] for the exterior domain case) to improve ψkε , k = 1, 2, to the class of C([0, ∞); H 4 ()). Now for the system (1.1), we define its energy, mass density and linear momentum density as follows: eε = E ε (ψ1ε , ψ2ε ),

(2.1)

ρkε = |ψkε |2 , k = 1, 2,

(2.2)

def def

def Jkε = def Jk,ε j =

ε ε (Jk,1 , Jk,2 ),

ε

1 2 +α

Im(ψkε ∂ j ψkε )

(2.3) for j, k = 1, 2,

(2.4)

where ∂ j ≡ ∂x j , ψkε is the complex conjugate of ψkε , E ε (·, ·) is defined in (1.5). Then we have Lemma 1. (i) Conservation of energy: d eε (t) = 0, ∀t > 0, dt

(2.5)

∂t ρkε + div Jkε = 0 f or x ∈ , t > 0, k = 1, 2,

(2.6)

(ii) Conservation of mass:

(iii) Conservation of linear momentum: ε−2α ∂t

2

2 ε ∂l [4Re(∂l ψkε ∂ j ψkε ) − ∂l ∂ j (|ψkε |2 )] 4

Jk,ε j +

k=1

k,l=1

+

1 2ε ∂ j

2

(ρkε )2

k=1

(2.7)

β + ∂ j (ρ1ε ρ2ε ) = 0 ε

for x ∈ , t > 0, j = 1,2. Proof. We may multiply the equation of ψ1ε in (1.1) by ∂t ψ1ε (complex conjugate of ∂t ψ1ε ) and integrate over . Then it is easy to check that 1

iε 2 −α

|∂t ψ1ε |2 d x =

ε 2 +

2 j=1

β ε

∂ j ψ1ε ∂t ∂ j ψ1ε d x +

|ψ2ε |2 ψ1ε ∂t ψ1ε d x.

1 ε

(|ψ1ε |2 − 1)ψ1ε ∂t ψ1ε d x (2.8)

554

T.-C. Lin, P. Zhang

Here we have used the Neumann boundary conditions (B), and integration by parts. Taking complex conjugate on (2.8), we have −iε

1 2 −α

|∂t ψ1ε |2 d x

ε = 2 +

2 j=1

β ε

∂ j ψ1ε ∂t ∂ j ψ1ε d x

1 + ε

(|ψ1ε |2 − 1)ψ1ε ∂t ψ1ε d x

|ψ2ε |2 ψ1ε ∂t ψ1ε d x.

Hence by adding (2.8) and (2.9) together, ε d 1 |∇ψ1ε |2 d x + (|ψ ε |2 + |ψ2ε |2 − 1)∂t |ψ1ε |2 d x 2 dt ε 1 β −1 + |ψ2ε |2 ∂t |ψ1ε |2 d x = 0. ε Similarly, we may use the equation of ψ2ε in (1.1) to obtain ε d 1 ε 2 |∇ψ2 | d x + (|ψ ε |2 + |ψ2ε |2 − 1)∂t |ψ2ε |2 d x 2 dt ε 1 β −1 + |ψ1ε |2 ∂t |ψ2ε |2 d x = 0. ε

(2.9)

(2.10)

(2.11)

Therefore by adding (2.10) and (2.11), we may complete the proof of (2.5). For the proof of (2.6), we may multiply the equation of ψ1ε in (1.1) by ψ1ε (complex conjugate of ψ1ε ). Then 1 ε 1 β iε 2 −α (∂t ψ1ε )ψ1ε = − (ψ1ε )ψ1ε + (|ψ1ε |2 − 1)|ψ1ε |2 + |ψ2ε |2 |ψ1ε |2 . (2.12) 2 ε ε

Take the complex conjugate on (2.12) so we have 1

iε 2 −α (∂t ψ1ε )ψ1ε =

ε 1 β (ψ1ε )ψ1ε − (|ψ1ε |2 − 1)|ψ1ε |2 − |ψ1ε |2 |ψ2ε |2 . 2 ε ε

(2.13)

Adding (2.12) and (2.13), we may complete the proof of (2.6) for k = 1, a similar proof gives (2.6) for k = 2. Now we prove (2.7) as follows: Multiply the conjugate equation of ψ1ε in (1.1) by ∂ j ψ1ε , and then 1

iε 2 −α ∂t ψ1ε ∂ j ψ1ε =

ε 1 β (ψ1ε )∂ j ψ1ε − (|ψ1ε |2 − 1)ψ1ε ∂ j ψ1ε − |ψ2ε |2 ψ1ε ∂ j ψ1ε . 2 ε ε (2.14)

On the other hand, take ∂ j on the equation of ψ1ε in (1.1), and multiply the resulting equation by ψ1ε . Then we have 1

iε 2 −α (∂t ∂ j ψ1ε )ψ1ε = − 2ε (∂ j ψ1ε )ψ1ε + 1ε |ψ1ε |2 ∂ j (|ψ1ε |2 − 1) + 1ε (|ψ1ε |2 − 1)ψ1ε ∂ j ψ1ε + βε |ψ1ε |2 ∂ j (|ψ2ε |2 ) + βε |ψ2ε |2 ψ1ε ∂ j ψ1ε .

(2.15)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

555

One may add (2.14) and (2.15) together, and choose the real part of the resulting equation to get ε 1 β ε−2α ∂t J1,ε j + Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ]+ ∂ j (ρ1ε )2 + ρ1ε ∂ j ρ2ε = 0. (2.16) 2 2ε ε To obtain (2.7), we need to prove

Claim 1. Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ] =

1 2

2 l=1

∂l [4Re(∂l ψ1ε ∂ j ψ1ε ) − ∂l ∂ j (|ψ1ε |2 )].

Proof. It is obvious that Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ] =

1 2 ε ( ∂l ψ1 )∂ j ψ1ε + ( ∂l2 ψ1ε )∂ j ψ1ε 2 2

2

l=1 2 −( ∂l2 ∂ j ψ1ε )ψ1ε l=1

−(

l=1 2

∂l2 ∂ j ψ1ε )ψ1ε . (2.17)

l=1

Besides, 2 2 2 ( ∂l2 ψ1ε )∂ j ψ1ε = ∂l (∂l ψ1ε ∂ j ψ1ε ) − (∂l ψ1ε )∂l ∂ j ψ1ε l=1

l=1

(2.18)

l=1

and 2 2 2 2 ε ε ε ε ( ∂l ∂ j ψ1 )ψ1 = ∂l (ψ1 ∂l ∂ j ψ1 ) − (∂l ψ1ε )∂l ∂ j ψ1ε . l=1

l=1

(2.19)

l=1

Put (2.18) and (2.19) into (2.17), and then Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ] =

1 ∂l (∂l ψ1ε ∂ j ψ1ε + ∂l ψ1ε ∂ j ψ1ε − ψ1ε ∂l ∂ j ψ1ε 2 2

l=1

−ψ1ε ∂l ∂ j ψ1ε ).

(2.20)

On the other hand, ∂l ∂ j (|ψ1ε |2 ) = ∂l ∂ j (ψ1ε ψ1ε ) = (ψ1ε ∂l ∂ j ψ1ε + ψ1ε ∂l ∂ j ψ1ε ) + (∂ j ψ1ε ∂l ψ1ε + ∂l ψ1ε ∂ j ψ1ε ).

(2.21)

Consequently, by (2.20) and (2.21), we may complete the proof of Claim 1. By (2.16) and Claim 1, we obtain ε−2α ∂t J1,ε j +

ε 1 ∂l [4Re(∂l ψ1ε ∂ j ψ1ε ) − ∂l ∂ j (|ψ1ε |2 )] + ∂ j (ρ1ε )2 4 2ε 2

l=1

+

β ε ρ ∂ j ρ2ε = 0. ε 1

(2.22)

556

T.-C. Lin, P. Zhang

Similarly, we may use the equation of ψ2ε in (1.1) to derive ε−2α ∂t J2,ε j +

ε 1 ∂l [4Re(∂l ψ2ε ∂ j ψ2ε ) − ∂l ∂ j (|ψ2ε |2 )] + ∂ j (ρ2ε )2 4 2ε 2

l=1

+

β ε ρ ∂ j ρ1ε = 0. ε 2

(2.23)

Therefore by adding (2.22) and (2.23), we may have (2.7) and complete the proof of Lemma 1. Due to the difference between (1.1) and (1.2), we may define another energy, mass density and linear momentum density for the system (1.2) as follows: def ε ε e˜ε = E ε (ψ1 , ψ2 ),

(2.24)

def ρkε =

(2.25)

|ψkε |2 ,

k = 1, 2,

ε ε , Jk,2 ), Jkε = (Jk,1

def

(2.26)

1 2

Jk,ε j = ε Im(ψkε ∂ j ψkε ) f or j, k = 1, 2, def

(2.27)

ε (·, ·) is defined in (1.14), and (ψ ε , ψ ε ) is the solution of (1.2) with the Neumann where E 1 2 boundary conditions (B). Then corresponding to Lemma 1, we may show Lemma 2. Under the assumptions of Theorem 2, there hold (i) Conservation of energy: d e˜ε (t) = 0, ∀t > 0, dt

(2.28)

∂t ρkε + div Jkε = 0 f or x ∈ , t > 0, k = 1, 2,

(2.29)

(ii) Conservation of mass:

(iii) Conservation of linear momentum:

∂t

2 k=1

Jk,ε j +

2 2 ε 1 ε 2 ∂l [4Re(∂l ψkε ∂ j ψkε ) − ∂l ∂ j (|ψkε |2 )] + ∂ j (ρk ) 4 2 k,l=1

+ γ ∂ j (ρ1ε ρ2ε ) for x ∈ , t > 0, j = 1,2.

k=1

=0

(2.30)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

557

3. Proof of Theorem 1 In this section, we want to prove both Theorem 1 and Corollary 1. For the proof of Theorem 1, we need to define a crucial function by ε H (t) = 2 de f

ε

2 k=1

|(∇ − iε

− 12 −α

v)ψkε |2 d x

2 1 ε + ρk ) − 1]2 [( 2ε k=1

β − 1 ε ε + ρ1 ρ2 d x, ε

(3.1)

for t ≥ 0, where v and ρkε ’s are defined in (1.9) and (2.2), respectively. Then it is easy to check that 2 2 2 ε ε−2α 1 ε ε 2 ε ε |∇ ρk | d x + H (t) = ε Jk − ρk v d x ρk 2 2 k=1 k=1 (3.2) 2 1 ε β − 1 ε ε 2 [( ρ1 ρ2 d x, ρk ) − 1] + + ε 2ε k=1

where Jkε are defined in (2.3) and (2.4). Moreover, by (1.5) and (2.1), we may rewrite (3.1) as ε

H (t) = eε (t) − ε

−2α

v·

2

Jkε d x

k=1

ε−2α + 2

2

|v|2 ρkε d x.

k=1

(3.3)

d Now we compute dt H ε (t) by using conservation laws, i.e. Lemma 1. By conservation of energy (2.5) and (3.3),

d ε d H (t) = −ε−2α dt dt

v·

2

Jkε d x

k=1

ε−2α d + 2 dt

|v|

2

2

ρkε d x.

(3.4)

k=1

By conservation of mass (2.6), 1 d 2 dt

|v|2

2

ρkε d x =

k=1

(v · ∂t v)

=

(v · ∂t v)

=

(v · ∂t v)

2 k=1 2 k=1 2

ρkε d x + ρkε d x − ρkε d x +

k=1

1 2 1 2 1 2

|v|2

|v|2

2

k=1 2

div Jkε d x (by (2.6))

k=1

∂t ρkε d x

∇|v|2 ·

2

Jkε d x,

k=1

i.e. 1 d 2 dt

|v|2

2 k=1

ρkε d x =

(v · ∂t v)

2 k=1

ρkε d x +

1 2

∇|v|2 ·

2 k=1

Jkε d x.

(3.5)

558

T.-C. Lin, P. Zhang

Here we have used integration by parts and the Neumann boundary conditions (B) to eliminate the integral along the boundary ∂. By conservation of linear momentum (2.7), 2 2 2 ε −ε−2α v· ∂t Jkε d x = vj ∂l [4Re(∂l ψkε ∂ j ψkε ) − ∂l ∂ j (|ψkε |2 )] 4 k=1

j,k=1

l=1

1 + ∂ j (ρkε )2 d x + 2ε 2

j=1

β v j ∂ j (ρ1ε ρ2ε )d x. ε

Using integration by parts and divergence free of v, 2 2 2 vj ∂l [Re(∂l ψkε ∂ j ψkε )]d x = − (∂l v j )Re(∂l ψkε ∂ j ψkε )d x, j,k=1

j,k,l=1

l=1

−

2 j,k,l=1

and

v j ∂l2 ∂ j (|ψkε |2 )d x

2

=

2 k,l=1

(div v)∂l2 (|ψkε |2 )d x = 0,

(3.7)

(3.8)

1 β ∂ j ( (ρkε )2 ) + ∂ j (ρ1ε ρ2ε )]d x 2ε ε 2

vj[

j=1

k=1

=−

[

1 2ε

2 k=1

(ρkε )2 +

β ε ε ρ ρ ]div vd x = 0. ε 1 2

Consequently, by (3.6)–(3.9), 2 2 −ε−2α v· ∂t Jkε d x = −ε (∂l v j )Re(∂l ψkε ∂ j ψkε )d x.

(3.6)

j,k,l=1

k=1

(3.9)

(3.10)

On the other hand, (3.4), (3.5) and (3.10) may imply 2 d ε (∂l v j )Re(∂l ψkε ∂ j ψkε )d x H (t) = −ε dt j,k,l=1

+ε−2α

2 k=1

[−(Jkε · ∂t v) + (v · ∂t v)ρkε +

1 ε J · ∇|v|2 ]d x. (3.11) 2 k

Put the equation of v (see (1.9)) into (3.11). Then 2 d ε H (t) = −ε (∂l v j )Re(∂l ψkε ∂ j ψkε )d x dt j,k,l=1

+ε−2α +ε−2α

2 1 (Jkε − ρkε v) · [(v · ∇)v] + Jkε · ∇|v|2 d x 2 k=1 2

k=1

(Jkε − ρkε ) · ∇ p d x.

(3.12)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

559

To deal with right side of (3.12), we need Claim 2. −

2

1

(∂l v j )Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] 1

j,l=1

=−

2

(∂l v j )Re(∂l ψkε ∂ j ψkε ) + ε−1−2α {(Jkε − ρkε ) · [(v · ∇)v] +

j,l=1

1 ε J · ∇|v|2 } 2 k

for k = 1, 2. Proof. The proof of Claim 2 is simply algebraic calculation as follows: 1

Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] 1 1 1 [(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] = 2 1 1 + [(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] =

1

1 1 1 {[∂ j ψkε ∂l ψkε − iε− 2 −α v j ψkε ∂l ψkε + iε− 2 −α vl ψkε ∂ j ψkε + ε−1−2α v j vl |ψkε |2 ] 2 1

1

+ [∂ j ψkε ∂l ψkε + iε− 2 −α v j ψkε ∂l ψkε − iε− 2 −α vl ψkε ∂ j ψkε + ε−1−2α v j vl |ψkε |2 ]} i 1 = Re(∂l ψkε ∂ j ψkε ) − ε− 2 −α (ψkε ∂l ψkε − ψkε ∂l ψkε )v j 2 i 1 + ε− 2 −α (ψkε ∂ j ψkε − ψkε ∂ j ψkε )vl + ε−1−2α v j vl |ψkε |2 2 ε ε = Re(∂l ψkε ∂ j ψkε ) − ε−1−2α (v j Jk,l + vl Jk,l − v j vl ρkε ) (by (2.2), (2.4)) for j, k, l = 1, 2. Then taking the sum for j, l = 1, 2, we obtain −

2

1

(∂l v j )Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] 1

j,l=1

=−

2

(∂l v j )Re(∂l ψkε ∂ j ψkε ) + ε−1−2α

j,l=1

=−

2

2

ε (∂l v j )(v j Jk,l + vl (Jk,ε j − v j ρkε ))

j,l=1

(∂l v j )Re(∂l ψkε ∂ j ψkε ) + ε−1−2α {(Jkε − ρkε v) · [(v · ∇)v] +

j,l=1

1 ε J · ∇|v|2 }, 2 k

and we may complete the proof of Claim 2. From (3.12) and Claim 2, we obtain 2 1 d ε 1 H (t) = −ε (∂l v j )Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ]d x dt j,k,l=1

+ε−2α

2 k=1

(Jkε − ρkε v) · ∇ p d x

(3.13)

560

T.-C. Lin, P. Zhang

Now we want to estimate the second integral of right side of (3.13). By (1.5) and energy conservation (2.5), eε (t) = eε (0) = E ε (ε1 , ε2 ) = O(ε−2α ), ∀ t > 0. This may imply 2 2 ε ρk ) − 1 d x = O(ε1−2α ), ∀ t > 0. (

(3.14)

k=1

Hence by (3.14), 2 2 ε ε (v · ∇ p)ρk d x = [( ρk ) − 1](v · ∇ p)d x + (v · ∇ p)d x k=1

=

[(

k=1 2

1

ρkε ) − 1](v · ∇ p)d x = O(ε 2 −α ) v · ∇ p L2 () ,

k=1

i.e. 2 k=1

1

(v · ∇ p)ρkε d x = O(ε 2 −α ) v · ∇ p L2 () .

(3.15)

Here we have used integration by parts, v · n |∂ = 0, divergence free of v and the Hölder inequality. Furthermore, by (2.6), (3.14) and the Hölder inequality, 2 2 Jkε · ∇ p d x = − p div Jkε d x (integration by par ts and using (B)) k=1

k=1

= = =

2

k=1

d dt d dt

p ∂t ρkε d x (by (2.6))

[(

[(

2 k=1 2

ρkε ) − 1] p d x −

[(

2

ρkε ) − 1]∂t p d x

k=1 1

ρkε ) − 1] p d x + O(ε 2 −α ) ∂t p L2 ()

k=1

¨ (by (3.14) and the H"older ineq.) i.e. 2 k=1

Jkε · ∇ p d x =

d dt

[(

2

1

ρkε ) − 1] p d x + O(ε 2 −α ) ∂t p L2 () . (3.16)

k=1

Thus by (3.13), (3.15) and (3.16), we obtain 2 d ρkε ) − 1] p d x} {H ε (t) − ε−2α [( dt k=1

ε

1

≤ C ∇v L∞ () H (t) + O(ε 2 −3α )( v · ∇ p L2 () + ∂t p L2 () ). (3.17)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

While again by (3.14) and the Hölder inequality, 2 1 ε−2α [( ρkε ) − 1] p d x = O(ε 2 −3α ) p L2 () .

561

(3.18)

k=1

Therefore by (3.17), (3.18) and the Gronwall inequality, 1 H ε (t) ≤ eC ∇v L ∞ ([0,T ]×) T H ε (0) + O(ε 2 −3α )( p L ∞ ([0,T ];L2 ()) + v · ∇ p L ∞ ([0,T ];L2 ()) + ∂t p L ∞ ([0,T ];L2 ()) ) f or 0 < t < T. (3.19) On the other hand, taking divergence to the first equation of (1.9) gives p=

2

∂i ∂ j (−)−1 (vi v j ),

i, j=1

but from (1) of Remark 1.1, v(t, x) ∈ L ∞ ([0, T ]; H 3 ()), therefore, by the properties of the Riesz transform [22], p L ∞ ([0,T ];L2 ()) ≤ C v ⊗ v L ∞ ([0,T ];L2 ()) ≤ C v 2L ∞ ([0,T ];H 2 ()) . Similarly, ∇ p L ∞ ([0,T ];L2 ()) ≤ C v 2L ∞ ([0,T ];H 3 ()) , and using the first equation of (1.9), we have a similar estimate for ∂t v L ∞ ([0,T ];L2 ()) . As a consequence, v · ∇ p L ∞ ([0,T ];L2 ()) + ∂t p L ∞ ([0,T ];L2 ()) ≤ C v 3L ∞ ([0,T ];H 3 ()) . By summing the above inequalities together with (3.19) and using the fact that 0 < α < 1 6 , we may complete the proof of (1.11) and Theorem 1. Now we prove Corollary 1 as follows: Using (1.8) and the Hölder inequality, we have 1 Jkε − ρkε v L4/3 () = ρkε ( ε Jkε − ρkε v) L4/3 () ρk 1 ¨ ≤ ρkε L4 () ε Jkε − ρkε v L2 () (by H"older ineq.) ρk = O(εα ) ρkε L4 () (by (1.8)). (3.20) By (1.7), it is obvious that 2 1/2 1/2 ρkε L4 () = ρkε L2 () ≤ ρkε L2 () (∵ ρkε ≥ 0) k=1

≤ [ (

2

ρkε ) − 1 L2 () + ||1/2 ]1/2 (by triangle ineq.)

k=1 1/2

= [||

√ + O( ε)]1/2 (by (1.7)),

562

T.-C. Lin, P. Zhang

i.e.

ρkε L4 () ≤ 2||1/4 f or k = 1, 2,

(3.21)

as ε sufficiently small. We may combine (3.20) and (3.21) to obtain (1.12) so we complete the proof of Corollary 1. 4. Proof of Theorem 2 In this section, we want to prove Theorem 2. Basically, the idea of the proof of Theorem 2 is similar to that of Theorem 1. Corresponding to (3.1) in the proof of Theorem 1, we may define another functional by f ε ε (t) de = H 2

2 k=1

1

|(∇ − iε− 2 u)ψkε |2 d x +

2 1 |( ρkε ) − ρ|2 + (γ − 1)ρ1ε ρ2ε d x 2 k=1

(4.1) for t ≥ 0, where (ρ, u) and ρkε ’s are defined in (1.18) and (2.25), respectively. Then it is obvious that 2 2 1 1 ε (t) = ε H |∇ ρkε |2 d x + | ε Jkε − ρkε u|2 d x 2 2 ρk n=1 k=1 2 1 |( (4.2) ρkε ) − ρ|2 + (γ − 1)ρ1ε ρ2ε d x, + 2 k=1

where Jkε ’s are defined in (2.26) and (2.27). Furthermore, by (1.14) and (2.24), we may transform (4.1) into ε (t) = H eε (t) −

u·

2

1 + 2

Jkε d x

k=1

|u|

2

2

ρkε d x

+

k=1

[−(

2 k=1

1 ρkε )ρ + ρ 2 ]d x. 2 (4.3)

We may use conservation laws, i.e. Lemma 2, to calculate of energy (2.28) and (4.3), we obtain d ε d H (t) = − dt dt +

d dt

u·

2

Jkε d x +

k=1 2

[−(

k=1

1 d 2 dt

d dt

|u|2

ε (t). By conservation H 2

ρkε d x

k=1

1 ρkε )ρ + ρ 2 ]d x. 2

(4.4)

By conservation of mass (2.29), as for (3.5), we have 1 d 2 dt

|u|

2

2 k=1

ρkε d x

=

(u · ∂t u)

2 k=1

ρkε

1 dx + 2

∇|u|2 ·

2 k=1

Jkε d x. (4.5)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

563

Besides, d dt

[−(

2 k=1

1 ρkε )ρ + ρ 2 ]d x = 2

[−(

=

[(

=

2

∂t ρkε )ρ − (

2

k=1 2

k=1 2

k=1 2

k=1

div Jkε )ρ − (

ρkε )∂t ρ + ρ∂t ρ]d x

ρkε )∂t ρ + ρ∂t ρ]d x

2 (Jkε · ∇ρ) − ( ρkε )∂t ρ + ρ∂t ρ]d x,

[−

k=1

k=1

i.e. d dt

[−(

2 k=1

1 ρkε )ρ + ρ 2 ]d x = 2

[−

2

2 (Jkε · ∇ρ) − ( ρkε )∂t ρ + ρ∂t ρ]d x.

k=1

k=1

(4.6) By conservation of linear momentum (2.30), using the same trick as (3.6)–(3.9), we obtain −

u·

2

∂t Jkε d x = −

k=1

ε 4 2

[

2

k=1 j,l=1

4Re(∂l ψkε ∂ j ψkε )∂l u j + (∇ρkε · div u)]d x

1 ε 2 − [ (ρk ) + γρ1ε ρ2ε ]div u d x. 2 2

(4.7)

k=1

Notice that due to compressibility, the velocity u is not divergence free any more. This may give the difference between (3.10) and (4.7). Hence by summing up (4.4)-(4.7), we find d ε H (t) = − dt

∂t u ·

2

Jkε

k=1

ε dx − 4 2

+

(u · ∂t u)

2

−(

k=1

2

[

2

k=1 j,l=1

+(∇ρkε · ∇div u)]d x −

k=1

ρkε )∂t ρ + ρ∂t ρ]d x.

1 ε 2 (ρk ) + γρ1ε ρ2ε ]div u d x 2 2

ρkε d x +

4Re(∂l ψkε ∂ j ψkε )∂l u j

[

1 2

k=1

∇|u|2 ·

2 k=1

Jkε d x +

[−

2

(Jkε · ∇ρ)

k=1

(4.8)

564

T.-C. Lin, P. Zhang

Plug the system (1.18) into (4.8), and then d ε ε H (t) = − dt 4 2

[

2

k=1 j,l=1

+

4Re(∂l ψkε ∂ j ψkε )∂l u j + (∇ρkε · ∇div u)]d x

2 1 {(Jkε − ρkε u) · [(u · ∇)u] + Jkε · ∇|u|2 }d x 2 k=1

2 2 1 ε 2 (Jkε − ρkε u) · ∇ρd x − [ (ρk ) + γρ1ε ρ2ε ]div u d x + 2 k=1

k=1

2 2 − (Jkε · ∇ρ) + [( ρkε ) − ρ]div(ρu) d x. +

k=1

(4.9)

k=1

By (2.25)–(2.27) and the same method as Claim 2 in Sect. 3, we obtain

−

2

1 1 (∂l u j )Re (∂ j − iε− 2 u j )ψkε (∂l − iε− 2 u l )ψkε

j,l=1

=−

2

(∂l u j )Re(∂l ψkε ∂ j ψkε ) + ε−1 {(Jkε − ρkε u) · [(u · ∇)u] +

j,l=1

1 ε J · ∇|u|2 }. 2 k (4.10)

Thus (4.9) and (4.10) imply 2 1 d ε 1 (∂l u j )Re[(∂ j − iε− 2 u j )ψkε (∂l − iε− 2 u j )ψkε ]d x H (t) = −ε dt j,k,l=1

ε 4 2

−

k=1

(∇ρkε · ∇div u)d x −

+γρ1ε ρ2ε ]div u d x −

[

1 ε 2 (ρk ) − (ρ ρkε ) 2 2

2

k=1

k=1

ρdiv(ρu)d x. (4.11)

Since ρdiv(ρu) = (ρ∇ρ) · u + ρ 2 div u = 21 (∇ρ 2 ) · u + ρ 2 div u, then using integration → by parts and noticing the slip boundary condition u · − n |∂ = 0, we find 1 2 ρdiv(ρu)d x = − (∇ρ ) · u d x − ρ 2 div u d x − 2 1 =− ρ 2 div u d x. 2

(4.12)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

565

Combining (4.11) with (4.12), we arrive at 2 1 d ε 1 (∂l u j )Re[(∂ j − iε− 2 u j )ψkε (∂l − iε− 2 u j )ψkε ]d x H (t) = −ε dt j,k,l=1

ε − 4 2

k=1

−(γ − 1)

(∇ρkε

1 · ∇div u)d x − 2

ρ1ε ρ2ε div

[(

2

ρkε ) − ρ]2 div u d x

k=1

u d x.

(4.13)

By (1.14) and (2.28), eε (t) = eε (0) = O(1) f or t ∈ [0, T∗ ). Consequently, 1 ∇ψkε L2 () = O( √ ), ψkε L4 () = O(1) f or t ∈ [0, T∗ ), k = 1, 2. (4.14) ε Then by (2.25), (4.14) and the Hölder inequality, ε | 4 2

k=1

ε ∇ψkε L2 () ψkε L 4 () ∇div u L4 () 2 √ ≤ C ε u H 3 () . (4.15)

(∇ρkε · ∇div u)d x| ≤

Hence by (4.1), (4.13) and (4.15), we obtain √ d ε ε (t) + ε u(t) H 3 () f or t ∈ [0, T∗ ). (4.16) H (t) ≤ C ∇u(t) L ∞ () H dt ε (0) → 0 as ε ↓ 0. Thus by (1) of Remark 1.2, (4.16) together with the From (1.15), H Gronwall inequality implies ε (t) → 0 H

f or t ∈ [0, T∗ )

as ε ↓ 0.

(4.17)

Therefore by (4.2), (4.14), (4.17) and the Hölder inequality, we may obtain (1.16), (1.17) and we complete the proof of Theorem 2. 5. Appendix In this section, we study global existence of the systems (1.1) and (1.2) with the Neumann boundary conditions (B). Basically, we follow the ideas of Brezis and Gallouet (cf. [4]). We firstly study the local existence theorem and then use conservation laws and a crucial inequality (see (5.12)) which holds only when the domain has two spatial dimensions to prove the global existence theorem. For the local existence theorem, we need to use a semigroup theorem (cf. [20]) as follows:

566

T.-C. Lin, P. Zhang

Theorem 3. Let H be a Hilbert space and A : D(A) ⊂ H → H be a m-accretive linear operator. Assume F is a mapping from D(A) into itself which is Lipschitz on every bounded set of D(A). Then for every u 0 ∈ D(A), there exists a unique solution u of the equation du dt + Au = Fu, u(0) = u 0 , defined for t ∈ [0, Tmax ) such that u ∈ C1 ([0, Tmax ); H ) C([0, Tmax ); D(A)) with the additional property that either Tmax = ∞ or Tmax < ∞ and lim u(t) + Au(t) = ∞. t↑Tmax

For simplicity, we may set ε = 1. Then both (1.1) and (1.2) have the following form: i∂t ψ1 = − 21 ψ1 − aψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 , (5.1) i∂t ψ2 = − 21 ψ2 − aψ2 + b|ψ2 |2 ψ2 + c|ψ1 |2 ψ2 , where a ≥ 0, b, c > 0 are constants. To apply Theorem 3 on (5.1) with the Neumann boundary conditions (B), we may define H = {(ψ1 , ψ2 )T : ψ j ∈ L2 (; C), j = 1, 2}, T i ψ1 2 ( − 2I )ψ1 = i A f or (ψ1 , ψ2 )T ∈ D(A), ψ2 2 ( − 2I )ψ2 ∂ψ where D(A) = {(ψ1 , ψ2 )T : ψ j ∈ H2 (; C), ∂ n j = 0, j = 1, 2}. Besides, we set ∂

initial data of (5.1) as (ψ1 , ψ2 )T |t=0 = (ψ1,0 , ψ2,0 )T ∈ D(A), and i(a + 1)ψ1 − ib|ψ1 |2 ψ1 − ic|ψ2 |2 ψ1 ψ1 ψ1 = f or ∈ D(A). F ψ2 ψ2 i(a + 1)ψ2 − ib|ψ2 |2 ψ2 − ic|ψ1 |2 ψ2 Then it is easy to check that F maps from D(A) into itself which is Lipschitz on every bounded set of D(A). Moreover, the operator A : D(A) ⊂ H → H is a m-accretive linear operator (cf. [10]). Hence by Theorem 3, we obtain the local existence of (5.1) with the Neumann boundary conditions (B). Now we want to show the global existence of (5.1) with the Neumann boundary conditions (B), i.e. Tmax = ∞ and ψ j (t) H2 () , j = 1,2, remain bounded on every finite time interval. Hereafter, (ψ1 , ψ2 ) denotes the solution of (5.1) with the Neumann boundary conditions (B). By energy conservation of (5.1), we have 2 k=1

|∇ψ j |2 d x ≤ C0

for t ∈ [0, Tmax ),

(5.2)

where C0 is a positive constant independent of t. Moreover, by conservation of mass on (5.1), 2 j=1

|ψ j |2 d x ≤ C1

for t ∈ [0, Tmax ),

(5.3)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

567

where C1 is a positive constant independent of t. Hence (5.2) and (5.3) imply 2

ψ j (t) 2H1 () ≤ C2

for t ∈ [0, Tmax ),

(5.4)

j=1

where C2 = C0 + C1 > 0 independent of t. We now denote by S(t) the L2 isometry group generated by − 2i ( − 2I ). Then by (5.1), we obtain t ψ1 (t) = S(t)ψ1,0 + i 0 S(t − s)[(a + 1)ψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 ](s)ds, (5.5) t ψ2 (t) = S(t)ψ2,0 + i 0 S(t − s)[(a + 1)ψ2 + b|ψ2 |2 ψ2 + c|ψ1 |2 ψ2 ](s)ds, where ψ j,0 ’s are initial data of ψ j ’s. Hence by (5.5), ψ1,0 ψ1 (t) = S(t)A A ψ2 ψ2,0 ! t (a + 1)ψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 ! (s)ds +i S(t − s)A (a + 1)ψ2 + b|ψ2 |2 ψ2 + c|ψ1 |2 ψ2 0 and

" " " " " " " " " A ψ1,0 " " A ψ1 (t)" ≤ " 2 " " ψ2 ψ 2,0 "L2 () L () ! " t " " (a + 1)ψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 " " " +C (s)ds. " (a + 1)ψ + b|ψ |2 ψ + c|ψ |2 ψ ! " 2 2 2 1 2 0 H2 ()

(5.6)

To estimate the integral in the right side of (5.6), we need the following lemma: Lemma 3. |v|2 u H2 () ≤ C v L∞ () max{ u L∞ , v L∞ } max{ u H2 () , v H2 () }, |u|2 v H2 () ≤ C u L∞ () max{ u L∞ , v L∞ } max{ u H2 () , v H2 () } for u, v ∈ H2 (), where C is a positive constant independent of u and v. Proof of Lemma 3. Let D be any first order differential operator. For u, v ∈ H2 (), we have |D 2 (|v|2 u)| ≤ C(|v|2 |D 2 u| + |v Du Dv| + |Dv|2 |u| + |u v D 2 v|) and so

|v|2 u H2 () ≤ C v 2L∞ () u H2 () + v L∞ () u W1,4 () v W1,4 ()

(5.7) + u L∞ () v 2W1,4 () + u L∞ () v L∞ () v H2 () .

Hereafter, for notation convenience, we denote C as the associated constants, which may be different on different lines. By Gagliardo-Nirenberg inequality (cf. [16], p.125)), 1/2

1/2

(5.8)

1/2

1/2

(5.9)

u W1,4 () ≤ C u L∞ () u H2 () , v W1,4 () ≤ C v L∞ () v H2 () ,

568

T.-C. Lin, P. Zhang

for u, v ∈ H2 (). Combining (5.7)–(5.9), we obtain 3/2 1/2 1/2 1/2 |v|2 u H2 () ≤ C v 2L∞ () u H2 () + v L∞ () u L∞ () v H2 () u H2 () + u L∞ () v L∞ () v H2 () 1/2

1/2

1/2

1/2

≤ C v L∞ () ( v L∞ () u H2 () + u L∞ () v H2 () )2

≤ C v L∞ () max{ u L∞ () , v L∞ () } max{ u H2 () , v H2 () }. (5.10) Similarly, we may interchange u and v to get |u|2 v H2 () ≤ C u L∞ () max{ u L∞ () , v L∞ () } max{ u H2 () , v H2 () }. Therefore we complete the proof of Lemma 3. By (5.6) and Lemma 3, t 2 2 2 ψ j (t) H2 () ≤ C + C ( ψ j (s) H2 () )( ψ j (s) L∞ () )2 d x 0

j=1

j=1

(5.11)

j=1

for t ∈ [0, Tmax ), where C may depend on a,b,c and ψ j,0 ’s. From [4], a crucial inequality is given by u L∞ () ≤ C(1+ log(1+ u H2 () )) for u ∈ H2 () with u H1 () ≤ 1, (5.12) provided that the domain has two spatial dimensions. Hence by (5.11) and (5.12), we obtain t 2 2 ψ j (t) H2 () ≤ C + C ( ψ j (s) H2 () ) 0

j=1

j=1

× [1 + log(1 +

2

ψ j (s) H2 () )]ds.

j=1

We denote by G(t) the right hand side in (5.13), thus G (t) = C(

2

ψ j (t) H2 () )[1 + log(1 +

j=1

2

ψ j (t) H2 () )]

j=1

≤ C G(t)[1 + log(1 + G(t))] (by (5.13)). Consequently, d log[1 + log(1 + G(t))] ≤ C, dt and we may find an estimate for

2 j=1

2 j=1

ψ j (t) H2 () of the form

ψ j (t) H2 () ≤ eαe

βt

(5.13)

Incompressible and Compressible Limits of Coupled Systems of NLS Equations

for some constants α and β. Therefore time interval and Tmax = ∞.

2 j=1

569

ψ j (t) H2 () remains bounded on every finite

Acknowledgements. T. C. Lin is partially supported by NCTS and NSC of Taiwan under Grant NSC94-2115M-002-019. P. Zhang is partially supported by NSF of China under Grant 10525101 and 10421101, and the innovation grant from Chinese Academy of Sciences.

References 1. Ao, P., Chui, S.T.: Binary Bose-Einstein condensate mixtures in weakly and strongly segregated phases. Phys. Rev. A 58, 4836–4840 (1998) 2. Beirao da Veiga, H.: On the barotropic motion of compressible perfect fluids. Ann. Sc. Norm. Sup. Pisa 8, 417–451 (1981) 3. Beirao da Veiga, H.: Data dependence in the mathematical theory of compressible inviscid fluids. Arch. Rat. Mech. Anal. 119, 109–127 (1992) 4. Brezis, H., Gallout, T.: Nonlinear Schrödinger evolution equations. Nonlinear Analysis, TMA 4, 677–681 (1980) 5. Esry, B.D., Greene, C.H.: Spontaneous spatial symmetry breaking in two-component Bose-Einstein condensates. Phys. Rev. A 59, 1457–1460 (1999) 6. Hasan, Z.R., Goble, D.F.: Effect of boundary conditions on finite Bose-Einstein assemblies. Phys. Rev. A 10(2), 618–624 (1974) 7. Grenier, E.: Semiclassical limit of the nonlinear Schrödinger equation in small time. Proc. Amer. Math. Soc. 126, 523–530 (1998) 8. Ginzburg, V.L., Pitaevskii, L.P.: Zh. Eksperim. Theor. Fys. 34, 1240 (1958) [Sov. Phys. JETP 7, 858 (1958)] 9. Hall, D.S., Matthews, M.R., Ensher, J.R., Wieman, C.E., Cornell, E.A.: Dynamics of component separation in a binary mixture of Bose- Einstein condensates. Phys. Rev. Lett. 81, 1539–1542 (1998) 10. Kato, T.: Perturbation theory of linear operator. Berlin: Springer, 1980 11. Lin, F.H., Lin, T.C.:Vortices in two-dimensional Bose-Einstein condensates. In: Geometry and nonlinear partial differential equations, AMS/IP Stud. Adv. Math. 29, Amer. Math. Soc., Providence, RI, 2002 pp. 87–114 12. Lin, F.H., Xin, J.X.: On the incompressible fluid limit and the vortex motion law of the nonlinear Schrödinger equation. Commun. Math. Phys. 200(2), 249–274 (1999) 13. Lin, F.H., Zhang, P.: Semiclassical limit of the Gross-Pitaevskii equation in an exterior domain. Arch. Rat. Mech. Anal. 179, 79–107 (2005) 14. Lions, P.L.: Mathematical Topics in Fluid Mechanics, Vol. 1, Incompressible Models, Lecture series in mathematics and its applications, V. 3, Oxford: Clarendon Press, 1996 15. Majda, A.: Compressible fluid flow and systems of conservation laws in several space variables. New York, Springer 1984 16. Nirenberg, L.: On elliptic partial differential equations. Ann. Sci. Norm. Sup. Pisa 13, 115–162 (1959) 17. Pérez-García, V.M., Konotop, V.V., Brazhnyi, V.A.: Feshbach resonance induced shock waves in BoseEinstein condensates. Phys. Rev. Lett. 92, 220–403(1–4) (2004) 18. Pitaevskii, L., Stringari, S.: Bose-Einstein condensation. Oxford: Oxford univ. Press, 2003 19. Puel, M.: Convergence of the Schrödinger–Poisson system to the incompressible Euler equations. Comm. Partial Diff. Eqs. 27, 2311–2331 (2002) 20. Segal, I.: Nonlinear semi-groups. Ann. Math. 78, 339–364 (1963) 21. Sideris, T.C.: Formation of singularities in three-dimensional compressible fluids. Commun. Math. Phys. 101, 475–485 (1985) 22. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton University Press 1970 23. Sulem, C., Sulem, P.L.: The nonlinear Schrödinger equation self-focusing and wave collapse. New York: Springer 1999 24. Timmermans, E.: Phase separation of Bose-Einstein condensates. Phys. Rev. Lett. 81, 5718–5721 (1998) 25. Zhang, P.: Wigner measure and the semiclassical limit of Schrodinger–Poisson equations. SIAM J. Math. Anal. 34, 700–718 (2003) Communicated by A. Kupiainen

Commun. Math. Phys. 266, 571–576 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0019-z

Communications in

Mathematical Physics

A Generalization of Hawking’s Black Hole Topology Theorem to Higher Dimensions Gregory J. Galloway1 , Richard Schoen2 1 Department of Mathematics, University of Miami, Coral Gables, FL 33124, USA.

E-mail: [email protected]

2 Department of Mathematics, Stanford University, Stanford, CA 94305, USA

Received: 28 September 2005 / Accepted: 10 November 2005 Published online: 9 June 2006 – © Springer-Verlag 2006

Abstract: Hawking’s theorem on the topology of black holes asserts that cross sections of the event horizon in 4-dimensional asymptotically flat stationary black hole spacetimes obeying the dominant energy condition are topologically 2-spheres. This conclusion extends to outer apparent horizons in spacetimes that are not necessarily stationary. In this paper we obtain a natural generalization of Hawking’s results to higher dimensions by showing that cross sections of the event horizon (in the stationary case) and outer apparent horizons (in the general case) are of positive Yamabe type, i.e., admit metrics of positive scalar curvature. This implies many well-known restrictions on the topology, and is consistent with recent examples of five dimensional stationary black hole spacetimes with horizon topology S 2 × S 1 . The proof is inspired by previous work of Schoen and Yau on the existence of solutions to the Jang equation (but does not make direct use of that equation). 1. Introduction A basic result in the theory of black holes is Hawking’s theorem [11, 13] on the topology of black holes, which asserts that cross sections of the event horizon in 4-dimensional asymptotically flat stationary black hole spacetimes obeying the dominant energy condition are spherical (i.e., topologically S 2 ). The proof is a beautiful variational argument, showing that if a cross section has genus ≥ 1 then it can be deformed along a null hypersurface to an outer trapped surface outside of the event horizon, which is forbidden by standard results on black holes [13].1 In [12], Hawking showed that his black hole topology result extends, by a similar argument, to outer apparent horizons in black hole spacetimes that are not necessarily stationary. (A related result had been shown by Gibbons [8] in the time-symmetric case.) Since Hawking’s arguments rely on the Gauss-Bonnet theorem, these results do not directly extend to higher dimensions. 1 Actually the torus T 2 arises as a borderline case in Hawking’s argument, but can occur only under special circumstances.

572

G.J. Galloway, R. Schoen

Given the current interest in higher dimensional black holes, it is of interest to determine which properties of black holes in four spacetime dimensions extend to higher dimensions. In this note we obtain a natural generalization of Hawking’s theorem on the topology of black holes to higher dimensions. The conclusion in higher dimensions is not that the horizon topology is spherical; that would be too strong, as evidenced by the striking example of Emparan and Reall [7] of a stationary vacuum black hole spacetime in five dimensions with horizon topology S 2 × S 1 . The natural conclusion in higher dimensions is that cross sections of the event horizon (in the stationary case), and outer apparent horizons (in the general case) are of positive Yamabe type, i.e. admit metrics of positive scalar curvature. As noted in [6], in the time symmetric case this follows from the minimal surface methodology of Schoen and Yau [18] in their treatment of manifolds of positive scalar curvature. The main point of the present paper is to show that this conclusion remains valid without any condition on the extrinsic curvature of space. That such a result might be expected to hold is suggested by work in [19, Sect. 4], which implies that the apparent horizons corresponding to the blow-up of solutions of the Jang equation, as described in [19], are of positive Yamabe type. We emphasize, however, that we do not need to make use of the Jang equation here.2 Much is now known about the topological obstructions to the existence of metrics of positive scalar curvature in higher dimensions. While the first major result along these lines is the famous theorem of Lichnerowicz [16] concerning the vanishing of the Aˆ genus, a key advance in our understanding was made in the late 70’s and early 80’s by Schoen and Yau [17, 18], and Gromov and Lawson [9, 10]. A brief review of these results, relevant to the topology of black holes, was considered in [6]. We shall recall the situation in five spacetime dimensions in the next section, after the statement of our main result. 2. The Main Result Let V n be an n-dimensional, n ≥ 3, spacelike hypersurface in a spacetime M n+1 , g . Let n−1 be a closed hypersurface in V n , and assume that n−1 separates V n into an “inside” and an “outside”. Let N be the outward unit normal to n−1 in V n , and let U be the future directed unit normal to V n in M n+1 . Then K = U + N is an outward null normal field to n−1 , unique up to scaling. The null second fundamental form of with respect to K is, for each p ∈ , the bilinear form defined by, χ : T p × T p → R,

χ (X, Y ) = ∇ X K , Y ,

(2.1)

where , = g and ∇ is the Levi-Civita connection, of M n+1 . Then the null expansion of is defined as θ = tr χ = h AB χ AB = div K , where h is the induced metric on . We shall say n−1 is an outer apparent horizon in V n provided, (i) is marginally outer trapped, i.e., θ = 0, and (ii) there are no outer trapped surfaces outside of . The latter means that there is no (n − 1)-surface contained in the region of V n outside of which is homologous to and which has negative expansion θ < 0 with respect to its outer null normal (relative to ). Heuristically, is the “outer limit” of outer trapped surfaces in V . 2 In any case, the parametric estimates of [19] which are used to construct solutions of the Jang equation asymptotic to vertical cylinders over apparent horizons are generally true only in low dimensions.

A Generalization of Hawking’s Black Hole Topology Theorem to Higher Dimensions

573

Finally, a spacetime M n+1 , g satisfying the Einstein equations Rab −

1 Rgab = Tab 2

(2.2)

is said to obey the dominant energy condition provided the energy-momentum tensor T satisfies T (X, Y ) = Tab X a Y b ≥ 0 for all future pointing causal vectors X, Y . We are now ready to state the main theorem. Theorem 2.1. Let M n+1 , g , n ≥ 3, be a spacetime satisfying the dominant energy condition. If n−1 is an outer apparent horizon in V n then n−1 is of positive Yamabe type, unless n−1 is Ricci flat (flat if n = 3, 4) in the induced metric, and both χ and T (U, K ) = Tab U a K b vanish on . Thus, except under special circumstances, n−1 is of positive Yamabe type. As noted in the introduction, this implies various restrictions on the topology of . Let us focus on the case dim M = 5, and hence dim = 3, and assume, by taking a double cover if necessary, that is orientable. Then by well-known results of SchoenYau [18] and Gromov-Lawson [10], topologically, must be a finite connected sum of spherical spaces (homotopy 3-spheres, perhaps with identifications) and S 2 × S 1 ’s. Indeed, by the prime decomposition theorem, can be expressed as a connected sum of spherical spaces, S 2 × S 1 ’s, and K (π, 1) manifolds (manifolds whose universal covers are contractible). But as admits a metric of positive scalar curvature, it cannot have any K (π, 1)’s in its prime decomposition. Thus, the basic horizon topologies in dim M = 5 are S 3 and S 2 × S 1 , both of which are realized by nontrivial black hole spacetimes. Under stringent geometric assumptions on the horizon, a related conclusion is arrived at in [14]. Proof of the theorem. We consider normal variations of in V , i.e., variations t → t of = 0 , − < t < , with variation vector field V = ∂t∂ t=0 = φ N , φ ∈ C ∞ (). Let θ (t) denote the null expansion of t with respect to K t = U + Nt , where Nt is the outer unit normal field to t in V . A computation shows [6, 3], ∂θ 2 φ, (2.3) = − φ + 2X, ∇φ + Q + div X − |X | ∂t t=0 where, Q=

1 1 S − T (U, K ) − |χ |2 , 2 2

(2.4)

S is the scalar curvature of , X is the vector field on defined by X = tan (∇ N U ), and , now denotes the induced metric h on . Introducing as in [3] the operator L = − +X, ∇( )+ Q + div X − |X |2 , Eq. (2.3) may be expressed as, ∂θ = L(φ) . (2.5) ∂t t=0 L is the stability operator associated with variations in the null expansion θ . In the time symmetric case the vector field X vanishes, and L reduces to the classical stability operator of minimal surface theory, as expected [6].

574

G.J. Galloway, R. Schoen

As discussed in [3], although L is not in general self adjoint, its principal eigenvalue λ1 is real, and one can choose a principal eigenfunction φ which is strictly positive, φ > 0. Using the eigenfunction φ to define our variation, we have from (2.5), ∂θ = λ1 φ . (2.6) ∂t t=0 The eigenvalue λ1 cannot be negative, for otherwise (2.6) would imply that ∂θ ∂t < 0 on . Since θ = 0 on , this would mean that for t > 0 sufficiently small, t would be outer trapped, contrary to our assumptions. Hence, λ1 ≥ 0, and we conclude for the variation determined by the positive eigen function φ that ∂θ ∂t t=0 ≥ 0. By completing the square on the right hand side of Eq. (2.3), this implies that the following inequality holds: − φ + (Q + div X ) φ + φ|∇ ln φ|2 − φ|X − ∇ ln φ|2 ≥ 0.

(2.7)

Setting u = ln φ, we obtain, − u + Q + div X − |X − ∇u|2 ≥ 0 .

(2.8)

As a side remark, note that substituting the expression for Q into (2.8) and integrating gives that the total scalar curvature of is nonnegative, and in fact is positive, except under special circumstances. In four spacetime dimensions one may then apply the Gauss-Bonnet theorem to recover Hawking’s theorem; in fact this is essentially Hawking’s original argument. However, in higher dimensions the positivity of the total scalar curvature, in and of itself, does not provide any topological information. To proceed with the higher dimensional case, we first absorb the Laplacian term

u = div (∇u) in (2.8) into the divergence term to obtain, Q + div (X − ∇u) − |X − ∇u|2 ≥ 0 .

(2.9)

Setting Y = X − ∇u, we arrive at the inequality, −Q + |Y |2 ≤ div Y .

(2.10)

Given any ψ ∈ C ∞ (), we multiply through by ψ 2 and derive, −ψ 2 Q + ψ 2 |Y |2 ≤ ψ 2 div Y = div ψ 2 Y − 2ψ∇ψ, Y ≤ div ψ 2 Y + 2|ψ||∇ψ||Y | ≤ div ψ 2 Y + |∇ψ|2 + ψ 2 |Y |2 . Integrating the above inequality yields, |∇ψ|2 + Qψ 2 ≥ 0 for all ψ ∈ C ∞ () ,

where Q is given in (2.4).

(2.11)

(2.12)

A Generalization of Hawking’s Black Hole Topology Theorem to Higher Dimensions

575

At this point rather standard arguments become applicable [19, 6]. Consider the eigenvalue problem, − ψ + Qψ = μψ .

(2.13)

Inequality (2.12) implies that the first eigenvalue μ1 of (2.13) is nonnegative, μ1 ≥ 0. Let f ∈ C ∞ () be an associated eigenfunction; f can be chosen to be strictly positive. Now consider in the conformally related metric h˜ = f 2/n−2 h. The scalar curvature S˜ of in the metric h˜ is given by, n − 1 |∇ f |2 S˜ = f −n/(n−2) −2 f + S f + n−2 f n − 1 |∇ f |2 = f −2/(n−2) 2μ1 + 2T (U, K ) + |χ |2 + , (2.14) n−2 f2 where, for the second equation, we have used (2.13), with ψ = f , and (2.4). Since, by the dominant energy condition, T (U, K ) ≥ 0, Eq. (2.14) implies that S˜ ≥ 0. If S˜ > 0 at some point, then by well known results [15] one can conformally change h˜ to a metric of strictly positive scalar curvature, and the theorem follows. If S˜ vanishes identically then, by Eq. (2.14), μ1 = 0, T (U, K ) ≡ 0, χ ≡ 0 and f is constant. Eq. (2.13), with ψ = f and Eq. (2.4) then imply that S ≡ 0. By a result of Bourguinon (see [15]), it follows that carries a metric of positive scalar curvature unless it is Ricci flat. The theorem now follows. Concluding Remarks 1. Let n−1 be a closed 2-sided hypersurface in the spacelike hypersurface V n ⊂ M n+1 . Then there exists a neighborhood W of n−1 in V n such that separates W into an “inside” and an “outside”. Suppose is marginally outer trapped, i.e., θ = 0 with respect to the outer null normal to . Following the terminology introduced in [3], we say that is stably outermost (respectively, strictly stably outermost) provided the principal eigenvalue λ1 of the stability operator L introduced in 2.5 satisfies λ1 ≥ 0 (resp., λ1 > 0). It is clear from the proof that the conclusion of Theorem 2.1 remains valid for marginally outer trapped surfaces that are stably outermost. Moreover the conclusion that is positive Yamabe holds without any caveat if is strictly stably outermost. To see this, note that Eq. (2.6) then implies that there exists > 0 such that ∂θ ∂t t=0 ≥ . Tracing through the proof using this inequality shows that (2.12) holds with Q replaced by Q − . Then the parenthetical expression in Eq. (2.14) will include a + term, and so S˜ will be strictly positive. 2. Theorem 2.1 applies, in particular, to the marginally trapped surfaces S R of a dynamical horizon H (see [2] for definitions). Indeed, by the maximum principle for marginally trapped surfaces [1], there can be no outer trapped surfaces in H outside of any S R . Alternatively, it is easily checked that each S R is stably outermost in the sense described in the previous paragraph. 3. As discussed in [6], the exceptional case in Theorem 2.1 can in effect be eliminated in the time symmetric case. In this case V n becomes a manifold of nonnegative scalar curvature, and n−1 is minimal. By the results in [5, 4], if is locally outer area minimizing and does not carry a metric of positive scalar curvature then an outer neighborhood of in V splits isometrically as a product [0, ) × . In physical terms, this means that there would be marginally outer trapped surfaces outside of ,

576

G.J. Galloway, R. Schoen

which, by a slight strengthening of our definition of ‘outer apparent horizon’, could not occur. (In fact, marginally outer trapped surfaces cannot occur outside the event horizon.) Under mild physical assumptions, but with dim M ≤ 8, one can show that is locally outer area minimizing; see [6, Theorem 3] for further discussion. Finally, in the asymptotically flat, but not necessarily time symmetric case, it is possible to perturb the initial data to make the dominant energy inequality strict, see [19, p. 240]. Hence, the exceptional case is unstable in this sense. Note added in proof: We are now able to eliminate the exceptional case in the general non-time symmetric setting under conditions analogous to those referred to in remark 3 above, e.g. assuming a mild asymptotic condition and assuming there are no outer trapped or marginally outer trapped surfaces outside of . Details will appear in a forthcoming paper. Acknowledgements. This work was supported in part by NSF grants DMS-0405906 (GJG) and DMS-0104163 (RS). The work was initiated at the Isaac Newton Institute in Cambridge, England during the Fall 2005 Program on Global Problems in Mathematical Relativity, organized by P. Chru´sciel, H. Friedrich and P. Tod. The authors would like to thank the Newton Institute for its support.

References 1. Ashtekar, A., Galloway, G.J.: Uniqueness theorems for dynamical horizons. Adv. Theor. Math. Phys. 8, 1–30 (2005) 2. Ashtekar, A., Krishnan, B.: Dynamical horizons and their properties. Phys. Rev. D 68, 261101 (2003) 3. Andersson, L., Mars, M., Simon, W.: Local existence of dynamical and trapping horizons, Phys. Rev. Lett. 95, 111102 (2005) 4. Cai, M.: Volume minimizing hypersurfaces in manifolds of nonnegative scalar curvature. In: Minimal Surfaces, Geometric Analysis, and Symplectic Geometry, Advanced Studies in Pure Mathematics, eds. Fukaya, K., Nishikawa, S., Spruck, J., 34, 1–7 (2002) 5. Cai, M., Galloway, G.J.: Rigidity of area minimzing tori in 3-manifolds of nonnegative scalar curvature. Commun. Anal. Geom. 8, 565–573 (2000) 6. Cai, M., Galloway, G.J.: On the topology and area of higher dimensional black holes. Class. Quant. Grav. 18, 2707–2718 (2001) 7. Emparan, R., Reall, H.S.: A rotating black ring in five dimensions. Phys. Rev. Lett. 88, 101101 (2002) 8. Gibbons, G.W.: The time symmetric initial value problem for black holes. Commun. Math. Phys. 27, 87–102 (1972) 9. Gromov, M., Lawson, B.: Spin and scalar curvature in the presence of the fundamental group. Ann. of Math. 111, 209–230 (1980) 10. Gromov, M., Lawson, B.: Positive scalar curvature and the Dirac operator on complete Riemannian manifolds. Publ. Math. IHES 58, 83–196 (1983) 11. Hawking, S.W.: Black holes in general relativity. Commun. Math. Phys. 25, 152–166 (1972) 12. Hawking, S.W.: The event horizon. In ‘Black Holes, Les Houches lectures’ (1972), edited by C. DeWitt, B. S. DeWitt, Amsterdam: North Holland, 1972 13. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge Cambridge University Press, 1973 14. Helfgott, C., OZ, Y., Yanay, Y.: On the topology of black hole event horizons in higher dimensions. JHEPO2 024 (2006) 15. Kazdan, J., Warner, F.: Prescribing curvatures. Proc. Symp. in Pure Math. 27, 309–319 (1975) 16. Lichnerowicz, A.: Spineurs harmoniques. Cr. Acd. Sci. Paris, Sér. A-B 257, 7–9 (1963) 17. Schoen, R., Yau, S.T.: 1 Existence of incompressible minimal surfaces and the topology of three dimensional manifolds of non-negative scalar curvature. Ann. of Math. 110, 127–142 (1979) 18. Schoen, R., Yau, S.T.: On the structure of manifolds with positive scalar curvature. Manuscripta Math. 28, 159–183 (1979) 19. Schoen, R., Yau, S.T.: Proof of the positive of mass theorem. II. Commun. Math. Phys., 79, 231–260 (1981) Communicated by G.W. Gibbons

Commun. Math. Phys. 266, 577–594 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0043-z

Communications in

Mathematical Physics

Geometric Quantization, Parallel Transport and the Fourier Transform William D. Kirwin , Siye Wu Department of Mathematics, University of Colorado, Boulder, CO 80309-0395, USA Received: 4 October 2004 / Accepted: 27 February 2006 Published online: 7 July 2006 – © Springer-Verlag 2006

Abstract: In quantum mechanics, the momentum space and position space wave functions are related by the Fourier transform. We investigate how the Fourier transform arises in the context of geometric quantization. We consider a Hilbert space bundle H over the space J of compatible complex structures on a symplectic vector space. This bundle is equipped with a projectively flat connection. We show that parallel transport along a geodesic in the bundle H → J is a rescaled orthogonal projection or Bogoliubov transformation. We then construct the kernel for the integral parallel transport operator. Finally, by extending geodesics to the boundary (for which the metaplectic correction is essential), we obtain the Segal-Bargmann and Fourier transforms as parallel transport in suitable limits. 1. Introduction In quantum mechanics, the position and momentum space wave functions are related by the Fourier transform. In this paper we examine how this relationship arises in the context of geometric quantization. We provide a geometric interpretation of the Fourier transform as parallel transport in a vector bundle of infinite rank. In fact, this consideration leads to a smoothly parametrized family of transforms which includes the Fourier transform, the Segal-Bargmann transform, and the Bogoliubov transform. Quantization of a symplectic manifold (M, ω) requires an Hermitian line bundle → M with a compatible connection such that the curvature is √ω−1 . is called a ω pre-quantum line bundle and it exists if and only if the de Rham class [ 2π ] is integral. 0 The pre-quantum Hilbert space H consists of sections of which are square-integrable with respect to the Liouville volume form on M. As is well-known, H0 is too large for the purpose of quantization. The additional structure we need is an almost complex structure compatible with ω. The space J of such J is connected and contractible. Each Current address: Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA

578

W. D. Kirwin, S. Wu

J ∈ J defines a quantum Hilbert space H J of (square-integrable) J -holomorphic sections of . They form a vector bundle H of Hilbert spaces over J, provided there is no jump of dim H J as J varies. To compare H J for different J , we need a connection on H → J. Given J, J ∈ J and a path connecting them, parallel transport in H is a unitary operator from H J to H J . If the connection is projectively flat, the holonomy is U (1), and parallel transports along different paths from J to J differ by at most a phase. Since a quantum state is actually represented by a ray in the Hilbert space, the “physics” obtained is thus independent of the choice of J . Unfortunately such a connection does not exist in general [5]. We will henceforth restrict our attention to a symplectic vector space (V, ω). We also restrict J to the space of linear complex structures on V compatible with ω. In this case, a projectively flat connection on H → J is constructed in [1], as a finite dimensional model for studying Chern-Simons gauge theory. In this paper, we study parallel transport in the bundle H along the geodesics in J, when the symplectic manifold is a vector space. The space J can be identified with the Siegel upper-half space, which has a natural Kähler metric. We show that parallel transport along a geodesic in the Hilbert space bundle is a rescaled orthogonal projection. Hence parallel transport agrees with the Bogoliubov transformation in [14, 15] and the intertwining operators in [10] and [8]. Part of the boundary of J (as a bounded domain) consists of real Langrangian subspaces L of (V, ω). Each L is a real polarization and also defines a quantum Hilbert space. By extending geodesics to the boundary (for which the metaplectic correction is essential), we obtain the Segal-Bargmann and Fourier transforms as parallel transport in suitable limits. The rest of the paper is organized as follows. In Sect. 2, we recall the identification of J with the Siegel upper-half space and describe the connection and the resulting geometry of the bundle of quantum Hilbert spaces over J. We also incorporate the metaplectic correction. In Sect. 3, we study parallel transport in H along geodesics in J. The condition of parallel transport is a partial differential equation. By the Sp(V, ω) symmetry, it suffices to consider a special class of geodesics so that the equation can be solved explicitly. We then show that the parallel transport is actually a rescaled orthogonal projection known as the Bogoliubov transformation. Hence the integral kernel for the equation of parallel transport is the Bergman reproducing kernel, up to a positive factor. We then show that by extending a geodesic in one direction to infinity, the parallel transport becomes the Segal-Bargmann transform. Extending both ends of a geodesic to infinity, the parallel transport converges to the Fourier transform. Since a real Lagrangian space is on the boundary of J, the quantum Hilbert space associated to it is not inside the bundle H → J. We show that with the metaplectic correction, the limit holds in the sense of almost everywhere convergence as sections over V . Other ways to formulate the limit are also established. Finally, we would like to mention some recent related work. Let K be a Lie group of compact type, that is, K is locally isomorphic to a compact Lie group. The cotangent bundle T ∗ K is naturally symplectic and, being diffeomorphic to the complexification K C , has a compatible complex structure. In [6], Hall constructed a generalized SegalBargmann transform between the vertically polarized and Kähler polarized quantum Hilbert spaces. The pairing is a unitary operator and a rescaled projection, as in [14, 15] for the flat case. In [4], the authors study parallel transport in the quantum Hilbert space bundle over a 1-parameter family of Kähler polarizations on T ∗ K . As in the flat case [1], the parallel transport equation is given by a holomorphic version of the heat operator,

Geometric Quantization, Parallel Transport and the Fourier Transform

579

which also appeared in [6]. It would be interesting to explore the projective flatness of the quantum Hilbert space bundle over a larger class of Kähler polarizations on T ∗ K . 2. Geometry of the Hilbert Space Bundle 2.1. Complex polarizations and the metaplectic correction. Let V be a real vector space of dimension 2n equipped with a constant symplectic form ω (i.e., a nondegenerate, closed 2-form). There exist linear coordinates {x i , y i }i=1,...,n or tx = (x 1 · · · x n ), ty = (y 1 · · · y n ) on V such that ω=

n

d x i ∧ dy i = td x ∧ dy.

i=1

A complex structure J ∈ End(V ) is compatible with the symplectic form ω if ω(J · , J · ) = ω( · , · ) and ω( · , J · ) > 0. Given such a J , the complexification of V decomposes as V C = V J(1,0) ⊕ V J(0,1) , √ (1,0) (0,1) (1,0) = {X ∈ V C | J X = −1 X } and V J = V J . Let J be the set of where V J all compatible complex structures on V . J can be identified, as follows, with the Siegel upper half-space 1

Hn = { ∈ Mn (C) | t = , Im > 0} ⊂ C 2 n(n+1) . (1,0)

We associate a compatible complex structure J ∈ J to a point ∈ Hn so that V J is spanned by In . Equivalently, the complex structure can be written in terms of √ = 1 + −1 2 as 1 −1 −2 − 1 −1 2 2 1 . J= −1 −−1 2 2 1 Thus J is identified with the positive Lagrangian Grassmannian. Real Lagrangian subspaces correspond to certain points on the boundary of Hn . For any ∈ Hn , we choose the corresponding holomorphic coordinates on V as ¯ z = (22 )− 2 (x − y). 1

(2.1)

1

The matrix factor (22 )− 2 is chosen so that the symplectic form is √ ω = −1 tdz ∧ d z¯ . We will drop the subscript when there is no danger of confusion. There is a pre-quantum line bundle → V with a connection whose curvature is n i i i i √ω . We use the symplectic potential τ = 1 i=1 x dy − y d x to trivialize → V . 2 −1 That is, the covariant derivative of a section s ∈ () along X is √ ∇ X s = X (s) − −1 (ι X τ )s,

580

W. D. Kirwin, S. Wu

if s is identified with a function on V . The pre-quantum Hilbert space H0 consists of n square-integrable sections of with respect to the Liouville volume form εω = (2πω)n n! . Polarized sections of are those which are holomorphic, i.e., ∇z¯ s = 0. Using the complex coordinates (2.1), the covariant derivatives in are ∇z =

1 1 ∂ ∂ − z¯ , ∇z¯ = + z. ∂z 2 ∂ z¯ 2

(2.2)

Hence, a polarized section ψ ∈ H J can be written as 1

ψ = φ(z) e− 2 |z|

2

for some entire function φ. Let H J ⊂ H0 denote the space of square integrable polarized sections with respect to the complex structure J . This is the quantum Hilbert space. We then have a quantum Hilbert space bundle H → J with fiber H J over J . There is an Hermitian structure on this bundle given by

ψ1 , ψ2 = ψ¯ 1 ψ2 εω (2.3) V

for ψ1 , ψ2 ∈ H J . Here and below, when J is parameterized by ∈ Hn , the subscript J can be replaced by . For example, we write H = H J . Since J is the positive Lagrangian Grassmannian, there is a natural canonical bundle L → J with fiber V J(1,0) over J . Let K → J be the dual determinant bundle with fiber K J = n (V J(1,0) )∗ . Since J is contractible, there is a unique (up to equivalence) square √ √ √ root bundle K → J such that K ⊗ K = K. This square root bundle is known as ˆ →J the bundle of √ half-forms. We define the corrected quantum Hilbert space bundle H √ ˆ ˆ as H = H ⊗ K. The fiber H J = H J ⊗ K J is called the corrected quantum Hilbert space. Including the bundle of half-forms is known as the metaplectic correction. 2.2. Symplectic and metaplectic group actions. Given a vector space V with a symplectic form ω, the symplectic group Sp(V, ω) is the set of linear transformations on V preserving ω. Upon choosing a set of linear symplectic coordinates {x i , y i }i=1,...,n , the group Sp(V, ω) is isomorphic to

A B t AC = tC A, tB D = tD B, t AD − tC B = In . Sp(2n, R) =

C D The group Sp(V, ω) acts on the set J of compatible complex structures by g : J → g J g −1 . The corresponding action on positive complex Lagrangian subspaces (1,0) (1,0) (1,0) is g : V J

→ gV J = Vg J g−1 . Identifying J with the Siegel upper half-space Hn , the action of Sp(V, ω) on J becomes the fractional linear transformation on Hn , i.e., A B g= : → g · = (A + B)(C + D)−1 . (2.4) C D The following results, which will be used in the sequel, can be verified by straightforward calculations.

Geometric Quantization, Parallel Transport and the Fourier Transform

Lemma 2.1. Suppose g =

=

¯ − √ 2 −1

A

B C D

581

∈ Sp(2n, R) and = g · 0 , = g · 0 ∈ Hn . Put

. Then

1. ¯ = t(C + D)−1 (0 − ¯ 0 )(C0 + D)−1 . − 0

(2.5)

In particular, Im = t(C0 + D)−1 Im0 (C0 + D)−1 . 2. −1 ¯ −1 ¯ ¯ (−1 2 − )2 = ( − ) ( − ) ¯ 0 )−1 ( ¯0− ¯ 0 )(C0 + D)−1 ; = (C0 + D)(0 − −1 2 (2

− −1 )

(2.6)

¯ )−1 = ( − )( −

¯ 0 )−1t(C + D). = t(C0 + D)−1 (0 − 0 )(0 − 0

(2.7)

The action of Sp(V, ω) on V lifts to the pre-quantum line bundle preserving the connection. Consequently, the group Sp(V, ω) acts on the pre-quantum Hilbert space H0 . In fact, since the symplectic potential τ is preserved by Sp(2n, R), under the corresponding trivialization ∼ = V × C, the action of g ∈ Sp(2n, R) is g · (v, ζ ) = (gv, ζ ), v ∈ V ∼ = R2n , ζ ∈ C, and that on s ∈ H0 ∼ = L 2 (V ) ⊗ C is (g · s)(v) = s(g −1 v), v ∈ V. The action of Sp(V, ω) lifts to the Hilbert space bundle H → J covering the action on J. Since Sp(V, ω) preserves the connection on , the action g : H J → Hg J g−1 is a unitary isomorphism for any g ∈ Sp(V, ω). The symplectic group Sp(V, ω) also acts on the vector bundle L → J and hence on the line bundle K → J. In fact the choice of coordinates (2.1) provides a global unitary section → d n z of K. The transformation of the complex coordinates 1 −1 A B : z → (g −1 )∗ z = 2 2 t (C + D)(g · )22 z g· g= C D 1

−1

= 22 (C + D)−1 (g · )2 2 z g· ,

(2.8)

where (g · )2 = Im(g · ), is unitary, and so is that of the section d n z, det (C + D) n A B d z g· . g= : d n z → (g −1 )∗ d n z = (2.9) C D |det(C + D)| √ This√action does not lift to K, but the double covering group of Sp(V, ω) does act on K. Since Sp(V, ω) is connected and π1 (Sp(V, ω)) ∼ = Z, there is a unique (up to isomorphism) connected double covering group M p(V, ω), called the metaplectic group. The double covering group of Sp(2n, R) is denoted by M p(2n, R). We have the following well-known result (see for example [9]):

582

W. D. Kirwin, S. Wu

Proposition 2.2. M p(V, ω) is isomorphic √ to the group whose elements are pairs (σ, g), where σ is a bundle isomorphism of K → J covering the action of g ∈ Sp(V, ω) on J. That is, we have a commutative diagram √ σ √ K −→ K ↓ ↓ g J −→ J √ Consequently, the metaplectic group M p(V, ω) acts acts on the cor√ on K and hence ˆ = H ⊗ K. Given g = A B ∈ Sp(2n, R), rected quantum Hilbert space bundle H C D the action of a lifted element in M p(2n, R) is ψ⊗

1

dn z

2 ˆ → det (C + D) ψ ◦ g −1 ⊗ d n z g· ∈ H ˆ g· , ∈H 1 |det(C + D)| 2

(2.10)

1

where the square root det (C + D) 2 depends on the lift of g to M p(2n, R). 2.3. Projectively ﬂat and ﬂat connections. First, we describe√a projectively flat connection on H → J [1]. Combining this and the connection on K → J, we obtain a flat ˆ → J [15, §10.2]. connection on H Since H → J is a subbundle of the product bundle J×H0 → J, the trivial connection on the latter projects to a connection on H. This connection is [1] 1 ∇ H = δ + (δ J ω−1 )i j ∇z i ∇z j , 4

(2.11)

where δ is the exterior differential on J. The second term √ is a 1-form on J valued in (1,0) 1 be the set of skew-adjoint operators on H J . Let PJ = 2 (1 − −1 J ) : V C → V J the projection with respect to the Hermitian form on V C defined by ω and J . Then the curvature of the above connection is [1] 1 F H = − Tr(PJ δ J ∧ δ J PJ ) idHJ . 8

(2.12)

So the connection is projectively flat [1]. Henceforth we omit the identity operator. The connection described above blows up at the boundary of J. We will be interested in extending geodesics in J to the boundary. In order to parallel transport along the extended geodesics in the next section, we must employ the metaplectic correction. The product bundle V C × J → J has an Hermitian structure defined by ω and J . So a connection on the sub-bundle L → J is given by the orthogonal projection of the trivial connection. Its curvature is 1 F L = PJ δ(PJ δ PJ ) = PJ δ PJ ∧ δ PJ PJ = − PJ δ J ∧ δ J PJ . 4

(2.13)

Proposition 2.3 ([15, §10.2]). The induced connection on the corrected quantum Hilbert ˆ → J is ﬂat. space bundle H

Geometric Quantization, Parallel Transport and the Fourier Transform

Proof. The connection on

583

√ √ K is F K = − 21 Tr F L. So by (2.12) and (2.13), √

F H + F K = 0. The result was proved in [15, §10.2] using cocycles.

The identification J = Hn provides J with a convenient set of coordinates. Using the variation of J , √ √ ¯ −1 −1 −1 ¯ ¯ −1 δJ = −1 −1 δ (I , − ) − n 2 2 2 δ 2 (In , −), In In 2 2 the connection (2.11) becomes ∇

H

√ 1 −1 t −1 ¯ − 2 ∇z . =δ− ∇z 2 2 δ 2 2

(2.14)

The curvature (2.12) is FH =

1 −1 ¯ Tr(−1 2 δ ∧ 2 δ ). 8

(2.15)

The latter is proportional to √ the standard Kähler form on √ Hn . On the other hand, using the (unitary) global section d n z, the connection on K is given by the 1-form (for any n ≥ 1) √

A K=

√

−1 4

Tr(−1 2 δ1 ).

(2.16)

Its curvature is the negative of (2.15). The Hilbert space H J is the Fock space of a harmonic √ oscillator with Hamiltonian H = |z|2 . In the case n = 1, the parameter is τ = τ1 + −1 τ2 in the upper half-plane. 1 k 2 A unitary basis for H J is {|k = √z e− 2 |z| }k∈N . The vector |0 is the vacuum state and k! |k (k ≥ 1) are the excited states. Such a basis provides a global unitary frame for the bundle H. Each |k, regarded as a function of τ valued in H0 , has the exterior derivative √ −1 k|k − (k + 1)(k + 2)|k + 2 δτ δ|k = 4τ2 √ + k|k − 2¯z k|k − 1 + z¯ 2 |k δ τ¯ . The connection is given by an infinite skew-Hermitian matrix valued 1-form √ −1 H k δkl − k(k − 1) δk,l+2 δτ Akl = k|δ|l = 4τ2 + l δkl − l(l − 1) δk+2,l δ τ¯ ,

(2.17)

while the matrix of the curvature 2-form is, as expected,

k|F H|l =

δτ ∧ δ τ¯ δkl . 8τ22

(2.18)

584

W. D. Kirwin, S. Wu

3. Parallel Transport Along the Geodesics 3.1. Solutions to the equation of parallel transport. The Siegel upper half-space Hn has a non-positively curved Kähler metric −1 ¯ ds 2 = Tr(−1 2 δ 2 δ ),

which is invariant under the action of Sp(2n, R). We study parallel transport in the ˆ along the geodesics in Hn . Let , ∈ Hn represent J, J ∈ J, bundles H and H respectively. Parallel transport in the bundle H along the unique geodesic from to ˆ We denote them by U J J = U : H J → H J is a unitary operator, and so is that in H. ˆJ →H ˆ J , respectively. The generating function for the basis of and Uˆ J J = Uˆ : H H is a coherent state or a principal vector [2] cα (z ) = exp(tαz ¯ − 21 |z |2 ),

(3.1)

Cn .

where α ∈ We wish to find U cα ∈ H and its metaplectic correction. For any diagonal matrix = diag[λ1 , . . . , λn ] ≥ 0, the curve γ : R → Hn defined √ by γ (t) = −1 e2t is a geodesic in Hn . Lemma 3.1 ([11]). For any geodesic γ : R → Hn , there exist g ∈ Sp(2n, R) and a diagonal matrix ≥ 0 such that γ = g · γ . We first study parallel transport along the geodesic γ ; the latter determines a oneparameter family√ of complex structures Jt , whose complex coordinates are z t = √1 (e−t x + −1 et y). The equation of parallel transport of a family of polarized 2 sections ψt ∈ (γ∗ H) is (∂t −

1t 2 ∇z t ∇z t )ψt

= 0.

(3.2) √

Proposition 3.2. The parallel transport of cα (z 0 ) along γ from 0 = −1 In to √ t = −1 e2t is given by 1 α¯ 1 1 t α¯ tanh t sech t − |z|2 . (Ut 0 cα )(z) = (det sech t) 2 exp sech t −tanh t z 2 z 2 (3.3) Proof. Since the connection 1-form − 21 t∇z ∇z is a sum of diagonal terms, we can assume n = 1; the general case is similar. We can also set λ1 = 1 by a rescaling of t. Let 1 2 ψt be the parallel transport of cα . Write z = z t and ψt = φ(t, z)e− 2 |z| . Then φ(t, z) is an entire function in z (for each t) satisfying ∂ 1 ∂2 1 2 ¯ φ(0, z) = eαz − + z φ(t, z) = 0, . ∂t 2 ∂z 2 2 Here we have used (2.2) and ft −

1 2 f zz

d dt z

−

= −¯z . Set φ(t, z) = e f (t,z) . Then f (t, z) satisfies

1 2 2 fz

+ 21 z 2 = 0,

If we look for a solution of the form f (t, z) = satisfy a set of ordinary differential equations pt = p 2 − 1,

qt = pq,

f (0, z) = αz. ¯ 1 2

p(t)z 2 + q(t)z + 21 r (t), then p, q, r rt = p + q 2

Geometric Quantization, Parallel Transport and the Fourier Transform

585

with the initial conditions p(0) = r (0) = 0, q(0) = α. ¯ The solutions are p(t) = − tanh t, q(t) = α¯ sech t, r (t) = ln sech t + α¯ 2 tanh t, and hence φ(t, z) =

√

sech t exp αz ¯ sech t + 21 (α¯ 2 − z 2 ) tanh t .

The result follows from the uniqueness of parallel transport.

Proposition 3.2 enables us to calculate the parallel transport of any basis vector in H J0 . In particular, the parallel transport of the vacuum is no longer the vacuum in a new polarization; it is a linear combination of states with an even number of excitations. We list the parallel transport of a few states with small excitation numbers in the case n = 1, λ1 = 1:     1 1  z0  z t sech t √   1 2 1 2  2  − 12 |z 0 |2 2 sech2 t + tanh t e− 2 z t tanh t− 2 |z t | .

→ sech t  (3.4)  z0  e z   t   .. .. . . √ Next we study parallel transport in the half-form bundle K. As noted earlier, the complex coordinates corresponding to the point γ (t) ∈ Hn are z t = √1 (e−t x + 2 √ d −1 et y). As t varies, the complex coordinates change by dt z t = −¯z t , whose pro√ (1,0) jection to Vγ (t) is 0. Consequently, dz t is a parallel section of γ∗ L∗ and d n z t is a parallel section of γ∗ K. The latter is also a consequence of (2.16). Hence the parallel √ √ transport of cα ⊗ d n z 0 is Ut 0 cα ⊗ d n z t . We now turn to parallel transport along a general geodesic. Theorem 3.3. Let , ∈ Hn and let γ be the unique geodesic such that γ (0) = and γ (1) = . Then 1. The parallel transport of cα (z ) along γ is 1

(U cα )(z ) =

1

(det 2 ) 4 (det 2 ) 4 1

| det | 2  1 1 t −1 2 2 1 −

I α ¯ n 2 2  × exp 1 1 2 z 2 2 2 −1 2 



1 2 2 −1 2  α¯ − 1 1 z 2 In − 2 2 −1 2 1 2

 1 2 |z | . 2 (3.5)

2. The parallel transport of

√ n d z along γ is

(det ) 2 1

1

| det | 2

| det | 2 1

dn z

3. The parallel transport of cα (z ) ⊗ (3.6).

=

1

(det ) 2

d n z .

(3.6)

√ n d z along γ is the tensor product of (3.5) and

586

W. D. Kirwin, S. Wu

B and be given by Lemma 3.1 such that γ = g · γ . Then Proof. Let g = CA D √ √ = g · 0 , = g · 0 ∈ Hn , where 0 = −1 In and 0 = −1 e2 In . 1. We first map cα (z ) = exp(tαz ¯ − 21 |z |2 ) in H by g −1 to cα0 (z 0 ) = exp(tα¯ 0 z 0 − 1 2 2 2 2 |z 0 | ) in H0 . By the unitarity of (2.8), we have |z 0 | = |z | and 1

−1

α0 = t(C0 + D) 22 α = (C0 + D)−1 2 2 α. The parallel transport of cα0 (z 0 ) in H0 along γ is (U0 0 cα0 )(z 0 ) in H0 given by (3.3). Since the connection is invariant under Sp(2n, R), the action of g on the latter is ) in H . Here (U cα )(z 1

−1 2 z 0 = e− t(C0 + D) 2 2 z = e (C0 + D)

− 21 z .

Using these identities and Lemma 2.1, we get 1

1

1

det(cosh t) = det 0 0 det(Im0 )− 2 = | det |(det 2 )− 2 (det 2 )− 2 , 1

1

t α¯ 0 sech t z 0 = tα ¯ 22 (C0 + D) −1 (C0 + D)2 2 z

t

0

=

1 2

0

1

2 α ¯ 2 −1 2 z ,

t

1

−1

¯ 0 )−1 ( ¯0− ¯ 0 )(C0 + D)−1 2 α¯ α¯ 0 tanh t α¯ 0 = tα ¯ 22 (C0 + D)(0 − 2

t

1

1

2 = tα(I ¯ n − 22 −1 ¯ 2 )α,

and −tz 0 tanh t z 0 = tz 2

− 21 t

1

¯ 0 )−1t(C + D)2 2 z (C0 + D)−1 (0 − 0 )(0 − 0 1

1

2 −1 2 = tz (In − 2 2 )z .

From these identities and from (3.3), the result follows. √ √ 2. Since the connection on K is invariant under M p(2n, R), we again map d n z by g −1 to 0 , parallel transport it along γ to 0 , and map the result to by g. By (2.9), the phase accumulated in these steps is 1 det(C0 + D) det (C0 + D) 2 , |det(C0 + D) det(C0 + D)| which is equal to those in (3.6) by taking the determinant of (2.5).

3.2. Projections, Bogoliubov transformations and the integral kernel of parallel transport. Given any two compatible complex structures J, J ∈ J, there is an orthogonal projection PJ J : H J → H J inside the pre-quantum Hilbert space H0 . In [14] and [15, §9.9], Woodhouse showed that up to a scalar multiplication, this projection is a unitary operator, the Bogoliubov transformation. On the other hand, parallel transport in the bundle H → J along the geodesic from J to J defines a manifestly unitary operator from H J to H J . The rescaled projection of the vacuum state calculated in [15, §9.9] coincides with (3.3) when α = 0. We show that this is true for all states.

Geometric Quantization, Parallel Transport and the Fourier Transform

587

Theorem 3.4. For any J, J ∈ J, the parallel transport U J J in H along the (unique) geodesic from J to J is the map α(J, J )PJ J : H J → H J , where 1 √ 1 4 ¯J ) 4 . − P α(J, J ) = det J +J = det −1 (P J 2

(3.7)

Proof. The quantity α(J, J ) defined in (3.7) in invariant under Sp(V, ω). Since all the steps are equivariant under Sp(V, ω), it suffices to consider the parallel transport along a geodesic of the form γ . Let Jt be the complex structure corresponding to γ (t). We want to show that there exists a function α(t) with α(0) = 1 such that for any ψ ∈ H J0 , ψt ∈ H Jt , if {ψt } is parallel along γ , then

ψ , ψ0 = α(t) ψ , ψt for any t ∈ R. This is equivalent to the condition that the right hand side has vanishing derivative with respect to t. Again, √ without loss of generality, we prove the case of n = 1, λ1 = 1. Since z t = √1 (e−t x + −1 et y), we have 2

∇z t = sech t ∇z 0 + tanh t ∇z¯ t . We also note that the formal adjoint of ∇z 0 is −∇z¯ 0 . So d (α(t) ψ , ψt ) = α (t) ψ , ψt + 21 α(t) ψ , ∇z2t ψ dt = α (t) ψ , ψt + 21 α(t) ψ , ( sech t ∇z 0 + tanh t ∇z¯ t )∇z t ψt = α (t) ψ , ψt + 21 α(t) sech t −∇z¯ 0 , ∇z t ψt √ + 21 α(t) tanh t ψ , (− −1 ω(∂z¯ t , ∂z t ) + ∇z t ∇z¯ t )ψt = (α (t) − 21 α(t) tanh t) ψ , ψt , √ whichvanishes if we choose α(t) = cosh t. It is easy to verify α(J0 , Jt ) = α(t) using √ 2t ¯ Jt = −2t −e . The second equality in (3.7) is because J +J 2 = −1 (PJ − PJ ). e

Corollary 3.5. Parallel transport in H → J along the geodesic in Hn from J to J coincides with the Bogoliubov transformation from H J to H J . Proof. The operator α(J, J )PJ J , including the scalar factor (3.7), coincides with the formula of the Bogoliubov transformation in [14] and [15, §9.9]. There is a more direct explanation of the above result. Given a complex structure J and the corresponding holomorphic coordinates (2.1) on V , H J is the Fock space of the creation and annihilation operators a †J = z, a J = ∇z + z¯ . As the complex structure changes along a geodesic, so do a †J and a J . In fact, when n = 1 √ and along γ (t) = −1 e2t , the parallel transports of a0 and a0† are cosh t at + sinh t at† and sinh t at + cosh t at† ,

588

W. D. Kirwin, S. Wu

respectively, where at = a Jt and at† = a †Jt . This deformation of the creation and annihilation operators, or the concept of the vacuum and excitations, is the physics origin of the Bogoliubov transformation. For any J ∈ J represented by , H J can be identified with the space of analytic 2 function on (V, J ) with the measure e−|z | εω (z ). The orthogonal projection onto H is given by the Bergman kernel. So we can express parallel transport by an integral kernel operator. Proposition 3.6. For any , ∈ Hn , the parallel transport U in H along the (unique) geodesic from to is 1

1

φ(z ) e− 2 |z | →

| det | 2

2

1 4

1

e− 2 |z |

1 (det 2 ) 4

2

(det 2 ) 1 t 2 1 2 × e z z¯ − 2 |z | − 2 |z | φ(z ) εω (z ).

(3.8)

V

Proof. The projection onto H is given by the Bergman kernel t

1

e z z¯ − 2 |z | Using the facts J 1 1

1

=

2 − 1 |z |2 2

.

√ √ −1 1 and (1, −)J = − −1 (1, −), we get

¯ J +J − 2 1 −

¯ 1

√ ¯ 0 − = −1 ¯ . 0 −

Taking the determinant, we get det

J + J | det |2 , = 2 det 2 det 2

from which the scalar factor in (3.8) follows.

Up to a phase, (3.8) agrees with the unitary intertwining operator from H to H in [10, 8]. √ √ A pairing can be defined on the half forms d n z and d n z even though they come from different complex structures.1 A simple calculation yields d n z , d n z =

1

(det ) 2 1

1

(det 2 ) 4 (det 2 ) 4

.

(3.9)

Since both the scalar factor in (3.8) and the phase in (3.6) are absorbed in (3.9), we have recovered 1 We recall that the pairing of d n z n and d z is determined by (−1)

d n z , d n z εω and that of

n(n−1) d z¯ ∧dz √ 2 (2π −1 )n

√ n √ d z and d n z is d n z , d n z = d n z , d n z .

=

Geometric Quantization, Parallel Transport and the Fourier Transform

589

Corollary 3.7√([15, §10.2]). Parallel transport from to under the ﬂat connection ˆ = H ⊗ K is given by in H Uˆ : ψ ⊗

ˆ → d n z , d n z P ψ ⊗ d n z ∈ H ˆ . d n z ∈ H

(3.10)

ˆ and H ˆ , Alternatively, this map can be described by a pairing between H

ψ ⊗

d n z , ψ ⊗

d n z = ψ , ψ d n z , d n z ,

(3.11)

where ψ , ψ is the inner product of ψ ∈ H and ψ ∈ H in H0 . ˆ . When = , the above pairing is the inner product (2.3)in H − 12 |z |2 in (3.8) is cα (z ), the integration yields the same We remark that if φ(z ) e result as (3.5). This gives another integral kernel of parallel transport. The existence of two different kernels is because φ(z ) is restricted to be holomorphic. Theorem 3.8. Under the assumptions of Proposition 3.6, the map U sends φ(z ) 1 2 e− 2 |z | to 1

1

(det 2 ) 4 (det 2 ) 4

1

e− 2 |z |

1 2

2

|det |   1 1 t 2 1 In − 22 −1 z ¯ 2   × φ(z ) exp 1 1 2 z 2 V 2 2 −1 2 

1



1

2 22 −1 2 1

1

2 In − 2 2 −1 2



− |z |2  εω (z ). 1

z¯ z

(3.12)

1

Proof. Since φ(z )e− 2 |z | is in H , we have an estimate |φ(w)| ≤ C e 2 |w| , where C is its norm [2]. By the reproducing property of the Bergman kernel, 2

1

φ(z ) e− 2 |z | = 2

2

φ(w) e−|w| cw (z ) εω (w), 2

V

which we substitute in (3.8). The integrand satisfies 1 2

1 1 1 2 2 t¯ 2 t 2

2 1 − 2 |z | + z z¯ − 2 |z |

≤ C e− 2 |w−z | − 2 |z −z | . e− 2 |z | φ(w) e−|w| − wz

Hence the double integral in w and z is absolutely convergent. Exchanging the order of the integration and performing the integral in z , we get

φ(w) e−|w| (U cw )(z ) εω (w), 2

V

which is (3.12) after relabeling the integration variable w as z .

590

W. D. Kirwin, S. Wu

3.3. Segal-Bargmann and Fourier transforms as parallel transport. The set S of real Lagrangian subspaces in (V, ω) can be identified with the Shilov boundary of J. For any L ∈ S, there is an Hermitian form on the space of sections of that are covariantly constant along L by choosing a translation invariant measure on V /L. The subspace H L of such sections that are L 2√ -integrable on V /L is independent of the choice of the measure. The bundles K and K extend to S; the fiber of K over L ∈ S is K L = (∧n (V /L)∗ )C ,√where (V /L)∗ is identified as the subspace of V ∗ that annihilates ˆ L = H L ⊗ K L for any L ∈ S; this is the quantum Hilbert space (with the L. Let H metaplectic correction) associated to the real √polarization L. The action of Sp(V, ω) on ˆ There is a canonical Hermitian S lifts to that of M p(V, ω) on the bundles √K and H. √ ˆ form on H. Given ψ1 , ψ2 ∈ H L and ν ∈ K L (L ∈ S), we have √ √ ψ¯ 1 ψ2 |ν| n , (3.13)

ψ1 ⊗ ν, ψ2 ⊗ ν = (2π ) 2

V /L

√ where |ν| is a density on V /L ∼ by ν. = Rn determined √ The M p(V, ω)-invariant pairing √on K √ → J between different √ √fibers also extends. Pairings are defined between K J , K L and between K L , K L for J ∈ J and L , L ∈ S such that L and L are transverse. For example, if L − = {x = 0}, L + = {y = 0} ∈ S in the symplectic coordinates (x, y) and ∈ Hn , we have √ √ √ 1 n , d n y, d n x = −1 2 . (3.14) d n z , d n x = det (22 )− 2 √ −1 For any J ∈ J corresponding to ∈ Hn , let R J be the subspace of ψ ∈ H J such that |ψ(z )| ≤

C (1 + |z |2 )n+α

ˆ J = RJ ⊗ for some C ≥ 0 and α > 0; such a ψ is L 1 on V . Let R pairing √ √ √ √ ¯ εω ψψ ψ ⊗ ν, ψ ⊗ ν = ν, ν

√ K J . There is a

(3.15)

V

√ √ ˆ J and ψ ⊗ ν ∈ H ˆ L ; the integral in (3.15) is absolutely between any ψ ⊗ ν ∈ R ˆL → H ˆ J is unitary and intertwines convergent. The corresponding operator Bˆ J L : H with the M p(V, ω)-action [2, 8, 15]. If L = L − and if J is parameterized by ∈ Hn , the operator and its inverse are, respectively, Bˆ L :φ(x) e

√

−1 t 2 xy

⊗

√

1

d n x −→

(det 22 ) 4 (det

√ −1

1

)2

1

2

e− 2 |z |

V /L −

φ(x)

   1 −1 (2 ) 21 (2 ) 21 √ −1 In − (22 ) 2 √ 2 2 −1 −1  z   1 z     × exp  x  2 x  1 −1 (2 ) 2 −1 √ √ − 2 −1 −1 # n |d x| , × d n z (3.16) n ⊗ (2π ) 2 

t

Geometric Quantization, Parallel Transport and the Fourier Transform 1

591

and, if φ(z ) e− 2 |z | is in R , 2

1 √ −1 t (det 22 ) 4 2 x y n 2 d z −→ e φ(z ) e−|z | 1 V (det √−1 ) 2    −1 −1  1 1 1 t 2 2 2 √ √ − (2 ) (2 ) (2 ) I 2 2 2  z¯   1 z¯  n −1 × exp −1−1 −1    1 x 2 x √ (22 ) 2 − √−1 −1 √ (3.17) × εω (z ) ⊗ d n x . √ When = −1 In , they are the usual Segal-Bargmann transform and its inverse [2]. For any pair of Lagrangian subspaces L , L ∈ S that are transverse, there exists a FouˆL →H ˆ L that intertwines with the action of M p(V, ω) rier transform operator Fˆ L L : H [7]. In particular, we have 2 −1 : φ(z ) e− 2 |z | ⊗ Bˆ L 1

√ √ −1 t Fˆ L + L − : φ(x) e 2 x y ⊗ d n x √ t √ n φ(x) e −1 x y

−→ −1 2

V /L −

|d n x|

n (2π ) 2

e−

√

−1 t 2 x y

⊗

d n y,

(3.18)

˜ ). Strictly speakwhere the integral in the parentheses is the usual Fourier transform φ(y √ ˆ L such that |ψ| is L 1 on ing, (3.18) is valid only on the dense subspace of ψ ⊗ ν ∈ H ˆ V /L; the operator then extends continuously to H L . Proposition 3.9. 1. Let J ∈ J and let L , L ∈ S be transverse to each other. Then for ˆ J and ψˆ ∈ R ˆ L, any ψˆ ∈ R ˆ lim Uˆ J J ψˆ = Bˆ −1 J L ψ,

J →L

lim Bˆ J L ψˆ = Fˆ L L ψˆ ;

J →L

(3.19)

here the limit is pointwise in V as J → L or L from inside J. 2. For any J, J ∈ J and L , L , L ∈ S that are mutually transverse, we have ˆ ˆ ˆ ˆ Bˆ J L = Uˆ J J ◦ Bˆ J L , Fˆ L L = Bˆ −1 J L ◦ B J L , FL L = FL L ◦ FL L .

(3.20)

Proof. 1. Let J, J be parameterized by , ∈ Hn . Without loss of generality, assume ˆ ) = φ(z ) e− 12 |z |2 ⊗ L = L − . Then the limit J → L is → 0 with 2 > 0. If ψ(z √ n ˆ ) is the tensor product of (3.12) and (3.6). As → 0, d z , then (Uˆ ψ)(z # √ 1 n 4 (det 22 ) d z → d n x and the integrand in (3.12) goes to that in (3.17). Since the latter is absolutely integrable, the limit commutes with the integration and thus the 1 first limit in (3.19) follows. We remark here that#the scalar factor (det 22 ) 4 that goes

. The proof of the second limit to zero in the limit is absorbed by the half-form d n z is similar. ˆ → J is flat and since Uˆ J J is the parallel 2. Since the connection on the bundle H ˆJ → R ˆ J and transport from J to J , we have Uˆ J J ◦ Uˆ J J = Uˆ J J . Using Uˆ J J : R −1 ˆ ˆ ˆ ˆ taking J → L, we get Bˆ −1 ◦ U = B on R , and hence on H . The proof of the J J J J JL JL other two identities are similar.

592

W. D. Kirwin, S. Wu

ˆ J from J to J goes We thus proved that, as J → L ∈ S, parallel transport of ψˆ ∈ R −1 2 ˆ Since the latter is not L on V and its norm is defined instead by (3.13), it is to Bˆ J L ψ. ˆ J or why its image is not obvious why the “operator” lim J →L Uˆ J J is continuous on H ˆ contained in H L . We now take the limit J → L as J follows the path of a geodesic. Lemma 3.10. 1. Let ≥ 0 be a diagonal matrix and γ = g · γ , a geodesic in J. Then limt→±∞ γ (t) are real Lagrangian subspaces if and only if > 0. 2. For any J ∈ J and L ∈ S, there is a geodesic γ in J such that γ (0) = J , limt→−∞ γ (t) = L. 3. A pair of real Lagrangian subspaces L , L are transverse if and only if there is a geodesic γ in J such that limt→−∞ γ (t) = L, limt→+∞ γ (t) = L . Proof. 1. Using the identification of J and Hn , γ (−∞) = 0 and γ (+∞) = √ + −1 ∞ In if and only if > 0, in which case they are real Lagrangian subspaces L − and L + , respectively. The result follows from the transitivity of the Sp(V, ω) action on S. √ 2. Without loss of generality, assume J is represented by = −1 In . Then for any diagonal > 0, γ (0) = J and limt→−∞ γ (t) = L − . The isotropic subgroup of J in Sp(V, ω) is isomorphic to U (n) and acts transitively on S. Hence the result. 3. Let γ = g · γ ( > 0) be the geodesic such that the limits hold. Then L = g L − and L = g L + . L, L are transverse since L − , L + are. Conversely, if L, L are transverse, then there exists g ∈ Sp(V, ω) such that L = g L − , L = g L + . The geodesic γ = g ·γ for any > 0 satisfies the requirement. Proposition 3.11. Let γ be a geodesic in J such that γ (0) = J and γ (−∞) = L , ˆ J , we have γ (+∞) = L ∈ S. Then for any ψˆ ∈ H ˆ lim Uˆ γ (t)J ψˆ = Bˆ −1 J L ψ,

t→−∞

lim Uˆ γ (t)J ψˆ = Fˆ L L lim Uˆ γ (t)J ψˆ

t→+∞

t→−∞

(3.21)

almost everywhere on V . Proof.√Without loss of generality, we assume γ = γ ( > 0). Then √ J is given by = −1 In and L = L − , L = L + , while at γ (t), z t = √1 (e−t x + −1 et y). Let 2 √ √ −1 t −1 ˆ y) = φ(x) e 2 x y ⊗ d n x. Using π± : V → V /L ± be the projections. Let ( Bˆ J L ψ)(x, (3.19) and (3.16), we get ˆ , y ) = ( Bˆ γ (t)L − Bˆ −1 ψ)(x ˆ , y) (Uˆ γ (t)J ψ)(x JL − n √ −t − 12 t(x−x )e−2t (x−x )+ −1 t(x−x )y |d x| = det e φ(x)e n (2π ) 2 V /L − √ # √ −1 t 1 (3.22) ×e 2 x y ⊗ (det 2et ) 2 d n z t n √ 1t −2t (x−x )+ −1 tx y |d x| = φ(x) e− 2 (x−x )e n (2π ) 2 V /L − √ # √ −1 t 1 (3.23) ×e− 2 x y ⊗ (det 2e−t ) 2 d n z t . √ √ −1 t As t → −∞, (3.22) goes to φ(x ) e 2 x y ⊗ d n x pointwise on π−−1 (E φ ), where E φ is the Lebesgue set of φ (see for example [12, Theorem I.1.25] or [3, Theorem 8.62]).

Geometric Quantization, Parallel Transport and the Fourier Transform

593

√ 1 Again, the scalar factor (det 2et ) 2 that vanishes in the limit is absorbed by d n z t . As √ n −1 n ˜ 2 t → +∞, (3.23) goes to −1 φ(y ) ⊗ d y pointwise on π+ (E φ˜ ) (see for example [3, Theorem 8.31(c)]); this also follows from the t → −∞ limit by making an Sp(V, ω) transformation that fixes J and exchanges L + and L − . It is well known that the Lebesgue set of an L 2 function is the complement of a measure-zero subset (see for example [3, Theorem 3.20] or [12, pp. 12-13]). ˆ L (L ∈ S) are defined up to a set of meaWe remark that since the elements in H sure-zero, the limits in (3.21) are the best possible results for pointwise convergence. The integral in (3.22), being the convolution of φ and the heat kernel, goes to φ(x ) when t → −∞ as tempered distributions on V /L − (see for example [13, Prop. 3.5.1] or ˜ ) when [3, Cor. 8.46]). In the same sense, the integral in (3.23) goes to φ(y t → +∞. Hence the limits in (3.21) hold as tempered distributions on V , with the given trivialization of . Finally, we consider the limit in L 2 -spaces. S is part of the topological boundary of ˆ L (L ∈ S) J as a bounded domain. We define a topology on the disjoint union E of all H ˆ ˆ and the total space of H → J. There is a bijection from E to (J S) × H J0 if we fix ˆ J (J ∈ J) and H ˆ L (L ∈ S) to H ˆ J are Uˆ J J and Bˆ −1 , any J0 ∈ J. The maps from H 0 0 J0 L ˆJ. respectively. The space E thus inherits the product topology on (J S) × H 0

ˆ J, Corollary 3.12. Let J ∈ J and let L , L ∈ S be a transverse pair. Then for any ψˆ ∈ H in the above topology on E, we have ˆ lim Uˆ J J ψˆ = Bˆ −1 J L ψ,

J →L

ˆ lim Uˆ J J ψˆ = Fˆ L L lim Uˆ J J ψ.

J →L

Proof. The limits follow directly from (3.20).

J →L

Acknowledgements. We are grateful to Arlan Ramsay for many helpful conversations and suggestions – in particular regarding the limits of the parallel transport operator. We would also like to thank Wicharn Lewkeeratiyutkul for bringing to our attention the paper [4].

References 1. Axelrod, S., Della Pietra, S., Witten, E.: Geometric quantization of Chern-Simons gauge theory. J. Diff. Geom. 33, 787–902 (1991) 2. Bargmann, V.: On a Hilbert space of analytic functions and an associated integral transform. Comm. Pure Appl. Math. 14, 187–214 (1961) 3. Folland, G.B.: Real analysis. Modern techniques and their applications John Wiley & Sons, New York 1984 4. Florentino, C., Matias, P., Mourao, J., Nunes, J.P.: Geometric quantization, complex structures and the coherent state transform. J. Funct. Anal. 221, 303–322 (2005) 5. Ginzburg, V.L., Montgomery, R.: Geometric quantization and no-go theorems. In: Grabowski, J., Urba´nski, P. (eds.) Poisson geometry, Warsaw, 1998, Banach Center Publ. 51, Warsaw: Polish Acad. Sci., 2000, 69–77 6. Hall, B.C.: Geometric quantization and the generalized Segal-Bargmann transform for Lie groups of compact type. Commun. Math. Phys. 226, 233–268 (2002) 7. Lion, G., Vergne, M.: The Weil representation, Maslov index and theta series. Prog. in Math. 6, Birkhäuser Boston, MA 1980, Part I 8. Magneron, B.: Spineurs symplectiques purs et indice de Maslov de plan Lagrangiens positifs. J. Funct. Anal. 59, 90–122 (1984) 9. Robinson, P.L., Rawnsley, J.H.: The metaplectic representation, Mpc structures and geometric quantization. Mem. Amer. Math. Soc. Vol. 81, No. 410, Providence, RI: Amer. Math. Soc., 1989

594

W. D. Kirwin, S. Wu

10. Satake, I.: On unitary representations of a certain group extension (in Japanese). Sugaku 21, 241–253 (1969); Fock representations and theta-functions. In: Ahlfors, L.V. et al. (eds.) Advances in the theory of Riemann surfaces. Princeton, NJ: Princeton Univ. Press, 1971, pp. 393–405 11. Siegel, C.L.: Symplectic geometry. Amer. J. Math. 65, 1–86 (1943) 12. Stein, E.M., Weiss, G.: Introduction to Fourier analysis on Euclidean spaces. Princeton, NJ: Princeton Univ. Press, 1971 13. Taylor, M.E.: Partial differential equations I. Basic theory. New York: Springer-Verlag 1996 14. Woodhouse, N.M.J.: Geometric quantization and the Bogoliubov transformation. Proc. Royal Soc. London A 378, 119–139 (1981) 15. Woodhouse, N.M.J.: Geometric Quantization (2nd ed.) New York: Oxford Univ. Press 1992 Communicated by A. Connes

Commun. Math. Phys. 266, 595–629 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0052-y

Communications in

Mathematical Physics

The Equations of Magnetohydrodynamics: On the Interaction Between Matter and Radiation in the Evolution of Gaseous Stars Bernard Ducomet1 , Eduard Feireisl2,3, 1 Département de Physique Théorique et Appliquée, CEA/DAM Ile de France, BP 12, 91680

Bruyères-le-Châtel, France

2 Department of Global Analysis, Technical University of Muenchen, Boltzmannstr. 3, 85747 Garching b.

Muenchen, Germany

3 Mathematical Institute AS CR, Žitná 25, 115 67 Praha 1, Czech Republic. E-mail: [email protected]

Received: 18 February 2005 / Accepted: 28 March 2006 Published online: 6 July 2006 – © Springer-Verlag 2006

Abstract: We prove existence of global-in-time weak solutions to the equations of magnetohydrodynamics, specifically, the Navier-Stokes-Fourier system describing the evolution of a compressible, viscous, and heat conducting fluid coupled with the Maxwell equations governing the behaviour of the magnetic field. The result applies to any finite energy data posed on a bounded spatial domain in R 3 , supplemented with conservative boundary conditions.

1. Physical Background-Basic Equations and Main Assumptions 1.1. Introduction. In a number of situations, stars may be considered as compressible fluids (see [36, 40]), in which the matter behaves either as a perfect or completely degenerate gas. Moreover their dynamics is very often shaped and controlled by intense magnetic fields coupled to self-gravitation and high temperature radiation effects (see, for instance, [8, 9, 15], among others). A striking example of such a coupling between magnetic and thermomechanical degrees of freedom is observed in the so-called solar ﬂares [41] (eruption phenomena in the coronal region of the Sun). During this spectacular event, a violent brightening is produced in the solar atmosphere where a huge amount of energy (∼ 1025 joules) is released in a matter of few minutes, and associated to a large coronal mass ejection. Magnetic reconnection is thought to be the mechanism responsible for this conversion of magnetic energy into heat and fluid motion. In our treatment, we consider a mathematical model derived from the classical principles of continuum mechanics and electrodynamics, where the field equations, the constitutive relations, as well as other physically grounded hypotheses are chosen on the basis of the following assumptions: The work supported by Grant A1019302 of GA AV CR

596

B. Ducomet, E. Feireisl

[A1] The material in question is a compressible, viscous, thermally and electrically conducting ﬂuid, occupying a bounded domain in the physical space R 3 . The physical system is energetically isolated. [A2] The motion of the ﬂuid is driven by two dominating body forces, namely, the self-gravitation and the Lorentz force imposed on the ﬂuid by the magnetic ﬁeld (see Chap. 2 in [8], Chap. 9 in [9], [26]). [A3] The ﬂuid is a perfect mixture of a ﬁnite number of species (gases), among which at least one constituent (the electron gas, for instance) behaves as a Fermi gas in the degenerate area of high densities and/or low temperatures. (see Chap. 16 in [15], [34]). [A4] The motion is an entropy producing (dissipative) process, where the viscous stress, the heat ﬂux, and the induced electric current are linear functions of the afﬁnities: the ﬂuid velocity gradient, the temperature gradient, and the magnetic ﬁeld respectively. The transport coefﬁcients in these ﬂuxes depend effectively on the temperature (see [33]). [A5] The pressure as well as the heat conductivity are augmented through the effect of high temperature radiation, assumed to be at thermal equilibrium with the ﬂuid (see Chap. 2 in [8], Chap. 16 in [15]). Many of the recent theoretical studies in continuum fluid mechanics addressing the problem of global-in-time solutions are concerned with isothermal or isentropic fluid flows (see [27, 39]). The present paper can be considered as a part of the research programme originated in [18], the aim of which is to develop a rigorous and, at the same time, physically relevant theory of general viscous fluids in the full thermodynamical setting. The central idea behind Hypotheses [A1]–[A5] is to identify a class of constitutive assumptions on the fluid in extreme regimes, therefore providing suitable a priori estimates relevant to the corresponding system of partial differential equations. When dealing with dissipative, or entropy producing processes, it is common to impose the second law of thermodynamics as the central principle. The central idea advocated in many recent studies postulates that the state variables change in a way that maximizes the rate of entropy production. From the mathematical viewpoint, the latter could be a non-negative measure, singular or absolutely continuous with respect to the standard Lebesgue measure, depending on the smoothness of the flow. Introducing the total entropy balance as one of the main ﬁeld equations represents a crucial aspect of the mathematical theory to be developed below. Another important feature employed in the present study is the regularizing effect of radiation already observed in [13]. Indeed one of the main theoretical problems to be dealt with is represented by a possible presence of cavities, inducing uncontrollable large amplitude time oscillations of extensive quantities as the internal energy or entropy. In the present setting, however, these quantities are being supplemented with a radiation component yielding the necessary a priori estimates on the time-derivative. Last but not least, the present theory leans on the commutator estimates developed for the case of parameter-dependent transport coefficients in [19]. 1.2. Radiation theory. In the following section, a purely phenomenological, or macroscopic, description is adopted, where the radiation is treated as a continuous field, and both the wave (classical) and photonic (quantum) aspects are taken into account. In the quantum picture, in agreement with Assumption [A5], the total pressure p in the fluid is augmented, due to the presence of the photon gas, by a radiation component p R related to the absolute temperature ϑ through the Stefan-Boltzmann law a (1.1) p R = ϑ 4 , with a constant a > 0 3

Equations of Magnetohydrodynamics

597

(see, for instance, Chap. 15 in [15]). Furthermore, the specific internal energy of the fluid must be supplemented with a term e R = e R (, ϑ) =

a 4 1 ϑ , equivalently, p R = e R , 3

(1.2)

where is the fluid density. Similarly, the heat conductivity of the fluid is enhanced by a radiation component q R = −κ R ϑ 3 ∇x ϑ, with a constant κ R > 0

(1.3)

(see [5, 29], among others). A different kind of interaction, described by the theory of magnetohydrodynamics, produces the so-called “collective effects” resulting from the macroscopic interaction between the motion of a conducting fluid and the electromagnetic field governed by Faraday’s law, ∂t B + curlx E = 0, divx B = 0.

(1.4)

Here, the magnetic induction vector B is related to the electric field E and the macroscopic fluid velocity u via Ohm’s law J = σ (E + u × B),

(1.5)

where J is the electric current, and σ the electrical conductivity of the fluid. Furthermore, neglecting the displacement current in the Ampère-Maxwell equation governing the electric field, we obtain Ampère’s law, µJ = curlx B, µ > 0,

(1.6)

where the constant µ stands for permeability of free space. Accordingly, Eq. (1.4) can be written in the form ∂t B + curlx (B × u) + curlx (λcurlx B) = 0,

(1.7)

where λ = (µσ )−1 is termed the magnetic diffusivity of the fluid (cf. Chap. 9 in [9], Chap. 4 in [26], among others). Equation (1.7) must be supplemented with suitable boundary conditions in order to obtain, at least formally, a mathematically well-posed problem. Here, conformally to Hypothesis [A1], we suppose that ∂ is a perfect conductor giving rise to the boundary conditions, B · n|∂ = E × n|∂ = 0,

(1.8)

where n stands for the (outer) normal vector. It is worth-noting that both mechanisms, even though being of the same origin, act simultaneously but rather independently as the radiation pressure is attributed to photons of very high energy while the collective effects of the electromagnetic field may be important under conditions of high density and/or extremely low temperature (energy) (see Chap. 2 in [8]).

598

B. Ducomet, E. Feireisl

1.3. Motion. In accordance with the basic principles of continuum mechanics, the fluid motion is described by the Navier-Stokes system of equations - a mathematical formulation of the mass conservation ∂t + divx (u) = 0,

(1.9)

∂t (u) + divx (u ⊗ u) = divx T + ∇x + J × B,

(1.10)

and the momentum balance

with the Cauchy stress tensor T. Here, in agreement with Assumption [A2], the body forces are represented by the gravitational force ∇x , where the potential obeys Poisson’s equation, − = G, with a constant G > 0,

(1.11)

and the Lorentz force imposed by the magnetic field, J × B = divx

1 µ

B⊗B−

1 |B|2 I . 2µ

(1.12)

Furthermore, as stated in Assumption [A4], the Cauchy stress tensor T is given by Stokes’ law T = S − pI, where p is the pressure, and the symbol S stands for the viscous stress tensor, 2 S = ν ∇x u + ∇x ut − divx u I + η divx u I, 3

(1.13)

with the shear viscosity coefficient ν, and the bulk viscosity coefficient η. Finally, in accordance with [A1], we impose the no-slip boundary conditions for the velocity field: u|∂ = 0.

(1.14)

1.4. Thermodynamics, entropy production. By virtue of the first law of thermodynamics, the energy of the system must be a conserved quantity, more specifically, ∂t

1 2

|u|2 + e +

1 1 |B|2 + divx ( |u|2 + e + p)u + E × B − Su 2µ 2

+divx q = ∇x · u,

(1.15)

where e is the specific internal energy, and q denotes the heat flux. In particular, assuming q · n|∂ = 0

(1.16)

Equations of Magnetohydrodynamics

599

(cf. Hypothesis [A1]), we deduce that the total energy of the system is a constant of motion: 1 d 1 1 |u|2 + e + |B|2 − dx = 0. (1.17) dt 2 2µ 2 Note that the gravitational potential is determined by Eq. (1.11) considered on the whole space R 3 , the density being extended to be zero outside . Consequently, d 1 ∇x · u dx = − dx. dt 2 However, in the variational formulation used in the present paper, it is more convenient to replace (1.15) by the entropy balance q ∂t (s) + divx (su) + divx =r (1.18) ϑ with the specific entropy s, and the entropy production rate r , for which the second law of thermodynamics requires r ≥ 0. The specific entropy s, the specific internal energy e, and the pressure are interrelated through the thermodynamics equation 1 ϑ Ds = De + D p, (1.19) in particular, the quantity (1/ϑ)(De + D(1/) p) must be a perfect gradient, which amounts to certain compatibility conditions imposed on e and p called Maxwell’s relationship. If the motion is smooth, it is easy to see that the sum of the kinetic and magnetic energy satisfies 1 1 1 ∂t |u|2 + |B|2 + divx ( |u|2 + p)u + E × B − Su 2 2µ 2 = ∇x · u + p divx u − S : ∇x u −

λ |curlx B|2 , µ

(1.20)

where the last two terms are responsible for the irreversible transfer of the mechanical and magnetic energy into heat. Indeed, with (1.19) at hand, it is a routine matter to check that 1 λ q · ∇x ϑ r= S : ∇x u + |curlx B|2 − . (1.21) ϑ µ ϑ If the motion is not (known to be) smooth, however, the validity of (1.20) is no longer guaranteed, in particular, the dissipation rate of the mechanical energy may exceed the value S : ∇x u (see, for instance, [12, 16] to illuminate this interesting and still largely open question). Accordingly, we replace Eq. (1.18) by the inequality q 1 λ q · ∇x ϑ ∂t (s) + divx (su) + divx ≥ S : ∇x u + |curlx B|2 − (1.22) ϑ ϑ µ ϑ

600

B. Ducomet, E. Feireisl

to be satisfied together with the total energy balance (1.17). Note that such a formulation is (i) consistent with the second law of thermodynamics, and (ii) equivalent to (1.15) provided the motion is smooth. Indeed, using (1.19) one can show that the internal energy production rate equals ϑr + pdivx u, where r is the entropy production evaluated through (1.21). On the other hand, if the motion is smooth, the mechanical and magnetic energy dissipation is given by (1.20), which is compatible with (1.17) only if r satisfies (1.21).

1.5. State equation. Conformally to [A5], the pressure p takes the form p(, ϑ) = p F (, ϑ) + p R (ϑ),

(1.23)

where the radiation component is given by (1.1). Similarly, the specific internal energy reads e(, ϑ) = e F (, ϑ) + e R (, ϑ),

(1.24)

with e R determined by (1.2). Furthermore, in accordance with Assumption [A3], we suppose p F (, ϑ) =

2 e F (, ϑ) 3

(1.25)

- a universal relation deduced by the methods of statistical physics and valid, in the continuum limit, for any system of non-interacting (non-relativistic) particles (see Chap. 10 in [8], Chap. 15 in [15], or [34]). The following hypothesis expresses the physical principle of convexity of the free energy: ∂ p F (, ϑ) ∂e F (, ϑ) > 0, > 0 for all , ϑ > 0 ∂ ∂ϑ

(1.26)

(cf. [23]). Finally, we suppose ∂ p F (, ϑ) > 0, for any fixed ϑ > 0, →0+ ∂

lim p F (, ϑ) = 0, lim

→0+

(1.27)

and lim inf e F (, ϑ) > 0, lim sup ϑ→0+

ϑ→0+

∂e F (, ϑ) < ∞ for any > 0. ∂ϑ

(1.28)

Note that (1.27) is in full agreement with the classical Boyle’s law that applies in the nondegenerate area of high temperatures and low densities. On the other hand, the physical meaning of ∂e F /∂ϑ = cv is the specific heat at constant volume, and condition (1.28) characterizes a Fermi gas in the (degenerate) regime of large densities and/or extremely low temperatures (cf. Chap. 9 in [30], and Chap. 15 in [15]). Note that these conditions also agree with the asymptotic models derived in [3]. The specific entropy s is determined, up to an additive constant, by the thermodynamics relation (1.19).

Equations of Magnetohydrodynamics

601

As shown below, the only admissible form of p F reads p F (, ϑ) = ϑ 5/2 PF (/ϑ 3/2 ), 5 where the function PF behaves as PF (z) ≈ z 3 for large values of z. Thus at least asymptotically the pressure p F coincides with a “γ −law” p F ≈ γ , γ = 5/3, reminiscent of the isentropic state equation for a perfect monoatomic gas. From the mathematical viewpoint, the value of γ plays a role of a “critical” parameter (see [27] or [18] for relevant discussion). 1.6. Fluxes and transport coefﬁcients. In accordance with the second law of thermodynamics, the viscosity coefficients ν and ζ are non-negative quantities. In addition, we shall suppose ν = ν(ϑ, B) > 0, η = η(ϑ, B) > 0,

(1.29)

where both the shear viscosity ν and the bulk viscosity η satisfy some technical but physically relevant coercivity conditions specified below. Similarly, the heat flux q obeys Fourier’s law: q = qR + qF ,

(1.30)

with the “radiation” heat flux q R given by (1.3), and q F = −κ F (, ϑ, B)∇x ϑ, κ > 0.

(1.31)

Finally, the magnetic diffusivity λ satisfies λ = λ(, ϑ, B) > 0.

(1.32)

For the sake of simplicity, but not without certain physical background (see, for instance, Chap. 7 in [24]), we shall assume that all the transport coefficients ν, ζ , κ F , and λ admit a common temperature scaling, namely c1 (1 + ϑ α ) ≤ κ F (, ϑ, B), ν(ϑ, B), η(ϑ, B), λ(, ϑ, B) ≤ c2 (1 + ϑ α ), c1 > 0, (1.33) with α ≥ 1 to be specified below. Note that, in order to comply with the principle of material frame indifference, the transport coefficients depend in fact only on |B|. The reader will have noticed that we have deliberately avoided the case when the viscosity coefficients depend effectively on the density. Being physically relevant though such a situation presents up to now unsurmountable mathematical difficulties. Note that, in the absence of the magnetic field, the transport coefficients depending only on the temperature are physically relevant at least in√the case of gases. In particular, for the so-called hard-sphere model, one has µ(ϑ) ≈ ϑ (see, for instance, [2]). The effect of the magnetic field is much more complicated than in the “toy” situation considered in this paper, namely the viscous stress becomes unisotropic depending effectively on the direction of B (see Sect. 19.44 in [6]). 2. Variational Formulation The approach pursued in this paper is based on the concept of variational solutions, for which the underlying field equations are expressed in terms of integral identities rather than the partial differential equations dicussed in Sect. 1.

602

B. Ducomet, E. Feireisl

2.1. Mass conservation. The principle of mass conservation is expressed through a family of integral identities

T

0

∂t ϕ + u · ∇x ϕ dx dt + 0 ϕ(0, ·) dx = 0

(2.1)

to be satisfied for any test function ϕ ∈ D([0, T ) × R 3 ). Here, the function 0 characterizes the prescribed initial distribution of the density. Note that our choice of the test functions already reflects the boundary conditions (1.14). Moreover, we tacitly assume that both and the flux u to be locally integrable on [0, T ) × so that (2.1) makes sense. If, in addition, belongs to the space L ∞ (0, T ; L γ ()) for a certain γ > 1, then γ ∈ C([0, T ]; L weak ()), in particular, (t) → 0 weakly in L γ () as t → 0, and the total mass

M0 =

(t) dx =

0 dx

(2.2)

is a constant of motion. Finally, if γ ≥ 2, one can use the regularization technique developed by DiPerna and P.-L.Lions [11] in order to show that , u satisfy the renormalized equation T b()∂t ϕ +b()u · ∇x ϕ +(b()−b ())divx uϕ dx dt + b(0 )ϕ(0, ·) dx = 0 0

(2.3) for any ϕ ∈ D([0, T ) × R 3 ), and for any continuously differentiable function b whose derivative vanishes for large arguments. At the same time, it can be shown that any renormalized solution belongs to the class C([0, T ]; L 1 ()) (see [11]). 2.2. Momentum balance. The variational formulation of (1.10) reads T (u) · ∂t ϕ + (u ⊗ u) : ∇x ϕ + p divx ϕ dx dt

0

T

= 0

S : ∇x ϕ −∇x · ϕ −(J × B) · ϕ dx dt − (u)0 · ϕ(0, ·) dx (2.4)

to be satisfied for any vector field ϕ ∈ D([0, T ) × ; R 3 ). Similarly to Sect. 2.1, we suppose the quantities u, u ⊗ u, p, S, ∇x , and J × B to be at least locally integrable on the set [0, T ) × . Here, the gravitational potential is given by Eq. (1.11) considered on the whole physical space R 3 , where was extended to be zero outside . Equivalently, = G(− )−1 [1 ],

(2.5)

Equations of Magnetohydrodynamics

with

603

(− )[v](x) = Fξ →x |ξ |2 Fx→ξ [v] ,

where F stands for the Fourier transform. The initial value of the momentum (u)0 should be prescribed in such a way that (u)0 = 0 a.a. on the set {0 = 0}.

(2.6)

Finally, the satisfaction of the no-slip boundary conditions (1.14) can be rephrased as u ∈ L 2 (0, T ; W01,2 (; R 3 )).

(2.7)

Here, of course, our choice of the function space has been motivated by the anticipated a priori bounds resulting from the entropy balance specified below. 2.3. Entropy production. In accordance with (1.22), the weak form of the entropy production reads T q s ∂t ϕ + su · ∇x ϕ + · ∇x ϕ dx dt ϑ 0 ≤

T 0

1 q · ∇x ϑ λ − S : ∇x u − |curlx B|2 ϕ dx dt ϑ ϑ µ − (s)0 ϕ(0, ·) dx,

(2.8)

for any test function ϕ ∈ D([0, T )× R 3 ), ϕ ≥ 0. Note that our choice of the test function space agrees with the boundary conditions (1.14), (1.16). The presence of ϑ in the denominator indicates that this quantity must be positive on a set of full measure for (2.8) to make sense. In particular, terms like ∇x ϑ/ϑ will be interpreted as ∇x log(ϑ) in the spirit of the following result (Lemma 5.3 in [13]). Lemma 2.1. Let ⊂ R N be a bounded Lipschitz domain. Furthermore, let be a non-negative function such that 2N , dx, γ dx ≤ K , with γ > 0<M≤ N +2 and let ϑ ∈ W 1,2 (). Then the following statements are equivalent: • The function ϑ is strictly positive a.a. on , | log(ϑ)| ∈ L 1 (), and

∇x ϑ ∈ L 2 (; R N ). ϑ

• The function log(ϑ) belongs to the Sobolev space W 1,2 (). Moreover, if this is the case, then ∇x ϑ a.a. on , ϑ and there is a constant c = c(M, K ) such that

log(ϑ) L 2 () ≤ c(M, K ) log(ϑ) L 1 () + ∇x ϑ/ϑ L 2 (;R N ) . ∇x log(ϑ) =

604

B. Ducomet, E. Feireisl

2.4. Maxwell’s equations. Equation (1.7) is replaced by T B · ∂t ϕ − (B × u + λcurlx B) · curlx ϕ dx dt + B0 · ϕ(0, ·) dx, (2.9) 0

to be satisfied for any vector field ϕ ∈ D([0, T ) × Here, in accordance with the boundary conditions (1.8), (1.14), one has to take R 3 ).

B0 ∈ L 2 (), divx B0 = 0 in D (), B0 · n|∂ = 0.

(2.10)

By virtue of Theorem 1.4 in [38], B0 belongs to the closure of all solenoidal functions from D() with respect to the L 2 −norm. Anticipating, in view of (1.17), (1.22), B ∈ L ∞ (0, T ; L 2 (; R 3 )), curlx B ∈ L 2 (0, T ; L 2 (; R 3 )), we can deduce from (2.9) that divx B(t) = 0 in D (), B(t) · n|∂ = 0 for a.a. t ∈ (0, T ) in full agreement with (1.4), (1.8). In particular, using Theorem 6.1, Chap. VII in [14], we conclude B ∈ L 2 (0, T ; W 1,2 (; R 3 )), divx B(t) = 0, B · n|∂ = 0 for a.a. t ∈ (0, T ). (2.11) 2.5. Total energy conservation. In accordance with (1.17), we shall assume the total energy to be a constant of motion, specifically, 1 1 1 E(t) = (t)|u(t)|2 +(e)(t)+ |B(t)|2 − (t) (t) dx = E 0 for a.a. t∈(0, T ), 2µ 2 2 (2.12) where

1 1 1 E0 = |B0 |2 − 0 0 dx. |(u)0 |2 + (e)0 + 2µ 2 20

(2.13)

Exactly as in (2.5), we have set 0 = G(− )−1 [1 0 ], and, in addition to (2.6), we require 1 2 3 √ (u)0 ∈ L (; R ). 0

(2.14)

Finally, it is clear that the initial values of the entropy (s)0 and the internal energy (e)0 should be chosen consistently with the constitutive relations determined through (1.19). A rather obvious possibility consists in fixing the initial temperature ϑ0 and setting (s)0 = 0 s(0 , ϑ0 ), (e)0 = 0 e(0 , ϑ0 ).

(2.15)

Equations of Magnetohydrodynamics

605

3. Main Results 3.1. The main existence theorem. Having introduced the concept of variational solutions we are in a position to state the main result of this paper. Theorem 3.1. Let ⊂ R 3 be a bounded domain with boundary of class C 2+δ , δ > 0. Assume that the thermodynamic functions p = p(, ϑ), e = e(, ϑ), s = s(, ϑ) are interrelated through (1.19), where p, e can be decomposed as in (1.23), (1.24), with the components p F , e F satisfying (1.25). Moreover, let p F (, ϑ), e F (, ϑ) be continuously differentiable functions of positive arguments , ϑ satisfying (1.26 –1.28). Furthermore, suppose that the transport coefﬁcients ν = ν(ϑ, B), η = η(ϑ, B), κ F = κ F (, ϑ, B), and λ = λ(, ϑ, B) are continuously differentiable functions of their arguments obeying (1.29 –1.33), with 1≤α<

65 . 27

(3.1)

Finally, let the initial data 0 , (u)0 , ϑ0 , B0 be given so that 5

0 ∈ L 3 (), (u)0 ∈ L 1 (; R 3 ), ϑ0 ∈ L ∞ (R 3 ), B0 ∈ L 2 (; R 3 ),

(3.2)

0 ≥ 0, ϑ0 > 0,

(3.3)

(s)0 = 0 s(0 , ϑ0 ),

1 |(u)0 |2 , (e)0 = 0 e(0 , ϑ0 ) ∈ L 1 (), 0

(3.4)

and divx B0 = 0 in D (), B0 · n|∂ = 0.

(3.5)

Then problem (2.1–2.10) possesses at least one variational solution , u, ϑ, B in the sense of Sect. 2 on an arbitrary time interval (0, T ). Remarks. (i) Note that, given the level of regularity of the variational solutions, the boundary condition (1.14) as well as the first one in (1.8) are satisfied in the sense of traces (cf. (2.7), (2.11)) while the relation E × n|∂ = 0, expressed in terms of J via (1.5), holds in a weak sense through the choice of test functions in the integral identity (2.9) (cf. [14, Chap. VII, Part 4]). (ii) Hypothesis (3.1) is the only technical restriction required by the mathematical theory. Note however that such a kind of functional dependence on ϑ was rigorously justified in [10] by means of the asymptotic analysis of certain kinetic-fluid models. The idea to impose the same temperature scaling on all transport coefficients was inspired by Chap. 7 in [24].

606

B. Ducomet, E. Feireisl

3.2. Principal difﬁculties. There seem to be only a few rigorous results available concerning the existence of global-in-time solutions for the full Navier-Stokes-Fourier system with arbitrarily large data (see [18, 25]). The principal difficulties of the present problem may be characterized as follows: • The transport coefficients are effective functions of ϑ, B that admit the uniform “temperature scaling” expressed through (1.33), with the common exponent α satisfying hypothesis (3.1). In particular, only very poor a priori estimates based on boundedness of the entropy production are available, at least in comparison with the previous theory developed for the Navier-Stokes-Fourier system in [20]. Moreover, there are no growth restrictions imposed on the derivatives of the transport coefficients. • The method introduced in [18] that is based on boundedness of the so-called oscillations defect measure is not applicable in a direct manner due to the lack of suitable a priori estimates. • The estimates based on the entropy production balance become rather delicate as s = s(, ϑ) may be singular for vanishing arguments. 3.3. Methods. As already mentioned above, the main stumbling block of the mathematical theory to be developed in the present paper is the lack of a priori estimates. Accordingly, the methods of weak convergence based on the theory of the parametrized (Young) measures represent the main tool (see the monograph [32], among others). If not stated otherwise, “weak” in the present context means “in the sense of integral averages”, that is, in the sense of the weak topology on the Lebesgue space L 1 . Accordingly, we shall denote by b(z) a weak limit of any sequence of composed functions {b(z n )}n≥1 . More precisely, b(y, z)ϕ(y) dy = ϕ(y) b(y, z) d y (z) dy, RM

RM

RK

where y (z) is a parametrized measure associated to a sequence {zn }n≥1 of vector-valued functions, zn : R M → R K (see Chap. 1 in [32]). Assume there is a sequence of approximate solutions resulting from a suitable regularization process. Our starting point will be to establish the relation

p(, ϑ)−

4 ν +η divx u b() = p(, ϑ) b()− ν(ϑ, |B| +η(ϑ, |B|)divx u b() 3 3 (3.6)

4

for any bounded function b. The quantity p − (4/3ν + η)divx u is usually termed the effective viscous pressure, and relation (3.6) was first proved by P.-L. Lions [27] for the barotropic Navier-Stokes system, where p = p(), and the viscosity coefficients ν and η are constant. The same result for general temperature dependent viscosity coefficients was obtained in [19] with the help of certain commutator estimates in the spirit of [7]. In the present setting, the approach developed in [19] has to be further modified in order to accommodate the dependence of ν and η on the magnetic field B. Similarly to [27, 35], the propagation of density oscillations is suitably described by the renormalized continuity equation (see [11]) ∂t b() + divx (b()u) + b () − b() divx u = 0, (3.7)

Equations of Magnetohydrodynamics

607

and its “weak” counterpart ∂t b() + divx (b()u) + b () − b() divx u = 0.

(3.8)

In order to establish (3.7), however, one has to show first boundedness of the oscillations defect measure oscγ +1 [n → ](Q) = sup lim sup |Tk (n ) − Tk ()|γ +1 dx dt < ∞, (3.9) k≥1

n→∞

Q

with Tk () = min{, k}, for any bounded Q ⊂ R 4 and a certain γ > 1 (see Chap. 6 in [18]). However, because of rather poor estimates resulting from (1.33), (2.8), relation (3.9) has to be replaced by a weaker “weighted” estimate sup lim sup (1 + ϑn )−β |Tk (n ) − Tk ()|γ +1 dx dt < ∞, (3.10) k≥1

n→∞

Q

for suitable β > 0, γ > 1; whence the theory developed in [18] must be modified. Finally, similarly to [20], one has to recover strong convergence of the sequence {ϑn }n≥1 of (approximate) temperatures knowing that the spatial gradients of ϑn are uniformly square integrable, and, in addition, s(, ϑ)ϑ = s(, ϑ)ϑ,

(3.11)

where (3.11) can be deduced from the entropy inequality (2.8) by means of a variant of the celebrated Aubin-Lions lemma. To this end, we use a result which is essentially due to Ball [1] (see also Theorem 6.2 in [32]), namely the possibility to characterize the weak limits of compositions with Caratheodory functions in terms of the associated parametrized (Young) measure. 3.4. Structure of the paper. The arrangement of the paper is as follows. After a preliminary section devoted to the basic structural properties of the thermodynamic functions, we introduce a three level approximation scheme adapted from [18]. After a short discussion, the proof of Theorem 3.1 is then reduced to a weak stability problem to be dealt with in the rest of the paper (see Sect. 5). Section 6 is devoted to uniform bounds on the sequence of approximate solutions. Here, the most delicate part is the proof of positivity of the absolute temperature presented in Part 6.3. The “easy” part of the limit passage is carried over in Sect. 7. With help of the uniform bounds established in Sect. 6, one can handle the convective terms in the field equations by means of a simple compactness argument. Section 8 is devoted to the proof of strong convergence of the sequence of approximate temperatures. To this end, the entropy inequality is used together with the theory of parametrized (Young) measures discussed in Sect. 3.3. In Sect. 9 it is shown that the sequence of approximate densities converges strongly in the Lebesgue space L 1 ((0, T ) × ). Obviously, this is the most delicate point of the proof as no uniform estimates are available on the derivatives. Here, the main novelty is introducing weighted estimates of the so-called oscillations defect measure in order to show that the limit densities satisfy the renormalized continuity equation. The proof of Theorem 3.1 is completed in Sect. 10.

608

B. Ducomet, E. Feireisl

4. Preliminaries As we can check by direct computation, relations (1.19), (1.25) are compatible if and only if there exists a function PF ∈ C 1 (0, ∞) such that 5 . (4.1) p F (, ϑ) = ϑ 2 PF 3 ϑ2 Consequently, ∂e F (, ϑ) 3 1 5 = PF (Y ) − PF (Y )Y , Y = 3 ; ∂ϑ 2Y 3 ϑ2 whence, by virtue of hypothesis (1.26), 5 PF (z) − PF (z)z > 0 for any z > 0, 3 where the latter inequality yields P (z) F < 0 for all z > 0, 5 z3 PF (z) > 0,

(4.2)

(4.3)

in particular, PF (z) 5

z3

→ p∞ > 0 for z → ∞.

(4.4)

Note that positivity of the limit p∞ follows from (1.25), (1.28). Moreover, in accordance with (1.27), (1.28), we have PF ∈ C 1 [0, ∞), 15 15 PF (z) − PF (z)z ≤ lim sup PF (z) − PF (z)z < ∞, (4.5) 0 < lim inf z→0+ z 3 z→0+ z 3 lim sup z→∞

15 PF (z) − PF (z)z < ∞, z 3

(4.6)

and lim PF (z) = 0, lim PF (z) > 0, lim

z→0+

z→0+

z→∞

PF (z) z

2 3

=

5 p∞ > 0. 3

(4.7)

Now it follows from (4.7) that 2

PF (z) ≥ cz 3 for all z > 0, and a certain c > 0; therefore there exists pc > 0 such that the mapping 5

→ p F (, ϑ) − pc 3 is a non-decreasing function of for any fixed ϑ > 0. (4.8) In addition, with the other thermodynamic quantities determined through Eq. (1.19), we have s(, ϑ) = s F (, ϑ) + s R (, ϑ),

(4.9)

Equations of Magnetohydrodynamics

609

where s F (, ϑ) = S F

3

ϑ2

, with S F (z) = −

3 53 PF (z) − PF (z)z , 2 z2

(4.10)

and s R (, ϑ) =

4 3 aϑ . 3

(4.11)

Note that, by virtue of (4.2), (4.5), S F is a decreasing function such that 2 lim zS F (z) = − PF (0) < 0; z→0+ 5 whence, normalizing by S F (1) = 0, we get −c1 log(z) ≤ S F (z) ≤ −c2 log(z), c1 > 0, for all 0 < z ≤ 1,

(4.12)

0 ≥ S F (z) ≥ −c3 log(z) for all z ≥ 1.

(4.13)

and

5. Approximation Scheme 5.1. A regularized problem. The approximation scheme used in the present paper is esentially that of Chap. 7 in [18], supplemented with the necessary modifications introduced in [13]. For reader’s convenience, the additional terms are put into {}. The continuity equation (1.9) is replaced by its “artificial viscosity” approximation ∂t + divx (u) = {ε }, ε > 0,

(5.1)

to be satisfied on (0, T )×, and supplemented by the homogeneous Neumann boundary conditions ∇x · n|∂ = 0.

(5.2)

The initial distribution of the approximate densities is given through (0, ·) = 0,δ ,

(5.3)

where 0,δ ∈ C 1 (), ∇x 0,δ · n|∂ = 0, inf 0,δ (x) > 0, x∈

(5.4)

with a positive parameter δ > 0. The functions 0,δ are chosen in such a way that 5

0,δ → 0 in L 3 (), |{0,δ < 0 }| → 0 for δ → 0

(5.5)

(cf. Sect. 4 in [13]). Here, of course, the choice of the “critical” exponent γ = 5/3 is intimately related to estimate (4.7) established above.

610

B. Ducomet, E. Feireisl

A regularized momentum equation reads ∂t (u) + divx (u ⊗ u) + ∇x p + {δ∇x + ε∇x u∇x } = divx S + ∇x + J × B (5.6) in (0, T ) × , with the quantities J × B, S, and determined by (1.6), (1.13), and (2.5), respectively. Furthermore, in accordance with (1.14), the approximate velocity field satisfies the homogeneous Dirichlet boundary conditions u|∂ = 0.

(5.7)

Similarly to Sect. 4 in [13] , we prescribe the initial conditions (u)(0, ·) = (u)0,δ , where

(u)0,δ =

(u)0 provided 0, 0 otherwise.

δ

≥ 0 ,

(5.8)

(5.9)

The role of the “artificial pressure” term δ in (5.6) is to provide additional estimates on the approximate densities in order to facilitate the limit passage ε → 0 (cf. Sect. 7 in [18]). To this end, one has to take large enough, say, > 8, and to re-parametrize the initial distribution of the approximate densities so that δ 0,δ dx → 0 for δ → 0. (5.10)

As a next step, pursuing the strategy of [13] we replace the entropy equation (1.18) by the (modified) internal energy balance ∂t (e + {δϑ}) + divx (e + {δϑ})u − divx (κ F + κ R ϑ 3 + {δϑ })∇x ϑ = S : ∇x u − p divx u +

λ |curlx B|2 + {εδ|∇x |2 −2 } µ

(5.11)

to be satisfied in (0, T ) × , together with no-flux boundary conditions ∇x ϑ · n|∂ = 0.

(5.12)

(e + δϑ)(0, ·) = 0,δ (e(0,δ , ϑ0,δ ) + δϑ0,δ ),

(5.13)

The initial conditions read

where the (approximate) temperature distribution satisfies ϑ0,δ ∈ C 1 (), ∇x ϑ0,δ · n|∂ = 0, inf ϑ0,δ (x) > 0, x∈

and

(5.14)

    

 p ϑ0,δ →  ϑ0 in L () for any p ≥ 1,    δ 0,δ log(ϑ0,δ ) dx → 0, as δ → 0,        0,δ s(0,δ , ϑ0,δ ) dx → 0 s(0 , ϑ0 ) dx 

(5.15)

Equations of Magnetohydrodynamics

611

0,δ e(0,δ , ϑ0,δ ) dx < c uniformly for δ > 0.

(5.16)

Finally, the magnetic induction vector B obeys the (unperturbed) equations ∂t B + curlx (B × u) + curlx (λcurlx B) = 0, divx B = 0

(5.17)

in (0, T ) × , supplemented with the initial condition B(0, ·) = B0,δ ,

(5.18)

where, by virtue of Theorem 1.4 in [38], one can take B0,δ ∈ D(; R 3 ), divx B0,δ = 0,

(5.19)

B0,δ → B0 in L 2 (; R 3 ) for δ → 0.

(5.20)

5.2. The overall strategy of the proof of Theorem 3.1. For given positive parameters ε, δ , and > 8, the proof of Theorem 3.1 consists of the following steps: Step 1. Solving problem (5.1–5.20) for fixed ε > 0, δ > 0. Step 2. Passing to the limit for ε → 0. Step 3. Letting δ → 0. To begin with, the goal proposed in Step 1 can be achieved by means of a simple fixed point argument exactly as in Sect. 5 in [13] (see also Chap. 7 in [18]). More specifically, Eq. (5.6) is solved in terms of the velocity u with help of the Faedo-Galerkin method, where , , ϑ, and B are computed successively from (5.1), (2.5), (5.11), and (5.17) as functions of u. In addition, it is easy to check that the corresponding approximate solutions satisfy the energy balance d 1 1 δ 2 2 |u| + e + |B| + + δϑ dx = ∇x · u dx dt 2 2µ −1 (5.21) in D (0, T ),

1 1 δ 2 2 |u| + e + |B| + + δϑ dx lim t→0+ 2 2µ −1 1 |(u)0,δ |2 + 0,δ e(0,δ , ϑ0,δ ) = 2 0,δ 1 δ dx + |B0,δ |2 + 0,δ + δ0,δ ϑ0,δ 2µ −1

(5.22)

(cf. formula (5.3) in [13]). The technical parts of Steps 2,3 are rather similar. As explained in Chap. 7 in [18], the only reason for splitting this step into the ε and δ−parts are the refined density estimates based on the multipliers ∇x (− )−1 [β ], where one has to take β = 1 when the artificial viscosity term ε is present, while uniform (independent of δ) estimates require β to be a small positive number (see also Sect. 6 in [13]). For this reason, we focus in this paper only on Step 3, in other words, our task will be to establish the weak sequential stability (compactness) property for the solutions set of the approximate problem specified below.

612

B. Ducomet, E. Feireisl

5.3. The weak sequential stability problem. The density δ ≥ 0 and the velocity uδ satisfy the integral identity T δ ∂t ϕ + δ uδ · ∇x ϕ dx dt + 0,δ ϕ(0, ·) dx = 0 (5.23)

0

R3

for any test function ϕ ∈ D([0, T ) × R 3 ). In addition, uδ (t) ∈ W01,2 (; R 3 ) for a.a. t ∈ (0, T ).

(5.24)

The momentum equation T δ uδ · ∂t ϕ + (δ uδ ⊗ uδ ) : ∇x ϕ + pδ divx ϕ + {δδ } divx ϕ dx dt

0

= 0

−

T

Sδ : ∇x ϕ − δ ∇x δ · ϕ − (Jδ × Bδ ) · ϕ dx dt

(u)0,δ · ϕ(0, ·) dx

(5.25)

holds for any ϕ ∈ D([0, T ) × ; R 3 ), with δ = G(− −1 )[1 δ ],

(5.26)

where pδ = p(δ , ϑδ ), and Sδ , Jδ are determined in terms of uδ , ϑδ , and Bδ through the constitutive relations (1.6), (1.13). The entropy production inequality reads T (δ sδ + {δδ log(ϑδ )})∂t ϕ + (δ sδ uδ + {δδ log(ϑδ )uδ }) · ∇x ϕ dx dt 0

T 0

≤

T 0

qδ · ∇x ϕδ − {δϑδ−1 ∇x ϑδ } · ∇x ϕ dx dt ϑδ 1 qδ · ∇x ϑδ λδ − Sδ : ∇x uδ − |curlx Bδ |2 − {δϑδ−1 |∇x ϑδ |2 } ϕ dx dt ϑδ µ ϑδ 0,δ s(0,δ , ϑ0,δ ) + {δ0,δ log(ϑ0,δ )} ϕ(0, ·) dx − (5.27)

for any ϕ ∈ D([0, T ) × R 3 ), ϕ ≥ 0. Here, ϑδ is assumed to be positive a.a. on the set (0, T ) × , sδ = s(δ , ϑδ ), λδ = λ(δ , ϑδ ), and qδ is a function of δ , ϑδ given by (1.30). The magnetic induction vector Bδ satisfies T Bδ · ∂t ϕ − (Bδ × uδ + λδ curlx Bδ ) · curlx ϕ dx dt + B0,δ · ϕ(0, ·) dx = 0 0

δ

(5.28) for any vector field ϕ ∈ D([0, T ) × R 3 ; R 3 ).

Equations of Magnetohydrodynamics

613

Finally, the (total) energy equality 1 1 1 δ 2 2 δ |uδ | + δ e + |Bδ | − δ δ + + δδ ϑδ (t) dx 2µ 2 −1 δ 2 1 1 |B0,δ |2 = |(u)0,δ |2 + 0,δ e(0,δ , ϑ0,δ ) + 2µ 20,δ G − 0,δ (− )−1 [1 0,δ ] dx 2 δ + 0,δ + δ0,δ ϑ0,δ dx (5.29) δ − 1 holds for a.a. t ∈ (0, T ). The weak sequential stability problem to be addressed in the remaining part of this paper consists in showing that one can pass to the limit   → ,    δ  uδ → u, as δ → 0   ϑδ → ϑ,   Bδ → B in a suitable topology, where the limit quantity {, u, ϑ, B} is a variational solution of problem (2.1–2.15), the existence of which is claimed in Theorem 3.1. 6. Uniform Bounds Our first goal is to identify the uniform bounds imposed on the sequences {δ }δ>0 , {uδ }δ>0 , {ϑδ }δ>0 , and {Bδ }δ>0 through the total energy balance (5.29), the dissipation inequality (5.27) as well as other relations resulting from (5.23–5.29). 6.1. Total mass conservation. As δ , δ uδ are locally integrable in [0, T )×, it follows easily from (5.23) that the total mass is a constant of motion, specifically, δ (t) dx = 0,δ dx = M0 for a.a. t ∈ (0, T ). (6.1)

In particular, as δ ≥ 0 and (5.5) holds, we get {δ }δ>0 bounded in L ∞ (0, T ; L 1 ()).

(6.2)

6.2. The gravitational potential. The classical elliptic estimates applied to (2.5), together with (6.1), give rise to δ δ dx ≤ δ 5 δ 5 ≤ c δ 5 δ L 1 () = cM0 δ 5 .

L 3 ()

L 2 ()

L 3 ()

L 3 ()

(6.3)

614

B. Ducomet, E. Feireisl

6.3. Energy estimates. On the other hand, by virtue of (1.25), (4.7),

δ e(δ , ϑδ ) dx ≥

3 2

p F (δ , ϑδ ) dx ≥

3 p∞ 2

5

δ3 dx;

whence the total energy balance (5.29), estimate (6.3), together with the bounds on the initial data (5.5), (5.9), (5.10), (5.16), and (5.20), yield a family of energy estimates: 5

{δ }δ>0 bounded in L ∞ (0, T ; L 3 ()),

(6.4)

{ϑδ }δ>0 bounded in L ∞ (0, T ; L 4 ()),

(6.5)

{Bδ }δ>0 bounded in L ∞ (0, T ; L 2 (; R 3 )),

(6.6)

{δ |uδ |2 }δ>0 , {δ e(δ , ϑδ )}δ>0 bounded in L ∞ (0, T ; L 1 ()),

(6.7)

and δ

δ dx ≤ c uniformly with respect to δ > 0.

(6.8)

In particular, by virtue of Hölder’s inequality, (6.4) together with (6.7) imply 5

{(u)δ }δ>0 bounded in L ∞ (0, T ; L 4 (; R 3 )).

(6.9)

6.4. Dissipation estimates. The following, relatively strong, estimates result from the entropy production inequality (5.27), where one is allowed to take a spatially homogeneous test function ϕ such that ϕ(0, ·) = 1. Taking (5.15), (5.16) into account we obtain τ 0

1 qδ · ∇x ϑδ λδ Sδ : ∇x uδ − + |curlx Bδ |2 + δϑδ−1 |∇x ϑδ |2 dx dt ϑδ µ ϑδ ≤ c1 + δ (τ )s(δ (τ ), ϑδ (τ )) + δδ (τ ) log(ϑδ (τ )) dx

≤ c2 +

δ (τ )s(δ (τ ), ϑδ (τ )) dx for a.a. τ ∈ (0, T ),

where the last inequality follows from (6.4), (6.5).

(6.10)

Equations of Magnetohydrodynamics

615

Furthermore, the most right integral can be estimated with help of (4.12), (4.13): 4 3 δ dx δ s(δ , ϑδ ) dx = aϑδ + δ S F 3 3 ϑδ2 3 log(ϑδ ) dx ≤ c1 − c2 3 δ log(δ ) − 2 {δ ≤ϑδ2 } ≤ c3 + c4 δ ϑδ dx ≤ c7 , (6.11) 3 δ log(ϑδ ) dx ≤ c5 + c6 {δ ≤ϑδ2 }

where we have used (6.4), (6.5). Thus the far left integral in (6.10) is bounded independently of δ, and we conclude, making use of hypotheses (1.29–1.33), that

α−1

{(1 + ϑδ ) 2 < ∇x uδ >}δ>0 is bounded in L 2 (0, T ; L 2 (; R 3×3 )), α−1 {(1 + ϑδ ) 2 divx uδ }δ>0 is bounded in L 2 (0, T ; L 2 ()), 3 {∇x log(ϑδ )}δ>0 , ∇x ϑδ2 is bounded in L 2 (0, T ; L 2 (; R 3 )), {(1 + ϑδ )

and

α−1 2

√

curlx Bδ }δ>0 is bounded in L 2 (0, T ; L 2 (; R 3 )), 2

δ∇x ϑδ

(6.12)

(6.13) (6.14)

is bounded in L 2 (0, T ; L 2 (; R 3 )),

where we have denoted Q =

(6.15)

1 1 (Q + QT ) − trace(Q)I 2 3

the traceless component of the symmetric part of a tensor Q. Since the velocity field uδ vanishes on the boundary in the sense of (5.24), estimate (6.16) yields, in particular, {uδ }δ>0 bounded in L 2 (0, T ; W01,2 (; R 3 )).

(6.16)

Similarly, (6.13) together with (6.5) give rise to 3 ϑδ2 bounded in L 2 (0, T ; W 1,2 ()).

(6.17)

6.5. Positivity of the absolute temperature. In agreement with the physical background and as required in the variational formulation introduced in Sect. 2, the absolute temperature must be positive a.a. on (0, T ) × . To this end, we make use of the uniform L 2 − estimates of the ∇x log(ϑδ ) established in (6.16) along with the following version of Poincaré’s inequality:

616

B. Ducomet, E. Feireisl

Lemma 6.1. Let ⊂ R N be a bounded Lipschitz domain, and ω ≥ 1 be a given constant. Furthermore, assume O ⊂ is a measurable set such that |O| ≥ o > 0. Then ω 1 . |v| ω dx

v W 1,2 () ≤ c(o, , ω) ∇x v L 2 (;R 3 ) + O

In order to apply Lemma 6.1, we show first a uniform bound 0
ϑδ (t) dx for a.a. t ∈ (0, T ).

(6.18)

Indeed as an immediate consequence of (5.27), we get

δ s(δ , ϑδ )(t) + δδ log(ϑδ )(t) dx ≥

0,δ s(0,δ , ϑ0,δ ) + δ0,δ log(ϑ0,δ ) dx,

where, by virtue of (5.5), (5.15), δ

0,δ log(ϑ0,δ ) dx → 0,

0,δ s(0,δ , ϑ0,δ ) dx →

0 s(0 , ϑ0 ) dx.

On the other hand, (6.4), (6.5) yield δ

δ log(ϑδ )(t) dx ≤ δ

δ (t)ϑδ (t) dx → 0 for a.a t ∈ (0, T )

while, in accordance with (4.9) 4 3 δ aϑδ (t) + δ S F (t) dx. δ s(δ , ϑδ )(t) dx = 3 3 ϑδ2 Assuming, by contradiction, the opposite of (6.18), we could extract a sequence {ϑδ (tδ )}δ>0 such that ϑδ (tδ ) → 0 weakly in L 4 (), and strongly in L p () for any 1 ≤ p < 4, (6.19)

4 3 0 aϑ0 + 0 S F dx ≤ lim inf 3 δ→0+ 3 ϑ02

δ (tδ )S F

δ (tδ ) dx, 3 ϑδ2

where 5

δ (tδ ) → (t) weakly in L 3 (),

δ (tδ ) dx =

0,δ dx.

(6.20)

Equations of Magnetohydrodynamics

617

Now, for any fixed K > 1, we can write

δ δ (tδ ) dx ≤ (tδ ) dx δ (tδ )S F δ (tδ )S F 3 3 3 2 {δ (tδ )≤ϑδ (tδ )} ϑδ2 ϑδ2 δ + (tδ ) dx δ (tδ )S F 3 3 {δ (tδ )≥K ϑδ2 (tδ )} ϑδ2 3 ϑδ2 ≤c δ (tδ ) 1 + (tδ ) dx + S F (K ) 0,δ dx 3 δ {δ (tδ )≤ϑδ2 (tδ )} 3 ≤ 2c ϑδ2 (tδ ) dx + S F (K ) 0,δ dx.

(6.21)

Thus combining (6.19 – 6.21) we conclude

4 3 0 aϑ0 + 0 S F dx ≤ S F (K ) 3 3 2 ϑ0

0 dx for any K > 1,

which is clearly impossible as S F is strictly decreasing with lim K →∞ S F (K ) < 0, and ϑ0 positive (non-zero) on . Thus we have established (6.18). Finally, seeing that T − ε|| ≤

3

{ϑδ (t)>ε}

ϑδ (t) dx ≤ |{ϑδ (t) > ε}| 4 ϑδ (t) L 4 () for any ε > 0

we infer, making use of (6.5), that there exist ε > 0 and o such that |{ϑδ (t) > ε}| > o > 0 for a.a. t ∈ (0, T ) uniformly for δ > 0. In other words, taking (6.13) into account, one can apply Lemma 6.1 to log(ϑδ ) in order to obtain the desired estimate {log(ϑδ )}δ>0 bounded in L 2 (0, T ; W 1,2 ()).

(6.22)

6.6. Estimates of the magnetic ﬁeld. As already pointed out in Sect. 2.4 , satisfaction of the integral identity (5.28) gives rise divx Bδ (t) = 0 in D (), Bδ · n|∂ = 0,

(6.23)

which, together with estimates (6.6), (6.14), and Theorem 6.1 in [14], yields {Bδ }δ>0 bounded in L 2 (0, T ; W 1,2 (; R 3 )).

(6.24)

618

B. Ducomet, E. Feireisl

6.7. Pressure estimates. More refined density estimates can be obtained through “computing” the pressure in the momentum equation (5.25). In order to do this, we start with a simple observation that estimates (6.4), (6.16) imply that the sequences {δ uδ }δ>0 , {δ uδ ⊗ uδ }δ>0 , and {δ ∇x δ }δ>0 , where is determined by (5.26), are bounded in the Lebesgue space L p ((0, T ) × ) for a certain p > 1. Furthermore, one gets Sδ =

ν(ϑδ ) η(ϑδ ) ϑδ ν(ϑδ ) < ∇x uδ > + ϑδ η(ϑδ ) divx uδ , ϑδ ϑδ

(6.25)

where, by virtue of (6.5), (6.12), (6.17), and hypothesis (3.1), the expression on the right-hand side is bounded in L p ((0, T ) × ) for a certain p > 1. Finally, we have the Lorentz force Jδ × Bδ determined through (1.6); whence (6.6), (6.24) combined with a simple interpolation argument yield {J × Bδ }δ>0 bounded in L p ((0, T ) × ) for a certain p > 1.

(6.26)

At this stage, repeating step by step the proof of the main result in [21], we are allowed to use the quantities ϕ(t, x) = ψ(t)B[δω ], ψ ∈ D(0, T ), for a sufficiently small parameter ω > 0, as test functions in (5.25), where B[v] is a suitable branch of solutions to the boundary value problem 1 divx B[v] = v − v dx, B|∂ = 0. ||

(6.27)

The construction of the operator B, described in detail in [22] (see also Lemma 3.17 in [31]) is based on an integral representation formula due to Bogovskii [4]. The resulting estimate reads T 0

p(δ , ϑδ )δω + δδ+ω dx dt < c, with c indepemdent of δ,

(6.28)

in particular, { p(δ , ϑδ )}δ>0 is bounded in L p ((0, T ) × ) for a certain p > 1,

(6.29)

and 5

{δ3

+ω

}δ>0 is bounded in L 1 ((0, T ) × ).

Note that an alternative way to obtain these estimates was proposed in [28].

(6.30)

Equations of Magnetohydrodynamics

619

7. Sequential Stability of the Field Equations 7.1. The continuity equation. With the estimates established in the preceding section, it is quite easy to pass to the limit for δ → 0 in (5.23). Indeed passing to subsequences if necessary we deduce from (6.4),(6.9), (6.16), and the fact that δ , uδ satisfy the integral identity (5.23): 5

3 δ → in C([0, T ]; L weak ()),

(7.1)

uδ → u weakly in L 2 (0, T ; W01,2 (; R 3 )),

(7.2)

5

δ uδ → u weakly-(*) in L ∞ (0, T ; L 4 (; R 3 )),

(7.3)

where the limit quantities satisfy (2.1). 7.2. The momentum equation. Using the estimates obtained in Sect. 6.7 together with (5.25) we have 5

δ uδ → u in C([0, T ]; L 4 (; R 3 )), 30 29

3×3 δ uδ ⊗ uδ → u ⊗ u weakly in L 2 (0, T ; L (; Rsym )),

(7.4) (7.5)

where we have used the embedding W01,2 () → L 6 (). Furthermore, in accordance with (6.28), (6.29), p(δ , ϑδ ) → p(, ϑ) weakly in L p ((0, T ) × )

(7.6)

δδ → 0 in L p ((0, T ) × )

(7.7)

and

for a certain p > 1. Here, in agreement with Sect. 3.3, T p(, ϑ)ϕ dx dt 0

=

T 0

ϕ

R2

p(, ϑ) dt,x (, ϑ) dx dt, ϕ ∈ D((0, T ) × ),

where t,x (, ϑ) is a parametrized (Young) measure associated to the (vector valued) sequence {δ , ϑδ }δ>0 . Similarly, one can use (5.28) together with estimates (6.6), (6.24) to deduce Bδ → B weakly in L 2 (0, T ; W 1,2 (; R 3 )) and strongly in L 2 ((0, T ) × ; R 3 ), (7.8) and, consequently, Jδ × Bδ =

1 1 curlx Bδ × Bδ → curlx B × B = J × B weakly in L p ((0, T ) × ; R 3 ) µ µ

for a certain p > 1.

620

B. Ducomet, E. Feireisl

Thus the limit quantities satisfy an “averaged” momentum equation T (u) · ∂t ϕ + (u ⊗ u) : ∇x ϕ + p(, ϑ) divx ϕ dx dt 0

=

T

0

S : ∇x ϕ − ∇x · ϕ − (J × B) · ϕ dx dt − (u)0 · ϕ(0, ·) dx

(7.9) for any vector field ϕ ∈ D([0, T ) × ; R 3 ), where the gravitational potential is given by (2.5). Note that, as a direct consequence of (7.1) and the standard elliptic theory,

−1 [1 δ ] → −1 [1 ] in C([0, T ] × ).

(7.10)

3×3 )), p > 1 of the approxThe symbol S denotes a weak limit in L p (0, T ; L p (; Rsym imate viscosity tensors Sδ specified in (6.25). Clearly, relation (7.9) will coincide with the (variational) momentum equation (2.4) as soon as we show strong (pointwise) convergence of the sequences {δ }δ>0 and {ϑδ }δ>0 . This issue will be addressed in the subsequent two sections.

8. Entropy Inequality and Strong Convergence of the Temperature 8.1. Entropy inequality and time oscillations. In order to extract a piece of information on the time oscillations of the sequence {ϑδ }δ>0 , we shall use the (rather poor) estimates on ∂t (δ s(δ , ϑδ )) provided by the approximate entropy balance (5.27). To begin with, it follows from (4.12), (4.13) that |δ s(δ , ϑδ )| ≤ c ϑδ3 + δ | log(δ )| + δ | log(ϑδ )| ; therefore, by virtue of the uniform estimates (6.4), (6.5), (6.9), and (6.22), we can assume δ s(δ , ϑδ ) → s(, ϑ) weakly in L p ((0, T ) × ),

(8.1)

δ s(δ , ϑδ )uδ → s(, ϑ)u weakly in L p ((0, T ) × ; R 3 )

(8.2)

for a certain p > 1. Similarly, one can estimate the entropy flux q + δϑδ−1 ∇x ϑδ ≤ c |∇x log(ϑδ )| + ϑδ2 |∇x ϑδ | + δϑδ−1 |∇x ϑδ | , ϑδ where ϑδ2 |∇x ϑδ | =

3 √ 2 23 2√ s (1−s) 2 ϑδ |∇x ϑδ2 |, δϑδ−1 ∇x ϑδ = δ |∇x ϑδ2 | δϑδ 2 ϑδ . 3

Choosing the parameter 0 < s < 1 small enough so that (1 − s)

= 1, 2

we have, by virtue of Hölder’s inequality,

δϑδ−1 ∇x ϑδ L p (;R 3 ) ≤ c

√

δ∇x ϑδ2 L 2 (;R 3 ) ϑδ L 4 ()

√

s

δϑδ 2 L 6 () .

(8.3)

Equations of Magnetohydrodynamics

621

Thus we can use estimates (6.5), (6.15) together with the imbedding W 1,2 () → in order to infer

L 6 ()

δϑδ−1 ∇x ϑδ → 0 in L p ((0, T ) × ; R 3 ) for a certain p > 1.

(8.4)

Moreover, using similar arguments one can also show that {

q }δ>0 is bounded in L p ((0, T ) × ; R 3 ), for a certain p > 1. ϑδ

(8.5)

On the other hand, in accordance with (6.13) b(ϑδ ) → b(ϑ) weakly in L q ((0, T ) × )), and weakly in L 2 (0, T ; W 1,2 ()) (8.6) for any finite q ≥ 1 provided both b and b are uniformly bounded. Now, as a straightforward consequence of the entropy balance (5.27), we have q Divt,x δ s(δ , ϑδ ) + δδ log(ϑδ ), (δ s(δ , ϑδ ) + δδ log(ϑδ ))uδ + − δϑδ−1 ∇x ϑδ ϑδ ≥ 0 in D ((0, T ) × ),

while (8.6) yields Curlt,x b(ϑδ ), 0, 0, 0 bounded in L 2 ((0, T ) × ; R 4×4 ). Thus a direct application of the celebrated Div-Curl lemma (see, for instance, [37]) gives rise to relation s(, ϑ)b(ϑ) = s(, ϑ) b(ϑ)

(8.7)

for any b as in (8.6). Moreover, seeing that the sequence {δ s(δ , ϑδ )ϑδ }δ>0 is bounded in L p ((0, T ) × ) we deduce from (8.8) a more concise statement s(, ϑ)ϑ = s(, ϑ)ϑ.

(8.8)

8.2. Parametrized measures, monotony, and pointwise convergence of the temperature. Our goal is to show that (8.8) necessarily implies strong (pointwise) convergence of the sequence {ϑδ }δ>0 . First of all, let us remark that the approximate solutions solve the renormalized continuity equation ∂t b(δ ) + divx (b(δ )uδ ) + b (δ )δ − b(δ ) divx uδ = 0 in D ((0, T ) × R 3 ), (8.9) provided δ , uδ are extended to be zero outside , and for any continuously differentiable function b whose derivative vanishes for large arguments. The functions δ being square-integrable because of the presence of the artificial pressure in the energy equality, Eq. (8.9) follows from (5.23) via the regularization technique of DiPerna and P.-L.Lions [11].

622

B. Ducomet, E. Feireisl

Now it follows from (8.9) that p

b(δ ) → b() in C([0, T ]; L weak ()) for any finite p > 1 and bounded b. (8.10) Relation (8.10) combined with (8.6) yields g()h(ϑ) = g() h(ϑ), or, in terms of the corresponding parametrized measures t,x (, ϑ) = t,x () ⊗ t,x (ϑ)

(8.11)

(cf. Sect. 3.3). Relation (8.11) says that oscillations (if any) in the sequences {δ }δ>0 and {ϑδ }δ>0 are “orthogonal”, that means, the parametrized measure associated to {δ , ϑδ }δ>0 can be written as a tensor product of the parametrized measures generated by {δ }δ>0 and {ϑδ }δ>0 . Consider a function H = H (t, x, r, z) defined for t ∈ (0, T ), x ∈ , and (r, z) ∈ R 2 through formula H (t, x, r, z) = r s(r, z) − s(r, ϑ(t, x) z − ϑ(t, x) . Clearly, H is a Caratheodory function, more specifically, H (t, x, ·, ·) is continuous for a.a (t, x) ∈ (0, T ) × , and H (·, ·, r, z) is measurable for any (r, z) ∈ R 2 . Moreover, as both s R and s F are increasing functions of the absolute temperature, we have 4 H (t, x, r, z) ≥ a z 3 − ϑ 3 (t, x) z − ϑ(t, x) ≥ 0. (8.12) 3 At this stage, we use a crucial observation proved rigorously in Theorem 6.2 in [32], namely that weak limits of Caratheodory functions can be characterized in terms of the associated parametrized measure. Accordingly, we obtain T lim ϕ(t, x)H (t, x, δ , ϑδ ) dx dt δ→0 0

=

T

T

T

0

= 0

ϕ(t, x)

− 0

=

T

T

0

= 0

ϕ(t, x)

R2

ϕ(t, x)

ϕ(t, x)

R2

H (t, x, , ϑ) dt,x (, ϑ) dx dt s(, ϑ)(ϑ − ϑ(t, x))dt,x (, ϑ) dx dt

R2

R2

s(, ϑ(t, x))(ϑ − ϑ(t, x))dt,x (, ϑ) dx dt

s(, ϑ)(ϑ − ϑ(t, x))dt,x (, ϑ) dx dt

ϕ s(, ϑ)ϑ − s(, ϑ)ϑ dx dt = 0 for any ϕ ∈ D((0, T ) × ,

where we have used (8.8) to get the last equality together with (8.11) in order to observe that s(, ϑ(t, x))(ϑ − ϑ(t, x)) dt,x (, ϑ) R2 = s(, ϑ(t, x)) dt,x () (ϑ − ϑ(t, x)) dt,x (ϑ) = 0. R

R

Equations of Magnetohydrodynamics

623

In particular, we deduce from (8.12) that ϑ 3 ϑ = ϑ 3 ϑ, which is equivalent to the desired result ϑδ → ϑ in L 4 ((0, T ) × ).

(8.13)

9. Pointwise Convergence of Densities 9.1. The effective viscous pressure. The problem of pointwise (strong) convergence of the sequence {δ }δ>0 represents one of the most delicate points of the present theory. Let us start with the celebrated and nowadays well-established result of P.-L.Lions [27] on the effective viscous pressure. In the present setting, it can be concisely stated in terms of the parametrized measures as follows: ψ p(, ϑ)b() − ψ p(, ϑ) b() = R : [ψ S]b() − R : [ψ S] b()

(9.1)

for any ψ ∈ D(), and any bounded continuous function b, where R = Ri, j is a pseudodifferential operator defined by means of the Fourier transform: ξ ξ i j −1 Ri, j [v] = ∂xi −1 ∂ v = F F [v] . (9.2) xj x→ξ x ξ →x |ξ |2 Note that (9.1) is independent of the specific form of the constitutive relations for p, S and requires only satisfaction of the momentum equation (5.25) together with the renormalized continuity equation (8.9) for δ , uδ . For a detailed proof of (9.1), the reader may consult Chap. 6, Formula (6.17) in [18]. 9.2. Commutator estimates. If the viscosity coefficients ν and η were constant, we would have 4 ν + η divx u, R[S] = 3 and (9.1) would become immediately 4 4 ν + η divx u b() − ν + η divx u b(). p(, ϑ)b() − p(, ϑ) b() = 3 3

(9.3)

The quantity p − ( 43 ν + η)divx u is usually called the effective viscous pressure, and (9.3) coincides with the original equation discovered in [27]. In order to establish the same relation for variable viscosity coefficients, we use a slightly modified version of the approach proposed in [19]. To this end, we shall write 4 4 R : [ψS] = ψ ν + η divx u + R : [ψS] − ψ ν + η divx u , 3 3 where the quantity in the curly brackets is a commutator fitting in the framework of the theory developed by Coifman and Meyer [7].

624

B. Ducomet, E. Feireisl

In particular, by virtue of Lemma 4.2 in [19], the quantity R : ψ ν(ϑδ , Bδ ) < ∇x uδ > +η(ϑδ , Bδ )divx uδ I 4 −ψ ν(ϑδ , Bδ ) + η(ϑδ , Bδ ) divx uδ 3 is bounded in the space L 2 (0, T ; W ω, p ()) for suitable 0 < ω < 1, p > 1 in terms of the bounds established in (6.16), (6.17), and (6.24) provided the viscosity coefficients are globally Lipschitz functions of their arguments. If this is the case, one can use (8.10) in order to deduce the desired relation 4 4 ν + η divx u b() − ψ ν + η divx u b() R : [ψ S]b() − R : [ψ S] b() = ψ 3 3 4 4 =ψ ν + η divx u b() − ψ ν + η divx u b(), 3 3 where the last equality is a direct consequence of the pointwise convergence proved in (7.8), (8.13). If ν and η are only continuously differentiable as required by the hypotheses of Theorem 3.1, one can write    ν(ϑ, B) = Y (ϑ, B)ν(ϑ, B) + (1 − Y (ϑ, B))ν(ϑ, B),  (9.4)  η(ϑ, B) = Y (ϑ, B)η(ϑ, B) + (1 − Y (ϑ, B))η(ϑ, B),  where Y ∈ D(R 2 ) is a suitable function. Now we have ν(ϑδ , Bδ ) < uδ > +η(ϑδ , Bδ )divx uδ I = Y (ϑδ , Bδ )ν(ϑδ , Bδ ) < uδ > +Y (ϑδ , Bδ )η(ϑδ , Bδ )divx uδ I + (1 − Y (ϑδ , Bδ ))ν(ϑδ , Bδ ) < uδ > +(1 − Y (ϑδ , Bδ ))η(ϑδ , Bδ )divx uδ I , where the expression in the curly brackets can be made arbitrarily small in the norm of L p ((0, T ) × ), with a certain p > 1, by a suitable choice of Y (see estimates (6.5), (6.12), (6.17), and formula (6.25). Thus relation (9.3) holds for any ν and η satisfying the hypotheses of Theorem 3.1 9.3. The oscillations defect measure. The most suitable tool for describing possible oscillations in the sequence {δ }δ>0 is the renormalized continuity equation (2.3) together with its counterpart resulting from letting δ → 0 in (8.9). Although we have already shown in Sect. 7.1 that the limit quantities , u satisfy the momentum equation (2.1), the validity of its renormalization (2.3) is not obvious because the sequence {δ }δ>0 is not known to be uniformly square integrable and the machinery of [11] does not work. In order to solve this problem, a concept of oscillations defect measure was introduced in [17]. To be more specific we set oscq [δ → ](Q) = sup lim sup |Tk (δ ) − Tk ()|q dx dt , (9.5) k≥1

δ→0

Q

Equations of Magnetohydrodynamics

625

where Tk are the cut-off functions, Tk (z) = sgn(z) min{|z|, k}. As shown in [17], the limit functions , u solve the renormalized equation (2.3) provided • δ , uδ satisfy (8.9); • {uδ }δ>0 is bounded in L 2 (0, T ; W 1,2 ()); • oscq [δ → ]((0, T ) × ) < ∞ for a certain q > 2.

(9.6)

Accordingly, in order to establish (2.3), it is enough to show (9.6).

9.4. Weighted estimates of the oscillations defect measure. In order to show (9.6), we use a new method based on weighted estimates, where the corresponding weight function depends on the absolute temperature. Taking b = Tk in (9.3) we get p(, ϑ)Tk () − p(, ϑ) Tk () 4 4 ν(ϑ, B) + η(ϑ, B) divx u Tk () − ν(ϑ, B) + η(ϑ, B) divx u Tk (). = 3 3 (9.7) 5

As observed in (4.8), there is a positive constant pc such that p F (, ϑ) − pc 3 is a non-decreasing function of for any ϑ. Accordingly, we have 5 5 p(, ϑ)Tk () − p(, ϑ) Tk () ≥ pc 3 Tk () − 3 Tk () .

(9.8)

Now, let us choose a weight function w ∈ C 1 [0, ∞), w(ϑ) > 0 for ϑ ≥ 0, w(ϑ) = ϑ −

1+α 2

for ϑ ≥ 1,

(9.9)

where α is the exponent appearing in hypothesis (1.33). Multiplying (9.7) by w(ϑ) and using (9.8) we obtain 5 w(ϑ) 4 5 ν(ϑ, B) + η(ϑ, B) divx u Tk () w(ϑ) 3 Tk () − 3 Tk () ≤ pc 3 w(ϑ) 4 ν(ϑ, B) + η(ϑ, B) divx u Tk (), (9.10) − pc 3 where the right-hand side is a weak limit of w(ϑδ ) 4 ν(ϑδ , Bδ ) + η(ϑδ , Bδ ) divx uδ Tk (δ ) − Tk () pc 3 w(ϑ) 4 + ν(ϑ, B) + η(ϑ, B) divx u Tk () − Tk () . pc 3

626

B. Ducomet, E. Feireisl

Consequently, employing the uniform weighted estimates (6.12), together with hypothesis (1.13) and the growth restriction on w specified in (9.9), we conclude T w(ϑ) 4 ν(ϑ, B) + η(ϑ, B) divx u Tk () pc 3 0 w(ϑ) 4 ν(ϑ, B) + η(ϑ, B) divx u Tk () dx dt − pc 3 ≤ c lim inf Tk (δ ) − Tk () L 2 ((0,T )×) , with c independent of k. (9.11) δ→0

On the other hand, it was shown in Chap. 6 in [18] that the left-hand side of (9.10) can be bounded below as T 5 5 w(ϑ) 3 Tk () − 3 Tk () dx dt

0

≥ lim sup δ→0

T

8

0

w(ϑ)|Tk (δ ) − Tk ()| 3 dx dt;

(9.12)

whence (9.10), together with (9.11), (9.12), give rise to the weighted estimate T lim sup δ→0

0

8

w(ϑ)|Tk (δ )−Tk ()| 3 dx dt ≤ c lim inf Tk (δ )−Tk () L 2 ((0,T )×) . δ→0

(9.13) Now, taking q > 2,

1+α 8 ω= 3q 2

we can use Hölder’s inequality to obtain T 0

=

|Tk (δ ) − Tk ()|q dx dt

T

≤c

|Tk (δ ) − Tk ()|q (1 + ϑ)−ω (1 + ϑ)ω dx dt

T 0

8

0 T

+ 0

|Tk (δ ) − Tk ()| 3 (1 + ϑ)−

1+α 2

dx dt

3q(1+α) (1 + ϑ) 2(8−3q) dx dt .

(9.14)

On the other hand, in accordance with (6.5), (6.17), ϑ ∈ L ∞ (0, T ; L 4 ()) ∩ L 3 (0, T ; L 9 ()), and a simple interpolation argument yields ϑ ∈ L r ((0, T ) × ), with r =

46 . 9

Equations of Magnetohydrodynamics

627

Consequently, one can find q > 2 such that 3q(1 + α) 46 =r = 2(8 − 3q) 9 provided α complies with hypothesis (3.1). Thus relations (9.13), (9.14) give rise to the desired estimate T lim sup w(ϑ)|Tk (δ )−Tk ()|q dx dt ≤ c 1+lim inf Tk (δ )−Tk () L 2 ((0,T )×) δ→0

δ→0

0

yielding (9.6).

9.5. Propagation of oscillations and strong convergence. Having proved (9.6) we have the renormalized continuity equation (2.3) satisfied by the limit function , u, in particular, ∂t L k () + divx (L k ()u) + Tk () divx u = 0 in D ((0, T ) × R 3 ),

(9.15)

provided , u have been extended to be zero outside , where L k () = 1

Tk (z) dz. z2

On the other hand, one can let δ → 0 in (8.9) to obtain ∂t L k () + divx (L k ()u) + Tk () divx u = 0 in D ((0, T ) × R 3 ).

(9.16)

Now, following step by step Chap. 6 in [18] we take the difference of (9.15), (9.16) and use (9.7), (9.8) to deduce

L k ()(τ ) − L k ()(τ ) dx ≤

τ 0

divx u Tk () − Tk () dx dt for any τ ∈ [0, T ],

where the right-hand sides vanish for k → ∞ due to (9.6). Consequently, log() = log() in (0, T ) × - a relation equivalent to strong convergence of {δ }δ>0 , that means, δ → in L 1 ((0, T ) × ).

(9.17)

628

B. Ducomet, E. Feireisl

10. Conclusion Our ultimate goal is to complete the proof of Theorem 3.1. Note that we have already shown that , u satisfy the continuity equation (2.1) as well as its renormalized version (2.3). Moreover, it is easy to see that the “averaged” momentum equation (7.9) coincides in fact with (2.4) in view of the strong convergence results established in (7.8), (8.13), and (9.17). By the same token, one can pass to the limit in the energy equality (5.29) in order to obtain (2.12). Note that the regularizing δ−dependent terms on the left-hand side disappear by virtue of the estimates (6.4), (6.5), and (6.28). Similarly, one can handle Maxwell’s equation (5.28). Here, the only thing to observe is that the terms λ(δ , ϑδ , Bδ )curlx Bδ are bounded in L p ((0, T ) × ), for a certain p > 1, uniformly with respect to δ. Indeed such a bound can be obtained exactly as in (6.25). To conclude, we have to deal with the entropy inequality (5.27). First of all, the extra terms on the left-hand side tend to zero because of (6.4), (6.5), (6.22), and (8.4). Furthermore, it is standard to pass to the limit in the production rate keeping the correct sense of the inequality as all terms are convex with respect to the spatial gradients of u, ϑ, and B. Finally, the limit in the “logarithmic” terms can be carried over thanks to estimate (6.22) and the following result (see Lemma 5.4 in [13]): Lemma 10.1. Let ⊂ R N be a bounded Lipschitz domain. Assume that ϑδ → ϑ in L 2 ((0, T ) × ) and log(ϑδ ) → log(ϑ) weakly in L 2 ((0, T ) × ). Then ϑ is positive a.a. on (0, T ) × and log(ϑ) = log(ϑ). References 1. Ball, J.M.: A version of the fundamental theorem for Young measures. In Lect. Notes in Physics 344, Berlin-Heidelberg-New York: Springer-Verlag, 1989, pp. 207–215 2. Becker, E.: Gasdynamik. Stuttgart: Teubner-Verlag, 1966 3. Besse, C., Claudel, J., Degond, P., Deluzet, F., Gallice, G., Tesserias, C.: A model hierarchy for ionospheric plasma modeling. Math. Models Meth. Appl. Sci. 14, 393–415 (2004) 4. Bogovskii, M. E.: Solution of some vector analysis problems connected with operators div and grad (in Russian). Trudy Sem. S.L. Sobolev 80(1), 5–40 (1980) 5. Buet, C., Després, B.: Asymptotic analysis of fluid models for the coupling of radiation and hydrodynamics. J. Quant. Spect. and Rad. Trans. 85(3–4), 385–418 (2004) 6. Chapman, S., Cowling, T. G.: Mathematical theory of non-uniform gases. Cambridge: Cambridge Univ. Press, 1990 7. Coifman, R., Meyer, Y.: On commutators of singular integrals and bilinear singular integrals. Trans. Amer. Math. Soc. 212, 315–331 (1975) 8. Cox, J.P., Giuli, R.T.: Principles of stellar structure, I.,II. New York: Gordon and Breach, 1968 9. Davidson, P. A.: Turbulence:An introduction for scientists and engineers. Oxford: Oxford University Press, 2004 10. Degond, P., Lemou, M.: On the viscosity and thermal conduction of fluids with multivalued internal energy. Euro. J. Mech. B- Fluids 20, 303–327 (2001) 11. DiPerna, R.J., Lions, P.-L.: Ordinary differential equations, transport theory and Sobolev spaces. Invent. Math. 98, 511–547 (1989) 12. Duchon, J., Robert, R.: Inertial energy dissipation for weak solutions of incompressible Euler and Navier-Stokes equations. Nonlinearity 13, 249–255 (2000) 13. Ducomet, B., Feireisl, E.: On the dynamics of gaseous stars. Arch. Rational Mech. Anal. 174, 221–266 (2004)

Equations of Magnetohydrodynamics

629

14. Duvaut, G., Lions, J.-L.: Inequalities in mechnics and physics. Heidelberg: Springer-Verlag, 1976 15. Eliezer, S., Ghatak, A., Hora, H.: An introduction to equations of states, theory and applications. Cambridge: Cambridge University Press, 1986 16. Eyink, G. L.: Local 4/5 law and energy dissipation anomaly in turbulence. Nonlinearity 16, 137–145 (2003) 17. Feireisl, E.: On compactness of solutions to the compressible isentropic Navier-Stokes equations when the density is not square integrable. Comment. Math. Univ. Carolinae 42(1), 83–98 (2001) 18. Feireisl, E.: Dynamics of viscous compressible fluids. Oxford: Oxford University Press, 2003 19. Feireisl, E.: On the motion of a viscous, compressible, and heat conducting fluid. Indiana Univ. Math. J. 53, 1707–1740 (2004) 20. Feireisl, E.: Stability of flows of real monoatomic gases. Commun. Partial Differ. Eqs. 31, 325–348 (2006) 21. Feireisl, E., Petzeltová, H.: On integrability up to the boundary of the weak solutions of the Navier-Stokes equations of compressible flow. Commun. Partial Differ. Eqs. 25(3–4), 755–767 (2000) 22. Galdi, G. P.: An introduction to the mathematical theory of the Navier-Stokes equations, I. New York: Springer-Verlag, 1994 23. Gallavotti, G.: Statistical mechanics: A short treatise. Heidelberg: Springer-Verlag, 1999 24. Giovangigli, V.: Multicomponent flow modeling. Basel: Birkhäuser, 1999 25. Hoff, D., Jenssen, H. K.: Symmetric nonbarotropic flows with large data and forces. Arch. Rational Mech. Anal. 173, 297–343 (2004) 26. Jeffrey, A.N., Taniuti, T.: Non-linear wave propagation. New York: Academic Press, 1964 27. Lions, P.-L.: Mathematical topics in fluid dynamics, Vol.2, Compressible models. Oxford: Oxford Science Publication, 1998 28. Lions, P.-L.: Bornes sur la densité pour les équations de Navier- Stokes compressible isentropiques avec conditions aux limites de Dirichlet. C.R. Acad. Sci. Paris, Sér I. 328, 659–662 (1999) 29. Mihalas, B., Weibel-Mihalas, B.: Foundations of radiation hydrodynamics. Dover: Dover Publications 1984 30. Müller, I., Ruggeri, T.: Rational extended thermodynamics. Springer Tracts in Natural Philosophy 37, Heidelberg: Springer-Verlag, 1998 31. Novotný, A., Straškraba, I.: Introduction to the theory of compressible flow. Oxford: Oxford University Press, 2004 32. Pedregal, P.: Parametrized measures and variational principles. Basel: Birkhäuser, 1997 33. Rajagopal, K. R., Srinivasa, A. R.: On thermodynamical restrictions of continua. Proc. Royal Soc. London A 460, 631–651 (2004) 34. Ruggeri, T., Trovato, M.: Hyperbolicity in extended thermodynamics of Fermi and Bose gases. Continuum Mech. Thermodyn. 16(6), 551–576 (2004) 35. Serre, D.: Variation de grande amplitude pour la densité d’un fluid viscueux compressible. Physica D 48, 113–128 (1991) 36. Shore, S. N.: An introduction to atrophysical hydrodynamics. New York: Academic Press, 1992 37. Tartar, L.: Compensated compactness and applications to partial differential equations. Nonlinear Anal. and Mech., Heriot-Watt Sympos., L.J. Knopps editor, Research Notes in Math. 39, Boston: Pitman, 1975, pp. 136–211 38. Temam, R.: Navier-Stokes equations. Amsterdam: North-Holland, 1977 39. Vaigant, V. A., Kazhikhov, A. V.: On the existence of global solutions to two-dimensional Navier-Stokes equations of a compressible viscous fluid (in Russian). Sibirskij Mat. Z. 36(6), 1283–1316 (1995) 40. Zahn, J.P., Zinn-Justin, J.: Astrophysical fluid dynamics, Les Houches, XLVII. Amsterdam: Elsevier, 1993 41. Zirin, H.: Astrophysics of the sun. Cambridge: Cambridge University Press, 1988 Communicated by P. Constantin

Commun. Math. Phys. 266, 631–645 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0058-5

Communications in

Mathematical Physics

A Stochastic Perturbation of Inviscid Flows Gautam Iyer Department of Mathematics, University of Chicago, Chicago, Illinois 60637, USA. E-mail: [email protected] Received: 4 May 2005 / Accepted: 12 February 2006 Published online: 18 July 2006 – © Springer-Verlag 2006

Abstract: We prove existence and regularity of the stochastic flows used in the stochastic Lagrangian formulation of the incompressible Navier-Stokes equations (with periodic boundary conditions), and consequently obtain a C k,α local existence result for the Navier-Stokes equations. Our estimates are independent of viscosity, allowing us to consider the inviscid limit. We show that as ν → 0, solutions of the stochastic Lagrangian formulation (with periodic √ boundary conditions) converge to solutions of the Euler equations at the rate of O( νt). 1. Introduction Consider an incompressible inviscid fluid with velocity field u in the absence of external forcing. The evolution of the velocity field is governed by the Euler [3] equations ∂t u + (u · ∇)u + ∇ p = 0, ∇ · u = 0.

(1.1) (1.2)

Viscosity introduces a diffusive term in the Euler equations and Eq. (1.1) becomes ∂t u + (u · ∇) u − νu + ∇ p = 0.

(1.3)

The Kolmogorov backward and Feynman-Kac formulae [11] show that any linear, diffusive, second order PDE can be obtained by averaging out a stochastic perturbation of an ODE. The theory for non-linear PDE’s is not as well developed. We are interested in interpreting the Navier-Stokes equations as the average of a suitable stochastic perturbation of the Euler equations. Many interesting non-linear PDE’s have been interpreted as averaging of stochastic processes, the simplest example being the Kolmogorov reaction diffusion equation [14]. In two dimensions the same is possible for the Navier-Stokes equations as the vorticity satisfies a standard Fokker-Plank equation. This combined with the Biot-Savart law

632

G. Iyer

led to the random vortex method [16] and has been used and studied extensively. In three dimensions the problem is a little harder as the vorticity equation is no longer of Fokker-Plank type, and the non-linearity causes trouble. In [15] Le Jan and Sznitman used a backward in time branching process in Fourier space to express the Navier-Stokes equations as the expected value of a stochastic process. This approach led to a new existence theorem, and was later [1] generalized and physical space analogues were developed. An approach more along the lines of this paper was developed by Busnello, Flandoli and Romito [2] who considered ‘noisy’ flow paths, and used Girsanov transformations to recover the velocity field. They obtained the 3-dimensional Navier-Stokes equations in this form, and generalized their method to work for a general class of second order parabolic equations. A different technique was used by Gomes [9] to express the diffusive Lagrangian [5] as the expected value minimizer of a suitable functional. Finally we mention similar systems have been considered by Jourdain et al in [10]. Our approach1 is to introduce a Brownian drift into the active vector formulation [4] of the Euler equations. Peter Constantin and the author showed [7] that this provides a physically meaningful, explicit stochastic representation of the Navier-Stokes equations. While long time dynamics of the system we consider are presently unknown, we hope that techniques used here will lead to control of the growth of certain quantities with non-zero probability. For example, we would like to find an exponential bound for ∇ X which holds with non-zero probability. Finding an almost sure bound of this form will lead to global existence. In this paper, we consider the flow given by the stochastic differential equation √ d X = u dt + 2ν d B (1.4) with initial data X (a, 0) = a.

(1.5)

Here ν > 0 represents the viscosity, and B represents a 3-dimensional Wiener process (we use the letter B to avoid confusion with the Weber operator). We recover the velocity field from X by A = X −1 , u = EP (∇ t A) (u 0 ◦ A) ,

(1.6) (1.7)

where E denotes the expected value with respect to the Wiener measure, P denotes the Leray-Hodge projection [3] on divergence free vector fields, and u 0 is the deterministic initial data. We clarify that by X −1 in Eq. (1.6) we mean the spatial inverse of X . We impose periodic boundary conditions, though all theorems proved here will also work if we work with the domain R3 and impose a decay at infinity condition instead. The motivation for considering the above system arises from the fact that in the absence of viscosity, the system (1.4)–(1.7) reduces to ∂t A + (u · ∇)A = 0, A(x, 0) = x, u = P (∇ t A)(u 0 ◦ A) .

(1.8) (1.9) (1.10)

1 In the original version of this paper, our intention was to propose this as a physically meaningful model for the Navier-Stokes equations. We presented a proof that the solution of the system considered here differs from the solution of the Navier-Stokes equations by O(t 3/2 ). Six months after submission of the original version of this paper, Peter Constantin and the author [7] discovered that the equations considered here are exactly equivalent to the Navier-Stokes equations.

A Stochastic Perturbation of Inviscid Flows

633

Peter Constantin proved [4] that u is a solution of the (deterministic) system (1.8)–(1.10) if and only if u is a solution of the incompressible Euler equations (1.1)–(1.2) with initial data u 0 . Thus the system (1.4)–(1.7) can be thought of as superimposing the Wiener process on the flow map, intuitively representing Brownian motion of fluid particles. Physically, the Brownian particle interaction is regarded as the source of viscosity, and the equivalence of (1.4)–(1.7) and the Navier-Stokes equations proved in [7] confirms this. We remark that Eq. (1.7) provides an explicit formula for u in terms of the map X . In this paper, we provide a self contained proof of a C k,α local existence theorem for the stochastic system (1.4)–(1.7). The proof in [7] showing equivalence between Navier-Stokes and (1.4)–(1.7) relies crucially on spatial regularity of solutions as stated in Theorem 2.1. We remark that the stochastic representation of Busnello, Flandoli and Romito does not admit a self contained existence proof as we have here. The estimates, and existence time can be chosen independent of the viscosity, thus enabling us to consider the vanishing viscosity limit. We show that as ν → 0, the√solution of (1.4)–(1.7) converges to the solution of the Euler equations at the rate of O( νt). We remark that the limit ν → 0 is not well understood in bounded domains using classical methods. We hope that this stochastic formulation (when extended to bounded domains) will give us a better handle on computing this limit. In the next section, we establish our notational convention, and describe precisely the results we prove in this paper. In Sect. 3 we prove bounds on the Weber operator, which are essential to all proofs presented in this paper. In Sect. 4 we prove local existence for (1.4)–(1.7) and the vanishing viscosity limit. Finally, in Sect. 5, we digress and present an alternate proof of local existence for the Navier-Stokes equations using the diffusive Lagrangian formulation [5]. 2. Notational Convention and Description of Results In this section we describe the main results we prove. We begin by establishing our notational convention. We let I denote the cube [0, L]3 with side of length L. We define the Hölder norms and semi-norms on I by |u|α = sup L α x,y∈I

uC k =

|u(x) − u(y)| , |x − y|α

L |m| sup |D m u|,

|m|k

uk,α = uC k +

I

|m|=k

L k D m u α ,

where D m denotes the derivative with respect to the multi index m. We let C k denote the space of all k-times continuously differentiable spatially periodic functions on I, and C k,α denote the space of all spatially periodic k + α Hölder continuous functions. The spaces C k and C k,α are endowed with the norms · C k and · k,α respectively. We use I to denote the identity function on R3 or I (depending on the context), and use I to denote the identity matrix. The first theorem we prove addresses local (in time) existence for the system (1.4)–(1.7): Theorem 2.1. Let k 1 and u 0 ∈ C k+1,α be divergence free. There exists a time T = T (k, α, L , u 0 k+1,α ), but independent of viscosity, and a pair of functions λ,

634

G. Iyer

u ∈ C([0, T ], C k+1,α ) such that u and X = I + λ satisfy the system (1.4)–(1.7). Further ∃U = U (k, α, L , u 0 k+1,α ) such that t ∈ [0, T ] =⇒ u(t)k+1,α U . We prove this theorem in Sect. 4. Our proof will also give a local existence result for the Euler equations, or any stochastic perturbation similar to the one considered here. We remark that the estimates required for this theorem along with Constantin’s diffusive Lagrangian formulation [5] also gives us local existence for the Navier-Stokes equations. In Sect. 5, we digress and present this proof. We remark that Theorem 2.1 is still true when k = 0. The only modification we need to make to our proof is to the inequalities in Lemma 4.1 which we do not carry out here. Since our estimates, and local existence time are independent of viscosity, we can address the question of convergence in the limit ν → 0. Proposition 2.1. Let u 0 ∈ C k+1,α be divergence free, and U, T be as in Theorem 2.1. For each ν > 0 we let u ν be the solution of the system (1.4)–(1.7) on the time interval [0, T ]. Making T smaller if necessary, let u be the solution to the Euler equations (1.1)– (1.2) with initial data u 0 deﬁned on the time interval [0, T ]. Then there exists a constant c = c(k, α, U, L) such that for all t ∈ [0, T ] we have √ u(t) − u ν (t)k,α cU νt. L At present we are unable to extend the above proposition to domains with boundaries. In this case, possible detachment of the boundary layer creates analytical obstructions to understanding the inviscid limit. We present a proof of Proposition 2.1 at the end of Sect. 4, and are presently working on extending it to work for domains with boundaries. 3. The Weber Operator and Bounds In this section we define and obtain estimates for the Weber operator which will be central to all subsequent results. Definition 3.1. We deﬁne the Weber operator W : C k,α × C k+1,α → C k,α by W(v, ) = P I + ∇ t v , where P is the Leray-Hodge projection [3] onto divergence free vector ﬁelds. Remark 3.1. The range of W is C k,α because multiplication by a C k,α function is a bounded operation on C k,α , and P is a classical Calderon-Zygmund singular integral operator [17] which is bounded on Hölder spaces. Remark. In the whole space, or with periodic boundary conditions, the Leray-Hodge projection commutes with derivatives. This is not true for arbitrary domains [6]. Formally it seems that W(v, ) should have one less derivative than . However we prove below that W(v, ) has as many derivatives as . The reason being, when we differentiate W(v, ), we can use ‘integration by parts’ to express the right hand side only in terms of first order derivatives. Lemma 3.1 (Integration by parts). If u, v ∈ C 1,α then P ∇ t u v = −P ∇ t v u .

A Stochastic Perturbation of Inviscid Flows

635

Proof. This follows immediately from the identity t ∇ u v + ∇ t v u = ∇(u · v) and the fact that P vanishes on gradients. Corollary 3.1. If k 1 and v, ∈ C k,α then W(v, ) ∈ C k,α and W(v, )k,α c 1 + ∇k−1,α vk,α . Proof. Notice first that W(v, ) ∈ C k−1,α by Remark 3.1. Now ∂i W(v, ) = P (∇ t ∂i )v + ∇ t ∂i v = P −∇ t v ∂i + ∇ t ∂i v . Now the right hand side has only first order derivatives of and v, hence ∇W(v, ) ∈ C k−1,α and the proposition follows. Proposition 3.1. If k 1 and 1 , 2 ∈ C k,α and v1 , v2 ∈ C k,α , are such that ∇i k−1,α d and

vi k,α U

for i = 1, 2, then there exists c = c(k, d, α) such that W(v1 , 1 ) − W(v2 , 2 )k,α c UL 1 − 2 k,α + v1 − v2 k,α .

(3.1)

If k = 0, the inequality (3.1) still holds provided we assume ∇i α d and

vi 1,α U

for i = 1, 2. Proof of Proposition 3.1. The main idea in the proof is to use ‘integration by parts’ to avoid the loss of derivative. By definition of W we have W(v1 , 1 ) − W(v2 , 2 ) = P (I + ∇ t 1 )v1 − (I + ∇ t 2 )v2 = P (I + ∇ t 1 )(v1 − v2 ) + ∇ t (1 − 2 )v2 (3.2) = P (I + ∇ t 1 )(v1 − v2 ) − ∇ t v2 (1 − 2 ) . Further, differentiating we have ∂i [W(v1 , 1 ) − W(v2 , 2 )] = ∂i P (I + ∇ t 1 )(v1 − v2 ) − ∇ t v2 (1 − 2 ) = P ∇ t ∂i 1 (v1 − v2 ) + (I + ∇ t 1 )∂i (v2 − v1 ) −∇ t ∂i v2 (1 − 2 ) − ∇ t v2 ∂i (2 − 1 ) = P − ∇ t (v1 − v2 )∂i 1 + (I + ∇ t 1 )∂i (v2 − v1 ) (3.3) +∇ t (1 − 2 )∂i v2 − ∇ t v2 ∂i (2 − 1 ) . Note that we used Lemma 3.1 to ensure that the right hand sides of (3.2) and (3.3) have only first order derivatives of and v. Thus taking the C k−1,α norms of Eqs. (3.2) and (3.3), and using the fact that multiplication by a C k,α function and P are bounded on C k,α , the proposition follows.

636

G. Iyer

4. Local Existence for the Stochastic Formulation In this section we prove local in time C k,α existence for the stochastic system (1.4)– (1.7) as stated in Theorem 2.1. We conclude by proving Proposition 2.1, showing how the stochastic system (1.4)–(1.7) behaves as ν → 0. We begin with a few preliminary results. Lemma 4.1. If k 1, then there exists a constant c = c(k, α) such that k+α f ◦ gk,α c f k,α 1 + ∇gk−1,α k+1 f ◦ g1 − f ◦ g2 k,α c 1 + ∇g1 k−1,α + ∇g2 k−1,α · ∇ f k,α g1 − g2 k,α and

k+1 f 1 ◦ g1 − f 2 ◦ g2 k,α c 1 + ∇g1 k−1,α + ∇g2 k−1,α · f 1 − f 2 k,α + min ∇ f 1 k,α , ∇ f 2 k,α g1 − g2 k,α .

The proof of Lemma 4.1 is elementary and not presented here. We subsequently use the above lemma repeatedly without reference or proof. Lemma 4.2. Let X 1 , X 2 ∈ C k+1,α be such that ∇ X 1 − Ik,α d < 1 and ∇ X 2 − Ik,α d < 1. Let A1 and A2 be the inverse of X 1 and X 2 respectively. Then there exists a constant c = c(k, α, d) such that A1 − A2 k,α c X 1 − X 2 k,α . Proof. Let c = c(k, α, d) be a constant that changes from line to line (we use this convention implicitly throughout this paper). Note first ∇ A = (∇ X )−1 ◦ A, and hence by Lemma 5.1, ∇ AC 0 (∇ X )−1 0 c. C

Now using Lemma 5.1 to bound (∇ X )−1 α we have ∇ Aα = (∇ X )−1 ◦ A (∇ X )−1 1 + ∇ AC 0 c. α

When k 1, we again bound (∇ X )−1 ◦ A we have

α

(∇ X )−1 k,α

∇ Ak,α (∇ X )−1

by Lemma 5.1. Taking the C k,α norm of

k,α

k 1 + ∇ Ak−1,α .

So by induction we can bound ∇ Ak,α by a constant c = c(k, α, d). The lemma now follows immediately from the identity A1 − A2 = (A1 ◦ X 2 − I ) ◦ A2 = (A1 ◦ X 2 − A1 ◦ X 1 ) ◦ A2 and Lemma 4.1.

A Stochastic Perturbation of Inviscid Flows

637

Lemma 4.3. Let u ∈ C([0, T ], C k+1,α ) and X satisfy the SDE (1.4) with initial data (1.5). Let λ = X − I and U = supt u(t)k+1,α . Then there exists a constant c = c(k, α, uk+1,α ) such that for short time ∇λ(t)k,α

cU t cU t/L e L

∇(t)k,α

and

cU t cU t/L e . L

Proof. From Eq. (1.4) we have

t X (x, t) = x +

u(X (x, s), s) ds +

√

2ν Bt

0

t =⇒

∇ X (t) = I +

(∇u) ◦ X · ∇ X.

(4.1)

0

Taking the C 0 norm of Eq. (4.1) and using Gronwall’s Lemma we have ∇λ(t)C 0 = ∇ X (t) − I C 0 eU t/L − 1. Now taking the C k,α norm in Eq. (4.1) we have

t ∇λ(t)k,α c

k ∇uk,α 1 + ∇λk−1,α 1 + ∇λk,α .

0

The bound for ∇λk,α now follows from the previous two inequalities, induction and Gronwall’s Lemma. The bound for ∇k,α then follows from Lemma 4.2. We draw attention to the fact that the above argument can only bound ∇λ, and not λ. Fortunately, our results only rely on a bound of ∇λ. Lemma 4.4. Let u, u¯ ∈ C([0, T ], C k+1,α ) be such that sup u(t)k+1,α U and

0t T

¯ sup u(t) k+1,α U.

0t T

Let X, X¯ be solutions of the SDE (1.4)–(1.5) with drift u and u¯ respectively, and let A and A¯ be the spatial inverse of X and X¯ respectively. Then there exists c = c(k, α, U ) and a time T = T (k, α, U ) such that X (t) − X¯ (t)

k,α

t ce

u − u ¯ k,α ,

cU t/L

(4.2)

0

A(t) − A(t) ¯ cecU t/L k,α

t u − u ¯ k,α 0

for all 0 t T .

(4.3)

638

G. Iyer

Proof. We first use Lemma 4.3 to bound ∇ X − Ik,α and ∇ X¯ − Ik,α for short time T . Now X (t) − X¯ (t) =

t

u ◦ X − u¯ ◦ X¯

0

=⇒ X (t) − X¯ (t) k,α

t

u ◦ X − u¯ ◦ X¯ k,α

0

t U u − u ¯ k,α + X − X¯ k,α , c L 0

and inequality (4.2) follows by applying Gronwall’s Lemma. Inequality (4.3) follows immediately from (4.2) and Lemma 4.2. We now provide the proof of Theorem 2.1. We reproduce the statement here for convenience. Theorem 2.1. Let k 1 and u 0 ∈ C k+1,α be divergence free. There exists a time T = T (k, α, L , u 0 k+1,α ), but independent of viscosity, and a pair of functions λ, u ∈ C([0, T ], C k+1,α ) such that u and X = I + λ satisfy the system (1.4)–(1.7). Further ∃U = U (k, α, L , u 0 k+1,α ) such that t ∈ [0, T ] =⇒ u(t)k+1,α U . Proof. Let U be a large constant, and T a small time, both of which will be specified later. Define as before U and L by

U = u ∈ C([0, T ], C k+1,α ) u(t)k+1,α U, ∇ · u = 0 and u(0) = u 0

and L = ∈ C([0, T ], C k+1,α ) ∇(t)k,α 21 ∀t ∈ [0, T ] and (·, 0) = 0 . We clarify that the functions u and are required to be spatially C k+1,α , and need only be continuous in time. Now given u ∈ U we define X u to be the solution of Eq. (1.4) with initial data (1.5) and λu = X u − I be the Eulerian displacement. We define Au by Eq. (1.6) and let u = Au − I be the Lagrangian displacement. Finally we define W : U → U by W (u) = EW(u 0 ◦ Au , u ). We aim to show that W : U → U is Lipschitz in the weaker norm uU = sup u(t)k,α 0t T

and when T is small enough, we will show that W is a contraction mapping. Let c be a constant that changes from line to line. By Corollary 3.1 we have W (u)k+1,α cE 1 + ∇u k,α u 0 ◦ Au k+1,α k+2 . c u 0 k+1,α sup 1 + ∇u k,α

(4.4)

A Stochastic Perturbation of Inviscid Flows

639

Here is the probability space on which our processes are defined. We remark that Lemma 4.3 gives us a bound on ∇u k,α . A bound on E∇u k,α instead would not have been enough. k+2 u 0 k+1,α , and then apply Lemma 4.3 to choose T Now we choose U = c 23 small enough to ensure u , λu ∈ L. Now inequality 4.4 ensures that W (u) ∈ U. Now if u, u¯ ∈ U, Lemma 4.4 guarantees

t u (t) − u¯ (t)k,α cecU t/L

u − u ¯ k,α . 0

Thus applying Proposition 3.1 we have W (u)(t) − W (u)(t) ¯ k,α c

U L

u (t) − u¯ (t)k,α +

+ u 0 ◦ Au (t) − u 0 ◦ Au¯ (t)k,α cU u (t) − u¯ (t)k,α L

t cU cU t/L u − u e ¯ k,α . L

0

So choosing T = T (k, α, L , U ) small enough we can ensure W is a contraction. The existence of a fixed point of W now follows by successive iteration. We define u n+1 = W (u n ). The sequence (u n ) converges strongly with respect to the C k,α norm. Since U is closed and convex, and the sequence (u n ) is uniformly bounded in the C k+1,α norm, it must have a weak limit u ∈ U. Finally since W is continuous with respect to the weaker C k,α norm, the limit must be a fixed point of W , and hence a solution to the system (1.4)–(1.7). We conclude by proving the vanishing viscosity behavior stated in Proposition 2.1. We reproduce the statement here for convenience. Proposition 2.1. Let u 0 ∈ C k+1,α be divergence free, and U, T be as in Theorem 2.1. For each ν > 0 we let u ν be the solution of the system (1.4)–(1.7) on the time interval [0, T ]. Making T smaller if necessary, let u be the solution to the Euler equations (1.1)– (1.2) with initial data u 0 deﬁned on the time interval [0, T ]. Then there exists a constant c = c(k, α, U, L) such that for all t ∈ [0, T ] we have u(t) − u ν (t)k,α

cU L

√ νt.

Proof. We use a subscript of ν to denote quantities associated to the solution of the viscous problem (1.4)–(1.7), and unsubscripted letters to denote the corresponding quantities associated to the solution of the Eulerian-Lagrangian formulation of the Euler equations (1.8)–(1.10). We use the same notation as in the proof of Theorem 2.1. Now from the proof of Theorem 2.1 we know that for short time ν , ∈ L. Using Lemma 4.2 and making T smaller if necessary, we can ensure λν , λ ∈ L. We begin by

640

G. Iyer

estimating Eλν − λk,α :

t λν (t) − λ(t) =

[u ν ◦ X ν − u ◦ X ] + 0

=⇒



λν (t) − λ(t)k,α c 

t

u ν − uk,α +

√ 2ν Bt

U L

λν − λk,α +

√

 ν|Bt | ,

0

and so by Gronwall’s lemma 

λν (t) − λ(t)k,α

√ c  ν|Bt | +

t

 u ν − uk,α  ecU t/L .

0

Using Lemma 4.2 and taking expected values gives  

t √ E ν (t) − (t)k,α c  νt + u ν − uk,α  ecU t/L .

(4.5)

0

To estimate the difference u ν − u, we use (1.7), and (1.10) to obtain =⇒

u ν − u = EW(u 0 ◦ Aν , ν ) − W(u 0 ◦ A, ) u ν − uk,α cE UL ν − k,α + u 0 ◦ Aν − u 0 ◦ Ak,α

=⇒ u ν (t) − u(t)k,α

cU L E ν

− k,α  

t √ cU t/L  cU νt + u ν − uk,α  , L e 0

and the theorem follows from Gronwall’s lemma. 5. Local Existence for the Navier-Stokes Equations Proposition 3.1, along with Peter Constantin’s diffusive Lagrangian formulation [5] immediately gives us a local in time C k,α existence and uniqueness result for the NavierStokes equations using classical PDE methods. We conclude this paper by presenting the proof in this section. Definition 5.1. Let k 2 and T > 0. We deﬁne Lk,α T by

k,α ∇(t)k−1,α 1 ∀t ∈ [0, T ] and (·, 0) = 0 . = ∈ C (I × [0, T ], I) Lk,α T 2 k,α (I × [0, T ], I) divergence free we deﬁne the virtual Given ∈ Lk,α T , and u ∈ C velocity v = vu, to be the unique solution of the linear parabolic equation

∂t vβ + (u · ∇) vβ − νvβ = 2νC ij,β ∂ j vi

(5.1)

A Stochastic Perturbation of Inviscid Flows

641

with initial data v(x, 0) = u(x, 0),

(5.2)

C j,i = (I + ∇)−1 ki ∂k ∂ j p

(5.3)

where p

are the commutator coefﬁcients. k,α (I × [0, T ]) by Finally we deﬁne the operator W : C k,α (I × [0, T ]) × Lk,α T →C W(u, ) = W(vu, , ).

(5.4)

Remark. We clarify that by ∈ C k,α (I × [0, T ], I) we only impose a C k,α spatial regularity restriction. We do not assume anything about time regularity. This will be the case for the remainder of this section. Remark. Observe that ∇k−1,α 21 guarantees that the matrix I + ∇ t in Eq. (5.3) is invertible. Further note that all coefficients in Eq. (5.1) are of class C k,α and hence by parabolic regularity [12], v ∈ C k,α . Lemma 5.1. Let X be a Banach algebra. If x ∈ X is such that x ρ < 1 then 1 + x 1 . Further if in addition y ρ then is invertible and (1 + x)−1 1−ρ (1 + x)−1 − (1 + y)−1

1 x − y . (1 − ρ)2

Proof. The first part of the lemma follows immediately from the identity (1 + x)−1 = (−x)n . The second part follows from the first part and the identity (1 + x)−1 − (1 + y)−1 = (1 + x)−1 (y − x)(1 + y)−1 . We generally use Lemma 5.1 when X is the space of C k,α periodic matrices. We finally prove that the Weber operator W is Lipschitz, which will quickly give us the existence theorem. k,α , are such that Proposition 5.1. If , ¯ ∈ Lk,α T and u, u¯ ∈ C

sup u(t)k,α U and

0t T

¯ sup u(t) k,α U,

0t T

then there exists c = c(k, α, L , ν, U ) and T = T (k, α, L , ν, U ) such that W(u, )(t) − W(u, ¯ c u(0) − u(0) ¯ )(t) ¯ k,α k,α νt cU ¯ 1 + 2 (t) − (t) + k,α L L +t u(t) − u(t) ¯ k−2,α for all 0 t T .

642

G. Iyer

Proof. Let v and v¯ be the virtual velocities associated to u, and u, ¯ ¯ respectively. Let C and C¯ be the commutator coefficients associated to and ¯ respectively. Since Eq. (5.1) is a linear parabolic equation with C k,α coefficients, standard regularity theory [12] ensures that there exists T = T (ν, U ) such that sup v(t)k,α 2U and

0t T

¯ sup v(t) k,α 2U.

0t T

Hence by Proposition 3.1 we have W(u, ) − W(u, ¯ ¯ = W(v, ) − W(v, ¯ ) ¯ ) k,α k,α U ¯ c L − k,α + v − v ¯ k,α .

(5.5)

Now let v˜ = v − v. ¯ The evolution equation of v˜ is given by ∂t v˜β + (u · ∇) v˜β − νv˜β − 2νC ij,β ∂ j v˜i = 2ν(C¯ ij,β − C ij,β )∂ j v¯i + ((u¯ − u) · ∇)v¯β with initial data v(x, ˜ 0) = u(x, 0) − u(x, ¯ 0). We estimate the C k−2,α norm of the right hand side. Let c be some constant which changes from line to line. By definition, ¯ −1 ∇∂ j ¯k − (I + ∇ t )−1 ∇∂ j k C¯ kj − C kj = (I + ∇ t ) ¯ −1 − (I + ∇ t )−1 ∇∂ j ¯k + (I + ∇ t )−1 ∇∂ j ¯k − ∇∂ j k . = (I + ∇ t ) ¯ −1 k−1,α . Note that by Lemma 5.1 we can bound (I + ∇ t )−1 k−1,α and (I + ∇ t ) Further, by Lemma 5.1 again we have ¯ −1 − (I + ∇ t )−1 c ∇ − ∇ ¯ . (I + ∇ t ) k−1,α

k−1,α

Combining these estimates we have c C − C¯ ∇ − ∇ ¯ k−1,α . k−2,α L Finally note ((u¯ − u) · ∇)v ¯ k−2,α

cU u − u ¯ k−2,α . L

Thus by parabolic regularity [12], cU t ν ∇(t) − ∇ (t) ¯ v(t) ˜ + u(t) − u(t) ¯ k,α k−2,α k−1,α L L + u(0) − u(0) ¯ k,α ,

(5.6)

and substituting Eq. (5.6) in (5.5), the proposition follows. Theorem 5.1. Let k 2 and u 0 ∈ C k,α (I, I) be divergence free. Then there exists T = T (k, α, L , ν, u 0 k,α ) and u ∈ C k,α (I × [0, T ], I) which is a solution of the Navier-Stokes equations with initial data u 0 .

A Stochastic Perturbation of Inviscid Flows

643

Proof. Let U > u 0 k,α . We define the set U by

U = u ∈ C([0, T ], C k,α ) u(t)k,α U, ∇ · u = 0, and u(0) = u 0 . Given u ∈ U, let u to be the unique solution of the equation ∂t u + (u · ∇) u − νu + u = 0 with initial data u (x, 0) = 0. Our aim is to produce u ∈ U such that u = W(u, u ), which from [5] we know must be a solution to the Navier-Stokes equations. We define the map W by W (u) = W(u, u ). If U is endowed with the strong norm uU = sup u(t)k,α , 0t T

we will show as before that for sufficiently small T, W maps the U into itself. Finally we will show that W is a contraction under a weaker norm, producing the desired fixed point. First note that by parabolic regularity [12], we have u (t)k,α cU t. The constant c of course depends on U , but we retain the U on the right for dimensional correctness. Thus choosing T small will guarantee u ∈ Lk,α T . Let v = v,u be the virtual velocity defined by Eq. (5.1), with initial data u 0 . Standard parabolic estimates [12], (and the fact that ∈ Lk,α T ), show v(t) − u 0 k,α

cU 2 t. L

Now by definition, W (u) = P I + ∇ t u v = P I + ∇ t u (v − u 0 ) + I + ∇ t u u 0 = P[u 0 ] + P I + ∇ t u (v − u 0 ) + P[(∇ t u )u 0 ]. Since u 0 is divergence free, P(u 0 ) = u 0 . Using Corollary 3.1, the preceding two estimates for u and v − u 0 , we obtain W (u)(t)k,α u 0 k,α + c Thus choosing T <

L (U cU 2

U2 t. L

− u 0 k,α ), we can ensure that W maps U into itself.

644

G. Iyer

To see that W has a fixed point, let u, u¯ ∈ U and define ˜ = u − u¯ . The evolution of ˜ is governed by ∂t ˜ + (u · ∇) ˜ − ν˜ = ((u¯ − u) · ∇) u¯ + u¯ − u and parabolic regularity [12] immediately gives ˜ ¯ (t) ct u(t) − u(t) k−2,α . k,α

Combining this with Proposition 5.1, we have ¯ sup W (u)(t) − W (u)(t) k,α

0t T

cU T

¯ sup u(t) − u(t) k−2,α . L 0t T

L , then W : U → U is a contraction mapping Thus if T is chosen to be smaller than cU and has a unique fixed point concluding the proof.

Remark. The above estimates along with the active vector formulation of the Euler equations [4] can be used to prove a C k,α local existence and uniqueness theorem for the Euler equations. Since a similar proof of this result can be found in the original paper [4] by P. Constantin, we do not present it here. Acknowledgement. I would like to thank Peter Constantin for his encouragement, support and many helpful discussions. I would also like to thank Hongjie Dong and Tu Nguyen for carefully reading this paper, and pointing out an error in the original proof of Theorem 2.1.

References 1. Bhattacharya, R.N., Chen, L., Dobson, S., Guenther, R.B., Orum, C., Ossiander, M., Thomann, E., Waymire, E.C.: Majorizing kernels and stochastic cascades with applications to incompressible Navier-Stokes equations. Trans. Amer. Math. Soc. 355(12), 5003–5040 (2003) (electronic) 2. Busnello, B., Flandoli, F., Romito, M., A probabilistic representation for the vorticity of a 3D viscous fluid and for general systems of parabolic equations. http://arxiv.org/list/math.PR/0306075, 2003 3. Chorin, A., Marsden, J.: A Mathematical Introduction to Fluid Mechanics. Berlin Heidelberg New York: Springer, 2000 4. Constantin, P.: An Eulerian-Lagrangian approach for incompressible fluids: local theory. J. Amer. Math. Soc. 14 (2), 263–278 (2001) (electronic) 5. Constantin, P.: An Eulerian-Lagrangian Approach to the Navier-Stokes equations. Commun. Math. Phys. 216(3), 663–686 (2001) 6. Constantin, P., Foias, C.: Navier-Stokes Equations. Chicago, IL: University of Chicago Press, 1988 7. Constantin, P., Iyer, G.: A stochastic Lagrangian representation of the 3-dimensional incompressible Navier-Stokes equations. http://arxiv.org/list/math.PR/0511067, 2005 8. Friedman, A.: Stochastic Differential Equations and Applications, Volume 1. London-New York-San Diego: Academic Press, 1975 9. Gomes, D.A.: A variational formulation for the Navier-Stokes equation. Commun. Math. Phys 257, 227–234 (2005) 10. Jourdain, B., Le Bris, C., Lelièvre, T.: Coupling PDEs and SDEs: the Illustrative Example of the Multiscale Simulation of Viscoelastic Flows. Lecture Notes in Computational Science and Engineering 44, Berlin Heidelberg New York: Springer, 2005, pp. 151–170 11. Karatzas, I., Shreve, S.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics 113, New York: Springer, 1991 12. Krylov, N.V.: Lectures on Elliptic and Parabolic Equations in Hölder Spaces. Graduate Studies in Mathematics 12, Providence, RI: Amer. Math. Soc, 1996 13. Kunita, H.: Stochastic ﬂows and stochastic differential equations. Cambridge Studies in Advanced Mathematics 24, Cambridge: Cambridge University Press, 1997

A Stochastic Perturbation of Inviscid Flows

645

14. Le Gall, J.: Spatial Branching Processes, Random Snakes and Partial Differential Equations. Lectures in Mathematics, Basel-Baston: Birkhäuser, 1999 15. Le Jan Y., Sznitman, A.S.: Stochastic cascades and 3-dimensional Navier-Stokes equations. Probab. Theory Related Fields 109(3), 343–366 (1997) 16. Majda, A., Bertozzi, A.: Vorticity and Incompressible Flow. Cambridge: Cambridge University Press, 2002 17. Stein, E.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton University Press, 1970 Communicated by A. Kupiainen

Commun. Math. Phys. 266, 647–663 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0056-7

Communications in

Mathematical Physics

On Monopoles and Domain Walls Amihay Hanany1 , David Tong2 1 Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

E-mail: [email protected]

2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, CB3 0WA, UK.

E-mail: [email protected] Received: 4 August 2005 / Accepted: 20 March 2006 Published online: 18 July 2006 – © Springer-Verlag 2006

Abstract: The purpose of this paper is to describe a relationship between maximally supersymmetric domain walls and magnetic monopoles. We show that the moduli space of domain walls in non-abelian gauge theories with N flavors is isomorphic to a complex, middle dimensional, submanifold of the moduli space of U (N ) magnetic monopoles. This submanifold is defined by the fixed point set of a circle action rotating the monopoles in the plane. To derive this result we present a D-brane construction of domain walls, yielding a description of their dynamics in terms of truncated Nahm equations. The physical explanation for the relationship lies in the fact that domain walls, in the guise of kinks on a vortex string, correspond to magnetic monopoles confined by the Meissner effect. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2. Domain Walls . . . . . . . . . . . . . . . . . . . . . . 2.1 Classification of Domain Walls . . . . . . . . . . . 2.2 The moduli space of domain walls: Some examples 2.3 The ordering of domain walls . . . . . . . . . . . . 3. Monopoles . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The relationship between Mg and Wg . . . . . . . 3.2 D-branes and Nahm’s equations . . . . . . . . . . 4. D-Branes and Domain Walls . . . . . . . . . . . . . . . 4.1 Domain wall dynamics . . . . . . . . . . . . . . . 4.2 The ordering of domain walls revisited . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

647 648 650 651 652 653 654 656 658 659 661

1. Introduction Domain walls in gauge theories with eight supercharges have rather special properties. These walls were first studied by Abraham and Townsend [1] who showed that in

648

A. Hanany, D. Tong

two-dimensions, where domain walls are known as kinks, they exhibit dyonic behaviour reminiscent of magnetic monopoles. Further similarities between kinks and magnetic monopoles, at both the classical and quantum level, were uncovered in [2]. The physical explanation for this relationship was presented in [3], where new BPS solutions were described corresponding to magnetic monopoles in a phase with fully broken gauge symmetry. The Meissner effect ensures that monopoles are confined: the magnetic flux is unable to propagate through the vacuum and leaves the monopole in two collimated tubes. From the perspective of the flux tube, the monopole appears as a kink. The idea of describing confined monopoles as kinks in Z N strings occurred previously in [4]. The relationship between the confined magnetic monopoles and the kink was further explored in [5–7] and related systems were studied in [8–13]. In this paper we use D-brane techniques to study the moduli space of multiple domain walls. This allows us to develop a description of the domain wall dynamics in terms of a linearized Nahm equation, providing a direct relationship to the dynamics of monopoles. Specifically, we show that the moduli space of domain walls, which we denote as Wg , is isomorphic to a middle dimensional submanifold of the moduli space of magnetic monopoles Mg . This submanifold describes magnetic monopoles lying along a line, ˆ rotating the monopoles in a and can be described as the fixed point of an S1 action k, plane, Wg ∼ . (1.1) = Mg k=0 ˆ The correspondence captures the topology and asymptotic metric of the domain wall moduli space Wg . It does not extend to the full metric on Wg . Nevertheless, as we shall explain, it does correctly capture the most important feature of domain walls: their ordering along the line. The relationship (1.1) plays companion to the result of [14], where the moduli space of vortices was shown to be a middle dimensional submanifold of the moduli space of instantons. Indeed, upon dimensional reduction, the self-dual instanton equations become the monopole equations, while the vortex equations descend to the domain wall equations. We start in the following section by describing the domain walls in question, together with a review of their moduli spaces. We pay particular attention to the crudest physical feature of domain walls, namely the rules governing their spatial ordering along the line. Section 3 contains a brief review of magnetic monopoles in higher rank gauge groups, primarily in order to fix notation, allowing us to elaborate on the relationship (1.1). We also describe the Nahm construction of the monopole moduli space as it arises from D-branes. The meat of the paper is in Sect. 4. We present a D-brane embedding of domain wall solitons which gives a description of their dynamics in terms of a linear Nahm equation. This equation is somewhat trivial, with the content hidden in various boundary conditions. We show how these boundary conditions capture the prescribed ordering of domain walls. 2. Domain Walls In this paper we will study a class of BPS domain wall solutions occurring in maximally supersymmetric theories with multiple, isolated vacua. The Lagrangian for these models includes a U (k) gauge field Aµ , a real adjoint scalar σ and N fundamental scalars qi , each with real mass m i ,

On Monopoles and Domain Walls

649

L = Tr

2 1 1 e2 † µν 2 2 q F F + |D σ | + ⊗q − v µν µ i i 4e2 2e2 2 N + |Dµ qi |2 + qi† (σ − m i )2 qi ,

i=1

where there is an implicit sum over the flavor index i in the adjoint valued term qi ⊗ qi† . This Lagrangian can be embedded in a theory with 8 supercharges in any spacetime dimension 1 ≤ d ≤ 5 (e.g. N = 2 SQCD in d = 3 + 1). Such theories include further scalar fields which can be shown to vanish on the domain wall solutions1 . The fermions do contribute zero modes but will not be important here. When the Higgs expectation value v 2 is non-vanishing, and the masses m i are distinct (m i = m j for i = j), the theory has a set of isolated vacua. Each vacuum is labelled by a set of k distinct elements, chosen from a possible N , = {ξ(a) : ξ(a) = ξ(b) for a = b} .

(2.1)

Here a = 1, . . . , k runs over the color index, while ξ(a) ∈ {1, . . . , N }. Up to a gauge transformation, the vacuum associated to this set is given by, σ = diag(m ξ(1) , . . . , m ξ(k) )

,

q ai = v δ ai=ξ(a) .

(2.2)

For N < k there are no supersymmetric vacua; for N ≥ k, the number of vacua is

N N! = Nvac = . (2.3) k k!(N − k)! Each of these vacua is isolated and exhibits a mass gap. There are k 2 non-BPS massive gauge bosons and quarks with masses m 2γ ∼ e2 v 2 + |m ξ(a) − m ξ(b) |2 and k(N − k) BPS massive quark fields with masses m q ∼ |m ξ(a) − m i | (with i ∈ / ). For vanishing masses m i = 0 the theory enjoys an SU (N ) flavor symmetry, rotating the qi . When distinct masses are turned on this is broken explicitly to the Cartan-subalgebra U (1) N −1 . Meanwhile, the U (k) gauge group is broken spontaneously in the vacuum by the expectation values (2.2). The existence of multiple, gapped, isolated vacua is sufficient to guarantee the existence of co-dimension one domain walls (otherwise known as kinks). These walls are BPS objects, satisfying first order Bogomoln’yi equations which can be derived in the usual manner by completing the square. We first choose a flat connection Fµν = 0, and allow the fields to depend only on a single coordinate, say x 3 . Then the Hamiltonian is given by H = Tr

N 2 e2 1 † † 2 2 2 2 + q |D |D σ | + ⊗q − v q | + q (σ − m ) q 3 i 3 i i i i i 2e2 2 i=1

N 2 1 = 2 Tr D3 σ − e2 qi ⊗qi† − v 2 + |D3 qi − (σ − m i )qi |2 2e i=1

1 If we promote the scalar field σ and the masses m to complex variables, then the theories admit an i interesting array of domain wall junctions [15] and dyonic walls [16].

650

A. Hanany, D. Tong N qi† (σ −m i )D3 qi +D3 qi† (σ −m i )qi + Tr D3 σ qi ⊗qi† − v 2 +

i=1

≥ −∂3 v Tr σ . 2

(2.4)

Our domain wall interpolates between a vacuum − at x 3 = −∞, as determined by a set (2.1), and a distinct vacuum + at x 3 = +∞. The minus signs above have been chosen under the assumption that m > 0, where m = i∈ − m i − i∈ + m i , so that a BPS domain wall satisfies the Bogomoln’yi equations, D3 σ = e2 qi ⊗qi† − v 2 , D3 qi = (σ − m i )qi (2.5) and has tension given by T = v 2 m. Analytic solutions to these equations can be found in the e2 → ∞ limit [17–19], which give smooth approximations to the solution at large, but finite e2 [20]. 2.1. Classiﬁcation of Domain Walls. Domain walls in field theories are classified by the choice of vacuum − and + at left and right infinity. However, our theory contains an exponentially large number of vacua (2.3) and one may hope that there is a coarser, less unwieldy, classification which captures certain relevant properties of a given domain wall. Such a classification was offered in [21]. Firstly define the N -vector m = (m 1 , . . . , m N ). The tension of the BPS domain wall can then be written as · g, Tg = v 2 m ≡ v 2 m

(2.6)

where the N -vector g ∈ R (su(N )), the root lattice of su(N ). Note that there do not exist domain wall solutions for all g ∈ R (su(N )); the only admissible vectors are of the form g = ( p1 , . . . , p N ) with pi = 0 or ±1. Note also that a choice of g does not specify a unique choice of vacua − and + at left and right infinity. Nor, in fact, does it specify a unique domain wall moduli space Wg . Nevertheless, domain walls in sectors with the same g share common traits. The dimension of the moduli space of domain wall solutions was computed in [21] using an index theorem, following earlier results in [22, 19]. To describe the dimension of the moduli space, it is useful to decompose g in terms of simple roots2 α i , g =

N −1

n i α i ,

n i ∈ Z.

(2.7)

i=1

The index theorem of [21] reveals that domain wall solutions to (2.5) exist only if n i ≥ 0 for all i. If this holds, the number of zero modes of a solution is given by N −1 ni , dim Wg = 2

(2.8)

i=1 2 The basis of simple roots is fixed by the requirement that m · α i > 0 for each i. A unique basis is defined in this way if m lies in a positive Weyl chamber, which occurs whenever the masses are distinct so that SU (N ) → U (1) N −1 . If we choose the ordering m 1 > m 2 > · · · > m N we have simple roots α 1 = (1, −1, 0, . . . , 0) and α 2 = (0, 1, −1, 0, . . . , 0) through to α N −1 = (0, . . . , 1, −1).

On Monopoles and Domain Walls

651

where Wg denotes the moduli space of any set of domain walls with charge g. This result has a simple physical interpretation. There exist N − 1 types of “elementary” domain walls corresponding to a g = α i , the simple roots. Each of these has just two collective coordinates corresponding to a position in the x 3 direction and a phase. As first explained in [1], the phase coordinate is a Goldstone mode arising because the domain wall configuration breaks the U (1) N −1 flavor symmetry as we review below. In general, a domain wall labelled by g can be thought to be constructed from i n i elementary domain walls, each with its own position and phase collective coordinate. Let us now turn to some examples. 2.2. The moduli space of domain walls: Some examples. Example 1. g = α 1 . The simplest system admitting a domain wall is the abelian k = 1 theory with N = 2 charged scalars q1 and q2 . The Nvac = 2 vacua of the theory are given by σ = m i and |q j |2 = v 2 δi j for i = 1, 2. There is a single domain wall in this theory with g = α 1 interpolating between these two vacua. Under the U (1) F flavor symmetry, q1 has charge +1 and q2 has charge −1. In each of the vacua, the U (1) F symmetry coincides with the U (1) gauge action but, in the core of the domain wall, both q1 and q2 are non-vanishing, and U (1) F acts non-trivially. The resulting goldstone mode is the phase collective coordinate. The moduli space of the domain wall is simply Wg=α1 ∼ = R × S1 ,

(2.9)

where the R factor describes the center of mass of the domain wall and the S1 the phase. One can show that the S1 has radius 2π v 2 /Tg = 2π/(m 1 − m 2 ). Example 2. g = α 1 + α 2 . The simplest system admitting multiple domain walls is the abelian k = 1 theory with N = 3 charged scalars. There are now three vacua, given by σ = m i and |q j |2 = v 2 δi j . In each a U (1) F1 × U (1) F2 flavor symmetry is unbroken, under which the scalars have charge   q1 : (+1, 0) U (1) F1 × U (1) F2 : q2 : (−1, 1) . (2.10)  q : (0, −1) 3 The first elementary domain wall g = α 1 interpolates between vacuum 1 and vacuum 2, breaking U (1) F1 along the way. The second elementary domain wall interpolates between vacuum 2 and vacuum 3, breaking U (1) F2 along the way. Of interest here is the domain wall g = α 1 + α 2 interpolating between vacuum 1 and vacuum 3. It can be thought of as a composite of two domain walls. The moduli space for these two domain walls was studied in [17, 23] and is of the form, ˜ α +α R×W 1 2 Wg=α1 +α2 ∼ . =R× Z

(2.11)

The first factor of R corresponds to the center of mass of the two domain walls; the second factor corresponds to the combined phase associated to the two domain walls. When the ratio of tensions of the two elementary domain walls Tα 1 /Tα 2 is rational, the ratio of the periods of the two phases are similarly rational and the second R factor collapses ˜ α +α to S1 , while the quotient Z reduces to a finite group. The relative moduli space W 1

2

652

A. Hanany, D. Tong

corresponds to the separation and relative phases of the two domain walls. Importantly, and unlike other solitons of higher co-dimension, the domain walls must obey a strict ordering on the x 3 line: the g = α 1 domain wall must always be to the left of g = α 2 domain wall. The separation between the walls is therefore the halfline R+ . It was shown in [17] that the relative phase is fibered over R+ to give rise to a smooth cylinder, with the tip corresponding to coincident domain walls. The resulting moduli space is shown in Fig. 1. Note that the moduli space (2.11) is toric, inheriting two isometries from the U (1) F1 × U (1) F2 symmetry. In an abelian gauge theory with arbitrary number of flavors N , the domain wall charge is always of the form g = i n i α i with n i = 0, 1, and the moduli space is always toric, meaning that half of the dimensions correspond to U (1) isometries. Example 3. g = α 1 + 2 α2 + α 3 . In non-abelian theories, the domain wall moduli spaces are no longer toric. The simplest such theory has a U (2) gauge group with N = 4 fundamental scalars. The 6 vacua, and 15 different domain walls, of this theory were detailed in [21]. Under the U (1)3F flavor symmetry, the fundamental scalars transform as  q : (+1, 0, 0)   1 q2 : (−1, 1, 0) U (1) F1 × U (1) F2 × U (1) F3 : . (2.12)   q3 : (0, −1, 1) q4 : (0, 0, −1) With this convention, the elementary domain wall g = α i picks up its phase from the action of the U (1) Fi flavor symmetry. Here we concentrate on the domain wall system with the maximal number of zero modes which arises from the choice of vacua − = (1, 2) and + = (3, 4) so that g = α 1 + 2 α2 + α 3 . This system can be separated into four constituent domain walls. As explained in [19, 21], the ordering of domain walls is no longer strictly fixed in this example. The two outer elementary domain walls, on the far left and far right, are each of the type g = α 2 . However, the relative positions of the middle two domain walls, g = α 1 and α 3 are not ordered and they may pass through each other. Unlike the situation for abelian gauge theories, the 8 dimensional domain wall moduli space for this example is no longer toric; Wg inherits only three U (1) isometries from (2.12). Physically this means that the two phases associated to the α 2 domain walls are not both Goldstone modes and they may interact as the domain walls approach. This behaviour is familiar from the study of the Atiyah-Hitchin metric describing the dynamics of two monopoles in SU (2) gauge theory; we shall make the analogy more precise in the following.

2.3. The ordering of domain walls. As we stressed in the above examples, in contrast to other solitons domain walls must obey some ordering on the line. This will be an

Fig. 1. The relative moduli space W˜ α 1 +α2 of two domain walls is a cigar

On Monopoles and Domain Walls

653

important ingredient when we come to extract domain wall data from the linearized Nahm’s equations in Sect. 4. Here we linger to review this ordering. The ordering of the elementary domain walls in non-abelian theories was studied in detail in [19]. One can derive the result by considering the possible sequences of vacua as we move over each domain wall. For example, we could consider the “maximal domain wall”, interpolating between − = {1, 2, . . . , k} and + = {N − k + 1, . . . , N }. From the left, the first elementary domain wall that we come across must be g = α k , corresponding to − = (1, 2, . . . , k − 1, k) → (1, 2, . . . , k − 1, k + 1). The next elementary domain wall may be either α k−1 or α k+1 . These two walls are free to pass through each other, but cannot move further to the left than the α k wall. And so on. Iterating this procedure, one finds that two neighbouring elementary domain walls α i and α j may pass through each other whenever α i · α j = 0, but otherwise have a fixed ordering on the line. The net result of this analysis is summarized in Fig. 2. The x 3 positions of the domain walls are shown on the vertical axis; the position on the horizontal axis denotes the type of elementary domain wall, starting on the left with α 1 and ending on the right with α N −1 . In summary, we see that for n i = n i+1 − 1, the α i domain walls are trapped between the α i+1 domain walls. The reverse holds when n i = n i+1 + 1. Finally, when n i = n i+1 the positions of the α i domain walls are interlaced with those of the α i+1 domain walls. The last α i+1 domain wall is unconstrained by the α i walls in its travel in the positive x 3 direction, although it may be trapped in turn by a α i+2 wall. While we have discussed the maximal domain wall above, other sectors can be reached either by removing some of the outer domain walls to infinity, or by taking non-interacting products such subsets. It’s important to note that labelling a topological sector g does not necessarily determine the ordering of domain walls3 . We shall show that domain walls with the same g, but different orderings, descend from different submanifolds of the same monopole moduli space. 3. Monopoles The main goal of this paper is to show how the moduli space of domain walls introduced in the previous section is isomorphic to a submanifold of a related monopole moduli space. In this section we review several relevant aspects of these monopole moduli spaces. It will turn out that the domain walls of Sect. 2 are related to monopoles in an SU (N ) gauge theory. Note that the flavor group from Sect. 2 has been promoted to a gauge group; we shall see the reason behind this in Sect. 5. The Bogomoln’yi monopole equations are Bµ = Dµ φ,

(3.1)

where Bµ , µ = 1, 2, 3 is the SU (N ) magnetic field and φ is an adjoint valued real scalar field. The monopoles exist only if φ takes a vacuum expectation value, φ = diag(m 1 , . . . , m N ),

(3.2)

where we take m i = m j for i = j, ensuring breaking to the maximal torus, SU (N ) → U (1) N −1 . It is not coincidence that we’ve denoted the vacuum expectation values by 3 An example: the g = α 1 + α 2 + α 3 domain wall. In the abelian theory with k = 1 and N = 4, the ordering is α 1 < α 2 < α 3 . However, in the non-abelian theory with k = 2 and N = 4, the ordering is α 1 , α 3 < α 2 .

654

A. Hanany, D. Tong

α1

α2

α3

αN–3

α4

αN–2

αN–1

x3 ........... ...........

Fig. 2. The ordering for the maximal domain wall. The x 3 spatial direction is shown horizontally. The position in the vertical direction denotes the type of domain wall. Domain walls of neighbouring types have their positions interlaced

m i , the same notation used for the masses in Sect. 2; it is for this choice of vacuum that the correspondence holds. (Specifically, the masses of the kinks will coincide with the masses of monopoles, ensuring that the asymptotic metrics on Wg and Mg also coincide). As described long ago by Goddard, Nuyts and Olive [24], the allowed magnetic charges under each unbroken U (1) N −1 are specified by a root vector4 of su(N ), g = ( p1 , . . . , p N ). It is customary to decompose this in terms of simple roots α i , g =

N −1

n i α i ,

n i ∈ Z.

(3.3)

i=1

Once again, the notation is identical to that used for domain walls (2.7) for good reason. Solutions to the monopole equations (3.1) exist for all values of n i ≥ 0. This is in contrast to domain walls where, as we have seen, configurations only exist in a finite number of sectors defined by pi = 0 or pi = ±1. The mass of the magnetic monopole is Mmono = (2π/e2 )m · g. The monopole moduli space Mg is the space of solutions to (3.2) in a fixed topological sector g. The dimension of this space, equal to the number of zero modes of given solution, was computed by E. Weinberg in [25] using Callias’ version of the index theorem. The result is: dim (Mg ) = 4

N −1

ni ,

(3.4)

i=1

which is to be compared with (2.8). 3.1. The relationship between Mg and Wg . We are now in a position to describe the relationship between the moduli space of domain walls Wg and the moduli space of magnetic monopoles Mg . We will show that Wg is a complex, middle dimensional, submanifold of Mg , defined by the fixed point set of the action rotating the monopoles in a plane, together with a suitable gauge action. To do this, we first need to describe the symmetries of Mg . The monopole moduli space Mg admits a natural, smooth, hyperKähler metric [26, 27]. For generic g this metric enjoys (N −1) tri-holomorphic isometries arising from the 4 We ignore the factor of 2 difference between roots and co-roots. For simply laced groups, such as SU (N ), it can be absorbed into convention.

On Monopoles and Domain Walls

655

action of the U (1) N −1 abelian gauge group. Further the metric has an SU (2) R symmetry, arising from rotations of the monopoles in R3 , which acts on the three complex structures of Mg . In other words, any U (1) R ⊂ SU (2) R is a holomorphic isometry, preserving a single complex structure while revolving the remaining two. Let us choose U (1) R to rotate the monopoles in the (x 2 − x 3 ) plane. In what follows we will be interested in a specific holomorphic Uˆ (1) action which acts simultaneously by a U (1) R rotation and a linear combination of the gauge rotations U (1) N −1 (to be specified presently). We ˆ We claim denote the Killing vector on Mg associated to Uˆ (1) as k. Wg = Mg k=0 . (3.5) ˆ This result holds at the level of topology and asymptotic metric of the spaces. The manifold Wg inherits a metric from Mg by this reduction: it does not coincide with the domain wall metric in the interior on Wg . (For example, corrections to the asymptotic metric on Wg are exponentially suppressed while those of Mg have power law behaviour). It would be interesting to examine if Wg inherits the correct Kähler class and/or complex structure from Mg . We defer a derivation of (3.5) to the following section, but first present some simple examples. Example 1. g = α 1 . Monopoles in SU (2) gauge theories are labelled by a single topological charge g = n 1 α 1 . For a single monopole (n 1 = 1) the moduli space is simply Mg=α1 ∼ = R3 × S1 ,

(3.6)

where the R3 factor denotes the position of the monopole, while the S1 arises from global gauge transformations under the surviving U (1). The radius of the S1 is 2π/(m 1 − m 2 ). In this case the Uˆ (1) action coincides with the rotation U (1) R in the (x 2 − x 3 ) plane and we have trivially Wα 1 ∼ . (3.7) = R × S1 ∼ = Mα 1 k=0 ˆ The similarity between the domain wall and monopole moduli spaces for a single soliton was noted by Abraham and Townsend [1]. In both cases, motion in the S1 factor gives rise to dyonic solitons. Note that monopole moduli spaces for charges g = n 1 α 1 exist for all n 1 ∈ Z+ . For example, the n 1 = 2 monopole moduli space is home to the famous Atiyah-Hitchin metric [27]. However, there is no domain wall moduli space with this charge in the class of theories we discuss in Sect. 2. Example 2. g = α 1 + α 2 . Our second example is the g = α 1 + α 2 monopole in SU (3) gauge theories (sometimes referred to as the (1, 1) monopole). The moduli space was determined in [28–30] to be of the form ˜ α +α R×M 1 2 , Mg=α1 +α2 ∼ = R3 × Z

(3.8)

˜ α +α is the Euclidean Taub-NUT space, endowed where the relative moduli space M 1 2 with the metric

1 1 −1 (dr 2 + r 2 dθ 2 +2 sin2 θ dφ 2 )+ 1+ ds 2 = 1+ (dψ + cos θ dφ)2 . (3.9) r r

656

A. Hanany, D. Tong

Here r ,θ and φ are the familiar spherical polar coordinates. The coordinate ψ arises from U (1) gauge transformations. The manifold has a SU (2) R ×U (1) isometry, of which only a U (1) R × U (1) are manifest in the above coordinates. The holomorphic U (1) R isometry acts by rotating the two monopoles: φ → φ + c. The tri-holomorpic U (1) isometry changes the relative phase of the monopoles: ψ → ψ + c. Both of these actions have a unique fixed point at r = 0, the “nut” of Taub-NUT. However, the combined action with Killing vector ∂ψ + ∂φ has a fixed point along the half-line θ = π , with ψ fibered over ˜ α +α . this line to produce the cigar shown in Fig. 1. This is the relative moduli space W 1 2 N −1 Similar calculations hold for monopoles of charge g = i=1 α i , whose dynamics is described by a class of toric hyperKähler metrics, known as the Lee-Weinberg-Yi metrics [31]. Once again, a suitable S1 action on these spaces can be identified such that the fixed points localize on Wg , the moduli space of domain walls in U (1) gauge theories with N charged scalars. α2 + α 1 . As described in the previous section, the simplest domain Example 3. g = α 1 +2 wall charge g = i n i α i with some n i > 1 occurs for g = α 1 + 2 α2 + α 3 , and corresponds to a monopole in a SU (4) gauge theory. No explicit expression for the metric on this monopole moduli space is known although, given the results of [32], such a computation may be feasible. Without an explicit expression for the metric in this, and more complicated examples, we need a more powerful method to describe the moduli space. This is provided by the Nahm construction, which we now review. 3.2. D-branes and Nahm’s equations. The moduli space of magnetic monopoles is isomorphic to the moduli space of Nahm data. Here we review the Nahm construction of the monopole moduli space [33] and, in particular, the embedding within the framework of D-branes due to Diaconescu [34]. This will be useful to compare to the domain walls of the next section. In the D-brane construction, Nahm’s equations arise as the low-energy description of D-strings suspended between D3-branes [34]. The SU (N ) Yang-Mills theory lives on the worldvolume of N D3-branes separated in, say, the x6 direction, with the i th D3-brane placed at position (x6 )i = m i in accord with the adjoint expectation value (3.2). The monopole of charge g = i n i α i corresponds to suspending n i D-strings between the i th and (i + 1)th D3-brane. This configuration is shown in Fig. 3. The motion of the D-strings in each segment m i ≤ x6 ≤ m i+1 is governed by four hermitian n i ×n i matrices, X 1 , X 2 , X 3 and A6 subject to the covariant version of Nahm’s equations, α1

α2

α3

D3 D1

Fig. 3. The g = 3 α1 + 2 α2 + 3 α3 monopole as D-strings stretched between D3-branes

On Monopoles and Domain Walls

657

d Xµ i − i[A6 , X µ ] − µνρ [X ν , X ρ ] = 0, m i ≤ x6 ≤ m i+1 , (3.10) d x6 2 modulo U (n i ) gauge transformations acting on the interval m i ≤ x6 ≤ m i+1 , and vanishing at the boundaries. The X µ form the triplet representation of the SU (2) R symmetry which rotates monopoles in R3 . The U (1) N −1 surviving gauge transformations act on the Nahm data by constant shifts of the (N − 1) “Wilson lines” A6 → A6 + c1n i . The interactions between neighbouring segments depend on the relative size of the matrices. There are three possibilities, given by [35]: i) n i = n i+1 : In this case the U (n i ) gauge symmetry is extended to the interval m i ≤ x6 ≤ m i+2 and an impurity is added to the right-hand-side of Nahm’s equations, which now read d Xµ i − i[A6 , X µ ] − µνρ [X ν , X ρ ] = ωα σµαβ ωβ† δ(x6 − m i+1 ). (3.11) d x6 2 Here σµ are the Pauli matrices. The impurity degrees of freedom lie in the complex 2-vector, ωα = (ψ, ψ˜ † ) which is a doublet under the SU (2) R symmetry. Both ψ and ψ˜ † are themselves complex n i vectors, transforming in the fundamental repreαβ sentation of the U (n i ) gauge group. The combination ωα σµ ωβ† is thus an n i × n i matrix, transforming in the adjoint representation of the gauge group. The ωα fields can be thought of as a hypermultiplet arising from D1 − D3 strings [36–38] ii) n i = n i+1 − 1: In this case X µ → (X µ )− , a set of three n i × n i matrices, as x6 → (m i )− from the left. To the right of m i , the X µ are (n i + 1) × (n i + 1) matrices which must obey

yµ aµ† as x6 → (m i )+ , Xµ → (3.12) aµ (X µ )− where yµ ∈ R and each aµ is a complex n i -vector. iii) n i ≤ n i+1 − 2: Once again we take X µ → (X µ )− as x6 → (m i )− but, from the other side, the matrices X µ now have a simple pole at the boundary,

0 Jµ /(x6 − m i ) + Yµ as x6 → (m i )+ , (3.13) Xµ → 0 (X µ )− where Jµ is the irreducible (n i+1 − n i ) × (n i+1 − n i ) representation of su(2), and Yµ are now constant (n i+1 − n i ) × (n i+1 − n i ) matrices. Case 2 above is usually described as a subset of Case 3 (with the one-dimensional irreducible su(2) representation given by Jµ = 0). Here we have listed Case 2 separately since when we come to describe a similar construction for domain walls, only Case 1 and 2 above will appear. The conditions for n i < n i+1 were derived in [39] by starting with the impurity data (3.11) and taking several monopoles to infinity. Obviously, for n i > n i+1 , one imposes the same boundary conditions described above, only flipped in the x6 direction. The space of solutions to Nahm’s equations, subject to the boundary conditions detailed above, is isomorphic to the monopole moduli space Mg . Moreover, there exists a natural hyperKähler metric on the solutions to Nahm’s equations which can be shown to coincide with the Manton metric on the monopole moduli space. For the g = α 1 + α 2 monopole, the metric on the associated Nahm data was computedin [28] and shown to give rise to the Euclidean Taub-NUT metric (3.9). For the g = i α i monopoles, the corresponding computation was performed in [40], resulting in the Lee-Weinberg-Yi metrics [31].

658

A. Hanany, D. Tong

4. D-Branes and Domain Walls In this section we would like to realize the domain walls that we described in Sect. 2 on the worldvolume of D-branes, mimicking Diaconescu’s construction for monopoles. From the resulting D-brane set-up we shall read off the world-volume dynamics of the domain walls to find that they are described by a truncated version of Nahm’s equations (3.10). Nahm’s equations have also arisen as a description of domain walls in N = 1 theories [41], although the relationship, if any, with the current work is unclear. Domain walls of the type described in Sect. 2 were previously embedded in D-branes in [42, 43] and several properties of the solitons were extracted (see in particular the latter reference). However, the worldvolume dynamics of the walls is difficult to determine in these set-ups and the relationship to magnetic monopoles obscured. We start by constructing the theory with eight supercharges on the worldvolume of D-branes [36]. For definiteness we choose to build the N = 2, d = 3 + 1 theory in IIA string theory although, by T-duality, we could equivalently work with any spacetime dimension5 . The construction is well known and is drawn in Fig. 4. We suspend k D4branes between two NS5-branes, and insert a further N D6-branes to play the role of the fundamental hypermultiplets. The worldvolume dimensions of the branes are N S5 : 012345, D4 : 01236, D6 : 0123789. The gauge coupling e2 and the Higgs vev v 2 are encoded in the separation of the NS5branes in the x6 and x9 directions respectively, while the masses m i are determined by the positions of the D6-branes in the x4 direction (we choose the D6-branes to be coincident in the x5 direction, corresponding to choosing all masses to be real), x6 x9 x4 1 2 ∼ , v ∼ , m ∼ − . (4.1) i e2 ls gs N S5 ls3 gs N S5 ls2 D6i After turning on the Higgs vev v 2 , the D4-branes must split on the D6-branes in order to preserve supersymmetry. The S-rule [36] ensures that each D6-brane may play host to only a single D4-brane. In this manner a vacuum of the theory is chosen by picking k out of the N D6-branes on which the D4-branes end, in agreement with Eq. (2.1). The domain walls correspond to a configuration of D4-branes which start life at x 3 = −∞ in a vacuum configuration − , and end up at x 3 = +∞ in a distinct vacuum + . As is clear from Fig. 4, as D4-branes walls interpolate in x 1 , they must also move in both the x 4 direction and the x 9 direction [45]. The NS-branes and D6-branes are linked, meaning that a D4-brane is either created or destroyed as they pass the NS5-branes in the x 6 direction [36]. In the domain wall background, which of these possibilities occurs differs if we move the D6-branes to the left or right since the D4-brane charge varies from one end of the domain wall to the other. As it stands, it is difficult to read off the dynamics of the D4-branes in Fig. 4. However, we can make progress by taking the e2 → ∞ limit, in which the two NS5-branes become coincident in the x 6 direction. After rotating our viewpoint, the system of branes now looks like the ladder configuration shown in Fig. 5 (note that we have also rotated the branes relative to Fig. 4, so the horizontal is the x 4 direction). We are left with a 5 In fact, as explained in [44], the overall U (1) ⊂ U (k) is decoupled in the IIA brane set-up after lifting to M-theory. This effect will not concern us here.

On Monopoles and Domain Walls

659 D6

NS5 D4

x6

x9 x4

Fig. 4. The D-brane set-up for the U (1) gauge theory with N = 3 flavors. The vacuum is shown on the left; the elementary domain wall g = α 1 on the right D6 NS5 x9 D4

2xD4

2xD4

2xD4

D4 x4

m1

m2

m3

m4

m5

m6

Fig. 5. The D-brane set-up for the U (2) gauge theory with N = 6 flavors. The maximal g = α 1 + α 5 + 2( α2 + α 3 + α 4 ) domain wall is shown

series of D4-branes, now with worldvolume 02349, stretched between N D6-branes, while simultaneously sandwiched between two NS5-branes. Following these manoeuvres, one finds that the domain wall g = i results in n i D4-branes stretched i ni α th th between the i and (i + 1) D6-branes (counting from the top, since we have chosen the ordering m i > m i+1 . It may be worth describing how the domain wall charges arise directly in the set-up of Fig. 5. We start in a chosen vacuum − , denoted by placing k pairs of white dots on N distinct D6-branes, as shown in the figure. A domain wall arises every time a pair of dots proceeds to another D6-brane, dragging a D4-brane behind it like clingwrap. The S-rule translates to the fact that two pairs of dots may not simultaneously lie on the same D6-brane. The final vacuum + is denoted by the black dots in the figure and the domain wall charges n i are given by the number of times a D4-brane has been pulled between the i th and (i + 1)th D6-branes. 4.1. Domain wall dynamics. We are now in a position to read off the dynamics of the domain walls. In the absence of the NS5-branes, the D4-branes would stretch to infinity in the x9 direction, and the resulting D-brane set-up in Fig. 5 is T-dual to the monopoles in Fig. 3. The presence of the NS5-branes projects out half the degrees of freedom of the monopoles, leaving a simple linear set of equations. In each segment m i ≤ x4 ≤ m i+1 the domain walls are described by two n i × n i matrices X 3 and A4 satisfying d X3 − i[A4 , X 3 ] = 0 d x4

(4.2)

660

A. Hanany, D. Tong

modulo U (n i ) gauge transformations acting on the interval m i ≤ x4 ≤ m i+1 , and vanishing at the boundaries. As in the case of monopoles, the interactions between neighbouring segments depends on the relative size of the matrices. The two possibilities are: i) n i = n i+1 : Again, the U (n i ) gauge symmetry is extended to the interval m i ≤ x4 ≤ m i+2 and an impurity is added to the right-hand-side of Nahm’s equations, which now read d X3 − i[A4 , X 3 ] = ±ψψ † δ(x4 − m i+1 ), d x4

(4.3)

where the impurity degree of freedom ψ transforms in the fundamental representation of the U (n i ) gauge group, ensuring the combination ψψ † is a n i × n i matrix transforming, like X 3 , in the adjoint representation. These ψ degrees of freedom are chiral multiplets which survive the NS5-brane projection. We shall see shortly that the choice of ± sign will dictate the relative ordering of the domain walls along the x3 direction. ii) n i = n i+1 − 1: In this case X 3 → (X 3 )− , an n i × n i matrix, as x4 → (m i )− from the left. To the right of m i , X 3 is a (n i + 1) × (n i + 1) matrix obeying

y a† as x4 → (m i )+ , X3 → (4.4) a (X )− where yµ ∈ R and each aµ is a complex n i -vector. The obvious analog of this boundary condition holds when n i = n i+1 + 1. These boundary conditions obviously descend from the original Nahm boundary conditions for monopoles. Just as the space of Nahm data is isomorphic to the moduli space of magnetic monopoles, we conjecture that the moduli space of linearized Nahm data described above is isomorphic to the moduli space of domain walls. We shall shortly show that it indeed captures the most relevant aspect of domain walls: their ordering. In fact, the linearized Nahm equations (4.2) are rather trivial to solve. We first employ the i U (n i ) gauge transformations to make A4 (x4 ) a constant in each interval m i ≤ x4 ≤ m i+1 . This can be achieved by first diagonalizing A4 , and subsequently acting with the U (1)n i transformation A4 → A4 − ∂4 α where, in each segment m i ≤ x4 ≤ m i+1 , α is given by m  x4 i+1 m i − x4 α(x4 ) = A4 (x4 ) d x4 −  A4 (x4 ) d x4  , (4.5) m i − m i+1 mi

mi

which has the property that α(m i ) = α(m i+1 ) = 0. Further gauge transformations with non-zero winding on the interval ensure that A4 is periodic, with each eigenvalue lying in A4 ∈ [0, 2π/(m i − m i+1 )). These N − 1 “Wilson lines” will play the role of the phases associated to domain wall system. Note that when n i = n i+1 , the above choice of gauge leaves a residual U (n i ) gauge symmetry acting only on the chiral impurity ψ. In this gauge we can now easily integrate (4.2) in each interval, X 3 (x4 ) = ei A4 x4 Xˆ 3 e−i A4 x4 ,

(4.6)

where the eigenvalues of X 3 are independent of x4 in each interval. We identify these n i eigenvalues with the positions of the n i α i elementary domain walls.

On Monopoles and Domain Walls

661

We are now in a position to derive the linearized Nahm equations (4.2) from the original Nahm equations (3.10) in terms of a fixed point set of a Uˆ (1) action. Consider first the action of the U (1) R ⊂ SU (2) R isometry on the Nahm data, which rotates X 1 and X 2 while leaving X 3 fixed. This rotation also acts on the impurity ω = (ψ, ψ˜ † ) ˜ → eiα (ψ, ψ). ˜ To retain half of the impurities for the domain wall equations by (ψ, ψ) (4.3), we need to compensate for this transformation with the residual U (1) ⊂ U (n i ) transformation acting on the appropriate impurity ω by ω → eiβ ω. By choosing β = ±α we can pick a Uˆ (1) action which leaves either the ψ or the ψ˜ impurity invariant. Which we choose to save is correlated with the choice of minus sign in (4.3) which, in turn, dictates the ordering of neighbouring domain walls as we shall now demonstrate. To summarize, we have shown that the description of domain wall dynamics (4.2) arises from the fixed point of a Uˆ (1) on the original Nahm equations (3.10). This action descends to a Uˆ (1) isometry on the monopole moduli space Mg , the fixed points of which coincide with the domain wall moduli space Wg . A physical explanation for this correspondence follows along the lines of [3]: in theories in the Higgs phase, confined magnetic monopoles with charge g exist, emitting k multiple vortex strings. When these vortex strings coincide, the worldvolume theory is of the form described in Sect. 2 [14] and the monopoles appear as charge g kinks. 4.2. The ordering of domain walls revisited. As explained in Sect. 2, in contrast to monopoles, domain walls must satisfy a specific ordering on the x 3 line. We will now show that this ordering is encoded in the boundary conditions described above. Suppose first that n i = n i+1 . The positions of the α i domain walls are given by the eigenvalues (i) of X 3 restricted to the interval m i ≤ x4 ≤ m i+1 . Let us denote this matrix as X 3 and (i) the eigenvalues as λm , where m = 1, . . . n i . The impurity (4.3) relates the two sets of eigenvalues by the jumping condition (i+1)

X3

(i)

= X 3 + ψψ † ,

(4.7)

where we have chosen the positive sign for definiteness. However, from the discussion in Sect. 2 (see, in particular, Fig. 3) we know that the domain walls cannot have arbitrary position but must be interlaced, (i)

(i+1)

λ1 ≤ λ1

(i)

(i+1)

(i+1) ≤ λ2 ≤ · · · ≤ λn i −1 ≤ λ(i) n i ≤ λn i .

(4.8)

We will now show that the ordering of domain walls (4.8) follows from the impurity jumping condition (4.7). (i) To see this, consider firstly the situation in which ψ † ψ λm so that the matrix (i) † ψψ may be treated as a small perturbation of X 3 . The positivity of ψψ † ensures that (i+1) (i) (i+1) each λm ≥ λm . Moreover, it is simple to show that the λm increase monotonically † with ψ ψ. This leaves us to consider the other extreme, in which ψ † ψ → ∞. It this (i+1) (i+1) limit ψ becomes one of the eigenvectors of X 3 with eigenvalue λn i = ψ † ψ → ∞ which reflects the fact that this limit corresponds to the situation in which the last domain wall is taken to infinity. What we want to show is that the remaining (n i −1) α i+1 domain walls are trapped between the n i α i domain walls as depicted in Fig. 3. Define the n i ×n i projection operator P = 1 − ψˆ ψˆ † ,

(4.9)

662

A. Hanany, D. Tong

where ψˆ = ψ/ ψ † ψ. The positions of the remaining (n i − 1) α i+1 domain walls are (i) given by the (non-zero) eigenvalues of P X 3 P. We must show that, given a rank n hermitian matrix X , the eigenvalues of P X P are trapped between the eigenvalues of X . This elementary property of hermitian matrices can be seen as follows: det(P X P − µ) = det(X P − µ) = det(X − µ − X ψˆ ψˆ † ) = det(X − µ) det(1 − (X − µ)−1 X ψˆ ψˆ † ). Since ψˆ ψˆ † is rank one, we can write this as det(P X P − µ) = det(X − µ) [1 − Tr((X − µ)−1 X ψˆ ψˆ † )] = −µ det(X − µ) Tr((X − µ)−1 ψˆ ψˆ † ) n n |ψˆ m |2 (λm − µ) = −µ , λm − µ m=1

(4.10)

m=1

where ψˆ m is the m th component of the vector ψ. We learn that P X P has one zero eigenvalue while, if the eigenvalues λm of X are distinct, then the eigenvalues of P X P lie at the roots the function n |ψˆ m |2 . R(µ) = λm − µ

(4.11)

m=1

The roots of R(µ) indeed lie between the eigenvalues λm . This completes the proof that the impurities (4.3) capture the correct ordering of the domain walls. The same argument shows that the boundary condition (4.4) gives rise to the correct ordering of domain walls when n i+1 = n i + 1, with the α i domain walls interlaced between the α i+1 domain walls. Indeed, it is not hard to show that (4.4) arises from (4.3) in the limit that one of the domain walls is taken to infinity. Acknowledgements. AH is supported in part by the CTP and LNS of MIT, DOE contract #DE-FC0294ER40818, NSF grant PHY-00-96515, the BSF American-Israeli Bi-national Science Foundation and a DOE OJI Award. DT is supported by the Royal Society.

References 1. Abraham, E.R.C., Townsend, P.K.: Q kinks. Phys. Lett. B 291, 85 (1992); More on Q kinks: A (1+1)dimensional analog of dyons. Phys. Lett. B 295, 225 (1992) 2. Dorey, N.: The BPS spectra of two-dimensional supersymmetric gauge theories with twisted mass terms. JHEP 9811, 005 (1998); Dorey, N., Hollowood, T.J., Tong, D.: The BPS spectra of gauge theories in two and four dimensions. JHEP 9905, 006 (1999) 3. Tong, D.: Monopoles in the Higgs phase. Phys. Rev. D 69, 065003 (2004) 4. Hindmarsh, M., Kibble, T.W.B.: Beads On Strings. Phys. Rev. Lett. 55, 2398 (1985) 5. Shifman, M., Yung, A.: Non-Abelian string junctions as confined monopoles. Phys. Rev. D 70, 045004 (2004) 6. Hanany, A., Tong, D.: Vortex strings and four-dimensional gauge dynamics. JHEP 0404, 066 (2004) 7. Auzzi, R., Bolognesi, S., Evslin, J.: Monopoles can be confined by 0, 1 or 2 vortices. JHEP 0502, 046 (2005) 8. Kneipp, M.A.C.: Color superconductivity, Z(N) flux tubes and monopole confinement in deformed N = 2* super Yang-Mills theories. Phys. Rev. D 69, 045007 (2004) 9. Auzzi, R., Bolognesi, S., Evslin, J., Konishi, K.: Nonabelian monopoles and the vortices that confine them. Nucl. Phys. B 686, 119 (2004)

On Monopoles and Domain Walls

663

10. Markov, V., Marshakov, A., Yung, A.: Non-Abelian vortices in N = 1* gauge theory. Nucl. Phys. B 709, 267 (2005) 11. Gorsky, A., Shifman, M., Yung, A.: Non-Abelian Meissner effect in Yang-Mills theories at weak coupling. Phys. Rev. D 71, 045010 (2005) 12. Mironov, A., Morozov, A., Tomaras, T.N.: On the need for phenomenological theory of P-vortices or does spaghetti confinement pattern admit condensed-matter analogies? J. Exp. Theor. Phys. 101, 331– 340 (2005) 13. Bolognesi, S., Evslin, J.: Stable vs unstable vortices in SQCD. JHEP 0603, 023 (2006) 14. Hanany, A., Tong, D.: Vortices, Instantons and Branes. JHEP 0307, 037 (2003) 15. Eto, M., Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Webs of walls. Phys. Rev. D 72, 085004 (2005) 16. Lee, K., Yee, H.U.: New BPS Objects in N = 2 Supersymmetric Gauge Theories. Phys. Rev. D 72, 065623 (2005); Eto, M., Isozumi, Y., Nitta, M., Ohashi, K.: 21 , 14 and 18 BPS Equations in SUSY Yang-Mills-Higgs Systems: Field Theoretical Brane Configurations. http://arxiv.org/list/hep-th/0506257, 2005 17. Tong, D.: The moduli space of BPS domain walls. Phys. Rev. D 66, 025013 (2002) 18. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Construction of non-Abelian walls and their complete moduli space. Phys. Rev. Lett. 93, 161601 (2004); All exact solutions of a 1/4 BPS equation. Phys. Rev. D 71, 065018 (2005) 19. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Non-Abelian walls in supersymmetric gauge theories. Phys. Rev. D 70, 125014 (2004) 20. Isozumi, Y., Ohashi, K., Sakai, N.: Exact wall solutions in 5-dimensional SUSY QED at finite coupling. JHEP 0311, 060 (2003); Sakai, N., Yang, Y.: Moduli sapce of BPS walls in supersymmetric gauge theories. http://arxiv.org/list/hep-th/0505136, 2005 21. Sakai, N., Tong, D.: Monopoles, Vortices, Domain Walls and D-Branes: The Rules of Interaction. JHEP 0503, 019 (2005) 22. Lee, K.S.M.: An index theorem for domain walls in supersymmetric gauge theories. Phys. Rev. D 67, 045009 (2003) 23. Tong, D.: Mirror mirror on the wall: On two-dimensional black holes and Liouville theory. JHEP 0304, 031 (2003) 24. Goddard, P., Nuyts, J., Olive, D.I.: Gauge Theories And Magnetic Charge. Nucl. Phys. B 125, 1 (1977) 25. Weinberg, E.J.: Fundamental Monopoles And Multi - Monopole Solutions For Arbitrary Simple Gauge Groups. Nucl. Phys. B 167, 500 (1980) 26. Manton, N.S.: A Remark On The Scattering Of BPS Monopoles. Phys. Lett. B 110, 54 (1982) 27. Atiyah, M., Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton University Press, 1988 28. Connell, S.A.: The Dynamics of the SU (3) Charge (1, 1) Magnetic Monopole. University of South Australia Preprint, 1995 29. Gauntlett, J.P., Lowe, D.A.: Dyons and S-Duality in N=4 Supersymmetric Gauge Theory. Nucl. Phys. B 472, 194 (1996) 30. Lee, K., Weinberg, E., Yi, P.: Electromagnetic Duality and SU (3) Monopoles. Phys. Lett. B 376, 97 (1996) 31. Lee, K., Weinberg, E., Yi, P.: The Moduli Space of Many BPS Monopoles for Arbitrary Gauge Groups. Phys. Rev. D 54, 1633 (1996) 32. Houghton, C., Irwin, P.W., Mountain, A.J.: Two monopoles of one type and one of another. JHEP 9904, 029 (1999) 33. Nahm, W.: A Simple Formalism For The BPS Monopole. Phys. Lett. B 90, 413 (1980) 34. Diaconescu, D.E.: D-branes, monopoles and Nahm equations. Nucl. Phys. B 503, 220 (1997) 35. Hurtubise, J., Murray, M.K.: On The Construction Of Monopoles For The Classical Groups. Commun. Math. Phys. 122, 35 (1989) 36. Hanany, A., Witten, E.: Type IIB superstrings, BPS monopoles, and three-dimensional gauge dynamics. Nucl. Phys. B 492, 152 (1997) 37. Kapustin, A., Sethi, S.: The Higgs branch of impurity theories. Adv. Theor. Math. Phys. 2, 571 (1998) 38. Tsimpis, D.: Nahm equations and boundary conditions. Phys. Lett. B 433, 287 (1998) 39. Chen, X.G., Weinberg, E.J.: ADHMN boundary conditions from removing monopoles. Phys. Rev. D 67, 065020 (2003) 40. Murray, M.K.: A note on the (1, 1,…, 1) monopole metric. J. Geom. Phys. 23, 31 (1997) 41. Bachas, C., Hoppe, J., Pioline, B.: Nahm equations, N = 1 domain walls, and D-strings in Ad S5 × S 5 . JHEP 0107, 041 (2001) 42. Lambert, N.D., Tong, D.: Kinky D-strings. Nucl. Phys. B 569, 606 (2000) 43. Eto, M., Isozumi, Y., Nitta, M., Ohashi, K., Ohta, K., Sakai, N.: D-brane construction for non-Abelian walls. Phys. Rev. D 71, 125006 (2005) 44. Witten, E.: Solutions of four-dimensional field theories via M-theory. Nucl. Phys. B 500, 3 (1997) 45. Hanany, A., Hori, K.: Branes and N = 2 theories in two dimensions. Nucl. Phys. B 513, 119 (1998) Communicated by N.A. Nekrasov

Commun. Math. Phys. 266, 665–697 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0053-x

Communications in

Mathematical Physics

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section D. Dürr1 , S. Goldstein2 , T. Moser1 , N. Zanghì3 1 Mathematisches Institut der Universität München, Theresienstr. 39, 80333 München, Germany.

E-mail: [email protected]; [email protected]

2 Department of Mathematics, Rutgers University, New Brunswick, NJ 08903, USA.

E-mail: [email protected]

3 Dipartimento di Fisica, Università di Genova, Sezione INFN Genova, Via Dodescanesco 33, 16146 Genova,

Italy. E-mail: [email protected] Received: 1 September 2005 / Accepted: 14 March 2006 Published online: 11 July 2006 – © Springer-Verlag 2006

Abstract: We prove that the empirical distribution of crossings of a “detector” surface by scattered particles converges in appropriate limits to the scattering cross section computed by stationary scattering theory. Our result, which is based on Bohmian mechanics and the flux-across-surfaces theorem, is the first derivation of the cross section starting from first microscopic principles. 1. Introduction The central quantity in a scattering experiment is the empirical cross section, which reflects the number of particles that are scattered in a given solid angle per unit time. In this paper we shall derive the theoretical prediction for the cross section starting from a microscopic model describing a realistic scattering situation. We confine ourselves to the case of potential scattering of a nonrelativistic, (spinless) quantum particle and leave the many-particle case for future research. This paper is in fact a technical elaboration and continuation of our article “Scattering theory from microscopic first principles” [9]. The common approaches to the foundations of scattering theory take for granted that “an experimentalist generally prepares a state … at t → −∞, and then measures what this state looks like at t → +∞” (cf. [25], p. 113), meaning that the asymptotic expressions are “all there is,” as if they are not the asymptotic expressions of some other formula, however complicated, describing the scattering situation as it really is, namely happening at finite distances and at finite times. Thus a truly microscopic derivation starting from first principles must provide firstly a formula for the empirical cross section, which by the law of large numbers approximates its expectation value, and which is computed from the underlying theory. Secondly, that formula should apply to the realistic finite-times and finite-distances situation, from which eventually the usual Born formula should emerge by taking appropriate limits.1 1 For a detailed discussion of the scattering regime see [8].

666

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

We shall present a Bohmian analysis of the scattering cross section. With a particle trajectory we can ask for example whether or not that trajectory eventually crosses a distant spherical surface and if it does when and where it first crosses that surface. Similarly, for a beam of particles we can ask for the number of particles in the beam that first crosses the surface in a given solid angle . From a Bohmian perspective it appears reasonable to identify this number with detection events in a scattering experiment. We thus model in this paper the measured cross section using the number N () of first crossings of . This will of course depend on many parameters encoding the experimental setup, e.g. the distances R and L of the detector and the particle source from the scattering center, the details of the beam including its profile A and the wave functions of the particles in the beam, as well as on the length of the time interval τ during which the particles are emitted. We shall show in this paper that when these parameters are suitably scaled, N τ() is well approximated by the usual Born formula for the scattering cross section in terms of the T -matrix, i.e., N () = 16π 4 |T (k0 ω, k0 )|2 d, lim (1) τ

where k0 is the initial momentum of the particles. The paper is organized as follows: We collect first some mathematical notions and facts as well as recent results of scattering theory. In Sect. 3 we define the relevant random variables associated with the surface-crossings of a single particle and relate their distribution to the quantum probability current density. In Sect. 4 we model the beam by a suitable point process and in Sect. 5 we define N () in terms of this point process. A precise description of the limit procedure will be presented in Sect. 6. Our main results, Theorem 1 and 2, are stated in Sect. 7 and are proven in Sect. 8.

2. The Mathematical Framework of Potential Scattering We list those results of scattering theory (e.g. [2, 7, 11, 14, 16, 18–20, 22]) which are essential for the proof of Theorem 1 and Theorem 2 in Sect. 8. We use the usual description of a nonrelativistic spinless one-particle system by the Hamiltonian H (we use natural units = m = 1), 1 H := − + V (x) =: H0 + V (x), 2 with the real-valued potential V ∈ (V )n , defined as follows: Definition 1. V is in (V )n , n = 2,3,4,..., if (i) V ∈ L 2 (R3 ), (ii) V is locally Hölder continuous except, perhaps, at a ﬁnite number of singularities, (iii) there exist positive numbers δ, C, R0 such that |V (x)| ≤ Cx−n−δ for x ≥ R0 , 1

where · := (1 + (·)2 ) 2 .

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

667

Under these conditions (see H is self-adjoint on the domain D(H ) = e.g. [16]) D(H0 ) = { f ∈ L 2 (R3 ) : |k 2 f := F f is the f (k)|2 d 3 k < ∞} (k = |k|), where Fourier transform − 23 f (k) := (2π ) (2) e−i k· x f (x)d 3 x. Let U (t) = e−i H t . Since H is self-adjoint on the domain D(H ), U (t) is a strongly continuous one-parameter unitary group on L 2 (R3 ). Let φ ∈D(H ). Then φt ≡ U (t)φ ∈D(H ) and satisfies the Schrödinger equation i

∂ φt (x) = H φt . ∂t

In a typical scattering experiment the scattered particles move almost freely far away from the scattering center. “Far away” in position space can also be phrased as “long before” and “long after” the scattering event takes place. So for the “scattering states” ψ there are asymptotes ψin , ψout defined by lim e−i H0 t ψin (x) − e−i H t ψ(x) = 0,

t→−∞

lim e−i H0 t ψout (x) − e−i H t ψ(x) = 0.

(3)

t→∞

From this it is natural to define the wave operators ± : L 2 (R3 ) → Ran(± ) by the strong limits ± := s-lim ei H t e−i H0 t . t→±∞

(4)

These wave operators map the incoming and outgoing asymptotes to their corresponding scattering states. Ikebe [14] proved that for a potential V ∈ (V )n the wave operators exist and have the range Ran(± ) = Hcont (H ) = Ha.c. (H ). (This property is called asymptotic completeness.) Hence, the scattering states consist of states with absolutely continuous spectrum and the singular continuous spectrum of H is empty. In addition Ikebe [14] showed that the Hamiltonian has no positive eigenvalues. Then we have for every ψ ∈ Ha.c. (H ) asymptotes ψin , ψout ∈ L 2 (R3 ) with − ψin = ψ = + ψout .

(5)

On D(H0 ) the wave operators satisfy the so-called intertwining property H ± = ± H0 , while on Ha.c. (H )∩D(H ) we have that −1 H0 −1 ± = ± H.

The scattering operator S : L 2 (R3 ) → L 2 (R3 ) is given by S := −1 + − ,

(6)

668

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

while using the identity I , the T -operator is given by T := S − I.

(7)

If the system is asymptotically complete, the ranges of the wave operators are equal and thus S is unitary. Since the wave operator maps a scattering state onto its asymptotic state, the scattering operator maps the incoming asymptote ψin onto the corresponding out state ψout . The formula for the T -matrix, which holds in the L 2 -sense, is given by (see e.g., Theorem XI.42 in [19]) T (k, k ) g (k )k d , (8) Tg(k) = −2πi k =k

for g ∈ S(R3 ) (Schwartz space) such that g has support in a spherical shell.2 T (k, k ) is given by (see e.g., [19]): T (k, k ) = (2π )−3 e−i k· x V (x)ϕ− (x, k )d 3 x, (9) where ϕ− (as well as ϕ+ ) are eigenfunctions of H defined by Lemma 1 below. Since the eigenfunctions ϕ± are bounded and continuous (cf. Lemma 2), we can conclude that T (k, k ) is bounded and continuous on R3 × R3 , if the potential is in (V )3 . Then the formula (8) can be proved for g ∈ S(R3 ) without any restriction on the momentum support by the same method as in [19]. We will need the time evolution of a state ψ ∈ Ha.c. (H ) with the Hamiltonian H . Its diagonalization on Ha.c. (H ) is given by the eigenfunctions ϕ± : 1 k2 (− + V (x))ϕ± (x, k) = ϕ± (x, k). 2 2

(10)

2

Inverting (− 21 − k2 ) one obtains the Lippmann-Schwinger equation. We recall the main parts of a result on this due to Ikebe in [14] which is collected in the present form in [22]. Proposition 1. Let V ∈ (V )2 . Then for any k ∈ R3 \{0} there are unique solutions ϕ± (·, k) : R3 → C of the Lippmann-Schwinger equations ∓ik| x − x | 1 e ϕ± (x, k) = ei k· x − V (x )ϕ± (x , k)d 3 x , (11) 2π |x − x | which satisfy the boundary conditions lim| x |→∞ (ϕ± (x, k) − ei k· x ) = 0, which are also classical solutions of the stationary Schrödinger equation (10), and are such that: (i) For any f ∈ L 2 (R3 ) the generalized Fourier transforms3 1 ∗ (F± f )(k) = l. i. m. ϕ± (x, k) f (x)d 3 x 3 2 (2π ) exist in L 2 (R3 ). 2 In [19] Equation (8) was proven outside an “exceptional set”. For our class of potentials the “exceptional set” is empty. The additional factor 21 in [19] comes from the different definition of H0 . 3 l. i. m. is a shorthand notation for s-lim 2 B , where s-lim denotes the limit in the L -norm and B R a R→∞

ball with radius R around the origin.

R

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

669

(ii) Ran(F± ) = L 2 (R3 ). Moreover F± : Ha.c. (H ) → L 2 (R3 ) are unitary and the inverses of these unitaries are given by 1 −1 (F± f )(x) = l. i. m. ϕ± (x, k) f (k)d 3 k. 3 (2π ) 2 (iii) For any f ∈ L 2 (R3 ) the relations ± f = F±−1 F f hold, where F is the ordinary Fourier transform given by (2). (iv) For any f ∈ D(H ) ∩ Ha.c. (H ) we have: k2 H f (x) = F±−1 F± f (x), 2 and therefore for any f ∈ Ha.c. (H ), k2 e−i H t f (x) = F±−1 e−i 2 t F± f (x). In order to apply stationary phase methods we will need estimates on the derivatives of the generalized eigenfunctions: Proposition 2. Let V ∈ (V )n for some n ≥ 3. Then: (i) ϕ± (x, ·) ∈ C n−2 (R3 \ {0}) for all x ∈ R3 and the partial derivatives4 ∂ kα ϕ± (x, k), |α| ≤ n − 2, are continuous with respect to x and k. If, in addition, zero is neither an eigenvalue nor a resonance of H , then (ii)

sup x ∈R3 , k∈R3

|ϕ± (x, k)| < ∞,

for any α with |α| ≤ n − 2 there is a cα < ∞ such that (iii)

sup k∈

R3 \{0}

|κ |α|−1 ∂ kα ϕ± (x, k)| < cα x|α| , with κ :=

k k ,

and for any l ∈ {1, ..., n − 2} there is a cl < ∞ such that l ∂ ∂ l (iv) sup ∂k l ϕ± (x, k) < cl x , where ∂k is the radial partial derivative in kk∈ R3 \{0}

space.

Remark 1. This proposition, except the assertion (iii), was proved in [22], Theorem 3.1. Assertion (iii) repairs a false statement in Theorem 3.1 which did not include the necessary κ |α|−1 factor, which we have in (iii). For |α| = 1, which was the important case in that paper, there is however no difference. We have commented on the proof of this corrected version in [11]. Remark 2. Zero is a resonance of H if there exists a solution f of H f = 0 such that x−γ f ∈ L 2 (R3 ) for any γ > 21 but not for γ = 0.5 The appearance of a zero eigenvalue or resonance can be regarded as an exceptional event: For a Hamiltonian H = H0 + cV, c ∈ R, this can only happen for c in a discrete subset of R, see [1], p. 20 and [15], p. 589. As a simple consequence of Proposition 2 we obtain 4 We use the usual multi-index notation: α = (α , α , α ), α ∈ N , ∂ α f (k) : ∂ α1 ∂ α2 ∂ α3 f (k) and 1 2 3 0 i k1 k2 k3 k |α| := α1 + α2 + α3 . 5 There are various definitions, see e.g. [26], p. 552, [1], p.20 and [15], p. 584.

670

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Corollary 1. Let V ∈ (V )3 and let zero be neither an eigenvalue nor a resonance of H . Then the T -matrix deﬁned by (9) is a bounded and continuous function on R3 × R3 . Moreover, if V ∈ (V )n , for some n ≥ 3 we have for all multi-indices α with |α| ≤ n − 3 a constant cα > 0 such that k

sup ∈R3 , k∈R3 \{0}

κ |α|−1 |∂ kα T (k , k)| ≤ cα .

(12)

With the regularity of the generalized eigenfunctions one can prove the flux-acrosssurfaces theorem. The quantum probability current density (=quantum flux density) is given by i j ψt (x) := − (ψt∗ (x)∇ψt (x) − ψt (x)∇ψt∗ (x)). 2

(13)

For ψt (x) a solution of the Schrödinger equation we have the identity ∂|ψt (x)|2 + div j ψt (x) = 0, ∂t which has the form of a continuity equation. The flux-across-surfaces theorem can be naturally proven for the following class of wave functions (in the following definition out (k) = ϕ+ (x, k)ψ(x)d 3 x (cf. Proposition we have the Fourier transform of ψout , ψ 1), in mind): Definition 2. A function f : R3 \ {0} → C is in G + if there is a constant C ∈ R+ with: | f (k)| ≤ Ck−15 , α ∂ f (k) ≤ Ck−6 , |α| = 1, k α κ ∂ f (k) ≤ Ck−5 , |α| = 2, κ = k 2 ∂ ∂k 2 f (k) ≤ Ck−3 .

k k ,

With this definition we have Proposition 3. (Flux-across-surfaces theorem (FAST)). Suppose V ∈ (V )4 and that out (k) ∈ G + and let zero is neither a resonance nor an eigenvalue of H . Suppose ψ −i H t ψ = + ψout . Then ψt (x) = e ψ(x) is continuously differentiable except at the singularities of V , for any measurable set ⊆ S 2 and any T ∈ R j ψt (x) · dσ dt is absolutely integrable on R × [T, ∞) for R sufﬁciently large and ∞

∞

ψt

j (x) · dσ dt= lim

lim

R→∞

R→∞

T R

T R

ψ j t (x) · dσ dt= |ψ out (k)|2 d 3 k, C

(14) where R := {x ∈ R3 : x = Rω, ω ∈ }, C := {k ∈ R3 : kk ∈ } is the cone given by and dσ is the outward-directed surface element on RS 2 . The proof can be found in [11]. The FAST plays a crucial role in the proof of our main results, Theorem 1 and Theorem 2. Its importance for scattering theory was first pointed out in [6].

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

671

3. The Quantum Flux, Crossing Statistics and Bohmian Mechanics In Bohmian mechanics, see [5], the particle has a position Q t that evolves via the equations d ∇ψt ( Q t ), Q t = v ψt ( Q t ) = Im dt ψt

∂ i ψt (x) = H ψt (x). ∂t

(15)

According to the quantum equilibrium hypothesis ([10], Born’s law), the positions of particles in an ensemble of particles each having wave function ψ are always |ψ|2 -distributed. Note that if Q 0 is |ψ0 |2 -distributed then Q t is |ψt |2 -distributed. Under two assumptions we have the |ψ0 |2 almost-sure existence and uniqueness of the Bohmian dynamics: A 1. The initial wave function ψ0 is normalized, ψ0 = 1, and ψ0 ∈ C ∞ (H ) = ∞ D(H n ). n=1

A 2. The potential V is in V2 and C ∞ except, perhaps, at a finite number of singularities. (See Berndl et al. [4], Theorem 3.1 and Corollary 3.2 for the proof, as well as Theorem 3 and Corollary 4 in [23]. The conditions in [4, 23] are much more general. In our context, however, we have to restrict to the case where V ∈ (V )2 .) Hence, depending on the initial position q 0 ∈ 0 , where 0 is the set of “good” points, the ψ particle has the trajectory Q t (q 0 ). On the set of “good” points, ψ0 (x) is different from zero and is differentiable. The complement R3 \ 0 of 0 has measure 0 (with respect to |ψ0 |2 ). ψ Given a trajectory Q t (q 0 ), q 0 ∈ 0 , we can define the number of crossings in a natural way. For the surface R ⊂ RS 2 with unit and normal vector n(x) = xx , x ∈ R ψ we define N+ (R) on 0 by:

ψ ψ ψ ˙ψ N+ (R)(q 0 ) := t ≥ 0| Q t (q 0 ) ∈ R and Q t (q 0 ) · n Q t (q 0 ) > 0 , (16) ψ

the number of crossings of the trajectory Q t (q 0 ) through R in the direction of the orientation in the time interval [0, ∞) (“problematical crossings” where the velocity is “orthogonal” to the orientation of R have measure zero and need not concern us, see ψ R as the time when the particle [3], p. 28-34). If N+ (R)(q 0 ) ≥ 1, we can define texit crosses the surface R in the positive direction for the first time:

ψ ψ R ˙ψ (q 0 ) := min t ≥ 0| Q t (q 0 ) ∈ R and Q texit t (q 0 ) · n Q t (q 0 ) > 0 . (17) In the case that the particle does not cross the surface in the positive direction, we set ψ

R (q 0 ) := ∞, if N+ (R)(q 0 ) = 0. texit ψ

(18)

Analogously to (16) we have N− (R), the number of crossings in the opposite direcψ ψ tion. For convenience we define N+ (R) and N− (R) on the whole of R3 by setting

672 ψ

D. Dürr, S. Goldstein, T. Moser, N. Zanghì ψ

N+ (R) = N− (R) = 0 for all q 0 ∈ R3 \ 0 . Then we can define the number of signed crossings on R3 by ψ

ψ

ψ

Nsig (R) := N+ (R) − N− (R).

(19)

The total number of crossings defined on R3 is then ψ

ψ

ψ

Ntot (R) := N+ (R) + N− (R).

(20)

These quantities are random variables on the space R3 of initial conditions, see [3], ψ ψ Lemma 4.2. The expectation values of Nsig (R) and Ntot (R) are given by flux inteψ

ψ

grals and are finite, see Proposition 4 below. This means that Nsig (R) and Ntot (R) are almost surely finite. Before we give a precise statement we argue heuristically for the connection between the quantum flux and the expectation values. For a particle to cross an infinitesimal surface dσ := ndσ in a time interval [t, t + dt), it must be at time t in the appropriate cylinder of size |v ψt (x) · dσ dt|. The probability is therefore |ψt (x)|2 |v ψt (x) · dσ dt| = | j ψt (x) · dσ |dt. ψ

Because the intervals are infinitesimal, we have for Nsig (dt, dσ ) ∈ {−1, 0, 1},6 where ψ

the sign will be the same as that of j · dσ . Therefore E(Nsig (dt, dσ )) = j ψt (x) · dσ dt and integration over R and [0, ∞) yields (21). The precise statement is: Proposition 4. Let A1 and A2 be satisﬁed. In addition suppose that the conditions of Proposition 3 are satisﬁed. Then for sufﬁciently large R the expectation values of ψ ψ Nsig (R) and Ntot (R) are ﬁnite and ψ E(Nsig (R))

∞ =

j ψt (x) · dσ dt,

(21)

| j ψt (x) · dσ |dt.

(22)

0 R

∞

ψ

E(Ntot (R)) = 0 R

The proof of Proposition 4 can be found in [3], pp. 34–37, and under slightly different conditions in [24]. The results in the references hold under more general conditions on the surfaces. Consider now a scattering situation where we want to calculate the number of first crossings. The detector corresponds to the surface R := {x ∈ R3 : x = Rω, ω ∈ ψ ⊂ S 2 } ⊂ RS 2 . Then we define Ndet ([0, ∞), R, ) to be equal to one if the particle with the wave function ψ0 = ψ is “detected” in [0, ∞) and zero otherwise. More precisely, ψ

Ndet (R, ) : R3 → {0, 1}, 6 N ψ (dt, dσ ) is the number of signed crossings in the time interval [t, t + dt) through the surface dσ . sig

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

ψ Ndet (R, )(q 0 )

:=

 1,

R S < ∞ and Q if q0 ≤ R, texit

0

otherwise.

2

ψ 2

RS texit

(q 0 ) ∈ R,

673

(23)

The definition is motivated by the idea that particles are detected when they cross the boundary RS 2 for the first time. Using the fact that RS 2 is closed we can estimate ψ ψ ψ Ndet (R, ) − Nsig (R) ≤ N− (RS 2 ) so that by the triangle inequality ψ ψ ψ E(Ndet (R, )) − E(Nsig (R)) ≤ E(N− (RS 2 )). With (19), (20) and Proposition 4 we obtain for the right-hand side of (24), 1 ψ ψ ψ E(N− (RS 2 )) = E Ntot (RS 2 ) − Nsig (RS 2 ) 2 ∞ ψ 1 | j t (x) · dσ | − j ψt (x) · dσ dt. = 2

(24)

(25)

0 R S2

If j ψt (x) · dσ ≥ 0 for all dσ ∈ RS 2 and t > 0 then we have by (24) and (25) that ψ ψ E(Nsig (R)) = E(Ndet (R)). In general j ψt (x) · dσ does not have to be positive, but the flux-across-surfaces theorem (Proposition 3) ensures that the flux is asymptotically ψ ψ outwards. Thus we can estimate the difference between E(Nsig (R)) and E(Ndet (R)) for all ψ which satisfy the flux-across-surfaces theorem using (24) and (25), 1 ∞ ψ ψ | j ψt (x) · dσ | − j ψt (x) · dσ dt → 0. E(Nsig (R))−E(Ndet (R, ))≤ R→∞ 2 0 R S2

(26) In particular under the hypotheses of Proposition 3 and the general assumptions A1 and ψ A2 we obtain asymptotic equality between the expectation values E(Ndet (R, )) and ψ E(Nsig (R)). 4. A Model for the Beam In a scattering situation a beam of particles is scattered off a target. We now wish to focus on the beam. We take the beam to be produced by a particle source located in the plane Y L perpendicular to the x3 -axis: Y L := {−L e3 + a| a⊥e3 }, L > 0. The particles are created with wave functions ψ ∈ Ha.c. translated to the plane Y L . Calling ψ y the translation of ψ by y, the “centers” of the translated wave functions, with which we are concerned, are located at y = y1 e1 + y2 e2 − L e3 ∈ Y L and are uniformly distributed in a bounded region A ⊂ Y L with area |A|. We call A the beam profile. The momentum distribution of the wave function is concentrated around the momentum k0 e3 .

674

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Remark 3. This model of a beam, in which the particles have random impact parameters and are scattered off a single target “particle,” is equivalent to the more realistic description of the scattering situation, in which all the target particles are randomly distributed (e.g., in a foil) and the incoming particles have the very same impact parameter, provided coherent and multiple-scattering effects are neglected (see e.g. [17], p. 214). The translated wave function ψ y of a wave function ψ ∈ Ha.c. will not in general be in Ha.c. , but can have a part in Hp.p. . This is problematical for the application of our general results (see Sect. 9). To avoid this difficulty, we assume: A 3. The Hamiltonian H = − 21 + V has no bound states, i.e. Hp.p. = {0}. Then ψ y ∈ Ha.c. , ∀ y ∈ R3 . We specify now more precisely the model for the beam, which has been already mentioned in [9]. The particles are created with wave functions ψ at random times t ∈ R+ and where the wave function of a particle is shifted randomly by the uniformly distributed “impact parameter” y ∈ A, the “center” of the wave function at the moment of emission. In Bohmian mechanics the initial position q ∈ R3 of the particle determines its trajectory. The initial position is |ψ y |2 -distributed. We shall not need many stochastic details about the beam. The reader may think of a Poisson point process with points in = R+ × A × R3 , with a point λ = (t, y, q) ∈ representing a particle with wave function ψ y (x) ≡ ψ(x − y), y ∈ A

(27)

emitted at the time t ∈ R+ and with initial position q ∈ R3 . We shall consider a general point process ( , F, P) built on (, B(), µ), where λ ∈ represents a configuration of countably many points in , i.e. λ = {λ}, λ ∈ , λ countable. For the number of points χ B (λ ) ≡

χ B (λ)

λ∈λ

in a set B ∈ B(), where χ B is the indicator function of the set B, we have that (28) E χ B = µ(B), where the intensity measure µ on B() is given by dµ = |ψ(x − y)|2 χ A (y)dtd 2 yd 3 x.

(29)

Remark 4. For a Poisson process we would have, in addition to (28), that µ(B)k P χ B = k = exp(−µ(B)) k! as well as the independence of χ A and χ B , for A ∩ B = ∅, A, B ∈ B().

(30)

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

675

We shall assume that the point process is ergodic in the following sense: For any B ∈ B() let B(τ ) := {(t, y, q) ∈ B|t ∈ [0, τ )}.

(31)

Then for any > 0, χ χ B(τ ) B(τ ) lim P −E ≥ = 0, τ →∞ τ τ

(32)

with E χ B(τ ) given by (28). Remark 5. Because of the independence property (cf. Remark 4), (32) holds for the case of a Poisson process. Remark 6. The point process has unit density in the following sense: Let C ⊂ A, τ > 0 and B := [0, τ ) × C × R3 be given. Then with (32) for any > 0, χB χ B ≥ = 0, (33) lim P −E τ →∞ |C|τ |C|τ and E

χ B(τ )

|C|τ

=

1 µ(B) = 1. |C|τ

(34)

5. The Definition of the Scattering Cross Section We shall now start to define N (τ, R, A, L , ψ, ), the number of detected particles. To simplify the notation we do not always indicate the dependence of N on A, L and ψ. Sometimes we will also suppress the dependence on R and . We define first Ndet (τ, R, ) for a single particle corresponding to λ = (t, y, q) by Ndet (τ, R, ψ, ) : → {0, 1}, ψ

Ndet (τ, R, ψ, )(λ) := χ[0,τ ) (t)Ndety (R, )(q),

(35)

ψ

where Ndety (R, )(q) is defined by (23). The characteristic function ensures that no particle is counted which is emitted after the time τ. Note that ψ y must satisfy condition ψ

A1 (Sect. 3) to ensure that Ndety (R, )(q) is well defined. Then N (τ, R, A, L , ψ, ) : → N0 , N (τ, R, A, L , ψ, )(λ ) =

λ∈λ

Ndet (τ, R, ψ, )(λ).

(36)

676

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

The empirical scattering cross section σemp () for the solid angle is the random variable7 N (τ, R, A, L , ψ, ) , (37) σemp () := τ which by the law of large numbers (for the Poisson case and by the ergodicity assumption (32) for the general case) should approximate for large τ in P-probability its corresponding P-expectation value. The expected value of (37) is then the theoretically predicted cross section. This theoretically predicted cross section involves a very complicated formula which is not very explicit, cf. (47) and Remark 7. It depends of course on the detection directions , the potential V and the approximate momentum k0 of the particles in the beam, but depends also on the other details of the experimental setup such as R, A, L and the detailed specification of ψ. By taking the scaling limit described in the next section, we shall arrive at (1), which does not depend on these additional details. 6. The Scaling of the Parameters According to the usual asymptotic picture of scattering theory where the particles are prepared long before and are detected long after the scattering event has occurred, the preparation and detection should be far away from the scattering center. That means the limits R → ∞ and L → ∞ have to be taken. However, increasing L has the (undesirable) effect of an increased spreading of the beam, which reduces the beam intensity in the scattering region. To maintain the beam intensity in the scattering region we must widen the beam profile A as L → ∞. The idealization of an incoming plane wave corresponds to particles with a narrow distribution in momentum space, i.e., to a limit in which the Fourier transform of the initial wave function becomes more and more concentrated around a fixed initial wave vector k0 . For a detailed discussion of the scattering regime see [8]. The limits for the parameters L , A, and ψ will be combined by simultaneously scaling them using a small parameter : We introduce L , A and ψ , whose precise dependence on will be given below, and consider the cross section corresponding to (37), depending on , R, τ , N (τ, R, A , L , ψ , ) , (38) τ to which the limit → 0 is to be applied. However, the limit R → ∞ is taken before we take → 0; this is because we must have that the diameter of the beam profile A is much smaller than R, since otherwise unscattered particles will often contribute to what should be the cross section for scattered particles. For convenience, we first take the limit τ → ∞, required for the stabilization of the empirical cross section produced by the law of large numbers. We are thus led to consider a limit for the cross section of the form () = σemp

σ () = lim lim lim σemp (). →0 R→∞ τ →∞

(39)

7 We shall ignore the dimension factor [unit area · unit time] which comes from the normalization of (37) 1 by the unit density [unit area·unit time] of the underlying point process, cf. Remark 6. One can also normalize by the beam density, i.e. with the number of detected particles (by a detector in the beam with a surface perpendicular to the beam axis) per unit time and unit area, in front of the target. In the scattering regime, i.e. if the parameters are suitably scaled (cf. Section 6), the beam will have unit density in front of the target. We shall not elaborate on this further in this paper, see however [8].

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

677

The precise definition of L , A and ψ , used in our main results, is the following: 3

ψ (x) = 2 ei k0 · x ψ( x), with the Fourier transform (k) = − 2 ψ ψ 3

k − k0

(40)

.

(41)

The particle source is located on Y L , with L =

L , l > 2. l

For the beam profile A ⊂ Y L we take the circular region D 3 and x3 = L } A = {x ∈ R | x12 + x22 < 2

(42)

(43)

with the beam diameter D given by D =

D , d > 2l − 3. d

(44)

(One might be inclined to consider a scattering experiment in which the diameter of the beam is much smaller than the distance of the particle source from the scattering center. Indeed, if 2 < l < 3, d < l is consistent with (44). Hence, such a scenario is covered by our results.) 7. The Scattering Cross Section Theorem We can now formulate our main results. Our basic assumptions are that V ∈ (V )5 (Definition 1), A2 (Sect. 3), A3 (no bound states, Sect. 4) and hat for all small enough ψ y is “good” for all y ∈ A in the sense that it satisfies A1 (Sect. 3) as well as the condition for the FAST (Prop. 3). Moreover, we need to assume that the potential has no zero energy resonances. However, instead of invoking the implicit condition on ψ that the ψ y are “good,” we impose stronger but more explicit conditions on ψ, namely that ψ ∈ C0∞ (R3 ) (Theorem 2) or ψ ∈ S (Theorem 1), with corresponding additional conditions on the potential (Definitions 4 and 3, respectively). Definition 3. V is in V if (i) the Hamiltonian H = − 21 + V has no bound states, i.e. Hp.p. = {0}, (ii) the Hamiltonian H = − 21 + V has no zero energy resonances, (iii) V is a C ∞ -function on R3 , (iv) V and its derivatives of all orders are uniformly bounded in x: For all multi-indices α there exist an Mα < ∞ such that |∂ xα V (x)| < Mα for all x ∈ R3 , (v) there exist positive numbers δ and C such that |V (x)| ≤ Cx−5−δ for all x ∈ R3 .

678

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Theorem 1. Let ψ be a normalized vector in S(R3 ) and suppose that V is in V. Furthermore, suppose that the point process ( , F, P) satisﬁes (28), (29) and the ergodic is well assumption (32). Let k0 ||e3 with k0 > 0 and suppose that k0 ∈ / C . Then σemp deﬁned and (recalling (1)) N (τ, R, A , L , ψ , ) P σemp () = −→ σ () = σ diff (ω)d, (45) →0,R→∞,τ →∞ τ

P

where σ diff (ω) = 16π 4 |T (k0 ω, k0 )|2 and −→ denotes convergence in probability. Definition 4. V is in V if (i) the Hamiltonian H = − 21 + V has no bound states, i.e. Hp.p. = {0}, (ii) the Hamiltonian H = − 21 + V has no zero energy resonances, (iii) V is in (V )5 , (iv) V is C ∞ except, perhaps, at a ﬁnite number of singularities. Under these conditions we obtain Theorem 2. Let ψ be a normalized vector in C0∞ (R3 ) and let V be in V . Furthermore, suppose that the point process ( , F, P) satisﬁes (28), (29) and the ergodic assumption is well deﬁned and (32). Let k0 ||e3 with k0 > 0 and suppose that k0 ∈ / C . Then σemp (45) of Theorem 1 holds. 8. Proof of Theorem 1 and Theorem 2 During the proof in this section and in the appendix 0 < c < ∞ will denote a constant whose value can change during a calculation—even within the same equation or inequality. If either V ∈ V and ψ ∈ S(R3 ) or V ∈ V and ψ ∈ C0∞ , then (if ψ is normalized) the ψ y are “good” for all y ∈ A for all small enough. That the ψ y satisfy the conditions for the FAST follows from Lemma 1 below, and that they satisfy A1 is easily seen: For the case V ∈ V and ψ ∈ S(R3 ) the conclusion follows from a simple computation, and if V ∈ V and ψ ∈ C0∞ it suffices to observe that by choosing small enough the wave function ψ y has, for all y ∈ A , no overlap with the singularities of the potential. N is thus well defined by (36), and we can take the first limit in (45) using the following Proposition 5. Suppose that ψ y satisﬁes A1 for all y ∈ A and that the potential satisﬁes A2. Furthermore, suppose that the point process ( , F, P) satisﬁes (28), (29) and the ergodic assumption (32). Then the number of detected particles N (τ ) obeys the law of large numbers, i.e. for all δ > 0, N (τ, ) (46) − γ ≥ δ = 0, lim P τ →∞ τ where

γ = A

ψ E Ndety () d 2 y.

(47)

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

679

Remark 7. γ = γ () is in fact the cross section which would be measured in an experiment. The remaining limits in (45) applied to γ yield the cross section σ (). If the basic point process is a Poisson process with [0, τ ) = R+ the times of detection in form a Poisson process with intensity γ . Moreover, in the scattering regime, the detailed detection events, involving times and directions, form a Poisson process on R+ × S 2 with intensity σ diff (ω). Proof. By the definition (36) of N we have that N (τ )(λ ) = χ B(τ ) (λ ) =

χ B(τ ) (λ),

(48)

λ∈λ

with B(τ ) given by B(τ ) = {(t, y, q) ∈ |Ndet (τ, )(t, y, q) = 1}.

(49)

It thus follows from (28) and (29) that

ψ E N (τ ) = µ(B(τ )) = χ[0,τ ) (t)Ndety ()(q)dµ ψ = τ E Ndety () d 2 y = τ γ .

(50)

A

The proposition follows from the ergodicity assumption (32).

It is not easy to calculate the expectation value γ (cf. (47)) directly. However, as we shall show below, using the FAST we can approximate (47) by ψ E Nsigy (R) d 2 y, (51) A

where the integrand of (51) is given by an integral over the flux (cf. (21)), an expression ψ

that we can more easily handle. We will show in Lemma 2 below that E Nsigy (R) is absolutely integrable over A . We introduce now a class of scattering states G for which we can show that the corresponding asymptotes are in the set G + , i.e. that they satisfy the FAST. Definition 5. A function f : R3 → C is in G 0 if 8 f ∈ Ha.c. (H ) ∩ C 8 (H ), x2 H n f (x) ∈ L 2 (R3 ), x4 H n Then G :=

t∈R

f (x) ∈

L 2 (R3 ),

n ∈ {0, 1, 2, ..., 8}, n ∈ {0, 1, 2, 3}.

e−i H t G 0 .

We state now the important lemma that ensures that the ψ y satisfy the FAST. 8 8 C 8 (H ) := D(H n ) n=1

680

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Lemma 1. Suppose V ∈ (V )4 and that zero is neither a resonance nor an eigenvalue of H . Then

out (k) = F −1 ψ(x) ∈ G ⇒ ψ ψ (k) ∈ G + . + The proof is adapted from [12] and can be found in the appendix. Remark 8. For other mapping properties between ψ and ψout , which are not applicable in our case, see [26]. For ψ ∈ S and V ∈ V or ψ ∈ C0∞ (R3 ) and V ∈ V we have that ψ y ∈ C ∞ (H ) for all y ∈ A and small enough. By (i) in the definition of V or V (Definition 3 or 4) there are no bound states. Hence ψ y ∈ Ha.c. (H ) ∩ C 8 (H ), and one easily sees that ψ y ∈ G. Thus by Lemma 1 and Proposition 3 the ψ y satisfy the FAST for all y ∈ A and small enough.

ψ

We now show that E Nsigy (R) is absolutely integrable over A .

Lemma 2. Suppose that ψ ∈ S and V ∈ V or that ψ ∈ C0∞ (R3 ) and V ∈ V . Then there exist M and R0 > 0 such that for small enough ∞ 0

| j ψ y,t (x) · dσ |dt < M, ∀ y ∈ A , ∀R > R0 .

(52)

R S2

For the proof see the Appendix. From now on we assume that R > R0 . By Lemma 1, Proposition 3, Proposition 4 and Lemma 2 we see that (51) is a ψ well defined expression. Moreover, by (26) the difference between E Ndety (R, )

ψ and E Nsigy (R) vanishes in the limit R → ∞, and using Lemma 2 we easily see by the dominated convergence theorem that the same conclusion holds for the integrals themselves. Thus, by Proposition 5, the limit σ () in Theorem 1 is given by ψ E Ndety (R, ) d 2 y σ () = lim lim γ = lim lim →0 R→∞

→0 R→∞

lim E

= lim

→0

A

R→∞

= lim

→0

lim

A

R→∞ R

A

ψ Nsigy (R)

d2 y

j ψ y,t (x) · dσ dtd 2 y.

Using Lemma 1 and Proposition 3 we get instead of (53), −1 −1 2 2 3 2 2 3 |+ ψ y (k)| d yd k = lim | S σ () = lim − ψ y (k)| d yd k. →0 C A

→0 C A

(53)

(54)

The formula for S = T + I is given by (8) and (9). To exploit this formula we write instead of (54):

2 2 3 σ () = lim |F S(−1 ψ − ψ ) + T ψ + ψ (55) − y y y y (k)| d yd k. →0 C A

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

By the triangle equality we see that (55) yields σ () = lim |F(T ψ y )(k)|2 d 2 yd 3 k, →0 C A

681

(56)

provided

2 2 −1 − ψ y − ψ y d y = 0

lim

→0

(57)

A

and lim

→0 C A

y (k)|2 d 2 yd 3 k = 0. |ψ

(58)

Remark 9. In [9] the “sufficient condition” for proceeding from (54) to (56) was insufficient. We will establish now (57) and (58). We start with (58). Suppose that is such that k0 ∈ / C . With (41) we have then that

k − k0 2 2 3 ψ d yd k C A (k) |2 d 2 yd 3 k. |ψ

2 2 3 ψ y (k) d yd k = −3

C A

=

1 (C − k 0 )

(59)

A

Since k0 ∈ / C there exists a δ > 0 such that |k − k0 | > δ

for all k ∈ C .

(60)

| ≤ ck −(d+2) ), the last integral in (59) can be ∈ S(R3 ) (we will use that |ψ Using that ψ estimated by 1 c (k) |2 d 2 yd 3 k ≤ (k) |2 d 2 yd 3 k ≤ |ψ |ψ d 3 k ≤ c, 2d 2d+4 k 1 (C − k 0 )

A

k> δ A

k> δ

(61) from which (58) follows. Since − is a partial isometry, (57) is equivalent to lim − ψ y − ψ y 2 d 2 y = 0, →0

A

which is the content of the following

(62)

682

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Lemma 3. Let zero be neither an eigenvalue nor a resonance of H and suppose that V ∈ (V )5 . Let ψ ∈ S(R3 ) and let k0 > 0. Then (63) lim − ψ y − ψ y 2 d 2 y = 0. →0

A

⊂ Peα for some α ∈ (0, π ), Remark 10. Under the additional condition that supp ψ 2 3 where Peα3 := {k ∈ R3 : k · e3 > k cos α}, 0 < α < π2 (this is a convenient condition, see e.g. [2], Lemma 7.17), one can prove in a manner similar to the way we prove Lemma 3 that the following holds: lim − ψ y − ψ y 2 d 2 y = 0. (64) L→∞ YL

It is well known that the integrand in (64) tends to zero for large y (see e.g. [2], Corollary 8.17, [19], Theorem XI.33, and [21], Theorem 2.20). Proof of Lemma 3. We have that − ψ y − ψ y 2 = 1 − (ψ y , − ψ y ) + c.c.

(65)

(k) for any ψ ∈ L 2 (R3 ) (Proposition 1, (iii)) we obtain for the Since − ψ = F−−1 ψ r.h.s. of (65): y (k)ϕ− (x, k)d 3 kd 3 x + c.c. 1 − (ψ y )∗ (x)(2π )−3/2 ψ (66) Writing ϕ− (x, k) = ei k· x − η− (x, k), and since ψ y 2 = 1, we then find that 2 ∗ −3/2 y (k)η− (x, k)d 3 kd 3 x + c.c. − ψ y − ψ y = (ψ y ) (x)(2π ) ψ

(67)

(68)

We shall divide the k-integration into two parts with the help of smooth (C ∞ ) mollifiers 0 ≤ f 1 (k) ≤ 1 and 0 ≤ f 2 (k) ≤ 1 satisfying 1, for |k − k0 | < k30 , f 1 (k) = 0, for |k − k0 | ≥ k20 , f 2 (k) := 1 − f 1 (k).

(69)

Using (69) we obtain for (68) 2 ∗ −3/2 y (k)( f 1 + f 2 )(k)η− (x, k)d 3 kd 3 x + c.c. ψ − ψ y − ψ y = (ψ y ) (x)(2π ) y (k) f 1 (k)η− (x, k)d 3 kd 3 x = (ψ y )∗ (x)(2π )−3/2 ψ ∗ −3/2 y (k) f 2 (k)η− (x, k)d 3 kd 3 x + (ψ y ) (x)(2π ) ψ + c.c. =: I1 + I2 + c.c. ≤ 2|I1 | + 2|I2 |.

(70)

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

683

(k)| ≤ cn Observing that ψ ∈ S(R3 ) we estimate |I2 | by using that for any n > 0 |ψ k and that |η− (x, k)| ≤ 1 + |ϕ− (x, k)| ≤ c (Proposition 2 (ii)) as well as (40), (41) and (69): k − k0 3 3 c ψ |I2 | ≤ 3 |ψ(x − y)|(2π )−3/2 d kd x ≤

c 3

| k− k0 |≥

k0 3

k 3 n−3 ψ d k ≤ c

k | k|≥ 30

1 3 d k = c n−3 , kn

(71)

k | k|≥ 30

if n ≥ 4. Lemma 3 concerns the integration of I1 and I2 over A . With (71) we obtain that |I2 |d 2 y ≤ c n−3−2d , (72) A

which tends to zero if we choose n large enough. We are left with showing that lim |I1 |d 2 y = 0, →0

(73)

A

and for this it suffices to prove that lim

→0 YL

|I1 |d 2 y = 0.

(74)

Recalling the Lippmann-Schwinger equation (11), i.e. that 1 η− (x, k) = 2π

eik| x − x | V (x )ϕ− (x , k), |x − x |

we find that I1 =

1 5

(2π ) 2

∗

(ψ y ) (x)

y (k) f 1 (k) ψ

eik| x − x | V (x )ϕ− (x , k)d 3 x d 3 kd 3 x. (75) |x − x |

Since the integrand in (75) is absolutely integrable over x, x , k (because ψ ∈ S(R3 ), V ∈ (V )5 ; cf. Lemma 2, (ii)) we are free to interchange these integrations and more generally change integration variables as convenient. Using (ψ y )∗ (x) = (ψ )∗ (x − y (k) = ψ (k)e−i k· y we obtain that y), ψ 1 ∗ (k) f 1 (k) (ψ ) (x − y) ψ I1 = 5 (2π ) 2 3 R R3 ik| x − x |−i k· y e V (x )ϕ− (x , k)d 3 x d 3 kd 3 x. × (76) |x − x | R3

684

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Making the change of variables x → x − y and using y = (y1 , y2 , −L ) we obtain

1

I1 =

5

(2π ) 2

(ψ )∗ (x)

R3

(k) f 1 (k) ψ

R3

R3

eik| y+ x − x |−ik1 y1 −ik2 y2 +ik3 L V (x ) | y + x − x| (77)

× ϕ− (x , k)d 3 x d 3 kd 3 x. Introducing as shorthand notation (no change of variables) ˜y = y + x − x , a := x − x , b3 := −L + a3 and letting (r, θ ) be the polar coordinates for ( y˜1 , y˜2 ), with er the corresponding radial unit vector (⊥e3 ), this becomes I1 =

1 5

(2π ) 2 ×

R3

e

ik

1 5

(2π ) 2

e

×

(k) f 1 (k) ψ

R3

y˜12 + y˜22 +(−L +a3 )2 −ik1 y˜1 −ik2 y˜2 +ik3 L

| ˜y| (k) f 1 (k) (ψ )∗ (x) ψ

R3

=

(ψ )∗ (x)

R3 R3 2 2 ik r +b3 −ik sin ϑ r cos β+ik cos ϑ L

r 2 + b32

R3

eik1 a1 +ik2 a2 · V (x )ϕ− (x , k)d 3 x d 3 kd 3 x

eik1 a1 +ik2 a2 · V (x )ϕ− (x , k)d 3 x d 3 kd 3 x, (78)

with k sin ϑ = |k p | = k12 + k22 , k3 = k cos ϑ, where ϑ (0 ≤ ϑ ≤ π ) is the angle between k and e3 and β is the angle between k p = (k1 , k2 , 0) and er . Moreover, there is an angle 0 < α < π2 such that ϑ ≤ α, i.e. cos α ≤ cos ϑ ≤ 1, 0 ≤ sin ϑ ≤ sin α, 0 < α <

π 2

(79)

for all k’s in the support of f 1 (cf. (69)). We introduce now spherical coordinates (k, ω) for k as integration variables and do ∈ S(R3 ), f 1 is first the integration over k (note that β is not k-dependent). Since ψ ∂ smooth and ∂k ϕ− (x , k) is uniformly bounded in k (Proposition 2 (iv)), we can do two integration by parts with respect to k and obtain that I1 = −

1

∞

× S2 0

(ψ ) (x)

5

(2π ) 2

∗

R3

V (x )

R3

∂2 eikλ (k) f 1 (k)ϕ− (x , k)eik1 a1 +ik2 a2 k 2 · ψ dkdd 3 x d 3 x, ∂k 2 2 2 2 r +b λ 3

(80)

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

685

where  b32 λ := r  1 + 2 − sin ϑ cos β  + cos ϑ L . r 

(81)

To estimate the derivatives of the functions f 1 (k)ϕ− (x , k) we use Proposition 2, (iv) and the smoothness of f 1 (k). We introduce a multi-index notation i := (i 1 , i 2 , i 3 , i 4 ), i m ∈ N0 , |i| := i 1 + i 2 + i 3 + i 4 , j := ( j1 , j2 , j3 ) analogously. With kl = κl k, κl ∈ [−1, 1], l = 1, 2 we obtain that 2 ∂ (k)k 2 eik1 a1 +ik2 a2 ) ( f (k)ϕ (x , k) ψ − ∂k 2 1 i3

i4

∂ i1 ∂ i2

2 ∂ iκ1 ka1 ∂ iκ2 ka2 ≤2 ∂k i4 e ∂k i1 f 1 (k)ϕ− (x , k) ∂k i2 ψ (k)k ∂k i3 e |i|=2 i

i4 ∂2 (k)k 2 κ1 a1 |i3 |κ2 a2 ≤c (1 + x )i1 i ψ ∂k 2 |i|=2 i

2 i1 ∂ 2 i3 i4 ≤c (1 + x ) i ψ (k)k a a ∂k 2 |i|=2 i

∂2 (k)k 2 x − x i3 +i4 ≤c (1 + x )i1 i ψ ∂k 2 |i|=2 j

2 j1 ∂ 2 (k)k x − x j3 . ≤c (1 + x ) j ψ (82) 2 ∂k | j|=2

With (79) we may assume that λ in (81) is bounded below, λ ≥ r (1 − sin α) + L cos α ≥ λmin := η(r + L ),

(83)

with η := min((1 − sin α), cos α) > 0. Using (83) and (82) in (80) we obtain that M :=

|I1 |d 2 y ≤ c

|ψ (x)|

| j|=2

R2 R3

YL

× R3

|V (x )|

∞

1 j2 ∂k ψ (k)k 2 |x − x | j3 (1 + x ) j1 r 2 + b32 λ2min

S2 0 3 3 2

× dkdd x d xd y.

(84)

Since the integrand of the right-hand side of (84) is positive, we may perform the change of integration variables (y1 , y2 ) → ( y˜1 , y˜2 ) → (r, θ ), as well as freely interchange the

686

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

order of integrations. With (83) we then obtain that M≤c |ψ (x)| |V (x )| | j|=2

R3 ∞∞2π

R3

1 j2 ∂k ψ (k)k 2 |x − x | j3 (1+x ) j1 r dθ dr dkdd 3 x d 3 x r 2 + b32 λ2min S2 0 0 0 ≤c |ψ (x)| |V (x )| ×

| j|=2

R3 ∞ ∞

∞

R3

1 j2 2 ψ (k)k |x − x | j3 (1 + x ) j1 dr dkdd 3 x d 3 x ∂ η2 (r + L )2 k S2 0 0 c = 2 |ψ (x)| |V (x )| η L ×

| j|=2

×

R3

R3

j2 ∂k ψ (k)k 2 |x − x | j3 (1 + x ) j1 dkdd 3 x d 3 x.

(85)

S2 0

Using that |x − x | j3 ≤ 2(x j3 + x j3 ) for j3 = 1, 2 we obtain that

c j M≤ |ψ (x)|(1 + x) j3 ∂k 2 ψ (k)k 2 L | j|=2 3 R R3 × |V (x )|(1 + x ) j1 + j3 d 3 x dkdd 3 x.

(86)

R3

Since V ∈ (V )5 (so that V ∈ L 2 (R3 ) and |V (x)| ≤ C x −5−δ , δ > 0, for x > R0 ) and j1 + j3 ≤ 2 the x integration is finite and we obtain (by dividing the integration region for x into two parts, x > R0 and x ≤ R0 )

c j2 j3 M≤ |ψ (x)|(1 + x) (87) ∂k ψ (k)k 2 dkdd 3 x. L j2 + j3 ≤2

R3

R3

Using (40), (41) and that ψ ∈ S(R3 ) one finds by simple calculation that c 1 |ψ (x)|x j3 d 3 x ≤ 3 j 2 3

(88)

R3

and

3 1 j2 ∂k ψ (k)k 2 dkd ≤ c 2 j2 .

R3

(89)

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

687

Since j2 + j3 ≤ 2 we see with (88), (89) and (42) that for M in (87) we have for small the bound M≤

c = c l−2 . L 2

Since l > 2, this completes the proof of (63).

(90)

We can now proceed with the evaluation of (56). With (8) we obtain for (56) σ () = lim |T ψ y (k)|2 d 2 yd 3 k →0 C A

2 2 −i k · y e T (k, k )ψ (k )k d(k ) d 2 yd 3 k =lim 4π →0 C A k =k 2 2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k )k d(k )dy1 dy2 d 3 k, =lim 4π →0 C y p < D 2

k =k

(91) where y p := (y1 , y2 ). We insert again the identity f 1 + f 2 ≡ 1 and obtain for σ ()

lim 4π 2

→0

C y p < D 2

2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k )( f 1 (k ) + f 2 (k ))k d(k ) k =k

×dy1 dy2 d 3 k.

(92)

Multiplying out we get four terms. The main term is

lim 4π 2

→0

C y p < D 2

2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f 1 (k )k d(k ) dy1 dy2 d 3 k. k =k

(93) Before we evaluate (93) we show that the three other terms are zero. Noting that T (k, k ) is bounded (Corollary 1) and that ψ ∈ S(R3 ) we obtain that c −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f i (k )k d(k ) ≤ 3 k, i = 1, 2, 2 k =k c k − k0 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f 2 (k )k d(k ) ≤ 3 k ψ 2 k =k

× f 2 (k )d(k ).

k =k

(94)

688

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Using (94), the difference between (93) and (92) is no greater than c 3

k − k0 2 2 3 ψ f 2 (k )k d(k )d yd k

C y p < D k =k 2

≤ ≤

c 3+2d

k − k0 2 3 ψ f 2 (k )k d k

R3

c 3+2d

| k − k0 |≥

k0 3

k − k0 2 3 ψ k d k .

(95)

(k)| ≤ cn for any 6 ≤ n ∈ N, we see that the right-hand side in (95) is Using that |ψ k bounded by c n−3−2d , which tends to zero for sufficiently large n. Thus the three other terms are zero. Since, as we shall show,

lim 4π 2

→0

C y p ≥ D 2

2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f 1 (k )k d(k ) k =k

×dy1 dy2 d 3 k = 0,

(96)

we may extend the y-integration in (93) to all of R2 , so that 2 2 −i(k1 y1 +k2 y2 −k3 L ) σ () = lim 4π e T (k, k )ψ (k ) f 1 (k )k d(k ) →0 2 C R

k =k

×dy1 dy2 d 3 k.

(97)

Before establishing (96) we compute (97) with the help of the following Lemma 4. Let 0 < α < π2 and δ > 0 be given. Suppose that φ : R3 → C is a function with support in the sector Peα3 := {k ∈ R3 : k · e3 > k cos α} such that |φ(k)|2 d(k) < ∞. Then k=δ

2 2 1 1 −i k· y d y= e φ(k)d(k) |φ(k)|2 d(k). 2π kk3 2

R

k=δ

(98)

k=δ

Remark 11. This lemma is proved in [2], Lemma 7.17. The integration over the impact parameter is crucial for the derivation and is a standard ingredient in the derivation of the scattering cross section.

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

689

Because of Corollary 1, T (k, k ) is bounded on R3 ×R3 and continuous on R3 ×R3 \ (k) f 1 (k) has support in Peϑ2 with 0 < ϑ2 < π . (k) ∈ S(R3 ) and ψ {0}. Moreover, ψ 3 2 Hence, by Lemma 4, (97) becomes 1 T (k, k )2 ψ (k )2 f 1 (k )2 σ () = lim 16π 4 d(k )d 3 k →0 cos ϑ C k =k 1 T (k ω, k )2 ψ (k )2 f 1 (k )2 = lim 16π 4 d 3 k d, (99) →0 cos ϑ R3

where k3 = k cos ϑ . Because supp f 1 (k) ⊂ Peϑ32 with 0 < ϑ2 < π2 , there exists a δ > 0 2 such that δ < cos ϑ . Hence the integral in (99) is finite (it is ≤2 c ψ 3 ). Thus, since 2 clearly |ψ (k)| → δ(k − k0 ) (in the sense that lim |ψ (k)| g(k)d k = g(k0 ) for →0

any bounded continuous function g), and since T (k ω, k ), f 1 (k ) and and continuous as functions of k , we may conclude that 4 σ () = 16π |T (k0 ω, k0 )|2 d.

1 cos ϑ

are bounded

(100)

The proof of Theorem 1 and Theorem 2 will thus be complete once we establish (96). Changing variables, (96) follows from 2 y1 y2 1 −i(k1 d +k2 d −k3 L ) 3 e T(k, k ) ψ (k ) f (k )k d(k ) lim 1 dy1 dy2 d k = 0. 2d →0 3 D R yp≥ 2

k =k

(101) Equation (101) is the content of Lemma 5. Let V ∈ (V )5 , ψ ∈ S(R3 ) and suppose that k0 > 0. Let l > 2, d > 2l − 3 and let M be given by (to simplify the notation we interchange k and k ) y y −i(k1 d1 +k2 d2 −k3 L ) (k) f 1 (k)kd(k). e T (k , k)ψ M = M(y1 , y2 , k , ) := k=k

(102) Then for any D > 0,

R3

y p ≥D

lim

→0

1 |M|2 dy1 dy2 d 3 k = 0. 2d

(103)

Proof. We will establish the following inequality (104) giving a bound on M: There exists a c < ∞ such that |M|2 ≤ cχ k0

3 2 , 2 k0

(k )

4d+5−4l

y 4p

1+

1 |k −k0 |

2 .

(104)

690

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

Assuming (104) we show now that (103) follows. Using (104), the integral in (103) is dominated by

k0 3 2
y p ≥D

∞ 1 dk 2d+5−4l 2 3 2d+5−4l c d yd k ≤ c

2 2 y 4p |k −k0 | 0| 1 + |k −k 1 + −∞ ∞ = c

2d+6−4l

= c

−∞ 2d+6−4l

dk (1 + |k |)2

.

(105)

Since d > 2l − 3 there is a δ > 0 such that d = 2l − 3 + δ. Then (105) is of order 2δ and (103) follows. It thus remains to establish (104). Changing variables in (102) from ω to k1 , k2 we obtain, with the Jacobian determinant k k3 with k3 = k3 (k1 , k2 ) = k+ = (k1 , k2 , k3 (k1 , k2 )), M =

e

−i(k1

y1 d

+k2

y2 d

−k3 L )

k12 +k22 ≤k 2

=

=:

1

e

3

2 1

3 2

−i(k1

y1 d

+k2

y2 d

)

−i(k1

y1 d

+k2

y2 d

)

k12 +k22 ≤k 2

e

(k+ ) f 1 (k+ )k T (k , k+ )ψ T (k , k+ )ψ

k+ − k0

g(k1 , k2 , k , )dk1 dk2 .

k 2 − k12 − k22 and

1 dk1 dk2 k k3

eik3 L

f 1 (k+ ) dk1 dk2 k3 (106)

k12 +k22 ≤k 2

Performing two integration by parts with respect to k p := (k1 , k2 ), we obtain (using the fact that f 1 (k+ ) and its derivatives vanish on the boundary of the region of integration) that y y yp 1 d −i(k1 d1 +k2 d2 ) |M| = 3 ∇k p e · 2 f 1 (k+ )g(k1 , k2 , k , )dk1 dk2 y p 2 k p ≤k y y 1 d −i(k1 d1 +k2 d2 ) y p = 3 e · ∇ g(k , k , k , )dk dk 1 2 1 2 k p y 2p 2 k p ≤k y y yp yp 1 2d −i(k1 d1 +k2 d2 ) = 3 ∇k p e · 2 2 · ∇ k p g(k1 , k2 , k , )dk1 dk2 y y p p 2 k p ≤k

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

691

y y y y 1 2 1 2d p −i(k1 d +k2 d ) p = 3 e · ∇ k p 2 · ∇ k p g(k1 , k2 , k , )dk1 dk2 2 y y p p 2 k p ≤k ≤

1 2d 3 2 2 yp

2 ∂k ∂k g(k1 , k2 , k , ) dk1 dk2 . i j kp

≤k

(107)

i, j=1

We estimate now the derivatives of g on the support of f 1 . Note first that on supp f 1 k3 > k0 /2. Using Corollary 1 we have for i, j = 1, 2 that ∂k T (k , k+ ) ≤ c, (108) sup |T (k , k+ )| ≤ c, sup i k ∈R3 , k+ ∈supp f 1

sup

k ∈R3 , k+ ∈supp f 1

k ∈R3 , k+ ∈supp f 1

∂k ∂k T (k , k+ ) ≤ c. i j

k+ − k0 and its derivatives we introduce the following To estimate the wave function ψ notation: Pk :=

1 1+

|k−k0 |

,

Pk :=

1 1+

| k− k0 |

.

(109)

Clearly Pk ≤ Pk .

(110)

and its derivatives decay faster than the reciprocal of any polynoSince ψ ∈ S(R3 ), ψ mial, we can find for k+ ∈ supp f 1 and for n ∈ N suitable constants such that k+ − k0 k+ − k0 c n k+ − k0 c n n ψ ≤ c Pk+ , ∂ki ψ ≤ Pk+ , ∂ki ∂k j ψ ≤ 2 Pk+ . (111)

The derivatives of the third factor e−ik3 L of g can be estimated on supp f 1 as follows: |ki | −ik3 L ≤ L |ki |. e ≤ 1, ∂ki e−ik3 L ≤ L |k3 |

(112)

Since |ki |Pk+ ≤ , we obtain using (111) with n = j + 1 and (42) that

k − k c + 0 j j j ∂k e−ik3 L ψ ≤ cL |ki |Pk+ Pk+ ≤ cL Pk+ = l−1 Pk+ , j arbitrary. i (113) With a similar calculation we find that

k − k c + 0 j ∂k ∂k e−ik3 L ψ ≤ 2l−2 Pk+ , j arbitrary, i j

(114)

692

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

k+ − k0 . Clearly and analogous estimates for terms which contains derivatives of ψ we have that f 1 (k+ ) ≤ c, sup ∂k f 1 (k+ ) ≤ c, sup ∂k ∂k f 1 (k+ ) ≤ c, i, j = 1, 2. sup i i j k3 k3 k3 k+ ∈supp f 1 k+ ∈supp f 1 k+ ∈supp f 1 (115) Combining (108), (111)–(115) and using that 2l − 2 > 2 since l > 2 we obtain for all k ∈ R3 and any n ∈ N that ∂k ∂k g(k1 , k2 , k , ) ≤ i j

c Pn , 2l−2 k+

(116)

for all (k1 , k2 ) such that k+ ∈ supp f 1 . Reintroducing the original integration variable ω we then have that |M| ≤

c 2d−2l+ 1 2 y 2p

χ{ f1 >0} Pkn k k3 d(k)

k=k

1 c ≤ 2 2d−2l+ 2 χ k0 3 (k ) yp 2 , 2 k0

Pkn d(k).

k=k ,| k− k0 |<

(117)

k0 2

Choosing n = 4 in (117) and splitting Pk4 into Pk4 = Pk1 Pk3 ≤ Pk1 Pk3 ,

(118)

we obtain that |M| ≤

c 2d+ 1 −2l

2 χ k0 3 (k )Pk1 y 2p 2 , 2 k0

Pk3 d(k). k=k ,| k− k0 |<

(119)

k0 2

Moreover, it is easy to see that

Pk d(k) ≤ c 3

k k=k ,| k− k0 |< 20

R2

1 1+

kp 3

dk1 dk2 ≤ c 2 .

(120)

Thus |M| ≤

c 2d+ 5 −2l

2 χ k0 3 (k )Pk1 y 2p 2 , 2 k0

and (104) follows. This completes the proof of Lemma 5.

(121)

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

693

9. Summary and Outlook The purpose of this paper has been to rigorously derive the standard formula for the scattering cross section starting from a microscopic model of a scattering experiment. While the use of Bohmian mechanics is crucial for our result, we would like to stress that major parts of our proof are vital even from an orthodox point of view. These parts concern in particular the replacement of the incoming asymptote by its scattering state (cf. Lemma 3 and Remark 10) and the flux-across-surfaces theorem in a formulation which depends only on the smoothness of the scattering state (cf. Proposition 3, Lemma 1 and [11]). Several problems have been left for future work, which we shall mention here. • Bound states: Our assumption A3 arises from the problem that in general the translation of the initial wave function by the impact parameter y—which is needed for the averaging over the beam profile—will produce wave functions which have a component in the bound states. One would then have to show that asymptotically the crossing statistics are induced by the “relevant part” ψ of the wave function, namely ψ := Pψ, where P is the projection onto the absolutely continuous subspace Ha.c. (H ) and is given by P := − ∗− . Note that by using Lemma 3 one can also show that lim

L→∞ YL

Pψ y − ψ y 2 d 2 y = 0,

(122)

i.e., that the bound state component is small in an L 2 -sense. This is however not directly applicable. • It would of course be desirable to derive the crossing statistics for many particles guided in general by an entangled wave function both for the noninteracting case and eventually even for interacting particles [13]. • We are currently working [8] on a detailed formulation of the conditions characterizing the scattering regime, which turns out to be surprisingly intricate. What we have shown here is that the simplest limiting procedure that brings the experimental arrangement into the scattering regime yields the standard formula of formal scattering theory. This formula should of course hold much more generally—more or less for all limits corresponding to the scattering regime—but establishing that this is so remains a formidable challenge. Acknowledgements. The work of S. Goldstein was supported in part by NSF Grant DMS-0504504. The work of T. Moser was supported by the DFG (DU 120/10). The work of N. Zanghì was supported by INFN.

694

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

10. Appendix Proof of Lemma 1. Let ψ ∈ G. Then there is a χ ∈ G 0 and a t ∈ R such that ψ = e−i H t χ . Using the intertwining property (6) we obtain −1 −i H t −i H0 t ψout = −1 χ = e−i H0 t −1 χout . + ψ = + e + χ =e

(123)

Since G + is invariant under time shifts it suffices to show that χ out (k) is in G + . Since 2 n 3 4 n 3 x H χ (x) ∈ L 2 (R ), 0 ≤ n ≤ 8, and x H χ (x) ∈ L 2 (R ), 0 ≤ n ≤ 3, we have H n χ (x) ∈ L 1 (R3 ) ∩ L 2 (R3 ), 0 ≤ n ≤ 8, x j H n χ (x) ∈ L 1 (R3 ) ∩ L 2 (R3 ), 0 ≤ n ≤ 3, j = {1, 2}.

(124)

Using Proposition 1 (ii), (iii) we have for f ∈ L 2 (R3 ): F+ + f = F f, and hence for χ = + χout we have that 3

χ out (k) = F+ χ (k) = (2π )− 2

ϕ+∗ (x, k)χ (x)d 3 x.

(125)

(126)

Using the intertwining property (6) we thus have: k2 −1 χ out (k) = H0 χ out (k) = F(H0 −1 + χ )(k) = F(+ H χ )(k) = F+ (H χ )(k) 2 3

= (2π )− 2

ϕ+∗ (x, k)(H χ )(x)d 3 x.

(127)

out (k) (0 ≤ n ≤ 8) we obtain Similarly, applying H0n to χ k 2n 3 χ out (k) = (2π )− 2 ϕ+∗ (x, k)(H n χ )(x)d 3 x. n 2

(128)

Since the generalized eigenfunctions are bounded (Proposition 2 (ii)) and H n χ ∈ L 1 (R3 ), 0 ≤ n ≤ 8, we obtain | χout (k)| ≤ c(1 + k)−16 ≤ c(1 + k)−15 .

(129)

Because of Proposition 2 (iii) and (124) we can differentiate (126) with respect to ki and get ∗ 3 3 ∂k χ = (2π )− 23 (k) ϕ (x, k) χ (x)d x ∂ out k i i + ≤ c, ∀k ∈ R \ {0}. (130) Differentiating (128) with n = 3 with respect to ki we obtain 3 ki k 6 ∂k i χ out (k) = 8(2π )− 2 out (k) . (131) ∂ki ϕ+∗ (x, k) (H 3 χ )(x)d 3 x − 6k 5 χ k

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

695

Again the right-hand side is bounded because of Lemma 2 (iii), (124) and (129). Hence, we obtain with (130): −6 ∂k χ , ∀k ∈ R3 \ {0}. (132) i out (k) ≤ c(1 + k) Using Proposition 2 (iii) and (126) we may control κ times a second derivative of χ out (k), obtaining − 23 ∗ 3 3 κ∂k ∂k χ (2π ) = (k) ∂ ϕ (x, k) χ (x)d x κ∂ k j ki + j i out ≤ c, ∀k ∈ R \ {0}. (133) For the last inequality we have also used (124) with j = 2 and n = 0. Similarly, using (131) we obtain 6 − 23 k κ∂k j ∂ki χ out (k) = 8(2π ) κ∂k j ∂ki ϕ+∗ (x, k) (H 3 χ )(x)d 3 x k j ki ki κ χout (k) − 6k 5 κ∂k j χ out (k) k k k kδi j k − ki k j kj out (k)κ − 6k 5 κ∂ki χ out (k), − 6k 5 χ k3 k − 30k 4

(134)

with right-hand side that is bounded because of Proposition 2 (iii), (124), (129) and (132). Hence, using (133), α −6 κ∂ χ ≤ c(1 + k)−5 , |α| = 2, ∀k ∈ R3 \ {0}. (135) k out (k) ≤ c(1 + k) Equation (132) implies also that |∂k χ out (k)| ≤ c(1 + k)−6 , ∀k ∈ R3 \ {0}.

(136)

Similarly, twice differentiating (126) with respect to k we obtain that 2 out (k) ≤ c, ∀k ∈ R3 \ {0}, ∂k χ

(137)

and then twice differentiating (128) for n = 2 with respect to k we obtain 2 out (k) ≤ c(1 + k)−4 ≤ c(1 + k)−3 , ∀k ∈ R3 \ {0}, ∂k χ

(138)

using Proposition 2 (iv), (124), (129), (136) and (137). With (129), (132), (135) and (138) we see that χ out (k) ∈ G + . Proof of Lemma 2. In the proof of Proposition 3 in [11] the absolute value of the flux integrated over time and the surface RS 2 with R > R0 (with some R0 > 0 depending on the potential) is shown to be bounded (uniformly in R) by linear combinations of out (k) and its derivatives, namely integrals over expressions correintegrals involving ψ sponding to the left hand side of the inequalities in Definition 2. Thus these bounds are out (k) ∈ G + . To bound the integrated flux uniformly for all ψ y , y ∈ A (and finite if ψ

small enough and fixed), F ψ y,out (k) = F −1 + ψ y (k) (note that ψ y ∈ Ha.c. (H ), for all y ∈ A , cf. (i) in Definition 3 or 4) must be bounded as in Definition 2 with

696

D. Dürr, S. Goldstein, T. Moser, N. Zanghì

constants uniform in y ∈ A . These constants depend, according to the proof of Lemma 1, on the norms of H n ψ y 1 , 0 ≤ n ≤ 8 and x j H n ψ y 1 , 0 ≤ n ≤ 3,

j ∈ {1, 2}.

(139)

We will show that for small enough there exists a constant C > 0 such that |H n ψ y (x)| ≤ C(1 + x)−6 , 0 ≤ n ≤ 8, ∀ y ∈ A .

(140)

Thus the norms in (139) are bounded uniformly in y ∈ A and Lemma 2 follows. It remains to establish (140). We start with n = 0. Since ψ ∈ S(R3 ) and y ∈ A , A compact, we obtain 3

|ψ y (x)| = 2 |ψ((x − y))| ≤ c(1 + |x − y|)−6 ≤ c(1 + x)−6 , ∀ y ∈ A . (141) For n = 1 we have with ψ y ≡ T y ψ (T y is the translation operator) and [T y , H0 ]− = 0, 3

|H ψ y (x)| = |(H0 + V )T y ψ (x)| = |T y H0 ψ (x)| + 2 |V (x)ψ((x − y))|. Using now |V (x)| < M < ∞ for V ∈ V or

sup

x ∈supp ψ y

(142)

|V (x)| < M < ∞ for ψ ∈

C0∞ (R3 ), V ∈ V , y ∈ A and small enough, we obtain together with (141), |H ψ y (x)| ≤ |T y H0 ψ (x)| + c(1 + x)−6 .

(143)

Since ψ ∈ S(R3 ) we have that also H0 ψ ∈ S(R3 ) so that analogously to (141), there is the bound |T y H0 ψ (x)| ≤ c(1 + x)−6 , ∀ y ∈ A .

(144)

Equations (143) and (144) yield (140) for n = 1. Analogously, we obtain (140) for 2 ≤ n ≤ 8 by using the fact that ψ ∈ S(R3 ) and |∂ xα V (x)| < M < ∞, ∀ |α| ≤ 14, if V ∈ V or sup |∂ xα V (x)| < M < ∞, ∀ |α| ≤ 14, for all y ∈ A and small enough if ψ

x ∈supp ψ y ∈ C0∞ (R3 )

and V ∈ V .

References 1. Albeverio, S., Gesztesy, F., Høegh-Krohn, R, Holden, H.: Solvable Models in Quantum Mechanics. Berlin Heidelberg New York: Springer, 1988 2. Amrein, W.O., Jauch, J.M., Sinha, K.B.: Scattering Theory in Quantum Mechanics. London: W. A. Benjamin, Inc., 1977 3. Berndl, K.: Zur Existenz der Dynamik in Bohmschen Systemen, Ph.D. thesis, Ludwig-Maximilians-Universität München, 1994 4. Berndl, K., Dürr, D., Goldstein, S., Peruzzi, G., Zanghì, N.: On the global existence of Bohmian mechanics. Commun. Math. Phys. 173(3), 647–673 (1995) 5. Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden” variables I, II. Phys. Rev. 85, 166–179, 180–193 (1952) 6. Combes, J.-M., Newton, R.G., Shtokhamer, R.: Scattering into cones and flux across surfaces. Phys. Rev. D 11(2), 366–372 (1975) 7. Dürr, D.: Bohmsche Mechanik als Grundlage der Quantenmechanik. Berlin Heidelberg New York: Springer, 2001 8. Dürr, D., Goldstein, S., Moser, T., Zanghì, N.: What does quantum scattering theory physically describe? In preparation

A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section

697

9. Dürr, D., Goldstein, S., Teufel, S., Zanghì, N.: Scattering theory from microscopic first principles. Physica A 279, 416–431 (2000) 10. Dürr, D., Goldstein, S., Zanghì, N.: Quantum Equilibrium and the Origin of Absolute Uncertainty. J. Stat. Phys. 67, 843–907 (1992) 11. Dürr, D., Moser, T., Pickl, P.: The Flux-Across-Surfaces Theorem under conditions on the scattering state. J. Phys. A: Math. Gen. 39, 163–183 (2006) 12. Dürr, D., Pickl, P.: Flux-across-surfaces theorem for a Dirac particle. J. Math. Phys. 44(2), 423–465 (2003) 13. Dürr, D., Teufel, S.: On the exit statistics theorem of many particle quantum scattering. In: Blanchard, P., Dell’Antonio, G.F. eds. Multiscale Methods in Quantum Mechanics: Theory and experiment, Boston: Birkhäuser, 2003 14. Ikebe, T.: Eigenfunction expansion associated with the Schrödinger operators and their applications to scattering theory. Arch. Rat. Mech. Anal. 5, 1–34 (1960) 15. Jensen, A., Kato, T.: Spectral properties of Schrödinger operators and time-decay of the wave functions. Duke Math. J. 46(3), 583–611 (1979) 16. Kato, T.: Fundamental Properties Of Hamiltonian Operators Of Schrödinger Type. Trans. Amer. Math. Soc. 70(1), 195–211 (1951) 17. Newton, R.G.: Scattering Theory of Waves and Particles, Second Edition, Berlin Heidelberg New York: Springer, 1982 18. Pearson, D.B.: Quantum Scattering and Spectral Theory. San Diego: Academic Press, 1988 19. Reed, M., Simon, B.: Methods Of Modern Mathematical Physics III: Scattering Theory. San Diego: Academic Press, 1979 20. Reed, M., Simon, B.: Methods Of Modern Mathematical Physics I: Functional Analysis. Revised and enlarged ed., San Diego: Academic Press, 1980 21. Teufel, S.: The ﬂux-across-surfaces theorem and its implications for scattering theory. Ph.D. thesis, Ludwig-Maximilians-Universität München, 1999 22. Teufel, S., Dürr, D., Münch-Berndl, K.: The flux-across-surfaces theorem for short range potentials and wave functions without energy cutoffs. J. Math. Phys. 40(4), 1901–1922 (1999) 23. Teufel, S., Tumulka, R.: A Simple Proof for Global Existence of Bohmian Trajectories. Commun. Math. Phys. 258(2), 349–365 (2005) 24. Tumulka, R.: Closed 3-Forms and Random Worldlines. Ph.D. thesis, Ludwig-Maximilians-Universität München, 2001 25. Weinberg, S.: Quantum Theory of Fields. Volume I: Foundations, Cambridge: Cambridge University Press, 1996 26. Yajima, K.: The W k, p -continuity of wave operators for Schrödinger operators. J. Math. Soc. Japan 47(3), 551–581 (1995) Communicated by A. Kupiainen

Commun. Math. Phys. 266, 699–714 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0046-9

Communications in

Mathematical Physics

Homogenization of Ornstein-Uhlenbeck Process in Random Environment Gaël Benabou Ceremade, UMR CNRS 7534, Université Paris IX - Dauphine, Place du Maréchal de Lattre de Tassigny, 75775 Paris Cedex 16, France. E-mail: [email protected] Received: 1 September 2005 / Accepted: 20 March 2006 Published online: 13 July 2006 – © Springer-Verlag 2006

Abstract: We consider a tracer particle moving in a random environment. The velocity of the tracer is modelled by an Ornstein-Uhlenbeck process which takes into account inertia and friction. The medium results in a possibly unbounded random potential. We prove an invariance principle for this kind of motion. The method used is generalized in order to obtain a central limit theorem for a large class of process, the most interesting application being a tagged particle in a medium of infinitely many Ornstein-Uhlenbeck particles.

1. Introduction An aqueous suspension of particles in equilibrium at temperature T = β −1 can be modeled by a system of interacting Ornstein-Uhlenbeck particles, i.e. by the following system of stochastic differential equations: dxi (t) = vi (t)dt, mdvi (t) = −γ vi (t)dt −

∇U (xi − x j )dt +

2γβdwi (t),

(1)

j=i

where the wi s are independent Brownian motions. The mass of each particle is m, V is a two body interaction potential between particles, and γ is the strength of the friction resulting from the fluid. In the overdamped case (mγ −1 small), (1) is approximated by the model of interacting Brownian motions, 1 dxi (t) = − ∇U (xi − x j )dt + γ j=i

2β dwi (t). γ

(2)

700

G. Benabou

Both models are of diffusive type, and one issue is about the macroscopic behaviour on a diffusive space-time scale. It has been proved (see [7, 12]) that both these models have the same macroscopic bulk diffusion. In this article, we consider the problem of the self-diffusion in equilibrium, i.e. the diffusion of one tagged particle. In the overdamped case (2), an invariance principle for its motion has been proved in [6]. But, in the inertial case (1), the question remained an open problem. The purpose of this paper is the proof of a central limit theorem for the tagged particle motion (1). The comparison with interacting Brownian motions is discussed in a recent article by the author ([14]), in which it is proved that the self-diffusion matrix is strictly smaller in the inertial case. Moreover, the self diffusion matrix converges to the one for the non-inertial case when the damping goes to infinity. We tag one of the particles of the system (1) and we note by x(t) its position at time t. We prove the following central limit theorem for the trajectory of this particle Theorem 1. In equilibrium, the process εx(ε−2 t) converges in the limit ε → 0 weakly to a Brownian motion with deterministic diffusion matrix which is deﬁned in Proposition 4.1. First we prove the central limit theorem in the case where the tagged particle is moving in a frozen random environment. Then, we extend this result to a larger class of dynamics, so that Theorem 1 can be seen as a particular case. Let us introduce formally the frozen environment process. Let (X, F, µ) be a probability space on which a group of measure preserving transformations G = {τx , x ∈ Rd } acts ergodically. The action is also assumed to be stochastically continuous. Under this hypothesis, we can define the infinitesimal generator D of G, which is a closed unbounded operator over L 2 (µ). Let V˜ (x, η) be a real-valued random potential given by a stationary random field on Rd × X , i.e. there exists V such that V˜ (x, η) = V (τ−x η). We suppose V ∈ L 2 (µ) and DV ∈ L 2 (µ). γ˜ (x, η) is also a stationary random field, which satisfies γ˜ (x, η) = γ (τ−x η), inf X γ = γ∗ > 0 and sup X γ < ∞. Throughout the article, the canonical Euclidean norm on Rd is denoted by |.|, and · denotes the associated scalar product. For every η ∈ X , we consider on Rd the following system of stochastic differential equations dx(t) = v(t)dt, (3) dv(t) = −γ˜ (η, x(t))v(t)dt − ∇x V˜ (η, x(t))dt + 2γ˜ (η, x(t))βdw(t), with initial conditions

   x(0) = 0,   v(0) ∼ G(dv) =

− |v|

2

e 2β dv, (2πβ)d/2

where w is a standard Brownian motion on Rd , and β is the inverse temperature of the medium. We assume here the existence of the dynamics (in the situation described in [2], this existence can be proved, due to the boundedness of ∇x V˜ ), and we want to prove that it satisfies an invariance principle. The usual way to treat this kind of problem is to introduce the environment process over X × Rd , say (η(t), v(t)) = (τ−x(t) η, v(t)). This is an autonomous Markov process 2 with formal stationary measure dπ(η, v) = Z −1 e−V (η)/β e−|v| /2β dµ(η)dv, where Z is a normalizing constant.

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

701

In [5], Kipnis and Varadhan develop a scheme for proving central limit theorems for any stochastic dynamics with a reversible invariant measure. The problem of the tagged particle in a system of interacting particles with a reversible measure can also be treated with this method, cf. [5] for the symmetric simple exclusion, and [6] for interacting Brownian particles (2). Moreover, an extension of these results can be proved for nonreversible dynamics with sector condition, or weak sector condition (cf. [9, 10], also see for a review [11]). This method is based on a martingale approximation and uses certain estimates on the resolvent. The problems we investigate in this paper do not satisfy any sector condition. The generator of (η(t), v(t)) is not symmetric with respect to the invariant measure dπ and is also degenerate in the positional variables. However, the system (3) has already been studied in [2] in the case of constant γ . In their article, Papanicolau and Varadhan assume V to be bounded along with its derivatives. This assumption makes it impossible to use their method for the problem of a tagged particle in a system of interacting Ornstein-Uhlenbeck particles (1), since there is no bound on the number of particles interacting with the tracer particle. The method of Papanicolau and Varadhan is inspired by the methods used in [5]. The authors study the resolvent equation corresponding to the generator of the environment process. Since this generator is degenerate, the solution of the resolvent equation lacks regularity in the space variable. This causes problems while trying to prove the usual Kipnis and Varadhan conditions for central limit theorems, see [5] and Proposition 2.4 in the present article. Papanicolau and Varadhan managed to gain enough regularity in the space variable for the solution of the resolvent equation by integrating it over the velocities. But this regularity is only obtained under the condition of a bounded force. Clearly, this method does not fit in Kipnis and Varadhan’s scheme, which only relies on the time reversal symmetry of the system. We propose here a way to avoid the hypothesis of boundedness of the force, motivated by the physical interpretation of the model. The main idea of our proof is the use of another kind of symmetry of the problem, which, in some sense, “replaces” the time reversibility. If one considers the time-reversed process (x(T − t), v(T − t)), it is very easy to see that, in the stationary measure π , it has the same distribution as the process (−x(t), v(t)). We exploit this symmetry by considering that the involution (η, v) → (η, −v) only affects the antisymmetric part of the generator. We try to generalize this approach by presenting a more abstract and general version of the result which extends to a larger class of dynamics which have the same kind of symmetry. This extension allows us to treat the problem of the tagged particle in the system of interacting Ornstein-Uhlenbeck particles (1). 1.1. Results. The first result of this article is the following Theorem 2. In equilibrium, under the hypotheses V ∈ L 2 (µ), e−V /β ∈ L 1 (µ), DV ∈ L 2 (π ), and hypotheses H1 to H3 stated below, the process εx(ε−2 t) deﬁned by (3) converges weakly in π -probability as ε → 0 to a Brownian motion whose diffusion

matrix is the only symmetric matrix OU deﬁned for all l ∈ Rd by l · OU l = β |ξl |2 , where ξl depends linearly on l and is deﬁned in Sect. 2, Proposition 2.4. If we suppose moreover e V /β ∈ L 1 (µ) then this diffusion is non-degenerate, i.e. there exists α > 0 such that OU > αIdRd . The second result presented in this paper is an extension of the previous one to a larger set of dynamics. See Theorems 3 and 4 for the exact statements. Theorem 1 is a direct consequence of these ones.

702

G. Benabou

2. Homogenization in a Frozen Environment 2.1. Preliminaries. The group of transformations G = {τx , x ∈ Rd } acting on the probability space (X, F, µ) is supposed to be commutative, measure preserving and ergodic, i.e. 1. ∀(x, y) ∈ (Rd )2 , τx+y = τx τ y = τ y τx ; 2. ∀x ∈ Rd , ∀A ∈ F, µ(τx A) = µ(A); 3. for A ∈ F, if ∀x ∈ Rd , A = τx A, then µ(A) = 0 or µ(A) = 1. We assume that the associated group of operators {Tx , x ∈ Rd : ∀ f ∈ L 2 (µ), Tx f : η → f (τ−x η)} defined above L 2 (µ) is stochastically continuous ∀δ > 0, f ∈ L 2 (µ), lim µ(|Th f (η) − f (η)| ≥ δ) = 0, h→0

which implies that {Tx } is a strongly continuous unitary group of operators on L 2 (µ). The infinitesimal generator of {Tx } is defined for a suitable f by D f (η) = ∇x (Tx f )(η) . x=0

D is a closed unbounded operator, whose domain D(D) is dense in L 2 (µ). Let V˜ be a random stationary potential defined above Rd × X , i.e. V˜ (x, η) = Tx V (η). V is taken in L 2 (µ) and we suppose that V ∈ D(D), |DV (η)|2 µ(dη) < ∞, (4) X

and that there is a positive β0 such that e−V (η)/β0 µ(dη) < ∞.

(5)

X

Finally, notice that ∇x V˜ (x, η) = DV (τ−x η). γ˜ is also a random stationary field with representation γ . We suppose inf γ = γ∗ > 0. X

(6)

We also suppose for convenience γ ∞ < ∞, so that the existence of solutions of (3) is ensured under good hypotheses on V , see below. This condition could certainly be weakened. 2.2. The Ornstein-Uhlenbeck Process. With the same notations, we now consider on (Rd )2 the diffusion (x(t), v(t)) solution of Eq. (3) with 0 < β ≤ β0 . We assume good enough hypotheses on V such that the existence of the solution (x(t), v(t)) of (3) in almost surely ensured for all positive t. For instance, one can suppose the boundedness of DV , or the usual global Lipschitz conditions in x for ∇x V (x, η). In fact, the main purpose of this article being the study of (30), the existence of the dynamics for this model is proved under very large conditions in [12]. Then, the generator of (x(t), v(t)) is given by Lη = γ˜ (x, η)(βv − v · ∇v ) + v · ∇x − ∇x V˜ (x, η) · ∇v .

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

703

We associate to (x(t), v(t)) the Markov process (η(t), v(t)) defined on X × Rd , where η(t) is the environment as seen by an observer “sitting on the particle”. We define η(t) on X by η(t) = τ−x(t) η η(0) = η. We make the following hypotheses on the semi-group of (η(t), v(t)). Notice that the following assertions can be proved under certain hypotheses on V , including boundedness of DV , the semigroup having enough regularity in this case. The main issue in general is about the existence of a core of regular enough functions for L. H 1. η(t), v(t)) is a Markov process on X × Rd whose semi-group is given by P t f (η, v) = Eη,v [ f (η(t), v(t))] = P η (t, 0, dv, dy) f (τ−y η, v), Rd

where P t is deﬁned on L ∞ (X ) and P η (t, x, v, .) is the transition probability of (x(t), v(t)). H 2. The generator L of (η(t), v(t)) is given by an extension of γ (η)(βv − v · ∇v ) + v · D − DV (η) · ∇v

(7)

and we denote by S and A the corresponding extensions of γ (η)(βv − v · ∇v ) and v · D − DV (η) · ∇v .

(8)

H 3. The probability measure e−V (η)/β µ(dη)G(dv) dπ = e−V (η)/β µ(dη) X

is stationary and ergodic for P t . Moreover

S and A are respectively symmetric and antisymmetric with respect to π , and S is self-adjoint. P t is therefore strongly continuous. In the following, . denotes the expectation with respect to dπ , and . . . the associated scalar product on L 2 (π ). The next lemma gives the key of all the calculus done in this article. Lemma 2.1. For all φ, ψ in D(D) , (a) Dφ(η)µ(dη) = 0, X

Dφ(η)ψ(η)µ(dη) = −

(b) X

Dψ(η)φ(η)µ(dη), X

(c) φ, DV = β Dφ , (d) vφ = β ∇v φ . ♦ The proof of this lemma

is a simple computation. Let us notice that (c) is a consequence of the chain rule D e V = DV e V and of (b). ♦

704

G. Benabou

2.3. The resolvent equation. 2.3.1. Introduction of useful functional spaces. We introduce the Dirichlet form E(φ, ψ) = − Sφ, ψ associated to L as the closure of the following one given for any φ and ψ smooth in v by: E(φ, ψ) = − Sφ, ψ = β γ ∇v φ · ∇v ψ . Let D(E) be its domain. We define H1 as being the completion of {φ ∈ D(E), φ 21 = −β −1 E(φ, φ) < ∞} for the norm . 1 defined as above. We also denote H˜ 1 = H1 ∩ L 2 (π ) endowed of the norm . 2˜ = . 2L 2 + γ∗ β . 21 . Notice that they are both Hilbert spaces. H1 We define also the dual space H˜ −1 of H˜ 1 by the completion of the set of all the functions ψ from H˜ 1 such that ψ 2H˜

−1

= sup (2 ψ, φ − φ 2H˜ ) < ∞ φ∈ H˜ 1

1

with respect to the norm . H˜ −1 . The action of H˜ −1 is denoted by H˜ −1 ., . H˜ 1 . We define H−1 in the same way. 2.3.2. Existence and uniqueness of solutions for the resolvent equation. Let us consider the H−1 -weak resolvent equation for all λ > 0, λh λ − Lh λ = l · v,

(9)

where l is fixed in Rd , in the sense that we are looking for h λ such that for any φ ∈ H1 , λ h λ , φ − H˜ −1 Lh λ , φ H˜ 1 = l · vφ . Besides the tightness of the process and the ergodicity of the stationary measure, Kipnis and Varadhan’s theory needs some information (given below by Proposition 2.4) on the solution to the resolvent equation. Our purpose is now to prove the statements of Proposition 2.4. The next proposition states the following very usual result in homogenization theory: the existence and the uniqueness of the solution h λ of (9), is ensured by l · v ∈ H−1 . Proposition 2.2. Equation (9) has a unique solution h λ in H˜ 1 , and Ah λ and Sh λ are in H˜ −1 . Moreover |l|2 β λ h 2λ ≤ |l|2 and γ |∇v h λ |2 ≤ , γ∗ γ∗ and

λ h 2λ + β γ |∇v h λ |2 ≤ β l · ∇v h λ .

♦ A proof of these results is given in [2], Lemma 2.1. ♦

(10)

(11)

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

705

2.3.3. Asymptotic behaviour of the solution. Let us recall here briefly the method developed in [2] by Papanicolau and Varadhan to prove a central limit theorem for the solution of (3). They manage to write h λ (η, v) = bλ (η, v) + cλ (η), where the family (bλ ) is uniformly bounded in L 2 (π ) with respect to λ, along with the family (Dcλ ). Using this

decomposition, they are able to prove that for all λ > 0, µ > 0, the quantity Ah λ , h µ goes to 0 when λ and µ go successively to 0. This result is sufficient to prove Proposition 2.4. Unfortunately, such a decomposition for h λ can only be obtained using the hypothesis of the boundedness of DV . We propose here another very simple way to decompose h λ , which leads to the same result without any further hypothesis. From now on, f λ and gλ denote respectively the even and the odd part of h λ with respect to v, i.e. for all (η, v), f λ (η, v) =

h λ (η, v) + h λ (η, −v) , 2

h λ (η, v) − h λ (η, −v) gλ (η, v) = . 2

(12)

The following lemma holds. Lemma 2.3. The family (gλ ) is uniformly bounded in L 2 (π ), β gλ2 ≤ |l|2 2 . γ∗

(13)

♦ The operator S has a spectral gap over Rd larger than γ∗ , because the velocities have a Gaussian distribution. Then, we may apply Poincaré’s inequality, which gives 1 gλ (η, v)2 G(dv) ≤ − gλ (η, v)Sgλ (η, v)G(dv) γ∗ R d Rd γ (η) = β |∇v gλ (η, v)|2 G(dv), γ∗ Rd and then due to (10) β β β gλ2 ≤ γ |∇v gλ |2 ≤ γ |∇v h λ |2 ≤ |l|2 2 . ♦ γ∗ γ∗ γ∗ The following proposition proves the basic hypotheses of the Kipnis and Varadhan method to obtain central limit theorems ([5]). Proposition 2.4. We have lim λ h 2λ = 0.

λ→0

Furthermore, there exists ξ , |ξ | ∈ L 2 (π ) so that √ lim γ ∇v h λ − ξ L 2 (π ) = 0. λ→0

(14)

(15)

706

G. Benabou

♦ Using (12), (9) gives λ f λ − S f λ − Agλ = 0

(16)

λgλ − Sgλ − A f λ = l · v.

(17)

and

Let µ be any positive real number. Multiplying (17) by gµ and integrating it with respect to π , we obtain

λ gλ , gµ + β γ ∇v gλ · ∇v gµ − H˜ −1 A f λ , gµ H˜ = β l · ∇v gµ . 1

Using the same regularisation as used in the proof of Proposition 2.2 (see [2] for details), one can prove easily that

H˜ −1 A f λ , gµ H˜ = − H˜ −1 Agµ , f λ H˜ , 1

1

and then using (16), we obtain

λ gλ , gµ + µ f λ , f µ + β γ ∇v gλ · ∇v gµ + β γ ∇v f λ · ∇v f µ

= λ gλ , gµ + µ f λ , f µ + β γ ∇v h λ · ∇v h µ

= β l · ∇v gµ = β l · ∇v h µ .

(18)

√ Due to (10), the family ( γ ∇v h λ ) is weakly compact in L 2 (π ) when λ goes to 0. Let ξ be a weak limiting point of this family. We also denote by g0 an L 2 -weak limiting point of the family (gλ ). We consider a subsequence λn such that ∇v h λn converges weakly to ξ and (gλn ) converges weakly to g0 . Then (18) gives

λn gλn , gλ p + λ p f λn , f λ p + β γ ∇v h λn · ∇v h λ p = β l · ∇v h λ p . (19) Letting p and then n go to infinity, we obtain |ξ |2 = l · ξ .

(20)

Equation (11) gives for all λ > 0, λ h 2λ + β γ |∇v h λ |2 ≤ β l · ∇v h λ

(21)

and then

lim sup γ |∇v h λn |2 ≤ l · ξ .

√ Then, due to the weak convergence of ( γ ∇v h λn ) to ξ , we have l · ξ = |ξ |2 √ ≤ lim inf γ |∇v h λn |2 √ ≤ lim sup γ |∇v h λn |2 ≤ l · ξ

(22)

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

707

√ and ( γ ∇v h λn ) converges strongly to ξ . Now (21) implies (14) for the subsequence λn . The last thing that needs to be proved now is the uniqueness of this limit. Let us √ √ suppose that there exists another limiting point ξ . Let γ ∇v h λn and γ ∇v h µ p be two subsequences which converge respectively to ξ and ξ . Writing Eq. (19) with λ = λn , µ = µ p , and letting n and p go to infinity, we obtain

ξ · ξ = l · ξ = |ξ |2 . Exchanging the roles of λn and µ p , it is obvious that we also have

ξ · ξ = l · ξ = |ξ |2 = |ξ |2 . And then ξ = ξ ,

(23) √ and the limit is unique. The weakly compact sequence ( γ ∇v h λ ) has a unique possible limiting point ξ , then it converges weakly to this point. The same argument as above proves that this convergence is strong. The proof of the proposition is therefore complete. ♦ 2.4. Tightness of the process. In order to prove the continuity of the limit process in the central limit theorem 2, we need the following compactness result: Proposition 2.5. The following inequality holds for all T ≥ 0: β 2 E sup |x(t) − x(s)| ≤ 8T . γ ∗ 0≤s≤t≤T ♦ One can rewrite

x(t) − x(s) =

t

v(u)du.

s

The proposition is then a direct consequence of Proposition 2.1.1 of [11], which gives t 2 ≤ 8T v 2H˜ . ♦ E sup v(u)du 0≤s≤t≤T

s

−1

2.5. Central limit theorem. Now, the homogenization theorem 2 can be proved in the same way as in [5]. Using the resolvent equation (9), we have ε−2 t −2 εl · x(ε t) = ε l · v(s)ds 0

= εM(ε−2 t) + ε h ε2 (η, v(0)) − h ε2 (η(ε−2 t), v((ε−2 t)) +ε3 0

ε−2 t

h ε2 (η(s), v(s))ds,

708

G. Benabou

where M(t) is the martingale t γ (η(s))∇v h ε2 (η(s), v(s)) · dws . M(t) = − 2β 0

Then due to (14), we can prove that 2 −2 −2 →0 Eπ εl · x(ε t) − εM(ε t) when ε goes to 0, and due to (15) and to the ergodicity of π , we can prove a central limit theorem for the martingale M(t) whose quadratic variation is given by t 2

M t = 2β γ (η(s))|∇v h ε2 (η(s), v(s)) ds. 0

Notice that these computations are correct if we assume the existence of a core of regular enough functions for L, as we considered here a weak version of the resolvent equation. One can also refer to [8], pp. 58–59, for a proof of the non-degeneracy of the limit process. 3. Central Limit Theorems for Non-Reversible Markov Processes We propose in this section to extend the previous result to a more general set of dynamics. Let (, F, π ) be a Polish probability space. The expectation with respect to π is denoted by .. Let η(t) be a Markov process with state space . 3.1. General setup. We suppose that there exists a one-to-one, π -preserving map on which we denote for any η ∈ by η⊥ , satisfying (η⊥ )⊥ = η. We define on L 2 (π ) the canonical involution f ⊥ (η) = f (η⊥ ) for all f ∈ L 2 (π ), η ∈ . Let F0 be a sub-σ -algebra of F. We denote the conditional expectation with respect to F0 by .|F0 . Let C0 be the subspace of L 2 (π ) consisting in all F0 -measurable functions. We suppose that for any f ∈ C0 , f ⊥ = f . Let P t be the semi-group of operators over L 2 (π ) corresponding to η(t). P t is supposed to have the following properties. P 1. π is a stationary ergodic measure for P t ; thus, P t is a strongly continuous contraction semi-group over L 2 (π ). We can deﬁne its inﬁnitesimal generator L as a closed unbounded operator over L 2 (π ) whose domain D(L) is dense in L 2 (π ), and the adjoint L ∗ of L which is also the generator of a strongly continuous semi-group. P 2. D(L) ∩ D(L ∗ ) contains a core for L and L ∗ . Then we can deﬁne S and A as being respectively the symmetric and the antisymmetric part of L with respect to π , and S is therefore self-adjoint. We suppose also D(L)⊥ = D(L), D(S)⊥ = D(S), D(A)⊥ = D(A); P 3. S satisﬁes the Poincaré’s inequality, i.e. there exists G > 0 such that for all f ∈ D(E) satisfying f |F0 = 0, we have f 2 |F0 ≤ G −S f, f |F0 ; P 4. For all f ∈ D(S), (S f )⊥ = S( f ⊥ ), and for all f ∈ D(S), (A f )⊥ = −A( f ⊥ ).

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

709

We introduce the space H1 consisting in the completion of {φ ∈ D(E), E(φ, φ) < ∞} √ with respect to the norm . 1 = E(φ, φ). We also introduce H˜ 1 = H1 ∩ L 2 (π ) endowed of the norm . 2˜ = . 21 + . 2L 2 (π ) , and the dual spaces H−1 and H˜ −1 of H1 these ones. Let ψ ∈ L 1 (π ) ∩ H−1 . We consider the weak resolvent equation (λ − L)h (n) λ = ψ. As in the case treated in Sect. 2, the following proposition holds Proposition 3.1. There exist sequences (Sn ) and (An ) of bounded operators over L 2 (π ) (n) and a sequence (h λ ) in H˜ 1 such that (n) a. h λ converges weakly in H˜ 1 to h λ ; b. The operators Sn and An are respectively self-adjoint and anti-self-adjoint with respect to π ; (n) (n) c. Sn h λ and An h λ converge weakly in H˜ −1 ; their limits are denoted respectively Sh λ and Ah λ ; (n)

d. For all n ∈ N, (λ − Sn − An )h λ = ψ. √ Moreover, h λ ∈ D( −S) and the following inequality is satisﬁed λ h 2λ + E(h λ , h λ ) ≤ ψ H−1 E(h λ , h λ )1/2 .

(24)

♦ We introduce for all θ > 0 the following approximation of the operators S and A, inspired by the Yosida approximation ([3] p. 12, Lemma 2.4.) Sθ = S(I − θ S)−1

Aθ = A(I − θ 2 A2 )−1 = 21 A(I − θ A)−1 − −A(I + θ A)−1 , where I is the identity of L 2 (π ). Sθ and Aθ are bounded operators over L 2 (π ), and the Sθ s (resp. the Aθ s) are obviously self-adjoint (resp. anti-self-adjoint) with respect to π . We can then introduce for all positive λ, h λ,θ = (λ − Sθ − Aθ )−1 ψ.

(25)

Due to Hille-Yosida’s theorem ([3], p. 10), we have (I − θ S)−1 ≤ 1, and it is then easy to check that 0 ≤ −S ≤ −Sθ . Using (25), we have 2 2 1/2 √

= ψh λ,θ ≤ ψ H−1 −Sθ h λ,θ −Sh λ,θ , (26) λ h 2λ,θ + As −S ≤ −Sθ , we have

√ 2 2 1/2 √ ≤ ψ H−1 −Sh λ,θ −Sh λ,θ , λ h 2λ,θ +

710

G. Benabou

which implies  2 2  λ h  λ,θ ≤ ψ H−1    √ 2 ≤ ψ 2H−1 −Sh λ,θ   

√ 2   ≤ ψ 2H−1 . −Sθ h λ,θ

(27)

(n) (n) It only remains to extract subsequences h λ and Sn h λ from (h λ,θ ) and (Sθ h λ,θ ) (n)

(n)

which converge weakly respectively in H˜ 1 and H˜ −1 . As we have An h λ = (λ−Sn )h λ −ψ, the proof of the proposition is complete. ♦

3.2. The main statements. The following theorem holds: Theorem 3. Under the previous assumptions P1 to P4, for all ψ ∈ L 1 (π ) ∩ H−1 such that ψ ⊥ = −ψ, we have

λ h 2λ → 0 as λ → 0

√ ∃ξ ∈ L 2 (π ), ( −Sh λ − ξ )2 → 0 as λ → 0.

(28)

As a direct consequence of this theorem, and the ergodicity of π we have the following result: ε−2 t

Theorem 4. The process ε 0 ψ(η(s))ds converges weakly as ε goes to 0 to a Brown

ian motion with diffusion coefﬁcient ξ 2 . The tightness of the process is ensured by the same argument as in Proposition 2.5.

3.3. Proofs. Let us first prove Theorem 3. Due to (24), λ h 2λ ≤ ψ 2H˜

−1

and E(h λ , h λ ) ≤ ψ 2H−1 .

Following the scheme of the proof of Proposition 2.4, we introduce the symmetric and antisymmetric parts of h λ = f λ + gλ with respect to ⊥ . As ⊥ preserves π , we have 1 1 ⊥ ⊥ hλ − h⊥ (h (29) = |F − h ) |F 0 λ 0 = − gλ |F0 = 0, λ λ 2 2

and thanks to P3 gλ2 ≤ GE(gλ , gλ ). But the same argument as in (29) together with P4 proves that E( f λ , gλ ) = 0. Then E(h λ , h λ ) = E( f λ , f λ ) + E(gλ , gλ ). Consequently gλ2 ≤ G ψ H−1 , and it is uniformly bounded in L 2 (π ). The same argument as in the proof of Proposition 2.4 now can be applied here without any changes, and Theorem 3 is proved. Theorem 4 is now a straightforward application of Theorem 3 and of the ergodicity of π , just as was Theorem 2 in the previous section. gλ |F0 =

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

711

4. Interacting Ornstein-Uhlenbeck Particles Let us now consider an infinite system of particles, each of them moving according to an Ornstein-Uhlenbeck process. They interact through a two-body potential. We want to follow the motion of a special particle, which we tag. The tagged particle evolves in a random environment which is not frozen any more and depends on the motion of the tagged particle. A model of this kind has already been studied for non-massive particles ([6]). The probability space X is now the space of locally finite configurations of particles in Rd with velocities in Rd , i.e. X = {η = {xi , vi }i∈I0 ⊂ Rd , ∀B ∈ B(Rd ), η x ∩ B is finite}, where I0 is a countable set of indexes, B(Rd ) is the set of bounded subsets of Rd and η x = {xi }i∈I0 for all η = ! {xi , vi }i∈I0 ⊂ Rd . We endow X of the weakest topology such that the map φ : η → i∈I0 h(xi , vi ) is continuous for any continuous compactly supported h mapping (Rd )2 to R. F is the corresponding Borel σ -algebra. The transformation group G = {τx , x ∈ Rd } is the group of space shifts τx η = {xi + x, vi }. We define the gradient of a suitable f with respect to xi ∈ η x , ∇xi f (η) · l = lim δ −1 ( f ([η \ {xi }] ∪ {(xi + δl}) − f (η)) δ→0

for all l ∈ Rd , and we similarly define its gradient ∇vi f with respect to a velocity vi ∈ η. Finally, we define the formal operator D D= ∇xi . i

Let U be a twice continuously differentiable, compactly supported map on Rd . We suppose that U is superstable, i.e. for all bounded in Rd , there exists C1 > 0 and C2 ≥ 0 such that for all (x1 , . . . , xn ) ∈ n , U (xi − x j ) ≥ −C2 n + C1 n 2 ||−1 , i= j

where || is the volume of . We also suppose that U is even. The measure µ on F is one of the ergodic Gibbs measures associated to the formal Hamiltonian H0 (η) =

1 1 U (xi − x j ) + |vi |2 2 2 i= j

i

with fixed temperature β and fugacity z. The existence of this measure is ensured by the stability of U . We now study the system defined by dxi (t) = vi (t)dt, ∇U (xi − x j )dt + 2γi βdwi (t), dvi (t) = −γi vi (t)dt − j=i

(30)

712

G. Benabou

which is a slight extension of (1) with γ depending on i, and the mass of the particles taken equal to 1. The γi s are positive constants satisfying inf γi = γ∗ > 0.

(31)

i

The existence of these dynamics is proved in [12]. The generator of this process is   γi β(vi − vi · ∇vi ) + vi · ∇xi − ∇U (xi − x j ) · ∇vi  . L I OU = j=i

i∈I0

Now, we consider one of these particles, of index i 0 , whose position and velocity are denoted respectively by x and v, and we tag it. We obtain the following system:  dx(t) = v(t)dt     dv(t) = −γ v(t)dt − ∇U (x − x j )dt + 2γβdw0    j∈I

dxi (t) = vi (t)dt      ∇U (xi − x j )dt − ∇U (xi − x)dt + 2γi βdwi (t), dvi (t) = −γi vi (t)dt −   j=i

where I = I0 \ {i 0 }. As in the previous section, we consider now the environment ω(t) = τ−x(t) η(t) = {yi (t), vi (t)}. The system becomes  dx(t) = v(t)dt     dv(t) = −γ v(t)dt + ∇U (y j )dt + 2γβdw0    j∈I (32) dyi (t) = (vi (t) − v(t)) dt      dv (t) = −γi vi (t)dt − ∇U (yi − y j )dt −∇U (yi )dt + 2γi βdwi (t).   i j=i

We introduce the Palm measure π , whose density with respect to µ is ! dπ − i = Z −1 e dµ

U (xi ) β

2

− |v| 2β

.

Introducing the new formal Hamiltonian H (η, v) = H0 +

U (xi ) +

i

|v|2 2

the generator of (ω(t), v(t)) is given by an extension of L I OU = S I OU + A I OU with S I OU = γ (βv − v · ∇v ) +

γi βvi − vi · ∇vi ,

i∈I

A I OU = −v · D + DH · ∇v +

vi · ∇ yi − ∇ yi H · ∇vi , i∈I

(33)

Homogenization of Ornstein-Uhlenbeck Process in Random Environment

713

where S I OU and A I OU are respectively symmetric and antisymmetric with respect to the invariant measure π defined above. The Dirichlet form associated to this generator is given for any φ ∈ D(L) by L I OU φ, φ = −γβ |∇v φ|2 − β γi |∇vi φ|2 . i

Some problems may arise at this point. The existence of the tagged particle process and of the environment process are not obvious, due to measurability problems. The strong continuity of the semi-group corresponding to the environment process, and the stationarity of π for this semi-group are difficult technical results which, as far as the author knows, have not been shown either. It seems that Ferrari’s scheme for proving the stationarity (see [4]) could be used here, if we assume the existence of a core of regular local function for L I OU , allowing us to commute the semi-group and the generator. These results, however interesting, are not the subject of the present article. We simply assume them in the following. Notice that, in the case of non-massive particles ([6]), these technical points are not treated either. We introduce the resolvent equation for all l ∈ Rd , (λ − L I OU )u λ = l · v. Due to the study done in Sect. 3, we can prove the existence of u λ in H˜ 1 , S I OU u λ and A I OU u λ in H˜ −1 . The next results are now direct consequences of Theorems 3 and 4. Proposition 4.1. We have lim λ u 2λ = 0.

λ→0

Furthermore, there exists ξ in L 2 such that lim |∇v u λ − ξ |2 = 0 λ→0

(34)

(35)

and there exists a family (ξi )i∈I in L 2 such that for all i ∈ I , lim |∇vi u λ − ξi |2 = 0 λ→0

and

|ξi |2 < ∞.

i∈I

♦ All we have to do is to prove that the hypotheses of Theorem 3 are satisfied. Let us introduce on the configuration space X the involution η = (xi , vi )i∈I → η⊥ = (xi , −vi )i∈I . F0 is the sub-σ -algebra of F generated by the functions which do not depend on the velocities. Hypotheses P1 and P4 are very easy to check, as the Palm measure is a product measure with respect to the positions and the velocities, and as it is even in the velocities. Moreover, it is a infinite-dimensional Gaussian measure in the velocities, so

714

G. Benabou

that S I OU has a positive spectral gap γ∗ , where γ∗ has been defined in (31). Hypothesis P3 is then satisfied. ♦ Now, Theorem 1 is a particular case of Theorem 4. In this case, the non-degeneracy of the macroscopic diffusion can not be proved. It depends certainly on the temperature of the system. Acknowledgements. The author wishes to express his thanks to Professor J. Fritz for his extremely precious advice which helped him to highly improve this article. He also wants to thank his PhD. thesis director Stefano Olla who made him work on this problem and helped him solving it.

References 1. Papanicolau, G., Varadhan, S. R. S.: Boundary Value Problems with Rapidly Oscillating Random Coefficients. In Colloquia Mathematica Societatis János Bolay, 27. Random Fields, Esztregom (Hungary), (1979), pp. 835–873 2. Papanicolau, G., Varadhan, S. R. S.: Ornstein-Uhlenbeck Processes in Random Potential. Commun. Pure Appl. Math. 38, 819–834 (1985) 3. Ethier, S. N., Kurtz, T. G.: Markov Processes. New York: John Wiley (1986) 4. Ferrari, P. A.: The Simple Exclusion Process as seen from a Tagged Particle. Ann. Prob. 14, 1277–1290 (1986) 5. Kipnis, C., Varadhan, S. R. S.: Central Limit Theorem for Additive Functionals of Reversible Markov Process and Applications to Simple Exclusions. Commun. Math. Phys. 104, 1–19 (1986) 6. De Masi, A., Ferrari, P., Goldstein, S., Wick, W. D.: An Invariance Principle for Reversible Markov Processes. Applications to Random Motions in Random Environments. J. Stat. Phys. 55, 3/4, 787–855 (1989) 7. Olla, S., Varadhan S.R.S.: Scaling Limit for Interacting Ornstein-Uhlenbeck Processes. Commun. Math. Phys. 135, 335–378 (1991) 8. Olla, S.: Homogenization of Diffusion Processes in Random Fields. In Publications del’ Ecole Doctorale del’ Ecole Polytechnique, Palaiseall, (1994) (Available at http://www.ceremade. dauphine.fr/∼olla/lho.ps) 9. Varadhan, S. R. S.: Self Diffusion of a Tagged Particle in Equilibrium for Asymmetric Mean Zero Random Walks with Simple Exclusion. Ann. Inst. H. Poincaré (Probabilités) 31, 273–285 (1996) 10. Sethuraman, S., Varadhan, S. R. S., Yau, H.T.: Diffusive Limit of a Tagged Particle in Asymmetric Exclusion Process, Commun. Pure Appl. Math. 53, 972–1006 (2000) 11. Olla, S.: Central Limit Theorems for Tagged Particles and for Diffusions in Random Environment. Notes of the course given at “États de la recherche : Milieux Aléatoires”, CIRM 23-25 November 2000. In: Milieux Aléatoires (F. Comets, E. Pardoux, ed.), Panorama et Synthèses 12, 2001, pp. 75-100. Available at http://www.ceremade.dauphine.fr/∼olla/cirmrev.pdf 12. Olla, S., Trémoulet, C.: Equilibrium Fluctuations for Interacting Ornstein-Uhlenbeck particles. Commun. Math. Phys. 233, 463–491 (2003) 13. Freidlin, M.: Some Remarks on the Smoluchowski-Kramers Approximation. J. Stat. Phys. 117, 3/4, 617–634 (2004) 14. Benabou, G.: Comparison between the homogenization of Ornstein-Uhlenbeck and Brownian processes. Preprint, submitted to Stoch. Proc. Appl. Communicated by H. Spohn

Commun. Math. Phys. 266, 715–733 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0048-7

Communications in

Mathematical Physics

Stability Conditions on a Non-Compact Calabi-Yau Threefold Tom Bridgeland School of Mathematics, University of Sheffield, Hicks Building, Hounsfield Road, Sheffield, S3 7RH, UK. E-mail: [email protected] Received: 2 September 2005 / Accepted: 10 February 2006 Published online: 30 May 2006 – © Springer-Verlag 2006

Abstract: We study the space of stability conditions Stab(X ) on the non-compact Calabi-Yau threefold X which is the total space of the canonical bundle of P2 . We give a combinatorial description of an open subset of Stab(X ) and state a conjecture relating Stab(X ) to the Frobenius manifold obtained from the quantum cohomology of P2 . We give some evidence from mirror symmetry for this conjecture.

1. Introduction The space of stability conditions Stab(X ) on a variety X was introduced in [5] as a mathematical framework for understanding Douglas’s notion of π -stability for D-branes in string theory [13]. This paper is concerned with the case when X = OP2 (−3) is the total space of the canonical line bundle of P2 . This non-compact Calabi-Yau threefold provides an amenable but interesting example on which to test the general theory, and many features of the spectrum of D-branes on X have already been studied in the physics literature (see for example [11, 12, 19]). So far, we are unable to give a complete description of Stab(X ). However, using the results of [7], we define an open subset Stab0 (X ) ⊂ Stab(X ) which is a disjoint union of regions indexed by the elements of an affine braid group. The combinatorics of these regions leads us to conjecture a precise connection between Stab0 (X ) and the Frobenius manifold defined by the quantum cohomology of P2 . Our main aim is to assemble some convincing evidence for this conjecture and to discuss some of its consequences. The existence of deep connections between quantum cohomology and derived categories has been known for some time. In particular, following observations of Cecotti and Vafa [10] and Zaslow [29], Dubrovin conjectured [15] that the derived category of a Fano variety Y has a full exceptional collection (E 0 , E 1 , . . . , E n−1 ) if and only if the quantum cohomology of Y is generically semisimple, and that in this case the Stokes matrix Si j of the corresponding Frobenius manifold should coincide with the

716

T. Bridgeland

Gram matrix χ (E i , E j ) for the Euler form of D(Y ). This statement has been verified for projective spaces [21, 28]. It was pointed out by Bondal and Kontsevich that a heuristic explanation for Dubrovin’s conjecture can be given using mirror symmetry. The mirror of a Fano variety with a full exceptional collection is expected to be an affine variety Yˇ , together with a holomorphic function f : Yˇ → C with isolated singularities. The Frobenius manifold arising from the quantum cohomology of Y should coincide with the Frobenius manifold of Saito type defined on the universal unfolding space of f . The Stokes matrix is then the intersection form evaluated on a distinguished basis of vanishing cycles. Under Kontsevich’s homological mirror proposal [25] the intersection form is identified with the Euler form on D(Y ), and the vanishing cycles, which are discs, correspond to exceptional objects in D(Y ). The conjecture stated below suggests that it may be possible to use spaces of stability conditions to give a more direct link between derived categories and quantum cohomology. To make this work one should somehow define the structure of a Frobenius manifold on the space of stability conditions which in some small patch recovers the usual quantum cohomology picture. At present however, the author has no clear ideas as to how this could be done. The other general conclusion one can draw from the example studied in this paper is that the space of stability conditions Stab(X ) is not an analogue of the stringy Kähler moduli space, but rather some extended version of it. The picture seems to be that the space Stab(X ) is a global version of the Frobenius manifold defined by big quantum cohomology, and the stringy Kähler moduli space is a submanifold which near the large volume limit is defined by the small quantum cohomology locus. In the rest of the introduction we shall describe our results in more detail. Missing definitions and proofs are hopefully covered in the main body of the paper. 1.1. A stability condition [5] on a triangulated category D consists of a full abelian subcategory A ⊂ D called the heart, together with a group homomorphism Z : K (D) −→ C called the central charge, with the compatibility property that for every nonzero object E ∈ A one has Z (E) ∈ H = {z ∈ C : z = r exp(iπ φ) with r > 0 and 0 < φ 1}. One insists further that A ⊂ D is the heart of a bounded t-structure on D, and that the map Z has the Harder–Narasimhan property. The set of all stability conditions on D satisfying an extra condition called local-finiteness form a complex manifold Stab(D). Forgetting the heart A ⊂ D and remembering the central charge gives a map Z : Stab(D) −→ HomZ (K(D), C). In this paper we shall consider the case when D is the subcategory of the bounded derived category of coherent sheaves on X = OP2 (−3) consisting of complexes whose cohomology sheaves are supported on the zero section P2 ⊂ X . In that case the Grothendieck group K(D) is a free abelian group of rank three. Our main result is

Stability Conditions on a Non-Compact Calabi-Yau Threefold

717

Theorem 1.1. There is a connected open subset Stab0 (X ) ⊂ Stab(X ) which can be written as a disjoint union of regions D(g), Stab0 (X ) = g∈G

where G is the afﬁne braid group with presentation G = τ0 , τ1 , τ2 | τi τ j τi = τ j τi τ j for all i, j . Each region D(g) is mapped isomorphically by Z onto a locally-closed subset of the three dimensional vector space HomZ (K(D), C), and the closures of two regions D(g1 ) and D(g2 ) intersect in Stab0 (X ) precisely if g1 g2−1 = τi±1 for some i ∈ {0, 1, 2}. The stability conditions in a given region D(g) all have the same heart A(g) ⊂ D. Each of these categories A(g) is equivalent to a category of nilpotent representations of a quiver with relations of the form

a

c

b

where the positive integers a, b, c labelling the graph represent the number of arrows in the quiver joining the corresponding vertices. In fact the triples (a, b, c) which come up are precisely the positive integer solutions to the Markov equation a 2 + b2 + c2 = abc. We denote by S0 (g), S1 (g), S2 (g) the three simple objects of A(g) corresponding to the three one-dimensional representations of the quiver. In the case when g = e is the identity we simply write Si = Si (e). The objects Si (g) are spherical objects of D in the sense of Seidel and Thomas [27]. As such they define autoequivalences Si (g) ∈ Aut D. These descend to give automorphisms φ Si (g) ∈ Aut K(D) i = 0, 1, 2, which with respect to the fixed basis of K(D) defined by the classes of the objects Si are given by a triple of matrices P0 (g), P1 (g), P2 (g) ∈ SL(3, Z). It turns out that exactly the same system of matrices come up in the study of the quantum cohomology of P2 . 1.2. Dubrovin showed that the semisimple Frobenius structure arising from the quantum cohomology of P2 can be analytically continued to give a Frobenius structure on a dense open subset M of the universal cover of the configuration space

718

T. Bridgeland

C3 (C) = {(u 0 , u 1 , u 2 ) ∈ C : i = j =⇒ u i = u j }/ Sym3 . Note that in some small ball on M the corresponding prepotential encodes the geometric data of the Gromov–Witten invariants of P2 , but away from this patch there is no such direct interpretation. Thus, just like the space of stability conditions, M is a non-perturbative object, not depending on any choice of large volume limit. Given a point m ∈ M we denote by {u 0 (m), u 1 (m), u 2 (m)} the corresponding unordered triple of points in C, and set Cm = C \ {u 0 (m), u 1 (m), u 2 (m)}. Let W denote the space W = {(m, z) ∈ M × C : z ∈ Cm } with its projection p : W → M. Using the Frobenius structure Dubrovin defined a series of flat, holomorphic connections ∇ˇ (s) on the pullback of the tangent bundle p ∗ (T M ). These connections are called the second structure connections. Connections of this type were first introduced by K. Saito in the theory of primitive forms for unfolding spaces. We shall be interested only in the case s = 21 . 1 For each m ∈ M the connection ∇ˇ = ∇ˇ ( 2 ) restricts to give a holomorphic connection ∇ˇ m on a trivial rank three bundle over Cm . Dubrovin showed that this family of connections is isomonodromic. Define another configuration space C3 (C∗ ) = {(u 0 , u 1 , u 2 ) ∈ C∗ : i = j =⇒ u i = u j }/ Sym3 ˜ 3 (C∗ ) be its universal cover. Define and let C M 0 = {m ∈ M : 0 ∈ Cm } and let M˜ 0 be its inverse image in C˜ 3 (C∗ ) under the natural map C˜ 3 (C∗ ) → C˜ 3 (C). We can choose a base-point m ∈ M 0 such that {u 0 (m), u 1 (m), u 2 (m)} are the three roots of unity. Let (γ0 , γ1 , γ2 ) denote the following basis of π1 (Cm , 0): γ

1

γ

2

γ

0

Let m ∈ U ⊂ M 0 be a small simply-connected neighbourhood of m. For each point m ∈ U there is a chosen basis of π1 (Cm , 0) obtained by deforming the loops γi , which we also denote (γ0 , γ1 , γ2 ). Let V be the space of flat sections of ∇ˇ m near the origin 0 ∈ C. Using the connection ∇ˇ we can identify V with the space of flat sections of ∇ˇ m˜ near 0 ∈ C for all points m˜ ∈ M˜ 0 . As we explain in Sect. 2.3, the group G is a subgroup of π1 (C3 (C∗ )), and

Stability Conditions on a Non-Compact Calabi-Yau Threefold

719

hence acts on C˜ 3 (C∗ ). Taking the monodromy of the connection ∇ˇ m˜ around the loops γi for m˜ ∈ g(U ) ∩ M˜ 0 we obtain linear automorphisms αi (g) ∈ Aut(V ) for i = 0, 1, 2. The following result relates these to the transformations φ Si (g) of the last section. Theorem 1.2. There is a triple of ﬂat sections (φ0 , φ1 , φ2 ) of the second structure connection ∇ˇ such that for all g ∈ G the monodromy transformations αi (g) act by the matrices Pi (g) deﬁned in the last section. This condition ﬁxes the triple (φ0 , φ1 , φ2 ) uniquely up to a scalar multiple. This theorem is a simple recasting of some results of Dubrovin, and boils down to two previously observed coincidences. The first is the fact mentioned in the introduction that the Stokes matrix Si j for the quantum cohomology of P2 coincides with the Gram matrix χ (E i , E j ) for the Euler form on K(P2 ) with respect to a basis coming from an exceptional triple of vector bundles (E 0 , E 1 , E 2 ). The second is that this coincidence is compatible with the braid group actions on these matrices arising on the one hand from the analytic continuation of the Frobenius manifold [16, Theorem 4.6], and on the other from the action of mutations on exceptional triples discovered by Bondal, Gorodentsev and Rudakov [2, 20]. In fact the connection ∇ˇ corresponds to the Gauss–Manin connection on the universal unfolding space of the singularity mirror to the space X . It might perhaps be easier to understand the connection in this geometric way. But part of the point of this paper is to try to avoid passing to the mirror. 1.3. We now describe a conjecture relating the quantum cohomology of P2 to the space of stability conditions on X . The noncompactness of X makes this relationship slightly more complicated than might be expected. In particular, the Euler form χ (−, −) on K(D) is degenerate, with a one-dimensional kernel generated by the class of a skyscraper sheaf [Ox ] ∈ K(D) for x ∈ P2 ⊂ X . In terms of the basis defined by the spherical objects Si = Si (e) one has [Ox ] = [S0 ] + [S1 ] + [S2 ]. Since this class is somehow special, and in particular is preserved by all autoequivalences of D, it makes sense to define a space of normalised stability conditions by Stab0n (X ) = σ = (Z , P) ∈ Stab0 (X ) : Z (Ox ) = i . This is a connected submanifold of Stab0 (X ). Define an affine space A2 = (z 0 , z 1 , z 2 ) ∈ C3 : z 0 + z 1 + z 2 = i . In co-ordinate form the map Z gives a local isomorphism Z : Stab0n (X ) −→ A2 obtained by sending a stability condition to the triple (Z (S0 ), Z (S1 ), Z (S2 )).

720

T. Bridgeland

On the quantum cohomology side, the flat sections (φ0 , φ1 , φ2 ) of Theorem 1.2 do not form a basis, and in fact satisfy φ0 + φ1 + φ2 = 0. Pulling back the connection ∇ˇ via the embedding M 0 → W defined by p → ( p, 0) we obtain a flat connection on the tangent bundle T M 0 . Taking co-ordinates whose gradients are the sections (φ0 , φ1 , φ2 ) and rescaling appropriately, one obtains a holomorphic map W : M˜ 0 −→ A2 . This map is invariant under the free C action on M˜ 0 ⊂ C˜ 3 (C∗ ) which lifts the C∗ action on C3 (C∗ ) obtained by simultaneously rescaling the points (u 0 , u 1 , u 2 ). The quotient ˜ 3 (C∗ )/C is the universal covering space of C [u 0 , u 1 , u 2 ] ∈ P2 : u i = 0 and u i = u j . The quotient M˜ 0/C is therefore a dense open subset. We call the induced map W : M˜ 0 /C −→ A2 the homogeneous twisted period map. It is a local isomorphism. We can now state our conjecture. Conjecture 1.3. There is a commuting diagram F Stab0n (X ) −−−−→ M˜ 0 /C     Z W

A2

A2

Moreover F is an isomorphism onto a dense open subset. Proving this conjecture would require a more detailed understanding of the geometry of the homogeneous twisted period map. In particular, it would be necessary to find an open subset of M˜ 0 /C which was mapped isomorphically by W onto the subset (z 0 , z 1 , z 2 ) ∈ A2 : Im(z i ) > 0 which is the image of the interior of the region D(e) under the map Z. 1.4. Here we describe two pieces of evidence for Conjecture 1.3. First consider the submanifold D ⊂ C3 (C) defined parametrically by taking the unordered triple of points u i = −1 + z 1/3 for some z ∈ C\{0, 1}. The inverse image of D in the universal cover C˜ 3 (C) is contained in the open subspace M. The submanifold D (or its inverse image in M) is the small quantum cohomology locus; in the standard flat co-ordinates it is given by (t0 , t1 , t2 ) = (−1, e z , 0). Dubrovin showed [18, Prop. 5.13] that on this locus the homogeneous twisted period map satisfies the differential equation 1 2 d θz3 − z(θz + )(θz + )θz W = 0, θz ≡ z . 3 3 dz

Stability Conditions on a Non-Compact Calabi-Yau Threefold

721

This is the Picard–Fuchs equation for the periods of the mirror of X , and is thus precisely the equation satisfied by the central charge on the stringy Kähler moduli space [1, 11]. A second piece of evidence for Conjecture 1.3 is that if we go down a dimension to the case X = OP1 (−2) the corresponding statement is known to be at least nearly true. In that case the space M is the universal cover of C2 (C) = {(u 0 , u 1 ) ∈ C : u 0 = u 1 } so that M˜ 0/C is the universal cover of C \ {0, 1} with co-ordinate λ = u 1 /u 0 . In this case we must take the second structure connection with s = 0 (in general, for a projective space of dimension d we should take s = (d − 1)/2). Thus the homogeneous twisted period map in this case is just the homogeneous part of the standard period map for the quantum cohomology of P1 . This was computed by Dubrovin. Identifying the affine space A1 = {(z 0 , z 1 ) : z 0 + z 1 = i} with C via the map (z 0 , z 1 ) → z 0 , the equation [14, G.20] implies that the homogeneous period map is

1 −1 1 + λ W(λ) = cos . π 1−λ On the other hand the space Stab(X ) was studied in [8]. The corresponding open subset Stab0 (X ) ⊂ Stab(X ) is actually a connected component, and the corresponding space Stab0n (X ) is a covering space of C\Z. This gives the following result. Theorem 1.4. In the case X = OP1 (−2) there is a commuting diagram H ˜ 1} Stab0n (X ) ←−−−− C \ {0,     Z W

C\Z

C\Z

in which all the maps are covering maps. In fact one expects Stab0n (X ) to be simply-connected so that H is actually an isomorphism.

2. Stability Conditions on X In this section we justify the claims about Stab(X ) made in the introduction. In particular we prove Theorem 1.1. We start by summarising some of the necessary definitions. More details can be found in [5, 7].

722

T. Bridgeland

2.1. Stability conditions and tilting. Let D be a triangulated category. Recall that a bounded t-structure on D determines and is determined by its heart, which is an abelian subcategory A ⊂ D. One has an identification of Grothendieck groups K(D) = K(A). A stability function on an abelian category A is defined to be a group homomorphism Z : K(A) → C such that 0 = E ∈ A =⇒ Z (E) ∈ R>0 exp(iπ φ(E)) with 0 < φ(E) 1. The real number φ(E) ∈ (0, 1] is called the phase of the object E. A nonzero object E ∈ A is said to be semistable with respect to Z if every subobject 0 = A ⊂ E satisfies φ(A) φ(E). The stability function Z is said to have the Harder–Narasimhan property if every nonzero object E ∈ A has a finite filtration 0 = E 0 ⊂ E 1 ⊂ · · · ⊂ E n−1 ⊂ E n = E whose factors F j = E j /E j−1 are semistable objects of A with φ(F1 ) > φ(F2 ) > · · · > φ(Fn ). A simple sufficient condition for the existence of Harder–Narasimhan filtrations was given in [5, Prop. 2.4]. In particular the Harder–Narasimhan property always holds when A has finite length. The definition of a stability condition appears in [5]. For our purposes the following equivalent definition will be more useful, see [5, Prop. 5.3]. Definition 2.1. A stability condition on D consists of a bounded t-structure on D and a stability function on its heart which has the Harder–Narasimhan property. The induced map Z : K(D) → C is called the central charge of the stability condition. It was shown in [5] that the set of stability conditions on D satisfying an additional condition called local-finiteness form the points of a complex manifold Stab(D). In general this manifold will be infinite-dimensional, but in the cases we consider in this paper K(D) is of finite rank, and it follows that Stab(D) has finite dimension. To construct t-structures we use the method of tilting introduced by Happel, Reiten and Smalø [22], based on earlier work of Brenner and Butler [4]. Suppose A ⊂ D is the heart of a bounded t-structure and is a finite length abelian category. Note that the t-structure is completely determined by the set of simple objects of A; indeed A is the smallest extension-closed subcategory of D containing this set of objects. Given a simple object S ∈ A define S ⊂ A to be the full subcategory consisting of objects E ∈ A all of whose simple factors are isomorphic to S. One can either view S as the torsion part of a torsion theory on A, in which case the torsion-free part is

F = E ∈ A : HomA (S, E) = 0 , or as the torsion-free part, in which case the torsion part is

T = E ∈ A : HomA (E, S) = 0 . The corresponding tilted subcategories are defined to be L S A = E ∈ D : H i (E) = 0 for i ∈ / {0, 1}, H 0 (E) ∈ F and H 1 (E) ∈ S , R S A = E ∈ D : H i (E) = 0 for i ∈ / {−1, 0}, H −1 (E) ∈ S and H 0 (E) ∈ T . They are the hearts of new bounded t-structures on D.

Stability Conditions on a Non-Compact Calabi-Yau Threefold

723

2.2. Quivery subcategories. Let X = OP2 (−3) with its projection π : X → P2 . Let D denote the full subcategory of the bounded derived category of coherent sheaves on X consisting of complexes supported on the zero-section P2 ⊂ X . Let Stab(X ) denote the space of locally-finite stability conditions on D. Let (E 0 , E 1 , E 2 ) be an exceptional collection of vector bundles on P2 . Any exceptional collection in Db Coh(P2 ) is of this form up to shifts. It was proved in [7] that there is an equivalence of categories Hom

•

2

∗

π E i , − : Db Coh(X ) −→ Db Mod(B),

i=0

where Mod(B) is the category of finite-dimensional right modules for the algebra B = End X

2

∗

π Ei .

i=0

The algebra B can be described as the path algebra of a quiver with relations taking the form

a

c

b

Pulling back the standard t-structure on Db Mod(B) gives a bounded t-structure on D whose heart is equivalent to the category of nilpotent modules of B. The abelian subcategories A ⊂ D obtained in this way are called exceptional. An abelian subcategory of D is called quivery if it is of the form (A) for some exceptional subcategory A ⊂ D and some autoequivalence ∈ Aut(D). Any quivery subcategory A ⊂ D is equivalent to a category of nilpotent modules of an algebra of the above form. As such it has three simple objects {S0 , S1 , S2 } corresponding to the three one-dimensional representations of the quiver. These objects Si are spherical in the sense of Seidel and Thomas and thus give rise to autoequivalences Si ∈ Aut(D). Note that the three simples Si completely determine the corresponding quivery subcategory A ⊂ D. The Ext groups between them can be read off from the quiver Hom1D (S0 , S1 ) = Ca , Hom1D (S1 , S2 ) = Cb , Hom1D (S2 , S0 ) = Cc with the other Hom1 groups being zero. Serre duality then determines the other groups. Take A to be the exceptional subcategory of D corresponding to the exceptional collection (O, O(1), O(2)) on P2 . Its simples are S0 = i ∗ O, S1 = i ∗ 1 (1)[1], S2 = i ∗ O(−1)[2],

724

T. Bridgeland

where i : P2 → X is the inclusion of the zero-section, and denotes the cotangent bundle of P2 . We have (a, b, c) = (3, 3, 3). Let us compute the automorphisms φ Si of K(D) induced by the autoequivalences Si . The twist functor S is defined by the triangle Hom•D (S, E) ⊗ Si −→ E −→ S (E) so that, at the level of K-theory, φ S ([E]) = [E] − χ (S, E)[S]. If we write Pi for the matrix representing the transformation φ Si with respect to the basis ([S0 ], [S1 ], [S2 ]) of K(D) then       1 3 −3 1 0 0 1 0 0 P0 =  0 1 0  , P1 =  −3 1 3  , P2 =  0 1 0  . 0 0 1 0 0 1 3 −3 1 2.3. Braid group action. It was shown in [7] that if one tilts a quivery subcategory A ⊂ D at one of its simples one obtains another quivery subcategory. To describe this process in more detail we need to define a certain braid group which acts on triples of spherical objects. The three-string annular braid group C B3 is the fundamental group of the configuration space of three unordered points in C∗ . It is generated by three elements τi indexed by the cyclic group i ∈ Z3 together with a single element r , subject to the relations for all i ∈ Z3 , r τi r −1 = τi+1 τi τ j τi = τ j τi τ j for all i, j ∈ Z3 . For a proof of the validity of this presentation see [24]. If we take the base point to be defined by the three roots of unity, then the elements τ1 and r correspond to the loops obtained by moving the points as follows:

τ1

r

We write G ⊂ C B3 for the subgroup generated by the three braids τ0 , τ1 , τ2 . Define a spherical triple in D to be a triple of spherical objects (S0 , S1 , S2 ) of D. The group C B3 acts on the set of spherical triples in D by the formulae τ1 (S0 , S1 , S2 ) = (S1 [−1], S1 (S0 ), S2 ), r (S0 , S1 , S2 ) = (S2 , S0 , S1 ). The following result allows one to completely understand the process of tilting for quivery subcategories of D.

Stability Conditions on a Non-Compact Calabi-Yau Threefold

725

Proposition 2.2. Let A ⊂ D be a quivery subcategory with simples (S0 , S1 , S2 ). Then for each i = 0, 1, 2 the three simples of the tilted quivery subcategory L Si (A) are given by the spherical triple τi (S0 , S1 , S2 ). For each g ∈ G we then have a quivery subcategory A(g) ⊂ D obtained by repeatedly tilting starting at A. Its three simples are given by the spherical triple (S0 (g), S1 (g), S2 (g)) = g(S0 , S1 , S2 ). Note that the three simples of an arbitrary quivery subcategory have no well-defined ordering, but the above definition gives a chosen order for the simples of the quivery subcategories A(g). Let Pi (g) ∈ SL(3, Z) be the matrix representing the automorphism of K(D) induced by the twist functor Si (g) with respect to the fixed basis ([S0 ], [S1 ], [S2 ]). The formulae defining the action of the braid group on spherical triples show that this system of matrices has the following transformation laws: P0 (τ1 g) = P1 (g), P0 (rg) = P2 (g),

P1 (τ1 g) = P1 (g)P0 (g)P1 (g)−1 , P1 (rg) = P0 (g),

P2 (τ1 g) = P2 (g), P2 (rg) = P0 (g).

Introduce a graph (D) whose vertices are the quivery subcategories of D, and in which two subcategories are joined by an edge if they differ by a tilt at a simple object. It was shown in [7] that distinct elements g ∈ G define distinct subcategories A(g) ⊂ D. It follows that each connected component of is just the Cayley graph of G with respect to the generators τ0 , τ1 , τ2 . 2.4. Stability conditions on X . Given an element g ∈ G let A(g) ⊂ D be the corresponding quivery subcategory. The class of any nonzero object E ∈ A(g) is a strictly positive linear combination: [E] = n i [Si (g)] with n 1 , n 2 , n 3 0 not all zero. It follows that to define a stability condition on D we can just choose three complex numbers z i in the strict upper half-plane H = {z ∈ C : z = r exp(iπ φ) with r > 0 and 0 < φ 1} and set Z (Si (g)) = z i . The Harder–Narasimhan property is automatically satisfied because A(g) has finite length. We shall denote the corresponding stability condition by σ (g, z 0 , z 1 , z 2 ). Lemma 2.3. If σ = σ (g, z 0 , z 1 , z 2 ) is a stability condition on D of the sort deﬁned above, and E ∈ D is stable in σ , then there is an open subset U ⊂ Stab(D) containing σ such that E is stable for all stability conditions in U . Proof. This follows from the arguments of [6, Sect. 8]. It is enough to check that the set of classes γ ∈ K(D) such that there is an object F ∈ D with class [F] = γ such that m σ (F) m σ (E) is finite. This is easy to see because the heart of σ has finite length.

726

T. Bridgeland

To each element g ∈ G there is an associated set of stability conditions D(g) = {σ (g, z 0 , z 1 , z 2 ) : (z 0 , z 1 , z 2 ) ∈ H 3 with at most one z i ∈ R} ⊂ Stab(X ). By definition these subsets of Stab(X ) are disjoint since they correspond to stability conditions with different hearts. Proposition 2.4. There is an open subset Stab0 (X ) =

D(g) ⊂ Stab(X ).

g∈G

If g1 , g2 ∈ G then the closures of the regions D(gi ) intersect in Stab0 (X ) precisely if g1 = τi±1 g2 for some i ∈ {0, 1, 2}. Proof. Suppose a point σ = σ (g, z 0 , z 1 , z 2 ) lies in D(g). We must show that there is an open neighbourhood of σ contained in the subset Stab0 (X ). The simple objects Si = Si (g) ∈ A(g) are stable in σ . They remain stable in a small open neighbourhood U of σ in Stab(X ). We repeatedly use the easily proved fact that if A, A ⊂ D are hearts of bounded t-structures and A ⊂ A then A = A . Suppose first that Im(z i ) > 0 for each i. Shrinking U we can assume each Si has phase in the interval (0, 1) for all stability conditions (Z , P) of U . Since A(g) is the smallest extension-closed subcategory of D containing the Si it follows that A(g) is contained in the heart P((0, 1]) of all stability conditions in U . This implies that P((0, 1]) = A(g) and so U is contained in D(g). Suppose now that one of the z i , without loss of generality z 0 , lies on the real axis, so that σ lies on the boundary of D(g). Thus z 0 ∈ R<0 , and Im(z i ) > 0 for i = 1, 2. Shrinking U we can assume that Re Z (S0 ) < 0 and Im Z (Si ) > 0 for i = 1, 2 for all stability conditions (Z , P) of U . The object S = S0 (S2 ) ∈ D lies in A(g), and is in fact a universal extension 0 −→ S2 −→ S −→ S0⊕a −→ 0, where a = dim Hom1D (S0 , S2 ). Since HomD (S0 , S ) = 0 the object S lies in P((0, 1)) and shrinking U we can assume that this is the case for all stability conditions (Z , P) of U . We split U into the two pieces U+ = Im Z (S0 ) 0 and U− = Im Z (S0 ) < 0. The argument above shows that U+ ⊂ D(g). On the other hand, for any stability condition (Z , P) in U− the object S0 is stable with phase in the interval (1, 3/2). Thus the heart P((0, 1]) contains the objects S0 [−1], S and S1 . Since these are the simples of the finite length category A(τ0 g) it follows that U− ⊂ D(τ0 g). 3. Quantum Cohomology and the Period Map In this section we describe some of Dubrovin’s results concerning the twisted period map of the quantum cohomology of P2 .

Stability Conditions on a Non-Compact Calabi-Yau Threefold

727

3.1. Frobenius manifolds. The notion of a Frobenius manifold was first introduced by Dubrovin, although similar structures arising in singularity theory were studied earlier by K. Saito. A Frobenius manifold is a complex manifold M with a flat metric g and a commutative multiplication ◦ : T M ⊗ T M −→ T M on its tangent bundle, satisfying the compatibility condition g(X ◦ Y, Z ) = g(X, Y ◦ Z ). One requires that locally on M there exists a holomorphic function called the prepotential such that g(X ◦ Y, Z ) = X Y Z () for all flat vector fields X, Y, Z . Finally, one also assumes the existence of a flat identity vector field e, and an Euler vector field E satisfying Lie E (◦) = ◦, Lie E (g) = (2 − d) · g for some constant d called the charge of the Frobenius manifold. Given a smooth projective variety Y of dimension d one can put the structure of a Frobenius manifold of charge d on an open subset of the vector space H ∗ (Y, C). The metric is the constant metric given by the Poincaré pairing, and the prepotential is defined by an infinite series whose coefficients are the genus zero Gromov–Witten invariants, which naively speaking count rational curves in Y . The condition that the resulting algebras be associative translates into the statement that satisfies the WDVV equations. In turn, these equations boil down to certain relations between Gromov–Witten classes arising from the structure of the cohomology ring of the moduli space of pointed rational curves. The Gromov–Witten invariants are only non-vanishing in certain degrees, which gives the existence of the Euler vector field. The resulting Frobenius manifold is called the quantum cohomology of Y . Actually, for a general projective variety Y (even a Calabi-Yau threefold) it is not known whether the series defining the prepotential has nonzero radius of convergence, so one has to work over a formal coefficient ring. But we shall only be interested in the case when Y is a projective space and here it is known that there are no convergence problems. For example, in the case Y = P2 we can take co-ordinates t = t0 + t1 ω + t2 ω2 , where ω ∈ H 2 (X, C) is the class of a line, and the function is then defined by the series (t) =

nk 1 2 (t0 t2 + t0 t12 ) + t 3k−1 ekt1 , 2 (3k − 1)! 2 k 1

where n k is the number of curves of degree k on P2 passing through 3k − 1 generic points. The Euler vector field is E = t0 See [16, Lect. 1] for more details.

∂ ∂ ∂ +3 − t2 . ∂t0 ∂t1 ∂t2

728

T. Bridgeland

3.2. Tame Frobenius manifolds. Let M be a Frobenius manifold. Multiplication by the Euler vector field E defines a section U ∈ End(T M ). A point m ∈ M is called tame if the endomorphism U has distinct eigenvalues. The set of tame points of M forms an open (possibly empty) subset of M. A Frobenius manifold will be called tame if all its points are tame. Let

Cn (C) = (u 0 , . . . , u n−1 ) ∈ Cn : i = j =⇒ u i = u j / Symn be the configuration space of n unordered points in C. Dubrovin showed that if M is a tame Frobenius manifold the map M → Cn (C) defined by the eigenvalues of U is a regular covering of an open subset of Cn (C). This means that locally one can use the functions u i as co-ordinates on M. In terms of these canonical co-ordinates the product structure is ∂ ∂ ∂ · = δi j , ∂u i ∂u j ∂u i and the Euler field takes the simple form E=

i

ui

∂ . ∂u i

The non-trivial data of the Frobenius structure on M is entirely contained in the dependence of the metric on the canonical co-ordinates. Given a Frobenius manifold M it is natural to ask whether it is possible to analytically continue the prepotential to obtain a larger Frobenius manifold M such that M can be identified with an open subset of M . Dubrovin showed how to do this for tame Frobenius manifolds. Theorem 3.1 [16, Theorem 4.7]. Given a tame Frobenius manifold M of dimension n, there is a regular covering space C˜ n (C) → Cn (C), a divisor B ⊂ C˜ n (C), and a tame Frobenius structure on M = C˜ n (C)\ B such that there is an open inclusion of Frobenius manifolds M → M . Let M be a Frobenius manifold. Define a subset of M × C, W = {( p, z) ∈ M × C : det(U − z 1) = 0}, and let p : W → M be the projection map. For each parameter s ∈ C one can define a flat, holomorphic connection ∇ˇ = ∇ˇ (s) on the bundle p ∗ (TM ), by the following formulae: ∇ˇ XY = ∇ X Y − (∇ E + c 1)(U − z 1)−1 (X ◦ Y ), ∇ˇ ∂/∂z Y = ∇∂/∂z Y + (∇ E + c 1)(U − z 1)−1 (Y ). Here ∇ is the Levi–Civita connection corresponding to the flat metric g on M, ∇ E is the endomorphism of T M defined by X → ∇ X E, and c=s+

(d − 1) . 2

We shall be interested in the case when c = d − 1.

Stability Conditions on a Non-Compact Calabi-Yau Threefold

729

Assume now that M is a tame Frobenius manifold with its canonical co-ordinates u 0 , . . . , u n−1 . The space W takes the form W = {(m, z) ∈ M × C : z = u i for 0 i n − 1}. For each m ∈ M the connection ∇ˇ restricts to give a holomorphic connection ∇ˇ m on a trivial bundle over the space Cm = C\{u 0 , . . . , u n−1 }. Dubrovin showed that these connections ∇ˇ m vary isomonodromically. We briefly explain this condition. Choose a point m ∈ M and a loop γm in Cm based at some point z ∈ Cm . Let H be the space of flat sections of ∇ˇ m near z ∈ Cm . Monodromy around the loop γm defines a linear transformation αm ∈ Aut(H ). For points m ∈ M in a small neighbourhood of m the connection ∇ˇ allows us to identify H with the space of flat sections of the connection ∇ˇ m near z ∈ Cm . Moreover we can continuously deform the loop γm to give a loop γm in Cm based at z, and hence obtain a transformation αm ∈ Aut(H ). The isomonodromy condition is the statement that the transformations αm of H obtained in this way are constant. 3.3. Quantum cohomology of P2 . Let us now consider the Frobenius manifold defined by the quantum cohomology of P2 . It is known that the subset of tame points of the resulting Frobenius manifold is non-empty, so we can apply Dubrovin’s result Theorem 3.1 to obtain a tame Frobenius manifold structure on a dense open subset M = C˜ 3 (C)\ B, ˜ 3 (C) is the universal cover. Let ∇ˇ = ∇ˇ ( 21 ) be the second structure connection where C with parameter s = 1/2. The following result of Dubrovin’s computes its monodromy. Theorem 3.2 (Dubrovin). There is a point m ∈ M with canonical co-ordinates (u 0 , u 1 , u 2 ) the three roots of unity. Let γ0 , γ1 , γ2 be the following loops γi in Cm based at 0 ∈ C : γ

1

γ

2

γ

0

There is a triple (φ0 , φ1 , φ2 ) of ﬂat sections of the connection ∇ˇ m in a neighbourhood of 0 ∈ C, such that the monodromy transformations Pi corresponding to the loops γi act on the triple (φ0 , φ1 , φ2 ) by the matrices       1 3 −3 1 0 0 1 0 0  0 1 0  ,  −3 1 3  ,  0 1 0  . 0 0 1 0 0 1 3 −3 1 Moreover this triple (φ0 , φ1 , φ2 ) is unique up to multiplication by an overall scalar factor.

730

T. Bridgeland

Proof. The existence of a triple of flat sections with the above monodromy properties follows from general work of Dubrovin on monodromy of twisted period maps [18, Lemma 4.10, 4.12], together with Dubrovin’s computation of the Stokes matrix of the quantum cohomology of P2 [16, Example 4.4]. Uniqueness is easily checked. The discriminant of the Frobenius manifold M is the submanifold = {m ∈ M : u i (m) = 0 for some i}. Write M 0 = M\ for its complement and let M˜ 0 be its inverse image under the natural map C˜ 3 (C∗ ) → C˜ 3 (C). Let us take the point m ∈ M 0 of Theorem 3.2 as a base-point, and choose a small simply-connected neighbourhood m ∈ U ⊂ M 0 . For each point m ∈ U we have a well-defined choice of loops γi in Cm based at 0 obtained by deforming the loops γi of Theorem 3.2. The group G is a subgroup of the fundamental group π1 (C3 (C∗ )) and therefore acts by covering transformations on C˜ 3 (C∗ ). Thus to each element g ∈ G we can associate a corresponding open subset Ug = g(U ) ∩ M˜ 0 ⊂ M˜ 0 . Using the connection ∇ˇ we can continue the triple of sections of Theorem 3.2 to obtain a standard triple of flat sections of ∇ˇ m in a neighbourhood of 0 ∈ C for all m ∈ M˜ 0 . This is well-defined despite the fact that M˜ 0 may not be simply-connected because of the isomonodromy property and the uniqueness statement in Theorem 3.2. In particular, for each g ∈ G we obtain a standard triple of flat sections of ∇ˇ m near 0 ∈ C for all points m ∈ Ug . Taking monodromy of the connection ∇ˇ m around the loops γi with respect to this standard triple gives three matrices P0 (g), P1 (g), P2 (g). To calculate these matrices we use the isomonodromy property. For example, consider a path in M˜ 0 from m to τ1 (m). If we move the loops γi continuously with the u i then at the point τ1 (m) we will obtain the following basis (γ0 , γ1 , γ2 ) of π1 (Cm , 0):

γ’ 0

γ’ 2

γ’ 1

Clearly γ0 = γ0−1 γ1 γ0 , γ1 = γ0 , γ2 = γ2 . By the isomonodromy property, the monodromy of the standard triple of sections at τ1 (m) around the loops γi will be the same as the monodromy of the triple of sections at m around the loops γi and is therefore described by the matrices Pi (e) of Theorem 3.2. But the matrices Pi (τ1 ) describe the monodromy of the same sections around the

Stability Conditions on a Non-Compact Calabi-Yau Threefold

731

loops γi . Arguing in this way we see that the matrices Pi (g) have the transformation properties P0 (τ1 g) = P1 (g), P0 (rg) = P2 (g),

P1 (τ1 g) = P1 (g)P0 (g)P1 (g)−1 , P2 (τ1 g) = P2 (g), P1 (rg) = P0 (g), P2 (rg) = P0 (g).

These are exactly the same transformation properties satisfied by the linear maps φ Si (g) . Since the matrices Pi (e) of Theorem 3.2 coincide with the matrices of φ Si (e) with respect to the basis {S0 (e), S1 (e), S2 (e)} we obtain Theorem 1.2. 3.4. Twisted period map. There is an embedding M 0 → W obtained by sending a point m to (m, 0). Pulling back the flat connection ∇ˇ we obtain a flat connection on the tangent ˇ We define flat co-ordinates Wi whose gradients bundle T M 0 which we also denote ∇. with respect to the flat metric on M are the flat sections (φ0 , φ1 , φ2 ) of Theorem 3.2. Putting them together gives a holomorphic map W : M˜ 0 −→ C3 uniquely defined up to scalar multiples. There is a free action of C on C˜ 3 (C∗ ) lifting the C∗ action which simultaneously rescales the co-ordinates (u 0 , u 1 , u 2 ) on C3 (C∗ ). Let A2 = {(z 0 , z 1 , z 2 ) ∈ C3 : z 0 + z 1 + z 2 = i} be the affine space defined in the introduction. Then Proposition 3.3. There is a unique scalar multiple of the map W which descends to give a local isomorphism W : M˜ 0/C −→ A2 . Proof. First we show that the only possible linear relation between the solutions φi of Theorem 3.2 is φ0 + φ1 + φ2 = 0. Indeed, any such relation must be monodromy invariant, and (1, 1, 1) is the unique vector (up to multiples) preserved by the three given matrices. Secondly we show that this relation does indeed hold. Otherwise (φ0 , φ1 , φ2 ) define a basis of solutions and the map W is a local isomorphism. Let E=

i

ui

∂ ∂u i

be the Euler vector field. Dubrovin showed that all components Wi of W satisfy Lie E (Wi ) = constant. We cannot have Lie E (W) = 0 since this would contradict the statement that W is a local isomorphism. Thus there is a two-dimensional subspace of solutions satisfying Lie E (Wi ) = 0. But this subspace would have to be monodromy invariant, and there are

732

T. Bridgeland

no such two-dimensional subspaces. This gives a contradiction, so the relation holds, and rescaling we can assume that W0 + W1 + W2 = i. Now it follows that the only two-dimensional, monodromy invariant subspace of solutions is that generated by the φi , so that Lie E (W) = 0 and the result follows. Acknowledgements. The problem of describing Stab(OP2 (−3)) was originally conceived as a joint project with Alastair King, and the basic picture described in Theorem 1.1 was worked out jointly with him. It’s a pleasure to thank Phil Boalch who first got me interested in the connections with Stokes matrices and quantum cohomology. Several other people have been extremely helpful in explaining various things about Frobenius manifolds; let me thank here B. Dubrovin, C. Hertling and M. Mazzocco.

References 1. Aspinwall, P., Greene, B., Morrison, D.: Measuring small distances in N = 2 sigma models. Nucl. Phys. B 420, no. 1–2, 184–242 (1994) 2. Bondal, A.: Representations of associative algebras and coherent sheaves. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 53, no. 1, 25–44 (1989); translation in Math. USSR-Izv. 34, no. 1, 23–42 (1990) 3. Bondal, A., Polishchuk, A.: Homological properties of associative algebras: the method of helices. (Russian) Izv. Ross. Akad. Nauk Ser. Mat. 57, no. 2, 3–50 (1993); translation in Russian Acad. Sci. Izv. Math. 42, no. 2, 219–260 (1994) 4. Brenner, S., Butler, M.: Generalizations of the Bernstein-Gelfand-Ponomarev reflection functors. Representation theory, II (Proc. Second Internat. Conf., Carleton Univ., Ottawa, Ont., 1979), Lecture Notes in Math. 832, Berlin-New York: Springer, 1980, pp. 103–169 5. Bridgeland, T.: Stability conditions on triangulated categories. http://arxiv.org/list/math.AG/0212237, 2002, to appear in Ann. of Maths 6. Bridgeland, T.: Stability conditions on K3 surfaces. http://arxiv.org/list/math.AG/0307164, 2003 7. Bridgeland, T.: T-structures on some local Calabi-Yau varieties. J. Alg. 289, 453–483 (2005) 8. Bridgeland, T.: Stability conditions and Kleinian singularities. http://arxiv.org/list/math.AG/0508257, 2005 9. Bridgeland, T., King, A., Reid, M.: The McKay correspondence as an equivalence of derived categories. J. Amer. Math. Soc. 14, no. 3, 535–554 (2001) 10. Cecotti, S., Vafa, C.: On classification of N = 2 supersymmetric theories. Commun. Math. Phys. 158, no. 3, 569–644 (1993) 11. Diaconescu, D.-E., Gomis, J.: Fractional branes and boundary states in orbifold theories. J. High Energy Phys. 2000, no. 10, Paper 1, 44 pp. 12. Douglas, M., Fiol, B., Römelsberger, C.: The spectrum of BPS branes on a noncompact Calabi-Yau. JHEP 0509, 057 (2005) 13. Douglas, M.: Dirichlet branes, homological mirror symmetry, and stability. In: Proceedings of the International Congress of Mathematicians, Vol. III (Beijing, 2002), Beijing: Higher Ed. Press, 2002, pp. 395–408 14. Dubrovin, B.: Geometry of 2D topological field theories. In: Integrable systems and quantum groups (Montecatini Terme, 1993), Lecture Notes in Math. 1620, Berlin: Springer, 1996, pp. 120–348 15. Dubrovin, B.: Geometry and analytic theory of Frobenius manifolds. In: Proceedings of the International Congress of Mathematicians, Vol. II (Berlin, 1998) Doc. Math. 1998 16. Dubrovin, B.: Painlevé transcendents in two-dimensional topological field theory. The Painlevé property, CRM Ser. Math. Phys., New York: Springer, 1999, pp. 287–412 17. Dubrovin, B., Mazzocco, M.: Monodromy of certain Painlevé-VI transcendents and reflection groups. Invent. Math. 141, no. 1, 55–147 (2000) 18. Dubrovin, B.: On almost duality for Frobenius manifolds. In: Geometry, topology, and mathematical physics, Amer. Math. Soc. Transl. Ser. 2, 212, Providence, RI: Amer. Math. Soc., 2004, pp. 75–132 19. Feng, B., Hanany, A., He, Y., Iqbal, A.: Quiver theories, soliton spectra and Picard-Lefschetz transformations. J. High Energy Phys. 2003, no. 2, 056, 33 pp. 20. Gorodentsev, A., Rudakov, A.: Exceptional vector bundles on projective spaces. Duke Math. J. 54, no. 1, 115–130 (1987) 21. Guzzetti, D.: Stokes matrices and monodromy of the quantum cohomology of projective spaces, Commun. Math. Phys. 207, no. 2, 341–383 (1999)

Stability Conditions on a Non-Compact Calabi-Yau Threefold

733

22. Happel, D., Reiten, I., Smalø, S.: Tilting in abelian categories and quasitilted algebras. Mem. Amer. Math. Soc. 120, no. 575 (1996) 23. Hertling, C.: Frobenius manifolds and moduli spaces for singularities. Cambridge Tracts in Mathematics 151, Cambridge: Cambridge University Press, 2002 24. Kent IV, R., Peifer, D.: A geometric and algebraic description of annular braid groups. Internat. J. Algebra Comput. 12, 85–97 (2002) 25. Kontsevich, M.: Homological algebra of mirror symmetry. Proceedings of the International Congress of Mathematicians. Vol. 1, 2, (Zürich, 1994), Basel: Birkhäuser, 1995, pp. 120–139 26. Manin, Yu.: Frobenius manifolds, quantum cohomology, and moduli spaces. American Mathematical Society Colloquium Publications, 47. Providence, RI: Amer. Math. Soc., 1999 27. Seidel, P., Thomas, R.: Braid group actions on derived categories of coherent sheaves. Duke Math. J. 108, no. 1, 37–108 (2001) 28. Tanabé, S.: Invariant of the hypergeometric group associated to the quantum cohomology of the projective space. Bull. Sci. Math. 128, no. 10, 811–827 (2004) 29. Zaslow, E.: Solitons and helices: the search for a math-physics bridge. Commun. Math. Phys. 175, no. 2, 337–375 (1996) Communicated by M.R. Douglas

Commun. Math. Phys. 266, 735–775 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0037-x

Communications in

Mathematical Physics

Grafting and Poisson Structure in (2+1)-Gravity with Vanishing Cosmological Constant C. Meusburger Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, Ontario N2L 2Y5, Canada. E-mail: [email protected] Received: 19 September 2005 / Accepted: 20 January 2006 Published online: 4 May 2006 – © Springer-Verlag 2006

Abstract: We relate the geometrical construction of (2+1)-spacetimes via grafting to phase space and Poisson structure in the Chern-Simons formulation of (2+1)-dimensional gravity with vanishing cosmological constant on manifolds of topology R × Sg , where Sg is an orientable two-surface of genus g > 1. We show how grafting along simple closed geodesics λ is implemented in the Chern-Simons formalism and derive explicit expressions for its action on the holonomies of general closed curves on Sg . We prove that this action is generated via the Poisson bracket by a gauge invariant observable associated to the holonomy of λ. We deduce a symmetry relation between the Poisson brackets of observables associated to the Lorentz and translational components of the holonomies of general closed curves on Sg and discuss its physical interpretation. Finally, we relate the action of grafting on the phase space to the action of Dehn twists and show that grafting can be viewed as a Dehn twist with a formal parameter θ satisfying θ 2 = 0. 1. Introduction (2+1)-dimensional gravity is of physical interest as a toy model for the (3+1)-dimensional case. It is used as a testing ground which allows one to investigate conceptual questions arising in the quantisation of gravity without being hindered by the technical complexity in higher dimensions. One of these questions is the problem of "quantising geometry" or, more concretely, the problem of recovering geometrical objects with a clear physical interpretation from the gauge theory-like formulations used as a starting point for quantisation. In (2+1)-dimensions, the relation between Einstein’s theory of gravity and gauge theory is more direct than in higher dimensional cases, since the theory takes the form of a Chern-Simons gauge theory. Depending on the value of the cosmological constant, vacuum solutions of Einstein’s equations of motion are flat or of constant curvature. The theory has only a finite number of physical degrees of freedom arising from the matter content and topology of the spacetime. This absence of local gravitational

736

C. Meusburger

degrees of freedom manifests itself mathematically in the possibility to formulate the theory as a Chern-Simons gauge theory [1, 2] where the gauge group is the (2+1)↑ dimensional Poincaré group P3 , the group S O(3, 1) ∼ = S L(2, C)/Z2 or S O(2, 2) ∼ = (S L(2, R) × S L(2, R))/Z2 , respectively, for cosmological constant = 0, > 0 and < 0. The main advantage of the Chern-Simons formulation of (2+1)-dimensional gravity is that it allows one to apply gauge theoretical concepts and methods which give rise to an efficient description of phase space and Poisson structure. As Einstein’s equations of motion take the form of a flatness condition on the gauge field, physical states can be characterised by holonomies, and conjugation invariant functions of the holonomies form a complete set of physical observables. Starting with the work of Nelson and Regge [3–5, 7, 6], Martin [8] and of Ashtekar, Husain, Rovelli, Samuel and Smolin [9], the description of (2+1)-dimensional gravity in terms of holonomies and the associated gauge invariant observables has proven useful in clarifying the structure of its classical phase space as well as in quantisation. An overview of different approaches and results is given in [10]. The disadvantage of this approach is that it makes it difficult to recover the geometrical picture of a spacetime manifold and thereby complicates the physical interpretation of the theory. Except for cases where the holonomies take a particularly simple form such as static spacetimes and the torus universe, it is in general not obvious how the description of the phase space in terms of holonomies and associated gauge invariant observables gives rise to a Lorentz metric on a spacetime manifold. The first to address this problem for general spacetimes was Mess [11], who showed how the geometry of (2+1)-dimensional spacetimes can be reconstructed from a set of holonomies. More recent results on this problem are obtained in the papers by Benedetti and Guadagnini [12] and by Benedetti and Bonsante [13], which are going to be our main references. They describe the construction of evolving spacetimes from static ones via the geometrical procedure of grafting, which, essentially, consists of inserting small annuli along certain geodesics of the spacetime. As they establish a unified picture for all values of the cosmological constant and show how this change of geometry affects the holonomies, they clarify the relation between holonomies and spacetime geometry considerably. However, despite these results, the problem of relating spacetime geometry and the description of phase space and Poisson structure in terms of holonomies has not yet been fully solved. The missing link is the role of the Poisson structure. A complete understanding of the gauge invariant observables must include a physical interpretation of the transformations on phase space they generate via the Poisson bracket. Conversely, to interpret the geometrical construction of evolving (2+1)-spacetimes via grafting as a physical transformation, one needs to determine how it affects phase space and Poisson structure. This paper addresses these questions for (2+1)-gravity with vanishing cosmological constant on manifolds of topology R × Sg , where Sg is an orientable two-surface of genus g > 1. It relates the construction of evolving (2+1)-spacetimes via grafting along simple, closed curves to the description of the phase space in terms of holonomies and the associated gauge invariant observables. The main results can be stated as follows. 1. We show how grafting along a closed, simple geodesic is implemented in the ChernSimons formulation of (2+1)-dimensional gravity. Using the parametrisation of the phase space in terms of holonomies given in [14, 15], we deduce explicit expressions for the action of grafting on the holonomies of general curves on Sg and investigate its properties as a transformation on phase space.

Grafting and Poisson Structure in (2+1)-Gravity

737

2. We derive the Hamiltonian that generates this grafting transformation via the Poisson bracket. This Hamiltonian is one of the two basic gauge invariant observables associated to a closed curve on Sg and obtained from the Lorentz component of its holonomy. 3. We demonstrate that there is a symmetry relation between the transformation of the observables associated to a curve η under grafting along λ and the transformation of the corresponding observables for λ under grafting along η. Infinitesimally, this relation takes the form of a general identity for the Poisson brackets of certain observables associated to the two curves. 4. We show that the action of grafting in our description of the phase space is closely related to the action of (infinitesimal) Dehn twists investigated in an earlier paper [16]. Essentially, grafting can be viewed as a Dehn twist with a formal parameter θ satisfying θ 2 = 0. The paper is structured as follows. In Sect. 2 we introduce the relevant definitions and notation, present some background on the (2+1)-dimensional Poincaré group and on hyperbolic geometry and summarise the description of grafted (2+1)-spacetimes in [12, 13] for the case of grafting along multicurves. In Sect. 3, we briefly review the Hamiltonian version of the Chern-Simons formulation of (2+1)-dimensional gravity. We discuss the role of holonomies and summarise the relevant results of [14, 15], in which phase space and Poisson structure are characterised ↑ ↑ by a symplectic potential on the manifold (P3 )2g with different copies of P3 standing for the holonomies of a set of generators of the fundamental group π1 (Sg ). Section 4 discusses the implementation of grafting along closed, simple geodesics in the Chern-Simons formalism. We show how the geometrical procedure of grafting in ↑ [12, 13] gives rise to a transformation on the extended phase space (P3 )2g and derive formulas for its action on the holonomies of general elements of the fundamental group π1 (Sg ). Section 5 establishes the relation of grafting and Poisson structure. After express↑ ing the symplectic potential on the extended phase space (P3 )2g in terms of variables adapted to the grafting transformations, we show that these transformations are generated by gauge invariant Hamiltonians and therefore act as Poisson isomorphisms. We deduce a general symmetry relation between the Poisson brackets of certain observables associated to general closed curves on Sg . In Sect. 6, we explore the link between grafting and Dehn twists. We review the results concerning Dehn twists derived in [16] and introduce a graphical procedure which allows one to determine the action of grafting on the holonomies of general closed curves on Sg . By means of this procedure, we then demonstrate that there is a close relation between the action of grafting and (infinitesimal) Dehn twists. In Sect. 7 we illustrate the general results from Sect. 4 to 6 by applying them to a concrete example. Section 8 contains a summary of our results and concluding remarks.

2. Grafted (2+1) Spacetimes with Vanishing Cosmological Constant: The Geometrical Viewpoint 2.1. The (2+1)-dimensional Poincaré group. Throughout the paper we use Einstein’s summation convention. Indices are raised and lowered with the three-dimensional Minkowski metric η = diag(1, −1, −1), and x · y stands for η(x, y).

738

C. Meusburger ↑

↑

↑

In the following L 3 and P3 = L 3 R3 denote, respectively, the (2+1)-dimensional proper orthochronous Lorentz and Poincaré group. We identify R3 and the Lie algebra ↑ ↑ so(2, 1) = Lie L 3 as vector spaces. The action of L 3 on R3 in its matrix representation then agrees with its action on so(2, 1) via the adjoint action p = ( p0 , p1 , p2 ) ∼ = pa Ja ,

Ad(u) p = pa u Ja u −1 = u b a pa Jb

(2.1)

where Ja , a = 0, 1, 2, are the generators of so(2, 1). For notational consistency with earlier papers [14–16] considering more general gauge groups we will use the notation Ad(u) p throughout the paper and often do not distinguish notationally between elements of so(2, 1) and associated vectors in R3 . With the parametrisation ↑

↑

(u, a) = (u, −Ad(u) j ) ∈ P3

u ∈ L 3 , a, j ∈ R3 ,

(2.2)

↑

the group multiplication in P3 is then given by (u 1 , a1 ) · (u 2 , a2 ) = (u 1 · u 2 , a1 + Ad(u 1 )a2 ) = u 1 · u 2 , −Ad(u 1 u 2 ) j 2 + Ad(u −1 . ) j 1 2 ↑

(2.3)

↑

The Lie algebra of P3 is Lie P3 = iso(2, 1). Denoting by Ja , a = 0, 1, 2, the generators of so(2, 1) by Pa , a = 0, 1, 2, the generators of the translations, and choosing the convention 012 = 1 for the epsilon tensor, we have the Lie bracket [Pa , Pb ] = 0, [Ja , Jb ] = abc J c , [Ja , Pb ] = abc P c ,

(2.4)

and a non-degenerate, Ad-invariant bilinear form , on iso(2, 1) is given by Ja , P b = δab , Ja , Jb = P a , P b = 0.

(2.5)

We represent the generators of so(2, 1) by the matrices (Ja )bc = −abc ↑

(2.6) ↑

pa Ja

the exponential map for L 3 . As and denote by exp : so(2, 1) → L 3 , pa Ja → e ↑ this map is surjective, see for example [17, 18], elements of L 3 can be parametrised in terms of a vector p ∈ R3 with p 0 ≥ 0 as u = e− p

aJ a

.

Using expression (2.6) for the generators of so(2, 1) and setting pˆ =

1 m

p

for m 2 := | p2 | = 0,

(2.7)

we find u ab

 c   pˆ a pˆ b + cos m(ηab − pˆ a pˆ b ) + sin mabc pˆ 1 c = ηab + abc p + 2 pa pb  − pˆ pˆ + cosh m(η + pˆ pˆ ) + sinh m pˆ c a b ab a b abc ↑

pa pa = m 2 > 0 pa pa = 0 pa pa = −m 2 < 0.

(2.8)

Elements u = e− p Ja ∈ L 3 are called elliptic, parabolic and hyperbolic, respectively, if p2 > 0, p2 = 0 and p2 < 0. Note that the exponential map is not injective, since e2π n J0 = 1 for n ∈ Z. However, in this paper we will be concerned with a

Grafting and Poisson Structure in (2+1)-Gravity

739

hyperbolic elements, for which the parametrisation (2.8) in terms of a spacelike vector p = ( p 0 , p 1 , p 2 ) with p 0 ≥ 0 is unique. ↑ A convenient way of describing the Lie algebra iso(2, 1) and the group P3 has been introduced in [8]. It relies on a formal parameter θ with θ 2 = 0 analogous to the one occurring in supersymmetry. With the definition (Pa )bc = θ (Ja )bc = −θ abc ,

(2.9)

it follows that the commutator of the matrices Pa , Ja , a = 0, 1, 2, is the Lie bracket (2.4) of the (2+1)-dimensional Poincaré algebra. Definition (2.9) also allows one to ↑ parametrise elements of the group P3 . Identifying (2.10) (u, a) ∼ = 1 + θa b Jb u, one obtains the multiplication law 1 + θa1b Jb u 1 · 1 + θa2c Jc u 2 = u 1 u 2 + θa1b Jb u 1 u 2 + θ u 1 a2b Jb u 2 + θ 2 a1b Jb u 1 a2c Jc u 2 u1u2, = 1+θ a1b + u 1 a2b Jb u −1 (2.11) 1 and with the identification (2.1) of so(2, 1) and R3 one recovers the group multiplication law (2.3). Furthermore, the introduction of the parameter θ makes it possible to ↑ express the exponential map exp : iso(2, 1) → P3 in terms of the exponential map ↑ exp : so(2, 1) → L 3 for the (2+1)-dimensional Lorentz group by setting ep

a J +k a P a a

= e( p +θk )Ja ∞ ∞ n−1 ( pa Ja )m k b Jb ( p c Jc )n−m−1 ( pa Ja )n +θ . = n! n! a

a

n=0

(2.12)

n=0 m=0

↑

To link the parametrisation of elements of P3 in terms of the exponential map with the parametrisation (2.2) (u, −Ad(u) j ) = e−( p

a +θk a )J a

↑

u ∈ L 3 , j ∈ R3 , ( pa +θ k a )Ja ∈ iso(2, 1),

(2.13)

one uses the identity

n n admpa Ja (k b Jb ) · ( p c Jc )n−m ( pa Ja )n , k b Jb = m

(2.14)

m=1

↑

in (2.12) and finds that the elements u ∈ L 3 , j ∈ R3 are given by u = e− p Ja , j = T ( p)k with T : R3 → R3 , ∞ adn a pa Ja (k Ja ) ab T ( p) kb Ja = (n + 1)! n=0

= k a Ja + 21 p b Jb , k a Ja + 16 p c Jc , p b Jb , k a Ja + · · · . a

(2.15)

(2.16)

740

C. Meusburger

Note that the linear map T ( p) is the same as the one considered in [14, 15], where its properties are discussed in more detail. In particular, it is shown that T ( p) is bijective, a maps p to itself and satisfies Ad(e− p Ja )T ( p) = T (− p). Its inverse T −1 ( p) : R3 → R3 plays an important role in the parametrisation of the right- and left-invariant vector fields ↑ ↑ JaL , JaR on L 3 . For any F ∈ C ∞ (L 3 ), we have d ∂F b b (2.17) JaL F e− p Jb = |t=0 F e−t Ja e− p Jb = T −1 ( p)ab b , dt ∂p b a ∂F d b b JaR F e− p Jb = |t=0 F e− p Jb et Ja = −Ad e p Jb T −1 ( p)cb b c dt ∂p ∂ F = −T −1 (− p)ab b . ∂p 2.2. Hyperbolic geometry. In this subsection we summarise some facts from hyperbolic geometry, mostly following the presentation in [19]. For a more specialised treatment focusing on Fuchsian groups see also [20]. In the following we denote by HT ⊂ R3 the hyperboloids of curvature − T1 with the metric induced by the (2+1)-dimensional Minkowski metric HT = x ∈ R3 | x 2 = T 2 , x 0 > 0 (2.18) and realise hyperbolic space H2 as the hyperboloid H2 = H1 . The tangent plane in a point p ∈ HT is given by (2.19) Tp HT = p⊥ = x ∈ R3 | x · p = 0 , and geodesics on HT are of the form c p,q (t) = p cosh t + q sinh t

with p2 = T 2 , q 2 = −T 2 , p · q = 0.

(2.20)

They are given as the intersection of HT with planes through the origin, which can be characterised in terms of their unit (Minkowski) normal vectors c p,q = HT ∩ n⊥ p,q

with

n p,q =

1 T2

p × q ∈ Tc p,q (t) HT ∀t ∈ R.

(2.21)

The isometry group of the hyperboloids HT is the (2+1)-dimensional proper orthochro↑ nous Lorentz group L 3 . The subgroup stabilising a given geodesic maps the associated plane to itself and is generated by the plane’s normal vector. More precisely, for a geodesic c p,q parametrised as in (2.20) and with associated normal vector n p,q as in (2.21), one has a α ∈ R. (2.22) Ad eαn p,q Ja c p,q (t) = cosh(t + α) p + sinh(t + α)q The uniformization theorem implies that any compact, oriented two-manifold of genus g > 1 with a metric of constant negative curvature is given as a quotient S = HT /

of a hyperboloid HT by the action of a cocompact Fuchsian group with 2g hyperbolic generators

↑ −1 · · · v = 1 ⊂ L3 .

= v A1 , v B1 , . . . , v Ag , v Bg ; v Bg , v −1 , v (2.23) B 1 Ag A1

Grafting and Poisson Structure in (2+1)-Gravity

741

The group is isomorphic to the fundamental group π1 (S ), and its action on the hyperboloid HT agrees with the action of π1 (S ) via deck transformations. Via its action on HT , it induces a tesselation of HT by its fundamental regions which are geodesic arc 4g-gons. In particular, there exists a geodesic arc 4g-gon P T in the tesselation of HT , in the following referred to as a fundamental polygon, such that each of the generators of

and their inverses map the polygon to one of its 4g neighbours. If one labels the sides of the polygon by a1 , a1 , . . . , b1 , b1 , . . . , ag , ag , bg , bg as in Fig. 5, it follows that the generators of map side x ∈ {a1 , . . . , bg } of the polygon P into x ∈ {a1 , . . . , bg }, Ad(v Ai ) : ai → ai

Ad(v Bi ) : bi → bi .

(2.24)

For a general polygon P in the tesselation related to P T via P = Ad(v)P T , v ∈ , the −1 elements of mapping this polygon into its 4g neighbours are given by vv ±1 A1 v , . . . , ±1 −1 vv Bg v . Geodesics on S are the images of geodesics on HT under the projection T : HT → S . In particular, a geodesic c p,q on HT gives rise to a closed geodesic on S if and only if there exists a nontrivial element v˜ ∈ , the geodesic’s holonomy, which maps c p,q to itself. From (2.22) it then follows that the group element v˜ ∈ is obtained by exponentiating a multiple of the geodesic’s normal vector ∃α ∈ R+ :

v˜ = eαn p,q Ja . a

(2.25)

Closed geodesics on S are therefore in one to one correspondence with elements of the group and hence with elements of the fundamental group π1 (S ). In the following we will often not distinguish notationally between an element of the fundamental group π1 (S ) and a closed geodesic or a general closed curve on S representing this element. 2.3. Grafting. Grafting along simple geodesics was first investigated in the context of complex projective structures and Teichmüller theory [21–23]. Following the work of Thurston [24, 25] who considered general geodesic laminations, the topic has attracted much interest in mathematics, for historical remarks see for instance [26]. The role of geodesic laminations in (2+1)-dimensional gravity was first explored by Mess [11] who investigated the construction of (2+1)-dimensional spacetimes from a set of holonomies. More recent work on grafting in the context of (2+1)-dimensional gravity are the papers by Benedetti and Guadagnini [12] and by Benedetti and Bonsante [13]. As we investigate a rather specific situation, namely grafting along closed, simple geodesics in (2+1)-dimensional gravity with vanishing cosmological constant , we limit our presentation to a summary of the grafting procedure described in [12, 13] for the case of = 0 and multicurves. For a more general treatment and a discussion of the relation between this grafting procedure and grafting on the space of complex projective structures, we refer the reader to [12, 13]. Given a cocompact Fuchsian group with 2g generators, there is a well-known procedure for constructing a flat (2+1)-dimensional spacetime of genus g associated to this group, see for example [10]. One foliates the interior of the forward lightcone with the tip at the origin by hyperboloids HT . The cocompact Fuchsian group acts on each hyperboloid HT and induces a tesselation of HT by geodesic arc 4g-gons which are mapped into each other by the elements of . The asssociated spacetime of genus g is then obtained by identifying on each hyperboloid the points related by the action of . It

742

C. Meusburger ↑

is shown in [10] that the P3 -valued holonomies of all curves in the resulting spacetime ↑ ↑ have vanishing translational components and lie in the subgroup L 3 ⊂ P3 . However, these spacetimes are of limited physical interest because they are static [10]. Grafting along measured geodesic laminations is a procedure which allows one to construct non-static or genuinely evolving (2+1)-spacetimes associated to a Fuchsian group . In the following we consider measured geodesic laminations which are weighted multicurves, i. e. countable or finite sets G I = (ci , wi ) | i ∈ I (2.26) of closed, simple non-intersecting geodesics ci on the associated two-surface S = H2 / , each equipped with a positive number, the weight wi > 0. Geometrically, grafting amounts to cutting the surface S along each geodesic ci and inserting a strip of width wi as indicated in Fig. 1. Equivalently, the construction can be described in the universal cover H2 . By lifting each geodesic ci to a geodesic ci on H2 and acting on it with the Fuchsian group , one obtains a -invariant multicurve on H2 , G I = {(Ad(v)ci , wi ) | i ∈ I, v ∈ } .

(2.27)

One then cuts the hyperboloid H2 along each geodesic ci in G I , shifts the resulting pieces in the direction of the geodesics’ normal vectors and glues in a strip of width wi as shown in Fig. 2. The cocompact Fuchsian group acts on the resulting surface in such a way that it identifies the images of the points related by the canonical action of

on H2 , and the associated grafted genus g surface is obtained by taking the quotient with respect to this action of . In the construction of flat (2+1)-spacetimes of topology R × Sg via grafting, the grafting procedure is performed for each value of the time parametrising R. As in the construction of static spacetimes, one foliates the interior of the forward lightcone by hyperboloids HT . By cutting and inserting strips along the lifted geodesics on each hyperboloid HT , one assigns to each cocompact Fuchsian group and each multicurve on S = H2 / a regular domain U ∈ R3 . The cocompact Fuchsian group acts on the domain U , and the grafted spacetime of topology R × Sg is obtained by identifying the points in U related by this action of . To give a mathematically precise definition, we follow the presentation in [13]. We consider a multicurve on S = H2 / as in (2.26) together with its lift to a -invariant weighted multicurve on H2 as in (2.27) and parametrise its geodesics as in (2.20), (2.28) G I = (c pi ,qi , wi ) | i ∈ I .

Fig. 1. Grafting along a closed simple geodesic c with weight w on a genus 2 surface

Grafting and Poisson Structure in (2+1)-Gravity

743

Fig. 2. Grafting along a geodesic with weight w in hyperbolic space

Furthermore, we choose a basepoint x 0 ∈ H2 − i∈I c pi ,qi that does not lie on any of the geodesics. For each point x ∈ H2 − i∈I c pi ,qi outside the geodesics, we choose an arc a x connecting x 0 and x, pointing towards x and transverse to each of the geodesics it intersects. We then define a map ρ : H2 − i∈I c pi ,qi → R3 by associating to each intersection point of a x with one of the geodesics the unit normal vector of the geodesic pointing towards x and multiplied with the weight wi i,x n pi ,qi for x ∈ / c pi ,qi , (2.29) ρ(x) = i∈I i∈I : a x ∩ c pi ,qi =∅ where i,x ∈ {±1} is the oriented intersection number of a x and c pi ,qi with the convention i,x = 1 if c pi ,qi crosses a x from the left to the right in the direction of a x and ensures that i,x wi n pi ,qi points towards x. Similarly, for each point x ∈ c p j ,q j that lies on one of the geodesics, we consider a geodesic ray r x starting in x 0 and through x, transversal to the geodesics at each intersection point, and set wi i,x n pi ,qi , ρ− (x) = i∈I −{ j}: r x ∩ c pi ,qi =∅ wi i,x n pi ,qi . (2.30) ρ+ (x) = w j j,x n p j q j + i∈I −{ j}: r x ∩ c pi ,qi =∅ On each hyperboloid HT , we now shift the points outside of the geodesics according to c pi qi , (2.31) T x → T x + ρ(x), x ∈ H2 − i∈I

and replace each geodesic by a strip T x → T x + tρ+ (x) + (1 − t)ρ− (x),

x∈

i∈I

c pi qi ⊂ H2 , t ∈ [0, 1]. (2.32)

744

C. Meusburger

From the definitions (2.29), (2.30) of the maps ρ, ρ± we see that for each value of T , this corresponds to the grafting procedure for hyperbolic space described above. The regular domain U ⊂ R3 associated to the multicurve G I is the image of the forward lightcone under this procedure [13]: U = UT , T ∈R+

UT =

T x + ρ(x) | x ∈ /

c pi ,qi

=:UT0

∪

i∈I

T x + tρ+ (x) + (1 − t)ρ− (x) | x ∈

c pi ,qi , t ∈ [0, 1] ,

i∈I

(2.33)

G =:UT I

where the two-dimensional surfaces UT are the images of the hyperboloids HT , given as a union of shifted pieces UT0 of hyperboloids and of strips UTG I . In particular, the tip of the lightcone is mapped to the initial singularity U0 of the regular domain U , / c pi ,qi ∪ tρ+ (x) + (1 − t)ρ− (x) | x ∈ c pi ,qi , t ∈ [0, 1] , U0 = ρ(x) | x ∈ i∈I

i∈I

(2.34) which is a graph (more precisely, a real simplicial tree) with each vertex corresponding to the area between two geodesics or between a geodesic and infinity and edges given by wi i,x n pi ,qi . It is shown in [12] that the parameter T defines a cosmological time function Tc : U → R+ Tc (T x + ρ(x)) = T Tc (T x + tρ+ (x) + (1 − t)ρ− (x)) = T,

(2.35)

and that the surfaces UT in (2.33) are surfaces of constant geodesic distance to the initial singularity U0 . The genus g spacetime associated to the cocompact Fuchsian group and the invariant multicurve G I is then obtained by identifying on each surface UT the images of the points on HT which are related by the canonical action of . This is implemented by defining another action of the group on U . It is shown in [13] that for -invariant multicurves G I on H2 the map ↑

f G I : → P3 , f G I (v) = (v, ρ(Ad(v)x 0 ))

(2.36)

defines a group homomorphism which leaves each surface UT invariant, acts on U freely and properly discontinuously and satisfies N ( f G I (v) y) = Ad(v)N ( y),

(2.37)

where N : U → H2 is the map that associates to each point in U the corresponding point in H2 , N (T x + ρ(x)) = x N (T x + tρ+ (x) + (1 − t)ρ− (x)) = x.

(2.38)

Grafting and Poisson Structure in (2+1)-Gravity

745

The flat (2+1)-spacetime of genus g associated to the group and the multicurve G I is defined as the quotient of U by the action of via f G I . Using the identity (2.37), we find that this amounts to identifying points y, y ∈ T ∈R+ UT0 according to y ∼ y

⇔

∃v ∈ : N ( y) = Ad(v)N ( y ) , Tc ( y) = Tc ( y ),

(2.39)

where Tc : U → R+0 is the cosmological time (2.35), and for points y, y ∈ T ∈R+ UTG I parametrised as in0 (2.33), we have the additional condition t = t . Hence, two points y, y ∈ T ∈R+ UT are identified if and only if they lie on the same surface UT and the corresponding points on H2 are identified by the canonical action of on H2 . ↑ The function f G I defines the P3 -valued holonomies of the resulting spacetime. Via ∼ the identification = π1 (Sg ) it assigns to each element of the fundamental group π1 (Sg ) ↑ an element of the group P3 , whose Lorentz component is the associated element of the Fuchsian group . However, in contrast to the static spacetimes considered above, it is clear from (2.36) that in grafted (2+1)-spacetimes there exist elements of the fundamental group whose holonomies have a nontrivial translational component. 3. Phase Space and Poisson Structure in the Chern-Simons Formulation of (2+1)-Dimensional Gravity 3.1. The Chern-Simons formulation of (2+1)-dimensional gravity. The formulation of (2+1)-dimensional gravity as a Chern-Simons gauge theory is derived from Cartan’s description, in which Einstein’s theory of gravity is formulated in terms of a dreibein of one-forms ea , a = 0, 1, 2, and spin connection one-forms ωa , a = 0, 1, 2, on a spacetime manifold M. The dreibein defines a Lorentz metric g on M via g = ηab ea ⊗ eb ,

(3.1)

and the one-forms ωa are the coefficients of the spin connection ω = ωa Ja .

(3.2)

To formulate the theory as a Chern-Simons gauge theory, one combines dreibein and spin-connection into the Cartan connection [27] or Chern-Simons gauge field A = ωa Ja + ea P a ,

(3.3)

an iso(2, 1) valued one form whose curvature F = Ta P a + Fωa Ja

(3.4)

combines the curvature and the torsion of the spin connection 1 Fωa = dωa + abc ωb ∧ ωc , Ta = dea + abc ωb ec . 2

(3.5)

This allows one to express Einstein’s equations of motion, the requirements of flatness and vanishing torsion, as a flatness condition on the Chern-Simons gauge field F = 0.

(3.6)

746

C. Meusburger

Note, however, that in order to define a metric g of signature (1, −1, −1) via (3.1), the dreibein e has to be non-degenerate, while no such condition is imposed in the corresponding Chern-Simons gauge theory. It is argued in [28] for the case of spacetimes containing particles that this leads to differences in the global structure of the phase spaces of the two theories. A further subtlety concerning the phase space in Einstein’s formulation and the Chern-Simons formulation of (2+1)-dimensional gravity arises from the presence of large gauge transformations. It has been shown by Witten [2] that infinitesimal gauge transformations are on-shell equivalent to infinitesimal diffeomorphisms in Einstein’s formulation of the theory. This equivalence does not hold for large gauge transformations which are not infinitesimally generated and arise in Chern-Simons the↑ ory with non-simply connected gauge groups such as the group P3 . Nevertheless, configurations related by such large gauge transformations are identified in the Chern-Simons formulation of (2+1)-dimensional gravity, potentially causing further differences in the global structure of the two phase spaces. However, as we are mainly concerned with the local properties of the phase space, we will not address these issues any further in this paper. In the following we consider spacetimes of topology M ≈ R × Sg , where Sg is an orientable two-surface of genus g > 1. On such spacetimes, it is possible to give a Hamiltonian formulation of the theory. One introduces coordinates x 0 , x 1 , x 2 on R × Sg such that x 0 parametrises R and splits the gauge field according to A = A0 d x 0 + A S ,

(3.7) ↑

where A S is a gauge field on Sg and A0 : R × S → P3 a function with values in the (2+1)-dimensional Poincaré group. The Chern-Simons action on M then takes the form 0 1 dx S[A S , A0 ] = 2 ∂0 A S ∧ A S + A0 , FS , R

Sg

where , denotes the bilinear form (2.5) on iso(2, 1), FS is the curvature of the spatial gauge field A S , FS = d S A S + A S ∧ A S ,

(3.8)

and d S denotes differentiation on the surface Sg . The function A0 plays the role of a Lagrange multiplier, and varying it leads to the flatness constraint FS = 0,

(3.9)

while variation of A S yields the evolution equation ∂0 A S = d S A0 + [A S , A0 ].

(3.10)

The action (3.8) is invariant under gauge transformations ↑

A0 → γ A0 γ −1 +γ ∂0 γ −1 A S → γ A S γ −1 +γ d S γ −1 with γ : R× Sg → P3 , ↑

(3.11)

and the phase space of the theory is the moduli space Mg of flat P3 -connections A S modulo gauge transformations on the spatial surface Sg .

Grafting and Poisson Structure in (2+1)-Gravity

747

3.2. Holonomies in the Chern-Simons formalism. Although the moduli space Mg of flat H -connections is defined as a quotient of the infinite dimensional space of flat H -connections on Sg , it is of finite dimension dim Mg = 2dim H (g − 1). In the ChernSimons formulation of (2+1)-dimensional gravity, we have dim Mg = 12(g − 1), and the finite dimensionality of Mg reflects the fact that the theory has no local gravitational degrees of freedom. From the geometrical viewpoint this fact can be summarised in the statement that every flat (2+1)-spacetime is locally Minkowski space. The corresponding statement in the Chern-Simons formalism is that, due to its flatness, a gauge field solving the equations of motion can be trivialised or written as pure gauge on any simply connected region R ⊂ R × Sg , ↑ with γ −1 = (v, x) : R → P3 . (3.12) A = γ dγ −1 = v −1 dv, Ad(v −1 )d x The dreibein on R is then given by ea = Ad(v −1 )ab d xb and from (3.1) it follows that the restriction of the metric g to R takes the form gab d x a d x b = (d x 0 )2 − (d x 1 )2 − (d x 2 )2 .

(3.13)

Hence, the translational part of the trivialising function γ −1 defines an embedding of the region R into Minkowski space, and the function x(x 0 , ·) gives the embedding of the surfaces of constant time parameter x 0 . A maximal simply connected region is obtained by cutting the spatial surface Sg along a set of generators of the fundamental group π1 (Sg ) as in Fig.3, which is the approach pursued by Alekseev and Malkin [29]. The fundamental group of a genus g surface Sg is generated by two loops ai , bi , i = 1, . . . , g around each handle, subject to a single defining relation

Fig. 3. Cutting the surface Sg along the generators of π1 (Sg )

748

C. Meusburger

π1 (Sg ) = a1 , b1 , . . . , ag , bg ; bg , ag−1 · · · b1 , a1−1 = 1 ,

(3.14)

where bi , ai−1 = bi ◦ ai−1 ◦ bi−1 ◦ ai . In the following we will work with a fixed set of generators and with a fixed basepoint p0 as shown in Fig. 4. Cutting the surface along each generator of the fundamental group results in a 4g-gon Pg as pictured in Fig. 5. In ↑ order to define a gauge field A S on Sg , the function γ −1 : Pg → P3 must satisfy an overlap condition relating its values onthe two sides corresponding to each generator of the fundamental group. For any y ∈ a1 , b1 , . . . , ag , bg , one must have [29] A S | y = γ d S γ −1 | y = γ d S γ −1 | y = A S | y ,

(3.15)

which is equivalent to the existence of a constant Poincaré element NY = (vY , x Y ) such that γ −1 | y = NY γ −1 | y or, equivalently, v| y = vY v| y x| y = Ad(vY )x| y + x Y . (3.16) Note that the information about the physical state is encoded entirely in the Poincaré elements N X , X ∈ {A1 , . . . , Bg }, since transformations of the form γ → γ˜ γ with ↑ γ˜ : Sg → P3 are gauge. Conversely, to determine the Poincaré elements N X for a given gauge field, it is not necessary to know the trivialising function γ but only the embedding of the sides of the polygon Pg , which defines them uniquely via (3.16). We will now relate these Poincaré elements to the holonomies of our set of generators of the fundamental group π1 (Sg ). In the Chern-Simons formalism, the holonomy of a curve c : [0, 1] → Sg is given by Hc = γ (c(1))γ −1 (c(0)),

Fig. 4. Generators and dual generators of the fundamental group π1 (Sg )

(3.17)

Grafting and Poisson Structure in (2+1)-Gravity

749

Fig. 5. The polygon Pg

where γ is the trivialising function for the spatial gauge field A S on a simply connected region in Sg containing c. By taking the polygon Pg as our simply connected region and labelling its sides as in Fig. 5, we find that the holonomies Ai , Bi associated to the curves ai , bi , , i = 1, . . . , g, are given by [29] Ai = γ ( p4i−3 )γ ( p4i−4 )−1 = γ ( p4i−2 )γ ( p4i−1 )−1 Bi = γ ( p4i−3 )γ ( p4i−2 )−1 = γ ( p4i )γ ( p4i−1 )−1 .

(3.18)

From the defining relation of the fundamental group, it follows that they satisfy the relation

· · · B1 , A−1 ≈ (1, 0). u ∞ , −Ad(u ∞ ) j ∞ := Bg , A−1 g 1

(3.19)

Using the overlap condition (3.15), we can express the value of the trivialising function γ at the corners of the polygon Pg in terms of its value at p0 and find −1 −1 γ −1 ( p4i ) = N Hi N Hi−1 · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi , −1 −1 −1 γ −1 ( p4i+1 )=N A−1 N B−1 N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi Ai+1 , i+1 i+1 −1 −1 −1 N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi Ai+1 Bi+1 , γ −1 ( p4i+2 ) = N B−1 i+1 −1 −1 −1 Hi Ai+1 Bi+1 Ai+1 , γ −1 ( p4i+3 ) = N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 (3.20)

750

C. Meusburger

where

Hi = u Hi , −Ad(u Hi ) j Hi = Bi , Ai−1 ,

N Hi = v Hi , x Hi = N Bi , N A−1 . i

(3.21)

Equation (3.20) allows us to express the holonomies Ai , Bi in terms of N Ai , N Bi , −1 −1 −1 Ai = γ ( p0 )N H · · · NH NH · N Bi · N Hi−1 N Hi−2 · · · N H1 γ −1 ( p0 ), 1 i−1 i −1 −1 −1 · · · NH NH · N Ai · N Hi−1 N Hi−2 · · · N H1 γ −1 ( p0 ), Bi = γ ( p0 )N H 1 i−1 i

(3.22)

and by inverting these expressions we obtain N Ai = γ −1 ( p0 )H1−1 · · ·Hi−1 Bi Hi−1 · · ·H1 γ ( p0 ), · · ·Hi−1 Ai Hi−1 · · ·H1 γ ( p0 ). N Bi = γ −1 ( p0 )H1−1

(3.23)

Note that expression (3.23) agrees exactly with (3.22) if we exchange Ai ↔ N Ai , Bi ↔ N Bi and γ −1 ( p0 ) ↔ γ ( p0 ). In particular, up to simultaneous conjugation with γ −1 ( p0 ), the Poincaré elements N Ai , N Bi are the holonomies along another set of generators of π1 (Sg ) pictured in Fig. 4 and given in terms of the generators ai , bi by −1 −1 −1 n ai = h −1 1 ◦ ... ◦ h i ◦ bi ◦ h i−1 ◦ ... ◦ h 1 n bi = h 1 ◦ ... ◦ h i ◦ ai ◦ h i−1 ◦ ... ◦ h 1 , (3.24)

where h i := bi , ai−1 . From Fig. 4, we see that this set of generators is dual to the set of generators a1 , b1 , . . . , ag , bg in the sense that n ai and n bi , respectively, intersect only ai and bi , in a single point.

3.3. Phase space and Poisson structure. The description of Chern-Simons theory with gauge group H on manifolds R × Sg in terms of the holonomies along a set of generators of the fundamental group π1 (Sg ) provides an efficient parametrisation of its phase space Mg . While the formulation in terms of Chern-Simons gauge fields exhibits an infinite number of redundant or gauge degrees of freedom, the characterisation in terms of the holonomies allows one to describe the moduli space Mg as a quotient of a finite dimensional space. It is given as

· · · B1 , A−1 = 1 /H, (3.25) Mg = A1 , B1 , . . . , A g , Bg ∈ H 2g | Bg , A−1 g 1 where the quotient stands for simultaneous conjugation of all group elements Ai , Bi ∈ H by elements of the gauge group H . Hence, the physical observables of the theory are functions on H 2g that are invariant under simultaneous conjugation with H or conjugation invariant functions of the holonomies associated to elements of π1 (Sg ). In the case ↑ of the gauge group P3 , these observables were first investigated in [3 and 9]; for the case of a disc with punctures representing massive, spinning particles, see also the work of Martin [8], who identify a complete set of generating observables and determine their

Grafting and Poisson Structure in (2+1)-Gravity

751

Poisson brackets. In our notation the associated to a general curve two basic observables a η ∈ π1 (Sg ) with holonomy Hη = u η , −Ad(u η ) j η , u η = e− pη Ja , are given by m 2η := − p2η ,

m η sη := pη · j η ,

(3.26)

and it follows directly from the group multiplication law (2.3), that they are invariant under conjugation of the holonomies. Furthermore, for a loop η around a puncture representing a massive, spinning particle, m η and sη have the physical interpretation of, respectively, mass and spin of the particle. In the following we will therefore refer to these observables as mass and spin of the curve η. Although it is possible to determine the canonical Poisson brackets of these observables [3, 9], the resulting expressions are nonlinear and rather complicated. The main advantage of the description of the phase space Mg as the quotient (3.25) is that it results in a much simpler description of the Poisson structure on Mg . Although the canonical Poisson structure on the space of Chern-Simons gauge fields does not induce a Poisson structure on the space of holonomies, it is possible to describe the symplectic structure on Mg in terms of an auxiliary Poisson structure on the manifold H 2g . The construction is due to Fock and Rosly [30] and was developed further by Alekseev, Grosse and Schomerus [31, 32] for the case of Chern-Simons theory with compact, semisimple gauge groups. A formulation from the symplectic viewpoint has been derived independently in [29]. In [14], this description is adapted to the universal cover of the (2+1)-dimensional Poincaré group and in [15] to gauge groups of the form G g∗ , where G is a finite dimensional, connected, simply connected, unimodular Lie group, g∗ the dual of its Lie algebra and G acts on g∗ in the coadjoint representation. It is shown in [14, 15] that in this case, the Poisson structure can be formulated in terms of a symplectic potential. ↑ ↑ Although the gauge group P3 = L 3 so(2, 1)∗ is not simply connected, the results of [14, 15] can nevertheless be applied to this case1 and are summarised in the following theorem. ↑

Theorem 3.1 ([15]). Consider the Poisson manifold ((P3 )2g , ) with group elements ↑ (A1 , B1 , ..., A g , Bg ) ∈ (P3 )2g parametrised according to Bi = u Bi , −Ad(u Bi ) j Bi , i = 1, . . . , g, (3.27) Ai = u Ai , −Ad(u Ai ) j Ai , and the Poisson structure given by the symplectic form = δ, where =

g

−1 j Ai , δ u Hi−1 · · · u H1 u Hi−1 · · · u H1

i=1

−1 −1 −1 −1 − j Ai , δ u −1 u u u u · · · u u u u · · · u H1 H1 Ai Bi Ai Hi−1 Ai Bi Ai Hi−1 +

g

−1 −1 −1 −1 j Bi , δ u −1 u u u u · · · u u u u · · · u H1 H1 Ai Bi Ai Hi−1 Ai Bi Ai Hi−1

i=1

−1

−1 − j Bi , δ u −1 u Hi = u Bi , u −1 u u u · · · u u u · · · u H1 H1 Bi Ai Hi−1 Bi Ai Hi−1 Ai , (3.28) 1 The assumptions of simply-connectedness and unimodularity in [14, 15] are motivated by the absence of large gauge transformations and by technical simplifications in the quantisation of the theory but play no role in the classical results needed in this paper.

752

C. Meusburger ↑

and δ denotes the exterior derivative on (P3 )2g . Then, the symplectic structure on the moduli space ↑

↑

−1 Mg = {(A1 , B1 , . . . , A g , Bg ) ∈ (P3 )2g | [Bg , A−1 g ] · · · [B1 , A1 ] = 1}/P3 ,

(3.29)

↑

is obtained from the symplectic form = δ on (P3 )2g by imposing the constraint (3.19) and dividing by the associated gauge transformations which act on the group ↑ elements Ai , Bi by simultaneous conjugation with P3 . 4. Grafting in the Chern-Simons Formalism: The Transformation of the Holonomies In this section we relate the geometrical description of grafted (2+1)-spacetimes to their description in the Chern-Simons formalism. We derive explicit expressions for the transformation of the holonomies Ai , Bi , N Ai , N Bi of our set of generators ai , bi ∈ π1 (Sg ) and their duals n Ai , n Bi ∈ π1 (Sg ) under the grafting operation. We start by considering the static spacetime associated to the cocompact Fuchsian group . In this case, we identify the time parameter x 0 in the splitting (3.7) of the gauge field with the parameter T characterising the hyperboloids HT . After cutting the spatial surface Sg along our set of generators ai , bi ∈ π1 (Sg ), we obtain the 4g-gon Pg in Fig. 5 on which the gauge field can be trivialised by a function ↑

γst−1 = (vst , x st ) : R+0 × Pg → P3

(4.1)

as in (3.12). For fixed T , the translational part x st (T, ·) : Pg → P T of γst−1 maps the polygon Pg to the polygon P T ⊂ HT defined by the Fuchsian group , such that the images of sides and corners of Pg are the corresponding sides and corners of P T . By choosing coordinates on Pg , it is in principle possible to give an explicit expres↑ sion for the trivialising function γst−1 : R+0 × Pg → P3 . However, in order to determine the holonomies Ai , Bi and N Ai , N Bi , it is sufficient to know the embedding of the sides and corners of Pg . As the two sides of the polygon P T corresponding to each generator ai , bi ∈ π1 (Sg ) are mapped into each other by the generators of according to (2.24), the overlap condition (3.16) for the trivialising function γst−1 becomes vst (T, ·)| y = vY vst (T, ·)| y

x st (T, ·)| y = Ad(vY )x st (T, ·)| y ,

(4.2)

where y ∈ {a1 , . . . , bg }, Y ∈ {A1 , . . . , Bg } and vY denotes the associated generator of

. The holonomies N Ai , N Bi are therefore given by N X = (v X , 0)

X ∈ {A1 , . . . , Bg }.

(4.3)

Their translational components vanish, and the same holds for the holonomies Ai , Bi up to conjugation with the Poincaré element γ ( p0 ) associated to the basepoint. We now consider the (2+1)-spacetimes obtained from the static spacetime associated to via grafting along a closed, simple geodesic λ ∈ π1 (S ) on S with weight w. As discussed in Sect. 2.2, this geodesic lifts to a -invariant multicurve on H2 , (4.4) G = (Ad(v)c p,q , w) | v ∈ ,

Grafting and Poisson Structure in (2+1)-Gravity

753

where c p,q is the lift of λ, parametrised as in (2.20) with p ∈ P 1 . As the geodesic c p,q is the lift of a simple closed geodesic on S , there exists a nontrivial element a v˜ = eαn p,q Ja ∈ with α ∈ R+ , the holonomy of λ defined up to conjugation, that maps the geodesic c p,q to itself. More precisely, the geodesic c p,q traverses a sequence of polygons P1 = P 1 ,

P2 = Ad(vr )P 1 ,

P3 = Ad(vr −1 vr )P 1 , . . . , Pr +1 = Ad(v1 · · · vr )P 1 = Ad(v)P ˜ 1

(4.5)

mapped into each other by group elements vi ∈ , until it reaches a point p = Ad(v1 · · · vr ) p = Ad(v) ˜ p ∈ Pr +1 identified with p. In particular, this implies that the geodesics in (4.4) do not have intersection points with the corners of the polygons in the tesselation of HT . In the following we therefore take the corner x st (T, p0 ) as our basepoint x 0 and parametrise γst−1 (T, p0 ) = (v0 , x 0 ).

(4.6)

As each generator of the Fuchsian group maps the polygon P T into one of its neighbours, we can express the group elements vi ∈ in terms of the generators and their inverses as αn p,q Ja r v αk v −αk+1 · · · v −α = v1 · · · vr = v αXrr · · · v αX11 vk = v αXrr · · · v αXk+1 X r , v˜ = e k+1 X k X k+1 a

(4.7)

with v X i ∈ {v A1 , . . . v Bg }, αi = ±1. To determine the map ρ in (2.29) for the grafting along the multicurve (4.4), we note that a general geodesic Ad(v)c p,q = cAd(v) p,Ad(v)q , v ∈ , is mapped to itself by the element v vv ˜ −1 ∈ and has the unit normal vector nAd(v) p,Ad(v)q = Ad(v)n p,q . The map ρ in (2.29) is therefore given by v,x Ad(v)n p,q for x ∈ / cAd(v) p,Ad(v)q , (4.8) ρ(x) = w v∈

v∈ : a x ∩ cAd(v) p,Ad(v)q =∅ and (2.30) implies for points x ∈ cAd(vx ) p,Ad(vx )q , vx ∈ , on one of the geodesics ρ− (x) = w v,x Ad(v)n p,q , ρ+ (x) = ρ− (x) + wvx ,x Ad(vx )n p,q . (4.9) v∈ −{vx }: r x ∩ cAd(v) p,Ad(v)q =∅ We are now ready to determine the transformation of the holonomies Ai , Bi and N Ai , N Bi under grafting along λ. We identify the time parameter x 0 in (3.7) with the cosmological time T of the regular domain (2.33) associated to the multicurve (4.4). For fixed T , the ↑ translational part of the trivialising function γ −1 = (v, x) : R+0 × Pg → P3 maps the T ⊂ U of P T under the grafting operation polygon Pg to the image P ,G T

T , x(T, ·) : Pg → P ,G T 1 cAd(v) p,Ad(v)q P ,G = T x + ρ(x) x ∈ P − v∈

1 ∪ T x + tρ+ (x) + (1− t)ρ− (x) x ∈ P ∩ cAd(v) p,Ad(v)q , t ∈ [0, 1] ⊂UT . v∈

(4.10)

754

C. Meusburger

Again, we do not need an explicit expression for the embedding function γ −1 but can determine the holonomies Ai , Bi and N Ai , N Bi from the embedding of the sides of the polygon Pg . For this, we consider a side y ∈ a1 , b1 . . . , ag , bg of the polygon Pg and denote by qiY , q Yf , respectively, its starting and endpoint. In the case of the static spacetime associated to , the holonomy Yst along y with respect to the basepoint p0 is given by (4.11) Yst = γst T, q Yf γst−1 T, qiY . Since the geodesics in (4.4) do not intersect the corners of the polygon, the embedding of starting and endpoint of y in the resulting regular domain is x(T, qiY ) = x st (T, qiY ) + ρ(qiY ),

x(T, q Yf ) = x st (T, q Yf ) + ρ(q Yf ), (4.12)

where here and in the following ρ(q), q ∈ Pg , stands for ρ(x st (1, q)). This implies γ −1 (T, qi,Y f ) = vst (T, qi,Y f ), x st (T, qi,Y f ) + ρ(qi,Y f ) = 1, ρ(qi,Y f ) · γst−1 (T, qi,Y f ), (4.13) and the holonomy Y becomes −1 (T, qiY ) ρ(q Yf ) − ρ(qiY ) . Y = Yst · 1, −Ad vst From expression (4.8) for the map ρ we deduce v,y Ad(v)n p,q , ρ(q Yf ) − ρ(qiY ) = w v∈ :y∩ cAd(v) p,Ad(v)q =∅

(4.14)

where v,y is the oriented intersection number of cAd(v) p,Ad(v)q and y, taken to be positive if cAd(v) p,Ad(v)q crosses y from the left to the right in the direction of y. In order to determine the transformations of the holonomies Ai , Bi , we therefore need to determine the intersection points of the multicurve (4.4) with the sides of the polygon P 1 and the oriented intersection numbers v,y . As the geodesic c p,q intersects the side Ad(vr −k+2 · · · vr )x ⊂ Pk of the polygon Pk αr −k+1 α1 if and only if the geodesic Ad(vr−1 · · · vr−1 −k+2 )c p,q = Ad(v X r −k+1 · · · v X 1 )c p,q intersects the side x ⊂ P 1 , the geodesics in (4.4) which have intersections points with the sides of P 1 are c1 = c p,q , c2 = Ad v αX11 c p,q , (4.15) α c3 = Ad v αX22 v αX11 c p,q , . . . , cr = Ad v Xrr−1 · · · v αX11 c p,q . −1 The intersections of the multicurve (4.4) with a given side y ⊂ P 1 are in one-to-one correspondence with factors v αXkk , X k = Y , in (4.7), and the geodesic in (4.15) intersecting y is ck if αk = 1 and ck+1 if αk = −1. Similarly, intersections with the side y are also in one-to-one correspondence with factors v αXkk , X k = Y , but the intersection takes place with ck for αk = −1 and with ck+1 for αk = 1. Taking into account the orientation of the sides ai , bi , ai , bi in the polygon Pg , see Fig. 5, we find that intersections with

Grafting and Poisson Structure in (2+1)-Gravity

755

sides ai and ai have positive intersection number for αk = 1 and negative intersection number for αk = −1, while the intersection numbers for sides bi , bi are positive and negative, respectively, for αk = −1 and αk = 1. Hence, we find that the transformation of the holonomy Y = (u Y , −Ad(u Y ) j Y ) under grafting along λ is given by u Y → u Y , −1 Y j Y → j Y + Y w Ad(vst (qi ))  α × Ad(v Xi−1 · · · v αX11 )n p,q− i−1 i:X i =Y,αi =1

 Ad(v αXii · · · v αX11 )n p,q  ,

(4.16)

i:X i =Y,αi =−1

where the overall sign Y is positive for Y = Ai and negative for Y = Bi . Note that (4.16) is invariant under conjugation of the group element v˜ = v αXrr · · · v αX11 ∈ associated to the geodesic c p,q with elements of . Although such a conjugation can give rise to additional intersection points, the identity Ad(v αXrr · · · v αX11 )n p,q = n p,q implies that their contributions to the transformation (4.20) cancel. Hence, the transformation (4.16) depends only on the geodesic λ ∈ π1 (S ) and not on the choice of the lift c p,q . To deduce the transformation of the holonomies Ai , Bi , we determine the corresponding starting and endpoints from Fig. 5. For Y = Ai , starting and end point are given by qiAi = p4(i−1) , q Af i = p4i−3 , for Y = Bi by qiBi = p4i−2 , q Bf i = p4i−3 , and (3.20) implies −1 −1 Ai vst (qi ) = v0−1 v −1 H1 · · · v Hi−1 ,

−1 −1 −1 Bi vst (qi ) = v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi .

(4.17)

Taking into account the oriented intersection numbers, we find that the holonomies Ai , Bi transform under grafting along λ according to −1 j Ai → j Ai+wAd v0−1 v −1 H1 · · ·v Hi−1   α × Ad(v Xk−1 · · · v αX11 )n p,q− Ad(v αXkk · · · v αX11 )n p,q, (4.18) k−1 k:X k =Ai ,αk =1

k:X k =Ai ,αk =−1

−1 −1 · · ·v v v j Bi → j Bi −wAd v0−1 v −1 B i H1 Hi−1 Ai   α × Ad(v Xk−1 · · · v αX11 )n p,q− Ad(v αXkk · · · v αX11 )n p,q . k−1 k:X k =Bi ,αk =1

k:X k =Bi ,αk =−1

Equivalently, we could have determined the transformation of the holonomies from the −1 Y sides ai , bi . In this case, qiY = p4i−1 for both y = ai , bi and therefore vst (qi ) = −1 −1 −1 Bi −1 −1 Ai −1 · · · v v = v (q )v = v (q )v , which together with the remark v0−1 v −1 st st i i H1 Hi−1 Ai Bi Ai before (4.16) yields the same result. With the interpretation of the holonomies Ai , Bi as the different factors in the prod↑ ↑ ↑ uct (P3 )2g , (4.18) defines a map Grwλ : (P3 )2g → (P3 )2g which leaves the sub↑ ↑ manifold (L 3 )2g ⊂ (P3 )2g invariant. The transformation of the holonomy of a general βs β curve η = ys ◦ . . . ◦ y1 1 ∈ π1 ( p0 , Sg ) under Grwλ is then obtained by writing the curve as a product in the generators Ai , Bi . Parametrising the associated holonomy as Hη = (u η , −Ad(u η ) j η ) as in (2.2), we find that the vector j η is given by

756

C. Meusburger s

jη =

i=1,βi =1

r

−β

−β

Ad(u Y1 1 · · · u Yi−1i−1 ) j Yi −

−β

i=1,βi =−1

−β

Ad(u Y1 1 · · · u Yi i ) j Yi ,

(4.19)

and using (4.18), we obtain the following theorem. β

β

Theorem 4.1. For η = ys s ◦ . . . ◦ y1 1 ∈ π1 (Sg ) with yi ∈ {a1 , . . . , bg }, βi ∈ {±1}, the transformation of the associated holonomy under grafting along λ is given by Grwλ : u η → u η ,  g  j η → j η + w i=1

Y j =Ai ,β j =1



−1  ·Ad(v0−1 v −1 H1 · · · v Hi−1 )

−w

g i=1

 

Y j =Bi ,β j =1

−β

−β

Ad(u Y1 1 · · · u Y j−1j−1 ) −

 −β

−β

Ad(u Y1 1 · · · u Y j j ) ·

Y j =Ai ,β j =−1

 α Ad(v Xk−1 · · · v αX11 )n p,q − Ad(v αXkk · · · v αX11 )n p,q  k−1

k:X k =Ai ,αk =1

k:X k =Ai ,αk =−1

−β

−β

Ad(u Y1 1 · · · u Y j−1j−1 ) − 

−1 −1  ·Ad(v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi )

−β



−β

Ad(u Y1 1 · · · u Y j j ) ·

Y j =Bi ,β j =−1

 α Ad(v Xk−1 · · · v αX11 )n p,q− Ad(v αXkk · · · v αX11)n p,q. k−1

k:X k =Bi ,αk =1

k:X k =Bi ,αk =−1

(4.20) Although the formula for the transformation of j η appears rather complicated, one can give a heuristic interpretation of the various factors (4.20). For this recall that, up to conjugation, the group element v˜ ∈ gives the holonomy of the geodesic λ and consider the associated element λ = n αXrr ◦ . . . ◦ n αX11 ∈ π1 ( p0 , Sg ) of the fundamental group based at p0 . The holonomy along this element is Hλ = (u λ , −Ad(u λ ) j λ ) = γ ( p0 )N Xαrr · · · N Xα11 γ −1 ( p0 ),

u λ = e− pλ Ja , (4.21) a

and from (4.7) it follows that the unit vector n p,q is given by n p,q = −Ad(v0 ) pˆ λ = − m1λ Ad(v0 ) pλ .

(4.22)

α

· · · v αX11 )n p,q , and Ad(v αXkk · · · v αX11 )n p,q in (4.20) can be Hence, the terms Ad(v Xk−1 k−1 viewed as the parallel transport along λ of the vector pˆ λ from the starting point of λ to the −1 intersection point with the sides ai , bi of the polygon Pg . The terms Ad(v0−1 v −1 H1 · · · v Hi−1 ) −1 −1 and Ad(v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi ) transport the vector from the point p0 to the starting −β

−β

point of, respectively, sides ai and bi of Pg . Finally, the terms Ad(u Y1 1 · · · u Y j−1j−1 ) and −β

−β

Ad(u Y1 1 · · · u Y j j ) describe the parallel transport along the curve η from its intersection point with λ to its starting point p0 . We will give a more detailed and precise interpretation of this formula in Sect. 6, where we discuss the link between grafting and Dehn twists.

Grafting and Poisson Structure in (2+1)-Gravity

757

5. Grafting and Poisson Structure In this section, we give explicit expressions for Hamiltonians on the Poisson manifold ↑ ((P3 )2g , ) which generate the transformation (4.20) of the holonomies under grafting via the Poisson bracket. As we have seen in Sect. 4 that the grafting operation is most easily described by parametrising one of the holonomies in question in terms of Ai , Bi and the other one in terms of N Ai , N Bi , the first step is to derive an expression for the symplectic potential (3.28) involving the components of both Ai , Bi and N Ai , N Bi . We then prove that the transformation (4.20) of the holonomies under grafting along a geodesic λ ∈ π1 (Sg ) with weight w is generated by wm λ , where m λ is the mass of λ defined as in (3.26). Finally, we use this result to investigate the properties of the grafting trans↑ ↑ formation Grwλ : (P3 )2g → (P3 )2g and prove a relation between the Poisson brackets of mass and spin for general elements λ, η ∈ π1 (Sg ). 5.1. The Poisson structure in terms of the dual generators. In order to derive an expression for the symplectic potential (3.28) in terms of both Ai , Bi and N Ai , N Bi , we need to express the Lorentz and translational components of the holonomies Ai , Bi and N Ai , N Bi in terms of each other via (3.22) and (3.23). For the Lorentz components, we can simply replace Ai , Bi with u Ai , u Bi , N Ai , N Bi with v Ai , v Bi and γ −1 ( p0 ) with v0 in (3.22), (3.23) and obtain −1 −1 v Ai = v0 u −1 H1 · · · u Hi · u Bi · u Hi−1 · · · u H1 v0 , −1 −1 v Bi = v0 u −1 H1 · · · u Hi · u Ai · u Hi−1 · · · u H1 v0 , −1 u Ai = v0−1 v −1 H1 · · · v Hi · v Bi · v Hi−1 · · · v H1 v0 ,

(5.1)

−1 u Bi = v0−1 v −1 H1 · · · v Hi · v Ai · v Hi−1 · · · v H1 v0 , −1 where u Hi = [u Bi , u −1 Ai ], v Hi = [v Bi , v Ai ]. The corresponding expressions for the translational components require some computation. Inserting the parametrisation of the holonomies Ai , Bi into (3.23) and using (5.1), we find $ % i−1 (1 − Ad(v Ak ))l Ak + (1 − Ad(v Bk ))l Bk x Ai = (1 − Ad(v Ai )) x 0 + k=1

x Bi

+Ad(v Bi )l Bi + (1 − Ad(v Ai ))l Ai + (1 − Ad(v Bi ))l Bi , $ % i−1 = (1 − Ad(v Bi )) x 0 + (1 − Ad(v Ak ))l Ak + (1 − Ad(v Bk ))l Bk k=1

−Ad(v Bi )l Ai + (1 − Ad(v Ai ))l Ai + (1 − Ad(v Bi ))l Bi ,

(5.2)

−1 −1 l Ai = Ad(v Hi−1 · · · v H1 v0 ) j Ai = Ad(v0 u −1 H1 · · · u Hi−1 u Ai ) · Ad(u Ai ) j Ai , −1 −1 −1 l Bi = −Ad(v −1 Bi v Ai v Hi−1 · · · v H1 v0 ) j Bi = −Ad(v0 u H1 · · · u Hi−1 u Ai ) · Ad(u Bi ) j Bi ,

(5.3)

758

C. Meusburger

and an analogous calculation for (3.22) yields % $ i−1 j Ai = − 1−Ad(u −1 Ad(v0−1 )x 0 + (1 − Ad(u Ak )) f Ak + (1 − Ad(u Bk )) f Bk Ai ) k=1

j Bi

+Ad(u −1 Ai ) (1−Ad(u Ai )) f Ai + (1 − Ad(u Bi )) f Bi +Ad(u Bi ) f Bi , % $ i−1 −1 −1 = − 1 − Ad(u Bi ) Ad(v0 )x 0 + (1 − Ad(u Ak )) f Ak + 1−Ad(u Bk ) f Bk k=1

+Ad(u −1 Bi ) (1 − Ad(u Ai )) f Ai + (1 − Ad(u Bi )) f Bi − Ad(u Bi ) f Ai ,

(5.4)

−1 −1 −1 −1 x Ai = Ad v0−1 v −1 f Ai = Ad u −1 Ai u Bi u Ai u Hi−1 · · · u H1 v0 H1 · · · v Hi−1 v Ai x Ai , −1 −1 −1 −1 −1 −1 f Bi = −Ad(u −1 u u u · · · u v )x = −Ad v v · · · v v A H H B 1 0 i i−1 i 0 Ai Bi H1 Hi−1 Ai x Bi . (5.5) Note that the variables f Ai , f Bi , l Ai , l Bi have a clear geometrical interpretation. From Fig. 5 and Eq. (3.20) we see that f Ai , f Bi can be viewed as the parallel transport of x Ai , x Bi from p0 to the point p4i−1 , which is the starting point of the sides ai and bi in the polygon Pg . Equivalently, we can interpret them as the parallel transport of −1 Ad(v −1 Ai )x Ai to p4i−4 and of Ad(v Bi )x Bi to p4i−2 , the starting points of sides ai and bi . Similarly, the variables l Ai represent the parallel transport of j Ai from the starting point p4i−4 of side ai , or, equivalently, of Ad(u Ai ) j Ai from its endpoint p4i−3 to p0 , while l Bi corresponds to the parallel transport of j Bi from p4i−2 to p0 or of Ad(u Bi ) j Bi from p4i−3 to p0 . Using expressions (5.1) to (5.5), we can now express the symplectic potential (3.28) ↑ on (P3 )2g in various combinations of the Lorentz and translational components of holonomies and dual holonomies. Theorem 5.1. 1. In terms of the variables introduced in (5.1) to (5.5), the symplectic potential (3.28) is given by = =

g

−1 l Ai , v −1 Ai δv Ai + l Bi , v Bi δv Bi

i=1 g

−1 −1 −1 f Ai , u −1 + f + j . δu , u δu +Ad(v )x , u δu A B 0 ∞ Bi ∞ ∞ i i 0 Ai Bi

(5.6)

(5.7)

i=1

2. After a gauge transformation which acts on the holonomies N Ai , N Bi by simultaneous conjugation with the Poincaré element (1, −Ad(v0 ) j ∞ − x 0 ), NY → N˜ Y = (vY , x˜ Y)= vY , x Y −(1−Ad(vY ))(Ad(v0 ) j ∞+x 0 ) , Y∈ A1 , ..., Bg , −1 f Ai → ˜f Ai = Ad u −1 · · ·u H1 v0−1 x˜ Ai , Ai u Bi u Ai u Hi−1 −1 −1 x˜ Bi , f Bi → ˜f Bi = −Ad u −1 u u u · · ·u v (5.8) A H H 1 i i−1 0 Ai Bi

Grafting and Poisson Structure in (2+1)-Gravity

759

the symplectic potential (3.28) takes the form = =

g

˜f A , u −1 δu A + ˜f B , u −1 δu B i i Ai Bi i i

i=1 g

˜ Ai , δ(v Hi−1 · · · v H1 )(v Hi−1 · · · v H1 )−1 Ad(v −1 Ai ) x

(5.9) (5.10)

i=1

−1 −1 −1 −1 −1 ˜ − Ad(v −1 ) x , δ(v v v v · · · v )(v v v v · · · v ) Ai H1 H1 Ai Ai Bi Ai Hi−1 Ai Bi Ai Hi−1 +

g

−1 −1 −1 −1 ˜ Bi , δ(v −1 Ad(v −1 Bi ) x Ai v Bi v Ai v Hi−1 · · · v H1 )(v Ai v Bi v Ai v Hi−1 · · · v H1 )

i=1

−1 −1 ˜ Bi , δ(v −1 . − Ad(v −1 Bi ) x Bi v Ai v Hi−1 · · · v H1 )(v Bi v Ai v Hi−1 · · · v H1 ) Proof. 1. The proof is a straightforward but rather lengthy computation. To prove (5.6) we express the products of the Lorentz components u Ai , u Bi in (3.28) as products of v Ai , v Bi , −1 u Hi−1 · · · u H1 = v0−1 v −1 H1 · · · v Hi−1 v0 , −1 −1 −1 −1 −1 u −1 Ai u Bi u Ai u Hi−1 · · · u H1 = v0 v H1 · · · v Hi−1 v Ai v0 ,

(5.11)

−1 −1 −1 −1 u −1 Bi u Ai u Hi−1 · · · u H1 = v0 v H1 · · · v Hi−1 v Ai v Bi v0 ,

and simplify the resulting products via the identity δ(ab)(ab)−1 = δaa −1 + Ad(a)δbb−1 .

(5.12)

Taking into account that the embedding of the basepoint is not varied, δv0 = 0, and using the Ad-invariance of the pairing , together with (5.3) we then obtain (5.6). To prove (5.7), we insert expression (5.4) for the variables j Ai , j Bi in terms of f Ai , f Bi into (3.28) and isolate the terms containing f Ai , f Bi . We then express the components of the constraint (3.19) in terms of Lorentz and translational components of the holonomies Ai , Bi and N Ai , N Bi according to −1 u ∞ = u Hg · · · u H1 = v0−1 v −1 H1 · · · v Hg v0 ,

j ∞ = Ad(v0−1 )

(5.13)

g

(1 − Ad(v Ai ))l Ai + (1 − Ad(v Bi ))l Bi

i=1 −1 = −(1 − Ad(u −1 ∞ ))Ad(v0 )x 0

+Ad(u −1 ∞)

g

(1 − Ad(u Ai )) f Ai + (1 − Ad(u Bi )) f Bi .

(5.14)

i=1

Making use repeatedly of the identity (5.12) and of the second identity in (5.14) we obtain (5.6). 2. Equation (5.7) can be transformed into (5.9), (5.10) as follows. We first derive an

760

C. Meusburger

expression for the term u −1 ∞ δu ∞ in terms of the Lorentz components u Ai , u Bi from (5.13), u −1 ∞ δu ∞ g −1 −1 −1 = Ad(u −1 · · · u ) H1 Hi−1 (1−Ad(u Ai u Bi u Ai ))u Ai δu Ai i=1

−1 −1 +(Ad(u −1 u u )−Ad(u u ))u δu Bi . (5.15) Ai Bi Ai Ai Bi Bi

Expressing f Ai , f Bi in (5.6) in terms of ˜f Ai , ˜f Bi and isolating the terms containing j ∞ + Ad(v0−1 )x 0 yields (5.9). Finally, we express the Lorentz components u Ai , u Bi in (5.9) as products in v Ai , v Bi via (5.1). After applying (5.8) and again making use of (5.12) we obtain (5.10). Thus, we find that the symplectic potential takes a particularly simple form when the components of the holonomies Ai , Bi are paired with those of N Ai , N Bi . Note also that up to the term j ∞ + Ad(v0−1 )x 0 , u −1 ∞ δu ∞ , which involves the components of the constraint (3.19) and can be eliminated by performing the gauge transformation to the variables ˜f Ai , ˜f Bi , the resulting expressions for the symplectic potential are symmetric under the exchange l Ai , l Bi ↔ f Ai , f Bi , v Ai , v Bi ↔ u Ai , u Bi , which corresponds to exchanging Ai , Bi ↔ N Ai , N Bi and γ −1 ( p0 ) ↔ γ ( p0 ). Similarly, expression (5.10) for the sympletic potential agrees with (3.28), if we take into account the difference in the parametrisation of the group elements Ai , Bi and N Ai , N Bi and exchange j Ai ↔ Ad(v Ai ) x˜ Ai , j Bi ↔ Ad(v Bi ) x˜ Bi . Hence, up to the gauge transformation (5.8), the symplectic potential takes the same form when expressed in terms of the holonomies Ai , Bi and in terms of N Ai , N Bi , as could be anticipated from the symmetry in expressions (3.22), (3.23). It follows from formula (5.6) for the symplectic potential that the only nontrivial Poisson brackets of the variables l Ai , l Bi and v Ai , v Bi are given by {laX , lbX } = −abc l cX , {laX , v X } = −v X Ja ,

X ∈ {A1 , . . . , Bg }.

(5.16)

We can therefore identify the variables laX with the left-invariant vector fields defined as ↑ in (2.17) and acting on the copy of L 3 associated to v X , d a v Bg {l aX , F} v A1 , ..., v Bg = −J Ra X F v A1 , ..., v Bg = − F v A1 , ..., v X et,J..., dt (5.17) ↑

for F ∈ C ∞ ((L 3 )2g ), X ∈ {A1 , . . . , Bg }. The same holds for the Poisson brackets of ˜f A , ˜f B with u A , u B , i i i i { f˜Xa , F} u A1 , ..., u Bg = −J Ra X F u A1 , ..., u Bg d a u Bg . = − F u A1 , ..., u X et,J..., dt

(5.18)

Grafting and Poisson Structure in (2+1)-Gravity

761

5.2. Hamiltonians for grafting. We can now use the results from Sect. 5.1 to show that the mass m λ of a closed, simple curve λ ∈ π1 (Sg ) generates the transformation of the holonomies under grafting along λ. Theorem 5.2. Consider a simple, closed curve λ = n αxrr ◦ . . . ◦ n αx11 ∈ π1 (Sg ) and a genβ β eral closed curve η = ys s ◦. . .◦ y1 1 ∈ π1 (Sg ) with holonomies Hλ and Hη , parametrised in terms of Ai , Bi and N Ai , N Bi as Hλ = u λ , −Ad(u λ ) j λ = γ ( p0 )N Xαrr · · · N Xα11 γ ( p0 )−1 , β β Hη = u η , −Ad(u η ) j η = Ys s · · · Y1 1 ,

(5.19)

where X i , Y j ∈ {A1 , . . . , Bg } and αi , β j ∈ {±1}. Then, the transformation (4.20) of the holonomy Hη under grafting along λ is generated by the mass m λ , {wm λ , F} = − ↑

d |t=0 F ◦ Grtwλ , dt

↑

F ∈ C ∞ ((P3 )2g ),

(5.20)

↑

where Grtwλ : (P3 )2g → (P3 )2g is given by (4.18), (4.20). Proof. To prove the theorem, we calculate the Poisson bracket of p2λ = −m 2λ with j Ai , j Bi . From expression (5.16) for the Poisson bracket we have {l aAi , u λ } = −

ba ba α αk α1 α1 u λ · Ad v Xk−1 · · · v v J + u · Ad v · · · v v Jb , 0 b λ 0 X1 Xk X1 k−1

X k =Ai ,αk =1

{l aBi , u λ } = −

X k =Ai ,αk =−1

α

u λ · Ad v Xk−1 · · · v αX11 v0 k−1

X k =Bi ,αk =1

ba

(5.21) ba Jb + u λ · Ad v αXkk · · · v αX11 v0 Jb .

X k =Bi ,αk =−1

(5.22) ↑

Applying the formula (2.17) for the left-invariant vector fields on L 3 to F = p2λ yields {l Ai , p2λ } = 2

α

Ad(v Xk−1 · · · v αX11 v0 ) pλ − 2 k−1

X k =Ai ,αk =1

{l Bi ,

p2λ }

α =2 Ad(v Xk−1 k−1 X k =Bi ,αk =1

Ad(v αXkk · · · v αX11 v0 ) pλ ,

X k =Ai ,αk =−1

· · · v αX11 v0 ) pλ

−2 Ad(v αXkk X k =Bi ,αk =−1

(5.23) · · · v αX11 v0 ) pλ ,

where the expressions involving vectors are to be understood componentwise. With expression (5.3) relating l Ai , l Bi to j Ai , j Bi and setting pˆ λ = m1λ pλ , p2λ = −m 2λ , we obtain

762

C. Meusburger

−1 {m λ , j Ai }= Ad v0−1 v −1 H1 · · · v Hi−1  α α1 ˆλ − × Ad v Xk−1 · · · v v X1 0 p k−1 X k =Ai ,αk =1



Ad v αX· kk· · v αX11 v0 pˆ λ  ,

X k =Ai ,αk =−1

−1 −1 · · · v v v {m λ , j Bi }= −Ad v0−1 v −1 B i H1 Hi−1 Ai   α α α α × Ad v Xk−1 · · · v X11 v0 pˆ λ − Ad v X· kk· · v X11 v0 pˆ λ  . k−1 X k =Bi ,αk =1

(5.24)

X k =Bi ,αk =−1

Using expression (4.19) for the variable j η as a linear combination of j Ai , j Bi and taking into account the relation (4.22) between the vector pˆ λ and the vector n p,q in (4.20), we find agreement with (4.20) up to a sign, which proves (5.20). Hence, we find that the transformation of the holonomies under grafting along a closed, simple geodesic λ on S is generated by the mass m λ . Note, however, that the transformation generated by the mass m λ is defined for general closed curves λ ∈ π1 (Sg ) ↑ ↑ and as a map (P3 )2g → (P3 )2g . In contrast, the grafting procedure defined in [12, 13] whose action on the holonomies is given in Sect. 4 is defined for simple, closed curves and acts on static spacetimes for which the translational components of the dual holonomies N Ai , N Bi vanish and their Lorentz components are the generators of a cocompact ↑ ↑ Fuchsian group . In this sense, the transformation Grλ : (P3 )2g → (P3 )2g generated by the mass m λ can be viewed as an extension of the grafting procedure in [12, 13] to ↑ non-simple curves and to the whole Poisson manifold ((P3 )2g , ). The fact that the transformation of the holonomies under grafting is generated via the Poisson bracket allows us to deduce some properties of this transformation which would be much less apparent from the general formula (4.20). Corollary 5.3. 1. The action of grafting leaves the constraint (3.19) invariant and commutes with the associated gauge transformation by simultaneous conjugation a {u ∞ , m λ } = j∞ (5.25) , m λ = 0. 2. The grafting transformations Grwi λi for different closed curves λi ∈ π1 (Sg ) with weights wi ∈ R+ commute and satisfy n d ↑ wi m λi , F = − F ◦ Grtwn λn ◦ . . . ◦ Grtw1 λ1 , F ∈ C ∞ (P3 )2g . dt t=0 i=1

(5.26) ↑

3. The grafting maps Grwλ act on the Poisson manifold ((P3 )2g , ) via Poisson isomorphisms ↑ {F ◦ Grwλ , G ◦ Grwλ } = {F, G} ◦ Grwλ , F, G ∈ C ∞ (P3 )2g . (5.27)

Grafting and Poisson Structure in (2+1)-Gravity

763

Proof. That the components of the constraint (3.19) Poisson commute with the mass m λ follows from the fact that m λ is an observable of the theory, but can also be checked a act on the Lorentz by direct calculation. It is shown in [14] that the components j∞ ↑ components u Ai , u Bi by simultaneous conjugation with L 3 , which leaves all masses m λ invariant. To prove the second statement, we recall that all Lorentz components u Ai , u Bi Poisson commute, which together with (4.20) and (5.20) implies the commutativity of grafting. Differentiating then yields (5.26). The third statement follows directly from the fact that the grafting transformation is generated via the Poisson bracket by a standard argument making use of the Jacobi identity. In our case, the fact that the Lorentz components u Ai , u Bi Poisson commute allows one to write {F ◦ Grwλ , G ◦ Grwλ } ∂ F ∂G a b a b a b = { j , j } − w{{m , j }, j } − w{ j , {m , j }} , λ λ X Y X Y X Y ∂ j Xa ∂ jYb X,Y ∈{A1 ,...,Bg }

and, using the Jacobi identity for the last two brackets, one obtains (5.27) ∂ F ∂G a b a b {F ◦ Grwλ , G ◦ Grwλ } = { j , j }−w{m , { j , j }} = {F, G} ◦ Grwλ . λ X Y X Y a ∂ j X ∂ jYb X,Y ∈{A ,...,B } g

1

After deriving the Hamiltonians that generate the transformation of the holonomies under grafting along a closed, simple curve λ ∈ π1 (Sg ), we will now demonstrate that Theorem 5.2 gives rise to a general symmetry relation between the Poisson brackets of mass and spin associated to general closed curves λ, η ∈ π1 (Sg ). Theorem 5.4. The Poisson brackets of mass and spin for λ, η ∈ π1 (Sg ) satisfy the relation pη · j η , p2λ = p2η , pλ · j λ , m η , sλ = sη , m λ . (5.28) Proof. To prove (5.28), we consider curves λ, η ∈ π1 (Sg ) with holonomies Hλ , Hη parametrised as in (5.19). From (5.23) it follows that the Poisson bracket of pη · j η and p2λ is given by

pη · j η , p2λ = 2

g i=1

 

 

 βk−1 β1 βk β1 Ad u Yk−1 · · · u Y1 pη − Ad u Yk · · · u Y1 pη ·

Yk =Ai ,βk =1

Yk =Ai ,βk =−1

αk−1 α1 −1 Ad v0−1 v −1 H1 · · · v Hi−1 v X k−1 · · · v X 1 v0 pλ

X k =Ai ,αk =1

−

 α α −1 k 1  Ad v0−1 v −1 H1 · · · v Hi−1 v X k · · · v X 1 v0 pλ

X k =Ai ,αk =−1

764

C. Meusburger

−2  

g

 

Ad

Yk =Bi ,βk =1

i=1

βk−1 u Yk−1

β · · · u Y11



β pη − Ad(u Ykk Yk =Bi ,βk =−1

β · · · u Y11 ) pη  ·

αk−1 α1 −1 −1 Ad v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi v X k−1 · · · v X 1 v0 pλ

X k =Bi ,αk =1

 α α −1 −1 k 1  − Ad v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi v X k · · · v X 1 v0 pλ . (5.29) X k =Bi ,αk =−1

To compute the Poisson bracket { p2η , pλ · j λ }, we express the translational component of the holonomy Hλ in terms of the holonomies N Ai , N Bi , −αk−1 −1 −αk −1 1 1 jλ =− Ad v0−1 v −α Ad v0−1 v −α X 1 · · · v X k−1 v X k x X k + X1 · · · vXk vXk x Xk . k:αk =1

k:αk =−1

(5.30) As simultaneous conjugation of all holonomies with a general Poincaré valued function ↑ on (P3 )2g leaves pλ · j λ invariant, we can replace x Ai , x Bi by x˜ Ai , x˜ Bi in expression (5.30). Using expression (5.18) for the Poisson bracket of ˜f Ai , ˜f Bi with u Ai , u Bi , Eq. (5.8) relating ˜f Ai , ˜f Bi and x˜ Ai , x˜ Bi and expression (2.17) for the action of the ↑ left-invariant vector fields on L 3 , we find that the Poisson bracket of x˜ Ai , x˜ Bi with p2λ is given by

x˜ Ai , p2η = 2Ad v Ai v Hi−1 · · · v H1 v0   βk−1 β β β × Ad u Yk−1 · · · u Y11 pη − Ad u Ykk · · · u Y11 pη , Yk =Ai ,βk =1

Yk =Ai ,βk =−1

x˜ Bi , p2η = −2Ad v Ai v Hi−1 · · · v H1 v0   βk−1 β β β × Ad u Yk−1 · · · u Y11 pη − Ad u Ykk · · · u Y11 pη . Yk =Bi ,βk =1

Yk =Bi ,βk =−1

Replacing x Ai → x˜ Ai , x Bi → x˜ Bi in expression (5.30) for j η then yields

p2η , j λ = 2

g



1  Ad(v0−1 v −α X1 i=1 X k =Ai ,αk =1



Ad(v Hi−1

βk−1 · · · v H1 v0 )  Ad(u Yk−1 Yk =Ai ,βk =1



−αk−1 1 · · · v X k−1 )− Ad(v0−1 v −α X1 X k =Ai ,αk =−1

β β · · · u Y11 ) pη − Ad(u Ykk Yk =Ai ,βk =−1

k  · · · v −α Xk )



β · · · u Y11 ) pη 

Grafting and Poisson Structure in (2+1)-Gravity

−2

g



1  Ad(v0−1 v −α X1 i=1 X k =Bi ,αk =1

Ad(v −1 Bi v Ai v Hi−1

765



−αk−1 1 · · · v X k−1 )− Ad(v0−1 v −α X1 X k =Bi ,αk =−1



βk−1 · · · v H1 v0 )  Ad(u Yk−1 Yk =Bi ,βk =1

k  · · · v −α Xk )

β β · · · u Y11 ) pη − Ad(u Ykk Yk =Bi ,βk =−1

 β · · · u Y11 ) pη  ,

(5.31) and multiplication with pλ gives (5.29). The geometrical implications of Theorem 5.4 are that the change of the spin sη of a closed, simple curve η ∈ π1 (Sg ) under grafting along another closed, simple curve λ ∈ π1 (Sg ) is the same as the change of the spin sλ under grafting along η. Furthermore, it is shown in [16], for a summary of the results see Sect. 6, that the product m λ sλ of mass and spin of a closed, simple curve λ ∈ π1 (Sg ) is the Hamiltonian which generates an infinitesimal Dehn twist around λ. Thus, Theorem 5.4 implies that the transformation of the mass m η under an infinitesimal Dehn twist around λ ∈ π1 (Sg ) agrees with the transformation of the spin sη under infinitesimal grafting along λ. We will clarify this connection further in the next section, where we discuss the relation between grafting and Dehn twists. 6. Grafting and Dehn Twists In this section, we show that there is a link between the transformation of the holonomies under grafting and under Dehn twists along a general closed, simple curve λ ∈ π1 (Sg ). The transformation of the holonomies under Dehn twists is investigated in [16] for Chern-Simons theory on a manifold of topology R× Sg,n , where Sg,n is a general orientable two-surface of genus g with n punctures. The gauge groups considered in [16] are of the form G g∗ , where G is a finite dimensional, connected, simply connected and unimodular Lie group, g∗ the dual of its Lie algebra and G acts on g∗ in the coadjoint representation. The assumption of simply-connectedness in [16] gives rise to technical simplifications in the quantised theory but does not affect the classical results. Hence, ↑ reasoning and results in [16] apply to the case of gauge group P3 and can be summarised as follows. Theorem 6.1 ([16]). For any simple, closed curve λ ∈ π1 (Sg ) with holonomy Hλ = a (u λ , −Ad(u λ ) j λ ), u λ = e− pλ Ja , the product of the associated mass and spin pλ · j λ = m λ sλ generates an inﬁnitesimal Dehn twist around λ via the Poisson bracket deﬁned by (3.28), d ↑ {m λ sλ , F} = |t=0 F ◦ Dtλ , (6.1) F ∈ C ∞ (P3 )2g , dt ↑

↑

↑

↑

where Dtλ : (P3 )2g → (P3 )2g agrees with the action Dλ : (P3 )2g → (P3 )2g of the Dehn-twist around λ for t = 1. The transformation Dtλ acts on the Poisson manifold ↑ ((P3 )2g , ) via Poisson isomorphisms, ↑ {F ◦ Dtλ , G ◦ Dtλ } = {F, G} ◦ Dtλ , (6.2) F, G ∈ C ∞ (P3 )2g .

766

C. Meusburger ↑

↑

As in the definition of the grafting map Grwλ : (P3 )2g → (P3 )2g , the different ↑ copies of P3 in Theorem 6.1 stand for the holonomies Ai , Bi . However, unlike our derivation of the grafting map, the derivation in [16] does not make use of the dual generators n ai , n bi but is formulated entirely in terms of the holonomies Ai , Bi . The action ↑ ↑ Dtλ : (P3 )2g → (P3 )2g of (infinitesimal) Dehn twists on the holonomies is determined graphically. As this graphical procedure will play an important role in relating Dehn twists and grafting, we present it here in a slightly different and more detailed version than in [16]. We consider simple curves λ, η ∈ π1 (Sg ) parametrised in terms of the generators β β ai , bi ∈ π1 (Sg ) as λ = z tδt ◦ . . . ◦ z 1δ1 , η = ys s ◦ . . . ◦ y1 1 with z i , y j ∈ {a1 , . . . , bg }, β j , δk ∈ {±1} and associated holonomies a a Hλ = Z tδt · · · Z 1δ1 = e−(pλ +θkλ )Ja = u λ , −Ad(u λ ) j λ , − pa +θk a J β Hη = Ysβs · · · Y1 1 = e η η a = u η , −Ad(u η ) j η .

(6.3)

To determine the action of the transformation generated by m λ sλ on the holonomy Hη , we consider the surface Sg − D obtained from Sg by removing a disc2 D. We represent the generators ai , bi ∈ π1 (Sg ) by curves as in Fig. 4, but instead of a basepoint, we draw a line on which the starting points sai , sbi and endpoints tai , tbi are ordered3 (from right to left) according to sa1 < sb1 < ta1 < tb1 < sa2 < sb2 < ta2 < tb2 < . . . < sag < sbg < tag < tbg .

(6.4)

The curves representing the generators ai , bi ∈ π1 (Sg ) start and end in, respectively, sai , sbi and tai , tbi and their inverses in sa −1 = tai ,sb−1 = tbi and ta −1 = sai ,tb−1 = sbi . i i i i To derive the transformation of the holonomy Hη under an (infinitesimal) Dehn twist along λ, we draw two such lines, one corresponding to η, one to λ such that the line for η is tangent to the disc, while the one for λ is displaced slightly away from the disc. We then decompose the curves representing η and λ graphically into the curves representing the generators ai , bi and their inverses, with ordered starting and end points on the corresponding lines, and into segments parallel to the lines which connect the starting and endpoints of different factors, see Figs. 6, 7, 9, and 10. The curves representing ai±1 , bi±1 and the segments connecting their starting and endpoints are drawn in such a way that there is a minimal number of intersection points and such that all intersection points occur on the lines connecting different starting and endpoints of generators ai , bi in the decomposition of λ, as shown in Figs. 6, 7, 9, and 10. An intersection point qi is said δi+1 to occur between the factors z iδi and z i+1 on λ if it lies on the straight line connecting δt+1 δ1 tz δi and sz δi+1 , where z t+1 = z 1 . Similarly, an intersection point occurs between the i

β

i+1

β

β

β

i+1 i+1 on η if it lies on yi+1 near the starting point s y βi+1 or on yi i near factors yi i and yi+1 i+1

the endpoint t y βi . i

2 The reason for the removal of the disc is that we work on an extended phase space where the constraint (3.19) arising from the defining relation of the fundamental group is not imposed. It is shown in [16] that this implies that instead of the mapping class group Map(Sg ), it is the mapping class group Map(Sg − D) that ↑

acts on the Poisson manifold ((P3 )2g , ). 3 This ordering corresponds to an ordering of the edges at each vertex needed to define the Poisson structure in the formalism developed by Fock and Rosly [30].

Grafting and Poisson Structure in (2+1)-Gravity

767

−1 −1 Fig. 6. The decomposition of n ai = h −1 1 ◦ . . . ◦ h i−1 ai ◦ bi ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 (full line) and its intersection with ai (dashed line); segments in the decomposition of n ai that do not intersect any generator a j , b j ∈ π1 (Sg ) are omitted

−1 −1 −1 Fig. 7. The decomposition of n bi = h −1 1 ◦ . . . ◦ h i−1 ai ◦ bi ◦ ai ◦ bi ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 (full line) and its intersection with bi (dashed line); segments in the decomposition of n bi that do not intersect any generator a j , b j ∈ π1 (Sg ) are omitted

768

C. Meusburger

Fig. 8. The intersection of the geodesics ci with the polygon P 1

Fig. 9. The decomposition of h i (full line) and its intersection points with ai , bi (dashed lines)

Grafting and Poisson Structure in (2+1)-Gravity

769

Fig. 10. The intersection points of h i (full line) and its intersection with ai , bi (dashed lines), simplified representation without horizontal segments that do not contain intersection points

Let now λ, η ∈ π1 (Sg ) have intersection points q1 , . . . , qn such that qi occurs beδk

δk

βj

+1

βj

+1

i tween z ki i and z ki i+1 on λ and between y ji i and y ji +1 on η with j1 ≤ j2 ≤ . . . ≤ jn . We denote by i = i (λ, η) the oriented intersection number in qi with the convention i = 1 if λ crosses η from the left to the right in the direction of η. It is shown in [16] that, with ↑ ↑ these conventions, the action of an infinitesimal Dehn twist Dtλ : (P3 )2g → (P3 )2g is given by inserting the Poincaré element δ δ ti δk δki −1 k ki −1 (Z ki i Z ki −1 · · ·Z 1δ1 )Hλ (Z ki i Z ki −1 · · ·Z 1δ1 )−1

δk

δk

= (Z ki i · · ·Z 1δ1 )e−ti ( pλ +θkλ )Ja(Z ki i · · ·Z 1δ1 )−1 βj

a

βj

a

(6.5)

+1

i , between the factors Y ji i and Y ji +1 β jn−1 +1 a a β n +1 β δ δ · Z knkn· · · Z 1δ1 e−tn( pλ +θkλ ) Ja (Z knkn· · · Z 1δ1 )−1 ·Y jn jn· · · Y jn−1 Dtλ: Hη → Ysβs· · · Y jn j+1 +1 ·

δ −1 β δ β jn−2 +1 a a kn−1 kn−1 jn−1 · Z kn−1 · · · Z 1δ1 e−tn−1 ( pλ +θkλ ) Ja Z kn−1 · · · Z 1δ1 Y jn−1 · · · Y jn−2 +1 () · · · () · δ −1 β δ a a βj β j1 +1 k k j β · Y j2 2· · · Y j1 +1 · Z k11 · · · Z 1δ1 e−t1 ( pλ +θkλ ) Ja Z k11 · · · Z 1δ1 Y j1 1 · · · Y1 1 . (6.6) & tλ that acts on the holonomy Hη by We now define an analogous transformation Gr inserting at each intersection point the vector

770

C. Meusburger

δk Z ki i · · ·Z 1δ1

Hλ

−1 θti δk i δ1 Z ki · · ·Z 1

δ −1 δ a k k ↑ = Z ki i · · ·Z 1δ1 e−θti pλ Ja Z ki i · · ·Z 1δ1 ∈ R3 ⊂ P3

(6.7)

instead of the Poincaré element (6.5) in the definition of the Dehn twist. −1 β βj a a β +1 j & tλ : Hη → Ysβs · · · Y jn +1 · Z δkn · · · Z δ1 e−θtn ( pλ +θkλ )Ja Z δkn · · · Z δ1 · Y j n · · · Y j n−1 · Gr 1 1 jn +1 kn kn n n−1 +1

−1 δk δk βj βj a a +1 δ δ · Z k n−1· · · Z 11 e−θtn−1 pλ +θkλ Ja Z k n−1· · · Z 11 Y j n−1· · · Y j n−2 +1 () · · · () · n−1

βj β j1 +1 · Y j 2· · · Y j +1 · 2 1

n−1

δk δ Z k 1 · · · Z 11 1

n−1

n−2

−1 a δk βj a δ β e−θt1 pλ +θkλ Ja Z k 1 · · · Z 11 Y j 1 · · · Y1 1 . 1 1

(6.8)

From the parametrisation (6.3) we see directly that this transformation leaves the Lorentz component of Hη invariant and acts on the vector j η according to & tλ : j η → j η + t Gr

'n

i=1 i Ad

−β

−β ji

u Y1 1 · · · u Y j

i

δ k Ad u Z ki · · · u δZ11 pλ . i

(6.9)

We will now demonstrate that, up to a factor m λ , the transformation (6.8) is the same as the transformation (4.20) of Hη under grafting along λ. For this, we express λ as a product in the dual generators n ai , n bi , λ = z tδt ◦ . . . ◦ z 1δ1 = n αxrr · . . . · n αx11 ,

xi ∈ a1 , . . . , bg , αi ∈ {±1}.

(6.10)

From expression (3.24) of n ai , n bi in terms of ai , bi , it follows that the curves on Sg representing n ai , n bi both start and end in sa1 . Hence, by representing the curve λ as a product of n ai , n bi , we find that in contrast to the graphical representation in terms of ai and bi , there are no intersection points on straight segments connecting the starting and endpoints of different factors. All intersection points of λ and η occur within the and n ±1 curves representing the factors n a±1 bi in (6.10), which reflects the fact that the i generators n ai , n bi are dual to the generators ai , bi . To show that transformation (6.8) agrees with the transformation (4.20) of the holonomy Hη under grafting along λ, it is therefore sufficient to examine the intersection points of n ai with ai and of n bi with bi . Expressing the generators n ai , n bi as products in the generators ai , bi via (3.24) and applying the graphical prescription defined above, we find that the intersection of −1 −1 ◦ bi on ai and n ai occurs between ai ◦ h i−1 ◦ . . . ◦ h 1 and h −1 1 ◦ . . . ◦ h i−1 ◦ ai n ai and after ai and has negative intersection number, see Fig. 6. Figure 7 shows that the intersection of bi and n bi occurs before bi and between bi−1 ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 −1 −1 and h −1 ◦ bi ◦ ai on n bi , also with negative intersection number. 1 ◦ . . . ◦ h i−1 ◦ ai The intersections of ai and n a−1 therefore lie between bi−1 ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 and i −1 −1 −1 −1 −1 h −1 1 ◦ . . . ◦ h i−1 ◦ ai and those of bi with n bi between ai ◦ bi ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 −1 −1 −1 and h 1 ◦ . . . ◦ h i−1 ◦ ai ◦ bi , both with positive intersection number. By evaluating the general expression (6.9) for the curves η = ai , η = bi , we find

Grafting and Poisson Structure in (2+1)-Gravity

j Ai → j Ai − Ad(u −1 Ai )

771 α

Ad(u Ai u Hi−1 · · · u H1 v0−1 v Xk−1 · · · v αX11 v0 ) pλ k−1

X k =Ai ,αk =1

+Ad(u −1 Ad(u −1 Ai ) Bi u Ai u Hi−1 X k =Ai ,αk =−1 j Bi → j Bi +

α

· · · u H1 v0−1 v Xk−1 · · · v αX11 v0 ) pλ , k−1

(6.11)

α

α1 −1 k−1 Ad(u −1 Bi u Ai u Hi−1 · · · u H1 v0 v X k−1 · · · v X 1 v0 ) pλ

X k =Bi ,αk =1

−

α

α1 −1 −1 k−1 Ad(u −1 Ai u Bi u Ai u Hi−1 · · · u H1 v0 v X k−1 · · · v X 1 v0 ) pλ ,

(6.12)

X k =Ai ,αk =−1

and with identities (4.22), (5.11) we recover (4.18), up to a factor m λ . The transformation of a general curve η ∈ π1 (Sg ) is then given by decomposing it into the generators ai , bi , and we obtain the following theorem Theorem 6.2. Formulated in terms of the holonomies Ai , Bi , the grafting map Grwλ : ↑ ↑ (P3 )2g → (P3 )2g deﬁned by (4.18) takes the form & wλ = Dθwλ , Grwm λ λ = Gr

(6.13)

& wλ given by (6.6), (6.8). In particular, the Poisson bracket between m λ and with Dwλ , Gr sη or, equivalently, sλ and m η is given by m λ , sη = sλ , m η n β α k l β =− i (λ, η) Ad u Z ki · · · u αZ11 pˆ λ · Ad u Yli · · · u Y11 pˆ η . (6.14) i=1

i

i

Hence, we have found a rather close relation between the action of infinitesimal Dehn twists and grafting along a closed, simple curve λ ∈ π1 (Sg ). The infinitesimal Dehn twist along λ is generated by the observable m λ sλ and acts on the holonomy another curve of αki η by inserting at each intersection point qi the Poincaré element Z ki · · · Z 1α1 Hλ α i t k Z ki i · · · Z 1α1 . Grafting along λ is generated by the observable m 2λ and inserts at α θi t α k k ∈ R3 ⊂ each intersection point the element Z ki i · · · Z 1α1 Hλ Z ki i · · · Z 1α1 ↑

P3 . The formal parameter θ satisfying θ 2 = 0 therefore allows us to view grafting along λ with weight w as an infinitesimal Dehn twist with parameter θ w. 7. Example: Grafting and Dehn twists along λ = h i = bi , ai−1 To illustrate the general results of this paper with a concrete example, we consider −1 ∈ π1 ( p0 , Sg ). grafting and Dehn twists along the curve λ = h i = bi , ai We start by determining the transformation of the holonomies under grafting along λ as described in Sect. 4. From (3.23) it follows that the associated element of the cocompact Fuchsian group is given by −1 −1 (7.1) · · · v v = v0 u Hi v0−1 = v −1 H1 Hi−1 v Hi v Hi−1 · · · v H1 .

772

C. Meusburger

As we have shown in Sect. 4 that conjugation with elements of does not affect the grafting, we can instead consider the curve

−1 λ˜ = n −1 = n , n b ai i hi with associated group element

−1 v˜ = v −1 Hi = v Ai , v Bi .

We denote by c˜ p,q the lift of the closed, simple geodesic λ˜ to a geodesic in H2 with p ∈ P 1 and with unit normal vector

a n˜ p,q = −Ad v Hi−1 · · · v H1 v0 pˆ λ , (7.2) e− pλ Ja = u Bi , u −1 Ai . From Fig. 8, we find that the geodesics in the associated -invariant multicurve on H2 that intersect the polygon P 1 ⊂ H2 are given by c1 = c˜ p,q ,

c2 = Ad(v −1 Bi )˜c p,q ,

−1 c4 = Ad(v Bi v −1 Ai v Bi )˜c p,q ,

c3 = Ad(v Ai v −1 Bi )˜c p,q ,

˜ p,q , c5 = Ad([v −1 Ai , v Bi ])˜c p,q = c

(7.3)

and all intersection points lie on sides ai , ai , bi , bi . The side ai of the polygon intersects c2 , c5 = c1 with, respectively, positive and negative intersection number, while bi intersects c2 , c3 , also with, respectively, positive and negative intersection number. Hence, using formula (4.18) and expression (7.2), we find that the transformation of the holonomies Ai , Bi along the generators ai , bi ∈ π1 (Sg ) is given by −1 · · ·v j Ai → j Ai −tAd v0−1 v −1 H1 Hi−1 (n5 − n2 ) −1 1−Ad v −1 n˜ p,q = j Ai −tAd v0−1 v −1 H1· · ·v Hi−1 Bi pˆ λ , = j Ai + t 1 − Ad u −1 Ai −1 −1 j Bi → j Bi +tAd v0−1 v −1 H1· · ·v Hi−1 v Ai v Bi (n3−n2 ) −1 −1 −1 1−Ad v n˜ p,q · · ·v v = j Bi −tAd v0−1 v −1 H1 Hi−1 Ai Hi pˆ λ , = j Bi + t 1 − Ad u −1 (7.4) Bi while all other holonomies transform trivially. The transformation of the holonomy along a general curve η ∈ π1 (Sg ) is obtained by writing the associated vector j η as a linear combination of j Ai , j Bi as in (4.19). Expression (5.24) implies that the mass m λ has non-trivial Poisson brackets only with the variables j Ai , j Bi , −1 ˜ p,q−Ad(v Hi )n˜ cp,q m λ , j Ai = Ad v0−1 v −1 Ad(v −1 H1 · · · v Hi−1 Bi )n = − 1−Ad(u −1 ) pˆ λ , Ai −1 −1 ˜ p,q−Ad(v −1 ˜ p,q m λ , j Bi = −Ad v0−1 v −1 Ad(v Ai v −1 H1 · · · v Hi−1 v Ai v Bi Bi )n Bi )n = − 1−Ad(u −1 ) pˆ λ . Bi

Grafting and Poisson Structure in (2+1)-Gravity

773

Grafting along λ therefore acts on the holonomies Ai , Bi according to Grtm λ λ : Ai → (1, t pλ )Ai (1, −t pλ ) = etθ pλ Ja Ai e−tθ pλ Ja = Hλ−θt Ai Hλθt , a

a

Bi → (1, t pλ )Bi (1, −t pλ ) = etθ pλ Ja Bi e−tθ pλ Ja = Hλ−θt Bi Hλθt . a

a

(7.5)

To determine the action of an (infinitesimal) Dehn twist along λ, we apply the graphical procedure of Sect. 6 as depicted in Figs. 9, and 10. We find that both ai and bi intersect λ twice, once at their starting points with positive intersection number and once at their endpoints with negative intersection number. All intersections take place on the segment linking tbi with sai on λ. Hence, the action of an infinitesimal Dehn twist along λ on the holonomies Ai , Bi is given by Dtλ : Ai → et ( pλ+θkλ ) Ja Ai e−t ( pλ+θkλ ) Ja = Hλ−t Ai Hλt , Bi → et ( pλ+θkλ ) Ja Bi e−t ( pλ+θkλ ) Ja a

a

a

a

a

a

a

a

= Hλ−t Bi Hλt ,

where Hλ = [Bi , Ai −1 ] = e−( pλ +θkλ )Ja , and we obtain the relation between grafting and Dehn twists in Theorem 6.2: Grtm λ λ = Dθtλ . a

a

8. Concluding Remarks In this paper we related the geometrical construction of evolving (2+1)-spacetimes via grafting to phase space and Poisson structure in the Chern-Simons formulation of (2+1)dimensional gravity. We demonstrated how grafting along closed, simple geodesics λ is implemented in the Chern-Simons formalism and showed how it gives rise to a transfor↑ mation on an extended phase space realised as the Poisson manifold ((P3 )2g , ). We derived explicit expressions for the action of this transformation on the holonomies of general elements of the fundamental group and proved that it leaves Poisson structure and constraints invariant. Furthermore, we showed that this transformation is generated via the Poisson bracket by a gauge invariant Hamiltonian, the mass m λ , and deduced the symmetry relation {m λ , sη } = {sλ , m η } between the Poisson brackets of mass and spin of general closed curves λ, η. We related the action of grafting on the extended phase space to the action of Dehn twists investigated in [16] and showed that grafting can essentially be viewed as a Dehn twist with a formal parameter θ satisfying θ 2 = 0. Together with the results concerning Dehn twists in [16], the results of this paper give rise to a rather concrete understanding of the relation between spacetime geometry and the description of the phase space in terms of holonomies. There are two basic transformations associated to a simple, closed curve λ that alter the geometry of (2+1)spacetimes, grafting and infinitesimal Dehn twists. These transformations are generated via the Poisson bracket by the two basic gauge invariant observables associated to this curve, its mass m λ and the product m λ sλ of its mass and spin, and act on the phase space via Poisson isomorphisms or canonical transformations. This sheds some light on the physical interpretation of these observables. In analogy to the situation in classical mechanics where momenta generate translations and angular momenta rotations, the two basic observables associated to a simple, closed curve in a (2+1)-dimensional spacetime generate infinitesimal changes in geometry. The grafting operation, generated by its mass, cuts the surface along the curve and translates the two sides of this cut against each other. The infinitesimal Dehn twist, generated by the product of its mass and spin, cuts the surface along the curve and infinitesimally rotates the two sides of the cut with respect to each other.

774

C. Meusburger

It would be interesting to investigate the relation between grafting and Poisson structure for other values of the cosmological constant and to see if similar results hold in these cases. In particular, it would be desirable to understand if and how the Wick rotation derived in [13] which relates the grafting procedure for different values of the cosmological constant manifests itself on the phase space. Although the semidirect product structure of the (2+1)-dimensional Poincaré group gives rise to many simplifications, Fock and Rosly’s description of the phase space [30] can also be applied to the ChernSimons formulation of (2+1)-dimensional gravity with cosmological constant > 0 and < 0. For the case of the gauge group S L(2, C) this has been achieved in [33, 34]. Although the resulting description of the Poisson structure is technically more involved ↑ than the one for the group P3 , it seems in principle possible to investigate transformations generated by the physical observables and to relate them to the corresponding grafting transformations in [13]. Acknowledgements. I thank Laurent Freidel, who showed interest in the transformation generated by the mass observables and suggested that it might be related to grafting. Some of my knowledge on grafting was acquired in discussions with him. Furthermore, I thank Bernd Schroers for useful discussions, answering many of my questions and for proofreading this paper.

References 1. Achucarro, A., Townsend, P.: A Chern–Simons action for three-dimensional anti-de Sitter supergravity theories. Phys. Lett. B 180, 85–100 (1986) 2. Witten, E.: 2+1 dimensional gravity as an exactly soluble system. Nucl. Phys. B 311, 46–78 (1988), Nucl. Phys. B 339, 516–32 (1988) 3. Nelson, J.E., Regge, T.: Homotopy groups and (2+1)-dimensional quantum gravity. Nucl. Phys. B 328, 190–202 (1989) 4. Nelson, J.E., Regge, T.: (2+1) Gravity for genus > 1. Commun .Math. Phys. 141, 211–23 (1991) 5. Nelson, J.E., Regge, T.: (2+1) Gravity for higher genus. Class Quant Grav. 9, 187–96 (1992) 6. Nelson, J.E., Regge, T.: The mapping class group for genus 2. Int. J. Mod. Phys. B6, 1847–1856 (1992) 7. Nelson, J.E., Regge, T.: Invariants of 2+1 quantum gravity. Commun. Math. Phys. 155, 561–568 (1993) 8. Martin, S. P.: Observables in 2+1 dimensional gravity. Nucl. Phys. B 327, 178–204 (1989) 9. Ashtekar, A., Husain, V., Rovelli, C., Samuel, J., Smolin, L.: (2+1) quantum gravity as a toy model for the (3+1) theory. Class. Quant. Grav. 6, L185–L193 (1989) 10. Carlip, S.: Quantum gravity in 2+1 dimensions. Cambridge: Cambridge University Press, 1998 11. Mess, G.: Lorentz spacetimes of constant curvature. preprint IHES/M/90/28, Avril 1990 12. Benedetti, R., Guadgnini, E.: Cosmological time in (2+1)-gravity. Nucl. Phys. B 613, 330–352 (2001) 13. Benedetti, R., Bonsante, F.: Wick rotations in 3D gravity: ML(H2 ) spacetimes. http://arxiv.org/list/ math.DG/0412470, 2004 14. Meusburger, C., Schroers, B.J.: Poisson structure and symmetry in the Chern-Simons formulation of (2+1)-dimensional gravity. Class. Quant. Grav.20, 2193–2234 (2003) 15. Meusburger, C., Schroers, B.J.: The quantisation of Poisson structures arising in Chern-Simons theory with gauge group G g∗ . Adv. Theor. Math. Phys. 7, 1003–1043 (2004) 16. Meusburger, C., Schroers, B.J.: Mapping class group actions in Chern-Simons theory with gauge group G g∗ . Nucl. Phys. B 706, 569-597 (2005) 17. Grigore, D. R.: The projective unitary irreducible representations of the Poincaré group in 1+2 dimensions. J. Math. Phys. 37, 460–473 (1996) 18. Mund, J., Schrader, R.: Hilbert spaces for Nonrelativistic and Relativistic "Free" Plektons (Particles with Braid Group Statistics). In: Albeverio, S., Figari, R., Orlandi, E., Teta, A. (eds.) Proceeding of the Conference "Advances in Dynamical Systems and Quantum Physics", Capri, Italy, 19-22 May, 1993. Singapore: World Scientific, 1995 19. Benedetti, R., Petronio, C.: Lectures on Hyperbolic Geometry. Berlin-Heidelberg: Springer Verlag, 1992 20. Katok, S.: Fuchsian Groups. Chicago: The University of Chicago Press, 1992 21. Goldman, W.M.: Projective structures with Fuchsian holonomy. J. Diff. Geom. 25, 297–326 (1987) 22. Hejhal, D.A.: Monodromy groups and linearly polymorphic functions. Acta. Math. 135, 1–55 (1975) 23. Maskit, B.: On a class of Kleinian groups. Ann. Acad. Sci. Fenn. Ser. A 442, 1–8 (1969) 24. Thurston, W.P.: Geometry and Topology of Three-Manifolds. Lecture notes, Princeton University, 1979

Grafting and Poisson Structure in (2+1)-Gravity

775

25. Thurston, W.P.: Earthquakes in two-dimensional hyperbolic geometry. In: Epstein, D.B. (ed.), Low dimensional topology and Kleinian groups. Cambridge: Cambridge University Press, 1987, pp. 91–112 26. McMullen, C.: Complex Earthquakes and Teichmüller theory. J. Amer. Math. Soc. 11, 283–320 (1998) 27. Sharpe, R. W.: Differential Geometry. New York: Springer Verlag, 1996 28. Matschull, H.-J.: On the relation between (2+1) Einstein gravity and Chern-Simons Theory. Class. Quant. Grav. 16, 2599–609 (1999) 29. Alekseev, A. Y., Malkin, A. Z.: Symplectic structure of the moduli space of flat connections on a Riemann surface. Commun. Math. Phys. 169, 99–119 (1995) 30. Fock, V. V., Rosly, A. A.: Poisson structures on moduli of flat connections on Riemann surfaces and r -matrices. Am. Math. Soc. Transl. 191, 67–86 (1999) 31. Alekseev, A. Y., Grosse, H., Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern-Simons Theory. Commun. Math. Phys. 172, 317–58 (1995) 32. Alekseev, A. Y., Grosse, H., Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern-Simons Theory II. Commun. Math. Phys. 174, 561–604 (1995) 33. Buffenoir, E., Roche, P.: Harmonic analysis on the quantum Lorentz group. Commun. Math. Phys. 207, 499-555 (1999) 34. Buffenoir, E., Noui, K., Roche, P.: Hamiltonian Quantization of Chern-Simons theory with S L(2, C) Group. Class. Quant. Grav. 19, 4953-5016 (2002) Communicated by G.W. Gibbons

Commun. Math. Phys. 266, 777–795 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0038-9

Communications in

Mathematical Physics

Mott Transition in Lattice Boson Models R. Fernández1 , J. Fröhlich2 , D. Ueltschi3 1 Laboratoire de Mathématiques Raphaël Salem, UMR 6085 CNRS-Université de Rouen, Avenue de

l’Université, BP.12, 76821 Saint Etienne du Rouvray, France. E-mail: [email protected]

2 Institut für Theoretische Physik, Eidgenössische Technische Hochschule, 8093 Zürich, Switzerland.

E-mail: [email protected]

3 Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA.

E-mail: [email protected] Received: 29 September 2005 / Accepted: 24 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006

Abstract: We use mathematically rigorous perturbation theory to study the transition between the Mott insulator and the conjectured Bose-Einstein condensate in a hard-core Bose-Hubbard model. The critical line is established to lowest order in the tunneling amplitude. 1. Introduction Initially introduced in 1989 [9], the Bose-Hubbard model has been the object of much recent work. It represents a simple lattice model of itinerant bosons which interact locally. This model turns out to describe fairly well recent experiments with bosonic atoms in optical lattices [12, 15]. Its low-temperature phase diagram has been uncovered in several studies, both analytical (see e.g. [9, 10, 8]) and numerical [2, 19] ones. When parameters such as the chemical potential or the tunneling amplitude are varied the Bose-Hubbard model exhibits a phase transition from a Mott insulating phase to a Bose-Einstein condensate. Figure. 2, below, depicts its ground state phase diagram. In this paper, we investigate the phase diagram of this model in a mathematically rigorous way. We focus on the situation with a small tunneling amplitude, t, and a small chemical potential, µ. We construct the critical line between Mott and non-Mott behavior to lowest order in the ratio t/µ. More precisely, we prove the existence of domains with and without Mott insulator. These domains are separated by a comparatively thin stretch; the domain without Mott insulator is widely believed to be a Bose condensate. Our results establish in particular the occurrence of a “quantum phase transition” in the ground state. Over the years several analytical methods have been developed that are useful for the study of models such as the Bose-Hubbard model. They include a general theory of classical lattice systems with quantum perturbations [3, 5, 6, 16]. These methods can Collaboration supported in part by the Swiss National Science Foundation under grant 2-77344-03.

778

R. Fernández, J. Fröhlich, D. Ueltschi

be used to establish the existence of Mott phases for small t; but they only apply to domains of parameters far from the transition lines. The Bose-Hubbard model on the complete graph can be studied rather explicitly and its phase diagram is similar to the one of the finite-dimensional model [4]. Results using reflection positivity are mentioned below and only apply to the hard-core model. A related model with an extra chessboard potential was studied in [1] (see also [17]). The Bose-Hubbard model is defined as follows. Let ⊂ Zd be a finite cube of volume ||. We introduce the bosonic Fock space H,N , (1.1) F = N 0

where H,N is the Hilbert space of symmetric complex functions on N . Creation and annihilation operators for a boson at site x ∈ are denoted by c†x and cx , respectively. The Hamiltonian of the Bose-Hubbard model is given by c†x cx c†x cx − 1 . (1.2) H = −t c†x c y + 21 U x, y ∈ |x − y| = 1

x∈

The first term in the Hamiltonian represents the kinetic energy; the hopping parameter t is chosen to be positive. The second term is an on-site interaction potential (assuming each particle interacts with all other particles at the same site). The interaction is proportional to the number of pairs of particles; the interaction parameter U is positive, and this corresponds to repulsive interactions. In our construction of the equilibrium state, we work in the grand-canonical ensemble. This amounts to adding a term −µN to the Hamiltonian, where N = x c†x cx is the number operator, and µ is the chemical potential. The limit U → ∞ describes the hard-core Bose gas where each site can be occupied by at most one particle. This model is equivalent to the x y model with spin 21 in a magnetic field proportional to µ. Spontaneous magnetization in the spin model corresponds to Bose-Einstein condensation in the boson model. The presence of a Bose condensate has been rigorously established for µ = 0 (the line of hole-particle symmetry). See [7] for a proof valid at low temperatures in three dimensions, and [14] for an analysis of the ground state in two dimensions. The proofs exploit reflection positivity and infrared bounds, a method that was originally introduced for the classical Heisenberg model in [11]. At present, there are no rigorous results about the presence of a condensate for µ = 0, or for finite U . The ground state phase diagram of the hard-core Bose gas is depicted in Fig. 1 and reveals three regions: a phase with empty sites, a phase with Bose-Einstein condensation in dimension greater or equal to two, and a phase with full occupation. Particle-hole symmetry implies that the phase diagram is symmetric around the axis µ = 0. The critical value of the hopping parameter in the ground state of the hard-core (hc) Bose gas is |µ| . (1.3) 2d This follows by observing that the cost of adding one particle in a state of vanishing density (where interactions are negligible) is −µ − 2dt. For µ < −2dt the empty configuration minimizes the energy, while for µ > −2dt a state with sufficiently low, but tchc (µ) =

Mott Transition in Lattice Boson Models

779

t

BEC

m 2d

m 2d

r=1

r=0

m

0

Fig. 1. Zero temperature phase diagram for the hard-core Bose gas. Bose-Einstein condensation is proved on the line µ = 0, for any t > 0. Our perturbation methods provide a quantitative description of the Mott insulator phases with density 0 and 1

positive density has negative energy. The Mott phases of the hard-core Bose gas at zero temperature are stable because of the absence of ‘quantum fluctuations’ — the ground state is just the empty or the full configuration. The hard-core model is an excellent approximation to the general Bose-Hubbard model when t is small and µ is sufficiently small. A first insight into the ground state phase diagram of the general Bose-Hubbard model is obtained by restricting the Hamiltonian to low energy configurations. Namely, for −∞ < µ 21 U , low energy states have 0 or 1 particle per site. The restricted model is the hard-core Bose gas. Next, for 21 U µ 23 U , states of lowest energy have 1 or 2 particles per site. The restricted model is again a hard-core Bose gas, but with effective hopping equal to 2t. We can define projections onto subspaces of low energy states for all µ; corresponding restricted models yield the following approximation for the critical hopping parameter:  |µ|   if − ∞ < µ < 21 U, approx 2d tc (1.4) (µ) = |µ − kU |   if (k − 21 )U < µ < (k + 21 )U, k 1, 2d(k + 1) approx

(thin lines in Fig. 2). The true critical line tc (µ) agrees with tc due to quantum fluctuations. We expect that approx (µ) 1 + O Ut . tc (µ) = tc

(µ) up to corrections (1.5)

In order to state our first result, we recall that the pressure p(β, µ) is defined by p(β, µ) = lim

Zd

1 log Tr F e−β(H −µN ) . ||

(1.6)

Here the limit is taken over a sequence of boxes of increasing size; standard arguments ensure its existence. Its derivative with respect to the chemical potential is the density; i.e., ρ(β, µ) =

1 ∂ p(β, µ). β ∂µ

(1.7)

The zero density phase is simpler to analyze because of the absence of quantum fluctuations. The following theorem holds uniformly in U , and therefore also applies to the hard-core model.

780

R. Fernández, J. Fröhlich, D. Ueltschi

t

m 2d

m 2d

BEC

m–U 4d r= 0

m–U 4d

r =1 0

m–2U 6d r=2

U

m–2U 6d r =3

m

2U

Fig. 2. Zero temperature phase diagram for the Bose-Hubbard model. Lobes are incompressible phases with integer densities. Thin lines represent the approximate critical line defined in (1.4)

Theorem 1.1 (Zero density phase). For µ < −2dt, there exists β0 such that if β > β0 , we have that (a) the pressure is real analytic in β, µ; (b) ρ(β, µ) < e−aβ . Here, a > 0 depends on t, µ, d, but it is uniform in β and U . This theorem is proven in Sect. 2. The transition lines between the Mott phases of density ρ 1 and the Bose-Einstein condensate are much harder to study because of the presence of quantum fluctuations. We consider a simplified model with a generalized hard-core condition that prevents more than two bosons from occupying a given site. The Hamiltonian is still given by (1.2), but it acts on the Hilbert space spanned by the configurations {0, 1, 2} . The phase diagram of this model is depicted in Fig. 3. This model is the simplest one exhibiting a phase with quantum fluctuations. Notice that, in the limit U → ∞, this model coincides with the usual hard-core model. The zero-density phase and the ρ = 2 phase are characterized by Theorem 1.1. The transition line of the ρ = 1 phase is more complicated. The following theorem shows that it is equal to µ/2d to first order in t/U , as in the hard-core model. Theorem 1.2 (Mott phase ρ = 1 in generalized hard-core model). Assume that µ t2 0 < µ < U4 and t < 2d − const U (with const 211 d). Then there exist β0 and a > 0 (depending on d, t, µ, U ) such that if β > β0 , we have that (a) the pressure is real analytic in β, µ; (b) |ρ(β, µ) − 1| < e−aβ . µ for small t, µ, so that our condition The critical line is expected to be close to t = 2d agrees to first order in t. While we do not state and prove it explicitly, a similar claim −µ holds around 4dt = µ − U . Indeed, the ρ = 1 phase prevails for t U4t for small t. The “quantum Pirogov-Sinai theory” of references [3, 5] applies here and allows to establish the existence of a Mott insulator for low t. Proving that the domain extends µ almost to the line t = 2d requires additional arguments, however; Theorem 1.2 is proved in Sect. 3. The generalized hard-core condition considerably simplifies the proof. Indeed,

Mott Transition in Lattice Boson Models

781

t

BEC m 2d

m 2d

m–U 4d

m–U 4d

r=1

r=0 0

r=2 U

m

Fig. 3. Zero temperature phase diagram for the Bose-Hubbard model with the generalized hard-core condition

it allows for a cute and convenient representation of the grand-canonical partition function in terms of a gas of non-overlapping oriented space-time loops, see Fig. 4 in Sect. 3. The result is nevertheless expected to hold for the regular Bose-Hubbard model as well. While we cannot establish the presence of a Bose-Einstein condensate, we can prove the absence of Mott insulating phases away from the critical lines, by establishing bounds on the density of the system. Theorem 1.3 (Absence of Mott phases). µ (a) For t > − 2d and for any large enough, the density of the ground state is bounded below by a strictly positive constant, that depends on t, µ but not on . This applies to the model with or without hard-core condition. 2 µ (b) Consider the model with generalized hard cores. For t > 2d + Ct ( Ut ) d+2 and for any large enough, the density of the ground state is less than a constant that is strictly less than 1; it depends on t, µ but not on . 2

10 d d+2 . This theorem is proved in Sect. 4. It is shown that C d+2 2d (2 dπ ) Quantum fluctuations have some influence on the phase diagram, and a detailed discussion is necessary. “Quantum fluctuations” are fluctuations in the ground state around the constant configuration with k bosons at each site, for some k depending on µ and t. They are present in Mott phases for k 1, while the ground state for µ < −2dt is simply the empty configuration. Quantum fluctuations are not present in effective hardcore models where each site is allowed either k or k + 1 bosons. Their presence lowers the energy of both Mott and Bose condensate states. The key question is which phase benefits most from them. In other words, writing the critical hopping parameter as approx

tc (µ) = tc

(µ) 1 + a( Ut ) ,

(1.8)

the question is about the sign of a( Ut ), for small Ut . The study in [10], based on expansion methods (no attempt at a rigorous control of convergence is made), suggests a rather surprising answer: the sign of a( Ut ) depends on the dimension! Namely, the quantum fluctuations favor Mott phases for d = 1, and they favor the Bose condensate for d 2. We expect that this question can be rigorously settled by combining the partial diagonalization method of [6] with our expansions in Sect. 3.

782

R. Fernández, J. Fröhlich, D. Ueltschi

2. Low-Density Expansions In this section, we present a Feynman-Kac expansion of the partition function adapted to the study of quantum states that are perturbations of the zero density phase. In this situation, quantum effects are reduced to a minimum, amounting basically to the combinatorics related to particle indistinguishability. Nevertheless, the resulting cluster expansion must deal with two difficult points: arbitrarily large numbers of bosons and closeness to the transition line. Both difficulties are resolved by estimating the entropy of space-time trajectories in a way inspired by Kennedy’s study of the Heisenberg model [13] — the present situation being actually simpler. The grand-canonical partition function of the Bose-Hubbard model is given by Z (β, , µ) = Tr e−β(H −µN ) ,

(2.1)

where the trace is taken over the bosonic Fock space. A standard Feynman-Kac expansion yields an expression for Z in terms of “space-time trajectories”, i.e. continuous-time nearest-neighbor random walks. More precisely,

eβµN

dνxβ1 xπ(1) (θ1 ) . . . dνxβN xπ(N ) (θ N ) Z (β, , µ) = N! x1 ,...,x N ∈ π ∈S N N 0

β

exp −U δθi (τ ),θ j (τ ) dτ . (2.2) 0

1i< j N

Here, θ denotes a space-time trajectory, i.e. θ is a map [0, β] → that is constant except for finitely many “jumps” at times 0 < τ1 < · · · < τm < β, and |θ (τ j −) − θ (τ j +)| = 1. β

The “measure” νx y on trajectories starting at x and ending at y introduced in Eq. (2.2) is a shortcut for the following operation. If f is a function on trajectories, then

tm dτ1 . . . dτm f (θ ). (2.3) dνxβy (θ ) f (θ ) = m 0

x1 , . . . , xm−1 |x j − x j−1 | = 1

0<τ1 <···<τm <β

The second sum is over nearest-neighbor sites such that |x1 − x| = |xm−1 − y| = 1. The trajectory θ on the right side of (2.3) is given by θ (τ ) = x j

for j ∈ [τ j , τ j+1 ),

where (x0 , τ0 ) = (x, 0) and (xm , τm+1 ) = (y, β). The underlying trace operation constrains the ensemble of trajectories to satisfy a periodicity condition in the “β-direction”. The initial and final particle configurations must be identical, modulo particle indistinguishablity. This explains the sum over permutations of N elements, π ∈ S N , on the right side of (2.2). We shall rewrite the expresssion (2.2) for the partition function in a form that fits into the framework of cluster expansions. The main result of cluster expansions is summarized in the appendix, and it is enough for our purpose. Trajectories are correlated because of (i) the interactions in the exponential factors of (2.2) which penalize intersections, and (ii) the permutations linking initial and final sites

Mott Transition in Lattice Boson Models

783

of different trajectories. The cluster expansion is designed to handle the former factors, but we need to deal first with the latter issue so to fall into the required framework. To this end, we concatenate each original trajectory with the one starting at its final site, so as to obtain a single closed trajectory that wraps several times around the β axis. Hence, instead of open trajectories [0, β] → , we consider ensembles of closed trajectories θi : [0, i β] → , with i being their winding number. Each such closed trajectory corresponds to a cycle of length i of the permutation π determined by the endpoints of the component open trajectories. For each cycle, the sum over i sites in and the integrals over the i enchained open trajectories can be written as a sum over a single site xi , followed by an integral over closed trajectories with θi (0) = θ (i β) = xi . Recalling that there are 1 N! k k! i=1 i permutations with k cycles of lengths 1 , . . . , k , we obtain the following expansion of the partition function in terms of closed trajectories instead of particles:

1

Z (β, , µ) = dνx11 xβ1 (θ1 ) . . . dνxkk xβk (θk ) k! k 0

k

x1 ,...,xk ∈ 1 ,...,k 1

w(θi )

1 − ζU (θi , θ j ) .

(2.4)

1i< j k

i=1

Let (θ ) denote the winding number of the trajectory θ : [0, (θ )β] → . Its weight is defined by w(θ ) =

1 βµ(θ) exp (θ) e

−U W (θ ) .

(2.5)

Here, W (θ ) measures the self-intersection of θ , that is,

W (θ ) =

β

0i< j (θ)−1 0

It will suffice to use the bound w(θ ) tories θ and θ are given by

δθ(iβ+τ ),θ( jβ+τ ) dτ.

(2.6)

1 βµ(θ) . Finally, interactions between trajec(θ) e

ζU (θ, θ ) = 1 − exp −U W (θ, θ ) .

(2.7)

Here, W (θ, θ ) measures the overlap between trajectories θ and θ ,

W (θ, θ ) =

)−1

(θ)−1 β (θ

i=0

j=0

0

δθ(iβ+τ ),θ ( jβ+τ ) dτ.

(2.8)

Expression (2.4) is suited for an application of Theorem A.1. We show that the weights w(θ ) are small in the sense that they satisfy the “Kotecký-Preiss criterion” (A.4).

784

R. Fernández, J. Fröhlich, D. Ueltschi

Proposition 2.1. For each closed trajectory θ let j (θ ) denote the number of jumps of θ . Then, there exist constants a, b > 0 such that eβµ

dνx xβ (θ )ea j (θ )+βb ζU (θ, θ ) a j (θ ) + βb(θ ). 1

x∈

Proof. Since ζU (θ, θ ) is increasing in U , it is enough to prove that, for any trajectory θ, eβ(µ+b)

(2.9) dνx xβ (θ ) ea j (θ ) ζ∞ (θ, θ ) a j (θ ) + βb(θ ) , x 1

with

ζ∞ (θ, θ ) = χ θ (iβ + τ ) = θ (kβ + τ ) for some 0 i (θ ) − 1, 0 k (θ ) − 1, 0 < τ < β .

Here, χ [·] denotes the characteristic function of the event in brackets. A trajectory θ intersects θ if a jump of θ intersects a vertical line of θ , or if a jump β of θ intersects a vertical line of θ (or both). Let ν0 denote the measure on trajectories β [0, β] → Zd , starting at x = 0 and with a jump at τ = 0. integration with respect to ν0 can be defined similarly as in (2.3); formally, we can also write

β β dν0 (θ ) f (θ ) = dν00 (θ )δ(τ1 ) f (θ ), (2.10) β

where ν00 is as in (2.3) (with x = y = 0), and where the Dirac function δ(τ1 ) forces the first jump to occur at τ1 = 0. We get an upper bound by neglecting the restriction that trajectories need to remain in . The left side of (2.9) is then bounded by eβ(µ+b)

eβ(µ+b)

β a j (θ ) β j (θ ) e + β(θ ) dν dν0 ea j (θ ) . (2.11) 00 1

1

The first term accounts for trajectories θ intersecting jumps of θ ; the second term accounts for trajectories θ involving a jump that intersects a vertical line of θ . We integrate over all trajectories [0, β] → Zd that start at x = 0, without requiring them to stay in . Each trajectory θ in the last integral of (2.11) can be decomposed into the jump from a neighbor z into 0, which contributes a factor tea (see definition (2.3)), plus a trajectory from z to 0. As 0 has 2d neighbors we see that

β β dν0 (θ ) ea j (θ ) 2dtea (2.12) dνz0 (θ ) ea j (θ ) . Furthermore, the definition (2.3) implies that for every x, y,

β a j (θ ) a m dνx y (θ ) e = (te ) m 0

≤ ete

a 2d β

0<τ1 <···<τm < x1 , . . . , xm−1 |x j − x j−1 | = 1 |x1 − x| = |y − xm−1 | = 1

.

β

dτ1 . . . dτm

(2.13)

Mott Transition in Lattice Boson Models

785

From (2.9), (2.11), (2.12) and (2.13) we conclude that eβ(µ+b)

dνx xβ (θ ) ea j (θ ) ζ∞ (θ, θ ) x 1 ≤ j (θ ) + 2dtea β(θ ) exp β [µ + 2dtea + b] .

(2.14)

1

As µ + 2dt < 0, we can choose a and b such that µ + 2dtea + b < 0. Then (2.9) holds for β large enough.

Proof of Theorem 1.1. Recall expression (1.6) for the pressure. Proposition 2.1 establishes the convergence of cluster expansions, as stated in Theorem A.1. With ϕ denoting the usual combinatorial function of cluster expansions, see (A.2), the partition function has the absolutely convergent expression

Z (β, , µ) = exp dνx11 xβ1 (θ1 ) m 1 x1 ,...,xm ∈ 1 ,...,m 1

...

dνxmm xβm (θm ) ϕ(θ1 , . . . , θm )

m

w(θ j ) .

(2.15)

j=1

Taking the logarithm and dividing by the volume, standard arguments show that boundary terms vanish in the thermodynamic limit, and we obtain

β dν001 (θ1 ) dνx22 xβ2 (θ2 ) p(β, µ) = m 1 x2 ,...,xm ∈Zd 1 ,...,m 1

...

dνxmm xβm (θm ) ϕ(θ1 , . . . , θm )

m

w(θ j ).

(2.16)

j=1

Integrals can be viewed as functions of β, µ, indexed by m, (xi ), and (i ). They are real analytic in the domain (β, µ) : β > β0 (µ). Their sum is absolutely convergent and Vitali’s convergence theorem implies that p(β, µ) is analytic. Recall that the density is given by the derivative of the pressure with respect to the chemical potential; see (1.7) for the precise definition. The analyticity implied by the expansion allows for term-by-term differentiation. We can check that

∂w(θ1 ) β ρ(β, µ) = dν001 (θ1 ) m ∂µ d 1 1

dνx22 xβ2 (θ2 ) . . .

m 1

x2 ,...,xm ∈Z 2 ,...,m 1 m

dνxmm xβm (θm ) ϕ(θ1 , . . . , θm )

w(θ j ). (2.17)

j=2 ∂ Note that ∂µ w(θ ) = β(θ )w(θ ), as follows from definition (2.5) of the weight of trajectories. By (A.5), we have the bound

β 1 dν001 (θ )w(θ )ea(θ) . (2.18) ρ(β, µ) β 1 1

786

R. Fernández, J. Fröhlich, D. Ueltschi

There exists ε > 0 such that µ + 2dtea + b + ε < 0. Using (2.13), we get a ρ(β, µ) βe−εβ 1 eβ1 [µ+2dte +b+ε] .

(2.19)

1 1

Then ρ(β, µ) e−εβ for β large enough, and this completes the proof of Theorem 1.1.

3. Space-Time Loop Representation The study of the transition line for the Mott phase with unit density requires the analysis of perturbations of the “vacuum” formed by one particle at each site. This involves the control of full-fledged quantum fluctuations. We turn, then, to a more general expansion setting previously employed to study spin and fermionic systems [3, 5]. This setting shares some similarities with that of Sect. 2, but it also differs from it in significant ways. We use the same symbols ν, w, ζ, , θ, but we caution the reader that they are defined in slightly different ways. Besides the quantum-fluctuation issue, bosonic systems present the additional complication of the unboundedness of occupation numbers. In the present paper we wish to leave this second issue aside. We consider, thus, the model with a generalized hard-core condition that ensures that configurations have at most two bosons at each site. Recall definition (2.1) of the grand-canonical partition function. It is convenient to write H − µN = V + T,

(3.1)

where V denotes the diagonal terms (i.e., interactions and chemical potential terms) in the basis of occupation numbers in position space, and T denotes the hopping terms. We will consider T to be a perturbation of V . Our expansion is based on Duhamel’s formula,

β e−β(V +T ) = e−βV + dτ e−τ V (−T ) e−(β−τ )(V +T ) , (3.2) 0

which we can iterate to obtain

−β(V +T ) e = dτ1 . . . dτm e−τ1 V (−T )e−(τ2 −τ1 )V . . . (−T )e−(β−τm )V . m 0 0<τ1 <...<τm <β

(3.3) Then Z (β, , µ) = Tr e

−β(V +T )

=

m 0

t

m 0<τ1 <...<τm <β

dτ1 . . . dτm

Tr e−τ1 V c†x1 c y1 e−(τ2 −τ1 )V . . . c†xm c ym e−(β−τm )V .

(3.4)

(x1 ,y1 ),...,(xm ,ym )

We denote by n = (n x )x∈ , n x ∈ N, a “classical configuration” that represents the state where n x bosons are located at site x, and |n the corresponding normalized vector.

Mott Transition in Lattice Boson Models

787

Inserting projector decompositions 1 =

ni

|n i n i | the trace can be written as

Tr e−τ1 V c†x1 c y1 e−(τ2 −τ1 )V . . . c†xm c ym e−(β−τm )V = n 0 | e−τ1 V c†x1 c y1 |n 1 n 1 | e−(τ2 −τ1 )V . . . c†xm c ym |n m

(3.5)

n 0 ,n 1 ,...,n m

× n m | e−(β−τm )V |n 0 . As the operator V is diagonal in the base |n, this decomposition allows us to rewrite the expansion (3.4) in the form

Z (β, , µ) = dν(n) w(n) , (3.6) where (i) n is a space-time quantum conﬁguration, namely an assignment of a configuration n(τ ), for each 0 < τ < β, such that – n is constant in τ , except at finitely many times τ1 < · · · < τm , with m even. – At each τi , a “jump” occurs, i.e. there are nearest-neighbor sites (xi , yi ) such that   n x (τi −) + 1 if x = xi , n x (τi +) = n x (τi −) − 1 if x = yi , (3.7)  n (τ −) otherwise. x i – n is periodic in the τ direction: n(β) = n(0). (ii) w(n) are positive weights defined by

w(n) = exp −

β 0

V (n(τ )) dτ

m

t n xi (τi +)n yi (τi −) ,

(3.8)

i=1

with the short-hand notation V (n) ≡ n| V |n. (iii) Integration with respect to the “measure” ν on quantum configurations stands for a sum over configurations at time 0, a sum over m, integrals over jumping times, and sums over locations of jumps. The expansion just obtained is rather general. It is convenient to interpret it in terms of random geometrical objects in a model-dependent fashion. For the case of interest here, we follow the “excitations”, namely the sites where the occupation number is different from the vacuum value 1. We therefore embed the “space time” × [0, β] in the cylinder Rd × S 1 (with periodic boundary conditions in the time direction) and decompose the trajectories of the excitations in connected components. In this way, a quantum configuration n can be represented as a set of non-intersecting loops (with winding numbers n = 0, ±1, ±2, . . . ) in this cylinder. The representation is defined by the following rules: • The constant configuration n with n x (τ ) = 1, for all x ∈ and 0 τ β, has no loops. • A jump of a boson from yi to xi at time τi (see (3.7)) is represented by a horizontal arrow from (yi , τi ) to (xi , τi ).

788

R. Fernández, J. Fröhlich, D. Ueltschi

• The points (x, τ ) with n x (τ ) = 1 are represented by vertical segments. These segments point upwards if n x (τ ) = 2, and downwards if n x (τ ) = 0. Loops are illustrated in Fig. 4. Similar representations have been used in various contexts, e.g. in a study of the Falicov-Kimball model [18]. Given a loop γ , we introduce the number of jumps j (γ ) (always an even number, possibly zero); the length 0 (γ ) of all vertical segments pointing downwards; the length 2 (γ ) of all vertical segments pointing upwards; (γ ) = 0 (γ ) + 2 (γ ); and the winding number z(γ ). Notice that 2 (γ ) − 0 (γ ) = βz(γ ). A loop γ defines a unique quantum configuration nγ . We define the weight of a loop as w(γ ) = t j (γ )

j (γ )

γ

γ

n yi (τi −)n xi (τi +) e−0 (γ )µ e−2 (γ )(U −µ) .

(3.9)

i=1

Note that we have subtracted the classical energy of the background configuration with one boson at each site. The weight w(γ ) thus only depends on excitation energies. These definitions allow us to rewrite the partition function (3.6) in terms of loops and their weights instead of space-time configurations. Unlike the trajectories of Sect. 2, the loops here have only a hard-core interaction due to the requirement of non-intersection. Furthermore, if = {γ1 , . . . , γm } is a set of disjoint loops, we have the important property that the weight of the corresponding quantum configuration n factorizes, w(n ) = eβµ||

m

w(γi ).

(3.10)

i=1

We define a measure on loops, also denoted ν, and we rewrite the partition function as

m 1

βµ|| Z (β, , µ) = e w(γi ) 1 − ζ (γi , γ j ) . dν(γ1 ) . . . dν(γm ) m! m 0

1i< j m

i=1

(3.11) Here, the term corresponding to m = 0 is set to eβµ|| , and the function ζ (γ , γ ) equals 1 if the loops γ and γ intersect (more precisely, if some of their vertical segments intersect), and equals 0 if the loops have disjoint support. b 2

0 2 2

2

0

0

2 2

0 2

2

0

0 Λ

Fig. 4. Illustration for the gas of space-time loops. There are three loops with respective winding numbers 1,0, and -1

Mott Transition in Lattice Boson Models

789

The expression (3.11) for the partition function is an adequate starting point for the method of cluster expansions. We prove that the weights are small so as to satisfy the “Kotecký-Preiss criterion”, Eq. (A.4). We can then appeal to Theorem A.1 to conclude that the cluster expansion converges. Proposition 3.1. Under the hypotheses of Theorem 1.2, we have that, for any loop γ ,

dν(γ ) w(γ ) ζ (γ , γ ) ea(γ ) a(γ ) with 2

2

t (γ ). a(γ ) = 214 d Ut 2 j (γ ) + 212 d U

(3.12)

Its proof relies on the bounds stated in the following lemma. Let us partition the set of loops into L = L(0+) ∪ L(−) , where L(0+) (resp. L(−) ) is the set of loops with nonnegative (resp. negative) winding numbers. For each site z we introduce the measures (0+) νz on loops that make a jump at time τ = 0 involving z. Further, we let Lz , Lz , and (−) Lz denote the sets of loops that contain (z, 0). Lemma 3.2. Under the hypotheses of Theorem 1.2, for any site z, t2 , (a) L(0+) dνz (γ ) w(γ ) ea(γ ) 211 d U 2 2 −β(U −µ−212 d tU ) a(γ ) e + 213 d Ut 2 , (b) L(0+) dν(γ ) w(γ ) e z 12 2 t 2 (c) L(−) dνz (γ ) w(γ ) ea(γ ) 4dt e−β(µ−2dt−2 d U ) , 12 2 t 2 (d) L(−) dν(γ ) w(γ ) ea(γ ) e−β(µ−2dt−2 d U ) . z

Proof of Proposition 3.1. Suppose that the loops γ and γ intersect, i.e. ζ (γ , γ ) = 1. Then either a jump of γ intersects a vertical line of γ , or a jump of γ intersects a vertical line of γ (both may happen at the same time). The first situation is analyzed using the measures νz , and the second situation involves the sets L(0+) and L(−) z z . More precisely, we have that

dν(γ ) w(γ ) ζ (γ , γ ) ea(γ ) (3.13)

(γ ) sup dνz (γ ) w(γ ) ea(γ ) + j (γ ) sup dν(γ ) w(γ ) ea(γ ) . z

L

z

Lz

Using the estimates in Lemma 3.2, the right side is seen to be smaller than a(γ ), provided β is large enough.

Proof of Lemma 3.2, (a) and (b). Loops of L(0+) have large energy cost, so crude entropy estimates are enough. Since 2 (γ ) 0 (γ ) for any loop γ ∈ L(0+) , we have that µ0 (γ ) + (U − µ)2 (γ ) 21 U (γ ). Then 2

t (γ ) µ0 (γ ) + (U − µ)2 (γ ) − 212 d U

1 4 U (γ ).

(3.14)

Further, we can check that e2

14 dt 2 /U 2

< 2.

(3.15)

790

R. Fernández, J. Fröhlich, D. Ueltschi

From these observations and (3.9), we obtain that   e−β(U −µ−212 d tU2 ) if j = 0 a(γ ) w(γ )e 1  (4t) j (γ ) e− 4 U (γ ) if j 2.

(3.16)

A loop with j (γ ) = 2n is characterized by a sequence of jump times 0 τ1 < τ2 < · · · < τ2n . At each such time the trajectory can choose among at most 2d neighbors to jump to and 2 directions of time to proceed after the jump. The last jump is determined by the fact that γ must be a loop, so there is no factor 2d (but both time directions are possible). The measure νz involves only loops with two jumps or more. From the last bound in (3.16) we obtain ∞ 2n−1

1 dνz (γ ) w(γ ) ea(γ ) 2 · 22n (2d)2n−1 (4t)2n dτ e− 4 U τ L(0+)

0

n 1

=

210 dt 2 /U 6 1 − ( 2Udt )2

2

t 211 d U .

(3.17)

Part (b) of the lemma follows from (3.16) and from considerations similar to (3.17). Namely,

12 t 2 dν(γ ) w(γ ) ea(γ ) e−β(U −µ−2 d U ) L(0+) z

+

2 · 2 (2d) 2n

2n−1

(4t)

2n

∞

dτ e

− 14 U τ

2n .

(3.18)

0

n 1

The first term in the right side represents loops without jumps. The right side is less than the upper bound in Lemma 3.2 (b).

Proof of Lemma 3.2, (c) and (d). Loops of L(−) have small energy cost when parameters are close to the transition line. Estimates are needed that are more subtle than for loops of L(0+) . The situation is similar to that of Sect. 2, but a problem needs to be solved: Loops, unlike trajectories, can backtrack in time. Our strategy is to first “renormalize” a loop γ ∈ L(−) by identifying a trajectory θ = θ (γ ) that moves only downwards, but with arbitrarily long jumps. Contributions of backtracking can be controlled by similar estimates as above. The entropy of these trajectories can be expressed using an appropriate hopping matrix and we obtain sharp enough bounds. We start with (d). Given a loop γ ∈ L0(−) , we start at (x, τ ) = (z, β) and move downwards along γ . When reaching the end of a vertical segment (because of the presence of a nearest-neighbor jump), we ignore possible backtracking and directly jump to the next downwards vertical segment in the loop, at constant time. See the dotted lines in Fig. 4. We obtain a trajectory, since the motion is downwards only, punctuated by with long-range hoppings with which we must cope. Behind a hop from x to y there is a backtracking excursion between these sites. Its contribution to the total weight of the original loop (times ea(γ ) ) is given by the “hopping matrix” component

σx y = dνx (γ )w(γ )ea(γ ) , (3.19) x→y

Mott Transition in Lattice Boson Models

791

where the integral is over loops that are open, have nonnegative winding number, start with a jump at (x, 0), and end at (y, 0). Each trajectory so constructed is characterized by a sequence of hopping times 0 = τ1 < · · · < τ2m ≤ β and a sequence of not-necessarily neighboring sites x = x0 , x1 , . . . , xm = x which are the successive hopping endpoints. Its weights are determined by factors exponentially decreasing with 0 for each vertical segment and hopping matrix entries for each jump. In this way we obtain

L(−) z

dν(γ ) w(γ ) ea(γ ) e−β(µ−2

12 d t 2 ) U

βm m!

m 0

e−β(µ−2

12 d t 2 ) U

eβ

m

x1 , · · · , xm−1 x0 = xm = z

i=1

x =0 σ0x

σxi xi−1

.

(3.20)

The overall exponential factor comes from the fact that 0 β because the winding number of the loops is not zero. The factor β m /m! follows by integrating all choices of hopping times. To conclude, we must bound the sum of the matrix elements of σ . The contribution 214 d

t2

U 2 . Other open loops involve of open loops that consist in just one jump is 2dte two jumps or more. Each jump has 2d possible directions. There are two possible time directions after each jump, except for the first and last ones. We need to integrate over time occurrence for each jump except the first one. We obtain ∞ m−1 2 214 d t 2 m m−2 m − 14 U τ U σ0x 2dt e + (2d) 2 (4t) e dτ

0

m 2

x=0

2dt e

2 214 d t 2 U

2

t + 29 d 2 U .

(3.21)

We used (3.15). Inserting into (3.20) we obtain Lemma 3.2 (d). The bound of part (c) is similar, with an extra factor 2dte

214 d

t2 U2

4dt for the additional first jump.

Proof of Theorem 1.2. This proof is similar to the one of Theorem 1.1. We use cluster expansions, in order to get a convergent expansion for the pressure, and prove analyticity by using Vitali’s theorem. The density has an expansion reminiscent of (2.17), namely

∂w(γ1 ) ρ(β, µ) = 1 + dν(γ1 ) ∂µ L0

m m dν(γ2 ) . . . dν(γm )ϕ(γ1 , . . . , γm ) w(γ j ). (3.22) m 1

i=2

The combinatorial function ϕ is given by (A.2). From (3.9) ∂w(γ1 ) = [2 (γ ) − 0 (γ )]w(γ ). ∂µ Again using Eq. (A.5), we find the bound

|ρ(β, µ) − 1| dν(γ ) 2 (γ ) − 0 (γ ) w(γ )ea(γ ) . L0

(3.23)

(3.24)

792

R. Fernández, J. Fröhlich, D. Ueltschi

Only loops with nonzero winding number contribute. Going over the proof of Lemma 3.2 (b) and (d) with a(γ ) → a(γ )+log (γ ), we can check that the right side of the equation t2 above is less than e−εβ whenever µ − 2dt − 212 d 2 U − ε > 0 and β is large enough.

4. Density Bounds Proof of Theorem 1.3, (a). The Bose-Hubbard Hamiltonian preserves the total number of particles, so that the density can be fixed. We denote by e0 (ρ) the ground state energy per site in the subspace of density ρ. Neglecting repulsive interactions can only decrease the ground state energy; the minimum kinetic energy of a single boson is −2dt. It follows that e0 (ρ) (−µ − 2dt)ρ for all U 0. We find an upper bound for e0 (ρ) by using a variational argument. It is well-known that the symmetric ground state is also the absolute ground state, so that we can consider a non-symmetric trial function. We decompose into boxes of size = ρ −1/d . We consider the trial function ⊗ Nj=1 ϕ j , where ϕ j is supported in the j th box only and minimizes the kinetic energy. As is well-known, ϕ j is the ground state of the Dirichlet problem in π . Since + 1 ρ −1/d and the box, and the corresponding eigenvalue is −2dt cos +1 cos x 1 −

x2 2 ,

this eigenvalue is less than −2dt + dt (πρ 1/d )2 . This implies that 2

e0 (ρ) b(ρ) ≡ (−µ − 2dt)ρ + π 2 dtρ 1+ d . The minimum of b(ρ) is reached for ρ 2/d = c=−

The minimum value is

µ + 2dt 1+ d

2 (π 2 t)

µ+2dt . π 2 t (d+2)

2

d 2

d +2

(4.1)

.

(4.2)

By inspecting Fig. 5 we find that the ground state density is necessarily larger than 2 µ + 2dt d/2 a= . (4.3) d + 2 π 2 t (d + 2)

Proof of Theorem 1.3, (b). The strategy is the same as for part (a), although quantum fluctuations bring extra complications. The variational argument leading to the upper bound for e0 (ρ) can be modified by replacing particles with holes, so as to yield ˜ ≡ −µ + (µ − 2dt)(1 − ρ) + π 2 dt (1 − ρ)1+ d . e0 (ρ) b(ρ) 2

(4.4)

The lower bound is trickier. We fix the density ρ and work in the Hilbert space H,N with N = ρ||. We have that e0 (ρ) = − lim

1 lim β|| Zd β→∞

log Tr H,N e−β(H −µN ) .

(4.5)

We can use the loop representation of Sect. 3 for the trace to obtain an expression similar to (3.11); the difference is that we require the sum of winding numbers of all loops to be equal to the negative of the number of holes M = || − N . The weights of loops with strictly positive winding numbers decays exponentially as e−β(U −µ) , so they do not contribute in the limit β → ∞. We obtain an upper bound for

Mott Transition in Lattice Boson Models

793

e (r)

b(r)

a

r

c

(– m – 2dt)r

Fig. 5. Upper and lower bounds for the ground state energy per site, e0 (ρ). The density that minimizes e(ρ) necessarily satisfies ρ a

Z (β, , µ) (and therefore a lower bound for e0 (ρ)) by neglecting the non-intersecting conditions between loops. Further, we replace the loops γ with negative winding numbers by trajectories θ as in the proof of Lemma 3.2 (c), (d). We then obtain the lower bound e0 (ρ) −µρ − lim

1 lim β|| Zd β→∞

− lim

1 β|| Zd β→∞

lim

˜

log Tr H,M e−β T

L(0)

dν(γ )w(γ ).

(4.6)

Here, T˜ denotes the multibody kinetic operator σ (x − y)c†x c y , T˜ =

(4.7)

x,y

and σ (x) is given in (3.19). Then, by (3.21), lim 1 β→∞ β

˜ t2 M. log Tr H,M e−β T 2dt + 210 d 2 U

(4.8)

The contribution of nonwinding loops is bounded using Lemma 3.2 (a),

1 t2 dν(γ )w(γ ) dν0 (γ )w(γ ) 211 d U . (0) β|| L(0) L

(4.9)

We have shown that 2

2

t t e0 (ρ) −µ − (2dt − µ + 210 d 2 U )(1 − ρ) − 211 d U .

(4.10)

1+d/2 . ˜ From here on we proceed as before. The minimum of b(ρ) is −µ− (π 2 2t)d/2 ( 2dt−µ d+2 ) The ground state density then satisfies 1+d/2 2dt−µ 2 t2 − 211 d U d+2 (π 2 t)d/2 1−ρ . (4.11) t2 2dt − µ + 210 d 2 U

One finds the condition of Theorem 1.3 by requiring that the numerator be strictly positive.

794

R. Fernández, J. Fröhlich, D. Ueltschi

Appendix A. Cluster Expansions This appendix contains the main theorem of [20] for the convergence of cluster expansions. It allows for an uncountable set of “polymers”, so that it applies here. Let (A, A, µ) be a measure space with µ a complex measure. We suppose that |µ|(A) < ∞, where |µ| is the total variation (absolute value) of µ. Let ζ be a complex measurable symmetric function on A × A. Let Z be the partition function:

1

Z= 1 − ζ (Ai , A j ) . (A.1) dµ(A1 ) . . . dµ(An ) n! n 0

1i< j n

The term n = 0 of the sum is understood to be 1. We denote by Gn the set of all (unoriented) graphs with n vertices, and Cn ⊂ Gn the set of connected graphs of n vertices. We introduce the following combinatorial function on finite sequences (A1 , . . . , An ) of A: 1 if n = 1 ϕ(A1 , . . . , An ) = 1 (A.2) [−ζ (A , A )] if n 2. i j G∈Cn (i, j)∈G n! The product is over edges of G. A sequence (A1 , . . . , An ) is a cluster if the graph with n vertices and an edge between i and j whenever ζ (Ai , A j ) = 0, is connected. Convergence of cluster expansion is guaranteed provided the terms in (A.1) are small in a suitable sense. First, we assume that |1 − ζ (A, A )| 1

(A.3)

for all A, A ∈ A. Second, we need that the “Kotecký-Preiss criterion” holds true. Namely, we suppose that there exists a nonnegative function a on A such that for all A ∈ A,

d|µ|(A ) |ζ (A, A )| ea(A ) a(A). (A.4) The cluster expansion allows to express the logarithm of the partition function as a sum (or an integral) over clusters. Theorem A.1 (Cluster expansion). Assume that d|µ|(A)ea(A) < ∞, and that (A.3) and (A.4) hold true. Then we have

Z = exp dµ(A1 ) . . . dµ(An ) ϕ(A1 , . . . , An ) . n 1

Combined sum and integrals converge absolutely. Furthermore, we have for all A1 ∈ A,

1+ n d|µ|(A2 ) . . . d|µ|(An ) |ϕ(A1 , . . . , An )| ea(A1 ) . (A.5) n 2

We refer to [20] for the proof of this theorem, and for further statements about correlation functions.

Mott Transition in Lattice Boson Models

795

References 1. Aizenman, M., Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: Bose-Einstein quantum phase transition in an optical lattice model. Phys. Rev. A 70, 023612 (2004); see also cond-mat/0412034; see also http:// arxiv.org/list/cond-mat/0412034, 2004 2. Batrouni, G.G., Assaad, F.F., Scalettar, R.T., Denteneer, P.J.H.: Dynamic response of trapped ultracold bosons on optical lattices. Phys. Rev. A 72, 031601(R) (2005) 3. Borgs, C., Kotecký, R., Ueltschi, D.: Low temperature phase diagrams for quantum perturbations of classical spin systems. Commun. Math. Phys. 181, 409–446 (1996) 4. Bru, J.-B., Dorlas, T.C.: Exact solution of the infinite-range-hopping Bose-Hubbard model. J. Stat. Phys. 113, 177–196 (2003) 5. Datta, N., Fernández, R., Fröhlich. J.: Low-temperature phase diagrams of quantum lattice systems. I. Stability for quantum perturbations of classical systems with finitely-many ground states. J. Stat. Phys. 84, 455–534 (1996) 6. Datta, N., Fernández, R., Fröhlich, J., Rey-Bellet, L.: Low-temperature phase diagrams of quantum lattice systems. II. Convergent perturbation expansions and stability in systems with infinite degeneracy. Helv. Phys. Acta. 69, 752–820 (1996) 7. Dyson, F.J., Lieb, E.H., Simon, B.: Phase transitions in quantum spin systems with isotropic and nonisotropic interactions. J. Stat. Phys. 18, 335–383 (1978) 8. Elstner, N., Monien, H.: Dynamics and thermodynamics of the Bose-Hubbard model. Phys. Rev. B 59, 12184–12187 (1999) 9. Fisher, M.P.A., Weichman, P.B., Grinstein, G., Fisher, D.S.: Boson localization and the superfluid-insulator transition. Phys. Rev. B 40, 546–570 (1989) 10. Freericks, J.K., Monien, H.: Strong-coupling expansions for the pure and disordered Bose-Hubbard model. Phys. Rev. B 53, 2691–2700 (1996) 11. Fröhlich, J., Simon, B., Spencer, T.: Infrared bounds, phase transitions and continuous symmetry breaking. Commun. Math. Phys. 50, 79–95 (1976) 12. Greiner, M., Mandel, O., Esslinger, T., Hänsch, T.W., Bloch, I.: Quantum phase transition from a superfluid to a Mott insulator in a gas of ultracold atoms. Nature 415, 39–44 (2002) 13. Kennedy, T.: Long range order in the anisotropic quantum ferromagnetic Heisenberg model. Commun. Math. Phys. 100, 447–462 (1985) 14. Kennedy, T., Lieb, E.H., Shastry, B.S.: The X -Y model has long-range order for all spins and all dimensions greater than one. Phys. Rev. Lett. 61, 2582–2584 (1988) 15. Kölh, M., Moritz, H., Stöferle, T., Schori, C., Esslinger, T.: Superfluid to Mott insulator transition in one, two, and three dimensions. J. Low Temp. Phys. 138, 635 (2005) 16. Kotecký, R., Ueltschi, D.: Effective interactions due to quantum fluctuations. Commun. Math. Phys. 206, 289–335 (1999) 17. Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: The mathematics of the Bose gas and its condensation. Oberwohlfach Seminars, Basel Birkhäuser, 2005 18. Messager, A., Miracle-Solé, S.: Low temperature states in the Falicov-Kimball model. Rev. Math. Phys. 8, 271–99 (1996) 19. Schmid, G., Todo, S., Troyer, M., Dorneich, A.: Finite-temperature phase diagram of hard-core bosons in two dimensions. Phys. Rev. Lett. 88, 167208 (2002) 20. Ueltschi, D.: Cluster expansions and correlation functions. Moscow Math. J. 4, 511–522 (2004) Communicated by M. Aizenman

Commun. Math. Phys. 266, 797–818 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0020-6

Communications in

Mathematical Physics

Upper Bounds to the Ground State Energies of the One- and Two-Component Charged Bose Gases Jan Philip Solovej, Institute for Mathematical Sciences, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark. E-mail: [email protected] Received: 30 September 2005 / Accepted: 2 December 2005 Published online: 5 May 2006 – © by J.P. Solovej 2006

Abstract: We prove upper bounds on the ground state energies of the one- and twocomponent charged Bose gases. The upper bound for the one-component gas agrees with the high density asymptotic formula proposed by L. Foldy in 1961. The upper bound for the two-component gas agrees in the large particle number limit with the asymptotic formula conjectured by F. Dyson in 1967. Matching asymptotic lower bounds for these systems were proved in references [10] and [11]. The formulas of Foldy and Dyson which are based on Bogolubov’s pairing theory have thus been validated.

1. Introduction and Main Results In 1961 L. Foldy [7] used Bogolubov’s 1947 pairing theory [4] for Bose systems to give a heuristic calculation of the ground state energy of a one-component charged Bose gas in the high density limit. The one-component Bose gas is a system of Bose particles all of the same charge moving in the presence of a fixed uniform background of the opposite charge. In 1967 F. Dyson [6] considered the two-component Bose gas with two species of bosons with opposite charges. Motivated by Foldy’s calculation Dyson was able to prove a rigorous upper bound on the ground state energy. A famous consequence of Dyson’s upper bound is that charged bosonic matter is not stable, the ground state energy is superlinear in the number of particles. Dyson, moreover, conjectured an exact asymptotic form of the ground state energy in the limit of a large number of particles. © 2006 by the author. This article may be reproduced in its entirety for non-commercial purposes.

Work partially supported by NSF grant DMS-0111298, by EU grant HPRN-CT-2002-00277, by

MaPhySto – A Network in Mathematical Physics and Stochastics, funded by The Danish National Research Foundation, and by grants from the Danish research council. Most of this work was done while the author was visiting the School of Mathematics, Institute for Advanced Study, Princeton.

798

J.P. Solovej

In [10] it was proved that Foldy’s calculation is indeed correct as a leading asymptotic lower bound for the ground state energy of the one-component charged Bose gas in the high density limit. In [11] it was similarly proved that Dyson’s conjectured expression is correct as an asymptotic lower bound for the ground state energy of the two-component charged Bose gas in the limit of a large number of particles. The aim of the present paper is to prove the corresponding upper bounds thus validating both Foldy’s one-component and Dyson’s two-component formulas. It should be mentioned that Foldy’s calculation may be viewed as a trial state calculation and may thus be turned into a rigorous upper bound. Foldy, however, uses periodic boundary conditions, and a periodic version of the Coulomb potential. It is not known whether this formulation has the same thermodynamic limit as the formulation given below. The one-component Bose gas is a system of N particles all of the same charge +1, say, constrained to a box = [0, L]3 ⊂ R3 , in which there is a uniform background charge of density ρ. The Hamiltonian for the one-component charged Bose gas is thus N 1 − 2 i − V (xi ) +

H N(1) =

i=1

|xi − x j |−1 + C,

(1)

1≤i< j≤N

where V (x) = ρ

ρ2 2

|xi − y|−1 dy, C =

×

|x − y|−1 dx dy.

We use Dirichlet boundary conditions. It is known from the work of Lieb and Narnhofer [9] that the ground state energy (1) E (1) (N ) of H N has a thermodynamic limit if we restrict to a neutral system e(ρ) = lim

N →∞ L 3 =N /ρ

E (1) (N ) . L3

It is however also shown in [9] that one will get the same thermodynamic energy by minimizing over all particle numbers, i.e., e(ρ) = lim inf L→∞ N

E (1) (N ) . L3

Theorem 1.1 (Foldy’s formula). The ground state energy e(ρ) of the one-component charged Bose gas satisﬁes the asymptotics lim ρ −5/4 e(ρ) = −I0 ,

ρ→∞

(2)

where

∞

I0 = (2/π )3/4 0

1/2 45/4 (3/4) 1 + x4 − x2 x4 + 2 dx = . 5π 1/4 (5/4)

(3)

Upper Bounds to Ground State Energies of Charged Bose Gases

799

The two component Bose gas is described by the Hamiltonian (2) HN

=

N i=1

1 − i + 2

1≤i< j≤N

ei e j |xi − x j |

acting on the Hilbert space L 2 (R3 ×{1, −1}), where the variable (xi , ei ) ∈ R3 ×{1, −1} gives the position and charge of particle i. The word two component refers to the fact that the charge of each particle can be either positive or negative. Thus the gas has a positive and a negative component. One would not normally consider the charges as variables, but rather fix them to have given values. If we did that, the Hamiltonian would not be fully symmetric in all N variables, but only in the variables for the positively charged particles and negatively charged particles separately. Clearly, the charge variables commute with the Hamiltonian and (2) the bottom of the spectrum (the ground state energy) E (2) (N ) of H N will therefore be achieved for a fixed combination of charges (rather than a superposition). Theorem 1.2 (Dyson’s formula). The ground state energy E (2) (N ) of the two-component charged Bose gas satisﬁes the asymptotics lim N −7/5 E (2) (N ) = −A,

N →∞

where A is the positive constant determined by the variational principle 2 5/2 2 1 −A = inf 2 |∇| − I0 0 ≤ , =1 R3

R3

R3

(4)

with I0 again given by (3). In [6] Dyson proves that E (2) (N ) ≤ −C N 7/5 , but with a constant different from A. He conjectures that the correct value is given as above. That the exponent 7/5 is, indeed, correct was first proved in 1988 by Conlon, Lieb, and Yau in [5], where they show a lower bound −C N 7/5 , but still not with the correct constant. They also proved that 5/4 is the correct exponent in Foldy’s formula. The asymptotic lower bounds in Theorems 1.1 and 1.2 were proved in [10] and [11] respectively. The main results of the following paper are the asymptotic upper bounds. In Sect. 2 we give a general construction of bosonic trial states on the bosonic Fock space over a general Hilbert space. The trial states will be built from coherent states and squeezed states. The trial states are essentially the ones dictated by Bogolubov theory. These trial states are the bosonic equivalent of the fermionic states in Hartree-Fock theory or rather to their extension including the Bardeen-Cooper-Schrieffer states (see [1]). In the same way as fermionic systems may be approximated by the semi-classical Thomas-Fermi theory we will also use a semi-classical type approximation to the Bogolubov trial states. In Sect. 3 we use the general trial state method to give an upper bound on the ground state energy for the two-component gas, but in a grand canonical setting where we do not fix the total number of particles. In Sect 3.1 we show how to get an upper bound for fixed particle number and thus prove Theorem 1.2. In Sect. 4 we use the general trial state method to give an upper bound on the ground state energy for the one-component gas and prove Theorem 1.1.

800

J.P. Solovej

A key ingredient in the proofs is a semiclassical construction where we represent operators as phase-space integrals with coherent states symbols and use the BerezinLieb inequalities. We need an operator version of the inequality. This is discussed in Appendix A. 2. The Abstract Trial State Construction Our goal in this section is to construct trial states on the bosonic Fock space F =

N F(H1 ) = ∞ N =0 H N , over some Hilbert Space H1 , i.e., H N = Sym H1 and H0 = C. We will be using the language of bosonic creation and annihilation operators as a convenient tool for the book keeping. We denote by |0 the vacuum vector in F. If T is an operator on H1 and W is an operator H1 ⊗ H1 , which is symmetric under interchange of the tensor factors, we may lift (also referred to as second quantize) these operators to F as ∞ N

Ti and

N =1 i=1

∞

Wi j .

N =2 1≤i< j≤N

Here Ti refers to the operator T acting on the i th factor in the tensor product and Wi j refers to W acting on the i th and i th factors. If u α , α = 1, . . . is an orthonormal basis for H1 we can express these operators using creation and annihilation operators as N ∞

Ti =

N =1 i=1

(u α , T u β )a(u α )∗ a(u β )

(5)

α,β

and ∞

Wi j =

N =0 1≤i< j≤N

1 (u α ⊗ u β , W u µ ⊗ u ν )a(u α )∗ a(u β )∗ a(u ν )a(u µ ). 2

(6)

αβµν

Of special interest is the number operator (the second quantization of the identity) N =

∞

N.

N =0

If φ ∈ H1 is a not necessarily normalized vector we define the corresponding coherent state as the normalized vector in Fock space |φC = exp(− φ 2 /2 + a(φ)∗ )|0 ∞ (a(φ)∗ )n 2 |0, e− φ /2 = n!

(7)

n=0

and for a normalized ψ ∈ H1 we define the squeezed state depending on λ ∈ C with |λ| < 1, |λ; ψS = (1 − |λ|2 )1/4 exp(−(λ/2)a(ψ)∗ a(ψ)∗ )|0 ∞ (−λ/2)n (a(ψ)∗ )2n |0. = (1 − |λ|2 )1/4 n! n=0

(8)

Upper Bounds to Ground State Energies of Charged Bose Gases

801

It is straightforward to check that these states are normalized. Up to an overall phase |φC and |λ; ψS are characterized by (a(φ) − φ 2 )|φC = 0 and (a(ψ) + λa(ψ)∗ )|λ; ψS = 0.

(9)

We immediately see that ∗ m k C φ|(a(φ) ) a(φ) |φC

= φ 2(m+k) .

(10)

For the squeezed state we get ∗ j j+2k |λ; ψS S λ; ψ|(a(ψ) ) a(ψ) ∞ (2n + 2k)! = (1 − |λ|2 )1/2 (2n (n + k)!2 n=0

− j + 1)(2n − j + 2) · · · (2n)

×(n + k)(n + k − 1) · · · (n + 1)(|λ|/2)2n (−λ/2)k k j 2 1/2 j k d −1 d |λ| = (1 − |λ| ) |λ| (−λ) (1 − |λ|2 )−1/2 . d|λ| j d|λ|

(11)

Moreover, the expectation in the state |λ; ψS of a product of an odd number of the operators a(ψ)∗ or a(ψ) vanishes. For the expectation of the particle number we find ∗ C φ|a(φ/ φ ) a(φ/ φ )|φC

= φ 2 and

∗ S λ; ψ|a(ψ) a(ψ)|λ; ψS

=

|λ|2 . 1 − |λ|2

We point out that the variation in the particle number is very different in the coherent state and in the squeezed state ∗ 2 C φ|(a(φ/ φ ) a(φ/ φ )) |φC

− C φ|a(φ/ φ )∗ a(φ/ φ )|φ2C = φ 2 ,

∗ 2 S λ; ψ|(a(ψ) a(ψ)) |λ; ψS

− S λ; ψ|a(ψ)∗ a(ψ)|λ; ψ2S =

(12)

2|λ|2

. (1 − |λ|2 )2 (13)

Thus in the coherent state the standard deviation of the particle number is the square root of the expectation itself, whereas for the squeezed state the standard deviation of the particle number is, in fact, greater than the expectation itself. For this reason the squeezed states are not appropriate for describing Bose condensates with a macroscopic and sharply defined occupation number in a specific one-particle state. To describe condensates we will use coherent states. We will here define a variational principle corresponding to the Bogolubov theory of Bose gases. We shall do this by characterizing the set of variational trial states (see also Robinson [12]). The Bogolubov variational theory is very similar to the Hartree-Fock theory for Fermi gases. More precisely, it is similar to the generalized Hartree-Fock theory which includes the Bardeen-Cooper-Schrieffer (BCS) trial states. In generalized Hartree-Fock theory (see [1]) the class of trial states is defined to be the quasi-free states on a fermionic Fock space. For the ground state (zero temperature) theory we may restrict to pure quasi-free states.

802

J.P. Solovej

To describe the variational states of Bogolubov theory we will again start from (normalized) quasi-free pure states. Such a state may be characterized as follows. If ∈ F(H1 ) is a normalized quasi-free pure state there exists an orthonormal family ψ1 , . . . 2 of H1 and a sequence of numbers 0 < λ1 , . . . < 1 with ∞ α=1 λα < ∞ such that

1 λα = (1 − λ2α ) 4 exp − a(ψα )∗ a(ψα )∗ |0. (14) 2 α=1

A straightforward but lengthy calculation from (11) shows that the quasi-free state satisfies , a1 a2 a3 a4 = , a1 a2 , a3 a4 + , a1 a4 , a2 a3 + , a1 a3 , a2 a4 (15) and from the definition of the state we have for all integers m ≥ 1, , a1 · · · a2m−1 = 0.

(16)

In (15) and (16), a j , j = 1, 2 . . . refer to any creation or annihilation operators. The relation (15) is the case m = 2 of the more general rule , a1 · · · a2m = , aπ(1) aπ(2) · · · , aπ(2m−1) aπ(2m) , (17) π ∈P2m

where P2m is the set of pairing permutations P2m = {π ∈ S2m | π(2 j − 1) < π(2 j + 1), j = 1, . . . , m − 1 π(2 j − 1) < π(2 j), j = 1, . . . , m} .

(18)

We shall here use this only in the case (15) when m = 2. The one-particle density matrix of the quasi-free state is the operator γ1 defined on the one-body space H1 by (g, γ1 f )H1 = (, a( f )∗ a(g))F , where f, g ∈ H1 . From (11) γ1 =

∞ α=1

λ2α |ψα ψα |. 1 − λ2α

(19)

Note, in particular, that the one-particle density matrix is a positive semi-definite trace class operator with Tr γ1 = (, N ) =

∞ α=1

λ2α < ∞. 1 − λ2α

Connected to the quasi-free pure state we also have the symmetric bilinear form ξ1 on H1 given by ξ1 ( f, g) = (, a( f )∗ a(g)∗ )F . We find, again from (11), that ξ1 ( f, g) =

∞ −λα (ψα , f )(ψα , g). 1 − λ2α

α=1

(20)

Upper Bounds to Ground State Energies of Charged Bose Gases

803

We may identify ξ1 with a linear map ξ1 : H1 → H1∗ , from the one-body space H1 to its dual space H1∗ . We then have the relations ξ1∗ ξ1 = γ1 (γ1 + 1), ξ1 γ1 = γ1 ξ1 ,

(21)

where we have also identified γ1 in the natural way with a map from H1∗ to itself. If we introduce the operator : H1 ⊕ H1∗ → H1 ⊕ H1∗ defined using matrix notation as

ξ1 γ , = 1∗ ξ1 1 + γ1 we may rewrite the condition (21) as

−1 0

0 = . 1

We may refer to an operator satisfying this condition as a symplectic projection. In the fermionic case the corresponding operator is simply a projection. Note that the operator may also be described by (| f 1 ⊕ g1 |, | f 2 ⊕ g2 |)H1 ⊕H1∗ = , a( f 2 )∗ + a(g2 ) a( f 1 ) + a(g1 )∗ F (H ) , 1

where we have used the Dirac bra and ket notation to denote elements of H1 and H1∗ respectively. Given a positive definite trace class operator γ1 and a symmetric bilinear form ξ1 satisfying (21) we may find a unique quasi-free pure state such that γ1 is the corresponding one-particle density matrix and ξ1 the corresponding bilinear form. To see this one simply has to show that there exists an orthonormal family ψ1 , . . . and a sequence of positive numbers λ1 , . . . such that (19) and (20) hold. This is a fairly simple exercise in linear algebra. The choice of ξ1 is equivalent to a particular choice of eigenbasis for γ1 . If γ1 has real eigenfunctions (in some representation) there is a particular ξ1 corresponding to this choice of basis. We shall use this in our construction of states in the next sections. Consider as an example γ1 being a real translation invariant operator on the Hilbert space L 2 (Rn /2π Zn ) of square integrable functions on the torus. The real eigenfunctions come in degenerate pairs of the form cos( px) and sin( px), p ∈ Zn . The associated quasi-free state will in the exponent have terms of the form a(cos( px))∗ a(cos( px))∗ + a(sin( px))∗ a(sin( px))∗ = a(ei px )∗ a(e−i px )∗ . This corresponds to a pairing of states with opposite momenta, as is the usual case in the Bogolubov pair theory. The Bogolubov variational states are not just quasi-free states as defined above. In fact, quasi-free states being built out of squeezed states are not well suited for describing condensates (see the discussion after (12) and (13). We introduce condensates by appropriate unitary transformations of quasi-free states as we shall now describe. Given φ ∈ H1 we have a unitary map Uφ on the Fock space F(H1 ) which satisfies Uφ∗ a( f )Uφ = a( f ) + ( f, φ). This unitary is unique up to an overall complex phase, which we may fix by noting that we can add the requirement that the unitary maps the vacuum state to a coherent state Uφ |0 = |φC .

804

J.P. Solovej

From the first identity in (9) it is clear that Uφ satisfies this up to a phase. The Bogolubov variational states are constructed from a quasi-free state and a vector φ ∈ H1 as φ = Uφ . From the above discussion we see that a Bogolubov state may be described as follows. Definition 2.1 (Bogolubov variational states). A Bogolubov state on the bosonic Fock space F(H1 ) is given by

1 λα (1 − λ2α ) 4 exp − (a(ψα )∗ − (φ, ψα ))(a(ψα )∗ − (φ, ψα )) |φC , φ,γ1 ,ξ1 = 2 α=1

(22) where φ∈ H1 and ψ1 , ψ2 . . . is an orthonormal family in H1 and 0 < λ1 , λ2 , . . . < 1 2 satisfy ∞ α=1 λα = 1. We call φ the condensate vector and ψ1 , ψ2 . . . the pair states. There is a one-to-one correspondence between Bogolubov states and triples (φ, γ1 , ξ1 ) consisting of a vector φ ∈ H1 a positive trace class operator γ1 on H1 and a bilinear form ξ1 on H1 × H1 satisfying (21). The correspondence is given by (19) and (20). We find for the one-particle density matrix of the Bogolubov state φ,γ1 ,ξ1 that φ,γ1 ,ξ1 , a(u)∗ a(v)φ,γ1 ,ξ1 F (H ) 1 = 0,γ1 ,ξ1 , (a(u)∗ + (φ, u))(a(v) + (v, φ))0,γ1 ,ξ1 F (H ) 1

= (v, γ1 u) + (v, φ)(φ, u)

(23)

and likewise for the two-particle density matrix using (15), φ,γ1 ,ξ1 , a(u 1 )∗ a(u 2 )∗ a(v2 )a(v1 )φ,γ1 ,ξ1 F (H ) 1

= (v1 , φ)(v2 , φ)(φ, u 1 )(φ, u 2 ) +ξ1 (u 1 , u 2 )(v1 , φ)(v2 , φ) + ξ1 (v1 , v2 )(φ, u 1 )(φ, u 2 ) +(v2 , γ1 u 1 )(v1 , φ)(φ, u 2 ) + (v1 , γ1 u 2 )(v2 , φ)(φ, u 1 ) +(v2 , γ1 u 2 )(v1 , φ)(φ, u 1 ) + (v1 , γ1 u 1 )(v2 , φ)(φ, u 2 ) +(v1 , γ1 u 1 )(v2 , γ1 u 2 ) + (v1 , γ1 u 2 )(v2 , γ1 u 1 ) + ξ1 (v1 , v2 )ξ1 (u 1 , u 2 ).

(24)

The above trial states are motivated by the Bogolubov approximation for Bose condensed systems. The states φ represent the condensate, whereas the states ψα , α = 1, . . . represent the pair states. A key ingredient in the Bogolubov approximation is the c-number substitution, i.e., the replacement of the operator a(φ) by the number φ 2 . This replacement will give the correct value for expectations of normal ordered products in the Bogolubov states if we have the additional assumption that γ1 φ = 0 (see (10). In Sect. 3 we will choose a Bogolubov state satisfying this assumption, but in Sect. 4 the Bogolubov state that we choose will not satisfy the assumption. It is not the aim here to study the general properties of the Bogolubov variational problem, i.e., the minimization of the expectation of many-body Hamiltonians restricted to Bogolubov states. We will instead proceed to the specific examples of the one-component and two-component charged Bose gas. Here we shall not characterize the exact Bogolubov minimizer, but instead give the semiclassical approximations to these states which give the leading order asymptotics in Theorems 1.1 and 1.2. The Hamiltonians that we are interested in are particle number conserving, i.e., commute with particle number and the reader may wonder why we do not define a class

Upper Bounds to Ground State Energies of Charged Bose Gases

805

of particle conserving, i.e., canonical trial states rather than the grand canonical states above. As in the fermionic BCS theory it is very complicated to write a canonical trial state. The calculations are greatly simplified in the grand canonical setting. Simple minded trial states with a fixed number of particles in the condensate will not give the correct approximation, since the important virtual pair creation will be lost. 3. The Two-Component Charged Bose Gas We consider the two component Bose gas described by the Hamiltonian H (2) =

∞ N =0

(2)

HN ,

(2)

HN =

N i=1

1 − i + 2

1≤i< j≤N

ei e j |xi − x j |

acting on the Fock space F(L 2 (R3 ×{1, −1}), where the variable (xi , ei ) ∈ R3 ×{1, −1} gives the position and charge of particle i. Our goal here is first to construct a grand canonical normalized trial function ∈ F(L 2 (R3 × {1, −1}) with particle numbers concentrated sharply around the average value N = (, N ) and such that H (2) = (, H (2) ) ≤ −AN 7/5 + o(N 7/5 )

(25)

for large N . We have denoted the expectation in the state by A = (, A). From this the proof of Dyson’s formula Theorem 1.2 (i.e., the fact that we can achieve this estimate with a trial function of fixed particle number) will follow fairly easily (see Sect. 3.1). To construct the trial state we use the method from the previous section. We begin with a normalized minimizer for the variational problem (4). Using spherically symmetric decreasing rearrangements it is not difficult to see that a minimizer exists and that it may be chosen positive and spherically symmetric decreasing. Moreover, from the Euler-Lagrange equation it is exponentially decreasing and smooth. It is, however, not essential that we can find an exact minimizer with these properties. As we shall see, we could as well have chosen an approximate minimizer, which is smooth and compactly supported. Let n > 0 and define the normalized function φ0 (x) = n 3/10 (n 1/5 x).

(26)

We define a normalized state n ∈ F as in (22) with the condensate vector on L 2 (R3 × {−1, 1}) given by n φ0 (x) φ(x, e) = 2 and the operator γ1 on L 2 (R3 × {−1, 1}) defined by the integral kernel γ1 (x, e; , y, e ) =

1 γ (x, y)ee , 2

806

J.P. Solovej

where γ is a positive semi-definite trace class operator having real eigenfunctions. We shall make an explicit choice for γ below (see 39). We write the spectral decomposition of γ as γ =

∞ α=1

λ2α |ψα ψα |, 1 − λ2α

(27)

where ψα , α = 1, . . . is a real orthonormal basis and 0 ≤ λα < 1 for α = 1, . . .. Observe that on the space L 2 (R3 × {1, −1}) we have φ 2 = n and γ1 φ = 0. Denoting ψα± (x, e) = ψα (x)δ±1,e , α = 1, . . . we may write the trial state n as   ∞ λα ∗ ∗  n ee aαe aαe |0 , (1 − λ2α )1/4 exp − + a ∗ (φ) − n = 2 4 α=1

(28)

e,e =± α=1

∗ = a(ψ )∗ , for α = 1, . . .. where aα,e αe As discussed in the previous section choosing n and any γ with real eigenfunctions uniquely specifies a state n of the form above (possible degenerate eigenvalues will not cause ambiguities). Instead of specifying the individual eigenfunctions ψα and parameters λα , α = 1, . . . we will simply choose the operator γ . The state n should be compared to Dyson’s trial state in [6]. The main difference is that whereas we use a coherent state construction for the condensate, Dyson used squeezed states for this as well. Put differently, Dyson’s trial state corresponds to an exponential of a purely quadratic expression in creation operators without any linear terms. As we explained in the previous section the consequence of using the linear term in the exponent is that the variation in the number of particles occupying the state φ0 is much smaller than for a quadratic term. From (23) we find for the expected number of particles in the state n , ∞ ∗ N = aαe aαe = n + Tr γ , (29) α=1 e=±

and for the kinetic energy expectation ∞ N n 1 − 2 i n = |∇φ0 |2 + Tr − 21 γ n , 2 N =0 i=1 n 7/5 = |∇|2 + Tr − 21 γ . 2 From (6) we get that   ∞ ei e j n , n  = |xi − x j | N =0 1≤i< j≤N

1 2

∞

α,β,µ,ν=1 ee =±

(30)

∗ ∗ ee wαβνµ aαe aβe aµe aνe ,

(31)

Upper Bounds to Ground State Energies of Charged Bose Gases

where

wαβνµ =

ψα (x)ψβ (y)|x − y|−1 ψν (x)ψµ (y) dx dy.

807

(32)

(Since the Coulomb energy is an unbounded operator one may worry about the convergence of the expansion in (31). This problem is easily circumvented by introducing a convergence factor into |x|−1 , e.g., |x|−1 (1−exp(−t|x|)). The expectation on the left of (31) converges as t → ∞ by the Monotone Convergence Theorem, since for fixed values of the charges each term is monotone in t. We may do all calculations and estimates for finite t and at the end let t → ∞. We will here ignore this slight complication.) Using the notation of Sect. 2 we have ee λ2α ee λα , ψαe ) = − ψβe , γ1 ψαe = δ , ξ (ψ δαβ , αβ 1 βe 2 1 − λ2α 2 1 − λ2α

(33)

and thus from (24), ∗ ∗ aαe aβe aµe aνe n2 (φ0 , ψα )(φ0 , ψβ )(ψµ , φ0 )(ψν , φ0 ) 4 λµ λα ee (ψµ , φ0 )(ψν , φ0 ) + δµν (φ0 , ψα )(φ0 , ψβ ) −n δαβ 4 1 − λ2α 1 − λ2µ λ2β ee λ2α +n (φ0 , ψβ )(ψν , φ0 ) + δβν (φ0 , ψα )(ψµ , φ0 ) δαµ 4 1 − λ2α 1 − λ2β

=

λ2β n λ2α n δ + δβµ (φ , ψ )(ψ , φ ) + (φ0 , ψβ )(ψµ , φ0 ) 0 α ν 0 αν 4 4 1 − λ2α 1 − λ2β + +

λ2β λ2β δαν δβµ λ2α δαµ δβν λ2α + 4 1 − λ2α 1 − λ2β 4 1 − λ2α 1 − λ2β

δαβ δµν λα λµ . 4 1 − λ2α 1 − λ2µ

(34)

We therefore arrive at  

2 ∞ ∞ ei e j λα λα n , , n  = wααµν (ψν , φ0 )(ψµ , φ0 )n − |xi − x j | 1−λ2α 1−λ2α α=1

N =0 1≤i< j≤N

where we have used that φ0 and ψα , α = 1, . . . are real. From the expression for wααµν we see that we may write this as   ∞ ei e j n , n  = nTr K γ − γ (γ + 1) , (35) |xi − x j | N =0 1≤i< j≤N

where K is the operator on L 2 (R3 ) with integral kernel K(x, y) = φ0 (x)|x − y|−1 φ0 (y).

(36)

808

J.P. Solovej

Putting together (30) and (35) we arrive at n 7/5 H (2) = |∇|2 + Tr − 21 γ + nTr K γ − γ (γ + 1) . 2

(37)

Our next goal is to construct the operator γ . Here we shall use the method of coherent states symbols. Let χ (x) = π −3/2 exp(−x 2 ) such that χ (x)2 dx = 1. Let 0 < be a parameter which we shall specify below as a function of n such that n −2/5 n −1/5 . Denote χ (x) = −3/2 χ (x/) and let θu, p (x) = exp(i px)χ (x − u). We then define γ to be the operator γ = (2π )−3

R 3 ×R 3

f (u, p)|θu, p θu, p | du d p,

where

p f (u, p) = g (8π nφ0 (u)2 )1/4

(38)

1 , where g( p) = 2

p4 + 1

(39)

1/2 − 1 . p2 p4 + 2 (40)

We see that f (u, p) ≥ 0 and hence γ is a positive semi-definite operator and since f (u, p) = f (u, − p) all eigenfunctions of γ may be chosen real. That this is an appropriate choice for the function f will be seen at the end of our calculation (see (48)). Moreover, n 3/4 φ0 (u)3/2 du g( p) d p Tr γ = (2π )−3 f (u, p) du d p = π −9/4 2 R3 R3 = 2−3/4 π −9/4 n 3/5 (u)3/2 du g( p) d p. (41) R3

R3

Thus γ is a trace class operator. Hence we have all the requirements needed in order for γ to define a state n . Moreover, we see from (29) that for large n, N = n + O(n 3/5 ).

(42)

We turn now to the calculation of the expectation of the kinetic energy, Tr(−γ ) = (2π )−3 |∇θu, p |2 f (u, p) du d p −3 2 −3 2 −2 χ = (2π ) p f (u, p) du d p + (2π ) (∇ ) f (u, p) du d p ≤ (2π )−3 p 2 f (u, p) du d p + C(n 2/5 )−2 n 7/5 3/4 −7/4 7/5 5/2 =2 π n (u) du p 2 g( p) d p + C(n 2/5 )−2 n 7/5 , (43) R3

R3

where in the second to last inequality we have used the definition (26) of φ0 .

Upper Bounds to Ground State Energies of Charged Bose Gases

809

The next step in calculating the energy expectation in the state n is to calculate (or √ rather estimate) Tr(K( γ (γ + 1) − γ )). In order to do this we shall use the operator version of the Berezin-Lieb inequality given in (76) in √ Theorem A.1 in Appendix A. We will use it for the operator concave function ξ(t) = t (t + 1) − t (see the discussion at the end of Appendix A) and the map ω → |ω being (u, p) → |θu, p . We have (2π )

−3

|θu, p θu, p | du d p = I.

Since K is a positive operator we conclude from Theorem A.1 that Tr(K( γ (γ + 1) − γ )) −3 ≥ (2π ) f (u, p)( f (u, p)+1)− f (u, p) θu, p |K|θu, p du d p. (44) Since |x − y|−1 is a positive definite kernel we have for 0 ≤ δ , θu, p |K|θu, p = ei px χ (x − u)φ0 (x)|x − y|−1 e−i py χ (y − u)φ0 (y) dx dy ≥ (1 − Cδ )φ0 (u)2 ei px χ (x − u)|x − y|−1 e−i py χ (y − u) dx dy −Cδ −1 (n 2/5 )4 n −3/5 ≥ φ0 (u)2 ei px χ (x)|x − y|−1 e−i py χ (y) dx dy −Cδ (n 2/5 )2 n −1/5 −Cδ −1 (n 2/5 )4 n −3/5 4π 2 ≥ φ0 (u) dq − C(n 2/5 )3 n −2/5 , j (q) | p − q|2

(45)

2 2 where j (q) = (2π )−3 |χ (q)|2 = 3 π −3 e−2 q (with the convention f ( p) = ei px f (x) dx for the Fourier transform). In the last inequality we have chosen δ = (n 2/5 )n −1/5 and in the first inequality we have used that |φ0 (x) − φ0 (u)| ≤ Cn 1/2 |x − u| and hence χ (x − u)|φ0 (x) − φ0 (u)||x − y|−1 χ (y − u)|φ0 (y) − φ0 (u)| dx dy ≤ C(n 2/5 )4 n −3/5 . We have that j (q) dq = 1. We will use the estimate −2 | p| − j ∗ | p|−2 |q| |q| −2 −1 ≤ | p| dq + | p| dq j (q) j (q) | p − q| | p − q|2

7/2 −2 −5/2 −1 −1 −5/2 −2 ≤ sup j (q)|q| | p| | p−q| dq +| p| | p−q| dq |q| |q| (46) ≤ C| p|−5/2 sup j (q)|q|7/2 .

810

J.P. Solovej

For our explicit choice of j we get | p|−2 − j ∗ | p|−2 ≤ −1/2 | p|−5/2 . From (44), (45) and estimate (41) we find that Tr(K( γ (γ + 1) − γ )) −2 ≥ 2(2π ) f (u, p)( f (u, p) + 1) − f (u, p) φ0 (u)2 j ∗ | p|−2 du d p −C(n 2/5 )3 n 1/5 ≥ 2−1/4 π −7/4 n 2/5

g( p)(g( p) + 1) − g( p) (u)5/2 | p|−2 du d p

(47) −C(n 2/5 )−1/2 n 2/5 − C(n 2/5 )3 n 1/5 , √ where we have also used that f (u, p)( f (u, p) + 1) − f (u, p) du d p ≤ Cn 3/5 (as in (41)). If we now insert the above estimate and (43) into (37) we arrive at

|∇(u)|2 du H (2) ≤ n 7/5 21 3 R +2−1/4 π −7/4 (u)5/2 du R3 2 −2 p g( p) − | p| g( p)(g( p) + 1) − g( p) d p × R3

+Cn

7/5

((n

2/5

)3 n −1/5 + (n 2/5 )−1/2 ).

(48)

The function g in (40) was chosen precisely so as to optimize the above expression. If we insert the expression for g it is easily seen that the term in the large parenthesis above is 2 1 |∇(u)| du − I0 (u)5/2 du. 2 R3

R3

If we choose to be an exact minimizer then this expression is −A (recall that A and I0 were defined in Theorem 1.2). From the estimate in (48) we see that if we choose as a function of n such that n 2/5 = n 2/35 then H (2) ≤ −An 7/5 (1 − Cn −1/35 ).

(49)

Because of the estimate (42) this means that we have found a state satisfying (25). We could instead have chosen to be a smooth compactly supported approximate minimizer to the variational problem (4). We would then for any ε > 0 have proved that limn→∞ n −7/5 H (2) ≤ −A + ε, which of course implies (25). 3.1. An upper bound for ﬁxed particle number. In this section we shall prove the upper bound in Theorem 1.2 on the energy E (2) (N ) corresponding to a fixed particle number N . Let ε,n for n, ε > 0 denote the state constructed in the previous section, but with the function g in (40) replaced by the function gε , which is equal to g for | p| > ε and is zero otherwise. We will again denote the expectation of any operator A in the state ε,n , by A. It then follows from the construction in the previous section that lim n −7/5 H (2) ≤ −Aε ,

n→∞

where Aε → A as ε → 0.

(50)

Upper Bounds to Ground State Energies of Charged Bose Gases

811

(m)

Let ε,n denote the projection of the state ε,n onto the subspace corresponding to particle number m = 0, 1, . . .. We then have 2 ∞ ∞ 2 2 (m) 2 ∗ N = m ε,n = aαe aα,e . e=± α=1

m=0

Hence from (29) and (34), N 2 − N 2 =

∞

∗ ∗ ∗ ∗ aαe aα,e aαe aα,e − aαe aα,e aαe aα,e

α=1 e,e =±

= n + 2Tr γε (γε + 1), where γε is given as in (39), but with f replaced by f ε , which is expressed in terms of gε instead of g. Thus using (75) in Theorem A.1 (or (76) for that matter) in the convex case, we see that f ε (u, p)( f ε (u, p) + 1) du d p ≤ n + Cε n 3/5 . N 2 − N 2 ≤ n + 2(2π )−3 Here Cε > 0 is a constant depending on ε and such that Cε → ∞ as ε → 0. It is at this point that it is necessary to replace g with gε , since otherwise the above integral is not convergent. For any M > 0 we have

m

∞ ! !2 ! ! ! (m) !2 (m) ! −3/5 m 7/5 |m − N |3/5 !ε,n !ε,n ! ≤ M !

7/5 !

m−N >M

m=0 −3/5

≤M N 2 7/10 (N − N )2 3/10 −3/5 =M N 2 7/10 (N 2 − N 2 )3/10 ≤ Cε M −3/5 n 17/10 .

(51)

Given a positive integer N , we choose n = N − C0 Then if C0 > 0 is chosen appropriately we have according to (29) and (42) that the expected particle number satisfies N 3/5 .

N − C1 N 3/5 ≤ N ≤ N − C2 N 3/5 , for some C1 , C2 > 0. Since M → E(M) is a non-increasing and non-positive function (adding particles will always lower the energy, since one may construct a trial state with the extra particles placed arbitrarily far away from the original particles) we have that ! ! ! (m) !2 E (2) (N ) ≤ E (2) (m) !ε,n ! ≤

m≤N ∞

! ! ! (m) !2 E (2) (m) !ε,n ! −

m=0

≤ H (2) +

m>N +C2 N 3/5

≤ H

(2)

+ Cε N 7/5−3/50 ,

m>N +C2 N 3/5

! ! ! (m) !2 Cm 7/5 !ε,n !

! ! ! (m) !2 E (2) (m) !ε,n !

812

J.P. Solovej

where we have used the lower bound E (2) (m) ≥ −Cm 7/5 (see [5] or[11]) and the estimate (51). Thus we finally get the upper bound in Theorem 1.2, lim sup N −7/5 E (2) (N ) ≤ lim lim sup n −7/5 H (2) + Cε N 7/5−3/50 = −A, ε→0 n→∞

N →∞

according to (50). 4. The One-Component Charged Bose Gas Since the thermodynamic ground state energy e(ρ) of the one-component charged Bose gas may be calculated by minimizing over all particle numbers we may again consider the grand canonical ensemble. Thus we are looking for an upper bound to the ground

(1) state energy of the Hamiltonian H (1) = ∞ N =0 H N acting on the Bosonic Fock space F(L 2 ()). To construct a grand canonical trial function we begin by choosing a real normalized function φ0 ∈ L 2 (). Let η ∈ C01 (0, L) be a non-negative function compactly supported ∞ in (0, L) and such that 0 η(t)2 dt = 1. Moreover, assume that η(t) is a constant for t ∈ [r, L − r ] for some 0 < r < L/4 to be chosen below. We will write this constant as (ρ/n)1/6 , for some n > 0. In fact, we shall choose r independently of L (for large L). We also assume that η(t) ≤ (ρ/n)1/6 . We then define (52) φ0 (x, y, z) = η(x)η(y)η(z). √ √ Thus φ0 is equal to a constant ρ/n on the cube [r, L − r ]3 and 0 ≤ φ0 (x) ≤ ρ/n for all x ∈ . Since η is normalized so is φ0 and ρ(L − 2r )3 ≤ n ≤ ρ L 3 . Thus the constant n is almost the number of particles required to have a neutral system. We have |η(t)| ≤ C L −1/2 and |φ0 (x)| ≤ C L −3/2

(53)

and we may assume that the derivatives satisfy |η (t)| ≤ Cr −1 L −1/2 and hence |∇φ0 (x)| ≤ Cr −1 L −3/2 .

(54)

In particular, we have

|∇φ0 (x)|2 dx ≤ C(r L)−1 .

Observe that we also have that (nφ0 (x)2 − ρ)|x − y|−1 (nφ0 (y)2 − ρ) dx dy ≤ Cρ 2 L 3r 2 .

(55)

(56)

We choose our grand canonical trial function n as in (22). The condensate vector is φ = z 0 φ0 ,

(57)

where the parameter z 0 > 0 will be chosen below. The operator γ1 = γ (we omit the subscript 1 because we shall use a subscript ε below with a different meaning) will be chosen to be a positive semi-definite trace class operator with real eigenfunctions. The eigenfunctions (corresponding to non-zero eigenvalues) should satisfy Dirichlet

Upper Bounds to Ground State Energies of Charged Bose Gases

813

boundary conditions on the boundary of . Let ψα , α = 1, . . . be an orthonormal basis of real eigenfunctions for γ . We use the notation aα∗ = a ∗ (ψα ). As usual we denote the expectation of an operator A in the state n by A. As in (30) we see from (5) and (23), ∞ N z 02 1 n , − 2 i n = |∇φ0 |2 + Tr − 21 γ 2 N =0 i=1 (58) ≤ C z 02 (r L)−1 + Tr − 21 γ , where in the last inequality we have used (55). We likewise get ∞ N n , V (xi )n = V (x)φ(x)2 dx + V (x)ργ (x) dx N =0 i=1

=ρ

×

z 02 φ0 (y)2 + ργ (y) dx dy, |x − y|

(59)

where ργ (x) = γ (x, x) is the density of the operator γ . From (6) we have (as in 31) with wαβνµ given exactly as in (32)   ∞ ∞ n , |xi − x j |−1 n  = 21 wαβνµ aα∗ aβ∗ aµ aν . α,β,µ,ν=1

N =0 1≤i< j≤N

We then obtain from (24) that   ∞ z4 n , |xi − x j |−1 n  = 0 φ0 (x)2 |x − y|−1 φ0 (y)2 dx dy 2 × N =0 1≤i< j≤N φ0 (x)2 |x − y|−1 ργ (x) dx dy +z 02 Tr K γ − γ (γ + 1) + z 02 × √ 2 |γ (x, y)| | γ (γ + 1)(x, y)|2 1 1 +2 dx dy + 2 dx dy |x − y| |x − y| × × + 21 ργ (x)|x − y|−1 ργ (y) dx dy, (60) ×

where the operator K is given as in (36). Putting together (58),(59), and (60) we arrive at √ |γ (x, y)|2 | γ (γ + 1)(x, y)|2 H (1) ≤ C z 02 (r L)−1 + 21 d xd y + 21 dx dy |x − y| |x − y| ρ − ργ (x)−z 02 φ0 (x)2 |x − y|−1 ρ − ργ (y)−z 02 φ0 (y)2 dx dy + 21 ×

+Tr − 21 γ + z 02 Tr K γ − γ (γ + 1) . We now choose γ = γε = (2π )−3

R3

gε

p (8πρ)1/4

(61)

|θ p θ p | d p,

(62)

814

J.P. Solovej

where the function gε ( p) = 0 for | p| ≤ ε and gε ( p) = g( p) for | p| > ε, where g is defined in (40), and θ p (x) = nρ −1 exp(i px)φ0 (x). (63) Recall that nρ −1 φ0 (x)2 ≤ 1 and is equal to 1 on most of . We see that the map p → |θ p satisfies the requirements of the map ω → |ω in Theorem A.1 with measure dµ(ω) = (2π )−3 d p. That γε satisfies the necessary requirements follows as before. It is clear that the eigenfunctions of γε with non-zero eigenvalues have compact support in (0, L)3 . We calculate the density of γε

p |θ p (x)2 | d p ργε (x) = (2π )−3 gε 1/4 3 (8πρ) R

p −3 −1 gε = (2π ) nρ φ0 (x)2 dp (8πρ)1/4 R3 (64) = nρ −1/4 2−3/4 π −9/4 φ0 (x)2 gε ( p) d p. We finally choose z 0 > 0 z 02

=n 1−2

−3/4 −1/4 −9/4

ρ

π

gε ( p) d p

(65)

(for ρ large enough). Then z 02 φ0 (x)2 + ργε (x) = nφ0 (x)2 . It follows from (56) and the fact that φ0 (x)2 ≤ ρ/n that ρ − ργε (x) − z 02 φ0 (x)2 |x − y|−1 ρ − ργε (y) − z 02 φ0 (y)2 dx dy ×

≤ Cρ 2 L 3r 2 . 1 4

To estimate the second term in (61) we will use Hardy’s inequality |u(x)| dx as follows: |x|2 |γε (x, y)|2 dx dy ≤ |x − y|

1/2

≤2

|γε (x, y)| dx dy 2

1/2

|γε (x, y)|2 dx dy

(66) |∇u(x)|2 dx ≥

|γε (x, y)|2 dx dy |x − y|2

1/2 1/2

|∇x γε (x, y)|2 dx dy

1/2 1/2 Tr(−γε2 ) = 2 Tr γε2 . Since x → x 2 is operator convex we may estimate these terms using the Berezin-Lieb inequality (76) in the convex case, but we may alternatively simply use the norm bound

γε ≤ Cε−2 . Hence

1/2 |γε (x, y)|2 dx dy ≤ Cε−2 (67) ργε (x) (Tr(−γε ))1/2 ≤ Cε−2 ρ L 3 , |x − y|

Upper Bounds to Ground State Energies of Charged Bose Gases

815

where we have used (64), n ≤ ρ L 3 and the fact which we shall prove below in (68), that Tr(−γε ) ≤ Cρ 5/4 L 3 (recall that we will choose r independently √ of L). The third term in (61) which compared to the second term has γε replaced by γε (γε + 1) is estimated in exactly the same way and with the same bound as the second term. We are now left with calculating the last two terms in (61). For the kinetic energy of γε we have as in (43),

p −3 n 2 2 p + |∇φ0 (x)| dx d p gε Tr(−γε ) ≤ (2π ) ρ R3 (8πρ)1/4 ≤ 23/4 π −7/4 ρ 5/4 L 3 p 2 gε ( p) d p + Cρ 3/4 L 3 (r L)−1/2 , (68) R3

where we have used (55) and n ≤ ρ L 3 . For the last term in (61) we again, as in (44), appeal to the operator version (76) of the Berezin-Lieb inequalities. We arrive at Tr K γε − γε (γε + 1) f ε ( p) − f ε ( p)( f ε ( p) + 1) θ p |K θ p d p, ≤ (2π )−3 (69)

R3

where f ε ( p) = gε p(8πρ)−1/4 . We have as in (45) θ p |K|θ p = 4π J ∗ | p|−2 ,

(70)

where J ( p) = (2π )−3 nρ −1 |φ02 ( p)|2 . The special form (52) implies that J ( p1 , p2 , p3 ) = j ( p1 ) j ( p2 ) j ( p3 ), −1 1/3 −1/3 η2 (τ )|2 . Since where j (τ ) dτ = n 1/3 ρ −1/3 η(t)4 dt, 2 j (τ ) = (2π ) n ρ−1/3 |1/3 η = 1, and 0 ≤ ρ and equal to this constant on [r, L − r ] we have η(t) ≤ n that 1 − 2r/L ≤ j (τ ) dτ ≤ 1. This implies in particular that (1 − 2r/L)3 ≤ J ( p) d p ≤ 1. (71) By (53) and (54) and the support property of η we have |η2 (τ )| ≤ |τ |−1 ≤ C(|τ |L)−1 . Thus j (τ ) ≤ C L(|τ |L)−2 . Hence J (q) dq ≤ 3 j (τ ) dτ ≤ C L −1/2 . |q|>L −1/2

|(η2 ) (t)| dt

|τ |>(3L)−1/2

(72)

For | p| > ε(8πρ)1/4 and |q| ≤ L −1/2 we have | p − q| ≤ (1 + Cρ −1/4 ε−1 L −1/2 )| p| and hence from (71) and (72), −2 −1/4 −1 −1/2 −2 −2 J ∗ | p| ≥ (1 + Cρ ε L ) | p| J (q) dq |q|
≥ (1 + Cρ −1/4 ε−1 L −1/2 )−2 ((1 − 2r L ) − C L −1/2 )| p|−2 ≥ (1 − C(ρ −1/4 ε−1 L −1/2 + r L −1 + L −1/2 ))| p|−2 .

(73)

816

J.P. Solovej

Inserting this into (70) and then into (69) we arrive at Tr K γε − γε (γε + 1) ≤ 2−1/4 ρ 1/4 π −7/4 (gε ( p) − gε ( p)(gε ( p) + 1))| p|−2 d p +C ε−1 L −1/2 + ρ 1/4 r L −1 + ρ 1/4 L −1/2 .

(74)

If we now insert √ the above estimate, (65), (66), (67), (68), and the same estimate for γε replaced by γε (γε + 1)) into (61) we see that lim sup L −3 H (1) ≤ ρ 5/4 2−1/4 π −7/4 | p|2 gε ( p) + gε ( p)| p|−2 L→∞ − gε ( p)(gε ( p) + 1)| p|−2 d p +Cρ(1 + ρr 2 + ε−2 ). Here we may actually let r → 0 (which really means that we could have chosen r as a negative power of L). If we recall the behavior of g( p) for small | p| from (40) we find that the error in replacing gε by g is of order ρ 5/4 ε. Thus by choosing ε = ρ −1/12 we obtain the final result e(ρ) ≤ lim sup L −3 H (1) ≤ −I0 ρ 5/4 (1 − Cρ −1/12 ). L→∞

A. The Berezin-Lieb Inequality In this appendix we shall prove variants of the Berezin-Lieb inequalities [2, 8]. Theorem A.1 (Berezin-Lieb inequalities). Let H be a Hilbert space and a measure space with a (positive) measure µ such that there exists a map ω → |ω ∈ H, satisfying |ωω|dµ(ω) ≤ I as operators. Assume ξ : R+ ∪ {0} → R is a concave function with ξ(0) ≥ 0. Then for any non-negative function f on satisfying f (ω)ω|ωdµ(ω) < ∞ we have the Berezin-Lieb inequality

TrH ξ f (ω)|ωω|dµ(ω) ≥ ξ( f (ω))ω|ωdµ(ω).

(75)

If moreover ξ is operator concave (still satisfying ξ(0) ≥ 0) the inequality holds as an operator inequality

ξ

f (ω)|ωω|dµ(ω) ≥

ξ( f (ω))|ωω|dµ(ω).

(76)

Upper Bounds to Ground State Energies of Charged Bose Gases

817

Proof. We first note that f (ω)|ωω|dµ(ω) is a positive semi-definite trace class operator. Let u 1 , u 2 , . . . be an orthonormal basis of eigenvectors for this operator. Then

TrH ξ f (ω)|ωω|dµ(ω)

∞ 2 ξ = f (ω)|ω|u i | dµ(ω) i=1

≥

∞

|ω|u i | dµ(ω)ξ 2

|ω|u i | dµ(ω) 2

−1

f (ω)|ω|u i | dµ(ω) , 2

i=1

where we have used that |ω|u i |2 dµ(ω) ≤ 1 and that since ξ is concave with ξ(0) ≥ 0 we have ξ(at) ≥ aξ(t) for all t ≥ 0 and 0 < a < 1. If we now use Jensen’s inequality we arrive at

∞ TrH ξ ξ( f (ω))|ω|u i |2 dµ(ω) f (ω)|ωω|dµ(ω) ≥ i=1

=

ξ( f (ω))ω|ωdµ(ω).

We turn to the case when ξ is operator concave. Define the operator U : H → L 2 (, dµ) by (U φ)(ω) = ω|φ. Then ∗ U h = h(ω)|ωdµ(ω). Thus if B is the multiplication operator on L 2 (, dµ) given by Bh(ω) = f (ω)h(ω) we have U ∗ BU = f (ω)|ωω|dµ(ω). In particular, we have the operator inequalities 0 ≤ U ∗ U ≤ I . Using that (1 − UU ∗ )1/2 U = U (1 − U ∗ U )1/2 it is straightforward to check that the following operators on H ⊕ L 2 (, dµ) (written in matrix notation) are unitary:

−U ∗ U∗ (I − U ∗ U )1/2 (I − U ∗ U )1/2 , V= . U= U (I − UU ∗ )1/2 U −(I − UU ∗ )1/2 Moreover we have that

∗

1 1 ∗ 0 0 U BU 0 0 0 U + V∗ V= . U 0B 0B 0 (1 − UU ∗ )1/2 B(1 − UU ∗ )1/2 2 2 Since ξ is operator concave and U and V are unitary we find that

ξ(U ∗ BU ) 0 0 ξ((1 − UU ∗ )1/2 B(1 − UU ∗ )1/2 )

1 1 0 0 0 0 U + V∗ V ≥ U∗ 0 ξ(B) 0 ξ(B) 2 2

∗ U ξ(B)U 0 = . 0 (1 − UU ∗ )1/2 ξ(B)(1 − UU ∗ )1/2

818

J.P. Solovej

In particular, this gives ξ(U ∗ BU ) ≥ U ∗ ξ(B)U , which is precisely the operator Berezin-Lieb inequality (76). In order to determine whether a given function is operator concave we may use Nevanlinna’s Theorem (see [3] Theorems V.4.11 and V.4.14 and Eq. (V.49)). According to this a real function ξ defined on the positive real axis with an analytic extension to C \ {x ∈ R | x ≤ 0}, which maps the upper half plane into itself has a representation of the form ∞ λ 1 ξ(t) = α + βt + − dν(λ), λ2 + 1 λ + t 0 ∞ 1 where β ≥ 0 and where ν is a positive measure satisfying 0 1+λ 2 dν(λ) < ∞. Since −1 t → −(t +λ) is operator concave the same is true for functions with the above integral representation. √ As a special case we see that the function ξ(t) = t (t + 1), which is analytic away from the segment [−1, 0] is operator concave. Acknowledgements. I would like to thank Elliott Lieb, Kumar Raman, and Robert Seiringer for valuable discussions.

References 1. Bach, V., Lieb, E.H., Solovej, J.P.: Generalized Hartree-Fock theory and the Hubbard model. J. Stat. Phys. 76, 3–90 (1994) 2. Berezin, F.A.: Izv. Akad. Nauk, ser. mat. 36(No. 5) (1972); English translation: USSR Izv. 6(No. 5) (1972); Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) 3. Bhatia, R.: Matrix Analysis, Graduate Texts in Mathematics, Vol. 169. New York: Springer-Verlag, 1997 4. Bogolubov, N.N.: J. Phys. (U.S.S.R.) 11, 23 (1947); Bogolubov, N.N., Zubarev, D.N.: Sov. Phys.-JETP 1, 83 (1955) 5. Conlon, J., Lieb, E.H., Yau, H.-T.: The N 7/5 Law for Charged Bosons. Commun. Math. Phys. 116, 417–448 (1988) 6. Dyson, F.J.: Ground State Energy of a Finite System of Charged Particles. J. Math. Phys. 8, 1538–1545 (1967) 7. Foldy, L.L.: Charged Boson Gas. Phys. Rev. 124, 649–651 (1961); Errata ibid 125, 2208 (1962) 8. Lieb, E.H.: The classical limit of quantum spin systems. Commun. Math. Phys. 31, 327–340 (1973) 9. Lieb, E.H., Narnhofer, H.: The Thermodynamic Limit for Jellium. J. Stat. Phys. 12, 291–310 (1975); Errata 14, 465 (1976) 10. Lieb, E.H., Solovej, J.P.: Ground State Energy of the One-Component Charged Bose Gas. Commun. Math. Phys. 217, 127–163 (2001). Errata 225, 219–221 (2002) 11. Lieb, E.H., Solovej, J.P.: Ground State Energy of the Two-Component Charged Bose Gas. Commun. Math. Phys. 252, 485–534 (2004) 12. Robinson, D.W.: The ground state of the Bose gas. Commun. Math. Phys. 1, 159–174 (1965) Communicated by M. Aizenman

Commun. Math. Phys. 266, 819–862 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0045-x

Communications in

Mathematical Physics

Continuum Limit of the Volterra Model, Separation of Variables and Non-Standard Realizations of the Virasoro Poisson Bracket O. Babelon Laboratoire de Physique Théorique et Hautes Energies (LPTHE), Unité Mixte de Recherche (UMR 7589), Université Pierre et Marie Curie-Paris6; CNRS; Université Denis Diderot-Paris7, Tour 24-25, 5ème étage, Boite 126, 4 place Jussieu, 75252 Paris Cedex 05, France. E-mail: [email protected] Received: 24 October 2005 / Accepted: 7 March 2006 Published online: 15 June 2006 – © Springer-Verlag 2006

Abstract: The classical Volterra model, equipped with the Faddeev-Takhtajan Poisson bracket provides a lattice version of the Virasoro algebra. The Volterra model being integrable, we can express the dynamical variables in terms of the so-called separated variables. Taking the continuum limit of these formulae, we obtain the Virasoro generators written as determinants of infinite matrices, the elements of which are constructed with a set of points lying on an infinite genus Riemann surface. The coordinates of these points are separated variables for an infinite set of Poisson commuting quantities including L 0 . The scaling limit of the eigenvector can also be calculated explicitly, so that the associated Schroedinger equation is in fact exactly solvable. 1. Introduction The relation between integrable systems and conformal field theory has long been recognized [1, 2]. Although the emphasis has been put rightfully on Baxter Q operator and therefore on Sklyanin’s separated variables [3], to the best of our knowledge there are no explicit expressions of the Virasoro generators in terms of these variables. We make here a first step in this direction by considering the classical version of this problem. Our strategy will be to start with the Volterra model on the lattice [4, 6] equipped with the Faddeev-Takhtajan [7, 8] Poisson bracket. Since the Volterra model is integrable, we can rewrite everything in terms of separated variables. Now, the FaddeevTakhtajan bracket goes directly to the Virasoro Poisson bracket in the continuum limit, and therefore by taking this limit in the separated variables formulae we will obtain the Virasoro generators expressed in terms of separated variables. This leads to the following rather new type of formula for the Virasoro generators: u(x) = L n e2inπ x = p02 + 2∂x2 log det (x) + (L 0 − p02 )δ(x). Here p0 is the zero mode and Poisson commutes with everything, the term (L 0 − p02 )δ(x)

820

O. Babelon

will be explained later and the formula for L 0 is given in Eq. (66). The infinite matrix (x) reads (k, m ∈ {1, · · · , ∞}): km (x) =

Wk (x)∂x E m (x) − ∂x Wk (x)E m (x) , 0≤x ≤1 Z k2 − m 2 π 2

(1)

with Wk (x) =

sin Z k (1 − x) sin Z k x + µk , Zk Zk

E m (x) = 2mπ sin mπ x.

(2)

The above formula for u(x) is valid on the interval 0 ≤ x ≤ 1, and should be extended outside this interval by periodicity (in particular the δ(x) term in a Dirac comb). The result of this paper is that if the variables Z k , µk , have Poisson bracket1 {Z k , Z k } = 0, {Z k , µk } = 2(Z k − p02 Z k−1 )µk δkk , {µk , µk } = 0

(3)

then u(x) does satisfy the Virasoro Poisson bracket: {u(x), u(y)} = 4(u(x) + u(y)) δ (x − y) + 2δ (x − y).

(4)

Morever, the variables Z k , µk are separated variables for an infinite set of higher commuting quantities, including L 0 . Since the separated variables are also the ones which solve the classical inverse problem, the Schroedinger equation with the potential u(x) (−∂x2 − u(x))ψ(x, ) = 2 ψ(x, )) is exactly solvable, meaning that we have explicit formulae for both the potential u(x) and a basis of solutions ψ(x, ). Constructing the linear combination which is quasi periodic (the so called Bloch waves) introduces an infinite genus Riemann surface. The coefficients in the expression of this curve define a complete set of Poisson commuting Hamiltonians including L 0 . The separated variables are points on this curve. The paper is organized as follows. In the first three sections we recall some known facts about the Volterra model on the lattice. In particular we recall the formulae expressing the dynamical degrees of freedom in terms of the separated variables. In Sect. 5 we compute the continuum (scaling) limit of the spectral curve. The result is Eq. (41). We then show that the Hamiltonians Hm in this formula are in involution. Moreover we show that the scaling limit of the dynamical divisor still belongs to that curve, and hence define separated variables for these Hamiltonians. In Sects. 6, 7 and 8, we compute the scaling limit of the eigenvector of the Lax matrix at each point of the spectral curve. The result is rather simple and is given in Eq. (50). We then show that the obtained expression does satisfy a second order Schroedinger equation and we compute its potential u(x). Finally, we construct the two quasi periodic solutions of that equation, the Bloch waves, and recover in this way exactly the same spectral curve as the one obtained in Sect. 5. In Sect. 9, we give conditions under which the determinants of the infinite matrices that appeared in the previous sections exist. We then perform a few checks in a certain perturbative scheme. In Sect. 10 we prepare the ground for the serious calculations coming next.

Z k2 − p02 , the Poisson bracket becomes a standard quadratic bracket {k , µk } = 2k µk . However p0 will then enter the formula for (x) and, in this work, we prefer to keep that formula simple at the expense of a slightly more complicated Poisson bracket. 1 Notice that if we redefine = k

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

821

In Sects. 11 and 12 we prove that the potential u(x) does satisfy the Virasoro Poisson bracket. An essential use is made of certain quartic relations, proven very much like the Hirota–Sato bilinear identities. These identities should be considered as generalizations for τ -functions of the quartic relations on Riemann’s Theta functions. 2. The Volterra Model In this and the following two sections we recall some well known facts about the Volterra model. The Volterra model, as an integrable system, was introduced in [4]. It is a restricted version of the Toda lattice. We consider a periodic lattice with N + 1 sites, and on each lattice site we attach a dynamical variable ai on which we impose the Faddeev-Takhtajan Poisson bracket [7]: (5) {ai , a j } = ai a j (4 − ai − a j )(δi, j+1 − δ j,i+1 ) − a j+1 δi, j+2 + ai+1 δ j,i+2 . This bracket2 is interesting because taking the continuum limit as ai 1 + 2 u(x), =

1 (N + 1)

it becomes the Virasoro Poisson bracket Eq. (4). For precisely this reason, and in this perspective, the lattice model has been extensively studied both at the classical level [7, 8] and at the quantum level [9–11, 13, 12]. The present paper is one more contribution to this series of works. The Lax matrix for the Volterra model is defined by: √ √   a1 √0 ··· µ−1 a N +1 √0  a1 0 a2 ··· 0    .. .. ..   .   . . √ √   (6) L(µ) =  ai−1 0 ai 0 0 .   .. .. ..   .   . √. √   a 0 a 0 · · · N −1 √ N √ µ a N +1 ··· 0 aN 0 It is well known that TrL n (µ) are in involution with respect to the Poisson bracket Eq. (5). Hence we have an integrable system on the lattice whose continuum limit is directly related to conformal field theory. The spectral curve is defined as usual:

:

det(L(µ) − λ) = 0.

(7)

Expanding the determinant we see that it is of the form: µ + µ−1 − t (λ) = 0,

(8)

where t (λ) is polynomial of degree N + 1.

t (λ) = A−1 λ N +1 − A−1 ai λ N −1 + · · · ,

(9)

i 2 In terms of Toda Hamiltonian structures, it is a linear combination of restrictions of the second and fourth Poisson brackets.

822

O. Babelon

where A=

√ a1 a2 · · · a N +1 .

Assuming N = 2n even, t (λ) is an odd polynomial, t (−λ) = −t (λ), and has exactly n + 1 independent coefficients. However, in that case, there is one Casimir function K = t (2): {t (2), ai } = 0, ∀i. The dimension of phase space is N = 2n and we have exactly n commuting quantities. The genus of the curve is g = N . At each point (λ, µ) of the spectral curve, we can attach an eigenvector (λ, µ) = (ψi (λ, µ)), i = 1, . . . , N + 1, corresponding to the eigenvalue λ of L(µ). Explicitly, the equation (L(µ) − λ) = 0 reads √ √ a1 ψ2 + µ−1 a N +1 ψ N +1 = λψ1 , √ √ ai−1 ψi−1 + ai ψi+1 = λψi , (10) √ √ µ a N +1 ψ1 + a N ψ N = λψ N +1 . We extend the definition of the coefficients ai by periodicity ai+N +1 = ai , and introduce a second order difference operator D:

√ √ D ≡ ai−1 ψi−1 + ai ψi+1 . i

This operator is a discrete version of a Schroedinger operator with periodic potential. Equations (10) are then equivalent to:

D = λψi , with ψi+N +1 = µψi . (11) i

Therefore, the eigenvector is a Bloch wave for the difference operator D with a Bloch momentum µ. In the continuum limit, Eq. (11) becomes the Schroedinger equation (−∂x2 − u(x))ψ(x) = 2 ψ(x), λ 2 − 2 2 . 3. The Free Case Since in the continuum limit ai → 1, it is useful to first recall some formulae in the trivial case ai = 1. They will be generalized to the full case in the next section. To introduce the zero mode from the start, we consider the slightly more general case ai = a: √ a(ψ2 + µ−1 ψ N +1 ) = λψ1 , √ a(ψ + ψi+1 ) = λψi , √ i−1 a(µψ1 + ψ N ) = λψ N +1 . i , where x are solutions of the The solution of the bulk equations is ψi = αx+i + βx− ± equation 1 λ x 2 − zx + 1 = 0, x± (λ) = (z ± z 2 − 4), z = √ . 2 a

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

823

Imposing the two boundary equations, we get (x+N +1 − µ) α + (x−N +1 − µ) β = 0, (x+N +1 − µ)x+ α + (x−N +1 − µ)x− β = 0. The compatibility of this system yields the spectral curve µ + µ−1 = x+N +1 + x−N +1 ≡ t (λ).

(12)

We now impose that the curve passes through the point (λ = 2, µ = µ0 ), where µ0 is related to the value of the Casimir function by K = µ0 + µ−1 0 .

(13)

Setting 1 x± (2) = √ ± a

1 − 1 = e±α , µ0 = ei p0 , a

0 . Hence the constant a is related to the value of the zero mode Eq. (12) gives α = i Np+1 p0 by:

√ a=

1 cos

p0 N +1

.

The components of the eigenvector, properly normalized, are meromorphic functions on the spectral curve Eq. (12). Choosing the normalization ψ N +1 = µ, they read ψi (λ, µ) =

λ PN +1−i (z) + µPi (z) , z=√ . PN +1 (z) a

(14)

We have introduced the polynomials of degree j − 1: j

P j (z) =

j

x+ − x− , x+ − x−

P j (z) = z j−1 + O(z j−3 ).

(15)

The first few polynomials are P0 = 0,

P1 = 1,

P2 = z,

P3 = z 2 − 1,

P4 = z 3 − 2z.

They are essentially the Tchebitchev polynomials of the second kind. As we will see, Eq. (14) is the general form of the meromorphic function ψi (λ, µ) even when ai = a. In particular, in order to take the continuum limit, the poles of the eigenvector3 will have to be close to the roots of the equation PN +1 ( √λa ) = 0, that is (0) √ Zk λ(0) , = 2 a cos k N +1

Z k(0) = kπ k = 1, . . . , N .

(16)

3 In fact in this simple case the eigenvector has no poles at finite distance because they are compensated by zeroes in the numerator. This degeneracy is lifted as soon as the ai are not all equal.

824

O. Babelon (0)

(0)

kπ

For these special values λk , we have x± = e±i N +1 and Eq. (12) gives µk = (−1)k . (0) (0) The set of points (λk , µk ), k = 1, · · · , N , will be called the free configuration and will play an important role below. It is simple to take the continuum limit in this free case. We set λ 2 − 2 2 , z 2 − 2 Z 2 , ψi±1 = ψ(x ± ), = where we have introduced the variable Z=

1 , N +1

2 + p02 .

(17)

The eigenvector equation becomes the Schroedinger equation −ψ (x) − p02 ψ(x) = 2 ψ(x), x = j.

(18)

We also have x± = 1 ± iZ and the equation of the spectral curve reads: µ + µ−1 = (1 + iZ )1/ + (1 − iZ )−1/ . In the limit → 0, it becomes µ + µ−1 = 2 cos Z .

(19)

Similarly, the eigenvector becomes a Baker-Akhiezer function: ψ(x) =

sin Z (1 − x) + µ sin Z x . sin Z

(20)

When µ = e±i Z this reduces to ψ(x) = e±i Z x as it should be. Notice that when µ is kept as a free parameter, the above formula gives two independent solutions of Eq. (18), but when µ belongs to the spectral curve Eq. (19) one has ψ(x + 1) = µψ(x). Equation (20) presents the two Bloch waves as a single function on the hyperelliptic spectral curve Eq. (19). Another example, important to us, will be the Dirac comb, [−∂ 2 − H0 δ(x)]ψ(x) = 2 ψ(x), δ(x + 1) = δ(x). On each interval x j = j < x < x j+1 = j + 1, one has ψ(x) = α j eix + β j e−ix , x j < x < x j+1 . The Bloch condition ψ(x + x j ) = µ j ψ(x), 0 < x < 1 = (µe−i ) j α0 , β j = (µei ) j β0 . (µ−e−i ) α0 = − (µ−ei ) β0 , while the gap equation on the (x j − 0) − H0 ψ(x j ) = 0, gives the spectral curve gives α j

µ + µ−1 = 2 cos − H0

The continuity of ψ(x) gives first derivative, −ψ (x j + 0) + ψ sin .

(21)

The Bloch wave itself is ψ(x) = µ j ψvac (x − x j ), x j < x < x j+1 , where ψvac (x) is given by Eq. (20) (with p0 = 0) but , µ now belonging to the curve Eq. (21).

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

825

4. Separated Variables In this section, we generalize the previous analysis when ai = a and express the dynamical variables of the Volterra model in terms of the separated variables. Equivalent formulae were already obtained a long time ago in [4, 5]. A quantum version of this construction for the closed Toda chain can be found in [19]. We have to reconstruct the eigenvectors of L(µ). Let us set = (ψi ), i = 1, . . . , N + 1. We normalize the last component ψ N +1 = µ. Notice that due to Eq. (8), µ does not vanish for finite λ. The components ψi are meromorphic functions on the spectral curve and are uniquely characterized by their poles and behavior at infinity which we now describe. We will call P + (λ = ∞, µ = ∞) and P − (λ = ∞, µ = 0) the two points above λ = ∞. In the neighbourhood of P ± , the local parameter is λ−1 and we have by direct expansion of Eq. (8):

P + : µ = A−1 λ N +1 1 + O(λ−2 ) , (22)

P − : µ = Aλ−N −1 1 + O(λ−2 ) . (23) At the points P + and P − , the eigenvector (P) behaves as:

1 ψi (P) = √ λi 1 + O(λ−2 ) , P ∼ P + , a N +1 a1 a2 . . . ai−1

√ ψi (P) = a N +1 a1 a2 . . . ai−1 λ−i 1 + O(λ−2 ) , P ∼ P − .

(24) (25)

This is easily deduced by inspection of Eq. (10). From the general results of the classical inverse scattering theory, we expect g + (N + 1) − 1 = 2N poles for the eigenvector (see e.g. [6, 15]). From Eq. (24), we see that we have a fixed pole of order N at P + (on the component ψ N ), and there remains g = N poles at finite distance, the so called dynamical poles. But we notice the symmetry property ψi (−λ, −µ) = (−1)i ψi (λ, µ) so that the dynamical poles come in pairs λ N +1−k = −λk , µ N +1−k = −µk and only (λk , µk ), k = 1 . . . n, are independent parameters. Everything can be expressed in terms of these 2n = N quantities (λk , µk ), k = 1 . . . n. In fact, they can be viewed as coordinates on (an open set of) phase space. First, the commuting Hamiltonians are easy to reconstruct. Indeed the spectral curve is determined by requiring that it passes through the points (λk , µk ), k = 1 · · · n, and through the point (2, µ0 ), where µ0 is related to the Casimir function as in Eq. (13). The equation of the curve itself can be written as a determinant   λ λ3 · · · λ N +1 µ + µ−1  2 23 · · · 2 N +1 µ0 + µ−1  0   N +1 3   µ1 + µ−1 (26) det λ1 λ1 · · · λ1 1  = 0. .  ..  ..  . λn

λ3n

···

λnN +1

µn + µ−1 n

826

O. Babelon

Expanding over the first row, we obtain a curve of the form Eq. (8), and we can read directly the Hamiltonians as the coefficients of t (λ). They appear as functions of the (λk , µk ) and can be shown to Poisson commute (see [16–18] for a proof and for the quantum generalization of this fact). Equations (24,25) and the data of the N dynamical poles also determine the functions ψi uniquely. Being meromorphic functions on a hyperelliptic curve, we can write quite generally ψi =

Q (i) (λ) + µR (i) (λ) , n 2 2 k=1 (λ − λk )

(27)

where Q (i) and R (i) are polynomials such that Q (i) (−λ) = (−1)i Q (i) (λ),

R (i) (−λ) = (−1)i+1 R (i) (λ).

Above λk , we have two points on the curve: (λk , µk ) and (λk , µ−1 k ). We want the poles to be at (λk , µk ) only so that the numerator in Eq. (27) should vanish at the points (λk , µ−1 k ). This gives n conditions (i) Q (i) (λk ) + µ−1 k R (λk ) = 0, k = 1 . . . n.

(28)

To have a pole of order i at P + and a zero of order i at P − we must choose degree Q (i) = N − i, degree R (i) = i − 1. Hence, these two polynomials depend altogether on n + 1 coefficients which are determined by imposing the n conditions Eq. (28) and requiring that the normalization coefficients are inverse to each other at P ± as in Eqs. (24,25). It is convenient to use the basis of polynomials P j (λ) given by Eq. (15). We will write the formulae for ψi in the case i odd, the case i even is similar. The polynomial Q (i) (λ) can be expanded over Q (i) (λ) :

P2 (λ), P4 (λ), · · · PN +1−i

and the polynomial R (i) (λ) can be expanded over R (i) (λ) :

P1 (λ), P3 (λ), · · · Pi (λ).

Solving the linear system Eq. (28), the eigenvector can be written as 

µP1 (λ) · · ·  P1 (λ1 ) · · ·  . ..  . Ki .  . ψi = 2 det   P1 (λk ) · · · (λ − λ2k )  . ..  .. . P1 (λn ) · · ·

 · · · −P2 (λ) · · · −µ1 P2 (λ1 )   .. ..  . .  , Pi (λk ) −µk PN +1−i (λk ) · · · −µk P2 (λk )    .. .. .. ..  . . . . Pi (λn ) −µn PN +1−i (λn ) · · · −µn P2 (λn ) (29)

µPi (λ) −PN +1−i (λ) Pi (λ1 ) −µ1 PN +1−i (λ1 ) .. .. . .

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

827

where K i are constants independent of λ, µ. Defining 

P1 (λ1 )  ..  .  i =  P1 (λk )  .  . . P1 (λn )

··· .. .

··· .. .

···

Pi−2 (λ1 ) .. .

Pi−2 (λk ) .. .

Pi−2 (λn )

−µ1 PN +1−i (λ1 ) .. .

−µk PN +1−i (λk ) .. .

··· .. .

−µn PN +1−i (λn )

··· .. .

···

 −µ1 P2 (λ1 ) ..  .   −µk P2 (λk )  (30)  ..  . −µn P2 (λn )

we can compute the leading terms in Eq. (29) when λ → ∞. At P (+) the leading term comes from µPi (λ), while at P (−) it comes from PN +1−i (λ), ψi (−1)

i−1 2

A−1 K i det i λi ,

ψi (−1)

i−1 2

K i det i+2 λ−i ,

P +, P −.

Imposing that the two coefficients of λi and λ−i are inverse to each other, we get K i2 =

A . det i det i+2

Comparing with Eqs. (24, 25), we finally obtain ai =

det i det i+3 det N det 3 , aN = A, a N +1 = A. det i+1 det i+2 det N +2 det 1

(31)

Here A−1 is the coefficient of λ N +1 in t (λ), Eq. (9), computed from Eq. (26). We impose the Poisson bracket on the variables λk , µk ,

1 {λk , λk } = 0, {λk , µk } = − δkk 4λk − λ3k µk , {µk , µk } = 0. 2

(32)

One can then check that the Hamiltonians defined by Eq. (26) are all in involution (this is a general result), and that the ai defined above do satisfy the Faddeev-Takhtajan Poisson bracket. The fact that the expressions for a N , a N +1 are different from the ones in the bulk is due to the choice of normalization of the eigenvector. However, the Poisson bracket of the ai is periodic. All this can be proved using techniques similar to the ones in [19]. 5. Continuum Limit of the Spectral Curve We now take the continuum limit of the spectral curve Eq. (26). The result is Eq. (41). We set λ=

√ √ az, λk = az k ,

2 p0 = z0 . √ = 2 cos N +1 a

From these, the scaled variables , Z , Z k are defined like this: λ = 2 cos

Z Zk , z = 2 cos , z k = 2 cos . N +1 N +1 N +1

(33)

828

O. Babelon

Notice that we have Z = 2 + p02 . In the following, we will refer to the terminology “perturbation theory” when the points (Z k , µk ) are small deviations from the free configuration Eq. (16). The formulae we will write will make sense in this perturbative setting. This however does not exclude the possibility to have a finite number of points which are large deviation. We will also be interested√in the deviation √ √ from the zero mode configuration. That is, we make the substitution ai → a√ a˜ i everywhere on the lattice. Alternatively this amounts to using the variable z = λ/ a. Using the basis of polynomials P j (z) defined in Eq. (15) instead of the z j , we can write the spectral curve as (it has the right form and passes through the right points)   µ + µ−1 PN +2 (z) · · · P4 (z) P2 (z) µ0 + µ−1 PN +2 (z 0 ) · · · P4 (z 0 ) P2 (z 0 ) 0   −1  PN +2 (z 1 ) · · · P4 (z 1 ) P2 (z 1 ) det µ1 + µ1  = 0.   . . . .  . .  µn + µ−1 n

PN +2 (z n )

···

P4 (z n )

P2 (z n )

Without changing the determinant, we can subtract from the first column the linear combination of the next two columns: PN +2 (z k ) − PN (z k ) = 2 cos Z k . The first column becomes



 µ + µ−1 − 2 cos Z γ0     γ1     .   .. γn

where we have set γk = µk + µ−1 k − 2 cos Z k .

(34)

Notice that γ0 = 0. The reason for this subtraction is that for the free configuration we also have γk(0) = 0, so that the spectral curve becomes simply µ + µ−1 = 2 cos Z as it should be. The subtraction gives sense to the spectral curve in perturbation theory. Expanding the determinant over the first row, we can write µ + µ−1 − 2 cos Z =

n+1

H2 j P2 j (z).

j=1

The H2 j are given by H = N −1 V, where we have defined    PN +2 (z 0 ) H N +2  PN +2 (z 1 )  HN    H = ..  ...  , N =  . H2

PN +2 (z n )

··· ···

P4 (z 0 ) P4 (z 1 )

···

P4 (z n )

   γ0 P2 (z 0 ) P2 (z 1 )  γ1   , V = ..   ...  . .  γn P2 (z n )

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

829

We will need to treat separately the first row and column in the matrix N . Let us write it as A B , N= C D where

B = PN (z 0 ) · · · P4 (z 0 ) P2 (z 0 ) ,

A = PN +2 (z 0 ),  PN +2 (z 1 )   .. C = , . PN +2 (z n )



 PN (z 1 ) · · · P4 (z 1 ) P2 (z 1 )  ..  . D =  ... .  PN (z n ) · · · P4 (z n ) P2 (z n )



To zero-th order in perturbation theory, we denote N = N (0) and similarly A(0) , B (0) , C (0) , D (0) . To take the continuum limit we have to consider the matrix N N (0)−1 . Lemma 1. In the continuum limit, we have  1 N N (0)−1 =  sin Z k p0 sin Z k Zk

sin p0

Zk

1 Z k2 −m 2 π 2

0 −



1 p02 −m 2 π 2

2(−1)m m 2 π 2

k, m = 1, . . . , ∞.

, (35)

Proof. Since PN +2 (z) = z PN +1 (z) − PN (z) and PN +1 (z k(0) ) = 0, we have (0)

Ck

(0)

= −Dk1 =⇒ (D (0)−1 C (0) )k = −δk1 ,

so that N

(0)−1

1 = A + B1

1 F

−B D (0)−1 . (A + B1 )D (0)−1 − F ⊗ B D (0)−1

where F is the column vector with components Fk = δk,1 , k = 1, . . . , n. It follows that 1 0 . N N (0)−1 = 1 1 D D (0)−1 − A+B (C + D F) ⊗ B D (0)−1 A+B1 (C + D F) 1 Noticing that A + B1 = PN +2 (z 0 ) + PN (z 0 ), (C + D F)k = PN +2 (z k ) + PN (z k ), we get in the continuum limit 1 sin Z k p0 (C + D F)k → . A + B1 Z k sin p0 The main trick to proceed is an explicit formula for the inverse of the matrix D (0) . It is not difficult to check that (D (0)−1 ) jk =

kπ 4 (0) sin2 P2 j (z k ). N +1 N +1

830

O. Babelon

With this, we find mπ 4 (0) sin2 P2 j (z 0 )P2 j (z m ) N +1 N +1 n

(B D (0)−1 )m =

j=1

sin2 Nmπ +1 0 sin Np+1 sin Nmπ +1

=

4 2 j p0 2 jmπ mπ sin sin →2 N +1 N +1 N +1 p0 n

j=1

1

×

d x sin p0 x sin mπ x, 0

and the last integral is easily evaluated with the result (B D (0)−1 )m → (−1)m

2m 2 π 2 sin p0 . p0 ( p02 − m 2 π 2 )

Similarly we compute (D D (0)−1 )km =

sin Z k (−1)m 2m 2 π 2 . Z k Z k2 − m 2 π 2

Gathering all this we get Eq. (35).

We now introduce the important infinite matrix Mkm =

1 , k, m = 1, . . . , ∞ Z k2 − m 2 π 2

(36)

and the important vector |η   1 1   |η = M −1   ...  .

(37)

1 With these notations we can compute the inverse of the matrix N N (0)−1 : Lemma 2. (N N (0)−1 )−1 =

p0 sin p0

1 1 1−χ ( p0 )|η

(−1)m+1 η 2m 2 π 2 m

where we have deﬁned the vector χ ( p0 )|m =

(−1)m 2m 2 π 2

0 |ηχ ( p )| 1 + 1−χ ( p 0)|η 0

1 . p02 −π 2 m 2

Proof. With the above notations we can write 1 0 N N (0)−1 = sin Z k p0 sin Z k m 2 2 . Z k sin p0 Z k Mkl (δlm − ηl χm ( p0 )) 2(−1) m π Letting N N (0)−1 =

1 Y

0 X

=⇒ (N N (0)−1 )−1 =

1 −X −1 Y

Zk , M −1 mk sin Z k

0 X −1

,

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

we find (X

−1

)mk

(−1)m = 2m 2 π 2

1+

and (−X

−1

p0 Y )m = sin p0

|ηχ ( p0 )| 1 − χ ( p0 )|η 1 1 − χ ( p0 )|η

831

M

−1

mk

Zk sin Z k

(38)

(−1)m+1 ηm . 2m 2 π 2

Let us return to the formula for the spectral curve. We assume that the conditions explained in Sect. 9 are satisfied so that the infinite sums we will manipulate are convergent. Denote η(Z ) = 1 −

∞ m=1

ηm Z 2 − m2π 2

(39)

Zk γk . sin Z k

(40)

and | m =

k

−1 Mmk

These quantities enter the expression of the continuum limit of the spectral curve. Proposition 1. In the continuum limit, the equation of the spectral curve becomes: Hm sin Z −1 , (41) µ + µ = 2 cos Z + −H0 + Z Z 2 − m2π 2 m where the conserved quantities Hm can be taken as Hm m 1 1 χ ( p0 )| ηm , H0 = Hm = m + = . 2 2 2 2 η( p0 ) η( p0 ) m p0 − π 2 m 2 m p0 − π m (42) Proof. We have µ + µ−1 − 2 cos Z =

n

PN +2−2i (z)H N +2−2i =

i=0

n

PN +2−2i (z)(N −1 V )i .

i=0

We insert 1 = N (0)−1 N (0) into the above expression: µ + µ−1 − 2 cos Z = PN +2−2i (z)(N (0)−1 )ik (N (0) )k j (N −1 V ) j .

(43)

i, j,k

Hence, we need to compute PN +2−2i (z)(N (0)−1 )im i

PN +2 (z) + PN (z) PN +2 (z) + PN (z) (0)−1 (0)−1 ,− BD = + PN +2−2i (z)(D )im PN +2 (z 0 ) + PN (z 0 ) PN +2 (z 0 ) + PN (z 0 ) i

832

O. Babelon

whose limit N → ∞ is easy to take i

PN +2−2i (z)(N (0)−1 )im →

sin Z Z

p0 , 2(−1)m m 2 π 2 × sin p0

1 1 − 2 2 2 2 Z −m π p0 − m 2 π 2

.

We can now take the limit N → ∞ in Eq. (43) sin Z µ + µ−1 − 2 cos Z = Z ∞ p0 1 1 m 2 2 × 2(−1) m π − 2 H0 + Hm , sin p0 Z 2 − m2π 2 p0 − m 2 π 2 m=1 where

    0  γ0  γ1   γ1    m = (N N (0)−1 )−1  .  =  H   , −1  ..   X  ...   γn

γn

where X −1 is given in Eq. (38) and we remembered that γ0 = 0. Since 1 − χ ( p0 )|η = η( p0 ) the equation of the spectral curve finally becomes −1

µ+µ

1 sin Z 1 1 − 2 cos Z = − 2 . m + × ηm χ( p0 )| Z Z 2 − m2π 2 η( p0 ) p0 − m 2 π 2 m

Another useful expression of this result is: sin Z η(Z ) m m µ + µ−1 = 2 cos Z + − . Z Z 2 − m2π 2 η( p0 ) p02 − m 2 π 2 m

(44)

The next proposition performs a few consistency checks. ±1 Proposition 2. The points (Z = p0 , µ±1 0 ), and (Z = Z k , µk ), all belong to the curve Eq. (44).

Proof. When Z = p0 , we find µ + µ−1 = 2 cos p0 , hence the curve passes through the point = 0, µ±1 0 , as it should be. When Z = Z k , recalling that η(Z k ) = 0, we find 1 Zl sin Z k M −1 (µl + µl−1 − 2 cos Z l ) Z k m Z k2 − m 2 π 2 ml sin Z l sin Z k −1 Z l Mkm Mml (µl + µl−1 − 2 cos Z l ) = µk + µ−1 = 2 cos Z k + k . Zk m sin Z l

µ + µ−1 = 2 cos Z k +

Hence the curve passes through the points Z k , µ±1 k .

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

833

We now show that the Hm in Eq. (41) all Poisson commute. We need the following result: Lemma 3. One has { n , m } = 0, { n , ηm } = { m , ηn }, {ηn , ηm } = 0.

(45) (46) (47)

Proof. Recall the definitions Eqs. (37,40) of ηm and m , −1 −1 ηm = Mmk |1k , m = Mmk |γ˜ k , γ˜k =

Zk γk , sin Z k

−1 where Mmk is the inverse of the matrix defined in Eq. (36). The relation Eq. (47) is obvious because the ηm depend only on the Z k . Consider the second relation Eq. (46): −1 −1 −1 −1 1k , Mml γ˜l } = Mm,l {Mnk , γ˜l }1k {ηn , m } = {Mnk −1 −1 −1 −1 = −Mm,l Mn,k {Mk p , γ˜l }M −1 pk |1k = −Mm,l Mn,l {Mlp , γ˜l }η p ,

where in the last step we used that {Mk p , γ˜l } = 0 if k = l. The result is obviously symmetric in m and n. Finally the first statement, Eq. (45), is simple. One has −1 −1 −1 −1 −1 γ˜k , Mnl γ˜l } = −Mmr Mnl {Mr s , γ˜l } − {Mls , γ˜r } Msk γ˜k { m , n } = {Mmk but because of the structure of M, we have {Mr s , γ˜l } = 0 if r = l and for r = l the term in the square bracket obviously vanishes. This is a special case of a general theorem [17, 18]. We are now ready to prove Proposition 3. The quantities H0 , Hm , Poisson commute {H0 , Hn } = 0, {Hn , Hm } = 0. Proof. Using Eq. (42), one has {Hn , Hm } = ηn ({C, m } + C{C, ηm }) − ηm ({C, n } + C{C, ηn }), where we denoted C= One has { m , C} =

l 1 1 χ ( p0 )| = . 2 η( p0 ) η( p0 ) p0 − π 2 l 2 l

{ m , ηl } 1 1 {ηm , l } C , {ηm , C} = , 2 2 2 η( p0 ) η( p0 ) p0 − π l p02 − π 2 l 2 l l

hence {C, m } + C{C, ηm } = −

{ m , ηl } + {ηm , l } 1 C = 0. η( p0 ) p02 − π 2 l 2 l

All this means that (Z k , µk ) are separated coordinates for the Hamiltonians Hm .

834

O. Babelon

6. Continuum Limit of the Eigenvector Having found the continuum limit of the spectral curve, we now consider the limit of the eigenvector. Again, the continuum limit can be computed, the result being Eq. (50). As seen from Eq. (29), the eigenvector can be written as (for i odd) √ A det Ni ψi = 2 , √ (z − z 2j ) det i det i+2 where



µPi (z) + PN +1−i (z)  Pi (z 1 ) + µ1 PN +1−i (z 1 )  ..  .  Ni =   Pi (z k ) + µ j PN +1−i (z k )  ..  . Pi (z n ) + µn PN +1−i (z n )

µP1 (z) P1 (z 1 ) .. . P1 (z k ) .. .

 · · · −P2 (z) · · · −µ1 P2 (z 1 )   .. ..  . .  . −µk PN +1−i (z k ) · · · −µk P2 (z k )    .. .. ..  . . .

· · · −PN +1−i (z) · · · −µ1 PN +1−i (z 1 ) .. .. . . ··· .. .

P1 (z n ) · · · −µn PN +1−i (z n ) · · · −µn P2 (z n )

(48) Compared to Eq. (29), we have subtracted the i th column from the first one for the same reason as in the previous section. Also we have used the variable z, z k instead of λ, λk . Let us decompose the matrix Ni in blocs particularizing the first row and first column: Ui Vi , Ni = Wi i where Ui ≡ µPi (z) + PN +1−i (z), (Wi )k ≡ µk Pi (z k ) + PN +1−i (z k ), (Vi ) j = µP j (z)θ (i − j) − PN +1− j (z)θ ( j − i), i, j odd. To order zero in perturbation, we have −(−1)k P j (z k(0) ) = PN +1− j (z k(0) ) so that Vi Ui (0) , Ni = 0 i(0) (0)

where i is the matrix Eq. (30) evaluated on the free configuration. It is in fact independent of i and we will denote it by (0) . The appearance of zero in the lower left corner was the reason for the subtraction in Eq. (48) and makes things better behaved in perturbation. (0) The matrix Ni being bloc triangular we can compute its inverse: −1 −Ui−1 Vi (0)−1 Ui (0)−1 = Ni 0 (0)−1 so that (0)−1

Ni Ni

=

1

Ui−1 Wi

0

i (0)−1 − Ui−1 Wi ⊗ Vi (0)−1

.

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

835

Returning to the formula for ψi , we multiply all the matrices by (0)−1 . The factors det (0)−1 cancel between the numerator and denominator. We arrive at

√ (0)−1 − U −1 W ⊗ V (0)−1 det i i i i A Ui ψi = 2

. 2 k z − zk (0)−1 (0)−1 det i+2 det i We want to take the scaling limit of this expression. Again, the main trick is an explicit formula for the inverse of (0) . It is not difficult to check that kπ 4 (0) sin2 P j (z k ). ((0)−1 ) jk = N +1 N +1 Let us compute i (0)−1 . Using the parametrization Eq. (33) we find (recall that i is assumed to be odd)  i−2 mπ  sin jmπ 4 j Zk N +1 sin sin (i (0)−1 )km = Z k  N + 1 sin N +1 N +1 N +1 j=1,odd  N −1 jmπ  (N + 1 − j)Z k sin −µk sin . N +1 N + 1 j=i,odd

Defining (x) as the scaling limit of i (0)−1 , we find (there is a factor 1/2 because the sum is over j odd only) x 1 mπ ((x))km = 2 dy sin Z k y sin mπ y − µk dy sin Z k (1 − y) sin mπ y . Zk 0 x Similarly, we define U (x, , µ) and Wk (x) by Ui = (N + 1)U (x, , µ), (Wi )k = (N + 1)Wk (x). We find U (x, , µ) = µ

sin x Z sin(1 − x)Z + , Z Z

Z=

2 + p02

and Wk (x) =

sin Z k (1 − x) sin Z k x + µk . Zk Zk

Finally, we have (again there is a factor 1/2 because the sum is over j odd only) (Vi (0)−1 )m = Vm (x, Z , µ), where

x

Vm (x, , µ) = 2mπ 0

sin y Z sin mπ y − dy µ Z

1 x

sin(1 − y)Z sin mπ y . Z

(49)

Putting everything together, we arrive at (up to a factor4 independent of x) 4 In this factor we will include in particular

1 2 2 k (1−Z /Z k )

which produces the poles at Z 2 = Z k2 . This is

important for the analyticity properties of ψ(x) but plays little role for the considerations of this paper.

836

O. Babelon

Proposition 4. ψ(x, , µ) = U (x, , µ) − V (x, , µ)|−1 (x)|W (x),

(50)

where we denoted by V (x, , µ)| the row vector with components Vm (x, , µ) and by |W (x) the column vector with components Wk (x). It is easy to show that the infinite sums involved in this formula converge under the conditions Eq. (65) of Sect. 9. Equation (50) is the generalization of Eq. (20). Here, the and µ dependence is entirely contained in the function U (x, , µ) and the vector V (x, , µ). For the moment they are free complex parameters. We now want to specialize to = 0, µ0 = e±i p0 ., We have U (x, , µ)|0,e±i p0 =

sin p0 ±i p0 x e , p0

m(±) (x, p0 ), Vm (x, , µ)|0,e±i p0 = U (x, , µ)|0,e±i p0 V where m(±) (x, p0 ) = −mπ V

$

% eimπ x e−imπ x . + mπ ± p0 mπ ∓ p0

Hence, up to a constant (±) (x, p0 )|−1 (x)|W (x) . ψ (±) (x, p0 ) = e±i p0 x 1 − V These are the primary fields of CFT. Their logarithmic derivatives are the free fields of the Coulomb gas representation. Notice that we have two such fields playing a completely symmetrical role: we go from one to the other by changing p0 → − p0 . This circumstance was recognized and used with great profit in [14]. The separated variables make this symmetry explicit and built in. 7. Schroedinger Equation Having found a formula for the wave function ψ(x, , µ), the next question is to find the potential in the Schroedinger equation that ψ(x, , µ) is expected to satisfy. At this point it is simpler to forget the lattice model and work directly with Eq. (50). Let us denote by |E(x) the vector with components E m (x) = 2mπ sin mπ x. Calculating explicitly the integrals in Eq. (49), we find (’ denotes the derivative with respect to x) Vm (x, , µ) =

U (x, , µ)E m (x) − U (x, , µ)E m (x) , Z 2 − m2π 2

Z 2 = 2 + p02 , (51)

and similarly for (x), we find km (x) =

Wk (x)E m (x) − Wk (x)E m (x) Z k2 − m 2 π 2

.

(52)

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

837

Notice the important formulae V (x, , µ)| = U (x, , µ)E(x)|, (x) = |W (x)E(x)|. The derivative of the matrix (x) is a rank one projector. The matrix (x) has a form familiar in the theory of integrable systems, and we know that it leads to non linear differential equations. Indeed, let us define the vector |(x) by |(x) = −1 (x)|W (x)

(53)

and let K be the diagonal matrix K mm = mπ δmm . Proposition 5. The vector |(x) satisﬁes the set of coupled non linear second order differential equations | + 2(|E) | + K 2 | = 0, where |E(x) =

(54)

m (x)E m (x).

m

Proof. By derivation, and using the formula for , we get |W = |, |W = | + | = |E|W + | , |W = |E|W + | + (|E + |E)|W . But we also have Z 2 − K 2 = |W E | − |W E|, where we defined the diagonal matrix Z Zkk = Z k δkk . Applying this identity to |, we get Z 2 | − K 2 | = E ||W − E||W . But Z 2 | = Z 2 |W = −|W , and therefore |W = E||W − E ||W − K 2 |. Comparing these two expressions of |W , we find | + (|E + |E)|W = −E ||W − K 2 |. Multiplying by −1 yields Eq. (54).

We are now ready to find the Schroedinger equation satisfied by ψ.

(55)

838

O. Babelon

Proposition 6. The function ψ(x, , µ) deﬁned by Eq. (50) satisﬁes the linear second order differential equation −ψ (x, , µ) − [ p02 + 2(E|) ] ψ(x, , µ) = 2 ψ(x, , µ).

(56)

Proof. We have ψ(x, , µ) = U (x, , µ) − V (x, , µ)|−1 |W (x) = U (x, , µ) − V (x, , µ)|(x). Using the formula for V (x, , µ)|, we get 1 1 ψ(x, , µ) = U 1 − E | 2 | + U E| 2 |. Z − K2 Z − K2 Next, remembering that U (x, , µ) = −Z 2 U (x, , µ), |E (x) = −K 2 |E(x), we obtain 1 1 ψ (x, , µ) = −U E| + E | 2 | + U 1 + E| 2 | Z − K2 Z − K2 and

1 | ψ (x, , µ) = −U Z + E| + (E|) + E | 2 Z − K2 1 | . +U −E| + E| 2 Z − K2

2

Using now the equation for | , we get Eq. (56).

The potential T (x) = 2(E|) can also be written directly in terms of . In fact, we have ∂x2 log det = ∂x Tr −1 = ∂x E|−1 W = ∂x E|, hence T (x) = 2∂x E| = 2∂x2 log det (x). The Schroedinger equation therefore also reads ψ (x, , µ) + [ p02 + 2 ∂x2 log det ] ψ(x, , µ) = −2 ψ(x, , µ).

(57)

In this formula both the potential and the function ψ(x, , µ) are known. The potential therefore belongs to the class of exactly solvable potentials. It is strongly reminiscent of the formula for finite zones potentials [20–23]. It can probably also be obtained by an infinite sequence of Darboux transformations [24]. The parameter µ which enters the function U (x, , µ) and the vector V (x, , µ)| was, up to now, a free parameter. Eq. (50) therefore provides two linearly independent solutions of Eq. (56). We now introduce the spectral curve by imposing the quasiperiodicity of ψ(x, , µ).

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

839

8. Bloch Waves and Spectral Curve So far, ψ(x, , µ) was defined on the interval [0, 1]. We extend its definition by imposing ψ(x + 1, , µ) = µψ(x, , µ). This extension is continuous as we now show. Proposition 7. ψ(1, , µ) − µψ(0, , µ) = 0. Proof. This follows immediately from −1 m Wk (1) = µ−1 U (1, , µ) = µU (0, , µ). k Wk (0), km (1) = µk km (0)(−1)

It is worth computing explicitly ψ(0, , µ). In terms of the matrix M introduced in Eq. (36), we have km (0) = Wk (0)Mkm E m (0), U (0, , µ) = Vm (0, , µ) = U (0, , µ) It follows that sin Z ψ(0, , µ) = Z

1−

sin Z , Z

1 E (0). Z 2 − m2π 2 m

ηm 2 Z − m2π 2

m

=

sin Z η(Z ), Z

(58)

where ηm and η(Z ) are defined in Eqs. (37, 39). Notice that when Z 2 = Z k2 , we have ψ(0, , µ) = 0, by definition of η(Z ). We now turn to the derivative of ψ(x, , µ). Lemma 4. One has ψ (1, , µ) − µψ (0, , µ) = µ (, µ), (, µ) = µ + µ−1 − 2 cos Z −

∞ sin Z E m (0)m (0) − E m (1)m (1) . Z Z 2 − m2π 2 m=1

Proof. We have U (1, , µ) = µ

sin Z sin Z , U (0, , µ) = , Z Z

and U (1, , µ) = µ cos Z − 1, U (0, , µ) = µ − cos Z . Using E k (1) = E k (0) = 0, we get 1 | (1) + U (1, , µ), Z2 − K 2 1 ψ (0, , µ) = −U (0, , µ)E (0)| 2 | (0) + U (0, , µ). Z − K2 From this the result follows. ψ (1, , µ) = −U (1, , µ)E (1)|

(59)

840

O. Babelon

At this point it is tempting to identify the spectral curve as (, µ) = 0. However, this cannot be correct because the point = 0, µ = µ0 does not belong to it. We have to change the Schroedinger equation. The only possible modification is at the edges. We consider therefore the equation ψ (x, , µ) + p02 + 2E + H0 δ(x) ψ(x, , µ) = −2 ψ(x, , µ). (60) The bulk formula for ψ(x, , µ) does not change. The continuity of ψ(x, , µ) at x = 1 still holds, but the derivative now has a discontinuity

1+

d x ψ + H0 ψ(1) = 0.

1−

Using ψ(1, , µ) = µ η(Z )

sin Z Z

the Bloch condition becomes ψ (1, , µ) − µψ (0, , µ) − H0 µ η(Z ) = 0, that is −1

µ+µ

sin Z = 2 cos Z + Z

m

m , −H0 η(Z ) , Z 2 − m2π 2

where we have set m = E m (0)m (0) − E m (1)m (1).

(61)

We now determine the coefficient H0 by requiring that the curve passes through the points p0 , µ±1 0 . We find H0 =

m 1 . η( p0 ) m p02 − m 2 π 2

Hence the curve takes the form −1

µ+µ

sin Z = 2 cos Z + Z

m

m m η(Z ) − Z 2 − m2π 2 η(Z 0 ) p02 − m 2 π 2

.

In order to compare with Eq. (44), we must compute m in Eq. (61). We have Wk (0) = µk

sin Z k sin Z k , Wk (1) = , Wk (0) = 1 − µk cos Z k , Zk Zk Wk (1) = cos Z k − µk

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

841

and km (0) = km (1) =

Wk (0)E m (0) = Wk (0)Mkm E m (0), Z k2 − π 2 m 2 Wk (1)E m (1) = Wk (1)Mkm E m (1), Z k2 − π 2 m 2

where M is the matrix introduced in Eq. (36). Since | (0) = −1 (0)|W (0) and | (1) = −1 (1)|W (1), one has E m (0)m (0) =

−1 Mmk (µ−1 k − cos Z k )

k

E m (1)m (1) =

−1 Mmk (cos Z k − µk )

k

Zk , sin Z k

Zk sin Z k

hence m = E m (0)m (0) − E m (1)m (1) =

−1 Mmk

k

Zk γk , sin Z k

where we recall that γk = µk + µ−1 k − 2 cos Z k . This is exactly Eq. (40) and shows that we have recovered precisely the spectral curve Eq. (44). Finally, let us close this section by proving the Proposition 8. The function ψ(x, , µ) has no pole at the point Z = Z k , µ−1 k . Proof. Here we restore the factor

1 k (1−Z

2 /Z 2 ) k

,

−1 U (x, , µ)| Z =Z k ,µ−1 = µ−1 k Wk (x), =⇒ Vk (x, , µ)| Z =Z k ,µ−1 = µk kk (x), k

k

ψ(x, , µ)| Z =Z k ,µ−1 k

1 −1 −1 (x), (x)Wk (x) = regular. × W (x) − µ k kk k k k 2 2 k (1 − Z /Z k )

This shows that the same property for the eigenvector on the lattice has been preserved when taking the continuum limit. 9. Perturbation Theory In the previous sections, we have manipulated determinants of infinite matrices quite freely. It is necessary now to investigate the conditions for the existence of the determinant det (x). We recall the free configuration Eq. (16). In the scaled variables it reads (0)

(0)

Z k = kπ, µk = (−1)k .

842

O. Babelon (0)

(0)

By construction, when (Z k , µk ) = (Z k , µk ), we have (x) = Id, so that det (x) = 1. (0) (0) Clearly for det (x) to exist, we have to assume (Z k , µk ) → (Z k , µk ) when k → ∞. Hence we set Z k = kπ + δ Z k , µk = (−1)k (1 + δµk ).

(62)

It is not difficult to see that to leading order in (δ Z k , δµk ), we have Wk (x) =

1 (δ Z k cos kπ x − δµk sin kπ x), kπ

and this implies k,m (x) =

2mπ δ Z k (m cos kπ x cos mπ x + k sin kπ x sin mπ x) k(Z k2 − π 2 m 2 ) (63) +δµk (−m sin kπ x cos mπ x + k cos kπ x sin mπ x) .

Notice that when m = k this formula gives that to leading order k,k (x) = 1, as it should be. A first consequence of these formulae is that if δ Z k = 0, δµk = 0, beyond a certain index k = kmax , then for k > kmax we have Wk (x) = 0, k,k (x) = 1, and k,m (x) = 0, ∀m = k. As a result only the first block of size kmax × kmax of the matrix (x) plays a role and all the constructions of the previous sections reduce to finite size matrices and vectors. If however we want to retain an infinite number of modes in order to keep the field theoretical character of the model, one has to say something about the rate at which δ Z k and δµk tend to zero when k → ∞. Disregarding a finite number of possibly large δ Z k , δµk which play no role in these convergence questions, we may assume that (x) is given by Eq. (63). As we have seen, it is of the form (x), (x) = Id + (x) is small. In fact, bounding the trigonometric functions by 1, we have where m k,m (x)| ≤ c (|δ Z k | + |δµk |), m = k. | k|k − m| Since |k − m| ≥ 1 when k = m, we may write as well m k,m (x)| ≤ c (|δ Z k | + |δµk |), m = k. | k It is not difficult to see that this formula is valid also for m = k (we have to adapt the constant c). It follows that (x)) log det (x) = Tr log(1 + ∞ ∞ n n 1 c n (x)| ≤ Tr| ≤ |δ Z k | + |δµk | . n n n=1

n=1

(64)

k

Hence & a sufficientcondition for the existence of the determinant is that the series k |δ Z k | + |δµk | converges. This is achieved if |δ Z k | + |δµk | <

c k 1+

, > 0.

(65)

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

843

One can then adjust the constant c such that the series in Eq. (64) converges. The condition Eq. (65) ensures that log det (x) exists. To build the potential u(x) however, this function has to be twice differentiable in x and this may require stronger conditions on . Now that we have found an expression for the potential u(x) in terms of a countable set of variables Z k , µk , we would like to check the Virasoro Poisson bracket directly. Recall that Tn e2iπ x . u(x) = p02 + T (x) + H0 δ(x), T (x) = 2∂x E| = n

Notice first that T (x) has no Fourier component T0 : 1 1 T0 = d x T (x) = 2 d x(E|) = 2( E(1)|(1) − E(0)|(0) ) = 0, 0

0

where we used that E m (0) = E m (1) = 0. The Fourier expansion of the potential u(x) reads L n e2iπ nx = p02 + (Tn + H0 )e2iπ nx . u(x) = n

n

We must therefore identify L 0 = p02 + H0 = p02 +

m 1 2 η(Z 0 ) m p0 − m 2 π 2

(66)

and L n = Tn + H0 , =⇒ Tn = L n − L 0 + p02 , n = 0 If u(x) has Poisson bracket Eq. (4), the algebra of the L n reads {L n , L m } = 8iπ(n − m)L n+m − 16iπ 3 n 3 δn+m,0 . The Poisson algebra for the Tn is then closed: {Tn , Tm } = 8iπ(n − m)Tn+m − 8iπ nTn + 8iπ mTm − 16i(π 3 n 3 − π p02 n)δn+m,0 or, in a form that will be useful later, {T (x), T (y)}=2δ (x − y)+4(2 p02 +T (x) + T (y))δ (x − y)−4T (x)δ(y)+4T (y)δ(x). (67) In this section, we consider the situation where all the variables (Z k , µk ) are close to the free configuration as in Eq. (62), and we perform a perturbation theory in δ Z k , δµk . We have seen that to lowest order (0) (x) = Id by construction. So, we can write the expansion (x) = Id + (1) (x) + (2) (x) + · · · , |W (x) = |W (1) (x) + |W (2) (x) + · · · , where we have taken into account that |W (0) (x) = 0. It follows that |(x) = |(1) (x) + |(2) (x) + · · · ,

844

O. Babelon

where |(1) = |W (1) , |(2) = |W (2) − (1) |W (1) , . . . . To lowest order, we find easily T (1) (x) = 2E|(1) = 4

kπ(δ Z k cos 2kπ x − δµk sin 2kπ x).

k

This shows in particular that δ Z k and δµk are just the Fourier components of the potential in this first approximation. We see here clearly that for T (1) (x) to exist as a function (and not just as a distribution), we need > 1 in Eq. (65). The Poisson bracket Eq. (3) becomes to leading order p02 {δ Z k , δµk } = 2kπ 1 − 2 2 . k π To define modes independent of the zero mode p0 we introduce ak = αk (δµk − iδ Z k ), ak† = α¯ k (δµk + iδ Z k ), where the coefficients αk , α¯ k satisfy5 αk α¯ k =

k2π 2 1 . 4π p02 − k 2 π 2

With this choice, one has {ak , ak† } = ik. we can then rewrite T (1) (x) = 2i

k

kπ

† ak 2ikπ x ak −2ikπ x e − e . αk α¯ k

It is now straightforward to compute the Poisson bracket {T (1) (x), T (1) (y)} = 2δ (x − y) + 8 p02 δ (x − y). This is the correct result for the Virasoro Poisson bracket in this approximation. Notice that the term δ (x − y) is exact already at this level. Higher order terms cannot contribute to it. Next, we look at the conserved quantities. The leading terms in the expansions of ηm and Hm are easy to find: 2 ). ηm 2mπ δ Z m , m 2m 2 π 2 (δµ2m + δ Z m

To see it, consider the defining relations of ηm , ηm m

Z k2 − π 2 m 2

= 1, ∀k.

5 It is known [14] that the poles at p 2 = k 2 π 2 are classical remnants of the zeroes of the Kac determinant. 0

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

845

When Z k is given by Eq. (62), the dominant term in the above sum is m = k. The equation becomes ηk /(2kπ δ Z k ) = 1. The same argument starting with the equation of the spectral curve, Eq. (41), taken at the point Z k , µk which belongs to it, yields the formula for Hm . Remark that the condition Eq. (65) ensures that the sums in the definition of the function η(Z ), Eq. (39), or in the definition of the spectral curve, Eq. (41), are convergent. † Written with the oscillators am , am , we find † † Hm = 8π( p02 − π 2 m 2 )am am , H0 = 8π am am . m

It is clear that the Hm are in involution at this order. As we see, in first approximation, the dynamical system reduces to a set of decoupled harmonic oscillators. The generator L 0 is given by † am am . L 0 = p02 + 8π m

It is easy to verify that {L 0 , T (1) (x)}0 = −4∂x T (1) (x). These perturbative arguments are good indications that u(x) indeed satisfies the Virasoro Poisson bracket. Clearly we will not go very far in perturbation and we now look for a more formal proof of this fact. For that purpose, we need some preparation. 10. Some Identities Before computing Poisson brackets to check the Virasoro algebra, we collect a number of useful identities. We start with a formula for the inverse matrix −1 (x). It has the same form as (x). Proposition 9. Let us deﬁne F| = E|−1 .

(68)

Then, we can write −1 mk =

m Fk − m Fk Z k2 − π 2 m 2

.

(69)

Proof. Multiplying Eq. (55) on both sides by −1 , we get −1 Z 2 − K 2 −1 = −1 |W E |−1 − −1 |W E|−1 and so −1 mk =

(−1 |W )m (E|−1 )k − (−1 |W )m (E |−1 )k . π 2 m 2 − Z k2

But −1 |W = | + E||, E |−1 = F | + E|F|. Plugging into the above formula, we obtain Eq. (69).

846

O. Babelon

Proposition 10. The vector F| satisﬁes a set of differential equations, F | + 2(E|) F| + F|Z 2 = 0.

(70)

Proof. The proof is the same as for |. From this we easily deduce (−1 ) = −|F|, which can also be proved using the similar property of . Let us define Ak (x) =

E m (x)m (x) , Z k2 − m 2 π 2 m

Ck (x) =

E m (x)m (x) , 2 2 2 2 m (Z k − m π )

Bk (x) =

E (x)m (x) m , 2 2 2 Z k −m π m

Dk (x) =

E (x)m (x) m . 2 2 2 2 (Z k −m π ) m

Proposition 11. We have the identity (1 − Bk )Wk + Ak Wk = 0.

(71)

Proof. This is just a rewriting of |W = | using Eq. (52) for (x).

Proposition 12. The following two identities hold (1 − Bk + Ak )Fk − Ak Fk = 0, (Bk

+

Z k2 Ak )Fk

+ (1 −

Bk )Fk

= 0.

(72) (73)

Proof. The first identity is a rewriting of F| = E|−1 using Eq. (69) for −1 (x). The second identity is just a rewriting of F | = E |−1 − E|F|. The above two identities form a linear system for Fk and Fk . Its compatibility implies the following: Proposition 13. (1 − Bk )2 + Ak Bk − Ak Bk + Ak + Z k2 A2k = 0.

(74)

k = Wk (1 − Bk ) + Wk Ak . F

(75)

Let us define

An important consequence of Eqs. (72,73) is k (x) are proportional, Proposition 14. The functions Fk (x) and F Fk (x) = −

ζk Fk (x). µk γk

(76)

The proportionality coefﬁcient is written in this speciﬁc way for later convenience. The quantity γk is the one deﬁned in Eq. (34).

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

847

Proof. Let us compute the Wronskian

k ) = Fk Wk (1 − Bk ) − Wk Bk + Wk Ak + Wk Ak W r (Fk , F

−Fk Wk (1 − Bk ) + Wk Ak

= −2k Wk (1 − Bk + Ak )Fk − Ak Fk

−Wk (Bk + Z k2 Ak )Fk + (1 − Bk )Fk = 0. Proposition 15. The following quantities are constants independent of x: ηm = m E m − m E m − < E > m E m 1 (m p − m p )(E m E p − E m E p ). − π 2 ( p2 − m 2 )

(77)

p =m

The quantities ηm deﬁned in Eq. (77) are in fact the same as the ones introduced in Eq. (37). Proof. To prove the first statement, just take the derivative with respect to x and use Eq. (54). To prove the second statement, use the ηm defined in Eq. (77) to rewrite Eq. (74) as ηm = 1, ∀ k, (78) 2 − m2π 2) (Z k m which is exactly the same as Eq. (37).

A straightforward consequence is the “trace” formula that will be useful later: E − E + E2 = − ηm .

(79)

m

We now compute the coefficients ζk appearing in Eq. (76). Proposition 16. The coefﬁcients ζk in Eq. (76) are determined by the set of equations ζk = 1, ∀ m. (80) 2 − m2π 2 Z k k These equations are dual to Eq. (78). Proof. Start with F|W = E|−1 |W = E|. Using Eq. (71,75,76 ), we have Fk Wk = − hence F|W =

k

ζk A k =

ζk (−Wk2 + Wk Wk )Ak = ζk Ak , µk γk

m

k

ζk Z k2 − m 2 π 2

E m m = E| =

Since this has to hold for all x, the only possibility is Eq. (80).

(81)

m

E m m .

848

O. Babelon

We collect below a few more identities of the type of Eq. (81) that will be important later. Proposition 17. Fk Wk = ζk Ak , Fk Wk = −ζk (1 − Bk ), Fk Wk = ζk (1 − Bk + Ak ),

(82) (83) (84)

Fk Wk = ζk (Bk + Z k2 Ak ).

(85)

Next, we relate the ηm and ζk Proposition 18. The following relation holds: 2 m (Z k

ηm 1 = . 2 2 2 ζ −m π ) k

Proof. We start from

km −1 mk = δkk .

m

When k = k this gives Wk Fk Dk − Wk Fk (Dk − Ak + Z k2 Ck ) − Wk Fk Ck + Wk Fk (Ck − Dk ) = 1, or using Eqs. (82–85), ζk 2Dk (1 − Bk ) + Dk Ak − Dk Ak + A2k − 2Z k2 Ck Ak − Ck Bk − Ck (1 − Bk ) = 1. Expanding this formula using ( p = m) 1 (Z k2

− π 2 m 2 )2 (Z k2

− π 2 p2 )

1 1 2 2 − m ) (Z k − π 2 m 2 )2 1 1 1 − 4 2 − , π ( p − m 2 )2 (Z k2 − π 2 m 2 ) (Z k2 − π 2 p 2 )

=−

π 2 ( p2

we get the result with ηm represented by Eq. (77).

An immediate and important consequence is an expansion of the function η(Z ) near Z 2 = Z k2 , η(Z ) = (Z 2 − Z k2 )

1 + ··· . ζk

Returning to the formula for ψ(x, , µ), we can write it as ψ(x, , µ) =

µ − ei Z ∗ µ − e−i Z w(x, Z ) − w (x, Z ), 2i Z 2i Z

(86)

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

849

where we define 1 iZ w(x, Z ) = ei Z x 1 − E | 2 | + E| | , Z − K2 Z2 − K 2 1 iZ | − E| | . w ∗ (x, Z ) = e−i Z x 1 − E | 2 Z − K2 Z2 − K 2

(87) (88)

The functions w(x, Z ) and w∗ (x, Z ) are defined on the Riemann sphere with a puncture at ∞. One can compute the Wronskian w w ∗ − w ∗ w = 2i Z η(Z ). This Wronskian vanishes precisely when Z 2 = Z k2 . Hence, when Z = Z k , w(x, Z k ) becomes proportional to w∗ (x, Z k ). Indeed we have w(x, Z k ) = ei Z k x (1 − Bk + i Z k Ak ), w ∗ (x, Z k ) = e−i Z k x (1 − Bk − i Z k Ak ). Then Eq. (71) can be rewritten as w(x, Z k ) =

αk(+)

(−)

αk

w ∗ (x, Z k ),

(89)

where (±)

αk

(+) (−)

= 1 − µk e±i Z k , αk αk

= µk γk .

(90)

k (x) introduced in Eq. (75) is solution of Eq. (70). It is not difficult to see The function F that the other solution of this equation is k . k = Ak Wk + 2Z k (Wk Dk − Wk Ck ) + x F G In terms of w(x, Z ), w∗ (x, Z ) defined in Eqs. (87,88), we have k = 1 (α (−) w| Z k + α (+) w ∗ | Z k ) = α (−) w| Z k = α (+) w ∗ | Z k , F k k k 2 k i k = − (α (−) ∂ Z w| Z k − α (+) ∂ Z w ∗ | Z k ). G k 2 k

(91) (92)

This will play an important role below.

11. Virasoro Algebra We are now ready to compute the Poisson bracket {T (x), T (y)}. The result is precisely the algebra of the Tn = L n − L 0 , Eq. (67).

850

O. Babelon

Proposition 19. Let T (x) = 2∂x E(x)|(x), and X (x), Y (x) be two arbitrary test functions. Then we have 1 1 1 d x X (x)T (x), dyY (y)T (y) = − d x(X Y − X Y ) 0

0

0

−4

1

0 1

+4

d x(X Y − X Y )( p02 + T )

0

−4

1

d x X (x)δ(x)

dyY (y)T (y)

0 1

d x X (x)T (x)

0

1

dyY (y)δ(y).

0

(93) The proof is rather long and we will split it into several lemmas. Since | = −1 |W , we have {|1 , |2 } = −1 −1 1 2 × {1 , 2 }|1 |2 −{1 , W2 }|1 −{W1 , 2 }|2 + {W1 , W2 } , where the index 1, 2 refers to the customary tensor notation. In this expression, all Poisson brackets can be computed explicitly. Using the fact that rows with different indices in and W Poisson commute and using only Eq. (71), we arrive at p02 sin Z k −1 −1 {m (x), n (y)} = −2 mk (x)nk (y) 1− 2 Z k γk Zk k k (x)G k (y) − F k (y)G k (x) , × F (94) where γk are defined in Eq. (34). Multypliying by E m (x)E n (y) and remembering that E|−1 = F| we get ζ 2 sin Z k p02 k {E(x)|(x), E(y)|(y)} = 2 1− 2 , Z k γk3 µ2k Zk k

(95) × Ak (y)Bk (x) − Ak (x)Bk (y) where k2 (x), Bk (x) = F k (x)G k (x). Ak (x) = F Using Eqs. (87,88), we can write Ak (x) = µk γk w(x, Z )w ∗ (x, Z )| Z =Z k , µk γk ∗ (w (x, Z )∂ Z w(x, Z ) − w(x, Z )∂ Z w ∗ (x, Z ))| Z =Z k . Bk (x) = 2i The strategy to evaluate the right hand side of Eq. (95) is to rewrite it as a sum over the residues of certain poles of a function on the Riemann Z -sphere. This sum can then be transformed as a sum over the residues of the other poles (there will be none in our case)

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

851

plus a integral over a small circle at infinity surrounding an essential singularity. This last integral can then be evaluated using the known asymptotics of the function. Let us define Z Z Z e−i Z , b(Z ) = − cos Z , c(Z ) = ei Z , (96) a(Z ) = 2i sin Z 2i sin Z 2i sin Z and introduce the functions 1 = w(x, Z )w ∗ (y, Z ) − w ∗ (x, Z )w(y, Z ), 2 = a(Z )w(x, Z )w(y, Z ) +b(Z )(w(x, Z )w∗ (y, Z ) + w ∗ (x, Z )w(y, Z )) +c(Z )w ∗ (x, Z )w ∗ (y, Z ). These functions defined on the Riemann Z -sphere have poles at the points ± mπ and have an essential singularity at infinity. Let = 1 2 and recall the definition of η(Z ) Eq. (39). Lemma 5.

ζ 2 sin Z k p02 p02 k Res 2 1− 2 1 − 2 = −2 η (Z ) Z Z k γk3 µ2k Zk ±Z k k

× Ak (y)Bk (x) − Ak (x)Bk (y) .

(97)

Proof. The factor η2 (Z ) introduces double poles at ±Z k because η(Z k ) = 0. However using Eq. (89), we see immediately that 1 |±Z k = 0, so that the poles are in fact simple. Remembering Eq. (86), we have ζ2 p02 p02 k ∂ Res 2 | + ∂ | 1 − 1− 2 = Z Z Z −Z k k . η (Z ) Z 4Z k2 Z k2 ±Z k k

We need to compute ∂ Z |±Z k = ∂ Z 1 |±Z k 2 |±Z k . Evaluating at Z k gives αk+ αk− 2i Ak (y)Bk (x) − Ak (x)Bk (y) , ∂ Z | Z k = 2 2 a(Z k ) − + 2b(Z k ) + c(Z k ) + αk µk γk αk while using w(−Z k ) = w ∗ (Z k ), ∂ Z w|−Z k =−∂ Z w ∗ | Z k , we also have αk+ αk− 2i ∂ Z |−Z k = 2 2 c(−Z k ) − + 2b(−Z k ) + a(−Z k ) + αk µk γk αk

× Ak (y)Bk (x) − Ak (x)Bk (y) . The result follows from the identities: αk+ α− Z k sin Z k a(Z k ) − + 2b(Z k ) + c(Z k ) k+ = 2i , α γk αk k c(−Z k )

αk+

αk−

+ 2b(−Z k ) + a(−Z k )

αk− Z k sin Z k = 2i . αk+ γk

852

O. Babelon

Next we have to examine the poles at ±mπ in the expression p02 1− 2 . η2 (Z ) Z We rewrite 1 and 2 as 1 = (w(x) − w ∗ (x))w ∗ (y) − (w(y) − w ∗ (y))w ∗ (x), Z 2 = × (w ∗ (x) − w(x))(ei Z w ∗ (y) − e−i Z w(y)) 4i sin Z

+(w ∗ (y) − w(y))(ei Z w ∗ (x) − e−i Z w(x)) .

Recalling the formula Eqs. (87, 88) for w(x, Z ) and w∗ (x, Z ), we see that when Z = 0, we have w = w∗ so that 1 = 0(Z ) and 2 = 0(Z 2 ). Hence we have no pole at Z = 0. When Z = ±π m + , w=

1 ∗(−1) 1 (−1) (0) ∗(0) w±m + w±m + · · · , w ∗ = w±m + w±m + ···

with (−1)

∗(−1)

w±m (x) = w±m (x) = ∓π mm (x). Because the two leading terms are the same, both w∗ (x, Z ) − w(x, Z ) and ei Z w ∗ (x, Z ) − e−i Z w(x, Z ) are regular. So 1 and 2 both behaves like 1/. Since 1/η2 (Z ) behaves like 2 , the whole thing is in fact regular. We come to the conclusion that everything happens at infinity. We want to compute 1 1 1 1 d x X (x)T (x), dyY (y)T (y) = 4 d x X (x) dyY (y) dZ C∞ 0 0 0 0 p02 1 ∂x ∂ y . × 1− 2 (98) Z η2 (Z ) Let w (x, Z ) = η−1/2 (Z ) w(x, Z ), w ∗ (x, Z ) = η−1/2 (Z ) w ∗ (x, Z ). The wronskian of w (x, Z ) and w ∗ (x, Z ) is 2i Z and therefore these functions coincide with the Baker–Akhiezer functions which are usually introduced in the pseudo-differential approach to the KdV hierarchy (see e.g. [15]). At Z ∞, we have ω(x) ω (x) + ω2 (x) + w (x, Z ) = ei Z x 1 − + · · · , iZ 2(i Z )2 ω(x) ω (x) + ω2 (x) + w ∗ (x, Z ) = e−i Z x 1 + + · · · , iZ 2(i Z )2 where we have set ω(x) = E(x).

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

853

Equation (79) is needed to verify this formula. Using these asymptotic forms, we find

∂x w 2 (x, Z ) =

1

An (x)(i Z )

n

e2i Z x ,

n=−∞

∗2

∂x w (x, Z ) =

1

(−1) An (x)(i Z ) n

n

e−2i Z x ,

n=−∞ −1 ∂x w (x, Z ) w ∗ (x, Z ) = C2n (x)(i Z )2n , n=−∞

where A1 = 2,

8 A−1 = 4ω2 (x), A−2 = − ω3 − 2 3 x 4 4 = ω + 4ω ω2 , C−2 (x) = ω (x). 3

A0 = −4ω(x), A−3

x

ω2 ,

Consider the term proportional to b(Z ) in Eq. (98): p02 dZ b(Z ) 1 − 2 2 (x, Z )∂ y w ∗2 (y, Z ). 4 d xd y(X (x)Y (y) − X (y)Y (x))∂x w 2iπ Z C∞ Lemma 6. Let us deﬁne I p (z) = C∞

dZ b(Z )(i Z ) p e2i Z z . 2iπ

We have 1 I−1 (z) = − 21 δ(z), I0 (z) = − 41 δ (z), I1 (z) = − 18 δ (z), I2 (z) = − 16 δ (z), n−3

2 I−n (z) = − (n−2)! (z)z n−2 , n ≥ 2.

Proof. It is clear that ∂z I p (z) = 2I p+1 (z), hence we can determine all the I p (z) recursively. For p ≥ 0 this is done by successively differentiating I−1 (z), which is easy to calculate, 1 dZ d Z cos Z 2i Z z 1 1 2inπ z 1 I−1 (z) = b(Z ) e2i Z z = e =− e = − δ(z). 2iπ i Z 2 2iπ sin Z 2 2 C∞ C∞ n∈Z

For p ≤ −2 we have to successively integrate I−1 (z). For this we need boundary conditions which are provided by dZ b(Z )Z − p = 0, p ≥ 2. (99) C∞ 2iπ

854

O. Babelon

This is because ∞ ∞ dZ d Z cos Z p−1 − p+1 ζ b(Z ) ζ p Z−p = − ζ Z 2i C∞ 2iπ sin Z C∞ 2iπ p=2 p=2 d Z cos Z 1 ζ2 =− 2i C∞ 2iπ sin Z Z − ζ ∞ 2 2ζ ζ 1 = = 0. cot ζ − − 2i ζ ζ 2 − n2π 2 n=0

Denoting

F p (x, y) =

(−1)m An (x)Am (y),

m+n= p

we get

1

4

dx 0

1

dy (X (x)Y (y) − X (y)Y (x))

0

p=2

F p (x, y)(I p (x − y) + p02 I p−2 (x − y)).

−∞

In this expression, we separate the terms with a δ(x − y) function or its derivative which will lead to local terms (L b ), and the non-local terms (N L b ) which are proportional to (x − y) : L b = 4 d xd y(X (x)Y (y) − X (y)Y (x))

× F2 I2 + F1 I1 + (F0 + p02 F2 )I0 + (F−1 + p02 F1 )I−1 , ∞ N L b = 4 d xd y(X (x)Y (y) − X (y)Y (x)) (F−2−n + p02 F−n )I−n−2 . (100) n=0

We have F2 (x, y) = −4, F0 (x, y) = −8 (ω(x) − ω(y))2 ,

F1 (x, y) = 8 (ω(x) − ω(y)) , F−1 (x, y) =

16 (ω(x) − ω(y))3 + 4 3

x

ω2 .

y

The local terms are 1 1 1 δ (x − y) + p02 δ (x − y) Lb = 4 dx dy(X (x)Y (y) − X (y)Y (x)) 4 0 0 1 − (ω(x)−ω(y))δ (x − y) − 2(ω(x) − ω(y))2 δ (x − y) − F−1 (x, y)δ(x − y) . 2 The last two terms obviously vanish and what remains is 1 d x −(X Y − X Y ) − 4(X Y − X Y )( p02 + 2ω ) . Lb = 0

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

855

Lemma 7. The non local term Eq. (100) is identically zero. Proof. The non local term reads 1 1 dx dy(X (x)Y (y) − X (y)Y (x))(x − y) N L b = −2 0 0 '∞ ( 2n 2 n F−2−n (x, y) + p0 F−n (x, y) × (x − y) . (n)! n=0

2 (x, Z )∂ w ∗2 The first sum is just the coefficient of (i Z )−2 in the formal expansion of ∂ w 0 (y, Z ) while the second sum is the coefficient of (i Z ) . Hence we have 1 1 N Lb = 2 dx dy(X (x)Y (y) − X (y)Y (x))(x − y) 0 0 dZ (Z − p02 Z −1 ) ∂ w 2 (x, Z )∂ w ∗2 (y, Z ). × 2iπ C∞ The above expression is zero in the following sense. Let us write dZ (Z − p02 Z −1 ) ∂ w 2 (x, Z )∂ w ∗2 (y, Z ) C∞ 2iπ ∞ (y − x)i dZ = (Z − p02 Z −1 ) ∂ w 2 (x, Z )∂ i+1 w ∗2 (x, Z ). i! C∞ 2iπ i=1

We will show that all the integrals around C∞ in the right-hand side are identically zero. Since the function w (x, Z ) satisfies the Schroedinger equation, its square w 2 (x, Z ) satisfies a third order differential equation D w 2 (x, Z ) = −4Z 2 ∂ w 2 (x, Z ), D = ∂ 3 + 8ω ∂ + 4ω . Let us introduce a pseudo differential operator such that w 2 (x, Z ) = e2i Z x . Then D = ∂∂ 2 −1 . Since D is anti self-adjoint, we also have D = −D∗ = ∗−1 ∂ 2 ∗ ∂. It follows that w ∗2 (x, Z ) which is solution of (−D∗ ) w ∗2 (x, Z ) = −4Z 2 ∂ w ∗2 (x, Z ) can be written as w ∗2 (x, Z ) = ∂ −1 ∗−1 ∂e−2i Z x .

856

O. Babelon

Hence ∂w 2 = ∂e2i Z x , ∂ i+1 w ∗2 = ∂ i ∗−1 ∂e−2i Z x . Finally, we have to compute dZ 2 (x, Z ))∂ i+1 w ∗2 (x, Z ) (Z − p02 Z −1 ) (∂ w 2iπ C∞ dZ = (Z − p02 Z −1 ) (∂e2i Z x ) × ∂ i ∗−1 ∂e−2i Z x 2iπ C∞ dZ −1 (∂e2i Z x ) × ∂ i ∗−1 (∂ 2 + 4 p02 )e−2i Z x . = 2i C∞ 2iπ We recall the formula (see e.g. [15]) dZ (Dei Z x )(Fe−i Z x ) = Res∂ (D F ∗ ), C∞ 2iπ where Res∂ is Adler’s residue [25]. So our expression is equal to Res∂ (∂(∂ 2 + 4 p02 )−1 ∂ i ) = Res∂ ((D + 4 p02 ∂)∂ i ) = 0, because (D + 4 p02 ∂)∂ i is a differential operator.

Consider next the term proportional to a(Z ) in Eq. (98), 1 1 dx dy X (x)Y (y) − X (y)Y (x) 4 0 0

p02 dZ a(Z ) 1 − 2 ∂x w (y, Z ) w ∗ (y, Z ) . × 2 (x, Z )∂ y w Z C∞ 2iπ Lemma 8. Let us deﬁne J p (x) = C∞

dZ a(Z )(i Z ) p e2i Z x , 2iπ

p = −1, −2, . . . .

One has J−1 (x) =

x p−2 1 δ(x), J− p (x) = 2 p−2 ((x) − 1), 2 ( p − 2)!

p ≥ 2.

Proof. One has ∂x J p (x) = 2J p+1 (x). The calculation of J−1 (x) is easy. Next, we need boundary conditions to determine the other J p by integration ∞ d Z e−i Z 1 ζ2 (iζ )n J−n (0) = 2i C∞ 2iπ sin Z Z − ζ n=2 ' ( $ −iζ % e ζ2 ζ2 1 e−iζ ζ2 + − + cot ζ = . = − = 2i sin ζ ζ − nπ 2i sin ζ 2 n∈Z

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

857

It follows that all the non-local terms containing J p (x) for p ≤ −2 vanish when 0 < x < 1. The a(Z ) term is

1

La = 4

1

dx 0

0

dy X (x)Y (y) − X (y)Y (x) A1 (x)C−2 (y)J−1 (x)

or La = 4

1

1

d x X (x)δ(x)

0

dyY (y)ω (y) − 4

0

1

1

d xY (x)δ(x)

0

dy X (y)ω (y).

0

Finally, the term in c(Z ) is just equal to the a(Z ) one and double it. Putting everything together, we arrive at Eq. (93). In the course of this proof, we have shown the identities dZ (Z − p02 Z −1 ) (∂ w 2 (x, Z ))∂ i+1 w ∗2 (x, Z ) = 0, ∀i ≥ 0. 2iπ C∞ These are quartic identities on the coefficients of w (x, Z ). Of course, we also have the quadratic relations of Hirota and Sato that were interpreted by Sato as Plücker relations defining the infinite Grassmannian, allowing to give a precise definition of τ -functions. The above relations are quartic relations on τ -functions analogous to the quartic relations on Riemann Theta functions. 12. Poisson Bracket {L 0 , u(x)} In the previous section, we have obtained the Poisson bracket for the generators Tn = L n − L 0 . We now have to reintroduce L 0 and check that it has the correct Poisson brackets. The candidate for L 0 was given in Eq. (66). Let us recall it: L 0 = p02 +

k 1 , 2 η( p0 ) p0 − k 2 π 2 k

where k = E k (0)k (0) − E k (1)k (1)) and η(Z ) = 1 −

m

E (0)m (0) ηm m = 1 − , 2 − m2π 2 Z 2 − m2π 2 Z m

where we have used Eq. (77), evaluated at x = 0, to express ηm . Proposition 20. We have the following Poisson bracket:

1

L 0, 0

dyY (y)u(y) = −4

1 0

dyY (y) (L 0 − p02 )δ (y) + T (y) .

This shows that {L 0 , ·} acts on u(y) as ∂ y , as it should be.

(101)

858

O. Babelon

Again, the proof is long and we will split it into several lemmas. We need to compute {m (x), T (y)}. and {m (x), T (y)} for x = 0 and x = 1. Multiplying Eq. (94) by E n (y) and remembering that E|−1 = F| and k (x), we get Fk (x) = − γkζµk k F {m (x), E(y)(y)} = −2m (x)

ζk2 (1 − p02 Z k−2 )

sin Z k Zk

k

µ2k γk3 (Z k2 − π 2 m 2 )

k (x) F k (x) F k (y)G k (y) − F k2 (y) F k (x)G k (x) × F +2m (x)

ζk2 (1 − p02 Z k−2 )

sin Z k

µ2k γk3 (Z k2 − π 2 m 2 )

Zk

k

k2 (x) F k (y)G k (y) − F k2 (y) F k (x)G k (x) . × F

(102)

As before, this can be expressed in terms of 1 and 2 . We find {m (x), E(y)(y)} = − Res±Z k k

(1 − p02 Z −2 ) 1 (m (x)∂x 2−m (x)2 ). η2 (Z )(Z 2 − π 2 m 2 )

This formula is the starting point to begin the computation of {η( p0 ), E(y)(y)}. Setting x = 0 in Eq. (102), the term proportional to m (x) vanishes because Fk (0) = 0. We are left with (x = 0, but we keep it for a while) {m (x), E(y)(y)} = −m (x)

Res±Z k

k

Multiplying by −

2 m ( p0

(0) Em 2 ( p0 −m 2 π 2 )

(1 − p02 Z −2 ) 1 ∂x 2 . (103) η2 (Z )(Z 2 − π 2 m 2 )

and summing over m, in the right hand side appears the sum

E m (0)m (0) ηm =− 2 2 2 2 2 2 2 2 2 2 2 − m π )(Z k − π m ) m ( p0 − π m )(Z k − π m ) ηm ηm −1 − 2 = 2 Z k − p02 m p02 − m 2 π 2 Zk − π 2m2 =

where we used that

&

ηm m Z 2 −π 2 m 2 k

1 η( p0 ), Z k2 − p02

= 1, ∀k. The factor 1/(Z k2 − p02 ) cancels with the

factor (1 − p02 Z k−2 ) in Eq. (103) and we are left with {η( p0 ), E(y)} = −η( p0 )

Res±Z k

k

Finally η( p0 ),

1 0

dyY (y)T (y) = 2η( p0 ) 0

1

1 1 ∂x 2 . η2 (Z )Z 2

dyY (y) C∞

dZ 1 2 ), (104) 1 ∂x ∂ y ( 2iπ Z 2

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

859

where we used the asymptotics expressions for w and w ∗ inside 1 and 2 , hence 2 removing the η (Z ) factor. The last step consists in evaluating the integral around C∞ . The result is: Lemma 9. η( p0 ),

1

1

dyY (y)T (y) = 4η( p0 )

0

dyY (y)δ (y).

(105)

0

Proof. Let us consider the integral over C∞ in Eq. (104). The term containing b(Z ) can be written as d Z b(Z ) Lb = 2 C∞ 2iπ Z × ( w (x, Z ) w ∗ (x, Z ) − w (x, Z ) w ∗ (x, Z ))∂ y ( w (y, Z ) w ∗ (y, Z )) + w (x, Z ) w (x, Z )∂ y w ∗2 (y, Z ) − w ∗ (x, Z ) w ∗ (x, Z )∂ y w 2 (y, Z ) . On the first line, we recognize the wronskian of w (x, Z ) and w ∗ (x, Z ), which is just ∗ −2 equal to −2i Z . Since ∂ y ( w (y, Z ) w (y, Z )) = Z S2 (y) + · · · this term vanishes by Eq. (99). Hence d Z b(Z ) 1 2 ∗2 ∗2 2 Lb = ∂ w (x, Z )∂ w (y, Z ) − ∂ w (x, Z )∂ w (y, Z ) (106) x y x y 2 C∞ 2iπ Z 2 ∞ =− F− p (x, y)I− p−2 (x − y) p=−2

1 = δ (y − x) + (x − y) 2

C∞

dZ 1 ∂x w 2 (x, Z )∂ y w ∗2 (y, Z ). 2iπ Z

(107)

The (x − y) term is zero because d Z 1 i+1 2 d Z 1 i+1 2i Z x ∂ w (∂ e (x, Z )∂ w ∗2 (x, Z ) = )(∗−1 ∂e−2i Z x ) 2iπ Z 2iπ Z C∞ C ∞ d Z i+1 2i Z x (∂ e = )(∗−1 e−2i Z x ) 2iπ C∞ = Res∂ (∂ i+1 · −1 ) = 0.

Next, we consider the a(Z ) term. It reads d Z a(Z ) L a = ∂y ( w(x, Z ) w ∗ (y, Z ) − w ∗ (x, Z ) w (y, Z )) w (x, Z ) w (y, Z ) 2iπ Z2 C∞ d Z a(Z ) 1 ∂x w = 2 (x, Z )∂ y ( w (y, Z ) w ∗ (y, Z )) 2 2 C∞ 2iπ Z 1 ∗ w (x, Z ) w (x, Z ) + w ∗ (x, Z ) w (x, Z ))∂ y w 2 (y, Z ) − ( 2 1 ∗ w (x, Z ) w (x, Z ) − w ∗ (x, Z ) w (x, Z ))∂ y w 2 (y, Z ) . − ( 2

860

O. Babelon

Again, the last term is the wronskian and so d Z a(Z ) 1 La = w 2 (y, Z ) + ∂x w 2 (x, Z )∂ y ( w(y, Z ) w ∗ (y, Z )) −i Z ∂ y ( 2 2 C∞ 2iπ Z 1 − ∂x ( w ∗ (x, Z ) w (x, Z ))∂ y ( w 2 (y, Z )) . 2 The first term is ∞ dZ A− p (y)J− p−1 (y) a(Z )(i Z )− p−1 A− p (y)e2i Z y = C∞ 2iπ p=−1

=

1 δ (y) − ω(y)δ(y) + ((y) − 1) 2 ∞ (2y) p−1 . × A− p (y) ( p − 1)! p=1

The last sum is zero because 0 < y < 1. The δ(y) term vanishes because ω(0) = 0. The second term is 1 d Z a(Z ) 1 A− p (x)C−q (y)(i Z )− p−q e2i Z x = − ((x) − 1) 2 C∞ 2iπ Z 2 2

× A1 (x)C−2 (y)(2x) + · · · . (108) This vanishes when x = 0. The third term is 1 d Z a(Z ) 1 − A− p (y)C−q (x)(i Z )− p−q e2i Z y = ((y) − 1) 2 C∞ 2iπ Z 2 2

× A1 (y)C−2 (x)(2y) + · · · and this vanishes when 0 < y < 1. Finally, it is easy to see that the c(Z ) term is equal to the a(Z ) one. Putting everything together, we get Eq. (105). The last result we need is the following: Lemma 10. 1 E (0) (0) − E (1) (1) 1 m m m m , dyY (y)T (y) = −4η( p ) dyY (y)T (y). 0 2 − π 2m2 p 0 0 0 m (109) Proof. Taking the derivative with respect to x of Eq. (102) and remembering that Fk (0) = Fk (1) = 0, the remaining terms are (there is a cancellation in the m (x) term) {m (x), E(y)(y)} = −2m (x)

sin Z k k

Zk

ζk2 (1 − p02 Z k−2 ) µ2k γk3 (Z k2 − π 2 m 2 )

k (x) F k (y)G k (y) − F k2 (y) F k (x)G k (x) , k (x) F × ∂x F

Classical Volterra Modeland a Lattice Version of Virasoro Algebra

861

where it is understood that x = 0 or x = 1. By exactly the same argument as before 1 E (x) (x) 1 m m , dyY (y)T (y) = 2η( p ) dyY (y)∂ y ∂x 0 p02 − π 2 m 2 0 0 m 1 dZ 1 ∂x 2 . × 2 C∞ 2iπ Z η(Z ) Hence, we just have to take the derivative with respect to x of the previous result, before setting x = 0 or x = 1. At x = 0 we get 1 2η( p0 ) dyY (y) − δ (y) + 4ω (y) . 0

δ (y)

The term comes from Eq. (107) while the second term comes from Eq. (108) doubled by the c(Z ) contribution. At x = 1 only the periodic δ (y) remains. Taking the difference we obtain Eq. (109). Putting everything together, we arrive at Eq. (101), and this finishes our proof that u(x) does satisfy the Virasoro Poisson bracket. 13. Conclusion We have succeeded to take the continuum limit in the formulae expressing the dynamical variables of the Volterra model in terms of the separated variables. This yields exactly solvable potentials and formulae for the Virasoro generators of a rather unusual type. Still, we were able to check that they have the correct Poisson brackets. Of course the most interesting thing now is to try to quantize this approach. As a first step, a semiclassical analysis along the lines of [26] should be very enlightening. The full quantum theory however may reserve some surprise. The bracket Eq. (3) being in fact an ordinary quadratic bracket, it is natural to quantize it with Weyl type commutation relations. This opens up the possibility of phenomena as those advocated in [9, 27]. Acknowledgements. This work was supported in part by the European Network ENIGMA, Contract number: MRTN-CT-2004-5652.

References 1. Gervais, J.L.: Transport matrices associated with the Virasoro algebra. Phys. Lett. B160, 279 (1985) 2. Bazhanov, V., Lukyanov, S., Zamolodchikov, A.: Integrable Structure of Conformal Field Theory, Quantum KdV Theory and Thermodynamic Bethe Ansatz. Commun. Math. Phys. 177, 381–398 (1996) Integrable Structure of Conformal Field Theory II. Q-operator and DDV equation. Commun. Math. Phys. 190, 247–278 (1997) Integrable Structure of Conformal Field Theory III. The Yang-Baxter Relation. Commun. Math. Phys. 200, 297–324 (1999) 3. Sklyanin, E.K.: The quantum Toda chain. Lect. Notes in Phys. 226, Berlin-Heidelberg-New York: Springer, 1985, pp. 196–233; Separation of variables. Prog. Theor. Phys. (suppl.) 185, 35 (1995) 4. Kac, M., van Moerbeke, P.: Some probabilistic aspect of scattering theory. Proceedings of the Conference on functional integration and its applications, (Cumberland Lodge London, 1974) Oxford:Clarendon Press, 1975, pp. 87–96 On some periodic Toda lattices, Proc. Nat. Acad. Sci., USA 72 (4), 1627–1629 (1975); A complete solution of the periodic Toda problem. Proc. Nat. Acad. Sci., USA, 72 (8), 2879–2880 (1975) 5. van Moerbeke, P.: The spectrum of Jacobi matrices. Invent. Math. 37, 45–81 (1976)

862

O. Babelon

6. Dubrovin, B.A., Krichever, I.M., Novikov, S.P.: Integrable Systems I. Encyclopedia of Mathematical Sciences, Dynamical systems IV. Berlin-Heidelberg-New York: Springer, 1990, p. 173–281 7. Faddeev, L.D., Takhtajan, L.: Liouville model on the lattice. Springer Lectures Notes in Physics, 246, Berlin-Heidelberg-New York: Springer, 1986, p. 66 8. Volkov, A.: A Hamiltonian interpretation of the Volterra model. Zapiski.Nauch.Semin. LOMI 150,17 (1986); Liouville theory and sh-Gordon model on the lattice. Zapiski.Nauch.Semin. LOMI 151,24 (1987); Miura transformation on the lattice. Theor. Math. Phys. 74 96 (1988) 9. Babelon, O.: Exchange formula and lattice deformation of the Virasoro algebra. Physics Letters 238B, 234 (1990) 10. Volkov, A. Yu.: Quantum Volterra Model. Phys. Lett. A167, 345 (1992); Noncommutative Hypergeometry. http://arxiv.org/list/ math.QA/0312084, 2003 11. Faddeev, L.D., Volkov, A. Yu.: Abelian current algebra and the Virasoro algebra on the lattice. Phys. Lett. B315, 311318 (1993) 12. Faddeev, L., Volkov, A. Yu.: Shift Operator for Nonabelian Lattice Current Algebra. Publ. Res. Inst. Math. Sci. Kyoto 40, 1113–1125 (2004) 13. Faddeev, L.D., Kashaev, R.M., Volkov, A.Yu.: Strongly coupled quantum discrete Liouville theory. I: Algebraic approach and duality. Commun. Math. Phys. 219, 199–219 (2001) 14. Gervais, J.L., Neveu, A.: Novel triangle relation and absence of tachyon in Liouville string field theory. Nucl. Phys. B238, 125 (1984); Oscillator representations of the two-dimensional conformal algebra. Commun. Math. Phys. 100, 15,(1985) 15. Babelon, O., Bernard, D., Talon, M.: Introduction to Classical Integrable Systems. Cambridge: Cambridge University Press, 2003 16. Atkinson, F.V.: Multiparameter spectral theory. Bull. Amer. Math. Soc. 74,1–27 (1968); Multiparameter Eigenvalue Problems. New York: Academic, 1972 17. Enriquez, B., Rubtsov, V.: Commuting families in skew fields and quantization of Beauville’s fibration. http://arxiv.org/list/ math.AG/0112276,2001 18. Babelon, O., Talon, M.: Riemann surfaces, separation of variables and classical and quantum integrability. Phys. Lett. A 312, 71–77 (2003) 19. Babelon, O.: On the Quantum Inverse Problem for the Closed Toda Chain. J. Phys A. Math. Gen. 37, (2004) pp. 303–316. 20. Novikov, S.P.: The periodic problem for the Korteweg-de Vries equation. Funkt. Anal. i Ego Pril. 8, 54–66 (1974) Translation in Funct. Anal. Jan.1975, pp. 236–246. 21. Dubrovin, B., Novikov, S.P.: Periodic and conditionally periodic analogues of multisoliton solutions of the Korteweg-de Vries equation. Dokl. Akad. Nauk. USSR 6, 2131–2144 (1974) 22. Its, A., Matveev, V.: On Hill operators with finitely many lacunae. Funkt. Anal. i ego Pril. 9 (1975) 23. McKean, H.P., van Moerbeke, P.: The Spectrum of Hill’s Equation. Invent. Math. 30, 217–274 (1975) 24. Matveev, V.B., Salle, M.A.: Darboux Transformations and Solitons, Berlin-Heidelberg-New York: Springer-Verlag, 1990 25. Adler, M.: On a trace functional for formal pseudo-differential operators and the symplectic structure of the Korteweg-de-Vries equations. Invent. Math. 50, 219 (1979) 26. Smirnov, F.A.: Quasi-classical Study of Form Factors in Finite Volume. http://arxiv.org/list/ hep-th/9802132, 1998; Dual Baxter equations and quantization of Affine Jacobian. http://arxiv.org/list/ math-ph/0001032, 2000 27. Faddeev, L.D.: Discrete Heisenberg-Weyl Group and Modular Group. Lett. Math. Phys. 34, 249–254 (1995); Modular Double of Quantum Group. http://arxiv.org/list/math.QA/9912078, 1999 Communicated by L. Takhtajan

Communications in Mathematical Physics - Volume 221

Read more

Communications in Mathematical Physics - Volume 220

Read more

Communications in Mathematical Physics - Volume 235

Read more

Communications in Mathematical Physics - Volume 223

Read more

Communications In Mathematical Physics - Volume 283

Read more

Communications In Mathematical Physics - Volume 270

Read more

Communications in Mathematical Physics - Volume 208

Read more

Communications in Mathematical Physics - Volume 186

Read more

Communications In Mathematical Physics - Volume 294

Read more

Communications in Mathematical Physics - Volume 217

Read more

Communications In Mathematical Physics - Volume 274

Read more

Communications in Mathematical Physics - Volume 239

Read more

Communications in Mathematical Physics - Volume 306

Read more

Communications in Mathematical Physics - Volume 264

Read more

Communications in Mathematical Physics - Volume 227

Read more

Communications in Mathematical Physics - Volume 184

Read more

Communications in Mathematical Physics - Volume 261

Read more

Communications in Mathematical Physics - Volume 225

Read more

Communications In Mathematical Physics - Volume 263

Read more

Communications in Mathematical Physics - Volume 211

Read more

Communications In Mathematical Physics - Volume 293

Read more

Communications in Mathematical Physics - Volume 246

Read more

Communications In Mathematical Physics - Volume 298

Read more

Communications in Mathematical Physics - Volume 234

Read more

Communications In Mathematical Physics - Volume 288

Read more

Communications in Mathematical Physics - Volume 304

Read more

Communications In Mathematical Physics - Volume 292

Read more

Communications in Mathematical Physics - Volume 233

Read more

Communications in Mathematical Physics - Volume 253

Read more

Communications in Mathematical Physics - Volume 222

Read more

Recommend Documents

Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Communications in Mathematical Physics - Volume 220

Commun. Math. Phys. 220, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 On the Definiti...

Communications in Mathematical Physics - Volume 235

Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0 Communications in Mathe...

Communications in Mathematical Physics - Volume 223

Commun. Math. Phys. 223, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Resonance Expan...

Communications In Mathematical Physics - Volume 283

Commun. Math. Phys. 283, 1–24 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0556-8 Communications in Mathe...

Communications In Mathematical Physics - Volume 270

Commun. Math. Phys. 270, 1–12 (2007) Digital Object Identifier (DOI) 10.1007/s00220-006-0139-5 Communications in Mathe...

Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Communications in Mathematical Physics - Volume 186

Commun. Math. Phys. 186, 1-59 (1997) Communications in Mathematical Physics (~) Springer-Verlag1997 Meanders and the...

Communications In Mathematical Physics - Volume 294

Commun. Math. Phys. 294, 1–19 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0920-3 Communications in Mathe...

Communications in Mathematical Physics - Volume 217

Commun. Math. Phys. 217, 1 – 31 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Integrable Stru...