Communications in Mathematical Physics - Volume 225

Commun. Math. Phys. 225, 1 – 32 (2002) Communications in Mathematical Physics © Springer-Verlag 2002 On Convergence ...

Author: M. Aizenman (Chief Editor)

44 downloads 853 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 225, 1 – 32 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

On Convergence to Equilibrium Distribution, I. The Klein–Gordon Equation with Mixing T. V. Dudnikova1, , A. I. Komech2, , E. A. Kopylova3, , Yu. M. Suhov4 1 Mathematics Department, Elektrostal Polytechnical Institute, Elektrostal, 144000 Russia.

E-mail: [email protected]

2 Mechanics and Mathematics Department, Moscow State University, Moscow, 119899 Russia.

E-mail: [email protected]

3 Physics and Applied Mathematics Department, Vladimir State University, Vladimir, Russia.

E-mail: [email protected]

4 Statistical Laboratory, Department of Pure Mathematics and Mathematical Statistics, University of

Cambridge, Cambridge, UK. E-mail: [email protected] Received: 4 January 2001 / Accepted: 2 July 2001

Dedicated to M. I. Vishik on the occasion of his 80th anniversary Abstract: Consider the Klein–Gordon equation (KGE) in Rn , n ≥ 2, with constant or variable coefficients. We study the distribution µt of the random solution at time t ∈ R. We assume that the initial probability measure µ0 has zero mean, a translationinvariant covariance, and a finite mean energy density. We also assume that µ0 satisfies a Rosenblatt- or Ibragimov–Linnik-type mixing condition. The main result is the convergence of µt to a Gaussian probability measure as t → ∞ which gives a Central Limit Theorem for the KGE. The proof for the case of constant coefficients is based on an analysis of long time asymptotics of the solution in the Fourier representation and Bernstein’s “room-corridor” argument. The case of variable coefficients is treated by using an “averaged” version of the scattering theory for infinite energy solutions, based on Vainberg’s results on local energy decay. 1. Introduction The aim of this paper is to underline a special role of equilibrium distributions in statistical mechanics of systems governed by hyperbolic partial differential equations (for parabolic equations see [6, 27]). Important examples arise when one discusses the role of a canonical Gibbs distribution (CGD) in the Planck theory of spectral density of the black-body emission and in the Einstein–Debye quantum theory of solid state (see, e.g. [31]). [The word “canonical” is used in this paper to emphasize the fact that the probability distribution under consideration is formally related to the “Hamiltonian”, or the energy functional, of the corresponding equation by the Gibbs exponential formula. Owing to the linearity of our equations, there are plenty of other first integrals which lead to other stationary measures.] Historically, the emission law was established Supported partly by research grants of DFG (436 RUS 113/615/0-1) and RFBR (01-01-04002)

Supported partly by the Institute of Physics and Mathematics of Michoacan in Morelia, the Max-Planck

Institute for the Mathematics in Sciences (Leipzig) and by research grant of DFG (436 RUS 113/615/0-1) Supported partly by research grant of RFBR (01-01-04002)

2

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

at a heuristic level by Kirchhoff in 1859 (see [34]) and stated formally by Planck in 1900 (see [25]). The law concerns the correspondence between the temperature and the colour of an emitting body (e.g., a burning carbon, or an incandescent wire in an electric bulb). Furthermore, it provides fundamental information on an interaction between the Maxwell field and “matter”. Planck’s formula specifies a “radiation intensity” IT (ω) of the electromagnetic field at a fixed temperature T > 0, as a function of the frequency ω > 0. It is convenient to treat IT (·) as the spectral correlation function of a stationary random process. Then if gT denotes an equilibrium distribution of this process, the Kirchhoff–Planck law suggests the long-time convergence µt gT , t → ∞.

(1.1)

Here µt is the distribution at time t of a nonstationary random solution. The resulting equilibrium temperature T is determined by an initial distribution µ0 . Convergence to equilibrium (1.1) is also expected in a system of Maxwell’s equations coupled to an equation of evolution of “matter”. For example, both (1.1) and the Kirchhoff–Planck law should hold for the coupled Maxwell–Dirac equations [5], or for their second-quantised modifications. However, the rigorous proof here is still an open problem. Previously, the convergence of type (1.1) to a CGD gT has been established for an ideal gas with infinitely many particles by Sinai (see, e.g., [7]). Similar results were later obtained for other infinite-dimensional systems (see [2, 10] and a survey [9]). For nonlinear wave problems, the first such result has been established by Jaksic and Pillet in [18]: they consider a system of a classical particle coupled to a wave field in a smooth nonlocal fashion. For all these models, the CGD gT is well-defined, although the convergence is highly non-trivial. On the other hand, for the local coupling such as in the Maxwell–Dirac equations, the problem of “ultraviolet divergence” arises: the CGDs cannot be defined directly as the local energy is formally infinite almost surely. This is a serious technical difficulty that suggests that, to begin with, one should analyse convergence to non-canonical stationary measures µ∞ , with finite mean local energy: µt µ∞ , t → ∞.

(1.2)

In fact, most of the above-mentioned papers establish the convergence to both CGDs and non-canonical stationary measures, by using the same methods. In our situation, the aforementioned ultraviolet divergence makes the difference between (1.1) and (1.2). In this paper we prove convergence (1.2) for the Klein–Gordon equation (KGE) in Rn , n ≥ 2: u(x, ¨ t) = nj=1 (∂j − iAj (x))2 u(x, t) − m2 u(x, t), x ∈ Rn , (1.3) u|t=0 = u0 (x), u| ˙ t=0 = v0 (x). ∂ , x ∈ Rn , t ∈ R, m > 0 is a fixed constant and (A1 (x), . . . , An (x)) a ∂xj vector potential of an external magnetic field; we assume that functions Aj (x) vanish outside a bounded domain. The solution u(x, t) is considered as a complex-valued classical function. It is important to identify a natural property of the initial measure µ0 guaranteeing convergence (1.2). We follow an idea of Dobrushin and Suhov [10] and use a “space”mixing condition of Rosenblatt- or Ibragimov–Linnik-type. Such a condition is natural from physical point of view. It replaces a “quasiergodic hypothesis” and allows us to avoid introducing a “thermostat” with a prescribed time-behaviour. Similar conditions

Here ∂j ≡

Convergence to Equilibrium Distribution, I

3

have been used in [2, 3, 33, 32]. In this paper, mixing is defined and applied in the context of the KGE. Thus we prove convergence (1.2) for a class of initial measures µ0 on a classical function space, with a finite mean local energy and satisfying a mixing condition. The limiting measure µ∞ is stationary and turns out to be a Gaussian probability measure (GPM). Hence, this result is a form of the Central Limit Theorem for the KGE. Another important question we discuss below is the relation of the limiting measure µ∞ to the CGD gT . The (formal) Klein–Gordon Hamiltonian is given by a quadratic form and so the CGDs gT are also GPMs, albeit generalised (i.e. living in generalised function spaces). As our limiting measures µ∞ are “classical” GPMs, they do not include CGDs. However, in the case of constant coefficients, a CGD can be obtained as a limit of measures µ∞ as the “correlation radius” figuring in the mixing conditions imposed on µ0 tends to zero. More precisely, we assume that for a fixed T > 0, 1 E v0 (x)v0 (y) + ∇u0 (x) · ∇u0 (y) + m2 u0 (x)u0 (y) → T δ(x − y), r → 0, 2 (1.4) where E denotes the expectation. Then the covariance functions (CFs) of the corresponding limit GPM µ∞ converge to the covariance functions of the CGD gT . In turn, this implies the convergence µt µ∞ ∼ gT , r → 1.

(1.5)

See Sect. 4. It should be noted that the existence of a “massive” (in a sense, infinite-dimensional) set of the limiting measures µ∞ that are different from CGD’s is related to the fact that KGE (1.3) is degenerate and admits infinitely many “additive” first integrals. Like the Klein–Gordon Hamiltonian, these integrals are quadratic forms; hence they generate GPMs via Gibbs exponential formulas. Convergence (1.2) has been obtained in [19–21] for translation-invariant initial measures µ0 . However, the original proofs were too long and used a specific apparatus of Bessel’s functions applicable exclusively in the case of the KGE. They have not been published in detail because of the lack of a unifying argument that could show the limits of the method and its forthcoming developments. To clarify the mechanism behind the results, one needed some new and robust ideas. The current work provides a modern approach applicable to a wide class of linear hyperbolic equations with a nondegenerate “dispersion relation”, see Eq. (7.20) below. We also weaken considerably the mixing condition on measure µ0 . Moreover, our approach yields much shorter proofs and is applicable to non-translation invariant initial measures. The last fact is important in relation to the two-temperature problem [3,12,33] and the hydrodynamic limit [8]. Such progress became possible in large part owing to the systematic use of a Fourier transform (FT) and a duality argument of Lemma 7.1. [The importance of the Fourier transform was demonstrated in earlier works [3, 32, 33].] Similar results, for the wave equation (WE) in Rn with odd n ≥ 3, are established in [11] which develops the results [26]. The KGE shares some common features with the WE (which is formally obtained by setting m = 0 in (1.3)), and the exposition in [20, 21] followed the structure of the earlier work [26]. On the other hand, the KGE and WE also have serious differences, see below. It is worth mentioning that possible extensions of our methods include, on the one hand, Dirac’s and other relativistic-invariant linear hyperbolic equations and on the other

4

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

hand harmonic lattices, as well as “coupled” systems of both types. We intend to return to these problems elsewhere. We now pass to a detailed description of the results. Formal definitions and statements ˙ t)), Y0 = (Y00 , Y01 ) ≡ are given in Sect. 2. Set: Y (t) = (Y 0 (t), Y 1 (t)) ≡ (u(·, t), u(·, (u0 , v0 ). Then (1.3) takes the form of an evolution equation Y˙ (t) = AY (t), t ∈ R; Y (0) = Y0 .

(1.6)

Here, A=

0 1 , A0

(1.7)

where A = nj=1 (∂j − iAj (x))2 − m2 . We assume that the initial date Y0 is a random element of a complex functional space H corresponding to states with a finite local energy, see Definition 2.1 below. The distribution of Y0 is a probability measure µ0 of mean zero satisfying some additional assumptions, see Conditions S1–S3 below. Given t ∈ R, denote by µt the measure that gives the distribution of Y (t), the random solution to (1.6). We study the asymptotics of µt as t → ±∞. We identify C ≡ R2 and denote by ⊗ the tensor product of real vectors. The CFs of the initial measure are supposed to be translation-invariant: ij j Q0 (x, y) := E Y0i (x) ⊗ Y0 (y) ij

= q0 (x − y), x, y ∈ Rn , i, j = 0, 1

(1.8)

(in fact our methods require a weaker assumption, but to simplify the exposition, we will not discuss it here). We also assume that the initial mean energy density is finite: e0 := E |v0 (x)|2 + |∇u0 (x)|2 + m2 |u0 (x)|2 = q011 (0) − q000 (0) + m2 q000 (0) < ∞, x ∈ Rn .

(1.9)

Finally, we assume that measure µ0 satisfies a mixing condition of a Rosenblatt- or Ibragimov–Linnik type, which means that Y0 (x) and

Y0 (y) are asymptotically independent as

|x − y| → ∞.

(1.10)

As was said before, our main result gives the (weak) convergence (1.2) of µt to a limiting measure µ∞ which is a stationary GPM on H. A similar convergence holds for t → −∞. Explicit formulas are then given for the CFs of µ∞ . The strategy of the proof is as follows. First, we prove (1.2) for the equation with constant coefficients (Ak (x) ≡ 0), in three steps. I. We check that the family of measures µt , t ≥ 0, is weakly compact. II. We check that the CFs converge to a limit: for i, j = 0, 1, ij ij (1.11) Qt (x, y) = Y i (x) ⊗ Y j (y)µt (dY ) → Q∞ (x, y), t → ∞.

Convergence to Equilibrium Distribution, I

5

III. Finally, we check that the characteristic functionals converge to a Gaussian one: 1 µˆ t (#) := exp{iY, #}µt (dY ) → exp{− Q∞ (#, #)}, t → ∞. (1.12) 2 Here # is an arbitrary element of the dual space and Q∞ the quadratic form with the ij integral kernel (Q∞ (x, y))i,j =0,1 ; Y, # denotes the scalar product in a real Hilbert 2 n space L (R ) ⊗ Rn . Property I follows from the Prokhorov Theorem by a method used in [37]. First, we prove a uniform bound for the mean local energy in µt , using the conservation of mean energy density. The conditions of the Prokhorov Theorem are then checked by using Sobolev’s embedding Theorem in conjunction with Chebyshev’s inequality. Next, we deduce Property II from an analysis of oscillatory integrals arising in the FT. An important role is attributed to Proposition 6.1 reflecting the properties of the CFs in the FT deduced from the mixing condition. On the other hand, the FT approach alone is not sufficient for proving Property III even in the case of constant coefficients. The reason is that a function of infinite energy corresponds to a singular generalised function in the FT, and the exact interpretation of the mixing condition (1.10) for such generalised functions is unclear. We deduce Property III from a representation of the solution in terms of the initial date in coordinate space. This is a modification of the approach adopted in [19–21]. It allows us to combine the mixing condition with the fact that waves in the coordinate space disperse to infinity. This leads to a representation of the solution as a sum of weakly dependent random variables. Then (1.12) follows from a Central Limit Theorem (CLT) under a Lindeberg-type condition. Checking such a condition is an important part of the proof. It is useful to discuss the dispersive mechanism that is behind (1.12) and compare the KGE (m > 0) and WE (m = 0). Take, for simplicity, n = 3 and u0 ≡ 0. The solution to (1.3) (with Ak (x) ≡ 0) is given by u(x, t) = E(x − y, t)v0 (y) dy, t > 0, (1.13) where E is the “retarded” fundamental solution

√ 1 mθ (t − |x|) J1 (m t 2 − x 2 ) E(x, t) = , δ(|x| − t) − √ 4πt 4π t 2 − x2

(1.14)

J1 is the Bessel function of the first order. For m = 0 the function E(·, t) is supported by the sphere |x| = t of area ∼ t 2 , and (1.13) becomes the Kirchhoff formula 1 u(x, t) = v0 (y) dS(y), (1.15) 4π t |x−y|=t

which manifests the dispersion of waves in the 3D space. Dividing the sphere {y ∈ R3 : |x − y| = t} into N ∼ t 2 “rooms” of a fixed width d 1, we rewrite (1.15) as N k=1

u(x, t) ∼ √

rk

N

,

(1.16)

6

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

where rk are nearly independent owing to the mixing condition. Then (1.2) follows by the well-known Bernstein “room-corridor” arguments. For m > 0 function E(·, t) is supported by the ball |x| ≤ t which means the absence of a strong Huyghen’s principle for the KGE. The volume of the ball is ∼ t 3 , hence rewriting (1.13) in the form (1.16) would need asymptotics of the type E(x, t) = O(t −3/2 ),

|x| ≤ t

(1.17)

√ as t → ∞. As J1 (r) ∼ cos(r − 3π/4)/ r, asymptotics (1.17) only holds in the region |x| ≤ vt with v < 1. For instance, cos(mγ t − 3π/4) E |x|=vt ∼ , (γ t)3/2 √ where γ = 1 − v 2 . However, the degree of the decay is different near the light cone |x| = t corresponding to v = 1 and γ = 0. For example, for a fixed r > 0, √ cos(m 2rt − 3π/4) = O(t −3/4 ), (1.18) E |x|=t−r ∼ (2rt)3/4 where r = t −|x| is the “distance” from the light cone. This illustrates that an application of Bernstein’s method in the case of the KGE requires a new idea. √ The key observation is that the asymptotics (1.18) displays oscillations ∼ cos m 2rt of E near the light cone as t → ∞. The solution becomes an oscillatory integral, and one is able to compensate the weak decay ∼ t −3/4 by a partial integration with Bessel functions, by a method following an argument from [23, Appendix B]. Such an approach was used in [21] and was accompanied by tedious computations in a combined “coordinate-momentum” representation. The approach adopted in this paper allows us to avoid this part of the argument. An important role is played by a duality argument of Lemma 7.1 leading to an analysis of an oscillatory integral with a phase function (=“dispersion relation”) with a nondegenerate Hessian, see (7.20). Simple examples show that the convergence may fail when the mixing condition does not hold. For instance, take u0 (x) ≡ ±1 and v0 (x) ≡ 0 with probability p± = 0.5. Then the mean value is zero and (1.9) holds, but (1.10) does not. The solution u(x, t) ≡ ± cos (mt) a.s., hence µt is periodic in time, and (1.2) fails. Finally, a comment on the case of variable coefficients Ak (x). In this case explicit formulas for the solution are unavailable. Here we construct a scattering theory for solutions of infinite global energy. This version of the scattering theory allows us to reduce the proof of (1.2) to the case of constant coefficients (this strategy is similar to [4, 11, 12]). In particular, in [11] one establishes, in the case of a WE, a long-time asymptotics U (t)Y0 = .U0 (t)Y0 + ρ(t)Y0 , t > 0.

(1.19)

Here U (t) is the dynamical group of the WE with variable coefficients, U0 (t) corresponds to the “free” equation with constant coefficients, and . is a “scattering operator”. In this paper, instead of (1.19), we use a dual representation: U (t)# = U0 (t)W # + r(t)#, t ≥ 0.

(1.20)

Convergence to Equilibrium Distribution, I

7

Here U (t) is a ”formal adjoint” to the dynamical group of Eq. (1.3), while U0 (t) corresponds to the “free” equation, with Ak (x) ≡ 0. The remainder r(t) is small in mean: E|Y0 , r(t)#|2 → 0, t → ∞.

(1.21)

This version of scattering theory is essentially based on Vainberg’s bounds for the local energy decay (see [35, 36]). Remark 1.1. (i) In [11] we deduce asymptotics (1.19) from its primal counterpart (1.20). In this paper we do not analyse connections between (1.20) and (1.19). (ii) It is useful to comment on the difference between two versions of scattering theory produced for the WE and KGE. In the first theory, the remainders ρ(t) and r(t) are small a.s., while in the second theory, developed in this paper, r(t) is small in mean (see (1.21)). Such a difference is related to a slow (power) decay of solutions to the KGE. The main result of the paper is stated in Sect. 2 (see Theorem A). Sections 3–8 deal with the case of constant coefficients: the main statement is given in Sect. 3 (see Theorem B), the relation to CGDs is discussed in Sect. 4, the compactness (Property I) is established in Sect. 5, convergence (1.11) in Sect. 6, and convergence (1.12) in Sects. 7, 8. In Sect. 9 we check the Lindeberg condition needed for convergence to a Gaussian limit. In Sect. 10 we discuss the infinite energy version of the scattering theory, and in Sect. 11 convergence (1.2). In Appendix A we collected FT-type calculations. Appendix B is concerned with a formula on generalised GPMs on Sobolev spaces.

2. Main Results 2.1. Notation. We assume that functions Ak (x) in (1.3) satisfy the following conditions: E1. Aj (x) are real C ∞ -functions. E2. Aj (x) = 0 for |x| > R0 , where R0 < ∞. E3.

∂A1 ∂A2 ≡ if n = 2. ∂x2 ∂x1

Assume that the initial state Y0 belongs to the phase space H defined below. 1 (Rn ) ⊕ H 0 (Rn ) is the Fréchet space of pairs Y (x) ≡ Definition 2.1. H ≡ Hloc loc (u(x), v(x)) of complex functions u(x), v(x), endowed with local energy seminorms

Y 2R =

|v(x)|2 + |∇u(x)|2 + m2 |u(x)|2 dx < ∞, ∀R > 0.

(2.1)

|x|
Proposition 2.2 follows from [22, Thms. V.3.1, V.3.2]) as the speed of propagation for Eq. (1.3) is finite. Proposition 2.2. (i) For any Y0 ∈ H there exists a unique (generalised) solution Y (t) ∈ C(R, H) to (1.6). (ii) For any t ∈ R the operator U (t) : Y0 → Y (t) is continuous in H.

8

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

s (Rn ), s ∈ R, Let us choose a function ζ (x) ∈ C0∞ (Rn ) with ζ (0) = 0. Denote by Hloc the local Sobolev spaces, i.e. the Fréchet spaces of distributions u ∈ D (Rn ) with finite seminorms us,R := 7s ζ (x/R)u L2 (Rn ) , (2.2)

|k|2 + 1, and vˆ := F v is the FT of a tempered distribution v. For ψ ∈ D define F ψ(k) = eik·x ψ(x)dx. −1 where 7s v := Fk→x (ks v(k)), ˆ k :=

1+s s (Rn ). Definition 2.3. For s ∈ R denote Hs ≡ Hloc (Rn ) ⊕ Hloc

Using standard techniques of pseudodifferential operators (see, e.g. [16]) and Sobolev’s Theorem, it is possible to prove that H0 = H ⊂ H−ε for every ε > 0, and the embedding is compact. 2.2. Random solution. Convergence to equilibrium. Let (;, <, P ) be a probability space with expectation E and B(H) denote the Borel σ -algebra in H. We assume that Y0 = Y0 (ω, ·) in (1.6) is a measurable random function with values in (H, B(H)). In other words, (ω, x) → Y0 (ω, x) is a measurable map ; × Rn → C2 with respect to the (completed) σ -algebras < × B(Rn ) and B(C2 ). Then, owing to Proposition 2.2, Y (t) = U (t)Y0 is again a measurable random function with values in (H, B(H)). We denote by µ0 (dY0 ) a probability measure on H giving the distribution of the Y0 . Without loss of generality, we assume (;, <, P ) = (H, B(H), µ0 ) and Y0 (ω, x) = ω(x) for µ0 (dω) × dx-almost all (ω, x) ∈ H × Rn . Definition 2.4. µt is a probability measure on H which gives the distribution of Y (t): µt (B) = µ0 (U (−t)B), B ∈ B(H), t ∈ R.

(2.3)

Our main goal is to derive the weak convergence of the measures µt in the Fréchet space H−ε for each ε > 0, H−ε

µt − µ∞ ,

t → ∞,

where µ∞ is a limiting measure on the space H. This means the convergence f (Y )µt (dY ) → f (Y )µ∞ (dY ), t → ∞,

(2.4)

(2.5)

for any bounded continuous functional f on H−ε . Recall that we identify C ≡ R2 and ⊗ stands for the tensor product of real vectors. Denote M 2 = R2 ⊗ R2 . Definition 2.5. The CFs of the measure µt are defined by ij Qt (x, y) ≡ E Y i (x, t) ⊗ Y j (y, t) , i, j = 0, 1, for almost all x, y ∈ Rn × Rn , (2.6) assuming that the expectations in the RHS are finite.

Convergence to Equilibrium Distribution, I

9

We set D = D ⊕ D, and Y, # = Y 0 , # 0 + Y 1 , # 1 for Y = (Y 0 , Y 1 ) ∈ H and # = (# 0 , # 1 ) ∈ D. For a probability measure µ on H, denote by µˆ the characteristic functional (FT) µ(#) ˆ ≡

exp(iY, #) µ(dY ), # ∈ D.

A probability measure µ is called a GPM (of mean zero) if its characteristic functional has the form 1 µ(#) ˆ = exp{− Q(#, #)}, # ∈ D, 2 where Q is a real nonnegative quadratic form in D. A measure µ is called translationinvariant if µ(Th B) = µ(B), B ∈ B(H), h ∈ Rn , where Th Y (x) = Y (x − h), x ∈ Rn . 2.3. Mixing condition. Let O(r) denote the set of all pairs of open bounded subsets A, B ⊂ Rn at distance dist(A, B) ≥ r and σ (A) the σ -algebra in H generated by the linear functionals Y → Y, #, where # ∈ D with supp # ⊂ A. Define the IbragimovLinnik mixing coefficient of a probability measure µ0 on H by (cf. [17, Def. 17.2.2]) ϕ(r) ≡

sup

(A,B)∈O(r)

sup A ∈ σ (A), B ∈ σ (B) µ0 (B) > 0

|µ0 (A ∩ B) − µ0 (A)µ0 (B)| . µ0 (B)

(2.7)

Definition 2.6. The measure µ0 satisfies the strong, uniform Ibragimov–Linnik mixing condition if ϕ(r) → 0,

r → ∞.

(2.8)

Below, we specify the rate of decay of ϕ (see Condition S3).

2.4. Main assumptions and results. We assume that measure µ0 has the following properties S0–S3: S0. µ0 has zero expectation value, EY0 (x) ≡ 0,

x ∈ Rn .

(2.9)

S1. µ0 has translation-invariant CFs, i.e. Eq. (1.8) holds for almost all x, y ∈ Rn . S2. µ0 has a finite mean energy density, i.e. Eq. (1.9) holds. S3. µ0 satisfies the strong uniform Ibragimov–Linnik mixing condition, with ∞ ϕ≡ 0

r n−1 ϕ 1/2 (r)dr < ∞.

(2.10)

10

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

ij Define, for almost all x, y ∈ Rn , the matrix Q∞ (x, y) ≡ Q∞ (x, y) 1 Q∞ (x, y) ≡ 2

(q000 + P ∗ q011 )(x − y)

i,j =0,1

(q001 − q010 )(x − y)

(q010 − q001 )(x − y) (q011 − ( − m2 )q000 )(x − y)

by

. (2.11)

Here P(z) is the fundamental solution for the operator − + m2 , and ∗ stands for the convolution of generalized functions. We show below that q011 ∈ L2 (Rn ) (see (6.1)). Then the convolution P ∗ q011 in (2.11) also belongs to L2 (Rn ). Let H = L2 (Rn ) ⊕ H 1 (Rn ) denote the space of complex valued functions # = (#0 , #1 ) with a finite norm 2 #H = (|#0 (x)|2 + |∇#1 (x)|2 + |#1 (x)|2 ) dx < ∞. (2.12) Rn

Denote by Q∞ a real quadratic form in H defined by ij Q∞ (x, y)# i (x), # j (y) dx dy, Q∞ (#, #) =

(2.13)

i,j =0,1 Rn ×Rn

where ·, · stands for the real scalar product in C2 ≡ R4 . The form Q∞ is continuous ij

in H as the functions Q∞ (x, y) are bounded. Theorem A. Let n ≥ 2, m > 0, and assume that E1–E3, S0–S3 hold. Then (i)

The convergence in (2.4) holds for any ε > 0.

(ii) The limiting measure µ∞ is a GPM on H. (iii) The characteristic functional of µ∞ has the form 1 µˆ ∞ (#) = exp{− Q∞ (W #, W #)}, # ∈ D, 2 where W : D → H is a linear continuous operator. 2.5. Remarks on conditions on the initial measure. (i) The (rather strong) form of mixing in Definition 2.6 is motivated by two facts: (a) it greatly simplifies the forthcoming arguments, (b) it allows us to produce an “optimal” (most slow) decay of ϕ indicating natural limits of Bernstein’s room-corridor method. Condition (2.7) can be easily verified for GPMs with finite-range dependence and their images under “local” maps H → H. See the examples in Sect. 2.6 below. (ii) The uniform Rosenblatt mixing condition [30] also suffices, together with a higher power > 2 in the bound (1.9): there exists δ > 0 such that E |v0 (x)|2+δ + |∇u0 (x)|2+δ + m2 |u0 (x)|2+δ < ∞. (1.4 ) Then (2.10) requires a modification: ∞ δ 1 r n−1 α p (r)dr < ∞, where p = min , , 2+δ 2 0

(2.10 )

Convergence to Equilibrium Distribution, I

11

where α(r) is the Rosenblatt mixing coefficient defined as in (2.7) but without µ(B) in the denominator. The statements of Theorem A and their proofs remain essentially unchanged, only Lemma 8.2 requires a suitable modification [17]. 2.6. Examples of initial measures with mixing condition. 2.6.1. Gaussian measures. In this section we construct initial GPMs µ0 satisfying S0– S3. Let µ0 be a GPM on H with the characteristic functional

1 µˆ 0 (#) ≡ E exp(iY, #) = exp − Q0 (#, #) , # ∈ D. (2.14) 2 ij

Here Q0 is a real nonnegative quadratic form with an integral kernel (Q0 (x, y))i,j =0,1 . Let ij

ij

Q0 (x, y) ≡ q0 (x − y),

(2.15)

ij

for any i, j , where the function q0 ∈ C 2 (Rn ) ⊗ M 2 has compact support. Then S0, S1 ij and S2 are satisfied; S3 holds with ϕ(r) ≡ 0 for r ≥ r0 if q0 (z) ≡ 0 for |z| ≥ r0 . For a ij

given matrix function q0 (z) such a measure exists on space H iff the corresponding ij FT is a nonnegative matrix-valued measure: qˆ0 (k) ≥ 0, k ∈ Rn , [15, Thm V.5.1]. ij

For example, all these conditions hold if qˆ0 (k) = Di δ ij f (k1 ) · · · · · f (kn ) with Di ≥ 0 and √ 2 1 − cos(r0 z/ n) f (z) = , z ∈ R. z2 2.6.2. Non-Gaussian measures. Now choose a pair of odd functions f 0 , f 1 ∈ C 1 (R), with bounded first derivatives. Define µ∗0 as the distribution of the random function (f 0 (Y 0 (x)), f 1 (Y 1 (x))), where (Y 0 , Y 1 ) is a random function with a Gaussian distribution µ0 from the previous example. Then S0–S3 hold for µ∗0 with a mixing coefficient ϕ ∗ (r) ≡ 0 for r ≥ r0 . Measure µ∗0 is not Gaussian if Di > 0 and the functions f i are bounded and nonconstant. 3. Equations with Constant Coefficients In Sects. 3–9 we assume that coefficients Ak (x) ≡ 0. Problem (1.3) then becomes u(x, ¨ t) = u(x, t) − m2 u(x, t), t ∈ R, u|t=0 = u0 (x), u| ˙ t=0 = v0 (x).

(3.1)

As in (1.6), we rewrite (3.1) in the form Y˙ (t) = A0 Y (t), t ∈ R; Y (0) = Y0 . Here we denote

A0 =

0 1 , A0 0

(3.2)

(3.3)

12

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

where A0 = − m2 . Denote by U0 (t), t ∈ R, the dynamical group for problem (3.2), then Y (t) = U0 (t)Y0 . The following proposition is well-known and is proved by a standard integration by parts. Proposition 3.1. Let Y0 = (u0 , v0 ) ∈ H, and Y (·, t) = (u(·, t), u(·, ˙ t)) ∈ C(R, H) is the solution to (3.1). Then the following energy bound holds: for R > 0 and t ∈ R,

|u(x, ˙ t)|2 + |∇u(x, t)|2 + m2 |u(x, t)|2 dx

|x|
≤

|v0 (x)|2 + |∇u0 (x)|2 + m2 |u0 (x)|2 dx. (3.4)

|x|
Set µt (B) = µ0 (U0 (−t)B), B ∈ B(H), t ∈ R. Then our main result for problem (3.2) is Theorem B. Let n ≥ 1, m > 0, and Conditions S0–S3 hold. Then the conclusions of Theorem A hold with W = I , and the limiting measure µ∞ is translation-invariant. Theorem B can be deduced from Propositions 3.2 and 3.3 below, by the same arguments as in [37, Thm XII.5.2]. Proposition 3.2. The family of measures {µt , t ∈ R}, is weakly compact in H−ε with any ε > 0, and the bounds hold: sup EU0 (t)Y0 2R < ∞, R > 0.

(3.5)

t≥0

Proposition 3.3. For every # ∈ D,

1 µˆ t (#) ≡ exp(iY, #)µt (dY ) → exp − Q∞ (#, #) , 2

t → ∞.

(3.6)

Propositions 3.2 and 3.3 are proved in Sects. 5 and 7–9, respectively. We will use repeatedly the FT (12.2) and (12.3) from Appendix A. 4. Relation to CGDs In this section we discuss how our results are related to CGDs. We restrict consideration to the case of Eq. (1.3) with constant coefficients and to the translation-invariant isotropic case. The CGD gT with the absolute temperature T ≥ 0 is defined formally by

where H :=

1 2

H 1 − gT (du × dv) = e T du(x)dv(x), Z x

(4.1)

|v(x)|2 + |∇u(x)|2 + m2 |u(x)|2 dx, and Z is a normalisation

constant. To make the definition rigorous, let us introduce a scale of weighted Sobolev spaces H s,α (Rn ) with arbitrary s, α ∈ R. We use notation (2.2).

Convergence to Equilibrium Distribution, I

13

Definition 4.1. (i) H s,α (Rn ) is the complex Hilbert space of the distributions w ∈ S (Rn ) with the finite norm ws,α ≡ xα 7s wL2 (Rn ) < ∞.

(4.2)

(ii) Hs,α is the Hilbert space of the pairs Y = (u, v) ∈ H 1+s,α (Rn ) ⊕ H s,α (Rn ) with the norm |||Y ||| s,α ≡ u1+s,α + vs,α .

(4.3)

Note that Hs,α ⊂ Hs,α if s < s and α < α, and this embedding is compact. These facts follow by standard methods of pseudodifferential operators and Sobolev’s Theorem (see, e.g. [16]). Now we can define the CGDs rigorously: gT is a GPM on a space Hs,α , s, α < −n/2, with the CFs gT00 (x − y) = T P(x − y), gT11 (x − y) = T δ(x − y), gT01 (x − y) = gT10 (x − y) = 0.

(4.4)

By Minlos Theorem [15, Thm. V.5.1], such a measure exists on Hs,α with s, α < −n/2 as, formally (see Appendix B), |||Y |||2s,α gT (dY ) < ∞.

(4.5)

Measure gT is stationary for the KGE, as its CFs are stationary; the last fact follows from formulas (12.6), (12.2). Also, gT is translation invariant, so S1 holds. Condition S2 fails since the “mean energy density” gT11 (0) − gT00 (0) + m2 gT00 (0) is infinite; this gives an “ultraviolet divergence”. Mixing condition S3 holds due to an exponential decay of the P(z). The convergence of type (1.1) holds for initial measures µ0 that are absolutely continuous with respect to the CGD gT , and the limit measure coincides with gT . This mixing property (and even the K-property) can be proved by using well-known methods developed for Gaussian processes [7], and we do not discuss it here. Remark. Assumption S2 implies that µ0 (H) = 1 and hence µ∞ (H) = 1. This excludes the case of a limiting CGD as it is a generalised GPM not supported by H. However, it is possible to extend our results to a class of generalised initial measures converging to CGDs. For the case of constant coefficients such an extension could be done by smearing the initial generalised field as the dynamics commutes with the averaging (cf. [12]). For variable coefficients such an extension requires a further work. To demonstrate the special role of the CGDs we consider a family of initial GPMs µ0,r , r ∈ (0, 1], satisfying S0–S3, with the radius of correlation r. More precisely, ij suppose that the corresponding CFs q0,r have the following properties G0–G3: 01 (z) = q 01 (−z), z ∈ Rn . G0. q0,r 0,r 11 (z) − q 00 (z) + m2 q 00 (z) = 0, |z| ≥ r. G1. q0,r 0,r 0,r

14

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

G2. For some T > 0, G3. sup

r∈(0,1]

1 2

11 00 00 q0,r (z) − q0,r (z) + m2 q0,r (z) dz → T , r → 0.

11 00 00 |q0,r (z)|+|q0,r (z)|+m2 |q0,r (z)| dz <∞.

Note that G0 means a symmetry relation Eu0 (x)v0 (y) = Eu0 (y)v0 (x) that holds for an isotropic measure where the CFs depend only on |x − y|. Examples of such a family will be provided later. Properties G0–G3 imply conditions S0–S3 for the initial measures µ0,r . Therefore, H−ε

Theorem B implies the convergence µt,r − µ∞,r , t → ∞, of type (2.4). The following proposition means that the limiting measure µ∞,r is close to CGD gT on the Sobolev space of distributions Hs,α with s, α < −n/2. Proposition 4.2. Let Conditions G0–G3 hold. Then corresponding limiting measures µ∞,r are concentrated on any space Hs,α with s, α < −n/2 and weakly converge to CGD gT on the space Hs,α : Hs,α

µ∞,r − gT , r → 0.

(4.6)

Proof. The convergence follows by the same arguments as in [37] from two facts (cf. Propositions 3.2, 3.3): for any s, α with s < s < −n/2 and α < α < −n/2, (I) sup |||Y |||2s,α µ∞,r (dY ) < ∞. r∈(0,1]

Q∞,r (#, #) → GT (#, #), r → 0, ij where Q∞,r is the quadratic form with the integral kernel q∞ (x − y) , and GT cor ij responds to gT (x − y) . It is important that the embedding Hs,α ⊂ Hs,α is compact. Property (I) can be checked with the help of formula (13.3) and by using the Parseval identity: C(α) 11 00 |||Y |||2s,α µ∞,r (dY ) = k2s tr qˆ∞,r (k) + k2(1+s) tr qˆ∞,r (k) dk n (2π) 11 00 f1 (z) trq∞,r (z) + f2 (z)(− + m2 )trq∞,r (z) dz, = C(α) (II)

For # ∈ D,

k2(1+s) 1 1 −ikz 2s e e−ikz 2 k dk and f (z) = dk. More 2 n n (2π) (2π ) k + m2 precisely, Property (I) follows from G3 as both functions fj (x) are bounded and con01 = q 10 = 0, tinuous for s < −n/2. Furthermore, G0 and (2.11) imply that q∞,r ∞,r hence 00 Q∞,r (#, #) = q∞,r (x − y)# 0 (x), # 0 (y) dxdy 11 + q∞,r (x − y)# 1 (x), # 1 (y) dxdy. (4.7) where f1 (z) =

Convergence to Equilibrium Distribution, I

15

G1 and G2 together imply that 11 q∞,r (x − y) → T δ(x − y), r → 0.

Then (2.11) implies 00 11 q∞,r = P ∗ q∞,r → T P, r → 0.

Therefore, Property (II) follows from (4.7): the justification follows easily in the FT space. The convergence of the covariance (II) provides the convergence of the measures (4.6) as all the measures are Gaussian. & ' Example. Consider an initial measure µ0 constructed in Example in Sect. 2.6.1. It satisfies Assumptions S0–S3 and G0. Furthermore, q000 (z) = q011 (z) = 0, |z| ≥ 1 if we choose r0 = 1. Denote by Y0 (x) = (Y00 (x), Y01 (x)) a random function with distribution µ0 . Denote by µ0,r , r > 0, the distribution of the random function Y0,r (x) = ij (r 1−ν Y00 (r −1 x), r −ν Y01 (r −1 x)), where ν = n/2. The corresponding CFs are q0,r (z) = 1 ij r 2−n−i−j q0 (r −1 z). Then all Conditions G0–G3 hold with T := trq011 (z)dz. 4 5. Compactness of the Family of Measures µt This section gives the proof of bound (3.5). Proposition 3.2 will follow then with the help of the Prokhorov Theorem [37, Lemma II.3.1] as in the proof of [37, Thm. XII.5.2]. It is important that the embedding H ⊂ H−ε is compact, by virtue of Sobolev’s Theorem, if ε > 0. Set: ˙ t)|2 + |∇u(x, t)|2 + m2 |u(x, t)|2 , x ∈ Rn . (5.1) et ≡ E |u(x, The CFs of measure µt are translation invariant due to condition S1. Hence, taking expectation in (3.4), we get by S2, et |BR | ≤ e0 |BR+t | < ∞.

(5.2)

Here BR is the ball |x| ≤ R in Rn , and |BR | is its volume. Taking R → ∞ we derive from (5.2) that et ≤ e0 : in fact, the reversibility implies then et = e0 (the mean energy density conservation). Hence, taking the expectation in (1.5), we get (3.5): EU0 (t)Y0 2R = e0 |BR | < ∞.

Corollary 5.1. Bound (3.5) implies the convergence of the integrals in (2.6).

6. Convergence of the Covariance Functions In this section we check the convergence of the CFs of measures µt with the help of the FT. This convergence is used in Sect. 8.

16

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

6.1. Mixing in terms of the spectral density. The next proposition gives the mixing ij ij property in terms of the FT qˆ0 of the initial CFs q0 . Assumption S2 implies that ij q0 (z) is a measurable bounded function. Therefore, it belongs to the Schwartz space of tempered distributions as well as its FT. ij

Proposition 6.1. Let the assumptions of Theorem B hold. Then qˆ0 ∈ L1 (Rn )⊗M 2 , ∀i, j . Proof. Step 1. First, let us prove that ij

∂ γ q0 (z) ∈ Lp (Rn ) ⊗ M 2 ,

p ≥ 1, |γ | ≤ 2 − i − j.

(6.1)

Conditions S0, S2 and S3 imply, by [17, Lemma 17.2.3] (see Lemma 8.2 (i) below), that ij

|∂ γ q0 (z)| ≤ Ce0 ϕ 1/2 (|z|), z ∈ Rn . The mixing coefficient ϕ is bounded, hence (6.2) and (2.10) imply (6.1): ∞ p p γ ij p p/2 |∂ q0 (z)| dz ≤ Ce0 ϕ (|z|) dz ≤ C1 e0 r n−1 ϕ 1/2 (r)dr < ∞.

(6.2)

(6.3)

0

Rn ij

Step 2. By Bohner’s theorem, qˆ0 ≡ (qˆ0 (k))dk is a complex positive-definite matrixvalued measure on Rn , and S2 implies that the total measure qˆ0 (Rn ) is finite. On the ij other hand, (6.1) with p = 2 implies that qˆ0 ∈ L2 (Rn ) ⊗ M 2 . & ' 6.2. Proof of convergence of covariance functions. Formulas (12.3), (12.2) and Proposition 6.1 imply for example, (6.4) qt00 (x − y) := E u(x, t) ⊗ u(y, t) 1 + cos 2ωt 1 sin 2ωt 01 e−ik(x−y) qˆ000 (k) + (qˆ0 (k) + qˆ010 (k)) n (2π) 2 2ω 1 − cos 2ωt 11 qˆ0 (k) dk, + 2ω2 where the integral converges and defines a continuous function determined for all x, y ∈ ij Rn . Similar integrals give a convenient modification for all functions qt (x − y), which we will work with. =

ij

Proposition 6.2. Covariance functions qt (z), i, j = 0, 1, converge for all z ∈ Rn : ij

ij

qt (z) → q∞ (z),

t → ∞,

(6.5)

ij

where functions q∞ (z) are defined in (2.11). Proof. Equation (6.4) and Proposition 6.1 imply, 1 00 00 qt00 (z) → q0 (z) + P ∗ q011 (z) = q∞ (z), t → ∞, (6.6) 2 as the oscillatory integrals tend to zero by the Lebesgue–Riemann Lemma. For other i, j the proof is similar. & '

Convergence to Equilibrium Distribution, I

17

Note that P(z) ∈ L1 (Rn ). Therefore, (6.1) with p = 1 and explicit formulas (2.11) imply the following ij

Corollary 6.3. Functions q∞ belong to L1 (Rn ) ⊗ M 2 , i, j = 0, 1. Remark 6.4. A similar argument in the FT representation implies compactness in Proposition 3.2. We provided an independent proof of the compactness in Sect. 5 to show the relation with energy conservation.

7. Bernstein’s Argument for the Klein–Gordon Equation In this and the subsequent section we develop a version of Bernstein’s “room-corridor” method. We use the standard integral representation for solutions, divide the domain of integration into “rooms” and “corridors” and evaluate their contribution. As a result, the value U0 (t)Y0 , # for # ∈ D is represented as the sum of weakly dependent random variables. We evaluate the variances of these random variables which will be important in next section. First, we evaluate Y (t), # in (3.6) by using duality arguments. For t ∈ R, introduce “formal adjoint” operators U0 (t), U (t) from space D to a suitable space of distributions. For example, Y, U0 (t)# = U0 (t)Y, #, # ∈ D, Y ∈ H.

(7.1)

Denote K(·, t) = U0 (t)#. Then (7.1) can be rewritten as Y (t), # = Y0 , K(·, t), t ∈ R.

(7.2)

The adjoint groups admit a convenient description. Lemma 7.1 below displays that the action of groups U0 (t), U (t) coincides, respectively, with the action of U0 (t), U (t), up to the order of the components. In particular, U0 (t), U (t) are continuous groups of operators D → D. Lemma 7.1. For # = (# 0 , # 1 ) ∈ D, ˙ t), φ(·, t)), U0 (t)# = (φ(·,

˙ t), ψ(·, t)), U (t)# = (ψ(·,

(7.3)

where φ(x, t) is the solution of Eq. (3.1) with the initial date (u0 , v0 ) = (# 1 , # 0 ) and ψ(x, t) is the solution of Eq. (1.3) with the initial state (u0 , v0 ) = (# 1 , # 0 ). Proof. Differentiating (7.1) in t with Y, # ∈ D, we obtain Y, U˙ 0 (t)# = U˙ 0 (t)Y, #.

(7.4)

Group U0 (t) has the generator (3.3). The generator of U0 (t) is the conjugate operator A0

=

0 A0 1 0

.

Hence, Eq. (7.3) holds with ψ¨ = A0 ψ. For group U (t) the proof is similar.

(7.5) ' &

18

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

Next we introduce a “room-corridor” partition of Rn . Given t > 0, choose d ≡ dt ≥ 1 and ρ ≡ ρt > 0. Asymptotic relations between t, dt and ρt are specified below. Set h = d + ρ and a j = j h, bj = a j + d, j ∈ Z.

(7.6)

j

j

We call the slabs Rt = {x ∈ Rn : a j ≤ x n ≤ bj } “rooms” and Ct = {x ∈ Rn : bj ≤ x n ≤ a j +1 } “corridors”. Here x = (x 1 , . . . , x n ), d is the width of a room, and ρ of a corridor. Denote by χr the indicator of the interval [0, d] and χc that of [d, h] so that j ∈Z (χr (s−j h)+χc (s−j h)) = 1 for (almost all) s ∈ R. The following decomposition holds: j j Y0 , K(·, t) = (Y0 , χr K(·, t) + Y0 , χc K(·, t)), (7.7) j ∈Z

where where

j χr

j

j

j

:= χr (x n − j h) and χc := χc (x n − j h). Consider random variables rt , ct , j

j

j

j

rt = Y0 , χr K(·, t), ct = Y0 , χc K(·, t), Then (7.7) and (7.2) imply U0 (t)Y0 , # =

j ∈ Z.

j j (rt + ct ).

(7.8)

(7.9)

j ∈Z

The series in (7.9) is indeed a finite sum. In fact, (7.5) and (12.1) imply that in the FT ˙ˆ ˆ ˆ ˆ t) and K(k, t) = Gˆt (k)#(k). Therefore, representation, K(k, t) = Aˆ 0 (k)K(k, 1 ˆ K(x, t) = e−ikx Gˆt (k)#(k) dk. (7.10) (2π )n Rn

This can be rewritten as a convolution K(·, t) = Rt ∗ #,

(7.11)

where Rt = F −1 Gˆt . The support supp # ⊂ Br with an r > 0. Then the convolution representation (7.11) implies that the support of the function K at t > 0 is a subset of an “inflated future cone” supp K ⊂ {(x, t) ∈ Rn × R+ : |x| ≤ t + r},

(7.12)

as Rt (x) is supported by the “future cone” |x| ≤ t. The last fact follows from general formulas (see [13, (II.4.5.12)]), or from the Paley–Wiener Theorem (see, e.g. [13, Thm. ˆ t (k) is an entire function of k ∈ Cn satisfying suitable bounds. Finally, II.2.5.1]), as R (7.8) implies that j

j

rt = ct = 0

for

j h + t < −r

or

j h − t > r.

(7.13)

t h

(7.14)

Therefore, series (7.9) becomes a sum U0 (t)Y0 , # =

Nt −Nt

as h ≥ 1.

j

j

(rt + ct ), Nt ∼

Convergence to Equilibrium Distribution, I

19

Lemma 7.2. Let n ≥ 1, m > 0, and S0–S3 hold. The following bounds hold for t > 1: j

j

E|rt |2 ≤ C(#) dt /t, E|ct |2 ≤ C(#) ρt /t,

j ∈ Z.

(7.15)

Proof. We discuss the first bound in (7.15) only; the second is done in a similar way. Step 1. Rewrite the left-hand side as the integral of CFs. Definition (7.8) and Corollary 5.1 imply by Fubini’s Theorem that j

j

j

E|rt |2 = χr (x n )χr (y n )q0 (x − y), K(x, t) ⊗ K(y, t) .

(7.16)

The following bound holds true (cf. [29, Thm. XI.17 (b)]): sup |K(x, t)| = O(t −n/2 ), t → ∞.

(7.17)

x∈Rn

In fact, (7.10) and (12.2) imply that K can be written as the sum K(x, t) =

1 ˆ e−i(kx∓ωt) a ± (ω)#(k) dk, (2π )n ±

(7.18)

Rn

where a ± (ω) is a matrix whose entries are linear functions of ω or 1/ω. Let us prove the asymptotics (7.17) along each ray x = vt + x0 with |v| ≤ 1, then it holds uniformly in x ∈ Rn owing to (7.12). We have by (7.18), K(vt + x0 , t) =

1 ˆ e−i(kv∓ω)t−ikx0 a ± (ω)#(k) dk. (2π )n ±

(7.19)

Rn

This is a sum of oscillatory integrals with the phase functions φ± (k) = kv ± ω(k). Each function has two stationary points, solutions to the equation v = ±∇ω(k) if |v| < 1, and has none if |v| ≥ 1. The phase functions are nondegenerate, i.e.

∂ 2 φ± (k) det ∂ki ∂kj

n i,j =1

= 0, k ∈ Rn .

(7.20)

ˆ At last, #(k) is smooth and decays rapidly at infinity. Therefore, K(vt + x0 , t) = O(t −n/2 ) according to the standard method of stationary phase, [14]. Step 2. According to (7.12) and (7.17), Eq. (7.16) implies that j

E|rt |2 ≤ Ct −n

j

χr (x n )q0 (x − y) dxdy = Ct −n

|x|≤t+r

|x|≤t+r

j

χr (x n )dx

q0 (z)dz, Rn

(7.21) ij where q0 (z) stands for the norm of a matrix q0 (z) . Therefore, (7.15) follows as q0 (·) ∈ L1 (Rn ) by (6.1).

' &

20

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

8. Convergence of Characteristic Functionals In this section we complete the proof of Proposition 3.3. We use a version of the CLT developed by Ibragimov and Linnik. If Q∞ (#, #) = 0, Proposition 3.3 is obvious. Thus, we may assume that for a given # ∈ D, Q∞ (#, #) = 0.

(8.1)

Choose 0 < δ < 1 and ρt ∼ t 1−δ , dt ∼

t , ln t

t → ∞.

Lemma 8.1. The following limit holds true: ρ 1/2 ρt t + Nt2 ϕ 1/2 (ρt ) + → 0, Nt ϕ(ρt ) + t t Proof. Function ϕ(r) is nonincreasing, hence by (2.10), n 1/2

r ϕ

r (r) = n

s

n−1 1/2

ϕ

r (r) ds ≤ n

0

(8.2)

t → ∞.

s n−1 ϕ 1/2 (s) ds ≤ Cϕ < ∞.

(8.3)

(8.4)

0

Then Eq. (8.3) follows as (8.2) and (7.14) imply that Nt ∼ ln t.

' &

By the triangle inequality,

j |µˆ t (#) − µˆ ∞ (#)| ≤ E exp{iU0 (t)Y0 , #} − E exp i r t t

1

1 j + exp − E|rt |2 − exp − Q∞ (#, #) t 2 2

1 j j +E exp i rt − exp − E|rt |2 t t 2 ≡ I1 + I2 + I3 , (8.5) where the sum

t

stands for

Nt j =−Nt

. We are going to show that all summands I1 , I2 , I3

tend to zero as t → ∞. Step (i). Equation (7.14) implies

j j j j rt (exp i ct − 1) ≤ E|ct | ≤ (E|ct |2 )1/2 . (8.6) I1 = E exp i t

t

t

t

From (8.6), (7.15) and (8.3) we obtain that I1 ≤ CNt (ρt /t)1/2 → 0, t → ∞. Step (ii). By the triangle inequality, 1 1 j E|rt |2 − Q∞ (#, #) ≤ |Qt (#, #) − Q∞ (#, #)| I2 ≤ t 2 2 1 2 1 j 2 j 2 j + E rt − E|rt | + E rt − Qt (#, #) t t t 2 2 ≡ I21 + I22 + I23 ,

(8.7)

(8.8)

Convergence to Equilibrium Distribution, I

21

ij where Qt is a quadratic form with the integral kernel Qt (x, y) . Equation (6.5) implies that I21 → 0. As to I22 , we first have that j I22 ≤ E|rt rtl |. (8.9) j
The next lemma is a corollary of [17, Lemma 17.2.3]. Lemma 8.2. Let ξ be a complex random variable measurable with respect to the σ algebra σ (A), η with respect to the σ -algebra σ (B), and the distance dist(A, B) ≥ r > 0. (i) Let (E|ξ |2 )1/2 ≤ a, (E|η|2 )1/2 ≤ b. Then |Eξ η − Eξ Eη| ≤ Cab ϕ 1/2 (r). (ii) Let |ξ | ≤ a, |η| ≤ b a.s. Then |Eξ η − Eξ Eη| ≤ Cab ϕ(r). j

We apply Lemma 8.2 to deduce that I22 → 0 as t → ∞. Note that rt = Y0 (x), j ∗ #) is measurable with respect to the σ -algebra σ (Rt ). The distance j between the different rooms Rt is greater than or equal to ρt according to (7.6). Then (8.9) and S1, S3 imply, together with Lemma 8.2 (i), that

j χr (x n )(Rt

I22 ≤ CNt2 ϕ 1/2 (ρt ),

(8.10)

which goes to 0 as t → ∞ because of (7.15) and (8.3). Finally, it remains to check that I23 → 0, t → ∞. By the Cauchy–Schwartz inequality, 2 j 2 j j rt −E rt + c I23 ≤ E t t t t 1/2 2 1/2 j j j Nt E|ct |2 + C E rt E|ct |2 . (8.11) ≤ CNt t

t

t

Then (7.15), (8.9) and (8.10) imply 2 j j j rt ≤ E|rt |2 + 2 E|rt rtl | ≤ CNt dt /t + C1 Nt ϕ 1/2 (ρt ) ≤ C2 < ∞. E t

t

j
Now (7.15), (8.11) and (8.3) yield I23 ≤ C1 Nt2 ρt /t + C2 Nt (ρt /t)1/2 → 0, t → ∞. So, all terms I21 , I22 , I23 in (8.8) tend to zero. Then (8.8) implies that 1 j I2 ≤ E|rt |2 − Q∞ (#, #) → 0, t → ∞. t 2 Step (iii). It remains to verify that

1 2

j j I3 = E exp i rt − exp − E r → 0, t → ∞. t t t 2

(8.12)

(8.13)

(8.14)

22

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

Using Lemma 8.2 (ii) yields: Nt

j j rt − E exp{irt } E exp i t

−Nt

Nt Nt

j j rt − E exp{irt−Nt }E exp i rt ≤ E exp{irt−Nt } exp i −Nt +1

−Nt +1

Nt Nt

j j rt − E exp{irt } +E exp{irt−Nt }E exp i −Nt +1

−Nt

Nt Nt

j j rt − E exp{irt }. ≤ Cϕ(ρt ) + E exp i −Nt +1

−Nt +1

We then apply Lemma 8.2 (ii) recursively and get, according to Lemma 8.1, Nt

j j rt − E exp{irt } ≤ CNt ϕ(ρt ) → 0, E exp i t

t → ∞.

(8.15)

−Nt

It remains to check that Nt

1 j j E exp{irt } − exp − E|rt |2 → 0, t → ∞. t 2

(8.16)

−Nt

According to the standard statement of the CLT (see, e.g. [24, Thm. 4.7]), it suffices to verify the Lindeberg condition: ∀ε > 0, 1 √ j 2 Eε σt |rt | → 0, t → ∞. t σt

(8.17)

j 2 Here σt ≡ t E|rt | , and Eδ f ≡ EXδ f , where Xδ is the indicator of the event |f | > δ 2 . Note that (8.13) and (8.1) imply that σt → Q∞ (#, #) = 0, t → ∞. Hence it remains to verify that ∀ε > 0, t

j

Eε |rt |2 → 0, t → ∞.

We check Eq. (8.18) in Sect. 9. This will complete the proof of Proposition 3.3.

(8.18) ' &

Convergence to Equilibrium Distribution, I

23

9. The Lindeberg Condition The proof of (8.18) can be reduced to the case when for some 7 ≥ 0 we have, almost surely, that |u0 (x)| + |v0 (x)| ≤ 7 < ∞, x ∈ Rn .

(9.1)

Then the proof of (8.18) is reduced to the convergence j E|rt |4 → 0, t → ∞

(9.2)

t

by using Chebyshev’s inequality. The general case can be covered by standard cutoff j arguments by taking into account that the bound (7.15) for E|rt |2 depends only on e0 and ϕ. The last fact is obvious from (7.21) and (6.3) with p = 1 and γ = 0. We deduce (9.2) from Theorem 9.1. Let the conditions of Theorem B hold and assume that (9.1) is fulfilled. Then for any # ∈ D there exists a constant C(#) such that j

E|rt |4 ≤ C(#)74 dt2 /t 2 , t > 1. Step 1. Given four points x1 , x2 , x3 , x4 ∈

Proof.

Rn ,

(9.3)

set:

(4)

M0 (x1 , ..., x4 ) = E (Y0 (x1 ) ⊗ ... ⊗ Y0 (x4 )) . Then, similarly to (7.16), Eqs. (9.1) and (7.8) imply by the Fubini Theorem that j

j

j

(4)

E|rt |4 = χr (x1n ) . . . χr (x4n )M0 (x1 , . . . , x4 ), K(x1 , t) ⊗ · · · ⊗ K(x4 , t).

(9.4)

Let us analyse the domain of the integration (Rn )4 in the RHS of (9.4). We partition (Rn )4 into three parts, W2 , W3 and W4 : n 4

(R ) =

4

Wi , Wi = {x¯ = (x1 , x2 , x3 , x4 ) ∈ (Rn )4 : |x1 − xi | = max |x1 − xp |}. p=2,3,4

i=2

(9.5) Furthermore, given x¯ = (x1 , x2 , x3 , x4 ) ∈ Wi , divide Rn into three parts Sj , j = 1, 2, 3: Rn = S1 ∪ S2 ∪ S3 , by two hyperplanes orthogonal to the segment [x1 , xi ] and partitioning it into three equal segments, where x1 ∈ S1 and xi ∈ S3 . Denote by xp , xq the two remaining points with p, q = 1, i. Set: Ai = {x¯ ∈ Wi : xp ∈ S1 , xq ∈ S3 }, Bi = {x¯ ∈ Wi : xp , xq ∈ S1 } and Ci = {x¯ ∈ Wi : xp , xq ∈ S3 }, i = 2, 3, 4. Then (4) ¯ x¯ ∈ (Rn )4 , in the following way: Wi = Ai ∪ Bi ∪ Ci . Define the function m0 (x), (4) M0 (x) ¯ − q0 (x1 − xp ) ⊗ q0 (xi − xq ), x¯ ∈ Ai , (4) (9.6) ¯ = m0 (x) (4) Wi M0 (x), ¯ x¯ ∈ Bi ∪ Ci . (4)

This determines m0 (x) ¯ correctly for almost all quadruples x. ¯ Note that j n j χr (x1 ) . . . χr (x4n )q0 (x1 − xp ) ⊗ q0 (xi − xq ), K(x1 , t) ⊗ · · · ⊗ K(x4 , t) j j = χr (x1n )χr (xpn )q0 (x1 − xp ), K(x1 , t) j j ⊗K(xp , t) χr (xin )χr (xqn )q0 (xi − xq ), K(xi , t) ⊗ K(xq , t) .

24

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

Each factor here is bounded by C(#) dt /t. Similarly to (7.15), this can be deduced from an expression of type (7.16) for the factors. Therefore, the proof of (9.3) reduces to the proof of the bound j

j

(4)

It := |χr (x1n ) . . . χr (x4n )m0 (x1 , . . . , x4 ), K(x1 , t) ⊗ · · · ⊗ K(x4 , t)| ≤ C(#)74 dt2 /t 2 ,

t > 1.

(9.7)

Step 2. Similarly to (7.21), Eq. (7.17) implies, j j (4) χr (x1n ) . . . χr (x4n )|m0 (x1 , . . . , x4 )|dx1 dx2 dx3 dx4 , It ≤ C(#) t −2n

(9.8)

(Btr )4 (4)

where Btr is the ball {x ∈ Rn : |x| ≤ t + r}. We estimate m0 using Lemma 8.2 (ii). Lemma 9.2. For each i = 2, 3, 4 and almost all x ∈ Wi the following bound holds: (4)

|m0 (x1 , . . . , x4 )| ≤ C74 ϕ(|x1 − xi |/3).

(9.9)

Proof. For x¯ ∈ Ai we apply Lemma 8.2 (ii) to C2 ⊗ C2 ≡ R4 ⊗ R4 -valued random variables ξ = Y0 (x1 ) ⊗ Y0 (xp ) and η = Y0 (xi ) ⊗ Y0 (xq ). Then (9.1) implies the bound for almost all x¯ ∈ Ai , (4)

|m0 (x)| ¯ ≤ C74 ϕ(|x1 − xi |/3).

(9.10)

For x¯ ∈ Bi , we apply Lemma 8.2 (ii) to ξ = Y0 (x1 ) and η = Y0 (xp ) ⊗ Y0 (xq ) ⊗ Y0 (xi ). Then S0 implies a similar bound for almost all x¯ ∈ Bi , (4) (4) ¯ = M0 (x) ¯ − EY0 (x1 ) ⊗ E Y0 (xp ) ⊗ Y0 (xq ) ⊗ Y0 (xi ) |m0 (x)| ≤ C74 ϕ(|x1 − xi |/3), and the same for almost all x¯ ∈ Ci .

(9.11)

' &

Step 3. It remains to prove the following bounds for each i = 2, 3, 4: j j χr (x1n ) . . . χr (x4n )Xi (x)ϕ(|x1 − xi |/3)dx1 dx2 dx3 dx4 ≤ Cdt2 t 2n−2 , Vi (t) := (Btr )4

(9.12) where Xi is an indicator of the set Wi . In fact, this integral does not depend on i, hence set i = 2 in the integrand:         j j Vi (t) ≤ C χr (x1n )ϕ(|x1 − x2 |/3)  χr (x3n )  X2 (x) dx4  dx3  dx1 dx2 . (Btr )2

Btr

Btr

(9.13)

Convergence to Equilibrium Distribution, I

25

Now a key observation is that the inner integral in dx4 is O(|x1 − x2 |n ) as X2 (x) = 0 for |x4 − x1 | > |x1 − x2 |. This implies     j j Vi (t) ≤ Cr 4 χr (x1n )  ϕ(|x1 − x2 |/3)|x1 − x2 |n dx2  dx1 χr (x3n ) dx3 . Btr

Btr

Btr

(9.14) The inner integral in dx2 is bounded as ϕ(|x1 − x2 |/3)|x1 − x2 |n dx2 ≤ C(n)

2(t+r)

r 2n−1 ϕ(r/3) dr

0

Btr

≤ C1 (n)

sup

r∈[0,2(t+r)]

r n ϕ 1/2 (r/3)

2(t+r)

r n−1 ϕ 1/2 (r/3) dr,

(9.15)

0

where the “sup” and the last integral are bounded by (8.4) and (2.10), respectively. Therefore, (9.12) follows from (9.14). This completes the proof of Theorem 9.1. & ' Proof of convergence (9.2). As dt ≤ h ∼ t/Nt , bound (9.3) implies, t

j

E|rt |4 ≤

C74 dt2 C1 74 Nt ≤ → 0, 2 t Nt

Nt → ∞.

' &

10. The Scattering Theory for Infinite Energy Solutions In this section we develop a version of the scattering theory to deduce Theorem A from Theorem B. The main step is to establish an asymptotics of type (1.20) for adjoint groups by using results of Vainberg [35]. Consider operators U (t), U0 (t) in the complex space H = L2 (Rn ) ⊕ H 1 (Rn ) (see (2.12)). The energy conservation for the KGE implies the following corollary: Corollary 10.1. There exists a constant C > 0 such that ∀# ∈ H : U0 (t)#H ≤ C#H ,

U (t)#H ≤ C#H , t ∈ R.

(10.1)

Lemma 10.3 below develops earlier results [35, Thms. 3, 4, 5]. Consider a family of finite seminorms in H , #2(R) = (|#0 (x)|2 + |#1 (x)|2 + |∇#1 (x)|2 ) dx, R > 0. |x|≤R

Denote by H(R) the subspace of functions from H with a support in the ball BR . Definition 10.2. Hc denotes the space ∪R>0 H(R) endowed with the following convergence: a sequence #n converges to # in Hc iff ∃R > 0 such that all #n ∈ H(R) , and #n converge to # in the norm · (R) .

26

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

Below, we speak of continuity of maps from Hc in the sense of sequential continuity. Given t ≥ 0, denote n ≥ 3, (t + 1)−3/2 , ε(t) = (10.2) (t + 1)−1 ln−2 (t + 2), n = 2. Lemma 10.3. Let Assumptions E1–E3 hold, and n ≥ 2. Then for any R, R0 > 0 there exists a constant C = C(R, R0 ) such that for # ∈ H(R) , U (t)#(R0 ) ≤ Cε(t)#(R) , t ≥ 0.

(10.3)

This lemma has been proved in [21] by using Conditions E1–E3 and a method developed in [36]. For the proof, the contour of integration in the k-plane from [35] had to be curved logarithmically at infinity as in [36], but should not be chosen parallel to the real axis. The main result of this section is Theorem 10.4 below. Given t ≥ 0, set (t + 1)−1/2 , n ≥ 3, ε1 (t) = (10.4) ln−1 (t + 2), n = 2. Theorem 10.4. Let Assumptions E1–E3 and S0–S3 hold, and n ≥ 2. Then there exist linear continuous operators W, r(t) : Hc → H such that for # ∈ Hc , U (t)# = U0 (t)W # + r(t)#, t ≥ 0,

(10.5)

and the following bounds hold ∀R > 0 and # ∈ H(R) : r(t)#H ≤ C(R)ε1 (t)#(R) , t ≥ 0, E|Y0 , r(t)#| ≤ 2

C(R)ε12 (t)#2(R) ,

t ≥ 0.

(10.6) (10.7)

Proof. We apply the standard Cook method: see, e.g., [29, Thm. XI.4]. Fix # ∈ H(R) and define W #, formally, as W# =

lim U (−t)U (t)# t→∞ 0

∞ =#+ 0

d U (−t)U (t)# dt. dt 0

We have to prove the convergence of the integral in norm in space H . First, observe that d U (t)# = A0 U0 (t)#, dt 0

d U (t)# = A U (t)#, dt

where A0 and A are the generators to groups U0 (t), U (t), respectively. Similarly to (7.5), we have 0A A = , (10.8) 1 0

Convergence to Equilibrium Distribution, I

where A =

n

27

(∂j − iAj )2 − m2 . Therefore,

j =1

d U (−t)U1 (t)# = U0 (−t)(A − A0 )U (t)#. dt 0 Now (10.8) and (7.5) imply A − A0 = Furthermore, E2 implies that L =

n

(10.9)

0L . 00

(∂j − iAj )2 − is a first order partial differential

j =1

operator with the coefficients vanishing for |x| ≥ R0 . Thus, (10.1) and (10.3) imply that U0 (−t)(A − A0 )U (t)#H ≤ C (A − A0 )U (t)#H 0 = C (A − A0 )U (t)# L2 (BR ) 0 1 ≤ C1 U (t)# H 1 (BR ) 0

≤ C(R)ε(t)#(R) , t ≥ 0.

(10.10)

Hence (10.9) implies ∞ s

d U (−t)U (t)#H dt ≤ C(R)ε1 (s)#(R) , s ≥ 0. dt 0

(10.11)

Therefore, (10.5) and (10.6) follow by (10.1). It remains to prove (10.7). First, similarly to (7.16), EY0 , r(t)#2 = q0 (x − y), r(t)#(x) ⊗ r(t)#(y).

(10.12)

Therefore, the Shur Lemma implies (similarly to (7.21)) EY0 , r(t)#2 ≤ q0 L1 r(t)#L2 r(t)#L2 ,

(10.13)

where the norms · Lp have an obvious meaning. Finally, (10.6) implies for # ∈ H(R) , r(t)#L2 ≤ Cr(t)#H ≤ C(R)ε1 (t)#(R) . Therefore, (10.7) follows from (10.13) since q0 L1 < ∞ by (6.1).

(10.14) ' &

11. Convergence to Equilibrium for Variable Coefficients The assertion of Theorem A follows from two propositions below: Proposition 11.1. The family of the measures {µt , t ∈ R}, is weakly compact in H−ε , ∀ε > 0. Proposition 11.2. For any # ∈ D, 1 µˆ t (#) ≡ exp(iY, #) µt (dY ) → exp{− Q∞ (W #, W # )}, t → ∞. 2

(11.1)

28

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

We deduce these propositions from Propositions 3.2 and 3.3, respectively, with the help of Theorem 10.4. Proof of Proposition 11.1. Similarly to Proposition 3.2, Proposition 11.1 follows from the bounds sup EU (t)Y0 R < ∞,

R > 0.

t≥0

(11.2)

For the proof, write the solution to (1.3) in the form u(x, t) = v(x, t) + w(x, t).

(11.3)

Here v(x, t) is the solution to (3.1), and w(x, t) is the solution to the following Cauchy problem: n  2 2 w(x, ¨ t) =  k=1 (∂k − iAk (x)) w(x, t) − m w(x, t)    − nk=1 2iAk (x)∂k v(x, t) − nk=1 (i∂k Ak (x) + A2k (x))v(x, t), (11.4)     w|t=0 = 0, w| ˙ t=0 = 0, x ∈ Rn . Then (11.3) implies EU (t)Y0 R ≤ EU0 (t)Y0 R + E(w(·, t), w(·, ˙ t))R .

(11.5)

By Proposition 3.1 we have sup EU0 (t)Y0 R < ∞. t≥0

(11.6)

It remains to estimate the second term in the right-hand side of (11.5). The Duhamel representation for the solution to (11.4) gives t U (t − s)(0, ψ(·, s)) ds,

(w, w) ˙ =

(11.7)

0

where ψ(x, s) = −2i

n k=1

Ak (x)∂k v(x, s) −

n k=1

(i∂k Ak (x) + A2k (x))v(x, s). Assump-

tion E2 implies that supp ψ(·, s) ⊂ BR0 . Moreover, (0, ψ(·, s))R0 ≤ Cv(·, s)H 1 (BR

0)

≤ CU0 (s)Y0 R0 .

(11.8)

The decay estimates of type (10.3) hold for the group U (t), as well as for U (t), as both groups correspond to the same equation by Lemma 7.1. Hence, we have from (11.8), U (t − s)(0, ψ(·, s))R ≤ C(R)ε(t − s)(0, ψ(·, s))R0 ≤ C1 (R)ε(t − s)U0 (s)Y0 R0 ,

(11.9)

Convergence to Equilibrium Distribution, I

29

where ε(·) is defined in (10.2). Therefore, (11.7) and (11.6) imply t E(w(·, t), w(·, ˙ t))R ≤ C(R)

ε(t − s)EU0 (s)Y0 R0 ds 0

≤ C2 (R) < ∞, Then (11.6) and (11.5) imply (11.2).

t ≥ 0.

(11.10)

' &

Proof of Proposition 11.2. Equations (10.5) and (10.7) imply by Cauchy–Schwartz, |E exp iU (t)Y0 , # − E exp iY0 , U0 (t)W #| ≤ E|Y0 , r(t)#| ≤ (E|Y0 , r(t)#|2 )1/2 → 0, t → ∞. It remains to prove that

1 E exp iY0 , U0 (t)W # → exp − Q∞ (W #, W # ) , t → ∞. 2

(11.11)

This does not follow directly from Proposition 3.3 since generally, W # ∈ D. We approximate W # by functions from D. W # ∈ H , and D is dense in H . Hence, for any V > 0 there exists K ∈ D such that W # − KH ≤ V. Therefore, we can derive (11.11) by the triangle inequality

1 E exp iY0 , U0 (t)W # − exp − Q∞ (W #, W # ) 2 ≤ E exp iY0 , U0 (t)W # − E exp iY0 , U0 (t)K

1 + E exp iU0 (t)Y0 , K − exp − Q∞ (K, K ) 2

1

1 + exp − Q∞ (K, K ) − exp − Q∞ (W #, W # ) . 2 2

(11.12)

(11.13)

Applying Cauchy–Schwartz, we get, similarly to (10.12)-(10.14), that E|Y0 , U0 (t)(W # − K)| ≤ (E|Y0 , U0 (t)(W # − K)|2 )1/2 ≤ CU0 (t)(W # − K)H . Hence, (10.1) and (11.12) imply E|Y0 , U0 (t)(W # − K)| ≤ CV, t ≥ 0.

(11.14)

Now we can estimate each term on the right-hand side of (11.13). The first term is O(V) uniformly in t > 0 by (11.14). The second term converges to zero as t → ∞ by Proposition 3.3 since K ∈ D. Finally, the third term is O(V) owing to (11.12) and the continuity of the quadratic form Q∞ (#, #) in L2 (Rn ) ⊗ C2 . The continuity follows ij from the Shur Lemma since the integral kernels q∞ (z) ∈ L1 (Rn ) ⊗ M 2 by Corollary 6.3. Now the convergence in (11.11) follows since V > 0 is arbitrary. & '

30

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

12. Appendix A. Fourier Transform Calculations Consider the covariance functions of the solutions to the system (3.2). Let F : w → wˆ denote the FT of a tempered distribution w ∈ S (Rn ) (see, e.g. [13]). We also use this notation for vector- and matrix-valued functions. 12.1. Dynamics in the FT space. In the FT representation, the system (3.2) becomes Y˙ˆ (k, t) = Aˆ 0 (k)Yˆ (k, t), hence Gˆt (k) = exp(Aˆ 0 (k)t).

Yˆ (k, t) = Gˆt (k)Yˆ0 (k), Here we denote  Aˆ 0 (k) = 

0

1

−|k|2 − m2

0





 Gˆt (k) = 

,

cos ωt −ω sin ωt

(12.1)  sin ωt ω  , cos ωt

(12.2)

where ω = ω(k) = |k|2 + m2 . 12.2. Covariance matrices in the FT space. Lemma 12.1. In the sense of matrix-valued distributions, −1 Gˆt (k)qˆ0 (k)Gˆt (k), t ∈ R. qt (x − y) := E Y (x, t) ⊗ Y (y, t) = Fk→x−y

(12.3)

Proof. Translation invariance (1.8) implies E Y0 (x) ⊗C Y0 (y) = C0+ (x − y), E Y0 (x) ⊗C Y0 (y) = C0− (x − y),

(12.4)

where ⊗C stands for the tensor product of complex vectors. Therefore, E Yˆ0 (k) ⊗C Yˆ0 (k ) = Fx→k Fy→k C0+ (x − y) = (2π )n δ(k + k )Cˆ 0+ (k), E Yˆ0 (k) ⊗C Yˆ0 (k ) = Fx→k Fy→−k C0− (x − y) = (2π )n δ(k − k )Cˆ 0− (k). (12.5) Now (12.1) and (12.2) give in matrix notation that E Yˆ (k, t) ⊗C Yˆ (k , t) = (2π )n δ(k + k )Gˆt (k)Cˆ 0+ (k)Gˆt (k), E Yˆ (k, t) ⊗C Yˆ (k , t) = (2π )n δ(k − k )Gˆt (k)Cˆ 0− (k)Gˆt (k).

(12.6)

Therefore, by the inverse FT formula we get −1 Gˆt (k)Cˆ 0+ (k)Gˆt (k), E Y (x, t) ⊗C Y (y, t) = Fk→x−y Gˆt (k)Cˆ − (k)Gˆt (k). E Y (x, t) ⊗C Y (y, t) = F −1

(12.7)

k→x−y

Then (12.3) follows by linearity.

' &

0

Convergence to Equilibrium Distribution, I

31

13. Appendix B. Measures in Sobolev’s Spaces Here we formally verify the bound (4.5) for s, α < −n/2. Definition (4.2) implies for u ∈ H s,α , 1 2α −ix(k−k ) s s ) dkdk dx. e (13.1) x k k u(k) ˆ u(k ˆ u2s,α = (2π)2n Let µ(du) be a translation-invariant measure in H s,α with a CF Q(x, y) = q(x − y). Similarly to (12.5), (12.4), we get ˆ )µ(du) = (2π )n δ(k − k ) tr q(k). ˆ (13.2) u(k) ˆ u(k Then, integrating (13.1) with respect to the measure µ(du), we get the formula 1 2α x dx k2s tr q(k) ˆ dk. (13.3) u2s,α µ(du) = (2π )n Applying it to q(k) ˆ = T with α, s < −n/2 and to q(k) ˆ = T (k 2 + m2 )−1 with 1 + s instead of s, we get (4.5). Acknowledgements. The authors thank V. I.Arnold,A. Bensoussan, I.A. Ibragimov, H. P. McKean, J. Lebowitz, A. I. Shnirelman, H. Spohn, B. R. Vainberg and M. I. Vishik for fruitful discussions and remarks.

References 1. Billingsley, P.: Convergence of Probability Measures. New York, London, Sydney, Toronto: John Wiley, 1968 2. Boldrighini, C., Dobrushin, R.L.Sukhov, Yu.M.: Time asymptotics for some degenerate models of evolution of systems with an infinite number of particles. Technical Report, University of Camerino, 1980 3. Boldrighini, C., Pellegrinotti,A., Triolo, L.: Convergence to stationary states for infinite harmonic systems, J. Stat. Phys. 30, 123–155 (1983) 4. Botvich, D.D., Malyshev, V.A.: Unitary equivalence of temperature dynamics for ideal and locally perturbed fermi-gas. Commun. Math. Phys. 91, no. 4, 301–312 (1983) 5. Bournaveas, N.: Local existence for the Maxwell–Dirac equations in three space dimensions. Comm. Partial Diff. Equs. 21, no. 5–6, 693–720 (1996) 6. Bulinskii, A.V., Molchanov, S.A.: Asymptotic Gaussian property of the solution of the Burgers equation with random initial data. Theory Probab. Appl. 36, no. 2, 217–236 (1991) 7. Cornfeld, I.P., Fomin, S.V., Sinai, Ya.G.: Ergodic Theory. New York–Berlin: Springer, 1982 8. Dobrushin, R.L., Pellegrinotti, A., Suhov, Yu.M.: One-dimensional harmonic lattice caricature of hydrodynamics: A higher correction. J. Stat. Phys. 61, no. 1/2, 387–402 (1990) 9. Dobrushin, R.L., Sinai,Ya.G., Sukhov,Yu.M.: Dynamical systems of statistical mechanics. In: Dynamical Systems, Ergodic Theory and Applications, Encyclopaedia of Mathematical Sciences, V. 100. Berlin: Springer, 2000, pp. 384–431 10. Dobrushin, R.L., Suhov, Yu.M.: On the problem of the mathematical foundation of the Gibbs postulate in classical statistical mechanics. In: Mathematical Problems in Theoretical Physics, Lecture Notes in Physics, V. 80. Berlin: Springer-Verlag 1978, pp. 325–340 11. Dudnikova, T.V., Komech, A.I., Ratanov, N.E., Suhov,Yu.M.: On convergence to equilibrium distribution. II. Wave equations with mixing. Submitted to J. Stat. Phys. 12. Dudnikova, T.V. Komech, A.I., Spohn, H.: On convergence to statistic equilibrium in two-temperature problem for wave equation with mixing. Preprint Max-Planck Institute for Mathematics in the Sciences, N. 26. Leipzig, 2000 (http://www.mis.mpg.de) 13. Egorov, Yu.V., Komech, A.I., Shubin, M.A.: Elements of the Modern Theory of Partial Differential Equations. Berlin: Springer, 1999 14. Fedoryuk, M.V.: The stationary phase method and pseudodifferential operators. Russ. Math. Surveys 26, no. 1, 65–115 (1971)

32

T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov

15. Gikhman, I.I., Skorokhod, A.V.: The Theory of Stochastic Processes, Vol. I. Berlin: Springer, 1974 16. Hörmander, L.: The Analysis of Linear Partial Differential Operators III: Pseudo-Differential Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1985 17. Ibragimov, I.A., Linnik, Yu.V.: Independent and Stationary Sequences of Random Variables. Groningen: Wolters-Noordhoff, 1971 18. Jaksic, V., Pillet, C.-A.: Ergodic properties of classical dissipative systems. I. Acta Math. 181, no. 2, 245–282 (1998) 19. Komech, A.I.: Stabilisation of statistics in wave and Klein–Gordon equations with mixing. Scattering theory for solutions of infinite energy. Rend. Sem. Mat. Fis. Milano 65, 9–22 (1995) 20. Kopylova, E.A.: Stabilization of statistical solutions of the Klein–Gordon equation. Mosc. Univ. Math. Bull. 41, no. 2, 72–75 (1986) 21. Kopylova, E.A.: Stabilisation of Statistical Solutions of Klein–Gordon Equations. PhD Thesis, Moscow State University, 1986 22. Mikhailov, V.P.: Partial Differential Equations. Moscow: Mir, 1978 23. Morawetz, C.S., Strauss, W.A.: Decay and scattering of solutions of a nonlinear relativistic wave equation. Comm. Pure Appl. Math. 25, 1–31 (1972) 24. Petrov, V.V.: Limit Theorems of Probability Theory. Oxford: Clarendon Press, 1995 25. Planck, M.: The Theory of Heat Radiation. New York: Dover Publications, 1959 26. Ratanov, N.E.: Stabilisation of statistic solutions of second order hyperbolic equations. Russian Mathematical Surveys 39, no. 1, 179–180 (1984) 27. Ratanov, N.E., Shuhov, A.G., Suhov, Yu.M.: Stabilisation of the statistical solution of the parabolic equation, Acta Appl. Math. 22, no. 1, 103–115 (1991) 28. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 29. Reed, M., Simon, B.: Methods of Modern Mathematical Physics III: Scattering Theory. New York: Academic Press, 1979 30. Rosenblatt, M.A.: A central limit theorem and a strong mixing condition. Proc. Nat. Acad. Sci. U.S.A. 42, no. 1, 43–47 (1956) 31. Seitz, F.: The Modern Theory of Solids. New York: McGraw-Hill, 1940 32. Shuhov, A.G., Suhov, Yu.M.: Ergodic properties of groups of the Bogoliubov transformations of CAR C ∗ -algebras. Ann. Phys. 175, 231–266 (1987) 33. Spohn, H., Lebowitz, J.L.: Stationary non equilibrium states of infinite harmonic systems. Commun. Math. Phys. 54, 97–120 (1977) 34. Sommerfeld, A.: Thermodynamics and Statistical Mechanics. New York: Academic Press, 1956 35. Vainberg, B.R.: Behaviour for large time of solutions of the Klein–Gordon equation. Trans. Moscow Math. Soc. 30, 139–158 (1976) 36. Vainberg, B.R.. Asymptotic Methods in Equations of Mathematical Physics. New York–London–Paris: Gordon and Breach, 1989 37. Vishik, M.I., Fursikov, A.V.: Mathematical Problems of Statistical Hydromechanics. Dordrecht; Kluwer Academic Publishers, 1988 Communicated by H. Spohn

Commun. Math. Phys. 225, 33 – 66 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Nonassociative Star Product Deformations for D-Brane World-Volumes in Curved Backgrounds Lorenzo Cornalba1 , Ricardo Schiappa2 1 Laboratoire de Physique Théorique, École Normale Supérieure, 75231 Paris Cedex 05, France.

E-mail: [email protected]

2 Department of Physics, Harvard University, Cambridge, MA 02138, USA.

E-mail: [email protected] Received: 22 March 2001 / Accepted: 13 July 2001

Abstract: We investigate the deformation of D-brane world-volumes in curved backgrounds. We calculate the leading corrections to the boundary conformal field theory involving the background fields, and in particular we study the correlation functions of the resulting system. This allows us to obtain the world-volume deformation, identifying the open string metric and the noncommutative deformation parameter. The picture that unfolds is the following: when the gauge invariant combination ω = B + F is constant one obtains the standard Moyal deformation of the brane world-volume. Similarly, when dω = 0 one obtains the noncommutative Kontsevich deformation, physically corresponding to a curved brane in a flat background. When the background is curved, H = dω = 0, we find that the relevant algebraic structure is still based on the Kontsevich expansion, which now defines a nonassociative star product with an A∞ homotopy associative algebraic structure. We then recover, within this formalism, some known results of Matrix theory in curved backgrounds. In particular, we show how the effective action obtained in this framework describes, as expected, the dielectric effect of D-branes. The polarized branes are interpreted as a soliton, associated to the condensation of the brane gauge field. Contents 1. 2. 3. 4. 5. 6. 7. 8. A. B.

Introduction and Summary . . . . . . . . . . . . . . Open Strings in Parallelizable Backgrounds . . . . . . Perturbation Theory . . . . . . . . . . . . . . . . . . Computation of n-Point Functions . . . . . . . . . . . Nonassociative Deformations of World-volumes . . . Corrections Involving the Metric Tensor . . . . . . . Tachyons and Matrix Models in Curved Backgrounds Future Perspectives . . . . . . . . . . . . . . . . . . Dilogarithm Identities . . . . . . . . . . . . . . . . . Computation of the Function S (x) . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

34 36 38 43 51 57 59 62 62 63

34

L. Cornalba, R. Schiappa

1. Introduction and Summary Noncommutative quantum field theoretic limits of string theory have received considerable attention in the recent literature, and have been studied in a variety of papers (see, e.g., [1–6] and references therein). The attention is focused on a specific scaling limit, where the effects of large magnetic backgrounds are translated into Moyal noncommutative deformations of the D-brane world-volume algebra of functions. The open string physics is therefore captured within a quantum field theory (which is renormalizable, despite appearances [7, 8]). A common point to most previous investigations is that the background (sigma model) fields are taken to be constant and that, as a consequence, the target space is flat. One may then ask the natural question of what happens if the background is curved, i.e., if the background fields are no longer constant? This question received some attention in a couple of recent papers [9–12], but there is no general answer to it (other papers of interest with some relation to this subject are, e.g., [13– 16]). Our goal in this work is to address this problem in the context of a simple model with weakly curved backgrounds, which can be on one side connected to the known flat background framework, and on the other hand can be related to formal results of brane physics in WZW models, which can be analyzed exactly with conformal field theory techniques [10, 11, 17]. More concretely, the aim of this paper is to understand how the presence of a nontrivial background field affects the world-volume deformation of a D-brane. It is known that, in the presence of a constant background B-field, the physics can be exactly described either by a sigma model approach [18–25], or alternatively, by translating the background B-field into a noncommutative Moyal deformation of the brane worldvolume algebra of functions [3, 5]. The constant field situation represents a particular choice of background and one can ask what happens in more complicated situations. One thing to keep in mind is that (as for the Born–Infeld action [26–28]) the gauge covariant combination to consider is not B alone, but B + F , which we shall denote by ω ≡ B + F in the following. One may then consider three cases of increasing complexity: the case of constant ω, the case where dω = 0 but ω is not constant, and the most general case where dω = 0 and we have NS–NS three form flux (as dω = dB + dF = H ) and a curved background. The analysis leads to the following complete picture. The first case, corresponding to constant ω, has been extensively studied in the literature where one obtains a noncommutative Moyal deformation of the brane world-volume [3–6]. The physics is by now very well understood, corresponding to a flat brane embedded in a flat background space. The second case, when ω is not constant but dω = 0, has also been studied in the literature, though to a much less extent. This gives the so-called Cattaneo–Felder model of [29]. One therefore obtains the natural extension of the Moyal deformation to the case of varying symplectic form, corresponding to the noncommutative Kontsevich star product deformation of the brane world-volume algebra of functions [30]. This situation corresponds to the embedding of a curved brane in a flat background space. These configurations have also been studied from the point of view of BPS membranes in Matrix theory, where the varying F -field physically corresponds to a varying density of zero-branes over a curved membrane [31, 32]. Finally, the general case where dω = 0 is the main subject of this paper. One no longer has a symplectic form and apparently no obvious definition of a star product – which usually comes from a given Poisson structure on the world-volume of the D-brane. In this general situation, we will find that the world-volume algebra of functions is deformed to an algebra which is not only noncommutative, but also nonassociative. One interesting point we shall uncover is that this

Nonassociative Star Product Deformations

35

nonassociative star product can still be defined using Kontsevich’s formula [30]. Therefore, the nonassociativity can be traced, thanks to Kontsevich’s formality formulae, to the Schouten–Nijenhuis bracket of ω−1 with itself, which is proportional to the NS–NS field strength dω = H [30, 29]. These nonassociative algebras have the structure of an A∞ homotopy associative algebra (see, e.g., [33–35]) which have previously received some attention in the string field theory literature since they are the natural algebras that appear in general open–closed string field theories [36, 37]. Our approach in this paper will rely on a perturbative calculation of n-point functions on the disk, using the background field method applied to open string theory [18, 19, 22, 23]. The background fields are expanded in Taylor series, and the derivative terms that appear are treated as new interactions, which we treat in a perturbative expansion. This allows us to obtain the open string parameters, metric G and deformation θ , generalizing the results in [23, 3, 5]. It also allows us to identify the star product deformations, as described in the previous paragraph. We begin, in Sect. 2, by describing the specific closed string backgrounds which we shall consider in this paper. These will be the class of parallelizable manifolds, exact background solutions for closed string theory [20]. Then, in Sect. 3, we shall describe in detail the perturbation theory on the disk for open strings in these curved backgrounds, i.e., we will study the new interaction vertices due to the curvature terms. In particular, we present the general methods that we then use in Sect. 4 for the calculation of n-point functions on the disk, with particular emphasis on the conformal properties of these disk correlators. These correlators also yield the open string parameters and the nonassociative Kontsevich star product. Section 5 includes a brief resume of the different situations and the different world-volume deformations and star products, which can be read directly by the reader who wishes to skip the calculations in the preceding sections. It also describes in some detail the concept of a nonassociative star product deformation, which could be a topic of great interest for future research. Most of the previous treatment is done in a particular α → 0 scaling limit [5], where the closed string metric, g, scales to zero. In Sect. 6 we move away from this limit and compute corrections to the previous results which explicitly depend on the closed string metric. These calculations yield the formulas relating open and closed string parameters. It is interesting to observe that the final answer is a simple generalization of the flat background results of [23, 3, 5]. In Sect. 7 we make contact with previous results and in particular we describe, within our formalism, the dielectric effect of Dbranes [38] in these curved backgrounds. Indeed, these solutions describing polarization of lower dimensional branes, obtained first in [38] and then further studied in different situations involving D-branes and fundamental strings in R–R or NS–NS backgrounds by, e.g., [39–45], is now reinterpreted, dually, as an instability of the space filling brane, which condenses to a lower dimensional brane. This is accomplished by first studying the relation between the partition function – the correlators we computed in the earlier sections – and the effective action. Once this connection is made (using boundary string field theory arguments), we obtain the usual matrix action in the presence of an H -field, and we can then use the previous results on the subject. Finally, we discuss in the concluding sections how further studies of these nonassociative geometries could lead to a proper definition of Matrix theory [46–50] in a general curved background. These nonassociative geometries could provide the proper framework to generalize the arguments in [51–53] and the weak field calculations of [54, 55] in order to build the matrix theory action in a general curved target space.

36

L. Cornalba, R. Schiappa

2. Open Strings in Parallelizable Backgrounds The physics of a string propagating in a curved background is conveniently described in terms of a nonlinear sigma model. In the presence of a background metric gab (x) and NS–NS 2-form field Bab (x) the action which governs the motion of the string is given by [18–20, 22, 23], 1 i a b S= gab (X) dX ∧ ∗dX + Bab (X) dXa ∧ dX b , (1) 4πα 4π α where is the string world-volume. Moreover, when considering open strings one can include boundary interactions on ∂. In the sequel, we will mainly focus on the coupling to the U (1) gauge field Aa (x), given by SB = i Aa (X) dXa . ∂

In this paper we will consider only the physics at weak string coupling, and we will consequently assume to have the topology of a disk. Other background fields (such as the dilaton) will not play a role in our subsequent analysis. We shall mainly address maximal branes, though our results are completely general. Also, from now on, we will work in units such that 2π α = 1. The action (1) is written in a generic coordinate system x a in spacetime. On the other hand, in order to use (1) to compute correlators in perturbation theory, it is natural to follow the standard techniques of the background field method and use coordinates x a which are Riemann normal coordinates at the origin – i.e. defined using geodesic paths in target space which start at x a = 0 [18, 20]. We recall that the main advantage of this choice is that the Taylor series expansion of any tensor around x a = 0 is explicitly given in terms of covariant tensors evaluated at the origin. In particular one has, up to quadratic order in the coordinates, 1 gab (x) = gab − Racbd x c x d + · · · . 3

(2)

Let us now consider the expansion of the NS–NS 2-form field, by first recalling that we have some gauge freedom in the definition of Bab (x). In fact, the transformations B → B + d, A → A − leave the total action S + SB invariant, and we can use this freedom to impose the following (radial) gauge1 : x a Bab (x) = x a Bab (0) . One can explicitly solve the above equation in terms of the NS–NS three-form field strength H = dB, and obtain Bab (x) = Bab + x c

1 0

s 2 Habc (sx) ds.

1 Given a generic field B (x), we can consider the gauge transformation parameter (x) given by a ab a (x) = x b 01 sBab (sx) ds. It is then a simple computation to see that the combination ∂a b − ∂b a equals −Bab (x) + x c 01 s 2 Habc (sx) ds.

Nonassociative Star Product Deformations

37

Therefore, the normal coordinate expansion for the field Bab is explicitly given by 1 1 Bab (x) = Bab + Habc x c + ∇d Habc x c x d + · · · . 3 4

(3)

Using the expressions (2) and (3), one can expand (1) about the classical constant background ∂Xa = 0 and obtain S = S0 + S1 + · · · ,

(4)

where Sn contains n + 2 powers of the coordinate fields X a and where, in particular, 1 i a b S0 = gab dX ∧ ∗dX + Bab dX a ∧ dX b , 2 2 i S1 = Habc X a dX b ∧ dX c . (5) 6 In this paper, we will be primarily interested in the effects of the term S1 , which describes a small curved deviation from the flat closed string background. Let us elaborate more on this point. To leading order in α , the beta function equations which describe consistent closed string backgrounds read [18, 20]: Rab =

1 Hacd Hb cd , 4

∇ a Habc = 0.

(6)

If we work to first order in H , one may then neglect the presence of curvature coming from the metric and only consider the effects of H coming from (5). We can actually make these arguments more systematic if we consider a general class of conjectured solutions to the beta function equations, called parallelizable manifolds [20]. These configurations are characterized by the following properties. First of all, the tensor Habc is covariantly constant, ∇a Hbcd = 0. Moreover, if we consider the generalized connection + 21 H , then the corresponding curvature tensor, 1 1 1 1 Rabcd = Rabcd + ∇a H bcd − ∇b H acd + Hade Hbc e − Hace Hbd e , 2 2 4 4 must vanish. Using the fact that Ra[bcd] = 0, one can easily show that the field Habc must satisfy a Jacobi identity, in the sense that Habe Hcd e + cyclicabc = 0. These facts then imply 1 Habe Hcd e , 4 and therefore (6). Moreover, at a more fundamental level, it was explicitly shown that when the target is parallelizable, the string sigma model is ultra-violet finite to two loops, with vanishing beta functions [20]. It was moreover suggested that this holds true to higher orders for the superstring, and one thus has a consistent solution of closed string theory [20]. In the parallelizable situation the expansion (4) drastically simplifies. In the sequel we shall only need the explicit forms of S0 and S1 given above. On the other hand, in Rabcd =

38

L. Cornalba, R. Schiappa

order to extend the results of this paper to higher order in H , one needs the expressions of Sn for n ≥ 2. We include, for completeness, the first of these terms explicitly given by: 1 X a X c dX b ∧ ∗dX d . S2 = − Habe Hcd e 24 3. Perturbation Theory In the last section we have reviewed the general form of the sigma model action which describes open string dynamics in curved backgrounds. From now on we shall only consider backgrounds which are weakly curved. More precisely we will work, for the rest of the paper, to leading order in the background field H , and consequently we shall focus our analysis on the action S0 + S1 + SB . If we denote with F = dA the U (1) field strength, and with ω the symplectic structure ωab (x) = Bab + Fab (x) , then the relevant action is given by 1 i dX a ∧ ∗dX b + i ω + Habc X a dX b ∧ dX c . gab 2 6

(7)

(8)

Before we start the detailed discussion of the perturbation theory for the action (8), and in order to set the stage and motivate the subsequent results, let us begin by recalling some known facts which are valid in the flat space limit of Habc = 0. On one side, the conventional approach to open string physics starts by considering the simple free action 21 gab dX a ∧ ∗dX b , or even the full free action S0 . One then analyzes the physics of boundary interactions by considering the coupling SB to the U (1) gauge field, A, and one treats (following, for example, the approaches in [19, 22, 23]) the interactions perturbatively in F = dA. In this scheme the basic interaction vertex with n external legs involves n − 2 derivatives of F , and the perturbation theory quickly becomes unmanageable as soon as one considers rapidly varying gauge fields. It was noted, on the other hand, in [29] that, if one considers the simple topological action i ω (that is, one looks at (8) in the limit gab , Habc → 0), then the resulting path integral drastically simplifies. In fact, if one considers the n-point function of n generic functions f1 (x), . . . , fn (x), placed cyclically on the boundary ∂ of the string world-volume, one obtains the simple result (independently of the moduli of the insertion points) [29, 5]: (9) f1 · · · fn = V (ω) dx (f1 # · · · # fn ) . In the above, # is the associative Kontsevich star product2 with respect to the Poisson structure α = ω−1 , i f # g = f · g + α ab ∂a f ∂b g + · · · , 2

(10)

2 The terms hidden behind the dots · · · in (10) are given by explicit diagrammatic expressions, as explained in [30], valid for any bi-vector field α ab (x) in terms of the functions f , g, the tensor α ab and their derivatives. If α −1 is closed, then the corresponding product is associative.

Nonassociative Star Product Deformations

39

√ and V (ω) = det ω (1 + · · · ) is a volume form3 such that V (ω) dx acts as a trace for the product #. The basic point we would like to stress is that the product (10) contains derivatives of α (and therefore of F ) to all orders, and is therefore valid for arbitrary gauge field configurations. This means that the perturbation theory in Aa becomes tractable to all orders when gab → 0, and is conveniently described in terms of the algebraic operation #. We shall see in this paper that, when one introduces the perturbation S1 but still considers the limit gab → 0, then one can still re-sum the perturbation theory to all orders in Aa . We will see that the relevant algebraic structure is still given by a Kontsevich product of the general form (10), but now with ω replaced in a natural way by the gauge invariant combination: 1 ωab (x) = ωab (x) + Habc x c = Bab (x) + Fab (x) , 3 and with α replaced by α = ω−1 . In order to clearly distinguish the two cases, we shall denote this second product (relative to α ) with •, given by the usual Kontsevich expansion, i ab f •g =f ·g+ α ∂a f ∂b g + · · · . 2 The two-form ω is not closed and correspondingly the product • is now nonassociative. We will discuss later how the nonassociativity is controlled by the field strength H = d ω. The n-point functions are again given by an equation similar to (9), with # replaced by •. On the other hand, expressions like f1 •· · ·•fn are ambiguous, due to the nonassociativity of the product, and one needs to insert parenthesis to precisely define their meaning. This can be done in various ways, and this fact is reflected in the dependence of n-point functions on the n−3 conformal moduli of the insertion points on the boundary ∂. The n-point functions will then be interpolations, parameterized by n − 3 moduli, between the various possible positions of the parenthesis in the expression f1 • · · · • fn . From now until Sect. 4.5 we will concentrate on the simplest case of F = 0 or ωab (x) = Bab . We thus neglect the boundary interaction SB and concentrate on the action S0 + S1 . The generalization to the case (7) will be comparatively simple (as for the d ω = 0 case) and is left to Sect. 4.5, which also summarizes the results in the general context. We now turn to a systematic discussion of the perturbation theory for the action S0 + S1 . 3.1. The Free Theory. Let us first recall some facts about the unperturbed action S0 . Since S0 is invariant under translations X a → Xa + ca , the field X a can be split into a constant zero mode x a and a fluctuating quantum field ζ a , Xa = x a + ζ a .

(11)

Path integrals with the free action S0 are then explicitly given by a path integral over the quantum field ζ a and an ordinary integral over the zero-mode x a as [19]: [dX] e−S0 (X) → dx [dζ ] e−S0 (ζ ) . 3 For more details on V (ω) we refer the reader to [56].

40

L. Cornalba, R. Schiappa

The integral in [dζ ] is gaussian and is determined once one obtains the two-point function for the fluctuating field ζ . From now on, and unless otherwise specified, we will parameterize the disk with the complex upper-half plane H+ . As discussed in [22, 3, 5], the two-point function can be more conveniently written if one introduces the open string metric Gab and noncommutativity tensor θ ab as given by 1 1 +θ = . G g+B It then has the general form, ζ a (z) ζ b (w) = where

1 1 ab i ab θ A (z, w) − Gab B (z, w) + g C (z, w) , π π 2π

(12)

w−z 1 ln , 2i z−w B (z, w) = ln |z − w| , z − w . C (z, w) = ln z − w

A (z, w) =

In the sequel, we shall only need to consider the propagator (12) when one point (say w) is placed at the boundary ∂ of the string world-sheet. In this case w = w and C (z, w) = 0. Also, in the case w = w, the coefficients A (z, w) and B (z, w) have a simple geometrical interpretation. A measures the angle between the line z–w and the vertical line passing through w, and B gives the logarithm of the distance between z and w. We now consider the limit gab → 0. In this limit, the effective open string metric Gab becomes large and therefore the term in (12) proportional to Gab becomes irrelevant. Also, one has in this limit, that θ = B −1 = α(x). In this case the propagator (12) reduces to i ζ a (z) ζ b (w) = θ ab A (z, w) , π and the computation of path integrals becomes simple. As we discussed in the previous subsection, if one considers n functions f1 , . . . , fn , positioned at ordered points τ1 < · · · < τn on the boundary ∂ of the string world-sheet, then the path integral (13) [dX] e−S0 (X) f1 (X (τ1 )) · · · fn (X (τn )) can be evaluated [3, 5] with the result, V (B) dx (f1 # · · · # fn ) . Since ω (x) = B is constant, the product # is the usual Moyal star product and V (B) = √ det B. A word on notation. From now on we will omit the explicit reference to the volume form in the integrals. We shall therefore use the following short-hand notation: V (ω) dx · · · → · · · .

Nonassociative Star Product Deformations

41

3.2. The Interaction. Let us now consider the effects of the perturbation S1 . Corresponding to the split (11), the effect of S1 is to introduce two bulk graphs: i V = − Habc x c dζ a ∧ dζ b , (14) 6 i W = − Habc ζ a dζ b ∧ dζ c . (15) 6 We will then consider the following path integral: [dX] e−S0 (X)−S1 (X) f1 (X (τ1 )) · · · fn (X (τn )) [dX] e−S0 (X) [1 + V + W] f1 (X (τ1 )) · · · fn (X (τn )) .

(16)

In order to analyze the effects of V and W, let us first introduce some notation and discuss some useful simple results. Consider a generic point z ∈ H+ , and consider the path integral: [dX] e−S0 (X) ζ a (z) f1 (X (τ1 )) · · · fn (X (τn )) . If we introduce the short-hand notation, A (z, τi ) = Ai , for the angle between the line z–τi and the vertical through τi , then the result of the above path integral is simply given by n i Ai θ aa f1 # · · · # ∂a fi # · · · # fn . π i=1

The above result is easy to understand once one considers the expansion of the functions fi (X) = fi (x + ζ ) in Taylor series in powers of ζ . The contraction of the field ζ a (z) with a field ζ a (τi ) coming from the Taylor expansion of the function fi gives a factor of πi Ai θ aa . We are then left with a path integral of the form (13), where the function fi has been replaced with its derivative ∂a fi . More generally, when a free field ζ a (z) is contracted with one of the boundary functions it acts as a differentiation: i A θ aa ∂a . π

(17)

With this result, we can now consider the effects of the perturbation vertices V and W in the path integral (16). Let us start with the analysis of V. Choose any two indices i < j and consider the term where the two ζ ’s in V differentiate the two functions fi and fj (in the sense just described above). If ζ a differentiates fi and ζ b differentiates fj one then gets

i a a b b x c · · · # ∂a fi # · · · # ∂ H θ θ dA ∧ dA abc i j b fj # · · · . 2 6π The integral over can be evaluated by noting that the upper-half plane H+ corresponds to the simplex − π2 < Ai < Aj < π2 in the Ai –Aj plane. Therefore the integral

42

L. Cornalba, R. Schiappa

dAi ∧ dAj is equal to 21 π 2 . Moreover, if we instead let ζ a differentiate fj and ζ b differentiate fi we obtain, using the antisymmetry of Habc , the same result as above. Summing the two contributions, and summing over all possible pairs i < j , one then obtains Vij ,

i<j

where, for i < j , we have defined

i Vij = Habc θ aa θ bb x c # f1 # · · · # ∂a fi # · · · # ∂ b fj # · · · # f n . 6 In the above equation we have used the fact that (for the Moyal product) f ·g = f #g, in order to rewrite everything in terms of # products, including the multiplication by the coordinate function x c . To conclude the analysis of the effect of the two-vertex V, one must also consider the term coming from the contraction of the two ζ ’s in V among themselves. This term will require some care, since we must regularize the contraction of two fields at coincident points. On the other hand, the general structure of the contribution can be obtained with little effort by recalling that the two indices a and b in (14) are contracted with the antisymmetric tensor Habc . This implies that the contribution in question must have the form V = N Habc θ bc x a # (f1 # · · · # fn ) , where N is an unknown constant which will later be determined to be 1/3. We now move to the analysis of the contributions coming from the three-graph W. First, given three indices i < j < k, let us define 1 a a b b c c Wij k = − Habc θ θ θ f1 # · · · # ∂a fi # · · · # ∂ c fk # · · · # f n . b fj # · · · # ∂ 12 It is then easy to check, using the general result (17), that the contribution from the three-vertex which comes from the contraction of the fields ζ ’s in W with the functions fi , fj , fk is given by

S τi , τj , τk Wij k , i<j
where the function S is

2 S τi , τj , τk = 3 Ai dAj ∧ dAk ± permutationij k π 4 = 3 Ai dAj ∧ dAk + cyclicij k . π

Other combinations, which involve contractions of the ζ ’s amongst themselves, yield a vanishing contribution to the result of the three-vertex W. Let us analyze the function S in more detail. As we saw, it depends on three ordered points τi < τj < τk on the boundary ∂. On the other hand, since it is written explicitly in terms of integrals of angle functions A, it is actually invariant under translations τ → τ + c and scalings τ → λτ , i.e., under the subgroup of the modular group SL (2, R) which leaves invariant the point at infinity. Therefore, by sending τi to 0 and

Nonassociative Star Product Deformations

43

τk to 1, it becomes clear that S actually only depends on a single parameter ranging between 0 and 1. Explicitly, one has:

τj i , S τi , τj , τk = S τki where, from now on, we use the notation τj i = τj − τi . As we explicitly show in the appendix, the function S (x) can be computed exactly. It is a monotonically decreasing function defined on [0, 1] ranging from 1 to −1. It satisfies S (1 − x) = −S (x) and is explicitly given by S (x) = 1 − 2L (x) . The function L (x) is the so called normalized Rogers dilogarithm [57], defined in terms

xn of the usual dilogarithm Li2 (x) = ∞ n=1 n2 as: L (x) =

6 1 Li + ln ln − x) . (x) (x) (1 2 π2 2

We then conclude that the contribution coming from the three-vertex W is given by τj i Wij k . S τki i<j
4. Computation of n-Point Functions We now use the general results derived in the previous section in order to analyze the conformal properties of the n-point functions (16). Let us recall that we are still working in the simple case of constant symplectic structure ωab (x) = Bab , so that ωab = Bab + 13 Habc x c . The generalization to arbitrary symplectic structure ωab (x) = Bab + Fab (x) is left to Sect. 4.5. In order to simplify the expressions in this section, we introduce the following shorthand notation:

K abc = θ aa θ bb θ cc Ha b c ya = Bab x b .

4.1. 2-point function. We shall first analyze the two-point function in some detail, since the manipulations for the higher point functions will be similar. One considers two functions f1 and f2 , placed at points τ1 and τ2 on the real line, with τ1 < τ2 . A simple computation, using the general results of the previous section, shows that the two-point function is explicitly given by: i (18) − K abc yc # ∂a f1 # ∂b f2 + N Bbc K abc ya # f1 # f2 . f1 # f2 + 6

44

L. Cornalba, R. Schiappa

The above expression does not depend on the explicit values of τ1 , τ2 , but depends only on the order of the points τi on the real line. On the other hand, since two points on the boundary of a disk have no (conformal) moduli, the two-point function must be a symmetric bilinear of f1 , f2 . The first term in (18) is clearly symmetric. Let us then concentrate on the second term, by rewriting it with f1 and f2 interchanged. This gives, after a small rearranging,

i abc abc K ∂a f1 # yc # ∂b f2 + N Bbc K f1 # ya # f2 . 6

(19)

Using that K abc ∂a f1 # yc = K abc yc # ∂a f1 , and differentiating by parts, we see that the difference between (19) and the second term of (18) reads i Bbc K abc ∂a f1 # f2 − N Bbc K abc [ya , f1 ] # f2 . 3 The above is then vanishing if one has N =

1 . 3

With this value the two-point function is conformally invariant and is a simple symmetric bilinear of the functions f1 , f2 . Let us denote the n-point function by Pn . A little computation shows, using the identity f # g = f g, that the two-point function (18) is given by the explicitly symmetric expression

P2 (f1 , f2 ) =

f1 f2

1 a bc 1 + Habc x θ . 3

(20)

4.2. 3-point function. Let τ1 < τ2 < τ3 be there ordered points on the real line, and let us consider the three-point function of three functions f1 , f2 , f3 . One now has a contribution from the three-vertex W, but it vanishes since abc K ∂a f1 # ∂b f2 # ∂c f3 = 0. The only contribution then comes from the two-vertex V, and it is given explicitly by (using the value 1/3 for N ) 1 abc P3 (f1 , f2 , f3 ) = f1 # f2 # f3 + Bbc K ya # f1 # f2 # f3 3 i − K abc yc # (∂a f1 # ∂b f2 # f3 + ∂a f1 # f2 # ∂b f3 + f1 # ∂a f2 # ∂b f3 ) . (21) 6 As for the two-point function, the above expression does not depend on the explicit values of the τi ’s, but only on their order on the real line. On the other hand, since three points on the disk have no moduli (as in the two-point function case), the above expression should actually be invariant under cyclic permutations of the three functions,

Nonassociative Star Product Deformations

45

and in particular under the replacement f1 , f2 , f3 → f2 , f3 , f1 . One must then show that (21) is equal to: 1 f2 # f3 # f1 + Bbc K abc f1 # ya # f2 # f3 3 i abc − K (−∂a f1 # yc # ∂b f2 # f3 − ∂a f1 # yc # f2 # ∂b f3 + f1 # yc # ∂a f2 # ∂b f3 ) . 6 In the second line of the above expression, we are free to move the function yc all the way to the left, since we are contracting with the totally antisymmetric object K abc . Given this fact, it is simple to show that the above expression is identical to (21), thus proving that also the three-point function is invariant under conformal transformations, and is therefore a cyclic trilinear of its inputs. Note that the same value for N makes both P2 and P3 invariant. 4.3. 4-point function. Let us now consider the four-point function. As usual we choose four ordered points τ1 < · · · < τ4 on the real line and four functions f1 , . . . , f4 . Following the general results in the previous sections, the result of the path integral (16) breaks into three parts. First we have the unperturbed result, given by f1 # · · · # f4 . The above is independent of the positions of the τ ’s, and is conformally (actually topologically) invariant by itself, since it is a cyclic multilinear function of the f ’s. Second, we have the term coming from the two-vertex V, given by (in the notation of Sect. 3.2) Vij . (22) V (f1 , . . . , f4 ) = V + i<j

Finally we have, for the first time, a non-vanishing contribution to the path integral coming from the three-vertex W, which is given explicitly by 1 abc (23) K [S (τ1 , τ2 , τ3 ) ∂a f1 # ∂b f2 # ∂c f3 # f4 + · · · ] , 12 where for three more terms which are weighted with the corresponding factor

· · · stands S τi , τj , τk , and with the derivatives ∂a , ∂b and ∂c acting on all possible groups of three functions – as explained in Sect. 3.2. Note that all the terms in (23) are actually the same after integration by parts (for example K abc ∂a f1 # ∂b f2 # f3 # ∂c f4 = −K abc ∂a f1 # ∂b f2 # ∂c f3 # f4 , and so on), so that the above equation can be rewritten as 1 (24) κ (τi ) K abc f1 # ∂a f2 # ∂b f3 # ∂c f4 , 12 where the coefficient κ is given by κ (τi ) = −S (τ1 , τ2 , τ3 ) + S (τ1 , τ2 , τ4 ) − S (τ1 , τ3 , τ4 ) + S (τ2 , τ3 , τ4 ) . Let us now discuss the conformal invariance of the above four-point function. Start by considering a general SL (2, R) transformation which preserves the order of the points

46

L. Cornalba, R. Schiappa

τ1 , . . . , τ4 , on the real line. In this case the term (22) is invariant by itself, since it depends only on the order of the insertion points and not their specific positions. It must then be true that (24) is also invariant, and this will be the case if the coefficient κ (τi ) itself is unchanged under the SL (2, R) transformation. We first recall that four points on the real line have a unique invariant module m, with 0 < m < 1, which can be taken to be the position of point 2 once one maps τ1 , τ3 , τ4 to 0, 1, +∞. Using the standard notation τij = τi − τj , the module m can also be invariantly described by the cross-ratio m=

τ43 τ21 . τ42 τ31

Let us now rewrite κ in terms of Rogers dilogarithms (see the appendix) 1 τ21 τ31 τ32 τ21 −L +L −L . κ=L 2 τ31 τ41 τ41 τ42 If we use the general identity (49), from the appendix, with x = quickly discovers that

τ21 τ31

and y =

τ31 τ41 ,

one

κ (τi ) = 2L (m) , thus showing that the expression (23) is conformally invariant, for an order-preserving SL (2, R) transformation. One now needs to show that the full four-point function is invariant under orderchanging conformal transformations. We will actually be done once we have considered the following special case. Start with the following configuration of points τ1 = 0, τ2 = m, τ3 = 1 and τ4 = +∞. The K-dependent part of the four-point function is given by 1 V (f1 , f2 , f3 , f4 ) + L (m) K abc f1 # ∂a f2 # ∂b f3 # ∂c f4 . 6 Let us now move the point τ4 from +∞ to −∞. In this case, the path integral gives 1 V (f4 , f1 , f2 , f4 ) + L (1 − m) K abc f4 # ∂a f1 # ∂b f2 # ∂c f3 6 1 = V (f4 , f1 , f2 , f4 ) + (L (m) − 1) K abc f1 # ∂a f2 # ∂b f3 # ∂c f4 . 6 One then needs to prove that 1 V (f4 , f1 , f2 , f4 ) − V (f1 , f2 , f3 , f4 ) = K abc 6

f1 # ∂a f2 # ∂b f3 # ∂c f4 .

We see that the dependence on the modulus m has dropped out and this must be the case since the LHS of the above equation depends only on the order of the points, not on their positions. The above equation is a special case of a more general formula which we shall prove in the next section, where we consider the conformal invariance of n-point functions.

Nonassociative Star Product Deformations

47

4.4. General n-point functions. We finally turn our analysis to the n-point functions by considering the path integral with n functions f1 , . . . , fn inserted on the real line in points τ1 < · · · < τn which are ordered from the left to the right. The unperturbed result is just, f1 # · · · # fn , which is invariant under all diffeomorphisms of the disk. The Habc dependent terms in the path integral divide as always in an expression coming from the two-vertex, V (f1 , . . . , fn ) = V + Vij , (25) i<j

and a part coming from the three-vertex i<j
Sij k = S τi , τj , τk . We have defined the symbols Wij k and Sij k for 1 ≤ i < j < k ≤ n, but one can extend the definition to all indices i, j , k by demanding that both Wij k and Sij k be totally antisymmetric tensors. Then the last contribution to the path integral is just 1 Sij k Wij k . 6

(26)

i,j,k

The terms Wij k are not linearly independent since one can show, differentiating by parts, that Wij k = 0. k

This implies that4 the number of independent coefficients Wij k is n−1 3 , and that there

is a totally antisymmetric tensor, Wij kl , such that Wij k = l Wij kl . Concretely, one can choose Wij kl =

1 Wij k − Wij l + Wikl − Wj kl . n

(27)

Therefore Eq. (26) can be written as 1 Sij k Wij kl = 6 i,j,k,l

Wij kl Sij k − Sij l + Sikl − Sj kl .

i<j
As we have already seen in Sect. 4.3 (see also the appendix), the properties of the Rogers dilogarithmic function imply that, for i < j < k < l, τlk τj i Sij k − Sij l + Sikl − Sj kl = −2L . τlj τki 4 These facts follow from the following (trivial) cohomology computation. Let C k be the space of totally

antisymmetric tensors with k indices, and let δk+1 : C k+1 → C k be defined by δTi1 ···ik = j Ti1 ···ik j . Then δk+1 δk = 0. It is easy to show that the corresponding cohomology is trivial (see Eq. 27). Therefore if δ3 B = 0, it must be that B = δ4 · · · . Moreover, one has that dim ker δk+1 = dim C k+1 − dim Imδk = dim C k+1 − dim ker δk , so that dim ker δ3 = dim C 3 − dim C 2 + dim C 1 − dim C 0 .

48

L. Cornalba, R. Schiappa

Therefore the final result,

−2

i<j
τlk τj i Wij kl L τlj τki

,

(28)

is written as a function only of the cross-ratios and is therefore conformally invariant. As in the case of the four-point function, the above reasoning is valid as long as the conformal transformation preserves the order of the points on the real line. In order to complete the proof of conformal invariance one must also consider the behavior of the full path integral as we pass one point from +∞ to −∞. Let us then consider the simple setup with τ1 , . . . , τn−1 at fixed positions, and τn → +∞. Then the sum (26) breaks into two parts: Sij k Wij k + Sij n Wij n = Sij k Wij k + Wij n , i<j
i<j
i<j
i<j
where we have used the fact that Sij n = S (0) = 1. Let us now “move τn across infinity”, so that τn → −∞. The first term in the above expression is invariant, since it does not contain the point n, and the function fn in Wij k is not differentiated. The only change is in the term with Wij n . As we move the point τn from +∞ to −∞, the coefficients Wij n are multiplied not with S (0) = 1 but with S (1) = −1, so that the total expression changes by 2 Wij n . i<j
The above term is purely topological, i.e., it does not depend on the explicit position of the points τ1 , . . . , τn−1 , and it must be canceled by the variation of expression (25) as we change the ordering of the functions. More precisely, one must have that: V (fn , f1 , . . . , fn−1 ) − V (f1 , . . . , fn ) = 2 Wij n . i<j
and V ij the quantities corresponding To prove the above statement let us denote with V to V and Vij , with the functions f1 , . . . , fn permuted to fn , f1 , . . . , fn−1 , so that

+ i<j V ij . It is easy to show that, V (fn , f1 , . . . , fn−1 ) = V ij = 2Wi−1,j −1,n + Vi−1,j −1 , V (1 < i < j ) , V1j = −Vj −1,n , (1 < j ) . − V = i Bbc K abc f1 # · · · # fn−1 # ∂a fn one can show that Also, since V 3 −V = 2 V Vj n . j
Putting everything together, one finally obtains V (fn , f1 , . . . , fn−1 ) − V (f1 , . . . , fn ) = ij + 1j − −V + V V Vij − Vj n =V =2

i<j
as was to be shown.

1
Wij n ,

1<j

i<j
j
Nonassociative Star Product Deformations

49

4.5. Including the boundary interaction SB . In this section we are going to extend the results of the previous section by including the effects of the boundary interaction SB in the computation of the n-point functions (16). We have not checked with path integral computations all the details of what follows, but the extension is quite natural. We will leave for future work a detailed path integral analysis of the results of this section. It is natural in this context to change notation and to represent, as usual (see, e.g., [56]), functions as operators and # products with operator multiplication. Finally, integrals will be denoted by traces Tr. Therefore, we shall shift notation for functions as follows x a → Xa ,

fi → Fi ,

and for traces as

V (ω) dx → Tr . One then has the simple correspondences: θ aa ∂a f → −i[X a , F ], θ ab → −i[X a , Xb ]. This allows us to rewrite the expressions for V , Vij and Wij k in operator notation as

2i Habc Tr X a X b X c F1 · · · Fn , 3

i Vij = − Habc Tr X c F1 · · · [X a , Fi ] · · · [Xb , Fj ] · · · Fn , 6 V =−

and

i Habc Tr F1 · · · [X a , Fi ] · · · [Xb , Fj ] · · · [Xc , Fk ] · · · Fn . 12 We now consider the general case of ωab (x) = Bab + Fab (x). The expressions above are still well-defined and are the natural generalizations of the ωab (x) = Bab expressions previously derived. On the other hand, for general ω, we have that: Wij k = Wij , Wij k = −

k

where, for i < j , Wij =

i Habc Tr F1 · · · X a , Xb , Fi · · · Xc , Fj · · · Fn 24

i − Habc Tr F1 · · · X a , Fi · · · Xb , Xc , Fj · · · Fn . 24

Note that, when [Xa , Xb ] = iθ ab is constant, Wij vanishes. In order to get a conformally invariant expression, one is then forced to replace Wij k → Wij k = Wij k − It is then clear that previous sections)

k

1 Wij − Wik + Wj k . n

Wij k = 0, so that the expression (using the notation of the W=

i<j
Sij k Wij k

50

L. Cornalba, R. Schiappa

is invariant under conformal transformations which do not change the order of the insertion points on the real line. In the case analyzed in the previous section, the term above (coming from the three-vertex) was supplemented with the term coming from the two-vertex, Vij . V (F1 , . . . , Fn ) = V + i<j

We recall that the above expression is important in the case when τn “goes around a b ∞”. In particular, when

[X , X ] is constant, we have that V (Fn , F1 , . . . , Fn−1 ) − V (F1 , . . . , Fn ) = 2 i<j
2 (i + j + k) Wij k . n i<j
A small computation shows that v=

2 n

(i + j + k + 3) Wij k +

i<j
= v−2

2 (i + j + 3) Wij n n i<j

Wij n ,

i<j
where we have used the fact that i<j
k Wij k = 0. One can then consider the combination: V (F1 , . . . , Fn ) + v (F1 , . . . , Fn ) .

(29)

The previous discussion implies that expression (29), in the case of constant [Xa , Xb ] = iθ ab , is a cyclic function in the arguments F1 , . . . , Fn . In general, though, the above need not be cyclic. We can nonetheless construct the correct generalization, V of V (F1 , . . . , Fn ), by cyclically symmetrizing. In particular, if we define V=

1 V (F1 , . . . , Fn ) + v (F1 , . . . , Fn ) + cyclic1···n − v (F1 , . . . , Fn ) , n

then this satisfies

V−V =2

Wij n

i<j
and, following the same arguments as in Sect. 4.4, we have restored conformal invariance. Therefore the final result for the n-point function is given by: V+ Sij k Wij k . i<j
Nonassociative Star Product Deformations

51

5. Nonassociative Deformations of World-volumes We are now in a position to show the importance of the Kontsevich product • in the above construction. In this section we shall first discuss in some detail Kontsevich products defined starting from various different bi-vector fields (in Sect. 5.1), and then see how one can reinterpret, in this framework, the results of the last section (in Sects. 5.2 and 5.3). Let us start the discussion by considering the simplest case when ω = B + F is constant. We are then considering the standard Moyal product deformation of the brane world-volume, which is described in [3,5]. Physically, it corresponds to the embedding of a flat brane in a flat background space. The relevant product is the Moyal star product, given by the formula i

θ (f # g) (x) = e 2

ij ∂ x ∂ y i j

f (x)g(y)|x=y .

(30)

The open string parameters can be written in terms of the closed string parameters with the formulas (43), where θ ab = −i[x a , x b ]# . In the zero slope limit [5], correlators are computed according to: n √ fi (X(τi )) = det ω d p+1 x f1 # · · · # fn . i=1

Now let us consider the case when ω (x) is no longer constant, but dω = 0. Then, ω still defines a symplectic structure on the brane world-volume. Physically, this corresponds to embeddings of a curved brane in a flat background space, as can be most easily seen from the Matrix theory point of view (for example this is described, in the context of holomorphic curves in flat space, in [31, 32]). Recall, in fact, that the F field represents the zero-brane density on a two-brane, such that 1 N= F 2π S is the total number of zero-branes. For static solutions F is proportional to the area element, and is therefore no longer constant with respect to the Euclidean coordinates of the flat background. The zero-brane density varies along the two-brane, which in turn effectively amounts to building a curved M2-brane in the flat space background. From the σ -model point of view, the case of dω = 0 is very similar to the constant one (after all, all symplectic structures are locally related by a coordinate change), but now the Moyal star product is replaced by Kontsevich’s formula [30], as shown in detail in [29]. Then the star product is (we denote the noncommutative parameter by α ab (x) here), i 1 f # g = f g + α ab ∂a f ∂b g − α ac α bd ∂a ∂b f ∂c ∂d g 2 8 1 ad bc − α ∂d α (∂a ∂b f ∂c g − ∂b f ∂a ∂c g) + O(α 3 ), (31) 12 while open string parameters are still given by the same formulas. Finally, correlators are computed in the α → 0 limit as follows: n fi (X(τi )) = V (ω)d p+1 x (f1 # · · · # fn ) . i=1

52

L. Cornalba, R. Schiappa

The situation we analyze in detail in this paper is when H = dω = 0. The target is then no longer flat and one is thus embedding a curved brane in a curved background. At first sight it seems that, since one no longer has a symplectic manifold (and therefore a Poisson structure), one can no longer identify the correct algebraic structure – if any – which controls the deformation in this case. The result we have obtained is that this is not the case. As we will explain at length in this section, the Kontsevich formula is still relevant in the description of the physics. Indeed, we find that the deformation is still given by the Kontsevich star product expansion, as written in coordinates (we shall now denote the star product by • and the inverse two-form by α in order to distinguish the two cases), i ab 1 ac bd f • g = fg + α ∂a ∂ b f ∂ c ∂ d g α ∂a f ∂ b g − α 2 8 1 ad − α bc (∂a ∂b f ∂c g − ∂b f ∂a ∂c g) + O( α 3 ). α ∂d 12

(32)

The difference now is that the star product is no longer associative. Therefore, when in curved backgrounds, the brane world-volume is deformed not only through a noncommutative parameter ( α= ω−1 ), but also through a nonassociative parameter which – as we shall see – is essentially H = d ω. Again, open string parameters are given by the same formulas as in the Moyal case (as will be later shown in Sect. 6). As we have seen in the previous section, correlators in the topological limit gab → 0 require detailed analysis. The results can again be written, as we shall show in this section, in terms of • and the general formula will still be n fi (X(τi )) ∼ V ( ω)d p+1 x (f1 • · · · • fn ) . i=1

On the other hand, due to the non-associativity of •, one has to define precisely what one means by the RHS of the above equation, which now depends explicitly on the moduli of the insertion points τi . 5.1. Nonassociative star products. We shall now study the properties of the nonassociative Kontsevich star product •. In the last part of this section we will work in the gab → 0 limit, so that α = ω−1 , α= ω−1 . Also, unless explicitly needed, we shall drop the tildes. Let us start by considering the associativity properties of the Kontsevich expansion (31) or (32). To this end one needs to compute, given three generic functions f , g, h, the difference (f # g) # h − f # (g # h). Using the expansions (31), (32), it is not difficult to show that: (f # g) # h − f # (g # h)

1 i> α ∂> α j k + α j > ∂> α ki + α k> ∂> α ij ∂i f ∂j g∂k h + O(α 3 ). = 6 If α is a Poisson structure, i.e., satisfies α i> ∂> α j k + α j > ∂> α ki + α k> ∂> α ij = 0,

(33)

Nonassociative Star Product Deformations

53

then the associated product is associative (in fact to all orders). Note that, when α is invertible, the above equation is equivalent to d α −1 = dω = 0, so that the expansion (31) defines an associative product. Now consider the product (32). In this case one has both a noncommutative deformation, with parameter α , and a nonassociative deformation with parameter H = d ω. To better understand this point let us re-write expression (33) in terms of the 3-form field H . Indeed, using that ∂k α ij = α ia α j b ∂k ωab , one can rewrite (33) as 1 ia j b kc α α Habc ∂i f ∂j g ∂k h + · · · . (34) α 6 Precisely because we have H = 0, the star product (32) is not associative. We then have two products, # and •, given by the Kontsevich expansion in terms of ωab (x) and ωab (x) = ωab (x) + 13 Habc x c , respectively. We wish to explicitly relate the product • to the associative product #. First, it is clear that (f • g) • h − f • (g • h) =

1 α ij = α ij + α ai α bj Habc x c + · · · . 3

Therefore one has f •g = f #g+ 6i α ai α bj Habc x c ∂i f ∂j g+· · · . Recalling that x a , f = iα ai ∂i f + · · · , it is not hard to show i f • g = f # g − Habc x c , [x a , f ]# # [x b , g]# . # 12 One can check the correctness of the above formula by expanding Eq. (32) to order α 2 . It is moreover convenient, as in Sect. 4.5, to move to operator notation for the associative product #. Therefore the above expression for the product • can be compactly written as i F • G = F G − Habc X c , [Xa , F ][Xb , G] . (35) 12 It is then simple to show that i (F • G) • H − F • (G • H ) = − Habc [X a , F ][X b , G][X c , H ], 6 which is the generalization of (34) to all orders in α. Let us now take the functions f , g and h to be the local coordinate functions x i in n R . By direct use of the nonassociative Kontsevich formula (32) one obtains, i ij xi • xj = xi xj + α (x), 2

(36)

and from (34),

1 ia j b kc α α α Habc . (37) xi • xj • xk − xi • xj • xk = 6 Calculating the star bracket commutator (making use of (36)) one obtains the noncommutative algebra, [x i , x j ]• = i α ij (x), which is a very similar result to the standard Kontsevich deformation. On the other hand, in order to compute the Jacobi expression, one uses (37) to obtain:

α ia α j b α kc Habc , [x i , [x j , x k ]• ]• + [x j , [x k , x i ]• ]• + [x k , [x i , x j ]• ]• = − which is a violation of the Jacobi identity.

54

L. Cornalba, R. Schiappa

5.2. Operator product expansions and factorization. In the previous subsection we have reviewed the basic properties of the nonassociative Kontsevich product •. We may now use the general results of Sect. 4.5 in order to show the relevance of the algebraic operation • in the computation of n-point functions. Let us then first summarize the results of Sect. 4. We have constructed n-point functions Pn [F1 , . . . , Fn ] which depend uniquely on the n − 3 conformal moduli of the insertion points τi of the functions Fi . In particular, the one-point function P1 , which we shall call P in the sequel, is a generalization of the trace, Tr, and is given by: 2i P1 [F ] = P [F ] = Tr (F ) − Habc Tr Xa X b X c F . 3 Now consider a general n-point function for functions Fi at points τi . Let us scale the insertion points τi → ε τi for ε → 0. On one hand the result of the n-point function does not change, since it is invariant under SL (2, R) transformations. On the other hand we can use an OPE argument to conclude that there must exist a function, On [F1 , . . . , Fn ] (τi ), such that Pn [F1 , . . . , Fn ] (τi ) = P [On [F1 , . . . , Fn ] (τi )] .

(38)

The operations On [F1 , . . . , Fn ] (τi ) are – informally – untraced versions of the Pn ’s, and are invariant under the subgroup of SL (2, R) which leaves the point at ∞ invariant, i.e., translations and rescalings. They will depend on n − 2 moduli. In particular one can now see the relevance of the • product, which is nothing but the operation O2 . More precisely, one can check that O1 [F ] (τ ) = F, O2 [F, G] (τ1 , τ2 ) = F • G, where, for the second expression, it is simple to use its explicit expansion, (35), and insert it in (38) in order to check that one does get the right result, P2 [F, G], as derived in Sect. 4.5. With a little more work, and using the results on the n-point functions of Sect. 4 and the facts on the • product of the previous subsection, one can likewise obtain, O3 [F, G, H ] (τi ) = L (1 − m) (F • G) • H + L (m) F • (G • H ) , τ21 m= . τ31 In particular O3 , which depends on a single modulus, is explicitly written in terms of the product • and interpolates between the two possible positionings of the parenthesis. More generally, the operations On will depend on n − 2 moduli and will interpolate between the various possible ways of taking products of n functions with the • product. OPE arguments can be used, in the general case, to compute n-point functions at the boundary of the moduli space of the insertion points τi . Again consider an n-point function with functions Fi at points τi . Let a subset of the points – say τ1 , . . . , τm – converge to zero via a common rescaling τi → ετi , i = 1, . . . , m. Then one can use an OPE argument to show that (the indices i and j indicate the two sets 1, . . . , m, and m + 1, . . . , n, respectively)

lim Pn [F1 , . . . , Fn ] ετi , τj ε→0

= Pn−m+1 Om [F1 , . . . , Fm ] (τi ) , Fm+1 , . . . , Fn 0, τj . (39)

Nonassociative Star Product Deformations

55

For example, one can show that P [(F • G) • H ] = P [F • (G • H )] . This follows from applying (39) and recalling the fact that the three-point function P3 [F, G, H ] (τ1 , τ2 , τ3 ) is independent of the moduli. If one considers the two limits τ2 → τ1 and τ2 → τ3 , and uses factorization with O2 ∼ •, one quickly arrives at the above result. 5.3. The homotopy associative algebraic structure. We have seen that the operations On define, as a function of the modular parameters, a structure which extends that of an associative algebra. In fact, the failure of the product • ∼ O2 to be associative is measured by O3 , which now interpolates (thanks to the modular parameter 0 < m < 1) between the two possible “placements” of the parenthesis. In a very crude sense, the nonassociativity at each order is controlled by higher order terms. These type of structures have appeared in the literature on string theory, starting from the use in string field theory of the Batalin–Vilkovisky formalism to quantize gauge theories which do not close offshell [34–37]. These are the A∞ homotopy associative algebras, where the failure of the associativity property is controlled by a third order term, and similarly at higher orders [33]. Let us formalize these concepts a bit further, and show that the structure C ∞ (M) and On [F1 , . . . , Fn ](τ ) is actually that of an A∞ space5 . The idea of an A∞ space is the same as that of an A∞ algebra, only the definition of homotopy is changed (one uses a map instead of a differential). So we first follow the original work of Stasheff [33] and recall the definition of homotopy associativity. The intuitive notion is the following. A space X and a multiplication m : X × X → X is a homotopy associative space if the maps m(1 ⊗ m) and m(m ⊗ 1) are homotopic as maps X × X × X → X. If we are given three functions, F1 , F2 and F3 , with the nonassociative product • there are two distinct ways to insert parenthesis in the natural application C ∞ (M) × C ∞ (M) × C ∞ (M) → C ∞ (M), i.e., the standard options (F1 • F2 ) • F3 and F1 • (F2 • F3 ). But a quick reminder of the previous section also tells us that there is a homotopy, O3 [F1 , F2 , F3 ](m) : [0, 1] × C ∞ (M)×3 → C ∞ (M), between these two seemingly distinct ways to associate brackets under the • product. In order to realize that there are stronger conditions of “associativity modulo homotopy” than the previous one, let us proceed by analyzing the situation with four functions, F1 , . . . , F4 . There are now five distinct ways to insert parenthesis for the nonassociative product, (F1 • F2 ) • (F3 • F4 ), ((F1 • F2 ) • F3 ) • F4 , F1 • (F2 • (F3 • F4 )), (F1 • (F2 • F3 )) • F4 , F1 • ((F2 • F3 ) • F4 ), which can actually be pictorially written at the vertices of a pentagon. The point is now that while the O3 homotopy naturally yields homotopies that run between the vertices, it is not necessarily true that one can extend the homotopy to the interior of the pentagon. If one can not extend the homotopy to this situation, the algebraic structure of the product is denoted A3 . If, on the other hand, one can extend the homotopy to the interior of the whole pentagon, the algebraic structure is denoted A4 . As we go further along this way 5 In here we take M to be the brane world-volume.

56

L. Cornalba, R. Schiappa

one is led to consider higher poliedra, and if one can always extend homotopies to the interior of these poliedra, then the algebraic structure is A∞ homotopy associative [33]. Let us illustrate these concepts by explictly writing down the O4 [F1 , . . . , F4 ](x, y) operation. One can compute it to be x 1−y 1− (F1 • F2 ) • (F3 • F4 ) O4 [F1 , . . . , F4 ](x, y) = L 1 − y 1−x x 1−y +L 1 − ((F1 • F2 ) • F3 ) • F4 y 1−x 1−y x 1− F1 • (F2 • (F3 • F4 )) +L y 1−x x +L (1 − y) (F1 • (F2 • F3 )) • F4 y 1−y +L x F1 • ((F2 • F3 ) • F4 ), 1−x τ21 τ31 x= , y= , τ41 τ41 where 0 < x < y < 1. At first sight one would say that {x, y} take values in a triangle. However, a glance at the expression above also tells us that while one of the vertices of this triangle, {x, y} = {0, 1}, is perfectly regular, the other two, {x, y} = {0, 0} and {x, y} = {1, 1}, are actually singular. Each of these singular points can actually be resolved into two distinct limits, once we scale x and y in the two possible different ways. For instance, the limit where {x, y} → {0, 0} can be approached with both x and y scaling as A → 0, with xy → 1 or with x scaling as A 2 and y as A, and xy → 0. A similar situation occurs for the limit where {x, y} → {1, 1}. So, the resolution of the singular vertices of the triangle actually produces the expected pentagon. Once this is realized, it is simple to see that O4 plays the role of the A4 homotopy, O4 [F1 , . . . , F4 ](x, y) : P × C ∞ (M)×4 → C ∞ (M), where P is the pentagon spanned by x and y. Observe that, as explained in the previous section, P [O4 ] = O4 . It is then very natural to conjecture that

general n-point functions will thus produce the necessary homotopies in order that C ∞ (M), •, {On }∞ n=2 is an A∞ homotopy associative algebra (and where Pn = P [On ]). The homotopies On [F1 , . . . , Fn ](τ ) also induce the necessary homotopies to create an L∞ commutator homotopy Lie algebra, and to create homotopy differential operators (which may contain non-trivial topological information for the tensor bundles of M). Indeed, the commutator algebra, [x i , x j ]• , is an L∞ homotopy Lie algebra: using the basic homotopy, O3 [F, G, H ](m), one can define a “composite” homotopy between zero and − 6i Habc [X a , F ][Xb , G][X c , H ], the term that violates Jacobi’s identity in the • commutator algebra. With this homotopy, Jacobi’s identity will be satisfied up to homotopy, and one thus obtains an L∞ homotopy Lie algebra. In order to build a differential structure (and thus, gauge theory) one still needs a covariant derivative in the sense that ∇ (F • G) = ∇F • G + F • ∇G. While this may not seem to be a viable course of action, one can use homotopy to impose the Leibnitz rule: the derivative operation ∇X F ∼ [X, F ] will satisfy the Leibnitz rule up to homotopy. Again using the basic homotopy one can define a composite homotopy between [X, F • G] and

Nonassociative Star Product Deformations

57

[X, F ] • G + F • [X, G], so that the commutator [X, F ] becomes a homotopy derivative for the • product. 6. Corrections Involving the Metric Tensor Up to now we have studied the topological limit gab → 0 in great detail. In these last sections we shall discuss corrections to the above results, when one includes a nonvanishing closed string metric gab in the calculations. This will allow us to identify the open string effective parameters – metric Gab and noncommutative parameter θ ab – in terms of the closed string parameters gab and Bab . Recall that the two-point function of the fluctuating field ζ is ζ a (z) ζ b (τ ) = where

i ab 1 θ A (z, τ ) − Gab B (z, τ ) , π π

1 τ −z ln , 2i z−τ B (z, τ ) = ln |z − τ | ,

A (z, τ ) =

and where we have placed the point τ at the world-sheet boundary ∂. In the previous sections we have worked only with the term in A(z, τ ). Here, we will evaluate the twopoint function P2 (x a , x b ) including the contribution arising from the term in B(z, τ ). The only diagrammatic contribution still comes from the two-graph V, (14), so that one can easily compute the relevant Feynman diagrams. First, observe that the contribution of the B(z, τ ) term to the volume form V (ω) is proportional to ∝ Habc Gab and therefore vanishes due to the antisymmetry of Habc . Thus, the volume form is unchanged. Schematically, the propagator looks like A−B. In the previous sections we computed the correction to the two-point function going like A2 , with the result: i ia j b θ θ Habc x c A (τ1 , τ2 ) , 3π therefore yielding a correction to the noncommutative θ parameter as, 1 θ ij → θ ij + θ ia θ j b Habc x c . 3

(40)

To this result we now add the correction to the two-point function which goes as B 2 , −

i ia j b G G Habc x c A (τ1 , τ2 ) . 3π

These two results produce the full correction to the noncommutative parameter, 1 1 θ ij → θ ij − Gia Gj b Habc x c + θ ia θ j b Habc x c . 3 3

(41)

Finally, there are also mixed corrections going as AB (and also the “symmetric” BA). They are, 1 − θ ia Gj b Habc x c B (τ1 , τ2 ) , 3π

58

L. Cornalba, R. Schiappa

plus the symmetric contribution in i and j . These contributions yield the correction to the effective open string metric, 1 1 Gij → Gij − θ ia Gj b Habc x c + Gia θ j b Habc x c . 3 3

(42)

All these results can be nicely combined with the flat spaces formulas which connect closed string and open string parameters [3, 5], to yield new but identical formulas in this curved background scenario.

6.1. Open string parameters. Let us first recall how, in the flat case, one relates open and closed string parameters [3, 5]: 1 1 1 = g , G g+ω g−ω 1 1 θ =− ω . g+ω g−ω

(43)

Here, ω = B + F is constant, Gij is the metric effectively seen by the open strings and θ ij is the noncommutativity parameter on the brane world-volume (we never consider noncommutativity along the time direction), [x i , x j ] = iθ ij . Let us now consider the above formulae (43) with ω replaced with the curved background expression, 1 ωab (x) = ωab + Habc x c , 3 and and let us expand (43) to first order in H . Denoting by G θ the “curved” open string parameters, one obtains: 1 1 + θ= G g + ω + 13 H x 1 1 1 − Hx g+ω g+ω 3 1 1 1 = − Hx θ − θ G G 3

1 g+ω 1 1 1 1 1 1 Hx + θ− Hx − θ Hx θ . 3 G G 3 G 3

It is clear that we have just obtained the previous results (42) and (41). Therefore, formulas (43) are still valid in the curved background situation, but now the fields are taken to be varying fields rather than constant fields. In other words, formulas (43) are still valid for the weakly varying non-closed gauge invariant two-form ω. In particular, in the zero slope α → 0 limit of [5], the effective open string parameters are given by 1 1 1 =− g , ω ω G

θ=

1 . ω

(44)

Nonassociative Star Product Deformations

59

7. Tachyons and Matrix Models in Curved Backgrounds In this section we analyze in some detail the zero-point function, i.e., the partition function, and connect the discussion of this paper, in this simple case, to some known results in Matrix models. Let us start by considering the Born–Infeld action in the presence of a weak background field H . We shall be brief, since the arguments which follow are very well known. One starts by expanding the determinant in order to obtain 1 1 1 S= 1 + F 2 + Habc x c Fab . det δab + Habc x c + Fab 3 4 6 We can then use the canonical correspondences Fab → i[X a , Xb ] and → Tr in order to rewrite the RHS above as 1 i a b c d a b c S Tr 1 − gac gbd [X , X ][X , X ] + Habc X X X 4 3 i Tr 1 + Habc X a X b X c + O(g 2 ). (45) 3 Note that the above action has been found by [10, 11] in the context of studies of branes in WZW models at large level k – that is, at small Habc ∼ k −1/2 – and by Myers in [38] in the context of studies of polarization of lower-dimensional branes in the presence of R–R background fields. In the sequel we will just look at the terms in the above equation which are independent of gab , and concentrate on the linear terms in Habc . A special case of the results of this paper is the zero-point function, or partition function, 2i Z = P [1] = Tr 1 − Habc X a X b X c . (46) 3 At first sight, there is an incompatibility between Eqs. (45) and (46), since one expects that Z ∼ S. We have, on the other hand, a difference (47) S − Z Habc Tr Xa X b X c . Recall though, see e.g. [58–60], that the partition function and the action need not be equal and are expected to differ by a renormalization group beta function contribution. More precisely, ∂ S = 1+β Z. ∂g We will show that the difference (47) is nothing but this extra term, thus resolving the apparent contradiction. Recall that the coefficient −2i/3 in Eq. (46) was fixed in Sect. 4 in order to obtain conformally invariant results. This corresponded to an ill defined vacuum graph (which contributes to the volume form) which is linearly divergent. In this paper we have chosen a specific regularization scheme which preserves the conformal invariance of the results. This scheme, however, does not correspond to the usual minimal subtraction scheme, as we will show in a moment, and contributes a finite part to the tachyon beta function, thus explaining the difference (47).

60

L. Cornalba, R. Schiappa

Let us be more specific. Let us consider the more general boundary interaction SB , including the tachyon field, 1 dτ TB (X) + iAa (X) X˙ a . 2π In the above, TB is the bare tachyon field, given by TB = T + DT , where DT are the tachyon counterterms. Now let us consider the vacuum graph in question and let us regularize it following, e.g., the prescription of [60]. The graph then contributes (including the tachyon counterterm): i −DT − Habc x c dτ ζ a ζ˙ b . (48) 6 Working on the disk and regularizing [60] the result, one finds that:

dτ ζ a ζ˙ b = 2iθ ab

e−2ε ab 1 = iθ − 1 + O . (ε) 1 − e−2ε ε

Therefore one obtains that Eq. (48) yields a result of i a b c 1 −DT − Habc X X X −1 , 3 ε where we have used that θ ab = −i[X a , Xb ]. The usual prescription is the one of minimal subtraction, i.e., the counterterm just cancels the pole leaving a finite result of i a b c 3 Habc X X X . We choose, on the other hand, a different renormalization prescription dictated by conformal invariance, which gives as a total contribution − 2i3 Habc X a X b X c . This implies that the tachyon counterterm must be 1 DT = −i − 1 Habc X a X b X c , 3ε and that the corresponding beta function is βT = −T − Habc X a X b X c . Following the methods of [58–60], this contribution to the beta function implies that the total action, including the tachyon potential, is given by i Tr 1 + T + Habc X a X b X c e−T , 3 which is extremized at T = − 3i Habc X a X b X c . The value of the action at the extremum is then i a b c Tr 1 + Habc X X X , 3 thus showing that the difference between the partition function and the action is compensated by a condensation of the tachyon.

Nonassociative Star Product Deformations

61

7.1. 11-dimensional language and the 3-form field. In order to be complete, one still needs to relate the previous Matrix theory action, which is written in the 10-dimensional type IIA language, to the full M-theory 11-dimensional language. Using the 11-dimensional light-cone notation and the previous operator form of the action, we have actually built a Matrix theory action in a weakly curved background. One can use the ideas from [49, 50], and their application to curved backgrounds [61], to make precise the relation between the Matrix theory and the D-brane Born–Infeld actions. We shall now briefly look at these issues, with a particular attention to the 11-dimensional 3-form field. Let us start by considering M-theory with a background metric gI J , in a frame with a compact coordinate X− of size R, which is light-like in the flat space limit gI J → ηI J . This theory can be described as the limit of a family of space-like compactified theories with background metric [49, 50]. Define the theory M gI J in a frame with a space-like The DLCQ limit of the original theory, M, can be compact direction X10 of size R. in the X 10 direction with boost parameter, found by boosting the theory M 1 R 2 1 = 1+ , γ = 2 R 1 − β2 is related to → 0. The 3-form field A I J K in the theory M and then taking the limit R that of the original theory M by the same Lorentz transformation as above. Moreover, on a small space-like circle of radius R is equivalent to type IIA string the theory M theory with background form fields given by: D2 I J K d X I ∧ d X J ∧ d X K = Cµνρ A dX µ ∧ dX ν ∧ dX ρ + Bµν dX µ ∧ dX ν ∧ dX 10 .

The configurations of interest in this paper carry no D0 or D2-brane charge. One will thus be left with a Bµν background form field which will give rise to the following M-theory background 3-form field (here we take α ≡ γ (1 − β)): 1 α A+−i = 0 , Aij k = 0 , A+ij = √ Bij , A−ij = − √ Bij , 2α 2 where one should recall that we have only the space–space B-field turned on. It is interesting to observe that, at the M-theory level, the nonassociative Kontsevich deformation obtained is associated to the A+ij and A−ij components of the 11dimensional field AI J K (see also [62] for the standard Moyal deformation). One is therefore led to speculate that, from a purely M-theoretic point of view, one should be able to construct deformations associated to 3-index tensor structures, which should reduce to Kontsevich type deformations for configurations considered in this paper. This is an interesting venue to explore, as it may also aid in understanding brane world-volume deformations associated to non-zero varying R–R fields (and R–R field strengths). One thing one can say is that 3-index tensor structures will probably be naturally associated to 3-component products or 3-brackets in the sense that one can write, given some (constant) tensor C ij k , i

{f, g, h}(x) = e 2 C

ij k ∂ x ∂ y ∂ z i j k

f (x)g(y)h(z)|x=y=z .

Structures involving 3-brackets have been previously discussed in the context of covariant Matrix theory actions [63]. It would be interesting to further explore these ideas.

62

L. Cornalba, R. Schiappa

8. Future Perspectives In this paper we have shown how to use open string perturbation theory in order to describe brane physics in weakly curved backgrounds. The method described allows one to translate the properties of the curved background – which traditionally show up as sigma model couplings – into a given deformation of the algebra of functions on the brane world-volume, which depends in general on the specific closed string background considered. In particular, the presence of an NS–NS field strength H induces a nonassociative deformation of the algebra of functions. Our choice of background is of the type R + 41 H 2 = 0, and it would be interesting to further develop the disk perturbation theory in order to investigate the properties of the star product deformation to higher order. It would also be interesting to study tachyon condensation in such a background. Given that one can compute correlators using star product prescriptions (as thoroughly explained in this work), one can then use standard boundary string field techniques in order to compute the minima of the tachyon potential in this background. Another interesting point is to further study the • product. Defining gauge theory on these “nonassociative manifolds” is not straightforward. Also, given the discussion about Matrix theory in curved backgrounds, it seems clear that understanding the geometry of these “nonassociative manifolds”, much like there is a geometrical understanding of Kontsevich’s noncommutative manifolds [64–66], would be needed in order to fully understand Matrix theory in any given background. More pragmatically, we were able to map functions to matrices because we wrote everything in terms of the # product, which is associative. A question that immediately rises is whether there is a “matrix” formulation of the theory which can be written exclusively in terms of the • product. Answering this question could be of great interest for the goal of defining Matrix theory in general curved backgrounds. This definitely requires a full understanding of the role of homotopy associative algebras. We hope to address some of these questions in the near future. A. Dilogarithm Identities Let us recall that the standard dilogarithmic function is defined by the following expression: x ∞ ln (1 − s) 1 n Li2 (x) = − ds = x . 2 s n 1

n=1 1 2 6 π . A related function, more useful for

In particular Li2 (0) = 0 and Li2 (1) = ζ (2) = our purposes, is the Rogers normalized dilogarithm [57] L (x) defined for 0 ≤ x ≤ 1 by 6 1 L (x) = 2 Li2 (x) + ln (x) ln (1 − x) . π 2

The Rogers dilogarithm is monotonically increasing on [0, 1] and, at the end points of the interval, is given by L (0) = 0 , L (1) = 1. The function L satisfies a fundamental property, which is crucial in our computations, namely, given x, y ∈ [0, 1] the following holds: x (1 − y) y (1 − x) L (x) + L (y) = L (xy) + L +L . (49) 1 − xy 1 − xy

Nonassociative Star Product Deformations

63

A special important case of the above equation is Euler’s identity, L (x) + L (1 − x) = 1. B. Computation of the Function S (x) Let us start the computation of S(x) by analyzing an auxiliary function, which we shall denote by s (x), and which will be defined for all x ∈ R. Let us consider three points 0, 1 and x on the real line and let us denote, given a point p in the upper-half plane, with α, β and γ the angles formed by the lines p–0, p–1 and p–x with the vertical line. A little plane geometry shows that: tan γ = (1 − x) tan α + x tan β. The function s will then be given by: s (x) =

4 π3

(50)

γ dα ∧ dβ.

From the geometric construction it is clear that s (1 − x) = −s (x) . We also recall that the upper-half plane corresponds to the simplex − π2 ≤ α ≤ β ≤ in the α–β plane. This fact can be used to compute special values of s, s (+∞) = −s (−∞) = 1 , 1 s (1) = −s (0) = , 3 and s (1/2) = 0. It is clear that, from the definition of S, one has for x ∈ [0, 1], −x 1 − s (x) + s . S (x) = s x 1−x Consider the derivative

d dx s (x).

π 2

(51)

(52)

From (50) we deduce that:

tan β − tan α d , γ = dx 1 + ((1 − x) tan α + x tan β)2 so that, d s= dx =

d γ dα ∧ dβ = dx

−∞
dzdw w−z

, 2 2 1 + z 1 + w 1 + ((1 − x) z + xw)2

where we have defined z = tan α, w = tan β. The above integral can be evaluated with the result: ln (1 − x)2 1 ln (x)2 d + . s=− 2 π 1−x x dx

64

L. Cornalba, R. Schiappa

d ln x The above equation can be easily integrated by noting that dx Li2 (x) = − 1−x . Using the boundary values (51) one obtains the following expression for S, 4 1 1 s = − + 2 Li2 (x) + ln (−x) ln (1 − x) , (x < 0) 3 π 2 2 s = 2 [−Li2 (1 − x) + Li2 (x)] , (0 < x < 1) π 1 1 4 s = − 2 Li2 (1 − x) + ln (x) ln (x − 1) . (x > 0) 3 π 2

We finally use Eq. (52) and the fact that Li2 (1 − 1/x) = −Li2 (1 − x) − show that

1 2

(ln (x))2 to

6 (−Li2 (x) + Li2 (1 − x)) π3 = −L (x) + L (1 − x) = 1 − 2L (x) .

S (x) =

Acknowledgements. We would like to thank P. Fonseca, G. Granja, J. Mourão, C. Nappi, J. Nunes, C. Nuñez and B. Pioline for helpful comments and/or discussions. LC is funded by a European Postdoctoral Institute fellowship. RS is supported in part by funds provided by the Fundação para a Ciência e Tecnologia, under the grant Praxis XXI BPD-17225/98 (Portugal).

References 1. Connes, A., Douglas, M.R. and Schwarz, A.: Noncommutative Geometry and Matrix Theory: Compactification on Tori. JHEP 9802, 003 (1998), hep-th/9711162 2. Cheung, Y.-K.E. and Krogh, M.: Noncommutative Geometry from D0-branes in a Background B-field. Nucl. Phys. B528, 185 (1998), hep-th/9803031 3. Schomerus, V.: D-branes and Deformation Quantization. JHEP 9906, 030 (1999), hep-th/9903205 4. Cornalba, L. and Schiappa, R.: Matrix Theory Star Products from the Born–Infeld Action. Adv. Theor. Math. Phys. (in press) hep-th/9907211 5. Seiberg, N. and Witten, E.: String Theory and Noncommutative Geometry, JHEP 9909, 032 (1999), hep-th/9908142 6. Cornalba, L.: D-Brane Physics and Noncommutative Yang–Mills Theory. Adv. Theor. Math. Phys. (in press) hep-th/9909081 7. Minwalla, S., Van Raamsdonk, M. and Seiberg, N.: Noncommutative Perturbative Dynamics. hepth/9912072 8. Matusis, A., Susskind, L. and Toumbas, N.: The IR/UV Connection in the Non-commutative Gauge Theories. hep-th/0002075 9. Anazawa, M.: D0-branes in an H -field Background and Noncommutative Geometry. Nucl. Phys. B569, 680 (2000), hep-th/9905055 10. Alekseev, A.Y., Recknagel, A. and Schomerus, V.: Non-commutative World-volume Geometries: Branes on SU (2) and Fuzzy Spheres. JHEP 9909, 023 (1999), hep-th/9908040 11. Alekseev, A.Y., Recknagel, A. and Schomerus, V.: Brane Dynamics in Background Fluxes and Noncommutative Geometry. JHEP 0005, 010 (2000), hep-th/0003187 12. Ho, P.-M. and Yeh, Y.-T.: Noncommutative D-brane in Non-constant NS–NS B-field Background. hepth/0005159 13. Chu, C-S., Ho, P.-M. and Kao, Y.-C.: Worldvolume Uncertainty Relations for D-branes. Phys. Rev. D60, 126003 (1999), hep-th/9904133 14. Ydri, B.: Noncommutative Geometry as a Regulator. Phys. Rev. D63, 025004 (2001), hep-th/0003232 15. Sahakian, V.: Transcribing Spacetime Data into Matrices, hep-th/0010237 16. Dasgupta, K. and Yin, Z.: Non-Abelian Geometry. hep-th/0011034 17. Bachas, C., Douglas, M. and Schweigert, C.: Flux Stabilization of D-branes. JHEP 0005, 048 (2000), hep-th/0003037

Nonassociative Star Product Deformations

65

18. Alvarez-Gaumé, L., Freedman, D.Z. and Mukhi, S.: The Background Field Method and the Ultraviolet Structure of the Supersymmetric Nonlinear σ -Model. Annals of Phys. 134, 85 (1981) 19. Fradkin, E.S. and Tseytlin, A.A.: Non-Linear Electrodynamics from Quantized Strings. Phys. Lett. B163, 123 (1985) 20. Braaten, E., Curtright, T.L. and Zachos, C.K.: Torsion and Geometrostasis in Nonlinear Sigma Models. Nucl. Phys. B260, 630 (1985) 21. Dorn, H. and Otto, H.-J.: Open Bosonic Strings in General Background Fields. Z. Phys. C32, 599 (1986) 22. Abouelsaood, A., Callan, C.G., Nappi C.R. and Yost, S.A.: Open Strings in Background Gauge Fields. Nucl. Phys. B280, 599 (1987) 23. Callan, C.G., Lovelace, C., Nappi, C.R. and Yost, S.A.: String Loop Corrections to Beta Functions. Nucl. Phys. B288, 525 (1987) 24. Tseytlin, A.A.: Renormalization of Möbius Infinities and Partition Function Representation for the String Effective Action. Phys. Lett. B202, 81 (1988) 25. Laidlaw, M.: Noncommutative Geometry from String Theory: Annulus Corrections. hep-th/0009068 26. Leigh, R.: Dirac–Born–Infeld Action from Dirichlet Sigma Model. Mod. Phys. Lett. A4, 2767 (1989) 27. J. Polchinski, Dirichlet-Branes and Ramond–Ramond Charges, Phys. Rev. Lett. 75, 4724 (1995), hepth/9510017 28. Witten, E.: Bound States of Strings and p-Branes. Nucl. Phys. B460, 335 (1996), hep-th/9510135 29. Cattaneo, A.S. and Felder, G.: A Path Integral Approach to the Kontsevich Quantization Formula. math.QA/9902090 30. Kontsevich, M.: Deformation Quantization of Poisson Manifolds I. q-alg/9709040 31. Cornalba, L. and Taylor IV, W.: Holomorphic Curves from Matrices. Nucl. Phys. B536, 513 (1998), hep-th/9807060 32. Cornalba, L.: Matrix Representations of Holomorphic Curves in T 4 . hep-th/9812184 33. Stasheff, J.: H -Spaces from a Homotopy Point of View. Lect. Notes in Math. 161, Springer-Verlag, 1970 34. Lada, T. and Stasheff, J.: Introduction to sh Lie Algebras for Physicists. Int. Jour. Theor. Phys. 32, 1087 (1993), hep-th/9209099 35. Lada, T. and Markl, M.: Strongly Homotopy Lie Algebras. hep-th/9406095 36. Zwiebach, B.: Oriented Open–Closed String Theory Revisited. Ann. Phys. 267, 193 (1998), hepth/9705241 37. Gaberdiel, M.R. and Zwiebach, B.: Tensor Construction of Open String Theories I: Foundations. Nucl. Phys. B505, 569 (1997), hep-th/9705038 38. Myers, R.C.: Dielectric Branes. JHEP 9912, 022 (1999), hep-th/9910053 39. Constable, N.R., Myers, R.C. and Tajford, O.: The Noncommutative Bion Core. Phys. Rev. D61, 106009 (2000), hep-th/9911136 40. Schiappa, R.: Matrix Strings in Weakly Curved Background Fields. hep-th/0005145 41. Millar, K., Taylor, W. and Van Raamsdonk, M.: D-particle Polarizations with Multiple Moments of Higher-dimensional Branes. hep-th/0007157 42. Jevicki, A., Mihailescu, M. and Ramgoolam, S.: Hidden Classical Symmetry in Quantum Spaces at Roots of Unity: From q-sphere to Fuzzy Sphere. hep-th/0008186 43. Das, S.R., Trivedi, S.P. and Vaidya, S.: Magnetic Moments of Branes and Giant Gravitons. hep-th/0008203 44. Klusoˇn, J.: D-Branes from N Non-BPS D0-Branes. hep-th/0009189 45. Lozano, Y.: Noncommutative Branes from M-Theory. hep-th/0012137 46. Witten, E.: String Theory Dynamics in Various Dimensions. Nucl. Phys. B443, 85 (1995), hep-th/9503124 47. Banks, T., Fischler, W., Shenker, S. and Susskind, L.: M-Theory as a Matrix Model: A Conjecture. Phys. Rev. D55, 5112 (1997), hep-th/9610043 48. de Wit, B., Hoppe, J. and Nicolai, H.: On the Quantum Mechanics of Supermembranes. Nucl. Phys. B305, 545 (1988) 49. Sen, A.: D0-Branes on T n and Matrix Theory. Adv. Theor. Math. Phys. 2, 51 (1998), hep-th/9709220 50. Seiberg, N.: Why is the Matrix Model Correct? Phys. Rev. Lett. 79, 3577 (1997), hep-th/9710009 51. Douglas, M.R.: D-branes and Matrix Theory in Curved Space. Nucl. Phys. Proc. Suppl. 68, 381 (1998), hep-th/9707228 52. Douglas, M.: D-Branes in Curved Space. Adv. Theor. Math. Phys. 1, 198 (1998), hep-th/9703056 53. Douglas, M., Kato, A. and Ooguri, H.: D-Brane Actions on Kähler Manifolds. Adv. Theor. Math. Phys. 1, 237 (1998), hep-th/9708012 54. Taylor IV, W. and Van Raamsdonk, M.: Supergravity Currents and Linearized Interactions for Matrix Theory Configurations with Fermionic Backgrounds. JHEP 9904, 013 (1999), hep-th/9812239 55. Dasgupta, A., Nicolai, H. and Plefka, J.: Vertex Operators for the Supermembrane. hep-th/0003280 56. Cornalba, L.: On the General Structure of the Nonabelian Born–Infeld Action. hep-th/0006018 57. Kirillov, A.N.: Dilogarithm Identities. Prog. Theor. Phys. Suppl. 118, 61 (1995), hep-th/9408113 58. Witten, E.: Some Computations in Background Independent Off-Shell String Theory, Phys. Rev. D47, 3405 (1993), hep-th/9210065

66

L. Cornalba, R. Schiappa

59. Shatashvili, S.L.: Comment on the Background Independent Open String Theory. Phys. Lett. B311, 83 (1993), hep-th/9303143 60. Tseytlin,A.A.: Sigma ModelApproach to String Theory EffectiveActions With Tachyons. hep-th/0011033 61. Taylor IV, W. and van Raamsdonk, M.: Multiple D0-Branes in Weakly Curved Backgrounds. Nucl. Phys. B558, 63 (1999), hep-th/9904095 62. Chu, C-S., Ho, P.-M. and Li, M.: Matrix Theory in a Constant C Field Background. Nucl. Phys. B574, 275 (2000), hep-th/9911153 63. Minic, D.: M-theory and Deformation Quantization. hep-th/9909022 64. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A. and Sternheimer, D.: Deformation Theory and Quantization: Deformations of Symplectic Structures, I & II. Annals of Phys. 111, 61 (1978) 65. Fedosov, B.: A Simple Geometrical Construction of Deformation Quantization. Jour. Diff. Geo. 40, 213 (1994) 66. Jurco, B., Schupp, P. and Wess, J.: Noncommutative Gauge Theory for Poisson Manifolds. Nucl. Phys. B584, 784 (2000), hep-th/0005005 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 225, 67 – 89 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Quantum Teleportation and Beam Splitting Karl-Heinz Fichtner1 , Masanori Ohya2 1 Friedrich-Schiller-Universität Jena, Fakultät für Mathematik und Informatik, Institut für Angewandte

Mathematik, 07740 Jena, Germany. E-mail: [email protected]

2 Department of Information Sciences, Science University of Tokyo, Noda City, Chiba 278-8510, Japan.

E-mail: [email protected] Received: 1 February 2001 / Accepted 19 July 2001

Abstract: Following the previous paper in which quantum teleportation is rigorously discussed with coherent entangled states given by beam splittings, we further discuss two types of models, the perfect teleportation model and non-perfect teleportation model, in a general scheme. Then the difference among several models, i.e., the perfect models and the non-perfect models, is studied. Our teleportation models are constructed by means of coherent states in some Fock space with counting measures, so that our model can be treated in the frame of usual optical communication. 1. Introduction Following the previous paper [12], we further discuss non-perfect teleportation. The notion of non-perfect teleportation is introduced in [12] to construct a handy (i.e., physically more realizable) teleportation, although its mathematics becomes a little more complicated. For the completeness of the present paper, we quickly review the meaning of teleportation and some basic facts of Fock space in this section. Then we dicuss perfect teleportation in a very general (more general than that given in [12]) scheme with our previous results, and we state the main theorems obtained in [12] for non-perfect teleportation, both in Sect. 2. The main results of this paper are presented in Sect. 3, where we discuss the difference among three models, i.e., the perfect model, the non-perfect one given in [12] and that discussed in the present paper. The proofs of the main results are given in Sect. 4.

1.1. Quantum teleportation. The study of quantum teleportation was started in paper [3], whose scheme can be mathematically expressed in the following steps [11, 12]: Step 0: A girl named Alice has an unknown quantum state ρ on (a N -dimensional) Hilbert space H1 and she was asked to teleport it to a boy named Bob.

68

K.-H. Fichtner, M. Ohya

Step 1: For this purpose, we need two other Hilbert spaces H2 and H3 , H2 is attached to Alice and H3 is attached to Bob. Prearrange a so-called entangled state σ on H2 ⊗ H3 having certain correlations and prepare an ensemble of the combined system in the state ρ ⊗ σ on H1 ⊗ H2 ⊗ H3 . on the Step 2: One then fixes a family of mutually orthogonal projections (Fnm )N n,m=1 Hilbert space H1 ⊗ H2 corresponding to an observable F := zn,m Fnm , n,m

and for a fixed pair of indices n, m, Alice performs a first kind incomplete measurement, involving only the H1 ⊗ H2 part of the system in the state ρ ⊗ σ , which filters the value znm , that is, after measurement on the given ensemble ρ ⊗ σ of identically prepared systems, only those where F shows the value znm are allowed to pass. According to the von Neumann rule, after Alice’s measurement, the state becomes (123) ρnm :=

(Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1) , tr 123 (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1)

where tr 123 is the full trace on the Hilbert space H1 ⊗ H2 ⊗ H3 . Step 3: Bob is informed which measurement was done by Alice. This is equivalent to transmitting the information that the eigenvalue znm was detected. This information is transmitted from Alice to Bob without disturbance and by means of classical tools. Step 4: Making only partial measurements on the third part of the system in the state (123) ρnm means that Bob will control a state nm (ρ) on H3 given by the partial (123) trace on H1 ⊗ H2 of the state ρnm (after Alice’s measurement) (123) nm (ρ) = tr 12 ρnm

= tr 12

(Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1) . tr 123 (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1)

Thus the whole teleportation scheme given by the family (Fnm ) and the entangled state σ can be characterized by the family (nm ) of channels from the set of states on H1 into the set of states on H3 and the family (pnm ) given by pnm (ρ) := tr 123 (Fnm ⊗ 1)ρ ⊗ σ (Fnm ⊗ 1) of the probabilities that Alice’s measurement according to the observable F will show the value znm . The teleportation scheme works perfectly with respect to a certain class S of states ρ on H1 if the following conditions are fulfilled: (E1) For each n, m there exists a unitary operator vnm : H1 → H3 such that ∗ nm (ρ) = vnm ρ vnm

(ρ ∈ S),

(E2) nm

pnm (ρ) = 1

(ρ ∈ S).

Quantum Teleportation and Beam Splitting

69

(E1) means that Bob can reconstruct the original state ρ by unitary keys {vnm } provided to him. (E2) means that Bob will succeed to find a proper key with certainty. Such a teleportation process can be classified into two cases [1], i.e., weak teleportation and general teleportation, in which the solutions of the teleportation in each case and the conditions of the uniqueness of the unitary key were discussed. The solution of the weak teleportation is a triple σ (23) , F (12) , U such that ∗ ρ (1) = U ∗ ρ (1) U holds for any state ρ (1) ∈ S(H1 ) . Once a weak solution of a teleportation problem is given, we can construct the general solution for all n, m above [1]. In [12], we considered a teleportation model where the entangled state σ is given by the splitting of a superposition of certain coherent states, although this model doesn’t work perfectly, that is, neither (E2) nor (E1) hold. In the same paper, we estimated the difference between the perfect teleportation and this non-perfect teleportation by adding a further step in the teleportation scheme: Step 5: Bob will perform a measurement on his part of the system according to the projection F+ := 1 − |exp(0) exp(0)|, where |exp(0) exp(0)| denotes the vacuum state (the coherent state with density 0). Then our new teleportation channels (we denote it again by nm ) have the form nm (ρ) := tr 12

(Fnm ⊗ F+ )ρ ⊗ σ (Fnm ⊗ F+ ) tr 123 (Fnm ⊗ F+ )ρ ⊗ σ (Fnm ⊗ F+ )

and the corresponding probabilities are pnm (ρ) := tr 123 (Fnm ⊗ F+ ) ρ ⊗ σ (Fnm ⊗ F+ ). For this teleportation scheme, (E1) is fulfilled but (E2) is not, which we review in the next section.

1.2. Basic notions and notations. We collect some basic facts concerning the (symmetric) Fock space in a way adapted to the language of counting measures. For details we refer to [6–8, 2, 9]. Let G be an arbitrary complete separable metric space. Further, let µ be a locally finite diffuse measure on G, i.e. µ(B) < +∞ for bounded measurable subsets of G and µ({x}) = 0 for all singletons x ∈ G. We denote the set of all finite counting measures on G by M = M(G). Since ϕ ∈ M n can be written in the form ϕ = δxj for some n = 0, 1, 2, . . . and xj ∈ G with the j =1

Dirac measure δx corresponding to x ∈ G, the elements of M can be interpreted as finite

70

K.-H. Fichtner, M. Ohya

(symmetric) point configurations in G. We equip M with its canonical σ –algebra W (cf. [6, 7]) and we consider the σ –finite measure F by setting   n 1 F (Y ) := XY (O) + XY  δxj  µn (d[x1 , . . . , xn ])(Y ∈ W), n! n≥1

j =1

Gn

where XY denotes the indicator function of a set Y and O represents the empty configuration, i. e., O(G) = 0. Since µ was assumed to be diffuse one easily checks that F is concentrated on the set of simple configurations (i.e., without multiple points) Mˆ := {ϕ ∈ M; ϕ({x}) ≤ 1 for all x ∈ G}. M = M(G) := L2 (M, W, F ) is called the (symmetric) Fock space over G. In [6] it was proved that M and the Boson Fock space (L2 (G)) in the usual definition are isomorphic. For each ∈ M with = 0 we denote by | the corresponding normalized vector | :=

. ||||

Further, | | denotes the corresponding one–dimensional projection describing a pure state given by the normalized vector | . Now, for each n ≥ 1 let M⊗n be the n-fold tensor product of the Hilbert space M, which can be identified with L2 (M n , F n ). For a given function g : G → C the function exp (g) : M → C defined by

1 if ϕ = 0 exp (g) (ϕ) := g(x) otherwise x∈G,ϕ({x})>0

is called an exponential vector generated by g. Observe that exp (g) ∈ M if and only if g ∈ L2 (G) and one has in this case 1 2 2 ||exp (g)||2 = eg and |exp (g) = e− 2 g exp (g). The projection |exp (g) exp (g)| is called the coherent state corresponding to g ∈ L2 (G). In the special case g := 0 we get the vacuum state |exp(0) = X{0} . The linear span of the exponential vectors of M is dense in M, so that bounded operators and certain unbounded operators can be characterized by their actions on exponential vectors. The operator D : dom(D) → M⊗2 given on a dense domain dom(D) ⊂ M containing the exponential vectors from M by Dψ(ϕ1 , ϕ2 ) := ψ(ϕ1 + ϕ2 )

(ψ ∈ dom(D), ϕ1 , ϕ2 ∈ M)

is called the compound Hida-Malliavin derivative. On exponential vectors exp (g) with g ∈ L2 (G), one gets immediately D exp (g) = exp (g) ⊗ exp (g).

(1)

Quantum Teleportation and Beam Splitting

71

The operator S : dom(S) → M given on a dense domain dom (S) ⊂ M⊗2 containing tensor products of exponential vectors by S(ϕ) :=

(ϕ, ˜ ϕ − ϕ) ˜

( ∈ dom(S), ϕ ∈ M)

ϕ≤ϕ ˜

is called the compound Skorohod integral. One gets

Dψ, M⊗2 = ψ, S M

(ψ ∈ dom(D), ∈ dom(S)),

S(exp (g) ⊗ exp (h)) = exp (g + h)

(g, h ∈ L2 (G)).

(2)

(3)

For more details we refer to [10]. Let T be a linear operator on L2 (G) with T ≤ 1. Then the operator (T ) called the second quantization of T is the (uniquely determined) bounded operator on M fulfilling (T )exp (g) = exp (T g)

(g ∈ L2 (G)).

Clearly, it holds (T1 T2 ) = (T1 )(T2 ), (T ∗ ) = (T )∗ .

(4)

It follows that (T ) is a unitary operator on M if T is a unitary operator on L2 (G). In [12] we proved. Lemma 1.1. Let K1 , K2 be linear operators on L2 (G) with a property K1∗ K1 + K2∗ K2 = 1 .

(5)

Then there exists exactly one isometry νK1 ,K2 from M to M⊗2 = M ⊗ M with νK1 ,K2 exp (g) = exp(K1 g) ⊗ exp(K2 g) (g ∈ L2 (G)).

(6)

Further it holds νK1 ,K2 = ((K1 ) ⊗ (K2 ))D

(7)

∗ (at least on dom(D) but one has the unique extension). The adjoint νK of νK1 ,K2 is 1 ,K2 characterized by ∗ νK (exp (h) ⊗ exp (g)) = exp(K1∗ h + K2∗ g) (g, h ∈ L2 (G)) 1 ,K2

(8)

and it holds ∗ νK = S((K1∗ ) ⊗ (K2∗ )). 1 ,K2

(9)

72

K.-H. Fichtner, M. Ohya

From K1 , K2 we get a transition expectation ξK1 K2 : M ⊗ M → M, using νK1 ,K2 and the lifting ξK∗ 1 K2 may be interpreted as a certain splitting (cf. [2]). The property (5) implies K1 g2 + K2 g2 = g2

(g ∈ L2 (G)).

(10)

Let U , V be unitary operators on L2 (G). If operators K1 , K2 satisfy (5), then the pair Kˆ 1 = U K1 , Kˆ 2 = V K2 fulfill (5). Here we explain the fundamental scheme of beam splitting [8]. We define an isometric operator Vα,β for coherent vectors such that Vα,β | exp (g) = | exp (αg) ⊗ | exp (βg) with | α |2 +| β |2 = 1. This beam splitting is a useful mathematical expression for optical communication and quantum measurements [2]. As one example, take α = β = 21 in the above formula and let K1 = K2 be the following operator of multiplication on L2 (G): 1 K1 g = √ g = K2 g 2

(g ∈ L2 (G)).

(11)

In this case, we put ν := νK1 ,K2 , then we obtain

ν exp (g) = exp

1 1 √ g ⊗ exp √ g 2 2

(g ∈ L2 (G)).

Another example is given by taking K1 and K2 as the projections to the corresponding subspaces H1 , H2 of the orthogonal sum L2 (G) = H1 ⊕ H2 . In [12] we used the first example in order to describe a teleportation model where Bob performs his experiments on the same ensemble of the systems as Alice. Further we used a special case of the second example in order to describe a teleportation model where Bob and Alice are spatially separated (cf. Sect. 5 of [12]). 2. Previous Results on Teleportation Let us review some results obtained in [12]. We fix an ONS {g1 , . . . , gN } ⊆ L2 (G), operators K1 , K2 on L2 (G) with (5), a unitary operator T on L2 (G), and d > 0. We assume T K1 gk = K2 gk

K1 gk , K1 gj = 0

(k = 1, . . . , N ),

(k = j ; k, j = 1, . . . , N ).

(12) (13)

Using (11) and (12) we get K1 gk 2 = K2 gk 2 =

1 . 2

(14)

Quantum Teleportation and Beam Splitting

73

From (12) and (13) we get

K2 gk , K2 gj = 0

(k = j ; k, j = 1, . . . , N ).

(15)

The state of Alice asked to teleport is of the type ρ=

N

λs |s s |,

(16)

s=1

where |s =

N

csj |exp (aK1 gj ) − exp (0)

j =1

|csj |2 = 1; s = 1, . . . , N

(17)

j

√ and a = d. One easily checks that (|exp (aK1 gj ) − exp (0) )N j =1 and (|exp aK2 gj ) − { exp (0) )N are ONS in M. The set ; s = 1, . . . , N} makes the N -dimensional s j =1 Hilbert space H1 defining an input state teleported by Alice. Although we may include the vaccum state |exp (0) to define H1 , here we take the N -dimensional Hilbert space H1 as above because of computational simplicity. In order to achieve that (|s )N s=1 is still an ONS in M we assume N

c¯sj ckj = 0

(j = k ; j, k = 1, . . . , N ) .

(18)

j =1 N Denote cs = [cs1,... , csN ] ∈ CN , then (cs )N s=1 is an CONS in C . N N Let (bn )n=1 be a sequence in C ,

bn = [bn1,... , bnN ] with properties |bnk | = 1

bn , bj = 0

(n, k = 1, . . . , N ),

(19)

(n = j ; n, j = 1, . . . , N ).

(20)

Now, for each m, n (= 1, . . . , N) , we have unitary operators Um , Bn on M given by Bn |exp (aK1 gj ) − exp (0) = bnj | exp (aK1 gj ) − exp (0)

Um |exp (aK1 gj ) − exp (0) = |exp (aK1 gj ⊕m ) − exp (0) where j ⊕ m := j + m(mod N ).

(j = 1, . . . , N ), (21) (j = 1, . . . , N ), (22)

74

K.-H. Fichtner, M. Ohya

2.1. A perfect teleportation. Then Alice’s measurements are performed with the projection Fnm = |ξnm ξnm |

(n, m = 1, . . . , N )

(23)

given by N 1 |ξnm = √ bnj |exp (aK1 gj ) − exp (0) ⊗ | exp (aK1 gj ⊕m ) − exp (0) . (24) N j =1 ⊗2 One easily checks that (|ξnm )N n,m=1 is an ONS in M . Further, the state vector |ξ of the entangled state σ = |ξ ξ | is given by

1 |ξ = √ |exp (aK1 gk ) − exp (0) ⊗ |exp (aK2 gk ) − exp (0) . N k

(25)

In [12] we proved the following theorem. Theorem 2.1. For each n, m = 1, . . . , N, define a channel nm by nm (ρ) := tr 12

(Fnm ⊗ 1) (ρ ⊗ σ ) (Fnm ⊗ 1) tr 123 (Fnm ⊗ 1) (ρ ⊗ σ ) (Fnm ⊗ 1)

(ρ normal state on M).

Then we have for all states ρ on M with (16) and (17), ∗ nm (ρ) = (T )Um Bn∗ ρ (T )Um Bn∗ .

(26)

(27)

Remark 2.2. Using the operators Bn , Um , (T ), the projections Fnm are given by unitary transformations of the entangled state σ : ∗ (28) Fnm = Bn ⊗ Um (T ∗ ) σ Bn ⊗ Um (T ∗ ) , or |ξnm = Bn ⊗ Um (T ∗ ) |ξ . If Alice performs a measurement according to the following selfadjoint operator F =

N

znm Fnm

n,m=1

with {znm |n, m = 1, . . . , N} ⊆ R − {0}, then she will obtain the value znm with probability 1/N 2 . The sum over all these probabilities is 1, so that the teleportation model works perfectly. Before stating some fundamental results of [12] for the non-perfect case, we note that our perfect teleportation is obviously treated in general finite Hilbert spaces Hk (k = 1, 2, 3) the same as the usual one [2]. Moreover, our teleportation scheme can be generalized a bit by introducing the entangled state σ12 on H1 ⊗ H2 defining the projections {Fnm } by the unitary operators Bn , Um . Wehere discuss the perfect teleportation k on general Hilbert spaces Hk (k = 1, 2, 3) . Let ξj ; j = 1, · · · , N be CONS of the

Quantum Teleportation and Beam Splitting

75

Hilbert space Hk (k = 1, 2, 3) . Define the entangled states σ12 and σ23 on H1 ⊗ H2 and H2 ⊗ H3 , respectively, such as σ12 = |ξ12 ξ12 | , σ23 = |ξ23 ξ23 | N N 1 2 2 3 √1 with ξ12 := √1 j =1 ξj ⊗ ξj and ξ23 := N j =1 ξj ⊗ ξj . By a sequence bn = N [bn1,... , bnN ]; n = 1, · · · , N in CN with the properties (19) and (20), we define the unitary operator Bn and Um such as Bn ξj1 := bnj ξj1 (n, j = 1, · · · , N ) and Um ξj2 := ξj2⊕m (n, j = 1, · · · , N ) with j ⊕ m := j + m (mod N ). Then the set {Fnm ; n, m = 1, · · · , N } of the projections of Alice is given by Fnm = (Bn ⊗ Um ) σ12 (Bn ⊗ Um )∗ , and the teleportation channels ∗nm ; n, m = 1, · · · , N are defined as (Fnm ⊗ 1) (ρ ⊗ σ23 ) (Fnm ⊗ 1) . tr 123 (Fnm ⊗ 1) (ρ ⊗ σ23 ) (Fnm ⊗ 1)

nm (ρ) := tr 12

Finally the unitary keys {Wnm ; n, m = 1, · · · , N } of Bob are given as Wnm ξj1 = bnj ξj3⊕m , (n, m = 1, · · · , N ) , by which we obtain the perfect teleportation ∗ nm (ρ) = Wnm ρWnm .

The above perfect teleportation is unique in the sense of unitary equivalence.

2.2. A non-perfect teleportation. We will review a non-perfect teleportation model in which the probability teleporting the state from Alice to Bob is less than 1 and it depends on the density parameter d (may be the energy of the beams) of the coherent vector. There, when d = a 2 tends to infinity, the probability tends to 1. Thus the model can be considered as asymptotically perfect. Take the normalized vector N

γ |η := √ |exp (agk ) , N k=1

1

1 2 2 1 1 = , with γ := 2 −d −a 1 + (N − 1)e 1 + (N − 1)e

(29)

and we replace in (26) the entangled state σ by σ˜ := |ξ˜ ξ˜ |, γ ξ˜ := νK1 ,K2 (η) = √ N

(30) N k=1

|exp (aK1 gk ) ⊗ |exp (aK2 gk ) .

76

K.-H. Fichtner, M. Ohya

Then for each n, m = 1, . . . , N, we get the channels on any normal state ρ on M such as ˜ nm (ρ) := tr 12

(Fnm ⊗ 1) (ρ ⊗ σ˜ ) (Fnm ⊗ 1) , tr 123 (Fnm ⊗ 1) (ρ ⊗ σ˜ ) (Fnm ⊗ 1)

(31)

7nm (ρ) := tr 12

(Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ ) , tr 123 (Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ )

(32)

where F+ = 1 − |exp (0) exp (0)|, i.e., F+ is the projection onto the space M+ of configurations having no vacuum part, M+ := {ψ ∈ M; | exp (0) exp (0)|ψ = 0}. One easily checks that 7nm (ρ) =

˜ nm (ρ)F+ F+ , ˜ nm (ρ)F+ tr F+

(33)

˜ nm (ρ) from Alice, Bob has to omit the vacuum. that is, after receiving the state From Theorem 2.1 it follows that for all ρ with (16) and (17), nm (ρ) =

F+ nm (ρ)F+ . tr (F+ nm (ρ)F+ )

˜ nm , namely, in general it does not hold This is not true if we replace nm by ˜ nm (ρ). 7nm (ρ) = In [12] we proved the following theorem. Theorem 2.3. For all states ρ on M with (16) and (17) and each pair n, m (= 1, . . . , N), we have ∗ 7nm (ρ) = (T ) Um Bn∗ ρ (T ) Um Bn∗

or 7nm (ρ) = nm (ρ)

(34)

and d 2 1 − e− 2 pnm (ρ) = tr 123 (Fnm ⊗ F+ ) (ρ ⊗ σ˜ ) (Fnm ⊗ F+ ) = . (35) 1 + (N − 1)e−d n,m n,m

That is, the model works only asymptotically perfectly in the sense of condition (E2). In other words, the model works perfectly for the case of high density (or energy) of the considered beams.

Quantum Teleportation and Beam Splitting

77

3. Main Results The tools of the teleportation model considered in Sect. 2.1 are the entangled state σ and the family of projections (Fnm )N n,m=1 . In order to have a more handy model, in Sect. 2.2 we have replaced the entangled state σ by another entangled state σ˜ given by the splitting of a superposition of certain coherent states (30). In addition, we are going to replace the projectors Fnm by projectors F˜nm defined as follows: ∗ (36) F˜nm := Bn ⊗ Um (T )∗ σ˜ Bn ⊗ Um (T )∗ . In order to make this definition precise we assume, in addition to (22), that it holds: Um exp(0) = exp(0)

(m = 1, . . . , N ).

Together with (22) that implies Um |exp(aK1 gj ) = |exp(aK1 gj ⊕m )

(m, j = 1, . . . , N ).

(37)

Formally we have the same relation between σ˜ and F˜nm like the relation between σ and Fnm (cf. Remark 2.2). Further for each pair n, m = 1, . . . , N we define channels on normal states on M such as F˜nm ⊗ F+ (ρ ⊗ σ˜ ) F˜nm ⊗ F+ ˜ , (38) 7nm (ρ) := tr 12 p˜ nm (ρ) where p˜ nm (ρ) := tr 123 F˜nm ⊗ F+ (ρ ⊗ σ˜ ) F˜nm ⊗ F+

(39)

(cf. (33), and (34)). In Sect. 4, we will prove the following theorem. Theorem 3.1. For each state ρ on M with (16), and (17), each pair n, m(= 1, . . . , N ) and each bounded operator A on M it holds d √ 2e− 2 2 ˜ |tr 7nm (ρ)A − tr (nm (ρ)A) | ≤ N + N N + N A , − d2 1−e

p˜ nm (ρ) − 1 ≤ e− d2 14 + 2 + √2 . N2 N2 N

(40)

(41)

d

From Theorem 2.1 and e− 2 −→0 (d → +∞), Theorem 3.1 means that our modified teleportation model works asymptotically perfectly (the case of high density or energy) in the sense of conditions (E1) and (E2). In order to obtain a deeper understanding of the whole procedure we are going to ˜ nm . The discuss another representation of the projectors F˜nm and of the channels 7 starting point is again the normalized vector |η given by (29). From (14) we obtain O√2 K1 gk 2 = gk 2 ,

(42)

78

K.-H. Fichtner, M. Ohya

where Of denotes the operator of multiplication corresponding to the number (or function) f Of ψ := f ψ

(ψ ∈ L2 (G)) .

(43)

Furthermore (13) implies

Of K1 gk , Of K1 gj = 0

(k = j ).

(44)

From (42), and (44) follows that we have a normalized vector |η ˜ given by |η ˜ :=

O√

2 K1

N √ γ |η = √ |exp a 2K1 gk . N k=1

(45)

Remark 3.2. In the case of the example given by (11) of Sect. 1.2, we have |η ˜ = |η . Now let V be the unitary operator on M ⊗ M characterized by V (exp(f1 ) ⊗ exp(f2 ))

1 1 = exp √ (f1 − f2 ) ⊗ exp √ (f1 + f2 ) (f1 , f2 ∈ L2 (G)) . 2 2

(46)

One easily checks V ∗ (exp(f1 ) ⊗ exp(f2 ))

1 1 = exp √ (f1 + f2 ) ⊗ exp √ (f2 − f1 ) (f1 , f2 ∈ L2 (G)) . 2 2

(47)

Remark 3.3. V describes a certain exchange procedure of particles (or energy) between two systems or beams (cf. [13]). Now, using (12), (30), (45), and (47), (46) one gets ξ˜ = νK1 ,K2 (η) = (1 ⊗ (T ))V ∗ (|exp(0) ⊗ |η ) ˜ ,

(48)

ξ˜ = (1 ⊗ (T ))V (|η ˜ ⊗ |exp(0) ) ,

(49)

and it follows σ˜ = |ξ˜ ξ˜ |,

∗ ˜ η|) ˜ (1 ⊗ (T ))V ∗ . = (1 ⊗ (T ))V ∗ (|exp(0) exp(0)| ⊗ |η

σ˜ = (1 ⊗ (T ))V (|η ˜ η| ˜ ⊗ |exp(0) exp(0)|) ((1 ⊗ (T ))V )∗ .

(50) (51)

Quantum Teleportation and Beam Splitting

79

From the definition of F˜nm (36) and (50) it follows ∗ ˜ η|) ˜ (Bn ⊗ Um ) V ∗ . F˜nm = (Bn ⊗ Um ) V ∗ (|exp(0) exp(0)| ⊗ |η

(52)

Using (51), and (52) we obtain

F˜nm ⊗ F+ (ρ ⊗ σ˜ ) F˜nm ⊗ F+

(n, m = 1, . . . , N )

∗ = (Xnm ⊗ 1) Wnm (ρ ⊗ |η ˜ η| ˜ ⊗ |exp(0) exp(0)|) Wnm (Xnm ⊗ 1)∗ ,

(53)

where Xnm : = (Bn ⊗ Um ) V ∗ , (n, m = 1, · · · , N ),

Wnm : = (|exp(0) exp(0)| ⊗ |η ˜ η| ˜ ⊗ F+ ) (V ⊗ 1) Bn∗ ⊗ Um∗ ⊗ (T ) (1 ⊗ V ) . (54) Xnm and consequently Xnm ⊗ 1 are unitary operators. For that reason we get from (53), tr 123 F˜nm ⊗ F+ (ρ ⊗ σ˜ ) F˜nm ⊗ F+ (n, m = 1, . . . , N ) ∗ = tr 123 Wnm (ρ ⊗ |η ˜ η| ˜ ⊗ exp(0) exp(0)|) Wnm

(55)

and tr 12 F˜nm ⊗ F+ (ρ ⊗ σ˜ ) F˜nm ⊗ F+ ∗ = tr 12 Wnm (ρ ⊗ |η ˜ η| ˜ ⊗ |exp(0) exp(0)|) Wnm

(56)

Now from (38), (39), (55) and (56) it follows ∗ , ˜ η| ˜ ⊗ | exp(0) exp(0)|) Wnm p˜ nm (ρ) = tr 123 Wnm (ρ ⊗ |η

˜ nm (ρ) = tr12 7

∗ ˜ η| ˜ ⊗ |exp(0) exp(0)|) Wnm Wnm (ρ ⊗ |η . ∗ tr 123 Wnm (ρ ⊗ |η ˜ η| ˜ ⊗ |exp(0) exp(0)|) Wnm

(57) (58)

According to (57,58) and (54), the procedure of the special teleportation model can be expressed in the following steps:

80

K.-H. Fichtner, M. Ohya

Step 0 – initial state ρ – the unknown state Alice wants to teleport |exp(0) exp(0)|–vacuum state, Bob’s state at the beginning. Step 1 – Transformation according to that means: splitting of the state |η ˜ η|. ˜ Step 2 – Transformation according to Step 3 – Transformation according to exchange of particles (or energy) between the first and the second part of the system. Step 4 – measurement according to checking for – first part in the vacuum? – in the third part is no vacuum? – second part reconstructed?

sin (ρ) = ρ ⊗ |η ˜ η| ˜ ⊗ |exp(0) exp(0)| | | | | | 1⊗V | | | Bn∗ ⊗ Um∗ ⊗ (T ) | | V ⊗1 | | | | |exp(0) exp(0)| ⊗ |η ˜ η| ˜ ⊗ F+ | | | | ↓

Final state sfin (ρ)

=

∗ Wnm (sin (ρ))Wnm ∗ tr 123 Wnm (sin (ρ))Wnm

˜ nm (ρ) = tr 12 sfin (ρ). Thus Theorem 3.1 means that in the Now from (57) we get 7 case of high density (or energy) d we have approximately (ρ with (16), and (17)) ∗ tr 12 sfin (ρ) = (T )Um Bn∗ ρ (T )Um Bn∗ . The proof of Theorem 3.1 shows that we have even more, namely it holds (approximately) ∗ sfin (ρ) = |exp(0) exp(0)| ⊗ |η ˜ η| ˜ ⊗ (T )Um Bn∗ ρ (T )Um Bn∗ . (59) Adding in our scheme the following step: Step 5 – Transformation (that means Bob uses the key provided to him)

∗ 1 ⊗ 1 ⊗ (T )Um Bn∗

Then sfin (ρ) will change into the new final state |exp(0) exp(0)| ⊗ |η ˜ η| ˜ ⊗ ρ. Summarizing one can describe the effect of the procedure (for large d!) as follows: At the beginning Alice has (e.g., can control) a state ρ, and Bob has the vacuum state (e.g., can control nothing). After the procedure Bob has the state ρ and Alice has the vacuum. Furthermore the teleportation mechanism is ready for the next teleportation (e.g. |η ˜ η| ˜ is reproduced in the course of teleportation).

Quantum Teleportation and Beam Splitting

81

We have considered three different models (cf. Sects. 2.1, 2.2, 2.3). Each of them is a special case of a more general concept we are going to describe in the following: Let H1 , H2 be N–dimensional subspaces of M+ such that (T ) maps H1 onto H2 , and H1 is invariant with respect to the unitary transformations Bn , Um (n, m = 1, . . . , N ). Further let σ1 , σ2 be projections of the type σk = |ξk ξk |,

ξk ∈ (H1 ⊕ M0 ) ⊗ (H2 ⊗ M0 )

(k = 1, 2),

where M0 is the orthogonal complement of M+ , e.g., M0 is the one-dimensional subspace of M spanned by the vacuum vector |exp(0) . 1 ,σ2 from Now for each n, m = 1, . . . , N and each pair σ1 , σ2 we define a channel >σnm the set of all normal states ρ on H1 into the set of all normal states on M+ ,

1 σ2 (ρ) := tr >σnm 12

σ1 σ1 ⊗ F+ (ρ ⊗ σ2 ) Fnm ⊗ F+ Fnm σ1 , σ1 tr 123 Fnm ⊗ F+ (ρ ⊗ σ2 ) Fnm ⊗ F+

where ∗ σ1 Fnm := Bn ⊗ Um (T ∗ ) σ1 Bn ⊗ Um (T ∗ ) . In this paper we have considered the situation where H1 is spanned by the ONS (|exp(aK1 gk ) − exp(0) )N k=1 and H2 is spanned by the ONS (|exp(aK2 gk ) − exp(0) )N k=1 . Further the model discussed in Sect. 2.1 corresponds to the special case σ1 = σ2 = σ , e.g. σ nm = >σnm

(n, m = 1, . . . , N )

(perfect in the sense of conditions (E1) and (E2)). The model discussed in Sect. 2.2 corresponds to the special case σ1 = σ = σ2 = σ˜ , e.g. σ˜ 7nm = >σnm

(perfect in the sense of (E1), and only asymptotically perfect in the sense of (E2)). Finally the model from this section corresponds to the special case σ1 = σ2 = σ˜ , e.g. σ˜ σ˜ ˜ nm = >nm 7

(non-perfect, neither (E2) nor (E1) hold, but asymptotically perfect in the sense of both conditions).

82

K.-H. Fichtner, M. Ohya

4. Proof of Theorem 3.1 From (14) we get a2 exp aKs gj − exp(0)2 = e 2 − 1 a2 exp aKs gj 2 = e 2

(s = 1, 2; j = 1, . . . , N ),

(s = 1, 2; j = 1, . . . , N ).

(60) (61)

Using (46), (60) and (61) one easily checks (62) V |exp aK1 gj − exp(0) ⊗ |exp (aK1 gk )

a2 − 1 a2 a a 2 exp √ K1 gj − gk ⊗ exp √ K1 gj + gk = e 2 −1 e 2 2 2

a a − exp − √ K1 (gk ) ⊗ exp √ K1 gk 2 2 (k, j = 1, . . . , N). Lemma 4.1. Put for j, k = 1, . . . , N, αj k := |exp(0) ⊗ |η ˜ , V |exp aK1 gj − exp(0) ⊗ |exp (aK1 gk ) . Then it holds for j, k = 1, . . . , N, 21 γ a2 2 αj k = 1 − e− 2 e−a √ N

(k = j ),

(63)

1 a2 2 γ αjj = 1 − e− 2 √ . N

(64)

Proof. We have

exp(0) , exp(f ) = 1

(f ∈ L2 (G)) .

(65)

Using (62), (65), and (45) we get for j, k = 1, . . . , N,

N a2 a 2 − 1 γ √ a 2 2 2 |exp( 2 aK1 gs ) , exp √ K1 gj + gk αj k = e − 1 e √ N s=1 2

√ a − |exp . (66) 2 aK1 gs , exp √ K1 gk 2 We have

exp(f1 ) , exp(f2 ) = e f1 ,f2

(f1 , f2 ∈ L2 (G)) .

Using (13) and (67) we obtain

√ a 2 aK1 gs , exp √ K1 gj + gk 0 = exp 2

√ a 2 aK1 gs , exp √ K1 gk − exp 2

(67)

(s = j ) (68)

Quantum Teleportation and Beam Splitting

83

From (61), (66), (67), and (68) it follows αj k =

a2 a 2 2 − 1 γ 2 2 2 e 2 − 1 e 2 ea ea K1 gj , K1 (gj +gk ) − ea K1 gj , K1 gk . √ N

(69)

Now (13) and (14) implies

K1 gj , K1 gk = For that reason (63), and (64) follow from (69).

1 δj k . 2

(70)

!

In the following we fix a pair n, m ∈ {1, . . . , N}. Remark 4.2. Without loss of generality we can assume Bn = 1,

(71)

which we can explain as follows: Using (57)–(59), and (54) we obtain in the case (71), ˜ km (ρ) = 7 ˜ nm Bk∗ ρBk 7 (k = 1, . . . , N ), ∗ p˜ km (ρ) = p˜ nm Bk ρBk (k = 1, . . . , N ). On the other hand from Theorem 2.1 it follows that in the case (71 ) for all states ρ with (16) and (17) it holds km (ρ) = nm Bk∗ ρBk (k = 1, . . . , N ). Finally it is easy to show that Bk∗ ρBk fulfills (16), and (17) if the state ρ fulfills (16) and (17). For those reasons Theorem 3.1 would be proved if we could prove (40), and (41) on the assumption that we have (71). Now from (30), (49), and (37) we get N γ ˜ ⊗ | exp(0) ) = √ | exp (aK1 gk ) ⊗ |exp(aK2 gk⊕m ) . Um∗ ⊗ (T ) V (|η N k=1 (72)

Lemma 4.3. Put for s = 1, . . . , N

βs := ((|exp(0) exp(0)| ⊗ |η ˜ η|) ˜ V ⊗ 1) 1 ⊗ Um∗ ⊗ (T ) (1 ⊗ V )|?s ⊗ |η ˜ ⊗ |exp(0) exp(0)| (s = 1, . . . , N ).

Then it holds γ2 βs = N

1−e

+ e−

a2 2

2 − a2

N j =1

1

csj

2

(73)



N 2 − a2  |exp(0) ⊗ |η ˜ ⊗ 1−e csj |exp aK2 gj ⊕m

N k=1

 |exp (aK2 gk )  .

j =1

(74)

84

K.-H. Fichtner, M. Ohya

Proof. From (17), (72), and (73) we get βs =

N

N γ ˜ η|) ˜ V |exp aK1 gj − exp(0) csj √ (|exp(0) exp(0)| ⊗ |η N k=1 j =1 ⊗|exp (aK1 gk ) ) ⊗ |exp (aK2 gk⊕m ) . (75)

Further we have ˜ η|) ˜ V |exp aK1 gj − exp(0) ⊗ |exp (aK1 gk ) (|exp(0) exp(0)| ⊗ |η = |exp(0) ⊗ |η ˜ | exp(0) ⊗ |η ˜ , V |exp aK1 gj − exp(0) ⊗ |exp (aK1 gk ) (j, k = 1, . . . , N ). (76) Using Lemma 4.1, (75), and (76) we obtain 1 a2 2 γ2 ˜ ⊗ csj |exp aK2 gj ⊕m 1 − e− 2 |exp(0) ⊗ |η N j =1   1 2 2 a2 γ2 + 1 − e− 2 e−a |exp(0) ⊗ |η ˜ ⊗ csj |exp (aK2 gk⊕m )  . N N

βs =

j k=j

That implies (74). ! Now we put N 1 |?0 := √ |exp aK1 gj − exp(0) . N j =1

(77)

Since F+ = 1 − |exp(0) exp(0)|, one easily checks 1 a2 2 F+ |exp (aKr gk ) = 1 − e− 2 |exp aKr gk − exp(0) (r = 1, 2; k = 1, . . . , m).

(78)

Using (77), and (78) we obtain F+

N k=1

1 a2 2 √ |exp (aK2 gk ) = 1 − e− 2 N Um (T )|?0 1 a2 2 √ N (T )Um |?0 . = 1 − e− 2

(79)

Quantum Teleportation and Beam Splitting

85

Using the same arguments we get   N 1 a2 2 F+  csj |exp aK2 gj ⊕m  = 1 − e− 2 Um (T )|?s

(s = 1, . . . , N )

j =1

= 1−e

2 − a2

1 2

(80) (T )Um |?s .

(81)

Finally we have ˜ η| ˜ ⊗ F+ ) (V ⊗ 1) (|exp(0) exp(0)| ⊗ |η = (1 ⊗ 1 ⊗ F+ ) (| exp(0) exp(0)| ⊗ |η ˜ η|) ˜ V ⊗ 1.

(82)

Using (54), (71), (79), (80), and Lemma 4.3 one easily checks the following equality: ˜ ⊗ | exp(0) ) (s = 1, . . . , N ) Wnm (|?s ⊗ |η a2 γ2 = ˜ ⊗ (T )Um Bn∗ 1 − e− 2 | exp(0) ⊗ |η N    √ a2 a2 ⊗  1 − e− 2 |?s + e− 2  csj  N |?0  .

(83)

j

For that reason we have the following lemma Lemma 4.4. For each bounded operator A on M and s = 1, . . . , N it holds ϑs (A) := Wnm (|?s ⊗ |η ˜ ⊗ |exp(0)) , (1 ⊗ 1 ⊗ A)Wnm (|?s ⊗ |η ˜ ⊗ |exp(0) ) 2 2 2 2 2 2 γ − a2 − a2 ∗ ∗ 1−e 1−e (T )Um Bn |?s , A(T )Um BN |?s = N   √ a2 a2 + e− 2 1 − e− 2  csj  N (T )Um Bn∗ |?s , A(T )Um Bn∗ |?0 j

  2 2 √ a a + e− 2 1 − e− 2  csj  N (T )Um Bn∗ |?0 , A(T )Um Bn∗ |?s +e

−a 2

|

j

j

csj | N (T )Um Bn∗ |?0 , A(T )Um Bn∗ |?0 . 2

Now from (16) we get ρ ⊗ |η ˜ η| ˜ ⊗ |exp(0) exp(0)| =

N s=1

λs |?s ⊗ η˜ ⊗ exp(0) ?s ⊗ η˜ ⊗ exp(0)|.

(84)

86

K.-H. Fichtner, M. Ohya

N On the other hand (|?s ⊗ η˜ ⊗ exp(0) )N s=1 is an ONS because (?s )s=1 is an ONS. For that reason from (57,58), (84), and Lemma 4.4 with A = 1 it follows

2

2 2 2 N a2 a2 γ 2 p˜ nm (ρ) = 1 − e− 2 1 − e− 2 + N e−a λs | csj |2 N

√ a2 + N e− 2

1 − e−

N 2

a 2

s=1

λs

s=1

N

s=1

csj ?s , ?0 + csj ?s , ?0

.

(85)

j =1

N As |exp(aK1 gj ) − exp(0) j =1 is an ONS we can calculate easily N 1

|?s , |?0 = √ csk . N k=1

For that reason from (85) follows

2 2 N N 2 2 a2 2 γ − a2 1−e 1 − e− 2 + λs | csj |2 p˜ nm (ρ) = N s=1 j =1

2 2 √ a a 2 N e−a + 2 N e− 2 1 − e− 2 . Further we have

s

λs = 1 and

N

|csj |2 2 |csk |2 |c | c ≤ c + ≤ N. ≤ sj sj sk 2 2 j =1

j

k

j

k

Using (86), (87) and the definition of γ (cf. (29)) we can estimate p˜ nm (ρ) − 1 2 N γ 2 2 2 2 2 2 − a2 − a2 = 1 − e 1 − e N 2 √ − a2 a2 1 −a 2 − − 4 + λs csj N e + 2 Ne 2 1 − e 2 γ s j a2 4 1 ≤ 2 1 − e− 2 N √ a2 2 a2 a2 2 + 1 − e− 2 λs | csj |2 N e−a + 2 N e− 2 1 − e− 2 s

j

2 − 1 + (N − 1)e √ a2 4 a2 1 2 2 ≤ 2 1 − e− 2 − 1 + (N − 1)e−a + e− 2 N N + 2 N N

2 2 √ 1 − a2 2 − a2 ≤ 2 e N (N + 2 N ) . 14 + N + e N

That implies (41).

(86)

−a 2

(87)

Quantum Teleportation and Beam Splitting

87

Lemma 4.5. We use the notation ϑs (A) from Lemma 4.4. Then for each bounded operator A on M and s = 1, . . . , N it holds ϑs (A) ∗ ∗ Zs (A) := − (T )Um Bn |?s , A(T )Um Bn |?s p˜ nm (ρ) 2e−

≤

a2 2

1 − e−

a2 2

2

√ N 2 + N N + N A .

Proof. Using Lemma 4.4 and the estimation (T )Um B ∗ |?k , A(T )Um B ∗ |?r ≤ A n n

(k, r = 0, . . . , N ),

we get 2 γ2 a2 2 1 − e− 2 Zs (A) ≤ A (p˜ nm (ρ))−1 − 1 N  2 2

2 2 2 √ 4 a a a γ + A 1 − e− 2 N csj (p˜ nm (ρ))−1 2e− 2 1 − e− 2 N j  2 + e−a N | csj |2  . j

Because of (86) it follows  a2 2 − 2  1−e 

Z ≤ A  −1 √ a2 2 a2 a2  2 − 2 − 2 2 −a 1 − e− 2 + λs | csj | N e +2 Ne 1−e s j

 √ a2 a2 2 2e− 2 1 − e− 2 N | csj | + e−a N | csj |2  j j 

+ .  √ a2 2 a2 a2  2 + λs | csj |2 N e−a + 2 N e− 2 1 − e− 2 1 − e− 2 s

j

Using (87) we get 2 2 − a2 1−e − 1

2

√ − a2 a2 a2 2 − − 2 −a 1−e 2 + λs | csj | N e +2 Ne 2 1−e 2 s j ≤

e−

a2 2

1 − e−

√ 2 N + 2N N a2 2 2

88

K.-H. Fichtner, M. Ohya

and 2e−

1 − e− e

≤

a2 2

a2 2

2

2 − a2

1−e

2 − a2

+

1 − e− s

λs |

a2 2

j

√

N|

j

csj | + e−a N | 2

j

csj |2

√ a2 a2 2 csj |2 N e−a + 2 N e− 2 1 − e− 2

2 2 2N + N

That proves Lemma 4.5. ! We have the representation (84) of ρ ⊗ |η ˜ η| ˜ ⊗ |exp(0) exp(0)| as a mixture of orthogonal projections. Thus from (56) and (57,58) we get with the notation ϑs (A) from Lemma 4.4, ˜ nm (ρ)A = λs ϑs (A) (p˜ nm (ρ))−1 . tr 7 s

On the other hand from Theorem 2.1 follows λs (T )Um Bn∗ |?s , A(T )Um Bn∗ |?s . tr (nm (ρ)A) = s

Consequently we have with notation Zs (A) from Lemma 4.5 ˜ nm (ρ)A − tr (nm (ρ)A) | ≤ λs Zs (A). |tr 7 For that reason (40) follows from Lemma 4.5, and That completes the proof of Theorem 3.1.

s

s

λs = 1.

References 1. Accardi, L. and Ohya, M.: Teleportation of general quantum states. Voltera Center preprint, 1998 2. Accardi, L., Ohya, M.: Compound channels, transition expectations and liftings. Applied Mathematics & Optimization 39, 33–59 (1999) 3. Bennett, C.H., Brassard, G., Crépeau, C., Jozsa, R., Peres, A. and Wootters, W.: Teleporting an unknown quantum state viaDual Classical and Einstein–Podolsky–Rosen channels. Phys. Rev. Lett. 70, 1895–1899 (1993) 4. Bennett, C.H., Brassard, G., Popescu, S., Schumacher, B., Smolin, J.A., Wootters, W.K.: Purification of noisy entanglement and faithful teleportation via noisy channels. Phys. Rev. Lett. 76, 722–725 (1996) 5. Ekert, A.K.: Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67, 661–663 (1991) 6. Fichtner, K.-H. and Freudenberg, W.: Pointprocesses and the position distrubution of infinite boson systems. J. Stat. Phys. 47, 959–978 (1987) 7. Fichtner, K.-H. and Freudenberg, W.: Characterization of states of infinite Boson systems I. – On the construction of states. Commun. Math. Phys. 137, 315–357 (1991) 8. Fichtner, K.-H., Freudenberg, W. and Liebscher, V.: Time evolution and invariance of Boson systems given by beam splittings. Infinite Dim. Anal. Quantum Prob. and Related Topics I, 511–533 (1998) 9. Lindsay, J.M.: Quantum and Noncausal Stochastic Calculus. Prob. Th. Rel. Fields 97, 65–80 (1993) 10. Fichtner, K.-H. and Winkler, G.: Generalized brownian motion, point processes and stochastic calculus for random fields. Math. Nachr. 161, 291–307 (1993) 11. Inoue, K., Ohya, M. and Suyari, H.: Characterization of quantum teleportation processes by nonlinear quantum mutual entropy. Physica D 120, 117–124 (1998)

Quantum Teleportation and Beam Splitting

89

12. Fichtner, K.-H. and Ohya, M.: Quantum Teleportation with Entangled States given by Beam Splittings. Commun. Math. Phys. 222, 229–247 (2001) 13. Fichtner, K.-H., Freudenberg, W. and Liebscher, V.: On Exchange Mechanisms for Bosons. Submitted to Infinite Dim. Anal., Quantum Prob. and Rel. Topics Communicated by H. Araki

Commun. Math. Phys. 225, 91 – 119 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Rates of Convergence in the CLT for Two-Dimensional Dispersive Billiards Françoise Pène Departement de Mathématiques, Université de Bretagne Occidentale, 6, avenue Victor le Gorgeu, 29285 Brest Cedex, France. E-mail: [email protected] Received: 23 April 2001 / Accepted: 3 August 2001

Abstract: We consider billiards in the two-dimensional torus with convex obstacles. Central Limit Theorems have been established for regular functions for the billiard transformation in [2], [1] and [14]. We are interestedhere in the problem of the rate of − 21 +α convergence. In this paper, we establish a rate in O n (for any α > 0) for the billiard transformation, by adapting the proof of [7, 6, 8]. In our proof, we use a strong decorrelation result obtained by the method developped in [14] for the study of1 general hyperbolic systems. Moreover, we establish a rate of convergence in O t − 6 in the Central Limit Theorem for the billiard flow. Contents 1. 2. 3. 4. A. B.

Two-Dimensional Dispersive Billiards with Finite Horizon . Main Results . . . . . . . . . . . . . . . . . . . . . . . . . Rate of Convergence for the Billiard Transformation: Proof Rate of Convergence for the Billiard Flow: Proof . . . . . . Construction of L.-S. Young’s Tower: Recalls . . . . . . . . Proof of the Strong Decorrelation Property . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

91 95 97 105 110 114

1. Two-Dimensional Dispersive Billiards with Finite Horizon 1.1. Description of the model. Stochastical properties (ergodicity, K property, CLT, exponential rate of decorrelation) of the system we consider in this paper have been studied in [13, 4, 2, 1, 14, 10] and in many other articles. We consider a compact subset Q of T2 (with connected interior), the complement of which is the finite union of strictly convex open sets (the closures of which are pairwise disjoints). Two examples of such domains Q are drawn (in white) in the following picture. We suppose that the boundary

92

F. Pène

∂Q of Q is C 3 with curvature κ never null. For any q ∈ ∂Q, we denote by n(q) the unit normal vector to ∂Q at q, oriented to the inside of Q. We are interested in the behavior of a point particle moving in Q with unit speed and with elastic reflections off ∂Q.

or

The set of configurations is the set Q1 given by ◦

Q1 := T 1 Q ∪M, with M := {x = (q, v) ∈ T 1 Q : q ∈ ∂Q, n(q), v ≥ 0}, where T 1 A denotes the unit tangent bundle to A. The billiard flow in Q is the flow (Yt )t on Q1 given by Yt (q, v) = (qt , v t ), where (qt , v t ) is the couple position-speed at time t of a particle that was at the position q with the speed v at time 0. This flow preserves the normalised Lebesgue measure µ on Q1 . It is a classical result that this flow can be represented by the suspension flow over the dynamical system (M, ν, T ) (corresponding to the reflection times) defined by the function τ + : M →]0; +∞[ (cf. Sect. 2.2) where: • M is the set defined previously corresponding to all possible configurations of a particle at the time just after a reflection; • ν is the Borel probability measure on M proportional to the measure given by cos(ϕ)drdϕ, where ϕ is the angular measure in − π2 ; π2 between n(q) and v and where r is the curvilinear absciss of q on the connected component of ∂Q to which it belongs; • T is the transformation that maps a configuration x = (q, v) ∈ M of a particle, at the time after a reflection, to the configuration T (x) = (q , v ) of this particle at the time after the next reflection;

T(x) x

• τ + (q, v) is the distance covered by a particle at position q ∈ ∂Q with speed v until the next reflection off ∂Q: τ + (q, v) := min{t > 0 : q + tv ∈ ∂Q}. It is a classical result that the probability measure ν is T -invariant. The dynamical system (M, ν, T ) is called the billiard system in the domain Q. The billiard system (M, ν, T ) is supposed to have finite horizon, i.e. such that the function τ + is uniformly bounded.

Billiards

93

In the first picture, only the second example has finite horizon. Let us define the set R0 of unit vectors tangent to ∂Q : R0 := {(q, v) ∈ M : n(q), v = 0}. For any integer k, we write Rk := T k (R0 ). For any −∞ ≤ k ≤ l ≤ +∞, we denote Rk,l := lj =k Rj . We recall that, for any integer k, the set Rk is the finite union of C 1 -regular curves. The study of the billiard system (M, ν, T ) is complicated by the existence of singularities for T corresponding to points in R−1 (cf. picture).

But, it is well-known that, for any integer k ≥ 1, T k defines a C 1 -diffeomorphism from M \ R−k,0 onto M \ R0,k (cf. [9], [10]). 1.2. A strong decorrelation property. In [14], L.-S. Young has established a CLT and an exponential rate of decorrelation for Hölder functions and for the billiard system (M, ν, T ). We can extend her results to a more general class of functions (Hη,m ) (appearing under an other form in Sect. 10 of [3]). Let η ∈]0; 1] be a real number and m ≥ 0 be an integer. We denote by Hη,m the set of bounded functions φ : M → C for which the following quantity is finite: (η,m)

C φ = Cφ

:= sup

sup

C∈Cm x,y∈C, x=y

|φ(x) − φ(y)| , (max(d(x, y), ..., d(T m (x), T m (y)))η

where Cm is the set of the connected components of M \R−m,0 and where d is a metric de fined on each connected component of M by d (q, v), (q , v ) = |r − r |2 + |ϕ − ϕ | if r (resp. r ) is the curvilinear absciss of q (resp. q ) and if ϕ (resp. ϕ ) is the angular measure in − π2 ; π2 of the angle between n(q) (resp. n(q )) and v (resp. v ). The space Hη,m is endowed with the norm · (η,m) given by (η,m)

φ(η,m) = φ∞ + Cφ

.

This space can be understood as the space of functions η-Hölder in m future coordinates. Example. If φ : M → R is Hölder of order η on each connected component of M, then φ ◦ T m is maybe not Hölder but is in Hη,m . The function τ + is in H1,1 . The following observations about these spaces shall be useful for our purpose. Proposition 1.1. Let η ∈]0; 1] be a real number, m0 ≥ 0 be an integer and φ and ψ be two functions in Hη,m0 . Then we have the following properties:

94

F. Pène

1. The product φ.ψ is in Hη,m0 and we have: φ.ψ(η,m0 ) ≤ φ(η,m0 ) ψ(η,m0 ) . 2. For any integer m1 such that m1 ≥ m0 , φ is in Hη,m1 and we have: (η,m1 )

Cφ

(η,m0 )

≤ Cφ

, so φ(η,m1 ) ≤ φ(η,m0 ) .

3. For any integer m ≥ 0, φ ◦ T m is in Hη,m0 +m and we have: (η,m +m)

Cφ◦T m0

(η,m0 )

≤ Cφ

, so φ ◦ T m (η,m

0 +m)

≤ φ(η,m0 ) .

4. If φ is a real-valued function, then eiφ is in Hη,m0 and we have: (η,m0 )

Ceiφ

(η,m0 )

≤ Cφ

, so eiφ

(η,m0 )

(η,m0 )

≤ 1 + Cφ

.

Now, we fix a real number η ∈]0; 1] and an integer m0 ≥ 0. We recall the following result. This result is a consequence of the method developed by L.-S. Young in [14]. It is proven in appendix B of this paper and appeared under a weaker form in [10]. This result is obtained by a careful rewriting of the proof of the exponential rate of decorrelation given in [14]. Proposition 1.2. Let r > 1 be a real number. There exist two constants Cr > 0 and δr ∈]0; 1[ such that for any integers m1 ≥ 0 and m2 ≥ 0 and any functions φ and ψ in Hη,m1 and in Hη,m2 respectively, any integer n ≥ 0, we have

Covν φ, ψ ◦ T n ≤ Cr φ(η,m ) . ψ(η,m ) δr n−r.m1 . 1 2

(1)

We mention the fact that analogous results should be established in the same way for any hyperbolic system to which the L.-S. Young method can be applied (calculations done in Sect. I.4 of [14] should be modified as in the proof of Theorem B.1 of this redaction). We point out the fact that Proposition 1.2 is much stronger than the exponential rate of the decorrelation property. In particular, according to Proposition 1.1 and by a classical combinatory argument, it gives the following result: Corollary 1.3 (cf. Proposition 4.2.3.1 in [10]). Let f be a real-valued function in Hη,m0 , ν-centered. Then, for any real number q ≥ 1, we have: n−1 f ◦ T k k=0 sup √ < +∞. n n≥1 q

Moreover, Proposition 1.2 implies a result of multiple decorrelation giving the CLT and playing a major role in the proofs of [11] (cf. also Chapter 8 of [10]). We can notice that, in the following, we shall only use the fact that inequality (1) holds for some r > 1.

Billiards

95

2. Main Results 2.1. A rate of convergence for the billiard transformation. For each function f : M → R and each integer n ≥ 1, we shall write Sn (f ) :=

n−1

f ◦ T k.

k=0

Let f : M → R be a ν-centered function in Hη,m0 . We know that the following limit exists:

1 Sn (f ) 2 2 σ = σ (f ) := lim Eν √ n→+∞ n and that we have: σ 2 (f ) =

Eν f.f ◦ T k = Eν [f 2 ] + 2 Eν f.f ◦ T k .

(2)

k≥1

k∈Z

In the following, we shall suppose that the function f is not a coboundary in L2 , i.e. that we have σ > 0. Then, to the foregoing, we already know that the sequence according Sn√(f ) of random variables converges in distribution to a random variable with n n≥0

gaussian distribution N (0, σ 2 ), i.e. that the following quantity converges to 0 as n goes to infinity:

x − t 22

Sn (f )

2σ e /n (f ) := sup

ν √ dt

. √ ≤x − n −∞ σ 2π x∈R

The question we are interested in is the rate of convergence to 0 of this quantity. The aim of this paper is to prove the following result. Theorem 2.1. Let f be a real-valued function in Hη,m0 , ν-centered and such that σ (f ) > 0. For any real number α > 0, we have 1 /n (f ) = On→+∞ n− 2 +α . To prove this, we shall establish the following result and conclude with the use of a classical inequality. For any real number t and any integer n ≥ 1, we write:

itS (f ) n

σ 2t2

√ hn (f, t) =

Eν e n − e− 2

. Proposition 2.2. Let f be a real-valued function in Hη,m0 , ν-centered, such that σ (f ) = 1 and let α ∈ 0; 21 be a real number. For any integer p ≥ 0, there exists a real number

Lp = Lp,α > 0 such that for any integer n ≥ 1 and any real number t ∈]0; n 2 −α [, we have tp hn (f, t) ≤ Lp 1 + an,p (t), p −α n 2 1

96

F. Pène

where the functions an,p = an,p,α satisfy an,p (t) ≥ 0 and

1

n 2 −α

0

an,p (t)

dt = On→+∞ t

1 n 2 −α 1

.

The proofs of these results follow the schemes of the proofs of [7, 6, 8]. 2.2. A rate of convergence for the billiard flow. Let N , λ, (St )t be the suspension flow defined over the dynamical system (M, to the reflection times) by ν, T ) (corresponding the function τ + : M →]0; +∞[, i.e. N , λ, (St )t is defined as follows: • N is the set N := (ω, s) : ω ∈ M, s ∈ 0; τ + (ω) (the set N is a fundamental domain of the quotient space of M × R by the equivalence relation generated by (ω, s)∼(T (ω), s − τ + (ω))); • the probability measure λ is the normalised Lebesgue measure in N , i.e. λ is given τ + (ω) f (ω, s) ds dν(ω); by N f (y) dλ(y) = τ + (ω1 ) dν(ω ) M 0 M

• St corresponds to St (ω, s) = (ω, s +t), where we identify (ω, τ + (ω)) with (T (ω), 0). In the following, we identify the billiard flow (Q1 , µ, (Yt )t ) with N , λ, (St )t by π : N → Q1 with π ((q, v), s) := (q + sv, v). Let a real number η ∈]0; 1] be given. Let us consider a Hölder function f : T 1 Q → R of order η, µ-centered as given. Then, we define the function F : M → R as follows: F (ω) :=

τ + (ω)

f (ω, s) ds.

0

Direct calculations insure that F is in Hη,1 , is ν-centered and that the following limit exists: 2 t 1 2 f ◦ Ys (·) ds σ˜ (f ) = lim Eµ √ t→+∞ t 0 and satisfies: σ˜ 2 (f ) =

σ 2 (F ) + (ω ) dν(ω ) . τ M

Theorem 2.3. Let f : T 1 Q → R be a Hölder function of order η, µ-centered and such that σ˜ 2 (f ) > 0. Then, we have:

t 1

1 sup

µ √ f ◦ Ys (·) ds ≤ x − P(X ≤ x)

= O t − 6 , t 0 x∈R where X is a centered gaussian variable such that E X2 = σ˜ 2 (f ). We do not pretend this rate to be optimal.

Billiards

97

3. Rate of Convergence for the Billiard Transformation: Proof Proof of Proposition 2.2. Let f be a real-valued function in Hη,m0 , ν-centered and such that σ 2 (f ) = 1 and let α ∈ 0; 21 be a real number. We shall prove, inductively on p, that, for any integer p ≥ 0, the following property is satisfied: Property (Hp ). There exists a constant Lp > 0 such that for any integer n ≥ 1 and 1 any real number t ∈]0; n 2 −α [, we have tp

hn (f, t) ≤ Lp n

p

1 2 −α

+ an,p (t),

1 n≥1,t∈]0;n 2 −α [

+∞ and

1

n 2 −α

0

an,p (t)

dt = On→+∞ t

n 2 −α an,p (t) < 1

where the functions an,p satisfy an,p (t) ≥ 0, Ap := sup

1 n 2 −α 1

.

Since we have hn (f, t) ≤ 2, property (H0 ) is clearly satisfied with L0 = 2 and with an,0 (t) = 0. Now, we consider an integer p ≥ 0; we suppose that property Hp is satisfied and prove that, then, property Hp+1 is also satisfied. The function an,p+1 shall be given by:   6 (j ) an,p+1 (t) = min 2, an,p+1 (t) , j =0

(j )

where the an,p+1 ’s shall appear in the following calculations and shall satisfy: lim sup n

1 2 −α

n→+∞

lim sup n→+∞

1

n 2 −α

0

t∈

sup 1 0;n 2 −α

dt < +∞ t

and

(j )

n 2 −α an,p+1 (t) < +∞. 1

(j )

an,p+1 (t)

In the following, n shall be any integer and t any real number satisfying: n ≥ 1 and 1 t ∈ 0; n 2 −α . The notation O shall only depend on p, α and f : for example, the notation gn,t = O(hn,t ) shall mean that there exists a constant C > 0 such that for any

1 integer n ≥ 1 and any real number t ∈ 0; n 2 −α , we have gn,t ≤ C.hn,t .

− t2 2 n

(1) t

1. Control of an,p+1 (t) := e 2 − 1 − 2n

. We have: (1) an,p+1 (t)

≤ n.e

2 − t2 1− n1

2 t4 − t2 = e 8n2

1− n1

t4 . 8n

(3)

98

F. Pène

2. Now, we shall get a bound for the following quantity:

2 n

itS√ n (f ) t

Dn (t) := Eν e n − 1−

.

2n

We first notice that we have: l

n−1 itSn−(l+1) (f )

t 2

√ l l+1

n Dn (t) ≤ E 1− Y ◦ T .e ◦ T ν

, 2n

(4)

l=0

with Y := e

itf √ n

t2 − 1− . 2n

itf

√ According to Proposition 1.1 and to the fact that we have

e n − 1

≤ √tfn , function t2 Y is in Hη,m0 and we have Y (η,m0 ) ≤ 2n + √t n f (η,m0 ) = O √t n . 3. We fix a real number r0 > 1. We define integers a1 = a1 (n), ..., ap+3 = ap+3 (n) as follows: − ln (n) a1 := ln δr0 and

5  ln n− 2  , aj := (r0 − 1) a1 + ... + aj −1 + ln δr0    

for any integer j = 2, ..., p +3; where δr0 is the constant appearing in Proposition 1.2 for r = r0 . We can notice that there exists some constant κ > 0 such that, for any n ≥ 1 κnα and, therefore, a1 + ... + ap+3 < κnα . and any j = 1, ..., p + 3, we have: aj < p+3

l

itSn−l−1 (f )

n−1 √ (2) t2 n

4. Control of: an,p+1 (t) := l=n−κnα 1 − 2n Eν Y.e ◦ T

. We have: 2 2 α −1) t κ t − t2 1− 1−α − n1 (2) α − t (n−κn n 2n , (5) e an,p+1 (t) = O n e =O √ 1 n n 2 −α $ # 1 since we consider only real numbers t satisfying 0; n 2 −α . 5. We shall estimate the following quantity: α n−κn −1

l=0

t2 1− 2n

l itSn−(l+1) (f )

√ n

Eν Y.e ◦ T

.

The study of this quantity constitutes the major part of our proof. Let us write: itSn−(l+1) (f ) √ n Jl (n, t) := Eν Y.e ◦T .

Billiards

99

6. Let l be any integer in {0, ..., n − κnα − 1}. Then, we have n − (l + 1) > a1 + ... + ap+3 . We use the following decomposition:   p+3 Saj (f ) ◦ T a1 +...+aj −1  + SMn,l (f ) ◦ T a1 +...+ap+3 , Sn−(l+1) (f ) =  j =1

with Mn,l := n − (l + 1) − a1 + ... + ap+3 . Thus, we have     p+3 itSM (f ) aj (f ) & itS√ n,l √ n Jl (n, t) = Eν Y  e n ◦ T 1+a1 +...+aj −1  e ◦ T 1+a1 +...+ap+3 . j =1

Let us denote by Il (n, t) the following quantity:    

itS (f ) p+3 itSM (f ) a itSa1 (f ) & n,l √ √j √ n Eν Y e n ◦ T e n − 1 ◦ T 1+a1 +...+aj −1 e ◦ T 1+a1 +...+ap+3. j =2

We have: |Il (n, t)| ≤ D1

t p+3

√

n.n

1−α 2

(p+2)

,

(6)

for some constant D1 > 0 independent of (n, t, l). Indeed, we have: p+3 p+3 itSa (f ) & & t √aj t √jn |Il (n, t)| ≤ Y ∞ − 1 ≤O √ √ , e n n j =2

p+2

j =2

itSaj (f )

tSa (f )

√ writing

e n − 1

≤

√j n

and according to Corollary 1.3 (we recall that we 1 consider only real numbers t satisfying: t ∈ 0; n 2 −α and that we have aj < κnα ). The main part of our proof is devoted to the establishment of an estimation of the error term Il (n, t) − Jl (n, t). We notice that this error term is given by the following formula:     p+3 itSM (f ) & n,l √ n Eν Y  εj ◦ T 1+a1 +...+aj −1  e ◦ T 1+a1 +...+ap+3  , ε=(ε1 ,...,εp+2 )

j =1

(

where the sum is taken over the ε = (ε1 , ..., εp+3 ) with εj ∈ itSaj (f ) √

−1; e

itSaj (f ) √ n

) not

itSa1 (f ) √

all equal to e n and such that ε1 = e n . Let us consider such a vector ε = (ε1 , ..., εp+3 ). We define the integer j0 := max{j ≥ 2 : εj = −1}. We have:   p+3 itSM (f ) & √n,l n εj ◦ T 1+a1 +...+aj −1  e ◦ T 1+a1 +...+ap+3 Y j =1

= −Dε (n, t).Eε,l (n, t) ◦ T 1+a1 +...+aj0 ,

100

F. Pène

with Dε (n, t) := Y.

j& 0 −1

εj ◦ T 1+a1 +...+aj −1

(7)

j =1

and





p+3 &

Eε,l (n, t) := 

εj ◦ T

aj0 +1 +...+aj −1 

e

itSM (f ) √n,l n

◦ T aj0 +1 +...+ap+3 .

j =j0 +1

According to our choice of j0 , for any j = j0 + 1, ..., p + 3, we have: εj = e Consequently, we have it √

S

itSaj (f ) √ n

.

(f )

Eε,l (n, t) = e n n−(l+1)−(a1 +...+aj0 ) . • First step: Control of Cov Dε (n, t), Eε,l (n, t) ◦ T 1+a1 +...+aj0 . Using controls of rate of decorrelation given in proposition 1.2, we show that we have:

t

(8)

Cov Dε (n, t), Eε,l (n, t) ◦ T 1+a1 +...+aj0 ≤ D2 2 , n for some constant D2 > 0 independent of (n, t, ε). Indeed, according to Proposition 1.1, Dε (n, t) is in Hη,a1 +...+aj0 −1 +m0 and we have: Dε (n, t)(η,a1 +...+aj0 −1 +m0 ) ≤ Y (η,m0 ) ≤O

t √

j& 0 −1 n



j& 0 −1

εj (η,aj +m0 −1)

j =1

(η,m0 )

1 + t

Cf

√

j =1

aj

n

 ≤O

t √

,

n

$ # 1 (we recall that we consider only the real numbers t satisfying t ∈ 0; n 2 −α ). In the same way, we can show that Eε,l (n, t) is in Hη,Nn,l , with Nn,l := n − (l + 1) − (a1 + ... + aj0 ) + m0 − 1 and that we have:   (η,m ) tCf 0 n  = O (n) . Eε,l (n, t)(η,Nn,l ) ≤ 1 + √ n According to proposition 1.2, we get:

Cov Dε (n, t), Eε,l (n, t) ◦ T 1+a1 +...+aj0

j √ 1+ j0=1 aj −r0 ≤ O(t n)δr0

According to our choice for the aj ’s, we have: δr0

1+

j0

j =1 aj −r0

j0 −1

m0 +

j =1

aj

=O

1 5

n2

.

j0 −1

m0 +

j =1

aj

.

Billiards

101

• Second step: Control of the mean of Dε (n, t). We shall now show that we have: t3 t2 |Eν [Dε (n, t)]| ≤ D3 √ 1−α + 2 , (9) n n.n for some D3 > 0 independent of (n, t, ε). Let us denote by J the following set: ( ) itS (f ) J := j = 1, ..., j0 − 1 : εj = e

a √j n

. itSa (f )

We recall that 1 is in J . By replacing, in formula (7), εj by εj := 1 + √jn , for 2 t each j ∈ J , we introduce an error in O √t n n1−α (uniformly in (n, t, ε)), since

2

t 2

Sa (f )

j we have εj − εj ≤ and according to Corollary 1.3. By replacing Y by 2n 3 2 itf t t (uniformly in (n, t, ε)). Y := √n + 2n 1 − f 2 , we make an error in O n√ n itSa (f )

For any j ∈ J , we write εj,0 := 1 and εj,1 := √jn . We shall control the following quantities:   & Zε,q (n, t) := Eν Y εj,q ◦ T 1+a1 +...+aj −1  , j j ∈J

{0, 1}J .

with q = (qj )j ∈ J ∈ (a) If we have j ∈J qj = 0, then we have Zε,q (n, t) = Eν [Y ] = (b) If we have have:

j ∈J

t2 1 − Eν [f 2 ] . 2n

qj = 1 and q1 = 1, then, according to Corollary 1.3, we

3 a1 itf it t k Zε,q (n, t) = Eν √ . √ f ◦T +O 3−α n n n 2 k=1 a 1 t3 t2 k f ◦ T + O √ 1−α . = − Eν f. n nn k=1

According to Proposition 1.2, to formula (2) (we recall that we have supposed σ 2 (f ) = 1) and to our choice for a1 , we have:    a1 $ # # 2 $ t2 t − Eν f.f ◦ T k = −  Eν f.f ◦ T k  + O δr0 a1  n n k=1 k≥1 2 a t 1 − Eν [f 2 ] 1 =− + O δr0 n 2 2 t 2 1 − Eν [f 2 ] t 1 =− +O . n 2 nn

102

F. Pène

(c) If we have j ∈J qj = 1 and qj = 1 (for some j ∈ J \ {1}), then, according to Corollary 1.3, we have:   a1 +...+aj 3 itf t it k  +O f ◦T Zε,q (n, t) = Eν √ . √ 3−α n n n 2 k=1+a1 +...+aj −1 t2 t3 a1 +...+aj −1 = O (δr0 ) + O √ 1−α n nn 2 t t3 =O + O √ 1−α . n2 nn (d) If

j ∈J

qj ≥ 2, then we have: Zε,q (n, t) = O

t3 √ 1−α nn

.

Indeed, we have: Y ∞ = O √t n and, according to Corollary 1.3,

2 j ∈J qj & t t 1+a1 +...+aj −1 =O . εj,qj ◦ T 1−α =O 1−α n j ∈J n 2 1

Therefore, we have:

|Eν [Dε (n, t)]| ≤ O +

Zε,q (n, t)

q∈{0,1}J

t3 t2 ≤ O √ 1−α + O . n2 n.n

t3 √ 1−α n.n

• Third step: Control of the mean of Eε,l (n, t). We recall that we have: Eε,l (n, t) = e

it √ S (f ) n n−(l+1)−(a1 +...+aj0 )

.

* Let us write n = nn,l,ε := n−(l+1)− a1 + ... + aj0 and t = tn,l,ε := t nn . We # $ 1 can notice that n and t satisfy: n ≥ 1 and t ∈ 0; n 2 −α . Therefore, according to the induction hypothesis Hp applied to (n , t ), we have: t p p 21 −α n t 2 l+1 κ + 1−α 2 n

2

Eν Eε,l (n, t) ≤ e− t2 nn + Lp

t2

≤ e− 2 e

n

+ an ,p t tp

+ Lp n

On the other hand, we notice that we have

Eν Eε,l (n, t) ≤ 1.

p

1 2 −α

(10) + an ,p t .

(11)

(12)

Billiards

103

7. According to inequalities (6), (8), (9), (11) and (12), is less than α n−κn −1

l=0

+

t2 1− 2n

D3 √

ε

where the sum

l D1 √

t3 n.n1−α

t2

e− 2 e

t p+3 n.n

t2 2

t2 2n

1−

l=0

l

|Jl (n, t)|

t t2 p+2 + 2 D 3 n2 n2 tp + an ,p t , 1

+ 2p+2 D2

1−α 2 (p+2)

l+1 κ n + n1−α

n−κnα −1

+ Lp n

p

2 −α

(

ε

is taken over the ε = (ε1 , ..., εp+3 ) with εj ∈

itSaj (f ) √

−1; e

(13)

itSaj (f ) √ n

)

itSa1 (f ) √

not all equal to e n and with ε1 = e n (we recall that, by definition, t and n depend not only on t and n but also on l and ε). We can notice that we have: n−1 l=0

t2 1− 2n

l

2n ≤ min n, 2 t

n−1

and

t2 1− 2n

l=0

l

t2l

e 2n ≤ n.

In the following, we control each term of formula (13). l (1) t2 t p+3 (a) Let us write: bn,p+1 (t) := n−1 1−α (p+2) . We have: l=0 1 − 2n D1 √ n.n

(1) bn,p+1 (t)

t p+3 2n ≤ 2 D1 √ 1−α =O t n.n 2 (p+2)

(3)

(b) Let us write: an,p+1 (t) :=

n−1 l=0

t2 2n

1−

l

(3)

an,p+1 (t) ≤ n.2p+2 D2 (4)

(c) Let us write: an,p+1 (t) := (4) an,p+1 (t)

(5)

t p+1

n

(p+1) 21 −α

.

t =O n2

1−

t2 2n

l

t . n

(15)

D3 nt 2 . We have: 2

t2 1 ≤ 2n min 1, 2 D3 2 = O min 1, t 2 n−1 . t n

(d) Let us write: an,p+1 (t) := have:

n−1

(5) an,p+1 (t)

l=0

≤2

p+2

ε

1−

D3 e

(14)

2p+2 D2 nt2 . We have:

n−κnα −1 l=0

2

t2 2n

l

t2

t 2 (l+1)+κnα n

t −2 2 D3 √n.n e 1−α e

2 − t2 1− n1 −

3

κ n1−α

t3 n

1 2 −α

.

(16)

. We

(17)

104

F. Pène

n−1

(2)

(e) Let us write: bn,p+1 (t) :=

ε

l=0

t2 2n

1−

l

(6)

(f) Let us write: an,p+1 (t) := show that we have: (6) an,p+1 (t) = O

n

1−

l=0

t

2

+O e

n1−2α

1

np( 2 −α)

. We have:

. (p+1) 21 −α

n−κnα −1 ε

tp

3

t p+1

(2)

bn,p+1 (t) ≤ 2p+2 .2D3 Lp

t D3 √n.n 1−α Lp

− t2

t2 2n

l

1 κ 2 − n1−α

(18)

t D3 √n.n 1−α an ,p (t ). We 3

− n2

t3 n 2 −α 1

.

(19)

+ , Let a vector ε = ε1 , ..., εp+3 be given. We notice that if we have l ≤ n2 − κnα − 1, then we have n = n − (l + 1) − (a1 + ... + aj0 ) ≥ n2 . From this, we get: n 2

α !−κn −1

1−

l=0 n 2

t2 2n

l √

t3 an ,p (t ) n.n1−α

α !−κn −1

1−

≤

l=0

t2 2n

l

Ap t3 1 1−α n.n n 2 −α t , =O n1−2α √

Ap 2 2 −α t3 2n √ t 2 n.n1−α n 21 −α according to the induction hypothesis Hp . On the other hand, we have: 1

≤

α n−κn −1

l=

n 2

1−

!−κnα ≤

t2 2n

l

α n−κn −1

√

t3 an ,p (t ) n.n1−α t2l

e− 2n √

t3 Ap n.n1−α

!−κnα n − t2 1 − κ − 2 t3 ≤ + 1 e 2 2 n1−α n √ 1−α Ap 2 n.n t2 1 κ 2 3 t − − − . ≤ O e 2 2 n1−α n 1 n 2 −α l=

n 2

8. Consequently, we have hn (f, t) ≤ an,p+1 (t) + bn,p+1 (t), with bn,p+1 (t) := 2j =1 (j ) (j ) bn,p+1 (t) and an,p+1 (t) := min 2, 6j =1 an,p+1 (t) . We have shown that if property Hp is satisfied, then property Hp+1 is also satisfied. We conclude that property Hp is true for any integer p ≥ 0.

Billiards

105

Conclusion (end of the proof of Theorem 2.1). We show how Theorem 2.1 can be deduced from Proposition 2.2. Let f be a real-valued function in Hη,m0 , ν-centered with σ 2 (f ) > 0. Without any loss of generality, we suppose that we have σ 2 (f ) = 1. According to an inequality given by C. Esseen in [5], for any real number U > 0, we have 24 1 2 U dt /n (f ) ≤ . hn (f, t) + √ π 0 t π 2π U 1 Let α ∈ 0; 41 be a real number, p ≥ 2 be an integer. Let us fix ω := α + 2p . According to Proposition 2.2, we have: 0

1

n 2 −ω

hn (f, t)

Lp dt ≤ t p

tp p 21 −α

n

= On→+∞ = On→+∞

n 21 −ω 0

1

np(ω−α) 1 . 1 n 2 −α

Therefore, we get:

/n (f ) = On→+∞

+ On→+∞

1 1

n2

1 −α− 2p

n 2 −α

1

+ On→+∞

1 1

n 2 −α 1

,

for any real number α ∈ 0; 41 and any integer p ≥ 2. 4. Rate of Convergence for the Billiard Flow: Proof In this section, we prove Theorem 2.3. We denote τ¯ := M τ + (ω) dν(ω). Let f : T 1 Q → R be a Hölder function of order η, µ-centered and such that σ˜ 2 (f ) > 0. We have to study the random variables Xt : Q1 → R given by: 1 Xt := √ t

0

t

f ◦ Ys (·) ds.

We shall denote by X a centered gaussian random variable defined on a probabilised space (B, P) such that E X2 = σ˜ 2 (f ). In this proof, we identify the billiard flow (Q1 , µ, (Yt )t ) with the suspension flow (N , λ, (St )t ). Let us define the function F : τ + (ω) 2 ) M → R by F (ω) := 0 f (ω, s) ds.As F is in Hη,1 and as we have σ˜ 2 (f ) = σ τ(F ¯ , according to Theorem 2.1, for any real number ε > 0, we have:

n−1

1 1

sup ν √ F ◦ T k (·) ≤ x − P (X ≤ x) = O n− 2 +ε .

τ¯ .n x∈R

k=0

106

F. Pène

τt¯ !−1 1. Let Wt : M → R be the random variable given by Wt := √1t k=0 F ◦ T k (·). We have: 1 sup |ν (Wt ≤ x) − P (X ≤ x)| = O t − 2 +ε , x∈R

for any real number ε > 0. Indeed, for any real number t ≥ 2τ¯ , we have:   t τ¯ !−1 1 k  √ − * 1  F ◦ T , + t τ¯ τt¯ k=0

∞

√ √ 2τ¯ t 2F ∞ ≤ 3 F ∞ = . √ τ ¯ t t2

It comes  ν

t τ¯ !−1

k k=0 F ◦ T * + , ≤x− τ¯ τt¯

√

 2F ∞  ≤ ν (Wt ≤ x) √ t   √ τt¯ !−1 F ◦Tk 2F ∞ k=0 . * + , ≤x+ ≤ν √ t τ¯ τt¯

We get:

sup |ν (Wt ≤ x) − P (X ≤ x)| ≤ O t

− 21 +ε

x∈R

+ P |X − x| ≤

√

2F ∞ √ t

,

for any real number ε > 0. n(t,·)−1 F ◦ T k, 2. Let Zt : M → R be the random variable defined by Zt := √1t k=0 where n(t, ω) denotes the number of reflections off ∂Q between instants 0 and t, for a particle having the configuration ω at time 0: ( n(t, ω) := max n ≥ 0 :

n−1

τ

+

) T (ω) ≤ t .

k

k=0

In points 2 to 5 of this proof, we prove that we have: 1 sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t − 4 +ε ,

x∈R

for any real number ε > 0. First, we notice that we have: 1 Zt − Wt = √ t

max(

!,n(t,·))−1

t τ¯

k=min(

t τ¯

F ◦ T k.

!,n(t,·))

3. Remaking calculations we have already done in [10, pp. 81 and 142] we prove the following result:

Billiards

107

Lemma 4.1. Let a real number L ≥ 2 be given. There exists a constant CL > 0 such that, for any real numbers t > 1 and K ≥ 1, we have:

. -

√

n(t, ·) − t ≥ K t ≤ CL K −L . ν

τ¯

Proof. Let us consider any real numbers t > 1 and K ≥ 2. We have

. -

√ t

ν

n(t, ·) − τ¯ ≥ K t / 0. 1 2. √ √ t t ≤ν n(t, ·) ≥ +ν n(t, ·) ≤ +K t −K t τ¯ τ¯    t √  t √  τ¯ +K   τ¯ −K  t −1 t! ≤ ν τ+ ◦ T i ≤ t  + ν  τ+ ◦ T i > t      i=0 i=0   t √   τ¯ +K t −1 √ τ + ◦ T i − τ¯ ≤ −τ¯ K t  + ≤ ν   i=0   t √  τ¯ −K √  t! τ + ◦ T i − τ¯ ≥ τ¯ K t − 1  +ν   i=0

n−1 1 2L + i ≤ 2 sup √ τ ◦ T − τ¯ n n≥1 i=0

L2L (M,ν)

n−1 1 2L + i ≤ 2 sup √ τ ◦ T − τ¯ n n≥1 i=0

L2L (M,ν)

2L √ t/τ¯ + K t + 1 2L √ τ¯ (K t − 1) (Kt/τ¯ + Kt + Kt)L = O K −L , √ 2L τ¯ K t 2

according to Corollary 1.3. # $ 4. For any real number p > 2 and L ≥ 2, there exists a constant ap,L such that, for any real numbers t > 1, α > 0 and β > 0, we have  1 β  p −4+ 2 t + t −Lβ  . ν ({|Zt − Wt | > α}) ≤ ap,L  αp Indeed, we have:

.

1 t

|Zt − Wt | > α and n(t, ·) − ≤ t 2 +β ν ({|Zt − Wt | > α}) ≤ ν τ¯

-

.

1 t

n(t, ·) − > t 2 +β +ν .

τ¯

According to Lemma 4.1, we have:

. -

n(t, ·) − t > t 21 +β ≤ CL t −Lβ . ν

τ¯

108

F. Pène

Moreover, according to Corollary 1.3 and to Theorem B of [12], there exists a constant Kp > 0 such that, for any integer N ≥ 0, we have:  

n−1

p 

n−1

p 

p

Eν  sup

F ◦ T k  + Eν  sup

F ◦ T −k  ≤ Kp N 2 .

n=1,...,N

n=1,...,N

k=0

k=0

Therefore, we have: |Zt − Wt | > α and ν 

≤

.

n(t, ·) − t ≤ t 21 +β

τ¯



. Eν |Zt − Wt |p 11 |n(t,·)− τt¯ |≤t 2 +β

 ≤O

t

p

1 β 4+ 2 p

αp t 2



αp

.

5. We get:

 ν (|Wt − Zt | > α) ≤ O

t

p − 41 + β2

αp

 + t −Lβ  ,

for any real numbers p > 2 and L ≥ 2. From this, it comes: 1 sup |ν (Zt ≤ x) − P(X ≤ x)| = O t − 4 +ε , x∈R

for any real number ε > 0. Indeed, we have ν(Zt ≤ x) − P(X ≤ x) ≤ ν(Wt ≤ x + α) + ν(Wt − Zt > α) − P(X ≤ x) ≤ ν(Wt ≤ x + α) − P(X ≤ x + α) + ν(Wt − Zt > α) + P(x ≤ X ≤ x + α) and P(X ≤ x) − ν(Zt ≤ x) ≤ P(X ≤ x) − ν (Wt ≤ x − α) + ν (Zt − Wt > α) ≤ P(X ≤ x − α) − ν(Wt ≤ x − α) + ν(Zt − Wt > α) + P(x − α ≤ X ≤ x). Therefore, according to the foregoing, for any real numbers ε > 0, p > 2 and L ≥ 2, we have: sup |ν (Zt ≤ x) − P(X ≤ x)| 1 ≤ O t − 2 +ε + ν(|Wt − Zt | > α) + P(x − α ≤ X ≤ x + α)   1 β p −4+ 2 1 t ≤ O t − 2 +ε + O  + t −Lβ  + O(α). αp

x∈R

Billiards

109

and α := t −γ , with γ > 0, we get: 1 p − 41 +γ + 4L − 21 +ε −γ . +t +t sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t

Taking β :=

1 2L

x∈R

To conclude, we shall take p, L large and γ close to 41 . For any real numbers L ≥ 3 1 1 and p ≥ 4, we take γ := 41 − 2p − 4L . We get: 1 1 −1+ 1 + 1 sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t − 2 +ε + t − 2 + t 4 2p 4L . x∈R

As this is true for any L ≥ 3 and p ≥ 4, we have shown that we have: 1 sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t − 4 +ε , x∈R

for any real number ε > 0. 6. We notice that, for any ω ∈ M and any real number t > 0, we have:

max τ + f 1

t M ∞

|Zt (ω)−Xt (ω, 0)| = √ f (Ys (ω, 0)) ds ≤ . √

t n(t,ω)−1 τ + (T k (ω)) t k=0

As we did in point 1 of this proof, we deduce from the foregoing that we have: 1 sup |ν (Xt (· , 0) ≤ x) − P(X ≤ x)| ≤ O t − 4 +ε , x∈R

for any real number ε > 0. 7. Now we consider the random variable Xt : Q1 → R given by Xt (ω, s) := Xt (ω, 0). We show that we have: 1

sup µ Xt ≤ x − P(X ≤ x) ≤ O t − 6 . x∈R

Let a real number x be fixed. For any integer m ≥ 1, we have:

µ X ≤ x − ν(Xt (·, 0) ≤ x)

t

τ + (ω)

=

1{Xt (ω,0)≤x} dν(ω)

1{Xt (ω,0)≤x} dν(ω) − τ ¯ M M

m−1

1

+ k τ (T (ω))

=

1{Xt (T k (ω),0)≤x } − 1{Xt (ω,0)≤x} dν(ω)

m

τ ¯ k=0 M

m−1

1

τ + (T k (ω))

≤

1{Xt (T k (ω),0)≤x } − 1{Xt (ω,0)≤x} dν(ω)

m

τ¯ k=0 M m−1 1 τ+ ◦ T k + −1 m τ¯ 1 k=0

≤

L (M,ν)

m−1 τ+

maxM τ¯ .m k=0 1 . +O √ m

M

1{Xt (T k (ω),0)≤x≤Xt (ω,0)} + 1{Xt (ω,0)≤x≤Xt (T k (ω),0)} dν(ω)

110

F. Pène + f ∞

As we have |Xt (·, 0) − Xt (T k (·), 0)| ≤ 2 k maxM√τt

, we get:

µ X ≤ x − ν(Xt (·, 0) ≤ x)

t . maxM τ + m maxM τ + f ∞ 1 |Xt (·, 0) − x| ≤ 2 ≤ ν +O √ √ τ¯ m t . m maxM τ + f ∞ maxM τ + P |X − x| ≤ 2 ≤ √ τ¯ t 1 1 +O √ + O t − 4 +ε m 1 1 m ≤ O √ + O t − 4 +ε + O √ , m t 1

for any real number ε > 0. By taking m := t 3 !, we get: 1

sup µ Xt ≤ x − ν(Xt (·, 0) ≤ x) ≤ O t − 6 . x∈R

8. To conclude, we notice that, for any (ω, s) ∈ N and any real number t > 0 we have: 2 maxM τ + f ∞ |Xt (ω, s) − Xt (ω, 0)| ≤ . √ t From this and the foregoing, we conclude that we have: 1 sup |µ (Xt ≤ x) − P(X ≤ x)| ≤ O t − 6 .

x∈R

A. Construction of L.-S. Young’s Tower: Recalls We see how Proposition 1.2 can be proved using the method developed by L.-S. Young in [14] for general hyperbolic systems. Let a real number η ∈]0; 1[ be fixed. Stable and unstable curves. Hyperbolic properties of (M, ν, T ) (existence and absolute continuity of stable and unstable foliations) are useful to make L.-S. Young’s construction. We recall here some well known results about stable and unstable curves for (M, ν, T ). Definition. We call a curve of M a curve γ contained in a connected component of M and which is C 1 for the parametrisation by ϕ). (r, For such a curve γ , we write l(γ ) := γ dr 2 + dϕ 2 . We call a stable curve (resp. unstable curve) a curve γ s (resp. γ u ) contained in M \ R−∞,0 (resp. in M \ R0,+∞ ) and satisfying lim l(T n (γ s )) = 0 (resp.

n→+∞

lim l(T −n (γ u )) = 0).

n→+∞

Billiards

111

We recall the following results: Proposition A.1. There exists a set M of M, exactly T -invariant, such that ν(M) = 1 and such that any x ∈ M is contained in an unique maximal stable curve written γ s (x) and in an unique maximal unstable curve written γ u (x). Proposition A.2. There exist two real numbers α ∈]0; 1[ and C > 0 such that, for any stable curve γ s , any unstable curve γ u and any integer n ≥ 0, we have l T n (γ s ) ≤ Cα n and l T −n (γ u ) ≤ Cα n . Moreover, the intersection of a stable curve with an unstable curve contains at most one point. Following L.-S. Young, we can construct an extension M˜ d , ν˜ d , T˜d of M, ν, T d (for some integer d ≥ 1) and a factor Mˆ d , νˆ d , Tˆd of M˜ d , ν˜ d , T˜d for which the transfer operator has “good” spectral properties on some functional space. The idea of the proof of the strong decorrelation property given in Proposition 1.2 is to prove first a result analogous to Proposition 1.2 for (M, ν, T d ) using these constructions. We shall establish these results after having briefly recalled the method of construction of these dynamical systems and stressing on the properties that shall be useful for our purpose. We recall the notions of extension and factor. Definition. Let (B0 , µ0 , θ0 ) and (B1 , µ1 , θ1 ) be two dynamical systems. The system (B1 , µ1 , θ1 ) is said to be an extension of (B0 , µ0 , θ0 ) by the map π : B1 → B0 if: • the map π is measurable; • µ0 is the image measure of µ1 by π , i.e. µ0 (A) = µ1 (π −1 (A)) for any measurable subset A of B0 ; • we have: π ◦ θ1 = θ0 ◦ π . We also say that (B0 , µ0 , θ0 ) is the factor of (B1 , µ1 , θ1 ) by π . An extension of (M, ν, T ). Definition. We call a rectangle of M a measurable subset A of M of the following form:     9 9 γ s ∩  γ u , A= s γ s ∈IA

u γ u ∈IA

s is a family of stable curves and I u a family of unstable curves and such that where IA A s × Iu . s u γ ∩ γ = ∅, for any (γ s , γ u ) ∈ IA A Let a rectangle A of M be given. We call the s-sub-rectangle of A a rectangle B of the following form:     9 9 γ s ∩  γ u , B= γ s ∈IBs

u γ u ∈IA

112

F. Pène

s . We call the u-sub-rectangle of A a rectangle C of the following with IBs contained in IA form:     9 9 C= γ s ∩  γ u , s γ s ∈IA

u γ u ∈IC

u. with ICu contained in IA

s ∩ In [14], L.-S. Young gives the construction of a rectangle K = γ s s γ ∈I u s contained in M (where I is a family of stable curves contained in γ u ∈I u γ M \ R1 and I u a family of unstable curves contained in M \ R−1 ) endowed with a return time R(·) in K under the action of T and of a (countable) ν-essential partition {Ki }i≥0 of K in s-sub-rectangles satisfying (in particular) the following: • R is equal to a constant ri on each Ki ; • For any x ∈ K, we have: T R(x) γ s (x) ⊆ γ s T R(x) (x) and T R(x) γ u (x) ⊇ γ u T R(x) (x) . • For any i ≥ 0, T ri (Ki ) is a u-sub-rectangle of K; • Ki is contained in a connected component of M \ R−ri ,0 . Λ

Λ γs

γu

TRi

Λi γs

γu

Then, she constructs a Borel probability measure µ˜ on K, T R(·) -invariant, such that Eµ˜ [R] < +∞ and such that the "discrete-time suspension system" M˜ 1 , ν˜ 1 , T˜1 over (K, µ, ˜ T R(·) ) defined by the function R(·) as follows is an extension of (M, ν, T ) (by π˜ 1 : M˜ 1 → M given by π˜ 1 (x, l) = T l (x)): • M˜ 1 := {(x, l) : x ∈ K, 0 ≤ l ≤ R(x) − 1}; • T˜1 (x, l) = (x, l + 1) if l < R(x) − 1 and T˜1 (x, l) = T R(x) (x), 0 if l = R(x) − 1; ˜ l) l≥0 µ(A • ν˜ 1 l≥0 Al × {l} = Eµ˜ [R(·)] , where, for each l, Al is a measurable subset of {R > l}.

Billiards

113

A partition. We define il : {x ∈ K : R(x) > l} → Ll by il (x) = (x, l). L.-S. Young gives the construction of a partition D = Ll,j ; l ≥ 0, j = 1, ..., jl , where {Ll,j }j is a finite partition of the l th "store" Ll := {(x, l ) ∈ M˜ 1 ; l = l} satisfying the following properties: Properties A.3. 1. j0 = 1 and L0,1 = L0 = K × {0}; 2. each il −1 Ll,j is a s-sub-rectangle of K, union of Ki ; 3. For any l ≥ 0, il+1 −1 Ll+1,j ; j = 1, ..., jl+1 is a partition of {R > l + 1} finer −1 Ll,j ; j = 1, ..., jl ; than the one induced by il 4. For any x, y in il−1 (Ll,j ) and in a same unstable curve, there exists an unstable curve containing x and y and contained in M \ R−l,0 ; 5. If T˜1−1 (L0 )∩Ll,j = ∅, then there exists an integer i ≥ 0 such that T˜1−1 (L0 )∩Ll,j = Ki × {ri − 1}. For any X, Y ∈ M˜ 1 , we define the separation time s(X, Y ) between X and Y as follows: : ; s(X, Y ) := max n ≥ 0 : T˜1n (Y ) ∈ D T˜1n (X) . The following fact shall be useful in our proof. Fact A.4. Let n ≥ 0 be an integer. Let X and Y be two points in M˜ 1 such that s(X, Y ) ≥ n. Then, the intersection point z of the curves γ s (π˜ 1 (X)) and γ u (π˜ 1 (Y )) exists. Moreover, T n (z) and T n (π˜ 1 (Y )) are both contained in the same unstable curve. Let us write d := gcd(ri ). An extension of (M, ν, T d ). We can show that the dynamical system M˜ d , ν˜ d , T˜d , defined as follows, is an extension of (M, ν, T d ) by π˜ d := π˜ 1| M˜ d : • M˜ d := l≥0 Lld ; • µ˜ d := (˜ν1 )|M˜ d and ν˜ d := d.µ˜ d is the probability measure proportional to µ˜ d ; • T˜d := T˜1d . |M˜ d

A factor with a quasicompact transfer operator. We consider the factor Mˆ d , νˆ d , Tˆd of M˜ d , ν˜ d , T˜d given by the canonical projection πˆ d : M˜ d → Mˆ d , where Mˆ d is the set of the Rd -classes of M˜ d , for the binary relation Rd defined on M˜ d by: (x, l)Rd (x , l ) ⇔ l = l and x, x are in a same γ s ∈ I s . L.-S. Young defines a natural measure m ˆ on Mˆ d such that νˆ d is absolutely continuous relatively to m ˆ and such that the density ρˆ := ddνˆmˆd satisfies: • c0 −1 ≤ ρˆ ≤ c0 , for some real number c0 > 1;

sˆ (x, ˆ y) ˆ ρ( ˆ x), ˆ for some real numbers c1 > 0 and α0 ∈]0; 1[; • ρ( ˆ x) ˆ − ρ( ˆ y) ˆ ≤ c 1 α0

114

F. Pène

ˆ ˆ with sˆ (πˆ d(x), πˆ d (y)) := s(x, y). We shall write Lld := πˆ d (Lld ) and Lld,j := πˆ d Lld,j . Let us fix α1 := max(α, α0 ). For any real numbers β ∈]0; 1[ and ε > 0, we define the functional space V(β,ε) as follows: : ; V(β,ε) := fˆ : Mˆ d → C measurable, fˆV(β,ε) < +∞ , where fˆ

V(β,ε)

:= fˆ

(β,ε,∞)

ˆ f

(β,ε,∞)

ˆ f

(β,ε,h)

+ fˆ

(β,ε,h)

, with

:= sup fˆ|Lˆ ld e−ld.ε , ∞

l≥0

:=

sup

sup

l≥0;j =1,...,jld x, ˆ ld,j ˆ y∈ ˆ L

ˆ

ˆ − fˆ(y) ˆ

f (x) β

d

<

sˆ (x, ˆ yˆ ) d

=

e−ld.ε .

ˆ shall be denoted by P . Then we have The transfer operator associated to Tˆd relative to m P ρˆ = ρ. ˆ L.-S. Young shows that we can find two real numbers β ∈]α1 ; 1[ and ε0 > 0 such that, for any real number ε ∈]0; ε0 ], the three following points hold: • There exists a real number C0 > 0 satisfying · L2 (ˆνd ) ≤ C0 · V(β,ε) ; • There exist two real numbers τ1 ∈]0; 1[ and C1 > 0 such that, for any integer n ≥ 0 and for any fˆ ∈ V(β,ε) satisfying Mˆ d fˆ d m ˆ = 0, we have n ˆ P f

V(β,ε)

≤ C1 τ1 n fˆ

V(β,ε)

;

x) ˆ

ˆ y)−1 ˆ • We have P (fˆ)(x) ˆ = zˆ :Tˆd (ˆz)=xˆ ξ(ˆz)fˆ(ˆz), with log ξ( ≤ C2 α1 sˆ(x, , for any xˆ ξ(y) ˆ

ˆ l,j . and yˆ in the same L In the following, we shall consider (β, ε) satisfying these properties. B. Proof of the Strong Decorrelation Property An exponential rate of decorrelation for (M, ν, T d ). Theorem B.1. Let κ ∈ 0; 21 be a real number. There exists a constant Lη,κ > 0 such that, for any integers m ˜ 1, m ˜ 2 , any functions φ and ψ in Hη,m˜ 1 .d and in Hη,m˜ 2 .d respectively and any integer n ≥ 0, we have

m ˜1

Covν φ, ψ ◦ T n.d ≤ Lη φ∞ Cψ + Cφ ψ∞ + φ∞ .ψ∞ τ0 n− 1−2κ , with τ0 := max α κη.d , τ11−2κ . Before establishing this result, we give the idea of its proof. We shall suppose n(1−2κ) ≥ m ˜ 1 and shall show how the study of Covν$ φ, ψ ◦ T #n.d leads us, after approximations, to the study of a quantity of the form Emˆ fˆ.gˆ ◦ Tˆ n , where fˆ and gˆ are two bounded d

Billiards

115

functions defined on Mˆ d such that P n1 fˆ is m-centered ˆ and is in V(β,ε) with n1 = 2 κn! + m ˜ 1 . Therefore, we shall get:

$

#

ˆ ≤ g ˆ ∞ P n (fˆ)

Emˆ fˆ.gˆ ◦ Tˆdn = Emˆ [P n (fˆ).g] 1 n ˆ ≤ g ˆ ∞ C0 P (f ) ≤ g ˆ ∞ C0 C1 τ1 n−n1 P n1 (fˆ) . V(β,ε)

V(β,ε)

˜ 2 , κ, φ and ψ be as in the statement of the theorem. If n(1 − 2κ) < m ˜ 1, Proof. Let m ˜ 1, m then we have

m ˜1

Covν (φ, ψ ◦ T n.d ) ≤ φ∞ .ψ∞ ≤ φ∞ .ψ∞ τ0 − 1−2κ τ0 n . In the following, we shall suppose n(1 − 2κ) ≥ m ˜ 1 . We denote k = kn := κn!. We have n ≥ 2κn + m ˜ 1 ≥ 2k + m ˜ 1 . Therefore

Covν (φ, ψ ◦ T n.d ) = Covν˜ d φ˜ ◦ T˜dk , (ψ˜ ◦ T˜dk ) ◦ T˜dn

with φ˜ := φ ◦ π˜ d and ψ˜ := ψ ◦ π˜ d . So, we have Covν (φ, ψ ◦ T n.d ) ≤ An + Bn + Cn , with An , Bn and Cn defined as follows:

1. We write An := Covν˜ d φ˜ ◦ T˜dk , (ψ˜ ◦ T˜dk ) ◦ T˜dn − Covν˜ d φ˜ ◦ T˜dk , ψˆ k ◦ T˜dn

where ψˆ k (x) is the infimum of ψ˜ ◦ T˜dk = ψ ◦ T kd ◦ π˜ d on the atom of M2k+m˜ 2 containing x, where we have written   (2k+ m ˜ 2 )d > M2k+m˜ 2 :=  . T˜1−i D i=0

|M˜ d

We shall use the regularity of ψ to get an upper-bound for An . We recall that, by m ˜ 2 .d (x)). Moreover, we hypothesis, x + → ψ(x) is Hölder of order η in (x,T (x), ..., T kd+j π˜ d M2k+m˜ 2 (for j = 0, ..., m ˜ 2 d) is contained shall see that each atom of T in a connected component of M \ R−sj ,0 and has a diameter less than 2Cα1 kd , with sj := (2k + m ˜ 2 )d − (kd + j ) ≥ kd. Indeed, let Y1 and Y2 be two points in the same ˜ 2 . Therefore, according to Fact atom of M2k+m˜ 2 . Then, we have s (Y1 , Y2 ) ≥ 2k + m A.4, the intersection point y3 of γ u (π˜ d (Y1 )) with γ s (π˜ d (Y2 )) exists. Since y3 and π˜ d (Y2 ) are both contained in the same stable curve and according to Proposition A.2, we have d T kd+j (y3 ), T kd+j (π˜ d (Y2 )) ≤ Cα kd+j ≤ Cα kd . Moreover, according to Fact A.4, T (2k+m˜ 2 )d (y3 ) and T (2k+m˜ 2 )d (π˜ 1 (Y1 )) are both contained in the same unstable curve. So we have: d T kd+j (y3 ), T kd+j (π˜ d (Y1 )) ≤ Cα sj ≤ Cα kd . As ψ is in Hη,m˜ 2 d , according to the foregoing, we have η ˜ ˜k ψ ◦ Td − ψˆ k ≤ Cψ 2Cα kd ≤ Kη Cψ τ0 n ; ∞

116

F. Pène

with Kη := (2C)η α −ηd . We get An ≤ φ∞ ψ˜ ◦ T˜dk − ψˆ k

∞

≤ Kη φ∞ Cψ τ0 n .

2. We write Bn := Covν˜ d φ˜ ◦ T˜dk , ψˆ k ◦ T˜dn − Covν˜ d φˆ k , ψˆ k ◦ T˜dn , where φˆ k (x) is ? (2k+m ˜ 1 )d ˜ −i the infimum of φ˜ ◦ T˜dk on the atom of M2k+m˜ 1 := T D containing 1 i=0

x. As previously, we can show that we have Bn ≤ Kη Cφ ψ∞ τ0 n . 3. We shall now give an upper bound for the following quantity:

|M˜ d

Cn := Covν˜ d φˆ k , ψˆ k ◦ T˜dn

=

ψˆ k ◦ T˜dn .φˆ k d ν˜ d − φˆ k d ν˜ d . ψˆ k d ν˜ d

M˜ M˜ M˜

d

d d

=

ψˆ k ◦ Tˆdn .φˆ k d νˆ d − φˆ k d νˆ d . ψˆ k d νˆ d

, Mˆ d

Mˆ d

Mˆ d

where we also denote by φˆ k the map φˆ k ◦ πˆ d−1 ,

≤

Mˆ

d

≤

ˆ

n ˆ ˆ ˆ ˆ ˆ dm ˆ − ψk .P (φk ρ) φk d νˆ d . ψk d νˆ d

ˆ ˆ M Md

d

n ˆ ˆ ˆ ψk . P (φk ρ) φk ρˆ d m) ˆ −( ˆ ρˆ d m ˆ

Md Mˆ d n ˆ ˆ ≤ ψ∞ C0 P (φk ρ) ˆ − ˆ ρˆ φk ρˆ d m Mˆ d

V(β,ε)

n−(2k+m ˜ 1 ) 2k+m ˜1 ˆ φk − ≤ ψ∞ C0 C1 τ1 P

Mˆ d

ˆ ˆ ρˆ φk ρˆ d m

V(β,ε)

,

ˆ ρˆ is m-centered ˆ and we shall see that P 2k+m˜ 1 (φˆ k ρ) ˆ is in since φˆ k − Mˆ d φˆ k ρˆ d m V(β,ε) . Let l ≥ 0 and j = 1, ..., jld be two integers. We denote by Ald,j the set of ˆ 2k+m˜ := πˆ d M2k+m˜ such that Tˆ 2k+m˜ 1 (A) ⊆ L ˆ ld,j . Let A be an atoms A of M 1 1 d 2k+ m ˜ 1 ˆ ld,j . defines a one-to-one map from A onto L atom of Ald,j . Then, the map Tˆd Indeed, Point 5 of Properties A.3, the fact that each Ki is a s-sub-rectangle and that ˆ T ri (Ki ) is a u-sub-rectangle insure as that Td defines a one-to-one map from each ?d −j " ˜ ˆ ⊆ L ˆ 0 ) onto L ˆ 0 . We denote by Bˆ (in πˆ d (D) and such that Tˆd (B) j "=0 T1 −(2k+m ˜ ) Tˆ(A,d) 1 the inverse map of Tˆd 2k + m ˜ 1 restricted to A. We notice that, for any ˆ ld,j , we have xˆ ∈ L

P 2k+m˜ 1 (fˆ)(x) ˆ =

A∈Ald,j

−(2k+m ˜ ) ξA (x) ˆ fˆ Tˆ(A,d) 1 (x) ˆ ,

Billiards

117

@2k+m˜ 1 −1 i −(2k+m˜ 1 ) where ξA (x) ˆ := i=0 ξ Tˆd (Tˆ(A,d) (x)) ˆ . Since we have P (ρ) ˆ = ρ, ˆ ξA ≥ 0 and ρˆ ≥ 0, we have

2k+m˜ 1 ˆ

(φk ρ) ˆ = sup P 2k+m˜ 1 (φˆ k ρ) ˆ |Lˆ ld,j e−ldε P (β,ε,∞)

∞

l,j

≤ sup φ∞ P 2k+m˜ 1 (ρ) ˆ |Lˆ ld,j e−ldε ≤ c0 φ∞ . ∞

l,j

ˆ l,j , we have According to the foregoing, for any x, ˆ yˆ ∈ L

ˆ y) ˆ

ˆ

C2 α1 sˆ(x,

log ξA (x) , ≤

ξA (y) ˆ

1 − α1

ˆ y) ˆ , for some constant C > 0 independent ˆ − ξA (x) ˆ ≤ C1 ξA (x)α ˆ 1 sˆ(x, and so ξA (y) 1 of n and of A. We denote by cA,ld,j the constant to which φˆ k is equal on A and (A) −(2k+m ˜ ) ρˆ2k+m˜ 1 ,ld,j := ρˆ ◦ Tˆ(A,d) 1 . We get P 2k+m˜ 1 (φˆ k ρ) ˆ (β,ε,h)   2k+ m ˜ 2k+ m ˜ 1 1 ˆ ˆ |P (φk ρ)( ˆ x) ˆ −P (φk ρ)( ˆ y)| ˆ  −ldε = < e = sup  sup sˆ (x, ˆ y) ˆ d l,j ˆ ld,j d x, ˆ y∈ ˆ L β ≤ sup

sup

(A)

l,j x, ˆ ld,j A∈A ˆ y∈ ˆ L ld,j

≤ sup

sup

l,j x, ˆ ld,j ˆ y∈ ˆ L

φ∞

|cA,j,ld |.

(A)

|ξA (x) ˆ ρˆ2k+m˜ 1 ,ld,j (x) ˆ − ξA (y) ˆ ρˆ2k+m˜ 1 ,ld,j (y)| ˆ β

A∈Ald,j

d

<

sˆ (x, ˆ y) ˆ d

sup

l,j x, ˆ ld,j ˆ y∈ ˆ L

≤ (C1 + c1 + c1 C1 ) sup

sup

l,j x, ˆ ld,j ˆ y∈ ˆ L

≤ (C1 + c1 + c1 C1 ) sup

sup

l,j x, ˆ ld,j ˆ y∈ ˆ L

≤

(C1

e−ldε

(A) (A) |ξA (x) ˆ ρˆ2k+m˜ 1 ,ld,j (x) ˆ − ξA (y) ˆ ρˆ2k+m˜ 1 ,ld,j (y)| ˆ = < e−ldε sˆ (x, ˆ y) ˆ d d

β

sˆ (x, ˆ y) ˆ

≤ (C1 + c1 + c1 C1 ) sup

=

φ∞

α1

(A)

ξA (x) ˆ ρˆ2k+m˜ 1 ,ld,j (x) ˆ

A∈Ald,j

β

d

<

sˆ (x, ˆ y) ˆ d

=

ˆ y) ˆ ρ( α1 sˆ(x, ˆ x) ˆ < = e−ldε φ∞ sˆ (x, ˆ y) ˆ d d

β

φ∞

ˆ y) ˆ α1 sˆ(x, ρ( ˆ x)e ˆ −ldε ˆ y) ˆ α1 sˆ(x,

+ c1 + c1 C1 )φ∞ c0 .

ˆ is in V(β,ε) . In particular, we have shown that P 2k+m˜ 1 (φˆ k ρ) According to the foregoing, |Covν (φ, ψ ◦ T nd )| is less than: Kη φ∞ Cψ + Cφ ψ∞ τ0 n + C0 C1 (C1 + c1 + c1 C1 )c0 τ1 (1−2κ)n−m˜ 1 φ∞ ψ∞ ≤ Kη φ∞ Cψ + Cφ ψ∞ τ0 n m ˜1

+ C0 C1 (C1 + c1 + c1 C1 )c0 τ0 n− 1−2κ φ∞ ψ∞ .

$ #

e−ldε

118

F. Pène

End of the proof of Proposition 1.2. Proposition 1.2 is a direct consequence of the 1 following result (with r = 1−2κ ). Corollary B.2. Let m1 , m2 ≥ 0 be two integers and κ ∈]0; 1/2[ be a real number. For any functions φ and ψ in Hη,m1 and in Hη,m2 respectively and any integer n ≥ 0, we have m1

Covν φ, ψ ◦ T n ≤ Cη,κ φ∞ Cψ + Cφ ψ∞ + φ∞ .ψ∞ τ n− 1−2κ , 0 1−2κ 1 d = τ0d and Cη,κ := Lη,κ τ0 − 1−2κ −d+1 . with τ0 := max α1 −κη , τ1 d Proof. For any integer l = 0, ..., d −1, we apply the foregoing to the couple (φ, ψ ◦T l ). A B (η,

Indeed, φ is in Hη,m1 so in Hη,A m1 Bd with Cφ

ψ ◦ T l is in H

C

η,

m2 +l d

D

d C D m +l η, 2d d

d

, with Cψ◦T l

m1 d

d)

(η,m1 )

≤ Cφ

(η,m2 )

≤ Cψ

. On the other hand,

. Consequently, for any integer

k ≥ 0, we have:

Covν φ, ψ ◦ T kd+l

= Covν φ, (ψ ◦ T l ) ◦ T kd

k

≤ Lη,κ φ∞ Cψ + Cφ ψ∞ + φ∞ ψ∞ τ0 τ0 ≤

−

Cm D 1 d 1−2κ

kd − Lη,κ φ∞ Cψ + Cφ ψ∞ + φ∞ ψ∞ τ0 τ0

Cm D 1 d d 1−2κ

kd − m1 +d ≤ Lη,κ φ∞ Cψ + Cφ ψ∞ + φ∞ ψ∞ τ0 τ0 1−2κ kd+l − m1 +d −d+1 ≤ Lη,κ φ∞ Cψ + Cφ ψ∞ + φ∞ ψ∞ τ0 τ0 1−2κ τ0 , $ # Acknowledgements. Thanks are due to Jean-Pierre Conze for all the stimulating discussions we have had and for his advices for the redaction of this paper. I would also like to thank Christophe Jan for the explanations he patiently gave me about the method he has developed.

References 1. Bunimovich, L.A., Chernov, N.I. and Sinai, Ya.G.: Statistical properties of two-dimensional hyperbolic billiards. Russ. Math. Survey 46, (4), 47–106 (1991) 2. Bunimovich, L.A. and Sinai, Ya.G.: Statistical properties of Lorentz gaz with periodic configuration of scatterers. Commun. Math. Phys. 78, 479–497 (1981) 3. Chernov, N.I.: Decay of Correlations and Dispersing Billiards. J. of Stat. Phys. 94, 513–556 (1999) 4. Chernov, N.I. and Sinai, Ya.G.: Ergodic properties of certain systems of two-dimensional discs and threedimensional balls. Russ. Math. Survey 42, (3), 181–207 (1987) 5. Esseen, C.: Fourier analysis of distribution functions. A mathematical study of the Laplace-Gaussian law. Acta Math. 77, 1–125 (1945) 6. Jan, C.: Vitesse de convergence dans le TCL pour des chaînes de Markov et certains processus associés à des systèmes dynamiques. C. R. Acad. Sci. Paris Sér. I Math. 331, (5), 395–398 (2000) 7. Jan, C.: Vitesse de convergence dans le TCL pour des processus associés à des systèmes dynamiques et aux produits de matrices aléatoires. Ph. D. thesis, University of Rennes 1, France, 2001

Billiards

119

8. Jan, C.: Rates of convergence for some processes under mixing conditions and application to random matrix products. Preprint, 2001 9. Katok, A., Strelcyn, J.-M., Ledrappier, F. and Przytycki, F.: Invariant manifolds, entropy and billiards: smooth maps with singularities. Lect. Notes in Math. 122.Berlin–Heidelberg–New York: Springer, 1986 10. Pène, F.: Applications des propriétés stochastiques des systèmes dynamiques de type hyperbolique: ergodicité du billard dispersif dans le plan, moyennisation d’équations différentielles perturbées par un flot ergodique. Ph. D. thesis, University of Rennes I, France, 2000 11. Pène, F.: Averaging method for differential equations perturbed by dynamical systems. Preprint, 2001 12. Serfling, R.J.: Moment inequalities for the maximum cumulative sum. Ann. Math. Stat. 41, 1227–1234 (1970) 13. Sinai, Ya.G.: Dynamical systems with elastic reflections. Russ. Math. Survey 25, (1), 137–189 (1970) 14. Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. of Math. 147, 585–650 (1998) Communicated by G. Gallavotti

Commun. Math. Phys. 225, 121 – 130 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

On Quantization of Quadratic Poisson Structures D. Manchon1 , M. Masmoudi2 , A. Roux2 1 Institut Elie Cartan, CNRS, BP 239, 54506 Vandoeuvre Cedex, France. E-mail: [email protected] 2 Université de Metz, Département de Mathématiques, île du Saulcy, 57045 Metz Cedex 01, France.

E-mail: [email protected]; [email protected] Received: 31 May 2001 / Accepted: 17 August 2001

Abstract: Any classical r-matrix on the Lie algebra of linear operators on a real vector space V gives rise to a quadratic Poisson structure on V which admits a deformation quantization stemming from the construction of V. Drinfel’d [Dr], [Gr]. We exhibit in this article an example of quadratic Poisson structure which does not arise this way. 1. Introduction Let V be a finite-dimensional real vector space. The linear action of the Lie group Gl(V ) on V induces by differentiation a Lie algebra isomorphism from g = gl(V ) to the Lie algebra of linear vector fields on V . Given a basis (e1 , . . . , en ) and then identifying gl(V ) with the Lie algebra of real n × n matrices the isomorphism is given by: J (Eij ) = xi ∂j , where Eij is the matrix with entries all vanishing except one equal to 1 on the ith line and jth column. There is a unique way to extend the Lie bracket of g to a graded Lie bracket, called the Schouten bracket on the shifted exterior algebra (g)[1] in a way compatible with the exterior product. The shift means that elements of k (g) are of degree k − 1, and then the Schouten bracket maps k (g) × l (g) to k+l−1 (g). The exterior algebra (g) inherits then a structure of Gerstenhaber algebra (cf. for example [V], Introduction). The space T poly (V ) of polyvector fields on M is also endowed with a Gerstenhaber algebra structure, with Schouten bracket extending the Lie bracket of vector fields [V]. The subalgebra (for the exterior product) generated by linear vector fields is a Ger(V ) of T poly (V ). The isomorphism J extends to a surjective stenhaber subalgebra Gerstenhaber algebra morphism: • (V ). J • : • (g) −→

122

D. Manchon, M. Masmoudi, A. Roux

Map J k has nontrivial kernel for k ≥ 2 as long as V has dimension ≥ 2. For example we have: J 2 (Eij ∧ Ekj ) = 0. A classical r-matrix on g is by definition an element r of g ∧ g such that [r, r] = 0. According to the discussion above the bivector field J 2 (r) defines then a quadratic Poisson structure on V . A natural question arises then: can one recover in this way any quadratic Poisson structure on V ? It was claimed true in [BR] but Z. H. Liu and P. Xu discovered that the authors’ argument was not correct [LX]. They brought up a positive answer in the two-dimensional case ([LX, Prop. 2.1]): namely the general quadratic Poisson structure (ax12 + 2bx1 x2 + cx22 )∂1 ∧ ∂2 is equal to J 2 (r) with: r=

b −a 10 ∧ . c −b 01

We give here a negative answer to this question in general: after a somewhat lengthy but elementary computation we show in Sect. 3 that the bivector field (x12 + x2 x3 )∂2 ∧ ∂3 on R3 is a counterexample to this conjecture: it is outside the image of the set of r-matrices by J 2 . We recall in Sect. 2 the construction by V. G. Drinfel’d of a translation-invariant deformation quantization on any Lie group G once given a classical r-matrix on the Lie algebra g [Dr,T]. The problem reduces to the case when r is non-degenerate, and the deformation quantization is then obtained by suitable restriction and transportation of the Baker–Campbell–Hausdorff deformation quantization ([Ka]) of the dual g∗ of the central extension g of g defined by r. The construction works moreover for any Kontsevich-type star product [ABM] on g∗ . For g = gl(V ), such a star product on ∗ g gives almost immediately through this construction a deformation quantization of quadratic Poisson structure J 2 (r) on V . Deformation quantization of some particular quadratic Poisson structures has been considered by several people namely Omori, Maeda and Yoshioka [OMY, Prop. 4.7]. Explicit computations for all quadratic Poisson structures in dimension 3 (then including our counterexample as well) have been performed by El Galiou and Tihami [ET], by a case-by-case method based on the classification of Dufour and Haraki [DH]. Let us recall that the existence of a deformation quantization for any Poisson manifold is a direct consequence of M. Kontsevich’s formality theorem.

2. Quantization of Poisson Structures Coming from r-Matrices Let g be a Lie algebra, and let r ∈ g∧g be a classical r-matrix. It defines an antisymmetric operator: r : g∗ −→ g. The classical Yang–Baxter equation [r, r] = 0 is equivalent to: ξ, [ r(η), r(ζ )] + η, [ r(ζ ), r(ξ )] + ζ, [ r(ξ ), r(η)] = 0

(∗)

for any ξ, η, ζ ∈ g∗ . The r-matrix r defines a left translation-invariant Poisson structure on any Lie group G with Lie algebra g.

On Quantization of Quadratic Poisson Structures

123

2.1. A central extension. We can firstly suppose r nondegenerate, i.e. that r is inversible with inverse ω, where ω belongs to g∗ ∧ g∗ . The classical Yang–Baxter equation is in this case equivalent to: ωX, [Y, Z] + ωY, [Z, X] + ωZ, [X, Y ] = 0 for any X, Y, Z ∈ g, i.e. is equivalent to the fact that ω is a 2-cocycle with values in the trivial representation. Let g be the central extension of g by this cocycle, defined by g = g ⊕ R with a bracket: [X + α, Y + β] = [X, Y ] + ω, X ∧ Y . The cocycle condition on ω is equivalent to de Jacobi identity for this bracket. Let g, and let H be the hyperplane in g∗ defined by: X0 = (0, 1) ∈ H = {ξ ∈ g∗ /ξ, X0 = 1}. It is the symplectic leaf through the point ξ0 defined by ξ0 , g = 0 and ξ0 , X0 = 1. It with Lie algebra 0 for any Lie group G is then the coadjoint orbit Ad∗ G.ξ g. 2.2. Kontsevich star products (after [ABM]). The linear Poisson manifold g∗ admits ∗ a whole bunch of equivalent Ad G-invariant deformation quantizations which can be built from the enveloping algebra U( g), for example the Baker–Campbell–Hausdorff quantization or the Kontsevich quantization [K,ABM, Ka, Di]. The Baker–Campbell– Hausdorff quantization is given by the following integral formula [ABM]: −iξ, x . y h¯ F −1 u(x)F −1 v(y)e dx dy, (u # v)(ξ ) = g×g

BCH

where the inverse Fourier transform is given by: u(η)eix,η dη, F −1 u(x) = (2π )−n g∗

and x . y stands for the Baker–Campbell–Hausdorff expansion: h¯

h¯ h¯ 2 [x, y] + ([x, [x, y]] + [y, [y, x]]) + · · · . 12 2 The Lebesgue measure dη on g∗ is normalized so that it is the dual measure of Lebesgue measure dx on g. The quantizations we can consider here are the ones called “Kontsevich star products” in [ABM]. They are all equivalent to the BCH quantization. The equivalence is a formal series of differential operators with constant coefficients on g∗ precisely given by a formal series of G-invariant polynomials on g of the following form: h¯ 2k as1 ,... ,sc Tr(ad x)s1 · · · Tr(ad x)sc , F (x) = 1 + x+y+

k≥1

c c≥1 (s1 ,... ,sc )∈S2k

c stands for those (s , . . . , s ) in Nc such that s + · · · + s = 2k, s ≤ s ≤ where S2k 1 c 1 c 1 2 · · · ≤ sc and sj = 1. The star product obtained this way admits the following integral form: F (−ix)F (−iy) −iξ, x h. y ¯ e F −1 u(x)F −1 v(y) dx dy. (u#v)(ξ ) = F −i(x . y) g×g h¯

124

D. Manchon, M. Masmoudi, A. Roux

2.3. Quantization of left-invariant Poisson structures. It is easy to derive from the fact that X0 is central that any of the deformation quantizations defined above does define with Lie by restriction a deformation quantization of H. Let G be the subgroup of G algebra g. We clearly have: 0 = H. Ad∗ G.ξ0 = Ad∗ G.ξ is the one-dimensional subgroup It is moreover easy to check that the stabilizer of ξ0 in G with Lie algebra generated by X0 . It is a simple consequence of the nondegeneracy of the alternate bilinear form ω. The dimension of G is equal to the dimension of H. The map: ϕ : G −→ H g −→ Ad∗ g.ξ0 is then a local G-equivariant diffeomorphism near the identity (with left translation on the left-hand side and coadjoint action on the right-hand side). We can then transport any deformation quantization of H and get a left translation-invariant deformation quantization of a neighbourhood of the identity in G. It extends by translation invariance to the whole group G, as well as to any Lie group G locally isomorphic to G. The deformation quantization on G can be written: h¯ k Ck (u, v), u#v = k≥0

’s are left-invariant bidifferential operators on G. There exists then an where the Ck element F = h¯ k Fk in U(g) ⊗ U(g) [[h]] ¯ such that: u#v(g) = F (u ⊗ v)(g, g).

(∗∗)

Let us now fix a basis x1 , . . . , xn of g, and consider elements of U(g) as polynomials F (x) = F (x1 , . . . , xn ) of the n noncommuting variables x1 , . . . , xn , which satisfy the relations: k cij xk . xi xj − xj xi = [xi , xj ] = k

Introducing a second identical set of noncommuting variables y = (y1 , . . . , yn ) commuting with th xj s, we can write any element A ∈ U(g) ⊗ U(g) as A(x, y). The element F defined above can then be written F (x, y) as a formal series with coefficients Fk (x, y). Proposition 2.1. The formal series F = F (x, y) ∈ U(g) ⊗ U(g) [[h¯ ]] above verifies: (1) (2)

F0 (x, y) = 0. 1 rij xi yj . F1 (x, y) = 2 i,j

(3) (4)

Fk (x, 0) = Fk (0, y) = 0 for k ≥ 1. F (x + y, z)F (x, y) = F (x, y + z)F (y, z).

Conversely any F (x, y) endowed with those 4 properties defines by formula (∗∗) a left translation deformation quantization of G.

On Quantization of Quadratic Poisson Structures

125

Proof. It is well-known: see for example [Dr,T]. The first condition comes from the fact that C0 (u, v) is the ordinary product uv. The second property comes from the expression of the left-invariant Poisson bracket on G defined from the r-matrix; the third property expresses the fact that 1#u = u#1 = u, and the last property is an expression of the associativity of star product #. Let us elaborate a bit on that last point: any element Xj of the basis corresponds to the polynomial expression G(x) = xj . The Leibniz rule: Xj .(ϕψ) = (Xj .ϕ)ψ + ϕ.(Xj .ψ) can be written as:

G(x) ◦ m = m ◦ G(x + y),

C ∞ (G × G)

where m : → C ∞ (G) stands for multiplication, here the restriction to the diagonal. The formula above extends to any polynomial expression G representing any element of the enveloping algebra. We have then: (u#v)#w = m ◦ F (u ⊗ v) #w

= m ◦ F m ◦ F (u ⊗ v) ⊗ w = m ◦ F ◦ (m ⊗ I ) ◦ (F ⊗ I )(u ⊗ v ⊗ w) = m ◦ F (x, z) ◦ (m ⊗ I ) ◦ F (x, y)(u ⊗ v ⊗ w) = m ◦ (m ⊗ I ) ◦ F (x + y, z)F (x, y)(u ⊗ v ⊗ w). Similarly we have: u#(v#w) = m ◦ (I ⊗ m) ◦ F (x, y + z)F (y, z)(u ⊗ v ⊗ w). The associativity condition for product # is then equivalent to Property (4) of the proposition. Let us now look at the case when r is degenerate. Then the image g0 of r is a subspace strictly contained in g. By skew-symmetry g0 is also the orthogonal of the kernel of r, and the classical Yang–Baxter equation [r, r] = 0 ensures thanks to (∗) that g0 is a Lie subalgebra of g. We get this way a nondegenerate 0 , r0 ] = 0. rk0 ∈ g0∧ g0 such that [r Applying the procedure above we get an F = h¯ Fk in U(g0 ) ⊗ U(g0 ) [[h]] ¯ which can be seen as an element of U(g) ⊗ U(g) [[h¯ ]]. 2.4. A class of easily quantizable Poisson structures. Let G be a Lie group with Lie algebra g. Let F (x, y) be a formal series in U(g) ⊗ U(g)[[h¯ ]] satisfying Properties (1)–(4) of Proposition 2.1 (for example one constructed from an r-matrix along the lines above). Let M be any differentiable manifold endowed with an action of G. The differentiation of this action induces a Lie algebra morphism from g to the vector fields on M, which extends to an algebra morphism from U(g) to the algebra of differential operators on M. Similarly it induces an algebra morphism from U(g) ⊗ U(g) to the algebra of differential operators on M × M. The formal series of bidifferential operators defined by the formula: ∗ = m ◦ F (x, y) (where m : C ∞ (M × M) → C ∞ (M) stands for ordinary multiplication of functions on M) defines then a star product on M, the associated Poisson bivector being defined by F1 (x, y) − F1 (y, x). The proof of this fact is similar to that of Proposition 2.1. It is

126

D. Manchon, M. Masmoudi, A. Roux

easily seen that if F (x, y) comes from a classical r-matrix r ∈ g ∧ g then the Poisson structure on M is J 2 (r), where J • is the Gerstenhaber algebra morphism from (g) to multivector fields on M extending the action of g. We will be interested in the sequel by the following particular situation: the manifold M is a vector space V , the action of G is linear, and there is a classical r-matrix r on g. We can as in the introduction view J as a Lie algebra morphism from g to the space of linear vector fields on V , and extend J to a morphism J • of Gerstenhaber algebras from (V ). In particular J 2 (r) defines a quadratic Poisson structure on V , and the (g) to formula just above gives a quantization of this particular quadratic Poisson structure. 3. Quadratic Poisson Structures and r-Matrices 3.1. Some definitions. We keep the notations of the introduction. The Gerstenhaber (V ) can be written as: algebra (V ) = n (V )[1]. S n (V ) ⊗ n (V ) [1] = n≥0

n≥0

2 (V ) such A quadratic Poisson structure on V can be defined as a bivector field in that: [, ] = 0. 2 Let be an element of (V ), and let r be an element of 2 (g) such that J 2 (r) = . It is then obvious that [, ] = 0 if and only if J 3 ([r, r]) = 0. If n ≥ 2 then J 2 and J 3 have nontrivial kernels: Precisely we have dim ker J 2 =

n2 (n2 − 1) 4

and

dim ker J 3 =

n2 (n2 − 1)(5n2 − 8) . 36

3.2. A counterexample in dimension 3. With the notations of Sect. 1, an element of g ∧ g can be written as: n jl r= rik Eij ∧ Ekl . i,j,k,l=1

We shall need for further calculations the following result: Proposition 3.1. Let r =

n i,j,k,l=1

jl

rik Eij ∧ Ekl be an element of g ∧ g, then [r, r] = 0

if and only if for any i, j, k, l, m, p ∈ {1, . . . , n} such that (i, j ) < (k, l) < (m, p). According to lexicographical order we have: n d=1

jp

pj

dp lj

dp j l

dj pl

dj lp

dl dl rik rdm − rmk rdi + rkm rdi − rim rdk + rmi rdk − rki rdm = 0.

Proof. This proposition is a direct consequence of the formula: [Eij , Ekl ] = δj k Eil − δli Ekj and the following lemma.

On Quantization of Quadratic Poisson Structures

127

Lemma 3.1. Let h be a finite-dimensional Lie algebra and let X1 , . . . , XN be a basis N of h. If r = r I J XI ∧ XJ is an element of h ∧ h ( r I J = −r J I ) then I,J =1 N

[r, r] = 4

r I J r KL [XI , XK ] ∧ XJ ∧ XL .

I,J,K,L=1

Remark. We can directly show Proposition 3.1 using relation (∗) of the beginning of Sect. 2 applied to elements of the dual basis of X1 , . . . , Xn . Proposition 3.2. The Poisson structure on R3 given by = (x12 + αx2 x3 )∂2 ∧ ∂3 with α = 0 is not the image of a classical r-matrix by J 2 . Proof. An element r =

n i,j,k,l=1

jl

jl

rik Eij ∧ Ekl of g ∧ g is parametrized by 36 coefficients

rik . It has image if and only if the following 18 equations are satisfied: 12 13 12 13 23 12 13 23 = r11 = r22 = r22 = r22 = r33 = r33 = r33 = 0, r11 12 21 r12 = r12 ,

13 31 r12 = r12 ,

12 21 r13 = r13 ,

13 31 = r13 , r13

23 32 r12 = r12 ,

23 32 r13 = r13 ,

12 21 r23 = r23 ,

13 31 r23 = r23 ,

32 23 r23 = r23 − α.

23 r11 = 1,

To lighten writing we rename the 18 remaining unknowns as follows: 11 = a, r12

12 r12 = b,

13 r12 = c,

11 r13 = d,

12 r13 = e,

13 r13 = f,

22 r12 = g,

23 r12 = h,

22 r13 = i,

23 r13 = j,

33 r12 = k,

33 r13 = l,

11 r23 = m,

12 r23 = n,

13 r23 = p,

22 r23 = q,

23 r23 = r,

33 r23 = s.

The unknown r must not be confused with the classical r-matrix on the whole. The context will not lead to any confusion. We have then: 23 = 1, r11

jl

21 r12 = b,

21 r13 = e,

31 r12 = c,

32 r12 = h,

31 r13 = f,

32 r13 = j,

21 r23 = n,

31 r23 = p,

32 r23 = r − α,

and the other rik are equal to 0. If r is such an element the equation [r, r] = 0 develops according to Proposition 3.2 into a system of 84 equations involving our 18 unknowns a, b, . . . , s, given by the vanishing of the 84 coefficients of elements of the basis Eij ∧ Ekl ∧ Emn of g ∧ g ∧ g. The 84 equations reduce to 66, thanks to the fact that we already have [, ] = 0. But we shall only consider 20 of them, which will be sufficient for exhibiting the counterexample:

128

D. Manchon, M. Masmoudi, A. Roux

Let us order the Eij ’s lexicographically from first to 9th , rename them accordingly (A1 = E11 , A2 = E12 , . . . , A9 = E33 ), and label by (x, y, z) the equation obtained by the vanishing of the coefficient of Ax ∧ Ay ∧ Az . We shall consider precisely the following equations: (1, 2, 5) ci − eh + n = 0,

(1, 2, 9) eh − ci + d = 0,

(1, 3, 5) cj − ek − a = 0, (1, 3, 9) ek − cj + p = 0, (1, 4, 6) mk − pc = 0, (1, 7, 8) en − im = 0, (1, 8, 9) αe + ip − j n = 0,

(1, 3, 7) ce + f 2 − j a − ld + m = 0 (1, 4, 5) mh − nc = 0, (1, 5, 6) αc + nk − ph = 0, (1, 7, 9) ep − j m = 0, (2, 3, 9) − 3f + r + ik − j h = 0,

(2, 4, 5) ag − b2 + nh − qc = 0, (2, 8, 9) ej + ir + αi − j q − f i = 0, (4, 5, 7) pn − rm − bm + na = 0,

(2, 5, 6) 2αh + bh − rh + qk − cg = 0, (3, 8, 9) is + 2αj + el − fj − j r = 0, (5, 6, 9) − nk − rs + hp + s(r − α) = 0,

(5, 8, 9) nj − ip + αq = 0,

(6, 8, 9) ln − pj + qs − (r − α)2 = 0.

Consider the following two sums: (1, 2, 5) + (1, 2, 9) : (1, 3, 5) + (1, 3, 9) :

n + d = 0, p − a = 0.

Hence n = −d and p = a. We will discuss the four cases a = d = 0, a = 0 and d = 0, a = 0 and d = 0, a = 0 and d = 0. First case: a = d = 0. Then looking successively at the following equations we get: (5, 8, 9) ⇒ q = 0 (1, 8, 9) ⇒ e = 0 (4, 5, 7) ⇒ m = 0 (3, 8, 9) ⇒ j = 0

(6, 8, 9) ⇒ r = α (2, 4, 5) ⇒ b = 0 (5, 6, 9) ⇒ s = 0 (2, 8, 9) ⇒ i = 0

(1, 5, 6) ⇒ c = 0, (2, 5, 6) ⇒ h = 0, (1, 3, 7) ⇒ f = 0, (2, 3, 9) ⇒ α = 0,

hence a contradiction to the hypothesis α = 0. Second case: a = 0 and d = 0 (hence n = 0). (1, 4, 6) ⇒ mk = 0. First subcase: m = 0. Then: (1, 4, 5) ⇒ c = 0

(1, 7, 8) ⇒ e = 0

(1, 2, 5) ⇒ n = 0,

hence a contradiction. Second subcase: m = 0, hence k = 0, (1, 5, 6) ⇒ c = 0

(1, 4, 5) ⇒ h = 0

hence a contradiction again. Third case: a = 0 and d = 0 (hence p = 0), (1, 4, 5) ⇒ mh = 0.

(1, 2, 5) ⇒ n = 0,

On Quantization of Quadratic Poisson Structures

129

First subcase: m = 0. Then: (1, 4, 6) ⇒ c = 0

(1, 7, 9) ⇒ e = 0

(1, 3, 5) ⇒ a = 0,

hence a contradiction. Second subcase: m = 0, hence h = 0. (1, 5, 6) ⇒ c = 0

(1, 4, 6) ⇒ k = 0

(1, 3, 5) ⇒ a = 0,

(1, 7, 8) ⇒ e = 0

(1, 3, 5) ⇒ a = 0,

hence a contradiction again. Fourth case: a = 0 and d = 0. First subcase: m = 0. (1, 4, 5) ⇒ c = 0 contradiction. Second subcase: m = 0. (1, 4, 6) ⇒ k =

a c m

(1, 7, 9) ⇒ j =

contradiction. This proves Proposition 3.2.

a e m

(1, 3, 5) ⇒ a = 0,

3.3. Cartan-type quadratic Poisson structures. Recall from [DH] that the curl of a Pois son stucture = i,j ij ∂i ∧ ∂j is defined by: rot =

∂j ij .∂i .

i,j

It is a linear vector field (and hence can be viewed as an n × n matrix) when is quadratic. A quadratic Poisson structure is of Cartan type if it can be written for some choice of coordinates as: n = cij xi xj ∂i ∧ ∂j i,j =1

with cj i = −cij . J. P. Dufour and A. Haraki proved the following result: Theorem 3.1 (Dufour–Haraki). Any quadratic Poisson structure the curl of which has eigenvalues λi such that λi +λj = λr +λs for any (i, j, r, s) with r = s and {i, j } = {r, s} is of Cartan type. Such a Cartan-type Poisson structure is the image by J (2) of a classical r-matrix, namely: r= cij Eii ∧ Ejj . i,j

Acknowledgements. The authors thank Maxim Kontsevich and Daniel Sternheimer for useful comments and precisions.

130

D. Manchon, M. Masmoudi, A. Roux

References [ABM] Arnal, D., Ben Amar, N., Masmoudi, M.: Cohomology of good graphs and Kontsevich linear star products. Lett. Math. phys. 48, 291–306 (1999) [BR] Bhaskara, K.H., Rama, K.: Quadratic Poisson structures. J. Math. Phys. 32, 2319–2322 (1991) [CFT] Cattaneo, A.S., Felder, G., Tomassini, L.: From local to global deformation quantization of Poisson manifolds. math.QA/0012228 [Di] Dito, G.: Kontsevich star-product on the dual of a Lie algebra. Lett. Math. Phys. 48, 307–322 (1999) [Dr] Drinfel’d, V.G.: On constant, quasiclassical solutions of the quantum Yang–Baxter equation. Soviet Math. Dokl. 28, No. 3, 667–671 (1983) [DH] Dufour, J.-P., Haraki, A.: Rotationnels et structures de Poisson quadratiques. C. R. Acad. Sci. 312, 137–140 (1991) [ET] El Galiou, M., Tihami, Q.: Star-Product of a quadratic Poisson structure. Tokyo J. Math. 19, No. 2, 475–498 (1996) [Gr] Grabowski, J.: Abstract Jacobi and Poisson structures. Quantization and star-products. J. Geom. Phys. 9, 45–73 (1992) [K] Kontsevich, M.: Deformation quantization of Poisson manifolds I, math.QA/9709040 [Ka] Kathotia, V.: Kontsevich’s universal formula for deformation quantization and the Campbell–Baker– Hausdorff formula. math.QA/9811174 [LW] Lu, J.-H., Weintein, A.: Poisson Lie groups, dressing transformations and Bruhat decompositions. J. Diff. Geom. 31, 501–526 (1990) [LX] Liu, Z.-J., Xu, P.: On quadratic Poisson structures. Lett. Math. Phys. 26, 33–42 (1992) [OMY] Omori, H., Maeda,Y.,Yoshioka, A.: Deformation quantizations of Poisson algebras. Contemp. Math. 179, 213–240 (1994) [T] Takhtajan, L.A.: Lectures on Quantum groups. Nankai Lect. Notes in Math. Phys. 69–197 (1990) [V] Voronov, A.A.: Homotopy Gerstenhaber algebras. math.QA/9908040 Communicated by A. Connes

Commun. Math. Phys. 225, 131 – 170 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Riemannian Geometry of Quantum Groups and Finite Groups with Nonuniversal Differentials Shahn Majid School of Mathematical Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK Received: 22 June 2000 / Accepted: 26 August 2001

Abstract: We construct noncommutative “Riemannian manifold” structures on dual quasitriangular Hopf algebras such as Cq [SU2 ] with its standard bicovariant differential calculus, using the quantum frame bundle approach introduced previously. The metric is provided by the braided-Killing form on the braided-Lie algebra on the tangent space and the n-bein by the Maurer–Cartan form. We also apply the theory to finite sets and in particular to finite group function algebras C[G] with differential calculi and Killing forms determined by a conjugacy class. The case of the permutation group C[S3 ] is worked out in full detail and a unique torsion free and cotorsion free or “Levi–Civita” connection is obtained with noncommutative Ricci curvature essentially proportional to the metric (an Einstein space). We also construct Dirac operators D / in the metric background, including on finite groups such as S3 . In the process we clarify the construction of connections from gauge fields with nonuniversal calculi on quantum principal bundles of tensor product form. 1. Introduction Noncommutative geometry has been proposed for many years as a natural generalisation of geometry to include quantum effects. Particularly important should be “Riemannian” geometry and moreover (in our opinion) quantum groups or Hopf algebras should play a central role [1] just as Lie groups do in the classical case. With such motivation, a systematic formalism of a quantum groups-based approach to “quantum manifolds” and “quantum Riemannian manifolds” on (possibly noncommutative) algebras was already introduced a few years ago in [2]. We used the notion of quantum principal bundles (with quantum group fibre) and connections in [3], to define “frame bundle”, “spin connection”, “vielbeins”, etc. The paper studied both the classical limit and at the other extreme with the universal differential calculus (which is formally defined on any algebra). We now follow up [2] with a detailed application of this formalism to uncover a rich noncommutative Riemannian geometry both of quantum groups and finite groups equipped

132

S. Majid

with general differential structures. That q-deformation quantum groups should have a q-deformed Riemannian geometry is hardly surprising, but that we can achieve it, proving as we do in Sect. 4 that all standard q-deformations of simple Lie groups are quantum Riemannian manifolds is a good test of our theory. More surprising perhaps is that finite groups have as equally rich a Riemannian geometry as Lie groups. It is well known that their bicovariant differential structures are defined by conjugacy classes (this is immediate from [4]), but we now take this much further in Sect. 5 to a braided-Lie algebra of invariant vector fields, Levi–Civita spin connections, Ricci tensor, etc. fully analogous to the Lie case. The formulation of Ricci tensors also make clear that we are in a position now to do gravitational physics in this noncommutative setting. In the finite group case functional integration over moduli spaces of metrics, etc., becomes finite-dimensional integration. In contrast to lattice approximation the finite spacing is not an “error” but simply a noncommutative modification of the geometry which remains exact and hence valid even for a finite number of points. Meanwhile in the q-deformed case infinities can be expected to be at least partly regularised as poles at q = 1. It may also be [6] that spin-network quantum gravity in the presence of a cosmological constant should lead specifically to a q-deformation of conventional Riemannian geometry. Another application of Hopf algebras to Planck scale physics is the observable-state duality introduced in [1] and this has been related recently to T -duality in σ -models on groups [7]. Also, the first systematic predictions for astronomical data (for gamma-rays of cosmological origin) coming out of models with noncommutative spacetime coordinates have emerged [8] with measurable effects even if the noncommutativity is of Planck scale order. In another direction, noncommutative tori such as studied by Connes, Rieffel and others have emerged as relevant to string theory [9]. Although we will not attempt such applications here, we do put on the table a general approach to such models that can be fully computed and which is (as we show) adequate to include the rich geometry of quantum groups and finite groups as basic building blocks, while in no way limited to them. From a mathematical point of view our constructive “bottom up” approach, in which we build up the layers of geometry more or less up to (in the present paper) the construction of Dirac operators, provides a useful complement to the powerful “top down” approach of Connes [5], in particular, coming out of K-theory and cyclic cohomology. There one starts with a spectral triple or “axiomatic Dirac operator” on an algebra as implicitly defining the noncommutative geometry. It appears that reconciling these two approaches should be rather important to a full development of both, and this provides a second motivation for the work. Sect. 5 contains, for example, a first result comparing the constructive approach with the Connes approach in the case of Dirac operators built up on finite groups. A physical application of such an understanding would be in the Connes–Lott approach to the standard model [10], where a discrete Dirac operator encodes the fermion mass matrix. A geometrical way to build up such a D / would translate directly into a prediction for this. A first step in this direction is in [11]. An outline of the paper is the following. We recall briefly in Sect. 2 the global theory from [2], with general differential calculi on the base M, fibre H and “total space” algebra P of the (frame) bundle. The new results begin in Sect. 3 where we specialise to the “local” theory (the parallelizable case) where P = M ⊗ H . Most of the work in this section goes into constructing a suitable nonuniversal differential structure (P ) and showing that local data such as V -bein and “gauge field” indeed provide a global bundle with soldering form and global connection. This situation is unusual in that the global theory is known but until now the trivial bundle theory has not been constructed as a

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

133

case of this (other than with the universal differential structure). What we achieve in this way is a theory that works at the level of a general algebra M equipped with a suitable parallelizable differential structure and associated framing, which is roughly the level of generality that we are used to in quantum theory by the time one has added ∗-structures and Hilbert spaces (we do not do this here since we have enough to do at the algebraic level). It is therefore also the level of generality appropriate to a definitive “quantum Riemannian geometry”. Note that a quantum group here is not an essential input and one could in principle use a more general “coalgebra bundle” [12]. The quantum-mechanical meaning of coalgebra bundles is discussed in [13], which also announces the present results. In Sect. 4 we apply this theory to the case where the base M is itself a quantum group. The main result is the construction of Riemannian metrics for general differential calculi from Ad-invariant bilinear forms on the underlying braided-Lie algebra [14], which we apply to standard quantum groups such as Cq [SU2 ]. For completeness we also consider the other extreme of usual enveloping algebras U (g) as noncommutative “flat” spacetimes. Finally in Sect. 5 we specialise our theory to finite sets and, in particular, to finite groups. The main results are in Sect. 5.3 where we compute everything for the concrete example of the permutation group S3 with its order 3 conjugacy class. We are able to explicitly solve the torsion-free and metric-compatibility (or “cotorsion-free”) equations for the “braided Killing form” metric and obtain a unique “Levi–Civita” spin connection. We also compute the Ricci tensor and find that S3 is essentially an Einstein space, and we compute the natural Dirac operator. The contribution of the gravitational spin connection to this is absolutely essential for a charge conjugation operator or symmetric distribution of eigenvalues about zero and we consider this a good test of the consistency of our constructive approach. Let us note that following [2] there have been one or two other constructive attempts at noncommutative Riemannian geometry for finite sets and finite groups, see e.g [16, 15]. The first of these (as well as some earlier works on “Levi–Civita connections” on q-deformed quantum groups and homogeneous spaces, such as [17]) takes a linear connection ∇ point of view and not a frame bundle and spin connection one (which is essential for us to arrive at a Dirac operator constructively). Meanwhile [15], while speaking of “vielbeins” and “spin connections” does not actually provide any form of “metric compatibility” between them and hence cannot be considered as a theory of gravity at all. Moreover, there is not any actual noncommutative geometry of the total space and fibre leading for example to any kind of “Lie algebra” in which the spin connection should take its values. These are some of the difficult problems solved in our approach. Moreover, even if one were interested only in finite groups (say), it is important that our constructions are not ad-hoc to that case but “functorial” in the sense of being embedded in a single theory that works for general algebras and with other limits including classical and q-deformed ones. Preliminaries. We use the usual notations for Hopf algebras as in [18], over a general ground field k. Thus : H → H ⊗ H is the coproduct on the algebra H , : H → k the counit and S : H → H the antipode, which we assume to be invertible. The right adjoint coaction of H on itself is AdR (h) = h(2) ⊗(Sh(1) )h(3) in the numerical notation for the output of repeated coproducts (summation understood). Next, on any algebra M there is a universal differential calculus with 1-forms 1 M given by the kernel of the product map M ⊗ M → M and dm = 1 ⊗ m − m ⊗ 1. General or “nonuniversal” 1 (M) are

134

S. Majid

quotients of this by an M − M-bimodule NM . Also the universal calculus extends to an entire exterior algebra with d2 = 0 and a general higher order calculus is a quotient of that by a differential graded ideal [5]. Equivalently one can build up the calculus order by order. Thus 1 (M) has a maximal prolongation by Leibniz and d2 = 0, and 2 (M) can then be specified as a quotient of the degree 2 part of that, etc. In the case of a Hopf algebra H one can construct [4] the bicovariant 1 (H ) equivalently in terms of crossed modules 0 ∈ MH H , where H acts and coacts on 0 from the 1 right in a compatible manner. Then (H ) = H ⊗ 0 with the tensor product (co)action from the right and the regular (co)action of H from the left via its (co)product. The universal calculus in this case corresponds to 0 = ker and a general calculus is a quotient of this by a right ideal QH which is invariant under the right adjoint coaction. Equally ¯ 0 ⊗ H , where ¯ 0 ∈ H M, etc. There is a canonical higher well we can write 1 (H ) = H 2 order exterior algebra characterised by d = 0 and the additional relations defined by quotienting by the kernel of id − , where (v ⊗ w) = w ⊗ v, H

H

v ∈ 0 ,

¯ 0. w∈

(1)

A quantum principal bundle [3] over an algebra M with universal calculus is (P , H, R ), where P is an algebra, H a Hopf algebra, R : P → P ⊗ H a right coaction and algebra map, with M = P H = {p ∈ P | R (p) = p ⊗ 1} ⊂ P ,

(2)

and P is flat as an M-bimodule, and the sequence ver

0 → P ( 1 M)P → 1 P −→P ⊗ ker

(3)

is exact, where ver(p ⊗ p ) = pR (p ). This is equivalent to a “Hopf-Galois” extension in the theory of Hopf algebras, e.g. [19], while arising in this “differential” form in [3]. For a general calculus 1 (M), a bundle means (P , H, R ) as before and also a choice of calculus 1 (P ) and 1 (H ) with the former right-covariant in the sense R NP ⊂ NP ⊗ H and NM = NP ∩ 1 M ⊂ 1 P ,

ver(NP ) = P ⊗ QH .

(4)

The first condition here states that we recover

1 (M) = {mdP n| m, n ∈ M} ⊂ 1 (P ),

(5)

as a restriction, while the second ensures exactness ver NP

0 → P 1 (M)P → 1 (P ) −→ P ⊗ 0

(6)

by the induced map ver NP . This is equivalent to the formulation in [3], as explained in [20].

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

135

2. Framings and Riemannian Geometry with Nonuniversal Calculi Here we briefly recall from [2] how the basic definitions of quantum group gauge theory can be extended to frame bundles, torsion, metric, etc., with new emphasis on the case of general differential calculus that will concern us. This is the noncommutative geometrical picture used in the paper. First of all, if V is a right H -comodule we define E = (P ⊗ V )H ,

E ∗ = homH (V , P )

(7)

to be “associated” bundles. They are dual in the sense that composition and multiplication in P gives a pairing E ⊗M E ∗ → M of M-bimodules (or every element of E ∗ induces a left M-module map E → M). This is the same as for the universal calculus. We further assume natural flatness properties so that (P 1 (M))H = 1 (M), etc. We will see these in detail for tensor bundles. Definition 2.1. A frame resolution of (M, 1 (M)) is a quantum bundle (P , H, R , 1 (P ), 1 (H )) over it as above, a right H -comodule V and an equivariant θ : V → P 1 (M) such that the induced left M-module map by applying θ and multiplying in P is an isomorphism sθ : E ∼ = 1 (M). This expresses the cotangent bundle as an associated bundle to a principal bundle, which is the role of framing. The choice of H is far from unique, however, and need not be any kind of analogue of GLn . Once framed, vector fields are −1 (M) ∼ = E ∗ and similarly for their powers. We call this also a “framing isomorphism” induced by θ . We then define a quantum metric as an isomorphism E ∼ = E ∗ , i.e. we require nondegeneracy but do not necessarily impose any symmetry (which would be unnatural in the noncommutative theory). When V is finite-dimensional note that V ∗ is a left H -comodule automatically and we can view E ∗ as given by the same construction as for E but with a left-right reversal and V replaced by V ∗ . We define H¯ = H op (with the opposite product) and P¯ = P as an algebra but with the left coaction ¯ ¯ ¯ ¯ L p ≡ p(0) ⊗ p(1) = S −1 p (2) ⊗ p(1)

(8)

in terms of the original right coaction. Then we have a left-handed bundle and a metric is equivalent to a coframing with this bundle and V ∗ , θ ∗ : V ∗ → 1 (M)P giving an isomorphism E ∗ ∼ = 1 (M) as right M-modules. This is the “self-dual” generalisation of Riemannian geometry as the existence of a framing and coframing at the same time. The corresponding metric is g= θ ∗ (f a ) ⊗ θ(ea ) ∈ 1 (M) ⊗ 1 (M), (9) a

P

M

where {ea } is a basis of V and {f a } is a dual basis. Or to avoid explicitly dualising V we can of course work with θ ∗ ∈ 1 (M)P ⊗ V and the metric as the composition with θ and ⊗M , etc. Finally, a connection on a quantum principal bundle is an equivariant complement of P 1 (M)P ⊂ 1 (P ). In concrete terms this is equivalent to a connection form, which is an equivariant map ω : 0 → 1 (P ),

ver NP ◦ ω = 1 ⊗ id,

(10)

136

S. Majid

where we recall that 0 is a right comodule by the adjoint coaction (as part of the crossed module structure). The associated projection (ω = ·P (id ⊗ ω)ver NP defines a covariant derivative Dω : E → 1 (M) ⊗ E, M

Dω = (id − (ω ) ◦ d ⊗ id

(11)

provided (id − (ω ) ◦ dP ⊂ 1 (M)P , in which case one says that ω is strong. It is clear that a (strong) connection ωU on the bundle with universal calculus such that ωU (QH ) ⊂ NP induces one on the bundle with general calculus. In the presence of a framing, we define: Definition 2.2. Associated to strong ω is the covariant derivative ∇ω : 1 (M) →

1 (M) ⊗M 1 (M) according to the framing isomorphism sθ , namely ∇ω = (id ⊗ sθ ) ◦ Dω ◦ sθ−1 . Both Dω and hence ∇ω behave in the expected way with respect to left-multiplication by M. One can then proceed to identify other geometrical objects in terms of ω, θ . Thus, torsion T : 1 (M) → 2 (M)

(12)

corresponds under framing isomorphisms to D¯ ω ∧ θ : V → P 2 (M) (here we need a left-handed version of the bundle as explained in [2].) Specifically, we apply this in the same manner as the construction of sθ to give a map E → 2 (M) which becomes T as stated under sθ . In this self-dual formulation it is natural to ask also that the “cotorsion” vanishes. This is the torsion of ω with respect to the coframing, i.e. Dω ∧θ ∗ ∈

1 (M) ⊗M E which we view via sθ as ) ∈ 2 (M) ⊗ 1 (M). M

(13)

Its vanishing is a generalisation of “metric compatability” as explained in [2]. Note that the vanishing torsion and cotorsion require us to specify 2 (M) suitably. We look at this in detail for trivial bundles in the next section. Similarly, the Riemann curvature is R : 1 (M) → 2 (M) ⊗ 1 (M) M

(14)

as a left M-module map corresponding to the curvature of ω. With some mild additional structure we can also define the Ricci tensor by a contraction. The most explicit, which we will adopt, is to apply lift i : 2 (M) → 1 (M) ⊗M 1 (M) and take a trace as an M-module map with values in the remaining 1 (M) ⊗M 1 (M). One could also view this as associated to an interior product or a Hodge ∗-operation. Let us also note that once 2 (M) is specified one could impose a “symmetry” condition on the metric if desired, as in the kernel of the wedge product ∧(g) = 0.

(15)

Finally, we discuss some general aspects in this context of “Dirac operator”. Most of the definition is straightforward; we define a spinor as ψ ∈ S = (P ⊗ W )H the associated bundle to some other representation of H . Since H is not required to be anything like SOn but can be a more general framing, it is not necessary to speak here of double covers or lifting; we simply frame by the more suitable quantum group to

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

137

begin with. Then Dω ψ ∈ 1 (M) ⊗M S maps over under the framing to E ⊗M S. The missing data to define an operator D / : S → S with reasonable properties under scalar multiplication of spinors is therefore a left M-module map γ : E ⊗ S → S. M

(16)

Classically, this would be induced by a map γ : V ⊗ W → W with equivariance and “Clifford algebra” properties with respect to the metric. Note also that in place of an “inner product” on S it is natural in our self-dual formulation to have instead an adjoint spinor space S ∗ = homH (W, P ) and D / defined on this similarly with γ ∗ . We do not attempt here a full formulation but will look at some of these issues for trivial bundles and quantum groups. 3. Parallelizable Riemannian Structures on Algebras In this section we apply the formalism above to obtain a general class of quantum Riemannian manifold structures on algebras M for which the quantum frame bundle has the tensor product form P = M ⊗ H , i.e. the parallelizable case. Other trivialisations can change this form, i.e. we work in what we call the tensor product gauge. Our main result is the construction of 1 (P ) such that the global theory above is induced from a “local” theory where global connections correspond to gauge fields A : 0 → 1 (M) and soldering forms to V -beins e : V → 1 (M). The choice of 1 (P ) is far from obvious, for example NP generated as a P -bimodule by NM , NH as suggested in [20] would not allow these correspondences to proceed. Proposition 3.1. On P = M ⊗ H with 1 (M), 1 (H ) given, we take R , 1 (P ) defined by R = id ⊗ ,

NP = NM ⊗ H ⊗ H + M ⊗ M ⊗ NH + 1 M ⊗ 1 H,

where we identify P ⊗ P = M ⊗ M ⊗ H ⊗ H . Then (P , 1 (P ), R ) is a quantum principal bundle with nonuniversal calculus over M, 1 (M). Moreover, we may identify the H -comodules

1 (M)P = P 1 (M)P = P 1 (M) = 1 (M) ⊗ H. Proof. The coaction R is only on the H ⊗ H part and each component of NP is clearly invariant under this. Hence R (NP ) ⊂ NP ⊗ H . Also ver = ·M ⊗ ver H ,

ver(mi ⊗ ni ⊗ hi ⊗ gi ) = mi ni ⊗ hi gi (1) ⊗ gi (2)

for mi ni ⊗ hi gi = 0 has ver( 1 M ⊗ H ⊗ H ) = 0 and hence ver(NP ) = P ⊗ QH as required (here ver H corresponds to H as a bundle over k). Next we note that for any algebras M, H , M ⊗ M = M ⊗ 1 ⊕ 1 M = 1 ⊗ M ⊕ 1 M, H ⊗ H = H ⊗ 1 ⊕ 1 H = 1 ⊗ H ⊕ 1 H by identifying m ⊗ n = mn ⊗ 1 − mdn or m ⊗ n = 1 ⊗ mn − (dm)n for the two cases and similarly for H ⊗ H . Hence (making choices, i.e. not canonically) we can write NP = NM ⊗ 1 ⊗ H ⊕ M ⊗ 1 ⊗ NH ⊕ 1 M ⊗ 1 H

138

S. Majid

as a vector space. From this it is clear that NP ∩ 1 M ⊗ 1 ⊗ 1 = NM ⊗ 1 ⊗ 1 as required. Hence we have a quantum principal bundle. Also from a similar decomposition we identify NP ∩ P 1 M = NM ⊗ H ⊗ 1,

NP ∩ ( 1 M)P = NM ⊗ 1 ⊗ H,

and hence we can identify

1 (M)P = 1 (M) ⊗ 1 ⊗ H and P 1 (M) = 1 (M) ⊗ H ⊗ 1. Finally, NP ∩ P ( 1 M)P = NM ⊗ 1 ⊗ H ⊕ 1 M ⊗ 1 H = NM ⊗ H ⊗ 1 ⊕ 1 M ⊗ 1 H so that we can identify P 1 (M)P with either 1 (M)P or P 1 (M). When the context is clear we therefore omit the ⊗ 1 and identify all three with 1 (M) ⊗ H . It remains to verify that these identifications are R -covariant, in particular that of P 1 (M)P . We need for this that the identifications H ⊗ H ∼ = 1 ⊗ H ⊕ 1 H , etc., are equivariant under the tensor product of the coaction in each factor up to an error in 1 H . In particular the projection to 1 ⊗ H by multiplication is covariant just because is an algebra homomorphism. As a justification for this calculus note that classically the three spaces P 1 (M), and P 1 (M)P coincide, which we have arranged also here. It means that all connections are automatically strong, etc, as in the classical theory. Also, 1 (P ) has the right size. Thus, for any (say) finite-dimensional algebra M define

1 (M)P

dim( 1 (M)) = dim(M) − 1 −

dim(NM ) , dim(M)

(17)

which is the dimension over M in the free case. Then for the above 1 (P ) we have dim( 1 (P )) = dim( 1 (M)) + dim( 1 (H )).

(18)

Next we consider framings and coframings with the above 1 (P ) understood. As for the universal calculus in [2] we define to this end a “V -bein” and “V -cobein” as linear maps e : V → 1 (M),

e∗ : V ∗ → 1 (M)

(19)

such that there are induced isomorphisms ∼ 1 (M), se : M ⊗ V = se (m ⊗ v) = me(v),

∼ 1 (M), se ∗ : V ∗ ⊗ M = se∗ (w ⊗ m) = e∗ (w)m.

Proposition 3.2. A framing and coframing of M with P = M ⊗ H are equivalent to (V , e, e∗ ), where V is a right H -comodule and e, e∗ are a V -bein and V -cobein in the sense above. The (co)frame resolutions and quantum metric are ¯ ¯ ¯ ¯ θ (v) = e(v (1) ) ⊗ v (2) ⊗ 1, θ ∗ (w) = e∗ (w (1) ) ⊗ 1 ⊗ w(2) , g= e∗ (f a ) ⊗ e(ea ). a

M

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

139

∼ V by in one direction and conversely by v → Proof. Note first that (H ⊗ V )H = ¯ ¯ S −1 v (2) ⊗ v (1) , hence E ∼ M ⊗ V . Likewise homH (V , H ) ∼ = = V ∗ by composing with ¯ ¯ (1) (2) in one direction and w → φ(w), φ(w)(v) = w, v v , hence E ∗ ∼ = V ∗ ⊗ M. This is part of the standard analysis for associated bundles in the trivial case [3]. Given e, e∗ we ¯ ¯ define respectively θ, θ ∗ as stated and verify they are equivariant. Thus θ(v (1) ) ⊗ v (2) = ¯ (1) ¯ ¯ (2) ¯ ¯ (1) (1) (2) ∗ e(v )⊗v ⊗ v ⊗ 1 = R θ(v) as V is a right comodule. Similarly for θ , where ¯ ¯ ¯ ¯ V ∗ is a right H -comodule by v, w (1) w (2) = v (1) , wS −1 v (2) as usual (i.e. the adjoint of the left H -comodule structure on V corresponding in the manner of (8) to the right comodule structure on V ). Finally the induced ¯ ¯ ¯ ¯ ) ⊗ v (2) ⊗ 1 = me(v (1) ) ⊗ hv (2) ⊗1 sθ (m ⊗ h ⊗ v) = (m ⊗ h)e(v (1)

under the above identification becomes ¯ ¯ ¯ (1) ¯ ¯ ¯ (2) ¯ ⊗ v (1) ) = me(v (1) ) ⊗(S −1 v (2) )v (1) ⊗ 1 = me(v) ⊗ 1 ⊗ 1, m ⊗ v → sθ (m ⊗ S −1 v (2)

i.e. reduces to se . Likewise sθ ∗ reduces to se∗ . Hence we obtain framings and coframings respectively from e, e∗ . Conversely any equivariant θ, θ ∗ must have this form by similar arguments as for E, E ∗ . Given these, the general formula for the metric then reduces to the one shown on using invariance of f a ⊗ ea . In fact the computation here is the same as for the universal calculus and works for any reasonable calculus on P , where

1 (M)P ∼ = 1 (M) ⊗ 1 ⊗ H , etc. For our particular 1 (P ) we can suppress the ⊗ 1 in the formulae for θ, θ ∗ . Next, for the principal bundle P = M ⊗ H a trivial reference connection is provided by ω0 (v) = 1 ⊗ 1 ⊗ πNH (S v˜ (1) ⊗ v˜ (2) ),

(20)

where v˜ ∈ ker is any lift of v ∈ 0 and πNH the projection to 1 (H ) (the Maurer– Cartan form of H viewed in 1 (P )). Here we view 1 (H ) ⊂ 1 (P ) by the same arguments as for 1 (M) (their situation is symmetric). Any other connection then corresponds to the addition of an Ad-equivariant form in the kernel of ver NP , i.e. ω − ω0 : 0 → P 1 (M)P . For our choice of 1 (P ) the target here can be identified with 1 (M)P . Theorem 3.3. A connection on 1 (P ) is equivalent to a linear map or “gauge field” A : 0 → 1 (M). The resulting connection and corresponding projection are ω(v) = ω0 (v) + πNP ((S v˜ (1) ) · A(π 0 v˜ (2) ) · v˜ (3) ) (id − (ω )(mi ⊗ ni ⊗ hi ⊗ gi ) = − mi ni A(π 0 gi (1) ) ⊗ 1 ⊗ hi gi (2) + mi dni ⊗ 1 ⊗ hi gi ∈ 1 (M)P in a manifestly strong form. Here v, ˜ mi ⊗ ni ⊗ hi ⊗ gi are representatives in ker and

1 P respectively and π 0 denotes the canonical projection to 0 , etc.

140

S. Majid

Proof. For any H -comodule V we identify equivariant maps V → 1 (M)P with linear maps V → 1 (M)P by the same construction as above for V → P . Thus ¯ ¯ ˜ A : V → 1 (M) corresponds to A(v) = A(v (1) ) ⊗ 1 ⊗ v (2) and conversely every ω has this form. In particular we take V = 0 and the right adjoint coaction given by projecting down that on ker . Thus ˜ A(v) = A(π 0 v˜ (2) ) ⊗ 1 ⊗(S v˜ (1) )v˜ (3) . When we identify 1 (M)P with P 1 (M)P we obtain the form for ω − ω0 shown. Note that πNP (A(π 0 v˜ (2) ) ⊗ S v˜ (1) ⊗ v˜ (3) ) = πNP (A(π 0 v˜ (2) ) ⊗ 1 ⊗(S v˜ (1) )v˜ (3) ) so that the left-hand side is manifestly well-defined. Here the difference between the expressions is in 1 (M) ⊗ 1 H and hence killed by the form of NP . Conversely it is clear that ω−ω0 is necessarily of this form as explained. Finally, given such a connection, we have from the form of ver NP , the corresponding projector (ω (mi ⊗ ni ⊗ hi ⊗ gi ) = πNP (mi ni A(π 0 gi (3) ) ⊗ hi gi (1) ⊗(Sgi (2) )gi (4) )) + mi ni ⊗ 1 ⊗ πNH (hi ⊗ gi ) for any representative mi ⊗ ni ⊗ hi ⊗ gi ∈ 1 P . Under πNP we can move the hi gi (1) to the second factor and cancel using the antipode axioms. We also write mi ni ⊗ 1 = mi ⊗ ni − mi dni and mi dni ⊗ hi ⊗ gi = mi dni ⊗ 1 ⊗ hi gi under πNP . In this form we have no further quotient and drop the πNP as shown. Note that if hi ⊗ gi ∈ NH then hi gi (1) ⊗ gi (2) ∈ H ⊗ QH , but since QH is Ad-invariant we have hi gi (1) ⊗ gi (3) ⊗(Sgi (2) )gi (4) ∈ H ⊗ QH ⊗ H. Multiplying the two copies of H we conclude that hi gi (2) ⊗ gi (1) ∈ H ⊗ QH also. Therefore id − (ω is well-defined. Note that we do not consider here the question of gauge transformations themselves, which is much more subtle for nonuniversal calculi even when the bundle is trivial: we simply show that all connections in our “tensor product gauge” have the above form. Basically, a gauge transformation changes the description of the bundle to a cocycle cross product as explained in [21], which in turn changes the description of the calculus (this is a quantum effect in that one does not have this cocycle classically). Other trivialisations and correspondingly the formulae in other gauges can in principle be computed via a bundle automorphism if one wants formulae for “gauge theory” but the tensor product form of the bundle P will also transform. Proposition 3.4. Given a gauge field on M as above and V any right H -comodule, the vector spaces E = M ⊗ V and E ∗ = Lin(V , M) acquire covariant derivatives ¯ ¯ DA : E ∗ → 1 (M) ⊗ E ∗ , (DA σ )(v) = dσ (v) − σ (v (1) ) · A(π˜ 0 v (2) ), M

DA : E → (M) ⊗ E, 1

M

¯ ¯ DA ψ = (d ⊗ id)ψ − ψi A(π˜ 0 ψ i (0) ) ⊗ ψ i (1) ,

where ψ = ψi ⊗ ψ i ∈ M ⊗ V is a notation and π˜ 0 denotes projection to ker followed by π 0 .

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

141

¯ ¯ Proof. Given σ ∈ E ∗ we view it as 3 ∈ homH (V , P ) as usual by 3(v) = σ (v (1) ) ⊗ v (2) . Then ¯ ¯ ¯ ¯ (id − (ω )(d3(v)) = (id − (ω ) 1 ⊗ σ (v (1) ) ⊗ 1 ⊗ v (2) − σ (v (1) ) ⊗ 1 ⊗ v (2) ⊗1 ¯ ¯ ¯ ¯ ¯ (2) = dσ (v (1) ) ⊗ 1 ⊗ v (2) − σ (v (1) )A(π˜ 0 v (2) (1) ) ⊗ 1 ⊗ v (2) .

However, this equivariant map V → 1 (M)P is in the image of the identification (as in the proposition above) with Lin(V , 1 (M)) = 1 (M) ⊗M E ∗ of DA σ as stated. Similarly for DA ψ. One may verify directly that both maps are well-defined. These formulae are characterised not by gauge covariance but by the global constructions of the previous section specialised to the case of a tensor product bundle. They are the basic local formulae of quantum group gauge theory with nonuniversal calculus in the tensor product gauge. Now we suppose the existence of V -(co)beins or framings and coframings as explained above. Then DA induces ∇A etc. under the framing isomorphisms: Corollary 3.5. The covariant derivative ∇A : 1 (M) → 1 (M) ⊗M 1 (M) is given by ¯ ¯ −1 −1 ⊗ e(se −1i ) − sei · A(π˜ 0 se −1i (0) ) ⊗ e(se −1i (1) ), ∇A = dsei M

M

−1 where sei ⊗ se −1i denotes the output of se−1 and we use the projected right adjoint coaction viewed as a left coaction as in (8). If we write α = α a ·e(ea ) for all α ∈ 1 (M), then this is ¯ ¯ ∇A α = dα a ⊗ e(ea ) − α a A(π˜ 0 ea (0) ) ⊗ e(ea (1) ). M

M

Similarly for trivial bundles we can look at the construction of γ . Here S can be identified with S = M ⊗ W as a left M-module as explained above for any associated bundle. Corollary 3.6. For P = M ⊗ H and given se and a right-comodule W , suitable γ in (16) are provided by linear maps γ : V → End(W ). The corresponding Dirac operator S → S is given on ψ = ψi ⊗ ψ i ∈ M ⊗ W by ¯ ¯ ) ⊗ γa (ψ i (1) ), D / ψ = ∂ a ψi ⊗ γa (ψ i ) − ψi Aa (π˜ 0 ψ i (0)

se−1 ◦ d = ∂ a ⊗ ea ,

γa = γ (ea ),

where Aa are the components of A as above. Proof. Since γ is a left M-module map and defined on (M ⊗ V ) ⊗M (M ⊗ W ) ∼ = M ⊗ V ⊗ W , it is determined by γ ((1 ⊗ v) ⊗M (1 ⊗ w)) ≡ γ (1 ⊗ v ⊗ w) ≡ γ (v)(w) ∈ M ⊗ W , say. It is natural to assume here that γ (v)(w) ∈ W itself. Note that the right M-module structure on M ⊗ V is not the obvious one (it is the one corresponding to that of 1 (M) via se ) but becomes irrelevant after we absorb ⊗M M. We then compute D / by the above formulae for DA on S and the left M-module isomorphism se−1 as before (with the notations stated) to map dψ and A over to M ⊗ V , thereby obtaining an element of M ⊗ V ⊗ W . We then apply γ to V and evaluate its output in End(W ) on the other (spinor) component of ψ.

142

S. Majid

We note that the operators ∂ a in these expressions are not derivations but characterised by ∂ a (mn) = m(∂ a n) + (∂ b m)ρb a (n);

(21)

where we write the “generalised braiding” or entwining operator induced by se as e : V ⊗ M → M ⊗ V ,

e (ea ⊗ m) = se−1 (e(ea )m) = ρa b (m) ⊗ eb

(22)

for operators ρa b on M. They evidently obey ρa b (1) = δa b and ρa b (mn) = ρa c (m)ρc b (n) as an expression of the right module structure of 1 (M). In this notation, [D / , m] = (∂ a m)ρa b ⊗ γ (eb )

(23)

if one wants to compare this approach with that of Connes [5]. From this it is clear that if γ : V → End(W ) is injective then ker πD/ = NM , where πD/ (mdn) = m[D / , n]. Hence these approaches correspond to the same differential calculus at degree 1. At higher degree Connes proposes to quotient the universal exterior algebra by the differential ideal generated from repeated commutators with D / . At degree 2 the requirement that we recover a given choice of 2 (M) is a quadratic constraint on the linear maps γ appearing in Corollary 3.6. Another aspect to the “correct” choice of γ would be to demand that it is H -equivariant as an analogue of the idea that the gamma-matrices generate a representation of the spin group. We will look at these constraints in detail in the settings of Sects. 4 and 5. We require similar properties as in Proposition 3.1 for 2 (M) and 2 (P ) needed for the global picture of curvature, torsion and cotorsion. Namely, we require

2 (M) ⊂ 2 (P ),

2 (M)P ⊂ 2 (P ),

(24)

etc. in the obvious way by ⊗ 1 (as above for 1-forms). For example 1 (P ) itself determines a “maximal prolongation” to higher forms consisting of 1 (P ) ⊗P 1 (P ) modulo the additional relations implied by extending d : 1 (P ) → 2 (P ) with a graded Leibniz rule and d2 = 0, and a short computation shows that this works. A general choice will be a bimodule quotient of this. Similarly for higher degree. We may then proceed to make calculations along exactly the same lines as for 1-forms above. Specifically, it is clear that Lin(V , 2 (M)) corresponds in the same manner as before to equivariant maps V → 2 (M)P , etc. One has therefore DA : Lin(V , n (M)) → Lin(V , n+1 (M)), ¯ ¯ DA ∧ σ (v) = dσ (v) + (−1)n+1 σ (v (1) ) ∧ A(π˜ 0 v (2) ),

etc. Here DA is d on P followed by (id − (ω ) in each copy of 1 (P ). The proof is just as for the universal calculus in [3] followed by the required projections. See also [21]. ¯ ¯ Proposition 3.7. For all σ ∈ Lin(V , M), DA ∧ DA ∧ σ (v) = −σ (v (1) )FA (π v (2) ), where

FA : ker → 2 (M),

FA (v) = dA(π˜ 0 v) + A(π˜ 0 v (1) ) ∧ A(π˜ 0 v (2) )

and π (h) = h − (h). We say that A is “regular” if F descends to 0 → 2 (M), i.e. if A(π˜ 0 q (1) ) ∧ A(π˜ 0 q (2) ) = 0,

∀q ∈ QH .

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

143

Proof. We apply the above formulae for DA and compute exactly as for the universal calculus. As in the usual computation iteration of the coaction produces a coproduct and the well-defined formula for DA ∧ DA ∧ σ (v) as stated. We omit details since they are the same as the universal case in [3]. See also [21]. The map A ◦ π 0 plays the role of A : ker → 1 M in the universal calculation and all expressions are finally projected to the relevant differentials. In doing this one only knows that FA : ker → 2 (M) as stated. It is not such a problem if A is not regular. Classically it would mean that FA was not Lie algebra valued but valued in the enveloping algebra. Such a condition depends very much on the form of A and of the calculi 2 (M) and 1 (H ). One could view it as some kind of “differentiability” condition on A. Next we clarify the geometric meaning of our objects. ∇∧ denotes applying the covariant derivative ∇ and then projecting to

2 (M). Corollary 3.8. The curvature R : 1 (M) → 2 (M) ⊗M 1 (M) for a regular connection obeys R = ((id ∧ ∇) − (d ⊗ id)) ◦ ∇. The torsion T : 1 (M) → 2 (M) and cotorsion ) ∈ 2 (M) ⊗M 1 (M) corresponding to ¯ ¯ D¯ A ∧ e(v) = de(v) + A(π˜ 0 v (0) ) ∧ e(v (1) ), ¯ ¯ DA ∧ e∗ (w) = de∗ (w) + e∗ (w (1) ) ∧ A(π˜ 0 w (2) )

respectively (assuming a V -cobein in the second case) are ∇∧ = d − T ,

) = (∇ ∧ id − id ∧ ∇)g + (T ⊗ id)g.

Proof. These results follow from the general theory outlined in Sect. 2 specialised to the bundle P = M ⊗ H along the lines already given. However, for trivial bundles one may give a direct self-contained proof as well. For the curvature the notation (id ∧ ∇) means to act in the second tensor factor of 1 (M) ⊗M 1 (M) and then project the first two of the resulting three factors to 2 (M). From the definition of ∇ we have on a 1-form α, ¯ ¯ ) ⊗ e(ea (1) )) Rα = ((id ∧ ∇) − (d ⊗ id))(dα a ⊗ e(ea ) − α a A(π˜ 0 ea (0) M

¯ (0)

M

¯ (0) ¯ (1) ¯ ¯ = dα ∧ (−A(π˜ 0 ea ) ⊗ e(ea )) + A(π˜ 0 ea ) ∧ A(π˜ 0 ea (1) ) ⊗ e(ea (1) ) a

¯ (1)

¯ (0)

M

a

¯ (0)

M

¯ (1)

+ d(α A(π˜ 0 ea )) ⊗ e(ea ) M

¯ ¯ = α a FA (π˜ 0 ea (0) ) ⊗ e(ea (1) ) M

using the Leibniz rule and the left comodule property. This also gives the way to compute the action of R from FA . For torsion we project the definition of ∇ down to 2 (M), so that ¯ ¯ ) ∧ e(ea (1) ) ∇ ∧ α = (dα a ) ∧ e(ea ) − α a A(π˜ 0 ea (0) ¯ ¯ = dα − α a de(ea ) − α a A(π˜ 0 ea (0) ) ∧ e(ea (1) ) = dα − α a D¯ A ∧ e(ea )

144

S. Majid

by the Leibniz rule in 2 (M). This also makes it clear how T can be efficiently determined from D¯ A e. For the cotorsion we use the metric to similarly relate it to DA e∗ , namely ¯ ¯ ) ∧ A(π˜ 0 f a (2) ) ⊗ e(ea ) ) = DA ∧ e∗ (f a ) ⊗ e(ea ) = de∗ (f a ) ⊗ e(ea ) + e∗ (f a (1) M

∗

a

M

∗

a

M

¯ (0)

¯ (1)

= de (f ) ⊗ e(ea ) + e (f ) ∧ A(π˜ 0 ea ) ⊗ e(ea ) M

= de∗ (f a ) ⊗ e(ea ) − e∗ (f a ) ∧ ∇e(ea ),

M

M

where we use that the right coaction on V ∗ is adjoint to the left one on V (obtained ¯ ¯ ¯ ¯ as in (8)). Specifically, it means that f a (1) ⊗ ea ⊗ f a (2) = f a ⊗ ea (1) ⊗ S −1 ea (2) for the relation between the two coactions. Finally, we use the characterisation of torsion already obtained. The corollary shows in particular one of the key ideas in our approach [2]; the vanishing of cotorsion (or rather the difference between the torsion and the cotorsion) is a skew-symmetrized version of the “Levi–Civita” condition of metric compatibility. From the Riemann tensor above, it is clear that if we are given a bimodule map i : 2 (M) → 1 (M) ⊗M 1 (M) (preferably splitting the surjection ∧ but not necessarily) we have a well-defined Ricci tensor ¯ ab ¯ )) e(eb ) ⊗ e(ea (1) ), Ricci = i(R)(e(ea )), f a = i(FA (π˜ 0 ea (0) M

(25)

where i(FA ) = i(FA )ab e(ea ) ⊗M e(eb ) defines its components. The first trace expression is with the pairing applied to the first component of i(R) with all coefficients taken to the left in the V -bein basis and me(ea ), f b = mδa b . It is independent of the basis of V . One may go further and similarly contract to the scalar curvature. Finally, let us note that we are taking a view in which the underlying variables are a V -bein for the framing and, given this, an independent V -cobein e∗ for the metric. If we fix a specific ∗ (f b ) = e(e )ηab for some fixed equivariant isomorreference choice of that, e.g. eref a ∗ (f a )g b for phism η : V ∗ ∼ V , then any other V -cobein has the form e∗ (f b ) = eref = a some g ∈ GL(n, M), where n = dim(V ). Then (summations understood) ∗ (f a )ga b ⊗ e(eb ) = e(ea )g ab ⊗ e(eb ); g = eref M

M

g ac = ηab g b c .

(26)

This completes our treatment of parallelizable quantum Riemannian manifold structures on general algebras M, which can be expected to be the minimum level of generality for comparison with quantum theory. The rest of the paper is devoted to constructing examples of this including quantum groups, finite sets and finite groups. One could in principle also apply it to specific quantum systems as well as to discrete algebras such as quaternions in the setting of [10]. 4. Riemannian Geometry of Quantum Groups In this section we construct quantum Riemannian geometries where M is a quantum group. This covers both finite groups and Lie groups (in an algebraic form) as well as their q-deformations. In fact Hopf algebras have been used historically to unify Lie theory and finite group theory and we do the same here by working with general Hopf

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

145

algebras. The main result follows in Sect. 4.1 with the construction of natural metrics on the standard Cq [G] from a braided-Killing form on the braided-Lie algebra tangent to the fibres of the frame bundle. For framing we take the same quantum group H = M. The classical meaning of this is explained in [2], with the same bicovariant differential calculi on M and H . These are determined by ideals QM = QH as usual. Here V = 0 = ker /QH has a right coaction AdR and is the dual of the braided-Lie algebra in the fibre direction. We begin by checking the various conditions needed to establish a framing or quantum manifold structure in the sense of Sect. 3. In effect we are able for the first time properly to interpret the well-known “Maurer–Cartan” form in [4] in a geometrical manner. It also provides an actual connection (generally with torsion). Lemma 4.1. For P = M ⊗ H and M a Hopf algebra, if 1 (M) is bicovariant then so is 1 (P ), QP = QM ⊗ H + M ⊗ QH + ker M ⊗ ker H ,

0P = 0M ⊗ 1 ⊕ 1 ⊗ 0H

and the exterior algebras · (P ), · (M) obey (24). In the case M = H the Maurer– Cartan form e : 0 → 1 (H ),

e(v) = πNH (S v˜ (1) ⊗ v˜ (2) )

for any representative v˜ of v ∈ 0 provides a framing as well as a zero curvature gauge field A = e : 0 → 1 (H ). Proof. For the differential calculus, it is evident that M ⊗ H (m ⊗ h) = m(1) ⊗ h(1) ⊗ m(2) ⊗ h(2) is a left or right coaction on M ⊗ H and that NP is bicovariant just because NM and NH are. The map ver M ⊗ H (not to be confused with that of the bundle) easily computes as an isomorphism NP ∼ = M ⊗ QM ⊗ H ⊗ H + M ⊗ M ⊗ H ⊗ QH + M ⊗ ker M ⊗ H ⊗ ker H = M ⊗ H ⊗ QP under the usual identification of the vector spaces. Note also that we have QP = QM ⊗ 1 ⊕ 1 ⊗ QH ⊕ ker M ⊗ ker H as right AdM ⊗ H -comodules. We then apply the Woronowicz construction for · (P ),

· (M). Here the additional relations on 1 (P ) ⊗P 1 (P ) are defined by the kernel of id − , where the braiding is determined by the usual flip on left and right invariant forms on P . But these are just the images of those either from M or from H . Next, that e provides an 0 -bein and hence a framing is precisely the geometric meaning of the isomorphism 1 (H ) ∼ = H ⊗ 0 , namely with inverse being se for the Maurer– Cartan form. Regularity of A is also immediate since e is known to obey the well-known “Maurer–Cartan equation” de(v) + e(π˜ 0 v˜ (1) ) ∧ e(π˜ 0 v˜ (2) ) = 0.

(27)

(This in turn is immediate by working in the universal calculus, where e(v) ˜ = S v˜ (1) ⊗ v˜ (2) − 1 ⊗ 1(v) ˜ = S v˜ (1) dv˜ (2) ). From the Maurer–Cartan equation it follows that if we view A = e as a gauge field then it is regular and has zero curvature.

146

S. Majid

The operators ρa b in (21) for this framing are those of right translation according to e(v)g = g (1) e(vg (2) ),

∀v ∈ 0 ,

g ∈ H.

(28)

There is also a right-handed framing defined by e(v) ¯ = πNH (v˜ (1) ⊗ S v˜ (2) ) and related by e(v) = e(Sπ ¯

0 v˜ (2) )(S v˜ (1) )v˜ (3) .

(29)

Hence the braiding in the definition · (H ) can be written in the crossed module form (e(v) ⊗ e(w)) = e(π 0 w˜ (2) ) ⊗ e(v(S w˜ (1) )w˜ (3) ) H

H

(30)

rather than the more standard form with e, e¯ as in (1). We clearly have a natural “lift” i = id − : 2 (H ) → 1 (H ) ⊗ 1 (H ), H

(31)

since 2 (H ) is by definition 1 (H ) ⊗H 1 (H ) modulo ker(id − ) and hence isomorphic to the image of id − . On the other hand, does not generally obey 2 = id and as a result this map does not generally split ∧, i.e. i ◦ ∧ is not a projection. Therefore one can use this i to define the Ricci tensor and interior products, etc., but it is not necessarily the best choice. The torsion tensor corresponds from Sect. 3 to D¯ A ∧ e(v) = de(v) + A(π˜ 0 (S −1 v˜ (3) )v˜ (1) ) ∧ e(π 0 v˜ (2) )

(32)

since the coaction on 0 to be used is the right adjoint one converted to a left coaction by (8). We do not solve this in general (this would appear to require further data) but it is worth noting that classically A = 21 e is a torsion free connection, and also cotorsion free for the Killing metric for any classical compact Lie group [2]. The latter is an example of an important class of quantum metrics where 0 -cobein e∗ : ∗0 → 1 (H ) is defined by a nondegenerate Ad-invariant bilinear form. Such an element corresponds to an Adinvariant element of η = η(1) ⊗ η(2) ∈ 0 ⊗ 0 nondegenerate as a map η : ∗0 → 0 by evaluation against the second component. Proposition 4.2. Any nondegenerate Ad-invariant η ∈ 0 ⊗ 0 defines a coframing e∗ = e ◦ η. The corresponding metric g = e(η(1) ) ⊗H e(η(2) ) is symmetric in the sense ∧(g) = 0 iff η = η(2) ⊗ S 2 η(1) . Its cotorsion in terms of e is given by DA ∧ e(v) = de(v) + e(π 0 v˜ (2) ) ∧ A(π˜ 0 (S v˜ (1) )v˜ (3) ). Proof. For the framing the only delicate part is to check that η : ∗0 → 0 is equivariant, where ∗0 has the right coaction adjoint to the left coaction on 0 given as in (8) by S −1 , i.e. that η(1) ⊗ η(2) (2) ⊗ S −1 ((Sη(2) (1) )η(2) (3) ) = η(1) (2) ⊗ η(2) ⊗(Sη(1) (1) )η(1) (3) , using Hopf algebra methods [18]. Next, the condition that e(η(1) ) ∧ e(η(2) ) = 0 is that (e ⊗H e) ◦ η is in the kernel of (id − ) where is as above. The corresponding on 0 ⊗ 0 computes as (η(1) ⊗ η(2) ) = η(2) (2) ⊗ η(1) (Sη(2) (1) )η(2) (3) = η(2) ⊗ S 2 η(1) in view of the equivariance of η. Finally, cotorsion from Sect. 3 corresponds to ¯ ¯ DA ∧ e∗ = (d ⊗ id)e∗ + e∗ (1) ∧ A(π˜ 0 e∗ (2) (0) ) ⊗ e∗ (2) (1) ,

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

147

where e∗ = e∗ (1) ⊗ e∗ (2) ∈ 1 (H ) ⊗ 0 , or equivalently ¯ ¯ ) ∧ A(π˜ 0 w (2) ) DA ∧ e∗ (w) = de∗ (w) + e∗ (w (1)

(33)

for the right coaction on ∗0 adjoint to the left coaction on 0 obtained as in (8). This second form where e∗ : ∗0 → 1 (H ) and equivariance of η immediately gives the result. Hence we have a canonical framing and metric and at least one natural (not generally torsion free or cotorsion free) connection on any Hopf algebra, and concrete equations for the torsion and cotorsion conditions. We also have a “tautological” choice of “gamma” matrix and hence an induced Dirac operator for each connection. Thus, let W be a right H -comodule viewed as in (8) as a left comodule. Also let the inverse map η−1 (v) = η−(1) η−(2) , v define η−1 ∈ ∗0 ⊗ ∗0 or η−1 : 0 ⊗ 0 → k depending on one’s point of view (we assume finite-dimensionality). Corollary 4.3. For any right comodule W and η as above there is a canonical equivariant map γ : 0 ⊗ W → W,

¯ ¯ γ (v)w = η−1 (v ⊗ π˜ 0 (w (0) ))w (1)

obeying additionally the identity ¯ ¯ w (1) , (γ ◦ γ )(η)w = c, w (0)

c = η−(1) η−(2) .

Proof. By similar Hopf algebra methods, equivariance of η can be written as ¯ ¯ ¯ (2) η−1 (v (1) ⊗ w(1) )v (2) w ¯ = η−1 (v ⊗ w). From this one similarly computes ¯ (1) ¯ −1 (1) ¯ (2) ¯ ¯ (2) ¯ ¯ ¯ (2) η (v ¯ ⊗ S −1 w (1) ) ⊗ v (2) w ¯ = γ (v (1) )w (1) ⊗ v (2) w¯, R (γ (v)w) = w(1) ¯ (1) ¯ −1 (1) ¯ (2) ¯ ¯ γ (η(1) )γ (η(2) )w = w (1) η (η ⊗ S −1 w (1) )η−1 (η(2) ⊗ S −1 w (2) ) ¯ (1) ¯ −1 −1 (2) ¯ (2) ¯ = w(1) η (S w ¯ ⊗ S −1 w (1) ) ¯ −1 ¯ ¯ = w(1) η ((S −1 w (2) )(1) ⊗(S −1 w (2) )(2) )

as required. Note that c is invariant under the right coadjoint coaction on ∗0 because η−(2) ⊗ η−(1) is (the reversal is because it is the left coadjoint coaction that respects the product here). There is also a tautological γ ∗ defined similarly without the η−1 i.e. just from the comodule itself and with similar features. The equivariance of γ (and γ ∗ ) here replaces the idea that the antisymmetric products of γ classically generates a representation of the rotation group or that γ generates a representation of the spin group. Meanwhile, the coadjoint invariant element c is central at least when it lies in a Hopf algebra U dually paired with H (which will generally be the case). We denote by ρW the left action of U corresponding to the right coaction of H so when ρW is irreducible then (γ ◦ γ )(η), etc. will be a multiple of the identity, which is a remnant of the usual “Clifford algebra” property for the symmetric products of γ . Proposition 4.4. With framing and connection provided by the Maurer–Cartan form itself and with the tautological γ as above, the Dirac operator associated to any right H -comodule is D / = ∂ a γa − ρW (S −1 c),

−1 γa = ηab ρW (S −1 f b ).

148

S. Majid

−1 Proof. Here ηab = η−1 (ea ⊗ eb ). The general expression for the Dirac operator is in Corollary 3.6. We note that if A = e = Aa e(ea ) then its components are Aa (v) = f a , v. Here f a are a dual basis of ∗0 ⊂ ker ⊂ U (which we assume for convenience of presentation). Hence f a , h = f a , π˜ 0 h automatically makes the projection, giving the general form of D / as stated. We write the coaction as an action of the dual basis for convenience. For the particular form of γ itself given by the coaction or by ρW we immediately obtain the result stated.

This completes our analysis for general Hopf algebras. Before turning to nontrivial examples let us note that for H cocommutative (e.g. classically an Abelian group) all connections A are torsion free and induce the same ∇ given by ∇α = dα a ⊗H e(ea ) with zero Riemannian curvature. Any nondegenerate bilinear form η ∈ 0 ⊗ 0 defines a metric with zero cotorsion as well. This does however, give a simple example of noncommutative geometry fully in keeping with the classical picture. For example, for a Lie algebra g the enveloping algebra H = U (g) can be viewed “up side down” as the quantisation of the Kirillov-Kostant bracket on g∗ . Proposition 4.5. For H = U (g), coirreducible calculi are provided by (V , λ) with V an irreducible right module (with right action ρ) and λ ∈ P (V ) a ray. Here

0 = ker / ker ρλ ,

ρλ : 0 ∼ = V,

ρλ (h) = λ · ρ(h),

∀h ∈ U (g).

Then e = eMC ◦ ρλ−1 is a framing, where eMC is the Maurer–Cartan form, and e(v)ξ = ξ e(v) + e(v · ρ(ξ )), d(ξ1 · · · ξn ) =

n−1

ξσ (1) · · · ξσ (m) e(ρλ (ξσ (m+1) · · · ξσ (n) )),

m=0 σ ∈Sm|n−m

where ξ, ξi ∈ g and Sm|n−m denotes permutations of {1, 2, · · · , n} such that σ (1) < · · · < σ (m) and σ (m + 1) < · · · < σ (n) (an m-shuffle). Any bilinear form η in V ⊗ V defines a metric as above, and ∇ is torsion free and cotorsion free. Proof. The differential calculus is a “differentiation” of the classification in [22] for the calculi for group algebras as a pair consisting of an irreducible representation and ray. After differentiating those formulae one verifies directly that the above defines a calculus and that it is coirreducible. Here ker ρλ is clearly an ideal and for fixed ρ and in the irreducible case the image of ρλ must be all of V . Actually the minimum we need for a calculus here is that λ is a cyclic vector. If we simply identify 0 with V in this way then clearly dξ = λρ(ξ ),

vξ = ξ v + vρ(ξ )

(34)

which is easily seen to extend by Leibniz to a well-defined calculus. Thus d(ξ η) = (λρ(ξ ))η + ξ(λρ(η)) = ξ λρ(η) + ηλρ(ξ ) + λρ(ξ η) so that d(ξ η−ηξ ) = d[ξ, η].A proof by induction gives the general form of d (writing the identification e explicitly). Also the right action on V corresponds to right multiplication on 0 as it should, since ρλ (hξ ) = λρ(hξ ) = λρ(h)ρ(ξ ) = ρλ (h)ρ(ξ ). If 0 defines a quotient differential calculus then it corresponds to a surjection φ : V → 0 an intertwiner as U (g)-modules which, for irreducible V , must be an isomorphism. To form

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

149

a commutative triangle, dξ = φ(λρ(ξ )) = φ(λ)ρ (ξ ), say, so that the quotient calculus is isomorphic to our (V , λ) calculus with λ = φ(λ). Moreover, (V , λ) is isomorphic to (V , λ ) if and only if φ is a nonzero multiple of the identity i.e. λ proportional to λ, i.e. the calculus depends on λ only up to scale. This describes the calculus that we use. While these are not all possible calculi (any ideal in ker defines a calculus since H is cocommutative), they are the natural “integrable” calculi in the sense that they “differentiate” the formulae in the finite group case. We compute the geometric structure. This is defined in terms of 0 (which is hard to work with) so we work instead with its isomorphic image which is V as stated. Hence we take V itself as the framing space and e the Maurer–Cartan form converted under the identification (similarly for all the formulae above). For the exterior algebra we have de(v) = 0 and e(v) ∧ e(w) = −e(w) ∧ e(v). More generally, it is clear from the proof that any representation V and cyclic λ likewise gives a framing, etc. (if we do not care about irreducibility). This describes U (g) as a “noncommutative flat space” (namely quantized g∗ ). One can also choose interesting spinor spaces and γ -matrices and hence a Dirac operator sensitive to A. On U (su2 ) for example one could take the usual γ (Pauli) matrices. And, of course, one can have other metrics not induced by constant η. 4.1. Killing form metric on Cq [G]. We now turn to our main construction which is the example of M = H a dual quasitriangular Hopf algebra. It means that there is a “universal R-matrix functional” R : H ⊗ H → k, which includes the standard deformations Cq [G] of the classical simple Lie groups. 1 (H ) is built from a finite-dimensional right comodule W (which we view as a left module of H ∗ with action ρW . The element Q = R21 R is the “universal Killing form” and we view it as a map Q : H → H ∗ by evaluation, i.e. g, Q(h) = Q(h ⊗ g) = R(g (1) ⊗ h(1) )R(h(2) ⊗ g (2) ) for g, h ∈ H . We assume that ρW ◦ Q is surjective (e.g. if R is factorisable and ρW irreducible). We also define the induced actions of H : ρ+ (h)α β = R(h ⊗ ρ α β ),

ρ− (h)α β = R−1 (ρ α β ⊗ h),

(35)

⊗ ρβ

where eα → eβ α defines the matrix elements of ρW for a basis {eα } of W . With these notations one knows that there is a bicovariant differential calculus defined by ∼ End(W ).

0 = ker / ker ρW ◦ Q, ρW ◦ Q : 0 = (36) This is part of the construction in [22], where it was shown that such calculi with ρW irreducible essentially classify all the coirreducible calculi for factorisable quantum groups such as Cq [G]. We let W ◦ be the predual of W as a right comodule. Proposition 4.6. A dual-quasitriangular Hopf algebra H with calculus defined by (W, ρW ) is framed by V = End(W ) = W ⊗ W ◦ and e = eMC ◦ (ρW ◦ Q)−1 . We have e(φ)h = h(1) e(ρ− (Sh(2) ) ◦ φ ◦ ρ+ (h(3) )), dh = ·(id ⊗ e)(h(1) ⊗ ρW ◦ Q(h(2) ) − h ⊗ id) for all h ∈ H , φ ∈ End(W ). Moreover, there is a natural choice of spinor space, namely W , with equivariant γ : V ⊗ W → W provided by the identity matrix and (D / ψ)α = ∂ α β ψ β − A(π˜ 0 S −1 ρ β γ )α β ψ γ , where ψ α ∈ H are the spinor components and A is a gauge field.

150

S. Majid

Proof. With the identification (36) understood, one could write the calculus 1 (H ) as dh = h(1) ρW ◦ Q(h(2) ) − h id,

φh = h(1) ρ− (Sh(2) ) ◦ φ ◦ ρ+ (h(3) ),

∀h ∈ H (37)

for the exterior derivative and bimodule structure on φ ∈ End(W ). In our context this identification is made by the framing and gives the structure shown when we write this explicitly. One may check that ρW ◦ Q(hg) = ρ− (S(hg)(1) )ρ+ ((hg)(2) ) = ρ− (Sg (1) )ρ− (Sh(1) )ρ+ (h(2) )ρ+ (g (2) ) = ρ− (Sg (1) )ρW ◦ Q(h)ρ+ (g (2) ) which leads to the stated H -module structure on V . Meanwhile, the right adjoint coaction is known [18] to intertwine under Q with the right coadjoint coaction on H ∗ , which means ρW ◦ Q(h(2) )α β ⊗(Sh(1) )h(3) = ρW ◦ Q(h)a b ⊗ ρ α a Sρ b β .

(38)

In our present setting the equivariance follows easily from the dual-quasitriangularity axioms for R provided the coaction maps a dual basis element as f α → f β ⊗ Sρ α β . This means that we identify V = W ⊗ W ◦ as stated. It is straightforward to verify that d as stated obeys the Leibniz rule and that we indeed have a calculus. Also from (38) and (30) one obtains easily the braiding in terms of R-matrices R α β γ δ = R(ρ α β ⊗ ρ γ δ ) and R˜ α β γ δ = R(ρ α β ⊗ Sρ γ δ ), (φ ⊗ ψ)α β γ δ = R a µ α b φ µ ν R b σ ν c ψ σ τ R −1τ d c δ R˜ γ a d β , ∀φ, ψ ∈ End(W ), (39) or i ψ2i Rφ1i R21 = Rφ1 R21 ψ2 if (φ ⊗ ψ) ≡ i ψ i ⊗ φ i in a standard notation. The partial derivatives are as usual the coefficient in d of the basic forms e(eα ⊗ f β ), which means ∂ α β (h) = h(1) ρW ◦ Q(h(2) )α β − hδ α β .

(40)

We define the gamma-matrices as projectors in W ⊗ W ◦ acting by evaluation (or γ : V → W ⊗ W ◦ the identity map). Thus γβ α (ψ) = eβ ψ α and γβ α = eβ ⊗ f α , giving D / as stated. We turn now to the construction of a natural “Killing form” metric. In fact we will give a self-contained quantum-group construction which avoids braided categories, but the following is the picture behind it. Thus, it was shown in [22] that for all such calculi the dual of 0 forms a braided-Lie algebra L in the sense of [14]. These are modelled on the properties of the 1-dimensional extension g ⊕ k.c when g is an ordinary Lie algebra; there is a coproduct : L → L ⊗ L and an extended bracket [ , ] : L ⊗ L → L and everything lives in a braided category (classically we would extend by [ξ, c] = ξ , [c, ξ ] = 0 for ξ ∈ g and [c, c] = c with c = c ⊗ c, ξ = ξ ⊗ c + c ⊗ ξ for the coproduct, and have a trivial braiding). The main “pentagonal Jacobi identity” axiom of a (right-handed) braided Lie algebra is shown in Fig. 1(a) in a diagrammatic notation [18] with operations flowing down the page and with the braid-crossing denoting the “background braiding” of the category. The axioms for (L, [ , ], ) and a counit are strong enough to define an additional “double” braiding shown in Fig. 1(b) and from this an enveloping algebra U (L) as a bialgebra or “braided group” in the braided category. This is defined as the quadratic algebra generated by L with relations of symmetry with

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

(a)

L

L

L

[ , ]

[ , ]

L

∆

L

151

(b)

L

L

L ∆

[ , ]

=

Ψ=

[ , ]

[ , ]

[ , ]

L

(c) η=

L

L

L

V*

(d)

∆

[ , ] [ , ]

L

L

L

=

[ , ]

[ , ]

L V Ad

T =

[ , ]

H Fig. 1. Pentagonal axiom (a) of a braided-Lie algebra. Induced “double braiding” (b), braided-Killing form and its braided symmetry (c) and construction (d) as η = T in our formulation

respect to (i.e. setting to zero the image of id − ) and coproduct extending on L (classically this would recover a quadratic extension of the usual U (g)). There is also a braided-Killing form η in Fig. 1(c) which is shown there to be braided-symmetric in the sense η = η ◦ . Here ∪ and ∩ are evaluation and coevaluation of L with a suitable dual. The braided-Killing form η classically restricts to the usual one on g and η(c, c) = 1. Thus Lie theory is contained as a special class of braided-Lie algebras and acquires extra structure such as the double braiding . In our case L = W ∗ ⊗ W = V ∗ in the preceding proposition with basis {x α β = α f ⊗ eβ } and x α β = x α γ ⊗ x γ β has a matrix form. The Lie bracket [ , ] is given in [14] in an R-matrix form as well as the background braiding defined by R. The double braiding is the adjoint of (39) for the exterior algebra and correspondingly the enveloping bialgebra U (L) is the left-handed braided matrices BL (R) with relations x2 Rx1 R21 = Rx1 R21 x2 . There is an algebra map U (L) → H ∗ sending x α β to ρW ◦ Q( )α β ∈ H ∗ and we identify the image of L with ∗0 by the counit projection to f α β = ρW ◦ Q( )α β − δ α β ∈ H ∗

(41)

adjoint to the restriction to ker ⊂ H in (36). The braided Killing form on L ⊗ L can then be viewed in 0 ⊗ 0 . We now give a version of this construction directly in our setting. Theorem 4.7. Let H be a dual-quasitriangular Hopf algebra with differential calculus as above. There is a braided-symmetric and Ad-invariant “braided-Killing form” η 0 = (π˜ 0 ⊗ π˜ 0 )T ∈ 0 ⊗ 0 ;

T = R(τ (1) ⊗ Sτ (2) )τ (3) ,

τ = ρ α α Sρ β β ∈ H,

η = (ρW ◦ Q ⊗ ρW ◦ Q)(T ) − id ⊗ ρW ◦ Q(T ) − ρW ◦ Q(T ) ⊗ id + (T )id ⊗ id ∈ V ⊗ V . If nondegenerate, there is a braided-symmetric Riemannian metric g = (e ⊗H e)η with ∧(g) = 0.

152

S. Majid

Proof. The two applications of [ , ] in the “figure of eight” braided trace in Fig. 1(c) can be written as a product in U (L) followed by a single [ , ] and this dualises to the coproduct of U (L)∗ applied to the element T in Fig. 1(d) (after some convention adjustments). This coproduct of U (L)∗ is essentially that of H , so we have η = (ρW ◦ Q ⊗ ρW ◦ Q)T ∈ V ⊗ V ,

(42)

where we use ρW ◦ Q to map H to V . This is the natural object from the braided-Lie theory and we will see that it has the stated features of η, however for our geometrical application we have to first project T down to 0 ⊗ 0 which is η 0 as stated (we have done the same in previous sections in the expression (id ⊗ π˜ 0 )Ad : 0 → 0 ⊗ 0 dual to [ , ]). Or by (36) we apply the counit projection π to T and then ρW ◦ Q to give the corresponding element of V ⊗ V . We now directly verify the properties of η and hence η. Notice first that if {ea } is a basis of V and {f a } a dual basis of V ∗ , let ¯ ¯ , f a ea (2) . τ = ea (1)

(43)

Cyclicity of the trace here appears as the following fact: τ (1) ⊗ τ (2) ⊗ . . . ⊗ τ(n) = τ(n) ⊗ τ (1) ⊗ · · · ⊗ τ(n−1) .

(44)

This is because the first expression may be written as the trace of n applications of the coaction, ¯ ··· (1) ¯ ··· (1) ¯ (2) ¯ ¯ ¯ (2) ¯ ¯ , f a ea (1) ⊗ · · · ⊗ ea (1) ⊗ ea (2) . τ = ea (1) ¯ ¯ is equivalent due to equivariance of the duality pairing to f a (1) and The outermost (1) ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ (1) ··· (1) (2) a (2) a (1) a (2) a (1) ea replaced by Sf . On the other hand f ⊗ ea ⊗ Sf = f ⊗ ea ⊗ ea (2) a by a change of basis (or invariance of the coevalution element f ⊗ ea ). Hence we may ¯ ¯ replace the coaction on f a by an innermost coaction ea (1) , putting an extra (1) in all ¯ ¯ a (2) (2) the other places and replacing Sf by e . Converting the iterated coactions back to coproducts gives the cyclicity property (in fact one needs only a coalgebra for the cyclicity with the appropriate adjoint operation in the role of S). In our case V = W ⊗ W ◦ with the coaction given in the preceding proposition. Then ¯ ¯ τ = (eα ⊗ f β )(1) , f α ⊗ eβ (eα ⊗ f β )(2) = f α ⊗ eβ , ea ⊗ f b ρ a α Sρ β b = ρ α α Sρ β β .

Now we compute the figure-of eight braided trace, which is fairly routine [18]. We ¯ ¯ read Fig. 1(d) from the top down, starting with f a ⊗ ea . This becomes f a ⊗ ea (1) ⊗ ea (2) . We then apply the background braiding to the first two places and evaluation, to find ¯ (1) ¯ (2) ¯ ¯ ¯ ¯ ¯ T = ea (1) , f a (1) R(f a (2) ⊗ ea (1) )ea (2) ¯ (1) ¯ ¯ (1) ¯ (2) ¯ (1) ¯ ¯ (2) ¯ ¯ = ea (1) , f a R(S −1 ea (2) ⊗ ea (1) )ea (1)

= τ (2) R(S −1 τ (3) ⊗ τ (1) ) = R(τ (1) ⊗ Sτ (2) )τ (3) = v(τ (1) )τ (2) using that R is S ⊗ S-invariant and cyclicity (44) again. Here v(h) = R(h(1) ⊗ Sh(2) ) implements S −2 by convolution [18]. Next, AdT = R(τ (1) ⊗ Sτ (2) )τ (4) ⊗(Sτ (3) )τ (5) = R(τ (2) ⊗ Sτ (3) )τ (5) ⊗(Sτ (4) )τ (1) = R(τ (1) ⊗ Sτ (4) )τ (5) ⊗ τ (2) Sτ (3) = T ⊗ 1

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

153

by cyclicity and the axioms of a dual-quasitriangular structure or that v implements S −2 [18]. So T and hence η are Ad-invariant (since and ρW ◦ Q (and π˜ 0 ) are Adcovariant). Similarly from the cyclicity (44) and the property of v it is clear that ST = T and (id ⊗ S 2 )op T = T so η and hence η are -invariant as in Proposition 4.2. One also has explicit formulae using the definition of Q and the dual-quasitriangularity axioms for R, ηα β γ δ = ub5 b6 Qb6 b1 a2 a3 R˜ a1 a2 a3 a4 Qa4 a5 a b R˜ α a b4 b5 × R −1b3 b4 b β Qa5 a1 c d R˜ γ c b2 b3 R −1b1 b2 d δ , ρW ◦ Q(T )α β = R˜ a1 a2 a3 a4 Qb4 b1 a2 a3 Qa4 a1 a b R −1b1 b2 b β R˜ α a b2 b3 ub3 b4 , (T ) = ub1 b2 Qb2 b1 a2 a3 R˜ a1 a2 a3 a1 , where u = R˜ a · · a and Q = R21 R.

(45)

(46)

The braided-Killing form of the standard quantum groups Cq [G] is closely related to the usual Killing form and is typically nondegenerate for generic q #= 1 (being rational functions in q they need to be nondegenerate at only one point to establish this). Hence the theorem above provides a construction of the metric for such quantum groups and their standard bicovariant differential calculi. We will demonstrate this explicitly for the case of Cq [SU2 ] with its standard 4-dimensional calculus (here W is the spin 21 representation). The exterior algebra in this case is well-known and in our conventions is as follows. We let e(e1 ⊗ f 1 ) = ea , e(e1 ⊗ f 2 ) = eb , etc., and θ = ea + ed . Then ea , eb , ec behave like usual forms or Grassmann variables and ea ∧ ed + ed ∧ ea + λec ∧ eb = 0, ed ∧ ec + q 2 ec ∧ ed + λea ∧ ec = 0, eb ∧ ed + q 2 ed ∧ eb + λeb ∧ ea = 0, ed2 = λec ∧ eb , d = [θ, },

ea

a b c d

qa q −1 b = ea , qc q −1 d

[ec , b] = [ec , d] = [eb , a] = [eb , c] = 0, [ec , a] = qλ bea , [eb , b] = qλ aea ,

[ec , c] = qλ dea , [eb , d] = qλ cea ,

[ed , a]q −1 = λbeb ,

[ed , b]q = λaec + qλ2 bea ,

[ed , c]q −1 = λdeb ,

[ed , d]q = λcec + qλ2 dea ,

where [x, y]q = xy − qyx and λ = (1 − q −2 ), and a, b, c, d ∈ SUq (2). Note that the ∧ relations are essentially those for the exact differentials on q-Minkowski space [18, Sect. 10.5] given by the braid statistics + for the addition law on that, as must be the case because BL (R) is the coordinate algebra of q-Minkowski space as well as U (L), see [14, 22]).

154

S. Majid

Proposition 4.8. Let [n]q = (1 − q n )/(1 − q). The braided-Killing form for the spin differential calculus on Cq [SU2 ], divided by (q − 1)2 is η = q −12 [8]q [2]q ηK − λ q −9 [3]q [2]q 2 − [2]2q q −2 θ ⊗ θ,

1 2

ηK = ec ⊗ eb + q 2 eb ⊗ ec +

(ea ⊗ ea − qea ⊗ ed − qed ⊗ ea + q(q 2 + q − 1)ed ⊗ ed ) . [2]q

Proof. We use the R-matrix formulae obtained in Theorem 4.7. In fact the difference between η and η is a multiple of θ ⊗ θ = id ⊗ id so only affects the second term here. One has ρW ◦ Q(T ) = id(2 + q −4 + q −8 ) and (T ) = (1 + q −2 )(1 + q −4 ) and their subtraction from η makes ηK the leading term and θ ⊗ θ O(q − 1) relative to it. This ηK is a q-deformation of ρW of the usual split Casimir X+ ⊗ X− + X− ⊗ X+ + 21 H ⊗ H , as it should be. The θ ⊗ θ is a kind of “null mode” that does not affect the geometry too much. One can take any coefficient for it in the metric subject to invertibility. 5. Finite Riemannian Geometry In this section we apply the general results above to the special case of M = C[G] the algebra of functions on a finite group G. We first specialise the results of Sect. 3 to M = C[3] for 3 a finite set and list the main formulae of Riemannian geometry for this case in a self-contained manner that could be put on a computer. We then proceed to concentrate on the group case as a good source of examples where there are clear choices for the differential structures, etc. Finally, we compute everything for the permutation group S3 including solving for a canonical torsion free cotorsion free or “Levi–Civita” spin connection in it. 5.1. Riemannian geometry on finite sets. Here we will see that even finite sets can be endowed with a rich variety of “manifold” structures using the framework of Sect. 3. In fact it is not true that every differential calculus on a finite set is parallelizable (see below); i.e. there may be a still more general theory over finite sets where we specialise the global constructions of Sect. 2. This is not relevant to the finite group case which is our main goal, and will therefore be considered elsewhere. On the other hand, we keep the fiber of the frame bundle to be a Hopf algebra H equipped with a bicovariant differential calculus defined by 0 of dimension n, since no special simplification is afforded by specialising further for the tensor product bundle. To be as concrete as possible (we have in mind actual matrix computations for numerical gravity on finite sets) let us assume that H is finite-dimensional and choose a basis {ei } for it with e0 = 1 and π˜ 0 (ei ) = ei for 1 ≤ i ≤ n with the image here a basis of 0 (and zero otherwise). In this way we identify 0 with its lift in H . The dual basis {f i } similarly splits with 1 ≤ i ≤ n a basis of ∗0 . The coproduct is of course ei = ci j k ej ⊗ ek ,

(47)

for some structure constants. Finally, we write right H -comodules V explicitly as left actions ρV of H ∗ . We define (since we typically convert right actions to left ones by S −1 ) the matrices τ i = ρV (S −1 f i ).

(48)

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

155

In fact the formulae below in the tensor product bundle depend only on this coalgebra and the choice of quotient space (so that similar formulae hold for coalgebra bundles [12] as well except that we would specify the matrices τ i or right action of H ∗ directly). Next, we let 3 be a finite set and M = C[3] spanned by delta-functions {δx } for x ∈ 3. It is easy to see (and well-known) that a general differential calculus 1 (M) corresponds to a subset E ⊆ 3 × 3 − diagonal,

1 (M) = {δx ⊗ δy | (x, y) ∈ E} = CE,

(49)

where we set to zero delta-functions corresponding to the complement of E and identify the remainder with their lifts as shown. If f = fx δx is a function with components fx , then df has components (df )x,y = fy − fx for (x, y) ∈ E. Lemma 5.1. A V -bein for a finite set 3 is a vector space on which H coacts and 1-forms Ea = Ea,x,y δx ⊗ δy (x,y)∈E

for each element of a basis {ea }a∈I of V such that the matrices {Ea,x,y } are invertible for each x ∈ 3 held fixed. A necessary and sufficient condition for the existence of a V -bein is that E is fibred over 3, which implies in particular that |E| = |3| dim(V ). z Proof. We write Ea = e(ea ), etc. In principle we require the matrices se z,a x,y = δx Ea,x,y 1 to be invertible as maps M ⊗ V → (M), but since they are left M-module maps (or from their special form) we know that their inverses must also be left M-module −1x,y maps and hence of the form se z,a = δxz Ea−1 x,y for a collection of matrices Ea−1 x,y inverse to the Ea,x,y for each x. This requires in particular that for each x ∈ 3 the set Fx = {y| (x, y) ∈ E} has the same size, namely the dimension of V , i.e. that E is a fibration over 3 (and Ea,x,y is a trivialisation of the vector bundle with fiber CFx over x). The fibration is also sufficient for the existence of a trivialisation since bundles over finite sets are trivial. Indeed, a natural “local” class of V -beins is just given by any collection of bijections sx : I ∼ = Fx with Ea,x,y = δsx (a),y . ∗a with respect Similarly a V -cobein is a collection of 1-forms with components Ex,y a ∗a to a dual basis {f } and with the matrices {Ex,y } invertible for each y ∈ 3 held fixed. The metric is then g= gx,y,z δx ⊗ δy ⊗ δz , (x,y,z)∈F ∗a gx,y,z = Ex,y Ea,y,z ,

(50)

F = {(x, y, z) ∈ 3 3 | (x, y), (y, z) ∈ E}, where 1 (M) ⊗M 1 (M) = CF . Moreover, a connection or gauge field with values in the dual of 0 is clearly a collection of 1-forms with components Ai,x,y . In our case H coacts on V so that it plays the role of frame transformations in the frame bundle approach. In that case A induces a covariant derivative on 1-forms (∇α)x,y,z = (αya − αxa )Ea,y,z − αxa Ai,x,y Eb,y,z τ i b a ,

(51)

where α = α a Ea defines the component functions α a of a 1-form α in the V -bein basis. Next we specify 2 (M) by a bimodule surjection ∧ : 1 (M) ⊗M 1 (M) → 2 (M).

156

S. Majid

Lemma 5.2. The surjections ∧ are necessarily given by quotients Vx,z of the spaces CFx,z where Fx,z = {y ∈ 3| (x, y, z) ∈ F } such that the image of the vector (1, 1, · · · , 1) is zero whenever (x, z) ∈ / E with x #= z. Explicitly, (∧f )x,α,z = fx,y,z px,z y α , (52) y∈Fx,z

for a family of matrices px,z with respect to a basis {eα } of each Vx,z and with rows summing to zero when (x, z) ∈ / E with x # = z. Proof. We require to quotient 1 (M) ⊗M 1 (M) = CF by a subbimodule. This must therefore take the form shown for some surjections px,z . The additional stated condition is for d2 = 0 (so the maximal prolongation will be Vx,z = CFx,z when (x, y) ∈ E and CFx,z /C.(1, 1, · · · , 1) otherwise). The argument is similar to that in [20]. There may be additional restrictions imposed by requiring the 2 (M) to be part of a global 2 (P ) as explained in Sect. 3. When αx,y , βx,y are the components of 1-forms as above then (αx,y + αy,z − αx,z )px,z y α , (α ∧ β)x,α,z = αx,y βy,z px,z y α . (dα)x,α,z = y∈Fx,z

y∈Fx,z

(53) With such an explicit description of 2 (M) it is clear that a connection A is regular if ci j k Aj,x,y Ak,y,z px,z y α = 0, ∀q ∈ / C ∪ {e}. (54) 1≤j,k≤n;y

Its curvature is Fi,x,α,z = (dAi )x,α,z +

ci j k Aj,x,y Ak,y,z px,z y α .

(55)

1≤j,k≤n;y

The actual Riemann tensor is the 2-form valued operator on 1-forms, Rx,α,z a b = Fi,x,α,z τ i a b ,

Rα = α a R b a ⊗ Eb . M

Meanwhile, the zero torsion and zero cotorsion equations are vanishing of (D¯ ∧ E)a,x,α,z = (dEa )x,α,z + Ai,x,y Eb,y,z px,z y α τ i b a ,

(56)

(57)

i,b,y

(D ∧ E ∗ )ax,α,z = (dE ∗a )x,α,z +

i,b,y

∗b Ex,y Ai,y,z px,z y α τ i a b .

(58)

Also, a “lift” i : 2 (M) → 1 (M) ⊗M 1 (M) is given similarly to the discussion above by a collection of inclusions ix,z : Vx,z → CFx,z or a family of rectangular matrices ix,z α y . We let πx,z = ix,z ◦ px,z so that π y w = py α i α w at each x, z. If i is a true lift so that p ◦ i = id then i ◦ ∧ is a projection splitting 1 (M) ⊗M 1 (M) into something isomorphic to 2 (M) plus a complement and the πx,z are likewise a family

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

157

of projection matrices. We do not want to strictly assume this, however. Given i, we have an interior product and, in particular, a Ricci tensor ic Riccix,y,z = i(Fi )ab x Eb,x,y Ec,y,z τ a .

(59)

Here i(Fi )x,w,z in 1 (M) ⊗M 1 (M) is as in (55) but with πx,z y α in place of px,z y α written there. The V -bein components here are defined by i(Fi )x,y,z = i(Fi )ab x Ea,x,y Eb,y,z as usual. Finally, gamma-matrices are a collection of matrices γa acting on spinors ψ which are functions with values in a vector space W on which H coacts by ρW , say. We define a as above. Then the associated Dirac operator is the corresponding matrices τW i . D / = ∂ a γa − Aai γa τW

(60)

For the case when H = C[G] it is actually useful to choose a different basis for H that reflects better the group structure, namely we label the basis by the group elements jk themselves (so ei is the delta-function at i ∈ G and ci j k = δi ). This has the same form as above except that the old e0 above is the sum of all the new basis elements. The role of e1 , · · · , en is played by ei for i ∈ C a subset of order n not containing the identity element e ∈ G (see below), which is purely a notational change. All the formulae above have the same form in this case except the regularity and curvature equations, for which one has to make a careful change of basis (or use the form of π˜ 0 in the new basis as given in the next section). One has instead, Aj,x,y Ak,y,z px,z y α = 0, ∀q ∈ / C ∪ {e}, (61) j k=q,y

Fi,x,α,z = (dAi )x,α,z + −

Aj,x,y Ak,y,z px,z y α

j k=i,y

(Aj,x,y Ai,y,z + Ai,x,y Aj,y,z )px,z y α ,

(62)

j,y

respectively, with i, j, k ∈ C. Whereas the above tensorial formulae are suitable for numerical computations, let us note finally that we also have more algebraic “Cartan calculus” formulae based on (21). Thus, Ea f = ρa b (f )Eb , df = [θ, f ], −1xy Eb f (y)Eaxy , ρa b (f )(x) = y∈Fx a

θ = θ Ea ,

θa (x) =

(63)

−1xy Ea

y∈Fx

for all functions f . For 2 (M) we can build ∧ from a G-equivariant projector π(ea ⊗ eb ) ≡ πab cd ec ⊗ ed on V ⊗ V . Then −1xy

πx,z y w = πab cd Ea

−1yz

Eb

Ecxw Edwz

(64)

for the above family of projection matrices. This imposes constraints on (π, E) and defines a moduli space of G-parallelizable manifold structures on a finite set of a given order.

158

S. Majid

5.2. Riemannian geometry on finite groups. We now specialise further to the case M = H = C[G] with the same bicovariant differential calculus on both. This gives a nontrivial setting at the level of finite groups. In principle one obtains “geometric invariants” of finite groups equipped with a differential calculus (i.e. a conjugacy class), which is certainly of independent mathematical interest as well as of physical interest as a simple toy setting for finite gravity. As mentioned above, the coirreducible calculi are classified immediately from [4] by nontrivial conjugacy classes C ⊂ G. In fact we do not need to assume that the calculus is irreducible and hence in what follows C is any Ad-stable subset not containing the group unit element e ∈ G. We denote the elements of C by a, b, c, etc. Then QH = {δq | q # = e, q ∈ / C}, 0 = {δa | a ∈ C} = CC, Ad(δa ) = δgag −1 ⊗ δg , g∈G

(65) df =

(∂ a f ) · Ea ,

∂ a = Ra − id,

Ea · f = Ra (f ) · Ea ,

(66)

a∈C

where Ra (f ) = f (( )a). In this description we identify a basis element ea of 0 with a fixed lift δa ∈ ker , which is an Ad-invariant identification. The projection from C[G] to 0 is then   if g ∈ C δg π˜ 0 (δg ) = − a∈C δa if g = e (67)  0 else. The elements of 0 viewed in 1 (H ) are the values of the Maurer–Cartan form e :

0 → 1 (H ), Ea = e(δa ) = πNH ( δg ⊗ δga ), NH = {δg ⊗ δgq | g ∈ G, q # = e, q ∈ / C}. (68) g∈G

In terms of the general finite set case, we have a local form of the V -bein and the action, sx (a) = xa,

Ea,x,y = δxa,y ,

τ a b c = δab−1 ca − δcb .

(69)

The rest of our treatment in the finite group case is more easily handled in the “Cartan calculus” form at the end of Sect. 5.1, i.e. by algebraic relations among the {Ea } generators of the entire exterior algebra rather than in “spacetime coordinates” αx,y , etc. Thus, the higher exterior algebra is generated [4] by the relations at the first order and the additional relations implied by the braiding (Ea ⊗ Eb ) = Eaba −1 ⊗ Ea . H

H

(70)

Thus, in 2 (H ) the quotient to the wedge product consists in setting to zero all linear combinations invariant under . In particular, for all g ∈ G the elements ab=g Ea ⊗ Eb are invariant (after a change of variables), hence these along with the clearly invariant Ea ⊗ Ea give some immediate relations Ea ∧ Eb = 0, ∀g ∈ G, Ea ∧ Ea = 0. (71) a,b∈C , ab=g

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

159

Using these relations and (67) the Maurer–Cartan equation on any Hopf algebra becomes dEa − (Ea ∧ Eb + Eb ∧ Ea ) = 0. (72) b

Meanwhile, the partial derivatives trivially obey ∂ a (mn) = m∂ a n + (∂ a m)Ra (n),

∂ a ∂ b = ∂ ab − ∂ a − ∂ b

(73)

as Rab = Ra Rb , where we extend the same definitions to ab ∈ G. One can also write ∂a∂b − ∂b∂b

−1 ab

= ∂b

−1 ab

− ∂a

(74)

as some form of Lie algebra [4], however such a point of view can only be taken so far, and we do not use it. Rather, the ∂ a form a representation of a braided-Lie algebra [14]. The above formulae, with the exception of our notations such as (67) and the observation (71), are all immediate from the general theory of [4] and are the starting point of any quantum-groups inspired noncommutative geometry on finite groups. We will also need a fuller description of 2 (H ) in the finite group case, provided by the following lemma. Lemma 5.3. For all g ∈ G let Pg = (CC ∩ gC −1 )σ the invariant subspace of the vector space with basis C ∩ gC −1 , where σ sends a basis element a to a −1 g. Let {λg,α } be a basis of Pg . Then the relations of 2 (H ) are g,α λa Ea ∧ Eb = 0, ∀α, ∀g ∈ G. a,b∈C , ab=g

Proof. For any λ ∈ Pg , we clearly have invariance under as ab=g λa Eaba −1 ⊗H Ea = cd=g λc−1 g Ec ⊗H Ed = ab=g λa Ea ⊗H Eb , by the σ -invariance of λ. Hence relations of the form shown hold in 2 (H ) for any basis of Pg , for each g ∈ G. One can show that this is a full set of relations after a detailed analysis of the kernel of id − in this case. We are now ready to specialise our results of Sects. 3 and 4 to obtain a theory of Riemannian geometry for finite groups. First of all, as a trivial example of the theory in [14] we may view ∗0 as the image under π of a braided-Lie algebra with trivial background braiding and L = {x a | a ∈ C} = CC,

[x a , x b ] = x b

−1 ab

,

x a = x a ⊗ x a .

(75)

The braided enveloping bialgebra U (L) from [14] in this case (because the background braiding is trivial) is actually a usual bialgebra or quantum group without antipode. It comes with a bialgebra homomorphism to the group algebra CG, U (L) = Cx a / x a x b = x b x b

−1 ab

,

p : U (L) → CG,

p(x a ) = a.

(76)

The further projection of the braided-Lie algebra generators to the kernel of the counit gives the basis {f a = a − e} of ∗0 dual to the ea = δa via Hopf algebra duality. One may also consider “braided gauge theory” with A having values in L rather than in ∗0 , but for the present we need this theory mainly to have a braided-Killing form.

160

S. Majid

Proposition 5.4. The braided-Killing form of L is a symmetric positive-integer valued and Ad-invariant bilinear form on the conjugacy class given by η(x a , x b ) = n(ab) ≡ #{c ∈ C| cab = abc} = η(x b , x b

−1 ab

),

∀a, b ∈ C.

We say that a conjugacy class is semisimple if this associated Killing form is nondegenerate. Proof. The braided-Killing form is defined as the trace η(x a , x b ) = δc , [[x c , x a ], x b ] c∈C

which is clearly as shown (the number of c ∈ C commuting with ab). Its formal properties are part of the general theory of braided-Lie algebras. The relevant braiding in the present case is that of the category of crossed CG-modules and has the form (x a ⊗ x b ) = −1 x b ⊗ x b ab (as above for differentials), so that η, depending only on the product, is clearly braided-symmetric in the sense η = η ◦ . Hence it is also symmetric in the usual sense (because S 2 = id in Proposition 4.2). Ad-invariance is clear as well. Note that the braided-Lie algebra itself is bosonic as the category of C[G]-comodules in which it lives has a trivial background braiding. Because ∗0 can be identified naturally (and Ad-invariantly) with L viewed inside CG, we pull back this braided-Killing form to obtain on Ad-invariant bilinear form η(f a , f b ) ≡ η(x a , x b ) = n(ab).

(77)

Note that this gives a slightly different Killing form than the trace of Adf a Adf b , i.e. taking the “Lie bracket” as the quantum group adjoint action of f a = a − e in CG, giving instead η(f a , f b ) = n(ab) − n(a) − n(b) + n(e) more similar to Theorem 4.7. This is also Ad-invariant so (if nondegenerate) could also be used to define a metric with essentially the same geometry. Also note that for an Abelian group the conjugacy classes are singletons but we may take C a collection of these and the same formulae as above for a (reducible) differential calculus. The Killing form will be degenerate in this case but the δ-function provides instead a suitable symmetric and invariant bilinear form. Let ηab = η(f a , f b ),

−1 ηab = η−1 (δa , δb )

(78)

for the braided Killing form and its inverse in our basis. We will write α = α a · Ea for any 1-form and similarly for the components of higher cotensors. We sum over repeated indices a, b ∈ C in tensor expressions. In this setting the main equations of “quantum group Riemannian geometry” of Sect. 3 become as follows. A framing is a collection of 1-forms {Ea } such that every 1-form is a unique linear combination of these with coefficient functions from the left (e.g. as above). A spin connection is a collection {Aa } of 1-forms and the covariant derivative associated to a spin connection and framing on any 1-form α is Ab ⊗(Eb−1 ab − Ea ). (79) ∇α = dα a ⊗ Ea − α a H

b

H

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

161

The extra term in ∇ comes from the projection π˜ 0 or equivalently from the fact that the role of “Lie algebra” is being played by the vectors {a − e} in the group algebra as explained above. The associated torsion tensor T : 1 (H ) → 2 (H ) measures the deviation T α = d ∧ α − ∇α and the zero-torsion condition is vanishing of D¯ A ∧ Ea = dEa + Ab ∧ (Eb−1 ab − Ea ). (80) b

The curvature ∇ 2 associated to a regular connection corresponds to a collection of 2-forms {Fa } defined by Fa = dAa + Ac ∧ A d − (Ab ∧ Aa + Aa ∧ Ab ), (81) cd=a, c,d∈C

b

where regularity in Sect. 3 becomes the condition Aa ∧ Ab = 0, ∀q # = e, q ∈ / C.

(82)

ab=q, a,b∈C

This ensures that the curvature descends to 0 , otherwise it potentially has values {Fg } for all g # = e. It is clear that the Maurer–Cartan form can be viewed as a regular connection with zero curvature. For any connection the associated Riemann curvature is the 2-form-valued operator Rα = α a Fb ⊗(Eb−1 ab − Ea ) (83) b

H

on 1-form α = α a Ea , according to the correspondence in Sect. 3. To define the Ricci tensor (or to define interior products in general) we need a bimodule inclusion or “lift” 2 (H ) → 1 (H ) ⊗H 1 (H ). The obvious one for the bicovariant calculus, although not precisely a lift any more (not covered by ∧) is provided by i = id − ,

i(Ea ∧ Eb ) = Ea ⊗ Eb − Eaba −1 ⊗ Ea . H

H

(84)

We provide now another possibility which is actually a lift in a natural manner, so that i ◦ ∧ is an actual projection operator on 1 (H ) ⊗H 1 (H ). Proposition 5.5. For H = C[G] and the bicovariant calculus, there is a canonical splitting of ∧ to a bimodule projection operator, defined by i(Ea ⊗ Eb ) = Ea ⊗ Eb − µα,a λαc Ec ⊗ Ed , H

H

α

cd=g

H

where µα ∈ Pg are a dual basis to the λα with respect to the dot product as vectors in CC ∩ gC −1 , and g = ab is fixed. Proof. Here the summed terms vanish under ∧ by Lemma 5.3, so that i as stated indeed splits this for any choice of coefficients µα . We choose these to be the dual basis to the β λα so that a∈C ∩g C −1 µα,a λa = δ α,β . Here g = ab is suppressed in our notation. Then one may verify that the map is well-defined on 2 (H ), i.e. ab=g λαa i(Ea ∧ Eb ) = 0. Finally, i by definition extends as a left H -module map and, since we only add terms of the same “total degree” g with respect to the right action, it becomes also a right module map.

162

S. Majid

Given the choice of “lift”, the Ricci tensor constructed from the Riemann tensor by making a point-wise trace over the input and the first output of i(R), is Ricci = i(Fc )ab Eb ⊗(Ec−1 ac − Ea ), (85) H

a,b,c

where i(Fc ) = i(Fc )ab Ea ⊗H Eb . Next, a gamma-matrix is a collection of endomorphisms {γa } of a vector space W on which G acts by a representation ρW say, subject to further constraints to be discussed on the γ . A “spinor” field is a W -valued function on G, and b D / = ∂ a γ a − A b a γ a τW ,

τ a = ρW (a −1 − e),

(86)

a are the “Lie algebra” where Ab = Ab a determines the components of each Aa . The τW a generators f in the representation ρW . The group inverse here makes them actually a right-action rather than a left one (just as the ∂ a are actually right-derivations). Finally, a metric is determined by a choice of framing and a coframing {E ∗a } which aE

is a collection of 1-forms such that every 1-form is a unique combination of these with coefficient functions from the right. Given a framing, a general coframimg and hence a general metric is determined by a point-wise invertible function-valued matrix {gab } and given as a cotensor by g = Ea g ab ⊗ Eb , H

(87)

where g ab is the matrix inverse (e.g. g ab = ηab above). The cotorsion of the spin connection is the torsion with respect to the coframing and corresponds to −1 DA ∧ E ∗a = dE ∗a + (E ∗cac − E ∗a ) ∧ Ac . (88) c

Vanishing of the cotorsion generalises the notion of metric compatibility in a slightly weaker “skew” formulation appropriate to our not requiring the metric symmetric [2]. One is at liberty now to do “finite gravity”. That is, one can look at the moduli spaces for the above data and solve the various equations as well as others such as given by the variation or minimisation of an action. The role of Einstein–Hilbert action can be played for example by the trace of D / 2 . Since everything is finite we do not need to worry so much about regularisations and Dixmier traces, etc. as in the approach of Connes [5]. We will not attempt this here but we will show for a nontrivial example in the next section that the moduli space of our basic data is not empty. For example, we could fix the framing and coframing to be the natural ones on any quantum group defined as above by the Maurer–Cartan form and a “braided-Killing form” g = η. We have established these canonical choices in Sect. 4. If one wants a torsion free spin connection we then we have to solve (in view of the Maurer–Cartan equations already obeyed), the condition Ab ∧ (Eb−1 ab − Ea ) + Eb ∧ Ea + Ea ∧ Eb = 0, ∀a ∈ C. (89) b#=a

We need only solve this for all except one a since the sum over a is automatically zero in view of (71). Finally we could take for γa the “tautological” one in Sect. 4, −1 γa = ηab ρW (b − e). (90) b

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

163

These are equivariant and obey ηab γa γb = ρW (C),

C=

a,b∈C

−1 ηab (a − e)(b − e),

(91)

where C is the braided Casimir element associated to the braided-Killing form. We can also consider other choices of gamma-matrices {γa }. Our other new proposal mentioned in Sect. 4 is that the gamma-matrices could be restricted by the requirement that Connes’ prescription [5] for the exterior algebra ·D/ obtained from D / should coincide with our bicovariant approach above, which would be the case classically. This condition is independent of the choice of framing, coframing or spin-connection since the commutators [D / , m] relevant for this (m any function) are independent of these. The following proposition shows, however, that this is not necessarily a natural restriction in the present context of finite groups. Theorem 5.6. A necessary condition for the Connes exterior algebra induced by D / to contain the relations of the Woronowicz bicovariant one on C[G] is γa2 = 0,

if a 2 ∈ C ∪ {e}, a ∈ C,

γa γb = 0,

∀g ∈ C ∪ {e}.

ab=g, a,b∈C

Proof. We recall that [5] considers a representation πD/ of the universal exterior algebra a spectral triple. The relevant part of this construction, however, does not depend on Hilbert spaces or self-adjointness and works for any algebra M and operator D / on a vector space in which M is also represented. In our case the algebra is M = H = C[G] and the vector space is of the form M ⊗ W and M is represented by multiplication. Then πD/ : · M → End(M ⊗ W ),

πD/ (m ⊗ n ⊗ · · · ⊗ p) = m[D / , n] · · · [D / , p]

defines the exterior algebra ·D/ as the quotient of the universal one modulo the differential graded ideal generated by the kernel of πD/ . At degree 1 we know from Sect. 3 that m[D / , n] =

m(∂ a n)Ra ⊗ γa

a∈C

from which it is clear that for an injective map γ : 0 → End(W ) the kernel of πD/ at degree 1 is the same as NH , the ideal set to zero by mdn = m(∂ a n)Ea . At degree 2 we have [D / , m][D / , n] = (∂ c m) ◦ Rc ◦ (∂ d n)Rd ⊗ γc γd c,d∈C

=

c,d∈C

(∂ c m)(∂ d Rc n)Rdc ⊗ γc γc−1 dc

after a change of variables. Next, working in the universal calculus, the product of Maurer–Cartan forms is e(δg ) ⊗ e(δh ) = δb ⊗ δbg ⊗ δbgh H

b∈G

164

S. Majid

and one finds

πD/ (e(δg ) ⊗ e(δh )) = H

c,d∈C ,b∈G

=

δb (δbgc−1 − δbg )(δbghc−1 d −1 − δbghc−1 )Rdc ⊗ γc γc−1 dc

δb Rgh ⊗ γg γh = Rgh ⊗ γg γh ,

b

where only the leading term in each difference contributes when g # = e and h # = e. The δ-functions then fix c = g and d = ghg −1 provided h ∈ C (otherwise we obtain zero). Also, πD/ (dNH ) = πD/ {(dδg ) ⊗ e(δq ) + δg de(q)| g ∈ G, q # = e, q ∈ / C} H γc γc−1 dc | g ∈ G, q # = e, q ∈ / C} = {δg Rq ⊗ c,d∈C ,dc=q

= {δg Rq ⊗

γa γb | g ∈ G, q # = e, q ∈ / C},

a,b∈C ,ab=q

where the first term fails to contribute since it is of the form a function times / C (see above). The second term only cone(δh ) ⊗H e(δq ), where h, q # = e and q ∈ tributes πD/ (δg ⊗ δbh−1 ⊗ δb ) which comes out as stated by similar computations to those above and a further change of variables as shown. Hence πD/ applied to the expressions leading to the relations (71) in the Woronowicz calculus, namely the expressions Rg ⊗ γa γb , Ra 2 ⊗ γa2 ab=g

do not lie in πD/ (dNH ) if g, a 2 ∈ C ∪ {e} respectively, unless zero. Hence for the Woronowicz ideal at degree 2 to be contained in ker πD/ + dNH a necessary condition is for these operators to vanish when g, a 2 ∈ C ∪ {e}. This gives the conditions stated. This will often be a sufficient condition as well, for suitably non-degenerate γ and in the nice cases where (71) are all the relations (at least at degree 2). Moreover, when the conclusion holds it often means that the Connes and Woronowicz calculi actually coincide, because the Woronowicz one tends to have the most relations in practice anyway. The theorem is a surprising result but easily verified for example on C[Z2 ]. This has only one nontrivial conjugacy class C = {u}, where u with u2 = e is the nontrivial element of Z2 . The Woronowicz calculus has Eu ∧ Eu = 0 and hence 2 = 0, while the Connes prescription can give this (if and) only if γu2 = 0. The nilpotency is associated to the order 2 of u and means in particular that the Dirac operator itself will not typically be Hermitian with respect to the obvious inner products. Such nilpotent models could still be physically interesting in the context of a model where Connes’ approach and the quantum groups approach to the discrete part of the geometry intersect [11]. One may easily make the same analysis in the general setting of Sect. 4 for any Hopf algebra but this simple example is enough to show the limitations of this approach (therefore we have omitted the full analysis). The result means that for γ chosen according to other criteria (such as equivariance) one will typically have a different induced higher order calculus

D/ than the usual bicovariant one of Woronowicz natural in this context. One may work with either one or with the maximal prolongation, with the difference appearing at 2 and higher, i.e. affecting the curvature and vanishing of torsion, etc.

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

165

5.3. Riemannian geometry of S3 . We now turn to a concrete example, the permutation group G = S3 generated by u, v with relations u2 = v 2 = e,

uvu = vuv.

(92)

The conjugacy class C = {u, v, uvu} is semisimple in the sense of Proposition 5.4 while the other nontrivial conjugacy class {uv, vu} is not. We therefore fix this first case, i.e. work with a 3-dimensional bicovariant differential calculus. In this case one finds by enumeration that ηab = 3δ ab .

(93)

The braided-Lie algebra here is [x u , x v ] = x w = [x v , x u ], [x v , x w ] = x u = [x w , x v ]

[x u , x w ] = x v = [x w , x u ],

(94)

and U (L) is generated by 1 and x a with the relations xuxv = xv xw = xw xu,

xv xu = xuxw = xw xv .

(95)

If one defined the Killing form by the adjoint action of the f a then one would have instead ηab = 3δ ab + 3. In fact any constant offset here does not change anything in terms of the resulting connection, etc. (but could render η degenerate). The various metrics just differ by a multiple of a,b Ea ⊗H Eb which will turn out to play a somewhat neutral role. The explicit form of the higher differential calculus is well-known and in this case (71) gives all the relations at degree 2, namely Eu ∧ Eu = Ev ∧ Ev = Euvu ∧ Euvu = 0, Eu ∧ Ev + Ev ∧ Euvu + Euvu ∧ Eu = 0, Ev ∧ Eu + Euvu ∧ Ev + Eu ∧ Euvu = 0,

(96)

so that 2 (C[S3 ]) is 4-dimensional. Lemma 5.3 establishes that these are in fact a full set of relations in this case. The Maurer–Cartan equations (72) immediately become dEu + Euvu ∧ Ev + Ev ∧ Euvu = 0, dEv + Eu ∧ Euvu + Euvu ∧ Eu = 0, dEuvu + Ev ∧ Eu + Eu ∧ Ev = 0.

(97)

This has been observed by many authors using Woronowicz bicovariant calculus. With this background we now construct explicit solutions to our torsion and cotorsion conditions. Proposition 5.7. For the framing by the Maurer–Cartan form, the moduli space of zerotorsion spin connections is 12-dimensional and takes the form Au = (α + 1)Eu + γ Ev + βEuvu , Auvu = βEu + αEv + (γ + 1)Euvu ,

Av = γ Eu + (β + 1)Ev + αEuvu , α + β + γ = −1, where α, β, γ are functions subject to the constraint shown. They obey a Aa = 0.

166

S. Majid

Proof. We solve the two equations Av ∧ (Euvu − Eu ) + Auvu ∧ (Ev − Eu ) = Euvu ∧ Ev + Ev ∧ Euvu , Au ∧ (Euvu − Ev ) + Auvu ∧ (Eu − Ev ) = Eu ∧ Euvu + Euvu ∧ Eu , where the third equation in (89) will be automatic given the other two. The right-hand sides here are −dEu and −dEv respectively. Into these equations we write the component decomposition Aa = Aa b Eb with Au u = α + 1,

Av v = β + 1,

Auvu uvu = γ + 1

say (these could be functions on the group, not numbers). We then write everything in terms of any four linearly independent 2-forms, say Eu ∧ Ev , Ev ∧ Eu , Eu ∧ Euvu and Euvu ∧ Eu , writing the other two in terms of these via the above relations in 2 . The coefficients of these four 2-forms must separately vanish and give us the four equations Auvu u − β = 0, Av u − γ = 0,

− Auvu v − γ − (β + 1) = 0, − Av uvu − (γ + 1) − β = 0

respectively. Similarly for the other equation, to give the solution stated.

Next we consider metrics. The general moduli space of all metrics is clearly GL3 raised to the 6th power, as we have a reference metric ηab provided by the braided-Killing form, and hence a natural reference coframing E ∗a = ηba Eb . The corresponding metric induced by the braided Killing form is of course g = ηab Ea ⊗ Eb = 3 E a ⊗ Ea . (98) H

a

H

Corollary 5.8. (i) The moduli space of cotorsion-free connections with respect to the coframing defined by the braided-Killing form metric ηab is also 12-dimensional and has a similar form to the above, with coefficients α, β, γ on the right. (ii) The moduli of torsion free and cotorsion free connections is 2-dimensional, with α, β, γ numbers. (iii) The point α = β = γ = − 13 in this moduli space is the unique regular torsionfree and cotorsion-free or “Levi–Civita” connection on S3 . This and its nonzero curvature are 1 Aa = Ea − θ, θ = Ea , Fa = dEa . 3 a Proof. We have to show vanishing of (88) for the coframing E ∗a = ηba Ea . However, because η is Ad-invariant and constant, this reduces in terms of E to vanishing of DA ∧ Ea = dEa + (Ebab−1 − Ea ) ∧ Ab . b

Note that this is a different equation from the torsion equation solved above. However, since every element of C has order 2, the inverse is irrelevant and the equation then differs only by a reversal of the ∧. Looking at the equations solved for zero torsion above, we see that they are invariant under such a reversal provided we write Aa = Eb Aa b with coefficients A a b from the right. Next we consider the intersection of the

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

167

moduli of torsion-free and cotorsion-free connections. Given the bimodule structure, if Au = Eu (α + 1) + Ev γ + Euvu β , etc., is also torsion free, we need Ru (α ) = α, Rv (γ ) = γ and Ruvu (β ) = β, and similarly for Av , Auvu . As a result, Ra (α ) = α for all a, hence α is a multiple of the identity function (a number) and α = α . It is similar for β, γ . Finally in this moduli of torsion-free and cotorsion-free connections we look for regular connections, i.e. those for which Au ∧ Av + Auvu ∧ Au + Av ∧ Auvu = 0, Av ∧ Au + Auvu ∧ Av + Au ∧ Auvu = 0,

(99)

corresponding to products of elements from C with values uv or vu. As before, we take the first equation, write Au = (α + 1)Eu + γ Ev + βEuvu , etc., (as found above), and write all products in terms of our chosen four 2-forms. The coefficients of Eu ∧ Ev and Ev ∧ Eu each yield α = γ , while those of Eu ∧ Euvu and Euvu ∧ Eu each yield α = β. The second equation above follows in an identical manner and can only give the same constraints by a symmetry in which we reverse the ∧. Hence there is a unique regular connection among torsion free and cotorsion free ones. We write it in the way shown in terms of the Maurer–Cartan form and θ. Finally, for any regular connection in our example, the curvature has to take the form Fa = dAa − (Ab ∧ Aa + Aa ∧ Ab ) (100) b

because the product of all distinct elements of the conjugacy class lie outside it, so there is no Ac ∧ Ad term in (81). For our connections the second term vanishes since b Ab = 0. Also, dθ = 0 when we put in the values of each dEa and average, and use (71). Hence Fa = dEa , which is certainly non-zero, being equal to the quadratic parts in the Maurer–Cartan equation. The explicit ∇ from the general formulae in Sect. 5.2 is 1 ∇Eu = −Eu ⊗ Eu − Ev ⊗ Euvu − Euvu ⊗ Ev + θ ⊗ θ, 3 H H H H 1 ∇Ev = −Ev ⊗ Ev − Eu ⊗ Euvu − Euvu ⊗ Eu + θ ⊗ θ, 3 H H H H 1 ∇Euvu = −Euvu ⊗ Euvu − Ev ⊗ Eu − Eu ⊗ Ev + θ ⊗ θ, 3 H H H H and one may then verify that indeed torsion and cotorsion vanish as Ea ⊗ Ea ) = 0. ∇ ∧ Ea = dEa , (∇ ∧ id − id ∧ ∇)( H

a

On the other hand a similar computation to the latter gives ∇( Ea ⊗ Ea ) = 2 Ea ⊗ E b ⊗ E c a

H

not a=b=c

−2

σ ∈S3

H

H

Eσ (u) ⊗ Eσ (v) ⊗ Eσ (uvu) # = 0, H

H

(101)

168

S. Majid

where we keep the left output of ∇ to the far left and act as a derivation. This is manifestly nonzero (as well as somewhat basis dependent, i.e. not really a natural computation on Ea ⊗H Ea ). Therefore full metric compatibility in the naive sense does not hold even for this simplest nontrivial example. This justifies our weaker notion of vanishing cotorsion as the appropriate generalisation for noncommutative geometry. We are then able from the general theory above to compute the Riemann and Ricci curvatures etc., for the Levi–Civita connection on S3 , the latter with respect to a choice of “lift”. One choice (84) is clearly i(Eu ∧ Ev ) = Eu ⊗ Ev − Euvu ⊗ Eu , i(Euvu ∧ Eu ) = Euvu ⊗ Eu − Ev ⊗ Euvu , H

H

i(Ev ∧ Eu ) = Ev ⊗ Eu − Euvu ⊗ Ev , H

H

H

H

i(Eu ∧ Euvu ) = Eu ⊗ Euvu − Ev ⊗ Eu . H

H

(102) For the second choice, the bases of Puv and Pvu are easily seen to be the unique vector λu = λv = λuvu = 1 so that the lift in Proposition 5.5 is 1 i(Ea ∧ Eb ) = Ea ⊗ Eb − Ec ⊗ Ed , ∀a # = b. (103) 3 H H cd=ab

Proposition 5.9. The unique Levi–Civita connection on S3 constructed above has constant curvature with respect to either of the above two lifts, with Ricci = µ(−g + θ ⊗ θ), H

where g is the metric (98) induced by the Killing form and µ = 1, 2/3 respectively. Proof. This is a direct computation from (85). The Riemann tensor is REu = dEu ⊗ Eu + dEv ⊗ Euvu + dEuvu ⊗ Ev , H

H

H

REv = dEv ⊗ Ev + dEu ⊗ Euvu + dEuvu ⊗ Eu , H

H

H

(104)

REuvu = dEuvu ⊗ Euvu + dEv ⊗ Eu + dEu ⊗ Ev ,

H

H

H

since a dEa = 0. We lift each term by applying the chosen i, then pick out the coefficient of Eu ⊗ in REu etc., for the trace. Thus S3 with its natural Riemannian structure is more or less an “Einstein space”. We could take gλ = g − λθ ⊗H θ as the metric from the start without changing anything above (although λ = 1 itself is degenerate). The scalar curvature itself is the further contraction of this with the inverse metric. One can similarly consider several other lifts with the same conclusions but a different value of µ. Note also that our trace conventions for Ricci in the classical case would become the first and third indices of the Riemann tensor, so that we have an opposite sign convention from the usual one. Hence S3 above for the natural choices of lift looks more like a compact manifold with constant positive curvature in the usual terms. Finally, to fix a Dirac operator, for the sake of discussion we choose the tautological γ defined as in (90) by the two-dimensional representation

0 1 1 0 ρW (u) = , ρW (v) = . (105) 1 0 −1 −1

Riemannian Geometry of Quantum Groups with Nonuniversal Differentials

The braided-Casimir is 2 1 C = ((u − e)2 + (v − e)2 + (uvu − e)2 ) = 2 − (u + v + uvu), 3 3

169

ρW (C) = 2, (106)

so that from (91), γa γb ηab = ρW (C) = 2.

(107)

Hence by Theorem 5.6 (because the elements of C all have order 2) the calculus implied by D / for these will be different from the one that we have already imposed from quantum group considerations. In fact D / imposes fewer relations. This is not a problem from the quantum groups point of view, we can still use D / perfectly well. The gamma-matrices are explicitly

1 −1 1 1 0 0 1 −2 −1 γu = , γv = , γuvu = . (108) 3 1 −1 3 −1 −2 3 0 0 They have some nice identities, however. In fact for any ρW with ρW (uv − e) invertible, which is the case here, one can show by enumeration of the cases and the identity ρW (e + uv + (uv)2 ) = 0 which then holds, that 2 1 γa = −1. (109) γa γb + γb γa + (γa + γb ) = (δab − 1), 3 3 a Proposition 5.10. For the Dirac operator on S3 with the above gamma-matrices and the canonical “Levi–Civita” connection on S3 constructed above, we have

1 −∂ u − 2∂ uvu − 3 ∂ u − ∂ uvu a D / = ∂ γa − 1 = . 3 ∂u − ∂v −∂ u − 2∂ v − 3 a = ρ (a −1 − e) = 3γ since all elements of C have Proof. To find this note first that τW W a order 2. The canonical connection in terms of components is Aa b = δa b − 13 , hence γa2 + γa γb . D / = ∂ a γa − 3 a

a,b

We then use the gamma-matrix identities above. The −1 appearing here reflects again a “constant curvature” now detected for S3 with its canonical Riemannian structure by the Dirac operator. Finally we note that while we have focussed here on the canonical metric induced by the braided-Killing form, one can similarly consider more general triples (A, E, E ∗ ) and solve for zero torsion, and zero cotorsion, compute the curvature, etc. One may then minimise an action defined for example by suitable contraction of the Ricci curvature, i.e. proceed to finite quantum gravity. Also, there is no problem introducing Maxwell or Yang– Mills fields and matter fields since we already have a bundle formalism, sections, etc. This intended application is beyond our present scope and will be attempted in detail elsewhere. A further application may be to insert our canonical Dirac operator on S3 into the framework for elementary particle Lagrangians of Connes and Lott. Acknowledgements. The writing of the manuscript was completed while visiting the CPT at Luminy, Marseilles during the month of May, 2000; I thank my hosts there for an excellent stay.

170

S. Majid

References 1. Majid, S.: Hopf algebras for physics at the Planck scale. J. Classical and Quantum Gravity 5, 1587–1606 (1988) 2. Majid, S.: Quantum and braided group Riemannian geometry. J. Geom. Phys. 30, 113–146 (1999); qalg/9709025 3. Brzezi´nski, T. and Majid, S.: Quantum group gauge theory on quantum spaces. Commun. Math. Phys. 157, 591–638 (1993); Erratum 167, 235 (1995) 4. Woronowicz, S.L.: Differential calculus on compact matrix pseudogroups (quantum groups). Commun. Math. Phys. 122, 125–170 (1989) 5. Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994 6. Major, S. and Smolin, L.: Quantum deformation of quantum gravity. Nucl. Phys. B 473, 267–290 (1996) 7. Beggs, E. and Majid, S.: Poisson-Lie T-duality for quasitriangular Lie bialgebras. Commun. Math. Phys. 220, 455–488 (2001) 8. Amelino-Camelia, G. and Majid, S.: Waves on noncommutative spacetime and gamma-ray bursts. Int. J. Mod. Phys. A 15, 4301–4323 (2000) 9. Connes, A., Douglas, M.R. and Schwarz, A.: Noncommutative geometry and matrix theory: Compactification on tori. J. High Energy Phys. 2, U40–U74 (1998) 10. Connes, A.: Noncommutative geometry and reality. J. Math. Phys. 36, 6194 (1995) 11. Majid, S. and Schucker, T.: Z2 × Z2 Lattice as a Connes–Lott-quantum group model. Preprint, 2000 12. Brzezi´nski, T. and Majid, S.: Quantum geometry of algebra factorisations and coalgebra bundles. Commun. Math. Phys. 213, 491–521 (2000) 13. Majid, S.: Conceptual issues for noncommutative gravity on algebras and finite sets. Int. J. Mod. Phys. B 14, 2427–2449 (2000) 14. Majid, S.: Quantum and braided Lie algebras. J. Geom. Phys. 13, 307–356 (1994) 15. Castellani, L.: Gravity on finite groups. Commun. Math. Phys. 218, 609–632 (2001) 16. Dimakis, A. and Muller-Hoissen, F.: Discrete Riemannian geometry. J. Math. Phys. 40, 1518 (1999) 17. Heckenberger, I. and Schmudgen, K.: Levi–Civita connections on the quantum groups SLq(N), Oq(N) and Sp(q)(N). Commun. Math. Phys. 185, 177–196 (1997) 18. Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 1995 19. Schneider, H-J.: Principal homogeneous spaces for arbitrary Hopf algebras. Isr. J. Math. 72, 167–195 (1990) 20. Brzezi´nski, T. and Majid, S.: Quantum differentials and the q-monopole revisited. Acta Appl. Math. 54, 185–232 (1998) 21. Majid, S.: Diagrammatics of braided group gauge theory. J. Knot Th. Ramif. 8, 731–771 (1999) 22. Majid, S.: Classification of bicovariant differential calculi. J. Geom. Phys. 25, 119–140 (1998) Communicated by A. Connes

Commun. Math. Phys. 225, 171 – 189 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface Manfred Peter Mathematisches Institut, Albert-Ludwigs-Universität, Eckerstr. 1, 79104 Freiburg, Germany. E-mail: [email protected] Received: 25 May 2001 / Accepted: 28 August 2001

Abstract: An explicit product representation is proved for the correlation function of the multiplicities of closed geodesics on the modular surface. This makes rigorous part of the investigation of Bogomolny, Leyvraz and Schmit on the correlation of the eigenvalues of the Laplacian on the modular surface. The result can also be seen as a (rigorously proved) analogue of the Hardy-Littlewood twin prime conjecture. 1. Introduction One of the connections between Number Theory and Mathematical Physics that emerged in recent years is Arithmetical Quantum Chaos. On the physical side we have the problem of how the notion of chaos in classical dynamical systems can be transfered to quantum mechanical systems. Here Gutzwiller’s trace formula is a useful quantitative tool which connects the lengths of closed orbits in the classical picture with the energy eigenvalues in the quantum mechanical picture. Unfortunately it is an asymptotic relation without rigorous error estimates. But in the special case of motion on a surface of constant negative curvature generated by a discontinuous group, Gutzwiller’s trace formula is Selberg’s trace formula and is therefore exact. Furthermore, for particular choices of groups connections with well understood number theoretical objects can be exploited which are not available for general groups. Numerical experiments led to surprising results: For a generic group the eigenvalue statistics of the Laplacian on the surface seem to be in accordance with the Gaussian Orthogonal Ensemble (see Bohigas, Giannoni, Schmit [6]). But for arithmetic groups it is closer to Poisson distribution ([6], Aurich, Steiner [1]; see also [5]). It seems that the high degeneracy of the length spectrum is responsible for the Poissonian behaviour of the eigenvalues. Note that a similar phenomenon was observed by Hejhal and Selberg for quaternion groups ([9], Theorems 17.1 and 18.8). They used the high degeneracy of the length spectrum to prove exceptionally large lower bounds for integral means of the remainder term in Weyl’s asymptotic law for the eigenvalues. Luo and Sarnak [10] used

172

M. Peter

the same phenomenon in their study of the “number variance” (the mean square for the remainder term in short intervals) for general arithmetic (not necessarily congruence) groups. They proved that for arithmetic groups, the length spectrum without multiplicities has at most linear growth and conjectured that this property characterizes arithmetic groups, a fact later proved by Schmutz [16] in the noncompact case. In an attempt to understand the Poissonian behaviour for the arithmetic group := SL(2, Z), Bogomolny, Leyvraz and Schmit [4] calculated the two point correlation function for the eigenvalues. Their arguments are not mathematically rigorous and are given in two steps: First Selberg’s trace formula along with heuristic arguments is used to reduce the pair correlation function of the eigenvalues to that of the lengths of closed orbits. Second a heuristic version of the Hardy-Littlewood method is used to express the latter correlation function as an infinite product with easily calculable factors. Making the first step in [4] mathematically rigorous is extremely hard without a new methodological tool. Selberg’s trace formula seems too weak for this purpose. In the present paper it will be shown how the second step can be made rigorous with an approach different from [4]. There is another way of looking at the result of this paper. Since the structure of the Selberg trace formula is the same as that of Weil’s explicit formula in prime number theory (see [2]) one can look upon the lengths of closed geodesics as the logarithms of some sort of generalized primes. Thus the theorem below is an analogue of the HardyLittlewood twin prime conjecture. In order to state the main result, let H be the complex upper half plane and := SL(2, Z). This group acts discontinuously on H by Möbius transformations. Let F := H \ be the Riemann surface generated by . The closed geodesics on F are in oneto-one correspondence to conjugacy classes of primitive hyperbolic elements in . For n ∈ N, n > 2, let g(n) be the number of closed geodesics on F which correspond to primitive conjugacy classes with trace n. Set α(n) := g(n) log n/n. Then the function α has mean value 1. Set α(n) ˜ := α(n) − 1. Theorem. For r ∈ N0 , the limit 1 α(n) ˜ α(n ˜ + r) N→∞ N

γ (r) := lim

2
exists. Its value is given by γ (r) + 1 =

1+

p

Ar (p b ) ,

b≥1

where for a prime p > 2 and b ≥ 2, we have 2 (n − 4)((n + r)2 − 4) 1 p − 1 , Ar (p) = 2 (p − 1)2 p n mod p   b r ≡ 0 (p b ) 2p (1 − 1/p),       b−1 , b ), r ≡ 0 (p b−1 )   r

≡ 0 (p −2p     1 b −1 b (1 − 1/p), r ≡ ±4 (p b ) b p . Ar (p ) = 2 p  −1 b b−1 (p − 1)2 p 3b−4   b ), r ≡ ±4 (p b−1 )    p , r

≡ ±4 (p −   p     0, otherwise

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

173

Furthermore, for b ≥ 6, we have

  1, r ≡ 0 (4)   1 1, r ≡ 0 (2) 1 −1, r ≡ 2 (4) , Ar (2) = , Ar (4) = 9 −1, r ≡ 1 (2) 18  0, r ≡ 1 (2)    1, r ≡ 0 (16)   1 −1, r ≡ 8 (16) , Ar (32) = 0, Ar (8) = 0, Ar (16) = 9 · 16  0, r ≡ 0 (8)    2, r ≡ 0 (2b )       b−1 (2b )   −2, r ≡ 2   1 b b−2 b Ar (2 ) = . 1, r ≡ ±(4 + 2 ) (2 ) 2b−4   9·2 b−2 b−1 b    + 2 ) (2 )  −1, r ≡ ±(4 + 2     0, otherwise

This theorem was already stated in [4] and was made plausible on heuristic grounds. In the present paper, a different approach is used which exploits the connection of α(n) with a sum of class numbers of primitive binary quadratic forms (see (2.1) below). The main step is to show that α is almost periodic and to compute its Fourier coefficients. Then the theorem follows from Parseval’s equation. To this end a method is used that has already been applied in [13] and [14]. It might be of interest to note that almost periodic functions – albeit on the real line – already found applications to trace formulas for integrable geodesic flows (see [3] for an overview). The present paper is organized as follows: In Sect. 2 the function α is reduced to class numbers. Section 3 contains a short review of almost periodic arithmetical functions. Furthermore, the function α is shown to be almost periodic by approximating it with a carefully chosen periodic function. This will considerably simplify the computation of the product representation of γ (r). In Sect. 4 this Euler product is derived and its local factors are computed in Sect. 5. 2. Reduction to Class Numbers In order to state the connection with class numbers we need some notation. Proofs of the number theoretic facts in this section can be found for example in [7], [8] and [11]. The letter d will always stand for a positive non-square discriminant (i.e. d ≡ 0, 1 (4)). A primitive binary quadratic form is a polynomial f = ax 2 +bxy +cy 2 with a, b, c ∈ Z and gcd(a, b, c) = 1. Its discriminant isd = b2 − 4ac. Two such forms fi , i = 1, 2, αβ are called equivalent if there is a matrix ∈ SL(2, Z) with f1 (x, y) = f2 (αx + γ δ βy, γ x + δy). A fundamental theorem in number theory now states that the number h(d) of equivalence classes of primitive binary quadratic forms with discriminant d is finite. This number is one of the important quantities in number theory since it appears in a surprising variety of situations. The Pellian equation u2 −dv 2 = 4 has infinitely many solutions in integers (u, v). The solution (ud , vd ) with ud , vd > 0 and ud minimal is called the fundamental solution √ since all the other solutions can be generated from it by a simple law. Set d := (ud +vd d)/2. The trace of d is defined as tr(d ) := ud . The one-to-one correspondence between primitive conjugacy classes in and closed geodesics on F is as follows: Every primitive hyperbolic P ∈ has two real fixed points

174

M. Peter

(one of them can be ∞). The orthogonal circle in H which ends in these fixed points induces a closed geodesic on F of length l where cosh(l/2) = | Tr P |. The possible values for l and their multiplicities are described in Proposition 2.1 ([15], Corollary 1.5). The lengths of closed geodesics on F are the numbers 2 log d with multiplicities h(d). From this proposition and the preceding description it follows that α(n) =

log n n

h(d).

(2.1)

d: tr(d )=n

Quantitative results about class numbers are often derived using Dirichlet’s class number formula. First we must define Jacobi’s character and Dirichlet L-series. Define χd : Z → {0, ±1} to be completely multiplicative with    1, x 2 ≡ d (p) is solvable, p |d  χd (p) := −1, x 2 ≡ d (p) is insolvable ,   0, p|d    1, d ≡ 1 (8)  χd (2) := −1, d ≡ 5 (8) , χd (−1) := 1.  0, d ≡ 0 (4)  χd is a Dirichlet character modulo d (the quadratic reciprocity law is used to prove the d-periodicity). For s > 0, the series L(s, χd ) :=

χd (n) n≥1

ns

is uniformly convergent and defines a holomorphic function. Proposition 2.2 (Dirichlet’s class number formula). For a positive non-square discriminant d, we have √ h(d) log d = d L(1, χd ). In the next section this formula will be used to prove the almost periodicity of α. In principle this could be done by writing L(1, χd ) =

χd (n) + error. n

1≤n≤N

Note that the sum on the right-hand side is a periodic function of d and can be used to approximate L(1, χd ). But this procedure has three severe drawbacks: First the approximation is not particularly good. Therefore we will use a smoothed version of the series representation of L(1, χd ) instead. Second we must approximate a sum of values L(1, χd ) with the condition tr(d ) = n. Breaking up this condition into easier summations, as must be done to process it further, would make things much more difficult. Third we want to compute the Fourier coefficients of α and show that they are multiplicative. This would be near to impossible with the above approach.

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

175

Instead an approximating periodic function is used which already incorporates some sort of multiplicativity. From the multiplicativity of χd the Euler product L(s, χd ) =

1−

p

χd (p) −1 , ps

s > 1,

follows. Thus it seems more reasonable to use a partial product of this representation than a partial sum of the series as approximating function.

3. Almost Periodicity The standard reference for almost periodic arithmetical functions is [17]. Here the necessary material will be reviewed briefly. Let q ≥ 1. For f : N → C, define the seminorm f q :=

lim sup N→∞

1/q 1 |f (n)|q ∈ [0, ∞]. N 1≤n≤N

f is called q-limit periodic if for every > 0 there is a periodic function h with f −hq ≤ . The set Dq of all q-limit periodic functions becomes a Banach space with norm · q if functions f1 , f2 with f1 − f2 q = 0 are identified. If 1 ≤ q1 ≤ q2 < ∞, we have D1 ⊇ Dq1 ⊇ Dq2 as sets (but they are endowed with different norms!). There is the more general notion of q-almost periodic function which will not be used in this paper (they are defined as above but with arbitrary trigonometric polynomials for h instead of periodic functions). For all f ∈ D1 , the mean value 1 f (n) N→∞ N

M(f ) := lim

1≤n≤N

exists. The space D2 is a Hilbert space with inner product f, h := M(f h),

f, h ∈ D2 .

For u ∈ R, define eu (n) := e2πiun , n ∈ N. For all f ∈ D1 , the Fourier coefficients f(u) := M(f e−u ), u ∈ R, exist. For u ∈ Q, we have f(u) = 0 (this comes from the fact that f can be approximated by linear combinations of functions ev with v ∈ Q, and that M(ev e−u ) equals 1 if v − u ∈ Z and 0 otherwise; for almost periodic functions it is no longer true). In D2 , we have the canonical orthonormal base {ea/b }, where 1 ≤ a ≤ b and gcd(a, b) = 1. Limit periodic (and almost periodic) functions have a couple of nice properties. They can be added, multiplied and plugged into continuous functions and, under certain conditions, the result is again a limit periodic (almost periodic) function. They have mean values and limit distributions. Here we will use Parseval’s equation. In Corollary 4.2 we will prove that α ∈ Dq for all q ≥ 1. As a side result in Sect. 5, we get M(α) = α (0)

176

M. Peter

= 1. Thus for r ∈ N0 , we have α, ˜ α˜ +r ∈ D2 (where α˜ +r (n) := α(n ˜ + r)), and Parseval’s equation gives γ (r) = M(α˜ α˜ +r ) = α, α+r − 2M(α) + 1 a a α α −1 = = +r b b b≥1 1≤a≤b: (a,b)=1

b≥1 1≤a≤b: (a,b)=1

a 2 2πiar/b α − 1. e b (3.1)

The last equation follows easily since α is real valued. If we can compute α and show that it is multiplicative our main theorem follows. The approximation of α by periodic functions is done in two steps. First we bring Dirichlet L-series into the picture. For n ∈ N, n > 2, define 1 L(1, χd ), β(n) := v 2 2 d,v≥1: dv =n −4

where we must remember that d will always run through non-square positive discriminants. Lemma 3.1. For q ≥ 1, we have α − βq = 0. Proof. For d fixed, the powers dl , l ≥ 1, of the fundamental unit give all solutions (u, v) of the Pellian equation u2 − dv 2 = 4 with u, v ≥ 1 by way of the rule √ u+v d = dl . 2 √ √ For every such solution we have v d = u2 − 4. Thus Proposition 2.2 gives h(d) log d β(n) = . √ n2 − 4 l d,l≥1: tr(d )=n

Since

√ 1 1 n + n2 − 4 l log d = log d = log , l l 2 it follows from (2.1) that √ 1 1 n + n2 − 4 β(n) = √ h(d) log 2 l n2 − 4 l =√

n n2 − 4

log

1 2 (n +

√

d,l≥1: tr(d )=n

n2

log n

− 4)

log n α(n) + O n

d≥1, l≥2: tr(dl )=n

h(d) .

√ Since h(d)√ d (this can be seen, e.g. from Proposition 2.2 and the estimates log d ≥ log 21 (1+ d) and L(1, χd ) log d; the latter follows easily by partial summation from the orthogonality relation for characters), it follows that for d, l ≥ 1 with tr(dl ) = n, n + √n2 − 4 1/ l √ l 1/ l h(d) d (d ) = ≤ n1/ l . 2

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

177

Thus

α(n) log n

1 ≤ log n τ (n2 − 4) n

d,v≥1: dv 2 =n2 −4

for all > 0; here τ denotes the divisor function and τ (m) m was used. This implies |β(n) − α(n)| −1/2 log 21 (1 + 1 − 4/n2 ) 4 1 − 2 1+ − 1 · |α(n)| n log n log n + √ 1 n 2 2 d,v≥1: dv =n −4

−1/2+

n

,

which proves the lemma. In the crucial second step β is approximated by a sum of partial products of the Euler product of L(1, χd ). For n > 2, P ≥ 2, define χd (p) −1 1 1− . v p

βP (n) :=

p≤P

d,v≥1: dv 2 =n2 −4, p|v⇒p≤P (1)

(2)

Then β(n) − βP (n) = ,P (n) + ,P (n), where

(1)

,P (n) :=

d,v≥1: dv 2 =n2 −4, p|v for some p>P

and (2) ,P (n)

:=

d,v≥1: dv 2 =n2 −4, p|v⇒p≤P

1 L(1, χd ) v

χd (p) −1 1 . 1− L(1, χd ) − v p p≤P

Fix q ∈ N with q > 1 and choose q > 1 with 1/(2q) + 1/q = 1. The next lemma (1) shows that ,P (n) is negligible in the 2q-norm as P → ∞. Lemma 3.2. For P ≥ 2, we have (1) ,P 2q

v>P

1 vq

1/q .

Proof. Hölder’s inequality gives (1)

|,P (n)| ≤

d,v≥1: dv 2 =n2 −4 p|v for some p>P

1 vq

1/q

d,v≥1: dv 2 =n2 −4 p|v for some p>P

1/(2q) L(1, χd )2q

.

178

M. Peter

For x ≥ 1, this gives

(1)

|,P (n)|2q ≤

v>P

2
1 vq

2q/q

L(1, χd )2q .

2
The second sum on the right-hand side is L(1, χd )2q ∼ const · x d,l≥1: tr(dl )≤x

as x → ∞ (see [12]). This means that the values L(1, χd ) are constant in the mean when ordered according to the sizes of their fundamental units. Thus the lemma follows. (2)

In order to estimate ,P (n) we must compare L(1, χd ) with a partial product of its Euler product. This is done by comparing both terms with a smoothed version of the Dirichlet series for L(1, χd ). Let N ≥ 1. Then (2)

(2,1)

(2,2)

(2,3)

,P (n) = ,P ,N (n) + ,P ,N (n) + ,P ,N (n), where (2,1)

,P ,N (n) :=

1 χd (l) L(1, χd ) − e−l/N , v l d,v

1 (2,2) ,P ,N (n) := v

l≥1

χd (l) −l/N , e l d,v l≥1: p|l for some p>P 1 χd (l) −l/N (2,3) ,P ,N (n) := −1 ; e v l d,v

l≥1: p|l⇒p≤P (2)

here the conditions on d and v are as in ,P (n). The third term can be estimated easily. Lemma 3.3. For P ≥ 2 and x, N ≥ 1, we have

1 (2,3) 2q 1/(2q) N −1/2 + ,P ,N (n) x 2
√ l> N: p|l⇒p≤P

1 . l

Proof. Since |e−u − 1| u for 0 ≤ u ≤ 1, we see that for n > 2 the inner sum in (2,3) ,P ,N (n) is

l≥1: p|l⇒p≤P

1 −l/N e − 1 l

√ l> N: p|l⇒p≤P

√ l> N: p|l⇒p≤P

1 + N −1/2 =: c1 (P , N ). l

2 + l

√ 1≤l≤ N

1 l · l N

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

179

Hölder’s inequality now gives (2,3) 2q , P ,N (n) 2

d,v≥1: dv 2 =n2 −4

2
c1 (P , N )

2q

1 v

2q c1 (P , N )2q

d,v≥1: dv 2 =n2 −4

2

c1 (P , N )2q

1 vq

2q/q

1

d,v≥1: dv 2 =n2 −4

1,

2
where q > 1 was used. The last sum is

1 ∼ const · x

d,l≥1: tr(dl )≤x

(see [12] or [15]), and the lemma follows.

So far we did not use the oscillation of the Jacobi character. For the estimation of (2,2) ,P ,N we must take it into account. Lemma 3.4. For l, v ∈ N and x ≥ 3, we have χd (l) d≥1, 2
x + v l, v 2− K(l)

where K(l) is the squarefree kernel of l and > 0. Proof. See [14], estimate (2.7). Lemma 3.5. For P ≥ 2 and x, N ≥ 1, we have τ2q (l) 1 (2,2) 2q 1 ,P ,N (n) + (log N )2q + x −1/2+ N 2q , x l K(l) x 2q l>P

2
where τk (l) :=

l1 ,... ,lk ≥1: l1 ···lk =l

1.

(2,2)

Proof. Split the first sum in ,P ,N (n) into two sums depending on whether v ≤ √ (2,2) (2,2,1) (2,2,2) v > n. Thus ,P ,N (n) = ,P ,N (n) + ,P ,N (n). A trivial estimate gives (2,2,2)

,P ,N (n)

√ d,v≥1: dv 2 =n2 −4, v> n

n or

1 1 −l/N 1 log N √ τ (n2 − 4) e v l n l≥1

and thus (2,2,2) 2q (log N )2q , (n) n2q(−1/2+) (log N )2q P ,N 2
√

2
(3.2)

180

M. Peter

since q > 1. Hölder’s inequality gives (2,2,1) , (n) P ,N

≤

√ d,v≥1: dv 2 =n2 −4, v≤ n

≤

1 vq

d,v≥1: dv 2 =n2 −4

×

1 v

χd (l) −l/N e l

l≥1: p|l for some p>P

1/q

√ d,v≥1: dv 2 =n2 −4, v≤ n l≥1: p|l for some p>P

χd (l) −l/N 2q e l

1/(2q) .

Thus for x ≥ 1, (2,2,1) 2q , (n) P ,N

2

√ 2

=

l1 ,... ,l2q : pi |li for some pi >P

×

l≥1: p|l for some p>P

2q

1 e−(l1 +···+l2q )/N l1 · · · l2q

χd (l) −l/N e l

√ 1≤v≤ x 2
χd (l1 · · · l2q ).

Applying Lemma 3.4 to the innermost sum gives the estimate

1 x τ2q (l) 2− l v K(l) √ 2q

l>P

+

l1 ,... ,l2q ≥1

1≤v≤ x

l1 · · · l2q −(l1 +···+l2q )/N e l1 · · · l2q

τ2q (l) x + x (1+)/2 N 2q , l K(l) 2q

√ 1≤v≤ x

v

l>P

which together with (3.2) proves the lemma. (2,1)

In order to estimate ,P ,N (n) we must show that the error I (d, N ) := L(1, χd ) −

χd (l) l≥1

l

e−l/N ,

(3.3)

which comes from smoothing the Dirichlet series expansion of L(1, χd ), is small for large N . This is done by representing I (d, N ) as an integral over a vertical line in

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

181

the critical strip {0 < s < 1} and using information about the location of the nontrivial zeros of L(s, χd ). The Dirichlet series for L(s, χd ) is absolutely and uniformly convergent on the line s = 2. Sterling’s formula gives (s) |#s|c e−π|#s|/2 for c1 ≤ s ≤ c2 , |#s| ≥ 1, with some constant c = c(c1 , c2 ). Using Mellin’s formula 1+i∞ 1 (s)y −s ds = e−y , y > 0, 2πi 1−i∞ we get 1 2πi

2+i∞

2−i∞

(s − 1) L(s, χd ) N s−1 ds =

χd (l) l≥1

l

e−l/N .

(3.4)

On the other hand, for 1/2 ≤ s ≤ 2, we have L(s, χd ) (d|s|)1/2

(3.5)

(this is easily seen by partial summation). Thus the line of integration in (3.4) may be moved to the line s = κ with some 1/2 < κ < 1. Taking into account the pole of the integrand at s = 1, and using the residue theorem we get for (3.4) the expression κ+i∞ 1 (s − 1) L(s, χd ) N s−1 ds + L(1, χd ) 2πi κ−i∞ and thus 1 I (d, N ) = − 2π i

κ+i∞

κ−i∞

(s − 1) L(s, χd ) N s−1 ds.

(3.6)

We must now considerably reduce the exponent 1/2 of d in (3.5). In order to see the principle let us first assume the Generalized Riemann Hypothesis which says that for all Dirichlet characters χ modulo q, the L-series L(s, χ ) has only zeros with real part 1/2 in the critical strip 0 < s < 1. From this the Generalized Lindelöf Hypothesis follows and in particular for all d ≥ 1, s = κ and > 0, we have L(s, χd ) (d|s|) . From (3.6) it follows that for d, N ≥ 1, I (d, N ) d N κ−1 . This shows that

(2,1)

,P ,N (n)

|I (d, N )| N κ−1 n3

d,v≥1: dv 2 =n2 −4

and for P ≥ 2 and x, N ≥ 1, we get 1 (2,1) 2q ,P ,N (n) x 6q N 2q(κ−1) . x

(3.7)

2
Here it is important that the exponent of N is negative. Taking N to be a power of x with (2,1) small exponent therefore lets the contribution of ,P ,N (n) vanish as x → ∞. The next lemma gives an estimate which for our purposes is as good as (3.7) and can be proved unconditionally.

182

M. Peter

Lemma 3.6. There are 0 < κ, µ < 1 such that for P ≥ 2, x, N ≥ 1 and > 0, we have 2q 1 (2,1) 2q ,P ,N (n) x N 2q(κ−1) + x µ−1+ log(x 2 N ) . x 2
Proof. Choose 1/2 < σ0 < 1 with µ := 8(1 − σ0 )/σ0 < 1. Choose σ0 < κ < 1. Define the rectangle Rx := s ∈ C σ0 ≤ s ≤ 1, |#s| ≤ log2 x . If d ≤ x 2 and L(s, χd ) has no zeros in Rx then a standard argument (see, for example, Titchmarsh [18], Theorem 14.2) shows that for s = κ, |#s| ≤ (log x)2 /2, we have (1−κ)/(1−σ0 ) log |L(s, χd )| log log(x + 2) log(x + 2) ≤ log x + c() with some constant c() > 0 depending on . Together with (3.5) and (3.6) this gives (|t| + 1)c e−π|t|/2 x N κ−1 dt I (d, N ) |t|≤(log x)2 /2 + |t|c e−π|t|/2 (d|t|)1/2 N κ−1 dt |t|≥(log x)2 /2

x N κ−1 .

(3.8)

Next we must show that L(s, χd ) cannot have a zero in Rx too often. From zero density estimates it follows that # (n, v, d) 2 < n ≤ x, d, v ≥ 1, n2 − dv 2 = 4, L(s, χd ) has a zero in Rx x µ+ (3.9) (see [12], Lemma 4.11 or [14], estimate (2.6)). A trivial estimation of (3.3) gives I (d, N ) log(dN ).

(3.10)

(3.8), (3.9), (3.10) and Hölder’s inequality give (2,1) 2q , P ,N (n) 2

2

×

d,v≥1: dv 2 =n2 −4

1 vq

2q/q

|I (d, N )|2q

d,v≥1: dv 2 =n2 −4

2

(x N κ−1 )2q +

2
2q x(x N κ−1 )2q + x µ+ log(x 2 N ) , which proves the lemma.

log(dN )

2q

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

183

Now the results are collected. Proposition 3.7. For P ≥ 2, we have β − βP 2q

v>P

1 vq

1/q +

τ2q (l) 1/(2q) . l K(l) 2q l>P

Proof. For x ≥ 1, choose N := x 1/(8q) . Then Lemmas 3.3, 3.5 and 3.6 show that 1 (2) 2q ,P (n) x 2
l>x 1/(16q) : p|l⇒p≤P

1 l

2q +

τ2q (l) l K(l) 2q

l>P

1 (log x)2q + x −1/4+ + x (κ−1)/4+ + x µ−1+ (log x)2q . x Since the series l≥1: p|l⇒p≤P 1/ l converges, we have for P ≥ 2 fixed +

τ2q (l) (2) 2q , (n) . P 2q l K(l) 2q l>P

Together with Lemma 3.2 this proves the proposition.

4. An Euler Product In this section we exploit the particular construction of βP by writing it as a product of functions each depending only on a single prime. For p a prime, b ∈ N0 and n ∈ Z, set Ipb (n) := 1 if n2 ≡ 4 (p2b ) and, in case p = 2, if (n2 − 4)2−2b is a discriminant (for p > 2 this is automatically fulfilled). Set Ipb (n) := 0 otherwise. Define β(p) (n) :=

−1 1 1 1 − χ(n2 −4)p−2b (p) Ipb (n), b p p

n > 2.

(4.1)

b≥0

Lemma 4.1. For P ≥ 2, we have βP =

p≤P

β(p) .

Proof. In the definition of βP (n), write v = p≤P p bp with bp ∈ N0 . There is a discriminant d > 0 with dv 2 = n2 − 4 iff p 2bp |n2 − 4 for all p ≤ P and d := (n2 − 4)v −2 ≡ 0, 1 (4). Since p2bp ≡ 1 (4) for 2 < p ≤ P , the last condition is equivalent to (n2 − 4)2−2b2 ≡ 0, 1 (4). If these conditions are fulfilled, we have (n2 − 4)p−2bp = drp2 with rp ∈ N, p |rp , for p ≤ P . Thus χ(n2 −4)p−2bp (p) = χd (p) for p ≤ P . This proves the lemma. Corollary 4.2. For q ≥ 1, we have α ∈ Dq . In particular, limP →∞ βP = α with respect to the q-norm.

184

M. Peter

Proof. For q ∈ N with q > 1 fixed and q > 1 with 1/(2q) + 1/q = 1 it follows from Lemma 3.1 and Proposition 3.7 that

α − βP 2q c2 (P )1/q + c3 (P )1/(2q) ; here c2 (P ) :=

1 → 0 vq v>P

as P → ∞ since q > 1. Furthermore, c3 (P ) :=

τ2q (l) →0 l K(l) 2q

l>P

as P → ∞, since τ2q (l) = l K(l) l≥1

a,b≥1: a squarefree

a b2 τ2q (ab2 ) < ∞. ab2 · a a2 b2 a≥1

b≥1

Thus limP →∞ α − βP 2q = 0. For f : N → ∞ arbitrary and 1 ≤ q1 ≤ q2 < ∞, we have f q1 ≤ f q2 by Hölder’s inequality. Thus limP →∞ α − βP q = 0 for all q ≥ 1. Since the bth summand of β(p) is p 2b+1 -periodic (22b+3 -periodic in case p = 2) and the series representing β(p) is uniformly convergent, the function β(p) is uniformly limit periodic, i.e. β(p) ∈ Du ; here Du is the set of all functions which can be approximated to an arbitrary accuracy by periodic functions with respect to the supremum norm. Since Du is closed under multiplication it follows from Lemma 4.1 that βP ∈ Du for all P ≥ 2. This gives α ∈ Dq for all q ≥ 1. Next the Fourier coefficients of α are computed in terms of the Fourier coefficients of the β(p) . In particular, this will show their multiplicativity. Lemma 4.3. (a) For all primes p, we have β (p) (0) = 1. (b) For b ∈ N, a ∈ Z, gcd(a, b) = 1, choose ap ∈ Z for all p|b such that − ordp b ≡ ab−1 (1). Then p|b ap p a a p α . = β (p) ordp b b p p|b Proof. From Corollary 4.2 it follows that for arbitrary 0 < < 1 there is some P ≥ 2 with α − βP 1 ≤ and ordp b = 0 for all p > P . For all p ≤ P it follows from (4.1) that there is some lp ≥ ordp b and coefficients cp (ap∗ ) ∈ C, 1 ≤ ap∗ ≤ plp , such that cp (ap∗ ) ea ∗ /plp ≤ , β(p) − 1≤ap∗ ≤plp

u

p

−P . where · u denotes the supremum norm and := P −1 maxp≤P β(p) u + 1 From Lemma 4.1 it follows that cp (ap∗ ) ea ∗ /plp ≤ . βP − p≤P

1≤ap∗ ≤plp

p

u

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

185

Thus α −

1≤ap∗ ≤plp (p≤P ) p≤P

cp (ap∗ ) e

lp ∗ p≤P ap /p

≤ 2. 1

For f ∈ D1 we have |f| ≤ f 1 . Furthermore, p≤P ap∗ p −lp ≡ ab−1 (1) iff ap∗ ≡ ap p lp −ordp b (p lp ) for p ≤ P . Therefore the orthogonality relation for the exponential function gives a α cp (ap p lp −ordp b ) ≤ 2. − b p≤P

− ordp b ) − c (a p lp −ordp b ) ≤ for p ≤ P and thus Similarly, β p p (p) (ap p − ordp b lp −ordp b β (a p ) − c (a p ) ≤ . p p (p) p p≤P

p≤P

This gives a a p α β − ≤ 3. (p) b p ordp b p≤P

In the next section we will compute β (p) and thereby show that β (p) (0) = 1. This gives (a) and a a p α − β ≤ 3. (p) b p ordp b p|b

Since 0 < < 1 is arbitrary, (b) follows.

From Corollary 4.2 it follows that (3.1) holds. Here the series on the right hand side is absolutely convergent (plug in r = 0). Thus Lemma 4.3 gives γ (r) + 1 =

1+

p

Ar (p bp ) ,

bp ≥1

where for p prime and b ∈ N, we define Ar (p b ) :=

1≤a≤pb , p |a

a 2 2πiar/pb . β(p) b e p

186

M. Peter

5. Computation of the Local Factors The last step is to calculate β (p) . In particular, this will show that β (p) (0) = 1 which is left over from the proof of Lemma 4.3. For p prime, b ∈ N0 , define −1 1 β(p,b) (n) := 1 − χ(n2 −4)p−2b (p) Ipb (n), n > 2. p Then β(p) = b≥0 p −b β(p,b) , where the series is uniformly convergent. The calculation will only be done for p > 2. The case p = 2 is similar but somewhat more elaborate. Let b, c ∈ N0 , a ∈ Z, p |a. c Case 1. 2b < c − 1. Since β(p,b) is p 2b+1 -periodic, we have β (p,b) (a/p ) = 0. Case 2. b = c = 0. Then β (p,0) (0) =

−1 1 1 1 − χn2 −4 (p) p p n mod p

=

1 # n mod p χn2 −4 (p) = 1 p(1 − 1/p) +

2 1 # n mod p χn2 −4 (p) = −1 + . p p(1 + 1/p)

The cardinality of the first set is (p − 3)/2 and that of the second is (p − 1)/2 (see for 2 example the proof of Lemma 3.3 in [14]). Thus β (p,0) (0) = 1 − 2/(p(p − 1)). Case 3. b = 0, c = 1. Define Sp± (a) :=

e2πian/p .

n mod p: χn2 −4 (p)=±1

Then β (p,0)

a p

=

−1 1 1 e−2πian/p 1 − χn2 −4 (p) p p n mod p

4π a 1 1 −1 ± = Sp (a) . 1∓ 2 cos + p p p ±

Case 4. 2b ≥ c − 1, b ≥ 1. Then β (p,b)

a 1 = 2b+1 c p p ±

n mod p 2b+1 : n≡±2 (p2b )

−1 1 c 1 − χ(n2 −4)p−2b (p) e−2πian/p . p

Setting n = ±2 + mp2b gives a 1 ∓4πia/pc 1 m −1 ∓2πiamp2b−c = 1 − β e e , (p,b) pc p 2b+1 ± p p m mod p

where

· p

denotes the Legendre symbol.

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

187

Case 4.1. 2b = c − 1. We have

e∓2πiam/p =

1 1 ∓2πiam/p m e +1 − 2 p 2 m mod p

m mod p: (m/p)=1

=

1 ∓2πiam/p m 1 − . e 2 p 2 m mod p

Set p := 1 if p ≡ 1 (4) and p := i otherwise. The last sum can be reduced to the Gaussian sum associated to the Legendre character which can be computed explicitely (see for example [8], Chapter 2). This gives for the above quantity the value 1 1 ∓a p p 1/2 − . 2 p 2 Therefore β (p,b)

a 2π a 1 −2 = cos pc p 2b+1 p 2 − 1 pc

p3/2 p a 4πia/pc −a −4πia/pc . e e + p p2 − 1 p

+ Case 4.2. 2b > c − 1. Then β (p,b)

a 1 ∓4πia/pc 1 m −1 = 1 − e pc p 2b+1 ± p p m mod p

= =

1

p 2b+1

±

2 p 2b+1

cos

e∓4πia/p

c

1 p−1 1 p−1 · + · +1 1 − 1/p 2 1 + 1/p 2

4π a p 2 + p + 1 pc

p+1

.

Now we can calculate the Fourier coefficients β (p)

a 1 a β = . (p,b) c b p p pc b≥0

Case 1. c = 0. Then β (p) (0) = 1 −

1 2 2 p2 + p + 1 + = 1. cos(0) b 2b+1 − 1) p p p+1

p(p 2

b≥1

188

M. Peter

Case 2. c = 1. Since Sp+ (a) + Sp− (a) = −2 cos(4π a/p), we get β (p)

4π a 1 −1 ± 1 1∓ = 2 cos + Sp (a) p p p p ± 4π a p 2 + p + 1 1 2 + cos p b p 2b+1 p p+1 b≥1 1 2p 4π a = 2 cos + Sp± (a) p −1 p p ∓ 1 ± 2 1 n − 4 2πiam/p = 2 . e p −1 p

a

n mod p

Case 3. c ≥ 2. Then β (p)

a = pc

0≤b<(c−1)/2

1 ·0 pb

4π a p3/2 a 1 1 −2 c p + 2 e4πia/p cos + b 2b+1 2 c p p p −1 p p −1 p b: 2b=c−1 −a c + e−4πia/p p 1 2 4π a p 2 + p + 1 + cos b 2b+1 p p pc p+1 b>(c−1)/2 −4πia/pc c e , c odd p pa e4πia/p + −a 1 p = 2 . c even (p − 1) p 3c/2−2 2 cos 4πa pc ,

Finally Ar (p c ) can be computed. Case 1. c = 1. Then Ar (p) =

n1 ,n2 mod p

n2 − 4 n2 − 4 1 1 2 (p 2 − 1)2 p p

e2πi(n1 −n2 +r)/p .

1≤a≤p−1

The innermost sum is p − 1 for n1 − n2 + r ≡ 0 (p) and −1 otherwise. Since n2 − 4 p − 3 p − 1 = − = −1, p 2 2

n mod p

the value for Ar (p) as given in the theorem follows. Case 2. c ≥ 2. Then for p |a, we have −1 c 1 c a e−8πia/p . 1 + β(p) c = 2 3c/2−2 p (p − 1)p p

The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface

189

Thus Ar (p c ) =

1 2 (p − 1)2 p 3c−4

1≤a≤pc , p |a

8π a −1 c c e2πiar/p 2 + 2 cos . p pc

A short calculation gives the value in the theorem.

Acknowledgements. I would like to express my sincere gratitude to Prof. Zeév Rudnick for bringing this problem to my attention. I gained much from conversations with him and from the stimulating atmosphere which he and his co-organizers created at the DMV seminar “The Riemann Zeta Function and Random Matrix Theory”.

References 1. Aurich, R., Steiner, F.: Energy-level statistics of the Hadamard-Gutzwiller ensemble. Phys. D 43, 155–180 (1990) 2. Barner, K.: On A. Weil’s explicit formula. J. Reine Angew. Math. 323, 139–152 (1981) 3. Bleher, P.: Trace formula for quantum integrable systems, lattice point problem, and small divisors. In: Emerging Applications of Number Theory, Hejhal, D.A., Friedman, J., Gutzwiller, M.C., Odlyzko, A.M. (eds). Berlin–Heidelberg–New York: Springer, 1999 4. Bogomolny, E.B., Leyvraz, F., Schmit, C.: Distribution of eigenvalues for the modular group. Commun. Math. Phys. 176, 577–617 (1996) 5. Bogomolny, E.B., Georgeot, B., Giannoni, M.-J., Schmit, C.: Arithmetical Chaos. Phys. Rep. 291, 219– 324 (1997) 6. Bohigas, O., Giannoni, M.-J., Schmit, C.: Characterization of chaotic quantum spectra and universality of level fluctuation laws. Phys. Rev. Lett. 52, 1–4(1984); Spectral properties of the Laplacian and random matrix theory. J. Physique Lett. 45, L-1015 (1984) 7. Borevich, Z.I., Shafarevich, I.R.: Number Theory. New York: Academic Press, 1966 8. Davenport, H.: Multiplicative Number Theory. Graduate Texts in Mathematics 74. Berlin–New York: Springer, 1980 9. Hejhal, D.A.: The Selberg Trace Formula, Vol. 1. Lecture Notes in Mathematics 548, Berlin–Heidelberg– New York: Springer, 1976 10. Luo, W., Sarnak, P.: Number variance for arithmetic hyperbolic surfaces. Commun. Math. Phys. 161, 419–432 (1994) 11. Ono, T.: An Introduction to Algebraic Number Theory. New York: Plenum Press, 1990 12. Peter, M.: Momente der Klassenzahlen binärer quadratischer Formen mit ganzalgebraischen Koeffizienten. Acta Arith. 70, 43–77 (1995) 13. Peter, M.: Almost periodicity of the normalized representation numbers associated to positive definite ternary quadratic forms. J. Number Theory 77, 122–144 (1999) 14. Peter, M.: The limit distribution of a number theoretic function arising from a problem in statistical mechanics. J. Number Th. 90, 265–280 (2001) 15. Sarnak, P.: Class numbers of indefinite binary quadratic forms. J. Number Th. 15, 229–247 (1982) 16. Schmutz, P.: Arithmetic groups and the length spectrum of Riemann surfaces. Duke Math. J. 84, 199–215 (1996) 17. Schwarz, W., Spilker, J.: Arithmetical Functions. London Mathematical Society LNS 184. Cambridge: Cambridge University Press, 1994 18. Titchmarsh, E.C.: The Theory of the Riemann Zeta-Function. Oxford: Oxford University Press, 1986 Communicated by P. Sarnak

Commun. Math. Phys. 225, 191 – 217 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Weak Hopf Algebras and Singular Solutions of Quantum Yang–Baxter Equation Fang Li1,3, , Steven Duplij2 1 Department of Mathematics, Zhejiang University (Xixi Campus), Hangzhou, Zhejiang 310028, P.R. China.

E-mail: [email protected]; [email protected]

2 Theoretical Physics, Kharkov National University, Kharkov 61077, Ukraine.

E-mail: [email protected]

3 Institute of Mathematics, Chinese Academy of Sciences, Beijing 100080, P.R. China

Received: 1 May 2001 / Accepted: 1 September 2001

Abstract: We investigate a generalization of Hopf algebra slq (2) by weakening the invertibility of the generator K, i.e. exchanging its invertibility KK −1 = 1 to the regularity KKK = K. This leads to a weak Hopf algebra wslq (2) and a J -weak Hopf algebra vslq (2) which are studied in detail. It is shown that the monoids of group-like elements of wslq (2) and vslq (2) are regular monoids, which supports the general conjucture on the connection betweek weak Hopf algebras and regular monoids. Moreover, w from wslq (2) a quasi-braided weak Hopf algebra U q is constructed and it is shown that the corresponding quasi-R-matrix is regular R w Rˆ w R w = R w . 1. Introduction The concept of a weak Hopf algebra as a generalization of a Hopf algebra [29, 1] was introduced in [18] and its characterizations and applications were studied in [20]. A k-bialgebra1 H = (H, µ, η, , ε) is called a weak Hopf algebra if there exists T ∈ Homk (H, H ) such that id ∗ T ∗ id = id and T ∗ id ∗ T = T , where T is called a weak antipode of H . This concept also generalizes the notion of the left and right Hopf algebras [24, 12]. The first aim of this concept is to give a new sub-class of bialgebras which includes all of Hopf algebras such that it is possible to characterize this sub-class through their monoids of all group-like elements [18, 20]. It was known that for every regular monoid S, its semigroup algebra kS over k is a weak Hopf algebra as the generalization of a group algebra [19]. The second aim is to construct some singular solutions of the quantum Yang-Baxter equation (QYBE) and research QYBE in a larger scope. On this hand, in [20] a quantum quasi-double D(H ) for a finite dimensional cocommutative perfect weak Hopf algebra Project (No. 19971074) supported by the National Natural Science Foundation of China. 1 In this paper, k always denotes a field.

192

F. Li, S. Duplij

with invertible weak antipode was built and it was verified that its quasi-R-matrix is a regular solution of the QYBE. In particular, the quantum quasi-double of a finite Clifford monoid as a generalization of the quantum double of a finite group was derived [20]. In this paper, we will construct two weak Hopf algebras in the other direction as a generalization of the quantum algebra slq (2) [22, 2]. We show that wsl2 (q) possesses a quasi-R-matrix which becomes a singular (in fact, regular) solution of the QYBE, with a parameter q. In this reason, we want to treat the meaning of wslq (2) and its quasiR-matrix just as slq (2) [28, 16]. It is interesting to note that wslq (2) is a natural and non-trivial example of weak Hopf algebras. 2. Weak Quantum Algebras For completeness and consistency we remind the definition of the enveloping algebra Uq = Uq (sl(2)) (see e.g. [16]). Let q ∈ C and q = ±1,0. The algebra Uq is generated by four variables (Chevalley generators) E, F , K, K −1 with the relations K −1 K = KK −1 = 1, KEK

−1

KF K

−1

= q E, 2

=q

EF − F E =

−2

F,

− K −1

K . q − q −1

(1) (2) (3) (4)

Now we try to generalize the invertibility condition (1). The first thought is weaken the invertibility to regularity, as it is usually made in semigroup theory [17] (see also [10, 6,7] for higher regularity). So we will consider such weakening the algebra Uq slq (2) , in which instead of the set K, K −1 we introduce a pair Kw , K w by means of the regularity relations Kw K w Kw = Kw , K w Kw K w = K w .

(5)

If K w satisfying (5) is unique for a given Kw , then it is called inverse of Kw (see e.g. [27, 11]). The regularity relations (5) imply that one can introduce the variables Jw = Kw K w ,

J w = K w Kw .

(6)

In terms of Jw the regularity conditions (5) are J w K w = Kw ,

K w Jw = K w ,

(7)

J wKw = Kw,

Kw J w = Kw .

(8)

Since the noncommutativity of generators Kw and K w very much complexifies the generalized construction2 , we first consider the commutative case and imply in what follow that Jw = J w . 2 This case will be considered elsewhere.

(9)

Weak Hopf Algebras and Yang–Baxter Equation

193

Let us list some useful properties of Jw which will be needed below. First we note that commutativity of Kw and K w leads to idempotency condition Jw2 = Jw ,

(10)

which means that Jw is a projector (see e.g. [15]). Conjecture 1. In algebras satisfying the regularity conditions (5) there exists as minimum one zero divisor Jw − 1. Remark 1. In addition with unity 1 we have an idempotent analog of unity Jw which makes the structure of weak algebras more complicated, but simultaneously more interesting. For any variable X we will define “J -conjugation” as def

XJw = Jw XJw

(11)

and the corresponding mapping will be written as ew (X) : X → XJw . Note that the mapping ew (X) is idempotent 2 ew (X) = ew (X) .

(12)

Remark 2. In the invertible case Kw = K, K w = K −1 we have Jw = 1 and ew (X) = X = id (X) for any X, so ew = id. It is seen from (5) that the generators Kw and K w are stable under “Jw -conjugation” KJw = Jw Kw Jw = Kw ,

K Jw = Jw K w Jw = K w .

(13)

Obviously, for any X Kw XK w = Kw XJw K w ,

(14)

Kw XK w = Y ⇒ Kw XJw K w = YJw .

(15)

and for any X and Y

Another definition connected with the idempotent analog of unity Jw is the “Jw product” for any two elements X and Y , viz. def

X Jw Y = XJw Y.

(16)

Remark 3. From (7) it follows that the “Jw -product” coincides with the usual product, if X ends with generators Kw and K w on right side or Y starts with them on left side. j

Let J (ij ) = Kwi K w then we will need a formula  i−j  Kw , i > j, j Jw(ij ) = Kwi K w = Jw , i = j,  j −i K w , i < j,

(17)

194

F. Li, S. Duplij

which follows from the regularity conditions (7). The variables J (ij ) satisfy the regularity conditions Jw(ij ) Jw(j i) Jw(ij ) = Jw(ij ) (ij )

(18)

(ij )

and stable under “J -conjugation” (11) JwJw = Jw . The regularity conditions (7) lead to the noncancellativity: for any two elements X and Y the following relations hold valid: X = Y ⇒ Kw X = Kw Y, Kw X = Kw Y X = Y,

(19) (20)

X = Y ⇒ K w X = K w Y,

(21)

K w X = K w Y X = Y, X = Y ⇒ XJw = YJw , XJw = YJw X = Y.

(22) (23) (24)

The generalization of Uq slq (2) by exploiting regularity (5) instead of invertibility (1) can be done in two different ways. Definition 1. Define Uqw = wslq (2) as the algebra generated by the four variables Ew , Fw , Kw , K w with the relations: K w K w = K w Kw ,

(25)

Kw K w Kw = Kw , K w Kw K w = K w , Kw Ew = q Ew Kw , K w Ew = q Kw Fw = q Ew Fw − Fw Ew =

−2

−2

(26) Ew K w ,

(27)

F w K w , K w Fw = q Fw K w ,

(28)

2

2

Kw − K w . q − q −1

(29)

We call wslq (2) a weak quantum algebra. Definition 2. Define Uqv = vslq (2) as the algebra generated by the four variables Ev , Fw , Kv , K v with the relations (Jv = Kv K v ): Kv K v = K v K v ,

(30)

Kv K v Kv = Kv , K v Kv K v = K v ,

(31)

Kv Ev K v = q Ev ,

(32)

2

Kv Fv K v = q Ev Jv Fv − Fv Jv Ev =

−2

Fv ,

(33)

Kv − K v . q − q −1

(34)

We call vslq (2) a J-weak quantum algebra.

Weak Hopf Algebras and Yang–Baxter Equation

195

In these definitions indeed the first two lines (25)–(26) and (30)–(31) are called to generalize the invertibility KK −1 = K −1 K = 1. Each next line (27)–(29) and (32)– (34) generalizes the corresponding line (2)–(4) in two different ways respectively. In the first almost quantum algebra wslq (2) the last relation (29) between E and F generators remains unchanged from slq (2), while two EK and F K relations are extended to four ones (27)–(28). In vslq (2), oppositely, two EK and F K relations remain unchanged from slq (2) (with K −1 → K substitution only), while the last relation (34) between E and F generators has the additional multiplier Jv which role will be clear later. Note that the EK and F K relations (32)–(33) can be written in the following form close to (27)–(28): Kv Ev Jv = q 2 Jv Ev Kv , K v Ev Jv = q −2 Jv Ev K v , Kv Fv Jv = q

−2

J v Fv K v , K v F v J v = q J v F v K v . 2

(35) (36)

Using (16) and (7) in the case of Jv we can also present the vslq (2) algebra as an algebra with the “Jv -product” Kv Jv K v = K v Jv Kv ,

(37)

Kv Jv K v Jv Kv = Kv , K v Jv Kv Jv K v = K v ,

(38)

Kv Jv Ev Jv K v = q Ev ,

(39)

2

Kv Jv Fv Jv K v = q Ev Jv Fv − Fv Jv Ev =

−2

Fv ,

(40)

Kv − K v . q − q −1

(41)

Remark 4. Due to (7) the only relation where the “Jw -product” really plays its role is the last relation (41). Uqv

From the following proposition, one can find the connection between Uqw = wslq (2), = vslq (2) and the quantum algebra slq (2).

Proposition 1. wslq (2)/(Jw − 1) ∼ = slq (2); vslq (2)/(Jv − 1) ∼ = slq (2). Proof. For cancellative Kw and Kv it is obvious.

Proposition algebras wslq (2) and vslq (2) possess zero divisors, one of 2. Quantum which is3 Jw,v − 1 which annihilates all generators. Proof. From regularity (26) and (31) it follows Kw,v Jw,v − 1 = 0 (see also (1)). Mul tiplying (27) on Jw gives Kw Ew Jw = q 2 Ew Kw Jw ⇒ Kw Ew K w Kw = q 2 Ew Kw . Using the second equation in (27) for the term in the bracket we obtain Kw q 2 K w Ew Kw = q 2 Ew Kw ⇒ (Jw − 1) Ew Kw = 0. For Fw similarly, but we use Eq. (28). By analogy, multiplying (32) on Jv we have Kv Ev K v Kv K v = q 2 Ev Jv ⇒ Kv Ev K v = q 2 Ev Jv ⇒ q 2 Ev = q 2 Ev Jv , and so Ev (Jv − 1) = 0. For Fv similarly, but we use Eq. (33). Remark 5. Since slq (2) is an algebra without zero divisors, some properties of slq (2) cannot be upgraded to wslq (2) and vslq (2), e.g. the standard theorem of Ore extensions and its proof (see Theorem I.7.1 in [16]). 3 We denote by X w,v one of the variables Xw or Xv .

196

F. Li, S. Duplij

Remark 6. We conjecture that in Uqw and Uqv there are no other than Jw,v − 1 zero divisors which annihilate all generators. In other case thorough analysis of them will be much more complicated and very different from the standard case of non-weak algebras. We can get some properties of Uqw and Uqv as follows. Lemma 1. The idempotent Jw is in the center of wslq (2). from (13). Multiplying the first equation in (27) on K w we Proof. ForKw it follows derive Kw Ew K w = q 2 Ew Jw , and applying the second equation in (27) we obtain Ew Jw = Jw Ew . For Fw similarly, but we use Eq. (28). Lemma 2. There are unique algebra automorphisms ωw and ωv of Uqw and Uqv respectively such that ωw,v (Kw,v ) = K w,v , ωw,v (K w,v ) = Kw,v , ωw,v (Ew,v ) = Fw,v , ωw,v (Fw,v ) = Ew,v . 2 = id and ω2 = id. Proof. The proof is obvious, if we note that ωw v

(42)

As in the case of the automorphism ω for slq (2) [16], the mappings ωw and ωv can be called the weak Cartan automorphisms. Remark 7. Note that ωw = ω and ωv = ω in general case. The connection between the algebras wslq (2) and vslq (2) can be seen from the following Proposition 3. There exist the following partial algebra morphism χ : vslq (2) → wslq (2) such that χ (X) = ev (X)

(43)

(v)

or more exactly: generators Xw = Jv Xv Jv = XvJv for all Xv = Kv , K v , Ev , Fv satisfy the same relations as Xw (25)–(29). Proof. Multiplying Eq. (32) on Kv we have Kv Ev K v Kv = q 2 Ev Kv , and using (7) we obtain Kv Ev Jv = q 2 Ev Jv Kv ⇒ Kv Jv Ev Jv = q 2 Jv Ev Jv Kv , and so KvJv EvJv = q 2 EvJv KvJv , which has the shape of the first equation in (27). For Fv similarly using Eq. (33) we obtain KvJv FvJv = q −2 FvJv KvJv . Equation (34) can be modified using (7) and then applying (11), then we obtain EvJv FvJv − FvJv EvJv = which coincides with (29).

KvJv − K vJv q − q −1

Weak Hopf Algebras and Yang–Baxter Equation

197

For conjugated equations (the second ones in (27)–(28)) after multiplication of (32) on K v we have K v Kv Ev K v = q 2 K v Ev ⇒ Jv Ev Jv K v = q 2 K v Jv Ev Jv or using definition (11) and (7) K vJv EvJv = q −2 EvJv K vJv . By analogy from (33) it follows K vJv FvJv = q 2 FvJv K vJv .

(v)

Note that the generators Xw coincide with Xw if Jv = 1 only. Therefore, some (but not all) properties of wslq (2) can be extended on vslq (2) as well, and below we mostly will consider wslq (2) in detail. Lemma 3. Let m ≥ 0 and n ∈ Z. The following relations hold in Uqw : m n m Ew Kw = q −2mn Kwn Ew ,

m n Ew Kw

=q

2mn

n m K w Ew ,

[Ew , Fwm ] = [m]Fwm−1 = [m] m [Ew , Fw ] = [m]

Fwm Kwn = q 2mn Kwn Fwm ,

n Fwm K w

=q

−2mn

n K w Fwm ,

q −(m−1) Kw − q m−1 K w q − q −1

(44) (45)

(46)

q m−1 Kw − q −(m−1) K w m−1 Fw , q − q −1 q −(m−1) Kw − q m−1 K w m−1 Ew q − q −1

m−1 = [m]Ew

(47)

q m−1 Kw − q −(m−1) K w . q − q −1

Proof. The first two relations result easily from Definition 1. The third one follows by induction using Definition 1 and [Ew , Fwm ] = [Ew , Fwm−1 ]Fw + Fwm−1 [Ew , Fw ] = [Ew , Fwm−1 ]Fw + Fwm−1 Applying the automorphism ωw (42) to (46), one gets (47).

Kw − K w . q − q −1

Note that the commutation relations (44)–(47) coincide with the slq (2) case. For vslq (2) the situation is more complicated, because Eqs. (32)–(33) cannot be solved under K v due to noncancellativity (see also (19)–(24)). Nevertheless, some analogous relations can be derived. Using the morphism (43) one can conclude that the similar (v) relations (44)–(47) hold for Xw = Jv Xv Jv , from which we obtain for vslq (2), Jv Evm Kvn = q −2mn Kvn Evm Jv , n Jv Evm K v

=

n q 2mn K v Evm Jv ,

Jv Fvm Kvn = q 2mn Kvn Fvm Jv ,

n Jv Fvm K v

=

n q −2mn K v Fvm Jv ,

(48) (49)

198

F. Li, S. Duplij

Jv Ev Jv Fvm Jv − Jv Fvm Jv Ev Jv = [m]Jv Fvm−1 = [m] Jv Evm Jv Fv Jv − Jv Fv Jv Evm Jv = [m]

q −(m−1) Kv − q m−1 K v q − q −1

(50)

q m−1 Kv − q −(m−1) K v m−1 F v Jv , q − q −1 q −(m−1) Kv − q m−1 K v m−1 Ev J v q − q −1

= [m]Jv Evm−1

(51)

q m−1 Kv − q −(m−1) K v . q − q −1

It is important to stress that due to noncancellativity of weak algebras we cannot cancel these relations on Jv (see (19)–(24)). In order to discuss the basis of Uqw = wslq (2), we need to generalize some properties of Ore extensions (see [16]).

3. Weak Ore Extensions Let R be an algebra over k and R[t] be the free left R-module consisting of all polynomials of the form P = ni=0 ai t i with coefficients in R. If an = 0, define deg(P ) = n; say deg(0) = −∞. Let α be an algebra morphism of R. An α-derivation of R is a k-linear endomorphism δ of R such that δ(ab) = α(a)δ(b) + δ(a)b for all a, b ∈ R. It follows that δ(1) = 0. Theorem 1. (i) Assume that R[t] has an algebra structure such that the natural inclusion of R into R[t] is a morphism of algebras and deg(P Q) ≤ deg(P ) + deg(Q) for any pair (P , Q) of elements of R[t]. Then there exists a unique injective algebra endomorphism α of R and a unique α-derivation δ of R such that ta = α(a)t + δ(a) for all a ∈ R; (ii) Conversely, given an algebra endomorphism α of R and an α-derivation δ of R, there exists a unique algebra structure on R[t] such that the inclusion of R into R[t] is an algebra morphism and ta = α(a)t + δ(a) for all a ∈ R. Proof. (i) Take any 0 = a ∈ R and consider the product ta. We have deg(ta) ≤ deg(t)+deg(a) = 1. By the definition of R[t], there exists uniquely determined elements α(a) and δ(a) of R such that ta = α(a)t + δ(a). This defines maps α and δ in a unique fashion. The left multiplication by t being linear, so are α and δ. Expanding both sides of the equality (ta)b = t (ab) in R[t] using ta = α(a)t + δ(a) for a, b ∈ R, we get α(a)α(b)t + α(a)δ(b) + δ(a)b = α(ab)t + δ(ab). It follows that α(ab) = α(a)α(b) and δ(ab) = α(a)δ(b) + δ(a)b, and, α(1)t + δ(1) = t1 = t. So, α(1) = 1, δ(1) = 0. Therefore, we know that α is an algebra endomorphism and δ is an α-derivation. The uniqueness of α and δ follows from the freeness of R[t] over R. (ii) We need to construct the multiplication on R[t] as an extension of that on R such that ta = α(a)t + δ(a). For this, it needs only to determine the multiplication ta for any a ∈ R.

Weak Hopf Algebras and Yang–Baxter Equation

199

Let M = {(fij )i,j ≥1 :fij ∈ End k (R) and each row and each column has only finitely 1   many fij = 0} and I =  1  is the identity of M. .. . For a ∈ R, let a : R → R satisfying a (r) = ar. Then a ∈ Endk (R); and for r ∈ R, (α a )(r) = α(ar) = α(a)α(r) = (α(a)α)(r), (δ a )(r) = δ(ar) = α(a)δ(r) + δ(a)r = + δ(a))(r), δ + δ(a) in Endk (R), and, obviously, (α(a)δ thus α a = α(a)α, a = α(a)δ for a, b ∈ R, ab = a b; a + b = a + b.   δ α δ     . .  ∈ M and define / : R[t] → M satisfying /( ni=0 ai t i ) = Let T =  . α   .. . n ai I )T i . It is seen that / is a k-linear map. i=0 ( Lemma 4. The map / is injective. Proof. Let p = ni=0 ai t i . Assume /(p) = 0.   01  ..   .     0i−1    For ei =  1i , obviously, {ei }i≥1 are linear independent. Since δ(1) = 0 and 0   i+1   .   ..  0n   01 ..     .    0i−1     δ(1)i  α(1) = 1, we have T ei =   = ei+1 and T i e1 = ei+1 for any i ≥ 0. Thus,  α(1)i+1    0  i+2    ..   . 0n 0 = /(P )e1 = ni=0 ( ai I )T i e1 = ni=0 ai ei+1 . It means that ai = 0 for all i, then ai = ai 1 = ai 1 = 0. Hence P = 0. . )T + δ(a)I Lemma 5. The following relation holds T ( a I ) = (α(a)I Proof. We have



  + δ(a) δ α(a)δ   a α δ   + δ(a) α(a)α α(a)δ     a  ..   T ( aI ) =  =   α .  α(a)α ..    . .. . + δ(a)I = (α(a)I )T + δ(a)I. = α(a)T

   ..  .  .. .

200

F. Li, S. Duplij

Now, we complete the proof of Theorem 1. Let S denote the subalgebra generated by T and a I (all a ∈ R) in M. From Lemma 5, we see that every element of S can be generated linearly by some elements in the form as ( a I )T n (a ∈ R, n ≥ 0). n n But /(at ) = ( a I )T , so /(R[t]) = S, i.e. / is surjective. Then by Lemma 4, / is bijective. It follows that R[t] and S are linearly isomorphic. Define ta = /−1 (T ( a I )), then we can extend this formula to define the multiplication of R[t] with f g = /−1 (xy) for any f, g ∈ R[t] and x = /(f ), y = /(g). Under this definition, R[t] becomes an algebra and / is an algebra isomorphism from R[t] to )T + δ(a)I ) = α(a)t + δ(a) for all a ∈ R. S, and, ta = /−1 (T ( a I )) = /−1 ((α(a)I Obviously, the inclusion of R into R[t] is an algebra morphism. Remark 8. Note that Theorem 1 can be recognized as a generalization of Theorem I.7.1 in [16], since R does not need to be without zero divisors, α does not need to be injective and only deg(P Q) ≤ deg(P ) + deg(Q). Definition 3. We call the algebra constructed from α and δ a weak Ore extension of R, denoted as Rw [t, α, δ]. n possible comLet Sn,k be the linear endomorphism of R defined as the sum of all k positions of k copies of δ and of n−k copies of α. By induction n, from ta = α(a)t +δ(a) n na = under the condition of Theorem 1(ii), we get t S (a)t n−k and moreover, n,k k=0 p n m n+m i i i = i i=0 ai t i=0 bi t p=0 ap i=0 ci t , where ci = k=0 Sp,k (bi−p+k ). Corollary 1. Under the condition of Theorem 1(ii), the following statements hold: (i) As a left R-module, Rw [t, α, δ] is free with basis {t i }i≥0 ; (ii) If α is an automorphism, then Rw [t, α, δ] is also a right free R-module with the same basis {t i }i≥0 . Proof. (i) It follows from the fact that Rw [t, α, δ] is just R[t] as a left R-module. (ii) Firstly, we can show that Rw [t, α, δ] = i≥0 t i R, i.e. for any p ∈ Rw [t, α, δ], there n i are a0 ,a1 ,· · · ,an ∈ R such that p = i=0 t ai . Equivalently, we show by induction on n that for any b ∈ R, bt n can be in the form ni=0 t i ai for some ai . When n = 0, it is obvious. Suppose that for n ≤ k − 1 the result holds. Consider the case n = k. Since α is surjective, there is a ∈ R that b = α n (a)= Sn,0 (a). But such n n n n−k n n t a = k=0 Sn,k (a)t , we get bt = t a − k=1 Sn,k (a)t n−k = ni=0 t i ai by the hypothesis of induction for some ai with an = a. For any i and a, b ∈ R, (t i a)b = t i (ab) since Rw [t, α, δ] is an algebra. Then Rw [t, α, δ] is a right R-module. Suppose f (t) = t n an + · · · + ta1 + A0 = 0 for ai ∈ R and an = 0. Then f (t) can be written as an element of R[t] by the formula t n a = nk=0 Sn,k (a)t n−k whose highest degree term is just that of t n an = nk=0 Sn,k (an )t n−k , i.e. α n (an )t n . From (i), we get n α (an ) = 0. It implies an = 0. It is a contradiction. Hence Rw [t, α, δ] is a free right R-module. We will need the following: Lemma 6. Let R be an algebra, α be an algebra automorphism and δ be an α-derivation of R. If R is a left (resp. right) Noetherian, then so is the weak Ore extension Rw [t, α, δ]. The proof can be made similarly as for Theorem I.8.3 in [16].

Weak Hopf Algebras and Yang–Baxter Equation

201

Theorem 2. The algebra wslq (2) is Noetherian with the basis m

i j l i j i j Fw Kw , Ew Fw K w , Ew Fw Jw }, Pw = {Ew

(52)

where i, j, l are any non-negative integers, m is any positive integer. Proof. As is well known, the two-variable polynomial algebra k[Kw , K w ] is Noetherian (see e.g. [15]). Then A0 = k[Kw , K w ]/(Jw Kw − Kw , K w Jw − K w ) is also Noetherian. For any i, j ≥ 0 and a, b, c ∈ k, if at least one element of a, b, c does not equal 0, j aKwi + bK w + cJw is not in the ideal (Jw Kw − Kw , K w Jw − K w ) of k[Kw , K w ]. So, j j in A0 , aKwi + bK w + cJw = 0. It follows that {Kwi , K w , Jw : i, j ≥ 0} is a basis of A0 . Let α1 satisfy α1 (Kw ) = q 2 Kw and α1 (K w ) = q −2 K w . Then α1 can be extended to an algebra automorphism on A0 and A1 = A0 [Fw , α1 , 0] is a weak Ore extension of A0 j from α = α1 and δ = 0. By Corollary 1, A1 is a free left A0 -module with basis {Fw }i≥0 . m j j j Thus, A1 is a k-algebra with basis {Kwl Fw , K w Fw , Jw Fw : l and j run respectively over all non-negative integers, m runs over all positive integers}. But, from the definition m j j j j m of the weak Ore extension, we have Kwl Fw = q −2lj Fw Kwl , K w Fw = q 2mj Fw K w , j j j j m j Jw Fw = Fw Jw . So, we conclude that {Fw Kwl , Fw K w , Fw Jw : l and j run respectively over all non-negative integers, m runs over all positive integers} is a basis of A1 . j j j m j m j Let α2 satisfy α2 (Fw Kwl ) = q −2l Fw Kwl , α2 (Fw K w ) = q 2m Fw K w , α2 (Fw Jw ) = j Fw Jw . Then α2 can be extended to an algebra automorphism on A1 . Let δ satisfy δ(1) = δ(Kw ) = δ(K w ) = 0, δ(Fwj Kwl )

=

l

δ(Fwj K w ) =

δ Fwj Jw =

j −1 i=0 j −1 i=0 j −1 i=0

Fwj −1

q −2i Kw − q 2i K w l Kw , q − q −1

Fwj −1

q −2i Kw − q 2i K w l Kw, q − q −1

Fwj −1

q −2i Kw − q 2i K w Jw q − q −1

for j > 0 and l ≥ 0. Then just as in the proof of Lemma VI.1.5 in [16], it can be shown that δ can be extended to an α2 -derivation of A1 such that A2 = A1 [Ew , α2 , δ] is a weak Ore extension of A1 . Then in A2 , Ew Kw = α2 (Kw )Ew + δ(Kw ) = q −2 Kw Ew , Ew Fw = α2 (Fw )Ew + δ(Fw ) = Fw Ew +

Ew K w = q 2 K w E w ,

Kw − K w . q − q −1

From these, we conclude that A2 ∼ = Uqw as algebras. Thus, from Lemma 6, Uqw is w i } Noetherian. By Corollary 1, Uq is free with basis {Ew i≥0 as a left A1 -module. Thus, j

j

m

j

i , F K E i , F J E i : i, j, l as a k-linear space, Uqw has the basis Qw = {Fw Kwl Ew w w w w w w run over all non-negative integers, m runs over all positive integers}. By Lemma 3 any x ∈ Pw (resp. Qw ) can be k-linearly generated by some elements of Qw (resp. Pw ), and therefore Pw and Qw generate the same space Uqw .

202

F. Li, S. Duplij

The similar theorem can be proved for vslq (2) as well. Theorem 3. The algebra vslq (2) is Noetherian with the basis m Pv = Jv Evi Jv Fvj Kvl , Jv Evi Jv Fvj K v , Jv Evi Jv Fvj Jv ,

(53)

where i, j, l are any non-negative integers, m is any positive integer. Proof. The two-variable polynomial algebra k[Kv , K v ] is Noetherian (see e.g. [15]). Then A0 = k[Kv , K v ]/(Jv Kv − Kv , K v Jv − K v ) is also Noetherian. For any i, j ≥ 0 j and a, b, c ∈ k, if at least one element of a, b, c does not equal 0, aKvi +bK v +cJv is not j in the ideal (Jv Kv − Kv , K v Jv − K v ) of k[Kv , K v ]. So, in A0 , aKvi + bK v + cJv = 0. j It follows that {Kvi , K v , Jv : i, j ≥ 0} is a basis of A0 . Let α1 satisfy α1 (Kv ) = q 2 Kv and α1 (K v ) = q −2 K v . Then α1 can be extended to an algebra automorphism on A0 and A1 = A0 [Jv Fv Jv , α1 , 0] is a weak Ore extension of A0 from α = α1 and δ = 0. By Corollary 7, A1 is a free left A0 -module with basis m j j j j {Jv Fv Jv }i≥0 . Thus, A1 is a k-algebra with basis {Kvl Fv Jv , K v Fv Jv , Jv Fv Jv : l and j run respectively over all non-negative integers, m runs over all positive integers}. From m j j j the definition of the weak Ore extension, we have Kvl Fv Jv = q −2lj Jv Fv Kvl , K v Fv Jv = j m j j j j m j q 2mj Jv Fv K v , Jv Fv = Fv Jv . So, we conclude that {Fv Kvl Jv , Fv K v Jv , Jv Fv Jv : l and j run respectively over all non-negative integers, m runs over all positive integers} is a basis of A1 . j j j m j m Let α2 satisfy α2 (Jv Fv Kvl ) = q −2l Jv Fv Kvl , α2 (Jv Fv K v ) = q 2m Jv Fv K v , j j α2 (Jv Fv Jv ) = Jv Fv Jv . Then α2 can be extended to an algebra automorphism on A1 . Let δ satisfy δ(1) = δ(Kv ) = δ(K v ) = 0, δ(Jv Fvj Kvl ) = l

δ(Jv Fvj K v ) =

δ Jv Fvj Jv =

j −1 i=0 j −1 i=0 j −1 i=0

Jv Fvj −1

q −2i Kv − q 2i K v l Kv , q − q −1

Jv Fvj −1

q −2i Kv − q 2i K v l Kv, q − q −1

Jv Fvj −1

q −2i Kv − q 2i K v Jv q − q −1

for j > 0 and l ≥ 0. Then just as in the proof of Lemma VI.1.5 in [16], it can be shown that δ can be extended to an α2 -derivation of A1 such that A2 = A1 [Jv Ev Jv , α2 , δ] is a weak Ore extension of A1 . Then in A2 , Jv Ev Kv = α2 (Kv )Jv Ev Jv + δ(Kv ) = q −2 Kv Ev Jv , Jv Ev K v = q 2 K v Ev Jv , Jv Ev Jv Fv Jv = α2 (Fv )Jv Ev Jv + δ(Jv Fv Jv ) = Jv Fv Jv Ev Jv +

Kv − K v . q − q −1

From these, we conclude that A2 ∼ = Uqv as algebras. Thus, from Lemma 6, Uqv is Noetherian. By Corollary 1, Uqv is free with basis {Jv Evi Jv }i≥0 as a left A1 -module. Thus, as a k-linear space, Uqv has the basis m

Qv = {Jv Fvj Kvl Evi Jv , Jv Fvj K v Evi Jv , Jv Fvj Jv Evi Jv },

Weak Hopf Algebras and Yang–Baxter Equation

203

where i, j, l run over all non-negative integers, m runs over all positive integers. By (48)–(51) any x ∈ Pv (resp. Qv ) can be k-linearly generated by some elements of Qv (resp. Pv ), and therefore Pv and Qv generate the same space Uqv . 4. Extension to the q = 1 Case Let us discuss the relation between Uqw = wslq (2) and U (slq (2)). Just like the quantum algebra slq (2), we first have to give another presentation for Uqw . Let q ∈ C and q = ±1,0. Define Uqw as the algebra generated by the five variables Ew , Fw , Kw , K w , Lv with the relations (for Uqv Eqs. (56) and (57) should be exchanged with (32) and (33) respectively): K w K w = K w Kw , K w K w K w = Kw ,

(54)

K w Kw K w = K w ,

Kw Ew = q 2 Ew Kw , Kw Fw = q

−2

Fw Kw ,

(55)

K w Ew = q −2 Ew K w ,

(56)

K w Fw = q F w K w ,

(57)

2

[Lw , Ew ] = q(Ew Kw + K w Ew ),

(58)

−1

(59)

[Lw , Fw ] = −q

(Fw Kw + K w Fw ),

Ew Fw − Fw Ew = Lw ,

(q − q

−1

)Lw = (Kw − K w ).

(60)

For vslq (2) we can similarly define the algebra Uqv , Kv K v = K v Kv , Kv K v Kv = Kv ,

(61)

K v Kv K v = K v ,

Kv Ev K v = q 2 Ev , Kv Fv K v = q

−2

(63)

Fv ,

(64)

Lv Jv Ev − Ev Jv Lv = q(Ev Kv + K v Ev ), Lv Jv Fv − Fv Jv Lv = −q

−1

(Fv Kv + K v Fv ),

Ev Jv Fv − Fv Jv Ev = Lv , (q − q

(62)

−1

)Lv = (Kv − K v ).

(65) (66) (67)

Note that contrary to Uqw and Uqv , the algebras Uqw and Uqw are defined for all invertible values of the parameter q, in particular for q = 1. Proposition 4. The algebra Uqw is isomorphic to the algebra Uqw with ϕw satisfying ϕw (Ew ) = Ew , ϕw (Fw ) = Fw , ϕw (Kw ) = Kw , ϕw (K w ) = K w . Proof. The proof is similar to that of Proposition VI.2.1 in [16] for slq (2). It suffices to check that ϕw and the map ψw : Uqw → Uqw satisfying ψw (Ew ) = Ew , ψw (Fw ) = Fw , ψw (Kw ) = Kw , ψw (Lw ) = [Ew , Fw ] are reciprocal algebra morphisms. On the other hand, we can give the following relationship between Uqw and U (sl(2)) whose proof is easy.

204

F. Li, S. Duplij

Proposition 5. For q = 1 (i) the algebra isomorphism U (sl(2)) ∼ = U1w /(Kw − 1) holds; (ii) there exists an injective algebra morphism π from U1w to U (sl(2))[Kw ]/(Kw3 − Kw ) satisfying π(Ew ) = XKw , π(Fw ) = Y , π(Kw ) = Kw , π(L) = H Kw . Remark 9. In Proposition 5(ii), π is only injective, but not surjective since K 2 = 1 in U (sl(2))[K]/(K 3 − K) and then X does not lie in the image of π.

5. Weak Hopf Algebras Structure Here we define weak analogs in wslq (2) and vslq (2) for the standard Hopf algebra structures , ε, S – comultiplication, counit and antipod, which should be algebra morphisms. For the weak quantum algebra wslq (2) we define the maps w : wslq (2) → wslq (2) ⊗ wslq (2), εw : wslq (2) → k and Tw : wslq (2) → wslq (2) satisfying respectively w (Ew ) = 1 ⊗ Ew + Ew ⊗ Kw , (Fw ) = Fw ⊗ 1 + K w ⊗ Fw ,

(68)

w (Kw ) = Kw ⊗ Kw , w (K w ) = K w ⊗ K w ,

(69)

εw (Ew ) = εw (Fw ) = 0, εw (Kw ) = εw (K w ) = 1,

(70)

Tw (Ew ) = −Ew K w , Tw (Fw ) = −Kw Fw , T (Kw ) = K w , Tw (K w ) = Kw .

(71)

The difference with the standard case (we follow notations of [16]) is in substitution of K −1 with K w and the last line, where instead of antipod S the weak antipod Tw is introduced [18]. Proposition 6. The relations (68)–(71) endow wslq (2) with a bialgebra structure. Proof. It can be shown by direct calculation that the following relations hold valid: w (Kw )w (K w ) = w (K w )w (Kw ),

(72)

w (Kw )w (K w )w (Kw ) = w (Kw ),

(73)

w (K w )w (Kw )w (K w ) = w (K w ),

(74)

w (Kw )w (Ew ) = q 2 w (Ew )w (Kw ),

(75)

w (K w )w (Ew ) = q −2 w (Ew )w (K w ),

(76)

w (Kw )w (Fw ) = q

−2

w (Fw )w (Kw ),

w (K w )w (Fw ) = q 2 w (Fw )w (K w ), w (Ew )w (Fw ) − w (Fw )w (Ew ) =

(w (Kw ) − w (K w )) ; (q − q −1 )

(77) (78) (79)

Weak Hopf Algebras and Yang–Baxter Equation

205

εw (Kw )εw (K w ) = εw (K w )εw (Kw ),

(80)

εw (Kw )εw (K w )εw (Kw ) = εw (Kw ),

(81)

εw (K w )εw (Kw )εw (K w ) = εw (K w ),

(82)

εw (Kw )εw (Ew ) = q 2 εw (Ew )εw (Kw ),

(83)

εw (K w )εw (Ew ) = q

−2

εw (Ew )εw (K w ),

(84)

εw (Kw )εw (Fw ) = q

−2

εw (Fw )εw (Kw ),

(85)

εw (K w )εw (Fw ) = q εw (Fw )εw (K w ), 2

εw (Ew )εw (Fw ) − εw (Fw )εw (Ew ) =

(εw (Kw ) − εw (K w )) ; (q − q −1 )

Tw (K w )Tw (Kw ) = Tw (Kw )Tw (K w ),

(86) (87)

(88)

Tw (Kw )Tw (K w )Tw (Kw ) = Tw (Kw ),

(89)

Tw (K w )Tw (Kw )Tw (K w ) = Tw (K w ),

(90)

Tw (Ew )Tw (Kw ) = q Tw (Kw )Tw (Ew ), 2

(91)

Tw (Ew )Tw (K w ) = q

−2

Tw (K w )Tw (Kw ),

(92)

Tw (Fw )Tw (Kw ) = q

−2

Tw (Kw )Tw (Fw ),

(93)

Tw (Fw )Tw (K w ) = q Tw (K w )Tw (Fw ), 2

Tw (Fw )Tw (Ew ) − Tw (Ew )Tw (Fw ) =

(Tw (Kw ) − Tw (K w )) . (q − q −1 )

(94) (95)

Therefore, through the basis in Theorem 2, and εw can be extended to algebra morphisms from wslq (2) to wslq (2) ⊗ wslq (2) and from wslq (2) to k, Tw can be extended to an anti-algebra morphism from wslq (2) to wslq (2) respectively. Using (72)–(87) it can be shown that (w ⊗ id)w (X) = (id ⊗w )w (X), (εw ⊗ id)w (X) = (id ⊗εw )w (X) = X

(96) (97)

for any X = Ew , Fw , Kw or K w . Let µw and ηw be the product and the unit of wslq (2) respectively. Hence (wslq (2), µw , ηw , w , εw ) becomes a bialgebra. Next we introduce the star product in the bialgebra (wslq (2), µw , ηw , w , εw ) similar to the standard way (see e.g. [16]) (A w B) (X) = µw [A ⊗ B] w (X).

(98)

Proposition 7. Tw satisfies the regularity conditions (id w Tw w id)(X) = X, (Tw w id w Tw )(X) = Tw (X) for any X = Ew , Fw , Kw or K w . It means that Tw is a weak antipode.

(99) (100)

206

F. Li, S. Duplij

Proof. Follows from (72)–(95) by tedious calculations. For X = Kw ,K w it is easy, and so we consider X = Ew , as an example. We have (id w Tw w id)(Ew ) = µw [(id w Tw ) ⊗ id] w (Ew ) = µw [(id w Tw ) ⊗ id] (1 ⊗ Ew + Ew ⊗ Kw ) = (id w Tw ) (1) id (Ew ) + (id w Tw ) (Ew ) id (Kw ) = µw [id ⊗Tw ] w (1) id (Ew ) + µw [id ⊗Tw ] w (Ew ) id (Kw ) = µw [id ⊗Tw ] (1 ⊗ 1) id (Ew ) + µw [id ⊗Tw ] (1 ⊗ Ew + Ew ⊗ Kw ) id (Kw ) = Tw (1) id (Ew ) + id (1) Tw (Ew ) id (Kw ) + id (Ew ) Tw (Kw ) id (Kw ) = Ew − Ew K w · Kw + Ew · K w · Kw = Ew = id (Ew ) . By analogy, for (100) and X = Ew we obtain (Tw w id w Tw )(Ew ) = µw [(Tw w id) ⊗ Tw ] w (Ew ) = µw [(Tw w id) ⊗ Tw ] (1 ⊗ Ew + Ew ⊗ Kw ) = (Tw w id) (1)Tw (Ew ) + (Tw w id) (Ew )Tw (Kw ) = µw [Tw ⊗ id] (1 ⊗ 1) Tw (1Ew 1) + µw [Tw ⊗ id] (1 ⊗ Ew + Ew ⊗ Kw ) Tw (Kw ) = Tw (1) Tw (Ew ) + Tw (1) id (Ew ) Tw (Kw ) + Tw (Ew ) id (Kw ) Tw (Kw ) = −Ew K w + Ew K w − Ew K w Kw K w = −Ew K w = Tw (Ew ).

Corollary 2. The bialgebra wslq (2) is a weak Hopf algebra with the weak antipode Tw . We can get an inner endomorphism as follows: Proposition 8. Tw2 is an inner endomorphism of the algebra wslq (2) satisfying for any X ∈ wslq (2), Tw2 (X) = Kw XK w ,

(101)

especially Tw2 (Kw ) = id (Kw ) ,

Tw2 K w = id K w .

(102)

Proof. Follows from (71). Assume that with the operations µw , ηw , w , εw the algebra wslq (2) would possess an antipode S so as to become a Hopf algebra, which should satisfy (S w id)(Kw ) = ηw εw (Kw ), and so it should follow that S(Kw )Kw = 1. But, it is not possible to hold since S(Kw ) can be written as a linear sum of the basis in Theorem 2. It implies that it is impossible for wslq (2) to become a Hopf algebra for the operations above. Corollary 3. wslq (2) is an example of a non-commutative and non-cocommutative weak Hopf algebra which is not a Hopf algebra.

Weak Hopf Algebras and Yang–Baxter Equation

207

In order for Uqw to become a weak Hopf algebra, it is enough to define w (Ew ), w (Fw ), w (Kw ), w (K w ), εw (Ew ), εw (Fw ), εw (Kw ), εw (K w ), Tw (Ew ), Tw (Fw ), Tw (Kw ), Tw (K w ) just as in wslq (2) and define w (Lw ) =

K w − Kw 1 (Kw ⊗ Kw − K w ⊗ K w ), εw (Lw ) = 0, Tw (Lw ) = . −1 q −q q − q −1

From Proposition 4 we conclude that wslq (2) is isomorphic to the algebra Uqw with ϕw . Moreover, one can see easily that ϕw is an isomorphism of weak Hopf algebras from wslq (2) to Uqw . For the J -weak quantum algebra vslq (2) we suppose that some additional Jv should appear even in the definitions of comultiplication and antipod. A thorough analysis gives the following nontrivial definitions: v (Ev ) = Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ,

(103)

v (Fv ) = Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv ,

(104)

v (Kv ) = Kv ⊗ Kv , v (K v ) = K v ⊗ K v ,

(105)

εv (Ev ) = εv (Fv ) = 0, εv (Kv ) = εv (K v ) = 1,

(106)

Tv (Ev ) = −Jv Ev K v ,

(107)

Tv (Kv ) = K v ,

Tv (Fv ) = −Kv Fv Jv ,

Tv (K v ) = Kv .

(108)

Note that from (105) it follows that v (Jv ) = Jv ⊗ Jv ,

(109)

and so Jv is a group-like element. Proposition 9. The relations (103)–(108) endow vslq (2) with a bialgebra structure. Proof. First we should prove that v defines a morphism of algebras from vslq (2) ⊗ vslq (2) into vslq (2). We check that (110) v (Kv ) v K v = v K v v (Kv ) , v (Kv ) v K v v (Kv ) = v (Kv ) , (111) (112) v K v v (Kv ) v K v = v K v , v (Kv ) v (Ev ) v K v = q 2 v (Ev ) , (113) −2 v (Kv ) v (Fv ) v K v = q v (Fv ) , (114) v (Kv ) − v K v v (Ev ) v (Jv ) v (Fv ) − v (Fv ) v (Jv ) v (Ev ) = . (115) q − q −1 The relations (110)–(112) are clear from (105). For (113) we have v (Kv ) v (Ev ) v K v = (Kv ⊗ Kv ) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) K v ⊗ K v = Jv ⊗ Kv Ev K v + Kv Ev K v ⊗ Kv = q 2 (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = q 2 v (Ev ) .

208

F. Li, S. Duplij

Relation (114) is obtained similarly. Next for (115) exploiting (7), (34) and (35)–(36) we derive v (Ev ) v (Jv ) v (Fv ) − v (Fv ) v (Jv ) v (Ev ) = (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) (Jv ⊗ Jv ) Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv − Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv (Jv ⊗ Jv ) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = Jv Fv Jv ⊗ Jv Ev Jv − Jv Fv Jv ⊗ Jv Ev Jv + Jv Ev K v ⊗ Kv Fv Jv − K v E v Jv ⊗ J v F v K v + J v E v Jv F v Jv ⊗ K v − J v F v Jv E v J v ⊗ K v + K v ⊗ J v E v J v F v J v − K v ⊗ J v F v Jv E v J v = Jv (Ev Jv Fv − Fv Jv Ev ) Jv ⊗ Kv + K v ⊗ Jv (Ev Jv Fv − Fv Jv Ev ) Jv Kv − K v Kv − K v Kv ⊗ K v − K v ⊗ K v Jv ⊗ K v + K v ⊗ J v Jv = −1 −1 q −q q −q q − q −1 v (Kv ) − v K v = . q − q −1 = Jv

Then we show that v (X) is coassociative (v ⊗ id) v (X) = (id ⊗v ) v (X) .

(116)

Take E as an example. On the one hand (v ⊗ id) v (E) = (v ⊗ id) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = v (Jv ) ⊗ Jv Ev Jv + v (Jv ) v (E) v (Jv ) ⊗ Kv = Jv ⊗ Jv ⊗ Jv Ev Jv + Jv ⊗ Jv Ev Jv ⊗ Kv + Jv Ev Jv ⊗ Kv ⊗ Kv . On the other hand (id ⊗v ) v (E) = (id ⊗v ) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = Jv ⊗ v (Jv ) v (E) v (Jv ) + Jv Ev Jv ⊗ v (Kv ) = Jv ⊗ Jv ⊗ Jv Ev Jv + Jv ⊗ Jv Ev Jv ⊗ Kv + Jv Ev Jv ⊗ Kv ⊗ Kv , which coincides with the previous example. The proof that the counit ε defines a morphism of algebras from vslq (2) onto k is straightforward and the result has the form εv (Kv ) εv K v = εv K v εv (Kv ) , εv (Kv ) εv K v εv (Kv ) = εv (Kv ) , εv K v εv (Kv ) εv K v = εv K v , εv (Kv ) εv (Ev ) εv K v = q 2 εv (Ev ) , εv (Kv ) εv (Fv ) εv K v = q −2 εv (Fv ) , εv (Kv ) − εv K v εv (Ev ) εv (Jv ) εv (Fv ) − εv (Fv ) εv (Jv ) εv (Ev ) = . q − q −1

(117) (118) (119) (120) (121) (122)

Weak Hopf Algebras and Yang–Baxter Equation

209

Moreover, it can be shown that (εv ⊗ id)v (X) = (id ⊗εv )v (X) = X for X = Ev , Fv , Kv , K v . Further we check that Tv defines an anti-morphism of algebras from vslq (2) to op vslq (2) as follows: Tv (Kv ) Tv K v = Tv K v Tv (Kv ) , (123) Tv (Kv ) Tv K v Tv (Kv ) = Tv (Kv ) , (124) (125) Tv K v Tv (Kv ) Tv K v = Tv K v , 2 Tv K v Tv (Ev ) Tv (Kv ) = q Tv (Ev ) , (126) −2 Tv K v Tv (Fv ) Tv (Kv ) = q Tv (Fv ) , (127) Tv (Kv ) − Tv K v Tv (Fv ) Tv (Jv ) Tv (Ev ) − Tv (Ev ) Tv (Jv ) Tv (Fv ) = . (128) q − q −1 The first three relations are obvious. For (126) using (107) and (35) we have Tv K v Tv (Ev ) Tv (Kv ) = Kv −Jv Ev K v K v = −q 2 Kv −K v Ev Jv K v = −q 2 Jv Ev Jv K v = q 2 Jv Ev K v = q 2 Tv (Ev ) . For the last relation (128), using (35)–(36), we obtain Tv (Fv ) Tv (Jv ) Tv (Ev ) − Tv (Ev ) Tv (Jv ) Tv (Fv ) = (Kv Fv Jv ) Jv −Jv Ev K v − −Jv Ev K v Jv (Kv Fv Jv )

Tv (Kv ) − Tv K v K v − Kv = Jv (Fv Jv Ev − Ev Jv Fv ) Jv = Jv Jv = . q − q −1 q − q −1 Therefore, we conclude that vslq (2), µv , ηv , v , Tv has the structure of a bialgebra.

The following property of Tv is crucial for understanding the structure of the bialgebra vslq (2), µv , ηv , v , Tv . Proposition 10. For any X ∈ vslq (2) we have (cf. (101)–(102)) Tv2 (Kv ) = ev (Kv ) , Tv2 K v = ev K v , Tv2 (Ev ) = Kv Ev K v , Tv2 (Fv ) = Kv Fv K v ,

(129) (130)

where ev (X) is defined in (11). Proof. for Ev we have Tv2 (Ev ) = Follows from (7) and (107)–(108). As an example Tv −Jv Ev K v = −Tv K v Tv (Ev ) Tv (Jv ) = Kv Jv Ev K v Jv = Kv Ev K v . The star product in vslq (2), µv , ηv , v , Tv has the form (A v B) (X) = µv [A ⊗ B] v (X).

(131)

210

F. Li, S. Duplij

Proposition 11. Tv satisfies the regularity conditions (ev v Tv v ev )(X) = ev (X) , (Tv v ev v Tv )(X) = Tv (X)

(132) (133)

for any X = Ev , Fv , Kv or K v . Proof. Follows from (103)–(108) and (131). For X = Kv ,K v it is easy, and so we consider X = Ev , as an example. We have (ev v Tv v ev )(Ev ) = µv [(ev v Tv ) ⊗ ev ] v (Ev ) = µv [(ev v Tv ) ⊗ ev ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = (ev v Tv ) (Jv ) ev (Jv Ev Jv ) + (ev v Tv ) (Jv Ev Jv ) ev (Kv ) = µv [ev ⊗ Tv ] v (Jv )ev (Jv Ev Jv ) + µv [ev ⊗ Tv ] v (Ev )ev (Kv ) = µv [ev ⊗ Tv ] (Jv ⊗ Jv ) ev (Ev ) + µv [ev ⊗ Tv ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) ev (Kv ) = ev (Jv ) Tv (Jv ) ev (Ev ) + ev (Jv ) Tv (Jv Ev Jv ) ev (Kv ) + ev (Ev ) Tv (Kv ) ev (Kv ) = Jv · Jv · Jv Ev Jv − Jv · Jv Jv Ev K v · Jv Kv Jv + Jv Ev Jv · K v · Jv Kv Jv = Jv Ev Jv = ev (Ev ) . By analogy, for (133) and X = Ev we obtain (Tv v ev v Tv )(Ev ) = µv [(Tv v ev ) ⊗ Tv ] v (Ev ) = µv [(Tv v ev ) ⊗ Tv ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = (Tv v ev ) (Jv )Tv (Jv Ev Jv ) + (Tv v ev ) (Ev )Tv (Kv ) = µv [Tv ⊗ ev ] (Jv ⊗ Jv ) Tv (Jv Ev Jv ) + µv [Tv ⊗ ev ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) Tv (Kv ) = Tv (Jv ) ev (Jv ) Tv (Jv Ev Jv ) + Tv (Jv ) ev (Jv Ev Jv ) Tv (Kv ) + Tv (Jv Ev Jv ) ev (Kv ) Tv (Kv ) = −Jv · Jv · Jv Jv Ev K v Jv + Jv · Jv Ev Jv · K v − Jv Jv Ev K v Jv · Jv Kv Jv · K v = −Jv Ev K v = Tv (Ev ).

From (132)–(133) it follows that vslq (2) is not a weak Hopf algebra in the definition of [18]. So we will call it a J -weak Hopf algebra and Tv a J -weak antipode. As it is seen from (99)–(100) and (132)–(133) the difference between them is in the exchange id with ev . Remark 10. The variable ev can be treated as an n = 2 example of the “tower identity” (n) (n) eαβ introduced for semisupermanifolds in [9, 10] or the “obstructor” eX for general mappings, categories and the Yang–Baxter equation in [6–8]. Comparing (68)–(71) with (103)–(108) we conclude that the connection of w , Tw , εw and v , Tv , εv can be written in the following way: v (X) = w (ev (X)) , Tv (X) = Tw (ev (X)) , εv (X) = εw (ev (X)) ,

(134) (135) (136)

which means that additionally to the partial algebra morphism (43) there exists a partial coalgebra morphism which is described by (134)–(136).

Weak Hopf Algebras and Yang–Baxter Equation

211

6. Group-Like Elements Now, we discuss the set G(wslq (2)) of all group-like elements of wslq (2). As is wellknown (see e.g. [14]) a semigroup S is called an inverse semigroup if for every x ∈ S, there exists a unique y ∈ S such that xyx = x and yxy = y, and a monoid is a semigroup with identity. We will show the following j

Proposition 12. The set of all group-like elements G(wslq (2)) = {J (ij ) = Kwi K w : i, j run over all non-negative integers}, which forms a regular monoid under the multiplication of wslq (2). Proof. Suppose x ∈ wslq (2) is a group-like element, i.e. w (x) = x ⊗ x. By Theorem i F j Kl + β i j m i j 2, x can be written as x = i,j,l,m αij l Ew ij m Ew Fw K w + γij Ew Fw Jw . Here w w and in the sequel, every α, β and γ with subscripts is in the field k and does not equal zero. Then i j l i j m i j [αij l w (Ew Fw Kw ) + w (βij m Ew Fw K w ) + w (γij Ew Fw Jw )] w (x) = i,j,l,m

=

[αij l (1 ⊗ Ew + Ew ⊗ Kw )i (Fw ⊗ 1 + K w ⊗ Fw )j (Kw ⊗ Kw )l

i,j,l,m

+ βij m (1 ⊗ Ew + Ew ⊗ Kw )i (Fw ⊗ 1 + K w ⊗ Fw )j (K w ⊗ K w )m + γij (1 ⊗ Ew + Ew ⊗ Kw )i (Fw ⊗ 1 + K w ⊗ Fw )j Jw ]; and x⊗x =

i,j,l,m

⊗

m

i j l i j i j αij l Ew Fw Kw + βij m Ew Fw K w + γij Ew F w Jw

i,j,l,m

i j l i j m i j αij l Ew Fw Kw + βij m Ew Fw K w + γij Ew F w Jw .

It is seen that if i = 0 or j = 0, w (x) is impossible to equal x ⊗ x. So, i = 0 and m j = 0. We get x = l,m αl Kwl + βm K w + Jw . Then m m αl Kwl ⊗ Kwl + βm K w ⊗ K w + Jw ⊗ Jw ; w (x) = l,m

x⊗x =

l,l ,m,m

m

αl αl Kwl ⊗ Kwl + αl βm Kwl ⊗ K w + αl Kwl ⊗ Jw m

m

m

m

+ αl βm K w ⊗ Kwl + βm βm K w ⊗ K w + βm K w ⊗ Jw m + αl Jw ⊗ Kwl + βm Jw ⊗ K w + Jw ⊗ Jw .

If there exists l = l , then x ⊗x possesses the monomial Kwl ⊗Kwl , which does not appear in w (x). It contradicts w (x) = x ⊗ x. Hence we have only a unique l. Similarly, m there exists a unique m. Thus x = αl Kwl + βm K w + Jw . Moreover, it is easy to see that m l αl Kw , βm K w and Jw can not appear simultaneously in the expression of x. Therefore, m we conclude that x = αl Kwl , βm K w or Jw (no summation) and we have w (Jw(ij ) ) = Jw(ij ) ⊗ Jw(ij ) .

(137)

212

F. Li, S. Duplij j

(ij )

It follows that G(wslq (2)) = {Jw = Kwi K w : i, j run over all non-negative integers}. j j i For any J (ij ) = Kwi K w ∈ G(wslq (2)), one can find J (j i) = Kw K w ∈ G(wslq (2)) (ij ) (j i) (ij ) (ij ) such that the regularity (18) takes place Jw Jw Jw = Jw , which means that G(wslq (2)) forms a regular monoid under the multiplication of wslq (2). For vslq (2) we have a similar statement. (ij )

j

Proposition 13. The set of all group-like elements G(vslq (2)) = {Jv = Kvi K v : i, j run over all non-negative integers}, which forms a regular monoid under the multiplication of vslq (2). Proof. Suppose x ∈ vslq (2) is a group-like element, i.e. v (x) = x ⊗ x. By The j j m orem 3, x can be written as x = i,j,l,m αij l Jv Evi Jv Fv Kvl + βij m Jv Evi Jv Fv K v + j

γij Jv Evi Jv Fv Jv . Here and in the sequel, every α, β and γ with subscripts is in the field k and does not equal zero. Then [αij l v (Jv Evi Jv Fvj Kvl ) v (x) = i,j,l,m

m

+ v (βij m Jv Evi Jv Fvj K v ) + v (γij Jv Evi Jv Fvj Jv )] [αij l (Jv ⊗ Jv )(Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv )i = i,j,l,m

× (Jv ⊗ Jv )(Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv )j (Kv ⊗ Kv )l + βij m (Jv ⊗ Jv )(Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv )i × (Jv ⊗ Jv )(Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv )j (K v ⊗ K v )m + γij (Jv ⊗ Jv )(Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv )i × (Jv ⊗ Jv )(Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv )j Jv ]; and x⊗x =

i,j,l,m

⊗

m

αij l Jv Evi Jv Fvj Kvl + βij m Jv Evi Jv Fvj K v + γij Jv Evi Jv Fvj Jv

i,j,l,m

m αij l Jv Evi Jv Fvj Kvl + βij m Jv Evi Jv Fvj K v + γij Jv Evi Jv Fvj Jv .

It is seen that if i = 0 or j = 0, v (x) is impossible to equal x ⊗ x. So, i = 0 and m j = 0. We get x = l,m αl Kvl + βm K v + Jv . Then m m αl Kvl ⊗ Kvl + βm K v ⊗ K v + Jv ⊗ Jv ; v (x) = l,m

x⊗x =

l,l ,m,m

m

αl αl Kvl ⊗ Kvl + αl βm Kvl ⊗ K v + αl Kvl ⊗ Jv m

m

m

m

+ αl βm K v ⊗ Kvl + βm βm K v ⊗ K v + βm K v ⊗ Jv m + αl Jv ⊗ Kvl + βm Jv ⊗ K v + Jv ⊗ Jv .

Weak Hopf Algebras and Yang–Baxter Equation

213

If there exists l = l , then x ⊗ x possesses the monomial Kvl ⊗ Kvl , which does not appear in v (x). It contradicts v (x) = x ⊗x. Hence we have only a unique l. Similarly, m there exists a unique m. Thus x = αl Kvl + βm K v + Jv Moreover, it is easy to see that m αl Kvl , βm K v and Jv can not appear simultaneously in the expression of x. Therefore, m we conclude that x = αl Kvl , βm K v or Jv (no summation) and we have v (Jv(ij ) ) = Jv(ij ) ⊗ Jv(ij ) . (ij )

(138)

j

It follows that G(vslq (2)) = {Jv = Kvi K v : i, j run over all non-negative integers}. j j i (ij ) (j i) For any Jv = Kvi K v ∈ G(vslq (2)), one can find Jv = Kv K v ∈ G(vslq (2)) (ij ) (j i) (ij ) (ij ) such that the regularity (18) takes place Jv Jv Jv = Jv , which means that G(vslq (2)) forms a regular monoid under the multiplication of vslq (2). These results show that wslq (2) and vslq (2) are examples of a weak Hopf algebra whose monoid of all group-like elements is a regular monoid. It incarnates further the corresponding relationship between weak Hopf algebras and regular monoids [19]. 7. Regular Quasi-R-Matrix From Proposition 1 we have seen that wslq (2)/(Jw −1) = slq (2). Now, we give another relationship between wslq (2) and slq (2) so as to construct a non-invertible universal R w -matrix from wslq (2). Theorem 4. wslq (2) possesses an ideal W and a sub-algebra Y satisfying wslq (2) = Y ⊕ W and W ∼ = slq (2) as Hopf algebras. j

j

m

j

i F K l , Ei F K , Ei F J : Proof. Let W be the linear sub-space generated by {Ew w w w w w w w w for all i ≥ 0, j ≥ 0, l > 0 and m > 0}, and Y is the linear sub-space generated by i F j : i ≥ 0, j ≥ 0}. It is easy to see that wsl (2) = Y ⊕W ; wsl (2)W wsl (2) ⊆ W , {Ew q q q w thus, W is an ideal; and, Y is a sub-algebra of wslq (2). Note that the identity of W is Jw . Moreover, W is a Hopf algebra with the unit Jw , the comultiplication W w satisfying

W w (Ew ) = Jw ⊗ Ew + Ew ⊗ Kw ,

(139)

= Fw ⊗ Jw + K w ⊗ Fw ,

(140)

W w (Fw ) W w (Kw )

= Kw ⊗ Kw ,

W w (K w ) = K w ⊗ K w ,

(141)

and the same counit, multiplication and antipode as in wslq (2). Let ρ be the algebra morphism from slq (2) to W satisfying ρ(E) = Ew , ρ(F ) = Fw , ρ(K) = Kw and i F j Kl , ρ(K −1 ) = K w . Then ρ is, in fact, a Hopf algebra isomorphism since {Ew w w j m j i i Ew Fw K w , Ew Fw Jw : for all i ≥ 0, j ≥ 0, l > 0 and m > 0} is a basis of W by Theorem 2. Let us assume here that q is a root of unity of order d in the field k, where d is an odd integer and d > 1. d , F d , K d − J ) the two-sided ideal of U w generated by E d , F d , K d − Set I = (Ew w w w q w w w w Jw . Define the algebra U q = Uqw /I .

214

F. Li, S. Duplij d

w

Remark 11. Note that K w = Jw in U q = Uqw /I since Kwd = Jw . It is easy to prove that I is also a coideal of Uq and Tw (I ) ⊆ I . Then I is a weak w Hopf ideal. It follows that U q has a unique weak Hopf algebra structure such that the natural morphism is a weak Hopf algebra morphism, so the comultiplication , the counit w and the weak antipode of U q are determined by the same formulas with Uqw . We will w show that U q is a quasi-braided weak Hopf algebra. As a generalization of a braided bialgebra and R-matrix we have the following definitions [18]. Definition 4. Let there be k-linear maps µ : H ⊗ H → H, η : k → H, : H → H ⊗H, ε : H → k in a k-linear space H such that (H, µ, η) is a k-algebra and (H, , ε) is a k-coalgebra. We call H an almost bialgebra, if is a k-algebra morphism, i.e. (xy) = (x) (y) for every x, y ∈ H . Definition 5. An almost bialgebra H = (H, µ, η, , ε) is called quasi-braided, if there exists an element R of the algebra H ⊗ H satisfying op (x)R = R(x)

(142)

( ⊗ idH )(R) = R13 R23 , (idH ⊗)(R) = R13 R12 .

(143) (144)

for all x ∈ H and

Such R is called a quasi-R-matrix. w d , Fd) ⊕ U q where By Theorem 4, we have U q = Uqw /I = Y /I ⊕ W/I ∼ = Y /(Ew w d d d q = slq (2)/(Ew , Fw , K − 1) is a finite Hopf algebra. We know in [16] that the subU m K n : 0 ≤ m, n ≤ d − 1} is a finite dimensional q of U q generated by {Ew algebra B w q is a braided Hopf algebra as a quotient of the quantum double Hopf sub-algebra and U q . The R-matrix of U q is of B

= 1 R d

0≤i,j,k≤d−1

(q − q −1 )k k(k−1)/2+2k(i−j )−2ij k i q Ew Kw ⊗ Fwk Kwj . [k]!

ρ

ρ

q ∼ Since slq (2) ∼ = W as Hopf algebras and (E d , F d , K d − 1) ∼ = I , we get U = W/I as Hopf algebras under the induced morphism of ρ. Then W/I is a braided Hopf algebra with a R-matrix, Rw =

1 d

0≤k≤d−1;1≤i,j ≤d

(q − q −1 )k k(k−1)/2+2k(i−j )−2ij k i Ew Kw ⊗ Fwk Kwj . q [k]!

Because the identity of W/I is Jw , there exists the inverse Rˆ w of R w such that = R w Rˆ w = Jw . Then we have

Rˆ w R w

R w Rˆ w R w = R w , Rˆ w R w Rˆ w = Rˆ w ,

(145) (146)

Weak Hopf Algebras and Yang–Baxter Equation

215

which shows that this R-matrix is regular in U q . It obeys the following relations: w w op w (x)R = R w (x)

(147)

w (w ⊗ id)(R w ) = R13 R23 ,w w w w (id ⊗w )(R ) = R13 R12 ,

(148) (149)

for any x ∈ W/I and

which are also satisfied in U q . Therefore R w is a von Neumann’s regular quasi-R-matrix of U q . So, we get the following Theorem 5. U q is a quasi-braided weak Hopf algebra with Rw =

1 d

0≤k≤d−1;1≤i,j ≤d

(q − q −1 )k k(k−1)/2+2k(i−j )−2ij k i Ew Kw ⊗ Fwk Kwj q [k]!

as its quasi-R-matrix, which is regular. The quasi-R-matrix from the J -weak Hopf algebra vslq (2) has a more complicated structure and will be considered elsewhere. 8. Discussion In conclusion we would like to compare the presented generalization of the Hopf algebra with the existing ones. A weak Hopf algebra in sense of [4, 30, 26] is a k-linear vector space H that is both an associative algebra (H, µ, η) and a coassociative coalgebra (H, weak , εweak ) related to each other in a certain self-dual way [3, 26] and that possesses an antipode Sweak satisfying (in Sweedler notations [29]) Sweak x(1) x(2) = 1(1) εweak x1(2) , (150) x(1) Sweak x(2) = εweak x1(1) 1(2) , (151) (pre-antipode), and if in addition Sweak x(1) x(2) Sweak x(3) = Sweak (x) ,

(152)

then Sweak can be called a Nill’s antipode. Weak Hopf algebras have “weaker” axioms (2) related to the unit and counit: εweak (xyz) = εweak (xy(1) )εweak (y(2) z) and weak (1) = (weak (1) ⊗ 1) (1 ⊗ weak (1)). So the comultiplication is non-unital weak (1) = 1⊗1 (like in weak quasi Hopf algebras [23]) and the counit is only “weakly” multiplicative, ε(xy) = ε(x1)ε(1(2) y). Therefore they can be called non-unital weak Hopf algebras. Note that this kind of “weakness” is the “strength” of weak Hopf algebras [3], because it allows (even in the finite dimensional and semisimple cases) the weak Hopf algebra to possess non-integral (quantum) dimensions. The earlier proposals of face algebras [13], quantum groupoids [25], the (finite dimensional) generalized Kac algebras [31] are weak Hopf algebras in this sense [26], not the most general ones, but having an involutive antipode. The weak antipode T introduced in [18] and in this paper (Tw and Tv ) is not usually a pre-antipode in the sense (150)–(151). Therefore the class of non-unital Hopf

216

F. Li, S. Duplij

algebras [26, 3] (or quantum groupoids [25]) and the class of weak Hopf algebras [18, 20, 5] are not included in each other. In fact, we have the following relation: A ❄ D

✲ B

✲ C ❄ ✲ E

where A denotes a Hopf algebra, B a non-unital weak Hopf algebra, C a non-unital almost weak Hopf algebra, D a weak Hopf algebra and E an almost weak Hopf algebra. From this, we see easily that just Hopf algebras compose their common subclass. Nill [26] points out that these algebras have many examples in the theory of quantum chain models. Dissimilarly, our examples come from regular monoid algebras [18–20] and also from this paper, i.e. wslq (2), vslq (2), etc. Note that although the weak Hopf algebras in this paper and the non-unital weak Hopf algebras introduced earlier do not include each other usually, their antipodes are defined by a similar method, that is, by using of the regularity of antipodes in the involution algebra of the original algebras. Therefore, we believe that it is possible to characterize certain aspects in similar ways. A further interesting work, which we want to continue, is to study our weak Hopf algebras through similar objects and methods for the non-unital weak Hopf algebras and moreover, to find applications in the theory of quantum chain models and other relative areas. Acknowledgements. F.L. thanks M. L. Ge, P. Trotter and N. H. Xi for kind help and fruitful discussions during his visits. S.D. is thankful to A. Kelarev, V. Lyubashenko, W. Marcinek and B. Schein for useful remarks. S.D. is grateful to the Zhejiang University for kind hospitality and the National Natural Science Foundation of China for financial support.

References 1. Abe, E.: Hopf Algebras. Cambridge: Cambridge Univ. Press, 1980 2. Bernstein, J. and Khovanova, T.: On quantum group SLq (2). Preprint MIT, hep-th/9412056, Cambridge, 1994 3. Böhm, G., Nill, F., and Szlachányi, K.: Weak Hopf algebras I. Integral theory and C ∗ -structure. J. Algebra 221, 385–438 (1999) 4. Böhm, G. and Szlachányi, K.: A coassociative C ∗ -quantum group with nonintegral dimensions. Lett. Math. Phys. 35, 437–456 (1996) 5. Duplij, S. and Li, F.: On regular solutions of quantum Yang-Baxter equation and weak Hopf algebras. J. Kharkov National University, ser. Nuclei, Particles and Fields 521, 15–30 (2001) 6. Duplij, S. and Marcinek, W.: Higher regularity properties of mappings and morphisms. Preprint Univ. Wrocław, IFT UWr 931/00, math-ph/0005033, Wrocław, 2000 7. Duplij, S. and Marcinek, W.: On higher regularity and monoidal categories. Kharkov State University Journal (Vestnik KSU), ser. Nuclei, Particles and Fields 481, 27–30 (2000) 8. Duplij, S. and Marcinek, W.: Noninvertibility, semisupermanifolds and categories regularization. In: Noncommutative Structures in Mathematics and Physics (Duplij S. and Wess J., eds.). Dordrecht: Kluwer, 2001, pp. 125–140 9. Duplij, S.: On semi-supermanifolds. Pure Math. Appl. 9, 283–310 (1998) 10. Duplij, S.: Semisupermanifolds and semigroups. Kharkov: Krok, 2000 11. Goodearl, K.: Von Neumann Regular Rings. London: Pitman, 1979 12. Green, J.A., Nicols, W.D., and Taft, E.J.: Left Hopf algebras. J. Algebra 65, 399–411 (1980) 13. Hayashi, T.: An algebra related to the fusion rules of Wess-Zumino-Witten models. Lett. Math. Phys. 22, 291–296 (1991) 14. Howie, J.M.: Fundamentals of Semigroup Theory. Oxford: Clarendon Press, 1995 15. Hungerford, T.W.: Algebra. New York: Springer-Verlag, 1980

Weak Hopf Algebras and Yang–Baxter Equation

217

16. Kassel, C.: Quantum Groups. New York: Springer-Verlag, 1995 17. Lawson, M.V.: Inverse Semigroups: The Theory of Partial Symmetries. Singapore: World Sci., 1998 18. Li, F.: Weak Hopf algebras and some new solutions of Yang-Baxter equation. J. Algebra 208, 72–100 (1998) 19. Li, F.: Weak Hopf algebras and regular monoids. J. Math. Research and Exposition 19, 325–331 (1999) 20. Li, F.: Solutions of Yang-Baxter equation in endomorphism semigroup and quasi-(co)braided almost bialgebras. Comm. Algebra 28, 2253–2270 (2000) 21. Li, F.: Weaker structures of Hopf algebras and singular solutions ofYang– Baxter equation. In: Symposium on The Frontiers of Physics at Millennium, Singapore: World Scientific Publishing Co. Pte. Ltd, 2001 22. Lustig, G.: On quantum groups. J. Algebra 131, 466–475 (1990) 23. Mack, G. and Schomerus, V.: Quasi Hopf quantum symmetry in quantum theory. Nucl. Phys. B370, 185–191 (1992) 24. Nichols, W.D. and Taft, E.J.: The Left Antipodes of a Left Hopf Algebra. Contemp. Math. 13. Providence: Amer. Math. Soc., 1982 25. Nikshych, D. and Vainerman, L.: Finite quantum groupoids and their applications. Preprint Univ. California, math.QA/0006057, Los Angeles, 2000 26. Nill, F.: Axioms for weak bialgebras. Preprint Inst. Theor. Phys. FU, math.QA/9805104, Berlin, 1998 27. Petrich, M.: Inverse Semigroups. New York: Wiley, 1984 28. Shnider, S. and Sternberg, S.: Quantum Groups. Boston: International Press, 1993 29. Sweedler, M.E.: Hopf Algebras. New York: Benjamin, 1969 30. Szlachányi, K.: Weak hopf algebras, Operator algebras and quantum field theory (Rome, 1996), Cambridge, MA: Internat. Press, 1997, pp. 621–632 31. Yamanouchi, T.: Duality for generalized Kac algebras and a characterization of finite groupoid algebras. J. Algebra 163, 9–50 (1994) Communicated by A. Connes

Commun. Math. Phys. 225, 219 – 221 (2002)

Communications in

Mathematical Physics

Erratum

Ground State Energy of the One-Component Charged Bose Gas Elliott H. Lieb1 , Jan Philip Solovej2 1 Department of Physics, Jadwin Hall, P. O. Box 708, Princeton University, Princeton, NJ 08544, USA.

E-mail: [email protected]

2 Department of Mathematics, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen,

Denmark. E-mail: [email protected] Received: 17 September 2001 / Accepted: 5 November 2001 Commun. Math. Phys. 217, 127–163 (2001)

The proof of Lemma B.1 of [1] contains an unjustified operator inequality. In the last estimate on p. 162 the Cauchy–Schwarz inequality was used incorrectly. The lemma is however still correct as stated. We shall show this below. The operator inequality to be proven is that (∇θ )2

−N −N −N + (∇θ)2 ≤ Ct −2 + Cs 2 t −4 , −N + s −2 −N + s −2 −N + s −2

(1)

where −N is the Neumann Laplacian of some bounded open set O ⊂ Rn , s > 0, and θ ∈ C ∞ (O) is constant near the boundary of O and satisfies the estimates ∂ α θ∞ ≤ Ct −|α| , for some t > 0 and all multi-indices α with |α| ≤ 3. The proof of (1) is a little technical. For the application in the paper the following estimate, in which the Cauchy–Schwarz inequality has been used correctly, would have sufficed. 2 −N −N −N 2 4 −1 + (∇θ) ≤ st (∇θ) + (st) (∇θ )2 −N + s −2 −N + s −2 −N + s −2 ≤ Cst −3 + C(st)−1

−N . −N + s −2

In order to prove (1) we shall use the two operator inequalities 2

[−N , f ][f, −N ] ≤

C∇f 2∞ (−N ) + C

i

∂i2 f ∞

© 2001 by the authors. This article may be reproduced in its entirety for non-commercial purposes.

(2)

220

E. H. Lieb, J. P. Solovej

and f (−N )f = −

∂i f 2 ∂i +

i

[∂i f, f ∂i ] ≤ −C

i

∂i f 2 ∂i + C

i

(∂i f )2

i

≤ Cf 2∞ (−N ) + C∇f 2 ,

(3)

where f is a smooth function with compact support in O, which we identify as a multiplication operator. We begin by rewriting the left side of (1): −N −N + (∇θ)2 (∇θ )2 −2 −N + s −N + s −2 ∞ −N −N = (∇θ )2 + (∇θ)2 du −2 + u)2 −2 + u)2 (− + s (− N N +s 0 ∞ 1 −N = (∇θ)2 −2 + u −2 + u − + s − N N +s 0 −N 1 2 (∇θ ) du + −N + s −2 + u −N + s −2 + u ∞ 1 −N + − N , (∇θ)2 −2 −N + s + u (−N + s −2 + u)2 0 1 −N 2 (∇θ) + , − du. N (−N + s −2 + u)2 −N + s −2 + u The first integral we estimate using a Cauchy–Schwarz inequality 0

∞

1 −N (∇θ)2 −N + s −2 + u −N + s −2 + u +

∞

≤ t2 0

−N 1 (∇θ)2 du −2 −N + s + u −N + s −2 + u

1 1 (∇θ)2 (−N )(∇θ)2 −N + s −2 + u −N + s −2 + u + t −2

≤ Ct −2

−N du (−N + s −2 + u)2

−N + Cs 2 t −4 , −N + s −2

where in the last estimate we have used (3) with f = (∇θ)2 .

(4)

Ground State Energy of the One-Component Charged Bose Gas

221

The the final step in the proof of (1) is to estimate the last integral in (4) using a Cauchy–Schwarz inequality, this time together with (2) with f = (∇θ)2 . ∞ 1 −N − N , (∇θ)2 −2 −N + s + u (−N + s −2 + u)2 0 1 −N 2 (∇θ) du , − N (−N + s −2 + u)2 −N + s −2 + u ∞ 1 1 − N , (∇θ)2 (∇θ)2 , −N du ≤ t4 −2 −N + s + u −N + s −2 + u 0 ∞ (−N )2 +t −4 du (−N + s −2 + u)4 0 ∞ 1 1 4 (Ct −6 (−N ) + Ct −8 ) du + s 2 t −4 ≤t −2 + u − + s − + s −2 + u N N 0 ≤ Ct −2

−N + Cs 2 t −4 . −N + s −2

This proves (1). Another correction concerns the assumption made in Corollary 6.5 and on p. 152 that ρ 1/4 R is large. It is not necessary to assume this. Indeed the integrand of the I integral in Lemma 6.11 is monotone increasing in g and, therefore, g may be replaced by 4π |k|−2 , and the resulting integral is finite. On the top of p. 154 we make a choice of R that violates the assumption that ρ 1/4 R is large but, as we have just seen, this assumption is not needed. References 1. Lieb, E.H. and Solovej, J.P.: Ground State Energy of the One-Component Charged Bose Gas. Commun. Math. Phys. 217, 127–163 (2001) Communicated by M. Aizenman

Commun. Math. Phys. 225, 223 – 274 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

On the Point-Particle (Newtonian) Limit of the Non-Linear Hartree Equation Jürg Fröhlich1 , Tai-Peng Tsai2 , Horng-Tzer Yau2,∗ 1 Theoretical Physics, ETH-Hönggerberg, 8093 Zürich, Switzerland. E-mail: [email protected] 2 Courant Institute, New York University, New York, NY 10012, USA.

E-mail: [email protected]; [email protected] Received: 30 June 2000 / Accepted: 25 June 2001

Abstract: We consider the nonlinear Hartree equation describing the dynamics of weakly interacting non-relativistic Bosons. We show that a nonlinear Møller wave operator describing the scattering of a soliton and a wave can be defined. We also consider the dynamics of a soliton in a slowly varying background potential W (εx). We prove that the soliton decomposes into a soliton plus a scattering wave (radiation) up to times of order ε −1 . To leading order, the center of the soliton follows the trajectory of a classical particle in the potential W (εx). 1. Introduction and Summary of Main Results The problem of identifying classical regimes of quantum mechanics is a long standing problem of quantum theory. For simple systems it was first studied by Schrödinger in 1926; see [1]. In this paper, we explore a classical regime for a class of systems of identical, non-relativistic bosons, e.g., bosonic atoms such as 7 Li, with very weak twobody interactions described by a potential −κ of van der Waals or Newtonian type satisfying certain regularity properties described below. These bosons move under the influence of an external potential λV , where V is a smooth, positive function on physical space R3 and λ ≥ 0. The potential λV describes e.g. a trap confining the bosons. Let κ denote the strength of the two-body interaction between two bosons as compared to their average kinetic energy, (e.g. in the sense that is small as compared to the kinetic energy operator of two bosons, in the sense of Kato and Rellich, [2]). We are interested in understanding the dynamics of a “condensate” of N = O κ −1 bosons in the “meanfield regime”, where κ is very small. By a “condensate” we mean a state of the system with the property that all except for o(N ) bosons are in the same one-particle state described by a wave function ψ(x), x ∈ R3 . N -particle states of this kind are also called coherent states. ∗ Work partially supported by National Science Foundation Grant no. DMS-9703752 and DMS-0072098

224

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Let ψ0 = ψ0 (x), x ∈ R3 , denote the initial one-particle wave function of a coherent state of the system at time t = 0. In the mean-field limit, κ → 0, N → ∞,

with κ · N =: ν = const.,

(1.1)

the quantum-mechanical time evolution of a condensate of bosons has the property that it maps the initial coherent state with a one-particle wave function ψ0 to a coherent state at a later time t with a one-particle wave function ψt . As proven by K. Hepp [3] (see also [4] for some refinements and extensions), the one-particle wave function ψt of the condensate turns out to be a solution of the (non-linear) Hartree equation, Eq. (1.2) below. If the two-body interactions are dominantly attractive, as for 7 Li atoms, and, given κ, the number of bosons is large enough (i.e., N > Ncrit. (κ), or ν > νcrit. ), the system has bound states. In other words, the bosons may condense into a tightly bound, spatially sharply localized cluster. In the mean-field regime, such bound states appear to be (weakly) well approximated by coherent states with a one-particle wave function corresponding to a non-trivial local minimum of the Hartree energy functional. Turning on a very slowly varying external potential, λV (x) := W (εx),

(1.2)

where W is a smooth, positive function, and ε is much smaller than the diameter of a bound state of N bosons when λ = 0, one expects that the position, r(t) ∈ R3 , of the center of mass of that bound state closely follows a solution of Newton’s equations of motion, r˙ (t) = v(t), v(t) ˙ = −ε (∇W ) (εr (t)) , (1.3) −1 for times t with |t| < O ε . It is in this precise sense that the quantum system of bosons described above approaches a classical regime in the mean-field limit. For attractive two-body interactions, the Hartree equation describing the dynamics of a condensate (coherent state) in the mean-field limit has a self-focussing non-linearity. As a consequence, it has non-trivial “solitary wave solutions” looking like approximate δ-functions, for ν sufficiently large. These solitary wave solutions are precisely the oneparticle wave functions of coherent bound states in the mean-field limit. Our main objective in this paper is to study slow motion of solitons of the Hartree equation. We propose to show that, under the influence of a slowly varying external potential W (εx), the center of mass position, r(t), of a solitary-wave solution of the self-focussing Hartree equation remains close to asolution of Newton’s equations of motion stated above, for all times t with |t| < O ε −1 . (We do, however, not prove rigorous results on the precise way in which a system of identical bosons approaches its mean-field limit; but see [3–5].) Our main results on the self-focussing Hartree equations have been announced in [6], where the reader can find additional background material and motivation coming from physics. In order to be able to describe our main results concisely, we introduce some notation and recall some known results on the Hartree equation. Let H 1 (Rn ) denote the Sobolev space, (1.4) H 1 (Rn ) = ψ(x), x ∈ Rn ∇ψ 2 + ψ 2 < ∞ ,

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

225

where ψ denotes a measurable complex function on Rn , ∇ψ denotes its gradient, and

(·) 2 denotes the L2 -norm. We study properties of solutions of the Hartree equation 1 i∂t ψt = − ψt + λV ψt − ν ∗ |ψt |2 ψt . (1.5) 2 In Eq. (1.5), ψt (x) = ψ(x, t), x ∈ Rn , t ∈ R, is a time (t)-dependent, complex-valued scalar function on physical space Rn belonging to the Sobolev space H 1 (Rn ), for each time t; denotes the scalar Laplacian, λV (x), λ ∈ R, is an external potential, with V a smooth, bounded, positive function on Rn , and −(x) is a radially symmetric two-body potential, with ∈ Lp (Rn , d n x) + L∞ (Rn ), p ≥ n2 ; furthermore ∗ denotes convolution. We shall use the following standard notation: For an arbitrary measurable function ψ on Rn , ψ := ψ(x) d n x, (1.6) Rn

ψ p :=

p

1/p

|ψ|

(1.7)

is the norm on the space Lp = Lp (Rn , d n x), 1 ≤ p < ∞,

ψ H 1 := ∇ψ 2 + ψ 2 is the norm on H 1 = H 1 (Rn ), and

(ψ ∗ χ )(x) :=

ψ(x − y)χ (y)d n y

(1.8)

(1.9)

Rn

denotes the convolution of ψ with another such function χ . There are two important functionals on Sobolev space H 1 which are conserved under the flow ψ := ψ0 → ψt , ψ ∈ H 1 , determined by the Hartree equation (1.5). The first one is the L2 -norm of ψ ¯ ψ := |ψ|2 = ψ 22 N ψ, (1.10) and the second one is the Hamilton (or energy) functional 1 λ 2 ¯ H ψ, ψ := |∇ψ| + V |ψ|2 4 2 1 − ∗ |ψ|2 |ψ|2 . 4

(1.11)

We note that if is a non-negative function belonging to Lp + L∞ , p ≥ n2 , then, for an arbitrary δ > 0, there exists a finite constant C(δ) such that ¯ ψ ∇ψ 22 + C(δ) N ψ, ¯ ψ 2, 0≤ ∗ |ψ|2 |ψ|2 ≤ δN ψ, (1.12)

226

J. Fröhlich, T.-P. Tsai, H.-T. Yau

¯ ψ), and for arbitrary λ, |λ| < see e.g. [7]. Thus, for an arbitrary, but fixed value of N (ψ, ¯ ψ) is bounded from below. ∞, the Hamilton functional H(ψ, Under the assumptions that λV (x) has a minimum at x = x∗ , |x∗ | < ∞, that ¯ ψ) is large enough, one can (x) ≥ 0 and that the value, N, of the functional N (ψ, ¯ show (see Sect. 3) that the Hamilton functional H(ψ, ψ) restricted to the sphere ¯ ψ =N SN := ψ ψ ∈ H 1 , N ψ, (1.13) in Sobolev space reaches its minimum on a positive function QN ∈ SN concentrated near x∗ and decaying exponentially fast in |x|, as |x| → ∞ . This result still holds when λ = 0 (i.e., for a vanishing external potential); but if QN is a minimizer of H S then N so is QN,a , where QN,a (x) := QN (x − a), for arbitrary a ∈ RN . This is a consequence of the translation invariance of H, for λ = 0. A minimizer, QN of H S is a solution of the non-linear eigenvalue equation N

−

1 Q + λV Q − ∗ Q2 Q = EQ, 2

for some real number E, with

(1.14)

¯ Q =N. N Q,

Then ψ(x, t) = QN (x)e−iEt is a stationary solution of the Hartree equation (1.5). Multiplying Eq. (1.14) by Q := QN and integrating, we find that 1 λ 1 E= (1.15) V Q2N − ∗ Q2N Q2N . (∇QN )2 + 2N N N ¯ ψ) evalOne should notice that EN is not the value of the energy functional H(ψ, SN ¯ ψ) in the presence uated on the minimizer ψ = QN , because one is minimizing H(ψ, ¯ ψ) = N . of a constraint, namely N (ψ, (0) ¯ ψ) , with λ = 0, Let QN be a minimizer of the Hamilton functional H(ψ, S N

(0)

centered at x = 0; (QN is known to exist and to be non-trivial, for N large enough). We set λ = 1 and choose V (x) ≡ V (ε) (x) := W (εx),

(1.16)

where W is a fixed, smooth, bounded, positive function on Rn , and ε > 0 is a parameter. Our main concern, in this paper, is to construct local (in time t) solutions of the Hartree equation (1.5), with λ = 1 and V = V (ε) as in (1.16), of the form

(0) (1.17) ψ(x, t) = QN (x − r(t)) + hε (x − r(t), t) eiθ(x,t) , (0)

where hε is a small, dispersive correction to the solitary wave described by QN (x − r(t))eiθ(x,t) , with

hε (·, t) H 1 0(ε3/2 ),

(1.18)

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

227

θ (x, t) is a time-dependent phase, θ (x, t) = v(t)·x − Et + ϑ0 (t),

(1.19)

where v(t) = dr(t) is the velocity of the solitary wave, and ϑ0 (t) is independent of dt x, for all times t with |t| 0(ε −1 ), and provided the soliton trajectory (r(t), v(t)) solves appropriate equations of motion. It will be shown that (r(t), v(t)) must solve the Newtonian equations of motion r˙ (t) = v(t), v(t) ˙ = −ε(∇W )(εr(t)) + a(t),

(1.20)

where a(t) is a “friction force”, with |a(t)| 0(ε 2 ),

(1.21)

for |t| 0(ε−1 ). The friction force a(t) will be determined more precisely in Sect. 3. Neglecting the friction force a(t), Eqs. (1.20) are Newton’s equations of motion for a point particle of mass N moving in an external acceleration field of strength ε with potential V (ε) . Thus, for the velocity v(t) of this particle to deviate substantially from the initial condition v(0) = v0 , the time t must be 0(ε −1 ). For times t, with |t| 0(ε −1 ), the friction force a(t) has a negligibly small effect, for small ε. A solution of the Hartree equation (1.5) of the form (1.17), with properties (1.18) through (1.21), for times t with |t| 0(ε −1 ), describes the motion of an extended particle in a shallow potential well V (ε) interacting weakly with a dispersive medium of infinitely many degrees of freedom with which it can exchange mass and energy. The point-particle limit in which Newton’s laws of motion become exact is the limit ε → 0. For ε > 0, the interactions between the extended particle and the dispersive medium can lead to phenomena such as mass accretion, loss of mass and energy from the particle into dispersive waves, and friction, for times t large on a scale of ε −1 . The intuitive picture is one of a bound cluster of “dust” describing an extended particle, which exhibits Newtonian motion with friction. The friction is caused by the loss of some “dust” originally bound to the particle. This loss of “dust” is only observed when the motion of the particle is not inertial (i.e., accelerated or decelerated) and is described by dispersive waves satisfying a wave equation which is essentially the linearization of (0) Eq. (1.5) around a solitary wave described by QN(t) (x − r(t))eiθ(x,t) . For very large times, the trajectory of the extended particle is expected either to approach an inertial motion diverging to spatial infinity (if W (x) → const, as |x| → ∞ and if the initial mass and velocity of the particle were large enough), or to approach a local minimum of W where the particle will come to rest. This dissipative behavior of the particle motion is an example of the general phenomenon of “dissipation through radiation”. Some simple results on the large-time asymptotics of solutions of the Hartree equation (1.5) (existence of wave operators) are proven in Sect. 4. But it is fair to say that we do not yet have a good mathematical understanding of large-time behavior of solutions of Eq. (1.5). For some earlier results on scattering for the Hartree and nonlinear Schrödinger equation, see, e.g., [7,8] and references given there. Our analysis of solutions of the Hartree equation (1.5) of the form described in (1.17), with properties (1.18) through (1.21), is based on a key assumption, which is, implicitly,

228

J. Fröhlich, T.-P. Tsai, H.-T. Yau

an assumption on the two-body potential − that will not be made explicit in this paper: Let (f, g) := f¯g denote the usual scalar product on L2 , and let H denote the Hessian of the Hamilton ¯ ψ), with λ = 0, at ψ = Q(0) . Furthermore, let H denote the functional H(ψ, real N restriction of H on real-valued functions, and extend it to a complex-linear operator. It is given by an unbounded, selfadjoint operator on L2 will be shown in Sect. 3 that Hreal defining a quadratic form on H 1 which is bounded from below. It is not hard to see that (0) Q, (Hreal − E)Q = ε0 (Q, Q) < 0, for Q := QN , (1.22) where E = EN and 2 ε0 = − N

∗ Q2 Q 2 .

(1.23)

−E has only one negative eigenvalue. Since H is translation-invariant, Actually, Hreal it follows that ∇Q := {∂1 Q, . . . , ∂n Q}, ∂j := ∂x∂ j , j = 1, . . . , n, are n non-vanishing, − E orthogonal to Q, i.e., linearly independent zero-modes for Hreal − E)∂j Q = 0, and ∂j Q, Q = 0, (1.24) (Hreal − E. for all j = 1, . . . , n. Thus 0 is an at least n-fold degenerate eigenvalue of Hreal ¯ ψ) , there is no spectrum of H − E in the interval Since Q is a minimizer of H(ψ, real SN − E in the interval (ε0 , 0). Furthermore, it is easy to see that the spectrum of Hreal [0, −E), where

1 1 2 2 E= (1.25) ∗ Q Q2 < 0 (∇Q) − N 2

is pure-point, while, on the half-line [−E, ∞), it is continuous. Thus, there is a gap, − E in [0, ∞); see Sect. 3 for ε2 > 0, between 0 and the rest of the spectrum of Hreal details. −E is precisely Our key assumption is that the multiplicity of the eigenvalue 0 of Hreal equal to n. This implies that h, (Hreal − E)h ≥ ε2 (h, h), ε2 > 0, (1.26) for all functions h ∈ H 1 with h ⊥ {Q, ∇Q} in the L2 -scalar product (·, ·). We are now prepared to summarize the contents of this paper and to state our main results in the form of theorems. In Sect. 2, we recall the Hamiltonian nature of the Hartree equation (1.5) on the phase space H 1 . We exhibit continuous symmetries of the Hamilton functional that give rise to Eq. (1.5) and derive the corresponding conservation laws. We show that the Hartree equation can also be viewed as the Euler–Lagrange equation derived from an action functional. The Lagrangian formulation of the non-linear Hartree equation is useful to study the formal point-particle limit (the ε → 0 limit in (1.16) through (1.21)). This limit is discussed, in general terms but without mathematical proofs, in Sect. 2, using ideas

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

229

similar to those in [9] in an analysis of vortex motion in the Ginzburg-Landau equation, which is based on an effective-action formalism. We also discuss some expected features of the non-linear Hartree dynamics in the large-time limit. Our first main result is proven in Sect. 3. (0)

Theorem 1.1. Suppose that assumption (1.26) holds for all minimizers Q = QN , with N in an open neighborhood of some N0 > 0. We also assume that (x) is radial,

W 2,1 (R3 )∩W 2,∞ (R3 ) ≤ C

(1.27)

for some constant C . Then there is a positive constant C0 such that, for an arbitrary T < ∞, there is an ε0 > 0 with the property that, for any 0 < ε ≤ ε0 and any initial condition of the form (1.28) ψ(x, 0) = ψ0 (x) = Q (x − r0 ) + hε,0 (x) eiv0 x , with Q = QN0 and hε,0 H 1 ≤ C0 ε 3/2 , the Hartree equation, Eq. (1.5), with λ = 1 and V (x) = W (εx) as in (1.16), has a solution of the form (1.17), for all times t with |t| < T ε−1 , with the following properties: 1. The phase θ (x, t) is as in (1.19); 2. the trajectory (r(t), v(t)) of the extended-particle solution (1.17) is a solution of the equations of motion (1.20) with initial conditions r(t) = r0 , v(t) = v0 , for a friction force a(t) bounded by |a(t)| ≤ C1 ε 2 ; 3. the dispersive correction hε satisfies

hε (·, t) H 1 ≤ C2 ε 3/2 , for some finite constants C1 , C2 depending on T . This result makes the point-particle limit (ε → 0) of the Hartree equation (1.5) precise for initial conditions describing a single extended particle (solitary wave) moving in a shallow potential well, W (εx), and perturbed by a small amount of radiation (described by hε ). It is a special case of the more general situation considered in Sect. 3. A more detailed discussion and the proof of Theorem 1.1 form the contents of Sect. 3. The results just described raise the issue of asymptotic properties of the dynamics determined by the Hartree equation, as time t tends to ±∞. In Sect. 4, we establish a result on the scattering of small-amplitude waves off a single solitary wave. For simplicity, we suppose that physical space is three-dimensional, n = 3, (but our methods can be applied whenever n ≥ 3), we set λ = 0, and we choose to be a non-negative, bounded function of rapid decrease, as |x| → ∞. We consider an “asymptotic profile” described by ψas (x, t) = Q (x − r0 − v0 t) e

i x·v0 − 21 v02 +E t

+ has (x, t),

(1.29)

where has is a solution of the free-particle Schrödinger equation 1 i∂t has (x, t) = − has (x, t), 2

(1.30)

230

J. Fröhlich, T.-P. Tsai, H.-T. Yau

with initial condition has (x, 0) to and being sufficiently small =: has,0 (x) belonging in the space H 4 (R3 ) ∩ W 3,1 R3 , 1 + |x|2 d 3 x and such that the Fourier transform, (0) hˆ as,0 (k), vanishes at k = v0 . In (1.29), Q = QN0 ∈ SN0 is a solution of Eq. (1.14), with (0)

λ = 0, and it is assumed that inequality (1.26) is satisfied for Q = QN ∈ SN , for all N in a small neighborhood of N0 > 0. Theorem 1.2. For an asymptotic profile ψas (x, t) as described in (1.29), (1.30), and under the hypotheses stated above, there are solutions, ψ± (x, t), of the Hartree equation (1.5) (for λ = 0) such that ψ± (x, t) −→ ψas (x, t), as t → ±∞,

(1.31)

in H 2 (R3 ). Their difference is of order O(t −1 ). Thus the non-linear Møller wave maps /± : ψas −→ ψ± exist as symplectic maps on asymptotic profiles of the form (1.29), (1.30). We emphasize that the effect of the scattering wave on the location and the phase of the soliton has to be tracked precisely for all time. The stability of the soliton is quite simple and can be obtained purely from energy consideration. A review can be found in Sect. 3 (see also Weinstein [14]). Therefore, the key points of Theorem 1.2 are its two precise assertions: 1. The location of the soliton is almost “linear.” 2. The scattering wave behaves like an ordinary dispersive wave, (described by has (x, t)), plus a small correction. The condition on the Fourier transform of has,0 is a technical one and we expect to remove it later on. Our result constitutes the first step toward scattering theory. The proof of Theorem 1.2 is the contents of the final section, Sect. 4, of this paper.

2. The Hartree Equation as a Hamiltonian System with Infinitely Many Degrees of Freedom, and Its Point-Particle Limit In the introduction, we have described results indicating how the Hartree equation (1.2) captures the dynamics of a system of very many non-relativistic bosons with very weak two-body interactions in a condensate state. This regime has been called the “mean-field limit”. Actually, the mean-field limit is equivalent, mathematically, to the classical limit in which the value of Planck’s constant, h, ¯ is sent to 0. We are accustomed to expect (actually in general erroneously) that the unitary dynamics of a quantum-mechanical system reduces to the Hamiltonian dynamics of a corresponding classical system, in the classical limit. In the examples studied in this paper, this expectation is justified.

2.1. The Hamiltonian nature of the Hartree equation. The phase space, 0, for the Hartree equation (1.5) is the Sobolev (energy) space H 1 (Rn ) defined in (1.4). We use ¯ ψ(x) and its complex conjugate ψ(x), x ∈ Rn , as complex coordinates for 0. The i ¯ It leads to the following Poisson symplectic 2-form on 0 is given by 2 dψ ∧ d ψ. brackets: ¯ ¯ {ψ(x), ψ(y)} = ψ(x), ψ(y) = 0, ¯ ψ(x), ψ(y) = 2iδ(x − y) .

(2.1) (2.2)

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

231

¯ ψ), leading to the Hartree equation (1.5) is given by The Hamilton functional, H(ψ, 1 ¯ ψ) = H(ψ, (2.3) |∇ψ|2 + 2λV |ψ|2 − ∗ |ψ|2 |ψ|2 . 4 For ∈ Lp + L∞ , p ≥ n2 , H is well defined on 0 and bounded below on the spheres ¯ ψ =N <∞ , SN = ψ ψ ∈ 0, N ψ, (2.4) where ¯ ψ = N ψ,

|ψ|2 ;

see inequality (1.12). Hamilton’s equations of motion for ψ are given by ψ˙ t (x) = H ψ¯ t , ψt , ψt (x) 1 =i ψt (x) − λV (x)ψt (x) 2 + ∗ |ψt |2 (x)ψt (x) ,

(2.5)

(2.6)

which is precisely the Hartree equation (1.5). From (2.3) we infer the following symmetries and corresponding conservation laws. (1) Gauge invariance of the first kind. The phase transformations ¯ ¯ ψ(x) → eiθ ψ(x), ψ(x) → e−iθ ψ(x)

(2.7)

¯ ψ) invariant. These transformations describe the symplectic flow generated leave H(ψ, ¯ ψ). Since they by the Hamiltonian vector field corresponding to the function 21 N (ψ, ¯ ψ), it follows that are a symmetry of H(ψ, {H, N } = 0,

(2.8)

and hence N is conserved, and the spheres SN defined in (2.4) are invariant under the time evolution ψ → ψt described by (2.6). (2) Galilei invariance, for λ = 0. We shall assume henceforth that is rotationinvariant. If the external potential λV vanishes then arbitrary Galilei transformations are symmetries of H. Space translations, x → x + a, are represented on 0 by ψ(x) → ψa (x) := ψ(x − a), a ∈ Rn , and are generated by the momentum functional i ¯ ¯ P(ψ, ψ) := ψ∇ψ . 2

(2.9)

¯ ψ) invariant, hence P is conserved under the time evolution They clearly leave H(ψ, and {H, P} = 0 .

(2.10)

232

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Rotations, Rab , in the (ab)-plane of Rn , 1 ≤ a < b ≤ n, are represented on 0 by −1 x . ψ(x) → ψRab (x) := ψ Rab They are generated by the angular momentum functionals ¯ ψ) := Lab (ψ,

i 2

ψ¯ x a ∂b − x b ∂a ψ,

(2.11)

with ∂b = ∂/∂x b . Since has been assumed to be rotation-invariant, rotations leave ¯ ψ) invariant, hence the functionals Lab are conserved under the time evolution H(ψ, and Poisson-commute with H, for all (ab). Finally, boosts (velocity transformations), x → x − vt, v ∈ Rn , t denotes time, are represented on time-dependent trajectories, ψt (x), in 0 by ψt (x) → ψt (v; x) := ψt (x − vt)e

2 i v·x− v2 t

.

(2.12)

They do not leave H invariant, but one easily checks that if ψt (x) is a solution of Hamilton’s equations of motion (2.6) then so is ψt (v; x), for arbitrary v ∈ Rn . The conserved quantity corresponding to (2.12) is given by Mv ψ¯ t , ψt :=

ψ¯ t v· (x + it∇) ψt .

(2.13)

It follows that the “centre of mass motion” of a solution ψt of (2.6) is inertial. We conclude this section by noting that, as usual, Hamilton’s equations of motion (2.6) can also be viewed as Euler–Lagrange equations derived from an action principle. The action functional is defined on a space of continuously differentiable (in time) trajectories in phase space 0. It is given by

¯ ψ := S ψ,

t2 t1

i ¯ ˙ ¯ dt ψt ψt − H ψt , ψt . 2

(2.14)

¯ ψ) by variation The Hartree equation (2.6) is obtained from the action functional S(ψ, ¯ with respect to ψ, i.e., it is equivalent to the equation ¯ ψ δ ψ¯ t (x) = 0, δS ψ,

(2.15)

under the boundary conditions that δψti (x) = 0, i = 1, 2 .

(2.16)

Global existence and uniqueness of solutions of the equations of motion (2.6), for ∈ Lp + L∞ , p ≥ n2 , is proven in [7] and refs. given there.

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

233

2.2. Stationary solutions of the Hartree equations, for fixed values of N , P and Lab . In this section, we consider stationary solutions of the non-linear Hartree equations (2.6), ¯ ψ), the assuming that λV = 0 and that is rotation-invariant. Since the L2 -norm N (ψ, ¯ ¯ ψ) are momentum functional P(ψ, ψ) and the angular momentum functionals Lab (ψ, conserved, we may put them to fixed values, N, P and Lab , respectively. In order to find ¯ ψ) = N, P(ψ, ¯ ψ) = π and Lab (ψ, ¯ ψ) = λab , stationary solutions of (2.6), with N (ψ, we may look for critical points of the generalized energy functional E ¯ ψ + ¯ ψ ¯ ψ; E, P , Lab := H ψ, N − N ψ, E ψ, 2 ab ¯ ψ , ¯ ψ + + P · π − P ψ, L λab − Lab ψ,

(2.17)

a
¯ ψ; E, P , Lab with where E, P and Lab are Lagrange multipliers. By varying E ψ, ¯ ψ, E, P and Lab , we find the equations respect to ψ, −

1 ψ − ∗ |ψ|2 ψ − Eψ 2 − iP ·∇ψ − i Lab x a ∂b − x n ∂a ψ = 0,

(2.18)

a
¯ and (variation with respect to ψ), ¯ ψ = N (variation with respect to E), N ψ, ¯ ψ = π (variation with respect to P ), P ψ,

(2.19) (2.20)

and ¯ ψ = λab (variation with respect to Lab ), Lab ψ,

(2.21)

1≤a
1 ψ − ∗ |ψ|2 ψ = Eψ, 2 ¯ ψ = N, N ψ,

(2.22) (2.23)

(0)

and the solution, ψ = QN , must satisfy

and

¯ (0) , Q(0) = i Q(0) , ∇Q(0) = 0 P Q N N N N 2

(2.24)

(0) x a ∂b − x b ∂a QN = 0, for all a < b .

(2.25)

234

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Equation (3.22) is identical to Eq. (1.14), for λV = 0, and E is given by 1 E= 2N

1 (0) 2 (0)2 (0)2 QN , ∇QN ∗ QN − N

(2.26) (0)

see Eq. (1.15), which is strictly negative, for a non-trivial minimizer QN . Lemma 2.1. For a positive, rotation-invariant potential ∈ Lp + L∞ , p ≥ n2 , with (x) → 0, as |x| → ∞, there exists a constant N∗ = N∗ (), with 0 ≤ N∗ < ∞ , such (0) ¯ (0) , Q(0) = N , that, for N > N∗ , (2.23) has a non-trivial solution ψ = QN , with N Q N N ¯ ψ . The phase of Q(0) can be chosen corresponding to a local minimum of H ψ, N S N

(0)

such that QN > 0. The non-linear eigenvalue E is given by (2.26) and is strictly (0) negative, for N > N∗ . The function QN (x) is smooth and decays exponentially, as √ |x| → ∞, with decay rate −E. Remarks. (i) From the theory of quantum-mechanical bound states we infer that, in n = 1, 2 dimensions, N∗ = 0, while, for n ≥ 3, N∗ is strictly positive if is integrable, but vanishes for potentials of very long range, such as the Coulomb potential; see [10]. (0)

(ii) Given a solution, QN , of (3.22), the function (0)

ψt (v; x) := QN (x − r − vt)e

i v·x− 21 v 2 +E t

(2.27)

solves the Hartree equation (2.6), with λV = 0, for arbitrary r ∈ Rn and v ∈ Rn . This follows from the Galilei invariance of the theory. For ψt as in (2.27), P ψ¯ t , ψt = N v .

(2.28)

Equation (2.6) also has wave-like solutions with P = 0, (e.g. ψt (x) = ψ0 exp i (k · x − E (k, ψ0 ) t), which has infinite energy and momentum). It would be of interest to also study square-integrable, stationary rotating soliton solutions of (2.6) with Lab = 0. (iii) It is straightforward to extend Lemma 2.1 to systems where λV = 0. Such generalizations are of particular interest when V has symmetries. Then minimizers, (0) ¯ ψ) QN , of H(ψ, S tend to break the symmetries of λV if N is large enough. N

¯ ψ) at ψ = Q(0) , (λ = 0). In our proofs (iv) Let H denote the Hessian of H(ψ, N of Theorems 1.1 and 1.2 (see Sects. 3 and 4), we shall always assume that assumption (1.26) holds, for all N in an open neighborhood of some N0 > N∗ . Since the proof of Lemma 2.1 is standard, it is omitted. The interesting analytical issues arise in the problems described in Remarks (iii) and (iv). They deserve further study.

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

235

2.3. A heuristic discussion of the point-particle limit of the Hartree equation. In this section we start from the results reviewed in the last section (see Lemma 2.1) to study the point-particle (Newtonian) limit of the Hartree equation. In this limit the Hartree equation reduces to the Newtonian mechanics of point-particles interacting through two-body potential forces. We use ideas closely related to those proposed in [9] in an analysis of vortex motion in the plane, as described by the Ginzburg–Landau equations. Let λV and be as in Eqs. (1.5), (2.6). We set λ = 1 and consider a family of external potentials of the form V (x) ≡ V (ε) (x) := W (εx),

(2.29)

where W is some smooth, positive function on Rn , and ε > 0 is a parameter. Furthermore, the two-body potential, −, is chosen to be (x) = s (x) + 6 (εx),

(2.30)

where s (x) is a rotation-invariant, smooth function decaying rapidly in ρ := |x|, as ρ → ∞, and with the properties that ds (ρ) < 0, for ρ > 0, dρ

(2.31)

and that the key gap assumption (1.26) stated in Sect. 1 holds for = s . The perturbing potential 6 is rotation-invariant and smooth and may be of long range, e.g. 6 (ρ) ∼ ρ 2−n , as ρ → ∞,

(2.32)

for n ≥ 3, which is the behavior of the Coulomb and of Newton’s gravitational potential. For simplicity, we assume that |d6 (ρ) dρ is uniformly bounded in ρ. We pick k positive integers N1 , . . . , Nk , with Nj > N∗ (s ), for all j . For λV = 0 and N > N∗ (s ), we define δN :=

N −1

(0)

d n xQN (x)2 x 2 ,

(2.33)

(0) ¯ ψ) , as described where QN is a rotation-invariant minimizer of the functional H(ψ, SN in Lemma 2.1. We consider an initial condition, ψ0 (x), for the Hartree equation (2.6) describing (0) a configuration of k far-separated “solitons”, QNj (x − rj ), rj ∈ Rn , j = 1, . . . , k, (perturbed by a small-amplitude wave), with the following properties: Each soliton (0) QNj (x) is a rotation-invariant solution of Eq. (3.22), with = s and N = Nj , ¯ ψ) (for λ = 0, = s ). Furthermore minimizing H(ψ, SN

max

j =1,... ,k

δNj

min

1≤i<j ≤k

ri − rj

where ε is the parameter introduced in (2.29), (2.30).

≤ ε,

(2.34)

236

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Our goal is to construct a solution, ψt , of the Hartree equation (2.6) of the form ψt (x) =

k j =1

(0) QNj (t) x − rj (t) eiθj (x,t) + hε (x, t),

(2.35)

where rj (0) = rj , as in (2.34), and r˙j (0) = vj ∈ Rn , j = 1, . . . , k, with the following properties: There is a positive constant T such that, for all times t with |t| < Tε , (a) (b)

hε (·, t) ∼ o(ε),

for an appropriately chosen norm (·) , θj (x, t) = r˙j (t)· x − rj (t) + ϑj (t),

where ϑj (t) is independent of x, and N˙ j (t) = o(ε). (c) The trajectories r1 (t), . . . , rk (t) and the phases ϑ1 (t), . . . , ϑk (t) will turn out to satisfy equations of motion which can be derived from the Hartree equation. In this section we do not present a mathematical proof of the claim that solutions of the Hartree equation (2.6) of the form (2.35) with properties (a)–(c) exist; (but see Sect. 3). We merely verify that a function ψt (x) of the form (2.35) with properties (a)–(c) approaches a critical ¯ ψ) introduced in (2.14), as ε → 0, provided the point of the action functional S(ψ, trajectories rj (t) satisfy certain Newtonian equations of motion and the phases ϑj (t) ¯ ψ) satisfy the Hartree are suitably chosen (j = 1, . . . , k). Since critical points of S(ψ, equation (2.6), this makes it plausible that solutions of (2.6) of the form (2.35) with properties (a) – (c) exist. This claim is proven in Sect. 3 for k = 1. Our heuristic analysis is based on the following simple facts: (1) For i = j, (0) (0) d n xQNi (x − ri ) QNj x − rj → 0, exponentially fast, as |ri − rj | = 0(ε−1 ) → ∞. This follows from Lemma 2.1. T (0) (2) QNi (t) , hε (·, t) = o(ε), for |t| ≤ , ε as ε → 0, for all i = 1, . . . , k; see (2.35) and property (a). (0) (0) (3) QNi , ∇QNi = 0, for all i, by translation invariance (see Eq. (1.24)). (4) For y := x − ri (t), (0) 2 d n y QNi (t) (y) y = 0, for all i, by rotation invariance. (0) (0) (5) N˙ i (t) = 2 QNi (t) , QN˙ (t) , for all i, i (0) (0) because Ni = QNi , QNi .

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

Using that ∂ (0) QNj (t) x − rj (t) eiθj (x,t) ∂t

(0) (0) = QN˙ (t) x − rj (t) − r˙j (t)·∇QNj (t) x − rj (t) j (0) ˙ + i θj (x, t)QNj (t) x − rj (t) eiθj (x,t) , with

237

(2.36)

θ˙j (x, t) = r¨j (t) x − rj (t) − r˙j (t)2 + ϑ˙ j (t),

(2.37)

∇θj (x, t) = r˙j (t),

(2.38)

and

¯ ψ) introduced in (2.14), we find that, for ψt (x) as in (2.35), the action functional S(ψ, with − Tε ≤ t1 < t2 ≤ Tε , is given by k (0) 2 i ˙ dt Nj − QNj x − rj r¨j · x − rj 2 j =1 t1 Nj 2 1 (0) 2 2 ˙ ∇QNj − r˙ + Nj r˙j − Nj ϑj − 2 2 j (0) 2 (0) 2 1 ∗ QNj QNj − Nj W εrj + 2 1 + Ni Nj 6 ε ri − rj + sε , 2

1 ¯ ψ = S ψ, 2

t2

(2.39)

i:i=j

where sε is an error term ∼ o(ε). In the first term on the R.S. of (2.39) we have used (5), the second term proportional to r¨j vanishes by (4), in the third and fourth term we have used (2.37), in the sixth term we have used (2.38), and various cross terms vanish because of (3) or only contribute to the error term because of (1) and (2). We have also used that (0) 2 d n xW (εx)QNj x − rj = Nj W εrj + o(ε) ; and that, for i = j , (0) (0) 2 2 d n x d n y QNi (x − ri ) (x − y) QNj y − rj = Ni Nj 6 ri − rj + o(ε), by (4) and because s (x) decays rapidly in |x|. Thus 1 ¯ ψ = S ψ, SNewton rj , Nj j =1,... ,k 2 t2 k i ˙ 1 (0) (0) dt Nj − Nj ϑ˙ j − 2H QNj , QNj + sε , + 2 2 t1

j =1

(2.40)

238

J. Fröhlich, T.-P. Tsai, H.-T. Yau

where SNewton

rj , Nj

t2 dt

=

j =1,... ,k

k Nj 2 r˙j − Nj W εrj 2 j =1

t1

1 Ni Nj 6 ε ri − rj + 2

(2.41)

i:i=j

is the usual Hamiltonian action for k point particles with masses N1 , . . . , Nk in an external acceleration field potential W (ε·) and interacting through two-body forces with with potential Ni Nj 6 ε ri − rj . In order to guarantee that the ansatz (2.35) yields a solution of the Hartree equation (2.6) with properties (a), (b) and (c), we must require that the variation of the ¯ ψ calculated in (2.40), (2.41) with respect to the variational parameaction S ψ, ters rj , Nj , ϑj , j = 1, . . . , k, and hε vanish! To write down the variational equations, we observe that the second term on the R.S. of (2.40) isindependent of r1 , . . . , rk , except ¯ ψ with respect to r1 , . . . , rk for the error term sε , which is o(ε). Thus, varying S ψ, yields Newton’s equations of motion r¨j = − ε (∇W ) εrj ε + Ni (∇6 ) ε rj − ri + aj , (2.42) 2 i:i=j

where aj comes from the error term sε , and |aj (t)| ∼ o(ε), for |t| ≤ Variation with respect to N1 , . . . , Nk yields the equations ϑ˙ j =

T ε

; j = 1, . . . , k.

1 2 Ni 6 ε ri − rj r˙j − W εrj + 2 i:i=j ∂ (0) (0) H QNj , QNj + o(ε) . − ∂Nj

(2.43)

It is easy to see that 1 ∂ (0) (0) (0) 2 ∇QNj H QNj , QNj = ∂Nj 2Nj 1 (0)2 (0)2 ∗ QNj QNj − Nj = Ej − Nj 6 (0) + o(ε), see Eq. (2.26). Hence, for |t| ≤ ϑ˙ j =

T ε

(2.44)

, k

1 2 r˙j − W εrj + Ni 6 ε ri − rj − Ej + o(ε) . 2 i=1

(2.45)

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

239

Variation with respect to ϑ1 , . . . , ϑk yields the equations N˙ j = o(ε),

(2.46)

(approximate conservation of masses of particles), and, finally, variation with respect to hε yields an equation of motion of the form k ∂ (2.47) hε (x, t) = X hε , rj , Nj , ϑj j =1 (x, t), ∂t with X ∼ o(ε), for |t| ≤ Tε , where (·) is an appropriately chosen norm. At a heuristic level, eqs. (2.42) and (2.46) show very clearly that the limit ε → 0 corresponds to the point-particle limit in which the masses, N1 , . . . , Nk , of the particles (“solitons”) are constant and their trajectories are solutions of Newton’s equations of motion, on time scales of 0(ε−1 ). It is interesting and useful to work out explicit expressions for all the terms of o(ε) in Eqs. (2.42), (2.45), (2.46) and (2.47), in order to understand more about the corrections to the Newtonian point-particle limit and to get a handle on phenomena like radiation loss and dissipation through emission of small-amplitude dispersive radiation. But, since our discussion in this section is at a formal level, let’s not! In the special case where k = 1, the terms of size o(ε) are analyzed in Sect. 3. The analysis of the correction term sε in expression (2.40) for the action functional and of the properties of solutions of Eq. (2.47) is crucial in attempting to understand the long-time behavior of solutions of the Hartree equation (2.6). In the introduction, we have drawn attention to results of Soffer and Weinstein [8], see also [12], concerning “nonlinear Rayleigh scattering” for small-amplitude solutions of the non-linear Schrödingeror Hartree equations with a suitable external potential λV . One would like to extend their results in the direction of a theory of non-linear resonances (metastable states) and gain understanding of the phenomenon of “approach to a groundstate”. Of particular interest ¯ ψ), see (2.3), restricted to a sphere are situations where the Hamilton functional H(ψ, SN in phase space has several distinct local minima, for N large enough. This happens when λV has several minima separated by large barriers and − is the potential of an attractive force. One would then like to understand the shape of the “basins of attraction” ¯ ψ) : The forward (backward) basin of in phase space of the local minima of H(ψ, SN ¯ ψ) parametrized by N consists of all attraction of a family of local minima of H(ψ, SN initial conditions in phase space which approach an element of this family plus dispersive radiation decaying to 0 at the free dispersion rate, as t → +∞ (t → −∞). This is the phenomenon of “approach to a groundstate”. More ambitiously, one might try to construct a “centre manifold” of asymptotically attracting configurations of solitons to which solutions of the Hartree equation with initial conditions sufficiently close to the centre manifold converge locally in space, as |t| → ∞. See [12] for some preliminary results. Let us consider an example: We choose an initial condition for the Hartree equation describing two far-separated solitons at positions r1 , r2 and with initial velocities v1 , v2 . We suppose that λV = 0 and that −6 is purely attractive and of short range. The “masses” N1 , N2 of the solitons and the initial conditions r1 , v1 and r2 , v2 are chosen such that the two solitons form a bound state, i.e., that N1 2 N2 2 v + v − N1 N2 6 (ε (r1 − r2 )) < 0 . 2 1 2 2

(2.48)

240

J. Fröhlich, T.-P. Tsai, H.-T. Yau

One would then like to calculate the power, PR (t), of emission of dispersive radiation through a sphere of radius R " max (|r1 |, |r2 |) . Moreover, one would like to show that, as t → ±∞, a typical configuration of two solitons satisfying (2.48) collapse to a single soliton moving through space at a constant velocity. This phenomenon would describe the “radiative collapse of a binary system”. More generally, it would be interesting to understand how, at intermediate times, small inhomogeneities in the initial conditions for solutions of the Hartree equation grow to form a structure of rotating bodies (solitons) perturbed by outgoing, dispersive radiation, before it eventually approaches a number of far separated solitons escaping from each other. [In studying such problems, one finds out that the Hartree equation not only “knows” about Newton’s equations of motion, it also “knows” about the Euler equations for the motion of rigid bodies.] The problems described here are problems on the scattering theory for the Hartree equation. If − is attractive, i.e., for a self-focussing non-linearity, scattering theory is bound to be very subtle, involving infinitely many “scattering channels”, and is beyond the reach of our methods; (see, however, Sect. 4 for some preliminary results, and [7] for the case where − is repulsive). 3. Proof of Theorem 1.1 In this section, we prove the first main result (Theorem 1.1) of this paper. 3.1. Stability of soliton solutions of Hartree equations. We first review the stability of the soliton solutions to the Hartree equation without external potential, i.e., for λ = 0. The equation is i∂t ψ = 2

1 ∂H = − 9ψ − ( ∗ |ψ|2 )ψ, 2 ∂ ψ¯

(3.1)

where ∂∂H (H = H(λ=0) , see (1.11)) is the first variation of the energy functional w.r.t. ψ¯ ¯ Recall that Q is a minimizer of H under the constraint N (ψ, ¯ ψ) := ψ 2 = N , for ψ. some N fixed, and thus Q satisfies the equation 1 − 9Q − ( ∗ |Q|2 )Q = EQ, 2

(3.2)

for some non-linear eigenvalue (Lagrange multiplier) E. Suppose the function ψ can be written in the form ψ = (Q + h)e−iEt . Then the linearized equation satisfied by h takes the form i∂t h = Lh,

(3.3)

1 ¯ Lh = − 9h − Eh − ( ∗ Q2 )h − Q( ∗ (Q(h + h))). 2

(3.4)

where

Due to the appearance of h¯ on the right side of (3.4), L is not a complex-linear operator. It is therefore convenient to separate the last equation into real and imaginary parts Lh = L+ A + iL− B,

h = A + iB,

(A and B real),

(3.5)

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

241

where 1 L− = − 9 − E − ∗ Q2 , 2 L+ = L− − 2Q[ ∗ (Q·)], In matrix form, ∂ ∂t

(L+ A = L− A − 2Q[ ∗ (QA)]).

A 0 L− A A = =: L . B −L+ 0 B B

(3.6)

(L is the matrix form of −iL; it determines a linear Hamiltonian vector field.) The operators L− and L+ also appear naturally in the second variation of the energy functional H. Writing ψ = u + iv, we have by explicit computation ∂H ∂H H(Q + h) = H(Q) + dx A + dx B ∂u Q ∂v Q 1 ∂ 2 H ∂ 2 H + dxA A + 2 dxA B 2 ∂u∂u Q ∂u∂v Q ∂ 2 H + dxB B + O(h3 ) ∂v∂v Q where

∂H = ∂u Q

∂H ∂u

¯ ψ=ψ=Q

.

Notice that H has no cross terms in u and v, except in the nonlinear term depending only on |ψ|2 . Since Q is real, we have that ∂H =0. ∂v Q Thus the first order term is just ∂H ∂H dx A = E dxQA, A = 2 dx ∂u Q ∂ ψ¯ Q where we have used Eq. (3.2). Similarly, ∂ 2 H = 0, ∂u∂v Q and the second order term is just 1 ∂ 2 H ∂ 2 H dxA A + dxB B 2 ∂u∂u Q ∂v∂v Q = dxAL+ A + dxBL− B + E dx(A2 + B 2 ) . − E = L .) We have thus proved that (Observe that Hreal +

H(Q + h) = H(Q) + E[(Q, A) + h 2 ] + Re(Lh, h) + O(h3 ),

(3.7)

242

J. Fröhlich, T.-P. Tsai, H.-T. Yau

where (f, g) =

f¯gdx is the standard L2 scalar product and Re(Lh, h) = dxAL+ A + dxBL− B.

Let Qε ≡ QN+ε be the (real) minimizer centered at the origin, with Qε 2 = N + ε. Let hε = Qε − Q. Then ε = Q + hε 2 − Q 2 = 2 Qhε + h2ε = 2 Qhε + O(ε 2 ). We define E(N ) as the minimal energy subject to the constraint ψ 2 = N : E(N ) =

inf

ψ 2 =N

H(ψ).

The last two equations and (3.7) then yield the standard relation ∂E(N ) = E/2. ∂N For an arbitrary h with Reh ⊥ Q, Eq. (3.7) yields dxAL+ A + dxBL− B = H(Q + h) − H(Q) − E h 2 + O(h3 ).

(3.8)

(3.9)

Since H(Q + h) ≥ E( Q + h 2 ) = E(N + h 2 ), (because Reh ⊥ Q, Q + h 2 =

Q 2 + h 2 ), we obtain from Eq. (3.8) H(Q + h) − H(Q) − E h 2 ≥ O(h3 ). This proves that

dxAL+ A ≥ 0,

dxBL− B ≥ 0,

for all A ⊥ Q and arbitrary B. Thus L− ≥ 0, and L+ has at most one negative eigenvalue. From the explicit form of L− and L+ we conclude that L− Q = 0,

L+ ∇Q = 0,

L− (xQ) = −∇Q.

(3.10)

Since Q is positive and L− ≥ 0, its null space is the span of Q, i.e., L− ≥ 0,

N (L− ) = spanR {Q}.

From the explicit form of L+ we have that (Q, L+ Q) =: ε0 · (Q, Q) < 0, where ε0 = −2(N (Q))

−1

(3.11)

Q2 ∗ Q2 < 0.

Thus L+ has exactly one negative eigenvalue. The continuous spectra of L− and L+ can easily be shown to be the half-line [−E, ∞). Since L+ ∇Q = 0, 0 is an at least n-fold

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

243

degenerate eigenvalue of L+ . A key assumption in our analysis is that the whole null − E is spanned by ∇Q, i.e., space of L+ = Hreal N (L+ ) = spanR {∇Q}.

(3.12)

Since the continuous spectrum of L− and of L+ is the half-line [−E, ∞), 0 is an isolated point. Hence there is a positive number δ such that (h, L+ h) ≥ δ(h, h), if h is orthogonal to the span of ∇Q and to the ground state of L+ . In particular, the number of eigenvalues strictly below δ is exactly n + 1. We have proved the following lemma. Lemma 3.1. Assume that (3.12) holds. Then the null spaces of L− and L+ are given by N (L− ) = spanR {Q}, N (L+ ) = spanR {∇Q}. Furthermore, there is a constant ε2 > 0 such that (a) (g, L− g) ≥ ε2 (g, g) if g ⊥ Q. (b) (f, L+ f ) ≥ ε2 (f, f ) if f ⊥ spanR {Q, ∇Q}. If we assume that

Q + h 2 = Q 2 the term with the factor E in (3.7) vanishes, because 2(Q, A) = − h 2 , and we have that

H(Q + h) = H(Q) +

dxAL+ A +

dxBL− B + O(h3 ).

(3.13)

Thus if h = A + iB, with A ⊥ spanR {∇Q},

B ⊥ Q,

Q + h 2 = Q 2 ,

(3.14)

then we can write A = A1 +cQ, with (A1 , Q) = 0, for some c of order h 2 , (c(Q, Q) = (A, Q) = −(h, h)/2). Since (Q, ∇Q) = 0, we have that (∇Q, A1 ) = 0, provided that (A, ∇Q) = 0. Therefore, under assumption (3.14), we can rewrite (3.7) as H(Q + h) − H(Q) = (A, L+ A) + (B, L− B) + O(h3 ) (3.15) 3 = (A1 , L+ A1 ) + (B, L− B) + O(h ). (3.16) Since A1 ⊥ spanR {Q, ∇Q}, we can apply Lemma 3.1 to conclude that (A1 , L+ A1 ) + (B, L− B) ≥ ε2 ( A1 2 + B 2 ). Since the difference between A1 2 and A 2 is of higher order, we obtain H(Q + h) − H(Q) ≥ ε2 h 2 + O(h3 ).

(3.17)

The last equation implies the global (modulational) stability of the soliton solution under small perturbations. To see this, suppose the initial data is of the form Q + h0 , with Q + h0 2 = Q 2 ; (the last condition always holds, since we can choose a Q with the mass of the initial value). At a later time t, we can find r and θ such that ψt (x − r)e−iθ = h(x) + Q(x) with the mass of the correction, h 2 , minimized. One can easily check that h satisfies condition (3.14). By inequality (3.17), h 2 is bounded from above by the left hand side, which is conserved under the time evolution.

244

J. Fröhlich, T.-P. Tsai, H.-T. Yau

3.2. Dynamical linearization of the Hartree equation around solitons. We now return to the Hartree equation (1.5) with external potential λV (x) = W (εx). Since our time scale is of order t ∼ ε −1 , the change in the external potential during the evolution on this time scale may not be small. Thus the argument in the last section no longer applies. We shall show that, nevertheless, the soliton solution is stable on this time scale, and we shall track the motion of the soliton precisely. The Hartree equation (1.5) is 1 i∂t ψ = − 9ψ + W (εx)ψ − ( ∗ |ψ|2 )ψ =: H (ψ)ψ. 2

(3.18)

The solutions we are interested in are of the form ψ(x, t) = [Q (x − r(t)) + hε (x − r(t), t)] eiθ(x,t) ,

(3.19)

for ε > 0 small enough, where Q(x) = Q(ε=0) (x) is a minimizer of the energy functional H, and hε (x − r(t), t) is a small correction term which tends to 0, as ε → 0; θ(x, t) is a time-dependent phase of the form θ (x, t) = v(t) · (x − r(t)) − Et + θ1 (t). Also, we expect that, to leading order, the velocity v(t) and the location r(t) of the soliton are given by r˙ (t) = v(t) ,

v(t) ˙ = −ε (∇W ) (εr(t)) .

For the time being, there is no canonical way to determine corrections to these equations, as the decomposition (3.19) is not unique. We require v, r, and θ1 to obey the following equations: r˙ (t) = v(t), v(t) ˙ = −ε(∇W )(εr(t)) + a(t), θ˙1 (t) = 1 v 2 (t) − W (εr(t)) + ω(t), 2

where the (vector) acceleration correction a(t) and the (scalar) “angular velocity” correction ω(t) are of higher order in ε and will be used for fine adjustment, later on. Their initial values will be discussed in Subsect. 4.5.1, when we adjust the initial datum hε,0 . We now derive the equations for a, ω and h. Let ξ(x, t) = Q(y)eiθ , y = x − r(t). By explicit computation, 1 −1 −1 ξ {i∂t − H (ξ )} ξ = ξ i∂t ξ + 9ξ − W (εx) + ∗ |ξ |2 2

∇Q 1 2 ˙ = −θ1 (t) − ∂t [v(t)(x − r(t)] − 2 v (t) + i (v(t) − r˙ (t)) Q 9Q − W (εx) + E + + ∗ |ξ |2 . 2Q We expand the potential W around the point r(t): W (εx) = W (εr(t)) + ∇W (εr(t))ε(x − r(t)) + /0 (x, t),

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

245

where the remainder /0 (x, t) is real and, by the mean value theorem, |/0 (x, t)| ≤ Cε2 |y|2 ,

(3.20)

where C = C(W ) depends on W . Recalling the equation for r, v and (3.2), we then have ξ −1 {i∂t − H (ξ )} ξ = −/ξ,

(3.21)

where / = −W (εr) + W (εx) + vy ˙ + ω = /0 (x, t) + a(t)y + ω(t).

(3.22)

Next, we consider h(y, t) = hε (x − r(t), t). Substituting ψ = (Q + h)eiθ into Eq. (3.18) and canceling eiθ we get i {(Q + h)(i∂t θ) − r˙ · ∇(Q + h) + ∂t h} 1 1 = − 9(Q + h) − iv · ∇(Q + h) + v 2 (Q + h) 2 2 + W (εx)(Q + h) − ( ∗ |Q + h|2 )(Q + h), where Q and h are taken at (y, t) = (x − r(t), t), that is, Q = Q(x − r(t)) = Q(y) and h = h(x − r(t), t) = h(y, t). Using r˙ (t) = v(t), Eq. (3.2), and 1 2 1 ˙ − W (εr) + ω(t) − E = −W (εx) + / − v 2 − E, ∂t θ = v˙ · x + − v − vr 2 2 we obtain 1 i∂t h = − 9h − Eh + /(Q + h) − ( ∗ |Q + h|2 )(Q + h) − ( ∗ Q2 )Q . 2 (3.23) Treating /h as an error term, we can rewrite this equation as ∂t h = −iLh + G,

(3.24)

where the operator L is given by (3.4), and the nonlinear part is G = − i/(Q + h) − iF (h), ¯ . with F (h) = − ( ∗ |h|2 )(Q + h) + ( ∗ [Q(h + h)])h

(3.25)

In matrix form, ∂ ∂t

A 0 L− A ReG = + . B B ImG −L+ 0

(3.26)

We observe that, except for /0 which is part of / (and thus appears in G), all quantities in this system are evaluated at (y, t).

246

J. Fröhlich, T.-P. Tsai, H.-T. Yau

3.3. Properties of the linearized flow. We have shown that the linear part in the dynamical linearization of the nonlinear Hartree equation results in the standard linear evolution (3.3) with matrix form given in (3.6). We notice that L+ and L− , the real and imaginary part of L, can be reinterpreted as complex-linear operators which turn out to be self-adjoint in the usual L2 space. The operator

0 −L+ 0 L− , with L∗ = L= −L+ 0 L− 0 acting on H 1 × H 1 is, however, not symmetric. Although our functions A and B are real, we shall view L− and L+ as self-adjoint operators on the Sobolev space H 1 of complex-valued functions. The operator L is, however, not self-adjoint and thus does not have a spectral decomposition. A standard procedure is to decompose the space H 1 ×H 1 into a direct sum of its generalized null space, S := Ng (L) = {v : Ln v = 0 for some n} , and the orthogonal compliment of the generalized null space of its adjoint, i.e., the space M = Ng (L∗ )⊥ . It is simple to check that both spaces, S = Ng (L) and M = Ng (L∗ )⊥ , are invariant under L. Note that the decomposition H 1 × H 1 = M ⊕ S is, however, not an orthogonal decomposition. Following M. I. Weinstein [13], we want to establish the following picture: 1. H 1 × H 1 = M ⊕ S. 2. The H 1 × H 1 -norm on M remains uniformly bounded under the linearized flow for all time. 3. The dynamics on S can be computed explicitly. We use PM and PS to denote (non-orthogonal) projections with respect to the decomposition M ⊕ S. We first establish some spectral properties of L+ and L− . 3.3.1. Generalized null space. We first determine the generalized null space S = Ng (L). We recall Lemma 3.1 and the equations L− Q = 0,

L+ ∇Q = 0,

L− xQ = −∇Q.

Since Q ⊥ spanR {∇Q} = N (L+ ) and L+ is self-adjoint, there exists a solution, 01 , of the equation L+ 01 = Q. We may assume 01 ⊥ ∇Q by subtracting its projection on the ∇Q-direction. If 01 ⊥ Q, then (01 , Q) = (01 , L+ 01 ) > ε1 (01 , 01 ), by Lemma 3.1. This contradiction shows (01 , Q) = 0. Now we let 0 = 01 +b∇Q with b = 2(01 , xQ). Then (0, Q) = (01 , Q) = 0, and (0, xQ) = 0. To summarize, we have found a 0 such that L+ 0 = Q,

(0, xQ) = 0,

(0, Q) = 0.

(3.27)

We require (0, xQ) = 0, in order to construct a dual basis on S in Proposition 3.2 below. To determine the generalized null space, we need to solve all solutions of the equation Ln ( uv ) = 00 for some n. If n = 2k is even, it is equivalent to (L− L+ )k u = 0 and (L+ L− )k v = 0. If n = 2k + 1 is odd, it is equivalent to L+ (L− L+ )k u = 0 and k = 0. We have solved the solutions for the case n = 1 above: It is the span L−(L+ L− ) v∇Q 0 of Q and 0 .

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

247

We next consider the case n = 2. The null space of L+ L− is N (L+ L− ) = L− −1 N (L+ ) = N (L− ) ⊕ spanR {xQ} = spanR {Q, xQ}.

(3.28)

Similarly, N (L− L+ ) = N (L+ ) ⊕ spanR {0} = spanR {∇Q, 0}.

(3.29)

For the case n = 3, we have N (L− L+ L− ) = L− −1 N (L− L+ ) = L− −1 spanR {∇Q, 0} . Since N (L− ) = spanR {Q} and (Q, 0) = 0, 0 is not in the range of L− . Thus L− −1 spanR {∇Q, 0} = L− −1 spanR {∇Q} = N (L− L+ ). This proves that N (L− L + L− ) = N (L+ L− ). Similarly, N (L + L − L+ ) = N (L− L+ ). Therefore, if Ln ( uv ) = 00 for some n ≥ 2, then L2 ( uv ) = 00 . Thus we have found a basis for Ng (L). We also have similar statements for Ng (L∗ ). Summarizing, we have proved Proposition 3.2. 0 ∇Q 0 0 , xQ , 0 }, Q , 0 0 xQ 0 Q ∗ Ng (L ) = spanR { 0 , 0 , ∇Q , 0 }.

S = Ng (L) = spanR {

(3.30)

Notice that these vectors are dual bases and we have ordered them correspondingly. In particular, for an arbitrary function g we have 0 PS (g) = κ1 (Img, 0) Q + κ2 (Reg, xQ) ∇Q 0 0 + κ1 (Reg, Q) 00 , + κ2 (Img, ∇Q) xQ where κ1 = 1/(Q, 0) and κ2 = 1/(xj Q, ∂j Q) = −2. Also note that we have 0 0 0 0 0 L Q = 0 , L ∇Q = 0 , L yQ = − ∇Q , L 00 = − Q . 0 0 (3.31) Let g(t) be a solution to the linear evolution (3.3) and denote the projection onto S by PS g(t) = α(t)

0 Q

+ β(t)

∇Q 0

+ γ (t)

0 xQ

+ δ(t)

0 0

.

Then by (3.31) the equations for the coefficients (α(t), β(t), γ (t), δ(t)) (note β(t) and γ (t) are vector functions) are given by 0 ˙ = −δ, Q : α ∇Q : β˙ = −γ , 0 0 yQ : γ˙ = 0, 0 : δ˙ = 0. 0

248

J. Fröhlich, T.-P. Tsai, H.-T. Yau

3.3.2. The flow on M. We have decomposed the space H 1 × H 1 into a direct sum of the generalized null space S = Ng (L) and M = Ng (L∗ )⊥ . The generalized null spaces for L and L∗ are given by Proposition 3.2. Thus, M is the space M = {( uv ) : u ⊥ spanR {Q, xQ}, v ⊥ spanR {∇Q, 0}}. Since all functions in the space S = Ng (L) and M ⊥ = Ng (L∗ ) are smooth, the projections PS and PM are bounded in any H k space. Our first aim is to prove Lemma 3.3 (H 1 -norm on M). 1. If g ∈ M, then Re(Lg, g) is non-negative and comparable to g 2H 1 . 2. If g(t) = e−itL φ and 0 = φ ∈ M, then g(t) H 1 is uniformly bounded below and above. To prove this lemma, we first show that, for all vectors ( uv ) ∈ M, C −1 u 2L2 ≤ (u, L+ u),

C −1 v 2L2 ≤ (v, L− v) ,

(3.32)

for some constant C, as follows from Lemma 3.1. In fact, it is sufficient to assume that u ⊥ spanR {Q, xQ} and v ⊥ spanR {0}. (As will become clear, we only use (0, Q) = 0 and (xQ, ∇Q) = 0 in this argument.) For the v-part, if (v, Q) = 0, the claim follows from Lemma 3.1. Hence we may assume tv = Q + w for some t = 0, w ⊥ Q. By assumption 0 = (0, tv) = (0, Q + w), hence |(0, Q)| = |(0, w)| ≤ 0 2 w 2 . Therefore we have w 2 ≥ c3 > 0 and ε2 c32 ε2 w 22 (w, L− w) (v, L− v) = ≥ ≥ . (v, v)

Q 22 + w 22

Q 22 + w 22

Q 22 + c32 For the u-part, if (u, ∇Q) = 0, the claim follows from Lemma 3.1. Hence we may assume u = b∇Q + w for some vector b = 0 and some w ⊥ Q, ∇Q. By assumption, 0 = b(xQ, u) = (bxQ, b∇Q + w) = C|b|2 + (bxQ, w), with C = 0. Hence w 2 > C|b| and ε2 w 22 (w, L+ w) (u, L+ u) = ≥ ≥ Cε2 , 2 (u, u) Cb2 + w 2 Cb2 + w 22 by a similar estimate. Hence (3.32) is proved. Now, since ∇u 2 is bounded by (u, L+ u) and u 2 , (and hence by (u, L+ u), see (3.32)), we can replace the norm on the left hand side of (3.32) by the H1 -norm; (here the H1 -norm is the sum of the L2 -norm plus the L2 -norm of the derivative). Therefore we have proved that, for ( uv ) ∈ M, C −1 (u, u)H 1 ≤ (u, L+ u) ≤ C(u, u)H 1 , C

−1

(3.33)

(v, v)H 1 ≤ (v, L− v) ≤ C(v, v)H 1 .

The upper bounds on (u, L+ u) and (v, L− v) are obvious. Hence the first part of the lemma is proved. The second part follows from the first part and the next lemma, which states that the quantity (u, L+ u) + (v, L− v), which is equivalent to the H 1 -norm on M, is actually conserved by the linear flow (3.3). We note that (u, L+ u) + (v, L− v) = Re(Lg, g) = Im(Lf, g).

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

249

Lemma 3.4. Recall that L = −iL, and iL = Li, (see (4.5)). 1. Re(Lf, g) = Im(Lf, g) = Im(Lg, f ) = −Im(f, Lg). 2. If g(t) = e−itL φ, then Im(Lk g, g) is constant for any integer k ≥ 0. 3. For any g(t) with ∂t g = Lg + G, one has d Im(Lg, g) = 2Im(Lg, G). dt Proof. All these assertions can be checked by simple computations. We only prove the last one in the following. d Im(Lg, g) = Im(L2 g + LG, g) + Im(Lg, Lg + G) dt = Im(LG, g) + Im(Lg, G) = 2Im(Lg, G).

' &

The following two lemmas will be used to prove inequality (3.61) below. (Note H k denotes the Sobolev space W k,2 .) Lemma 3.5. (a) For any m ≥ 1, e−itL is a bounded map from M ∩ H m into itself. Explicitly, for any φ ∈ M ∩ H m , −itL φ m ≤ Cm φ H m . e H

(b) L = −iL, restricted to M, has an inverse which is bounded from M ∩ L2 to M ∩ H 2 . Proof. Proof of (a): Let g(t) = e−itL . The case m = 1 is Lemma 3.3, part 2. If m ≥ 3 is odd, we have that

g(t) 2H m ≤ C Im(Lm g, g) + C g(t) 2H m−2 ≤ C Im(Lm φ, φ) + C φ 2 m−2 ≤ C φ 2 m . H

H

The second inequality uses Lemma 3.4, part 2. (Note: If m is even, Im(Lm g, g) = 0, and the first inequality fails.) The general case follows from interpolation. Proof of (b): For ( uv ) ∈ M we seek xy ∈ M such that L xy = ( uv ), i.e., L− y = u and L+ x = −v. Notice that u ⊥ Q and v ⊥ ∇Q, and the null spaces of the self-adjoint operators L− and L+ are spanned by Q and ∇Q respectively. Since 0 is an isolated −1 eigenvalue of L− and L+ , it follows that L−1 − and L+ are bounded operators on the orthogonal complements of the null spaces. This proves that L has a bounded inverse on M ∩ L2 . To prove the bound, write w = ( uv ) ∈ M ∩ H 2 . By (3.33) 1

u 22 + C L+ u 22 . 2 Hence u W 1,2 ≤ C L+ u 2 . Similarly v W 1,2 ≤ C L− v 2 . Furthermore, write L+ = − 21 9 + V . (The explicit form of V can easily be read from the definition of L+ .)

u 2W 1,2 ≤ C(u, L+ u) ≤

9u 2 = 2 L+ u − V u 2 ≤ 2 L+ u 2 + C(V ) u 2 ≤ C L+ u 2 . Hence u W 2,2 ≤ C L+ u 2 . Similarly v W 2,2 ≤ C L− v 2 . We conclude that

w W 2,2 ≤ C Lw 2 . The lemma follows by a duality argument. & '

250

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Let Xk = H k ∩ L2 (1 + |y|2k )dy

(3.34)

denote the subspace of H k of functions with prescribed decay at infinity. Lemma 3.6 (Finite propagation speed). For any integer k ≥ 0, for any real m ≥ 1, and for φ ∈ M ∩ Xk ∩ H k+m , m tL y e φ

Hk

≤ C y m φ X + C(1 + |t|m ) φ H k+m . k

(3.35)

The constant C depends on k and m. Remark. For the free Schrödinger equation, one need not assume that φ ∈ M, since Lemma 3.5 (a) always holds. Proof. Let α be any multi-index with |α| = k. Let g(t) = e−itL φ and v(t) = ∇yα g(t). We have ∂t v = Lv + Ig,

with I = [∇ α , L].

Hence, d dt

y

2m

|v| = 2Re y 2m vv ¯ t ≤ C y 2m−1 |v||∇v|dy + C v 22 + C y 2m |v||Ig|. 2

Since I is a localized operator involving derivatives only up to (k − 1)st order, (I vanishes for k = 0), the last term is bounded by v 2 g H k−1 ≤ C φ 2H k . Hence, by interpolation, m−1 d m 2 y v ≤ C y m v · ∇v + C φ 2H k dt 2 y 2 2 m 2(1−1/2m) 1/m ≤ C y v 2 · v H m + C φ 2H k . Let f (t) = y m v 22 and N = C φ 2H k+m . By Hölder’s inequality, f ≤ f 1−1/2m N 1/2m + N ≤

f + C(1 + t)2m−1 N, 1+t

hence f (t) ≤ Cf (0) + C(1 + t)2m N , which proves the claim. We will need the case k = 1 when we prove Lemma 3.8.

' &

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

251

3.4. The fine adjustment. We first recall the conclusion of dynamical linearization. We decompose the function ψ into the sum ψ(x, t) = [Q (x − r(t)) + hε (x − r(t), t)] eiθ(x,t) , where θ (x, t) is a time-dependent phase of the form θ (x, t) = v(t) · (x − r(t)) − Et + θ1 (t), with r˙ (t) = v(t), v(t) ˙ = −∇W (εr(t))ε + a(t), 1 θ˙1 (t) = v 2 (t) − W (εr(t)) + ω(t). 2 Here the (vector) acceleration correction a(t) and the (scalar) angular velocity correction ω(t) are of smaller orders, and we shall determine their values in this subsection. The main correction h satisfies the equation ∂ h = Lh + G, ∂t

h(0) = hε,0 ,

(3.36)

with G = −i/(Q + h) − iF (h), / = /0 + ay + ω, /0 = W (εx) − W (εr) − εy∇W (εr), ¯ F = −( ∗ |h|2 )(Q + h) − ( ∗ [Q(h + h)])h. We decompose h(t) into a sum of its components in S and M: h(t) = hS (t) + hM (t). The component in S is a sum of the basis vectors (3.30) 0 0 + β(t) ∇Q + γ (t) yQ + δ(t) 00 . hS (t) = α(t) Q 0 We now consider projections of Eq. (3.36) onto S and M. Taking inner products with the dual basis, (see Proposition 3.2), we obtain the equations on S: 0 ˙ = −δ +κ1 (ImG, 0), (3.37) Q : α ∇Q ˙ (3.38) : β = −γ +κ2 (ReG, yQ), 0 0 (3.39) κ2 (ImG, ∇Q), yQ : γ˙ = 0 (3.40) : δ˙ = κ1 (ReG, Q). 0

The equation on M is ∂ hM = LhM + PM G. ∂t

(3.41)

Notice that /0 = W (εx) − W (εr) − εy∇W (εr) is determined by r(t), which solves r¨ = −ε∇W (εr) + a.

252

J. Fröhlich, T.-P. Tsai, H.-T. Yau

This system is not closed, yet, since we still need to determine a and ω, which are used for the fine adjustment. Observe that a and ω appear explicitly in the equation on S only through ImG, that is, ayQ and ωQ. These two terms appear in (3.37) and (3.39), the equations for α and γ . Our strategy is to choose a and ω so that α˙ = 0 and γ˙ = 0. Then hS (t) has at most linear growth. It is important to understand the orders of these quantities. Assume that h ≤ o(ε). Since the force G contains an external input /0 Q ∼ ε 2 , G is of the form O(h2 ) + ε 2 . The equation for hM , i.e., (3.41), is thus of the form f ≤ f 2 + c2 ε 2 ,

c>1,

(3.42)

(and we have assumed that we can take care of the linear part). The solutions of this equation can blow up at t = (cε)−1 . Explicitly, if f (0) = 0 then f (t) = cε tan(cεt). A more careful examination shows that, due to a cancellation property when integrating in time (which is due to an oscillatory behaviour in time), one can show that h(t) ∼ ε3/2 . Based on this observation, we will prove that a(t) ∼ ε2 ,

ω(t) ∼ ε2 ,

PS (h) ∼ ε3 ,

PM (h) ∼ ε3/2 .

(3.43)

In the following subsections, we will prove the existence of h(y, t) by proving a priori bounds and using its local existence. It is also possible to prove existence by a contraction mapping argument, as we will do in Sect. 4 for the wave operator. 3.5. Initial value and equations on S. 3.5.1. Initial value. Recall that the initial datum is given by ψ0 (x) = Q (x) + hε,0 (x) eiv0 x . The coordinates of the initial value hε,0 in the S direction can be calculated: 0 Q : α(0) = κ1 (Imhε,0 , 0), ∇Q : β(0) = κ2 (Rehε,0 , yQ), 0 0 yQ : γ (0) = κ2 (Imhε,0 , ∇Q), 0 δ(0) = κ1 (Rehε,0 , Q). 0 : By our assumption on hε,0 , these initial values are of order ε 3/2 , which is too large for our purpose. They can be made smaller by introducing suitable normalization conventions. We first replace Q by Q∗ = Q ψ0 2 , with Q∗ L2 = ψ0 L2 . We then define h1 by the equation ψ0 (x) = Q (x) + hε,0 (x) eiv0 x = Q∗ (x) + h1 (x) eiv0 x .

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

253

From the assumption Q∗ 2 = ψ0 2 , we have 2(Q∗ , Reh1 ) = − h1 2 . Next, we want to choose r ∗ , v ∗ , θ ∗ , and write ∗ ∗ ∗ ψ0 (x) = Q∗ x − r ∗ + h∗ x − r ∗ eiv (x−r )+iθ ,

(3.44)

so that PS h∗ is essentially zero. Notice that h∗ is determined once we have chosen r ∗ , v ∗ and θ ∗ : As a function of y = x − r ∗ , ∗ ∗ ∗ h∗ (y) = Q∗ y + r ∗ + h1 y + r ∗ ei[v0 (y+r )−v y−θ ] − Q∗ (y) . The leading term of h∗ is given by (we will choose r ∗ ∼ 0, v ∗ ∼ v0 and θ ∗ ∼ v0 r ∗ ) h∗ (y) ∼ h1 (y) + Q∗ (y) i(v0 − v ∗ )y + i(v0 r ∗ − θ ∗ ) + r ∗ · ∇Q∗ (y) . We can now calculate the initial value of h∗ along the S direction (w.r.t. Q∗ ) as before. The conclusion is 0 ∗ ∗ +κ1 (v0 r ∗ − θ ∗ ) Q∗ , 0 ∗ , Q : α ∼ κ1 Imh1 , 0 ∇Q : β ∗ ∼ κ2 Reh1 , yQ∗ +κ2 k rk∗ ∇k Q∗ , yQ∗ , 0 0 ∗ ∗ ∗ ∗ ∗ yQ : γ ∼ κ2 Imh1 , ∇Q +κ2 (v0 − v ) Q , ∇Q , 0 δ ∗ ∼ κ1 Reh1 , Q∗ . 0 : Since the initial value hε,0 is of order ε 3/2 , h1 is of the same order and we can choose v0 r ∗ − θ ∗ , r ∗ and v0 − v ∗ of order ε3/2 such that α ∗ , β ∗ and γ ∗ vanish to leading order. It is easy to check that the next order is bounded by ε3 . Furthermore, δ ∗ is of order ε 3 as well, thanks to the relation 2(Q∗ , Reh1 ) = − h1 2 . In the remaining part of this section, we will prove Theorem 1.1 with ψ0 of the form (3.44), and PS h∗ ∼ ε3 . The initial values of r(0), v(0), and θ(0) are defined correspondingly. Notice that, by the assumption of the Theorem, (3.12) is satisfied by Q∗ . After this case is proved, the statement in the theorem, with Q = QN0 , can be obtained by defining h as h(y, t) = ψ(x, t)e−iθ − QN0 (y) = (ψe−iθ − Q∗ ) + (Q∗ − QN0 ) = h∗ (y, t) + (Q∗ − QN0 ) = O(ε3/2 ). From now on, we may and will drop the superscript ∗ and assume PS hε,0 ≤ Cc0 ε 3 ,

(3.45)

where ε is sufficiently small: ε ≤ ε−1 , with ε−1 and C depending only on the initial setting (H, N0 , Q,...) but not on W or T . Equation (3.45) will be used in (3.52) below. We note that the smallness of c0 is only used to find a suitable h∗ (0). It is no longer needed in the future and hence c0 is independent of T and W . Also note that we may assume hε,0 ≤ c0 ε 1+σ for σ ∈ (0, 1/2]. Then we replace ε 3/2 , inthe above argument, by ε1+σ , and we get a similar conclusion, with (3.45) replaced by PS hε,0 ≤ Cc0 ε 2+2σ .

254

J. Fröhlich, T.-P. Tsai, H.-T. Yau

3.5.2. Equations on S. From now on, C denotes a constant which may depend on the quantities (, Q...), but not on W or T . Recall that we want to set α˙ = 0 and γ˙ = 0 in (3.37) and (3.39), which yield equations for a and ω. From the definition of G and the inner product relations (xQ, 0) = 0 and κ1 = 1/(Q, 0), we have κ1 (ImG, 0) = −ω − κ1 (G2 , 0), where G2 = /0 Q + /Reh + ReF (h).

(3.46)

Similarly, from (Q, ∇Q) = 0 and κ2 = 1/(xj Q, ∂j Q), we have κ2 (ImG, ∇Q) = −a − κ2 (G2 , ∇Q). Therefore, in order to have α˙ = 0 and γ˙ = 0, it suffices to set ω = −δ − κ1 (G2 , 0), a = −κ2 (G2 , ∇Q).

(3.47)

With this choice of a and ω, we have α(t) = α(0), γ (t) = γ (0); β(t) and δ(t) are defined by solving the ODEs (3.38) and (3.40), i.e., t δ(t) = κ1 (ReG(s), Q)ds + δ(0), (3.48) 0 t β(t) = κ2 (ReG(s), yQ)ds − γ (0)t + β(0). (3.49) 0

qLet Cw = 1 + W W 3,∞ . Then |/0 (x, t)| ≤ CCw

ε 2 |y|2 ,

(cf. (3.20)). Define

ζ (t) := |a(t)| + |ω(t)| + ε1/2 h(t) H 1 + Cw ε 2 . (We would like to have that ζ (t) = O(ε 2 ) for 0 ≤ t ≤ T ε −1 .) In the following we work in the time range [0, t1 ] where ζ (t) ≤ C∗ ε 2 ,

with ε ≤ ε0 ≤ (C∗ + T + 100)−2

(3.50)

holds. Here C∗ > Cw is a (large) constant to be determined later. Equation (3.50) is true for t = 0 if C∗ is sufficiently large with respect to c0 . Moreover, if ζ (s) < C∗ ε 2 for some s < T ε−1 , then (3.50) holds for a small time interval [s, s + δs] by a local wellposedness result. Our goal is to show that Eq. (3.50) holds for 0 ≤ t ≤ T ε−1 , by requiring ε0 sufficiently small. Our strategy is to show that, indeed, ζ (t) ≤ 21 C∗ ε 2 if (3.50) holds. A local wellposedness result then guarantees that (3.50) holds for the whole time range. The quantities ω and a are defined in terms of G2 in (3.47), and recall the definition of G2 , Eq. (3.46). Note that /0 Q is the leading term in G2 . In their definitions, the main term comes from /0 Q, and we have |(/0 Q, 0)| + |(/0 Q, ∇Q)| ≤ CCw ε 2 .

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

255

Also |(/(t)Reh(t), 0)| + |(/(t)Reh(t), ∇Q)| ≤ C(Cw ε 2 + |a(t)| + |ω(t)|) h 2 ≤ Cε −1/2 ζ (t)2 . From the assumption (1.27) on , we have for a general φ ∈ H 1 , ∗ |φ|2 φ 1 ≤ ∗ |φ|2 ∞ · φ H 1 + (∇) ∗ |φ|2 H

L∞

L

From the Young inequality, we have ∗ |φ|2 ∞ ≤ L∞ |φ|2 L

L1

· φ L2 .

= L∞ φ 2L2 .

Similarly, we can bound the term with replaced by ∇. Thus we have proved that

F (φ) H 1 ≤ C φ 2H 1 + C φ 3H 1 for some constant depending on . Hence we can bound (F (h(t)), 0) and (F (h(t)), ∇Q) by |(F (h(t)), 0)| + |(F (h(t)), ∇Q)| ≤ C h 2H 1 + C h 3H 1 ≤ Cε −1 ζ (t)2 . Under assumption (3.50), we have thus proved that |ω(t)| + |a(t)| ≤ CCw ε 2 + Cε −1 ζ (t)2 .

(3.51)

To estimate β and δ, we note that ReG = (/0 + ay + ω)Imh + ImF (h). Since we are only interested in the inner products of ReG with Q or yQ, and Q has exponential decay, we can treat y to be of order one. Thus we have the bound t dsε −1 ζ (s)2 , (3.52) |β(t)| + |δ(t)| ≤ Cc0 (T + 1)ε 3 + 0

where we have used (3.45) and εt ≤ T . 3.6. Modified linear operator on M. It is important to observe that / is not bounded. In fact, (3.53) / =W (εx) − W (εr) − εy∇W (εr) + ay + ω = O ε 2 (y 2 + 1) = O (1 + ε|y|) ,

(3.54)

depending on whether we use Taylor expansion. In either case / is not bounded. This makes the term −i/h in the nonlinear term G hard to control, although the term −i/Q stays fine since Q is localized. By a finite propagation speed estimate we will see that / is of order 1. However, −i/h still cannot be considered an error term. To overcome this difficulty, we will include this term in the linear operator.

256

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Explicitly, Eq. (3.41) for hM can be rewritten as ∂t hM = (L + PM 1i /)hM + PM G, = −i/(Q + hS ) − iF (h). G

(3.55)

Hence we must consider the solution propagator P(s, t) which solves the following problem: If u(t) = P(s, t)φ, then u is a solution of the equation ∂t u(t) = (L + PM 1i /)u(t),

u(s) = φ.

We note that the operator L + PM 1i / leaves M and S invariant; but we will primarily consider P(s, t) on M. Now the equation for hM can be written as t hM (t) = + P(0, t)hM (0). P(s, t)PM G(s)ds (3.56) 0

into the sum of a main part, φ(s), and a remainder, PM G3 (s), We decompose PM G where φ(s) = PM (−i/(s)Q) = PM (−i/0 (s)Q),

G3 = −i/hS − iF (h).

The following lemma provides a basic estimate on the propagator P(s, t). Lemma 3.7. Assume (3.50) is true for 0 ≤ t ≤ T ε−1 . For φ ∈ M,

P(s, t)φ H 1 ≤ C10 φ H 1 for 0 ≤ s ≤ t ≤ T ε−1 , where C10 = eCCw T is independent of ε. We shall prove this lemma in the next subsection. Assuming this lemma and recalling that G3 (s) is of order h2 + h3 , we can bound the contribution of G3 (s) to hM by t t P(s, t)PM G3 (s)ds ≤ CC10 ε −1 ζ (s)2 ds. H1

0

0

The key observation is the following lemma, which takes into account cancellations in the time integration. Lemma 3.8 (Cancellation). Assume (3.50) is true for 0 ≤ t ≤ T ε −1 . Let φ ∈ M ∩ X3 . For 1 ( t ≤ T ε−1 , we have that t P(s, t)φds ≤ C12 t 1/2 φ X 3 H1

0

for a constant C12 = C12 (W, T ) independent of ε. Furthermore, for φ(t) : [0, T ε −1 ] → M ∩ X3 , t P(s, t)φ(t)ds ≤ C12 t 1/2 sup φ(s) X + C12 t sup φ(s) − φ(σ ) H 1 . 3 0

H1

s

|s−σ |≤t 1/2

(3.57) The space X3 has been defined in (3.34).

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

257

We also claim the following bounds on the main term φ(s) = PM (−i/0 (s)Q),

φ(s) X3 ≤ CCw ε 2 ,

φ(s) − φ(σ ) H 1 ≤ CCw ε 3 |s − σ |.

(3.58)

We will prove the lemma and the claim in next subsection. Assuming Lemma 3.8 and the claim, we get t P(s, t)PM (−i/0 (s)Q)ds

H1

0

≤ C12 t 1/2 CCw ε 2 + C12 tCCw ε 3 t 1/2 ≤ C13 ε 3/2 ,

where C13 = CC12 Cw (T + 1)3/2 . Hence, by (3.56), hM (t) is bounded by

hM (t) H 1 ≤ C13 ε 3/2 + CC10

t

ε −1 ζ (s)2 ds + Cc0 ε 3/2 .

(3.59)

0

Recall that ζ (t) = |a(t)| + |ω(t)| + ε1/2 h(t) H 1 . Then we can combine all these bounds, (3.51), (3.52), and (3.59), to obtain the following estimate: ζ (t) ≤ C(Cw + c0 (1 + ε 1/2 T ) + C13 )ε 2 + Cε −1 ζ 2 (t) + CC10

t

ε −1 ζ (s)2 ds

0

≤ Cε2 (Cw + c0 (1 + ε 1/2 T ) + C13 + C10 C∗2 ε(1 + T )) ≤ Cε2 C14 ,

where C14 = Cw + 2c0 + C13 + 1,

if we require ε1/2 T ≤ 1 and C10 C∗2 ε(1 + T ) < 1, in addition to assumption (3.50). We now choose C∗ = 2CC14 and then ε0 such that ε0 ≤ (C∗ + 100)−2 ,

1/2

ε0 T ≤ 1,

C10 C∗2 ε0 (1 + T ) < 1.

With these choices, we have proven that ζ (t) ≤

1 C∗ ε 2 2

(3.60)

under assumption (3.50) that ζ (t) ≤ C∗ ε 2 . Suppose that [0, t1 ] is the maximal time interval such that (3.50) holds and t1 < T ε−1 . Then the equality must hold at t = t1 by local existence and continuity, and ζ (t) must be slightly less than C∗ ε 2 for some t < t1 . This is a contradiction to (3.60) and hence (3.50) holds true for all t ≤ T ε−1 .

258

J. Fröhlich, T.-P. Tsai, H.-T. Yau

3.7. Proofs of lemmas. 3.7.1. Proof of Lemma 3.7. Here we prove that the flow given by P(s, t) is bounded in M: Proof. Let u(t) = P(s, t)φ ∈ M, and f (t) = Im(Lu(t), u(t)) ≥ 0. Recall the second assertion of Lemma 3.4: It implies that fˆ(t) = Im(Lg(t), g(t)), with g(t) := et L φ, is constant. We propose that f (t) does not grow in t very fast, for s, t ∈ (0, T ε −1 ). More precisely, we will prove that d f (t) ≤ Cεf (t), dt which implies f (t) ≤ Cf (s), and hence Lemma 4.7 follows. We recall the third assertion of Lemma 3.4. In our case, ∂t u = Lu + PM 1i /u, hence d f (t)/2 = Im(Lu, PM 1i /u) = Im(Lu, −i/u) − Im(Lu, PS (−i/u)) dt 1 = Im ∇ u(∇/)u ¯ + O(ε 2 u 22 ) − Im(Lu, PS (−i/u)) . 2 j (e , −i/u)ej = If {ej } and {ej } denote dual bases of S, then PS (−i/u) = (i/ej , u)ej . Hence PS (−i/u) H 1 ≤ Cε2 u L2 , and Im(Lu, PS (−i/u)) ≤ C u H 1 · PS (−i/u) H 1 ≤ Cε2 u 2H 1 . Since ∇/ ∞ = ε∇W (εx) − ε∇W (εr) + a ∞ ≤ 2Cw ε + C∗ ε 2 , (with no y depen dence), the term Im 21 ∇ u(∇/)u ¯ dominates, and d f (t) ≤ Im ∇ u(∇/)u ¯ + Cε 2 u 2H 1 ≤ (2Cw ε + Cζ (t)) u 2H 1 ≤ CCw εf. dt The last inequality follows from (3.50) and Lemma 3.3. Hence f (t) ≤ eCCw εt f (0) ≤ eCCw T f (0) for t ≤ T ε−1 .

' &

3.7.2. Proof of Lemma 3.8. Next we prove the key cancellation lemma. The cancellation is due to oscillatory behavior in time. We first prove a variant of Lemma 3.8 for the original flow et L , which will help us to visualize the oscillation. Then we will prove a weaker result for the modified flow in Lemma 3.8. d ρ(t) = O(ε) in H 1 . (One such Suppose ρ(t) ∈ M satisfies ρ(t) = O(1) and dt −2 example is ε /0 (t)Q.) Then there is a C > 0 such that t e(t−s)L ρ(s)ds ≤ C(1 + εt). (3.61) 0

H1

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

259

By Lemma 3.5, L−1 is defined on M and commutes with e(t−s)L . Thus

t

e

(t−s)L

d (t−s)L −1 L ρ(s)ds −e 0 ds

t t d (t−s)L −1 = −e L ρ(s) + e(t−s)L L−1 ρ(s)ds 0 ds 0 t = O(1) + e(t−s)L O(ε)ds

ρ(s)ds =

0

t

0

= O(1) + O(εt),

in H 1 .

Here we have used Lemma 3.3. (Notice the analogy with the integration of eit , which does not increase the order of eit because of its oscillation.) Now we prove the lemma. Proof. Choose τ ∼ t 1/2 " 1. We have

t

P(s, t)φds =

0

j

=

(j +1)τ jτ

P(s, t)φds

P((j + 1)τ, t)

j

(j +1)τ jτ

P(s, (j + 1)τ )φds.

We write each summand as (j +1)τ P(s, (j + 1)τ )φds ≡ (I) jτ

=

(j +1)τ jτ

+

e((j +1)τ −s)L φds

(j +1)τ

jτ

(j +1)τ s

P(σ, (j + 1)τ )PM 1i /(σ )e(σ −s)L φdσ ds

≡ (II) + (III). We have

(II) H 1 ≤ C φ H 1 (1 + ετ ) ≤ C φ H 1 by (3.61) and (3.50). For (III), since φ is localized, we expect it is not affected much by the large potential in PM 1i /(σ ) for large y. To prove this, we use the finite propagation speed estimate (3.35): For s ∈ (0, τ ), PM 1i /(·)es L φ 1 ≤ C Cw ε 2 y 2 + C∗ ε 2 (1 + |y|) es L φ 1 H H 2 2 2 ≤ CCw ε (1 + y )φ 1 + (1 + s ) φ H 3 H 2 + CC∗ ε (1 + |y|)φ H 1 + (1 + s) φ H 2 .

260

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Hence

III H 1 ≤ CC10 ε 2 τ 2 (Cw + C∗ ) (1 + y 2 )φ

H1

≤ CC10 Cw φ X3 ,

+ (Cw τ 2 + C∗ τ ) φ H 3

τ 2ε

since < 2 and ε1/2 C∗ ≤ 1, see (3.50). Therefore (I ) H 1 ≤ C11 φ X3 with C11 = C + CC10 Cw and t P(s, t)φds ≤ C10 C11 φ X3 ≤ CC10 C11 t 1/2 φ X3 . H1

0

j

Next, for a suitably localized function φ(t) ∈ M ∩ X3 , t P(s, t)φ(t)ds 1 0 H (j +1)τ = P((j + 1)τ, t) P(s, (j + 1)τ ) φ(j τ ) + φ(s) − φ(j τ ) ds jτ j 1 H 2 ≤ C10 C11 φ(j τ ) X3 + C10 τ sup φ(s) − φ(σ ) H 1 j

≤ C12 t

|s−σ |≤τ

j 1/2

sup φ(s) X3 + C12 t s

sup |s−σ |≤t 1/2

2 = C(1 + C eCCw T )2 . with C12 = CC11 w

φ(s) − φ(σ ) H 1 ' &

This estimate is mainly used for φ(s) = PM (−i/0 (s)Q). 3.7.3. Proof of claim (3.58). We rewrite /0 in the form /0 (x, t) = W (εr + εy) − W (εr) − ∇W (εr) · εy 1 {∇W (εr + uεy) · εy} du − ∇W (εr) · εy =

0

1 1

0

0

=

∇ 2 W (εr + vuεy) : εy ⊗ uεy dvdu.

From the first line we have that ∇ 3 /0 ∞ ≤ ε3 ∇ 3 W ∞ . From the third line we obtain /0 e−ν|y| ≤ ε2 ∇ 2 W . Hence, for φ(s) = PM (−i/0 (s)Q), we have that ∞ ∞ 3

φ(s) X3 ≤ C ∇ /0 + C /0 e−ν|y| ≤ C W W 3,∞ ε 2 , ∞

∞

e−ν|y|

where the factor is due to the exponential decay of Q. Furthermore, since 2 ∇ W (εr(s) + vuεy) − ∇ 2 W (εr(σ ) + vuεy) ≤ sup |∇ 3 W | · ε|r(s) − r(σ )| and |r(s) − r(σ )| ≤ CC∗ |s − σ |, (note |v(t)| ≤ CC∗ ), we conclude that

φ(s) − φ(σ ) L2 ≤ C ∇ 3 W C∗ ε 3 |s − σ |. By rewriting ∇/0 (x, t) = for ∇[φ(s) − φ(σ )] L2 .

1 0

∞

∇ 2 W (εr(t) + uεy) · ε 2 y du, we get the same bound

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

261

4. Møller Wave Operator In this section we prove Theorem 1.2. We assume for simplicity that the space dimension n = 3. All arguments can be modified easily to n > 3. In the main argument of this section, we assume v0 = 0 and work with the profile ξ∞ = has,0 , with ξˆ∞ (0) = 0. At the end of this section we deal with general v0 by applying a Galilei transform. In either case, we have has,0 (x) = ξ∞ (x)eiv0 ·x , and hˆ as,0 (v0 ) = ξˆ∞ (0) = 0. 4.1. Dynamical linearization. We recall the Hartree equation 1 i∂t ψ = − 9ψ − ( ∗ |ψ|2 )ψ, 2 and the equation for the ground state Q, 1 − 9Q − ( ∗ Q2 )Q = EQ. 2 We consider solutions of the Hartree equation of the form ψ = (Q(y) + h(y, t))eiθ(y,t) , where y = x − r(t),

r˙ (t) = v(t),

θ (y, t) = v(t)y − Et + θ1 (t),

v(t) ˙ = a(t), ∞ 1 2 θ1 (t) = − v + ω ds . 2 t

The argument here is the same as that in Subsect. 4.2, with W ≡ 0. We obtain the equation for h: ∂t h = Lh + G(h), where the linear part 1 1 −29 − E + A , i ¯ A(h) = −( ∗ Q2 )h − Q( ∗ (Q(h + h))), L=

and the nonlinear part 1 {/(Q + h) + F (h)} , / = ω + ay, i ¯ F (h) = −( ∗ |h|2 )(Q + h) − ( ∗ [Q(h + h)])h. G=

We take projections of Eq. (4.1) onto S and M. The equations on S are 0 ˙ = −δ +κ1 (ImG, 0), Q : α ∇Q : β˙ = −γ +κ2 (ReG, yQ), 0 0 κ2 (ImG, ∇Q), yQ : γ˙ = 0 : δ˙ = κ1 (ReG, Q). 0

(4.1)

262

J. Fröhlich, T.-P. Tsai, H.-T. Yau

(See Proposition 4.2.) The equation on M is ∂t hM = LhM + PM G(h). Next we consider the wave operator. Given a profile ξ∞ at t = ∞, we hope to find a function h(y, t) such that h(y, t) − et L0 ξ∞ → 0, as t → ∞, in a sense to be made more precise. Here L0 = −i − 21 9 − E so that L = L0 − iA. Our strategy is to write h(·, t) = ξ(·, t) + g(·, t), where ξ(t) is the main term, which satisfies a linear equation and has the desired profile explicitly; g(t) is an error and converges to zero, as t → ∞, in a suitable sense. In view of the equation for h, we would like ξ to satisfy the linear equation ∂t ξ = Lξ + PM J ξ

ξ(t) ∈ M,

(4.2)

with ξ(t) → et L0 ξ∞ , as t → ∞. The operator J is a modification of the multiplication operator −i/ and is to be defined later in (4.9). Define the propagator P(s, t) such that u(t) := P(s, t)φ solves the equation ∂t u = Lu + PM J u,

u(s) = φ ∈ M.

Clearly, P(s, t) leaves the space M invariant so that u ∈ M. Note that t < s in this section, cf. Sect. 3. We define ξ to be given by ∞ 1 PM (4.3) ξ(t) = PM et L0 ξ∞ − P(s, t)PM [ A + PM J (s)]es L0 ξ∞ ds. i t We have that ξ ∈ M, by definition, and that ξ satisfies (4.2) (differentiate (4.3) and use that [L, PM ] = 0!). We shall prove later on that ξ(t) → et L0 ξ∞

in L2 ,

as t → ∞,

(4.4)

under the assumption ξˆ∞ (0) = 0.

(4.5)

The potential / = ω + ay is unbounded and complicates the analysis. One may prove certain finite propagation speed estimates, so that y is effectively cut off, as in Sect. 3. Alternatively, we can modify the form of ψ so that the unbounded potential is cut off. We shall follow the second option in this section. Specifically, we would like h not to “see” the fast phase change vy in θ when y is large. Let χ (·) be a smooth cutoff function with χ (x) = 1, for |x| ≤ 1, and χ (x) = 0, for |x| ≥ 2. We consider ψ of the form: ψ = Q(y)eiθ + h(y, t)ei(χvy−Et+θ1 ) = Q + µ−1 h eiθ ,

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

263

where θ = vy − Et + θ1 , µ = exp(i(1 − χ )vy), y = x − r(t) and χ = χ (C∗ y/t), (C∗ > 0 is a constant to be chosen later). Then µ−1 h satisfies (4.1) ∂t (µ−1 h) = L(µ−1 h) − i/(Q + µ−1 h) − iF (µ−1 h).

(4.6)

Now µ∂t (µ−1 h) = ∂t h + h∂t (−i(1 − χ )vy) and ∂t (−i(1 − χ )vy) =: −i(ay + J (1) ), where

J (1) = −χ ay − (1 − χ )v 2 + (∇χ )(vt + y)t −2 vy . Also µL(µ−1 h) = Lh + µ[L, µ−1 ]h. Explicit computation gives

µ∇µ−1 = i −(1 − χ )v + (∇χ )t −1 vy = iJ (2) , µ9µ−1 = −(J (2) )2 + i∇ · J (2) ,

∇ · J (2) = 2(∇χ ) · t −1 v + (9χ )t −2 vy.

Recall that L = −i(−9/2 − E + A). Thus, iµ [9, µ−1 ]h − iµ[A, µ−1 ]h 2 i (2) 2 ∇ · J (2) = − (J ) − h − J (2) · ∇h + J (3) h, 2 2

µ[L, µ−1 ]h =

where ¯ − iQ ∗ [Q(h + h)] ¯ J (3) h := −iµ[A, µ−1 ]h = iµQ ∗ [Q(µ−1 h + µh)]. This yields the following equation for h: ∂t h = Lh + J h − i/µQ − iµF (µ−1 h),

(4.7)

where J h = −iω + iJ

(1)

i ∇ · J (2) − (J (2) )2 − 2 2

h − J (2) · ∇h + J (3) h.

Notice that J depends on ω, a and v with v(t) ˙ = a(t). Throughout the rest of this section we assume that there is a constant C∗ such that t 3 |a(t)| + t 2 |ω(t)| ≤ C∗ .

(4.8)

We shall prove later on that this assumption holds. Under this assumption, one finds that

J (1) ∞ ≤ O(t −2 ),

J (2) ∞ ≤ O(t −2 ),

∇ · J (2) ∞ ≤ O(t −3 ),

J (3) ∞ ≤ O(e−t ).´

We write J = Ja + Jb · ∇ + Jc ,

(4.9)

264

J. Fröhlich, T.-P. Tsai, H.-T. Yau

with Ja = i[−ω − χ ay + (∇χ )t −2 (vy)y],

Jb = −J (2) = − −(1 − χ )v + (∇χ )t −1 vy , i ∇ · J (2) Jc = −i(1 − χ )v 2 + i(∇χ )vt −1 (vy) − (J (2) )2 − + J (3) . 2 2 Note that Jb is real. Furthermore, the only appearance of µ in J is in J (3) , which is exponentially small. Assuming the bound (4.8) on a and ω, we can check the following bounds on J :

Ja ∞ + Jb ∞ ≤ O(t −2 ),

Jc ∞ ≤ O(t −3 ).

(4.10)

Once J (t) is defined, so is ξ(t) by (4.2). We can now use (4.7) and (4.2) to obtain an equation for g := h − ξ : ∂t g = (L + J )g + PS J ξ − i/µQ − iµF (µ−1 (ξ + g)).

(4.11)

−1 G(1) µ := J gS − i/(µ − 1)Q − iµF (µ (ξ + g)).

(4.12)

Let

(1)

Since −i/Q ∈ S, we have that PM G = PM J gM + PM Gµ , and the equation for g on M is gM (t) = −

t

∞

P(s, t)PM G(1) µ ds,

(4.13)

Let −1 G(2) µ := J g + PS J ξ − i/(µ − 1)Q − iµF (µ (ξ + g)).

(4.14)

(2)

Then PS G = −i/Q + PS Gµ , and the equations on S are 0 Q

∇Q

0 0 yQ

0 0

:

α˙ = −δ − ω +κ1 (ImG(2) µ , 0),

:

β˙ = −γ

:

γ˙ =

:

δ˙ =

+κ2 (ReG(2) µ , yQ), −a+κ2 (ImG(2) µ , ∇Q), κ1 (ReG(2) µ , Q).

Here we have used that κ1 (−/Q, 0) = −ω, κ2 (−/Q, ∇Q) = −a.

(4.15)

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

265

4.2. Bounds on the Propagator P(s, t). The following lemma shows that P(s, t) conserves the H 1 -norm in M. Lemma 4.1. Assume the bound (4.8). Then P(s, t) is bounded in M ∩ H k , k = 1, 2, 3. More precisely, there is a constant C such that for any C∗ (the bound in (4.8)), any T ≥ 1, and any φ ∈ M ∩ H k , we have P(s, t)φ H k ≤ eCC∗ /T φ H k , k = 1, 2, 3, provided that s, t ≥ T . (The larger T is, the better the estimate.) Proof. We first consider the case k = 1. Assume u(t) ∈ M, ∂t u(t) = Lu + PM J (t)u, u(s) = φ. Let f (t) = Im(Lu, u) ≥ 0. Then d f (t) = Im(Lu, PM J u) = Im(Lu, J u) − Im(Lu, PS J u). 2dt Here we have used Lemma 4.3. Note |(Lu, PS J u)| ≤ CC∗ t −2 u 2L2 and Im(Lu, J u) = CRe(9u, Jb · ∇u) + O(t −2 ) u 2H 1 , also, (recall Jb is real) 2Re(9u, Jb · ∇u) = − Jb · ∇|∇u|2 − 2Re (∇ u¯ · ∇)Jb · ∇u 2 = (∇ · Jb )|∇u| − 2Re (∇ u¯ · ∇)Jb · ∇u ≤ CC∗ t −3 u 2H 1 . (4.16) Hence we have

d f (t) ≤ CC∗ t −2 u 2 1 ≤ CC∗ t −2 f (t). dt H

(4.17)

Hence we get

t [ln f ]t ≤ −CC∗ t −1 ≤ CC∗ T −1 . s s

In particular, f (t) f (s) −1 , ≤ eCC∗ T . f (s) f (t) Now we consider the case k = 3. The case k = 2 follows by interpolation. Let u(t) be as above and w = Lu ∈ M. We have ∂t w = Lw + LPM J u = Lw + J w + [L, J ]u − LPS J u. This time we let f3 (t) = Im(Lw, w) and have d f3 (t) = Im(Lw, J w + [L, J ]u − LPS J u). 2dt We have |(Lw, LPS J u)| ≤ CC∗ t −2 w 2 u 2 , and we already showed |Im(Lw, J w)| ≤ CC∗ t −2 w 2H 1 when we considered f (t), see especially (4.16). Finally |Im(Lw, [L, J ]u)| = |Im(Lw, −i(∇Jb ) · ∇u + O(t −2 )u)| ≤ CC∗ t −3 w H 1 u H 2 + CC∗ t −2 w H 1 u H 1 ,

266

J. Fröhlich, T.-P. Tsai, H.-T. Yau

by integration by parts. Since w 2H 1 is comparable with f3 , we conclude

d f3 (t) ≤ CC∗ t −2 f3 (t) + f3 (t) u(t) H 1 ≤ CC∗ t −2 [f3 (t) + f (t)] . dt Together with (4.17), we see (f + f3 ) satisfies the same inequality in (4.17), and hence the same bound. Since (f + f3 ) ∼ u(t) 2H 3 , the lemma is proved. & ' Remark. Due to the spatial cut-off in our Eq. (4.7), we do not need to prove a finite speed estimate for P, (as we did in Lemma 4.6 for P), in order to prove the above lemma. 4.3. Estimates of ξ . We now estimate ξ precisely. Recall (4.2) and (4.3), the equations ∞ of ξ . Our goal is to estimate the term − t P(s, t)PM ( 1i A + PM J (s))es L0 ξ∞ ds. We need the following standard results on the free evolution. Lemma 4.2 (Decay of eit9/2 ). Let k > 0 be a positive integer and assume ∇pm ξˆ∞ (0) = 0 for all non-negative integers m ≤ 2k − 2, then C n it9/2 ξ∞ (x) ≤ d/2+k (1 + |y|2k )|∇yn ξ∞ (y)|dy, (4.18) ∇x e x=O(1) t for any integer n ≥ 0. Proof. We first consider the case n = 0. Write r = (e

it9/2

ξ∞ )(x) =

1

e

i|x−y|2 2t . We

have

i|x−y|2 2t

ξ∞ (y)dy (2πit)d/2 1 1 2 1 k−1 k = 1 + r + r + ··· + + O(r ) ξ∞ (y)dy. r 2 (k − 1)! (2πit)d/2

Therefore, the conclusion of the lemma holds if |x − y|2l ξ∞ (y)dy = 0 for all x, for all l < k, which is true under the assumption of the lemma. For general n, we take the derivative m nˆ n first and then proceed as above. Note ∇pm (∇ x ξ∞ )(0) = ∇p (p ξ∞ )(0) = 0 for all m ≤ 2k − 2. & ' We now use that P(s, t) is bounded in H1 (Lemma 4.1) to have ∞ ∞ 1 s L0 1 s L0 P(s, t)PM Ae ξ∞ ds ≤ PM i Ae ξ∞ 1 ds . i t t H1 H From Lemma 4.2 with k = 1, the last term is bounded by ∞ ∞ s L0 ξ ds ≤ s −5/2 ds ≤ Ct −3/2 . e ∞ 1,∞ t

W

(y∼1)

t

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

267

Notice that this is the only place we use assumption (4.5). Now we recall J = Ja + Jb · ∇ + Jc . Since Jc (s) ∞ ≤ s −3 , we have ∞ ∞ s L0 − J (s)e ξ ds ≤ Cs −3 ds ξ∞ H 1 ≤ Ct −2 ξ∞ H 1 . P (s, t)P M c ∞ t

H1

t

We now expand P(s, t) once more to get ∞ P(s, t)PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds = ξJ + ξAJ + ξJ J , − t

where

∞ e(t−s)L0 PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds, ξJ = −PM t ∞ s 1 P(σ, t)PM Ae(σ −s)L0 dσ PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds, ξAJ = i t t ∞ s P(σ, t)PM J (s)e(σ −s)L0 dσ PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds. ξJ J = t

t

Recall from J.-L. Journe, A. Soffer and C. D. Sogge [15], C Vˆ 1 is0 H0 is1 H0 Ve . 1 ∞ ≤ e (L ,L ) (s0 + s1 )d/2

(4.19)

Suppose that we can neglect the second projection PM in the definition of ξJ . Since Jb ∇es L0 = Jb es L0 ∇, and we can write Ja = −iω + Ja2 , Jb = v + Jb2 , where Ja2 and Jb2 have compact supports and Ja2 (s) L1 (p) + Jb2 (s) L1 (p) = O(s −2 ), the L∞ -norm of the integrand of ξJ is bounded by Ct −3/2 s −2 . Integrating in s we get

ξJ (t) L∞ ≤ Ct −3/2 ξ∞ W 1,1 . To handle the PM , we simply use that PM = 1 − PS . Since PS is a projection onto local smooth functions, the same proof applies. We shall not repeat the argument to handle the projection PM later on. We can also bound ξJ (t) in the H 1 norm by brutal force as we deal with Jc :

ξJ (t) H 1 ≤ Ct −1 ξ∞ H 2 , since ξJ involves only free evolution. We now use that P(s, t) is bounded in H1 to have ∞ s (σ −s)L0

ξAJ (t) H 1 ≤ PM (Ja + Jb · ∇)(s)es L0 ξ∞ Ae t

H1

t

From the definition of A, we have (σ −s)L0 PM (Ja + Jb · ∇)(s)es L0 ξ∞ 1 Ae H (σ −s)L0 ≤ e PM (Ja + Jb · ∇)(s)es L0 ξ∞ ∞ L (σ −s)L0 s L0 + ∇e PM (Ja + Jb · ∇)(s)e ξ∞

L∞

.

dσ ds.

268

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Again we use (4.19) to have (σ −s)L0 PM (Ja + Jb · ∇)(s)es L0 ξ∞ e

L∞

≤ σ −3/2 s −2 ξ∞ W 1,1 .

Since ∇ and e(σ −s)L0 commute, we can bound the term with ∇e(σ −s)L0 in the same way −2 by also using ∇ y Ja2 L1 (p) + ∇ y Jb2 L1 (p) ≤ O(s ). We conclude that ∞ s

ξAJ (t) H 1 ≤ σ −3/2 s −2 dσ ds ξ∞ W 2,1 ≤ t −3/2 ξ∞ W 2,1 . (4.20) t

t

Finally, we can bound ξJ J (t) by ∞ s

ξJ J (t) H 1 ≤ σ −2 s −2 dσ ds ξ∞ H 3 ≤ t −2 ξ∞ H 3 . t

t

(4.21)

Let ξ(t) = ξ (0) (t) + ξ (1) (t) + ξ (2) (t), where ξ (0) (t) = PM et L0 ξ∞ , ξ (1) = ξJ and ξ (2) (t) denotes the rest. Then we have proved that (0) ξ (t) ∞ + ξ (1) (t) ∞ ≤ Ct −3/2 , L L (4.22) (1) −1 ξ (t) 1 ≤ Ct , ξ (2) (t) 1 ≤ Ct −3/2 , H

H

with the constants depending on ξ∞ . In fact, tracking the proof we see that, since ∇ commutes with es L0 , we actually have (0) ξ (t) 2,∞ + ξ (1) (t) 2,∞ ≤ Ct −3/2 , W W (4.23) (1) −1 ξ (t) 2 ≤ Ct , ξ (2) (t) 2 ≤ Ct −3/2 . H

H

Of course we need to use a stronger norm for ξ∞ . The following norm is sufficient:

ξ∞ H 4 + ξ∞ W 3,1 + ξ∞ W 2,1 ((1+x 2 )dx) ≤ C −1 C∗ ,

(4.24)

where C∗ is a small constant to be chosen in the next subsection. 4.4. Existence of g. In this section we construct the solution via a contraction mapping argument. After defining the map in Step 1, we show the following bounds in Step 2: t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 < C∗ ,

(t > T )

(4.25)

provided that ξ∞ ≤ C −1 C∗ with C∗ > 0 sufficiently small (see (4.24)) and T sufficiently large. Finally in Step 3 we show that the contraction mapping converges in the norm t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 1 in the ball t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 < C∗ . Notice that we use the H 1 norm for g(t) in the contraction, which is weaker than the H 2 norm appearing in (4.25). Our approach is certainly not the shortest. Once a certain apriori bound is established, we can follow standard existence construction by taking weak limits. This will avoid the

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

269

proof of the contraction completely. Our approach however provides more information to the scattering operator. STEP 1

We first define the map (ω, a, g) −→ (ω , a , g )

(4.26)

with the convention gS = PS g and gM = PM g , and so on. Recall that J (t) and ξ(t), defined by (4.9) and (4.3) respectively, depend on ω and a. To solve the equation on the S (4.15), we first solve β and δ from (4.15). Since we plan to solve the equation by iteration, we define (we think γ = 0) ∞ κ1 (ReG(2) (4.27) δ (t) = − µ (s), Q)ds , t ∞ κ2 (ReG(2) β (t) = − µ (s), yQ)ds . t

Instead of solving the equation for α and γ , we choose ω and a such that α˙ = γ˙ = 0. Therefore, we define ω , a to be ω = −δ + κ1 (ImG(2) µ , 0),

a =

(4.28)

κ2 (ImG(2) µ , ∇Q).

With this choice, the component of g in the S direction is simply gS (t) = β (t) ∇Q + δ (t) 00 . 0 Finally, the component on the M direction is given by ∞ P(s, t)PM G(1) gM (t) = − µ ds, t

(4.29)

(1)

P(s, t) depends on a and ω, so is where Gµ is defined in (4.12). Note the definition of µ. Our next step is to prove this map is bounded in a certain norm. STEP 2

Suppose that ξ∞ ≤ C −1 C∗ (see (4.24)) and t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 < C∗ .

(4.30)

We will prove the following bound:

t 2 |ω (t)| + t 3 |a (t)| + t 2 g (t)

H2

< C∗ /2

(4.31)

provided that C∗ is sufficiently small. The last statement seems to be contradictory as the norm is getting smaller after each iteration and we can drive the constant to zero. But this is impossible as the constant on the estimate of ξ∞ remains unchanged. Indeed, the right hand side of the last bound depends mainly on the constant appearing in the estimate of ξ∞ , i.e., in the inequality ξ∞ ≤ C −1 C∗ . Since a(t) satisfies (4.30) = v, ˙ v = r˙, we have |v(t)| ≤ CC∗ t −2 and |r(t)| ≤ and a−1 −1 CC∗ t . We now estimate µF (µ (ξ + g))H 2 . By definition, µF (µ−1 h) = − ∗ |h|2 (µQ + h) − 2 ∗ [QRe(µ−1 h)] h.

270

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Recall the decomposition and the estimate for ξ (4.23) from Subsect. 5.3. Write h = ξ + g = ξ (0) + ξ (1) + (ξ (2) + g). Because of the bound (1.27) on , ∗ |h|2 (µQ + h) 2 H 2 ≤ ∗ |h| 2,∞ · µQ + h H 2 W (0) ≤ C ∗ |ξ + ξ (1) |2 2,∞ + C ∗ |ξ (2) + g|2 2,∞ W W 2 2 (0) (2) (1) ≤ C W 2,1 · ξ + ξ 2,∞ + C W 2,∞ · ξ + g 2 W

≤ CC∗2 t −3 .

H

Since ∗ [QRe(µ−1 h)] h is a local term by the presence of Q, using the bound (1.27) on we have ∗ [QRe(µ−1 h)] h 2 ≤ C h 2H 2 (y∼1) ≤ CC∗2 t −3 . H

We conclude that

µF (µ−1 (ξ + g))

H2

≤ CC∗2 t −3 .

From the bound of J (4.10) and the assumption on the norm of g (4.25), we have

J gS (t) H 2 ≤ CC∗2 t −2−2 . For any f ∈ S, we also have |(f, J gM )| ≤ CC∗ t −2 f H1 gM L2 ≤ CC∗2 t −4 f H1 . Also, |(f, PS J ξ )| ≤ CC∗2 t −2−3/2 . Finally −i/(µ − 1)Q is exponentially small in t. (2) Hence we conclude that |(f, Gµ )| ≤ CC∗2 t −3 f . Thus 1 C∗ t −2 , 8 1 |a (t)| ≤ C∗ t −3 , 8

|β (t)| + |δ (t)| ≤

1 |ω (t)| ≤ C∗ t −2 , 8 1 gS (t) 2 ≤ C∗ t −2 , H 8

provided that C∗ is sufficiently small. One can also easily check that ∞ ∞ 1 (1) CC∗2 s −3 ds ≤ C∗ t −2 . Gµ (s) 2 ds ≤ gM (t) 2 ≤ H H 8 t t The claim (4.31) is proved. STEP 3 Given two data (ω1 , a1 , g1 ) and (ω2 , a2 , g2 ) we denote by δ their differences: δω = ω1 − ω2 , δa = a1 − a2 , δg = g1 − g2 , δg = g1 − g2 , and so on. We also let δ 0 = sup t 2 |δω(t)| + t 3 |δa(t)| + t 2 δg(t) H 1 . (4.32) t

Note: different a(t) gives different µ, (µ = ei(1−χ)vy ), but χ is the same. Also, from the −2 definition of J , we have δJa (t) ∞ + δJb (t) ∞ + δJ (t) ∞ c ≤ Cδ 0 t . 2 3 2 Our goal is to estimate t |δω (t)|+t |δa (t)|+t δg (t) H 1 . Recall the definition of ω , a and g from (4.27), (4.28) and (4.29). In order to estimate the difference of

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

271 (2)

(2)

(2)

ω , a from two initial data, we need to control the difference of δGµ := Gµ,1 −Gµ,2 , (2)

where Gµ,k := Jk gk − iµk F (µ−1 k (ξk + gk )), k = 1, 2. Here µk , ξk and Jk denote the corresponding µ, ξ and J , k = 1, 2 and thus ∂t ξk = (L + PM Jk )ξk . We shall −1 first estimate δξ , then δF = µ1 F (µ−1 1 (ξ1 + g1 )) − µ2 F (µ2 (ξ2 + g2 )) and finally δ(J g) := J1 g1 − J2 g2 and δ(J ξ ) := J1 ξ1 − J2 ξ2 . From the equation of ξ , δξ satisfies ∂t δξ = (L + PM J1 )δξ + PM (δJ )ξ2 . Since δξ(t) → 0 in H 1 as t → ∞, (see (4.23)), we have ∞ P1 (s, t)PM (δJ (s))ξ2 (s)ds δξ = − t

in H 1 . We now derive a bound on δξ . The last term can be decomposed into two parts A + B with ∞ P1 (s, t)PM (δJ (s))PM es L0 ξ∞ ds, A := − t ∞ P1 (s, t)PM (δJ (s)) ξ2 (s) − PM es L0 ξ∞ ds. B := − t

Since δJ (s) ∞ ≤ Cδ 0 s −2 and ξ2 (s)−es L0 ξ2 H 2 ≤ Cs −1 from (4.23), we can bound B by

B H 1 ≤

Cδ 0 s −2 C∗ s −1 ds = CC∗ δ 0 t −2 .

We can bound A exactly as in Subsect. 4.5. In other words, it can be written as a sum of three terms satisfying (4.23). More precisely, A = A(0) + A(1) + A(2) and (0) A (t) 2,∞ + A(1) (t) 2,∞ ≤ Cδ 0 t −3/2 , W

(1) A (t)

H2

W

≤ Cδ 0 t −1 ,

(2) A (t)

H2

≤ Cδ 0 t −3/2 .

(In fact, A(0) = 0.) Notice that the constants on the right hand side now have a δ 0 factor. In particular, we can write δξ = (δξ )a + (δξ )b with (δξ )a = A(0) + A(1) and (δξ )b = A(2) + B such that

(δξ )a (t) W 1,∞ ≤ Cδ 0 t −3/2 ,

(δξ )b (t) H 1 ≤ Cδ 0 t −3/2 .

(4.33)

From the definition of δF , we can bound δF in terms of δξ and δg. (Note that (µ1 − µ2 )Q is exponentially small in t.) The previous bound on δξ and the bound (4.32) on δg thus yields that

δF H 1 ≤ CC∗ δ 0 t −3 . Also, δ(J g) = (δJ )g1 + J2 (δg). Thus, for any f ∈ S with f H 1 ≤ 1 we have |(f, δ[J g])| ≤ CC∗ δ 0 t −4 .

272

J. Fröhlich, T.-P. Tsai, H.-T. Yau

Similarly, |(f, δ[PS J ξ ])| ≤ CC∗ δ 0 t −7/2 . Finally, δ(−i/(µ − 1)Q) ≤ CC∗ t −2 e−Ct . We conclude for any f ∈ S with f H 1 ≤ 1 that −3 |(f, δG(2) µ )| ≤ CC∗ δ 0 t .

Simple calculations then show that 1 −3 δ0 t , 8 1 |δω (t)| ≤ δ 0 t −2 , 8 1 δgS (t) 1 ≤ δ 0 t −2 , H 8 |δa (t)| ≤

provided that C∗ is sufficiently small. Finally, the equation of gM (4.29) can be written explicitly as ∂t gM = LgM + PM J (gM + gS ) − i/(µ − 1)Q − iµF (µ−1 (g + ξ )) .

Hence for δgM = g1,M − g2,M we have ∂t δgM = (L + PM J1 )δgM + PM (−δJ )g2,M + δ(J gS ) − iδ(/(µ − 1))Q − iδF .

Since (δgM )(t) → 0 as t → ∞ in H 1 , we can put it in integral form:

(δgM )(t) = ∞ − P1 (s, t)PM (−δJ )g2,M + δ(J gS ) − iδ(/(µ − 1))Q − iδF ds. t

(4.34)

Therefore, we can bound the H1 norm of δgM by ∞ ≤ C δgM 1 (−δJ )g2,M + δ(J gS ) − iδ(/(µ − 1))Q − iδF H

Since

t

(δJ )g2,M

H1

≤ Cδ 0 s −2 g2,M

H2

H1

ds .

,

(that is why we needed to prove a stronger bound for g in Step 2), together with previous bounds on δ(J gS ), −iδ(/(µ − 1))Q and iδF , we can bound the integrand by C∗ δ 0 s −3 . Thus we have 1 δgM 1 ≤ δ 0 t −2 H 8 provided that C∗ is sufficiently small.

Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation

Conclusion:

273

For the case v0 = 0, we have proved that t 2 |δω (t)| + t 3 |δa (t)| + t 2 δg (t)

H1

≤ δ 0 /2

under the assumptions (4.24), (4.30) and (4.32). Thus the map (4.26) is a contraction. Since (4.30) holds for a nonempty set of functions (including zero), we obtain a solution (ω, a, g), together with ξ . Furthermore, we have proved that t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 ≤ C∗ ∞ for t greater than an aboulute constant T . Hence v(t) = − t a(s)ds = O(t −2 ). Similarly r(t) = O(t −1 ) and θ0 (t) = O(t −1 ). Also recall y = x − r(t) and has,0 = ξ∞ . Therefore, by Taylor expansion, ψ(x, t) − ψas (x, t) = Q(y)ei(vy−Et+θ0 ) + h(y, t)ei(χvy−Et+θ0 ) − Q(x)e−iEt + eit9/2 ξ∞ (x) = O(t −1 )

in H 2 .

Note that our result is true for t > T . However, if we replace all previous estimates of the form t −m by (t + T )−m , our contraction argument still holds. Hence Theorem 1.2 is proved for the case v0 = 0. To conclude Theorem 1.2 for general v0 , we apply the following Galilei transform (boost): ψ(x, t) −→ ψ(x − v0 t, t)ei(v0 ·x− 2 v0 t) . 1 2

(Recall has,0 (x) = ξ∞ (x)eiv0 ·x and hˆ as,0 (v0 ) = ξˆ∞ (0) = 0.) Also, for general r0 we apply a translation, which does not require a change of assumption. The proof is complete. Acknowledgements. We thank J. Bourgain, I.M. Sigal and T. Spencer for some useful discussions and comments. T. P. Tsai and H. T. Yau would like to acknowledge the support of the Center for Theoretical Sciences at Taiwan where part of this work was done.

References 1. Schrödinger, E.: Der stetige Übergang von der Mikro- zur Makromechanik. Die Naturwissenschaften 28, 664–669 (1926) 2. Kato, T.: Perturbation Theory for Linear Operators on Hilbert Space. Berlin–Heidelberg–New York: Springer-Verlag, 1980 3. Hepp, K.: The classical limit for quantum mechanical correlation functions. Commun. Math. Phys. 35, 265–277 (1974) 4. Ginibre, J. and Velo, G.: The classical field limit of nonerlativistic bosons, I. Ann. Phys. (NY) 128, 243–285 (1980); “· · · , II”. Ann. Inst. H. Poincaré 33, 363–394 (1980); (see also: Ginibre, J. and Velo, G.: Commun. Math. Phys. 66, 37–76 (1979) and 68, 45–68 (1979)) 5. Lieb, E.H., Seiringer, R. and Yngvason, J.: Bosons in a Trap: A Rigorous Derivation of the GrossPitaevskii Energy Functional. Los Alamos archive, math-ph/9908027 6. Fröhlich, J., Tsai, T.-P. and Yau, H.-T.: On a classical limit of quantum theory and the non-linear Hartree equation. To appear in the proceedings of the conference “Visions in Mathematics” (Tel Aviv, 1999). Special volume of GAFA. Basel: Birkhäuser, 2000

274

J. Fröhlich, T.-P. Tsai, H.-T. Yau

7. Ginibre, J. and Velo, G.: On a class of nonlinear Schrödinger equations with nonlocal interaction. Math. Z. 170, 109–136 (1980); Scattering theory in the energy space for a class of nonlinear Schrödinger equations. J. Math. Pure Appl. 64, 363–401 (1985); Scattering theory in the energy space for a class of Hartree equations. Preprint 1998 8. Soffer, A. and Weinstein, M.I.: Multichannel nonlinear scattering theory for nonintegrable equations I, II. Commun. Math. Phys. 133, 119–146 (1990); J. Diff. Eqns. 98, 376–390 (1992) 9. Ovchinnikov, Y.N. and Sigal, I.M.: Dynamics of localized structures. Physica A 261, 143–158 (1998); Ginzburg-Landau equation, I, general discussion. In: PDE’s and their Applications. Seco, L. et al. (eds.). CRM Proceedings and Lecture Notes 12, 199–220 (1997) 10. Reed, M. and Simon, B.: Methods of modern mathematical physics, II, Fourier analysis, self-adjointness. New York, San Francisco, London: Academic Press, 1975 11. Schlein, B.: diploma thesis, ETH 1999 12. Pillet, C.-A. and Wayne, C.E.: Invariant manifolds for a class of dispersive, Hamiltonian partial differential equations. J. Diff. Equations 141, 310–326 (1997) 13. Weinstein, M.I.: Modulational stability of ground states of nonlinear Schrödinger equations. SIAM J. Math. Anal. 16, no. 3, 472–491 (1985) 14. Weinstein, M.I.: Lyapunov stability of ground states of nonlinear dispersive evolution equations. Comm. Pure Appl. Math. 39, 51–68 (1986) 15. Journe, J.-L., Soffer, A. and Sogge, C.D.: Decay estimates for Schrödinger operators. Comm. Pure Appl. Math. 44, no. 5, 573–604 (1991) Communicated by A. Kupiainen

Commun. Math. Phys. 225, 275 – 304 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

From Invariant Curves to Strange Attractors Qiudong Wang1, , Lai-Sang Young2, 1 Dept. of Math., University of Arizona, Tucson, AZ 85721, USA. E-mail: [email protected] 2 Courant Institute of Mathematical Sciences, 251 Mercer St., New York, NY 10012, USA.

E-mail: [email protected] Received: 10 January 2001 / Accepted: 10 July 2001

Abstract: We prove that simple mechanical systems, when subjected to external periodic forcing, can exhibit a surprisingly rich array of dynamical behaviors as parameters are varied. In particular, the existence of global strange attractors with fully stochastic properties is proved for a class of second order ODEs. Introduction In the history of classical mechanics, dissipative systems received only limited attention, in part because it was believed that in these systems all orbits eventually tended toward stable equilibria (fixed points or periodic cycles). Evidence that second order equations with a periodic forcing term can have interesting behavior first appeared in the study of van der Pol’s equation, which describes an oscillator with nonlinear damping. The first observations were due to van der Pol and van der Mark. Cartwright and Littlewood proved later that in certain parameter ranges, this equation had periodic orbits of different periods [CL]. Their results pointed to an attracting set more complicated than a fixed point or an invariant curve. Levinson obtained detailed information for a simplified model [Ln]. His work inspired Smale, who introduced the general idea of a horseshoe [Sm], which Levi used later to explain the observed phenomena [Li1]. A number of other differential equations with chaotic behavior have been studied in the last few decades, both numerically and analytically. Examples from the dissipative category include the equations of Lorenz [Lo, G, Ro, Ry, Sp, T, W], Duffing’s equation [D, Ho], Lorentz gases acted on by external forces [CELS], and modified van der Pol type systems [Li2]. For a systematic treatment of the Lorenz and Duffing equations, see [GH]. While some progress has been made, the number of equations for which a rigorous global description of the dynamics is available has remained small. This research is partially supported by a grant from the NSF

This research is partially supported by a grant from the NSF

276

Q. Wang, L.-S. Young

In this paper, we consider an equation of the form dθ d 2θ − 1) = (θ)PT (t), + λ( dt 2 dt where θ ∈ S 1 and λ > 0. If the right side is set identically equal to zero, this equation represents the motion of a particle subjected to a constant external force which causes it to decelerate when its velocity exceeds one and to accelerate when it is below one. Independent of the initial condition, the particle approaches uniform motion in which it moves with velocity equal to one. To this extremely simple dynamical system, we add another external force in the form of a pulse: is an arbitrary function, PT is timeperiodic with period T , and for t ∈ [0, T ), it is equal to 1 on a short interval and 0 otherwise. We learned after this work was completed that a similar equation has been studied numerically in the physics literature by G. Zaslavsky.1 We prove that the system above exhibits, for different values of λ and T , a very rich array of dynamical phenomena, including (a) invariant curves with quasi-periodic behavior, (b) gradient-like dynamics with stable and unstable equilibria, (c) transient chaos caused by the presence of horseshoes, with almost every trajectory eventually tending to a stable equilibrium, and (d) strange attractors with SRB measures and fully stochastic behavior. These results are new for the equation in question. As abstract dynamical phenomena, (a)–(c) are fairly well understood, and their occurrences in concrete models have been noted; see [GH]. The situation with regard to (d) is very different. The analysis that allows us to handle attractors of this type was not available until recently. To our knowledge, this is the first time a concrete differential equation has been proved analytically to have a global nonuniformly hyperbolic attractor with an SRB measure.2 We regard Theorem 3, which discusses the strange attractor case, as the main result of this paper. Our proof of Theorem 3 is based on [WY], in which we built a dynamical theory for a (general) class of attractors with one direction of instability and strong dissipation. In [WY], we identified a set of conditions which guarantees the existence of strange attractors with strong stochastic properties. The properties in question include most of the standard mathematical notions associated with chaos: positive Lyapunov exponents, positive entropy, SRB measures, exponential decay of correlations, symbolic coding of orbits, fractal geometry, etc. The occurrence of scenario (d) above is proved by checking the conditions in [WY]. For the convenience of the reader, we will recall these conditions as well as the package of results that follows once these conditions are checked. Our purpose in writing this paper is not only to point out the range of phenomena that can occur when simple second order equations are periodically forced, but to bring to the foreground the techniques that have allowed us to reach these conclusions in a relatively straightforward manner. These techniques are clearly not limited to the systems considered here. It is our hope that they will find applications in other dynamical systems, particularly those that arise naturally from mechanics or physics. 1 Zaslavsky produced in [Z1] numerical evidence of strange attractors. He also discussed in [Z2] how this model can be viewed as a strong idealization of the turbulence problem. 2 Levi proved in [Li1] the occurrence of scenario (c) for his modified van der Pol systems, not scenario (d) as is sometimes incorrectly reported.

From Invariant Curves to Strange Attractors

277

1. Statement of Results 1.1. Setting and assumptions. Consider the differential equation d 2θ dθ +λ = µ + (θ)PT (t), dt 2 dt

(1)

where θ ∈ S 1 , λ, µ > 0 are constants, : S 1 → R is a smooth function, and PT has the following form: for some t0 < T , PT satisfies PT (t) = PT (t + T )

and PT (t) =

1 0

for all t

for t ∈ [0, t0 ], for t ∈ (t0 , T ).

As discussed in the introduction, (1) describes a simple mechanical system consisting of a µ particle moving in a circle subjected to an external time-periodic force. With r = dθ dt − λ , (1) is equivalent to dθ µ =r+ , dt λ dr = −λr + (θ)PT (t). dt

(2)

Let FT denote the time-T -map of (2), that is, the map that transforms the phase space S 1 × R from time 0 to time T . Unless explicitly stated otherwise, when we write FT , it will be assumed that T is the period of the forcing. We set µ = λ for simplicity, and normalize the forcing term as follows: Given a function 0 : S 1 → R, we let = t10 0 , that is to say, the magnitude of this part of the force is taken to be inversely proportional to the duration of its action, and the proportionality constant is taken to be 1 for simplicity. Our analysis will proceed as follows: * The function 0 is fixed throughout. With the exception of Theorem 2(b) (where more is assumed), the only requirements are that 0 is of class C 4 and all of its critical points are nondegenerate. 1 * We assume t0 < 10 min{λ−1 , K0−2 }, where K0 = max{ 0 C 4 , 1}. Further restrictions on t0 are imposed in each case as needed. (We do not regard t0 as an important parameter and will assume it is as small as the arguments require.) * The two important parameters are λ and T . We will prove that (i) the properties of (1) are intrinsically different for λ small and for λ large, and (ii) for fixed λ, the properties of (1) depend quite delicately on the value of T . To interpret our results correctly, the reader should keep in mind that the dynamical pictures described below are not the only ones that can occur, and it is possible to have combinations of them, such as sinks and strange attractors, on different parts of the phase space. Our aim here is to identify several important pure dynamics types, to indicate the nature and approximate locations of the parameter sets on which they occur, and to convey a sense of prevalence, meaning that these phenomena occur naturally and not as a result of mere coincidence.

278

Q. Wang, L.-S. Young

1.2. Statements of theorems. The setting of Sect. 1.1 is assumed throughout. We consider the discrete-time system defined by the Poincaré map FT . Precise meanings of some of the technical terms are given after the statements of the theorems. Theorem 3 is our main result. The scenarios presented in Theorems 1 and 2 are also integral parts of the picture. Theorem 1 (Existence of invariant curves). Let λ ≥ 4K0 and T ≥ t0 + 23 . Then there is a simple closed curve of class C 4 to which all the orbits of FT converge. Moreover, we have the following dichotomy: (a) (Quasi-periodic attractors) Let 0 = {T : ρ(T ) ∈ R \ Q}, where ρ(T ) is the rotation number of FT |. Then (i) 0 intersects every unit interval in [ 23 , ∞) in a set of positive Lebesgue measure, and (ii) the following hold for T ∈ 0 : FT | is topologically conjugate to an irrational rotation, and for every z ∈ S 1 × R, 1 n−1 δF i z converges weakly to µ where µ is the unique invariant probability 0 n T measure on . ¯0 (b) (Periodic sinks and saddles) There is an open and dense subset 1 of [t0 + 23 , ∞)\ such that for T ∈ 1 , FT has a finite number of periodic sinks and saddles on . Every orbit of FT converges to one of these periodic orbits. Theorem 2 is elementary; it uses standard techniques, and 0 is required only to be C 2 . We include this result because the dynamical pictures described occur for a nontrivial set of parameters. Theorem 2 (Convergence to stable equilibria). (a) (Gradient-like dynamics) ∃λ0 < max | 0 | such that ∀λ > λ0 , if t0 is sufficiently small, then there are open intervals of T for which FT has a finite number of periodic points all of which are saddles or sinks, and every orbit not on the stable manifold of a saddle tends to a sink. (b) (Transient chaos) Assume 0 has exactly two critical points. Then there exist intervals of λ accumulating at 0 such that for each of these λ, if t0 is sufficiently small, then there are open intervals of T for which FT has a periodic sink and a “horseshoe”, i.e. a uniformly hyperbolic invariant set such that FT | is conjugate to a shift of finite type with positive topological entropy. Lebesgue-a.e. z ∈ S 1 × R is attracted to the sink as n → ∞. Remarks. (i) The picture in Theorem 2(a) is more general than that in Theorem 1(b): there are no simple closed invariant curves in general (see Proposition 4.1). (ii) We describe the scenario in Theorem 2(b) as “transient chaos” for the following reasons: being an invariant set, points near it tend to stay near it for some period of time, mimicking the dynamics on . This chaotic behavior, however, is transient, because has Lebesgue measure zero, and for a typical initial condition, the orbit eventually leaves behind and heads for a sink. Our next result deals with a notion of chaos that is sustained through time. A compact, FT -invariant set ⊂ S 1 × R is called a global attractor for FT if for every z ∈ S 1 × R, dist(FTn z, ) → 0 as n → ∞. In order not to interrupt the flow of ideas, we postpone the technical definitions of some of the terms used in Theorems 2 and 3 to after the statements of both results. Here is our main result:

From Invariant Curves to Strange Attractors

279

Theorem 3 (Strange attractors). For the parameters specified below, F = FT has a strange attractor, a description of which follows: ¯ t¯0 > 0 such that for every λ < λ¯ and t0 < t¯0 , Relevant parameter set. There exist λ, there is a positive Lebesgue measure set = (λ, t0 ) in T -space for which the results of this theorem hold; ⊂ [T0 , ∞) for some large T0 , and meets every subinterval of [T0 , ∞) of length O(λ) in a set of positive Lebesgue measure. ¯ t0 < t¯0 , and T ∈ (λ, t0 ). Then F = FT has Dynamical characteristics. Let λ < λ, a global attractor with the following dynamical properties: (1) Hyperbolic behavior. F | is nonuniformly hyperbolic with an identifiable set C ⊂ which is the source of all nonhyperbolic behavior. More precisely: (a) C = ∪i Ci where Ci is a Cantor set located near (θ, r) = (ci , 0), ci being the critical points of 0 ; at each z ∈ C, stable and unstable directions coincide, i.e. there is a vector v with DF n (z)v → 0 exponentially fast as n → ±∞. (b) Away from C the dynamics is uniformly hyperbolic. More precisely, let ε := {z ∈ : dC (F n z) ≥ ε∀n ∈ Z}, where dC (·) is a notion of distance to C. Then is the closure of ∪ε>0 ε , ε is a uniformly hyperbolic invariant set for each ε > 0, and the hyperbolicity of F |ε deteriorates (e.g. minimum (E u , E s ) → 0) as ε → 0. (2) Statistical properties. (a) F admits a unique SRB measure µ supported on . (b) With the exception of a Lebesgue measure zero set of initial conditions, the asymptotic behavior of every orbit of F is governed by µ. More precisely, for Lebesgue-a.e. z ∈ S1 × R, if ϕ : S 1 × R → R is a continuous function, then 1 n−1 ϕ(F i z) → ϕdµ as n → ∞. 0 n (c) (F, µ) is ergodic, mixing, and Bernoulli. (d) For every observable ϕ : → R of Hölder class, the sequence ϕ, ϕ ◦ F, ϕ ◦ F 2 , · · · , ϕ ◦ F n , · · · viewed as a stochastic process with underlying probability space (, µ) has exponential decay of correlations and obeys the Central Limit Theorem. (3) Symbolic coding and other geometric properties. (a) Kneading sequences are well defined for all critical orbits, i.e. all orbits emanating from C. (b) With respect to the partition defined by the fractal sets Ci , the coding of orbits in is well defined and essentially one-to-one. More precisely, if σ is the shift operator, then there is a closed subset & ⊂ '∞ −∞ {1, · · · , s} with σ (&) ⊂ & and a continuous surjection π : & → such that π ◦ σ = F ◦ π ; moreover, π is i one-to-one except over ∪∞ −∞ F C, where it is two-to-one. (In general, (&, σ ) is not a shift of finite type.) (c) Let htop (F ) denote the topological entropy of F , Nn the number of cylinder sets of length n in & above, and Pn the number of fixed points of F n . Then htop (F ) = lim

n→∞

1 1 log Nn = lim log Pn . n→∞ n n

Moreover, F has an invariant measure of maximal entropy.

280

Q. Wang, L.-S. Young

For a more detailed description of the dynamics on these strange attractors, see [WY]. We review below the definitions and related background information for some of the technical terms used in the theorems. For more information on this material, see [KH] and [Y1]. A compact F -invariant set is called uniformly hyperbolic if the following hold: (1) The tangent space at every x ∈ splits into E u (x)+E s (x) with minx∈ (E u , E s ) > 0; (2) this splitting is DF -invariant; and (3) there exist C ≥ 1 and σ < 1 such that for all x ∈ and n ≥ 0, DF n (x)v ≤ Cσ n v for all v ∈ E s (x), DF −n (x)v ≤ Cσ n v for all v ∈ E u (x). In Theorem 3(1)(b), not only does min (E u , E s ) → 0 as ε → 0, we have C → ∞ as well. This means the smaller ε, the longer it takes for the geometry of hyperbolic behavior to take hold. An F -invariant Borel probability measure µ is called an SRB measure if F has a positive Lyapunov exponent µ-a.e. and the conditional measures of µ on unstable manifolds are equivalent to the Riemannian volume on these leaves. SRB measures are of physical relevance because they can be observed: in dissipative dynamical systems, all invariant probability measures are necessarily singular, but ergodic SRB measures with nonzero Lyapunov exponents have the property that there is a positive Lebesgue ϕ(F i z) → ϕdµ as n → ∞ for every measure set of points z for which n1 n−1 0 continuous function ϕ. Referring to the set of points z above as the measure-theoretic basin of µ, Theorem 3(2)(b) says that the measure-theoretic basin here is not just a positive Lebesgue measure set, it is, modulo a set of Lebesgue measure zero, the entire phase space. By a decomposition theorem for SRB measures with no zero exponents ([Le]), the uniqueness of µ implies that it is ergodic, and the mixing and Bernoulli properties are equivalent to (F n , µ) being ergodic for all n ≥ 1. We say the dynamical system (F, µ) has exponential decay of correlations for Hölder continuous observables if given a Hölder exponent η, there exists τ = τ (η) < 1 such that for all ϕ ∈ L∞ (µ) and ψ : → R Hölder with exponent η, there exists K = K(ϕ, ψ) such that (ϕ ◦ F n )ψdµ − ϕdµ ψdµ ≤ K(ϕ, ψ)τ n for all n ≥ 1. Finally, we say the Central Limit Theorem holds for ϕ with ϕdµ = 0 if n−1 √1 ϕ ◦ F i converges in distribution to the normal distribution, and the variance 0 n is strictly positive unless ϕ ◦ F = ψ ◦ F − ψ for some ψ. 1.3. Illustrations. Figure 1 below shows the approximate location and shape of the invariant curve or strange attractor (corresponding to different values of λ and T ) for the time-T -map FT : S 1 × R → S 1 × R. Figure 2 explains the mechanisms behind the changes in the dynamical picture as λ decreases. The straight line in (a) represents {r = 0} in (θ, r)-coordinates, and the subsequent pictures show the images of this line (or circle) at various times under the flow. Figure 2(b) shows the effect of the forcing; observe that it need not constitute a large perturbation. For t ∈ (t0 , T ], the forcing is turned off, and the system relaxes to a limit cycle with contraction rate e−λ . Figure 2(d) shows the image of {r = 0} for λ > 1 and e−λT reasonably contractive; these parameters correspond to the existence of invariant curves. As λ decreases, the effect of the shear term in (2) becomes more

From Invariant Curves to Strange Attractors

281

Fig. 1. Left: Invariant curves λ > 1; right: Strange attractor λ 1

(a) t = 0

(b) t = t

0

-λ e (c) t 0< t < T

(d) t = T, λ>1

(e) t = T, λ decreasing

(f) t = T, λ decreasing further

(g) t = T, T >> 1, λ << 1

Fig. 2 a–g. Image of {r = 0} at time t

speed ~ 1

282

Q. Wang, L.-S. Young

prominent, as shown in (e). As λ decreases further, one sees a phenomenon resembling “the breaking of the wave” which accompanies the break-up of the invariant circle. Finally, in Figure 2(g), a tubular neighborhood of {r = 0} is folded and mapped into itself, leading to the formation of horseshoes and/or strange attractors. 2. Preliminary Information on the ODE 2.1. Singular limits. Let

θ (t) r(t)

=

θ(θ0 , r0 ; t) r(θ0 , r0 ; t)

denote the solution of (2) with θ (0) = θ0 and r(0) = r0 . Then a simple exercise gives θ (T ) θ(t0 ) + (T − t0 ) + r(tλ0 ) (1 − e−λ(T −t0 ) ) θ0 FT : → = , r(T ) r0 r(t0 )e−λ(T −t0 ) where the value of θ (T ) above is to be interpreted as mod 1 or on S 1 . We let a = {T − t0 } be the fractional part of T − t0 , b = e−λn , where n = [T − t0 ] is the integer part of T − t0 , and let Ta,b = FT . Then θ(t0 ) + a + r(tλ0 ) − be−λa r(tλ0 ) θ0 → . (3) Ta,b : r0 be−λa r(t0 ) (The appearance of “T ” in both FT and Ta,b is unfortunate; we hope it is not confusing. We wish eventually to make a connection to [WY] and this notation is used there.) We first fix t0 and λ, and let T → ∞. Clearly, b → 0 as T → ∞. The limit of FT as T → ∞ does not exist. However, Ta,b has the following well defined singular limit as b → 0: θ(t0 ) + r(tλ0 ) + a θ0 Ta,0 : → . (4) r0 0 1 denote the first component of T . We will show in Sect. 2.3 that as t → Let Ta,0 a,0 0 1 → Tˆ 1 , where 0, Ta,0 a,0 r0 1 θ0 1 → θ0 + (5) + 0 (θ0 ) + a. Tˆa,0 : λ λ r0

In later sections, we will also work with two families of circle maps fa and fˆa 1 and Tˆ 1 respectively to {r = 0}, i.e. obtained by restricting Ta,0 0 a,0 fa (θ0 ) = θ (θ0 , 0; t0 ) +

r(θ0 , 0; t0 ) + a; λ

1 fˆa (θ0 ) = θ0 + 0 (θ0 ) + a. λ While our results are not confined to these limiting situations, the relation between FT and the objects that appear in Eq. (1), namely, 0 , λ and t0 , can be made transparent 1 and Tˆ 1 . This is how we will go about by comparing first Ta,b and Ta,0 and then Ta,0 a,0 obtaining information on FT .

From Invariant Curves to Strange Attractors

283

2.2. The time-t0 -map. In this subsection we consider the solution of (2) for t ∈ [0, t0 ] and record some derivative estimates. We first write r(t) = u(t)e−λt , θ (t) = v(t) + t −

1 u(t)e−λt . λ

Differentiating (6) and plugging into (2), we obtain 1 t u(t) = u0 + 0 (θ (τ ))eλτ dτ, t0 0 t 1 v(t) = v0 + 0 (θ (τ ))dτ. λt0 0 Substituted back into (6), this gives u0 = r0 , v0 = θ0 + rλ0 , and 1 t λτ 0 (θ (τ ))e dτ e−λt , r(t) = r0 + t0 0 r0 θ (t) = θ0 + t + (1 − e−λt )

λt t 1 −λt λτ + 0 (θ (τ ))dτ − e 0 (θ (τ ))e dτ . λt0 0 0

(6)

(7)

(8)

We assume in the rest of this subsection that |r0 | < 1. Recall that K0 = max{ 0 C 4 , 1}. We remark that in most of our bounds involving K0 , it is in fact not necessary to use the C 4 -norm. For example, K0 in Lemmas 2.1 and 2.4 can be replaced by the C 1 -norm of 0 . Lemma 2.1. (i) |θ (t0 ) − θ0 | < 5K0 t0 ; (ii) ∂θ (t) max − 1 < 4K0 t0 ; t≤t0 ∂θ0 (iii)

∂r(t0 ) ∂θ < 2K0 ; 0

∂θ (t) < 2t0 ; max t≤t0 ∂r0 ∂r(t0 ) ∂r ≤ 2. 0

Proof. (i) From (8), |r0 | 1 t −λt λτ |θ (t) − θ0 | < t + (1 − e ) + 0 (1 − e )dτ λ λt0 0 (1 − e−λt ) t λτ + 0 e dτ . λt 0

0

Using the inequalities 1 − e−x ≤ x and ex − 1 ≤ xex for x > 0 and the fact that 1 λt0 < 10 , we see immediately that the four terms above add up to < 5K0 t0 . (ii) ∂θ = 1 + A + B, ∂θ0

284

Q. Wang, L.-S. Young

where

t ∂θ 1 A= 0 (1 − eλτ )dτ, λt0 0 ∂θ0 t 1 ∂θ λτ −λt B= (1 − e ) 0 e dτ. λt0 ∂θ0 0

Letting 71 = maxt≤t0 | ∂θ(t) ∂θ0 − 1| and recalling that t0 K0 <

−1 1 10 K0

(9)

≤

1 10 ,

5 1 5 K0 t0 (71 + 1) ≤ 71 + K0 t0 , 2 4 2 which implies that 71 < 4K0 t0 . Similarly, writing 71 ≤ |A| + |B| ≤

∂θ 1 ˜ = (1 − e−λt ) + A˜ + B, ∂r0 λ where

t 1 ∂θ A˜ = 0 (1 − eλτ )dτ, λt0 0 ∂r0 t 1 ∂θ λτ −λt ˜ B= (1 − e ) 0 e dτ, λt0 ∂r0 0

∂θ and reasoning as above, we arrive at the desired bound for ∂r . 0 (iii) follows from (ii) and a straightforward computation. # "

We will also need estimates on higher derivatives. Lemma 2.2. For i = 2 and 3,

∂ i θ(t) max ≤ 20K0 t0 . 0≤t≤t0 ∂θ i 0

Proof. Letting A and B be as in (9), we have ∂ 2θ ∂A ∂B = + , 2 ∂ θ0 ∂θ0 ∂θ0 where

2 ∂θ 2 ∂ θ (1 − eλτ )dτ, + 0 2 ∂θ0 ∂ θ0 0 t 2 ∂B ∂θ 2 1 −λt ∂ θ 0 eλτ dτ. = (1 − e ) + 0 2 ∂θ0 λt0 ∂θ ∂ θ 0 0 0 2 Let 72 = max0≤t≤t0 ∂ ∂θθ(t) 2 . Similar reasoning as before gives ∂A 1 = ∂θ0 λt0

t

0

0

5 K0 t0 (1 + 71 )2 + 2 and thus 72 < 8K0 t0 . The proof for i = 3 is similar. 72 ≤

1 72 , 4 # "

we obtain

From Invariant Curves to Strange Attractors

285

1 and Tˆ 1 be as in Sect. 2.1. Reading off θ(t ) 2.3. Relating FT to 0 , λ and t0 . Let Ta,0 0 a,0 and r(t0 ) from Eq. (8), we have t0 1 r0 1 Ta,0 (θ0 , r0 ) = θ0 + + 0 (θ (t))dt + a. λ λt0 0 1 − Tˆ 1 | ≤ From this, it follows immediately that |Ta,0 a,0 estimates for future use.

K0 t0 λ .

We record the following

Lemma 2.3. (i) 1 ∂Ta,0 6 4 < < ; 5λ ∂r0 5λ

(ii) There is a numerical constant M0 such that for i = 1, 2, 3, 1 ∂iT 1 2 ∂ i Tˆa,0 M0 K0 t0 a,0 ≤ . − ∂θ0i λ ∂θ0i Proof. (i) Since 1 ∂Ta,0

∂r0

=

1 1 + λ λt0

0

t0

0

∂θ dt, ∂r0

it suffices to observe that the second term on the right has absolute value bounded above 1 by λ1 K0 (2t0 ) (Lemma 2.1(ii)), which is < 5λ . (ii) t0 1 ∂T 1 ∂ Tˆa,0 1 t0 ∂θ a,0 − 0 (θ (t))( − 1)dt + ( 0 (θ (t)) − 0 (θ0 ))dt = ∂θ0 ∂θ0 λt0 0 ∂θ0 0 ≤

9K02 t0 K0 (4K0 t0 + 5K0 t0 ) = λ λ

by Lemma 2.1(ii) and (i). For the second derivative, t0 1 1 2 ∂ 2 Ta,0 ∂ 2 Tˆa,0 ∂θ 2 1 ∂ θ 0 dt − = + 0 2 λt0 0 ∂θ0 ∂ θ0 ∂θ02 ∂θ02 1 (θ0 ) λ 0 t0 1 ∂θ 2 = 0 − 1 dt λt0 0 ∂θ0 t0 1 ∂ 2θ + 0 2 dt λt0 0 ∂ θ0 t0 1 + ( 0 (θ (t)) − 0 (θ0 ))dt. λt0 0 −

Observe that each of the functions of the integrals is bounded by constant ·K02 t0 . The third derivative is estimated similarly. " #

286

Q. Wang, L.-S. Young 1 , 0). Then We are now ready to estimate the derivative of FT . Let Tˆa,0 = (Tˆa,0 1 + λ1 0 λ1 ˆ D Ta,0 = . 0 0

Lemma 2.4. DFT =

1 + λ1 0 λ1 0 0

+

A 1 B1 C1 D1

,

where 9K02 t0 2K0 −λ(T −t0 ) 2 2K0 t0 + e + e−λ(T −t0 ) , , |B1 | ≤ λ λ λ λ |C1 | ≤ 2K0 e−λ(T −t0 ) , |D1 | ≤ 2e−λ(T −t0 ) .

|A1 | ≤

Proof. Comparing (3) and (4), and using Lemma 2.1, we see that A 2 B2 DTa,b = DTa,0 + , C2 D2 0 −λ(T −t0 ) where |A2 | ≤ 2K , |B2 | ≤ λ e −λ(T −t ) 0 2e . By Lemma 2.3,

2 −λ(T −t0 ) , |C2 | λe

DTa,0 = D Tˆa,0 + where |A3 | ≤

9K02 t0 λ , |B3 |

≤

2K0 t0 λ

A 3 B3 C3 D3

and C3 = D3 = 0.

≤ 2K0 e−λ(T −t0 ) , and |D2 | ≤ , # "

2.4. Absorbing sets. For dynamical systems with noncompact phase spaces, it is convenient to know that the action takes place in compact regions. An absorbing set for FT is an open set A with compact closure such that FT (A) ⊂ A and for all z ∈ S 1 × R, there exists n = n(z) such that FTn z ∈ A. Lemma 2.5. Assume λ(T − t0 ) ≥ 1. Then A := {(θ, r) ∈ S 1 × R : |r| < 4K0 e−λ(T −t0 ) } is an absorbing set for FT . Proof. Write (θn , rn ) = FTn (z) for z = (θ0 , r0 ). By (3) and (8) we have |rn | < e−λ(T −t0 ) |rn−1 | + 2K0 e−λ(T −t0 ) . With e−λ(T −t0 ) < 21 , this proves FT (A) ⊂ A. The other condition follows since inductively we have |rn | < e

−nλ(T −t0 )

|r0 | + 2K0

n i=1

# "

e−iλ(T −t0 ) < 4K0 e−λ(T −t0 ) .

From Invariant Curves to Strange Attractors

287

3. A View from the Singular Limit 1 to {r = 0} (see Sect. 2.1). We have thus defined, Recall that fa is the restriction of Ta,0 0 for each choice of 0 , λ and t0 , a one-parameter family of circle maps which represents the behavior of Eq. (1) as T → ∞. Conversely, one can recover information on the system defined by (1) if its singular limit {fa } is known: for large T or, equivalently, small b > 0, FT = Ta,b can be thought of as a perturbation of fa or an “unfolding” of fa to a small neighborhood of {r0 = 0}. In this section, we look at the problem from the point of view of the singular limit. Forgetting temporarily their connection to Eq. (1), we think of fa as abstract circle maps. The following is a brief review of several types of behaviors that are known to be “typical” and a general discussion of existing methods for transporting these onedimensional behaviors to two dimensions. The invertible case: circle diffeomorphisms. The classical theory of Poincaré and Denjoy is well known (see any elementary text). We point out a striking resemblance between fˆa and the well known family of circle maps first studied by Arnold [A]:

gµ,ε : x → x + µ + ε cos(2π x),

ε ≥ 0.

A dichotomy of behavior was observed for this family: “resonant wedges” in the (µ, ε)plane corresponding to rational rotation numbers, and the “devil’s staircase” defined by µ → ρ(gµ,ε ). These ideas are very much behind our results in Theorem 1. For us, an important question is how to bring these results for fa to Ta,b for b > 0. KAM techniques (using the intersection property) come to mind for the persistence of invariant circles with Diophantine rotation numbers, but they will not be used here. Because of strong normal contraction, invariant curves are shown to exist independent of rotation number using techniques from hyperbolic theory. The situation is then reduced to one dimension. Smooth non-invertible circle maps. For general information on one-dimensional maps, see e.g. [dMvS]. Two types of dynamical behaviors are known to be prevalent. They are (i) maps with attractive periodic cycles, and (ii) maps with absolutely continuous invariant measures. There is some evidence that these are the only observable pure dynamics types. For the quadratic family Qa : x → 1 − ax 2 , (i) and (ii) together account for a set of full Lebesgue measure in parameter space3 [Lyu2]. We discuss these two cases separately. (i) Periodic sinks. Continuing to use the quadratic family as a paradigm, we see that period doubling occurs for a below some a0 (see e.g. [CE]). In this regime, Qa has a periodic sink which attracts all points in the interval except for a finite number of unstable periodic orbits and their pre-images. Above a0 , there is an open set of a for which Qa has an attractive periodic orbit, but the set of points not attracted to the sink is now a complicated invariant set on which the map is uniformly expanding. The set of parameters with this property has been shown to be dense ([GS] and [Lyu1]). When the dynamical picture of a one-dimensional map is as above, it “unfolds” into a two-dimensional diffeomorphism satisfying Smale’s Axiom A [Sm]. The passage of uniform expansion in one dimension to uniform hyperbolicity in two dimensions is relatively simple due to the robustness or stability of uniform hyperbolic behavior (see e.g. [Sh]). 3 (i) and (ii) can easily occur on different parts of the phase space for multimodal maps.

288

Q. Wang, L.-S. Young

(ii) Absolutely continuous invariant measures. There is another type of dynamics that is prevalent in the probabilistic sense. Theorem ([J]). Let Qa : x → 1 − ax 2 , a ∈ [0, 2]. Then there exists a positive measure set of a for which Qa has an absolutely continuous invariant probability measure with a positive Lyapunov exponent. The question here is: what does the existence of absolutely continuous invariant measures for fa tell us about Ta,b for b small? The answer to this question is far from simple, and the situation was only resolved quite recently. It came as a result of two different sets of developments. The first is an “abstract” theory of nonuniformly hyperbolic systems, which provides a general framework for studying chaotic behavior. The most important idea that has come out of this theory is probably the notion of an SRB measure ([Si, R1, B, P, R2, Le, LY, PS, Y2]; see also [Y1] for an exposition). The other set of developments is more directly related to small perturbations of one-dimensional maps. Pioneering work in this direction was carried out by Benedicks and Carleson [BC], who achieved an important breakthrough on the Hénon maps. The utility of [BC] in applications, however, is limited by the fact that it relied on computations using explicitly the formulas of the Hénon maps. For other results related to the Hénon maps, see e.g. [BY, MV]. In a recent paper [WY], we extended the analysis in [BC] to a more general class of attractors, namely those with strong dissipation, one direction of instability, and well defined singular limits. We also developed the geometric and dynamical pictures of these attractors more fully, merging some of the ideas from [BC] with those from general nonuniform hyperbolic theory. Checkable conditions were given for the first time that guarantee the existence of SRB measures and their stochastic behavior. The properties in the statement of Theorem 3 is a summary of the results in [WY], and this entire package is guaranteed once certain fairly simple conditions are satisfied. These conditions are stated and checked for our equation in Sect. 5. 4. Proofs of Theorems 1 and 2 4.1. Proof of Theorem 1. We assume throughout this subsection that λ and T satisfy the hypotheses of Theorem 1, i.e. λ ≥ 4K0 and T − t0 ≥ 23 . This implies in particular 1 that e−λ(T −t0 ) ≤ e−6 < 100 . Recall also that a standing assumption throughout is −2 1 1 t0 < 10 K0 ≤ 10 (see Sect. 1.1). Let F = FT . 4.1.1. Existence of invariant circles. Identifying the tangent space of z ∈ S 1 × R with the (θ, r)-plane, we introduce the following cones: 1 |θ |}, 4 Ks := {(θ, r) : |r| > |θ |}.

Kc := {(θ, r) : |r| <

Lemma 4.1. (a) For z ∈ {|r| < 1}, v ∈ Kc $⇒ DFz (v) ∈ Kc and |DFz (v)| > 13 |v|. (b) For z with F −1 z ∈ {|r| < 1}, v ∈ Ks $⇒ DFz−1 (v) ∈ Ks and |DFz−1 (v)| > 10|v|.

From Invariant Curves to Strange Attractors

289

Proof. Write

DFz =

AB CD

.

Substituting the admissible values of λ, T and t0 into Lemma 2.4, we obtain |A − 1| <

1 , 2

|B| <

1 , 3

and

|C|, |D| <

1 . 50

(10)

(The estimate for |C| uses K0 e−6K0 < e−6 .) Let s(v) denote the slope of a vector v ∈ R2 . Then C + Ds(v) s(DF (v)) = . A + Bs(v) To verify DF (Kc ) ⊂ Kc , for example, we choose v with |s(v)| < 41 , and substituting in the numbers from (10), we obtain |s(DF (v))| < The other claims are checked similarly.

1 50 1 2

+ −

1 1 50 4 11 34

<

1 . 4

# "

We have thus identified a family of stable cones Ks and a family of center cones Kc . We call Kc “center cones” because while vectors in Kc may be expanded or contracted by DF , they are not contracted as strongly as vectors in DF −1 (Ks ). This domination implies uniform hyperbolicity on the projective level, a property relied upon heavily in the proof of the next lemma. Recall that A = {|r| < 4K0 e−λ(T −t0 ) } is an absorbing set of F (Lemma 2.5). Lemma 4.2. There is an F -invariant curve in A such that (a) is the graph of a C 4 function g : S 1 → R with |g | < 1/4; (b) for every z ∈ A, dist (F n z, ) → 0 as n → ∞. Proof. By standard arguments from hyperbolic theory, it follows from Lemma 4.1 that there is a stable foliation W s defined everywhere on A. Tangent vectors to the leaves of W s satisfy |s(v)| > 1, so that each W s -leaf is a C 1 segment joining the two boundary components of A. Moreover, F maps each W s -leaf strictly into a W s -leaf, contracting 1 length by a factor < 10 . It follows from this that := ∩n>0 F n (A) is a compact set which s meets each W -leaf in exactly one point. Part (b) of Lemma 4.2 follows immediately. Let γ0 be the curve {r = 0}. Then the images γn := F n γ0 converge in the Hausdorff metric to , the center manifold of F . By Lemma 4.1(a), the tangent vectors to γn have slopes between ±1/4 for all n. This proves that is the graph of a Lipschitz function g with Lipschitz constant ≤ 1/4. That g is C 4 follows from the fact that F is C 4 and standard graph transform arguments involving the Fiber Contraction Theorem. We refer the reader to [HPS]. " #

290

Q. Wang, L.-S. Young

4.1.2. Dynamics on invariant circles. For each T , let T be the simple closed curve left invariant by FT . We introduce a family of maps hT : S 1 → S 1 as follows: For θ0 ∈ S 1 , let z be the unique point in T whose θ -coordinate is θ0 . Then hT (θ0 ) = θ1 , where θ1 is the θ -coordinate of FT (z). Let ρ(hT ) denote the rotation number of hT . Since dθ1 99 > 1 − e−λ(T −t0 ) |r(t0 )| > , dT 100

(11)

it is an easy exercise to see that T → ρ(hT ) is a continuous nondecreasing function with ρ(hT +1 ) ≈ ρ(hT ) + 1. Case 1. ρ(hT ) ∈ R \ Q. By Denjoy theory, hT is topologically conjugate to the rigid rotation by ρ(hT ), which is well known to admit only one invariant probability measure. This together with Lemmas 2.5 and 4.2(b) imply immediately the unique ergodicity of FT . To prove that 0 in Theorem 1 has positive Lebesgue measure, we appeal to the following theorem of Herman: Theorem ([He]). Let Diff r+ (S 1 ) denote the space of C r orientation-preserving diffeomorphisms of S 1 . Let s → hs ∈ Diff 3+ (S 1 ) be C 1 and suppose that for some s0 < s1 , ρ(hs0 ) = ρ(hs1 ). Then {s ∈ [s0 , s1 ] : ρ(hs ) ∈ R \ Q} has positive Lebesgue measure. Case 2. ρ(hT ) ∈ Q. We fix p, q ∈ Z+ , p, q relatively prime, and let I be a connected component of {T : ρ(hT ) = pq } with nonempty interior. From (11), it follows that d dT

q

99 (hT (θ0 )) > 100 for every θ0 . Standard transversality arguments give an open and q dense subset I˜ of I such that for T ∈ I˜, the graph of hT is transversal to the diagonal q of S 1 × S 1 . For T ∈ I˜, the fixed points of hT (in the order in which they appear on S 1 ) are alternately strictly repelling and strictly contracting. With the contraction normal to T , they correspond to saddles and sinks respectively for FT . This completes the proof of Theorem 1.

4.2. Proof of Theorem 2. Our analysis will proceed as follows. Referring the reader to Sect. 2.1 for definitions and notation, we will argue that uniformly expanding invariant sets of fa translate directly into uniformly hyperbolic invariant sets of Ta,b for b sufficiently small. That being the case, to produce the phenomena described in Theorem 2, it suffices to produce the corresponding behaviors for fa . Furthermore, since uniformly expanding invariant sets are stable under perturbations, and fa is a small perturbation of fˆa for t0 << λ (Lemma 2.3), it suffices to work with fˆa . Recall that 1 fˆa (s) = s + 0 (s) + a. λ 4.2.1. Gradient-like dynamics. Let m0 = − min 0 . Then fˆa is a circle diffeomorphism if and only if λ > m0 . Fix λ > m0 . Varying a (which corresponds to moving the graph of fˆa up and down), we see that there is an open set of a for which fˆa has a finite number of fixed points which are alternately repelling and attracting. For these a, it is a simple exercise to show that for sufficiently small t0 and b, FT = Ta,b has the gradient-like dynamics described in Theorem 2. More generally, if ρ(fˆa ) = pq , then the discussion q q above applies to fˆa unless fˆa = id.

From Invariant Curves to Strange Attractors

p1

c1

p2

c2

291

p1

p1

(a)

c1

x1

p2

c2

p1

(b) Fig. 3 a,b.

Gradient-like dynamics, in general, persist when λ drops below m0 . Intuitively, no simple closed invariant curve exists beyond this point because the unstable manifold of the saddle “turns around”. We provide a rigorous proof in a restricted context. Proposition 4.1. Suppose 0 has exactly two critical points and negative Schwarzian derivative. Then there exist intervals of λ, t0 and T for which FT has gradient-like dynamics but there are no smooth simple closed invariant curves. Proof. Let c1 and c2 denote the critical points of 0 . There is an interval of a0 such ˜ 0 , then ˜ 0 has exactly two zeros, at say p1 and p2 . Fix such an that if 0 = a0 + a0 . Without loss of generality, we assume p1 < c1 < p2 < c2 < p1 + 1 = p1 , and 0 (p1 ) > 0, 0 (p2 ) < 0. In the rest of the proof, for each λ we consider, let f = fˆa , ˜ 0 (s). Observe that p1 is a repelling fixed where a = − aλ0 mod 1, so that f (s) = s + λ1 point of f , p2 is an attractive fixed point of f , and f (c1 ) = f (c2 ) = 1. This discussion is valid for all λ. For large λ, f maps (c1 , c2 ) strictly into itself. (See Fig. 3(a).) This continues to be the case for some interval of λ below m0 . Since 0 < 0 on (c1 , c2 ), we have 1 − mλ0 < f < 1 on (c1 , c2 ), so there exist ε, ε > 0 and an interval L of λ below m0 for which f (c1 + ε, c2 − ε) ⊂ (c1 + 2ε, c2 − 2ε) and |f |(c1 +ε,c2 −ε) | < 1 − ε . (See Fig. 3(b).) Thus every point in (c1 + ε, c2 − ε) tends to p2 , and since every point in S 1 \ (c1 + ε, c2 − ε) eventually enters (c1 + ε, c2 − ε), we conclude that f and hence F = Ta,b have gradient-like dynamics for a as above and t0 and b suitably small. Let p˜ 1 and p˜ 2 denote the saddle and sink of F respectively. To prove the proposition, suppose F leaves invariant a smooth simple closed curve . Since it is not possible for all the points in an invariant circle to converge to the same point, must intersect the stable manifold of p˜ 1 . This implies p˜ 1 ∈ , and hence W u , the unstable manifold of p˜ 1 , must be contained in . Fix an orientation on , and let τ be a positively oriented tangent field on W u . To derive a contradiction, we will produce, for every ε1 > 0, two points z, z ∈ W u such that d(z, z ) < ε1 and τ (z) and τ (z ) point in opposite directions. By the negative Schwarzian property of 0 , f = 0 at exactly two points x1 < x2 in (c1 , c2 ). Move λ if necessary so xi = p2 , i = 1, 2. Without loss of generality, we

292

Q. Wang, L.-S. Young

stable curves ~ p1 p1

x1

f(x1)

Fig. 4.

may assume x1 ∈ (c1 , p2 ). The following two statements, which we claim are valid for suitable choices of t0 , a and b, clearly lead to the desired contradiction. (1) The right branch of W u is roughly horizontal until about f (x1 ), where it makes a sharp turn and doubles back for a definite distance, creating two roughly parallel segments with opposite orientation (see Fig. 4). (2) There exist pairs of points on these parallel segments joined by stable curves. Claims (1) and (2) follow from Lemma 4.3, which is a general result valid for any λ and any 0 (and not just the ones considered in this subsection). It is similar in spirit to Lemma 4.2 and has the same proof, which will be omitted. ¯ 0 , λ, δ, ε) << δ such that the Lemma 4.3. Given fa and constants δ, ε > 0, ∃b¯ = b( ¯ Let z = (r, θ ) ∈ A (which depends on b) be following hold for F = Ta,b with b < b. such that |fa (θ )| > δ. Then: (a) |s(v)| = O( bδ ) $⇒ |s(DFz v)| = O( bδ ) and |DFz v| > (1 − ε)δ|v|; (b) there exists C = C( 0 , λ) such that |s(DFz v)| > Cδ $⇒ |s(v)| > Cδ and |DFz v| b |v| = O( δ ). Claim (1) follows immediately from Lemma 4.3(a). Part (b) of this lemma implies that if a region of A misses the two rectangles {(r, θ ) : |f (θ )| < δ} in all of its forward iterates, then it is foliated by stable curves. Since f (p2 ) = 0, Claim (2) is easily arranged by choosing δ sufficiently small. " # 4.2.2. Transient chaos. We return to the family fˆa where λ is now assumed to be small. Let c1 and c2 be the critical points of 0 . Then fˆa has exactly two critical points s1 and s2 near c1 and c2 . Let a be fixed for now. As λ is varied, the critical values fˆa (s1 ) and fˆa (s2 ) move at rates ∼ λ1 in opposite directions. There exists, therefore, a sequence of λ for which they coincide. Observe that this sequence is independent of a. We now fix each of these λ and adjust a so that fˆa (s1 ) = s1 , where s1 is the critical point with the property that | 0 (c1 )| ≤ | 0 (c2 )|. We will show that for the (λ, a)-pairs selected above, f = fˆa has the following properties: (i) it has a sink, and (ii) when restricted to the set of points that are not attracted to the sink, f is uniformly expanding. By design, we have f (s1 ) = s1 , which is therefore a sink, and f (s2 ) = s1 . For √ 1.5 i = 1, 2, let αi = | (c )| λ and Ii = [si − αi , si + αi ]. 0

i

From Invariant Curves to Strange Attractors

293

Lemma 4.4. Assume λ is sufficiently small. Then √ (a) for s ∈ I1 ∪ I2 , we have |f (s)| > 1.4; (b) for s ∈ I1 ∪ I2 , we have f n s → s1 as n → ∞. |f (s)| ≥ |f (si ± αi )| for some i. Since Proof. (a) We may assume for s ∈ I1 ∪ I2 that√ 1 this is = λ | 0 (ξi )|αi for some ξi ∈ Ii , it is > 1.4. (b) First we check f (Ii ) ⊂ I1 , i = 1, 2: 1 1 1.5 λ2 | (ξi )|αi2 ≤ | (ξi )| · 2λ 0 2λ 0 | 0 (ci )|2 λ λ ≤ < α1 . ≤ | 0 (ci )| | 0 (c1 )|

|f (si ± αi ) − f (si )| =

A similar computation shows that f restricted to I1 is a contraction.

# "

Let F = Ta,b , where λ and a are near the ones selected above and t0 and b are sufficiently small. Let Bi , i = 1, 2, be the two components of A \ {(θ, r) : θ ∈ I1 ∪ I2 }. With λ sufficiently small, F wraps each Bi around A (in the horizontal direction) at least once, with F (Bi ) crossing completely Bj every time they meet. This, on the topological level, is the standard construction of a horseshoe. Let := {z ∈ A : F n (z) ∈ B1 ∪ B2

∀n ∈ Z}.

With b sufficiently small, the uniform hyperbolicity of F | follows from Lemma 4.3. This completes the proof of Theorem 2. 5. Proof of Theorem 3 5.1. Conditions from [WY] for strange attractors. As explained in the introduction, the proof of Theorem 3 is obtained largely via a direct application of [WY] – provided the conditions in Sect. 1.1 of [WY] are verified. For the convenience of the reader, we give a self-contained discussion of these conditions here, modifying one of them to improve its checkability and adding a new one, (C4), to guarantee mixing. The notation in this section is that in [WY]. We consider a family of maps Ta,b : A = S 1 × [−1, 1] → A, where a ∈ [a0 , a1 ] ⊂ R and b ∈ B0 ⊂ R, B0 being any subset with 0 as an accumulation point.4 In this setup, b is a measure of dissipation; our results hold for b sufficiently small. We explain the role of the parameter a: For systems that are not uniformly hyperbolic, a scenario that competes with that of strange attractors and SRB measures is the presence of periodic sinks. In general, arbitrarily near systems with SRB measures, there are open sets of maps with sinks; proving directly the existence of an SRB measure for a given dynamical system requires information of arbitrarily high precision. We get around this problem by considering one-parameter families, in our case a → Ta,b , and by showing that if a family satisfies certain reasonable conditions, then a positive measure set of parameters with SRB measures is guaranteed. We now state our conditions on these families. 4 In [WY], B is taken to be an interval but the formulation here is all that is used. 0

294

Q. Wang, L.-S. Young

(C1) Regularity conditions. For each b ∈ B0 , the function (x, y, a) → Ta,b (x, y) is C 3 ; and as b → 0, these functions converge in the C 3 norm to (x, y, a) → Ta,0 (x, y). (ii) For each b = 0, Ta,b is an embedding of A into itself, whereas Ta,0 is a singular map with Ta,0 (A) ⊂ S 1 × {0}. (iii) There exists K > 0 such that for all a, b with b = 0, (i)

| det DTa,b (z)| ≤K | det DTa,b (z )|

∀z, z ∈ S 1 × [−1, 1].

As before, we refer to Ta,0 as well as its restriction to S 1 × {0}, i.e. the family of one-dimensional maps fa : S 1 → S 1 defined by fa (x) = Ta,0 (x, 0), as the singular limit of Ta,b . The rest of our conditions are imposed on the singular limit alone. The second condition in [WY] is: (C2) There exists a ∗ ∈ [a0 , a1 ] such that f = fa ∗ satisfies the Misiurewicz condition. The Misiurewicz condition (see [M]) encapsulates a number of properties some of which are hard to check or not needed in full force. We propose here to replace it by (C2’), a set of conditions that is more directly checkable (although a little cumbersome to state). That the results in [WY] are valid when (C2) is replaced by (C2’) below is proved in Lemma A.1 in the Appendix. (C2’) Existence of a sufficiently expanding map from which to perturb. There exists a ∗ ∈ [a0 , a1 ] such that f = fa ∗ has the following properties: There are numbers c1 > 0, N1 ∈ Z+ , and a neighborhood I of the critical set C such that f is expanding on S 1 \ I in the following sense: (a) if x, f x, · · · , f n−1 x ∈ I, n ≥ N1 , then |(f n ) x| ≥ ec1 n ; (b) if x, f x, · · · , f n−1 x ∈ I and f n x ∈ I , any n, then |(f n ) x| ≥ ec1 n ; (ii) f n x ∈ I ∀x ∈ C and n > 0; (iii) in I , the derivative is controlled as follows: (a) |f | is bounded away from 0; (b) by following the critical orbit, every x ∈ I \ C is guaranteed a recovery time n(x) ≥ 1 with the property that f j x ∈ I for 0 < j < n(x) and |(f n(x) ) x| ≥ ec1 n(x) .

(i)

Next we introduce the notion of smooth continuations. Let Ca denote the critical set of fa . For x = x(a ∗ ) ∈ Ca ∗ , the continuation x(a) of x to a near a ∗ is the unique critical point of fa near x. If p is a hyperbolic periodic point of fa ∗ , then p(a) is the unique periodic point of fa near p having the same period. It is a fact that in general, if p is a point whose fa ∗ -orbit is bounded away from Ca ∗ , then for a sufficiently near a ∗ , there is a unique point p(a) with the same symbolic itinerary under fa . (C3) Conditions on fa ∗ and Ta ∗ ,0 . (i)

Parameter transversality. For each x ∈ Ca ∗ , let p = f (x), and let x(a) and p(a) denote the continuations of x and p respectively. Then d d fa (x(a)) = p(a) da da

at a = a ∗ .

From Invariant Curves to Strange Attractors

295

(ii) Nondegeneracy at “turns”. ∂ Ta ∗ ,0 (x, 0) = 0 ∂y

∀x ∈ Ca ∗ .

The following fact often facilitates the checking of condition (C3)(i): Lemma 5.1 ([TTY], Sect. VII). Let f = fa ∗ , and suppose all x ∈ C. Then ∞ [(∂a fa )(f k x)]a=a ∗ k=0

(f k ) (f x)

1 n≥0 |(f n ) (f x)|

< ∞ for

d d = fa (x(a)) − p(a) da da

a=a ∗

.

The main conditions in [WY] are contained in (C1)–(C3) (or, equivalently, (C1), (C2’) and (C3)). The conclusions of Theorem 3, however, are more specific than those of [WY], which allow the co-existence of multiple ergodic SRB measures. We now introduce a fourth condition,5 which along with (C1)–(C3) implies the uniqueness of SRB measures and their mixing properties. This implication is proved in Lemma A.2 in the Appendix. (C4) Conditions for mixing. (i) ec1 > 2 where c1 is in (C2’). (ii) Let J1 , · · · , Jr be the intervals of monotonicity of fa ∗ , and let P = (pi,j ) be the matrix defined by 1 if f (Ji ) ⊃ Jj , pi,j = 0 otherwise. Then there exists N2 > 0 such that P N2 > 0. The discussion in this subsection can be summarized as follows: Theorem 3’. Assume {Ta,b } satisfies (C1), (C2’), (C3) and (C4) above. Then for all sufficiently small b > 0, there is a positive measure set of a for which Ta,b has the properties in (1), (2) and (3) of Theorem 3. We remark that [WY] contains a more detailed description of the dynamical picture than the statement of Theorem 3 and refer the interested reader there for more information. In the rest of this section the discussion pertains to the differential Eq. (1) defined in Sect. 1.1. All notation is as in Sect. 2.1. To prove Theorem 3, it suffices to verify that for the parameters in question, Ta,b satisfies the conditions above. This is carried out in the next three subsections. 5 Condition (*) in Sect. 1.2 of [WY], the only condition in [WY] not implied by (C1)–(C3), is clearly contained in (C4).

296

Q. Wang, L.-S. Young

5.2. Verification of (C2’): Expanding properties. Among the conditions to be checked, (C2’), which guarantees a suitable environment from which to perturb, is arguably the most fundamental of the four. It is also the one that requires the most work. In this subsection, we will – after placing some restrictions on λ and t0 – show that (C2’) is valid for all fa for which (C2’)(ii) is satisfied. The existence of a satisfying (C2’)(ii) is the topic of the next subsection. Let x¯1 , x¯2 , · · · , x¯k1 be the critical points of 0 , and let k2 = min{1, 21 mini | 0 (x¯i )|}. We fix ε = ε( 0 ) > 0 with the property that |x¯i − x¯j | > 4ε for i = j and | 0 | > k2 on ∪i (x¯i − 2ε, x¯i + 2ε), and claim that by choosing λ and t0 sufficiently small, we may assume the following about fa . Let C denote the critical set of fa , and let Cε denote the ε-neighborhood of C. Then (i) C = {x1 , · · · , xk1 } with |xi − x¯i | < ε; (ii) on Cε , |fa | > kλ2 . To justify these claims, observe first that by taking λ small enough, the critical set of fˆa can be made arbitrarily close to that of 0 . Second, by choosing t0 sufficiently small (independent of λ), we can make fa − fˆa C 3 < ελ1 for ε1 as small as we please (Lemma 2.3). These observations together with fˆa = λ1 0 imply (i) and (ii). A number of other conditions will be imposed on λ; they will be specified as we go along. Some of these conditions are determined via an auxiliary constant K > 1 which depends only on 0 and which will be chosen to be large enough for certain purposes. Let σ := 2k2−1 K 3 λ. We assume 21 σ < ε, so that |fa (x)| > K 3 for x ∈ Cε \ C 1 σ . We 2

also assume λ is small enough that |fa | > K 3 outside of Cε . Together these imply

(iii) |fa | > K 3 outside of C 1 σ . 2

For simplicity of notation, we write f = fa in the rest of this subsection. Lemma 5.2. Let c ∈ C be such that f n (c) ∈ Cσ ∀n > 0. Consider x with |x − c| < 21 σ , 1 and let n(x) be the smallest n such that |f n (x) − f n (c)| > 3K K 3 λ. Then n(x) > 1 0 and |(f n(x) ) | ≥ k3 K n(x) for some k3 = k3 (K0 , k2 ). Before giving the proof of this lemma, we first prove a distortion estimate. Sublemma 5.1. Let x, y ∈ S 1 and n ∈ Z+ be such that ωi , the segment between f i x 1 and f i y, satisfies |ωi | < 3K K 3 λ and dist(ωi , C) > 21 σ for all i with 0 ≤ i < n. Then 0 (f n ) x ≤ 2. (f n ) y Proof. n−1

log

n−1

f (f i x) |f (f i x) − f (f i y)| (f n ) x log = ≤ (f n ) y f (f i y) |f (f i y)| ≤

i=0 n−1 i=0

i=0

(1 +

K0 i λ )|f x K3

− f i y|

n−1 (1 + Kλ0 ) 1 |f n−1 x − f n−1 y|. < K3 K 3i i=0

From Invariant Curves to Strange Attractors

Assuming that

1 λ

297

and K are sufficiently large, this is < 21 .

# "

Proof of Lemma 5.2. First we show n(x) > 1. Given the location of x, we have K > |f x| = |f (ξ )||x − c| for some ξ between x and c. This implies |f x − f c| = which we may assume is <

1 1 |f (ζ )| 2 K |f (ζ )||x − c|2 < 2 2 |f (ξ )|2

1 3 3K0 K λ.

For n(x) = 2, use

|(f 2 ) x| · |x − c| ≥ constK 3 λ

and

|x − c| < constKλ.

We assume from here on that n = n(x) ≥ 3, and estimate |(f n ) (x)| as follows. 1 Since |f n x − f n c| > 3K K 3 λ, it follows from Sublemma 5.1 that for some ξ1 , 0 1 1 |f (ξ1 )||x − c|2 · 2|(f n−1 ) (f c)| > K 3 λ. 2 3K0

(12)

Reversing the inequality at time n − 1 and using Sublemma 5.1 again, we have 1 1 1 K 3 λ. |f (ξ2 )||x − c|2 · |(f n−2 ) (f c)| < 2 2 3K0

(13)

Substituting the estimate for |(f n−1 ) (f c)| from (12) into 1 |(f n ) x| ≥ |f (ζ )||x − c| · |(f n−1 ) (f c)|, 2 we obtain |(f n ) x| ≥

1 |f (ζ )| 1 1 . K 3λ 2 |f (ξ1 )| 2K0 |x − c|

Now plug the estimate for |x − c| from (13) into the last inequality and use the lower bounds for |f (ξ2 )| and |(f n−2 ) (f c)| from (ii) and (iii) earlier on in this subsection. We arrive at the estimate k2 3(n−2) (ζ )| 1 3 3 1 |f n 3 λK |(f ) x| > K λ = constK 2 (n−2)+ 2 . 1 3 2 |f (ξ1 )| 3K0 4 3K K λ 0

The power to which K is raised is ≥ n for n ≥ 3. This completes the proof of Lemma 5.2. # " We have proved the following: Suppose fa has the property that each of its critical points c satisfies fan (c) ∈ Cσ for all n > 0. Then (C2’)(i) and (iii) hold for fa with I = C 1 σ . This follows from properties (ii) and (iii) in the first part of this subsection 2 and from Lemma 5.2.

298

Q. Wang, L.-S. Young

5.3. Verification of (C2’): “Multiple Misiurewicz points”. The goal of this section is to show that for many values of the parameter a, fa has the property that its critical orbits (in strictly positive time) stay away from its critical set. Precise statements will be formulated later. We remark that for the quadratic family x → 1 − ax 2 or any other family with a single critical point, this is a trivial exercise: there are many periodic orbits or compact invariant Cantor sets disjoint from the critical set, and if changes in parameter correspond to the movement of fa (c) in a reasonable way, then there would be many parameters for which fa (c) ∈ . We call these parameters “Misiurewicz points”. For maps with more than one critical point, as circle maps necessarily are, the required condition is that all of the critical orbits are trapped in some invariant set away from C. This is clearly more problematic, especially with having measure zero. We call parameters with these properties “multiple Misiurewicz points”. Their existence and O(λ)-density within the family {fa } is the concern of this subsection. Recall that σ = 2k2−1 K 3 λ and Cσ is the σ -neighborhood of C. Recall also from Sect. 5.2 that outside of Cσ , |fa | > K 3 . We are looking for a parameter a ∗ such that f = fa ∗ has the property that for all c ∈ C, f n c ∈ Cσ ∀n > 0. Write C = {x1 , · · · , xk1 } as before, and let be a parameter interval. For k = 1, 2, · · · , k1 and i = 1, 2, · · · , we introduce the curves of critical points (k)

a → γi (a) := fai (xk ), a ∈ . Observe that for all k,

d (k) da γ1

= 1, and for all i,

d (k) d (k) (k) γ (a) = γ (a)fa (γi (a)) + 1. da i+1 da i (k)

Thus if γj (a) ∈ Cσ for all j ≤ i and K is sufficiently large, then d (k) d (k) (k) γi+1 (a) ≈ γ (a)fa (γi (a)) da da i

(14)

d (k) 1 γ (a) ≥ K 3i . da i+1 2

(15)

and

We also have the following distortion estimate: (k)

Sublemma 5.2. For k = 1, 2, · · · , k1 and n ∈ Z+ , let ⊂ [0, 1) be such that γi (a) ∈ (k) 1 Cσ for i = 1, 2, · · · , n − 1. Assume that |γn−1 | ≤ 3K K 3 λ. Then for all a, a ∈ , we 0 have d γ (k) (a) da n ≤ 2. d (k) γn (a ) da Using (14) and (15), we see that the proof is entirely parallel to that of Sublemma 5.1 with slightly weaker estimates. We leave it as an exercise for the reader. Let d be the minimum distance between critical points. Choosing λ sufficiently small, we may assume 6k1 σ << d. The following is the main result of this subsection. Lemma 5.3. Given 0 ⊂ [0, 1) with |0 | = 6k1 σ , there exists a ∗ ∈ 0 such that ∀c ∈ C, fan∗ c ∈ Cσ ∀n > 0.

From Invariant Curves to Strange Attractors

299

Proof. We describe first an algorithm for selecting a sequence of intervals 0 ⊃ 1 ⊃ 2 ⊃ · · · so that a ∗ ∈ ∩i i has the desired property: At step n, the (k1 +1)-tuple (n ; i1,n , i2,n , · · · , ik1 ,n ) is called an “admissible configuration” if n is a subinterval of 0 , ik,n ≤ n, and the following conditions are satisfied for each k: (k)

(A1) γi |n ∩ Cσ = ∅ for all i ≤ ik,n ; (A2) for all a, a ∈ n , d (k) da γik,n (a) d (k) ≤ 2; da γik,n (a ) (k)

(A3) (“minimum length condition”) |γik,n +1 |n | ≥ 12k1 σ . Observe that (A3) is about the length of the critical curve one iterate later. Let us first show that we have an admissible configuration for n = 1. Let ik,1 = 1 d (k) γ1 = 1, we have for all k. The parameter interval 1 is chosen as follows. Since da (k) (k) (k) |γ1 |0 | = 6k1 σ , so that γ1 meets at most one component of Cσ and |(γ1 )−1 Cσ | ≤ (k) 2σ . Even in the worst case scenario when all k1 intervals (γ1 )−1 Cσ are evenly spaced, (k) there exists an interval 1 ⊂ 0 with |1 | = 2σ such that γ1 |1 ∩ Cσ = ∅ for all k. (k) Equations (A1) and (A2) are trivially satisfied, as is (A3) since |γ2 |1 | > 2σ K 3 , and 2K 3 is assumed to be > 12k1 . We now discuss how to proceed at a generic step, i.e. step n, assuming we are handed an admissible configuration (n ; i1,n , i2,n , · · · , ik1 ,n ). First, we divide the set {1, 2, · · · , k1 } into indices k that are “ready to advance”, meaning the situation is right for the k th curve to progress to the next iterate, and those that are not. Say k ∈ A if (k)

(A4) |γik,n |n | <

1 3 3K0 K λ

(A5)

< d (image of the next iterate meets at most one interval in Cσ ).

(k) |γik,n +1 |n |

(distortion estimate holds for the next iterate);

Consider first the case where A = ∅. We set ik,n+1 = ik,n +1 for k ∈ A, ik,n+1 = ik,n otherwise, and look for n+1 ⊂ n so that (n+1 ; i1,n+1 , · · · , ik1 ,n+1 ) is again an admissible configuration. (k) Let k ∈ A. By virtue of (A3) and (A5), we have 12k1 σ < |γik,n+1 |n | < d, so that (k)

the fraction of γik,n+1 |n in Cσ is ≤ good control of the distortion of a (k)

1 6k1 . By virtue of (A4) and Sublemma d (k) γik,n+1 . Together this gives → da

|(γik,n+1 |n )−1 Cσ | ≤

1 |n |. 3k1

5.2, we have

(16)

By the same geometric argument as in the case n = 1, there exists a subinterval n+1 ⊂ (k) n of length 3k11 |n | with the property that γik,n+1 |n+1 ∩ Cσ = ∅ for all k ∈ A. For this choice of n+1 , we have (A1) by design, and (A2) is given by (A4) from step n. As for (A3), observe that by the same reasoning as in (16), the pullback of any interval of (k) S 1 of length 2σ has length ≤ 3k11 |n |, so |γik,n+1 |n+1 | ≥ 2σ , and one iterate later, it is guaranteed to have length > 2K 3 σ .

300

Q. Wang, L.-S. Young

Consider now k ∈ A. Conditions (A1) and (A2) are inherited from the previous step, and (A3) is checked as follows: If k ∈ A because (A4) fails, then (k)

|γik,n+1 |n+1 | ≥

1 1 (k) |γ | | ≥ cK 3 λ, · 2 3k1 ik,n n

where c is a constant independent of K of λ. Notice that this uses only the distortion estimate from step n. One iterate later, this curve will have length > cK 6 λ, which we may assume is > 12k1 σ . If (A4) holds but (A5) fails, then the distortion estimate holds for the next iterate, and (k)

|γik,n+1 +1 |n+1 | ≥

1 (k) |γ | | ≥ cd, 6k1 ik,n +1 n

which we may also assume is > 12k1 σ . This completes the construction from step n to step n + 1 when A = ∅. If A = ∅, then we let n be the left half of n , and observe that the (n + 1)tuple (n ; i1,n , i2,n , · · · , ik1 ,n ) is again admissible. To verify (A3), we fix k, and argue separately as in the last paragraph the two cases corresponding to (i) the failure of (A4) with respect to n and (ii) the failure of (A5) but not (A4). Repeat this process if necessary until A = ∅. " # 5.4. Verification of (C1), (C3) and (C4). We now verify the remaining conditions in Sect. 5.1. Observe from the arguments below that (C1) and (C3)(ii) are quite natural for systems arising from differential equations, while (C3)(i) and (C4) are, to a large extent, consequences of the fact that the maps fa are sufficiently expanding. Verification of (C1): Let Ft0 denote the time-t0 -map of (2) (the period of the forcing continues to be T ). Then (i) follows from the fact that Ft0 has bounded C 3 norms on S 1 × [−1, 1]; (ii) is obvious, and (iii) is a consequence of the fact that det(DFT ) = e−λ(T −t0 ) det(DFt0 ). Verification of (C3): For (i), since (∂a fa )(·) = 1 and |(f k ) (fx)| ≥ K k , Lemma 5.1 applies, and the quantity in question has absolute value ≥ 1 − i≥1 K1i > 0. Part (ii) is Lemma 2.3(i). Verification of (C4): (i) is proved since ec1 = K > 2. For (ii), by choosing λ sufficiently small depending on 0 , it is easily arranged that pi,j = 1 for all i, j . This completes the proof of Theorem 3. Appendix We supply here the proofs of the two lemmas promised in Sect. 5.1. This appendix has to be read in conjunction with [WY]. Lemma A.4. All the theorems in [WY] remain valid if the Misiurewicz condition in Step I, Sect.1.1, of [WY] is replaced by condition (C2’) in Sect. 5.1 of this paper. Proof. The three most important uses of the Misiurewicz condition in [WY] are: – the nondegeneracy of the critical points (this is guaranteed by (C2’)(iii)(a)); – every critical orbit stays a fixed distance away from C (this is precisely (C2’)(ii));

From Invariant Curves to Strange Attractors

301

– there exist c0 , c > 0 such that for every critical point x, |(f n ) (f x)| > c0 ecn (this is guaranteed by (C2’)(i) and (ii)). These three properties aside, the only consequences of the Misiurewicz condition used in [WY] are contained in Lemma 2.5 of [WY]. Let Cδ denote the δ-neighborhood of C. Then there exist cˆ0 , cˆ1 > 0 such that the following hold for all sufficiently small δ > 0: Let x ∈ S 1 be such that x, f x, · · · , f n−1 x ∈ Cδ , any n. Then (i) |(f n ) x| ≥ cˆ0 δecˆ1 n ; (ii) if, in addition, f n x ∈ Cδ , then |(f n ) x| ≥ cˆ0 ecˆ1 n . We claim that the conclusions of this lemma also follow from (C2’). Let n1 < · · · < nq , 0 ≤ n1 , nq ≤ n, be the times when f ni x ∈ I . Then – |(f n1 ) x| ≥ ec1 n1 by (C2’)(i)(b); – |(f ni+1 −ni ) (f ni x)| ≥ ec1 (ni+1 −ni ) by (C2’)(iii)(b) followed by (i)(b); – |(f n−nq ) (f nq x)| = |f (f nq x)| · |(f n−(nq +1) ) (f nq +1 x)|, where |f (f nq x)| ≥ |f (ξ )|d(x, C) ≥ c0 δ by (C2’)(iii)(a) and |(f n−(nq +1) ) (f nq +1 x)| ≥ c0 ec1 (n−(nq +1)) by (C2’)(i)(a). Together these inequalities prove both of the assertions in the lemma.

# "

Lemma A.5. Let {Ta,b } be as in Sect. 5.1 of this paper, and let be the set of (a, b) such that T = Ta,b satisfies the conclusions of Theorem 1 in [WY]. Suppose {Ta,b } also satisfies (C4), and δ is smaller than a number depending on c1 . Then (i) T admits at most one SRB measure µ; (ii) (T , µ) is mixing. Proof. Let {x1 < · · · < xr } be the set of critical points of f . Consider a segment ω ⊂ ∂R0 corresponding to an outermost Iµj at one of the components of C (0) . First we claim there exist N ∈ Z+ and ωˆ ⊂ ω such that T i ωˆ ∩ C (0) = ∅ for all 0 < i < N and T N ωˆ connects two components of C (0) . This claim is proved as follows. Let ω denote the image of ω at the end of its bound period. Then ω has length > δ Kβ . We continue to iterate, deleting all parts that fall into C (0) . Then i steps later, the undeleted part of T i ω is made up of finitely many segments. Suppose that for all i ≤ n, none of these segments is long enough to connect two components of C (0) , so that the number of segments deleted up to step i is ≤ 2i . We estimate the average length of these segments at time n as follows: First, the pull-back to ω of all the deleted parts has total measure ≤ i≤n 2i e−c1 i (2δ) by (C2’)(i)(b). Since 2 < ec1 by (C4)(i), we may assume this is < 21 δ Kβ provided δ is sufficiently small. The undeleted segments of T n ω add up, therefore, to > ec1 n 21 δ Kβ in length, and since there are at most 2n of them, their average length is > 2−n ec1 n 21 δ Kβ . Thus one sees that as n increases, there must come a point when our claim is fulfilled. Next we observe that if ω is a C 2 (b) segment connecting two components of C (0) , then using (C4)(ii) and reasoning as with finite state Markov chains, we have that for every n ≥ N2 and every k ∈ {1, · · · , r}, there is a subsegment ωn,k ⊂ ω such that for all i < n, T i ωn,k ∩ C (0) = ∅ and T n ωn,k stretches across the region between xk and xk+1 , extending beyond the critical regions containing these two points.

302

Q. Wang, L.-S. Young

Recall that in [WY], Sects. 8.1 and 8.2, a finite number of ergodic SRB measures {µi , i ≤ r } are constructed, and it is shown in Sect. 8.3 that these are all the ergodic SRB measures T has. The discussion above shows that starting at any reference set, a segment ω ⊂ ∂R0 as above will spend a positive fraction of time in every reference set, proving that r ≤ 1. Furthermore, starting from any reference set, the return time to it takes on all values greater than some N0 , proving that µ1 is mixing. " # 6. Concluding Remarks • For area-preserving maps, it is well known that when integrability first breaks down, the phase portrait is dominated by KAM curves. Farther away from integrability, one sees larger Birkhoff zones of instability interspersed with elliptic islands. Continuing to move toward the chaotic end of the spectrum, it is widely believed – though not proved – that most of the phase space is covered with ergodic regions with positive Lyapunov exponents. This paper deals with the corresponding pictures for strongly dissipative systems. We consider a simple model consisting of a periodically forced limit cycle. Keeping the magnitude of the “kick” constant, we prove that scenarios roughly parallel to those in the last paragraph occur for our Poincaré maps, with attracting invariant circles (taking the place of KAM curves), periodic sinks (instead of elliptic islands), and as the contractive power of the cycle diminishes, we prove that the stage is shared by at least two scenarios occupying parameter sets that are delicately intertwined: horseshoes and sinks, and strange attractors. By “strange attractors”, we refer to attractors characterized by SRB measures, positive Lyapunov exponents, and strong mixing properties. For the differential equation in question, we prove that the system has global strange attractors of this kind for a positive measure set of parameters. • Our second point has to do with bridging the gap between abstract theory and concrete problems. Today we have a fairly good hyperbolic theory, yet chaotic phenomena in naturally occurring dynamical systems have continued to resist analysis. One of the messages of this paper is that for certain types of strange attractors, the situation is now improved: For attractors with strong dissipation and one direction of instability, there are now relatively simple, checkable conditions which, when satisfied, guarantee the existence of an attractor with a detailed package of statistical and geometric properties. Our conditions are formulated to give rigorous results, but where rigorous analysis is out of reach, they can also serve as a basis for numerical work to provide justification for various mathematical statements about strange attractors. References [A] [BC] [BY] [B] [CL] [CELS]

Arnold, V.I.: Small denominators, I: Mappings of the circumference onto itself. AMS Transl. Ser. 2 46, 213–284 (1965) Benedicks, M. and Carleson, L.: The dynamics of the Hénon map. Ann. Math. 133, 73–169 (1991) Benedicks, M. and Young, L.-S.: Sinai-Bowen-Ruelle measure for certain Hénon maps. Invent. Math. 112, 541–576 (1993) Bowen, R.: Equilibrium states and the ergodic theory of Anosov diffeomorphisms. Lecture Notes in Math. Vol. 470, Berlin: Springer, 1975 Cartwright, M.L. and Littlewood, J.E.: On nonlinear differential equations of the second order. J. London Math. Soc. 20, 180–189 (1945) Chernov, N., Eyink, G., Lebowitz, J. and Sinai, Ya.G.: Steady-state electrical conduction in the periodic Lorentz gas. Commun. Math. Phys. 154, 569–601 (1993)

From Invariant Curves to Strange Attractors

[CE] [D] [GS] [G] [GH] [He] [HPS] [Ho] [J] [KH] [Le] [LY] [Li1] [Li2] [Ln] [Lo] [Lyu1] [Lyu2] [dMvS] [M1] [MV] [P] [PS] [Ro] [R1] [R2] [Ry] [Sh] [Si] [Sp] [TTY] [T] [W] [WY] [Y1] [Y2]

303

Collet, P. and Eckmann, J.-P.: Iterated Maps on the Interval as Dynamical Systems. Progress on Physics I (1980) Duffing, G.: Erzwungene Schwingungen bei veränderlicher Eigenfrequenz. Braunschwieg: 1918 Graczyk, J. and Swiatek, G.: Generic hyperbolicity in the logistic family. Ann. Math. 146, 1–52 (1997) Guckenheimer, J.: A strange strange attractor. In: Bifurcation and its Application (J.E. Marsden and M. McCracken, ed.). Berlin–Heidelberg–New York: Springer-Verlag, 1976, pp. 81 Guckenheimer, J. and Holmes, P.: Nonlinear oscillators, dynamical systems and bifurcations of vector fields. Appl. Math. Sciences 42, Berlin–Heidelberg–New York: Springer-Verlag, 1983 Herman, M.: Mesure de Lebesgue et Nombre de Rotation. Lecture Notes in Math. 597, Berlin– Heidelberg–New York: Springer, 1977, pp. 271–293 Hirsch, M., Pugh, C. and Shub, M.: Invariant Manifolds. Lecture Notes in Math. 583, Berlin– Heidelberg–New York: Springer Verlag, 1977 Holmes, P.: A nonlinear oscillator with a strange attractor. Phil. Trans. Roy. Soc. A. 292, 419–448 (1979) Jakobson, M.: Absolutely continues invariant measures for one-parameter families of onedimensional maps. Commun. Math. Phys. 81, 39–88 (1981) Katok, A. and Hasselblatt, B.: Introduction to the modern dynamical systems. Cambridge: Cambridge University Press, 1995 Ledrappier, F.: Propriétés ergodiques des mesures de Sinai. Publ. Math. Inst. Hautes Etud. Sci. 59, 163–188 (1984) Ledrappier, F. and Young, L.-S.: The metric entropy of diffeomorphisms. Ann. Math. 122, 509–574 (1985) Levi, M.: Qualitative analysis of periodically forced relaxation oscillations. Mem. AMS 214, 1–147 (1981) Levi, M.: A new randomness-generating mechanism in forced relaxation oscillations. Physica D 114, 230–236 (1998) Levinson, N.: A second order differential equation with singular solutions. Ann. Math. 50, No. 1, 127–153 (1949) Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sc. 20, No. 1, 130–141 (1963) Lyubich, M.: Dynamics of quadratic polynomials, I-II. Acta Math. 178, 185–297 (1997) Lyubich, M.: Regular and stochastic dynamics in the real quadratic family. Proc. Natl. Acad. Sci. USA 95, 14025–14027 (1998) de Melo, W. and van Strien, S.: One-dimensional Dynamics. Berlin–Heidelberg–New York: Springer-Verlag, 1993 Misiurewicz, M.: Absolutely continues invariant measures for certain maps of an interval. Publ. Math. IHES. 53, 17–51M. Mora, L. and Viana, M.: Abundance of strange attractors. Acta. Math. 171, 1–71 (1993) Pesin, Ja.B.: Characteristic Lyapunov exponents and smooth ergodic theory. Russ. Math. Surv. 32.4, 55–114 (1977) Pugh, C. and Shub, M.: Ergodic attractors. Trans. A. M. S. 312, 1–54 (1989) Robinson, C.: Homoclinic bifurcation to a transitive attractor of Lorenz type. Nonlinearity 2, 495– 518 (1989) Ruelle, D.: A measure associated with Axiom A attractors. Am. J. Math. 98, 619–654 (1976) Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publ. Math. Inst. Hautes Étud. Sci. 50, 27–58 (1979) Rychlik, M.: Lorenz attractors through Sil’nikov-type bifurcation, Part I. Ergodic Theory and Dynamical Systems 10, 793–822 (1990) Shub, M.: Global Stability of Dynamical Systems. Berlin–Heidelberg–New York: Springer, 1987 Sinai, Y.G.: Gibbs measure in ergodic theory. Russ. Math. Surv. 27, 21–69 (1972) Sparrow, C.: The Lorenz Equations. Berlin–Heidelberg–New York: Springer, 1982 Thieullen, P., Tresser, C. and Young, L.-S.: Positive exponent for generic 1-parameter families of unimodal maps. C.R. Acad. Sci. Paris, t. 315, Serie (1992), 69–72; J. d’Analyse 64, 121–172 (1994) Tucker, W.: The Lorenz attractor exists. C. R. Acad. Sci. Paris Ser. I Math. 328 (1999), no. 12, 1197–1202 Williams, R.: The structure of Lorenz attractors. In: Turbulence Seminar Berkeley 1996/97 (P. Bernard and T. Ratiu, ed.), Berlin–Heidelberg–New York: Springer-Verlag, 1977, pp. 94–112 Wang, Q.D. and Young, L.-S.: Strange attractors with one direction of instability. Commun. Math. Phys. 218, 1–97 (2001) Young, L.-S.: Ergodic theory of differentiable dynamical systems. In: Real and Complex Dynamical Systems, B. Branner and P. Hjorth (eds.), Dordrecht: Kluwer Acad. Press, 1995 Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. of Math. 147, 585–650 (1998)

304

[Z1] [Z2]

Q. Wang, L.-S. Young

Zaslavsky, G.: The simplest case of a strange attractor. Phys. Lett. A 69, no. 3, 145–147 (1978) Zaslavsky, G.: Chaos in Dynamic Systems. Harwood Academic Publishers, first printing, 1985

Communicated by M. Aizenman

Commun. Math. Phys. 225, 305 – 329 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Exponential Convergence to Non-Equilibrium Stationary States in Classical Statistical Mechanics Luc Rey-Bellet, Lawrence E. Thomas Department of Mathematics, University of Virginia, Kerchof Hall, Charlottesville, VA 22903, USA. E-mail: [email protected]; [email protected] Received: 12 March 2001 / Accepted: 5 August 2001

Abstract: We continue the study of a model for heat conduction [6] consisting of a chain of non-linear oscillators coupled to two Hamiltonian heat reservoirs at different temperatures. We establish existence of a Liapunov function for the chain dynamics and use it to show exponentially fast convergence of the dynamics to a unique stationary state. Ingredients of the proof are the reduction of the infinite dimensional dynamics to a finite-dimensional stochastic process as well as a bound on the propagation of energy in chains of anharmonic oscillators. 1. Introduction In its present state, non-equilibrium statistical mechanics is lacking the firm theoretical foundations that equilibrium statistical mechanics has. This is due, perhaps, to the extremely great variety of physical phenomena that non-equilibrium statistical mechanics describes. We will concentrate here on a system which is maintained, by suitable forces, in a state far from equilibrium. In such an idealization, the non-equilibrium phenomena can be described by stationary non-equilibrium states (SNS), which are the analog of canonical or microcanonical states of equilibrium. Recently many works have been devoted to the rigorous study of SNS. Two main streams are emerging. In the first approach, for open systems, a system is driven out of equilibrium by interacting with several reservoirs at different temperatures. In the second approach, for thermostated systems, a system is driven out of equilibrium by nonHamiltonian forces and constrained to a compact energy surface by Gaussian (or other) thermostats [9, 24]. One should view both approaches as two different idealizations of the same physical situation, in the same spirit as the equivalence of ensembles in equilibrium statistical mechanics. But for the moment, the extent to which both approaches are equivalent remains a largely open problem. We consider here an open system, a model of heat conduction consisting of a finitedimensional classical Hamiltonian model, a one-dimensional finite lattice of anharmonic

306

L. Rey-Bellet, L. E. Thomas

oscillators (referred to as the chain), coupled, at the boundaries only, to two reservoirs of classical non-interacting phonons at positive and different temperatures. We believe this model to be quite realistic, in particular it is completely Hamiltonian and non-linear. This model goes back (in the linear case) to [8] (see also [23, 26]). First rigorous results for anharmonic models appear [6] and go further in [7, 5]. Similar models in classical and quantum mechanics have attracted attention in the last few years, mostly for systems coupled to a single reservoir at zero or positive temperature, i.e., for systems near thermal equilibrium (see e.g. [12, 13, 3, 15, 25]. In our case, with two reservoirs, no Gibbs Ansatz is available and in general, even the very existence of a (non-equilibrium) stationary state is a mathematically challenging question which requires a sufficiently deep understanding of the dynamics. For the model at hand, conditions for the existence of the SNS have been given in [6] and generalized in [5]. The uniqueness of the SNS as well as the strict positivity of entropy production (or heat flux) have been proved in [7]. The leading asymptotics of the invariant measure (for low temperatures) are studied in [21] and shown to be described by a variational principle. Under suitable assumptions on the chain interactions and its interactions with the reservoirs, we establish the existence of a Liapunov function for the chain dynamics. We then use this Liapunov function to establish that the relaxation to the SNS occurs at an exponential rate, and finally we prove that the system has a spectral gap (using probabilistic techniques developed by Meyn and Tweedie in [18]). The Hamiltonian of the model has the form H = H B + HS + HI .

(1)

The two reservoirs of free phonons are described by wave equations in Rd with the Hamiltonian HB = H (ϕL , πL ) + H (ϕR , πR ), 1 H (ϕ, π) = dx(|∇ϕ(x)|2 + |π(x)|2 ), 2 where L and R stand for the “left” and “right” reservoirs, respectively. The Hamiltonian describing the chain of length n is given by HS (p, q) =

n p2 i

i=1

V (q) =

n i=1

U (1) (qi ) +

2

+ V (q1 , . . . , qn ),

n−1

U (2) (qi − qi+1 ),

i=1

where (pi , qi ) ∈ Rd × Rd are the coordinates and momenta of the i th particle of the chain. The phase space of the chain is R2dn . The interaction between the chain and the reservoirs occurs at the boundaries only and is of dipole-type HI = q1 · dx∇ϕL (x)ρL (x) + qn · dx∇ϕR (x)ρR (x), where ρL and ρR are coupling functions (“charge densities”) which we will assume spherically symmetric.

Exponential Convergence to Non-Equilibrium Stationary States

307

Our assumptions on the anharmonic lattice described by HS (p, q) are the following: • H1 Growth at infinity. The potentials U (1) (x) and U (2) (x) are C ∞ and grow at infinity like xk1 and xk2 : There exist constants Ci , Di , i = 1, 2 such that lim λ−ki U (i) (λx) λ→∞ lim λ−ki +1 ∇U (i) (λx) λ→∞

= a (i) xki ,

(2)

= a (i) ki xki −2 x,

(3)

∂ 2 U (i) (x) ≤ (Ci + Di V (x))

1− k2

i

,

(4)

where · in Eq. (4) denotes some matrix-norm. Moreover we will assume that k2 ≥ k1 ≥ 2, so that, for large x the interaction potential U (2) is “stiffer” than the one-body potential U (1) . It follows from Eqs. (2) and (3) that the critical set of V (q), i.e., the set {q : ∇V (q) = 0} is a compact set. • H2 Non-degeneracy. The coupling potential between nearest neighbors U (2) is nondegenerate in the following sense. For x ∈ Rd and m = 1, 2, · · · , let A(m) (x) : Rd → m Rd denote the linear maps given by

A(m) (x)v

l1 l2 ···lm

=

d l=1

∂ m+1 U (2) (x)vl . ∂x (l1 ) · · · ∂x (lm ) ∂x (l)

We assume that for each x ∈ Rd there exists m0 such that Rank A(1) (x), · · · A(m0 ) (x) = d. In particular this condition is satisfied, for m0 = 1, if U (2) is strictly convex. If d = 1, this condition means that for any x, there exists m0 = m0 (x) ≥ 2 such that ∂ m /∂U (2) (x) = 0. In other words the potential U (2) has no flat piece or infinitely degenerate points. The class of coupling functions ρi , i ∈ {L, R} we can allow is relatively restrictive: • H3 Rationality of the coupling. Let ρˆi denote the Fourier transform of ρi . We assume that |k|d−1 |ρˆi (k)|2 =

1 , Qi (k 2 )

where Qi , i ∈ {L, R} are polynomials with real coefficients and no roots on the real axis. In particular, if k0 is a root of Qi , then so are −k0 , k 0 and −k 0 . Under these conditions we have the following result (a more detailed and precise statement will be given in the next section). Let F (p, q) be an observable on the phase space of the chain, for example any function with at most polynomial growth (no smoothness is required). We denote as (p(t), q(t)) the solution of the Hamiltonian equation of motion with Hamiltonian (1) and initial conditions (p, q). Of course (p(t), q(t)) depends also on the variables of the reservoirs, though only through their initial conditions (πL , ϕL , πR , ϕR ). We introduce the temperature by making the assumption that the initial conditions of the reservoirs are distributed according to thermal equilibrium at temperature TR and TL respectively and we denote ·LR as the corresponding average.

308

L. Rey-Bellet, L. E. Thomas

Theorem 1.1. Under Conditions H1–H3, there is a measure ν(dp, dq) with a smooth everywhere positive density such that the Law of Large Numbers holds: 1 T lim F (p(t), q(t))dt = F dν T →∞ T 0 for almost all initial conditions (πL , ϕL , πR , ϕR ) of the reservoirs and for all initial conditions (p, q) of the chain. Moreover there exist a constant r > 1 and a function C(p, q) with Cdν < ∞ such that F (p(t), q(t))LR − F dν ≤ C(p, q)r −t for all initial conditions (p, q). That is, if we average over the initial conditions of the reservoirs the convergence is exponential. Note that the ergodic properties stated in Theorem 1.1 hold not only for ν-almost every initial condition (p, q), but in fact for every (p, q). The existence of a (unique) stationary state was proved for (exactly solvable) quadratic harmonic potentials V (q) in [26], for k1 = k2 = 2 (i.e., for potential which are quadratic at infinity) in [6, 7] and generalized to the case k2 > k1 ≥ 2 in [5]. What is really new here is that we prove that the convergence occurs exponentially fast and we also weaken slightly the conditions on the potential (in particular the case k1 = k2 is allowed and our Condition H2 on U (2) is weaker than the one used in [6, 7, 5]). Our methods also differ notably from those used in [6, 5]; in fact we reprove the existence of the SNS (with a shorter and more constructive proof than in [6, 5]) and, at the same time, we prove much stronger ergodic properties. We devote the rest of this section to a brief discussion of the Assumptions H1–H3. Since the reservoirs are free phonon gases and since we make a statistical assumption on the initial condition of the reservoirs, one can integrate out the variables of the reservoirs yielding random integro-differential equations for the variables (p, q). Our Assumption H3 of rational coupling is, in effect, a Markovian assumption: with such coupling one can eliminate the memory terms by adding a finite number of auxiliary variables to obtain a system of Markovian stochastic differential equations on the extended phase space consisting of the dynamical variables (p, q) together with the auxiliary variables. The main (new) ingredient in our proof is then the construction of a Liapunov function for the system, which implies, using probabilistic methods developed in [1, 20, 18], the exponential convergence towards the stationary state. To explain the construction of a Liapunov function, note that the dynamics of the chain in the bulk is simply Hamiltonian, while at the boundaries the action of the reservoirs results into two distinct forces. There are dissipative forces which correspond to the fact that the energy of the chain dissipates into the reservoirs. This force is independent of the temperature. On the other hand since the reservoirs are infinite and at positive temperatures, they exert (random) forces at the boundaries of the chain and these forces turn out to be proportional to the temperatures of the reservoirs. The construction of the Liapunov function proceeds in two steps. In a first step we neglect completely the random force, only dissipation acts. This corresponds to dynamics at temperature zero, and one can prove that the energy decreases and that the system relaxes to a (local) equilibrium of the Hamiltonian H (p, q). We establish the rate at which this relaxation takes place (at sufficiently high energies). In the second step we consider the complete dynamics and we show that for energies which are much higher

Exponential Convergence to Non-Equilibrium Stationary States

309

than the temperatures of the reservoirs, the random force is essentially negligible with respect to the dissipation. This means that except for (exponentially) rare excursions the system spends most of its time in a compact neighborhood of the equilibrium points. On the other hand, in this compact set, i.e., at energies of order of the temperatures of the reservoirs, the dynamics is essentially determined by the fluctuations and to prove exponential convergence to a SNS one has to show that the fluctuations are such that every part of the phase space is visited by the dynamics. To summarize, we control the dynamics at any temperature by the dynamics at zero temperature. This allows one to understand the meaning of our assumptions on the potential V (q). If we suppose that the energy has an infinite number of local minima tending to infinity, the zero temperature (long-time) dynamics is not confined to a compact energy domain and our argument fails. With regard to the condition k2 ≥ k1 in Condition H1 on the exponents of the potentials, since the results of [27] and the rigorous proofs of [17, 2], it is known that stable (in the sense of Nekhoroshev) localized states exist in non-linear lattices. Consider, for example, an infinite chain of oscillators (without reservoirs). Numerically and in certain cases rigorously [17], one can show the existence of breathers, i.e., of solutions which are spatially (exponentially) localized and time-periodic. Although the breathers occur both for k1 > k2 and k2 ≥ k1 they behave differently at high energies. For k1 > k2 , the higher the energy, the more localized the breathers get (hard breathers), while for k2 ≥ k1 , as the energy gets bigger the breathers become less and less localized (soft breathers). In fact a key point of our analysis is to show that at high energy, if the energy E of the initial condition is localized away from the boundary, then after a time of order one, the oscillators at the boundaries carry at least an energy of order E 2/k2 so that the chain system energy can relax into the reservoirs. Although we believe that the existence of a SNS probably may not depend too much on these localization phenomena, the rate of convergence to the SNS presumably does. Our approach of controlling the dynamics by the zero-temperature dynamics may not be adequate if Condition H1 fails to hold and so more refined estimates on the dynamics are needed to show that these localized states might be in fact destroyed by the coupling to the reservoirs. As regards the organization of this paper, Sect. 2 presents the effective stochastic differential equations for the chain, a discussion of allowable interactions between the reservoirs and the chain and a concise statement, Theorem 2.1, of the exponential convergence. In Sect. 3 we discuss the dissipative deterministic system (corresponding to reservoirs at temperature 0), Theorem 3.3, and then we show the extent to which the random paths follow the deterministic ones, Proposition 3.7. We give a lower bound on the random energy dissipation, Corollary 3.8. We then conclude Sect. 3 by providing the Liapunov function, Theorem 3.10, and bounds on the exponential hitting times on (sufficiently large) compact sets, Theorem 3.11. In Sect. 4 we prove that the random process has a smooth law and at most one ergodic component, improving slightly results of [6, 7, 5]. Finally in Sect. 5 we conclude the proof of Theorem 2.1 by invoking results of [18] on the ergodic theory of the Markov processes. 2. Effective Equations We first give a precise description of the reservoirs and of their coupling to the system and derive the stochastic equations which we will study. A free phonon gas is described by a linear wave equation in Rd , i.e., by the pair of real fields φ(x) = (ϕ(x), π(x)), x ∈ Rd . We define the norm φ by φ2 ≡ dx(|∇φ(x)|2 + |π(x)|2 ) and denote ·, ·

310

L. Rey-Bellet, L. E. Thomas

the corresponding scalar product. The phase space of the reservoirs at finite energy is the real Hilbert space of functions φ(x) such that the energy HB (φ) = φ2 /2 is finite and the equations of motion are 0 1 ˙ φ(t, x) = Lφ(t, x), L = . −, 0 In order to describe the coupling of the reservoir to the system, let us consider first a single confined particle in Rd with Hamiltonian HS (p, q) = p 2 /2 + V (q). As the Hamiltonian for the coupled system particle plus one single reservoir, we have 1 H (φ, p, q) = φ2 + p 2 + V (q) + q · dx∇ϕ(x)ρ(x) 2 = HB (φ) + HS (p, q) + q · φ, α, where ρ(x) is a real rotation invariant function and α = (α (1) , · · · , α (d) ) is, in Fourier space, given by 2 ˆ −ik (i) ρ(k)/k αˆ (i) = . 0 We introduce the covariance matrix C (ij ) (t) = exp (Lt)α (i) , α (j ) . A simple computation shows that 1 C (ij ) (t) = δij dk|ρ(k)|2 ei|k|t , d and we define a coupling constant λ by putting λ2 = C (ii) (0) = d1 dk|ρ(k)|2 . The equations of motion of the coupled system are q(t) ˙ = p(t), p(t) ˙ = −∇V (q(t)) − φ, α, ˙ k) = L (φ(t, k) + q(t) · α(k)) . φ(t,

(5)

With the change of variables ψ(k) = φ(k) + q · α(k), Eqs. (5) become q(t) ˙ = p(t), p(t) ˙ = −∇Veff (q(t)) − ψ, α, ˙ k) = Lψ(t, k) + p(t) · α(k), ψ(t,

(6)

where Veff (q) = V (q) − λ2 q 2 /2. Integrating the last of Eqs. (6) with initial condition ψ0 (k) one finds t Lt ψ(t, k) = e ψ0 (k) + dseL(t−s) α(k) · p(s), 0

and inserting into the second of Eqs. (6) gives q(t) ˙ = p(t),

t

p(t) ˙ = −∇Veff (q(t)) − 0

dsC(t − s)p(s) − ψ0 , e−Lt α.

(7)

Exponential Convergence to Non-Equilibrium Stationary States

311

If we now assume that, at time t = 0, the reservoir is at temperature T , then ψ0 is distributed according to the Gaussian measure with covariance T · , · and then ξ(t) ≡ ψ0 , e−Lt α is a d-dimensional stationary Gaussian process with mean 0 and covariance T C(t − s). Note that the covariance itself appears in the deterministic memory term on the r.h.s. of Eq. (7) (fluctuation-dissipation relation). By Assumption H3 there is a polynomial p(u) which is a real function of iu and which has its roots in the lower half plane such that ∞ 1 du eiut . C (ii) (t) = 2 |p(u)| −∞ Note that this is a Markovian assumption [4]: ξ(t) is Markovian in the sense that we have the identity p(−id/dt)ξ(t) = ω(t), ˙ where ω(t) ˙ is a white noise, i.e., the joint motion of d m ξ(t)/dt m , 0 ≤ m ≤ deg p − 1 is a (Gaussian) Markov process. This assumption together with the fluctuation-dissipation relation permits, by extending the phase space with a finite number of variables, to rewrite the integro-differential equations (7) as a Markov process. Note that ξ(t) can be written as [4] ∞ ξ(t) = k(t − t )dω(t ), k(t) = dueiut p(u)−1 −∞

with k(t) = 0 for t ≤ 0. For example if p(u) ∝ iu + γ then C (ii) (t) = λ2 e−γ |t| . Introducing the variable r defined by t t λr(t) = dsC(t − s)p(s) + k(t − t )dω(t ), −∞

0

we obtain from Eqs. (7) the set of Markovian differential equations: q(t) ˙ = p(t), p(t) ˙ = −∇Veff (q(t)) − λr(t), dr(t) = (−γ r(t) + λp(t))dt + (2T γ )1/2 dω(t).

(8)

If p(u) ∝ (iu + γ + iσ )(iu + γ − iσ ), then C(t) = λ2 cos(σ t)e−γ |t| and introducing the two auxiliary variables r and s defined by t λr(t) = λ2 ds cos(σ (t − s))e−γ |t−s| p(s) 0 t + (T λ2 γ )1/2 cos(σ (t − s))e−γ |t−s| dω(s), −∞ t 2 dt sin(σ (t − s))e−γ |t−s| p(s) λs(t) = λ 0 t + (T λ2 γ )1/2 dt sin(σ (t − s))e−γ |t−s| dω(s), −∞

we obtain then the set of Markovian differential equations: q(t) ˙ = p(t), p(t) ˙ = −∇Veff (q(t)) − λr(t), dr(t) = (−γ r(t) − σ s(t) + λp(t))dt + (2T γ )1/2 dω(t), s˙ (t) = −γ s(t) + σ r(t).

(9)

312

L. Rey-Bellet, L. E. Thomas

Obviously other similar sets of equations can be derived for an arbitrary polynomial p(u). Another coupling which we could easily handle with our methods occurs in the following limiting case, see [8]. Formally one wants to take C (ii) (t) = η2 δ(t). Note that this corresponds to a coupling function with |ρ(k)|2 = 1 in which case λ2 = ∞. A possible limiting procedure consists in taking a sequence of covariances tending to a delta function and at the same time suitably rescaling the coupling (see [8]). In this case one obtains the Langevin equations which serve as the commonly-used model system with reservoir in the physics literature, q(t) ˙ = p(t), dp(t) = (−∇Veff (q(t)) − η2 p(t))dt + (2T η2 )1/2 dω(t).

(10)

The derivation of the effective equations for the chain is a straightforward generalization of the above computations. Our techniques apply equally well to any of the couplings above. However, for simplicity, we will only consider the case where the couplings to both reservoirs satisfy |ρi (k)|2 ∝ k 2 + γ 2 , i = L, R. For notational simplicity we set T1 = TL and Tn = TR , we denote r1 and rn as the two auxiliary variables and we will use the notations r = (r1 , rn ), and x = (p, q, r) ∈ X = R2d(n+1) . In this case we obtain the set of Markovian stochastic differential equations given by q˙1 = p1 , p˙ 1 = −∇q1 Veff (q) − λr1 , dr1 q˙j p˙ j q˙n p˙ n

= = = = =

(−γ r1 + λp1 )dt + (2T1 γ )1/2 dω1 , pj , j = 2, . . . , n − 1, −∇qj Veff (q), j = 2, . . . , n − 1, pn , −∇qn Veff (q) − λrn ,

drn = (−γ rn + λpn )dt + (2Tn γ )1/2 dωn ,

(11)

where Veff (q) = V (q) − λ2 q12 /2 − λ2 qn2 /2. From now on, for notational simplicity we will suppress the index “eff” and consider V = Veff as our potential energy. It will be useful to introduce the following notation. We define the linear maps : : Rdn → R2d by :(x1 , . . . , xn ) = (λx1 , λxn ) and T : R2d → R2d by T (x, y) = (T1 x, Tn y). With this we can rewrite Eqs. (11) in the compact form q˙ = p, p˙ = −∇q V − :∗ r, dr = (−γ r + :p)dt + (2γ T )1/2 dω.

(12)

The solution x(t) of Eqs. (12) is a Markov process. We denote T t as the associated semigroup, T t f (x) = Ex [f (x(t)], with generator L = γ (∇r T ∇r − r∇r ) + :p∇r − r:∇p + p∇q − (∇q V (q))∇p ,

(13)

Exponential Convergence to Non-Equilibrium Stationary States

313

and Pt (x, dy) as the transition probability of the Markov process x(t). There is a natural energy function which is associated to Eq. (12), given by r2 + H (p, q). 2 A straightforward computation shows that in the special case T1 = Tn = T , G(p, q, r) =

Z −1 e−G(p,q,r)/T is an invariant measure for the Markov process x(t). Given a function W : X → R satisfying W ≥ 1 we consider the following weighted total variation norm · W given by π W = sup f dπ , (14) |f |≤W

for any (signed) measure π. We introduce norms · θ and Banach spaces L∞ θ (X) given by f θ = sup

x∈X

|f (x)| , eθG(x)

L∞ θ (X) = {f : f θ < ∞},

(15)

∞ and write Kθ for the norm of an operator K : L∞ θ (X) → Lθ (X). Theorem 1.1 is a direct consequence of the following result:

Theorem 2.1. Assume that Conditions H1 and H2 hold. The Markov process x(t) which solves (12) has smooth transition probability densities, Pt (x, dy) = pt (x, y)dy, with pt (x, y) ∈ C ∞ ((0, ∞) × X × X). The Markov process x(t) has a unique invariant measure µ, and µ has a C ∞ everywhere positive density. For any θ with 0 < θ < (max{T1 , Tn })−1 there exist constants r = r(θ) > 1 and R = R(θ) < ∞ such that Pt (x, ·) − µexp (θG) ≤ Rr −t exp (θ G(x)),

(16)

for all x ∈ X, (exponential convergence to the SNS) or equivalently T t − µθ ≤ Rr −t , (spectral gap). Furthermore for all functions f , g with f 2 , g 2 ∈ L∞ θ (X) and all t > 0 we have gT t f dµ − f dµ gdµ ≤ Rr −t f 2 1/2 g 2 1/2 , θ θ (exponential decay of correlations in the SNS). The convergence in the weighted variation norm, Eq. (16), implies that the Law of Large Numbers holds [10, 18]. Corollary 2.2. Under Assumptions H1 and H2 x(t) satisfies the Law of Large Numbers: For all initial conditions x ∈ X and all f ∈ L1 (X, dµ), 1 T f (x(t))dt = f dµ lim T →∞ T 0 almost surely.

314

L. Rey-Bellet, L. E. Thomas

The convergence of the transition probabilities as given in (16) is shown in [18] to follow from the following properties: • Strong Feller property. The diffusion process is strong Feller, i.e., the semigroup T t maps bounded measurable functions into continuous functions. This is a consequence of the hypoellipticity of the diffusion x(t), which follows from Condition H2, see Sect. 4. • Small-time open set accessibility. For all t > 0, all x ∈ X and all open set A ⊂ X we have Pt (x, A) > 0. This means that the Markov process is “strongly aperiodic”. In particular, combined with the strong Feller property it implies uniqueness of the invariant measure. This property is discussed in Sect. 4 using the support theorem of [28] and explicit computations. This generalizes (slightly) the result obtained in [7]. • Liapunov function and hitting times. Fix s > 0 arbitrary. Set W = exp (θ G) and choose θ with 0 < θ < (max {T1 , Tn })−1 . Then W is a Liapunov function for the Markov chain {x(ns)}n≥0 : W > 1, W has compact level sets and there is a compact set U , (depending on s and θ ) and constants κ < 1 and b < ∞, (both depending on U , s and θ ) such that T s W (x) ≤ κW (x) + b1U (x),

(17)

where 1U denotes the indicator function of the set U . In addition the constant κ in Eq. (17) can be chosen arbitrarily small by choosing the set U sufficiently large. The existence of a Liapunov function is the main technical result of this paper (see Sect. 3) and the Condition H1 is crucial to obtain it. Note that the time derivative of the (averaged) energy d Ex [G(x(t))] = γ Ex [Tr(T) − r 2 (t)], dt is not necessarily negative. But it is the case, as follows from our analysis below that, for t > 0, Ex [G(x(t)) − G(x)] < −cG(x)2/k2 for x sufficiently large. A nice interpretation of a Liapunov bound of the form (17) is in terms of hitting times. Let τU denote the first time the diffusion x(t) hits the set U ; then Eq. (17) implies that τU is exponentially bounded. We will show that for any a > 0, no matter how large, we can find a compact set U = U (a) such that Ex [eaτU ] < ∞, for all x ∈ X. So except for exponentially rare excursions the Markov process x(t) lives on the compact set U . Combined with the fact that the process has a smooth law, this provides an intuitive picture of the exponential convergence result of Theorem 2.1.

Exponential Convergence to Non-Equilibrium Stationary States

315

3. Liapunov Function and Hitting Times 3.1. Scaling and deterministic energy dissipation. We first consider the question of energy dissipation for the following deterministic equations: q˙ = p, p˙ = −∇q V (q) − :∗ r, r˙ = −γ r + :p,

(18)

obtained from Eq. (12) by setting T1 = Tn = 0, corresponding to an initial condition of the reservoirs with energy 0. A simple computation shows that the energy G(p, q, r) is non-increasing along the flow x(t) = (p(t), q(t), r(t)) given by Eq. (18): d G(p(t), q(t), r(t)) = −γ r 2 (t) ≤ 0. dt We now show by a scaling argument that for any initial condition with sufficiently high energy, after a small time, a substantial amount of energy is dissipated. At high energy, the two-body interaction U (2) in the potential dominates the term (1) U since k2 ≥ k1 and so for an initial condition with energy G(x) = E, the natural time scale – essentially the period of a single one-dimensional oscillator in the potential |q|k2 – is E 1/k2 −1/2 . We scale a solution of Eq. (18) with initial energy E as follows 1 1

1 − p(t) ˜ = E − 2 p E k2 2 t , 1 1

−1 − q(t) ˜ = E k2 q E k2 2 t , 1 1

−1 − r˜ (t) = E k2 r E k2 2 t . (19) ˜ E (p, Accordingly the energy scales as G(p, q, r) = E G ˜ q, ˜ r˜ ), where 2 2 p˜ 2 −1 r˜ ˜ E (p, ˜ q, ˜ r˜ ) = E k2 ˜ + + V˜E (q), G 2 2 n n−1 V˜E (q) ˜ = U˜ (1) (q˜i ) + U˜ (2) (q˜i − q˜i+1 ), i=1

U˜ (i) (x) ˜ = E −1 U˜

i=1

(i)

1 k2

E x ,

i = 1, 2.

The equations of motion for the rescaled variables are q˙˜ = p, ˜ −1 p˙˜ = −∇q˜ V˜E (q) ˜ − E k2 :∗ r, 2

− r˙˜ = −E k2 2 γ r˜ + :p. ˜ 1

1

By Assumption H1, as E → ∞ the rescaled energy becomes ˜ ∞ (p, ˜ E (p, G ˜ q, ˜ r˜ ) ≡ lim G ˜ q, ˜ r˜ ) E→∞   p˜ 2 /2 + V˜∞ (q) ˜ k1 = k2 > 2 or k2 > k1 ≥ 2 , =  r˜ 2 /2 + p˜ 2 /2 + V˜ (q) k1 = k2 = 2 ∞ ˜

(20)

316

where

L. Rey-Bellet, L. E. Thomas

  a (1) q˜i k2 + a (2) q˜i − q˜i+1 k2 k1 = k2 ≥ 2 V∞ (q) ˜ = .  a (2) q˜ − q˜ k2 k > k ≥ 2 i i+1 2 1

The equations of motion scale in this limit to q˙˜ = p, ˜ ˙p˜ = −∇q˜ V˜∞ (q), ˜ ˙r˜ = :p, ˜

(21)

in the case k2 > 2, while they scale to q˙˜ = p, ˜ p˙˜ = −∇q˜ V˜∞ (q) ˜ − :∗ r, r˙˜ = −γ r + :p, ˜

(22)

in the case k1 = k2 = 2. Remark 3.1. The scaling for the p and q is natural due to the Hamiltonian nature of the problem, but the scaling of r has a certain amount of arbitrariness. Since G is quadratic in r, it might appear natural to scale r with a factor E −1/2 instead of E −1/k2 as we do. On the other hand, the very definition of r as an integral of p suggests that r should scale as q, as we have chosen. Remark 3.2. Had we supposed, instead of H1, that k1 > k2 , then the natural time scale at high energy would be E 1/k1 −1/2 . Scaling the variables (with k2 replaced by k1 would yield the limiting Hamiltonian p˜ 2 /2+ a (1) q˜i k1 , i.e., the Hamiltonian of n uncoupled oscillators. So in this case, at high energy, essentially no energy is transmitted through the chain. While this does not necessarily preclude the existence of an invariant measure, we expect in this case the convergence to a SNS to be much slower. In any case even the existence of the SNS in this case remains an open problem. Theorem 3.3. Given τ > 0 fixed there are constants c > 0 and E0 < ∞ such that for any x with G(x) = E > E0 and any solution x(t) of Eq. (18) with x(0) = x we have the estimate, for tE = E 1/k2 −1/2 τ , 3

G(x(tE )) − E ≤ −cE k2

− 21

.

(23)

Remark 3.4. In view of Eq. (23), this shows that r is at least typically O(E 1/k2 ) on the time interval [0, E 1/k2 −1/2 τ ]. Proof. Given a solution of Eq. (18) with initial condition x of energy G(x) = E, we use the scaling given by Eq. (19) and we obtain τ tE 3 −1 G(x(tE )) − E = −γ dtr 2 (t) = −γ E k2 2 dt r˜ 2 (t), (24) 0

0

where r˜ (t) is the solution of Eq. (20) with initial condition x˜ of (rescaled) energy ˜ E (x) G ˜ = 1. By Assumption H2 we may choose E0 so large that for E > E0 the ˜ E are contained in, say, the set {G ˜ E ≤ 1/2}. critical points of G

Exponential Convergence to Non-Equilibrium Stationary States

317

For a fixed E and x with G(x) = E, we show that there is a constant cx,E > 0 such that τ dt r˜ 2 (t) ≥ cx,E . (25) 0

τ The proof is by contradiction, cf. [21]. Suppose that 0 dt r˜ 2 (t) = 0, then we have r˜ (t) = 0, for all t ∈ [0, τ ]. From the third equation in (20) we conclude that p˜ 1 (t) = p˜ n (t) = 0 for all t ∈ [0, τ ], and so from the first equation in (20) we see that q˜1 (t) and q˜n (t) are constant on [0, τ ]. The second equation in (20) gives then 0 = p˙˜ 1 (t) = −∇q˜1 V˜ (q(t)) ˜ = −∇q˜1 U˜ (1) (q˜1 (t)) − ∇q˜1 U˜ (2) (q˜1 (t) − q˜2 (t)), together with a similar equation for p˙ n . By our Assumption H1 the map ∇ U˜ (2) has a right inverse g locally bounded and measurable and thus we obtain q˜2 (t) = q˜1 (t) − g(U˜ (1) (q˜1 (t))). Since q˜1 is constant, this implies that q˜2 is also constant on [0, τ ]. Similarly we see that q˜n−1 is constant on [0, τ ]. Using again the first equation in (20) we obtain now p˜ 2 (t) = p˜ n−1 (t) = 0 for all t ∈ [0, τ ]. Inductively one concludes that r˜ = 0 implies ˜ E . This p˜ = 0 and ∇q˜ V˜ = 0 and thus the initial condition x˜ is a critical point of G contradicts our assumption and Eq. (25) follows. ˜ E is compact. Using the continuity of the Now for given E, the energy surface G solutions of O.D.E. with respect to initial conditions we conclude that there is a constant cE > 0 such that τ inf dt r˜ 2 (t) ≥ cE . ˜ E =1} 0 x∈{ ˜ G

˜∞ Finally we investigate the dependence on E of cE . We note that for E = ∞, G has a well-defined limit given by Eq. (21) and the rescaled equations of motion, in the limit E → ∞, are given by Eqs. (21) in the case k2 > 2 and by Eq. (22) in the case ˜ ∞ = 1} is not k1 = k2 = 2. Except in the case k1 = k2 = 2 the energy surface {G ˜ compact. However, in the case k1 = k2 > 2, the Hamiltonian G∞ and the equation of motion are invariant under the translation r → r + a, for any a ∈ R2d . And in the case ˜ ∞ and the equation of motion are invariant under the k2 > k1 > 2 the Hamiltonian G translation r → r + a q → q + b, for any a ∈ R2d and b ∈ Rdn . The quotient of the ˜ ∞ = 1} by these translations, is compact. energy surface {G ˜ ∞ = 1} a similar argument as above show that τ dt (˜r + Note that for a given x˜ ∈ {G 0 a)2 > 0, for any a > 0 and since this integral clearly goes to ∞ as a → ∞ there exists a constant c∞ > 0 such that τ inf r˜ 2 (t)dt > c∞ . ˜ ∞ =1} 0 x∈{ ˜ G

Using again that the solution of O.D.E. depends smoothly on its parameters, we obtain τ inf inf dt r˜ 2 (t) > c. E>E0 x∈{ ˜ E =1} 0 ˜ G

This estimate, together with Eq. (24) gives the conclusion of Theorem 3.3.

!

318

L. Rey-Bellet, L. E. Thomas

3.2. Approximate deterministic behavior of random paths. In this section we show that at sufficiently high energies, the overwhelming majority of the random paths x(t) = x(t, ω) solving Eqs. (12) follows very closely the deterministic paths xdet solving Eqs. (18). As a consequence, for most random paths the same amount of energy is dissipated into the reservoirs as for the corresponding deterministic ones. We need the following a priori “no-runaway” bound on the growth of G(x(t)). Lemma 3.5. Let θ ≤ (max{T1 , Tn })−1 . Then Ex [exp (θ G(x(t)))] is well-defined and satisfies the bound Ex [exp (θ G(x(t)))] ≤ exp (γ Tr(T )θ t) exp (θ G(x)).

(26)

Moreover for any x with G(x) = E and any δ > 0 we have the estimate Px

sup G(x(s)) ≥ (1 + δ)E

≤ exp (γ Tr(T )θ t) exp (−δθ E).

(27)

0≤s≤t

Remark 3.6. The lemma shows that for E sufficiently large, with very high probability, G(x(t)) = O(E) if G(x) = E. The assumption on θ here arises naturally in the proof, where we need (1 − θ T ) ≥ 0, cf. Eq. (28). Proof. For θ ≤ (max{T1 , Tn })−1 we have the bound (the generator L is given by Eq. (13)) L exp (θ G(x)) = γ θ exp (θ G(x)) (Tr(T ) − r(1 − θT )r) ≤ γ θTr(T ) exp (θ G(x)),

(28)

so that for the function W (t, x) = exp (−γ θ Tr(T )t) exp (θ G(x)) we have the inequality (∂t + L)W (t, x) ≤ 0. We denote σR as the exit time from the set {G(x) < R}, i.e., σR = inf{t ≥ 0, G(x(t)) ≥ R}. If the initial condition x satisfies G(x) = E < R, we denote xR (t) the process which is stopped when it exits {G(x) < R}, i.e., xR (t) = x(t) for t < σR and xR (t) = x(σR ) for t ≥ σR . We set σR (t) = min{σR , t} and applying Ito’s formula with stopping time to the function W (t, x) we obtain Ex exp (θ G(x(σR (t)))) exp (−γ θ Tr(T )σR (t)) − exp (θ G(x)) ≤ 0, thus Ex exp (θ G(x(σR (t)))) ≤ exp (γ θ Tr(T )t) exp (θ G(x)).

(29)

Since Ex exp (θ G(x(σR (t)))) ≥ Ex exp (θ G(x(σR (t))))1σR
Exponential Convergence to Non-Equilibrium Stationary States

319

It follows that G(xR (t)) → G(x(t)) almost surely as R → ∞, so by the Fatou lemma we obtain from Eq. (29) the bound Eq. (26). The bound Eq. (27) is obtained by noting that the left side is equal to Px {σE(1+δ) < t} ≤ exp (γ θ Tr(T )t) exp (−δθE), and this concludes the proof of Lemma 3.5. We have the following “tracking” estimates to the effect that the random path closely follows the deterministic one at least up to time tE for a set of paths which have nearly full measure. We set ,x(t) ≡ x(t, ω) − xdet (t) = (,r(t), ,p(t), ,q(t)) with both x(t) and xdet (t) having initial condition x. Let S(x, E, t) = {x(·); G(x) = E

sup G(x(s)) < 2E}.

and

0≤s≤t

By Lemma 3.5, P{S(x, E, t)} ≥ 1 − exp (γ θ Tr(T )t − θE). Proposition 3.7. There exist constants E0 < ∞ and c > 0 such that for paths x(t, ω) ∈ S(x, E, tE ) with tE = E 1/k2 −1/2 τ and E > E0 we have  2   −1 ,q(t) E k2   sup  ,p(t)  ≤ c sup 2γ T ω(t)  E k12 − 21  . 0≤t≤tE 0≤t≤tE ,r(t) 1 

(30)

Proof. We write differential equations for ,x(t) again assuming both the random and deterministic paths start at the same point x with energy G(x) = E. These equations can be written in the somewhat symbolic form: d,q = ,pdt,

d,p = O(E 1−2/k2 ),q − :∗ ,r dt, d,r = (−γ ,r + :,p) dt + 2γ T dω.

(31)

The O(E 1−2/k2 ) coefficient refers to the difference between forces, −∇q V (·) evaluated at x(t) and xdet (t); we have that G(x(t)) ≤ 2E, so that ∇q V (q) − ∇q V (qdet ) = O(∂ 2 V ),q = O(E 1−2/k2 ),q. For later purposes we pick a constant c so large that ρ = ρ(x) = c E

1− k2

2

≥ sup i

j

2 ∂ V (q) {q:V (q)≤2E} ∂qi ∂qj sup

for all sufficiently large E. In order to estimate the solutions of Eqs. (31), we consider the 3 × 3 matrix which bounds the coefficients in this system, and which is given by 

 01 0 M = ρ 0 λ . 0λγ

(32)

320

L. Rey-Bellet, L. E. Thomas

We have the following estimate on powers of M: For ,X (0) = (0, 0, 1)T , we set ,X(m) ≡ M m ,X (0) . For α = max(1, γ + λ), we obtain ,X (1) ≤ α(0, 1, 1)T , ,X(2) ≤ α 2 (1, 1, 1)T , and, for m ≥ 3,  m−2   (m)  ρ 2 u  m−1  ,X(m) ≡  v (m)  ≤ α m 2m−2  ρ 2  , m−2 w (m) ρ 2 where the inequalities are componentwise. From this we obtain the bound √     1 2 ρ2αt 0 2 (αt)√e . etM  0  ≤  αte ρ2αt √ 1 2 ρ2αt 1 1 + αt + (αt) e

(33)

2

If 0 ≤ t ≤ tE we have bounded, and

√

ρt <

√

c . Then the exponentials in the above equation are

    0 1/ρ √ etM  0  ≤ c  1/ ρ  , 1 1

(34)

for some constant c. Returning now to the original differential equation system Eq. (31), we write this equation in the usual integral equation form:     t ,p(s) ,q(t)  −∇q V (q(s, ω))ds + ∇q V (qdet (s)) − :∗ ,r(s)   ,p(t)  = 0 ,r(t) −γ ,r(s) + :,p(s)   0 . + √ 0 (35) 2γ T ω(t) From this we obtain the bound       t ,q(t) 0 ,q(t)  ,p(t)  ≤ M  ,p(t)  ds +  0  , 0 ,r(t) ωmax ,r(t) √ where M is the matrix given by Eq. (32), and ωmax = supt≤tE 2γ T ω(t). Note that the solution of the integral equation   t 0 ,X(t) = (36) dsM,X(s) +  0  , 0 ωmax is ,X(t) = exp (tM)(0, 0, ωmax )T . We can solve both Eq. (35) and Eq. (36) by iteration. Let ,xm (s), ,Xm (s) denote the respective mth iterates (with ,x0 (s) =

Exponential Convergence to Non-Equilibrium Stationary States

321

√ (0, 0, 2γ T ω(s))T , and ,X0 (s) = (0, 0, ωmax )T , 0 ≤ s ≤ tE ). The ,Xm ’s are monotone increasing in m. Then it is easy to see that   ,qm (t)  ,pm (t)  ≤ ,Xm (t) ≤ ,X(t), ,rm (t) for each iterate. By Eqs. (33), (34), and the definition of ρ the conclusion Eq. (30) follows. ! As a consequence of Theorem 3.3 and Proposition 3.7 we obtain α Corollary 3.8. √ Let L(E) = E with α < 1/k2 and assume that w(t) is such that sup0≤t≤tE 2γ T ω(t) ≤ L(E) and x(·, ω) ∈ S(x, E, tE ). Then there are constants c > 0 and E0 < ∞ such that all paths x(t, w) with initial condition x with G(x) = E > E0 satisfy the bound tE 3 −1 r 2 (s)ds ≥ cE k2 2 . (37) 0

Remark 3.9. For large energy E, paths not satisfying the hypotheses of the corollary have measure bounded by Px { sup 2γ T ω > L(E)} + P{S(x, E, tE )C } 0≤s≤tE

a L(E)2 ≤ exp − + exp (θ (γ Tr(T )tE − E)) 2 bγ Tmax tE L(E)2 ≤ a exp − , bγ Tmax tE

(38)

where a and b are constants which depend only on the dimension of ω. Here we have used the reflection principle to estimate the first probability and Eq. (27) and the definition of S to estimate the second probability. For E large enough, the second term is small relative to the first. Proof. It is convenient to introduce the L2 -norm on functions on [0, t], f t ≡

1/2 t 2 ds f (s) . By Theorem 3.3, there are constants E1 and c1 such that for E > E1 0 the deterministic paths xdet (s) satisfy the bound tE 3 −1 2 rdet 2tE = rdet (s)ds ≥ c1 E k2 2 . 0

By Proposition 3.7, there are constants E2 and c2 such that ,r(s) ≤ c2 L(E), uniformly in s, 0 ≤ s ≤ tE , and uniformly in x with G(x) > E2 . So we have 1/2 1/2 3 1 −1 −1 rtE ≥ rdet tE − ,rtE ≥ c1 E k2 2 − c2 L(E) E k2 2 . But the last term is O(E α−1/4+1/2k2 ), which is of lower order than the first since α < 1/k2 , so the corollary follows, for an appropriate constant c and E sufficiently large. !

322

L. Rey-Bellet, L. E. Thomas

3.3. Liapunov function and exponential hitting times. With the estimates we prove now our main technical result. Theorem 3.10. Let s > 0 and θ < θ0 ≡ (max{T1 , Tn })−1 . Then there are a compact set U = U (s, θ ) and constants κ = κ(U, s, θ) < 1 and L = L(U, s, θ ) < ∞ such that T s exp (θ G)(x) ≤ κ exp (θ G)(x) + L1U (x),

(39)

where 1U is the indicator function of the set U . The constant κ can be made arbitrarily small by choosing U large enough. Proof. For any compact set U and for any t, T s exp (θ G)(x) is a bounded function, uniformly on [0, t]. So, in order to prove Eq. (39), we only have to prove that there exist a compact set U and κ < 1 such that sup Ex exp (θ (G(x(s)) − G(x))) ≤ κ < 1. x∈U C

Using Ito’s Formula to compute G(x(s)) − G(x) in terms of a stochastic integral we obtain Ex exp (θ (G(x(s)) − G(x))) s s = exp (θ γ Tr(T )s)Ex exp −θ γ r 2 dt + θ 2γ T rdω(t) . (40) 0

0

For any θ < θ0 , we choose p > 1 such that θp < θ0 . Using Hölder inequality we obtain, s s 2 γ r dt + θ 2γ T rdω(t) Ex exp −θ = Ex

0

exp −θ 0

0

s

γ r 2 dt +

pθ 2 2

s

2 2γ T r dt

0

s 2 pθ 2 s × exp − 2γ T r dt + θ 2γ T rdω(t) 2 0 0 1/q s 2 s 2 qpθ γ r 2 dt + 2γ T r dt ≤ Ex exp −qθ 2 0 0 1/p s s 2 2 2 p θ × Ex exp − 2γ T r dt + θp 2γ T rdω(t) 2 0 0 s 2 1/q qpθ 2 s 2 = Ex exp −qθ dtγ r + dt 2γ T r . 2 0 0 Here, in the next to last line, we have used the fact that the second factor is the expectation of a martingale (the integrand is non-anticipating) with expectation 1. Finally we obtain the bound Ex exp (θ (G(x(s)) − G(x))) 1/q s dtγ r 2 . (41) ≤ exp (θ γ Tr(T )s)Ex exp −qθ(1 − pθ Tmax ) 0

Exponential Convergence to Non-Equilibrium Stationary States

323

In order to proceed we need to distinguish two cases according to whether 3/k2 − 1/2 > 0 or 3/k2 − 1/2 ≤ 0 (see Corollary 3.8). In the first case we let E0 be defined by 1/k −1/2 s = E0 2 τ . For E > E0 we break the expectation Eq. (41) into two parts according to whether the paths satisfy sthe hypotheses t of Corollary 3.8 or not. For the first part we use Corollary 3.8 and that 0 r 2 (s)ds ≥ 0E r 2 (s) ≥ cE 3/k2 −1/2 ; for the second part we use estimate (38) in Remark 3.9 on the probability of unlikely paths together with the fact that the exponential under the expectation in Eq. (41) is bounded by 1. We obtain for all x with G(x) = E > E0 the bound Ex exp (θ (G(x(s)) − G(x))) ≤ exp θγ Tr(T )tE0 1/q 3 L(E)2 θ0 −1 . × exp −qθ (1 − pθ Tmax )cE k2 2 + a exp − bγ tE

(42)

Choosing the set U = {x; G(x) ≤ E1 } with E1 large enough we can make the term in Eq. (42) as small as we want. If 3/k2 − 1/2 ≤ 0, for a given s and a given x with G(x) = E we split the time interval [0, s] into E 1/2−1/k2 pieces [tj , tj +1 ], each one of size of order E 1/k2 −1/2 s. For the “good” paths, i.e., for the paths x(t) which satisfy the hypotheses of Corollary 3.8 on each time interval [tj , tj +1 ], the tracking estimates of Proposition 3.7 imply that G(x(t)) = O(E) for t ineach interval.Applying Corollary 3.8 and using that G(x(tj )) = s O(E) we conclude that 0 r 2 (s)ds is at least of order E 3/k2 −1/2 × E 1/2−1/k2 = E 2/k2 . The probability of the remaining paths can be estimated, using Eq. (38), not to exceed 1

1

E 2 − k2 L2max θ0 1 − 1 − a exp − . bγ tE The remainder of the argument is essentially as above, Eq. (42) and this concludes the proof of Theorem 3.10. ! The existence of the Liapunov function given by Eq. (39) can be interpreted in terms of hitting times. Let τU be the time for the diffusion x(t) to hit the set U . Theorem 3.11. Assume that θ < (max{T1 , Tn })−1 . For any (arbitrarily large) a > 0 there exists a constant E0 = E0 (a) > 0 such that for U = {x; G(x) ≤ E0 } and x ∈ U C we have Ex eaτU < ea + (ea − 1) exp (θ (G(x) − E0 )).

(43)

Proof. Let s = 1 and θ < θ0 be given, we set κ = exp (−a)/2 and take U to be the set given by Theorem 3.10. Let Xn be the Markov chain defined by Xn = x(n) and NU be the least integer such that XNU ∈ U . Then Ex [eaτU ] ≤ Ex [eaNU ],

(44)

so that to estimate the exponential hitting time, it suffices to estimate the exponential “step number”.

324

L. Rey-Bellet, L. E. Thomas

Using Chernov’s inequality we obtain Px {NU > n} = Px {−

n

(G(Xj ) − G(Xj −1 ) < G(x) − E0 , Xj ∈ U c }

j =1



≤ eθ(G(x)−E0 ) Ex 

n

 eθ(G(Xj )−G(Xj −1 )) , Xj ∈ U c 

j =1

 ≤e

θ(G(x)−E0 )

n−1

Ex 

eθ(G(Xj )−G(Xj −1 ))

j =1



)

× EXn−1 eθ(G(Xn )−G(Xn−1 , Xj ∈ U c  ≤ eθ(G(x)−E0 ) sup Ey [eθ(G(X1 )−G(y)) ] y∈U c



n−1

× Ex 



eθ(G(Xj )−G(Xj −1 )) , Xj ∈ U c 

j =1

≤ ··· ≤ e

!

θ(G(x)−E0 )

"n sup Ey [e

θ(G(X1 )−G(y))

y∈U c

]

.

By Theorem 3.10 we have sup Ex [eθ(G(X1 )−G(x)) ] < κ,

x∈U c

and therefore we have geometric decay of P>n ≡ Px {NU > n} in n, P>n ≤ κ n exp (θ G(x) − E0 ). Summing by parts we obtain ∞ Ex eaNU = ean Px {τU = n}

# = lim

M→∞

n=1 M

P>n (e

$ a(n+1)

an

a

− e ) + e P>0 − e

a(M+1)

P>M ,

n=1

which, together with Eq. (44) gives Eq. (43).

!

4. Accessibility and Strong Feller Property In this section we prove that the Markov process is strong Feller and moreover we show that it is strongly aperiodic in the sense that for all t > 0, all x ∈ X and all open sets A ⊂ X we have Pt (x, A) > 0. Both results imply immediately that x(t) has at most one invariant measure: Since the process is strong Feller the invariant measure (if it exists) has a smooth density which is everywhere positive by the property of aperiodicity. Obviously no two different such measures can exist.

Exponential Convergence to Non-Equilibrium Stationary States

325

The strong Feller property is an immediate consequence of the hypoelliptic properties of the generator L of the diffusion. The result is an easy consequence of the estimates in [7, 5], since there much stronger global hypoelliptic estimates are proven (though under stronger conditions on the potential U (2) ). We present here the argument for completeness. The generator of the Markov process x(t) can be written in the form L=

2d i=1

Xi2 + X0 .

If the Lie algebra generated by the set of commutators {Xi }2d i=1 ,

{[Xi , Xi ]}2d i,j =0 ,

{[[Xi , Xj ], Xk ]}2d i,j,k=0 ,

···

(45)

has rank dim(X) at every point x ∈ X, then the Markov process has a C ∞ law. In particular it is strong Feller. This is a consequence of the Hörmander Theorem [11, 16] or it can be proved directly using Malliavin Calculus developed by Malliavin, Bismut, Stroock and others (see e.g. [19]). Proposition 4.1. If H2 holds then the generator L given by Eq. (13) satisfies the rank condition (45). Proof. This is a straightforward computation. The vector fields Xi , i = 1, · · · 2d give ∂r (j ) , i = 1, n, j = 1, · · · , d. The commutators i

& % ∂r (j ) , X0 = γ ∂r (j ) − λ∂p(j ) , 1 1 1 %% & & 2 ∂r (j ) , X0 , X0 = γ ∂r (j ) − γ λ∂p(j ) − λ∂q (j ) , 1

1

1

1

yield the vector fields ∂p(j ) and ∂q (j ) . Further 1

1

%

d & ∂q (j ) , X0 = 1

l=1

d

∂ 2 U (2) ∂ 2V (q)∂p(l) + (q1 − q2 )∂p(l) . 1 2 ∂q (j ) ∂q (l) ∂q (j ) ∂q (l) 1

l=1

1

2

1

If U (2) is strictly convex, this yields ∂p(j ) while in the general case we need to consider 2 further the commutators # $$$ # # d ∂ 2 U (2) ∂ (j1 ) , · · · , ∂ (jm−1 ) , (q1 − q2 )∂p(l) q1 q1 2 ∂q (jm ) ∂q (l) l=1

=

d l=1

1

∂ m+1 U (2) ∂

(j1 )

q1

· · · ∂q (jm ) ∂q (l) 1

2

(q1 − q2 )∂p(l) . 2

1

Condition H3 means that we can write ∂p(j ) as a linear combination of these commutators 2 for every x ∈ X. The other basis elements of the tangent space are obtained inductively following the same procedure. !

326

L. Rey-Bellet, L. E. Thomas

We now prove the strong aperiodicity of the process x(t). This is based on the support theorem of Stroock and Varadhan [28]. The support of the diffusion process x(t) with initial condition x on the time interval [0, t], is by definition the smallest closed subset Sx,t of C([0, t]) such that Px [x(t, ω) ∈ Sx,t ] = 1. The support can be studied using the associated control system, i.e., the ordinary differential equation where the white noise ω(t) ˙ is replaced by a control u(t) ∈ L1 ([0, T ]): For our problem we have the control system q˙ = p, p˙ = −∇q V + :∗ r, r˙ = (−γ r + :p) + u,

(46)

and we denote xu (t) the solution of this control system with initial condition x and control u. The support theorem asserts that the support of the diffusion Sx,t is the closure of the set {xu ; u ∈ L1 ([0, t])}. As a consequence suppPt (x, ·), the support of the transition probabilities is equal to the closure of the set of accessible points {y; ∃u ∈ L1 ([0, t]) s.t. xu (t) = y}. Proposition 4.2. If Condition H2 holds then for all t > 0, all x ∈ X, supp Pt (x, ·) = X.

(47)

Proof. This result is proved in [7] under the additional condition that the interaction potential U (2) is strictly convex, in particular ∇U (2) is a diffeomorphism. Our Condition H2 implies that ∇U (2) is surjective. We can choose an inverse g : Rd → Rd which is locally bounded. From this point the proof proceeds exactly as in Theorem 3.2 of [7] and we will not repeat it here. ! 5. Proof of Theorem 2.1 The proof of Theorem 2.1 is a consequence of the theory linking the ergodic properties of the Markov process with existence of Liapunov functions, a theory which has been developed over the past twenty years. The proof of these ergodic properties relies on the intuition that the compact set U together with a Liapunov function plays much the same role as an atom in, say, a countable state space Markov chain. The technical device to implement this idea was invented in [1, 20], and is called splitting. It consists in constructing a new Markov chain with state space X0 ∪ X1 , where Xi are two copies of the original state space X. The new chain possesses an atom and has a projection which is the original chain. The ergodic properties of a chain with an atom are then analyzed by means of renewal theory and a coupling argument is applied to the return times to the atom. A complete account of this theory for a discrete time Markov process is developed in the book of Meyn and Tweedie [18], from which the result needed here is taken (Chapter 15). For a given s > 0 consider the discrete time Markov chain Xj = x(j s) with transition probabilities P (x, dy) ≡ Ps (x, dy) and semigroup P j ≡ T j s . By the results of Sect. 4, the Markov chain is strongly aperiodic, i.e., P (x, A) > 0 for any open set A and for any x and it is strong Feller. The exponential bound on the hitting time given in Theorem 3.11 implies in particular that Ex [τU ] is finite for all x ∈ X and thus we have an invariant measure µ (for hypoelliptic diffusions this is established in [14]). By aperiodicity and the strong Feller property, this invariant measure is unique.

Exponential Convergence to Non-Equilibrium Stationary States

327

The following theorem is proved in [18]: Theorem 5.1. If the Markov chain {Xj } is strong Feller and strongly aperiodic and if there are a function W > 1, a compact set U , and constants κ < 1 and L < ∞ such that P W (x) ≤ κW (x) + L1U (x),

(48)

then there exist constants r > 1 and R < ∞ such that, for any x, r n P (x, ·) − µW ≤ RW (x), n

where the weighted variation norm · W is defined in Eq. (14). By Theorem 3.10 the assumptions of Theorem 5.1 are satisfied with W = exp(θ G) and θ < (max{T1 , Tn })−1 . For the semigroup T t we note that we have the apriori estimate T t exp(θ G)(x) ≤ exp(γ θ Tr(T )t) exp(θ G)(x), cf. Lemma 3.5, which shows that T t is a bounded operator on L∞ θ (X) defined in Eq. (15). Setting t = ns + u with 0 ≤ u < s, and using the invariance of µ one obtains T t − µθ ≤ T nτ − µθ T s θ ≤ R˜ r˜ −t ,

(49)

for some r˜ > 1 and R˜ < ∞ or equivalently ∞ r˜ t Pt (x, ·) − µexp (θG) ≤ R˜ exp (θ G(x)). 0

As a consequence, for any s > 0, T s has 1 as a simple eigenvalue and the rest of the spectrum is contained in a disk of radius ρ < 1. The exponential decay of correlations in the stationary states follows from this. Corollary 5.2. There exist constants R < ∞ and r > 1 such that for all f , g with f 2 , g 2 ∈ L∞ θ (X), we have f T t gdµ − f dµ gdµ ≤ Rf 2 1/2 g 2 1/2 r −t . θ θ 2 Proof. If f 2 ∈ L∞ θ , we have |f (x)| ≤ f θ exp(θ G(x)/2) and similarly for g. Further if Eq. (49) holds with W = exp (θ G) it also holds for exp (θ G/2), and thus for some R1 < ∞ and r1 > 1 we have t T g(x) − gdµ ≤ R1 r −t g 2 1/2 exp θ G(x) . θ 1 2 1/2

Therefore we obtain f T t gdµ − f dµ gdµ ≤ |f (x)| T t g(x) − gdµ dµ 1/2 1/2 ≤ exp (θ G)dµ R1 r1−t f 2 θ g 2 θ .

328

L. Rey-Bellet, L. E. Thomas

To conclude we need to show that which we rewrite as

exp (θ G)dµ < ∞. This follows from Eq. (48)

N exp (θ G(x)) ≤ exp (θ G(x)) − P exp (θ G(x)) + L1U (x), with N = 1 − κ. From this we obtain N

N N 1 1 1 exp (θ G(x)) + L exp (θ G(Xk )) ≤ 1U (Xk ). N N N k=1

(50)

k=1

By the Law of Large Numbers the r.h.s of Eq. (50) converges to Lµ(U ) which is finite, and thus exp (θ G)dµ is finite, too. ! This concludes the proof of Theorem 2.1. Note added in proof. Stronger spectral properties as well as a fluctuation theorem for the entropy production are proved in [22]. Acknowledgement. We would like to thank Pierre Collet, Jean-Pierre Eckmann, Servet Martinez and ClaudeAlain Pillet for their comments and suggestions as well as Martin Hairer for useful comments on the controllability issues discussed in Sect. 4. L. E. Thomas is supported in part by NSF Grant 980139.

References 1. Athreya, K.B., Ney, P.: A new approach to the limit theory of recurrent Markov chains. Trans. Am. Math. Soc. 245, 493–501 (1978) 2. Bambusi, D.: Exponential stability of breathers in Hamiltonian networks of weakly coupled oscillators. Nonlinearity 9, 433–457 (1996) 3. Bach, V., Fröhlich, J., Sigal, I.M.: Quantum electrodynamics of confined nonrelativistic particles. Adv. Math. 137, 299–395 (1998) 4. Dym, H., McKean, H.P.: Gaussian processes, function theory, and the inverse spectral problem. Probability and Mathematical Statistics, Vol. 31. New York–London: Academic Press, 1976 5. Eckmann, J.-P., Hairer, M.: Non-equilibrium statistical mechanics of strongly anharmonic chains of oscillators. Commun. Math. Phys. 212, 105–164 (2000) 6. Eckmann, J.-P., Pillet C.-A., Rey-Bellet, L.: Non-equilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures. Commun. Math. Phys. 201, 657–697 (1999) 7. Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Entropy production in non-linear, thermally driven Hamiltonian systems. J. Stat. Phys. 95, 305–331 (1999) 8. Ford, G.W., Kac, M., Mazur, P.: Statistical mechanics of assemblies of coupled oscillators. J. Math. Phys. 6, 504–515 (1965) 9. Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in stationary states. J. Stat. Phys. 80, 931–970 (1995) 10. Has’minskii, R.Z.: Stochastic stability of differential equations. Alphen aan den Rijn–Germantown: Sijthoff and Noordhoff, 1980 11. Hörmander, L.: The Analysis of linear partial differential operators. Vol. III. Berlin: Springer, 1985 12. Jakši´c, V., Pillet, C.-A.: Ergodic properties of classical dissipative systems. I. Acta Math. 181, 245–282 (1998) 13. Jakši´c, V., Pillet, C.-A.,: On a model for quantum friction. III. Ergodic properties of the spin-boson system. Commun. Math. Phys. 178, 627–651 (1996) 14. Kliemann, W.: Recurrence and invariant measures for degenerate diffusions. Ann. of Prob. 15, 690–702 (1987) 15. Komech, A., Spohn, H., Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Comm. Partial Differ. Eq. 22, 307–335 (1997) 16. Kunita, H.: Supports of diffusion processes and controllability problems. In: Proc. Intern. Symp. SDE Kyoto 1976. New York: Wiley, 1978, pp. 163–185 17. MacKay, R.S., Aubry, S.: Proof of existence of breathers for time-reversible or Hamiltonian networks of weakly coupled oscillators. Nonlinearity 7, 1623–1643 (1994)

Exponential Convergence to Non-Equilibrium Stationary States

329

18. Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Communication and Control Engineering Series, London: Springer-Verlag London, 1993 19. Norriss, J.: Simplified Malliavin Calculus. In: Séminaire de probabilités XX. Lectures Note in Math. 1204, Berlin: Springer, 1986, pp. 101–130 20. Nummelin, E.: A splitting technique for stationary Markov Chains. Z. Wahrscheinlichkeitstheorie Verw. Geb. 43, 309–318 (1978) 21. Rey-Bellet, L., Thomas, L.E.: Asymptotic behavior of thermal non-equilibrium steady states for a driven chain of anharmonic oscillators. Commun. Math. Phys. 215, 1–24 (2000) 22. Rey-Bellet, L., Thomas, L.E.: Fluctuations of the entropy production in an harmonic chains. Preprint 2001 23. Rieder, Z., Lebowitz, J.L., Lieb, E.: Properties of a harmonic crystal in a stationary non-equilibrium state. J. Math. Phys. 8, 1073–1085 (1967) 24. Ruelle, D.: Smooth dynamics and new theoretical ideas in non-equilibrium statistical mechanics. J. Stat. Phys. 95, 393–468 (1999) 25. Ruelle, D.: Natural non-equilibrium states in quantum statistical mechanics. J. Stat. Phys. 98, 57–75 (2000) 26. Spohn, H., Lebowitz, J.L.: Stationary non-equilibrium states of infinite harmonic systems. Commun. Math. Phys. 54, 97–120 (1977) 27. Sievers, A.J., Takeno, S.: Intrinsic localized modes in anharmonic crystals. Phys. Rev. Lett. 61, 970–973 (1988) 28. Stroock, D.W., Varadhan, S.R.S.: On the support of diffusion processes with applications to the strong maximum principle. In: Proc. 6th Berkeley Symp. Math. Stat. Prob., Vol. III. Berkeley: Univ. California Press, 1972, pp. 361–368 Communicated by H. Spohn

Commun. Math. Phys. 225, 331 – 359 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime Christopher J. Fewster1 , Rainer Verch2 1 Department of Mathematics, University of York, Heslington, York YO10 5DD, UK.

E-mail: [email protected]

2 Institut für Theoretische Physik, Universität Göttingen, Bunsenstr. 9, 37073 Göttingen, Germany,

E-mail: [email protected] Received: 21 May 2001 / Accepted: 23 August 2001

Abstract: Quantum fields are well known to violate the weak energy condition of general relativity: the renormalised energy density at any given point is unbounded from below as a function of the quantum state. By contrast, for the scalar and electromagnetic fields it has been shown that weighted averages of the energy density along timelike curves satisfy “quantum weak energy inequalities” (QWEIs) which constitute lower bounds on these quantities. Previously, Dirac QWEIs have been obtained only for massless fields in two-dimensional spacetimes. In this paper we establish QWEIs for the Dirac and Majorana fields of mass m ≥ 0 on general four-dimensional globally hyperbolic spacetimes, averaging along arbitrary smooth timelike curves with respect to any of a large class of smooth compactly supported positive weights. Our proof makes essential use of the microlocal characterisation of the class of Hadamard states, for which the energy density may be defined by point-splitting.

1. Introduction In general relativity, it is customary to assume that the stress-energy tensor satisfies one or more of the classical energy conditions; the weak energy condition, for example, being the assertion that the energy density measured by any observer is nonnegative. The primary motivation behind these energy conditions is that they ensure that gravity acts as an attractive force (in the sense of focussing geodesic congruences) in accordance with our experience of gravitation on a wide range of scales. It is therefore natural to assume that physically reasonable forms of classical matter obey such conditions, and to regard matter theories (such as the nonminimally coupled scalar field [4,43,12,17]) violating such conditions as being of questionable physical significance on many scales. Moreover, the energy conditions have proved to be of great value in obtaining deep results in classical general relativity, such as the positive mass and singularity theorems [39, 48, 21].

332

C. J. Fewster, R. Verch

However, it is well known that all the pointwise energy conditions are violated in quantum field theory. Indeed, Epstein, Glaser and Jaffe [6] proved that no Wightman field theory on Minkowski space can admit a (nontrivial) energy density observable whose expectation values are bounded from below and vanish in the Minkowski vacuum state. Moreover, in linear field theories (both in flat and curved spacetimes) it is easy to construct states whose energy density at a given point may be tuned to be arbitrarily negative [3, 26]. This raises the possibility that quantum matter might be used to construct spacetimes with exotic properties, such as traversable wormholes [18] or so-called “warp drive” spacetimes [33], usually excluded by the classical energy conditions. One might also ask whether the conclusions of the singularity theorems remain valid for quantum matter. Furthermore, it is necessary to understand how classical matter contrives to obey the classical energy conditions, given that its fundamental constituents need not. One profitable line of enquiry, starting with the work of Ford [14], has been to investigate weighted averages of the renormalised energy density along the worldline of an observer, or over a small spacetime region. It turns out that the expectation values of these averaged observables are bounded from below independently of the state, and such bounds have been developed in successively greater generality over the last few years [16, 34, 11, 7, 9, 22, 8]. Most recently, one of us [8] has established the existence of such bounds (and given an explicit, though not optimal, lower bound) for the minimally coupled real linear scalar field in any globally hyperbolic spacetime, in the case where averaging is performed with respect to proper time along any smooth timelike curve using an arbitrary smooth compactly supported positive weight belonging to the class1 W = {f ∈ C0∞ (R) | f (τ ) = g(τ )2 for some real-valued g ∈ C0∞ (R)}.

(1.1)

The constraints imposed by these lower bounds have been called “quantum inequalities” by various authors. However, as we hope to discuss elsewhere, there seem to be strong parallels between the phenomena discussed above and various situations arising in quantum mechanics. Indeed, quantum inequalities appear to be a widespread feature of quantum theory as a whole, stemming ultimately from the uncertainty principle. In this light, we adopt the more specific terminology “quantum weak energy inequality” (or QWEI) in relation to lower bounds on the renormalised energy density of a quantum field. There are also related constraints which demand that the integral of the energy density over any complete (i.e. inextendible) smooth timelike or lightlike (“null”-) curve be nonnegative. These are called the “averaged weak energy condition” (AWEC) or “averaged null energy condition” (ANEC), respectively; they may be viewed as a limiting case of QWEIs when the weight function (f in Eq. (1.2) below) against which the energy density is integrated along a timelike or lightlike curve approaches the unit function. Historically, it was first pointed out in a work by Tipler [41] that suitable versions of AWEC or ANEC imply singularity theorems in general relativity similar to those which one obtains from pointwise positivity conditions on the energy density. Subsequently, the question of whether quantum fields obey AWEC or ANEC has been investigated in a number of works [26, 13, 46, 49, 50, 15, 12, 42]. Most of these references treat linear quantum fields; [42] establishes ANEC for general (axiomatic) quantum field theory in two-dimensional Minkowski spacetime. A study of the interrelations between averaged energy conditions and QWEIs is contained in [15]. We refer to [12] for further discussion and review of averaged energy conditions. 1 See the remarks following Theorem 4.1 for a brief discussion of this class.

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

333

QWEIs place stringent constraints on attempts to generate exotic spacetimes [18, 33] and may open a route towards proving results analogous to the singularity theorems for quantum matter [49, 50]. To date, however, most QWEI results have been obtained for scalar field theories, while the more physically interesting electromagnetic and Dirac fields have received comparatively little attention. Ford and Roman have considered the electromagnetic field in Minkowksi space [16] and shown that a QWEI holds for averaging against the Lorentzian weight f (τ ) = τ0 /[π(τ 2 +τ02 )] along timelike geodesics; this result was generalised to static trajectories in static spacetimes by Pfenning [31], who has also removed the restriction to Lorentzian weights [32]. It is reasonable to suppose that even more general QWEIs may be obtained for this case. Both the scalar and electromagnetic fields have the property that the classical energy density is manifestly nonnegative, a fact which underpins all the results on these fields. The Dirac field is technically very different in that the “classical” energy density is unbounded both from above and below; in second quantization, renormalisation serves the dual purpose of restoring finiteness and imposing positivity of the Hamiltonian. This problem appears to have restricted progress on the Dirac field to date. The main contribution has been that of Vollick, who established a QWEI for Dirac fields in two-dimensional spacetimes [45] by converting the problem to one involving a scalar field and then adapting arguments due to Flanagan [11]. There seems little prospect of generalising this argument beyond the two-dimensional setting. In four dimensions, Vollick has also given explicit examples of states with locally negative energy densities [44] and demonstrated that the resulting energy densities nonetheless obey QWEIs modelled on those for the scalar field. In this paper we establish a general QWEI for massive or massless Dirac fields on four-dimensional2 globally hyperbolic spacetimes. To be more specific, let γ be a smooth timelike curve, parametrized by its proper time τ , in a globally hyperbolic spacetime (M, g). Let ω0 be a given (but arbitrary) Hadamard state3 of the Dirac field on (M, g). The state ω0 is used as a “reference state” to define the expected normal ordered energy density : T00 : ω for any other Hadamard state ω. Our main result, Theorem 4.1 asserts that (1.2) inf dτ : T00 : ω (γ (τ ))f (τ ) > −∞, ω

where the infimum is taken over the class of Hadamard states and f belongs to the class W. In principle our arguments yield an explicit lower bound for the left-hand side of (1.2). This expression is unfortunately not particularly enlightening and is not expected to be sharp. Let us note that (1.2) remains true if the normal ordered energy density is replaced by the renormalised energy density, as these two quantities differ by a smooth function. The plan of the paper is as follows. We begin, for completeness and to fix notation, by reviewing the theory of the quantized Dirac field in Sect. 2. Particular attention is given to the class of Hadamard states, which may be characterised by a microlocal spectrum condition on the wave-front set of the two-point function. This formulation of the Hadamard condition is technically convenient and allows us to bring the tools of microlocal analysis to bear. Section 3 explains how the normal ordered energy density may be constructed by point-splitting. 2 The restriction to four dimensions is purely for convenience: our methods would apply in more general dimensions. 3 See Sect. 2 for a brief review of the concepts used here.

334

C. J. Fewster, R. Verch

The proof of our QWEI begins in Sect. 4, using the following strategy. The averaged normal ordered energy density is first expressed as an integral over R2 ; decomposing this integral according to the quadrants of R2 , each piece is then split further into four using a decomposition induced by the reference state ω0 . All but two of the resulting sixteen contributions can be bounded (both above and below) using estimates obtained in Sect. 5. The remaining terms are then expressed in the form R = lim→+∞ Tr J W , where J and W are self-adjoint and J is independent of ω. The parameter ∈ R+ defines a cut-off, used to avoid domain problems. We prove that W is positive and trace-class with bounded trace as ω varies. To conclude that R is bounded below, it then suffices to establish that the operators J are bounded below uniformly in . This is accomplished in Sect. 6, completing the proof of our QWEI. In Sect. 7 we briefly describe how our arguments can also be applied to the Majorana field. Conventions. The metric signature is (+, −, −, −). Lower (resp. upper) case Latin characters from the beginning of the alphabet will label tetrad (resp. spinor) indices. Tetrad indices run from 0 to 3, and we will use j, k to label the spatial components 1, 2, 3. The summation convention will be used throughout the paper except where otherwise indicated. Units with c = h¯ = 1 are adopted. The Fourier transform of an integrable f on Rn will be defined using n function ˇ convention f(k) = d x eikx f (x), with inverse h(x) = (2π )−n the nnon-standard −ikx d ke h(k). The Fourier transform of a distribution u with compact support is u(k) = u(ek ) with ek (x) = eikx . Given a Lorentzian manifold (M, g), D (M) will denote the space of distributions on M as defined in §6.3 of [24]. Thus if u ∈ D (M) there is, for each chart (U, κ) in M, a distribution uκ ∈ D (κ(U )) such that4 u(f ) = uκ (( −|g|f ) ◦ κ −1 )

∀f ∈ C0∞ (U )

(1.3)

and so that uκ = (κ ◦ κ −1 )∗ uκ in κ(U ∩ U ) for any other chart (U , κ ). We will also (and usually) write u ◦ κ −1 for uκ . Spinor and cospinor distributions will be defined in an analogous fashion, with the convention that a spinor distribution acts on test cospinor fields, and vice versa.

2. The Quantized Dirac Field 2.1. Geometrical preliminaries. In order to make the present paper sufficiently selfcontained, we need to summarize a few basic facts about the geometry of spinor fields in curved spacetimes. We will follow Dimock’s work [5] to large extent. We will consider Dirac fields in a four-dimensional globally hyperbolic spacetime (M, g). To begin with, we recall that a globally hyperbolic spacetime is a Lorentzian spacetime admitting a Cauchy surface, the latter being a smooth hypersurface in M which is intersected exactly once by each inextendible causal curve in (M, g). We will also suppose that (M, g) is orientable and time-orientable, and that such orientations have been chosen. Then (M, g) possesses a spin-structure, that is, there is a principal fibre bundle S(M, g) having SL(2, C) as structure group, acting from the right, together with 4 The factor of √−|g| is used to identify test functions with test densities, on which u strictly speaking

acts.

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

335

a 2-1 fibre-bundle homomorphism ψ : S(M, g) → F (M, g) which projects S(M, g) onto the frame bundle F (M, g). That is to say, ψ preserves base-points and obeys ψ ◦ Rs = R(s) ◦ ψ,

(2.1)

where R denotes the right action of the structure groups on the principal fibre bun↑ dles involved and SL(2, C) s → (s) ∈ L+ is the covering projection onto the proper orthochronous Lorentz group. We recall that F (M, g) is the bundle of oriented and time-oriented tetrads (e0 , e1 , e2 , e3 ), so that g(ea , eb ) = ηab with ηab = diag(1, −1, −1, −1), where e0 is timelike and future-pointing and the tetrad is given the orientation of M. Moreover, a collection of 4 × 4-matrices γ0 , . . . , γ3 is called a set of Dirac matrices if γa γb + γb γa = 2ηab · 1.

(2.2)

A theorem due to Pauli states that, if γ0 , . . . , γ3 and γ0 , . . . , γ3 are two sets of Dirac matrices, then there is an invertible matrix M so that γa = Mγa M −1 . Any set of Dirac ( . )

↑

matrices γ0 , . . . , γ3 is connected to the covering SL(2, C) −→ L+ in the following way. Let Spin(1, 3) consist of all unimodular 4 × 4-matrices S so that Sγa S −1 = γb b a

(2.3)

holds for some real numbers b a = b a (S). It follows from the defining properties of Dirac matrices that b a (S) is contained in the Lorentz group. The restriction of the map S → b a (S) in (2.3) to Spin0 (1, 3), the unit connected component of Spin(1, 3), is a ↑ group homomorphism with range L+ , and thus Spin0 (1, 3) is isomorphic to SL(2, C). Sometimes it is useful to distinguish sets of Dirac matrices with certain properties. One says that a set γ0 , . . . , γ3 of Dirac matrices belongs to a standard representation if γ0∗ = γ0 and γk∗ = −γk .

(2.4)

Here, γa∗ is the hermitian adjoint of γa .A set of Dirac matrices which belongs to a standard representation and has the additional property that the complex conjugate matrices fulfills γ a = −γa ,

(2.5)

is said to belong to a Majorana representation. We will now suppose that we are given a globally hyperbolic spacetime (M, g) together with a spin-structure (S(M, g), ψ) and a set of Dirac matrices γ0 , . . . , γ3 which we will assume, for the sake of notational simplicity, to belong to a standard representation. (We note, however, that everything which follows could also be carried out in a similar way without that assumption.) As was pointed out above, via the isomorphism Spin0 (1, 3) SL(2, C), C4 carries a representation of the universal covering group of ↑ L+ which is given by the action of the matrices S in Spin0 (1, 3) on vectors in C4 . Via Spin0 (1, 3) SL(2, C), we can also regard S(M, g) as a Spin0 (1, 3)-principal bundle and form the associated vector bundle DM = S(M, g) Spin0 (1,3) C4 .

(2.6)

336

C. J. Fewster, R. Verch

That is, the fibre of DM at p ∈ M consists of the orbits [sp , x] = {(RS−1 sp , Sx) : S ∈ Spin0 (1, 3)}

(2.7)

for sp ∈ S(M, g)p and x ∈ C4 . There is a fibrewise left action of Spin0 (1, 3) on DM by LS [sp , x] = [sp , Sx] . Elements in DM are called spinors, and elements in the dual bundle D ∗ M are called cospinors. Moreover, if E is a (local) section in S(M, g), then it induces on one hand a tetrad field (e0 , . . . , e3 ) = ψ ◦ E, i.e. a (local) smooth section in F (M, g), via the spin-structure, and on the other hand it induces a set (EA )4A=1 of (local) smooth sections in DM, defined by EA = [E, bA ],

(2.8)

where b1 , . . . , b4 is the standard basis in C4 . There are corresponding dual tetrad fields B . The eb (e0 , . . . , e3 ) defined by eb (ea ) = δab and (E B )4B=1 defined by E B (EA ) = δA ∗ B ∗ are smooth sections in T M, and the E are smooth sections in D M, the dual bundle to DM. We shall denote by C ∞ (DM) and C ∞ (D ∗ M) the sets of smooth sections in DM and D ∗ M, respectively. The notation for smooth sections in T M and T ∗ M will be similar. With respect to the given set of Dirac matrices, one can define a section γ in C ∞ (T ∗ M) ⊗ C ∞ (DM) ⊗ C ∞ (D ∗ M), i.e. a mixed spinor-tensor field, by setting its components γb A B in the induced frame eb ⊗ EA ⊗ E B to be equal to the matrix elements (γb )A B of γb . This definition is independent of the induced frames, i.e. independent of the chosen (local) section E in S(M, g). (Once the set of Dirac matrices γ0 , . . . , γ3 is given, γ encodes the spin-structure at the level of DM.) Moreover, there is an anti-linear isomorphism DM → D ∗ M induced by forming B the Dirac adjoint: If u = uA EA is a spinor, one can assign to it a cospinor u+ = u+ BE with components A u+ B = u γ0AB ,

(2.9)

where γ0AB are the matrix elements of γ0 . This assignment possesses an inverse (denoted by the same symbol), where a cospinor v = vB E B is mapped to a spinor v + having components v +A = γ0 AB vB . Again, γ0 AB are the matrix elements of γ0 . The operation of taking the Dirac adjoint gives rise to anti-linear isomorphisms between C ∞ (DM) and C ∞ (D ∗ M) in the obvious manner. The metric-induced covariant derivative ∇ on C ∞ (T M) induces a covariant derivative, also denoted by ∇, on C ∞ (DM). If (e0 , . . . , e3 ) and (EA )3A=1 are induced by a section E in S(M, g) and f = f A EA is a local section in DM, then ∇f = ∇b f A (eb ⊗ EA ) ∈ C ∞ (T ∗ M) ⊗ C ∞ (DM)

(2.10)

has the frame components ∇b f A = ∂b f A + σb A B f B , where ∂b f A = df A (eb ),

1 a A dC σb A B = − 1bd γa C γ B . 4

(2.11)

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

337

Here, df A is the differential of the function f A , and we read the components of the a are Christoffel’s connection coefficients, Dirac matrices on the right hand side while 1bd defined by a d b ∇k = (∂b k a + 1bd k )e ⊗ ea

(2.12)

for k = k b eb ∈ C ∞ (T M). The covariant derivative ∇ can be extended to cospinor fields and mixed spinortensor fields by requiring the Leibniz rule and commutativity with contractions. Thus, if h = hB E B is a cospinor field, then ∇h = ∇b hB eb ⊗ E B has the components ∇b hB = ∂b hB − hC σb C B .

(2.13)

It follows that ∇γ = 0. 2.2. The Dirac Equation. The Dirac-operator /∇ is a first order differential operator taking spinor fields to spinor fields, or cospinor fields to cospinor fields; it is defined as the action of the covariant derivative followed by contraction with the spinor-tensor γ . More precisely, if f = f A EA ∈ C ∞ (DM) and h = hB E B ∈ C ∞ (D ∗ M), then /∇f = (∇ / f )A EA = ηab γa A B ∇b f B EA ,

(2.14)

/∇h = (∇ / h)B E = η ∇b hC γa

(2.15)

B

ab

C

BE

B

.

An important property of the Dirac operator is that it commutes with taking the Dirac adjoint, i.e. (∇ / f )+ = /∇f +

and

(∇ / h)+ = /∇h+

(2.16)

for f ∈ C ∞ (DM) and h ∈ C ∞ (D ∗ M). The Dirac equation is the following first order partial differential equation for spinor fields u ∈ C ∞ (DM) or for cospinor fields v ∈ C ∞ (D ∗ M): (−i∇ / + m)u = 0, (i∇ / + m)v = 0,

(2.17) (2.18)

P = (i∇ / + m)(−i∇ / + m)

(2.19)

where m ≥ 0 is a constant. Then

is the Lichnerowicz wave operator on spinors or cospinors. It is a second order wave operator which has metric principal part, and owing to global hyperbolicity of (M, g) this implies that the Cauchy problem for the corresponding wave equations is well-posed and that P possesses uniquely determined advanced and retarded fundamental solutions. As shown in [5], this implies that the Dirac operators −i∇ / +m on spinor fields and i∇ / +m on cospinor fields possess uniquely determined pairs of advanced(−) and retarded(+) ± and S ± , respectively: This means that, for the spinor case, fundamental solutions Ssp cosp ± : C0∞ (DM) → C ∞ (DM) Ssp

(2.20)

are continuous linear maps so that ± ± (−i∇ / + m)Ssp u = u = Ssp (−i∇ / + m)u

(2.21)

338

C. J. Fewster, R. Verch

± u is contained in the causal fuholds for all u ∈ C0∞ (DM) and, moreover, supp Ssp ture(+)/causal past(−) of supp u. (We note that our convention concerning advanced/retarded fundamental solutions is opposite to that in [38].) The cospinor case is analogous. Then one defines the retarded-minus-advanced fundamental solutions + − − Ssp Ssp = Ssp

and

+ − Scosp = Scosp − Scosp .

(2.22)

In order to quantize the Dirac field, it is very convenient to “double” the system by taking pairs of spinor fields and cospinor fields together, as was done in the references [28, 29, 23]. We give here an equivalent version which makes contact with the notation used in [38]. To this end, let us denote C0∞ (DM) by Dsp and C0∞ (D ∗ M) by Dcosp , and define the doubled space Ddouble = Dcosp ⊕ Dsp . On Ddouble we introduce the sesquilinear form h h1 , 2 = f1+ , f2 − h2 , h+ (2.23) 1 f1 f2 for h1 , h2 ∈ Dcosp and f1 , f2 ∈ Dsp , where for v ∈ Dcosp and u ∈ Dsp we employ the dual pairing dµg (p) vp (up ) (2.24) v, u = M

with dµg denoting the canonical 4-volume form induced by the metric g on M. This dual pairing also embeds Dcosp in the (topological) dual space Dsp of Dsp , and vice versa, embeds Dsp in Dcosp . The sesquilinear form (· , ·) is non-degenerate, but not positive. A useful relation is Scosp h, f = −h, Ssp f

(2.25)

for h ∈ Dcosp and f ∈ Dsp . Let us define the conjugate-linear isomorphism 1 : Ddouble → Ddouble , playing the role of a charge-conjugation, by + h f . (2.26) 1 = f h+ Then one finds that 1 is a skew-conjugation with respect to (· , ·), namely it holds that (1F1 , 1F2 ) = −(F2 , F1 )

∀ F1 , F2 ∈ Ddouble .

Now we introduce the following “doubled” operators on Ddouble : −∇ / + im 0 −∇ / − im 0 , D✁ = , D✄ = 0 /∇ + im 0 /∇ − im iScosp 0 S✁ = . 0 iSsp Then it holds that

D✁ D✄ = D✄ D✁ = Pdouble =

Pcosp 0 0 Psp

(2.27)

(2.28) (2.29)

,

(2.30)

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

339

where P... denotes the Lichnerowicz wave operator on spinors and cospinors, respectively; moreover, one finds that 1 commutes with Pdouble and 1D✁ = −D✁ 1 ,

1D✄ = −D✄ 1,

(D✁ F1 , F2 ) = −(F1 , D✁ F2 ), (D✄ , F1 , F2 ) = −(F1 , D✄ F2 ) ∀ F1 , F2 ∈ Ddouble .

(2.31) (2.32)

± One may also check that S✁ (defined in obvious analogy to S✁ ) are the retarded(+)/advanced(−) fundamental solutions for the operator D✄ ; consequently

D✄ S✁ = S✁ D✄ = 0.

(2.33)

Furthermore, from (2.31) one can see that 1S✁ = −S✁ 1.

(2.34)

This entails that 1 is a complex conjugation for the sesquilinear form (F1 , F2 )S = (S✁ F1 , F2 ),

F1 , F2 ∈ Ddouble ,

(2.35)

so that (1F1 , 1F2 )S = (F2 , F1 )S ∀ F1 , F2 ∈ Ddouble . On the other hand one can see that (cf. (2.25)) h h1 , 2 = −if1+ , Ssp f2 + iScosp h2 , h+ 1 f1 f2 S

(2.36)

(2.37)

and this implies by Prop. 2.2 in [5] that (· , ·)S is positive-semidefinite, (F, F )S ≥ 0. Now we introduce the quotient space Ddouble /ker S✁ and denote by H its completion with respect to (· , ·)S . The conjugation 1 induces by (2.34) a conjugation on H which will again be denoted by 1. Hence, we have derived from the doubled Dirac equation a complex Hilbert-space H (with scalar product (· , ·)S ) together with a complex conjugation 1. The system can be quantized, following Araki [1], by assigning to these data the algebra of canonical anti-commutation relations CAR(H, 1). This is the unique C ∗ -algebra with unit 1 which is generated by a family {B(v) : v ∈ H } subject to the relations: (i) v → B(v) is C-linear, (ii) B(1v) = B(v)∗ , (iii) B(v)∗ B(w) + B(w)B(v)∗ = (v, w)S · 1. Now let q : Ddouble → Ddouble /ker S✁ denote the quotient map, then we define the quantized Dirac field as the linear map which assigns to each h ∈ Dcosp the element h 5(h) = B q (2.38) 0 in CAR(H, 1). The adjoint spinor field will be defined by 0 , f ∈ Dsp . 5 + (f ) = B q f

(2.39)

340

C. J. Fewster, R. Verch

As a consequence of (iii), the field and its adjoint satisfy the anti-commutation relations 5(h)5 + (f ) + 5 + (f )5(h) = −ih, Ssp f .

(2.40)

We also note that, owing to (2.33), B(q(D✄ F )) = 0, and this entails 5((i∇ / + m)h) = 0

and

5 + ((−i∇ / + m)f ) = 0

(2.41)

for all h ∈ Dcosp and f ∈ Dsp . Remarks. (i) 5 acts linearly on cospinors and fulfills, in the sense of distributions, the equation (−i∇ / + m)5 = 0, therefore the map h → 5(h) is regarded as a spinor field. Similarly, 5 + acts linearly on spinor fields and fulfills in distributional sense (i∇ / + m)5 + = 0; hence it is viewed as a cospinor field. (ii) 5 and 5 + are C ∗ -valued distributions since e.g. 2||5 + (f )||2 = −if + , Ssp f and Dsp ⊗ Dsp f1 ⊗ f2 → −if1+ , Ssp f2 is continuous (with respect to the usual testfunction topology). (iii) We briefly comment on the case where γ0 , . . . , γ3 belong to a Majorana representation (as considered in [38]) and one wishes to quantize the Majorana field. In that situation, Dcosp (and similarly, Dsp ) carries an “intrinsic” charge conjugation 1 given by v = vB E B → vB E B in any frame. Then 1 is a skew-conjugation for the sesquilinear form (h, h ) = dµg (p) hp (h+ (2.42) p) M

on Dcosp . Upon defining D✄ = /∇ − im, D✁ = /∇ + im and S✁ = iScosp , one obtains similar relations as before. Then H arises as completion of Dcosp /ker S✁ and (h, h )S = (S✁ h, h ). The field operators then simplify to 5(h) = B(q(h));

5 + (f ) = 5(f + )∗ = 5(1f + )

(2.43)

for h ∈ Dcosp , f ∈ Dsp . In other words, the Majorana field may be quantized without doubling the classical system. Instead of starting with Dcosp one can likewise consider Dsp ; this has been done in [38]. 2.3. Hadamard states. We recall that a state on a C ∗ -algebra C is a linear functional ω : C → C fulfilling ω(1) = 1 and ω(A∗ A) ≥ 0 for all A ∈ C. For the purpose of the present work, it is sufficient to focus on the two-point functions ω2 of states ω on CAR(H, 1). The two-point function ω2 of ω is an element in (Ddouble ⊗ Ddouble ) given by ω2 (F1 ⊗ F2 ) = ω(B(q(F1 ))B(q(F2 ))),

F1 , F2 ∈ Ddouble .

(2.44)

It was shown in [1] that there is for each state ω on CAR(H, 1) a linear operator Q on H with the properties: (I) 0 ≤ Q∗ = Q ≤ 1, (II) Q + 1Q1 = 1, (III) ω2 (F1 ⊗ F2 ) = (1q(F1 ), Qq(F2 ))S

for all F1 , F2 ∈ Ddouble .

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

341

Conversely, each linear operator Q with the properties (I) and (II) determines by (III) the two-point function ω2 of some state ω on CAR(H, 1), a so-called quasifree state, determined by the two-point function ω2 (see [1] for discussion). Such a quasifree state ω is pure (and then often called a Fock-state) if and only if Q is a projection, i.e. Q2 = Q. The operator Q with the properties (I), (II) and (III) above will be called the operator labelling the quasifree state ω. The stress-energy tensor is defined using the two-point functions of a particular class of states, the Hadamard states. One says that a state ω on CAR(H, 1) is a Hadamard state if we may write ω2 (F1 ⊗ F2 ) = iw(D✁ F1 ⊗ F2 )

(2.45)

for some distribution w ∈ (Ddouble ⊗ Ddouble ) of Hadamard form for the doubled waveoperator Pdouble on Ddouble ; the definition of a Hadamard form for such a wave-operator has been given in [38] (cf. also [28, 29, 23, 30]). This definition entails that the difference between the two-point functions of two Hadamard states is smooth. For the purposes of this work, we will also need the characterization of Hadamard states in terms of properties of the wave-front set WF(ω2 ) that appears in [29, 23, 38], following a line of argument given in [35] for the scalar case. The relevant statement, proven in the references just stated, is: A state ω on CAR(H, 1) is a Hadamard state if and only if the wave-front set of its two-point function ω2 satisfies the relation

WF (ω2 ) = (p, ξ ; p , −ξ ) ∈ T˙ ∗ (M × M) | (p, ξ ) ∼ (p , ξ ); ξ ∈ Np+ , (2.46) where T˙ ∗ (M × M) is the cotangent bundle over M × M without the zero-section, (p, ξ ) ∼ (p , ξ ) means that there is a lightlike geodesic connecting the points p and p in M and to which ξ and ξ are co-tangent, and Np+ is the set of all future-directed null covectors at p. (We remark that in [38] the opposite sign convention for the Fourier transform was chosen, leading to the opposite constraint ξ ∈ Np− (i.e., past-directed null covectors) in that reference compared to (2.46).) At this point we very briefly recall the definition of the wave-front set of a distribution [24]. For a distribution t ∈ D (Rn ), a point (x, k) ∈ Rn × (Rn \{0}) is called a regular directed point of t if there exists χ ∈ D(Rn ) with χ (x) ' = 0 and a conic open neighbourhood C of k in Rn \{0} such that sup (1 + |k |)N |χt(k )| < ∞

k ∈C

∀ N ∈ N.

(2.47)

(If this holds we will say that χt is of rapid decay in C.) The complement in Rn ×(Rn \{0}) of the set of all regular directed points of t is called the wave-front set WF (t) of t. Given a scalar distribution τ on a manifold M, one says that a non-zero covector (p, ξ ) ∈ T˙ ∗ (M) is in WF (τ ) if there is a coordinate chart (U, κ) around p ∈ M so that (κ(p), t(κ −1 ) ξ ) ∈ WF (τ ◦ κ −1 ) where (as discussed at the end of Sect. 1) τ ◦ κ −1 ∈ D (κ(U )) is a distribution on the chart range of κ. This definition of WF (τ ) is independent of the choice of the chart κ. We refer the reader to [24] for further discussion of the properties of the wave-front set of distributions on manifolds. For the case that τ is a distribution on test-sections of a vector bundle, e.g. defined on Ddouble , τ can be viewed, via (local) trivializations of the bundle, as the collection (τ A B )A,B of scalar distributions, and then WF (τ ) is defined as the union of WF (τ A B ) over all components A, B. It is not difficult to see that this definition is independent of the chosen (local) trivialization.

342

C. J. Fewster, R. Verch

Now let ω2 be the two-point function of a Hadamard state ω on CAR(H, 1), and let Q be the corresponding operator on H with the properties (I), (II) and (III) above. We will use the notation Q1 = 1Q1

(2.48)

for the “charge conjugate” of Q, and we will adopt this notation also for other operators on H. In studying the stress-energy tensor, we will be particularly interested in the following distributions on Dsp ⊗ Dcosp associated with ω (respectively, with Q), which we will also refer to as two-point functions: ωQ (f ⊗ h) = ω(5 + (f )5(h)),

1 ωQ (f ⊗ h) = ω(5(h)5 + (f )),

(2.49)

1 =ω where f ∈ Dsp and h ∈ Dcosp . Note that ωQ Q1 . As a consequence of the constraint on the wave-front set (2.46) for Hadamard states, which is also called the microlocal spectrum condition, one finds that the following microlocal spectrum condition holds 1, for ωQ and ωQ

= WF (ωQ ) = (p, ξ ; p , −ξ ) ∈ T˙ ∗ (M × M) | (p, ξ ) ∼ (p , ξ ); ξ ∈ Np= . (2.50)

Here, and below, we use = and > to denote either the presence or absence of a 1 in the following context-dependent way: Q= Q Q1

=

ωQ ωQ 1 ωQ

=

Np Np+ Np−

R= R+ . R−

To avoid confusion we will sometimes use · as a placeholder to indicate the absence of a 1. Thus Q· = Q, for example. We note that the microlocal spectrum condition in the form (2.50) has been proved directly for quasifree Hadamard states in [29, 23]. Let us note that (2.49) may be written 0 h , Qq ; (2.51) ωQ (f ⊗ h) = 1q f 0 S as a slight abuse of notation we will use this relation to define ωQ for general bounded operators Q on H and refer to ωQ as the two-point function labelled by Q. (Of course ωQ is not in general the two-point function of a state.) We will also denote by Had (H, 1) = the class of operators which obey properties (I) and (II) and such that ωQ obeys (2.50). Thus Had (H, 1) parametrizes the quasifree Hadamard states on CAR(H, 1). Our proof of Thm. 4.1 below relies on the existence of pure, quasifree Hadamard states for the Dirac field. It seems that this has never been established in the literature in full detail, therefore we sketch here how such states may be constructed by adapting an argument employed by Fulling, Narcowich and Wald [19] for the case of the free scalar field to the Dirac field. The first step is to show that there exists a pure, quasifree Hadamard state for the Dirac field for ultrastatic (M, g). In fact, if (M, g) is ultrastatic, then it may be endowed with a suitable spin structure so that the ultrastatic time shifts give rise to a continuous unitary group Ut (t ∈ R) on H leaving the scalar product (· , ·)S invariant and fulfilling 1Ut = Ut 1. Moreover, if the mass parameter m appearing in the Dirac equation is strictly positive, then the spectrum of the self-adjoint generator of Ut

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

343

is bounded away from zero. Theorem 2 in [1], or the results in [47], thus show that there is a pure quasifree state ω0 on the CAR algebra of the Dirac field on ultrastatic (M, g) which is a ground state for the C ∗ -dynamics induced by Ut ; the projection P0 labelling ω0 is the projection onto the positive spectral subspace of the unitary group Ut . Since ω0 is a ground state, it fulfills the microlocal spectrum condition [37] and hence is a Hadamard state [38]. In a second step, one uses a technique developed in [19] which allows one to view a neighbourhood of a Cauchy surface of any given globally hyperbolic spacetime as being isometrically embedded in a globally hyperbolic spacetime that has an ultrastatic part (with suitable spin structure as above) in its past. By the uniqueness of the Cauchy problem and the “propagation of Hadamard form” under the dynamics of the Dirac equation [38], any pure, quasifree Hadamard state prescribed on the ultrastatic part of the spacetime (e.g. ω0 ) induces a pure, quasifree Hadamard state everywhere on the spacetime, in particular on the embedded neighbourhood of the Cauchy surface of the initially given globally hyperbolic spacetime. Using the same argument once more, a pure, quasifree Hadamard state for the Dirac field is thereby induced on any given globally hyperbolic spacetime. The mass parameter m may be allowed to be variable over spacetime in this process without affecting the Hadamard form, so that one obtains a pure, quasifree Hadamard state of the Dirac field on any globally hyperbolic spacetime for any m ≥ 0. The argument just sketched implicitly also shows that there exists an abundance of quasifree Hadamard states. 3. A Point-Split Energy Density For the remainder of this paper, we will assume that (M, g) is globally hyperbolic, orientable and time orientable, with spin structure (S(M, g), ψ) and that the Dirac matrices γa belong to a standard representation. Let γ : R → M be a smooth timelike curve in (M, g), parametrized by its proper time, along which we wish to establish a QWEI. The starting point is the construction of a normal ordered energy density on γ , which is accomplished as follows. We first claim that there exists a tubular neighbourhood Cγ of γ and a local section E of S(M, g) over Cγ such that the induced tetrad field (e0 , . . . , e3 ) = ψ ◦ E satisfies e0 |γ = u, where u = γ˙ is the velocity of γ . To see this, choose any locally finite open cover {Uj | j ∈ Z} of γ by charts Uj such that (i) Uj ∩ Uk = ∅ unless |j − k| ≤ 1; (ii) Uj ∩ Uj +1 is contractable for each j ∈ Z; (iii) Uj ∩ Uk ∩ Ul = ∅ if j, k, l are distinct. The existence of such a cover follows from global hyperbolicity of (M, g) since γ is timelike. Now extend u to a smooth timelike unit vector field u on some tubular neighbourhood Cγ of γ , so that Cγ ⊂ j Uj . Choose any tetrad (e0 , . . . , e3 ) on Cγ . Then we may obtain a tetrad (e0 , . . . , e3 ) with e0 = u by applying a unique boost in ↑ L+ at each point (whose parameters are given by the components of u with respect to ea , and therefore vary smoothly). This tetrad lifts smoothly to S(M, g) in each Uj ∩ Cγ and may be patched together along Cγ to obtain the required section E by virtue of properties (i), (ii) and (iii).5 Next, we choose smooth spinor fields vA (A = 1, . . . , 4) in Cγ , such that δ AB vA ⊗ vB+ = γ0 . 5 Of course, there are exactly two such sections.

(3.1)

344

C. J. Fewster, R. Verch

This is easily satisfied by taking vA = EA , where EA is the spin frame induced by E; however, it will be convenient to make a slightly different choice when considering the Majorana field. The changes relevant for Majorana fields will be described in Sect. 7. With respect to the frame ea the Dirac stress-energy tensor is Tab =

i + ψ γ(a ∇b) ψ − (∇(a ψ + )γb) ψ , 2

(3.2)

which is manifestly symmetric, and is conserved provided ψ obeys the Dirac equation (2.17). In particular, T00 (γ (τ )) is the energy density measured by an observer with worldline γ at proper time τ . We may use (3.1) to define a bi-scalar point-split energy density T (x, y) = δ AB

i + (ψ vA )(x)(vB+ e0 · ∇ψ)(y) − ([e0 · ∇ψ + ]vA )(x)(vB+ ψ)(y) 2 (3.3)

with the property that T (x, x) = T00 (x). Integrating by parts, T becomes a scalar bi-distribution T ∈ (D(M) ⊗ D(M)) , T (f ⊗ g) = δ AB

i + ψ (∇ · [e0 vA f ])ψ(vB+ g) − ψ + (vA f )ψ(∇ · [e0 vB+ g]) . (3.4) 2

Here (and below) the notation ∇ · [e0 v] denotes minus the distributional dual of e0 · ∇, applied to the test function or (co)spinor v. Thus, ∇ · [e0 v] = v∇ · e0 + e0 · ∇v,

(3.5) µ

µ

where, with respect to local coordinates (x µ ), ∇ · e0 = ∇µ e0 and e0 · ∇v = e0 ∇µ v. Upon quantization, we obtain the algebra-valued bi-distribution T given by T (f ⊗ g) = δ AB

i + 5 (∇ · [e0 vA f ])5(vB+ g) − 5 + (vA f )5(∇ · [e0 vB+ g]) . (3.6) 2

Given a state ω we will now use the same symbol to denote its two-point function ω(f ⊗ g) = ω(5 + (f )5(g)) and also set vAB = vA ⊗ vB+ . Thus vAB ω will denote the matrix of scalar bi-distributions vAB ω(f ⊗ g) = ω(5 + (vA f )5(vB+ g)).

(3.7)

The formulae ∇ · (e0 vA f ) = vA ∇ · (e0 f ) + σ0 C A f vC and ∇ · (e0 vB+ g) = vB+ ∇ · (e0 g) + σ0 C B gvC+ now allow us to write the expectation value T ω of T in state ω as T ω = LAB vAB ω,

(3.8)

where LAB =

1 1 (1 ⊗ ie0 · ∇ − ie0 · ∇ ⊗ 1) δ AB + AAB , 2 2

(3.9)

and AAB = i[δ CB σ0 A C ⊗ 1 − 1 ⊗ δ AC σ0 B C ].

(3.10)

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

345

If a reference Hadamard state ω0 is now specified, we may define the normal ordered point-split energy density (with respect to ω0 ) by : T : ω = T ω − T ω0 .

(3.11)

: T : ω = LAB vAB : ω :,

(3.12)

This may also be written where : ω : = ω − ω0 is the normal ordered two-point function. Since : ω : is smooth for Hadamard ω, : T : ω is also smooth. Accordingly, the “coincidence limit” (i.e., the restriction of : T : ω to the diagonal) is well defined and yields the normal ordered energy density near γ . We denote the energy density along γ by : ρ : ω (τ ) = : T : ω (γ (τ ), γ (τ )).

(3.13)

Note also that : T : ω (f ⊗ g) is symmetric in f and g by virtue of the CAR’s. It will be convenient to regard : ρ : ω as the diagonal of the pull-back γ2∗ : T : ω , where γ2 (τ, τ ) = (γ (τ ), γ (τ )). In turn, γ2∗ : T : ω may be written as the action of a differential operator on the pulled-back normal ordered two-point function: 1 (3.14) (1 ⊗ D − D ⊗ 1)δ AB + γ2∗ AAB γ2∗ vAB : ω :, γ2∗ : T : ω = 2 where D = id/dτ (strictly speaking, D should be regarded as the distributional dual of −id/dτ ). 4. Main Argument We now come to the proof of the QWEI for Dirac fields. In the following, (M, g) is assumed to satisfy the hypotheses stated at the beginning of the previous section. Theorem 4.1. Let γ : R → M be a smooth timelike curve in (M, g) parametrized by its proper time. Let ω0 be a Hadamard state of the Dirac field on (M, g). Define the normal ordered energy density : ρ : ω by (3.13) with respect to the reference state ω0 . Then for any weight f belonging to W (defined in Eq. (1.1)) (4.1) inf dτ : ρ : ω (τ )f (τ ) > −∞, ω

where the infimum is taken over all Hadamard states ω. That is, there exists a quantum weak energy inequality for the Dirac field. Remarks. (i) If the reference state ω0 is changed, : ρ : ω is modified by a smooth function which is independent of ω. Thus we may assume without loss of generality that ω0 is pure and quasifree. Exactly the same argument entails that (4.1) holds if we replace the normal ordered energy density by the renormalised energy density. (ii) Perhaps surprisingly, the class W (of squares of real-valued C0∞ (R) functions) does not coincide with the class of nonnegative smooth compactly supported functions. In fact, Glaeser [20]6 has constructed an example of a C ∞ nonnegative function f , vanishing d2 √ f (x) diverges as x → 0. The delicacy of this point resides only at the origin, so that dx 2 in the behaviour of f at zeros of infinite order. It is not clear whether the restriction to weights in W is purely a technical limitation of our proof, or whether QWEIs should be understood as quadratic form results (cf. [10]). 6 We are grateful to S.P. Eveson and P.J. Bushell for help in locating this reference.

346

C. J. Fewster, R. Verch

Proof. It is sufficient to prove (4.1) for arbitrary f ∈ C0∞ (I ) ∩ W, where I ⊂ R is an arbitrary open interval with compact closure. To this end, choose η ∈ C0∞ (M) such that η equals unity on a neighbourhood of γ (I ). It is easy to see that : ρ : ω is unaltered on I if we replace vAB : ω : in (3.14) by the compactly supported distribution uAB : ω :, where uAB = ηvA ⊗ ηvB+ . Applying the formula dλ dλ ϕ (λ − λ ), F (−λ, λ ) dτ F (τ, τ )ϕ(τ ) = (2π )2 which is valid for F ∈ C0∞ (R2 ) and ϕ ∈ C0∞ (R), one may show that (ω) I = dτ : ρ : ω (τ )f (τ ) = dλ dλ J AB (λ, λ )WAB (λ, λ ), where

∧ (ω) WAB (λ, λ ) = γ2∗ uAB : ω : (−λ, λ ),

(4.2)

(4.3)

(4.4)

(4.5)

and we have also written J AB (λ, λ ) = where

1 AB AB ∧ (λ + λ ) f (λ − λ )δ + [θ f ] (λ − λ ) , 8π 2

θ AB (τ ) = γ2∗ AAB (τ, τ ) = i δ CB σ0 A C |γ (τ ) − δ AC σ0 B C |γ (τ )

(4.6)

(4.7)

is clearly hermitian (θ BA (τ ) = θ AB (τ )). It follows that J AB is a hermitian matrix kernel, i.e., J AB (λ, λ ) = J BA (λ , λ). (ω) Note that J AB is state-independent, while WAB contains all the dependence on the state of interest ω and the reference state ω0 . We also note that J AB (λ, λ ) decays rapidly away from the diagonal in R2 . Assuming without loss that ω0 is pure and quasifree, it must be labelled by some projection P on H. Since the stress-energy tensor is defined in terms of the two-point function, it is enough to establish (4.1) when the infimum is taken over quasifree Hadamard states, whose two-point functions are of the form ωQ for Q ∈ Had (H, 1) as discussed in Sect. 2.3. The normal ordered two-point function : ωQ : = ωQ − ωP is labelled by Q − P , i.e., : ωQ : = ωQ−P in the spirit of the remarks following (2.51). Now Q − P = −P Q1 P + P 1 QP + P QP 1 + P 1 QP 1 ,

(4.8)

and this induces a decomposition of : ωQ : into four pieces : ωQ : = −ω· 1 · + ω1 · · + ω· · 1 + ω1 · 1 , where ω= G > is the two-point function labelled by P = QG P > ; that is, 0 h =G> = G > ω (f ⊗ h) = 1q ,P Q P q f 0 S

(4.9)

(4.10)

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

347

for f ∈ Dsp and h ∈ Dcosp . In Sect. 5 we will show that each ω= G > may be pulled back to R2 by γ2 , allowing us to write (ω )

where

·1· 1·· ··1 1·1 WABQ = −WAB + WAB + WAB + WAB ,

(4.11)

∧ =G> WAB (λ, λ ) = γ2∗ uAB ω= G > (−λ, λ ).

(4.12)

These functions are analytic in λ, λ because they are the Fourier transforms of compactly supported distributions. Furthermore, the following Q-independent bounds will be established in Sect. 5. Lemma 4.2. For any Q ∈ Had (H, 1), =G> => WAB (λ, λ ) ≤ XAB (λ, λ )

∀(λ, λ ) ∈ R2 ,

(4.13)

where XAB is independent of Q and is defined in terms of the reference two-point function ωP by =>

=

>

XAB (λ, λ ) = YA (λ)YB (λ ),

(4.14)

=

where YA (λ) is the positive square root of = = ∧ YA (λ)2 = γ2∗ uAA ωP (−λ, λ) (no sum on A).

(4.15)

Furthermore, YA· (λ) (resp. YA1 (λ)) decays rapidly as λ → +∞ (resp. λ → −∞) and is of polynomially bounded growth as λ → −∞ (resp. λ → +∞). =

Remarks. (i) The right-hand side of Eq. (4.15) is nonnegative because uAA ωP is of positive type as a scalar bi-distribution, and this property is inherited under pull-back by γ2 (see Theorem 2.2 in [8]). In fact, the calculation =

=

uAB ωP (f A ⊗ f B ) = ωP (f A uA ⊗ (f B uB )+ ) = ω0 (5(f A uA )5(f B uB )∗ ) ≥0 (4.16) =

(summing on A and B) for f A ∈ C0∞ (M) (A = 1, . . . , 4) shows that uAB ωP is of = positive type as a matrix-valued distribution. Positive type of uAA ωP follows in consequence. A similar argument (using (4.10) and the property Q ≥ 0 for Q ∈ Had (H, 1)) shows that uAB ω· 1 · , uAB ω1 · 1 and their pull-backs by γ2 share the matrix positive type property. =

(ii) The statements on the growth of YA are obtained from the Paley-Wiener-Schwartz theorem [24] which entails that the Fourier transform of a compactly supported distribution is of at worst polynomial growth. Below, we will frequently use the fact that the product of a rapidly decaying function and one of polynomial growth is itself rapidly decaying. Because the bounds obtained in Lemma 4.2 exhibit different behaviour in the four quadrants C1 , . . . , C4 of the (λ, λ )-plane it is convenient to decompose the averaged energy density (4.4) as I = Ik , where Ik is the contribution arising from quadrant Ck . We proceed to bound the Ik in turn.

348

C. J. Fewster, R. Verch

Starting with the second and fourth quadrants C2 = R− × R+ and C4 = R+ × R− , Lemma 4.2 yields the Q-independent bound (ωQ ) => XAB (λ, λ ) (4.17) WAB (λ, λ ) ≤ =>

in which each summand on the right-hand side is of at worst polynomial growth. This may be combined with the rapid decay of J AB away from the diagonal to yield the following Q-independent bound on the contribution from these quadrants: => dλ dλ J AB (λ, λ ) XAB (λ, λ ) < ∞. (4.18) |I2 + I4 | ≤ C2 ∪C4

=>

We are left with the first and third quadrants. Since J AB exhibits polynomial growth along the diagonal, the previous argument will not allow us to bound all the terms arising from the decomposition (4.11). To see this, note that Lemma 4.2 applied to the P 1 QP 1 1 · 1 gives a bound X 1 1 growing polynomially7 in all directions in the first term WAB AB · · for the P Q1 P term is polynomially growing in the quadrant. Similarly, the bound XAB third quadrant. However, Lemma 4.2 suffices to bound the other contributions to I1 and => I3 because at least one factor in the relevant XAB is rapidly decaying. Thus ·· ·1 1· |I1 − R1 | ≤ dλ dλ J AB (λ, λ ) XAB (λ, λ ) + XAB (λ, λ ) + XAB (λ, λ ) C1

<∞ and

(4.19)

|I3 − R3 | ≤

C3

11 ·1 1· dλ dλ J AB (λ, λ ) XAB (λ, λ ) + XAB (λ, λ ) + XAB (λ, λ )

< ∞,

(4.20)

where the remaining terms are R1 =

C1

and

1·1 dλ dλ J AB (λ, λ )WAB (λ, λ )

R3 = −

C3

·1· dλ dλ J AB (λ, λ )WAB (λ, λ )

(4.21)

(4.22)

(the leading minus sign arises because it is −P Q1 P which appears in (4.8)). The quan1 · 1 and W · 1 · are hermitian matrix kernels. tities R1 and R3 are real, because J AB , WAB AB To complete the proof of the QWEI it is required to show that R1 and R3 are bounded from below independently of Q. We will present the argument for R1 in detail and indicate how the proof is modified for R3 . Let χ ( ∈ R+ ) be a family of smooth, real-valued, nonincreasing functions such that χ equals unity on [0, ] and vanishes on [ + 1, ∞). It is clear that 1·1 dλ dλ χ (λ)χ (λ )J AB (λ, λ )WAB (λ, λ ) (4.23) R1 = lim →∞ C1

7 Note that it is the bound which is polynomially growing; we expect that W 1 · 1 is actually decaying. AB

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

349

and the cut-off χ now allows us to interpret the integral as a trace in the following way. Define σ : R+ → R by σ (λ) = (1 + λ2 )1/2 [1 + |Y 1 (λ)|C4 ],

(4.24)

where | · |C4 is the usual vector norm on C4 . Then σ is smooth, positive, bounded away on from zero and of polynomially bounded growth. Next, define operators J and W 2 + 4 L (R , dλ) ⊗ C by ∞ dλ σ (λ)σ (λ )J AC (λ, λ )δCB ϕ B (λ ) (4.25) (J ϕ)A (λ) = 0

and ϕ)B (λ ) = δ BD (W

∞

dλ

0

1 · 1 (λ , λ ) WCD ϕ C (λ ), σ (λ )σ (λ )

(4.26)

where we have written σ (λ) = χ (λ)σ (λ). Now J is Hilbert–Schmidt (due to the are summarised in the following cut-off) and self-adjoint while the properties of W proposition, which is proved at the end of this section. is a positive trace-class operator with Proposition 4.3. For all Q ∈ Had (H, 1), W ≤ 0 ≤ Tr W

π . 2

(4.27)

We may therefore rewrite Eq. (4.23) in the form . R1 = lim Tr J W →∞

(4.28)

, and using W ≥ 0 and (4.27), Introducing an orthornormal basis of eigenvectors υn for W we have = υn = υn Tr J W υn | J W υn | J υn υn | W n

≥ inf spec (J )

n

υn υn | W

n

≥ inf spec (J ) Tr W π ≥ min{0, inf spec (J )}. 2

(4.29)

Noting that the right-hand side has no Q-dependence, the required lower bound on R1 now follows from the following proposition, which is proved in Sect. 6. Proposition 4.4. The spectrum of J is bounded from below uniformly in (with a finite lower bound). Turning to the integral R3 , we follow exactly the same argument but with the single difference that the kernel J AB on R− × R− defines an operator on L2 (R− ) ⊗ C4 which may be bounded above by a nonnegative quantity; this is compensated by the leading minus sign in the definition of R3 . Thus R1 + R3 has a finite Q-independent lower bound as required and the proof of Theorem 4.1 is complete. , -

350

C. J. Fewster, R. Verch

Proof of Proposition 4.3. Using Lemma 4.2 and Cauchy–Schwarz, we first estimate YC1 (λ )|ϕ C (λ )| |Y 1 (λ )|C4 ϕ)(λ )|C4 ≤ dλ |(W (1 + λ 2 )1/2 [1 + |Y 1 (λ )|C4 ] (1 + λ 2 )1/2 [1 + |Y 1 (λ )|C4 ] (4.30) ϕ. ≤ 1 π .ϕ.. Thus W is bounded. A for ϕ ∈ L2 (R+ ) ⊗ C4 , from which we obtain .W 2 short calculation shows that

ϕ = γ2∗ uCB ω1 · 1 (ψ C ⊗ ψ B ) ≥ 0 (4.31) ϕ | W for ϕ ∈ C0∞ (R+ ) ⊗ C4 , where 1 ψ(τ ) = 2π

1 ϕ σ

∨

(τ ) ∈ S(R) ⊗ C4 ,

(4.32)

and we have used the matrix positive type property and compact support of γ2∗ uCB ω1 · 1 . is positive. In order to show that W is trace-class it is now enough (by the lemma8 Thus W is finite (whereupon following Theorem XI.31 in [36]) to show that the formal trace of W the formal trace is indeed the trace). But this is just ∞ ∞ |Y 1 (λ)|2C4 π dλ σ (λ)−2 δ BC WBC (λ, λ) ≤ dλ ≤ , (4.33) 2 1 2 (1 + λ )(1 + |Y (λ)|C4 ) 2 0 0 is where we have used Lemma 4.2 and the fact that the diagonal of the kernel of W positive (because W is positive and has continuous kernel – see the lemma in [36] is trace-class on L2 (R+ ) ⊗ C4 ; putting this together with mentioned above). Thus W ≤ π/2 as required. , the positivity property, we have 0 ≤ Tr W -

5. Proof of Lemma 4.2 =G>

=

We first establish the existence of the quantities WAB and YA (λ) defined by (4.12) and (4.15). Using self-adjointness of P = , Cauchy-Schwarz and .Q. ≤ 1 (following from property (I) of Sect. 2.3 for Q ∈ Had (H, 1)) we obtain from (4.10) the inequality 2 + 2 = 0 P > f2 = ω= (f1 ⊗ f + )ω> (f2 ⊗ f + ) |ω= G > (f1 ⊗ f2+ )|2 ≤ P 1 1 2 P P f1 0 (5.1) for any fi ∈ Dsp (i = 1, 2). This inequality underlies the following lemma, which will be proved at the end of this section. Lemma 5.1. The wave-front set of ω= G > satisfies

WF (ω= G > ) ⊂ N = ∪ Z × −N > ∪ Z

(5.2)

=

as a subset of T ∗ (M × M), where N = = {(p, ξ ) | ξ ∈ Np } and Z is the zero section Z = {(p, 0) | p ∈ M} of T ∗ M. 8 For this purpose, we regard L2 (R+ , dλ) ⊗ C4 as L2 (X, dλ ⊗ dµ), where X is the locally compact space R+ × Z4 and µ is the counting measure on Z4 . The measure dλ ⊗ dµ is a Baire measure and the lemma may be applied.

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

351

This is by no means a sharp estimate of the wave-front set, but it will suffice for our purposes. Defining uAB as in (4.2), uAB ω= G > is a scalar bi-distribution with wave-front set contained in the right-hand-side of Eq. (5.2). Now by Theorem 2.5.11 in [25], the pullback γ2∗ uAB ω= G > exists provided the intersection of its wave-front set with the set of normals Nγ2 of γ2 is empty. One may show that

Nγ2 = {(γ (τ ), ξ ; γ (τ ), ξ ) ∈ T ∗ (M × M) | ξa ua (τ ) = ξb ub (τ ) = 0}

(5.3)

(see Sect. 3 of [8] where a corresponding argument is given). This has trivial intersection with WF (uAB ω= G > ), because no null covector can annihilate a timelike vector. Accordingly, γ2∗ uAB ω= G > exists in D (R2 ), and (again by Theorem 2.5.11 in [25]) its wave-front set obeys WF (γ2∗ uAB ω= G > ) ⊂ R × R= × R × −R> .

(5.4)

Since uAB is compactly supported, we may take Fourier transforms and conclude that =G> the WAB do indeed exist. Exactly the same argument, using the microlocal spectrum = condition (2.50) in place of (5.2), shows that the pull-backs γ2∗ uAB ωP exist with =

WF (γ2∗ uAB ωP ) ⊂ R × R= × R × −R= .

(5.5)

It remains to prove Lemmas 4.2 and 5.1. Proof of Lemma 4.2. An argument using regularising sequences in analogy with the proof of Theorem 2.2 in [8] shows that the inequality (5.1) is inherited by the pull-back and becomes |γ2∗ uAB ω= G > (f1 ⊗ f2 )|2 =

>

≤ γ2∗ uAA ωP (f1 ⊗ f1 ) γ2∗ uBB ωP (f2 ⊗ f2 ) ∀ f1 , f2 ∈ C0∞ (R)

(5.6)

(no sum on either A or B). Substituting f1 (t) = e−itλ , f2 (t ) = e−it λ , the required bounds (4.13) are obtained. As γ2∗ uAA ωP· is compactly supported, its set of singular directions (those directions in which its Fourier transform fails to decay rapidly) is given by K(γ2∗ uAA ωP· ) = {(λ, λ ) | (τ, λ; τ , λ ) ∈ WF (γ2∗ uAA ωP· ) for some (τ, τ ) ∈ R2 } = R+ × R −

(5.7)

(see Proposition 8.1.3 in [24]). Thus (−1, 1) is not a singular direction for γ2∗ uAA ωP· and we deduce that YA· (λ) is rapidly decaying at λ → +∞ and of polynomially bounded growth as λ → −∞ by the Paley-Wiener-Schwartz theorem [24]. An analogous argument shows that YA1 (λ) decays rapidly as λ → −∞ and is polynomially bounded as λ → +∞. , Proof of Lemma 5.1. Suppose (p1 , ξ1 ; p2 , −ξ2 ) ∈ WF (ω= G > ) with ξ1 ' = 0. We will show that =

(p1 , ξ1 ; p1 , −ξ1 ) ∈ WF (ωP ),

(5.8)

352

C. J. Fewster, R. Verch =

from which it follows that ξ1 ∈ Np1 by the microlocal spectrum condition (2.50). To prove (5.8) fix charts (Ui , κi ) with pi ∈ Ui (i = 1, 2), and define ki so that ξi = t κi (pi )ki . Let χi be arbitrary smooth spinor fields compactly supported in Ui with χi (pi ) ' = 0. We will use the notation χij = χi ⊗ χj+ ;

κij = κi × κj .

(5.9)

Since ξ1 , and hence k1 , is nonzero, any conical neighbourhood V11 of (k1 , −k1 ) contains a set of formO1 ×−O1 , where O1 is a neighbourhood of k1 bounded away from ∧

=

−1 is not of rapid decay in the conic neighbourhood zero. We claim that χ11 ωP ◦ κ11 α>0 α (O1 × −O1 ) ⊂ V11 . Since V11 was arbitrary, we conclude that (k1 , −k1 ) is a = −1 singular direction for χ11 ωP ◦ κ11 ; letting the support of χ1 tend to {p1 }, Eq. (5.8) is established as required. To justify our claim above we apply (5.1) to

fj =

√

1 −|g| ◦ κj−1

χj e

i( . )Nj

◦ κj ,

(5.10)

and recall (1.3) to obtain ∧ ∧ χ12 ω= G > ◦ κ −1 (N1 , −N2 ) ≤ χ11 ω= ◦ κ −1 (N1 , −N1 ) 12 11 P ∧ > −1 (N2 , −N2 ). × χ22 ωP ◦ κ22

(5.11)

Now let O2 be any neighbourhood of k2 . By the Paley-Wiener-Schwartz theorem we have ∧ > −1 χ22 ωP ◦ κ22 (αN2 , −αN2 ) ≤ R(α) ∀N2 ∈ O2 (5.12) for some polynomial R; accordingly, ∧ ∧ = −1 −1 sup χ12 ω= G > ◦ κ12 (αN1 , −αN2 ) ≤ R(α) sup χ11 ωP ◦ κ11 (αN1 , −αN1 )

Nj ∈Oj

N∈O1

(5.13) for all α > 0. Were our claim false, the right-hand side of this equation would be rapidly ∧

−1 decaying as α → +∞ and we could infer that χ12 ω= G > ◦ κ12 was of rapid decay in the conical neighbourhood V12 = ∪α>0 α (O1 × −O2 ) of (k1 , −k2 ). But this would contradict the initial hypothesis that (p1 , ξ1 ; p2 , −ξ2 ) ∈ WF (ω= G > ), so the claim is proved. = We have therefore shown that (p1 , ξ1 ; p2 , −ξ2 ) ∈ WF (ω= G > ) implies ξ1 ∈ Np1 ∪{0}. > An exactly analogous argument shows that, in addition, ξ2 ∈ Np2 ∪ {0}. This completes the proof. , -

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

353

6. Proof of Proposition 4.4 We now prove that the operators J on L2 (R+ , dλ) ⊗ C4 are bounded from below uniformly in . To begin, we consider the related operator K acting on L2 (R+ , dλ) by ∞ σ (λ)σ (λ ) (K ϕ)(λ) = dλ (λ + λ )f(λ − λ )ϕ(λ ). (6.1) 8π 2 0 If the spin-connection terms vanished, J would be equal to K ⊗ 1. Our analysis of K is based on the following identity. Lemma 6.1. If f = g 2 for real-valued g ∈ C0∞ (R), then ∞ dµ µ g (λ − µ) g (λ − µ) . (λ + λ )f(λ − λ ) = −∞ π

(6.2)

Proof. Note first that the right-hand side (RHS) exists for each λ, λ ∈ R. Changing variables to ν = µ − (λ + λ )/2 and writing ζ = (λ − λ )/2, ∞ dν λ + λ RHS of (6.2) = +ν g (ζ − ν) g (ζ + ν) 2 −∞ π = (λ + λ )( g R g )(2ζ ) = (λ + λ )f (λ − λ ) (6.3) g (u) = g (−u), the fact that ν g (ζ − ν) g (ζ + ν) as required, where we have also used is odd, and a further change of variables. In addition, we have used R to denote the convolution (h1 R h2 )(λ) = dλ /(2π )h1 (λ − λ )h2 (λ ). , It follows from this identity that the kernel of K may be rewritten in the form σ (λ)σ (λ ) ∞ K (λ, λ ) = dµ µ g (λ − µ) g (λ − µ). (6.4) 8π 3 −∞ ± to have the kernels We now define K + K (λ, λ ) =

σ (λ)σ (λ ) 8π 3

∞ −∞

dµ |µ| g (λ − µ) g (λ − µ)

(6.5)

and − K (λ, λ ) = −

2σ (λ)σ (λ ) 8π 3

0 −∞

dµ µ g (λ − µ) g (λ − µ).

(6.6)

The integrals in these kernels are bounded on compact subsets of R+ ×R+ , so the cut-off ± + − functions σ ensure that K and K are Hilbert-Schmidt. Clearly K = K − K ; furthermore, the easily proven identity 2 ∞ µ ∞ − dµ 3 dλ g (λ + µ)σ (λ )ϕ(λ ) (6.7) ϕ | K ϕ = 4π 0 0

354

C. J. Fewster, R. Verch

− (valid, say, for ϕ ∈ C0∞ (R+ )) shows that K is positive. A similar argument establishes + positivity of K . − which will allow a bound uniform in to be Our aim is now to find a bound on K + obtained. (The operator K is bounded for each , e.g., by its Hilbert–Schmidt norm, but becomes unbounded in the limit → ∞). Regarding the inner integral in (6.7) as an L2 -inner product and applying Cauchy–Schwarz, we obtain

where

− ϕ ≤ C .ϕ.2 , ϕ | K

(6.8)

2 1 )σ (λ ) C = dµ dλ µ g (µ + λ 4π 3 R+ ×R+ ∞ = du| g (u)|2 F (u)

(6.9)

0

and 1 F (u) = 4π 3

u 0

dλ (u − λ )σ (λ )2

(6.10)

is bounded and nonnegative. Let us observe that this step depends in an essential way on the fact that, for µ > 0, the argument of g in (6.7) is bounded away from zero, together with the rapid decay property of g. The above analysis entails that −C is a lower bound for K , but this is certainly − not the sharpest bound. In fact essentially the same argument applies if K is replaced + + − 1 − by 2 K and K is adjusted to maintain K = K − K , with the conclusion that −C /2 is also a lower bound for K . The convenience of the choices made above is + that, as we now show, the operator L = J − K ⊗ 1 is form bounded relative to K with relative bound no greater than 21 . To this end, we first use the convolution theorem to write 1 ∞ ϕ | L ϕ = dτ [σ ϕ]∨ (τ )† f (τ )θ (τ )[σ ϕ]∨ (τ ) (6.11) 2 −∞ for ϕ ∈ C0∞ (R+ ) ⊗ C4 , where † denotes the matrix hermitian conjugate. Setting C = sup .θ (τ ).C4 , we then estimate

τ ∈R

2 C ∞ dτ (g[σ ϕ]∨ )(τ )C4 2 −∞ 2 C ∞ dµ (g[σ ϕ]∨ )∧ (µ)C4 ≤ 2 −∞ 2π 2 C ∞ dµ ∞ dλ = g (µ − λ )σ (λ )ϕ(λ ) . 2 −∞ 2π 0 2π C4

|ϕ | L ϕ | ≤

(6.12)

+ By comparison with the definition of K , this implies

|ϕ | L ϕ | ≤

1 + ⊗ 1)ϕ ϕ | (K 2 C dµ + 16π 3 |µ|

∞ 0

2 dλ g (µ − λ )σ (λ )ϕ(λ ) . (6.13) 4

C

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

355

Using a similar argument to that used to obtain (6.8), the last term may be bounded by .ϕ.2 , where C ∞ C dµ dλ | g (µ + λ )σ (λ )|2 16π 3 |µ|
= C

−C

(6.14)

and the bounded nonnegative function G is given on [−C , ∞) by C G (u) = 16π 3

u+C

max{u−C ,0}

dλ σ (λ )2 .

(6.15)

+ + − Thus L is form bounded relative to K as claimed above. Since J = K −K +L , we have + − |ϕ | (J − K ⊗ 1)ϕ | ≤ |ϕ | L ϕ | + |ϕ | K ϕ | 1 + ≤ ϕ | (K ⊗ 1)ϕ + (C + C ).ϕ.2 2

(6.16)

+ ). and, as K is positive, it follows that J is bounded below by −(C + C may both To complete the proof of Proposition 4.4 we must show that C and C be bounded above uniformly in . This holds because σ (λ) is pointwise dominated by σ (λ) and thus F and G are pointwise dominated by the functions F and G obtained by replacing σ by σ in (6.10) and (6.15). Furthermore, F and G are of polynomially bounded growth so we obtain the bounds ∞ C ≤ du| g (u)|2 F (u) < ∞ (6.17) 0

and C

≤

∞ −C

du | g (u)|2 G(u) < ∞

(6.18)

for all , where the rapid decay property of g has been used. 7. Majorana Fields In this section we will indicate the changes required in Sect. 3 when treating Majorana fields. Suppose that γ0 , . . . , γ3 belong to a Majorana representation and that H is the completion of Dcosp /ker S✁ with scalar product given by (h, h )S = (S✁ h, h ), cf. Remark (iii) at the end of Sect. 2.2. The field operators are then given by 5(h) = B(q(h));

5 + (f ) = 5(f + )∗ = 5(1f + )

h ∈ Dcosp , f ∈ Dsp ,

(7.1)

where q : Dcosp → Dcosp /ker S✁ is the quotient map. Now, as in Sect. 3 one may choose, in a tubular neighbourhood of any timelike curve γ , induced frames (e0 , . . . , e3 ) and (EA )so that e0 |γ = u, the tangent of γ in proper time parametrization. However, it is

356

C. J. Fewster, R. Verch

now convenient to set vA = iEA , for then we have the two properties δ AB vA ⊗ vB+ = γ0

and

+ + 1vA = vA

(7.2)

as we are working in a Majorana representation. To quantize the point-split energy density (3.4), we may substitute from (7.1) and + ) for f ∈ D(M) to obtain use the formula 1[∇ · (e0 vA f )]+ = ∇ · (f e0 vA T (f ⊗ g) = δ AB

i + + ])5(vB+ g) − 5(f vA )5(∇ · [e0 vB+ g]) 5(∇ · [f e0 vA 2

(7.3)

as the replacement for (3.6). The CAR’s may be used to show that : T : ω (f ⊗ g) is symmetric in f, g. + Using the formula e0 · ∇vB+ = σ0 C B vC , it follows that T ω may be expressed in the form + ⊗ vB+ ω2 , T ω = LAB vA

(7.4)

where ω2 (h, h ) = ω(5(h)5(h )) is the two-point function and LAB =

1 1 (1 ⊗ ie0 · ∇ − ie0 · ∇ ⊗ 1) δ AB + AAB , 2 2

(7.5)

with AAB = i[δ CB σ0 A C ⊗ 1 − 1 ⊗ δ AC σ0 B C ].

(7.6)

The components of σ0 are real in a Majorana representation and it follows that θ AB (τ ) = + γ2∗ AAB (τ, τ ) is hermitian for each τ . Moreover, vA ⊗ vB+ ω2 is of matrix positive type as shown by the calculation + + A ⊗ vB+ ω2 )(f A ⊗ f B ) = ω2 (1[vA f ] ⊗ vB+ f B ) (vA + ∗ = ω(5(vA f ) 5(vB+ f B )) ≥ 0

(7.7)

+ + = vA and (7.1). in which we have used 1vA From this point onwards, one may proceed to define : ρ : ω as in Sect. 3, and the statement and proof of Thm. 4.1 carry over (apart from some obvious changes) upon observing that Hadamard states ω of the Majorana field obey

WF (ω2 ) = (p, ξ ; p , −ξ ) ∈ T˙ ∗ (M × M) | (p, ξ ) ∼ (p , ξ );

ξ ∈ Np+

(7.8)

(cf. [38], note again that in this reference a different sign convention for the Fourier transform was chosen which results in the opposite form of the wave-front set there).

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

357

8. Conclusion In this paper, we have established general QWEIs for the Dirac and Majorana fields in globally hyperbolic spacetimes. We conclude with various remarks. First, these QWEIs hold despite the fact that the “classical” Dirac equation fails to obey the weak energy condition. This is encouraging evidence that QWEIs are a widespread feature of all quantum field theories and that they are the correct replacement for the classical energy conditions. It would be interesting to understand whether QWEIs can be obtained in a general axiomatic setting. A second point was posed to us by Buchholz (private communication): given a weight in W the corresponding averaged energy density may be defined as a symmetric operator on a suitable dense domain (such as the domain of microlocal smoothness introduced in [2]) in some Hilbert space representation. The force of our result (and the corresponding result in [8]) is that such operators are semibounded, and therefore admit self-adjoint extensions (in particular, the Friedrichs extension). Can one give any interpretation to the evolution generated by this operator? The answer is not clear, but we speculate that there could be a connection with the dynamics discussed by Keyl [27] in his recent study of quantum fields on timelike curves. The connection is somewhat tentative (in particular, the role of the weight must be understood), and would only be expected to hold under restricted conditions such as for static trajectories in static spacetimes. Nonetheless, it remains an intriguing possibility. Finally, although our approach does lead to explicit lower bounds on the various contributions to the averaged energy density, these are not expected to be optimal. We hope to return to this question elsewhere. Acknowledgement. We thank the organisers of the meeting on Microlocal Analysis and Quantum Field Theory at the Mathematisches Forschungsinstitut Oberwolfach, where this work was commenced. CJF thanks Simon Eveson, Alfredo Calvo Pereira and Stefan Hollands for useful discussions, and is grateful to the Institut für Theoretische Physik in Göttingen for hospitality in the later stages of the work. We also thank Detlev Buchholz for raising the issue discussed above. The work of CJF was assisted by a grant from the Nuffield Foundation.

References 1. Araki, H.: On quasifree states of CAR and Bogoliubov transformations. Publ. RIMS 6, 385 (1970/71) 2. Brunetti, R., Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: renormalization on physical backgrounds. Commun. Math. Phys. 208, 623 (2000) 3. Davies, P.C.W., Fulling, S.A.: Radiation from moving mirrors and from black holes. Proc. Roy. Soc. A356, 237 (1977) 4. Deser, S.: Improvement versus stability in gravity-scalar coupling. Phys. Lett. 134B, 419 (1984) 5. Dimock, J.: Dirac quantum fields on a manifold. Trans. Am. Math. Soc. 269, 133 (1982) 6. Epstein, H., Glaser, V., Jaffe, A.: Nonpositivity of the energy density in quantized field theories. Nuovo Cimento 36, 1016 (1965) 7. Fewster, C.J., Eveson, S.P.: Bounds on negative energy densities in flat spacetime. Phys. Rev. D 58, 084010 (1998) 8. Fewster, C.J.: A general worldline quantum inequality. Class. Quantum Grav. 17, 1897 (2000) 9. Fewster, C.J., Teo, E.: Bounds on negative energy densities in static spacetimes. Phys. Rev. D 59, 104016 (1999) 10. Fewster, C.J., Teo, E.: Quantum inequalities and quantum interest as eigenvalue problems. Phys. Rev. D 61, 084012 (2000) 11. Flanagan, É.É.: Quantum inequalities in two-dimensional Minkowski spacetime. Phys. Rev. D 56, 4922 (1997)

358

C. J. Fewster, R. Verch

12. Flanagan, É.É., Wald, R.M.: Does backreaction enforce the averaged null energy condition in semiclassical gravity? Phys. Rev. D 54, 6233 (1996) 13. Folacci, A.: Averaged-null-energy condition for electromagnetism in Minkowski spacetime. Phys. Rev. D 46, 2726 (1992) 14. Ford, L.H.: Quantum coherence effects and the second law of thermodyamics. Proc. Roy. Soc. Lond. A364, 227 (1978) 15. Ford, L.H., Roman, T.A.: Averaged energy conditions and quantum inequalities. Phys. Rev. D 51, 4277 (1995) 16. Ford, L.H., Roman, T.A.: Restrictions on negative energy density in flat spacetime. Phys. Rev. D 55, 2082 (1997) 17. Ford, L.H., Roman, T.A.: Classical Scalar Fields and the Generalized Second Law. Phys. Rev. D 64, 024023 (2001) 18. Ford, L.H., Roman, T.A.: Quantum field theory constrains traversable wormhole geometries. Phys. Rev. D 53, 5496 (1996) 19. Fulling, S.A., Narcowich, F.J., Wald, R.M.: Singularity structure of the two-point function in quantum field theory in curved spacetime, II. Ann. Phys. (N.Y.) 136, 243 (1981) 20. Glaeser, G.: Racine carrée d’une fonction différentiable. Ann. Inst. Fourier, Grenoble 13, 203 (1963) 21. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge: Cambridge University Press, 1973 22. Helfer, A.D.: The Hamiltonians of Linear Quantum Fields: II. Classically Positive Hamiltonians. arXiv:hep-th/9908012 23. Hollands, S.: The Hadamard condition for Dirac fields and adiabatic states on Robertson-Walker spacetimes. Commun. Math. Phys. 216, 635 (2001) 24. Hörmander, L.: The analysis of linear partial differential operators I. Berlin: Springer-Verlag, 1983 25. Hörmander, L.: Fourier integral operators. I. Acta Math. 127, 79 (1971) 26. Klinkhammer, G.: Averaged energy conditions for free scalar fields in flat space-time. Phys. Rev. D 43, 2542 (1991) 27. Keyl, M.: Quantum fields on timelike curves. arXiv:math-ph/0012024 28. Köhler, M.: The stress energy tensor of a locally supersymmetric quantum field on a curved spacetime. Dissertation, Hamburg University, 1995. Preprint DESY-95-080, arXiv:gr-qc/9505014 29. Kratzert, K.: Singularity structure of the two-point function of the free Dirac field on a globally hyperbolic spacetime. Annalen Phys. 9, 475 (2000) 30. Najmi, A.-H., Ottewill, A.C.: Quantum states and the Hadamard form II, Energy minimization for spin 1/2 fields. Phys. Rev. D 30, 2573 (1984) 31. Pfenning, M.J.: Quantum inequality restrictions on negative energy densities in curved spacetimes. Ph.D. thesis, Tufts University, 1998. Preprint arXiv:gr-qc/9805037 32. Pfenning, M.J.: Quantum inequalities for the electromagnetic field. Phys. Rev. D 65, 024009 (2002) 33. Pfenning, M.J., Ford, L.H.: The unphysical nature of “warp drive”. Class. Quantum Grav. 14, 1743 (1997) 34. Pfenning, M.J., Ford, L.H.: Scalar field quantum inequalities in static spacetimes. Phys. Rev. D 57, 3489 (1998) 35. Radzikowski, M.J.: Micro-local approach to the Hadamard condition in quantum field theory in curved spacetime. Commun. Math. Phys. 179, 529 (1996) 36. Reed, M., Simon, B.: Methods of modern mathematical physics, Vol. III. San Diego: Academic Press, 1979 37. Sahlmann, H., Verch, R.: Passivity and microlocal spectrum condition. Commun. Math. Phys. 214, 705 (2000) 38. Sahlmann, H., Verch, R.: Microlocal spectrum condition and Hadamard form for vector-valued quantum fields in curved spacetime. Rev. Math. Phys. 13, 1203 (2001) 39. Schoen, R., Yau, S.-T.: Proof of the positive mass theorem. II.. Commun. Math. Phys. 79, 231 (1981) 40. Taylor, M.E.: Pseudodifferential operators. Princeton: Princeton University Press, 1981 41. Tipler, F.J.: Energy conditions and spacetime singularities. Phys. Rev. D 17, 2521 (1978) 42. Verch, R.: The averaged null energy condition for general quantum field theories in two dimensions. J. Math. Phys. 41, 206 (2000) 43. Visser, M., Barcelo, C.: Energy conditions and their cosmological implications. arXiv:gr-qc/0001099 44. Vollick, D.N.: Negative energy density states for the Dirac field in flat space-time. Phys. Rev. D 57, 3484 (1998) 45. Vollick, D.N.: Quantum inequalities in curved two dimensional spacetime. Phys. Rev. D 61, 084022 (2000)

A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime

359

46. Wald, R.M., Yurtsever, U.: General proof of the averaged null energy condition for a massless scalar field in two-dimensional curved spacetime. Phys. Rev. D 44, 403 (1991) 47. Weinless, M.: Existence and uniqueness of the vacuum for linear quantized fields. J. Funct. Anal. 4, 350 (1969) 48. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381 (1981) 49. Yurtsever, U.: Averaged null energy condition and difference inequalities in quantum field theory. Phys. Rev. D 51, 5797 (1995) 50. Yurtsever, U.: Remarks on the averaged null energy condition in quantum field theory. Phys. Rev. D 52, R564 (1995) Communicated by H. Araki

Commun. Math. Phys. 225, 361 – 397 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Non-linear Stability of Modulated Fronts for the Swift–Hohenberg Equation J.-P. Eckmann1,2 , G. Schneider3 1 Dépt. de Physique Théorique, Université de Genève, 1211 Genève 4, Switzerland 2 Section de Mathématiques, Université de Genève, 1211 Genève 4, Switzerland 3 Mathematisches Institut, Universität Bayreuth, 95440 Bayreuth, Germany

Received: 23 February 2001 / Accepted: 27 August 2001

Abstract: We consider front solutions of the Swift–Hohenberg equation ∂t u = −(1 + ∂x2 )2 u + ε2 u − u3 . These are traveling waves which leave in their wake a periodic pattern in the laboratory frame. Using renormalization techniques and a decomposition into Bloch waves, we show the non-linear stability of these solutions. It turns out that this problem is closely related to the question of stability of the trivial solution for the model problem ∂t u(x, t) = ∂x2 u(x, t) + (1 + tanh(x − ct))u(x, t) + u(x, t)p with p > 3. In particular, we show that the instability of the perturbation ahead of the front is entirely compensated by a diffusive stabilization which sets in once the perturbation has hit the bulk behind the front. Contents 1.

Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . .

362

2. 3. 4.

Part I. A Simplified Problem . . . . . . . . . . . . . . . . The Model Equation . . . . . . . . . . . . . . . . . . . . . The Linear Simplified Problem . . . . . . . . . . . . . . . The Renormalization Approach for the Simplified Problem

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

364 364 367 370

5. 6. 7.

Part II. The Swift–Hohenberg Equation . . . . . . Bloch Waves . . . . . . . . . . . . . . . . . . . . The Linearized Problem . . . . . . . . . . . . . . The Renormalization Process for the Full Problem

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

378 378 380 385

. . . .

. . . .

. . . .

. . . .

. . . .

362

J.-P. Eckmann, G. Schneider

1. Statement of the Problem We consider the Swift–Hohenberg equation ∂t u = −(1 + ∂x2 )2 u + ε 2 u − u3 ,

(1.1)

with u(x, t) ∈ R, x ∈ R, t ≥ 0 and 0 < ε 1 a small bifurcation parameter. It has been shown some time ago that a 2-parameter family of (small) spatially periodic solutions exists which are independent of t. These solutions correspond to a periodic pattern which exists in the laboratory frame. These solutions are of the form Uq,a (x) = Aq cos (1 + εq)x + a + O(ε 2 ), which bifurcate from the solution u ≡ 0. Here, Aq = 2ε(1 − 4q 2 )1/2 . It is furthermore well-known and proved in [CE90a] that these solutions are marginally stable for 4|q|2 ≤ 13 , the so-called Eckhaus stability range ([Eck65]), and that the spectrum of the linearization about these solutions is all of R− . Finally, after a long time it was shown in [Schn96] that these solutions are also non-linearly stable, and this proof was presented in a slightly different form in [EWW97]. In another direction, in earlier work of [CE86] and [EW91] traveling wave solutions of a special kind leaving a fixed pattern in the laboratory space were shown to exist, and their linear stability was studied in [CE87]. Our present paper is concerned with a first proof of the non-linear stability of these traveling solutions. We first describe the traveling solutions. One way to view them is to write Uq,a (x) = 21 Aq,n ein((1+εq)x+a) , n∈2Z+1

where Aq,1 = Aq as defined above and Aq,−n = A¯ q,n , with x¯ the complex conjugate of x. Here, the Aq,n are in fact O(ε|n| ), and furthermore Uq,a extends to an analytic function. The modulated front solutions are then of the form u(x, t) = Fc,q,a (x − ct, x), with Fc,q,a (ξ, x) =

1 2

Wc,q,n (ξ )ein((1+εq)x+a) .

(1.2)

n∈2Z+1

Note that these are not classical traveling waves of the form u(x − ct), and note furthermore that Fc,q,a is periodic in its second argument (with period 2π/(1 + εq)). The modulated front solutions satisfy [CE86, EW91], when c > 0: lim Wc,q,n (ξ ) = Aq,n ,

ξ →−∞

lim Wc,q,n (ξ ) = 0.

ξ →∞

These modulated front solutions are constructed with the help of a center manifold reduction, where all Wc,q,n are determined by the central modes Wc,q,±1 . In the reduced four-dimensional system for Wc,q,±1 = Wc,q,±1 (ξ ) there is a heteroclinic connection lying in the intersection of a four-dimensional stable manifold of the origin and a twodimensional unstable manifold of an equilibrium corresponding to Uq,a . Since this is a

Stability of Modulated Fronts

363

very robust situation these solutions can be constructed by some perturbation analysis from the ones for q = 0. For small ε and q = 0 the solution Wc,0,1 of the amplitude equation on the center manifold is close to the real-valued front solution Wc,0,1 (ξ ) = εB(εξ ) = εB(ζ ) of the equation 4∂ζ2 B + cB ∂ζ B + B − 3B|B|2 = 0, connecting Wc,0,1 = 0 at ζ = +∞ with Wc,0,1 = A0 at ζ = −∞. The constant cB is given by cB = ε −1 c = O(1). Our paper deals with the question: Under which conditions does the solution of (1.1) with initial data Fc,q,a (x, x) + v(x) converge to Fc,q,a (x − ct, x) as t → ∞? We will show our results for the case q = 0 and a = 0 only, to keep the notation on a reasonable level. The extension to arbitrary a is trivial by translating the origin, while the extension to arbitrary q satisfying 4|q|2 < 13 necessitates some notational work and leads to bounds which depend on q. Thus, we will write the periodic solution as U∗ (x) = A cos x + O(ε2 ),

(1.3)

with A = 2ε, and the modulated front (moving with speed c = O(ε)) as Wc (ξ )einx . Fc (ξ, x) = 21 n∈2Z+1

We describe next the nature of the stability problem. Consider an initial condition u0 (x) = Fc (x, x) + v0 (x), and let u(x, t) denote the solution of (1.1) with that initial condition. Since Fc solves (1.1), we find for the evolution of v(x, t) ≡ u(x, t) − Fc (x − ct, x): ∂t v(x, t) = Lv (x, t) − 3Fc (x − ct, x)2 v(x, t) (1.4) − 3Fc (x − ct, x)v(x, t)2 − v(x, t)3 . Here, L = −(1 + ∂x2 )2 + ε 2 . We define the translation operator τct by (τct f )(x) = f (x − ct, x), so that (1.4) can be written as ∂t v = Lv − 3(τct Fc )2 v − 3(τct Fc )v 2 − v 3 .

(1.5)

Introduce now Kct (the difference between the modulated front and the periodic solution) by (1.6) Kct (x) = τct Fc (x) − U∗ (x) = Fc (x − ct, x) − U∗ (x). Note that Kct (x) vanishes as x → −∞, and approaches −U∗ (x) as x → ∞. With these notations we can rewrite (1.5) as ∂t v = Lv − 3U∗2 v − 6U∗ Kct v − 3Kct2 v − 3U∗ v 2 − v 3 − 3Kct v 2 = Mv + Mi v + N (v) + Ni (v),

(1.7)

where Mv = Lv − 3U∗2 v, Mi v = −6U∗ Kct v − 3Kct2 v, N (v) = −3U∗ v 2 − v 3 , Ni (v) = −3Kct v 2 .

(1.8)

364

J.-P. Eckmann, G. Schneider

The variables with index i vanish with some exponential rate for fixed x ∈ R in the laboratory frame. They will be seen to be exponentially “irrelevant” in terms of a renormalization group analysis. In order to explain this renormalization problem, we will study, in the next section the model problem ∂t u(x, t) = ∂x2 u(x, t) + a(x − ct)u(x, t) + u(x, t)p , with a(ξ ) = 21 (1 + tanh ξ ), and p > 3. This problem is nice in its own right. The similitude will come from the correspondence of M with ∂x2 , and of Mi v with the term a(x − ct)u(x, t). Indeed: – the first term will be seen to be diffusive in the laboratory frame, – the second term will be seen to be irrelevant in the laboratory frame, but the first together with the second term will be exponentially damping in a suitable space of exponentially decaying functions in a frame moving with a speed close to c. As in previous work [Sa77, BK94, Ga94, EW94] our analysis will be based on an interplay of estimates obtained in these two topologies. Our main results are stated in Theorem 4.1 for the simplified problem and in Theorem 7.1 for the Swift–Hohenberg problem. We not only show convergence to the front, but give also precise first order estimates in both cases. As far as possible, the treatment of the two problems is done in analogous fashion, so that the reader who has followed the proof of the simplified problem should have no difficulty in reading the proof for the full, more complicated, problem. Remark. An ideal treatment of this problem would necessitate a norm in a frame moving with the same speed as the front. Such a space is needed to study the stability of socalled critical fronts (moving at the minimal possible speed where they are linearly stable). Achieving this aim seems to be a necessary step in solving the long-standing problem of “front selection” [DL83], in a case where the maximum principle [AW78] is not available. Remark. The method also applies to more complicated systems, like hydrodynamic stability problems. A typical example are the fronts connecting the Taylor vortices with the Couette flow in the Taylor-Couette problem. These fronts have been constructed in [HS99]. The stability of the spatially periodic Taylor vortices has been shown in [Schn98]. Notation. Throughout this paper many different constants are denoted with the same symbol C.

Part I. A Simplified Problem 2. The Model Equation Let a(ξ ) = 21 (1 + tanh ξ ).

(2.1)

∂t u(x, t) = ∂x2 u(x, t) + a(x − ct)u(x, t) + u(x, t)p ,

(2.2)

We want to study the equation

Stability of Modulated Fronts

365

with c > 0 and p > 3. For notational simplicity we assume p ∈ N. To understand the dynamics of (2.2) it might be useful to consider the following simplified problem: ∂t v(x, t) = ∂x2 v(x, t) + ϑ(x − ct)v(x, t),

(2.3)

where ϑ(z) = 1 when z > 0 and ϑ(z) = 0 when z < 0. If we go to the moving frame ξ = x − ct and let w(ξ, t) = v(ξ + ct, t) = v(x, t), then the equation for w becomes ∂t w(ξ, t) = ∂ξ2 w(ξ, t) + c∂ξ w(ξ, t) + ϑ(ξ )w(ξ, t).

(2.4)

For x > 0, we have ϑ(x) = 1 and hence the corresponding characteristic polynomial for (2.4) (in momentum space) is −k 2 + ick + 1, while for x < 0, we have ϑ(x) = 0 with its corresponding polynomial −k 2 + ick. Thus, we expect the solution to be exponentially unstable ahead of the front, i.e., for x > 0, and diffusively stable behind the front. If we consider an initial condition v0 (ξ ) localized near ξ = ξ0 > 0, and of amplitude A, then we expect the amplitude to grow like et A until t = t∗ = ξ0 /c, when this perturbation “hits” the back of the front (in the moving frame), or, in other words, when the back of the front hits the perturbation (in the laboratory frame). Thus, the perturbation does not grow larger than Aeξ0 /c . We use this in the following way. Assume that the amplitude at ξ > 0 is bounded by Ae−βξ . Then, ignoring diffusion, we find that the contribution to the amplitude at the origin at time t = ξ0 /c is bounded by ξ0 dξ Aeξ (1−βc)/c . 0

Clearly, if βc > 1, the initial perturbations are sufficiently small for the total effect at the origin (in the moving frame) to be small. Once this has happened, a second epoch starts where the perturbation is behind the front. Then, due to the diffusive behavior, the amplitude will go down as C . (t − t∗ + 1)1/2 These considerations will be used in the choice of topology below.

2.1. Function spaces and Fourier transform. We start the precise analysis and will work in Fourier space and revert to the x-variables only at the end of the discussion. We define the Fourier transform by 1 Ff (k) = dx f (x)e−ikx . 2π

366

J.-P. Eckmann, G. Schneider

Notation. If f denotes a function, then f˜ is defined by f˜ = Ff , and if A is an operator, −1 . We also use the notation f˜ ∗ g˜ for the convolution then A˜ is defined byA˜ = FAF ˜ product f g (k) = f ∗ g˜ (k) = d(f˜(k − ()g((). ˜ Finally, Tζ denotes the conjugate of translation: (Tζ f˜)(k) = e−iζ k f˜(k), so that the Fourier transform of Tζ f (x) = f (x − ζ ) is

(2.5)

FTζ f = Tζ Ff. The relation ([Ta97]) β

k α ∂k (Ff )(k) = (−i)α+β F(∂xα x β f )(k) motivates the introduction of the following norms: We fix a small δ > 0 and define 1/2  n m j δ 2((+j ) dk |∂k (k ( f˜(k))|2  . (2.6) f˜H˜ m,δ =  n

(=0 j =0

The dual norm to this is f H nm,δ

1/2  m n = δ 2((+j ) dx |∂x( f (x)|2 x 2j  .

(2.7)

(=0 j =0

Parseval’s identity immediately leads to: f H nm,δ = Ff H˜ m,δ . n

In the following we mainly work with the spaces to m = n = 2 and m = 0, n = 2. For some constant C independent of 1 ≥ δ > 0, f gH 2 ≤ Cf H 2 gH 2 , 2,δ

2,δ

2,δ

f˜ ∗ g ˜ ˜ 2,δ ≤ Cf˜ ˜ 2,δ g ˜ ˜ 2,δ , H2

H2

(2.8)

H2

or in a stronger version f gH 2 ≤ Cf H 2 gH 2 , 2,δ

2,δ

0,δ

f˜ ∗ g ˜ ˜ 2,δ ≤ Cf˜ ˜ 2,δ g ˜ ˜ 0,δ . H2

H2

(2.9)

H2

Finally, we shall also need the inequality f˜ ∗ g ˜ ˜ 0,δ ≤ f C 2 g ˜ ˜ 0,δ , H2

H2

b,δ

(2.10)

where f C 2 = b,δ

2 j =0

j

δ j sup |∂x f (x)|. x∈R

(2.11)

Stability of Modulated Fronts

367

This follows from ˜ ˜ 0,δ , f˜ ∗ g ˜ ˜ 0,δ = f · gH 2 ≤ f C 2 gH 2 = f C 2 g H2

0,δ

b,δ

0,δ

b,δ

H2

0,δ where the inequality above is a direct consequence of the definition of H˜ 2 . We define the map Wβ,ctˆ by

(Wβ,ctˆ f )(ξ ) = f (ξ + ct)e ˆ βξ ,

(2.12)

where β ∈ (0, β∗ ) and cˆ ∈ (0, c) will be fixed later. The Fourier conjugate of this operator then satisfies β,ctˆ f˜ (k) ≡ FWβ,ctˆ F −1 f˜ (k) = ei(k+iβ)ctˆ f˜(k + iβ), W (2.13) as one sees from the following equalities: ˜ 2π(Wβ,ctˆ f )(k) = = = =

dξ e−ikξ Wβ,ctˆ f (ξ ) dξ e−ikξ f (ξ + ct)e ˆ βξ dξ e−i(k+iβ)ξ f (ξ + ct) ˆ ˆ dξ e−i(k+iβ)(ξ −ct) f (ξ )

= 2π ei(k+iβ)ctˆ f˜(k + iβ). 2 , then W β,ctˆ f˜ extends to This calculation also shows that if f (ξ )eβ∗ ξ ∈ H 20,δ for f ∈ Cb,δ 0,δ β,ctˆ f˜)(·−iβ) ∈ H ˜ 2 for all β ∈ [0, β∗ ). an analytic function in {0 < Im k < β∗ } and (W

Remark. Since the norms for different δ are equivalent, all theorems throughout this paper can also be formulated in a version with δ = 1. 3. The Linear Simplified Problem In this section we study the linearization of Eq. (2.2): ∂t U (x, t) = ∂x2 U (x, t) + a(x − ct)U (x, t).

(3.1)

The function a is given as a(ξ ) = 21 (1 + tanh ξ ),

(3.2)

but our methods will work for many other functions. The crucial property we need is the existence of a β∗ > 0 such that a(ξ )e−βξ satisfies ξ → a(ξ )e−βξ H 2 ≤ C, 2,δ

(3.3)

for all β ∈ (0, β∗ ). For the case of (3.2) we can take β∗ = 2. The Fourier transform a˜ of a is therefore a tempered distribution which is the boundary value of a function (again

368

J.-P. Eckmann, G. Schneider

called a) ˜ which is analytic in the strip {z | 0 > Im z > −β∗ }. Furthermore, there is a K such that, for all δ ∈ (0, 1], aC 2 ≤ 1 + Kδ,

(3.4)

sup |a(x)| ≤ 1.

(3.5)

b,δ

since x∈R

The bound (3.5) will be tacitly used later. The next proposition describes how solutions of (3.1) tend to 0 as t → ∞. We write Ut (x) for U (x, t) and use similar notation for other functions of space and time. Proposition 3.1. Assume that there are a β and a cˆ ∈ (0, c) such that β 2 − β cˆ + 1 ≡ −2γ < 0. Then there exists a δ ∈ (0, 1] such that the following holds. Assume that U0 ∈ H 22,δ and that W0 (ξ ) = Wβ,0 U0 (ξ ) = U0 (ξ )eβξ ∈ H 20,δ . (These conditions are independent of δ > 0.) Then the solution Ut (x) = U (x, t) of (3.1) with initial data U0 2 ˜ exists for all t > 0 and with ψ(k) = e−k the rescaled solution V˜ (k, t) = U˜ (kt −1/2 , t) satisfies ˜ 2,δ ≤ V˜t − U˜ 0 (0)ψ ˜ H2

C U˜ 0 ˜ 2,δ . H2 (1 + t)1/2

(3.6)

β,ctˆ U˜ t satisfies t = W The function W 0 0,δ . t 0,δ ≤ Ce−3γ t/2 W W ˜ ˜ H2

(3.7)

H2

The constant C does not depend on U0 . Remark. Note that it is optimal to choose cˆ arbitrarily close to c. t and W t : The equation for Proof. First of all, we rewrite Eq. (3.1) for Ut in terms of U Wt = Wβ,ctˆ Ut is ∂t W (ξ, t) = ∂ξ2 W (ξ, t) + (cˆ − 2β)∂ξ W (ξ, t) ˆ (ξ, t). + a(ξ − (c − c)t)W ˆ (ξ, t) + (β 2 − β c)W

(3.8)

Taking Fourier transforms we then find, omitting the argument k and using the notation of (2.5): t , t = −k 2 U t + (Tct a) ˜ ∗U ∂t U 2 t + (T(c−c)t t . t = β − β cˆ − k 2 + ik(cˆ − 2β) W ∂t W ˜ ∗W ˆ a)

(3.9) (3.10)

It is at this point that the simultaneous choice of two representations for the solution and their associated topologies is crucial. t converges to 0, i.e., we show (3.7). We find from (2.10): We first show that W ˜ ∗ f˜ ˜ 0,δ ≤ a(· − ζ )C 2 · f˜ ˜ 0,δ = aC 2 · f˜ ˜ 0,δ . (Tζ a) H2

b,δ

H2

b,δ

Therefore, (3.4) implies t 0,δ ≤ (1 + Kδ)W t 0,δ , (T(c−c)t ˜ ∗W ˆ a) ˜ ˜ H2

H2

H2

(3.11)

Stability of Modulated Fronts

369

and we get from (3.10) the bound 1 2 2 ∂t Wt H˜ 0,δ 2

t 2 0,δ . ≤ (β 2 − β cˆ + 1 + Kδ)W ˜ H2

We choose δ > 0 so small that β 2 − β cˆ + 1 + Kδ ≤ −3γ /2. Integrating over t we get from the choice of β, δ, and c: ˆ t 0,δ ≤ e−3γ t/2 W 0 0,δ . W ˜ ˜ H2

(3.12)

H2

Thus, we have shown Eq. (3.7). . From (2.13) and deforming the contour of integration, we get Next, we study U Tζ a˜ ∗ f˜ (k) = =

d( e−iζ (k−() a(k ˜ − ()f˜(() β,ctˆ f˜ (( − iβ)e−i(ctˆ d( e−iζ (k−() a(k ˜ − () W

β,ctˆ f˜ (()e−i((+iβ)ctˆ d( e−iζ (k−(−iβ) a(k ˜ − ( − iβ) W −β(ζ −ct) ˆ β,ctˆ f˜ (()e−i(ctˆ . d( e−iζ (k−() a(k =e ˜ − ( − iβ) W =

(3.13) β,ctˆ U ˜ t (k) = e−ik ctˆ W t (k). Then (3.13) Let h(k) = e−ictk a(k−iβ) ˜ and g(k) ˜ = e−ik ctˆ W implies ˆ ˜ t = e−β(c−c)t Tct a˜ ∗ U h ∗ g. ˜ From this and (3.3) we conclude that ˆ t 2,δ = e−β(c−c)t h˜ ∗ g ˜ ˜ 2,δ Tct a˜ ∗ U ˜ H2

H2

≤ Ce

−β(c−c)t ˆ

˜ 2,δ g h ˜ ˜ 0,δ ˜ H2

H2

(3.14)

ˆ t 0,δ . ≤ C(1 + tc)2 e−β(c−c)t W ˜ H2

t 0,δ stays bounded (it actually decays On the other hand, from (3.7) we know that W H˜ 2 t is of the form exponentially), and thus the evolution equation for U ˆ t (k) = −k 2 U t (k) + f˜(k, t)(1 + tc)2 e−β(c−c)t , ∂t U

with f˜(·, t) ˜ 2,δ uniformly bounded in t. Since, by construction, cˆ < c, we conclude H2 that (3.6) holds, using well-known arguments which will be made explicit in the proof of Theorem 4.1. The proof of Proposition 3.1 is complete.

370

J.-P. Eckmann, G. Schneider

4. The Renormalization Approach for the Simplified Problem β,ctˆ u˜ t = We consider now the non-linear problem (2.2) and its related version for w˜ t = W FWβ,ctˆ ut in Fourier space. It takes the form ∗p ∂t u˜ t = − k 2 u˜ t + Tct a) ˜ ∗ u˜ t + u˜ t , (4.1) ∂t w˜ t = β 2 − β cˆ − k 2 + ik(cˆ − 2β) w˜ t ∗(p−1) + T(c−c)t ˜ ∗ w˜ t + (T−ctˆ u˜ t ) ∗ w˜ t . ˆ a) Let Mβ be the operator of multiplication: (Mβ f )(x) = eβx f (x). Choose the constants c, ˆ and β such that they satisfy as before 0 > −2γ = β 2 − β cˆ + 1, and fix them henceforth. Our main result for the simplified problem is: Theorem 4.1. For all ϑ ∈ (0, 1/2) there are positive constants R, C and δ ∈ (0, 1] such that the following holds: Assume u0 H 2 + Mβ u0 H 2 ≤ R. Then the solution ut of 2,δ 0,δ (2.2) with initial condition u0 converges to a Gaussian in the sense that there is a constant 2 ˜ = e−k the rescaled solution v(k, ˜ t) = u(kt ˜ −1/2 , t) A∗ = A∗ (u0 ) such that with ψ(k) satisfies ˜ 2,δ ≤ v˜t − A∗ ψ ˜ H2

Furthermore,

CR . (t + 1)1/2−ϑ

(4.2)

w˜ t ˜ 0,δ = FWβ,ctˆ ut ˜ 0,δ ≤ CRe−γ t . H2

H2

We shall use the renormalization technique of [BK92] to show that u˜ t and w˜ t behave t and W t from the (as t → ∞) essentially in the same way as their linear counterparts U previous section. This technique consists, see [CEE92], in pushing forward the solution for some time and then rescaling it. This process makes the effective non-linearity smaller at each step, so that in the end the convergence properties of the linearized problem are obtained. We fix 0 < σ ≤ 1 and introduce: L˜ f˜ (κ) = f˜(σ κ). (4.3) This is a linear change of coordinates in function space. Definition (2.6) and (4.3) imply that 2 L˜ f˜2˜ 2,δ = σ −1 d(σ κ) δ 2( σ −2( σ 2j |(∂ j f˜)(σ κ)|2 (σ κ)2( . H2

j,(=0

From this we conclude immediately that for 0 < σ < 1: L˜ f˜ ˜ 2,δ ≤ σ −5/2 f˜ ˜ 2,δ and L˜ −1 f˜ ˜ 2,δ ≤ σ −3/2 f˜ ˜ 2,δ .

(4.4)

L˜ f˜ ˜ 0,δ ≤ σ −1/2 f˜ ˜ 0,δ and L˜ −1 f˜ ˜ 0,δ ≤ σ 1/2 f˜ ˜ 0,δ .

(4.5)

H2

H2

H2

H2

Similarly H2

H2

H2

H2

Stability of Modulated Fronts

Note also that

371

˜ f˜ ∗ g) L( ˜ (κ) =

dκ f˜(σ κ − κ )g(κ ˜ ) = σ d(σ −1 κ ) f˜(σ κ − σ σ −1 κ )g(σ ˜ σ −1 κ ) = σ (L˜ f˜) ∗ (L˜ g) ˜ (κ).

Furthermore,

˜ Tζ a) L( ˜ (κ) = eiζ σ κ a(σ ˜ κ) = Tσ ζ (L˜ a) ˜ (κ),

and therefore we have

We next define

(4.6)

L˜ (Tζ a) ˜ ∗ (L˜ f˜). ˜ ∗ f˜ = σ (Tσ ζ L˜ a)

(4.7)

u˜ n,τ (κ) = L˜ n u˜ (κ, σ −2n τ ) = u(σ ˜ n κ, σ −2n τ ), w˜ n,τ (κ) = eγ σ

−2n τ

w(κ, ˜ σ −2n τ ),

so that this corresponds to an additional rescaling of the time axis. Note that u˜ n,σ 2 (κ) = u˜ n−1,1 (σ κ), and

w˜ n,σ 2 (κ) = eγ σ

−2n σ 2

w(κ, ˜ σ −2n σ 2 ) = w˜ n−1,1 (κ),

˜ i.e., the exponentially damped variable w is not scaled in space. We also let a˜ n = L˜ n a. From (4.6), (4.7), and ∂τ = σ −2n ∂t we find easily that (4.1) transforms to the system (omitting the argument κ): ∗p ∂τ u˜ n,τ = − κ 2 u˜ n,τ + σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ , (4.8) 2 2n 2 σ ∂τ w˜ n,τ = (β − β cˆ + γ ) − κ + iκ(cˆ − 2β) w˜ n,τ (4.9) −n ∗(p−1) ˜ + (T(c−c)σ ˜ ∗ w˜ n,τ + (T−cσ ˜ n,τ )) ∗ w˜ n,τ . ˆ −2n τ a) ˆ −2n τ (L u

We see that under these rescalings the coefficients of the non-linear terms in the first equation go to 0 as n → ∞. We will now put this observation into more mathematical form. Equation (4.1) is of the form ∂t Xt = L Xt + N Xt , where L contains the linear parts with the exception of those depending on a˜ n and N denotes the other terms. We can write the solution as t Xt = e(t−t0 )L Xt0 + ds e(t−s)L N (Xs ). t0

Going to the rescaled variables Xn,τ , and taking t0 = σ −2(n−1) and t = σ −2n τ , we can express this (for the u) ˜ as follows. Equation (4.8) leads to u˜ n,τ (κ) = e−κ (τ −σ ) u˜ n,σ 2 (κ)

τ 2 ∗p dτ e−κ (τ −τ ) σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ (κ). (4.10) + 2

σ2

2

372

J.-P. Eckmann, G. Schneider

Similarly, we rewrite (4.9) as ˜ −n ˜ n,τ ))∗(p−1) ∗ w˜ n,τ , ˜ n,τ w˜ n,τ + σ −2n (T−cσ ∂τ w˜ n,τ = G ˆ −2n τ (L u ˜ n,τ is defined, cf. (4.9), by where G ˜ n,τ f˜ (κ) σ 2n G ˜ ∗ f˜ (κ). = (β 2 − β cˆ + γ ) − κ 2 + iκ(cˆ − 2β) f˜(κ) + (T(c−c)σ ˆ −2n τ a) ˜ n,τ f˜n,τ is nothing but (3.10) in The solution of the linear evolution equation ∂τ f˜n,τ = G ˜ a new coordinate system. We write the solution as fn,τ = S˜n,τ,τ f˜n,τ . Then, in analogy to (4.10) we get w˜ n,τ (κ) = S˜n,τ,σ 2 w˜ n,σ 2 (κ)

τ −n ∗(p−1) ˜ dτ S˜n,τ,τ (T−cσ ( L u ˜ )) ∗ w ˜ (κ). + σ −2n −2n n,τ n,τ ˆ τ σ2

(4.11)

Remark. The proof of Theorem 4.1 is divided into several steps: In Lemma 4.2 below, we give the inequalities for the exponentially damped part in scaled variables. Then in Lemma 4.4 a priori estimates for the solutions of (4.10) and (4.11) are established. With these a priori bounds we show Proposition 4.5. From these results, Theorem 4.1 will follow rather simply by a contraction argument. 4.1. The weighted linear problem. We bound S˜n,τ,τ . Recall that we are assuming β 2 − β cˆ + 1 = −2γ < 0. Lemma 4.2. There exists a C > 0 such that for 1 > τ > τ ≥ 0 one has S˜n,τ,τ f˜ ˜ 0,δ ≤ Ce−γ σ

−2n (τ −τ )/2

H2

f˜ ˜ 0,δ , H2

(4.12)

for all n ∈ N. ˜ n,τ f˜τ , with solution f˜τ = S˜n,τ,τ f˜τ : Proof. We consider the equation ∂τ f˜τ = G ˜ ∗ f˜τ , ∂τ f˜τ = λn f˜τ + σ −2n (T(c−c)σ ˆ −2n τ a) where λn is the operator of multiplication by λn (κ) = (β 2 − β cˆ + γ ) − κ 2 + iκ(cˆ − 2β) σ −2n . The variation of constant formula yields τ λn (τ −τ ) ˜ ˜ fτ + fτ = e ds eλn (τ −s) σ −2n (T(c−c)σ ˜ ∗ f˜s . ˆ −2n s a) τ

We use

eλn τ f˜ ˜ 0,δ ≤ eλn τ C 0 f˜ ˜ 0,δ , H2

b

H2

(4.13)

Stability of Modulated Fronts

373

and

eλn τ C 0 ≤ e(β

2 −β c+γ ˆ )σ −2n τ

b

.

We find 2 ˆ )σ −2n (τ −τ ) ˜ f˜τ ˜ 0,δ ≤ e(β −β c+γ fτ ˜ 0,δ H2 H2 τ 2 −2n ˆ )σ (τ −s) −2n + ds e(β −β c+γ σ aC 2 f˜s ˜ 0,δ ,

τ

H2

b,δ

since (T˜ζ Fa) ∗ f˜ ˜ 0,δ = a(· − ζ )F −1 f˜H 2 H2

0,δ

≤ a(· − ζ )C 2 F b,δ

−1

f˜H 2 = aC 2 f˜ ˜ 0,δ . 0,δ

H2

b,δ

Using aC 2 = 1 + Kδ and applying Gronwall’s inequality to e−(β b,δ f˜τ 0,δ we get

2 −β c+γ ˆ )σ −2n τ

H˜ 2

e−(β

2 −β c+γ ˆ )σ −2n (τ −τ )

−2n f˜τ ˜ 0,δ ≤ eσ (1+Kδ)(τ −τ ) f˜τ ˜ 0,δ ,

H2

H2

or equivalently, 2 ˆ +1+Kδ)σ −2n (τ −τ ) f˜τ ˜ 0,δ ≤ f˜τ ˜ 0,δ e(β −β c+γ .

H2

(4.14)

H0

Choosing δ ∈ (0, 1] so small that Kδ < γ /2 completes the proof of Lemma 4.2.

4.2. An a priori bound on the non-linear problem. We now state and prove a priori bounds on the solution of (4.10) and (4.11). Finally these solutions will be controlled by proving inequalities for the elements of the following sequences. Definition 4.3. For all n, we define ρnu = u˜ n,1 ˜ 2,δ

and ρnw = w˜ n,1 ˜ 0,δ .

sup u˜ n,τ ˜ 2,δ

and Rnw =

H2

H2

Moreover, we define Rnu =

τ ∈[σ 2 ,1]

H2

sup w˜ n,τ ˜ 0,δ .

τ ∈[σ 2 ,1]

H2

(4.15)

Lemma 4.4. For all n ∈ N there is a constant ηn > 0 such that the following holds: If u , ρ w , and σ > 0 are smaller than η , the solutions of (4.10) and (4.11) exist for ρn−1 n n−1 all τ ∈ [σ 2 , 1]. Moreover, we have the estimates −n

u + Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p , Rnu ≤ Cσ −5/2 ρn−1

(4.16)

w Rnw ≤ Cρn−1 + Cσ n(p−3/2) (Rnu )p−1 Rnw ,

(4.17)

and

with a constant C independent of σ and n.

374

J.-P. Eckmann, G. Schneider

Remark. There is no need for a detailed expression for η = ηn since the existence of the solutions is guaranteed if we can show Rnu < ∞ and Rnw < ∞. With (4.16) and (4.17) we have detailed control of these quantities in terms of the norm of the initial conditions and σ . Proof. We start with (4.11). We bound the first term of (4.11) by w Cρn−1 .

(4.18)

For the second term in (4.11), we get with (4.5) and (4.6), ˜ −n ˜ n,τ ))∗(p−1) ∗ w˜ n,τ 0,δ (T−cσ ˆ −2n τ (L u ˜ H2

≤(L˜ −n (T−cσ ˜ n,τ ))∗(p−1) ∗ w˜ n,τ ˜ 0,δ ˆ −n τ u H2

≤σ

n/2 n(p−2)

σ

p−1 u˜ n,τ 0,δ w˜ n,τ ˜ 0,δ H2 H˜ 2

≤σ n(p−3/2) (Rnu )p−1 Rnw a bound Cσ n(p−7/2)

τ σ2

dτ e−γ σ

−2n (τ −τ )/2

(Rnu )p−1 Rnw ≤ Cσ n(p−3/2) (Rnu )p−1 Rnw .

We next consider (4.10). The first term is bounded by κ →e−κ

2 (τ −σ 2 )

u˜ n−1,1 (σ κ) ˜ 2,δ

≤ κ → e

H2

−κ 2 (τ −σ 2 )

C 2 κ → u˜ n−1,1 (σ κ) ˜ 2,δ b,δ

H2

(4.19)

u , ≤ Cσ −5/2 ρn−1

using (4.4). We use (4.7) and recall a˜ n = L˜ n a˜ to rewrite the second term of (4.10) into

τ

2 dτ e−κ (τ −τ ) (Tcσ −2n τ a) ˜ ∗ (L˜ −n u˜ n,τ ) (κ) σ2 τ 2 = σ −2n dτ e−κ (τ −τ ) (Tcσ −2n τ a) ˜ ∗ u˜ σ −2n τ (κ) 2 σ τ −2n ˆ −κ 2 (τ −τ ) = σ −2n dτ e−βσ τ (c−c) e 2 σ −2n ˆ −2n τ −γ σ −2n τ × d( ei(κ−()cσ τ a(κ ˜ − ( − iβ) w((, ˜ σ −2n τ )e−i(cσ e ,

σ −2n

Stability of Modulated Fronts

375

where (3.13) is used for the last equality. Using this identity, we get from the techniques leading to (3.14): τ 2 dτ e−κ (τ −τ ) (Tcσ −n τ a˜ n ) ∗ u˜ n,τ (κ) ˜ 2,δ σ −n κ → H2 2 σ τ ≤ σ −n dτ (Tcσ −n τ a˜ n ) ∗ u˜ n,τ ˜ 2,δ H2 σ2 τ ≤ σ −2n dτ L˜ n ((Tcσ −2n τ a) ˜ ∗ u˜ σ −2n τ ) ˜ 2,δ H2 σ2 (4.20) τ −9n/2 ≤σ dτ (Tcσ −2n τ a) ˜ ∗ u˜ σ −2n τ ˜ 2,δ H2 σ2 τ ˆ )σ −2n τ w ≤ Cσ −9n/2 dτ (1 + cσ −2n τ )2 e−(β(c−c)+γ Rn ≤ Cσ

σ2 −17n/2 −(β(c−c)+γ ˆ )σ −2(n−1)

e

−n

ˆ )σ Rnw ≤ Ce−(β(c−c)+γ Rnw .

For the last term in (4.10) we get a bound τ Cσ n(p−3) dτ (Rnu )p ≤ Cσ n(p−3) (Rnu )p .

(4.21)

σ2

The proof of Lemma 4.4 now follows by applying the contraction mapping principle to u , ρ w and σ > 0 sufficiently small the Lipschitz constant (4.10) and (4.11). For ρn−1 n−1 2,δ 0,δ for the right-hand side of (4.10) and (4.11) in C([σ 2 , 1], H˜ 2 × H˜ 2 ) is smaller than 1. An application of a classical fixed point argument completes the proof of Lemma 4.4. 4.3. The iteration process. We next decompose the solution u˜ n,τ for τ = 1 into a 2 ˜ Gaussian part and a remainder. Let ψ(κ) = e−κ and write ˜ u˜ n,1 (κ) = An ψ(κ) + r˜n (κ), : H˜ 2,δ where r˜n (0) = 0, and the amplitude An is in R. We also define > 2 → R by f˜ = f˜ . > (4.22) κ=0 Then (4.10) can be decomposed accordingly and takes the form An = An−1

1 2 ∗p +> dτ e−κ (1−τ ) σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ (κ) , r˜n (κ) = e

σ2 −κ 2 (1−σ 2 )

+ +e

1

r˜n−1 (σ κ)

dτ e−κ

σ2 −κ 2 (1−σ 2 )

2 (1−τ )

∗p

(4.23)

σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ (κ)

˜ κ) − An ψ(κ). ˜ An−1 ψ(σ

(4.24)

Then we define ρnr = ˜rn ˜ 2,δ and so ρnu ≤ C(|An | + ρnr ). Our main estimate is now H2

376

J.-P. Eckmann, G. Schneider

Proposition 4.5. There is a constant C > 0 such that for σ > 0 sufficiently small the solution u˜ of (2.2) satisfies for all n ∈ N: −n

|An − An−1 | ≤ Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p , r + Ce ρnr ≤ Cσρn−1

ρnw ≤ Ce−Cσ

−2n

−Cσ −n

(4.25)

Rnw + Cσ n(p−3) (Rnu )p ,

w ρn−1 + Cσ n(p−3/2) (Rnu )p−1 Rnw .

(4.26)

Proof. We begin by bounding the difference An − An−1 using (4.23). Observe that since 2,δ we work in H˜ 2 , we have f˜| ≤ Cf˜ 2,δ , |> ˜

(4.27)

H2

with C independent of δ. Thus, it suffices to bound the norm of the integral in (4.23). The first term in (4.23) is the one containing the translated term a˜ n and was already bounded in (4.20) while the second was bounded in (4.21). Combining these bounds with (4.27), we find (4.25). We next bound r˜n in terms of r˜n−1 , using (4.24). The first term is the one where the 2,δ projection is crucial: For σ > 0 sufficiently small, f˜ ∈ H˜ 2 with f˜(0) = 0 one has κ → e−κ

2 (1−σ 2 )

f˜(σ κ) ˜ 2,δ ≤ Cσ f˜ ˜ 2,δ . H2

H2

(4.28)

Indeed, writing out the definition (2.6) of H˜ 2 , one gets for the term with j = ( = 0: 2 ˜ ˜ −2κ 2 (1−σ 2 ) ˜ 2 −2κ 2 (1−σ 2 ) 2 f (σ κ) − f (0) dκ e |f (σ κ)| = dκ e (σ κ) . σκ 2,δ

Clearly, a bound of the type of (4.28) follows for this term by the assumptions on f˜. The derivatives are handled similarly, except that there is no need to divide and multiply by powers of σ κ since each derivative produces a factor σ . We now bound the other terms in (4.24). The first term is bounded using (4.28) and 2,δ yields a bound (in H˜ 2 ) of r . Cσρn−1

(4.29)

The second and third terms have been bounded in (4.20) and (4.21): −n

Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p .

(4.30)

Finally, the last term in (4.24) can be written as 2 2 2 2 2 2 X˜ n ≡ An−1 (e−κ (1−σ ) e−κ σ − e−κ ) + (An−1 − An )e−κ .

2,δ The first expression vanishes and we get a bound (in H˜ 2 ): −n

X˜ n ˜ 2,δ ≤ Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p . H2

(4.31)

Collecting the bounds (4.29)–(4.31), the assertion (4.26) for r˜n follows. Finally, the bounds on ρnw follow as those in Lemma 4.4. The proof of Proposition 4.5 is complete.

Stability of Modulated Fronts

377

Proof of Theorem 4.1. The proof is an induction argument, using repeatedly the above estimates. Again we write C for (positive) constants which can be chosen independent of σ and n. Assume that R = supn∈N Rnu < ∞ exists. From Lemma 4.4 we observe for σ > 0 sufficiently small Rnw ≤ Rnu ≤

w Cρn−1

w ≤ C ρn−1 , 1 − Cσ n(p−3/2) R p−1 −n u + Ce−Cσ Rnw Cσ −5/2 ρn−1

(4.32)

1 − Cσ n(p−3) R p−1

−n

u w ≤ Cσ −5/2 ρn−1 + Ce−Cσ ρn−1 ,

with a constant C which can be chosen independent of R. Using Proposition 4.5 we find −n

w u |An − An−1 | ≤ Ce−Cσ ρn−1 + Cσ n(p−3) σ −5/2 ρn−1 , −n

r w u + Ce−Cσ ρn−1 + Cσ n(p−3) σ −5/2 ρn−1 , ρnr ≤ Cσρn−1

ρnu ≤ C(|An | + ρnr ), ρnw ≤ Ce−Cσ

−2n

(4.33)

w w ρn−1 + C σ n(p−3/2) ρn−1 .

Therefore, we can choose σ > 0 so small that for n > 3: (recall p > 3 and p ∈ N) w r |An − An−1 | ≤ ρn−1 /10 + σ n−3 (|An−1 | + ρn−1 ),

r w ρnr ≤ 3ρn−1 /4 + ρn−1 /10 + σ n−3 |An−1 |, w /10 . ρnw ≤ ρn−1

Thus, the sequence of An converges geometrically to a finite limit A∗ . Furthermore, we find that limn→∞ ρnr = 0, and limn→∞ ρnw = 0. Since the quantities |An |, ρnr , ρnw increase only for at most three steps the term CR p−1 in (4.32) stays less than 1/2 if we choose |A1 |, ρ1r , ρ1w = O(σ m ), for an m > 0 sufficiently large. We then deduce from (4.32) the existence of a finite constant R = supn∈N Rnu . Going back to (4.33) for given ϑ > 0 we can choose σ > 0 so small that |An − An−1 | + ρnr ≤ Cσ (1−2ϑ)n which implies the associated convergence rate stated in Theorem 4.1. This holds since ρn ≤ Cσρn−1 implies ρ|t=σ −2n = ρn ≤ (Cσ )n ρ0 and ρ(t) ≤ ρ0 t −1/2 t ln C/ ln σ

−2

≤ ρ0 t −1/2+ϑ

for σ > 0 sufficiently small. Finally, the scaling of w˜ n,τ implies the exponential decay of w˜ t . The proof of Theorem 4.1 is complete.

378

J.-P. Eckmann, G. Schneider

Part II. The Swift–Hohenberg Equation 5. Bloch Waves Since the problem we consider takes place in a setting with a periodic background provided by the stationary solution of the Swift–Hohenberg, it is natural to work with the Bloch representation of the functions. For additional information see [RS72]. The starting point of Bloch wave analysis in the case of a 2π –periodic underlying pattern is the following relation: 1/2 ikx u(x) = dk e u(k) ˜ = d( ei(n+()x u(n ˜ + () =

1/2 −1/2

where we define

d(

n∈Z −1/2

e

i(n+()x

u(n ˜ + () =

n∈Z

1/2 −1/2

(5.1) d( e

i(x

u((, ˆ x),

T u ((, x) ≡ u((, ˆ x) = einx u(n ˜ + ().

(5.2)

n∈Z

The operator T will play a rôle analogous to that played by the Fourier transform F for the simplified problem of Part I. We will use analogous notation: Notation. If f denotes a function, then fˆ is defined by fˆ = T f , and if A is an operator, then Aˆ is defined by Aˆ = T AT −1 . Note that

dx |u(x)|2 = 2π

−1/2

R

1/2

2π

d(

dx |u((, ˆ x)|2 .

(5.3)

0

This is easily seen from Parseval’s identity: 2 dx |u(x)|2 = 2π dk|u(k)| ˜ R

R

= 2π = 2π

d( |u(n ˜ + ()|2

n∈Z −1/2 1/2 −1/2

= 2π

1/2

d(

n∈Z 2π

1/2

−1/2

|u(n ˜ + ()|2

d(

dx |u((, ˆ x)|2 .

0

The sum and the integral can be interchanged in (5.1) due to Fubini’s theorem when u is in the Schwartz space S. We shall use frequently the following fundamental properties (which follow at once from (5.2)): u((, ˆ x) = eix u(( ˆ + 1, x), u((, ˆ x) = u((, ˆ x + 2π ), ˆ x) for real-valued u. u((, ˆ x) = u(−(,

(5.4)

Stability of Modulated Fronts

379

Finally Tζ denotes the conjugate of translation, so that the Bloch transform of Tζ f (x) = f (x − ζ ) is T Tζ f = Tζ T f. Multiplication in position space corresponds to a modified convolution operation for the Bloch-functions: 1/2 u · v ((, x) = d( u(( ˆ − ( , x)v(( ˆ , x) ≡ uˆ ∗ vˆ ((, x). −1/2

This follows from (5.4) and the identities: imx dk u(( ˜ + m − k)v(k)e ˜ u · v ((, x) = =

m∈Z R 1/2

−1/2

m,n∈Z

d(

u(( ˜ + m − ( − n)v(( ˜ + n)ei(m−n)x einx .

Recalling the norm f H ns,δ

1/2  n s = δ 2(m+j ) dx |∂xm f (x)|2 x 2j  j =0 m=0

we now introduce fˆ ˆ s,δ Hn

 n s = δ 2(m+j )

2π

d( 1/2

j =0 m=0

1/2

1/2

0

j dx |∂x ∂(m fˆ((, x)|2 

.

We get from Parseval’s equality C −1 uH ns,δ ≤ u ˆ ˆ s,δ ≤ CuH ns,δ , Hn

for some C independent of δ ∈ (0, 1). As before, in the following we mainly work with the spaces to s = n = 2 and s = 0, n = 2. Similarly, in analogy to (2.8), we also have u v ˆ 2,δ = uˆ ∗ v ˆ ˆ 2,δ ≤ Cu ˆ ˆ 2,δ v ˆ ˆ 2,δ , H2 H H2 H2 2 uv (· − iβ, ·) ˆ 0,δ = uˆ ∗ vˆ (· − iβ, ·) ˆ 0,δ H2

(5.5)

H2

≤ Cu ˆ ˆ 0,δ v(· ˆ − iβ, ·) ˆ 0,δ , H2

(5.6)

H2

or in a stronger version ∗ v ˆ ˆ 2,δ ≤ Cu ˆ ˆ 0,δ v ˆ ˆ 2,δ , u v ˆ 2,δ = uˆ H2 H H2 H2 2 uv (· − iβ, ·) ˆ 2,δ = uˆ ∗ vˆ (· − iβ, ·) ˆ 2,δ ≤ Cu ˆ ˆ 2,δ v(· ˆ − iβ, ·) ˆ 0,δ . H2

H2

H2

H2

(5.7)

380

J.-P. Eckmann, G. Schneider 2 (see (2.11) for the definition): Then, Finally, suppose f is a function in Cb,δ

∗ v ˆ ˆ 0,δ ≤ Cf C 2 v ˆ ˆ 0,δ , fv ˆ 0,δ = fˆ H2 H2 H2 b,δ fˆ ∗ vˆ (· − iβ, ·) ˆ 0,δ ≤ Cf C 2 v(· ˆ − iβ, ·) ˆ 0,δ . H2 H2 b,δ

(5.8) (5.9)

Thus, apart from notational differences, we can work in the Bloch spaces with much the same bounds as in the spaces used for the model problem of the previous sections. 6. The Linearized Problem We discuss here again the behavior of the linearized problem as in Sect. 3, but now for the Swift–Hohenberg equation. The discussion will again be split in an aspect behind the front and one ahead of the front. In Sect. 3, the behavior of the problem in the bulk behind the traveling front was diffusive by construction, and the only difficulty was to understand the rôle of the decay of a to 0 (as e−β|x| ) as x → −∞. For the problem of the Swift–Hohenberg equation, the situation is similar, leading again to diffusive behavior. However, this observation is not obvious. Therefore, the first problem consists in showing the diffusive behavior. In order to obtain optimal results for the analysis ahead of the front, i.e., for the variable in the weighted representation, we use our approximate knowledge of the shape of the front. 6.1. The unweighted representation. In analogy with the simplified example, the linearized problem would be now ∂t v = Mv + Mi v,

(6.1)

where M and Mi have been defined in Eqs. (1.7) and (1.8). By the analysis for the model problem we expect that the term Mi v will be irrelevant for the dynamics in the bulk with some exponential rate. Therefore, it will be considered in the sequel together with the non-linear terms. As a consequence, the linear equation dominating the behavior behind the front is given by ∂t v = Mv.

(6.2)

We recall those features of the proof of diffusive stability of [Schn96, Schn98] which are relevant to the study of (6.2). In order to do this, we need to localize the spectrum of M. Since this is welldocumented, we just summarize the results. As the linearized problem has periodic ˆ = T MT −1 equals a direct integral ⊕d( M( , where each coefficients, the operator M 2,δ M( acts on the subspace with fixed quasi-momentum ( in Hˆ 2 . The eigenfunctions of M( are given by Bloch waves of the form ei(x w(,n with 2π -periodic w(,n . The index n ∈ N counts various eigenvalues for fixed (. For each ( ∈ R (or rather in the Brillouin zone [− 21 , 21 ]) they are solutions of the eigenvalue equation 2 M( w( (x) ≡ − 1 + (i( + ∂x )2 w( (x) + ε 2 w( (x) − 3U∗2 (x)w( (x) = µ( w( (x). The spectrum takes the familiar form of a curve µ1 (() with an expansion µ1 (() = −c1 (2 + O((3 ),

Stability of Modulated Fronts

381

and c1 > 0 and the remainder of the spectrum negative and bounded away from 0. The eigenfunction associated with µ1 (0) is ∂x U∗ (x), reflecting the translation invariance of the original problem (1.1). There is an (0 > 0 such that for fixed ( ∈ (−(0 , (0 ) the eigenfunction ϕ( (x) = w(,1 (x) of the main branch µ1 (() is well defined (and a continuation of ∂x U∗ (x)) as ( is varied away from 0. Corresponding to this we define the central projections Pˆc (() by Pˆc (()f = ϕ¯( , f ϕ( , where ·, · is the scalar product in L2 ([0, 2π ]) and ϕ¯( the associated eigenfunction of 2,δ the adjoint problem. We will need a smooth version of the projection in Hˆ 2 . We fix once and for all a non-negative smooth cutoff function χ with support in [−(0 /2, (0 /2] which equals 1 on [−(0 /4, (0 /4]. Then we define the operators Eˆ c and Eˆ s by: Eˆ c (() = χ (()Pˆc ((),

Eˆ s (() = 1(() − Eˆ c (().

It will be useful to define auxiliary “mode filters” Eˆ ch and Eˆ sh by Eˆ ch (() = χ ((/2)Pˆc ((),

Eˆ sh (() = 1(() − χ (2()Pˆc (().

These definitions are made in such a way that Eˆ ch Eˆ c = Eˆ c ,

Eˆ sh Eˆ s = Eˆ s ,

which will be used to replace the (missing) projection property of Eˆ c and Eˆ s . We next extend the definitions (4.3) of Sect. 4 to the Bloch spaces. To avoid cumbersome notation, we shall use mostly the same symbols as in that section. Thus, with σ < 1 as before, we let now uˆ (κ, x) = u(σ L ˆ κ, x). Note that here, and elsewhere, the scaling does not act on the x variable, only on the quasi-momentum κ. The novelty of renormalization in Bloch space here is that since the integration region over the ( variable is finite it will change with the scaling. Therefore, we introduce (for fixed δ > 0), Kσ,ρ = {uˆ | u ˆ Kσ,ρ < ∞},

(6.3)

where u ˆ 2Kσ,ρ ≡

2

1/(2σ )

n,n =0 −1/(2σ )

2π

d( 0

dx δ 2(n+n ) |∂(n ∂xn u((, ˆ x)|2 (1 + (2 )2ρ .

For technical reasons we introduced a weight in the Bloch variable (. It turns out that an appropriate choice for the critical part is Kσc = Kσ,3/2 and for the stable part Kσs = Kσ,1 . Note that T , as defined in (5.2) is an isomorphism between the space H 22,δ and the space Kσ,ρ by (5.3) and the definition (6.3). As before we have fˆK n ≤ σ −5/2−2ρ fˆK L , σ ,ρ σ n−1 ,ρ

(6.4)

382

J.-P. Eckmann, G. Schneider

for 0 < σ ≤ 1, where the additional factor σ −2ρ is due to the weight in the (-variable. Moreover, as before, we will not scale the weighted variable and so we fix 0,δ Kw = Hˆ 2 .

Consider again the eigenfunctions ϕ( (x). The function vˆt ((, x) = eµ1 (()t ϕ( (x), solves the equation

∂t vˆt ((, ·) = M( (vˆt ((, ·)).

Because of the nature of the spectrum µ1 ((), this solution satisfies vˆt ((t −1/2 , x) = e−c1 ( vˆ0 (0, x) + O(t −1/2 ). 2

Using this observation and the fact that the Eˆ s -part is exponentially damped, the result will be 0 satisfies: t of the problem (6.2) with initial data V Proposition 6.1. The solution V t ((t −1/2 , x) − e−c1 (2 Pˆc (0)V 0 (0, x)K √ ≤ C V 0 2,δ , ((, x) → V 1/ t,1 Hˆ 2 t 1/2

(6.5)

for a constant C > 0 and all t ≥ 1. Moreover, there is a constant γ− > 0 such that 0 2,δ , t ((t −1/2 , x)K √ ≤ Ce−γ− t V ((, x) → Eˆ s V (6.6) ˆ 1/ t,1 H2

for all t ≥ 1. 6.2. The weighted representation. The weighted representation will be obtained by translating the effect of the transformation Wβ,ctˆ defined in (2.12) to the language of the Bloch waves. In accordance with our notational conventions, we set β,ctˆ = T Wβ,ctˆ T −1 , W and we get now, in analogy to (2.13), ˆ β,ctˆ fˆ ((, x) = ei c((+iβ)t W fˆ(( + iβ, x + ct). ˆ β,ctˆ v, ˆ then takes the form Equation (6.1), expressed in terms of W β,ctˆ vˆ = M β,ctˆ vˆ + M β,ctˆ vˆ , β,ctˆ W i,β,ctˆ W ∂t W

(6.7)

with β,ctˆ fˆ ((, x) = Lˆ iβ fˆ ((, x) − 3U∗2 (x + ct) M ˆ fˆ((, x) + c(i(( ˆ + iβ) + ∂x )fˆ((, x), i,β,ctˆ fˆ ((, x) = − 6U∗ (x + ct) ˆ T−ctˆ K ∗ fˆ ((, x) M ct ∗ T−ctˆ K ∗ fˆ)((, x). − 3(T−ctˆ K ct ct Some explanations are in order: Lˆ iβ is the operator −(1 + (∂x + i( − β)2 )2 + ε2 . The functions U∗ are just multiplications in the Bloch representation because they are

Stability of Modulated Fronts

383

periodic. More precisely, one has U ∗ ((, x) = U∗ (x)δ(() in the sense of distributions. The functions K are derived from K ct ct of Eq. (1.6) and are seen to be given by −i(ct K Fc ((, x − ct, x) − U∗ (x)δ((), ct ((, x) ≡ T Kct ((, x) = e where the Bloch transform is taken in the first (non-periodic) variable of Fc . In order to obtain optimal results for the analysis ahead of the front, i.e., for the variable in the weighted representation, we recall some facts from the construction [CE86, EW91] of the fronts. For small ε > 0 the bifurcating solutions u of the Swift–Hohenberg equation can be approximated by ˜ ψ(x, t, ε) = εA(εx, ε2 t)eix + c.c., up to an error O(ε2 ), where A satisfies the Ginzburg–Landau equation ∂T A = 4∂X2 A + A − 3A|A|2 , with X ∈ R, T ≥ 0 and A(X, T ) ∈ C. See [CE90b, vH91, KSM92, Schn94]. This equation possesses a real-valued front Af (X, T ) = B(X − cB T ), where ξ → B(ξ ) satisfies the ordinary differential equation 4B + cB B + B − 3B|B|2 = 0. For |cB | ≥ 4 the real–valued fronts of this equation are monotonic. These fronts and the trivial solution A = 0 can be stabilized by introducing a weight eβA x satisfying the stability condition DA (cB , βA ) = 4βA2 − βA cB + 1 < 0, see [BK92]. √ Remark. Since B(ξ ) converges at a faster rate to 1/ 3 for ξ → −∞ than to 0 for ξ → ∞ there will be no additional restriction such as (3.3) on βA . Remark. Our result will be optimal in the sense that each modulated front Fc which corresponds to a front of the associated amplitude equation satisfying DA (cB , βA ) < 0 is stable. The connection between the quantities of the Ginzburg–Landau equation and the associated Swift–Hohenberg equation is as follows. We have c = εcB + O(ε 2 ), and β = εβA + O(ε 2 ). In order to prove this remark we write the modulated front Fc as defined in (1.2) as a sum of the Ginzburg–Landau part and a remainder Fc (ξ, x) = 2εB(εξ ) cos(x) + ε 2 Fr (ξ, x), where Fr satisfies

sup Fr (· + y, ·)C 2 ≤ C,

y∈R

b,δ

for a constant C independent of ε ∈ (0, 1) and δ ∈ (0, 1). Then we consider (6.7) which we write without decomposition as . (6.8) = L iβ W − 3T−ctˆ (τ + c(i(( ∂t W ∗ T−ctˆ (τ ∗W ˆ + iβ) + ∂x )W ct Fc ) ct F )

384

J.-P. Eckmann, G. Schneider

In order to control these solutions we use that the linearized system (6.7) evolves in such a way that during times of order O(1/ε 2 ) it can be approximated by the associated linearized Ginzburg–Landau equation ¯ ∂τ A = 4(∂X − βA )2 A + cB (∂X − βA )A + A − B 2 (2A + A).

(6.9)

Theorem 6.2. For all C0 > 0, and τ1 > 0 there exist positive constants ε0 , C1 , C2 , 0 and τ0 such that for all ε ∈ (0, ε0 ] the following is true: For all initial conditions W with W0 ˆ 0,δ ≤ C0 ε there are a solution Wt of (6.8) and a solution Aτ of (6.9) with H2 t in the sense that A0 0,δ ≤ C1 such that the function Aτ approximates W H˜ 2

t − εT (Aε2 t−τ (x)eix + c.c.) 0,δ ≤ C2 ε 2 , W 0 ˆ H2

for all t ∈ [τ0 /ε 2 , (τ0 + τ1 )/ε 2 ]. Here T again denotes the map of Eq. (5.2) from a function f of x to its Bloch representation fˆ((, x). Proof. The proof of this is very similar to the case of the (non-linear) Swift–Hohenberg equation which was discussed in the literature [CE90b, vH91, KSM92, Schn94]. Our (linear) problem is in fact easier and the proof is left to the reader. For the system (6.9) we have the estimate [BK92] Aτ H 2 ≤ CeDA (cB ,βA ,δ)τ A0 H 2 , 0,δ

0,δ

with limδ→0 DA (cB , βA , δ) = DA (cB , βA ). The deviation of DA (cB , βA , δ) from DA (cB , βA ) comes again from the derivatives of B. As a consequence of this estimate and of Theorem 6.2 we conclude that

t 0,δ ≤ CeD(c,β,ε,δ)(t−t ) W t 0,δ , W ˆ ˆ H2

H2

(6.10)

for a constant C and a coefficient D = D(c, β, ε, δ). We can (and will) choose this constant D in such a way that (for ε → 0): D(c, β, ε, δ) = ε2 (DA (cB , βA , δ) + o(1)).

(6.11)

We define D(c, β, ε) = limδ→0 D(c, β, ε, δ). Remark. The choice of a sufficiently small δ > 0 and ε > 0 will allow us to prove the stability of all fronts which are predicted to be stable by the associated amplitude equation since lim(ε,δ)→0 ε −2 D(c, β, ε, δ) = DA (cB , βA ). In the following we consider a modulated front with velocity c and a given (sufficiently small) bifurcation parameter ε > 0 for which there are a β and a cˆ ∈ (0, c) which satisfy: D(c, ˆ β, ε) = −2γ < 0.

(6.12)

Stability of Modulated Fronts

385

Proposition 6.3. Suppose that the above stability condition (6.12) is satisfied. Then there β,ctˆ V t obey t = W is a δ ∈ (0, 1] such that: There is a C < ∞ for which the functions W the bounds s 0,δ . t 0,δ ≤ Ce−3γ (t−s)/2 W W ˆ ˆ H2

H2

(6.13)

As in the previous sections this result will have to be improved for the non-linear problem. Therefore, we skip at this point the proof, and will only deal with the improved version later. Thus, the linear problems (6.2) and (6.7) are the analogs of (3.9) and (3.10) and can be studied pretty much as in the case of the simplified problem, yielding inequalities similar to (3.6) and (3.7). 7. The Renormalization Process for the Full Problem We assume throughout this section that the stability condition (6.12) is satisfied. We prove here our main Theorem 7.1. There are a δ > 0 and positive constants R and C such that the following holds: Assume v0 H 2 + Mβ v0 H 2 ≤ R and denote by vt the solution of (1.4) with 2,δ 2,δ ˜ initial condition v0 . Let ψ(() = exp(−c1 (2 ). There is a constant A∗ = A∗ (v0 ) such that the rescaled solution vˆtr ((, x) = vˆt ((t −1/2 , x) satisfies CR . (t + 1)1/4

(7.1)

β,ctˆ vˆt Kw ≤ CRe−γ t . wt Kw = W

(7.2)

˜ x U∗ K √ ≤ vˆtr − A∗ ψ∂ 1/ t,1 Furthermore,

Remarks. • The inequality (7.1) really says that the difference vˆt ((t −1/2 , x) − A∗ e−c1 ( ∂x U∗ (x) 2

is small, where U∗ is the periodic solution (see Eq. (1.3)) of the Swift–Hohenberg equation. Expressed in the laboratory frame, this means that an initial perturbation v0 (x) will go to 0 like π −x 2 exp( ) ∂x U∗ (x), vt (x) ≈ A∗ (v0 ) c1 t 4c1 t when t → ∞, uniformly for x ∈ R. See [Schn96]. In particular, this means that near the extrema of U∗ the convergence is faster than O(t −1/2 ) since at those points ∂x U∗ vanishes. • The inequality (7.2) gives some more precise bound on the growth of a perturbation ahead of the front, because it says that this perturbation decays exponentially in the weighted norm. More explicitly, we have at least a bound

|vt (x + ct)| ≤ Ceβx−γ t , with γ slightly smaller than γ .

386

J.-P. Eckmann, G. Schneider

• The decay (t + 1)−1/4 in (7.1) can be improved easily to (t + 1)−1/2+ϑ for any ϑ > 0. We have chosen ϑ = 1/4 to keep the notation at a reasonable level. Proof. As we explained before, the proof is similar to the one in Sect. 3 except that now the function behind the front is split into a diffusive part vˆc and into an exponentially damped part vˆs , and correspondingly there will be a few more equations. In Bloch space the initial conditions satisfy vˆ0 ˆ 2,δ + vˆ0 (· + iβ, ·) ˆ 0,δ ≤ R. The H2

H2

system for the variables vˆc and vˆs with initial conditions vˆc |t=0 = Eˆ c v| ˆ t=0 , vˆs |t=0 = β,ctˆ vˆ with initial conditions w β,0 v| ˆ t=0 , and for the variable w =W |t=0 = W ˆ t=0 is Eˆ s v| given in Bloch space by ˆ vˆc , vˆs ) + Eˆ c N (vˆc , vˆs ), vˆc + Eˆ c H( ∂t vˆc = M ˆ vˆc , vˆs ) + Eˆ s N (vˆc , vˆs ), vˆs + Eˆ s H( ∂t vˆs = M

(7.3)

w w w (vˆc , vˆs , w ∂t w =M +N ), where, see (1.8) and (6.7), with vˆ = vˆc + vˆs , = T MT −1 , M ˆ vˆc , vˆs ) = T Mi T −1 vˆ + T Ni (T −1 v), H( ˆ −1 (vˆc , vˆs ) = T N (T v), N ˆ w = M β,ctˆ + M i,β,ctˆ , M w (vˆc , vˆs , w N ) = −3T−ctˆ U∗ · T−ctˆ vˆ ∗w − 3T−ctˆ K ∗ T−ctˆ vˆ ∗w ct ∗ T−ctˆ vˆ ∗w . − T−ctˆ vˆ It is useful to modify this system by introducing the coordinates (uˆ c , uˆ s ) by uˆ c = vˆc ,

−1 Eˆ s (3U∗ · vˆc uˆ s = −M ∗ vˆc ) + vˆs .

(7.4)

This coordinate transform takes care of the fact that asymptotically vˆs can be expressed by vˆc . Under the scaling used below the new variable uˆ s converges to zero, while the old variable vˆs converges to a nontrivial expression. Under this transform (7.3) becomes uˆ c + N c,i (uˆ c , uˆ s ) + N c (uˆ c , uˆ s ), ∂t uˆ c = M uˆ s + N s,i (uˆ c , uˆ s ) + N s (uˆ c , uˆ s ), ∂t uˆ s = M ∂t w =

w w w (uˆ c , uˆ s , w M +N ),

where ˆ uˆ c , M c,i (uˆ c , uˆ s ) = Eˆ c H( −1 Eˆ s (3U∗ · uˆ c N ∗ uˆ c ) + uˆ s ) , −1 ˆ ˆ ˆ Ns,i (uˆ c , uˆ s ) = Es H(uˆ c , M Es (3U∗ · uˆ c ∗ uˆ c ) + uˆ s ) , −1 (uˆ c , M Eˆ s (3U∗ · uˆ c c (uˆ c , uˆ s ) = Eˆ c N ∗ uˆ c ) + uˆ s ) , N s (uˆ c , uˆ s ) = Eˆ s N (uˆ c , M −1 Eˆ s (3U∗ · uˆ c N ∗ uˆ c ) + uˆ s ) −1 Eˆ s (3U∗ · uˆ c ∗ uˆ c )], − ∂t [M Nw (uˆ c , uˆ s , w ) = Nw (vˆc , vˆs , w ).

(7.5)

Stability of Modulated Fronts

387

We follow the lines of Sect. 4 and start with the renormalization process by introducing the scalings vˆc,n (κ, x, τ ) = uˆ c (σ n κ, x, σ −2n τ ), vˆs,n (κ, x, τ ) = σ −3n/2 uˆ s (σ n κ, x, σ −2n τ ), w n (κ, x, τ ) = eγ σ

−2n τ

w (κ, x, σ −2n τ ).

(The 3rd argument is the time, and the function w has here another meaning than in Sect. 4.) Note again that only the Bloch variable is rescaled, but x is left untouched. As before the Bloch variable is not scaled in the weighted representation w. Under these scalings the functions vˆs,n and w n still converge to 0 as n → ∞. The variation of constant formula yields now vˆc,n (κ, x, τ ) = eσ

−2n M c,n (τ −σ 2 )

+ σ −2n + σ −2n vˆs,n (κ, x, τ ) = e

τ

σ τ

2

vˆc,n−1 (σ κ, x, 1) −2n c,i,n (vˆc,n , vˆs,n ) (κ, x, τ ) dτ eσ Mc,n (τ −τ ) N dτ eσ

−2n M c,n (τ −τ )

σ2 s,n (τ −σ 2 ) −3/2 σ −2n M

+ σ −7n/2 + σ −7n/2

c,n (vˆc,n , vˆs,n ) (κ, x, τ ), N

(7.6)

σ

τ

σ τ

2

σ2

vˆs,n−1 (σ κ, x, 1) −2n s,i,n (vˆc,n , vˆs,n )) (κ, x, τ ) dτ eσ Ms,n (τ −τ ) N dτ eσ

−2n M s,n (τ −τ )

s,n (vˆc,n , vˆs,n ) (κ, x, τ ), N

w n (κ, x, τ ) = Sn (τ, σ 2 ) wn−1 (κ, x, 1) τ w,n (vˆc,n , vˆs,n , w + σ −2n dτ n ) (κ, x, τ ), Sn (τ, τ ) N σ2

(7.7)

(7.8)

with L −n , c,n = L n Eˆ ch M M L −n , n Eˆ sh M s,n = L M c,i (L −n vˆs,n ), c,i,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N s,i (L −n vˆs,n ), s,i,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N c (L −n vˆs,n ), c,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N s (L −n vˆs,n ), s,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N −n vˆs,n , w w,n (vˆc,n , vˆs,n , w w (L −n vˆc,n , σ 3n/2 L N n ) = N n ), where we recall the definition

fˆ ((, x) ≡ fˆ(σ (, x), L

and where Sn (τ, τ ) is now the evolution operator associated with the equation w + γ )fˆτ . ∂τ fˆτ = σ −2n (M

(7.9)

388

J.-P. Eckmann, G. Schneider

Again, the exponential scaling of w n with respect to time does not affect the definition w due to the fact that w of N n only appears linearly. All this is quite analogous to the developments in Eqs. (4.10) and (4.11).

7.1. The scaled linear evolution operators. First we bound the linear evolution operators s,n . c,n and M generated by M Lemma 7.2. For all ρ1 ≥ ρ2 ≥ 0 there exist Cρ1 ,ρ2 > 0 and γ− > 0 such that for 1 ≥ τ > τ ≥ σ 2 and all σ ∈ (0, 1) one has eσ e

−2n M c,n (τ −τ )

n Eˆ ch L −n g L ˆ Kσ n ,ρ ≤ C(τ − τ )ρ2 −ρ1 g ˆ Kσ n ,ρ , 1

s,n (τ −τ ) n σ −2n M

2

−2n −n g L Eˆ sh L ˆ Kσ n ,ρ ≤ Ce−γ− σ (τ −τ ) (τ − τ )ρ2 −ρ1 g ˆ Kσ n ,ρ , 1

2

for all n ∈ N. Proof. The first estimate follows directly from the fact that c,n (()f = µ1 (()Pˆc (()f = −c1 (2 Pˆc (()f + O((3 ). M s,n (() The second estimate follows from the fact that the real part of the spectrum of M as a function of ( can be bounded from above by a strictly negative parabola. Next, we bound Sn (τ, τ ) as defined through (7.9) and state the analog of Lemma 4.2. Lemma 7.3. Suppose that the stability condition (6.12) is satisfied. Then there is a δ ∈ (0, 1] and a C > 0 such that for 1 > τ > τ ≥ 0 and all σ ∈ (0, 1] one has −2n w Kw ≤ Ce−γ σ (τ −τ )/2 w K w , Sn (τ, τ )

(7.10)

for all n ∈ N. The proof of Lemma 7.3 follows closely the one of Lemma 4.2 in Sect. 4.1. Therefore, it will be omitted here. 7.2. The scaled non-linear terms. Next we estimate the scaled non-linear terms in Nc,n , Ns,n , and Nw,n . In order to estimate the time derivatives on the right hand side of (7.5) coming from the coordinate transform (7.4) we need to choose vˆc,n in the better space Kσc n = Kσ n ,3/2 instead of only being in Kσ n ,1 . Lemma 7.4. Suppose max{vˆc,n Kc n , vˆs,n Ks n , wn Kw } ≤ 1. Then there exists a σ σ C1 > 0 such that for all σ ∈ (0, 1] one has c,n K n ≤ C1 σ 5n/2 (vˆc,n Kc n + vˆs,n Ks n )2 , N σ ,3/4 σ

σ

s,n Ks ≤ C1 σ 2n (vˆc,n Kc + vˆs,n Ks )2 , N n n n σ

σ

σ

w,n Kw ≤C1 σ n/2 (vˆc,n Kc + vˆs,n Ks ) w n K w . N n n σ

σ

Stability of Modulated Fronts

389

Proof. Throughout the proof we use fˆ fˆ) g) L( ∗ g) ˆ (κ) = σ (L ∗ (L ˆ (κ).

(7.11)

w,n . The most dangerous term in N w,n coming from i) We start with the estimates for N Nw (vˆc , vˆs , w ) is −n ∗ T−cσ 3T−cσ ∗w n . ˆ −2n τ Kct ˆ −2n τ (L vˆ c,n ) −n vˆc,n 0,δ ≤ Cσ n/2 vˆc,n Kc and (5.6) we obtain From L n ˆ σ

H2

−n ∗ T−cσ T−cσ ∗w n ˆ 0,δ ˆ −2n τ Kct ˆ −2n τ (L vˆ c,n ) H2

−n ˆ −n τ vˆc,n ) ≤T−cσ ∗w n ˆ 0,δ ˆ −2n τ Kct C 2 L (T−cσ H2

b,δ

≤Cσ

n/2

vˆc,n

Kσc n

w n K w .

s,n . The only difficulty stems from the ii) We use (7.11) to obtain the estimates for N term −1 Eˆ s (3U∗ · uˆ c −1 Eˆ s (6U∗ · uˆ c ∂t [M ∗ uˆ c )] = M ∗ ∂t uˆ c ) coming from the change of coordinates (7.4). This can be estimated in the required way by expressing ∂t uˆ c by the right-hand side of (7.5), by using then the points ii.1)–ii.3) and the fact we already have a factor σ n by uˆ c ∗ ∂t uˆ c using again (7.11). ii.1) The first bound for the terms on the right-hand side of (7.5) is c,n vˆc,n K n ≤ Cσ n vˆc,n K n , M σ ,1 σ ,3/2 which follows from the form of µ1 (() by using the following lemma. 2 ([−1/2, 1/2), C 2 ((0, 2π ), C)) with µ((, ·) Lemma 7.5. Let µ ∈ Cper C 2 ((0,2π),C) ≤ C|(|2(ρ1 −ρ2 ) for a ρ1 ≥ ρ2 ≥ 0. Then, there exists a C > 0 such that for all σ ∈ (0, 1] we have

σ µ)u (L ˆ Kσ,ρ2 ≤ Cσ 2(ρ1 −ρ2 ) µCper ˆ Kσ,ρ1 . 2 ([−1/2,1/2),C 2 ((0,2π),C)) u

(7.12)

Proof. This follows since sup |

(∈R

(2(ρ1 −ρ2 ) σ 2(ρ1 −ρ2 ) | < Cσ 2(ρ1 −ρ2 ) . (1 + (2 )(ρ1 −ρ2 )

c,i,n is exponentially small in terms of σ . ii.2) By Lemma 7.8 below the term N ii.3) From (7.11) we easily obtain c,n K n ≤ σ n (vˆc,n Kc + vˆs,n Ks )2 . N n n σ ,1 σ

σ

c,n part. Note that N c,n can be written iii) From [Schn96] we recall the estimates for the N as c,n = sˆ1 + sˆ2 + N c,n,r , N

390

J.-P. Eckmann, G. Schneider

where n Eˆ c L −n (U∗ · vˆc,n sˆ1 = −3σ n L ∗ vˆc,n ), n Eˆ c L −n (U∗ · vˆc,n s,n )−1 (3U∗ · vˆc,n ∗ (M ∗ vˆc,n )) sˆ2 = −6σ 2n L 2n n ˆ −n ∗ vˆc,n ∗ vˆc,n ), − σ L Ec L (vˆc,n c,n,r K n = O(σ 5n/2 (vˆc,n Kc + vˆs,n Ks )2 ). N n n σ ,1 σ

σ

c,n,r follows easily by applying again (7.11). The estimate for N It remains to estimate sˆ1 and sˆ2 . These estimates have been obtained in [Schn96]. For completeness we recall some of the arguments. Introducing an (() ∈ C by vˆc,n ((, x) = an (()ϕσ n ( (x) shows that the terms sˆ1 and sˆ2 are of the form 2n sˆ2 ((, x) = σ dm dk K2 (σ n (, σ n (( − m), σ n (m − k), σ n k)

× an (( − m)an (m − k)an (k) ϕσ n ( (x),

n n n n sˆ1 ((, x) = σ dm K1 (σ (, σ (( − m), σ m) an (( − m) an (m) ϕσ n ( (x), with Kj : R2+j → C the kernel of an integral operator. The detailed expression for K1 is given in (7.13) below. The case n = m = k = ( = 0 corresponds to the spatially periodic case. In the spatially periodic case there exists a center manifold G = {u = U0,a | a ∈ R}, consisting of the spatially periodic fixed points related to each other by the translation invariance of the original Swift–Hohenberg equation. By a formal calculation it turns out that the flow of the one-dimensional center manifold G is determined by the ordinary differential equation d a = 0 · a + K1 (0, 0, 0)a 2 + K2 (0, 0, 0, 0)a 3 + O(a 4 ). dt Since the center manifold consists of fixed points the flow a = a(t) is trivial, i.e., d dt a = 0. Consequently, we obtain K1 (0, 0, 0) = K2 (0, 0, 0, 0) = 0. Therefore, |K2 ((, ( − m, m − k, k)| ≤ C(|(| + |( − m| + |m − k| + |k|), and so (7.11) and (7.12) imply ˆs2 Kσ n ,1 ≤ Cσ 3n (vˆc,n Kc n )2 . σ

Interestingly it turned out that the first derivatives of K1 vanish as well. Since the eigenvalue problem M( ϕ( = µ1 (()ϕ( is self-adjoint, the projection Pˆc (() is orthogonal in 2 ˆ L (0, 2π) and is given by Pc (()u = ( ϕ( (x)u((, x)dx)ϕ( (·). Thus K1 ((, ( − m, m) = 3 dx ϕ( (x)ϕ(−m (x)ϕm (x)U (x). (7.13)

Stability of Modulated Fronts

391

Expanding ϕ( (x) = ∂x U (x) + i(g(x) + O((2 ), with g(x) ∈ R yields K1 ((, ( − m, m) = 3

dx (∂x U (x))3 U (x)

− i(g(x)(∂x U (x))2 U (x) + i(( − m)g(x)(∂x U (x))2 U (x)

2 2 2 2 + (∂x U (x)) img(x)U (x) + O(( + (( − m) + m ) . Note that U is an even function, so ∂x U is odd, which proves again K1 (0, 0, 0) = 0. Since, in addition, the first order terms cancel we have |K1 ((, ( − m, m)| ≤ C|(2 + (( − m)2 + m2 |, and so from (7.11) and (7.12), ˆs1 Kσ n ,3/4 ≤ Cσ 5n/2 (vˆc,n Kc n )2 . σ

Summing the estimates shows the assertion.

7.3. Bounds on the integrals. Here we estimate the integrals in the variation of constant formula in terms of the following quantities. Definition 7.6. For all n, we define u Rcs,n =

sup vˆc,n (τ )Kc n + sup vˆs,n (τ )Ks n , σ

τ ∈[σ 2 ,1]

τ ∈[σ 2 ,1]

σ

and Rnw =

sup wn (τ )Kw .

τ ∈[σ 2 ,1]

In the following two lemmas we estimate the integrals appearing in (7.6)–(7.8). u + R w ≤ 1. Then for all 1 ≥ τ ≥ σ 2 and all σ ∈ (0, 1] one Lemma 7.7. Assume Rcs,n n has τ −2n c,n (vˆc,n , vˆs,n ) (·, ·, τ )Kc σ −2n dτ eσ Mc,n (τ −τ ) N n σ2

σ

σ2

σ

u ≤ Cσ n/2 (Rcs,n )2 , τ −2n s,n (vˆc,n , vˆs,n ) (·, ·, τ )Ks dτ eσ Ms,n (τ −τ ) N σ −7n/2 n u ≤ Cσ n/2 (Rcs,n )2 , τ w,n (vˆc,n , vˆs,n , wˆ n ) (·, ·, τ )Kw dτ Sn (t, τ ) N σ −2n σ2

u ≤ Cσ n/2 Rcs,n Rnw .

392

J.-P. Eckmann, G. Schneider

Proof. We first use Lemma 7.2 and Lemma 7.4. For the second integral in (7.6) we get a bound τ −2n c,n (vˆc,n , vˆs,n ) (·, ·, τ )Kc sup σ −2n dτ eσ Mc,n (τ −τ ) N n σ

σ2

τ ∈[σ 2 ,1]

u ≤ Cσ −2n (Rcs,n )2 σ 5n/2

u ≤ Cσ n/2 (Rcs,n )2 .

1 σ2

dτ (1 − τ )−3/4

For the second integral in (7.7) we find similarly τ −2n s,n (vˆc,n , vˆs,n ) (·, ·, τ )Ks sup σ −7n/2 dτ eσ Ms,n (τ −τ ) N n σ

σ2

τ ∈[σ 2 ,1]

u ≤ C(Rcs,n )2 σ −3n/2 u ≤ Cσ n/2 (Rcs,n )2 .

1 σ2

dτ e−Cσ

−2n (1−τ )

For the integral in (7.8) we find, using now Lemma 7.3 and Lemma 7.4, a bound τ −2n −2n u u Cσ dτ e−γ σ (τ −τ )/2 (σ n/2 Rcs,n Rnw ) ≤ Cσ n/2 Rcs,n Rnw .

σ2

u + R w ≤ 1. Then for all 1 ≥ τ ≥ σ 2 and all σ ∈ (0, 1) one Lemma 7.8. Assume Rcs,n n has τ −2n c,i,n (vˆc,n , vˆs,n ) (·, ·, τ )Kc σ −2n dτ eσ Mc,n (τ −τ ) N n σ

σ2

−(β(c−c)+γ ˆ )σ −n

≤ Ce Rnw , τ −2n s,i,n (vˆc,n , vˆs,n ) (·, ·, τ )Ks σ −7n/2 dτ eσ Ms,n (τ −τ ) N n σ

σ2

≤ Ce

−(β(c−c)+γ ˆ )σ −n

Rnw .

Proof. We restrict ourselves to the linear part Mi . A typical term of (7.6) – the first in the definition of Mi in (1.8) – can be rewritten as τ

−2n n K −n vˆc,n (τ )) (κ, x) U (x) cσ −2n τ dτ eσ Mc,n (τ −τ ) L ∗ (L σ −2n σ2 τ

−2n n K cσ −2n τ = σ −2n dτ eσ Mc,n (τ −τ ) L ∗ uˆ σ −2n τ (κ, x) U (x). σ2

Since Kct (x) vanishes as x → −∞ with some exponential rate, its Bloch wave transform cσ −2n τ can be extended into a strip in the complex plane such that K cσ −2n τ K ∗ uˆ σ −2n τ (κ, x) ˆ −2n τ −γ σ −2n τ cσ −2n τ (κ − ( − iβ, x) w = d( K ((, x, σ −2n τ )e−i(cσ e ˆ × e−β(c−c)σ

−2n τ

ei(κ−()cσ

−2n τ

.

Stability of Modulated Fronts

393

c,n (τ − τ )) is bounded Using this identity, we get, as in (4.20) – because exp(σ −2n M 0,δ ˆ Kn ≤ Cσ −11n/2 u ˆ 2,δ and (5.7), and recalling Kw = Hˆ 2 : – from Lˆ n u Hˆ 2

σ ,3/2

τ

−2n n K cσ −2n τ dτ eσ Mc,n (τ −τ ) L ∗ uˆ n,τ Kσ n ,3/2 τ −2n n (K cσ −2n τ ≤ Cσ dτ L ∗ uˆ n,τ )Kσ n ,3/2 2 σ τ ˆ −2n τ −11n/2 ≤ Cσ −2n dτ e−β(c−c)σ σ (1 + cσ −2n τ )2

σ −2n

σ2

σ2

× (κ, x) → e−icσ

−2n τ κ

(7.14)

cσ −2n τ (κ − iβ, x) 2,δ K ˆ H2

−2n

−2n

ˆ τ × (κ, x) → e−iκ cσ w n,τ (κ, x)Kw e−γ σ τ τ ˆ −2n τ −γ σ −2n τ w ≤ Cσ −15n/2 dτ (1 + cσ −2n τ )2 e−β(c−c)σ e Rn

≤ Cσ

σ2 −23n/2 −(β(c−c)+γ ˆ )σ −2(n−1)

e

−n

ˆ )σ Rnw ≤ Ce−(β(c−c)+γ Rnw .

The non-linear terms coming from Ni can be handled in exactly the same way and yield similar bounds. The same is true for the terms with Ns,i,n in (7.7). 7.4. Bounds on the initial condition. Here, we estimate the first terms on the right-hand side of the variation of constant formulae (7.6)–(7.8). Lemma 7.9. For all 1 ≥ τ ≥ σ 2 and all σ ∈ (0, 1] we have eσ e

−2n M c,n (τ −σ 2 )

n Eˆ ch L −n L g L ˆ Kc n ≤ Cσ −11/2 g ˆ Kc n−1 , σ

s,n (τ −σ 2 ) n σ −2n M

σ

−2n 2 −n σ −3/2 L g L Eˆ sh L ˆ Ks n ≤ Cσ −6 e−Cσ (τ −σ ) g ˆ Ks n−1 , σ

σ

Sn (τ, σ )g ˆ Kw ≤ Ce 2

−γ σ −2n (τ −σ 2 )/2

g ˆ Kw .

Proof. The first two bounds of Lemma 7.9 follow immediately from Lemma 7.2 and (6.4). The third inequality is an immediate consequence of Lemma 7.3. 7.5. A priori bounds on the non-linear problem. This section follows closely Sect. 4.4. We need a priori bounds on the solution of (7.6)–(7.8). We (re)define now quantities analogous to those of Definition 4.3. Definition 7.10. For all n ∈ N, we define u = vˆc,n |τ =1 Kc n + vˆs,n |τ =1 Ks n , ρcs,n σ

σ

and ρnw = wn |τ =1 Kw .

Lemma 7.11. For all n ∈ N there is a constant ηn > 0 such that the following holds: If u w , and σ > 0 are smaller than η , the solutions of (7.6)–(7.8) exist for all ρcs,n−1 , ρn−1 n 2 τ ∈ [σ , 1]. Moreover, we have the estimates −n

u u u Rcs,n ≤ Cσ −6 ρcs,n−1 + Ce−Cσ Rnw + Cσ n/2 (Rcs,n )2 ,

(7.15)

394

J.-P. Eckmann, G. Schneider

and w u + Cσ n/2 Rcs,n Rnw , Rnw ≤ Cρn−1

(7.16)

with a constant C independent of σ and n. Remark. We remark again that there is no need for a detailed expression for ηn since u the existence of the solutions is guaranteed if we can show Rcs,n < ∞ and Rnw < ∞. By (7.15) and (7.16) we have detailed control of these quantities in terms of the norms of the initial conditions and σ . Proof. For the derivation of the estimates we assume in the sequel, without loss of u + R w ≤ 1. For the first term in (7.8) we obtained in Lemma 7.9 a generality, that Rcs,n n bound w Cρn−1 .

(7.17)

u Rw . For the second term in (7.8), we obtained in Lemma 7.7 a bound Cσ n/2 Rcs,n n We now discuss in detail (7.7). Using Lemma 7.9 the first term is bounded by u . Lemma 7.7 and Lemma 7.8 yield for the second and third terms a bound Cσ −6 ρcs,n−1 −n

u )2 + Ce−Cσ R w for a C > 0 independent of σ ∈ (0, 1] and n ∈ N. Cσ n/2 (Rcs,n n Finally, we come to the bounds for (7.6). Using Lemma 7.9 the first term is bounded u by Cσ −11/2 ρcs,n−1 . Lemma 7.7 and Lemma 7.8 yield for the second and third terms a −n

u )2 + Ce−Cσ R w for a C > 0 independent of σ ∈ (0, 1] and n ∈ N. bound Cσ n/2 (Rcs,n n The proof of Lemma 7.11 now follows by applying the contraction mapping principle to the system consisting of (7.6), (7.7), and (7.8). u w and σ > 0 sufficiently small the Lipschitz constant on the Then for ρcs,n−1 , ρn−1 right-hand side of (7.6) to (7.8) in C([σ 2 , 1], Kσc n × Kσs n × Kw ) is smaller than 1. An application of a classical fixed point argument completes the proof of Lemma 7.11.

7.6. The iteration process. As in the case of the simplified problem, we decompose the 2 ˜ = e−c1 κ solution vˆc,n (·, ·, τ ) for τ = 1 into a Gaussian part and a remainder. Let ψ(κ) and write ˜ vˆc,n (κ, x, 1) = An ψ(κ)ϕ σ −n κ (x) + rˆn (κ, x), : Kc n → C by where rˆn (0, x) = 0, and the amplitude An is in C. We also define > σ ˆ (7.18) (>f )ϕ0 = Pc (0)f κ=0 . Then (7.6) can be decomposed accordingly and takes the form 1

c,n (1−τ ) −2n σ −2n M An = An−1 + > σ dτ e (Nc,i,n + Nc,n ) , σ2

rˆn (κ, x) = eσ

(7.19)

−2n M c,n (1−σ 2 )

+ σ −2n +e

1

rˆn−1 (σ κ, x) −2n c,i,n + N c,n ) (κ, x) dτ eσ Mc,n (1−τ ) (N

σ2 −2n σ Mc,n (1−σ 2 )

(7.20)

˜ κ)ϕσ −n κ (x) − An ψ(κ)ϕ ˜ An−1 ψ(σ σ −n κ (x).

If we define next ρnr = ˆrn Kc n + vˆs,n |τ =1 Ks n then the above construction implies σ σ u ρcs,n ≤ C(|An | + ρnr ).

Stability of Modulated Fronts

395

Our main estimate is now Proposition 7.12. There is a constant C > 0 such that for sufficiently small σ > 0 the solution (vc,n , vs,n , wn ) of (7.6)–(7.8) satisfies for all n ∈ N: −n

u )2 , |An − An−1 | ≤ Ce−Cσ Rnw + Cσ n/2 (Rcs,n

ρnr

≤

−n r Cσρn−1 + Ce−Cσ Rnw u + Cσ n Rcs,n ,

ρnw ≤ Ce

−Cσ −2n

+ Cσ

n/2

(7.21) u (Rcs,n )2

(7.22)

w u ρn−1 + Cσ n/2 Rcs,n Rnw .

(7.23)

Proof. We begin by bounding the difference An − An−1 using (7.19). Since fˆ is in H 2 as a function of ( we obviously have fˆ| ≤ CfˆKc . |> n

(7.24)

σ

Thus, it suffices to bound the norm of the integral in (7.19), but this has already been done in the proof of Lemma 7.7 and Lemma 7.8. We next bound rˆn in terms of rˆn−1 , using (7.20). The first term is the one where the projection is crucial: For σ > 0 sufficiently small, rˆn−1 ∈ Kσc n−1 with rˆn−1 (0) = 0 one has (κ, x) → eσ

−2n M c,n (1−σ 2 )

rˆn−1 (σ κ, x)Kc n ≤ Cσ ˆrn−1 Kc n−1 , σ

σ

(7.25)

as in the proof of Proposition 4.5. This leads for the first term in (7.20) to a bound (in Kσc n ) r . Cσρn−1

(7.26)

The second and third term have been bounded in the proof of Lemma 7.7 and Lemma 7.8 by −n

Ce−Cσ Rnu + Cσ n/2 (Rnu )2 .

(7.27)

Finally, the last term c,n (1−σ 2 ) n (κ, x) ≡ eσ −2n M ˜ κ)ϕσ −n κ (x) − An ψ(κ)ϕ ˜ X An−1 ψ(σ σ −n κ (x),

in (7.20) leads to a bound (in Kσc n ): −n

w u u n ≤ Ce−Cσ Rn−1 X + Cσ n/2 (Rcs,n )2 + Cσ n Rcs,n ,

(7.28)

where the last term is due to µ1 (() = −c1 (2 + O((3 ) not being exactly a parabola. For details see [Schn96]. Collecting the bounds, the assertion (7.22) for rˆn follows. Finally, the bounds on ρnw follow in the same way as those in Lemma 7.11. The proof of Proposition 7.12 is complete.

396

J.-P. Eckmann, G. Schneider

Proof of Theorem 7.1. As before the proof is just an induction argument, using repeatedly the above estimates. Again we write C for constants which can be chosen independent u of σ and n. Assume that R = supn∈N Rcs,n < ∞ exists. From Lemma 7.11 we observe for σ > 0 sufficiently small, Rnw ≤ u Rcs,n ≤

w Cρn−1

w ≤ Cρn−1 , 1 − Cσ n/2 R −n u + Ce−Cσ Rnw Cσ −6 ρcs,n−1

1 − Cσ n/2 R

(7.29)

−n

u w + Ce−Cσ ρn−1 , ≤ Cσ −6 ρcs,n−1

with a constant C which can be chosen independent of R. Using Proposition 7.12 we find −n

w u |An − An−1 | ≤ Ce−Cσ ρn−1 + Cσ n/2 σ −6 ρcs,n−1 , −n

r w u ρnr ≤ Cσρn−1 + Ce−Cσ ρn−1 + Cσ n/2 σ −6 ρcs,n−1 ,

u ≤ C(|An | + ρnr ), ρcs,n

ρnw ≤ Ce−Cσ

−2n

(7.30)

w w ρn−1 + Cσ n/2 ρn−1 .

Therefore, we can choose σ > 0 so small that for n > 13: w r |An − An−1 | ≤ ρn−1 /10 + σ (n−13)/2 (|An−1 | + ρn−1 ), r w ρnr ≤ 3ρn−1 /4 + ρn−1 /10 + σ (n−13)/2 |An |, w w ρn ≤ ρn−1 /10.

Thus, the sequence of An converges geometrically to a finite limit A∗ . Furthermore, we find that limn→∞ ρnr = 0, and limn→∞ ρnw = 0. Since the quantities |An |, ρnr , ρnw increase only for at most 13 steps, the term CR in (7.29) stays less than 1/2 if we choose |A1 |, ρ1r , ρ1w = O(σ m ), for a sufficiently large m > 0. From (7.29) the existence of a u finite constant R = supn∈N Rcs,n follows . Going back to (7.30) we can choose σ > 0 so small that |An − An−1 | + ρnr ≤ Cσ n/2 , which implies the associated convergence rate stated in Theorem 7.1. Finally, the scaling of wn (·, ·, τ ) implies the exponential decay of w(t). The proof of Theorem 7.1 is complete. Acknowledgement. Guido Schneider would like to thank at the Physics Department of the University of Geneva for kind hospitality. Both authors would like to thank the referee for reading the paper very carefully and for very helpful comments. This work is partially supported by the Fonds National Suisse. The work of Guido Schneider is partially supported by the Deutsche Forschungsgemeinschaft DFG under the grant Mi459/2–3.

References [AW78]

Aronson, D.G., Weinberger, H.: Multidimensional nonlinear diffusion arising in population genetics. Adv. Math. 30, 33–76 (1978)

Stability of Modulated Fronts

[BK92]

397

Bricmont, J., Kupiainen, A.: Renormalization group and the Ginzburg–Landau equation. Commun. Math. Phys. 150, 193–208 (1992) [BK94] Bricmont, J., Kupiainen, A.: Stability of moving fronts in the Ginzburg–Landau equation. Commun. Math. Phys. 159, 287–318 (1994) [CE86] Collet, P., Eckmann, J.-P.: The existence of dendritic fronts. Commun. Math. Phys. 107, 39–92 (1986) [CE87] Collet, P., Eckmann, J.-P.: The stability of modulated fronts. Helv. Phys. Acta 60, 969–991 (1987) [CE90a] Collet, P., Eckmann, J.-P.: Instabilities and fronts in extended systems. Princeton: Princeton University Press, 1990 [CE90b] Collet, P., Eckmann, J.-P.: The time dependent amplitude equation for the Swift–Hohenberg problem. Commun. Math. Phys. 132, 139–153 (1990) [CEE92] Collet, P., Eckmann, J.-P., Epstein, H.: Diffusive repair for the Ginsburg–Landau equation. Helv. Phys. Acta 65, 56–92 (1992) [DL83] Dee, G., Langer, J.S.: Propagating pattern selection. Phys. Rev. Lett. 50, 383–386 (1983) [Eck65] Eckhaus, W.: Studies in nonlinear stability theory. Springer Tracts in Nat. Phil. Vol. 6, Berlin– Heidelberg–New York: Springer, 1965 [EW91] Eckmann, J.-P., Wayne, C.E.: Propagating fronts and the center manifold theorem. Commun. Math. Phys. 136, 285–307 (1991) [EW94] Eckmann, J.-P., Wayne, C.E.: The non–linear stability of front solutions for parabolic partial differential equations. Commun. Math. Phys. 161, 323–334 (1994) [EWW97] Eckmann, J.-P., Wayne, C.E., Wittwer, P.: Geometric stability analysis of periodic solutions of the Swift–Hohenberg equation. Commun. Math. Phys. 190, 173–211 (1997) [Ga94] Gallay, T.: Local stability of critical fronts in nonlinear parabolic partial differential equations. Nonlinearity 7, 741–764 (1994) [HS99] Haragus, M., Schneider, G.: Bifurcating fronts for the Taylor–Couette problem in infinite cylinders. Zeitschrift für Angewandte Mathematik und Physik (ZAMP) 50, 120–151 (1999) [KSM92] Kirrmann, P., Schneider, G., Mielke,A.: The validity of modulation equations for extended systems with cubic nonlinearities. Proceedings of the Royal Society of Edinburgh 122A, 85–91 (1992) [RS72] Reed, M., Simon, B.: Methods of Modern Mathematical Physics I–IV. NewYork: Academic Press, 1972 [Sa77] Sattinger, D.H.: Weighted norms for the stability of travelling waves. J. Diff. Eqns. 25, 130–144 (1977) [Schn94] Schneider, G.: Error estimates for the Ginzburg–Landau approximation. J. Appl. Math. Phys. 45, 433–457 (1994) [Schn96] Schneider, G.: Diffusive stability of spatial periodic solutions of the Swift–Hohenberg equation. Commun. Math. Phys. 178, 679–702 (1996) [Schn98] Schneider, G.: Nonlinear stability of Taylor-vortices in infinite cylinders. Arch. Rational Mech. Anal. 144, 121–200 (1998) [Ta97] Taylor, M.E.: Partial Differential Equations I: Basic Theory. Appl. Math. Sciences 115, Berlin– Heidelberg–New York: Springer, 1997 [vH91] van Harten, A.: On the validity of Ginzburg–Landau’s equation. J. Nonlinear Science 1, 397–422 (1991) [Wa97] Wayne, C.E.: Invariant manifolds for parabolic partial differential equations on unbounded domains. Arch. Rat. Mech. Anal. 138, 279–306 (1997) Communicated by A. Kupiainen

Commun. Math. Phys. 225, 399 – 421 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Pauli Operator and Aharonov–Casher Theorem for Measure Valued Magnetic Fields László Erd˝os1, , Vitali Vougalter2 1 School of Mathematics, Georgia Tech, Atlanta, GA 30332, USA. E-mail: [email protected] 2 Department of Mathematics, University of British Columbia, Vancouver, BC, Canada V6T 1Z2.

E-mail: [email protected] Received: 14 May 2001 / Accepted: 5 September 2001

Abstract: We define the two dimensional Pauli operator and identify its core for magnetic fields that are regular Borel measures. The magnetic field is generated by a scalar potential hence we bypass the usual A ∈ L2loc condition on the vector potential, which does not allow to consider such singular fields. We extend the Aharonov–Casher theorem for magnetic fields that are measures with finite total variation and we present a counterexample in case of infinite total variation. One of the key technical tools is a weighted L2 estimate on a singular integral operator. 1. Introduction We consider the usual Pauli operator in d = 2 dimensions with a magnetic field B, 2 H = σ · (−i∇ + A) = (−i∇ + A)2 + σ3 B

on

L2 (R2 , C2 ),

B := curl(A) = ∇ ⊥ · A with ∇ ⊥ := (−∂2 , ∂1 ). Here σ · (−i∇ + A) is the two dimensional Dirac operator on the trivial spinorbundle over R2 with real vector potential A and σ = (σ1 , σ2 ) are the first two Pauli matrices. Precise conditions on A and B will be specified later. The Aharonov–Casher theorem [A-C] states that the dimension of the kernel of H is given by dim Ker(H ) = | |, where 1 := 2π

Partially supported by NSF grant DMS-9970323

R2

B(x)dx

(1)

400

L. Erd˝os, V. Vougalter

(possibly ±∞) is the flux (divided by 2π ) and denotes the lower integer part ( n = n − 1 for n ≥ 1 integer and 0 = 0). Moreover, σ3 ψ = −sψ for any ψ ∈ Ker(H ), where s = sign( ). On a Spinc -bundle over S 2 with a smooth magnetic field the analogous theorem is equivalent to the index theorem (for a short direct proof see [E-S]). From topological reasons the analogue of , the total curvature of a connection, is an integer (the Chern number of the determinant line bundle), and the number of zero modes of the corresponding Dirac operator is . In the present paper we investigate two related questions: (i) What is the most general class of magnetic fields for which the Pauli operator can be properly defined on R2 ? (ii) What is the most general class of magnetic fields for the Aharonov–Casher theorem to hold on R2 ? Pauli operators are usually defined either via the magnetic Schrödinger operator, (−i∇ + A)2 , by adding the magnetic field σ3 B as an external potential, or directly by the quadratic form of the Dirac operator σ · (−i∇ + A) (see Sect. 2.1). In both ways, the standard condition A ∈ L2loc (R2 , R2 ) is necessary. On the other hand, the statement of the Aharonov–Casher theorem uses only that B ∈ L1 (R2 ), and in fact B can even be a measure. It is therefore a natural question to extend the Pauli operator for such magnetic fields and investigate the validity of the Aharonov– Casher theorem. However, even if B ∈ L1 , it might not be generated by an A ∈ L2loc . For example, any gauge A generating the radial field B(x) = |x|−2 | log |x| |−3/2 1(|x| ≤ 1/2 1 1 2 −1 2 ) ∈ L satisfies |x|≤1/2 |A(x)| dx ≥ 0 (r| log r|) dr = ∞ (here 1 is the characteristic function). Hence the Pauli operator cannot be defined in the usual way on C0∞ as its core. In case of a point singularity at p ∈ R2 one can study the extensions from C0∞ (R2 \ {p}), but such approach may not be possible for B with a more complicated singular set. In this paper we present an alternative method which enables us to define the Pauli operator for any magnetic field that is a regular Borel measure (Theorem 2.7). Moreover, we actually define the corresponding quadratic form on the maximal domain and identify a core. We recall that the maximal domain contains all finite energy states, hence it has a direct physical interpretation. For mathematical analysis, however, one needs to know a core explicitly that contains reasonably “nice” functions. For most Schrödinger type operators the core consists of smooth functions. In case of Pauli operators with singular magnetic fields the core will be identified as the set of smooth functions times an explicit nonsmooth factor. The basic idea is to define the Pauli operator via a real generating potential function h, satisfying h = B

(2)

instead of the usual vector potential A. This potential function appears in the original proof of the Aharonov–Casher theorem. The key identity is the following: 2 2 σ · (−i∇ + A)ψ 2 = 4 ∂z¯ (e−h ψ+ ) e2h + ∂z (eh ψ− ) e−2h (3) for regular data, with A := ∇ ⊥ h (integrals without specified domains are understood on R2 with respect to the Lebesgue measure). We will define the Pauli quadratic form by

Pauli Operator for Measure Valued Fields

401

the right-hand side even for less regular data. It turns out that any magnetic field that is a regular Borel measure can be handled by an h-potential. The main technical tool is that for an appropriate choice of h, the weight function e±2h (locally) belongs to the Muckenhoupt A2 class ([G-R, St]). Therefore the maximal operator and certain singular integral operators are bounded on the weighted L2 spaces. This will be essential to identify the core of the Pauli operator. We point out that this approach does not apply to the magnetic Schrödinger operator (−i∇ + A)2 . The Aharonov–Casher theorem has been rigorously proven only for a restricted class of magnetic fields on R2 . The conditions involve some control on the decay at infinity and on local singularities. In fact, to our knowledge, the optimal conditions have never been investigated. The original paper [A-C] does not focus on conditions. The exposition [CFKS] assumes compactly supported bounded magnetic field thesis B(x). The Ph.D. by K. Miller [Mi] assumes boundedness, and assumes that |B(x)| log |x| dx < ∞. The boundedness condition is clearly too strong, and it can be easily replaced with the assumption that B ∈ K(R2 ) Kato class. Miller also observes that in case of the integer = 0 there could be either | | or | | − 1 zero states, but if the field is compactly supported then the number of states is always | | − 1 [CFKS]. The idea behind each proof is to construct a potential function h satisfying (2). Locally, H ψ = 0 is equivalent to ψ = (eh g+ , e−h g− ) with ∂z¯ g+ = 0, ∂z g− = 0, where we identify R2 with C and use the notations x = (x1 , x2 ) ∈ R2 and z = x1 + ix2 ∈ C simultaneously. The condition ψ ∈ L2 (R2 , C2 ) together with the explicit growth (or decay) rate of h at infinity determines the global solution space by identifying the space of (anti)holomorphic functions g± with a controlled growth rate at infinity. For bounded magnetic fields decaying fast enough at infinity, a solution to (2) is given by 1 h(x) = log |x − y|B(y) dy (4) 2π R2 and h(x) behaves as ≈ log |x| for large x. If ≥ 0, then eh g+ is never in L2 , and e−h g− ∈ L2 if |g− | grows at most as the ( − 1)th power of |x|. If < ∞, then g− must be a polynomial of degree at most − 1. If = ∞, then the integral in (4) is not absolutely convergent. If the radial behavior of B is regular enough, then h may still be defined via (4) as a conditionally convergent integral and we then have a solution space of infinite dimension. Conditions on local regularity and decay at infinity are used to establish bounds on the auxiliary function h given by (4), but they are not a priori needed for the Aharonov– Casher Theorem (1). We show that local regularity conditions are irrelevant by proving the Aharonov–Casher theorem for any measure valued magnetic fields with finite total variation (Theorem 3.1). Many fields with infinite total variation can also be covered; some regular behavior at infinity is sufficient (Corollary 3.3). However, some control is needed in general, as we present a counterexample to the Aharonov–Casher theorem for a magnetic field with infinite total variation. Counterexample 1.1. There exists a continuous bounded magnetic field B such that R2 |B| = ∞ and 1 := lim (r) = lim B(x) dx (5) r→∞ r→∞ 2π |x|≤r

402

L. Erd˝os, V. Vougalter

exists and > 1, but dim KerH = 0. Finally, we recall a conjecture from [Mi]: Conjecture 1.2. Let B(x) ≥ 0 with flux := dimension of Ker (H ) is at least .

1 2π

B, which may be infinite. Then the

The proof in [Mi] failed because it would have relied on the conjecture that for any continuous function B ≥ 0 there exists a positive solution h to (2). This is false. A counterexample (even with finite ) was given by C. Fefferman and B. Simon and it was presented in [Mi]. However, the same magnetic field does not yield a counterexample to Conjecture 1.2. Theorem 3.1 settles this conjecture for < ∞, but the case = ∞ remains open. The magnetic field in our counterexample does not have a definite sign, in fact is defined only as an improper integral. 2. Definition of the Pauli Operator 2.1. Standard definition for A ∈ L2loc . The standard definition of the magnetic Schrödinger operator, (−i∇ + A)2 , or the Pauli operator, [σ · (−i∇ + A)]2 , as a quadratic form, requires A ∈ L2loc (see e.g. [L-S, L-L] and for the Pauli operator [So]). We define !k := −i∂k + Ak , Q± := !1 ± i!2 , or with complex notation Q+ = −2i∂z¯ + a, Q− = −2i∂z + a¯ with a := A1 + iA2 . These are closable operators, originally defined on C0∞ (R2 ). Their closures are denoted by the same letter on the minimal domains Dmin (!j ) and Dmin (Q± ). Let sA (u, u) := !1 u2 + !2 u2 = |(−i∇ + A)u|2 , u ∈ C0∞ (R2 ) be the closable quadratic form associated with the magnetic Schrödinger operator on the minimal form domain Dmin (sA ). It is known [Si] that the minimal domain coincides with the maximal domain Dmax (sA ) := {u ∈ L2 (R2 ) : sA (u, u) < ∞}. We will denote D(sA ) := Dmax (sA ) = Dmin (sA ) and let SA be the corresponding self-adjoint operator. The closable quadratic form associated with the Pauli operator is pA (ψ, ψ) := Q+ ψ+ 2 + Q− ψ− 2 =

|σ · (−i∇ + A)ψ|2 , ψ+ ψ= ∈ C0∞ (R2 , C2 ). ψ−

The condition A ∈ L2loc is obviously necessary. The minimal form domain is Dmin (pA ) = Dmin (Q+ ) ⊗ Dmin (Q− ), while Dmax (pA ) := {ψ ∈ L2 (R2 , C2 ) : pA (ψ, ψ) < ∞}. The unique self-adjoint operators associated with these forms are PAmin and PAmax . Clearly Dmin (pA ) ⊂ Dmax (pA ). For a locally bounded magnetic field B = ∇ ⊥ ·A one can choose a vector potential A ∈ L∞ loc by the Poincaré formula and in this case Dmin (pA ) = Dmax (pA ), i.e., PAmin = PAmax . To see this, we first approximate any ψ ∈ Dmax (pA ) in the norm [ · 2 + pA (· , ·)]1/2 by functions ψn = ψχn of compact support, where χn → 1 and ∇χn ∞ → 0. Then we use that ∇ψn 2 ≤ 2pA (ψn , ψn )+

Pauli Operator for Measure Valued Fields

403

2Aψn 2 < ∞, i.e. ψn ∈ H 1 , so it can be appoximated by C0∞ functions in H 1 and also in [ · 2 + pA (· , ·)]1/2 . To our knowledge, the precise conditions for Dmin (pA ) = Dmax (pA ) have not been investigated in general. Such a result is expected to be harder than Dmin (sA ) = Dmax (sA ) due to the lack of the diamagnetic inequality. In the present paper we do not address this question. We will define the Pauli quadratic form differently and always on the appropriate maximal domain since this is the physically relevant object (finite energy) and we identify a natural core for computations. We will see that this approach works for data even more singular than A ∈ L2loc and for A ∈ L2loc we obtain PAmax back. It is nevertheless a mathematically interesting open question to determine the biggest subset of L2loc vector potentials such that the set C0∞ is still a core for the Pauli form. Finally we remark D(sA )⊗D(sA ) ⊂ Dmin (pA ) for A ∈ L2loc . In case of B = ∇ ⊥·A ∈ ∞ L , these two domains are equal and PAmin = SA ⊗ I2 + σ3 B. If B ∈ L∞ loc only, then the form domains coincide locally. For more details on these statements, see Sect. 2 of [So].

2.2. Measures and integer point fluxes. Let M be theset of signed real Borel measures µ(dx) on R2 with finite total variation, |µ|(R2 ) = R2 |µ|(dx) < ∞. Let M be the set of signed real regular Borel measures µ on R2 , in particular they have σ -finite total variation. If µ(dx) = B(x)dx is absolutely continuous, then µ ∈ M is equivalent to B ∈ L1 . Let M∗ be the set of all measures µ ∈ M such that µ({x}) ∈ (−2π, 2π ) for any point x ∈ R2 , and M∗ := M ∩ M∗ . Definition 2.1. Two measures µ, µ ∈ M are said to be equivalent if µ − µ = 2π j nj δzj , where nj ∈ Z, zj ∈ R2 . The equivalence class of any measure µ ∈ M contains a unique measure, called the reduction of µ and denoted by µ∗ , such that µ∗ ({x}) ∈ [−π, π) for any x ∈ R2 . In particular, µ∗ ∈ M∗ . The Pauli operator associated with µ ∈ M will depend only on the equivalence class of µ up to a gauge transformation, so we can work with µ ∈ M∗ . This just reflects the physical expectation that any magnetic point flux 2π nδz , with integer n, is removable by the gauge transformation ψ(x) → einϕ ψ(x), where ϕ = arg(x − z). In case of several point fluxes, 2π j nj δzj , the phase factor should be exp i j nj arg(x − zj ) , but it may not converge for an infinite set of points {zj }. However, any µ ∈ M can be uniquely written as µ = µ∗ + 2π j nj δzj with nj ∈ Z \ {0} and with a set of distinct points {zj } which do not accumulate in R2 ≡ C. Let I+ := {j : nj > 0}, and I− := {j : nj < 0} be the set of indices of the points with positive and negative masses, respectively. By the Weierstrass theorem, there exist analytic functions Fµ (x) and Gµ (x) (recall x = x1 + ix2 ) such that Fµ has zeros exactly at the points {zj : j ∈ I+ } with multiplicities nj , and Gµ has zeros at {zj : j ∈ I− } with multiplicities −nj . Let Lµ (x) := Fµ (x)Gµ (x). Then the integer point fluxes can be removed by the unitary gauge transformation Uµ : ψ(x) →

Lµ (x) ψ(x). |Lµ (x)|

(6)

404

L. Erd˝os, V. Vougalter

For example, for any compact set K ⊂ R2 , we can write Lµ /|Lµ | as

nj arg(x − zj ) + iHK (x) , Lµ (x)/|Lµ (x)| = exp i

x ∈ K,

j :zj ∈K

where HK is a real harmonic function on K. In particular, for any ψ supported on K,

Uµ∗ (−i∇)Uµ ψ = − i∇ + nj Aj + ∇HK ψ, j :zj ∈K

where ∇ ⊥ · Aj = 2πδzj . 2.3. Potential function. The Pauli quadratic form for magnetic fields µ ∈ M∗ will be defined via the right hand side of (3), where h is a solution to h = µ. The following theorem shows that for µ ∈ M∗ one can always choose a good potential function h. Later we will extend it for µ ∈ M∗ . 1 Theorem 2.2. Let µ ∈ M∗ and := 2π µ(dx) be the total flux (divided by 2π ). There is 0 < ε(µ) ≤ 1 such that for any 0 < ε < ε(µ) there exists a real valued 1,p function h = h(ε) ∈ ∩p<2 Wloc with h = µ (in the distributional sense), such that (i) For any compact set K ⊂ R2 and any square Q ⊂ K, 1 1 e2h e−2h ≤ C1 (K, ε, µ). |Q| Q |Q| Q (ii) e±2h ∈ L1+ε loc . (iii) h can be split as h = h1 + h2 with the following estimates: h1 (x) for |x| ≥ R(ε, µ) log |x| − ≤ ε and

Q(u)

e±2h2 ≤ C2 (ε) u!2ε

(7)

(8)

(9)

with some constants C1 (K, ε, µ), C2 (ε) and R(ε, µ). Here Q(u) = [u − 21 , u + 21 ]2 denotes the unit square about u ∈ R2 and u! = (u2 + 1)1/2 . Remark. The property (i) means that e2h satisfies a certain reversed Hölder inequality locally. If (7) were true for any square Q ⊂ R2 with a K-independent constant, then e2h would be in the weight-class A2 used in harmonic analysis (see [G-R, St]). Nevertheless, this property will allow us to use weighted L2 -bounds on a certain singular integral operator locally (Lemma 2.9). We also remark that property (ii) follows from the local analog of the well-known fact that ω ∈ A2 "⇒ ω ∈ Ap for some p < 2. 1,p

Corollary 2.3. If h ∈ L1loc satisfies h ∈ M∗ , then h ∈ Wloc for all p < 2 and e±2h ∈ L1loc . If, in addition, h ∈ M∗ , then e±2h ∈ L1+ε loc with some ε > 0.

Pauli Operator for Measure Valued Fields

405

Proof. Suppose first that µ := h ∈ M∗ . Choose ε < ε(µ) and consider h(ε) ∈ 1,p ±2h(ε) ∈ L1+ε . Since (h − h(ε) ) = 0, p<2 Wloc constructed in Theorem 2.2 with e loc we have h = h(ε) + ϕ with a smooth function ϕ so the statements follow for h as well. If µ = h has infinite total variation, then Theorem 2.2 cannot be applied directly. But for any compact set K one can find another compact set K ∗ with K ⊂ int(K ∗ ) and ∗ then the measure h ∈ M to K ∗ has finite total variation. Therefore one restricted 1,p ±h∗ ∗ ∈ L2loc with h∗ = µ on K ∗ , i.e., h − h∗ can find a function h ∈ p<2 Wloc , e ∗ is harmonic on K , hence it is smooth and bounded on K. So h ∈ p<2 W 1,p (K) and e±h ∈ L2 (K) follows from the same properties of h∗ . $ % Proof of Theorem 2.2. Step 1. First we write µ = µd + µc , where µd := 2π j Cj δzj (Cj ∈ (−1, 1), zj ∈ C ≡ R2 ) is the discrete part of the measure µ, and µc is continuous, i.e., µc ({x}) = 0 for any point x ∈ R2 . The summation can be infinite, finite or empty, but j |Cj | < ∞. We also assume that zj ’s are distinct. Let ε(µ) :=

1 min 1 − |Cj | , 10 j

(10)

then clearly ε(µ) > 0. We fix an 0 < ε < ε(µ). All objects defined below will depend on ε, but we will neglect this fact in the notations. We split the measure µd = µd,1 + µd,2 such that µd,1 := 2π

N

Cj δzj ,

µd,2 := 2π

j =1

∞

Cj δzj ,

j =N+1

2 where N is chosen such that 2π ∞ j =N+1 |Cj | < ε/2. In particular |µd,2 |(R ) < ε/2. We define 1 |x − y| log j = 1, 2, (11) µd,j (dy), hd,j (x) := 2π y! so that hd,j = µd,j . Notice that hd,j (x) is well defined for a.e. x, moreover hd,j ∈ 1,p Wloc for all p < 2 by Jensen’s inequality. Step 2. We split µc = µc,1 +µc,2 such that µc,1 be compactly supported and |µc,2 |(R2 ) < ε/2. We set µj := µd,j + µc,j , j = 1, 2. Then we define 1 |x − y| log j = 1, 2, (12) µc,j (dy), hc,j (x) := 2π R2 y! 1,p

clearly hc,j ∈ Wloc for all p < 2, and hc,j = µc,j (in distributional sense). Finally, we define h1 := hd,1 + hc,1 ,

h2 := hd,2 + hc,2 ,

h := h1 + h2

(13)

and clearly hj = µj . Since µd,1 and µc,1 are compactly supported, the estimate (8) is straightforward. We will also need the notation ν := µd,2 + µc = µc,1 + µ2 . Step 3. For any integer L we define 8L := (2−L Z)2 + (2−L−1 , 2−L−1 ) to be the shifted and rescaled integer lattice. We define the dyadic squares of scale L to be the squares

(L) Dk := k1 − 2−L−1 , k1 + 2−L−1 × k2 − 2−L−1 , k2 + 2−L−1

406

L. Erd˝os, V. Vougalter

of side-length 2−L about the lattice points k = (k1 , k2 ) ∈ 8L . The squares

(L) := k1 − 2−L , k1 + 2−L × k2 − 2−L , k2 + 2−L D k of double side-length with the same center k are called doubled dyadic squares of scale L. Similarly, the squares

(L) := k1 − 3 · 2−L−1 , k1 + 3 · 2−L−1 × k2 − 3 · 2−L−1 , k2 + 3 · 2−L−1 D k are called tripled dyadic squares of scale L. For a fixed scale L the collection of dyadic L and D L denote the set of doubled and tripled dyadic squares is denoted by DL . D squares, respectively. The elements of DL partition R2 for each L. Notice also that every square Q ⊂ R2 can be covered by a doubled dyadic square of area not bigger than a universal constant times |Q|. Lemma 2.4. There exists 1 ≤ M = M(µ, ε) < ∞ such that |µ|(Q) < 2π(1 − ε) for M . any Q ∈ D Proof. We first notice that the support of µd,1 consists of finitely many points, hence L contains at most one point from this support. for large enough L each element of D Second, since the measure |ν| = |µc | + |µd,2 | does not charge more than ε/2 to any point, we claim that there exists a positive integer 1 ≤ M = M(µ, ε) < ∞ such that |ν|(D) < ε for any dyadic square of scale M. We can choose M(µ, ε) ≥ L. This statement is clear by a dyadic decomposition; we start with the partition of R2 into dyadic squares of scale L. There are just finitely many squares D ∈ DL such that |ν|(D) ≥ ε. We split these squares further into four identical dyadic squares. If this process stops after finitely many steps, then we have reached our M as the scale of the finest decomposition. Now suppose on the contrary that this process never stops. Then we could find a strictly decreasing sequence of nested dyadic squares D1 ⊃ D2 ⊃ . . . such that |ν|(Dj ) ≥ ε, but |ν| would charge at least ε weight to their intersection which is a point. Finally, since |µ| = |µd,1 | + |ν| and every tripled square can be covered by 9 dyadic squares of the same scale, we have |µ|(Q) ≤ 9ε + 2π maxj |Cj | < 2π(1 − ε) for each M for large enough M. $ % Q∈D Step 4. Now we turn to the proof of (7) and first we prove it for any doubled dyadic K be a doubled dyadic square with K ≥ M and (K) ∈ D square of big scale. Let Q = D k (K) let Q = Dk be the corresponding tripled square with the same center k ∈ 8K . We split the measure µ as µ = µint + µext := 1Q µ + 1Q c µ with |µ| = |µint | + |µext | and h is decomposed accordingly as h = hint + hext with 1 |x − y| # h# (x) := log µ (dy), 2π R2 y! where # = int, ext. We also define 1 1 h int (x) := log |x − y|µint (dy) = hint (x) + log y!µint (dy). 2π R2 2π R2

Pauli Operator for Measure Valued Fields

407

Let AvQ hext := |Q|−1 Q hext be the average of hext on Q. A simple calculation shows that ext ∀x ∈ Q (14) h (x) − AvQ hext ≤ C|µ|(R2 ), with a universal constant C using that µext is supported outside of the tripled square. Therefore 1 1 1 1 2h −2h 4C|µ|(R2 ) 2 h int −2 h int ≤e . e e e e |Q| Q |Q| Q |Q| Q |Q| Q int We split µint into its positive and negative parts: µint = µint + − µ− , we let φ± := int µ± ≥ 0. By Lemma 2.4 and K ≥ M we have φ := φ+ + φ− < (1 − ε). Q Now we apply Jensen’s inequality for the probability measures (2π φ± )−1 µint ± (if φ± = 0): 1 2 h int 2φ+ int e = exp log |x − y| µ+ (dy) 2π φ+ Q Q Q 1 × exp log |x − y |−2φ− µint (dy ) dx − 2π φ− Q 1 1 dx µint ≤ + (dy) 2π φ 2π φ− + Q Q 2φ+ × µint |x − y |−2φ− ≤ C(ε)|Q|1+φ+ −φ− (15) − (dy )|x − y| 1 2π

Q

with an ε-dependent constant. When performing the dx integration, we used the fact that φ− < 1 − ε, hence the singularity is integrable. Similarly, we have int e−2h ≤ C(ε)|Q|1−φ+ +φ− Q

which completes the proof of (7) for doubled dyadic squares of scale at least M with a K-independent constant. Step 5. Next, we prove e±2h ∈ L1+ε loc . We can follow the argument in Step 4. On any ext square Q ∈ DM we can use that h is bounded by (14) and we can focus on exp(±2 hint ). Then we use Jensen’s inequality (15) and use the fact that x ) → |x −y|−2(1+ε)φ± is locally integrable since φ± < (1 − ε). Step 6. Now we complete the proof of (7) for all squares Q ⊂ K. Since every square can be covered by a doubled dyadic square of comparable size, we can assume that Q is such a square. If the scale of Q is smaller than M, then |Q|−1 ≤ 4M(µ,ε) and we can simply use e±2h ∈ L1loc to estimate the integrals. Step 7. Finally, we prove (9). Let Q(u) := [u − 1, u + 1]2 and we split the measure µ2 as ext µ2 = µint µ2 + 1Q(u) c µ2 2 + µ2 := 1Q(u) ext and the function h2 = hint 2 + h2 , where 1 |x − y| # h#2 (x) = log µ2 (dy), 2π R2 y!

# = int, ext.

408

L. Erd˝os, V. Vougalter

Similarly to the estimates (14) and (15) in Step 4, we obtain

exp 2|µ2 |(Q(u)) e±2h2 ≤ C(ε) exp ± 2AvQ(u) hext log u! , 2 Q(u)

and a simple calculation shows |x − y| ext ext c (u)) log u! + C(ε). log |µ2 |(dy)dx ≤ |µ2 |(Q AvQ(u) h2 ≤ c y! Q(u) Q(u) % From these estimates (9) follows using that |µ2 |(R2 ) ≤ ε. $

2.4. Definition of the Pauli operator for measure valued fields. For any real valued function h ∈ L1loc (R2 ) we define the following quadratic form: π h (ψ, ξ ) := π+h (ψ+ , ξ+ ) + π−h (ψ− , ξ− ) with π+h (ψ+ , ξ+ ) := 4 π−h (ψ− , ξ− ) := 4

∂z¯ (e−h ψ+ )∂z¯ (e−h ξ+ )e2h , ∂z (eh ψ− )∂z (eh ξ− )e−2h

on the natural maximal domains D π±h = ψ± ∈ L2 (R2 ) : π±h (ψ± , ψ± ) < ∞ , ψ+ D(π h ) = D(π+h ) ⊗ D(π−h ) = ψ = ∈ L2 (R2 , C2 ) : π h (ψ, ψ) < ∞ . ψ− We use · to denote the usual L2 (R2 , dx) or L2 (R2 , C2 , dx) norms. We define the following norms on functions: 1/2 , |||f |||h,+ := f 2 + ∂z¯ (e−h f )eh 2 1/2 , |||f |||h,− := f 2 + ∂z (eh f )e−h 2 and for a spinor ψ we let |||ψ|||h := |||ψ+ |||h,+ + |||ψ− |||h,− . For any real function h ∈ L1loc with h ∈ M, we define the set g+ eh ∞ 2 Ch := ψ = : g± ∈ C0 (R ) . g− e−h

(16)

(17)

Notice that this set depends only on µ = h: if h, h are two functions such that h = h = µ in the distributional sense, then h − h is harmonic, i.e., smooth. Therefore eh and eh differ by a smooth multiplicative factor, i.e. Ch = Ch , hence we

Pauli Operator for Measure Valued Fields

409

can denote this set by Cµ . Moreover, by Theorem 2.2, for any µ ∈ M∗ and any compact set K, there exists an h ∈ L1loc with h = µ on K, and h is unique modulo adding a smooth (harmonic) function. Since the support of g± is compact, the following set is well-defined for all µ ∈ M∗ : g+ eh ∞ 2 : g ∈ C (R ), h = µ on supp(g ) ∪ supp(g ) . Cµ := ψ = ± − + 0 g− e−h (18) Theorem 2.5. Let h ∈ L1loc (R2 ) be a real valued function such that µ := h ∈ M∗ . Then (i) The quadratic form π h is nonnegative, symmetric and closed, hence it defines a unique self-adjoint operator Hh (Hh ψ, ξ ) := π h (ψ, ξ ),

ψ ∈ D(Hh ), ξ ∈ D(π h )

with domain D(Hh ) := {ψ ∈ D(π h ) : π h (ψ, ·) ∈ L2 (R2 , C2 ) } (ii) The set Cµ is dense in D(π h ) with respect to ||| · |||h , i.e., it is a form core of Hh . (iii) For any L1loc -functions h and h with h = h ∈ M∗ , the operators Hh and Hh are unitarily equivalent by a U (1)-gauge transformation. In particular, the spectral properties of Hh depend only on µ = h. Definition 2.6. For any real function h ∈ L1loc with µ = h ∈ M∗ the operator Hh will be called the Pauli operator with generating potential h. For any µ ∈ M∗ the unitarily equivalent operators {Hh : h = µ} are called the Pauli operators with a magnetic field µ. The Pauli operators for any µ ∈ M are defined as Uµ∗ H Uµ on the core Uµ∗ Cµ , where H is a Pauli operator with the reduced field µ∗ ∈ M∗ (see Definition 2.1) and Uµ is defined in (6). To complete the definition of the Pauli operator for any magnetic field µ ∈ M, we need Theorem 2.7. For any µ ∈ M∗ , there exists h ∈ L1loc with h = µ (in fact, for any 1,p p < 2 one can find an h ∈ Wloc ). Hence the above definition of Hh actually defines the Pauli operators for any measure valued magnetic field µ ∈ M. Proof of Theorem 2.5. From Corollary 2.3 we know that e±h ∈ L2loc , and we show below that for any doubled dyadic square Q0 the estimate 1 1 2h −2h ≤ C3 (h, Q0 ) e e (19) |Q| Q |Q| Q analogous to (7) is valid on any square Q ⊂ Q0 , with a (h, Q0 )-dependent constant. These are the two properties of h which we use below. For any Q0 one can find a compact set K such that Q0 ⊂ int(K) and µ = h restricted to K, µ|K , has finite total variation. Let ε = ε(µ|K )/2 and we consider h(ε) defined in Theorem 2.2. Since h = h(ε) on K, we can write h = h(ε) + ϕ with a

410

L. Erd˝os, V. Vougalter

smooth real function ϕ depending on h. In particular, for any doubled dyadic square Q0 the estimate (7) for h(ε) implies that (19) is valid for h = h(ε) + ϕ on any square Q ⊂ Q0 . Part (i). Let ψn = (ψn+ , ψn− ) be a Cauchy sequence in the norm ||| · |||h , i.e., ψn → ψ in L2 (dx), ∂z¯ (e−h ψn+ ) → u+ in L2 (e2h dx) and ∂z (eh ψn− ) → u− in L2 (e−2h dx). We have to show that ∂z¯ (e−h ψ+ ) = u+ , ∂z (eh ψ− ) = u− . For any φ ∈ C0∞ (R2 ), (∂z¯ φ) e−h ψn+ = − (∂z¯ φ) e−h ψ+ φu+ = lim φ∂z¯ (e−h ψn+ ) = − lim n→∞

n→∞

hence ∂z¯ (e−h ψ+ ) = u+ in the distributional sense. Here we used that

φ u+ − ∂z¯ (e−h ψn+ ) ≤ φe−h u+ − ∂z¯ (e−h ψn+ )L2 (e2h ) → 0 and

∂z¯ φ e−h (ψ+ − ψn+ ) ≤ ∂z¯ φe−h ψ+ − ψn+ → 0,

which follows from e−h ∈ L2loc . The proof of the spin-down component is similar. This shows that the form π h is closed. The rest of the argument is standard (see, e.g., Lemma 1 in [L-S]). Part (ii). The spin-up and spin-down parts can be treated separately and analogously, so we focus only on the spin-up part. Step 1. We first show that the set C0 := {f ∈ D(π+h ), supp(f )compact} is dense in D(π+h ) with respect to ||| · |||h,+ . This is standard: let χ (x) be a compactly supported smooth cutoff function, 0 ≤ χ ≤ 1, χ (x) ≡ 1 for |x| ≤ 1, and let χn (x) := χ (x/n). For any f ∈ D(π+h ) we consider fn = χn f , then clearly |||f − fn |||h,+ → 0. Step 2. We need the following Lemma 2.8. Let f ∈ C0 then ∇(f e−h ) ∈ L2 (e2h ). Proof of Lemma 2.8. Let g := f e−h . Let Q1 be a doubled dyadic square that contains a neighborhood of K := supp (g), and let Q0 be a doubled dyadic square that strictly contains Q1 and |Q0 | = 4|Q1 |. We define e2h(x) for x ∈ Q1 (20) ω(x) := 1 for x ∈ Qc1 . Lemma 2.9. The function ω(x) satisfies the inequality 1 1 −1 ≤ C4 (h, Q0 ) ω ω |Q| Q |Q| Q for any square Q ⊂ R2 , i.e., ω is an A2 -weight (see [G-R, St]).

(21)

Pauli Operator for Measure Valued Fields

411

Proof of Lemma 2.9. It is sufficient to prove (21) for all doubled dyadic squares Q. It is easy to see that one of the following cases occurs: (i) Q is disjoint from Q1 , (ii) Q ⊂ Q0 , (iii) |Q1 | ≤ 9|Q|. In the first case (21) is trivial, in case (ii) it follows from (19). Finally, in case (iii) we have 1 1 1 1 −1 2 2h −2h ≤ 36 1 + 1+ , ω ω e e |Q| Q |Q| Q |Q0 | Q0 |Q0 | Q0 hence (21) holds with an appropriate constant. = 2(|∂z Since immediately from |∇g|2

g|2

R2

+ |∂z¯

g|2 )

% $

and ω = e2h on supp (g), Lemma 2.8 follows

|∂z g|2 ω ≤ C5 (h, Q0 )

R2

|∂z¯ g|2 ω .

(22)

Notice that

(ξ1 − iξ2 )2 , |ξ |2 where the hat stands for the Fourier transform, ξ ∈ R2 , and m(ξ ) is a homogeneous multiplier of degree 0. Hence (22) is just the weighted L2 -inequality for the regular singular integral operator Tm with Fourier multiplier m(ξ ) and with weight ω ∈ A2 [G-R, St]. $ % ∂ z g(ξ ) = m(ξ )∂ z¯ g(ξ )

with

m(ξ ) :=

Step 3. To conclude that C0 ∩ eh C0∞ is dense in C0 with respect to ||| · |||h,+ , we use the fact that C0∞ is dense in the weighted Sobolev space W 1,2 (ω) with the A2 -weight ω (see e.g. [K]). Here we only recall the key point of the proof. Let g ∈ W 1,2 (ω) be ∞ −2 compactly supported and gε := Jε ∗ g ∈ C0 , where Jε (x) := ε J (x/ε) is a standard mollifier: 0 ≤ J ≤ 1, J = 1, J smooth, compactly supported. Then the functions |∇gε | ≤ Jε ∗ |∇g| have an L2 -integrable majorant by the weighted maximal inequality [St] applied to |∇g| ∈ L2 (ω), hence gε → g in W 1,2 (ω) as ε → 0. Notice that every gε is supported on a common compact neighborhood of the support of g. Part (iii). Since h = h , we can write h = h + ϕ with a smooth real function ϕ. We define λ as the harmonic conjugate of ϕ, ∇λ = ∇ ⊥ ϕ, which exists and is smooth by ϕ = 0. By ∂z¯ (ϕ + iλ) = 0 we have

π h (ψ, ψ) = π h e−iλ ψ, e−iλ ψ , ψ ∈ Cµ , and then by the density of Cµ we obtain the same relation for all ψ ∈ D(π h ).

% $

Proof of Theorem 2.7. Since |µ| is finite on every bounded set, we can find a sequence of disjoint rings, Rj := {x : rj ≤ |x| ≤ rj + 2δj }, j = 1, 2, . . . , with appropriate widths 2δj > 0 and radii rj → ∞ (as j → ∞), such that j |µ|(Rj ) < ∞. For j = 0 we set rj = δj = 0. Let 0 ≤ χj ≤ 1 (j = 0, 1, . . . ) be smooth functions such that χj (x) ≡ 1 for rj + 2δj ≤ |x| ≤ rj +1 and χj (x) ≡ 0 for |x| ≤ rj + δj or |x| ≥ rj +1 + δj +1 . Notice that the supports of χj are disjoint. We define µj := µ · 1{x : rj + 2δj ≤ |x| ≤ rj +1 }. By Theorem 2.2 there exist 1,p hj ∈ p<2 Wloc , e±hj ∈ L2loc with hj = µj . We notice that

χ j hj = ν + χj µ j , j

j

412

L. Erd˝os, V. Vougalter

where ν is absolutely continuous, ν = N (x)dx with N=

2∇χj · ∇hj + hj χj ∈ L1loc .

j 1 2 We can find a decomposition N = N1 + N2 , N1 ∈ L∞ loc and N2 ∈ L (R ). Let κ := µ− j χj µj −N2 (x)dx, then κ ∈ M since N2 ∈ L1 (R2 ) and the measure µ − j χj µj belongs to M since it vanishes on the complement of j Rj ; it has a total variation smaller than |µ| on each Rj and j |µ|(Rj ) < ∞. It is also clear that κ does not charge more to any point than µ does since 0 ≤ χj ≤ 1 and they have disjoint supports, hence κ ∈ M∗ . 1,p By Theorem 2.2 there is k ∈ p<2 Wloc , e±k ∈ L2loc such that k = κ. We define 1,p h∗ := k + j χj hj , clearly h∗ ∈ p<2 Wloc and µ = h∗ − N1 (x)dx.

⊥ By the Poincaré formula there exists A ∈ L∞ loc with ∇ ·A = −N1 . For any fixed p ⊥ p < 2 (even for p < ∞), one can find A ∈ Lloc with ∇ · A = −N1 , ∇ · A = 0 by ⊥ ⊥ Lemma 1.1 (ii) [L]. But then A = (−A2 , A1 ) is curl-free, hence A = ∇ h for some 1,p h ∈ Wloc by Lemma 1.1 (i) [L]. Then h = N1 , hence h := h∗ − h satisfies h = µ 1,p % and we see that h ∈ Wloc ⊂ L1loc . $

Finally, we have to verify that the Pauli operator Hh defined in this section coincides with the standard Pauli operator if A ∈ L2loc , modulo a gauge transformation. Proposition 2.10. Let A ∈ L2loc . We assume that ∇ ⊥ · A (in the distributional sense) is a measure and that µ := ∇ ⊥ · A ∈ M. Then µ ∈ M∗ , in fact µ has no discrete component. Moreover, if h = µ with some h ∈ L1loc , then the operator Hh defined in Theorem 2.5 is unitarily equivalent to the Pauli operator PAmax associated with the maximal form pA on Dmax (pA ) as defined in Sect. 2.1. Remark. A ∈ L2loc does not imply that ∇ ⊥ ·A is even locally a measure of finite variation. One example is the radial gauge A(x) := (|x|)|x|−2 x ⊥ , x ⊥ := (−x2 , x1 ), that generates the radial field

B(x) :=

∞ (−4)n n=1

n

· 1 2−n ≤ |x| < 2−n+1

with flux (r) := |x|≤r B(x)dx. One can easily check that |x|≤1 |B(x)|dx = ∞ but A ∈ L2loc . However, if ∇ ⊥ ·A ≥ 0 as a distribution, then it is a (positive) Borel measure µ ∈ M∗ (see [L-L]). Proof. First we show that µ = ∇ ⊥ · A has no discrete component. Suppose, on the contrary, that µ({x}) = 0 for some x, and we can assume x = 0, µ({0}) > 0. Let χ be a radially symmetric smooth function on R2 , 0 ≤ χ ≤ 1, suppχ ⊂ {|x| ≤ 2}, χ (x) ≡ 1 for |x| ≤ 1, |∇χ | ≤ 2, and let χn (x) := χ (2n x). Clearly − A·∇ ⊥ χn = χn dµ → µ({0})

Pauli Operator for Measure Valued Fields

413

as n → ∞. Using polar coordinates, we have, for large enough n, 2−n+1 2π 1 ⊥ |A(s, θ )|dθ ds µ({0}) ≤ − A · ∇ χn ≤ 4 2 2−n 0 2−n+1 2π 1/2 √ |A(s, θ )|2 dθ s ds ≤ 4 2π 2−n

√ ≤ 4 2π

0

|A(x)|2 · 1(2−n ≤ |x| ≤ 2−n+1 ) dx

1/2 ,

hence |x|≤1 |A|2 = ∞. The proof also works if we assume only µ ∈ M instead of µ ∈ M. Now we prove the unitary equivalence. Without loss of generality we can assume that ∇ · A = 0 by part (ii) Lemma 1.1 of [L]. Let Ah := ∇ ⊥ h, then ∇ ⊥ ·Ah = µ, ∇ · Ah = 0 1,1 and Ah ∈ L1loc by Corollary 2.3. Since ∇ ⊥ ·(A − Ah ) = 0, there exists λ ∈ Wloc such that A = Ah + ∇λ by part (i) Lemma 1.1 of [L]. Taking the divergence, we see that λ = 0, hence λ is smooth. Let ϕ be a smooth harmonic conjugate of λ, ∇λ = ∇ ⊥ ϕ. We have the following identity:

pA (ψ, ψ) = π h+ϕ (ψ, ψ) = π h eiλ ψ, eiλ ψ . From the first equality we obtain that Dmax (pA ) = D(π h+ϕ ), i.e., PAmax = Hh+ϕ on D(PAmax ) = D(Hh+ϕ ). From the second equality it follows that Hh+ϕ = e−iλ Hh eiλ and that D(Hh+ϕ ) = e−iλ D(Hh ). In fact, D(Hh ) = D(Hh+ϕ ) since the multiplication by the smooth factor eiλ leaves the form core Ch = Ch+ϕ = Cµ invariant. $ % 2.5. Pauli operator generated by both potentials. Theorem 2.7 showed that every measure µ ∈ M∗ can be generated by an h-potential, h = µ, and we defined the Pauli operators. However, it may be useful to combine the scalar potential with the usual vector potential A ∈ L2loc to generate the given magnetic field. In this way one has more freedom in choosing the potentials. Typically, the singularities can be easier handled by 1 the h-potential, and the standard h = 2π log | · | ∗ µ formula is (locally) available. But this formula exhibits a strong non-locality of h, and the truncation method of the proof of Theorem 2.7 is not particularly convenient in practice. Large distance behavior of the bulk magnetic field is better described by a vector potential. In this section we give such a unified definition of the Pauli operator. For any h ∈ L1loc , A ∈ L2loc we define the quadratic form 2 2h 2 h,A −h π (ψ, ψ) := (−2i∂z¯ + a)(e ψ+ ) e + (−2i∂z + a)(e ¯ h ψ+ ) e−2h on the maximal domain D(π h,A ) := ψ ∈ L2 (R2 , C2 ) : |||ψ|||h,A < ∞ , where a = A1 + iA2 and 1/2 . |||ψ|||h,A := ψ2 + π h,A (ψ, ψ)

414

L. Erd˝os, V. Vougalter

Let P ∗ := (h, A) : h ∈ L1loc , e±h A ∈ L2loc , h ∈ M∗ , ∇ ⊥ ·A ∈ M be the set of admissible potential pairs. The measure µ := h + ∇ ⊥ ·A ∈ M is called the magnetic field generated by (h, A). We recall from Corollary 2.3 that (h, A) ∈ P ∗ 1,p implies h ∈ p<2 Wloc and e±h ∈ L2loc , moreover, e±h A ∈ L2loc implies A ∈ L2loc . Since ∇ ⊥ ·A has no discrete component (Proposition 2.10), the measure µ generated by (h, A) ∈ P ∗ is in M∗ . In particular, the set of measures generated by a potential pair from P ∗ is the same as the set of measures generated by only L1loc h-potentials (Theorem 2.7). Theorem 2.11. (i) (Self-adjointness). Assume that (h, A) ∈ P ∗ and let µ := h + ∇ ⊥ ·A. Then π h,A is a nonnegative symmetric closed form, hence it defines a unique self-adjoint operator Hh,A . (ii) (Core). The set Cµ (see (18)) is dense in D(π h,A ) with respect to ||| · |||h,A , i.e., it is a form core for Hh,A . h ∈ L1loc such that h + ∇ ⊥ · A = h, then (iii) (Consistency). If (h, A) ∈ P ∗ and Hh,A is unitary equivalent to H h defined in Theorem 2.5. Definition 2.12. For any (h, A) ∈ P ∗ the operator Hh,A is called the Pauli operator with a potential pair (h, A). Notice that Proposition 2.10 and (iii) of Theorem 2.11 guarantees that the Pauli operators with the same magnetic field are unitarily equivalent, irrespective of which definition we use. Proof of Theorem 2.11. Part (i). The proof that π h,A is closed is very similar to the proof of part (i) of Theorem 2.5. The operators ∂z¯ and ∂z should be replaced by ∂z¯ + ia and ∂z − ia, but the extra terms with a can always be estimated by the local L2 norm of e±h A. Part (ii). Step 1. We need the following preliminary observation. Since A ∈ L2loc , we 1,2 can consider the decomposition A = A + ∇ λ, ∇ · A = 0, A ∈ L2loc , λ ∈ Wloc (see Lemma 1.1 [L]). A and λ are called the divergence-free and the gradient component of A ∈ L2loc , and notice that A is unique up to a smooth gradient, since if A+∇ λ= A +∇ λ then 0 = ∇ · (A − A ) = ( λ − λ), i.e., λ − λ is smooth. 2 ±h ±h Moreover, if Ae ∈ Lloc , then Ae ∈ L2loc as well. To see this, we fix a compact set K and a compact set K ∗ whose interior contains K, then we choose a cutoff function 0 ≤ ϕK ≤ 1 with ϕK ≡ 1 on K and supp ϕK ⊂ K ∗ . We let AK := ϕK A ∈ L2 and let λ be defined via its Fourier transform λ(ξ ) :=

ξ · AK (ξ ) , |ξ |2

ξ ∈ R2

i.e., −λ = ∇ · A. Then ∇λ is obtained from AK by the action of a singular integral operator whose multiplier is ξ ⊗ ξ/|ξ |2 . Choose ω as in (20), where Q0 is a dyadic square containing K ∗ , then ω ∈ A2 . Hence, by the weighted L2 -inequality we have 2 2 |∇λ| ω ≤ C(ω) |AK | ω = C(ω) |AK |2 e2h

Pauli Operator for Measure Valued Fields

415

with some ω-dependent constant. In particular ∇λ ∈ L2loc (e2h ). The proof of ∇λ ∈ L2loc (e−2h ) is identical. Now λ satisfies λ = −λ on K, i.e. λ and λ differ by an 2 additive smooth function, hence ∇ λ ∈ Lloc (e±2h ), which means that Ae±h ∈ L2loc . Step 2. We show that Cµ ⊂ D(π h,A ) if (h, A) ∈ P ∗ , µ = h + ∇ ⊥ · A. Let ψ ∈ Cµ be compactly supported on K and let K ∗ be a compact set whose interior contains K. Since µ restricted to K ∗ has finite total variation, we can apply Theorem 2.2 for the restricted measure to construct a function h∗ ∈ L1loc such that h∗ = µ = h + ∇ ⊥ · A ∗ ⊥ ∗ on K . Then there is a real function χ such that A = ∇ (h − h) + ∇χ (Lemma 1.1. of [L]). After taking the divergence, we see that χ is harmonic on K ∗ . Let ϕ ∈ C ∞ be its harmonic conjugate, ∇χ = ∇ ⊥ ϕ. We have the identity

∗ π h,A (e−i λ ψ, e−i λ ψ) = π h,A (ψ, ψ) = π h +ϕ ψ, ψ (23) ∗

for any ψ supported on K ∗ . Since ψ ∈ Cµ , we can write ψ± = g± e±h and we see that the right-hand side of (23) is finite, hence e−i λ ψ ∈ D(π h,A ). But by the Schwarz inequality 1 π h,A (e−i λ ψ, e−i λ ψ) ≥ π h,A (ψ, ψ) − 4 g± 2∞ |∇ λ|2 e±2h , ∗ 2 K ± hence ψ ∈ D(π h,A ) by Step 1. Step 3. We now show that Cµ is dense in D(π h,A ) with respect to |||·|||h,A if (h, A) ∈ P ∗ , µ = h + ∇ ⊥ · A. We first notice that it is sufficient to show that Cµ is dense in the set C 0 := {ψ ∈ D(π h,A ), : supp(ψ) compact}, similarly to Step 1 of the proof of Theorem 2.5 (ii). So let ψ ∈ D(π h,A ) be supported on a compact set K. As in Step 2, we let h∗ ∈ L1loc be a function such that h∗ = µ on a compact neighborhood K ∗ of K, K ⊂ int(K ∗ ). As before, we have A = ∇ ⊥ (h∗ − h) + ∞ ∗ ∇χ with a harmonic χ and let ϕ ∈ C (K ) be its harmonic conjugate, ∇χ = ∇ ⊥ ϕ. The identity (23) is now written as

∗ π h,A (ψ, ψ) = π h +ϕ ei λ ψ, ei λ ψ (24)

∗

for any ψ supported on K ∗ . In particular, ψ ∈ D(π h,A ) implies ei λ ψ ∈ D(π h +ϕ ). We define the set g+ eh ∞ Cµ := ψ = : g± ∈ L0 , h = µ on supp (g− ) ∪ supp (g+ ) , g− e−h where L∞ 0 denotes the set of bounded, compactly supported functions. The set Cµ is well defined, see the remark before the definition (18). ∗ Since Cµ is dense in D(π h +ϕ ) with respect to ||| · |||h∗ +ϕ by part (ii) of Theorem 2.5, we can find a sequence of spinors ξn ∈ Cµ such that |||ξn − ei λ ψ|||h∗ +ϕ → 0. We can ∗ assume that all ξn are supported in K (see the remark at the end of Step 3 of the proof Theorem 2.2 (ii)). But then |||e−i λ ξn − ψ|||h,A → 0 again by (24); in particular the set C 1 := D(π h,A ) Cµ is dense in D(π h,A ), ||| · |||h,A .

416

L. Erd˝os, V. Vougalter

Finally, we show that Cµ is dense in C 1 , ||| · |||h,A . Let χ ∈ C 1 , i.e., χ± = g± e±h with some compactly supported bounded functions g± . Notice that if g is a bounded function, then ge±h A ∈ L2loc since e±h A ∈ L2loc . In particular (∂z¯ + ia)g+ ∈ L2 (e2h ) implies ∂z¯ g+ ∈ L2 (e2h ). But then ∇g+ ∈ L2 (e2h ) by Lemma 2.8 and similarly for g− . We focus only on the spin-up part, the spin-down part is similar. Let g (ε) := Jε ∗ g+ , where Jε is a standard mollifier (see Step 3 of the proof of Theorem 2.5 (ii)). Recall that g (ε) ∞ ≤ g+ ∞ and the functions |∇g (ε) | ≤ Jε ∗|∇g+ | have an L2 (e2h )-integrable majorant using the weighted maximal inequality. By passing to a subsequence g (ε) → g+ , ∇g (ε) → ∇g+ in L2 (e2h ) and g (ε) → g+ a.e. as ε → 0. Therefore (∂z¯ + ia)(g (ε) − g+ )2 e2h + (g (ε) − g+ )eh 2 ≤2

∇(g (ε) − g+ )2 e2h + 2

2 |A|2 |g (ε) − g+ |2 e2h + (g (ε) − g+ )eh → 0

as ε → 0 since eh ∈ L2loc and eh A ∈ L2loc . Part (iii). Since h ∈ M∗ , we know that h ∈ L2loc (Corollary 2.3). Since ∇ ⊥ · (∇ ⊥ h − 1,2 h + A) = 0, there exists λ ∈ Wloc with ∇ ⊥ (h − h) + A = ∇λ. A simple calculation ∇ ⊥ shows that π h,A (ψ, ψ) = π h (eiλ ψ, eiλ ψ). % $

3. Aharonov–Casher Theorem We prove the following extension of the Aharonov–Casher theorem: Theorem 3.1. Let µ ∈ M and we assume that µ∗ ∈ M∗ i.e., we assume that after reducing the point masses inµ the reduced measure µ∗ has finite total variation (see 1 Definition 2.1). Let := 2π µ∗ . The dimension of the kernel of any Pauli operator H with magnetic field µ is given [| |] if ∈ Z or = 0 dimKerH = (25) [| |] or [| |] − 1 if ∈ Z \ {0} (here [a] denotes the integer part of a). In the case of nonzero integer (second line) both cases can occur, but if, additionally, µ∗ has a compact support or has a definite sign, then always dimKer H = [| |] − 1. In all cases the kernel is in the eigenspace of σ3 : Ker(H ) ⊂ {ψ : σ3 ψ = −sψ} with s = sign( ). Combining this theorem with Proposition 2.10 we obtain Corollary 3.2. If A ∈ L2loc and B ∈ L1 , then the dimension of the kernel of PAmax is 1 B. $ % given by (25) where = 2π

Pauli Operator for Measure Valued Fields

417

Proof of Theorem 3.1. We can assume that µ ∈ M∗ . Recalling the definition of ε(µ) from (10) we apply Theorem 2.2 to choose h := h(ε) with some ε < ε(µ). By (iii) of Theorem 2.5 it is sufficient to consider the operator Hh . We can also assume that ≥ 0. Suppose first that

is not integer; let { } = − [ ] be its fractional part. Choose ε < min ε(µ), { }/3, (1 − { })/3 . Any normalized eigenspinor ψ with π h (ψ, ψ) = 0 must be in the form ψ = (eh g+ , e−h g− ), where g+ is holomorphic and g− is antiholomorphic. First we show that g+ = 0. Let u ∈ R2 with |u| ≥ R(ε, µ) + 1. We use the decomposition h = h1 + h2 from Theorem 2.2. We have

1 = ψ2 ≥

Q(u)

e2h |g+ |2 ≥

Q(u)

2

eh1 |g+ |

Q(u)

e−2h2

−1

,

and by (iii) of Theorem 2.2 and subharmonicity of |g+ | we see that |g+ (u)| ≤ C Q(u) |g+ | ≤ C u!− +2ε → 0 as u → ∞, hence g+ = 0. A similar calculation shows that |g− (u)| ≤ C u! +2ε , i.e., g− must be a polynomial of degree at most [ ] since + 2ε < [ ] + 1. However, a polynomial of degree [ ] would give

e−2h |g− |2 ≥ C

k∈80 ,|k|≥R

≥C

|k|2[ ]

Q(k)

e−2h ≥ C

k∈80 ,|k|≥R

|k|2[ ]

Q(k)

e2h

−1

|k|2([ ]− −2ε) = ∞

k∈80 ,|k|≥R

for some large enough R and various positive constants C. On the other hand, the functions g− (z) = 1, z, . . . , z[ ]−1 all give normalizable spinors since for these choices −2h 2 2([ ]− −1) e |g− | ≤ C |k| e−2h2 ≤ C |k|2([ ]− −1+ε) < ∞. k∈80

Q(k)

k∈80

(26) If is integer, then the same arguments work except (26); in fact g− (z) = z −1 may or may not give normalizable solutions. If µ is compactly supported and ≥ 0 is integer, then from the definition of h (11)–(13) we see that h(x) − log |x| is bounded for all large enough |x|. Similarly one can easily verify that if µ ≥ 0, then h(x) ≤ log x! + C since log |x − y| ≤ log x! + log y! + C. In both cases z −1 e−h is not L2 -normalizable at infinity. However, if µ can change sign and is not compactly supported then there could be ∈ Z zero energy states. For example the radial field (with β > 0, N ∈ N) 2(N + β)e−2 for |x| ≤ e B(x) = , −β(|x| log |x|)−2 for e < |x| 1 with = 2π B = N , is generated by a radial potential h(x) such that h(x) = N log |x|+β log log |x| for large x and is regular for small x. The threshold state z −1 e−h

418

L. Erd˝os, V. Vougalter

is normalizable only for β > 21 , so in this case the dimension of the kernel is , otherwise − 1. $ % This theorem requires µ ∈ M. We will see in Sect. 4 that the Aharonov–Casher theorem need not be true for magnetic fields with infinite total variation. However, the proof above still works for magnetic fields that can be decomposed into the sum of a component in M and a component with a regularly behaving generating potential. We just remark about one possible extension: Corollary 3.3. Suppose that µ ∈ M can be written as µ = µrad + µ such that µ∈M and µrad is a rotationally symmetric Borel measure (i.e., µrad = µrad ◦ R for any rotation R in R2 around the origin). We can assume that µrad ({0}) = 0 by including the possible delta function at the origin into µ. We assume that rad := limR→∞ (R) := 1 limR→∞ 2π µ (dx) exists (possibly infinite) and hrad (x) := (|x|) log |x| satrad |x|≤R isfies ∇hrad ∈ L∞ . Then all statements of Theorem 3.1 for the Pauli operator with loc ∗ 1 magnetic field µ are valid with := rad + 2π µ (dx). Proof. Let Arad := ∇ ⊥ hrad ∈ L∞ µ given in loc and let h be the generating function of Theorem 2.2 for some ε < ε( µ). Then ( h, Arad ) ∈ P ∗ with a magnetic field µ, hence the Pauli operators are well defined and unitarily equivalent. Clearly π h,Arad (ψ, ψ) = π h (ψ, ψ) with h = hrad + h, hence any zero energy state ψ must be in the form ψ = (eh g+ , e−h g− ), where g± are (anti)holomorphic. Now we can follow the proof of Theorem 3.1. We use the estimates (7), (9) for the h part of the generating potential and we estimate hrad (x) := (|x|) log |x| by limR→∞ (R) for large R, limR→0 | (R)| = 0 for small R and by ∈ L∞ % loc (R) for intermediate R. $ 4. A Counterexample In this section we present the construction of Counterexample 1.1. For simplicity, the magnetic field will be only bounded and not continuous, but it will be easy to see that a small mollification does not modify the estimates. Let δ < 1/10 be a fixed small number and Nk = 10k for k = 1, 2, . . . . We denote the Nkth roots of unity by ζk,j := exp(2π ij/Nk ), j = 1, 2, . . . Nk . Let Dk,n,j := {x : |x − nζk,j | ≤ δ} be the disk of radius δ about nζk,j , let D k,n,j := {x : |x − nζk,j | ≤ 2δ} be the twice bigger disk. − B with Let 0 < ε < 1/4 be fixed. The magnetic field B is given as B := B0 + B B0 (x) := 2(1 + ε)δ −2 1(|x| ≤ δ), := B

∞

k , B

k := B

k=1

k +2k 4

k,n , B

n=4k +1

k,n (x) := B

Nk

2δ −2 1(x ∈ Dk,n,j )

j =1

and := B

∞ k=1

k , B

k := B

k +2k 4

n=4k +1

k,n , B

k,n (x) := 1 B 2π

0

2π

k,n (|x|eiθ )dθ. B

consists of uniform field “bumps” with flux 2π localized on the disks The field B Dk,n,j around points nζk,j that are located on concentric circles of radius n. The field B

Pauli Operator for Measure Valued Fields

419

The field Bk = B k − B k is called the k th band. The relation is the radial average of B. (5) is straightforward by construction. h − h, with We define the potential function h := h0 + 1 h0 (x) := 2π

hk :=

k +2k 4

R2

n=4k +1

∞

hk ,

k=1

1 hk,n (x) := 2π

hk,n ,

h :=

log |x − y|B0 (y)dy, R2

k,n (y)dy − Nk log n log |x − y|B

and h :=

∞

hk ,

hk :=

k=1

k +2k 4

1 hk,n (x) := 2π

hk,n ,

n=4k +1

|x|

0

dr r

|y|≤r

k,n (y)dy. B

k , k . Easy computations yield the following relations: Clearly hk = B hk = B h0 (x) = (1 + ε) log |x| for |x| ≥ δ, N k x + ix Nk x1 + ix2 1 2 hk,n (x) = log ζk,j − = log 1 − n n j =1 0 for |x| ≤ n − δ hk,n (x) = |x| −1 Nk log n + O(n ) for |x| ≥ n − δ.

for x ∈

Nk

Dk,n,j ,

j =1

The infinite sums in the definition of h and h are absolutely convergent, hence h ∈ The sum of the hk (x)’s converges since

L∞ loc .

k +2k ∞ 4 x Nk <∞ n k

k=1 n=4 +1

for each fixed x and hk (x) is actually zero for all but finite k. Therefore we know that h = B in the distributional sense. Moreover, we can rearrange the sums and write h = h0 +

∞

hk ,

hk :=

k=1

k +2k 4

hk,n ,

hk,n := hk,n − hk,n .

n=4k +1

A short calculation shows that for each k0 , ∞

|hk (x)| = O(1)

for 3 · 4k0 −1 − 1 ≤ |x| ≤ 3 · 4k0 + 1.

k=1 k=k0

Hence the size of h(x) is determined by h0 (x) and the band nearest to x.

(27)

420

L. Erd˝os, V. Vougalter

Now we show that e−2h = ∞, in fact D e−2h = ∞, where D := k,n,j D k,n,j \

Dk,n,j . The proof of e2h = ∞ is similar but easier. We fix k ≥ 1, 4k + 1 ≤ m ≤ 4k + 2k , 1 ≤ G ≤ Nk and let x ∈ D k,m,G \ Dk,m,G . Then 

Nk  n  log 1 − + Nk O(n−1 ) for n < m x1 +ix2 hk,n (x) =

Nk  2  log 1 − x1 +ix for n > m. n Writing x1 + ix2 = mζk,G + (H1 + iH2 ), H = (H1 , H2 ) ∈ R2 , δ ≤ |H| ≤ 2δ and expanding hk,n (x) around mζk,G up to second order in H we easily obtain that hk,n (x) ≤ Nk O(n−1 ) for each n = m if δ is small enough, hence hk (x) ≤ h k,m (x) + O(1). Moreover, hk,m (x) = log 1 − [(x1 + ix2 )/m]Nk + O(1) since | hk,m (x)| = O(1) for any m − 2δ ≤ |x| ≤ m + 2δ. Hence k +2k Nk ∞ 4 x Nk −2 1 −2h e − ≥C 1 dx 2(1+ε) m m D k,m,G \Dk,m,G k k=1 m=4 +1

G=1

4k +2k

=C

∞

k=1 m=4k +1

≥C

Nk m2(1+ε)

H1 + iH2 Nk −2 1 − 1 + dH m δ≤|H|≤2δ

∞ 1 k(1−4ε) 2 = ∞. Nk k=1

Finally, we have to show that e−h f¯ ∈ L2 (R2 ) for any entire function f . First we show that f cannot have zeros. Suppose that a is (one of) its zero closest to the origin, i.e., f (z) = (z − a)m g(z), g is entire, g(0) = 0, m ≥ 1. Let Ak := {x : 3 · 4k − 1 ≤ |x| ≤ 3 · 4k + 1}, then h(x) = h0 (x) + O(1) for all x ∈ Ak by (27). Hence for a large enough K, ∞ e−2h |f |2 ≥ C e−2h0 (x) |x − a|2m |g(x)|2 dx ≥C

k=K Ak ∞ 2k(m−1−ε)

4

Ak

k=K

≥ C|g(0)|2

∞

|g(x)|2 dx

42k(m−1−ε) · 4k = ∞,

k=K

using that is subharmonic and the area of Ak is of order 4k . Now, since f has no zeros, we can write f = eϕ and we would like to show that ϕ is constant. It is enough to show that R := Re ϕ is constant and we can assume R(a) = 0. Suppose that ∇R(a) = 0 for some a ∈ C. Let zk be the point where the maximum of R over the closed disk Dk := {|x| ≤ 3 · 4k } is attained. Since R is harmonic, |zk | = 3 · 4k . Using (27) and the subharmonicity of |e2ϕ |, we have ∞ ∞ 4−2k(1+ε) |e2ϕ(x) |dx ≥ C 4−2k(1+ε) e2R(zk ) . (28) e−2h |f |2 ≥ C |g|2

k=1

|x−zk |≤1

k=1

Pauli Operator for Measure Valued Fields

421

From the Poisson formula we easily obtain |∇R(a)| ≤ 4−k maxDk R = 4−k R(zk ) for large enough k. Hence R(zk ) ≥ 4k |∇R(a)| and the integral in (28) is infinite. $ % Acknowledgement. This work started during the first author’s visit at the Erwin Schrödinger Institute, Vienna. Valuable discussions with T. Hoffmann-Ostenhof and M. Loss are gratefully acknowledged. The authors thank the referee for careful reading and comments.

References [A-C]

Aharonov, Y., Casher, A.: Ground state of spin-1/2 charged particle in a two-dimensional magnetic field. Phys. Rev. A19, 2461–2462 (1979) [CFKS] Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators with Application to Quantum Mechanics and Global Geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1987 [E-S] Erd˝os, L., Solovej, J.P.: The kernel of the Dirac operator. Rev. Math. Phys. 13 No. 10, 1247–1280 (2001) [G-R] Garcia-Cuerva, J., Rubio de Francia, J.L.: Weighted Norm Inequalities and Related Topics. Amsterdam: North-Holland, 1985 [K] Kilpeläinen, T.: Weigted Sobolev spaces and capacity. Ann. Acad. Sci. Fenn., Series A. I. Math. 19, 95–113 (1994) [L] Leinfelder, H.: Gauge invariance of Schrödinger operators and related spectral properties. J. Op. Theory 9, 163–179 (1983) [L-S] Leinfelder, H., Simader, C.: Schrödinger operators with singular magnetic vector potentials. Math. Z. 176, 1–19 (1981) [L-L] Lieb, E., Loss, M.: Analysis. Providence, RI: Amer. Math. Soc., 1997 [Mi] Miller, K., Bound states of Quantum Mechanical Particles in Magnetic Fields. Ph.D. Thesis, Princeton University, 1982 [Si] Simon, B.: Maximal and minimal Schrödinger forms. J. Operator Theory. 1, 37–47 (1979) [So] Sobolev, A.: On the Lieb-Thirring estimates for the Pauli operator. Duke J. Math. 82, 607–635 (1996) [St] Stein, E.: Harmonic Analysis. Princeton, NJ: Princeton University Press, 1993 Communicated by B. Simon

Commun. Math. Phys. 225, 423 – 448 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Space-Time Invariant Measures, Entropy, and Dimension for Stochastic Ginzburg–Landau Equations Jacques Rougemont Department of Mathematics, Heriot–Watt University, Edinburgh EH14 4AS, United Kingdom Received: 21 June 2000 / Accepted: 28 September 2001

Abstract: We consider a randomly forced Ginzburg–Landau equation on an unbounded domain. The forcing is smooth and homogeneous in space and white noise in time. We prove existence and smoothness of solutions, existence of an invariant measure for the corresponding Markov process and we define the spatial densities of topological entropy, of measure-theoretic entropy, and of upper box-counting dimension. We prove inequalities relating these different quantities. The proof of existence of an invariant measure uses the compact embedding of some space of uniformly smooth functions into the space of locally square-integrable functions and a priori bounds on the semi-flow in these spaces. The bounds on the entropy follow from spatially localised estimates on the rate of divergence of nearby orbits and on the smoothing effect of the evolution.

1. Introduction The use of dynamical system techniques and ideas in the study of extended partial differential equations has proved extremely fruitful in the past, see for example P. Collet’s talk at ICM’98, [C2] (where he also emphasises the limitations of such an approach). However, until now only results using topological or geometric properties of the dynamics have been used (like invariant manifolds, bifurcation theory, topological entropy, Hausdorff dimension). That is to say, extended dynamical systems are usually regarded as topological dynamical systems. In contrast, most of the very deep results in finite dimensional dynamical systems use measure-theoretic ideas, namely ergodic theory (as advocated for instance in the review by L.-S. Young at the ICMP in 1997, [Y2]). One of the favourite models of infinite dimensional dynamical systems studied recently is the Ginzburg–Landau equation. It appears as a generic normal form describing the amplitude of periodic bifurcated solutions (see [C2]) and it is also believed to be a good example of spatio-temporally chaotic dynamics [LO]. It is known that its attractor is infinite-dimensional and has positive ε-entropy (see [CE1, CE2, CE3, Ro]).

424

J. Rougemont

Here we propose to use random perturbations to obtain, by probabilistic techniques, the existence of invariant measures for the corresponding random dynamical system. The existence result is based on the observation by J. Ginibre and G. Velo that the Ginzburg– Landau equation has global solutions both in uniformly local Sobolev spaces and in local L2 space. Their proofs go through to the stochastic case without much effort, if we assume the noise to be bounded and smooth in space. Since uniformly local Sobolev spaces of sufficiently high order are compactly embedded into local L2 space, we get the Feller property of the semi-group and the tightness of the Cesàro means therefore existence of an invariant measure (by standard arguments for stochastic differential equations, see [DZ1]). These measures are also translation invariant, because the noise, the deterministic part of the equation and the spaces used are all translation invariant. We refer to the property of being invariant under the time evolution as “stationarity” and the invariance under space translations as “homogeneity” of the measure, following Vishik and Fursikov [VF]. In a second part of the paper we define the topological entropy and the measuretheoretic entropy, or rather their spatial densities, since both quantities are extensive (this has been discovered in this context by Collet and Eckmann in [CE1], see also [Ru] for earlier similar ideas). Usual inequalities from ergodic theory can be proved in this case and the Collet–Eckmann bound on the topological entropy is also valid (see [CE2]). The paper is organised as follows: in Sect. 2, we set the model and the functional analysis background needed for the remainder of the paper. The main results of the paper are summarised in Sect. 3. In Sect. 4 we obtain uniform bounds on the solutions in Sobolev spaces, these bounds being then used in Sect. 5 to prove the existence of invariant measures. Section 6 is devoted to the results on existence and properties of the (measure-theoretic and topological) entropies. Various technical proofs are provided in the final Sects. 7–12. We finish this introduction by commenting on the fact that many new results on invariant measures for nonlinear PDEs have recently appeared. We mention for instance [BKL, FM, EH, KS, Ku, Ma, S]. To the best of our knowledge the present work is the first where the model considered enjoys: infinite volume (hence continuous spectrum without gap), genuinely nonlinear interaction, homogeneous noise (hence infinite supply of energy at each time), non-trivial deterministic dynamics (e.g. the attractor of the deterministic Ginzburg–Landau is infinite dimensional). However, we are still unable to prove uniqueness of the invariant measure (i.e. an ergodicity result, as for example in [KS, F1, FM, Ma, BKL, DZ2]). The construction of a noise term for which ergodicity could be expected is a difficult problem in our context, because, intuitively, one should try to drive all unstable frequencies with the noise. However, our system has continuous Fourier spectrum, hence “all frequencies” is an uncountable set. 2. Model and Definitions We consider equations of the form du = (1 + iα)u + u − (1 + iβ)|u|2q u dt + ξ(x) dw(t), u(x, t) ∈ C,

x ∈ Rd ,

t ≥ 0,

α, β ∈ R,

(2.1)

where is the d-dimensional Laplacian, w(t) is a Wiener process and Eq. (2.1) is understood as an Itô stochastic differential in t. For awhile we simply assume ξ ∈ d C∞ k ξk (x) dwk (t) is immediate b (R ). The extension to a convergent (e.g. finite) sum

Stochastic Ginzburg–Landau Equations

425

and time-dependent forces ξ(x, t) dw(t) could also be considered. We refrain however from going into too much generality for the sake of readability. The noise term will be discussed in more detail in Sect. 5. We also assume u(·, 0) = u0 ∈ Cb∞ (Rd ). We make the following Hypothesis 2.1. We assume d ≤ 2, q > 21 , and √ 2q + 1 −(1 + αβ) < |α − β| , q

|β| ≤

√ 2q + 1 . q

(2.2)

Remark 2.1. The second inequality in (2.2) implies the first one. We wrote the first one because it appears in this form in the proof of Proposition 4.1, while the second condition appears in the proof of Lemma 10.1, see Ginibre and Velo [GV1, GV2] for an extensive discussion of the influence of these parameters on the dynamics. We next introduce the function spaces used in this paper. Let ϕδ,y (x) =

1 . cosh(δx1 ) · · · cosh(δxd )

(2.3)

p

The main feature of this function is that it belongs to L for all p and n ∇ ϕδ,y = An δ n < ∞ ϕ δ,y ∞

(2.4)

for all y ∈ Rd and n ∈ N. This function is used as a weight on Sobolev and Lebesgue spaces: Definition 2.1. The local Lebesgue space L2δ,y is defined as the completion of Cb∞ (bounded smooth functions) in the norm induced by the scalar product f, g δ,y = ϕδ,y (x)f (x)g(x) dx. m are defined as The local Sobolev spaces Hδ,y

m = f : ∇ k f ∈ L2δ,y , k = 0, . . . , m . Hδ,y

m are defined as the completion of C ∞ in the norm The uniformly local Sobolev spaces Hul b

f 2Hm = ul

m

sup ∇ k f, ∇ k f δ,y . d

k=0 y∈R

m is actually independent of δ > 0, since the following inclusion Remark that Hul holds: m m m = Hδ,y ⊂ Hδ,y . Hul (2.5) y∈Rd

δ>0

Usual Sobolev embeddings hold [Ad]; for example if m > d/2, the inequality f 2∞ ≤ Cδf 2Hm ul

(2.6)

426

J. Rougemont

m → L∞ . Moreover, by the Rellich–Kondrachov implies the continuous embedding Hul Theorem [Ad], m+k Hδ,y → Hδm ,y

(2.7)

is compact if k > d/2 and 0 < δ < δ (see Sect. 11). Notations. Throughout the paper, z denotes the complex conjugate of z, ft (x) ≡ f (x, t), hence ft X is the norm of f (x, t) in the space X (dx) ( e.g. X = L2 ). Norms in p Lebesgue spaces L are denoted · p ( · is usually the norm on L2δ,y for the current choice of δ and y). Expectations and probabilities with respect to the Wiener measure are denoted E and P. We denote the integer part of the positive real x by [x] ≡ max{n ∈ N : n ≤ x}. Symbols C, C1 , C2 , . . . , c, c1 , c2 , . . . usually denote generic numerical constants. The product f g means the convolution of the functions f and g. The cube of side L and centre 0 in Rd is QL = [− 21 L, 21 L]d . 3. Summary of Results In this section, we describe in a rather informal way the main results of this paper. The first result (Sect. 4) is the following theorem of existence of smooth bounded solutions to Eq. (2.1): Theorem A. If Hypothesis 2.1 holds, then Eq. (2.1) with initial data u0 ∈ Cb∞ has a unique solution u(x, t) = ut (x). For all real p ≥ 1 and integer m, there is a Bp,m < ∞ such that for all t > t0 (u0 ): p

Eut Hm ≤ Bp,m . ul

The proof relies on well-known estimates [GV1, Mi, C1] using the dissipative nature of the nonlinear term in Eq. (2.1) for the deterministic part and on Itô’s Lemma to treat p the stochastic term. Actually, in the evolution equation for Eut Hm , Itô’s Lemma only ul generates terms which are dominated by the nonlinear dissipative term and this implies that the techniques which were developed for the deterministic equation are applicable. By Lemma 10.1, Eq. (2.1) defines a stochastic semi-flow $tω on the space L2δ,y for any specific choice of y, for example y = 0. It also defines a Markovian Feller semi-group Pt acting on Cb (L2δ,0 , C):

Pt f (u) = E ω → f $tω (u) .

An invariant measure for Eq. (2.1) is a fixed point of the dual semi-group Pt∗ . We next m assume that ξ is an homogeneous process adapted to the Brownian motion. Since Hul 2 is compactly embedded into Lδ,y for m > d/2 (see (2.7)) the following theorem (see Sect. 5) is an immediate consequence of Theorem A by the Prokhorov and Krylov– Bogolyubov Theorems (see [Ar,VF, DZ2]): Theorem B. There exists at least one invariant measure Eq. (2.1). This measure µ for m. is homogeneous in x and its support is contained in m≥0 Hul

Stochastic Ginzburg–Landau Equations

427

Finally, in Sect. 6, we define the random attractor (see [CDF, D]) A(ω, R) =

T >0 t>T

Aω =

L2δ,0

$tθ −t ω (BR ) L2δ,0

A(ω, R)

,

.

R>0

Here and below, θ t is the time-shift in the noise, and Tx the group of spatial translations. m is the ball of radius R and centre 0. We introduce the following Moreover BR ⊂ Hul dynamical observables (see [KH, LQ]): 1 1 E log Nω,n,τ,QL ,ε , lim d ε→0 L→∞ L n→∞ nτ   n−1  1 1  −kτ E H hµ = lim lim d lim $ T (. kτ T ω,ε )  , µ −x θ ω x  ε→0 L→∞ L n→∞ nτ  x∈

htop = lim lim

dup = lim sup lim E ε→0

L→∞

log Mε,QL ,ω Ld log ε−1

Zd ∩QL

(3.1)

k=0

,

where Nω,n,τ,Q,ε is the cardinality of a minimal (n, ε)-cover of Aω |Q , .ω,ε is a sequence of partitions of Aω in sets of diameter at most ε in the metric of L∞ (Q1 ), Mε,Q,ω is the least cardinality of an ε-cover of Aω |Q and QL = [− 21 L, 21 L]d (see Sect. 6 for detailed definitions). The quantities in Eq. (3.1) are called respectively the topological entropy, the metric or measure-theoretic entropy [KH] and the upper (box-counting) dimension. It is important to note that the above numbers are all spatial densities (limit as L → ∞ of quantities divided by Ld ) although the limits are not taken in the most natural order. They are thus spatially localised versions of the usual entropies and dimensions. We then prove the following estimates: Theorem C. There is a γ < ∞ such that hµ ≤ htop ≤ γ dup < ∞. The proof that all the various limits in Eq. (3.1) exist relies on standard subadditive bounds [KH]. The upper bound on dup follows from spatially localised estimates of the rate of divergence of nearby orbits (Lemma 6.1) as well as the smoothing action of the evolution (see Sect. 7, in particular Lemma 7.1). It is similar to the proof of the deterministic case [CE2]. 4. Bounded Smooth Solutions In this section, we prove bounds on solutions u(t) to Eq. (2.1) using weighted energy inequalities and Itô’s Lemma. The aim is to prove that the expectation of moments of u(t)Hulm (and on u(t)∞ as a consequence of (2.6)) are, asymptotically in time, uniformly bounded. Theorem 4.1 below summarises these results. We obtain these bounds recursively in m. Lemma 4.2 treats the case m = 0, Proposition 4.1 the case m = 1. This is sufficient to get L∞ bounds (by (2.6), see Proposition 4.2) in dimension d = 1, but for d = 2, we need the Gagliardo–Nirenberg inequality Lemma 4.3. Bounds for arbitrary m then follow easily from standard energy estimates.

428

J. Rougemont

Theorem 4.1. If Hypothesis 2.1 holds, then Eq. (2.1) with initial data u0 ∈ Cb∞ has a unique solution u(x, t) = ut (x). For all real p ≥ 1 and integer m, there is a Bp,m < ∞ such that for all t > t0 (u0 ), p Eut Hm ≤ Bp,m . ul

Remark 4.1. This proof is amply simplified by our assumptions on the regularity of ξ in Eq. (2.1). A much more general theory of stochastic PDEs on Rd can be found, for example, in Krylov [Kr]. Funaki [F1, F2] has studied similar equations with stronger assumptions on the nonlinearity and Eckmann–Hairer [EH] have recently proved a similar result for stochastic forcings with finite energy. Proof. In the first part of the proof, we fix y ∈ Rd and δ > 0 such that A1 δ + A2 δ 2 < 1 (see Eq. (2.4)). We write · and (· , ·) for the norm and scalar product in the corresponding space L2δ,y . All bounds will actually turn out to be uniform in y. We stress that scalar products denoted (· , ·) contain the weight ϕδ,y (see Definition 2.1) hence integration by parts produces commutators with terms like ∇ϕδ,y /ϕδ,y . From now on, we also write ϕ for ϕδ,y . m (the domain of the closure Let L = (1 + iα) − 21 (2 + α 2 ). For f ∈ Dm () ⊂ Hδ,y m in Hδ,y of with core Cb∞ ), the following holds by Eq. (2.4): Re ∇ m f, ∇ m Lf = − ∇ m+1 f, ∇ m+1 f + Re ∇ϕϕ −1 ∇ m f, (1 + iα)∇ m+1 f 1 − (2 + α 2 ) ∇ m f, ∇ m f 2 1 1 ≤ − ∇ m+1 f, ∇ m+1 f − ∇ m f, ∇ m f , 2 2 namely L is a dissipative operator. By the Lumer–Phillips Theorem [Y1], L generates a m , with strongly continuous quasi-bounded semi-group exp(tL) on Hδ,y −t/2 m →Hm ≤ e . et L Hδ,y δ,y

t 0

(4.1)

m by the Duhamel formula. Let ζ (t) = We define mild solutions to Eq. (2.1) in Hδ,y

e(t−s)L ξ dw(s) (Itô integral) and define t 4 + α2 z t = e t L u0 + − (1 + iβ)|zs |2q zs ds + ζ (t). e(t−s)L 2 0

(4.2)

M : R+ → R+ be a smooth cutoff function satisfying P M (x) = 1 if x < M 2 We let P 2 and PM (x) = 0 if x > M + 1. We next define PM (u) = PM (uHulm ). We introduce this cutoff into the nonlinear term above, effectively rendering the nonlinearity uniformly Lipschitz: t 4 + α2 (4.3) zs ds + ζ (t). zt = et L u0 + e(t−s)L PM ( zs ) − (1 + iβ)| zs |2q 2 0 We next define the random stopping time τ (R) by τ (R) = min t > 0 : zt Hulm ≥ R .

(4.4)

Stochastic Ginzburg–Landau Equations

429

We fix arbitrarily a positive number R < M, and if χI denotes the characteristic function of the set I , we consider the following integral equation for t < τ (R): t∧τ (R) 4 + α2 − (1 + iβ)|us |2q us ds + ζ (t). e(t−s)L PM (us ) ut = et L u0 + 2 0 (4.5) The following is a simple consequence of our construction: m , there is almost surely a unique function u ∈ Hm Lemma 4.1. For all δ, y, all u0 ∈ Hul t ul satisfying Eq. (4.5), this function is independent of M > R and it also satisfies Eq. (4.2) for t < τ (R).

Proof. The proof is classical, see [DZ1, Ku]. See also Sect. 10 for details of the uniqueness result. The remaining part of the proof of Theorem 4.1 follows very closely the paper by Mielke [Mi] which is based on [BGO, C1, GV1, GV2]. We first establish uniform bounds in L2δ,y . Lemma 4.2. For all δ > 0 and p ≥ 1, there are C0,p (δ) such that the following bound holds for all t > t0 (u0 ) and all y ∈ Rd : p

Eut L2 ≤ C0,p (δ).

(4.6)

δ,y

Proof. We first estimate the square of the norm in L2δ,y = L2 . By Itô’s formula, we have dut 2 = − 2∇ut 2 dt − 2Re ∇ϕϕ −1 ut , (1 + iα)∇ut dt + 2ut 2 dt − 2Re ut , (1 + iβ)|ut |2q ut dt + ξ 2 dt + 2Re (ut , ξ ) dw(t) ≤ − ∇ut 2 dt + 2 + (1 + α 2 ) ut 2 dt − 2 ut , |ut |2q ut dt

(4.7)

+ ξ dt + 2Re (ut , ξ ) dw(t) 2

≤ − ∇ut 2 dt + C(α, ξ , q) dt − ut 2 dt + 2Re (ut , ξ ) dw(t). We integrate this last inequality over t and take expectations. By standard arguments the expectation of the Itô integral vanishes (recall that we consider stopped solutions, Eq. (4.5), see [DZ1]) and we obtain T T EuT 2 ≤ Eu0 2 − E ∇ut 2 dt − E ut 2 − C dt. 0

0

By Gronwall’s inequality, this is EuT 2 ≤ max C0,2 , (Eu0 2 − C0,2 )e−T + C0,2 . For higher powers of the L2 norm, we use Itô’s formula again: 2 1 2p dut 2 = ut 2p−2 dut 2 + 2(p − 1)ut 2p−4 Re (ut , ξ ) dt, p

430

J. Rougemont

hence (after substituting the estimate (4.7)) 2p

2p

EuT 2 ≤ Eu0 2 − E

T

ut 2p − C0,2p dt,

0

which by Gronwall’s inequality gives a uniform bound on ut p for p > 2. For p ∈ [1, 2), we use Jensen’s inequality: p/2 p/2 Eut p ≤ Eut 2 ≤ C0,2 = C0,p . p

If u0 is uniformly bounded and because ξ L2 is bounded uniformly in y and t, we δ,y

obtain the uniform bound in the spaces L2δ,y for all y, p

sup sup Eut L2 ≤ C0,p (δ), t>0 y∈Rd

which proves Lemma 4.2.

δ,y

(4.8)

Proposition 4.1. For all δ > 0 and p ≥ 1, there are C1,p (δ) such that the following bound holds for all t > t0 (u0 ) and all y ∈ Rd : p

Eut H1 ≤ C1,p (δ). δ,y

Proof. We first consider the differential d∇ut 2

= − 2ut 2 dt − 2Re ∇ϕϕ −1 ∇ut , (1 + iα)ut dt + 2∇ut 2 dt + 2Re ut , (1 + iβ)|ut |2q ut dt + 2Re ∇ϕϕ −1 ∇ut , (1 + iβ)|ut |2q ut dt + ∇ξ 2 dt + 2Re (∇ut , ∇ξ ) dw(t) ≤ − 2ut 2 dt + 2∇ut 2 dt + 2Re ut , (1 + iβ)|ut |2q ut dt + 2 1 + α 2 ut + 1 + β 2 |ut |2q+1 ∇ut dt

(4.9)

+ ∇ξ 2 dt − 2Re (ut , ϕ −1 ∇(ϕ∇ξ )) dw(t), and we also compute the following differential that will help us to cancel out some of the terms above: 1 d|ut |q+1 2 q +1 = 2Re |ut |2q u, (1 + iα)ut dt + 2|ut |q+1 2 dt 2 (4.10) dt + 2Re |ut |2q ut , ξ dw(t) − 2|ut |2q ut 2 dt + 2q Re |ut |q−1 ut , ξ 2q ≤ 2Re |ut | u, (1 + iα)ut dt + 2|ut |q+1 2 dt + 2qξ 2 |ut |q 2 dt − 2|ut |2q ut 2 dt + 2Re |ut |2q ut , ξ dw(t).

Stochastic Ginzburg–Landau Equations

431

We take a convex combination of Inequalities (4.10) and (4.9) (here λ ∈ [0, 1]): (1 − λ) 1 λ d∇ut 2 + d|ut |q+1 2 2 q +1 2 ≤ λ∇ut + (1 − λ)|ut |q+1 2 dt λ + (1 − λ)qξ 2 |ut |q 2 + ∇ξ 2 dt 2 + λ 1 + α 2 ut + 1 + β 2 |ut |2q+1 ∇ut dt + M dt + Re −λ(ut , ϕ −1 ∇(ϕ∇ξ )) + (1 − λ)b |ut |2q ut , ξ dw(t). The term denoted by M is treated separately: M = − λut 2 + (1 − λ)|ut |2q+1 2 + Re 1 − i(λβ − (1 − λ)α) |ut |2q ut , ut (1 − λ) ≤ −ε λut 2 + |ut |2q+1 2 q +1 − 2(1 − ε) λ(1 − λ) |ut |2q ut , ut + Re 1 − i(λβ − (1 − λ)α) |ut |2q ut , ut (1 − λ) ≡ −ε λut 2 + |ut |2q+1 2 + M(ε). q +1 is negative (see e.g. [GV1,Mi, Under Hypothesis 2.1, there is an ε > 0 such that M(ε) BGO]). The proof goes as follows: we first remark that integration by parts leads to 2q ∇ϕ 2q |ut | ut , ut = − |ut | ut , ∇ut ϕ q ut 2 ∇u2t . − (q + 1) ϕ|ut |2q |∇ut |2 1 + 1 + q |ut |2 |∇ut |2 The last bracket above is of the form 1 + z. Its argument can be estimated as follows: q | arg(1 + z)| ≤ arcsin |z| = arcsin 1+q ≡ θ ∗ . We plug this into M: ≤ −(q + 1) |ut |2q , |∇ut |2 M(ε) × 2(1 − ε) λ(1 − λ) + cos θ ∗ − |λβ − (1 − λ)α| sin θ ∗ + C(α, β, λ, ε)|ut |2q+1 ∇ut . The curly bracket above can be made positive by suitably choosing λ and ε, namely we take λ = cos2 η, we optimise for η and we obtain the following condition, which ∗ is √ obviously fulfilled for small ε > 0 under Hypothesis 2.1 (remark that 1/ tan θ = 2q + 1/q): −(1 + αβ) − |β − α|/ tan θ ∗ + ε(2 − ε)/ sin2 θ ∗ ≤ 0,

432

J. Rougemont

(see Ginibre and Velo [GV1] for this argument, Mielke [Mi] has a slightly different formulation). We thus obtain 1 (1 − λ) λ d∇ut 2 + d|ut |q+1 2 2 q +1 (1 − λ) ≤ −ε λut 2 + |ut |2q+1 2 dt q +1 + C1 ∇ut 2 + C2 |ut |q+1 2 dt + C3 ξ 2 |ut |q 2 + C4 ∇ξ 2 dt + C5 ut + C6 |ut |2q+1 + C7 ∇ut dt + Re −λ(ut , ϕ −1 ∇(ϕ∇ξ )) + (1 − λ) |ut |2q ut , ξ dw(t) ε (1 − λ) ≤ C dt − λ∇ut 2 + |ut |q+1 2 dt 2 q +1 + Re −λ(ut , ϕ −1 ∇(ϕ∇ξ )) + (1 − λ) |ut |2q ut , ξ dw(t), thanks to the following obvious inequality: −ut 2 ≤ −ρ∇ut 2 + Cρ 2 ut 2

(4.11)

which holds for all ρ > 0 and for some C > 0. As before, we take expectations, integrate over t and we use Gronwall’s inequality to find the following bound: max E∇uT 2 , E|uT |q+1 2 (4.12) ≤ max C1,2 , (E∇u0 2 + E|u0 |q+1 2 − C1,2 )e−εT + C1,2 . This and Lemma 4.2 prove Proposition 4.1.

We next consider solutions z(x, t) to Eq. (4.2) with bounded initial condition. Proposition 4.1 on u(x, t) implies the following: Proposition 4.2. For all p ≥ 1, there is a C∞,p such that for all t > t0 (u0 ), the following holds: p

Ezt ∞ ≤ C∞,p .

(4.13)

Proof. By the bound (2.6), if d = 1 then Proposition 4.1 implies the bound (4.13) for 2 . This is easily achieved with the stopped solutions. If d = 2 we need a bound in Hδ,y help of a Gagliardo–Nirenberg inequality which we prove in Sect. 12: 3 (R2 ). For all K > 0 there are C(K), η such that Lemma 4.3. Let f ∈ Hul ϕδ,y (x) |f (x)|2q f (x) f (x) dx η 1 3 2 2(q+1) ≤ ϕδ,y |∇ f (x)| dx + C(K) sup ϕδ,y (x)|f (x)| dx . K y

Stochastic Ginzburg–Landau Equations

433

We use Inequality (4.11) (with ut replaced by ∇ut ), Lemma 4.3, and the estimate (4.12) to bound the time derivative of ut 2L2 : δ,x

1 dut 2L2 δ,x 2 1 3 2 ≤ − ∇ ut L2 dt + (1 + (1 + α 2 ))ut 2L2 dt δ,x δ,x 2 1 2η + C(1 + β 2 ) sup |ut |q+1 L2 dt + ξ 2L2 dt + (ut , ξ )δ,x dw(t) δ,x 2 δ,y y 1 ≤ − dut 2L2 + C dt + (ut , ϕ −1 (ϕξ ))δ,x dw(t), δ,x 2 p

where C depends on the parameters in Eq. (2.1) (including ξ ) and on ut L2 , η

|ut |q+1 L2 , and ∇ut 2L2 δ,x

δ,x

δ,x

which satisfy bounds (4.6) and (4.12) (or rather some

extension of it to deal with the power η). By the usual Gronwall inequality, this proves (4.13) for stopped solutions (Eq. (4.5)). We now choose a very large n0 C0,2 + C1,2 + C2,2 and we let En = u : ∃t < τ (2n) s.t. ut ∞ > n0 + n , where τ is the stopping time from Eq. (4.4). By Tchebychev’s inequality, by Proposition 4.1 and (2.6) we have ∞

n=1

P(En ) ≤

∞

n=1

n−2 sup E(ut 2∞ ) < ∞. t≤τ (2n)

By the Borel–Cantelli Lemma, it means that almost surely only finitely many of the events En happen, and hence ut ∞ remains bounded as the cutoffs R and M in Eq. (4.5) are sent to infinity (because the distribution of the stochastic process ut has bounded moments independently of the cutoff, see Sect. 10) a uniform bound holds in a small interval of time and this can be iterated indefinitely. This implies uniform boundedness p of zt and a similar argument holds for zt ∞ , p > 1. Proof of Theorem 4.1. By Proposition 4.1, it only remains to show that ∇ m ut p is bounded for m > 1. Let p = 2. We assume that it is true for m − 1 and we consider 1 d∇ m ut 2 2 1 1 ≤ − ∇ m+1 ut 2 dt + 1 + (1 + α 2 ) ∇ m ut 2 dt 2 2 m 1 m − Re (1 + iβ) ∇ ut , ∇ (|ut |2q ut ) dt + ∇ m ξ 2 dt 2 + Re (∇ m ut , ∇ m ξ ) dw(t) 1 ≤ − ∇ m+1 ut 2 dt + C1 ∇ m ut 2 dt + C2 ut 2Hm−1 + C3 dt 2 δ,y + (−1)m Re (ut , ϕ −1 ∇ m (ϕ∇ m ξ )) dw(t).

434

J. Rougemont

Using (4.11) (with ut replaced by ∇ m−1 ut ), Proposition 4.2, and the recursion assumption, this can be bounded by: 1 1 d∇ m ut 2 ≤ −∇ m ut 2 + C dt + (−1)m Re (ut , ϕ −1 ∇ m (ϕ∇ m ξ )) dw(t). 2 2 We then take expectations and integrate: E∇ m ut 2 ≤ E∇ m u0 2 − E The case p = 2 is similar.

t 0

∇ m us 2 − C ds.

5. Invariant Measures We now turn to the problem of the existence of an invariant measure for the process defined by Eq. (2.1). We construct here a simple example of a smooth homogeneous random forcing. This construction admits evident generalisations. Then, assuming such an homogeneous forcing, we prove that there exists (at least) one invariant measure for the time evolution which is also homogeneous in space. The proof is classical, it uses the Krylov–Bogolyubov argument and a tightness property (Prokhorov’s Theorem). Compactness in L2δ,0 is provided by Theorem 4.1, Eq. (2.5) and Eq. (2.7). Let ξ(x) be a C ∞ almost periodic function on Rd . Denoting Ty ξ(x) = ξ(x + y), the set L∞ G = Ty ξ : y ∈ Rd is a compact group which can be endowed with the normalised Haar measure h. We denote by (G, F1 , h) the corresponding probability space (F1 is the sigma-algebra of Borel sets) and by ξy the corresponding random variable. Let next wα (t) be a standard Brownian motion (vanishing at 0) on the probability space (C0 (R, R), F2 , W) (F2 is the sigma-algebra generated by the topology of uniform convergence on compact sets and W is the Wiener measure). We define the noise by >ω (x, t) = ξy (x)wα (t), where ω = (α, y) ∈ C0 (R, R) × G = ?. Our basic probability space is thus (?, F, P) = (C0 (R, R)×G, F2 ×F1 , W ×h). By the definition of Haar measures, P is homogeneous in x, i.e. Ty∗ P = P for all y ∈ Rd (see Vishik and Fursikov [VF] for a discussion of homogeneous measures). Let $tω be the semi-flow generated by Eq. (2.1) with noise >ω (x, t). Using Lemma 10.1, we can define a Markov semi-group Pt acting on Cb (L2δ,0 , C) by (5.1) Pt f (u) = E ω → f $tω (u) . Pt is a Markovian Feller semi-group (the Feller property follows from the continuity of $tω ). Its dual Pt∗ acts on probability measures over L2δ,0 by ∗ (5.2) Pt µ (B) = E ω → µ $−t ω (B) . We call µ an invariant measure for Eq. (2.1) if Pt∗ µ = µ for all t > 0 (see Arnold [Ar]). In this section, we prove the following theorem, which is actually a simple consequence of the bounds derived in Sect. 4. Theorem 5.1. There exists at least one invariant measure Eq. (2.1). This measure µ for m. is homogeneous in x and its support is contained in m≥0 Hul

Stochastic Ginzburg–Landau Equations

435

Proof. We consider the family of measures {µt }t>0 , where µt =

1 t

0

t

Ps∗ δ 0 ds,

δ 0 being the unit mass at 0 ∈ L2δ,0 . By Theorem 4.1, this family is tight in L2δ,0 for all δ > 0. Namely for any ε > 0 there is a compact Kε ⊂⊂ L2δ,0 such that µt (Kε ) > 1 − ε. m (m > d/2) for sufficiently large For the set Kε we choose the ball of radius R(ε) in Hul R(ε) and the compactness follows from (2.5) and (2.7). By the Prokhorov Theorem, {µt }t>0 is weakly precompact and thus there is at least one accumulation point µ. By the standard Krylov–Bogolyubov argument µ is an invariant measure (see [Ar,VF, DZ2] for a detailed statement of these procedures). Let tn , n = 1, 2, . . . be a sequence such that µ = w−limn→∞ µtn . Let BR be the ball m and let f 2 of radius R in Hδ,y y,R be any bounded continuous function on Lδ,y vanishing 2 2 on BR (which is a compact set). Since the topologies of Lδ,0 and of Lδ,y are equivalent, this function is continuous on L2δ,0 . Obviously | fy,R (η)µtn (dη)| < f ∞ ε(R) for all n and y, where ε(R) → 0 as R → ∞. By weak convergence of µtn to µ this also holds m . for µ and hence the support of µ must be contained in y∈R Hδ,y m We next prove the homogeneity of µ. Let f ∈ Cb (Hul , C) and define the translation operator Ty by Ty f (u) = f (Ty u). We have

1 tn n→∞ tn 0 1 tn = lim n→∞ tn 0 1 tn = lim n→∞ tn 0 1 tn = lim n→∞ tn 0

Ty f (η)µ(dη) = lim

f (Ty η)P($tω (0) ∈ dη)

dt f (η)P(Ty ($tω (0)) ∈ dη) dt t f (η)P($Ty ω (Ty (0)) ∈ dη) dt t f (η)P($ω (0) ∈ dη) dt = f (η)µ(dη),

where we have used the homogeneity of P. Since the above holds for all f , it proves that µ is homogeneous and the proof of Theorem 5.1 is finished. Remark 5.1. In the above construction of a tight family of measures, we could have m instead of δ . considered any homogeneous initial measure µ0 supported by ∩m≥0 Hul 0 6. Entropy Estimates In this section, we define and estimate different notions of entropy for Eq. (2.1). We start with the topological entropy, then the measure-theoretic entropy and finally the upper box-counting dimension. All these quantities are extensive, hence we actually define their spatial densities. The inequalities between these different entropies (Theorem 6.1) are classical, and the proofs are straightforwardly adapted from [KH, LQ]. The proof that these are finite quantities follows from an estimate of the maximum rate of divergence of nearby orbits (similar to the largest Lyapunov exponent) in Lemma 6.1, see also [CE2].

436

J. Rougemont

To do so we first introduce the basic dynamical setup: let $tω (t > 0) be the solution semi-flow to Eq. (2.1) for given noise parameter ω and let θ t be the shift semi-flow on ?: >θ τ ω (x, t) = >ω (x, t + τ ) − >ω (x, t). Let next S t be the semi-flow on L2δ,0 × ? defined by S t : L2δ,0 × ? → L2δ,0 × ? (u, ω) → $tω (u), θ t (ω) . m (m > d) endowed with the (weaker) topology of uniform We consider the space Hul convergence on the compact Q ⊂⊂ Rd . By standard embeddings (see (2.6) and (2.7)) m are compact in L∞ (Q). Following Crauel et al. [CDF, D], we define bounded sets of Hul m and the random attractor Aω as follows: let BR be the ball of radius R, centre 0 in Hul let

A(ω, R) =

T >0 t>T

Aω =

L2δ,0

$tθ −t ω (BR ) L2δ,0

A(ω, R)

,

.

R>0 m , hence By the estimates of Sect. 4, Aω is almost surely closed and bounded in Hul it is compact in L∞ (Q) for any bounded Q ⊂ Rd . Moreover the diameter of Aω in m is less than some R with P(ω : R < ∞) = 1 and E(ω → R ) < ∞ (by Hul ω ω ω Theorem 4.1). The following equivariance properties hold (we assume θ t Tx = Tx θ t for all (x, t) ∈ Rd × R+ ):

$tω Aω =Aθ t ω , Tx Aω =ATx ω ,

(6.1)

and it contains the support of any invariant measure for S t . Let next µ be an invariant measure in the sense of Sect. 5, namely a stationary measure for the Markov semigroup (5.1). We also assume that P is an invariant measure for θ t . Then µ × P is an invariant measure for the dynamical system (S t , X , B), where X = L2δ,0 × ? and B is the associated sigma-algebra. More precisely, one has ∗ (6.2) E ω → $tω µ = µ (which is only a rephrasing of Pt∗ µ = µ, see Eq. (5.2)). We introduce the following definitions: Definition 6.1. Let τ > 0, n ∈ N, and Q ⊂⊂ R. We define a pseudo-metric dω,n,τ,Q on m by Hul kτ dω,n,τ,Q (u, v) = max $kτ ω (u) − $ω (v)L∞ (Q) . k=0,...,n−1

Let Nω,n,τ,Q,ε be the cardinality of a minimal (n, ε)–cover of Aω |Q (that is Nω,n,τ,Q,ε is the least number of open sets whose diameter in the metric dω,n,τ,Q is at most ε and whose union contains Aω ).

Stochastic Ginzburg–Landau Equations

437

We define the cube QL = [− 21 L, 21 L]d . We are now able to prove the existence of the spatial density of topological entropy htop for Eq. (2.1): Proposition 6.1. For all τ > 0 the following limit exists: 1 1 lim E log Nω,n,τ,QL ,ε . d ε→0 L→∞ L n→∞ nτ

htop = lim lim

(6.3)

This limit is independent of τ > 0. Proof. The proof is similar to the deterministic case treated by Collet and Eckmann in [CE2] and is reproduced in Sect. 8. Let U = {U1 , . . . , Uk , . . . } be a countable (or finite) µ-measurable partition of Aω . For two partitions U and V, we denote their refinement {Uk ∩ VD : Uk ∈ U, VD ∈ −τ V, µ(Uk ∩ VD ) > 0} by U ∨ V. Moreover $−τ ω (U) = {$ω (Uk ) : Uk ∈ U} is a measurable partition of Aθ −τ ω whenever U is a measurable partition of Aω . (Here $−t ω stands for the inverse of $tω , namely $−t ω (x) is the set of all pre-images of x.) Definition 6.2. Let Hµ U) and Hµ (U|V) denote the entropy of a partition and the conditional entropy, both relative to a given measure µ. They are defined as follows

Hµ (U) = − µ(U ) log µ(U ), U ∈U

Hµ (U|V) = −

U ∈U ,V ∈V

µ(U ∩ V ) µ(U ∩ V ) log . µ(V )

We adopt here the convention 0 log 0 = 0, therefore 0 < Hµ (U) ≤ log card(U) (which is possibly infinite for countable U). We also choose an arbitrary sequence .ω,ε of partitions of Aω in sets of diameter at most ε in the metric of L∞ (Q1 ). The second result in this section is the existence of the spatial density of measuretheoretic entropy hµ Proposition 6.2. For all τ > 0 the following limit exists:  



 n−1  1 1  Hµ   E lim $−kτ ω T−x (.θ kτ Tx ω,ε ) .   d n→∞ ε→0 L→∞ L nτ x∈

hµ = lim lim

Zd ∩QL

(6.4)

k=0

It is independent of τ > 0 and of the particular choice of the sequence of partitions .ω,ε . Proof. Again, the proof is quite standard, see e.g. [KH,LQ] and Sect. 9. We next introduce the notions of upper density of dimension dup . Definition 6.3. Let Mε,Q,ω be the least cardinality of an open cover of Aω by sets of diameter less than ε in the metric of L∞ (Q), where Q is compact (we call this an ε-cover of Aω |Q ). Let dup (ω) be the upper density of dimension of Aω : log Mε,QL ,ω dup = lim sup lim E . Ld log ε−1 ε→0 L→∞

438

J. Rougemont

The main results of the section are the following inequalities involving the different entropies just defined. Corresponding inequalities in finite dimensional dynamical systems are well-known [KH]. Theorem 6.1. There is a γ < ∞ such that hµ ≤ htop ≤ γ dup < ∞.

(6.5)

Before giving the proof of Theorem 6.1, we state a lemma which will prove useful later on. Lemma 6.1. There are C, γ such that for all (sufficiently large) L and all (sufficiently small) ε > 0, if u − vL∞ (QL ) ≤ ε then for t > 0, L = L − C(1 + t) log 1/ε, one has $tω (u) − $tω (v)L∞ (QL ) ≤ Ceγ t ε almost surely. Proof. Let ut and vt be two solutions to Eq. (2.1). By Lemma 10.1, ut − vt L2 ≤ eγ t u0 − v0 L2 δ,0

δ,0

and moreover both ut ∞ and vt ∞ are bounded uniformly in time (see Proposition 4.2). Let Kt (·) be the convolution kernel associated with the semi-group exp(tL) (see (4.1)) and let rs = us − vs . By Duhamel’s formula, |rt (x)|

t r0 (x)| + Kt−s G1 (us , vs )rs + G2 (us , vs )r s (x) ds 0 γt ≤ c1 e ε + |r0 (y)| sup

≤ |Kt

|x−y|2 ≤Ct log 1/ε

+ sup G1 (us , vs )∞ + G2 (us , vs )∞ 0≤s≤t

≤ c1 e

γt

ε+

sup |x−y|2 ≤Ct log 1/ε

≤ c3 (1 + t)e(1+γ )t 2ε +

0

t

|Kt−s | √ ϕδ,0

√

|r0 (y)| + c2 ϕδ,x |r0 |2 sup

|x−y|2 ≤Ct log 1/ε

|r0 (y)| +

√ ( ϕδ,x |rs |)(x) ds

t 0

sup

e

γs

|Kt−s | √ ϕ ds δ,0 2 |r0 (y)|

|x−y|≤C log 1/ε

≤ 4c3 e(2+γ )t ε, where in the last line we have assumed |x| ≤ 21 L − C(1 + t) log 1/ε (hence |y| ≤ 21 L) and used the assumption sup|y|≤L/2 |r0 (y)| ≤ ε. Proof of Theorem 6.1. We split Theorem 6.1 into three independent statements, namely each one of the three inequalities in (6.5). Proof of hµ ≤ htop . We follow the most standard proof (originally by Misiurewicz, quoted in [KH]). We modify the partition .ω,ε = {σ1 , . . . , σN } by “shrinking” each element, namely by replacing each σk by a closed set Uk with Uk ⊂ σk and we define U0 = Aω \ ∪N k=1 Uk . We thus obtain a new partition Uω,ε = {U0 , . . . , UN } and an open

Stochastic Ginzburg–Landau Equations

439

cover V ω,ε = {U1 ∪ U0, . . . , UN ∪ U0 }. We assume that the Uk have been chosen such that E Hµ (.ω,ε |Uω,ε ) < 1. Remark that n−1

card

x∈ j =0 Zd ∩QL d

τ $−j ω T−x (Uθ −j τ Tx ω,ε )

≤ 2nL card

n−1 x∈ j =0 Zd ∩QL

τ $−j ω T−x (Vθ −j τ Tx ω,ε ) ,

and by Definition 6.2 Hµ

n−1 x∈ j =0 Zd ∩QL

≤ log card

τ $−j ω T−x (Uθ −j τ Tx ω,ε )

n−1 x∈ j =0 Zd ∩QL

≤ log card

n−1 x∈ j =0 Zd ∩QL

τ $−j ω T−x (Uθ −j τ Tx ω,ε )

τ $−j T (V ) + nLd log 2. −j τ −x θ Tx ω,ε ω

Consequently 



n−1  1 1  −j τ Hµ  E lim $ T (U ) −j τ −x θ Tx ω,ε  ω L→∞ Ld n→∞ nτ  x∈ lim

Zd ∩QL

j =0





1 1  E log card lim  d L→∞ L n→∞ nτ x∈

≤ lim

Zd ∩QL

+

n−1 j =0

 τ  $−j ω T−x (Vθ −j τ Tx ω,ε ) 

log C . τ

Moreover, the difference between the original partition and the new one is small, namely:   n−1 1  jτ lim $ω (.θ −j τ ω,ε )  E Hµ n→∞ nτ j =0   n−1 1  jτ 1 ≤ lim E Hµ $ω (Uθ −j τ ω,ε )  + E Hµ (.ω,ε |Uω,ε ) . n→∞ nτ τ j =0

440

J. Rougemont

Since all the above holds for arbitrarily large τ > 0 we get hµ ≤





1 1  log card lim E ε→0 L→∞ Ld n→∞ nτ  x∈ lim lim

Zd ∩QL

n−1 j =0

 τ  $−j ω T−x (Vθ −j τ Tx ω,ε )  .

(6.6)

Next let δω,ε be the Lebesgue number of the cover Vω,ε (namely the largest δω,ε > 0 such that every ball of diameter δω,ε is contained in an element of Vω,ε ). Indeed δω,ε is ! jτ also the Lebesgue number of n−1 j =0 $ω (Vθ −j τ ω,ε ) with respect to the metric dω,n,τ,QL . Hence n−1 τ card $−j ω T−x (Vθ −j τ Tx ω,ε ) ≤ Mδω,ε ,QL ,ω , x∈ j =0 Zd ∩QL

and this proves that the r.h.s. of (6.6) is less than htop . Proof of htop ≤ γ dup . The proof follows [CE2]. Let ρ > 0 be such that for all ε < ε0 , all L > L0 = L0 (ε, ρ),we have log Mε,QL ,ω E ≤ dup + ρ. Ld log ε−1 Let L = L + C(T + 1) log(1/ε) and ε = C −1 exp(−γ T )ε (see Lemma 6.1). Let an ε -cover of Aω |QL (in the sense of Definition 6.3) be given. Then it is also a (T /τ, ε)-cover (in the sense of Definition 6.1), hence Nω,T /τ,τ,QL ,ε ≤ Mε ,QL ,ω , from which follows 1 1 E log Nω,T /τ,τ,QL ,ε lim d ε→0 L→∞ L T →∞ T 1 1 = lim lim d inf E log Nω,T /τ,τ,QL ,ε ε→0 L→∞ L T T log Mε ,QL ,ω 1 ≤ lim lim E ε→0 L→∞ T Ld 1 ≤ lim lim (dup + ρ) log 1/ε + ρ . ε→0 L→∞ T

htop = lim lim

Since log 1/ε = γ T + log(C/ε), the limit T → ∞ and ρ → 0 leaves only γ dup on the r.h.s. above. Proof of dup < ∞. We want to prove a bound on Hε of the form Hε ≤ C log 1/ε for small ε > 0. To do so we use iteratively the following bound: Lemma 6.2. There are Bω , C > 0 such that for all L > 0 and sufficiently small ε > 0, one has almost surely Ld +1/ε2

Mε,QL ,ω ≤ M2ε,QL+C ,θ −1 ω Bθ −1 ω

.

(6.7)

Stochastic Ginzburg–Landau Equations

441

The proof of Lemma 6.2 is postponed to Sect. 7. Let ε > 0, L > 0. Let T be the smallest integer larger than (log 2)−1 log(1/ε). By iterating T times the bound (6.7), we obtain Mε,QL ,ω ≤ M1,QL+T C ,θ −T ω

T " n=1

(L+(n−1)C)d +1/ε2

Bθ −n ω

.

Using the results of [KT], the smoothness of the functions in Aω and the rapid decay of the distribution of Rω (where Rω is such that MRω ,QL ,ω = 1, see Theorem 4.1), we see that E(ω → M1,QL ,ω ) ≤ C for all L, hence log Mε,QL ,ω ≤ C < ∞. dup = lim sup lim E Ld log ε−1 ε→0 L→∞ 7. Proof of Lemma 6.2 We give the proof for the notationally convenient case d = 1. Let u and v be two orbits of Eq. (2.1) with initial conditions u0 and v0 such that u0 and v0 belong to Aω . The difference r = u − v satisfies almost surely the equation ∂t r = 1 + (1 + iα)∂x2 r + G1 (u, v)r + G2 (u, v)r, (7.1) where we have used the notation N (|x|2 ) = − (b + iβ)|x|2q , 1 1 2 2 2 2 G1 (x, y) = N (|x| ) + N (|y| ) + (|x| + |y| ) N t|x|2 + (1 − t)|y|2 dt , 2 0 1 N t|x|2 + (1 − t)|y|2 dt. G2 (x, y) = xy 0

(7.2) Let χ (x) be a smooth and monotone function satisfying χ (x) = 1 if x ≤ 1 and χ (x) = 0 if x ≥ 2. We decompose the kernel of exp(tL) into a low frequency part and a high frequency part: ∞ 1 2 (−) Kt (x) = eipx+t (1−(1+iα)p ) χ (|p/p ∗ |) dp, 2π −∞ ∞ 1 2 (+) Kt (x) = eipx+t (1−(1+iα)p ) 1 − χ (|p/p ∗ |) dp, 2π −∞ where p∗ > 4 is a sufficiently large real number. We decompose the solutions rt (x) to Eq. (7.1) accordingly: (−)

(+)

rt (x) = rt (x) + rt (x), t (−) (−) (−) Kt−s (G1 (us , vs )rs + G2 (us , vs )r s ) (x) ds, r0 (x) + rt (x) = Kt 0 t (+) (+) (+) rt (x) = Kt r0 (x) + Kt−s (G1 (us , vs )rs + G2 (us , vs )r s ) (x) ds. 0

442

J. Rougemont (−)

(+)

The kernels Kt and Kt have some regularity and decay properties that we next describe: let the Bernstein class BR,k be the following set of functions: (7.3) BR,k ≡ f ∈ L∞ : f extends to an entire function, |f (z)| ≤ Rek|Im z| . We have (−)

f is in BR,2p∗ with R ≤ 2C0 f ∞ . Lemma 7.1. For all p∗ > 4, t > 21 , f ∈ L∞ , Kt Moreover, for all n ∈ N, there is a Cn > 0 such that Cn (x)| ≤ √ (1 + x 2 /t)−n , t C ∗ 2 n (+) |Kt (x)| ≤ √ e−(p ) t/2 (1 + x 2 /t)−n . t (−)

|Kt

The proof of Lemma 7.1 is omitted, see [CE2, Ro]. Pick a 2ε-cover of Aω |QL+C(ε) (which exists a.s. by compactness, see Definition 6.3) and let u and v belong to one of its elements. Then r0 = u − v satisfies |r0 (x)| ≤ 2ε for |x| ≤ 21 (L + C(ε)). Define ξy(n) (x) =

1 . (1 + (x − y)2 )n/2 (n)

Remark that Lemma 10.1 also holds with ϕy replaced by ξy (n ≥ 2). Moreover by reproducing the proof of Lemma 6.1 using the bounds from Lemma 7.1 we obtain (for |x| ≤ L/2): 1 (−) (−) (−) (n) (n) |r1 (x)| ≤ |K1 r0 (x)| + C K1−s / ξ0 2 ξy rs 2 0

1

≤ Cε + 2Cε

√

0

Cn 1−s

(7.4)

eγ s ds

≤ Aε, where A depends on n but not on p∗ and (+)

(+)

|r1 (x)| ≤ |K1 ≤e

r0 (x)| + C

−(p∗ )2 /2

≤ B(p∗ )ε,

1

(+) (n) (n) K1−s / ξ0 2 ξy rs 2

0 1 C

ε + 2Cε 0

ne

−(p∗ )2 (1−s)/2

√

1−s

eγ s ds

(7.5)

where B(p∗ ) → 0 as p∗ → ∞. We choose p ∗ so large that B(p ∗ ) < 21 . We next use a result of Cartwright (see [KT, Eq. (191)]): for all f in the Bernstein class BR,2p∗ (see (7.3)), the following identity holds: f (x) =

∞ sin(8p ∗ x)

sin(4p ∗ (x − xn )) n (−1) f (x ) , n (x − xn )2 32(p ∗ )2 n=−∞

(7.6)

Stochastic Ginzburg–Landau Equations

where xn =

nπ 8p∗ .

443

Let f, g be in BR,2p∗ . A simple application of Eq. (7.6) shows that

f − gL∞ (QL ) ≤ C

sup

|n|≤[4p∗ L/π]+4Cp∗ /(επ)

1 |f (xn ) − g(xn )| + ε. 4

Hence, among all the functions in BRω ,2p∗ that are bounded by Aε in [− 21 L, 21 L] (by ∗ ∗ (−) (7.4), r1 is such a function), at most (4A)Cp L (4Rω /ε)Cp /ε of them are ε/2-separated on QL . By taking a ball of diameter ε around each of them, and repeating the operation for each element of the original 2ε-cover, we get an ε-cover of $1ω (Aω )|QL = Aθ 1 ω |QL . The number of elements in this cover is at most ∗

(4A)Cp L (4Rω /ε)Cp The proof of Lemma 6.2 is complete.

∗ /ε

M2ε,QL+C ,ω .

8. Proof of Proposition 6.1 We follow Collet and Eckmann’s proof [CE2], which is itself an adaptation of standard proofs of existence of the topological entropy, see e.g. [KH] and references therein. The proof of Proposition 6.1 is based on the following inequalities: Lemma 8.1. For all compacts Q, Q , all m, n ∈ N and ε > ε > 0 one has Nω,n,τ,Q,ε ≤ Nω,n,τ,Q,ε , Nω,n,τ,Q∪Q ,ε ≤ Nω,n,τ,Q,ε Nω,n,τ,Q ,ε , Nω,n+m,τ,Q,ε ≤ Nω,n,τ,Q,ε Nθ nτ ω,m,τ,Q,ε .

(8.1) (8.2) (8.3)

Furthermore for any τ < τ the following inequalities hold: Nω,n,τ ,QL ,ε ≤ Nω,n,τ,Qf (L) ,g(ε) ≤ Nω,n,τ ,Qf (f (L)) ,g(g(ε)) ,

(8.4)

where f (L) = L + C(τ + 1) log ε−1 and g(ε) = c exp(−γ τ )ε with C, c, γ some constants. Lemma 8.1 implies immediately that the limit in Eq. (6.3) exists: by subadditivity (8.3) and by invariance of P under θ t , we get that J1 = lim

n→∞

1 E log Nω,n,τ,QL ,ε nτ

exists, it is non-increasing in ε and by further subadditivity (8.2), 1 J1 L→∞ Ld

J2 = lim

also exists and is non-increasing in ε (by (8.1)). Hence the limit in Eq. (6.3) exists. By (8.4), it is independent of τ . Proof of Lemma 8.1. The inequality (8.1) is obvious from the definitions. We prove (8.2) by making the observation that if {A1 , . . . , AN } is an (n, ε)-cover of Aω |Q and {B1 , . . . , BM } an (n, ε)-cover of Aω |Q , then {Aj ∩ Bk : j = 1, . . . , N, k = 1, . . . , M} is an (n, ε)-cover of Aω |Q∪Q .

444

J. Rougemont

Similarly if {A1 , . . . , AN } is an (n, ε)-cover of Aω |Q and {B1 , . . . , BM } an (m, ε)cover of Aθ nτ ω |Q , then {Aj ∩ $−nτ ω Bk : j = 1, . . . , N, k = 1, . . . , M} is an (m + n, ε)cover of Aω |Q which proves (8.3). The inequality (8.4) follows immediately from Lemma 6.1, since if D is a set of diameter g(ε) in the metric dω,n,τ,Qf (L) then D is a set of diameter at most ε in the metric dω,n,τ ,QL . Remark 8.1. The topology of L∞ (Q) is a simplifying choice (as far as Eq. (8.2) is concerned), but [CE3] shows that other topologies can be used as well. 9. Proof of Proposition 6.2 This proof is, like the proof of Proposition 6.1, based on subadditive bounds. We use wellknown properties of the function Hµ (·), see [KH], Chapter 4.3 (in particular Proposition 4.3.3). We recall that x → −x log x is concave, hence for any partition U and any t > 0, the following holds: −t Hµ $−t $ (U) P(dω) ≤ H (U)P(dω) = Hµ (U), µ ω ω where we have used Eq. (6.2). We thus have Hµ

n+m−1

=

k=0

Hµ

n−1

Hµ

k=n

≤

Hµ +

n−1 k=0

k=0

$−kτ ω (.θ kτ ω,ε ) P(dω)

$−kτ ω (.θ kτ ω,ε ) P(dω)

k=0

Hµ

n−1

+

k=0

$−kτ ω (.θ kτ ω,ε ) P(dω)

m−1 −kτ Hµ $−nτ $ (. kτ ω,ε ) P(dω )P(dω) θ ω ω k=0

≤

n−1

$−kτ ω (.θ kτ ω,ε )

m−1 Hµ $−nτ $−kτ ω θ nτ ω (.θ (k+n)τ ω,ε ) P(dω)

≤

$−kτ ω (.θ kτ ω,ε ) P(dω)

k=0 n+m−1

+

$−kτ ω (.θ kτ ω,ε ) P(dω)

Hµ

n−1 k=0

$−kτ ω (.θ kτ ω,ε )

P(dω) +

Hµ

m−1 k=0

$−kτ (. ) P(dω), kτ θ ω,ε ω

namely subadditivity in the time variable. We can prove subadditivity in the space variable in a similar way. Thus the first two limits in Eq. (6.4) exist. These limits are monotonically increasing as ε → 0, hence the third limit is well-defined.

Stochastic Ginzburg–Landau Equations

445

ω,ε We next prove that the limit is independent of the choice of .ω,ε : let .ω,ε and . be two different sequences, we get (by the Rokhlin inequality) n−1 lim 1 lim 1 Hµ $−kτ ω T−x (.θ kτ Tx ω,ε ) L→∞ Ld n→∞ nτ x∈ Zd ∩QL

k=0

n−1 1 1 −kτ − lim d lim Hµ $ω T−x (.θ kτ Tx ω,ε ) L→∞ L n→∞ nτ x∈ Zd ∩QL

k=0

ω,ε ) + Hµ (. ω,ε |.ω,ε ), ≤ Hµ (.ω,ε |. and the r.h.s. above vanishes as ε → 0 since these sequences generate the whole sigmaalgebra of Aω in this limit. We prove that Eq. (6.4) is independent of τ by using Lemma 6.1 and an argument similar to the one used in Sect. 8. 10. Uniqueness of Solutions In this section, we provide details of the existence and uniqueness result for Eq. (4.2). First remark that the process t ζ (t) = e(t−s)L ξ dw(s) 0 m , hence in L2 for any δ, y. Moreover, is a well defined Gaussian stochastic process in Hul δ,y by construction, the nonlinearity in Eq. (4.3) is uniformly Lipschitz, hence local existence m follows by a contraction argument. It is immediate that the corresponding process in Hul m. ut has bounded moments in Hul 2 The uniqueness in Lδ,y space follows from the fact that bounded smooth functions are dense and the following

Lemma 10.1. The semi-flow $tω extends almost surely to a bounded continuous semiflow on L2δ,y for any δ > 0 and y ∈ Rd . Proof. We apply the non-propagation estimate of Ginibre and Velo [GV1]. Let u0 and v0 be two functions in L2δ,y and denote the corresponding solutions to Eq. (2.1) by ut and vt . Their difference ut − vt satisfies (almost surely) the following inequality: 1 √ 1 √ ∂t ϕδ,y (ut − vt )22 ≤ (1 + 1 + α 2 ) ϕδ,y (ut − vt )22 2 2 −Re (1 + iβ) ϕδ,y (ut − v t ) |ut |2q ut − |vt |2q vt . By [GV1] (Proposition 3.1), Hypothesis 2.1 implies that the last term above is negative. We thus get an estimate of the form ut − vt L2 ≤ exp(ct)u0 − v0 L2 δ,y

δ,y

This and Lemma 4.2 prove that $tω is uniformly bounded and continuous on L2δ,y for (n)

any δ > 0 and y ∈ Rd if we define ut = limn→∞ ut of bounded functions approaching u0 .

(n)

where u0 is a Cauchy sequence

446

J. Rougemont

11. Compact Embedding for Local Spaces In this section, we give a proof of Relation (2.7) which is a trivial adaptation of [Ad], Theorem 6.53, p.174. More precisely we prove the embedding (2.7) to be Hilbert– m+k Schmidt. Let {en }n∈N be a complete orthonormal basis of Hδ,y . Let {Qn }n∈N be a d countable cover of R by balls of radius 1. Let x ∈ Qn , let α ≤ m and define the m+k by bounded linear operator Dxα on Hδ,y Dxα (u) = ∇ α u(x). Its norm is (by Sobolev embedding) bounded by Dxα (u)2 m+k ≤ max sup |∇ α u(x)|2 ≤ Hδ,y

0≤α≤m x∈Qn

C u2 m+k . Hδ,y inf x∈Qn ϕδ,y (x)

By Riesz’ Lemma, Dxα (·) = (vxα , ·)Hm+k for some vector vxα and δ,y

∞

|∇ α en (x)|2 =

n=1

∞

n=1

|(en , vxα )Hm+k |2 = vxα 2 m+k . δ,y

Hδ,y

Thus the Hilbert–Schmidt norm of the embedding map is ∞

n=1

en 2Hm δ ,y

=

d α≤m R ∞

≤m which is finite whenever δ > δ.

vxα 2 m+k ϕδ ,y (x) dx

n=1 Qn

Hδ ,y

Cϕδ ,y (x) dx, inf z∈Qn ϕδ,y (z)

12. Proof of Lemma 4.3 The proof can be found in [BGO, Mi] and is summarised below. We decompose the plane into countably many sets Q(m, n) of unit area and use the bounds ϕδ,y (x) ≤ exp(−δ|x − y|) ≤ eϕδ,y (x). For simplicity we assume δ = 1 and we drop it from our notation (if Lemma 4.3 is true for δ = 1 then it is true for all δ > 0 by scaling, possibly with different constants). We simply write D f for D f (x) dx for D ⊂ R2 . We have R2

ϕy |(|f |2q f )f | ≤ C

m,n

e−|n|

Q(m,n)

|f ||f |2q−1 |f ||f | + |∇f |2 , (12.1)

# where m Q(m, n) ⊃ {x ∈ R2 : n − 21 ≤ |x − y| ≤ n + 21 }. We estimate each summand using Hölder and Gagliardo–Nirenberg inequalities. For any p, r with p−1 + r −1 = 1

Stochastic Ginzburg–Landau Equations

447

and in particular for r = 1 + 1/q and p = 1 + q, we get: |f ||f |2q−1 |f ||f | + |∇f |2 Q(m,n)

2q 2q−1 ≤ c1 f 2p f 2pq/(p−1) f 2p + f 2pq/(p−1) ∇f 24pq/(p+q−1) 2q 2q−1 1/2 1/2 2 ≤ c2 f 2p f 2pq/(p−1) f 2p + f 2pq/(p−1) f 2pq/(p−1) f 2p 2q

= c3 f 22p f 2qr 2(2q+2)/(2q+3)

≤ c4 ∇ 3 f 2

2(q+1/(2q+3))

f 2(q+1)

4q 2 +6q+2

≤ K −1 ∇ 3 f 22 + c5 Kf 2(q+1)

.

By summing up all contribution to (12.1) we arrive at ϕy |(|f |2q f )f | 2 R

≤ CK −1 e−|n| |∇ 3 f |2 + C K e−|n| Q(m,n)

m,n

−1 ≤ CK −1 = CK

R2

R2

ϕy |∇ 3 f |2 + C K

m,n

ne−|n| sup

n

ϕy |∇ 3 f |2 + C K sup

which proves Lemma 4.3.

y

R2

y

R2

Q(m,n)

|f |2(q+1)

ϕy |f |2(q+1)

ϕy |f |2(q+1)

η

η

η

,

Acknowledgements. This work was supported by the Fonds National Suisse. I am grateful to Martin Hairer, Sergei Kuksin and Armen Shirikyan for their comments and suggestions.

References [Ad] Adams, R.A.: Sobolev Spaces. New York: Academic Press, 1975 [Ar] Arnold, L.: Random Dynamical Systems. Berlin–Heidelberg: Springer, 1998 [BGO] Bartucelli, M.V., Gibbon, J.D., Oliver, M.: Length scales in solutions of the complex Ginzburg– Landau equation. Physica D 89, 267–286 (1996) [BKL] Bricmont, J., Kupiainen, A., Lefevere, R.: Ergodicity of the 2D Navier–Stokes Equations with Random Forcing. Commun. Math. Phys. 224, 65–81 (2001) [C1] Collet, P.: Thermodynamic limit of the Ginzburg–Landau equations. Nonlinearity 7, 1175–1190 (1994) [C2] Collet, P.: Extended Dynamical Systems, Doc. Math. Extra Volume ICM III (1998), 123–132. [CDF] Crauel, H., Debussche, A., Flandoli, F.: Random Attractors. J. Dyn. Diff. Equ. 9, 307–341 (1997) [CE1] Collet, P., Eckmann, J.-P.: Extensive Properties of the Complex Ginzburg–Landau Equation, Commun. Math. Phys. 200, 699–722 (1999) [CE2] Collet, P., Eckmann, J.-P.: The definition and measurement of the topological entropy per unit volume in parabolic PDEs. Nonlinearity 12, 451–473 (1999) [CE3] Collet, P., Eckmann, J.-P.: Topological entropy and ε–entropy for damped hyperbolic equations. Ann. Henri Poincaré 1, 715–752 (2000) [D] Debussche, A.: Hausdorff Dimension of a Random Invariant Set. J. Math. Pures Appl. 77, 967–988 (1998) [DZ1] Da Prato, G., Zabczyk, J.: Stochastic equations in infinite dimensions. Cambridge: Cambridge University Press, 1992

448

[DZ2]

J. Rougemont

Da Prato, G., Zabczyk, J.: Ergodicity for infinite-dimensional systems. Cambridge: Cambridge University Press, 1996 [EH] Eckmann, J.-P., Hairer, M.: Invariant Measures for Stochastic PDE’s on Unbounded Domains. Nonlinearity 14, 133–151 (2001) [F1] Funaki, T.: The Reversible measures of Multi-Dimensional Ginzburg–Landau Type Continuum Model. Osaka J. Math. 28, 463–494 (1991) [F2] Funaki T.: Regularity Properties for Stochastic Partial Differential Equations of Parabolic Type. Osaka J. Math. 28, 495–516 (1991) [FM] Flandoli, F., Maslowski, B.: Ergodicity of the 2–D Navier–Stokes equation under random perturbations. Commun. Math. Phys. 172, 119–141 (1995) [GV1] Ginibre, J., Velo G.: The Cauchy Problem in Local Spaces for the Complex Ginzburg–Landau Equation I. Compactness Methods. Physica D 95, 191–228 (1996) [GV2] Ginibre, J., Velo, G.: The Cauchy Problem in Local Spaces for the Complex Ginzburg–Landau Equation II. Contraction Methods. Commun. Math. Phys. 187, 45–79 (1997) [KH] Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge: Cambridge University Press, 1995 [Kr] Krylov, N.V.: An analytic approach to SPDEs. In: Stochastic partial differential equations: six perspectives. Carmona, R.A. and Rozovskii, B., eds.. Providence, RI: Am. Math. Soc., 1999 [KS] Kuksin, S.B., Shirikyan, A.: Stochastic Dissipative PDEs and Gibbs Measures. Commun. Math. Phys. 213, 291–330 (2000) [KT] Kolmogorov, A.N., Tikhomirov, V.M.: ε-entropy and ε-capacity of sets in functional spaces. In: Selected Works of Kolmogorov, A.N., Vol III, Shiryayev, A.N., ed.. Dordrecht: Kluwer, 1993. [Ku] Kuksin, S.B.: Stochastic Nonlinear Schrödinger Equation. 1. A priori Estimates. Proc. Steklov Inst. Math. 225, 219–242 (1999) [LO] Levermore, C.D., Oliver, M.: The complex Ginzburg–Landau equation as a model problem. In: Dynamical systems and probabilistic methods in partial differential equations. Deift, P. et al., eds. Providence, RI: Am. Math. Soc., 1996 [LQ] Liu, P.-D., Qian, M.: Smooth Ergodic Theory of Random Dynamical Systems. Lecture Notes in Mathematics, 1606. Berlin–Heidelberg, Springer, 1995 [Ma] Mattingly, J.C.: Ergodicity of 2D Navier–Stokes equations with random noise and large viscosity. Commun. Math Phys. 206, 273–288 (1999) [Mi] Mielke, A.: Bounds for the solutions of the complex Ginzburg–Landau equation in terms of the dispersion parameters. Physica D 117, 106–116 (1998) [Ro] Rougemont, J.: ε–Entropy Estimates for Driven Parabolic Equations. Preprint (2000) [Ru] Ruelle, D.: Large Volume Limit of the Distribution of Characteristic Exponents in Turbulence. Commun. Math. Phys. 87, 287–302 (1982) [S] Sinai, Ya.G.: Two Results Concerning Asymptotic Behaviour of Solutions of the Burgers Equation. J. Statist. Phys. 64, 1–12 (1991) [VF] Vishik, M.J., Fursikov, A.V.: Mathematical Problems of Statistical Hydromechanics. Dordrecht: Kluwer, 1988 [Y1] Yosida, K.: Functional Analysis, Sixth edition. Berlin–New York: Springer, 1980 [Y2] Young, L.-S.: Ergodic Theory of Chaotic Dynamical Systems. In: XIIIth International Congress of Mathematical Physics (ICMP’97), Brisbane. Cambridge, MA: Internat. Press, 1999 Communicated by Ya. G. Sinai

Commun. Math. Phys. 225, 449 – 450 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Erratum

Monotonicity of Optimal Transportation and the FKG and Related Inequalities Luis A. Caffarelli Department of Mathematics, RLM 8.000, C1200, University of Texas at Austin, Austin, TX 78712-1082, USA. E-mail: [email protected] Received: 3 July 2001 / Accepted: 5 July 2001 Commun. Math. Phys. 214, 547–563 (2000)

It was pointed out to us by Gilles Harge, that the proof of Theorem 11, p. 559 was incomplete, since we prove there only that δϕ ≤ 2h2 and thus Dαα ϕ ≤ 2 . We now complete the proof: We change the formula on line 11, p. 560 to (the correct one) h δϕ = ∇ϕ(x0 + te) − ∇ϕ(x0 − te), e dt . (∗) 0

We first plug the information ∇ϕ(x0 + te) − ∇ϕ(x0 − te), e ≤ 2λ ≤ 2h (from convexity along the x0 + te line) and we get δϕ ≤ 2h2 , and thus Dαα ϕ ≤ 2. We now have the extra information that 0 ≤ ϕαα ≤ 2 . More generally, suppose we know that 0 ≤ ϕαα ≤ a0 for some a0 > 1. We plug that information in the formula (∗), above, and get, for any 0 ≤ t ≤ h ∇ϕ(x0 + te) − ∇ϕ(x0 − te), e ≤ min(2h, 2α0 t) .

450

L. A. Caffarelli

Thus, by integration along the segment we get 1 h2 + δϕ ≤ 2 h2 1 − a0 2a0 (2a − 1) 0 = h2 a0 = h2 a1 < h2 a0 . Thus, ϕαα ≤ a1 < a0 . Starting with a0 = 2 and repeating the argument infinitely many times we end up proving that δϕ ≤ h2 since 1 is the unique solution of This completes the proof. Communicated by J. L. Lebowitz

(2a−1) a

= 1.

Commun. Math. Phys. 225, 451 – 452 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Erratum

Energy Correlations in O(N) Models and the Wolff Representation Michael Campbell Department of Mathematics, University of California, Irvine, CA 92697-3875, USA. E-mail: [email protected] Received: 27 September 2001 / Accepted: 27 September 2001 Commun. Math. Phys. 218, 99–111 (2001)

1. Introduction In this erratum, an error in [1] is pointed out. 1. The proof of Lemma 1 in [1] is not correct. A high temperature expansion for a simple 4-site graph ([2] and the author) shows inconsistencies that point to an error. ˜ m ) + (1/2) The problem in Lemma 1 is the association above (25) of (1/2)δ( ˜ m = π) with δ(Zm = 0). An approximation to δ( ˜ m = 0) is to replace it with δ( ˜ m = Zm /ym , exp[−λ(arctan(Zm /ym ))2 ]/ dsm exp[−λ(arctan(Zm /ym ))2 ]. Since tan ˜ m = π ) + (1/3)δ( ˜ m = 2π ) ˜ m = 0) + (1/3)δ( clearly this will converge to (1/3)δ( as λ → ∞. However, in the new coordinates of (25), the above approximation does not converge to δ(Zm = 0) in (27), which ends up being a uniform measure in the xm -ym plane. Although the maximum of arctan(Zm /ym ) is uniformly distributed in the xm -ym plane (at Zm = 0), the mass is not. Geometrically this can be established by looking at the surface Zm /ym = constant, which is a plane. Note this plane has the most mass between it and the xm -ym plane when ym = 1 and the least when ym = 0. Thus the approximation to the delta function will not converge to a uniform measure in the coordinates of (25). The δ(Zm = 0) should be removed from (27) and replaced with the correct limit. 2. Theorem 2 in [1] relies upon the assumptions of Lemma 1 mentioned above. So it does not hold, and part (ii) of Theorem 2 is incorrect. However, a slight modification does show that if the inductive assumption that (ii) holds for the O(N −1) model is made, then (ii) does hold for the O(N ) model if we replace all dot products in (ii) si1 sj1 + · · · + siN sjN with the first N − 1 terms: si1 sj1 + · · · + siN−1 sjN−1 . In effect Theorem 2 says that if it is inductively assumed that (ii) holds for O(N − 1), then any subset of the same N − 1 terms in the O(N ) model will also satisfy (ii) by a direct application of the strong-FKG property. If it is assumed (i) and (ii) hold for the O(N − 1) model, then part (i) holds exactly as stated.

452

M. Campbell

3. All other results in [1] are correct for O(N ) under the assumption that (i) and (ii) of Theorem 2 hold for the O(N −1) model. Hence if an inductive approach is taken towards proving (i) and (ii), then there are some potentially useful tools available. Namely if it is assumed that (i) and (ii) hold for the O(N − 1) model, then the strong-FKG property can be used in the O(N ) model. References 1. Campbell, M.: Energy Correlations in O(N ) models and the Wolff Representation. Commun. Math. Phys. 218, 99–111 (2001) 2. Hara, T. and Sokal, A.: Private communication Communicated by J. L. Lebowitz

Commun. Math. Phys. 225, 453 – 463 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Mean-Field Criticality for Percolation on Planar Non-Amenable Graphs Roberto H. Schonmann Department of Mathematics, University of California at Los Angeles, Los Angeles, CA 90095, USA. E-mail: [email protected] Received: 4 April 2001 / Accepted: 4 October 2001

Abstract: The critical exponents β, γ , δ and are proved to exist and to take their meanfield values for independent percolation on the following classes of infinite, locally finite, connected transitive graphs: (1) Non-amenable planar with one end. (2) Unimodular with infinitely many ends. 1. Introduction 1.1. Results. A great deal of attention has been given recently to the study of statistical mechanics and related systems on various classes of graphs. The reader is invited to consult [Lyo] and [Sch] for introductions to the subject, and for references to the literature. This paper can be seen as a continuation of [Sch], and we refer the reader to that paper for background and motivation. Basic terminology, definitions and notation will be reviewed later in this introduction. We will consider independent bond percolation on an infinite, locally finite, connected transitive graph G = (V , E). Results similar to the ones presented here hold also for independent site percolation, with similar proofs. The same remark can be made about the extension from transitive to quasi-transitive graphs. Conjecture 1.2 in [Sch], combined with Conjecture 6 in [BS1] (reproduced as Conjectures 1.1 in [Sch]), state that if the graph is non-amenable, critical exponents exist and take their mean-field values. In [Sch], Theorem 1.1, this was proved for various critical exponents in the case in which the graph is unimodular and the edge-isoperimetric constant (Cheeger constant) is a sufficiently large fraction of the degree of the graph (previously, a special case had been handled in [Wu]). Here we prove similar results in two cases. Theorem 1.1. For independent bond percolation on the following classes of infinite, locally finite, connected transitive graphs the critical exponents β, γ , δ and exist and take their mean field values: Work partially supported by the N.S.F. through grant DMS-0071766 and by a Guggenheim Foundation fellowship.

454

R. H. Schonmann

(i) Graphs which are planar non-amenable and have one end. (ii) Graphs which are unimodular and have infinitely many ends. At the end of the next subsection, after introducing the necessary notation, we review the meaning of the exponents addressed in this theorem and recall what their mean-field values are. It is worth pointing out that while Theorem 1.1 in [Sch] was proved by verifying the triangle condition of [AN] (or, more precisely, the open triangle condition of [BA]), in the present paper we will follow a somewhat different route, based nevertheless also on the work of [AN, BA], and [Ngu]. We do not know whether the triangle condition holds in the cases treated here. The fact that there is no percolation at the critical point, which is a feature of mean-field criticality, is known to hold for independent percolation on any infinite, locally finite, connected transitive unimodular graph. This was proved in [BLPS1], and a simpler proof was provided in [BLPS2]. Unfortunately, the methods from these papers do not provide information on critical exponents. Part (i) of Theorem 1.1 is the main contribution in this paper. This is one more instance in which the extra techniques resulting from planarity allow one to prove that results on percolation conjectured to hold with greater generality are true at least in the planar case. In the classical study of percolation (and other statistical mechanics processes) on transitive amenable graphs, and especially on the graphs Zd , this is a well known fact: planarity allows one to make much faster progress, and much more has been proved in the case of Z2 than in the more general case of Zd (see, e.g., [Gri]). In the context of percolation on transitive non-amenable graphs, a similar pattern has been followed. The paper [Lal1] anticipated for certain transitive non-amenable planar graphs some of the results which would later be proved for more general transitive non-amenable graphs. The study of percolation on transitive non-amenable planar graphs was later greatly developed in the papers [Lal2] and [BS2]. For instance, the fundamental Conjecture 6 in [BS1], which states that for independent bond or site percolation on transitive nonamenable graphs there is always a regime with infinitely many infinite clusters, was proved to hold under the extra assumption of planarity. In contrast to Theorem 1.1(i), independent percolation on transitive amenable planar graphs with one end is expected to have critical exponents with non-mean-field values. The case in which the graph is Z2 is extensively discussed in [Gri]. In the case of site percolation on the triangular lattice, various critical exponents have recently been proved to indeed take their conjectured, non-mean-field, values. This is a result of the rapid progress on conformal invariance, in combination with earlier work by H. Kesten relating various critical exponents in the two dimensional case (see [LSW, SW] and references therein). 1.2. Terminology and notation. We will consider independent bond percolation on an infinite, locally finite, connected graph G = (V , E), where V is the set of vertices (sites) and E is the set of edges (bonds). A site r ∈ V will be singled out and denoted the root of G. The cardinality of a set S ⊂ V will be denoted by |S|. The edge boundary of a set S ⊂ V is ∂E S = {{x, y} ∈ E : x ∈ S, y ∈ S c } and its inner vertex boundary is ∂in S = {x ∈ S : {x, y} ∈ ∂E S for some y ∈ S c }. The edge-isoperimetric constant (Cheeger constant) of G is defined as |∂E S| : S ⊂ V , 0 = |S| < ∞ . iE (G) = inf |S|

Mean-Field Criticality

455

G is said to be amenable in case iE (G) = 0. The number of ends of the graph G is E(G) = sup {number of infinite connected components of G\S}, S⊂V |S|<∞

where G\S is the graph obtained from the graph G by removing the vertices which belong to S and the edges incident to these vertices. (The definition of ends of a graph is being omitted because it is not needed in this paper. Those familiar with that concept will note that E(G) coincides with the cardinality of the set of ends of the graph G in case this cardinality is finite and that E(G) = ∞ when this cardinality is infinite, but E(G) does not distinguish between different infinite cardinalities. While this is a drawback of E(G), its definition is simpler than that of the set of ends of a graph, and is sufficient for various purposes including those in this paper.) Informally, a graph is transitive (same as vertex-transitive or homogeneous) if all its vertices play exactly the same role. More precisely, this means that for each pair x, y ∈ V there is an automorphism of the graph which maps x to y. A graph is said to be quasi-transitive if there is a finite set of vertices, V0 , with the property that each vertex of the graph can be mapped into one of the vertices of V0 by an automorphism. Informally, a graph is quasi-transitive if there is a finite number of types of vertices, and vertices of the same type play the same role. The number of ends of an infinite, locally finite, connected transitive graph is 1,2, or ∞; moreover, when the number of ends is 2, the graph is amenable and when the number of ends is ∞ the graph is non-amenable (see Sect. 6 of [Moh]). The stabilizer, S(x), of a vertex x ∈ V is the set of automorphisms of G which fix x. A transitive graph is unimodular if for each x, y ∈ V , |{γ (y) : γ ∈ S(x)}| = |{γ (x) : γ ∈ S(y)}|. A graph is said to be planar if it can be embedded in R2 with vertices being represented by points and edges being represented by lines which connect the corresponding vertices and can only intersect at their end-points. The probability measure according to which each edge is occupied with probability p and vacant with probability 1 − p, independently of the others, will be denoted by Pp . The corresponding expectation will be denoted by Ep . Given A, B ⊂ V , we will write {A ↔ B} for the event that there is a path of occupied bonds connecting A to B (if A = {x}, we write {x ↔ B}, rather than {{x} ↔ B}, and will use similar conventions S

systematically). Given also S ⊂ V we will write {A ←→ B} for the event that there is a path of occupied bonds connecting A to B with all the sites which appear in this path S

S

belonging to S. We will set {A ↔ B} = {A ↔ B}c , {A ←→ B} = {A ←→ B}c . For x ∈ V , C(x) = {y ∈ V : x ↔ y} will denote the cluster of the site x. The probability of percolation is definedas θ (p) = Pp (|C(r)| = ∞). The susceptibility is defined as χ (p) = Ep (|C(r)|) = x∈V Pp (r ↔ x). The threshold for percolation is the critical point pc = inf{p ∈ [0, 1] : θ(p) > 0}. From the methods of [AB], we know that for quasi-transitive graphs pc = sup{p ∈ [0, 1] : χ (p) < ∞}. The threshold for uniqueness of the infinite cluster is pu = inf{p ∈ [0, 1] : Pp (there is a unique infinite cluster) = 1}. In order to define the critical exponent δ, we introduce a “ghost field”. Each site is painted green, independently of anything else, with probability q. Pp,q will denote the corresponding probability measure in this enlarged probability space, and Ep,q will be the corresponding expectation. The random set of green sites will be denoted by Q. One defines θ (p, q) = Pp,q (r ↔ Q), and χ (p, q) = Ep,q (|C(r)|; C(r) ∩ Q = ∅) = x∈V Pp,q (r ↔ x, r ↔ Q). Next we review what is meant by saying that each one of the critical exponents which appears in Theorem 1.1 exists and takes its mean-field value. The labels on the

456

R. H. Schonmann

left indicate the way one usually refers to each statement, and provide the corresponding mean-field value of each critical exponent: [γ = 1] [β = 1] [δ = 2] [ = 2]

C1 (pc − p)−1 ≤ χ (p) ≤ C2 (pc − p)−1 , for p < pc , C1 (p − pc )1 ≤ θ (p) ≤ C2 (p − pc )1 , for p > pc , C1 q 1/2 ≤ θ (pc , q) ≤ C2 q 1/2 , for q > 0, For m = 1, 2, . . . C1 (pc − p)−2 ≤ Ep (|C(r)|m+1 )/Ep (|C(r)|m ) −2 ≤ C2 (pc − p) , for p < pc ,

where in each case C1 , C2 ∈ (0, ∞). 2. Sufficient Conditions for Mean-Field Criticality From the arguments in [AN] (modified in the fashion of Sect. 3.1 of [BA]) and [Ngu], we have: Lemma 2.1.A. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph such that pc < 1. Suppose also that there are ", c > 0 and sites x1 , x2 ∈ V such that for every p ∈ (pc − ", pc ), Pp (x1 ↔ z1 , x2 ↔ z2 , x1 ↔ x2 ) ≥ c(χ (p))2 . z1 ,z2 ∈V

Then γ = 1 and = 2. From the arguments in [BA] and [New] we have: Lemma 2.1.B. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph such that pc < 1. Suppose also that there are ", c > 0 and sites x1 , x2 , x3 ∈ V such that for every p ∈ (pc − ", pc ) and q ∈ (0, "), Pp,q (x1 ↔ z, x1 ↔ Q, x2 ↔ Q, x3 ↔ Q, x2 ↔ x3 ) ≥ cχ (p, q)(θ (p, q))2 . z∈V

Then δ = 2 and β = 1. The role of unimodularity in the derivation of the two lemmas above is explained in Section 3.2 of [Sch]. In the remainder of this section, we will reduce the lemmas above to further sufficient conditions for statements of mean-field criticality. The reader can either study these lemmas in the order in which they will be presented, or alternatively, study first the lemmas labeled with “A”, which refer to the exponents γ and , and later study the lemmas labeled with “B”, which refer to the exponents δ and β, and which have more involved proofs. Lemma 2.2.A. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are ", c > 0, disjoint sets of sites V1 , V2 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 such that for every p ∈ (pc − ", pc ), Pp (V1 ↔ V2 ) ≤ 1 − c, z∈Vi

Then γ = 1 and = 2.

Vi

Pp (xi ←→ z) ≥ cχ (p)

(i = 1, 2).

Mean-Field Criticality

457 Vi

Proof. For i = 1, 2, the events {xi ←→ z} depend only on the state of occupancy of the edges which have both endpoints in Vi , while the event {V1 ↔ V2 } depends only on the state of occupancy of the other edges. Therefore, by independence, Pp (x1 ↔ z1 , x2 ↔ z2 , x1 ↔ x2 ) z1 ,z2 ∈V

≥

V1

V2

Pp (x1 ←→ z1 , x2 ←→ z2 , V1 ↔ V2 )

z1 ,z2 ∈V

=

V1

V2

Pp (x1 ←→ z1 )Pp (x2 ←→ z2 )Pp (V1 ↔ V2 )

z1 ,z2 ∈V



= 

 V1

Pp (x1 ←→ z1 ) 

z1 ∈V1

 V2

Pp (x2 ←→ z2 ) Pp (V1 ↔ V2 )

z2 ∈V2

≥ c (χ (p)) . 3

2

And the claim follows from Lemma 2.1.A. (The hypothesis in that lemma that pc < 1 must hold, since otherwise Pp (V1 ↔ V2 ) → 1, as p pc .) Lemma 2.2.B. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are ", c > 0, disjoint sets of sites V1 , V2 , V3 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 , x3 ∈ V3 such that for every p ∈ (pc − ", pc ) and q ∈ (0, "), Pp,q (Vi ↔ Vj ) ≤ 1 − c (i = j ), V1 Pp,q (x1 ←→ z, x1 ↔ Q) ≥ cχ (p, q), z∈V1 Vi

Pp,q (xi ←→ Q) ≥ cθ (p, q)

(i = 2, 3).

Then δ = 2 and β = 1. Proof. Set V1

Az1 = {x1 →←→ z},

A˜ 1 = {x1 ↔ Q},

V1 A˜ z1 = {x1 →←→ z, x1 ↔ Q}, V2

A2 = {x2 →←→ Q},

V3

A3 = {x3 →←→ Q},

B = {V1 ↔ V2 , V2 ↔ V3 , V3 ↔ V1 }. E

⊂ E, we will denote by FE the σ -field generated by the state of For each set occupancy of the edges in E . Let E1 = {{u, v} ∈ E : u, v ∈ V1 }, and let (E1k )k≥1 be an increasing sequence of subsets of E1 which converges to this set (i.e., ∪k E1k = E1 ). k k For any k, and any configuration ω1 ∈ {0, 1}E1 , the set of configurations in {0, 1}E\E1 × {0, 1}V which in combination with ω1 produce a configuration in A˜ 1 is a decreasing set. Similarly for B. Therefore, by the Harris’ inequality, Pp,q (A˜ 1 B|FE k ) ≥ Pp,q (A˜ 1 |FE k )Pp,q (B|FE k ) = Pp,q (A˜ 1 |FE k )Pp,q (B), 1

1

1

1

458

R. H. Schonmann

where in the last step we used the fact that B depends only on the state of occupancy of the edges which have at least one endpoint in (V1 )c and therefore is independent of FE k . Letting k → ∞, and using (5.9) on p. 264 of [Dur], yields 1

Pp,q (A˜ 1 B|FE1 ) ≥ Pp,q (A˜ 1 |FE1 )Pp,q (B). Integration over Az1 ∈ FE1 , yields now Pp,q (A˜ z1 B) = Pp,q (A˜ 1 Az1 B) ≥ Pp,q (A˜ 1 Az1 )Pp,q (B) = Pp,q (A˜ z1 )Pp,q (B). Therefore,

Pp,q (x1 ↔ z, x1 ↔ Q, x2 ↔ Q, x3 ↔ Q, x2 ↔ x3 ) ≥

z∈V

=

Pp,q (A˜ z1 B)Pp,q (A2 )Pp,q (A3 )

z∈V1 6

≥

Pp,q (A˜ z1 A2 A3 B)

z∈V1

Pp,q (A˜ z1 )Pp,q (A2 )Pp,q (A3 )Pp,q (B)

z∈V1

≥ c χ (p, q)(θ (p, q)) . 2

In the second step above we used the fact that A˜ z1 B depends only on the state of occupancy of the edges which have at least one endpoint in (V2 ∪ V3 )c and on the state (green or not) of the vertices in (V2 ∪ V3 )c , while, for i = 2, 3, Ai depends only on the state of occupancy of the edges which have both endpoints in Vi and on the state (green or not) of the vertices in Vi . In the last step above we used Harris’ inequality to obtain Pp,q (B) ≥ c3 . The claim follows now from Lemma 2.1.B. (The hypothesis in that lemma that pc < 1 must hold, since otherwise Pp,q (Vi ↔ Vj ) → 1, as p pc .) Lemma 2.3.A. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are disjoint sets of sites V1 , V2 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 such that

Ppc (V1 ↔ V2 ) < 1, Ppc (xi ↔ v) < 1

(i = 1, 2).

v∈∂in Vi

Then γ = 1 and = 2. Proof. We will verify the hypothesis of Lemma2.2.A with " = pc and c = min{1 − Ppc (V1 ↔ V2 ), 1− v∈∂in V1 Ppc (x1 ↔ v), 1− v∈∂in V2 Ppc (x2 ↔ v)}. By monotonicity in p, only the second display in the hypothesis of Lemma 2.2.A requires any nonVi

trivial argumentation. To verify it, we note that if {xi ↔ z} occurs, then either {xi ←→ z} occurs, or else there is some vertex v ∈ ∂in Vi for which the event {xi ↔ v}{v ↔ z} occurs. From the van den Berg–Kesten–Fiebig–Reimer inequality, we obtain then, for p < pc , Vi Pp (xi ↔ v)Pp (v ↔ z) Pp (xi ↔ z) ≤ Pp (xi ←→ z) + v∈∂in Vi Vi

≤ Pp (xi ←→ z) +

v∈∂in Vi

Ppc (xi ↔ v)Pp (v ↔ z).

Mean-Field Criticality

459

Summing over z ∈ V , χ (p) ≤

z∈Vi

Therefore,

Vi

Pp (xi ←→ z) +

Ppc (xi ↔ v)χ (p).

v∈∂in Vi

Vi

Pp (xi ←→ z) ≥ cχ (p).

z∈Vi

Lemma 2.3.B. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are disjoint sets of sites V1 , V2 , V3 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 , x3 ∈ V3 such that Ppc (Vi ↔ Vj ) < 1 Ppc (xi ↔ v) < 1

(i = j ), (i = 1, 2, 3).

v∈∂in Vi

Then δ = 2 and β = 1.

Proof. We will verify the hypothesis of Lemma 2.2.B with " = p c and c = min{1 − Ppc (V1 ↔ V2 ), 1 − Ppc (V2 ↔ V3 ), 1 − Ppc (V3 ↔ V1 ), 1 − v∈∂in V1 Ppc (x1 ↔ v), 1 − v∈∂in V2 Ppc (x2 ↔ v), 1 − v∈∂in V3 Ppc (x3 ↔ v)}. By monotonicity in p, only the second and third displays in the hypothesis of Lemma 2.2.B require any non-trivial argumentation. Vi

To verify the third display, we note that if {xi ↔ Q} occurs, then either {xi ←→ Q} occurs, or else there is some vertex v ∈ ∂in Vi for which the event {xi ↔ v}{v ↔ Q} occurs. From the van den Berg–Kesten–Fiebig–Reimer inequality, we obtain then, for p < pc and q ∈ (0, 1], Vi θ (p, q) = Pp,q (xi ↔ Q) ≤ Pp,q (xi ←→ Q) + Pp,q (xi ↔ v)Pp,q (v ↔ Q) v∈∂in Vi Vi

≤ Pp,q (xi ←→ Q) +

Ppc (xi ↔ v)θ (p, q).

v∈∂in Vi

Therefore,

Vi

Pp,q (xi ←→ Q) ≥ cθ (p, q) (i = 2, 3). To verify the second display in the hypothesis of Lemma 2.2.B note that if {x1 ↔ V1

z, x1 ↔ Q} occurs, then either {x1 ←→ z, x1 ↔ Q} occurs, or else there is some vertex v ∈ ∂in Vi for which the event {x1 ↔ v}{v ↔ z, v ↔ Q} occurs (this is slightly subtle; recall that our sample space is {0, 1}E × {0, 1}V and, using the notation in [Gri], p. 38, take for the set K in the definition of a set of edges which produce a path from x1 to v – do not include any vertex in K). From the van den Berg–Kesten–Fiebig–Reimer inequality, we obtain then, for p < pc and q ∈ (0, 1], Pp,q (x1 ↔ z, x1 ↔ Q) V1

≤ Pp,q (x1 ←→ z, x1 ↔ Q) +

Pp,q (x1 ↔ v)Pp,q (v ↔ z, v ↔ Q)

v∈∂in V1 V1

≤ Pp,q (x1 ←→ z, x1 ↔ Q) +

v∈∂in V1

Ppc (x1 ↔ v)Pp,q (v ↔ z, v ↔ Q).

460

R. H. Schonmann

Summing over z ∈ V , V1 χ (p, q) ≤ Pp,q (x1 ←→ z, x1 ↔ Q) + Ppc (x1 ↔ v)χ (p, q). z∈V1

Therefore,

v∈∂in V1 V1

Pp,q (x1 ←→ z, x1 ↔ Q) ≥ cχ (p, q).

z∈V1

3. The Case of Planar Graphs In this section we suppose that G = (V , E) is an infinite, locally finite, connected transitive non-amenable planar single-ended graph. Proposition 2.1 of [BS2] states that G is unimodular and that it can be embedded in the hyperbolic plane H2 in the following way. Each vertex of G is mapped into a point of H2 and each edge of G is mapped into a geodesic line segment with endpoints at the points of H2 which are images of its endpoints; moreover the group of automorphisms of G is mapped in this way into a group of isometries of H2 . It is clear that, by adjusting the length scale, such an embedding can be chosen so that each face in the embedding has diameter less than 1. In particular any point of H2 is then within distance 1 of a point which represents a vertex of G, and all the geodesic line segments which represent edges of G have length at most 1. We will refer to such an embedding as a “nice embedding”. One convenient way to describe the dual G† = (V † , E † ) of G is to represent each element of V † by a face (tile) in the embedding of G, described above, and represent elements of E † by pairs of faces whose topological boundaries intersect on a nondegenerate geodesic line segment (which represents an edge of G). This establishes a one-to-one correspondence between E and E † , and the image of e ∈ E under this correspondence will be denoted by e† . Since G is transitive, G† is quasi-transitive. Any bond percolation process on G is coupled to a bond percolation process on G† , by declaring each edge e† vacant (resp. occupied) if e is occupied (resp. vacant). Independent percolation at density p on G is coupled in this fashion to independent percolation at density 1 − p on G† . The following lemma is a basic building block in our argumentation in this section. In the statement of this lemma, we identify a path in the dual graph with the union of the tiles that correspond to the endpoints of the dual edges in this path in the embedding. Lemma 3.1. Suppose that G is an infinite, locally finite, connected transitive nonamenable planar single-ended graph, nicely embedded in H2 . If p < pu , then there is C0 > 0 such that the following happens. Let L be an arbitrary geodesic line in H2 , s and s be two points on L, separated by distance l > 2, and L and L be geodesic lines perpendicular to L through s and s , respectively. Then Pp (there is an occupied dual path separating L from L ) > C0 . Proof. This was proved in a somewhat more restricted setting and for site percolation in [Lal2], Lemma 2.15. The more general case considered here can be handled in the same way, by using results in [BS2]. First, from Theorem 3.7 of [BS2], we learn that there is percolation in the dual process when p < pu . From the generalization of Corollary 4.4 of [BS2] to quasi-transitive tilings of H2 , we learn then that percolation also occurs in this dual process on hyperbolic half-spaces. This enables us to use the arguments in the proofs of Lemma 2.14 and 2.15 in [Lal2] to conclude the proof.

Mean-Field Criticality

461

Given a nice embedding of G in H2 and a set S ⊂ H2 , we will use the notation S¯ for the set of vertices of G which are endpoints of edges represented in the embedding by geodesic line segments which intersect S. Lemma 3.2. Suppose that G is an infinite, locally finite, connected transitive nonamenable planar single-ended graph, nicely embedded in H2 . If p < pu , then there are C1 , C2 ∈ (0, ∞), such that the following happens. Let L be an arbitrary geodesic line in H2 , s and s be two points on L, separated by distance L, and L and L be geodesic lines perpendicular to L through s and s , respectively. Then Pp (L¯ ↔ L¯ ) ≤ C1 e−C2 L . Proof. Take l > 2 and consider the set of geodesic lines which separate L from L , are perpendicular to L and cross it at points which are at distance j l, j = 1, 2, . . . , L/ l from s . Since any path from L¯ to L¯ has to cross all these lines, the claim follows from Lemma 3.1. Lemma 3.3. Suppose that G is an infinite, locally finite, connected transitive nonamenable planar single-ended graph, nicely embedded in H2 . If p < pu , then there are C3 , C4 ∈ (0, ∞), such that the following happens. Let L be an arbitrary geodesic line in H2 , s and s be two points on L, separated by distance L, and L be the geodesic line perpendicular to L through s . Let x be a vertex of G which in the embedding is mapped into a point of H2 at distance at most 1 from s. Then Pp (x ↔ y) ≤ C3 e−C4 L . Proof. Let L+ and L− be the two half-lines into which s partitions L . Take some l > 2. Set s0 = s , and for k ∈ {1, 2, . . . } let sk (resp. s−k ) be the point on L+ (resp. L− ) at distance kl from s . For k ∈ {1, 2, . . . } let Ik (resp. I−k ) be the geodesic segment (contained in L ) with endpoints sk−1 and sk (resp. s−k+1 and s−k ). For j ∈ Z, let Lj be the geodesic line perpendicular to L through sj . Then Pp (x ↔ y) ≤ Pp (x ↔ y). y∈L¯

j ∈Z\{0} y∈I¯ j

Let D be the degree of G. It is easy to see that for some small " > 0 any ball of radius " in H2 can intersect at most D edges of the embedding of G in H2 . Therefore it is also easy to see that any geodesic line segment of length d can intersect at most dD/" such edges. Therefore, from the previous display we obtain, for arbitrary J , lD 2lJ D ¯ Pp (x ↔ y) ≤ Pp (x ↔ I¯ j ) + Pp (x ↔ L). " " y∈L¯

j :|j |>J

When j > 1 (resp. j < 1) any path from x to I¯ j has to cross the lines Li , i = 1, 2, . . . , j − 1, (resp. i = −1, −2, . . . , −j + 1). Hence, Lemma 3.1 implies Pp (x ↔ I¯ j ) ≤ C5 e−C6 j , for some C5 , C6 ∈ (0, ∞). Therefore, using Lemma 3.2 and taking J = L , we obtain Pp (x ↔ y) ≤ C7 e−C6 L + C8 L e−C2 L ≤ C3 e−C4 L . y∈L¯

462

R. H. Schonmann

Proof of Theorem 1.1(i). We will check that the hypothesis of Lemma 2.3.A and Lemma 2.3.B are satisfied (note that the former are contained in the latter). Suppose that G is nicely embedded in H2 . Let L be a geodesic line and s1 , . . . , s7 be distinct points on L, such that for i = 1, . . . , 6, the distance between si and si+1 has the same common value L. For each i, let Li be the geodesic line perpendicular to L through ri . The removal of L2 ∪ L3 ∪ L5 ∪ L6 breaks H2 into 5 connected components. For i = 1, 4, 7, let Vi be the connected component which contains si . Set V1 = V¯ 1 , V2 = V¯ 4 , V3 = V¯ 7 . Let x1 , x2 and x3 be vertices of G which in the embedding are mapped into points of H2 at distance at most 1 from s1 , s4 and s7 , respectively. With these choices, the hypothesis of Lemma 2.3.B are satisfied, provided that L is large enough, as can be seen from Lemma 3.2, Lemma 3.3 and Theorem 1.1 of [BS2], which states that pc < pu .

4. The Case of Graphs with Infinitely Many Ends We will need some notation and terminology related to the binary homogeneous tree, T2 , i.e., the tree in which every vertex has degree 3. The set of vertices of this tree will be denoted by V (T2 ). Given i, j, k ∈ V (T2 ) we will say that k is between i and j if the shortest path from i to j passes through k. The following proposition will be used in this section; it can be easily proved with the arguments in the proof of Propositions 6.1 in [Moh2]. (Compare with Proposition 2.1 in [Sch].) Below B(u, n) will denote the ball of radius n centered at u ∈ V in the graph G = (V , E). Proposition 4.1. Suppose that G = (V , E) is an infinite, locally finite, connected transitive graph. If G has infinitely many ends, then there is a positive integer n and vertices uk ∈ V , k ∈ V (T2 ) such that the balls B(uk , n), k ∈ Z ar disjoint and have the following property. For each i, j ∈ V (T2 ) any path from B(ui , n) to B(uj , n) intersects each B(uk , n) with k between i and j . Proof of Theorem 1.1(ii). We will check that the hypothesis of Lemma 2.3.A and Lemma 2.3.B are satisfied (note that the former are contained in the latter). Let k0 , k1 , k2 , k3 ∈ V (T2 ) be such that for 1 ≤ i < j ≤ 3, k0 is between ki and kj , and for i = 1, 2, 3, the distance in T2 between ki and k0 has a common value l. Using the notation in Proposition 4.1, set xi = uki , i = 0, 1, 2, 3. Proposition 4.1 implies that G\B(x0 , n) has at least 3 distinct infinite components, which contain respectively x1 , x2 and x3 . Call them, respectively, V1 , V2 and V3 . Since G has infinitely many ends, it is non-amenable and hence, by Theorem 2 of [BS1] (adapted to bond percolation), it has pc < 1. To verify the hypothesis of Lemma 2.3.B, let K be the number of edges of G which have at least one endpoint in B(u0 , n), and note that, for 1 ≤ i < j ≤ 3, Ppc (Vi ↔ Vj ) ≤ 1 − (1 − pc )K < 1, and, for i = 1, 2, 3, Ppc (xi ↔ v) ≤ |∂in Vi |Ppc (xi ↔ ∂in Vi ) ≤ K(1 − (1 − pc )K )l−1 . v∈∂in Vi

The last expression can be made arbitrarily small by taking l sufficiently large.

Mean-Field Criticality

463

Acknowledgement. I am grateful to Ander Holroyd and Oded Schramm for their various comments and suggestions.

References [AB]

Aizenman, M. and Barsky D.: Sharpness of the phase transition in percolation models. Commun. Math. Phys. 108, 489–526 (1987) [AN] Aizenman, M. and Newman, C.M.: Tree graph inequalities and critical behavior in percolation models. J. Stat. Phys. 16, 811–828 (1983) [BA] Barsky, D.J. and Aizenman, M.: Percolation critical exponents under the triangle condition. Commun. Math. Phys. 19, 1520–1536 (1991) [BLPS1] Benjamini, I., Lyons, R., Peres,Y. and Schramm, O.: Group-invariant percolation on graphs. Geom. and Funct. Anal. 9, 29–66 (1999) [BLPS2] Benjamini, I., Lyons, R., Peres, Y. and Schramm, O.: Critical percolation on any non-amenable group has no infinite clusters. Ann. of Probability 27, 1347–1356 (1999) [BS1] Benjamini, I. and Schramm, O.: Percolation beyond Zd , many questions and a few answers. Electronic Communications in Probability 1, 71–82 (1996) [BS2] Benjamini, I. and Schramm, O.: Percolation in the hyperbolic plane. J. Am. Math. Soc. 14, 487–507 (2000) [Dur] Durrett, R.: Probability: Theory and Examples. Duxbury Press, Second edition, 1996 [Gri] Grimmett, G.R.: Percolation. New York–Berlin: Springer-Verlag, 2nd edition, 1999 [Lal1] Lalley, S.P.: Percolation on Fuchsian groups. Annales de L’Institut Henri Poincaré (Probability and Statistics) 34 , 151–177 (1998) [Lal2] Lalley, S.P.: Percolation clusters in hyperbolic tesselations. Geom. and Funct. Anal. (to appear) [LSW] Lawler, G., Schramm, O. and Werner, W.: One-arm exponent for critical 2D percolation. Preprint, 2001 [Lyo] Lyons, R.: Phase transition on non-amenable graphs. J. Math. Phys. 41, 1099–1126 (2000) [Moh] Mohar, B.: Some relations between analytic and geometric properties of infinite graphs. Discrete Mathematics 95, 193–219 (1991) [New] Newman, C.M.: Another critical exponent inequality for percolation: β ≥ 2/δ. J. Stat. Phys. 47, 695–699 (1987) [Ngu] Nguyen, B.: Gap exponent for percolation processes with triangle condition. J. Stat. Phys. 49, 235–243 (1987) [Sch] Schonmann, R.H.: Multiplicity of phase transitions and mean-field criticality on highly nonamenable graphs. Commun. Math. Phys. 219, 271–322 (2001) [SW] Smirnov, S. and Werner W.: critical exponents for two-dimensional percolation. Preprint, 2001 [Wu] Wu, C.C.: Critical behavior of percolation and Markov fields on branching planes. J. Appl. Probability 30, 538–547 (1993) Communicated by M. Aizenman

Commun. Math. Phys. 225, 465 – 485 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Normal Coordinates and Primitive Elements in the Hopf Algebra of Renormalization C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara Instituto de Ciencias Nucleares, Universidad Nacional Autónoma de México, Apdo. Postal 70-543, 04510 México, D.F., Mexico. E-mail: {chryss,quevedo,mrosen,vergara}@nuclecu.unam.mx Received: 25 May 2001 / Accepted: 5 October 2001

Abstract: We introduce normal coordinates on the infinite dimensional group G introduced by Connes and Kreimer in their analysis of the Hopf algebra of rooted trees. We study the primitive elements of the algebra and show that they are generated by a simple application of the inverse Poincaré lemma, given a closed left invariant 1-form on G. For the special case of the ladder primitives, we find a second description that relates them to the Hopf algebra of functionals on power series with the usual product. Either approach shows that the ladder primitives are given by the Schur polynomials. The relevance of the lower central series of the dual Lie algebra in the process of renormalization is also discussed, leading to a natural concept of k-primitiveness, which is shown to be equivalent to the one already in the literature.

Contents 1. 2. 3.

4.

5.

Introduction . . . . . . . . . . . . . . . . . . . . . Differential Geometry á la Hopf . . . . . . . . . . . The Hopf Algebra of Rooted Trees and Its Dual . . 3.1 Functions . . . . . . . . . . . . . . . . . . . 3.2 Vector fields . . . . . . . . . . . . . . . . . 3.3 1-forms . . . . . . . . . . . . . . . . . . . . Normal Coordinates . . . . . . . . . . . . . . . . . 4.1 A new basis . . . . . . . . . . . . . . . . . 4.2 The Hopf structure . . . . . . . . . . . . . . Primitive Elements . . . . . . . . . . . . . . . . . . 5.1 Ladder generators . . . . . . . . . . . . . . 5.2 The general case . . . . . . . . . . . . . . . 5.3 The lower central series and k-primitiveness

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

466 467 468 468 469 470 470 470 472 474 475 476 478

466

6.

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

Normal Coordinates and Toy Model Renormalization . . . . . . . . . . . 6.1 The toy model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Renormalization in the ψ-basis . . . . . . . . . . . . . . . . . . .

481 481 482

1. Introduction The process of renormalization in quantum field theory has been substantially elucidated in recent years. In a series of papers (see, e.g., [11, 7, 2, 9] and references therein), a Hopf algebra structure has been identified that greatly simplifies its combinatorics. This, in turn, has led to the development of an underlying geometric picture, involving an infinite dimensional group manifold G, the coordinates of which are in one-to-one correspondence with (classes of) 1PI superficially divergent Feynman diagrams of the theory. The latter are indexed by a type of graphs known as (decorated) rooted trees, which capture the subdivergence structure of the diagram. The forest formula prescription for the renormalization of a diagram then is translated into a series of operations on the corresponding rooted tree and the latter have been shown to deliver standard Hopf algebraic quantities, like the coproduct and the antipode of the rooted tree. The above results were obtained using a powerful mixture of algebraic and combinatoric techniques that brought to light unexpected interconnections with noncommutative geometry, among several other fields. The complexity of the full Hopf algebra of decorated rooted trees is, in many respects, overwhelming. Even in the simplest cases, one is confronted with an infinite set of available decorations for the vertices of the rooted trees, originating in the infinite number of primitive divergent diagrams appearing in the underlying theory. It is rather fortunate then that the considerably simpler algebra of rooted trees with a single decoration seems to capture many of the features of realistic theories. It is for this reason that it has been studied extensively, as a first step towards an understanding of the full theory. Of primary importance, given their rôle in renormalization theory, is the study of the primitive elements of the above Hopf algebra. These correspond to sums of products of diagrams with the property that their renormalization involves a single subtraction. In Ref. [3], an ansatz is presented for a (conjectured) infinite family of such elements, corresponding to the ladder generators of the algebra, i.e., to trees whose every vertex has fertility at most one. Furthermore, dealing with the general case, a set of vertexincreasing operators is constructed that generates new primitive elements from known ones. As the number of primitive elements increases rapidly with increasing number of vertices, this approach necessitates the introduction of new operators in each step, a task that has not yet been systematized. Our motivation in this paper is two-fold. On a general, methodological level, we argue that the above algebraic/combinatoric approach, with all its multiple successes, should nevertheless be complemented by a differential geometric one, which, we feel, has not been sufficiently considered in the literature. On a second, more concrete level, we provide support for our claim, by showing how a simple application of the inverse Poincaré lemma reduces the search for primitive elements to that of closed, left invariant (LI) 1-forms on G. For the case of the ladder primitives, we give a simple generating formula that identifies them with the Schur polynomials. Our discussion uses the normal coordinates on the group, a choice that leads naturally to a concept of k-primitiveness, associated with the lower central series of the dual Lie algebra – we prove that this coincides with the k-primitiveness introduced in Ref. [3]. We discuss the rôle of the new

Normal Coordinates and Primitive Elements in Hopf Algebra

467

coordinates in renormalization, using the toy model realization of Ref. [10], while also commenting on similar results obtained for the more realistic heavy quark model of [2]. 2. Differential Geometry á la Hopf We will be dealing with differential geometric concepts expressed in Hopf algebraic terms. We opt for this formulation having in mind the transcription of our results for the non-commutative case – Hopf algebras are ideally suited to this task. We start by providing a short dictionary between the two languages and establish the notation, assuming nevertheless familiarity with the basic definitions. Two algebras will be of main interest to us: on the one hand we have the (commutative, non-cocommutative) algebra A of functions on a (possibly infinite dimensional) group manifold, generated by {φ A }, with A ranging in an index set – we denote by a, b, . . . general elements of A. On the other hand, we have the (non-commutative, cocommutative) universal enveloping algebra U of the Lie algebra of the group. We actually work with a suitable completion of U, so as to allow exponentials of its generators ZA , which we identify with the points of the manifold1 – we denote by x, y, . . . general elements of U (we use g, g , . . . if we refer to group elements in particular). Both algebras are Hopf algebras. For A, the coproduct (a) ≡ a(1) ⊗ a(2) codifies left and right translations L∗g (a)(·) = a(1) (g)a(2) (·),

(1)

and similarly for right translations. For U, it expresses Leibniz’s rule, (Z) = Z ⊗ 1 + 1 ⊗ Z, for the left-invariant generator Z. The two Hopf algebras are dual , via the inner product (also called pairing)

· , · : U ⊗ A → C,

x ⊗ a → x, a ,

(2)

which, when x stands for a generator Z, amounts to taking the derivative of a along x and evaluating it at the identity. For x = g, the above definition produces a Taylor series expansion of a at the identity which gives, for a analytic, the value a(g) of a at the point g. The coproduct in A is dual to the product in U via

xy, a = x ⊗ y, a(1) ⊗ a(2) (3) and vice-versa. We usually work with dual bases, so that ZA only gives 1 when paired with φ A , while its inner product with all other φ’s, as well as with all products of φ’s, vanishes. Given a Poincaré–Birkhoff–Witt basis {f i } for A, {f i } = {1, φ A , φ A φ B , . . . },

(4)

one can build a dual basis {ei } for the entire U by adjoining to the above Z’s polynomials j in them, {ei } = {1, ZA , quadratic, cubic, . . . }, with ei , f j = δi – this, in general, involves a non-trivial calculation. To every element a of A we can associate a LI 1-form a , given by a = S(a(1) )da(2) ,

(5)

1 The particular group we deal with in Sect. 3 is non-compact and infinite dimensional. Nevertheless, in this paper, we only consider elements that correspond to exponentials of linear combinations of the generators. For a readable account of what we might be missing in doing so, see Ref. [12].

468

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

d being the exterior derivative and S the antipode in A. is linear, while on products it gives (ab) = a (b) + (a)b ,

A commutative,

(6)

where is the counit in A. We take all generators φ T of A to be counitless, i.e., we choose functions that vanish at the identity of the group, except for the unit function 1A (which we often write as just 1). This implies that only returns a non-zero result when applied to the generators and vanishes on all products, as well as on 1A . The Maurer-Cartan (MC) equations take the form da = −a(1) a(2) .

(7)

Using (6), one sees that only the bilinear part of the coproduct contributes to the MC equations. 3. The Hopf Algebra of Rooted Trees and Its Dual 3.1. Functions. We specialize the general considerations of the previous section to the Connes–Kreimer algebra of renormalization. For a detailed exposition we refer the reader to [10, 6, 8] and references therein, we give here only a brief account of the basic definitions and some illustrative examples. A is now the Hopf algebra HR of functions generated by φ T , where T is a rooted tree. This means that the group manifold G is, in this case, infinite dimensional, with one dimension for every rooted tree – the φ’s are coordinate functions on this manifold. The group law is encoded in the coproduct C C (φ T ) = φ T ⊗ 1 + 1 ⊗ φ T + φ P (T ) ⊗ φ R (T ) . (8) cuts C

The sum in the above definition is over admissible cuts, i.e., cuts that may involve more than one edge (simple cuts) but such that there is no more than one simple cut on any path from the root downwards. R C (T ) is the part that is left containing the root of T while P C (T ) is the product of all branches cut, e.g. b

( b b) =

b b b

⊗1+1⊗

b b b

b

+ 2 b⊗ b+

bb

⊗ b,

(9)

where we let a tree T itself denote the corresponding function φ T , a convention freely used in the rest ofb the paper. The factor 2 on the r.h.s. appears because there are two possible cuts on b b generating the corresponding term. A convenient way to recast (8) as a single sum, is to introduce a full and an empty cut, above and below any tree T respectively, e.g., full cut

b b

b

b

b

b

empty cut.

(10)

We rewrite (8) in the form (φ T ) =

cuts C

φP

C (T )

⊗ φR

C (T )

,

(11)

Normal Coordinates and Primitive Elements in Hopf Algebra

469

where the above two extra cuts, included in C , produce the primitive part of the coproduct. Notice that respects the grading given by the number v(T ) of vertices of a tree T . We call this the v-degree of φ T , denote it by degv (φ T ), and extend it to monomials as the sum of the v-degrees of the factors. The polynomial degree will be called p-degree to avoid confusion – it is obviously not respected by the coproduct. We will use the (n) (n) notation Ai for the subspace of A of v-degree n and p-degree i, e.g., A1 is the linear span of the generators with n vertices. ∗ , generated by {Z }, with T a 3.2. Vector fields. The rôle of U is now played by HR T rooted tree and we take the Z’s dual to the φ’s, in the sense of the previous section. ZT is a left invariant vector field on G. The Lie algebra of such vector fields is found by computing, using (3), the pairing of ZA ZB − ZB ZA with all basis functions {f i }.

Example 1. Computation of [Z , Z ]. We have b b

b

b b

b

b

b

b

˜ b b) = 2 b ⊗ b + (

˜ b) = b ⊗ b + b ⊗ b, ( b b b ˜ b b) = b ⊗ b + b ⊗ b + (

bb

⊗ b+ b⊗

bb

⊗ b,

(12)

,

bb

˜ T ) ≡ (φ T ) − φ T ⊗ 1 − 1 ⊗ φ T . These are the only functions that contain where (φ b the term b ⊗ b in their coproduct. We find therefore, using (3), b b b b (13) Z Z , b b = 1. Z Z , b b = 2, Z Z , b = 1, b

b b

b

Similarly, one computes

b b

b

ZZ, b b

b

b b b

= 1,

ZZ, b b

b

b bb

b b

= 1,

(14)

the pairings with all other functions being zero. It follows that the only non-zero pairing of the commutator is b [Z , Z ], b b = 2. (15) b

b b

But the element 2Z of U has exactly the same pairings, therefore, in order for the inner product between U and A to be non-degenerate, one must set [Z , Z ] = 2Z . b b b

b

b b

Proceeding along these lines, one arrives at the general expression [7] [ZT1 , ZT2 ] = n T − n T ZT ≡ f T ZT , T

T1 T2

T2 T1

T

T1 T2

b b b

(16)

where n T is the number of simple cuts on T that produce T1 , T2 , with T2 containing T1 T2 the root of T (denoted by n(T1 , T2 , T ) in [6]) and the last equation defines the structure constants f T of the Lie algebra. We introduce, following [7], a ∗-operation among T1 T2 the Z’s, defined by ZT1 ∗ ZT2 = n

T ZT . T1 T2

(17)

Notice that this is not the product in U but, nevertheless, it gives correctly the commutator when antisymmetrized (cf. (16)). The above Lie bracket conserves the number of vertices.

470

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

3.3. 1-forms. We turn now to LI 1-forms. Starting from (5) and using the particular form of the coproduct in (8), we find

φ T =

φ S(P

C (T ))

dφ R

C (T )

= dφ T +

C

φ S(P

C (T ))

dφ R

C (T )

.

(18)

C

For the MC equations we may use directly (7) and the comment that follows it to find

dφ T = −

simple C

φ P C (T ) φ RC (T ) .

(19)

The restriction to simple cuts is possible since cuts that involve more than one edge produce non-linear terms in the first tensor factor of the coproduct and these are annihilated by . This is probably the easiest way to derive the structure constants. Example 2. Maurer–Cartan equation for . Using (18) we find b b b

= d b, b

b

= d b − bd b, b b

b b b

b

b

= d b b − 2 bd b +

d b.

bb

(20)

Direct application of d to the above expression for , or use of (19), gives b b b

d

b b b

= −2 , b

in agreement with the commutator [Z , Z ] = 2Z b

b b

b b

b b b

(21)

of Ex. 1.

General vector and 1-form fields are obtained as linear combinations of the above, with coefficients in A. 4. Normal Coordinates 4.1. A new basis. We introduce new coordinates {ψ A } on G, defined by A g, ψ A = α A , where g = eα ZA ,

(22)

i.e., the ψ’s are normal coordinates centered at the origin and, like the φ’s, are indexed by rooted trees. Of fundamental importance in the sequel will be the canonical element C (see, e.g., [4]), given by A

C = ei ⊗ f i = eZA ⊗ψ .

(23)

{ei } and {f i } above are dual bases of U and A respectively (see (4)). In contrast with (4), we fix now the {ei } to be {1, ZA , ZA ZB , . . . } and define the ψ’s by the second equality above (the tensor product sign ensures that the Z’s do not act on the ψ’s). C may be regarded as an “indefinite group element” – when the ψ’s get evaluated on some specific point g0 of the group manifold, C becomes g0 . One may also view C as an “indefinite

Normal Coordinates and Primitive Elements in Hopf Algebra

471

function” on the group – when the Z’s get evaluated on some particular (analytic) φ0 , the resulting Taylor series delivers φ0 , i.e., A A eZA ⊗ψ , id ⊗g0 = g0 , eZA ⊗ψ , φ0 ⊗ id = φ0 . (24) In the above, g0 , φ0 stand for any element in the corresponding universal enveloping algebra, not just the generators.

i The second of (24) gives the relation between the two i linear bases f(φ) and f(ψ) , generated by the φ’s and the ψ’s respectively. Indeed, A taking φ0 = φ and expanding the exponential we find ∞ 1 ZB1 . . . ZBm , φ A ψ B1 . . . ψ Bm m! m=0 1 = ψ A + ZB1 ZB2 , φ A ψ B1 ψ B2 + . . . . 2

φA =

(25)

Lemma 1. The change of linear basis in A generated by (25) is invertible. Proof. Notice that the linear part of φ A (ψ) is ψ A and also, that the above expansion preserves the v-degree. We choose a linear basis in A with the following ordering b b

b b b

b b b

b b

{ φ , φ , φ φ , φ , φ , φ φ , (φ )3 , . . . },

b

v=1

b

b

v=2

b

b

(26)

v=3

namely, in blocks of increasing v-degree and, within each block, non-decreasing pdegree. The above remarks then show that the matrix A, defined by j

i f(φ) = Ai j f(ψ) ,

(27)

i } is also ordered as in (26), is upper triangular, with units along the diagonal where {f(ψ) and hence invertible.

Notice that A is in block-diagonal form, with each block Av acting on A(v) , v = 1, 2, . . . . The computation of φ A (ψ), via (25), reduces essentially to the evaluation of the inner product of φ A with monomials in the Z’s – this is facilitated by the following Lemma 2. The inner product ZB1 . . . ZBm , φ A is given by

ZB1 . . . ZBm , φ A = ZB1 ∗ . . . ∗ ZBm , φ A = n

A , B1 ...Bm

(28)

where A B1 ...Bm

n

A R1 n B1 R1 B2 R2

=n

Rm−2 Bm−1 Bm

...n

(29)

(ZB1 ∗ . . . ∗ ZBm above is computed starting from the right, e.g., ZB1 ∗ ZB2 ∗ ZB3 ≡ ZB1 ∗ (ZB2 ∗ ZB3 )).

472

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

Proof. We have

ZB1 . . . ZBm , φ A = ZB1 ⊗ . . . ⊗ ZBm , m−1 (φ A ) .

(30)

In the above inner product, only the m-linear terms in m−1 (φ A ) contribute, since the Z’s vanish on products and the unit function. One particular way of evaluating the (m − 1)-fold coproduct is to apply always on the rightmost It is then tensor factor. A m ⊗ j −1 clear that, in this case, we may instead apply lin , since ⊗ (φ ) and j =1 id A m ⊗ j −1 ⊗lin (φ ) only differ by terms containing products of the φ’s or units j =1 id (this is only true if lin is applied in the rightmost factor). Notice now that the ∗-product of the Z’s is dual to lin ,

ZB1 ∗ ZB2 , φ A = ZB1 ⊗ ZB2 , lin (φ A ) .

(31)

Repeated application of this equation and use of the definition of ∗, Eq. (17), completes the proof. A concise way to express the relation between the two sets of generators is via the ∗-exponential (x ∈ U1 ) e∗x ≡

∞ ∞ 1 ∗i 1 x = x ∗ ··· ∗ x . i! i! i=0 i=0 i factors

(32)

Combining (25) and (28) we find Z ⊗ψ A

e∗ A

= ZB ⊗ φ B ,

(33)

where the convention (ZA ⊗ ψ A ) ∗ (ZB ⊗ ψ B ) = ZA ∗ ZB ⊗ ψ A ψ B is understood and the sum on the r.h.s. starts with 1 ⊗ 1.

4.2. The Hopf structure. We derive now the Hopf data for the new basis. A standard property of C is ( ⊗ id)C = C13 C23 , A

(id ⊗)C = C12 C13 ,

(34)

where, e.g., C13 ≡ eZA ⊗1⊗ψ – this is just the product-coproduct duality in (3). The second of (34) permits the calculation of the coproduct of the ψ’s by applying the Baker– Cambell–Hausdorff (BCH) formula to the product on its r.h.s., (ψ A ) is the coefficient of ZA in the resulting single exponential

Normal Coordinates and Primitive Elements in Hopf Algebra

473

exp ZA ⊗ (ψ A ) = exp ZA ⊗ ψ A ⊗ 1 exp ZB ⊗ 1 ⊗ ψ B 1 = exp ZA ⊗ ψ A ⊗ 1 + ZB ⊗ 1 ⊗ ψ B + [ZA , ZB ] ⊗ ψ A ⊗ ψ B + . . . 2 A 1 A A B1 = exp ZA ⊗ ψ ⊗ 1 + 1 ⊗ ψ + f ψ ⊗ ψ B2 + . . . , (35) 2 B1 B2 so that 1 A B1 ψ ⊗ ψ B2 + . . . . (ψ A ) = ψ A ⊗ 1 + 1 ⊗ ψ A + f 2 B1 B2

(36)

Higher terms in the coproduct can be computed by using a recursion relation for the BCH formula (see, e.g., Sect. 16 of [1]). The counit of all ψ A vanishes.Although (ψ A ) A can be complicated, S(ψ ) never is. Using S(g), ψ A = g, S(ψ A ) and the fact that S(g) = g −1 , it is easily inferred that S(ψ A ) = −ψ A ,

(37)

which extends as S(pr (ψ)) = (−1)r pr (ψ) on homogeneous polynomials of p-degree r. We see the first of the many advantages of working in the ψ-basis: the antipode is diagonal. Example 3. Computation of ψ (n) , n ≤ 4. A straightforward application of (25) gives =ψ Z, b =ψ , b b b 1 2 1 b = ψ Z, b + ψ ψ ZZ, b =ψ + ψ , 2 2 b b 1 3 b = ψ +ψ ψ + ψ , 6 b 1 3 b b = ψ +ψ ψ + ψ , 3 b b

b

b

b

b b

b b b

b

b b

b

b b b b

b b

b

b b b

b b b

b b

b

b

b b

b

b

b

b

b b b

1 1 2 1 2 4 = ψ +ψ ψ + ψ + ψ ψ + ψ , 24 2 2 b b 1 1 2 2 4 b b = ψ +ψ ψ + ψ ψ + ψ ψ + ψ , 12 2 3 b bb 1 1 1 2 5 2 1 4 b = ψ + ψ ψ + ψ ψ + ψ + ψ ψ + ψ , 2 2 2 6 8 b 3 1 4 2 bbb = ψ + ψ ψ +ψ ψ + ψ . 2 4 b b b b

b bb b

b bbb

b

b b

b b b

b

b

b b b

b

b b b

b

b

b b b

b b

b

b b

b

b

b b b

b b

b

b b

b

b

b

b b

b

(38)

474

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

Inverting the above expressions we find ψ = b, b

b b

b

b b b

b b b

1 2 b , 2

ψ = b−

1 3 b , 3 b b 1 3 = b b− bb+ b , 6 b

ψ = − b b b

ψ

b b b b

b b b

ψ = − b b b b

b b

b bb

+

b b bb

− b b bb

1 2

b 2 b

+

2

b

b b

−

1 4 b , 4

5 2b 1 b 1 4 b b− bb b− b , 6 2 6 b b bb 1 b 1 b 2 2 b 1 b2 1 4 b , ψ = b − bb− bb b+ b b− b − 2 2 3 2 12 b 1 2b 3 b (39) ψ = b b b − b b b + b b. 2 2 Concerning the coproduct, Eq. (36) shows that all ladder ψ’s are primitive. For the rest of the ψ’s, we get (omitting the primitive part) ˜ ψ = ψ ⊗ψ −ψ ⊗ψ , 1 1 ˜ ψ = ψ ⊗ψ −ψ ⊗ψ + ψ ⊗ψ − ψ ⊗ψ 2 2 1 1 1 2 1 2 + ψ ψ ⊗ψ + ψ ⊗ψ ψ − ψ ⊗ψ − ψ ⊗ψ , 6 6 6 6 1 1 1 1 ˜ ψ = ψ ⊗ψ − ψ ⊗ψ + ψ ⊗ψ − ψ ⊗ψ 2 2 2 2 1 1 2 1 1 2 − ψ ⊗ψ ψ − ψ ψ ⊗ψ + ψ ⊗ψ + ψ ⊗ψ , 6 6 6 6 3 1 3 1 ˜ ψ = ψ ⊗ψ − ψ ⊗ψ − ψ ⊗ψ ψ − ψ ψ ⊗ψ 2 2 2 2 1 2 1 2 + ψ ⊗ψ + ψ ⊗ψ . (40) 2 2 ψ

=

b b

−

+

b bb b

b bbb

b b b

b

b b

b b

b

b b b b

b

b b b

b b b

b

b

b bb b

b b

b

b bbb

b

b b b

b

b

b

b b b

b

b b b

b b

b b

b b

b

b b b

b

b b b

b

b b

b b

b

b b

b

b

b b

b

b

b b b

b

b b

b

b

b b

b

b

b

b b

b

b b b

b

b b b

b

b b

b

b

One can easily verify that S(ψ A ) = −ψ A . 5. Primitive Elements We turn now to the study of the primitive elements of A. These are of fundamental importance in any Hopf algebra, but acquire even more privileged status in our case, given their rôle in renormalization. Apart from this, they are also of interest in representation theory: given a primitive element a ∈ A, (a) = a ⊗ 1A + 1A ⊗ a, one obtains a one-dimensional representation ρa of U via (41) ρa (x) ≡ x, ea .

Normal Coordinates and Primitive Elements in Hopf Algebra

475

Indeed, ea is group-like, (ea ) = ea ⊗ ea , so that ρa (xy) ≡ xy, ea = x ⊗ y, ea ⊗ ea = ρa (x)ρa (y).

(42)

Conversely, every one-dimensional representation of U is associated to some primitive element in A. Primitive elements are typically rare, but the algebra of rooted trees is quite exceptional in this respect: there is an infinite number of them in A, with a non-trivial index set. We start our discussion with the easiest case, that of the ladder generators, for which our Theorem 1 below supplies a complete answer. We then turn to the considerably more complicated general case which Theorem 2 reduces to the problem of finding all closed LI 1-forms on G. 5.1. Ladder generators. We consider the subalgebra T of HR generated by the ladder generators Tn , where n counts the number of vertices. Their coproduct is (Tn ) =

n

Tk ⊗ Tn−k ,

(43)

k=0

making T a sub-Hopf algebra of HR (notice though that for φ not in T , (φ) may involve terms in T ⊗ T ). Experimenting a little we find that, for the first few n’s, each Tn gives rise to a primitive P (n) . The general case is handled by the following Theorem 1. To each ladder generator Tn , n = 1, 2, . . . , corresponds a primitive element P (n) , with Tn as its linear part, given by

P

(n)

∞ 1 ∂n m = log Tm x . n n! ∂x x=0

(44)

m=0

n Proof. Consider the algebra F of formal power series f (x) = ∞ n=0 cn x , c0 = 1, with ∗ the usual product. Define a basis {ξn , n = 0, 1, 2, . . . } of F , the dual of F, via

ξn , f (x) = cn ,

(45)

i.e., ξn reads off the coefficient of x n in f and ξ0 = 1. For f (x) = f (x)f (x) we have2 f (x) =

∞ n=0

cn x n ,

cn =

n k=0

ck cn−k ,

(46)

which implies the coproduct (ξn ) =

n

ξk ⊗ ξn−k

k=0 2 Notice that primes only distinguish functions here, they do not denote differentiation.

(47)

476

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

in F ∗ . Endowing F ∗ with a commutative product, we arrive at the isomorphism F ∗ ∼ =T, as Hopf algebras, with ξn ↔ Tn . Define a new basis {σn , n = 0, 1, 2, . . . } in F ∗ by

σr , f (x) = c˜r ,

with

f (x) = e

∞

r=1 c˜r x

r

(48)

and σ0 = 1. Then f (x) = e

∞

r r=1 c˜r x

,

with

c˜r = c˜r + c˜r ,

(49)

implying the coproduct (σr ) = σr ⊗1+1⊗σr . The σ ’s, under the above isomorphism, correspond to the P (n) in T . Solving the equation e

∞

r=1 P

(r) x r

=

∞

Tn x n

(50)

n=0

for P (r) , one arrives at (44).

We read off P (n) , for the first few values of n, as the coefficient of x n in the Taylor series expansion ∞ 1 1 n = T1 x + T2 − T12 x 2 + T3 − T1 T2 + T13 x 3 Tn x log 2 3 n=0

1 1 + T4 − T1 T3 − T22 + T12 T2 − T14 x 4 2 4 1 + T5 − T1 T4 − T2 T3 + T12 T3 + T1 T22 − T13 T2 + T15 x 5 5 1 2 2 + T6 − T1 T5 − T2 T4 − T3 + T1 T4 + 2T1 T2 T3 − T13 T3 2 1 3 3 2 2 1 4 + T2 − T1 T2 + T1 T2 − T16 x 6 + . . . . (51) 3 2 6

The polynomials P (n) (Ti ) are known as Schur polynomials. 5.2. The general case. Given a closed LI 1-form α on G, there exists a linear combination φ i of the generators φ A such that α = φ i . Applying the inverse Poincaré lemma, we may write (locally)

φ i = dψ i ,

(52)

for some function ψ i in A. Requiring additionally that ψ i vanish at the origin, (ψ i ) = 0, fixes the constant left arbitrary by (52) to zero. ψ i can be expressed in terms of the i of ψ i (φ) is φ i . But then φ’s. Since φ i reduces to dφ i at the origin, the linear part ψlin φ i = ψ i , since projects to the linear part. Comparing the r.h.s. of (52) with the

general expression for a LI 1-form, Eq. (5), we conclude that ψ i is primitive. Conversely, every primitive function ψ i gives rise to a closed LI 1-form, dψ i = ddψ i = 0 = dψ i . lin

Normal Coordinates and Primitive Elements in Hopf Algebra

477

Equation (7), and the comment that follows it, show that lin (φ i ) is symmetric under the interchange of its two tensor factors. This observation leads to a particularly simple way to identify primitive elements. One first looks for linear combinations φ i of the φ A with symmetric lin (φ i ) (notice that lin is given by simple cuts). The explicit expression for the corresponding primitive ψ i then is given by the standard formula for the (local) potential of a closed form. We find that the result is simplified considerably due to the particular form of the coproduct of the φ A , namely the linearity of (φ A ) in its second tensor factor.

Theorem 2. Given φ i ∈ A1 , such that dφ i = 0. Then the element ψ i of A, given by

ψ i = −+−1 ◦ S(φ i ),

(53)

is primitive and has φ i as its linear part (+ above is the p-degree operator for the φ’s, +(φ A1 . . . φ Ar ) = rφ A1 . . . φ Ar ). Proof. We apply the inverse Poincaré lemma to φ i . For a given v-degree n, only φ A of v-degree up to n enter in the formulas – we denote them collectively by x (e.g., S(φ)(x) denotes the standard expression of S(φ) in terms of the φ A while S(φ)(zx) denotes the same expression with every φ A multiplied by z). Consider the family of diffeomorphisms ϕt : x → (1 − t)x, 0 ≤ t ≤ 1. Then ϕ0∗ is the identity map while ϕ1∗ is the zero map. The corresponding velocity field is v =

d 1 ϕt (x) = −x ⇒ v(y, t) = − y, dt 1−t

where y = ϕt (x). We have3 φ (x) = i

However, we find

d ∗ dt ϕt

ϕ0∗

φ (ϕ0 (x)) i

− ϕ1∗

φ (ϕ1 (x)) = i

0

dt

1

(54)

d ∗ ϕ φ i (y) . dt t

(55)

= ϕt∗ Lv = ϕt∗ (d iv + iv d) and, taking into account the closure of φ i , φ i (x) = d

0 1

dt ϕt∗ iv φ i (y) .

(56)

This is the inverse Poincaré lemma. We concentrate now on the action of iv on φ i (y). We have 1 i i iv = − φ i (y) = S(φ(1) ) dφ(2) (y). (57) y j i∂y j , 1−t

In this latter (implied) sum, all terms in the coproduct of φ i appear except the first one, φ i ⊗ 1, which is annihilated by d. Notice now that y j i∂y j dy i = y i . Since (φ i ) is linear in its second factor we conclude that

i i i i ) dφ(2) (y) = S(φ(1) ) φ(2) (y) − S(φ i )(y) = −S(φ i )(y). y j i∂y j S(φ(1) 3 We ignore in the sequel the singularity of v at t = 1 – it is easily shown to be harmless.

(58)

478

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

Substituting back into (56) and putting 1 − t ≡ z we find 1 dz φ i (x) = −d S(φ i )(zx), 0 z

(59)

which, upon performing the integration over z, gives φ i = −d+−1 ◦ S(φ i ). The remarks preceding the theorem complete the proof. 5.3. The lower central series and k-primitiveness. We extend here the notion of primitiveness to that of k-primitiveness. Our starting point is our BCH-based prescription for calculating the coproduct of the ψ’s, Eq. (36). Suppose we identify all generators Zi[1] of G that cannot be written as commutators (the Zi[1] are, in general, linear combinations of the ZA ). Then we may perform a linear change of basis in G and split the generators into two classes, one made up of the above Zi[1] and the other spanning the complement – we denote the latter by {Zi }. Writing the canonical element in the new basis, [1]

C = eZi

i +Z ⊗ψ i ⊗ψ[1] i

,

(60)

i with the primitive ψ’s. This is so since, in we are led to the identification of the ψ[1]

the BCH formula, the Zi[1] are never produced by the commutators, so that the only i ) is the primitive part. Consider now the lower central series of contribution to (ψ[1] G, consisting of the series of subspaces G [1] , G [2] , . . . . A particular Z in G belongs to G [k] if it can be written as a (k − 1)-nested commutator. This implies that if Z belongs to G [k] , it also belongs to all G [r] , with r < k. This is the standard definition of G [k] – we actually need a slightly modified one, according to which Z belongs only to the G [k] with the maximum k. With this definition, G [k] ∩ G [r] = ∅ whenever k = r. We may now perform a linear change of basis in G such that each generator Zi[k] in the new basis belongs to G [k] . Writing the canonical element in the form [k]

C = eZi

i ⊗ψ[k]

,

(61)

i dual to the above Z [k] . Since the Z [k] are linear defines the k-primitiveness for the ψ[k] i i i will be linear combinations of the ψ A . A splits accombinations of the ZA , the ψ[k] [k] – the primitive ψ’s, in particular, span A[1] . cordingly to a direct sum, A = ∞ k=1 A Notice that ψ’s with n vertices may belong to G [k] with k ≤ n − 1. This is so because the “longest” nested commutator with n vertices is [Z , [Z , . . . [Z , Z ]] . . . ], with n − 2 entries of Z . i . The above concept of k-primitiveness arose naturally in our study of the primitive ψ[1] Some time afterwards, we became aware of Ref. [3], where a concept of k-primitiveness is also defined, as follows: given an element χ of A, one computes successive powers of the coproduct, k (χ ). There is a minimum k for which all terms in k (χ ) contain a unity in at least one of the tensor factors – this defines the k-primitiveness of χ . Our i , while the above makes definition is intrinsically defined only on the generators ψ[k] i , the two definitions coincide. sense in all of A. We now show that, for ψ[k] b

b

b

b b

b

i ) contains at least one unit tensor Lemma 3. The minimum value of r for which r (ψ[k] factor in each of its terms, is r = k.

Normal Coordinates and Primitive Elements in Hopf Algebra

479

i can be computed by iteration of the Proof. The various powers of the coproduct of ψ[k] second of (34), i r−1 (ψ[k] ) = coeff. of Zi[k] in log C01 C02 . . . C0r . (62) i ), the (k +1)-linear term can only be produced by the k-nested This shows that in k (ψ[k] commutator

[Zi1 , [Zi2 , . . . [Zik , Zik+1 ]] . . . ] ⊗ ψ i1 ⊗ . . . ⊗ ψ ik+1 . The latter, however, has no Zi[k] component, since Zi[k] can be written as a (k − 1)-nested commutator at most. It is also clear, for the same reason, that there are no terms of higher p-degree in the ψ’s, as those would correspond to even longer nested commutators. i ) then must have at least one unit tensor factor in each of its terms. On the k (ψ[k] i ) is not zero, because, by definition, the other hand, the k-linear term in k−1 (ψ[k] corresponding (k − 1)-nested commutator has a Zi[k] component.

As shown in [3], the k-degree satisfies j

i ψ ) = k1 + k2 . degk (ψ[k 1 ] [k2 ]

(63)

We use the two definitions of the k-degree interchangeably in what follows. We may now clarify the relation between the primitive elements given by the inverse Poincaré formula, Eq. (53), and the ones introduced above via the lower central series of G.

Lemma 4. Given φ i = ci A φ A , with ci A constants, such that dφ i = 0. Then the

primitive element ψ i of (53) is equal to ci A ψ A , i.e.,

ψ i = −+−1 ◦ S(φ i ) = ci A ψ A .

(64)

All primitive elements of A can be obtained in this form. i is primitive, while (sums of) products of them Proof. Any linear combination of the ψ[1] i constitute a linear basis in the vector space of are not, due to (63). Therefore, the ψ[1]

primitive elements of A. To the given φ i , Eq. (53) associates a primitive element ψ i , with φ i as its linear part. The unique linear combination of the ψ A (and, hence, of the i ) with this linear part is ψ i = ci ψ A . ψ[1] A

We give an example illustrating the above. Example 4. Construction of G (n)[k] , A(n)[k] , for n ≤ 4. To identify the generators of G (n)[k] , we construct all (k−1)-nested commutators with n vertices – G (n)[1] is determined (n)[k] in G (n) (below we use the orthogonal complement as the complement of n−1 k=2 G but this is not essential, one simply has to complete the basis of the Z’s). This gives a matrix that effects the transition from the basis {ZA }, indexed by rooted trees, to the i in terms basis {Zi[k] }, of definite k-primitiveness. The inverse matrix then gives the ψ[k] of the ψ A .

480

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

G (1)[1] = G (1) is generated by Z . G (2)[1] = G (2) is generated by Z , since the only commutator with two vertices, [Z , Z ] is zero. For n = 3, we have the only non(3)[2] ≡ [Z , Z ] = 2Z . The complement in G (3) is spanned by zero commutator4 Z1 (3)[1] = Z . Next we look at the case n = 4. We find the only non-zero commutators Z1 b b

b

b

b

b b

b

b b b

b b b

(4)[2]

[Z , Z ] = (0, 2, 1, 0) ≡ Z1 b

b b b

(4)[3]

[Z , Z ] = (0, −1, 1, 3) ≡ Z1

,

b b b

b

,

(65)

in the basis Z , Z , Z , Z . The orthogonal complement in G (4) is spanned by b b b b

b b b b

b bb b

(4)[1]

Z1

b bbb

(4)[1]

= (1, 0, 0, 0),

Z2

= (0, 1, −2, 1).

(66)

Writing the above change of basis symbolically as Zi[k] = MZA , with M a matrix of i = ψ A M −1 . numerical coefficients, the dual change of basis for the ψ’s is given by ψ[k] We find 1 =ψ , ψ(1)[1] b

b b b

b b

1 ψ(2)[1] =ψ ,

1 ψ(3)[1] =ψ ,

1 ψ(3)[2] =

1 ψ , 2 b b b

(67)

while, for n = 4, b b b b

1 ψ(4)[1] =ψ , 2 ψ(4)[1] 1 ψ(4)[2] 1 ψ(4)[3]

b b b b

b bb b

1 1 1 = ψ − ψ + ψ , 3 6 6 7 2 1 = ψ + ψ + ψ , 18 9 18 1 1 5 =− ψ + ψ + ψ . 18 9 18 b b b b

b bbb

b bb b

b b b b

b bbb

b bb b

b bbb

(68)

2 , one easily verifies that Referring to, e.g., ψ(4)[1]

1 φ = φ 6 i

b b b b

b bb b

1 1 − φ + φ 6 3

b bbb

(69)

2 . has symmetric lin and, when inserted in (53), delivers ψ(4)[1]

To continue the above construction to the cases n = 5, 6, we developed a REDUCE program, incorporating some of the procedures of [2]. The numbers Pn,k of k-primitive ψ’s with n ≤ 6 vertices that we find coincide with the ones in Table 4 of [3], as expected. In what refers to the primitive ψ’s, the procedure presented above, starting with φ’s with symmetric lin and then using (53), should be considerably more efficient than the one used in [3] – it would be interesting to quantify this statement. Notice that an equivalent procedure involves expanding the primitive ψ’s as ψ[1] = cA ψ A and then determining the constants cA from the set of equations f T ZT , ψ[1] = 0 (the latter is the statement RS that ψ[1] is invariant under the coadjoint coaction). (n)[k]

4 We remind the reader of our notation: Z is the i th element in the subspace of k-primitive, n-vertex i Z’s. The same notation is used for the ψ’s, with the position of the indices (upper–lower) interchanged.

Normal Coordinates and Primitive Elements in Hopf Algebra

481

6. Normal Coordinates and Toy Model Renormalization We turn now to what, in some sense, is our main objective, namely, the application of the formalism presented so far in the problem of renormalization in perturbative quantum field theory. The scope of our considerations in this section can only be modest, since realistic quantum field theories involve rooted trees with an infinite number of decorations. Nevertheless, a toy model exists (see [10]) that realizes the φ A as nested divergent integrals, regulated by a parameter . We find this an extremely useful construct that captures many of the most important features of realistic renormalization – again, we refer the reader to [10, 6] for a detailed presentation. What we are interested in here, is the rôle of the new coordinates ψ in the renormalization of divergent quantities. We start with a brief review of the basics. 6.1. The toy model. The elementary divergence in the toy model we deal with is given by the integral ∞ y − I1 (c; ) = dy , (70) y+c 0 which diverges as goes to zero. c above will be referred to as the external parameter of the integral. We associate the function φ with I1 (c; ). To the function φ corresponds the nested integral ∞ ∞ ∞ y − y − y − I2 (c; ) = dy1 1 I1 (y1 ; ) = dy1 1 dy2 2 . (71) y1 + c y1 + c 0 y2 + y 1 0 0 b b

b

b b b

b b b

Notice that the external parameter of the subdivergence I1 is y1 . To φ , φ correspond, respectively, ∞ ∞ 2 y1− y − I3,1 (c; ) = dy1 I3,2 (c; ) = dy1 1 I2 (y1 ; ), I1 (y1 ; ) , y1 + c y1 + c 0 0 (72) it should be clear how this assignment extends to all φ A . In this way, each φ A can be associated with the Laurent series in that corresponds to its associated integral, e.g. ∞ y − 1 π φ = dy = c− = − a + O(), (73) y+c sin(π ) 0 b

where a ≡ log(c) and, similarly (using MAPLE), 1 a 5π 2 2 + a + O(), − + 2 2 12 3a 2 a 7π 2 1 a 2 1 φ = 3− 2+ + − 9a + 14π 2 + O(), 6 2 4 18 12 3a 2 1 a 11π 2 1 a 2 φ = 3− 2+ + − 9a + 11π 2 + O(), 3 2 18 6 2 2 5π 1 1 a a a 2 2 1 φ = + 8a + O( 0 ), − + − + 15π 24 4 6 3 3 24 2 18 b b

φ = b b b

b b b

b b b b

(74)

482

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara

2a 2 3π 2 1 1 a a 2 2 1 + 16a + O( 0 ), − + − + 27π 12 4 3 3 3 8 2 18 1 1 a 11π 2 1 a 2 φ = 4 − 3 + a2 + − 8a + 11π 2 + O( 0 ), 2 8 2 24 6 1 a 19π 2 1 a 2 2 1 φ = 4 − 3 + 2a 2 + 16a + O( 0 ), , − + 19π 4 24 2 6 and so on. It is easily seen that φ’s with n vertices give rise to Laurent series with leading pole of order n. The process of renormalization assigns to each φ A a finite “renormalized” A (see, e.g., [5]). In Hopf algebraic terms, the latter is given by [2] value φR A A A φR = SR φ(1) (75) φ(2) , φ

b b b b

=

b bb b

b bbb

where the twisted antipode SR is defined recursively by A A SR φ A = −R φ A − R SR φ(1 ) φ(2 ) .

(76)

R above is a renormalization map that we choose here to give the pole part of its argument, evaluated at the external parameter equal to 1, e.g., R φ = 1/2 2 (compare with the first of (74)). The primed sum in the second term of (76) excludes the primitive part of the coproduct. The magic of renormalization lies in the fact that, for any φ A , A in (75) has no poles in – what makes this statement non-trivial the renormalized φR A , are independent of external is that all terms subtracted iteratively from φ A , to give φR parameters. We conclude our brief review with the following statement, proven in [11]: if R satisfies the multiplicative constraint R(xy) − R R(x)y − R xR(y) + R(x)R(y) = 0, (77) b b

then SR is multiplicative, SR (xy) = SR (x)SR (y) – our choice of R above does satisfy (77). 6.2. Renormalization in the ψ-basis. For a given number n of vertices, the renormalization of every generator φ A gives rise to 2n counterterms, for a total of rn 2n , where rn is the number of rooted trees with n vertices. To renormalize the ψ’s, one can always express them in terms of the φ’s and then proceed as above. However, for renormalization schemes R that satisfy (77), a much more efficient possibility arises. Equation (75), in this case, is valid for any function in A, and, in particular, for the ψ’s. Notice that although the action of the antipode S is trivial on the ψ A , that of the twisted antipode i } is that the complexity SR is not, in general. The advantage of working in the basis {ψ[k] i of the renormalization of a generator ψ(n)[k] is governed by k, not n, which entails, in general, significant savings. As an extreme example, a primitive ψ with one hundred vertices is renormalized by a simple subtraction – this should be compared with the 2100 counterterms necessary for the renormalization of each of the φ(100) ’s. How significant i in the can the savings be in, e.g., CPU time, depends on the distribution of the ψ(n) various k-classes. As proved in [3], the numbers Pn,k of k-primitive ψ’s with n vertices are generated by Pk (x) ≡

∞ n=1

Pn,k x n =

µ(s) s|k

k

1−

∞

1 − x ns

n=1

rn k/s

,

(78)

Normal Coordinates and Primitive Elements in Hopf Algebra

483

a rather non-trivial result. The sum in the r.h.s. above extends over all divisors s of k, including 1 and k. µ(s) is the Möbius function, equal to zero, if s is divisible by a square, and to (−1)p , if s is the product of p distinct primes (µ(1) ≡ 1). Of particular interest to us is the asymptotic behavior of Pn,k , for large values of n [3], Pn,k 1 k−1 1 1− = , (79) fk ≡ lim n→∞ rn c c where c = 2.95 . . . is the Otter constant. This is encouraging, as the population of the CPU-intensive high-k ψ’s is seen to be exponentially suppressed. A realistic estimate of the complexity of renormalization in the ψ-basis is outside the scope of this article, as it would probably entail implementation-dependent parameters. Nevertheless, we attempt a first-order estimation by assigning a computational cost of 2k to a k-primitive ψ, while the φ(n) are assigned the cost 2n . The ratio of the total costs of renormalizing all generators with n vertices in the two bases then is c n−1 rn 2 n ρn = n−1 ≈ (c − 2) , (80) k c−1 k=1 Pn,k 2 with ρ33 ≈ 6×105 making the difference between a week and a second. We consider (80) as a loose upper bound on the potential savings. Another feature of the ψ’s that is worth pointing out is their toy model pole structure. A corresponds to a Laurent series with maximal pole As mentioned above, each of the φ(n) i is much milder. We list the series expansion order n. We find that the behavior of the ψ(n) of the first few ψ A , which should be compared with the analogous expressions for the φ A , Eq. (74), 1 − a + O(), π2 ψ = + O(), 4 π 2a π2 ψ = − + O(), 18 6 7π 2 7π 2 a ψ = − + O(), 36 12 ψ = b

b b

b b b

b b b

b b b b

π4 + O(), 8 19π 4 + O(), = 72 π2 π 2a + O( 0 ), = − 24 2 6 π2 π 2a + O( 0 ). = − 12 2 3

ψ = ψ

b b b b

ψ ψ

b bb b

b bbb

b b b b

(81)

b b b b

Notice that, e.g., the primitive ψ is actually finite, as is ψ which is not primitive. We emphasize that ψ is still given by (75) (with φ a → ψ ) and does not coincide b b b b

b b b b

R

484

C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara b b b b

i are of order 1/ 2 , even though with the finite ψ (see Ex. 5 below). The other two ψ(4) they have G [3] components. These initial observations point to a general feature of the ψ’s: the pole order does not specify the complexity of their renormalization, as is the case with the φ’s. The cancellations of the higher-order poles observed point to rather non-trivial underlying combinatorics that, we believe, deserve further investigation. i The series expansion of the ψ(n)[k] is

π4 + O(), 48 π2 π 2a = − + O( 0 ), 2 72 18 π2 π 2a + O( 0 ) = − 36 2 9

2 ψ(4)[1] = 1 ψ(4)[2] 1 ψ(4)[3]

(82)

(the rest are essentially identical to the ψ A ). We also point out that some of the n = 6 primitive ψ’s are of order 1/ 3 – nevertheless, the coefficients of all poles are independent of c and their renormalization is accomplished by a simple subtraction, in agreement with (75). Example 5. (82) give

2 1 , ψ(4)[2] ,ψ Renormalization of ψ(4)[1]

b b b b

2 . For the primitive ψ(4)[1] , Eqs. (76),

2 2 = −R ψ(4)[1] = 0, SR ψ(4)[1]

(83)

2 2 2 2 so that the renormalized value ψ(4)[1] R = ψ(4)[1] +SR ψ(4)[1] coincides with ψ(4)[1] . 1 , the first of (65) and (36) give For the 2-primitive ψ(4)[2] 1 1 1 1 1 1 1 1 1 ψ(4)[2] ⊗ 1 + 1 ⊗ ψ(4)[2] + ψ(1)[1] ⊗ ψ(3)[1] − ψ(3)[1] ⊗ ψ(1)[1] , = ψ(4)[2] 2 2 (84) so that 1 1 1 1 1 1 1 1 1 ψ(4)[2] R = ψ(4)[2] + SR ψ(4)[2] + 2 SR ψ(1)[1] ψ(3)[1] − 2 SR ψ(3)[1] ψ(1)[1] . (85)

For the (non-trivial) twisted antipode we find 1 1 1 1 1 1 1 1 = −R ψ(4)[2] + R R ψ(1)[1] ψ(3)[1] − R R ψ(3)[1] ψ(1)[1] . SR ψ(4)[2] 2 2 (86) Substituting above we get 7 4 1 ψ(4)[2] R = 96 π + O().

(87)

Normal Coordinates and Primitive Elements in Hopf Algebra

485

b b b b

Finally, for ψ , we use the coproduct given in (40) and, proceeding along the same lines, we find

ψ

b b b b

R

=

13 4 1 π − π 2 a 2 + O(), 96 24

(88)

b b b b

which is different, as mentioned above, from the finite ψ . The remarkable pole structure of the ψ’s observed above, persists for other, more realistic models as well. For example, we have repeated the above analysis for the heavy-quark model of [2]. We find that, for n ≤ 4, the maximal pole order appearing is only 1/, with all ladder ψ’s, except the first one, finite. Acknowledgement. C. C. would like to thank Denjoe O’Connor for discussions and for pointing out Ref. [12]. The authors acknowledge partial support from CONACyT grant 32307-E and DGAPA-UNAM grant IN119792 (C. C.), DGAPA-UNAM grant 981212 (H. Q.) and CONACyT project G245427-E (M. R.).

References 1. Borodulin, V.I. Rogalyov, R.N. and Slabospitsky, S.R.: CORE: COmpendium of RElations. hepph/9507456 2. Broadhurst, D.J. and Kreimer, D.: Renormalization Automated by Hopf Algebra. J. Symb. Comp. 27, 581 (1999), hep-th/9810087 3. Broadhurst, D.J. and Kreimer, D.: Towards Cohomology of Renormalization: Bigrading the Combinatorial Hopf Algebra of Rooted Trees. Commun. Math. Phys. 215, 217–236 (2000), hep-th/0001202 4. Chryssomalakos, C., Schupp, P. and Watts, P.: The Rôle of the Canonical Element in the Quantized Algebra of Differential Operators A U . hep-th/9310100 5. Collins, J.: Renormalization. Cambridge: Cambridge University Press, 1984 6. Connes, A. and Kreimer, D.: Hopf Algebras, Renormalization and Noncommutative Geometry. Commun. Math. Phys. 199, 203–242 (1998), hep-th/9808042 7. Connes,A. and Kreimer, D.: Renormalization in Quantum Field Theory and the Riemann-Hilbert Problem. 1. The Hopf Algebra Structure of Graphs and the Main Theorem. Commun. Math. Phys. 210, 249–273 (2000), hep-th/9912092 8. Kastler, D.: Connes-Moscovici-Kreimer Hopf Algebras. Fields Institute Communications XX, 2001, math-ph/0104017 9. Kreimer, D.: Combinatorics of (Perturbative) Quantum Field Theory. hep-th/0010059 10. Kreimer, D.: On the Hopf Algebra Structure of Perturbative Quantum Field Theories. Adv. Theor. Math. Phys. 2, 303–334 (1998), q-alg/9707029 11. Kreimer, D.: Chen’s Iterated Integral Represents the Operator Product Expansion. Adv. Theor. Math. Phys. 3, 627–670 (1999) hep-th/9901099 12. Milnor, J.: Remarks on Infinite-Dimensional Lie Groups. In: DeWitt, B.S. and Stora, R. (eds), Relativity, Groups and Topology II (Les Houches 1983). Elsevier Science B.V., 1984, pp. 1007–1057 Communicated by A. Connes

Commun. Math. Phys. 225, 487 – 521 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Finite-Wavelength Stability of Capillary-Gravity Solitary Waves Mariana Haragus1 , Arnd Scheel2, 1 Mathématiques Appliquées de Bordeaux, Université Bordeaux 1, 351, Cours de la Libération, 33405 Talence

Cedex, France. E-mail: [email protected]

2 Institut für Mathematik I, Freie Universität Berlin, Arnimallee 2–6, 14195 Berlin, Germany

Received: 7 February 2001 / Accepted: 6 October 2001

Abstract: We consider the Euler equations describing nonlinear waves on the free surface of a two-dimensional inviscid, irrotational fluid layer of finite depth. For large surface tension, Bond number larger than 1/3, and Froude number close to 1, the system possesses a one-parameter family of small-amplitude, traveling solitary wave solutions. We show that these solitary waves are spectrally stable with respect to perturbations of finite wave-number. In particular, we exclude possible unstable eigenvalues of the linearization at the soliton in the long-wavelength regime, corresponding to small frequency, and unstable eigenvalues with finite but bounded frequency, arising from non-adiabatic interaction of the infinite-wavelength soliton with finite-wavelength perturbations. 1. Introduction In this article, we study stability of solitary waves traveling at constant velocity on the free surface of a two-dimensional inviscid fluid layer of finite depth under the influence of gravity and surface tension. The equations of motion are the Euler equations for nonlinear surface waves. Solitary waves are among the most striking phenomena and appear to be stable in several parameter regimes. Both for large surface tension and in the absence of surface tension, solitary waves are known to exist as particular solutions. Together with the solitary waves, there exists a family of spatially periodic waves, which are known as Stokes waves in the absence of surface tension. Phenomenologically, solitary waves appear to be stable in both parameter regimes mentioned, whereas Stokes waves are stable only for large enough wavelengths. At some critical finite wavelength, the periodic waves destabilize, an instability mechanism first discovered in [BF67, Be67,Wh67] and known as the Benjamin-Feir instability. Mathematically, the water wave problem is an evolutionary partial differential equation and possesses a Hamiltonian structure [Za68]. Various symmetries and associated Permanent address: University of Minnesota, School of Mathematics, 206 Church St. S.E., Minneapolis, MN 55455, USA

488

M. Haragus, A. Scheel

conservation laws are known; see [BO80]. The initial-value problem to this partial differential equation is well posed locally in time in the case of gravity waves [Na74, KN79,Yo82, Cr85,Wu97]. Both solitary waves and spatially periodic Stokes waves are particular equilibria of the Hamiltonian system. Their stability or instability is to first order determined by the spectrum of the linearization. Complete stability proofs would however have to take into consideration the effects of nonlinearity, as well. Throughout this paper, we focus on the spectrum of the linearization, the first and basic step towards stability of solitary waves. Existence of free surface waves in the full Euler equations has attracted a lot of interest in the late 80’s using bifurcation theory. For example, existence of solitary waves for large surface tension, Bond number larger than 1/3, was shown in [Ki88,AK89, Sa91]. Stability of surface waves in the full Euler equations is, from a mathematical point of view, a completely open problem, for both cases of gravity and capillary-gravity waves. Although a tremendous amount of literature is devoted to stability and instability of surface waves, to our knowledge, the present work represents the first rigorous attempt to show stability of solitary waves. Below, we summarize part of the previous work on stability and instability. Most detailed results are available for Stokes waves. In the absence of surface tension, a rigorous proof of the Benjamin-Feir instability of small-amplitude Stokes waves has been given in [BM95]. Rigorous stability proofs, even for the linearized problem, do not seem to be available. On the other hand, instability induced by critical eigenvalues leaving the imaginary axis of the linearized equations about a periodic wave upon variations of parameters has been extensively studied, both numerically and analytically; see, for example, [Mc82, LH84, Sa85, MS86, LHT97] and the references therein. Solitary waves in shallow water in the absence of surface tension appear to be stable at small amplitude. This is suggested by the numerical results on eigenvalues of the linearized operator in the absence of surface tension in [Ta86]. An instability seems to occur at some critical, finite amplitude, see again [Ta86]. The nature of this crest instability has also been investigated in direct numerical simulations, in [LHT97]. As already mentioned, stability results for solitary waves in the full Euler equations are not known. However, for large-wavelength initial data, the evolution of the free surface is governed on large time scales by certain model equations. For example, both for zero and for large surface tension, a formal expansion of the solution in the large wavelength exhibits at leading order a Korteweg–de Vries equation [KdV, Bou]. In other parameter regimes, the fifth order Kawahara equation [Ka72], or nonlinear Schrödinger equations can be derived. Together with these model equations, there come two mathematical problems: (i) What are the wave dynamics in the model equations? (ii) What can we conclude from the dynamics in the model equations for the dynamics of the full equations? For the particular question of stability of solitary waves we are interested in, these two problems reduce to first, the question of stability of solitary waves in the Korteweg– de Vries equation, and second, the question of validity of the approximation. Stability of solitary waves in the Korteweg–de Vries equation is fairly well understood. Orbital stability of the two-parameter family of solitary waves in this infinite-dimensional, integrable Hamiltonian system has been shown in [Be72, BSS87]. More towards the spirit of the present work, asymptotic stability of solitary waves has been shown in [PW96]. The proof there relies on a very careful understanding of the linearized problem using

Stability of Solitary Waves

489

a scattering-type analysis. Convergence then is, necessarily, established in an exponentially weighted function space, where the Korteweg–de Vries equation is not Hamiltonian. Deviating from the primary objective of this work, we also mention stability results for the Kawahara equation [Ka72]. This fifth order partial differential equation describes the dynamics of surface waves in the critical case of moderate surface tension, that is, for Bond numbers close to 1/3. For Bond numbers slightly larger than 1/3, the Kawahara equation supports solitary wave solutions just like the Korteweg–de Vries equation. Again, existence and orbital stability of these waves have been proved; see [IS92]. These stability results for the model equations let us believe that the solitary waves of the full Euler equations are stable at low amplitudes. However, the question to which extent solutions of the full system are well approximated by solutions of the model equations has not received a satisfactory answer that would allow us to conclude the stability of the solitary waves of the full system from only the stability of the corresponding waves of the model equation. Moreover, results on the validity of the model equations exist only in the case of gravity waves [KN79, KN86, Cr85, SW00]. In the presence of surface tension, the reduction method in [Ha96] permits to derive, in a rigorous and systematic manner, reduced systems that are nonlocal in the unbounded space variable and local in time, for different regions in the parameter plane (λ, b). The model equations, such as the Korteweg–de Vries and Kawahara equations, appear as the lowest order part in these reduced systems, but the connection between the solutions of the model equations and those of the reduced systems is still not clear. If we want to infer stability of solitary waves in the full Euler equations from stability of the soliton in the Korteweg–de Vries equation, two major problems arise. First, the Korteweg–deVries equations are valid on large, but finite time scales. Instabilities beyond these time scales are invisible in this leading order approximation. The second difficulty is non-adiabatic interactions between the infinite-wavelength solitary wave and finitewavelength perturbations. In the long-wavelength approximation of the Korteweg–de Vries equation, these perturbations are ignored. However, even at the linear level, these types of interaction may produce unstable eigenvalues, as has been shown, in a different context, in [KS98]. We give an outline of our results. In the case of large surface tension, we use bifurcation theory to deduce spectral stability of small-amplitude solitary waves for eigenvalues of finite frequencies, corresponding to finite wave numbers of the perturbations; see Theorem 2. As a first step, we reformulate the Euler equations as an abstract, firstorder differential equation in the spatial variable x; Sect. 2. The existence of solitary waves, Sect. 3, is described by a four-dimensional differential equation, which, due to symmetries reduces at leading order to a one-degree of freedom Hamiltonian system. The homoclinic orbit of this Hamiltonian system represents the solitary wave solution. This part of the analysis is similar to [Ki88]. The formulation of the Euler equations as a dynamical system in the spatial variable x in [Ki88] is slightly simpler, but does not generalize to the time-dependent case. We then linearize the Euler equations about this solitary wave solution and look for eigenfunctions with temporal growth eσ t . We obtain a generalized eigenvalue problem for the linearized operator L(σ ), depending on the spectral parameter σ . We formulate the stability problem in terms of the spectrum of this generalized eigenvalue problem and state our main results in Sect. 4. Stability of the continuous spectrum then follows from general perturbation arguments together with an explicit computation of the dispersion relation; Sect. 5. The main body of the proof is contained in Sect. 6, where point spectrum off the imaginary axis is excluded. It is here that we crucially rely on the dynamical systems formulation of the problem. We

490

M. Haragus, A. Scheel

define a complex analytic function, depending on the spectral parameter σ , which we call the Evans function of the full water-wave problem. Its zeroes σ coincide with the point spectrum. Stability of the solitary wave decomposes into stability in three different regimes, depending on the magnitude of the frequency of the eigenvalue, given by the imaginary part of the spectral parameter σ : (I) the long-wavelength, (II) the intermediate-wavelength, and (III) the short-wavelength regime. Our main result claims stability in (I) and (II). Stability in the short-wavelength regime (III) remains open. In the intermediate-wavelength regime (II), we exclude eigenvalues popping out of the essential spectrum by analytically continuing the Evans function into the essential spectrum and explicitly computing its value from the linear dispersion relation about the flat surface. The long-wavelength regime (I) requires a more subtle analysis. In appropriate scalings, we find the Korteweg–de Vries equation and the Evans function associated to the Korteweg–de Vries soliton, already computed explicitly in [PW92]. The major difficulty then is associated to the fact that the linear dispersion relation about the trivial surface in the long-wavelength limit is the dispersion relation of the wave equation and not the dispersion relation of the Korteweg–de Vries equation. Technically, the problem appears when we formulate the Euler equations for the potential of the velocity field, whereas we derive the Korteweg–de Vries equation for the derivative of the potential. In particular, at bifurcation, we have four critical modes with zero group velocity. Only three are represented in the third order Korteweg–de Vries equation. The central argument relies on the symmetry of the dispersion relation induced by reflection in physical space. The symmetry is exploited in Sect. 6.2.4, where we show that the additional critical mode does not couple to the three other modes. More precisely, we show that we can continue the Evans function for the full water-wave problem problem analytically in the KdV-scaled spectral parameter σ . At leading order, we are able to compute the Evans function explicitly and find the Evans function of the KdV-soliton, multiplied by σ . The additional factor σ precisely accounts for the fourth critical mode induced by translation of the velocity potential by constants. The stability proof is concluded by a perturbation argument, which shows that all roots of the Evans function are located in the origin, even for higher order perturbations, since they are induced by symmetries of the full water-wave problem. The method developed here for the case of large surface tension can be applied to the case of zero surface tension, as well.Although the formulation of the problem, Sect. 2, has to be adapted, most of the consequent analysis is very similar. In particular, Theorem 2 on spectral stability holds in the absence of surface tension, as well. An important difference arises when proving the absence of unstable point spectrum with small frequency. The fourth critical mode, which appears in addition to the KdV-spectrum, carries a group velocity with the opposite sign when compared with the case of large surface tension. This actually simplifies the stability proof substantially in allowing for a continuation of the Evans function across the essential spectrum by means of exponentially weighted spaces, just like in the Korteweg–deVries approximation; see [PW97] and [HS01] for solitary waves in different contexts, where a similar situation arises.

Stability of Solitary Waves

491

2. The Euler Equations and Spatial Dynamics Consider nonlinear waves propagating at a constant speed c on the free surface of an inviscid fluid layer of mean depth h and constant density ρ. Assume that both gravity and surface tension are present, and denote by g the acceleration due to gravity and by T the coefficient of surface tension. In a coordinate system (X, Y ) moving with the waves the bottom lies at Y = 0 and the free surface is described by Y = Z(X, t), where t is the time variable. The flow is supposed to be irrotational, so the velocity field has a potential = (X, Y, t). Introduce dimensionless variables by choosing the unit length to be h and the unit velocity to be c. The Euler equations of motion become XX + Y Y = 0,

for 0 < Y < 1 + Z(X, t),

(2.1)

with the boundary conditions Y = 0

(2.2)

at the bottom Y = 0, and Zt + ZX + ZX X = Y , 1 bZXX t + X + (2X + 2Y ) + λZ − =0 2 )3/2 2 (1 + ZX

(2.3) (2.4)

on the free surface Y = 1 + Z(X, t). The dimensionless numbers λ = gh/c2

and

b = T /ρhc2

are the inverse square of the Froude number and the Bond number. The analysis is made for capillary-gravity waves, so we fix b = 0. The goal of this section is to write the system (2.1)–(2.4) in the abstract form Dwt = wx + F (w; λ),

(2.5)

with boundary conditions 0 = f (w), Bwt = f (w),

on y = 0, on y = 1,

(2.6) (2.7)

where D, B are linear and F , f nonlinear maps acting on a Hilbert space of functions defined on the bounded cross-section of the domain. Consider the new variables u = X ,

η=

bZX 2 1 + ZX

,

and the change of coordinates x = X,

y=

Y , 1 + Z(X, t)

(2.8)

492

M. Haragus, A. Scheel

which transforms the moving domain {(X, Y ) ∈ R2 | 0 ≤ Y ≤ 1 + Z(X, t)} into R × [0, 1]. Then, (2.1), (2.4) lead to the system yη 0 = x − u − y , in R × (0, 1), (2.9) (1 + Z) b2 − η2 1 yη 0 = ux + yy − uy , in R × (0, 1), (2.10) (1 + Z)2 (1 + Z) b2 − η2 η 0 = Zx − , (2.11) b2 − η2 u2 1 2 + 2 2(1 + Z)2 y η(1 + u) − y , on y = 1, (1 + Z) b2 − η2

t = ηx − λZ − u −

(2.12)

with boundary conditions 0 = y , on y = 0, 1 η(1 + u) Zt = , y − 1+Z b2 − η2

(2.13) on y = 1,

(2.14)

obtained from (2.2) and (2.3). Equations (2.9)–(2.12) are of the form (2.5) in which the independent variable w, the linear operator D and the map F are defined through w = (, u, Z, η)T , Dw = (0, 0, 0, y=1 )T , and



yη y  (1 + Z) b2 − η2  1 yη  yy − uy  2  (1 + Z) (1 + Z) b2 − η2  F (w; λ) =  − η   b2 −η2  2  η(1 + u) 1 2  −λZ − u + u − + y 2 2(1 + Z)2 y (1 + Z) b2 − η2



−u −

      .      y=1

The boundary conditions (2.13), (2.14) are of the form (2.6), (2.7) in which Bw = Z,

f (w) =

1 yη(1 + u) y − . 1+Z b2 − η2

We consider (2.5) as an abstract differential equation on the phase space X := H 1 (0, 1) × L2 (0, 1) × R2 . Set U = {(, u, Z, η) ∈ X | Z > −1, |η| < b}, and define X1 := H 2 (0, 1) × H 1 (0, 1) × R2 , and V = U ∩ X1 . The properties of D, F , B and f are summarized in the following lemma.

Stability of Solitary Waves

493

Lemma 2.1. The following statements hold: (i) D is a bounded linear operator from X (resp. X 1 ) into X (resp. X 1 ). (ii) B is a bounded linear operator from X (resp. X1 ) into R. (iii) F ∈ C k (V × R, X) and f ∈ C k (U, L2 (0, 1)) ∩ C k (V , H 1 (0, 1)), for any k ≥ 0. The proof is an easy consequence of the definition of D, B, f , and F and is left to the reader. Remark 2.2. The Euler equations (2.1)–(2.4) possess a reversibility symmetry. For any solution (Z(X, t), (X, t)), reversibility yields a different solution (Z(−X, −t), −(−X, −t)). For the system (2.5) this means that D commutes and F anticommutes with the R = diag(−1, 1, 1, −1), and for the boundary conditions (2.6)–(2.7) that BR = B and g(Rw) = −g(w), for any w ∈ U . 3. Steady Solitary Waves The Euler equations (2.1)–(2.4) possess steady solitary-wave solutions for any b > 1/3 and λ = 1 + ε2 for ε sufficiently small. Mathematical proofs go back to [Ki88,AK89, Sa91]. Our main purpose is a study of the temporal stability properties of these solitary waves. As we explained in the previous section, our approach to the stability problem is technically based on a spatial dynamics formulation of the eigenvalue problem – similar to the existence proof given in [Ki88]. However, our formulation slightly differs from the one exploited there. For the convenience of the reader, and in order to exhibit the main technical tools in the slightly simpler steady problem, we sketch the proof of existence of solitary waves in this section. In particular, we describe the most important properties of the steady solitary wave solutions of (2.5)–(2.7) that exist for b > 1/3 and λ > 1, λ close to 1. From now on we fix b > 1/3 and set λ = 1 + ε2 . The solitary waves are not unique, due to the invariance of the equations under translations in X, , and due to Galilean invariance. Translational symmetry is ruled out by restriction to symmetric waves, that is reversible solutions of the spatial dynamics formulation, satisfying Z(X) = Z(−X) and (X, Y ) = −(−X, Y ). In the steady problem the mean flow m is conserved and can be used to select a unique solitary wave from the family of solitary waves obtained by Galilean invariance. Fixing the mean flow through a cross section to one amounts to the condition 1+Z(X)

m = 1 + Z(X) +

X (X, Y ) dY = 1.

(3.1)

0

We consider the steady water-wave problem (2.5) with wt = 0, wx + F (w; λ) = 0,

(3.2)

with boundary conditions f (w) = 0,

on y = 0, 1.

(3.3)

The proof of existence of solitary waves for this system is, as the one in [Ki88], based on a center manifold reduction. However, the reduction procedure cannot be applied directly to this system because of the nonlinear boundary condition on y = 1. We therefore consider first a nonlinear change of variables on U which transforms this boundary condition into a linear condition on y = 1.

494

M. Haragus, A. Scheel

˜ u, Z, η), where Lemma 3.1. The map χ : U → U defined by χ (, u, Z, η) = (, ˜ =+

y

f (w) − f (0)w dy −

0

Z (0) 1+Z

is a C 1 -diffeomorphism. Moreover, the restriction χ : V → V is a C 1 -diffeomorphism. Proof. It is easy to check that χ is a smooth map from U into X. A direct calculation shows that 1 y2η η ˜ = + − 1+Z 2b b2 − η2

y

y (1 + u(y )) dy ,

0

˜ u, Z, η) = so χ is invertible with inverse χ −1 : U → U defined through χ −1 (, (, u, Z, η) with   y 2η η y ˜ − y (1 + u(y )) dy  . + = (1 + Z)  2b b2 − η2 0

χ −1

The fact that is smooth proves the first part of the lemma. The second part follows from the fact that the restrictions to V , χ : V → V and χ −1 : V → V , are well defined and smooth. ˜ Then (3.2)–(3.3) yields the following system for w ˜ Set w = χ −1 (w). −1 ˜ x = − Dχ −1 (w) ˜ ˜ λ) =: G(w; ˜ λ), w F (χ −1 (w);

(3.4)

with boundary conditions ˜ y = 0, on y = 0, η ˜ y = , on y = 1, b since

(3.5) (3.6)

yη . b We treat this system as an infinite dimensional dynamical system on the phase space X. We write ˜ y = f (w) +

˜ w ˜ w; ˜ x = A(λ) ˜ + G( ˜ λ), w

(3.7)

˜ w; ˜ w. ˜ ˜ λ) = G(w; ˜ λ) − A(λ) ˜ The boundary conditions where A(λ) = Dw˜ G(0; λ) and G( ˜ (3.5)–(3.6) are included in the domain of definition of the linear operator A(λ) by taking η ˜ ˜ y (0) = 0, ˜ y (1) = ˜ u, Z, η) ∈ X1 | . Y := Dom(A(λ)) = (, b ˜ ˜ is a smooth Then A(λ) is a closed linear operator in X with domain Y dense in X, and G map from W = U ∩ Y × R into X.

Stability of Solitary Waves

495

˜ Note that χ (0) = 0 and Dχ (0) = I , so A(λ) = Dw˜ G(0; λ) = −Dw F (0; λ). This means that the linear part of the system (3.2) is not changed by the transformation above. The same is true for the boundary conditions (3.3). A direct calculation shows that T η ˜ w ˜ yy , , λZ + u ˜ = u, − A(λ) . y=1 b Remark also that (3.7) is reversible with reverser R defined in Sect. 2, since χ (Rw) = Rχ (w). We apply center manifold reduction directly to this system. We find a four-dimensional reduced system which describes the steady waves. Note that the reduced system obtained in [Ki88] is only two-dimensional. The two additional dimensions here are due to the invariance of (2.1)–(2.4) under translations in the fluid potential and due to Galilean invariance. Both symmetries are inherited by the system (3.7) from the full Euler equations. In [Ki88], these invariances were factored out, already in the dynamical formulation of the problem, before the reduction procedure, such that the reduced equation did not possess these symmetries any more. Here, we only use them after the reduction, and show that it is possible to simplify the reduced system on the four-dimensional centermanifold to a two-dimensional differential equation with the help of reversibility and condition (3.1). The reason for this slightly different approach is that we cannot factor out these symmetries in the eigenvalue problem. Theorem 1. For any b > 1/3 and k ≥ 0 there exist ε ∗ > 0 and C > 0 such that, for any ε ∈ (0, ε∗ ) the system (3.2)–(3.3) with λ = 1 + ε2 possesses a unique solitary-wave solution wε∗ ∈ Cbk (R, X 1 ) with the following properties: ˜ ε∗ , where wε∗0 = (U 0 , u0 , −u0 , −bu0x ) with (i) wε∗ = wε∗0 + w √ βεx , u (x) = ε sech 2 0

2

x U (x) =

2

0

u0 (x ) dx ,

0

β=

3 , 3b − 1

˜ ε∗ (x)X1 ≤ Cε3 for any x ∈ R. Moreover, and w √ βε|x|

˜ ε∗ (x)X1 ≤ Cε4 e− (I − P ) w

√ βε|x|

˜ ε∗ (x)X1 ≤ Cε3 e− ∂y P w

,

,

where P is the projection on the –component of w: P : X → X, P = diag (1, 0, 0, 0). (ii) wε∗ is reversible, i.e. Rwε∗ (x) = wε∗ (−x), and the components ∗ε , u∗ε , Zε∗ , ηε∗ of wε∗ satisfy Zε∗ (x) + (1 + Zε∗ (x))

1 0

(iii) wε∗ is a smooth function of ε.

u∗ε (x, y) dy = 0.

496

M. Haragus, A. Scheel

Proof. By Lemma 3.1 it is enough to show the existence of solitary waves for the system ˜ (3.7).As in [Ki88] one can show that A(λ) has compact resolvent, so its spectrum consists only of isolated eigenvalues of finite multiplicities. The eigenvalue problem ˜ w ˜ = ζ w, ˜ A(λ)

˜ ∈Y w

˜ can be solved explicitly, and we find that ζ is an eigenvalue of A(λ) if and only if it satisfies the equality ζ 2 cos ζ = (λ − bζ 2 )ζ sin ζ. ˜ A direct calculation shows that 0 is always an eigenvalue of A(λ) with generalized eigenvectors w0 = (1, 0, 0, 0)T , wλ = (0, 1, −1/λ, 0)T , ˜ ˜ such that A(λ)w 0 = 0, A(λ)w 1 = w0 . If b > 1/3 and λ = 1 this eigenvalue has algebraic multiplicity 4; the generalized eigenvectors   2      0 1 0 − y2  0   y2  0  1   , w3 =  − 2  , , w2 =  w0 =   , w1 =    0   1 − b 0 −1 2 0 0 −b 0 ˜ ˜ satisfy A(1)w 0 = 0, A(1)w i = wi−1 , i = 1, 2, 3, and form a basis for the generalized eigenspace associated to the eigenvalue 0. We apply the reduction result in [Mi88] to system (3.7) with b > 1/3 and λ = 1 + ε2 close to λ0 = 1. By direct calculation one can prove that there exist positive constants C(λ) and q0 such that C(λ) −1 ˜ (iq − A(λ)) , X→X ≤ |q|

(3.8)

˜ is smooth in w ˜ and ε 2 when considered for any q ∈ R, |q| > q0 . Moreover, the map G ˜ into X. With these preparations, the reduction as a map from the domain Y = Dom (A) ˜ ∈ Cbk (R, Y ) of (3.7) is of theorem in [Mi88] shows that any small bounded solution w the form ˜ w(x) = a0 (x)w0 + a1 (x)w1 + a2 (x)w2 + a3 (x)w3 + 3(a0 , a1 , a2 , a3 ; ε2 ),

(3.9)

with 3(a0 , a1 , a2 , a3 ; ε2 ) = O(|aj |(|aj | + ε2 )), and aj satisfy the reduced system a0,x = a1 + f0 (a0 , a1 , a2 , a3 ; ε2 ), a1,x = a2 + f1 (a0 , a1 , a2 , a3 ; ε2 ), a2,x = a3 + f2 (a0 , a1 , a2 , a3 ; ε2 ),

(3.10)

a3,x = f3 (a0 , a1 , a2 , a3 ; ε2 ), in which fj (a0 , a1 , a2 , a3 ; ε2 ) = O(|aj |(|aj | + ε 2 )). By a careful choice of a cut-off function, necessary in the construction of the centermanifold, one can arrange to have the reduced flow inherit the symmetries of the full system (3.7). In particular, the invariance of (3.7) under translation in implies that 3 and (3.10) are invariant under transformations of the form a0 → a0 + α, for any α ∈ R,

Stability of Solitary Waves

497

such that 3 and the fj , j = 0, . . . , 3 do not depend upon a0 . The reduced equations (3.10) possess a skew-product structure and decouple into a system for a1 , a2 , a3 , a1,x = a2 + f1 (a1 , a2 , a3 ; ε2 ), a2,x = a3 + f2 (a1 , a2 , a3 ; ε2 ), a3,x = f3 (a1 , a2 , a3

(3.11)

; ε2 ),

and one differential equation for a0 , which can be integrated. Reversibility can be used to uniquely determine a0 . The reduced system (3.10) is reversible with reverser R0 acting through R0 (a0 , a1 , a2 , a3 ) = (−a0 , a1 , −a2 , a3 ), since Rw0 = −w0 , Rw1 = w1 , Rw2 = −w2 , Rw3 = w3 . Reversible solutions of (3.10) are those with a0 , a2 odd and a1 , a3 even functions in x. For such solutions a0 is uniquely determined by the condition a0 (0) = 0, which leads to x a0 (x) =

a1 + f0 (a1 , a2 , a3 ; ε2 ) dx .

(3.12)

0

Next, we use the condition (3.1) to uniquely determine a3 for solutions of (3.7) with ˜ u, Z, η) this condition reads ˜ = (, mean flow one. For w 1 Z(x) + (1 + Z(x))

u(x, y) dy = 0,

x ∈ R.

0

˜ from (3.9) yields an equality Substitution of w F(a1 , a2 , a3 ; ε2 ) = 0. It is not difficult to see that F is smooth in its arguments, and a direct calculation shows that 1 Da3 F(0, 0, 0; ε2 ) = − b = 0. 3 Then by the implicit function theorem we obtain a3 = ψ(a1 , a2 ; ε2 ) = O(|aj |(|aj | + ε 2 )),

(3.13)

with ψ smooth function. Substituting (3.13) into (3.11) we obtain the two-dimensional system a1,x = a2 + g1 (a1 , a2 ; ε2 ), a2,x = g2 (a1 , a2 ; ε2 ).

(3.14)

This system is also reversible with reverser acting through a1 → a1 , a2 → −a2 . One can now argue as in [Ki88] and prove that (3.14) possesses a unique reversible homoclinic solution (a1∗ (ε), a2∗ (ε)), smooth function of ε, for sufficiently small ε > 0. Explicit calculation of the relevant quadratic terms shows that √ βεx a1∗ (x; ε) = ε2 sech2 + O(ε 4 ). 2 The equalities (3.12), (3.13) give the reversible homoclinic solution of the reduced system (3.10), and from (3.9) we find the reversible solitary-wave solution of (3.7). This proves the theorem.

498

M. Haragus, A. Scheel

4. Spectral Stability of Solitary Waves In this section we formulate the stability problem in terms of the spectrum of a family of linear operators and state the main results.

4.1. Linearized system. Consider the linearization of the problem (2.5)–(2.7) about the solitary wave wε∗ ∈ Cbk (R, X 1 ) found in Theorem 1 for ε ∈ (0, ε∗ ): DWt = Wx + Dw F (wε∗ ; 1 + ε 2 )W, 0 = f (wε∗ )W, on y = 0, BWt = f (wε∗ )W, on y = 1.

(4.1) (4.2) (4.3)

We look for solutions of this system of the form W(t, x) = eσ t Wσ (x),

(4.4)

with Wσ bounded function from R into the complexification of X 1 , for σ ∈ C. For simplicity we denote the complexification of X 1 , and later those of X and Y , also by X 1 (resp. X and Y ). Roughly speaking, the solitary wave wε∗ is stable if (4.1)–(4.3) does not possess any solutions of the form (4.4) for any σ ∈ C with Re σ > 0. Substitution of (4.4) into (4.1)–(4.3) yields the following system for Wσ : σ DW = Wx + Dw F (wε∗ ; 1 + ε 2 )W, 0 = f (wε∗ )W, on y = 0, σ BW = f (wε∗ )W, on y = 1.

(4.5) (4.6) (4.7)

We write this system in abstract form = 0, := W x − L(σ, ε)W L(σ, ε)W with L(σ, ε) some linear operator in X, and then formulate the stability problem for wε∗ in terms of the spectrum of the family of operators Lε = (L(σ, ε))σ ∈C . We proceed as in the steady problem by constructing first a linear diffeomorphism χσ which transforms the non-autonomous boundary conditions (4.6)–(4.7) into autonomous boundary conditions. Lemma 4.1. Assume σ ∈ C and ε ∈ (0, ε∗ ). The linear map χσ : X → X defined by σ 2 y BW, 0, 0, 0 χσ W = Dχ (wε∗ )W − 2 is bounded and has bounded inverse χσ−1 : X → X. Moreover, χσ and χσ−1 are analytic in σ , smooth in ε, and their restrictions to X1 are well defined. The proof is similar to the one of Lemma 3.1 so we omit it here. Note that χ0 is the linearization about wε∗ of the diffeomorphism χ in Lemma 3.1. Then the system (4.5) becomes Set W = χσ−1 W. + (∂x χσ ) χσ−1 W, x = χσ σ D − Dw F (wε∗ ; 1 + ε 2 ) χσ−1 W (4.8) W

Stability of Solitary Waves

499

with boundary conditions ˜ y = 0, η ˜y = , b

on y = 0,

(4.9)

on y = 1

(4.10)

= (, ˜ u, Z, η). for W Explicit calculation of the equations in (4.8) show that it is of the form x = D(σ, ε)W + A(ε)W, W

(4.11)

with D(σ, ε) = D∞ (σ ) + ε 2 D1 (x; σ, ε), a bounded linear operator in X, and A(ε) = A∞ (ε 2 ) + ε 2 A1 (x; ε), a closed linear operator in X. The parts A∞ and D∞ correspond to the linearization evaluated at the asymptotic state of the solitary wave, at x = ∞. The parts A1 and D1 correspond to the perturbation due to the solitary wave. These are operators with coefficients depending on x, and decaying to 0 at x = ∞ with the same rate as the decay rate of the solitary wave wε∗ . Since we do not need the explicit formulas of these ˜ + ε2 ), operators in the following, we omit them here. However, note that A∞ (ε 2 ) = A(1 and that D∞ (σ ) and D1 (x; σ, ε) depend upon σ in the following way: D∞ (σ ) = σ D∞1 + σ 2 D∞2 ,

D1 (x; σ, ε) = σ D11 (x; ε) + σ 2 D12 (x; ε),

since BD = 0.As in the formulation of the steady problem (3.7), the boundary conditions (4.9)–(4.10) are included in the domain of definition of the operator A(ε). The properties of D(σ, ε) and A(ε) needed later are summarized in the next lemma. They follow from Lemma 2.1, the decay properties of wε∗ in Theorem 1, and the definition of χσ in Lemma 4.1. Lemma 4.2. Assume σ ∈ C, ε ∈ (0, ε∗ ) and x ∈ R. (i) D∞ (σ ) and D1 (x; σ, ε) are bounded linear operators in X (resp. X 1 ), depending analytically upon σ and smoothly upon ε. (ii) A∞ (ε 2 ) and A1 (x; ε) are closed linear operators in X with dense domain Y , depend analytically upon σ and smoothly upon ε. Moreover, there exists a positive constant C such that the following inequalities hold for any σ ∈ C, ε ∈ (0, ε∗ ) and x ∈ R: D∞ (σ )X(resp.X1 )→X(resp.X1 ) ≤ C |σ |(1 + |σ |), √ βε|x|

D1 (x; σ, ε)X(resp.X1 )→X(resp.X1 ) ≤ C |σ |(1 + |σ |)e− A∞ (ε 2 )Y →X ≤ C,

√ βε|x|

A1 (x; ε)Y →X ≤ C e−

.

,

500

M. Haragus, A. Scheel

4.2. Spectral stability. Set L(σ, ε) = D(σ, ε) + A(ε), and consider the family of operators Lε = (L(σ, ε))σ ∈C defined by L(σ, ε) =

d − L(σ, ε). dx

= 0. Set H = L2 (R, X) and W = H 1 (R, X) ∩ Equation (4.11) becomes L(σ, ε)W 2 L (R, Y ). Then L(σ, ε) is a closed linear operator in H with dense domain W. Define the resolvent of the family of operators Lε as the set ρ(Lε ) = {σ ∈ C : L(σ, ε) invertible}. The set :(Lε ) = C \ ρ(Lε ) is called the spectrum of Lε . We distinguish between point spectrum :p (Lε ) = :(Lε ) ∩ {σ ∈ C : L(σ, ε) Fredholm with index 0}, and essential spectrum :e (Lε ) = :(Lε ) \ :p (Lε ). Definition 4.3. The solitary wave wε∗ is called spectrally stable if :(Lε ) ⊂ {σ ∈ C : Re σ ≤ 0}, and spectrally unstable otherwise. The main result in this paper is: Theorem 2. Fix b > 1/3, and choose any R > 0 large. Then there exists εb > 0 such that, for any ε ∈ (0, εb ), the spectrum of Lε coincides with the imaginary axis in a ball of radius R: :(Lε ) ∩ {σ ∈ C : |σ | ≤ R} = iR ∩ {σ ∈ C : |σ | ≤ R}. The proof consists of two parts summarized in the following two theorems. Theorem 3. There exists εe > 0 such that for any ε ∈ (0, εe ) the essential spectrum of Lε coincides with the imaginary axis. Theorem 4. Fix b > 1/3, and choose any R > 0 large. Then there exists εp > 0 such that for any ε ∈ (0, εp ) the point spectrum of Lε is contained in iR ∪ {|σ | ≥ R}. Both theorems are proved in Sects. 5 and 6. The result in Theorem 2 is a consequence of Theorems 3 and 4. Remark 4.4. In fact, we prove slightly more. We actually compute eigenvalues embedded into the essential spectrum :e (Lε ) = iR. We show that inside the essential spectrum, there is only the zero eigenvalue with geometric multiplicity two and algebraic multiplicity three. One eigenfunction is due to the invariance of the Euler equations under → +const, and the second eigenfunction is given by the x-derivative of the solitary wave. The generalized eigenvector to the second eigenfunction is given by the derivative of the solitary wave with respect to the wave speed.

Stability of Solitary Waves

501

5. The Essential Spectrum of Solitary Waves We prove Theorem 3. We study first the spectrum of the family of asymptotic operators Lε∞ = (L∞ (σ, ε))σ ∈C , where L∞ (σ, ε) =

d − L∞ (σ, ε), dx

L∞ (σ, ε) = D∞ (σ ) + A∞ (ε 2 ).

Lemma 5.1. For any ε ≥ 0, the essential spectrum of Lε∞ is equal to iR. The point spectrum of Lε∞ is empty. Proof. The asymptotic operators D∞ (σ ) and A∞ (ε 2 ) are independent of x, so in order to determine the spectrum of Lε∞ we can use the Fourier transform in x. Let k denote the Fourier variable. Then the spectrum of Lε∞ in H coincides with the spectrum of Lε∞ = ( L∞ (σ, ε))σ ∈C , where L∞ (σ, ε) = ik − L∞ (σ, ε). The domain of L∞ (σ, ε) is 2 1 W = L (R, Y ) ∩ H (R, X), where 1 (R, X) = {f ∈ L2 (R, X) : (1 + |k|)f ∈ L2 (R, X)}. H The resolvent set of Lε∞ consists of the values σ ∈ C with the following two properties: (i) :(L∞ (σ, ε)) ∩ iR = ∅, where :(L∞ (σ, ε)) is the spectrum of L∞ (σ, ε) in X, (ii) there exists a positive constant C(σ, ε) such that the estimate (ik − L∞ (σ, ε))−1 X→X ≤

C(σ, ε) , 1 + |k|

(5.1)

holds for any k ∈ R. Indeed, assume that (i) and (ii) hold for some σ ∈ C. Then, for any f ∈ H there exists g (k) = (ik − L∞ (σ, ε))−1 f(k) with (1 + |k|) g 2H 2 2 2 = (1 + |k|) g (k)X dk ≤ C(σ, ε) f(k)2X dk = C(σ, ε)2 f2H . R

R

and the map f → Hence g∈W g is bounded from H into W. The operator L∞ (σ, ε) has compact resolvent, so its spectrum consists only of isolated eigenvalues of finite multiplicities. The eigenvalue problem ˜ = ζ w, ˜ L∞ (σ, ε)w

˜ ∈ Y, w

can be solved explicitly. We find that ζ is an eigenvalue of L∞ (σ, ε) if and only if (σ + ζ )2 cos ζ = (1 + ε2 − bζ 2 )ζ sin ζ.

(5.2)

Set σ = σ1 + iσ2 and ζ = ik. Then (5.2) yields (σ2 + k)2 − σ12 = (1 + ε 2 + bk 2 )k tanh k, 2σ1 (σ2 + k) = 0.

(5.3) (5.4)

502

M. Haragus, A. Scheel

If σ1 = 0, i.e. σ ∈ / iR, the equality (5.4) implies k = −σ2 which is clearly not a solution of (5.3). Hence (5.2) has no purely imaginary solutions, i.e. :(L∞ (σ, ε)) ∩ iR = ∅, for any σ ∈ / iR. If σ1 = 0, i.e. σ ∈ iR, the last equality is always satisfied, and (5.3) has, for any σ2 = 0, exactly two real solutions, one positive and one negative (recall that b > 1/3), so (5.2) has in this case two purely imaginary solutions, both simple and different from zero. For σ = 0, (5.2) has only one purely imaginary solution, ζ = 0 which is a root of multiplicity two if ε = 0, and a root of multiplicity four if ε = 0. We conclude that (i) is satisfied for any σ ∈ / iR, and is not satisfied if σ ∈ iR. ˜ + ε 2 ), where We show that (ii) holds for any σ ∈ / iR. Recall that A∞ (ε 2 ) = A(1 ˜ A(λ) is the linear operator in (3.7). Then (3.8) implies (ik − A∞ (ε 2 ))−1 X→X ≤

C(ε) , |k|

for any |k| ≥ k0 , for some positive k0 and C(ε). Since D∞ (σ ) is a bounded operator in X, we find (ik − A∞ (ε 2 ))−1 D∞ (σ )X→X ≤ D∞ (σ )

1 C(ε) ≤ , |k| 2

if |k| ≥ k1 (σ, ε) = max{k0 , 2D∞ (σ )C(ε)}. Then (ik − L∞ (σ, ε))−1 = (I + (ik − A∞ (ε 2 ))−1 D∞ (σ ))−1 (ik − A∞ (ε 2 ))−1 , so, for any |k| ≥ k1 (σ, ε), (ik − L∞ (σ, ε))−1 X→X ≤

2C(ε) . |k|

Now (5.1) follows for σ ∈ / iR from :(L∞ (σ, ε)) ∩ iR = ∅. We conclude that any σ ∈ / iR belongs to the resolvent of Lε∞ . It remains to show that the entire imaginary axis belongs to the essential spectrum. We therefore exhibit an orthonormal sequence w< ∈ X, with L∞ (σ, ε)w< → 0 and conclude that L∞ (σ, ε) cannot be Fredholm of index zero, for σ ∈ iR. From (5.3), (5.4), we find a k∗ = k∗ (σ ) ∈ R and a vector w0 such that (ik∗ − L∞ (σ, ε))w0 = 0. Let θR be a smooth, even cut-off function, with θR (x) = 1 for |x| ≤ R, θR (x) = 0 for |x| ≥ R + 1, and θR (x) = θ0 (x − R) for x ∈ [R, R + 1]. Define ˜ < := θ< (x − 2<2 )eik∗ x w0 and renormalize w< := w ˜ < /w ˜ < H . Since the supports of all w w< are disjoint, the w< form an orthonormal sequence. A straight forward computation shows that L∞ (σ, ε)w< H = O(<−1/2 ). This proves the lemma. We show now that the essential spectrum of Lε is contained in iR. Proposition 5.2. There exists ε0 > 0 such that, for any ε ∈ (0, ε0 ) and any σ ∈ / iR, the operator L(σ, ε) is Fredholm with zero index, so :e (Lε ) ⊂ iR. This proposition is proved in six steps contained in the following lemmas. Lemma 5.3. There exist positive constants ε1 , c1 (σ, ε), c2 (σ ), such that the inequalities wW ≤ c1 (σ, ε)L∞ (σ, ε)wH ,

(5.5)

wW ≤ c2 (σ ) (wH + L(σ, ε)wH ) ,

(5.6)

/ iR and w ∈ W. hold, for any ε ∈ (0, ε1 ), σ ∈

Stability of Solitary Waves

503

Proof. From Lemma 5.1 follows L∞ (σ, ε)−1 vW ≤ C(σ, ε)vH , for any ε > 0, σ ∈ / iR, v ∈ H. For w ∈ W set v = L∞ (σ, ε)w ∈ H. Then wW = L∞ (σ, ε)−1 vW ≤ C(σ, ε)vH = C(σ, ε)L∞ (σ, ε)wH and (5.5) is proved. Choose σ0 ∈ / iR and ε0 ∈ (0, ε∗ ). Then wW ≤ c1 (σ0 , ε0 )L∞ (σ0 , ε0 )wH ≤ c1 (σ0 , ε0 ) L(σ, ε)wH +(D∞ (σ ) − D∞ (σ0 ))wH + (A∞ (ε 2 ) − A∞ (ε02 ))wH +ε 2 D1 (x; σ, ε)wH + ε 2 A1 (x; ε)wH . ˜ + ε2 ) we deduce that A∞ (ε 2 ) − A∞ (ε 2 ) From the explicit formula for A∞ (ε 2 ) = A(1 0 is a bounded operator in X, and (A∞ (ε 2 ) − A∞ (ε 2 ))w 0

H

≤ ε2 − ε02 wH .

Furthermore, Lemma 4.2 implies D1 (x; σ, ε)wH ≤ C|σ |(1 + |σ |)wH ,

A1 (x; ε)wH ≤ CwW ,

for any w ∈ W. The constant C is independent of ε and σ . Finally, recall that D∞ (σ ) is bounded in X and conclude wW ≤ c1 (σ0 , ε0 ) L(σ, ε)wH + C(σ )wH + |ε 2 − ε02 |wH + ε 2 CwW . Choose ε1 such that ε12 c1 (σ0 , ε0 )C ≤ 1/2 and (5.6) is proved. For the next two lemmas we follow [RS95]. For each T > 0 define the Hilbert spaces HT = L2 ([−T , T ], X), WT = L2 ([−T , T ], Y ) ∩ H 1 ([−T , T ], X). The embedding WT ⊂ HT is compact (cf. [RS95], Lemma 3.8). / iR. There exist T = T (σ, ε) > 0 and Lemma 5.4. Assume ε ∈ (0, ε1 ) and σ ∈ c3 (σ, ε) > 0, such that the inequality

wW ≤ c3 (σ, ε) wHT + L(σ, ε)wH , holds, for any w ∈ W.

(5.7)

504

M. Haragus, A. Scheel

Proof. Assume w ∈ W is such that w(x) = 0, for |x| ≤ T , for some T > 0. Then (5.5) and the inequalities in Lemma 4.2 imply wW ≤ c1 (σ, ε)L∞ (σ, ε)wH ≤ c1 (σ, ε) L(σ, ε)wH + ε 2 D1 (x; σ, ε)wH + ε 2 A1 (x; ε)wH √ βεT

≤ c1 (σ, ε)L(σ, ε)wH + C0 (σ, ε)ε2 e−

wW .

Then, there exist T = T (σ, ε) > 0 and C1 (σ, ε) > 0 such that for any w ∈ W, with w(x) = 0, for |x| ≤ T − 1, we have wW ≤ C1 (σ, ε)L(σ, ε)wH .

(5.8)

Take a smooth cutoff function φ : R → [0, 1] such that φ(x) = 0 for |x| ≥ T , φ(x) = 1 for |x| ≤ T − 1, and |φ (x)| ≤ m. Using (5.6) and (5.8) we obtain wW ≤ φwW + (1 − φ)wW ≤ c2 (σ ) (φwH + L(σ, ε)φwH )

+C1 (σ, ε)L(σ, ε)(1 − φ)wH ≤ c3 (σ, ε) wHT + L(σ, ε)wH , since L(σ, ε)φw = φL(σ, ε)w + φ w.

/ iR, the operator L(σ, ε) has closed range Lemma 5.5. For any ε ∈ (0, ε1 ) and σ ∈ and finite dimensional kernel. Proof. Since the restriction W → HT is compact the conclusion follows from Lemma 5.4 and the Abstract Closed Range Lemma (cf. [RS95]). / iR, the adjoint operator L(σ, ε)∗ has closed Lemma 5.6. For any ε ∈ (0, ε1 ) and σ ∈ range and finite dimensional kernel. Proof. The proof is similar to the proof of Lemma 5.5 and we omit it.

Lemmas 5.5 and 5.6 imply: / iR, the operator L(σ, ε) is Fredholm. Lemma 5.7. For any ε ∈ (0, ε1 ) and σ ∈ Finally, we show / iR, the Fredholm index of L(σ, ε) is zero. Lemma 5.8. For any ε ∈ (0, ε1 ) and σ ∈ Proof. Since L(σ, ε) − L∞ (σ, ε) is a small perturbation of L∞ (σ, ε), and since this operator has a bounded inverse from H into W, for any σ ∈ / iR, a perturbation argument shows that L(σ, ε) is invertible for σ in an open set in the right half plane Re σ > 0, and for σ in an open set in the left half plane Re σ < 0. Hence, for σ in these open subsets the Fredholm index of L(σ, ε) is zero. Since the Fredholm index of L(σ, ε) is constant on connected subsets of C \ iR, we conclude that its Fredholm index is zero, for any ε ∈ (0, ε1 ) and σ ∈ / iR. Proposition 5.9. For any ε ∈ (0, ε1 ), the entire imaginary axis σ ∈ iR belongs to the essential spectrum of Lε . Proof. The proof is identical to the proof for Lε∞ from Lemma 5.1. The orthonormal sequence w< , which was constructed there, satisfies L(σ, ε)w< → 0 for < → ∞.

Stability of Solitary Waves

505

6. The Point Spectrum of Solitary Waves The goal of this section is to prove Theorem 4. Equivalently, given the information on the essential spectrum from Theorem 3, we show that Re σ = 0 belongs to the resolvent set for bounded |σ | and small ε. Proposition 6.1. For any R > 0, there exists ε2 > 0 such that, for any ε ∈ (0, ε2 ) and any σ ∈ / iR, |σ | ≤ R, the operator L(σ, ε) is invertible. The proposition is proved in several steps. Since we have bounds on the norm of L∞ (σ, ε)−1 , uniformly for values | Re σ | ≥ δ > 0, |σ | ≤ R, it is sufficient to consider a neighborhood of the imaginary axis σ ∈ i[−R, R]. We therefore concentrate on a neighborhood of σ = iq for fixed q. There are then two different cases: (I) finite frequencies q = 0, (II) small frequencies q = 0. In both cases, we are interested in the kernel of the operator L(σ, ε), which is Fredholm with index zero for Re σ = 0. Elements of the kernel are bounded solutions of the abstract, non-autonomous, linear differential equation x = D(σ, ε)W + A(ε)W. W

(6.1)

It is sufficient to show that this ordinary differential equation does not possess any nontrivial, bounded solutions. We will see that, just as for the nonlinear steady equation, bounded solutions lie on a finite-dimensional, invariant manifold. To the abstract, quasilinear differential equation (6.1), we apply non-autonomous center-manifold reduction; see [Mi88]. The reduction is performed for σ close to iq ∈ iR and ε small. Note that for any q fixed, finite, and ε = 0, the linear equation is a relatively bounded perturbation of the principal part x = L∞ (iq, 0)W = (D∞ (iq) + A∞ (0))W, W with small relative bound. In Lemma 5.1 we proved the resolvent estimate (ik − L∞ (iq, 0))−1

X→X

≤

C , |k|

for all |k| ≥ k0 (q), and we may apply the reduction theorem in [Mi88] in a neighborhood of any fixed point q, uniformly for bounded q. 6.1. The case of non-zero frequency. 6.1.1. The reduction. We exclude point spectrum in a neighborhood of iq = 0, case (I). Set σ = iq + δ and rewrite (6.1) as x = L∞ (iq, 0)W + δB∞ (δ)W + ε 2 (B0 + B1 (x; δ, ε))W, W where L∞ (iq, 0) = D∞ (iq) + A∞ (0), and δB∞ (δ) = D∞ (iq + δ) − D∞ (iq),

ε2 B0 = A∞ (ε 2 ) − A∞ (0),

B1 (x; δ, ε) = D1 (x; iq + δ, ε) + A1 (x; ε).

(6.2)

506

M. Haragus, A. Scheel

We view Eq. (6.2) as a small perturbation of the eigenvalue problem for δ = 0 and ε = 0. This is justified by the following inequalities B∞ (δ)Y (resp. X)→Y (resp. X) ≤ C(1 + |q|),

B0 Y (resp. X)→Y (resp. X) ≤ C, √ βε|x|

B1 (x; δ, ε)Y →X ≤ C(1 + |q|2 )e−

for ε ∈ (0, ε3 ) and any q = 0. The reduction procedure is performed for small ε and δ. We have to find the center eigenspace of the linear operator L∞ (iq, 0). The linear operator L∞ (iq, 0) is closed in X with dense domain Y . Moreover, it has compact resolvent, so its spectrum consists only of isolated eigenvalues of finite multiplicities. As shown in the proof of Lemma 5.1, ζ is an eigenvalue of L∞ (iq, 0) if (iq + ζ )2 cos ζ = (1 − bζ 2 )ζ sin ζ. Imaginary solutions ζ = ik of this equation satisfy (q + k)2 = (1 + bk 2 )k tanh k. We find exactly two simple roots ik1 and ik2 with k2 < 0 < k1 (since b > 1/3). Hence, L∞ (iq, 0) has two simple, purely imaginary eigenvalues ik1 , ik2 . The corresponding eigenvectors are  w1,2

  =  

cosh(k1,2 y) + 21 qy 2 z˜ 1,2 ik1,2 cosh(k1,2 y) i˜z1,2

   ,  

−bk1,2 z˜ 1,2 where z˜ 1,2 = −

k1,2 sinh k1,2 (k1,2 + q) cosh k1,2 . =− 2 k1,2 + q λ + bk1,2

The center manifold reduction implies that small bounded solutions of (6.2) are of the form W(x) = a1 (x)w1 + a2 (x)w2 + O((|δ| + ε 2 )|aj |).

(6.3)

For the amplitudes a = (a1 , a2 ), we find a linear, non-autonomous system of ordinary differential equations, depending on the eigenvalue parameter σ and the bifurcation parameter ε ax = A(x; δ, ε)a.

(6.4)

In ε = 0, the 2 × 2-matrix A does not depend on x any more and possesses two distinct purely imaginary eigenvalues. In the remainder of this section, we set up a perturbation argument, which shows that for ε small and Re δ > 0, there are no bounded solutions to (6.4).

Stability of Solitary Waves

507

6.1.2. Exponential dichotomies. In ε = 0, Eq. (6.4) is autonomous. At Re σ = 0, the spectrum of the matrix A consists precisely of the eigenvalues ζ1 = ik1 , k1 > 0 and ζ2 = ik2 , k2 < 0. Depending on δ = σ − iq, the eigenvalues may move off the imaginary axis. A direct computation shows that dζ1 /dδ > 0 and dζ2 /dδ < 0, such that the eigenvalues leave the axis, with non-vanishing speed, in opposite directions. In particular, for Re σ > 0 small and ε = 0, we find that (6.4) is a hyperbolic, linear ordinary differential equation. The eigenspaces are analytic in δ = σ − iq. For ε > 0, the eigenvalues ζ1 and ζ2 still describe the dynamics at x = ±∞, since the solitary wave and therefore the coefficients of the matrix A(x; δ, ε) converge to zero, √ with rate e− βε|x| . Therefore, when Re δ > 0, the dynamics for |x| → ∞ are hyperbolic, with stable eigenvalue ζ2 and unstable eigenvalue ζ1 . The following lemma on exponential dichotomies shows in which sense the hyperbolic structure can be continued to finite x. We therefore consider a general non-autonomous, linear differential equation ax = A(x; δ, µ)a, a ∈ Rn ,

(6.5)

depending on a real parameter µ and a complex spectral parameter δ. In our example, µ represents the (small) parameter ε. Lemma 6.2. Consider (6.5) with fundamental solution ϕ(x, y). Assume asymptotically constant coefficients A(x; δ, µ) → A± (δ, µ) as x → ∞, and smoothness: A and A± are C k in the parameter µ ∈ Uµ ⊆ R, k ≥ 0, and analytic in the spectral parameter δ ∈ Uδ ⊆ C, and A is continuous in x. Furthermore assume that A± are hyperbolic, that is, they do not possess eigenvalues on the imaginary axis, for all µ ∈ Uµ and all δ ∈ Uδ . Then there exists a unique decomposition of the phase space Rn into linear, stable s (x; δ, µ) and E u (x; δ, µ), which are as smooth as A. The and unstable subspaces E+ − subspaces are invariant under the linear evolution ϕ(x, y): s s ϕ(x, y)E+ (y) = E+ (x),

u u ϕ(x, y)E− (y) = E− (x).

s (0) and Moreover, any initial value to a bounded solution on [0, ∞) is contained in E+ u (0). initial values to bounded solutions on (−∞, 0] are contained in E− On the other hand, there are positive constants C, η+ > 0, and η− > 0 such that we have uniform exponential decay for solutions in forward time,

|ϕ(x, y)a| ≤ Ce−η+ |x−y| |a| s (y), x ≥ y ≥ 0, and in backward time for all a ∈ E+

|ϕ(x, y)a| ≤ Ce−η− |x−y| |a| u (y), x ≤ y ≤ 0. The constants C and η can be chosen independently of for all a ∈ E− ± µ, δ in compact subsets of Uµ × Uδ .

For the proof, see [Co78], for example. By the above lemma, we find nontrivial, bounded solutions, if and only if stable and unstable subspaces intersect nontrivially u s E− (0) ∩ E+ (0) = {0}.

508

M. Haragus, A. Scheel j

We may choose bases a± , analytic in δ and continuous in µ in the two subspaces and compute the determinant j

E(δ; µ) = det (a± ).

(6.6)

A variant of this analytic function is usually referred to as the Evans function [Ev72, AGJ90]. Clearly, zeroes of E detect precisely the nontrivial bounded solutions to (6.4), and therefore the point spectrum coincides with the zeroes of E. The algebraic multiplicity of eigenvalues coincides with the order of the zeroes of E; see [AGJ90]. By analyticity in δ and continuous dependence on µ, the number of zeroes counted with multiplicity varies continuously with µ. We are going to exploit this fact in Sect. 6.2. In our setting, both subspaces are well-defined and complex one-dimensional for s (0) and au (0), which lead to Re δ > 0. They are spanned by the complex vectors a+ − s u solutions a+ (x) and a− (x). It is our goal to show, that both solutions can be extended, analytically in δ and continuously in ε in an open neighborhood of δ = 0, in particular, across the imaginary axis where hyperbolicity at x = ±∞ is lost, into the left half s (0) and au (0) plane. We show that in the limit ε = 0 and Re δ = 0, the initial values a+ − converge to eigenvectors e2 and e1 to the eigenvalues ζ2 and ζ1 , respectively. In particular, E(0; 0) = 0, and by continuity, we can exclude unstable eigenvalues in a neighborhood of σ = iq. 6.1.3. A gap lemma. The goal here is to continue the Evans function across the essential spectrum. The idea is to exploit rapid convergence of the coefficients of the non-autonomous differential equation A(x), compensating for the loss of hyperbolicity in the asymptotic equation at x = ±∞. The main idea was already used in [GZ98, Theorem 2.3] and [KS98, Lemma 2.2]. We recall the results stated there. Theorem 5 ([GZ98, KS98]). Consider a non-autonomous, linear differential equation ax = A(x; δ, µ)a ∈ Rn with fundamental solution ϕ(x, y), with paramters δ ∈ Uδ (0) ⊂ C and µ ∈ Uµ (0) ⊂ R close to the origin. Assume exponential convergence to asymptotically constant coefficients |A(x; δ, µ) − A∞ (δ, µ)| ≤ Ce−η|x| with positive constants C, η > 0. Assume furthermore that A and A∞ are C k in µ, k ≥ 0, and analytic in δ, and A is continuous in x. At µ = 0, δ = 0, we require the existence of a spectral projection P to A∞ such that Re spec P A∞ ≤ 0 and Re spec (id − P )A∞ ≥ 0. Then there exists a unique decomposition of the phase space Rn into linear, stable s (x; δ, µ) and E u (x; δ, µ), which are as smooth as A. The and unstable subspaces E+ − subspaces are invariant under the linear evolution ϕ(x, y): s s (y) = E+ (x), ϕ(x, y)E+

u u ϕ(x, y)E− (y) = E− (x).

s (0) converge to E s (δ, µ) as x → ∞, where the Solutions to initial values in E+ s eigenspace E (δ, µ) smoothly depends on δ and µ and coincides with the range Im P for µ = 0, δ = 0.

Stability of Solitary Waves

509

u (0) converge to E u (δ, µ) as x → ∞, where the Also, solutions to initial values in E− u eigenspace E (δ, µ) smoothly depends on δ and µ and coincides with the kernel Ker P for µ = 0, δ = 0. In particular, for parameter values δ, µ where the eigenspaces E s/u (δ, µ) are actus (x; δ, µ) and ally the stable and unstable eigenspaces, respectively, the subspaces E+ u (x; δ, µ) coincide with the eigenspaces from Lemma 6.2. E−

In our problem, one additional difficulty arises. The convergence rate η of the non√ autonomous perturbation depends on µ = ε. The rate, βε, although fast compared to the eigenvalues of the asymptotic matrix O(ε 2 ), is not bounded away from zero, as required in the above theorem. We therefore restate a parameter-dependent version of these results, taking into account the different orders of convergence of the solitary wave and possible eigenfunctions. Proposition 6.3. Consider a non-autonomous, linear differential equation ax = A(x; δ, µ)a, depending on a parameter µ ∈ Rp and an eigenvalue parameter δ ∈ C, in a neighborhood of the origin in Rp × C. Assume that the coefficients A are C k , k ≥ 0, in µ and analytic in δ, and that A is continuous in x. Furthermore assume that A(x; δ, µ) converge to constant matrices, as |x| → ∞ |A(x; δ, µ) − A∞ (δ, µ)| ≤ C|µ|e−η(µ)|x| , and, as µ → 0, |A(x; δ, µ) − A0 (δ)| ≤ C|µ|. Assume spec A0 (0) ⊂ iR, and A∞ (δ, µ) is hyperbolic for Re δ = 0, with |(ik − A∞ (δ, µ))−1 | ≤

C , | Re δ|

(6.7)

for all k ∈ R and with C > 0 independent of µ ≥ 0. Suppose that spatial convergence of the coefficients is fast compared to the rate of hyperbolicity: µ/η(µ) → 0 as µ → 0. Then, the Evans function E(δ; µ) defined for Re δ > 0, can be extended continuously in µ and analytically in δ, in a sector {(δ, µ); − Re δ ≤ M|µ|}, for any fixed constant M > 0. In the limit µ → 0, we find E(δ; 0) = 0 for δ close to zero. Proof. For any µ = 0 small, the conclusions of the proposition directly follow from the gap lemma, Theorem 5. We have to show that the limit µ → 0 of E(δ; µ) exists, and is nonzero. For Re δ > 0, µ ≥ 0, the equation possesses exponential dichotomies, as stated in Lemma 6.2. The subspaces can actually be constructed from a fixed point argument. We s , first. From the resolvent estimate (6.7), we conclude that the subspaces focus on E+ s/u corresponding to stable and unstable eigenvalues E+ (δ) for the equation with µ = 0 continue analytically in a neighborhood of δ = 0. We write P+ for the projection on s (0) along E u (0), and B(y) := A(y) − A , suppressing the dependence on δ and E+ ∞ +

510

M. Haragus, A. Scheel

µ. For Re δ > 0, solutions a(x) which are bounded on x ≥ 0 then solve the integral equation a(x) = e

A∞ x

x a0 +

e

A∞ (x−y)

x P+ B(y)a(y)dy +

eA∞ (x−y) (id − P+ )B(y)a(y)dy

∞

0

with a0 = P+ a(0). We substitute aˆ (x) = e−A∞ x a(x) and arrive at x aˆ (x) = a0 +

e

−A∞ y

P+ B(y)e

0

A∞ y

x aˆ (y)dy +

e−A∞ y (id − P+ )B(y)eA∞ y aˆ (y)dy.

∞

We view the right side as an affine operator on the space of bounded, continuous functions on [0, ∞), equipped with the supremum norm. Since B(y) ≤ C|µ|e−η(µ)|y| , and |eA∞ y | ≤ CeC|µ|y for − Re δ ≤ C|µ|, we find that the norm of the linear part of the right side is C µ/η(µ), which converges to zero for µ → 0 by assumption. We therefore find a unique solution aˆ (x) in the sector, which converges to the constant solution as µ → 0. We find the stable subspace as aˆ (0), parameterized over a0 . The construction of the unstable subspace is similar. In the limit, µ = 0, we find the Evans function for the constant coefficient equation, which is nonzero, since we have a spectral decomposition on the imaginary axis corresponding to the limits of stable and unstable subspaces. Together with the considerations in Sect. 6.1.2, this proves absence of point spectrum in a neighborhood of the imaginary axis, outside a given small neighborhood of the origin, which we consider next. 6.2. The case of small frequency. We exclude point spectrum in a neighborhood of the origin σ = 0, off the imaginary axis. As a first step, we reduce the eigenvalue problem to finding non-trivial solutions to a four-dimensional non-autonomous ordinary differential equation, Sect. 6.2.1. We then introduce and justify a long-wave scaling corresponding to the Korteweg-de Vries limit, Sect. 6.2.2. We then recall from [PW92] the structure of the spectrum in the scaling limit, where we find the spectrum of the Korteweg– de Vries soliton, Sect. 6.2.3. The last part of this chapter, Sect. 6.2.4, is devoted to the central perturbation arguments. We show that the spectrum of the capillary-gravity waves coincides with the point spectrum of the Korteweg–de Vries soliton in a neighborhood of the imaginary axis. 6.2.1. The reduction. Rewrite (6.1) for σ = δ small as x = L∞ (0, 0)W + δB∞ (δ)W + ε 2 (B0 + B1 (x; δ, ε))W, W

(6.8)

where L∞ (0, 0) = A∞ (0), and δB∞ (δ) = D∞ (δ),

ε2 B0 = A∞ (ε 2 ) − A∞ (0),

B1 = D1 + A1 .

˜ so it is exactly the linear operator used for the analysis of Recall that A∞ (0) = A(1), the steady problem in Theorem 1. From those results we find that A∞ (0) has only one purely imaginary eigenvalue ζ = 0, with algebraic multiplicity four. The corresponding (generalized) eigenvectors are w0 , w1 , w2 , w3 found in the proof of Theorem 1.

Stability of Solitary Waves

511

The center manifold reduction implies that the bounded solutions of (6.8) are of the form W(x) = a0 (x)w0 + a1 (x)w1 + a2 (x)w2 + a3 (x)w3 + O((|δ| + ε 2 )|aj |), and the amplitudes aj satisfy a non-autonomous, linear, reduced system of the form a0,x = a1 + δ(c00 a0 + c02 a2 ) + ε 2 (c01 a1 + c03 a3 ) + ε 2 f0 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |), a1,x = a2 + δ(c11 a1 + c13 a3 ) + ε2 f1 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |), a2,x = a3 + δ(c20 a0 + c22 a2 ) + ε 2 (c21 a1 + c23 a3 )

(6.9)

+ ε 2 f2 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |), a3,x = δ(c31 a1 + c33 a3 ) + ε 2 f3 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |). The constants cij are O(1) and can be determined explicitly. In particular, we have c20 = c31 = −β and c21 = β. Note that the functions fj are independent of a0 . This is due to the invariance of (2.1)–(2.4) under → + const. which implies the invariance of the reduced system under a0 → a0 + const. if δ = 0. A direct calculation of the relevant terms gives ε2 f2 (x; a1 , a2 , a3 ) = −βu0 a1 + ε 2 f22 (x; a2 , a3 ), ε2 f3 (x; a1 , a2 , a3 ) = −2βu0x a1 − 2βu0 a2 + ε 2 f33 (x; a3 ) with u0 from Theorem 1 (i). 6.2.2. Justifying the Korteweg–de Vries scaling. As a first step, we prove that any eigenvalue δ, Re δ = 0 is necessarily located in an O(ε 3 )-neighborhood of the origin. Suppose therefore ε = ν|δ|1/3 with ν small. We shall prove that the system (6.9) does not possess non-trivial, bounded solutions, provided Re δ = 0. We may scale the system (6.9) according to ξ = |δ|1/3 x,

aj (x) = |δ|j/3 Aj (ξ ),

j = 0, 3,

and obtain A0,ξ = A1 + O(δ 2/3 + ν 2 δ 2/3 ), A1,ξ = A2 + O(δ 2/3 + ν 2 δ 1/3 ), A2,ξ = A3 − βeiarg(δ) A0 + O(δ 2/3 + ν 2 ),

(6.10)

A3,ξ = −βeiarg(δ) A1 + O(δ 2/3 + ν 2 ). 3 = At ν = δ = 0, we have an autonomous linear ODE with eigenvalues ζ0 = 0, ζ1,2,3 0 0 0 iarg(δ) iarg(δ) , and corresponding eigenvectors A1 = A2 = 0, A3 = −βe , A00 = 1, −2βe and Akj = (−ζk )j , k = 1, 2, 3 and j = 0, . . . , 3. Now suppose first Re δ = 0. Then ζj , j = 1, 2, 3 are hyperbolic. Therefore the eigenspace to the eigenvalue ζ0 forms a normally hyperbolic center-manifold for the linear flow. This center-manifold persists

512

M. Haragus, A. Scheel

under small, non-autonomous perturbations and contains all bounded solutions (we may construct the center-manifold as the robust intersection of center-stable manifold at x = ∞ and center-unstable manifold at x = −∞). On the other hand, the eigenvalue ζ0 = 0 is easily seen from (5.2) to move off the imaginary axis whenever ζ moves off the axis. But this eigenvalue determines the asymptotic behavior of solutions in the center-manifold at x = +∞ and x = −∞. If now δ approaches the imaginary axis, we have to refine the arguments as in Case I, above. Using the gap lemma, Proposition 6.3, we continue the center-stable manifold at x = +∞ and the center-unstable manifold at x = −∞ smoothly across the imaginary axis, exploiting fast convergence of the nonautonomous terms on the scale, O(ε) compared to the order of the perturbation O(ε2 ). We omit the details which are similar to the case of non-zero frequency, Sect. 6.1. 6.2.3. The Korteweg–de Vries limit. We may now assume that the eigenvalue δ is necessarily of the order ε3 and therefore scale δ = ε 3 H. We obtain in the KdV-scaling ξ = εx,

aj (x) = εj Aj (ξ ),

j = 0, 3,

the scaled reduced system A0,ξ = A1 + O(ε 2 ), A1,ξ = A2 + O(ε), A2,ξ = A3 − βHA0 + βA1 − βA∗1 A1 + O(ε 2 ), A3,ξ = −βHA1 − 2βA∗1,ξ A1 − 2βA∗1 A2 + O(ε).

(6.11)

Here A∗1 is the steady solitary wave solution of the KdV-equation 2βA1,τ + A1,ξ ξ ξ − βA1,ξ + 3βA1 A1,ξ = 0, A∗1 (ξ ) = sech 2

√

βξ 2

(6.12)

.

We consider the case ε = 0 first. We transform variables B0 = A0 , B1 = A1 , B2 = A2 , B3 = A3 − βHA0 + βA1 − βA∗1 A1 and obtain at ε = 0, B0,ξ B1,ξ B2,ξ B3,ξ

= B1 , = B2 , = B3 , = −2βHB1 + βB2 − 3βA∗1,ξ B1 − 3βA∗1 B2 ,

(6.13)

which is the KdV-equation, linearized in the soliton solution A∗1 , for B1 = B0,ξ . The equation at |ξ | = ∞ reduces to B0,ξ = B1 ,

B1,ξ ξ ξ + 2βHB1 − βB1,ξ = 0

with characteristic polynomial ζ 4 + 2βHζ − βζ 2 for the ζ -eigenvalues, determining exponential spatial decay or growth of possible eigenfunctions. Besides ζ = 0 with eigenvector (1, 0, 0, 0)T , we have precisely the spectrum of the linearization about the

Stability of Solitary Waves

513

KdV-soliton. In particular, dynamics in the space (1, 0, 0, 0)⊥ are precisely the (linear) dynamics around the KdV-soliton. This strongly suggests that eigenfunctions will appear wherever the KdV-soliton possesses eigenfunctions — and nowhere else. Given the stability of the KdV-soliton [PW92], this would then prove stability of the solitary wave in the Euler-equations! We construct in the sequel a more refined picture of the spectrum in ε = 0, which will, in particular, be persistent for ε > 0. First of all, we note that the trivial zero-eigenvalue moves out of zero as soon as ε becomes positive and H non-zero. This can be readily seen from (5.2), by substituting the KdV-scaling δ = ε3 H and ζ = εZ. From the dispersion relation (5.2) we then obtain a new equation for Z, H and ε. The Taylor expansion of this equation in ε 2 is, up to third order 1 1 1 1 ε4 b − Z 4 −Z 2 + 2HZ + ε 2 − b − Z 6 + Z 4 −HZ 3 + H2 + O(ε 4 ) 3 6 5 6 = 0. (6.14) To second order in ε2 , there is still one eigenvalue ζ0 = 0 which can be seen to be perturbed to ζ0 = − 21 ε 2 H + O(ε 4 H) by the third order terms in ε 2 . We emphasize here, that all eigenvalues are, for ε ≥ 0 small, smooth functions in H and ε. 6.2.4. Perturbing the Korteweg–de Vries spectrum. We consider the scaled eigenvalueproblem for the water-waves (6.11) as a small perturbation of the eigenvalue problem for the KdV-equation (6.13). We distinguish three cases, with increasing difficulty. First we consider H bounded away from the imaginary axis. We then continue the arguments for H close to the imaginary axis, but bounded away from the origin. Finally, we study the eigenvalue problem for H in a neighborhood of the origin. (I) Eigenvalues far from the imaginary axis. Suppose first that Re H ≥ ν∗ > 0 for some ν∗ > 0. We have to exclude bounded solutions to (6.11) for ε > 0, small. As in Sect. 6.1, we exploit the fact that the ξ -dependent coefficients in (6.11) converge exponentially as |ξ | → ∞, uniformly in ε ≥ 0. In order to construct stable and unstable subspaces as in Sect. 6.1, we discuss the spatial eigenvalues ζj of (6.11) at |ξ | = ∞. From the scaled dispersion relation (6.14), we find two eigenvalues with positive real part, ζ1 and ζ3 , one eigenvalue with negative real part, which we call ζ2 and the eigenvalue ζ0 , which for ε = 0 remains in the origin, and moves into the left half plane for ε > 0: Re ζ1 , Re ζ3 > 0,

Re ζ0 ≤ 0, Re ζ2 < 0.

With Lemma 6.2, we can construct linear subspaces E s (0) and E u (0), such that all initial values at ξ = 0 of the linear equation (6.11) leading to bounded solutions on R+ or R− are contained in E s (0) or E u (0), respectively. Both subspaces depend analytically on 1/2 H, Re H ≥ ν∗ > 0, and smoothly on ε ≥ 0. Choosing analytic bases Bs/u in E s/u (0), we can compute the Evans function

E(H; ε) = det Bs1 , Bs2 , Bu1 , Bu2 . We show that E(H; 0) is nonzero for Re H ≥ 0. By continuity in ε and the previous considerations for large H, this excludes eigenvalues in Re H ≥ ν∗ > 0.

514

M. Haragus, A. Scheel

The Evans function E(H; 0) can be computed almost explicitly from (6.13). Recall, that the equation for (B1 , B2 , B3 ) does not depend on B0 and is precisely the linearization about the KdV-soliton. We therefore define the subspace (0, ∗, ∗, ∗) = (1, 0, 0, 0)⊥ as the KdV-subspace. This subspace is not flow-invariant, but the dynamics in this subspace are independent of the value of B0 in the first component if ε = 0. This gives the equations a skew-product structure. We may first solve the equation in the KdV-subspace and then solve the equation for B0 . Within the KdV-subspace, we find the eigenvalues ζ1 , ζ2 , s u (0) by intersecting and ζ3 . We find the stable and unstable subspaces EKdV (0) and EKdV s s u the subspaces E (0) and E (0) with the KdV-subspace. In particular, EKdV (0) is oneu dimensional and EKdV (0) is two-dimensional. Choosing analytic bases in these two subspaces, we can compute an analytic function EKdV (H), the Evans function of the KdVsoliton. We are now going to use information from [PW92] on the zeroes of EKdV (H). Theorem 6 ([PW92]). The Evans function EKdV (H) for the KdV-soliton can be extended analytically into Re H > −4/3. It vanishes precisely in the origin, where we have EKdV (0) = 0, EKdV (0) = 0, EKdV (0) = 0. From this information, we can infer absence of zeroes for E(H; 0) in Re H ≥ ν∗ . Lemma 6.4. The reduced, scaled Evans function of the water-wave problem, E(H; 0), and the Evans function for the KdV-soliton, EKdV (H), differ by a non-vanishing analytic function S(H): E(H; 0) = S(H)EKdV (H);

S(H) = 0

for Re H > 0. Proof. We compute E(H) choosing a particular analytic basis in E s (0) and E u (0). Note first that Bs1 := B0 = (1, 0, 0, 0)T ∈ E s (0) since this vector is constant under time-ξ 1 2 3 evolution. Next, let Bs,2 KdV (H), Bu, KdV (H), Bu, KdV (H) ∈ C denote the basis vectors s/u

for stable and unstable KdV-subspaces EKdV (0). Solving B0,ξ = B1 , with B1 given from j j the KdV-subspace, with initial condition Bs/u, KdV (H), we find particular bases Bs/u of j

E s/u (0), which coincide with Bs/u, KdV in the KdV-subspace. Since Bs1 = (1, 0, 0, 0)T , we find that in these coordinates the determinant det (Bs1 , Bs2 , Bu1 , Bu2 ) is of the form   1 ∗ ∗ ∗

1

2  2   0 Bs, KdV (H) Bu,  KdV 1 (H) Bu, KdV 1 (H)  1  E(H; 0) = det  2

1

2  0 B  s, KdV 2 (H) Bu, KdV 2 (H) Bu, KdV 2 (H)  

2

1

2 0 Bs, KdV 3 (H) Bu, KdV 3 (H) Bu, KdV 3 (H)

1 2 = det Bs,2 KdV (H), Bu, KdV (H), Bu, KdV (H) = EKdV (H). Choosing different analytic bases, the determinant only differs by a nonzero, analytic factor, which proves the lemma. Corollary 6.5. The scaled Evans function of the water-wave problem E(H; 0) does not vanish in the right half plane. In particular, for 0 < ε ≤ ε∗ (ν), there are no unstable eigenvalues of the solitary wave in Re δ ≥ νε3/2 .

Stability of Solitary Waves

515

(II) Eigenvalues close to the imaginary axis. We show that we may continue the construction from Lemma 6.4 across the imaginary axis, outside a neighborhood of the origin. Lemma 6.6. The reduced Evans function E(H; ε) can be continued analytically in H and continuously in ε in a region {Re H ≥ −ν, |H| ≥ ν} ⊂ C. Proof. We have to show that the stable and unstable subspaces E s (0) and E u (0) continue analytically in H and continuously in ε across the imaginary axis. This in turn is an immediate consequence of the gap lemma, Theorem 5. Corollary 6.7. The scaled Evans function of the water-wave problem E(H; ε) does not vanish in a region {Re H ≥ −ν, |H| ≥ ν} ⊂ C. In order to finish the proof, it remains to exclude eigenvalues for the perturbed, scaled eigenvalue problem (6.14) in a neighborhood of the origin. (III) Eigenvalues close to the origin. Finally, we address the crucial neighborhood of the origin. We may already suspect that transversality as above might not hold, since already the KdV-equation possesses an eigenvalue H = 0 of algebraic multiplicity two, embedded in the essential spectrum. Again, the strategy consists of first continuing the Evans function E(H; ε) for the water-wave problem analytically in H and continuously in ε in a neighborhood of the origin first. As a second step, we show how this Evans function is related to the Evans function of the Korteweg-de Vries equation, EKdV (H). The goal of this step to conclude that for all ε ≥ 0 sufficiently small, E possesses at most three zeroes in a neighborhood of the origin – exploiting that the number of zeroes of an analytic function is invariant under small perturbations. We then conclude the stability proof exhibiting two explicit eigenvectors in the kernel and an explicit principal vector in the generalized kernel. We start with some notational preliminaries for the asymptotic equation at |ξ | = ∞. The eigenvalues of the linear equation on the right side of √ (6.13) at H =√0, |ξ | = ∞ are ζs = ζu = 0, a double zero eigenvalue, and ζss = − β and ζuu = β. The zero eigenvalue is geometrically simple with eigenvector (1, 0, 0, 0)T . The central observation now is that for H, ε = 0 the zero eigenvalues unfold smoothly: 1 ζs = − ε 2 H + O(ε 2 H2 ), 2

ζu = 2H + O(H2 + ε 2 H).

These expansions are readily computed from the Newton polygon to (6.14), with leading order contribution −Z 2 + 2HZ + ε 2 H2 . Eigenvectors are smooth as well and given by ej = (−1, ζj , ζj2 , ζj3 ) for j = s, u, ss, uu. For ε > 0, Re H > 0, the stable eigenspace is spanned by E s = span {es , ess } and the unstable eigenspace by E u = span {eu , euu }. At H = 0, we find a nontrivial intersection of stable and unstable subspaces E s ∩ E u = span {es } = span {eu }. We emphasize that this smooth unfolding is non-generic: in a typical unfolding of √ the Jordan block with a parameter H, the eigenvalues are smooth functions of H! The smooth unfolding here is due to reversibility: in the scaled dispersion relation (6.14), there is no linear term H, which would make the leading order √ contribution in the Z-H Newton polygon for (6.14) to be −Z 2 + H = 0, with Z ∼ H. Reversibility implies

516

M. Haragus, A. Scheel

invariance of the dispersion relation under Z # → −Z and H #→ −H, for all ε! It is this symmetry which excludes linear terms in H. We next show that the subspaces E s (ξ ) and E u (ξ ), constructed for H outside a neighborhood of zero above, can be continued analytically in H and smoothly in ε across this neighborhood. Lemma 6.8. The Evans function E(H; ε) to the scaled linearization about the solitary wave in the water-wave problem (6.14) possesses an analytic extension into an open neighborhood of the origin |H| ≤ ν0 , which depends continuously on ε ≥ 0 sufficiently small. The neighborhood is uniform in ε, that is, ν0 does not depend on ε ≥ 0. Proof. The construction very much relies, in the spirit of the gap lemma, Theorem 5, on a stable manifold theorem. However, we cannot apply the gap lemma directly, since additional hyperbolic eigenvalues are present, which actually are in resonance with spatial convergence of the coefficients at H = 0. 1 We compactify time 2βξ = log( 1+τ 1−τ ), τ ∈ [−1, 1] and obtain a smooth (C in τ and analytic in H) differential equation, suspended with the equation τξ = β(1 − τ 2 ). The fibers τ = +1 and τ = −1 are invariant and describe the limiting situation at ξ = ±∞. In these fibers the dynamics possesses invariant subspaces which are the linear eigenspaces to the eigenvalues ζj , j = s, u, ss, uu. In the τ -direction, the asymptotic τ = ±1-subspaces are linearly stable (τ = +1) and linearly unstable (τ = −1), respectively, with exponential rate ±2β. The flows inside τ = 1 and τ = −1 are linear and coincide. Subspaces corresponding to eigenspaces and generalized eigenspaces are flow-invariant subspaces. For example, the two-dimensional subspace in τ = ±1 corresponding to the generalized kernel for H = 0, can be viewed as a smooth, normally hyperbolic, local centermanifold. Inside this center-manifold, we find the particularly important flow-invariant subspaces span {es } in τ = +1 and span {eu } in τ = −1. The subspaces are analytic in H and continuous in ε. They possess strong unstable and strong stable foliations, which are as smooth as the vector field. Indeed, we may smoothly transform variables, Bj # → Bj e−ζs/u ξ to trivialize the flow inside the eigenspace, which consists of a line of equilibria after the rescaling. The foliations are then given as the strong stable manifolds of the equilibria in the eigenspaces. Analyticity follows from differentiability and the Cauchy-Riemann differential equations. We denote by W ss (span {es }) the three-dimensional stable manifold of the subspace span {es } in the extended phasespace (τ, B). Analogously, let W uu (span {eu }) denote the three-dimensional unstable manifold of the subspace span {eu }. By construction, these manifolds are the smooth continuations of E s (ξ ) and E u (ξ ) that we already constructed in the region ReH > 0: W ss (span {es }) ∩ {τ = 0} = E s (0) and W uu (span {eu }) ∩ {τ = 0} = E u (0). Choosing analytic bases in these subspaces, and evaluating the determinant, we have continued the Evans-function E into a neighborhood of the origin H = 0 smoothly, analytically in H and continuously in ε. Remark 6.9. The above construction does not show that we can smoothly single out a particular one-dimensional subspace of initial conditions which converges to span {eu } or span {es } faster than the other solutions – which is part of the proof of the gap lemma; see the proof of Proposition 6.3. In fact, we believe that this is in general impossible, since precisely at the origin, H = 0, the contracting and expanding eigenvalues ζss and ζuu are equal to the rate of exponential approach in the ξ -direction, which makes it impossible to single out a strong stable or unstable direction.

Stability of Solitary Waves

517

The next step provides an expansion for E(H; 0) near H = 0. Lemma 6.10. There exists a nonzero coefficient E3 = 0 such that E(H; 0) = E3 H3 + O(H4 ). Proof. For ε = 0, the linear equation (6.13) possesses a skew-product structure, already exploited in the previous paragraphs (I) and (II). In the KdV-subspace, the dynamics are independent of B0 . Stable and unstable subs u (0) are well-defined. We may choose particular bases spaces EKdV (0) and EKdV s ss EKdV (0) = span {BKdV (0)},

u uu u EKdV (0) = span {BKdV (0), BKdV (0)}

such that solutions in the KdV-subspace with these initial conditions satisfy e−ζ e−ζ

ss ξ

ss ss BKdV (ξ ) → bKdV for ξ → ∞,

uu ξ

uu uu BKdV (ξ ) → bKdV for ξ → −∞,

u u e−ζ ξ BKdV (ξ ) → bKdV for ξ → −∞. u

From these solutions, we are going to construct a basis of stable and unstable subspaces for the full water-wave problem (6.14), E s (0) and E u (0). We start with E s (0). First, B s (1, 0, 0, 0)T is a ξ -independent, bounded solution and belongs to E s (0). The second ss (ξ ). Define basis vector is readily computed from BKdV ξ B0ss (ξ )

=

ss BKdV (s)ds 1

∞

and

T

B1ss (ξ ), B2ss (ξ ), B3ss (ξ )

ss = BKdV (ξ ).

T Then B ss (ξ ) = B0ss (ξ ), B1ss (ξ ), B2ss (ξ ), B3ss (ξ ) is exponentially decaying for ξ → ∞ and B ss (0) is the desired second basis vector in E s (0). Similarly, we define ξ B0uu (ξ )

=

uu (BKdV )1 (s)ds −∞

and uu (ξ ). (B1uu (ξ ), B2uu (ξ ), B3uu (ξ ))T = BKdV

Then B uu (ξ ) = (B0uu (ξ ), B1uu (ξ ), B2uu (ξ ), B3uu (ξ ))T is exponentially decaying for ξ → ∞ and B uu (0) ∈ E u (0).

518

M. Haragus, A. Scheel

u The same construction for BKdV would give a pole in H = 0 since the integral diverges due to slow exponential decay, ζ u = 2H + O(H2 + ε 2 H), u u (ξ ) = bKdV eζ BKdV

uξ

+ r(ξ )

with r(ξ ) = O(e(ζ +ν)ξ ) for ξ → −∞ with some ν > 0, uniformly in H close to zero. We therefore rescale the KdV-eigenvector with H and set u

u u B˜ KdV (ξ ) = HBKdV (ξ ).

We then proceed as for B uu and define ξ B0u (ξ )

=

u (B˜ KdV )1 (s)ds + B0u (0)

0

with B0u (0)

Hbu = u + ζ

0 r(s)ds.

−∞

With this choice of B0u (0), B0u (ξ ) decays to zero exponentially for Re H > 0. Note that B0u (0) is analytic in a neighborhood of H = 0 and that, with a suitable choice of bu we can arrange to have B0u (0) = 1 + O(H), u (B1u (ξ ), B2u (ξ ), B3u (ξ ))T = BKdV (ξ ).

Then B u (ξ ) = (B0u (ξ ), B1u (ξ ), B2u (ξ ), B3u (ξ ))T is exponentially decaying for ξ → −∞ and Re H > 0 and B u (0) ∈ E u (0). The Evans function for the water-wave problem is then given by the determinant E(H, 0) = det(B uu , B u , B s , B ss ). Exploiting that B s (0) = 1, we find that u uu ss E(H, 0) = det(BKdV , B˜ KdV , BKdV ) = HEKdV (H).

Together with Theorem 6 for the Evans function of the KdV equation, this proves the lemma. Geometrically, the unfolding of the subspaces is as follows, roughly speaking. For H = 0, B u and B s coincide and B ss and B uu can be assumed to coincide as well. The weak directions, B s and B u cross transversely in H = 0, contributing a factor H to E. The strong directions B ss and B uu unfold with quadratic tangency, just as in the KdV-equation, contributing a factor H2 to E. By continuity in ε and analyticity in H, Lemma 6.8, we conclude using Rouché’s theorem that for ε > 0 small, E(H; ε) possesses precisely three roots close to the origin, counted with multiplicity. The following lemma therefore shows that there are indeed no unstable eigenvalues in a small enough neighborhood of the origin.

Stability of Solitary Waves

519

Lemma 6.11. The Evans function for the water-wave problem E(H; ε) possesses a triple root in the origin for all ε ≥ 0 sufficiently small. Proof. Let H = 0. We find for ε ≥ 0 a two-dimensional intersection of Es (0) and Eu (0), generated by the derivative of the solitary wave and the translation of the potential (1, 0, 0, 0)T . Indeed, by construction, Lemma 6.8, any bounded solution necessarily lies in the intersection, since solutions which at H = 0 do not belong to the intersection grow at least linearly. From Galilean invariance, we find the exponentially localized derivative of the solitary wave with respect to the wave speed as a principal vector to the derivative of the solitary wave. Following [PW92], we conclude that E(H; ε) possesses at least a triple zero in H = 0. On the other hand, Lemma 6.10 shows that the multiplicity is at most three. This proves the lemma. 6.3. Proof of Proposition 6.1. We conclude the proof of absence of point spectrum in the right half plane. First, we showed in Sect. 6.1 that there are no unstable eigenvalues in a neighborhood of the imaginary axis, up to possible eigenvalues with large imaginary part or in a neighborhood of the imaginary axis. We then showed in Sect. 6.2.2 that eigenvalues in a neighborhood of the imaginary axis necessarily scale with ε 3 , justifying the Korteweg–de Vries scaling. Finally, we showed in Sect. 6.2.4 that in the Korteweg– de Vries scaling, there are no unstable eigenvalues. The main part was a perturbation argument, based on the construction of an analytic Evans function. We showed that any eigenvalue is a root of an analytic function E(H; ε). We then continued E(H; ε) analytically in an open neighborhood of H = 0, for ε ≥ 0. Lemma 6.10 showed that there are at most three eigenvalues in a neighborhood of zero, counting multiplicity, and Lemma 6.11 showed that all three eigenvalues are located in zero, for ε ≥ 0 sufficiently small. This proves spectral stability up to possible eigenvalues with imaginary part tending to ∞ as ε → 0, Proposition 6.1. Acknowledgement. The authors gratefully acknowledge financial support by DAAD/Procope, Nr. D/0031082 and F/03132UD.

References [AGJ90] Alexander, J., Gardner, R. and Jones, C.K.R.T.: A topological invariant arising in the stability analysis of traveling waves. J. Reine Angew. Math. 410, 167–212 (1990) [AK89] Amick, C.J. and Kirchgässner, K.: A theory of solitary water-waves in the presence of surface tension. Arch. Rational Mech. Anal. 105, 1–49 (1989) [Be67] Benjamin, T.B.: Instability of periodic wavetrains in nonlinear dispersive systems. Proc. Roy. Soc. Lond. A 299, 59–75 (1967) [Be72] Benjamin, T.B.: The stability of solitary waves. Proc. R. Soc. Lon. A 328, 153–183 (1972) [BF67] Benjamin, T.B. and Feir, J.E.: The disintegration of wave trains on deep water, Part 1. J. Fluid Mech. 27, 417–430 (1967) [BO80] Benjamin, T.B. and Olver, P.: Hamiltonian structure, symmetries and conservation laws for water waves. J. Fluid Mech. 125, 137–185 (1982) [BSS87] Bona, J.L., Souganidis, P.E. and Strauss, W.A.: Stability and instability of solitary waves of Korteweg-de Vries type. Proc. R. Soc. Lon. A 411, 395–412 (1987) [Bou] Boussinesq, M.J.: Essai sur la théorie des eaux courantes. Mémoires présentés par divers savants à l’Académie des Sciences Inst. France (séries 2) 23, 1–680 (1877) [BM95] Bridges, T.J. and Mielke, A.: A proof of the Benjamin-Feir instability, Arch. Rational Mech. Anal. 133, 145–198 (1995)

520

[Co78] [Cr85]

M. Haragus, A. Scheel

Coppel, W.A.: Dichotomies in stability theory. Lect. Notes Math. 629. Berlin: Springer, 1978 Craig, W.: An existence theory for water waves and the Boussinesq and Korteweg-de Vries scaling limits. Comm. Partial Diff. Eq. 10, 787–1003 (1985) [Ev72] Evans, J.: Nerve axon equations (iii): Stability of the nerve impulses. Indiana Univ. Math. J. 22, 577–594 (1972) [GZ98] Gardner, R. and Zumbrun, K.: The gap lemma and geometric criteria for instability of viscous shock profiles. Comm. Pure Appl. Math. 51, 797–855 (1998) [Ha96] Haragus, M.: Model equations for water waves in the presence of surface tension. Eur. J. Mech. B/Fluids 15, 471–492 (1996) [HS01] Haragus, M. and Scheel, A.: Linear stability and instability of ion-acoustic plasma solitary waves. Preprint. [IS92] Il’ichev, A.T. and Semenov, A.Y.: Stability of solitary waves in dispersive media described by a fifth-order evolution equation. Theoret. Comput. Fluid Dynamics 3, 307–326 (1992) [KN79] Kano, T. and Nishida, T.: Sur les ondes de surface de l’eau avec une justification mathématique des équations des ondes en eau peu profonde. J. Math. Kyoto Univ. 19, 335–370 (1979) [KN86] Kano, T. and Nishida, T.: A mathematical justification for Korteweg-de Vries and Boussinesq equation of water surface waves. Osaka J. Math. 23, 389–413 (1986) [KS98] Kapitula, T. and Sandstede B.: Stability of bright solitary-wave solutions to perturbed nonlinear Schrödinger equations. Physica D 124, 58–103 (1998) [Ka72] T. Kawahara, Oscillatory solitary waves in dispersive media, Phys. Soc. Japan 33 (1972), 260–264. [Ki88] Kirchgässner, K.: Nonlinearly Resonant Surface Waves and Homoclinic Bifurcation. Adv. Appl. Mech. 26, 135–181 (1988) [KdV] Korteweg, D.J. and de Vries, G.: On the change of form of long waves advancing in a rectangular channel, and on a new type of long stationary waves Phil. Mag. 5, 422–443 (1895) [LH84] Longuet-Higgins, M.S.: On the stability of steep gravity waves. Proc. R. Soc. Lon. A 396, 269–280 (1984) [LHT97] Longuet-Higgins, M.S. and Tanaka, M.: On the crest intabilities of steep surface waves. J. Fluid Mech. 336, 51–68 (1997) [MS86] MacKay, R.S. and Saffman, P.G.: Stability of water waves. Proc. R. Soc. Lon. A 406, 115–125 (1986) [Mc82] McLean, J.W.: Instabilities of finite-amplitude water waves. J. Fluid Mech. 114, 315–330 (1982) [Mi88] Mielke, A.: Reduction of quasilinear elliptic equations in cylindrical domains with applications. Math. Meth. Appl. Sci. 10, 51–66 (1988) [Na74] Nalimov,V.I.: The Cauchy-Poisson Problem (in Russian). Dynamik Splosh. Sredy 18, 104–210 (1974) [PW92] Pego, R.L. and Weinstein, M.I.: Eigenvalues, and instabilities of solitary waves. Philos. Trans. Roy. Soc. Lond. A 340, 47–94 (1992) [PW96] Pego, R.L. and Weinstein, M.I.: Asymptotic stability of solitary waves. Commun. Math. Phys. 164, 305–349 (1996) [PW97] Pego, R.L. and Weinstein, M.I.: Convective linear stability of solitary waves for Boussinesq equations. Stud. Appl. Math. 99, 311–375 (1997) [RS95] Robbin, J. and Salamon, D.: The spectral flow and the Maslov index. Bull. London Math. Soc. 27, 1–33 (1995) [Sa91] Sachs, R.L.: On the existence of small amplitude solitary waves with strong surface tension. J. Diff. Equ. 90, 31–51 (1991) [Sa85] Saffman, P.G.: The superharmonic instability of finite amplitude water waves. J. Fluid Mech. 159, 169–174 (1985) [SW00] Schneider, G. and Wayne, C.E.: The long wave limit for the water wave problem I. The case of zero surface tension, Comm. Pure Appl. Math. 53, 1475–1535 (2000) [Ta86] Tanaka, M.: The stability of solitary waves, Phys. Fluids 29, 650–655 (1986) [Wh67] Whitham, G.B.: Nonlinear dispersion of water-waves, J. Fluid Mech. 27, 399–412 (1967) [Wu97] Wu, S.: Well-posedness in Sobolev spaces of the full water wave problem in 2-D. Invent. Math. 130, 39–72 (1997)

Stability of Solitary Waves

[Yo82] [Za68]

521

Yosihara, H.: Gravity waves on the free surface of an incompressible perfect fluid of finite depth. RIMS Kyoto 18, 49–96 (1982) Zakharov, V.E.: Stability of periodic waves of finite amplitude on the surface of a deep fluid. J. Appl. Mech. Tech. Phys. 2, 190–194 (1968)

Communicated by P. Constantin

Commun. Math. Phys. 225, 523 – 549 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

One Dimensional Behavior of Singular N Dimensional Solutions of Semilinear Heat Equations Hatem Zaag1,2 1 Courant Institute, New York University, 251 Mercer Street, New York, NY 10012, USA 2 Département de Mathématiques et Applications, CNRS UMR 8553, École Normale Supérieure,

45 rue d’Ulm, 75005 Paris, France. E-mail: [email protected] Received: 20 June 2001 / Accepted: 6 October 2001

Abstract: We consider u(x, t) a solution of ut = u + |u|p−1 u that blows up at time T , where u : RN × [0, T ) → R, p > 1, (N − 2)p < N + 2 and either u(0) ≥ 0 or (3N − 4)p < 3N + 8. We are concerned with the behavior of the solution near a non isolated blow-up point, as T − t → 0. Under a non-degeneracy condition and assuming that the blow-up set is locally continuous and N − 1 dimensional, we escape logarithmic scales of the variable T − t and give a sharper expansion of the solution with the much smaller error term (T − t)1/2−η for any η > 0. In particular, if in addition p > 3, then the solution is very close to a superposition of one dimensional solutions as functions of the distance to the blow-up set. Finally, we prove that the mere hypothesis that the blow-up set is continuous implies that it is C 1,1/2−η for any η > 0. 1. Introduction In this paper, we are mainly concerned with the blow-up behavior at non-isolated blow-up points of the following semilinear heat equation: ut = u + |u|p−1 u, u(., 0) = u0 ∈ L∞ (RN ),

(1)

where u(t) : x ∈ RN → u(x, t) ∈ R and stands for the Laplacian in RN . We assume in addition that the exponent p > 1 is subcritical: if N ≥ 3 then 1 < p < (N +2)/(N −2). Moreover, we assume that either u0 ≥ 0 or (3N − 4)p < 3N + 8.

(2)

This problem has attracted a lot of attention because it captures features common to a whole range of blow-up problems arising in various physical situations; particularly it highlights the role of scaling and self-similarity. Among related equations, we mention: the motion by mean curvature, surface diffusion (Bernoff, Bertozzi and Witelski [1]) and

524

H. Zaag

chemotaxis (Brenner et al. [3], Betterton and Brenner [2]). However, Eq. (1) is simple enough to be tractable in rigorous mathematical terms, unlike other physical equations. In this work, we build up tools that may be useful in more physical situations. As a matter of fact, in Sect. 5 we will mention connections with a chemotaxis problem. The behavior near singular points is a major concern in all singularity problems. One general idea of this work is to find out how to refine the singular behavior beyond first order terms and reach significantly small error terms. Through a change of variables, singular behavior reduces to the asymptotic behavior of some PDE when a small positive parameter goes to zero. For the heat equation (1), = T − t → 0, where T is the blow-up time. In previous work, an explicit profile is found to be a good first order approximation, up to ν α where ν = −1/ log and α > 0. Further refinements in this direction should give an expansion of the solution in terms of powers of ν, i.e., in logarithmic scales of (see Stewartson and Stuart [18]). Logarithmic scales also arise in some singular perturbation problems such as low Reynolds number fluids and some vibrating membranes studies (see Ward [20] and the references therein, see also Segur and Kruskal [17] for a Klein–Gordon equation). Since ν goes to zero slowly, infinite logarithmic series may be of only limited practical use in approximating the exact solution. Relevant approximations, i.e., approximations up to lower order terms such as β for β > 0, lie beyond all logarithmic scales. In this work, our idea to capture such relevant terms is to abandon the explicit profile function obtained as a first order approximation, and take a less explicit function as a first order description of the singular behavior. Both formulations agree to the first order. Through scaling and matching, we can reach the order β by iterating the expansion around the less explicit function. A second general idea in this work is to see how more constraints on the singular set yield more regularity for that set. This idea is found in studies of free boundary problems, where over determined boundary conditions yield regularity of the free boundary. In this work, we focus on the case where the blow-up set of (1) is a continuum. The mere hypothesis that the blow-up set is continuous, which is an unstable situation (see Sect. 5), adds constraints in the problem, yielding C 1,α regularity for the blow-up set. 1.1. Blow-up behavior in logarithmic scales of T − t. A solution u(t) to (1) blows up in finite time if its maximal existence time T is finite. In this case, lim u(t) H 1 (RN ) = lim u(t) L∞ (RN ) = +∞.

t→T

t→T

Let us consider such a solution. T is called the blow-up time of u. A point a ∈ RN is called a blow-up point if |u(x, t)| → +∞ as (x, t) → (a, T ) (this definition is equivalent to the usual local unboundedness definition, because of Corollary 2 in Merle and Zaag [15]). S denotes the blow-up set, i.e., the set of all blow2 (RN \S) up points. From [15], we know that there exists a blow-up profile u∗ ∈ Cloc such that 2 (RN \S) as t → T . u(x, t) → u∗ (x) in Cloc

(3)

Given aˆ ∈ S, we know from Velázquez [19] that up to some scalings, u approaches a particular explicit function near the singularity (a, ˆ T ). We consider the case where for

Solutions of Semilinear Heat Equations

525

all K0 > 0,

1 sup (T − t) p−1 u aˆ + Qaˆ z (T − t)| log(T − t)|, t − flaˆ (z) → 0

|z|≤K0

(4)

as t → T , where Qaˆ is an orthonormal N × N matrix, laˆ = 1, ..., N, and

l (p − 1)2 2 zi fl (z) = p − 1 + 4p

1 − p−1

.

(5)

i=1

Other behaviors with the scaling (T − t)− 2k (x − a) ˆ where k = 2, 3, .. may occur (see [19]). We suspect them to be unstable. If laˆ = N , then aˆ is an isolated blow-up point. An extensive literature is devoted to this case (Weissler [21], Bricmont and Kupiainen [5], Herrero and Velázquez [12] and [19], . . . ). We have proved the stability of such a behavior with Fermanian and Merle in [8]. The key argument in our proof was the following Liouville Theorem proved by Merle and Zaag in [13] and [15]: Consider U a solution of (1) defined for all (x, t) ∈ RN × (−∞, T ) such that 1

−

1

for all (x, t) ∈ RN × (−∞, T ), |U (x, t)| ≤ C(T − t) p−1 . Then, either U ≡ 0 or − 1 U (x, t) = (p − 1)(T ∗ − t) p−1 for some T ∗ ≥ T . When laˆ = N , the blow-up behavior of u(x, t) near the isolated blow-up point aˆ is already contained in (4) which shows that the profile of u(x, t) is a function of a one dimensional variable: 1 d(x, S) − p−1 u(x, t) ∼ (T − t) f1 , (6) (T − t)| log(T − t)| since S = {a} ˆ and d(x, S) = |x − a| ˆ when x is close to a. ˆ This description remains valid even when aˆ is not isolated, as we will show later. The case laˆ < N is known to occur, namely when u is invariant with respect to some coordinates. However, when laˆ < N , we cannot even tell whether aˆ is isolated or not. The first singularity description was obtained in [23]. For simplicity, we assume that locally near a, ˆ S is a (N − laˆ )-dimensional C 1 manifold. We have shown in Theorems 3 and 4 in [23] that for some ˆ δ) t0 < T and δ > 0, for all K0 > 0, t ∈ [t0 , T ) and x ∈ B(a, such that d(x, S) ≤ K0 (T − t)| log(T − t)|, we have 1 d(x, S) log | log(T − t)| , (7) (T − t) p−1 u(x, t) − f1 ≤ C0 (K0 ) | log(T − t)| (T − t)| log(T − t)| where f1 is defined in (5). Note that formally, this is the same description as in the case laˆ = N , where aˆ was isolated (see (6)). The variable d(x, S), normal to S, appears as the blow-up variable that determines the size of u. The major step in [23] is the proof of the stability of the behavior (4) in a neighborhood of aˆ in S. The key argument in getting this stability is the Liouville Theorem of [15], stated earlier in this section. The error term in (7) shows that we fall in logarithmic scales of the small parameter = T − t. In this paper, we do better, and get to error terms of order (T − t)α with α > 0. Following the ideas of the Introduction, we will replace the explicit profile f1 by a less explicit function, and then go beyond all logarithmic scales, through scaling and matching.

526

H. Zaag

1.2. Blow-up behavior beyond all logarithmic scales of T −t. A natural candidate for this non explicit function is simply a one dimensional solution of (1) that has the same profile f1 . It is classical that there exists a one dimensional even function u(x ˜ 1 , t), solution of (1), which decays on (0, ∞) and blows up at time T only at the origin, with the profile f1 , in the sense that for all K0 > 0 and t ∈ [t0 , T ), if |x1 | ≤ K0 (T − t)| log(T − t)|, then 1 x log | log(T − t)| 1 ˜ 1 , t) − f1 (8) (T − t) p−1 u(x ≤ C0 (K0 ) | log(T − t)| (T − t)| log(T − t)| (see Appendix A for a proof of this fact). Hence, it follows from (7) that for all K0 > 0, ˆ δ) such that d(x, S) ≤ K0 (T − t)| log(T − t)|, we have t ∈ [t0 , T ) and x ∈ B(a, 1

(T − t) p−1 |u(x, t) − u(d(x, ˜ S), t)| ≤ C(K0 )

log | log(T − t)| . | log(T − t)|

(9)

This estimate remains valid even if we replace u(d(x, ˜ S), t) by any u˜ σ (x,t) (d(x, S), t), where u˜ σ is defined by

σ − σ u˜ σ (x1 , t) = e p−1 u˜ e− 2 x1 , T − e−σ (T − t) , (10) provided that |σ (x, t)| ≤ C(K0 ). Indeed, for any σ ∈ R, u˜ σ is still a blow-up solution of (1) with the same properties and the same profile (8) as u. ˜ Moreover, u˜ σ = u, ˜ unless σ = 0, because u˜ is not self-similar (see Appendix A). For each blow-up point a near a, ˆ we will suitably choose this free scaling parameter 1

σ = σ (a) so that the difference (T −t) p−1 u(x, t) − u˜ σ (a) (d(x, S), t) along the normal direction to S at a is minimum. Following the ideas of the Introduction, if we refine the expansion about this well chosen, though less explicit, function u˜ σ (a) (d(x, S), t), then we escape logarithmic scales. In particular, if p > 3, then the difference u(x, t) − u˜ σ (a) (d(x, S), t) is bounded and goes to zero as t → T , although both functions blow up. This can be done only when laˆ = 1 which corresponds to a (N − 1)-dimensional blow-up set, according to [23]. We claim the following: Theorem 1 (The N dimensional solution seen as a superposition of one dimensional solutions of the normal variable to the blow-up set, with a suitable dilation). Assume N ≥ 2 and consider u a solution of (1) that blows up at time T on a set S which is ˆ If u behaves as stated in (4) near a (N − 1)-dimensional C 1 manifold, locally near a. (a, ˆ T ) with laˆ = 1 and if p > 3, then for all t ∈ [t1 , T ) and x ∈ B(a, ˆ δ) such that d(x, S) < 0 for some t1 < T , δ > 0 and 0 > 0, we have u(x, t) − u˜ σ (P (x)) (d(x, S), t) ≤ h(x, t) < M < +∞, (11) S where PS (x) is the projection of x over S and h(x, t) → 0 as d(x, S) → 0 and t → T . Thus, when p > 3, all the singular terms of u in a neighborhood of (a, ˆ T ) are contained in the rescaled one dimensional solution u˜ σ (PS (x)) (d(x, S), t), which shows that in a tubular neighborhood of the blow-up set S, the space variable splits into 2 independent variables:

Solutions of Semilinear Heat Equations

527

– A primary variable, d(x, S), normal to S. It accounts for the main singular term of u and gives the size of u(x, t), as already shown in the old formulation (9), which follows directly from [23]. – A secondary variable, PS (x), whose effect is sharper. Through the optimal choice of the dilation σ (PS (x)), it absorbs all next singular terms in the normal direction to S at PS (x). Similar ideas are used by Betterton and Brenner [2] in a chemotaxis model; see Sect. 5 for a short discussion of connections with that work. We would like to mention that we have successfully used this idea of modulation of the dilation with Fermanian in [9] to prove that for N = 1 and p ≥ 3, there is only one blow-up solution of (1) with the profile (4), up to a bounded function and to the invariances of the equation (the dilation and translations in space and in time). Theorem 1 is a direct consequence of the following result which is valid also for 1 < p ≤ 3. Theorem 2 (Blow-up behavior and profile near a blow-up point where u behaves as in (4) assuming S is locally a (N − 1)-dimensional manifold). Under the hypotheses of Theorem 1 and without the restriction p > 3, there exists t1 < T and 0 > 0 such that for all x ∈ B(a, ˆ δ) such that d(x, S) ≤ 0 , we have the following: (i) For all t ∈ [t1 , T ), u(x, t) − u˜ σ (P (x)) (d(x, S), t) S

p−3 p−3 p 3 +C ≤ C mM (T − t) 2(p−1) | log(T − t)| 2 +C0 , d(x, S) p−1 | log d(x, S)| p−1 0 , (12) where PS (x) is the projection of x over S, mM = min if 1 < p ≤ 3 and mM = max if p > 3. (ii) If x ∈ S, then u(x, t) → u∗ (x) as t → T and

σ (PS (x)) p−3 p ∗ σ (P (x)) u (x) − e− p−1 u˜ ∗ e− S2 d(x, S) ≤ Cd(x, S) p−1 | log d(x, S)| p−1 +C0 ,

where u˜ ∗ (x1 ) = lim u(x ˜ 1 , t). t→T

Remark. In [23], we have obtained the following explicit equivalent for u∗ :

8p | log d(x, S)| u (x) ∼ (p − 1)2 d(x, S)2 ∗

1 p−1

∼ u˜ ∗ (d(x, S)) as d(x, S) → 0.

Our new estimate shows that up to a suitable dilation, all the next terms in the expansion p−3

p

of u∗ up to the order d(x, S) p−1 | log d(x, S)| p−1 dimensional solution.

+C0

are the same as the particular one

528

H. Zaag

1.3. C 1,α regularity of the blow-up set. The splitting of the space variable x into d(x, S) and PS (x), as shown in (12), induces a geometric constraint on the blow-up set S, leading to more regularity on S. Proposition 3 (C 1, 2 −η regularity for S and C 1−η regularity for the dilation σ ). Under 1 the hypotheses of Theorem 2, S is the graph of a function ϕ ⊂ C 1, 2 −η (BN−1 (0, δ1 ), 1−η R), locally near a, ˆ and σ is a C function, for any η > 0. More precisely, there is a h0 > 0 such that for all |ξ | < δ1 and |h| < h0 such that |ξ + h| < δ1 , we have 1

|ϕ(ξ + h) − ϕ(ξ ) − hϕ (ξ )| ≤ C|h|3/2 | log |h|| 2 +C0 , 1

|σ (ξ, ϕ(ξ )) − σ (ξ + h, ϕ(ξ + h))| ≤ C|h|| log |h||3+C0 . The regularity of the blow-up set S is our second concern in this paper. We know from Velázquez [19] that the (N − 1)-dimensional Hausdorff measure of S is bounded on compact sets. Under a local non-degeneracy condition, we have proved in [23] that if S locally contains a continuum, then S is locally a C 1 manifold of dimension k = 1 1, ..., N − 1. Since Proposition 3 derives C 1, 2 −η regularity assuming C 1 regularity, we can weaken the hypotheses of Proposition 3 and get a stronger version that derives 1 C 1, 2 −η regularity just assuming continuity. Stating this new version requires additional technical notation. We consider a non-isolated blow-up point aˆ where u has the behavior (4) with laˆ = 1. We may take Qaˆ = Id. According to Theorem 2 in [19], for all > 0, there is δ() > 0 such that S ∩ B(a, ˆ δ) ⊂ +a,π, ≡ x | |Pπ (x − a)| ˆ ≥ (1 − )|x − a| ˆ , ˆ where Pπ is the orthogonal projection over π , the subspace spanned by e2 , ..., eN . Note that +a,π, is a cone with vertex aˆ that shrinks to aˆ + π as → 0. In fact, aˆ + π is the ˆ candidate for the tangent plane to S at a. ˆ We assume there is a ∈ C((−1, 1)N−1 , RN ) such that a(0) = aˆ and Im a ⊂ S, where Im a is at least (N − 1)-dimensional in the sense that ∀b ∈ Im a, there are (N − 1) independent vectors v1 , ..., vN−1 in RN and a1 , .., aN−1 functions in C 1 ([0, 1], S) such that ai (0) = b and ai (0) = vi .

(13)

This hypothesis means that b is actually non-isolated in (N − 1) independent directions. We also assume that aˆ = 0 is not an endpoint in Im a in the sense that ∀ > 0, the projection of a((−, )N−1 ) on the plane aˆ + π contains an open ball with center a. ˆ

(14)

We claim the following: Theorem 4 (Regularity of the blow-up set near a point with the behavior (4) assuming S contains a (N − 1)-dimensional continuum). Take N ≥ 2 and consider u a solution of (1) that blows up at time T on a set S and take aˆ ∈ S, where u behaves locally as stated in (4) with laˆ = 1. Consider a ∈ C((−1, 1)N−1 , RN ) such that aˆ = a(0) ∈ Im a ⊂ S and Im a is at least (N − 1)-dimensional in the sense (13). If aˆ is not an endpoint (in the sense (14)), then there are δ > 0, δ1 > 0 and 1 ϕ ∈ C 1, 2 −η (BN−1 (0, δ1 ), R) (for any η > 0) such that S ∩ B(a, ˆ 2δ) = graph ϕ ∩ B(a, ˆ 2δ) = Im a ∩ B(a, ˆ 2δ).

(15)

Solutions of Semilinear Heat Equations

529

Moreover, the conclusions of Theorem 2 and Proposition 3 hold. In particular, if p > 3, then the conclusion of Theorem 1 also holds. Remark. When N = 2, we can replace conditions (13) and (14) just by the existence of α0 such that for all > 0, a(−, ) intersects the complimentary of any connected closed cone with vertex at aˆ and angle α ∈ (0, α0 ]. Remark. In the case laˆ ≥ 2 in (4), that is when the blow-up set is 2 dimensional, we are unable to suitably choose the dilation in (10) and we cannot escape the logarithmic scale in T − t. Hence, we cannot obtain C 1,α regularity. We can nonetheless improve estimate (9) and prove that: For all t ∈ [t1 , T ) and x ∈ B(a, ˆ δ) such that d(x, S) ≤ 0 , we have − 1 − 2 (T − t) p−1 d(x, S) p−1 |u(x, t) − u(d(x, ˜ S), t)| ≤ C min . , | log(T − t)| | log d(x, S)| p−2 p−1 Theorem 1 is a direct consequence of Theorem 2. Throughout the paper, we assume the hypotheses of Theorem 2. In Sect. 2, we start from the conclusion given in [23] under the hypotheses of Theorem 2 and show that for any blow-up point a near a, ˆ there is σ (a) ∈ R such that u˜ σ (a) is the best profile for u along the normal direction to S at a. In Sect. 3, we use this to get the blow-up behavior of u in a tubular neighborhood of S (Theorem 2). In Sect. 4, we prove regularity results (Proposition 3). Theorem 4 is a direct consequence of Theorem 2 and Proposition 3 because of the results of [23]. Indeed, Theorem 4 in [23] asserts that under the hypotheses of Theorem 4, S is the graph of a C 1 function; hence Theorem 2 and Proposition 3 apply. Some connections with a chemotaxis model are presented in Sect. 5. The results of this paper and those of [23] have been presented in the note [22]. 2. Modulation of the Dilation, Uniformly with Respect to the Blow-up Point This is a major step in our paper. Under the hypotheses of Theorem 2, there is a C 1 function ϕ such that ˆ 2δ) = graph ϕ ∩ B(a, ˆ 2δ) Sδ ≡ S ∩ B(a,

(16)

for some δ > 0 and ϕ ∈ C 1 (BN−1 (0, δ1 ), R), where δ1 > 0 and BN−1 (0, δ1 ) is a ball in RN−1 . If a ∈ Sδ and wa is defined by 1 x−a wa (y, s) = (T − t) p−1 u(x, t), y = √ , s = − log(T − t), T −t

(17)

then we see from (1) that for all (y, s) ∈ RN × [− log T , ∞), 1 w ∂w = w − y.∇w − + |w|p−1 w. ∂s 2 p−1

(18)

We have proved in Propositions 3.1, 4.4 and 4.4’ of [23] that for all a ∈ Sδ and s ≥ − log T ,

y12 κ log s 1− (19) ≤C 2 , wa (Qa y, s) − κ + 2 2ps 2 s Lρ

530

H. Zaag

where Qa is a N × N orthogonal matrix continuous in terms of a, such that {Qa ei | i = 2, ..., N } span the tangent plane Ta to S at a, Qa e1 is the normal direction to S at a, κ = (p − 1)

1 − p−1

and ρ(y) = e−

|y|2 4

/(4π )N/2 .

(20)

To show this, we first start from (4) and use the paper by Filippas and Kohn [10] to establish (19) at a = a. ˆ Then, we use dynamical system methods to show the stability of the behavior (19) for solutions of (18). The Liouville Theorem stated in Subsect. 1.1 is a central argument. The particular one dimensional solution u(x ˜ 1 , t) in Subsect. 1.2 can also be thought as a N dimensional solution blowing up on the hyperplane {x1 = 0} in RN . Therefore, the results of [23] apply to u˜ and (19) holds for u˜ too. Since u˜ is invariant in the direction of the blow-up set, we have for all a ∈ {x1 = 0}, Qa ≡ Id and w˜ a = w˜ defined by 1 x1 w(y ˜ 1 , s) = (T − t) p−1 u(x ˜ 1 , t), y1 = √ , s = − log(T − t). (21) T −t w˜ is a solution of (18) and (19) yields for all s ≥ − log T ,

y12 κ log s 1− ˜ 1 , s) − κ + ≤C 2 . w(y 2ps 2 s 2

(22)

Lρ

Using (19) and (22), we get for all σ0 > 0, a ∈ Sδ , |σ | ≤ σ0 and s ≥ − log T + σ0 , log s . (23) s2 We aim in this section at choosing a particular σ = σ (a) so that this difference becomes s less than Ce− 2 s C0 for some C0 ≥ 0. This is equivalent to choosing an appropriate dilation λ(a) = e−σ (a) in (10) for the original function u(x ˜ 1 , t). The following proposition is the goal of this section. ˜ 1 , s + σ ) L2ρ ≤ C(σ0 ) wa (Qa y, s) − w(y

Proposition 2.1 (Modulation of the dilation in the one dimensional solution). There exist s0 > 0 and C0 > 0 and a continuous function σ : Sδ → R such that for all a ∈ Sδ and s ≥ s0 , s

wa (Qa y, s) − w(y ˜ 1 , s + σ (a)) L2ρ ≤ C0 e− 2 s C0 . Let us first recall from [15] some consequences of the Liouville Theorem of Subsect. 1.1, namely some L∞ estimates and a localization property for blow-up solutions of (1). We also need some elementary estimates of the one dimensional solution u. ˜ 2.1. Uniform L∞ estimates. The following propositions are consequences of the Liouville Theorem of Subsect. 1.1. Proposition 2.2 (L∞ estimates for solutions to (1) at blow-up). There exists C > 0 such that if u is a solution to (1) which blows up at time T > 0, then, there exists sˆ such that for all s ≥ sˆ and a ∈ RN , C C and || ∇ i wa (s) ||L∞ ≤ i/2 s s for all i ∈ {1, 2, 3}, where wa is defined in (17). || wa (s) ||L∞ ≤ κ +

(24)

Solutions of Semilinear Heat Equations

531

Proposition 2.3 (A uniform localization of the PDE (1) by means of the associated ODE). Let u be a solution to (1) which blows up at time T . Then, ∀ > 0, ∃C > 0, ∂u T ∀t ∈ , T , ∀x ∈ RN , − |u|p−1 u ≤ |u|p + C . 2 ∂t The reader will find a proof of these propositions in [15] and [14] respectively. In the following lemma, we give some elementary estimates for the particular one dimensional solution u: ˜ Lemma 2.4 (Elementary estimates for u). ˜ (i) There exists C > 0 and sˆ > 0 such that for all s ≥ sˆ and |y1 | ≤ w(y ˜ 1 , s) ≤ w(0, ˜ s) − C

√ s, we have

y12 . s

(ii) R

y12

(y 2 − 2) e− 4 ∂ w˜ κ as s → ∞. (y1 , s) 1 √ dy1 ∼ ∂s 8 4ps 2 4π

(iii)

∂ w˜ κ as s → ∞. (0, s) ∼ − ∂s 2ps 2

Proof. See Appendix A. 2.2. A dynamical system formulation for the modulation problem. Our approach is identical to what we did with Fermanian in [9] for the difference of two solutions with the radial profile (laˆ = N) in (4), instead of the non symmetric profile (1 = laˆ < N ) we handle here. Therefore, we follow the full strategy of [9] and emphasize the novelties. However, some technical details – most of them are straightforward and long – are omitted. The reader can find them in [9]. Consider an arbitrary σ0 ≥ 0 and fix a ∈ Sδ and |σ | ≤ σ0 . If we define ga,σ (y, s) = wa (Qa y, s) − w(y ˜ 1 , s + σ ), then we see from (18) that for all (y, s) ∈ RN × [− log T + σ0 , ∞),

∂s ga,σ (y, s) = L + αa,σ g a,σ , where L = −

y 2

α a,σ (y, s) =

(25)

(26)

· ∇ + 1 and ∀(y, s) ∈ RN × R, ˜ 1 , s + σ ) |p−1 w˜ | wa (Qa y, s) |p−1 wa − | w(y p − wa − w˜ p−1

(27)

if wa (Qa y, s) = w(y ˜ 1 , s + σ ), and in general, α(y, s) = p | w¯ a,σ (y, s) |p−1 −

p p−1

(28)

532

H. Zaag

for some w¯ a,σ (y, s) ∈ wa (Qa y, s), w(y ˜ 1 , s + σ ) . In the following, we drop down the index (a, σ ) unless there is ambiguity. One should keep in mind that all quantities defined from g also depend on (a, σ ). According to (23) and (25), g → 0 in L2ρ as s → ∞. More precisely, for all s ≥ − log T + σ0 , g(s) L2ρ ≤ C(σ0 )

log s . s2

(29)

Operator L is self-adjoint on D(L) ⊂ L2ρ (RN ) where ρ is defined in (20). The spectrum of L consists of eigenvalues m spec L = 1 − , m ∈ N . 2

Note that except two positive eigenvalues 1 and 21 and a null eigenvalue, all the spectrum is negative. The eigenfunctions of L are hβ (y) = hβ1 (y1 ) . . . hβN (yN ),

(30)

where β = (β1 , . . . , βN ) ∈ NN and for each m ∈ N, hm is the rescaled Hermite polynomial hm (ξ ) =

[m/2] j =0

m! hm (−1)j ξ m−2j . We note km = j !(m − 2j )! hm 2L2

,

(31)

ρ1 (R)

where L2ρ1 (R) is the L2 space with the measure 2

ξ N e− 4 ρ1 (yi ). ρ1 (ξ ) = √ that satisfies ρ(y) = 4π i=1

(32)

The polynomials hm and hβ satisfy |β| Lhβ = 1 − hm (ξ )kj (ξ )ρ1 (ξ )dξ = δm,j . hβ and 2 R Let us introduce the component of g(., s) on hβ , gβ (s) = kβ (y)g(y, s)ρ(y)dy where kβ (y) =|| hβ ||−2 h (y). L2 β ρ

RN

(33)

(34)

If Pn is the orthogonal projector of L2ρ over the eigenspace of L corresponding to 1 − n2 , gβ (s)hβ (y). Since the eigenfunctions of L span the whole space then Pn g(y, s) = |β|=n

L2ρ , we can write   g(y, s) = P g = gβ (s)hβ (y) n   N n∈N β∈N 2 2  ln (s)2 where ln (s) ≡ Pn g L2ρ .   g(s) L2ρ ≡ I (s) = n∈N

As for α, we claim the following:

(35)

Solutions of Semilinear Heat Equations

533

Lemma 2.5 (Estimates on α). For all σ0 ≥ 0, a ∈ Sδ , |σ | ≤ σ0 , y ∈ RN and s ≥ − log T + σ0 , α(y, s) ≤

C(σ0 ) , s

|α(y, s)| ≤

C(σ0 ) (1 + |y|2 ) s

C(σ ) 1 0 and α(y, s) + h2 (y1 ) ≤ 3/2 (1 + |y|3 ). 4s s

(36)

Proof. See Lemma 2.5 in [9] where a similar lemma was derived from Proposition 2.2, k by parabolic regularity). (22) and (19) (note that both (22) and (19) hold in Cloc 2.3. Modulation for the dilation in the one dimensional solution. We prove Proposition 2.1 here. Practically, since g a,σ satisfies Eq. (26), we consider that equation as a dynamical system and classify all possible asymptotic behaviors the equation can exhibit as s → ∞, under the growth condition (29). It turns out that the effect of α in (26) can be neglected, except in the neutral mode of L. Since the eigenvalues of L are 1, 21 , 0 and − 2k for any integer k ≥ 1, we expect the positive modes to be neglected. More precisely, unless g a,σ decreases faster than e−ks for any k ∈ N, either the null mode or a negative mode of L will dominate as s → ∞. Moreover, we expect g a,σ to decrease polynomially in the former case (because of the effect of the 1s term in α) and exponentially in the latter. We proceed in 3 steps: – In Step 1, we project Eq. (26) on the different modes. We then show that the positive modes are relatively small and that either the null or a negative mode dominates (unless g a,σ decreases faster than e−ks for any k ∈ N). – In Step 2, we solve the ODE satisfied by the null mode and show that it decays like 1 , except for a critical explicit value σ (a) of σ , where it decays faster. s2 – In Step 3, we take σ equal to this critical value σ (a) and show that the null mode can not dominate, unless g a,σ ≡ 0. Thus, we drop down in the spectrum from 0 to − 21 or less, which gives exponentially fast decay for g a,σ . Step 1: Dominance of a particular mode. Let us first project (26) on the different modes. For the null mode of L (|β| = 2), the main term of the equation comes from the main term of α (see (36)). Lemma 2.6 (Projection of (26) on the different modes). For all σ0 ≥ 0, a ∈ Sδ , |σ | ≤ σ0 and s ≥ − log T + σ0 , we have the following: (i) For all n ∈ N, |ln + ( n2 − 1)ln | ≤ C(n, σ0 ) I (s) s . n 1 C0 (σ0 ) (ii) For all n ∈ N, I (s) ≤ 1 − n+1 + I (s) + (n + 1 − k)lk (s). 2 s 2 k=0 (s) 4 + C(σ0 ) l0 +l (iii) If |β| = 2, then gβ (s) + βs1 gβ (s) ≤ C(σ0 ) Is 3/2 s . Proof. The calculation is straightforward. Parts (i) and (ii) follow from (26) and Lemma 2.5 exactly as in Lemma 2.7 in [9]. (iii) The calculation is straightforward and similar to the proof of Proposition 2.9 in [9]. See Appendix B.1 for details.

534

H. Zaag

Our main goal in this step is to show that one mode has to dominate all the others (unless I (s) decays faster than e−ks for any k ∈ N). The argument would be clear if α was identically zero, because the modes would not interact in that case. In the actual proof, we rely on this simple fact and treat the term αg as a perturbation to get the result. We claim the following lemma (which was proved in [9] with no special care to uniform estimates with respect to a ∈ Sδ ): Lemma 2.7(Dominance of a mode). For all a ∈ Sδ and σ ∈ R, either for all m ∈ N, I (s) lm (s) = O s or there is n ≥ 2 such that I ∼ ln as s → ∞. In that case, ∀m = n,

lm = O Is as s → ∞. Proof. See Proposition 2.6 in [9].

Lemma 2.7 asserts that the positive modes l0 and l1 are O Is as s → ∞. We need to know that this holds uniformly with respect to a and σ . We claim the following: Lemma 2.8 (Uniform smallness of the positive modes). For all σ0 ≥ 0, there exists s1 > 0 such that for all a ∈ Sδ , |σ | ≤ σ0 and s ≥ s1 , l0 (s) + l1 (s) ≤ 2C(σ0 ) I (s) s . Proof. It is the same as in [9], with more care about the dependence of the constants. See Appendix B.2 for the proof. Step 2: Asymptotic behavior of the null mode. We first use the decay information on I (s) and l0 (s) contained in (29) and Lemma 2.8 to solve the ODE satisfied by the null mode and stated in (iii) of Lemma 2.6. We claim the following: Lemma 2.9 (Decay of the null mode of (26)). For all σ0 ≥ 0, there is s3 (σ0 ) such that for all a ∈ Sδ , |σ | ≤ σ0 , s ≥ s3 (σ0 ) and |β| = 2, we have:  s  |ga,σ,β (s)| ≤ C(σ0 ) log if β1 = 2 s 5/2 k log s  ga,σ,β (s) − a,σ ≤ C(σ0 ) s 5/2 if β1 = 2. s2 Proof. This is straightforward. See Appendix B.3.

Now it becomes clear that by making ka,σ = 0, the decay of the null mode is faster, which suggests that the null mode may not dominate, therefore, we drop down in the spectrum to − 21 or less, which yields exponential decay. But, can we make ka,σ = 0? The answer is yes and this comes from a simple fact: the difference ka,σ − ka,0 does not depend on the function w or on the blow-up point a ∈ Sδ , or even on the one dimensional solution w; ˜ it is a linear function of σ . More precisely, we have the following lemma, which is the core of our argument: Lemma 2.10 (Modulation of the value of σ ). For all a ∈ Sδ and σ ∈ R, κ ka,σ = ka,0 − 4p σ. Proof. By definition of ka,σ (see Lemma 2.9 and (34)), ka,σ = lim s 2 ga,σ (y, s)k2 (y1 )ρ(y)dy. s→∞

RN

(37)

Solutions of Semilinear Heat Equations

Therefore,

535

ka,σ − ka,0 = lim s 2 s→∞

= lim s 2 s→∞

R R

ga,σ (y, s) − ga,0 (y, s) k2 (y1 )ρ(y)dy

N

(38)

˜ 1 , s) − w(y ˜ 1 , s + σ )) k2 (y1 )ρ1 (y1 )dy1 (w(y

according to (25) and (32). In particular, ka,0 − ka,σ does not depend on w or on a ∈ Sδ . Since we know from (ii) in Lemma 2.4, (31) and (32) that ∂ w˜ κ as s → ∞, (y1 , s)k2 (y1 )ρ1 (y1 )dy1 ∼ 4ps 2 R ∂s the conclusion follows by the mean value theorem.

4p κ ka,0 , which makes k a,σ L2ρ . We conclude the proof

In the following, we take σ = σ (a) ≡

= 0.

of Proposition 2.1 Step 3: Exponential decay of ga,σ (a) in here. With this choice of σ , k a,σ = 0, hence, (iii) of Lemma 2.6 and Lemma 2.9 yield

2 C log s l2 (s) ≥ − l2 − 3/2 I (s) and l2 (s) = O as s → ∞ (39) s s s 5/2

gβ2 hβ 2L2 . This implies that we cannot have I ∼ l2 , unless recall that l22 = |β|=2

ρ

I ≡ 0. Therefore, Lemma 2.7 implies that either a negative mode dominates, or all the modes are less than CI (s)/s. In both cases, the differential inequality (ii) in Lemma 2.6 yields exponential decay for I (s), which is the desired conclusion. However, we need to make this decay uniform with respect to the blow-up point a ∈ Sδ . We need first to fix σ0 . The uniform estimate of Lemma 2.9 along with the continuity of g a,σ (y, s) with respect to a, σ and s (see (25)) yields the continuity of k a,σ with respect to (a, σ ) ∈ Sδ × R (see (37)). Hence, we can fix σ0 = max a∈Sδ

4p |ka,0 | < +∞ κ

(40)

and define a continuous function σ : Sδ → [−σ0 , σ0 ] by σ (a) = 4p κ ka,0 . Just note that if we take n = 2 in (i) and (ii) of Lemma 2.6 and use Lemma 2.8, then we see that x = l2 and y = I satisfy the inequality (41) in the following ODE lemma: Lemma 2.11 (ODE lemma). For all M > 0 and sˆ , there is s¯ (M, sˆ ) ≥ sˆ such that if 0 ≤ x(s) ≤ y(s) → 0 as s → ∞ and x ≥ − Ms y ∀s ≥ sˆ , (41) y ≤ − 21 y + Ms y + 21 x, 5M y(s) s or y ≥ x > 0 and y ∼ x as s → ∞.

then either ∀s ≥ s¯ , x(s) ≤

Remark. If (43) holds, then we have no uniform control with respect to M and sˆ .

(42) (43)

536

H. Zaag

Proof. See Appendix B.2. We have just proved that (43) doesn’t hold. Therefore, for all a ∈ Sδ and s ≥ s2 for some s2 > 0, l2 (s) ≤ CI (s)/s. Using Lemma 2.8 and (ii) in Proposition 2.6 (take n = 3) yields for all a ∈ Sδ , if σ = 4p κ ka,0 , then  I (s)   lk (s) ≤ C s if k = 0, 1 or 2 2 ∀s ≥ s0 , (s) ≤ − 1 + C0 I (s) + 1  I (n + 1 − k)lk (s).  2 s 2 k=0

s Therefore, ∀s ≥ s0 , I (s) ≤ − 21 + Cs I (s), hence I (s) ≤ C0 e− 2 s C0 for some C0 > 0. This concludes the proof of Proposition 2.1. 3. Blow-up Behavior of u in a Tubular Neighborhood of S We prove Theorem 2 here. We have proved in [23] that (7) holds. This estimate identifies for each t ∈ [0, T ) three regions in B(a, ˆ δ): – The blow-up region. It is {x | d(x, S) ≤ (T − t)| log(T − t)|}. According to (7), it corresponds to the set {x | |u(x, t)| ≥ η u(t) L∞ } for some 0 < η < 1. – The regular region. It is the region far away from blow-up, where u stays bounded, say by 1. It corresponds to {x | d(x, S) ≥ 0 } for some 0 > 0. – The intermediate region. It is between the two others, that is {x | 1 ≤ |u(x, t)| ≤ η u(t) L∞ } or {x | (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 }. We handle separately the blow-up and the intermediate regions whose union makes the tubular neighborhood. Our technique is the same as in [9]. Although we had only one blow-up point in [9], it turns out that the techniques of [9] hold uniformly with respect to the blow-up point, when they are adapted to the present case. Therefore, we follow the method of [9]. However, we omit technical details; the reader can find them in [9] and in the appendix. We proceed in 3 steps: – In Step 1, we use the transport effect of the term − 21 y.∇g in Eq. (26) to extend the √ convergence of Proposition2.1 from compact sets to larger sets |y| ≤ s, i.e., the blow-up region d(x, S) ≤ (T − t)| log(T − t)|, after the change (17). – In Step 2, we i.e., when use the information on the edge of the blow-up region, d(x, S) = (T − t)| log(T − t)| as initial data to solve the ODE u = up , which turns out to be a very good approximation for the PDE in the intermediate region (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 , as mentioned in Proposition 2.3. – In Step 3, we just gather the previous information to prove Theorem 2. Step 1: The blow-up region. The L2ρ estimate of Proposition 2.1 also holds uniformly on compact sets. The convection term − 21 y.∇g in Eq. (26) allows us to carry estimates √ s−s from compact sets to sets |y| ≤ s along characteristics of the type y = Re 2 . The following lemma is a corollary of Proposition 2.1 in Velázquez [19]. It is proved in the course of the proof of Proposition 2.13 in [9].

Solutions of Semilinear Heat Equations

537

Lemma √ 3.1 (Velzáquez-Extension of the convergence from compact sets to sets |y| ≤ s). Assume g is a solution of 1 ∂s g = g − y.∇g + g + α(y, s)g for (y, s) ∈ RN × [ˆs , ∞), 2 where α(y, s) ≤ Ms and |g(y, s)| ≤ M. Then, for all s ≥ sˆ and s ≥ s + 1 such that √ s−s e 2 = s, we have

sup√ |g(y, s)| ≤ C(M)es−s g(s ) L2ρ .

|y|≤ s

This lemma along with Proposition 2.1 yields for all a ∈ Sδ and s ≥ s0 + 1,

s

C

sup√ |ga,σ (a) (y, s)| ≤ Ces−s C0 e− 2 s 0 ,

|y|≤ s s−s

where e 2 = proposition:

√

s. Since s = s − log s, we have just proved part i) of the following

Proposition 3.2 (Uniform √ estimates for wa in larger sets |y| ≤ s ≥ s0 + 1 and |y| ≤ s, (i) (ii)

s

√

s). For all a ∈ Sδ ,

|ga,σ (a) (y, s)| ≤ Ce− 2 s 2 +C0 , − s 3 +C0 ˜ , |wa (y, s) − w(y.Q a e1 , s + σ (a))| ≤ Ce 2 s 2 3

where s0 and C0 are defined in Proposition 2.1. Proof of (ii). Just change Q0 y into y in part (i) and use the definition of g given in (25). Now, we just rewrite part (ii) of the previous proposition in the original variables u(x, t) through the transformation (17) to get the following corollary: Corollary 3.3 (Uniform estimates for u(x, t) in the larger sets |x − a| ≤ (T − t)| log(T − t)|). For all a ∈ Sδ , t ≥ T − e−s0 −1 and |x − a| ≤ (T − t)| log(T − t)|, − 1 d(x,T ) u(x, t) − (T − t) p−1 w˜ √T −ta , − log(T − t) + σ (a) 1 − 1 = u(x, t) − u˜ σ (a) (d(x, Ta ), t) ≤ C(T − t) 2 p−1 | log(T − t)|3/2+C0 , where Ta is the tangent plane to S at a and u˜ σ (a) is defined in (10). The only delicate point in this transformation is the computation of y.Qa e1 in terms of x, a and t. Using (17), we have |y.Qa e1 | = |(x − a).Qa e1 |(T − t)−1/2 = d(x, Ta )(T − t)−1/2 , because Qa e1 is the normal direction to the blow-up set S at the blow-up point a (see (20)). The relation between w˜ and u˜ σ follows directly from the definition of w˜ (21) and the definition of u˜ σ (10). Now, if we choose a to be the closest blow-up point to x, that is a = PS (x), the projection of x on the blow-up set S, then we get d(x, Ta ) = d(x, S), which yields the following corollary:

538

H. Zaag

Corollary 3.4 (Uniform estimates for u(x, t) in the blow-up region d(x, S) ≤ (T − t)| log(T − t)|). For all t ≥ T − e−s0 −1 and x ∈ B(a, ˆ δ) such that d(x, S) ≤ (T − t)| log(T − t)|, u(x, t) − u˜ σ (P

S (x))

1 − 1 (d(x, S), t) ≤ C(T − t) 2 p−1 | log(T − t)|3/2+C0 ,

where PS (x) is the projection of x over S. Remark. We need the restriction |x − a| ˆ < δ to guarantee the fact that PS (x) is in Sδ ≡ S ∩ B(a, ˆ 2δ), defined in (16), so that Corollary 3.3 applies. Indeed, if |x − a| ˆ < δ, then |PS (x) − a| ˆ ≤ |PS (x) − x| + |x − a| ˆ ≤ 2|x − a| ˆ < 2δ, because aˆ ∈ S. Hence PS (x) ∈ Sδ . Step 2: Estimates in the intermediate region. We consider a point (x, t) in the interme diate region, i.e. such that d(x, S) ≥ (T − t)| log(T − t)|. We remark that the point (x, t˜(d(x, S))), where t˜(d) is defined by (44) d = (T − t˜)| log(T − t˜)| is on the frontier of the two regions (note that t˜ ≤ t). Therefore, we have an estimate on u and on u − u˜ σ (PS (x)) at (x, t˜(d(x, S))), respectively from (7) and from Corollary 3.4. Moreover, the PDE (1) can be uniformly localized by the ODE u = up , according to Proposition 2.3. The one dimensional solution u˜ too. Our idea is simple: we use the ODE to propagate the information on u − u˜ σ (PS (x)) from time t˜ to t. Thus, the error term on u − u˜ σ (PS (x)) in the intermediate region will be the same as the one on the edge. More precisely: Proposition 3.5 (Estimates in the intermediate region (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 ). There exists 0 > 0 such that for all x ∈ B(a, ˆ δ) and t ∈ [0, T ), if (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 , then 1

|u(x, t) − u˜ σ (PS (x)) (d(x, S), t)| ≤ C(T − t˜) 2

1 − p−1

2 1− p−1

≤ Cd(x, S)

| log(T − t˜)|3/2+C0 p

| log d(x, S)| p−1

+C0

,

where PS (x) is the orthogonal projection of x on S and t˜ = t˜(d(x, S)) is defined by (44). Proof. The main argument of the proof has just been given. The reader can find the “technical” proof in Appendix C. Step 3: Estimates in a tubular neighborhood of S. We prove Theorem 2 here. Let t1 = max(T − e−s0 −1 , t˜(0 )), where 0 and t˜(0 ) are given in Proposition 3.5, and consider some x ∈ B(a, ˆ δ) such that d(x, S) ≤ 0 . (i) Let t ∈ [t1 , T ). If t ≤ t˜(d(x, S)) defined in (44), then d(x, S) ≤ ˜ (T − t)| log(T − t)|. Use Corollary 3.4. If t ≥ t (d(x, S)), then d(x, S) ≥ (T − t)| log(T − t)|. Use Proposition 3.5. (ii) Just make t → T in (i) and use (10).

Solutions of Semilinear Heat Equations

539

4. Regularity of the Blow-up Set We prove Theorem 4 and Proposition 3 here. To keep up with the notation of [23], we assume that aˆ = 0 and Qaˆ = Id, and consider that Sδ , the intersection of S with B(a, ˆ 2δ) (see (16)), is the graph of a function ϕ ∈ C 1 (BN−1 (0, δ1 ), R) of the variable x˜ = (x2 , ..., xN ). If we introduce A(x) ˜ = (ϕ(x), ˜ x), ˜ then Im A ∩ B(a, ˆ 2δ) = graph ϕ ∩ B(a, ˆ 2δ) = Sδ . Given x near Sδ , Corollary 3.3 gives many different asymptotic behaviors for u(x, t), depending on the choice of the point a ∈ Im A ∩ B(x, (T − t)| log(T − t)|). All these possible behaviors have to agree, up to the error term in Corollary 3.3. This implies a geometric constraint on Sδ , which gives some more regularity on A (and ϕ). ˜ < δ1 and A(x) ˜ We consider some |x| ˜ < δ1 and some h˜ ∈ RN−1 such that |x˜ + h| 1 ˜ as well as A(x˜ + h) are in Sδ . Since A is C and σ is continuous (see Proposition 2.1), there is C ∗ such that ˜ − A(x)| ˜ and |σ (A(x))| |ϕ (x)| ˜ ≤ C ∗ , |A(x˜ + h) ˜ ≤ C ∗ |h| ˜ ≤ C∗. (45) ˜ ≤ (T − t)| log(T − t)|, For any time t ≥ T − e−s0 −1 such that |A(x) ˜ − A(x˜ + h)| ˜ we can estimate u(A(x˜ + h), t) from Corollary 3.3 in two ways: ˜ and s = − log(T − t), which gives – First by taking x = a = A(x˜ + h) 1 s 3 ˜ ≤ Ce− 2 s 2 +C0 . ˜ s + σ (A(x˜ + h))) (T − t) p−1 u(A(x˜ + h), t) − w(0,

(46)

˜ and s = − log(T − t), which gives – Second, by taking a = A(x), ˜ x = A(x˜ + h) s 1 ˜ t) − w˜ d A(x˜ + h), ˜ TA(x) ˜ (T − t) p−1 u(A(x˜ + h), ˜ e 2 , s + σ (A(x))

(47)

s

≤ Ce− 2 s 2 +C0 . 3

˜ such that Now, if we fix t = t˜(x, ˜ h) ˜ ˜ log(T − t˜(x, ˜ ˜ = (T − t˜(x, ˜ h))| ˜ h))| A(x˜ + h) − A(x)

(48)

˜ < h1 (s0 ) for some h1 (s0 ) > 0, then we see from (45) that t˜(x, ˜ ≥ and take |h| ˜ h) T − e−s0 −1 , hence (46) and (47) hold. Therefore, if s˜ = − log(T − t˜), then s˜ ˜ − w˜ d A(x˜ + h), ˜ TA(x) ˜ s˜ + σ (A(x˜ + h))) ˜ w(0, ˜ e 2 , s˜ + σ (A(x)) (49) s˜ 3 ≤ Ce− 2 s˜ 2 +C0 . ˜ we don’t change t˜(x, ˜ and obtain similarly By changing the roles of x˜ and x˜ + h, ˜ h) s˜ ˜ 2 ˜ s˜ + σ (A(x))) ˜ − w˜ d A(x), ˜ TA(x+ e , s ˜ + σ (A( x ˜ + h)) w(0, ˜ ˜ h) (50) s˜ 3 ≤ Ce− 2 s˜ 2 +C0 .

540

H. Zaag

Since u, ˜ hence w˜ are radially decreasing (see Subsect. 1.2), this yields s˜ 3 ˜ ≤ Ce− 2 s˜ 2 +C0 . ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s˜ + σ (A(x˜ + h))) w(0,

(51)

˜ ≥ 0, then Indeed, if w(0, ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s˜ + σ (A(x˜ + h))) ˜ 0 ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s ˜ + σ (A(x˜ + h))) s˜ ˜ ≤ w˜ 0, s˜ + σ (A(x))) ˜ − w(d ˜ A(x), ˜ TA(x+ ˜ e 2 , s˜ + σ (A(x˜ + h)) ˜ h) because w˜ is radially decreasing. Hence, (51) follows from (50). Do the same and use (49) in the other case. Therefore, with a triangular identity, we get from (51) and (49) s˜ ˜ TA(x) ˜ 0 ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w(d ˜ A(x˜ + h), ˜ e 2 , s˜ + σ (A(x))) (52) − 2s˜ 23 +C0 ≤ Ce s˜ . Note that since A(x) ˜ ∈ TA(x) ˜ , we have

˜ A(x) d(A(x+ ˜ h),T ˜ ) ˜ |A(x+ ˜ h)−A( x)| ˜

≤ 1. Therefore, (i) of Lemma 2.4

˜ < h2 then s˜ + σ (A(x)) ˜ ≥ sˆ by implies that there is C > 0 and h2 > 0 such that if |h| (48) and (45) and s˜ 2 C ˜ TA(x) d A( x ˜ + h), e2 ˜ s˜ +σ (A(x)) ˜ s˜ (53) ˜ TA(x) 2 ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w˜ d A(x˜ + h), , s ˜ + σ (A( x)) ˜ . e ˜ Since Im A is the graph of ϕ, we have

˜ TA(x) d A(x˜ + h), ˜

˜ − ϕ(x) ˜ ˜ − h.∇ϕ( x) ˜ ϕ(x˜ + h) . = 1 + |∇ϕ(x)| ˜ 2

(54)

˜ < h3 , then s˜ is large enough by Using (iii) in Lemma 2.4, we get h3 > 0 such that if |h| (48) and (45) and ˜ + h))| ˜ . ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s˜ + σ (A(x˜ + h))) √ If τ (d) is given by d = τ | log τ |, then C |σ (A(x)) ˜ − σ (A(x˜ s˜ 2

log τ ∼ 2 log d and τ ∼ Therefore,

log | log τ | | log τ |

≤

log | log d| | log d|

d2 as d → 0. 2| log d|

if |d| ≤ d0 for some d0 > 0. Combining this with (48)

˜ < h4 for some h4 > 0, and (45), we have for all |h| C s˜ 3 3 1 C0 s˜ ˜ 23 | log |h|| ˜ 21 + 20 ˜ e− 2 s˜ 2 +C0 ≤ Cd 2 | log d| 2 + 2 ≤ C|h| e− 2 (˜s + σ (A(x))) s˜

˜ log |h|| ˜ 3+C0 , s˜ 2 e− 2 s˜ 2 +C0 ≤ Cd| log d|3+C0 ≤ C|h|| 3

(55)

(56)

where d = |A(x) ˜ − A(x˜ + h)|. Take h0 = min(h1 , h2 , h3 , h4 ). Combining (53), (54), (52), (56) and (45) gives the regularity estimate for ϕ. Combining (55), (51) and (56) gives the regularity estimate for σ and closes the proof of Proposition 3.

Solutions of Semilinear Heat Equations

541

5. Connection with a Chemotaxis Problem We would like to mention connections between the ideas of this paper and the chemotaxis problem of Betterton and Brenner [2]. Chemotaxis refers to the movement of bacteria under a gradient of some chemical substance. Under special conditions, bacteria excrete a substance to attract neighboring individuals. This way, bacteria aggregate and their density blows up in finite time T > 0. For simplicity, we assume that the cellular division is much slower than the dynamics of chemotaxis, and that the diffusion of bacteria is much slower than the diffusion of the attractant. Therefore, we have from [2] the equations satisfied by ρ, the bacterial density and c, the chemical attractant concentration: ∂ρ ∂t

= ρ − ∇.(ρ∇c) = ρ + ρ 2 − ∇ρ.∇c, 0 = c + ρ.

(57)

Many blow-up regimes are possible, depending on the relative importance of the three terms on the right-hand side of (57). A global picture is presented by Brenner et. al in [3], in the case of radial solutions. One of those regimes has the same scaling (T − t)| log(T − t)| as Eq. (1) with p = 2 (see Subsect. 4.3 in [3]). In an experiment conducted by Budrene and Berg [6, 7], (see also Brenner, Levitov and Budrene [4]), it appears clearly that the dynamics are 3 dimensional and not radial. The authors observe two regimes in this finite time blow-up: – The transient regime, for t ≤ t1 for some t1 < T . The bacteria aggregate along cylindrical structures that shrink towards their common axis, as time grows. This suggests that the axis of the cylinder would be the singular set. – The asymptotic regime. The cylinder is destabilized at time t = t1 and breaks up into spherical aggregates. Then, the three dimensions of the sphere shrink simultaneously, leading to isolated blow-up points. Although the chemotaxis equation is non-local, it has the same one dimensional scaling as the heat equation (1). Both equations deal with blow-up on a continuum (say on a line) and share the idea of the instability of such a behavior (only single point blow-up is thought to be generic for Eq. (1)). However, the goals of the two papers are different. Indeed, while [2] proves the instability of the blow-up on a line, we prove here that if this occurs, which is exceptional, then we have more constraints, hence more regularity on that line. Although the goals are different, the same idea is used in both works: how to connect all local singular behavior near singular points (or candidates for singular points in the case of [2]) to get a global picture of the situation? In [2], the destabilization of the cylinder at time t1 breaks the symmetry and induces a variation of a “local blow-up time”, or phase. The variation of the phase along the line is governed by a phase equation. The minimum of the phase determines the actual blow-up point. In our case, the connection between local behaviors is done through the dilation σ (a), a ∈ S, analogous to the phase of [2]. The Liouville theorem of [15] cited in Subsect. 1.1 is the key tool to connect local descriptions. We are unable to find a non trivial phase equation for σ , analogous to that of chemotaxis. However, since σ is linked to the one dimensional scaling of (1), which is also present for chemotaxis, we believe that if one adopts our point of view in chemotaxis, σ would satisfy a non trivial equation, related to the phase equation of [2].

542

H. Zaag

A. Properties of the Particular Single Point Blow-up Solution in One Dimension A.1. Existence of the one dimensional solution. We prove here the existence of the particular one dimensional solution announced in Subsect. 1.2. Take g a symmetric positive continuous function, decreasing on (0, ∞) and going to zero at infinity. The solution u(x ˜ 1 , t) of (1) with initial data kg is symmetric and decreasing on (0, ∞) as well. If k is large enough, then u(x ˜ 1 , t) blows up in finite-time T˜ , only at the origin (see Theorems 1 and 2 in Mueller and Weissler [16]). We can assume T˜ = T by changing u˜ into some 2

2 ˜ u˜ λ (x1 , t) = λ p−1 u(λx 1 , λ t).

Theorem 1 in Herrero and Velázquez [12] then asserts that u˜ has the profile f1 defined in (5). u˜ is not self-similar, because the only self-similar solutions of (1) are independent of space, hence trivial (see Theorem 1’ in Giga and Kohn [11]).

A.2. Elementary estimates for the one dimensional solution. We prove Lemma 2.4 here. (i) Using a Taylor expansion, we write ˜ s) + y1 w(y ˜ 1 , s) = w(0,

∂ w˜ 1 ∂ 2 w˜ 1 ∂ 3 w˜ (0, s) + y12 2 (0, s) + y13 3 (z1 , s) ∂y1 2 ∂y1 6 ∂y1

for some z1 ∈ (0, y1 ). Since w˜ is even, we have ˜ − 4sC

∂ w˜ ∂y1 (0, s)

≡ 0. Since (22) also holds

we have ≤ for some C˜ > 0. Since Proposition 2.2 implies that in ∂ 3 w˜ 3 (z1 , s) ≤ C3 , we combine all the previous estimates with the Taylor expansion ∂y1 s 3/2 to get k , Cloc

∂ 2 w˜ (0, s) ∂y12

∀|y1 | ≤

√ 6C˜ √ C˜ s ≡ δ˜ s, w(y ˜ 1 , s) ≤ w(0, ˜ s) − y12 . C3 s

If δ˜ ≥ 1, then the proof is complete. If δ˜ < 1, then recall that

y1 sup√ w(y ˜ 1 , s) − f1 √ → 0 as s → ∞, s |y1 |≤ s

(58)

(59)

since u˜ has the profile f1 defined in (5). Therefore, there is sˆ > 0 such that if s ≥ sˆ and √ √ δ˜ s ≤ |y1 | ≤ s, then |w(0, ˜ s) − w(y ˜ 1 , s)| ≥

1 |y |2 1 ˜ ≥ f1 (0) − f1 (δ) ˜ 1 . f1 (0) − f1 (δ) 2 2 s

The conclusion then follows from (58) and (60). (ii) See identity (5.34) on p. 854 in Filippas and Kohn [10]. (iii) We know from (59) that w(y ˜ 1 , s) → f1 (0) = (p − 1)

1 − p−1

as s → ∞

(60)

Solutions of Semilinear Heat Equations

543

uniformly on compact sets. Since w(s) ˜ ˜ L∞ and ∇ w(s) L∞ go to 0 as s → ∞ (see Proposition 2.2), we use Eq. (18) to get ∂ w˜ (y1 , s) → 0 as s → ∞ ∂s uniformly on compact sets. By the Lebesgue Theorem, we obtain ∂ w˜ → 0 as s → ∞. ∂s (s) 2 Lρ (R) 1

Let us introduce q(y1 , s) = the same type as (26):

∂ w˜ ∂s (y1 , s).

From (18), we see that q satisfies an equation of

∂q = (L + α(y1 , s)) q, ∂s where Lq =

∂2q ∂y12

∂q − 21 y1 ∂y + q and α(y1 , s) = p w(y ˜ 1 , s)p−1 − 1

(61) p p−1 .

In particular, we

have the same dynamical system techniques as for Eq. (26). Therefore, we just sketch our argument and borrow techniques from Sect. 2 and from [9] where the same equation was considered. Since w˜ satisfies Proposition 2.2 and (22), α satisfies the estimates of Lemma 2.5. If we borrow the notations we used for g in Sect. 2 and write q(y1 , s) = qn (s)hn (y1 ), I (s) = q(s) L2ρ , ln (s) = |qn (s)| hn (y1 ) L2ρ , (62) 1

n∈N

then we have I (s)2 =

n∈N

us remark that

1

qn (s)2 hn 2L2 and Eqs. (i) and (ii) in Lemma 2.6 hold. Let ρ1

I (s) ≥

C for s large, where C > 0. s2

(63)

Indeed, I (s) ≥ |q2 (s)| h2 L2ρ and by definition (see (31) and (32)), 1

q2 (s) =

∂ w˜ κ (y1 , s)k2 (y1 )ρ1 (y1 )dy1 = w2 (s) ∼ ∂s 4ps 2

(64)

as mentioned in (ii) of the lemma we are proving. Like for Eq. (26), Lemma 2.7 holds and either no mode dominates in I (s) or I (s) ∼ ln (s) as s → ∞ for some n ≥ 2. We claim that I (s) ∼ l2 (s) as s → ∞. Indeed, if no mode dominates or if I (s) ∼ ln (s) with n ≥ 3, then Lemmas 2.6 and 2.7 imply that I (s) has to decay exponentially fast. Contradiction with (63). Using (64), we see that √ √ κ 2 as s → ∞. (65) I (s) ∼ l2 (s) = 2 2|q2 (s)| ∼ 2ps 2

544

H. Zaag

Our conclusion follows if we prove that q(y1 , s) − q2 (s)h2 (y1 ) L2ρ = O 1

1 s3

.

(66)

Indeed, parabolic regularity implies that (66) also holds in L∞ loc , in particular, at y1 = 0: κ ∂ w˜ (0, s) = q(0, s) ∼ q2 (s)h2 (0) ∼ − as s → ∞, ∂s 2ps 2 which is the desired conclusion (note that h2 (0) = −2, by (31)). Let us prove (66). Proof of (66). From (62), we see that q − q2 (s)h2 (y1 ) 2L2 = c0 q0 (s)2 + c1 q1 (s)2 + l3 (s)2 , ρ1

where l3 = π3 q L2ρ and π3 q =

∞

1

qn (s)hn (y1 ). Using (i) of Lemma 2.6 with n = 0

n=3

or 1, along with (65), we see that ln (s) + n2 − 1 ln (s) ≤ ln (s) = O

1 s3

(67)

C s3

which yields

as s → 0 for n = 0 or 1.

(68)

If we project (61) using π3 , we see that ∂s π3 q = Lπ3 q + π3 (αq). Multiplying this equation by π3 qρ1 (y1 ) and integrating over R, we see that 1 1 d 2 l3 = Lπ3 q. π3 qρ1 dy1 + π3 (αq)π3 qρ1 dy1 ≤ − l32 + π3 (αq)π3 qρ1 dy1 2 ds 2 because π3 is the projector over the negative part of the spectrum. Using Cauchy– Schwartz’s inequality twice, we write π3 (αq)π3 qρ1 dy1 ≤ π3 (αq) L2 π3 q L2 ρ1 ρ1 ≤ αq L2ρ l3 (because π3 is a projector) 1 ≤ α L4ρ q L4ρ l3 . 1

1

Therefore, 1 l3 ≤ − l3 + α L4ρ q L4ρ . 1 1 2

C C 2 s (1 + y1 ) L4ρ1 ≡ s . Equation (61) has a the L2ρ1 norm up to some delay in time (see

Using Proposition 2.5, we see that α L4ρ ≤ 1

(69)

nice property of control of the L4ρ1 norm by Lemma 2.3 in [12]):

1/4

1/2 q(y1 , s)4 ρ1 dy1 ≤C q(y1 , s − s∗ )2 ρ1 dy1

Solutions of Semilinear Heat Equations

545

for some s∗ > 0. Using (65), we end-up with q L4ρ ≤ 1

C . s2

Therefore, (69) becomes

1 C l3 ≤ − l3 + 3 2 s which yields

l3 (s) = O

1 s3

as s → ∞.

(70)

Thus, (66) follows from (67), (68) and (70). This concludes the proof of Lemma 2.4.

B. Projection of Equation (26) on the Different Modes We prove in this appendix various technical lemmas from Sect. 2. In Subsect. B.1, we prove part (iii) of Lemma 2.6. We prove Lemma 2.8 and Lemma 2.11 in Subsect. B.2. Subsection B.3 is devoted to the proof of Lemma 2.9. B.1. Equation on the null mode. We prove (iii) of Lemma 2.6 here. Take β ∈ NN such that |β| = 2. If we multiply (26) by kβ (y)ρ(y) and integrate over RN , then we get from (34) and (33) gβ (s) = αgkβ ρdy. Using (36) and Cauchy–Schwartz’s inequality, we write for all a ∈ Sδ , |σ | < σ0 and s ≥ − log T + σ0 , 1 C |gβ + 4s h2 (y1 )gkβ ρdy| ≤ s 3/2 (1 + |y|3 )|g||kβ |ρdy ≤

C (1 + |y|3 )kβ L2ρ g L2ρ s 3/2

≡

C(β) I (s). s 3/2

Using (35), (30), (32) and (33), we write h2 (y1 )g(y, s)kβ (y)ρ(y)dy = gγ (s) h2 (y1 )hγ (y)kβ (y)ρ(y)dy γ ∈NN

=

gγ

γ ∈NN

=

γ ∈NN

h2 (y1 )hγ1 (y1 )kβ1 (y1 )ρ1 (y1 )dy1

gγ (s)

N

hγi kβi ρ1 (yi )dyi

i=2

h2 (y1 )hγ1 (y1 )kβ1 (y1 )ρ1 (y1 )dy1

N

δγi ,βi .

i=2

Because of the orthogonality relation (33) and symmetry, the above term is zero except when for all i = 2, . . . , N, γi = βiand |γ1 − β1 | = 0 or 2. If γ = β, then the term is gβ (s) h2 (y1 )hβ1 (y1 )kβ1 (y1 )ρ1 (y1 )dy1 = 4β1 gβ (s) after straightforward calculations based on (31) and (33),performed for β1 = 0, 1 or 2. If γ = β ± (2, 0, . . . , 0), then the term is gγ (s) h2 hβ1 ±2 kβ1 ρ1 dy1 ≡ C|gγ (s)| ≤ C (l0 + l4 ) by (35). This concludes the proof of (iii) in Lemma 2.6.

546

H. Zaag

B.2. Uniform smallness of the positive modes. We prove Lemmas 2.8 and Lemma 2.11 here. Proof of Lemma 2.8. If we take n = 0 in (i) and (ii) in Lemma 2.6, then we see that x = e−s l0 (s) and y = e−s I (s) satisfy inequality (41) in the ODE Lemma 2.11. Therefore, either (42) or (43) holds. Let us assume by contradiction that (43) holds. Then, we see that I (s) ∼ l0 > 0 as s → ∞. Using i) of Lemma 2.6 with n = 0, we see that l0 and I go to infinity. Contradiction. Thus (42) holds and we get the estimate for s s l0 . We do the same for l1 and I , using Lemma 2.11 with x = e− 2 l1 and y = e− 2 I . This closes the proof of Lemma 2.8. It remains to prove Lemma 2.11. Proof of Lemma 2.11. This lemma was proved in [9] with no attention to the dependence of the conclusion on the data. We have proved there that either x = O

y s

or x ∼ y > 0 as s → ∞,

with no uniform estimates. Let us prove the uniform version. Define s¯ (M, sˆ ) ≥ sˆ such that

3M 35 M ∀s ≥ s¯ , + 2 5 − M ≥ 0. 2s s 2

(71)

(72)

If (42) doesn’t hold, then there is s˜ ≥ s¯ such that γ (˜s ) > 0, where γ (s) = x(s)− 5M s y(s) (˜s may depend on x and y). Using (41) and (72), we get

∀s ≥ s˜ , γ ≥ y Therefore, γ (s) ≥ γ (˜s )

3M M + 2 2s s

5M s˜ s

2

35 5− M 2

−

5M 5M γ ≥− γ. 2s 2s

> 0 and

∀s ≥ s˜ , x(s) >

5M y(s). s

(73)

In particular, y ≥ x > 0 and we can write from (41) the following equation for all s ≥ s˜ ,

x M x 1x x 2M 1x x ∀s ≥ s˜ , ≥− 1+ + 1− ≥− + 1− . y s y 2y y s 2y y

(74)

The proof will be completed

if we rule out the first possibility in (71). We proceed by contradiction. If x = O ys , then we have from (73) and (74), 5M 5M M 25M 2 2M x + 1− = − ≥− y s 2s s 2s 4s 2 for s large. This implies that xy → ∞ as s → ∞. Contradiction with x ≤ y. Thus, only the second case in (71) holds and Lemma 2.11 as well as Lemma 2.8 are proved.

Solutions of Semilinear Heat Equations

547

B.3. Decay of the null mode. We prove Lemma 2.9 here. We use Eq. (iii) in Lemma 2.6. We need to estimate the error terms there. Let s3 (σ0 ) = max (− log T + σ0 , s1 (σ0 )), where s1 (σ0 ) is defined in Lemma 2.8. Consider some a ∈ Sδ and |σ | ≤ σ0 . According to Lemma 2.8 and (29), we have for all s ≥ s3 (σ0 ), l4 (s) ≤ I (s) ≤ C(σ0 )

log s I (s) log s ≤ C(σ0 ) 3 . and l0 ≤ C s2 s s

(75)

As for the size of l4 , we integrate the equation in (i) of Lemma 2.6 with n = 4 to get ∀s ≥ s3 , s I (t) −(s−s3 ) −s l4 (s3 ) + e et l4 (s) ≤ e dt. t s3 Using (75), we see that s s I (t) log t log s et et 3 dt ≤ C(σ0 )es 3 . dt ≤ C(σ0 ) t t s s3 s3 Therefore, ∀s ≥ s3 , l4 (s) ≤ C(σ0 )

log s . s3

(76)

Using (iii) of Lemma 2.6 along with (75) and (76) yields β1 log s ∀s ≥ s3 , ∀|β| = 2, gβ (s) + gβ (s) ≤ C(σ0 ) 7/2 . s s s Since β1 = 0, 1 or 2 and |gβ (s)| ≤ Cl2 (s) ≤ CI (s) ≤ C(σ0 ) log by (75), this yields s2 the conclusion.

C. Estimates in the Intermediate Region We prove Proposition 3.5 here. From (7) and Corollary 3.4, we have information on u and u − u˜ σ (PS (x)) at (x, t˜(d(x, S))), a point on the edge of the blow-up region. We use this as initial data, and solve the 2 ODEs of Proposition 2.3 between t˜ and t to get an estimate on u and u − u˜ σ (PS (x)) at (x, t), when t ∈ [t˜, T ). For clearness, we work with t˜ rescaled versions of u and u, ˜ defined for all (ξ, τ ) ∈ R2 × [− T − , 1) by: t˜  1  p−1   v(x, ξ, τ ) = (T − t˜) u(x + ξ T − t˜, t˜ + τ (T − t˜)) 1 v(x, ˜ ξ, τ ) = (T − t˜) p−1 u˜ σ (PS (x)) (d(x, S) + ξ1 T − t˜, t˜ + τ (T − t˜))    h(x, ξ, τ ) = v − v, ˜

(77)

where t˜ = t˜(d(x, S)) is defined in (44) and goes to T as d(x, S) → 0. We start with initial data at τ = 0 for v, v˜ and h (which corresponds to information on u at time t˜, i.e. at the frontier between the blow-up and the intermediate regions).

548

H. Zaag

We see from Corollary 3.4 and (7) that there is 1 > 0 such that if |x − a| ˆ < δ and d(x, S) < 1 , then   |v(x, 0, 0) − f (1)| ≤ C log | log(T − t˜)| 1 (78) | log(T − t˜)|  1 |h(x, 0, 0)| ≤ C(T − t˜) 2 | log(T − t˜)|3/2+C0 . As rescaled versions, v and v˜ are still solutions of the PDE (1). However, it is easier to work with the localizing ODE given in Proposition 2.3: for all > 0 and (x, t) ∈ RN × [ T2 , T ), |∂t u − |u|p−1 u| ≤ |u|p + C , |∂t u˜ − |u| ˜ p−1 u| ˜ ≤ |u| ˜ p + C , where C denotes hereafter a constant depending only on . Since σ (a) is continuous in terms of a (see Proposition 2.1), we see from the definition of u˜ σ (10) that for all a ∈ Sδ and (x, t) ∈ RN × [T − e−σ0 T2 , T ), |∂t u˜ σ (a) − |u˜ σ (a) |p−1 u˜ σ (a) | ≤ |u˜ σ (a) |p + C . Using (77), we get for all > 0, x ∈ B(a, ˆ δ) and τ ∈ [0, 1), p

|∂τ v(x, 0, τ ) − |v|p−1 v| ≤ |v|p + C (T − t˜) p−1 , p |∂τ v(x, ˜ 0, τ ) − |v| ˜ p−1 v| ˜ ≤ |v| ˜ p + C (T − t˜) p−1 , p |∂τ h(x, 0, τ ) − p|v| ¯ p−1 h| ≤ (|v|p + |v| ˜ p ) + C (T − t˜) p−1

(79)

for some v¯ ∈ [v, v]. ˜ Since the solution of p

v0 = v0 , v0 (0) = f (1) − 1 2 p−1 is v0 (τ ) = (p−1) + (p − 1)(1 − τ ) , a bounded function for all τ ∈ [0, 1], we 4p use the continuity of ODE solutions with respect to initial data to get sup |v(x, 0, τ ) − v0 (τ )| + |v(x, ˜ 0, τ ) − v0 (τ )| → 0 as d(x, S) → 0

τ ∈[0,1)

and sup |h(x, 0, τ )| ≤ C|h(x, 0, 0)|

τ ∈[0,1)

whenever d(x, S) ≤ 0 for some 0 > 0. Therefore, we get from (77) and (78): 1 3 − 1 sup u(x, t) − u˜ σ (PS (x)) (d(x, S), t) ≤ C(T − t˜) 2 p−1 | log(T − t˜)| 2 +C0 . t˜≤t
Since d(x, S) ≥ 1

(T − t˜) 2

(T − t)| log(T − t)| whenever t ≥ t˜ (see (44)) and 1 − p−1

2 1− p−1

| log(T − t˜)| 2 +C0 ∼ Cd(x, S) 3

p

| log d(x, S)| p−1

as d(x, S) → 0 (see (44)), this concludes the proof of Proposition 3.5.

+C0

Acknowledgement. The author wants to thank Fang-Hua Lin and Frank Merle for interesting conversations about the work, and Robert V. Kohn who made valuable suggestions and pointed out many references. Many thanks to Naoufel Ben Abdallah for his kind invitation to the Université Paul Sabatier in Toulouse, where part of this work was done. The remarks of the referee are valuable and highly appreciated. The author wants to acknowledge partial support received from the NSF grant DMS-9631832.

Solutions of Semilinear Heat Equations

549

References 1. Bernoff, A.J., Bertozzi, A.L., Witelski, T.P.: Axisymmetric surface diffusion: Dynamics and stability of self-similar pinchoff. J. Statist. Phys. 93, 725–776 (1998) 2. Betterton, M.D., Brenner, M.P.: Collapsing bacterial cylinders. Phys. Rev. E 64, 061904 (2001) 3. Brenner, M.P., Constantin, P., Kadanoff, L.P., Schenkel, A., Venkataramani, S.C.: Diffusion, attraction and collapse. Nonlinearity 12, 1071–1098 (1999) 4. Brenner, M.P., Levitov, L., Budrene, E.O.: Physical mechanisms for chemotactic pattern formation by bacteria. Biophys. J. 74, 1677–1693 (1995) 5. Bricmont, J., Kupiainen,A.: Universality in blow-up for nonlinear heat equations. Nonlinearity 7, 539–575 (1994) 6. Budrene, E.O., Berg, H.C.: Complex patterns formed by motiile cells of escherichia coli. Nature 349, 630–633 (1991) 7. Budrene, E.O., Berg, H.C.: Dynamics of formation of symmetrical patterns by chemotactic bacteria. Nature 376, 49–53 (1995) 8. Fermanian Kammerer, C., Merle, F., Zaag, H.: Stability of the blow-up profile of non-linear heat equations from the dynamical system point of view. Math. Annalen 317, 195–237 (2000) 9. Fermanian Kammerer, C., Zaag, H.: Boundedness up to blow-up of the difference between two solutions to a semilinear heat equation. Nonlinearity 13, 1189–1216 (2000) 10. Filippas, S., Kohn, R.V.: Refined asymptotics for the blowup of ut − u = up . Comm. Pure Appl. Math. 45, 821–869 (1992) 11. Giga, Y., Kohn, R.V.: Asymptotically self-similar blow-up of semilinear heat equations. Comm. Pure Appl. Math. 38, 297–319 (1985) 12. Herrero, M.A.,Velázquez, J.J.L.: Blow-up behaviour of one-dimensional semilinear parabolic equations. Ann. Inst. H. Poincaré Anal. Non Linéaire 10, 131–189 (1993) 13. Merle, F., Zaag, H.: Optimal estimates for blowup rate and behavior for nonlinear heat equations. Comm. Pure Appl. Math. 51, 139–196 (1998) 14. Merle, F., Zaag, H.: Refined uniform estimates at blow-up and applications for nonlinear heat equations. Geom. Funct. Anal. 8, 1043–1085 (1998) 15. Merle, F., Zaag, H.: A Liouville theorem for vector-valued nonlinear heat equations and applications. Math. Annalen 316, 103–137 (2000) 16. Mueller, C.E., Weissler, F.B.: Single point blow-up for a general semilinear heat equation. Indiana Univ. Math. J. 34, 881–913 (1985) 17. Segur, H., Kruskal, M.D.: Nonexistence of small-amplitude breather solutions in φ 4 theory. Phys. Rev. Lett. 58, 747–750 (1987) 18. Stewartson, K., Stuart, J.T.: A non-linear instability theory for a wave system in plane Poiseuille flow. J. Fluid Mech. 48, 529–545 (1971) 19. Velázquez, J.J.L.: Higher-dimensional blow up for semilinear parabolic equations. Comm. Partial Differential Equations 17, 1567–1596 (1992) 20. Ward, M.J.: Topics in singular perturbations and hybrid asymptotic-numerical methods. In: ICIAM 95, Hamburg, 1995, Berlin: Akademie Verlag, 1996, pp. 435–462 21. Weissler, F.B.: Single point blow-up for a semilinear initial value problem. J. Differential Equations 55, 204–224 (1984) 22. Zaag, H.: Regularity of the blow-up set and singular behavior for semilinear heat equations. In: Proceedings of the third international Palestinian conference on math and math education, Beitlehem, August 2000 23. Zaag, H.: On the regularity of the blow-up set for semilinear heat equations. Ann. Inst. H. Poincaré Anal. Non Linéaire, 2002. To appear Communicated by P. Constantin

Commun. Math. Phys. 225, 551 – 571 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation J. A. Carrillo1 , G. Toscani2 1 Departamento de Matemática Aplicada, Universidad de Granada, 18071 Granada, Spain 2 Department of Mathematics, University of Pavia, via Ferrata 1, 27100 Pavia, Italy

Received: 2 February 2001 / Accepted: 7 October 2001

Abstract: In this paper we investigate the large-time behavior of strong solutions to the one-dimensional fourth order degenerate parabolic equation ut = −(uuxxx )x , modeling the evolution of the interface of a spreading droplet. For nonnegative initial values u0 (x) ∈ H 1 (IR), both compactly supported or of finite second moment, we prove explicit and universal algebraic decay in the L1 -norm of the strong solution u(x, t) towards the unique (among source type solutions) strong source type solution of the equation with the same mass. The method we use is based on the study of the time decay of the entropy introduced in [13] for the porous medium equation, and uses analogies between the thin film equation and the porous medium equation. 1. Introduction In this paper we analyze the asymptotic behavior of the one-dimensional fourth-order nonlinear degenerate diffusion equation ∂u = −(uuxxx )x , ∂t

(x ∈ IR, t > 0),

(1.1)

with u(x, t = 0) = u0 (x) ≥ 0,

(x ∈ IR).

(1.2)

This equation, derived from a lubrication approximation, models the surface tension dominated motion of thin viscous films and spreading droplets [24] ∂u = ∇x · (f (u)∇x x u). ∂t Equation (1.1) is a particular case of the thin film equation ∂u = −(|u|n uxxx )x , ∂t

(x ∈ IR, t > 0),

(1.3)

(1.4)

552

J. A. Carrillo, G. Toscani

where n > 0. Compactly supported nonnegative source type solutions to (1.4) exist for all 0 < n < 3 [7]. For a given n and mass, there is more than one similarity solution U (x, t). A unique (up to translation) solution is obtained by imposing the additional constraint Ux (x, t) = 0 at the edge of the support. Recent numerical studies [8, 9] indicate that the support of the solution has finite speed of propagation and continuous flux, two properties desirable for a physically correct model. Moreover, they show a rapid convergence of the solution onto the similarity solution before the merging of support. For second order degenerate diffusion equations, like the porous medium equation [2] ∂u = ∇x · (|u|n ∇x u), ∂t

n>0

(1.5)

the convergence of the solution towards the similarity solution has been known for many years (see [30, 31] and the references therein). Recently, this problem has been addressed by techniques borrowed from kinetic theory, essentially based on the study of the time decay of the entropy [13, 14, 16, 26]. This strategy is possible any time we can work with an evolution equation which possesses a unique steady state of given mass, in correspondence to which the convex entropy attains the (unique) extremal point. For this reason, instead of working on (1.1) directly, one considers the asymptotic decay towards its equilibrium state of solutions to the (nonlinear) equation ∂v = (xv − vvxxx )x , ∂t v(x, t = 0) = u0 (x) ≥ 0,

(x ∈ IR, t > 0), (x ∈ IR).

(1.6)

(1.7)

The reason relies on the following fundamental remark: there exists a time dependent scaling which transforms (1.6) into the thin film equation (1.1); moreover, we can fix the time scaling in order for the initial data for (1.6) after rescaling to be the same as for the original equation. The exact expression of this time transformation follows from a by now standard time dependent change of variables [13, 14, 16, 26] and is given by v(x, t) = α(t)u(α(t)x, β(t)), where

(1.8)

α(t) = et and β(t) = e5t − 1 /5.

As a conclusion, any property about the asymptotic behavior of v(x, t) can be translated into a result about the asymptotic behavior of u(x, t). In fact, self-similar source-type solutions of (1.1) are translated into steady states for (1.6). Equation (1.6) has a unique C 1 (IR) (up to translation) compactly supported steady state of given mass M, v∞ (x) =

2 1 2 C − x2 + 24

(1.9)

with C = C(M), and, as usual, g+ indicates the positive part of g. Note that the uniqueness is due to the regularity condition C 1 (IR) which implies that vx = 0 at the edge of the support. In fact, this uniqueness is derived from the uniqueness of the source type solutions U (x, t) for (1.1) with the additional assumption Ux (x, t) = 0 at the edge of the support (see [7]). This solution has been found first by Smyth and Hill [28] and

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

553

proved to be linearly stable very recently [8]. The steady state (1.9) is nothing but the Barenblatt–Pattle steady state of the rescaled porous medium equation 1 ∂h = (xh + h 2 hx )x , ∂t

(x ∈ IR, t > 0).

(1.10)

Following [13, 14] the natural entropy to study the asymptotic behavior for (1.10) is given by 2 x 8 3/2 H (f ) = dx. (1.11) f+ f 2 3 IR In fact, this entropy has a unique minimum point attained at the stationary state (1.9). A related functional involving the second moment has been recently used in [11] for the analysis of a fourth order diffusion equation similar to (1.6), where the confining effect is due to a nonlinear second order antidiffusion term. It is interesting to remark that through the evolution of this functional they are able to study the blow-up in finite time of solutions. However, in the present study the nonlinear in f term in H (f ) is due to the fourth-order term rather than to the second-order term as in [11]. In the next sections, given the solution to (1.6), we shall study the asymptotic decay of its entropy (2.9) towards the minimum, showing exponential convergence to equilibrium in relative entropy. A Csiszár-Kullback type-inequality [20, 14] then implies an exponential convergence to the equilibrium v∞ in L1 . Going back through the change of variables we recover an algebraic decay rate towards the most regular source-type solution [7, 10] of the thin-film equation (1.1). All the above computations can be done rigorously for strong solutions of the problem (1.1) in which u ∈ C 1 (IR) for a.e. t > 0 and this is the aim of the rest of the paper. We state our main result here and we refer to the fourth section for a more rigorous statement. Theorem 1.1. The intermediate asymptotics in L1 for strong positive solutions of the thin film equation (1.1) are given by the unique (among source type solutions) strong source type solution of the equation with the same mass. Moreover, an explicit and universal algebraic rate of convergence can be obtained. Let us point out that to our knowledge the term intermediate asymptotics has been used [3] when self-similar behavior as t → ∞ for solutions of Cauchy problems is obtained, although they can be considered anyway as large-time asymptotics results. For the case of prescribed contact angle at the boundary we refer to [27], in which the existence of solution for this problem is analysed. These solutions and the less regular source-type solutions are not strong solutions, and thus outside the scope of Theorem 1.1. It would be certainly interesting to study the stability properties of these less regular source-type solutions. The methods introduced in this work are restricted to Eq. (1.1) among the set of Eqs. (1.4) due to two facts. First, the entropy associated to the strong steady state (1.9) is explicit as the steady state is; in the general case (1.4), 0 < n < 3, the strong steady states are not explicit nor the entropy functionals. Therefore, the application of the entropy-entropy production method based on the knowledge of the derivative in time of the functional becomes rather difficult. Nevertheless, extensions of the method to other fourth-order equations is possible, and we discuss them in the last section. Equation (1.1) has a very particular structure, and in fact it is the only one among thin-film equations

554

J. A. Carrillo, G. Toscani

(1.4), 0 < n < 3, that can be written formally as the gradient-flow for a suitable functional in a suitable metric (we refer to [26, 27] for details). In our case, (1.6) can be written formally as a gradient flow with respect to the entropy functional while (1.1) with respect to the dissipation of H (f ). This would be part of future research. Also, there are particular weak forms of the equation that are only valid for n = 1, see [11]. Finally, we should remark that the uniqueness of strong solutions of the Cauchy problem for (1.1) continues to remain an open problem. Nevertheless, we are able to understand the large time asymptotics of these strong solutions simply looking at entropy and energy identities and inequalities. This uniqueness issue has nothing to do with the uniqueness of the strong source type solution among the set of source type solutions [7]. The results of the present paper will be reached by steps. As a first step, in the next section we will discuss the analogies between the second and fourth order diffusions. By this analysis, the case n = 1 will be clearly separated from any other equation of type (1.4) with n = 1. Section 3 will be devoted to an overview of the properties of strong solutions to the thin films equation (1.4). Section 4 contains the main result, namely the rigorous study of the time decay of the entropy for compactly supported initial data. Section 5 contains the generalization of the main result for non compactly supported initial data and finally, in Sect. 6 we discuss different fourth order degenerate diffusion equations for which this strategy may be applied. Some of these problems have been addressed before. In particular, the asymptotic decay of the solution to (1.4) in a bounded domain with periodic boundary condition has been studied in [10] by means of generalized convex entropies. The authors show that the weak nonnegative solution becomes a strong positive solution after some finite time, and approaches exponentially in time its mean as t → ∞. Recently, explicit constants for the exponential rate of decay of the solution to the same problem have been obtained in [23]. Here, the natural entropy (1.11) has been introduced for the bounded domain problem. 2. Similarities Between the Fourth- and Second-Order Diffusion Equations In this section we compare the fourth-order problem (1.4) to the well-known second order degenerate diffusion equation (1.5). Some similarities between the fourth- and second-order cases are that both equations are parabolic and in divergence form, with a subdiffusive nonlinear diffusion coefficient. Moreover, in both cases there are compactly supported source type solutions: for all n > 1 in the second-order case, and for all 0 < n < 3 in the fourth-order case. In general, at any deeper level, the similarities between the fourth- and second-order problems cease to exist. One striking difference is the lack of a maximum principle for the fourth-order problem. If n = 1, additional similarities with the porous medium equation 1

ht = a(h 2 hx )x ,

(2.1)

can be found. First, let us remark that (1.1) can be written as 3 1 ut = −2 u 2 u 2 xx .

(2.2)

After the rescaling (1.8), Eq. (2.2) becomes 3 1 vt = −2 v 2 v 2 xx

(2.3)

xx

xx

+ (xv)x .

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

555

For a given positive constant c, let us add and subtract to (2.3) the quantity

2c v

3 2

xx

= 2c v

3 2

x2 2

xx xx

.

(2.4)

We have 3 x2 + 2c v 2 xx + (xv)x = vt = −2 v v + c 2 xx xx

3 1 1 1 x2 x2 −2c v 2 + v 6cv 2 + . v2 + c 2 xx xx 2 x x

3 2

1 2

(2.5)

√ If we set c = 1/ 6, we finally obtain

√ 1 √ 1 3 2 x2 x2 2 2 2 6v + + v 6v + . vt = − √ v 2 xx xx 2 x x 6 Now, consider Eq. (2.1) rescaled as in (1.8). If we set a =

√ 1 x2 2 6w + , wt = w 2 x x

(2.6)

√

6 2

we obtain (2.7)

which is nothing but Eq. (2.6) without the higher order term. Since steady states of both Eqs. (2.6) and (2.7) are obtained by setting

√ 1 x2 6v 2 + v 2

x

= 0;

(2.8)

both equations have the same C 1 (IR) steady states. Thus, by studying entropies of the second order nonlinear degenerate diffusion equation (2.7) we obtain at once entropies for the fourth-order nonlinear diffusion equation (2.6). Following [13, 14], it is immediate to recover the exact form of the entropy associated to the steady state v∞ given in (1.9): 2 x 8 3/2 dx. (2.9) f+ f H (f ) = 2 3 IR Theorem 2.1 of [29] then gives that, for any given ⊇ (−C, C), v∞ is the unique extremal of H (f ) for all f belonging to the manifold

f (x) dx = M , (2.10) Fc = f ≥ 0, f ∈ L1 (),

namely H (f ) ≥ H (v∞ ), and the equality holds if and only if f = v∞ .

(2.11)

556

J. A. Carrillo, G. Toscani

Using this entropy (2.9) we have at least formally integrating by parts in Eq. (2.6) that 2 √ d x H (v) = vt + 6v 1/2 dx dt 2 IR 2 2 √ x + 6v 1/2 dx =− v 2 IR x

2 √ 1/2 2 2 3/2 x v dx. (2.12) + 6v −√ 2 6 IR xx Let v∞ (x) be the stationary solution defined by (1.9). The relative entropy H (v|v∞ ) is defined by H (v|v∞ ) = H (v) − H (v∞ ). Thus, by (2.12) we have d H (v|v∞ ) ≤ − dt

v IR

x 2 √ 1/2 + 6v 2

2 x

dx = −Dp (v) ≤ 0.

(2.13)

We remark that Dp (v) is the entropy production associated to the porous medium type equation (2.1). Lower bounds for the entropy production in terms of the relative entropy have been obtained in [13, 16]. These bounds correspond to generalized logarithmic Sobolev inequalities. The results of [13], Theorem 17 [14] assure that H (v|v∞ ) ≤

1 Dp (v). 2

(2.14)

Applying (2.14) to v(t) we finally deduce d H (v(t)|v∞ ) ≤ −2H (v(t)|v∞ ), dt which implies exponential convergence to equilibrium in relative entropy with an explicit rate. A Csiszár–Kullback type-inequality [20, 14] shows that the L1 deviation of v with respect to v∞ is bounded by H (v(t)|v∞ ) and thus, one proves the exponential convergence to the equilibrium v∞ in L1 . The change of variables (1.8) finally gives an algebraic decay rate towards the most regular source-type solution [7, 10] of the thinfilm equation (1.1). All the formal computations we gave can be done rigorously for strong solutions and compactly supported initial data. This will be the object of the next sections. Remark 2.1. The similarity between the thin film equation (1.4) and the porous medium equation (1.5) is unlikely limited to the case discussed in this section. Only in this case in fact the C 1 -similarity solution to (1.1) is also a similarity solution to the porous medium equation (1.5). In all the other cases in which a similarity solution to (1.4) is known to exist, namely for 0 < n < 3, the solution is not explicit, neither is a Barenblatt–Pattle type similarity solution. Moreover, only if n = 1 the thin film equation can be written in the form (2.6), which is the key of the whole analogy. Nevertheless, the properties of the similarity solutions for n = 1 discussed in [7] (compact support and monotonicity) do not exclude, in the spirit of the analysis of [29], the existence of a suitable entropy which attains the minimum in correspondence to the similarity solution.

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

557

3. Overview on Properties of Strong Solutions In this section, we deal with the asymptotic decay of the solution to (2.5). In the sequel we will assume the initial data u0 is non-negative and compactly supported. Moreover, u0 ∈ H 1 (IR) with mass M > 0. We will denote by (H0) this set of hypotheses on the initial data. The Cauchy problem for Eq. (1.1) has been studied deeply in [5]. The basic existence theorem in a bounded domain with no-flux boundary conditions has been obtained by Bernis and Friedman [6]. This theorem assures the existence of a weak solution in the sense of Definition 2.1 below. Later on, a detailed analysis of the regularity of the solution [4, 10] proved the existence of strong solutions in the sense of Definition 2.2 below and many other properties. Bernis [5] studied the finite speed of propagation of strong solutions by means of new entropy estimates. These strong solutions are obtained by a regularization introduced firstly in [6] and subsequently used in [4, 10, 5]. In what follows, we merely collect here the concepts of weak and strong solutions and its properties, addressing the reader for details mainly to the papers of [5, 10]. We will use the notations Q = IR × (0, ∞), QT = IR × (0, T ), P = Q \ ({u = 0} ∪ {t = 0}) and PT = QT \ ({u = 0} ∪ {t = 0}). Definition 3.1. A weak solution to problem (1.1) with initial condition satisfying (H0) is a function u(x, t) ≥ 0 enjoying the following properties: ¯ ∩ L∞ (0, ∞; H 1 (IR)), u ∈ C 1/2,1/8 (Q)

(3.1)

u ∈ C ∞ (P) and u1/2 uxxx ∈ L2 (P),

(3.2)

Q

uψt +

P

uuxxx ψx = 0

(3.3)

¯ with compact support inside Q, for all ψ ∈ Lip(Q) u(x, 0) = u0 (x), x ∈ IR and ux (·, t) → u0x strongly in L2 (IR) as t → 0. (3.4) If a weak solution satisfies additional regularity it is called a strong solution. Let us note the concept of strong solution is much weaker than the classical solution. Definition 3.2. A strong solution to problem (1.1) with initial condition satisfying (H0) is a weak solution verifying: u ∈ L2 ([0, T ], H 2 (IR))

(3.5)

for any T > 0 and thus u(·, t) ∈ C 1 (IR) a.e. in t > 0. By means of a regularization, a strong solution for problem (1.1) was proved to exist in [6, 4, 10, 5]. This solution satisfies the following additional regularity: u1−s/2 ∈ L2 ([0, T ], H 2 (IR)) for 0 < s < (ur )x ∈ L4 (QT ) for any

1 , 2

1 s 1 − ≤ r < 1, 0 < s < , 2 4 2

(3.6)

(3.7)

558

J. A. Carrillo, G. Toscani

for any T > 0, and u satisfies Eq. (1.1) in the sense

r 1−r u uψt − uuxx ψxx − u uxx ψx = 0 r x Q Q Q

(3.8)

¯ with compact support inside Q and r, s as in (3.7)–(3.8). Notice for all ψ ∈ C ∞ (Q) that the case s = 0 and r = 1/2 is not included. Moreover, u satisfies also (see [11]), Eq. (1.1) in the sense 3 − uψt + u(x, T )ψ(x, T ) − u0 (x)ψ(x, 0) = uux ψxxx + u2 ψxx 2 QT x QT IR IR QT (3.9) for all ψ ∈ C0∞ (Q¯T ) and all T > 0. Let us remark that this last property comes from the previous weak formulation by integrating by parts once in the last term in (3.8) taking into account that ux = 0 in the set where u = 0 due to u ∈ C 1 (IR) and u ≥ 0. Strong solutions are known to preserve mass u(x, t) dx = u0 (x) dx = M (Conservation of mass). (3.10) IR

IR

Moreover, there is dissipation of surface-tension energy, that is, u2x (x, T ) dx + 2 uu2xxx (x, t) dx dt ≤ u20x dx, IR

PT

(3.11)

IR

for all T > 0, and they admit entropies (Sect. 3 in [5]): in particular, the function t −→ u1+λ (x, t) dx IR

/ {0, 2, 3}, and verifies is absolutely continuous in [0, ∞) for − 21 < λ and λ ∈ 1 d − u1+λ (x, t) dx λ(λ + 1) dt IR λ(1 − λ) = uλ u2xx dx + uλ−2 u4x dx. 3 IR∩{u>0} IR∩{u>0} Moreover, we have the following integration-by-parts formula 1−λ uλ−1 uxx u2x dx = uλ−2 u4x dx a.e. t > 0. 3 IR∩{u>0} IR∩{u>0}

(3.12)

(3.13)

Let us remark that this integration by parts formula is not directly written in [5] but it is a straightforward consequence of Lemma 3.3 in [5]. Finally, the support of the solution increases following the law A1 M 1/5 t 1/5 ≤ |χ (u)|(t) ≤ |χ (u0 )| + A2 M 1/5 t 1/5

(3.14)

for any t > 0 where |χ (u)|(t) = χ+ (u)(t) − χ− (u)(t) and χ± (t) are the supremum and the infimum of the support of u(x, t) respectively (Theorems 7.1 and 7.2 in [5]). This finite speed of the propagation property for strong solutions of (1.1) is the reason

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

559

behind the intermediate asymptotics. Let us remark that (3.14) is optimal for the unique strong (self-similar) source-type solution. Finally, we have also the following property on strong solutions of problem (1.1): the function 2 x t −→ u(x, t) dx IR 2 is absolutely continuous in [0, ∞) and verifies 2 d x 3 u2 dx. u dx = dt IR 2 2 IR x

(3.15)

The evolution of the second moment is directly derived from (3.9) by taking as test function ψ(x, t) = θ1 (t) x 2 θ2 (x), where θ2 (x) ∈ C0∞ (IR) and θ2 (x) = 1 inside the support of u(x, t) for any 0 ≤ t ≤ T . Now, we translate all the properties and definitions of solutions for (1.1) to properties and definitions of solutions for the nonlinear Fokker-Planck type equation (1.6) through the change of variables (1.8). It is straightforward to change variables in weak formulations using suitable test functions. Thus, we have a weak solution to problem (1.6) with initial condition satisfying (H0), which is a function v(x, t) ≥ 0 enjoying the following properties: ¯ ∩ L∞ (0, ∞; H 1 (IR)), v ∈ C 1/2,1/8 (Q)

(3.16)

v ∈ C ∞ (P) and v 1/2 vxxx ∈ L2 (P),

(3.17)

Q

vψt −

Q

xvψx +

P

vvxxx ψx = 0

(3.18)

¯ with compact support inside Q, for all ψ ∈ Lip(Q) v(x, 0) = u0 (x), x ∈ and vx (·, t) → u0x strongly in L2 (IR) as t → 0. (3.19) Let us note that the sets where u and v are positive coincide, and thus P = \({v = 0} ∪ {t = 0}). Moreover, v is a strong solution in the sense of Definition 2.2 and thus, v ∈ L2 ([0, T ], H 2 (IR))

(3.20)

for any T > 0 with v(·, t) ∈ C 1 (IR) a.e. in t > 0. Furthermore, v satisfies v 1−s/2 ∈ L2 ([0, T ], H 2 (IR)) for 0 < s < (v r )x ∈ L4 (QT ) for any

1 , 2

1 s 1 − ≤ r < 1, 0 < s < , 2 4 2

for any T > 0 and v satisfies Eq. (1.6) in the sense

r 1−r v vψt − xvψx − vvxx ψxx − v vxx ψx = 0 r x Q Q Q Q

(3.21)

(3.22)

(3.23)

560

J. A. Carrillo, G. Toscani

¯ with compact support inside Q and r, s as in (3.22)–(3.23). Weak for all ψ ∈ C ∞ (Q) formulation (3.9) can be translated analogously.Also, the solution v verifies the following properties: v(x, t) dx = u0 (x) dx = M (Conservation of mass). (3.24) IR

IR

Moreover, the function t −→

v 1+λ (x, t) dx IR

is absolutely continuous in [0, ∞) for − 21 < λ and λ ∈ / {0, 2, 3}, and verifies 1 1 d − v 1+λ (x, t) dx + v 1+λ (x, t) dx = λ(λ + 1) dt IR λ + 1 IR λ(1 − λ) λ 2 v vxx dx + v λ−2 vx4 dx. 3 IR∩{v>0} IR∩{v>0} Furthermore, we have the following integration-by-parts formula 1−λ λ−1 2 v vxx vx dx = v λ−2 vx4 dx a.e. t > 0 3 IR∩{v>0} IR∩{v>0}

(3.25)

(3.26)

and 5t e5t − 1 −t 1/5 e − 1 ≤ |χ (v)|(t) ≤ |χ (u )|e + A M (3.27) 0 2 5e5t 5e5t for any t > 0. Let us remark that to obtain the last bound we have made use of (1.8) and we call t again for the new time variable for v as done in the rest of the paper.

A1 M 1/5

Remark 3.3. The finite speed of propagation property for strong solutions of problem (1.1) is translated to the property that the support of v remains always inside a suitable bounded interval [−R, R] and therefore, the Cauchy problem for compactly supported initial data u0 for the nonlinear Fokker–Planck equation (1.6) coincides with the no-flux initial-boundary value problem for (1.6) by setting vx = vxxx = 0 at x = ±R. In fact, given u0 there exists R > 0 and ω− , ω+ ∈ IR, −R < ω− < ω+ < R such that v = 0 in [−R, ω− ] and [ω+ , R]. Remark 3.4. The existence theory of the initial-boundary value problem for equations slightly different from (1.6), with no-flux boundary conditions has been studied directly by a regularization procedure in [17]. The results there can be easily extended to cover (1.6). Finally, we will need also the following property on strong solutions of problem (1.6): the function 2 x t −→ v(x, t) dx IR 2 is absolutely continuous in [0, ∞) and verifies 2 d x 3 v dx = − x 2 v dx + v 2 dx. (3.28) 2 IR x dt IR 2 IR This property is proved easily taking a suitable test function in (3.23) and doing integration by parts which are completely rigorous in this case due to (3.20).

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

561

4. Entropy–Entropy Production Method Our main goal here is to study the asymptotic behavior of strong solutions of (1.6) obtained in the previous section. Let us remark that, whenever the initial datum u0 satisfies (H0), if µ denotes the first moment of u0 , µ = xu0 (x) dx, (4.1) IR

one has |µ| < ∞. Thus, without loss of generality, we shall study the asymptotic decay of (1.6) choosing as initial datum u¯ 0 (x) = u0 (x + µ/M), which has the first moment equal to zero. The general case will follow by translation. This choice is connected with the optimal decay in relative entropy, we will detail at the end of this section. As discussed in Sect. 2, we can compute easily a strong positive steady solution of Eq. (1.6) given by v∞ (x) =

2 1 2 C − x2 + 24

(4.2)

with a suitable C = C(M) to fix the mass M. This solution corresponds (up to a time translation) to the unique strong source-type solution of problem (1.1) through the change of variables (1.8). Moreover, by symmetry, it has the first moment equal to zero. Let us recall that the entropy for (4.2) is given by 2 x 8 3/2 H (w) = dx. (4.3) w+ w 2 3 IR The relative entropy between w and v∞ , where w and v∞ have the same mass, is defined by the quantity H (w|v∞ ) = H (w) − H (v∞ ).

(4.4)

By (2.11), H (w|v∞ ) ≥ 0 with equality if and only if w = v∞ . Taking into account the results of the previous section, mainly (3.16)–(3.27), we can rigorously compute the derivative of the relative entropy. Using properties (3.25) for λ = 1/2 and (3.28), we conclude that the function t −→ H (v|v∞ ) is absolutely continuous in [0, ∞) and verifies d H (v|v∞ ) = −D(v), dt

(4.5)

where the entropy dissipation D(v) is given by 2 3 2 D(v) = x v dx − vx dx − v 3/2 dx 2 3 IR IR IR 3 1 2 1/2 2 + v vxx dx + v −3/2 vx4 dx. 2 IR∩{v>0} 8 3 IR∩{v>0}

2

(4.6)

562

J. A. Carrillo, G. Toscani

Let us define the following functional over the solution v: 2

2 2

2 √ √ x x 2 ˜ D(v) = + 6v 1/2 + 6v 1/2 v dx + √ v 3/2 dx, 2 2 6 IR∩{v>0} IR∩{v>0} x xx (4.7) which has sense due to (3.20) and (3.25). It is an exercise now to check that in fact ˜ D(v) = D(v). This is in fact how the completing the square (2.6) and (2.12) of Sect. 2 becomes rigorous. Thanks to (3.20), we expand the squares and take the derivatives up to second order. Comparing terms, we notice that there are three integration-by-parts to be proved. The first one we need is: vvxx dx = − vx2 dx a.e. t > 0, (4.8) IR

IR

which is obvious from (3.20). The second one is: 2 xv 1/2 vx dx = − v 3/2 dx a.e. t > 0 3 IR IR which is also true by (3.16)-(3.23). Finally, the third one is: 1 −1/2 2 v vxx vx dx = v −3/2 vx4 dx a.e. t > 0, 6 IR∩{v>0} IR∩{v>0}

(4.9)

(4.10)

which corresponds to (3.26) for λ = 1/2. Therefore, it follows rigorously that 2

2 2

2 √ √ x x 2 D(v) = + 6v 1/2 + 6v 1/2 v dx + √ v 3/2 dx, 2 2 6 IR∩{v>0} IR∩{v>0} x xx (4.11) which is of course greater than or equal to 2

2 √ x Dp (v) = v dx. + 6v 1/2 2 IR∩{v>0} x

(4.12)

Let us note that Dp (v) is exactly the entropy dissipation of the porous medium equation (2.1) Thus, we proved the entropy dissipation bound d (4.13) H (v|v∞ ) ≤ −Dp (v). dt To this point, we recall the generalized logarithmic Sobolev inequality proved in [13, 14, 16] in the one dimensional case and for m = 3/2. Theorem 4.1. Let w(x) ≥ 0 belong to L1 (IR) with mass M such that the distributional derivative of w is square integrable. Then w 3/2 ∈ L1 (IR) and 3 1 (4.14) w 3/2 dx ≤ wx2 dx + B(M), 8 3 IR IR

x2 3/2 v∞ + 2v∞ dx. 2 IR Moreover there is equality in (4.14) if and only if w is a multiple and translate of w = v∞ . where

B(M) =

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

563

The previous generalized logarithmic Sobolev inequality implies H (w|v∞ ) ≤

1 Dp (w). 2

(4.15)

Substituting this inequality into (4.13) we finally deduce d H (v(t)|v∞ ) ≤ −2H (v(t)|v∞ ), dt which furnishes H (v|v∞ ) ≤ e−2t H (u¯ 0 |v∞ )

(4.16)

for any t ≥ 0. Equation (4.16) gives the exponential decay to the steady state in relative entropy. The next step consists in obtaining the exponential decay towards v∞ in L1 (IR). This is by now a classical argument, which uses Csiszar–Kullback type inequalities. To this aim, we quote the following theorem [14, 16, 26]. Theorem 4.2. Let w(x) ≥ 0 belong to L1 (IR) with mass M. Then, the following Csiszar– Kullback type inequality holds

w − v∞ L1 (IR)

16 ≤ 3

1/2 v∞

dx

1/2

H (w|v∞ ).

(4.17)

IR

Substituting (4.17) into (4.16) we get v − v∞ L1 (IR) ≤ e

−t

16 3

1/2 v∞

dx

1/2

H (u¯ 0 |v∞ )

(4.18)

IR

for any t ≥ 0. Now, the result for the thin film equation (1.1) follows coming back to the original equation through the change of variables (1.8). So, we proved our main result. Theorem 4.3. We are given a non-negative, compactly supported initial condition u0 ∈ H 1 (IR) with mass M > 0 and first moment µ. Let u(t, x) be a corresponding strong solution to the Cauchy problem (1.1). Then, if U (t, x) is the unique strong source-type (self-similar) solution of (1.1) with mass M ([28,5,7]) up to a time translation to set U (0, x) = v∞ , there exists a constant A > 0 depending only on v∞ and M such that u(x, t) − U (x − µ/M, t)L1 (IR) ≤ A H (u¯ 0 |v∞ )(5t + 1)−1/5 for any t ≥ 0. The constant A is given by

16 A = A(M, v∞ ) = 3

1/2 v∞

1/2 dx

.

(4.19)

IR

The decay rate in Theorem 4.3 is optimal since the difference between v(x, t) and v(x − x0 , t) decays exactly as t −115 , cf. [8].

564

J. A. Carrillo, G. Toscani

Remark 4.4. Theorem 4.3 shows that a solution to the initial value problem for (1.1), with given mass M and first momentum µ, converges to a self-similar solution of mass M that is symmetric about x¯ = µ/M and was a delta function at time t = −1/5, with an explicit rate of convergence. Since the support is spreading in time and the height is decaying in time, the initial data clearly does not select a particular self-similar solution as its long-time limit, only specified by x¯ = µ/M or t = −1/5 in the rate of decay. The exact meaning of Theorem 4.3 relies in the fact that the decay towards equilibrium in relative entropy of the solution to the nonlinear Fokker–Planck type equation (1.6), corresponding to an initial condition of the form u0 (x + c), where c ∈ IR, is optimal in correspondence to c = µ/M, where M and µ are respectively the mass and the first momentum of u0 (x). In fact, a direct computation shows that µ 2 H (u0 (x + c)) = H (u¯ 0 (x + c − µ/M)) = H (u¯ 0 (x)) + c − M. (4.20) M This equality in particular shows that, for initial data as in Theorem 4.3, translations in space are controlled by

1/2 µ 2 u(x, t) − U (x − c, t)L1 (IR) ≤ A H (u¯ 0 |v∞ ) + c − M (5t + 1)−1/5 . M Likewise, direct computations give a control of translations in time. To this aim, consider that, for any time τ > 0,

α(t) ˆ α(t) ˆ |U (x, t) − U (x, t + τ )| dx = v∞ (x, t) − v∞ x dx α(t ˆ + τ) α(t ˆ + τ) IR IR

≤ 1−

α(t) ˆ α(t) ˆ α(t) ˆ v∞ dx, M+ x − v (x, t) ∞ α(t ˆ + τ) α(t ˆ + τ ) IR α(t ˆ + τ)

ˆ α(t ˆ + τ ) < 1. Recalling that a 5 = 15 where α(t) ˆ = (5t + 1)1/5 . Let us set ρ = α(t)/ 16 M, the last integral can be easily evaluated to give the inequality

a a/ρ ρ |v∞ (ρx) − v∞ (x)| dx = 2ρ v∞ (ρx) dx (v∞ (ρx) − v∞ (x)) dx + a

0

IR

7 ≤ M(1 − ρ)2 . 4 Finally, since 1−ρ ≤ |U (x, t) − U (x, t + τ )| dx ≤ IR

τ , (5t + 1)4/5

7 τ M+ M 4/5 4 (5t + 1)

τ (5t + 1)4/5

2 ,

(4.21)

which implies, for τ > 0, the following formula for time translations: 1 u(x, t) − U (x, t + τ )L1 (IR) ≤ A H (u¯ 0 |v∞ ) (5t + 1)1/5 2

τ 7 τ + M + . (4.22) M 4 (5t + 1)4/5 (5t + 1)4/5

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

565

Theorem 4.3 implies, together with the dissipation of surface-tension energy (3.11), the asymptotic decay of the solution in L∞ (IR)-norm. To this aim, the following interpolation inequality will be useful [23]. Lemma 4.5. Let f ∈ L1 (IR) ∩ H 1 (IR). Then, the following inequality: f L∞ (IR) ≤

3

1

f L3 1 (IR) 2

1 fx2 dx

2π 3

3

(4.23)

IR

holds. Using Lemma 4.5, we get Theorem 4.6. We are given a non negative, compactly supported initial condition u0 ∈ H 1 (IR) with mass M > 0 and first moment µ. Let u(t, x) be a corresponding strong solution to the Cauchy problem (1.1). Then, if U (t, x) is the unique strong source-type (self-similar) solution of (1.1) with mass M ([28,5,7]) up to a time translation to set U (0, x) = v∞ , there exists a constant B > 0 depending only on v∞ , M and uo H 1 (IR) such that u(x, t) − U (x − µ/M, t)L∞ (IR) ≤ BH (u¯ 0 |v∞ )1/6 (5t + 1)−1/15 for any t ≥ 0. The constant B is given by 1/6 3A1/3 2 2 2 v∞,x dx + 2 u0,x dx . B= 2 IR IR 2π 3

(4.24)

(4.25)

Remark 4.7. It is straightforward by interpolation to obtain from the previous Theorems 4.3 and 4.6 a decay estimate in Lp norm with rate

1 5t + 1

1

15

1+ p2

,

for all 1 ≤ p ≤ ∞. Likewise, one can obtain control of space and time translation even in these cases. 5. Cauchy Problem for Non-Compactly Supported Initial Data In this section we show that Theorems 4.3 and 4.6 can be extended to non-compactly supported initial data with finite entropy. As a first step, we extend the existence theory of strong solutions to non-compactly supported initial data. The procedure is close to that introduced in [10], and makes use of similar arguments. In the remainder of this section we will assume the nonnegative initial data u0 ∈ L1 (IR) ∩ H 1 (IR) with mass M > 0, x 2 u0 ∈ L1 (IR) and u0 log u0 ∈ L1 (IR). We will denote by (H1) this set of hypotheses on the initial data. Let us also assume that the first moment µ vanishes without loss of generality. Let us consider a sequence of positive functions χn (x) ∈ Co∞ (IR) such that 0 ≤ χn ≤ 1 with χn = 1 on the interval [−n, n] and χn = 0 outside the interval [−(n + 1), n + 1]. We consider the approximated initial data u0n (x) = u0 (x)χn (x). For any n ≥ n0 , u0n verifies all the set of hypotheses (H0) given at the beginning of Sect. 3. Its mass will be denoted by Mn . Let us remark that, for n sufficiently large, Mn > 0. Moreover, we

566

J. A. Carrillo, G. Toscani

have that u0n → u0 as n → ∞ in L1 (IR) ∩ H 1 (IR) while the second moments x 2 u0n are uniformly bounded and converge towards x 2 u0 in L1 (IR). The sequence of solutions un (t, x) verify all the properties and results obtained in the previous two sections. Moreover, if we denote by Mn the mass of the approximated sequence and if Un and U denote the unique strong source-type (self-similar) solution of (1.1) with mass Mn and M respectively, then Mn → M and Un → U as n → ∞. Taking into account the dissipation of surface-tension energy (3.11), the conservation of mass and Nash inequality [25]: there exists a constant 5 > 0 such that for all w ∈ L1 (IR) ∩ H 1 (IR), w3L2 ≤ 5w2L1 ∇wL2 , we deduce that un is a bounded sequence in L1 (Q) ∩ L2 ([0, T ], H 1 (IR)) independent of t and n. Let us prove that we can get a uniform control on the L2 ([0, T ], H 2 (IR)) norm. In order to do this we use the entropy Go (u) = u log u − u as in [6, 10, 11], we have

QT

(un )2xx (x, t) dx ≤

Go (un ) dx − Go (Mn ).

(5.1)

IR

This is not directly written in these references, but choosing a suitable approximation of the initial data, it follows directly for instance from estimates around Eq. (43) in [10]. Using hypotheses (H1) we have that the right-hand side is uniformly bounded in n. Therefore, un is a bounded sequence in L2 ([0, T ], H 2 (IR)) independent of n for any T > 0. Therefore, we can take a weak limit (up to a subsequence we denote it with the same index) in L2 ([0, T ], H 2 (IR)) towards a function u which is a strong convergence in the spaces L2 ([0, T ], H 1 (IR)) for any T > 0. Furthermore, since we have a uniform control on the second order moment by (3.15), i.e., d dt

IR

x2 3 un dx = 2 2

(un )2x dx,

(5.2)

IR

we deduce the strong convergence of un towards u in L1 (QT ). Due to Sobolev inequalities, the Ascoli–Arzela theorem and similar arguments as in [6] (Sect. 2) we deduce that we have convergence in C 1/2,1/8 (K × [0, T ]) for any compact interval K and T > 0, and therefore, the initial data u0 is taken by u as a continuous function and in L2 (IR) using (5.2). Moreover, we have also convergence of un → u in C([0, T ], Lp (K)) for any 1 ≤ p ≤ ∞ and any T > 0 and any K compact interval. Again, using (5.2) one can prove convergence in C([0, T ], Lp (IR)) for any 1 ≤ p < ∞. Regarding the convergence of ux (t, ·) as t → 0 in L2 (IR), we use the strong convergence of u(t, ·) → u0 in L2 (IR) to deduce ux (t, ·) → u0x weakly in L2 (IR). Using now (5.2) we have that 2 lim sup ux (x, t) dx ≤ u20x dx t→0

IR

IR

and therefore, since L2 (IR) is a uniformly convex space, we deduce that ux (t, ·) → u0x as t → 0 strongly in L2 (IR).

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

567

We shall now prove that we can pass to the limit in the weak formulation (3.8) of Eq. (1.1), that is,

Q

un ψ t −

Q

un (un )xx ψxx −

Q

(un )

1−r

(un )r r

x

(un )xx ψx = 0

(5.3)

¯ with compact support inside Q, 0 < s < 1/2 and 1/2−s/4 ≤ r < 1. for all ψ ∈ C ∞ (Q) Since un converges strongly in L2 (QT ) and (un )xx converges weakly in L2 (QT ) for any T > 0, we can take limits in the first two terms as n → ∞. In order to pass to the limit in the last term we will use the proof given in [10] to which we refer for the details. Let us sketch it. Let us use the additional entropies (3.12). If − 21 < λ = −s < 0 we have d 1 s(1 − s) dt =

(un )1−s (x, t) dx IR

IR∩{un >0}

(un )

−s

(un )2xx

s(1 + s) dx − 3

IR∩{un >0}

(un )−s−2 (un )4x dx.

(5.4)

First, using that un is uniformly bounded with respect to n in L1 (QT ) and the uniform bound in L∞ ([0, T ], H 1 (IR)) we have that un is uniformly bounded with respect to n in L∞ ([0, T ], Lp (IR)) for any 1 ≤ p ≤ ∞. Now using (5.4), we can produce a uniform bound of [(un )1−s/2 ]xx in L2 (QT ) and of [(un )1/2−s/4 ]x in L4 (QT ) by mimicking the procedure in [10, Sect. 4, p. 14–16]. Finally, using Lemma 6 in [10] we deduce directly the convergence of the last nonlinear term in (5.3). Concluding, we proved the existence of a strong solution to the Cauchy problem for Eq. (1.1) with initial data u0 satisfying (H1). Now, it is easy to apply Theorem 4.3 to the approximated problems to obtain un (x, t) − Un (x, t)L1 (IR) ≤ A H (u0n |v∞n )(5t + 1)−1/5 , and to take the limit n → ∞ to prove that Theorems 4.3 and 4.6 apply to the strong solutions corresponding to initial data satisfying hypotheses (H1). We proved Theorem 5.1. We are given a non-negative, initial condition u0 ∈ L1 (IR) ∩ H 1 (IR) with mass M > 0, x 2 u0 ∈ L1 (IR) and u0 log u0 ∈ L1 (IR). Let µ be the first moment of u0 . Then, there exists a strong solution u(t, x) to the Cauchy problem (1.1). Moreover, if U (t, x) is the unique strong source-type (self-similar) solution of (1.1) with mass M ([28,5,7]) up to a time translation to set U (0, x) = v∞ , there exist constants A, B > 0 such that u(x, t) − U (x − µ/M, t)L1 (IR) ≤ A H (u¯ 0 |v∞ )(5t + 1)−1/5 and u(x, t) − U (x − µ/M, t)L∞ (IR) ≤ BH (u¯ 0 |v∞ )1/6 (5t + 1)−1/15 for any t ≥ 0. The constants A and B are given in Theorems 4.3 and 4.6 respectively.

568

J. A. Carrillo, G. Toscani

6. Final Remarks Equation (2.6) admits less regular steady states with positive slope γ at the edge of the support. Equation (1.1) with a prescribed contact angle has been recently studied by Otto [27], who did not consider the asymptotic behavior of the solution. It would be certainly interesting to study if these less regular steady states are the asymptotic limit of solutions (with the same prescribed slope γ ) to the rescaled thin film equation in the situation studied by Otto. In more detail, any steady solution of the rescaled thin film equation, of given mass M, with a prescribed positive slope γ , has the form 2 1 2 γ 2 C − x2 + C − x2 + + 24 1 2C1 1

vγ ,∞ (x) =

(6.1)

with C1 = C1 (M). The C 1 -steady state we considered in this paper is included in (6.1), and corresponds to the choice γ = 0. Whenever γ = 0, vγ ,∞ (x) is no more a steady state of the rescaled porous medium equation (1.10), the representation (2.6) is useless, and the entropy method of course fails. Any steady state of the form (6.1) admits a natural convex entropy (see [29]) 2 x H (f ) = f + 9(f ) dx, (6.2) 2 IR where

f

9(f ) =

  6y +

0

3γ C1



2 −

3γ  dy. C1

(6.3)

By Theorem 2.1 of [29], for all nonnegative functions f of given mass M, 9(f ) ≥ 9(vγ ,∞ ),

(6.4)

with equality if and only if f = vγ ,∞ . Unlikely, at present we do not know if the entropy (6.2) is monotone decreasing along the solution to Eq. (1.6). At least for strong solutions, Eq. (1.6) has another Lyapunov functional (entropy) which is non-increasing monotonically in time, H (f, fx ) = x 2 f + (fx )2 dx. (6.5) IR

In fact, surface-tension energy dissipation (3.11) gives, for any T > 0, H (u, ux )(t) + 2

T 0

u uxx IR

x2 − 2

2 x

dxdt = H (u, ux )(0).

(6.6)

We remark that both the entropy (6.5) and the entropy production D(f ) =

f IR

2

x2 fxx − dx, 2 x

(6.7)

do not distinguish among the steady states (6.1). We believe that the study of the time evolution of this Lyapunov functional would be of great importance to understand the

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

569

asymptotic behavior of solutions with a prescribed contact angle, but presently we are not in a position to do it. On the other hand, within the same strategy we can treat in a formal way any equation of the form ∂v = −C1 9(v) (V (x) + h(v))xx xx + v (V (x) + h(v))x x , ∂t

(6.8)

where C1 > 0, 9 ≥ 0 and V and h verify the hypotheses needed in [14] to work on the general nonlinear diffusion equation ∂v = (v(V (x) + h(v))x )x , ∂t

(x ∈ IR, t > 0).

(6.9)

A class of fourth order diffusion equations that can be written in the form (6.8) by taking 2 V (x) = x2 , is the following: ∂v = − 9(v) (h(v))xx xx + (xv)x , ∂t

(6.10)

where 9 is increasing from 9(0) = 0, while h is conjugate to 9, in the sense that h (v) =

9 (v) . v

(6.11)

This is unlikely, since we can not perform rigorously the asymptotic behavior, since the existence theory for this class of equations is at present not well developed. A particular case of (6.10) is relevant in semiconductors device modeling, and corresponds to the choice 9(v) = v. In this case we obtain ∂v = − v (log v)xx xx + (xv)x ∂t

(6.12)

that corresponds through a suitable change of variables of the type (1.8) to the spin equation ∂u = − u (log u)xx xx ∂t

(6.13)

introduced by Derrida, Lebowitz, Speer and Spohn in [15]. Bleher, Lebowitz and Speer in [12] and subsequently Jüngel and Pinnau in [22] studied Eq. (6.13) in the bounded domain (0, 1) subject to boundary conditions u(0) = u(1) = 1, and ux (0) = ux (1) = 0. In fact, (6.12) admits the Gaussian as a steady state which is the minimum of the physical entropy 2 x f + f log f dx. H (f ) = 2 IR The formal computation of the evolution of the relative entropy and the logarithmic Sobolev inequality imply the exponential convergence in relative entropy towards the Gaussian for solutions of (6.12). This result would imply an algebraic decay in L1 -norm towards a modified heat kernel for solutions of (6.13). The rigorous study of the Cauchy problem for Eq. (6.13) is under consideration.

570

J. A. Carrillo, G. Toscani

Acknowledgement. This work has been performed and financially supported within the activities both from the TMR project “Asymptotic Methods in Kinetic Theory”, No. ERBFRMXCT 970157, funded by the EC., from the Italian MURST, project “Mathematical Problems in Kinetic Theories”, from the Spanish–Italian Acciones Integradas and from the Spanish DGES projects PB98-1281 and PB98-1294. The authors would like to express their sincere gratitude to Mary Pugh for helpful references and fruitful comments. The suggestions of the anonymous referees, which led to a marked improvement in the structure of the paper, are gratefully acknowledged.

References 1. Arnold, A., Markowich, P., Toscani, G., Unterreiter, A.: On logarithmic Sobolev inequalities, CsiszárKullback inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. To appear in Comm. P.D.E. 2. Aronson, D.G.: The porous media equation. In: Nonlinear Diffusion Problems, edited by A. Fasano and M. Primicerio, Montecatini 1985, Lecture Notes in Mathematics 1224, Berlin: Springer, 1986, pp. 1–46 3. Barenblatt, G.I.: Similarity, Self-Similarity and Intermediate Asymptotics. New York–London: Plenum, 1979 4. Beretta, E., Bertsch, M., Dal Passo, R.: Nonnegative solutions of a fourth-order degenerate parabolic equation. Arch. Rational Mech. Anal. 129, 175–200 (1995) 5. Bernis, F.: Finite speed of propagation and continuity of the interface for thin viscous flows. Adv. in Diff. Eq. 3, 337–368 (1996) 6. Bernis, F., Friedman, A.: Higher order nonlinear degenerate parabolic equations. J. Diff. Eqns. 83, 179– 206 (1990) 7. Bernis, F., Peletier, L.A., Williams, S.M.: Source type solutions of a fourth order nonlinear degenerate parabolic equation. Nonlinear Analysis 18, 217–234 (1992) 8. Bernoff, A.J., Witelski, T.P.: Linear stability of source-type similarity solutions of the thin film equation. To appear in Appl. Math. Letters (2001) 9. Bertozzi, A.L.: The mathematics of moving contact lines in thin liquid films. Notices of the AMS, June– July 1998, 689–697 (1998) 10. Bertozzi, A.L., Pugh, M.: The lubrication approximation for thin viscous films: Regularity and long-time behavior of weak solutions. Comm. Pure Appl. Math. XLIX, 85–123 (1996) 11. Bertozzi, A.L., Pugh, M.: Finite-time blow-up of solutions of some long-wave unstable thin film equations. Indiana Univ. Math. J. 49, 1323–1366 (2000) 12. Bleher, P.M., Lebowitz, J.L., Speer, E.R.: Existence and positivity of solutions of a fourth order nonlinear PDE describing interface fluctuations. Commun. Pure Appl. Math. XLVII, 923–942 (1994) 13. Carrillo, J.A., Toscani, G.: Asymptotic L1 -decay of solutions of the porous medium equation to selfsimilarity. Indiana Univ. Math. J. 49, 113–141 (2000) 14. Carrillo, J.A., Jungel, A., Markowich, P., Toscani, G., Unterreiter, A.: Entropy dissipation methods for degenerate parabolic problems and generalized Sobolev inequalities. Monatsh. Math. 133, 1–82 (2001) 15. Derrida, B., Lebowitz, J.L., Speer, J., Spohn, H.: Fluctuations of a stationary nonequilibrium interface. Phys. Rev. Lett. 67, 165–168 (1991) 16. Dolbeault, J., del Pino, M.: Generalized Sobolev inequalities and asymptotic behaviour in fast diffusion and porous medium problems. Preprint 17. Giacomelli, L.:A fourth-order degenerate parabolic equation describing thin viscous flows over an inclined plane. Appl. Math. Lett. 12, 107–111 (1999) 18. Giacomelli, L., Otto, F.: Variational formulation for the lubrication approximation of the Hele-Shaw flow. Preprint 19. Hocking, L.M.: The spreading of a thin drop by gravity and capillarity. Q. J. Mech. Appl. Math. 34, 37–55 (1981) 20. Kullback, S.: Information Theory and Statistics. New York: John Wiley, 1959 21. Lacey, A.A.: The motion with slip of a thin viscous droplet over a solid surface. Stud. Appl. Math. 67, 217–230 (1982). 22. Jungel, A., Pinnau, R.: Global non-negative solutions of a nonlinear fourth-order parabolic equation for quantum systems. SIAM J. Math. Anal. 32, 760–777 (2000) 23. Lopez, J.L., Soler, J., Toscani, G.: Time rescaling and asymptotic behavior of some fourth order degenerate diffusion equations. To appear in Computers and Math. Applications 24. Myers, T.G.: Thin films with high surface tension. SIAM Reviews 40, 441–462 (1998) 25. Nash, J.: Continuity of solutions of parabolic and elliptic equations. Am. J. Math. 80, 931–954 (1958) 26. Otto, F.: The geometry of dissipative evolution equations: The porous medium equation. Comm. P.D.E. 26, 101–174 (2001)

Long-Time Asymptotics for Strong Solutions of the Thin Film Equation

571

27. Otto, F.: Lubrication approximation with prescribed nonzero contact angle. Comm. P.D.E. 23, 2077–2164 (1998) 28. Smyth, N.F., Hill, J.M.: Higher order nonlinear diffusion. IMA J. Appl. Math. 40, 73–86 (1988) 29. Toscani, G.: Remarks on entropy and equilibrium states. Appl. Math. Letters 12, 19–25 (1999) 30. Vázquez, J.L.: Asymptotic behaviour for the porous medium equation in the whole space. Notas del curso de doctorado Métodos asintóticos en ecuaciones de evolución 31. Vázquez, J.L.: An introduction to the mathematical theory of the porous medium equation. In: Shape optimization and free boundaries (Montreal, PQ, 1990), NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., 380, Dordrecht: Kluwer Acad. Publ. (1992), pp. 347–389 Communicated by J. L. Lebowitz

Commun. Math. Phys. 225, 573 – 609 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Unitary Representations of Uq (sl(2, R)), the Modular Double and the Multiparticle q-Deformed Toda Chains S. Kharchev1 , D. Lebedev1 , M. Semenov-Tian-Shansky2,3 1 Institute of Theoretical and Experimental Physics, Moscow 117259, Russia 2 Université de Bourgogne, 21078 Dijon, France 3 Steklov Math. Institute, St. Petersburg 191011, Russia

Received: 11 April 2001 / Accepted: 8 October 2001

Abstract: The paper deals with the analytic theory of the quantum q-deformed Toda chains; the technique used combines the methods of representation theory and the Quantum Inverse Scattering Method. The key phenomenon which is under scrutiny is the role of the modular duality concept (first discovered by L. Faddeev) in the representation theory of noncompact semisimple quantum groups. Explicit formulae for the Whittaker vectors are presented in terms of the double sine functions and the wave functions of the N -particle q-deformed open Toda chain are given as a multiple integral of the Mellin– Barnes type. For the periodic chain the two dual Baxter equations are derived. Preface In the late seventies B. Kostant [1] has discovered a fascinating link between the representation theory of non-compact semisimple Lie groups and the quantum Toda chain. Let G be a real split semisimple Lie group, B = MAN its minimal Borel subgroup, let N and V = N¯ be the corresponding opposite unipotent subgroups. Let χN , χV be nondegenerate unitary characters of N and V , respectively. Let HT be the space of smooth functions on G which satisfy the functional equation ϕ(vxn) = χV (v)χN (n) ϕ(x),

v ∈ V , n ∈ N.

A function ϕ ∈ HT is uniquely determined by its restriction to A ⊂ G. Obviously, HT is invariant under the action of the center of the universal enveloping algebra Z ⊂ U (g); hence, any Casimir operator C ∈ Z gives rise to a differential operator acting in C ∞ (A). When C is the quadratic Casimir, this is precisely the Toda Hamiltonian; other Casimirs provide a complete set of quantum integrals of motion. This observation reduces the spectral theory of the Toda chain to the representation theory of semisimple Lie groups. The joint eigenfunctions of the quantum Toda Hamiltonians are the so called generalized Whittaker functions. The theory of Whittaker

574

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

functions has been extensively studied in the 60’s and 70’s [2–4]; it displays deep parallels with the celebrated Harish-Chandra theory of spherical functions [5] and depends on a profound study of the principal series representations [6]. The group theoretic approach based on representation theory of finite-dimensional semisimple groups is matched by a more sophisticated technique of the Quantum Inverse Scattering Method [7]. The treatment of the Toda chain by means of QISM is based on a 2 × 2 matrix first order difference Lax operator for the Toda lattice. (In order to understand its relation to the n × n Lax representation which is implicit in Kostant’s approach recall that the Lax matrix is a tridiagonal Jacobi matrix which defines a threeterm recurrence relation and hence may be regarded as a second order scalar difference operator.) While the use of the lattice Lax representation restricts generality: we have to assume that g = sl(n)1 , it allows to bring into play the powerful machinery of quantum R-matrices (and hence eventually of infinite dimensional quantum groups). Recently the first two authors have established an explicit connection of the QISM-based approach to the quantum Toda chain to the theory of Whittaker functions [9]. The technique of QISM yields new explicit formulae for the Whittaker functions which, to the best of our knowledge, were not known in the elementary representation theory. It looks rather natural to generalize this approach to the q-deformed case. The use of lattice Lax representation makes the procedure rather straightforward: one simply has to replace the rational R-matrix with the trigonometric one (we shall see, however, that this generalization includes a number of nontrivial points). On the other hand, the very definitions of “noncompact quantum groups” which one needs to proceed with the q-deformed version of the Kostant approach are by no means obvious. It is the interplay of the explicit formulae based on QISM and of their not-yet-defined counterparts coming from the representation theory of noncompact finite-dimensional quantum groups that makes the entire game very exciting. Our preliminary results suggest that the correct treatment of the problem requires a very significant change in the entire framework of the representation theory of Uq (g); the crucial role is played by the “modular dual” of Uq (g) and the modular double Uq (g) ⊗ U q (g) which was introduced recently by Faddeev [10]2 . Among other things, this new point of view leads to new possibilities in the choice of real forms of the relevant algebras: it is the real form of the modular double Uq (g) ⊗ U q (g) which really matters. One nontrivial possibility for the choice of the real form has been recently pointed out by Faddeev, Kashaev and Volkov [11] in their study of the quantum Liouville theory; it is very encouraging that the same real form naturally arises in the study of the q-deformed Toda chain. Analytical aspects of the theory bring into play the double gamma and double sine functions of Barnes [12–16], or the closely related quantum dilogarithms [17], which replace the ordinary gamma functions in the formulae for both the Harish-Chandra c-functions and the Whittaker functions. We believe that the implications of these constructions for the representation theory are probably more interesting than the qdeformed Toda model itself (commonly known as the relativistic Toda chain [18]). Our strategy in the present paper is as follows. In Sect. 1 we shall start with the elementary representation theory of the algebra Uq (sl(2, R)). Section 2 deals with the theory of Whittaker vectors and Whittaker functions for the modular double Uq (sl(2, R)) ⊗ 1 The treatment of other classical Lie algebras is also possible; for that end, one needs to use lattice Lax pairs with boundary conditions introduced by Sklyanin [8]. In the present note we shall not deal with this generalization and assume that g = sl(n). 2 The definition of the modular double was coined out by Faddeev in the special case g = sl(2); as pointed out to the authors by B. Feigin, it is most likely that for general semisimple Lie algebras the modular dual of Uq (g) is U q (ˇg), where gˇ is the Langlands dual of g.

Analytic Theory of Quantum q-Deformed Toda Chains

575

U q (sl(2, R)) and with the 2-particle q-deformed open Toda chain. We obtain explicit formulae for the Whittaker vectors in terms of the double sine functions and derive the integral representations for solutions to a one-parameter family of two-particle relativistic Toda chains in the framework of representation theory; all these solutions possess dual symmetry. Generalization to the N -particle case is described in Sect. 3; using the QISM approach, we derive an appropriate solution of the spectral problem for the open N-particle chain in the form of a multiple integral of the Mellin–Barnes type with the gamma functions replaced by double sine functions. It is shown that the solution for the N -periodic chain is represented as a generalized Fourier transform of the N − 1particle open wave function with the kernel satisfying two mutually dual Baxter equations. Finally, in the Appendix we list the essential analytic properties of the double sine functions. 1. Representations of Principal Series of Uq (sl(2, R)) and the Modular Double In this section we shall discuss the representations of Uq (sl(2, R)) which may be regarded as deformations of the principal series representations of SL(2, R). As pointed out by Faddeev [10], these representations possess a remarkable duality which is similar to the modular duality for noncommutative tori discovered by Rieffel [19]. We start with the algebraic definition of Uq (sl(2, C)) (see, for example, [20]). It is generated by elements K ±1 , E, F subject to the relations KE = q 2 EK,

KF = q −2 F K,

EF − F E =

K − K −1 , q − q −1

(1.1)

q = eπiτ ,

τ ∈ C.

(1.2)

where

The bialgebra structure on Uq (sl(2, C)) is given by the coproduct 3 K = K ⊗ K, E = E ⊗ 1 + K ⊗ E, F =1⊗F +F ⊗K

−1

(1.3) .

The center of Uq (sl(2, C)) is generated by the Casimir element C2 = qK + q −1 K −1 + (q − q −1 )2 F E.

(1.4)

The algebra Uq (sl(2, C)) admits a real form defined by the involution K ∗ = K,

E ∗ = −E,

F ∗ = −F,

(1.5)

which is compatible with the commutation relations (1.1) only if |q| = 1, i.e. τ ∈ R. The corresponding real algebra is called Uq (sl(2, R)). (We shall see later that when Uq (sl(2)) is replaced with its modular double, there is a possibility to choose the real structure in a different way.) 3 The coalgebraic structure of U (sl(2, C)) is not used in the present paper. q

576

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

It is sometimes useful to consider the corresponding “infinitesimal” algebra Uτ (sl(2, R)), τ ∈ R, with generators E, F, H and relations [H, E] = 2E, [E, F ] =

[H, F ] = −2F,

K = eH ,

q H − q −H . q − q −1

(1.6)

Evidently, there is an involution H ∗ = −H,

E ∗ = −E,

F ∗ = −F.

(1.7)

Let us sketch the representation theory of Uq (sl(2, R)) in the way which stresses the role of the modular duality concept (cf. [21]). The representations of the principal series of Uq (sl(2, R)) admit an explicit realization by means of finite difference operators on the real line; the commutation relations of the basic operators which are the building blocks for these representations are the ordinary Weyl relations. To put it in a different way, the principal series representations of Uq (sl(2, R)) factor through a noncommutative torus. Definition 1.1. The noncommutative torus Aq is the associative algebra generated by u, v subject to the relation uv = q 2 vu. We shall adjoin to Aq the inverse elements u−1 , v −1 (in other words, we replace Aq with its field of fractions, which we denote by the same letter). Proposition 1.1. For any z ∈ C the mapping Uq (sl(2, R)) → Aq defined by K → zu−1 ,

E →

v −1 (1 − u−1 ), q − q −1

F →

qv (z − z−1 u) q − q −1

(1.8)

is a homomorphism of algebras. Note that the Casimir C2 is mapped by the homomorphism (1.8) to qz + q −1 z−1 . It is sometimes technically convenient to extend the algebra Uq (sl(2)) by adjoining to it “virtual Casimir elements”. The following assertion is well-known. Proposition 1.2. The center of Uq (sl(2, R)) is isomorphic to the polynomial algebra ˆ Uq (sl(2, R)) is a free Z-module. Z = C[qz + q −1 z−1 ] ⊂ C[z, z−1 ] = Z; Set

ˆ Uˆ q (sl(2, R)) = Uq (sl(2, R)) ⊗Z Z.

The mapping (1.8) canonically extends to Uˆ q (sl(2)). Informally, we may think of Uˆ q (sl(2)) as of a bundle of noncommutative tori parameterized by the spectrum of ˆ the central element z ∈ Z. Proposition 1.1 is a simple instance of the “free field representations” for quantum groups; it may also be compared with the well-known Gelfand–Kirillov theorem [22] which asserts that the field of fractions of the universal enveloping algebra is isomorphic to the standard noncommutative division algebra (central extension of the field of fractions of the Weyl algebra generated by several pairs of “canonical variables” pi , qi ). As a motivation for the study of the modular duality for Uq (sl(2, R)) let us recall the following standard construction from ergodic theory [19, 10]. Let q = exp π iω1 /ω2 , where ω1 , ω2 ∈ R; we shall assume that τ = ω1 /ω2 is irrational. Put q = exp(π iω2 /ω1 )

Analytic Theory of Quantum q-Deformed Toda Chains

577

and let A u, v , and relations u v = q 2 v u. Let us define q be the dual torus with generators unitary operators Tω1 , Tω2 , S−iω1 , S−iω2 in L2 (R) by Tω1 ϕ(t) = ϕ(t + ω1 ), S−iω1 ϕ(t) = e

2π it ω1

Tω2 ϕ(t) = ϕ(t + ω2 ),

ϕ(t), S−iω2 ϕ(t) = e

2π it ω2

ϕ(t).

(1.9)

Define the dual representations of Aq and A q in H = L2 (R) by ρ : u → Tω1 , ρ : u → Tω2 ,

v → S−iω2 , v → S−iω1 .

(1.10)

It is easy to see that Aq , and A q are the centralizers of each other in the algebra B(H) of all bounded operators in H. The space H = L2 (R), which has the structure of a left Aq -module and of a right A q -module is called the imprimitivity (Aq , A q )-bimodule. The images of Aq , A in B(H) are factors of type II . Clearly, the representations of Aq q 1 and A q are reducible (in fact, both Aq and A q contain plenty of idempotent elements which are by projection operators in H; the image of a projection operator represented ∈ ρ P A q is an invariant subspace for Aq ; the subspaces of H which arise in this way are the celebrated fractional dimensional spaces of von Neumann). On the other hand, the second commutant of Aq ⊗ A q coincides with B(H) and hence (1.10) is an irreducible representation of Aq ⊗ A q (as a matter of fact, up to unitary equivalence, this algebra has a unique irreducible representation). The relation between the two noncommutative tori described above is called by Rieffel the strong Morita equivalence; in a more general way, Rieffel showed [19] that two tori +b πiτ , τ are strong Morita equivalent if and only if Aq and A q = eπi τ = aτ q, q = e cτ +d , where ab ∈ GL(2, Z), cd which explains the term “modular duality”. Definition 1.2. The modular dual of Uq (sl(2, R)) is the Hopf algebra U q (sl(2, R)) with q = eπi/τ ; we set also ˆ Uˆ Z, q (sl(2)) = U q (sl(2)) ⊗Z = C[ Z qz + q −1 z−1 ],

Zˆ = C[z, z−1 ].

The obvious motivation for this definition is the existence of the “dual free field representation” U q (sl(2, R)) → A q. Remark 1.1. The modular transformation usually considered in the theory of theta functions is τ → − τ1 ; this transformation preserves the upper half-plane Im τ > 0 and the unit circle |q| < 1. While the flip q → q −1 amounts to the simple exchange of the generators of the quantum torus (and hence, in particular, the quantum algebras U q (sl(2, R)) and U q −1 (sl(2, R)) are isomorphic), our choice of the sign of the modular transform appears to be most natural for the study of the q-deformed Toda chain. The fundamental difference which arises in the representation theory of Uq (sl(2, R)) is that its unitary representations are constructed from non-unitary representations of the quantum torus: we need to make a kind of “Wick rotation” and hence u, v ∈ Aq are

578

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

represented by unbounded operators [23]. More precisely, let us consider the following operators in H: Tiω1 ϕ(t) = ϕ(t + iω1 ),

Tiω2 ϕ(t) = ϕ(t + iω2 ),

Sω1 ϕ(t) = e

Sω2 ϕ(t) = e ω2 ϕ(t).

2π t ω1

ϕ(t),

2π t

(1.11)

The dual representations of Aq , A q are now given by ρW : u → Tiω1 , ρ W : u → Tiω2 ,

v → Sω2 , v → Sω1 .

(1.12)

Operators (1.11) are essentially self-adjoint on the common domain P which consists of entire functions ψ such that esx |ψ (x + iy)|2 dx < ∞ for all y ∈ R, s ∈ R. R

Remark 1.2. Unlike the unitary case, the definition of the centralizer of an unbounded operator must take care of the domains of operators; thus AB = BA implies that B(DomA ) ⊂ DomA ; this may not be true even if B is bounded. As a result, although the four operators (1.11) commute with each other, the same is not true, e. g., for their spectral projection operators4 ; thus, contrary to the ergodic case, the centralizer of ρW Aq does not contain projection operators, and hence the representations ρW , ρ W are geometrically irreducible. It is much more important for us, however, that they are not irreducible in the operator sense, as each of them still admits a huge algebra of intertwiners. Proposition 1.3. Operators which commute with all four operators (1.11) are scalars. Corollary 1.1. Representation of Aq ⊗ A q is strongly irreducible (i.e. it does not admit any nontrivial intertwiners). Let us now describe explicitly the particular principal series representation of Uˆ q (sl(2, R)) which is extensively used in the paper (the realization we use is slightly different from those described in [21]). Namely, the representation πλ of Uˆ q (sl(2, R)) (q = eπiω1 /ω2 , ω1 , ω2 ∈ R+ ), which depends on a parameter λ ∈ C, is given by π iλ

πλ :

−1 , K → e ω2 Tiω 1 −1 Sω −1 E → q−q2−1 1 − Tiω , 1 π iλ qS 2 − π iλ F → q−qω−1 e ω2 − e ω2 Tiω1 , π iλ

C2 → qe ω2 + q −1 e z → e

π iλ ω2

− πωiλ 2

(1.13)

,

.

4 Spectral projection operators E( ) for multiplication operators are multiplication operators by the characteristic function of the interval ; this function has compact support and hence the spectral projector does not preserve the domain which consists of analytic functions.

Analytic Theory of Quantum q-Deformed Toda Chains

579

By duality, we define the representation πλ of the modular dual algebra Uˆ q (sl(2, R)) πiω /ω 2 1 with q=e by π iλ

πλ :

→ e ω1 T −1 , K iω2 Sω−1 −1 1 E → 1 − T , −1 iω q − q 2 π iλ π iλ q Sω1 − ω ω 1 1 F → e −e Tiω2 , q − q −1 π iλ

(1.14)

π iλ

− 2 → C q e ω1 + q −1 e ω1 , π iλ

z → e ω1 . The representations πλ , πλ are defined on a larger space Pλ ⊃ P which depends on λ. Definition 1.3. Pλ is the set of entire functions such that (i) For t → +∞ a function ψ ∈ Pλ admits an asymptotic expansion 2π λt

ψ(t + is) ∼t→+∞ e ω1 ω2

Cn1 ,n2 e

−2π t (n1 ω1 +n2 ω2 ) ω1 ω2

(1.15)

n1 ,n2 ≥0

uniformly in each bounded strip. (ii) For t → −∞ it admits an asymptotic expansion 2π t (n1 ω1 +n2 ω2 ) ω1 ω2 ψ(t + is) ∼t→−∞ C 1 + Cn1 ,n2 e

(1.16)

n1 ,n2 >0

uniformly in each bounded strip5 . The scalar product which is adapted to the discussion of the unitarity conditions in our algebra, defined on Pλ only for λ ∈ iR − ω1 − ω2 , is given by 2πt ω1 + ω1 1 2 (ϕ, ψ) = e ϕ(t)ψ(t)dt, (1.17) R

Proposition 1.4. The following statements hold: (i) Operators πλ (X), X ∈ Uq (sl(2, R)) and πλ (Y ), Y ∈ U q (sl(2, R)) leave Pλ ⊂ L2 (R) invariant and commute with each other on this domain; any operator which commutes with both algebras is scalar. (ii) If λ ∈ iR − ω1 − ω2 , all operators πλ (X), X ∈ Uq (sl(2, R)) obey the involution generated by (1.5) with respect to the scalar product (1.17). A similar statement holds for all operators πλ (Y ), Y ∈ U q (sl(2, R)). πλ are related by z = z τ , τ = ω1 /ω2 . (iii) The central characters z, z of πλ , As in Proposition 1.3, the commutativity condition implies that the operator preserves the domains of our unbounded operators. 5 The space P essentially coincides with those considered in [21]. λ

580

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

Corollary 1.2. The principal series representation πλ of Uˆ q (sl(2, R)) canonically extends to a representation of Uˆ q (sl(2, R)) ⊗ Uˆ q (sl(2, R)) which is defined on the same domain Pλ ; this representation is unitary if and only if λ ∈ iR − ω1 − ω2 . Remark 1.3. Since representations of the dual algebras Uq (sl(2)) and U q (sl(2)) are constructed from the dual representations of the quantum tori Aq , A q for any values of the central characters z = eπiλ/ω2 , z = eπiµ/ω1 , one might conclude that the principal series representations πλ , πµ centralize each other for any pair of indices λ, µ; it is the condition on the common domain which imposes the selection rule. Let us now introduce the following key definition. Definition 1.4. The modular double of Uˆ q (sl(2)) is the Hopf algebra Dmod = Uˆ q (sl(2)) ⊗ Uˆ q (sl(2)). The bialgebra structure of the modular double, i.e., its product and coproduct, is standard. The point is that this algebra admits an unexpected class of representations which are not tensor products of representations of the factors, but rather are related to a kind of “type II” operator algebras (the quotation marks reflect the fact that due to analyticity constraints our algebras are “thinner” than the genuine type II factors; in particular, they do not contain projection operators). The modular double should be regarded as an analytic rather than algebraic object which for the first time brings into play the nontrivial analytic properties of noncompact semisimple quantum groups. In what follows we shall be interested only in the principal series representations of Dmod defined above; with respect to this subclass of representations Dmod behaves itself as a rank one algebra. Note that the kernel of these representations contains the two-sided ideal J ⊂ Uˆ q (sl(2)) ⊗ Uˆ q (sl(2)) generated by the relations z = z τ,

τ. K=K

The use of the modular double and its representations, instead of those of its factors, appears to be very natural in many ways. We shall see below that the definition of the Whittaker vectors becomes unambiguous only if we require that they are the eigenvectors One more reason to enjoy the presence of of both nilpotent generators πλ (E), πλ (E). a double set of generators is the integrability problem for the q-deformed (relativistic) Toda model discussed below. The q-Toda Hamiltonian, which is derived from the Casimir element of Uq (sl(2)) is a difference operator which involves only translations Tiω1 ; due to the presence of quasiconstants (i.e., functions with period iω1 ), its spectrum becomes multiple with infinite multiplicity; the multiplicity problem is resolved when we take into account the dual Casimir element which involves dual translations Tiω2 . The real form of Dmod used above is inherited from the real forms of Uˆ q (sl(2)), Uˆ q (sl(2)). As pointed out by Faddeev [10], for a special choice of the complex periods ω1 , ω2 there exists another real form of Dmod which does not reduce to real forms of its factors. Namely, Proposition 1.5. (i) Let us assume that ω1 = ω2 , or, equivalently, that |τ | = 1. Then the mapping F → −F , K → K, z → z E → −E, q 2 extends to a C-antilinear involution of Dmod .

(1.18)

Analytic Theory of Quantum q-Deformed Toda Chains

581

(ii) Let ρ be a unitary representation of Dmod with respect to the real form (1.18); then all operators ρ(X), X ∈ Uˆ q (sl(2)) ⊂ Dmod , ρ(Y ), Y ∈ Uˆ q (sl(2)) ⊂ Dmod , are normal. (iii) Let λ ∈ iR − ω1 − ω2 ; then the principal series representation πλ extends to a unitary representation of Dmod with respect to the real form (1.18). Physical self-adjoint Hamiltonians associated with the real form (1.18) can be derived from the real and imaginary parts of the Casimir operators. Analytically, Faddeev’s real form is particularly attractive, since in that case the lattice generated by ω1 , ω2 is non-degenerate. The use of the modular double is very well suited for the treatment of interpolation problems. Recall that we are dealing with the “rational form” of the quantum algebra Uq (sl(2)) which is defined in terms of the generator K = q H . This choice is at the core of modular duality: it will be completely destroyed if we replace Uq (sl(2)) with the “infinitesimal” algebra Uτ (sl(2)) generated by E, F, H , the commutativity of two dual sets of generators will be destroyed. On the other hand, when it comes up to compute special functions associated with representations of Uq (sl(2)), i.e., some specific matrix coefficients of its irreducible representations, e.g., spherical functions or Whittaker functions, and to construct the corresponding spectral theory, it is important to define these functions on the entire real line or on its compexification. By contrast, the use of the rational form Uq (sl(2)) implies that these functions are defined a priori only on a discrete set {K n , n ∈ Z}. Let us assume that ω1 , ω2 are real and τ = ω1 /ω2 is irrational. Proposition 1.6. For any α ∈ R the operators πλ (eαH ) are approximated by linear m ), n, m ∈ Z. combinations of πλ (K n · K Indeed, πλ (eαH ) is a translation operator, πλ (eαH )ϕ(t) = ϕ(t − iα); on the other m )ϕ(t) = eπiλn/ω2 +πiλm/ω1 ϕ(t − inω1 − imω2 ). The set {inω1 + hand, πλ (K n · K imω2 ; n, m ∈ Z} is dense in iR. Remark 1.4. There exists a whole family of principal series representations similar to those described above. It is easy to find a realization of the algebras Uq (sl(2)) and U q (sl(2)) labeled by integer indices k1 and k2 , respectively, such that all representation operators, which act on an appropriate space Pλk1 k2 , satisfy the unitarity condition with respect to the scalar product with the measure exp{ 2π(k1 ωω11 ω+k2 2 ω2 )t }. In this case one obtains unitary representation if and only if λ ∈ iR − k1 ω1 − k2 ω2 . For simplicity, in the present paper we restrict ourself to the case k1 = k2 = 1, although the more general case can be treated quite similarly. 2. Whittaker Vectors Let g be a semisimple Lie algebra, n its maximal nilpotent subalgebra generated by positive root vectors. A character χ : n → C is uniquely fixed by its values on root vectors associated with simple roots; it is called nondegenerate if χ (eα ) = 0 for all simple roots α. A Whittaker vector in a g-module V is a vector w ∈ V such that Xw = χ (X)w

(2.1)

for all X ∈ n. The extension of this definition to q-deformed algebras is nontrivial: it is easy to see that for rank g ≥ 2 the algebra Uq (n) generated by the Chevalley generators

582

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

associated with positive simple roots does not admit nondegenerate characters (the obstruction is associated with the q-deformed Serre relations). In [24] Sevostyanov found the way around this difficulty: one has to rescale the generators of the nilpotent subalgebra multiplying them by appropriate group-like elements from the Cartan subalgebra. Although the Serre relations are vacuous in the sl2 case, the same trick proves worthy in that case as well; it provides an extra freedom which serves to construct various versions of the q-deformed Toda Hamiltonians. Whittaker vectors associated with the unitary principal series representations of SL(2, R) do not lie in the Hilbert space, because the spectrum of E, F is continuous; as a result, the Whittaker functions which are defined as formal matrix coefficients of the principal series representations between a pair of Whittaker vectors are expressed by a divergent integral which requires regularization. The situation in the q-deformed case is completely similar. As already mentioned, the natural definition of Whittaker vectors in the q-deformed setting requires the use of the modular double. The two commuting ∈ Dmod give rise to two compatible difference equations which have a generators E, E unique common solution with nice analytic properties; this solution does not belong to L2 (R), because it does not decrease rapidly enough. With these remarks in mind, we may now proceed to the formal definition. Let (πλ , πλ ), λ ∈ iR − ω1 − ω2 , be the unitary representation of Dmod and α ∈ R an (α) arbitrary parameter. The E-Whittaker vector ;λ is defined by (α)

g ω1 (α) eπiα πλ (q αH );λ , q − q −1 g ω2 (α) = eπiα πλ ( q α H );λ ; −1 q − q

πλ (E);λ = (α) πλ (E); λ

(2.2)

here g is a positive real number (the “coupling constant”). The extra parameter α matches the freedom in the choice of the quantum Lax operator in the alternative formulation of the q-deformed Toda theory based on the Quantum Inverse Scattering Method. In other words, particular choices of α correspond to different Toda-like models. In a similar way, the F -Whittaker vectors are defined by (α)

g ω1

(α) , eπiα πλ (q −αH ); λ q − q −1 ω g 2 (α) =− eπiα πλ ( q −α H ); λ . q − q −1

=− πλ (F ); λ (α)

);

πλ (F λ

(2.3)

The definition of the Whittaker vectors is completely symmetric with respect to the exchange of the two dual algebras Uq (sl(2)), U q (sl(2)). Note that the existence of a or πλ (F ), ) is common eigenvector of the commuting generators πλ (E), πλ (E), πλ (F ˆ ˆ guaranteed due to our “selection rule” for the central characters of Uq (sl(2)), U q (sl(2)). 2.1. Whittaker vectors: Explicit solutions. We shall start with the explicit formulae for the simplest Whittaker vectors corresponding to a particular choice of α. Using the representations (1.13), (1.14), we get the following system of difference equations for (α) the vectors ;λ with α = 0, 1: (0)

;λ (t − iω1 ) (0) ;λ (t)

2π t

= 1 − g ω1 e ω2 ,

(2.4a)

Analytic Theory of Quantum q-Deformed Toda Chains (0)

;λ (t − iω2 ) (0) ;λ (t)

583 2π t

= 1 − g ω2 e ω1 ,

(1)

;λ (t − iω1 ) (1) ;λ (t)

=

1 2π t

1 − g ω1 e ω2

(1)

;λ (t − iω2 ) (1) ;λ (t)

=

+ πωiλ

2π t

,

(2.4c)

.

(2.4d)

2

1 1 − g ω2 e ω1

(2.4b)

+ πωiλ 1

(α) with α = 0, 1 satisfy the difference equations In a similar way, the Whittaker vectors ; λ

(0) (t + iω1 ) 2π iλ 2π t π iλ ; −1 ω1 − ω2 − ω2 λ ω2 = e g e 1 + q ,

(0) (t) ;

(2.5a)

(0) (t + iω2 ) 2π iλ 2π t π iλ ; −1 ω2 − ω1 − ω1 λ ω1 = e g e 1 + q ,

(0) (t) ;

(2.5b)

λ

λ

2π iλ

(1) (t + iω1 ) ; e ω2 λ = , − 2π t

(1) (t) ; 1 + q −1 g ω1 e ω2 λ

(2.5c)

2π iλ

(1) (t + iω2 ) ; e ω1 λ = . t − 2π

(1) (t) ω1 −1 ω ; 2 1 + q g e λ

(2.5d)

Let S(y) be the function defined in terms of the double sine S2 (y) according to (A.17).6 Proposition 2.1. The Whittaker vectors satisfying Eqs. (2.4a–2.5d) are given by the following formulae: (0) 1 ω2 ;λ (t) = S − it + ω1 +ω2 − iω2π log g , (2.6a) λ (1) ;λ (t) = S −1 − it + ω1 +ω2 + − 2 λ

(0) (t) = S it + 1 (ω1 +ω2 ) − − ; λ 2 2

(1) (t) = S −1 it + 1 (ω1 +ω2 ) − ; λ 2

log g ,

(2.6b)

2π λt log g e ω1 ω2 ,

(2.6c)

iω1 ω2 2π

iω1 ω2 2π

iω1 ω2 2π

2π λt log g e ω1 ω2 .

(2.6d)

6 In the main text we shall write S(y) instead of S(y|ω) for brevity. We omit such dependence for any other function of such type.

584

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

In a more general way, one can prove the following formulae for the Whittaker vectors

(α) with arbitrary values of α: ; λ π i(2α−1)ζ 2 i + ω2πωiζ [t+ iα (α) 2 (λ+ω1 +ω2 )+ 4 (ω1 +ω2 )] dζ, 1 2 ;λ (t) = c(ζ )e 2ω1 ω2 (2.7a)

(α) ;λ ,

=α 2π λt

(α) (t) = e ω1 ω2 ; λ

c(ζ )e

π i(2α−1)ζ 2 2π iζ i(1−α) i 2ω1 ω2 − ω1 ω2 [t+ 2 (λ+ω1 +ω2 )− 4 (ω1 +ω2 )]

dζ,

(2.7b)

=α

where c(ζ ) ≡ √

g iζ S −1 (−iζ ) ω1 ω2 2

(2.8)

and the contour =α is chosen in such a way that it passes above the poles of the integrand π iαζ 2

and escapes to infinity in the sector where the function e ω1 ω2 is decaying on the left π i(α−1)ζ 2

(α) (α) and in the sector where e ω1 ω2 is decaying on the right. For α = 0, 1, ;λ , ; λ are entire functions of the variable t; for “degenerated cases” α = 0, 1, the integrals in (2.7a), (2.7b) may be evaluated explicitly using formulae (A.27) and reduce to (2.6); in these cases both vectors are meromorphic functions of t. Let us note that the function c(ζ ) may be regarded as the q-deformed Harish-Chandra function (this term is justified by its role in the asymptotic formulae for the Whittaker functions, see below).

2.2. Whittaker functions. Now we would like to define the q-deformed Whittaker functions as the matrix elements of Whittaker vectors. As mentioned before, the standard integral (1.17) is divergent in this case. To regularize the integral, one should deform the integration contour in an appropriate way. Therefore, by the scalar product below we mean a suitable regularization of (1.17). (α)

Definition 2.1. Let α = (α1 , α2 ) ∈ R2 . The Whittaker functions wλ (x) corresponding to the representation (πλ , πλ ) of the algebra Dmod are the matrix elements π(ω +ω )x πx − ω1 ω 2 (α)

(α1 ) , e− ω2 H ;(α2 ) . 1 2 wλ (x) = e ; (2.9) λ λ Proposition 2.2. The Whittaker functions (2.9) satisfy the equations (α)

(α)

2π x

(α)

wλ (x −iω1 ) + wλ (x +iω1 ) + q α1 −α2 g 2ω1 e ω2 wλ (x +i(α1 −α2 )ω1 ) π iλ − π iλ (α) = − qe ω2 + q −1 e ω2 wλ (x), (α)

(α)

2π x

(2.10a)

(α)

q α1 −α2 g 2ω2 e ω1 wλ (x +i(α1 −α2 )ω2 ) wλ (x −iω2 ) + wλ (x +iω2 ) + π iλ − π iλ (α) = − qe ω1 + q −1 e ω1 wλ (x).

(2.10b)

Analytic Theory of Quantum q-Deformed Toda Chains

585

Let us check (2.10a) formally; we shall discuss the convergence of the integral in (2.9) a little later. Set πx (α)

(α1 ) , e− ω2 H ;(α2 ) . Fλ (x) = ; (2.11) λ λ The eigenvalue of the Casimir operator πλ (C2 ) is π iλ

C2 = qe ω2 + q −1 e Therefore,

− πωiλ

.

2

(2.12)

πx π iλ π iλ

(α1 ) , e− ω2 H C2 ;(α2 ) = (qe ω2 + q −1 e− ω2 )F (α) (x). ; λ λ λ

On the other hand, πx

(α1 ) , e− ω2 H C2 ;(α2 ) ; λ λ πx − (α ) 1

, e ω2 H q H +1 + q −H −1 + (q −q −1 )2 F E ;(α2 ) = ; λ λ πx

(α1 ) , e− ω2 H (q H +1 + q −H −1 );(α2 ) = ; λ λ 2π x πx − H (α )

1 , e ω2 E;(α2 ) . − (q −q −1 )2 e ω2 F ; λ λ Using the definition of the Whittaker vectors (2.2), (2.3), we obtain πx

(α1 ) , e− ω2 H C2 ;(α2 ) ; λ λ πx

(α1 ) , e− ω2 H (q H +1 + q −H −1 );(α2 ) = ; λ λ 2π x πx

(α1 ) , e− ω2 H q (α2 −α1 )H ;(α2 ) − eπi(α2 −α1 ) g 2ω1 e ω2 ; λ λ 2π x (α) = qe−iω1 ∂x + q −1 eiω1 ∂x − eπi(α2 −α1 ) g 2ω1 e ω2 ei(α1 −α2 )ω1 ∂x Fλ (x).

(2.13)

(2.14)

(2.15)

(α)

From (2.13) and (2.15) it follows that the matrix coefficient Fλ satisfies the equation 2π x (α) qe−iω1 ∂x + q −1 eiω1 ∂x − eπi(α2 −α1 ) g 2ω1 e ω2 ei(α1 −α2 )ω1 ∂x Fλ (x) (2.16) π iλ − π iλ (α) = qe ω2 + q −1 e ω2 Fλ (x). Hence, the function (α)

wλ (x) = e

−

π(ω1 +ω2 )x ω1 ω2

(α)

Fλ (x)

(2.17)

satisfies (2.10a). Corollary 2.1. Let the unitary weight be λ = −iγ − ω1 − ω2 . The Whittaker functions (α) (α) w−iγ −ω1 −ω2 ≡ wγ are eigenfunctions of the Hamilton operators 2π x

H(α1 −α2 ) = eiω1 ∂x + e−iω1 ∂x + q α1 −α2 g 2ω1 e ω2 ei(α1 −α2 )ω1 ∂x , 2π x

(α1 −α2 ) = eiω2 ∂x + e−iω2 ∂x + H q α1 −α2 g 2ω2 e ω1 ei(α1 −α2 )ω2 ∂x πγ

with eigenvalues εγ = e ω2 + e

− πωγ

2

πγ

and εγ = e ω 1 + e

− πωγ

1

, respectively.

(2.18)

586

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

(α) are the two dual Hamiltonians of the q-deformed 2-particle Toda Operators H(α) , H (α) are essentially self-adjoint in the space of chain. If ω1 , ω2 are real, both H(α) and H smooth functions on the line which decrease faster than e−|ωx| , where ω = max(ω1 , ω2 ). (α) and i(H(α) − When ω1 = ω2 , “physical” self-adjoint Hamiltonians are H(α) + H (α) H ). (α) (α) Using the explicit formulae for the Whittaker vectors ;λ , ; λ , we may express (α) Whittaker functions wγ in integral form: wγ(α) (x) = Nγ e

−

π(ω1 +ω2 )x ω1 ω2

π iγ x

= Nγ e ω1 ω2

e

πx

− ω H (α2 )

(α1 ) 2 ; ;−iγ −ω1 −ω2 −iγ −ω1 −ω2 , e

2π(ω1 +ω2 )t ω1 ω2

(α2 )

(α1 ) ; −iγ −ω1 −ω2 (t ) ;−iγ −ω1 −ω2 (t + x)dt,

(2.19)

where one introduces the normalization factor Nγ for future convenience as follows: Nγ =

1 ω1 ω2

πi

e− 2 [B2,2 (iγ )−B2,2 (0)]

(2.20)

with the polynomial B2,2 (z) defined by (A.4). In particular, substituting in (2.19) the expressions (2.6b), (2.6c), and using (A.21) (0,1) (−) ≡ wγ the integral representation we get for the Whittaker function wγ π iγ x (−) 1 ω2 ω ω 1 2 wγ (x) = Nγ e S −1 it + 2i γ − iω2π log g

C−

× S −1 − it − ix −

i 2

γ + 21 (ω1 +ω2 ) −

iω1 ω2 2π

log g e

(2.21) 2π iγ t ω1 ω2

dt,

where the contour belongs asymptotically to the sectors π 1 < arg t < (arg ω1 + arg ω2 ) + π, 2 2 1 π arg ω1 − < arg t < (arg ω1 + arg ω2 ) 2 2 arg ω1 +

(2.22)

(at this point one can relax the “physical” constraints imposed on parameters ω1 , ω2 ) and lies between the two sets of poles of the integrand: = − γ2 + tn(−) 1 ,n2 (−) (x) = −x − tm 1 ,m2

γ 2

−

ω1 ω2 2π

ω1 ω2 2π

log g + i(n1 ω1 + n2 ω2 ) ,

log g − i (m1 + 21 )ω1 + (m2 + 21 )ω2 ,

n1 , n2 ≥ 0, m1 , m2 ≥ 0.

(See (A.12), (A.17) for the description of the poles and the zeros of S(y).) The choice of the integration contour assures convergence and provides a natural regularization of the divergent inner product. Indeed, to see that the integral in (2.21) is well defined observe that due to (A.24) the integrand has the asymptotics 2

e

− ωπ itω +t (... ) 1 2

.

But in sectors (2.22) the quadratic exponential decreases. Hence, the integral (2.21) is absolutely convergent.

Analytic Theory of Quantum q-Deformed Toda Chains (1,0)

587 (+)

In a similar way, the function wγ ≡ wγ corresponding to (2.6a), (2.6d) admits the integral representation π iγ x 1 ω2 S it + 21 (ω1 +ω2 ) − iω2π log g wγ(+) (x) = Nγ e ω1 ω2

C+

× S − it − ix + ω1 +ω2 −

iω1 ω2 2π

(2.23)

log g e

2π iγ t ω1 ω2

dt,

where the contour belongs asymptotically to the sectors 1 3π (arg ω1 + arg ω2 ) + π < arg t < arg ω2 + , 2 2 (2.24) 1 π (arg ω1 + arg ω2 ) < arg t < arg ω2 + , 2 2 and lies between the two sets of poles of the integrand:

ω1 ω2 1 1 = log g − i (n + )ω + (n + )ω , n1 , n2 ≥ 0, tn(+) 1 1 2 2 ,n 2π 2 2 1 2 (+) tm (x) = −x − 1 ,m2

ω1 ω2 2π

log g + i(m1 ω1 + m2 ω2 ),

m1 , m2 ≥ 0.

The integral (2.23) is absolutely convergent. (0,0) (0) Quite similarly, one can construct the function wγ (x) ≡ wγ (x) using the Whit (0) and ;(0) : taker vectors ; λ λ π iγ x 1 ω2 S −1 it + 2i γ − iω2π log g wγ(0) (x) = Nγ e ω1 ω2

C0

× S − it − ix + ω1 +ω2 −

iω1 ω2 2π

log g e

(2.25) 2π iγ t ω1 ω2

dt ,

where the contour C0 belongs asymptotically to the sectors 1 π < arg t < (arg ω1 + arg ω2 ) + π, 2 2 (2.26) π 1 (arg ω1 + arg ω2 ) < arg t < arg ω2 + 2 2 and lies below the poles of the integrand. Thus, the functions (2.21), (2.23), and (2.25) are the eigenfunctins of the corresponding spectral problems, πγ

2π x − πγ 1 + q −1 g 2ω1 e ω2 wγ(−) (x − iω1 ) + wγ(−) (x + iω1 ) = e ω2 + e ω2 wγ(−) (x), arg ω1 +

πγ 2π x − πγ wγ(+) (x − iω1 ) + 1 + qg 2ω1 e ω2 wγ(+) (x + iω1 ) = e ω2 + e ω2 wγ(+) (x), wγ(0) (x − iω1 ) + wγ(0) (x + iω1 ) + g 2ω1 e

2π x ω2

πγ − πγ wγ(0) (x) = e ω2 + e ω2 wγ(0) (x).

(2.27)

(2.28)

(2.29)

588

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

Besides, these solutions are the eigenfunctions for the dual spectral problems where ω1 ↔ ω2 . (±) The solutions wγ described above appear to be close to the q-Macdonald functions of the first and second kind which arise in the context of relativistic Toda chain [25]. However, the deformations of the Macdonald function have been investigated in the framework of the standard q-analysis [26] for the typical region |q| < 1 (which evidently fails in the case |q| = 1) and without any reference to the dual symmetry. Formulae (2.19) will be referred to as the Gauss–Euler representation for Whittaker (C) functions. The integral representations for wγ (x), (C = 0, ±1) are the degenerations of a more general q-hypergeometric function [27]. We shall see later that the technique of QISM yields a different integral representation for Whittaker functions which is a q-deformation of the Mellin–Barnes integrals.

2.3. Analytic properties. Let us give the summary of the analytic properties of the Whittaker functions which may be derived directly from the Gauss–Euler representation. (±)

(0)

Lemma 2.1. wγ and wγ can be extended to the entire functions in γ ∈ C. As a (−) function of x ∈ C, wγ (x) has poles at x = − ω1πω2 log g − i(k1 + 21 )ω1 − i(k2 + 21 )ω2 ,

k1 , k2 ≥ 0.

(2.30)

k1 , k2 ≥ 0.

(2.31)

log g wγ(−) (x).

(2.32)

arg γ ∈ / arg ω2 − π2 , arg ω1 − π2 arg ω2 + π2 , arg ω1 + π2 ,

(2.33)

(+)

Similarly, the function wγ (x) has poles at x = − ω1πω2 log g + i(k1 + 21 )ω1 + i(k2 + 21 )ω2 , (0)

The function wγ (x) is an entire one in x ∈ C. Lemma 2.2. wγ(+) (x) = S − ix + 21 (ω1 +ω2 ) −

iω1 ω2 π

Lemma 2.3. For any γ ∈ C such that

the following asymptotics holds as x tends to infinity in the sector π 3π < arg x < arg ω2 + : 2 2 π iγ x − π iγ x wγ(C) (x) = c(γ ) e ω1 ω2 1 + o(1) + c(−γ ) e ω1 ω2 1 + o(1) , arg ω1 +

(2.34) (2.35)

(C = 0, ±), where the function c(γ ) is defined by (2.8). We shall call c(γ ) the quantum Harish-Chandra function associated with Uq (sl(2, R)).

Analytic Theory of Quantum q-Deformed Toda Chains

589

2.4. Mellin-Barnes representation. To make a comparison with the formulae provided by the Quantum Inverse Scattering Method we need a different integral representation of the Whittaker functions. Put π iγ x iζ x − π iC [ζ 2 +γ ζ ] 2π (C) ω ω c(ζ )c(ζ + γ )e ω1 ω2 e ω1 ω2 dζ, (2.36) ψγ (x) = e 1 2 Ce

where the contour CC is above the poles of the integrand and belongs in the left (right) half2 2 − π(C−1)ζ − π(C+1)ζ plane in ζ ∈ C to the sectors where the exponential e ω1 ω2 e ω1 ω2 quadratically vanishes. The integral (2.36) is absolutely convergent for any x ∈ C provided C = ±1. In the degenerate case C = −1 the integral is convergent provided that

π π , (2.37) arg x ∈ / arg ω2 − , arg ω1 − 2 2 while for C = 1 it is defined in the region

π π arg x ∈ / arg ω2 + , arg ω1 + . 2 2

(2.38) (C)

Using the properties of double sine it can be directly verified that the function ψγ (x) satisfies to Eqs. (2.10) where α1 − α2 = C. Proposition 2.3. For C = 0, ±1, wγ(C) (x) = ψγ(C) (x).

(2.39)

The expression (2.36) will be referred to as the (q-deformed) Mellin–Barnes representation for Whittaker functions. It will be shown below that this is the representation which can be easily generalized to those for the N -particle q-deformed Toda chain. 2.5. Limit to SL(2, R) Toda chain. Let ωk > 0 , (k = 1, 2). Suppose that the “coupling constant” g(ω) has the asymptotics such that g ω1 (ω) =

2π [1 + O(ω2−1 )] ω2

(ω2 → ∞).

(2.40)

−1

For example, the simplest (and standard) choice g ω1 (ω) = q−q iω1 satisfies this condition. After the rescaling x → ωπ2 x, Eqs. (2.27), (2.28), and (2.29) take the form π iω1 πγ π iω1 − ∂ ∂ − πγ 1 + q −1 g 2ω1 e2x e ω2 x + e ω2 x wγ(−) (x) = e ω2 + e ω2 wγ(−) (x). (2.41a)

e

−

π iω1 ω2 ∂x

e

−

π iω1

πγ ∂ − πγ + 1 + q −1 g 2ω1 e2x e ω2 x wγ(+) (x) = e ω2 + e ω2 wγ(+) (x), (2.41b)

π iω1 ω2 ∂x

+e

−

π iω1 ω2 ∂x

πγ − πγ + g 2ω1 e2x wγ(0) (x) = e ω2 + e ω2 wγ(0) (x),

(2.41c)

590

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

In the limit (2.40) the Eqs. (2.41) are reduced to the SL(2, R) Toda equation p 2 + 4e2x wγ (x) = γ 2 wγ (x),

(2.42)

where p = −iω1 ∂x and ω1 plays the role of Planck constant. Note that the more general equation (2.10a) has the same limit (2.42). The solution to (2.42) with appropriate asymptotic behavior is written in terms of Macdonald function wγ (x) = K γ ω21 ex . (2.43) iω1

Lemma 2.4. lim ψ (C)

ω2 →∞

ω2 π x =

1 K γ 2 ex . π ω1 iω1 ω1

(2.44)

Proof. Using the formula √ lim 2π

ω2 →∞

2π ω1 ω2

1− 2

z ω1

S2−1 (z) = =

z , ω1

(2.45)

proved in [28], one easily finds that the quantum Harish-Chandra function (2.8) reduces to the usual =-function: ζ ζ iω1 1 lim c(ζ ) = 2πω ω = , (2.46) 1 1 ω2 →∞ iω1 provided that asymptotics (2.40) holds. (This function is closely related to the standard Harish-Chandra function for the Toda chain [6]; the difference with the usual definition is due to a different normalization of solutions.) Hence, in the limit ω2 → ∞ the rescaled function (2.36) takes the form iγ ζ ζ + γ ex 2iζ 1 e x ω1 1 ω1 (C) ω2 lim ψ ( π x) = = dζ , = ω2 →∞ πω1 4πω1 ω1 iω1 iω1 ω1 (2.47) where the contour is parallel to the real axis and passes above the poles of the integrand. The expression in brackets is exactly the Macdonald function (2.43) in the Mellin–Barnes representation. ! 3. N -Particle q-Toda Chain and Duality The extension of the formalism described above to the case of the N -particle Toda chain may be performed directly with the help of the “free field representation” for Uq (sl(N, R)), i.e., the homomorphism of Uq (sl(N, R)) into an appropriate multidimensional quantum torus. Instead, we shall describe a different approach based on the “lattice Lax representation with spectral parameter”. As usual, the Lax representation allows to construct of quantum Hamiltonians for a bunch of related systems: periodic Toda chain, open Toda chain, as well as different degenerate systems obtained by removing some of the potential terms from the Hamiltonians. Of course, the choice of the model

Analytic Theory of Quantum q-Deformed Toda Chains

591

in question depends on our choice of the quantum R-matrix. The obvious choice is the standard trigonometric 4 × 4 R-matrix; to get more freedom in the choice of the model we may use twisted trigonometric R-matrices. In all cases, there is a natural homomorphism of the corresponding quantum algebra into the tensor product of noncommutative tori; this allows to introduce the corresponding dual system realized by means of the natural representation of the product of modular dual quantum tori in the same Hilbert space. The entire picture of modular duality is thus fully generalized to the N -particle case. We would like to point out that in the R-matrix formalism it is more convenient to work with Uq (gl(N, R)) and reduce the final formulae to the case of Uq (sl(N, R)) in the standard way. The main advantage provided by the use of the lattice Lax representation is the possibility to get inductive integral representations for the wave functions in question and generalization of the above construction to the periodic case as it was done in [29]. 3.1. The models. q-Toda chain, or relativistic Toda chain (RTC), was introduced by Ruijsenaars [18]. The periodic chain can be described by the Hamiltonian H1 (x1 , p1 ; . . . ; xN , pN ) =

N

2π

1 + q −1 g 2ω1 e ω2

(xn −xn+1 )

eω1 pn ,

(3.1)

n=1

where xn , pn are the canonical coordinates and momenta with standard commutation relations [xn , pm ] = iδnm and the boundary condition xN +1 = x1 is imposed. The system has exactly N mutually commuting Hamiltonians (the polynomial functions of ± 2π xn

ω2 , v = e ω1 pn )7 . the Weyl variables u±1 n n =e Guided by the notion of the modular double considered above, one can define the dual system which is determined by the Hamiltonian

1 (x1 , p1 ; . . . ; xN , pN ) = H

N

2π

1 + q −1 g 2ω2 e ω1

(xn −xn+1 )

eω2 pn

(3.2)

n=1

with the same boundary condition. It is evident that the systems mutually commute. Analogously, the open relativistic Toda chain and its dual system are defined by the Hamiltonians N 2π (x −x ) h1 (x1 , p1 ; . . . ; xN , pN ) = (3.3) 1 + q −1 g 2ω1 e ω2 n n+1 eω1 pn n=1

and h1 (x1 , p1 ; . . . ; xN , pN ) =

N

2π

1 + q −1 g 2ω2 e ω1

(xn −xn+1 )

eω2 pn

(3.4)

n=1

respectively, with the boundary condition xN +1 ≡ ∞. Similarly to the periodic case, each open system possesses exactly N mutually commuting Hamiltonians. Moreover, the Hamiltonians of the dual system commute with those of original one. The basic goal of the present section is to construct the explicit integral representation of the common eigenfunctions for all Hamiltonians in the case of the open N -particle RTC. This will be done in the framework of the QISM approach for the periodic RTC. 7 Higher Hamiltonians will be described below using the standard Lax formalism.

592

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

3.2. Twisted trigonometric R-matrix. In order to investigate the relativistic Toda chain using the quantum version of the corresponding classical Lax matrix [30], one needs to introduce the notion of the twisted R-matrix [31]. Let  qz2 − q −1 w 2 0 0 0   1 0 z2 − w 2 (q − q −1 )zw 0  R(z/w) = 2 × −1 2 2   2 0 (q − q )zw z − w 0 z −w 2 −1 2 0 0 0 qz − q w (3.5) 

be the R-matrix in the principal gradation satisfying the standard Yang–Baxter equation. Consider the twisting of the R-matrix (3.5): −1 Rθ (z/w) = F21 (θ )R(z/w)F12 (θ )

(3.6)

with −1 F12 (θ ) ≡ F21 (θ ) = exp

θ 4

1 ⊗ σ 3 − σ3 ⊗ 1

,

(3.7)

where σ3 is the Pauli matrix. One gets 

 a(z, w) 0 0 0  0 b(z, w) c(z, w) 0  1   Rθ (z/w) = 2  , z − w2  0 c(z, w) b(z, w) 0  0 0 0 a(z, w)

(3.8)

a(z, w) = qz2 − q −1 w 2 , b(z, w) = eθ (z2 − w 2 ), b(z, w) = e−θ (z2 − w 2 ), c(z, w) = (q − q −1 )zw.

(3.9)

where

It is easy to verify that Rθ (z/w) satisfies the same Yang–Baxter equation as R(z/w). A quantum Lax operator L(z) is, by definition, a 2 × 2-matrix L(z) =

L11 (z) L12 (z) L21 (z) L22 (z)

(3.10)

with operator-valued entries which satisfies the fundamental commutation relations Rθ (z/w)L(z) ⊗ L(w) = (1 ⊗ L(w))(L(z) ⊗ 1)Rθ (z/w).

(3.11)

We define the quantum determinant of the matrix (3.10) by the formula detq L(z) = L11 (zq 1/2 )L22 (zq −1/2 ) − eθ L12 (zq 1/2 )L21 (zq −1/2 ).

(3.12)

Analytic Theory of Quantum q-Deformed Toda Chains

593

3.3. Lax operator and monodromy matrix. As usual in the Quantum Inverse Scattering Method, the entries of the quantum Lax operator generate the basic Hopf algebra AR (defined implicitly by the fundamental commutation relation (3.11) which underlies all the associated quantum integrable systems; to get a particular system, we need to fix its representation. The representation which yields the q-deformed Toda chain is provided by the following construction. Let ω1 , ω2 ∈ C. We consider a lattice system with local quantum Lax operators − 2π xn z − z−1 eω1 pn g ω1 e ω2 Ln (z) = , (3.13) 2π xn +ω1 pn −g ω1 e ω2 0 where xn , pn are the canonical coordinates and momenta with the commutation relations [xn , pm ] = iδnm and g is a real parameter (possibly depending on ω). On the classical level the Lax matrices (3.13) have been introduced in [30]. Proposition 3.1. The Lax operator (3.13) satisfies the commutation relations (3.11) with the quantum R-matrix (3.8), (3.9), where q=e

ω 2

iπ ω1

,

(3.14)

and eθ = q.

(3.15)

The monodromy matrix for the N -periodic chain is defined in the standard way: AN (z) BN (z) TN (z) = LN (z) . . . L1 (z) ≡ . (3.16) CN (z) DN (z) By the usual Hopf algebra properties, the entries of T (z) satisfy the same commutation relations as the corresponding entries of the Lax operators. The quantum determinant of the Lax operator (3.13) is detq L(z) = g 2ω1 eω1 pn .

(3.17)

It is simple to show that the quantum determinant of the monodromy matrix detq TN (z) = AN (zq 1/2 )DN (zq −1/2 ) − qBN (zq 1/2 )CN (zq −1/2 )

(3.18)

obeys the property detq TN (z) = detq LN (z) · . . . · detq L1 (z).

(3.19)

Hence, due to (3.17), detq TN (z) = g 2Nω1

N

eω1 pn .

(3.20)

n=1

Note that in the twisted case the quantum determinant is no longer a central element in the quantum algebra.

594

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

Following the same line of argument as in Sect. 1, we may introduce the modular dual system by 2π xn −1 eω2 pn ω2 e− ω1 z − z g n (z) = L . (3.21) 2π xn +ω2 pn −g ω2 e ω1 0 The operator (3.21) satisfies the commutation relation (3.11) with the twisted R-matrix ω iπ 2 (3.8), (3.9) with the only change q → q, θ → θ , where q = eθ = e ω1 . The dual monodromy matrix is defined by N (z) . . . L 1 (z) ≡ AN (z) BN (z) . TN (z) = L (3.22) N (z) N (z) D C The system describing by the Lax operators (3.13), (3.21) may be referred to as the modular relativistic Toda chain. 3.4. Hamiltonians. As usual, the transfer matrix tN (z) = AN (z) + DN (z)

(3.23)

satisfies the commutation relations [tN (z), tN (w)] = 0.

(3.24)

The same is true for the dual transfer matrix N (z) + D N (z); tN (z) = A

(3.25)

moreover, the modular duality implies that [tN (z), tN (w)] = 0.

(3.26)

Clearly, tN (z) has the following structure: tN (z) =

N

(−1)k zN−2k Hk (x1 , p1 ; . . . ; xN , pN ),

(3.27)

k=0

where H0 = 1 and

HN (p1 , . . . , pN ) = exp ω1

N

pn ,

(3.28)

n=1

H1 (x1 , p1 ; . . . ; xN , pN ) =

N

2π

1 + q −1 g 2ω1 e ω2

(xn −xn+1 )

eω1 pn ,

(3.29)

n=1

HN −1 (x1 , p1 ; . . . ; xN , pN ) = HN

N

2π

1 + q −1 g 2ω1 e ω2

n=1

(xn−1 −xn )

e−ω1 pn ,

(3.30)

Analytic Theory of Quantum q-Deformed Toda Chains

595

where in (3.29), (3.30) the periodicity is assumed: xN +1 = x1 . Hence, due to (3.24) the periodic RTC has exactly N commuting operators. The following statement is true: the operator AN (z) is the generating function for the Hamiltonians of the N -particle open RTC: AN (z) =

N

(−1)k zN−2k hk (x1 , p1 ; . . . ; xN , pN ),

(3.31)

k=0

where h0 = 1 and

hN (p1 , . . . , pN ) = exp ω1

N

pn ,

(3.32)

n=1

h1 (x1 , p1 ; . . . ; xN , pN ) =

N

2π

1 + q −1 g 2ω1 e ω2

(xn −xn+1 )

eω1 pn ,

(3.33)

n=1

hN −1 (x1 , p1 ; . . . ; xN , pN ) = hN

N

2π

1 + q −1 g 2ω1 e ω2

(xn−1 −xn )

e−ω1 pn

(3.34)

n=1

assuming xN +1 ≡ ∞ in (3.33) and x0 ≡ −∞ in (3.34). 1 , . . . H N and The second set of the Hamiltonians H h1 , . . . hN are obtained from the former one by the flip ω1 ↔ ω2 . Lemma 3.1. 1. Suppose that ω1 , ω2 are real; then all coefficients of tN (z), tN (z) and N (z) are formally self-adjoint in L2 (RN ). AN (z), A tN (z) and 2. Suppose that Im ω1 = 0 and ω1 = ω2 ; then all coefficients of tN (z), N (z) are normal operators and their “real” and “imaginary” parts (X + X, AN (z), A i(X − X)) are formally self-adjoint. 3.5. Integral representation for the wave functions: Inductive procedure. Our goal is to get an inductive integral representation for the wave functions of the multiparticle open relativistic Toda chain. The approach described below is an analytic version of the algebraic method of separation of variables invented by Sklyanin [32]. Set γ = (γ1 , . . . , γN −1 ) ∈ RN−1 , x = (x1 , . . . , xN −1 ) ∈ RN−1 . Let ψγ (x) be the common wave function for the dual open RTC systems with N − 1 particles: AN −1 (z)ψγ (x) =

N−1

z − z−1 e

2π γm ω2

ψγ (x),

(3.35)

ψγ (x).

(3.36)

m=1

N −1 (z)ψγ (x) = A

N−1

z − z−1 e

2π γm ω1

m=1

The key point of the inductive procedure (described for the first time in [9] for the ordinary Toda chain) is to compute the action on ψγ (x) of the N -particle Hamiltonians.

596

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

It turns out that such an action “preserves” the form of wave function8 . This computation, which starts with the case N = 2, is based on the ordinary RT T commutation relations for the quantum monodromy matrix; the inductive formula given below is based on a self-consistent choice of the normalization for the wave functions. Proposition 3.2. There exists a unique solution ψγ1 ,... ,γN −1 (x1 , . . . , xN −1 ) to the common spectral problem (3.35), (3.36) such that for any N ≥ 2 the eigenfunction ψγ is an entire function in γ ∈ CN−1 satisfying the relations N−1 2π γj 2π xN 2π − −1 ω N 1 ω γm e ω2 ψγ −iω1 ej (x), (3.37) AN e 2 ψγ (x) = q (−ig ) exp ω2 m=1

N−1 2π γj 2π xN 2π − −1 ω N 2 ω AN e 1 ψγ (x) = q (−ig ) exp γm e ω1 ψγ −iω2 ej (x), (3.38) ω1 m=1

where {ej } is the standard basis of RN−1 . 2π xN

As a comment to the proposition above we remark that CN (z) = −q −1 g ω1 e ω2 eω1 pN AN −1 (z). Hence, the compatibility of (3.35) and (3.37) follows from the quadratic RT T relations; the argument for the dual system is completely similar. Assuming that such a function is known, the heuristic idea behind the inductive integral representation for the N -particle wave function ψλ1 ,... ,λN (x1 , . . . , xN ) is to represent it as the generalized Fourier transform with respect to the N − 1-particle wave 2π iγ1 x1

function ψγ (x). Note that the “one-particle” solution is ψγ1 (x1 ) = e ω1 ω2 ; it is easy to verify directly that this trivial function satisfies the conditions of Proposition 3.2, which forms the induction basis. The exact statement is given by Theorem 3.1 below; here we represent the essential ideas how to arrive at this statement. Introduce an auxiliary wave function by Kγ ,ε (x1 , . . . , xN ) N−1 2πi N −1 N−1 2 x πi ω1 ω2 ε−m=1 γm N def e = exp γm − ε γm ψγ (x1 , . . . , xN −1 ). ω1 ω2 m=1

m=1

(3.39) Generalizing proposition 3.2, one can prove the following result. N (z) on the auxiliary wave function (3.39) is Proposition 3.3. The action of AN (z), A given by  N  N−1 −1 2π 2π γj γm ω2 ε− −1 m=1  AN (z)Kγ ,ε = z − z e z − z−1 e ω2 Kγ ,ε j =1

+ (−ig ω1 )N e

πε ω2

N−1 j =1

Kγ −iω1 ej ,ε

s=j

z−e

e

π γj ω2

2π γs ω2

−e

2π γs ω2

(3.40) z−1 e

−

π γj ω2

,

8 This idea goes back to M. Gutzviller [33] who explicitly calculated such an action on the 2 and 3-particle

eigenfunctions for the Toda chain.

Analytic Theory of Quantum q-Deformed Toda Chains

 N (z)Kγ ,ε = z − z−1 e A

2π ω1

N −1

ε−

m=1

γm

597

 N−1 2π γj  z − z−1 e ω1 Kγ ,ε j =1

+ (−ig ω2 )N e

πε ω1

N−1

Kγ −iω2 ej ,ε

j =1

s=j

z−e

e

π γj ω1

2π γs ω1

−e

2π γs ω1

(3.41) z−1 e

−

π γj ω1

.

Let us write formally ψλ1 ,... ,λN (x1 , . . . , xN ) =

µ(γ )Q(γ |λ)Kγ ,λ1 +...+λN dγ ,

(3.42)

where µ(γ ) = def

j
!

π π 4ω1 ω2 sinh (γj − γk ) · sinh (γj − γk ) ω1 ω2

(3.43)

and Q(γ |λ) is an unknown kernel. The form of the integrand in (3.42) is a natural generalization of the one for the usual Toda chain [9]; the latter case actually corresponds to the limit ω2 → ∞ of RTC model, and in this limit the function µ(γ ) reduces to the Sklyanin measure [32]. By assumption, ψγ (x) satisfies, in particular, Eqs. (3.35) and (3.37); on the other π λn

hand, if we apply AN (z) and AN +1 (e ω2 ) to (3.42) and demand that similar equations hold for the function ψλ1 ,... ,λN (see the exact formulae (3.48) and (3.49) below), these requirements yield, after an appropriate deformation of the integration contour, the difference equations for the Fourier amplitude Q(γ |λ): i N Q(γ + iω1 ej |λ) =

i N−1 Q(γ |λ − iω1 en ) =

N g ω1 −N

2

sinh

k=1

−1 g ω1 −N+1 N

2

j =1

π (γj −λk )Q(γ |λ) , ω2

sinh

π (γj −λn )Q(γ |λ) , ω2

(3.44)

(3.45)

where {en } denote the standard basis in RN . It is clear that Eqs. (3.44), (3.45) can be factorized into the first order equations of Baxter’s type. Assume for definiteness that ω1 / R− . Up to a quasiconstant, their solutions can be expressed in terms of the double ω2 ∈ sine S2 (z) according to Eqs. (A.10); to eliminate the quasiconstant (which must be set to 1), we use the dual (ω1 ↔ ω2 ) counterpart of (3.44), (3.45) (which arise due to similar reasoning from (3.36), (3.38)) as well as the requirement of analyticity with respect to parameters λ1 , . . . , λN . The formal computation sketched above must be matched by a series of estimates justifying the deformation of the integration contours. The induction basis is given by a careful study of the cases, where N = 2, 3. In this way we arrive at the following result:

598

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

Theorem 3.1. Let ψγ1 ,... ,γN−1 (x1 , . . . , xN −1 ) be the solution of Eqs. (3.35)–(3.38). Let c(γ ) be the rank 1 quantum Harish-Chandra function defined by (2.8). Introduce N N−1

Q(γ |λ) =

c(γj − λn ).

(3.46)

j =1 n=1

Let ψλ1 ,... ,λN be the function defined by the following integral: ψλ1 ,... ,λN (x1 , . . . , xN ) = µ(γ )Q(γ |λ)Kγ ,λ1 +...+λN (x1 , . . . , xN ) dγ ,

(3.47)

C

where the auxiliary function Kγ ,λ1 +...+λN (x1 , . . . , xN ) is defined by (3.39) and the contour of integration in the multiple integral is chosen in such a way that (a) Im γj > maxk {Im λk }; (b) The left end of the contour escapes to infinity in the sectors 1 3π (arg ω1 + arg ω2 ) + π < arg γj < arg ω2 + ; 2 2 (c) The right end of the contour escapes to infinity in the sectors π π arg ω1 − < arg γj < arg ω2 + . 2 2 Then the function (3.47) is a common eigenfunction for N -particle open RTC. Namely, it satisfies to the following properties: (i) ψλ1 ,... ,λN is an entire function in λ ∈ CN ; (ii) ψλ1 ,... ,λN is the solution to the following set of equations: AN (z)ψλ1 ,... ,λN =

N

z − z−1 e

2π λk ω2

ψλ1 ,... ,λN ,

(3.48)

k=1

π λn AN +1 e ω2 ψλ1 ,... ,λN =q

−1

ω1 N+1

(−ig )

N 2π x 2π − ωN +1 2 exp λk e ψλ1 ,... ,λn −iω1 ,... ,λN , ω2

(3.49)

k=1

N (z)ψλ1 ,... ,λN = A

N

z − z−1 e

2π λk ω1

ψλ1 ,... ,λN ,

(3.50)

k=1

π λn N +1 e ω1 ψλ1 ,... ,λN A =q

−1

ω2 N+1

(−ig )

N 2π x 2π − ωN +1 1 exp λk e ψλ1 ,... ,λn −iω2 ,... ,λN . ω1

k=1

(3.51)

Analytic Theory of Quantum q-Deformed Toda Chains

599

By inductive application of the formula (3.47), starting with the trivial one-particle wave function ψγ1 (x1 ) = e

2π iγ1 x1 ω1 ω2

, we get an explicit solution for the N -particle system.

||γj k ||N j,k=1

Theorem 3.2. Let be a lower triangular N × N matrix and let the last row (γN 1 , . . . , γNN ) be identified with(λ1 , . . . , λN ). (i) The solution to (3.48)–(3.51) can be written in the form: ψλ1 ,... ,λN (x1 , . . . , xN ) N−1 n π π = 4ω1 ω2 sinh (γnj −γnk ) · sinh (γnj − γnk ) ω1 ω2 DN n=1

×

n n+1

j,k=1 j
"

c(γnj −γn+1,k ) exp

j =1 k=1

πi ω1 ω2

N

m=1 γnm

2

−

#!

N

k,m=1 γn+1,k γnm



 N N−1 2πi × exp  xn γnm − γn−1,m  dγj k , ω1 ω2 n,m=1

j,k=1 j ≥k

(3.52) where the integral should be understood as follows. We integrate from top to bottom of the lower triangular matrix: first we integrate over γ11 along the line Im γ11 > max{Im γ21 , Im γ22 }; then we integrate over the set (γ21 , γ22 ) along the lines Im γ2j > maxm { Imγ3m } and so on. The last integrations should be performed over the variables (γN −1,1 . . . , γN −1,N −1 ) along the lines ImγN −1,k > maxm {Im γN ,m }. The asymptotic behaviour of all contours is chosen in the same way as in the previous theorem. (ii) The wave function ψλ has the following asymptotic behaviour as xj − xk → −∞ inside the positive Weyl chamber P+ = {(x1 , . . . , xN ); x1 < x2 < ... < xN }: πi ψλ1 ,... ,λN (x1 , . . . , xN ) ∼ C(sλ) exp (sλ, x) , (3.53) ω1 ω2 s∈W

where C(λ) =

e

− ωπiω λj λk 1 2 c(λ

j

− λk );

(3.54)

j
the sum is over all permutations of (λ1 , . . . , λN ) and (. , .) is standard scalar product in RN . We will refer to (3.54) as the quantum Harish-Chandra function for the modular double corresponding to Uq (gl(n)). Remark 3.1. In the case of N = 2 the solution (3.52) has the following form ψλ1 ,λ2 (x1 , x2 ) =

2πi (λ +λ )x e ω1 ω2 1 2 2

πi [γ 2 −(λ +λ )γ ] 1 2 11 11

e ω1 ω2

c(γ11 − λ1 )c(γ11 − λ2 ) e

2πiγ11 ω1 ω2 (x1 −x2 ) dγ

11

.

D2

(3.55)

600

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

Changing the integration variable γ11 → ζ + λ1 and letting x = x1 − x2 one can obtain (up to a simple GL(1) factor) the function ψλ1 −λ2 (x) which coincides with Uq (sl(2)) solution (2.36) for C = −1. Corollary 3.1. Let the periods ω1 , ω2 be real positive numbers. Fix the following choice −1 of the coupling constant g ω1 = q−q iω1 and let ρ be the half-sum of positive roots of ω2 sl(N, R) written in standard basis of RN . After rescaling xk → 2π xk and sending (T C) (T C) ω2 ω2 → ∞ one obtains in this limit ψλ ( 2π x) → ψλ (x), where ψλ (x) is a solution of GL(N, R) open Toda chain in the Mellin–Barnes form [9]. In terms of the classical GL(N, R) Whittaker functions W (x; λ) [4] it can be written in the form (T C)

ψλ

−2i(λ,ρ)/ω1

(x) = ω1

π −1/2 =

j
λ − λ 1 j k + W (x; λ), iω1 2

(3.56)

In this limit the quantum Harish-Chandra function (3.54) reduces to the standard one: −2i(λ,ρ)/ω1

C(λ) → ω1

λj −λk = . iω1

(3.57)

j
Remark 3.2. To generalize the solution (2.36) to an arbitrary N -particle case one should deal with the Lax operators L(C) n (z)

=

−g ω1 e

(C) L n (z)

=

z − z−1 eω1 pn

2π xn 1−C ω2 + 2 ω1 pn

z − z−1 eω2 pn −g ω2 e

2π xn 1−C ω1 + 2 ω2 pn

g ω1 e

− 2πωxn + 1+C 2 ω1 pn

2

,

(3.58)

,

(3.59)

0

g ω2 e

− 2πωxn + 1+C 2 ω2 pn 1

0

which satisfy the RT T relation with the same twisted R-matrix. Applying the method described above to this model, one obtains the solution of the same structure as (3.52) with the additional factor −C in front of the quadratic form in γ -variables. Note that different deformations of the relativistic Toda chain has been introduced in [34] on purely algebraic level.

3.6. Periodic q-Toda chain. In this section we will formulate briefly the extension of our approach to the case of periodic modular q-Toda chain. The details will be published in a separate publication. In the same way as it was done in the case of the open chain, we can calculate the action of the transfer matrices tN (z) and tN (z) on the auxiliary wave function (3.39). One can prove the following result.

Analytic Theory of Quantum q-Deformed Toda Chains

601

Proposition 3.4. The action of tN (z), tN (z) on the auxiliary wave function (3.39) is given by tN (z)Kγ ,ε N N−1 −1 2π 2π γj ε− γm −1 ω2 m=1 z − z−1 e ω2 Kγ ,ε = z−z e j =1

+e

πε ω2

N−1

ω1 N

ω1 N

(−ig ) Kγ −iω1 ej ,ε + (ig ) Kγ +iω1 ej ,ε

j =1

s=j

z−e

e

π γj ω2

2π γs ω2

−e

2π γs ω2

z−1 e

−

π γj ω2

,

(3.60) tN (z)Kγ ,ε  = z − z−1 e

2π ω1

N −1

ε−

m=1

γm

 N−1 2π γj  z − z−1 e ω1 Kγ ,ε j =1

+e

πε ω1

N−1

ω2 N

ω2 N

(−ig ) Kγ −iω2 ej ,ε + (ig ) Kγ +iω2 ej ,ε

j =1

s=j

z−e

e

π γj ω1

2π γs ω1

−e

2π γs ω1

z−1 e

−

π γj ω1

.

(3.61) Using this proposition, one can prove that integral representation (3.47) is still valid for the wave functions of the periodic q-Toda chain. Namely, the l.h.s. of (3.47) should be considered as the wave function of periodic chain (i.e. as the common eigenfunction for the operators tN (z) and tN (z)) with the following changes. The Fourier coefficient Q(γ | λ) factorizes now into the product Q(γ | λ) =

N−1

Q(γj | λ),

(3.62)

j =1

where Q(γ |λ) is an entire function with an appropriate asymptotic behaviour which satisfies the following system of the mutually dual Baxter equations: N g ω1 −N

2

sinh

k=1

π (γ − λk )Q(γ |λ) = i N Q(γ + iω1 |λ) + i −N Q(γ − iω1 |λ), ω2 (3.63a)

N g ω2 −N

2

k=1

sinh

π (γ − λk )Q(γ |λ) = i N Q(γ + iω2 |λ) + i −N Q(γ − iω2 |λ) ω1 (3.63b)

(compare with (3.44)). Remark 3.3. In the limit ω2 → ∞ the equation (3.63a) goes to the standard Baxter equation for the N -particle periodic Toda chain [32] with Planck constant h¯ = ω1 −1 provided g ω1 = 2π ω2 [1 + O(ω2 )].

602

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

We would like to stress that the proper asymptotic behaviour of the solution of Baxter equations is fixed by the condition that the integral (3.47) converges. Together with the analyticity condition for the solution this leads to the quantization conditions of Gutzwiller’s type for the eigenvalues λ [33, 35, 36]. Note that the Baxter equation (3.63a) has been obtained in [37] in the framework of separation of variables [32]. A similar system of Baxter equations (3.63) appeared for the first time in [11, 38] in the models different from ours, but with the same type of duality property. Acknowledgement. We are deeply indebted to L. Faddeev, B. Feigin, S. Ruijsenaars, S. Khoroshkin, and F. Smirnov for stimulating discussions. We are also grateful to T. Sultanov for preparing the pictures. The research was partly supported by grants INTAS 99-01782; RFBR 00-02-16477 (S. Kharchev); INTAS 97-1312, RFBR 00-02-16530 (D. Lebedev) and by grant 00-15-96557 for Support of Scientific Schools; RFBR 99-01-00101 (M. Semenov-Tian-Shansky). One of us (D.L.) is deeply indebted to the Institut des Hautes Études Scientifiques where part of the work has been done.

A. Double Sine Function We give here a short summary of the properties of the double sines and related functions. The theory of double sines, double gamma functions, etc. goes back to the papers of Barnes [12, 13], with more recent additions of Shintani [15] and Kurokawa [16]; the closely related quantum dilogarithms were independently introduced by Faddeev and Kashaev [17] in connection with the Quantum Inverse Scattering Method. See also relevant items on the subject in [39, 40].

A.1. Definition and main properties. The basic properties of double sines listed below are extracted mainly from [15] and [16]; all the main ideas are contained already in the papers of Barnes [12]–[14]9 . I. Integral formulae. Set ω = (ω1 , ω2 ), ω1 , ω2 > 0. The double sine S2 (z|ω) is defined by the integral [12, 15] log S2 (z|ω) = CH

ω1 +ω2 2 )t ω1 t ω2 t 2 sinh 2

sinh(z − 2 sinh

log(−t)

dt 2π it

(A.1)

in the region 0 < Re z < ω1 + ω2 ,

(A.2)

where the contour CH is drawn in Fig. 1: 9 In [12, 13] Barnes has developed the complete theory of the so-called double gamma functions =2 (z|ω1 , ω2 ). The double sine function appeared for the first time in a paper by Shintani [15] as a ratio of two appropriate double gamma functions.

Analytic Theory of Quantum q-Deformed Toda Chains

603

✛ 0 •

+∞

✲

✚ Fig. 1. The Hankel contour CH

An equivalent integral representation is easily derived from (A.1): ezt πi dt log S2 (z|ω) = B2,2 (z|ω) + , ω t ω t 1 2 2 (e − 1)(e − 1) t

(A.3)

R+i0

where B2,2 (z|ω) =

ω2 + 3ω1 ω2 + ω22 z2 ω1 + ω2 − z+ 1 ω1 ω2 ω 1 ω2 6ω1 ω2

(A.4)

and the contour is drawn in Fig. 2: ✲ ✛✘ →

•

+∞

→

0 Fig. 2. Contour R + i0

Note that B2,2 (ω1 + ω2 − z|ω) = B2,2 (z|ω).

(A.5)

From (A.1) one can also derive that ∞ 2 sinh(z − ω1 +ω 1 dt 2 )t log S2 (z|ω) = (2z − ω1 − ω2 ) − . ω 1 ω2 t t 2 sinh ω21 t sinh ω22 t

(A.6)

0

II. Series and product formulae. Evaluating (A.6) by the residue formula, one obtains the series expansions which are valid in the regions Im z > 0, and Im z < 0, respectively:   2πinz 2πinz ∞   ω ω e 2 1 πi e 1 log S2 (z|ω) = + , (A.7) B2,2 (z|ω) + 2πinω1 2  2 n  2πinω n=1 e ω1 − 1 e ω2 − 1  ∞ 1 πi log S2 (z|ω) = − B2,2 (z|ω) + 2 n n=1

e e

− 2πinz ω 1

2 − 2πinω ω 1

−1

+

e e

− 2πinz ω 2

1 − 2πinω ω 2

−1

  

. (A.8)

By an appropriate deformation of the contour CH the double sine (A.1) can be extended to all complex values of ω1 , ω2 , provided that ωω21 ∈ / R− [12]. Namely, the contour is in general along the bisector of the smallest angle between −arg ω1 and −arg ω2 enclosing

604

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

the origin but no other poles of the integrand in (A.1) (the contour R + i0 should be rotated similarly). As a corollary, we obtain the product expansions which are valid when Im ωω21 > 0): ∞ .

S2 (z|ω) = e

πi B

2,2 (z|ω)

2

m=0 ∞ . m=1 ∞ .

= e−

πi B

1 − q 2m e

1 − q −2m e

q=e

− 2πiz 1 − q −2m e ω1

∞ .

m=1

where

ω 2

πi ω1

,

2πiz ω1

2,2 (z|ω) m=0

2

2πiz ω2

1 − q 2m e

q=e

ω

πi ω2

1

− 2πiz ω

(A.9)

,

2

.

The equality of the two expressions is due to the modular transformation law for the theta function θ1 (z| ωω21 ). III. Functional relations. The function S2 (z|ω) satisfies the difference equations S2 (z + ω1 |ω) 1 = z , S2 (z|ω) 2 sin π ω

(A.10a)

S2 (z + ω2 |ω) 1 = z . S2 (z|ω) 2 sin π ω

(A.10b)

2

1

Moreover, πz πz sin , ω1 ω2 S2 (z|ω)S2 (ω1 + ω2 − z|ω) = 1.

S2 (z|ω)S2 (−z|ω) = −4 sin

(A.11)

IV. Poles and zeroes. The zeroes and poles of S2 (z|ω) are as follows: poles at z = n1 ω1 + n2 ω2 , zeros at z = n1 ω1 + n2 ω2 ,

n1 , n2 ≥ 1, n1 , n2 ≤ 0.

(A.12)

Moreover, lim z−1 S2 (z|ω) = √

z→0

2π . ω1 ω2

(A.13)

Hence, from (A.10) and (A.13), / S2 (ω1 |ω) =

ω2 , ω1

/ S2 (ω2 |ω) =

ω1 . ω2

(A.14)

Analytic Theory of Quantum q-Deformed Toda Chains

605

Using (A.10), (A.13), one can calculate the residues of S2 (z|ω) and S2−1 (z|ω) at the corresponding points (A.12): √ ω1 ω2 (−1)n1 n2 lim zS2 (z+n1 ω1 + n2 ω2 |ω) = , (A.15a) n. 1 −1 2 −1 z→0 2π n. π kω π mω 1 2 2 sin ω 2 sin ω 2 1 m=1

k=1

lim zS2−1 (z−n1 ω1 z→0

√ ω1 ω2 − n2 ω2 |ω) = n1 . 2π k=1

(−1)n1 n2 +n1 +n2 . (A.15b) n kω1 .2 2 sin π mω2 2 sin πω ω1 2 m=1

V. Asymptotics. Assuming that Im ωω21 > 0, we have  πi 2 B2,2 (z|ω) ,  arg ω1 < arg z < arg ω2 + π,   e πi  − B (z|ω)  2,2  e 2 , arg ω1 − π < arg z < arg ω2 ,    − π2i B2,2 (z|ω)  e , arg ω2 < arg z < arg ω1 , ∞ S2 (z|ω) −−−→ . − 2πiz 2m z→∞  ω 2 1−q e    m=1   ∞  π i B (z|ω) . 2πiz  2m 2,2  ω 1 − q e 2 , arg ω2 +π < arg z < arg ω1 +π. e 2 m=0

VI. Complex conjugation. S2 (z|ω) = S2 (z |ω).

(A.16)

A.2. Related functions. It is convenient to introduce the function S(z|ω) as follows: S2 (z|ω) = e

πi 2 B2,2 (z|ω)

S(z|ω),

(A.17)

where according to (A.3) log S(z|ω) = R+i0

For complex periods with Im ∞ .

S(z|ω) =

m=0 ∞ .

1 − q 2m e

m=1

ω1 ω2

(eω1 t

ezt dt . − 1)(eω2 t − 1) t

(A.18)

> 0 we get

2πiz ω2

2πiz 1 − q −2m e ω1

∞ .

= e−πiB2,2 (z|ω)

m=0 ∞ . m=1

1 − q −2m e

− 2πiz ω

1

− 2πiz 1 − q 2m e ω2

.

(A.19)

Evidently, S(z|ω) satisfies the difference equations 1 S(z + ω1 |ω) , = 2π iz S(z|ω) 1 − e ω2

(A.20a)

606

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

S(z + ω2 |ω) 1 = 2π iz S(z|ω) 1 − e ω1

(A.20b)

S(z|ω) = S −1 (ω1 + ω2 − z |ω)

(A.21)

and obeys the condition

when ω1 , ω2 are real or ω1 = ω2 . Let us finally mention that the quantum dilogarithm eω (z) defined in [10, 17] is related to S(z|ω) via eω (z) = S(

ω1 + ω2 − iz|ω), 2

(A.22)

while the hyperbolic gamma function G(z|ω) introduced in [28] is G(z|ω) = S2 (

ω1 + ω 2 + iz|ω). 2

(A.23)

A.3. Fourier transform. The function S(it|ω) has the following asymptotics:    1,

π < arg t < arg ω2 + 2 S(it|ω) −−−→ π t →∞   e−πiB2,2 (it|ω) , arg ω1 + < arg t < arg ω2 + 2 arg ω1 −

π 2 3π 2

(A.24)

(we are not interested in behavior in two remaining sectors). Each sector is divided into π it 2

−

π it 2

two subsectors (+) (resp. (−)) where the exponential e ω1 ω2 (resp. e ω1 ω2 ) is rapidly decreasing. To be more precise, the sectors (+) are determined by inequalities 3π 1 (arg ω1 + argω2 ) + π < arg t < arg ω2 + , 2 2

(A.25a)

1 π (arg ω1 + argω2 ) < arg t < arg ω2 + , 2 2

(A.25b)

while the sectors (−) are determined by inequalities arg ω1 +

π 1 < arg t < (arg ω1 + argω2 ) + π, 2 2

(A.26a)

π 1 < arg t < (arg ω1 + argω2 ) 2 2

(A.26b)

arg ω1 −

(see Fig. 3 for the typical complex periods ω1 and ω2 ).

Analytic Theory of Quantum q-Deformed Toda Chains

607 1 2 (arg ω1 +arg ω2 +π)

arg ω1 + π2

...................... .................... .................... ................... .................. .................. ................ ............... ............... .............. ............. ............ ........... .......... ......... ........ ....... ...... ..... .... ... .

ω1

✿ z ω2

1 2 (arg ω1 +arg ω2 )+π

arg ω2 + π2

e−π iB2,2 (it|ω) ←S(it|ω)

S(it|ω)→1

arg ω2 − π2

. .. ... .... ..... ...... ....... ........ ......... .......... ........... ............ ............. .............. ............... ............... ................. .................. ...................

1 2 (arg ω1 +arg ω2 )

arg ω1 − π2

Fig. 3.

There exist remarkable integral formulae with the double sine type functions [11, 21]. In particular, there are the following Fourier transformation formulae [11]10 : 2π izt 2π za πi √ S(it +ω1 + ω2 − a) e ω1 ω2 dt = ω1 ω2 e− 2 B2,2 (0) S −1 (−iz) e ω1 ω2 , (A.27a) =

2π izt

S −1 (it + a) e ω1 ω2 dt =

πi √ − 2π za ω1 ω2 e 2 B2,2 (0) S(ω1 + ω2 + iz) e ω1 ω2 ,

(A.27b)

L

2π izt

S(−it +ω1 + ω2 −a) e ω1 ω2 dt =

πi √ − 2π za ω1 ω2 e− 2 B2,2 (0) S −1 (iz) e ω1 ω2 ,

(A.27c)

=

2π izt

S −1 (−it + a) e ω1 ω2 dt =

2π za πi √ ω1 ω2 e 2 B2,2 (0) S(ω1 + ω2 − iz) e ω1 ω2 .

(A.27d)

L

The notations here are as follows. The contours = and L are above the poles tn1 ,n2 = −i(a + n1 ω1 + n2 ω2 )

(n1 , n2 ≥ 0)

(A.28)

10 Actually, any three formulae in (A.27) are consequences of the fourth one. Nevertheless, we list all of them for convenience.

608

S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky

of the integrands in (A.27a) and (A.27d) while the contours L and = are below the poles tn 1 ,n2 = i(a + n1 ω1 + n2 ω2 )

n1 , n2 ≥ 0)

(A.29)

of the integrands in (A.27b) and (A.27c). Further, the contours = and L are beginning in subsectors (A.25a) and (A.26a) respectively, but may lie in the whole sector [arg ω1 − π π 2 , arg ω2 + 2 ], while the contours = and L may lie in the whole sector [arg ω1 + π 3π 2 , arg ω2 + 2 ], but are ending in subsectors (A.25b) and (A.26b), respectively. Provided with such a description, the formulae (A.27a) and (A.27b) hold in the region

π π , (A.30) arg z ∈ / arg ω2 − , arg ω1 − 2 2 while in (A.27c) and (A.27d)

π π arg z ∈ / arg ω2 + , arg ω1 + . 2 2

(A.31)

References 1. Kostant, B.: Quantization and representation theory. In: Representation Theory of Lie Groups, Proc of Symp., Oxford, 1977, London Math. Soc. Lecture Notes series, Vol. 34. Cambridge, 1979, pp. 287–317 2. Jacquet, H.: Fonctions de Whittaker associées aux groupes de Chevalley, Bull. Soc. Math. France 95, 243–309 (1967), 3. Schiffmann, G.: Intégrales d’entrelacement et fonctions de Whittaker. Bull. Soc. Math. France, 99, 3–72 (1971) 4. Hashizume, M.: Whittaker models for real reductive groups. J. Math. Soc. Japan 5, 394–401 (1979) Whittaker functions on semisimple Lie groups. Hiroshima Math. J. 12, 259–293 (1982) 5. Harish-Chandra: Spherical gunctions on a semisimple Lie group. I. Amer. J. Math. 80, 241–310 (1958) 6. Semenov-Tian-Shansky, M.A.: Quantization of Open Toda Lattices. Encyclopædia of Mathematical Sciences, Vol. 16. Dynamical Systems VII. Ch. 3. Berlin–Heidelberg–New York: Springer Verlag, 1994, pp. 226–259 7. Faddeev, L.: Quantum completely integrable models in field theory. Sov. Sci. Rev., Sect. C (Math.Phys. Rev.) 1, 107–155 (1980) 8. Sklyanin, E.: Boundary conditions for integrable quantum systems. J. Phys. A21, 2375–2389 (1988) 9. Kharchev, S., Lebedev, D.: Eigenfunctions of GL(N, R) Toda chain: The Mellin–Barnes representation. JETP Lett. 71, 235–238 (2000); hep-th/0004065 10. Faddeev, L.: Discrete Heisenberg-Weyl group and modular group. Lett. Math. Phys. 34, 249 (1995); hep-th/9504111; Modular double of quantum group. math.qa/9912078 11. Faddeev, L., Kashaev, R., Volkov, A.: Strongly coupled quantum discrete Liouville theory. I: Algebraic approach and duality. hep-th/0006156 12. Barnes, E.W.: The genesis of the double gamma function. Proc. London Math. Soc. 31, 358–381 (1899) 13. Barnes, E.W.: The theory of the double gamma function. Phil. Trans. Roy. Soc. A196, 265–387 (1901) 14. Barnes, E.W.: On the theory of multiple gamma functions. Trans. Cambr. Phil. Soc. 19, 374–425 (1904) 15. Shintani, T.: On a Kronecker limit formula for real quadratic fields. J. Fac. Sci. Univ. Tokyo, Sect. 1A 24, 167–199 (1977) 16. Kurokawa, N.: Multiple sine functions and Selberg zeta functions. Proc. Japan Acad. A67, 61–64, (1991) Gamma factors and Plancherel measures. Proc. Japan Acad. A68, 256–260 (1992) Multiple zeta functions; an example, Adv. Studies Pure Math. 21, 219–226 (1992) 17. Faddeev, L., Kashaev, R.: Quantum dilogarithm. Mod. Phys. Lett. 9, 265–282 (1994), hep-th/9310070 18. Ruijsenaars, S.: The relativistic Toda systems. Commun. Math. Phys. 133, 217–247 (1990) 19. Rieffel, M.: C ∗ -algebras associated with irrational rotations. Pacific J. Math. 93, 415–430 (1981) 20. Chari, V., Pressley, A.: A guide to quantum groups. Cambridge: Cambridge Univ. Press, 1994 21. Ponsot, B., Teschner, J.:Liouville bootstrap via harmonic analysis on a noncompact quantum group. hep-th/9911110; Clebsch-Gordan and Racah-Wigner coefficients for a continuous series of representations of Uq (sl(2, R)). math.QA/0007097

Analytic Theory of Quantum q-Deformed Toda Chains

609

22. Gelfand, I., Kirilov, A.: Sur les corps liés aux algébres enveloppantes des algébres de Lie. Publ. Mat. Hautes Etud. Sci. 31, 509–523 (1966) 23. Schmüdgen, K.: Operator representations of Uq (sl2 ). Lett. Math. Phys. 37, 211–222 1996 24. Sevostyanov, A.: Quantum deformation of Whittaker modules and Toda lattice. math.QA/9905128 25. Olshanetsky, M.,Rogov, V.-B.: Liouville quantum mechanics on a lattice from geometry of quantum Lorentz group. J. Phys. A27, 4669-4683 (1994) 26. Gasper, G., Rahman, M.: Basic hypergeometric series. Cambridge: Cambridge Univ. Press, 1990 27. Nishizawa, M., Ueno, K.: Integral soluitons of q-difference equations of the hypergeometric type with |q| = 1. q-alg/9612014 28. Ruijsenaars, S.: First order analytic difference equations and integrable quantum systems. J. Math. Phys. 38, 1069-1146 (1997) 29. Kharchev, S., Lebedev, D.: Integral representations for the eigenfunctions of quantum open and periodic Toda chains from QISM formalism. J. Phys. A34, 1–12 (2001); hep-th/0007040 30. Suris, Yu.: Discrete time generalized Toda lattices: Complete integrability and relation with relativistic Toda lattices. Phys. Lett. 145, 113–119 (1990) 31. Reshetikhin, S.: Multiparameter quantum groups and twisted quasitriangular Hopf algebras. Lett. Math. Phys. 20, 331–335 (1990) 32. Sklyanin, E.: The quantum Toda chain, Lect. Notes in Phys. 226, 196–233 (1985) 33. Gutzwiller, M.: The quantum mechanical Toda lattice II. Ann. of Phys. 133, 304–331 (1981) 34. Kundu, A.: Generation of a quantum integrable class of discrete-time or relativistic periodic Toda chains. hep-th/9403001 35. Pasquier, V., Gaudin, M.: The periodic Toda chain and a matrix generalization of the Bessel function recursion relations. J. Phys. A25, 5243–5252 (1992) 36. Kharchev, S., Lebedev, D.<. Integral representation for the eigenfunctions of a quantum periodic Toda chain. Lett. Math. Phys. 50, 53–77 (1999); hep-th/9910265 37. Kuznetsov, V., Tsiganov, A.: Separation of variables for the quantum relativistic Toda lattices. hepth/9402111 38. Smirnov, F.: Dual Baxter equations and quantization of Affine Jacobian. J. Phys. A33, 3385–3405 (2000); math-ph/0001032 39. Jimbo, M., Miwa, T.: QKZ equation with |q| = 1 and correlation functions of the XXZ model in the gapless regime. J. Phys. A29, 2923–2958 (1996); hep-th/9601135 40. Ruijsenaars, S.: On Barnes’ multiple zeta and gamma functions. Adv. in Math. 156, 107–132 (2000) Communicated by A. Connes

Commun. Math. Phys. 225, 611 – 632 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

A Pregenerator for Burgers Equation Forced by Conservative Noise Sigurd Assing Department of Mathematics and Statistics, University of Edinburgh, JCMB, King’s Buildings, Mayfield Road, Edinburgh EH9 3JZ, UK. E-mail: [email protected] Received: 9 March 2001 / Accepted: 10 October 2001

Abstract: We consider the stationarity of a Burgers equation with an external random force of gradient type in one space dimension. The expected stationary measure is the white noise measure on the space of tempered distributions. As a consequence, the nonlinearity of the formal equation ut + λuux = ν uxx + ηx is ill-defined. Introducing a pregenerator we can formulate a generalized martingale problem leading to a meaningful version of the formal equation which was an open problem.

1. The Problem Since its introduction by A. R. Forsyth, [13], in 1906, the so called Burgers equation ut + λuux = ν uxx has held a steady interest of research in both mathematics and physics (see for example [30] and the references therein). The linear viscous dissipation term with parameter ν > 0 on the right-hand side has been added in order to soften shock wave phenomena, and, Burgers himself studied the equation in [10, 11] for random initial data. The parameter λ controls the strength of the nonlinearity. We are mainly interested in the stationary regimes in Burgers’ turbulence but, because dissipation leads to a decay of turbulence, we need a supply of energy from outside. So, our object of interest is a multidimensional stochastic partial differential equation of Burgers’ type ∂ u + λ(u · ∇)u = ν u + f(t, x), t ≥ 0, x ∈ Rd , ∂t

(B)

While visiting USC Los Angeles, the research of the author was supported by the DFG project “Existenz, Eindeutigkeit und Statistik unendlich-dimensionaler Langevin-Gleichungen”.

612

S. Assing

where ∇ denotes the space gradient, the Laplacian and the necessary input of energy is provided by the external random force field f. In applications, it is often natural to assume that u is generated by a potential h, that is u(t, ·) = ∇h(t, ·), t ≥ 0. As a consequence, we also have the identity λ(u · ∇)u =

λ ∇u2 . 2

(1)

in dimensions d ≥ 2, and, the nonforced equation conserves the integral Rd u(t, x) dx in time which is a desired property. But, having Eq. (B) for u in mind, for this property it is necessary by formal calculations that d f(t, x) dx = 0, t ≥ 0. dt Rd So, in what follows, we assume that the random force field f is also generated by a potential, i.e. f(t, ·) = ∇η(t, ·), t ≥ 0,

(2)

for some random field η(t, x), t ≥ 0, x ∈ Rd . Remark that this structure of the forcing term is the difference to the nonconservative driven Burgers equation which is strongly treated in the literature, too. Under these assumptions, the potential h generating u should satisfy the KPZ equation ∂ λ h + (∇h)2 = ν h + η ∂t 2

(KPZ)

for growing interfaces (see [20]). Here the random potential η is supposed to satisfy the property of scale invariance, i.e. in the sense of probability distributions b(z+d)/2 η(bz t, bx) = η(t, x), t ≥ 0, x ∈ Rd ,

(3)

for arbitrary b, z > 0, because then statistically self-similar solutions to (KPZ) are expected which are important (see [22]). The most popular example of such a random potential, which we always have in mind in what follows, is the space-time white noise, where η denotes a Gaussian field with mean zero and covariance structure η(t, x)η(t , x ) = γ δ(t − t )δ(x − x ), γ > 0 a constant. So far the physical motivation of the structure of our Eq. (B) we return to, now. From a rigorous mathematical point of view, the first problem is that if the random potential η generating f is the space-time white noise then, already in one space dimension, solutions to equations structured like (B) do not take values in a function space but in a generalized function space. As a consequence, Eq. (B) is ill-posed, because the usual product makes no sense for arbitrary generalized functions. Similarly, the nonlinearity in the KPZ equation (KPZ) is not defined in any space dimension. Although, in the physical literature people formally multiply generalized functions as usual but in combination with taking expectations, the rigorous way to overcome the above problem should be the redefinition of the “ill” terms of the equation. This has been done in [3, 16, 18] using the Wick calculus in Gaussian analysis. Here, solutions to (B)

Pregenerator for Burgers Equation Forced by Conservative Noise

613

are treated as random fields with parameters t ≥ 0 and x ∈ Rd . However, the prize is that now, for each fixed (t, x), the “random variable” u(t, x) is a stochastic distribution in the sense of T. Hida or evenY. G. Kondratiev. So, for a rigorously defined nonlinearity, the Wick product has been applied leading to a Wick-type Burgers equation for the components uk of u, k = 1, . . . , d: d

∂uk ∂uk +λ uj = ν uk + fk (t, x), t ≥ 0, x ∈ Rd . ∂t ∂xj

(W B)

j =1

Of course, this type of equation is interesting in itself but the concrete relation to the physical problem behind Burgers’ equation remains to be clarified, yet. Let us stress the last point a bit more in what follows. If the space dimension d is equal to one then, using (1) and (2), Eq. (B) formally reads ∂ λ ∂ 2 ∂2 ∂ u+ u = ν 2u+ η(t, x), t ≥ 0, x ∈ R ∂t 2 ∂x ∂x ∂x

(4)

and, for λ = 0, the corresponding linear equation admits a unique stationary solution with values in the space of tempered distributions S (R). Its invariant measure is the Gaussian measure µ on S (R) with characteristic functional γ exp{iφ(l)} µ(dφ) = exp − l2L2 (R) , l ∈ S(R). 4ν Now, it is well-known in the physical literature that this measure should also be invariant under the flow generated by the solutions of ∂ λ ∂ 2 u+ u = 0, t ≥ 0, x ∈ R. ∂t 2 ∂x The reason for this claim is a calculation combining properties of the usual product for the nonlinearity with taking expectations (see [22] for example). But, the measure µ is only carried by singular distributions in S (R), hence the usual product is rather not applicable. Nevertheless, the argument is a good indication of µ being a possible equilibrium point for the conservative forced Burgers equation in one space dimension. On the other hand, there are also physical arguments being in contradiction to the stationarity of µ with respect to Eq. (4) for λ = 0 if u2 is defined by the usual product, and, this has brought J. Krug and H. Spohn in [22] to the question of how well-defined the equation is mathematically, leading again to the problem described above. To be seen as an interesting way out, Ya. G. Sinai, [26], could show the existence of an invariant measure for Eq. (4) not changing the structure (2) but the irregularity of the ∂ random force field ∂x η(t, x), t ≥ 0, x ∈ R. His random potentials η(t, ˜ x) are white in 2 t ≥ 0 and both C -smooth as well as periodic in x ∈ R. The corresponding invariant measures are of course different from µ but he can stay with the usual product u2 for the nonlinearity. It is remarkable that W. E, K. Khanin, A. Mazel, Ya. G. Sinai, [12], even obtain invariant measures for Eq. (4) with η˜ instead of η, if ν = 0. Unfortunately, the nice property (3) does not hold true for the random potentials η˜ defined in [12], resp. [26]. Letting η be unchanged, the nonlinearity has to be redefined, and, from the ideas given in [5] by L. Bertini / G. Giacomin it follows that one should apply some kind of Wick-renormalized power : u2 :µ with respect to µ instead of the usual product u2 in

614

S. Assing

(4). This result has been obtained by approximating Eq. (4) (see also [4]). The authors cannot derive a limiting equation but, solving the corresponding equation on each level of the approximation and taking a certain limit of this sequence of solutions, they get an S (R)-valued stochastic process which is even stationary if it starts from µ. So, the result in [5] supports the claim of µ being an invariant measure with respect to (4), and, it gives a hint how to define the nonlinearity of the equation rigorously. Here we should mention that the known method to construct Wick-renomalized powers (see [15, 25] for example) does not work with respect to our Gaussian measure µ because its covariance is too “bad”. It is one result in our paper that this method can be generalized to a much wider class of Gaussian measures. Applying the generalized Wick-renormalized power, we define an operator L with the domain P = {F | S (R) → R : F (φ) = f (φ(l1 ), . . . , φ(ln )), l1 , . . . , ln ∈ S(R), f polynomial, n = 1, 2, . . . } on the space of Hida distributions (S)∗ with respect to µ, leading to a martingale problem for Eq. (4). Hence, in a weak sense, we are able to give a meaningful version of the formal Eq. (4). This version only admits µ as the initial distribution for (4), and, the solutions have to be S (R)-valued stochastic processes (ut )t≥0 whose one-dimensional distributions are µ-stationary, i.e. ut ∼ µ, t ≥ 0. Such restrictions for solutions of stochastic differential equations are also well-known with respect to the stochastic quantization of the Euclidean quantum field theory (see [19]). Remark that the nonlinearity defined by the generalized Wick-renormalized power is different from the nonlinearity given by the Wick product used in (WB) (see the Appendix for a more detailed view). The operator (L, P) is not closed which is why we call it a pregenerator for (4). We do not know if (L, P) is closable or if there is a closed extension generating a semigroup describing the transition probabilities of a possible Markovian solution to our martingale problem. But, we can show that the Gaussian measure µ solves the elliptic equation for stationary measures L∗ µ = 0 in the sense of (S ) 1, LF (S )∗

= 0, ∀ F ∈ P.

As a consequence, the operator (L, P) should be seen as a µ-divergence free perturbation (see [2]) of the operator core (L0 , P) for the generator of the unique strong Markov solution to the corresponding linear problem of (4), where λ = 0. The above elliptic equation itself is of strong interest, we refer to these recent papers [1, 7–9] for existence and uniquenenss of solutions. However, it is not known if µ is a unique solution in our case. We would have to point out that we do not have a solution for our martingale problem describing (4). But there is a good candidate. It remains to be shown that the limit process constructed by L. Bertini/G. Giacomin in [5], Appendix B, is a solution. We finally want to mention that in d ≥ 2 space dimensions there is no knowledge about invariant measures with respect to the Burgers’ type Eq. (B) with a random forcing term f given by the gradient of a space-time white noise. Unfortunately, our Gaussian measure µ, now considered on S (Rd ) with analogous characteristic functional, is not any longer invariant with respect to the corresponding linear problem of (B), where λ = 0.

Pregenerator for Burgers Equation Forced by Conservative Noise

615

2. A Generalization of Wick-Renormalized Powers Let S (Rd ) be the space of real valued tempered distributions on Rd , d ≥ 1, which consists of the continuous linear functionals on the space S(Rd ) of rapidly decreasing functions equipped with Schwartz’ topology. A usual element of S (Rd ) is denoted by φ, resp. φ1 , φ2 , . . . , and if a linear functional φ ∈ S (Rd ) is applied to a test function l ∈ S(Rd ) we write φ(l). We consider mean zero Gaussian measures µ on S (Rd ) of which covariance is given by the inner product of a real separable Hilbert space H including S(Rd ) as a continuously embedded dense subspace so that φ(l)φ(l ) µ(dφ) = (l, l )H , l, l ∈ S(Rd ). d Assume that H has an orthonormal basis {ei }∞ i=1 ⊆ S(R ) and introduce the Hermite polynomials x 2 x 2 dn n exp − Hn (x) = (−1) exp , x ∈ R, n = 0, 1, 2, . . . . 2 dx n 2 For a multi-index α = (α1 , α2 , . . . ), αi ∈ N, i = 1, 2, . . . , |α| = ∞ i=1 αi < ∞, we set ∞ Hα (φ) = Hαi (φ(ei )), φ ∈ S (Rd ). i=1

Then it is well-known in Gaussian analysis (see [17] for example) that the linear hull P{ei } = Lin{Hα } of the system {Hα } indexed by the set of multi-indices α is dense in L2 (µ), and the set P{ei } will be our chosen smooth domain in L2 (µ) during this section. Particularly, any F ∈ L2 (µ) can be uniquely determined by a sequence of real numbers {cα } such that c α Hα F = α

and F 2L2 (µ) =

α

α! cα2

where

α! =

∞

αi ! .

i=1

Return to the problem of defining “reasonable powers” of tempered distributions φ ∈ S (Rd ). By a power we mean an integer m ≥ 2, the trivial linear case is omitted. d Of course, if φ ∈ Lm loc (R ), then φ m (x) := φ(x)m , x ∈ Rd defines an element of S (Rd ) so that powers are well-defined at least on a subset of µ µ S (Rd ). Now, we have a measure µ on (S (Rd ), B ), where B denotes the completion with respect to µ of the σ -algebra generated by the cylinder subsets of S (Rd ). If the set d Lm loc (R ) is of µ-measure one and we can neglect sets of µ-measure zero, then the above d definition of a power works very well. But in important applications Lm loc (R ) even has µ-measure zero, and, one needs another reasonable definition.

616

S. Assing

First one could think of finding a definition which works on a special set of µ-measure one as above. This is sometimes possible (see Example a) below), but, it does not match the application we have in mind – the Burgers equation forced by conservative noise. So, using ideas borrowed from quantum field theory (see B. Simon, [25], for a good introduction to the probabilistic counterpart), we extend our framework as follows. µ Interpreting (S (Rd ), B , µ) as a probability space, we denote by X a space of µ random variables on (S (Rd ), B , µ), for example let X = L2 (µ). Definition 1. (i) A linear mapping + S(Rd ) → X is said to be a random-variable-valued distribution. (ii) Let + be a random-variable-valued distribution. For 1 ≤ i ≤ d we call the randomvariable-valued distribution ∂x∂ i + defined by

∂

∂ + (l) = −+ l , l ∈ S(Rd ), ∂xi ∂xi

a partial derivative of +. Remark 1. a) Our powers of tempered distributions φ ∈ S (Rd ) will be random-variablevalued distributions + and not elements in S (Rd ), in general. We do not call + a stochastic process indexed by S(Rd ) because it is supposed to be applied to the states of a stochastic process as in quantum field theory. b) The spaces X we have to consider can also contain generalized random variables in a sense we explain later. As we have seen above, the definition of powers on S (Rd ) strongly depends on properties of the underlying measure µ. As a consequence we restrict ourselves to a subclass of the Gaussian measures we introduced at the beginning of this section. More precisely, let the inner product of H be given by a covariance operator C on L2 (Rd ) with domain S(Rd ), that is (C, S(Rd )) is symmetric, positive definite (i.e. (Cl, l)L2 (Rd ) > 0 for l ∈ S(Rd ) \ {0}) and H corresponds to the completion of S(Rd ) with respect to the norm · H defined by l2H = (Cl, l)L2 (Rd ) , l ∈ S(Rd ). Finally, the restricting property of C, always assumed in what follows, is

d d C {ei }∞ i=1 ⊆ Cb (R ) = {f | R → R continuous and bounded}.

(5)

A number of interesting examples for µ go back to essentially √ self-adjoint covariance operators (C, S(Rd )) on L2 (Rd ), of which the square root C admits a representaion ˜ by a kernel C(x, y), x, y ∈ Rd , satisfying ˜ (6) C(x, y)2 dy, < ∞, ∀ x ∈ Rd . In this case, C itself also admits a representaion by a kernel which is given by ˜ ˜ y) dz, x, y ∈ Rd , C(x, y) = C(x, z)C(z,

(7)

Pregenerator for Burgers Equation Forced by Conservative Noise

617

and, following B. Simon, [25], for fixed m ≥ 2 we are able to define a random-variablevalued distribution : +m : S(Rd ) → L2 (µ) which can be viewed as a reasonable mth power on S (Rd ) if this kernel of C additionally satisfies 2 l(x)C(x, y)m l(y) dxdy | < ∞, ∀ l ∈ S(Rd ). (8) |l| := | Proposition 1 (Compare Prop.V.1 in [25] ). Let the measure µ be given by an essentially self-adjoint covariance operator (C, S(Rd )) on L2 (Rd ) admitting a representaion by a kernel which satisfies (6), (7), (8). Then there is a unique random-variable-valued distribution : +m : S(Rd ) → L2 (µ) such that for all multi-indices α and test functions l on Rd ∞ αi m! i=1 Cei (x) l(x) dx : |α| = m m Hα : + : (l) dµ = . 0 : else Particularly, it holds that : +m : (l)L2 (µ) ≤

√

(9)

m! |l|, l ∈ S(Rd ).

Remark 2. a) For every l ∈ S(Rd ), (9) defines a linear functional on P{ei } . The conditions (6), (7) and (8) ensure that this linear functional is continuous with respect to the L2 (µ)-norm. In fact, (6), (7) yield ∞

Cei (x)Cei (y) = C(x, y), x, y ∈ Rd ,

i=1

and the proof of Proposition 2 below will show how to apply (8). Since P{ei } is dense in L2 (µ) it can be uniquely extended to a continuous linear functional on L2 (µ) and, applying Riesz’ Theorem, it is identified with an element of L2 (µ) denoted by : +m : (l). The conditions (6), (7) are only weak assumptions but they do not appear in Prop.V.1, [25]. However, we don’t see how the single condition (8) should ensure the continuity of the linear functional. Finally, it follows from the estimation given in Proposition 1 that : +m : is some kind of a continuous random-variable-valued distribution. b) We want to emphasize that assuming (5) the following easy calculation would be true for a multi-index α with |α| = m and l ∈ S(Rd ) fixed: ∞ m Hα : + : (l) dµ = m! Cei (x)αi l(x) dx i=1

= lim ε↓0

m!

∞ i=1

Cei (y)δε,x (y) dy

αi

l(x) dx

Hα (φ) : φ(δε,x )m : µ(dφ) l(x) dx ε↓0 = lim Hα (φ) : φ(δε,x )m : l(x) dx µ(dφ),

= lim ε↓0

618

S. Assing

where the test functions δε,x (y) = √

1 2π ε 2

exp

−

x − y2Rd 2ε 2

, y ∈ Rd ,

approximate Dirac’s delta-distribution δx for each x ∈ Rd , and, φ(l) : φ(l)m : = lm , l ∈ S(Rd ), H H m l H

(10)

denotes the mth Wick power of the Gaussian random variable φ → φ(l) on µ (S (Rd ), B , µ). This is the reason why : +m : can be called a Wick-renormalized power. B. Simon used to assume a stronger condition than (5) to establish an equivalent calculation as above (see Remark 4a) below). c) The claim of : +m : to make sense as a power on S (Rd ) has been justified by meaningful applications (see the examples below as well as [15, 25] and the references therein). Examples. a) Let µ be the Euclidean free field measure on S (Rd ) given by the covariance operator C = (1 − )−1 , where denotes the closure in L2 (Rd ) of the Laplace operator ( , S(Rd )). In every dimension d ≥ 1, C is determined by a locally integrable kernel C(x, y), x, y ∈ Rd , and, condition (5) is satisfied. In dimension d = 1, the subsets Lm loc (R) ⊆ S (R) have µ-measure one which is why there is no need to consider Wick-renormalized powers. But in all dimensions d ≥ 2, the corresponding measure µ does not carry regular distributions in S (Rd ). If d = 2 then (8) is satisfied and Wick-renormalized powers can be defined for all m ∈ N. In this case, there is a whole bunch of estimations similar to that one given in Proposition 1, namely m

: +m : (l)Lp (µ) ≤ c (p − 1) 2 lL1+ε (R2 ) , l ∈ S(R2 ), for all p ≥ 2 and ε ∈ (0, 1], where the constant c is only dependent on ε, m. Using these estimations, one can show (see [23]) that the random-variable-valued distribution : +m : µ-a.s. is a tempered distribution in S (R2 ). If d = 3 then (8) is only satisfied for m = 2 and, in all dimensions d ≥ 4, (8) does not hold anymore (remark m ≥ 2 by assumption). The case d = 2 is insofar interesting because here the Wick-renormalized powers have successfully been applied to define interacting quantum fields. We should mention that, for the same reason, in 3 dimensions a meaningful fourth power has been defined based on a renormalization being more complicated than the simple Wick-renormalization (see [14]). This fourth power is not any longer a random-variable-valued distribution since the linearity in l ∈ S(R3 ) is missing. b) Let µ be the so-called white noise on S (Rd ), i.e. µ is defined by the identity operator C = I d. Then, the covariance operator C is not determined by a locally integrable kernel but (5) is satisfied. Nevertheless, we are able to generalize the ideas behind Proposition 1 with respect to the white noise measure µ. This will be important for the conservative forced Burgers equation. Similarly, we can define generalized Wick-renormalized powers for those covariance operators in Example a) above where condition (8) is not satisfied. Here, a possible connection of the generalized Wick-renormalized powers and the fourth power defined in [14] remains to be clarified.

Pregenerator for Burgers Equation Forced by Conservative Noise

619

We continue in defining generalized Wick-renormalized powers on S (Rd ) with respect to an arbitrary Gaussian measure µ given by a covariance operator (C, S(Rd )) satisfying (5). Fix m ≥ 2 and l ∈ S(Rd ). From the calculation made in Remark 2b) we know that for all multi-indices α with |α| = m, m!

∞

Hα (φ)

ε↓0

i=1

Cei (x)αi l(x) dx = lim

: φ(δε,x )m : l(x) dx µ(dφ)

(11)

holds true. In analogy to Remark 2a), in a first step we define : +m : (l) to be the linear functional on P{ei } given by ∞ αi m! i=1 Cei (x) l(x) dx : |α| = m m Hα , : + : (l) = , (12) 0 : else where the brackets on the left-hand side mean that the linear functional : +m : (l) is applied to the element Hα of P{ei } . If now there is a locally convex topological space X µ of random variables on (S (Rd ), B , µ) densely containing P{ei } such that : +m : (l) is a continuous linear functional on P{ei } then : +m : (l) can be identified with an element of X (the topological dual of X). In this case, the brackets · , · in (12) correspond with the dual pairing X · , · X . Moreover, the mapping S(Rd )

l → : +m : (l) ∈ X

is linear leading to a random-variable-valued distribution : +m :. We are allowed to call it a generalized Wick-renormalized power because of (11). Of course, depending on the properties of (C, S(Rd )), one can always find such a space X, but, its topology might be so coarse that the corresponding dual space X is a space of generalized random variables. More precisely, looking for appropriate spaces X, resp. X , the literature provides large spaces of generalized random variables, for example see [17, 21] and the references therein. For our convenience we point out the following construction we found in [17]. Let (A, D(A)) be a self-adjoint operator H with discrete spectrum 1 < 41 ≤ on ∞ 42 ≤ . . . , and eigenbasis {ei }∞ such that 4i−a < ∞ for some positive constant i=1 i=1 a. Define on P{ei } the family of norms ( · 2,p,A )p∈N by N N p cj α H j α = cj α 7(A )Hj α , j =1 j =1 2 L (µ)

2,p,A

where [7(Ap )Hj α ](φ) :=

∞

: φ(Ap ei )j αi : for µ-a.e. φ ∈ S (Rd )

i=1

for the multi-indices j α, j = 1, . . . , N (remark Hj αi (φ(ei )) = : φ(ei )j αi : by (10)). Now, let (L2 (µ))p,A denote the completion of P{ei } with respect to the corresponding norm · 2,p,A and define (L2 (µ))p,A (L2 (µ))∞,A = p∈N

620

S. Assing

to be the projective limit of the Hilbert spaces (L2 (µ))p,A . Then, (L2 (µ))∞,A can be understood as a space of random test functions, and, its dual with respect to L2 (µ), we denote by (L2 (µ))−∞,A , is a large space of generalized random variables. Remark 3. Our conditions on the spectrum of (A, D(A)) have two important consequences: The space (L2 (µ))∞,A is both a nuclear space and an algebra under pointwise multiplication (see [17] and the references therein). But, motivated by Proposition 1, we want to consider the spaces X, resp. X , to be “as close as possible to L2 (µ)” (see also Remark 4b)) which is why we choose X = (L2 (µ))1,A . Thus, our test functions can be represented by those F ∈ L2 (µ) satisfying

2 F 22,1,A = (4α )2 (α!)−1 F Hα dµ < ∞, α

where 4α =

∞ i=1

4iαi .

And, the dual space X = (L2 (µ))−1,A with respect to L2 (µ) is the completion of P{ei } with respect to the norm F 2,−1,A =

α −2

(4 )

(α!)

−1

α

2 21 F Hα dµ , F ∈ P{ei } .

(13)

In what follows, ∞ · , · −∞ always means dual pairing between (L2 (µ))∞,A and (L2 (µ))−∞,A which coincides on P{ei } × (L2 (µ))−1,A with the dual pairing between (L2 (µ))1,A and (L2 (µ))−1,A . Proposition 2. Let the measure µ on S (Rd ) be given by a covariance operator (C, S(Rd )) on L2 (Rd ) satisfying (5) and assume ∞ i=1

4i−2 sup |Cei (x)|2 < ∞.

(14)

x∈Rd

Then, for every m ≥ 2, there exists a unique random-variable-valued distribution : +m : S(Rd ) → (L2 (µ))−1,A such that (12) with · , · replaced by ∞ · , · −∞ holds true for all multi-indices α and test functions l ∈ S(Rd ). Moreover, we have m

: + : (l)2,−1,A ≤

√

m!

∞ i=1

m 2

4i2

sup |Cei (x)|

x∈Rd

2

lL1 (Rd ) .

(15)

Pregenerator for Burgers Equation Forced by Conservative Noise

621

Proof. It suffices to show (15). Fix l ∈ S(Rd ). From (12) and (13) follows ∞ 2 : +m : (l)22,−1,A = (4α )−2 (α!)−1 m! Cei (x)αi l(x) dx |α|=m

i=1

= m!

i1 <...
×

Cei1 (x)

= m! |

αi1

m! −2αi −2α 4i1 1 . . . 4im im αi1 ! . . . αim !

. . . Ceim (x)

αim

dx dy l(x)l(y)

i1 <...
2 l(x) dx

m! −2αi1 −2α 4 . . . 4im im αi1 ! . . . αim ! i1

×[Cei1 (x)Cei1 (y)]αi1 . . . [Ceim (x)Ceim (y)]αim | = m!|

l(x)

∞ i=1

≤ m!

∞ i=1

4i−2

m 4i−2 Cei (x)Cei (y)

l(y) dx dy|

m sup |Cei (x)| ( |l(x)| dx)2 . 2

x∈Rd

This finishes the proof of the proposition.

$ #

Definition 2. The random-variable-valued distribution : +m : which exists under the assumptions of Propositon 2 is called the mth generalized Wick-renormalized power on S (Rd ) with respect to µ, m ≥ 2. Remark 4. a) The main reason for condition (5) is to establish (11) justifying the name “Wick-renormalized power” for : +m :. We think that one can hardly find a weaker sufficient condition for (11). Compare (5) with the following condition used by B. Simon in Th. V.3, [25], to get (11). He assumes that for every l ∈ S(Rd ), 1 ˆ 2 dk, φ(l)2 µ(dφ) = a(k)|l(k)| 2 where lˆ denotes the Fourier transform of l and the real function a satisfies d

|a(k)| ≤ Const (1 + |k|2 )− 2 , k ∈ Rd . Remark that in our Example b) above, B. Simon’s condition does not hold. b) It is more technical that (5) can also be applied to formulate condition (14) in Proposition 2. Finally, condition (14) somehow controls how “close” to L2 (µ) the space (L2 (µ))−1,A can be chosen in order to admit Wick-renormalized powers. c) We are successfully able to apply the second generalized Wick-renormalized power in case of Eq. (4).

622

S. Assing

3. The Pregenerator Return to the formal Eq. (4) with a space-time white noise as introduced in Sect. 1. Assuming the setting of the last section, we define L0 F (φ) = ν

n ∂ d2 f (φ(l1 ), . . . , φ(ln )) φ( 2 li ) ∂xi dx i=1

+

n γ ∂2 d d f (φ(l1 ), . . . , φ(ln ))( li , lj )L2 (R) , φ ∈ S (R), 2 ∂xi ∂xj dx dx i,j =1

for an arbitrary smooth polynomial F ∈ P represented by F (φ) = f (φ(l1 ), . . . , φ(ln )), φ ∈ S (R), where f is a polynomial in n variables and l1 , . . . , ln ∈ S(R). Let µ be the mean zero Gaussian measure on S (R) given by the covariance operator γ C = 2ν I d on L2 (R). This is exactly the same measure we introduced in Sect. 1 by its characteristic functional. Integrating by parts with respect to µ one sees that the operator (L0 , P) on L2 (µ) is symmetric. Moreover, it is closable and its closure (L0 , D(L0 )) generates a strongly continuous contraction semigroup on L2 (µ) which corresponds to the Ornstein-Uhlenbeck process solving the linear equation ∂ ∂2 ∂ u = ν 2u+ η(t, x), t ≥ 0, x ∈ R, ∂t ∂x ∂x

(16)

(see [6]). The operator (L0 , D(L0 )) is the corresponding infinitesimal generator, and we call (L0 , P) a pregenerator. This pregenerator necessarily satisfies L0 F dµ = 0, ∀ F ∈ P; we abbreviate by the elliptic equation (L0 ) µ = 0, because µ is the unique invariant measure with respect to the above strong Markov solution. Of course, the formal Eq. (4) is a perturbation of (16) and, FORMALLY, its pregenerator L defined on P should look like a first order perturbation n

LF (φ) = L0 F (φ) +

λ ∂ d f (φ(l1 ), . . . , φ(ln ))φ 2 ( li ), 2 ∂xi dx

φ ∈ S (R),

i=1

d li ) make no sense. But, we now replace φ 2 by the of (L0 , P) where the terms φ 2 ( dx generalized Wick-renormalized power : +2 : with respect to µ leading to a meaningful pregenerator (L, P). The prize is that this pregenerator cannot live on L2 (µ). At first, we specify the spaces where : +2 : will be considered. For the operator A defined on S(R) by

Af (x) = −

d2 f (x) + (x 2 + 1)f (x), dx 2

x ∈ R,

Pregenerator for Burgers Equation Forced by Conservative Noise

623

the Hermite functions x2

ξn (x) = (n!(2π) 2 )− 2 Hn (x)e− 4 , 1

1

x ∈ R, n = 0, 1, 2, . . . ,

form an orthonormal basis on L2 (R) such that Aξn = (2n + 2)ξn , n = 0, 1, 2, . . . . Then, the covariance space H of µ has the renormed Hermite functions 2ν ξi−1 , i = 1, 2, . . . , ei = γ as an orthonormal basis, and, the eigenvalues (4i )∞ i=1 corresponding to the self-adjoint operator (A, D(A)) on H are given by 4i = 2i, i = 1, 2, . . . . Now we know (see formula (8.91.10) in [29]) that sup |ξn (x)| = O(n− 12 ). 1

x∈R

As a consequence, Condition (14) holds true in the case of the situation described above. Thus, : +2 : exists as an (L2 (µ))−1,A -valued distribution by Proposition 2. Remark that, for this special operator A, the space (L2 (µ))−∞,A coincides with the space of Hida distributions (S) . In what follows, let us consider : +2 : in the larger space (L2 (µ))−∞,A including (L2 (µ))−1,A because then the space of random test functions (L2 (µ))∞,A is an algebra (see Remark 3). Moreover, the domain P being larger than P{ei } is also included in (L2 (µ))∞,A because S(R) ⊆ p∈N D(Ap ). In order to verify the last statement, we only have to show that for a fixed l ∈ S(R) the elementary smooth polynomial F (φ) = φ(l), φ ∈ S (R), belongs to the algebra (L2 (µ))∞,A . The last is true if 2

(4α )2p (α!)−1 F Hα dµ < ∞ F 22,p,A = α

for every p ∈ N. But, fixing an arbitrary p ∈ N, we get for this elementary smooth polynomial F that

2 α 2p −1 (4 ) (α!) φ(l)Hα (φ) dµ(φ) α

=

∞ i=1

2p

(l, ei )2H 4i =

∞ i=1

(l, Ap ei )2H =

∞ i=1

(Ap l, ei )2H = Ap l2H < ∞,

since S(R) ⊆ p∈N D(Ap ). We come to the correct definition of the operator L. For F ∈ P represented as at the beginning of this section, the Frechét derivative F is given by F (φ) =

n ∂ f (φ(l1 ), . . . , φ(ln )) li , φ ∈ S (R), ∂xi i=1

624

S. Assing

and, using Definition 1(ii), we set LF = L0 F −

λ ∂ : +2 : (F ) 2 ∂x

being a well-defined element of (L2 (µ))−∞,A . Indeed, we get G, LF = GL0 F dµ ∞ −∞

d λ ∂ + li f (·)G, : +2 : , G ∈ (L2 (µ)∞,A , ∞ ∂xi −∞ 2 dx n

i=1

where

∂ ∂ f (·) φ → f (φ(l1 ), . . . , φ(ln )) ∈ P ∂xi ∂xi

is also an element of (L2 (µ))∞,A . Hence, LF is correctly defined.

∂ ∂xi f (·)G

∈ (L2 (µ))∞,A by Remark 3 why

Proposition 3. The densely defined operator (L, P) on (L2 (µ))−∞,A satisfies the elliptic equation L µ = 0 in the sense of

∞ 1, LF −∞

= 0, ∀ F ∈ P.

Proof. Let us recursively introduce special polynomials in P by Wick ordering (see [15, 25]) with respect to our covariance operator C: : φ(l1 ) . . . φ(lm ) :=: φ(l1 ) . . . φ(lm−1 ) : φ(lm ) −

m−1 i=1

: φ(l1 ) . . . φ(li ) . . . φ(lm−1 ) : (Cli , lm )L2 (R) , !

φ ∈ S (R),

where ! means omission and l1 , . . . , lm ∈ S(R). Of course, P coincides with the linear hull spanned by such polynomials which is why we only have to show ∞ 1, LF −∞

= 0

for one of these polynomials F (φ) =: φ(l1 ) . . . φ(lm ) :,

φ ∈ S (R),

fixed. If the context is clear, in what follows we will also denote by : φ(l1 ) . . . φ(lm ) : the whole mapping φ →: φ(l1 ) . . . φ(lm ) :, that is F . At first, from the definition of L follows ∂ λ L0 F dµ − · ∞ 1, : +2 : (F ) ∞ 1, LF −∞ = −∞ 2 ∂x ∂ λ = − · 1, : +2 : (F ) −∞ 2 ∞ ∂x because (L0 , P) is symmetric on L2 (µ).

Pregenerator for Burgers Equation Forced by Conservative Noise

625

But, from the above recursive definition we get F (φ) =

m i=1

φ ∈ S (R).

: φ(l1 ) . . . φ(li ) . . . φ(lm ) : li , !

Hence, we have to prove that m i=1

∞

d : φ(l1 ) . . . φ(li ) . . . φ(lm ) :, : +2 : = 0. li ! −∞ dx

(17)

Lemma 1. Let m ˜ ≥ 2 and l ∈ S(R). Then it holds that ∞ :

φ(l1 ) . . . φ(lm ) :, : +m˜ : (l) −∞ =

˜ m! Cl1 (x) . . . Clm (x) l(x) dx : m = m : else

0

.

Proof. Because of : φ(l1 ) . . . φ(lm ) :∈ P ⊆ (L2 (µ))∞,A we may write : φ(l1 ) . . . φ(lm ) :=

α

Hα Hα : φ(l1 ) . . . φ(lm ) : √ dµ √ , α! α!

where the sum converges in (L2 (µ))∞,A . Thus ∞ :

φ(l1 ) . . . φ(lm ) :, : +m˜ : (l) −∞ : φ(l1 ) . . . φ(lm ) : Hα dµ · = α

H ∞

α

α!

, : +m˜ : (l)

−∞

.

The integrals in the last sum can be explored by Feynman’s rules or by Fock-space isometry as follows: : φ(l1 ) . . . φ(lm ) : Hα dµ =

 ∞  

α0 +...+α i

  σ ∈Sm i=1 j =α0 +...+αi−1 +1 0

(Clσ (j ) , ei )L2 (R) : |α| = m :

,

else

where α0 = 0 by convention, Sm denotes the group of permutations of order m!, and, if α0 + . . . + αi−1 + 1 > α0 + . . . + αi , then the corresponding product in these limits is set to be 1. So, applying the definition of : +m˜ :, we have that, on the one hand, ∞ :

φ(l1 ) . . . φ(lm ) :, : +m˜ : (l) −∞ = 0

626

S. Assing

if m = m ˜ because ∞ Hα , : +m˜ : (l) −∞ vanishes for multi-indices α with |α| = m ˜ and, on the other hand, if m = m ˜ then φ(l1 ) . . . φ(lm ) :, : +m˜ : (l) −∞   ∞ α0 +...+α ∞ i m!  = (Clσ (j ) , ei )L2 (R)  · Cei (x)αi l(x) dx α!

∞ :

|α|=m

σ ∈Sm i=1 j =α0 +...+αi−1 +1

i=1

α0 +...+α i

= m! = m!

∞ j =α0 +...+αi−1 +1 |α|=m σ ∈Sm ∞

(Clσ (j ) , ei )L2 (R)

αi !

i=1

(Cl1 , ei1 )L2 (R) · . . . · (Clm , eim )L2 (R) Cei1 (x) · . . . · Ceim (x)l(x) dx

i1 ,... ,im =1

= m!

 

Cei (x)αi l(x) dx



∞

(l1 , ei1 )H Cei1 (x) · . . . · (lm , eim )H Ceim (x) l(x) dx

i1 ,... ,im =1

∞ ∞ = m! C (l1 , ei )H ei (x) · . . . · C (lm , ei )H ei (x) l(x) dx = m!

i=1

i=1

Cl1 (x) . . . Clm (x)l(x) dx.

$ #

Remark 5. The above proof also shows that adding S(Rd ) ⊆ p∈N D(Ap ) as an assumption to the already made assumptions (5),(14), we could have defined generalized Wick-renormalized powers straight away on P by the assertion of Lemma 1 instead of on P{ei } by (12). However, the condition S(Rd ) ⊆ p∈N D(Ap ) can be restricting for certain covariance operators C. Let us now finish the proof of Proposition 3. Lemma 1 yields that (17) is in any event true for m = 3, and, if m = 3 then 3 i=1

d li ) −∞ dx d d =2 Cl2 (x)Cl3 (x) l1 (x) dx + Cl1 (x)Cl3 (x) l2 (x) dx dx dx d + Cl1 (x)Cl2 (x) l3 (x) dx dx 2 γ d = 2 [ (l1 l2 l3 )](x) dx = 0. 2ν dx

∞ φ(l1 ) . . . φ(li ) . . . φ(l3 )

!

:, : +2 : (

$ # Remark 6. a) This pregenerator unifies both, the idea of L. Bertini/G. Giacomin to use some kind of Wick-renormalization for the nonlinearity in (4) and the measure µ to be a possible equilibrium point with respect to (4) as explained in Sect. 1.

Pregenerator for Burgers Equation Forced by Conservative Noise

627

∂ b) From Proposition 3 it follows that − λ2 ∂x : +2 : is a so-called µ-divergence free perturbation of (L0 , P) in the sense of

∞

1, −

λ ∂ = 0, ∀ F ∈ P; : +2 : (F ) −∞ 2 ∂x

(18)

we refer to [2] for a good introduction of this notion within an L2 -setting. Remark that W. Stannat, [27], studied such perturbations on L1 , and, our application shows that an extension to spaces of generalized random variables is needed. We want to emphasize that (18) should be seen as the rigorous translation of the physical intuition (see Sect. 1) of µ being invariant under the flow generated by the solutions of ∂ λ ∂ 2 u+ u = 0, t ≥ 0, x ∈ R. ∂t 2 ∂x 4. The Martingale Problem Since the beginning of its systematic development by D.W. Stroock and S.R.S. Varadhan, [28], martingale problems have deeply been studied in the literature, and, it is well-known that weak solutions of stochastic differential equations can equivalently be described by them. In this section we want to give rigorous sense to the formal Eq. (4) by using a generalized version of a martingale problem within a stationary setting. At first, let us present the idea by means of the linear equation (16). Consider the operator (L0 , P) on L2 (µ) as introduced in the last section. It is a core for the generator of a strong Markov family (<, F, (Ft )t≥0 , (ut )t≥0 , (Pφ0 )φ∈S (R) ) with state space S (R). In our case, one can choose < to be the path space C([0, ∞), S (R)) of all continuous S (R)-valued functions on the positive time axis, where S (R) is equipped with the inductive limit topology and, then, (ut )t≥0 is the coordinate process ut (ω) = ω(t),

ω ∈ <, t ≥ 0.

The σ -algebra F and the filtration (Ft )t≥0 are supposed to be augmented versions (see [24]) of the Borel σ -algebra B(<) and the “natural” filtration Ftu = σ {us : s ≤ t},

t ≥ 0,

respectively. Now, the coordinate process (ut )t≥0 considered on (<, F, P0 ), with probability measure P0 = Pφ0 (·)µ(dφ), S (R)

is the unique solution of (16) starting from µ in the following sense: There is a cylindrical Wiener process (Wt )t≥0 on (<, F, P0 ) with covariance E0 Wt (l)Ws (l ) = γ (t ∧ s)(l, l )L2 (R) ,

t, s ≥ 0, l, l ∈ S(R),

such that ut (l) = u0 (l) + ν

t 0

us (

d2 d l) ds − Wt ( l), dx 2 dx

t ≥ 0, l ∈ S(R), P0 -a.s.

628

S. Assing

Simply using Itô’s formula, this equivalently means that the process (ut )t≥0 on (<, F, P0 ) solves a so-called martingale problem for (L0 , P), i.e. the (Ftu )t≥0 -adapted stochastic process F (ut ) − F (u0 ) −

t

0

L0 F (us ) ds,

t ≥ 0,

is a martingale for every F ∈ P, or in other words, E0 (F (utn+1 ) − F (utn ))

n

Gk (utk ) =

tn+1

tn

k=1

E0 L0 F (us )

n

Gk (utk ) ds

(19)

k=1

whenever 0 ≤ t1 < . . . < tn < tn+1 and F, G1 , . . . , Gn ∈ P. Remark 7. a) By definition, the operator L0 maps P into the space C(S (R)) of continuous functions on S (R). As a consequence, in (19), we may consider the operator (L0 , P) on C(S (R)) instead of L2 (µ) which is why we do not have to ask if the integral on the right-hand side of (19) depends on the chosen µ-version of L0 F, F ∈ P. Nevertheless, the generator for the strong Markov family is considered on L2 (µ) because there is no complete family of seminorms on C(S (R)) known which makes (L0 , P) closable. b) As already mentioned in the Sects. 1 and 3, the process (ut )t≥0 is also a stationary process on (<, F, P0 ). Thus, we especially have that P0 ◦ u−1 t = µ,

t ≥ 0.

As a consequence, the expectations on both sides of (19) exist for all F, G1 , . . . , Gn ∈ P because µ is a Gaussian measure. The idea is now to define a stationary solution of (4) by an equality like (19) since we already know a reasonable operator which should be used to replace L0 . But, if we replace L0 by our pregenerator L then the problem arises that we do not know what the meaning of LF (us ), F ∈ P, s ≥ 0, is, because L maps P into (L2 (µ))−∞,A . However, we are able to overcome this difficulty by the following definition. Definition 3. A µ-stationary continuous S (R)-valued process (ut )t≥0 on some probability space (<, F, P) is called a µ-stationary solution of (4) if, for all 0 ≤ t1 < . . . < tn < tn+1 and F, G1 , . . . , Gn ∈ P, it holds that (i) E(

n

Gk (utk )|us = ·) ∈ (L2 (µ))∞,A ,

s ∈ (tn , tn+1 );

k=1

(ii) (iii)

n

s → ∞ E(

k=1 Gk (utk )|us

E(F (utn+1 ) − F (utn ))

n k=1

= ·), LF −∞ is B((tn , tn+1 ))-measurable;

Gk (utk ) =

tn+1 tn

∞ E(

n

Gk (utk )|us = ·), LF −∞ ds.

k=1

Remark 8. a) Of course, the assumption of (ut )t≥0 to be µ-stationary could slightly be weakened to P ◦ u−1 t ≥ 0. t = µ,

Pregenerator for Burgers Equation Forced by Conservative Noise

629

b) The conditions (i) and (ii) look like technically assumptions for the main condition (iii) to be well-defined. However, the example below shows that they reflect some kind of regularity which seems to be natural in our general situation. c) Finally, the main condition (iii) gives rigorous sense to the equality (19) with L replacing L0 , and, this is why we also say that a µ-stationary solution of (4) solves the generalized martingale problem for (L, P, µ). See the example below for the concrete relation between (iii) and (19). Example. It is a matter of fact that the pregenerator (L0 , P) for the linear equation (16) can also be considered on (L2 (µ))−∞,A , and, we will show that the coordinate process (ut )t≥0 on (<, F, P0 ) is not only a stationary solution of (16) starting from µ in the classical sense but also solves the generalized martingale problem for (L0 , P, µ) in the sense of Definition 3. At first, for fixed 0 ≤ t1 < . . . < tn < tn+1 and F, G1 , . . . , Gn ∈ P, we have to show that for s ∈ (tn , tn+1 ) E0 (

n

Gk (utk )|us = ·) ∈ (L2 (µ))∞,A .

(20)

k=1

But, the closure of (L0 , P) in the space L2 (µ) generates a strongly continuous contraction semigroup (Tt0 )t≥0 on L2 (µ) which corresponds to (<, F, (Ft )t≥0 , (ut )t≥0 , (Pφ0 )φ∈S (R) ) (see Sect. 3) which is why, for a fixed multi-index α, we obtain E0

n

Gk (utk |us = ·)Hα dµ

k=1

= E0 = =

n

Gk (utk )Hα (us )

k=1 0 H ]] . . . ]](φ)µ(dφ) Tt01 [G1 Tt02 −t1 [. . . [Gn−1 Tt0n −tn−1 [Gn Ts−t n α 0 Tˆs−t [Gn Tˆt0n −tn−1 [. . . [G2 Tˆt02 −t1 [G1 Tˆt01 1]] . . . ]](φ)Hα (φ)µ(dφ), n

where (Tˆt0 )t≥0 is the corresponding cosemigroup on L2 (µ). Hence, E0

n

k=1

0 Gk (utk |us = ·) = Tˆs−t [Gn Tˆt0n −tn−1 [. . . [G2 Tˆt02 −t1 [G1 Tˆt01 1]] . . . ]], n

(21)

since the coefficients E0 ( nk=1 Gk (utk )|us = ·)Hα dµ determine E0 ( nk=1 Gk (utk )|us = ·). As a consequence, a sufficient condition for (20) to be true is Tˆt0 ((L2 (µ))∞,A ) ⊆ (L2 (µ))∞,A ,

t ≥ 0.

(22)

Indeed, Gk ∈ (L2 (µ))∞,A , k = 1, . . . , n, and (L2 (µ))∞,A is an algebra (see Remark 3). We do not want to prove (22) here because there are easier arguments for (20) to be true in our special situation: On the one hand, (Tˆt0 )t≥0 = (Tt0 )t≥0 since the closure of (L0 , P) is symmetric on L2 (µ) and, on the other hand, we know the explicit structure of (Tt0 )t≥0 by Mehler’s formula because (<, F, (Ft )t≥0 , (ut )t≥0 , (Pφ0 )φ∈S (R) ) is the

630

S. Assing

Ornstein-Uhlenbeck process given by the linear equation (16). So, applying [6], we calculate * ˜ ˜ φ), F ∈ L2 (µ), t ≥ 0, Tt0 F (φ) = F (pt φ + I d − p2t φ)µ(d where pt denotes the extension to S (R) of the semigroup defined by the heat kernels d2 corresponding to ν dx 2 . From the last formula it even follows that Tt0 (P) ⊆ P,

t ≥ 0,

and (20) is a consequence of (21), again. Now, applying (21) once more, the condition (ii) is satisfied for the coordinate process (ut )t≥0 on (<, F, P0 ) if the mapping 0 s → ∞ Tˆs−t [Gn Tˆt0n −tn−1 [. . . [G2 Tˆt02 −t1 [G1 Tˆt01 1]] . . . ]], L0 F −∞ n

is B((tn , tn+1 ))-measurable. But, (Tˆt0 )t≥0 = (Tt0 )t≥0 is a strongly continuous semigroup on L2 (µ) and 0 0 0 0 0 ∞ Tˆs−tn [Gn Tˆtn −tn−1 [. . . [G2 Tˆt2 −t1 [G1 Tˆt1 1]] . . . ]], L F −∞

=

0 Tˆs−t [Gn Tˆt0n −tn−1 [. . . [G2 Tˆt02 −t1 [G1 Tˆt01 1]] . . . ]] L0 F dµ n

in our case which is why the mapping above even turns out to be continuous in s ∈ (tn , tn+1 ). Finally, (19) implies (iii) because tn+1 n Gk (utk |us = ·), L0 F ds E0 tn

∞

−∞

k=1

=

tn+1

tn

=

tn+1

tn

E0

E0

n

n

Gk (utk |us = ·)L0 F dµ ds

k=1

Gk (utk )L0 F (us ) ds.

k=1

The above example illustrates the following: If a closed extension of (L, P) on (L2 (µ))−∞,A would generate a semigroup (Tt )t≥0 on (L2 (µ))−∞,A corresponding to the transition function for a µ-stationary Markov process (ut )t≥0 on some probability space (<, F, P) then a sufficient condition for (ut )t≥0 to satisfy the conditions (i) and (ii) in Definition 3 would be that the dual semigroup (Tˆt )t≥0 of (Tt )t≥0 is strongly continuous on (L2 (µ))∞,A . Indeed, here one could derive an equality like (21) simply applying the Markov property, and, since the dual semigroup (Tˆt )t≥0 is by definition a family of mappings Tˆt : (L2 (µ))∞,A → (L2 (µ))∞,A ,

t ≥ 0,

(i) would follow the same way as (20) would have followed from (21) and (22). Finally, the strong continuity of (Tˆt )t≥0 on (L2 (µ))∞,A would even make the mapping s → ∞ E(

n

Gk (utk )|us = ·), LF −∞

k=1

continuous on (tn , tn+1 ) which is better than (ii).

Pregenerator for Burgers Equation Forced by Conservative Noise

631

5. Appendix: The Wick Product Reviewed ∂ ∂ : +2 :) with the nonlinearity λ u(t, x) ∂x u(t, x) applied to We want to compare λ2 ( ∂x (W B) in the special situation of Eq. (4). At first, within the setting of Sect. 3, the Wick product µ of two random variables the Wick product µ of two random variables F, G ∈ L2 (µ) represented as cα Hα resp. G = c˜α Hα F = α

α

is by definition the element of L2 (µ) with the unique representation F µ G = cα c˜β Hα+β . α,β

Of course, this definition can also be applied to generalized random variables in (L2 (µ))−∞,A . Introducing δε,x as in Remark 2b), easy calculations show that, at each x for ε ↓ 0, the random variables φ → φ(δε,x ) converge to a generalized random variable in (L2 (µ))−1,A , which we abbreviate by φ(δx ) = lim φ(δε,x ). ε↓0

From (10) it follows that : φ(δε,x )2 := φ(δε,x ) µ φ(δε,x ) which is why we finally get : +2 : (l) =

[φ(δx ) µ φ(δx )]l(x) dx,

l ∈ S(R),

by combining (11) and the definition of : +2 :. Here, the integral on the right-hand side has to be understood as a Bochner integral. Remark that (14) is again a suitable sufficient condition to ensure the convergences required above. Altogether, there is a strong connection between Wick-renormalized powers and the Wick product µ on the state space. But, the Wick product people use to apply to (W B) corresponds to the path space. In fact, the nonlinearity in (W B) is actually given by ∂ u(t, x), ∂x where P denotes the underlying Gaussian path space measure. So it is some kind of convolution on the level of the path space and does not present a local drift function on ∂ ∂ the state space as λ2 ( ∂x : +2 :). In other words, the nonlinearity λ u(t, x) P ∂x u(t, x) should be represented by B(ω, u(t, x)), where the drift B additionally depends on the path space variable ω. Such a drift does of course not match the physical ideas behind the nonlinearity of a Burgers equation. One could argue that, for special solutions of (W B), ω and B are decoupled in an appropriate way. However, we think that the solutions considered in [3, 16, 18] do not have such a property. λu(t, x) P

Acknowledgements. The author thanks Boris Rozovskii for helpful discussions.

632

S. Assing

References 1. Albeverio, S., Bogachev, V.I., Röckner, M.: On uniqueness of invariant measures for finite and infinite dimensional diffusions. Commun. Pure and Appl. Math. 52, 325–362 (1999) 2. Albeverio, S., Cruzeiro, A.-B.: Global Flows with Invariant (Gibbs) Measures for Euler and Navier–Stokes Two Dimensional Fluids. Commun. Math. Phys. 129, 431–444 (1990) 3. Benth, F.E., Streit, L.: The Burgers Equation with a Non Gaussian Random Force. In: Decreusefond, L. et al. (eds.) Stochastic Analysis and Related Topics. Progress in Probability 42. Boston: Birkhäuser, 1998 4. Bertini, L., Cancrini, N., Jona-Lasinio, G.: Burgers equation forced by conservative or nonconservative noise. In: Cardoso, A.I. et al. (eds.) Stochastic Analysis and Applications in Physics. Dordrecht, London: Kluwer Academic, 1994 5. Bertini, L., Giacomin, G.: Stochastic Burgers and KPZ Equations from Particle Systems. Commun. Math. Phys. 183, 571–607 (1997) 6. Bogachev, V.I., Röckner, M.: Mehler formula and capacities for infinite dimensional Ornstein-Uhlenbeck processes with general linear drift. Osaka J. Math. 32, 237–274 (1995) 7. Bogachev, V.I., Röckner, M.: Elliptic equations for infinite dimensional probability distributions and Lyapunov functions. C.R. Acad. Sci. Paris Serié I 329, 705–710 (1999) 8. Bogachev, V.I., Röckner, M.: Elliptic equations for measures on infinite dimensional spaces and applications. Prob. Th. Rel. Fields 120, 445–496 (2001) 9. Bogachev, V.I., Röckner, M., Zhang, T.-S.: Existence and uniqueness of invariant measures: An approach via sectorial forms. Appl. Math. Optim. 41, 87–109 (2000) 10. Burgers, J.M.: Applications of a model system to illustrate some points of the statistical theory of free turbulence. Proc. Roy. Neth. Acad. Sci. 43, 2–12 (1940) 11. Burgers, J.M.: The Nonlinear Diffusion Equation. Dordrecht, Boston: D. Reidel Pub. Co., 1974 12. E, W., Khanin, K., Mazel,A., Sinai,Ya.G.: Invariant measures for Burgers equation with stochastic forcing. Ann. of Math. 151, 877–960 (2000) 13. Forsyth, A.R.: Partial Differential Equations, Part IV. Theory of Differential Equations VI. Cambridge: Cambridge Univ. Press, 1906 14. Glimm, J., Jaffe, A.: The Positivity of the φ34 Hamiltonian. Fort. Phys. 21, 327–376 (1973) 15. Glimm, J., Jaffe, A.: Quantum Physics, A Functional Integral Point of View. New York: Springer, 1981 16. Grothaus, M., Kondratiev, Y.G., Us, G.F.: Wick calculus for regular generalized stochastic functions. Random Oper. Stochastic Equations 7(4), 301–328 (1999) 17. Hida, T., Kuo, H.-H., Potthoff, J., Streit, L.: White noise. An infinite dimensional calculus. Dordrecht, London: Kluwer Academic, 1993 18. Holden, H., Lindstrøm, B., Øksendal, B., Ubøe, J., Zhang, T.-S.: The stochastic Wick type Burgers equation. In: Etheridge, A. (ed.) Stochastic Partial Differential Equations. Cambridge: Cambridge Univ. Press, 1995 19. Jona-Lasinio, G., Mitter, P.K.: On the stochastic quantization of field theory. Commun. Math. Phys. 101, 406–436 (1985) 20. Kardar, M., Parisi, G., Zhang,Y.C.: Dynamical scaling of growing interfaces. Phys. Rev. Lett. 56, 889–892 (1986) 21. Kondratiev,Y.G., Leukert, P., Streit, L.: Wick Calculus in Gaussian Analysis. Acta App. Math. 44, 269–294 (1996) 22. Krug, J., Spohn, H.: Kinetic roughening of growing surfaces. In: Godrèche, C. (ed.) Solids far from equilibrium: Growth morphology and defects. Cambridge: Cambridge University Press, 1991 23. Röckner, M.: Specifications and Martin boundaries for P (φ)2 -random fields. Commun. Math. Phys. 106, 105–135 (1986) 24. Sharpe, M.T.: General theory of Markov processes. Boston, London: Academic Press, 1988 25. Simon, B.: The P (+)2 Eucledian (Quantum) Field Theory. Princeton: Princeton Univ. Press, 1974 26. Sinai, Ya.G.: Burgers system driven by a periodic stochastic flow. In: Ikeda, N. et al. (eds.) Itô’s Stochastic Calculus and Probability Theory. New York: Springer, 1996 27. Stannat, W.: (Nonsymmetric) Dirichlet Operators on L1 : Existence, Uniqueness and associated Markov Processes. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 28, No. 1, 99–140 (1999) 28. Stroock, D.W., Varadhan, S.R.S.: Diffusion processes with continuous coefficients I,II. Comm. Pure Appl. Math. 22, 345–400, 479–530 (1969) 29. Szegö, G.: Orthogonal polynomials. Colloquium publications of the American Mathematical Society 23. Providence, RI: Am. Math. Soc., 1939, i.e. 1975 30. Woyczy´nski, W.A.: Burgers-KPZ Turbulence. Lecture Notes in Mathematics 1700. Berlin, London: Springer, 1998 Communicated by H. Spohn

Commun. Math. Phys. 225, 633 – 637 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

A Remark on Lp -Boundedness of Wave Operators for Two Dimensional Schrödinger Operators Arne Jensen , Kenji Yajima Department of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan. E-mail: [email protected] Received: 13 September 2001 / Accepted: 15 October 2001

Abstract: Let H = − + V be a two dimensional Schrödinger operator with a real potential V (x) satisfying the decay condition |V (x)| ≤ Cx−δ , δ > 6. Let H0 = −. We show that the wave operators s-limt→±∞ eitH e−itH0 are bounded in Lp (R2 ) under the condition that H has no zero resonances or bound states. In this paper the condition R2 V (x)dx = 0, imposed in a previous paper (K. Yajima, Commun. Math. Phys. 208, 125–152 (1999)), is removed.

1. Introduction Let H = − + V and H0 = − be Schrödinger operators in L2 (R2 ). We assume that V is multiplication by a function V (x), which satisfies the following condition: Assumption 1. V (x) is real-valued and |V (x)| ≤ Cx−δ , x ∈ R2 , for some δ > 6. It is well known that under this assumption the wave operators W± defined by the limits W± u = lim eitH e−itH0 u, t→±∞

u ∈ L2 (R2 ),

exist and are complete, i.e. Ran W± = L2ac (H ), the absolutely continuous subspace of L2 (R2 ) for H , and the singular continuous spectrum of H is absent. In this note we prove the following theorem: On leave from: Department of Mathematical Sciences and MaPhySto, Centre for Mathematical Physics and Stochastics, funded by the Danish National Research Foundation, Aalborg University, Fr. Bajers Vej 7G, 9220 Aalborg Ø, Denmark. E-mail: [email protected] Partly supported by the Grant-in-Aid for Scientific Research, The Ministry of Education, Science, Sports and Culture, Japan #11304006.

634

A. Jensen, K. Yajima

Theorem 1. Let Assumption 1 be satisfied. Suppose that 0 is neither an eigenvalue nor 2 (R 2 ) \ {0} of −u + V u = 0, a resonance of H , viz. there are no solutions u ∈ Hloc which for some a, b1 , and b2 satisfy for |α| ≤ 1, b1 x1 + b2 x2 α = O(|x|−1−ε−|α| ), |x| → ∞. (1.1) ∂x u − a − |x|2 Then the wave operators W± are bounded in Lp (R2 ) for all p, 1 < p < ∞. In [2], one of the authors has shown Theorem 1 under the additional assumption that R2 V (x)dx = 0. This additional assumption was made to simplify the asymptotic analysis as λ → 0 of the boundary values R ± (λ) = limε↓0 R(λ ± iε) on the reals of the resolvent R(z) = (H − z)−1 of H . By applying the recent results [1] of the other author with G. Nenciu on precisely this asymptotic problem, we show that this additional assumption is unnecessary. 2. Proof of the Theorem We choose c > 0 sufficiently small and let χ (t) ∈ C0∞ ([0, ∞)) be a cut-off function such that χ (t) = 1 for t ≤ c/2 and χ (t) = 0 for t ≥ c. We set χ˜(t) = 1 − χ (t). The argument in Sects. 2 and 3 of [2] does not use the assumption R2 V (x)dx = 0, and it implies that the high energy part of the wave operators W± χ˜ (H0 ) are bounded in Lp (R2 ) for 1 < p < ∞. Thus we have only to prove that the low energy part W± χ (H0 ) are bounded in Lp (R2 ) for 1 < p < ∞. 2.1. Preliminaries. We record some results from [1] and [2] which we need in what follows. The following three results are Proposition 2.1, Lemma 4.4 and Lemma 4.1 of [2], respectively. We define the operator W (1) (V ) depending on a function V by ∞ 1 (1) R0− (λ)V {R0+ (λ) − R0− (λ)}u dλ (2.1) W (V )u = − 2πi 0 for u ∈ S(R2 ). Here R0± (λ) = limε↓0 R0 (λ ± iε) denote the boundary values of the free resolvent. As is well known, these boundary values exist for λ > 0 in the norm topology of B(L2,s (R2 ), L2,−s (R2 )) for s > 1/2. Here L2,s (R2 ) denotes the usual weighted space. Lemma 1. If V ∈ L2,s (R2 ) for some s > 1, then W (1) (V ) extends to a bounded operator in Lp (R2 ) for any p, 1 < p < ∞, and W (1) (V )B(Lp ) ≤ Csp xs V 2 .

(2.2)

Corollary 1. Suppose that K is an integral operator with the integral kernel K(x, y) and that K satisfies R2

R2

1/2 x2s |K(x, x − y)|2 dx

dy ≡ Ks < ∞

(2.3)

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

635

for some s > 1. Then the operator Z, defined by ∞ 1 R0− (λ)K{R0+ (λ) − R0− (λ)}u dλ Zu = − 2π i 0

(2.4)

for u ∈ S(R2 ), can be extended to a bounded operator in Lp (R2 ) for any p, 1 < p < ∞, and furthermore Zup ≤ Csp Ks up . Lemma 2. Suppose that N (k) satisfies for some s > 3, (d/dk)j N (k)B(L2,−s ,L2,s ) ≤ Cj k 2−j log k for j = 0, 1, 2 and for 0 < k < c. Then the operator A, defined by ∞ 1 Au = − R0− (k 2 )N (k){R0+ (k 2 ) − R0− (k 2 )}χ (k 2 )u k dk πi 0

(2.5)

(2.6)

for u ∈ S(R2 ), can be extended to a bounded operator in Lp (R2 ) for any p, 1 ≤ p ≤ ∞. For studying the low energy behavior of R ± (k 2 ) we define, following [1], 1, if V (x) ≥ 0, U (x) = −1, if V (x) < 0, and v(x) = |V (x)|1/2 ,

w(x) = U (x)v(x).

We also need M ± (k) = U + vR0± (k 2 )v,

k > 0.

Define the orthogonal projections in L2 (R2 ) by P = V −1 1 v ⊗ v,

Q = 1 − P.

It follows from the results in [1] and Assumption 1 that M ± (k) = U + c± (k)P + vG0 v + O(k 2 log k)

(2.7)

in the operator norm of B(L2 ), where c± (k) = a ± + b± log k, and G0 is the integral operator with the integral kernel G0 (x, y) = −

1 log |x − y|. 2π

The term O(k 2 log k) stands for a B(L2 )-valued C 2 function N˜ (k), which satisfies d j /dk j N˜ (k)B(L2 ) ≤ Ck 2−j log k,

0 < k < c,

(2.8)

for j = 0, 1, 2. The differentiability of the expansion (2.7) is easily verified using the results in [1]. Note that the decay rate V (x) = O(x−δ ), δ > 6, suffices in order to differentiate twice. The error term is handled using an appropriate version of the

636

A. Jensen, K. Yajima

remainder in Taylor’s formula and the results in [1]. Hereafter we denote operators which satisfy (2.8) indiscriminately by O(k 2 log k). Let M0 = U + vG0 v. It is known (cf. [1, Theorem 6.2]) that QM0 Q is invertibel in QL2 (R2 ), if and only if 0 is neither an eigenvalue nor a resonance of H and, in that case, M ± (k)−1 = g ± (k)−1 {P − P M0 QD0 Q − QD0 QM0 P + QD0 QM0 P M0 QD0 Q} + QD0 Q + O(k 2 log k),

(2.9)

where g ± (k) = c± log k +d ± with non-vanishing constant c± , and where we introduced the notation D0 = (QM0 Q)−1 , see formula (6.27) of [1]. Notice that each of the operators in the braces is a rank one operator. With α = V 1 , and v1 = QD0 QM0 v we have P = α −1 v ⊗ v, QD0 QM0 P = αv1 ⊗ v,

P M0 QD0 Q = αv ⊗ v1 , QD0 QM0 P M0 QD0 Q = αv1 ⊗ v1 .

(2.10) (2.11)

Lemma 3. The operator QD0 Q − QU Q is an operator of Hilbert–Schmidt type. Proof. Since QM0 Q is invertible in QL2 (R2 ), the operator T = P +QM0 Q is invertible in L2 (R2 ) and D0 = QT −1 Q. Clearly T = U + {vG0 v + P + P M0 P − P M0 Q − QM0 P } ≡ U (1 + S). Here P , P M0 P , P M0 Q, and QM0 P are rank one operators, and vG0 v is of Hilbert– Schmidt type, since v(x) = O(x−δ/2 ), δ/2 > 3. Thus S is a Hilbert–Schmidt operator. Since U is invertible, we have that 1 + S is also invertible. Using (1 + S)−1 = 1 − S(1 + S)−1 , it follows that T −1 − U is a Hilbert–Schmidt operator, which implies the result in the lemma. 2.2. The Proof. It suffices to consider W+ . By the stationary representation formula for the wave operators we have ∞ 1 R − (λ)V {R0+ (λ) − R0− (λ)}χ (λ)u dλ. (2.12) W+ χ (H0 )u = χ (H0 )u − 2πi 0 The operator χ (H0 ) has a smooth and rapidly decreasing integral kernel, so it is bounded in Lp (R2 ) for any 1 ≤ p ≤ ∞. Hence, we need to study the operator W1 defined by the integral on the right of (2.12). Change to the variable k determined by λ = k 2 , and use the formula R ± (k 2 )V = R0± (k 2 )vM ± (k)v,

(2.13)

Lp -Boundedness of Wave Operators for 2D Schrödinger Operators

637

cf. Sect. 4 in [1]. Then ∞ 1 W1 u = − R0− (k 2 )vM − (k)−1 v{R0+ (k 2 ) − R0− (k 2 )}χ (k 2 )u k dk. πi 0

(2.14)

By virtue of (2.9), (2.10), (2.11), and Lemma 3, we have M − (k)−1 = d(k)F + L + U + O(k 2 log k),

d(k) = g − (k)−1 ,

(2.15)

where F is of rank 3, and L is of Hilbert–Schmidt type. It follows that the integral kernels K1 (x, y) and K2 (x, y) of vF v and v(L + U )v satisfy the condition (2.3) of Corollary 1. Thus, ∞ 1 W11 u = − R0− (k 2 )vF v{R0+ (k 2 ) − R0− (k 2 )}χ (k 2 )uk dk, (2.16) πi 0 ∞ 1 R − (k 2 )v(L + U )v{R0+ (k 2 ) − R0− (k 2 )}χ (k 2 )u k dk, (2.17) W12 u = − πi 0 are bounded in Lp (R2 ) for 1 < p < ∞. On the other hand vO(k 2 log k)v satisfies the condition (2.5) of Lemma 2, since the error term in (2.15) is found using the Neumann series, cf. [1], and since the error term in (2.7) satisfies (2.8). Therefore we can apply Lemma 2 to conclude that ∞ 1 W13 u = − R0− (k 2 )vO(k 2 log k)v{R0+ (k 2 ) − R0− (k 2 )}χ (k 2 )u k dk (2.18) πi 0 is bounded in Lp (R2 ) for 1 ≤ p ≤ ∞. Thus, W1 = W11 d(|D|) + W12 + W13 is bounded in Lp (R2 ) for 1 < p < ∞, since d(|D|) is bounded in Lp (R2 ) for any p, 1 < p < ∞, by the standard Fourier multiplier theorem. References [1] Jensen, A. and Nenciu, G.: A unified approach to resolvent expansions at thresholds. Rev. Math. Phys. 13, 717–754 (2001) [2] Yajima, K.: Lp -boundedness of wave operators for two dimensional Schrödinger operators. Commun. Math. Phys. 208, 125–152 (1999). Communicated by B. Simon

Commun. Math. Phys. 225, 639 – 668 (2002)

Communications in

Mathematical Physics

© Springer-Verlag 2002

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons Marcos Jardim University of Pennsylvania, Department of Mathematics, Philadelphia, PA 19104-6593, USA Received: 15 October 1999 / Accepted: 16 October 2001

Abstract: We present the Nahm transform of the doubly-periodic instantons (i.e. instantons on T 2 × R2 ), converting them into certain meromorphic solutions of Hitchin’s equations over an elliptic curve. We then show how to construct a triple consisting of an algebraic curve plus a line bundle with connection over it from a doubly-periodic instanton, and that such data coincides with the Hitchin spectral data associated with the Nahm transformed Higgs bundle. 1. Introduction In [14], we have shown how certain SU (2) instantons over R4 which are periodic in two directions, the so-called doubly-periodic instantons, can be constructed from a particular type of singular solutions of Hitchin’s equations (first introduced in [11]) over an elliptic curve. This was done via a procedure known as Nahm transform, which has attracted much attention among physicists recently (see for instance [5,19,7] and the references therein). We now present the inverse construction, showing that all extensible doubly-periodic instantons were obtained in [14]. Recall that given a function f : T × C → R, we say that f ∼ O(|w|n ) if: lim

w→∞

|f (w)| < ∞, |w|n

We consider anti-self-dual connections A on rank two bundle E → T × C satisfying the following conditions: 1. |FA | ∼ O(r −2 ); 2. there is a holomorphic vector bundle E → T ×P1 , the so-called instanton bundle, with trivial determinant such that E|T ×(P1 \{∞}) (E, ∂ A ), where ∂ A is the holomorphic structure on E induced by the instanton connection A.

640

M. Jardim

Such connections are said to be extensible. If A is an extensible instanton connection, then its energy (i.e. L2 norm of the curvature FA ) is an integer, the instanton number; furthermore, A splits as a sum of flat connections at the torus added infinity, and such flat connections are called the asymptotic states of A (see Sect. 2). Let us now outline the contents of this paper. The key feature of Nahm transforms is to try to solve the Dirac equation, and then use its L2 -solutions to form a vector bundle over the jacobian torus Tˆ , which parametrises the set of holomorphic flat line bundles over T × C. Therefore, our first task is to show that the Dirac operator is Fredholm and compute its index. The bulk of the paper lies in Sects. 4 and 5, where we present the Nahm transform of doubly-periodic instantons and show some of the properties of the transformed objects. Section 6 is dedicated to prove that the construction here presented is indeed the inverse of the one presented in [14], completing the proof of the main result that motivated these two papers: Theorem 1. The Nahm transform is a bijective correspondence between the following objects: – gauge equivalence classes of extensible, irreducible SU (2) instanton connections on E → T × C with fixed instanton number k and asymptotic state ξ0 ; and – admissible U (k) solutions of Hitchin’s equations over the dual torus Tˆ , such that the Higgs field has at most simple poles at ±ξ0 ∈ Tˆ , with semi-simple residues of rank ≤ 2 if ξ0 is an element of order 2 in the Jacobian of T , and rank ≤ 1 otherwise. We also state a higher rank generalization of the above result in Sect. 7. Finally, we discuss the role played by spectral curves in the correspondence of Theorem 1. More precisely, Hitchin has shown that Higgs pairs are equivalent to a pair consisting of an algebraic curve (the spectral curve) in the total space of the cotangent bundle plus a “line bundle” over it [12]. We conclude this paper by showing how to construct a spectral data, consisting of an algebraic curve plus a line bundle with connection over it, from the instanton (Sect. 8), and proving that it coincides with the Hitchin spectral data for the Nahm transformed Higgs bundle (Sect. 10). In this way, we complete a circle of ideas analogous to Hitchin’s approach to monopoles [10]: doubly-periodic instantonsLe LL qq8 q LL q q LL q q LL q q L% xq singular o / spectral Higgs pairs curves A similar circle of ideas has also been established for periodic monopoles (that is, solutions of Bogomolny equations on R2 × S 1 ) by Cherkis and Kapustin [5]. Similar correspondences are expected to hold for all translation invariant instantons on R4 . Note. This paper presents the combination of the two previous preprints [15] and [16].

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

641

2. Extensibility and Asymptotic Behaviour We now use the extensibility hypothesis to study the compatibility between the instanton connection A and the extended bundle E → T × P1 . More precisely, we first want to show that the holomorphic type of the restriction of the extended bundle to the added divisor T∞ = T × {∞} is indeed directly determined by the asymptotic behaviour of the instanton connection A. Then we argue that the topology of E is fixed by the action (L2 -norm) of A. Before that, we must fix an appropriate trivialisation at infinity. 2.1. Gauge fixing at infinity. Let BR denote a closed ball in C of radius R, and let VR be its complement. Also, consider the obvious projection p : T × VR → T . We shall need the following technical proposition, which follows from the gauge-fixing result established in [2] (see also the appendix in [13]). Proposition 1. If |FA | ∼ O(r −2 ), then, for R sufficiently large, there is a gauge over T × VR and a constant flat connection on a topologically trivial rank two bundle over the elliptic curve such that: A − p ∗ = α ∼ O(r −1 · log r). 2.2. Asymptotic states. By general theory, a constant flat connection on a bundle S → T determines uniquely a holomorphic structure on this bundle. Moreover, S must split, holomorphically, as the sum of two line bundles, i.e. S = Lξ0 ⊕ L−ξ0 , uniquely up to ±1. Here, ±ξ0 are seen as points in Tˆ , the Jacobian of the elliptic curve T . Therefore, by Pproposition 1, to each extensible instanton connection we can associate an unique pair of opposite points ±ξ0 ∈ Tˆ . Such points are called the asymptotic states of A. Lemma 1. If an extensible instanton connection A has asymptotic states ±ξ0 , then E|T∞ = Lξ0 ⊕ L−ξ0 . Proof. Let V∞ ⊂ P1 be a small neighbourhood centred at ∞ ∈ P1 ; let w be a coordinate there. We can regard E|T ×V∞ as a family of rank 2 bundles over T , parametrised by w. Furthermore, if ∂ denotes the holomorphic structure on E, let ∂ w be the holomorphic structure on the restriction E|Tw . Clearly, as operators: lim ∂ w = ∂ ∞ .

w→∞

However, from condition (2) in the definition of extensibility, we know that ∂ w = ∂ A|Tw away from ∞. But Proposition 1 tells us that ∂ A|Tw approaches ∂ as w → ∞. Therefore, ∂ ∞ = ∂ , and the lemma follows. 2.3. The instanton number. Let us now argue that the topological type of E is determined by the action of the instanton connection: Lemma 2. c2 (E) = 8π1 2 T ×C |FA |2 .

642

M. Jardim

Proof. Again, let V be a small neighbourhood of ∞ ∈ P1 . Let ±ξ0 be the canonical connection on the bundle Lξ0 ⊕ L−ξ0 over an elliptic curve and consider the projection p : T ×V → T. Now consider a connection A on the extended bundle E that coincides with p ∗ ±ξ0 on T × V . Therefore 1 1 Tr(FA ∧ FA ) = Tr(FA ∧ FA ) c2 (E) = 8π 2 T ×P1 8π 2 T ×(P1 \{∞}) 1 lim Tr(FA ∧ FA ). (1) = 8π 2 R→∞ T ×BR On the other hand, we have from Lemma 1 that A − A = α is a 1-form in O(r −1 · log(r)). Define the 1-parameter family of connections At = A + t · α, so that the corresponding curvatures: 2 FAt = t · FA + (1 − t) · FA − t − t2 · α ∧ α ⇒ So let: i(A) =

1 8π 2

|FAt | ∼ O(r −2 · log2 r) ∀t ∈ [0, 1].

T ×C

Tr(FA ∧ FA ) =

1 lim 8π 2 R→∞

(2)

T ×BR

Tr(FA ∧ FA ).

(3)

Usual Chern–Weil theory tells us that: 1 lim c2 (E) − i(A) = (Tr(FA ∧ FA ) − Tr(FA ∧ FA )) 8π 2 R→∞ T ×BR 1 1 = lim d Tr(α ∧ F ) A t 4π 2 R→∞ T ×BR 0 1

1 = lim Tr(α ∧ FAt ) =0 4π 2 R→∞ T ×SR1 0 by our estimates in Proposition 1 and Eq. (2).

We denote the space of extensible connections with fixed instanton number k and asymptotic states ξ0 by A(k,ξ0 ) . 2.4. Estimating the Dolbeault operator. Finally, we need one final lemma that will be useful in the following section, where we develop a Fredholm theory for the Dirac operator coupled to an instanton connection A ∈ A(k,ξ0 ) . First, note that the bundle Lξ0 ⊕ L−ξ0 → T admits a flat connection with constant p1

coefficients, which we denote by ξ0 . Use the projection T × VR → T to pull it back to T × VR . We show that: Lemma 3. Let A ∈ A(k,ξ0 ) be any extensible instanton connection. Given is R sufficiently large such that:

> 0, there

||∂ A − ∂ ξ0 ||L2 (T ×VR ) < . Proof. Since ∂ A − ∂ ξ0 is just the (0, 1)-part of the 1-form α = A − ξ0 , the statement is a simple consequence of the gauge-fixing Proposition 1.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

643

3. Fredholm Theory of the Dirac Operator We begin by recalling that points in the dual torus ξ ∈ Tˆ parametrises the set of flat holomorphic line bundles Lξ → T . Moreover, such bundles have a natural choice of connection, denoted iξ , which is consistent with the holomorphic structure; see [14]. In fact, Tˆ also parametrises the set of flat holomorphic line bundles over T × C. Using the projection p1 : T × C → T , one obtains the holomorphic line bundle p1∗ (Lξ ) over T × C, which we shall also denote by Lξ for simplicity; let ωξ be the pullback of the flat constant connection on Lξ → T described above; clearly, such connection is also flat. Let E → T ×C be a rank 2 bundle provided with an instanton connection A ∈ A(k,ξ0 ) . Form the bundle E ⊗ Lξ with the corresponding connection Aξ = A ⊗ I + I ⊗ ωξ ; since all we have done was to add a flat term to our original instanton, Aξ is still an instanton on the twisted bundle. We also require A to be irreducible; clearly, its twisted version Aξ is also irreducible. Consider now the Dirac operator acting on the bundle E(ξ ) = E ⊗ Lξ , coupled to the connection Aξ , and its adjoint: DAξ : (E(ξ ) ⊗ S + ) → (E(ξ ) ⊗ S − ), ∗ DA : (E(ξ ) ⊗ S − ) → (E(ξ ) ⊗ S + ), ξ

where the spaces of sections are provided with norms suitably defined. Since the base manifold is flat and the connection is anti-self-dual, the Weitzenböck formula on E(ξ ) ⊗ S + → T × C is simply: ∗ DAξ = ∇A∗ ξ ∇Aξ DA ξ

(4)

⇒ ||DAξ s||2 = ||∇Aξ s||2 . Hence, if Aξ is irreducible, there are no covariantly constant sections of E(ξ ) ⊗ S + . This means that the kernel of DAξ is trivial. Now, if DAξ is a Fredholm ∗ (which coincides with cokerD ) is a finite dimensional subspace operator, then kerDA Aξ ξ − of (E(ξ ) ⊗ S ). In this rather technical but fundamental section, we prove that this is indeed the case: Theorem 2. Given any instanton connection A ∈ A(k,ξ0 ) , the Dirac operators: ∗ DA : L21 (E(ξ ) ⊗ S − ) → L2 (E(ξ ) ⊗ S + ) ξ

(5)

form a smooth family of Fredholm operators parametrised by Tˆ \ {±ξ0 }. Moreover, ∗ = k, for all ξ ∈ Tˆ \ {±ξ }. indexDA 0 ξ The Sobolev norm in the left-hand side of (5) is defined as follows. Let Dξ∗ be the Dirac operator Lξ ⊗ S − → Lξ ⊗ S + . Then L21 (E(ξ ) ⊗ S − ) is the completion of (E(ξ ) ⊗ S − ) with respect to the norm: ||s||L2 = ||s||L2 + ||Dξ∗ s||L2 . 1

(6)

The proof consists of three steps, which we now outline. We first prove that the operators Dξ∗ : L21 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S + ) are invertible for nontrivial ξ ∈ Tˆ . A

644

M. Jardim

gluing argument then shows that the Dirac operator coupled to a twisted instanton Aξ is Fredholm if ξ ! = ξ0 . To compute its index, we use the Gromov-Lawson Relative Index Theorem [9]. It is important to note here that DAξ fails to be Fredholm when ξ = ±ξ0 ; the reason will be clear from the proof of the theorem. As we will see, this phenomenon is the source of the singularities that appear in the transformed objects. 3.1. The flat model. Let Lξ → T × C be the flat line bundle described above, provided with the constant connection ωξ . Our starting point to prove Theorem 2 is the following proposition. Proposition 2. For non-trivial ξ ∈ Tˆ , the coupled Dirac operator Dξ∗ : L21 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S + ) is invertible. Its inverse is denoted by Q∞ ξ . Proof. Let Lξ → T × C be a flat line bundle as above, provided with the constant connection ωξ = p ∗ (−iξ ), as described in [14]. Consider the twisted Dirac operator: Dξ : (Lξ ⊗ S + ) → (Lξ ⊗ S − ) and its adjoint Dξ∗ . Since M = T × C is a Kähler surface, we have the following decompositions:

(0,0)

(0,2)

(0,1)

(0,1)

S + = (M L ξ ⊕ ( M L ξ S − = (M L ξ = (T

(0,1)

Lξ ⊕ ( C

.

(7)

With respect to these decompositions, the Dirac operator and its adjoint are given by:    (z)  (z),∗ (w),∗ (w),∗ ∂ ξ −∂ ξ −∂ ξ −∂ ξ  D∗ =  , Dξ =  (8) ξ (w) (z),∗ (w) (z) ∂ ξ −∂ ξ ∂ξ ∂ξ (z,w)

where ∂ ξ denotes the Dolbeault operator twisted by ωξ along the toroidal (z) and plane (w) complex coordinates, i.e. the components of the covariant derivative. Hence, (0,0) (0,2) the coupled Dirac laplacian "ξ = Dξ∗ Dξ mapping (M Lξ ⊕ (M Lξ to itself is just:  

(z)



(w)

"ξ + "ξ 0

0 (z)

(w)

"ξ + "ξ

.

(9)

The off-diagonal terms are cancelled, for they are proportional to the curvature, which was supposed to vanish. Moreover, the flat connection ωξ is a pull back from the torus, (w) so that "ξ is just the usual plane laplacian. Let us concentrate on a single component, (0,0)

say (M Lξ .

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

645

First, we want to solve the homogeneous equation "ξ f = 0 for (0,0) ˆ f ∈ (M (Lξ ) and a fixed ξ ∈ T . Now, separate variables, supposing that f (z, w) = ϕ(z)g(w): (z) "ξ f = 0 ⇔ g"ξ ϕ + ϕ"(w) g = 0. Therefore:

(z)

(i)

"ξ ϕ = λ2 ϕ

(ii)

"(w) g = −λ2 g → ("(w) + λ2 )g = 0,

(10)

where λ2 are the eigenvalues of the ξ -twisted laplacian over the torus. They form a discrete, unbounded set {λn (ξ )}n∈N of R+ , each being a function of the parameter ξ . Note that since H 0 (T , Lξ ) = 0 for nontrivial ξ ∈ Tˆ , we can indeed guarantee that λn (ξ ) > 0 for all nontrivial ξ . On the other hand, for Lξ = C, the laplacian has a 1-dimensional kernel, i.e. one zero eigenvalue. (z) As usual, we can decompose f on the eigenstates of "ξ , i.e.: f =

gn (w)ϕn (z),

(11)

n (0,0)

where {ϕn } is an orthonormal basis for the L2 norm on (M (Lξ ) of eigenstates with eigenvalues {λ2n }; so, ||f ||2L2 (T ×C) = n ||gn ||2L2 (C) . Moreover: "ξ f =

[("(w) + λ2n )gn ]ϕn .

(12)

n

Proposition 3. Let ρ ∈ L2 (Lξ ⊗ S + ) be compactly supported and suppose that ξ is nontrivial. Then there is f ∈ L2 (Lξ ⊗ S + ) and a constant k < ∞ such that /ξ f = ρ and ||f ||L2 ≤ k||ρ||L2 . Proof. Given (12), solving the equation "ξ f = ρ amounts to solve ("(w) + λ2n )gn = ρn for each n, where gn , ρn are the components of g, ρ along the eigenspaces of λ2n , respectively. Fix some integer n and denote by Fn the fundamental solution of (/(w) +λ2n )Fn (w) = 0. Rescale the plane coordinate w = λn w, which transforms the previous equation to ("(w ) + 1)Fn ( w λn ) = 0. The unique integrable solution for this equation is the Bessel function K0 (see below), so that Fn (w) = K0 (λn w). Solutions to the non-homogeneous equations will then be given by the convolution: gn (w) = Fn (w − x)ρn (x)dxdx (13) R2

and recall that ||gn ||L2 ≤ ||Fn ||L1 ||ρn ||L2 . So, all we need is an estimate for ||Fn ||L1 which is independent of n. From the expression above, one sees that each Fn is integrable if the Bessel function K0 is: ||Fn ||L1 = λ−2 n ||K0 ||L1 . So, let λ = min{λn }n∈N ; therefore, ||Fn ||L1 ≤ λ−2 ||K0 ||L1 ; putting k = λ−2 ||K0 ||L1 we have ||gn ||L2 ≤ k||ρn ||L2 for each n. This completes the proof.

646

M. Jardim

Consider the Hilbert space L22 (Lξ ⊗ S ± ) obtained by the completion of (Lξ ⊗ S ± ) with respect to the norm: ||s||L2 = ||s||L2 + ||"ξ s||L2 .

(14)

2

The map "ξ : L22 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S − ) is then bounded, for clearly ||/ξ s||L2 ≤ ||s||L2 . Let Gξ : L2 (Lξ ⊗ S − ) → L22 (Lξ ⊗ S − ) be the inverse of "ξ given by Proposi2 tion 3. Using the inequality of the proposition, one shows that Gξ is also bounded, if ξ is nontrivial: ||Gξ s||L2 = ||Gξ s||L2 + ||"ξ Gξ s||L2 = ||Gξ s||L2 + ||s||L2 ≤ 2

≤ k||s||L2 + ||s||L2 ≤ (k + 1) · ||s||L2 . Moreover, we also conclude that: ||Gξ || < 1 +

C . λ2

(15)

Hence, Gξ is an invertible operator when acting between the above Hilbert spaces, if ξ is non-trivial. Remark 1. We emphasise the necessity of assuming that ξ is nontrivial. If ξ = e, ˆ then the Eq. (10i) admits one zero eigenvalue; on the other hand, the fundamental solution of "(w) g = 0 is essentially log r, which is not integrable. It is then impossible to get the estimate of Proposition 3, in other words, the operator "(ξ =e) ˆ fails to be invertible. In addition, the parameter k also depends on ξ , and k → ∞ (i.e. λ → 0) as ξ → 0. Now, define the norms: ||s||L2 = ||s||L2 + ||Dξ∗ s||L2 if s ∈ (Lξ ⊗ S − ) 1 ||s||L2 = ||s||L2 + ||Dξ s||L2 if s ∈ (Lξ ⊗ S + ). l+1

l

(16)

l

and consider the Dirac operators as maps between the following Hilbert spaces, obtained by the completion of (Lξ ⊗ S ± ) with respect to the above norms: ∗ Dξ : L21 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S + ) (17) Dξ : L2l+1 (Lξ ⊗ S + ) → L2l (Lξ ⊗ S − ). Then Dξ∗ is clearly bounded. Furthermore, it has an inverse given by (Dξ∗ )−1 = Dξ Gξ : L2 (Lξ ⊗ S + ) → L21 (Lξ ⊗ S − ), which is also bounded: ||(Dξ∗ )−1 s||L2 = ||(Dξ∗ )−1 s||L2 + ||Dξ∗ (Dξ∗ )−1 s||L2 1

= ||Dξ Gξ s||L2 + ||s||L2 = ||Dξ Gξ s||L2 1

≤ ||Gξ s||L2 ≤ (k + 1) · ||s||L2 . 2

So, Dξ∗

is also Fredholm when acting as in (17), and our proof is complete. To further ∗ −1 reference, we shall denote Q∞ ξ = (Dξ ) ; note, moreover, that this is a bounded, elliptic, pseudo-differential operator of order −1.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

647

We are left with one point to establish: the integrability of the fundamental solution of (" + 1)F = 0 in the plane. Indeed, first note that since the operator " + 1 has polar symmetry, then the fundamental solution F also has. After imposing this symmetry, we obtain the following ODE, for r > 0: 1 (" + 1)F (r) = 0 ⇒ F + F − F = 0. r This is a Bessel equation with parameter ν = 0. Its solutions are linear combinations of the Bessel functions of imaginary argument I0 and K0 (see [1], chapter 11). Below are possible integral representations for these functions (see [8]): ∞ 1 e−rt (t 2 − 1)− 2 dt, K0 (r) = 1

I0 (r) =

1

−1

cosh(rt)(t 2 − 1)− 2 dt. 1

It is easy to see that I0 (r) increases exponentially with r; it is also finite for r = 0. For the purpose of finding a Green’s function for the operator " + 1, this solution can be eliminated. With the help of a table of integrals, one finds out that K0 is integrable; indeed, by [8]: ∞ 2π ∞ K0 (r)d 2 vol = K0 (r)rdrdθ = 2π rK0 (r)dr = 2π. R2

0

0

0

This means that ||K0 ||L1 = 2π . Proposition 4. The solution f of the flat laplacian problem /ξ f = ρ of Proposition (3) decays exponentially if ξ is nontrivial, in the sense that there is a real constant λ > 0 such that: lim eλr |f | < ∞. r→∞

Proof. As r → ∞, the Bessel function K0 admits the following asymptotic expansion ([20], p. 202): π 1 e−r 1 9 2 K0 (r) ∼ + ··· . (18) + √ 1− 2 8r 128r 2 r Now since each ρn has compact support, it follows from (13) that each gn will also decay exponentially: π 1 e−λn |w−x| 1 2 gn (w) ∼ + · · · ρn (x)dxdx, 1− √ 2 8λn |w − x| λn |w − x| : where : is the support of ρ. As |w| → ∞, then also |w − x| ∼ |w| for all x ∈ :. Therefore, π 1 e−λn |w| 1 2 gn (w) ∼ 1− ρn (x)dxdx, as |w| → ∞. + ··· √ 2 8λn |w| λn |w| : Choosing 0 < λ < min{λn }n∈N , the statement follows from the eigenspace decomposition of f (11) and (12).

648

M. Jardim

In particular, note that (f/w) also belongs to L2 (Lξ ⊗ S + ). Define 2 (Lξ ⊗ S + ) as the space of all ψ ∈ (Lξ ⊗ S + ) such that ψ/w is square-integrable. L The proposition just proved implies that the flat model laplacian acting as follows: 2 (Lξ ⊗ S ± ) → L2 (Lξ ⊗ S ± ) "ξ : L is an invertible operator. Since "ξ = Dξ Dξ∗ , we conclude that: 2 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S + ) Dξ∗ : L

(19)

is also invertible. 3.2. Completing the proof of Theorem 2. Let K denote a closed ball in C of sufficiently ∗ is Fredholm, large radius R; its complement is DR defined as above. To show that DA ξ first note that the usual elliptic theory for compact manifolds guarantees the existence ∗ inside this compact core T × K; this is a bounded, elliptic, of a parametrix for DA ξ pseudo-differential operator: 2 + 2 − QK Aξ : L (E(ξ ) ⊗ S |T ×K ) → L1 (E(ξ ) ⊗ S |T ×K )

of order −1. On the other hand, it follows from Lemma 3 that: ∗ ∗ ||DA − (Dξ∗0 +ξ ⊕ D−ξ )||2L2 (T ×D 0 +ξ ξ

R)

<2 ,

∗ | is also invertible for sufficiently where can be made arbitrarily small. Thus, DA ξ T ×DR large R % 0, if ξ ! = ±ξ0 . Denote this inverse by Q∞ ; Aξ this is also a bounded, elliptic, pseudo-differential operator of order −1. Now choose β1 , β2 : C → R respectively supported over K and DR and satisfying ∞ β12 + β22 = 1 everywhere. We can patch together our two parametrix QK Aξ and QAξ in the following way: ∞ PAξ g = β1 QK Aξ (β1 g) + β2 QAξ (β2 g).

(20)

This is the same as restricting the section g to T × K (respectively, ∞ T × DR ), apply QK Aξ (QAξ ) and restricting the result again to T × K (T × DR ). Note that PAξ acts as follows: PAξ : L2 (E(ξ ) ⊗ S + ) → L21 (E(ξ ) ⊗ S − ). ∗ . In fact, take We want to show that this is a parametrix for DA ξ

g ∈ L2 (E(ξ ) ⊗ S + ); then:

∗ ∗ ∗ ∞ P g = DA [β1 QK DA Aξ (β1 g)] + DAξ [β2 QAξ (β2 g)] ξ Aξ ξ ∗ ∗ ∞ = {β1 DA QK Aξ (β1 g) + β2 DAξ QAξ (β2 g)} ξ ∞ + dβ1 .QK Aξ (β1 g) + dβ2 .QAξ (β2 g) . ∞ S g

where “.” means Clifford multiplication.

(21)

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

649

∗ Since QK Aξ is a parametrix for DAξ inside T × K, the first term (inside brackets)

equals the identity plus a compact operator S K acting on β1 g. Similarly, in the second term, Q∞ Aξ is the inverse of the Dirac operator outside K. Together, the first two terms

form the identity operator plus S K . Hence:

∗ (DA P − I )g = S K g + S ∞ g, ξ Aξ

where S ∞ : L2 (E(ξ ) ⊗ S + ) → L2 (E(ξ ) ⊗ S + ) is the operator over the brackets in ∞ ∞ (21). Since QK Aξ and QAξ are bounded operators, so is S ; we argue that this is also a compact operator. denote the closure of the support of dβ1 and dβ2 (which is an annulus In fact, let ∂K around the boundary of K). Consider the diagram: s

L2 (E(ξ ) ⊗ S + ) −→ L21 (E(ξ ) ⊗ S + |T 2 ×∂K ) ↓ i L2 (E(ξ ) ⊗ S + |T 2 ×∂K ) ∩ L2 (E(ξ ) ⊗ S + ). Now, let ϒ ⊂ L2 (E(ξ ) ⊗ S + ) be a bounded set; since s is a bounded operator, s(ϒ) is also bounded. By the Rellich lemma, the map i is a compact inclusion; note that is a compact subset of the plane. Hence, i(s(ϒ)) is a relatively compact subset of ∂K 2 + L2 (E(ξ ) ⊗ S + |T 2 ×∂K ), and clearly also a relatively compact subset of L (E(ξ ) ⊗ S ). This means that: S ∞ = i ◦ s : L2 (E(ξ ) ⊗ S + ) → L2 (E(ξ ) ⊗ S + ) is a compact operator, as have we claimed. We conclude that: ∗ DA P − I = [compact operator] ξ Aξ ∗ if ξ ! = ±ξ . so that (20) is indeed a parametrix for DA 0 ξ ∗ Finally, to compute the index of DAξ we use the Relative Index Theorem of Gromov & Lawson [9] (see also the appendix in [13]). One can show that: ∗ ] = k, for all ξ ! = ±ξ . Lemma 4. If A ∈ A(k,ξ0 ) , then index[DA 0 ξ

3.3. The Green’s operator. Clearly, the Dirac laplacian, with the norms as in (14): /Aξ : L22 (E ⊗ Lξ ⊗ S + ) → L2 (E ⊗ Lξ ⊗ S + ) ∗ D /Aξ = DA Aξ ξ

(22)

is also a Fredholm operator. In particular, by general Fredholm theory, there is a bounded operator GAξ , called the Green’s operator, such that: /Aξ GAξ = I − Hξ , where Hξ is the finite rank orthogonal projection operator: Hξ : L22 (E ⊗ Lξ ⊗ S + ) → ker(/Aξ ).

650

M. Jardim

3.4. Harmonic spinors and cohomology. To conclude this chapter, we want to inter∗ as some holomorphic object defined in terms pret the harmonic spinors ψ ∈ kerDA of the compactified bundle E → T × P1 . Indeed, we aim to establish the following identification: Proposition 5. If A has nontrivial asymptotic state ξ0 ∈ Tˆ and k > 0, then there is an ∗. isomorphism H 1 (T × P1 , E) ≡ kerDA ∗ ⊂ L2 (E ⊗ S − ), with the norm defined in (6). First, we must show Note that kerDA 1 that H 1 (T × P1 , E) has the correct dimension.

Vanishing theorem. Since χ (E) = −k, it is enough to show that the cohomologies of orders 0 and 2 vanish in order to conclude that h1 (T × P1 , O(E)) = k. A holomorphic bundle E → T × P1 is said to be generically fibrewise semistable if the restriction E|Tw is semistable 1 for generic w ∈ P1 (here, Tw = T × {w}). Similarly, E is said to be fibrewise semistable (regular) if the restriction E|Tw is semistable (regular) for all w ∈ P1 . Notice that every instanton bundle is generically fibrewise semistable, since E|T∞ is semistable, which is a generic condition. This observation leads to the desired vanishing result: Lemma 5. If E is an irreducible instanton bundle and k > 0, then: h0 (T × P1 , E(ξ )) = h2 (T × P1 , E(ξ )) = 0, ∀ξ ∈ Tˆ . Let Lξ → T be a flat line bundle as described in [14]; denote: ˜ ) = E ⊗ p1∗ Lξ ⊗ p2∗ OP1 (1). E(ξ ) = E ⊗ p1∗ Lξ and E(ξ Note that we can regard p2∗ OP1 (1) as the line bundle corresponding to the divisor T∞ . It follows from the lemma that: ˜ )) = k h1 (T × P1 , E(ξ )) = h1 (T × P1 , E(ξ for every ξ ∈ Tˆ . Proof. Take w ∈ P1 such that E(ξ )|Tw = Lξ1 ⊕ Lξ2 for some non-trivial ξ1 , ξ2 ∈ Tˆ and let V ⊂ P1 be an open neighbourhood of w such that every point of V satisfies the same condition; the existence of such an open set is guaranteed by the fact that E is generically fibrewise semistable. Suppose there is a holomorphic section s ∈ H 0 (M, E(ξ )); it gives rise to a holomorphic section sw of E(ξ )|Tw → Tw . On the other hand, we have that h0 (T , E(ξ )|T ×{w} ) = 0, hence sw ≡ 0. Moreover, sw ≡ 0 for all w ∈ V , so that s must vanish identically on the ˜ )) open set T × V , hence vanish everywhere and h0 (E(ξ )) = 0. The vanishing of h0 (E(ξ ∗ ˜ is proved in the very same way by noting E(ξ )|Tw ≡ E(ξ )|Tw since p2 OP1 (1)|Tw = C. The vanishing of the h2 ’s follows from Serre duality and a similar argument for the bundle E(ξ ) ⊗ KP1 . More precisely, Serre duality implies that: H 2 (T × P1 , E(ξ )) = H 0 (T × P1 , E(ξ )∨ ⊗ KT ×P1 )∗ = H 0 (T × P1 , E(ξ )∨ ⊗ p2∗ OP1 (−2))∗ . 1 Recall that every semistable, rank 2 vector bundle over an elliptic with trivial determinant either splits as a sum of flat line bundles or it is the unique nontrivial extension of a flat line bundle of order 2 by itself. Such bundle is regular if it is not the sum of trivial line bundles.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

651

On the other hand, it is easy to see that: E(−ξ )|Tw ≡ (E(ξ )∨ ⊗ p2∗ OP1 (−2))|Tw so that we can apply the h0 (T × P1 , E(ξ )∨ ⊗ KT ×P1 ) = 0.

same

argument

as

above

to

show

that

Proof of Proposition 5. Let {wi } ⊂ P1 be such that H 0 (Twi , E|Twi ) does not vanish. As we argued above, there are only finitely many such points; in fact, it can be shown that there are at most k such points (see Lemma 2 of [14]). Suppose that #{wi } = p ≤ k; note also that ∞ ∈ / {wi } if ξ0 is nontrivial. Denote by B the divisor in T × P1 consisting of the elliptic curves lying over these points, i.e. B = i T × {wi }. Also, denote E(p) = E ⊗ OT ×P1 (B). Consider the exact sequence of sheaves: 0 → O(E) → O(E(p)) → O(E(p)|B ) → 0 which induces the following sequence of cohomology: 0 → H 0 (B, E(p)|B ) → H 1 (T × P1 , E) → H 1 (T × P1 , E(p)) → H 1 (B, E(p)|B ) → 0 dim = k dim = k (23) and note that p ≤ h0 (B, E(p)|B ) = h1 (B, E(p)|B ) ≤ 2k. It follows from (23) that h0 (B, E(p)|B ) = h1 (B, E(p)|B ) = k, so that the map H 0 (B, E(p)|B ) → H 1 (T × P1 , E) is an isomorphism. This means that each element in H 1 (T × P1 , E) can be represented by a (0, 1)-form θ supported on tubular neighbourhoods of the fibres T × {wi }. Pulling θ back to T × C, we obtain a compactly supported (0, 1)-form, which we also denote by θ , since ξ0 is nontrivial. ∗ ψ = 0 out of θ, and within the same coWe want to fashion a solution ψ of DA homology class. In other words, we want to find a section s ∈ L2 ((0 E) such that ∗ (θ + ∂ s) = 0. Since D ∗ = ∂ ∗ − ∂ , this is the same as solving the equation: DA A A A A ∗

∗

∂ A ∂ A s = /A s = −∂ A θ for a compactly supported θ . In the Fredholm theory for the Dirac operator developed above, we constructed the ∗ Green’s operator GA of the Dirac laplacian /A . Thus, we can write s = −GA ∂ A θ and P

∗

∗. ψ = θ − ∂ A GA ∂ A θ = P θ , where P denotes the L2 projection L2 (E ⊗ S − ) → kerDA ∗ 2 − We must verify that ψ ∈ L (E ⊗ S ); it is enough to show that ∂ A GA ∂ A θ is square∗ integrable for any compactly supported (0,1)-form θ . First note that γ = ∂ A θ also has compact support, thus s = GA γ ∈ L2 ((0 E). Therefore, we have:

||∂ A s||2L2 = ,∂ A s, ∂ A s- = ,∂ A s, (∂ A GA )γ = ,(∂ A GA )∗ ∂ A s, γ -

which is finite, since γ is compactly supported. Note the integration by parts made from the first to the second line is justified by the same fact. Therefore, ψ is indeed a ∗ ψ = 0. square-integrable solution of DA

652

M. Jardim

Finally, to see that the map defined above is injective (hence an isomorphism), let θ be another (0, 1)-form supported around B and within the same cohomology class as θ , so that θ − θ = ∂ A α. Thus: ∗

∗

ψ − ψ = (θ − ∂ A GA ∂ A θ) − (θ − ∂ A GA ∂ A θ ) ∗

= (θ − θ ) − ∂ A GA ∂ A (θ − θ ) = ∂ Aα

∗ − ∂ A GA ∂ A ∂ A α

(24)

= ∂ A α − ∂ A α = 0.

This completes the proof.

4. Nahm Transform of Doubly-Periodic Instantons Recall that our starting point is a rank two vector bundle E → T × C provided with an instanton connection A ∈ A(k,ξ0 ) , where the instanton number k and the asymptotic state ξ0 are from now on fixed. Over the punctured Jacobian torus Tˆ \ {±ξ0 }, consider the trivial Hilbert bundle ˆ H → Tˆ \ {±ξ0 } whose fibres are Hˆ ξ = L21 (E(ξ ) ⊗ S − ). Taking the L21 -norm on the fibres, Hˆ becomes an hermitian bundle. Moreover, call dˆ the trivial covariant derivative on Hˆ ; such derivative is clearly unitary, hence one can define a holomorphic structure over Hˆ . Now consider the finite-dimensional sub-bundle V C→ Hˆ over Tˆ \ {±ξ0 } whose ∗ . Remark that this is actually the index bundle for the fibres are given by Vξ = kerDA ξ family of Dirac operators DAξ . Let i : V → Hˆ be the natural inclusion and P : Hˆ → V

∗ for the fibrewise orthogonal L2 projection; more precisely, Pξ = I − DAξ GAξ DA ξ each ξ ∈ Tˆ \ {±ξ0 }, where GAξ denotes the Green’s operator for (22), I is the identity operator. We can define a connection on V via the projection formula:

∇B = P ◦ dˆ ◦ i,

(25)

where B is the associated connection form. Clearly, V inherits the hermitian metric from Hˆ , and B is also unitary with respect to this induced metric. Hence, we can provide V with the holomorphic structure coming from the unitary connection B. Alternatively, V also admits an interpretation in terms of monads, see [6]. The Dirac operator can be unfolded into a family of elliptic complexes parametrised by Tˆ \ {±ξ0 }, namely: ∂ Aξ

−∂ Aξ

0 → L22 ((0 E(ξ )) −→ L21 ((0,1 E(ξ )) −→ L2 ((0,2 E(ξ )) → 0

(26)

which, of course, are also Fredholm. Moreover, the cohomologies of order 0 and 2 must vanish, by Proposition 5. As in [6], such a holomorphic family defines a holomorphic ∗ }, plus an unitary connection, vector bundle V → (Tˆ \ {±ξ0 }), with fibres Vξ = ker{DA ξ induced by orthogonal projection, which is compatible with the given holomorphic structure. Such a connection will be denoted by B. We will invoke this construction repeatedly throughout this work.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

653

The curvature FB of B is simply given by: ˆ d). ˆ FB = ∇B ∇B = P d(P Explicit formulas for the matrix elements on an arbitrary local trivialisation of V → (Tˆ \ {±ξ0 }) will be useful later on. For instance, pick up an orthonormal frame {ψi }kn=1 over an open set U ⊂ Tˆ \ {±ξ0 }. Then, we have that: ˆ j -, (B)ij = ,ψj , ∇B ψi - = ,ψj , dψ ˆ dψ ˆ i )- = ,ψj , d(P ˆ dψ ˆ i )-. (FB )ij = ,ψj , FB ψi - = ,ψj , P d(P

(27)

Higgs field. We now define the Higgs field E ∈ End(V ) ⊗ KTˆ . Let w be the complex ∗ . coordinate of the plane, and ψ ∈ (V ), i.e. for each ξ ∈ Tˆ \ {±ξ0 }, ψ[ξ ] ∈ kerDAξ For a fixed ξ , the Higgs field will act on ψ[ξ ] by multiplying this section by the plane ∗ : coordinate w and then projecting it back to kerDA ξ √ (E(ψ))[ξ ] = 2 2π Pξ (wψ[ξ ])dξ.

(28)

√ Its conjugate is clearly given by (E∗ (ψ))[ξ ] = 2 2π Pξ (wψ[ξ ])dξ . There is a subtle analytical point here. The spinors ψ belong to L2 (E(ξ ) ⊗ S − ) but is not necessarily the case that wψ also belong to L2 (E(ξ ) ⊗ S − ). However, we have the following lemma: Lemma 6. If ψ ∈ wψ ∈ L2 (E ⊗ S − ).

∗ and A has nontrivial asymptotic state, then kerDA

Proof. The key result here is Pproposition 4, and the observation that follows it, in particular the invertibility of the operator (19). ∗ is sufficiently close to the flat Let K ⊂ T × C be a compact subset such that DA ∗ outside K. Thus, restricted to the complement of K, D ∗ is invertible Dirac operator D±ξ A 0 acting from L˜2 → L2 . ∗ , then D ∗ (wψ) = dw · ψ ∈ L2 (E ⊗ S + | Now if ψ ∈ kerDA T ×C\K ) and the A proposition follows. Note that the dependence of (B, E) on the original instanton A is contained on the L2 ∗ ψ = 0. It is easy to see that the finite projection operator P , i.e. on the k solutions of DA ξ dimensional space spanned by these ψ is gauge invariant; moreover, the multiplication by w also commutes with gauge transformations gˆ ∈ Aut(V ). Therefore, we have that: Proposition 6. If A and A are gauge equivalent irreducible instantons, then the corresponding pairs (B, E) and (B , E ) are also gauge equivalent. A pair (B, E) is called a Higgs pair on the bundle V → Tˆ \ {±ξ0 } if it satisfies Hitchin’s self-duality equations: (i) FB + [E, E∗ ] = 0 (29) (ii) ∂ B E = 0.

654

M. Jardim

Recall from [14] that the unitary connection of the Poincaré line bundle P → T × Tˆ and its corresponding curvature are given by: ω(z, ξ ) = iπ ·

2

2 dξµ ∧ dzµ . ξµ dzµ − zµ dξµ and :(z, ξ ) = 2iπ ·

µ=1

µ=1

From Braam & Baal [4], we know that if s ∈

(E(ξ ) ⊗ S − ),

then:

∗ ˆ ∗ ˆ = −: · s, DA (ds) = [DA , d]s ξ ξ

where “·” means Clifford multiplication. The local formula for the curvature (27) may now be cast on a more convenient form: ∗ ˆ ˆ dψ ˆ i )- = ,ψj , d(D ˆ Aξ GAξ DA (FB )ij = ,ψj , d(P dψi )ξ ∗ ˆ ∗ ˆ = ,−DA dψj , GAξ (DA dψi )- = ,:.ψj , GAξ (: · ψi )ξ ξ

Since the Clifford multiplication commutes with the Green’s operator, we end up with: (FB )ij = −,(: ∧ :) · ψi , GAξ ψi = 8π 2 ,(dz1 ∧ dz2 ) · ψj , GAξ ψi -dξ1 ∧ dξ2

(30)

= −4π 2 i,(dz1 ∧ dz2 ) · ψj , GAξ ψi -dξ ∧ dξ . Note moreover that the inner product is taken in L2 (E(ξ ) ⊗ S − ), integrating out the (z, w) coordinates. Theorem 3. If A ∈ A(k,ξ0 ) , then the associated pair (B, E) on the dual bundle V → Tˆ \ {±ξ0 } constructed above satisfies the Hitchin’s equations (29). Proof. Choose an open set U ∈ Tˆ \ {±ξ0 } and pick up a local orthonormal trivialisation of V → Tˆ \ {±ξ0 } over U , such that the corresponding local frame {ψi }kn=1 is parallel ∗ . at ξ . Recall that ψi (ξ ) ∈ kerDA ξ First, we shall look at the second equation of (29), and recall that Tˆ \ {±ξ0 } was given the flat Euclidean metric induced from the quotient. Once a local trivialisation is chosen, the endomorphism E can then be put in matrix form, with matrix elements given by: aij (ξ ) = ,ψj (ξ ), E[ψi ](ξ )-, where , , - is the inner product on L2 (E(ξ ) ⊗ S − ), integrating out the (z, w) coordinates. Clearly, E is a holomorphic endomorphism if its matrix elements in a holomorphic trivialisation are holomorphic functions. However: ∗ E[ψi ](ξ ) = Pξ (wψi (ξ ))dξ = (I − DAξ GAξ DA )(wψi (ξ ))dξ ξ

so that:

√ ∗ aij (ξ ) = 2 2π ,ψj (ξ ), wψi (ξ )- − ,ψj (ξ ), DAξ GAξ DA (wψi (ξ ))ξ √ ∗ ∗ = 2 2π ,ψj (ξ ), wψi (ξ )- − ,DA ψ (ξ ), G D (wψ (ξ ))j A i ξ A ξ ξ √ = 2 2π,ψj (ξ ), wψi (ξ )-.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

655

Therefore: √ (ξ ) = 2 2π ,∂B ψj , wψi - + ,ψj , ∂ B (wψi )∂ξ √ ∂w ψi + ∂ B ψi - = 0 = 2 2π ,ψj , ∂ξ

∂aij

as ψi is parallel at ξ . Since this can be done for all ξ ∈ Tˆ \ {±ξ0 }, the second equation is satisfied. Now, we move back to (29(i)). Let us first compute the matrix elements ([E, E∗ ])ij . Note that: ∗ , w]ψ (ξ ) = D ∗ (wψ (ξ )) = −dw · ψ (ξ ) (i) [DA i i i Aξ ξ (31) ∗ , w]ψ (ξ ) = D ∗ (wψ (ξ )) = 0, (ii) [DA i i Aξ ξ ∗

where we used the fact that DAξ = ∂ Aξ − ∂ Aξ . Recall that for 1-forms [E, E∗ ] = EE∗ + E∗ E. We compute each term separately: E∗ E(ψi ) = 8π 2 P [wP (wψi )]dξ ∧ dξ ∗ = 8π 2 wP (wψi ) − DAξ GAξ DA wP (wψi ) dξ ∧ dξ ξ ∗ = 8π 2 wwψi − wDAξ GAξ DA (wψi ) ξ ∗ −DAξ GAξ DA wP (wψi ) dξ ∧ dξ ξ EE∗ (ψi ) = 8π 2 P [wP (wψi )]dξ ∧ dξ ∗ = 8π 2 wwψi − wDAξ GAξ DA (wψi ) ξ ∗ wP (wψi ) dξ ∧ dξ. −DAξ GAξ DA ξ The two first terms of EE∗ and E∗ E cancel each other and the third terms will cancel out when we take the inner product with ψj . Moreover, the second term of E∗ E is zero by (31(ii)). So we are left with: ([E, E∗ ])ij = 8π 2 ,ψj , [E, E∗ ]ψi ∗ = 8π 2 ,ψj , wDAξ Gξ DA (wψi )-dξ ∧ dξ ξ ∗ ∗ = 8π 2 ,DA (wψj ), Gξ DA (wψi )-dξ ∧ dξ ξ ξ

= −8π 2 ,(dw ∧ dw).ψj , Gξ ψi -dξ ∧ dξ = −4π 2 i,(dw1 ∧ dw2 ).ψj , Gξ ψi -dξ ∧ dξ , where we have once more used the fact that the Clifford multiplication commutes with the Green’s operator. Summing the final expression above with (30), one gets: (FB )ij + ([E, E∗ ])ij = −4π 2 i,(dz1 ∧ dz2 + dw1 ∧ dw2 ) · ψj , Gξ ψi -dξ ∧ dξ = 0 for the first term of the inner product is zero since it consists of a self-dual form (the Kähler form) acting on a negative spinor.

656

M. Jardim

Clearly, the above result has two weak points: it tells nothing about the behaviour of the Higgs field around the singular points ±ξ0 ; and it fails to show that the Higgs pairs so obtained are admissible in the sense of [14]. In fact, establishing the first point requires the use of algebraic-geometric methods, and will be taken up in Sect. 5 below. The second point will be clarified in Sect. 6. 5. Holomorphic Version The vanishing results of Sect. 3.4 put us in position to define the transformed bundle V → Tˆ . Indeed, consider the following elliptic complex: ∂ Aξ

−∂ Aξ

0 → L22 ((0 E(ξ )) → L21 ((0,1 E(ξ )) → L2 ((0,2 E(ξ )) → 0.

(32)

According to Proposition 5, H 1 (T × P1 , E(ξ )) is the only nontrivial cohomology of this complex. It then follows that the family of vector spaces given by Vξ = H 1 (T ×P1 , E(ξ )) forms a holomorphic vector bundle of rank k over Tˆ ; denote such holomorphic structure by ∂ V . Note that Vξ is defined even if ξ = ±ξ0 . Furthermore, by Proposition 5, V|Tˆ \±ξ0 coincides holomorphically with the dual bundle V defined on the previous section, i.e.: (V, ∂ V )|Tˆ \{±ξ0 } (V , ∂ B ). Moreover, V comes equipped with a hermitian metric h , which we want to compare with h, the hermitian metric on V induced from the monad (26). The key point is a fact we noted before in Lemma 3: given an 1-form a on T × P1 , its L2 -norm with respect to the round metric is always larger than its L2 -norm with respect to the flat metric on T × (P1 \ {∞}): ||a||L2 > ||a||L2 . R

F

Thus, comparing the monads (26) and (32), one sees that h is bounded above by h . In particular, the metric h is bounded at ±ξ0 . We can regard V as an index bundle for the family of Dirac operators over T × P1 parametrised by ξ ∈ Tˆ . Hence, its degree can be computed by the Atiyah-Singer index ∗ E ⊗ p ∗ P over T × P1 × Tˆ , and theorem for families. Consider now the bundle G = p12 13 note that G|T ×P1 ×{ξ } = E(ξ ). Then we have: ch(V) = −ch(G) · td(T × P1 )/[T × P1 ] 1 = − 2 + 2c1 (P) + c1 (P)2 − c2 (E) 1 + c1 (P1 ) /[T × P1 ] 2 1 = k − c1 (P)2 c1 (P1 )/[T × P1 ] = k − 2tˆ, 2 where the “−” sign in the first line is needed since V is formed by the null spaces of the adjoint Dirac operator. Summing up: Lemma 7. The dual bundle (V , ∂ B ) → Tˆ \ {±ξ0 } admits a holomorphic extension V → Tˆ of degree −2. Moreover, its hermitian metric h is bounded above at the punctures ±ξ0 .

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

657

The determinant line bundle of V is not fixed, however. In fact, let tx : T × P1 → T × P1 be the translation of the torus by x ∈ T , acting trivially on P1 , and let E = tx∗ E. If V is the dual bundle associated with E then V = V ⊗ Lx . Indeed: ∗ ∗ ∗ Vξ = H 1 (T × P1 , E (ξ )) = H 1 T × P1 , p12 (tx E) ⊗ p13 P|T ×P1 ×{ξ } ∗ ∗ = H 1 T × P1 , tx∗ (p12 E ⊗ p13 P) ⊗ p3∗ Lx |T ×P1 ×{ξ } ∗ ∗ = H 1 T × P1 , p12 E ⊗ p13 P|T ×P1 ×{ξ } ⊗ (Lx )ξ ⇒ Vξ = Vξ ⊗ (Lx )ξ as a canonical isomorphism for each ξ ∈ Tˆ . Thus V = V ⊗ Lx . Note also that if B is an admissible connection, V admits no splitting V = V0 ⊕ L compatible with B for any flat line bundle L. Defining the Higgs field. The next step is to give a holomorphic description of the Higgs field E. Recall that h0 (T × P1 , p2∗ OP1 (1)) = 2, and regarding P1 = C ∪ {∞}, we can fix two holomorphic sections s0 , s∞ ∈ H 0 (P1 , OP1 (1)) such that s0 vanishes at 0 ∈ C and s∞ vanishes at the point added at infinity. In homogeneous coordinates {(w1 , w2 ) ∈ C2 |w2 ! = 0} and {(w1 , w2 ) ∈ C2 |w1 ! = 0}, we have that, respectively (w = w1 /w2 ): s0 (w) = w, s∞ (w) = 1,

s0 (w) = 1, 1 s∞ (w) = . w

Let us first consider an alternative definition of the transformed Higgs field. For each ξ ∈ Tˆ , we define the map: H

ξ ˜ )) H 1 (T × P1 , E(ξ )) × H 1 (T × P1 , E(ξ )) −→ H 1 (T × P1 , E(ξ (α, β) / → α ⊗ s0 − β ⊗ s∞ .

(33)

If (α, β) ∈ kerHξ , we define an endomorphism ϕ of H 1 (T × P1 , E(ξ )) at the point ξ ∈ Tˆ as follows: ϕξ (α) = β.

(34)

We check that ϕ actually coincides with the Higgs field E we defined in the previous section, up to a multiplicative constant. Note that: α ⊗ s0 − β ⊗ s∞ = 0 ⇔ β = α(⊗s0 )(⊗s∞ )−1 . Moreover, recall that, for any trivialisation of OP1 (1) with local coordinate w on P1 , the = w. The claim now follows from the proof of Proposition 5; we denote quotient ss∞0 (w) √(w) E[ξ ] = 2 2π · ϕξ . Proposition 7. The eigenvalues of the Higgs field E have at most simple poles at ±ξ0 . Moreover, the residues of E are semi-simple and have rank ≤ 2 if ξ0 is an element of order 2 in the Jacobian of T , and rank ≤ 1 otherwise.

658

M. Jardim

Proof. Suppose α(ξ ) is an eigenvector of Eξ with eigenvalue Eξ (α(ξ )) = (ξ ) · α(ξ ). Thus,

(ξ )

= 1/ (ξ ), i.e.

α(ξ ) ⊗ s0 − (ξ ) · α(ξ ) ⊗ s∞ = 0. ⇒ α(ξ ) ⊗ ( (ξ ) · s0 − s∞ ) = 0 Therefore, denoting s (ξ ) = (ξ ) · s0 − s∞ , we have that α(ξ ) ∈ ker(⊗s (ξ )). On the other hand, consider the sheaf sequence: ⊗s (ξ ) ) → E(ξ )|T → 0, 0 → E(ξ ) → E(ξ (ξ )

since the section s (ξ ) vanishes at (ξ ). It induces the cohomology sequence: 0 → H 0 (T

(ξ )

˜ )|T ) → H 1 (T × P1 , E(ξ )) ⊗s→(ξ ) . . . , E(ξ (ξ )

(35)

˜ )|T ) which is non-empty if and only if so that ker(⊗s (ξ )) = H 0 (T (ξ ) , E(ξ (ξ ) E(ξ )|T (ξ ) = Lξ ⊕ L−ξ or F2 ⊗ Lξ . Hence, as ξ approaches ±ξ0 , we must have that one of the eigenvalues of E, say (ξ ) approaches ∞, since E| T∞ = Lξ0 ⊕ L−ξ0 . Moreover, s (ξ ) → s∞ , so that: lim α(ξ ) ∈ ker(⊗s∞ ) = H 0 (T∞ , E(ξ )|T∞ ).

ξ →±ξ0

Therefore, we conclude that, if ξ0 ! = −ξ0 , then one of the eigenvalues of E has a simple pole at ±ξ0 since h0 (T∞ , E(±ξ0 )|T∞ ) = 1; similarly, if ξ0 = −ξ0 , then two of the eigenvalues of E have a simple poles at ξ0 . Note in particular that the images of the residues of E at ±ξ0 are precisely given by: 1 1 ˜ H 0 (T∞ , E(±ξ 0 )|T∞ ) ⊂ H (T × P , E(±ξ0 )).

This proposition almost concludes the main task of this paper, namely to construct the inverse of the Nahm transform of [14]. It only remains to be shown that the Nahm transformed Higgs pair is admissible. We must then show how to match the SU (2) bundle Eˇ → T × C with doubly-periodic instanton Aˇ constructed from the transformed Higgs pair (B, E) as in [14] with the original objects A and E → T × C we started with in the present paper. These tasks are taken up in the following section. 6. Proof of Inversion So far, we have established that the Nahm transform of a doubly-periodic instanton is the same kind of singular Higgs pair as those we started with in the first part of this series [14]. We must now show that the transform presented here is actually the inverse of the construction of instantons of [14]. More precisely, we show that if we start with a doublyperiodic instanton A, apply the Nahm transform to obtain a Higgs pair (B, E), then the corresponding doubly-periodic instanton constructed as in [14] is gauge equivalent to the original object. First, consider the six-dimensional manifold T ×C×(Tˆ \{±ξ0 }). To shorten notation, we denote Mξ = T × C × {ξ } and Tˆ(z,w) = {z} × {w} × (Tˆ \ {±ξ0 }).

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

659

∗ E ⊗ p ∗ P over T × C × (Tˆ \ {±ξ }); note that Now take the bundle G = p23 0 12 G|Mξ ≡ E(ξ ) and G|Tˆ(z,w) ≡ E(z,w) ⊗ Lz , where E(z,w) denotes a trivial rank 2 bundle over Tˆ \ {±ξ0 } with the fibres canonically identified with the vector space E(z,w) . G is clearly holomorphic; we denote by ∂ M the action of the associated Dolbeault operator along the T × P1 direction, and by ∂ Tˆ its action along the Tˆ direction. In particular, ∂ M |Mξ ≡ ∂ Aξ . q

Let Cp,q = (T ×C (G) ⊗ ( ˆ (G); in other words, Cp,q consists of the (p + q)-forms T over T × C × (Tˆ \ {±ξ0 }) with values in G spanned by forms of the shape: 0,p

s(z, w, ξ )dzi1 dw i2 dξj1 dξ j2 , i1 , i2 , j1 , j2 ∈ {0, 1} and i1 + i2 = p, j1 + j2 = q.

(36)

Analytically, we want to regard Cp,q as the completion of the set of smooth forms of the shape above with respect to a Sobolev norm described as follows: s|T ×C×{ξ } ∈ L2 ((2−q E(ξ )) for each ξ ∈ Tˆ \ {±ξ0 }, q s|{(z,w)}×Tˆ \{±ξ0 } ∈ L2q ((2−q Lz ) for each (z, w) ∈ T × C. Now, define the maps: δ1

δ2

Cp,0 → Cp,1 → Cp,2

(37) δ1 (s) = (∂ Tˆ s, −w · s ∧ dξ ) δ2 (s1 , s2 ) = (∂ Tˆ s2 + w · s1 ∧ dξ ) 0,p 1,0 for (s1 , s2 ) ∈ (T ×P1 (G) ⊗ (0,1 (G) ⊕ ( (G) ≡ C(p, 1). Note that (37) does define Tˆ Tˆ a complex. The inversion result will follow from the analysis of the spectral sequences associated to the following double complex (for the general theory of spectral sequences and double complexes, we refer to [3]): ∂M

C0,2 → ↑ δ2 ∂M

C0,1 → ↑ δ1 ∂M

C0,0 →

−∂ M

C1,2 → C2,2 ↑ −δ2 ↑ δ2 −∂ M

C1,1 → C2,0 ↑ −δ1 ↑ δ1 C1,0

(38)

−∂ M

→ C2,0 .

The idea is to compute the total cohomology of the spectral sequence in the two possible different ways and compare the filtrations of the total cohomology. Lemma 8. By first taking the cohomology of the rows, we obtain p,q E2

0 H 2 (C(e, 0)) 0 0 H 1 (C(e, 0)) 0 q ↑ 0 H 0 (C(e, 0)) 0, →

p

(39)

660

M. Jardim

where H i (C(e, 0)) are the cohomology groups of the complex that yields the monad description of the construction of doubly-periodic instantons in [14] (see Proposition 3 there). Proof. First, note that the rows coincide with the complex (26). Moreover, we can regard elements in Cp,q as q-forms over Tˆ with values in 0,p 2 L2−p ((T ×C G). To see this, fix some ξ ∈ Tˆ ; by (36), s(z, w, ξ ) ∈ (0,p G|Mξ . So, by varying ξ we get the interpretation above. p,q This said, it is clear that the first and second columns of E1 must vanish, since A is ∗ irreducible. In the middle column, we get q-forms over Tˆ with values in ker(∂ M − ∂ M ), ∗ ). which for a fixed ξ restricts to ker(DA ξ

Therefore, after taking the cohomologies of the rows, we are left with: 0 p,q

C1

Lp ((1,1 V )

0

↑ (∂ B + E) p 0 L1 ((1,0 V ⊕ (0,1 V ) 0 ↑ (E + ∂ B ) q↑0

→

p

L2 ((0 V )

(40)

0.

p

But this is just the complex that yields the monad description of the construction of doubly-periodic instantons in [14]. The lemma follows after taking the cohomology of the remaining column. Total cohomology and admissibility. Note that, as we pointed out in the beginning of this section, we still do not know if the Higgs pair (B, E) arising from the instanton (E, A) is admissible or not, i.e. the hypercohomology spaces H0 and H2 might be nontrivial. The next lemma deals with this problem. Lemma 9. The only nontrivial cohomology of the total complex is H 2 (C(p, q)), which is naturally isomorphic to the fibre E(e,0) . In particular, this shows that the Higgs pairs (B, E) obtained via Nahm transform on instanton connection A ∈ A(k,ξ0 ) are indeed admissible, see [14]. Proof. First note that we can regard an element in Cp,q as a (0, p)-form over T × C ∗ q ,q with values in ( ˆ1 2 (G). Since G|Tˆ(z,w) ≡ E(z,w) ⊗ Lz , ker∂ M and ker∂ M are nontrivial T only if z = e, the identity element in the group law of T . Hence, it is enough to work on a tubular neighbourhood of {e} × P1 (Tˆ \ {±ξ0 }). More precisely, we define another double complex (germ C)p,q , consisting of forms defined on arbitrary neighbourhoods of {e}×P1 ×(Tˆ \{±ξ0 }). Then we have a restriction map Cp,q → (germ C)p,q commuting with ∂ M , δ1 and δ2 . Such a map also induces an isomorphism between the total cohomologies of Cp,q and (germ C)p,q . So we can work with (germ C)p,q to prove the lemma.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

661

Let Ve be some neighbourhood of e ∈ T . By the Poincaré lemma applied to ∂ T , we get:

p,q

(germ C)1

q↑

→

(2Ve (G) 0 0 ↑ (1Ve (G) 0 0 ↑ (0Ve (G) 0 0

(41)

p

where Ve denotes a tubular neighbourhood of Ne = {e} × P1 × (Tˆ \ {±ξ0 }) As in [6] (see pp. 91–92), the complex in the first row is, after restriction, mapped into a Koszul complex over Ne : (w ξ )

(−ξ,z)

ONe (G) −→ ONe (G) ⊕ ONe (G) → ONe (G) so that: E(e,0) 0 0

p,q

(germ C)2

q↑

0 0

00 00

(42)

→

p

It then follows from Lemmas 8 and 9 that there is a natural isomorphism of vector spaces II : H 1 (C(e, 0)) ≡ Eˇ (e,0) → E(e,0) , which in principle may depend on the choice of complex structure I on T × C. ˇ A) ˇ with the original data. Since the choice of identity element in T and Matching (E, ˇ More of origin in C is arbitrary, we can extend II to a bundle isomorphism E → E. precisely, let t(u,v) : T × C → T × C be the translation map (z, w) → (z + u, w + v). ∗ ∗ Clearly, the connection t(u,v) A on the pullback bundle t(u,v) E is also irreducible and ∗ t(u,v) E(e,0) ≡ E(u,v) . Computing the total cohomology of the double complex (38) associated to the bundle t ∗ G (where t ∗ acts trivially on Tˆ coordinate), Lemmas 8 (u,v)

(u,v)

and 9 lead to an isomorphism of vector spaces H 1 (C(u, v)) ≡ Eˇ (u,v) → E(u,v) . It is clear from the naturality of the constructions that these fibre isomorphisms fit ˇ In particular, together to define a holomorphic bundle isomorphism II : E → E. II takes the Dolbeault operator ∂ A of the holomorphic bundle E → T × C to the Dolbeault operator ∂ Aˇ of the holomorphic bundle Eˇ → T × C. It also follows from this observation that the holomorphic extensions E and Eˇ must be isomorphic as holomorphic vector bundles. However, such fact still does not guarantee that the connections A and Aˇ are gaugeequivalent. This is accomplished if we can show that II is actually independent of the choice of complex structure in T × C. Therefore, the proof of the main theorem 1 is completed by the following proposition:

662

M. Jardim

Proposition 8. The bundle map II : Eˇ → E is independent of the choice of complex structure on T × C. Proof. Again, it is sufficient to consider only the fibre over (e, 0). As in [6], the idea is to present an explicit description of II : Eˇ (e,0) → E(e,0) , and then show that it is Euclidean invariant. Let α ∈ H 1 (C(e, 0)) ⊂ C1,1 . To find II ([α]) we have to find β ∈ C0,2 such that ∂ M β = δ2 α. A solution to this equation is provided by the Hodge theory for the ∂ M operator: ∗ β = GM (∂ M δ2 α), ∗

where GM denotes the Green’s operator for ∂ M ∂ M , which can be regarded fibrewise as the family of Green’s operators GAξ = GM |Mξ parametrised by ξ ∈ (Tˆ \ {±ξ0 }). In principle, β depends on the complex structure I via the operators ∂ M and GM . However, by the Weitzenböck formula applied to the bundle G, we have: ∗

∗ ∂ M ∂ M = ∇M ∇M .

Here, ∇M is the covariant derivative in the T ×C direction on G. With this interpretation, ∗ ∇ )−1 is seen to be independent of the complex structure I ; in fact, it is GM = (∇M M Euclidean invariant. Now β as an element of C1,1 has the form β(z, w; ξ )dξ dξ , so that the restriction r(e,0) (β) = β|Tˆ(e,0) is a (1, 1)-form over Tˆ \ {±ξ0 } with values in E(e,0) . Take its cohomology class in H 2 (Tˆ \ {±ξ0 }, C ⊗ E(e,0) ), so that:

II ([α]) = which is the desired explicit description.

Tˆ(e,0)

r(e,0) (β)

Together with the work done in [14], we have thus proven Theorem 1. 7. Instantons of Higher Rank One easily realizes that there is nothing really special about rank two bundles; the whole proof could be generalised to higher rank. Indeed, the only point in restricting to the rank two case is to reduce the number of possible vector bundles over an elliptic curve, and avoid a tedious case-by-case study throughout the various stages of the proof. Before we can state the generalisation of the main theorem 1, we must review our definitions of asymptotic state and irreducibility. The restriction of the instanton bundle E → T × P1 to the added divisor T∞ is a flat SU (n) bundle, i.e. E|T∞ = Lξ1 ⊕ · · · ⊕ Lξk k

Lξl = OT .

such that l=1

In other words, E|T∞ is determined by a set of points (ξ1 , . . . , ξj ) ∈ J (T ) with multij plicities (m1 , . . . , mj ), and such that l=1 ml ξl = 0. We call such data the generalised asymptotic state.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

663

Moreover, we will say that (E, A) is 1-irreducible if there is no flat line bundle E → T × C such that E admits a splitting E ⊕ L which is compatible with the connection A. Theorem 4. There is a bijective correspondence between the following objects: – Gauge equivalence classes of 1-irreducible SU (n)-instantons over T × C with fixed instanton number k and generalised asymptotic state (ξ1 , . . . , ξj ) with multiplicities (m1 , . . . , mj ); – Admissible U (k) solutions of the Hitchin’s equations over the dual torus Tˆ , such that the Higgs field has at most simple poles at {ξ1 , . . . , ξj }; moreover, its residue at ξj is semi-simple and has rank ≤ mj . 8. The Instanton Spectral Data Our goal now is to construct an algebraic curve S C→ Tˆ × C associated to a doublyperiodic instanton A; let E be the associated instanton bundle. ∗ Let DA denote the restriction of the coupled Dirac operator DAξ to the torus Tw . ξ (w) We define: ∗ S = {(ξ, w) ∈ Tˆ × C | ker{DA } ! = 0}. ξ (w)

(43)

∗

Since DAξ (w) = ∂ Aξ |Tw − ∂ Aξ |Tw , it is easy to see that: ∗ ker{DA } = H 1 (Tw , E(ξ )|Tw ) = H 1 (Tw , E(ξ )|Tw ). ξ (w)

(44)

Note also that S can be compactified to a curve S C→ Tˆ × P1 by adding the two points (±ξ0 , ∞) corresponding to the asymptotic states. Assuming that the instanton bundle E is fibrewise semistable 2 , we conclude that S is a branched double cover of P1 ; the branch points correspond to those w ∈ C such that E|Tw is an extension of a line bundle of order two by itself. Lemma 10. If E is fibrewise semistable, the natural projection map π1 : S → Tˆ is a branched k-fold covering map. Furthermore, the projection π2 : S → P1 is a branched double covering map with 4k branch points, counted with multiplicity. It follows that all spectral curves belong to the linear system |k·[Tˆ ]+2·[P1 ]| ⊂ Tˆ ×P1 . Moreover, if E is regular, then S is a smooth curve of genus g(S) = 2k − 1, by the Riemann–Hurwitz formula. Proof. The proof of the first statement is a simple application of the Riemann–Roch theorem for the family of Dolbeault operators ∂ w on E(ξ )|Tw , parametrised by P1 . For generic w ∈ P1 , dim{ker∂ w } = 0; this dimension jumps precisely when E(ξ )|Tw is an extension of the trivial line bundle by itself. Thus the cardinality of π1−1 (ξ ) coincides with the number of jumping points (counted with multiplicity). 2 In general, E is only generically fibrewise semistable, so that S may contain whole fibres.

664

M. Jardim

From index theory, we know that the number of jumping points is precisely the first Chern class of the index bundle; therefore: ch(E(ξ )) · td(KT )/[T ] #(jumping points) = c1 (Ind[∂ w ]) = P1 =− c2 (E(ξ )) = −k. (45) T ×P1

This shows that S is a k-fold covering of Tˆ . Since the branch points of the projection π2 are exactly the pre-images of the elements of order two in Tˆ , there are 4k branch points, counted with multiplicity. Line bundle with connection. Let π1 : Tˆ × P1 → Tˆ and π2 : Tˆ × P1 → P1 be the natural projection maps; we will also use π1 and π2 to denote the projections S → Tˆ and S → P1 . To each s ∈ S, we attach the vector space: ∗ (46) = H 1 (Tπ2 (s) , E(π1 (s))|Tπ2 (s) ). Ls = ker DA (π (s)) π (s) 2 1

If E is generically fibrewise semistable, then L is only a coherent sheaf on the (possibly singular, non-reduced) spectral curve. However, when the instanton bundle is regular L becomes a line bundle over the smooth spectral curve. So now let us assume that A is a regular doubly-periodic instanton, and consider the bundle π1∗ H → S. There is a bundle map T : π1∗ H → L, which is given by the following composition on each fibre: r P ∗ ∗ L21 (:0,1 E(π1 (s))) → ker DA → ker D , (47) A (π (s)) π (s) π (s) 2 1

1

where r denotes the restriction map. Let ιL→H denote the inclusion L C→ π1∗ H, which makes sense in terms of distributions. A connection on the line bundle L → S is defined by: ∇ = T ◦ π1∗ d ◦ ιL→H .

(48)

9. Hitchin’s Spectral Data We now look at the other side of the correspondence in Theorem 1 and review Hitchin’s construction of spectral curves associated to Higgs bundles [12]. Recall that V → Tˆ \ {±ξ0 } is a rank k vector bundle, and E is an endomorphism valued (1, 0)-form with simple poles at ±ξ0 . So, for any fixed ξ ∈ Tˆ \ {±ξ0 }, E[ξ ] is a k × k matrix and one can compute its k eigenvalues. As we vary ξ , we get a k-fold covering, possibly branched, of Tˆ \ {±ξ0 } inside Tˆ × C. This curve of eigenvalues is what we want to define as our Higgs spectral curve; more precisely: (49) C = (ξ, w) ∈ Tˆ × C | det(E[ξ ] − w · Ik ) = 0 . In other words, C is the set of points (ξ, w) ∈ Tˆ × C such that w is an eigenvalue of the endomorphism E[ξ ] : Vξ → Vξ . Since we are assuming that E has simple poles at ±ξ0 , the curve C C→ Tˆ × C can be compactified to a curve C C→ Tˆ × P1 by adding the points (±ξ0 , ∞). The following proposition is a familiar fact from the theory of Higgs bundles.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

665

Proposition 9. If ξ0 ! = −ξ0 , the spectral curve associated to a generic Higgs bundle (V , B, E) is smooth. Otherwise, if ξ0 = −ξ0 , then all spectral curves have a double-point at (±ξ0 , ∞), but are generically smooth elsewhere. Defining the spectral bundle. As before, we will denote the projections C → Tˆ and C → P1 by π1 and π2 . We define a coherent sheaf N on C with stalks given by: Nc = coker {E[π1 (c)] − π2 (c) · Idk } ,

(50)

i.e. the dual of the π2 (c)-eigenspace of E[π1 (c)]. Generically, one expects the eigenvalues to be distinct, so that N becomes a line bundle over the smooth curve C. Assuming that Higgs bundle (V , B, E) is generic, we define a connection ( on the line bundle N → C. First note that N is naturally a subbundle of π1∗ V ; let ιN →V be the inclusion and E : π1∗ V → N the fibrewise projection. We define: ∇( = E ◦ π1∗ ∇B ◦ ιN →V .

(51)

10. Matching the Spectral Data We are finally in a position to state and prove the second main result of this paper: Theorem 5. If (V , B, E) is the Nahm transform of a regular instanton (E, A), then the instanton spectral data (S, L, ) is equivalent to the Higgs spectral data (C, N , (), in the sense that the curves S and C coincide pointwise and there is a natural isomorphism L → N preserving the connections. Proof. Clearly, both spectral curves already have the points (±ξ0 , ∞) in common. So let ξ ! = ±ξ0 and suppose that α is an eigenvector of E[ξ ] with eigenvalue < ∞. In particular, the point (ξ, ) ∈ Tˆ × C belongs to the Higgs spectral curve C. By definition, we have: E[ξ ](α) = · α ⇒ α ⊗ (s0 − · s∞ ) = 0. Clearly, s = s0 − · s∞ is a holomorphic section in H 0 (P1 , OP1 (1)) vanishing at ∈ P1 \ {∞}. Therefore it induces the following exact sequence: ⊗s ) → E(ξ )|T → 0 0 → E(ξ ) → E(ξ

which in turn induces the cohomology sequence: )|T ) → 0 → H 0 (T , E(ξ

⊗s r )) → → H 1 (T × P1 , E(ξ )) → H 1 (T × P1 , E(ξ r )|T ) → → H 1 (T , E(ξ 0.

(52)

)|T ) is nonempty (since it contains α), thus H 1 (T , E(ξ )|T ) = Now H 0 (T , E(ξ 1 1 1 1 1 )) = k. H (T , E(ξ )|T ) is also nonempty since h (T × P , E(ξ )) = h (T × P , E(ξ ∗ Therefore, ker{DAξ (w) } is also nonempty since it can be identified with H 1 (T , E(ξ )|T ) (see (44)). Hence (ξ, ) ∈ Tˆ × C also belongs to the instanton spectral curve S. The same argument provides the converse statement. Thus the curves C and S must coincide pointwise.

666

M. Jardim

It also follows from the cohomology sequence (52) that the dual of the -eigenspace )|T ) = H 1 (T , E(ξ )|T ). In other words, there are canonof E[ξ ] is exactly H 1 (T , E(ξ ical identifications between the fibres N(ξ, ) and L(ξ, ) ; therefore, the line bundles are isomorphic. Finally, let us check that the connection ∇( and ∇ also coincide. Noting that the projection E : π1∗ V → N = L is just the restriction map ∗ ∗ r : ker DA → ker DA on each s ∈ S = C, it is easy to see that π1 (s) π1 (s) (π2 (s)) ∗ T = E ◦ π1 P . Therefore, we have:

∇ = T ◦ π1∗ d ◦ ιN →H = E ◦ π1∗ P ◦ π1∗ d ◦ ιV →H ◦ ιN →V = E ◦ π1∗ ∇B ◦ ιN →V = ∇( . Remark 2. More generally, the above argument shows that the pairs (S, L) and (C, N ) also coincide (i.e. the curves S and C coincide pointwise, and L and N are isomorphic as coherent sheaves) when E is fibrewise semistable. Remark 3. Cherkis and Kapustin used a similar argument to establish the analogous result for periodic monopoles [5]. More precisely, they considered monopoles on S 1 ×R2 , so that the Nahm transformed object is a Higgs pair on S 1 × R. Each of these objects can be associated to a spectral pair consisting of an algebraic curve on R2 × (R2 \ {0}) plus a line bundle over it. If the Higgs pair is the Nahm transform of a periodic monopole, Cherkis and Kapustin have shown that both spectral data coincide. 11. Conclusion 11.1. An analytical remark. The attentive reader might have noticed that the assumptions on doubly-periodic instantons used on this paper (namely extensibility) do coincide with the conclusions of the first paper of the series. However, it is important to point out at this stage the small gap remaining between the conclusions of the present paper and the assumptions in [14]. More precisely, we assumed in [14] that the harmonic metric associated with the Higgs pair (B, E) on the bundle V → Tˆ \ {±ξ0 } is non-degenerate along the kernel of the residues of E, and h ∼ O(r 1±α ) along the image of the residues of E, for some 0 ≤ α < 1/2, in a holomorphic trivialisation of V over a sufficiently small neighbourhood around ±ξ0 , . The gap is closed in [2], where it is shown that the Nahm transformed Higgs pairs here constructed do satisfy the above condition. The analytical features of extensible doubly-periodic instantons are further studied by Olivier Biquard and the author in [2]. In particular, we show that if |FA | ∼ O(r −2 ) then A is extensible, and the asymptotic behaviour is completely determined. Moreover, we give a deformation theory description of the moduli space of rank two doubly-periodic instanton connections as a hyperkähler manifold of complex dimension 4k − 2. It is also shown that the Nahm transform is a hyperkähler isometry between the moduli space of doubly-periodic instantons and the moduli space of singular Higgs pairs. 11.2. Relation with Fourier–Mukai transform. The instanton spectral pair (S, L) could also be constructed via Fourier–Mukai transform in the following way.

Nahm Transform and Spectral Curves for Doubly-Periodic Instantons

667

Let F be a sheaf on T × P1 and consider the diagram: T × Tˆ × LP1 LLL r LLOˆL rrr r r LLL r r O & rx r T × P1 Tˆ × P1 The Fourier–Mukai transform of F is given by ˆ ∗ (O∗ F ⊗ P), H(F ) = R O where P denotes the pullback of the Poincaré bundle from T × Tˆ to T × Tˆ × P1 . If F is torsion-free and generically fibrewise semistable, then H(F ) is a torsion sheaf on Tˆ × P1 . It is simple to show that if F is locally-free and generically fibrewise semistable (as the instanton bundles considered in this paper are), then H(F ) is supported exactly over the spectral curve S, and the restriction of H(F ) to its support coincides with L [18]. ˆ ∗ (O∗ F ⊗ P)). A more careful study of doubly-periodic Furthermore, V = π1∗ (R 1 O instantons from the point of view of its Fourier–Mukai transform is done in [17]. Therefore, the holomorphic version of the Nahm transform can be seen as a Fourier– Mukai transform composed with Hitchin’s correspondence. However, the Nahm transform (and the spectral construction of Sect. 8) also contains some differential-geometric information (i.e. the instanton A, the transformed connection B, and the spectral connection ) in addition to the holomorphic information encoded into the Fourier–Mukai transform. Of course, such differential-geometric information is usually encoded into the holomorphic data in the form of a stability condition. Such a condition is well-known for Higgs bundles [11]. For doubly-periodic instantons, the appropriate concept of stability for the corresponding instanton bundles is established in [2]. It is less clear, though, what is the stability condition to be imposed on the spectral pairs (S, L); such a question is addressed in [18] in a more general context. Acknowledgement. This work is part of my Ph.D. project [13], which was funded by CNPq, Brazil. I am grateful to my supervisors, Simon Donaldson and Nigel Hitchin, for their constant support and guidance. I also thank Brian Steer and Olivier Biquard for valuable suggestions in the later stages of this project.

References 1. Arfken, G.: Mathematical methods in physics. London, New York: Academic Press, 1966 2. Biquard, O., Jardim, M.: Asymptotic behaviour and the moduli space of doubly-periodic instantons. J. Eur. Math. Soc. 3, 335–375 (2001) 3. Bott, R., Tu, L.: Differential forms in algebraic topology. New York: Springer-Verlag, 1982 4. Braam, P., van Baal, P.: Nahm’s transform for instantons. Commun. Math. Phys. 122, p.267-280 (1989). 5. Cherkis, S.; Kapustin, A.: Nahm Transform for Periodic Monopoles and N = 2 SuperYang–Mills Theory. Commun. Math. Phys. 218, 333–371 (2001) 6. Donaldson, S., Kronheimer, P.: Geometry of four-manifolds. Oxford: Clarendon Press, 1990 7. García Pérez, M., González-Arroyo, A., Pena, C., van Baal, P.: Nahm dualities on the torus – a synthesis. Nucl. Phys. B564, 159–181 (2000) 8. Gradshteyn, I., Ryzhik, I., Jeffrey, A. (ed.): Table of integrals, products and series. Boston: Academic Press, 1994 9. Gromov, M., Lawson, H.: Positive scalar curvature and the index of the Dirac operator on complete Riemannian manifolds. Inst. des Hautes Études Scientifiques Publ. Math. 58, 295–408 (1983)

668

M. Jardim

10. 11. 12. 13.

Hitchin, N.: Construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) Hitchin, N.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. 55, 59–126 (1987) Hitchin, N.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) Jardim, M.: Nahm transform for doubly-periodic instantons. Ph.D. thesis, Oxford 1999. Available at math.DG/9912028 Jardim, M.: Construction of doubly-periodic instantons. Commun. Math. Phys. 216, 1–15 (2001) Jardim, M.: Nahm transform for doubly-periodic instantons. Preprint math.DG/9910120 Jardim, M.: Spectral curves and Nahm transform for doubly-periodic instantons. Preprint math.AG/9909146 Jardim, M.: Classification and existence of doubly-periodic instantons. Preprint math.DG/0108004 Jardim, M., Maciocia, A.: A Fourier–Mukai approach to spectral data for instantons. Preprint math.AG/0006054 Kapustin, A., Sethi, S.: Higgs branch of impurity theories. Adv. Theor. Math. Phys. 2, 571–592 (1998) Watson, G.N.: A treatise on the theory of Bessel functions. Cambridge: Cambridge University Press, 1995

14. 15. 16. 17. 18. 19. 20.

Communicated by R. Dijkgraaf