Communications in Mathematical Physics - Volume 200

Commun. Math. Phys. 200, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 The One-Point St...

Author: A. Jaffe (Chief Editor)

33 downloads 827 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 200, 1 – 23 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

The One-Point Statistics of Viscous Burgers Turbulence Initialized with Gaussian Data? Reade Ryan1 , Marco Avellaneda2 1

The Center for Applied Probability, Columbia University, New York, NY 10027, USA. E-mail: [email protected] 2 Courant Institute of Mathematical Sciences, New York University, New York, NY 10012, USA. E-mail: [email protected] Received: 11 July 1997 / Accepted: 3 April 1998

Abstract: We study the statistics of the viscous Burgers turbulence (BT) model, initialized at time t = 0 by a large class of Gaussian data. Using a first-principles analysis of the Hopf–Cole formula for the Burgers equation and the theory of large deviations for Gaussian processes, we characterize the tails of the probability distribution functions n u(x,t) , n = 1, 2, . . . . The (PDFs) for the velocity u(x, t) and the velocity derivatives ∂ ∂x n PDF tails have a non-universal structure of the form log P (θ) ∝ −(Re)−p tq θr , where Re is the Reynolds number and p, q, and r depend on the order of differentiation and the infrared behavior of the initial energy spectrum.

1. Introduction The one-dimensional Burgers turbulence (BT) model is described by the nonlinear system 1 ∂xx u(x, t), x ∈ R, t > 0, ∂t u(x, t) + u(x, t)∂x u(x, t) = Re (1.1) u(x, 0) = uo (x), x ∈ R, where the initial data uo (x) is a random function and the Reynolds number Re is a positive (possibly infinite) constant. This random system was first introduced as a simplified model of Navier–Stokes (N-S) turbulence [5]. The above differential equation, now called the Burgers equation, captures the interaction of two essential mechanisms in hydrodynamic turbulence: nonlinear wave propagation and viscosity. Hence, there exists a strong analogy between the compressible N-S equation and Burgers equation. Both systems exhibit shock wave formations that dissipate energy at small scales. With its ? This work was partially supported by NSF grant NSF-DMS-95-04122 and by the NSF funded Center for Applied Probability (CAP) at Columbia University. The work was initiated while the first author was at the Courant Institute of Mathematical Sciences at NYU. This article is an expansion of R.R.’s doctoral thesis.

2

R. Ryan, M. Avellaneda

relative simplicity, the BT model is, thus, often used as a testing ground for analytic approaches to Navier–Stokes turbulence [7, 11]. BT systems are also of interest in their own right and have been employed in, among other things, the study of shock wave formation in compressible fluids [8], and the formation of large-scale mass clustering in a expanding universe [15, 18]. Nowhere is the beautiful simplicity of the Burgers equation more apparent than in its connection to the heat equation via the Hopf–Cole transformation: u(x, t) = −

2 wx (x, t)/w(x, t). Re

Applying this transformation to the Burgers equation, it is standard to show that w(x, t) solves the heat equation wt = (Re)−1 wxx with initial data Z x uo (y)dy)}. w(x, 0) = exp{−Re/2( 0

Solving for w, one obtains an explicit solution for u(x, t), the Hopf–Cole formula: R∞ u(x, t) =

x 1 −∞ − t t R∞

n y exp − Re 2 [φo (y) +

−∞

n exp − Re 2 [φo (y) +

o

(x−y)2 2t ]

o

(x−y)2 2t ]

dy

dy

,

(1.2)

Rx where φo (y), often called the initial potential, = 0 uo (y)dy. Through a first-principles analysis of this equation (1.2) and large-deviation type estimates of certain rare events for the initial data, we will characterize the tails of the PDFs for the velocity u(x, t) and the velocity derivatives ∂xn u(x, t), n = 1, 2, . . . . The study of PDFs associated with BT started with the work of Burgers [6]. Studying inviscid (Re = ∞) BT with white-noise initial data, he found integral representations for the shock strength and rarefaction interval size PDFs. More recently, Avellaneda and E [2, 3] and Ryan [13] investigated the tails of these PDFs and the PDF of the velocity u(x, t) for this system, finding they have a faster-than-Gaussian rate of decay, log P (θ) ∝ −Cθ3 . The one-point statistics of inviscid BT models with more general data have been studied in numerous others works [14, 10, 18, 17]. Much of this work points toward the conclusion that the solution statistics are non-universal. The statistics of {u(x, t), x ∈ R}, for t > 0, are highly sensitive to differences in the statistics of the initial data. Recently, Gurbatov et al. [9] studied the energy spectrum of u(x, t) in BT models where the initial velocity was stationary and Gaussian with a spectrum proportional to k n at small wave numbers k. They showed that the energy spectrum of u(x, t) for low and intermediate wave numbers depended heavily on the exponent n characterizing the initial spectrum at low wave numbers. In this paper, we show that the shape of the PDFs for u(x, t) and its derivatives has a similar “responsiveness” to the low-wave number behavior of the initial velocity spectrum. For viscous BT systems, the PDFs of the velocity and its gradient were analyzed previously by Gotoh and Kraichnan [7, 11] using the theory of mapping closure and numerical simulations. Looking to quantify the deviation from Gaussian statistics1 , their analysis showed that the PDF tails of the velocity gradient have a significantly slower 1

Their numerical simulations were performed with stationary Gaussian initial data, having energy spectrum 2 2 E[k] = E0 |k/k0 |2 e−k /k1 .

One-Point Statistics of Burgers Turbulence

3

decay rate than that of the velocity. The precise shape of the tails was, however, difficult to ascertain due to the rareness of the tail events. In this paper, we study the shape of the PDFs of the velocity, the velocity gradient, and all the higher order velocity derivatives for viscous Burgers systems, initialized with a large class of Gaussian initial data. We rigorously show that the tails of these PDFs should decay like “stretched” exponentials. Through an analytical understanding of the “physics” of Burgers turbulence and the theory of large deviations for general Gaussian processes, we prove that these tails have the form log P (θ) ∝ −(Re)−p tq θr , where the exponents p, q, and r are determined by the low wave number behavior of the initial velocity spectrum and the order of differentiation. In a 1995 Physics of Fluids article [4], the authors of this paper and Weinan E first conjectured that these PDFs should decay in this fashion. We believe that the methods of proof employed here can also be applied to the study of more general statistical problems associated with BT systems.

2. The Initial Potentials and Their Spectral Representations Looking at the Hopf–Cole formula, Eq. (1.2), it is clear that the solution to the Burgers equation does not depend directly on the initial data uo (x) but on the initial potential φo (x). Therefore, we shall define the class of initial potential that we want to investigate rather than the actual initial velocity. This is also convenient because we wish to study BT systems initialized with generalized random data, such as white noise or fractional Brownian noise. The integrals of these processes, on the other hand, are classically defined (even continuous) stochastic processes (resp. Brownian motion and fractional Brownian motion). Thus, we can consider the solutions to BT systems initialized with generalized data without a lengthy discussion of the definition and statistical properties of generalized random functions. From a physical point of view, the assumption of isotropic initial data makes intuitive sense. Therefore, we shall consider initial potentials that are mean-zero Gaussian processes that are either stationary themselves or have stationary increments. In this section, we give a precise definition of the class of potentials under consideration through a discussion of their energy spectra and their spectral representations with respect to white noise. For a stationary process G(y), y ∈ R, its energy spectrum E[k] is defined as the Fourier transform of its correlation function, i.e. Z∞ E[G(x)G(y)] = R(x − y) =

ei(x−y)k E[k]dk.

(2.1)

−∞

For real processes, G(y), this simplifies to Z∞ cos((x − y)k)E[k]dk.

R(x − y) =

(2.2)

−∞

Stochastic processes that have only stationary increments do not have integrable correlation functions. Therefore, the above definition for the energy spectrum does not

4

R. Ryan, M. Avellaneda

apply. In a self-consistent way, we define the spectrum of a real stationary-increment process as the function which satisfies the equation Z∞ (1 − cos(yk))E[k]dk.

E[(G(y) − G(0)) ] = 2 2

(2.3)

−∞

Note that for stationary processes the above is equivalent to Eq. (2.2). The spectral representation of a real mean-zero Gaussian process G(y) with stationary increments and energy spectrum E[k] is given by the formula Z∞

Z∞ (1 − cos(yk)) (E[k])

G(y) − G(0) =

1/2

−∞

dZ1 (k) + −∞

sin(yk) (E[k])1/2 dZ2 (k), (2.4)

where Z1 (·) and Z2 (·) are independent Brownian motions on R. The basic properties of stochastic integrals imply that the right-hand side of Eq. (2.4) is Gaussian with mean zero and has energy spectrum = E[k]. Given the 1-1 correspondence between Gaussian processes and correlation functions, Eq. (2.4) gives the unique spectral representation of G(y). This representation will play a key role in establishing the large-deviation estimates on the tail events for the velocity and the velocity derivatives. The initial potentials under consideration in this paper are continuous, Gaussian processes that have, at least, stationary increments and have infrared power laws in their spectra. Their energy spectra satisfy the following conditions: 1. ∃α ∈ (−∞, 2) : E[k] = C|k|−1−α , for k ∈ (−1, 1). / (−1, 1). 2. For the same α and for some ε ∈ (0, 2), E[k] ≤ C|k|−1−(α∨ε) , for k ∈ 0 3. ∀k ∈ / (−1, 1), ∃ε0 ∈ (0, 2) such that E[k] ≥ C|k|−1−ε . C is an arbitrary constant whose value may change from line to line, and x ∨ y ≡ max{x, y}. We shall continue to use this notation below. Condition 1 is the essential condition here. The low-wave number structure of the energy spectrum dictates the large-scale behavior of the potential processes. Thus, we shall see that the infrared power law exponent α determines the tail-event probabilities in our BT systems. Conditions 2 and 3 are technical necessities. Condition 2 insures that there is no ultraviolet singularity in the spectrum and that ∀x, y ∈ R, (2.5) E (G(y) − G(x))2 ≤ C|y − x|α∨ε . This implies that the paths of G(·) are not too rough.2 Condition 3 excludes processes that are too “deterministic”. In the large-deviation analysis below, the necessity of these requirements will be clear. 3. Results and Hopf–Cole Analysis Below we state our results on the tail probabilities for viscous Burgers systems with the above-stated initial data. To obtain the upper bounds on the PDF tails, we employ the theory of extreme values for Gaussian processes, making use of Adler’s work on 2

Equation (2.5) implies that the path of G(·) is H¨older continuous with exponent < (α ∨ ε)/2.

One-Point Statistics of Burgers Turbulence

5

this subject [1]. For the lower bounds, we use the spectral representations of the initial potentials in conjunction with the Cameron–Martin–Girsanov formula for Brownian motion. Theorem 1. Let u(x, t), x ∈ R and t ∈ R+ be the solution to the 1-D Burgers equation with finite Reynolds number Re. If the initial potential G(x) is a Gaussian process satisfying Conditions 1 and 2 and parameterized by α ∈ (−∞, 0) ∪ (0, 2), then ∀n = 0, 1, 2, . . . , ∃cn > 0 such that n ∂ u(x, t) > θ ≤ exp −cn (Re)−p tq θr (3.1) Prob n ∂x for θ max{(Re)−1 , t−1 (Re/t)n }, with p=

(4 − α ∨ 0)n , n+1

q = 2 − α ∨ 0, and r =

4−α∨0 . n+1

For an initial potential G with α = 0, there are logarithmic corrections to Eq. (3.1). ∀n = 0, 1, 2, . . . , ∃cn > 0 such that n ∂ u(x, t) −cn (Re)−p tq θr . (3.2) > θ ≤ exp Prob ∂xn log[(Re)−n/(n+1) tθ1/n+1 ] Theorem 2. If the initial potential G(y) also satisfies Condition 3 and is parameterized by α ∈ (−2, 0) ∪ (0, 2), then ∃Cn such that n ∂ u(x, t) (3.3) > θ ≥ exp −Cn tq (Re)−p θr Prob n ∂x for θ max{(Re)−1 , t−1 (Re/t)n }, with p, q, and r defined as in Theorem 1. The correct super-exponential lower bounds for the above PDFs are also attainable for α < −2. Proving these bounds, however, would be a lengthy and unenlightening process. After proving Theorem 1 and 2, we will indicate how these results can be extended for initial potentials with such α. In the interest of brevity, we also leave the proof of the PDF lower bounds when α = 0 to the interested reader. In the above theorems, it is clear that α = 0 is a transition point. This is due to the fact that α = 0 is the border value between the stationary and non-stationary regimes. For α ∈ (0, 2), the potential process is non-stationary with (3.4) E (G(y) − G(x))2 ∝ |y − x|α , as |y − x| → ∞. For α < 0, the potentials are stationary, with the critical value α = 0 corresponding to a non-stationary process with logarithmic growth of mean-square differences. This dichotomy leads to quite different tail-event behavior. Understanding how the differences in the large-scale behavior of the initial potentials lead to different PDF statistics is, in essence, the project of this paper. To gain some initial insight into this phenomenon, let us briefly analyze the solution to the Burgers system (1.1), looking, in particular, to see where large velocities and velocity derivatives occur. This analysis will also give the motivation behind our method of proof. If Re t, it is well-known that the Hopf–Cole formula can be analyzed by the method of steepest descent. The function

6

R. Ryan, M. Avellaneda

8(x, y, t) = φo (y) +

(x − y)2 2t

in Eq. (1.2) controls the shape of u(x, t). With t fixed, for each x the main contribution to the right side of Eq. (1.2) comes from the y-points at which 8(x, y, t) achieves its global minimum. For points x0 , where 8(x0 , y, t) achieves its global minimum at only one point y(x0 ), steepest descent analysis shows that in a small neighborhood around x0 , u(x, t) ≈

x − y(x0 ) . t

(3.5)

The solution u(·, t) takes the form of a ramp of slope 1/t. No matter how large the velocity is in this neighborhood, its derivatives are universally small. It is near points x for which 8(x, y, t) has two (or more) minimum points that large gradients and large derivatives of higher order are found. Here, steepest descent calculations (see [6] e.g.) reveal that in the neighborhood of an x such that 8(x, y, t) achieves its global minimum at y1 and y2 , x + r y1 e(y1 −x)rRe/2t + y2 e(y2 −x)rRe/2t − t t e(y1 −x)rRe/2t + e(y2 −x)rRe/2t δy Reδyr x+r−ζ − tanh , ≈ t 2t 4t

u(x + r, t) ≈

(3.6) (3.7)

where ζ = (y2 + y1 )/2 and δy = y2 − y1 . This formula shows that the Burgers equation satisfies to leading order the following scaling laws near a double minimum point: n+1 δy ∂ n u(x, t) ≈ 2 (Re)n , n = 0, 1, 2, . . . . (3.8) n ∂x 4t In the above discussion, we assumed that Re t and that δy was at least of order one. Getting back to the problem of finding large velocity derivatives when Re is relatively small, we make the useful observation that a similar steepest descent argument can be made in the vicinity of a double minimum point provided that “shock strength” δy/t (Re)−1 . To see this, let us rescale the Burgers equation in the vicinity of such a point with x = δyz and u(z, ˜ t) = (δy)−1 u(x, t). Then the equation for u˜ is u˜ t + u˜ u˜ z =

1 u˜ zz . δyRe

(3.9)

Now the “effective” Reynolds number is δyRe which is t and the rescaled shock strength equals one. We can apply our previous analysis to the rescaled equation. Therefore, the approximate equality in Eq. (3.8) for velocity derivatives holds for Burgers systems with arbitrary Reynolds numbers in the vicinity of large enough shocks. Thus, n+1 (Re)n is equivalent estimating the probability that ∂xn u(x, t) > θ when θ ≈ 2 δy/4t to estimating the probability that 8(x, y, t) has two global minimum points y1 and y2 sep1 (y − x)2 arated by a distance > δy. Assuming that y1 ≈ x, a natural assumption since 2t has its minimum at x, and y2 ≈ x + δy, an off-the-cuff calculation shows that the prob 1 2 (δy) ability of such adouble minimum is equivalent to Prob G(x + δy) − G(x) = , 2t which is ∝ exp −(δy)r t−2 /8 with r defined as in Theorem 1. The next two sections establish the PDF bounds of Theorems 1 and 2, using this Hopf–Cole analysis as the pattern on which to mold the proofs.

One-Point Statistics of Burgers Turbulence

7

In order to simplify and unify the proof of Theorem 1, we introduce the following notation: ∀x > 1,   for α < 0 1 α+ x = log(x) for α = 0  xα for α > 0. This will allow us to handle the logarithmic corrections to the PDF tails for α = 0 without the need of additional statements. 4. Upper Bound Proofs for Theorem 1 We start with the proof of the upper bound for the PDF tail for the velocity. We will get the upper bounds for the velocity derivatives by a fairly simple extension of this proof. The idea is to find a large set of initial-potential paths F , for which the velocity at some point x (t fixed) is bounded in absolute value and for which 8(x, y, t) does not have two widely separated global minima, thus preventing large velocity derivatives. By the stationarity of the initial velocity and the translation invariance of the Burgers equation, we can specify x = 0 without loss of generality . The complement of this set F c will then give us the appropriate PDF upper bounds. With G(y) as the initial potential, we define the set F as follows: F = A1 ∩ A2 ∩ B, 1 1 A1 = {G(·) : y 2 ≤ G(y) + y 2 , ∀y > L}, 4t 2t 1 1 A2 = {G(·) : y 2 ≤ G(y) + y 2 , ∀y < −L}, 4t 2t 1 B = {G(·) : G(y) + y 2 ≤ L2−α∨ε /t, ∀y ∈ (−2/L2 , 2/L2 )}, 2t where L = θt/2 and ε corresponds to the ε in Condition 2. First we show that {|u(0, t)| ≤ θ} ⊂ F . Setting 8(y, t) = G(y) +

(4.1)

1 2 2t y ,

R∞ ye−Re8(y,t)/2 dy 1 −∞ |u(0, t)| = ∞ R t −Re8(y,t)/2 e dy

(4.2)

−∞ R∞

|y|e−Re8(y,t)/2 dy 1 −∞ ≤ t R∞ −Re8(y,t)/2 e dy

(4.3)

−∞

−L R

L 1 −∞ ≤ + t t

|y|e−Re8(y,t)/2 dy + R∞ −∞

R∞ L

|y|e−Re8(y,t)/2 dy

e−Re8(y,t)/2 dy

.

(4.4)

8

R. Ryan, M. Avellaneda

Using the bounds on the paths of F , we get ∀L t/Re, R∞ L 1 L |u(0, t)| ≤ + t t 2/L R 2

ye−Rey

2

/8t

dy (4.5)

e−ReL2−α∨ε /2t dy

0

2tL − Re (L2 −4L2−α∨ε ) L 8t 1+ ≤ e t Re 2L = θ. ≤ t

(4.6) (4.7)

Thus, Prob{|u(0, t)| > l} ≤ P (F c ) = P (Ac1 ∪ Ac2 ∪ B) ≤ P (Ac1 ) + P (Ac2 ) + P (B c ).

(4.8) (4.9)

Using the following result due to Adler [1] and a corollary of this result, we shall bound the right-hand side of the above inequality by the desired probability. Lemma 4.1 (Adler). Let G(·) be a continuous mean-zero Gaussian process on R. Define a metric on R, p(x, y) = E[G(~x) − G(~y )]2 . Given some bounded set I ∈ R, we define NI () = the number of -balls needed to cover the set I under the metric p(·, ·). If ∀ > 0, NI () ≤ K−γ for some γ, K > 0, then with σI2 = supx∈I E[G(x)2 ],

sup G(x) ≥ m

P

x∈I

1 2 2 ≤ C(γ, K)m exp − m /σI , 2 η

(4.10)

where η is any real number > γ. Since E (G(y) − G(x))2 ≤ C|y − x|α∨ε for some ε, (Eq. (2.5)), the process G satisfies the conditions of Lemma 4.1 with γ = 1/(α ∨ ε). Furthermore, Ryan (see [14] Eq. (3.18)) proved the following corollary: Corollary 4.1. With G(y), y ∈ R defined as above and L > 1, ∃C such that P

min G(y) + βy 2 ≤ 0

y>L

n o + ≤ Cβ η exp −Cβ 2 L4−α .

(4.11)

Applying Corollary 4.1 to the first and second terms on the right-hand side of Eq. (4.9), one has P (Ac1 )

=

P (Ac2 )

1 2 = P min [G(y) + y ] ≤ 0 y>L 4t o n o n + + + ≤ exp −CL4−α /t2 = exp −C0 θ4−α t2−α .

(4.12)

(4.13)

One-Point Statistics of Burgers Turbulence

9

A straightforward application of Lemma 4.1 yields 1 max 2 [G(y) + y 2 ] ≥ L2−α∨ε /t P (B c ) = P 2t 0≤y≤2/L 2−α∨ε /2t ≤P max Bh (y) ≥ L 0≤y≤2/L2 η ≤ C L2−α∨ε /2t exp −CL4 /t2 o n + + ≤ exp −Cθ4−α t 2−α .

(4.14) (4.15) (4.16) (4.17)

This concludes the upper bound proof for the velocity PDF. 4.1. Proof of velocity derivative upper bounds. Using the same set F , we can extend this above proof to obtain the upper bounds on the tails of the velocity derivative PDFs. With the upper bound estimate on the measure of F c already in hand, all we need to show is that for all paths G(·) ∈ F , ∃Kn > 0, n = 1, 2, . . . such that n n+1 ∂ u(0, t) ≤ Kn (Re)n L . ∂xn t

(4.18) n

u(0,t) in In order to obtain this bound, let us rewrite the Hopf–Cole solution for ∂ ∂x n the following manner:   ∞  Z   n n+1 2 ∂ ∂ u(0, t) Rexy/2t −Re(Bh (y)+y 2 /2t)/2   x − 2/Re log = e e dy   ∂xn ∂xn+1 2t −∞ x=0 h Rex io n n+1 n ∂ ∂ Y , = (x/t) − 2/Re n+1 log E e 2t (4.19) ∂xn ∂x x=0

where Y is defined as a random variable whose probability density P (Y = y) is equal 2 to C1 e−Re(Bh (y)+y /2t)/2 , with C a normalizing constant. The above expectation is the moment-generating function of Y with respect to the parameter Rex/2t. The log of this expectation is called the cumulant-generating function of Y . Formally, we can Taylor expand this function and the moment generating function of Y to get the identity 2 n λ2 Rex Rex λn Rex + = λ1 + ... + + ... log E e 2t 2! 2t n! 2t # " 2 m ξ2 Rex ξm Rex Rex + + ... + + ... . = log 1 + ξ1 2t 2! 2t m! 2t (4.20) h

Rex 2t Y

i

Here ξm = the mth moment of Y , and λn = the nth cumulant or semi-invariant moment of Y . This identity, however, is more than just formal. The Gaussian-like decay of the density of Y gives us the existence of the Taylor series expansion of the momentgenerating function, which in turn implies the existence of the expansion of the cumulant Taylor series. Matching the powers of x in Eq. (4.20), we get:

10

R. Ryan, M. Avellaneda

λ1 = ξ1 , λ2 = ξ2 − ξ12 , λ3 = ξ3 − 3ξ2 ξ1 + 2ξ13 , .. ., π π n X (−1)m−1 X ξp1 1 ξp2 2 m! ., ... λn = n! m p1 p2 π1 !π2 ! . . . m=1

(4.21)

The second summationP in the last equation is over all natural number pair sequences P {(πi , pi )}i such that 1. i πi = m, 2. i πi pi = n and 3.pi 6= pj if i 6= j. By taking derivatives of the cumulant series and setting x = 0, we arrive at the formula n ∂n ∂ n u(0, t) − 2 (Re) λn+1 . = (x/t) n n ∂x ∂x (2t)n+1 x=0

(4.22)

Therefore, the problem of finding a bound on the k th derivative of u(x, t) at x = 0 reduces to the problem of finding a bound onPλn+1 .3 To achieve this bound on λn+1 , we note that for any sequence {(πi , pi )}i where i πi pi = n + 1, Y Y πi (ξpi )πi ≤ E[|Y |pi ] i

(4.23)

i

≤

Y

E[|Y |πi pi ]

i

≤ E[|Y |n+1 ]. Therefore, |λn+1 | ≤ Cn∗ E[|Y |n+1 ],

(4.24)

where Cn∗ = (n + 1)!

π1 π2 n+1 X 1 1 X 1 m! . ... m p1 p2 π1 !π2 ! . . . m=1

We have, thus, further reduced the problem to finding the appropriate upper bounds on the moments of |Y |. For this, we use the properties of the paths G(·) ∈ F as we did in the proof of the velocity-tail upper bound. With 8(y, t) defined as before, We mention in passing that Eq. (4.22) holds for all x where the cumulants are taken with respect to the 2 normalized measure P (y) = C1 e−Re(Bh (y)+(y−x) /2t)/2 . 3

One-Point Statistics of Burgers Turbulence

11

E[|Y |n+1 ] ≤ Ln+1 + E |Y |n+1 1[|Y |>L] −L R

≤ Ln+1 +

−∞

R∞ ≤L

n+1

+

(4.25)

|y|n+1 e−Re8(y,t)/2 dy + R∞ −∞

y n+1 e−Rey

L 2/L R 2

2

/8t

R∞ L

|y|n+1 e−Re8(y,t)/2 dy

e−Re8(y,t)/2 dy

(4.26)

dy (4.27)

e−ReL2−α∨ε /2t dy

0

CtL − Re (L2 −4L2−α∨ε ) e 8t (4.28) Re t . (4.29) ∀L ≤ 2Ln+1 , Re This inequality combined with Eq. (4.24) and our upper bound on the measure of F proves that for θ = 4Cn∗ (Re)n (L/2t)n+1 (Re)n t−n−1 and (Re)−1 ,

≤ Ln+1 · 1 +

∂ n u(0, t) > θ ≤ exp −cn (Re)−p tq θr , (4.30) n ∂x which completes the upper bound proof for the PDF tails of the velocity and its derivatives. Prob

5. Proof of the Theorem 2 The lower bounds on these tail events are a good deal harder to obtain than the upper bounds. One must find a set of paths E with the correct probability such that ∀G(·) ∈ E, there exists, for some x, two global minima, y1 and y2 , for the function 8(y, x, t). Then one must show that this gives us large velocities and velocity derivatives in the vicinity of x (see Fig.1). Complicating this further is the fact that one must alter the set E, depending on whether or not the potential process is stationary. For this reason, we will first prove Theorem 2 for α ∈ (0, 2). Then we will show how, altering the set E, we can establish this result for α ∈ (−2, 0). Proof for α ∈ (0, 2). For clarity we shall split the proof into two parts. In the first part, we show that for every path G(·) in a certain set E, there exist large velocity derivative values at particular points in the neighborhood of zero. The points will depend on the order of the derivative and how large the “shock” near zero is, but not on the particular path chosen from the set E. In the second part, we shall find a lower bound on the probability of the set E and show that it gives us the bound that we want for all derivative orders. Let the set E be defined as follows, with 8(y, t) = G(y) + y 2 /2t: E = A ∩ B1 ∩ B2 , A = {G(·) : |G(y) + ζ(y)| ≤ , ∀y ∈ (−L, 3L)}, 1 B1 = {G(·) : 8(y, t) ≥ y 2 , ∀y ≤ −L}, 4t 1 B2 = {G(·) : 8(y, t) ≥ (y − 2L)2 , ∀y ≥ 3L}, , 4t

(5.1)

12

R. Ryan, M. Avellaneda

z-axis

1 2 z = B(y)+y 2t

z = 1 (y-L)2 4t -2L

-L

z = 1 (y-2L)2 4t 0

L y-axis

Fig. 1. An illustration of the double-minimum structure of B(y) +

where ζ(y) =

  0

2L t (y  4L 2  t

− L)

3L

2L

for y < L for y ∈ [L, 3L] for y > 3L.

1 2 y 2t

4L

for B(·) ∈ E

(5.2)

For all paths G(·) ∈ E, 8(y, t) has a double-welled structure, with one minimum 1 2 y +O(), near y = 0 and the other near y = 2L (see Fig. 1). For y ∈ (−L, L), 8(y, t) = 2t 1 2 and for y ∈ [L, 3L), 8(y, t) = 2t (y − 2L) + O(). With this definition of E, we make the following two claims, which encapsulate the proof. Claim 5.1. Let u(x, t) be the solution to the Burgers equation (Re < ∞) with the initial potential path G(·) ∈ E. Then ∀k = 1, 2, . . . , ∃xk,L such that k k+1 ∂ u(xk,L , t) ≥ Ck (Re)k L , ∂xk 2t

(5.3)

where xk,L and Ck are independent of the path G(·). Claim 5.2. With the set E defined as above and the process G(y) parameterized by α ∈ (0, 2), ∃C such that (5.4) P (E) ≥ exp −CL4−α t−2 .

One-Point Statistics of Burgers Turbulence

13

Given Claims 5.1 and 5.2, the proof of Theorem 2 for α ∈ (0, 2) is immediate. Proof of Claim 5.1. Using the Hopf–Cole formula, we have  ∞  Z  2 2 ∂ n+1 ∂ n u(x, t) −Re(G(y)+(y−x) /2t)/2 = − log e dy .   ∂xn Re ∂xn+1

(5.5)

−∞

k

u(x,t) , for x near zero, from y < −L The first step is to control the contributions to ∂ ∂x k and y > 3L. We rearrange the above equation to yield  3L  Z  2 2 ∂ n+1 ∂ n u(x, t) −Re(G(y)+(y−x) /2t)/2 = − log e dy (5.6)   ∂xn Re ∂xn+1 −L   R   −Re(G(y)+(y−x)2 /2t)/2   e dy     n+1 2 ∂ |y−L|>2L . log 1 + − n+1 3L   R Re ∂x   2 /2t)/2 −Re(G(y)+(y−x)  e dy    −L

To estimate the second term in the latter equality, we employ the bounds for the paths in E. Straightforward calculations using these bounds give for k ≥ 0 and |x| < 1, Z ∂k (Re)k−1 k−1 Re −Re(G(y)+(y−x)2 /2t)/2 2 , e dy ≤ C L exp − (L + x) k1 k ∂x tk−1 4t |y−L|>2L k Z3L (Re)k−1 k ∂ −Re(G(y)+(y−x)2 /2t)/2 ∝ e dy L . ∂xk tk−1 −L

Using the Taylor Series expansion for log(1 + z) with |z| < 1 and elementary properties of derivatives, one has for k ≥ 0,   R 2    e−Re(G(y)+(y−x) /2t)/2 dy      2 ∂ k+1 |y−L|>2L log 1 + 3L   Re ∂xk+1 R   2 /2t)/2 −Re(G(y)+(y−x)   e dy   −L k (Re) Re ≤ Ck0 k Lk exp − (L + x)2 . t 4t

(5.7)

By taking L large, we can obtain as much control as needed over the contributions from points outside the interval (−L, 3L) to the value of the velocity and its derivatives in the neighborhood near x = 0. Therefore, we shall disregard these contributions and focus on those from the interval (−L, 3L). On this interval for any path in E it is clear that G(y) +

1 1 (x − y)2 = x2 − xy/t + ψ(y) − + 2φ(y), 2t 2t

(5.8)

14

R. Ryan, M. Avellaneda

( ∀y ∈ (−L, L] y 2 /2t, ψ(y) = (y − 2L)2 /2t, ∀y ∈ (L, 3L) and φ(y) is some function whose values are contained in the interval (0, 1). Now let us define two probability measures µ and ν on the interval (−L, 3L), n xy o 1 exp −Re(ψ(y) − )/2 dy, C t exp {−Re(2φ(y))/2} µ(dy) , ν(dy) = 3L R exp {−Re(2φ(y))/2} µ(dy)

µ(dy) =

(5.9) (5.10)

−L

where C is a normalizing constant such that µ[(−L, 3L)] = 1. From previous analysis and specifically Eq. (4.22), one also has ∀n ≥ 0, ∂n (Re)n λn+1 ∂ n u(x, t) = (x/t) + 2 , n n ∂x ∂x (2t)n+1

(5.11)

where the cumulants λn are taken with respect to the measure ν(dy). What we would actually like is for these cumulants to be taken with respect to the measure µ(dy), which does not depend on the particulars of the path Bh (y). Therefore, let us estimate the difference between these quantities. The difference for the first cumulant λ1 , which is equal to the mean, is Eµ [ye−Reφ(y) ] |Eµ [y] − Eν [y]| = Eµ [y] − Eµ [e−Reφ(y) ] e−Reφ(y) = Eµ y(1 − −Reφ(y) Eµ [e ] ≤ (eRe − 1)Eµ [|y|].

(5.12) (5.13) (5.14)

We get the inequality from the fact that φ(y) ∈ (0, 1), ∀y ∈ (−L, 3L). Therefore, this difference is controlled by Eµ [|y|] and the value of . Since there is nothing special about the function f (y) = y, the same inequality holds for all the moments of y. Since the cumulant of order n is a polynomial combination of all the moments up to order n, it is clear that we also have control of these quantities, given by the formula |λµn − λνn | ≤ Cn∗ (eRe − 1)eRe(n−1) Eµ [|y|n ],

(5.15)

where Cn∗ is given in Eq. (4.24). Since can be as small as we like and Eµ [|y|n ] ≤ (3L)n , we have sufficient control over these differences. All that is left to do is to calculate the contribution to the Hopf–Cole formula that 1 2 y = ψ(y). For x = O(L−1 ), one has the interval (−L, 3L) makes when G(y) + 2t

One-Point Statistics of Burgers Turbulence

 log 

Z3L

−L

exp − 

= log 

ZL

Re 2

15

G(y) +

e−Re(y−x)

2

 dy 

(5.16)

 Z3L 2 /4t dy + eReLx/t e−Re(y−2L−x) /4t dy 

−L



i

h

1 (y − x)2 2t

= log 1 + eReLx/t + log 

L

ZL

 e−Re(y−x)

2

/4t

dy  .

−L

From the Hopf–Cole formula and the above result, we have that ReLx L L u(x, t) = − − tanh t t 2t   L  Z 2 ∂ e−Re(y−x) /4t dy + Other Terms . − 2/Re log   ∂x

(5.17)

−L

An easy calculation gives that for x = O(L−1 ),  L  k Z  2 2 2 ReL ∂k −Re(y−x) /4t log e dy ≤ C e−Re(L +x )/4t . k   ∂x 2t

(5.18)

−L

Now we put together all the estimates from Eqs. (5.7), (5.12), (5.15), and (5.18). Given that the initial potential path G(·) is in the set E, it is clear that for the Burgers equation with Re < ∞, k k+1 k ∂ ∂ u(x, t) ≥ 2(Re)k L tanh(z) (5.19) ∂xk k 2t ∂z z=ReLx/2t k+1 3L (eRe − 1)eRek − 2Ck∗ (Re)k 2t k ReL Re − Ck00 exp − (L + x)2 . t 4t 2tzk , where zk = the global maximum For fixed k ≥ 0 and fixed L 1, we set xk,L = ReL k ∂ 2t . For fixed k, ∃Lk , k of the function ∂zk tanh(z) for k 6= 0 and for k = 0 x0,L = ReL such that if L > Lk and ∈ (0, k ), then ∀G(·) ∈ E, k k+1 ∂ u(xk,L , t) ≥ Ck (Re)k L . (5.20) ∂xk 2t

This concludes the proof of Claim 5.1.

Proof of Claim 5.2. To get a lower bound on the probability of E, we recall the spectral representation of our potential processes given by Eq. (2.4) and write

16

R. Ryan, M. Avellaneda

Z∞ G(y) + ζ(y) =

(cos(yk) − 1) (E[k])1/2 d(Z1 (k) + gc (k))

(5.21)

−∞ Z∞

sin(yk) (E[k])1/2 d(Z2 (k) + gs (k)),

+ −∞

where Z1 (·) and Z2 (·) are standard independent Brownian motions and gc (k) and gs (k) are given by L cos(3Lk) − cos(Lk) (5.22) /(E[k])1/2 , g˙c (k) = πt k2 and g˙s (k) =

L sin(Lk) − sin(3Lk) /(E[k])1/2 , πt k2

(5.23)

E[k] being the spectrum of G. It is a simple exercise in Fourier analysis to show that gc (k) and gs (k) satisfy Eq. (5.21)), when ζ(y) is given by Eq. (5.2). Setting W1 (k) = Z1 (k) + gc (k) and W2 (k) = Z2 (k) + gs (k), we employ the Cameron– Martin–Girsanov formula to obtain a new measure on path space, for which W1 (·) and W2 (·) are standard independent Brownian motions without drift. Under this measure, ˜ G(y) ≡ G(y) + ζ(y) is now a mean-zero Gaussian process on R, identical in law to G(y) under the original measure. Using theR Radon-NikodymRderivative provided by the Cameron-Martin formula and setting Z = g˙c (k)dW1 (k) + g˙s (k)dW2 (k), we have Z 1 2 2 (˙gc (k)) + (˙gs (k)) dk (5.24) P (E) = exp − 2 # " ˜ < ; B1 ∩ B2 . × E exp {Z}; sup |G(y)| y∈[−L,3L]

˜ The set B2 now equals The set B1 is unchanged by the mapping of G(·) into G(·). 1 L ˜ : G(y) ˜ (5.25) G(·) > − (y − 3L) − y 2 , ∀y ≥ 3L . t 4t The exponential term in Eq. (5.24) is equivalent, in the sense of large deviations, to the probability that G(·) = ζ(·). The term inside the exponential is the action of the R 2 ˆ denotes the ˆ /E[k]dk, where ζ(k) path ζ with respect to the process G and = 21 |ζ(k)| Fourier transform of ζ(y). Using Conditions 1 and 3 on the spectrum of G, we get Z

L2 t−2 (g˙c (k))2 + (g˙s (k))2 dk ≤ Cπ 2

Z1 −1

L2 t−2 + Cπ 2

2[1 − cos(2Lk)] dk |k|3−α Z

|k|>1

4−α −2

≤ CL

t

(5.26)

2[1 − cos(2Lk)] dk |k|3−ε0

+ C 0 L2 t−2 .

(5.27)

One-Point Statistics of Burgers Turbulence

17

To get a lower bound on the second term in Eq. (5.24), we first note that B10 ⊂ B1 and B20 ⊂ B2 , where 1 ˜ : |G(y)| ˜ < y 2 , ∀y < −L , (5.28) B10 = G(·) 4t 1 ˜ : |G(y)| ˜ < y 2 , ∀y > 3L . (5.29) B20 = G(·) 4t Therefore, "

#

E exp {Z};

sup

y∈[−L,3L]

˜ |G(y)| < ; B1 ∩ B2

(5.30)

" o n ≥ exp −L2−α/2 t −1 P |Z| < L2−α/2 t−1 ;

# sup

y∈[−L,3L]

˜ |G(y)| < ; B10 ∩ B20 .

In order to obtain a sufficiently tight lower bound on the second term in line (5.30), we ˇ ak [16], and the need to make use of two technical lemmas. The first is due to Z. Sid´ second is due to Monrad and Rootzin [12]. ˇ ak). Given any continuous mean-zero Gaussian process G(y), Lemma 5.1 (Sid´ y ∈ R, and a correlated mean-zero Gaussian random variable Z, then for any piecewise continuous, strictly-positive function β(y), any constant η > 0, and any Borel set I ⊂ R, ! ! P

sup |G(y)| < β(y), |Z| < η y∈I

≥ P

sup |X(y)| < β(y) y∈I

· P (|Z| < η).

Lemma 5.2 (Monrad and Rootzin). Let G(y) be a Gaussian process with G(0) = 0 and assume that for some α ∈ (0, 2), (5.31) E (G(y) − G(x))2 ≤ C|x − y|α ∀x, y ∈ R. Then ∀, t > 0, ∃C 0 , depending only on , α and the constant C in Eq. (5.31), such that ! P

sup |G(y)| <

y∈[0,t]

≥ exp {−C0 t} .

Applying these lemmas to the second term in line (5.30), we get # " 2−α/2 −1 0 0 ˜ t ; sup |G(y)| < ; B1 B2 P |Z| < L y∈[−L,3L]

h

2−α/2 −1

≥ P |Z| < L " ≥C

P

sup

y∈[−L,3L]

t

i

#

"

P

(5.32)

sup

y∈[−L,3L]

˜ |G(y)| <

; B10 B20

# ! 0 0 c ˜ |G(y)| < − P (B1 B2 )

(5.33)

≥ C exp {−CL} − P (B10 )c + P (B20 )c

(5.34)

≥ exp {−CL} .

(5.35)

18

R. Ryan, M. Avellaneda

Line (5.35) uses Corollary 4.1, and line (5.33) above uses the fact that Z is a mean-zero Gaussian r.v. with variance ≤ CL4−α t−2 , for some C > 0. These estimates combined with Eqs. (5.24), (5.26), and (5.30) prove that P (E) ≥ exp −CL4−α t −2 , ∀L t,

(5.36)

for some C > 0 whose value depends on the spectrum of G. This completes the proof of Claim 5.2 and, thus, the proof of Theorem 2 for α ∈ (0, 2). Proof of Theorem 2 for α ∈ (−2, 0). To establish the PDF lower bounds for this class of Gaussian potentials, we must alter the set E. We will again look at a set of paths, which are contained in a tube of width 2 in the interval [−L, 3L], but the tube must follow a different path ζ(y). For α ∈ (−2, 0), G(y) is a stationary process, and so a “stationary” path ζ must be chosen. We define E as follows, with 8(y, t) = G(y) + y 2 /2t: E = A ∩ B1 ∩ B2 ,

with

A = {G(·) : |G(y) + ζ(y)| ≤ , ∀y ∈ (−L, 3L)}, 1 B1 = {G(·) : 8(y, t) ≥ y 2 , ∀y ≤ −L}, 4t 1 B2 = {G(·) : 8(y, t) ≥ (y − 2L)2 , ∀y ≥ 3L}, 4t

(5.37)

(5.38)

where

ζ(y) =

  0

2L2  t

0

1 − (y − 2L)2

for y < 2L − 1 for y ∈ [2L − 1, 2L + 1] for y > 2L + 1.

(5.39)

With this new ζ, 8(y, t) still has a double-welled structure, with one minimum near y = 0 and the other near y = 2L. However, the second well is now narrower and steeper 2 2L 2 with 8(y, t) = 1+4L 2t (y − 2L) + t (y − 2L) + O(). Once again, there are two things one needs to prove: the lower bound estimate on the velocity derivatives near x = 0 and the lower bound estimate on the probability of E. To establish the former, we note that the bounds on G(y) given by sets B1 and B2 lead to the same lower bounds on the velocity and velocity derivative near x = 0 given by Eqs. (5.6) and (5.7), i.e.  3L  Z  2 2 ∂ n+1 ∂ n u(x, t) −Re(G(y)+(y−x) /2t)/2 ≥ − log e dy   ∂xn Re ∂xn+1 −L (Re)n Re − Cn0 n Ln exp − (L + x)2 . t 4t

(5.40)

Then, using the same reasoning employed in the proof for α ∈ (0, 2), we can replace G(y) with ζ(y) in the above equation to get the inequality

One-Point Statistics of Burgers Turbulence

n

19

 3L Z

n+1

∂ u(x, t) 2 ∂ ≥− log  ∂xn Re ∂xn+1

e−Re(ζ(y)+(y−x)

−L

− Cn00 (eRe − 1)eRe(n−1)

2

/2t)/2

dy

  (5.41)



(Re)n n L . tn

Finally, we have Z3L

e−Re(ζ(y)+(y−x)

2

/2t)/2

Z3L dy =

−L

e−Re(y−x)

2

/4t

2L+1 Z

dy −

−L 2L+1 Z

+

e−Re(y−x)

2

/4t

dy (5.42)

2L−1

Re 1 2L2 2 2 1 − (y − 2L) exp − dy. (y − x) − 2 2t t

2L−1

Setting z = equals

√

1 + 4L2 (y − 2L) in the third term above, the left side of the above equation r

where

4tπ 1 − I1 (x) + exp A + Bx + Dx2 [1 − I2 (x)] , Re

r I1 (x) = r I2 (x) =

Re 4tπ Re 4tπ

Z y ∈[−L,2L−1)∪(2L+1,3L] /

Z

√ √ z ∈[− / 1+4L2 , 1+4L2 ]

(5.43)

Re exp − (y − x)2 dy, 4t (

Re exp − 4t

2L − x z− √ 1 + 4L2

and 1 ReL Re − log(1 + 4L2 ), B= A= t(4 + L−2 ) 2 t 1 Re 1− . D=− 4t 1 + 4L2

2 )

1 1− 1 + 4L2

dz

,

Therefore,  3L  Z  2 2 ∂ n+1 −Re(ζ(y)+(y−x) /2t)/2 log e dy −   Re ∂xn+1 −L n ∂ x/2 − L 1 (1 + tanh [A + Bx + Dx]) = 1− 1 + 4L2 ∂xn t I1 (x) + I2 (x) 2 ∂ n+1 . log 1 − − Re ∂xn+1 1 + eA+Bx+Dx2

(5.44)

20

R. Ryan, M. Avellaneda

Straightforward calculations yield that for i = 1, 2, n = 0, 1, 2, . . . , and |x| < 1, n+1 n+1 ∂ Re Re n 2 L exp − (L − x) . ∂xn+1 Ii (x) ≤ Cn t 4t These bounds and elementary calculus lead to the following inequalities, with n ∈ N, 2 ∂ n+1 I1 (x) + I2 (x) (5.45) Re ∂xn+1 log 1 − 1 + eA+Bx+Dx2 (Re)n Re ≤ Cn0 n+1 Ln+1 exp − (L − x)2 . t 4t Using Eqs. (5.41), (5.44), and (5.45) and the fact ∂xn tanh(A+Bx+Dx2 ) ≥ Cn00 (ReL/t)n for some x = O(L−1 log L) and some Cn00 > 0, one has that ∀n = 0, 1, 2, . . . , ∃Ln t/Re and n > 0 such that ∀L > Ln and ∀ ∈ (0, n ), ∃xn,L such that n n−1 ∂ u(xn,L , t) ≥ C (Re) Ln (5.46) ∂xn tn for all paths G(·) ∈ E. With this established, we need only find the correct lower bound on the probability of the set E. The method we use is identical to that used in the proof of Theorem 2 for α ∈ (0, 2). Starting as we did in that proof, we note that Z∞ (cos(yk) − 1) (E[k])1/2 d(Z1 (k) + gc (k))

G(y) + ζ(y) =

(5.47)

−∞

Z∞ sin(yk) (E[k])1/2 d(Z2 (k) + gs (k)),

+ −∞

where Z1 (·) and Z2 (·) are again standard independent Brownian motions and gc (k) and gs (k) are given by 4L2 k −1 sin(k) − cos(k) cos(2Lk), (5.48) g˙c (k) = πt (E[k])1/2 k 2 and g˙s (k) = −

4L2 k −1 sin(k) − cos(k) sin(2Lk), πt (E[k])1/2 k 2

(5.49)

E[k] being the spectrum of G. Using the Cameron–Martin–Girsanov formula to change to a measure in which W1 (k) = Z1 (k) + gc (k), and W2 (k) = Z2 (k) + gs (k) are mean-zero Brownian motions, we arrive at the equality Z 1 (˙gc (k))2 + (˙gs (k))2 dk (5.50) Prob{E} = exp − 2 # " ˜ × E exp {Z}; sup |G(y)| < ; B1 B2 , y∈[−L,3L]

One-Point Statistics of Burgers Turbulence

21

R R ˜ is equivwhere Z = g˙c (k)dW1 (k) + g˙s (k)dW2 (k) and, under the new measure, G(·) alent in law to G(·) under the original measure. To bound the first term on the right-hand side of the above equation, we note that for energy spectra that satisfy Conditions 1 and 3 with α ∈ (−2, 0), Z (5.51) (g˙c (k))2 + (g˙s (k))2 dk ≤ CL4 t−2 . To find the appropriate lower bound on the second term in Eq. (5.50), we simply refer the reader to the proof of the lower bound on the probability of the set E for potentials parameterized by α ∈ (0, 2). The steps are identical and lead to the inequality # " ˜ < ; B1 B2 ≥ exp {−CL}, (5.52) E exp {Z}; sup |G(y)| y∈[−L,3L]

for some C > 0, independent of L. Thus,

P (E) ≥ exp −CL4 t−2 .

This completes Theorem 2.

(5.53)

6. Discussion and Extensions As we mentioned above, lower bounds on the velocity and velocity derivative PDFs for BT systems, in which the infrared exponent α of the initial potential spectrum is ≤ −2, are also attainable. The key to these lower-bound proofs is finding an appropriate path ζ(y) for the initial potential G(y) to follow. The path ζ must induce large values n respect to G, i.e. for R u(x,2t) and ∂x u(x, t), n = 1, 2, . . . and have finite action with 2 ˆ ˆ /E[k]dk < ∞. For processes G with α ∈ (0, 2), |ζ(k)| had to be ≤ Ck −2 |ζ(k)| for small k to ensure finite action. Therefore, the function ζ(y) was chosen so that its derivative had no infrared singularity in its Fourier transform, i.e. that its derivative was, in some sense, “stationary”. This makes intuitive sense in that the derivative of G(y) with α ∈ (0, 2) is stationary. With α ∈ (−2, 0), we had to choose a different path ζ. ˆ For the action of ζ to be finite given this class of potentials, ζ(k) had to be no larger than O(1) for k near 0. In other words, the path chosen had to be stationary because the potentials were now stationary. Looking at BT systems for which the initial potential G(y) is parameterized by ˆ must head to zero at least as fast as O(k) as k → 0 if α ∈ (−4, −2], it is clear that ζ(k) Ry ζ(x)dx, in addition the action of ζ is to be finite. This implies that the integral of ζ(y), to ζ itself, must be a stationary path with respect to y. This requirement on ζ can also be seen as a consequence of the fact that the integral process of G(y) is stationary as well in this regime. Generalizing this, we see that for BT systems with α ∈ (−2n − 2, −2n], ˆ must be at least O(k n ) as k → 0, implying that the nth -order integral of ζ(y) must ζ(k) be stationary. Thus, extending Theorem 2 to cover systems for which α ≤ −2 is simply a matter of finding a set of paths that has the correct action and induces the appropriate double-minimum structure for the solution to the viscous Burgers equation. Such a set of paths is indicated in the 1995 paper [4] by Avellaneda, Ryan and E. Theorem 2 can also be strengthened by weakening Condition 3 on the initial-potential 2 spectrum. This condition can be replaced by the condition that E[k] ≥ Ce−Ak , for some

22

R. Ryan, M. Avellaneda

A, C > 0. To extend Theorem 2 in this manner, one needsRto tackle the possibility of a 2 ˆ /E[k]dk, due to the strong ultra-violet singularity in the action of the path ζ, |ζ(k)| rapid decay of the initial spectrum as k → ∞. By convolving the original path ζ with an appropriate heat kernel, one obtains a new path ζ 0 (y) that still has the double-minimum structure, which induces large velocity derivatives of all orders, and has finite action 0 − A2 k2 ˆ for some A0 > A. with respect to G, since ζˆ0 (k) = ζ(k)e Finally, we note that the methods used in this paper can be applied to other statistical problems associated with BT systems. The combination of hands-on analysis of the Hopf–Cole formula and rigorous probabilistic estimates is ideal for the study of the two-point PDFs for the solution to BT models and other higher order statistics. By studying the Hopf–Cole formula, one can determine the sets of initial potential paths, which correspond to certain events for the velocity u(x, t) and its derivatives. Then, rigorous probabilistic tools can yield the necessary estimates on the BT events. This method is also appropriate for multi-dimensional BT problems. Via this analysis, the authors have obtained results, analogous to Theorems 1 and 2, for 2-D and 3-D Burgers turbulence that will appear elsewhere.

References 1. Adler, R.J.: An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes. Inst. Math. Statist. Lecture Notes- Monograph Series, Vol. 12, 1990 2. Avellaneda, M.: PDFs for velocity and velocity gradients in Burgers Turbulence. Commun. Math. Phys. 169, 134 (1995) 3. Avellaneda, M. and E, W.: Statistical properties of shocks in Burgers Turbulence. Commun. Math. Phys. 172, 13 (1995) 4. Avellaneda, M., Ryan R. and E, W.: Statistical properties of shocks in Burgers Turbulence. Phys. Fluids 7, 3067–71 (1995) 5. Burgers, J.M.: A mathematical model illustrating the theory of turbulence. In: Advances in Applied Mathematics, London–New York: Academic Press, 1948 6. Burgers, J.M.: The nonlinear diffusion equation. Dordrecht: D. Reidel Pulishing Co., 1974 7. Gotoh, T. and Kraichnan, R.: Statistics of decaying Burgers turbulence. Phys. Fluids A 5, 445 (1993) 8. Gurbatov, S., Malakhov, A. and Saichev, A.: Nonlinear random waves and turbulence in nondispersive media: Waves, rays and particles. New York: Manchester Univ. Press, 1991 9. Gurbatov, S., Simdyankin, S., Aurell, E., Frisch, U., and T´oth, G.: On the decay of Burgers Turbulence. J. Fluid Mech. 344, 339–374 (1997) 10. Handa, K.: A remark on shocks in inviscid Burgers turbulence. In: Fitzmaurice, N., Gurarie, D., McCaughan, F., Woyczy´nski, W. (eds.), Nonlinear waves and weak turbulence with applications in oceanography and condensed matter physics. Progress in Nonlinear Differnetial Equations and their applications, Vol. 11. Boston, Berlin: Birkh¨auser, 1993, pp. 339–345 11. Kraichnan, R.: Models of intermittency in hydrodynamic turbulence. Phys. Rev. Lett. 65, 575 (1990) 12. Monrad, D. and Rootzin, H.: Small value of Gaussian processes and functional laws of the iterated logarithm. Probab. Theory Realted Fields 101, no. 2, 173–192 (1995) 13. Ryan, R.: Large deviation analysis of Burgers turbulence with white-noise initial data. Commun. Pure Applied Math. 51, no. 1, 47–75 (1998) 14. Ryan, R.: The statistics of Burgers turbulence initialized with fractional Brownian noise. Commun. Math. Phys. 191, 71–86 (1998) 15. Shandarin, S.N. and Zel’dovich, Ya.B.: The large-scale structure of the Universe: Turbulence, intermittency and structures in a self-gravitating medium. Rev. Mod. Phy. 61, 185 (1989) ˇ ak, Z.: Rectangular confidence regions for the means of multivariate normal distributions. J. Am. 16. Sid´ Statist. Assoc. 62, 626–633 (1967)

One-Point Statistics of Burgers Turbulence

23

17. Sinai, Y.: Statistics of shocks in solutions of inviscid Burgers equation. Commun. Math. Phys. 148, 601–622 (1992) 18. Vergassola, M., Dubrulle, B., Frisch, U. and Noullez, A.: Burgers’ Equation, Devil’s Staircases and the Mass Distribution for Large-Scale Structures. Astron. Astrophys. 289, 325–356 (1994) Communicated by Ya. G. Sinai

Commun. Math. Phys. 200, 25 – 34 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Tunneling Estimates for Magnetic Schr¨odinger Operators Shu Nakamura Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1, Komaba, Meguro-ku, Tokyo, 153-8914, Japan. E-mail: [email protected] Received: 30 March 1998 / Accepted: 1 May 1998

Abstract: We study the behavior of eigenfunctions in the semiclassical limit for Schr¨odinger operators with a simple well potential and a (non-zero) constant magnetic field. We prove an exponential decay estimate on the low-lying eigenfunctions, where the exponent depends explicitly on the magnetic field strength.

1. Introduction In this paper, we consider magnetic Schr¨odinger operators H = (−i~∂x − A(x))2 + V (x) on L2 (Rn ) in the semiclassical limit ~ ↓ 0. Here ~ > 0 is the Planck constant, A(x) is the vector potential and V (x) is the scalar potential. We always assume V (x) and A(x) are realvalued Rn -valued) functions on Rn . The magnetic field is given by B = dA, where P(or n A = j=1 Aj (x)dxj , and B is defined as a 2-form on Rn . We denote the momentum operator by p = −i~∂x . In the following, we mainly study the case n = 2, and thus the magnetic field is given simply by B(x) = ∂1 A2 (x) − ∂2 A1 (x), x ∈ R2 , where ∂j = ∂/∂xj . In addition, we suppose B(x) is constant, i.e., B(x) = B 6= 0. We may suppose B > 0 without loss of generality. We require V (x) to be analytic with respect to rotations. We use the polar coordinate: x1 = r cos θ, x2 = r sin θ, r ≥ 0, θ ∈ T = R/2πZ, and we write, abusing notations slightly, V (r, θ) for V (x).

26

S. Nakamura

Assumption A. (i) V is a smooth simple well, i.e., V (x) is a C ∞ -function, V (0) = 0, V (x) > 0 for x 6= 0, and lim inf V (x) > 0. |x|→∞

(ii) There exist a τ > 0 and a continuous function f (r) such that for each fixed r > 0 the potential V (r, θ) has an analytic extension in θ to θ ∈ Sτ = {z ∈ C | | Im z| < τ }, and Re V (r, θ) ≥ f (r) > 0, r > 0, θ ∈ Sτ . Theorem 1.1. Suppose Assumption A, and let H be as above. Let ψ be an eigenfunction of H with an eigenvalue E = o(1) as ~ ↓ 0. Then for any ε > 0 and any compact set K ⊂ R2 , there is C > 0 such that |ψ(x)| ≤ C exp[−(g(r) − ε)/~], x ∈ K, ~ ∈ (0, 1],

(1.1)

where Z

r

g(r) =

r f (s) +

0

and δ =

δ 2 B 2 s2 ds, r > 0 4

(1.2)

2τ ∈ (0, 1). 1 + 2τ

Remark. It is well-known that for any ε > 0 and any compact K, |ψ(x)| ≤ C exp[−(h(x) − ε)/~], x ∈ K, ~ ∈ (0, 1], where h(x) is the Agmon distance from 0, which is defined by Z h(x) = inf 0

1

γ(0) = 0, γ(1) = x V (γ(t))γ(t)dt ˙

p

(cf. Helffer–Sj¨ostrand [4], Brummelhuis [1]). If V (x) is radial, i.e., V (x) = V (r), then we can take f (r) = V (r), and we have Z g(r) = 0

r

r V (s) +

δ 2 B 2 s2 ds > 4

Z

r

p

V (s)ds = h(r).

0

Here τ can be taken arbitrarily large, and hence δ can be arbitrarily close to 1. In fact, we can set δ = 1 because of the small constant ε > 0 in Eq. (1.1). Thus the estimate in Theorem 1.1 is better than the above estimate, which is a consequence of the (standard) Agmon method (cf. Simon [12], Helffer–Sj¨ostrand [3]). Moreover, if V (x) is radial, the above estimate is shown to be optimal by using the separation of variable in the polar coordinate. On the other hand, if V (x) is not radial, our estimate is not necessarily optimal, as shown in the next example.

Tunneling Estimates for Magnetic Schr¨odinger Operators

27

Example (Non-isotropic harmonic oscillator in constant magnetic field). Let a, b and B positive, and set H = (p1 +

B 2 B x2 ) + (p2 − x1 )2 + a2 x21 + b2 x22 . 2 2

Then H is unitarily equivalent to a harmonic oscillator because the symbol is a positive quadratic form in x and p, and we can solve the eigenvalue problem explicitly. It is easy to show that each eigenvalue is proportional to ~. The ground state is given by ψ(x) = c0 exp[−ϕ(x)/~], 1 1 ϕ(x) = cx21 + dx22 + iex1 x2 , 2 2 where

a p (a + b)2 + B 2 , a+b b p (a + b)2 + B 2 , d= a+b 1 a−b B. e= · 2 a+b c=

For any B 6= 0, we have c > a and d > b, and this confirms that g(x) decays faster in the presence of the magnetic field. We note that if a = b then 1 1p 2 1 c= d= a + B 2 /4. 2 2 2 If we apply Theorem 1.1, then τ can be taken arbitrarily large and hence δ can be arbitrarily close to 1. Thus Z rr r2 p 2 B 2 s2 ds = a2 s2 + a + B 2 /4, g(r) ∼ 4 2 0 and it is the right exponent expected from the above computations. On the other hand, if a 6= b, our result is not optimal. If a >> b > 0, then c∼

√

a2 + B 2 , d ∼

b√ 2 a + B2. a

Thus the decay rate is strongly direction dependent in this case. Let us see what Theorem 1.1 tells us in this case. By elementary computations, we have a+b 1 τ ∼ log 2 a − b and then

b√ 2 b |B|r2 < a + B 2 r2 . a a This result is not optimal, but it is not very far from it if B is very large. g(r) ∼

Our result can be easily applied to the double well problem with constant magnetic field.

28

S. Nakamura

Assumption B. (i)

V (x) is a smooth symmetric double well, i.e., V (x) is C ∞ -class; V (x1 , x2 ) = V (−x1 , x2 ), x ∈ R2 .

There are x(1) and x(2) such that (2) (1) (2) (1) (2) x(1) 1 = −x1 6= 0, x2 = x2 , V (x ) = V (x ) = 0,

and V (x) > 0 for x 6= x(1) , x(2) . Moreover, lim inf V (x) > 0. |x|→∞

(ii) V (x) is analytic in a neighborhood of x(j) , j = 1, 2. Theorem 1.2. Suppose V satisfies Assumption B. Let E0 and E1 be the lowest two eigenvalues of H (counting multiplicities). Then there are a,b > 0 and C > 0 such that |E1 − E0 | ≤ C exp[−(a + bB)/~]

(1.3)

for any ~ ∈ (0, 1] and B > 0. Theorem 1.2 follows from standard means, and we omit the proof (See, e.g., [2], [3], [12].) We note that Theorem 1.1 can be applied to the multiple well problem as well. The main idea of the proof of Theorem 1.1 is to use a weight function ρ(r, pθ ), where pθ = −i~∂θ , i.e., a weight in the phase space. More precisely, we will work on the polar coordinate, and compute Hρ ≡ eρ(r,pθ )/~ He−ρ(r,pθ )/~ . We then construct ρ(r, η) so that Re Hρ > 0 away from the origin. Note that there is no canonical definition of the “Agmon metric” in this general setting, and we construct ρ(r, η) explicitly by hand. The exponential decay estimates of eigenfunctions of Schr¨odinger operators (without magnetic field) has been studied extensively by Simon [12], Helffer–Sj¨ostrand [3] and others (see also [2]). These estimates are known to be optimal, and even asymptotic expansion is known in many cases. In a paper [4], Helffer and Sj¨ostrand studied exponential decay of eigenfunctions for magnetic Schr¨odinger operators. They constructed WKB solutions and thus obtained the optimal estimates for the decay rate of the eigenfunctions, but their method applies only to the case with weak magnetic fields. Brummelhuis [1] also studied exponential decay of the eigenfunctions to magnetic Schr¨odinger operators, but his result does not imply that the exponential decay rate of O(h−1 ) increases due to the magnetic field. Our proof employs the idea of the phase space tunneling estimates (cf. Martinez [6, 7], Nakamura [8, 9]). The same idea was applied to obtain an alternative proof of the Gaussian decay estimates of eigenfunctions to the magnetic Schr¨odinger operators due to Erd¨os [5] (cf. Nakamura [10], Sordoni [13]). Theorem 1.1 may be considered as a semiclassical analogue of the Gaussian decay estimate at infinity. In Sect. 2, we prepare simple tools, and prove Theorem 1.1 in Sect. 3. We discuss generalizations in the last section.

Tunneling Estimates for Magnetic Schr¨odinger Operators

29

2. Preliminaries We define a pseudodifferential operator on `2 (~Z) as follows: Z 2π X e−i(η−ξ)θ/~ a(~; η, θ)u(ξ)dθ a(~; η, −pη )u(η) = (2π~)−1 0

ξ∈~Z

for u ∈ C0 (~Z) (or u ∈ C0∞ (R)) and a ∈ C ∞ (R × T). Definition 2.1. Let m, l ∈ R. We write a(~; η, θ) ∈ S m,l if for any α, β, α β m−|α| , ~ ∈ (0, 1], η ∈ R, θ ∈ T. ∂η ∂θ a(~; η, θ) ≤ Cαβ ~l hηi The operator a(~; η, −pη ) is unitarily equivalent to a pseudodifferential operator on T, i.e., F −1 a(~; η, −pη )F is a pseudodifferential operator on T, and the standard theory of pseudodifferential operator is easily constructed. In particular, we will use the following: Proposition 2.1 (G˚arding’s inequality). Suppose q(~; η, θ) ∈ S m,0 satisfies m

Re q(~; η, θ) ≥ c hηi , ~ ∈ (0, 1], η ∈ R, θ ∈ T with some c > 0, then there are c0 , C > 0 such that m

Re[q(~; η, −pη )] ≥ c0 hηi − C~. In order to study the effect of the exponential weight, we also use the following symbol class: Definition 2.2. Let m, l ∈ R and τ > 0. We write a(~; η, θ) ∈ Sτm,l if a(~; η, θ) is analytic in θ ∈ Sτ and for any α, β, α β m−|α| , ~ ∈ (0, 1], η ∈ R, θ ∈ Sτ . ∂η ∂θ a(~; η, θ) ≤ Cαβ ~l hηi The next proposition follows from the standard argument (cf., e.g., [11]). Proposition 2.2. Let ρ(η) ∈ C ∞ (R) such that for any α, α ∂η ρ(η) ≤ Cα hηi1−|α| , η ∈ R, and

sup |ρ0 (η)| < τ.

η∈R

If A = a(~; η, −pη ) ∈

OPSτm,l ,

then

Aρ ≡ eρ(η)/~ Ae−ρ(η)/~ ∈ OPS m,l , and the principal symbol is given by a(~; η, θ − iρ0 (η)), i.e., Sym(Aρ ) − a(~; η, θ − iρ0 (η)) ∈ S m−1,l+1 , where Sym(A) denotes the symbol of a pseudodifferential operator A.

30

S. Nakamura

3. Proof of Theorem 1.1 3.1. Hamiltonian. Let B > 0 be the magnetic field strength. We can choose a gauge so that B B A(x) = − x2 , x1 . 2 2 Then our Hamiltonian can be expressed in the polar coordinate as follows: 2 pθ ~2 Br − 2 + V (r, θ) on L2 (drdθ), − H = p2r + r 2 4r where pr = −i~∂r , pθ = −i~∂θ . We now transform H by the Fourier series expansion: Z 2π −1/2 e−iηθ/~ u(θ)dθ, u ∈ L2 (T), η ∈ ~Z, Fu(η) = (2π~) 0

and we obtain K = FHF −1 2 η Br ~2 2 − = pr + − 2 + V (r, −pη ) on L2 (R+ ) ⊗ `2 (~Z). r 2 4r

(3.1)

Here FV F −1 = V (r, −pη ) is considered as a pseudodifferential operator (with respect to η). Let ρ(r, η) be a smooth function satisfying |∂η ρ(r, η)| ≤ τ, r > 0, η ∈ R and suppose that ρ(r, η) is constant if |r| + |η| is sufficiently large. Then we have Kρ ≡ eρ(r,η)/~ Ke−ρ(r,η)/~ 2 ~2 η Br 2 = (pr + i∂r ρ(r, η)) + + Vρ − 2 , − r 2 4r where

Vρ ≡ eρ(r,η)/~ V e−ρ(r,η)/~ = V (r, −pη − i∂η ρ(r, η)) + R

with R ∈ OPS −1,1 (by Proposition 2.2). We will construct ρ(r, η) so that Re Kρ is positive away from (r, η) = (0, 0), and as large as possible. 3.2. Weight function. Let δ = 2τ /(1 + 2τ ) (i.e., τ = δ/(2 − 2δ)), and let B B B δ = (r, η) r2 − η ≥ δ r2 = (r, η) η ≤ (1 − δ) r2 . 2 2 2 Then we set

Z   

r

r

B 2 s2 δ2 + f (s)ds 4 0 ρ0 (r, η) = Z rr  δBr2 B 2 s2   τη + + f (s)ds − δ2 4 4 0

if

(r, η) ∈ δ

if

(r, η) ∈ cδ .

Tunneling Estimates for Magnetic Schr¨odinger Operators

31

We note that if (r, η) ∈ ∂δ , then

δ (1 − δ)Br2 δBr2 = η− =0 τη − 4 2(1 − δ) 2

and hence ρ0 (r, η) is continuous. Moreover, it is locally Lipshitz continuous. We also see from the above computation that τη − and hence

Z

r

δBr2 > 0 for (r, η) ∈ cδ , 4 r

B 2 s2 + f (s)ds, r > 0, η ∈ R. 4 0 Now let us fix 0 < α < 1 and compute (formally) the symbol of K(αρ0 ) . On δ we have 2 Br2 1 − |∂r (αρ0 (r, η))|2 + Re V(αρ0 ) η − r2 2 2 2 Br2 1 2 2 Br + f (r) + f (r) ≥ (1 − α2 )f (r). ≥ 2 δ −α δ r 2 4 ρ0 (r, η) ≥

δ2

On the other hand, on cδ , we have 2 Br2 1 η− − |∂r (αρ0 (r, η))|2 + Re V (r, θ − iα∂η ρ0 (r, η)) r2 2 r 2 r 2 r2 B B 2 r2 2 δ2 + f (r) ≥ (1 − α2 )f (r), + f (r) − δ 2 ≥ −α 4 4 √ √ √ since a + b − a ≤ b. Thus the symbol of K(αρ0 ) is bounded from below by (1 − α2 )f (r) − O(~). Given αρ0 , we now use mollifier and a partition of unity to construct ρ(r, η) satisfying the following properties: Lemma 3.1. Let ε > 0 and R > 0. Then there exists ρ(r, η) ∈ C ∞ (R+ × R) such that (i) ρ(r, η) = 0 in a neighborhood of (0, 0). (ii) ρ(r, η) is a constant if |ρ| + |η| is sufficiently large. (iii) If r ≤ R then Z rr B 2 s2 δ2 + f (s)ds − ε. ρ(r, η) ≥ 4 0 (iv) There is δ1 > 0 such that 2 η Br 2 − |∂r ρ(r, η)|2 + Re V (r, θ − i∂η ρ(r, η)) ≥ δ1 f (r) hηi − r 2 for 0 < r < R, η ∈ R, and 2 η Br 2 − |∂r ρ(r, η)|2 + Re V (r, θ − i∂η ρ(r, η)) ≥ δ1 hηi − r 2 if r ≥ R, η ∈ R.

32

S. Nakamura

(v) There is δ2 > 0 such that ρ(r, η) is independent of r if 0 < r < δ2 . (vi) sup |∂η ρ(r, η)| < τ . The construction is somewhat long but straightforward, and we omit the details. 3.3. Agmon-type estimate. Now we choose a smooth cut-off function χ(r, η). Let χ0 ∈ C0∞ (R) such that 0 ≤ χ0 (x) ≤ 1 and χ0 (x) = 1 if |x| ≤ 1/2 0 if |x| ≥ 1. r χ0 χ(r, η) = 1 − χ0 r χ0 η , δ3 δ3 R where δ3 > 0 is taken so small that

We set

ρ(r, η) = 0 if |r| ≤ δ3 and |η| ≤ δ3 , B (i.e., δ3 < 4/B), δ3 − δ32 > 0 4 and δ3 < δ2 . Note that (1 − χ)ρ = 0. Lemma 3.2. There exists δ4 > 0 such that 2 2 2 ~2 χ χ(r, η) 1 η − Br − ∂r ρ(r, η) + Re Vρ − 2 (r, η) ≥ δ4 χ(r, η)2 r2 2 4r (3.2) in the operator sense on L2 (R+ ) ⊗ `2 (~Z) if ~ is sufficiently small. Proof. Since the variable r appears as a parameter in both sides, it suffices to show it for each r > 0 in the operator sense on `2 (~Z). If δ3 /2 ≤ r, then by Lemma 3.1-(iv) and G˚arding’s inequality we have 2 2 Br2 δ1 1 2 η − − ∂r ρ + Re Vρ ≥ f (r) hηi − C~ ≥ δ4 > 0 2 r 2 2 if ~ is sufficiently small. Now let r < δ2 /2. Then we have 2 2 Bδ32 δ5 1 δ3 χ(r, η)2 1 η − Br − ≥ ≥ r2 2 r2 2 8 2 with δ5 > 0. Hence (noting ∂r ρ = 0), we have 2 2 ~2 χ χ(r, η) 1 η − Br + Re Vρ − 2 (r, η) r2 2 4r 2 ~ δ5 ≥ χ(r, η) 2 − 2 − C~ χ(r, η) ≥ δ6 χ(r, η)2 r 4r with δ6 > 0 if ~ is small. These imply the assertion.

The next lemma is an easy consequence of Lemma 3.2.

Tunneling Estimates for Magnetic Schr¨odinger Operators

33

Lemma 3.3. Let ρ and χ as above. Let E = E(~) = o(1) as ~ → 0. Then there is δ7 > 0 such that Re[χ(Kρ − E)χ] ≥ δ7 χ2 if ~ is sufficiently small. Proof of Theorem 1.1. We apply the standard argument of the Agmon method (cf. e.g., [11], Theorem 2.2 and Lemma 2.3). Let ψ be an eigenfunction of K: Kψ = Eψ with E = o(1) as ~ → 0. Then by Lemma 3.3 we learn keρ/~ χψk ≤ C kψk + kKψk , where C is independent of ~ > 0. By Lemma 3.1-(iii), it follows kexp[(g(r) − ε)/~]χψk ≤ C, where g(r) is given by (1.2). Now Theorem 1.1 follows immediately by the Sobolev embedding theorem.

4. Discussions 4.1. Generalization with respect to B(x). It is easy to see that we can prove the same result for non-constant magnetic field B(x) if it is radial and bounded from below, i.e., if B(x) = B(r), and B(r) ≥ B0 > 0. On the other hand, if B(x) is not radial, we surely need the analyticity of B(x) with respect to the rotations, and the argument becomes more complicated. In fact we can carry out a similar argument, but we need an additional assumption, and it prohibits us from applying it to strong magnetic fields. 4.2. Generalization with respect to space dimension. We can easily prove essentially the same result for Schr¨odinger operator on even dimensional Euclidean space with nondegenerate constant magnetic field. If the space dimension is odd, e.g., n = 3, then we have improved decay only in the direction perpendicular to the magnetic field. The decay in the direction of the magnetic field is the same as that without magnetic fields. Acknowledgement. A part of this work was done when the author was staying at The Fields Institute for Research in the Mathematical Sciences, Toronto, Canada. He wishes to thank the institute for the hospitality and the support.

References 1. Brummelhuis, R.: Exponential decay in the semiclassical limit for eigenfunctions of Schr¨odinger operators with magnetic fields and potentials which degenerate at infinity. Comm. P. D. E. 16, 1489–1502 (1991) 2. Helffer, B.: Semi-Classical Analysis for the Schr¨odinger Operator and Applications. Springer Lecture Notes in Math. 1336 Berlin–Heidelberg–New York: Springer-Verlag, 1988 3. Helffer, B., Sj¨ostrand, J.: Multiple wells in the semiclassical limit. Comm. P. D. E. 9, 337–408 (1984) 4. Helffer, B., Sj¨ostrand, J.: Effet tunnel pour l’´equation de Schr¨odinger avec champ magn´etique. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 14, 625–657 (1987) 5. Erd¨os, L.: Gaussian decay of the magnetic eigenfunctions. Geom. Funct. Anal. 6, 231–248 (1996) 6. Martinez, A.: Estimates on complex interactions in phase space. Math. Nachr. 167, 203–254 (1994)

34

S. Nakamura

7. Martinez, A.: Microlocal exponential estimates and applications totunneling. In: Microlocal Analysis and Spectral Theoru, Ser. C, 490, L. Rondino ed., 1997 8. Nakamura, S.: On Martinez’ method of phase space tunnenling. Rev. Math. Phys. 7, 431–441 (1995) 9. Nakamura, S.: On an example of phase-space tunneling.Ann. Inst. H. Poincar´e (Phys. Th´eo.) 63, 211–229 (1995) 10. Nakamura, S.: Gaussian decay estimates for the eigenfunctions of magnetic Schr¨odinger operators. Comm. P. D. E. 21, 993–1006 (1996) 11. Nakamura, S.: Agmon-type exponential decay estimates for pseudodifferential operators. To appear in J. Math. Sci. Univ. Tokyo 12. Simon, B.: Semiclassical analysis of low-lying eigenvalues. II. Tunneling. Ann. Math. 120, 89–118 (1984) 13. Sordoni, V.: Gaussian decay for the eigenfunctions of a Schrodinger operator with magnetic field constant at infinity. To appear in Comm. P. D. E. Communicated by B. Simon

Commun. Math. Phys. 200, 35 – 41 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Semi-Classical States for Non-Self-Adjoint Schr¨odinger Operators E.B. Davies Department of Mathematics, King’s College, Strand, London WC2R 2LS, UK. E-mail: [email protected] Received: 7 April 1998 / Accepted: 12 June 1998

Abstract: We prove that the spectrum of certain non-self-adjoint Schr¨odinger operators is unstable in the semi-classical limit h → 0. Similar results hold for a fixed operator in the high energy limit. The method involves the construction of approximate semiclassical modes of the operator by the JWKB method for energies far from the spectrum.

1. Introduction It is well known that the complex spectrum of many non-self-adjoint differential operators is highly unstable under small perturbations [2, 3, 8, 9]; this has been investigated in detail for the Rayleigh equation in hydrodynamics in [6, Ch. 4]. One way of exploring this fact is by defining the pseudospectrum of such an operator H by Specε (H) := Spec(H) ∪ {z : kR(z)k > ε−1 }, where ε > 0 and R(z) is the resolvent of H. It is known that Specε (H) contains the ε-neighbourhood of the spectrum, and that it is contained in the ε-neighbourhood of the numerical range of H. Theorem 1 states that for Schr¨odinger operators with complex potentials Specε (H) expands to fill a region U of the complex plane much larger than the spectrum in the semi-classical limit h → 0. More precisely we obtain an explicit lower bound on kR(z)k which increases rapidly as h → 0 for all z in the region U defined by U := {z = η 2 + V (a) : η ∈ R\{0} and Im(V 0 (a)) 6 = 0}. For h = 1 we deduce that for large enough z within a suitable region the resolvent norms of Schr¨odinger operators with complex potentials become very large even though z may be far from the spectrum. This had apparently not been noticed in other spectral investigations of these operators [1, 4, 5, 7], apart from [3], whose results are greatly extended and improved in Theorem 2 below.

2

E.B. Davies

Our results have a positive aspect. Our proofs use a JWKB analysis to construct a continuous family of approximate eigenstates, which we call semi-classical modes, for the operators in question. These modes have complex energies far from the spectrum, but could be used to investigate the time evolution of fairly general initial states by expanding these in terms of the modes.

2. The Estimates For reasons which will become clear in the next section, we consider operators somewhat more general than those described in the last section. We assume that H acts in L2 (R) according to the formula Hf (x) := −h2

d2 f + Vh (x)f (x), dx2

where Vh are smooth potentials for all small enough h > 0 which depend continuously on h, together with their derivatives of all orders. In the applications in the next section Vh has an expansion involving fractional powers of h, but this is invisible since we treat Vh as it stands, and only need an asymptotic expansion involving integer powers of h; this simplification is essential to our solution of the problem. Technically we assume that H is some closed extension of the operator initially defined on Cc∞ (R). We assume throughout the section that z := η 2 + Vh (a), where η ∈ R\{0} and Im(V00 (a)) 6 = 0. The h-dependence of z as defined above may be eliminated by a uniformity argument spelled out in the next section, but we do not focus on this issue here. Our goal is to prove the upper bound kH f˜ − z f˜k/kf˜k = O(hn ) as h → 0, for all n > 0, where f˜ ∈ Cc∞ (R) depends upon h and n. This immediately implies that k(H − zI)−1 k diverges as h → 0 faster than any negative power of h. Although the above equation may be interpreted as stating that such z are approximate eigenvalues for small enough h > 0, it does not follow that they are close to true eigenvalues, and indeed the examples studied in [2, 3, 8, 9] show that there is a strong distinction between the spectrum and pseudospectrum. Theorem 1. There exists δ > 0 and for each n > 0 a positive constant cn and an h-dependent function f˜ ∈ Cc∞ (R) such that if 0 < h < δ then kH f˜ − z f˜k/kf˜k ≤ cn hn . Proof. The proof is a direct construction. We put f˜(a + s) := ξ(s)f (s) for all s ∈ R, where ξ ∈ Cc∞ (R) satisfies ξ(s) = 1 if |s| < δ/2 and ξ(s) = 0 if |s| > δ, and δ > 0 is determined below. We take f to be the smooth but not square integrable JWKB function f := exp(−ψ) where n X hm ψm (s) ψ(s) := m=−1

and ψm are defined below. A direct computation shows that

Semi-Classical States

3

Hf − zf =

2n+2 X

! m

h φm

f,

m=0

where φm are given by the formulae 0 )2 + Vh − z φ0 := −(ψ−1 00 0 − 2ψ−1 ψ00 φ1 := ψ−1

0 φ2 := ψ000 − 2ψ−1 ψ10 − (ψ00 )2

0 φ3 := ψ100 − 2ψ−1 ψ20 − 2ψ00 ψ10 ... φ2n+2 := −(ψn0 )2 .

By setting φm = 0 for 0 ≤ m ≤ n + 1, we obtain a series of equations which enable us to determine all ψm , provided δ > 0 is small enough. The key equation is the complex eikonal equation 0 (s)2 = Vh (a + s) − Vh (a) − η 2 ψ−1

whose solution is

Z

s

ψ−1 (s) :=

Vh (a + t) − Vh (a) − η 2

1/2

dt

0

1/2 Vh (a + t) − Vh (a) iη 1 − dt η2 0 iV 0 (a)s2 + O(s3 ). = iηs − h 4η Z

s

=

We have assumed that Im(V00 (a)) 6 = 0 and this implies that Im(Vh0 (a)) 6 = 0 for all small enough h > 0. We choose η to be of the same sign as Im(Vh0 (a)) so that for a suitable constant γ > 0 we have γs2 ≤ Re(ψ−1 (s)) ≤ 3γs2 for all small enough s and h. We also assume that s and h are small enough for ρ := 0 )−1 to satisfy a bound of the form |ρ(s)| ≤ β. (2ψ−1 We now force φm = 0 for 0 ≤ m ≤ n + 1 by putting 00 , ψ00 = ρψ−1

ψ10 = ρ(ψ000 − (ψ00 )2 ), ψ20 = ρ(ψ100 − 2ψ00 ψ10 ), etc. We determine the functions uniquely by also imposing ψm (0) = 0 for all 0 ≤ m ≤ n. Each of the functions is bounded provided s and h are small enough, and the same is true of the remaining functions φm . Specifically we assume that for some δ > 0 and constants cm , c0m we have |ψm (s)| ≤ cm for 0 ≤ m ≤ n, and

4

E.B. Davies

|φm (s)| ≤ c0m for n + 2 ≤ m ≤ 2n + 2, provided |s| ≤ δ and 0 < h < δ 2 . In the following calculations ai denote various positive constants, independent of h and s. We have Z kf˜k22 ≥

−δ/2

Z ≥

δ/2

δ/2

|f (s)|2 ds e−3γs

2

h−1 −a1

ds

−δ/2

Z

δh−1/2 /2

e−3γt

2

=

−a1 1/2

h

−δh−1/2 /2

Z ≥

1/2

e−3γt

2

−a1 1/2

h

−1/2

dt

dt

= a2 h1/2 . We also have kH f˜ − z f˜k2 = k − h2 f ξ 00 − 2h2 f 0 ξ 0 + ξ(Hf − zf )k2 ≤ h2 kf ξ 00 k2 + 2h2 kf 0 ξ 0 k2 +

2n+2 X

hm kξφm f k2

m=n+2

and need to estimate each of the norms. Since ξ 0 has support in {s : δ/2 ≤ |s| ≤ δ}, we have kξ

00

Z f k22

≤ a3

e−γs

2

h−1 +a4

ds

δ/2≤|s|≤δ

≤ a5 e−γδ

2

/4h

.

In other words kξ 00 f k2 decreases at an exponential rate as h → 0. A similar argument applies to kξ 0 f 0 k2 . Since φm is bounded on {s : |s| ≤ δ}, uniformly for |h| ≤ δ 2 , we see that Z kξφm f k22 ≤ a6 Z ≤ ≤

δ

−δ δ

|f (s)|2 ds

e−γs

2

h−1 +a7

ds

−δ a8 h1/2

by an argument similar to that used for f˜ above. Putting the various inequalities together we obtain the statement of the theorem.

Semi-Classical States

5

3. High Energy Spectrum By a change of scale our theorem can be applied to prove the instability of the high energy spectrum of a non-self-adjoint Schr¨odinger operator with complex potential. The results in this section extend those of [3] both by providing greater insight into the mechanism involved and by obtaining much stronger estimates. Adopting quantum mechanical notation we assume that h = 1 and that the operator H acting in L2 (R) is given by n X cm Qm , H := P 2 + m=1

where n is even and the constant cn has positive real and imaginary parts. Theorem 2. If z ∈ C satisfies 0 < arg(z) < arg(cn ) and σ > 0 then k(H − σzI)−1 k diverges to infinity faster than any power of σ as σ → +∞. Proof. If u > 0 then the operator H is unitarily equivalent to the operator H1 := u−2 P 2 +

n X

cm um Qm .

m=1

Putting u := σ 1/n we obtain k(H − σzI)−1 k = σ −1 k(H2 − zI)−1 k, where H2 := σ −1 H1 = u−2−n P 2 +

n X

cm um−n Qm .

m=1

Putting h := u

−(n+2)/2

we have

H2 := h2 P 2 +

n X

cm h2(n−m)/(n+2) Qm = h2 P 2 + Vh (Q).

m=1

This is precisely the form of operator to which Theorem 1 applies. We have V00 (a) = cn an so Im(cn ) > 0 implies that the conditions of Theorem 1 are satisfied for any z in the sector U := {z : 0 < arg(z) < arg(cn )}. This completes the proof, except for a technical point which we now address. In Theorem 1 we assumed that z = η 2 +Vh (a) where Im(V00 (a)) 6 = 0, so z is apparently dependent on h, with the limit z0 := η 2 + V0 (a) as h → 0. We rectify this problem by fixing z ∈ U and making a and η depend upon h in such a way that z = ηh2 + Vh (ah ), where ηh → η and ah → a as h → 0. We now have to check that all the estimates of Sect. 2 are locally uniform with respect to η and a, so that the result we claim does indeed follow.

6

E.B. Davies

The method of this paper can be extended to treat certain rotationally invariant problems in higher space dimensions. The condition −2 ≤ p(1) in the next theorem is included because it is relevant to the existence of a closed extension of the operator, by virtue of an application of the theory of sectorial forms. Theorem 3. Let the operator H acting in L2 (RN ) be some closed extension of the operator given by n X cm |x|p(m) f (x) Hf (x) := −1f (x) + m=1

Cc∞ (RN \{0}),

where cn has positive real and imaginary for all f in the initial domain parts, p(n) > 0 and −2 ≤ p(1) < p(2) < . . . < p(n). If z ∈ C satisfies 0 < arg(z) < arg(cn ) and σ > 0 then k(H − σzI)−1 k diverges to infinity faster than any power of σ as σ → +∞. Proof. The difference from Theorem 2 is that after restricting to the usual angular momentum sectors the operators act in L2 (0, ∞) and include angular momentum terms in the potential. However, it may be seen that the analysis of Theorem 2 can be extended to operators of the form n X cm Qp(m) H := P 2 + m=1

so the incorporation of the angular momentum terms causes no difficulties.

Note. Since the supports of the test functions used in the proof of the theorem are compact and move to infinity, Theorem 3 remains valid if we add a non-central potential to H, provided that potential decreases at infinity faster than any negative power of |x|. Weaker versions of Theorems 2 and 3 hold if one adjoins a potential which decreases more slowly at infinity. Acknowledgement. I should like to thank M. Kelbert and Y. Safarov for helpful comments.

References 1. Auscher, P., McIntosh, A., Tchamitchian, P.: Heat kernels of second order elliptic operators and applications. J. Funct. Anal., to appear 2. Davies, E.B.: Pseudospectra of differential operators. J. Oper. Theory, to appear 3. Davies, E.B.: Pseudospectra, the harmonic oscillator and complex resonances. Proc. Royal Soc. London, Ser. A, to appear 4. Duong, X.T., Robinson, D.W.: Semigroup kernels, Poisson bounds and holomorphic functional calculus. J. Funct. Anal., to appear 5. Edmunds, D.E., Evans, W.D.: Spectral Theory and Differential Operators. Oxford: Oxford Univ. Press, 1987 6. Kelbert, M., Sazonov, I.: Pulses and other Wave Processes in Fluids. Dordrecht, Boston: Kluwer Acad. Publ., London, 1996 7. Liskevich, V., Manavi. A.: Dominated semigroups with singular complex potentials. J. Funct. Anal. 151, 281–305 (1997)

Semi-Classical States

7

8. Reddy, S.C., Trefethen, L.N.: Pseudospectra of the convection-diffusion operator. SIAM J. Applied Math. 54, 1634–1649 (1994) 9. Trefethen, L.N.: Pseudospectra of linear operators. SIAM Review 39, 383–406 (1997) Communicated by B. Simon

Commun. Math. Phys. 200, 43 – 56 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Zeros of Graph-Counting Polynomials David Ruelle I.H.E.S., 35 route de Chartres, 91440 Bures-sur-Yvette, France. E-mail: [email protected] Received: 4 May 1998 / Accepted: 12 June 1998

Abstract: Given a finite graph E we define a family A of subgraphs F byP restricting the number of edges of F with endpoint at any vertex of E. Defining QA (z) = F ∈U z cardF , we can in many cases give precise information on the location of zeros of QA (z) (zeros all real negative, all imaginary, etc.). Extensions of these results to weighted and infinite graphs are given. 1. Introduction and Statement of Results This paper studies the location of zeros of polynomials X z cardF , QA (z) =

(1.1)

F ∈A

where A is a set of subgraphs of a given finite graph (V, E). The graph (V, E) is defined by the vertex set V , the edge set E, and the two endpoints j, k ∈ V of each a ∈ E (we assume j 6= k, but allow several edges with the same endpoints). A subgraph F is viewed as a subset of E. We shall consider sets A of the general form A = {F ⊂ E : (restrictions on the numbers of edges of F with any endpoint j ∈ V )}. We may write A = A(V, E) to indicate the dependence on the graph (V, E). Let σ = {. . . } be a set of nonnegative integers (we shall consider the cases σ = {0, 1}, {1, 2}, {0, 1, 2}, {0, 2}, {0, 2, 4}, {even}, {≥ 1}, and also {< max} as explained below). A set A = (σ) = ({. . . }) of subgraphs of (V, E) is defined by A = (σ) = {F ⊂ E : (∀j) card{a ∈ F : j is an endpoint of a} ∈ σ}.

(1.2)

44

D. Ruelle

In the case σ = {< max}, the set σ depends on j and is {< max} = {s ≥ 0 : s < number of edges of E with endpoint j}. Suppose that the graph (V, E) is oriented by placing an arrow on each edge a ∈ E; at each vertex j ∈ V there are thus ingoing and outgoing edges. Given two sets σ 0 , σ 00 of nonnegative integers we define A = (σ 0 → σ 00 ) = {F ⊂ E : (∀j) card{outgoing edges of F at j} ∈ σ 0

(1.3)

and card{ingoing edges of F at j} ∈ σ 00 }

In the cases A = ({. . . }) and A = ({. . . } → {. . . }) as just defined, we impose the same restrictions at each vertex j ∈ V and each edge a ∈ E. One could consider more general situations where several classes of vertices and edges are distinguished , and also study them by the methods of this paper. For simplicity we restrict ourselves to (1.2) and (1.3). Some of our results on the location of the zeros Z of Q(σ0 →σ00 ) are summarized in the following table. For Q(σ) we obtain the same results as for Q(σ→σ) . Much more precise statements will be made below in Sect. 6 for (σ) and in Sect. 7 for (σ 0 → σ 00 ). Note also that the table may be completed by symmetry (the entry for (σ 0 → σ 00 ) is the same as for (σ 00 → σ 0 )). {0, 1} {0, 1} {0, 1, 2}

{0, 1, 2} {0, 2} {0, 2, 4} {even} {< max}

Z real ReZ < 0 ReZ < 0

{≥ 1}

Z imaginary

ReZ < 0

(Z = 0)

ReZ 2 < 0

−

−

−

ImZ 6= 0

−

−

−

{0, 2} {0, 2, 4} {even} {< max} {≥ 1}

(cardioid)

The polynomial Q({0,1}) counts dimer subgraphs; the fact that its zeros are real (and therefore negative) was first proved by Heilmann and Lieb [3]1 . The case of Q({1,2} ) is similar (real zeros) and will be discussed in Sect. 6. The polynomial Q({0,1,2}) counts unbranched subgraphs; the fact that its zeros have negative real part was proved by Ruelle [8]. The other results appear to be new, for instance the fact that the zeros of Q({0,1}→{0,2}) (which counts bifurcating subgraphs) are purely imaginary. Our method of study of the polynomials QA uses the Asano contraction (see [1, 6]) and Grace’s theorem (see below). We are thus close to the ideas used in equilibrium statistical mechanics to study the zeros of the grand partition function, in particular the circle theorem of Lee and Yang ([4, 7], and references quoted there). The machinery of proof of the present paper is developed in Sects. 2 to 5. In Sect. 6 we deal with the polynomials Q({... }) and in Sect. 7 with the polynomials Q({... }→{... }) . Finally, Sect. 8 discusses the easy extension to (possibly infinite) graphs with weights. 1 For a generalisation see Wagner [9], which contains further results and references on graph-counting polynomials with real zeros.

Zeros of Graph-Counting Polynomials

45

2. Polynomials and Their Zeros 2.1. Subsets of C. We define closed subsets of C as follows: π π 1 , θ ∈ (− , )}, cos θ 2 2 π π 1 iθ ˆ = {z : Rez ≥ 1} = {ρe : ρ ≥ , θ ∈ (− , )}, 1 cos θ 2 2 π π 1 iθ , θ ∈ (− , )}, H = {ρe : ρ = √ 4 4 cos 2θ π π 1 , θ ∈ (− , )}, Hˆ = {ρeiθ : ρ ≥ √ 4 4 cos 2θ 1 = {z : Rez = 1} = {ρeiθ : ρ =

2 y2 } = {ρeiθ : ρ = , θ ∈ (−π, π)}, 4 1 + cos θ 2 y2 , θ ∈ (−π, π)}. Pˆ = {z = x + iy : 1 − x ≥ } = {ρeiθ : ρ ≥ 4 1 + cos θ P = {z = x + iy : 1 − x =

Note that H is the branch of the hyperbola {z = x + iy : x2 − y 2 = 1} in {z = x + iy : x > 0}; P is a parabola with focus at 0. 2.2. Symmetric polynomials. We shall use symmetric polynomials pσ of the form pσ (z1 , . . . , zn ) = a0 + a1

X j

zj + a2

X

zj zk + a3

j
X

zj zk zl + . . . ,

j
/ σ. We where σ is a set of integers ≥ 0, and aj = 1 for j ∈ σ, aj = 0 for j ∈ consider the cases σ = {0, 1}, {1, 2}, {0, 1, 2}, {0, 2}, {0, 2, 4}, {even}, {< max}, {≥ 1} as in Sect. 1, but other choices might be interesting2 . We use the same symbol pσ independently of the number of variables of the polynomial. This is a mild abuse of notation in the case of p{<max} (because {< max} = {0, 1, . . . , n − 1} depends on the number n of variables). Proposition 2.1. To a symmetric polynomial pσ as above, with a0 6= 0, we may associate / G, as follows: a closed set G ⊂ C such that G 63 0 and p(z1 , . . . , zn ) 6= 0 if z1 , . . . , zn ∈ Pn ˆ for any θ ∈ (−π/2, π/2). (a) p{0,1} (z1 , . . . , zn ) = 1 + j=1 zj ; G = −n−1 e−iθ cos θ1 Pn P (b) p{0,1,2} (z1 , . . . , zn ) = 1 + j=1 zj + j
r

2 −iθ √ ˆ cos 2θ 1 e n

for any

π π θ ∈ (− , ). 4 4

2 Possibly interesting choices are {1, 3}, {odd}. The choices {1}, {2}, {all} however are not useful for our intended applications.

46

D. Ruelle

(c) We lump together several cases: p{0,2} (z1 , . . . , zn ) = 1 + p{0,2,4} (z1 , . . . , zn ) = 1 + p{even} (z1 , . . . , zn ) =

X

X j
zj zk +

j
zj zk , X

(c1) zj zk zl zm ,

j
1 [ (1 + zj ) + 2 j=1

(1 − zj )].

(c2)

(c3)

j=1

G = C 0ˆ where 0ˆ = 0∪(outside q of 0), 0 is any circle of finite radius through ±i 2 and C > 0 is given by C = n(n−1) (c1), or C is the smallest positive root of

π C 2 + n(n−1)(n−2)(n−3) C 4 = 0 (c2), or C = tan 2n (c3). 1 − n(n−1) 2 2.3.4 Qn Qn 1 ˆ (d) p{<max} (z1 , . . . , zn ) = j=1 (1 + zj ) − j=1 zj ; G = − 2 1.

Interesting limiting cases, where a0 = 0, are the following Pn P (a’) p{1,2} (z1 , . . . , zn ) = j=1 zj + j
1 − n−1

2 n

Notice also that for (c3) one can take G = {z : | arg

π 1−z |≥ } 1+z n

as is seen by a simple calculation. Our study of the roots of QA uses the following lemmas. Lemma 2.2 (Grace’s theorem). Let Q(z) be a polynomial of degree n with complex coefficients and P (z1 , . . . , zn ) the only polynomial which is symmetric in its arguments, of degree 1 in each, and such that P (z, . . . , z) = Q(z). / M, . . . , zn ∈ / If the n roots of Q are contained in a closed circular region M , and z1 ∈ M , then P (z1 , . . . , zn ) 6= 0).

Zeros of Graph-Counting Polynomials

47

A closed circular region is a closed subset of C bounded by a circle or a straight line. The coefficients of z n , z n−1 , . . . in Q(z) are allowed to vanish; we then say that some of the roots of Q(z) are at ∞, and we must then take M noncompact. Proof. See Polya and Szeg¨o [5] V, Exercise 145.

Lemma 2.3 (Ruelle). Let K, L be closed subsets of the complex plane C, which do not contain 0. Suppose that the complex polynomial α + βz1 + γz2 + δz1 z2 can vanish only when z1 ∈ K or z2 ∈ L. Then the polynomial obtained by the Asano contraction, namely α + δz can vanish only when z ∈ −K ∗L, where we have written K ∗L = {z 0 z 00 : z 0 ∈ K, z 00 ∈ L}. Proof. For a proof see [6].

Lemma 2.4. Let the coefficients of the polynomials Qλ (z) of order ≤ N tend to the coefficients of Q(z) when λ → 0. If the roots of Qλ are in the closed set K ⊂ C and if Q does not vanish identically, then the roots of Q are in K. Proof. (Roots at infinity are ignored here, only finite roots are considered). If Z ∈ / K, we may choose > 0 such that {z : |z − Z| ≤ } ∩ K = ∅ and Q(z) 6= 0 if |z − Z| = . Since Qλ tends to Q uniformly on the circle {z : |z − Z| = }, the number of zeros of Qλ and Q inside this circle is the same for small λ, hence Q(Z) does not vanish. 3. Graphs Let a finite graph be defined by the vertex set V , the edge set E, and the incidence set I ⊂ V × E such that (j, a) ∈ I when j is an endpoint of a. [We assume that every vertex is an endpoint of at least one edge, that the two endpoints of each edge are distinct, but we allow several edges between the same endpoints.] We denote by I(j) the set of all (j, a) ∈ I, with fixed j. Proposition 3.1. With the above notation, consider the product Y pσ ((zaj )(j,a)∈I(j) )

(3.1)

j∈V

which is a polynomial in the variables (zaj )(j,a)∈I , linear in each. If for each monomial of this product we replace each factor zaj zak (where j, k are the endpoints of a) by za , and unmatched zaj , zak by 0, we obtain X Y za . P(σ) ((za )a∈E ) = F ∈(σ) a∈F

Q

Proof. Each monomial of P(σ) is of the form a∈F za , where F ⊂ E. By construction, the subgraphs F which occur are precisely those for which (∀j) card{a ∈ F : j is an endpoint of a} ∈ σ. This proves the proposition.

48

D. Ruelle

If the graph (V, E) is oriented, the incidence set I is the disjoint union of I 0 , I 00 where (j, a) ∈ I 0 , (k, a) ∈ I 00 mean that the edge a is outgoing at j and ingoing at k. Let us write I 00 (j) = I 00 ∩ I(j). I 0 (j) = I 0 ∩ I(j), Proposition 3.2. With the above notation, consider the product Y [pσ0 ((zaj )(j,a)∈I 0 (j) )pσ00 ((zbj )(j,b)∈I 00 (j) )] j∈V

=

Y

pσ0 ((zaj )(j,a)∈I 0 (j) ).

j∈V

Y

pσ00 ((zak )(k,a)∈I 00 (k) )

(3.2)

k∈V

which is a polynomial in the variables (zaj )(j,a)∈I , linear in each. If for each monomial in this product we replace each factor zaj zak (where j, k are the endpoints of a) by za , and unmatched zaj , zak by 0, we obtain Y X za . P(σ0 →σ00 ) ((za )a∈E ) = F ∈(σ 0 →σ 00 ) a∈F

Q Proof. Each monomial of P(σ0 →σ00 ) is of the form a∈F za , where F ⊂ E. By construction, the subgraphs F which occur are precisely those for which (∀j) card{outgoing edges of F at j} ∈ σ 0 , (∀k) card{ingoing edges of F at k} ∈ σ 00 . This proves the proposition.

Propositions 3.1 and 3.2 express that the polynomials QA introduced in Sect. 1 can be obtained from a product (3.1) or (3.2) of polynomials pσ by repeated Asano contractions zaj , zak → za (as described in Lemma 2.3), yielding PA ((za )a∈E ), then taking QA (z) = PA (z, . . . , z). One could in a similar way deal with more general situations than A = (σ) or A = (σ 0 → σ 00 ). 4. Geometric Results ˆ and P, Pˆ as defined in Sect. 2.1. ˆ H, H, We collect here some facts involving 1, 1, Because − log cos θ is convex on (−π/2, π/2) we have ˆ H ∗ H = Hˆ ∗ Hˆ = 1,

(4.1)

ˆ ˆ ∗1 ˆ = P. 1∗1=1

(4.2)

−iθ

cos θ 1 is, for θ ∈ (−π/2, π/2), any line through (We shall not use (4.1).) Since e ˆ is the corresponding closed half plane not +1 except the real axis, and e−iθ cos θ 1 containing 0, we have ˆ = [1, +∞). (4.3) ∩θ e−iθ cos θ 1 Using (4.2) and simple geometry, it is also clear that ˆ ∗ (e−iθ cos θ 1) ˆ = ∩θ e−2iθ cos2 θ Pˆ = [1, +∞). ∩θ (e−iθ cos θ 1)

(4.4)

Zeros of Graph-Counting Polynomials

49

Proposition 4.1. Let G = {reiα : r ≥ ρ(α)}, where ρ(·) is smooth, defined on R(mod 2π), or on a closed interval of R(mod 2π), or an open interval such that ρ(α) → ∞ when α tends to the endpoints of the interval. We assume that ρ(·) > 0, and ρ(·) + ρ00 (·) > 0. (The limit case ρ(α) + ρ00 (α) = 0 arises when the osculating circle to the curve α 7→ ρ(α)eiα passes through 0). Then ˆ ∗ G = G. ∩θ∈(−π/2,π/2) e−iθ cos θ 1 ˆ ∗ G is the open convex region around Proof. Note that G is closed 63 0, and that C\1 0 bounded by the envelope E of the lines t 7→ (1 + it)ρ(α)eiα parametrized by α. Expressing that the two real linear equations idt ρ(α)eiα + (1 + it)(iρ(α) + ρ0 (α))eiα dα = 0 or i dt + (1 + it)(i +

ρ0 (α) ) dα = 0 ρ(α)

have a vanishing determinant yields 0 0 ρ (α)/ρ(α) − t 1 1 + tρ0 (α)/ρ(α) = 0 i.e., t = ρ0 (α)/ρ(α). The envelope E is thus given parametrically by the map α 7→ E(α) = (ρ(α) + iρ0 (α))eiα with derivative E 0 (α) = i(ρ(α) + ρ00 (α))eiα 6= 0. In particular ρ(·) + ρ00 (·) > 0 implies that E(β) − E(α) 6= 0 if 0 < β − α ≤ π, i.e., E has no self-intersection. The tangent to E at E(α) passes through the point ρ(α)eiα and is orthogonal to the vector ρ(α)eiα . In other words, the orthogonal projection on {reiα : r > 0} of ˆ ∗ G) ∩ {z : | arg z − α| < π/2} is {reiα : r > 0} ∩ (C\G). Therefore (C\1 ˆ ∗ G) = C\G ∪θ∈(−π/2,π/2) e−iθ cos θ(C\1 or ˆ ∗G=G ∩θ∈(−π/2,π/2) e−iθ cos θ 1 as announced.

ˆ If 0 is a circle containing 0 in ˆ H. 4.1. Applications. Proposition 4.1 applies to G = 1, its inside, the proposition also applies to G = 0∪(outside of 0). If 0 is a circle which does not contain 0 in its inside, and G = 0∪(inside of 0), the proposition applies to [1, +∞) ∗ G: ˆ ∗ G = [1, +∞) ∗ G. ∩θ∈(−π/2,π/2) e−iθ cos θ 1 Proposition 4.2. Let the family (0) consist of the circles (of finite radius) through ±i, and 0ˆ = 0∪(outside of 0). Then ˆ ∗ 0ˆ = {z = x + iy : |y| ≥ 1}. ∩1

50

D. Ruelle

Proof. Taking 0 = {A + Reiθ : θ ∈ (−π, π]} with A real, |A| < R, we find (as in the ˆ ∗ 0ˆ is the region inside the envelope E of the lines proof of Proposition 4.1) that C\1 t 7→ (1 + it)(A + Reiθ ) parametrized by θ. Direct calculation shows that E is the ellipse E = {z = x + iy :

2 y x − A 2 + √ = 1}. 2 2 R R −A

√ Here R2 − A2 = 1, and the union of the insides of the allowed ellipses is {z = x + iy : |y| < 1}, proving the proposition. 5. Calculations of −G0 ∗ G00 Each Asano contraction zaj , zak → za which occurs in Proposition 3.1 or 3.2 involves a variable zaj of a polynomial p0 (n0 variables) and a variable zak of a polynomial p00 (n00 variables). If G0 , G00 are closed sets associated with p0 , p00 in accordance with Proposition 2.1, we are led by Lemma 2.3 to computing −G0 ∗ G00 (and then taking an intersection over the possible choices of G0 and G00 ). We proceed by examining various possible cases, without striving for optimal results. 0 −1 ˆ G00 = −n00 −1 e−iθ00 cos θ00 1, ˆ where Case (a)–(a). We have G0 = −n0 e−iθ cos θ0 1, we allow θ0 , θ00 ∈ (−π/2, π/2). Therefore, using Proposition 4.1 and then (4.3), 0

00

ˆ ∗ (e−iθ cos θ00 1) ˆ ∩ − G0 ∗ G00 = −(n0 n00 )−1 ∩θ0 θ00 (e−iθ cos θ0 1) 00

ˆ = −(n0 n00 )−1 [1, ∞) = (−∞, −(n0 n00 )−1 ] = −(n0 n00 )−1 ∩ e−iθ cos θ00 1 if θ0 , θ00 are allowed to vary independently. If we impose θ0 = θ00 the same result is obtained since, using (4.4), ˆ ∗1 ˆ = −(n0 n00 )−1 [1, +∞) ∩ − G0 ∗ G00 = −(n0 n00 )−1 ∩ e−2iθ cos2 θ1 = (−∞, −(n0 n00 )−1 ]. 0 −1 ˆ with θ0 ∈ (−π/2, π/2). In our applications, Case (a)–(-). Here G0 = −n0 e−iθ cos θ0 1 the G00 or [1, ∞) ∗ G00 satisfy the conditions of Proposition 4.1, and we have thus

∩ − G0 ∗ G00 = n0

−1

∩G00 ([1, +∞) ∗ G00 ).

Example. Suppose that we may take G00 = C 0ˆ for all the circles 0 (of finite radius) through ±i. This situation occurs in case (c) of Proposition 2.1, and we have then ∩[1, +∞) ∗ G00 = ∩G00 = iC((−∞, −1] ∪ [1, +∞)), so that ∩ − G0 ∗ G00 is the imaginary axis minus the interval i n0

[−C, +C].

ˆ and G = −δ n00 −1 e−iθ00 cos θ00 1, ˆ cos θ 1 Case (a’)–(a’). We have G = −δ n e 0 00 with θ , θ ∈ (−π/2, π/2) and 0 < δ → 0. These are the same expressions as in case (a)–(a), with an extra factor δ. Therefore 0

0 −1 −iθ 0

−1

0

00

∩ − G0 ∗ G00 = (−∞, −δ 2 (n0 n00 )−1 ] ⊂ (−∞, 0].

Zeros of Graph-Counting Polynomials

51

Case (a’)–(-). We have the same expressions for ∩ − G0 ∗ G00 as in case (a)–(-), but with an extra factor δ. For example in the case (a’)–(b) we have ∩ − G0 ∗ G00 ⊂ {z = x + iy : x ≤ 0, |y| ≤ x}. In case (a’)–(c), we have ∩ − G0 ∗ G00 ⊂ imaginary axis. Case (b)–(b). We may take 0 2 1 √ e−iθ (n0 − 1)(n00 − 1) n0 n00 √ 00 √ ˆ ∗1 ˆ e−iθ cos 2θ0 cos 2θ00 1

−G0 ∗ G00 = −

with θ0 , θ00 ∈ (−π/4, π/4). Using Proposition 4.1, we find √ 0 00 √ ˆ ∗1 ˆ ∩ e−iθ e−iθ cos 2θ0 cos 2θ00 1 ˆ ∗1 ˆ =1 ˆ ⊂ ∩φ∈(−π/2,π/2) e−iφ cos φ 1 so that ∩ − G0 ∗ G00 ⊂ −

(n0

2 1 ˆ √ 1. 00 − 1)(n − 1) n0 n00

If n0 = n00 = n, we recover the result of [8]: ∩ − G0 G00 = −

2 ˆ 1. n(n − 1)2

This result could be somewhat improved, using circular regions for G0 , G00 . q 0√ ˆ and G00 = C 00 0, ˆ where 0 Case (b)–(c). We may take G0 = − n01−1 n20 e−iθ cos 2θ0 1 is any circle (of finite radius) through ±i. Thus, using Proposition 4.2 and the fact that √ the eiθ cos 2θ 1 are the tangents to H, we find r 0√ 2 C 00 0 00 ˆ ∗ 0ˆ ∩θ0 ∈(−π/4,π/4) e−iθ cos 2θ0 ∩0 1 ∩ −G ∗ G = 0 0 n −1 n r √ 2 C 00 ∩θ∈(−π/4,π/4) e−iθ cos 2θ {z = x + iy : |y| ≥ 1} = 0 0 n −1 n r √ 2 iC 00 ˆ ∪ 1) ˆ ∩θ∈(−π/4,π/4) e−iθ cos 2θ ((−1) = 0 0 n −1 n r r 00 2 2 iC 00 ˆ ∪ H) ˆ = C ((−H) {z = x + iy : y 2 − x2 ≥ 1}. = 0 0 0 n −1 n n − 1 n0 0 and 63 0. This result could be improved using for G0 all closed half-planes 3 ζ± 0 ˆ any θ ∈ (−π/4, π/4), and Case (b)–(d). For some > 0 we may take G0 = −e−iθ 1, 1 ˆ 00 G = − 2 1. Thus, using (4.2), we get

π π −G0 ∗ G00 ⊂ C\{z = ρeiθ : ρ > 0, θ ∈ (− , )}. 4 4

52

D. Ruelle

0 ˆ any θ ∈ (−π/4, π/4) , Case (b)–(d’). For some > 0 we may take G0 = −e−iθ 1, 00 ˆ 0 < δ → 0. Thus and G = −δ 1,

π π ∩ − G0 ∗ G00 = C\{z = ρeiθ : ρ > 0, θ ∈ (− , )}. 4 4 ˆ G00 = C 00 0ˆ with circles 0 (of finite radius) through ±i varying Case (c)–(c). If G0 = C 0 0, 0 independently for G , G00 , one finds ∩ − G0 ∗ G00 = C 0 C 00 {z : |z| ≥ 1}. This result can be only slightly improved when one has G0 = G00 = {z : | arg

1−z π | ≥ }. 1+z n

Case (c)–(d). Taking G0 = C 0 0ˆ where 0 is any circle (of finite radius) through ±i and ˆ we have, using Proposition 4.2, G00 = − 21 1 ∩ − G0 ∗ G00 = −

0 C0 ˆ = C {z = x + iy : |y| ≥ 1}. ∩ 0ˆ ∗ 1 2 2

ˆ we have, using (4.2), Case (d)–(d). Taking G0 = G00 = − 21 1 1 ˆ 1ˆ ˆ ∗ 1 = − P. −G0 ∗ G00 = − 1 4 4 Case (d’)–(d’). Taking G0 = G00 = {z : |z + 1| ≤ 1 − δ}, with 0 < δ → 0, we have G0 = G00 ⊂ −{z = ρeiθ : ρ ≤ 2 cos θ, θ ∈ (−π/2, π/2)}, hence −G0 ∗ G00 = {z = ρeiθ : ρ ≤ 2(1 − cos θ), θ ∈ (−π, π]}. (This region is bounded by a cardioid.)

6. Zeros of Graph-Counting Polynomials In this section we consider the polynomial QA (z) =

X

z cardF

F ∈A

with A = (σ), and make assertions on the location of its zeros for various choices of σ. Following Propositions 3.1, 2.1, and Lemma 2.3 we have to compute sets −G ∗ G. The computations have mostly been done in Sect. 5, and we can here simply read off the results. In what follows, n will denote the maximum number of edges with endpoints at any vertex (degree of the graph E). A = ({0, 1}). Here A consists of dimer subgraphs F : each vertex is an endpoint of at most one edge of F . All the zeros of Q({0,1}) are real (hence < 0), as first proved by

Zeros of Graph-Counting Polynomials

53

Heilmann and Lieb [3]. Indeed by case (a)–(a) of Sect. 5 we find that Q({0,1}) (z) can vanish only for z ∈ (−∞, −n−2 ]. A = ({1, 2}). The subgraphs F occurring in ({1, 2}) are those unbranched subgraphs which fill E. Here all the zeros are real ≤ 0. Indeed, by case (a’)–(a’) of Sect. 5 and Lemma 2.4 we see that Q({1,2}) (z) can vanish only for z ∈ (−∞, 0]. A = ({0, 1, 2}). Here A consists of the unbranched subgraphs F of E and the zeros of QA = Q({0,1,2}) have negative real part (Ruelle [8]). Indeed by case (b)–(b) of Sect. 5, Q({0,1,2}) can vanish only if Rez ≤ −2/n(n − 1)2 , where we have assumed n ≥ 2. A = ({even}). Let E be a piece of square lattice in the plane, and A consist of those subgraphs F such that each vertex j ∈ V is an endpoint of exactly 0, 2, or 4 edges ∈ F (boundary subgraphs). Fisher [2] has presented evidence that in the limit where E is large (as a piece of square lattice), the zeros of QA lie asymptotically on the two circles √ {z : |z ± 1| = 2}. This conjecture of Fisher, together with the results presented here, raises the hope that the zeros of graph-counting polynomials tend to be localized on curves under fairly general circumstances. A = ({< max}). For each vertex j ∈ V , the subgraphs F which occur in ({< max}) have strictly less edges with endpoint j than E has. By case (d)–(d) of Sect. 5, Q({<max}) (z) can vanish only for z ∈ − 41 Pˆ . A = ({≥ 1}). The zeros of Q({≥1}) are the inverses of those of Q({<max}) and therefore lie in −{z = ρeiθ : ρ ≤ 2(1 + cos θ), θ ∈ (−π, π)} i.e., in a region bounded by a cardioid. This also follows from case (d’)–(d’) of Sect. 5.

7. Oriented Graphs Interesting families of subgraphs can be defined when (V, E) is oriented. Remember that the incidence set I is the disjoint union of I 0 , I 00 , where (j, a) ∈ I 0 , (k, a) ∈ I 00 mean that the edge a begins at j and ends at k. We define n0 = max card{a ∈ E : (j, a) ∈ I 0 }, j

00

n = max card{a ∈ E : (k, a) ∈ I 00 }. k

We are here concerned with families A = (σ 0 → σ 00 ) of subgraphs such that the number of edges of F originating at a vertex and the number of edges ending at a vertex take restricted sets of values σ 0 and σ 00 . The following proposition gives a variety of results on the location of zeros of polynomials of the form Q(σ0 →σ00 ) , without exhausting possibilities, or giving necessarily best possible results (for improvements the reader is referred to the easy proofs in Sect. 5).

54

D. Ruelle

Proposition 7.1. Let again QA (z) =

X

z cardF .

F ∈A

We shall write C 0 , C 00 for the quantity obtained by the replacement n → n0 , n00 in the definition of C in Proposition 2.1(c). Then Q({0,1}→{0,1}) has real zeros, located on (−∞, −(n0 n00 )−1 ]. Also Q({1,2}→{1,2}) has real zeros, located on (−∞, 0]. Q({0,1}→{0,1,2}) has zeros with real part ≤ −1/n0 (n00 − 1) (we assume n00 ≥ 2). −1 More precisely, the zeros are in n0 X 00 , where X 00 is obtained by the replacement 00 n → n in the definition of X in Proposition 2.1(b), real zeros are thus ≤ −2/n0 n00 . In particular Q({0,1}→{0,1,2}) has its zeros in {z = x + iy : x < 0, |y| < |x|}. Also Q({1,2}→{0,1,2}) has its zeros in {z = x + iy : x ≤ 0, |y| ≤ |x|}. Q({0,1}→{0,2}) , Q({0,1}→{0,2,4}) , Q{(0,1}→{even}) have purely imaginary zeros, located −2 2 on {z = iy : y 2 ≥ n0 C 00 }. Also Q({1,2}→{0,2}) , Q({1,2}→{0,2,4}) , Q{1,2}→{even}) have purely imaginary zeros. −1 Q({0,1}→{<max}) has zeros with real part ≤ −n0 /2. Also Q({1,2}→{<max}) , Q({1,2}→{≥1}) have zeros with real part ≤ 0. 1 √ 2 . Q({0,1,2}→{0,1,2}) has zeros with real part ≤ − (n0 −1)(n 00 −1) n0 n00 Q({0,1,2}→{0,2}) , Q({0,1,2}→{0,2,4}) , Q({0,1,2}→{even}) have no real zeros; these poly002

nomials are of the form F (z 2 ) where the zeros of F have real part ≤ − n0 2C (n0 −1)2 . iθ Q({0,1,2}→{<max}) has its zeros in C\{z = ρe : ρ ≥ 0, θ ∈ [− π4 , π4 ]}; Q({0,1,2}→{≥1}) has its zeros in C\{z = ρeiθ : ρ > 0, θ ∈ (− π4 , π4 )}. Q({0,2}→{<max}) , Q({0,2,4}→{<max}) , Q({even}→{<max}) have their zeros in {z = x + iy : |y| > C 0 /2}, these zeros are thus never real. Q({<max}→{<max}) has its zeros in {z = x + iy : 1 + 4x ≥ 4y 2 }. Q({≥1}→{≥1}) has its zeros in the region −{z = ρeiθ : ρ ≤ 2(1+cos θ), θ ∈ (−π, π)} (bounded by a cardioid). Reversing the direction of the arrows produces more polynomials for which one has information on the location of the zeros. Proof. The proposition results from the calculations of Sect. 5.

We have omitted some trivial cases from the discussion. Note for instance that Q({0,1}→{≥1}) = const. z cardV , which can vanish only at z = 0.

8. Graphs with Weights and Infinite Graphs Suppose that a weight Wa > 0 is given to each a ∈ E and replace QA (z) by the weighted polynomial X Y ( Wa )z cardF . QW A (z) = F ∈A a∈F

We note that QW A (z) is obtained from PA ((za )a∈E ) by taking za = Wa z. In the cases / ∩ − G0 ∗ G00 . We have thus which we considered we had PA ((za )a∈E ) 6= 0 when za ∈ W 0 00 QA (z) = 0 only when z ∈ ∪λ>0 λ ∩ −G ∗ G . This gives a number of easy results as follows:

Zeros of Graph-Counting Polynomials

55

W Proposition 8.1. QW 0; QW ({0,1}) and Q({0,1}→{0,1}) have real zeros < ({1,2}) W and Q({1,2}→{1,2}) have real zeros ≤ 0. W W QW ({0,1,2}) , Q({0,1}→{<max}) , Q({0,1,2}→{0,1,2}) have zeros with real part < 0; W W Q({1,2}→{<max}) , Q({1,2}→{≥1}) have zeros with real part ≤ 0. W W W W QW ({0,1}→{0,2}) , Q({0,1}→{0,2,4}) , Q({0,1}→{even}) , Q({1,2}→{0,2}) , Q({1,2}→{0,2,4}) , W Q({1,2}→{even}) have purely imaginary zeros. W QW ({0,1}→{0,1,2}) has its zeros in {z = x + iy : x < 0, |y| < |x|}; Q({1,2}→{0,1,2}) has its zeros in {z = x + iy : x ≤ 0, |y| ≤ |x|} W W QW ({0,1,2}→{0,2}) , Q({0,1,2}→{0,2,4}) , Q({0,1,2}→{even}) have their zeros in {z = x + iy : |y| > |x|} (they are polynomials of the form F (z 2 ) where the zeros of F have real part < 0). iθ QW : ρ ≥ 0, θ ∈ [− π4 , π4 ]}; ({0,1,2}→{<max}) has its zeros in C\{z = ρe π π iθ QW ({0,1,2}→{≥1}) has its zeros in C\{z = ρe : ρ > 0, θ ∈ (− 4 , 4 )}. W W QW ({0,2}→{<max}) , Q({0,2,4}→{<max}) , Q({even}→{<max}) have no real zeros. Reversing the direction of the arrows does not change the information given here on the zeros.

Proof. The proofs are the same as for Proposition 7.1.

We have defined a family A = (. . . ) (resp. A = (. . . → . . . )) of subgraphs F of any finite graph E by restricting the number ν of allowed edges with a given endpoint (resp. the number ν 0 of ingoing edges and ν 00 of outgoing edges). Suppose that 0 is an allowed value of ν (resp. of ν 0 and ν 00 ). Then we can in a natural manner define the corresponding family A of subgraphs of a (countable) infinite graph E. Giving weights Wa > 0 to the edges a ∈ E, and assuming X Wa < +∞ we define

QW A (z) =

X Y ( Wa )z cardF . F ∈A a∈F

We have

|QW A (z)| ≤

Y

(1 + Wa |z|),

a∈E

and therefore QW A is an entire analytic function. Let V ∗ be a finite subset of the infinite set V of vertices associated with E, and E ∗ = {a ∈ E :both endpoints of a are ∈ V ∗ }. Also let A(V ∗ , E ∗ ) consist of the subgraphs of E ∗ which are in A. Then X Y Q∗ (z) = ( Wa )z cardF F ∈A(V ∗ ,E ∗ ) a∈F

satisfies

|Q∗ (z)| ≤

Y

(1 + Wa |z|)

a∈E ∗ ∗ and the coefficients of Q∗ tend to those of QW A when V tends to E. Therefore Q tends W W to QA uniformly on compacts, and the zeros of QA must be limits of zeros of Q∗ . This yields the following result.

56

D. Ruelle

Proposition 8.2. For infinite E, the zeros of QW A are localized as follows: W and Q have real zeros < 0. QW ({0,1}) ({0,1}→{0,1}) W W W Q({0,1,2}) , Q({0,1}→{<max}) , Q({0,1,2}→{0,1,2}) have zeros with real part ≤ 0. W W QW ({0,1}→{0,2}) , Q({0,1}→{0,2,4}) , Q({0,1}→{even}) have purely imaginary zeros. W Q({0,1}→{0,1,2}) has zeros in {z = x + iy : x ≤ 0, |y| ≤ x}. W W QW ({0,1,2}→{0,2}) , Q({0,1,2}→{0,2,4}) , Q({0,1,2}→{even}) have zeros in {z = x+iy : |y| ≥ 2 |x|} (they are of the form F (z ) where the zeros of F have real part ≤ 0). π π iθ QW ({0,1,2}→{<max}) has its zeros in C\{z = ρe : ρ > 0, θ ∈ (− 4 , 4 )}. Reversing the direction of the arrows does not change the information given here on the zeros. Proof. In view of what was said above, this is a direct consequence of Proposition 8.1. References 1. Asano, T.: Theorems on the partition functions of the Heisenberg ferromagnets. J. Phys. Soc. Jap. 29, 350–359 (1970) 2. Fisher, M.E.: The nature of critical points. In: Lectures in Theoretical Physics 7c Boulder: University of Colorado Press, 1964, pp. 1–159 3. Heilmann, O.J. and Lieb. E.H.: Theory of monomer-dimer systems. Commun. Math. Phys. 25, 190–232 (1972); 27, 166 (1972) 4. Lee, T.D. and Yang, C.N.: Statistical theory of equations of state and phase relations. II. Lattice gas and Ising model. Phys. Rev. 87, 410–419 (1952) 5. Polya, G. and Szeg¨o, G.: Aufgaben und Lehrs¨atze aus der Analysis. 2 Vol., 3rd ed., Berlin: Springer, 1964 6. Ruelle, D.: Extension of the Lee–Yang circle theorem. Phys. Rev. Letters 26, 303–304 (1971) 7. Ruelle, D.: Some remarks on the location of zeroes of the partition function for lattice systems. Commun. Math. Phys. 31, 265–277 (1973) 8. Ruelle, D.: Counting unbranched subgraphs. J. Algebraic Combinatorics, to appear (archived as mp arc 98-104) 9. Wagner, D.J.: Multipartition series. S.I.A.M. J. Discrete Math. 9, 529–544 (1996) Communicated by A. Jaffe

Commun. Math. Phys. 200, 57 – 103 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II Jens B¨ockenhauer, David E. Evans School of Mathematics, University of Wales Cardiff, PO Box 926, Senghennydd Road, Cardiff CF2 4YH, Wales, UK Received: 11 May 1998 / Accepted: 16 June 1998

Abstract: We apply the theory of α-induction of sectors which we elaborated in our previous paper to several nets of subfactors arising from conformal field theory. The main application are conformal embeddings and orbifold inclusions of SU(n) WZW models. For the latter, we construct the extended net of factors by hand. Developing further some ideas of F. Xu, our treatment leads canonically to certain fusion graphs, and in all our examples we rediscover the graphs Di Francesco, Petkova and Zuber associated empirically to the corresponding SU(n) modular invariants. We establish a connection between exponents of these graphs and the appearance of characters in the block-diagonal modular invariants, provided that the extended modular S-matrices diagonalize the endomorphism fusion rules of the extended theories. This is proven for many cases, and our results cover all the block-diagonal SU(2) modular invariants, thus provide some explanation of the A-D-E classification. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Outline of this paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Application of α-Induction to Conformal Inclusions . . . . . . . . . . . . . . . 2.1 The general method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Example: SU (2)10 ⊂ SO(5)1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 A non-commutative sector algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Treatment of Orbifold Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Construction of the extended net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Endomorphisms of the extended net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Spin and statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58 58 62 63 64 64 66 68 72 77 77 82 85 87

58

J. B¨ockenhauer, D. E. Evans

4 4.1 4.2 4.3 5 5.1 5.2 6

Graphs and Intertwining Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some matrices and their properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modular invariants and exponents of graphs . . . . . . . . . . . . . . . . . . . . . . Discussion and consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inclusions of extended U (1) theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minimal models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89 89 91 96 97 97 99 101

1. Introduction 1.1. Background. The SU(n)q subfactors of Wenzl [56] can be understood from the viewpoint of statistical mechanics [18], the IRF models of [9] or from the viewpoint of conformal field theory, irreducible highest weight positive energy representations of the loop groups of SU(n) [55]. These viewpoints are also related to the study and classification of modular partition functions on a torus. The main aim of this paper is to understand the relation between certain modular invariant partition functions and fusion graphs, and the explanation we provide comes from an induction-restriction procedure of sectors of (sub-) factors arising from associated embeddings of the SU(n) theories. The statistical mechanical models of [9] are generalizations of the Ising model. The configuration space of the Ising model, distributions of symbols +, − on the vertices of the square lattice Z2 , can also be thought of as distributions of edges of the Dynkin diagram A3 on the edges of a square lattice, where the end vertices are labelled by + and −. This model can be generalized by replacing A3 by other graphs 0 such as Dynkin diagrams or indeed the Weyl alcove A(m) of the level k integrable representations of SU(n), where m = k + n is the altitude. The vertices of A(n+k) are given by weights Pn−1 Pn−1 {3 = i=1 mi 3(i) : mi ∈ N0 , i=1 mi ≤ k}, where the 3(i) are the n − 1 weights of the fundamental representation, and the oriented edges are given by the vectors ei defined by e1 = 3(1) , ei = 3(i) − 3(i−1) , i = 1, 2, . . . , n − 1, en = 3(n−1) . We can also n−1 , k ≥ p1 ≥ p2 ≥ · · · ≥ pn−1 ≥ label our states by partitions or Young tableaux (pj )j=1 Pn−1 n−1 n−1 , where pj = i=j mi . pn ≡ 0 obtained by the transformation (mi )i=1 7→ (pj )j=1 The unoccupied state corresponds to (0, 0, . . . , 0) or the empty Young tableau in the two descriptions, which we often denote by ∗ or 0. A configuration is then a distribution of the edges of 0 over Z2 , and associated to each local configuration is a Boltzmann weight (see Fig. 1) satisfying the Yang– α

γ

β

δ

Fig. 1. Boltzmann weight where α, β, γ, δ are edges of 0

Baxter equation of Fig. 2. The justification of the term SU(n)N models is as follows. By Weyl duality, the representation of the permutation group on Matn is the fixed point algebra of the product action of SU(n); here Matn denotes the algebra of complex Nn × n matrices. Deforming this, there is a representation of the Hecke algebra in Matn whose commutant is a representation of a deformation of SU(n), the quantum group SU(n)q [29]. The Boltzmann weights lie in this Hecke algebra representation, and at

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

59

=

Fig. 2. Yang–Baxter equation

criticality reduce to the natural braid generators gi , so that the Yang–Baxter equation of Fig. 2 reduces to the braid relation gi gi+1 gi = gi+1 gi gi+1 . The labels of the irreducible representations of either the Hecke algebra (e.g. the permutation group when q = 1) or the quantum group (e.g. SU(n) when q = 1) are generically given by A, a Young tableaux of at most n − 1 rows. However when q is a root of unity e2πi/m we have the further constraint of at most k = m − n columns, where k is the level, i.e. A(m) (e.g. when n = 2 the vertices of the Dynkin diagram Ak+1 ). The Boltzmann weights involve paths of length two in the Bratteli diagram using the embedding graph 0. More generally we can look at matrices T = [Tξ,η ], where ξ, η are path in 0 of length n with fixed initial vertex ∗ and some terminal vertex (Fig. 3)

Fig. 3. Matrices of partition functions [Tξ,η ], where ξ, η are paths on 0 with fixed initial vertex ∗ and same terminal vertex generate a von Neumann algebra N

which form a finite dimensional algebra with the usual graphical algebraic operations, see Sect. 2.6 of [19]. These algebras are nested as n increases, and we can complete with respect to a natural trace and obtain a von Neumann algebra. A subfactor N ⊂ M can be obtained by the adjoint action, i.e. with the aid of the initial Boltzmann weights placed on the boundary as in Fig. 4 (see Sect. 3.5 of [19]). For the SU(n)q subfactors, this just amounts to {gi : i = 1, 2, 3, . . . }00 ⊂ {gi : i = 0, 1, 2, . . . }00 because of the braid relations Ad(g1 g2 · · · )(gi ) = gi+1 . The center Zn of SU(n) acts on A(m) leaving the Boltzmann weights invariant and hence induces an action on M , leaving N globally invariant, yielding the orbifold subfactor N Zn ⊂ M Zn . The action of the center Zn (corresponding to the simple currents [52]) on A(m) is as follows. We set A0 = ∗, and label the other end vertices of A(m) by A1 = A0 + (m − n)e1 , A2 = A1 + (m − n)e2 , . . . , An−1 = An−2 + (m − n)en−1 . Then the generator σ of the

Fig. 4. Embedding of N ⊂ M by T → Ad(V )(T ), where V = g1 g2 · · · is the product of Boltzmann weights at criticality

60

J. B¨ockenhauer, D. E. Evans

P P center Zn acts on the graph A(m) as the rotation σ(Aj + cr er ) = Aj+1 + cr er+1 , where the indices are in Zn . Let us now turn to the loop group picture. The loop group LG is the group of smooth maps from S 1 into a compact Lie group G under pointwise multiplication [47]. We are interested in projective representations of LG o Rot(S 1 ), where Rot(S 1 ) is the rotation group, which are highest weight representations in that the generator L0 of the rotation group is bounded below. Such representations are called positive energy representations and are classified by irreducible representations of G and a level k. For unitary irreducible positive energy representations, the possibilities are severely restricted. Indeed k must be integral and, for a given value of the level, there are only a finite number of admissible (vacuum vector) irreducible representations of G. For the case of G = SU(n), the admissible ones at level k are the vertices of A(m) , where m = n + k. Restricting to loops LI G concentrated on an interval I ⊂ S 1 , LI G = {f ∈ LG : f (z) = e, z ∈ / I}, we get for each positive energy representation π a subfactor π(LI G)00 ⊂ π(LI c G)00 if I c is the complementary interval, of type III and of finite index – e.g. of index 4 cos2 (π/(k + 2)) in the case of the fundamental representation of SU(2) and level k [55]. We next turn to the modular invariant picture. It is argued on physical grounds that the partition function Z(τ ) in a conformal field theory on the torus should be invariant under re-parameterization of the torus by SL(2, Z) [8]. In the string theory formulation, modular invariance is essentially built into the definition of the partition function (although Nahm [40] has argued the case for modular invariance in terms of the chiral algebra and its representations rather than a functional integral setting). In the transfer matrix picture of statistical mechanics we can write the partition function as an average over e−βH , where H is the Hamiltonian, now L0 + L¯ 0 − c/12 (the shift by c/24 arising from mapping the Virasoro algebra on the plane to a cylinder). We have a momentum P (= L0 − L¯ 0 ) describing evolution along the closed string, so taking both evolutions into account, we first compute Z(τ ) = tr(e−βH eiηP ) = tr(e2πiτ (L0 −c/24) e−2πiτ¯ (L0 −c/24) ). ¯

(1)

Here 2πiτ = −β + iη parameterizes the metric of the torus, and we then have to average over τ . If we choose one τ from each orbit under the action of PSL(2; Z) and integrate we implicitly assume that Z(τ ) is modular invariant. From a Hilbert space decomposition of the loop group representation the partition function Eq. (1) decomposes as X Z3,30 χ3 χ¯ 30 , (2) Z= 3,30

where χ3 is the conformal character1 tr(q L0 −c/24 ), q = e2πiτ , of the unitary positive energy irreducible representation π3 , 3 ∈A(m) , where m = n + k for some fixed level k. The problem then is to find or classify all expressions of the form Eq. (2) where Z 1 and Z3,30 is a non-negative is SL(2, Z) invariant, subject to the normalization Z0,0 =P 2 , where integer. A simple argument of Gannon [23] shows that 3,30 Z3,30 ≤ 1/S0,0 1 More precisely, for current algebras the conformal characters also depend on other parameters than τ , corresponding to Cartan subalgebra generators, and the modular group acts on the collection of all these parameters (see e.g. [30]). The Virasoro specialized characters, namely the traces tr(q L0 −c/24 ), are obtained by taking the other parameters to be zero.

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

61

S0,0 is a matrix entry of the S-matrix action of SL(2, Z) on characters, and hence for a given G at a fixed level, there are only finitely many possible modular invariants. They have been completely classified in the case SU(2) by [7] and in the case SU(3) by [24], and the program of Gannon to the complete classification is far advanced – see e.g. the notes of Chapter 8 of [19] for a review. The Gannon program involves identifying first a special class of modular invariants, the ADE 7 invariants which satisfy Z0,3 6= 0 implies that 3 is (the weight labelling) a simple current or equivalently Z0,3 6= 0 implies S3,0 = S0,0 , and then identify what appear to be very few remaining exceptions which include those arising from conformal embeddings. The ADE 7 invariants include all the automorphism invariants, for which Z0,3 = δ0,3 . Such an invariant is basically an automorphism of the fusion ring. There is a permutation σ of A(m) such that Z3,30 = δ3,σ(30 ) . The ADE 7 invariants also include the simple current invariants for which Z3,30 6= 0 implies 3 = J ·30 for a simple current J. Automorphism and simple current invariants constitute the A and D type modular invariants. Note that there are two kinds of modular invariants: P type I, |χi |2 , P χi χ¯ σ(i) , type II, where χi are (possibly extended) characters and σ is a non-trivial permutation of the (extended) fusion rules. The type II invariants where the characters are properly extended (i.e. at least one χi is a proper sum over two or more χ3 ’s) finally constitute the E7 modular invariants. The type I modular invariants where the characters χi are proper extensions are also called block-diagonal. In fact any SU(n) block-diagonal modular invariant can be interpreted as a completely diagonal invariant of a larger theory embedding the SU(n) level k WZW theory. In the case of a conformal inclusion the larger theory is given in terms of a G (necessarily level 1) WZW theory with G a simple Lie group, and in the orbifold inclusion case the larger theory is given in terms of a simple current extension of the SU(n) theory, and the SU(n) theory itself can then be thought of as the Zn orbifold (i.e. as given in terms of the fixed points under a Zn symmetry) of the extended object. For SU(2) and SU(3) both cases actually exhaust all the block-diagonal modular invariants.2 The modular invariants appear to be labelled naturally, in the case of SU(2) and SU(3), by graphs. The SU(2) modular are labelled by A-D-E Dynkin diagrams in the sense that the non-vanishing diagonal entries of the modular invariant are given by the conformal characters labelled by the Coxeter exponents of the labelling ADE graph [7]. Recall the eigenvalues of the (adjacency matrix of the) D and E graphs constitute subsets of the vertices of the A graph with the same Coxeter number, and their labels are called Coxeter exponents. For example in case of SU(2) at level 16 there are three modular invariants. In each case the diagonal part of the invariant is described by a certain subset I = {j} of the vertices of A17 . The (adjacency matrix of the) graph of A17 has eigenvalues {2 cos((j + 1)π/h)}, where j = 0, 1, 2, . . . , 16 labels the vertices of A17 and h = 18 is the Coxeter number of A17 . Then I is the set of the Coxeter exponents, i.e. the set {2 cos((j + 1)π/h), j ∈ I} gives all the eigenvalues of the Dynkin diagram, A17 , D10 or E7 . The completely diagonal invariant then corresponds to the graph A17 itself. In this way all SU(2) modular invariants are described by A-D-E graphs. In the subfactor theory only A-D-E Dynkin diagrams with A`+1 , D2`+2 (` = 1, 2, . . . ), E6 and E8 , appear as the (dual) principal graphs (or fusion graphs) of subfactors with 2 However, for SU(n) with larger n there appear also block-diagonal modular invariants which arise from level-rank duality but neither from conformal nor orbifold inclusions, e.g. for SU(10) at level 2. This kind of invariants will not be treated in this paper.

62

J. B¨ockenhauer, D. E. Evans

index less than four (see e.g. [19]). The graphs D2`+1 and E7 do not appear as principal graphs; they are not flat, but for example the flat part of E7 is D10 . In the rational conformal field theory of SU(2) models described by A-D-E Dynkin diagrams, one now again may argue that there is a degeneracy so that only Deven , E6 and E8 , namely the type I cases need be counted. For example in the case of k = 16, the modular invariant for E7 reduces to that of D10 under the simple interchange of blocks χ8 and χ2 + χ14 (see e.g. [11]). A classification of SU(3) modular invariants was completed by [24]. In analogy with theA-D-E classification for SU(2), we label these A (the completely diagonal invariants), D (the simple current invariants) and the exceptional E invariants. We should also throw ¯ labels the conjugate representation) in their conjugates Z c , (Z c )3,30 = Z3,3 ¯ 0 (here 3 (6) (6)c (9) (9)c (12) = E (12)c , E (24) = E (24)c . For SU(2), the — although D = D , D = D , E automorphism invariants are the A-series and the Dodd -series. For SU(3) they are A(m) and A(m)c for all m and D(m) and D(m)c for m 6= 0 mod 3. The ADE 7 invariants for SU(2) are the A-series, D-series, and the E7 exceptional (hence the name ADE 7 ). In the SU(3) case the A invariants are A(m) and A(m)c , the D invariants are D(m) and D(m)c , (12) (12)c and EMS [39]. The and the E7 invariants are the two Moore and Seiberg invariants EMS (8) (12) (24) other invariants E , E , E correspond to conformal embeddings SU(3)5 ⊂ SU(6)1 , SU(3)9 ⊂ (E6 )1 , SU(3)21 ⊂ (E7 )1 , respectively (cf. E6 and E8 for SU(2)). The simple current invariants for SU(2) are the A and D series, and for SU(3) are again the A and D series (but not their conjugations). Di Francesco and Zuber initiated a program to associate graphs to these invariants [13, 14]. These graphs are three colourable and such that their eigenvalues (“exponents”), constituting again a subset of the set of eigenvalues of the A graph and thus being labelled by its vertices, match the non-vanishing diagonal entries of the modular invariant. They also associated graphs to several SU(n) modular invariants with higher n and their concept is quite general, however, there may be difficulties associating a graph to some invariants – unlike the SU(2) case. We also consider in Sect. 5 the modular invariants of the extended U (1) current algebras as treated in [6] and the minimal model modular invariants which arise from coset theories (SU(2)m−2 ⊗ SU(2)1 )/SU(2)m−1 and are labelled by pairs (G1 , G2 ) of A-D-E graphs, associated to levels (m − 2, m − 1) [7]. 1.2. Outline of this paper. A conformal inclusion directly provides a net of subfactors in terms of the von Neumann algebras of local loop groups in the vacuum representation of the larger theory. For the orbifold case we start with the level k vacuum representation of the loop group SU(n) and construct a net of subfactors by a DHR construction of fields implementing automorphisms, constituting a simple current extension [52, 39] in terms of bounded operators. The construction is possible and yields moreover a local extended net exactly at the levels where the orbifold modular invariants occur. In both cases we arrive at a net of subfactors N ⊂ M satisfying the necessary conditions to apply the procedure of α-induction elaborated in our previous paper. As a consequence of Wassermann’s work, to the level k positive energy representations of the loop group LSU(n) correspond (DHR superselection) sectors of local algebras N (I◦ ), where I◦ ⊂ S 1 is some proper interval. These sectors are labelled by admissible weights 3 and it is proven that their sector products obey the well-known SU(n)k fusion rules [55], giving rise to a fusion algebra W = W (n, k). The irreducible subsectors obtained by α-induction of these sectors generate sector algebra V . The results of our previous paper, in particular the homomorphism property of α-induction, allows to read off partially the structure of V in terms of the fusion rules in W , but it

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

63

does in general not determine the multiplication table completely. However, in many examples it provides enough information to resolve the puzzle. The homomorphism property of α-induction also implies that V carries a representation of W which therefore decomposes into a direct sum over the characters γ3 of the fusion algebra W which are labelled by admissible weights 3 as well. The representation matrix associated to the first fundamental weight 3(1) can be interpreted as the adjacency matrix of a graph (which is in fact the fusion graph of α3(1) ), and its eigenvalues are the evaluation of the characters γ3 in the decomposition of this representation of W . The weights 3 labelling the characters which in fact appear this way are called exponents as can be recognized as a generalization of the Coxeter exponents in the SU(2) case. As a consequence of ασ-reciprocity, proved in our previous paper, there is a fusion subalgebra T ⊂ V generated by the (localized) sectors of the larger theory which correspond to the blocks in the modular invariant. It is widely believed in general but proven only for several cases that their sector products coincide with the Verlinde fusion rules known in conformal field theory. Provided that this is true for the embedding theory at hand we show that the interplay of S-matrices diagonalizing the fusion rules and implementing modular transformations at the same time forces a conformal character χ3 to appear in the modular invariant if and only if 3 is an exponent. 1.3. Preliminaries. Here we briefly review our basic notation and results, however, for precise definitions and statements we refer the reader to our previous paper [4]. There we considered certain nets of subfactors N ⊂ M on the punctured circle, i.e. we were dealing with a family of subfactors N (I) ⊂ M (I) on a Hilbert space H, indexed by the set Jz of open intervals I on the unit circle S 1 that do neither contain nor touch a distinguished point “at infinity” z ∈ S 1 . The defining representation of N possesses a subrepresentation π0 on a distinguished subspace H0 giving rise to another net A = {A(I) = π0 (N (I)), I ∈ Jz }. We assumed this net to be strongly additive (which is equivalent to strong additivity of the net N ) and to satisfy Haag duality, A(I) = CA (I 0 )0 , where CA (I 0 ) denotes the C ∗ -algebra generated by all A(J), with intervals J ∈ Jz and J ⊂ I 0 , the (interior of the) complement of I, and also locality of the net M. Fixing an interval I◦ ∈ Jz we used the crucial observation in [38] that there is an endomorphism γ of the C ∗ -algebras M into N (the C ∗ -algebras associated to the nets are denoted by the same symbols as the nets itself, as usual) such that it restricts to a canonical endomorphism of M (I◦ ) into N (I◦ ). By θ we denote its restriction to N . We defined a map 1N (I◦ ) → End(M), λ 7→ αλ , called α-induction, where 1N (I◦ ) is the set of transportable endomorphisms localized in I◦ . Explicitly, αλ = γ −1 ◦ Ad(ε(λ, θ)) ◦ λ ◦ γ, with statistics operators ε(λ, θ). In fact this is the formula for the extended endomorphism given already in [38], which realizes Roberts’ “cohomological extension” (see e.g. [51]). As endomorphisms in 1N (I◦ ) leave N (I◦ ) invariant one can consider elements of 1N (I◦ ) as elements of End(N (I◦ )), and therefore it makes sense to define the quotient [1]N (I◦ ) by inner equivalence in N (I◦ ). Similarly, the endomorphisms αλ leave M (I◦ ) invariant, hence we can consider them also as elements of End(M (I◦ )) and form their inner equivalence classes [αλ ] in M (I◦ ). We derived that in terms of these equivalence classes, called sectors, α-induction [λ] 7→ [αλ ] preserves the natural additive and multiplicative structures. Crucial for our analysis is also the formula hαλ , αµ iM (I◦ ) = hθ ◦ λ, µiN (I◦ ) , λ, µ ∈ 1N (I◦ ),

64

J. B¨ockenhauer, D. E. Evans

where for endomorphisms ρ, σ of an infinite factor M we denote hρ, σiM = dim HomM (ρ, σ) = dim{t ∈ M : tρ(m) = σ(m)t, m ∈ M }. We also have a map End(M) → End(N ), β 7→ σβ , called σ-restriction. Let 1M (I◦ ) ⊂ End(M) denote the set of transportable endomorphisms localized in I◦ , and 1(0) M (I◦ ) ⊂ 1M (I◦ ) the subset of endomorphisms leaving M (I) for any I ∈ Jz with I◦ ⊂ I (0) invariant. (If the net M is Haag dual then 1(0) M (I◦ ) = 1M (I◦ ).) If β ∈ 1M (I◦ ) then σβ leaves N (I◦ ) invariant and hence we can consider β and σβ as elements of End(M (I◦ )) and End(N (I◦ )), respectively, and we derived ασ-reciprocity, hαλ , βiM (I◦ ) = hλ, σβ iN (I◦ ) , λ ∈ 1N (I◦ ), β ∈ 1(0) M (I◦ ). If one starts with a certain set W of sectors in [1]N (I◦ ) one obtains a set V of sectors of M (I◦ ) by α-induction, and the above results provide close connections between the algebraic structures of W and V, conveniently formulated in the language of sector algebras. 2. Application of α-Induction to Conformal Inclusions In this section we develop our first main application of α-induction. We consider nets of subfactors which arise from conformal inclusions of SU(n). 2.1. The general method. We first explain that conformal inclusions of SU(n) give rise to quantum field theoretical nets of subfactors so that we can apply the machinery of α-induction developed in our previous paper. Let Hk ⊂ G1 be a conformal inclusion of H = SU(n) at level k with G a connected compact simple Lie group. Then there is an associated block-diagonal modular invariant of SU(n), X X 2 ext |χext bt,3 χ3 . (3) Z= t | , χt = t∈T

3∈A(n+k)

Here T denotes the labelling set of positive energy representations (π t , Ht ) of LG at level t 1, χext t the characters of H , and χ3 the characters of the level k positive energy representation spaces H3 of LSU(n), and (π3 , H3 ) appears in the decomposition of (π t , Ht ) with multiplicity bt,3 . Thus we have in terms of the positive energy representations M bt,3 π3 . (4) π t |LH = 3∈A(n+k)

Now let us define a net of subfactors N ⊂ M on the Hilbert space H ≡ H0 by N (I) = π 0 (LI H)00 , M (I) = π 0 (LI G)00 ,

(5)

A(I) = π0 (LI H)00

(6)

and also the net A by

for intervals I ⊂ S 1 . For conformal embeddings the index of the subfactors N (I) ⊂ M (I) is finite (see e.g. [54, 50]), and as the nets M and A constitute M¨obius covariant precosheaves on the circle they satisfy Haag duality on the closed circle [5] and hence

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

65

in particular locality. Moreover, we have a vector ∈ H which is cyclic and separating for each M (I) (I 6= S 1 any interval such that I¯ 6= S 1 ) on H and N (I) on H0 ⊂ H. The modular group of M (I) associated to the state ω(·) = h, ·i is geometric, i.e. its action restricts to a geometric action on the loop group elements, see [55] for the case G = SU(m) and [54] for the general case that G is any compact simple Lie group. Hence the modular group leaves the subalgebra N (I) invariant for SU(n) ⊂ G is a subgroup. Therefore there is a normal conditional expectation EI from M (I) onto N (I) and preserves the state ω by Takesaki’s theorem [53]. Furthermore, EI is unique and faithful as the inclusion is also irreducible. The net N ⊂ M is standard (by the ReehSchlieder theorem) and hence the Jones projection eN from H onto H0 does not depend on the interval I. Therefore we have EI (m) = EI (m)eN = eN meN = eN m for any I and m ∈ M (I). Hence EI (m) = EJ (m) for any pair I ⊂ J since is separating for M (J). We conclude that we have a faithful normal conditional expectation E from M onto N and it obviously preserves the vector state ω. Now let us remove a “point at infinity” z ∈ S 1 and take the set Jz as the index set of our nets A, N , M. Then we are clearly dealing with directed nets. For H = SU(n), Haag duality on the closed circle, A(I) = A(I 0 )0 , has been proven directly by Wassermann [55], as has strong additivity or “irrelevance of points” i.e. A(I) = A(I1 )∨A(I2 ) if the intervals I1 and IW 2 are obtained by removing one single point from the interval I. Moreover, A(I) = n A(Jn ) for any sequence of increasing intervals Jn tending to I [54]. Both arguments imply that we have Haag duality even on the punctured circle, A(I) = CA (I 0 )0 . In fact, as the proofs in [54] are formulated for any compact connected simple Lie group we similarly have Haag duality on the punctured circle for M, M (I) = CM (I 0 )0 . (For G = SO(m) (level 1) this has also been proven directly in [3].) As π0 appears (precisely once) in π 0 |LH we conclude that the net N has a Haag dual subrepresentation, and the corresponding net is given by A = {A(I) = π0 (N (I)), I ∈ Jz } (note that we take, by abuse of notation, the same symbol π0 for the subrepresentation of the net N and for the vacuum representation of LH). Let us summarize the discussion in the following Proposition 2.1. Starting from a conformal inclusion SU(n)k ⊂ G1 with G a compact connected simple Lie group the net N ⊂ M (over the index set Jz ) defined as above is a quantum field theoretical net of subfactors of finite index where M is Haag dual (hence local) and N is strongly additive and has a Haag dual subrepresentation. As the positive energy representations of LH = LSU(n) satisfy local equivalence [55], π3 (LI H) ' π0 (LI H), we have by the standard arguments endomorphisms λ0;3 ∈ 1A (I◦ ) that correspond to π3 for some interval I◦ ∈ Jz . Wassermann has related the LSU(n) fusion rules to the (relative tensor) product of bimodules, and this is equivalent to the composition of endomorphisms. Hence we have complete information about the sector products [λ0;3 ] × [λ0;30 ]. Equivalently, we can also take the lifted endomorphisms λ3 = π0−1 ◦ λ0;3 ◦ π0 ∈ 1N (I◦ ), and then we clearly have the same sector product rules. By Eq. (27) of [4] we have M b0,3 [λ3 ], [θ] = 3∈A(n+k)

66

J. B¨ockenhauer, D. E. Evans

where this P decomposition corresponds to the the vacuum block in the modular invariant, = χext 0 3∈A(n+k) b0,3 χ3 , see Eq. (3). Our procedure is then as follows. Recall that a sector basis is a finite set of irreducible sectors with finite statistical dimension which contains the identity sector and is closed under sector products and conjugation. A sector basis canonically defines an algebra called sector algebra. (We refer again to [4] for precise definitions.) We take the sector basis W ≡ W(n, k) = {[λ3 ], 3 ∈ A(n+k) } ⊂ [1]N (I◦ ) and we denote by W ≡ W (n, k) the associated fusion algebra. By α-induction (see Theorem 4.2 of [4]) we obtain a sector algebra V with sector basis V ⊂ Sect(M (I◦ )), consisting of the distinct irreducible subsectors of the [α3 ]. (We write α3 for αλ3 .) Picking endomorphisms λ3(p) , associated to the pth fundamental representation, p = 1, 2, . . . , n−1 (3(p) denotes the pth fundamental weight) and forming α3(p) , we can compute the sector products [α3(p) ] × [α3 ] for all 3 ∈ A(n+k) . In many cases, the homomorphism [α] is surjective and therefore all the fusion rules in V can be read off from the fusion rules in W . But even for those of our examples where the homomorphism [α] is not surjective we can at least determine the fusion rules [α3(p) ] × [β] for all [β] ∈ V, and thus we can draw the associated fusion graphs. Since the positive energy representations of a loop group of any connected compact simple satisfy local equivalence [54] we have endomorphisms βt ∈ 1(0) M (I◦ ) = 1M (I◦ ), t ∈ T , corresponding to the level 1 positive energy representations of LG. As we know the branching rules of the decomposition of π t |LH , Eq. (4), and Las σ-restriction corresponds to the restriction of representations it follows [σβt ] = 3∈A(n+k) bt,3 [λ3 ]. As a consequence of ασ-reciprocity, hα3 , βt iM (I◦ ) = hλ3 , σβt iN (I◦ ) = bt,3 , 3 ∈ A(n+k) , t ∈ T , we conclude (cf. Theorem 4.3 of [4]) that T ⊂ V and that the associated fusion algebra T must be a sector subalgebra T ⊂ V . (We identify the labelling set T itself with the associated sector basis: T ≡ {[βt ]} ⊂ [1]M (I◦ ), t ≡ [βt ].) It is widely believed but in general not known whether the endomorphisms associated to the positive energy representations of a loop group LG obey the Verlinde fusion rules of the corresponding WZW theory. However, for the level 1 theories which are relevant here, this is proven for many cases including G = SU(m) as a special (and most trivial) case of Wassermann’s analysis [55] and G = SO(m) as done in [3], moreover, for G = G2 it follows from our treatment of the conformal embedding SU(2)28 ⊂ (G2 )1 . Now recall that Di Francesco, Petkova and Zuber (see [14, 45] or [10, 11]) associated certain graphs to modular invariants by some empirical procedure. For these graphs they constructed fusion algebras (which are possibly not uniquely determined), and they discovered for the block-diagonal modular invariants some subalgebras spanned by a subset of the vertices, called marked vertices, which obey the fusion rules of the extended theory. Indeed, in our examples we rediscover their graphs by drawing the fusion graphs of α3(p) . The elements of T turn out to represent exactly the marked vertices, and our theory provides an explanation why the graph algebras (which are in fact the fusion algebras V ) possess subalgebras corresponding to the fusion rules of the extended theory. 2.2. Example: SU(2)10 ⊂ SO(5)1 . We consider the conformal inclusion SU(2)10 ⊂ SO(5)1 . The corresponding SU(2) modular invariant is the E6 one, ZE6 = |χ0 + χ6 |2 + |χ4 + χ10 |2 + |χ3 + χ7 |2 .

(7)

The three blocks come from the basic (0), the vector (v) and the spinor (s) representation of LSO(5) at level 1. For LSU(2) at level 10 there are 11 positive energy representations πj , labelled by the (doubled, thus integer valued) spin j = 0, 1, 2, . . . , 10. Let

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

67

λj ∈ 1N (I◦ ) be corresponding endomorphisms. The fusion algebra W ≡ W (2, k) is characterized by the fusion rules min(j1 +j2M ,2k−(j1 +j2 ))

[λj1 ] × [λj2 ] =

[λj ],

(8)

j=|j1 −j2 |,j+j1 +j2 even

and here k = 10. From the vacuum block in Eq. (7) we read off [θ] = [λ0 ] ⊕ [λ6 ]. By Theorem 3.9 of [4] we obtain (writing αj for αλj ) hαj1 , αj2 iM (I◦ ) = hλj1 , θ ◦ λj2 iN (I◦ ) = hλj1 , λj2 iN (I◦ ) + hλj1 , λ6 ◦ λj2 iN (I◦ ) . We find this way

hαj , αj iM (I◦ ) =

1 2

for j = 0, 1, 2, 8, 9, 10 . for j = 3, 4, 5, 6, 7

We further compute hα3 , α9 iM (I◦ ) = 1, hence [α3 ] = [α9 ]⊕[α3(1) ] with [α3(1) ] irreducible. As hα3 , αj iM (I◦ ) = 0 for j = 0, 1, 2, 8, 10, there is no irreducible [αj ] that equals [α3(1) ]. Checking all other hαj1 , αj2 iM (I◦ ) we finally find that there are six different irreducible sectors, i.e. elements of V, namely [α0 ], [α1 ], [α2 ], [α9 ], [α10 ], and [α3(1) ], and we have the identity [α8 ] = [α2 ]. The reducible [αj ]’s decompose into the elements of V as follows, [α3 ] = [α3(1) ] ⊕ [α9 ],

[α4 ] = [α2 ] ⊕ [α10 ], [α5 ] = [α1 ] ⊕ [α9 ],

[α6 ] = [α0 ] ⊕ [α2 ],

[α7 ] = [α1 ] ⊕ [α3(1) ].

We are in the fortunate situation that we can write all elements of V as (integral) linear combinations of [αj ]’s, i.e. the homomorphism [α] is surjective. Thus we can determine their fusion rules from those of LSU(2). For instance, we compute [α3(1) ] × [α1 ] = ([α3 ] × [α1 ]) ([α9 ] × [α1 ]) = ([α2 ] ⊕ [α4 ]) ([α8 ] ⊕ [α10 ]) = [α2 ]. In particular, we can draw the fusion graph for [α3(1) ] ≡ [α1 ]. It is straightforward to check that this is E6 , Fig. 5. The homomorphism [α] : W → V induces an induction-restriction graph connecting A11 and E6 . We just draw an edge from each spin j vertex of A11 to the vertices of E6 that represent the irreducible subsectors in the decomposition of [αj ]. For example, we draw from the spin j = 4 vertex one line to the vertex [α2 ] and one to [α10 ]. Completing the picture we obtain a graph with two connected components 01 and 02 corresponding

[α(1) 3 ]

[α0 ]

[α1 ]

[α2 ] Fig. 5. E6

[α9 ]

[α10 ]

68

J. B¨ockenhauer, D. E. Evans [λ0 ]

[λ6 ]

[λ2 ]

[α0 ]

[λ8 ]

[λ4 ]

[α2 ]

[λ10 ]

[α10 ]

Fig. 6. 01 [λ1 ]

[λ7 ]

[λ5 ]

[λ3 ]

[α1 ]

[α(1) 3 ]

[α9 ]

[λ9 ]

Fig. 7. 02

to the even and odd spins, respectively, see Figs. 6, 7. These graphs are actually well known as graphs connecting A11 and E6 , cf. [43, 16, 31, 26]. Indeed, one can show that 01 is the principal graph for the inclusion N (I◦ ) ⊂ M (I◦ ). We plan to come back to this fact in a separate publication. Now we turn to the discussion of the marked vertices. Let β0 , βv , βs ∈ 1M (I◦ ) be endomorphisms corresponding to the level 1 basic, vector and spinor representation of LSO(5) (as constructed in [3]). From the blocks in Eq. (7) we can read off the decomposition of the σ-restricted endomorphisms, [σβ0 ] = [λ0 ] ⊕ [λ6 ], [σβv ] = [λ4 ] ⊕ [λ10 ], [σβs ] = [λ3 ] ⊕ [λ7 ]. By ασ-reciprocity, we conclude that [β0 ] = [idM (I◦ ) ] must appear in [α0 ] and [α6 ], [βv ] in [α4 ] and [α10 ] and [βs ] in [α3 ] and [α7 ] with multiplicity one. Hence we conclude [α0 ] = [β0 ], [α10 ] = [βv ], [α3(1) ] = [βs ]. In Fig. 5 we encircled the marked vertices (and we will do it also in the following examples). It is easy to check that [α0 ], [α10 ] and [α3(1) ] indeed obey the Ising fusion rules, e.g. [α3(1) ] × [α3(1) ] = ([α3 ] [α9 ]) × ([α3 ] [α9 ]) = [α0 ] ⊕ [α10 ] as it is well known for the end vertices in the graph algebra of E6 . This finds now an explanation by the machinery of α-induction and σ-restriction. Put differently, our theory proves again the result of [3], namely that the endomorphisms associated to the level 1 positive energy representations of LSO(5) obey the Ising fusion rules. 2.3. More examples. (i) Example. SU(2)28 ⊂ (G2 )1 . The corresponding modular invariant is the E8 one, ZE8 = |χ0 + χ10 + χ18 + χ28 |2 + |χ6 + χ12 + χ16 + χ22 |2 .

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

69

[α(2) 5 ]

[α0 ]

[α1 ]

[α2 ]

[α3 ]

[α4 ]

(1) [α(1) 5 ] [α6 ]

Fig. 8. E8

The two blocks come from the positive energy representations π0 and πφ of L(G2 ) at level 1. With [θ] = [λ0 ] ⊕ [λ10 ] ⊕ [λ18 ] ⊕ [λ28 ] we can determine the structure of the induced sector algebra V . We omit the straightforward calculations and just present the result here. We find that the sector basis V has elements, given by [α0 ], [α1 ], [α2 ], [α3 ], [α4 ], [α5(1) ], [α5(2) ] and [α6(1) ]. The decompositions of the reducible [αj ]’s read [α5 ] = [α5(1) ] ⊕ [α5(2) ],

[α6 ] = [α4 ] ⊕ [α6(1) ],

[α7 ] = [α3 ] ⊕ [α5(1) ],

[α8 ] = [α2 ] ⊕ [α4 ],

[α9 ] = [α1 ] ⊕ [α3 ] ⊕ [α5(2) ],

[α10 ] = [α0 ] ⊕ [α2 ] ⊕ [α4 ],

[α11 ] = [α1 ] ⊕ [α3 ] ⊕ [α5(1) ], [α12 ] = [α2 ] ⊕ [α4 ] ⊕ [α6(1) ], [α13 ] = [α3 ] ⊕ [α5(1) ] ⊕ [α5(2) ], [α14 ] = 2[α4 ], and we have [α28−j ] = [αj ]. The fusion graph of [α1 ] is in fact E8 , given in Fig. 8. The marked vertices are given by [α0 ] = [β0 ], [α6(1) ] = [βφ ], and it is easy to check that they indeed obey the Lee-Yang fusion rules [α6(1) ] × [α6(1) ] = [α0 ] ⊕ [α6(1) ] of (G2 )1 , i.e. here our theory proves that the endomorphisms associated to the (G2 )1 positive energy representations obey these fusion rules. (ii) Example. SU(2)4 ⊂ SU(3)1 . The corresponding modular invariant is the D4 one, ZD4 = |χ0 + χ4 |2 + 2|χ2 |2 . The first block comes from the vacuum representation π(0,0) and the second one from the positive energy representations π(1,0) and π(1,1) of LSU(3) at level 1 which both restrict to the spin 2 representation of LSU(2) at level 4. With [θ] = [λ0 ] ⊕ [λ4 ] we find that V has four elements, namely [α0 ], [α1 ], [α2(1) ], [α2(2) ], where we have the decomposition [α2 ] = [α2(1) ] ⊕ [α2(2) ], and also [α4−j ] = [αj ]. Note that we cannot isolate [α2(1) ] and [α2(2) ]. Thus in this case the homomorphism [α] is not surjective! However, since [α1 ] × [α2 ] = [α1 ] ⊕ [α3 ] = 2[α1 ],

70

J. B¨ockenhauer, D. E. Evans [α(1) 2 ]

[α0 ]

[α1 ] Fig. 9. D4

it follows that and we find

[α(2) 2 ]

[α1 ] × [α2(i) ] = [α1 ], i = 1, 2, [α1 ] × [α1 ] = [α0 ] ⊕ [α2 ] = [α0 ] ⊕ [α2(1) ] ⊕ [α2(2) ],

hence the fusion graph of [α1 ] is uniquely determined to be D4 , see Fig. 9. The SU(3)1 positive energy representations obey Z3 fusion rules, and from ασreciprocity we conclude that the marked vertices are given by [α0 ] = [β(0,0) ], [α2(1) ] = [β(1,0) ], [α2(2) ] = [β(1,1) ]. (Clearly we have the freedom to define which is [α2(1) ] and which [α2(2) ].) (iii) Example. SU(3)3 ⊂ SO(8)1 . We now turn to the treatment of the SU(3) conformal embeddings. We shall label the LSU(3) level k positive energy representations by pairs of integers (p, q), k ≥ p ≥ q ≥ 0, that give the lengths of the rows of the associated Young tableaux. Thus the (first) fundamental representation has the label (1, 0). We denote the endomorphism that corresponds to the positive energy representation labelled by (p, q) by λ(p,q) . Thus the sectors [λ(p,q) ] constitute the sector basis of the fusion algebra W (3, k). Recall that the fusion of the sector [λ(1,0) ] that corresponds to the fundamental representation is [λ(p,q) ] × [λ(1,0) ] = [λ(p+1,q) ] ⊕ [λ(p,q+1) ] ⊕ [λ(p−1,q−1) ], where it is understood that on the r.h.s. only sectors inside A(k+3) contribute. Now for the conformal embedding SU(3)3 ⊂ SO(8)1 , the corresponding modular invariant reads ZD(6) = |χ(0,0) + χ(3,0) + χ(3,3) |2 + 3|χ(2,1) |2 ,

[α(1,1) ]

[α(0,0) ]

j

z j ~ = 6 9

[α(1) (2,1) ] [α(2) (2,1) ] [α(3) (2,1) ]

[α(1,0) ] Fig. 10. D(6)

thus we have [θ] = [λ(0,0) ] ⊕ [λ(3,0) ] ⊕ [λ(3,3) ]. We find that V has six elements, [α(0,0) ], (1) (2) (3) ], [α(2,1) ] and [α(2,1) ]. Here the only non-trivial decomposition is [α(1,0) ], [α(1,1) ], [α(2,1)

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

71

(1) (2) (3) [α(2,1) ] = [α(2,1) ] ⊕ [α(2,1) ] ⊕ [α(2,1) ], and we have [α(p,q) ] = [α(3−q,p−q) ]. The fusion (1) (6) ], graph of [α(1,0) ] is indeed D , see Fig. 10. The marked vertices are [α(0,0) ], [α(2,1) (2) (3) [α(2,1) ] and [α(2,1) ] and hence obey the Z2 × Z2 fusion rules of SO(8)1 . The other D-type block-diagonal modular invariants, namely D2%+2 for SU(2) and D(3%+3) for SU(3), % = 2, 3, 4, . . . , do not come from conformal inclusions. This will be discussed in the following section.

(iv) Example. SU(3)5 ⊂ SU(6)1 . The corresponding modular invariant reads ZE (8) = |χ(0,0) + χ(4,2) |2 + |χ(2,0) + χ(5,3) |2 + |χ(2,2) + χ(5,2) |2 +|χ(3,0) + χ(3,3) |2 + |χ(3,1) + χ(5,5) |2 + |χ(3,2) + χ(5,0) |2 , hence

[θ] = [λ(0,0) ] ⊕ [λ(4,2) ].

By computing all the numbers hα(p,q) , α(r,s) iM (I◦ ) = hθ ◦ λ(p,q) , λ(r,s) iN (I◦ ) (where we denote α(p,q) = αλ(p,q) ) we find that V has 12 elements, and the reducible [α(p,q) ]’s decompose into these irreducibles as (1) ], [α(2,0) ] = [α(4,4) ] ⊕ [α(2,0)

[α(2,1) ] = [α(5,1) ] ⊕ [α(5,4) ], (1) ], [α(2,2) ] = [α(4,0) ] ⊕ [α(2,2) (1) ], [α(3,0) ] = [α(5,4) ] ⊕ [α(3,0)

[α(3,1) ] = [α(1,0) ] ⊕ [α(4,0) ] ⊕ [α(5,5) ], [α(3,2) ] = [α(1,1) ] ⊕ [α(4,4) ] ⊕ [α(5,0) ], (1) ], [α(3,3) ] = [α(5,1) ] ⊕ [α(3,0)

[α(4,1) ] = [α(1,1) ] ⊕ [α(4,4) ], [α(4,2) ] = [α(0,0) ] ⊕ [α(5,1) ] ⊕ [α(5,4) ], [α(4,3) ] = [α(1,0) ] ⊕ [α(4,0) ], (1) ], [α(5,2) ] = [α(1,0) ] ⊕ [α(2,2) (1) ]. [α(5,3) ] = [α(1,1) ] ⊕ [α(2,0)

We find that the homomorphism [α] is surjective as we can invert these formulae; namely we obtain (1) ] = [α(2,0) ] [α(4,4) ], [α(2,0) (1) ] = [α(2,2) ] [α(4,0) ], [α(2,2) (1) ]= [α(3,0)

1 2

[α(3,0) ] ⊕ [α(3,3) ] [α(5,1) ] [α(5,4) ] .

It is then a straightforward calculation to determine the fusion rules of V , and the fusion graph of [α(1,0) ] is given by Fig. 11. (1) (1) (1) ], [α(5,0) ], [α(3,0) ], [α(5,5) ] and [α(2,0) ], The marked vertices are given by [α(0,0) ], [α(2,2) and one may check that they in fact obey the Z6 fusion rules of SU(6)1 .

72

J. B¨ockenhauer, D. E. Evans [α(5,0) ]

[α(5,1) ]

[α(1) (2,2) ]

U

[α(0,0) ]

[α(1,1) ]

?

j

-

K

[α(1,0) ]

K

-

U

[α(4,0) ]

Y U 6 *

[α(1) (3,0) ]

[α(4,4) ]

-

K [α(5,5) ]

[α(5,4) ]

[α(1) (2,0) ] Fig. 11. E (8)

2.4. A non-commutative sector algebra. Example. SU(4)4 ⊂ SO(15)1 . Labelling the positive energy representations of SU(4)4 with partitions (p1 , p2 , p3 ) ∈ Z3 , 4 ≥ p1 ≥ p2 ≥ p3 ≥ 0, the corresponding modular invariant reads Z = |χ(0,0,0) + χ(3,1,0) + χ(3,3,2) + χ(4,4,0) |2 +|χ(2,1,1) + χ(4,0,0) + χ(4,3,1) + χ(4,4,4) |2 + 4|χ(3,2,1) |2 , where the blocks correspond to the basic, the vector and the spinor representation of SO(15)1 . With [θ] = [λ(0,0,0) ] + [λ(3,1,0) ] + [λ(3,3,2) ] + [λ(4,4,0) ] we can compute the table (α(p1 ,p2 ,p3 ) ≡ αλ(p1 ,p2 ,p3 ) ) hα(p1 ,p2 ,p3 ) , α(q1 ,q2 ,q3 ) iM (I◦ ) = hθ ◦ λ(p1 ,p2 ,p3 ) , λ(q1 ,q2 ,q3 ) iN (I◦ ) . Evaluating this table we first obtain that [α(0,0,0) ], [α(1,0,0) ], [α(1,1,0) ], [α(1,1,1) ] and [α(4,0,0) ] are distinct irreducible sectors and we find identities [α(0,0,0) ] = [α(4,4,0) ], [α(1,0,0) ] = [α(3,3,3) ] = [α(4,1,0) ] = [α(4,4,1) ], [α(1,1,0) ] = [α(2,0,0) ] = [α(2,2,2) ] = [α(3,3,0) ] = [α(4,1,1) ] = [α(4,2,0) ] = [α(4,3,3) ] = [α(4,4,2) ], [α(1,1,1) ] = [α(3,0,0) ] = [α(4,3,0) ] = [α(4,4,3) ], [α(4,0,0) ] = [α(4,4,4) ]. (i) (i) (i) ], [α(2,2,0) ], [α(2,2,1) ], i = 1, 2, and We find seven further elements in V, namely [α(2,1,0) (1) [α(3,2,1) ], and the reducible [α(p1 ,p2 ,p3 ) ] decompose in this basis as follows,

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

73

(1) (2) [α(2,1,0) ] = [α(1,1,1) ] ⊕ [α(2,1,0) ] ⊕ [α(2,1,0) ], (1) (2) ] ⊕ [α(2,2,0) ], [α(2,1,1) ] = [α(4,0,0) ] ⊕ [α(2,2,0) (1) (2) ] ⊕ [α(2,2,0) ], [α(2,2,0) ] = [α(2,2,0) (1) (2) ] ⊕ [α(2,2,1) ], [α(2,2,1) ] = [α(1,0,0) ] ⊕ [α(2,2,1) (1) (2) ] ⊕ [α(2,2,0) ], [α(3,1,0) ] = [α(0,0,0) ] ⊕ [α(2,2,0) (1) (2) ] ⊕ [α(2,2,1) ], [α(3,1,1) ] = [α(1,0,0) ] ⊕ [α(2,2,1) (1) (2) ] ⊕ [α(2,2,1) ], [α(3,2,0) ] = [α(1,0,0) ] ⊕ [α(2,2,1) (1) ], [α(3,2,1) ] = 2[α(1,1,0) ] ⊕ 2[α(3,2,1) (1) (2) ] ⊕ [α(2,1,0) ], [α(3,2,2) ] = [α(1,1,1) ] ⊕ [α(2,1,0) (1) (2) ] ⊕ [α(2,1,0) ], [α(3,3,1) ] = [α(1,1,1) ] ⊕ [α(2,1,0) (1) (2) ] ⊕ [α(2,2,0) ], [α(3,3,2) ] = [α(0,0,0) ] ⊕ [α(2,2,0) (1) (2) ] ⊕ [α(2,1,0) ], [α(4,2,1) ] = [α(1,1,1) ] ⊕ [α(2,1,0) (1) (2) ] ⊕ [α(2,2,0) ], [α(4,2,2) ] = [α(2,2,0) (1) (2) ] ⊕ [α(2,2,0) ], [α(4,3,1) ] = [α(4,0,0) ] ⊕ [α(2,2,0) (1) (2) ] ⊕ [α(2,2,1) ]. [α(4,3,2) ] = [α(1,0,0) ] ⊕ [α(2,2,1) (1) ], corresponding to the basic, The marked vertices are [α(0,0,0) ], [α(4,0,0) ] and [α(3,2,1) vector and spinor representation of SO(15)1 , respectively. That the spinor representation πs of SO(15)1 restricts to two copies of π(3,2,1) , i.e. bs,(3,2,1) = 2, implies in particular that (1) [α(3,2,1) ] appears in the decomposition of [α(3,2,1) ] with multiplicity 2 by ασ-reciprocity. Using the SU(4)4 fusion rules, i.e. of W (4, 4), we obtain the following sector products by the homomorphism property of α-induction,

[α(1,0,0) ] × [α(1,0,0) ] = 2[α(1,1,0) ], [α(1,0,0) ] × [α(1,1,0) ] =

(9)

(2) 2[α(1,1,1) ] ⊕ ⊕ [α(2,1,0) ], (1) [α(0,0,0) ] ⊕ [α(4,0,0) ] ⊕ [α(2,2,0) ]⊕ (1) [α(2,1,0) ]

[α(1,0,0) ] × [α(1,1,1) ] = [α(1,0,0) ] × [α(4,0,0) ] = [α(1,0,0) ],

(10) (2) [α(2,2,0) ],

(1) (1) (2) ] = [α(2,1,0) ] ⊕ [α(2,1,0) ]. [α(1,0,0) ] × [α(3,2,1)

(11) (12) (13)

However, as the homomorphism [α] : W → V is not surjective we cannot isolate (i) (i) (i) ], [α(2,2,0) ] and [α(2,2,1) ], i = 1, 2. First we can only compute [α(2,1,0) (1) (2) (1) (2) ] ⊕ [α(2,1,0) ] = 2[α(2,2,0) ] ⊕ 2[α(2,2,0) ], [α(1,0,0) ] × [α(2,1,0) (1) (2) (1) (2) ] ⊕ [α(2,2,0) ] = 2[α(1,0,0) ] ⊕ 2[α(2,2,1) ] ⊕ 2[α(2,2,1) ], [α(1,0,0) ] × [α(2,2,0) (1) (2) (1) ] ⊕ [α(2,2,1) ] = 2[α(1,1,0) ] ⊕ 2[α(3,2,1) ]. [α(1,0,0) ] × [α(2,2,1)

(14) (15) (16)

Now recall that the statistical dimension of the positive energy representation of SU(n)k labelled by a partition (p1 , p2 , . . . , pn−1 ), k ≥ p1 ≥ p2 ≥ · · · ≥ pn−1 ≥ pn ≡ 0, is given by

74

J. B¨ockenhauer, D. E. Evans

Y

d(p1 ,p2 ,...,pn−1 ) =

sin

1≤i<j≤n

(pi −pj +j−i)π n+k sin (j−i)π n+k

.

(1) ] has staWith n = k = 4 we obtain d(1,0,0) = sin(π/8)−1 . Since the marked vertex [α(3,2,1) √ (i) (1) ) d(3,2,1) = 2 ≡ 4 sin(π/8) cos(π/8) tistical dimension (we write d(p1 ,p2 ,p3 ) ≡ dα(i) (p1 ,p2 ,p3 )

(2) (1) and do we obtain from Eq. (13) d(1) (2,1,0) +d(2,1,0) = d(1,0,0) d(3,2,1) = 4 cos(π/8). So we may √ (1) 2 assume without loss of generality that d(2,1,0) ≤ 2 cos(π/8).As 4 cos (π/8) = 2+ 2 < 4 (1) ] × [α(1) it follows that [α(2,1,0) (2,1,0) ] decomposes into at most three irreducible sectors. Therefore we conclude by Eq. (11) that (1) (1) , α(1,0,0) ◦ α(2,1,0) iM (I◦ ) = hα(1,0,0) ◦ α(2,1,0) (1) ◦ α(1) = hα(1,1,1) ◦ α(1,0,0) , α(2,1,0) (2,1,0) iM (I◦ ) ≤ 3, (1) ] cannot contain an irreducible sector with multiplicity larger and thus [α(1,0,0) ]×[α(2,1,0) than one. But we also have (1) (1) , α(2,2,0) iM (I◦ ) = hα(2,1,0) , α(1,1,1) ◦ α(2,2,0) iM (I◦ ) = 2, hα(1,0,0) ◦ α(2,1,0)

since one checks [α(1,1,1) ] × [α(2,2,0) ] = 2[α(2,1,0) ]. It follows by comparison with Eq. (14), (i) (1) (2) ] = [α(2,2,0) ] ⊕ [α(2,2,0) ], i = 1, 2, [α(1,0,0) ] × [α(2,1,0)

(17)

and d(i) (2,1,0) = 2 cos(π/8), i = 1, 2. (i) ]= We have [α(2,1,0) ] = [α(2,2,1) ], and hence with a suitable choice of notation [α(2,1,0) (i) [α(2,2,1) ] for i = 1, 2. One checks (1) (2) (1) ] ⊕ [α(2,1,0) ] × [α(1,1,1) ] = 2[α(1,1,0) ] ⊕ 2[α(3,2,1) ], [α(2,1,0) and since 2 +

√

2 = d(1,1,0) 6= d(1) (3,2,1) =

√

2 we find

(i) (1) ] × [α(1,1,1) ] = [α(1,1,0) ] ⊕ [α(3,2,1) ], i = 1, 2, [α(2,1,0)

(18)

and conjugation yields (i) (1) ] = [α(1,1,0) ] ⊕ [α(3,2,1) ], i = 1, 2, [α(1,0,0) ] × [α(2,2,1)

We have [α(2,2,0) ] = [α(2,2,0) ], and this is (2) (1) (1) [α(1) (2,2,0) ] ⊕ [α(2,2,0) ] = [α(2,2,0) ] ⊕ [α(2,2,0) ],

hence conjugation of Eq. (17) yields (i) (1) (2) ] = [α(2,2,0) ] ⊕ [α(2,2,0) ], i = 1, 2. [α(1,1,1) ] × [α(2,2,1)

Thus we find for i, j = 1, 2, (j) (j) (i) (i) , α(2,2,1) iM (I◦ ) = hα(2,2,0) , α(1,1,1) ◦ α(2,2,1) iM (I◦ ) = 1, hα(1,0,0) ◦ α(2,2,0)

(19)

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II [α(1,1,1) ]

[α(4,0,0) ] [α(0,0,0) ]

6

[α(1) (2,1,0) ]

R

j

*

R

[α(2) (2,1,0) ]

?R

6 [α(1,0,0) ]

? Y I

[α(1) (2,2,0) ]

[α(1,1,0) ]

j

75

R ?

I

[α(2) (2,2,0) ]

Y

[α(1) (2,2,1) ]

? *

[α(1) (3,2,1) ]

[α(2) (2,2,1) ]

Fig. 12. Fusion graph of [α(1,0,0) ]

and similarly we obtain (by use of Eq. (11)) (i) (i) , α(1,0,0) iM (I◦ ) = hα(2,2,0) , α(1,1,1) ◦ α(1,0,0) iM (I◦ ) = 1 hα(1,0,0) ◦ α(2,2,0)

for j = 1, 2. It follows now from Eq. (15), (i) (1) (2) ] = [α(1,0,0) ] ⊕ [α(2,2,1) ] ⊕ [α(2,2,1) ], i = 1, 2. [α(1,0,0) ] × [α(2,2,0)

(20)

We have succeeded to compute [α(1,0,0) ] × [β] for each [β] ∈ V, and thus we can draw the fusion graph given in Fig. 12. Similarly one finds for the sector products of [α(1,1,0) ] (1) (2) [α(1,1,0) ] × [α(1,1,0) ] = [α(0,0,0) ] ⊕ [α(4,0,0) ] ⊕ 2[α(2,2,0) ] ⊕ 2[α(2,2,0) ], (1) (2) ] ⊕ [α(2,2,1) ], [α(1,1,0) ] × [α(1,1,1) ] = 2[α(1,0,0) ] ⊕ [α(2,2,1) [α(1,1,0) ] × [α(4,0,0) ] = [α(1,1,0) ], (1) (1) (2) ] = [α(2,2,0) ] ⊕ [α(2,2,0) ], [α(1,1,0) ] × [α(3,2,1)

and also (we omit some details) (i) (1) (2) ] = [α(1,0,0) ] ⊕ [α(2,2,1) ] ⊕ [α(2,2,1) ], [α(1,1,0) ] × [α(2,1,0)

[α(1,1,0) ] × [α(1,1,0) ] ×

(i) [α(2,2,0) ] (i) [α(2,2,1) ]

= =

(1) 2[α(1,1,0) ] ⊕ [α(3,2,1) ] (1) [α(1,1,1) ] ⊕ [α(2,1,0) ] ⊕

i = 1, 2, i = 1, 2,

(2) [α(2,1,0) ],

i = 1, 2.

These equations are visualized in the (disconnected) fusion graph of [α(1,1,0) ], see Fig. 13. We now want to show that for this example the α-induced sector algebra is in fact non-commutative! (The appearance of a non-commutative sector structure associated to the conformal embedding SU(4)4 ⊂ SO(15)1 was first observed by Xu [57] in his (i) (i) framework.) From Eqs. (18) and (19) we obtain (recall [α(2,1,0) ] = [α(2,2,1) ]) (j) (i) ◦ α(2,1,0) , α(1,0,0) ◦ α(1,0,0) iM (I◦ ) = hα(2,1,0) (j) (i) ◦ α(1,1,1) , α(2,2,1) ◦ α(1,0,0) iM (I◦ ) = 2, i, j = 1, 2, = hα(2,1,0)

76

J. B¨ockenhauer, D. E. Evans [α(1) (2,1,0) ]

[α(2) (2,1,0) ]

[α(1,0,0) ]

[α(1) (2,2,1) ]

[α(2) (2,2,1) ]

[α(1) (2,2,0) ]

[α(4,0,0) ]

[α(1,1,0) ]

[α(0,0,0) ]

[α(1,1,1) ]

[α(1) (3,2,1) ]

[α(2) (2,2,0) ]

Fig. 13. Fusion graph of [α(1,1,0) ]

but from [α(1,0,0) ] × [α(1,0,0) ] = 2[α(1,1,0) ] we conclude that [α(1,1,0) ] is a subsector of (j) (i) ] × [α(2,1,0) ], and by matching the statistical dimensions we find indeed [α(2,1,0) (j) (i) ] × [α(2,1,0) ] = [α(1,1,0) ], i, j = 1, 2, [α(2,1,0)

and hence (i) (i) (i) (i) ◦ α(2,2,1) , α(2,2,1) ◦ α(2,1,0) iM (I◦ ) = hα(2,1,0) (i) (i) (i) (i) ◦ α(2,1,0) , α(2,1,0) ◦ α(2,1,0) iM (I◦ ) = 1, i = 1, 2. = hα(2,1,0)

(21)

(i) (i) (i) (i) ] × [α(2,2,1) ] as well as [α(2,2,1) ] × [α(2,1,0) ] must contain the identity sector But [α(2,1,0) (i) (i) [α(0,0,0) ], and also other sectors since d(2,1,0) = d(2,2,1) = 2 cos(π/8) > 1, i = 1, 2. Because Eq. (21) tells us that these products have only the identity sector in common we have shown (i) (i) (i) (i) ] × [α(2,2,1) ] 6= [α(2,2,1) ] × [α(2,1,0) ], i = 1, 2. [α(2,1,0)

Indeed one can compute these products as follows. Since (i) (i) (1) (2) ] × [α(1,0,0) ] = [α(1,0,0) ] × [α(2,1,0) ] = [α(2,2,0) ] ⊕ [α(2,2,0) ], i = 1, 2, [α(2,1,0)

it follows (i) (i) ◦ α(2,2,1) , α(1,1,1) ◦ α(1,0,0) iM (I◦ ) = hα(2,1,0) (i) (i) , α(1,0,0) ◦ α(2,1,0) iM (I◦ ) = 2, i = 1, 2, = hα(1,0,0) ◦ α(2,1,0)

and

(i) (i) ◦ α(2,1,0) , α(1,0,0) ◦ α(1,1,1) iM (I◦ ) = hα(2,2,1) (i) (i) ◦ α(1,0,0) , α(2,1,0) ◦ α(1,0,0) iM (I◦ ) = 2, i = 1, 2, = hα(2,1,0)

(i) (i) (i) (i) ]×[α(2,2,1) and from Eq. (11) we conclude that both [α(2,1,0) ] and [α(2,2,1) ]×[α(2,1,0) ] must (1) (2) ] and [α(2,2,0) ]. contain, besides the identity sector, one of the sectors [α(4,0,0) ], [α(2,2,0) (1) (1) Let us assume that [α(2,1,0) ]×[α(2,2,1) ] contains [α(4,0,0) ]. Then, because of a mismatch of

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

77

√ quantum dimensions of 2, it contains necessarily a third sector (which is determined (1) (1) (1) to be [α(3,2,1) ]). Since now [α(2,2,1) ] × [α(2,1,0) ] cannot contain [α(4,0,0) ] it contains (1) (2) ] or [α(2,2,0) ], and as then the quantum dimensions match this means that either [α(2,2,0) (1) (1) (1) (1) ]×[α(2,1,0) ] decomposes into two irreducible sectors whereas [α(2,1,0) ]×[α(2,2,1) ] [α(2,2,1) decomposes into three. However, this contradicts (1) (1) (1) (1) hα(2,1,0) ◦ α(2,2,1) , α(2,1,0) ◦ α(2,2,1) iM (I◦ ) = (1) (1) (1) (1) ◦ α(2,1,0) , α(2,2,1) ◦ α(2,1,0) iM (I◦ ) . = hα(2,2,1)

It follows, with a suitable choice of notation, (1) (1) (1) ] × [α(2,2,1) ] = [α(0,0,0) ] ⊕ [α(2,2,0) ], [α(2,1,0) (1) (1) (2) ] × [α(2,1,0) ] = [α(0,0,0) ] ⊕ [α(2,2,0) ]. [α(2,2,1)

Petkova and Zuber obtained the fusion graphs of Figs. 12 and 13 in a completely different and more empirical way (Fig. A.2. in [45]). The non-commutativity of V nicely explains why they could not find non-negative structure constants associated to these graphs: They were searching for a (commutative) fusion algebra.

3. The Treatment of Orbifold Inclusions We have seen that conformal inclusions can be described in terms of nets of subfactors. For orbifold inclusions the extended net is not a priori given. However, it is argued in [52, 39] that orbifold type modular invariants arise from extensions of current algebras by some simple currents. The conformal dimensions of these simple currents are necessarily integers. In the following we will describe this idea in our “bounded operator framework”. The techniques we use are not essentially new. Similar and often more general statements can be found in particular in [48, 49]. However, we prefer to give a self-contained presentation and to avoid unnecessary generality if this simplifies our arguments. Starting from the vacuum net A where A(I) = π0 (LI SU(n))00 as usual we will construct a net of subfactors N ⊂ M describing the analogue of the conformal inclusions now for the orbifold type modular invariants, and we will see that the extended net obtained by this construction is local exactly for the levels where the orbifold modular invariants appear. 3.1. Construction of the extended net. We call an automorphism σˆ 0 ∈ 1A (I◦ ) a simple current of order n, if n = 2, 3, 4, . . . is the smallest positive integer such that σˆ 0n is equivalent to the identity, i.e. σˆ 0n = Ad(Y ) for a unitary Y ∈ B(H0 ), and then Y ∈ A(I◦ ) by Haag duality. For our construction we need an equivalent automorphism σ0 which is periodic, i.e. σ0n is exactly the identity. We call ρ0 ∈ 1A (I◦ ) a fixed point of the simple current σˆ 0 if [σˆ 0 ◦ ρ0 ] = [ρ0 ]. The following lemma gives a sufficient criterion for the possibility of a choice of a periodic automorphism (cf. [28], Prop. 3.3, or [48], Lemmata 4.4 and 4.5). Lemma 3.1. Let σˆ 0 ∈ 1A (I◦ ) be a simple current of order n. If there is an irreducible fixed point ρ0 ∈ 1A (I◦ ) of σˆ 0 then there is a simple current σ0 ∈ 1A (I◦ ) such that [σ0 ] = [σˆ 0 ] and σ0n = id.

78

J. B¨ockenhauer, D. E. Evans

Proof. Since [σˆ 0 ◦ρ0 ] = [ρ0 ] there is a unitary U ∈ A(I◦ ) such that σˆ 0 ◦ρ0 = Ad(U )◦ρ0 . We set σ0 = Ad(U ∗ ) ◦ σˆ 0 . Then σ0n is clearly inner, namely σ0n = (Ad(U ∗ ) ◦ σˆ 0 )n = Ad(U ∗ σˆ 0 (U ∗ )σˆ 02 (U ∗ ) · · · σˆ 0n−1 (U ∗ )) ◦ σˆ 0n = Ad(Z), where Z = U ∗ σˆ 0 (U ∗ )σˆ 02 (U ∗ ) · · · σˆ 0n−1 (U ∗ )Y . (Recall σˆ 0n = Ad(Y ).) Now we have ρ0 = σ0n ◦ ρ0 = Ad(Z) ◦ ρ0 and thus Z ∈ ρ0 (A(I◦ ))0 ∩ A(I◦ ). Since we assumed ρ0 to be irreducible it follows Z ∈ C1 and hence σ0n = Ad(Z) = id. From now on we assume that there is an irreducible fixed point ρ0 ∈ 1A (I◦ ) for the simple current σˆ 0 ∈ 1A (I◦ ) of order n, and hence we have an equivalent periodic automorphism σ0 ∈ 1A (I◦ ), i.e. σ0n = id, and also σ0 ◦ ρ0 = ρ0 . The following construction is basically the construction of the field group and algebra as in [15]. Recall that H0 is the vacuum Hilbert space where A lives on. We set H=

n−1 M

H0 .

p=0

For a vector 9 ∈ H we denote by 9p ∈ H0 its pth component with respect to this Ln−1 decomposition. We define a representation π of A on H by π(a) = p=0 σ0p (a), i.e. (π(a)9)p = σ0p (a)9p , a ∈ A, 9 ∈ H, p = 0, 1, 2, . . . , n − 1. Then the net N is defined in terms of local algebras by N (I) = π(A(I)), I ∈ Jz . Pick a unitary UI such that σ0;I = Ad(UI ) ◦ σ0 ∈ 1A (I) for some I ∈ Jz . We define unitary field operators fUI ∈ B(H) by (fUI 9)p = σ0p−1 (UI∗ )9p−1 , 9 ∈ H, p = 0, 1, 2, . . . , n − 1, (mod n). It is easy to check that (fU∗I 9)p = σ0p (UI )9p+1 , 9 ∈ H, p = 0, 1, 2, . . . , n − 1, (mod n). Then we find

(fU∗I π(a)fUI 9)p = σ0p (UI )(π(a)fUI 9)p+1

= σ0p (UI )σ0p+1 (a)(fUI 9)p+1 = σ0p (UI )σ0p+1 (a)σ0p (UI∗ )9p = σ0p ◦ σ0;I (a)9p ,

hence fU∗I π(a)fUI = π ◦ σ0;I (a), a ∈ A.

(22)

We use the following notation: For λ0 ∈ End(A) we define λ ∈ End(N ) by λ(π(a)) = π(λ0 (a)), a ∈ A. One checks easily that λ ∈ 1N (I◦ ) if λ0 ∈ 1A (I◦ ). Then Eq. (22) reads fU∗I xfUI = σI (x) for x ∈ N , and in particular fUI ∈ N (I 0 )0 , i.e. fields are relatively local to observables. Now we define the extended net M in terms of local algebras M (I) being generated by N (I) and fUI ,

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

79

M (I) = hN (I), fUI i, I ∈ Jz . Note that we have fUI = f1 π(UI∗ ) since (fUI 9)p = σ0p−1 (UI∗ )9p−1 = π(UI∗ )9p−1 = (f1 π(UI∗ )9)p . Therefore the definition of M (I) is independent on the special choice of UI because if Ad(Uˆ I ) ◦ σ0 is also localized in I then UI Uˆ I∗ ∈ A(I) by Haag duality and hence fUI and fUˆ I differ only by an element in N (I). Note that our construction is such that (obviously by taking UI◦ = 1) we have M (I◦ ) ∼ = A(I◦ ) oσ0 Zn . We want to show that this is similar for any I ∈ Jz . Lemma 3.2. For any I ∈ Jz there is a unitary W ∈ A(I) such that σ˜ 0;I = Ad(W ∗ ) ◦ n = id. σ0;I ∈ 1A (I) fulfills σ˜ 0;I Proof. Since the irreducible fixed point ρ0 is transportable there is a unitary Uρ0 ;I◦ ,I such that ρ0;I = Ad(Uρ0 ;I◦ ,I ) ◦ ρ0 ∈ 1A (I), and hence σ0;I ◦ ρ0;I = Ad(UI ) ◦ σ0 ◦ Ad(Uρ0 ;I◦ ,I ) ◦ ρ0 = Ad(UI σ0 (Uρ0 ;I◦ ,I )) ◦ σ0 ◦ ρ0 = Ad(UI σ0 (Uρ0 ;I◦ ,I )) ◦ ρ0 = Ad(UI σ0 (Uρ0 ;I◦ ,I )UI∗ ) ◦ ρ0;I . Now W = UI σ0 (Uρ0 ;I◦ ,I )UI∗ ∈ A(I) by Haag duality, hence ρ0;I is an irreducible fixed point for σ0;I . Then, by the same argument as in Lemma 3.1 we find that σ˜ 0;I = n Ad(W ∗ ) ◦ σ0;I fulfills σ˜ 0;I = id. n = id, and then Now we take this unitary W such that σ˜ 0;I = Ad(W ∗ ) ◦ σ0;I and σ˜ 0;I ∗ we define U˜ I = W UI and set

fU˜ I = fUI π(W ) ∈ M (I). Then it follows from Eq. (22) that fU∗˜ I π(a)fU˜ I = π ◦ σ˜ 0;I (a), a ∈ A.

(23)

Lemma 3.3. With a suitable choice of the phase of W we have fUn˜ = 1. I

Proof. We have σ0 = Ad(U˜ I ) ◦ σ˜ 0;I . Choose J ∈ Jz such that J ⊃ I ∪ I◦ . Then for any a ∈ A(J) we find n = XaX ∗ , a = σ0n (a) = (Ad(U˜ I ) ◦ σ˜ 0;I )n (a) = Ad(X) ◦ σ˜ 0;I n−1 ˜ 2 where X = U˜ I σ˜ 0;I (U˜ I )σ˜ 0;I (U˜ I ) · · · σ˜ 0;I (UI ), and therefore X ∈ A(J)0 ∩ A(J) = C1. If X = ξ1, ξ ∈ C, then we can replace W by ξ 1/n W , i.e. U˜ I by ξ −1/n U˜ I to achieve X = 1. Now we compute

80

J. B¨ockenhauer, D. E. Evans

fUn˜ = f1 π(U˜ I )fUn−1 ˜ I

I

n−1 ˜ = f1 fUn−1 ˜ 0;I (UI )) ˜ I π(σ n−2 ˜ = f1 f1 π(UI )f ˜ π(σ˜ n−1 (U˜ I )) UI

0;I

n−2 ˜ n−1 ˜ ˜ 0;I (UI )σ˜ 0;I (UI )) = f12 fUn−2 ˜ I π(σ n = . . . = f1 π(X) = 1,

where we used Eq. (23). p is outer for p 6= 0 Equation (23) holds in particular for a ∈ A(I), moreover, σ˜ 0;I n (mod n), and fU˜ = 1. By the uniqueness of the crossed product we find I

Corollary 3.4. We have M (I) ∼ = A(I) oσ˜ 0;I Zn for any I ∈ Jz . In particular, each M (I) is a factor. Let 0 ∈ H0 denote the vacuum vector. Then 0 is cyclic and separating for each local algebra A(I). Let ∈ H denote the vector given by p = δp,0 0 . It is clear from our construction that is cyclic and separating for each M (I), that is, our net N ⊂ M is standard. Fixing UI for any I ∈ Jz it is clear that each m ∈ M (I) can be uniquely written as m=

n−1 X p=0

xp fUp˜ , xp ∈ N (I). I

Then the map M (I) M (I) EN (I) : M (I) → N (I), m 7→ EN (I) (m) = x0 , M (J) M (I) is a faithful normal conditional expectation. It also satisfies EN (J) |M (I) = EN (I) for I ⊂ J and preserves the vector state ω = h, ·i. We summarize the discussion in the following

Proposition 3.5. The net N ⊂ M is a standard net of subfactors with a standard conditional expectation. Note that N ⊂ M is even a quantum field theoretical net of subfactors by relative locality M (I) ⊂ N (I 0 )0 . However, M will in general not be local itself. The requirement of locality of M imposes restrictions on our simple current σ0 . For λ0 , µ0 ∈ 1A (I◦ ) we denoted by λ and µ the corresponding endomorphisms in 1N (I◦ ). For the statistics operators we use the notation ε± (λ, µ) = π(± (λ0 , µ0 )) (and ε(λ, µ) = ε+ (λ, µ)) as in the previous paper [4]. Recall that for disjoint intervals I1 , I2 ∈ Jz we write I2 > I1 (respectively I2 < I1 ) if I1 lies clockwise (respectively counter-clockwise) to I2 relative to the point “at infinity” z. Lemma 3.6. For I1 ∩ I2 = ∅ we have fUI2 fUI1 = ε± (σ, σ)fUI1 fUI2 with the +-sign if I2 > I1 and the −-sign if I2 < I1 .

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

81

Proof. We compute fUI2 fUI1 = f1 π(UI∗2 )f1 π(UI∗2 )

= π(σ0−1 (UI∗2 )σ0−2 (UI∗1 ))f12

= π(σ0−1 (UI∗2 )σ0−2 (UI∗1 )σ0−2 (UI2 )σ0−1 (UI1 ))f1 π(UI1 )∗ f1 π(UI2 )∗ = σ −2 ◦ π(σ0 (UI∗2 )UI∗1 UI2 σ0 (UI1 ))fUI1 fUI2 = σ −2 ◦ π(± (σ0 , σ0 ))fUI1 fUI2 = σ −2 (ε± (σ, σ))fUI1 fUI2 ,

where we recognized the definition of the statistics operator in Subsect. 2.3 of [4], and ε± (σ, σ) are just scalars since ± (σ0 , σ0 ) ∈ σ02 (A(I◦ ))0 ∩ A(I◦ ), hence we can omit the symbol σ −2 . This leads us immediately to the following Corollary 3.7. The net M is local if and only if (σ0 , σ0 ) = 1. In Subsect. 3 we will use Corollary 3.7 to analyze for which levels we have a local extended net if we take for σ0 the simple current corresponding to the weight k3(1) of the LSU(n) level k theory. For completeness we also add the following Proposition 3.8. If the net M is local then it is in fact Haag dual. Proof. Let I ∈ Jz be arbitrary. We have to show that CM (I 0 )0 = M (I). As CM (I 0 )0 ⊃ M (I) follows from locality we only have to show the reverse inclusion. Thus assume X ∈ CM (I 0 )0 , and we have to show that X ∈ M (I). Choose an interval J ∈ Jz such that I◦ ∪ I ⊂ J. Then in particular X ∈ CM (J 0 )0 and therefore Xπ(a) = π(a)X for all a ∈ CA (J 0 ). This reads in matrix components (corresponding to the decomposition of H into n copies of H0 ) Xp,q σ0q (a) = σ0p (a)Xp,q , p, q ∈ Zn , but σ0 acts trivially on CA (J 0 ) as I◦ ⊂ J. Hence Xp,q ∈ CA (J 0 )0 = A(J) by Haag duality of A. Now choose K ∈ Jz such that K ⊂ J 0 . Then we have in particular XfUK = fUK X. From this we ∗ ∗ ) = σ0p−1 (UK )Xp−1,q , p, q ∈ Zn , and obtain for the matrix components Xp,q+1 σ0q (UK hence ∗ Xp+1,q+1 = σ0p (UK )Xp,q σ0q (UK ) ∗ −p σ0 (Xp,q )σ0q−p (UK )) = σ0p (UK

∗ · σ0;K ◦ σ0−p (Xp,q ) · σ0q−p (UK )) = σ0p (UK

∗ q−p σ0 (UK )) = σ0p (σ01−p (Xp,q )UK

= σ0 (Xp,q )σ0p ((σ0q−p , σ0 )) = σ0 (Xp,q ),

where we used that σ0−p (Xp,q ) ∈ A(J) since I◦ ⊂ J and σ0;K acts trivially on A(J) since K ⊂ J 0 , and also that (σ0q−p , σ0 )) = (σ0 , σ0 )σ0 ((σ0 , σ0 )) · · · σ0q−p−1 ((σ0 , σ0 )) = 1, since (σ0 , σ0 ) = 1 as M is local. We conclude that Xp+k,q+k = σ0k (Xp,q ), p, q, k ∈ Zn , and P by settingp a˜ p = X0,p ∈ A(J) this means that X can be written as X = ap )f1 , i.e. X ∈ M (J), but then we can also alternatively write X = p∈Zn π(˜

82

J. B¨ockenhauer, D. E. Evans

P

π(ap )fUpI with ap ∈ A(J) since also I ⊂ J. Because we assumed that X ∈ CM (I 0 )0 we must have in particular that Xπ(b) = π(b)X whenever b ∈ CA (I 0 ). Now X X π(ap )fUpI π(b) = π(ap )π(b)fUpI Xπ(b) = p∈Zn

p∈Zn

p∈Zn

by relative locality of fields and observables, and X π(b)X = π(b)π(ap )fUpI , p∈Zn

X

hence

p∈Zn

π(ap )π(b)fUpI =

X p∈Zn

π(b)π(ap )fUpI .

Multiplication by fU−q I

from the right and application of the conditional expectation yields π(aq )π(b) = π(b)π(aq ) for P all b ∈ CA (I 0 ), q ∈ Zn . It follows aq ∈ CA (I 0 )0 = A(I), q ∈ Zn , and therefore X = p∈Zn π(ap )fUpI ∈ M (I). 3.2. Endomorphisms of the extended net. We have lifted endomorphisms λ0 of A to endomorphisms λ of N by λ◦π = π ◦λ0 . Next we consider the α-induced endomorphisms αλ ∈ End(M). In the following we assume that M is Haag dual, i.e. that (σ0 , σ0 ) = 1. For notation we refer again to our previous paper [4] and to Subsect. 1.3. Lemma 3.9. For λ0 ∈ 1A (I◦ ) we have αλ (f1 ) = f1 ε(λ, σ). Proof. By applying γ to f1∗ xf1 = σ(x), x ∈ N , we find γ(f1 ) ∈ HomN (I◦ ) (θ ◦ σ, θ). Hence by the BFE, Eq. (22) of [4], we obtain γ(f1 )θ(ε(λ, σ))ε(λ, θ) = ε(λ, θ) · λ ◦ γ(f1 ), and therefore

αλ (f1 ) ≡ γ −1 ◦ Ad(ε(λ, θ)) ◦ λ ◦ γ(f1 ) = f1 ε(λ, σ),

proving the lemma. Now we ask when αλ is localized. For the sake of simplicity, we restrict the discussion to irreducible λ0 . Define the monodromy by Y (λ0 , σ0 ) = (λ0 , σ0 )(σ0 , λ0 ). Note that for irreducible λ0 the monodromy is a scalar as Y (λ0 , σ0 ) ∈ σ0 ◦ λ0 (A(I◦ ))0 ∩ A(I◦ ) = C1, i.e. Y (λ0 , σ0 ) = ω1, ω ∈ C. Therefore we have (λ0 , σ0 ) = Y (λ0 , σ0 )(σ0 , λ0 )∗ = ω− (λ0 , σ0 ). Lemma 3.10. For λ0 ∈ 1A (I◦ ) irreducible αλ is localized in I◦ if and only if the monodromy Y (λ0 , σ0 ) is trivial, i.e. ω = 1. Proof. It is clear that αλ (x) ≡ λ(x) = x for any x ∈ N (I) with I ∩ I◦ = ∅ since λ0 ∈ 1A (I◦ ). Thus we have to check whether αλ (fUI ) = fUI whenever I ∩ I◦ = ∅. By definition αλ (fUI ) = αλ (f1 )λ(π(UI∗ )) = f1 π((λ0 , σ0 )λ0 (UI∗ )) = fUI π(UI (λ0 , σ0 )λ0 (UI∗ )). For I ∈ Jz such that I ∩ I◦ = ∅ we distinguish two cases. Case 1. I > I◦ . We can choose an interval I+ ∈ Jz such that I+ > I◦ and I+ > I. Since I > I◦ we can choose some J+ ∈ Jz such that J+ ⊃ I ∪I+ but J+ ∩I◦ = ∅. For any unitary

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

83

Uσ0 ,+ such that σ0,+ = Ad(Uσ0 ,+ ) ◦ σ0 ∈ 1A (I+ ) the statistics operator can be written as (λ0 , σ0 ) = Uσ∗0 ,+ λ0 (Uσ0 ,+ ). Since σ0;I = Ad(UI ) ◦ σ0 we have σ0,+ = Ad(V+ ) ◦ σ0;I with V+ = Uσ0 ,+ UI∗ , and hence V+ ∈ A(J+ ) by Haag duality. Then UI (λ0 , σ0 )λ0 (UI∗ ) = V+∗ λ0 (V+ ) = V+∗ V+ = 1 since J+ ∩ I◦ = ∅. Hence αλ (fUI ) = fUI for I > I◦ . Case 2. I < I◦ . Recall (λ0 , σ0 ) = ω− (λ0 , σ0 ), hence UI (λ0 , σ0 )λ0 (UI∗ ) = ωUI − (λ0 , σ0 )λ0 (UI∗ ). We can choose an interval I− ∈ Jz such that I− < I◦ and I− < I. Since I < I◦ we can choose some J− ∈ Jz such that J− ⊃ I ∪ I− but J− ∩ I◦ = ∅. For any unitary Uσ0 ,− such that σ0,− = Ad(Uσ0 ,− )◦σ0 ∈ 1A (I− ) the statistics operator can be written as − (λ0 , σ0 ) = Uσ∗0 ,− λ0 (Uσ0 ,− ). Then σ0,− = Ad(V− )◦σ0;I with V− = Uσ0 ,− UI∗ ∈ A(J− ), and UI − (λ0 , σ0 )λ0 (UI∗ ) = V−∗ λ0 (V− ) = V−∗ V− = 1, and hence αλ (fUI ) = ωfUI for I < I◦ . The statement follows. The next step is the transportability. Note that 1(0) M (I◦ ) = 1M (I◦ ) since M is Haag dual. We have the following (cf. Prop. 5.2 in [49]) Lemma 3.11. For λ0 ∈ 1A (I◦ ) irreducible we have αλ ∈ 1M (I◦ ) if and only if the monodromy Y (λ0 , σ0 ) is trivial, i.e. ω = 1. Proof. After Lemma 3.10 all we have to show is that αλ is transportable if ω = 1. Since λ0 ∈ 1A (I◦ ) there is for any J ∈ Jz a unitary U ≡ Uλ0 ;I◦ ,J such that λ˜ 0 ∈ 1A (J). Define α˜ λ = Ad(π(U ))◦αλ . It is clear that α˜ λ (x) = x whenever x ∈ N (I) with I ∩J = ∅. We show that also α˜ λ (fUI ) = fUI in that case. We again distinguish two cases. Case 1. I > J. We choose I+ ∈ Jz such that I+ > I◦ and I+ > J. Since I > J there is a K+ ∈ Jz such that K+ ⊃ I+ ∪ I but K+ ∩ J = ∅. As before, we choose a unitary Uσ0 ,+ such that σ0,+ = Ad(Uσ0 ,+ ) ◦ σ0 ∈ 1A (I+ ). Then σ0,+ = Ad(V+ ) ◦ σ0;I and therefore V+ = Uσ0 ,+ UI∗ ∈ A(K+ ), hence λ˜ 0 (V+ ) = V+ . Since I+ > I◦ and I+ > J we also have σ0,+ (U ) = U . Now we compute α˜ λ (fUI ) = Ad(π(U )) ◦ αλ (fUI ) = π(U )fUI π(UI (λ0 , σ0 )λ0 (UI∗ )U ∗ ) = fUI π(σ0;I (U )UI (λ0 , σ0 )λ0 (UI∗ )U ∗ ) = fUI π(UI σ0 (U )(λ0 , σ0 )U ∗ λ˜ 0 (U ∗ )) I

= fUI π(UI σ0 (U )Uσ∗0 ,+ λ0 (Uσ0 ,+ )U ∗ λ˜ 0 (UI∗ )) = fUI π(UI Uσ∗ ,+ σ0,+ (U )U ∗ λ˜ 0 (Uσ0 ,+ )λ˜ 0 (U ∗ )) 0

= fUI π(V+∗ λ˜ 0 (V+ )) = fUI .

I

84

J. B¨ockenhauer, D. E. Evans

Case 2. I < J. We choose I− ∈ Jz such that I− < I◦ and I− < J. Since I < J there is a K− ∈ Jz such that K− ⊃ I− ∪I but K− ∩J = ∅. Let σ0,− = Ad(Uσ0 ,− )◦σ0 ∈ 1A (I− ). Then V− = Uσ0 ,+ UI∗ ∈ A(K− ), hence λ˜ 0 (V− ) = V− ; since I− < I◦ and I− < J we also have σ0,− (U ) = U . If ω = 1 then (λ0 , σ0 ) = − (λ0 , σ0 ), so we can compute analogously α˜ λ (fUI ) = fUI π(UI σ0 (U )− (λ0 , σ0 )U ∗ λ˜ 0 (UI∗ )) = fUI π(UI σ0 (U )Uσ∗ ,− λ0 (Uσ0 ,− )U ∗ λ˜ 0 (U ∗ )) =

I 0 ∗ ∗˜ ˜ fUI π(UI Uσ0 ,− σ0,− (U )U λ0 (Uσ0 ,− )λ0 (UI∗ ))

= fUI π(V−∗ λ˜ 0 (V− )) = fUI . We have shown that α˜ λ is localized in J. Since J ∈ Jz was arbitrary it follows αλ ∈ 1M (I◦ ). Our construction of the net M is such that (Proposition 2.10 and the discussion in Subsect. 2.4 in [4]) [θ] =

n−1 M

[σ p ].

(24)

p=0

Lemma 3.12. For any λ0 ∈ 1A (I◦ ) we have    n−1  X π(Tp )f1p , Tp ∈ HomA(I◦ ) (σ0−p ◦ λ0 , λ0 ) . (25) HomM (I◦ ) (αλ , αλ ) = t =   p=0

Pn−1 Proof. Suppose t ∈ HomM (I◦ ) (αλ , αλ ). We can write t = p=0 π(Tp )f1p with Tp ∈ A(I◦ ). Now from t · αλ ◦ π(a) = αλ ◦ π(a) · t for all a ∈ A(I◦ ) we obtain n−1 X

π(Tp )f1p · π ◦ λ0 (a) ≡

p=0

n−1 X

π(Tp · σ0−p ◦ λ0 (a))f1p =

p=0

n−1 X

π(λ0 (a)Tp )f1p .

p=0

It follows Tp ∈ HomA(I◦ ) (σ0−p ◦ λ0 , λ0 ). It remains to be shown that then tαλ (f1 ) = αλ (f1 )t. From the BFE, Eq. (11) in [4], we obtain σ0 (Tp )(σ0−p , σ0 )σ0−p ((λ0 , σ0 )) = (λ0 , σ0 )Tp . But (σ0−p , σ0 ) ≡ (σ0n−p , σ0 ) = (σ0 , σ0 )σ0 ((σ0 , σ0 )) · · · σ0n−p−1 ((σ0 , σ0 )) = 1 as (σ0 , σ0 ) = 1. Hence we find σ0 (Tp )σ0−p ((λ0 , σ0 )) = (λ0 , σ0 )Tp .

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

85

Now we compute tαλ (f1 ) = = = = =

Pn−1 p=0

Pn−1 p=0

Pn−1 p=0

Pn−1 p=0

Pn−1 p=0 Pn−1 p=0

π(Tp )f1p+1 ε(λ, σ) f1 · π ◦ σ0 (Tp ) · f1p ε(λ, σ) f1 · π ◦ σ0 (Tp ) · σ −p (ε(λ, σ))f1p f1 π(σ0 (Tp )σ0−p ((λ0 , σ0 )))f1p f1 π((λ0 , σ0 )Tp )f1p

= f1 ε(λ, σ)π(Tp )f1p = αλ (f1 )t, thus we have indeed Eq. (25). For the fixed point ρ0 of σ0 we have HomA(I◦ ) (σ0 ◦ ρ0 , ρ0 ) = C1. By Lemma 3.12 it is not hard to see that then HomM (I◦ ) (αρ , αρ ) is an n-dimensional commutative algebra, i.e. HomM (I◦ ) (αρ , αρ ) ∼ = C ⊕ C ⊕ . . . ⊕ C, and therefore [αρ ] decomposes in n distinct irreducible sectors. Since σαρ = γ ◦ αρ |N = θ ◦ ρ and thus [σαρ ] = [θ ◦ ρ] = n[ρ] we arrive at the following Ln−1 Corollary 3.13. We have [αρ ] = p=0 [δp ] with [δp ] distinct and irreducible. Moreover, [σδp ] = [ρ] for all p = 0, 1, . . . , n − 1. 3.3. Spin and statistics. We found that the extended net is local (and even Haag dual) if and only if (σ0 , σ0 ) = 1. In fact (σ0 , σ0 ) can be computed by the spin and statistics connection. In the following the conformal dimensions h3 , which are by definition the lowest eigenvalues of the rotation generator L0 in the positive energy representations (π3 , H3 ), 3 ∈ A(n+k) , will play an important role. They are given by h3 =

(3|3 + 2ρ) , 2(k + n)

Pn−1 where ρ = i=1 3(i) and (·|·) is the symmetric bilinear form. Recalling that (3(i) |3(j) ) = Pn−1 i(n − j)/n for 1 ≤ i ≤ j ≤ n − 1 one may obtain for 3 = i=1 mi 3(i) , h3 =

X 1≤i≤j≤n−1

n−1

mi mj

n−1

X i(n − j) X 2 i(n − i) i(n − i) mi mi − + , n(k + n) 2n(k + n) 2(k + n) i=1 i=1

(26)

Pn−1 where we used the Dynkin labelling, i.e. mi ∈ N0 and i=1 mi ≤ k. Now let λ0;3 ∈ 1A (I◦ ) denote the endomorphisms corresponding to the positive energy representations (π3 , H3 ), 3 ∈ A(n+k) . Then σ0 = λ0;k3(1) is a simple current of order n, and its fusion rules correspond to the Zn -rotation of A(n+k) . It has a fixed point if k is a multiple of n, namely ρ0 = λ0;3R , where 3R = nk 3(1) + nk 3(2) + . . . + nk 3(n−1) . Therefore we first require k ∈ nN so that we can construct the extended net M by means of σ0 as explained in the previous subsections. Then we ask when M is local. Proposition 3.14. The net M is local if and only if k ∈ 2nN if n is even and k ∈ nN if n is odd.

86

J. B¨ockenhauer, D. E. Evans

Proof. By Corollary 3.7 the net M is local if and only if (σ0 , σ0 ) = 1. Since σ0 is an automorphism we have (σ0 , σ0 ) = κσ0 1, where κσ0 ∈ C is the statistical phase. By the conformal spin and statistics theorem [27] we have κσ0 = e2πihσ0 , where hσ0 is the infimum of the spectrum of the rotation generator L0 in the representation π0 ◦ σ0 . But this is the conformal dimension, hσ0 = hk3(1) , and by Eq. (26), hk3(1) = k

n−1 . 2n

Therefore (σ0 , σ0 ) = 1 if and only if k(n − 1)/2n ∈ N, the statement follows. The next step is to ask for which 3 ∈ A(n+k) we have α3 ≡ αλ3 ∈ 1M (I◦ ). For Pn−1 3 = m1 3(1) + m2 3(2) + · · · + mn−1 3(n−1) we denote |3| = i=1 imi . Recall that the Zn -rotation σ on A(n+k) is defined by σ(3) = (k − m1 − . . . − mn−1 )3(1) + m1 3(2) + m2 3(3) + . . . + mn−2 3(n−1) . Proposition 3.15. We have α3 ∈ 1M (I◦ ) if and only if |3| ∈ nZ. Proof. By Lemma 3.11 we have α3 ∈ 1M (I◦ ) if and only if Y (λ0;3 , σ0 ) = 1. By Lemma 3.3 of [20] we have for any T ∈ HomA(I◦ ) (λ0;σ(3) , σ0 ◦ λ0;3 ), Y (λ0;3 , σ0 )T =

κλ0;σ(3) T, κσ0 κλ0;3

where the κ’s are statistical phases. Since [λ0;σ(3) ] = [σ0 ◦ λ0,3 ] and since λ0;3 is irreducible we can take T unitary, hence Y (λ0;3 , σ0 ) =

κλ0;σ(3) 1. κσ0 κλ0;3

Using again the conformal spin and statistics theorem we find κλ0;σ(3) 2πi(hλ0;σ(3) −hσ0 −hλ0;3 ) 2πi(hσ(3) −hk3(1) −h3 ) =e ≡e . κσ0 κλ0;3 Now by Lemma 2.7 of [35] we have 1 hσ(3) − h3 = n

(n − 1)k − |3| , 2

hence hσ(3) −hk3(1) −h3 = −|3|/n. Therefore Y (λ0;3 , σ0 ) = 1 if and only if |3| ∈ nZ. Remark. If we label the positive energy representations of LSU(n) at level k by partitions Pn−1 (or Young tableaux) (p1 , p2 , . . . , pn−1 ) with pi = j=i mj , then Proposition 3.15 reads Pn−1 α(p1 ,... ,pn−1 ) ∈ 1M (I◦ ) if and only if i=1 pi ∈ nZ.

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

87

By Proposition 3.15 it should be clear that for the orbifold modular invariants the sectors corresponding to the marked vertices are (the irreducible subsectors of) α(p1 ,... ,pn−1 ) Pn−1 with i=1 pi ∈ nZ, as these α-induced endomorphisms are localized and transportable endomorphisms of the extended net M. Moreover, as we will see by the treatment of the examples, their σ-restriction corresponds to the block structure of the corresponding orbifold modular invariants. The SU(n)k sectors that do not appear in the blocks of the modular invariant can be identified as “twisted sectors” if we consider the SU(n)k theory as the Zn orbifold of the extended theory. In fact, α-induction of these sectors does not provide localized sectors; here we only obtain “solitonic” localization of the α-induced endomorphisms. For SU(2) the positive energy representations are labelled by the spin j ≡ m1 = 0, 1, 2, . . . , k. Then Eq. (26) reduces to hj =

j(j + 2) . 4k + 8

First we find by Proposition 3.14 that we can construct the local extended net for k = 4%, % ∈ N, since then hk = % ∈ Z. The rotation σ is now the flip σ(j) = k − j. Hence hσ(j) − hj =

j (k − j)(k − j + 2) j(j + 2) k − 2j − = =%− , 4k + 8 4k + 8 4 2

i.e. αj ∈ 1M (I◦ ) if and only if j ∈ 2Z. 3.4. Examples. We now consider some examples for the application of α-induction to the extended net coming from an orbifold block-diagonal modular invariant. The simplest case is the SU(2) D4 modular invariant but we have already discussed this case as the extended net here coincides with the net associated to the SU(3)1 theory. The D2%+2 modular invariants with % > 1 do not come from conformal inclusions. They appear at level k = 4% and can be written as ZD2%+2 =

1 X |χj + χk−j |2 . 2 k≥j≥0 j∈2Z

Let us first illustrate this at the next case in the D-series, namely D6 . The D6 invariant appears at level 8, thus we start with the fusion algebra W (2, 8). The simple current is given by [λ8 ] and indeed h8 = 2. Equation (24) now reads [θ] = [λ0 ] ⊕ [λ8 ] and from this we get immediately that [α4 ] decomposes into two irreducible sectors, say [α4 ] = [α4(1) ] ⊕ [α4(2) ], all other [αj ] are irreducible and [α8−j ] = [αj ]. The fusion rules involving [αj ], j = 0, 1, 2, 3, can be read off from those of [λj ] by the homomorphism property of α-induction, so one only has to find the fusion rules involving [α4(i) ], i = 1, 2. One checks that the following fusion rules, [α1 ] × [α4(i) ] = [α3 ], i = 1, 2, [α2 ] × [α4(i) ] = [α2 ] ⊕ [α4(i+1) ], i = 1, 2 (mod 2), [α3 ] × [α4(i) ] = [α1 ] ⊕ [α3 ], i = 1, 2, [α4(i) ] × [α4(i) ] = [α0 ] ⊕ [α4(i) ], i = 1, 2, [α4(1) ] × [α4(2) ] = [α2 ],

88

J. B¨ockenhauer, D. E. Evans [α(1) 4 ]

[α0 ]

[α1 ]

[α2 ]

[α3 ] [α(2) 4 ]

Fig. 14. D6

determine a well-defined fusion algebra with unit [α0 ]. The fusion graph of [α1 ] is easily seen to be D6 , see Fig. 14. For arbitrary % = 1, 2, 3, . . . , k = 4%, the fusion algebra can be characterized as (1) follows. We have 2% + 2 irreducible sectors [αj ], j = 0, 1, 2, . . . , 2% − 1, and [α2% ] and (2) ]. The fusion rules are given from those in W (2, 4%), see Eq. (8), i.e. [α2% [αj1 ] × [αj2 ] =

min(j1 +j2M ,2k−(j1 +j2 ))

[αj ],

j=|j1 −j2 |,j+j1 +j2 even (1) (2) ] ⊕ [α2% ] for j = 0, 1, 2, . . . , 2%, where we identify [αk−j ] = [αj ] and [α2% ] = [α2% on the r.h.s. Thus associativity, the homomorphism property of [α] and compatibility with hαj , αj 0 iM (I◦ ) = hθ ◦ λj , λj 0 iN (I◦ ) , where [θ] = [λ0 ] ⊕ [λk ] are automatically guaranteed, and the fusion graph of [α1 ] is already determined to be D2%+2 . We only (i) ], i = 1, 2. But it is shown have to specify the fusion rules involving the isolated [α2% in [28] that the fusion graph D2%+2 of [α1 ] already determines all the (endomorphism) fusion rules; they are given by    [α2%−j ] ⊕ [α2%−j+2 ] ⊕ . . . ⊕ [α2%−3 ] ⊕ [α2%−1 ], j ∈ 2Z + 1 (i) (i) [αj ] × [α2% ] = [α2%−j ] ⊕ [α2%−j+2 ] ⊕ . . . ⊕ [α2%−2 ] ⊕ [α2% ], j ∈ 4Z   (i+1) ], j ∈ 4Z + 2 [α2%−j ] ⊕ [α2%−j+2 ] ⊕ . . . ⊕ [α2%−2 ] ⊕ [α2%

for 0 < j < 2% and i = 1, 2 (mod 2). Of course [α0 ] × [α2% ]± = [α2% ]± , and ( (i) ], % = 2, 4, 6, . . . [α0 ] ⊕ [α4 ] ⊕ . . . ⊕ [α2%−4 ] ⊕ [α2% (i) (i) , [α2% ] × [α2% ] = (i+1) [α2 ] ⊕ [α6 ] ⊕ . . . ⊕ [α2%−4 ] ⊕ [α2% ], % = 1, 3, 5, . . . [α2 ] ⊕ [α6 ] ⊕ . . . ⊕ [α2%−6 ] ⊕ [α2%−2 ], % = 2, 4, 6, . . . (i) (i+1) , ] × [α2% ]= [α2% [α0 ] ⊕ [α4 ] ⊕ . . . ⊕ [α2%−6 ] ⊕ [α2%−2 ], % = 1, 3, 5, . . . for i = 1, 2 (mod 2). Next we consider the D(3%+3) , % ∈ N, (block-diagonal) modular invariant that appears at level k = 3%, 1 X |χ(p,q) + χσ(p,q) + χσ2 (p,q) |2 , ZD(3%+3) = 3 k≥p≥q≥0 p+q∈3Z

where σ is the Z3 rotation of the A(k+3) graph, σ(p, q) = (k − q, p − q), σ 2 (p, q) = (k − p + q, k − p).

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

89

This is an orbifold invariant and it can be treated completely analogously to the Deven invariants of SU(2). The vacuum block gives us the [θ], [θ] = [λ(0,0) ] ⊕ [λ(k,0) ] ⊕ [λ(k,k) ]. Using the LSU(3) fusion rules at level k, in particular those for the simple currents [λ(p,q) ] × [λ(k,0) ] = [λσ(p,q) ], [λ(p,q) ] × [λ(k,k) ] = [λσ2 (p,q) ], we find hα(p,q) , α(r,s) iM (I◦ ) = δ(p,q),(r,s) + δσ(p,q),(r,s) + δσ2 (p,q),(r,s) = δp,r δq,s + δk−q,r δp−q,s + δk−p+q,r δk−p,s

.

Hence we have identifications [α(p,q) ] = [ασ(p,q) ] = [ασ2 (p,q) ] and all [α(p,q) ] are irreducible apart from the fixed point (p, q) = (2%, %), where hα(p,q) , α(r,s) iM (I◦ ) = 3, so that it decomposes into three irreducible sectors as follows, (1) (2) (3) ] ⊕ [α(2%,%) ] ⊕ [α(2%,%) ]. [α(2%,%) ] = [α(2%,%)

One easily checks that the fusion graphs of [α(1,0) ] are the orbifold graphs D(k+3) which were first discovered in [36] in the context of statistical mechanical models and then in [18] in the subfactor context.

4. Graphs and Intertwining Matrices In this section we define several fusion matrices and study their properties. Using some ideas of Xu [57], we establish identities between these matrices which allow to identify them with certain matrices considered by Di Francesco and Zuber. 4.1. Some matrices and their properties. Let again W ≡ W(n, k) = {[λ3 ], 3 ∈ A(n+k) } be the canonical sector basis for SU(n)k . Recall that the structure constants of W can be written as 00

3 3, 30 , 300 ∈ A(n+k) , N3,3 0 = hλ3 ◦ λ30 , λ300 iN (I◦ ) , 00

and this defines matrices N3 by (N3 )30 ,300 = N330 ,3 . We set Ap = N3(p) , for the fundamental weights 3(p) , p = 1, 2, . . . , n − 1. Note that A1 is the adjacency matrix of the first fusion graph of the fundamental representation, i.e. of A(n+k) considered as a graph. For either a conformal inclusion or an orbifold inclusion as discussed above let V denote the sector algebra with basis V obtained by α-induction. We denote α3 ≡ αλ3 . First of all we claim Lemma 4.1. For either a conformal or an orbifold inclusion, α3(1) is always irreducible.

90

J. B¨ockenhauer, D. E. Evans

Proof. Irreducibility of α3(1) means that hα3(1) , α3(1) iM (I◦ ) = 1. We have hα3(1) , α3(1) iM (I◦ ) = hθ ◦ λ3(1) , λ3(1) iN (I◦ ) = hθ, λ3(1) ◦ λ3(n−1) iN (I◦ ) = 1 + hθ, λ3(1) +3(n−1) iN (I◦ ) , as [λ3(1) ] = [λ3(n−1) ] and [λ3(1) ] × [λ3(n−1) ] = [λ0 ] ⊕ [λ3(1) +3(n−1) ] and since [λ0 ] = [id] appears in the decomposition of [θ] precisely once. Using the formula for the conformal dimension, Eq. (26), one checks that h3(1) +3(n−1) = n/(k + n) ∈ / Z. However, all subsectors of [θ] must have integer conformal dimension (and this corresponds to T-invariance in the modular invariant picture): For the conformal inclusion case, the decomposition of [θ] corresponds to the decomposition of the restricted vacuum repLn−1 p p resentation. In the orbifold inclusion case we have [θ] = p=0 [σ ], [σ ] = [λk3(p) ], and hk3(p) = kp(n − p)/2n ∈ Z as k ∈ 2nN if n is even and k ∈ nN if n is odd. We conclude that hθ, λ3(1) +3(n−1) iN (I◦ ) = 0, proving irreducibility of α3(1) . We define the following collection of non-negative integers, b = hβa ◦ α3 , βb iM (I◦ ) , 3 ∈ A(n+k) , a, b ∈ V, V3;a

where βa are representative endomorphisms of a ≡ [βa ] (and we will use the label 0 for the identity sector of M (I◦ ) as well). This defines square matrices V3 , 3 ∈ A(n+k) , by b b , as well as rectangular matrices V(a) , a ∈ V, by (V(a) )3,b = V3;a . Also, (V3 )a,b = V3;a we set Gp = V3(p) , p = 1, 2, . . . , n − 1. Hence Gp is the adjacency matrix of the fusion graph of α3(p) . Lemma 4.2. The matrices V3 and V(a) have the following properties. 1. V0 = 1d , 2. Ap V(a) = V(a) Gp , a ∈ V, p = 1, 2, . . . , n − 1, P 300 3. V3 V30 = 300 N3,3 0 · V300 . b = δa,b as V is a sector basis. Proof. Ad 1. We obviously have V0;a Ad 2. We compute P (Ap V(a) )3,b = 30 ∈A(n+k) (Ap )3,30 V3b0 ;a P = 30 ∈A(n+k) hλ3 ◦ λ3(p) , λ30 iN (I◦ ) hβa ◦ α30 , βb iM (I◦ ) = hβa ◦ αλ3 ◦λ3(p) , βb iM (I◦ ) P = c∈V hβa ◦ α3 , βc iM (I◦ ) hβc ◦ α3(p) , βb iM (I◦ ) P c = c∈V V3;a (Gp )c,b = (V(a) Gp )3,b ,

where we used the fact that the [λ3 ]’s and [βa ]’s constitute sector bases, and in the third equality we used the additive homomorphism property of α-induction. Ad 3. We compute P c V3b0 ;c (V3 V30 )a,b = c∈V V3;a P = c∈V hβa ◦ α3 , βc iM (I◦ ) hβc ◦ α30 , βb iM (I◦ ) = hβa ◦ α3 ◦ α30 , βb iM (I◦ ) P 300 = 300 ∈A(n+k) N3,3 0 hβa ◦ α300 , βb iM (I◦ ) P 00 3 = 300 ∈A(n+k) N3,3 0 (V300 )a,b , where we again used the homomorphism property of α-induction.

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

91

By some abuse of notation we also denote the sector product matrices associated to c with V by Na , i.e. (Na )b,c = Nb,a c = hβb ◦ βa , βc iM (I◦ ) , a, b, c ∈ V. Nb,a

Analogous to the commutative case [33], these matrices realize the “regular” representation of the sector algebra V ; we have X c Na,b · Nc , a, b ∈ V, N a Nb = c∈V

since (Na Nb )d,e =

P f ∈V

P f e Nd,a Nf,b = f ∈V hβd ◦ βa , βf iM (I◦ ) hβf ◦ βb , βe iM (I◦ ) = hβd ◦ βa ◦ βb , βe iM (I◦ ) P c = c∈V Na,b hβd ◦ βc , βe iM (I◦ ) P c = c∈V Na,b (Nc )d,e ,

where we used that V is a sector basis. Note that Lemma 4.2 (3) reflects basically the homomorphism property of αP a · induction. Using the decomposition of [α3 ] one can similarly derive V3 = a∈V V3;0 (n+k) . The following lemma reflects the commutativity of [α3 ] with each Na for 3 ∈ A [βa ], proven in Proposition 3.16 of [4]. Lemma 4.3. We have V3 Na = Na V3 for any 3 ∈ A(n+k) and a ∈ V. Proof. We compute P P d c Nd,a = d∈V hβb ◦ α3 , βd iM (I◦ ) hβd ◦ βa , βc iM (I◦ ) (V3 Na )b,c = d∈V V3;b = hβb ◦ α3 ◦ βa , βc iM (I◦ ) = hβb ◦ βa ◦ α3 , βc iM (I◦ ) P = d∈V hβb ◦ βa , βd iM (I◦ ) hβd ◦ α3 , βc iM (I◦ ) P d c = d∈V Nb,a · V3;d = (Na V3 )b,c , where we used Proposition 3.16 of [4]. 4.2. Modular invariants and exponents of graphs. Let us briefly recall some facts about fusion algebras (see e.g. [33]). If W is a fusion algebra with sector basis k then the matrices Ni defined W = {w0 , w1 , . . . , wd−1 } and structure constants Ni,j k form the regular representation of W , and since they constitute a famby (Ni )j,k = Ni,j ily of normal, commuting matrices they can be simultaneously diagonalized by a unitary matrix S. Then the diagonal matrices S ∗ Ni S form a direct sum over all the irreducible (one-dimensional) representations of W , i.e. over its characters. These representations ρj are labelled by j = 0, 1, . . . , d − 1 and are given by ρj (wi ) =

Si,j , i = 0, 1, 2, . . . , d − 1, S0,j

where Si,j are the matrix elements of S. Now let us start with a conformal or orbifold inclusion of SU(n) at level k, and let V again denote the sector basis obtained by α-induction from the sector basis W = W(n, k)

92

J. B¨ockenhauer, D. E. Evans

corresponding to the positive energy representations of SU(n)k . Recall that T ⊂ V are the sectors corresponding to the marked vertices, generating a commutative sector subalgebra T ⊂ V by Theorem 4.3 of [4]; for details see also Subsect. 2.1. Note c b = Nc,a , thus Na is the transpose matrix of Na . Since T is closed under that Nb,a conjugation and by Lemma 4.3, the matrices Nt , t ∈ T , and V3 , 3 ∈ A(n+k) , form a family of normal, commuting matrices and hence can be simultaneously diagonalized in a suitable orthonormal basis that we denote by {ψ i , i = 1, 2, . . . , D}; here D = |V|. As the matrices V3 constitute a representation of the fusion algebra W ≡ W (n, k) by Lemma 4.2 they decompose in the one-dimensional irreducible representations γ8 of W , which are labelled by weights 8 ∈ A(n+k) and are given by γ8 (3) =

S3,8 , 3 ∈ A(n+k) , S0,8

where S3,8 denote the entries of the matrix S that diagonalizes the fusion rules of the endomorphisms associated to the LSU(n) level k theory. Due to Wassermann’s result [55] these endomorphisms obey the fusion rules given by the Verlinde formula in terms of the modular S-matrix S, therefore the modular S-matrix S diagonalizes the endomorphism fusion rules, i.e. we have indeed S = S. We conclude that we have a map 8 : {1, 2, . . . , D} → A(n+k) , i 7→ 8(i), such that V3 =

D X

γ8(i) (3)|ψ i ihψ i |, 3 ∈ A(n+k) ,

i=1

i.e. in components b = V3;a

D X S3,8(i) i=1

S0,8(i)

ψai (ψbi )∗ , 3 ∈ A(n+k) , a, b ∈ V.

The image of 8 is the set of weights ∈ A(n+k) such that γ appears in the V3 ’s. Since in particular Gp = V3(p) , p = 1, 2, . . . , n − 1, we call these weights exponents and denote the set of exponents Exp = Im 8. In other words, Exp labels the joint spectrum of the matrices V3 . Similarly, as the Nt ’s with t ∈ T give a representation of the fusion algebra T of the extended theory by Theorem 4.3 of [4] we have a map s : {1, 2, . . . , D} → T , i 7→ s(i), such that D X ηs(i) (t)|ψ i ihψ i |, t ∈ T , Nt = i=1

where ηs (t) =

ext St,s ext , t ∈ T , S0,s

ext denote the entries of a matrix are the one-dimensional representations of T and St,s ext S that diagonalizes the (endomorphism!) fusion rules of T . It is widely believed for general conformal field theories (and even conjectured e.g. in [21], Conjecture 4.48) that endomorphisms representing the sectors of a conformal field theory obey the Verlinde fusion rules given in terms of the modular S-matrix, i.e. that we can choose S ext = S ext , where S ext is the S-matrix coming from the modular transformation of the extended characters. However, a proof exists only for several particular cases, see below.

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

93

Thus we have in components c = Na,t

D ext X St,s(i) i i ∗ ext ψa (ψc ) , t ∈ T , a, c ∈ V. S 0,s(i) i=1

For 3 ∈ A(n+k) and t ∈ T we define Eig(3, t) to be the space spanned by those ψ i which correspond simultaneous eigenvalues γ3 (30 ) of V30 and ηt (t0 ) of Nt0 for all t0 ∈ T , 30 ∈ A(n+k) , i.e. Eig(3, t) = span{ψ i : i ∈ 8−1 (3) ∩ s−1 (t)}, so in particular Eig(3, t) = 0 iff 3 ∈ / Exp. So far the vectors ψ i are fixed up to unitary transformations in each Eig(3, t). Lemma 4.4. We have ψti =

ext St,s(i) ψ0i ext S0,s(i)

for any t ∈ T and i = 1, 2, . . . , D.

c = δc,t . Hence Proof. Clearly we have N0,t

ψti =

X

c N0,t ψci =

c∈V

D D ext ext ext X XX St,s(j) St,s(j) St,s(i) j j j ∗ i i ψ (ψ ) ψ = ψ δ = i,j c c 0 0 ext ext ext ψ0 S S S 0,s(j) 0,s(j) 0,s(i) j=1 j=1

c∈V

by orthonormality of the ψ i ’s. Let ψ0 = (ψ0i )D i=1 denote the dual vector of 0-components, and we set s X kψ0 k3,t = |ψ0i |2 . i∈8−1 (3)∩s−1 (t)

Lemma 4.5. If Eig(3, t) 6= 0 for some 3 ∈ A(n+k) and t ∈ T then kψ0 k3,t 6= 0. P c c c · Nc and Na,t = Nt,a for t ∈ T by Theorem 4.3 Proof. Since Na Nb = c∈V Na,b of [4] (a, b, c ∈ V) we have Na Nt = Nt Na for any a ∈ V. Hence we find for i ∈ 8−1 (3) ∩ s−1 (t), V Na ψ i = γ3 ()Na ψ i , Nu Na ψ i = ηt (u)Na ψ i , ∈ A(n+k) , u ∈ T , i.e. Na ψ i ∈ Eig(3, t). In other words, the matrices Na are block-diagonal in the basis ψ i corresponding to the decomposition in Eig(3, t). It follows that there are matrices, i ∈ C, i, j ∈ 8−1 (3) ∩ s−1 (t) such that the “blocks” Ba ≡ Ba (3, t), (Ba )i,j = Ba;j P i ψ j , hence in particular for the 0-components Na ψ i = j∈8−1 (3)∩s−1 (t) Ba;j X i Ba;j ψ0j . (Na ψ i )0 = j∈8−1 (3)∩s−1 (t)

Since (Na ψ i )0 = a ∈ V,

P c∈V

c Na,0 ψci = ψai we have for any i ∈ 8−1 (3) ∩ s−1 (t) and any X i Ba;j ψ0j . ψai = j∈8−1 (3)∩s−1 (t)

It follows if ψ0j = 0 for all j ∈ 8−1 (3) ∩ s−1 (t) then ψai = 0 for all i ∈ 8−1 (3) ∩ s−1 (t) and a ∈ V, i.e. Eig(3, t) = 0.

94

J. B¨ockenhauer, D. E. Evans

We set D3,t = dim Eig(3, t) ≡ |8−1 (3) ∩ s−1 (t)|. Our vectors ψ i are fixed up to P unitary transformations (rotations) in each Eig(3, t), ψ i 7→ j∈8−1 (3)∩s−1 (t) ui,j ψ j , with unitary matrices u = (ui,j )i,j∈8−1 (3)∩s−1 (t) . Thus we have in particular ψ0i 7→ P j D3,t on the sphere of radius kψ0 k3,t . As j∈8−1 (3)∩s−1 (t) ui,j ψ0 . This is a rotation in C we have shown that kψ0 k3,t 6= 0 if Eig(3, t) 6= 0 we arrive at the following Corollary 4.6. There is a choice of eigenvectors ψ i such that ψ0i 6= 0 for all i = −1/2 1, 2, . . . , D, e.g. ψ0i = D3,t kψ0 k3,t > 0 whenever i ∈ 8−1 (3) ∩ s−1 (t). As we now can divide by ψ0i we obtain immediately from Lemma 4.4 the following Corollary 4.7. For such a choice we have for any t ∈ T and any b, c ∈ V, c = Na,t

D X ψ i ψ i (ψ i )∗ a

i=1

t

ψ0i

c

.

(27)

(n+k) , Let χext t , t ∈ T , denote the characters of the extended theory and χ3 , 3 ∈ A ext . Further, let b denote the branching coefficients, defined by χ = those of SU(n) k t,3 t P ext be the modular S-matrix of the extended theory, i.e. (in 3∈A(n+k) bt,3 χ3 . Let also S the notation of [30]) X (z|z) 1 z ext ext , , u − = − St,v χt (τ, z, u). χext t τ τ τ v∈T

Lemma 4.8. For any 3 ∈ A(n+k) and u ∈ T we have X X ext Su,v bv,3 = bu, S,3 . v∈T

(28)

∈A(n+k)

Proof. This is essentially the computation in [30], p. 268, here for the special case that the branching functions are constants. By taking the S-transformation on both sides of P bu, χ we obtain χext (n+k) u = ∈A X v∈T

ext ext Su,v χv ≡

X

X

ext Su,v bv,3 χ3 =

v∈T 3∈A(n+k)

X

bu, S,3 χ3 .

3,∈A(n+k)

Since the full (not the Virasoro specialized!) characters χ3 are linearly independent functions the coefficients must coincide, so we are done. t = hα3 , βt iM (I◦ ) = hλ3 , σβt iN (I◦ ) by ασ-reciprocity, thus we find for Note that V3;0 t the branching coefficients bt,3 = V3;0 . Let N˜ t denote the restriction of the matrices Nt v ˜ to T , i.e. (Nt )u,v = Nu,t , t, u, v ∈ T .

Lemma 4.9. Provided that S ext diagonalizes the fusion matrices N˜ t , t ∈ T , i.e. S ext = S ext , we have for 3 ∈ A(n+k) and t ∈ T , bt,3 =

kψ0 k23,t . ext S S0,t 0,3

(29)

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

95

ext ∗ Proof. Exploiting S ext = S ext , multiplying Eq. (28) by (Su,t ) and summing over u ∈ T yields P P ext ∗ ) bu, S,3 bt,3 = u∈T ∈A(n+k) (Su,t P P ext ∗ u = u∈T ∈A(n+k) (Su,t ) V ;0 S,3 P P PD ext ∗ (S,8(i) )∗ i ∗ i = u∈T ∈A(n+k) i=1 (Su,t ) S (ψ0 ) ψu S,3 P PD ext ∗ δ3,8(i) i ∗ i 0,8(i) = u∈T i=1 (Su,t ) S0,3 (ψ0 ) ψu ext P PD ext ∗ δ3,8(i) i 2 Su,s(i) = u∈T i=1 (Su,t ) S0,3 |ψ0 | S ext 0,s(i) PD |ψ i |2 = i=1 δt,s(i) δ3,8(i) S ext S0

=

0,t

kψ0 k23,t , ext S S0,t 0,3

0,3

u

u = V ;0 and Lemma 4.4. where we used bu, = V;0

Recall that the mass matrix of the modular invariant is given by X bt,3 bt,30 . Z3,30 = t∈T

We can now summarize Lemmata 4.5 and 29 in the following Theorem 4.10. Provided that S ext diagonalizes the fusion matrices N˜ t , t ∈ T , i.e. S ext = S ext , we have bt,3 6= 0 if and only if Eig(3, t) 6= 0. In particular Z3,3 6= 0 if and only if 3 ∈ Exp. Actually p we would like to prove a stronger statement than Theorem 4.10, namely bt,3 = D3,t because this equality holds in all our examples we have investigated so far. Let us explain why it holds in our examples. Let X X X Z3,3 = b2t,3 . tr Z = 3∈A(n+k)

t∈T 3∈A(n+k)

P P We clearly have t∈T 3∈A(n+k) D3,t = D ≡ |V|, and it is a simple observation that tr Z = D in all our examples, hence X X X X b2t,3 = D3,t . t∈T 3∈A(n+k)

t∈T 3∈A(n+k)

Thus, if all bt,3 ∈ {0, 1} then pour derived equivalence of bt,3 6= 0 and D3,t 6= 0 in Theorem 4.10 implies bt,3 = D3,t for all t ∈ T , 3 ∈ A(n+k) . The only case of our examples where some bt,3 > 1 appears is the conformal embedding SU(4)4 ⊂ SO(15)1 , where the spinor (s) representation of SO(15)1 restricts to two copies of π(3,2,1) , i.e. bs,(3,2,1) = 2. Because of Theorem 4.10 we have D3,t ≥ b2t,3 for all pairs (t, 3) 6= (s, (3, 2, 1)). However, Petkova and Zuber [45] P found a multiplicity 4 of the exponent (3, 2, 1) in the graphs of Figs. 12 and 13, i.e. t∈T D(3,2,1),t = 4. But since bt,(3,2,1) = 0 for t 6=ps implies D(3,2,1),t = 0 for t 6= s it follows D(3,2,1),s = 4, and hence indeed bt,3 = D3,t for all t ∈ T , 3 ∈ A(4+4) because tr Z = D = 12. Nevertheless we have not succeeded in proving this equality for the general case.

96

J. B¨ockenhauer, D. E. Evans

4.3. Discussion and consequences. Let us now summarize some of the results of this section. To each block-diagonal modular invariant of SU(n), coming either from a conformal inclusion or being of Zn -orbifold type, we have a net of subfactors such that we can apply α-induction. By doing this, we obtain in particular a set of n − 1 normal, mutually commuting matrices Gp , p = 1, 2, . . . , n − 1, which can be interpreted as adjacency matrices of fusion graphs, namely those of [α3(p) ] in the sector algebra V . Since [α3(p) ] = [α3(n−p) ] we find Gtp = Gn−p . The matrices Gp can be simultaneously diagonalized in an orthonormal basis {ψ i , i = 1, 2, . . . , |V|}, and the eigenvalues of Gp are given by S3(p) ,8 /S0,8 , 8 ∈ Exp, where Exp is a subset of A(n+k) . Recall that one can define a Zn -valued colouring τ on the vertices of A(n+k) (which we can identify with the elements of W) by τ (3) = |3| mod n. Then one has τ (0) = 0 0 00 300 and N3,3 0 = hλ3 ◦ λ30 , λ300 iN (I◦ ) = 0 if τ (3) + τ (3 ) 6= τ (3 ). If [θ] decomposes only into sectors [λ3 ] of colour zero then the elements of V inherit the colouring from A(n+k) : The colour of [β] ∈ V is set to be τ (3) if [β] appears in [α3 ]. This is then well defined because hα3 , α30 iM (I◦ ) = hθ ◦ λ3 , λ30 iN (I◦ ) = 0 if τ (3) 6= τ (30 ). That [θ] decomposes only into sectors of colour zero is true for all orbifold inclusions and also all conformal embeddings considered here. Therefore the matrices Gp satisfy all the axioms postulated in [45]. However, as already mentioned there, there are also counter examples e.g. the conformal embeddings SU(8)1 ⊂ (E7 )1 and SU(9)1 ⊂ (E8 )1 where [θ] has also constituents of non-zero colour. The sector algebra V possesses a subalgebra given by the fusion algebra of the sectors of the extended net which restrict to the relevant sectors of the net of the SU(n) theory. If the corresponding fusion matrices, coming from these sector products, are diagonalized by the modular S-matrix S ext of the extended characters then the non-zero diagonal entries Z3,3 of the modular invariant are precisely those with 3 ∈ Exp. We conjecture that the modular S-matrix S ext always diagonalizes the extended (endomorphism) fusion rules, but let us point out the cases where it has already been proven. First we consider the modular invariants coming from conformal inclusions. It follows from Wassermann’s results [55] that in particular the endomorphisms of any LSU(m) level 1 theory satisfy the (Zm ) fusion rules of the Verlinde formula, thus we have S ext = S ext for all conformal inclusions SU(n) ⊂ G with G = SU(m) for some m. This covers the infinite series of inclusions SU(n)n−2 ⊂ SU(n(n − 1)/2)1 and SU(n)n+2 ⊂ SU(n(n + 1)/2)1 . By the result of [3], the endomorphisms of the LSO(m) level 1 theories satisfy the wellknown SO(m)1 fusion rules, hence S ext = S ext also for G = SO(m). This covers the infinite series of inclusions SU(n)n ⊂ SO(n2 − 1)1 , and also SU(2)10 ⊂ SO(5)1 . Moreover, we have seen from the treatment of the E8 modular invariant of SU(2) (at level k = 28) that the endomorphisms of the L(G2 ) level 1 theory obey the Lee-Yang fusion rules, thus we have S ext = S ext also for the conformal inclusion SU(2)28 ⊂ (G2 )1 . Now let us turn to the orbifold modular invariants. Unfortunately, our results are only complete for SU(2). We have seen that the fusion algebra V for the D2%+2 modular invariants is completely determined although the homomorphism [α] is not surjective. The modular S-matrices S ext of the extended characters are known [22, 1] and their Verlinde fusion rules are given explicitly in [1]. They coincide exactly with the fusion rules of the sectors [αj ], j = 0, 2, 4, . . . , 2%−2, and [α2% ]± , the “marked vertices”, which

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

97

we gave in Subsect. 3.4. Thus we have S ext = S ext also for these cases. Summarizing we found that S ext = S ext holds for all block-diagonal modular invariants of SU(2), hence its diagonal entries are labelled by some subset Exp ⊂ Ak+1 . As we have seen for the (non-trivial) block-diagonal modular invariants that G1 = N[α1 ] is the adjacency matrix of the Coxeter graphs E6 , E8 or Deven (in fact since they are fusion graphs of norm d2[α1 ] = d21 = 4 cos2 (π/(k + 2)) they can only be these graphs), the set Exp is necessarily given by the Coxeter exponents of these graphs. Thus our theory explains in particular why the spins of the diagonal entries of the non-trivial block-diagonal modular invariants are given by the Coxeter exponents of the graphs E6 , E8 and D2%+2 , % ∈ N. 5. Other Applications We shall also discuss some other examples for the application of α-induction which may be of some interest of their own. 5.1. Inclusions of extended U (1) theories. Let AN , N = 1, 2, 3, . . . , denote the extension of the U (1) current algebra discussed in [6]. It has 2N sectors constituting Z2N fusion rules. The characters are given by Ka(N ) (q) =

1 X (a+2mN )2 /4N q , a ∈ Z2N , η m∈Z

where η is Dedekind’s function. The modular invariant partition functions of these theories have been classified [12]. For each factorization N = `2 pp0 , `, p, p0 ∈ N, p and p0 coprime, associate r, r0 ∈ Z such that r0 p0 − rp = 1. Define s = r0 p0 + rp. Then X ¯ (N ) Z (N ) (`, s)Ka(N ) K Z (N ) (`, s) = a,b

b

δa,sb+2cN/` 0

if `|a and `|b otherwise

a,b∈Z2N

P

with (N ) (`, s) Za,b

=

c∈Z`

exhaust all modular invariants. Note that ` = 1, p = N , p0 = 1, implying r = 0, r0 = 1, s = 1, gives the diagonal modular invariant Z (N ) (1, 1). Now choose an N such that ` 6= 1 so that AN is a non-maximal U (1)-extension in the terminology of [6]. Choose p = N/`2 , p0 = 1 implying r = 0, r0 = 1 and s = 1. The corresponding partition function reads 2 X X (N ) Ka`+2cN/` . (30) Z (N ) (`, 1) = a∈Z2p c∈Z`

But

P

(` p) Ka`+2cN/` (q) = η −1 2

c∈Z`

= η −1 hence

Z (` p) (`, 1) = 2

P

P

c∈Z`

P

a∈Z2p

2

q (a`+2c`p+2m` 2

m∈Z

X

m∈Z

q (a+2mp)

/4p

p)2 /4`2 p

= Ka(p) (q),

|Ka(p) |2 = Z (p) (1, 1).

98

J. B¨ockenhauer, D. E. Evans

Indeed the inclusion AN =`2 p ⊂ Ap is of Z` type. Note that Z (` p) (`, 1) is block-diagonal, and we can take the net of inclusions of local algebras AN (I) ⊂ Ap (I), with I ∈ Jz , as our net of subfactors N ⊂ M. ) the endomorphisms (which are in fact the automorphisms Let us denote by λ(N a constructed in [6]) corresponding to the sectors labelled by a ∈ Z2N . The Z2N fusion rules then just read 2

(N ) (N ) ) [λ(N a ] × [λb ] = [λa+b ], a, b ∈ Z2N .

Thus the associated fusion algebra W (N ) is the group algebra of Z2N (with Z2N as sector basis). Now we want to apply the machinery of α-induction. For a non-maximal AN , N = `2 p, we start with the block-diagonal partition function in Eq. (30) and read off the [θ] from the vacuum block, [θ] =

M

) [λ(N 2c`p ].

(31)

c∈Z` ) (N ) It is easy to see that the formula hαa(N ) , αb(N ) iM (I◦ ) = hθ ◦ λ(N a , λa iN (I◦ ) (we denote (N ) ) ) and the homomorphism property of [α] determine the induced sector αa ≡ αλ(N a algebra V to be the group algebra of Z2`p = Z2N /Z` . Now Z2`p has the subgroup Z2p , describing the fusion rules of Ap , and this corresponds to the marked vertices. As an illustration we discuss the Z2 inclusion A4 ⊂ A1 . The A1 theory has two b 1 theory. The A4 theory has eight sectors, sectors and it is known to be precisely the su(2) and their conformal weights 1ν are given by 1a = a2 /16, a = 0, ±1, ±2, ±3, 4. Indeed the series of Z2 orbifold inclusions SO(2n)2 ⊂ SU(2n)1 gives the inclusion A4 ⊂ A1 when n = 1. The sectors of A4 labelled by a = 0, ±2, 4 are the basic (◦), the spinor (s,c) and the vector (v) modules, the sectors labelled by ±1 and ±3 are the twisted sectors σ, τ and σ 0 , τ 0 , respectively, in the terminology of SO(2n)2 . The modular invariant Z (4) (2, 1) of A4 reads (4) 2 | = |K0(1) |2 + |K1(1) |2 = Z (1) (1, 1). Z (4) (2, 1) = |K0(4) + K4(4) |2 + |K2(4) + K−2 (4) (4) From [θ] = [λ(4) 0 ] ⊕ [λ4 ] we obtain that V is the group algebra of Z4 . The sectors [λ0 ], (4) (4) (1) (1) [λ(4) 4 ] and [λ2 ], [λ−2 ], obtained from [λ0 ] and [λ1 ] by σ-restriction, yield irreducible (4) (4) (4) (4) sectors [α0 ] = [α4 ] and [α2 ] = [α−2 ], respectively, and constitute the Z2 ⊂ Z4 subgroup. In the fusion graph of [α1(4) ], being the Z4 graph A(1) 3 , these sectors represent (4) ] and [λ ] are not obtained by σ-restriction the marked vertices. The sectors [λ(4) ±1 ±3 of any A1 sectors. Note that these are precisely the twisted sectors. Correspondingly, (4) (4) ] and [α3(4) ] = [α−3 ] yield the elements of the sector basis V of V which [α1(4) ] = [α−1 are not represented by marked vertices. These observations generalize as follows to the block-diagonal modular invariant Z (N ) (`, 1) for N = `2 p. We have seen that we then can consider AN as the Z` orbifold theory of Ap . Reading off the [θ], Eq. (31), from the vacuum block we obtain that V is the group algebra of Z2`p . The irreducible sectors [αa(N ) ] with a a multiple of `, i.e. a ∈ Z2p ⊂ Z2`p , are represented as marked vertices in the fusion graph of [α1(N ) ]. ) Correspondingly, the sectors [λ(N a ], a ∈ Z2p , are obtained by σ-restriction of the sector (p) [λa ] of Ap . Considering AN as the Z` orbifold of Ap , we can interpret the other sectors ) / Z2p , as twisted sectors. [λ(N b ], b ∈

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

99

5.2. Minimal models. We shall also briefly discuss the treatment of the minimal models here. The minimal models are described by the positive energy representations of the diffeomorphism group of the circle Diff(S 1 ), or, on the level of Lie algebras, by the unitary highest weight modules of the Virasoro algebra Vir(c), where the central charge c ≡ c(m) is given by c=1−

6 , m = 3, 4, 5, . . . . m(m + 1)

These models arise as coset theories [25] SU(2)m−2 ⊗ SU(2)1 . SU(2)m−1 The modules Hp,q , appearing at fixed m, are labelled by pairs of integers p = 0, 1, 2, . . . , m − 2 and q = 0, 1, 2, . . . , m − 1, the conformal grid. We have a double counting, Hp,q = Hm−p−2,m−q−1 . In the setting of local von Neumann algebras the minimal models have been treated in [37] quite analogously to the treatment of the loop groups LSU(n) by Wassermann. Here we are dealing with a net N of local von Neumann algebras N (I) = π0 (Diff I (S 1 ))00 , where π0 is the vacuum representation of Diff(S 1 ) and Diff I (S 1 ) is the subgroup of diffeomorphisms concentrated on an interval I ⊂ S 1 . Analogously to the arguments for LSU(n) we have Haag duality in the vacuum representation and the positive energy representations correspond localized, transportable endomorphisms. The well known fusion rules are proven in the bimodule setting in [37] and hence they give the correct fusion rules for the corresponding sectors, explicitly, [λp,q ] × [λp0 ,q0 ] =

0 0 min(p+p0 ,2m−p−p −4) min(q+q 0 ,2m−q−q −2) M M

[λr,s ],

r=|p−p0 | r+p+p0 even

s=|q−q 0 | s+q+q 0 even

where λp,q ∈ 1N (I◦ ) denote the endomorphisms associated to Hp,q . This determines the fusion algebra WVir(c(m)) . The modular invariants of the minimal models are classified, and are labelled by pairs (G1 , G2 ) of ADE-graphs (with Coxeter numbers m − 2 and m − 1) [7]. If we write the SU(2) modular invariants appearing at level k and labelled by ADE-graphs G as ZG =

k X j−0

(k) Zj,j ¯ j0 , 0 (G)χj χ

then the (G1 , G2 ) modular invariants of the minimal model with c = c(m) is given by ZG1 ,G2 =

m−2 m−1 1 X X (m−2) (m−1) Zp,p0 (G1 )Zq,q (G2 )χp,q χ¯ p0 ,q0 , 0 2 0 0 p,p =0 q,q =0

where χp,q denotes the character of Hp,q . The prefactor 1/2 is due to the double counting. Since either m − 2 or m − 1 is odd either G1 or G2 is necessarily an A-graph.

100

J. B¨ockenhauer, D. E. Evans

We would like to apply the procedure of α-induction. Although these are block diagonal modular invariants we do not always know what the net M is. Recall that for SU(2) the E6 and E8 modular invariants come from the conformal embedding SU(2) ⊂ G, where G = SO(5) or G2 , respectively. So it is natural to ask whether there is an extension of Vir for the (G1 , G2 ) modular invariants where G1 or G2 is E6 or E8 . For the (E6 , A12 )- and (E8 , A30 )-invariants the natural candidate is the coset G1 ⊗ SU(2)1 , SU(2)m−1 where G1 = SO(5)1 and m = 12, or G1 = (G2 )1 , m = 30, respectively. However, for the (A10 ,E6 ) and (A28 , E8 ) modular invariants there is no such natural candidate. For any block diagonal modular invariant of the minimal models we proceed by assuming that there is a net M such that the net of subfactors N ⊂ M has the correct properties, in particular, that the blocks correspond to σ-restriction of representations of the net M. Then we may go on as follows: Let 0k (G) denote the set of integers j with [λj ] appears in the [θ] we associated to the SU(2) G modular invariant, see Table 1. For the (G1 , G2 ) Table 1. The sets 0k (G) Level

Graph G

0k (G)

k = 1, 2, 3, . . .

Ak+1

{0}

k = 4%, % = 1, 2, 3, . . .

D2%+2

{0, 4%}

k = 10

E6

{0, 6}

k = 28

E8

{0, 10, 18, 28}

modular invariant of the minimal model with c = c(m) define [θ] ∈ [1]N (I◦ ) by M M [λp,q ], [θ] = p∈0m−2 (G1 ) q∈0m−1 (G2 )

so that [θ] precisely correspond to the vacuum block in ZG1 ,G2 . (Note that one of the summations is always trivial as either G1 or G2 is an A-graph.) Then we determine the induced fusion algebra V by hαp,q , αp0 q0 iM (I◦ ) = hθ ◦λp,q , λp0 q0 iN (I◦ ) , where we denote αp,q ≡ αλp,q . We have to choose an analogue of the fundamental representation of LSU(n) for the minimal models. It is instructive to discuss briefly the fusion graphs of [αλ ] for the choices λ = λ0,1 , λ1,0 , λ1,1 . First one checks that then [αλ ] is irreducible, hαλ , αλ iM (I◦ ) = 1, in all cases. Now [α0,1 ] ([α1,0 ]) generates the fusion subalgebra corresponding to the first column (row) of the conformal grid, being isomorphic to W (2, m − 1) (W (2, m − 2). Thus the fusion graph of [α0,1 ] ([α1,0 ]) is not connected; the identity component is just G2 (G1 ). Now consider [λ1,1 ] that is [λ1,1 ] = [λ0,1 ] × [λ1,0 ]. The fusion graph of [α1,1 ] is somehow a combination of the graphs G1 and G2 . As an illustration, we give the result for the (A4 ,D4 ) modular invariant (m = 5, c = 4/5)   3 1 X |χp,0 + χp,4 |2 + 2|χp,2 |2  . Z(A4 ,D4 ) = 2 p=0

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

101

[α3,1 ]

[α2,2 ]+

[α2,0 ] [α2,2 ]− [α1,1 ]

[α0,2 ]+

[α0,0 ]

[α0,2 ]−

Fig. 15. Fusion graph of [α1,1 ] for the (A4 ,D4 ) modular invariant

Then [θ] = [λ0,0 ] ⊕ [λ0,4 ], and we find eight distinct irreducible sectors, [α0,0 ], [α1,1 ], [α2,0 ], [α3,1 ], [α0,2 ]± , [α2,2 ]± . The fusion graph of [α1,1 ] is given in Fig. 15.

6. Outlook We have applied the procedure of α-induction and σ-restriction of sectors to chiral conformal field theory models, in particular to the SU(n)k WZW models. Looking at the block-diagonal modular invariants arising from conformal or orbifold inclusions of SU(n), we have seen that their classification by certain fusion graphs – in particular the A-D-E classification in the SU(2) case – can be understood by the induction-restriction machinery of the relevant sectors. However, many questions remain unanswered. The induction turns out to be non-surjective in several cases; this is apparently closely related to multiplicities in the mass matrix Z, but a good understanding of this non-surjectivity (which can even lead to non-commutativity of the induced sector algebra) is still missing. It might be possible to extract more information about the structure of the induced fusion algebra from the SU(n)k data than our results in [4] like the main reducibility formula or ασ-reciprocity provide. In fact, the observation tr Z = D ≡ |V| is still awaiting a good explanation. It will certainly be worth looking also at the block-diagonal SU(n) modular invariants that come neither from conformal nor from orbifold embeddings. Moreover, it is not clear at the moment how to incorporate the type II modular invariants in our framework. Another challenging question, suggested by the treatment of SU(2), concerns a better understanding of the relation between the appearance of modular invariants of SU(n) WZW models and the existence of sub-(equivalent)-paragroups of the paragroups arising from the relevant A-type subfactors ([32, 42]). Of course it will also be interesting to construct the associated fusion graphs also for modular invariants of other Lie groups, e.g. Sp(n). Acknowledgement. We would like to thank F. Goodman for providing us a Pascal program for the computation of the SU(n)k fusion coefficients as well as to thank J. Fuchs for providing us a printout of certain fusion matrices. We are also grateful to T. Gannon for helpful correspondence by e-mail, and it is a pleasure to thank K.-H. Rehren for many useful comments on an earlier version of the manuscript. This project is supported by the EU TMR Network in Non-Commutative Geometry.

102

J. B¨ockenhauer, D. E. Evans

References 1. Baver, E., Gepner, D.: Fusion rules for extended current algebras. Mod. Phys. Lett. A 11, 1929–1946 (1996) 2. Bisch, D., Jones, V.F.R.: Algebras associated to intermediate subfactors. Invent. Math. 128, 89–157 (1997) 3. B¨ockenhauer, J.: An algebraic formulation of level one Wess-Zumino-Witten models. Rev. Math. Phys. 8, 925–947 (1996) 4. B¨ockenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors I.. Commun. Math. Phys. 197, 361–386 (1998) 5. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) 6. Buchholz, D., Mack, G., Todorov, I.: The current algebra on the circle as a germ of local field theories. Nucl. Phys. B (Proc. Suppl.) 5B, 20–56 (1988) 7. Cappelli, A., Itzykson, C., Zuber, J.-B.: The A-D-E classification of minimal and A(1) 1 conformal invariant theories. Commun. Math. Phys. 113, 1–26 (1987) 8. Cardy, J.L.: Operator content of two-dimensional conformally invariant theories. Nucl. Phys. B 270, 186–204 (1986) 9. Date, E., Jimbo, M., Miwa, T., and Okado, M.: Solvable lattice models. Theta functions – Bowdoin 1987, Part 1, Proceedings of Symposia in Pure Mathematics Vol. 49, American Mathematical Society, Providence, R.I.: (1987), pp. 295–332 10. Di Francesco, P: Integrable lattice models, graphs and modular invariant conformal field theories. Int. J. Mod. Phys. A 7, 407–500 (1992) 11. Di Francesco, P., Mathieu, P., S´en´echal, D.: Conformal field theory. New York: Springer-Verlag, 1996 12. Di Francesco, P., Saleur, H., Zuber, J.-B.: Modular invariance in non-minimal two-dimensional conformal field theories. Nucl. Phys. B 285, 454–480 (1987) 13. Di Francesco, P., Zuber, J.-B.: SU(N ) lattice integrable models associated with graphs. Nucl. Phys. B 338, 602–646 (1990) 14. Di Francesco, P., Zuber, J.-B.: SU(N ) lattice integrable models and modular invariants. In: Randjbar, S. et al (eds.): Recent developments in conformal field theories. Singapore: World Scientific, 1990, 179–215 15. Doplicher, S., Haag, R., Roberts, J.E.: Fields, observables and gauge transformations II. Commun. Math. Phys. 15, 173–200 (1969) 16. Evans, D.E., Gould, J.D.: Dimension groups and embeddings of graph algebras. Intern. J. Math. 5, 291–327 (1994) 17. Evans, D.E., Kawahigashi, Y.: The E7 commuting squares produce D10 as principal graph. Publ. RIMS, Kyoto Univ. 30, 151–166 (1994) 18. Evans, D.E., Kawahigashi, Y.: Orbifold subfactors from Hecke algebras. Commun. Math. Phys. 165, 445–484 (1994) 19. Evans, D.E., Kawahigashi, Y.: Quantum symmetries on operator algebras. Oxford: Oxford University Press, 1998 20. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras II. Rev. Math. Phys. Special Issue, 113–157 (1992) 21. Fr¨ohlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 22. Fuchs, J., Schellekens, A.N., Schweigert, C.: A modular matrix S for all simple current extensions. Nucl. Phys. B 473, 323–366 (1996) 23. Gannon, T.: WZW commutants, lattices and level-one partition functions. Nucl. Phys. B 396, 708–736 (1993) 24. Gannon, T.: The classification of affine SU(3) modular invariants. Commun. Math. Phys. 161, 233–264 (1994) 25. Goddard, P., Kent, A., Olive, D.: Virasoro algebras and coset space models. Phys. Lett. B 152, 88–93 (1985) 26. Goodman, F.M., de la Harpe, P., Jones, V.F.R.: Coxeter graphs and towers of algebras. New York: Springer-Verlag, 1989 27. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 28. Izumi, M.: Application of fusion rules to classification of subfactors. Publ. RIMS, Kyoto Univ. 27, 953–994 (1991)

Modular Invariants, Graphs and α-Induction for Nets of Subfactors. II

103

29. Jimbo, M.: A q-analogue of U (N + 1), Hecke algebra and the Yang–Baxter equation. Lett. Math. Phys. 11, 247–252 (1986) 30. Kac, V.G.: Infinite dimensional Lie algebras. 3rd edition, London: Cambridge University Press, 1990 31. Kawahigashi, Y.: Classification of paragroup actions on subfactors. Publ. RIMS, Kyoto Univ. 31, 481– 517 (1995) 32. Kawahigashi, Y.: Quantum Galois correspondence for subfactors. In preparation. 33. Kawai, T.: On the structure of fusion algebras. Phys. Lett. B 217, 247–251 (1989) 34. Kl¨umper, A., Pearce, P.A.: Conformal weights of RSOS lattice models and their fusion hierarchies. Physica A 183, 304–350 (1992) 35. Kohno, T., Takata, T.: Symmetry of Witten’s 3-manifold invariants for sl(n, C). Journal of Knot Theory and Its Ramifications 2, 149–169 (1993) 36. Kostov, I.K.: Free field presentation of the An coset models on the torus. Nucl. Phys. B 300, 559–587 (1988) 37. Loke, T.: Operator algebras and conformal field theory of the discrete series representations of Diff(S 1 ). Dissertation Cambridge (1994) 38. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 39. Moore, G., Seiberg, N.: Naturality in conformal field theory. Nucl. Phys. B 313, 16–40 (1989) 40. Nahm, W.: A proof of modular invariance. Int. J. Mod. Phys. A 6, 2837–2845 (1991) 41. O’Brian, L., Pearce, P.A.: Lattice realizations of unitary minimal modular invariant partition functions. J. Phys. A 28, 4891–4906 (1995) 42. Ocneanu, A.: Paths on Coxeter diagrams: From platonic solids and singularities to minimal models and subfactors. Lectures given at the Fields Institute (1995), notes recorded by S. Goto. 43. Pasquier, V.: Operator content of the ADE lattice models. J. Phys. A 20, 5707–5717 (1987) 44. Pearce, P.: Recent progress in solving A-D-E lattice models. Physica A 205, 15–30 (1994) 45. Petkova, V.B., Zuber, J.-B.: From CFT to graphs. Nucl. Phys. B 463, 161–193 (1996) 46. Petkova, V.B., Zuber, J.-B.: Conformal field theory and graphs. In: Proceedings Goslar 1996 “Group 21” 47. Pressley, A., Segal, G.: Loop groups. Oxford: Oxford University Press, 1986 48. Rehren, K.-H.: Space-time fields and exchange fields. Commun. Math. Phys. 132, 461–483 (1990) 49. Rehren, K.-H.: Markov traces as characters for local algebras. Nucl. Phys. B (Proc. Suppl.) 18B, 259–268 (1990) 50. Rehren, K.-H.: Subfactors and coset models. In: Doebner, H.-D. et al (eds.): Generalized symmetries in physics. Singapore: World Scientific 1994, pp. 338–356 51. Roberts, J.E.: Lectures on algebraic quantum field theory. In: Kastler, D. (ed.): The algebraic theory of superselection sectors. Singapore: World Scientific 1990, pp. 1–112 52. Schellekens, A.N., Yankielowicz, S.: Extended chiral algebras and modular invariant partition functions. Nucl. Phys. B 327, 673–703 (1989) 53. Takesaki, M.: Conditional expectations in von Neumann algebras. J. Funct. Anal. 9, 306–321 (1972) 54. Wassermann, A.: Subfactors arising from positive energy representations of some infinite dimensional groups. Unpublished notes 1990 55. Wassermann, A.: Operator algebras and conformal field theory III. To appear in Invent. Math. 56. Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math. 92, 345–383 (1988) 57. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 349–403 (1998) Communicated by H. Araki

Commun. Math. Phys. 200, 105 – 124 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Scattering Matrices in Many-Body Scattering Andr´as Vasy Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 29 October 1997 / Accepted: 19 June 1998

Abstract: In this paper we show that generalized eigenfunctions of many-body Hamiltonians H with short-range two-body interactions have distributional asymptotics at non-threshold channels. The leading terms of the asymptotics can be used to define a scattering matrix, and we show that this is the same (up to normalization) as that arising from the standard wave-operator approach. We also prove the existence of local distributional asymptotics for locally approximate generalized eigenfunctions in the more general setting of short range perturbations of a scattering metric, defined by Melrose in [13].

1. Introduction The purpose of this paper is to show the equivalence of two different definitions of the S-matrices (corresponding to non-threshold channels) in the many-body problem. The standard definition comes from the channel wave operators, but we can also define them by using a distributional asymptotic expansion of generalized eigenfunctions. The existence of such distributional asymptotics can be proved in a very elementary way. One reason for considering the second definition is that it makes sense even when the operator we wish to study, i.e. the Hamiltonian, is not a perturbation (in some sense) of a well-understood global operator such as 1 in the Euclidean setting, see e.g. [13, 21] and the paragraphs after (1.6). It is well-known that these two definitions coincide (up to normalization) in twobody scattering. In [10] Isozaki obtained the top term of an asymptotic expansion in three-body scattering which sufficed to show that the 2-cluster to 3-cluster S-matrix in three-body scattering can be described by such asymptotic expansions. Then Hassell proved in [6] the equivalence for the free channel to free channel scattering in the threebody problem. In both of these cases the asymptotic expansion is actually smooth – this holds more generally at the free cluster as shown in [20]. Here we present a perhaps

106

A. Vasy

somewhat simpler proof (at the cost of not obtaining some of Hassell’s other results, see the remark after the statement of the theorem) which we also generalize to other channels. The main part of the proof consists of relating the asymptotic expansions to Isozaki’s representation of the S-matrix [9], see also [11] and [19]. Another alternative is to relate the asymptotics to Yafaev’s representation of the S-matrix in [25]; see the remarks at the end of Sect. 5. It should be emphasized that we merely give and relate two definitions of the Smatrices which is quite easy; the works of G´erard, Isozaki and Skibsted [3, 4, 9, 12] provide the main external material. The analysis of the structure of the S-matrices is a much harder task; see e.g. [9, 19, 21, 22]. In addition to connecting two approaches to scattering theory, we hope to make this paper a bridge between different systems of notations and concepts, namely the traditional ones and those of Melrose [13]. Before we can state the precise definitions, we need to introduce some basic (and mostly standard) notation. We consider the Euclidean space Rn , and we assume that we are given a (finite) family X of linear subspaces Xa , a ∈ I, of Rn which is closed under intersections and includes the subspace X1 = {0} consisting of the origin, and the whole space X0 = Rn . Let X a be the orthocomplement of Xa , and let π a be the orthogonal projection to X a , πa to Xa . A many-body Hamiltonian is an operator of the form X (π a )∗ Va ; (1.1) H =1+ a∈I

here 1 is the positive Laplacian, V0 = 0, and the Va are real-valued functions in an appropriate class which we take here to be (not necessarily polyhomogeneous) symbols of order −ρ < −1 on the vector space Xa to simplify the problem: Va ∈ S −ρ (Xa ). Thus, H is a self-adjoint operator on L2 = L2 (Rn ). We let R(λ) = (H − λ)−1 for λ ∈ C \ R be the resolvent of H. We have a natural partial ordering on I induced by the ordering of X a by inclusion. (Though the ordering based on inclusion of the Xa would be sometimes more natural, we use the conventional ordering.) Let I1 = {1} (recall that X1 = {0}); 1 is the maximal element of I. A maximal element of I \ I1 is called a 2-cluster; I2 denotes the set of 2-clusters. In general, once Ik has been defined for k = 1, . . . , m − 1, we let Im (the set m−1 0 0 = I \ ∪k=1 Ik , if Im is not empty. of m-clusters) be the set of maximal elements of Im 0 0 If Im = {0} (so Im+1 is empty), we call H an m-body Hamiltonian. For example, if I 6= {0, 1}, and for all a, b ∈ / {0, 1}, a 6= b, we have Xa ∩ Xb = {0}, then H is a 3-body Hamiltonian. The N -cluster of an N -body Hamiltonian is also called the free cluster, since it corresponds to the particles which are asymptotically free. It is convenient to compactify these spaces as in [13]. Thus, we let Sn+ to be the radial compactification of Rn to a closed hemisphere, i.e. a ball, (using the standard map SP), and Sn−1 = ∂Sn+ . We write w = rω, ω ∈ Sn−1 , for polar coordinates on Rn , and we let x ∈ C ∞ (Sn+ ) be such that x = (SP−1 )∗ (r−1 ) for r > 1. Hence, x is a smoothed version of r−1 (smoothed at the origin of Rn ), and it is a boundary defining function of Sn+ . We usually identify (the interior of) Sn+ with Rn . Thus, we write S −ρ (Sn+ ) and S −ρ (Rn ) interchangeably and we drop the explicit pull-back notation in the future and simply write x = r−1 (for r > 1). We recall that under SP we have the following correspondence of weighted Sobolev spaces: k,l n (S+ ) = H k,l = H k,l (Rn ) = hwi−l H k (Rn ), Hsc

(1.2)

Scattering Matrices in Many-Body Scattering

107

where hwi = (1 + |w|2 )1/2 . Thus, for λ ∈ C \ R the resolvent extends to a map k,l n k+2,l n (S+ ) → Hsc (S+ ). R(λ) : Hsc

(1.3)

X¯ a = cl(SP(Xa )), Ca = X¯ a ∩ ∂Sn+ .

(1.4)

Similarly, we let

Hence, Ca is a sphere of dimension na − 1, where na = dim Xa . Again, we write the polar coordinates on Xa (with respect to the induced metric) as wa = ra ωa , ωa ∈ Ca , and let xa = ra−1 (for ra > 1). Note that if a is a 2-cluster then Ca ∩ Cb = ∅ unless b ≤ a. We also define the “singular part” of Ca as the set Ca,sing = ∪b6≤a (Cb ∩ Ca ),

(1.5)

and its “regular part” as the set Ca0 = Ca \ ∪b6≤a Cb = Ca \ Ca,sing .

(1.6)

Ca0

= Ca . For example, if a is a 2-cluster then Ca,sing = ∅ and Here we would like to remark that the scattering calculus of Melrose [13], described k,l n (S+ ) play only a very minor briefly in Sect. 2, and the corresponding notation such as Hsc role in this paper and the arguments can be reworded to eliminate this role completely (at the cost of a slight loss of generality in Sect. 3). The advantages of the compactified notation would be more apparent when dealing with uniform behavior near infinity (i.e. near ∂Sn+ ), such as when constructing symbols of pseudo-differential operators (“observables”) for propagation estimates, as in [13, 21]. Thus, keeping in mind the intended role of this paper as a bridge, we will often write out statements in both notations, e.g. as done above in (1.2). An example of the more general geometric setting, namely a geometric generalization of the three-body problem, i.e. when Sn+ is replaced by a manifold with boundary X equipped with a “scattering metric” (see Sect. 3), Ca by disjoint closed embedded submanifolds of ∂X (C0 and C1 are dropped from the notation), H = 1 + V , and the potential V has certain singularities at ∪a Ca , was studied in [21] and the wave front relation of the free-to-free S-matrix was analyzed there. Corresponding to each cluster a we introduce the cluster Hamiltonian ha as an operator on L2 (X a ) given by X Vb , (1.7) ha = 1 + b≤a

1 being the Laplacian of the induced metric on X a . Thus, if H is a N -body Hamiltonian and a is a k-cluster, then ha is a (N + 1 − k)-body Hamiltonian. The L2 eigenfunctions of ha play an important role in many-body scattering; we remark that by Froese’s and Herbst’s result, [1], specpp (ha ) ⊂ (−∞, 0] (there are no positive eigenvalues). Moreover, specpp (ha ) is bounded below since ha differs from 1 by a bounded operator. Note that X 0 = {0}, h0 = 0, so the unique eigenvalue of h0 is 0. The eigenvalues of ha can be used to define the set of thresholds of hb . Namely, we let 3a = ∪b
(1.8)

108

A. Vasy

be the set of thresholds of ha , and we also let 30a = 3a ∪ specpp (ha ) = ∪b≤a specpp (hb ).

(1.9)

Thus, 0 ∈ 3a for a 6= 0 and 3a ⊂ (−∞, 0]. It follows from the Mourre theory (see e.g. [2, 16]) that 3a is closed, countable, and specpp (ha ) can only accumulate at 3a . Moreover, R(λ), considered as an operator on weighted Sobolev spaces, has a limit 0

k,l n k+2,l (S+ ) → Hsc (Sn+ ) R(λ ± i0) : Hsc

(1.10)

for l > 1/2, l0 < −1/2, from either half of the complex plane away from 3 = 31 ∪ specpp (H).

(1.11)

In addition, L2 eigenfunctions of ha with eigenvalues which are not thresholds are necessarily Schwartz functions on X a (see [1]). We also label the eigenvalues of ha , counted with multiplicities, by integers m, and we call the pairs α = (a, m) channels. We denote the eigenvalue of the channel α by α , write ψα for a corresponding normalized eigenfunction, and let eα be the orthogonal projection to ψα in L2 (X a ). We say that / 3a . If λ > α , we also introduce the a channel α is a non-threshold channel if α ∈ notation λα = (λ − α )1/2 .

(1.12)

It is also convenient to introduce Ha = 1Xa ⊗ Id + Id ⊗ha ,

(1.13)

on L2 (Xa ) ⊗ L2 (X a ), so H = Ha + V˜a , where V˜a is the intercluster interaction X Vb . (1.14) V˜a = b6≤a

Note, in particular, that V˜a ∈ xρ S 0 (U ) = S −ρ (U ), where U = Sn+ \ ∪b6≤a Cb is a neighborhood of Ca0 . We also let Eα = Id ⊗eα ,

(1.15)

and πα : S 0 (Rn ) → S 0 (Xa ) is given by πα (u)(f ) = u(f ⊗ ψα )

(1.16)

(we use the real pairing for distributions as usual), so Eα (u) = πα (u)⊗ψα . We sometimes write the coordinates on Xa ⊕ X a as (wa , wa ). We can now define the wave-operator scattering matrices as follows. The wave operators are given by Wα± = s − lim eiHt e−iHa t Jα ,

(1.17)

Jα f = ((π a )∗ ψα )(πa∗ f ).

(1.18)

t→±∞

where

Scattering Matrices in Many-Body Scattering

109

Then the scattering operator is Sβα = (Wβ+ )∗ (Wα− ).

(1.19)

Using the unitary Fourier transform, Fα : L2 (Xa ) → L2 ((α , ∞) × Ca ), Fα u(λ, ω) = (2π)−na /2 2−1/2 (λ − α )(na −2)/4

Z

e−i(λ−α )

(1.20) 1/2

ω·w

u(w) dw, (1.21)

one shows that the scattering matrix Sˆ βα = Fβ Sβα Fα∗

(1.22)

(Sˆ βα f )(λ, ω) = (Sˆ βα (λ)f (λ, .))(ω)

(1.23)

has the form

a.e. λ > α (see e.g. [17]). The other definition comes from the stationary theory, more precisely from the asymptotic behavior of generalized eigenfunctions. Namely, we show in Sect. 2 that ∞,−1/2− n (S+ ) (here for λ ∈ (α , ∞) \ 3 and g ∈ Cc∞ (Ca0 ), there is a unique u ∈ Hsc ∈ (0, 1/2)) such that (H − λ)u = 0, and u has the form u = e−iλα r r−(na −1)/2 ((π a )∗ ψα )v− + R(λ + i0)f,

(1.24)

∞,1/2+0

(Sn+ ), 0 > 0. In addition, in where v− ∈ C ∞ (Sn+ ), v− |Ca = g, and f ∈ Hsc Sects. 3 and 4 we show that for this u and for λ > β the projection πβ u has the following distributional asymptotic behavior: Z πβ u(rb ωb ) h(ωb ) dωb Cb

−(n−1)/2

= rb

(e−iλβ rb Q0β,− (πβ u, h) + eiλβ rb Q0β,+ (πβ u, h) + rb−δ v), h ∈ Cc∞ (Cb0 ), (1.25)

where δ > 0, v ∈ C ∞ ((1, ∞)) ∩ L∞ ((1, ∞)) and Q0β,± (πβ u, .) define distributions in C −∞ (Cb0 ) (so Q0β,± (πβ u, h) give a complex number for each h). In fact, Q0α,− (πα u, .) = g, Q0β,− (πβ u, .) = 0 if β 6= α are the “incoming” distributional boundary values, and Q0β,+ (πβ u, .) are the “outgoing” ones. Since the scattering matrix is exactly supposed to relate these, we define Sβα (λ)(g)(h) = Q0β,+ (πβ u, h), u given by (1.24),

(1.26)

providing a continuous operator Sβα (λ) : Cc∞ (Ca0 ) → C −∞ (Cb0 ). We remark that our normalization √ is different from that of [13, 20, 21], where this operator would be denoted by S00 (− λ) for λ > 0 (in the free-channel setting discussed there). In this paper we show the distributional asymptotics (1.25), and prove the following theorem:

110

A. Vasy

Theorem. If α and β are non-threshold channels then for λ > max(α , β ), λ ∈ / 3, Sˆ βα (λ) = cSβα (λ)R, c = eiπ(na +nb −2)/4 (λα /λβ )1/2 ,

(1.27)

(as maps Cc∞ (Ca0 ) → C −∞ (Cb0 )) where R is pull back by the antipodal map on Ca . Note that we take Sβα (λ) to be a map from smooth functions on Ca supported away from Ca,sing to distributions on Cb0 , i.e. in the dual space of smooth functions on Cb supported away from Cb,sing . Now, Cb,sing has measure 0 in Cb , so the L2 spaces on Cb and Cb0 can be identified. On the other hand, there are distributions on Cb which are supported on the singular part Cb,sing which are thus 0 as distributions on Cb0 . Correspondingly, there is a possibility of losing information about the “outgoing” data while proceeding in this manner. Thus, without specifying the behavior of generalized eigenfunctions of H at the singular part (e.g. by showing that the result is in L2 ), our description cannot lead to versions of statements of “asymptotic completeness” (see e.g. [18, 24]). We remark that the case of the N -cluster to N -cluster scattering matrix is significantly simpler than the general case. Indeed, in this case we have smooth (not only distributional) asymptotics as discussed in [20]. Thus, we first note that given λ > 0, g ∈ Cc∞ (C00 ) there exists a unique u ∈ C −∞ (Sn+ ) which satisfies (H − λ)u = 0 and which is of the form u = e−i

√ λ/x (n−1)/2

x

v− + R(λ + i0)f, v− |Sn−1 = g

(1.28)

for some f ∈ C˙∞ (Sn+ ), v− ∈ C ∞ (Sn+ ), at least if the Va are “classical” symbols (i.e. they have an asymptotic expansion); otherwise v− is the sum of a function of the angular variable and a negative order symbol (which thus decays at ∞, i.e. at ∂ X¯ a ). (On a manifold with boundary, X, C˙∞ (X) denotes the space of smooth functions which vanish at ∂X with all derivatives, so C˙∞ (Sn+ ) corresponds to S(Rn ) under SP. The dual space of distributions is denoted C −∞ (X) which thus corresponds to S 0 (Rn ) if X = Sn+ .) Moreover, for any f 0 ∈ C˙∞ (Sn+ ), R(λ + i0)f 0 = ei

√ λ/x (n−1)/2

x

v+ , v+ ∈ C ∞ (Sn−1 \ C0,sing ).

(1.29)

Hence, with f 0 = f , we see that the asymptotic-expansion scattering matrix, S(λ), becomes the map S00 (λ) : Cc∞ (C00 ) → C ∞ (C00 ),

(1.30)

S00 (λ)g = v+ |C00 .

(1.31)

Note that P0,+ (λ)g, g ∈ Cc∞ (C00 ), is of the form (1.28)–(1.29), so we have u = P0,+ (λ)g above. Here P0,+ (λ) is the Poisson operator discussed in the next section. Even the proof of the equivalence with the wave-operator S-matrix is simple: the boundary pairing of [13, Proposition 13] localizes easily, as discussed in [21, Lemma 19.8] for three-body scattering, and one immediately obtains the formula (5.7). The simple calculation of Theorem 5.1 now proves the equivalence.

Scattering Matrices in Many-Body Scattering

111

2. Poisson Operators It is not hard to write down partial spherical wave type generalized eigenfunctions of 0 (σ) be H corresponding to a non-threshold channel α of a k-cluster, a. Namely, let Pa,± the Poisson operator for the free Laplacian on Xa at energy σ. Thus, spherical waves on Xa of energy σ are given by 0

0 ∞,l (σ)g ∈ Hsc (Sn+ ), ua,± = Pa,±

(2.1)

where l0 < −1/2, g ∈ C ∞ (Ca ) (recall that Ca is the sphere at “infinity” for Xa ), and they have the form ua,± = e−i

√ σra −(na −1)/2 ra v−

+ ei

√ σra −(na −1)/2 ra v+ ,

v∓ |∂ X¯ a = g

(2.2)

with v± ∈ C ∞ (X¯ a ). Thus, for ua,+ only v− |∂ X¯ a = g is specified while for ua,− only v+ |∂ X¯ a = g is. Note that the normalization here is the same as in the introduction, i.e. it differs from [13], etc. We also introduce the “partially free” Poisson operator corresponding to Ha ; it is defined as 0 (λ − α ). P˜α,± (λ) = ((π a )∗ ψα )πa∗ Pa,±

(2.3)

Then u0 = P˜α,± (λ)g satisfies (Ha − λ)u0 = 0 (this remains true for g ∈ C −∞ (Ca )). Now assume in addition that g is supported in the regular part of Ca : g ∈ Cc∞ (Ca0 ). Then it follows from the standard properties of the free Poisson operator that u0 decays rapidly with all derivatives at Cb ∩ Ca for b 6≤ a. Hence, (H − λ)u0 = V˜a u0 ∈ ∞,ρ−1/2− n Hsc (S+ ) for all > 0 since ψα ∈ S(X a ) (having a non-threshold eigenvalue), ∞,1/2+0 n (S+ ) for sufficiently small 0 > 0. Thus, letting so V˜a u0 is in particular in Hsc 0

∞,l (Sn+ ), λ ∈ (α , ∞) \ 3, l0 < −1/2,(2.4) u = u0 − R(λ + i0)(H − λ)u0 ∈ Hsc

we obtain a generalized eigenfunction of H which differs from u0 by a purely “outgoing” ∞,−1/2+˜ n (S+ ), 0 < ˜ < 0 , in the sense of G´erard, Isozaki and Skibsted term modulo Hsc [3, 4]. We will discuss this statement below more precisely. Hence, we can define the channel Poisson operator corresponding to the nonthreshold channel α at energy λ as ∞,−1/2− n (S+ ) ⊂ C −∞ (Sn+ ), Pα,± (λ) : Cc∞ (Ca0 ) → Hsc

(2.5)

Pα,± (λ) = (Id −R(λ ± i0)(H − λ))P˜α,± (λ), λ ∈ (α , ∞) \ 3.

(2.6)

Thus, u = Pα,+ (λ)g gives the generalized eigenfunction of (1.24). Note that the term −(n −1)/2 v+ (wa )ψα (wa ) in (2.3), which corresponds to the second term in (2.2), eiλα ra ra a is equal to R(λ + i0)(H − λ)(eiλα ra ra−(na −1)/2 v+ (wa )ψα (wa )) by Isozaki’s uniqueness theorem (as discussed in the following paragraph) since they are both outgoing and their difference is a generalized eigenfunction.

112

A. Vasy

Uniqueness of solutions of the form (1.24) is a consequence of Isozaki’s uniqueness theorem [12] together with the aforementioned “outgoing” property of R(λ + i0)f . Indeed, the difference u˜ of two such solutions satisfies (H − λ)u˜ = 0 and it is of the ∞,1/2+0 n (S+ ). Let form u˜ − + R(λ + i0)f˜ with u˜ − ∈ L2sc (Sn+ ) and f˜ ∈ Hsc n

B=

wj 1 X wj ∂wj + ∂wj ) ( 2i j=1 hwi hwi

(2.7)

and let√F− be a 0th order symbol on R (i.e. F− ∈ S 0 (R)) satisfying supp(F− ) ⊂ (−∞, a(λ)) where for λ ∈ specc (H) \ 3 we let a(λ) = inf{λ − µ : µ ∈ 31 , µ < λ} > 0.

(2.8)

Note that a(λ) = λ for λ > 0 as 31 ⊂ (−∞, 0]. Since F− (B) is bounded on L2sc (Sn+ ), F− (B)u˜ − ∈ L2sc (Sn+ ). Also, by a version of the “outgoing” property, [12, Theorem 1.4], 0,−α n 0,−α n (S+ ) for some α ∈ (0, 1/2), so F− (B)u˜ ∈ Hsc (S+ ). F− (B)R(λ + i0)f˜ ∈ Hsc Hence, by [12, Theorem 1.5], u˜ = 0. (Isozaki actually uses a modified vector field, due to Graf [5], to allow local singularities of the potentials; we do not need this modification.) ∞,1/2+0 n (S+ ), Now, another version of the “outgoing” property of R(λ + i0)f , f ∈ Hsc 0 > 0, is the following. In [3] (see also [23]) it is proved that for pseudo-differential operators T− arising by a quantization of a symbol t− (w, ξ) ∈ C ∞ (T ∗ Rn ) satisfying estimates β γ ∂ξ t− (w, ξ)| ≤ Chwi−|β| hξi−k , 0 ≤ |β|, |γ| ≤ k, |∂w

k sufficiently large, supported in

w·ξ |w|

(2.9)

√ ≤ σ a(λ), −1 < σ < 1, we have

∞,−1/2+˜ n (S+ ), 0 < ˜ < 0 . T− R(λ + i0)f ∈ Hsc

(2.10)

We remark that the rather bad behavior of the symbol t− in ξ is irrelevant in our setting ∞,−1/2− n (S+ ). The connection between this T− and F− (B) is provided by as u0 ∈ Hsc w·ξ . the fact that B is the Weyl quantization of the symbol hwi The version of this result stated in [4] is actually more useful for us to prove the equivalence of the S-matrices in Sect. 5. Thus, let ζa ∈ C ∞ (Sn+ ) be supported in a sufficiently small neighborhood of Ca0 . We note that this means that ζa is a 0th order “classical” symbol on Rn with “cone support” away from Ca,sing ⊂ Sn−1 . Also suppose that t− ∈ C ∞ (T ∗ Xa ), supp(t− ) ⊂ Xa × Kξa ⊂ T ∗ Xa , K compact,

(2.11)

it satisfies estimates α ∂ β t (wa , ξa )| ≤ Cαβ hwa i−|α| , |∂w a ξa −

(2.12)

and its support satisfies supp t− ⊂ {(wa , ξa ) :

p wa · ξa ≤ σ a(λ), |wa | ≥ 1} |wa |

(2.13)

Scattering Matrices in Many-Body Scattering

113

for some σ ∈ (−1, 1). Let T− be a quantization of t− (see the remarks below on the degree of independence from the choice of a quantization map), regarded as an operator ∞,1/2+0 n (S+ ), on Rn . Then for f ∈ Hsc ∞,−1/2+˜ n (S+ ). T− ζa R(λ + i0)f ∈ Hsc

(2.14)

Similar results hold if we replace R(λ + i0) by R(λ − i0), t− by t+ , where for some σ ∈ (−1, 1), t+ satisfies supp t+ ⊂ {(wa , ξa ) :

p wa · ξa ≥ σ a(λ), |wa | ≥ 1}. |wa |

(2.15)

We now connect this statement with Melrose’s scattering calculus [13]. Let X be a manifold with boundary. Recall that the set of polyhomogeneous scattering pseudodifferential operators of differential order m and weight l is denoted by 9m,l sc (X). For X = Sn+ it arises as the “classical” algebra corresponding to the metric hwi−2 dw2 + hξi−2 dξ 2 and weight hwi−l hξim in H¨ormander’s calculus [7]; in general, it arises from this via a local coordinate identification of X with Sn+ . We write qL for a left quantization map from symbols on sc T ∗ X (to be discussed below) to elements of 9sc (X). Thus, operators in ∞ ∗ ¯ 9m,l sc (Xa ) are given by the quantization of symbols t ∈ C (T Xa ) satisfying estimates α ∂ β t| ≤ Cαβ hwa i−l−|α| hξa im−|β| |∂w a ξa

(2.16)

and having asymptotic expansions at infinity (both in ξa and in wa ). There is also a similar calculus 9scc (X), where asymptotic expansions are not required. We remark that the algebras 9sc (X) and 9scc (X) do not depend on the choice of a (left, right or Weyl) quantization map, and T ∈ 9m,l sc (X) implies that T is a bounded operator from m0 ,l0 m0 −m,l0 +l Hsc (X) to Hsc (X). Thus, the estimate (2.12) together with the compactness −∞,0 ¯ (Xa ) and T− = qL (t− ) ⊗ Id. of K imply that qL (t− ) ∈ 9scc To see the role played by the support condition, note that the standard coordinates on the scattering cotangent bundle, sc T ∗ X, near ∂X, are (x, y, τ, µ), where x is a boundary defining function of X and the yj give coordinates on ∂X. That is, covectors in sc T ∗ X are written as τ

dy dx . +µ· x2 x

(2.17)

Thus, τ =−

w·ξ 2 , τ + |µ|2 = |ξ|2 , |w|

(2.18)

in terms of the Euclidean coordinates (w, ξ) on T ∗ Rn , |µ| denoting the metric length of covectors on Sn−1 with respect to the standard metric, and the first term in (2.17) is Sn+ is replaced by X¯ a . Thus, the just −τ dr. Similarly, we write τa instead of τ when √ condition (2.13) means that t− is supported in√τa ≥ σ a(λ) for some σ ∈ (−1, 1), and similarly (2.15) means that on supp t+ , τa ≤ σ a(λ). Hence, a typical symbol satisfying the conditions mentioned above is a function (2.19) t = φ0 (xa )χ(τa )φ(|ξa |2 ), √ √ where φ ∈ Cc∞ (R), χ ∈ C ∞ (R) is supported in (σ a(λ), ∞) or (−∞, σ a(λ)), and φ0 ∈ C ∞ (R) is supported near 0. Here φ0 is only used to localize near xa = 0 (i.e.

114

A. Vasy

|wa | = ∞) and eliminate the singular behavior of polar coordinates near the origin; we will not indicate this explicitly in the notation from now on. We remark that the support condition (2.13), and indeed already the support part of (2.11) depend on the choice of the quantization map (left, right, Weyl). That is, if T arises by, say, left quantization of a symbol satisfying (2.11) and (2.13), it may very well happen that it is given by Weyl quantization of a symbol which does not satisfy these estimates. However, the latter symbol still nearly satisfies these properties in the sense that, for example, outside Xa × K it is given by the restriction of a Schwartz function on T ∗ Xa , i.e. it is rapidly decaying in both wa and ξa outside Xa × K. Note that rapidly −∞,∞ ¯ ( Xa ) decreasing symbols t0 in both wa and ξa give rise to operators T 0 ∈ 9sc −∞ ¯ (Xa ) to C˙∞ (X¯ a ). (using any “reasonable” quantization map), which thus map C Correspondingly, ∞,k n (S+ ) → C˙∞ (Sn+ ) (T 0 ⊗ Id)ζa : Hsc

(2.20)

for any k with ζa as above, since on the support of ζa we have hwi ≤ chwa i for some c > 0. We also remark that if T− is an operator arising from a symbol t− satisfying n the above support estimates and A ∈ 9k,0 sc (S+ ), then the symbols of T− A and AT− do not necessarily satisfy the same estimates. However, as a simple consequence of the composition rule in 9sc (Sn+ ) (cf. [7, Theorems 18.1.8 and 18.5.4]), it is still true that the symbols of these are given by a Schwartz function outside Xa × K. The same is true for the commutator [A, T− ]. In addition, for commutators the estimate actually improves; the right-hand side of (2.12) can be replaced by Cαβ hwa i−1−|α| . 3. Distributional Asymptotics of Locally Approximate Generalized Eigenfunctions We proceed to define a distributional boundary value for certain locally approximate generalized eigenfunctions of 1. Since it requires no additional effort and it may even clarify matters, we do this in the geometric setting introduced by Melrose in [13], i.e. 2 h0 we take X to be a compact manifold with boundary, with scattering metric g = dx x4 + x2 , 0 where x is a boundary defining function of X and h is a smooth symmetric 2-cotensor on X which restricts to a metric h on ∂X. We write the induced Riemannian density on the boundary as dy. The Laplacian of g has the form 1 = (x2 Dx )2 + i(n − 1)x(x2 Dx ) + x2 10 + P, P ∈ x3 Diff 2b (X) ∩ x2 Diff 2c (X), (3.1) on a collar neighborhood [0, 2a)x × ∂X of ∂X, where 10 is (the product coordinate extension of) the Laplacian of the metric on ∂X, see [13, Lemma 3]. An example of this setup is Euclidean space with the standard metric but scattering metrics exist on every compact manifold with boundary, not just on Sn+ . Recall that with X = Sn+ , the Euclidean metric gives the Riemannian density rn−1 dr dω, dω the standard measure on the unit dy sphere, which in the compactified picture, x = r−1 , dy = dω, becomes dx xn+1 , and the 2 Laplacian has the form (3.1) with P = 0. Indeed, we have Dr = −x Dx , so the polar coordinate expression for the (positive) Euclidean Laplacian, 1 = Dr2 − i(n − 1)r−1 Dr + r−2 10 ,

(3.2)

10 being the standard Laplacian on the sphere, shows (3.1). In fact, for the many-body problem itself (i.e. in the rest of this paper) we do not need the general geometric setting,

Scattering Matrices in Many-Body Scattering

115

so the reader may want to keep the Euclidean picture in mind while reading this section. To facilitate this, we rewrite expressions in the Euclidean notation several times during the arguments of this section. Let p ∈ ∂X, U a product neighborhood of p in X, so U 0 = U ∩∂X is a neighborhood ∞,−1/2− of p in ∂X and U = [0, 2a)x × U 0 . For an arbitrary u ∈ Hsc (U ), ∈ (0, 21 ), ∞,1/2+0

which satisfies (1 − λ2 )u ∈ Hsc (U ), 0 ∈ (0, 1 − ], we can define a distributional 0 boundary value on U . Note that we write the spectral parameter as λ2 as done in [13], k,l (U ) denotes this is more convenient in the discussion that follows. Also, the space Hsc k,l k,l −∞ (U ), we say that v ∈ Hsc (U ) if the localized version of Hsc (X), i.e. if v ∈ C k,l k,l ∞ φv ∈ Hsc (X) for all φ ∈ Cc (U ). (Perhaps Hsc,loc (U ) would be a better notation but it is too cumbersome.) Thus, we evaluate the pairing B : C −∞ (U ) × Cc∞ (U 0 ) → C −∞ ([0, 2a)x ), Z B(u, h) =

∂X

(3.3)

x−(n−1)/2 u(x, y)h(y) dy.

In particular, for X = Sn+ , the integral is ∞,k (U ), we have h ∈ Cc∞ (U 0 ), u ∈ Hsc

R Sn−1

(3.4)

r(n−1)/2 u(r, ω)h(ω) dω. We note that for

∞,k ([0, 2a)) ⊂ xk+1 L2 ([0, 2a]; dx) = r−k L2 (( B(u, h) ∈ Hsc

1 , ∞), dr). 2a

(3.5)

Thus, B(u, h) ∈ C ∞ ((0, a)). We have the following result. ∞,−1/2−

Proposition 3.1. For h ∈ Cc∞ (U 0 ) and u ∈ Hsc ∞,1/2+0 (U ), 0 ∈ (0, 1 − ], (1 − λ2 )u ∈ Hsc

(U ), ∈ (0, 1/2), satisfying

B(u, h) = e−iλ/x Q− (u, h) + eiλ/x Q+ (u, h), Q± (u, h) ∈ C 0,α ([0, a)x ),

(3.6)

here 0 < α < 0 . Moreover, for each u satisfying our hypotheses, Q0± (u, h) = Q0± (u)(h) = Q± (u, h)|x=0 = Q± (u, h)(0)

(3.7)

define distributions Q0± (u) ∈ C −∞ (U 0 ), called the incoming (−) and outgoing (+) ∞,−1/2 (U ) then Q0± (u, h) = 0 distributional boundary values of u. If in addition u ∈ Hsc ∞ 0 for all h ∈ Cc (U ). Remark 3.2. In order to avoid confusion due to the notation, we would like to emphasize that Q± (u, h) are functions on the interval [0, a)x while Q0± (u, h) are complex numbers, namely the values of Q± (u, h) at x = 0. We prove this proposition directly by an argument similar to those used in [13, 20]. We remark that if (1 − λ2 )u = 0, u ∈ C −∞ (X), then u = P+ (λ2 )g for some g ∈ C −∞ (∂X) (here P+ (λ2 ) is the Poisson operator for 1 as in (3.29)), and (3.6) follows with Q± (u, h) ∈ C ∞ ([0, a)x ) from stationary phase and the construction of Melrose and Zworski (in the non-Euclidean setting), [15]. A weaker conclusion than that of the proposition, namely the description of WFsc (B(u, h)), follows from the push-forward theorem proved by Hassell in [6] – of course, that theorem requires only weaker assumptions about u.

116

A. Vasy

Proof. First, under our assumptions B(u, h) ∈ x1/2− L2 (dx) ⊂ x1−δ L1 (dx) = r1+δ L1 (dr), δ > ,

(3.8)

while (recall that (x2 Dx )2 − λ2 = Dr2 − λ2 ) ((x2 Dx )2 − λ2 )B(u, h) Z (n − 1)(n − 3)x2 − λ2 )u h dy, ((x2 Dx )2 + i(n − 1)x(x2 Dx ) − = x−(n−1)/2 4 ∂X (3.9) where we wrote L1 (dx) for L1 ([0, a]; dx), etc. Thus, using (3.1), Z 0 ((x2 Dx )2 − λ2 )B(u, h) + x2 x−(n−1)/2 u 10 h dy ∈ x3/2+ L2 (dx), (3.10) ∂X

hence 0

0

0

((x2 Dx )2 − λ2 )B(u, h) ∈ x3/2+ L2 (dx) ⊂ x2+δ L1 (dx) = r−δ L1 (dr), (3.11) 0 < δ 0 < 0 . At this point we are considering a function of one variable, and it makes no difference whether we take it to be x or r. Recall that (x, τ ) give coordinates on sc T ∗ [0, a), −1 i.e. covectors in sc T ∗ [0, a) have the form τ dx x2 . In the uncompactified notation, r = x , −1 −τ is the dual variable of r ∈ (a , ∞) in the usual sense. We can divide B(u, h) into two terms by use of a ps.d.o. (really just cutoffs on the Fourier transform side) which is the identity microlocally near τ = −λ, vanishes near τ = λ. We can thus concentrate on the term, say B+ , which has scattering wave front set near τ = −λ, so Fu is smooth away from τ = −λ. As x2 Dx − λ is elliptic there (it is given by multiplication by τ − λ conjugated by the Fourier transform: F −1 (τ − λ)F), it can be removed, so we conclude that 0

0

0

(x2 Dx + λ)B+ (u, h) = f ∈ x3/2+ L2 (dx) ⊂ x2+δ L1 (dx) = r−δ L1 (dr). (3.12) Multiplying by the integrating factor e−iλ/x and integrating gives Z x ie−iλ/t f (t)t−2 dt). B+ (u, h)(x) = eiλ/x (e−iλ/x0 B+ (u, h)(x0 ) +

(3.13)

x0

Since f ∈ x2 L1 (dx) (i.e. f ∈ L1 (dr), note that −x−2 dx = d(1/x) induces the Lebesgue measure in r = x−1 ), it follows that the integral and hence the factor in parentheses, Q+ (u, h), are continuous functions on [0, a)x . In addition, using this continuity, for α ∈ [0, δ 0 ], Z x Z x −α −α −2 |f (t)|x t dt ≤ |f (t)|t−2−α dt, |x (Q+ (u, h)(x) − Q+ (u, h)(0))| ≤ 0 0 (3.14) which converges so Q+ (u, h) ∈ C 0,α ([0, a)) for all α ∈ (0, 0 ). Hence, (3.6) holds. Taking into account the explicit expression for f , it also follows that Q0± (u, .) are continuous from Cc∞ (U 0 ) to C, so they define a distribution for each u satisfying our hypotheses. ∞,−1/2 (U ), then B(u, h) ∈ x1/2 L2 (dx). Since Finally, if u ∈ Hsc e±iλ/x (Q± (u, h) − Q0± (u, h)) ∈ x1/2 L2 (dx),

(3.15)

Scattering Matrices in Many-Body Scattering

117

we conclude that e−iλ/x Q0− (u, h) + eiλ/x Q0+ (u, h) ∈ x1/2 L2 (dx) = r1/2 L2 (dr).

(3.16)

This, however, implies that Q0+ (u, h) = Q0− (u, h) = 0. Of course, if u has a local asymptotic expansion u = e−iλ/x x(n−1)/2 v− + eiλ/x x(n−1)/2 v+ , v± ∈ C ∞ (U ),

(3.17)

then Q0± (u, .) = v± |∂X . For example, in the free-to-free scattering discussed in (1.28)(1.29), Q0± (u, .) are just the boundary restrictions of v± , so for u = P0,+ (λ2 )g, g ∈ Cc∞ (C00 ), we have Q0− (u) = g, Q0+ (u) = S00 (λ2 )g. It is very useful to have a different way of calculating the distributional boundary values. The idea is to relate the pairing B to the pairing hu1 , (1 − λ2 )u2 i − h(1 − λ2 )u1 , u2 i,

(3.18)

∞,−1/2−

(X), (1 − λ2 )uj ∈ here considered under the assumption that uj ∈ Hsc ∞,1/2+ (X), j = 1, 2. Hsc The connection between this pairing and Q0± is given by the following proposition. It is convenient to introduce a complex-pairing (linear in the first variable) notation for Q0± as well (as opposed to the real distributional pairing); we write ¯ hQ0± (u), hi = Q0± (u, h). ∞,−1/2−

Proposition 3.3. Suppose that u1 = u ∈ Hsc ∞,1/2+ (U ), ∈ (0, 1/2), and Hsc

(3.19) (U ) satisfies (1 − λ2 )u ∈

u2 = u2,+ + u2,− , u2,± = e±iλ/x x(n−1)/2 v± , v± ∈ Cc∞ (U ).

(3.20)

Let h± = v± |∂X ∈ Cc∞ (U 0 ). Then hQ0+ (u1 ), h+ i − hQ0− (u1 ), h− i =

1 (hu1 , (1 − λ2 )u2 i − h(1 − λ2 )u1 , u2 i). 2iλ (3.21)

Note that (1 − λ2 )u2,+ = eiλ/x x(n+3)/2 v 0 , v 0 ∈ Cc∞ (U ), ∞,−1/2−

∞,3/2−

(3.22)

(X), (1−λ2 )u2,+ ∈ Hsc (X), and similarly with u2,− . Thus, so u2,+ ∈ Hsc the right-hand side of (3.21) is automatically defined. This proposition is analogous to the “boundary pairing” discussed in [13, Proposition 13] and its localized version in [21, Lemma 19.8]. Note also that Isozaki uses essentially the right-hand side of (3.21) in [10, Theorem 1.2] to describe the spatial asymptotics of generalized eigenfunctions (coming from a 2-cluster) in three-body scattering in an averaged sense; see also Corollary 3.4.

118

A. Vasy

Proof. We can take u2 = u2,+ ; the proof for u2 = u2,− is similar, and in general we just add the two thus derived formulae. Let φ ∈ C ∞ ([0, a]) be identically 0 near 0, identically 1 near a, so φ(x) can be regarded as a smooth function on X which is identically 1 away from a collar neighborhood of ∂X, then (3.18) is given by Z lim (u1 1u2 − 1u1 u2 )φ(x/s) dg s→0 Z (3.23) dx dy = lim −u1 2(x2 Dx φ(x/s))(x2 Dx u2 ) n+1 . s→0 x R Here we used that the other terms are of the form f ψ(x/s) dg with ψ ∈ Cc∞ ((0, 1)) and f ∈ L1 (dg), and by the dominated convergence theorem these integrals go to 0 as s → 0. Using this remark and the explicit form of u2 , we deduce that (3.18) is given by Z x (3.24) 2iλ lim u1 ( φ0 (x/s))e−iλ/x v(0, y) x−(n+1)/2 dx dy. s→0 s Letting h(y) = v(0, y), we see that (3.18) is Z 2iλ lim

s→0

0

1

¯ ( x φ0 (x/s))e−iλ/x x−1 dx B(u1 , h) s Z 1 ¯ ( x φ0 (x/s)) x−1 dx Q0+ (u1 , h) = 2iλ lim s→0 0 s ¯ lim φ(1/s) = 2iλQ0+ (u1 , h). ¯ = 2iλQ0+ (u1 , h) s→0

(3.25)

¯ by Q0± (u1 , h) ¯ by the dominated convergence theorem Here we could replace Q± (u1 , h) applied to the difference Z 0

1

¯ − Q0+ (u1 , h)) ¯ ( x φ0 (x/s)) x−1 dx, (Q+ (u1 , h) s

(3.26)

noting that ¯ − Q0+ (u1 , h)) ¯ ∈ L∞ ([0, a]), α ∈ [0, 1 − ). x−α (Q+ (u1 , h)

(3.27)

We could also drop the Q0− term because it is of the form Z 2iλ lim

s→0

0

1

¯ x φ0 (x/s)e−2iλ/x x−1 dx Q0− (u1 , h) s Z 0 ¯ e−2iλt/s t−2 φ0 (t−1 ) dt, (3.28) = 2iλQ− (u1 , h) lim s→0

R

where the integral is essentially the Fourier transform of a smooth compactly supported function evaluated at 1/s, which is rapidly decreasing as s → 0. Thus, (3.25), together with the remark at the beginning of the proof, completes the proof of the proposition.

Scattering Matrices in Many-Body Scattering

119

Let P± (λ2 ) be the Poisson operator for 1 at energy λ2 , λ > 0. Thus, for h ∈ C ∞ (∂X), P± (λ2 )h has the asymptotic expansion P± (λ2 )h = e−iλ/x x(n−1)/2 v− + eiλ/x x(n−1)/2 v+ , v∓ |∂X = h,

(3.29)

v± ∈ C ∞ (X). Let χ ∈ C ∞ (R) be identically 0 on (λ − δ, ∞), 1 on (−∞, −λ + δ), δ > 0, and φ ∈ Cc∞ (R) identically 1 near λ2 , −∞,0 (X). T+ = qL (χ(τ )φ(τ 2 + |µ|2h )) ∈ 9sc

(3.30)

Note that for X = Sn+ the second factor is φ(|ξ|2 ) in terms of the Euclidean coordinates, and in case 1 is the Euclidean Laplacian and we use the left quantization, this is the −∞,∞ n (S+ )). Since same as T+ φ[ (1) if φ[ ≡ 1 on supp φ (otherwise the difference is in 9sc u2 = u2,+ = T+ P− (λ2 )h satisfies the hypotheses of the proposition, we immediately deduce the following corollary. ∞,−1/2−

Corollary 3.4. Suppose that u ∈ C −∞ (X) satisfies u ∈ Hsc ∞,1/2+ λ2 )u ∈ Hsc (U ), ∈ (0, 1/2). For h ∈ Cc∞ (U 0 ) we have then hQ0+ (u), hi =

(U ) and (1 −

1 (hu, (1 − λ2 )T+ P− (λ2 )hi − h(1 − λ2 )u, T+ P− (λ2 )hi). (3.31) 2iλ

4. Scattering Matrices Motivated by the definition of S00 (λ), we define the asymptotic expansion scattering matrices at the other k-clusters using distributional boundary values. This is a little more complicated since at the other k-clusters we can have a number of channels, including channels corresponding to adjacent k 0 -clusters, and since we do not have smooth asymptotic expansions in the sense of (1.28)-(1.29). Thus, we project u to the eigenspace corresponding to a channel β = (b, m). As usual, we write λβ = (λ − β )1/2 . Let πβ : S 0 (Rn ) → S 0 (Xb ) as in the Introduction, so ∞,k n ∞,k ¯ (S+ ) → Hsc ( Xb ) πβ : hwb is Hsc

(4.1)

for any k and s (we can allow a factor of hwb is corresponding to the rapid decay of ψβ ). Actually, we need a local version of this, namely that ∞,k n ∞,k ¯ (S+ \ Cb,sing ) → Hsc (Xb \ Cb,sing ), πβ : hwb is Hsc

(4.2)

which follows easily from the product decomposition Rn = Xb × X b . Thus, if u ∈ ∞,−1/2− n (S+ ), (H − λ)u ∈ C˙∞ (Sn+ ), then Hsc ∞,ρ−1/2− n (S+ \ Cb,sing ) (Hb − λ)u = (H − λ)u − V˜b u ∈hwb iρ Hsc 0

∞,1/2+ (Sn+ \ Cb,sing ) ⊂ hwb iρ Hsc

(4.3)

for sufficiently small 0 > 0, so ∞,1/2+0 (U ), U = X¯ b \ Cb,sing ; (1Xb − (λβ )2 )πβ u ∈ Hsc

(4.4)

120

A. Vasy 0

∞,1/2+ thus U is a neighborhood of Cb0 in X¯ b . (Recall here that Hsc (U ) denotes the localized weighted Sobolev space!) Hence, the previous analysis applies with X¯ b in place of X, xb = |wb |−1 in place of x, so we can conclude that for h ∈ Cc∞ (Cb0 ),

BX¯ b (πβ u, h) = e−iλβ /xb Qβ,− (πβ u, h) + eiλβ /xb Qβ,+ (πβ u, h).

(4.5)

Q0β,± πβ u(h) = Q0β,± (πβ u, h) = Qβ,± (πβ u, h)|xb =0

(4.6)

We let

be the incoming and outgoing distributional boundary values of πβ u on X¯ b . We can now define the asymptotic expansion scattering matrix from channel α to / 3, as the map channel β for λ > max(α , β ), λ ∈ Sβα (λ) : Cc∞ (Ca0 ) → C −∞ (Cb0 )

(4.7)

Sβα (λ)g = Q0β,+ πβ Pα,+ (λ)g.

(4.8)

given by

We also remark that directly from (2.6) and Proposition 3.1, taking into account that ∞,1/2+ n ∞,−1/2+0 n (S+ ), > 0, is “outgoing” modulo Hsc (S+ ), 0 < 0 < R(λ + i0)f , f ∈ Hsc , by [3], we have Q0α,− πα Pα,+ (λ)g = g, Q0β,− πβ Pα,+ (λ)g = 0, β 6= α.

(4.9)

Here it is actually convenient to use (2.14) with a replaced by b and with a symbol as in (2.19) to deduce that ∞,−1/2+0 ¯ (Xb \ Cb,sing ). qL (t− )πβ R(λ + i0)f ∈ Hsc

(4.10)

An application of the push-forward theorem of [6] (or alternatively conjugation by the Fourier transform and standard wave front set analysis, cf. the discussion after (3.11)) is used to show that the vanishing of Q0β,− (πβ R(λ + i0)f ) follows from (4.10). 5. Equivalence of Scattering Matrices We show in this section that the asymptotic expansion scattering matrices Sβα (λ) agree with the wave-operator ones, Sˆ βα (λ), up to normalization. We apply Corollary 3.4 with X = X¯ b , λ replaced by λβ , u1 = πβ u, u = Pα,+ (λ)g, g ∈ Cc∞ (Ca0 ). First, we lift 0 the pairing to one on Rn by replacing πβ u by Eβ u and Pb,− (λ2β )h by P˜β,− (λ)h. This amounts to tensoring each factor in each pairing on the right-hand side of (3.31) by ψβ , so using kψβ k = 1, we deduce that (3.31) becomes 1 (hEβ u, (1Xb − (λβ )2 )T+ P˜β,− (λ)hi − h(1Xb − (λβ )2 )Eβ u, T+ P˜β,− (λ)hi). 2iλβ (5.1) Here T+ denotes the lifted operator, T+ = qL (χ(τb )φ(ξb2 )) ⊗ Id, χ ∈ C ∞ (R), χ ≡ 1 on (−∞, −λβ + ), χ ≡ 0 on (λβ − , ∞), > 0, and φ ∈ Cc∞ (R), φ ≡ 1 near λ2β . As λ2β ≥ a(λ) by (2.8), we may make the stronger assumption that

Scattering Matrices in Many-Body Scattering

121

p p χ ∈ C ∞ (R), χ ≡ 1 on (−∞, − a(λ) + ), χ ≡ 0 on ( a(λ) − , ∞).

(5.2)

We note that, due to the presence of Eβ above and the remarks following (3.30), T+ is the ˜ b ) if φ˜ ∈ Cc∞ (R) is identically 1 near supp φ + β . Also, due to the decay same as T+ φ(H ˜ b ) is a very nice operator; it lies in 9−∞,0 (Sn+ ) in the ˜ b ) brings, T+ φ(H in ξ the factor φ(H 3sc three-body problem where this calculus has been constructed in [21], and an analogous result is expected to hold in general. Now Eβ is symmetric, commutes with 1Xb and with T+ , Eβ P˜β,− (λ)h = P˜β,− (λ)h, so we can rewrite the above expression as 1 (hu, (1Xb − (λβ )2 )T+ P˜β,− (λ)hi − h(1Xb − (λβ )2 )u, T+ P˜β,− (λ)hi). (5.3) 2iλβ Moreover, taking into account that ψβ ∈ S(X b ) and (hβ − β )ψβ = 0, we can replace 1Xb − (λβ )2 by Hb − λ, yielding 1 (hu, (Hb − λ)T+ P˜β,− (λ)hi − h(Hb − λ)u, T+ P˜β,− (λ)hi. 2iλβ

(5.4)

In addition, as h ∈ Cc∞ (Cb0 ), P˜β,− (λ)h vanishes with all derivatives to infinite order near Cb,sing , thus near ∪c6≤b Cc due to the rapid decay of ψβ . Correspondingly, if we replace Hb − λ by V˜b in the pairing above, it again vanishes, so we can replace Hb − λ by H − λ. We are thus led to conclude that 2iλβ hQ0β,+ πβ u, hi = hu, (H − λ)T+ P˜β,− (λ)hi − h(H − λ)u, T+ P˜β,− (λ)hi. (5.5) Since we have u = Pα,+ (λ)g, g ∈ Cc∞ (Ca0 ), we deduce that hSβα (λ)g, hi =

1 hPα,+ (λ)g, (H − λ)T+ P˜β,− (λ)hi, 2iλβ

(5.6)

so Sβα (λ) =

1 ((H − λ)T+ P˜β,− (λ))∗ Pα,+ (λ), 2iλβ

(5.7)

/ 3. λ > max(α , β ), λ ∈ We are now ready to prove the main theorem. Recall that the wave operator scattering matrix is Sˆ βα (λ). Theorem 5.1. The asymptotic expansion and the wave-operator scattering matrices are / 3, equal up to normalization. More precisely, for λ > max(α , β ), λ ∈ Sˆ βα (λ) = cSβα (λ)R, c = eiπ(na +nb −2)/4

λα λβ

1/2 ,

(5.8)

(as maps Cc∞ (Ca0 ) → C −∞ (Cb0 )) where R is pull back by the antipodal map on Ca .

122

A. Vasy

Proof. We only deal with α = β, the other case being extremely similar. Let Fα (λ) be the trace of the Fourier transform (1.21): (Fα (λ)f )(ω) = (Fα f )(λ, ω)

(5.9)

and T+ = qL (χ(τa )φ(|ξa |2 )), φ ∈ Cc∞ (R) identically 1 near λ2α , χ as in (5.2), as discussed at the beginning of this section. Note that HT+ − T+ Ha = [Ha , T+ ] + V˜a T+ .

(5.10)

If g ∈ Cc∞ (Ca0 ) then Fα∗ (λ)g vanishes to infinite order near Ca,sing , so due to the rapid decay of ψα , Jα Fα∗ (λ)g vanishes to infinite order near ∪c6≤a Cc . Therefore, ∞,1/2+0 n (S+ ) V˜a T+ Jα Fα∗ (λ)g ∈ Hsc

(5.11)

for sufficiently small 0 since V˜a decays away from ∪c6≤a Cc . On the other hand, since Id ⊗ha commutes with T+ and 1Xa commutes with Ha , we have [Ha , T+ ] = [1Xa , qL (χ(τa )φ(|ξa |2 ))] ⊗ Id .

(5.12)

Since the identity operator commutes with 1Xa and composition is microlocal (see the a comments at the end of Sect. 2), it follows that [1Xa , qL (χ(τa )φ(|ξa |2 ))] is, modulo √ −∞,∞ n trivial (S+ ), of the form qL (t) where t is supported in τa ∈ [− a(λ) + √ term in 9sc , a(λ)−], so it satisfies (2.13). Indeed, due to commutativity to top order in 9sc (X¯ a ), t also decays like hwa i−1 . Thus, [Ha , T+ ] (hence [Ha , T+ ]∗ ) is of the form qL (t) ⊗ Id. (These statements could be proved by direct commutation arguments as in [3].) ∞,l n (S+ ), l > 1/2, we have, with ζa as in (2.14), Therefore, for u ∈ Hsc 0

∞,−1/2+ (Sn+ ) [Ha , T+ ]∗ ζa R(λ + i0)u ∈ hwa i−1 Hsc

(5.13)

for sufficiently small 0 > 0. To prove the theorem, we note that Isozaki’s proof in [9, Lemma 3.1] can be repeated nearly verbatim to conclude that for f, g ∈ Cc∞ (Ca0 ) h(Sˆ αα (λ) − Id)f, gi = h2πi(−Fα (λ)Jα∗ T+∗ V˜a + Fα (λ)Jα∗ (HT+ − T+ Ha )∗ R(λ + i0)V˜a )Jα Fα∗ (λ)f, gi. (5.14) Here Id appears since the incoming and outgoing channels coincide, so we have an additional term (Wα+ )∗ Wα+ = Id in Isozaki’s argument. The pairing in (5.14) makes sense due to (5.10)-(5.13), the rapid decay of ψα (present as Jα∗ ) in wα and that of Fα∗ (λ)g near Ca,sing which also allow us to deal with the 1 − ζa term. Thus, as a map Cc∞ (Ca0 ) → C −∞ (Ca0 ) we have Sˆ αα (λ) − Id = 2πi(−Fα (λ)Jα∗ T+∗ V˜a + Fα (λ)Jα∗ (HT+ − T+ Ha )∗ R(λ + i0)V˜a )Jα Fα∗ (λ). (5.15)

Scattering Matrices in Many-Body Scattering

123

Moreover, the Poisson operator satisfies Jα Fα (λ)∗ = C P˜α,− (λ) = C 0 P˜α,+ (λ)R,

(5.16)

C = (4πλα )−1/2 e−iπ(na −1)/4 , C 0 = C ∗ ,

(5.17)

see [14] (this simply comes from stationary phase methods), so Sˆ αα (λ) − Id = C 00 (−P˜α,− (λ)∗ T+∗ V˜a + P˜α,− (λ)∗ (HT+ − T+ Ha )∗ R(λ + i0)V˜a )P˜α,+ (λ)R (5.18) with C 00 =

ina 2λα .

Now, from the argument of the previous paragraphs we know that 1 ((Ha − λ)T+ P˜α,− (λ))∗ P˜α,+ (λ) = i−(na −1) R, 2iλα

(5.19)

since the left-hand side is just the asymptotic expansion scattering matrix of Ha at λ, which coincides with that of 1Xa at λ2α , which is i−(na −1) R (see [14]; this is again from stationary phase). The left-hand side can be rewritten as 1 ([Ha , T+ ]P˜α,− (λ))∗ P˜α,+ (λ), 2iλα

(5.20)

so together with (5.15) we see that Sˆ αα (λ) = −C 00 ((HT+ − T+ Ha )P˜α,− (λ))∗ (Id −R(λ + i0)V˜a )P˜α,+ (λ)R. (5.21) But then (2.6) and (Ha − λ)P˜α,+ (λ) = 0 show that Sˆ αα (λ) = −C 00 ((H − λ)T+ P˜α,− (λ))∗ Pα,+ (λ)R = ina −1 Sαα (λ)R,

(5.22)

where we used (5.7) in the last step. The proof for α 6= β is very similar. Remark 5.2. An alternative approach to proving the equivalence of the S-matrices is to use Yafaev’s representation, [25]. In our notation this would mean that we replace Corollary 3.4 by another result. Namely, we would want to understand Q0β,+ (πβ R(λ + i0)f ), where f = (H − λ)P˜α,+ (λ)g, g ∈ Cc∞ (Ca0 ). Since R(λ + i0)f , and hence u = ∞,1/2+0 ¯ (Xb ), 0 > 0 is sufficiently small) by πβ R(λ + i0)f is “outgoing” (modulo Hsc [3, 4], we can use the push-forward theorem of [6] to conclude that Q0− (u) = 0. Hence, Propositon 3.3 implies that hQ0+ (u), hi =

−1 0 h(1 − λ2β )u, Pb,− (λ2β )hi. 2iλβ

(5.23)

Yafaev’s representation follows by rewriting this equation and using the corresponding formula for the Euclidean scattering matrix, much like it was done in the proof of the Theorem. Acknowledgement. I am very grateful to Andrew Hassell, whose work initiated this project, for our numerous very fruitful discussions on the equivalence of the two different definitions of S-matrices and for providing me with early versions of his manuscript [6]. I would also like to thank Richard Melrose, Erik Skibsted and the referees for their helpful comments about this paper.

124

A. Vasy

References 1. Froese, R.G. and Herbst, I.: Exponential bounds and absence of positive eigenvalues of N-body Schr¨odinger operators. Commun. Math. Phys. 87, 429–447 (1982) 2. Froese, R.G. and Herbst, I.: A new proof of the Mourre estimate. Duke Math. J. 49, 1075–1085 (1982) 3. G´erard, C. and Isozaki, H. and Skibsted, E.: Commutator algebra and resolvent estimates. Adv Studies in Pure Math. 23, 69–82 (1994) 4. G´erard, C. and Isozaki, H. and Skibsted, E.: N-body resolvent estimates. J. Math. Soc. Japan. 48, 135–160 (1996) 5. Graf, G.M.: Asymptotic completeness for N-body short range systems: A new proof. Commun. Math. Phys. 132, 73–101 (1990) 6. Hassell, A.: Scattering matrices for the quantum 3-body problem. Preprint (1997) 7. H¨ormander, L.: The analysis of linesr diffrential operators. Vol. 1–4, Berlin–Heidelberg–New York: Springer-Verlag, 1983 8. Ikawa, M.: Spectral and scattering theory. New York: Marcel Dekker, 1994 9. Isozaki, H.: Structures of S-matrices for three body Schr¨odinger operators. Commun. Math. Phys. 146, 241–258 (1992) 10. Isozaki, H.: Asymptotic properties of generalized eigenfunctions for three body Schr¨odinger operators. Commun. Math. Phys. 153, 1–21 (1993) 11. Isozaki, H.: On N-body Schr¨odinger operators. Proc. Indian Acad. Sci. Math. Sci. 104, 667–703 (1993) 12. Isozaki, H.: A uniqueness theorem for the N-body Schr¨odinger equation and its applications. In: Ikawa [8], 1994 13. Melrose, R.B.: Spectral and scattering theory for the Laplacian on asymptotically Euclidian spaces. In: Ikawa [8], 1994 14. Melrose, R.B.: Geometric scattering theory. Cambridge: Cambridge University Press, 1995 15. R. B. Melrose and M. Zworski, : Scattering metrics and geodesic flow at infinity. Invent. Math. 124, 389–436 (1996) 16. Perry, P., Sigal, I.M. and Simon, B.: Spectral analysis of N-body Schr¨odinger operators. Ann. Math. 114, 519–567 (1981) 17. Reed, M. and Simon, B.: Methods of modern mathematical physics. New York: Academic Press, 1979 18. Sigal, I.M. and Soffer, A.: N-particle scattering problem: Asymptotic completeness for short range systems. Ann. Math. 125, 35–108 (1987) 19. Skibsted, E.: Smoothness of N-body scattering amplitudes. Rev. in Math. Phys. 4, 619–658 (1992) 20. Vasy, A.: Asymptotic behavior of generalized eigenfunctions in N-body scattering. J. Func. Anal. 148, 170–184 (1997) 21. Vasy, A.: Propagation of singularities in three-body scattering. PhD thesis, Massachusetts Institute of Technology, 1997; submitted for publication 22. Vasy, A.: Structure of the resolvent for three-body potentials. Duke Math. J. 90, 379–434 (1997) 23. Wang, X.P.: Microlocal estimates for N-body Schr¨odinger operators. J. Fac. Sci. Univ. Tokyo Sect. IA, Math. 40, 337–385 (1993) 24. Yafaev, D.: Radiation conditions and scattering theory for N-particle Hamiltonians. Commun. Math. Phys. 154, 523–554 (1993) 25. Yafaev, D.: Resolvent estimates and scattering matrix for N-particle Hamiltonians. Integr. Equat. Oper. Th. 21, 93–126 (1995) Communicated by B. Simon

Commun. Math. Phys. 200, 125 – 179 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

“Non-Gibbsian” States and their Gibbs Description R. L. Dobrushin† ? , S. B. Shlosman?,?? CPT, CNRS, Luminy, Marseille, France Received: 13 March 1998 / Accepted: 19 June 1998

Abstract: The driving principle behind this paper is the following thesis: “Every physically reasonable random field has to be a Gibbs random field”. In this paper the so-called “non-Gibbsian” random fields are considered. The usual definition of the Gibbs field is generalized in such a way so as to include some of the discovered “non-Gibbsian” fields. The new definition is then used to show that the projection of the two-dimensional Ising model onto the one-dimensional sublattice Z1 falls into the class of the generalized Gibbs fields.

A foreword by S. Shlosman I had the sad duty to complete one of the papers by R. L. Dobrushin, on which he was working during the last several months of his life. He got the idea of this work right after the paper of Roberto Schonmann [S] appeared in 1989. He discussed the project of bringing the “non-Gibbsian” fields back to the Gibbsian fold with me several times. There are two versions of Dobrushin’s manuscript. The first one deals with the projection of the (+)-phase of the 2D Ising model onto an arbitrary countable subset of the lattice Z2 . Because of the wish to have a preliminary version [D95] of this result in time for the conference in The Netherlands (the conference in August–September, 1995, in Renkum, the last one in which R. L. Dobrushin participated), he started to write down the special case of projecting onto the sublattice Z1 ⊂ Z2 . Both manuscripts were left unfinished. The following is the result of my attempt to finish the second manuscript. I had to change some of the initial statements, and I also added some new ones. I think that, ? The work of the authors was partially supported by the Russian Fund for Fundamental Research through grant 930101470. ?? On leave from the Department of Mathematics of the University of California at Irvine. The work was partially supported by the NSF through grant DMS 9500958.

126

R. L. Dobrushin, S. B. Shlosman

if completed by R. L. Dobrushin, the final version would also differ from his original manuscript, though for sure the changes would be different.

1. Introduction In the last years a following surprising possibility was discovered and intensively discussed in the literature (see the excellent review paper of van Enter, Fernandez, and Sokal [EFS], the recent informative paper of van Enter, Fernandez and Kotecky [EFK] and the references there). Natural functionals of values of Gibbs random fields of the type used in the renormalization-group theory can (in some sense) be non-Gibbsian. More exactly, it means the following. Let T be a countable set and X be a finite set. For any subsets V 0 ⊂ V ⊆ T and any configuration σV ∈ X V we denote the restriction 0 σV |V 0 ∈ X V of σV to V 0 by σV 0 . For any two mutually disjoint subsets V1 , V2 ⊆ T and any configurations σV1 ∈ X V1 and σV2 ∈ X V2 the configuration σ˜ ∈ X2V1 ∪V2 , such that its ˜ V2 on V2 coincides with σV2 , restriction σ| ˜ V1 on V1 coincides with σV1 and its restriction σ| is denoted by σV1 ∪ σV2 . A Gibbsian potential U = (UA (σA ), A ⊂ T, 0 < |A| < ∞) is a system of real-valued functions UA (σA ) of σA ∈ X A , labelled by the system of all finite nonempty subsets A ⊂ T . It is assumed that for any t ∈ T the following series absolutely converges: X max |UA (σA )| < ∞. (1.1) A⊂T :t∈A,|A|<∞

σA ∈X A

For any finite V ⊂ T , any configuration σ¯ T \V ∈ X T \V , called a boundary condition, and any σV ∈ X V consider the relative energy X UA (σA ) EVU (σV /σ¯ T \V ) = A⊆V,A6=∅

X

+

UA (σA∩V ∪ σ¯ A∩(T \V ) ).

(1.2)

A⊂T :A∩V 6=∅,A∩(T \V )6=∅,|A|<∞

The condition (1.1) guarantees the convergence of the series in (1.2) for all boundary conditions σ¯ T \V and configurations σV . Let ¯ T \V ) = pU V (σV /σ

exp{−EVU (σV /σ¯ T \V )} , ZVU (σ¯ T \V )

(1.3)

where the partition function ZVU (σ¯ T \V ) =

X

exp{−EVU (σV /σ¯ T \V )}.

(1.4)

σV ∈X V

The transition function pV (σV /σ¯ T \V ) is defined for all σV and all boundary conditions ¯ T \V ), σV ∈ X V , σ¯ T \V ∈ X T \V , |V | < ∞} of σ¯ T \V . The system pU = {pU V (σV /σ transition functions is called the Gibbs specification with the potential U. For any subset W ⊆ T we consider the smallest σ-algebra of subsets of the space X T with respect to which all the coordinate functions σt , t ∈ W , are measurable. This σalgebra is denoted by BW . A probability measure P on the measurable space (X T , BT )

“Non-Gibbsian” States and their Gibbs Description

127

is called consistent with the Gibbs specification pU , if for any finite V ⊂ T , any function φ(σV ) of σV ∈ X V and any subset B ∈ BT \V ,   Z Z X  φ(σV )P(dσ) = φ(σV )pU ¯ T \V ) PT \V (dσ¯ T \V ), (1.5) V (σV /σ B

B

σV ∈X V

where PT \V is the restriction of the measure P to the σ-algebra BT \V . The probability measure P is called ”non-Gibbsian”, if there exists no potential U satisfying the condition (1.1), such that the measure P is consistent with the Gibbs specification pU . To be concrete we restrict ourselves to functionals of the type of decimation, applied to the two-dimensional ferromagnetic symmetric Ising random field with large values of the inverse temperature β > 0 and (+)-boundary condition. Recall the corresponding definitions. Let T = Z2 and X = {1, −1}. Consider the probability measure Pβ,+ on 2 the measurable space (X Z , BZ2 ) defined by the following usual construction. Let a subset V ⊂ Z2 be finite and PVβ,+ (σV ), σV ∈ X V , be the restriction of the measure Pβ,+ to the σ -algebra BV . Let WN , N = 1, 2, . . . be the sequence of the lattice squares WN = {t = (t1 , t2 ) ∈ Z2 : |t1 | ≤ N, |t2 | ≤ N }. Then X V PVβ,+ (σV ) = lim pβ,+ WN (σWN ), σV ∈ X . (1.6) N →∞:V ⊆WN

σWN ∈X WN :σWN |V =σV

2 Here pβ,+ W , W ⊂ Z , |W | < ∞, are the probability distributions defined by the Ising specification for the case of the plus-boundary conditions, i.e.

pβ,+ W (σW ) = where (for W c = Z2 \ W )

Ising exp{−βEW (σW /+)} β,+ ZW

X

Ising EW (σW /+) = −

σs σt −

(s,t)⊆W :|s−t|=1

and β,+ = ZW

X

,

X

(1.7)

σs

(1.8)

(s,t):s∈W,t∈W c ,|s−t|=1

Ising exp{−βEW (σW /+)}.

(1.9)

σW ∈X W

It is well-known that the limits (1.6) exist. By the Kolmogorov theorem the system of the probability distributions PVβ,+ , V ⊂ Z2 , |V | < ∞, defines the measure Pβ,+ in a unique way. 2 Let now a countable set T ⊆ Z2 . Consider the projection 5T : X Z → X T , defined 2 0 0 by σ = (σt , t ∈ Z ) → 5T σ = (σt , t ∈ T ), where σt = σt for all t ∈ T . Via this T projection, the measure Pβ,+ induces a measure Pβ,+ T on the space (X , BT ), defined by β,+ (5−1 Pβ,+ T (B) = P T B), B ∈ BT ,

(1.10)

which is called the projection on T of the Ising measure Pβ,+ . In case of the sublattice T = bZ2 (then b should be an integer), the transformation described above is called

128

R. L. Dobrushin, S. B. Shlosman

decimation with spacing b. It is one of the transformations used in renormalization-group theory. An important result of van Enter, Fernandez, and Sokal [EFS], extending earlier results of Griffiths and Pearce [GP78, GP79] and Israel [I], states that for decimation is ”non-Gibbsian”. The other important with any spacing b ≥ 2 the measure Pβ,+ T result is due to Schonmann [S]. He considers the case of T = {t = (t1 , t2 ) ∈ Z2 : t2 = is ”non-Gibbsian” in this case also. The flow 0} and proved that the measure Pβ,+ T of witty examples of ”non-Gibbsian” measures is on the increase. Sometimes these examples are treated as pathological ones and even as an insult to physical intuition, since the belief that any reasonable transformation of a Gibbs measure leads to another Gibbs measure is a foundation-stone of the generally accepted, but mathematically nonrigorous, renormalization-group theory. The mathematical results discussed above are very interesting, but the used terminology seems to be misleading in some sense. The property (1.1) of the potential is very convenient, if it is valid, but is not intrinsic to the notion of Gibbs distribution. Moreover, it is not strictly necessary to assume that Gibbs transition functions (1.3) are defined for all boundary conditions σ¯ T \V ; it is enough to assume that they are defined only almost everywhere with respect to the corresponding random field. This point of view is evidently unavoidable, if the space of values X = R1 , and the interaction is infinite range. For example, Gaussian translation-invariant random fields are described as Gibbsian ones with the potential U = (UA (σA ), A ⊂ Zd , |A| < ∞), such that UA (σX ) =

cs−t σs σt if A = {s, t}, 0 in other cases,

(1.11)

where cs−t = ct−s are some nonnegative constants (see [D80]). It is clear that if all these constants do not vanish, the series in the definition (1.2) for the relative energy diverges for some boundary conditions σ¯ V c , and so the Gibbs specification can not be defined in a reasonable way for all boundary conditions. Of course, in the case of finite X similar possibilities appear exotic, but, if we meet the real difficulties in these cases, we can try to find a way out by using partly defined specifications. At least, physical intuition of renormalization-group theory is vague enough to be not in contradiction with such interpretation of Gibbs measures. More exactly, let again T be a countable set and X be a finite set. We define the potential U = (UA (σA ), A ⊂ T, |A| < ∞) as above, but we will not introduce the U condition (1.1). Instead, for any finite V ⊂ T we introduce the set X V ∈ BT \V of all boundary conditions σ¯ T \V such that the series (1.2) absolutely converges for all U

U

σV ∈ X V . (This set X V can in principle be empty.) Then for σ¯ T \V ∈ X V we can ¯ T \V ) by the formula (1.3). The system still define the probability distribution pU V (σV /σ U

¯ T \V ), σV ∈ X V , σ¯ T \V ∈ X V , V ⊂ T, |V | < ∞} of transition pU = {pU V (σV /σ functions will be called the partly defined Gibbs specification with the potential U. A probability measure P on the measurable space (X T , BT ) is called consistent with the partly defined Gibbs specification pU , if for any finite V ⊂ T the probability U

PT \V (X V ) = 1,

(1.12)

and the identity (1.5) holds for any function φ(σV ) of σV ∈ X V and any set B ∈ BT \V U

such that B ⊆ X V .

“Non-Gibbsian” States and their Gibbs Description

129

Now we can formulate the main result of the paper. Theorem 1.1. There exists a value β0 < ∞ such that for all β > β0 and T = {t = (t1 , t2 ) ∈ Z2 : t2 = 0} ⊂ Z2 there exists a Gibbs potential UTβ = {UAβ (σA ), A ⊂ β,+ is T, |A| < ∞, σA ∈ X A }, such that the projection Pβ,+ T of the Ising measure P UTβ consistent with the partly defined Gibbs specification p . Remark 1. A similar construction can be applied to the case of minus-boundary condition. As a result we obtain that the projection of the Ising measure with minus-boundary β condition is consistent with the partly defined Gibbs specification with the potential U T , β which is obtained from the potential UT by the change of variables σt ↔ −σt , t ∈ T . For large enough β these two potentials do not coincide. We believe that there is no β,− are consistent with the unique potential UeTβ , such that both projections Pβ,+ T and PT β partly defined Gibbs specification pUeT . Remark 2. We consider the two-dimensional Ising model and the sublattice T = Z1 ⊂ Z2 only to simplify the following construction. In fact, the only essential feature of our approach is that in the considered case the contour method and the cluster expansion for the contour system are applicable. Therefore, similar results can be proved for other examples of the subsets T ⊂ Z2 , for the Ising model in dimension d > 2 and for many other Gibbs models. Remark 3. There is a hope that the partly defined Gibbs specifications exist also in other ”non-Gibbsian” situations, but the investigation of this problem requires a lot of additional work. Remark 4. There is the following doubt, formulated in connection with the approach developed in this paper: the class of partly defined potentials is too wide and we know nothing about corresponding “almost Gibbs” fields. One possible response is that we also know almost nothing about the usual Gibbs fields with potentials of the class (1.1). We have only a general theorem of existence, a theorem of uniqueness for high temperatures and some examples of phase transitions for low temperatures. It seems that similar results can be obtained under some appropriate restrictions for Gibbs fields with partly defined potentials. 1.1. Some heuristics. We finish this section by an informal description of some properties of the projected low-temperature Ising measure, which is meant to explain the origins of its “non-Gibbsianity” (in the standard meaning) and also to make the main results of this paper (see Propositions 2.2 and 2.3) look natural. Consider the segment [−n, n] ⊂ Z1 . We want to fix the configuration σ everywhere on [−n, n] except at the origin, and we want to estimate the influence of the rest of the configuration σ outside this segment on the value σ0 . For the ordinary Gibbs field this influence vanishes as n → ∞. As we will see soon, this is generally not the case in our example. To be specific, suppose that the configuration σ is antisymmetric on [−n, n], that is σs = −σ−s for 1 ≤ s ≤ n, and that σ[1,n] is somewhat neutral, that is for some fixed k n all the averages 1 1 k |σs+1 + · · · + σs+k | < 2 , say, for all s, 0 ≤ s ≤ n − k. The reader can consider, for example, the configuration σt = (−1)t , 0 < t ≤ n. Let us first consider the case when the configuration outside [−n, n] is identically +1. Then the typical configuration of the (+)-phase of the 2D model under such condition in the vicinity of the origin looks as follows: the segments of minuses of the configuration σ are enclosed by “thin” contours

130

R. L. Dobrushin, S. B. Shlosman

which stay very close to these segments, while the rest of the plane is filled by the (+)-phase. Because of the symmetry of the configuration σ[−n,n]\0 , its influence on the spin at the origin vanishes, and the influence left is that of the (+)-phase outside the thin contours, so the spin σ0 behaves basically as if being in the (+)-phase. Let us now add two more (−)-segments, ±[n + 1, n + N ]. If N is moderate, then nothing new happens, and we have just two more thin contours, surrounding the new (−)-segments. However, when N becomes larger than a certain threshold N (n, β), then the picture becomes qualitatively different: the two new contours prefer to merge into one big contour, containing the whole segment [−N, N ] inside. Energetically this is a loss, because the length of the new system of contours exceeds the minimal possible one by ∼ 2n. The reason why the system is willing to pay this price is that the arrangement described gives more possibilities for fluctuations. Roughly speaking, the second picture corresponds to the pinning of the external contour of the third one to the segment [−n, n]. However, the probability of the event that this contour √ is located at√some given height √ for h ∼ N . So when N exp {2βn}, h above the origin, is of the order of C(β) N the entropy considerations win over the energy considerations, and the system prefers the freedom of fluctuations to the energy gain. When the √ unifying external contour is formed (which happens with probability close to one, if N exp {2βn}), the picture in the vicinity of the origin is reversed: inside the big contour we see thin contours surrounding (+)-segments of the configuration σ. Outside them what we see is basically the (−)-phase, and so the spin σ0 behaves basically as if being in the (−)-phase. This is how the “non-Gibbsian” nature of the projected field is manifested: no matter how large the segments [−n, −1], [1, n] are, where the configuration is fixed, we still can influence the behavior at the origin – by choosing the segments ±[n + 1, n + N ] even bigger. That is a clear indication that if we still want to describe the field by a Gibbs potential, we have to be prepared to allow extremely long-range interactions, which might not even decay. Another, more quantitative information, we can get from this example is that the total energy of two strings of (−)-particles, H(σ [a1 ,b1 ]∪[a2 .b2 ] ), where a1 < b1 < a2 < b2 , can be (approximately) equal to the sum of the energies of the two strings, H(σ [a1 ,b1 ] ) + H(σ [a2 ,b2 ] ), only if the strings are well separated. Here and in the following for A ⊂⊂ Z1 we denote by σ A the configuration −1 for s ∈ A , (1.13) σ A (s) = +1 for s ∈ / A. The reason for that is that the approximate additivity of the energy is equivalent to the approximate independence. But when the distance between the segments, |a2 − b1 | is o(min(|b1 − a1 |, |b2 − a2 |)), then we have to expect the two contours to coalesce, which rules out any hope for independence. 1.2. Organization of the paper. In the next section we write down the expression for the potential we are looking for, and formulate its main decay properties. In Sect. 3 we study one of the main important objects we need: the meniscus. By this we mean the following. Suppose a box contains a system which is a mixture of two phases, with a phase separation surface between them. In case both phases are touching the walls of the box, the separation surface has its boundary on the wall. What we need is the information on how the phase separation surface approaches the wall of the box in question. We find in particular that the number of pure states of the Ising model in the half space equals the continuum! In Sects. 4 and 5 we remind the reader about the cluster expansion technique.

“Non-Gibbsian” States and their Gibbs Description

131

Sections 6 and 7 contain the main technical results of the paper. There we are obtaining the estimates on the interaction, which are enough to prove the Gibbsianity of the projected field. This is done in Sect. 8. A preliminary formulation of the results of the present paper appeared in [DS].

2. Construction of the Potential The proof of Theorem 1.1 is based on an explicit construction of the potential UTβ . In this section we explain this construction. In the case of the two-point space X = {1, −1} we will use the following notations for the configurations σV ∈ X V in a volume V . For any σV = (σt , t ∈ V ) ∈ X V we let σV+ = {t ∈ V : σt = 1}, σV− = {t ∈ V : σt = −1}.

(2.1)

As it is well-known (see [G] or [EFS]), distinct Gibbs potentials can define the same Gibbs specification, but there is a subclass of potentials, which are defined by the specification in a unique and even, in some sense, constructive way. It is the subclass of the so-called lattice gas potentials. These potentials are especially natural for the fields with two values ±1, when a typical configuration takes mainly one of these two values. In the considered case of the Ising measure Pβ,+ with β large, this dominant value is +1. We can treat the sites t ∈ σV− as the sites where interacting particles are situated, and the sites t ∈ σV+ as empty (vacuum) sites. We will use the following definitions. In the case X = {1, −1} a Gibbs potential U = (UA (σA ), A ⊂ T, 0 < |A| < ∞) is called a lattice gas potential if UA (σA ) =

− UA , if σA = A, 0 for all other σA ∈ X A ,

(2.2)

where {UA , A ⊂ T, 0 < |A| < ∞} is a system of real numbers. We can treat UA as the energy of mutual interaction between particles which occupy the set A. We show below that the potentials UTβ , the existence of which is stated in Theorem 1.1, can be constructed as lattice gas potentials. There is a constructive way to recover the values UA from the probability distribution of a Gibbs field with a lattice gas potential. Let W be a finite volume, X = {1, −1}, and pW be a probability distribution on the set X W . Assume that pW is a Gibbs distribution with a lattice gas potential U = {UA , A ⊆ W }. It means that pW (σW ) =

exp{−

P

− A:A⊆σW U ZW

UA }

, σW ∈ X W ,

(2.3)

where the partition function U = ZW

X σW ∈X W

  exp − 

X − A:A⊆σW

UA

  

.

(2.4)

132

R. L. Dobrushin, S. B. Shlosman

D D Let a set D ⊂ W , and σW is the configuration defined by (1.13), i.e. σW follows from (2.3) that P exp{− A:A⊆D UA } D . pW (σW ) = U ZW

−

= D. It

(2.5)

∅ ∅ + on W (i.e. (σW ) = W) In particular, for the case of the “vacuum” configuration σW we have ∅ U −1 ) = (ZW ) . pW (σW

It follows from the relations (2.5) and (2.6) that D X ) pW (σW . UA = − ln ∅ pW (σW ) A:A⊆D

(2.6)

(2.7)

Recall now a well-known (and checked in an elementary way) Mobius inversion formula. Let 8(·), 9(·) be a pair of functions defined on the set of all finite subsets A ⊆ R, where R is a finite or countable set. Then X 8(D), A ⊆ R, |A| < ∞, (2.8) 9(A) = D⊆A

if and only if 8(A) =

X

(−1)|A\D| 9(D), A ⊆ R, |A| < ∞.

(2.9)

D⊆A

It follows from the Mobius formula and the relation (2.7) that for any finite A ⊆ W , D X ) pW (σW . (2.10) (−1)|A\D| ln UA = − ∅ pW (σW ) D⊆A Of course, we do not know yet that the measure Pβ,+ T is consistent with a partly defined Gibbs specification with a lattice gas potential U. But the previous discussion suggests the following definitions, which will be justified below. We will treat the measure Pβ,+ T as the limit, as N → ∞, of the projections of the Ising distributions pβ,+ (recall (1.7)) WN to the sets T ∩ WN and write the analog of the formula (2.10) for these projections. For any N and D ⊆ (T ∩ WN ) we let − pβ,+ WN σWN : (σWN |T ∩WN ) = D β,+,N (2.11) QT,D = − ln β,+ . pWN σWN : (σWN |T ∩WN )− = ∅ For any finite D ⊂ T we let β,+,N Qβ,+ T,D = lim QT,D , N →∞

if, of course, this limit exists. Let β,+,N = UT,A

X D⊆A

(−1)|A\D| Qβ,+,N T,D ,

β,+ β,+,N = lim UT,A = UT,A N →∞

X D⊆A

(−1)|A\D| Qβ,+ T,D .

(2.12)

(2.13)

(2.14)

“Non-Gibbsian” States and their Gibbs Description

133

Theorem 2.1. In the situation of Theorem 1.1 all the limits (2.12), (2.14) exist. The measure Pβ,+ T is consistent with the partly defined Gibbs specification with the lattice β,+ gas potential UTβ,+ which is defined by the relation (2.2), where the values UA = UT,A . It is clear that the main Theorem 1.1 follows from Theorem 2.1. The main difficulty of its proof is a derivation of the appropriate estimates on the absolute values of the β,+ . Once obtained, they would imply easily the consistency of the measure potentials UT,A β,+ PT with the potential UTβ,+ . 2.1. Bounds on UTβ,+ . Here we formulate these estimates, the derivation of which is the main content of the paper. Proposition 2.2. A rough estimate. Let A ⊂ T be any finite set. Then |UA | ≤ 4 (2β + C) |A| 2|A| .

(2.15)

The estimate (2.15) does not look very promising, and, what is worse, it seems that for some sets it cannot be improved significantly. These sets include segments and other sets of high density. Fortunately, the probability of the event that the segment A is occupied by the (−)-particles decays as exp {−4β |A|}, and so such a weak estimate as (2.15) by itself does not destroy our argument. The next statement supplements the estimate (2.15) with the bound needed to enable the proof of our main result. Proposition 2.3. A refined estimate. Let ε > 0 be fixed. Then there exists a value β = β (ε), such that for all β > β (ε) and for all finite sets A ⊂ T with density ρ (A) below 1 − ε : ρ (A) ≡ we have

|A| <1−ε diam (A) + 1

(2.16)

|UA | ≤ exp {−β 0 diam (A)} ,

where β 0 = β 0 (β) → ∞ as β → ∞. 3. The Meniscus As the reader will see in the following, the case of projecting a two-dimensional field onto the one-dimensional sublattice has the best chances to be non-Gibbsian. The reason, roughly speaking, is this: the question about weak dependence of the spatially separated 0 boxes B1 , B2 of the sublattice Zd ⊂ Zd according to the projection of the random field P boils down to the following question: Consider two closed contours 01 , 02 ⊂ Zd , enclosing the boxes B1 , B2 , respectively. Let the contours be distributed independently, each one governed by the field P. The question is about the behavior of the probability of the event that their intersection is nonempty, as a function of dist(B1 , B2 ). It is more or less clear that the lower the dimension of the contour is the bigger are its fluctuations. In other words, the lower the dimension of the contour 0 the higher are chances to observe it at some distance from the enclosed set B. That is the reason why we think that the 2D case might be the most difficult one. To obtain our result we have therefore to control the fluctuations of

134

R. L. Dobrushin, S. B. Shlosman

the contour surrounding a given set B, which is a segment, or punctured segment, in our case. We will do it by using the following meniscus theorem, which seems to be interesting on its own. The meniscus theorem describes the phases of the semiinfinite Ising model, which correspond to various possible slopes of the interface as it approaches the wall. Namely, we consider the 2D Ising model on the half-infinite lattice (Z2 )+ = {(x, y) ∈ Z2 : y > 0}, and we are interested in the set of Gibbs states corresponding to the following boundary condition on the x-axis: +1 for x ≥ 0, σ¯ x = −1 for x < 0. To remind the reader of the definition of the Gibbs field on (Z2 )+ , corresponding to the Ising model with a given boundary condition, we need some notation. Let 3 ⊂ (Z2 )+ be a finite subset. By ∂3 we denote the usual boundary of 3 ⊂ Z2 , and by ∂ + 3 ⊂ ∂3 – the intersection ∂3 ∩ (Z2 )+ . A random field on the set of all configurations on (Z2 )+ is called the Gibbs field with Ising interaction and boundary conditions σ, ¯ iff its conditional distribution in any finite 3 given the configuration σ c outside 3 is given by the usual Gibbs formula for the Ising interaction in the case ∂ + 3 = ∂3, while in the remaining case ∂3\∂ + 3 6= ∅ we also use the same formula where we supplement the configuration σ c |∂ + 3 by the restriction σ| ¯ ∂3\∂ + 3 . To formulate the meniscus theorem we introduce the following configurations, which will be used as boundary conditions. Let n = (nx , ny ) be any unit vector in R2 with nx > 0. We define the configuration σ n by +1 for n · x ≥ 0, n σ (x) = −1 for n · x < 0. Denote by VN the box

VN = (x, y) ∈ Z2 : −N < x < N, 0 < y < N ,

(3.1)

and let h·iβN,n be the Gibbs state in VN , corresponding to the Ising model at inverse temperature β with boundary condition σ n . We also introduce the notation h (N, n) for the integer point on the line y = N closest to the line n · x = 0. Theorem 3.1 (The meniscus theorem). If β is large enough, then 1. The thermodynamic limit h·iβn = limN →∞ h·iβN,n exists for every n.

2. The states h·iβn are mutually different Gibbs states of the Ising model on the half-lattice ¯ (Z2 )+ , corresponding to the boundary condition σ. For applications of this result to the problem of Gibbsianity we need a certain property of the phases h·iβn . This property is formulated in terms of the contours of configurations. So in the next subsection we remind the reader of the relevant definitions. 3.1. Peierls contours. Let Z2 be the two-dimensional integer lattice. Assume that Z2 ⊂ R2 . Let Z2? be the conjugate lattice with vertices (n1 + 1/2, n2 + 1/2), n1 , n2 ∈ Z1 . Let E be the set of edges of the conjugate lattice, i.e. the set of all closed intervals of the length 1 connecting the adjacent points of this lattice. For each edge e ∈ E there are two vertices of the original lattice Z2 with the distance 1/2 from e. We say that they are vertices adjacent to the edge e. Let W ⊂ Z2 be a finite set. The set of edges e ∈ E such

“Non-Gibbsian” States and their Gibbs Description

135

that at least one of two points to which the edge e is adjacent belongs to W is called the set of edges in the volume W and is denoted by E(W ). Return now to the Ising distribution with the plus-boundary condition. The boundary B(σW ) of the configuration σW = (σt , t ∈ W ) ∈ X W , where X = {1, −1}, is defined as the set of all edges e ∈ E(W ) such that either both points t, t0 ∈ Z2 adjacent to e / W and belong to W and σt 6= σt0 or one of these two points t ∈ W , the other one t0 ∈ σt = −1. As usual, we represent the boundary B(σW ) as a sum of contours, but to do it in a unique way it is necessary to be careful in the definition of contours. Let e1 , e2 ∈ E be two distinct edges containing a common vertex t = (t1 , t2 ) ∈ Z2? . We say that these edges make a legitimate turn, if either one of these edges connects the vertex t with the vertex (t1 + 1, t2 ) and the other vertex connects it with the vertex (t1 , t2 + 1) or if one of these edges connects the vertex t with the vertex (t1 − 1, t2 ) and the other vertex connects it with the vertex (t1 , t2 − 1). A contour is defined as a sequence e1 , e2 , . . . , ek of mutually distinct edges such that the edges ei , ei+1 , i = 1, 2, . . . , k (here k + 1 = 1) have a common vertex and in case there is another pair of edges ei0 , ei0 +1 of this contour having the same common vertex, the edges ei , ei+1 make a legitimate turn. The set of all contours is denoted by G. We say that a contour 0 ∈ G is a contour in the volume W , if all its edges belong to E(W ). The set of all such contours is denoted by G(W ). The number of edges in a contour 0 ∈ G is denoted |0| and is called the length of this contour. The set of all points t ∈ Z2 such that there is no continuous curve in R2 which does not intersect a contour 0 and connect the point t ∈ Z2 ⊂ R2 to ”infinity” will be called the interior of the contour 0 and will be denoted by Int 0. We say that the contours 01 and 02 are compatible if they have no common edges and if at any vertex which is contained in both of contours these contours make legitimate turns. We say that a finite system of contours π ⊂ G is a compatible system of contours, if any two different contours in π are compatible. Let H be the set of all systems of compatible contours and H(W ) ⊆ H be the set of all compatible systems of contours π ⊆ G(W ) of contours in a volume W ( the set H(W ) includes the empty system of contours). It is easy to understand that for any finite volume W and any configuration σW ∈ X W there exists a unique system of contours π(σW ) ∈ H(W ) such that B(σW ) = ∪0∈π(σW ) 0.

(3.2)

Further, for any system of contours π ∈ H(W ) there is a unique configuration σW (π) ∈ X W such that π(σW (π)) = π.

(3.3)

For any point t ∈ Z2 denote by O(t) the set of all contours 0 ∈ G such that t ∈ Int 0. The configuration σW = (σt , t ∈ W ) ∈ X W can be reconstructed from the contour system π(σW ) by the help of the relation +1, if |O(t) ∩ π(σW )| is even, σt = −1, if |O(t) ∩ π(σW )| is odd. Let 0 be a fixed contour. Define the subset 1 (0) ⊂ Z2 by the relation:   [ Int γ  . 1 (0) = Z2 \  γ:γ is compatible with 0

(3.4)

136

R. L. Dobrushin, S. B. Shlosman

√ 2 from It is a subset of all sites of the lattice which are at the distance not greater than 2 0. The set 1 (π) is defined in the obvious way. 3.2. Localization of the meniscus. We return now to the meniscus. Note first that if β is large enough, then with h·iβn -probability 1 every configuration σ on (Z2 )+ contains exactly one infinite contour, and it has an endpoint at the point (− 21 , 0). Let us denote this contour by 0 = 0(σ), and let [γ1 (σ), γ2 (σ)] ⊆ R1 be the (random) segment of the x-axis, obtained by projecting the contour 0 onto R1 (the cases of the projection to be semiinfinite or infinite are not excluded, of course). Theorem 3.2 (The theorem on the localization of the meniscus). Suppose additionally to the conditions of the Meniscus Theorem that ny > 0. (That means that the contours 0 go typically to the north-west.) Then the h·iβn -probability of the event {σ : γ2 (σ) ≥ l, l > 0} is bounded from above by exp{−cn βl} with cn > 0.

(3.5)

(The positivity of cn is not uniform in n, of course). The above theorem estimates the probability of the event that the contour 0 crosses the vertical line and deviates from it into the positive quadrant by the distance l. A similar result holds for the probabilities of crossing and deviating by a distance l from other straight lines, passing through the origin. However, if the line has equation y = kx with the slope k < 0, k 6= − nnxy , then the corresponding estimate is of the form exp{−ck l}, and the positive exponent ck does not diverge as β → ∞, unlike (3.5). The reason is that such deviations are not suppressed even at zero temperature. The following corollary of the above result is crucial for our purposes. Consider ¯ which is +1 again the box VN (see (3.1)) and endow it with the boundary condition σ, on the left, top and right border, as well as on the segment [0, N ], and is different from being identically +1 on the remaining part of the boundary. Then every configuration ˜ attached to the boundary of VN . σ˜ in VN has a certain amount of open contours 0i (σ) ˜ γ˜ 2 (σ)] ˜ the smallest segment of the x-axis, containing the projections Denote by [γ˜ 1 (σ), ˜ Let h·iβσ,N be the corresponding Gibbs state in VN . Then the of all the contours 0i (σ). ¯ following statement holds: - probability of the event {σ˜ : γ˜ 2 (σ) ˜ ≥ l, l > 0} is bounded Corollary 3.3. The h·iβσ,N ¯ from above by exp{−cβl}

(3.6)

with c > 0, uniformly in σ, ¯ N. can be coupled with any of the states h·iβn in such a way that Proof. The state h·iβσ,N ¯ the former is higher than the latter in the FKG sense. (That means that with probability one, according to the coupling measure, σ(t) ˜ 6= σ(t) for every t ∈ VN .) In particular, the region between the contour 0(σ) and the ray −∞, − 21 of the x-axis contains all the ˜ and so γ2 (σ) ≥ γ˜ 2 (σ). ˜ So the corollary holds with c = cn for arbitrary n contours 0i (σ), with positive coordinates nx , ny .

“Non-Gibbsian” States and their Gibbs Description

137

3.3. Properties of the phase separation line. The reader who is familiar with the book [DKS] would note that the above statements are quite close to those in its Chapter 4, where the question about large deviations of the phase separation line is discussed. There the deviations were studied for two ensembles: one was the canonical ensemble of the phase separation lines S ∈ IN,n , connecting two points ( 21 , 21 ) and (h (N, n) , N − 21 ), while the other was the grand canonical ensemble of the phase separation lines, which were starting at ( 21 , 21 ), but which were terminating on the line y = N − 21 , with the position h (S) of the endpoint on this line randomly distributed in such a way that its mean value was h (N, √ n), and the distribution was asymptotically normal with the variance of the order of N . Before discussing these results further, we recall some notions introduced in [DKS]. 3.3.1. The ensemble of tame animals. We start by considering the canonical ensemble of tame animals, which is defined as a measure on the set ∞ = {S ∈ IN,n : |S ∩ {(x, y) : y = m}| = 1 IN,n

for all m = 0, 1, . . . , N }

(3.7)

of all SOS-trajectories S, starting at ( 21 , 21 ) and terminating at (h (N, n) , N − 21 ). We denote by k(r, S) the abscissa of the trajectory S at the level r: k(r, S) = S ∩ {(x, y) : y = r}.

(3.8)

We introduce the distribution ∞ ∞ (S) = (4(N, n, ∞))−1 exp{−2β|S|}, S ∈ IN,n , PN,n

(3.9)

with the partition function X

4(N, n, ∞) =

exp{−2β|S|}.

(3.10)

∞ S∈IN, n

The main tool in investigating this ensemble is the passage to the grand canonical ensemble, which is defined as a family of measures on the union [ ∞ ∞ = IN,n , (3.11) IN n

indexed by the real parameter H: ∞ ∞ (S) = (4(N, H, ∞))−1 exp{−2β|S| + βHh(S)}, S ∈ IN , PN,H

(3.12)

with the partition function 4(N, H, ∞) =

X

exp{−2β|S| + βHh(S)}.

(3.13)

∞ S∈IN

It is not difficult to calculate explicitly the partition function (3.13) of the ensemble of tame animals. Supposing that −2 < H < 2 we have 4(N, H, ∞) = e2β (QH )N with

(3.14)

138

R. L. Dobrushin, S. B. Shlosman +∞ X

QH =

exp{−2β(|k| + 1) + βHk} = e−2β

k=−∞

sinh(2β) . (3.15) cosh(2β) − cosh(Hβ)

1 The position h(S) of the endpoint of the polygon S on the line y = N − is a random 2 variable in the ensemble (3.12). It equals the sum of N identically distributed random variables with the probability distribution ∞ (k) = Q−1 PH H exp{−2β(|k| + 1) + βHk},

(3.16)

the mean value sinh(Hβ) ∂ log QH = , ∂H cosh(2β) − cosh(Hβ)

(3.17)

∂ 2 log QH cosh(2β) cosh(Hβ) − 1 = . ∂H 2 (cosh(2β) − cosh(Hβ))2

(3.18)

∞ = β −1 MH

and the variance ∞ = β −2 DH

According to the standard local limit theorem for sums of independent random variables (see for example [Gn]), as N → ∞ one has 1 1 ∞ ∞ 2 , ({S : h(S) = bN }) ∼ p exp − (b − N M ) PN,H N H ∞ ∞ 2N DH 2πN DH (3.19) whenever a sequence of integers bN is chosen in such a way that the quantity ∞ N| |bN − MH 1/2 N

is bounded uniformly in N . 3.3.2. The ensemble of wild animals. The “true” phase separation line of the Ising model is a small perturbation of the ensemble, introduced above, see [DKS]. The corresponding ensemble IN,n of the separation lines is again formed by the polygons S of the dual lattice, starting at ( 21 , 21 ) and terminating at (h (N, n) , N − 21 ), but this time it is not required that the intersections S ∩ {(x, y) : y = r} are singletons. Therefore, instead of the random variable k(r, S) (see (3.8)) we introduce two new variables: 6 ∅} , k(r, S) = min {x : S ∩ {(x, y) : y = r} = ¯k(r, S) = max {x : S ∩ {(x, y) : y = r} = 6 ∅} . The probability distribution PN,n (S) (compare with (3.9)) is the one induced on the set IN,n of separation lines by the Ising model random field h·iβN,n , while the probability S distribution PN,H (S) on IN = n IN,n is obtained as a normalized mixture of the distributions PN,n (S) with the weights exp{βHh(S)}, compare with (3.12). Turning back to the meniscus problem, the event dev (r, ck ) we are interested in, is the following one: the random line deviates at the level r from the ray of its expected values by an amount linear in the distance from the starting point, i.e. by ck r. In the ensemble IN its probability is given by

“Non-Gibbsian” States and their Gibbs Description

139

X

PN,H (dev (r, ck )) ≡ PN,H (r, ck ) ≡

PN,H (S). (3.20)

¯ S∈IN :k(r,S)−rM H ≥ck r or rMH −k(r,S)≥ck r

The result of [DKS] is that this probability is exponentially small in r: PN,H (r, ck ) ≤ A exp{−r9(ck )},

(3.21)

with 9(·) a positive function, defined on a positive semiaxis. This statement is almost what we need, though the estimator does not include the dependence on β. (To have such dependence one has to put restrictions on the slopes ck and MH .) However when one looks at the situation in the canonical ensemble, it is much less satisfactory; the corresponding estimate of [DKS] reads: X √ PN,n (S) ≤ A N exp{−r9(ck )}.(3.22) PN,n (r, ck ) ≡ ¯ S∈IN,n :k(r,S)−rM H ≥ck r or rMH −k(r,S)≥ck r

This estimate was enough for the purposes of [DKS] of studying the surface tension. To obtain (3.22) from (3.21) is very easy: PN,H (r, ck ) ≥ PN,H (r, ck ; h(S) = h (N, n)) ≡ PN,H r, ck |h(S) = h (N, n) PN,H (h(S) = h (N, n)) ≡ PN,n (r, ck )PN,H (h(S) = h (N, n)) , and we get the desired bound by using the statement that the random variable h(S) has the standard local limit behavior with the variance of the order of N . It is clear however that the true estimate in (3.22) should be of the same order as in (3.21), and the probability PN,H (r, ck ) should in fact be asymptotically equal to the conditional probability PN,H r, ck |h(S) = h (N, n) , since the conditioning is done by fixing the random variable h(S) to be equal to its mean value (which belongs to the region of its typical values). As we will see in the following, the corresponding improvement can indeed be done. In this paper we will not prove the Meniscus Theorem, relegating it to the forthcoming publication. What we will establish is the Meniscus Localization Theorem for the particular choice of the vector n = ( √12 , √12 ). Since the only thing we need for the purposes of the present paper is the above mentioned Corollary 3.3, it will be enough. The relevant events we should study are that the observables k(r, S) reach some threshold values l > 0. Note however, that the probability of the event k(r, S) ≥ l only increases when the size N of the box VN increases. So for any fixed r we can choose N as large as it is convenient for us. 3.3.3. Calculations in the grand canonical ensemble of tame animals. We start by estimating the probability of the deviation we are interested in, in the grand canonical ∞ of tame animals. In accordance with our choice of the direction n, the ensemble PN,H field H has to satisfy the equation ∞ ≡ β −1 MH

∂ log QH sinh(Hβ) = = −1. ∂H cosh(2β) − cosh(Hβ)

(3.23)

140

R. L. Dobrushin, S. B. Shlosman

∞ The function MH is analytic and increasing, so the inverse function H ∞ (M ) is also analytic and increasing. The solution to Eq. (3.23) is H = H ∞ (−1) = −2 + lnβ2 + o( β1 ). The event we are occupied with is that the observable k(r, S) reaches some threshold ∞ (r, l), we have for every choice of l > 0, 0 < r < N . Denoting this probability by PN,H the auxiliary field K that

X

∞ (r, l) ≡ PN,H

∞ :k(r,S)=l S∈IN

exp{−2β|S| + βHh(S)} 4(N, H, ∞)

X

≡ e−βKl

∞ :k(r,S)=l S∈IN

exp{−2β|S| + βHh(S) + βKk(r, S)} 4(N, H, ∞)

4(N, H, r, K, ∞) , 4(N, H, ∞) X exp{−2β|S| + βHh(S) + βKk(r, S)}. 4(N, H, r, K, ∞) = ≤ e−βKl

where

(3.24)

∞ S∈IN

(The estimate (3.24) is the standard Cramer tilt.) The straightforward calculations (3.14), (3.15) imply that r cosh(2β) − cosh(Hβ) ∞ −βKl . (3.25) PN,H (r, l) ≤ e cosh(2β) − cosh((H + K) β) (Note that this estimate does not depend on N .) The choice of K is up to us. The approximate optimization in (3.25) suggests the following choice for K: ( 2β 2 − β1 ln(1 + rl ) for rl < e2 , K + H = 1 l 2β 2β for rl ≥ e2 . β re The choice which we will use and which is simpler to handle is the following:  for r ≤ l,   −H 1 r e2β l ln(1 + ) 2 − K +H = β l for l < r < 2 ,  2β  1 l e2β for r ≥ e 2 l . βr

(3.26)

Hence we have the estimate X ∞ ∞ (l) ≡ PN,H (r, l) PN,H r

≤ e2βHl

r X l+r  X 1 + eβHl  + 2r 2β

r< e 2

≤e

2β 2βHl e l

2 −βl

<e



l

 βHl

+e

l

X 2β l

2l


l+r 2r



r +

X 2β l

r> e 2

1  ( )r  2

r X 1   X 3 + ( )r  l + 4 2 2β

,

provided β is large enough.

r>2l

r> e 2

l

(3.27)

“Non-Gibbsian” States and their Gibbs Description

141

3.3.4. Calculations in the canonical ensemble of tame animals. In complete analogy with (3.24) we obtain that in the canonical ensemble ∞ (r, l) ≤ e−βKl PN,n

where

4(N, n, r, K, ∞) , 4(N, n, ∞)

X

4(N, n, ∞) =

(3.28)

exp{−2β|S|},

∞ :h(S)=h(N,n) S∈IN

X

4(N, n, r, K, ∞) =

exp{−2β|S| + βKk(r, S)}.

∞ :h(S)=h(N,n) S∈IN

To proceed, we use the following relations between the canonical and grand canonical partition functions: for every H: ∞ {S : h(S) = h(N, n)} 4(N, n, ∞) = PN,H

× 4(N, H, ∞) exp{−βHh(N, n)}, ∞ {S : h(S) = h(N, n)} 4(N, n, r, K, ∞) = PN,H,r,K × 4(N, H, r, K, ∞) exp{−βHh(N, n)}. ∞ ∞ is defined on IN by Of course, the distribution PN,H,r,K ∞ {S} = PN,H,r,K

exp{−2β|S| + βHh(S) + βKk(r, S)} . 4(N, H, r, K, ∞)

˜ gives The substitution to (3.28) of the last two relations (with different values H, H) ∞ (r, l) ≤ e−βKl × PN,n

×

(3.29)

∞ {S : h(S) = h(N, n)} 4(N, H, PN, ˜ r, K, ∞) exp{−β Hh(N, ˜ ˜ n)} H,r,K ∞ {S : h(S) = h(N, n)} PN,H

4(N, H, ∞)

exp{−βHh(N, n)}

.

(Here, in contrast with (3.25), we have the N -dependence.) We obtain the best possible bound by choosing the magnetic fields H, H˜ in such a way that the expectations of ∞ and the random variable h(S) are equal to the same value h(N, n) under both PN,H ∞ . Under that choice the first ratio in (3.29) is less than 1, provided N is large PN,H,r,K ˜ ∞ and enough. To see this we note first that the distributions of h(S) under both PN,H ∞ are asymptotically normal, and that the standard local limit theorem holds PN,H,r,K ˜ for them. Lemma 3.4, proven below, tells us that the variance of the random variable ∞ ∞ is not bigger than the one under PN, , so that we can use the h(S) under PN,H ˜ H,r,K knowledge of the limit behavior of h(S) and apply the relation (3.19). Also, the difference |H˜ − H| → 0, as N → ∞, while the product (H˜ − H)h(N, n) tends to the derivative d ∞ dM H (−1). This derivative is finite, and that takes care of the last factor. (In the present situation also a different choice is possible: H is chosen in the way prescribed, while H˜ is taken to be equal to H. In such a case the first ratio in (3.29) goes to 1 as N → ∞, since the random variables k(r, S) and h(S) − k(r, S) are independent under ∞ ∞ . Hence the random variables h(S) under both PN,H,r,K and the distribution PN,H,r,K ∞ have the local limit behavior with the divergent variances, and these variances are PN,H asymptotically equal.) After these remarks the estimate (3.29) is reduced to (3.24). Of course, the values of N for which this reduction is valid, depend on (r, l), but this is

142

R. L. Dobrushin, S. B. Shlosman

irrelevant for our purposes. Thus, in the limit N → ∞ we have the following analog of (3.27): X PN∞=∞,n (r, l) < e−βl . PN∞=∞,n (l) ≡ r

3.3.5. Properties of the variance. Here we will study the properties of the variance of the random variable h (S) when the trajectory S is subject to a varying “field”. To be specific, we suppose that a given fraction λ of the total life-span N of the trajectory it is under the influence of the “field” K, while after that time the field has a different value L. If we denote by hλ (S) the location of the polygon S at the “time” [λN ], then the probability distribution we are interested in is given by ∞ (S) = e−2β (QK )−[λN ] (QL )−(N −[λN ]) × PN,λ,K,L

× exp{−2β|S| + βKhλ (S) + βL(h(S) − hλ (S))}, S ∈

(3.30) ∞ IN .

To simplify the notations we will consider hereafter only the case when the total time length is 2N , while λ = 21 . The corresponding measure (3.30) will be denoted by ∞ (S). P2N,K,L ∞ Let 2N MK,L be the mean value of the random variable h(S) according to the ∞ distribution P2N,K,L (S). Clearly, ∞ ∞ = N MK + N ML∞ . 2N MK,L

(3.31)

∞ (S), for We are interested in the one-parameter families of the distributions P2N,K,L which ∞ = const ≡ C. MK,L

(3.32)

(This restriction is natural, since our main interest is in the canonical ensemble, when the endpoint of the polygon S is fixed; the restriction (3.32) means that the endpoint is fixed “in the mean”.) The relation between K and L is then the following, according to (3.17) and (3.31): sinh(Lβ) sinh(Kβ) + = 2βC. cosh(2β) − cosh(Kβ) cosh(2β) − cosh(Lβ)

(3.33)

∞ , which is defined naturally What we are interested in is the behavior of the variance DK,L by: ∞ ∞ ∞ = N DK + N DL 2N DK,L

(3.34)

(see (3.18). ∞ restricted to Lemma 3.4. Let C be any real number, and consider the function DK,L the curve (3.33). Let HC be the value of magnetic field such that ∞ = C. MH C ,HC

(3.35)

∞ = C, we have Then for any K, L, satisfying MK,L ∞ ∞ ≥ DH . DK,L C ,HC

(3.36)

“Non-Gibbsian” States and their Gibbs Description

143

Proof. The proof of this lemma consists in calculation of the derivative of the function ∞ ∞ along the curve MK,L = C. As we will see, this function has on any such curve DK,L exactly one minimum at HC for every C. ∞ is clearly equal to The gradient of the function MK,L cosh(2β) cosh(Kβ) − 1 cosh(2β) cosh(Lβ) − 1 , , (cosh(2β) − cosh(Kβ))2 (cosh(2β) − cosh(Lβ))2 hence the tangent direction (k, l) to the curve (3.33) satisfies the relation: k

cosh(2β) cosh(Lβ) − 1 cosh(2β) cosh(Kβ) − 1 +l = 0. (cosh(2β) − cosh(Kβ))2 (cosh(2β) − cosh(Lβ))2

(3.37)

∞ along Suppose that K > L. Let us show that in this case the derivative of DK,L the vector (k, l), satisfying (3.37) with k > 0 is positive. Let us take (k, l) = cosh(2β) cosh(Kβ) − 1 cosh(2β) cosh(Lβ) − 1 ,− . The derivative in question is (cosh(2β) − cosh(Lβ))2 (cosh(2β) − cosh(Kβ))2 then equal to

sinh(Kβ)[cosh(2β)(cosh(2β) + cosh(Kβ)) − 1][cosh(2β) cosh(Lβ) − 1] [cosh(2β) − cosh(Lβ)]2 [cosh(2β) − cosh(Kβ)]3 sinh(Lβ)[cosh(2β)(cosh(2β) + cosh(Lβ)) − 1][cosh(2β) cosh(Kβ) − 1] − . [cosh(2β) − cosh(Kβ)]2 [cosh(2β) − cosh(Lβ)]3 Positivity of this expression is equivalent to the statement that the function sinh(Hβ)[cosh(2β)(cosh(2β) + cosh(Hβ)) − 1] [cosh(2β) − cosh(Hβ)][cosh(2β) cosh(Hβ) − 1] is increasing. But this follows from direct calculation of its derivative, which is the sum of four manifestly nonnegative terms, one of which is even manifestly positive. 3.3.6. Wild animals. What should be done next is the same construction for the case of the real Ising model, which has to take into account the real behavior of the separation line, which has overhangs, and which thus has to be described by the ensemble of wild animals (in the terminology of [DKS]). But all the necessary constructions, involving the cluster expansion in the ensemble of wild animals, are presented in Chapter 4 of [DKS], and they need not be repeated here. 4. Contour Representation of the Partition Function Now we recall the main definitions of the contour method in a variant convenient for our aims and introduce the notation used below. The definition (1.8) implies that the energy Ising (σW /+) = 2|π(σW )| − |E(W )|, σW ∈ X W , EW

where we let |π| =

X

|0|, π ∈ H(W ).

0∈π

So recalling the definition (1.7) we find that

(4.1)

(4.2)

144

R. L. Dobrushin, S. B. Shlosman W pβ,+ W (σW ) = p˜W (π(σW )), σW ∈ X ,

(4.3)

where the contour probability distribution ˜ ))−1 exp {−2β|π|} , π ∈ H(W ), p˜W (π) = (Z(W

(4.4)

and the contour partition function ˜ Z(W )=

X

exp {−2β|π|} .

(4.5)

π∈H(W )

We can also rewrite in contour terms the quantities Qβ,+,N T,D introduced by relation (2.11) and used in the formulation of the main theorem. It follows from the relations (3.4) and (4.4) that for any finite set D ⊆ T ∩ WN , − −1 ˜ ˜ pβ,+ WN (σWN : (σWN ) = D) = (Z(WN )) ZT (D, WN ),

where

X

Z˜ T (D, WN ) =

exp {−2β|π|} ,

(4.6)

(4.7)

π∈KT (D,WN )

and the set KT (D, WN ) ⊆ H(WN ) of systems of contours is defined by the relation KT (D, WN ) = {π ∈ H(WN ) : |O(t) ∩ π| is odd, if t ∈ D, |O(t) ∩ π| is even, if t ∈ (T \ D)}.

(4.8)

In a similar way − −1 ˜ ˜ pβ,+ WN (σWN : (σWN ) = ∅) = (Z(WN )) ZT (∅, WN ).

(4.9)

Thus it follows from the definition (2.11) Qβ,+,N T,D = − ln

Z˜ T (D, WN ) . Z˜ T (∅, WN )

(4.10)

This formula is the starting point of the following estimates. 5. Cluster Expansions The following estimates use the cluster expansion method which exists in many versions. We choose the general Kotecky-Preiss model ([KP]) with simplifications introduced in the paper [D 94], though some other variants of the cluster expansion method can be also applied to the derivation of the same estimates. In this section we formulate the definitions and the results from [D 94] which will be used below. Let us describe the Kotecky-Preiss model. Let 2 be a finite or a countable set. Its elements will be called animals and so we call this model the animal model. (The ensembles of wild and tame animals of Sect. 3 were particular examples of the animal models.) Assume that a subset S ⊆ 2 × 2 of the set of pairs (θ1 , θ2 ) of animals is fixed, which is symmetric, i.e. a pair (θ1 , θ2 ) ∈ S if and only if the pair (θ2 , θ1 ) ∈ S, and reflexive, i.e. the diagonal pairs (θ, θ) ∈ S. If the pair of animals (θ1 , θ2 ) ∈ S, we say that they are compatible and write θ1 ↔ θ2 . If the pair of animals (θ1 , θ2 ) ∈ (2 × 2) \ S,

“Non-Gibbsian” States and their Gibbs Description

145

we say that they are incompatible and write θ1 = θ2 . The pair (2, S) is called the animal model. Sometimes it is convenient to consider the undirected graph without loops and multiple edges with the set 2 as the set of its vertices, such that the vertices θ1 , θ2 ∈ 2 are connected by an edge of this graph if and only if θ1 = θ2 . It is evident that this graph describes the animal model (2, S) in a unique way. Fix an animal model (2, S). A finite subset τ ⊆ 2 is called a herd, if any two animals θ1 , θ2 ∈ τ are compatible. For any finite 3 ⊆ 2 the set of all herds τ such that τ ⊆ 3 is denoted by H(3) and is called the set of all herds in 3. (The set H(3) includes the empty herd which is denoted by ∅.) Assume that a complex-valued function w(θ), θ ∈ 2, is given. The number w(θ) is called the weight of the animal θ. Let a finite set 3 ⊆ 2. The number X Y w(θ) (5.1) Zw (3) = τ ∈H(3) θ∈τ

is called the partition function in 3 defined by the weights w = (w(θ), τ ∈ 2). (For τ = ∅ the product in (5.1) equals 1 by definition. So if 3 is an empty set, the partition function Zw (3) = 1.) A good control over the logarithm of the partition function of the animal models is possible only if the absolute values |w(θ)| are in some sense small enough. It leads to the main restrictions on the domain of applicability of the discussed approach. To be more definite we describe a condition on the weights which is used below and was introduced by Kotecky and Preiss [KP]. Assume that a positive-valued function b(θ), θ ∈ 2, is given. The value b(θ) will be called the might of the animal θ. A choice of the function b(θ) for a concrete animal model with given weights is determined simply by a wish to satisfy the needed conditions. Roughly speaking, the might b(θ) has to be large if the animal θ is incompatible with many other animals. Definition 5.1. We say that a weight function w(θ), θ ∈ 2, satisfies the KP-condition, if there exists a non-negative weight function w0 (θ) ≥ 0, θ ∈ 2, such that for any θ ∈ 2,     X ˜ θ) ˜ + w0 (θ)b(θ) ≤ b(θ) w0 (θ)b( (5.2) exp   ˜ ˜ θ∈2: θ=θ

and |w(θ)| ≤ w0 (θ), θ ∈ 2.

(5.3)

Proposition 5.2. Fix a positive weight function w0 (θ), θ ∈ 2 and mights b(θ), θ ∈ 2, such that the condition (5.2) is satisfied. Let W0 be the set of all (complex-valued) weight functions w = w(θ) of θ ∈ 2 such that |w(θ)| ≤ w0 (θ), θ ∈ 2.

(5.4)

Consider a weight function w ∈ W0 . Then for any finite set 3 ⊆ 2 the partition function Zw (3) 6= 0 and X w0 (θ)b(θ). (5.5) | ln |Zw (3)|| ≤ θ∈3

146

R. L. Dobrushin, S. B. Shlosman

Proposition 5.2 is proved in [D 94] by simple induction in |3|. It follows from this proposition that the function ln Zw (3) is an analytic function of the arguments w(θ), θ ∈ 2. It turns out (see [D 94]) that the cluster expansion is nothing else but the Taylor expansion of this function at the point w(θ) ≡ 0, θ ∈ 2. Its coefficients do not depend on 3, the majority of them vanish and the rest can be estimated by the help of the usual Cauchy formula. For any finite 3 ⊆ 2 the set of all pairs ρ = (ρ, ¯ α), such that ρ¯ ⊆ 3 is a subset and α = α(θ) ≥ 1, θ ∈ ρ, ¯ is an integer-valued function of θ ∈ ρ, ¯ will be denoted by D(3) and will be called a group of animals in 3. The set ρ¯ will be called the support of the group and the value α(θ) will be interpreted as the multiplicity of animals of the kind θ in the group ρ. We say that a group ρ = (ρ, ¯ α) is a sum of groups ρi = (ρ¯i , αi ), i = 1, 2, . . . , k, ¯ i = 1, 2, . . . , k, and if ρ¯i ⊆ ρ, X αi (θ), θ ∈ ρ. ¯ (5.6) α(θ) = i=1,2,... ,k:θ∈ρ¯ i

A gang of animals in 3 is an non-empty group of animals ρ = (ρ, ¯ α) ∈ D(3) such that for any two animals θ, θ0 ∈ ρ¯ there is a sequence θ = θ1 , θ2 , . . . , θn = θ0 of animals in ρ¯ such that the animals θi and θi+1 are incompatible for all i = 1, 2 . . . , n − 1, i.e. ρ¯ is a connected subset of the graph 2. The set of all gangs in 3 will be denoted by G(3). The subsets ρ, ¯ which are connected subsets of the graph 2, will be called the supports of gangs. The set of all supports of gangs ρ¯ ⊆ 3 is denoted by G(3). Proposition 5.3. Let the conditions of Proposition 5.2 be fulfilled and a finite set 3 ⊆ 2 be fixed. Consider a polydisk W0 (3) = {w = (w(θ), θ ∈ 3) : |w(θ)| ≤ w0 (θ), θ ∈ 3} ⊂ C3 and the set W0in (3) of all inner points of the polydisk W0 (3). The partition function Zw (3) will be treated as a function of w ∈ C3 . For any w ∈ W0in (3) a convergent expansion X X Y qw (ρ) = r(ρ) w(θ)α(θ) (5.7) ln Zw (3) = ρ∈G(3)

ρ∈G(3)

θ∈ρ¯

holds. The coefficients r(ρ) are real numbers depending only on the restriction of the graph structure on 2 to ρ. ¯ For any gang ρ = (ρ, ¯ α),    Y X Y |w(θ)| α(θ)  . (5.8) w(θ)α(θ) | ≤  w0 (θ)b(θ)  |qw (ρ)| = |r(ρ) w0 (θ) θ∈ρ¯

θ∈ρ¯

θ∈ρ¯

In the applications of the animal model considered below the animals will be contours 0 or some combinations of contours and their weights will be e−β|0| or something like it. Since we are interested in the case of large β, the following strong hypothesis is painless for us: |w(θ)| ≤

1 w0 (θ), θ ∈ 2. 2

(5.9)

Then we have the following simplification of the cluster expansion (5.7). Corollary 5.4. Let ¯ = q¯w (ρ)

X ρ=(ρ¯ 0 ,α)∈G(3):ρ¯ 0 =ρ¯

qw (ρ).

(5.10)

“Non-Gibbsian” States and their Gibbs Description

147

Then for any finite set 3 ⊆ 2,

X

ln Zw (3) =

q¯w (ρ). ¯

(5.11)

ρ∈G(3) ¯

If the condition (5.9) is fulfilled, then   Y |w(θ)| X |ρ| ¯   , ρ¯ ∈ G(3). ¯ ≤2 w0 (θ)b(θ) |q¯w (ρ)| w0 (θ) θ∈ρ¯

(5.12)

θ∈ρ¯

Proof. The expansion (5.11) follows immediately from the expansion (5.7). We find ¯ using the condition (5.9) that for any ρ¯ ∈ G(3),   a ! ∞ Y |w(θ)| α(θ) Y |w(θ)| X X |w(θ)|  = w0 (θ) w0 (θ) w0 (θ) 0 0 a=0 ρ=(ρ¯ ,α)∈G(3):ρ¯ =ρ¯ θ∈ρ¯ θ∈ρ¯ ! ∞ a Y |w(θ)| Y |w(θ)| X 1 . (5.13) 2 = = w0 (θ) 2 w0 (θ) a=0 θ∈ρ¯

θ∈ρ¯

The desired estimate (5.12) follows now from the relations (5.8) and (5.13).

6. A Rough Estimate on the Potential UTβ,+ ,A In this section we will obtain a rough estimate on the main quantity Qβ,+,N T,D = − ln

Z˜ T (D, WN ) , Z˜ T (∅, WN )

(6.1)

introduced in (4.10). A much more precise estimate will be obtained in the next section, at the price of restricting the range of the sets D allowed (and with much more labor). Here we will treat the case of general finite sets D ⊂ T . We first rewrite Qβ,+,N T,D in the following way:   X  EN (π) , (6.2) Qβ,+,N T,D = − ln π∈KT (D,WN )

where EN (π) = and bT (π0 , WN ) = Z

bT (π, WN ) Z , bT (∅, WN ) Z X

exp{−2β|π|}.

(6.3)

(6.4)

π∈R(π0 ,WN )

We use the notation R(π0 ) (correspondingly R(π0 , WN )) for the collection of all admissible systems of contours (correspondingly in WN ), containing π0 , such that all contours

148

R. L. Dobrushin, S. B. Shlosman

bT (π0 = except these in π0 do not intersect the axis T = Z1 . Note, that Z˜ T (D = ∅, WN ) = Z ∅, WN ). We will use the following representation for this ratio of the partition functions:     X bT (π, WN ) Z = exp −2β|π| − 8(M ) , (6.5) bT (∅, WN )   Z M :M ⊂WN ,M ∩1(π)6=∅,M ∩T =∅

see (3.4). The function 8 is defined for all finite subsets M ⊂ Z2 , vanishes for all subsets which are disconnected, and satisfies the following estimate: for some β0 < ∞ and all β ≥ β0 one has |8(M )| ≤ exp{−2(β − β0 )d(M )},

(6.6)

with d(M ) denoting the minimal cardinality of connected sets of bonds belonging to M and containing all boundary bonds of the set M . Also, the function 8 is invariant with respect to shifts: 8(M ) = 8(M + t), t ∈ Z2 .

(6.7)

Such a representation follows directly from the cluster expansion of the previous secbT (∅, WN ). The animals here are bT (π0 , WN ) and Z tion for the partition functions Z the contours. Our choice of the weights w0 , w and the mights b is the following: w0 (0) = exp {−β0 |0|} , w (0) = exp {−β |0|} with β ≥ 2β0 , b (0) = exp {|0|}. The quantity 8(M ) is given by the sum over all herds of contours, “covering” the subset M : X q (ρ) . 8(M ) = ρ:∪0∈ρ¯ Int 0=M

A connected set M such that M ∩ 1(π) 6= ∅ will be called a blob (on π). For future use we fix for each M a connected set δM ⊂ M of d(M ) bonds containing all boundary bonds of it. We will call it a c-boundary of M . It is easy to see from the representation (6.5), that the quantities EN (π) approach β,+ their limits E(π), while Qβ,+,N T,D go to a limit, QT,D , for every D, as N → ∞ (though β,+,N and Qβ,+ not uniformly in D). The study of the quantities EN (π) and E(π), QT,D T,D is done in the identical manner, so below we will treat only the latter cases of the infinite volume quantities (to save on notation). Their finite volume analogs will be needed only for Proposition 8.2 concerning the comparison of the interactions UAN and UA . We start with the following simple estimate. Lemma 6.1. If β is large enough, then there exists a value C = C (β) → 0 as β → ∞, such that for all π, (−2β − C)|π| ≤ ln E(π) ≤ (−2β + C)|π|.

(6.8)

Proof. Note that every blob M is defined by its c-boundary δM . So to sum over all possible blob assignments – which is what we have to do according to (6.5) – is the same as to sum over all possible c-boundaries, intersecting π. This summation is a standard combinatorics, which is done using the estimate (6.6) and the remark that for every blob M one has d(M ) ≥ 4.

“Non-Gibbsian” States and their Gibbs Description

149

Now we can estimate the inner sum in (6.2). Lemma 6.2. For some value C = C (β), uniformly bounded as β → ∞, X E(π) ≤ exp (−2β + C) 2 |D| . exp (−2β − C) 4 |D| ≤ (6.9) π∈KT (D,WN ) Proof. Since all the weights E(π) are positive, the lower estimate follows by taking the shortest system π, isolating D. Such a system can have at most 4 |D| bonds. It is easy to see that the number of systems of contours π ∈ KT (D, WN ) of the total length l does not exceed 3l . Since for every π we have |π| ≥ 2 |D|, the result follows from (6.8). Proof of Proposition 2.2. Since the number of summands in the definition (2.14) is 2|A| , the relation (2.15) follows immediately from (6.9).

7. The Estimate of the Potential UTβ,+ ,A for Punctured Sets A In this section we will establish the asymptotic splitting property of the main quantity Qβ,+,N T,D = − ln

Z˜ T (D, WN ) , Z˜ T (∅, WN )

introduced in (4.10), in the situation when the finite set D is essentially disconnected. By this we mean the following: Let Di ⊂ D, i ∈ I be a fixed partition of D into disjoint subsets, such that each Di is a union of connected components of D. Our goal is to show that under the condition that the elements of the partition Di ⊂ D, i ∈ I are sufficiently separated, the main contribution to (6.1) comes from the terms corresponding to subsets Di : X β,+,N QT,Di + higher order terms . (7.1) Qβ,+,N T,D = i

The precise meaning of (7.1) is given by Proposition 7.4 below. Before explaining the notion of essential disconnectedness, we will introduce the notion of (1 − ε)-connected sets, which is crucial for the present section. 7.1. (1 − ε)-connected sets. A finite set A ⊂ Z1 will be called connected, iff it is a segment, A = [a, b] = {n ∈ Z1 , a ≤ n ≤ b}, a, b ∈ Z1 . We are going to define (1 − ε)connected sets, 1 > ε ≥ 0. (The connected sets would then coincide with 1-connected ones.) To do this, we consider all segments [a, b] with a, b ∈ A, inside which the set A has density above 1 − ε, which means that |[a, b] ∩ A| ≥ 1 − ε. |[a, b]| (Here |B| is the number of elements in the subset B ⊂ Z1 ; in particular, |[a, b]| = b − a + 1.) Consider the union of all such segments [a, b] of A-density above 1 − ε. This is a finite subset of Z1 , and as such is a disjoint union of segments I1 , I2 , . . . Ik , with dist Ii , Ij ≥ 2. If it consists of just one segment, then A will be called (1 − ε)connected. Otherwise it will be called (1 − ε)-disconnected, and the intersections Ai = A ∩ Ii will be called (1 − ε)-connected components of A.

150

R. L. Dobrushin, S. B. Shlosman

For a finite set A ⊂ Z1 we define the segment [lA , rA ] ⊂ Z1 , lA ≤ rA ∈ Z1 , as the shortest one containing the set A. If the set A is (1 − ε)-disconnected, then the complement [lA , rA ] \ (I1 ∪ I2 ∪ · · · ∪ Ik ) is nonempty, and consists of segments J1 , . . . , Jk−1 . These (k − 1) segments would be called lacunas of A. The quantity l (A) =

k−1 X

|Ji |

(7.2)

i=1

is then the total number of points in the lacunas. We introduce also the density ρ (A) of a finite set A as ρ (A) =

|A| |A| ≡ . diam (A) + 1 |[lA , rA ]|

(7.3)

The next three statements contain the main properties of (1 − ε)-connected sets. 7.1.1. Properties of (1 − ε)-connected sets. Lemma 7.1. Each (1 − ε)-connected set A (and, in particular, each (1 − ε)-connected component of any set) has density ρ (A) ≥ 1 − 2ε. Proof. To show this suppose that the set A is (1 − ε) -connected, and let [a1 , b1 ], [a2 , b2 ], . . . , ai , bi ∈ A be the collection of all segments of A-density above 1 − ε, which are ordered in the lexicographic order. This order will be denoted by ≺. The segments [a1 , b1 ], [a2 , b2 ], . . . form a covering of A. We want to construct a subcovering of that covering, which is “minimal”. We define the segments of this minimal covering inductively. Denoting by lJ (rJ ) the left (right) endpoint of the segment J, define the segment J1 as the (≺)-last one among these segments in the collection, for which lJ = a1 . Suppose that the segments Jk , k = 1, . . . , n − 1 are already defined, and their union does not yet contain A. We define Jn to be the last segment J among these from our collection, which have the following two properties: n−1 i) The intersections J ∩ (∪n−1 k=1 Jk ) 6= ∅, J ∩ (A \ ∪k=1 Jk ) 6= ∅. (Because A is (1 − ε)connected, the set of such J-s has to be nonempty.) ii) rJn ≥ rJ for all J satisfying i)

This process clearly terminates, and let N be the number of segments in the minimal covering thus constructed. We claim now that lJn > rJn−2 for all n ≥ 3. Indeed, otherwise the segment Jn would have been added to the minimal covering at the previous step. So every point of A belongs to at most two consecutive segments from our minimal covering. Define the segments Ki to be Ji ∩ Ji+1 , i = 1, . . . , N − 1, K0 = KN = ∅, Li = Ji \ ∪j6=i Jj , i = 1, . . . , N . We have now that for every i = 1, . . . , N , |A ∩ Ki−1 | + |A ∩ Li | + |A ∩ Ki | ≡ |A ∩ Ji | ≥ ≥ (1 − ε) |Ji | ≡ (1 − ε)(|Ki−1 | + |Li | + |Ki |), since all the segments Ji have A-density above 1 − ε. Adding all these N inequalities results in the following one:

“Non-Gibbsian” States and their Gibbs Description

151

|A ∩ L1 | + |A ∩ K1 | + |A ∩ L2 | + |A ∩ K2 | + · · · + |A ∩ KN −1 | + |A ∩ LN | ≥ (1 − ε)(|L1 | + |K1 | + |L2 | + |K2 | + · · · + |KN −1 | + |LN |) + (1 − ε)(|K1 |+|K2 |+. . .+|KN −1 |)−(|A ∩ K1 | + |A ∩ K2 | + · · · + |A ∩ KN −1 |) ≥ (1 − ε)(|L1 | + |K1 | + |L2 | + |K2 | + · · · + |KN −1 | + |LN |) − −ε(|K1 | + |K2 | + · · · + |KN −1 |) ≥ (1 − 2ε)(|L1 | + |K1 | + |L2 | + |K2 | + · · · + |KN −1 | + |LN |),

which proves our statement.

Lemma 7.2. Suppose the set A is (1 − ε)-disconnected, and the subset B ⊂ A has nonempty intersections with at least two different (1 − ε)-connected components of A. Then B is also (1 − ε)-disconnected. Proof. The proof follows immediately from the fact that, by definition, the union of two intersecting (1 − ε)-connected sets is again (1 − ε)-connected. As we know from Lemma 7.1, every finite set B with density ρ (B) < (1 − 2ε) is (1 − ε)-disconnected, and so has lacunas. We will show that if the density of B is even smaller, then the total length of lacunas of B is comparable to its size. Lemma 7.3. Suppose ε is small enough and the finite set B has density ρ (B) ≤ (1 − 3ε). Then ε l (B) ≥ diam (B) . 2 Proof. Let B = B1 ∪ · · · ∪ Bk be the decomposition of B into its (1 − ε)-connected components. Since each Bi is (1 − ε)-connected, ρ (Bi ) =

|Bi | ≥ (1 − 2ε) . diam (Bi ) + 1

On the other hand, ρ (B) = Pk

i=1

Hence

Pk

i=1

|Bi |

(diam (Bi ) + 1) + l (B)

.

1 1 l (B) ≥ − > ε, Pk 1 − 3ε 1 − 2ε i=1 |Bi |

provided ε is small enough. Therefore l (B) > ε

k X

|Bi | ≥ ε (1 − 2ε)

i=1

k X

(diam (Bi ) + 1) ,

i=1

hence l (B) [1 + ε (1 − 2ε)] > # " k X (diam (Bi ) + 1) + l (B) ≡ ε (1 − 2ε) [diam (B) + 1] , > ε (1 − 2ε) i=1

and the statement follows.

152

R. L. Dobrushin, S. B. Shlosman

We now are in the position to formulate the condition of sufficient separation of the subsets Di . • Condition of sufficient separation. We suppose that the partition Di ⊂ D, i ∈ e and its {1, . . . , n} ≡ I is obtained in the following way: for some finite set D e e ei ∩ D e (1 − ε)-connected components Di ⊂ D, i ∈ I we have D ⊂ D and Di = D for all i ∈ I. In particular, the partition of D into (1 − ε)-connected components will go. To save on notation we will present below the proofs only for this specific case; the generalization to the general case will be obvious. For a finite set D endowed with its partition into sufficiently separated subsets Di , i ∈ I we define, similarly to (7.2), X n−1 lDi+1 − rDi − 1 . l (D) = l D, {Di , i ∈ I} =

(7.4)

i=1

(We suppose here and in the following that the enumeration is such that lD1 < lD2 < · · · < lDn .) Proposition 7.4. Suppose that the finite set D ⊂ T ∩WN together with its partition into sufficiently separated subsets Di , i ∈ I is fixed. Then if β is large enough, the following expansion holds: X β,+,N X QT,Di + Gβ (D[i,j] ), (7.5) Qβ,+,N T,D = i

where

i<j∈I

D[i,j] =

[

Dk ,

k∈[i,j]

while the function Gβ (D[i,j] ) depends only on the sets Dk , k ∈ [i, j], and does not depend on the set D otherwise. We have also the following estimate: (7.6) |Gβ (D[i,j] )| ≤ exp −β 0 l D[i,j] , {Dk , k ∈ [i, j]} , with β 0 → ∞ as β → ∞. The expansion (7.5) is somewhat similar to the standard cluster expansion of the logarithm of the partition function. The difference here lies in the fact that the usual low temperature expansion is made around a single ground state configuration, which has no contours at all. So in the usual case the leading term of the partition function corresponds to the vacuum and is 1, while here the situation is more complicated and is different from the standard one. The next three subsections contain the proof of the above proposition. 7.2. Dressed system representation. As we have seen above, one of the main object we have to study is the following ratio of the partition functions: E(π) = while

bT (π, WN ) Z , bT (∅, WN ) Z

(7.7)

“Non-Gibbsian” States and their Gibbs Description

153



X

 Qβ,+,N T,D = − ln

 E(π) .

π∈KT (D,WN )

Recall that (see (6.4)) bT (π0 , WN ) = Z

X

exp{−2β|π|},

π∈R(π0 )

where R(π0 ) is the collection of all admissible systems π of contours containing π0 , such that all contours in π except those in π0 do not intersect the axis T = Z1 . We will use again the representation (6.5), but this time we want to modify the weights 8, in such a way that the new weights 80 will be nonpositive. This can be achieved, but for the price that these new weights are not translation invariant anymore, and moreover, depend not only on M , but also on the set M ∩ π. This is the price we can afford. The construction is the following: Let b be an arbitrary bond of Z2∗ , and τ (β) > 0 be the sum X exp{−2(β − β0 )d(M )}. τ (β) = M ⊂Z2 ,b∈M

For every bond b ∈ Z2∗ \ Z1∗ define X τ (β, b) =

exp{−2(β − β0 )d(M )}.

M ⊂Z2 ,b∈M ,M ∩T 6=∅

For the set M we put   8(M ) − |M ∩ π| exp{−2(β − β0 )d(M )} for |M | > 1, 80 (M, π) = 8(M ) − |M ∩ π| exp{−2(β − β0 )d(M )}−  P for |M | = 1, − b∈M ∩π n(b)τ (β, b)

(7.8)

1 . The idea behind the last defininumber of sites of (Z 2 )+ , adjacent to b tion is simple: we add a small fraction – namely, τ (β) – to the contribution of every bond b of the family π into the total weight 2β|π|, and distribute the negative of it over all M ’s, adjacent to b, according to (7.8). Each M gets as many contributions as there are bonds in π to which it is adjacent, while even one contribution is enough to make it negative. The only exceptions are the one-point M ’s; we put all the addition which is not claimed by other M ’s to these; this is the source of the non-translation invariance. (Note, however, that the translation invariance with respect to the horizontal shifts is retained.) The analog of (6.6) holds evidently. We therefore have the following representation for E(π):     X 80 (M, π) . (7.9) E(π) = exp −(2β + τ (β))|π| −   where n(b) =

M :M ∩1(π)6=∅,M ∩T =∅

Next come the usual trick of the theory of cluster expansions, which starts by defining the function 0

9(M, π) = e−8 (M,π) − 1,

(7.10)

154

R. L. Dobrushin, S. B. Shlosman

which in our case satisfies 0 ≤ 9(M, π) ≤ exp{−2(β − β0 )d(M )}.

(7.11)

Next, by a slight abuse of notation, we define a dressed system πˆ (or a system π with a dressing) to be the following object: πˆ = {π, M1 , . . . , Mk } consists of the system π itself plus a finite collection of blobs sitting on it, i.e. a finite collection of distinct finite connected sets M1 , . . . , Mk ⊂ Z2 , Mi ∩ T = ∅, such that mi ≡ Mi ∩ 1(π) 6= ∅, i = 1, . . . , k; k = 0, 1, . . . . A full notation for a dressed system should include also the sets mi , i.e. it is πˆ = {π, M1 , . . . , Mk ; m1 , . . . , mk }. The reason for this inclusion lies in the fact that the natural weight of the dressed systems we are going to define next, depends on these intersections. In particular, the union, πˆ 0 ∪ πˆ 00 = {π 0 ∪ π 00 , M10 , . . . , Mk0 0 , M100 , . . . , Mk0000 ; m01 , . . . , m0k0 , m001 , . . . , m00k00 } of two systems πˆ 0 = {π 0 , M10 , . . . , Mk0 0 ; m01 , . . . , m0k0 } and πˆ 00 = {π 00 , M100 , . . . , Mk0000 ; m001 , . . . , m00k00 } would be a dressed system only if, first, the systems π 0 and π 00 are compatible, and, second, all the intersections Mi0 ∩ 1(π 00 ) and Mj00 ∩ 1(π 0 ) are empty. The expression (7.9) may be rewritten now in the form Y (9(M, π) + 1) E(π) = exp{−(2β + τ (β))|π|} M :M ∩1(π)6=∅,M ∩T =∅

=

∞ X k=1

exp{−(2β + τ (β))|π|}

k Y

9(Mi , π) ≡

i=1

X

E(π), ˆ

(7.12)

9(Mi , π).

(7.13)

πˆ

where the weight E(π) ˆ is, of course, nothing else but E(π) ˆ = exp{−(2β + τ (β))|π|}

k Y i=1

The advantage of the positivity in (7.11) is that it allows us to interpret the last expression as the statistical weight and introduce the corresponding probability distributions on various ensembles of dressed systems of contours. 7.3. Structures. Fix now a finite set D ⊆ T ∩ WN . Let Di ⊂ D, i ∈ I be a fixed partition of D into sufficiently separated components. It is convenient for us to suppose that the index set I is a segment of integers, I = [1, n] ⊂ N. We denote by G(I) the set of all possible graph structures on the set of vertices I without loops and multiple edges; in other words, G(I) is the set of all subsets of pairs of distinct elements of I. Let π ∈ KT (D, WN ) be a system of compatible contours, isolating D, and πˆ = {π, M1 , . . . , Mk } be some dressing of it. (It might be an empty dressing, that is, it might contain no M ’s at all.) Define a graph g(π) ˆ ∈ G(I) in the following manner: a bond {i, j}, i, j ∈ I belongs to g(π) ˆ provided that either i) there is a contour 0 in π such that both subsets Di , Dj belong to Int (0), or M of the dressing π, ˆ such that Di ii) there are two contours 00 , 000 in π and an element 0 belongs to Int 0 , Dj belongs to Int 000 , while 1(00 ) ∩ M 6= ∅ = 6 1(000 ) ∩ M . We call two dressed systems π and π 0 of compatible contours, isolating the set D, equivalent, iff g(π) ˆ = g(πˆ 0 ) = g ∈ G(I). We denote by L ≡ L(D, I, {Di , i ∈ I}, g) such a class of equivalence, and will call it a structure (on D). If a system πˆ belongs to the structure L, then we call L ≡ L(π) ˆ the structure of the dressed system π. ˆ In

“Non-Gibbsian” States and their Gibbs Description

155

the case when |I| = 1 and the partition {Di , i ∈ I} containing just one element, the corresponding structure is unique and contains all systems of contours, isolating D. Such a structure will be called sometime an elementary structure, and will be denoted simply by L(D). We would like to compare different D’s. So we equip every D with some partition {Di } into sufficiently separated components, and we define the set LT (WN ) to be the union of all L(D, ID , {Di , i ∈ ID }, g) over all various D ∈ T , all g ∈ G(ID ). For any β > 0, any finite D ⊂ T , endowed with its partition {Di }, and any structure L = L(D, {Di }, g) on it we let X E(π). ˆ (7.14) Sβ (L) = π∈L ˆ

Let now the system π ∈ KT (D, WN ) of compatible essential contours isolate D, and πˆ ˆ then π 0 also belongs is its dressing. Clearly, if a dressed system πˆ 0 has the structure L(π), to KT (D, WN ). Let LT (D, {Di }, WN ) be the set of all structures L on D; [ L(D, {Di }, g). LT (D, {Di }, WN ) = g∈G(I)

Then for any D ⊆ T ∩ WN , X π∈KT (D,WN )

bT (π, WN ) Z = bT (∅, WN ) Z

X L∈LT (D,{Di },WN )

Sβ (L) ≡

X g∈G(I)

Sβ (L(D, {Di }, g)). (7.15)

We now rewrite (7.1) and reformulate our goal. We want to show that under the condition that the elements of the partition Di ⊂ D, i ∈ I are sufficiently separated, the main contribution to (7.15) comes from the term corresponding to just one graph g∅ ∈ G(I), where by g∅ we denote the graph with I as its set of vertices, which has no bonds. Moreover, under the same condition the term in question equals in leading order the product of the contributions of the vertices of g∅ , which means that Y X Sβ (L(D, I, {Di }, g)) = Sβ (L(Dj )) + higher order terms . (7.16) g∈G(I)

j∈I

According to our conventions the full notation for the structure L(Dj ) should be L(Dj , {j}, {Dj }, (gj )∅ ); it corresponds to the only structure on Dj , with the partition {Dj } containing the set Dj itself as its only element; the graph (gj )∅ is the only element of the set G({j}). The rest of this subsection and the next one is devoted to the proof of (7.16). We say that a pair of systems of compatible contours π, π 0 ∈ H(WN ) is mutually compatible, if these systems are disjoint and their sum π∪π 0 is again a compatible system of contours. Likewise, we say that a pair of dressed systems πˆ = {π, M1 , . . . , Mk } and πˆ 0 = {π 0 , M10 , . . . , Mk0 0 } is mutually compatible, if the pair π, π 0 is mutually compatible, 0 while all the intersections M i ∩ π 0 , M i ∩ π are empty. (The intersections of M ’s and M 0 ’s are allowed.) Let L1 , L2 ∈ LT (WN ) be two different structures of contours, and R(L1 , L2 ) ⊆ L1 × L2 be the set of all pairs of dressed systems of contours (πˆ 1 , πˆ 2 ) ∈ L1 × L2 which are not mutually compatible. Later we will need the following construction, which associates with every graph g ∈ G(I) the partition 5(g) of I. It is defined in the following manner: we consider

156

R. L. Dobrushin, S. B. Shlosman

the set I to be a subset of the segment [1, n] ⊂ R1 , and we associate with every bond b = {i1 , i2 } of g the segment ub = [i1 , i2 ] ⊂ [1, n]. The union of all these segments is a closed subset of R1 , which is a collection of disjoint segments vj ⊂ R1 . The intersections vj ∩ I are by definition all the non-trivial elements of the partition 5(g). The trivial elements of it are the singletons, corresponding to single points of I, which do not belong to the above union of segments. For a given structure L = L(D, I, {Di }, g) we define the partition 5(L) to be the partition 5(g). In case a dressed system πˆ has g for its graph g(π), ˆ we likewise define 5(π) ˆ to be the partition 5(g(π)). ˆ Note that 5(g) is a partition of I into consecutive segments, some of which might be degenerate. Let J (g) be the index set for these segments. Let 5 ≡ 5I = (31 , 32 , . . . , 3j , . . . , 3|J | ), j ∈ J be a partition of the index set I into consecutive segments. (The finest such partition (3∗1 , 3∗2 , . . . , 3∗j , . . . , 3∗n ) of I into singletons, 3∗j = {j} ∈ I, will be denoted by 5∗I .) We then define the subsets D(3j ) = ∪i∈3j Di ⊂ D, equipped with partitions into corresponding Di , i ∈ 3j ; let us also fix graphs gj ∈ G(3j ), j ∈ J . Consider the product P(5, {gj , j ∈ J }) = L1 × L2 × · · · × L|J | , where Lj ≡ L(D(3j ), 3j , {Di , i ∈ 3j }, gj ). This product consists of all collections of dressed systems P = (πˆ 1 , πˆ 2 , . . . , πˆ |J | ), πˆ j ∈ Lj , j = 1, . . . , |J |, which, however, need not be compatible. So we consider the set of all possible unoriented graphs G(J ) on vertices J , and for any graph h ∈ G(J ) we let \ {P ∈ P(5, {gj , j ∈ I}) : (πˆ j1 , πˆ j2 ) ∈ R(Lj1 , Lj2 )}, R(5, {gj }, h) = (7.17) {j1 ,j2 }∈h where j1 , j2 ∈ J , and the intersection is taken over all bonds of the graph h. In words, R(5, {gj }, h) is the set of all collections P such that all the pairs (πˆ j1 , πˆ j2 ), {j1 , j2 } ∈ h of dressed system of contours are not mutually compatible. For a system of contours P ∈ P(5, {gj , j ∈ I}) we define the graph h(P ) ∈ G(J ) to be the maximal one among these h’s, for which the inclusion P ∈ R(5, {gj }, h) holds. For any h ∈ G(J ) let h ⊆ J be the set of all vertices of the graph h which are adjacent to at least one bond of h. Let [ R(5I , {gj }) = {P ∈ P(5, {gj }) : (πˆ i , πˆ k ) ∈ R(Li , Lk )} (7.18) i,k∈J

be the set of all collections P such that some pair (πˆ i , πˆ k ) of systems of contours is not compatible. It is clear that for any collection of systems of contours P = (πˆ 1 , πˆ 2 , . . . , πˆ |J| ) ∈ P(5, {gj }) the system of contours |J |

πˆ =

[

πˆ j

(7.19)

j=1

belongs to some structure L on D if and only if P ∈ / R(5). On the other hand, if πˆ ∈ L and 5(π) ˆ = (31 , 32 , . . . , 3k ), then there exists a unique collection of graphs gj ∈ G(3j ), j = 1, . . . , k with 5(gj ) = 3j and a unique collection of k dressed systems πˆ j ∈ L(D(3j ), 3j , {Di , i ∈ 3j }, gj ) such that (7.19) holds.

“Non-Gibbsian” States and their Gibbs Description

157

So we have: X Sβ (L(D, I, {Di }, g)) = =

g∈G(I) n X k=1

(7.20)

X

X

5=31 ,...,3k

g1 ∈G(31 ),...,gk ∈G(3k ):5(gj )=3j

 k Y  Sβ (L(D(3j ), 3j , {Di , i ∈ 3j }, gj )) − j=1

X

Y

 E(πˆ j ) .

P ∈R(5,{gj }) πˆ j ∈P

Since always 0 < Sβ (L) < ∞, we can rewrite (7.20) as   n X X   Sβ (L(D, I, {Di }, g)) = ln Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) ln j=1

g∈G(I)

+Fβ (D), where

(7.21)

Fβ (D) = ln Zβ (D) ,

while Zβ (D) = 1 +  n−1 X

 + 

X

 k Y

X

k=1 5=31 ,...,3k g1 ∈G(31 ),...,gk ∈G(3k ): j=1 5(gj )=3j

 ×   − 

 ×

n Y

 Sβ (L(D(3j ), 3j , {Di , i ∈ 3j }, gj )) ×



Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) −1

(7.22)

j=1 n X

 X

X

X

Y

k=1 5=31 ,...,3k g1 ∈G(31 ),...,gk ∈G(3k ): P ∈R(5,{gj }) πˆ j ∈P 5(gj )=3j n Y

 E(πˆ j ) ×



Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) −1 .

j=1

(The only term from (7.20) which is not present in (7.22) is the leading one, which equals the denominators in (7.22), and its absence is reflected in the fact that the first summation is up to n − 1.) 7.4. A polymer representation for Zβ (D). We will study now various terms in (7.22), and we will show that many cancellations happen and that the terms are in fact small enough due either to the fact that the contours in the structures L(D(3j )) with nontrivial 3j ’s are much longer than the contours from the structures L(Di ), or because we have incompatibility condition entering into the last term of (7.22), which again forces the presence of extra long terms.

158

R. L. Dobrushin, S. B. Shlosman

Our immediate goal is to represent the sum Zβ (D) in (7.22) as a partition function of an animal model. The animals turn out to be just segments s of the set I. Proposition 7.5. There exists a function w defined for all finite (1 − ε)-disconnected ˜ ⊂ Z1 be a finite subset, which is subsets of Z1 , such that the following holds. Let D ˜ i be its (1 − ε)-connected components, i ∈ {1, . . . , n} ≡ (1 − ε)-disconnected, and let D ˜ i are nonempty. ˜ be a subset, such that all the components Di = D ∩ D I. Let D ⊂ D Then XY (7.23) w (D3 ) . Zβ (D) = 1 + 5 3∈5

Here the summation goes over all partitions of the set I into (disjoint) segments 5 except the partition 5∗ into singletons, 5 = {α1 , β1 , α2 , β2 , . . . , αk , βk : 1 ≤ k < n, α1 = 1 ≤ β1 < α2 ≤ β2 < · · · ≤ βk = n} , the product is taken over all segments 3 ∈ 5 of positive length (i.e. all 3’s which are singletons, are excluded from (7.23)), and D3 = ∪i∈3 Di . The function w satisfies the estimate: |w (D3 )| ≤ exp −β 0 l D3 , {Di , i ∈ 3} ,

(7.24)

(see (7.4)), where β 0 = β 0 (β) → ∞ as β → ∞. Once this proposition is proven, Proposition 7.4 follows from the estimates (7.23), (7.24) and the relations (7.20), (7.21) by applying the cluster expansion (5.11) to the logarithm of the partition function Zβ (D). The animals here are the sets D3 ; they can be identified with the corresponding nontrivial segments 3 ⊂ I. Our choice of the weights w is obvious; the weights w0 (D3 ) are given by the rhs of (7.24), and the mights b (D3 ) can be taken to be equal to exp {|3|}. (Here |3| is the number of points in the index set 3.) As the reader of this hard technical section sees, the generic term in the representation (7.22) corresponds to a pair of graphs. One is a graph g ∈ G(I) (which is not necessarily connected), while another is a graph h ∈ G (J (g)), where the set of indices J (g) enumerates the elements of the partition 5(g). The first graph describes the connections between different components Di , i ∈ I, which occur due to the corresponding structures L, while the second graph describes the incompatibility pattern between different L’s. In the next subsection we will treat the case when the graph g is trivial (≡ has no bonds), then we will treat the case of trivial h’s, and finally we will treat the general case. 7.4.1. A simple term: R1 . As a warm-up, we begin the necessary estimates with the term R1 of (7.22), corresponding to the case k = n in the last term of (7.22). In that case the partition 5 is the partition into singletons, 5 = 5∗I , all the graphs gj are trivial, so we will omit sometimes 5’s and/or gj ’s from our notations. So we have to consider the product P(5∗I ) = L1 × L2 × · · · × Ln of elementary structures Lj ≡ L(Dj , {j}, {Dj }, (gj )∅ ), consisting of all collections of dressed systems P = (πˆ 1 , πˆ 2 , . . . , πˆ n ), πˆ j ∈ Lj , j = 1, . . . , n. The ratio which we are going to estimate is

“Non-Gibbsian” States and their Gibbs Description

159

Q

P

P ∈R(5∗ ) QnI j=1

R (D) = − 1

πˆ j ∈P

E(πˆ j )

Sβ (Lj )

.

Proposition 7.6. In the notation of Proposition 7.5 there exists a function w1 defined for all finite (1 − ε)-disconnected subsets of Z1 , such that the following holds: R1 (D) =

XY

w1 (D3 ) .

(7.25)

5 3∈5

The function w1 satisfies the estimate: 1 w (D3 ) ≤ exp {−β 0 l (D3 )} , where β 0 = β 0 (β) → ∞ as β → ∞. Proof. We first rewrite the function R1 . Using a well-known formula for the union of events via their intersections, we find that R =− 1

X P ∈R

Q

ˆj) πˆ ∈P E(π Qnj j=1 Sβ (Lj )

X

=

Q

X

|h|

(−1)

πˆ ∈P

Qnj

P ∈R(h)

h∈G(I),|h|>0

j=1

E(πˆ j )

Sβ (Lj )

(7.26)

(recall (7.17)), where we denote by |h| the number of bonds of the graph h. We now rewrite the inner sums in the rhs of (7.26) by singling out the factors corresponding to sites j ∈ I \ h. We find that X

Y



Y

E(πˆ j ) = 

P ∈R(h) πˆ j ∈P



X

Y

P ={πˆ j :j∈h}∈R(5∗ ¯ ,h) h

πˆ j ∈P

Sβ (Lj )

j∈I\h

E(πˆ j ). (7.27)

Here 5∗h¯ = {3j , j ∈ h}, and we use a slight abuse of notation in the expression R(5∗h¯ , h) by treating the graph h ∈ G(I) as a graph on the vertices h, i.e. as an element of G(h) ⊂ G(I). The key observation now is that the last sum factors. Namely, if we introduce the set C(h) of all maximal connected subgraphs of h, excluding isolated vertices, then X

Y

P ={πˆ j :j∈h}∈R(5∗ ¯ ,h) h

πˆ j ∈P

E(πˆ j ) =

Y

X

Y

E(πˆ j ). (7.28)

k∈C(h) P ={πˆ j :j∈k}∈R(5∗ ,k) πˆ j ∈P ¯ k

So if we define the weight Q1β (k) of a connected graph k with vertices among the points of the index set I by P Q1β (k) = (−1)|k|

Q

P ={πˆ j :j∈k}∈R(5∗ ¯ ,k) k

Q

j∈k

πˆ j ∈P

Sβ (Lj )

E(πˆ j )

,

(7.29)

160

R. L. Dobrushin, S. B. Shlosman

then it follows from the relations (7.22), (7.26), (7.27) and (7.28) that X Y Q1β (k). R1 = h∈G(I),|h|>0

(7.30)

k∈C(h)

One can estimate from above the weights of each graph k, contributing to (7.30), directly. However, the estimate, in general, would not be better than exp{−cβl(k)}, where X dist Dir , Dir+1 (7.31) l(k) = r

and where the summation goes over all components Di ∈k, ordered by the natural linear order inherited from Z1 . The more optimistic estimate with X dist(Di , Dj ) L(k) = {i,j}∈k

instead of l(k) does not hold in general. The estimate available is not enough for our 2 purposes, since the number of connected graphs on vertices k is of the order 2(|k| ) , while l(k) can well be of order |k|. So we have too many animals with a given weight, and the straightforward application of the cluster expansion machinery would fail. To cope with this problem we will introduce the partition of the set of all graphs h ∈ G(I) into families, according to what the partition 5(h) is. Namely, for every nondegenerate segment s ⊂ I denote by H(s) ⊂ G(I) the subset of all graphs which have the segment s as the only nontrivial element of the partition 5(h). Introduce now the weight Q1β (s) by X Y Q1β (k). (7.32) Q1β (s) = h∈H(s) k∈C(h)

That definition clearly allows us to rewrite (7.30) as X Y Q1β (s), R1 =

(7.33)

S∈S(I) s∈S

where the summation goes over all collections S ∈ S(I) of disjoint segments of I of positive lengths. To estimate Q1β (s), we first rewrite this weight in terms of the families P , contributing to it. Clearly, P Q X ˆj) P ={πˆ j :j∈s}∈R(5∗ πˆ j ∈P E(π 1 |h| s ,h) Q (−1) Qβ (s) = j∈s Sβ (Lj ) h∈H(s)  Q X X ˆj) πˆ ∈P E(π |h|  . (−1)  Q j = j∈s Sβ (Lj ) ∗ P ={πˆ j :j∈s}∈R(5s ): g(P )∈H(s)

h⊆g(P ):h∈H(s)

P |h| ; being large, We need an estimate on the coefficient K(P ) ≡ h⊆g(P ):h∈H(s) (−1) it can destroy our strategy. Happily, as we will show at the end of this subsection (see Lemma 7.8),

“Non-Gibbsian” States and their Gibbs Description

161

X |h| |K(P )| = (−1) ≤ 1, h⊆g(P ):h∈H(s) so |Q1β (s)|

Q

X

≤

πˆ j ∈P

Q

P ={πˆ j :j∈s}∈R(5∗ s ):

j∈s

E(πˆ j )

Sβ (Lj )

(7.34)

.

(7.35)

g(P )∈H(s)

The rhs of (7.35) can be interpreted as a probability of an event in a certain ensemble. Namely, introduce the following product distribution on P(5∗s ): Q ˆj) πˆ ∈P E(π . qs (P ) ≡ qs ({πˆ j , j ∈ s}) = Q j j∈s Sβ (Lj ) This is just the ensemble of independent dressed families of contours, each family belonging to the corresponding structure Lj , and being distributed according to qj (πˆ j ) =

E(πˆ j ) . Sβ (Lj )

(7.36)

Then we can rewrite (7.35) as |Q1β (s)| ≤ qs ({P ∈ P(5∗s ) : g(P ) ∈ H(s)}).

(7.37)

To estimate the last probability, we first define for every graph g (= g(P )), contributing to (7.37), a spanning subgraph sp(g) ⊆ g. The definition is inductive. Suppose the set of all vertices g is enumerated in increasing order: g = {i1 < i2 < · · · < i|g| } ⊂ s. The first bond {c1 , d1 } of sp(g) is the longest one of g among those incident to the vertex i1 . Suppose inductively that the bonds {cr , dr } are already constructed, r = 1, 2, . . . , k, and c1 (= i1 ) < c2 < · · · < ck , d1 < d2 < · · · < dk . If dk = i|g| , the process terminates. Otherwise the set of bonds {c, d} ∈ g, such that c ≤ dk < d, is nonempty (since S(g) = s), and we take for the bond {ck+1 , dk+1 } the one from this set with the rightmost endpoint d. Let K be the total number of bonds in sp(g). By construction, c2 < d2 < c4 < d4 < . . . , and so every vertex of c1 < d1 < c3 < d3 < . . . , the graph sp(g) belongs to at most two bonds of it. Hence K ≤ |s|. Moreover, every vertex of g belongs to at most one “even” bond and to at most one “odd” bond of sp(g). Denote by spo (g) ⊂ sp(g) (resp. spe (g) ⊂ sp(g)) the subgraph composed by only “odd” PK (resp. “even”) bonds, and let Ko (Ke ) be their number. Since k=1 (dk − ck ) ≥ l(g) (see (7.31)), we have that at least one of the estimates holds: Ko X

(d2k−1 − c2k−1 ) ≥

k=1

KT X l(g) l(g) (d2k − c2k ) ≥ , or . 2 2 k=1

That implies the inclusion {P ∈ P(5∗s ) : g(P ) ∈ H(s)} |s| +1 2 [ [ ⊂ k=1

{c1 ,d1 ,c2 ,d2 ,...,ck ,dk ∈s: c1
(7.38) R 5∗s , h(c1 , d1 ; c2 , d2 ; . . . ; ck , dk ) ,

162

R. L. Dobrushin, S. B. Shlosman

where the second union is taken over all collections of disjoint segments [c1 , d1 ], [c2 , d2 ], . . . , [ck , dk ] in s of total length at least l(g) 2 , and the graph h(c1 , d1 ; c2 , d2 ; . . . ; ck , dk ) has k bonds {c1 , d1 }, {c2 , d2 }, . . . , {ck , dk }. The crucial point for us here is that the number of such collections is bounded from above by only 2|s|−1 (≡ the number of subsets of the set of |s| − 1 elements)

(7.39)

(while the number of all graphs h with vertices in s is of the order of 2 const |s| , which is the reason why we have to pass to segments s from graphs h). By definition of the measure qs we have that 2

qs R 5∗s , h(c1 , d1 ; c2 , d2 ; . . . ; ck , dk )

=

k Y

qci ,di (R (ci , di , h(ci , di ))) , (7.40)

i=1

where the measure qci ,di = qci qdi (see (7.36)), and the family R (ci , di , h(ci , di )) corresponds to the graph with two vertices and one bond joining them, i.e., in accordance with the definition (7.17), it is the collection of systems P = (πˆ 1 , πˆ 2 ), πˆ 1 ∈ Lci , πˆ 2 ∈ Ldi , which are not mutually compatible (i.e. (πˆ 1 , πˆ 2 ) ∈ R(Lci , Ldi )). To estimate the probability qc,d (R (c, d, h(c, d))) consider the vertical line lcd = {(xcd , y)} ⊂ R2 , with the abscissa xcd positioned halfway between the sets Dc and Dd . Then X

qc (πˆ c )qd (πˆ d ) ≤ qc {πˆ c ∩ lcd 6= ∅} + qd {πˆ d ∩ lcd 6= ∅}.

P ={πˆ c ,πˆ d }∈R(c,d,h(c,d))

(7.41)

The event {πˆ c ∩ lcd 6= ∅} can be written as a sum: (7.42) {πˆ c ∩ lcd 6= ∅} = {πc ∩ lcd 6= ∅} ∪   [ {b ∈ πc , for a blob M of πˆ c we have b ∈ M , M ∩ lcd 6= ∅} . ∪ b∈Z2 \lcd

Now, qc {b ∈ πc , for M ∈ πˆ c , b ∈ M , M ∩ lcd 6= ∅}

(7.43)

= qc {b ∈ πc }qc { for M ∈ πˆ c , b ∈ M , M ∩ lcd 6= ∅ | b ∈ πc }. Next, we need the following simple Lemma 7.7. Consider an elementary structure L(D), and let p(π) ˆ be the natural probability distribution on it: E(π) ˆ . p(π) ˆ = Sβ (L(D)) Let M be a fixed blob. Then the probability of the event that the dressed system πˆ contains this blob M in its dressing can be estimated as follows: p πˆ : πˆ = {π, M, M1 , . . . } ≤ exp{−2(β − β0 )d(M )}.

“Non-Gibbsian” States and their Gibbs Description

163

Proof. The proof is almost immediate. Indeed,

=

P E(π) ˆ π={π,M,M ˆ 1 ,M2 ,... } p πˆ : πˆ = {π, M, M1 , M2 , . . . } = Sβ (L(D)) P 9(M, π)E(π) ˆ π={π,M ˆ 1 ,M2 ,... }:1(π)∩M 6=∅,Mi 6=M, Sβ (L(D)) P

≤ exp{−2(β − β0 )d(M )} ≤ exp{−2(β − β0 )d(M )}.

π={π,M ˆ 1 ,M2 ,... }:1(π)∩M 6=∅,Mi 6=M,

E(π) ˆ

Sβ (L(D))

According to this lemma the last factor in (7.43) can be estimated as follows: qc { for M ∈ πˆ c , b ∈ M , M ∩ lcd 6= ∅|b ∈ πc } ≤ exp{−4(β − β0 ) dist(b, lcd )}. (7.44) Hence qc {πˆ c ∩ lcd 6= ∅} ≤

X

exp{−4(β − β0 )r}qc {πc ∩ (lcd − r) 6= ∅}.

r∈Z1 :r≥0

Here (lcd − r) is the line lcd shifted to the left by r units, and of course the probabilities qc {πc ∩ (lcd − r) 6= ∅} equal to 1 once r ≥ 21 dist(Dc , Dd ). For the remaining values of r these probabilities are estimated in (3.6), according to which we have 1 qc {πc ∩ (lcd − r) 6= ∅} ≤ exp{−cβ( dist(Dc , Dd ) − r)}. 2 Together with (7.44) it shows that X c qc (πˆ c )qd (πˆ d ) ≤ 2 exp{− β dist(Dc , Dd )}. 4 P ={πˆ c ,πˆ d }∈R(c,d,h(c,d))

Combining the last estimate, the estimates (7.41) and (7.39), the relations (7.40) and (7.38) and the estimate (7.37) we get: c |Q1β (s)| ≤ 2|2s| exp{− βl(s)} ≤ exp{−c0 βl(s)}, 4

P where l(s) = k∈s dist(Dik , Dik+1 ). Together with the formula (7.33) the last estimate shows that the term R1 has the form needed for the machinery of the cluster expansion to be applicable. We finish this subsection by proving the estimate (7.34), together with another statement from graph theory. 7.4.2. Two statements about graphs. Let g ∈ G(I) be a graph on the vertices I (with no multiple bonds). We will call it a spanning graph, if for every k ∈ I, k > 1 there is at least one bond {i, j} of g, such that i ≤ k − 1 < k ≤ j. In other words, the graph g is spanning, if every segment [k − 1, k] is covered by at least one bond of g. An equivalent definition is that the partition 5 (g) of I consists of one element, which is the set I itself. Note that a spanning graph is not necessarily connected.

164

R. L. Dobrushin, S. B. Shlosman

Let g ∈ G(I) be a spanning graph. Consider the quantity X (−1)|h| , N (g) = h⊂g

where the summation goes over all subgraphs of g which are themselves spanning, and where |h| is the number of bonds in h. Then we have Lemma 7.8. N (g) = −1, 0 or 1. Proof. The proof goes by induction on the number of points in I. For |I| = 2 we have N (g) = −1 for the only spanning graph g ∈ G(I). Each bigger set I is also treated by induction – this time in the number of bonds in g. If |g| = 1 then again N (g) = −1. Consider the general spanning graph g. We call a bond {k, l} a subordinate to a bond {i, j}, if i ≤ k < l ≤ j. Suppose first that the P graph g has a bond {i, j} with a subordinate {k, l}. We claim then that the subsum h⊂g,{i,j}∈h (−1)|h| = 0. Indeed, the map h → h4{k, l} is a one-to-one map on the set of all spanning subgraphs of g, containing the bond {i, j}. Hence to evaluate N (g) we can delete the bond {i, j} from g. If the resulting graph is not spanning, then N (g) = 0. Otherwise it has less bonds than g, which permits the induction step in our case. Suppose next that no bond has a subordinate. That implies in particular that the graph g has exactly one bond incident to the site n. Let that bond be {k, n}. Consider the factor graph f = g/{k, . . . , n}. Note that there is a natural one-to-one correspondence between the bonds of f and the bonds of g excluding one bond {k, n}, which disappears after the factorization. It is immediate to see that this correspondence gives rise to a one-to-one correspondence between the spanning subgraphs of f and of g, so N (f ) = −N (g). Hence the proof is complete, since f has less sites than g. Let A be a finite set, and A = ∪Ai , i ∈ I be its partition into disjoint subsets, which are called connected components of A. For every subset C ⊂ A we define connected components Ci of C as those intersections Ci = C ∩ Ai , which are nonempty. These components are indexed by i ∈ I(C) ⊆ I. Denote by G(C) the set of all connected graph structures with some of the sets Ci as their vertices. (For example, a single set Ci is an element of G(C).) For g ∈ G(C) we denote by I(g) ⊆ I(C) the subset of indices corresponding to the vertices of g. Let G = ∪C G(C). (The element g of G can be thought of as a collection of nonempty subsets Ci ⊆ Ai , i ∈ J ⊆ I, together with a structure of a connected graph on the set J of vertices.) Suppose the function f is defined on G. Consider the sum X X (−1)|A\C| f (g). (7.45) S (f ) = C⊂A

g∈G(C)

Note that in general the term f (g) would appear more than once in S (f ). Indeed, if for example the set C consists from two connected components, C = Ci1 ∪ Ci2 , while the graph g ∈ G(C) has one vertex Ci1 and no bonds, then for any e ∈ Ci2 we also have g ∈ G(C\e). Hence, some cancellations in (7.45) should be expected. Let the (integer) coefficients c(g), g ∈ G be the result of these cancellations; in other words, define them to be the coefficients of the formal expansion of the sum S (f ): X X X c(g)f (g) = (−1)|A\C| f (g). (7.46) g∈G

C⊂A

g∈G(C)

“Non-Gibbsian” States and their Gibbs Description

165

Lemma 7.9. If I(g) 6= I, then c(g) = 0. Otherwise c(g) = ±1. Proof. Let g ∈ G be an arbitrary element, and suppose a subset C0 ⊂ A is such that g ∈ G(C0 ), and moreover I(g) = I(C0 ) 6= I. Then the complement Ic (C0 ) = I \ I(C0 ) is nonempty. Suppose now that C ⊂ A is a subset, for which the inclusion g ∈ G(C) also holds. This is equivalent to the statement that the set C = C0 ∪ B, where B is any subset of the union Ac (C) = ∪i∈Ic (C0 ) Ai . So X (−1)|B| = 0. c(g) = ± B⊆Ac (C)

On the other hand, if I(g) = I, then clearly the graph g appears in the sum (7.46) exactly once. 7.4.3. Interlaced structures: The term R2 . We start with the following statement, which will be used in this and the next subsection. Let D ⊂ Z1 be a finite set, which is supposed to be (1 − ε)-disconnected. We consider again the smallest segment [lD , rD ] ⊂ Z1 such that D ⊂ [lD , rD ], and so diam(D) = lD − rD . Let Di ⊂ D, i ∈ I be a partition ˆ T (D) of all of D into (1 − ε)-connected components of D. Consider the ensemble K compatible dressed systems π, ˆ isolating the set D. Let us assign to each system πˆ the weight E (π), ˆ introduced in (7.29). We are interested in a certain event A(D, l) in this ensemble, which we describe next. Roughly speaking, A(D, l) happens when the system πˆ “covers” the segment [lD , rD + l]. More precisely it means the following. We say that πˆ ∈ A(D, l) iff • the partition 5 (g (π)) ˆ of I contains precisely one element; • the geometric projection of the set of bonds b (π) ˆ = π ∪ ∪Mi ∈πˆ M i ⊂ R2 on the x axis contains the segment [lD , rD + l]. Lemma 7.10. There exists β 0 = β 0 (β, ε), β 0 → ∞ as β → ∞, such that P E (π) ˆ π∈A(D,l) ˆ Qn ≤ exp{−β 0 (diam(D) + l)}. j=1 Sβ (L(Dj , {j}, {Dj }, (gj )∅ ))

(7.47)

Proof. Let the dressed system πˆ ∈ A(D, l), πˆ = {π, M1 , . . . , Mk }, and π = {γ1 , . . . , γl }. The proof will be an adaptation of the Peierls argument to our setting. To implement it we will define first the (−)-sets Nα of π. For that let us consider the configuration σπ , which is defined by the property that its contours are precisely the contours γ1 , . . . , γl . The sets N α are defined as the maximal connected components of the set t ∈ Z2 : σπ (t) = −1 \ D. So they are either in the upper or in the lower halfplane. We think about them as composed from unit closed plaquettes centered at the sites of the lattice Z2 . In the extreme case the collection Nα is empty, and that corresponds to the shortest possible collection {γ˙ 1 , . . . , γ˙ l(D) } of contours, isolating D. The set of bonds belonging to the contours in this minimal collection will be denoted by 0D . Consider the connected components Ti of the union (∪α Nα )∪ ∪k1 Mi . The elementary (Peierls) surgery of the system πˆcorresponds to the removal of one such component T = ∪α∈A(T ) Nα ∪ ∪β∈B(T ) Mβ and results in a system πˆ 0 = {π 0 , M10 , . . . , Mk0 0 }, which is obtained from πˆ in the following way. The collection {M10 , . . . , Mk0 0 } is just the initial collection {M1 , . . . , Mk }, from which the subcollection {Mβ , β ∈ B (T )} is removed. The family π 0 = {γ10 , . . . , γl00 } is defined to be the collection of all the contours of the configuration σπ0 , which is given by the relation

166

R. L. Dobrushin, S. B. Shlosman

σ (t) = π0

+1 for t ∈ ∪α∈A(T ) Nα , σπ (t) otherwise.

(So indeed it is just flipping some spins inside some contours.) The natural thing to do is to perform such a surgery only on these T ’s, which contain extra long contours. Such T ’s will be called bridges, and we define them as follows: the component T is a bridge, iff it is adjacent to at least two different (1 − ε) -connected components Di of D. Let us denote the corresponding set of these indices i by I (T ) ⊂ I. The second (and the last) case when a component (with |I (T )| ≥ 1) is called a bridge is when it is responsible for the overhang over the segment [rD , rD + l]. (The component then picks up its surplus length from the overhang.) These bridges will be called Avignon bridges, or A-bridges , though in such cases the set I (T ) might be one-element. (Note. We will not perform surgeries on A-bridges T having single element sets I (T ) .) We denote by B (π) ˆ the collection of all bridges of π. ˆ For every bridge T we define the segment S (I (T )) ⊆ I as the smallest one, containing the set I (T ).For every T we introduce the subset DT = ∪i∈I(T ) Di ⊂ D, the base of the bridge. Let lDT , rDT be the smallest segment, containing the base DT . It follows from the condition πˆ ∈ A(D, l) that the collection of segments {S (I (T )) , T ∈ B (π)} ˆ is a covering of I. We begin the proof by considering the case of dressed systems which have no Abridges T with single element sets I (T ). So we introduce the subset A0 (D, l) ⊂ A(D, l) as the set of all dressed systems πˆ ∈ A(D, l) with no A-bridges T with |I (T )| = 1. By A00 (D, l) ⊂ A(D, l) we denote its complement. Let T ∈ B (π) ˆ be some bridge of πˆ ∈ A0 (D, l), and πˆ 0be the result of the surgery. We are going to compare the two weights, E (π) ˆ and E πˆ 0 . To do it, we will introduce the boundary ∂T of T to be the set ∂T = ∂ ∪α∈A(T ) Nα ∪ ∪β∈B(T ) δMβ . We split it into two subsets: ∂ + T = 0D ∩ ∂ ∪α∈A(T ) Nα , ∂ − T = ∂ ∪α∈A(T ) Nα \ ∂ + T ∪ ∪β∈B(T ) δMβ . We are using these notation since the result of a surgery is the removal of ∂ − T and replacing it by ∂ + T . Therefore if we introduce the quantity l (T ) = ∂ − T − |∂ + T | , then we have:

E (π) ˆ ≤ E πˆ 0 exp {−2βl (T )} .

For a family B of bridges we also define l (B) =

X

l (T ) .

T ∈B

Likewise we define

L (T ) = |∂T | , L (B) =

X

L (T ) .

T ∈B

Note that

L (T ) ≤ l (T ) + 2 |DT | .

“Non-Gibbsian” States and their Gibbs Description

167

Let us apply repeatedly the above estimates to the sequence Ti ∈ B (π) ˆ of all bridges of π, ˆ and denote by π˙ = π˙ (π) ˆ the final system. Then we likewise have E (π) ˆ ≤ E (π) ˙ exp {−2βl (B (π))} ˆ .

(7.48)

We need to have a lower estimate on l (T ) and l (B). Since |I (T )| > 1, the base DT of the bridge T is (1 − ε)-disconnected, and its subsets Di , i ∈ I (T ), are its (1 − ε)-connected components. We claim that ! X − + ∂ T − |∂ T | ≥ 2 diam (DT ) − |Di | , i∈I(T )

! X − ∂ T − |∂ + T | ≥ 2 diam (DT ) + l − |Di | ,

and

i∈I(T ) + if the bridge T is an A-bridge. It is most in the case easily seen when the set ∂ T is the + 2 maximal possible; i.e. ∂ T = 0D ∩ (x, y) ∈ R : y = 1/2 . Then the set ∂T contains a double connection between points lDT and rDT (and an additional P overhang of the length ≥ 2l in the case of an A-bridge), while always |∂ + T | ≤ i∈I(T ) |Di |. In the general case the set ∂ + T is less than 0D ∩ (x, y) ∈ R2 : y = 1/2 by a subset 1, say, but then this 1 can be added both to ∂ + T and ∂ − T , which reduces this case to the previous special one. Hence, − ∂ T − |∂ + T | ≥ 2ε diam (DT ) ,

and

− ∂ T − |∂ + T | ≥ 2ε diam (DT ) + 2l

for an A-bridge, since the set DT is (1 − ε)-disconnected. So, because the collection S (I (Ti )) of segments is a covering of I, we have l (B) ≥ 2ε diam (D) + 2l.

(7.49)

Let us sum the estimates (7.48) over all configurations πˆ with B (π) ˆ = B, where B is a fixed collection of bridges. We have: X X E (π) ˆ ≤ exp {−2βl (B)} E (π˙ (π)) ˆ . (7.50) π:B( ˆ π)=B ˆ

π:B( ˆ π)=B ˆ

But Qn the systems π˙ have no bridges. So every such system is an element of the product ˆ 1 6= πˆ 2 , B (πˆ 1 ) = B (πˆ 2 ) = B, then π˙ (πˆ 1 ) 6= j=1 L(Dj , {j}, {Dj }, (gj )∅ ). Also, if π π˙ (πˆ 2 ). Hence P E (π˙ (π)) ˆ π:B( ˆ π)=B ˆ Qn ≤ 1. (7.51) j=1 Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) So we arrive at an estimate: P Qn

j=1

0 (D,l) π∈A ˆ

E (π) ˆ

Sβ (L(Dj , {j}, {Dj }, (gj )∅ ))

≤

X π:B( ˆ π)=B ˆ

exp {−2βl (B)} .

168

R. L. Dobrushin, S. B. Shlosman

The estimate of the last sum is a standard combinatorics. Note first that a bridge T is completely defined by the two sets of bonds: ∂ ∪α∈A(T ) Nα and ∪β∈B(T ) δMβ ; together they form the connected set ∂T . Let us call the bonds of the former as γ-bonds, and the bonds of the latter as M -bonds. Consider the set of all T ’s, such that they contain a given bond b, while L (T ) = L. Then the number of such bridges is bounded from above by 8L ; the extra factor 2 comes from the option for a bond to be a γ-bond or an M -bond. Each bridge is attached to the set D along several bonds; let us choose the leftmost one for each. The number of different sets of bonds we can obtain in that way, is clearly less than 2|D| . If that collection is fixed, then the number of different collections B of bridges with L (B) = L is bounded from above by the same quantity 8L . So by using (7.49) we arrive at an estimate P X ˆ 0 (D,l) E (π) π∈A ˆ Qn ≤ 2rD −lD 82(rD −lD )+k exp {−2βk} . j=1 Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) k≥2ε(rD −lD )+2l

This estimate proves our lemma for the case of configurations which have no A-bridges T with |I (T )| = 1. The argument for the remaining π’s, ˆ which form the set A00 (D, l), is essentially the same, apart from one modification. Let Ti1 , . . . , Tik be all the A-bridges T of the system πˆ with |I (T )| = 1. The base of each of them is just a single (1 − ε)-connected component of D. Let T be the one for which its base DT is the leftmost such component. The rest of the A-bridges can be ignored, so for the sake of simplicity of the exposition ˆ If we denote by ¯i the only element the we will suppose that T is the only A-bridge of π. set I T has, then, according to our notation, D T = Di¯ . We introduce also the segment lDi¯ , rDi¯ as the smallest one containing the set Di¯ . ˆ be the collection of the remaining bridges of π. ˆ We again do the Let now B (π) surgeries over all of them, denote by π˙ = π˙ (π) ˆ the final system, and have ˆ . E (π) ˆ ≤ E (π) ˙ exp −2βl B (π) Note that the resulting dressed system π˙ still contains the A-bridge T (as its only bridge). Let πˆ i¯ = πˆ i¯ (π) ˆ ⊂ π˙ be the dressed subsystem, attached to DT (which subsystem develops the bridge T ). Clearly, πˆ i¯ ∈ L(Di¯ , {¯i}, {Di¯ }, (gi¯ )∅ ). Let us write π˙ as the ˜ E πˆ i¯ , and that the system π˜ disjoint union, π˙ = π˜ ∪ πˆ i¯ . Note that E π˜ ∪ πˆ i¯ = E (π) has no bridges. So the analog of (7.50) looks as follows: X X E (π) ˆ ≤ exp −2βl B E (π) ˜ E πˆ i¯ . π:B( ˆ π)=B∪T ˆ

π:B( ˆ π)=B∪T ˆ

Instead of (7.51) we write: P Qn

j=1

π:B( ˆ π)=B∪T ˆ

E (π) ˜ E πˆ i¯

P

Sβ (L(Dj , {j}, {Dj }, (gj )∅ ))

≤

¯ πˆ i¯ ∈L(Di¯ ,{i},{D i¯ },(gi¯ )∅ ): B(πˆ i¯ )=T

E πˆ i¯

.

Sβ L(Di¯ , {¯i}, {Di¯ }, (gi¯ )∅ )

The rhs of the last estimate (in contrast with the lhs of (7.47) !) can be interpreted as a probability of a certain event in the ensemble L(Di¯ , {¯i}, {Di¯ }, (gi¯ )∅ ) : namely, it is the event of observingthe dressedsystem living on Di¯ , having the T -shaped overhang covering the segment rDi¯ , rD + l . As in the relation (7.42), we conclude that

“Non-Gibbsian” States and their Gibbs Description

P

¯ πˆ i¯ ∈L(Di¯ ,{i},{D i¯ },(gi¯ )∅):

E πˆ i¯

169

≤ exp −c0 β rD + l − rDi¯ .

πˆ i¯ hangs over rDi¯ ,rD +l

Sβ L(Di¯ , {¯i}, {Di¯ }, (gi¯ )∅ ) On the other hand,

l B ≥ 2ε diam D1 ∪ · · · ∪ Di¯ , provided ¯i > 1, in which case εdiam D1 ∪ . . . . ∪ Di¯ + rD + l − rDi¯ ≥ εdiam (D)+ l. When ¯i = 1 we likewise have rD + l − rD1 ≥ εdiam (D) + l, because D is not (1 − ε) connected. The rest of the argument for the A00 case is the same.

We now can formulate and prove the statement about the structure of the term R2 , similar to the one about R1 . Proposition 7.11. In the notation of Proposition 7.5 there exists a function w2 defined for all finite (1 − ε) -disconnected subsets of Z1 , such that for R2 = R2 (D)  n−1 X

 = 

 X

k Y

X

k=1 5=31 ,...,3k g1 ∈G(31 ),...,gk ∈G(3k ): j=1 5(gj )=3j

 ×

n Y

 Sβ (L(D(3j ), 3j , {Di , i ∈ 3j }, gj )) 

−1

Sβ (L(Dj , {j}, {Dj }, (gj )∅ ))

j=1

(7.52) the following representation holds: R2 (D) =

X

Y

w2 (D3 ) .

5 3∈5 : |3|>1

The function w2 satisfies the estimate: w2 (D3 ) ≤ exp {−β 0 diam (D3 )} . Proof. We remind the reader that each term in the expression for R2 corresponds to a collection of graphs g1 ∈ G(31 ), . . . , gk ∈ G(3k ) with 5(gj ) = 3j , such that not all of the sets 3j are one-point sets. Note that all one-point sets 3j cancel out, and the remainder factors into the product of terms of the type Q

2

¯ = 3

P

Q

¯ ¯ g∈G(3):5(g)= 3

Q

¯ j∈3

¯ j∈3

¯ 3, ¯ {Di , i ∈ 3}, ¯ g)) Sβ (L(D(3),

Sβ (L(Dj , {j}, {Dj }, (gj )∅ ))

,

¯ ⊂I is a nontrivial segment. But this is exactly the situation we were considering where 3 in the preceding lemma (with l = 0), and the result follows.

170

R. L. Dobrushin, S. B. Shlosman

7.4.4. The general term R3 . Now we are in the position to consider the generic term in (7.22). The idea is to combine the methods of the two previous subsections. This term corresponds to a partition 5 ≡ 5I = (31 , 32 , . . . , 3j , . . . , 3|J | ), j ∈ J of the index set I into consecutive segments, and is given by X Y X E(πˆ j ) (7.53) R3 = R35 = −  ×

g1 ∈G(31 ),...,gk ∈G(3k ): P ∈R(5,{gj }) πˆ j ∈P 5(gj )=3j n Y



Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) −1 .

j=1

Now we treat the inner sum of (7.53) in the same way we treated the sum (7.26). Namely, Q X ˆj) πˆ j ∈P E(π Qn = − j=1 Sβ (L(Dj , {j}, {Dj }, (gj )∅ )) P ∈R(5,{gj }) Q X X ˆj) πˆ ∈P E(π |h| Qnj . (−1) = j=1 Sβ (Lj ) h∈G(J ),|h|≥0

P ∈R(5,{gj },h)

Here we are dealing with the case 5I 6= 5∗I (the latter being the partition into single points), and that is why the case |h| = 0 is included, in contrast with (7.26). Now we introduce the set C(h) of all maximal connected subgraphs of h, excluding isolated vertices, corresponding to trivial graphs gj . Then X Y E(πˆ j ) (7.54) (−1)|h| =

Y

P ={πˆ j :j∈h}∈R(5,{gj },h) πˆ j ∈P

X

(−1)|k|

k∈C(h)

Y

E(πˆ j ).

P ={πˆ j :j∈k}∈R(5,{gj },k) πˆ j ∈P

Here k¯ is a subset of J . So we define the weight Q3β (k) of a connected graph k with vertices among the points of the index set J by P Q ˆj) πˆ j ∈P E(π P ={πˆ j :j∈k}∈R(5,{gj },k) 3 |k| Q . (7.55) Qβ (k) = (−1) j∈k Sβ (Lj ) We then pass to the weight Q3β (s) =

X

Y

Q3β (k),

(7.56)

h∈H(s) k∈C(h)

where H(s) ⊂ G(J ) is the subset of all graphs which have the segment s ⊂ J as the only nontrivial element of the partition 5(h) of J . Using (7.34), we have the estimate Q X ˆj) πˆ ∈P E(π 3 Qj , (7.57) |Qβ (s)| ≤ i∈sˇ Sβ (Li ) P ={πˆ j :j∈s}∈R(5,{gj }): g(P )∈H(s)

“Non-Gibbsian” States and their Gibbs Description

171

where sˇ ⊂ I is the full preimage of s ⊂ J under the natural projection of the index set I onto J ; of course, |s| ˇ ≥ |s|. Here comes the difference between the general case and the case of the R1 term: the ratios in (7.57) can not be interpreted as probabilities, unless |s| ˇ = |s|, in which case all the graphs g are trivial. So we use the lemma of the previous section instead. As in the R1 case, instead of estimating the individual terms in (7.57), we will estimate the sums Q

X

Q

P ={πˆ j :j∈s}∈R(5,{gj }): sp(g(P ))=γ∈H(s)

πˆ j ∈P

E(πˆ j )

i∈s−1

Sβ (Li )

,

corresponding to different spanning graphs γ the families P might have. The rest of the proof follows literally the one for the R1 case and will be omitted. The result is the following Proposition 7.12. Consider the set of pairs D0 , 50 , where D0 is a finite subset of Z1 , split into its sufficiently separated components Di0 , i ∈ I 0 , while 50 is a partition of I 0 into consecutive segments. In the notation of Proposition 7.5 there exists a function w3 defined for all pairs D0 , 50 , such that the following representation holds: R35 (D) =

XY

˜ ∩5 . w3 D3˜ , 3

˜ 3∈ ˜ 5 ˜ 5

˜ of the set I into segments, such that Here the summation is taken over all partitions 5 ˜ ∩ 5 is the partition ˜ 5 is a strict refinement of 5, D3˜ = ∪i∈3˜ Di , while the partition 3 3 ˜ of 3 into subsets, which are elements of 5. The function w satisfies the estimate: w3 D0 , 50 ≤ exp −β 0 d˜ D0 , 50 ,

(7.58)

with β 0 = β 0 (β, ε) → ∞ as β → ∞ and with the function d˜ D0 , 50 defined as follows. Let I 0 = 30r ∪ 30r+1 ∪ . . . . ∪ 30r+s , s ≥ 0

(7.59)

be our partition 50 of the set I 0 into consecutive segments. Then s−1 X 0 0 ˜d D0 , 50 = dist D3 , D3 0 0 i=0

r+i

r+i+1

+

s ∗ X i=0

0 , diam D3 0 r+i

(7.60)

P∗ is taken only over these 30· which are not one-element subsets of where the last sum 0 into I . In the special case when 50 is a partition singletons, the second term in (7.60) disappears, and we have w3 D0 , 50 = w1 D0 . In the special case when 50 has just one element (i.e. s = 0 in (7.59)), while D0 consists of more than one component, we have the first term in (7.60) disappearing, and w3 D0 , 50 = w2 D0 . In the special case when 50 has just one element, but D0 consists of one component, we have w3 D0 , 50 = 1 0 0 ˜ (which is in line with (7.58), since in that case d D , 5 = 0).

172

R. L. Dobrushin, S. B. Shlosman

7.4.5. Proof of the polymer representation. Now we have all the ingredients needed to finish the proof of the representation for the partition function Zβ (D). We just have to put together the results of Propositions 7.6, 7.11 and 7.12. We need to define for all segments 3 with |3| > 1 the weights w (D3 ), where D3 ⊂ D is equipped with the partition D3 = ∪i∈3 Di into the sufficiently separated components of D. The definition is the following: X w3 (D3 , 53 ) , w (D3 ) = 53

where the summation goes over all partitions 53 of the index set 3. As was mentioned in Proposition 7.12, for extreme choices of the partition 53 we get either the functional w1 (D3 ) or w2 (D3 ). Note that the number of summands in the last expression is at most 2|3|−1 . On the other hand, the distances between the consecutive subsets Di are not smaller than 2, hence the entropy term 2|3|−1 is beaten for large β. So the estimate (7.24) follows from (7.58), (7.60), the definition (7.4) and the fact that the starting set ˜ and hence all D3 ’s are (1 − ε)-disconnected. D 7.5. Final estimate. Now we can obtain the desired estimate on the interaction U . Recall that X β,+ (−1)|A\D| Qβ,+ = (7.61) UT,A T,D . D⊆A

Suppose now that the set A is (1 − ε) -disconnected, and Ai ⊂ A, i ∈ I, are its (1 − ε)connected components. Then we can write the formula (7.5) for Qβ,+ T,D for every subset D ⊆ A, with Di = D ∩ Ai . Let us substitute all these formulas into (7.61). To do the cancellations we are going to use Lemma 7.9, which we apply for the following choice of the function f (g) , g ∈ G (D): • in case the graph g has for its vertices the components Di1 , . . . , Dik of D, which are consecutive components of D, while the bonds of g are the following pairs of components of D: {Di1 , Di2 } , {Di2 , Di3 } , . . . , Dik−1 , Dik – we put f (g) = Gβ (D[i1 ,ik ] ); in the special case when k = 1 and g has just one vertex Di , we put f (g) = Qβ,+ T,Di . • for all other graphs g we put f (g) = 0. Lemma 7.9 then tells us that the only surviving terms are those which have points in every component of A : X β,+ (−1)|A\D| Gβ D; {Dk , k ∈ [1, n]} . = UT,A D⊆A: D∩Ai 6=∅ for all i=1,...,n

The estimate (7.6) implies the following bound: β,+ UT,A ≤ 2|A| exp {−β 0 l (A)} . So, if the density ρ (A) < 1 − 3ε, then by Lemma 7.3 we have β,+ UT,A ≤ exp {−β 00 diam (A)} .

“Non-Gibbsian” States and their Gibbs Description

173

8. Proof of the Almost Gibbsianity What is left now is the check of the DLR equation (1.5). More precisely, we have to define U the sets X V of boundary conditions, for which the series (1.2) converges absolutely, and then to check the relation (1.5) for corresponding boundary conditions. Our choice of U X V is the following: let δ > 0 be any real, and define (δ) = {σ : ∃ n = n (σ) ∀ k > n m[0.k] (σ) > −1 + δ, m[−k,0] (σ) > −1 + δ} , U

U

X V = X V (δ) = {σ : ∃ n = n (σ) ∀ k > n m[0.k]\V (σ) > −1 + δ, m[−k,0]\V (σ) > −1 + δ , where for every finite set Y , every σ we denote by mY (σ) the average magnetization of σ on Y : P σt mY (σ) = t∈Y . |Y | U

U

Let us introduce also the subsets (δ, n) ⊂ (δ), X V (δ, n) ⊂ X V (δ) by (δ, n) = {σ : ∀ k > n m[0.k] (σ) > −1 + δ, m[−k,0] (σ) > −1 + δ} , U X V (δ, n) = σ : ∀ k > n m[0.k]\V (σ) > −1 + δ, m[−k,0]\V (σ) > −1 + δ . U

U

Clearly, (δ) = ∪n (δ, n), X V (δ) = ∪n X V (δ, n). In what follows, the arguments for U the sets (δ) and X V (δ) would be identical, and so we will present them only for the former case. Proposition 8.1. For every δ < 2 there exists a value β (δ), such that for all β > β (δ), all V finite we have: β,+ Pβ,+ T ( (δ, n)) ↑ PT ( (δ)) = 1, U U β,+ (δ, (δ) X X n) ↑ P = 1. Pβ,+ V V T T

The same statements hold for the finite volume Gibbs measures pβ,+ WN (σWN )dσWN with (+)-boundary conditions, uniformly in N . ( (δ, n))c go to zero Proof. We will show that the measures of the complements Pβ,+ T as n → ∞, provided β > β (δ) is large enough. Let σ ∈ ( (δ, n))c . By definition it − ⊂ [−n, n] has at least n (2 − δ) elements. Note that means that the subset σ[−n,n] the number of all possible subsets of [−n, n] is bounded by 22n . We are left with the estimate of the Pβ,+ -probability in the usual 2D Ising model of the event NA = σ = (σt , t ∈ Z2 ) : σA = −1, σ[−n,n]\A = +1 , where A is a subset of [−n, n] ⊂ Z1 . If the event NA takes place, then the set A is isolated from infinity by contours, surrounding it. Let us consider the family γ of all external contours among these. (Note that these exist with probability one with respect to the measure Pβ,+ .) They are also separating A from infinity. There are two cases to consider:

174

R. L. Dobrushin, S. B. Shlosman

1. each of the contours in γ intersects the segment [−n, n] ; 2. there exist a contour in γ which does not intersect [−n, n]. In the second case the family γ consists in fact from just one contour 0, which surrounds the whole segment [−n, n]. If L is the rightmost point where 0 intersects the x axis, then the Pβ,+ -probability to observe such a contour is bounded from above by X X 3l exp {−2βl} ≤ exp {−β 0 n} L≥n l>2(L+n) 0

for some β diverging with β. In the first case let us fix an intersection point inside [−n, n] for every contour in γ. Note that the number of different collections of such intersection points again does not exceed 22n . Once such a collection is fixed, the number of different contours of the total length l, passing through each point of this collection, is bounded from above by 3l . Since γ surrounds A, the total length l has to be bigger than 2 |A|. Putting all these estimates together we get the following bound: X ( (δ, n))c ≤ 24n 3l exp {−2βl} ≤ exp {−β 0 n (2 − δ)} . Pβ,+ T l>2n(2−δ)

¯ T \V ) can indeed be defined by We proceed by checking that the function pU V (σV /σ U

the formula (1.3), provided σ¯ T \V ∈ X V (δ, n). For this we need to check the absolute U

convergence of the series (1.2) for any n, any σ¯ T \V ∈ X V (δ, n). But it is easy to see U

that for every V and n one can find an n0 (V, n), such that for every σ¯ T \V ∈ X V (δ, n) − with the properties and every σV ∈ X V we have for any subset A ∈ σV ∪ σ¯ T \V δ A ∩ V 6= ∅, A ∩ −∞, −n0 ∪ +n0 , +∞ 6= ∅ that ρ (A) < 1 − (see (2.16)). So 2 by Proposition 2.3 we obtain that the terms of the series (1.2) are exponentially small in the diam (A), with the exponent β 0 , diverging as β → ∞. Since the number of different A’s intersecting V and having diameter diam (A) = l is bounded by |V | + l 2l , the absolute convergence follows. ¯ T \V ) Next let us consider the question of measurability. Note that the function pU V (σV /σ U

U / XV . is not yet defined everywhere thus far; pV (σV /σ¯ T \V ) = 0 for all σ¯ T \V ∈ let us put U

X V (δ) Since due to Proposition 8.1, Pβ,+ T

= 1, we can put instead of 0 any other

constant. The argument of the last paragraph shows that the function pU ¯ T \V ) is V (σV /σ U

continuous on every subspace X V (δ, n) in the topology induced by the product topology U on (though not uniformly in n). Hence it is a measurable function on X V (δ, n) with U respect to the σ-algebra of subsets of X V (δ, n) obtained from the σ-algebra BT by taking U U the intersections of the subsets from BT with X V (δ, n). Since the subsets X V (δ, n) ¯ T \V ) follows. (Note are themselves elements of BT , the BT -measurability of pU V (σV /σ that this function is not continuous on .) The last thing we need to show in order to complete our argument is the state¯ [−N,N ]\V ∪ σT+ \[−N,N ] ) and ment that the two probability distributions, pU V (σV /σ N

¯ [−N,N ]\V ), are close enough, provided that the configuration σ¯ [−N,N ]\V ∪ pU V (σV /σ U

σT+ \[−N,N ] belongs to the family X V (δ, n), while the number N , which may depend on n, is large enough.

“Non-Gibbsian” States and their Gibbs Description

175

Proposition 8.2. Let the integers n, N be fixed, with N ≥ 8n, N diam (V ). Consider the set S (n, N ) of all configurations σ¯ [−N,N ]\V such that σ¯ [−N,N ]\V ∪ σT+ \[−N,N ] ∈ U

X V (δ, n). Then the ratio of the two probability distributions satisfy the following estimate: pU ¯ [−N,N ]\V ∪ σT+ \[−N,N ] ) V (σV /σ

sup

≤ exp {−β 0 N } ,

N

pU ¯ [−N,N ]\V ) V (σV /σ

σ¯ [−N,N ]\V ∈S(n,N )

(8.1)

with β 0 diverging as β → ∞. −

Proof. Denote by 1 the set σ¯ [−N,N ]\V

. Then the ratio

¯ [−N,N ]\V ∪ σT+ \[−N,N ] ) pU V (σV /σ N

pU ¯ [−N,N ]\V ) V (σV /σ β,+ β,+,N , UT,A , where A ⊆ 1 ∪ V , A ∩ V 6= ∅. We would is a function of the quantities UT,A be done once we show that for every B ⊆ V , B 6= ∅, X X β,+ β,+,N UT,A − UT,A ≤ exp {−β 0 N } . A:A⊆1∪V, A:A⊆1∪V, A∩V =B

A∩V =B

c Note, however, that for every set A, entering the last sums, and such that A∩ − N8 , N8 6= δ β,+ β,+,N ∅, we have ρ (A) ≤ 1 − , and therefore UT,A ≤ exp {−β 0 diam (A)}, UT,A ≤ 2 2.3. Hence the contribution of the correexp {−β 0 diam (A)}, according to Proposition . What is left now is the treatment of the sets sponding sums is less than exp −β 0 N 10 A ⊂ − N8 , N8 , which would be possible due to the fact that in such a case the distance dist A, WNc ≥ 7N 8 . Recall that X β,+,N β,+,N = (−1)|A\D| QT,D , (8.2) UT,A D⊆A



 Qβ,+,N T,D = − ln   EN (π) = exp −2β|π| −  while β,+ = UT,A



X

EN (π) ,

(8.3)

π∈KT (D,WN )

 

X

8(M )

M :M ⊂WN ,M ∩1(π)6=∅,M ∩T =∅

X D⊆A

(−1)|A\D| Qβ,+ T,D , 

 Qβ,+ T,D = − ln

X

π∈KT (D)



,

(8.4)

(8.5)

 E(π) ,

(8.6)

176

R. L. Dobrushin, S. B. Shlosman

  E(π) = exp −2β|π| − 

 

X

8(M )



M :M ∩1(π)6=∅,M ∩T =∅

.

(8.7)

So the difference between (8.2) and (8.5) comes from the following facts: • the sum (8.6) contains extra terms over the sum (8.3); • the families π, entering both sums, have different weights (8.4) and (8.7). Note, however, that the difference X

X

EN (π) −

π∈KT (D,WN )

EN (π) = O exp {−2βN } ,

π∈KT (D,WN/2 )

while each of the above sums is of the order not smaller than exp {−4β |D|} ≥ exp {−βN }. Hence X

EN (π) = 1 + O exp {−βN }

π∈KT (D,WN )

X

EN (π).

π∈KT (D,WN/2 )

Likewise, X

E(π) = 1 + O exp {−βN }

π∈KT (D)

X

E(π).

π∈KT (D,WN/2 )

Finally, for a given family π ∈ KT (D, WN/2 ) we have the representation   E(π) = exp −  EN (π)

X

 

8(M )



M :M ∩1(π)6=∅,M ∩WN 6=∅,M ∩T =∅

,

and since the last sum is O N 2 exp {−2βN } uniformly in π, we have X

E(π) =

π∈KT (D,WN/2 )

Hence

X

EN (π) 1 + O N 2 exp {−2βN }

π∈KT (D,WN/2 )

β,+ 2 Qβ,+,N T,D − QT,D = O N exp {−2βN } + O exp {−βN } ,

and X X β,+ β,+,N UT,A − UT,A A:A⊆1∪V, A:A⊆1∪V, A∩V =B A∩V =B N 0 0N | | 4 . ≤ 2 O exp {−β N } + exp −β 10

.

“Non-Gibbsian” States and their Gibbs Description

177

Now we are in a position to check (1.5) for the measure Pβ,+ T . We first rewrite the integral β,+ R P U ¯ T \V ) PT \V (dσ¯ T \V ) as a sum: σV ∈X V φ(σV )pV (σV /σ B 

Z

 B



X

σV ∈X V

φ(σV )pU ¯ T \V ) Pβ,+ ¯ T \V ) = V (σV /σ T \V (dσ



Z =



U

B∩X V (δ,n)

σV ∈X V



Z +

X



U

B\X V (δ,n)

 φ(σV )pU ¯ T \V ) Pβ,+ ¯ T \V ) V (σV /σ T \V (dσ

X

σV ∈X V

 φ(σV )pU ¯ T \V ) Pβ,+ ¯ T \V ), V (σV /σ T \V (dσ

and observe that the second integral goes to zero as n → ∞, due to Proposition 8.1 and ¯ T \V ) is continuous because the integrand is bounded. Because the function pU V (σV /σ U

(=quasilocal) on X V (δ, n), we can find a value N = N (ε) big enough, such that   Z X  φ(σV )pU ¯ T \V ) Pβ,+ ¯ T \V ) − V (σV /σ T \V (dσ U B∩X V (δ,n) σV ∈X V   Z X  φ(σV )pU ¯ [−N,N ]\V ∪ σT+ \[−N,N ] ) × V (σV /σ U

B∩X V (δ,n)

σV ∈X V

( σ ¯ )d σ ¯ × pβ,+ < ε. [−N,N ]\V [−N,N ]\V WN ,[−N,N ]\V (Here pβ,+ WN is just the finite volume Ising distribution with (+)-boundary conditions, while pβ,+ WN ,[−N,N ]\V is its projection on the subset [−N, N ] \ V ⊂ T.) Due to Proposition 8.2 we know that if N is large enough, then we can approximate arbitrarily close the UN ·/· , provided that the conditioning belongs probability distribution pU V ·/· by pV U

to the class X V (δ, n): 

Z

U B∩X V

X

 (δ,n)



U

B∩X V (δ,n)

φ(σV )pU ¯ [−N,N ]\V ∪ σT+ \[−N,N ] ) × V (σV /σ

σV ∈X V

×pβ,+ WN ,[−N,N ]\V Z





(σ¯ [−N,N ]\V )dσ¯ [−N,N ]\V −

X

σV ∈X V

×pβ,+ WN ,[−N,N ]\V

N



φ(σV )pU ¯ [−N,N ]\V ) × V (σV /σ

(σ¯ [−N,N ]\V )dσ¯ [−N,N ]\V < ε.

178

R. L. Dobrushin, S. B. Shlosman

Again by Proposition 8.1 we know that   Z X N  φ(σV )pU ¯ [−N,N ]\V ) × V (σV /σ U (δ,n) B∩X V σV ∈X V ¯ [−N,N ]\V )dσ¯ [−N,N ]\V − × pβ,+ WN ,[−N,N ]\V (σ   Z X N β,+ U   φ(σV )pV (σV /σ¯ [−N,N ]\V ) pWN ,[−N,N ]\V (σ¯ [−N,N ]\V )dσ¯ [−N,N ]\V B σV ∈X V < ε, if n is large. But for the last integral we have the identity   Z X N  φ(σV )pU ¯ T \V ) pβ,+ ¯ [−N,N ]\V )dσ¯ [−N,N ]\V V (σV /σ WN ,[−N,N ]\V (σ B

Z =

B

σV ∈X V

φ(σWN |V )pβ,+ WN (σWN )dσWN ,

which is just the partial case of the DLR equation for the Ising model Gibbs measure (see (2.7), (2.10)). Since Z Z φ(σWN |V )pβ,+ (σ )dσ → φ(σV )Pβ,+ WN WN WN T (dσ) B

B

as N → ∞, that proves our statement.

9. Concluding Remarks The prediction of Remark 2 turned out to be true. Recently in the paper by Bricmont, Kupiainen and Lefevre the corresponding result was proven for the case of the projection of the d-dimensional Ising model to the sublattice bZd . In the paper [MV] of Ch. Maes and Van de Velde a result close to the results of the present paper is obtained. The difference between [MV] and the present paper is that in [MV] the authors use the one-dimensionality of the problem considered in a more essential way. Namely, they use the observation that any one-dimensional potential U = (UA (σA ), A ⊂ Z1 , 0 < |A| < ∞) can be reduced to the potential ∗ (σ[a,b] ), [a, b] ⊂ Z1 , −∞ < a ≤ b < +∞). The reduction is given by the U ∗ = (U[a,b] formula X ∗ (σ[a,b] ) = UA (σ[a,b] |A ). U[a,b] A⊂[a,b]: a∈A,b∈A

The potential U ∗ , constructed in [MV], is related to the potential of the present paper by the above summation. Unfortunately, the paper [MV] contains several erroneous statements; it seems, though, that they can be corrected. The subject of the almost Gibbsian fields is also treated in [ES, MS].

“Non-Gibbsian” States and their Gibbs Description

179

References [BKL] Bricmont, J., Kupiainen, A. and Lefevre, R.: Renormalization group pathologies and the definition of Gibbs states. Preprint, UC Louvain, 1997 [D80] Dobrushin, R.L.: Gaussian random fields – Gibbsian point of view. in: Multicomponent Random Systems, ed. by R.L. Dobrushin and Ya. G. Sinai, New York: Marcel Dekker, 1980, pp. 119–152 [D95] Dobrushin, R.L.: A Gibbsian representation for non-Gibbsian fields. Lecture given at the workshop ‘Probability and Physics’, September 1995, Renkum (the Netherlands) [D96] Dobrushin, R.L.: Estimates of semi-invariants for the Ising model at low temperatures. In: Berezin memorial volume, AMS Translations, 177(2), 59–81 (1996) [DKS] Dobrushin, R.L., Kotecky, R. and Shlosman, S.B.: Wulff construction: a global shape from local interaction. AMS translations series, Providence, RI: American Mathematical Society, 1992 [DS] Dobrushin, R.L. and Shlosman, S.B.: Gibbsian description of ‘non-Gibbsian’ fields. Russ. Math. Surv. 52, 285–297 (1997) [EFK] van Enter, A.C.D., Fernandez, R., and Kotecky, R.: Pathological behavior of renormalization-group maps at high fields and above the transition temperature. J. Stat. Phys., 79, 969–992, (1995) [EFS] van Enter, A.C.D., Fernandez, R. and Sokal, A.D.: Regularity properties and pathologies of positionspace renormalization transformations: Scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879–1167, (1993) [ES] van Enter, A.C.D. and Shlosman, S.B.: (Almost) Gibbsian description of the sign-fields of SOS-fields. To appear in J. Stat. Phys. [G] Georgii, H.-O.: Gibbs Measures and Phase Transitions. Berlin: Walter de Gruyger, 1988 [Gn] Gnedenko, B.V.: The theory of probability. New York: Chelsea, 1962 [GP78] Griffiths, R.B. and Pearce, P.A.: Position-space renormalization-group transformations: Some proofs and some problems. Phys. Rev. Lett.,41, 917–920 (1978) [GP79] Griffiths, R.B. and Pearce, P.A.: Mathematical problems of position-space renormalization-group transformations. J. Stat. Phys., 20, 499–545 (1979) [I] Israel, R.B.: Banach algebras and Kadanoff transformations. In: Random Fields (Esztergom, 1979), J. Fritz, J. L. Lebowitz, D. Szasz eds, Amsterdam: North-Holland, 2, 1981, pp. 593–608 [KP] Kotecky, R. and Preiss, D.: Cluster expansion for abstract polymer models. Commun. Math. Phys., 103, 491–498 (1986) [MV] Maes, Ch. and Vande Velde, K.: Relative energies for non-Gibbsian states. Commun. Math. Phys., 189, 277–286 (1997) [MS] Maes, C. and Shlosman, S.B.: Freezing Transition in the Ising Model without Internal Contours. Preprint 1997 [S] Schonmann, R.H.: Projections of Gibbs measures can be non-Gibbsian. Commun. Math. Phys.,124, 1–7 (1989) Communicated by Ya. G. Sinai

Commun. Math. Phys. 200, 181 – 193 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Large-Time Behavior of Solutions to the Equations of a One-Dimensional Viscous Polytropic Ideal Gas in Unbounded Domains ? Song Jiang Institute of Applied Physics and Computational Mathematics, P.O. Box 8009 (28] ), Beijing 100088, P.R. China. E-mail: [email protected] Received: 21 July 1997/ Accepted: 21 June 1998

Abstract: The large-time behavior of solutions to the initial and initial boundary value problems for a one-dimensional viscous polytropic ideal gas in unbounded domains is investigated. Using a special cut-off function to localize the problem, we derive a local representation for the specific volume. With the help of the local representation, and certain new estimates for the temperature and the stress, and the weighted energy estimates, we prove that in any bounded interval, the specific volume is pointwise bounded from below and above for all t ≥ 0 and a generalized solution is convergent as time goes to infinity.

1. Introduction The present paper is concerned with the large-time behavior of solutions to the equations of motion of a one-dimensional viscous polytropic ideal gas in unbounded domains. The motion of a viscous polytropic ideal gas in R is described by the following equations in Lagrangian coordinates (see [4, 22]): ut = vx ,

vx θ σ := µ − R vt = σx u u θx + σvx , cV θ t = λ u x

(1.1)

,

(1.2) (1.3)

where u, v, and θ are the specific volume, the velocity, and the absolute temperature respectively, σ is the stress, µ, R, cV and λ are positive constants. ?

Supported by the Ministry of Education, the NSF and the CAEP of China

182

S. Jiang

We shall consider the initial and initial boundary value problems for (1.1)–(1.3) in the region {t > 0, x ∈ } with the initial conditions (u(x, 0), v(x, 0), θ(x, 0)) = (u0 (x), v0 (x), θ0 (x)), x ∈ ,

(1.4)

and the boundary conditions v|∂ = 0, θx |∂ = 0,

(1.5)

v|∂ = 0, θ|∂ = 1,

(1.6)

or

where = R for the Cauchy problem (1.1)–(1.4) and = (0, ∞) for the initial boundary value problems (1.1)–(1.5), and (1.1)–(1.4), (1.6). Since the first work of Kazhikhov and Shelukhin [14] on the global existence in the dynamics of a one-dimensional viscous polytropic ideal gas for large initial data, significant progress has been made on the mathematical aspect of the initial and initial boundary value problems for (1.1)–(1.3). For initial boundary value problems in bounded domains the existence and uniqueness of global (generalized) solutions and the regularity have been known. Moreover, the global solution is asymptotically stable as time tends to infinity; see [1, 19, 16, 17, 18, 2, 3, 5], among others. For the Cauchy problem (1.1)–(1.4) and the initial boundary value problems for (1.1)–(1.5) and (1.1)–(1.4), (1.6) (in unbounded domains), Kazhikhov, Shelukhin [13, 14] (also cf. [1, 21, 7, 8]) proved that if ¯ u0 − 1, v0 , θ0 − 1 ∈ H 1 ; u0 , θ0 > 0 on ,

(1.7)

and u0 , v0 , θ0 are compatible with (1.5), (1.6), then there exists a unique (generalized) solution of (u, v, θ) with u, θ > 0 such that for any T > 0, u − 1, v, θ − 1 ∈ L∞ ((0, T ), H 1 ), ut ∈ L∞ ((0, T ), L2 ), vt , θt , uxt , vxx , θxx ∈ L2 ((0, T ), L2 ),

(1.8)

and the regularity holds. The asymptotic behavior as t → ∞ of the solution has been studied under some smallness conditions on the initial data; see [10, 12, 20, 11, 6, 15, 9]. However, nothing has been known on the large-time behavior of the solution in the case of large data. The aim of the present work is to investigate the large-time behavior of the generalized solution of (1.1)–(1.4), (1.1)–(1.5), and (1.1)–(1.4), (1.6). To study the large-time behavior of the solution some difficulties arise from the unboundedness of the domains. In the case of bounded domains we can derive a useful representation of u(x, t), with which and the imbedding L2 ,→ L1 one can obtain pointwise bounds of u(x, t). For unbounded domains, however, the useful representation and the imbedding L2 ,→ L1 in bounded domains do not remain valid (two of the difficulties). To overcome such difficulties we have to introduce new techniques. In this paper we use a special cutoff function to localize the problem, and then by modifying Kazhikhov’s argument for bounded domains [1, 19], we derive a local representation of u(x, t). Because of the choice of the cut-off function the error term induced by the cut-off function has the right sign and can be controlled using the local representation and new estimates for θ(x, t) and σ(x, t) which implicitly show the lower pointwise boundedness of θ for all t ≥ 0 (cf. (2.15), (2.17), (2.22)), such that in any bounded interval u(x, t) is pointwise bounded

Large-Time Behavior of Solutions to Equations of 1-D Ideal Gas

183

from below and above for all t ≥ 0. With the help of the local pointwise bounds of u(x, t) we apply the weighted energy estimates with different weights to prove that in any bounded interval the solution converges to a (constant) state as time goes to infinity. More precisely, denote u1 (t) :=

1 k

Z

k

u(x, t)dx, u2 (t) :=

0

1 2k

Z

k

−k

u(x, t)dx,

(1.9)

the main result of this paper reads: Theorem 1.1. Let k > 0 be an arbitrary but fixed integer. Let (1.7) be satisfied and let (u, v, θ) be a generalized solution of (1.1)–(1.4), or (1.1)–(1.5), or (1.1)–(1.4), (1.6). Then we have the following: i) For the initial boundary value problem (1.1)–(1.5) there are positive constants u and u, independent of t, such that u ≤ u(x, t) ≤ u for all x ∈ [0, k] and t ≥ 0. Moreover, as t → ∞ we have ku(t) − u1 (t)kL2 (0,k) → 0; for an arbitrary but fixed integer l ≥ 2, kv(t)kL2l (0,k) → 0.

(1.10)

ii) For the initial boundary value problem (1.1)–(1.4), (1.6) we have k(u(t) − u1 (t), v(t), θ(t) − 1)kH 1 (0,k) → 0 as t → ∞.

(1.11)

iii) For the Cauchy problem (1.1)–(1.4) we have u ≤ u(x, t) ≤ u for all x ∈ [−k, k] and t ≥ 0, where u, u are positive constants independent of t. If, in addition, u0 , θ0 are even functions and v0 is an odd function, then as t → ∞, ku(t) − u2 (t)kL2 (−k,k) → 0; for an arbitrary but fixed integer l ≥ 2, kv(t)kL2l (−k,k) → 0.

(1.12)

We shall prove Theorem 1.1 in Sect. 3. In Sect. 2 we derive local pointwise bounds of u(x, t). Notation (used throughout this paper). Let G be a domain in R. Let m ≥ 0 be an nonnegative integer and let 1 ≤ p ≤ ∞. By W m,p (G) we denote the usual Sobolev space defined over G with norm k · kW m,p (G) ; W m,2 (G) ≡ H m (G) with norm k · kH m (G) , W 0,p (G) ≡ Lp (G) with norm k · kLp (G) . For simplicity we also use the following abbreviations: Lp ≡ Lp (), H m ≡ H m (); k · kLp ≡ k · kLp () , k · kH m ≡ k · kH m () . k · k stands for the norm in L2 (). Lp (I, B) resp. k · kLp (I,B) denotes the space of all strongly measurable, pth -power integrable (essentially bounded if p = ∞) functions from I to B resp. its norm, I ⊂ R an interval, B a Banach space. For a vector valued function f = (f1 , · · · , fm ) and a normed space X with the norm k| · k|, f ∈ X means that each component of f is in X; we put k|f k| := k|f1 k| + · · · + k|fm k|. The same letter C (sometimes used as C(k) to emphasize the dependence of C on k) will denote various positive constants which do not depend on the time t.

184

S. Jiang

2. Local Pointwise Estimates of u In this section we prove that in any bounded interval u(x, t) is bounded from below and above for all t ≥ 0. We start with the following lemma, which is motivated by the second law of thermodynamics and embodies the dissipative effects of viscosity and thermal diffusion. Lemma 2.1. There is a positive constant e0 , independent of t, such that Z U (x, t)dx +

Z tZ 0

v2 θ2 λ x2 + µ x uθ uθ

dxds ≤ e0

∀ t ≥ 0,

(2.1)

where U (x, t) := v 2 /2 + R(u − log u − 1) + cV (θ − log θ − 1) (x, t).

(2.2)

Proof. Using Eqs. (1.1)–(1.3), we obtain after a straightforward calculation that Ut + µ

θ2 vx2 + λ x2 = [vσ]x + Rvx + λ uθ uθ

1 θx 1− . θ u x

imbedding theorem (H 1 ,→ L∞ ), RBy virtue of Taylor’s theorem, (1.7), and Sobolev’s 2 U (x, 0)dx ≤ C(1 + k(u0 − 1, v0 , θ0 − 1)k ). So, if we integrate the above identity over × [0, t] (t ≥ 0) and use the boundary conditions (1.5) or (1.6), we obtain (2.1). The proof is complete. From Lemma 2.1 we see that Z Z i+1 (u − log u − 1)(x, t)dx, i

i+1

i

(θ − log θ − 1)(x, t)dx ≤

e0 , (2.3) min{R, cV }

where i = 0, 1, 2, · · · for the initial boundary value problems and i = 0, ±1, ±2, · · · for the Cauchy problem. Hence by terms of the mean value theorem, for each t ≥ 0 there are points ai (t), bi (t) ∈ [i, i + 1] such that 0 < α1 ≤ u(ai (t), t), θ(bi (t), t) ≤ α2 , t ≥ 0,

(2.4)

where α1 , α2 are two (positive) roots of the equation y − log y − 1 = e0 / min{R, cV }, and i = 0, 1, 2, · · · for the initial boundary value problems and i = 0, ±1, ±2, · · · for the Cauchy problem. Moreover, if we utilize (2.3) and apply Jensen’s inequality to the convex function y − log y − 1, we obtain: Z

i+1 i

Z

i+1

i

which gives α1 ≤ proved

Z u(x, t)dx − log Z θ(x, t)dx − log R i+1 i

u(x, t)dx,

i

i+1

i+1

i

R i+1 i

u(x, t)dx − 1, θ(x, t)dx − 1 ≤ e0 / min{R, cV },

θ(x, t)dx ≤ α2 . In view of (2.4), we thus have

Large-Time Behavior of Solutions to Equations of 1-D Ideal Gas

185

Lemma 2.2. Let α1 , α2 be two (positive) roots of the equation y − log y − 1 = e0 / min{R, cV } and the constant e0 be the same as in Lemma 2.1. Then Z i+1 Z i+1 u(x, t)dx, θ(x, t)dx ≤ α2 t ≥ 0, (2.5) α1 ≤ i

i

and for each t ≥ 0 there are points ai (t), bi (t) ∈ [i, i + 1] such that α1 ≤ u(ai (t), t), θ(bi (t), t) ≤ α2 ,

t ≥ 0,

(2.6)

where i = 0, 1, 2, · · · for the initial boundary value problems and i = 0, ±1, ±2, · · · for the Cauchy problem. Our next aim is to give a local representation of u by using some cut-off function. Let φ ∈ W 1,∞ (R) be defined by   1, x≤k+1 φ(x) := k + 2 − x, k + 1 ≤ x ≤ k + 2 . (2.7)  0, x≥k+2 For simplicity we denote k := (0, k + 1) for the initial boundary value problems and k := (−k−1, k+1) for the Cauchy problem. We have the following local representation: Lemma 2.3. We have u(x, t) = B(x, t)Y (t) + where

R µ

Z 0

t

B(x, t)Y (t) ¯ k , t ≥ 0, θ(x, s)ds, x ∈ B(x, s)Y (s)

Z ∞ 1 (v0 (y) − v(y, t))φ(y)dy , B(x, t) : = u0 (x) exp µ ) ( Z Z x t k+2 1 σ(y, s)dyds . Y (t) : = exp µ 0 k+1

(2.8)

(2.9)

Proof. We multiply Eq. (1.2) by φ to obtain: [φv]t = [σφ]x − φx σ. Integrating this over ¯ k ) with respect to x and recalling (1.1) and the definition of φ, σ, we (x, ∞) (x ∈ arrive at Z ∞ Z ∞ [vφ]t dy = σ + σφx dy − x

x

= µ[log u]t − R

θ − u

Z

k+2

k+1

¯ k. σ(y, t)dy, x ∈

(2.10)

Recalling the definition of B(x, t) and Y (t), we integrate (2.10) over (0,t) with respect to t and then take the exponential on both sides of the resulting equation to deduce that Z t 1 R θ(x, s) 1 ¯ k , t ≥ 0. (2.11) = exp ds , x ∈ B(x, t)Y (t) u(x, t) µ 0 u(x, s) Multiplying (2.11) by Rθ(x, t)/µ and integrating over (0, t), we infer Z t Z θ(x, s) R t θ(x, s) R ds = 1 + ds. exp µ 0 u(x, s) µ 0 B(x, s)Y (s) Substituting the above identity into (2.11), we obtain (2.8).

186

S. Jiang

Now in order to bound u(x, t) pointwise, we first show the exponential decay of Y (t) by utilizing Lemmas 2.1, 2.2 and some new estimates for θ, σ (cf. (2.15), (2.17)), then we use the representation (2.8) to obtain the following local uniform bounds on u: Lemma 2.4. There are positive constants u, u independent of t, such that u ≤ u(x, t) ≤ u

¯ k , t ≥ 0. ∀x∈

(2.12)

Proof. Recalling (2.7) and (2.9), we get from Cauchy–Schwarz’s inequality, (1.7), and (2.1) that C(k)−1 ≤ B(x, t) ≤ C(k)

¯ k , t ≥ 0. ∀x∈

(2.13)

By Cauchy–Schwarz’s inequality, (2.1), and (2.5), we see that for x ∈ [k + 1, k + 2], !1/2 Z !1/2 Z t Z k+2 2 Z t Z x k+2 θx (y, τ ) θx dy udy dτ dydτ ≤ 2 s bk+1 (τ ) θ(y, τ ) s k+1 uθ k+1 Z t Z k+2 2 θx α2 dxdτ + (t − s) log 2 ≤ log 2 s k+1 uθ2 α2 e0 + (t − s) log 2, (2.14) ≤ λ log 2 where bk+1 (t) is the same as in Lemma 2.2. We apply Jensen’s inequality to the convex function ex , and utilize (2.6), (2.14) to obtain that for any x ∈ [k + 1, k + 2], t ≥ s ≥ 0, Z t Z t Z t 1 θ(x, τ )dτ = exp{log θ(x, τ )}dτ ≥ (t − s) exp log θdτ t−s s s s Z t θ(x, τ ) 1 + log θ(bk+1 (τ ), τ ) dτ log = (t − s) exp t−s s θ(bk+1 (τ ), τ ) Z Z t x θx 1 dy + log θ(bk+1 (τ ), τ ) dτ = (t − s) exp t−s s bk+1 (τ ) θ Z Z θx 1 t x dydτ ≥ (t − s) exp log α1 − t − s s bk+1 (τ ) θ −α2 e0 α1 ≥ C(t − s)e−1/[C(t−s)] , (2.15) (t − s) exp ≥ 2 (t − s)λ log 2 which gives Z −

t

inf

s [k+1,k+2]

θ(·, τ )dτ ≤

0 −C(t − s)

for 0 ≤ t − s ≤ 1 . for 1 ≤ t − s

(2.16)

Applying Cauchy–Schwarz’s inequality, Jensen’s inequality for the convex function 1/x (x > 0), utilizing (2.1), (2.16), and taking into account that t−s C for 0 ≤ t − s ≤ 1 for t ≥ s ≥ 0, ≤C− C − (t − s)/C for 1 ≤ t − s C we obtain

Large-Time Behavior of Solutions to Equations of 1-D Ideal Gas

θ vx dxdτ µ −R σ(x, τ )dxdτ = u u k+1 s k+1 Z Z Z t Z k+2 2 vx R t k+2 θ ≤C dxdτ − dxdτ 2 s k+1 u s k+1 uθ Z Z k+2 1 R t dxdτ inf θ(·, τ ) ≤C− 2 s [k+1,k+2] u k+1 !−1 Z Z k+2 R t inf θ(·, τ ) udx dτ ≤C− 2 s [k+1,k+2] k+1 Z t R inf θ(·, τ )dτ ≤ C − (t − s)/C ≤C− 2α2 s [k+1,k+2]

Z tZ s

187

Z tZ

k+2

k+2

(2.17)

for t ≥ s ≥ 0. Recalling the definition of Y (t), it follows from (2.17) that 0 ≤ Y (t) ≤ Ce−t/C ,

Y (t) ≤ Ce−(t−s)/C Y (s)

Inserting (2.13) and (2.18) into (2.8), we infer that Z t θ(x, s)e−(t−s)/C ds u(x, t) ≤ C + C

∀ t ≥ 0.

¯ k , t ≥ 0. for all x ∈

(2.18)

(2.19)

0

On the other hand, by Lemmas 2.1–2.2 we have Z i+1 1/2 1/2 θ−1/2 |θx |dx θ (x, t) − θ (bi (t), t) ≤ i

1/2 Z i+1 1/2 θx2 dx uθdx ≤ uθ2 i i Z 1/2 √ θx2 ≤ α2 dx max u1/2 (·, t) for x ∈ [i, i + 1], 2 [i,i+1] uθ Z

i+1

and i = 0, 1, · · · for the initial boundary value problems and i = 0, ±1, · · · for the Cauchy problem. The above inequality together with (2.6) yields Z θx2 α1 − α2 dx max u(·, t) 2 ¯k 2 uθ (2.20) Z θx2 ¯ k. dx max u(·, t), x ∈ ≤ θ(x, t) ≤ 2α2 + 2α2 2 ¯k uθ Hence, inserting (2.20) into (2.19), applying Gronwall’s inequality, and (2.1), we conclude u(x, t) ≤ C(k)

¯ k , t ≥ 0. ∀x∈

(2.21)

Now, integrating (2.8) over (0, 1) with respect to x, using (2.5), (2.13), and (2.18), we obtain Z t Y (t) −t/C ds, t ≥ 0. (2.22) +C α1 ≤ Ce Y (s) 0

188

S. Jiang

It follows from (2.8), (2.13), (2.18), and (2.20)–(2.22), (2.1) that Z Z t Y (t) α1 θx2 −C dx ds u(x, t) ≥ C 2 2 uθ 0 Y (s) ! Z Z t/2 Z t Y (t) θx2 −t/C2 −C + dxds ≥ C0 − C1 e Y (s) uθ2 0 t/2 Z t Z Z t/2 Z θx2 θx2 dxds − C dxds ≥ C0 − C1 e−t/C2 − Ce−t/(2C) 2 2 uθ 0 t/2 uθ Z t Z θx2 dxds ≥ C0 − C1 e−t/C2 − Ce−t/(2C) − C 2 t/2 uθ ¯ k , t ≥ T0 , for all x ∈ (2.23) ≥ C2 where T0 , Ci (i = 0, 1, 2) are positive constants independent of t. From the proof of the existence in [1, pp.68–75], [13] (also see [8, pp.348–350]) we see that u satisfies 0 < m1 (t) ≤ u(x, t)

¯ t ≥ 0, for x ∈ ,

(2.24)

where m1 (t) is a continuous function of t (in fact, m1 (t) = Ce−Ct ). Therefore, combining (2.21) with (2.23)–(2.24), we get (2.12). The proof is complete. 3. Proof of Theorem 1.1 We have proved the local pointwise estimates in Theorem 1.1 (i.e. Lemma 2.4). In this section we apply the results obtained in Sect. 2 to prove the rest of Theorem 1.1. The proof is divided into three cases. Case I. For the problem (1.1)–(1.5). We start with the following lemma: Lemma 3.1. Z t

Z max v 2 (·, s)ds,

0 [0,k+1]

t

¯ 2 ds ≤ C max [θ(·, s) − θ(s)]

0 [0,k+1]

for all t ≥ 0,

(3.1)

¯ := θ(b0 (t), t) and b0 (t) is the same as in Lemma 2.2. where θ(t) Proof. It follows from (1.5), Cauchy–Schwarz’s inequality, (2.1), and Lemma 2.4 that Z t Z k+1 Z t max v 2 (·, s)ds ≤ 2 |vvx |dxds 0 [0,k+1] Z t Z k+1

0

0

v uθ dxds uθ 2(k + 1)α2 u 0 0 Z t Z k+1 1 2 max v θdxds ≤C+ 2(k + 1)α2 0 [0,k+1] 0 Z 1 t max v 2 (·, s)ds, ≤C+ 2 0 [0,k+1]

≤

C

vx2

2

+

which implies the first part of (3.1). Similarly, from Cauchy–Schwarz’s inequality and Lemma 2.4 we get

Large-Time Behavior of Solutions to Equations of 1-D Ideal Gas

Z

t

¯ 2 ds ≤ C max [θ(·, s) − θ(s)]

0 [0,k+1]

≤C +C

Z tZ 0

0

Z tZ 0

k+1

k+1

0

189

θx2 dx uθ2

Z

k+1

θ2 dxds

0

θx2 ¯ 2 ds. dx max [θ(·, s) − θ(s)] uθ2 [0,k+1]

Applying Gronwall’s inequality and (2.1) to the above inequality, we obtain the second part of (3.1). The proof is complete. Let ψ ∈ W 1,∞ (R), ψ(x) = 1 for x ≤ k, ψ(x) = k + 1 − x for k ≤ x ≤ k + 1 and ψ(x) = 0 for x ≥ k + 1. Note that Eq. (1.2) can be rewritten as [µux /u − v]t = R[θ/u]x . Multiplying this by (µux /u − v)ψ in L2 ((0, ∞) × (0, t)), applying the inequality |xy| ≤ x2 + −1 y 2 , (2.1), ¯ 2x ≥ (α1 /2 − C(θ − θ) ¯ 2 )u2x , we (2.12), (2.6), and taking into account θu2x = (θ¯ + θ − θ)u obtain Z 2 Z Z tZ ux u2x µ2 2 ψdx ≤ C + C v ψdx − Rµ θ ψ 3 2 u2 u 0 Z tZ 2 2 v2 θx2 2 ux 2v ψdxds + θ + θ + +C uθ2 u3 u u 0 Z t Z 2 α Z t Z u2 C ux 1 x ¯2 ψdxds + C max (θ − θ) ψdxds ≤ − Rµ − C 3 3 2 u u 0 0 [0,k+1] Z t ¯ 2 + v 2 ]ds max [(θ − θ) (0 < < 1), +C−1 0 [0,k+1]

which, by choosing appropriately small, and applying (2.12) and (3.1), gives Z Z tZ Z t Z ¯2 u2x ψdx + u2x ψdxds ≤ C + C max (θ − θ) u2x ψdxds.

0

0 [0,k+1]

(3.2)

Applying Gronwall’s inequality to (3.2) and using (3.1), one gets Lemma 3.2. We have Z tZ Z u2x (x, t)ψ(x)dx + u2x ψdxds ≤ C

0

By virtue of Poincar´e’s inequality and (3.3), Z Z ∞ 2 k(u − u1 )(t)kL2 (0,k) dt ≤ C 0

0

∞

for all t ≥ 0.

(3.3)

kux (t)k2L2 (0,k) dt ≤ C.

(3.4)

It follows from Eq. (1.1), the mean value theorem (for u(k, t) − u1 ), integration by parts, (1.5), (3.1), and (3.4) that Z ∞ Z k Z ∞ d v(k, t) k(u − u1 )(t)k2 2 dt = (u − u )(v − )dx dt 1 x (0,k) L dt k 0 0 0 Z ∞Z k Z ∞ {(u(k, t) − u1 )2 + ku − u1 k2L2 (0,k) + v 2 (k, t)}dt + |ux v|dxdt ≤C 0 0 0 Z ∞ (kux k2L2 (0,k) + max v 2 )dt ≤ C, ≤C 0

[0,k]

190

S. Jiang

which combined with (3.4) yields (1.10)1 . To prove (1.10)2 , we multiply (1.2) by 2jv 2j−1 ψ 2j (j ≥ 2) in L2 ((0, ∞) × (0, t)), integrate by parts with respect to x, utilize (1.5), the inequality |xy| ≤ x2 + y 2 , and Lemma 2.4, we infer that Z tZ Z v 2j (x, t)ψ 2j (x)dx + j 2 v 2j−2 vx2 ψ 2j dxds 0 Z tZ 2j 2j−2 2 + θ|v|2j−1 ψ 2j−1 + θ2 v 2j−2 ψ 2j dxds v ψ ≤ C + Cj ≡ C + Ij (t).

0

(3.5)

R

Define Fj (t) := max0≤s≤t v 2j (x, s)ψ 2j (x)dx. Then Ij can be estimated as follows, using the inequality |xy| ≤ x2 + y 2 , (2.1) and (3.1), Z tZ ¯ 2 v 2 + v 2 ψ 2 dxds v 4 + (θ − θ) I2 (t) ≤ C 0 Z t ¯ 2 (·, s)ds ≤ C, max v 2 + (θ − θ) (3.6) ≤C 0 [0,k+1]

and for j ≥ 3

Z

Ij (t) ≤ Cj 2

t

Z ¯ 2 + v2 } max {|θ − θ|

0 [0,k+1]

[(vψ)2j−2 + (vψ)2j−4 ]dxds

≤ Cj 2 (Fj−1 + Fj−2 )(t). Inserting (3.6)–(3.7) into (3.5), we obtain by induction for j that Z t 2 + j kv j−1 vx ψ j k2 ds ≤ Cj ∀ t ≥ 0, j ≥ 2, kv(t)ψk2j 2j L

(3.7)

(3.8)

0

where Cj is a positive constant that depends only on j and k. Using Eq. (1.2), (3.1) and (3.8), following the same arguments as used for (3.5)–(3.7), we get for j ≥ 2, Z ∞ Z ∞ d kv(t)ψk2j2j dt ≤ C + Cj 2 kv j−1 vx ψ j k2 dt L dt 0 0 Z 2j−2 + (vψ)2j−4 )dx < ∞, +C(j − 2)j max [(vψ) t≥0

which together with (3.1) and (3.8) yields (1.10)2 in Theorem 1.1. Case II. For the problem (1.1)–(1.4), (1.6). First it is easy to see that for the boundary conditions (1.6), Lemmas 3.1, 3.2 still hold. Using (1.6) and applying the same arguments as used for (3.1), we obtain Z ∞ max [θ(t) − 1]2 dt ≤ C. (3.9) 0

[0,k+1]

Now we estimate vx in L2 -norm. Multiplying (1.2) with vψ 2 in L2 ( × (0, t)), integrating by parts for [vx /u]x with respect to x, employing (2.1), (2.12), and Lemma 3.2, we obtain

Large-Time Behavior of Solutions to Equations of 1-D Ideal Gas

191

Z tZ vx2 2 |θx | θ|ux | |vx | ψ dxds ≤ C + C |ψx | + ψ + 2 ψ |v|ψdxds u u u u 0 0 Z t Z tZ 2 θx2 vx + C[ 2 + θ2 v 2 + u2x ] ψ 2 dxds + C max v 2 ds ≤C+ 2u uθ 0 0 [0,k+1] Z Z 2 Z t vx 2 1 t max [v 2 + (θ − 1)2 ]ds, t ≥ 0, ψ dxds + C ≤C+ [0,k+1] 2 0 u 0 Z tZ

whence (also by Lemma 3.1 and (3.9)) Z ∞ kvx (t)ψk2 dt ≤ C.

(3.10)

0

The following lemma gives L2 -bounds of vt . Lemma 3.3. For any t ≥ 0 we have Z kvx (t)ψ 2 k2 + k[θ(t) − 1]ψ 2 k2 +

t

(kvt ψ 2 k2 + kθx ψ 2 k2 )ds ≤ C.

(3.11)

0

Proof. We multiply (1.2) by uvt ψ 4 in L2 ( × (0, t)), integrate by parts for [vx /u]x with respect to x, and utilize (2.12), Cauchy–Schwarz’s inequality, (3.10) and Lemma 3.2, to get Z t Z tZ 2 2 2 2 2 kvt ψ k ds ≤ C + C θx + θ2 u2x + vx2 u2x ψ 4 dxds kvx (t)ψ k + 0 0 Z Z t 2 3 2 2 2 2 ux ψdx + kθx ψ k + max (θ − 1) ds. max vx ψ (3.12) ≤C +C 0

¯

[0,k+1]

Here the term max vx2 ψ 3 can be bounded as follows, using (2.12), Sobolev’s embedding theorem (W 1,1 ,→ L∞ ), and Eq. (1.2). max vx2 ψ 3 ≤ C max (σ 2 + θ2 )ψ 3 ≤ kσx ψ 2 k2 + C−1 kσψk2 + C max θ2 ¯

[0,k+1]

≤ C−1 {1 + max (θ − 1)2 + kvx ψk2 } + kvt ψ 2 k2 (0 < < 1). [0,k+1]

Substituting the above estimate into (3.12), choosing appropriately small, and taking into account (3.3), (3.9)–(3.10), we conclude Z t Z t 2 2 2 2 kvt ψ k ds ≤ C + C kθx ψ 2 k2 ds. (3.13) kvx (t)ψ k + 0

0

If we multiply Eq. (1.3) with (θ − 1)ψ 4 in L2 ( × (0, t)), apply Cauchy–Schwarz’s inequality, (3.9)–(3.10), we find that Z t Z t kθx ψ 2 k2 ds ≤ C + C max (θ − 1)2 kvx ψ 2 k2 ds. (3.14) k[θ(t) − 1]ψ 2 k2 + 0

0 [0,k+1]

Multiplying (3.14) with 2C and adding the resulting inequality to (3.13), applying then Gronwall’s inequality and (3.9), we obtain (3.11).

192

S. Jiang

As a result of Lemma 3.3 we have Z ∞Z 2 (u2tx ψ 4 + vxx ψ 4 )dxds ≤ C.

(3.15)

0

In fact, from Sobolev’s imbedding theorem (W 1,1 ,→ L∞ ), (1.1)–(1.2), and (2.12), (3.3) and (3.9)–(3.10), we get Z tZ 0

Z 2 (u2tx +vxx )ψ 4 dxds ≤ C+C

t

0

Z tZ

max vx2 ψ 3 ds ≤ C+ ¯

0

1 2 4 (Cvx2 ψ+ vxx ψ )dxds, 2

which yields (3.15). If we multiply (1.3) by uθt ψ 8 , integrate and keep in mind that max θx2 ψ 6 ≤ C(−1 kθx ψ 2 k2 + k[θx /u]x ψ 4 k2 ),

we obtain by the arguments similar to those used in Lemma 3.3 and (3.15) that kθx (t)ψ k + 4 2

Z tZ 0

2 (θt2 + θxx )ψ 8 dxds ≤ C

∀ t ≥ 0.

(3.16)

Now weRare in a position to prove (1.11). By (3.3), R R (3.10)–(3.11), (3.15)–(3.16), and the identity vx vxt ψ 8 dx = − vxx vt ψ 8 dx − 8 vx vt ψ 7 ψx dx, we see that Z

∞ 0

d kux ψ 8 k2 + d kvx ψ 8 k2 + d kθx ψ 8 k2 dt ≤ C, dt dt dt

which together with (3.3), (3.10)–(3.11) gives k(ux (t), vx (t), θx (t))kL2 (0,k) → 0 as t → ∞.

(3.17)

In view of the boundary conditions (1.6), we apply Poincar´e’s inequality and (3.17) to obtain k(u(t) − u1 (t), v(t), θ(t) − 1)kH 1 (0,k) → 0 as t → ∞. This proves (1.11). ˜ be the generalized solution of the Case III. For the Cauchy problem. Let (u, ˜ v, ˜ θ) initial boundary value problem (1.1)–(1.5) with the initial data (u0 , v0 , θ0 ) in the half line. Then we define u(x, t), v(x, t), θ(x, t) in R × (0, ∞) by

u(x, ˜ t), u(−x, ˜ t), ˜θ(x, t), θ(x, t) := ˜ θ(−x, t),

u(x, t) :=

x≥0 , v(x, t) := x<0 x≥0 . x<0

v(x, ˜ t), x≥0 , −v(−x, ˜ t), x < 0 (3.18)

Then u, v, θ satisfy (1.8) and the Cauchy problem (1.1)–(1.4). Therefore, (u, v, θ) defined by (3.18) is a generalized solution of (1.1)–(1.4). By virtue of the uniqueness (u, v, θ) defined by (3.18) coincides with the generalized solution indicated in (1.8). From the definition (3.18) and i) of Theorem 1.1, (1.12) follows immediately. The proof of Theorem 1.1 is complete.

Large-Time Behavior of Solutions to Equations of 1-D Ideal Gas

193

References 1. Antontsev, S.N., Kazhikhov, A.V., Monakhov, V.N.: Boundary Value Problems in Mechanics of Nonhomogeneous Fluids. Amsterdam, New York: North–Holland, 1990 2. Amosov, A.A., Zlotnik, A.A.: Global generalized solutions of the equations of the one-dimensional motion of a viscous heat-conducting gas. Soviet Math. Dokl. 38, 1–5 (1989) 3. Amosov, A.A., Zlotnik, A.A.: Solvability “in the large” of a system of equations of the one-dimensional motion of an inhomogeneous viscous heat-conducting gas. Math. Notes 52, 753–763 (1992) 4. Batchelor, G.K.: An Introduction to Fluid Dynamics. London: Cambridge Univ. Press, 1967 5. Fujita-Yashima, H., Padula, M., Novotn´y, A.: Equation monodimensionnelle d’un gaz visqueux et calorif`ere avec des conditions initiales moins restrictives. Ricerche di Matematica XLII, 199-248 (1993) 6. Hoff, D.: Global well-posedness of the Cauchy problem for the Navier–Stokes equations of nonisentropic flow with discontinuous initial data. J. Diff. Eqs. 95, 33–74 (1992) 7. Iskenderova, D.A., Smagulov, Sh.S.: The Cauchy problem for the equations of a viscous heat-conducting gas with degenerate density. Comput. Meths. Math. Phys. 33, 1109–1117 (1993) 8. Jiang, S.: Global spherically symmetric solutions to the equations of a viscous polytropic ideal gas in an exterior domain. Commun. Math. Phys. 178, 339–374 (1996) 9. Jiang, S.: Large-time behavior of solutions to the equations of a viscous polytropic ideal gas. Annali di Mate. Pura ed Appl. (to appear) 10. Kanel, Y.I.: Cauchy problem for the equations of gasdynamics with viscosity. Siberian Math. J. 20, 208–218 (1979) 11. Kawashima, S.: Large-time behaviour of solutions to hyperbolic–parabolic systems of conservation laws and applications. Proc. Roy. Soc. Edinburgh 106A, 169–194 (1987) 12. Kawashima, S., Nishida, T.: Global solutions to the initial value problem for the equations of onedimensional motion of viscous polytropic gases. J. Math. Kyoto Univ. 21, 825–837 (1981) 13. Kazhikhov, A.V.: Cauchy problem for viscous gas equations. Siberian Math. J. 23, 44–49 (1982) 14. Kazhikhov, A.V., Shelukhin, V.V.: Unique global solution with respect to time of initial boundary value problems for one-dimensional equations of a viscous gas. J. Appl. Math. Mech. 41, 273–282 (1977) 15. Liu, T.-P., Zeng,Y.: Large time behavior of solutions of general quasilinear hyperbolic-parabolic systems of conservation laws. Memoirs of the AMS, No. 599 (1997) 16. Nagasawa, T.: On the one-dimensional motion of the polytropic ideal gas non-fixed on the boundary. J. Diff. Eqs. 65, 49–67 (1986) 17. Nagasawa, T.: On the asymptotic behavior of the one-dimensional motion of the polytropic ideal gas with stress–free condition. Quart. Appl. Math. 46, 665–679 (1988) 18. Nagasawa, T.: On the one-dimensional free boundary problem for the heat-conductive compressible viscous gas. In: Mimura, M., Nishida, T. (eds.) Recent Topics in Nonlinear PDE IV, Lecture Notes in Num. Appl. Anal. 10, Amsterdam, Tokyo: Kinokuniya/North–Holland, 1989, pp. 83–99 19. Nishida, T.: Equations of motion of compressible viscous fluids. In: Nishida, T., Mimura, M., Fujii, H. (eds.) Pattern and Waves, Amsterdam, Tokyo: Kinokuniya/North-Holland, 1986, pp. 97–128 20. Okada, M., Kawashima, S.: On the equations of one-dimensional motion of compressible viscous fluids. J. Math. Kyoto Univ. 23, 55–71 (1983) 21. Serre, D.: Sur l’´equation monodimensionnelle d’un fluids visqueux, compressible et conducteur de chaleur. C.R. Acad. Sc. Paris, S´er. I 303, 703–706 (1986) 22. Serrin, J.: Mathematical principles of classical fluid mechanics. In: Fl¨ugge, S., Truesdell, C. (eds.), Handbuch der Physik. VIII/1, Berlin–Heidelberg–New York: Springer-Verlag, 1972, pp. 125–262 Communicated by H. Araki

Commun. Math. Phys. 200, 195 – 210 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Nucleation of Topological Dipoles in Nematic Liquid Crystals Giovanna Guidone Peroli, Epifanio G. Virga Dipartimento di Matematica, Universit`a di Pavia, via Ferrata 1, 27100 Pavia, Italy Received: 10 July 1997 / Accepted: 25 June 1998

Abstract: A topological dipole is a pair of point defects with opposite topological charges. In this paper we show by example how the nucleation of such a dipole within a smooth field can drive a metastable state into a stable one. Building on our previous work, we construct a mathematical model for the dynamics of both monopoles and dipoles in a capillary filled with a nematic liquid crystal. Though our analysis is fit for liquid crystals, a similar mechanism is also likely to apply to the field theory for other ordered media.

1. Introduction Recently much interest has been raised about the motion of defects and its setting within a proper dynamical theory. In many mathematical studies defects are considered as singularities in fields which evolve in time according to some partial differential equation. Particle-like defects can be assigned a topological charge, which describes the winding of the field around the point; in some contexts they are called vortices [1]. Their evolution is part of the dynamics of the whole system and their mutual interaction is usually both interesting and involved. In [1] Neu studies fields into the complex plane; he states that “Pairs of vortices evolving under the nonlinear heat equation with like (opposite) winding numbers undergo a repulsive (attractive) interaction” (cf. also [2]). In [3] Lin objects that this is not always the case when considering the heat flow associated with the Ginzburg–Landau equation: he finds that there are circumstances when vortices with opposite topological charges do not tend to coalesce, since they feel a “local energy barrier”. Here we are concerned with nematic liquid crystals, which are materials composed of molecules elongated in one direction and symmetric around it. Within the classical

196

G. G. Peroli, E. G. Virga

Oseen-Frank theory, the configuration of a liquid crystal inside a regular region B is described by a field n : B −→ S2

(1.1)

which designates the average direction of the molecules at each point in B. The unit vector field n is often called the director field; it also delivers the optic axis of these optically uniaxial materials, which may change from point to point. Here a point defect is a point where the optic axis n suffers a discontinuity. Every point defect can be assigned a topological charge, which is the integer corresponding to the homotopy class of the mapping n restricted to a small sphere around the point (cf. [4] for a review). Were n a planar field, the topological charge would coincide with the classical winding number. The elastic free energy per unit volume is assumed to depend on n and its first gradient. In 1958, Frank [5] gave a general representation formula for such an energy; here we use the one-constant approximation to it, according to which the energy associated with (1.1) is EB [n] :=

K 2

Z B

|∇n|2 dv,

(1.2)

where K is a positive elastic modulus and v denotes the volume measure. Our main objective is a comprehensive study of the interactions between point defects inside a capillary tube with prescribed boundary conditions. Though in a different context, this is the same objective as in the above studies for vortices. However, here we undertake a different path. We cannot afford the study of the general solutions to the equilibrium equation deriving from (1.2), which are usually called the harmonic maps. As in [6] and [7] we propose a model for the fields around both a +1 and a −1 defect, obtained by minimizing the free energy in a special class of fields which mimic the structure of these singularities. Though we believe these minimizers to be very close to harmonic maps, they are not exactly so; in return, they can be written explicitly, and so it is possible to evaluate both their elastic energy and the energy dissipated by viscous torques in their dynamic rearrangement. Thanks to this fact, by use of the dissipation principle posited in [8] we could write a proper dynamical equation which governs the interaction between two defects [9] (see also [10] and [11]). There is a distinctive feature in our approach: while in the heat flow of the Ginzburg– Landau energy functional the viscosity is a phenomenological coefficient, here an effective viscosity comes naturally into play, which depends on the configurations traversed by the defects in their motion. For two defects with opposite topological charge, the effective viscosity decreases as they come closer to one another. This might indeed be the reason why our model does not predict the counterintuitive repulsion between two opposite vortices found in [3]. In [9] we studied the coalescence of two defects with opposite topological charges in a tube which enforces homeotropic anchoring on the lateral boundary, requiring n to be parallel to the normal. The outcomes of this analysis were then compared to the experimental data collected by a group of researchers working in Halle-Saale for the Max-Planck Society (see both [12] and [13]). We obtained a general good agreement between theoretical predictions and experiments. Two defects with topological charges +1 and −1 are seen to approach each other with increasing velocity, until they come together and disappear, leaving no trace behind. In a language borrowed from the physics

Nucleation of Topological Dipoles in Nematic Liquid Crystals

197

of elementary particles, we may then say that an annihilation takes place. Similarly, other phenomena which can occur to defects may be called fission and nucleation. The former, already tackled by Neu in [1], is the decay of a defect with high charge in several defects with low charges. The latter is the germination of a pair of defects with zero total charge out of a regular field. Here we focus attention on this phenomenon. We show in a specific example the rˆole that it is also likely to play in more general contexts: namely, conveying a metastable state into a stable one. Suppose that there are two local minimizers for a system with different energies: one is metastable and the other is stable, as the former stores more energy than the latter. Let further both minimizers be smooth fields that do not fall within the same homotopy class. Thus, deforming the metastable field into the stable one, which would imply an energy gain, cannot be achieved through a smooth dynamics. It can, however, be done through the nucleation of two defects with opposite charges, which first arise within the metastable field at the expense of an energy supplied to the system from outside, and then repel each other, leaving the stable field in the space between them. In the end, the total energy of the system decreases more than it increases in the beginning, and so the whole process involves a net energy gain. As long as the lateral boundary of a capillary tube enforces a homeotropic anchoring, there is essentially one smooth minimizer of the total free energy in (1.2). Two distinct smooth local minimizers, however, arise for this problem as soon as the angle made by the director with the normal to the boundary deviates from zero to any degree. One may conjecture that such a failure of the homeotropic anchoring might indeed occur on decreasing the radius of the capillary, as increasing the curvature of the lateral boundary should loosen the capability of the molecules to stay parallel to the normal. We say that n is subject to a conical anchoring on a bounding surface S if n · ν |S = cos ϕ0 ,

(1.3)

where ν is the unit normal to S and 0 < ϕ0 < π2 . We assume that (1.3) holds on the lateral surface of a cylinder and that the director field is axisymmetric: n(r, ϑ, z) = cos ϕ(r, z)er + sin ϕ(r, z)ez ,

(1.4)

so that it can take only two symmetric orientations on the anchoring cone: ϕ(R, z) = ±ϕ0 .

(1.5)

Actually, either sign in (1.5) corresponds to the same physical situation: changing ϕ0 into its opposite would simply amount to reversing the cylinder upside down. It has long been known (see [14] and [15]) that when the lateral boundary of the tube prescribes a homeotropic anchoring for n there are two energy minimizers, which differ by a reflection; in the frame (er , eϑ , ez ) of cylindrical coordinates they can be written as n± (r, ϑ, z) =

2Rr R2 − r 2 e ± ez , r R2 + r 2 R2 + r 2

(1.6)

where R is the radius of the tube. Each of these fields is often referred to as escaped or fluted. They are z-independent and one is changed into the other by changing ez into −ez . Hence, they store the same elastic energy per unit height of the tube. On the contrary, when in (1.5) ϕ0 6= 0, the minimizing fields are no longer symmetric with respect to a reflection through a plane orthogonal to the z-axis. Moreover, they do not store the same elastic energy, since one is more distorted than the other. Such minimizing

198

G. G. Peroli, E. G. Virga

fields, which we call n+ and n− , are studied in Sect. 2; they play an essential rˆole in the construction of a class of fields that mimic the structure of either a +1 or a −1 defect. The lack of symmetry between n+ and n− is reflected by the structure of the singularities, which we describe in Sect. 3. The field on one side of a defect stores more energy than the field on the other side. This makes a major difference with the case of homeotropic anchoring, and entails two distinct consequences. On the one hand, a single defect, or monopole, is set in motion along the axis in the direction that makes the less distorted field n− extend over a larger region as time elapses. On the other hand, two defects with opposite topological charge may nucleate within the more distorted configuration n+ , and feel a repulsive force. As the defects in this dipole are taken apart, the field n− takes the place of n+ , and so the total elastic energy decreases. Section 4 is devoted to the motion of a monopole; in Sect. 5 we describe how a dipole can be generated in a tube, with the defects subject to a repulsive interaction; finally, in Sect. 6 we consider yet another problem, where a frozen dipole, consisting of a pair of defects at a prescribed distance, is seen to degenerate into a singular string, in analogy with what was shown by Brezis, Coron, and Lieb in [16] and [17]. 2. Escaped Fields In this section we seek the minimizers of the elastic free energy (1.2) when B is a cylinder of height h and radius R subject to the conical anchoring condition (1.3) on the lateral boundary. We consider only the axisymmetric configurations described in (1.4), which therefore obey (1.5). In the following we take ϕ(R, z) = +ϕ0

(2.1)

with no loss of generality, and denote by n0 the boundary datum for the director field: n0 := cos ϕ0 er + sin ϕ0 ez .

(2.2)

Moreover, in this section we consider only fields that are constant along z. Under this assumption, the elastic energy for the field in (1.4) can be written as Z R 1 2 {ϕ0 + 2 cos2 ϕ}rdr, (2.3) F[ϕ] = πKh r 0 where a prime denotes differentiation with respect to r. Standard arguments lead to the following equilibrium equation for F: ϕ0 = 2

cos2 ϕ , r2

(2.4)

where the integration constant has been set equal to zero to make the integral in (2.3) converge. Subject to (2.1), this equation has precisely two solutions, namely R2 cos2 ϕ0 − (1 − sin ϕ0 )2 r2 ), R2 cos2 ϕ0 + (1 − sin ϕ0 )2 r2 (1 + sin ϕ0 )2 r2 − R2 cos2 ϕ0 ). ϕ+ (r) := arcsin( 2 R cos2 ϕ0 + (1 + sin ϕ0 )2 r2

ϕ− (r) := arcsin(

(2.5)

Nucleation of Topological Dipoles in Nematic Liquid Crystals

199

Fig. 1. The two minimizers of the elastic free energy when the anchoring is not homeotropic: (a) n− , (b) n+

We then call n− and n+ the director fields obtained by inserting ϕ− and ϕ+ into (1.4). The field n− escapes towards ez along the same direction as n0 , so it turns less than π2 ; on the contrary, n+ escapes towards the opposite direction, generating a bigger elastic distortion in the tube. The fields n− and n+ are represented in Figs. 1 a, b; notice that for ϕ0 = 0 they reduce to the escaped fields in (1.6). Using (2.5) in (2.3), we easily compute the energy associated with both equilibrium fields: F[ϕ− ] = 2πKh(1 − sin ϕ0 ), F[ϕ+ ] = 2πKh(1 + sin ϕ0 ).

(2.6)

We now proceed to prove that both n− and n+ are minimizers for F: the former is the absolute minimizer, while the latter is just a local one. For 0 a positive real and ∈ [−0 , 0 ], we define ϕ+ (r) := ϕ+ (r) + v(r), ϕ− (r) := ϕ− (r) + v(r),

(2.7)

where v : [0, R] −→ IR is a mapping of class C 1 such that v(0) = v(R) = 0.

(2.8)

Both fields n+ and n− are locally stable if both d2 F[ϕ+ ]|=0 > 0 d2

(2.9)

d2 F[ϕ− ]|=0 > 0, d2

(2.10)

and

for all v that do not vanish identically. Easy computations show that

200

G. G. Peroli, E. G. Virga

d2 F[ϕ± ] |=0 = 2πKh d2

Z

R

{v 0 (r)r − 2

0

1 cos 2ϕ± (r)v 2 (r)}dr. r

(2.11)

Let G be the functional defined by Z

R

G[v] :=

{v 0 r − 2

0

N 2 v }dr, r

(2.12)

where N satisfies N ≥ M± , with M± := max{0, max cos 2ϕ± (r)}.

(2.13)

d2 F[ϕ± ]|=0 > 2πKh G[v] d2

(2.14)

r∈[0,R]

so that

for all v that do not vanish identically. When N can be chosen equal to 0, it follows immediately from (2.12) that G[v] > 0, unless v ≡ 0. When N must be positive, the equilibrium equation for G reads as v 00 +

v0 N + v = 0, r r2

(2.15)

which has the following solution: √ √ v(r) = A cos( N ln r) + B sin( N ln r),

(2.16)

where A and B are arbitrary constants. Since the only function in (2.16) that obeys (2.8) is v ≡ 0, this is the only minimizer of G. Thus, by (2.14), we conclude that both n+ and n− are locally stable. These fields play an essential rˆole in our description of point defects, as shown in the following section. 3. Point Defects Here we describe a class of director fields which serve as prototypes for both +1 and −1 defects; we then find the minimizer of the elastic free energy within this class, and finally compute the minimum energy. In constructing a class of fields that exhibit either a +1 or a −1 defect, we build on the idea that these singularities appear when a field which escapes in one direction along the axis joins another which escapes in the opposite direction. We now focus attention on a +1 defect. Let h1 and h2 be given positive reals and let r = r1 (z) and r = r2 (z) be two unknown functions which, when z varies in the intervals [0, h1 ] and [−h2 , 0], describe the longitudinal profiles of two axisymmetric free surfaces which we call joints for short. On each section through a plane orthogonal to the axis of the cylinder we rescale one escaped field so that it ends with ϕ = ϕ0 on the joint; moreover, we assume that the director field is constant on every cross-section with constant ϑ in the peripheral region bounded by the joints and the lateral boundary of the tube. We call n1 and n2 the director fields that appear above and below a +1 point defect. It readily follows from (2.5) that they are determined, respectively, by the angles ϕ1 and ϕ2 defined as

Nucleation of Topological Dipoles in Nematic Liquid Crystals

201

ϕ1 (r, z) =

( r 2 cos2 ϕ −(1−sin ϕ )2 r 2 arcsin( r12 cos2 ϕ00 +(1−sin ϕ00)2 r2 ), 0 ≤ r ≤ r1 (z),

(3.1)

ϕ2 (r, z) =

( (1+sin ϕ )2 r 2 −r 2 cos2 ϕ arcsin( r2 cos2 ϕ0 0 +(1+sin2 ϕ0 )2 r20 ), 0 ≤ r ≤ r2 (z),

(3.2)

ϕ0

ϕ0

1

r1 (z) ≤ r ≤ R,

2

r2 (z) ≤ r ≤ R.

The fields n1 and n2 join along the plane z = 0, where they form a +1 defect which is shown in Fig. 2a. A −1 defect can easily be obtained by exchanging the rˆoles of n1 and n2 , as shown in Fig. 2b.

h1

h2

(a)

(b)

Fig. 2. Director fields exhibiting a +1 and a −1 defect, respectively

The elastic free energy stored in the regions adjacent to the joints is indeed the sum of two terms, E1 and E2 , obtained by computing (1.2) over the cylinders with heights h1 and h2 (see Fig. 2a). For a field like the one in (1.4), |∇n|2 = |∇ϕ|2 = ϕ0 r + ϕ0 z + 2

2

1 cos2 ϕ, r2

where ϕ0r and ϕ0z are the derivatives of ϕ with respect to r and z. Since ( i sin ϕ0 )ri cos ϕ0 , 0 ≤ r ≤ ri (z), (−1)i r22(1+(−1) ∂ϕi cos2 ϕ0 +(1+(−1)i sin ϕ0 )r 2 i = ∂r 0 ri (z) ≤ r ≤ R, and

(3.3)

(3.4)

202

G. G. Peroli, E. G. Virga

 2(1+(−1)i sin ϕ0 )r cos ϕ0 0 ∂ϕi  ri2 cos2 ϕ0 +(1+(−1)i sin ϕ0 )r2 ri , 0 ≤ r ≤ ri (z), = 0 ∂z ri (z) ≤ r ≤ R,

(3.5)

we arrive at the following expression for the elastic energy stored in each joint: Z E1 [r1 ] = πK

h1

{A(ϕ0 )r0 1 + B(ϕ0 ) − cos2 ϕ0 ln 2

0

r1 } dz, R

(3.6)

and Z E2 [r2 ] = πK

0

−h2

{A(−ϕ0 )r0 2 + B(−ϕ0 ) − cos2 ϕ0 ln 2

r2 } dz, R

(3.7)

where now the prime denotes differentiation with respect to z and A(ϕ0 ) :=

1 + sin ϕ0 {2 ln 2 − 1 − 2 ln(1 + sin ϕ0 ) + sin ϕ0 }, 1 − sin ϕ0 B(ϕ0 ) := 2(1 − sin ϕ0 ).

(3.8)

(3.9)

The Euler–Lagrange equations for the functionals in (3.6) and (3.7) can be easily integrated once, thus leading to r1 + C1 , R r2 A(−ϕ0 )(r20 )2 = − cos2 ϕ0 ln + C2 , R A(ϕ0 )(r10 )2 = − cos2 ϕ0 ln

(3.10)

where C1 and C2 are integration constants which can be determined by the natural equilibrium conditions at the free end-points: r10 (h1 ) = 0 and r20 (−h2 ) = 0.

(3.11)

These conditions require the minimizing joint to be tangent to the lateral boundary of the cylinder where it ends. For a single defect, both joints reach the lateral boundary of the capillary tube, but for two defects sufficiently close to one another, the joints between them may meet along an inner cylinder. This is why, in describing our model for the director field around a defect, we keep a level of generality that would not be needed, were our discussion here not preliminary to the analysis of a more complicated structure. If we set r1 (h1 ) , R r2 (−h2 ) , ρ2 := R ρ1 :=

(3.12)

from (3.10) and (3.11) we obtain that Ci = cos2 ϕ0 ln ρi

for

i = 1, 2.

(3.13)

Nucleation of Topological Dipoles in Nematic Liquid Crystals

203

Besides, by use of the geometric constraints ri (0) = 0

for

i = 1, 2,

(3.14)

we can explicitly solve (3.10) and find the inverses of both functions ri (z), which here we denote by zi (r): √

Z ∞ A(ϕ0 ) e−x √ dx, ρ1 R z1 (r) = ρ R cos ϕ0 x ln 1r √ Z ∞ A(−ϕ0 ) e−x √ dx. ρ2 R z2 (r) = − ρ R cos ϕ0 x ln 2r

(3.15)

Equations (3.15) give the shape of the equilibrium joints, and determine through (3.12) an explicit dependence for their heights h1 and h2 on ρ1 and ρ2 : √

A(ϕ0 )π ρ1 R, cos ϕ0 √ A(−ϕ0 )π ρ2 R, h2 = cos ϕ0 h1 =

(3.16)

since Z

∞

0

√ e−x √ dx = π. x

(3.17)

Equations (3.16) also show that the equilibrium problem for EB in (1.2) has a unique solution in the class of all admissible joints, provided that √

A(ϕ0 )π R, cos ϕ0 √ A(−ϕ0 )π R, h2 ≤ cos ϕ0 h1 ≤

(3.18)

since both ρ1 and ρ2 cannot exceed 1. We are now in a position to compute the elastic free energy of an equilibrium joint. Inserting (3.10) into (3.6) and (3.7), we arrive at E1 = πK{B(ϕ0 ) + cos2 ϕ0 − cos2 ϕ0 ln ρ1 }h1 , E2 = πK{B(−ϕ0 ) + cos2 ϕ0 − cos2 ϕ0 ln ρ2 }h2 .

(3.19)

By (3.16), it is clear that Ei can be made to depend on only ρi or only hi , for both i = 1 and i = 2. Thus far both h1 and h2 have been taken as arbitrarily prescribed. We have shown that whenever they obey (3.18), there is precisely one pair of equilibrium joints. We shall see in the following sections how they are to be chosen to solve the specific variational problems we now address.

204

G. G. Peroli, E. G. Virga

4. Escape of a Monopole Our purpose in this section is to describe the motion of a single defect, which we call a monopole. We focus attention on a −1 defect. We show here that this motion is peculiar to the conical anchoring, since it is driven by the lack of symmetry between the equilibrium solutions n− and n+ found in Sect. 2. We first set a proper dynamical context, which allows us to derive the equation that governs the motion of a single defect. In [8] Leslie rebuilt the classical theory of nematic flows proposed in [18] on a dissipation principle which takes the following form when the hydrodynamic flow is negligible: E˙ + W = 0;

(4.1)

here E˙ is the rate of change of the elastic free energy and W is the energy dissipated by the viscous torques acting during the rearrangement of the director field. In our model, when a defect moves along the axis of the tube the joints rigidly slide with it. Moreover, both joints reach the lateral boundary of the tube; thus, r1 (h1 ) = r2 (−h2 ) = R, and so by (3.12) ρ1 = ρ2 = 1, which makes both h1 and h2 attain their maximum values. It then follows from (3.19) that the energy stored in both joints remains constant while they move. On the contrary, the energy stored in the fields n− and n+ on both sides of the defect varies in time, as do the regions they occupy. Apart from an inessential constant, this energy can be computed by adding the energies in both formulae (2.6), provided in the former h is taken as the coordinate z0 of the defect, and in the latter as H − z0 , where H is the height of the tube. Figure 3 illustrates the situation we envisage here: the z-axis is so oriented that the field n− stays below the field n+ , which would make the defect move upwards to reduce the total elastic energy within the tube. We thus obtain E˙ = −4πK z˙0 sin ϕ0 .

(4.2)

We now compute the dissipation W as a function of z˙0 , so that (4.1) becomes a first order differential equation for z0 . For a director field represented as in (1.4) the energy dissipated in a prescribed region B is given the following form: Z ∂ϕ (4.3) W = γ1 ( )2 dv, B ∂t where γ1 > 0 is the rotational viscosity. Here ϕ depends on time, as do the joints which slide along the axis: their longitudinal sections are now represented by the functions Zi (r, t) := z0 (t) + zi (r)

for

i = 1, 2,

(4.4)

where zi are defined in (3.15). Let B be a portion of the tube which at a given time encloses both the defect and its joints. The integral in (4.3) can be computed by use of both (3.1) and (3.2), where r1 and r2 are replaced by R1 and R2 , the inverse functions of Z1 and Z2 . Reasoning precisely as in Sect. 3 of [11], with little more labour here we arrive at Z R ∂Z1 2 Z R ∂Z2 2 ( ∂t ) ( ∂t ) dr − A(−ϕ ) dr}, (4.5) W = 2πγ1 {A(ϕ0 ) 0 ∂Z1 ∂Z2 0

∂r

0

∂r

whence, by (4.4) and (3.15) it follows that p p √ W = πγ1 πR cos ϕ0 ( A(−ϕ0 ) + A(ϕ0 ))z˙0 2 .

(4.6)

Nucleation of Topological Dipoles in Nematic Liquid Crystals

205

Fig. 3. A −1 defect moving in a capillary tube

We are now in a position to derive the equation that governs the motion of a monopole inside the tube: inserting both (4.6) and (4.2) into (4.1), we obtain 4 1 1 z˙0 √ =√ tan ϕ0 √ , R πτ A(−ϕ0 ) + A(ϕ0 )

(4.7)

where τ := γ1KR is a characteristic time which depends on the size of the capillary. Equation (4.7) shows that in a tube which enforces a conical anchoring, a single defect moves with a constant velocity, which depends on the material constants of the liquid crystal and the anchoring angle ϕ0 . This velocity decreases when ϕ0 tends to zero and it vanishes when ϕ0 = 0, so that a defect would not move when the anchoring is homeotropic. A singular phenomenon occurs in the limit as ϕ0 → π2 , when the velocity diverges. This is indeed due to the fact that in this limit the field n+ generates a string, which is a whole singular line along the axis of the tube. This type of singularity is again considered in Sect. 6 for two defects with opposite topological charges at a prescribed distance. 2

5. Nucleation of a Dipole In this section we study the generation inside a tube subject to conical anchoring of a topological dipole, that is, a pair of point defects with opposite topological charges. The motion towards annihilation of the defects in a topological dipole has long been observed; here we predict a different phenomenon, which is possible only in the presence of conical anchoring. It stems from the lack of symmetry between the equilibrium

206

G. G. Peroli, E. G. Virga

solutions n− and n+ , that we found in Sect. 2. There we proved that n+ is a locally stable point for the energy functional: it is thus conceivable that this configuration be actually attained in a cylinder, though it does not represent the absolute minimizer. Here we propose a mechanism which would allow the excess of elastic energy associated with n+ to decay, and make the system evolve towards its minimum. It can be shown by simple topological arguments that no continuous deformation of n+ into n− is possible, while the boundary datum n0 is held unchanged. Thus, singularities must be generated; here we consider the nucleation of a pair. In our setting there are two possible configurations for a topological dipole: the more distorted field n+ can lie in the region between the defects or in the regions away from them. The evolution of these initial states is quite different. The former turns into the annihilation of the defects, as shown in [13]. The latter causes the more energetic field n+ to relax into the less energetic n− , as we now show. We again attack this problem from a dynamical perspective: relying on (4.1), we write the equation of motion, and we see from it that the defects are eventually taken apart, but only if the initial distance between them is larger than a critical value, which depends on the anchoring angle. Consider a cylinder sufficiently high to accommodate two defects together with their joints. Let d be the distance between the defects. The inner joints embrace the field n− ; √ A(ϕ )π

by (3.18)1 their maximum height is h1 = cos ϕ00 R, and so they do not meet when √ A(ϕ0 )π R =: dc . (5.1) d>2 cos ϕ0 In other words, ρ1 = 1 for d > dc . On the contrary, for d ≤ dc , h1 must equal d2 , and so by (3.16)1 ρ1 =

d ≤ 1. dc

(5.2)

On the other hand, ρ2 = 1 for all values of d, and so h2 always attains its maximum value. A sketch of the director field for both d < dc and d > dc is drawn in Fig. 4. The elastic free energy stored in a cylinder with a dipole can be computed as in Sect. 4 for a monopole. A judicious application of the formulae in (3.19), with both h1 and h2 chosen as just discussed above, gives the energy in the region enclosed by the joints. Similarly, the energies associated with the director fields n− and n+ follow from the formulae in (2.6), provided that h is chosen as the height of the cylinder occupied by the corresponding field. The result of all these computations is a function E of the distance d: ( 2πK{−4 sin ϕ0 + cos2 ϕ0 (1 − ln ddc )}d + E(0) for d ≤ dc , (5.3) E(d) = for d ≥ dc . −8πK sin ϕ0 d Following the same pattern as in Sect. 4, we can also compute the dissipation W as a function of d and its time derivative. Here, however, this computation is much more involved, as integrals similar to those in (4.5) must now be evaluated also for joints whose shape changes in time. All these integrals resemble the example worked out explicitly in [11]: putting all them together we arrive at ( 3 √ √ 1 2 π γ1 R cos ϕ0 {λ A(ϕ0 ) ddc + A(−ϕ0 )}d˙2 for d < dc , 2 ˙ √ √ (5.4) W(d, d) = 1 3 2 A(−ϕ0 )}d˙2 for d > dc , 2 π γ1 R cos ϕ0 { A(ϕ0 ) +

Nucleation of Topological Dipoles in Nematic Liquid Crystals

207

d < dc (a)

d > dc (b)

Fig. 4. A topological dipole inside a capillary tube for both d < dc and d > dc

where λ ' 1.445 is a dimensionless parameter which has been computed numerically ˙ into in [9]. We now insert both the derivative of E(d) with respect to time and W(d, d) (4.1), so obtaining the first-order differential equation for the motion of the defects:  4 sin ϕ0 +cos2 ϕ0 ln ddc 1 √ √ √ for d < dc , d˙ τ π cos ϕ0 { A(ϕ0 )λ d + A(−ϕ0 )} , dc = 1 (5.5) 2R  √ √ 4 tan ϕ√0 , for d > dc , τ π(

A(ϕ0 )+

A(−ϕ0 ))

where τ = γ1KR is the characteristic time defined in Sect. 4. It should be noticed that since λ 6= 1 the relative velocity of the defects suffers a jump when d crosses dc , while the force between them is continuous. This is clearly a direct consequence of the analogous discontinuity for W in (5.4), and reveals an impulse somehow occurring in the dissipative torques, which might well be just an artefact of our model. The velocity in Eq. (5.5) changes its sign for 2

−

d = d0 := e

4 sin ϕ0 cos2 ϕ0

dc ;

(5.6)

since it is negative for d < d0 , the dipole contracts within this range, while it dilates beyond it. A graph of ddc0 against the anchoring angle ϕ0 is plotted in Fig. 5; it shows that this ratio decreases when ϕ0 increases: it is already in the order of hundredths when ϕ0 = π4 . The outcome of this model can be interpreted in the following way. Should a fluctuation generate a topological dipole embracing the less distorted field n− , the system would then prefer the more distorted field n+ to retreat from the rest of the cylinder, provided the defects are born sufficiently far apart. Precisely, if a dipole is generated

208

G. G. Peroli, E. G. Virga

Fig. 5. The graph of the ratio

d0 dc

vs. the anchoring angle

with d < d0 , it closes up again, so restoring the local minimizer of the free energy functional. In other words, there is an energy barrier which kills all dipoles that do not climb it. Such a barrier is simply the difference E(d0 ) − E(0), which we scale to the characteristic elastic energy 2πKR: it represents the energy to be supplied to the system for this process to get started. It readily follows from (5.3) that 1E :=

4 sin ϕ0 p E(d0 ) − E(0) − = cos ϕ0 A(ϕ0 )πe cos2 ϕ0 . 2πKR

(5.7)

In Fig. 6 we plot the graph of 1E against ϕ0 : it shows how the energy barrier is always less than 1.1 times the characteristic energy 2πKR.

Fig. 6. The activation energy vs. the anchoring angle ϕ0

6. Frozen Dipole Here our approach to the problem of finding a minimizer for the energy functional in (1.2) has been a little unusual. Instead of looking for a harmonic map, we restricted attention to a special class of fields, where luckily enough the variational problem was explicitly solved. Then, making these equilibrium fields evolve in time so as to conform to a dissipation principle, we arrived at the equations that describe the motion of both a monopole and a dipole in a tube. There is, however, a major objection to this method: the fields we are using are very different from the harmonic maps which solve exactly the minimum problem. To this

Nucleation of Topological Dipoles in Nematic Liquid Crystals

209

objection we can oppose both the experimental [12] and numerical [19] evidences that confirm our predictions. Here we compare in a limiting case the energy of our special fields and that of other fields introduced elsewhere. We attempt to support our method with an analogy argument, typical of classical mathematical physics. Little is known about harmonic maps obeying cylindrical symmetry and having defects. In [16] and [17], Brezis, Coron, and Lieb consider two points p1 and p2 at a prescribed distance D and seek the minimizers of (1.2) in the class C := {n : IR3 −→ S2 | n ∈ C1 (IR3 \{p1 , p2 })}.

(6.1)

They prove that the lower bound for the elastic energy is 4πKD, which is not attained in C. Moreover, they construct a minimizing sequence in C which tends to a field that is singular all along the string connecting p1 and p2 . In the following we show that this result can be recovered within the class of fields employed throughout our analysis. Let p1 and p2 be two prescribed points on the axis of a capillary tube that enforces a conical anchoring with amplitude ϕ0 on its lateral boundary. The situation we envisage here is illustrated in Fig. 7.

Fig. 7. A sketch of the director field between two prescribed point defects

The dipole now embraces the field n+ , while the field n− occupies the outer regions. Thus, the height of the inner joints is h2 and that of the outer joints is h1 . While this latter takes its maximum value in (3.18)1 , h2 = D2 , and so ρ1 = 1

and

D cos ϕ0 ρ2 = √ . 2 A(−ϕ0 )π R

(6.2)

Computing both energies in (3.19) for both defects, by use of (6.2) we obtain the energy stored in this dipole as a function of ϕ0 :

210

G. G. Peroli, E. G. Virga

√

A(ϕ0 )π R cos ϕ0 D cos ϕ0 ))}D, (6.3) + πK{B(−ϕ0 ) + cos2 ϕ0 (1 − ln( √ 2 A(−ϕ0 )π R

2E1 + 2E2 = E(ϕ0 ) := 2πK{B(ϕ0 ) + cos ϕ0 } 2

where the functions A(ϕ0 ) and B(ϕ0 ) have been defined in (3.8) and (3.9). We now vary the anchoring angle ϕ0 and make it tend to π2 . Since √ A(ϕ0 ) 1 cos ϕ0 and limπ cos2 ϕ0 ln √ = 0, = limπ − − cos ϕ0 2 A(−ϕ0 ) ϕ0 → 2 ϕ0 → 2

(6.4)

from (6.3) we have that lim E(ϕ0 ) = 4πKD.

ϕ0 → π2 −

(6.5)

Thus, also within the class of fields described in Sect. 3 we can approach the infimum of the elastic free energy for a frozen dipole. References 1. Neu, J.C.: Vortices in complex scalar fields. Physica D 43, 385–406 (1990) 2. Jaffe, A. and Taubes, C.: Vortices and Monopoles. Boston: Birkh¨auser, 1980 3. Lin, F.H.: Mixed vortex-antivortex solutions of Ginzburg–Landau equations. Arch. Rat. Mech. Anal. 133, 103–128 (1995) 4. Mermin, N.D.: The topological theory of defects in ordered media. Rev. Mod. Phys. 51, 591–648 (1979) 5. Frank, F.C.: On the theory of liquid crystals, Discuss. Faraday Soc., 25, 19–28 (1958) 6. Guidone, Peroli, G., and Virga, E.G.: Capillary locking of point defects in nematics. In: Contemporary Research in the Mechanics and Mathematics of Materials, R. C. Batra & M. F. Beatty eds., CIMNE, Barcelona, 1996, pp. 236–247 7. Guidone Peroli, G. and Virga, E.G.: Modelling the capillary locking of point defects in nematic liquid crystals. IMA J. Appl. Math. 58, 211–236 (1997) 8. Leslie, F.M.: Continuum theory for nematic liquid crystals, Continuum Mech. Thermodyn. 4, 167 (1992) 9. Guidone Peroli, G. and Virga, E.G.: Annihilation of point defects in nematic liquid crystals. Phys. Rev. E 54, 5235–5241 (1996) 10. Guidone Peroli, G. and Virga, E.G.: Arrays of defect evolving towards neutral equilibria. Phys. Rev. E 56, 1819–1824 (1997) 11. Guidone Peroli, G. and Virga, E.G.: Dynamics of point defects in nematic liquid crystals. Physica D bf 111, 356–372(1998) 12. Guidone Peroli, G., Hillig, G., Saupe, A. and Virga, E.G.: Orientational Capillary Pressure and Nematic Point Defect. To appear in Phys. Rev. E (1998) 13. Guidone Peroli, G. and Virga, E.G.: The role of boundary conditions in the annihilation of nematic point defects. Submitted to Phys. Rev. E (1998) 14. Cladis, P.E. and Kl´eman, M.: Non-singular disclinations of strength S = 1 in nematics, J. Phys. (Paris) 33, 591–598 (1972) 15. Meyer, R.B.: On the existence of even indexed disclinations in nematic liquid crystals. Phil. Mag. 77, 405–424 (1973) 16. Brezis, H., Coron, J.-M. and Lieb, E.H.: Estimations d’´energie pour des applications de R3 a` valeurs dans S2 . C. R. Acad. Sc. Paris, 303, S´erie I, n◦ 5, 1986 17. Brezis, H., Coron, J.M. and Lieb, E.H.: Harmonic maps with defects. Commun. Math. Phys., 107, 647–705 (1986) 18. Leslie, F.M.: Some consitutive equations for liquid crystals. Arch. Rat. Mech. Anal. 28, 265–283 (1968) ˇ 19. Kralj, S. and Zumer, S.: Private Communication Communicated by D. Brydges

Commun. Math. Phys. 200, 211 – 247 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Stable Magnetic Equilibria in a Symmetric Collisionless Plasma Yan Guo Division of Applied Mathematics, Brown University, Providence, RI 02912, USA; Department of Mathematics, Princeton University, Princeton, NJ 08544, USA Received: 13 February 1998 / Accepted: 13 July 1998

Abstract: A collisionless plasma is described by the Vlasov–Maxwell system. In many physical situations, a plasma is invariant under either rotations or translations. Many symmetric equilibria with nontrivial magnetic fields are critical points of an appropriate Liapunov functional, and their dynamical stability is studied among all symmetric perturbations. The set of all minimizers of the Liapunov functional are dynamically stable. Criteria for stability for general critical points are also established. A simpler sufficient condition for stability is derived for neutral equilibria. 1. Introduction A collisionless plasma is described by the relativistic Vlasov–Maxwell system, where charged particles interact with a self-consistent electromagnetic field. Let be a bounded domain in R3 . Let F± (t, x, v) be the distribution functions for the ions (+), and electrons (-) at time t, spatial coordinates x = (x1 , x2 , x3 ) ∈ ∈ R3 and momentum v = (v1 , v2 , v3 ) ∈ R3 . Let m± be the masses for the ions (+) and electrons (-) respectively. For notational simplicity, we normalize all other q physical constants to be one. Let the electromagnetic field be E and B. Let hv± i = m2± + |v|2 and the relativistic velocity be vˆ ± = v/hv± i. The Vlasov–Maxwell system takes the form of ∂t F± + vˆ ± · ∇x F± ± (E + vˆ ± × B) · ∇v F± = 0, Z (vˆ + F+ − vˆ − F− )dv, ∂t E − curl B = −j = − R3

∂t B + curl E = 0, Z (F+ − F− )dv. div B = 0. div E = ρ = R3

(1)

212

Y. Guo

The initial condition is F± (0, x, v) = F±0 , E(0, x) = E0 (x) and B(0, x) = B0 (x) with constraints div E0 = ρ(0, x) and div B0 = 0. At ∂, we impose the ideal specular reflection boundary condition for the plasma and perfect conductor boundary condition for the electromagnetic field. One of the fundamental features of the Vlasov–Maxwell system (1) is multiplicity of its steady states, whose dynamical stability is one of the centers for the plasma study. In many situations, a plasma has certain symmetry. Recently, by using certain invariants with respect to corresponding symmetries, Degond [D], later Batt and Fabian [BF], have constructed various types of steady state solutions to (1). The goal of this article is to investigate the dynamical stability of a certain class of these known equilibria. We assume 2 (R), µ0± < 0, lim µ± (ζ) = +∞, 0 < µ± ∈ Cloc ζ→−∞

−γ

µ± (ζ) ≤ C|ζ| sup [|η± (ζ)| +

ζ∈R

, |µ0± (ζ)|

0 |η± (ζ)|]

≤ C|ζ|−γ−1 , for γ > 4 and ζ > 0,

(2)

< ∞.

A typical example is µ± (ζ) = e−ζ . If is invariant with respect to all rotations along the x3 -axis Z and ∩ Z = ∅, we define h± (α, β, v) = hv± i ± β(r, z) + η± (r(vθ ± α(r, z)).

(3)

A class of steady axially symmetric solutions takes the form: 1 F± = µ± (h± (α, β, v)), E = ∇β, B = (−∂z α cos θ, −∂z α sin θ, [∂r (rα)]). r p Here z = x3 , r = x21 + x22 , vθ = −v1 sin θ + v2 cos θ and tan θ = x2 /x1 . The axial magnetic potential α and the electric potential β satisfy Z 1 [µ+ (h+ ) − µ− (h− )]dv, (4) −∂zz β − ∂rr β − ∂r β = ρ(α, β) = r R3 Z 1 1 [vˆ θ+ µ+ (h+ ) − vˆ θ− µ− (h− )]dv −∂zz α − ∂rr α − ∂r α + 2 α = jθ (α, β) = r r 3 R with β = 0 and α = 0 at the boundary ∂. −1/γ for ξ near zero, here µ−1 By (2), |µ−1 ± (ξ)| ≤ C|ξ| ± are the inverses of µ± , and γ > 4. We define Z ζ {µ−1 (5) H± (ζ) = − ± }(ξ)dξ. 0

To study these axially symmetric equilibria, we observe that they are critical points of the steady Liapunov functional: XZ {H± (f± ) + [hv± i + η± (r(vθ ± Aθ ))]f± }rdrdzdv J0 (f± , Aθ ) = ±

+

1 2

Z

×R3

1 {|∇φ|2 + |∂z Aθ |2 + | ∂r (rAθ )|2 }rdrdz, r

(6)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

213

where −∇φ ∈ H01 () is the electric potential which satisfies: Z (f+ − f− )dv. −1φ = R3

We illustrate our first main result as follows: Main Theorem A. Assume (2). (a) The set of minimizers of J0 are dynamically stable among axially symmetric perturbations. (b) Assume for a critical point (µ± (h± ), α) of J0 , 1 1 (7) −∂zz − ∂rr − ∂r + 2 − ∂α jθ (α, β) > 0. r r Under some mild conditions, the critical point is dynamically stable among axially symmetric perturbations. For a precise statement, see Theorem 2. Since the operator −∂zz − ∂rr − r1 ∂r + r12 is positive, (7) is valid if ∂α jθ (α, β) is small. In particular, if β ≡ 0 (neutral) in an equilibria (µ± (h± ), α), condition (7) is valid if α is a non-degenerate minimizer of the functional Z 1 1 1 G(α) = { |∂z α|2 + | ∂r (rα)|2 2 2 r (8) XZ + M± (h± (α, 0, v))dv}rdrdz, ±

R3

Rζ where M± (ζ) = ∞ µ± (ξ)dξ. If is invariant with respect to the translations along the Z axis ( depends only on x1 and x2 ), we define h± (α, β, v) = hv± i ± β(x1 , x2 ) + η± (v3 ± α(x1 , x2 )).

(9)

A class of steady translation invariant solutions takes the form F± = µ± (h± (α, β, v)), E = (∂x1 β, ∂x2 β, 0), B = (∂x2 α, −∂x1 α, 0). And the electromagnetic potentials β and α satisfy Z [µ+ (h+ ) − µ− (h− )]dv, −∂x1 x1 β − ∂x2 x2 β = ρ(α, β) = 3 ZR [vˆ 3+ µ+ (h+ ) − vˆ 3− µ− (h− )]dv, −∂x1 x1 α − ∂x2 x2 α = j3 (α, β) =

(10)

R3

where both β and α vanish at ∂. With (2) and (5), these equilibria are critical points of the steady Liapunov functional: XZ {H± (f± ) + [hv± i + η± (v3 ± Az )]f± }dx1 dx2 dv J0 (f± , Az ) = ±

+

1 2

Z

×R3

{|∇φ|2 + |∇Az |2 }dx1 dx2 ,

(11)

where the electric potential −∇φ satisfies −1φ = ρ. The main results are parallel to those in the axially symmetric case.

214

Y. Guo

Main Theorem B. Assume (2). (a) The set of minimizers of J0 are dynamically stable among perturbations which are invariant under Z-translations. (b) Assume for a critical point (µ± , α) of J0 , −∂x1 x1 − ∂x2 x2 − ∂α j3 (α, β) > 0. Then under some mild conditions, the critical point is dynamically stable among perturbations which are invariant under Z-translations. For a precise statement, see Sect. 9. We also have a similar reduction when β ≡ 0. In [G2], the author initiated the study of dynamical stability of equilibria with nontrivial magnetic fields. Many magnetic equilibria, including those new “flat-tail” solutions constructed by Ragazzo and the author [GR], were shown to be dynamically stable in one space dimension. This paper is to generalize [G2] to a high dimensional case. In addition to more complicated geometry in the present study, there are several major improvements. This article is based on the crucial observation that many equilibria are critical points of J0 . This observation leads to more general stable configurations for µ± (ζ), which is restricted to only µ± (ζ) = e−ζ in [G2]. Secondly, in the present frame work, the electric field could be nontrivial, while it is absent (purely magnetic) in [G2]. Finally, by connecting steady states to critical points, we apply the natural and direct variational method to study their stability. In [G2], the stability analysis is based on the nodal set analysis which is very one dimensional. The present variational method is more general and suitable for multi-dimensional problems and could be applied to other different problems. In particular, without any “hard” estimates, the set of minimizers of J0 is always dynamically stable. Many results in this paper are valid for more general J0 . Furthermore, based on a similar variational approach, many stable equilibria have been constructed recently in the stellar dynamics case [G5, GRe]. This paper raises many interesting questions. First of all, can the restriction on the domain ∩ Z = ∅ be relaxed for the axially symmetric case? The second question is to further characterize minimizers of J0 , especially to study their uniqueness. Thirdly, we believe that a weaker version of (7) should be sufficient for the stability. One of the more important questions is the stability analysis of these equilibria against perturbations without any symmetry. This will lead to a deeper understanding of dynamical behavior of perturbations near these equilibria. Finally, our stability theorem suffers from the usual drawback of weak solutions. Despite many contributions ([G1, G3] and [G4]), the uniqueness of weak solutions for the general Vlasov–Maxwell system in a bounded domain is open. It certainly will be of great interest to make progress in this direction. Very recently, Rein [R1] et al have used the same idea as in this paper to study the stability problems in the Vlasov–Poisson system. This article is organized as follows. The major part of the paper, Sect. 2 to Sect. 8, is devoted to the axially symmetric case. In Sect. 1, we formulate the initial-boundary value problem for the Vlasov–Maxwell system with axial symmetry. The key point is to derive a separate equation and boundary condition for the axial magnetic potential Aθ , which is invariant among all axially symmetric gauge transformations. In Sect. 3 we derive the variational formulation to study steady state solutions. Section 4 is devoted to the estimate for the second variation of J0 near a critical point. The stability of minimizers is studied in Sect. 5. Section 6 is the construction of weak solutions with axial symmetry. Here the idea of [G1] is modified to construct a sequence of approximate solutions which preserve a

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

215

new invariant. This is a technical section since there is no standard theory for solving the boundary value problem for linear Vlasov equations in the cylindrical coordinate system. We have to transfer the problem to the Cartesian coordinates. We prove the Main Theorem A in Sect. 7. We obtain simpler criterion of stability for the purely magnetic equilibria in Sect. 8. We state parallel results for a plasma which is invariant under translations in Sect. 9. 2. Axial Symmetry p We use the cylindrical coordinate (r, θ, z): r(x) = x21 + x22 , z = x3 for (x1 , x2 , x3 ) ∈ R3 . We define the local vector basis x1 x2 x2 x1 , , 0), eθ (x) = (− , , 0), ez (x) = (0, 0, 1). er (x) = ( r(x) r(x) r(x) r(x) Any vector function K : R3 → R3 has a decomposition K(x) = Kr (x)er (x) + Kθ (x)eθ (x) + Kz (x)ez (x) with Kr (x) = hK(x), er (x)i, Kθ (x) = hK(x), eθ (x)i, Kz (x) = hK(x), ez (x)i. We define K to be axially symmetric if Kr , Kθ and Kz are invariant with respect to all rotations about Z, that is, Kr = Kr (r, z),Kθ = Kθ (r, z),Kz = Kz (r, z), do not depend upon θ. For any momentum vector v = (v1 , v2 , v3 ) with vr = v ·er (x), vθ = v ·eθ (x), vz = v · ez (x), we decompose the momentum vector v as v = vr er (x) + vθ eθ (x) + vz ez (x). q q m2± + |v|2 = m2± + vr2 + vθ2 + vz2 , vˆ ± = v/hv± i. Let p = We have hv± i = (t, r, θ, z, vr , vθ , vz ). For 0 ≤ θ < 2π and r > 0, we define the following standard smooth, one to one mapping between the cylindrical coordinates and Cartesian coordinates. Let q = (t, x1 , x2 , x3 , v1 , v2 , v3 ), q = T (p) = (t, r cos θ, r sin θ, z, vr cos θ − vθ sin θ, vr sin θ + vθ cos θ, vz ), q x1 v1 + x2 v2 −x2 v1 + x1 v2 p = T −1 (q) = (t, x21 + x22 , tan−1 x2 /x1 , z, p 2 , p 2 , vz ), x1 + x22 x1 + x22 (12) = (t, r, θ, z, vr , vθ , vz ). We also define a one-to-one correspondence between functions in Cartesian and in cylindrical coordinates: F (q) ≡ F (T (p)) = f (p). Moreover, let 0± (s; p) be the trajectory vˆ ± vθ dvr dr = vˆ r± , = ±{Er + vˆ θ± Bz − vˆ z± Bθ + θ }, ds ds r ± v ˆ vr dvθ dθ vˆ θ± = , = ±{Eθ + vˆ z± Br − vˆ r± Bz − θ }, ds r ds r dvz dz ± ± ± = vˆ z , = ±{Ez + vˆ r Bθ − vˆ θ Br } ds ds

(13)

216

Y. Guo

in the cylindrical coordinates with 0± (t; p) = p. Then by an elementary computation, T ◦0± (s; p) is exactly the trajectory for the Vlasov equation in the Cartisian coordinates: dv dx = vˆ ± , = ±(E + vˆ ± × B). ds ds

(14)

Hence F± (q) solves the Vlasov equation (1) if and only if the corresponding f± (p) = F± (T (p)) solves the Vlasov equation in cylindrical coordinates: vˆ θ± vθ vˆ θ± ∂θ f± + vˆ z± ∂z f± ± (Er + vˆ θ± Bz − vˆ z± Bθ + )∂vr f± r r vˆ ± vr ±(Eθ + vˆ z± Br − vˆ r± Bz − θ )∂vθ f± ± (Ez + vˆ r± Bθ − vˆ θ± Br )∂vz f± = 0. r

∂t f± + vˆ r± ∂r f± +

Here Br , Bz , Bθ and Er , Eθ , Ez are the axial components of (E, B); And in the cylindrical coordinates, the Maxwell system takes the form Z ∂t Er − ∂θ Bz + ∂z Bθ = −jr = − [vˆ r+ f+ − vˆ r− f− ], Z ∂t Eθ + ∂r Bz − ∂z Br = −jθ = − [vˆ θ+ f+ − vˆ θ− f− ], Z 1 ∂t Ez − [∂r (rBθ ) − ∂θ Br ] = −jz = − [vˆ z+ f+ − vˆ z− f− ], r ∂t Bθ − (∂r Ez − ∂z Er ) = 0, ∂t Br − ∂z Eθ = 0, 1 ∂t Bz + [∂r (rEθ ) − ∂θ Er ] = 0 r with constraints 1 Er + ∂r Er + r 1 Br + ∂r Br + r

Z 1 ∂θ Eθ + ∂z Ez = [f+ − f− ], r 1 ∂θ Bθ + ∂z Bz = 0. r

Let ⊂ {(r, z) | r ≥ r0 > 0, z ∈ R} with ∂ ∈ C 2,δ be a bounded domain. is independent of θ. Since ∩ Z = ∅, is regular in the sense in [BF]. We define the measure d = rdrdz. Since r ≥ r0 on , d is essentially the standard measure drdz. The most important example is a torus, which resembles the p geometry of a tokamak in the plasma confinement. Let ∂ = {z = g(r) = g( x21 + x22 )}, and its outward normal be 0 0 0 0 n = (− √ g xr1 , √−g xr2 , √ g ). Equivalently, nr = − √ g , nθ = 0 and nz =

1+g 02 1 √ . 1+g 02

1+g 02

1+g 02

1+g 02

We look for solutions (f± , E, B) which are axially symmetric. For the electromagnetic fields E and B, their corresponding vector potential A and scalar potential φ satisfy E = −∇φ − ∂t A, This is equivalent to

B = curl A.

(15)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

217

1 Er = −∂r φ − ∂t Ar , Eθ = − ∂θ φ − ∂t Aθ , Ez = −∂z φ − ∂t Az , r 1 1 1 Br = ∂θ Az − ∂z Aθ , Bθ = ∂z Ar − ∂r Az , Bz = (∂r (rAθ )) − ∂θ Ar . r r r It follows that if A and φ are axially symmetric, so are E and B. Clearly, those gauge transformations which map axially symmetric functions to themselves have the form of A → A + ∇χ(t, r, z) and φ → φ + ∂t χ(t, r, z), with a scalar, axially symmetric function χ. Moreover, these gauge transformations map Aθ to itself: x2 x1 Aθ → Aθ − ∂x1 χ + ∂x2 χ ≡ Aθ . r r Hence Aθ plays an important role due to its invariant property under these axially symmetric gauge transformations. Because of the special role of Aθ , it is convenient to formulate the problem in terms of f (t, r, z, vr , vθ , vz ), Er (t, r, z), Ez (t, r, z), Bθ (t, r, z) as well as Aθ (t, r, z). The Vlasov–Maxwell system now takes the form: L(f± , Aθ , Er , Ez , Bθ ) = ∂t f± + vˆ r± ∂r f± + vˆ z± ∂z f±

vˆ θ± vˆ ± vθ ∂r (rAθ ) − vˆ z± Bθ + θ )∂vr f± r r ± vˆ ± vr v ˆ ±(−∂t Aθ − vˆ z± ∂z Aθ − r ∂r (rAθ ) − θ )∂vθ f± r r ±(Ez + vˆ r± Bθ + vˆ θ± ∂z Aθ )∂vz f± = 0. ±(Er +

(16)

We prescribe the specular reflection boundary condition for f± on ∂: f± (t, r, z, vr , vθ , vz ) = f± (t, r, z, v r , vθ , v z ) ≡ Kf± (t, r, z, vr , vθ , vz ) with v = v − 2(n · v)n, v θ = vθ since nθ = 0, and (r, θ, z) ∈ ∂. We then separate the Maxwell system for Er , Ez and Bθ as Z ∂t Er + ∂z Bθ = −jr = − [vˆ r+ f+ − vˆ r− f− ], Z 1 ∂t Ez − ∂r (rBθ ) = −jz = − [vˆ z+ f+ − vˆ z− f− ], r ∂t Bθ − (∂r Ez − ∂z Er ) = 0.

(17)

(18)

The perfect conductor boundary condition E × n = 0 reduces to: −Ez nr + Er nz = 0

(19)

and Eθ = 0. The constraint, divB = 0, is automatically satisfied, since Br = −∂z Aθ , Bz = r1 ∂r (rAθ ). On the other hand, the constraint, divE = ρ reduces to 1 Er + ∂r Er + ∂z Ez = ρ. r

(20)

The equations for Br and Bz become trivial, and the equation for Eθ reduces to an equation for Aθ : 1 −∂tt Aθ + ∂r { ∂r (rAθ )} + ∂zz Aθ = −jθ . r

(21)

218

Y. Guo

Notice that from (15), Eθ = −∂t Aθ , Br = −∂z Aθ , Bz = boundary conditions Eθ = 0 and B · n = 0 reduce to

1 r ∂r (rAθ ).

Now the

1 g 0 ∂z Aθ + (∂r (rAθ )) = Br nr + Bz nz = 0. r

∂t Aθ = 0,

The second condition is Aθ + ∂r Aθ (r, g(r)) = 0 at the boundary. We therefore impose a compatible boundary condition Aθ ≡ 0.

(22)

We then separate the Maxwell system as (18) and (21) with separate boundary conditions (19) and (22).

3. Liapunov Functional and its Critical Points Let 5 = × R3 . We now use invariants η± (r(vθ ± Aθ )) to (16) to construct steady states to (1). For notational simplicity, we use η± (ξ) ≡ η± (r(vθ ± ξ)).

(23)

U = {u = (f± , Aθ ) : Aθ ∈ H 1 (); 0 ≤ f ± ∈ L1 (5)}.

(24)

Recall d = rdrdz, we define

Let (Er , Ez , Bθ ) ∈ L2 (), we define u = (u, Er , Ez , Bθ ). The formally conserved energy functional is E(u) =

XZ ±

5

hv± if± ddv +

1 2

Z

(|E|2 + |B|2 )d.

(25)

And we define the full dynamical Liapunov functional J(u) = E(u) +

XZ ±

5

[H± (f± ) + η± (Aθ )f± ]dvd,

(26)

where divE = ρ, and H± are defined in (5). We also recall the steady Liapunov functional J0 (f± , Aθ ) as in (6). Lemma 1. Assume (2) and J0 (u) < ∞. Then for any large constants M > 0 and K > 0, there is C(M, K) > 0 such that H± (f± ) > 0 for f± ≥ M and Z 1 H± (f± ) + hv± if± 2 |v|≥K f± ≥M Z 1 1 {|∇φ|2 + |∂z Aθ |2 + | ∂r (rAθ )|2 }d ≤ J0 (u) + C(M, K). + 2 r 1 2

Z

(27)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

219

Proof. From (6), it suffices to prove: Z Z 1 1 H± (f± ) + hv± if± 2 f± ≥M 2 |v|≥K Z ≤ H± (f± ) + [hv± i + η± (Aθ )]f± + C(M, K). 5

Without loss of generality, we only consider the 0 +0 case. We separate three cases to find R a lower bound for H+ (f+ ) + [hv+ i + η+ (Aθ )]f+ . 1/γ On the set f+ ≥ K, where K is a fixed large number: By (2), |µ−1 + (ζ)| ≤ C|ζ| for ζ near zero with γ > 4. Hence H+ (ζ) ∈ C[0, ∞) from its definition (5). Moreover, from (2), H+ (ζ) is superlinear at ζ = +∞: lim

ζ→+∞

H+ (ζ) = lim H+0 (ζ) = − lim [µ−1 + ](ζ) = +∞. ζ→∞ ζ→+∞ ζ

Since η+ (·) is bounded, we have 1 H+ (f+ ) > 0 2

H+ (f+ ) + η+ (Aθ )f+ ≥

for f+ ≥ K. Hence Z Z Z 1 H+ (f+ ) + hv+ if+ ≤ {H+ (f+ ) + [hv+ i + η+ (Aθ )]f+ }. (28) 2 f+ ≥M f+ ≥M f+ ≥M On the set f+ ≤ M and |v| ≥ K: Notice that since η+ is bounded, [η+ (Aθ ) + hv+ i]f+ ≥

3 hv+ if+ 4

(29)

when |v| ≥ K. We also have 00

0 H± (ζ) = −{µ−1 ± (ζ)} = −

1 > 0. µ0± (µ−1 ± (ζ))

(30)

Since H+ is convex, H+ (f+ ) ≥ H+ (µ+ (hv+ i/4)) + H+0 (µ+ (hv+ i/4))[f+ − µ+ (hv+ i/4)] hv+ i = H+ (µ+ (hv+ i/4)) − (f+ − µ+ (hv+ i/4)) 4 ≥ H+ (µ+ (hv+ i/4)) − hv+ if+ /4. Since H+0 (µ+ (ζ)) ≡ −ζ, we have Z H+ (µ+ (ζ)) = −

d dζ H+ (µ+ (ζ)) ζ

∞

(31)

≡ −ζµ0+ (ζ) and thus

sµ0+ (s)ds = −ζµ+ (ζ) +

Z

ζ

∞

µ+ (s)ds.

It then follows that H+ (µ+ (hv+ i/4)) ∈ L1 by the decay assumption in (2). Therefore, combining with (31) and (29), we have

220

Y. Guo

1 2

Z

Z

f+ ≤M,|v|≥K

hv+ if+ ≤

f+ ≤M,|v|≥K

H+ (f+ ) + [hv+ i + η+ (Aθ )]f+

Z

−

H+ (µ+ (hv+ i/4)).

(32)

R3

On the set f+ ≤ M and |v| ≤ K: Since now both f+ and v are bounded, and |H+ (f+ )| ≤ C(M ) by its continuity, therefore we have Z {H+ (f+ ) + [η+ (Aθ ) + hv+ i]f+ }| ≤ C2 (M, K). (33) | f+ ≤M,|v|≤K

Combining all three cases, we deduce from (33), (32) and (28), Z Z 1 1 H+ (f± ) + hv+ if+ 2 f+ ≥M 2 |v|≥K Z ≤ {H+ (f+ ) + [η+ (Aθ ) + hv+ i]f+ } {f ≤M,|v|≥K}∪{f+ ≥M } Z + H+ (µ+ (hv+ i/4)) − 3 ZR = {H+ (f+ ) + [η+ (Aθ ) + hv+ i]f+ } Z5 {H+ (f+ ) + [η+ (Aθ ) + hv+ i]f+ } − f+ ≤M,|v|≤K Z H+ (µ+ (hv+ i/4)) − 3 ZR ≤ {H+ (f+ ) + [η+ (Aθ ) + hv+ i]f+ } − C(M, K). 5

Lemma 2. Assume (2). Then any critical point u0 of J0 with J0 (u0 ) < ∞ takes the form (µ± (h± (α, β, v)), α), where α and β satisfy (4). In particular, u = [µ± (h± ), α, −∂r β, ∂z β, 0] is a steady state solution of (16), (18) and (21) with boundary conditions (17), (19) and (22). Moreover kαkC 2,δ () + kβkC 2,δ () ≤ C(µ± , η± ). Proof. For any solution (α, β) to (4), since β ≡ 0 on ∂, (19) is satisfied. So are other boundary conditions. It is straight forward to verify that u = [µ± (h± ), α, −∂r β, ∂z β, 0] is a steady state of (16), (18) and (21). Let u0 = (g± , α) be a critical point of J0R. The corresponding electric potential β ∈ H01 () satisfies −∂zz β − ∂rr β − r1 ∂r β = [g+ − g− ]. We first claim that g± > 0 almost everywhere. Proof of the claim. If not, without loss of generality, we may assume g+ ≡ 0 on set K with 0 < m(K) < ∞. Let χK be the characteristic function of K. Since kρ(u0 + tχK ) − ρ(u0 )k2 ≤ m(K)1/2 t we have

k∇φ(u0 + tχK )k22 − k∇φ(u0 )k22 = O(t)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

for t > 0. Therefore from (6),

221

Z

J0 (u0 + tχK ) − J0 (u0 ) =

K

{H+ (t) + [hv± i + η± (r(vθ ± α))]t} + O(t).

d J(u0 + Notice that from (5), H0 (ζ) = −µ−1 (ζ), and H0 (0) = −∞. Therefore, dt tχK )|t=0 = −∞, does not exist. This contradicts that u0 is a critical point. The claim is proved.

Now that g± > 0, g± + tg1 ≥ 0 for any g1 ∈ Cc∞ for t small. Since ∩ Z = ∅, standard variation of J0 in (6) with respect to u yields 0 (g± ) + hv± i ± β + η± (r(vθ ± α)) = 0, H± Z 1 1 0 r[η+0 (α)g+ − η− (α)g− ]dv = 0. −∂zz α − ∂rr α − ∂r α + 2 α + r r R3

(34)

By the assumption on H± in (5), we invert the first equation in (34) to get g± = µ± (h± (α, β, v)). Since from Lemma 1, g± ∈ L1 , it follows that α ∈ W 1,p for any p < 2 from the standard elliptic theory in 2D. In order to verify (4), it suffices to prove Z 0 r[η+0 (α)µ+ (h+ ) − η− (α)µ− (h− )]dv = −jθ (α, β). (35) R3

Insert g± = µ± (h± ) into (4), the left hand side in (35) is Z −jθ (α, β) + { [vˆ θ+ + rη+0 (r(vθ + α))]µ+ (h+ (α, β, v))dv R3 Z 0 [vˆ θ− − rη− (r(vθ − α))]µ− (h− (α, β, v))dv}. −

(36)

R3

0 (r(vθ ± α)). Since Notice that from (3), ∂vθ h± (α, β, v) = vˆ θ± ± rη± M± (ζ), we rewrite the second term in (36) as Z ∂vθ [M+ (h+ (α, β, v)) − M− (h− (α, β, v)]dv = 0,

Rζ ∞

µ± (y)dy =

R3

for almost every (r, z) in , by the decay condition in (2). Hence, the left hand side of (35) is −jθ (α, β) and (4) is satisfied. We now prove the regularity R of the standard Schauder’s estimate, R of α and β. In light it suffices to prove that both [µ+ − µ− ] and [vˆ θ+ µ+ − vˆ θ− µ− ] are a C 0,δ function of r and z. To this end, we first show β ∈ L∞ via the maximum principle. Since limζ→∞ µ± (ζ) = 0 from (2), there is N0 > 0 such that if β(r, z) > N0 , Z [µ+ (hv+ i + β + η+ (α)) − µ− (hv− i − β + η− (α))]dv < 0 (37) ρ(x) = R3

for any α ∈ R. We now claim that sup β ≤ N0 . If not, consider the non-empty set S = { (r, z) | β(r, z) > N0 }. Since from Lemma 1, k∇βk22 < ∞, ρ = −1φ ∈ H −1 (). We multiply (β − N0 )+ ∈ H01 on both sides of (4) to get Z Z 2 |∇β| − ρβ = 0. S

S

222

Y. Guo

Since ρβ ≤ 0 on S, this implies m(S) = 0. This is a contradiction. A similar argument proves that β is bounded from below. Now β is bounded, from the decay assumption in (2), it follows that Z Z |ρ| + |jθ | ≤ C| µ± (h± )dv| ≤ C hvi−γ dv < ∞. Hence α ∈ C 1,δ and β ∈ C 1,δ . Moreover, |ρ(r, z) − ρ(r0 , z 0 )| + |jθ (r, z) − jθ (r0 , z 0 )| Z Z ≤ C(|η 0 |∞ )1| |µ0± (h± (α, β))dv| ≤ C1 hvi−γ−1 dv ≤ C1, where 1 = |α(r, z) − α(r0 , z 0 )| + |β(r, z) − β(r0 , z 0 )|, and α is between α(r, z) and α(r0 , z 0 ), and β is between β(r, z) and β(r0 , z 0 ). This implies that both jθ (α, β) and ρ(α, β) are C 0,δ by assumption (2). The lemma thus follows from Schauder’s theory. 4. Second Variation of J For any critical point u0 = (µ± (h± ), α) of J0 , we define X {H± (f± ) − H± (µ± (h± )) + h± (α, β, v)(f± − µ± (h± ))}, Q(u, u0 ) =

(38)

± 0 (µ± (h± )) = −h± and H ∈ C 2 (0, ∞), from the where h± are defined in (3). Since H± Taylor expansion:

Q(u, u0 ) =

X1 ±

2

∗ H± 00 (f± )(f± − µ± )2 ≥ 0,

(39)

∗ is positive and between f± and µ± . For λ > 0, we further define a measurement where f± between u = (u, Er , Ez , Bθ ) and a critical point u0 as

d(u, u0 ) ≡ d(u, u0 , λ) = Q(u, u0 ) Z 1 1 {|E + ∇β|2 + λ|∂z (Aθ − α)|2 + λ| ∂r (r[Aθ − α])|2 + |Bθ |2 }d + 2 r Z

Z

with

(40)

divE = R3

[f+ − f− ]dv,

−1β = R3

[µ+ − µ− ]dv.

Lemma 3. Assume (2). Let u0 be a critical point of J0 with J0 (u0 ) < ∞. If J(u) < ∞, then J(u) − J(u0 ) = d(u, u0 , 1) Z Z X [η± (Aθ ) − η± (α)]f± dvd + jθ (α, β)(Aθ − α)d}, +{ 5 ±

and d = rdrdz. The second equation in (41) can be further written as:

(41)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

−

1 2

Z

223

Z

∂α jθ (α, β)|Aθ − α|2 d +

5

N (f± − µ± , Aθ − α)ddv.

(42)

Here N (f± − µ± , Aθ − α) is (summation over ±) 0 (r(vθ ± Aθ ))(Aθ − α)(f± − µ± ) − ∓rη±

+

r2 0 {η (r(vθ ± α))}2 µ0± (Aθ − α)2 2 ±

r2 00 {η ± (r(vθ ± Aθ ))f± − η 00 ± (r(vθ ± α))µ± }(Aθ − α)2 , 2

(43)

some Aθ between Aθ and α, and f ± between µ± and f± . Proof. Recall the definitions of J and Q in (26) and (38), we decompose B to its axial parts Br , Bθ and Bz . We first rearrange J(u) − J(u0 ) as Z Q(u, u0 ) + [η± (r(vθ ± Aθ )) − η± (r(vθ ± α))]f± 5 Z Z 1 − β[f+ − f− − (µ+ − µ− )] + |E|2 − |∇β|2 (44) 2 5 Z 1 1 1 {Bθ2 + ∂z A2θ + [ ∂r (rAθ )]2 − ∂z α2 − [ ∂r (rα)]2 }. + 2 r r Notice that E ∈ L2 from Lemma 1, 1 1 2 1 |E| − |∇β|2 = −∇β(E + ∇β) + |E + ∇β|2 . 2 2 2 Since divE = ρ and β ∈ H01 (), we can integrate (by standard approximations) the middle term to get Z Z Z βdiv(E + ∇β) = β[f+ − f− − (µ+ − µ− )], − ∇β(E + ∇β) =

R

5

where −1φ = [f+ − f− ]dv. This term thus cancels with the first term in the second equation in (44). Notice that 1 1 1 {|∂z Aθ |2 − |∂z α|2 + | ∂r (rAθ )|2 − | ∂r (rα)|2 } 2 r r 1 1 2 2 = {|∂z [Aθ − α]| + | ∂r [r(Aθ − α)]| } 2 r 1 +{∂z α∂z (Aθ − α) + 2 (∂r (rα))∂r (r(Aθ − α))}. r

(45)

Integrating by parts, from (4), we deduce that the -integral of the last term above is: Z Z 1 1 [−∂zz α − ∂rr α − ∂r α + 2 α](Aθ − α)d = jθ (α, β)(Aθ − α)d. r r We thus conclude (41).

224

Y. Guo

To prove (42), from (35), the last part in (41) can be written as : XZ 0 [η± (Aθ ) − η± (α)]f± − r[η+0 (α)µ+ − η− (α)µ− ](Aθ − α). ±

5

By the Taylor expansion with respect to (Aθ , f± ), it takes the form: Z 1 2 00 0 (Aθ )(Aθ − α)(f± − µ± ) r η± (Aθ )f± (Aθ − α)2 ± rη± 5 2 Z 2 r 0 {η 00 ± (Aθ )f± + [η± (α)]2 µ0± }(Aθ − α)2 = 5 2 Z r2 0 0 rη± (Aθ )(Aθ − α)(f± − µ± ) − [η± (α)]2 µ0± (Aθ − α)2 , ± 2 5

(46)

(k) (k) (Aθ ) = η± (r(vθ ± Aθ )), k = 0, 1, 2, and the linear terms vanish at f± = where η± µ± (h± ), Aθ = α. The last two terms in (46) are in the desired form in (43). We rewrite the first term as Z Z 1 1 2 2 00 r η± (Aθ )f± (Aθ − α) = r2 η± 00 (α)µ± (Aθ − α)2 2 5 2 5 Z 1 +{ r2 [η± 00 (Aθ )f ± − η± 00 (α)µ± ](Aθ − α)2 }. 2 5

The last two terms above are of the desired form in N . Combining the first term above 0 2 0 ] µ± in (46), and taking one α derivative over (35), we have r2 η 00 ± µ± with r2 [η± Z X r2 η± 00 (α)µ± + [η+0 (α)]2 µ0± dv ≡ −∂α jθ (α, β). ±

R3

We therefore obtain our lemma.

In the case that the operator −∂zz − ∂rr − r1 ∂r + following

1 r2

− ∂α jθ (α, β) > 0, we have the

Lemma 4. Assume (2) and kη± kC 3 (R) + sup |µ00 (ζ)/µ0 (ζ)| < ∞. ζ∈[0,∞)

(47)

If for some λ > 0, Z 1 {|∂z (Aθ − α)|2 + | ∂r (r[Aθ − α])|2 − ∂α jθ (α, β)|Aθ − α|2 } ≥ λkAθ − αk2H 1 , r then there exist 0 < σ < 1 and constant C ≡ C(kf+ k∞ , kηkC 3 (R) , kβkC 0 (R) , khvif± kL1 ) such that if divE = ρ and d(u, u0 , λ) ≤ 1, J(u) − J(u0 ) ≥ (1 − σ)d(u, u0 , λ) − Cd3/2 (u, u0 , λ).

(48)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

225

Proof. Notice that by the positivity assumption of the operator −∂zz − ∂rr − r1 ∂r + r12 − ∂α jθ (α, β), it suffices to prove that Z N ≥ −σd − Cd3/2 . 5

Without loss of generality, we may only take the 0 +0 components in N . The case for 0 −0 is the same. We first treat the first two terms in (43). Split the first mixed term in N , η+0 (Aθ ) = η+0 (α) + {η+0 (Aθ ) − η+0 (α)}.

(49)

In light of (43), we combine the first term above with the second term in N to get, 1 −rη+0 (α)(Aθ − α)(f+ − µ+ ) − r2 [η+0 (α)]2 µ0+ (Aθ − α)2 2 1 (1 + ν)r2 [η+0 (α)]2 (Aθ − α)2 } = {−rη+0 (α)(Aθ − α)(f+ − µ+ ) + 2H00 (f+∗ ) 1 1 ν −{ 00 ∗ + ( 00 ∗ + µ0+ )}r2 [η+0 (α)]2 (Aθ − α)2 2H (f+ ) 2 H (f+ ) = L1 + L2 for any ν > 0 and f+∗ as in (39). Notice that H00 + > 0 from (30). We make a perfect square for L1 , H00 (f+∗ )(f+ − µ+ )2 2(1 + ν) s r H00 (f+∗ ) 1+ν 2 1 (f+ − µ+ ) − η+0 (α)(Aθ − α)r } + { 2 1+ν H00 (f+∗ )

L1 = −

≥−

1 Q(u, u0 ). 1+ν

Notice that from (30),

1 H00 (0)

(50)

= 0. Since

d 0 −1 µ (µ (ζ)) = µ00 (µ−1 (ζ))/µ0 (µ−1 (ζ)), dζ by (47) and (30), H001(ζ) ∈ C 1 [0, ∞). Therefore, R L2 can be estimated as −

ν 2

Z 5

r2 [η 0 (α)]2 (Aθ − α)2 ≥ −Cν 00 H (f+∗ ) +

1 H00 (f+∗ )

Z 5 Z

≤ Cf+∗ , and the first term in

f+∗ [η+0 (α)]2 (Aθ − α)2

≥ −Cνk R3

f+∗ dvkL4/3 () kAθ − αk2L8 () ,

where we have used the Sobolev Imbedding Theorem for Aθ − α in two dimensions (since ∩ Z = ∅). By a well-known estimate in kinetic theory,

226

Y. Guo

Z k R3

f+∗ dvkL4/3 () ≤ C(kf+∗ kL∞ , khvif+∗ kL1 ) ≤ C(kf+ kL∞ , khvif+ kL1 ) + C(kµ+ kL∞ , khviµ+ kL1 ) ≤ C(kf+ kL∞ , khvif+ kL1 )

byR(2). We thus obtain that for ν small (depending on kf+ kL∞ , khvif+ kL1 ), the first term in L2 is bounded from below by −

λ kAθ − αk2H 1 . 12

To estimate the second term in L2 , since

1 H00

(51)

∈ C 1 [0, ∞) from (30),

1 1 1 + µ0+ (h+ )| = | 00 ∗ − 00 | ≤ C|f+∗ − µ+ | ≤ C|f+ − µ+ |, H00 (f+∗ ) H (f+ ) H (µ+ ) R where C depends on kf+ k∞ and kµ+ k∞ . Therefore, the second term in L2 can be estimated similarly as Z r2 [η+0 ]2 |f+ − µ+ |(Aθ − α)2 C 5 Z Z 1 00 ∗ 2 ≤ ν1 H (f+ )(f+ − µ+ ) + C(ν1 ) (52) [η 0 ]4 (Aθ − α)4 00 (f ∗ ) + H 5 + Z5 Q(u, u0 ) + C(kf+ kL∞ , khvif+ kL1 , ν1 )kAθ − αk416 ≤ ν1 5 Z ≤ ν1 Q(u, u0 ) + C(kf+ kL∞ , khvif+ kL1 , ν1 )kAθ − αk4H 1 , |

5

by the Sobolev Imbedding Theorem in 2D. With the same method, we estimate the second term in (49) as Z r[η+0 (Aθ ) − η+0 (α)](f+ − µ+ )(Aθ − α)| | 5 Z 2 00 |f+ − µ+ |(Aθ − α)2 ≤ r sup |η+ | 5 Z ≤ ν1 Q(u, u0 ) + C(ν1 , kf+ kL∞ , khvif+ kL1 )kAθ − αk4H 1 .

(53)

5

This concludes the estimate for the first two terms in N . Now we treat the last term in (43). We first split r2 η± 00 (Aθ )f + − r2 η± 00 (α)µ+ = r2 [η± 00 (Aθ ) − η± 00 (α)]µ+ + r2 η± 00 (Aθ )(f + − µ+ ) ≤ r3 sup |η 000 ||Aθ − α|µ+ + r2 sup |η 00 ||f+ − µ+ | ≤ C|Aθ − α|µ+ + C|f+ − µ+ | since is bounded. Now we can bound the second term above by the same estimate in (52) to get

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

Z

Z C

227

|f+ − µ+ ||Aθ − α|2 ≤ ν1

Q(u, u0 ) + C(ν1 )kAθ − αk4H 1 .

On the other hand, for the first term, we have Z Z Z 3 C |Aθ − α| µ+ ≤ C{sup µ+ (h+ )dv} |Aθ − α|3 dx ≤ CkAθ − αk3H 1 . x

Letting ν small, then choosing ν1 small, we have obtained Z Z λ N+ ≥ −σ{ Q(u, u0 ) + kAθ − αk2H 1 } − CkAθ − αk3H 1 2 for 0 < σ < 1, where C depends on kf+ kL∞ , khvif+ kL1 .

5. Stability of Minimizers of J0 In this section, we construct minimizers of J0 and prove that the set of minimizers are dynamically stable. We consider the problem that inf J(u)

u∈U

among all admissible u where u is defined in (24). Theorem 1. Assume (2). Let un ∈ U be any minimizing sequence for J0 , there exists u0 ∈ U such that J0 (u0 ) = min J0 (u), U

and u0 is a critical point of J0 and satisfies (4). Moreover, up to a subsequence, lim d(un , u0 , 1) = 0.

n→∞

Proof. Notice that for any u ∈ U, Z J0 (u) ≥ {H± (f± ) + [hv± i + η± (Aθ )]f± }d.

(54)

(55)

5

Therefore, J0 is bounded from below in light of Lemma 1. Moreover, let un be any minimizing sequence such that limn→∞ J0 (un ) = inf u∈U J0 (u). Now by Lemma 1, we deduce that Z Z Z 1 n H± (f± )+ hv± if+n + {|∇φn |2 + |∂z Anθ |2 + | ∂r (rAnθ )|2 } r f± ≥M |v|≥K ≤ 2 inf J0 (u) + 2C(M, K) R n ]dv. This implies that up to a subsequence, uniformly in n. Here −1φn = [f+n − f− n n there exists u0 = (g± , α) such that Aθ → α weakly in H01 (), and hv± if± → hv± ig± 1 3 n 2 weakly in L ( × R ), and ∇φ → ∇β weakly in L (). (The superlinearity of H± excludes the possibility of g± being a measure.) We claim that J0 (u0 ) = inf u∈U J0 (u).

228

Y. Guo

Proof of the claim. First of all, we have limn→∞ of H± and Mazur’s theorem, we have Z lim

n→∞

5

R 5

n hv± if± ≥

R 5

hv± ig± . By convexity

Z

n H± (f± )

≥

5

H± (g± ).

Moreover, by lower-semicontinuity of the norms, we have Z lim

n→∞

Z

n 2

|∇φ | +

with −1β = the mixterm:

R

Z

1 | ∂r (rAnθ )|2 ≥ r

|∂z Anθ |2 +

Z

Z |∇β| +

Z

2

|∂z α| + 2

1 | ∂r (rα)|2 , r

[g+ − g− ]dv since is bounded. On the other hand, we now show for Z lim

n→∞

5

n [η± (Anθ )f± − η± (α)g± ] = 0.

(56)

For a large number k > 0, we split (56) as Z

Z + |v|≥k

|v|≤k

= L1 + L2 .

From Lemma 1, for k ≥ K, L1 is bounded by 2kη± k∞ k

Z |v|≥K

n hv± i[f± + g± ]ddv ≤

C . k

We further split L2 as Z |v|≤k

n η± (α)(f±

Z − g± ) +

|v|≤k

n [η± (Anθ ) − η± (α)]f± .

For a fixed k, the first term clearly goes to zero since η± ∈ L∞ . For the second term, we may only consider the + case. For a given k1 , we separate it as Z

Z |v|≤k,f+n ≤k1

+

Z

|v|≤k,f+n ≥k1

≤ C(k, k1 )

|η+ (Anθ )

Z − η+ (α)| + 2|η|∞

f+n ≥k1

f+n .

For fixed k1 and k, the first term goes to zero as n → ∞ since An0 is bounded in H 1 and η 0 ∈ C 1 . Since H+ is superlinear, the second term is bounded by Z f+n ≥k1

|

f+n |H+ (f+n ) ≤ o(1) H+ (f+n )

Z f+n ≥M

H+ (f+n ) = o(1)

as k1 → ∞. Therefore we deduce that the mix term goes to zero by first choosing sufficiently large k and k1 , and then letting n → ∞. The claim is proved.

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

229

0 Now we show that u0 is a critical point of J0 . We notice that since limζ→0 H± (ζ) = −∞, from the same argument as in the proof in Lemma 2, we know that g± > 0 almost everywhere. Hence at any point (r, z, v) ∈ 5, H0 is continuous near g± (r, z, v). Thus we can take the standard variation near any point in 5. It follows from Lemma 2 that u0 is a steady state solution and g± = µ± (h± ). We now prove (54). We notice that from (41) in Lemma 3 and previous arguments in (56), it suffices to show that Z jθ (α, β)[Anθ − α] = 0. lim n→∞

∞

By Lemma 2, α ∈ L and β ∈ L∞ . Hence from the decay condition of µ± in (2), R supr,z R3 µ± (h± )dv < ∞, and jθ (α, β) ∈ L∞ (). Hence we deduce that (54). We first define the set of all minimizers as: U0 = {u0 , J0 (u0 ) = inf u∈U J0 (u)}. We also define the measurement of u to the set U0 as d(u, U0 ) ≡

inf

{u0 ∈U0 }

d(u, u0 ).

(57)

Theorem 2. Assume (2). Assume all solution u(t) of (16), (18) and (21) with boundary conditions (17), (19) and (22) satisfy sup J(u(t)) ≤ J(u(0)).

(58)

0≤t<∞

Then ∀ > 0, there exists δ > 0 such that if d(u(0), U0 ) < δ, sup d(u(t), U0 ) < . 0≤t<∞

Proof. We prove the theorem by contradiction. If not, there exist tn , un0 = (µn± , αn ) ∈ U0 and initial data un (0) such that d(un (0), un0 ) = We first claim that

1 , but d(un (tn ), U0 ) ≥ 0 > 0. n

(59)

lim J(un (0)) = J0 (un0 ).

n→∞

Proof of the claim. Since d(un (0), un0 ) = n1 , by Lemma 1, we have Z Z Z n n H± (f± (0)) + hv± if± (0) ≤ 2[ H± (µn± ) + hv± iµn± ] + C2 . f± (0)≤M

|v|≥K

5

Since d(un (0), un0 ) = 1/n, kAnθ (0) − αn kH 1 → 0 as n → ∞. Therefore Z jθ (αn , β n )(Anθ − αn ) → 0, lim n→∞

n

n

since jθ (α , β ) is uniformly bounded by Lemma 2. Moreover, by the same argument as in (56), we deduce Z n (0) = 0. [η± (Anθ (0)) − η± (αn )]f± lim n→∞

230

Y. Guo

Thus, in light of (41) Lemma 3, we deduce the claim.

By choosing φn (tn ) ∈ H01 such that −1φn (tn ) = ρn (tn ) = divEn (tn ) ∈ H −1 , we have Z ∇φn · (E − ∇φn ) = 0.

n n (t ), Anθ (tn )) ∈ U and we then decompose J as Now un = (f±

1 1 J0 (un (tn )) + kBθ (tn )k22 + k∇φn (tn ) + En (tn )k22 = J(un (tn )). 2 2

(60)

By the assumption (58) and the claim, lim sup J(un (tn )) ≤ lim J(un (0)) = J0 (un0 ) = inf J0 (u). n→∞

n→∞

u∈U

Since kBθ (tn )k22 + k∇φn (tn ) + En (tn )k22 ≥ 0 in (60), the corresponding un (tn ) is a minimizing sequence of J0 (u) and thus from Theorem 1 that there exists u0 ∈ U0 such that lim d(un (tn ), u0 ) = 0. n→∞

Moreover, as n → ∞ in (60), kBθ (tn )k22 + k∇φn (tn ) + En (tn )k22 → 0. We thus conclude d(un (tn ), u0 ) < 0 /2 for n large. This is a contradiction. 6. Weak Solutions In this section, we construct global finite energy weak solutions u(t) to the Vlasov– Maxwell system which satisfies (58). This implies the stability of the minimizers of J0 thanks to Theorem 2. We follow the idea in [G1] to construct a weak solution for given axially symmetric initial data u0 with ∂t Aθ |t=0 = A˙ θ0 . We assume that: f±0 ∈ L1 ∩ L∞ (5), H± (f±0 ) ∈ L1 (5), Aθ0 ∈ H01 (), A˙ θ0 ∈ L2 (), 1 |||u0 ||| ≡ J(u0 ) + kf±0 kL∞ < ∞, Er0 + ∂r Er0 + ∂z Ez = ρ(0). r

(61)

Notice that from Lemma 1, (61) implies that f±0 ∈ Lp for any 1 ≤ p ≤ ∞, E and B are initially in L2 . Let p = (t, r, z, vr , vθ , vz ), we define γ ± = {p ∈ (0, ∞) × ∂ × R3 | ± (nr vr + nz vz ) > 0}, γ 0 = {(t, r, z, vr , vθ , vz ) ∈ (0, ∞) × ∂ × R3 | nr vr + nz vz = 0}, where (nr , nz ) is the outward normal at ∂. We first define Definition 1 (Test function space). V = {g(p) ∈ Cc∞ ([0, ∞) × R2 × R3 )| supp g ⊂⊂ {[0, ∞) × × R3 } \ {(0 × ∂) × R3 ∪ γ0 }}, M1 = {G(p) ∈ Cc∞ ([0, ∞) × )},

M2 = {(ψ1 , ψ2 , φ)|(ψ1 , ψ2 ) ∈ Cc∞ ([0, ∞) × × R3 ), φ ∈ Cc∞ ([0, ∞) × × R3 )}.

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

231

We define the test functionals as follows. γ ∈ Definition 2 (Test functionals). Assume (61). Let f± ∈ L1loc ([0, ∞) × 5), and let f± 1,1 ∞ + 1 L (γ ). Let Er , Ez ,Bθ and ∂t Aθ ∈ Lloc ([0, ∞) × ), and Aθ ∈ Wloc ([0, ∞) × ). Let g± ∈ V, G(p) ∈ M1 and (ψ1 , ψ2 , φ) ∈ M2 . We define Z γ f±0 g± (0)ddv A± (f± , f± , Aθ , Er , Ez , Bθ , g± ; T, V ) = −

Z

T

Z

×V

Z

γ L(g± )f± dtddv + g± f± dγ± + + ×V γ ∩[0,T ] 0 Z B(f± , Aθ , G; T, V ) = [A˙ θ0 G(0) − Aθ0 ∂t G(0)]d

−

Z γ − ∩[0,T ]

γ g± K(f± )dγ± ,

Z TZ 1 [−∂tt G + ∂r { ∂r (rG)} + ∂zz G]Aθ dωdt + jθ G, + r 0 0 Z C(f± , Er , Ez , Bθ , ψ1 , ψ2 ; T, V ) = − (Er0 ψ1 (0) + Ez0 ψ2 (0))d Z

T

Z

T

Z

Z

{(Er ∂t ψ1 + Ez ∂t ψ2 ) + (∂z ψ1 − ∂r ψ2 )Bθ + (ψ1 jr + ψ2 jz )}dtddv, Z D(Er , Ez , Bθ , φ; T, V ) = − φ(0)Bθ0 d −

0

Z

T

− 0

Z

1 [Bθ ∂t φ + ∂r (rφ)Ez + ∂z φEr ]dtd, r

where K is defined in (17), dγ± = (vˆ r± nr + vˆ z± nz )rdσdvdt with the standard surface measure dσ on ∂, and jr , jθ and jz are defined as in (18) and (21) with v-integration over the set V . Definition 3 (Weak solutions). Assume (61). u(t) = [f± , Aθ , Er , Ez , Bθ ] is a weak solution to (16), (18) and (21) with boundary conditions (17), (19) and (22), if 0 ≤ f± ∈ γ ∈ L∞ (γ + ), Er , Ez , Bθ and ∂t Aθ ∈ L∞ ([0, ∞); L2 ()); L∞ ([0, ∞); L1 ∩ L∞ (5)), f± 1 Aθ (t) ∈ H0 (), and moreover, for all g± ∈ V, G ∈ M1 and (ψ1 , ψ2 , φ) ∈ M2 : γ , Aθ , Er , Ez , Bθ , g± ; ∞, R3 ) = 0, B(f± , Aθ , G; ∞, R3 ) = 0, A± (f± , f±

C(f± , Er , Ez , Bθ , ψ1 , ψ2 ; ∞, R3 ) = 0, D(f± , Er , Ez , Bθ , φ; ∞, R3 ) = 0, 1 ∂r Er + Er + ∂z Ez = ρ. r We now begin to construct a sequence of approximate solutions for (16), (18) and (21). Without loss of generality, we first assume f±0 has compact support and (20) is valid at t = 0. The main idea is to cut off the momentum space R3 as in [G1]. However, we make two revisions to [G1] to preserve the formally conserved quantity R η (r(v ± Aθ ))f± . We use a cut-off cylinder (with unbounded vθ ) rather than a cut± θ 5 off ball in the momentum space. Moreover, we impose a conservative reflexive boundary condition for the distributions on the cut-off boundary. For each fixed N > 0, we define a cylinder in the momentum space: p (62) VN = {(vr , vθ , vz )| vr2 + vz2 ≤ N, vθ ∈ R}.

232

Y. Guo

The boundary of VN is {(N cos ω, vθ , N sin ω), 0 ≤ ω ≤ 2π}. We let ± = γ ± ∩ {0 ≤ t ≤ N }. γN

Let 5N = × VN and supp f±0 ⊂ 5N . k , Akθ , Erk , Ezk , Bθk ) We now construct an iterative approximate solution uk (t) = (f± for k = 0, 1, 2, 3, ... starting with u0 = u0 in (61). For a given k and for any g ∈ L2 () and h ∈ H 1 (), we define g ∗ = g ∗ qk and h∗ = h ∗ qk , with qk ∈ Cc∞ () such that kg ∗ − gkL2 () ≤

1 1 , kh∗ − hkH 1 () ≤ . k k

(63)

k+1 as a solution of the linear equation Given uk , we now solve f± k+1 k∗ k∗ k∗ , Ak∗ Lk (f± θ , Er , Ez , Bθ ) = 0

(64)

(see (16)) with the specular reflection condition on ∂ and other artificial boundary conditions on ∂VN to be specified later. Lemma 5. Assume (61) and (2), and supp f±0 ⊂ 5N . Given Erk∗ , Ezk∗ , Bθk∗ , Ak∗ θ and k+1,γ ∞ k+1 ∞ ∞ + ∈ C (), there exist 0 ≤ f ∈ L ([0, N ] × 5 ) and 0 ≤ f ∈ L (γN ) ∂t Ak∗ N ± ± θ such that k+1,γ k+1 k∗ k∗ k∗ , f± , Ak∗ A± (f± θ , Er , Ez , Bθ , g± ; N, VN ) = 0,

(65)

for all g± ∈ V and supp g± ⊂⊂ [0, N ) × × VN . Moreover, we have k+1,γ k+1 (t)kp,(5N ) = kf±0 kp,(5N ) , 1 ≤ p ≤ ∞, kf± k∞,γN+ ≤ kf±0 k∞ , sup kf±

0≤t≤N

k+1,γ k+1 k∗ , f± } ≤ C(N, , kErk∗ k∞ , kEzk∗ k∞ , kBθk∗ k∞ , kAk∗ supp{f± θ kC 1 , k∂t Aθ k∞ ) Z Z Z tZ k+1 k+1 hv± if± (t) = hv± if±0 + [Ek∗ · jk+1 ], 5N 5N 5N 0 Z Z Z Z k+1 k∗ k+1 H± (f± )(t) = H± (f±0 ), η± (Aθ (t))f± (t) = η± (Ak∗ θ0 )f±0 , 5N ∂t ρk+1

Z

|

5N

Z k

VN

5N

k+1

+ ∇x · j

5N

5N

= 0,

k+1 |vθ |2 f± (t)dvθ |

k+1 ≤ C(N, kf± k∞ )kAk∗ k2 + C(|||u0 |||, supp f±0 ),

k+1 k+1 f± dvkL2 () ≤ C(N, kf± k∞ )kAk∗ k2 + C(|||u0 |||, N, supp f±0 ).

All integrations over are with respect to the measure d = rdrdz. k+1,γ k+1 and f± . For notational simplicity, without Proof. Step 1. The construction of f± k+1 loss of generality, we just consider the case for f+ and f+k+1,γ . The initial-boundary value problem for the linear Vlasov equation in an axially symmetric region is not standard. The main idea is to use the linear result in Cartesian coordinates. Let p = (t, r, θ, z, vr , vθ , vz ). We first consider the full Vlasov equation in the 3D cylindrical domain 6 = { p | 0 ≤ θ ≤ π, p ∈ [0, N ] × 5N }.

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

233

In 6, we recall the standard smooth, one to one mapping T as in (12). Let D− (D+ ) be the incoming (outgoing) set (depending on k) for the characteristic equation of f+k+1 in 6: dr = vˆ r+ , ds dθ vˆ θ+ = , ds r dz = vˆ z+ , ds

dvr vˆ + vˆ θ+ vθ + k∗ = Erk∗ + θ [∂r (rAk∗ , )] − v ˆ B + θ z θ ds r r dvθ vˆ r+ vˆ θ+ vr = −∂t Ak∗ [∂r (rAk∗ , ˆ z+ ∂z Ak∗ θ −v θ − θ )] − ds r r dvz = Ezk∗ + vˆ r+ Bθk∗ + vˆ θ+ ∂z Ak∗ θ , ds

(66)

with 0(t; p) = p, see (13) and (14). Then T (D± ) is the corresponding incoming (outgoing) set for the Vlasov equation in the Cartesian domain T (6). Therefore, we can solve for F k+1 (q) = f+k+1 (p), Yk (F k+1 ) ≡ ∂t F k+1 + vˆ + · ∇x F k+1 + (Ek∗ + vˆ + × Bk∗ ) · ∇v F k+1 = 0 in T (6) by prescribing the following initial boundary condition. Notice that up to a set of surface measure zero, ∂6 = ∂6t ∪ ∂6 ∪ ∂6VN ∪ ∂6θ , and we shall prescribe the boundary conditions piece by piece. On D0 = {p | t = 0} ⊂ ∂6t , we prescribe the initial data as: f+k+1 (0, r, θ, z, vr , vθ , vz ) = f+0 (r, z, vr , vθ , vz ).

(67)

On ∂6 = { p ∈ ∂6|(r, z) ∈ ∂}, we define the purely specular condition as in (17): f+ (t, r, θ, z, vr , vθ , vz ) = f+ (t, r, θ, z, v r , vθ , v z ) ≡ Kf± .

(68)

On the artificial boundary ∂6VN = { p ∈ ∂6|(vr , vθ , vz ) ∈ ∂VN }, ∂VN = (N cos ω, vθ , N sin ω) with its outward normal nv = (cos ω, 0, sin ω). We define v˙ + · nv (as in the trajectory equation (66)): 1 vˆ θ+ vθ } cos ω + {Ezk∗ + vˆ θ+ ∂z Ak∗ {Erk∗ + vˆ θ+ [ ∂r (rAk∗ θ )] + θ } sin ω r r ≡ a+ (r, z, vθ ) cos ω + b+ (r, z, vθ ) sin ω.

(69)

We then split ∂VN as I + = {v ∈ ∂VN : v˙ + · nv > 0}, I − = {v ∈ ∂VN : v˙ + · nv < 0}, and I 0 = {v ∈ ∂VN : v˙ + · nv = 0}. Notice that I ± belongs to the outgoing (incoming) set D± of f+k+1 respectively. We define the reflexive boundary condition on I − : f+k+1 (t, r, θ, z, N cos ω, vθ , N sin ω)I − = f+k+1 (t, r, θ, z, −N cos ω, vθ , −N sin ω)I + , (70) and an absorbing boundary condition on I 0 : f+k+1 (t, x, v)I 0 ∩D− = 0.

(71)

(We impose a similar reflexive boundary condition on v˙ − · nv < 0 and an absorbing k+1 .) condition on incoming part of v˙ − · nv = 0 for f−

234

Y. Guo

On ∂6θ = { p ∈ ∂6 | θ = 0, θ = π}, we define the following π−periodic boundary vˆ θ+ condition. For {θ = 0, vθ > 0} ⊂ D− : (since dθ ds = r ) f+k+1 (t, r, 0, z, vr , vθ , vz ) = f+k+1 (t, r, π, z, vr , vθ , vz ).

(72)

For {θ = π, vθ < 0} ⊂ D+ : f+k+1 (t, r, π, z, vr , vθ , vz ) = f+k+1 (t, r, 0, z, vr , vθ , vz ).

(73)

We use K to denote the abstract boundary operator to represent all boundary conditions (68), (70), (71), (72) and (73) so that the boundary condition can be simply stated as k+1 k+1 fD − \D 0 = KfD + \D 0 .

Now we solve (for fixed k) Yk (F k+1 ) = 0, f k+1 |D− \D0 = Kf k+1 |D+ \D0 , f k+1 |D0 = f+0

(74)

by an iteration in l = 0, 1, 2, 3, ...: Yk (Flk+1 ) = 0,

k+1 flk+1 |D− \D0 = Kfl−1 |D+ \D0 , flk+1 |D0 , = f+0 .

(75)

starting from f0k+1 |D0 = f+0 and f0k+1 |D− \D0 ≡ 0. By the standard linear theory [BP], there exists a sequence of solutions to (75). Following the trajectory (66), we have uniform estimates on l, kFlk+1 kL∞ (T (6)) ≤ kf+0 kL∞ (5) , kFlk+1,γ kL∞ (T (D+ )) ≤ kf+0 kL∞ (5N ) ,

suppFlk+1 ≤ suppf+0 + N [kEk∗ k∞ + kBk∗ k∞ ].

(76)

Therefore, letting l → ∞, we obtain F k+1 and F k+1,γ which satisfies (74) with the same estimate (76). By the standard trace theory in [BP], since Yk (F k+1 ) = 0, we have Z Z F k+1 dσT+ = KF k+1 dσT− , (77) T (D − )

T (D + )

where dσT+ and dσT− are two surface measures characterized by the equation Z Z Z Yk (g) = gdσT+ − gdσT− , T (6)

T (D − )

T (D + )

which is valid for all smooth functions g. Let Lk be the “full” Vlasov operator Lk = Lk +

vˆ θ+ ∂θ . r

By the change of variable p = T −1 (q), Z Z Yk (g) = Lk (g ◦ T −1 )dtddθdvr dvθ dvz . T (6)

6

Although Lk is not in divergence form, Lk r is. This is due to the fact: 1 1 1 − ∂vr (vθ vˆ θ± ) + ∂vθ (vθ vˆ r± ) = vˆ r± . r r r

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

By the divergence theorem in 6, Z Z −1 Lk (g ◦ T )dtddθdvr dvθ dvz = 6

D+

(g ◦ T

235

−1

Z )dσ − +

D−

(g ◦ T −1 )dσ − ,

where dσ + and dσ − are the unique surface measures associated with the differential operator Lk . Therefore Z Z Z Z gdσT+ = (g ◦ T −1 )dσ + ; gdσT− = (g ◦ T −1 )dσ − . T (D + )

T (D − )

D+

D−

Combining with (77) and letting f+k+1 = g ◦ T −1 , we deduce Z Z f+k+1 dσ + = f+k+1 dσ − ,

(78)

D−

D+

for Lk (f+k+1 ) = 0. This means that we can directly compute all the surface integrals over D± . We now claim that K is conservative in L1 (D− \ D0 ). Proof of the Claim. . This is clearly true on ∂6 and ∂6θ . Now on ∂6VN , from (69) and (70), the boundary integral is Z Z 2π k+1 [a± cos ω + b± sin ω]f+ (N cos ω, vθ , N sin ω)dω dvθ dtddθ. (79) 0

Since a± and b± do not depend on ω, for fixed vθ , we divide the ω-integration into R π R 2π + π . By changing ω 0 = π + ω in the second integral, we obtain: 0 Z π Z 2π dω = − [a± cos ω + b± sin ω]f+k+1 (−N cos ω, vθ , −N sin ω)dω π

0

Z

2π

=−

[a± cos ω + b± sin ω]f+k+1 (N cos ω, vθ , N sin ω)dω

0

from (70). The claim thus follows. k+1

(or f+k+1 ) is that f+k+1 is

unique. By [BP], F independent of θ, thereby Lk ≡ Lk , and f+k+1 satisfies We now show the original equation (64). Define the π-periodic extension of f+k+1 as: k+1

f + (t, r, z, mπ + θ, vr , vθ , vz ) = f+k+1 (t, r, z, θ, vr , vθ , vz ) k+1

for all m ∈ Z and 0 ≤ θ ≤ π. For any 0 ≤ θ0 < π, notice that f + (t, r, θ + θ0 , z, vr , vθ , vz ) satisfies both the same boundary condition K as well as the equation k+1 Lk (f + ) = 0 as f+k+1 does, since all the data are independent of θ. Therefore, uniqueness implies k+1 f (θ + θ0 ) = f k+1 (θ) for all θ0 ∈ [0, π). Hence from periodicity Z 1 π k+1 0 0 k+1 f (θ )dθ . f+ (θ) = π 0 +

236

Y. Guo

Therefore f+k+1 does not depend on θ. We thus finish the construction of the solution. And by the argument in [G1], (65) is satisfied. Step 2. The estimates. All the identities in (3) then follow from various standard integrations via (78) over [0, t] × 5 (See [G1]). The estimate in the first line follows from the standard Lp estimates, 1 ≤ p ≤ ∞. The estimate for the support in the second line follows from (76). The third estimate is the energy estimate. This is proved by an integration of the k+1 ) = Ek∗ · jk+1 over [0, t] × 5. Since hv± i is independent of ω on equation Lk (hv± if± the artificial boundary ∂VN , it follows from (79) that the boundary integral over ∂VN has no contribution. A direct computation leads to Lk [η± (r(vθ ± Ak∗ θ ))] = 0. It then follows that k∗ k+1 Lk [η± (r(vθ ± Aθ ))f± ] = 0. By (78), estimates on the fourth line follow from integrations of k+1 )) = 0, Lk (H± (f±

k+1 Lk [η± (r(vθ ± Ak∗ θ ))f± ] = 0

over [0, t] × 5. Since η± (r(vθ ± Ak∗ θ )) is independent of ω on the artificial boundary ∂VN , boundary integrals on both integrations drop due to (79). It then also follows that ∂t ρk+1 + ∇x jk+1 = 0. To prove the last two estimates, we choose η± (x) = x2 . We have Z Z k+1 k∗ 2 k+1 r2 vθ2 f± (t) ≤ r2 {∓2vθ Ak∗ θ − [Aθ ] }f± + C(|||u(0)|||, supp f±0 ) 5N 5N Z Z k+1 1/2 2 k+1 1/2 hvθ i2 f± } { hvθ i−2 |Ak∗ + C(|||u(0)|||, supp f±0 ) ≤ C{ θ | f± } 5N

Z

≤ C{sup r,z

VN

5N Z k+1 f± k∗ k+1 1/2 2 dv}kA kL { hvθ i2 f± } hvθ i2 5N

Z

k+1 k∞ )[kAk∗ kL2 + 1]{ ≤ C(N, kf±

5N

+ C(|||u(0)|||, supp f±0 )

k+1 1/2 vθ2 f± } + C(|||u(0)|||, supp f±0 ).

Since r ≥ r0 > 0, the second to the last estimate follows. The last estimate follows from the fact: Z Z Z k+1 −2 1/2 f dv| ≤ { hvθ i dv} { hvθ i2 |f k+1 |2 dv}1/2 | VN VN VN Z k+1 ≤ C(N, kf± k∞ ){ hvθ i2 |f k+1 |dv}1/2 . VN

We then conclude our lemma.

k+1 , we now solve the Maxwell system as Given f± Z k+1 [vˆ r+ f+k+1 − vˆ r− f− ]dv, ∂t Erk+1 + ∂z Bθk+1 = −jrk+1 = − VN Z 1 k+1 [vˆ z+ f+k+1 − vˆ z− f− ]dv, ∂t Ezk+1 − ∂r (rBθk+1 ) = −jzk+1 = − r VN

∂t Bθk+1 − (∂r Ezk+1 − ∂z Erk+1 ) = 0

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

237

with boundary condition −Erk+1 nr + Ezk+1 nz = 0. as: We also solve Ak+1 θ −∂tt Ak+1 θ

1 k+1 + ∂r { ∂r (rAk+1 =− θ )} + ∂zz Aθ r

Z VN

(80)

k+1 [vˆ θ+ f+k+1 − vˆ θ− f− ]dv

(81)

= 0 on ∂. with Dirichlet condition Ak+1 θ k+1 as in the previLemma 6. Assume (2), (61) and supp f±0 ⊂ 5N . Given f± k+1 k+1 k+1 ous lemma, there exists a unique solution (Er , Ez , Bθ ) ∈ L∞ (L2 ()) and Ak+1 ∈ L∞ (H 1 ()) and ∂t Ak+1 ∈ L∞ (L2 ()) such that k+1 k+1 , Ak+1 , G; N, VN ) = 0, C(f± , Erk+1 , Ezk+1 , Bθk+1 , ψ1 , ψ2 ; N, VN ) = 0, B(f± k+1 , Erk+1 , Ezk+1 , Bθk+1 , φ; N, VN ) = 0 D(f±

(82)

for all G ∈ M1 , (ψ1 , ψ2 , φ) ∈ M2 with supp (G, ψ1 , ψ2 , φ) ⊂ [0, N ) × . Moreover, we have for 0 ≤ t ≤ N , k+1 , |||u0 |||), kAk+1 (t)kH 1 () + k∂t Ak+1 (t)kL2 () ≤ C(N, supp f± Z Z Z tZ 1 1 2 |Ek+1 |2 (t) + |Bk+1 |2 (t) = |E0k+1 |2 + |Bk+1 | − Ek+1 · j k+1 , 0 2 2 0 Z 1 k+1 k+1 k+1 k+1 k+1 E (t) + ∂r Er (t) + ∂z Ez (t) = ρ (t) = [f+k+1 − f− ]dv. (83) r r VN

Proof. By the last estimates in Lemma 5, we deduce that jrk+1 (t, ·), jzk+1 (t, ·) and jθk+1 (t, ·) ∈ L∞ (L2 ()). By the standard energy estimate of (81) and Lemma 5, 1 kAk+1 (t)k2H 1 + k∂t Ak+1 (t)k22 + k Ak+1 (t)k22 r Z t ≤ |||u0 |||2 + k∂t Ak+1 (τ )k2 kjθk+1 (τ )k2 dτ 0 Z t k∂t Ak+1 (τ )k2 [kAk (τ )k2 + 1]dτ. ≤ |||u0 |||2 + C 0

Then the first estimate in (83) follows from an induction on k. On the other hand, the main part of the reduced Maxwell system is a symmetric hyperbolic system with ∂t v + M1 ∂r v + M2 ∂z v, where vT = (Erk+1 , Ezk+1 , Bθk+1 ) and     0 0 0 010 M2 = 1 0 0 , M1 = 0 0 −1 , 0 −1 0 000 and

 0 nr 0 M1 nr + M2 nz = nr 0 −nz  0 −nz 0 

238

Y. Guo

is of rank two. The boundary condition (80) forms a maximal nonnegative space with respect to M1 nr + M2 nz . Hence there exists a strong solution [LP] (Erk+1 , Ezk+1 , Bθk+1 ) and the second energy estimate in (83) is valid. Equation (83) follows from the Maxwell system and Lemma 5: 1 1 ∂t [ Erk+1 (t) + ∂r Erk+1 (t) + ∂z Ezk+1 (t)] = jrk+1 + ∂r jrk+1 + ∂z jzk+1 = −∂t ρk+1 , r r

and the initial constraints in (61). Now we are ready to prove

Theorem 3. Assume (2) and (61). Then there exists a weak solution γ , Aθ , Er , Ez , Bθ ] (as in Definition 3) such that [f± , f± γ sup kf± (t)kLp ≤ kf±0 kLp , 1 ≤ p ≤ ∞, kf± (t)kL∞ ,γ ± ≤ kf±0 kL∞ , Z Z H± (f± (t)) ≤ H± (f±0 ), sup 0≤t<∞ 5 5 Z Z η± (Aθ (t))f± (t) = η± (Aθ0 )f±0 , a.e.t > 0, 0≤t<∞

5

5

sup E(u(t)) ≤ E(u0 ),

sup J(u(t)) ≤ J(u0 ).

0≤t<∞

0≤t<∞

Proof. We first assume that f±0 have compact supports. We construct the weak solutions in two steps [G1]. In the constructed approximate solutions uk+1 N (t) in the previous two lemmas, we first fix N and let k → ∞ to get a weak limit uN (t). In the second step, we let N → ∞ to obtain the final weak solution u(t). We fix N such that supp f±0 ⊂ 5N , and let k → ∞ in Lemmas 5 and 6. Up to k+1 N → f± weakly in L2 ([0, N ] × 5N ) and a subsequence, as k → ∞, we assume f± k+1,γ N,γ ∞ ∞ + → f± weak ∗ in L (γN ). Moreover, Ek+1 and Bk+1 weak ∗ in L (5N ), and f± N N converges weakly to E and B in L2 ([0, N ] × ) respectively. N and By lower-semicontinuity, the Lp estimates in Lemma 5 remain valid for f± N,γ f± . By convexity, Z Z N H± (f± (t)) ≤ H± (f±0 ). (84) sup 0≤t<∞

5

5

R

1 k+1 → AN Since Ak+1 θ θ weakly in H0 (), and k VN f± dvkL2 () is bounded, we deduce for almost every t, Z Z N η± (AN (t))f (t) = η± (Aθ0 )f±0 , (85) θ ± 5

5

The key step is to show that the energy inequality holds for uN (t): sup E(uN (t)) ≤ E(u0 ),

(86)

0≤t<∞

which gives L2 bounds for EN and BN for all t ∈ [0, N ]. Combining the energy estimates in Lemmas 5 and 6, we only need to show that for fixed N ,

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

Z tZ lim

k→∞

5N

0

239

[Ek+1 − Ek∗ ] · jk+1 = 0.

(87)

R R We break jk+1 into |vθ |≤M dv + |vθ |≥M dv. In light of the last two estimates in Lemma 5, the second term is bounded by Z Z Z k+1 k+1 2 f± dv|2 ≤ { hvθ i−2 }{ hvθ i2 |f± | } | |vθ |≥M

|vθ |≥M

≤

C(N ) M

Z

|vθ |≥M

|vθ |≥M

k+1 hvθ i2 f± ≤

C(N, supp f±0 , |||u0 |||) . M

Therefore, the L norm of the second term is arbitrarily small for R M sufficiently large. On the other hand, for the given large fixed M , the first part |vθ |≤M dv is bounded 2

in H 1/4 ([0, N ] × ) by the averaging lemma. Hence jk+1 converges to jN strongly in L2 (). See [G1]. But Ek+1 − Ek∗ goes to zero weakly in L2 (), we thus conclude (87) by first letting M large and then letting k → ∞. Therefore the energy inequality (86) N 2 , EN and BN as k → ∞. It follows that ∂t AN is valid for the weak limits f± θ ∈ L () 1 and AN θ ∈ H (). From the constructions in Lemmas 5 and 6, J(uk+1 (t)) = J(u0 ) + R(k, N ),

(88)

where R(k, N ) is Z tZ [Ek+1 − Ek∗ ] · jk+1 5 0 Z N k+1 k+1 k∗ k+1 − {[η± (Ak∗ θ (t)) − η± (Aθ (t))]f± (t) − [η± (Aθ0 ) − η± (Aθ0 )]f±0 }. 5N

Notice that from Lemma 6, Akθ is bounded in H 1 () for fixed N . Since k bounded for all k, we let k → ∞ in (88) to get Z k+1 k+1 [η± (Ak∗ | θ (t)) − η± (Aθ (t))]f± (t)| 5N Z k+1 k+1 (t) − A (t)k k f± k2 → 0. ≤ C(N )kAk∗ 2 θ θ

R VN

k+1 f± k2 is

VN

Combining with (87), (85) and (84), we have J(uN (t)) ≤ J(u0 ).

(89)

Now it follows the argument in [G1] that as N → ∞, there is a weak solution f± ∈ γ ∈ L∞ (γ + ), and E, B ∈ L2loc ((0, ∞) × ), Lploc ((0, ∞) × 5), for 1 ≤ p ≤ ∞, f± 1 N , E N and B N and AN Aθ ∈ Hloc ((0, ∞)×) as the weak limits of f± θ . Moreover, lower p semicontinuity implies the desired L estimates and (86) are valid for the weak solution. Equation (84) follows for the weak limit by convexity. Now by the same argument as in (56), (85) holds for the weak limits, so does (89). The theorem is proved in the case when f±0 have compact supports. In the general case in which f±0 do not have compact supports, we approximate m m , E0m and Bm f±0 , E0 and B0 by f±0 0 , such that f±0 = χ|v|≤m (v)f±0 , and

240

Y. Guo m lim kf± − f±0 kp → 0, for1 ≤ p < ∞;

m→∞

lim kE0m − E0 k2 + kBm 0 − B0 k2 → 0,

m→∞

m lim J(um 0 ) = J(u0 );

m→∞ divE0m = ρm 0 .

Here χ is the characteristic function. We obtain a sequence of corresponding weak m .) The theorem solutions um with the same estimates (independent of the supports of f±0 thus follows when we take an appropriate weak limit as m → ∞. 7. Stability of Critical Points We are now ready to prove the main theorem. Theorem 4. Assume (61) and (2). (a) (Stability of minimizers). Let U0 be the set of minimizers of J0 . Then for every > 0 there is δ > 0 such that sup d(u(t), U0 ) < , 0≤t<∞

provided d(u(0), U0 ) < δ, where u(t) is the corresponding weak solution constructed in Theorem 3. (b) (Stability of critical points). Let u0 be a critical point of J0 such that (7) and (47) hold. Then ∀ > 0, there is δ > 0 such that if d(u(0), u0 ) < δ, |||u0 ||| ≤ C0 then the corresponding weak solution u(t) constructed in Theorem 3 satisfies sup d(u(t), u0 ) < .

(90)

0≤t<∞

Proof. Part (a) follows from Thereom 2 and 3. For part (b), if a given critical point u0 satisfies (7), then from Lemma 4 (1 − σ)d(u(t), u0 ) − Cd3/2 (u(t), u0 ) ≤ J(u(t)) − J(u0 ). By a standard continuity argument in stability analysis, this would imply that u0 is stable if d(u(t), u0 ) is continuous in t, which is not clear for weak solutions. Instead, we establish (90) at the level of the approximate solutions uk+1 N (t) of u(t), that is, we shall prove d(uk+1 N (t), u0 ) ≤ if d(u0 , u0 ) ≤ δ. Here δ is uniform in k and N . We then establish the same for u(t) by letting k → ∞ and N → ∞. Without loss of generality, we again assume that f±0 have compact supports. We k+1 (t)) ≤ 1, recall the approximate solutions uk+1 N (t) in Theorem 3. For fixed N , if d(u k+1 k+1 = ρ , we deduce from Lemma 4 that since divE (1 − σ)d(uk+1 (t)) − Cd(uk+1 (t))3/2 ≤ J(uk+1 (t)) − J(u0 ) ≤ J(u0 ) − J(u0 ) + R(k, N ). k+1 k∞ , khvif k+1 k1 ). Notice that both Here R(k, N ) is defined in (88) and C = C(kf± k+1 k+1 kf± k∞ and khvif k1 are uniformly bounded for k since E(uk+1 (t)) is bounded. For k ≥ k0 large, by the same argument in (87),

R(k, N ) <

1 [J(u0 ) − J(u0 )] 2

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

241

(for u0 6= u0 ). Hence we have for k ≥ k0 , (1 − σ)d(uk+1 (t), u0 ) − Cd(uk+1 (t), u0 )3/2 ≤ C1 [J(u0 ) − J(u0 )] with C1 a fixed constant independent of k and N . We now claim that d(uk+1 (t), u0 ) is continuous in t. R k+1 (t)) and Proof of the claim. Notice that from Lemmas 5 and 6, both 5N H± (f± k+1 continuous in t. By the definition of d as in (38) and (40), it suffices E(uN (t)) are R k+1 (t) is also continuous in t. Notice that to show that 5N h± (α, β, v)f± k+1 k+1 ) = [Lk (h± (α, β, v))]f± . Lk (h± (α, β, v)f±

Since h± (α, β, v) is independent of ω at ∂VN , we integrate over [s, t] × 5N as in (78) and (79) to get, Z tZ Z k+1 k+1 k+1 h± (α, β, v)[f± (t) − f± (s)] = [Lk (h± (α, β, v))]f± (τ ). 5N

s

5N

Since Lk (h± (α, β, v)) is continuous, it thus follows that tinuous and the claim follows.

R 5N

k+1 h± (α, β, v)f± (t) is con-

Therefore, by a standard continuity argument, sup d(uk+1 (t), u0 ) ≤ 3C1 [J(u(0)) − J(u0 )], 0≤t<∞

provided that d(u0 , u0 ) ≤ δ, where δ is independent of k and N . By taking k → ∞ and N → ∞, we deduce that sup d(u(t), u0 ) ≤ 3C1 [J(u0 ) − J(u0 )] 0≤t<∞

by lower semicontinuity of d. Now for any > 0, we choose δ small such that 3C1 [J(u0 ) − J(u0 )] < , the theorem follows. 8. Neutral Plasmas It is interesting to check when condition (7) is satisfied for a given critical point u0 . In this section, we give an interpretation of condition (7) in the neutral case, i.e., β ≡ 0. Definition 4. Given µ± , η± is called a neutral pair for µ± , if for all α ∈ R, Z [µ+ (h+ (α, 0, v)) − µ− (h− (α, 0, v))]dv ≡ 0. ρ(α, 0) =

(91)

R3

If such a neutral pair exists, from (4), we have the following easy reduction: Lemma 7. Let η± be a neutral pair for µ± , if α(r, z) satisfies 1 1 −∂zz α − ∂rr α − ∂r α + 2 α − jθ (α, 0) = 0, r r then [µ± (h± (α, 0, v)), α, 0] is a critical point of J0 .

(92)

242

Y. Guo

There are many examples of such neutral pairs, especially when µ± (x) = e−x . In this case, let ν± = e−η± , and (91) reduces to Z Z ν+ (r(vθ + α))e−hv+ i dv = ν− (r(vθ − α))e−hv− i dv. (93) R3

R3

For given ν+ , we can solve ν− from (93) via the Fourier transform for α. Lemma 8. Assume that ν+ is even, and ν∞ = limx→±∞ ν+ (x). Assume ν+ − ν∞ ∈ W 2,1 (R). Then there exists a bounded function ν− and constants C± , such that ν± +C± ≥ 0 and satisfy (93) for all α. Proof. In the case m+ = m− , we simply choose ν− (x) = ν+ (x). Otherwise, we assume m+ > m− . We first assume ν∞ = 0. Notice that by a change of variable vθ → −vθ , Z Z −hv+ i ν+ (r(vθ + α))e dv = ν+ (r(vθ − α))e−hv+ i dv R3

R3

from the evenness of ν+ . We perform the Fourier transform (denoted by F) in (93) with respect to α to get Z Z √ 2 2 −hv± i ν± (r(vθ − α))e dv](y) = Fν± (r ·)(y) e− m± +|v| +ivθ y dv. F[ R3

R3

By elementary computations, as in [St], we have Z ∞ Z √ 2 2 1 − m2± +|v|2 +ivθ y √ e−2y u−u−m± /4u du ≡ h± (y), e dv = u 3 R 0 and 0 ≤ h ≡ h+ /h− ≤ 1. Now we can solve ν− as: Z Z ν+ (r(x − y))eiyθ h(θ)dθdy ν− (rx) = c R R Z Z = c [ ν+ (x − y)eiyθ/r dy]h(θ)dθ, R

R

R where c is some numerical constant. Notice that R ν+ (x − y)eiyθ/r dy ∈ L∞ . Since ν∞ = 0, we have Z Z r2 Ckν+ kW 2,1 . | ν+ (x − y)eiyθ/r dy| = 2 | ν+ 00 (x − y)eiyθ/r dy| ≤ θ θ2 R R R Hence R ν+ (x − y)eiyθ/r dy ∈ L1 (R). Notice that 0 ≤ h ≤ 1, hence ν− is bounded. We then can adjust two constants C± such that ν± + C± ≥ 0 and satisfy (93) (even in the case ν∞ is not zero). With more smooth conditions on ν+ , we can deduce that ν− is more regular too. For a neutral pair η± , we set up a variational formulation to find a solution α. Recall (8). Lemma 9. Assume (2). Let η± be a neutral pair. Then there exists a minimizer α0 of inf α∈H01 () G(α), such that α0 satisfies (92) and 1 1 −∂zz − ∂rr − ∂r + 2 − ∂α jθ (α0 , 0) ≥ 0. r r

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

243

In particular, in light of Theorem 4, a non-degenerate minimizer α0 should deduce a stable steady state solution (µ± , α0 , 0). This is the case if sup ∂α jθ (α0 , 0) < λ0 ,

(94)

(r,z)∈

where λ0 is the lowest eigenvalue of −∂zz − ∂rr − r1 ∂r +

1 r2

on .

Proof. Notice that from (2), M±R(ζ)R≤ C|ζ|−γ+1 for ζ > 0. But γ > 4 and h± is bounded from below by (3), hence R3 M± (h+ (α, 0, vθ ))dv is bounded for all α. The existence follows from standard argument for minimization. Now the α derivative of the nonlinear term in G(α) is Z 0 {rη+0 (α)µ+ (h+ (α, vθ )) − rη− (α)µ− (h− (α, vθ ))}dv = −jθ (α, 0) R3

via integration by part in the vθ variable as in (35). 9. Translation Invariants

In this section, we study the Vlasov–Maxwell system when the solutions are independent of z. The results and proofs are parallel to those for the axially symmetric case. We only state the corresponding theorems. Let be an bounded open set in { (x1 , x2 ) ∈ R2 } with the standard measure d = dx1 dx2 and ∂ ∈ C 2,δ , for some δ > 0. We look for solutions of the Vlasov–Maxwell system of the form f± (t, x1 , x2 , v1 , v2 , v3 ), E1 (t, x1 , x2 ), E2 (t, x1 , x2 ), B3 (t, x1 , x2 ) and a magnetic potential Az (t, x1 , x2 ) (invariant under gauge translations which are independent of z), such that B1 = ∂x2 Az , B2 = −∂x1 Az . E3 = −∂t Az , The Vlasov–Maxwell system now can be written as: L(f± ) = ∂t f± + vˆ 1± ∂x1 f± + vˆ 2± ∂x2 f± ± (E1 + vˆ 2± B3 + vˆ 3± ∂x1 Az )∂v1 f± ±(E2 +

vˆ 3± ∂x2 Az

−

vˆ 1± B3 )∂v2 f±

± (−∂t Az −

vˆ 1± ∂x1 Az

−

vˆ 2± ∂x2 Az )∂v3 f±

(95) = 0.

And the Maxwell system takes the form: Z [v1+ f+ − v1− f− ]dv, ∂t E1 − ∂x2 B3 = −j1 = − 3 ZR [v2+ f+ − v2− f− ]dv, ∂t E2 + ∂x1 B3 = −j2 = − R3

∂t B3 + ∂x1 E2 − ∂x2 E1 = 0.

(96)

The equations for B1 and B2 become trivial, and the equation for E3 is: Z [v3+ f+ − v3− f− ]dv. −∂tt Az + ∂x1 x1 Az + ∂x2 x2 Az = −j3 = − R3

(97)

244

Y. Guo

The constraints are reduced to

Z

∂x1 E1 + ∂x2 E2 = ρ =

R3

[f+ − f− ]dv.

(98)

The boundary condition for f± is purely specular: f± (t, x1 , x2 , v) = f± (t, x1 , x2 , v),

(99)

where v = v − 2(v · n)n, n = (nx1 , nx2 ) is the outward normal vector of ∂. The perfect conductor boundary condition reduces to E1 nx2 − E2 nx1 = 0,

Az = 0.

(100)

We now use the invariant η± (v3 ± Az ) to (95), (96) and (97) to construct steady states to (1). For notational simplicity, we use η± (ξ) ≡ η± (v3 ± ξ), h± (α, β, v) ≡ hv± i ± β + η± (v3 ± α).

(101)

Let 5 = × R3 and γ = { (x, v) ∈ 5 | x ∈ ∂}, γ ± = { (x, v) ∈ γ | ± (v1 nx1 + v2 nx2 ) > 0}. (102) Let U = {u = (f± , Az ) : Az ∈ H01 (); 0 ≤ f ± ∈ L1 (5)}. Let u = (u, E1 , E2 , B3 ). The energy functional is Z XZ 1 hv± if± ddv + (|E|2 + |B|2 )d, (103) E(u) = 2 5 ± with d = dx1 dx2 . And we define the full dynamical Liapunov functional XZ [H± (f± ) + η± (Az )]f± , J(u) = E(u) + ±

(104)

5

where divE = ρ. We also recall the steady Liapunov functional J0 in (11). Lemma 1 is valid now if |∂z Aθ |2 + | r1 ∂r (rAθ )|2 is replaced by |∇Az |2 . Lemma 2 is also valid if Eq. (4) is replaced by the Eq. (10), and (µ± (h± ), α, ∂x1 β, ∂x2 β, 0) is a steady states solution of the (95), (96) and (97) with boundary conditions (99) and (100). For any critical point u0 = (µ± , α) of J0 , we define Q(u, u0 ) as in (38) with new h± in (101). We further define a measurement between u = (u, E1 , E2 , B3 ) and a critical point u0 as Z 1 {|E + ∇β|2 + λ|∇(Az − α)|2 + |B3 |2 } d(u, u0 ) ≡ d(u, u0 , λ) = Q(u, u0 ) + 2 (105) R R with divE = − R3 [f+ − f− ]dv, −1β = R3 [µ+ − µ− ]dv, φ ∈ H01 (). Lemma 3 is still valid if jθ (α, β) is replaced by j3 (α, β), Aθ is replaced by Az , and N (f± − µ± , Aθ − α) is replaced by 1 0 0 (v3 ± Az )(Az − α)(f± − µ± ) − {η± (v3 ± α)}2 µ± 0 (Az − α)2 ∓η± 2 1 + {η± 00 (v3 ± Az )f± − η± 00 (vz ± α)µ± }(Az − α)2 , 2

(106)

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

245

some Az between Az and α, and f ± between µ± and f± . Lemma 4 is still valid if the positivity condition is replaced by Z |∇(Az − α)|2 + ∂α j3 (α, β)|Az − α|2 ≥ λkAz − αk2H 1 .

(107)

Consider inf u∈U J(u) among all admissible u. Define the set of all minimizers as: U0 = {u0 , J0 (u0 ) = inf u∈U J0 (u)} and d(u, U0 ) ≡ inf {u0 ∈U0 } d(u, u0 ). Then Theorems 1 and 4 are valid now. We assume f±0 ∈ L1 ∩ L∞ (5), H± (f±0 ) ∈ L1 (5), Az0 ∈ H01 (), A˙ z0 ∈ L2 (), |||u0 ||| ≡ J(u0 ) + kf±0 kL∞ < ∞, ∂x1 E10 + ∂x2 E20 = ρ(0).

(108)

With the same test function space as in Definition 1, we define γ ∈ Definition 5 (Test functionals). Assume (108). Let f± ∈ L1loc ([0, ∞) × 5), and f± 1,1 ([0, ∞)×). L∞ (γ + ). Let E1 , E2 , B3 and ∂t Az be all in L1loc ([0, ∞)×), and Az ∈ Wloc Let g± ∈ V, G(p) ∈ M1 and (ψ1 , ψ2 , φ) ∈ M2 . We define Z γ , Az , E1 , E2 , B3 , g± ; T, V ) = − f±0 g± (0)ddv A± (f± , f±

Z

×V

Z

Z

T

γ L(g± )f± dtddv + g± f± dγ± + ×V γ + ∩[0,T ] 0 Z B(f± , Az , G; T, V ) = [A˙ z0 G(0) − Az0 ∂t G(0)]d

−

Z γ − ∩[0,T ]

γ g± K(f± )dγ± ,

Z TZ [−∂tt + ∂x1 x1 + ∂zz ]GAz dωdt + j3 G, 0 0 Z C(f± , E1 , E2 , B3 , ψ1 , ψ2 ; T, V ) = − (E10 ψ1 (0) + E20 ψ2 (0))d Z

Z

T

+

Z

T

Z

{(E1 ∂t ψ1 + E2 ∂t ψ2 ) − (∂x2 ψ1 − ∂x1 ψ2 )B3 + (ψ1 j1 + ψ2 j2 )}dtddv, Z D(E1 , E2 , B3 , φ, T, V ) = − φ(0)B30 d −

0

Z

T

− 0

Z

[B3 ∂t φ + ∂x1 φE2 − ∂x2 φE1 ]dtd,

where dγ± = (vˆ 1± nx1 + vˆ 2± nx2 )dσdvdt with the standard surface measure dσ on ∂, and L is defined in (95), K is defined in (99), j is defined as in (96) and (97) with v-integration over the set V . Definition 6 (Weak solutions). Assume (108). We define that u(t) = [f± , Az , E1 , E2 , B3 ] is a weak solution to (95), (96) and (97) with boundary conditions (99), and (100) γ ∈ L∞ (γ + ); E1 , E2 , B3 and ∂t Az ∈ if 0 ≤ f± ∈ L∞ ([0, ∞); L1 ∩ L∞ (5)), f± ∞ 2 1 L ([0, ∞); L ()); Az (t) ∈ H0 (), moreover γ A± (f± , f± , Az , E1 , E2 , B3 , g± ; ∞, R3 ) = 0, B(f± , Az , G; ∞, R3 ) = 0,

C(f± , E1 , E2 , B3 , ψ1 , ψ2 ; ∞, R3 ) = 0, D(f± , E1 , E2 , B3 , φ; ∞, R3 ) = 0, ∂x1 E1 + ∂x2 Ex2 = ρ,

246

Y. Guo

for all g± ∈ V, G ∈ M1 and (ψ1 , ψ2 , φ) ∈ M2 . With the same ideas as in Lemmas 5 and 6, Theorem 3 is valid now. We also deduce the stability result as in Theorem 4: Theorem 5. Assume (61) and (2). (a) (Stability of minimizers). Let u0 be the set of minimizers of J0 . Then for every > 0 there is δ > 0 such that sup d(u(t), U0 ) < , 0≤t<∞

provided d(u(0), U0 ) < δ, where u(t) is the corresponding constructed weak solution. (b) (Stability of critical points). Let u0 be a critical point of J0 . Assume (107) and (47). Then ∀ > 0, there is δ > 0 such that if d(u(0), u0 ) < δ, |||u0 ||| ≤ C0 then the corresponding weak solution u(t) satisfies sup d(u(t), u0 ) < . 0≤t<∞

We also define the neutral pair in this case as in Definition 4 with h± defined in (101). Lemma 8 is valid if r(vθ ± α) is replaced by v3 ± α in (93). We define G(α) =

1 2

Z

|∇α|2 d +

XZ Z ±

R3

M± (h+ (α, 0, v))ddv.

(109)

If such a neutral pair exists, we have the following reduction: Theorem 6. Assume (2) and let η± be a neutral pair for µ± . If α(r, z) satisfies −∂x1 x1 α − ∂x2 x2 α − j3 (α, 0) = 0,

(110)

then (α, 0) is a critical point of J0 in (11). Moreover, there exists a minimizer α0 of inf α∈H01 G(α) such that α0 satisfying (110) and −∂x1 x1 − ∂x2 x2 − ∂α j3 (α0 , 0) ≥ 0. In particular, in light of Theorem 5, a non-degenerate minimizer α0 deduces a stable steady state solution (µ± (h± ), α0 , 0). This is the case if sup

(x1 ,x2 )∈

∂α j3 (α, 0) < λ0 ,

(111)

where λ0 is the lowest eigenvalue of −∂x1 x1 − ∂x2 x2 on . Acknowledgement. The research is supported in part by NSF grant 96-23253 and a NSF Postdoctoral Fellowship. The author thanks S. Tahvidlar-Zadeh for helpful discussions.

Stable Magnetic Equilibria in Symmetric Collisionless Plasma

247

References [BF] [BMR] [BP] [BR] [D] [G] [G1] [G2] [G3] [G4] [G5] [GMP] [GR] [GRe] [GS1] [GS2] [GS3] [HM] [LP] [MP] [R1] [R2] [R3] [W] [Wo]

Batt, J., Fabian, K.: Stationary solutions of the relativistic Vlasov–Maxwell system of plasma physics. Chin. Ann. of Math. 14 B 3, 253–278 (1993) Batt, J., Morrison, P., Rein, G.: Linear stability of stationary solutions of the Vlasov–Poisson system in 3 dimensions. Arch. Rational Mech. Anal. 130, 163–182 (1995) Beals, R., Protopopescu, V.: Abstract time-dependent transport equations. J. Math. Anal. Appl. 370– 405 (1987) Batt, J., Rein, G.: A rigorous stability result for the Vlasov-Poisson system in three dimensions. Anal. di Mat. Pura ed Applicata CLXIV, 133–154 (1993) Degond, P.: Solutions stationaires explicites du systeme de Vlasov–Maxwell relativiste. C. R. Acad. Sci. Paris, Serie I 310, 607–612 (1990) Gardner, C.: Bound on the energy available from a plasma. Phys.Fluids 6, 839–840 (1963) Guo, Y.: Global weak solutions of the Vlasov–Maxwell system with boundary conditions. Commun. Math. Phys. 154, 145–163 (1993) Guo, Y.: Stable magnetic equilibria in collisionless plasmas. Comm. Pure and Applied. Math. Vol L, 0891–0933 (1997) Guo, Y. Singular solutions of the Vlasov–Maxwell system on a half line. Arch. Rat. Mech. Anal. 131, 241–304 (1995) Guo, Y.: Regularity for the Vlasov equations in a half space. Indiana Univ. Math . J. 43, 255–320 1994 Guo, Y.: Variational method for stable polytropic galaxies. Preprint 1998 Greenberg, W., van der Mee, C., Protopopescu, V.: Boundary Value Problems in Abstract Kinetic Theory. Basel–Boston: Birkhauser, 1987 Guo, Y., Ragazzo, C.: On steady states in a collisionless plasma. Comm. Pur. Appl. Math. XLIX, 1145–1174 (1996) Guo, Y., Rein, G.: Stable steady states in stellar dynamics. Preprint 1998 Guo,Y., Strauss, W.: Nonlinear instability of double-humped equilibria.Ann.IHP,Analyse Nonlineaire 12, 339–352 (1995) Guo, Y., Strauss, W.: Instability of periodic BGK equilibria. Comm. Pure. Appl. Math. XLVIII, 861–864 (1995) Guo, Y., Strauss, W.: Unstable BGK solitary waves and collisionless shocks. To appear in Commun. Math. Phys. D. Holm, J. Marsden, T. Ratiu and Weinstein, A.: Nonlinear stability of fluid and plasma equilibria. Phys. Rep. 123, 1–116 (1985) Lax, P., Philips, R.S.: Local Boundary conditions for dissipative symmetric linear differential operators. Comm. Pur. Appl. Math., XIII, 427–455 (1960) Marchioro, C., Pulvirenti, M.: A note on the nonlinear stability of a spatially symmetric Vlasov– Poisson flow. Math. Meth. in the Appl. Sci., 8, 284–288 (1986) Braasch, P., Rein, G., Vukadinovic, J.: Nonlinear stability of stationary plasmas – an extension of the energy-Casimir method. SIAM J. Appl. Math. To appear Rein, G.: Nonlinear stability for the Vlasov–Poisson system – the energy-Casimir method. Math. Meth. in Appl. Sci. 17, 1129–1140 (1994) Rein, G.: Existence of stationary, collisionless plasmas in bounded domains. Math. Meth. in Appl. Sci. 15, 365–374 (1992) Wan, Y. H.: Nonlinear stability of stationary spherically symmetric models in stellar dynamics. Arch. Rat. Mech. Anal. 112, 83–95 (1990) Wolansky, G.: On nonlinear stability of polytropic galaxies. Preprint

Communicated by H. Araki

Commun. Math. Phys. 200, 249 – 274 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

On the Incompressible Fluid Limit and the Vortex Motion Law of the Nonlinear Schr¨odinger Equation F.-H. Lin1 , J. X. Xin2 1 2

Courant Institute, New York University, 251 Mercer Street, New York, NY 10012, USA Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA

Received: 1 December 1997 / Accepted: 27 June 1998

Abstract: The nonlinear Schr¨odinger equation (NLS) has been a fundamental model for understanding vortex motion in superfluids. The vortex motion law has been formally derived on various physical grounds and has been around for almost half a century. We study the nonlinear Schr¨odinger equation in the incompressible fluid limit on a bounded domain with Dirichlet or Neumann boundary condition. The initial condition contains any finite number of degree ±1 vortices. We prove that the NLS linear momentum weakly converges to a solution of the incompressible Euler equation away from the vortices. If the initial NLS energy is almost minimizing, we show that the vortex motion obeys the classical Kirchhoff law for fluid point vortices. Similar results hold for the entire plane and periodic cases, and a related complex Ginzburg–Landau equation. We treat as well the semi-classical (WKB) limit of NLS in the presence of vortices. In this limit, sound waves propagate through steady vortices. 1. Introduction We study the two dimensional nonlinear Schr¨odinger (NLS) equation: iu,t = 1x u + −2 (1 − |u |2 )u , x ∈ ,

(1.1)

where u = u (t, x) is a complex valued function defined for each t > 0; a small positive parameter; x = (x1 , x2 ) ∈ , a simply connected bounded domain with smooth boundary in R2 ; 1 = ∂x1 x1 + ∂x2 x2 denotes the two-dimensional Laplacian. The NLS (1.1) has been proposed and studied as the fundamental equation for understanding superfluids, see Ginzburg and Pitaevskii [14], Landau and Lifschitz [19], Donnelly [9], Frisch, Pomeau and Rica [13], Josserand and Pomeau [18], and many others. We shall consider (1.1) with the prescribed Dirichlet boundary condition: u|∂ = g(x), |g| = 1, deg(g, ∂) = ±n,

(1.2)

250

F.-H. Lin, J. X. Xin

where n is a given positive integer, and the zero Neumann boundary condition: uν |∂ = 0,

(1.3)

ν the normal direction. Our method is general enough that we can handle the entire plane case ( = R2 ) and the periodic case too. We will see that as ↓ 0, the Dirichlet boundary condition corresponds to applying a tangential force at the boundary so that the tangential fluid velocity is g ∧ gτ , τ the tangential unit direction. The Neumann boundary condition corresponds to zero normal fluid velocity (no fluid penetration) at the boundary. For ease of presentation, we shall work with the Dirichlet case first, then comment on all necessary modifications in the proof to reach a similar conclusion for the Neumann case. Subsequently, we also remark on the entire plane and periodic cases. The NLS (1.1) preserves the total energy: Z Z 1 (1 − |u |2 )2 e (u ) ≡ , (1.4) |∇u |2 + E (u ) = 42 2 and admits vortices in solutions, which are points where |u | becomes zero and the phase of u or |uu | has singularities. These points are the locations of regular fluids, which are surrounded by superfluids. If there are n degree one point vortices in the solution, the energy E (u ) has the asymptotic expression: E (u )(t) = E (u )(0) = nπ log

1 + O(1).

(1.5)

So we shall consider initial data u (0, x) = u0 (x) with n degree one vortices, and belonging to H 2 () for each > 0 so that (1.5) holds. With initial and boundary data (1.5) and (1.2), it is well-known [3] that the defocusing NLS (1.1) is globally well-posed in C(R+ , H 2 ) ∩ C 1 (R+ , L2 ) for each > 0. Our goal is to analyze the limiting behavior of solutions as ↓ 0. The systematic matched asymptotic derivation of the limiting vortex motion law was carried out by Neu [28] for = R2 . The motion law is the classical Kirchhoff law for fluid point vortices [1], and was known to Onsager [30] in 1949. The connection between Schr¨odinger equations and the classical fluid mechanics was already noted in 1927 by Madelung [26], which applies to NLS (1.1) as well. Along this line, there have been over the years many formal derivations of Kirchhoff law based on Madelung’s fluid mechanical formulation, see Creswick and Morrison [7], Ercolani and Montgomery [11], among others. Madelung’s idea was to identify |u|2 as the fluid density ρ, and ∇θ = ∇ arg u, as the fluid velocity v. Then he defined the linear momentum p = ρ∇θ. In the new variables (ρ, v), the NLS (1.1) becomes: ρt − 2∇ · p = 0,

(1.6)

1 pt − 2∇ · (ρv ⊗ v) = −∇P (ρ) − ∇ · (ρHess(log ρ)), 2

(1.7)

where P = 21 2 (1 − ρ2 ) is the pressure, and Hess denotes the Hessian. Madelung’s formulation of course relies on the assumption that the amplitude of u is not zero and the phase θ is not singular, otherwise the transform is not well-defined and (1.6)–(1.7) gets singular even though NLS itself is still regular. When we are studying solutions with vortices, this singular case is however just what we have to deal with, and so an alternative intepretation of the fluid formalism related to but different from Madelung’s

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

251

transform must be used instead. In view of the energy functional (1.4), ρ is close to one almost everywhere as ↓ 0, and (1.6) implies formally that ∇ · v = 0, provided v converges. Hence the limiting problem we are considering is an incompressible fluid limit involving vortices. We also see that the Neumann boundary condition (1.3) says that θν = v · ν = 0, if we write u = ρ1/2 eiθ and assume that vortices are away from the boundary (so ρ ∼ 1). Hence (1.3) reduces to the zero normal velocity boundary condition for ideal classical fluids. Let us mention that a modified Madelung’s transform has been utilized in the study of the semi-classical limit (WKB limit) of NLS: iut = 1x u + −1 |u |2 u ,

(1.8)

with initial data: u(0, x) = a0 (x)eiS0 (x)/ . Grenier [15] showed in particular that for a0 and S0 in H s (Rd ), s > 2 + d/2, solutions u exist on a small time interval [0, T ], T independent of . Moreover, u = a(t, x, )eiS(t,x,)/ , with a and S in L∞ ([0, T ]; H s ) uniformly in , and (ρ, ∇S) converge to the solution (ρ, v) of the isentropic compressible Euler equation: ρt + ∇ · (ρv) = 0, |v|2 + ρ) = 0. vt + ∇( 2

(1.9)

In one space dimension, using integrable machinery, Jin, Levermore and McLaughlin [17] obtained the above convergence results globally in time. These works on the compressible fluid limit treated only the regime of smooth phase functions, and there are no vortices involved. Since the formation of vortices, their motion, and the resulting drag force are of tremendous physical significance in superfluids, [13, 18], it has been a longstanding fundamental problem to understand how to rigorously pass to the classical fluid limit in the presence of vortices. Our approach begins with writing the conservation laws of NLS in the form of fluid dynamic representation. However, in contrast to all earlier applications of the Madelung transform, we avoid making explicit use of the phase variable θ and do not work with (1.6)–(1.7). The conservation laws of NLS are put into the form: • Conservation of mass: ∂t |u |2 = 2∇ · p(u ),

(1.10)

where in vector notation p(u ) = u ∧ ∇u , the linear momentum. • Conservation of linear momentum: ∂t p(u ) = 2div (∇u ⊗ ∇u ) − ∇P ,

(1.11)

|u |4 − 1 , 22

(1.12)

where: P = |∇u |2 + u · 1u − is the pressure. • Conservation of energy: ∂t e (u ) = div (u,t ∇u ).

(1.13)

252

F.-H. Lin, J. X. Xin

Then we study convergence of various terms in (1.10)–(1.11) using the above three conservation laws (in particular the projection of (1.11) onto divergence free fields), and perform various circulation calculations involving the linear momentum p and its first moments. We show that vortices do not move on the slower time scale t ∼ O(λ ), λ → 0 as → 0, and they move continuously on the scale t ∼ O(1). With precise characterization of weak limits of linear momentum p, we are able to show that p converges locally in space to v, the solution of the two-dimensional incompressible Euler equation away from the n continuously moving point vortices, and moreover, v is curl-free. That v is curl-free away from vortices agrees with the physical picture that superfluids are potential flows [19]. Finally, the motion law of point vortices (the Kirchhoff law) follows from the limiting linear momentum equation. Our main results are: Theorem 1.1 (Weak convergence and fluid limit). Let us consider NLS (1.1) with Dirichlet boundary condition (1.2), and initial energy (1.5) with n degree nj = ±1 vortices. Then as ↓ 0, the energy density e (u ) concentrates as Radon measures in M() for any fixed time t ≥ 0: n X e(u )dx * δ aj (t) , πn log 1 j=1 and vortices of u converge to aj (t) moving continuously in time of t ∼ O(1) (or t ∈ [0, T ], T any fixed constant) as ↓ 0. Vortices of u do not move on any slower time scale t ∼ O(λ ) = o(1) (or t = λ τ , τ ∈ [0, T ], T any fixed positive constant, and λ → 0) as ↓ 0. Moreover on the time scale t ∼ O(1), the linear momentum p(u ) converges weakly in L1 ([0, T ]; L1loc (a )) to a solution v of the incompressible Euler equation: vt = 2v · ∇v − 2∇P,

div v = 0,

x ∈ a ≡ {\(a1 (t), · · · , an (t))}

with boundary condition: v · τ = g ∧ gτ , τ the unit tangential vector on ∂. The function v is precisely characterized as: v = ∇(2a + ha ), where 2a =

n X

arg

j=1

x − aj (t) |x − aj (t)|

n j

,

and ha is harmonic on satisfying the boundary condition: ha,τ = −2a,τ + g ∧ gτ , on ∂. So h is unique up to an additive constant. The total pressure 2P is a single-valued function on , and is smooth on a . The quadratic tensor product weakly converges as: ∇u ⊗ ∇u * v ⊗ v + µ,

M(a ),

(1.14)

where µ is a symmetric tensorial Radon defect measure of finite mass over ; and div(µ) = ∇Pµ on a , where Pµ is a well-defined distribution function on a . Theorem 1.2 (Vortex motion law). Consider the same assumptions as in Theorem 1.1, and in addition assume that the initial NLS energy is almost minimizing, namely E (u )(0) = nπ log

1 + πW (a(0)) + o(1),

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

253

as goes to zero. Let Hj = Hj (a), a = (a1 , · · · , an ), denote the smooth part of 2a + ha near each vortex, and define the renormalized energy function as: ∂Hj ∂Hj (aj ), (aj ) , ∇aj W (a) = 2nj − ∂x2 ∂x1 j = 1, · · · , n. The vortex motion obeys the classical Kirchhoff law: a0j (t) = nj J∇aj W (a) = −2∇Hj (a), j = 1, · · · , n, where

J=

and

W (a) = −

X

0 −1 1 0

,

nl nj log |al − aj | + boundary contributions.

l6=j

We remark that the total initial NLS energy E (u ) in (1.5) can be decomposed into a sum of three parts: the vortex self-energy nπ log 1 , the Kirchhoff energy πW (a(0)), and the remaining O(1) excessive energy in general. The Kirchhoff energy facilitates the vortex motion. The remaining energy creates the defect measure µ. The total pressure consists of the contribution from the original NLS pressure and the contribution from the defect measure (the defect pressure). If the excessive energy is absent, or in other words the initial energy satisfies: E (u )(0) = nπ log

1 + πW (a(0)) + o(1),

(1.15)

which also means that u is almost energy minimizing for the given vortex locations, the linear momentum p(u ) converges strongly in L1 ([0, T ]; L1loc (a )) and the defect measure µ = 0. In general, with O(1) excessive energy, to prove the same motion law requires further information on µ; either that the divergence of the defect measure µ is a gradient of a distribution on the entire domain (i.e. is globally curl-free as a distribution) or that the support of µ is away from the vicinities of vortex locations. Physically the excessive energy is carried by sound waves (time dependent phase waves), see the discussion of the WKB limit in Sect. 7. It is conceivable that vortices still move according to Kirchhoff law when sound waves have propagated away from them, either absorbed by the vortex cores or the physical boundary. Otherwise, sound waves may modify the motion of vortices by creating oscillations, [13]. It is very interesting to understand the vortex sound interaction (Nore et al. [29]) in terms of the structure of the defect measure µ based on our results here. Due to the local nature of our method, we are able to prove the same theorems for the zero Neumann case (1.3), with the modification that the boundary condition is instead v · ν = 0, and ha,ν = −2a,ν . Similar results are established for the entire plane and the periodic cases, as long as the sum of vortex degrees is zero and the total energy obeys (1.5). Our results on the Dirichlet and Neumann cases easily extend to the situation where there are 2k + n vortices in a bounded domain, n + k being of degree +1, and k of degree −1. Due to the possibility of finite time vortex collisions in Kirchhoff law in the case of signed vortices [27], the results are meant for any time before any two vortices come together. It is remarkable that NLS vortices obey the Kirchhoff law in the incompressible fluid limit, considering that the ±1 vortices are only known to be dynamically marginally

254

F.-H. Lin, J. X. Xin

stable in the spectral sense, see Weinstein and Xin [32]. For this reason, it seems impossible to prove the validity of the motion law for the above mentioned initial and boundary conditions by attempting to justify the matched asymptotic derivation of Neu [28] which relied on linearization about vortices. The fluid dynamic approach developed here has been extended by the authors [25] to establish the vortex motion laws of the analogous nonlinear wave (NLW) equation, and the nonlinear heat (NLH) equation. In NLW and NLH, Euler-like equations also appear and lead to the motion laws. Under a similar energy almost minimizing assumption (1.15), the NLW vortex motion law is: 1 a00j = −nj ∇aj W , on the time scale t ∼ O(log 2 1 ). During the preparation of this paper, we learned of Colliander and Jerrard [5] on the periodic case of NLS. They showed the motion law under the energy almost minimizing assumption, however, did not study the defect measure and the general fluid limit. The rest of the paper is organized as follows. In Sect. 2, we state and prove energy concentration, and show its direct consequences on convergence of linear momentum away from vortices and basic energy type bounds. In Sect. 3, we study mobility and continuity of vortex locations based on linear momentum equation and subsequently refine the form of weak limit of solutions based on conservation of mass. We also prove a key energy estimate which is used later to control the defect measure. In Sect. 4, we show using all results in previous sections that the NLS linear momentum converges to a solution of the two dimensional incompressible Euler equation away from vortices. The Kirchhoff law then follows from the limiting linear momentum equation under the energy minimizing assumption. In Sect. 5, we comment on all necessary modifications to establish the similar results for the zero Neumann case, as well as the entire plane and periodic cases. In Sect. 6, we apply our method to show the vortex motion law for a related complex Ginzburg–Landau (CGL) equation. Besides the interest of CGL vortices in its own right, this result provides another proof of NLS vortex motion law by passing the CGL to NLS limit. In Sect. 7, we study the semi-classical (WKB) limit of NLS. Due to the slow time scale O(), vortices do not move, and the regular part of the phase function of the solution satisfies the linear wave equation, indicating the propagation of sound waves through vortices.

2. Energy Concentration and Basic Weak Limits In this section, we present weak convergence results on two basic physical quantities: the energy e(u ) and the linear momentum p(u ). Consequently, we deduce the weak convergence of the curl of p(u ). The one half curl of p(u ) is equal to the Jacobian of the map u , hence it will be denoted by Jac (u ), and it is also known as vorticity. All the results follow from energy concentration and energy comparisons, and are independent of dynamics. Lemma 2.1. Suppose uk is a sequence of H 1 -maps from into C (the complex plane) satisfying the Dirichlet boundary condition uk |∂ = g. Suppose also that for a positive k independent constant C0 the energy satisfies: Z Ek (uk ) =

Z ek (uk ) ≡

(1 − |uk |2 )2 1 1 ≤ πn log + C0 . |∇uk |2 + 2 k 42k

Then taking a subsequence in k if necessary, we have as = k ↓ 0 that

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

255

n

X e (u )dx * δaj , 1 πn log j=1

(2.1)

as Radon measures. Moreover, min{|al − aj |, dist(al , ∂), l, j = 1, · · · , n, l 6= j} ≥ δ 0 (g, , C0 ) > 0. Proof. This lemma is same as Proposition 1 of Lin [23], where the earlier structure theorem of Lin [20] (Theorem 2.4) is extended to show that there are small positive numbers 0 and α0 such that for k ∈ (0, 0 ), there are n distinct balls Bj ’s with radii α k j , αj ∈ [α0 , 1/2], which contain vortices of degrees ±1. In other words, vortex α locations are known up to an error of O(k j ). Lemma 2.2. Under the assumptions of Lemma 2.1, we have up to a subsequence if necessary: nj n Y x − aj eiha (x) ≡ ua , (2.2) u * |x − a | j j=1 1 1 (\{a1 , · · · , an }) ≡ Hloc (a ) for some ha ∈ H 1 (). Morenj = ±1, weakly in Hloc over, Z |∇ha |2 ≤ C1 , (2.3) Z (1 − |u |2 )2 ≤ C1 , (2.4) 2 Z |∇|u ||2 ≤ C1 , (2.5)

for a positive constant C1 , uniformly in . Proof. These results follow from energy comparisons. For the weak convergence (2.2) and inequality (2.3), see the general convergence theorem of [20] and also Proposition 2 of [23]. The inequality (2.4) is shown in Lecture 1 of [21]. For (2.5), we use the fact that ∇|u | = 0, a.e. on the set {x ∈ : |u | = 0}, and write u = |u |eiH whenever |u | = 6 0. Substituting this expression into the total energy, which is uniformly bounded away from the set {x ∈ : |u | = 0}, gives (2.5). Intuitively, the singular part of energy that contributes to nπ log 1 comes from the singular part of the phase of u (the sum of vortex phases). The above three inequalities are valid since they either involve only the amplitude |u | or the regular part of the phase ha . Remark 2.1. Under the same assumptions as in Lemma 2.1, the renormalized energy is defined as (γ a universal constant): # " Z 1 2 |∇ua | − n log 1/r + γn, (2.6) W = W (a1 , · · · , an ) = lim S r↓0 2π \ n B (a ) r j j=1

see Bethuel, Brezis and H´elein [2]. Here ua is a harmonic map of the form (2.2). The W function has the properties that: W → +∞ if some aj reaches the boundary ∂ or aj = al for some j 6= l; otherwise, it is locally analytic in a. Due to γn, W (a) is also local energy minimizing.

256

F.-H. Lin, J. X. Xin

Lemma 2.3. Under the same assumptions as Lemma 2.1, the linear momentum p(u ) is uniformly bounded in L1loc (a ), and up to a subsequence if necessary: p(u ) * v = ∇2a + ∇ha ,

(2.7)

in L1loc (a ), where 2a =

n X

arg

j=1

x − aj |x − aj |

nj

.

(2.8)

Moreover, 2Jac (u ) dx = curl (p(u )) dx * 0,

(2.9)

in the sense of bounded measures M(a ). Proof. We see from Lemmas 2.1 and 2.2 that p(u ) is uniformly bounded in L1 away from vortices {a1 , · · · , an }. Since ∇u is weakly compact in H 1 (a ), and u compact in L2 (a ), we have: p(u ) = u ∧ ∇u * v = ∇2a + ∇ha , in L1loc (a ). Noticing that v is a gradient of an H 1 function, we have by taking the curl of p(u ) and the weak continuity of Jacobians with respect to H 1 weak convergence that 2Jac (u ) dx = curl p(u ) dx * 0, in M(a ). Note that Jac (u ) ∈ L1loc (a ). The proof is complete.

(2.10)

Lemma 2.4. The linear momentum p(u ) ∈ L1 () uniformly in . Let ϕ ∈ C0∞ (), ϕ = x1 for x ∈ BR/2 (aj ), ϕ = 0, for x 6∈ BR (aj ), where R ∈ (0, δ 0 ). Then we have with aj = (ξj , ηj ): Z ∇⊥ ϕ · p(u ) → 2πξj . (2.11) BR (aj )

A similar convergence holds with x2 in place of x1 , ηj in place of ξj . Proof. The integral in (2.11) is the projection of the linear momentum onto a divergence free field. We have from Lemma 2.2 that |u | ∈ H 1 (), uniformly in . Hence |u | ∈ Lq (), uniformly in , for any q < ∞ by the Gagliardo–Nirenberg inequality. We 0 shall establish that ∇u ∈ Lp (), uniformly in , for p0 ∈ [1, 2). Given this fact, r p(u ) = u ∧ ∇u ∈ L (), uniformly in for any r ∈ [1, 2). This and Lemma 2.3 imply that: Z Z ∇⊥ ϕ · p(u ) → ∇⊥ ϕ · (∇2a + ha ) BR B Z R ∇⊥ ϕ · ∇θj = BR Z Z = ∇⊥ ϕ · ∇θj + x1 ∂τ θj , B0 (aj ) ∂B0

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

257

where B0 is a small ball of radius 0 about aj , and ∂τ is the tangential derivative. The first integral clearly goes to zero as 0 → 0, and the second integral goes to 2πξj by a direct calculation. The convergence (2.11) follows. 0 Now we show that ∇u ∈ Lp (), uniformly in , for p0 ∈ [1, 2), by an energy argument. It is sufficient to consider a finite neighborhood of a single, say plus one, vortex. Without loss of generality, we can assume that the essential zero of u is inside B(0, α ), for some α ∈ (1/4, 1/2), and that B(0, 1) is inside and contains the essential zero. We have then from Lin [20]: 1 E (u , B(0, 1)) ≤ π log + C1 , Z α e (u ) ≤ C2 (α, C1 ), ∂B(0,α ) deg(u /|u |, ∂B(0, α )) = 1.

(2.12)

It follows from (2.12) that there exists a θ ∈ (1/4, 1/2), and a constant 0 (C1 ) such that if ≤ 0 (C1 ): Z 1 e (u ) ≥ π log (2.13) − C0 . θ B(0,1)\B(0,θ ) i(2+h) 1 ,Rin Hloc (B(0, 1)\0); u * In fact, there exists θ ∈ (1/4, R 1/2) such that u * e i(2+h) 1 in H (∂B(0, θ )); θ ∂B(0,θ ) e (u ) ≤ C(C1 ). So B(0,1)\B(0,θ ) e (u ) ≤ C. e R Now as in Lin [20], replace u by the minimizer u˜ of the energy B(0,1)\B(0,θ ) e (u ) subject to the Dirichlet boundary condition u˜ = u , on ∂B(0, θ ), and zero Neumann on ∂B(0, 1). Such a minimizer satisfies |u˜ | ≥ 1/2 on B(0, 1)\B(0, θ ) and that: Z 1 e (u˜ ) ≥ π log (2.14) − C0 , θ B(0,1)\B(0,θ ) proving (2.13). Combining (2.13) and (2.12), we have: Z θ + C1 + C0 . e (u ) ≤ π log (2.15) B(0,θ )

Now we iterate (2.15) to a sequence of balls B(0, r(n) ), r(n) = θ(1) · · · θ(n−1) , θ(1) = θ , and θ(j) ’s ∈ (1/4, 1/2), n = 1, 2, · · · , N , where N is such that r(N ) ≥ 2α . At each n, the lower energy bound on the annuli becomes: Z 1 (2.16) e (u ) ≥ π log (n+1) − C0 (n) , (n) (n+1) θ r B(0,r )\B(0,r ) and the upper bound is: Z

n

X r(n) e (u ) ≤ π log + C1 + C0 (1 + 1/r(j) ). (n) B(0,r ) j=1

(2.17)

The sum of the second term in (2.17) is bounded by a geometric sum from above since θ(j) ∈ (1/4, 1/2), and its upper bound is const. −α . Hence the energy upper bound finally is:

258

F.-H. Lin, J. X. Xin

Z

r r + C1 + C3 1−α ≤ π log + C1 + 2C0 ,

e (u ) ≤ π log

B(0,r)

(2.18)

for small , and r ∈ (2α , 1). With a similar argument via the energy minimizer, we also have: Z r0 e (u ) ≥ π log − C4 , B(0,r 0 )

(2.19)

for any r0 ∈ (2α , 1). Combining (2.18) and (2.19), we infer that for r ≥ 2α : Z e (u ) ≤ C5 .

(2.20)

B(0,2r)\B(0,r)

Now we bound for any p0 ∈ [1, 2) (2N +1 α ∈ (1/2, 2/3)) using the H¨older inequality: Z

p0

B(0,1/2)

|∇u |

Z ≤

p0

B(0,2

α)

Z ≤ 2

|∇u | +

+

0

j=1

α)

B(0,2j+1 α )\B(0,2j α ))

p0 /2

B(0,2

N X

N Z X

e (u )

|∇u |p

0

cp0 (2−p )α 0

c(p0 , C5 )(|B(0, 2j+1 α )\B(0, 2j α )|)(2−p )/2

j=1 0

≤ o(1) + c(p0 , C5 )(3π)(2−p )/2

N X

0

(2j α )2−p ≤ C6 (p0 , C5 ). (2.21)

j=1

The proof is complete.

3. Mobility and Continuity of Vortex Motion In the previous section, we obtained in Lemma 2.2 the weak limit of solutions based on the energy consideration. Due to conservation of energy, Lemma 2.2 applies to each time slice of evolution, and so Lemma 2.2 holds with aj = aj (t), and ha = ha (t, x). In this section, we shall utilize the conservation of linear momentum to show the mobility and continuity of vortex motion. With the additional help of conservation of mass, we also refine the weak limit of solution u in that we find out how the function h depends on vortex locations a0j s, and that it is harmonic in space. Subsequently, we also prove a key energy estimate for the later analysis of the defect measure. Proposition 3.1. The vortices in u do not move in any slower time scale t ∼ o(1), as → 0. On the time scale t ∼ O(1), the vortex locations a,j (t) are uniformly continuous in t as → 0. Proof. By Lemma 2.1: u (0, x) *

n Y x − a0j ih0 (x) , e |x − a0j | j=1

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

259

1 in Hloc (a0 ) with kh0 kH 1 () ≤ C0 . Let R > 0 be a small number, R 41 R0 , where

R0 = min{|al − aj |, dist(al , ∂), l, j = 1, · · · , n, l 6= j}. Due to energy conservation, the number R0 remains positive for all time. Let t be such that ∀t ∈ [0, t ), u (t, x) has vortices inside ∪nl=1 BR/4 (a0l ), and t is the maximum time with this property. In other words, for some j, a,j (t ) ∈ ∂BR/4 (a0j ). By the H 1 continuity of u in time for each > 0, such t > 0 exists. We prove that lim inf →0+ t > 0. Suppose otherwise, at least for a subsequence of , still denoted the same, t → 0. Write v (t, x) = u (x, t t), then the NLS for v becomes iv,t = t 1v +

t (1 − |v |2 )v , 2

and the linear momentum equation: ∂t p(v ) = 2t div (∇v ⊗ ∇v ) − ∇(t P ).

(3.1)

The vortices of v lie in ∪nl=1 BR/4 (a0l ) for all t ∈ [0, 1), and at t = 1, one of the vortices, say a,j (1), reaches ∪nl=1 ∂BR/4 (a0l ). The vortex locations are well-defined up to a small error of O(α0 ). With no loss of generality, let us assume that a,j (0) = 0. Let ϕ ∈ C0∞ (BR0 /2 ), and ϕ = x1 for x ∈ BR0 /4 . Multiplying both sides of (3.1) by ∇⊥ ϕ and integrating over BR0 /2 × [0, 1], we obtain with integration by parts: Z ∂BR0 /2

⊥

∇ ϕ·

Z p(u )|10

= −2t

Z

1

dt 0

∂BR0 /2

(∇u ⊗ ∇u ) : ∇∇⊥ ϕ. (3.2)

The right side integral is in fact over BR0 /2 \BR0 /4 , hence is uniformly bounded by a constant C independent of . Passing ↓ 0, by Lemma 2.4, the left hand side converges to 2π(ξj (1) − ξj (0)). Since t → 0, ξj (1) = ξj (0). Similarly, ηj (1) = ηj (0), contradicting the assumption that aj travels a distance R/4 at t = 1. Hence t is bounded away from zero uniformly in . Since R can be any small number, we have proved that vortices a,l (t), l = 1, · · · , n are uniformly continuous in t, or the limiting locations al (t) are continuous in t. As a byproduct, we have also shown that vortices in u do not move on any slow time scale t ∼ o(1) as → 0. Replacing t by t = O(1) in the above proof, we in fact have shown that: Corollary 3.1. On the time scale t ∼ O(1), the limiting vortex locations al (t), are Lipschitz continuous, where l = 1, · · · , n. Now let us characterize the function ha = ha (t, x) in: Proposition 3.2. The function ha (t, x) in the weak limit (2.2) of Lemma 2.1 satisfies: 1ha = 0, x ∈ , ha,τ = −2a,τ + g ∧ gτ , x ∈ ∂,

(3.3)

where 2a is given in (2.8). So ha is unique up to an additive constant, and depends on time via vortex locations aj (t).

260

F.-H. Lin, J. X. Xin

Proof. By Lemma 2.3 and dominated convergence, for any function ψ1 (x) ∈ C0∞ (a ) and ϕ(t) ∈ C0∞ ((0, T )), we have: Z T Z Z T Z ϕ(t) p(u )ψ1 (x) = ϕ(t) ∇(2a + ha )ψ1 (x). (3.4) lim →0 0 a a 0 In addition, using the mass conservation law (1.10), we also have: Z Z Z Z T 1 T ϕ(t) p(u ) · ∇ψ1 (x) = ϕt (t) |u |2 ψ1 (x) 2 0 a a 0 Z Z T 1 ψ1 (x) ϕt (t) = 0, → 2 a 0

(3.5)

where the convergence is due to (2.4) of Lemma 2.2. It follows that the weak limit of p(u ) is divergence free. It follows that ha is a harmonic function on a and is also H 1 () by Lemma 2.2. Thus ha can have at worst removable singularities and is a harmonic function on the whole domain . The function ha then has a well-defined boundary value, which we identify next. Let ψ = ψ(t, x) be a compactly supported function in a small region 0 near the boundary ∂; for each t, supp{ψ} ∩ ∂ contains a finite curve; ψ is also compactly supported inside the time interval [0, T ], T > 0. Note that near the boundary, there are no vortices, hence 2a is a single valued function. Let us calculate: I Z Z Z ψp(ua ) · τ ds = ψp(ua ) · d~l = curl (ψp(ua )) = ∇ψ ∧ p(ua ) ∂0 ∂0 0 0 Z Z Z ∇ψ ∧ p(u ) = lim[ curl (ψp(u )) − ψcurl p(u )] = lim ↓0 0 ↓0 0 0 Z I ψp(u ) · d~l = ψ(g ∧ gτ )ds, (3.6) = lim ↓0 ∂0 ∂0 implying that: p(ua ) = ∂τ (2a + ha ) = g ∧ gτ , on the boundary ∂ for all t ≥ 0. Hence the harmonic function ha is uniquely determined up to an additive constant, due to integrating the tangential derivative once along the boundary to recover the related Dirichlet boundary data. Prescribing the boundary map g with certain degree for NLS implies a boundary force along the tangential direction for the limiting fluid motion. We complete the proof. Proposition 3.3. Let t > 0 and u = u (t, x) be as in Lemma 2.1, with vortex locations (a1 , a2 , · · · , an ). If for some ω0 > 0: 1 ≤ π W (a) + ω0 , lim sup E (u ) − πn log →0 then for any r > 0, there is a constant C independent of and r such that for any t > 0:

2

p (u )

− v ≤ Cω0 , (3.7) lim sup

2 |u | →0 L (\U n Br (aj )) j=1

lim sup k ∇|u | k2L2 (\U n Br (aj )) j=1 →0

≤ Cω0 .

(3.8)

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

261

Proof. We first let k → 0 such that lim sup k ∇|u | k2L2 (\U n Br (aj )) = lim sup k ∇|uk | k2L2 (\U n Br (aj )) . j=1 j=1 k →0 →0 By Lemma 2.2, we can assume without loss of generality that H 1 (a )

for some h ∈ H 1 (). Here ei2a

* ei(2a +h) , uk loc Qn x−a = j=1 |x−ajj | . Hence

pk (uk ) |uk |

L2loc (a )

*

∇(2a + h).

For any ρ > 0, then n Bρ (aj )) Ek (uk , \Uj=1 " # Z pk (uk ) 2 1 1 2 2 2 |∇|uk | | + + 2 (1 − |uk | ) ≡ n B (a ) 2 \Uj=1 |uk | 2k ρ j 2 Z p (u ) 1 |∇|uk | |2 + k k − ∇(2a + h) ≥ n B (a ) 2 \Uj=1 |uk | ρ j Z 1 2 + |∇(2a + h)| dx + ok (1), n B (a ) 2 \Uj=1 ρ j

(3.9)

here ok (1) → 0 as k → ∞. Next, we let uk (h, ρ) be such that uk (h, ρ) = ei(2a +h) on n Bρ (aj ); and on each Bρ (aj ), uk (h, ρ) is a minimizer of Ek on each Bρ (aj ) \Uj=1 with boundary value ei(2a +h) . We choose ρ ∈ ( r2 , r) so that uk |∂Bρ * e(2a +h) in H 1 (∂Bρ (aj )) for j = 1, · · · , n, by taking the subsequence of k as needed. Then it is easy to see by a simple comparison that for j = 1, · · · , n: Ek (uk , Bρ (aj )) ≥ E(uk (h, ρ), Bρ (aj )) + o(ρ, k ), here o(ρ, k ) → 0 as k → ∞. Therefore π W (a) + ok (1) ≤ Ek (uk (h, ρ)) − πn log ≤ Ek (uk ) − nπ log −

1 2

1 k

1 1 + o(ρ, k ) − k 2

Z n B (a ) \Uj=1 ρ j

2 pk (uk ) dx. + h) − ∇(2 a n B (a ) |uk | \Uj=1 ρ j

Z

Since E (uk ) − πn log 1k ≤ π W (a) + w0 , we thus conclude that Z |∇|uk | |2 ≤ 2w0 , lim k →0 \U n Br (aj ) j=1

| ∇|uk | |2 dx (3.10)

(3.11)

262

F.-H. Lin, J. X. Xin

which implies (3.8) and that 2 pk (uk ) − ∇(2a + h) ≤ 2w0 . lim sup n B (a ) |uk | k →0 \Uj=1 r j Z

(3.12)

We observe now if k → 0 is so that 2 pk (uk ) |u | − v dx, n \Uj=1 Br (aj ) k

Z lim k →0

is the left-hand side of (3.7), then by (3.12): 2 Z p (u ) |∇h − ∇ha |2 . lim sup |u | − v dx ≤ 4w0 + 2 n B (a ) n B (a ) →0 \Uj=1 \Uj=1 r j r j (3.13) Z

Here v = ∇(2a + ha ). Now we show that

Z n B (a ) \Uj=1 r j

|∇h − ∇ha |2 ≤ w0 .

To do this, we observe that for a ρ > 0 with Z Z 2 C |∇h|2 ≤ |∇h|2 dx ≤ , ρ ρ ∂Bρ B2ρ \Bρ/2 we have

n Bρ (aj )) ≥ πn log E (u (h, ρ), Uj=1

ρ + γn + o(ρ, ).

This follows from an easy energy estimate, see [22]. Here o(ρ, ) → 0 as → 0+ . This implies in turn that Z 1 n |∇(2a + h)|2 E (u (h, ρ), \Uj=1 Bρ (aj )) = n B (a ) 2 \Uj=1 ρ j 1 ≤ πW (a) − γn + w0 + o(ρ, ) + nπ log . ρ On the other hand, we have: Z 1 1 |∇(2a + ha )|2 = nπ log + π W (a) − γn + o(ρ), n B (a ) 2 \Uj=1 ρ ρ j

(3.14)

(3.15)

where o(ρ) → 0+ , as ρ → 0. We also note for any h ∈ H 1 (): Z Z |∇(2a + h)|2 dx = |∇2a |2 + |∇h|2 n B (a ) \Uj=1 ρ j

Z

+2 ∂

∂2a ·h−2 ∂ν

n Z X j=1

∂Bρ (aj )

n B (a ) \Uj=1 ρ j

¯ (h − h)

∂2a , ∂n

(3.16)

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

263

R Pn where the last term is bounded by const. j=1 ρ ∂Bρ |∇h|, which goes to zero as ρ → 0. By sending → 0, then ρ → 0, we therefore obtain by combining (3.14), (3.15), and (3.16) that Z Z 2 |∇h| ≤ |∇ha |2 + w0 . (3.17)

along with the fact that ha is harmonic, and h|∂ = ha |∂ , yields RInequality (3.17), |∇(h − ha )|2 ≤ w0 . The proof is complete. We end this section with an interesting conjugation property of the regular part of the vortex phase in terms of the renormalized energy function W . Near each vortex aj , write the weak limit as ei arg(x−aj )+iHj , where Hj is harmonic. Then: Lemma 3.1.

∇aj W (a) = 2nj

∂Hj ∂Hj (aj ), (aj ) . − ∂x2 ∂x1

(3.18)

For a proof, see [2] (Theorem 8.3). 4. Convergence to Incompressible Euler Equation and Vortex Motion Law In this section, we use continuity of vortices, the weak convergence and the precise form of the weak limit discussed in the previous sections to pass the linear momentum equation (1.11) to the incompressible limit on the punctured domain a , and show that the limiting equation is the two dimensional Euler equation. We show properties of defect measures and total pressure P to finish proving Theorem 1.1. We then establish the Kirchhoff law for vortex motion based on the limiting projected linear momentum equation. Finally, we show strong convergence of the linear momentum under the initial energy almost minimizing assumption. Let us write the linear momentum equation in component form: pm (u )t = 2(u,xm · u,xj )xj − Pxm ,

m = 1, 2.

(4.1)

Direct calculation shows that if |u | > 0 then u,xm =

u pm (u ) iu + |u |xm . |u | |u | |u |

(4.2)

Note that |∇u | = 0, a.e, on the set {|u | = 0}. Hence, we only need to consider the set {|u | > 0}. It follows from (4.2) that pm (u ) · pj (u ) + |u |xm |u |xj |u |2 pj (u ) pm (u ) − vm − vj + |u |xm |u |xj = |u | |u | pj (u ) pm (u ) + vj − vm vj . + vm |u | |u |

u,xm · u,xj =

(4.3)

Note that k|u |−1 p(u )kL2loc (a ) ≤ C, for a positive constant independent of , and t ∈ [0, T ]. Hence |u |−1 p(u ) is weakly compact in L2 (a × [0, T ]). Since |u | → 1

264

F.-H. Lin, J. X. Xin

in L2 (a × [0, T ]), the weak L1 (a × [0, T ]) limit of p(u ) equal to v = ∇(2a + ha ) coincides with the weak L2 (a × [0, T ]) limit of |u |−1 p(u ). It follows that vm

pj (u ) pm (u ) * vm vj , vj * vj vm |u | |u |

in L2 (a × [0, T ]). The product terms pj (u ) pm (u ) − vm − vj + |u |xm |u |xj * µm,j , |u | |u |

(4.4)

(4.5)

as measures to a symmetric tensorial measure µm,j ∈ M(a ). We prove: Proposition 4.1. The defect measure µ = (µm,j ) is a finite mass Radon measure on the domain . Its divergence div(µm,j ) is curl free in the sense of a distribution, and can be written into ∇Pµ on a , where Pµ is a distribution function well-defined on the entire domain a . The weak limit v is a solution of the incompressible Euler equation: vt = 2v · ∇v − 2∇P, div v = 0, ∀ x ∈ a , where the total pressure 2P is a single-valued function, and smooth in a . Proof. That the defect measure µ ≥ 0 is a finite mass Radon measure on the entire domain follows from Proposition 3.3. Let us take ψ ∈ (C0∞ (a × [0, T )))2 , divψ = 0, form the inner product of ψ with both sides of the linear momentum equation (1.11), and integrate by parts to get Z Z Z ψt · p(u ) − 2(∇u ⊗ ∇u ) : ∇ψ = 0. ψ(0, x)p(u0 ) + Passing to the limit, we get Z Z Z ψt v − 2(v ⊗ v + µ) : ∇ψ = 0. ψ(0, x)v 0 +

(4.6)

In particular, we choose ψ to be of the form: ψ = α(t)(−ϕx2 , ϕx1 ) = α(t)∇⊥ ϕ,

(4.7)

where ϕ ∈ C0∞ (a ), α(0) = 0. Then due to v being curl free on a , (4.6) reduces to Z Z (4.8) α(t)µ : ∇∇⊥ ϕ = 0, which means that the weak divergence of the measure µ is a weak gradient away from vortices, hence can be written locally into a gradient of another distribution, by an approximation argument. We denote div µ = ∇Pµ , Pµ is a local distribution for now. It follows that (4.6) reduces to Z Z Z ψt v − 2(v ⊗ v) : ∇ψ = 0. (4.9) ψ(0, x)v 0 + Since v is harmonic in x and Lipschitz continuous in time, it is easy to bootstrap on (4.9) to show that v is smooth in (x, t) ∈ a × (0, T ). We can now write (4.9) into the strong form of the Euler equation:

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

vt = 2v · ∇v − 2∇P,

x ∈ a , div v = 0, v(0, x) = v 0 (x)

265

(4.10)

for some function 2P locally defined on a × (0, T ). Taking the divergence of (4.10) gives 1P = div(v · ∇v). That v is harmonic in a then implies that P is smooth in a . Using (4.10), we see that the integral around each vortex: I I Z 1 ∂P vt · d~l − v · ∇v · d~l. = − 2 ∂BR (aj ) ∂BR (aj ) ∂θ ∂BR (aj ) By the form of weak limit v, the circulation of vt is zero. The circulation of the v · ∇v term is also zero by a direct calculation with v = ∇(2a + ha ). First we note that curl (v · ∇v) = v · ∇ curl v = 0, x ∈ a . Hence it is enough to calculate the circulation on a very small circle around aj and show that it goes to zero as the radius of the circle x−a goes to zero. Let aj = (ξj , ηj ), and x = (ξ, η). Let us write H = 2a +ha = arg |x−ajj | +Hj and so −(η − ηj ) , (ξ − ξj )2 + (η − ηj )2 (ξ − ξj ) , Hη = (Hj )η + (ξ − ξj )2 + (η − ηj )2 Hξ = (Hj )ξ +

(4.11)

and below we denote ∇Hj = (I, II). Noticing that Iξ + IIη = 1Hj = 0, we have I Z

∂BR (aj ) 2π

=

v · ∇v · d~l =

Z

2π

R[v · ∇v1 (− sin θ) + v · ∇v2 cos θ]dθ 0

R[(I − R−1 sin θ)(Iξ + 2R−2 sin θ cos θ)(− sin θ)

0

+(II + R−1 cos θ)(Iη − R−2 cos 2θ)(− sin θ)] Z 2π R[(I − R−1 sin θ)(IIξ − R−2 cos 2θ) cos θ + 0

+(II + R−1 cos θ)(IIη − R−2 sin 2θ) cos θ]dθ Z 2π [Iξ sin2 θ + IIη cos2 θ]dθ + O(R) = 0

= π(Iξ (aj ) + IIη (aj )) + O(R) = O(R) → 0.

(4.12)

Thus the total pressure 2P is a well-defined single-valued function over the whole domain . It consists of the defect pressure from µ and the contribution from the original NLS pressure. Finally, we show that the defect pressure Pµ is a well-defined distribution on . For ψ = ψ(r), supported in the annulus BR (aj (s))\BR/2 (aj (s)) = BR \BR/2 , it follows from the linear momentum equation for t near s that Z Z d p(u )(ψτ ) = −2 ∇u ⊗ ∇u : ∇(ψτ ), dt BR \BR/2 BR \BR/2 where the NLS pressure has zero circulation and is removed. Passing ↓ 0 and using the fact that v · ∇v has zero circulation as proved above, we have

266

F.-H. Lin, J. X. Xin

Z

Z (µ + v ⊗ v) : ∇(ψτ ) = −

0= BR \BR/2

Z

=−

R

Z

dr ϕ(r)

R/2

∂Br

BR \BR/2

(div µ + v · ∇v) · (ψτ )

∂Pµ , ∂θ

R ∂P implying that ∂Br ∂θµ = 0 for any r > 0, hence Pµ is a well-defined distribution on a . The proof of the proposition and also that of Theorem 1.1 is complete. Proof of Theorem 1.2. Let us consider the time interval [t, t + k], with k small, and the ball BR = BR (aj (t)) inside the annulus BR0 /2 as in the proof of Proposition 3.1. The number R is much smaller than R0 and is large enough to contain aj (s), s ∈ [t, t + k]. For example, R = Ck, for a suitable constant C depending on the Lipschitz constant of aj . Proceeding as in Proposition 3.1, with ϕ = x1 in BR (aj (t)) and supported inside BR0 /2 , we find: Z BR0 /2

Z

= −2 Z →2

∇⊥ ϕ p(u )|t+k t Z

t+k

ds

t t+k

t

Z ds

BR0 /2 \BR

BR0 /2 \BR

(∇u ⊗ ∇u ) : ∇ ∇⊥ ϕ

−(µ + v ⊗ v) : ∇∇⊥ ϕ.

(4.13)

Here µ ∈ M() and v ⊗ v 6∈ L1 (). As in Proposition 3.1, the left hand side of (4.13) converges to 2π(ξj (t + k) − ξj (t)). For the right-hand side, we calculate the second term in (4.13): Z Z

s+k

s s+k

= Z − Z

s s+k s s+k

= s

Z +

s

s+k

Z ds

BR0 /2 (aj (s))\BR (aj (s))

Z ds

BR0 /2 (aj (s))\BR (aj (s))

Z ds Z ds

∂BR (aj (s))

∂BR (aj (s))

Z ds

∂BR (aj (s))

−(v ⊗ v) : ∇∇⊥ ϕ v · ∇v · ∇⊥ ϕ

(v ⊗ v) : (ν ⊗ n⊥ ) (v · ∇v · ν ⊥ )(n · x) −(v ⊗ v) : (ν ⊗ n⊥ ),

where n = (1, 0) and ν is the normal direction at ∂BR (aj (s)).

(4.14)

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

267

Let us calculate the inner part of the first integral of the right-hand side of (4.14) as follows: Z 2π (ξj (t)R + R2 cos θ)[v · ∇v1 (− sin θ) + v · ∇v2 cos θ]dθ Z

0 2π

=

(ξj (t)R + R2 cos θ)[(I − R−1 sin θ)(Iξ + 2R−2 sin θ cos θ)(− sin θ)

0

+(II + R−1 cos θ)(Iη − R−2 cos 2θ)(− sin θ)] dθ Z 2π (ξj (t)R + R2 cos θ)[(I − R−1 sin θ)(IIξ − R−2 cos 2θ) cos θ + 0

+(II + R−1 cos θ)(IIη − R−2 sin 2θ) cos θ]dθ Z 2π Z 2π 2(sin θ cos θ)2 dθ − I cos2 θ cos 2θdθ + O(R) = −I Z

0

0 2π

cos2 θ = −πI.

= −I

(4.15)

0

Similarly, the inner part of the second integral of the right hand side of (4.14) also contributes −πI. Therefore dividing by k and letting k → 0, we have from (4.13)– (4.15) that ξj0 = −2Hj,ξ + fj,1 (µ). With a similar equation for ηj , we conclude that a0j = −2∇Hj + fj (µ),

(4.16)

where fj (µ) is a possible correction due to the defect measure µ. Using the conjugation of Hj with the renormalized energy, we rewrite (4.16) into a0j = nj J∇aj W (a) + fj (µ), where

J=

and W (a) = −

X

0 −1 1 0

(4.17)

,

nl nj log |al − aj | + boundary contributions.

l6=j

The Kirchhoff law follows if fj (µ) = 0, which we show below under the energy almost minimizing assumption. Since the Kirchhoff law may encounter finite time collapse for signed vortices, the validity established here applies also to any time prior to the collapse in the signed vortex situation. Proposition 4.2. Under the almost minimizing initial energy assumption, we have p(u ) − v → 0, ∇|u | → 0, |u | in L2 (a ), and the defect measure µ = 0. The Kirchhoff law holds.

268

F.-H. Lin, J. X. Xin

Proof. For simplicity, let us consider vortices of the same sign plus one. Let a˜ j,t = J∇a˜ j W (˜a), a˜ (0) = a(0); and define m(t) =

n X

|aj (t) − a˜ j (t)|.

j=1

Take a small time interval t ∈ [0, tδ ] so that |m(t)| ≤ δ, with δ a small number to be selected. Lipschitz continuity of m implies that it is differentiable a.e. in t. We have m0 (t) ≤

n X

|a0j (t) − a˜ 0j (t)|

j=1

≤ ≤

n X j=1 n X

|a0j (t) − J∇aj W (a)| +

n X

|J∇aj W (a) − J∇a˜ j W (˜a)|

j=1

|a0j (t) − J∇aj W (a)| + Cm(t).

(4.18)

j=1

As before, consider the time interval [t, t + k], with k small, and the ball BR = BR (aj (t)) inside BR0 /2 . Proceeding as before, we find Z ∇⊥ ϕ p(u )|t+k LHS = t BR0 /2

Z

= −2

t+k

t

Z ds

BR0 /2 \BR

(∇u ⊗ ∇u ) : ∇∇⊥ ϕ

! T p(u ) p(u ) ds − v ⊗ v : ∇∇⊥ ϕ + v⊗ v⊗ = −2 |u | |u | BR0 /2 \BR t Z Z t+k p(u ) −v ds [ + (−2) |u | BR0 /2 \BR t p(u ) − v + ∇|u | ⊗ ∇|u |] : ∇∇⊥ ϕ ⊗ |u | (4.19) = RHS1 + RHS2 . Z

t+k

Z

Now the almost minimizing energy assumption gives: 1 + W (a(0)) + o(1) 1 = nπ log + W (˜a(t)) + o(1) 1 ≤ nπ log + W (a(t)) + Cm(t) + o(1).

E(u ) = nπ log

(4.20)

Selecting δC ≤ ω0 ∈ (0, 1), we infer from Proposition 3.3 that for all t ∈ (0, tδ ): p(u ) − vkL2 (BR0 /2 \BR ) ≤ C1 m(t), lim sup k |u | →0

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

269

and lim sup k∇|u |kL2 (BR0 /2 \BR ) ≤ C1 m(t). →0

(4.21)

Passing → 0 in (4.19), then dividing and sending k ↓ 0, we get (a = (ξ, η)): LHS → 2πξj0 (t),

RHS1 → 2πJWξj (a(t)).

In view of (4.21), we have from (4.19) that |ξj0 (t) − JWξj (a)| ≤ C2 m(t). With a similar estimate on ηj (t), we get |a0j (t)−J∇aj W (a)| ≤ C2 m(t). It follows that m0 (t) ≤ Cm(t), with m(0) = 0, hence m(t) = 0 for all t ∈ [0, tδ ]. Induction in time shows a(t) ≡ a˜ for all t ≥ 0. Hence the Kirchhoff law holds with strong convergence of p and ∇|u |. The proof is complete.

5. Zero Neumann and Other Boundary Conditions In this section, we comment on all necessary modifications in the proofs of previous sections to establish similar results for the zero Neumann case, the entire space case, and the periodic case. For the Neumann boundary case, the ha in the weak limit is harmonic and satisfies the boundary condition: ha,ν = −2a,ν . The resulting renormalized energy W goes to −∞ if one of the vortices goes near ∂. To establish a uniform bound on W , we proceed by first showing the vortex continuous motion in time, then using the dynamical law to deduce that the renormalized energy is conserved. Thus the vortices never come close to each other or to the boundary ∂ since initially W is finite. The energy arguments can be modified as in Lin [22] and [23]. What remains is the treatment of the boundary value of ha . Let us derive the Neumann boundary condition on ha . First, near the boundary ∂, there are no vortices by induction in time. So we can write u = ρ eiH , where both ρ and H are real functions. Direct calculation shows: p(u ) = (ρ )2 ∇H , x ∈ ∂. p(u ) · ν = (ρ )2 Hν , Similarly

(5.1)

uν = (ρν + iHν )eiH ,

and so zero Neumann boundary condition (1.3) says ρν = 0, Hν = 0, ∂,

(5.2)

implying in view of (5.1): p(u ) · ν = 0,

∂, ∀ > 0.

(5.3)

Let ψ = ψ(t, x) be a compactly supported function in a small region near the boundary; for each t, supp {ψ} ∩ ∂ contains a finite curve; ψ is also compactly supported inside the time interval [0, T ], T > 0.

270

F.-H. Lin, J. X. Xin

Due to div p(ua ) = 0 on a , we have using (5.3) and mass conservation: Z Z Z ψ p(ua ) · ν = p(ua ) · ∇ψ = lim p(u ) · ∇ψ →0 a ∂ a Z Z 1 div p(u )ψ = − lim |u |2t ψ, = − lim →0 a 2 →0 a which upon integration over [0, T ] and integration by parts gives Z TZ Z TZ 1 ψ p(ua ) · ν = lim |u |2 ψt = 0. →0 2 ∂ a 0 0

(5.4)

(5.5)

It follows from arbitrariness of ψ and smoothness of p(ua ) that p(ua ) · ν = 0 on ∂, which is just the desired boundary condition ha,ν = −2a,ν . Physically, ha plays the role of correcting 2a on the boundary so that there is no flow into the wall. 2 Let us turn to the entire space PRn case and the periodic case. For these two cases, we assume that the sum of degrees j=1 nj = 0 (zero sum condition). Under this condition and that u (0, x) converges to a constant eiθ0 at x = ∞ sufficiently fast, the total energy E on R2 remains the same asymptotic expression nπ log 1 + O(1). Otherwise, the energy is infinite, and one has to look at the energy distribution over finite domains to locate vortices. The analogous problem on R2 with infinite initial energy has been solved recently for the Ginzburg–Landau equation in Lin and Xin [24]. When the sum of 2 2 vortex degrees is zero, the harmonic function ha having a finite P L gradient over R is a constant. The renormalized energy simplifies to WR2 = − l6=j nl nj log |al − aj |, free of boundary contributions. The zero sum condition is needed in the periodic case in order to maintain the boundary condition for solutions containing vortices. The renormalized P energy is similar: Wper = − l6=j nl nj G(al − aj ), with G the periodic Green’s function for the Laplacian on the two dimensional torus (1G = 2π(δ 0 − 1)). 6. Vortex Motion Law of a CGL In this section, we apply our method to establish the vortex motion law of a related complex Ginzburg–Landau (CGL) equation: δ u,t + iu,t = 1u + −2 (1 − |u |2 )u , log 1

(6.1)

where δ > 0 is a fixed positive number. We shall only consider the Dirichlet boundary condition (1.2), with extensions to other boundary conditions the same as remarked in the last section. The energy conservation is Z Z δ d e (u )(t, x) dx = − |u,t |2 , (6.2) dt log 1 which implies via Lemma 2.1: Z

T 0

Z

δu2,t log 1

dx ≤ C0 ,

(6.3)

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

271

and energy concentration: n

µ (t, x) =

X e (u )(t, x) dx * µ(t, x) = δ aj (t) . 1 π log j=1

(6.4)

It follows from (6.3) that aj (t) are Lipschitz continuous in t for any δ > 0, see [20, 22] for details. The conservation of mass is now ∂t |u |2 = 2divp(u ) −

2δ u ∧ u,t , log 1

(6.5)

and the conservation of linear momentum is ∂t p(u ) = 2div (∇u ⊗ ∇u ) − ∇P −

2δ u,t · ∇u , log 1

(6.6)

with the pressure P = |∇u |2 + u · 1u −

|u |4 − 1 δ u · u,t . − 2 2 log 1

(6.7)

We observe that δ u ∧ u,t → 0, log 1

L1 ([0, T ]; L1 ()),

δ u,t · ∇u → 0, log 1

L1 ([0, T ]; L1 (a )).

by (6.3), and similarly

Using the same arguments as before for NLS, we deduce that p(u ) * v satisfying the Euler equation on a ; moreover, the vortices aj (t) obey the same Kirchhoff law as in Theorem 1.2. Since the results are independent of δ, we have as a byproduct another proof of continuity and the dynamical law for NLS vortices sending δ ↓ 0.

7. Semiclassical Limit of NLS In this section, we consider the semiclassical (WKB) limit of NLS: iv,t = 2 1v + (1 − |v |2 )v ,

(7.1)

with the Dirichlet boundary condition (1.2) and initial data satisfying (1.5). The case when there are no vortices in solutions (uniformly bounded energy as ↓ 0), has been studied in Colin and Soyeur [4]. Here we are concerned with the case when there are vortices. We show:

272

F.-H. Lin, J. X. Xin

Theorem 7.1. Suppose that the initial data n Y x − aj ih(x) e , v (0, x) * |x − aj | j=1

→ 0 in L2 (0 ), for any compact weakly in H 1 (a ), h(x) ∈ H 1 (), and that |v (0,x)|−1 0 subset of a . Then there is no vortex motion at a later time and v (t, x) *

n Y x − aj ih(t,x) e , |x − aj | j=1

(7.2)

where the phase function h(t, x) ∈ H 1 () and is the weak solution of the finite energy of the following initial-boundary value problem of the linear wave equation: htt − 21h = 0, x ∈ , h(t, x) = h(x), x ∈ ∂, h(0, x) = h(x), ht (0, x) = 0.

(7.3)

Proof. By Proposition 3.1 (t = ), we know that vortices do not move on this slow WKB time scale. By Lemma 2.1 and Lemma 2.2: v (t, x) *

n Y x − aj ih(t,x) , e |x − aj | j=1

where h(t, x) ∈ H 1 () for each time t. The conservation of mass is now 1 − |v |2 + 2div(p(v )) = 0, t and the conservation of energy implies Z Z (1 − |v |2 )2 2 |∇v | + ≤ C0 , 22 0

(7.4)

(7.5)

(7.6)

where 0 is a compact subset of a , C0 a positive constant independent of . It follows that v is bounded in L∞ ([0, T ]; H 1 (0 )); v,t bounded in L∞ ([0, T ]; H −1 (0 )) in view |2 ) bounded in L∞ ([0, T ], L2 ). So v is strongly compact in of (7.6) and (7.1); and (1−|v C([0, T ], L2 (0 )) and weakly compact in L∞ ([0, T ], H 1 (0 )). Up to a subsequence if necessary: v → v strongly in L∞ ([0, T ]; L2 (0 )) and weakly in L∞ ([0, T ]; H 1 (0 )). In the meantime, (7.5) gives: Z t Z t 1 − |v (0, x)|2 1 − |v |2 0 0 div p(v )(t ) dt + div p(v)(t0 ) dt0 , = −2 * −2 0 0 (7.7) in the sense of the distribution on 0 . This then allows us to pass ↓ 0 in (7.1) and obtain Z t div p(v)(t0 ) dt0 , ivt = −2v 0

Incompressible Fluid Limit and the Vortex Motion Law of NLS Equation

273

in the distribution sense on 0 . Also |v| = 1, limt↓0+ v(t, x) = h(x) in L2 (0 ), and limt↓0+ vt (t, x) = 0, in H −1 (0 ). Writing v = eiH shows Z t 1H(t0 )dt0 = 0, x ∈ 0 Ht − 2 0

and further letting H = 2a + h(t, x), with 2a harmonic on 0 , yields Z t 1h(t0 )dt0 = 0, x ∈ 0 , ht − 2

(7.8)

0

or by arbitrariness of 0 : htt − 21h = 0,

D0 (a × [0, T ]).

(7.9)

It follows that h is a distribution solution of the linear wave equation on a . The boundary data h(t, x) = h(x), x ∈ ∂, follows from v → v in H s , s ∈ (1/2, 1), near the boundary and the standard trace imbedding. Finally, h(t, x) ∈ H 1 () implies that h is the unique weak solution of (7.3) with finite total energy. The proof is complete. Acknowledgement. F.-H. Lin was partially supported by NSF grant DMS-9706862, and J. X. Xin was partially supported by NSF grant DMS-9625680. J. X. Xin would like to thank Yves Pomeau for many enlightening conversations on the physics of superfluids, and the Courant Institute for the hospitality and support during his visit. Both authors wish to thank N. Ercolani, G. Eyink, D. Levermore, A. Majda, D. McLaughlin, and T. Spencer for their interest and suggestions. We thank R. Jerrard for his constructive comments on an earlier version of this paper.

References 1. Batchelor, G.: Introduction to Fluid Mechanics. Cambridge: Cambridge Univ. Press, 1980 2. Bethuel, T., Brezis, H., Helein, F.: Ginzburg–Landau Vortices. Boston: Birkhauser, 1994 3. Brezis, H., Gallou¨et, T.: Nonlinear Schr¨odinger Evolution Equations. Nonlinear Anal, TMA, 4 (4), 677–681 (1980) 4. Colin, T., Soyeur, A.: Some singular limits for evolutionary Ginzburg–Landau equations. Asymptotic Analysis, 13, 361–372 (1996) 5. Colliander, J.E., Jerrard, R.L.: Vortex Dynamics for the Ginzburg–Landau-Schr¨odinger Equations. Preprint, 1998 6. Constantin, P., Foias, C.: Navier-Stokes Equations. Chicago and London: Univ. of Chicago Press, 1988 7. Creswick, R., Morrison, H.: On the dynamics of quantum vortices. Phys. Lett. A, 76, 267 (1980) 8. DiPerna, R., Majda, A.: Reduced Hausdorff dimension and concentration-cancellation for 2-D incompressible flow. J. Am. Math. Soc. 1, 59–95 (1988) 9. Donnelly, R.J.: Quantized Vortices in Helium II. Cambridge: Cambridge Univ. Press, 1991 10. E, W.: Dynamics of vortices in Ginzburg–Landau theories with applications to superconductivity. Physica D 77, 383–404 (1994) 11. Ercolani, N., Montgomery, R.: On the fluid approximation to a nonlinear Schr¨odinger equation. Phys. Lett. A 180, 402–408 (1993) 12. Evans, L.C.: Weak Convergence Methods for Nonlinear PDE’s. NSF-CBMS Regional Conference Lectures, 1988 13. Frisch, T., Pomeau, Y., Rica, S.: Transition to Dissipation in a Model of Superflow. Phys. Rev. Lett. 69, No. 11, 1644–1647 (1992) 14. Ginzburg, V.L., Pitaevskii, L.P.: On the theory of superfluidity. Sov. Phys. JETP 7, 585 (1958) 15. Grenier, E.: Limite semi-classique de l’´equation de Schr¨odinger nonlin´eare en temps petit. C. R. Acad. Sci. Paris, 320, S´erie I, 691–694 (1995) 16. Jaffe, A. and Taubes, C.: Vortices and Monopoles. Boston: Birkh¨auser, 1980

274

F.-H. Lin, J. X. Xin

17. Jin, S., Levermore, C.D., McLaughlin, D.W.: The Behavior of Solutions of the NLS Equation in the Semiclassical Limit, Singular Limits of Dispersive Waves. NATO ASI, Series B: Physics, 320, 235–256 (1994) 18. Josserand, C., Pomeau, Y.: Generation of vortices in a model superfluid He4 by the KP instability. Europhys. Lett. 30 no.1, 43–48 (1995) 19. Landau, L.D., Lifschitz, E.M.: Fluid Mechanics, Course of Theoretical Physics. Vol. 6, 2nd edition, London–New York: Pergamon Press, 1989 20. Lin, F.-H.: Some Dynamical Properties of Ginzburg–Landau Vortices. Comm. Pure Appl. Math, Vol. XLIX, 323–359 (1996); and remarks on this paper, Ibid, XLIX, 361–364 (1996) 21. Lin, F.-H.: Static and moving vortices in Ginzburg–Landau theories. Nonlinear PDE in Geometry and Physics, the 1995 Barret Lectures, Basel–Boston: Birkha¨user, 1997 22. Lin, F.-H.: Complex Ginzburg–Landau Equations and Dynamics of Vortices, Filaments and Codimension 2 Submanifolds. Comm. Pure Appl. Math, 51 (4), 385–441 (1998) 23. Lin, F.-H.: Vortex Dynamics for the Nonlinear Wave Equations. Comm. Pure and Appl. Math. to appear, 1998 24. Lin, F.-H., Xin, J.: On the Dynamical Law of the Ginzburg–Landau Vortices on the Plane. Comm. Pure and Appl. Math. to appear, 1998 25. Lin, F.-H., Xin, J.: A Unified Approach to Vortex Motion Laws of Complex Scalar Field Equations. Math. Research Letters 5, 1–6 (1998) 26. Madelung, E.: Quantentheorie in Hydrodynamischer Form. Z. Physik 40, 322 (1927) 27. Marchioro, C., Pulriventi, M.: Mathematical Theory of Incompressible Nonviscous Fluids. Appl. Math. Sci. 96, Berlin–Heidelberg–New York: Springer-Verlag, 1994 28. Neu, J.: Vortices in the Complex Scalar Fields. Physica D 43, 385–406 (1990) 29. Nore, C., Brachet, M., Cerda, E., Tirapegui, E.: Scattering of First Sound by Superfluid Vortices. Phys. Rev. Letters 72, No. 16, 2593–2595 (1994) 30. Onsager, L.: Statistical Hydrodynamics. Nuovo Cimento, V–VI Suppl. 2, 279 (1949) 31. Ovchinnikov, Y., Sigal, I.: The Ginzburg–Landau Equation III, Vortex Dynamics. Preprint, 1997 32. Weinstein, M., Xin, J.: Dynamic Stability of Vortex Solutions of Ginzburg–Landau and Nonlinear Schr¨odinger Equations. Commun. Math. Phys. 180, 389–428 (1996) Communicated by A. Kupiainen

Commun. Math. Phys. 200, 275 – 296 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Smeared and Unsmeared Chiral Vertex Operators Florin Constantinescu1 , Gunter ¨ Scharf2 1 Fachbereich Mathematik, Johann Wolfgang Goethe Universit¨ at Frankfurt, Robert-Mayer-Str. 10, D-60054 Frankfurt am Main, Germany 2 Institut f¨ ur Theoretische Physik, Universit¨at Z¨urich, Winterthurerstr. 190, CH-8057 Z¨urich, Switzerland

Received: 16 October 1997 / Accepted: 7 July 1998

Abstract: We prove unboundedness and boundedness of the unsmeared and smeared chiral vertex operators, respectively. We use elementary methods in bosonic Fock space, only. Possible applications to conformal two-dimensional quantum field theory, perturbation thereof, and to the perturbative construction of the sine-Gordon model by the Epstein-Glaser method are discussed. From another point of view the results of this paper can be looked at as a first step towards a Hilbert space interpretation of vertex operator algebras.

1. Introduction The subject of massless two-dimensional fields was always a source of interesting problems. The light-cone variables, when compactified in the euclidean, allow application of complex methods. Here we are concerned, in the compact case, with both unsmeared and smeared vertex operators near and on the unit circle. Using elementary methods only, we show that, as operators in Hilbert spaces which are related to the bosonic Fock space, the usual unsmeared vertices are poor operators, whereas the smeared ones are nice bounded operators. The result is surprising taking into account that bosonic operators are usually unbounded. On the other hand, two-dimensional abelian bosonization makes the result plausible, at least in some case (equivalence to fermions which are bounded). Although the functional properties of the unsmeared vertex operators are not overwhelming their algebraic properties, when restricted inside the unit circle, are remarkable. This is consistent with their usefulness in the frame of vertex operator algebras. In the smeared case we were motivated by similar results obtained in the framework of the Wess–Zumino–Witten model of two-dimensional conformal quantum field theory [1, 2] and for Minkowski two-dimensional massless fields [3] by explicit fermionic methods. Instead we keep working in the bosonic Fock space where vertex operators naturally

276

F. Constantinescu, G. Scharf

live. Beside explicit computation in Fock space our main tools are a generalized Gram inequality for determinants and an explicit tensor product argument which remembers a trick used in [1, 3]. The range of validity of our results extends to chiral vertex operators with real charges between −1 and 1. We expect some input on two-dimensional conformal quantum field theory and on the sine-Gordon model. Indeed, from the conformal field theory point of view the latter is strongly related to the perturbed conformal quantum field theory [20]. The consequences are twofold: (i) On one side, in a hamiltonian approach to perturbed conformal field theory, the boundedness of the vertex operator (which appears as a perturbation) would suggest a regular analytic perturbation which is seldom found even in quantum mechanics. This agrees with a convergent perturbation series in the lagrangian formalism [4]. (ii) On the other side, a perturbative construction of the S-matrix for the sine-Gordon model by the Epstein-Glaser method [5, 6] appears to provide us with a convergent perturbation series (before the adiabatic limit). Needless to say, although our study in this paper is restricted to the compact case, we expect similar results by similar methods in the non-compact Minkowski case too; the case in which the Epstein-Glaser method is currently used [5, 6]. We will return to this subject elsewhere. For the usual perturbative approach including the case of conformal quantum field theory see [7–9]. The paper is organized as follows. In the second section we set up the bosonic Fock space notations, define the unsmeared chiral vertex operators and discuss their Hilbert space properties. In the third section we introduce operators the smeared vertex operators on the unit circle in bosonic Fock space and prove some inequalities for their vacuum expectation values in a special case. Here the (generalized) Gram inequality, proven in Appendix 1, is used. Gram determinant inequalities are a hint to fermions behind the bosons, but we prefer to stay in the bosonic framework and in fact find, if possible, alternative proofs which are not necessarily fermionic in nature. In the fourth and fifth sections we adjust the bosonic Fock space in order to incorporate neutrality and finally interpret the results of the third section as boundedness of the smeared chiral vertex operator with a neutrality condition. We use a Fock space framework which is standard in the old style approach to string theory (and was taken up to conformal field theory and the study of some infinite dimensional Lie algebras). A possible alternative aproach is mentioned in Sect. 6. In that section a discussion of the results obtained and perspectives concerning conformal field theory and the sine-Gordon model follows. Appendices provide complementary results partially used in the main text which could also be of independent interest. 2. Unsmeared Chiral Vertex Operators in Bosonic Fock Space Let an , n ∈ Z − {0} generate the Heisenberg algebra [an , am ] = nδn,−m , where an , n ≥ 1 are annihilation and a−n , n ≥ 1 creation operators in bosonic Fock space F. We will also consider central extensions by a0 with a0 80 (α) = α80 (α), where 80 (α) is the cyclic vacuum in the bosonic Fock space F(α), now indexed by the basic charge α (in particular we can take α = 0). As Hilbert spaces the spaces F(α) are all the same and we will keep denoting them by F with vacuum ≡ 80 instead of 80 (α) if clear from the context. In F we consider the usual basis 8η = where

1 1 aηk . . . aη−1 80 , (η!I η )1/2 −k

(1)

Smeared and Unsmeared Chiral Vertex Operators

277

η = (η1 , η2 , . . . ), ηi ≥ 0, η! =

∞ Y

ηi !,

i=1

Iη =

∞ Y i=0

iηi , kηk =

∞ X

iηi < ∞.

i=1

Let F0 ⊂ F be the linear span of 8η . The (unbounded, closed, densely defined) operators an , n 6= 0 act as usual for n ≥ 1, √ an 8η = nηn 8η−en , p (2) a−n 8η = n(ηn + 1)8η+en , where en is the unit vector in l2 with zero components for k 6= n and one for k = n. Let γ, z ∈ C, z 6= 0. The formal unsmeared vertex operator in F is Vγ (z) ≡ V (γ, z) = V− (γ, z)V+ (γ, z) with

(3)

X ∞ zn a−n , V− (γ, z) = exp γ n n=1

∞ X z −n an . V+ (γ, z) = exp −γ n n=1

(4)

Further on we consider vertex operators V˜γ (z) ≡ V˜ (γ, z) defined as follows. First consider the operator (5) Tγ : F(α) → F(α + γ) such that [Tγ , an ] = 0 for n 6= 0 and [Tγ , a0 ] = −γTγ . We introduce the vertex operator V˜ (γ, z) from F(α) to F(α + γ) by V˜γ (z) ≡ V˜ (γ, z) = Tγ z γa0 V (γ, z).

(6)

We will use both vertices V (γ, z) and V˜ (γ, z). Since the operators Tγ and a0 are harmless, there will be, from the Hilbert space point of view, not much difference between the two vertices. Some difference will appear later on after introducing the neutrality condition. Now let us introduce the involution a+n = a−n , n ∈ Z. This is generally not necessary but it is useful in applications. The formal adjoints of vertex operators are again vertex operators 1 V (γ, z)+ = V −γ ∗ , ∗ , z ∗ 2 1 (7) V˜ (γ, z)+ = (z ∗ )−(γ ) V˜ −γ ∗ , ∗ , z where z ∗ , γ ∗ are the complex conjugate of z, γ ∈ C and Tγ+ = Tγ ∗ . For the purpose of the computations to follow we remark that [a0 , Tγ ] = γTγ implies z γ1 a0 Tγ2 = z γ1 γ2 Tγ2 z γ1 a0 . The proof is a simple computation. We remark that in studying V˜ (γ, z) (but not V (γ, z)) consistency of the formal adjoint requires γ ∈ R.

278

F. Constantinescu, G. Scharf

We start looking at vertices no longer formal (as in (4)) but as operators in Hilbert space. We restrict here to V = V (γ, z) as an operator in F(α) with α = 0 (denoted by F) but similar results hold for V˜ , too. A direct computation in Fock space [10–12] shows that we obtain well-defined matrix elements of V : Vη,ν (γ, z) = (8η , V (γ, z)8ν ) = γ γ 1 Y mηi νi √ z i , − √ z −i , η!ν! i=1 i i ∞

=√ where

mην (x, y) =

min(η,ν) X j=0

η j

ν j!xη−j y ν−j j

are related to the monic Charlier and Laguerre polynomials [13] n X n x (a) = l!(−a)n−l Cn(a) (x) = n!L(x−n) n l l

(8)

(9)

(10)

l=0

by

mην (x, y) = y ν−η Cη−(xy) (ν).

(11)

Note that the product in (8) is finite because mηi νi is different from zero for only a finite number of i and m00 (x, y) = 1. The generating function of Cn(a) (x) is ∞ X

Cn(a) (x)

n=0

wn = e−aw (1 + w)x n!

(12)

and the orthogonality relation Z∞

(a) Cm (x)Cn(a) (x) dψ (a) (x) = an n!δm,n

(13)

0

holds with respect to the step function ψ (a) with jumps (x!)−1 e−a ax and normalization ∞ X e−a ax x=0

x!

=1

(14)

at x = 0, 1, 2, . . . . For a > 0 (our case if γ is real) this is the Poisson distribution. At this point it is interesting to remark that the basic formula for giving a Hilbert space meaning to formal products of vertex operators (see later) corresponds to the following generalization of the orthogonality relation for the Charlier polynomials [13]: Lemma 1. For x, y, z ∈ C and i, j integers ∞ X 1 mik (x, y)mkj (z, w) = mij (x + z, y + w)exy k! k=0

hold in the sense of the formal power series in the variables.

(15)

Smeared and Unsmeared Chiral Vertex Operators

279

The proof which is a long induction argument can be found in [10]. It encodes the fact that formally the product of two vertex operators is, up to scalar factors, again a vertex-like operator. Generally the matrix elements Vη,ν = Vη,ν (γ, z) define an operator V = V (γ, z) in Hilbert space F (no longer formal) by XX Vη,ν (8ν , 9)8η (16) V9= η

for 9 in the domain of definition n D(V ) = 9 ∈ F :

for all η and

ν

lim

k→∞

X

Vη,ν (8ν , 9) exists

kνk≤k

2 o X X Vη,ν (8ν , 9) < ∞ . η

(17)

ν

Certainly the domain of definition of V can be zero, D(V ) = {0}. In this case the matrix elements Vη,ν determine V only as a bilinear form and not properly as an operator. This will really happen in some cases below. Using definition (16, 17), a bunch of results on V = V (γ, z) has been proven [11] using mainly Lemma 1 and coherent states [14] generated by exponentials of type V− in (4). We select what is relevant for us: Theorem 2. We have for arbitrary γ ∈ C, (i) (ii) (iii) (iv)

for |z| < 1, V (γ, z) is densely defined with F0 in its domain of definition D(V (γ, z)). For |z| > 1 we have D(V (γ, z)) = {0}. For |z| < 1, V (γ, z) is not closable. Let |z2 | < |z1 | < 1, then V (γ, z2 )F0 ⊂ D(V (γ, z1 )).

Proofs of the theorem are based on Lemma 1 [10, 11]. In particular, if the involution a+n = a−n , n ∈ Z is introduced, the adjoints V + (γ, z) cannot be defined as operators for |z| < 1. The situation with V (z), V + (z) appear to be somewhat similar to that of annihilation and creation operators a(x), a+ (x) in elementary quantum mechanics. Indeed, the domain of definition of a+ (x) as an operator in Fock space is zero, but working with bilinear forms instead of operators saves the matter. Only after smearing, a+ (f ), f ∈ L2 becomes a (nontrivial) operator. However, the reader should not push this analogy too far because of chiral properties of V, V + which are absent in a, a+ . At this stage we retain the fact that there is a striking asymmetry between unsmeared vertices inside and outside the unit circle, as far as their operator properties in the Fock space F are concerned. The symmetry is restored after smearing on the unit circle as we will see below. Corollary 3. For arbitrary γ ∈ C and |z| < 1 the vertex V (γ, z) is an unbounded operator in Fock space. This is a consequence of property (iii). Properties (i)-(iii) say nothing about the case |z| = 1. Property (iii) shows that V (γ, z) in the interesting region |z| < 1 is a poor operator. Nevertheless, Property (iv) allows for defining products V (γ1 , z1 )V (γ2 , z2 ) . . . V (γr , zr )

(18)

280

F. Constantinescu, G. Scharf

for |zr | < |zr−1 | < . . . < |z1 | < 1 as densely defined operators in F with F0 in their domains, a property justifying vertex algebras. Some remarks are in order: First we didn’t use any kind of braid relation between vertices and in fact defined Hilbert space products V (γ1 , z1 ) . . . V (γr , zr ) only for |zr | < . . . < |z1 | < 1. Second we didn’t introduce the neutrality condition, common in massless two-dimensional field theory. Third, it is interesting to remark that the product of vertices (18) is defined in F without an invariant domain for the factors. Indeed, F0 is not invariant under V (γ, z). This is the situation with unsmeared vertex operators. In the next section we will show that a smearing operation applied to vertices dramatically improves their properties (near |z| = 1) such that finally under the neutrality condition they turn into bounded operators in the bosonic Fock space to be precisely defined below. We consider the case −1 ≤ γ ≤ 1. For later use we mention the following formula which now has a Hilbert space operator interpretation (see the remarks above concerning the existence of products) Y (zi − zj )γi γj TP γj × V˜ (γ1 , z1 )V˜ (γ2 , z2 ) . . . V˜ (γr , zr ) = ×

r Y i=1

ziγi a0

X ∞

exp

n=1

1≤i<j≤r

X r ∞ r 1 X 1 X −n n ( ( γi zi )a−n exp γi zi )an n i=1 n i=1 n=1

(19)

γi γj have for |zr | < |zr−1 | < . . . < |z 1 | < 1. In the case of V (γi , zi ) the factors (zi − zj ) z

γi γ j

. In particular (19) reproduces the well known formula to be replaced by 1 − zji P for the n-point function under neutrality γi = 0, Y (zi − zj )γi γj . (20) (80 , V˜ (γ1 , z1 ) . . . V˜ (γn , zn )80 ) = i<j

Here we set α = 0; otherwise some extra powers of zi appear. The determination in (19) and (20) is fixed as usual by taking log(zi − zj ) real for real 0 < zj < zi . 3. Smeared Chiral Vertex Operators The formal smeared chiral vertex operator on the unit circle S 1 is Z 1 V (γ, z)f (z)dz, V (γ, f ) = 2πi

(21)

|z|=1

where f = f (z) is a test function on S 1 to be chosen below. In this section we start by looking at (21) as a limit for z approaching the unit circle from the interior, i.e. (21) has to be understood as Z 1 V (γ, rz)f (z)dz. (22) V (γ, f ) = lim r→1− 2πi |z|=1

Let us consider the case γ = 1 first. When appropriate we use the notation V (γ = 1, z) = V (z). A rough idea of what happens is obtained by calculating the following norm in the unsmeared case for |z| = 1 − ε, ε → 0+ with help of (19)

Smeared and Unsmeared Chiral Vertex Operators

281

kV (z)80 k2 = (80 , V (z)+ V (z)80 ) =

=

1 = ∞, 1 − z ∗ z |z|=1−ε,ε→0+

(23)

whereas in the smeared case with f ∈ L2 (S 1 ), kV (f )80 k2 = (80 , V (f )+ V (f )80 ) = 1 = 4π 2

Z

Z

∞

|z|=|w∗ |=1

X 1 ∗ ∗ ∗ (w )f (z) dw dz = |c−n |2 ≤ kf k22 < ∞. f 1 − w∗ z n=1

(24)

Here cn is the Fourier coefficient 1 cn = 2πi

Z z

−n−1

|z|=1

1 f (z) dz = 2π

Z2π

e−inθ f eiθ dθ

(25)

0

and the L2 (S 1 ) norm is given by kf k22 =

1 2π

Z2π

+∞ X

|f (eiθ )|2 dθ =

|cn |2 .

(26)

n=−∞

0

In (24) use was made of the formal smeared adjoint V (γ, f )+ =

1 2π

Z |z|=1

V (γ, z)+ f (z)∗ dz ∗ =

1 2π

Z |z|=1

1 V −γ ∗ , ∗ f ∗ (z ∗ ) dz ∗ . z

(27)

Following the convention in (23), z approaches the unit circle from the interior and consequently 1/z ∗ approaches it from the exterior. In (24), before taking the limit to the unit circle, we have |z| < 1 < |w∗ |−1 which assures the validity of the geometric series used there. The rigorous definition of the smeared vertex V (γ, f ) and adjoint V (γ, f )+ as operators in bosonic Fock space is analogous to the definition of their unsmeared counterparts in Sect. 2 (cf. further in this section). In addition we have to remark that the smearing in L2 enables us to take them on the unit circle, a fact which is not true in the unsmeared case. This calls the computation in (24) as opposed to (23). For the convenience of the reader let us give some details in the special case V (γ, f )80 . Indeed, using formulas from Sect. 2 we have λη ≡ λη (z) = (8η , V (γ, z)80 ) = and

i ηi ∞ Y γz (ηi !)−1/2 √ i i=1

282

F. Constantinescu, G. Scharf

X

|λη |2 =

η

η ∞ XY 1 |γz i |2 i = η! i η i=1 i

k ∞ X ∞ Y 1 |γz i |2 = = k! i i=1 k=0

∞ X |z 2 |i |γ|2 |z 2i | 2 = exp |γ| = exp = i i i=1 i=1 ∞ Y

1 |γ| , (28) 1 − |z|2 P showing that in the unsmeared case η |λη |2 < ∞ for |z| < 1, but not for |z| = 1 because of the divergence of the harmonic series. On the other hand, in the smeared out case for Z 1 λη (z)f (z) dz (29) 3η = 2πi

2

=

|z|=1

P

we have η |3η |2 < ∞ as a consequence of the computation above and Parseval’s relation for the Fourier coefficients of f ∈ L2 (S 1 ). This shows that the vacuum is in the domain of definition of V (γ, f ) when concentrating on the unit circle which was not the case for the unsmeared V (see (23)). Similar considerations hold for products of smeared vertex operators appearing below. We leave the details to the interested reader. We remark that some care is needed for the case |z| > 1 (including the adjoint with |z| < 1). The truncation which makes divergent sums well defined is provided by looking at the problem in the framework of bilinear forms. The way in which the bilinear adjoint operation is implemented is different from the standard case of annihilation and creation operators. It can be understood in terms of Fourier coefficients of the smearing function: under adjunction they reverse the index. On products of smeared vertices applied to the vacuum the adjoint operation is equivalent (according to the Hardy decomposition) to the change of the regularization in the standard kernel from the inside to the outside of the unit circle. In this way analytic continuation and braiding enter the scene. Consequently the results are fully consistent with those obtained by the usual formal work supplemented by analytic continuation, braiding etc. At the same time, for this particular example, one obtains a Hilbert space version of methods used in operator vertex algebras (formal fields, formal distributions, etc.). Anticipating, the smearing operation reinforces symmetry: both V (f ), f ∈ L2 and its adjoint are densely defined, closed operators. Neutrality will turn them into bounded operators in the case −1 ≤ γ ≤ 1. We didn’t study further V (f ), f ∈ L2 without the neutrality condition here. Remark that even for the case γ ∈ C, |γ| ≤ 1 the norm kV (γ, f )80 k2 is finite for f ∈ L2 (S 1 ). Indeed, for 1 > x = |γ|2 > 0 the binomial series (|a| < 1) (1 − a)−x = 1 +

x(x + 1) 2 x(x + 1)(x + 2) 3 x a+ a + a + ... 1! 2! 3!

(30)

Smeared and Unsmeared Chiral Vertex Operators

283

has positive coefficients and gives kV (γ, f )80 k2 =

x x(x + 1) |c−1 |2 + |c−2 |2 + . . . ≤ 1! 2!

1·2 1 |c−1 |2 + |c−2 |2 + . . . = kV (γ = 1, f )80 k2 < ∞. (31) 1! 2! The two-point function results (24) and (31) in the smeared out case are encouraging as opposed to the unsmeared case (23). The estimates by geometric series expansion above can be replaced by the more efficient technique of Hardy spaces (see for instance [15]). In this theory we have the direct sum decomposition ≤

2 , L2 (S 1 ) = H+2 ⊕ H−

(32)

2 are Hilbert spaces of L2 -functions with positive and zero frequencies where H+2 and H− and negative frequencies, respectively. H+2 is the usual Hardy space denoted by H 2 . In 2 from a different language we have in H+2 L2 -boundary values from inside and in H− outside the unit circle. The main formula we use in this context is Z Z 1 1 f (z1 )g(z2 ) dz1 dz2 = 2πi z1 − z2 S1 S1

Z

Z

f (+) (z2 )g(z2 ) dz2 =

= S1

f (+) (z2 )g (+) (z2 ) dz2 ,

(33)

S1

2 H±

∈ and the integration variables tend to the unit circle, respecting |z1 | > where f 1 > |z2 |. It is understood that the first integration goes on z1 and the second on z2 . We use (33) in several forms which at first glance look different but are always the same formula (33). For instance, we have (±)

=

1 4π 2

Z |w∗ |=|z|=1

(V (f )80 , V (g)80 ) = (80 , V (f )+ V (g)80 ) = Z −1 ∗ −1 u f (u )g(z) 1 1 ∗ ∗ ∗ f du dz = (w )g(z) dw dz = − 1 − w∗ z 4π 2 u−z =

1 2πi

Z

(+)

z −1 f ∗ (z −1 )

g(z) dz = (f (−) , g (−) ).

(34)

>From (34) we get in particular for f = g the previous relation (24) from which we retain (35) kV (f )80 k = kf (−) k2 ≤ kf k2 . In the following we will generalize the relation (35) to scalar products of the form

where

(Vn (f ), Vn (g)),

(36)

Vn (f ) = V (f1 )V (f2 ) . . . V (fn )80

(37)

and similarly for Vn (g), with fi (zi ), gi (zi ) ∈ L (S ), i = 1, 2, . . . , n and the regularization prescription |zn | < |zn−1 | < . . . < |z1 | < 1. 2

1

284

F. Constantinescu, G. Scharf

+ We also define V−1 = V+1 = V +. Before proceeding to some computations let us remark that in smeared vertex operator products like Z Z 1 V−1 (z1 )V+1 (z2 )f (z1 )g(z2 ) dz1 dz2 (38) V−1 (f )V+1 (g) = − 2 4π |z1 |=|z2 |=1

which appear in (34) it is understood that the regularization is of the type |z1 | > 1 > |z2 | and the integration on z1 is to be performed first. We want to stress that there is no inconsistency in the regularization |z1 | > 1 > |z2 | in (38) as compared to |z2 | < |z1 | < 1 in (19) where the product of unsmeared vertex operators was defined. In fact, correctly speaking, the smeared product has to be defined over bilinear forms and not over single operators V (z1 ), V (z2 ) (Theorem2 shows that V (z1 ) for |z1 | > 1 is even not defined as an operator). Continuing, we write f = (f1 , f2 , . . . , fn ), g = (g1 , g2 , . . . , gn ) and (Vn (f ), Vn (g)) = (80 , V (fn )+ . . . V (f1 )+ V (g1 ) . . . V (gn )80 ) = =

1 (4π 2 )n

Z

D(w∗ , z)

i=1

f ∗i (wi∗ )g i (zi ) dwi∗ dzi ,

(39)

Qn

where ∗

D(w , z) = and

n Y

∗ ∗ i<j (zi − zj )(wi − wj ) Qn ∗ i,j=1 (1 − zi wj )

f i (zi ) = zii−n fi (zi ), g i (zi ) = zii−n gi (zi )

for i = 1, 2, . . . , n. Using the Cauchy determinant formula 1 D(w∗ , z) = det 1 − zi wj∗ 1≤i<j≤n

(40)

(41)

(42)

and expanding the determinant we get from (39)

where

(Vn (f ), Vn (g)) = Gn (f (−) ; g (−) ),

(43)

Gn (f, g) ≡ Gn (f1 , f2 , . . . , fn ; g1 , g2 , . . . , gn )   (f1 , g1 ) . . . (f1 , gn )  =  ... (fn , g1 ) . . . (fn , gn )

(44)

is the (generalized) Gram determinant. For f = g we write for the usual Gram determinant Gn (f ; f ) ≡ Gn (f ) ≡ Gn (f1 , f2 , . . . , fn ).

(45)

As a consequence of the smearing the Cauchy determinant which often appears P in twodimensional (massless) physics (for instance in (20) with γi = ±1 and neutrality γi =

Smeared and Unsmeared Chiral Vertex Operators

285

0) goes over into a Gram determinant. Certainly Gn (f1 , . . . , fn ) ≥ 0, as necessary because from (43) (46) 0 < kVn (f )k2 = Gn (f ). Observe that f i , g i ∈ L2 (S 1 ) iff fi , gi ∈ L2 (S 1 ) and kf i k2 = kfi k2 , kg i k2 = kgi k2 , i = 1, 2, . . . , n. Using a simple Gram inequality (see Appendix 1 for a collection of Gram determinant inequalities used in this paper) we get kV (f0 )Vn (f )k ≤ kf 0(−) k2 kVn (f )k ≤ kf0 k2 kVn (f )k.

(47)

The two-point function estimate (34) over only negative frequencies cannot be saved here because of the zi -powers in (41). The inequality (47) is a generalization of (35). Similar considerations apply to V˜ with the same bounds kV˜ (f0 )V˜n (f )k ≤ kf0 k2 kV˜n (f )k,

(48)

where V˜n is defined as in (37) with V replaced by V˜ . On the l.h.s. of (48) the norm is taken in F(n + 1) (we have γ = 1), whereas on the r.h.s. it is taken in F(n) with α arbitrary (here α = 0). In fact, we can do better: for V˜ (f ) there are no powers of z which force f into f as in (41) and the bound is kf (−) k2 involving only the negative frequency part in the Hardy decomposition of f . This remark applies to all operators V˜ (f ) to follow. Indeed, instead of (39) we now have (V˜n (f ), V˜n (g)) = (V˜ (f1 ) . . . V˜ (fn )80 , V˜ (g1 ) . . . V˜ (gn )80 ) = Z n Y 1 ∗ , z) fi∗ (wi∗ )gi (zi ) dwi∗ dzi . = D(w (4π 2 )n i=1

(49)

This formula is obtained by commuting first the operators Tγ through the operators a0 in V˜n (f ) and V˜n (g) separately and then using the adjoint operation on V ’s as in (39). The above formula proves the claim. Let us pause for a moment in order to understand what we have achieved and what we are going to do. First observe that up to now the elementary Gram determinant inequality was used to estimate the norms (35) and (47) directly in the bosonic Fock space, without appealing to any kind of abstract bosonic-fermionic equivalence and without introducing the braiding relation. Our goal in the next section is to prove boundedness of vertex operators in bosonic Fock space. We will start working with the vertex operator V˜ (f ) restricted from F ⊕ = ⊕∞ n=0 F(α + n) to (50) ⊕ ˜ F˜ = ⊕∞ n=0 F(α + n), ˜ + n) is the closed subspace of F(α + n) generated by V˜n (f ) where α is arbitrary and F(α as in (37), i.e. (51) V˜n (f ) = V˜ (f1 )V˜ (f2 ) . . . V˜ (fn )80 (α) with 80 (α) being the vacuum in F(α) (see also [24]). In choosing orthogonal direct sums in (50), the physical neutrality (the "neutrality condition") is realized in both F ⊕ and F˜ ⊕ . In physics neutrality is a consequence of the massless limit in two-dimensional quantum field theory [18, 19]. This idea fits well in our framework but we do not continue to discuss it. Let us only remark that in quantum field theory the massless limit in each adequate framework requires supplementary conditions (see for instance [23]) of which kind in our case the neutrality is. Certainly, behind F˜ ⊕ a fermionic structure is hidden

286

F. Constantinescu, G. Scharf

and this is related to the fact that on V (f ) (or V˜ (f )) a braiding relation can consistently be imposed, which in the case γ = 1 degenerates into antisymmetry. This will no longer be the case in Sect. 4 where γ 6= 1. Nevertheless it is exactly this "fermionic flavor" in bosonic Fock space which does the job for us. For proving boundedness of V˜ (f ) in this framework we need a (generalized) Gram determinant inequality to be proven in Appendix 1. In the fifth section we will extend the results from the present γ = 1 to γ ∈ [−1, 1]. The Hilbert space F˜ ⊕ in (50) is a reasonable framework to study the vertex V˜ (f ) with neutrality condition. We take α = 0 in order to retain the simple formula (28), although related results can be formulated in the general case. The vertex operator V (f ), on the other hand, lives in the original bosonic Fock space F ≡ F(α = 0). This case, which is of interest too, is touched in Sect. 5 where also the connection to the massless two-dimensional physics is shortly discussed. 4. Boundedness of Vertex Operators for γ = 1 Through Generalized Gram Inequality By construction the set V˜n (f ) =

n Y

V˜ (fi )80 ,

(52)

i=1

f = (f1 , f2 , . . . , fn ), fi ∈ L2 (S 1 ), i = 1, . . . , n and n = 0, 1, 2, . . . is total in F˜ ⊕ . We look first at m

X

˜ αj V˜m (fj ) , (53)

V (f0 ) j=1

where αj are complex constants. We changed the notation a little and denote now by fj in V˜ (fi ) the n-tuple fj = (fj1 , . . . , fjn ), j = 1, 2, . . . , n. The norm in (53) is in F(n). To estimate this we use the following (generalized) Gram inequality (see Appendix 1): m X

αi∗ αj G(f0 , fi ; f0 , fj ) ≤ kf0 k22

i,j=1

m X

αi∗ αj G(fi ; fj ).

(54)

i,j=1

By expanding (53) and using (54) as in (47) we get m m

X

X

˜ αj V˜n (fj ) = αj V˜n+1 (fj0 ) ≤

V (f0 ) j=1

j=1

m

X

αj V˜n (fj ) , ≤ kf0 k2

(55)

j=1

where fj0 = (f0 , fj ), j = 1, 2, . . . , m, and in (55) obvious norms in F(n + 1) and F(n), respectively, are chosen. Taking into account the direct sum structure of F˜ ⊕ , it follows that V˜ (f0 ), f0 ∈ L2 (S 1 ) is bounded on a dense set and therefore bounded as an operator in the whole F˜ ⊕ with bound kf0 k2 . Note that the braiding condition on vertices (which would turn them into chiral fermions) was not imposed. In fact vertices V˜γ=1 (z) and V˜γ=−1 (z) as constructed in this paper can be explicitly realized as (chiral) fermions satisfying CAR. We leave the details for the interested reader.

Smeared and Unsmeared Chiral Vertex Operators

287

In our proof above we didn’t need CAR between V˜1 = V˜γ=1 and V˜−1 = V˜γ=−1 . Moreover the bound kf0 k2 obtained in (55) can be improved, as remarked in the preceding section, to kf0(−) k2 and kf0(+) k2 for V˜1 and V˜−1 , respectively. But this seeming advantage comes at a high expense: the Gram inequalities produce boundedness only on the cyclic subspace of V˜1 on which V˜−1 is not defined and boundedness of V˜−1 on the cyclic subspace of V˜1 . These cyclic subspaces (and their direct sum) are smaller than the full fermionic Fock space and cannot accommodate observables (selfadjoint operators). The boundedness of V˜1 (or V˜−1 ) in the full fermionic Hilbert space (cyclically generated by V˜1 and V˜−1 ) does not follow from the Gram inequalities (at least the classical Gram inequality and the generalization proved in Appendix 1) and requires the full power of CAR (cf. [3]). 5. Boundedness of Vertex Operators with −1 ≤ γ ≤ 1 In this section we study vertices V˜ (γ, z) ≡ V˜γ (z) with −1 ≤ γ ≤ 1 as operators in Fγ⊕ = ⊕∞ n=0 F(nγ), where V˜γ (z) maps from F(α) to F(α+γ) in order to account for the neutrality condition. We restrict V˜γ (z) to ˜ F˜ γ⊕ = ⊕∞ (56) n=0 F(nγ), ˜ where F(nγ) is the closed subspace of F(nγ) generated by n Y

V˜γ (fi )80 , fi ∈ L2 (S 1 ), i = 1, 2, . . . , n

i=1

˜ and n = 0, 1, 2, . . . . Here F(0) is one-dimensional, generated by 80 . We use a trick from [1, 3] which in our bosonic framework becomes fully transparent. Consider two copies of Heisenberg algebras generated by an , a0n , n ∈ Z satisfying [an , am ] = nδn,−m , [a0n , a0m ] = nδn,−m ,

(57)

[an , a0m ] = 0, and vertex operators V˜γ (z), V˜γ00 (z),

X ∞ ∞ X zn z −n a−n exp −γ an , V˜γ (z) = Tγ z γa0 exp γ n n n=1 n=1 X ∞ ∞ X z −n zn 0 0 0 γa00 0 0 ˜ a−n exp −γ a0n Vγ 0 (z) = Tγ 0 z exp γ n n n=1 n=1 with

[Tγ0 0 , a0 ] = [Tγ , a00 ] = [Tγ , Tγ0 0 ] = 0, [Tγ , a0 ] = −γTγ , [Tγ0 0 , a00 ] = −γ 0 Tγ0 0 .

(58)

(59)

288

F. Constantinescu, G. Scharf

Note that V˜γ (z), V˜γ00 (z) refer to the same complex variable z but act in different Fock spaces. In the tensor product space we consider the tensor product operator on the same variable z, (60) V˜ (z) = V˜γ (z) ⊗ V˜γ 0 (z) with γ, γ 0 ∈ [−1, 1], γ 2 + γ 02 = 1. Such tensor products with a restriction to the diagonal in the space-time variables were introduced long ago in axiomatic field theory [26]. Observe that the notation in (60) which suggests a vertex V˜γ=1 (z) is consistent. Indeed, with An = γan + γ 0 a0n , n ∈ Z, (61) T = Tγ Tγ0 0 , we have [An , Am ] = [γan + γ 0 a0n , γam + γ 0 a0m ] = nδn,−m (γ 2 + γ 02 ) = nδn,−m

(62)

and [T, A0 ] = −(γ 2 + γ 02 )T = −T, [T, An ] = 0, n ∈ Z − {0}. We apologize for the sloppy notation.One should read in (61) An = (γan ⊗ I) + (I 0 ⊗ γ 0 a0n ), where I, I 0 are identities etc. The tensor product V˜ (z) in (60) is an extension of the vertex operator X ∞

V˜γ=1 (z) = T z A0 exp

n=1

X ∞ zn z −n A−n exp − An n n n=1

(63)

⊕ to the tensor product of Fock spaces F˜ γ⊕ , F˜ γ⊕0 , in which V˜γ (z), from its Fock space F˜ γ=1 V˜γ 0 (z) live. As in Sect. 4, by the generalized Gram inequality the smeared vertex operator ⊕ . We claim that its smeared extension V˜ (f ) V˜γ=1 (f ), f ∈ L2 (S 1 ) is bounded in F˜ γ=1 in (60) is also bounded with the same bound kf k2 as operator in F˜ γ⊕ ⊗ F˜ γ⊕0 . This follows in our case by explicitly realizing V˜ (f ) as a (chiral) fermionic operator satisfying CAR on a dense domain (see [25]) not only in F˜ ⊕ but also in F˜ γ⊕ ⊗ F˜ γ⊕0 (cf. [3]) and using the fact that the fermionic bound is algebraically determined (see for instance Theorem 1 in [25]). In fact looking at the proof of the generalized Gram inequality in Appendix 1 one realizes that it is also shaped after the proof of Theorem 1 in [25]. Let 9, 8 ∈ F˜ γ⊕ and 90 , 80 ∈ F˜ γ⊕0 . Then, for f ∈ L2 (S 1 ) we can write

1 Z (9, V˜γ (z)8)(90 , V˜γ00 (z)80 )f (z) dz. 9 ⊗ 90 , V˜ (f )(8 ⊗ 80 ) = 2π

(64)

We choose 80 = 0 (vacuum in F˜ γ⊕0 now denoted by 0 ) and 90 = V˜γ00 (g)0 with g(w) = w−1 , w ∈ S 1 . We write

Smeared and Unsmeared Chiral Vertex Operators

289

(90 , V˜γ00 (z)80 ) = (V˜γ00 (g)0 , V˜γ00 (z)0 ) = Z 1 0 0 ˜0 0 ˜ Vγ 0 (w)g(w) dw , Vγ 0 (z) ) = = 2π Z γ 02 1 1 1 dw∗ = 1. = 2πi 1 − w∗ z w∗

(65)

|w∗ |=1

1 Z (9, V˜γ (z)8)f (z) dz. (66) 9 ⊗ 90 , V˜ (f )(8 ⊗ 80 ) = 2π Take γ 02 = 1 − γ 2 and use the boundedness of V˜ (f ) as an operator in F˜ γ⊕ ⊗ F˜ γ⊕0 to obtain Then

|(9, V˜γ (f )8)| ≤ k9kk8kkV˜γ00 (g)0 kkf k2 .

(67)

Now we use (31) to write

and finally

kV˜γ00 (g)0 k ≤ kV˜ 0 (γ 0 = 1, g)0 k ≤ kgk2 = 1

(68)

|(9, V˜γ (f )8)| ≤ k9kk8kkf k2

(69)

for arbitrary 9 and 8 in F˜ γ . This proves boudedness of V˜γ (f ) with norm smaller than or equal to kf k2 , i.e. the same bound as for γ = 1. Since γ, γ 0 ∈ R, γ 2 + γ 02 = 1, we have obtained boundedness of vertex operators for γ ∈ [−1, 1]. Let us summarize the result of this and the last section in the following Theorem 4. The chiral vertex operators V˜γ (f ), f ∈ L2 (S 1 ) smeared on the unit circle are bounded operators in the Hilbert space F˜ γ⊕ for all γ ∈ [−1, 1] with |γ| ≤ 1. Note that as discussed in Sect. 3 the bound kf k2 on F˜ γ⊕ can be improved to kf (−) k2 . However, on the full Fock space cyclically generated by V±γ (z) the bound is kf k2 . For this latter case we cannot replace CAR [25] by elementary Gram inequalities (see the discussion at the end of Sect. 4). For similar results concerning Vγ (f ) see the next section where we comment on the results obtained in this paper. 6. Remarks and Discussion We have proved that contrary to the chiral unsmeared vertices, the smeared ones are well behaved, being bounded operators in the Hilbert space F˜ ⊕ constructed explicitly in Sect. 4 by starting from the bosonic Fock space in which the vertices naturally live. The smearing in the complex variable z is only one-dimensional on the unit circle |z| = 1 and this is typical for free fields in quantum field theory. The result holds for charges γ ∈ [−1, 1]. Regarding the Hilbert space, there are further possibilities to realize it. We give one example (other constructions can be found in [17]). Consider the vertex operator V (z) as in (3) with γ = 1. Let W be a finite linear combination of vectors Vn (fj ) , fj = (fj1 , . . . , fjn ): X X Vn = αj Vn (fj ), (70) W = n

n,j

290

F. Constantinescu, G. Scharf

where in Vn we collect all terms in W with the same number n of vertex factors V . In the bosonic Fock space F we change the scalar product (·, ·) to a new one which we define on vectors W by linearity and relations s(Vn , Vm ) = 0, if n 6= m, s(Vn , Vn ) = (Vn , Vn ).

(71)

Let us remark that the set of vectors W (74) is dense in F (see Appendix 2). We complete ˆ In both F, Fˆ the set of vectors in the new scalar product and get a Hilbert space F. W is contained densely with respect to the different scalar products involved. In Fˆ we introduce a densely defined bilinear form Vˆ (f0 ), f0 ∈ L2 (S 1 ) by Vˆ (f0 )(Vn , Vm ) = 0 for m 6= n − 1, Vˆ (f0 )(Vn , Vn−1 ) = (Vn , V (f0 )Vn−1 ).

(72)

Again the generalized Gram inequality followed by some elementary reasoning shows that Vˆ (f0 ) is a densely defined bounded bilinear form in Fˆ w.r.t. the new scalar product ˆ s(·, ·). In conclusion it defines a bounded operator Vˆ (f ) in F. ˆ ˜ In our approach the spaces F and F, although showing fermionic flavors, were directly coined in the bosonic Fock space in which vertices are formally defined. Neutrality was essentially used. Up to Sect. 5 braiding was not introduced. In Sect. 5 we used it although we believe that this would not be necessary if we could replace the abstract reasoning on fermions by a more direct computation in bosonic Fock space. For complex charges the smeared vertices Vγ (f ) and V˜γ (f ) seem to be badly behaved. We do not know what happens with the smeared vertices without neutrality. Usually neutrality is motivated by the zero mass limit in two-dimensional field theory [18, 19]. Our interest in obtaining smeared vertices as bounded operators in a Hilbert space as close as possible to the bosonic Fock space was stimulated at the beginning by the idea of applying the causal Epstein-Glaser method [5] to the two-dimensional sine-Gordon model as well as to conformal perturbation theory. These two aspects of two-dimensional physics are strongly related [20]. Operatorial boundedness in both chiral and antichiral variables in a Minkowski approach will enable us to prove not only convergence of perturbation theory, which is an ancient result, but also to save a factorial factor in the estimate, leading to entire type in the result and convergence for all coupling constants. A similar (weaker) result for the case of two-dimensional conformal quantum field theory was already mentioned in [4]. We will come back to these questions elsewhere. Last but not least Sects. 2 and 3 can be looked at as a first step towards a Hilbert space interpretation of vertex operator algebras (see for instance [28]). Indeed, through the smearing out operation the calculus of formal distributions (formal Dirac function) can be taken over to the Fourier space. Using truncation of formal expansions and going to the unit circle, the convergence is guaranteed independent of the side from which the unit circle is approached. Passing from inside to outside or vice versa is an involution on Fourier coefficients. According to the Hilbert space framework of this paper the calculus of formal distributions is then a formal calculus on kernels of Hilbert space operators.

Appendix 1: Gram Determinant Inequalities Here we state some generalized Gram inequalities for the generalied Gram determinant. By the generalized Gram determinant we understand

Smeared and Unsmeared Chiral Vertex Operators

291

G(x, y) ≡ Gp (x1 , x2 , . . . , xp ; y1 , y2 , . . . , yp ) = (x1 , y1 ) . . . (x1 , yp ) , ... = (xp , y1 ) . . . (xp , yp )

(A.1)

where x = (x1 , . . . , xp ), y = (y1 , . . . , yp ) are in the same vector space with scalar product (·, ·). The usual Gram determinant is obtained by taking xi = yi , i = 1, . . . , p. Here we consider the case where xi , yj are L2 functions. Let F and G be matrix functions given by     f11 . . . f1n f1  f  crf . . . f2n  (A.2) F =  2  ≡  21 , ... ... fr fr1 . . . frn     g11 . . . g1m g1  g  g . . . g2m  (A.3) G =  2  ≡  21 , ... ... gs gs1 . . . gsm and such that the dimension of the linear span generated by the functions fij ; i = 1, . . . , r, j = 1, . . . , n is not higher than n. We call the attention of the reader to the fact that here fi , gk represent vector functions. The generalized Gram inequality we use in this paper is X αj∗ βk∗ αj 0 βk0 Gn+m (fj , gk ; fj 0 gk0 ) ≤ 1≤j,j 0 ≤r 1≤k,k0 ≤s

≤

X

1≤j,j 0 ≤r

αj∗ αj 0 Gn (fj ; fj 0 )

X

βk∗ βk0 Gm (gk ; gk0 ),

(A.4)

1≤k,k0 ≤s

where αj , j = 1, . . . , r; βk , k = 1, . . . , s are complex constants. The inequality used in (47) is the special case n = r = s = 1, α1 = 1 and the inequality (54) is the case n = r = 1, α1 = 1 (α’s in (54) are β’s in (A.4)). All inequalities for the usual Gram determinants [21] are special cases of (A.4) with identical f and g functions. The above restriction on the dimension of the linear span can be relaxed. Concerning the proof one could use the Landsberg formula [22] Z x1 (s1 ) . . . x1 (sp ) ∗ y1 (s1 ) . . . y1 (sp ) ... dp ν(s) = ... xp (s1 ) . . . xp (sp ) yp (s1 ) . . . yp (sp ) 4p (A.5) (x1 , y1 ) . . . (x1 , yp ) = p!G(x, y), = p! . . . (xp , y1 ) . . . (xp , yp ) where xi , yj ∈ L2 (4 , ν), 4 p = 4 × . . . × 4 and dp ν is the product measure, together with Laplace expansion formula for determinants. Certainly another proof can be given by fermionic Fock space methods. Consider (smeared) fermionic annihilation and creation operators a(f ), a+ (f ): {a(f ), a(g)} = 0 = {a+ (f ), a+ (g)}, {a(f ), a+ (g)} = (f, g),

(A.6)

292

F. Constantinescu, G. Scharf

where {·, ·} is the anticommutator. The action in the fermionic Fock space is as usual given by (a(f )ψ)(n) (x1 , . . . , xn ) =

(a+ (f )ψ)(n) (x1 , . . . , xn ) =

√

Z n+1

dx f (x)∗ ψ (n+1) (x, x1 , . . . , xn ),

n √ X n (−1)i−1 f (xi )ψ (n−1) (x1 , . . . , xˆ i , . . . xn ),

(A.7)

(A.8)

i=1

where xˆ i indicates that the ith variable is to be omitted and ψ (n) (x1 , . . . , xn ) is totally antisymmetric. Let r X αj a+ (fj1 ) . . . a+ (fjn ), (A.9) 9= j=1

8=

s X

βk a+ (gk1 ) . . . a+ (gkm ).

(A.10)

k=1

Then we have on the vacuum k9k2 =

X

αj∗ αj 0 Gn (fj ; fj 0 ),

(A.11)

βk∗ βk0 Gm (gk ; gk0 ),

(A.12)

j,j 0

k8k2 =

X k,k0

and k98k2 =

X

αj∗ βk∗ αj 0 βk0 Gn+m (fj , gk ; fj 0 , gk0 ).

(A.13)

j,j 0 ,k,k0

Hence, the generalized Gram inequality (A.4) is proved if we can show that the operator norm k9k2 is equal to X αj∗ αj 0 Gn (fj ; fj 0 ). (A.14) k9k2 = j,j 0

This is a consequence of Wick’s theorem about normal ordering of operator products which we write down with the following simplified notation: a(fn ) . . . a(f1 )a+ (g1 ) . . . a+ (gn ) = : a(fn ) . . . a+ (gn ) : |

|

(A.15)

+ : a(fn ) . . . a+ (g1 ) . . . a+ (gn ) : + . . . + G(f1 , . . . , fn ; g1 , . . . , gn ). The r.h.s. is obtained from the l.h.s. by normal ordering denoted by double dots, that means by anticommuting all emission operators a+ to the left. The “contractions” (indicated by the bracket upstairs) represent the anticommutators (fn , gj ) (A.6) which appear in this process. The last term has all operators contracted in pairs in all possible ways and this gives just the Gram determinant.

Smeared and Unsmeared Chiral Vertex Operators

293

Now let us consider the square hX i2 αj αj 0 a+ (fj1 ) . . . a+ (fjn )a(fj 0 n ) . . . a(fj 0 1 ) (99+ )2 = j,j 0

=

X

αj αj∗0 αl αl∗0 a+ (fj1 ) . . . a+ (fjn ) ×

(A.16)

jj 0 ll0

× a(fj 0 n ) . . . a(fj 0 1 )a+ (fl1 ) . . . a+ (fln )a(fl0 n . . . a(fl0 1 ). In the last line we substitute Wick’s theorem (A.15). Then only the last term with Gram’s determinant contributes for the following reason. All other terms contain more than n emission operators, but there are only n linear independent f ’s. Consequently, expanding the f ’s with respect to an n-dimensional basis one gets at least two equal Fermi operators a+ (f )a+ (f ) = 0. Therefore we obtain X αj∗0 αl G(fj 0 1 , . . . fj 0 n ; fl1 , . . . fln ) (99+ )2 = j 0 ,l

×

X

αj αl∗0 a+ (fj1 ) . . . a+ (fjn )a(fl0 n ) . . . a(fl0 1 ) =

j,l0

=(

X

α∗ αG)(99+ )

with obvious short-hand notation. Since 99+ is selfadjoint this implies X α∗ αG| = k9k2 = k9+ k2 k99+ k = | which is the desired result (A.14). Appendix 2 Strictly speaking the result of this appendix is not necessary for the main body of the paper. This is because we realized neutrality by passing from the genuine bosonic Fock space F to Hilbert spaces F˜ (see Sects. 3–5) where linear combinations of vectors V˜n are dense by construction. The same applies to the space Fˆ in Sect. 6. Nevertheless the results of this appendix help to get a better understanding of the structure of the Hilbert spaces in which our vertices live. It could gain on interest if we succeed to abolish neutrality. Furthermore, the problem touched here was mentioned as open in a related setting up in [24] (see also [27]). It is our aim to show that the set of vectors Vn (f ) =

n Y

V (fi )80 ,

(A.20)

i=1

where f = (f1 , . . . , fn ), fi ∈ L2 (S 1 ) is total in the Fock space F. In other words linear combinations n,j X (A.21) αj Vn (fj ), αj ∈ C, n = 0, 1, 2, . . . , where now fj = (f1j , . . . , fnj ) are vector functions dense in F. This type of property is common in quantum field theory and we could invoke this. The point is that vertices

294

F. Constantinescu, G. Scharf

V (f ) are not exactly quantum fields because of the massless limit involved. A direct clean proof of this property is therefore desirable. This point was already mentioned in [24]. It turns out that the vectors (A.20) are coherent states of a special type. One expects coherent states to be dense (even overcomplete [14]). We use the relation for the unsmeared vertices X n ∞ n Y Y zj 1X i 1− exp V (zi ) = zj a−i zi i j=1 i=1 i=1 1≤i<j≤n X ∞ n 1 X −i zj ai , × exp i j=1 i=1 where 1 > |z1 | > |z2 | > . . . |zn |, and apply it to the vacuum 80 : n Y

V (zi )80 =

i=1

with Zi(n) =

Y

1≤i<j≤n

Pn

i j=1 zj .

Now exp

X ∞ zj 1 n Z a−i 80 1− exp zi i i i=1

X ∞ 1 i=1

i

Zin a−i

(A.22)

80

(A.23)

are typical coherent states [14]. The idea is to integrate (A.22) against powers of zi−1 and use Z 1 z n dz = δn,−1 2πi S 1 in order to produce elements 8η of the Fock basis out of (A.22). For instance, if n = 1, Zi = z1i , we get Z Z 1 a−2 2 a−1 V (z1 )z1−2 dz1 80 ∼ 1+ z1 + z1 + . . . z1−2 dz1 80 = V (z1−2 )80 = 2π 1 2 = a−1 80 ∼ 8η , η = (1, 0, 0, . . . ). How can one obtain states like a−2 80 and a2−1 80 ? Let us expand V (z1 )80 = exp

∞ X zn 1

n=1

n

a−n 80

2 X zn 1 X z1n 1 = 1+ a−n + a−n + . . . 80 , n 2! n and collect the terms with z1n : 1 X a−n 1 X a−n1 a−n2 + + n 2! n +n =n n1 n2 3! P 1

2

nj =n

a−n1 a−n2 a−n3 + ... . n1 n2 n3

(A.24)

The corresponding Fock state is obtained by integration with z1−n−1 . We now consider two factors and use (A.22),

Smeared and Unsmeared Chiral Vertex Operators

V (z1 )V (z2 )80 = g(z1 , z2 ) exp

295

∞ X zn + zn 1

n=1 ∞ X

= g(z1 , z2 ) exp

n=1

2

n

a−n 80

z1n a−n n

exp

∞ X zn 2

n=1

n

(A.25)

a−n 80 ,

where g(z1 , z2 ) are the prefactors in (A.22). The latter are cancelled by working with the following smearing function g −1 z1−n−1 z2−m−1 . Since we can choose n and m independently, this gives us arbitrary products of two factors of the form (A.24). The generalization to more than two factors is straightforward. It is not hard to see that in this way we get a total set of Fock vectors. Indeed, starting with n = 1 in (A.24) gives a−1 80 , and the η1 -fold product with itself (denoted by 1η1 ) 1 80 . Next we take n = 2 in (A.24): gives aη−1 a−2 1 2 + a−1 . 2 2! 2 Since a−1 80 is already constructed, we get a−2 80 . Forming products 1η2 × 2, we find 1 a−2 80 . Now we are ready to consider the product (22 ) of two n = 2: aη−1 1 2 (a + 2a−2 a2−1 + a4−1 ). 4 −2 This gives us a2−2 80 because the other two vectors are already known. The next steps in the process are: 1 1 2 1 a2−2 80 , a3−2 80 , aη−1 a3−2 80 , . . . , aη−2 aη−1 80 . aη−1

We continue with n = 3 and so on. We thus arrive at the usual basis in Fock space. The same proof goes through for γ 6= 1. The test functions we have used in the foregoing construction are of a more general kind than the simple products of fi (zi ) in (A.20). But this is not an essential point because such test functions can be approximated by linear combinations of simple products with help of the Laurent expansion. In particular the above proof shows that the set of smeared coherent states n Y

V− (fi )80 , n = 0, 1, 2, . . . ,

(A.26)

i=1

where fi ∈ L2 (S 1 ) is total in F. Concerning the unsmeared case, the situation is not completely clear to us. We do not know whether the set (A.26) where the smearing is left out and the variables satisfy 1 > |z1 | > |z2 | . . . > |zn | is total in F. On the other hand a slightly larger set based on X t n a−n 80 , exp n n where now tn are left arbitrary in C is total [11]. These are the genuine coherent states in infinitely many variables. Acknowledgement. We thank W.Boenkost for assistance at an incipient stage of this work and F. Kleespieß for discussions. Special thanks go to K.-H. Rehren for sending published and unpublished work and helping to improve the paper. One of us (F.C.) thanks Giovanni Felder and Forschungsinstitut f¨ur Mathematik, ETH Z¨urich for hospitality and to J¨urg Fr¨ohlich for encouragement.

296

F. Constantinescu, G. Scharf

References 1. Wassermann, A.: Operator algebras and conformal field theory III. Preprint, Cambridge (UK), 1995 2. Loke, T.: Operator algebras and conformal field theory of the discrete series representations of Diff(S 1 ). PhD thesis, Cambridge (UK), 1994 3. Rehren, K.-H.: Lett.Math.Phys. 40, 299 (1997) 4. Constantinescu, F., Flume, R.: Phys. Lett. B 326, 101 (1994) 5. Epstein, H., Glaser, V.: Ann. Inst. Poincare A 19, 211 (1973) 6. Scharf, G.: Finite quantum electrodynamics: The causal approach Berlin–Heidelberg–New York: Springer-Verlag, 2nd edition, 1995 7. Constantinescu, F., Flume, R.: J. Phys. A23, 2971 (1990) 8. Constantinescu, F., Flume, R.: Phys. Lett. B257, 63 (1991) 9. Konik, R.M., LeClair, A.: Nucl. Phys. B479, 619 (1996) 10. Boenkost, W., Constantinescu, F.: J. Math. Phys. 34, 3607 (1993) 11. Boenkost, W.: Vertex-Operatoren, Darstellungen der Virasoro-Algebra und konforme Quantenfeldtheorie, Dissertation, Frankfurt am Main, 1994 12. Constantinescu, F., de Groote, H.: Algebraische und geometrische Methoden in der Physik. Stuttgart: Teubner, 1994 13. Chihara, T.: An introduction to orthogonal polynomials: New York, London, Paris: Gordon and Breach, 1978 14. Klauder, J., Sudarshan, E.: Fundamentals of quantum optics. New York: Benjamin, 1968 15. Rudin, W.: Real and complex analysis. New York: McGraw-Hill, 1966 16. Mandelstam, S.: Phys.Reports, C13, 253 (1974) 17. Pressley, A., Segal, G.: Loop groups. Oxford: Clarendon Press, 1986 18. Coleman, S.: Phys. Rev., D11, 2088 (1975) 19. Itzykson, C., Drouffe, J.-M.: Statistical field theory. Vol. 2, Cambridge: Cambridge Univ. Press, 1989 20. Zamolodchikov, Al.B.: Int. J. Modern Physics, 8, 1125 (1995) 21. Gantmacher, F.R.: Matrizentheorie. Berlin–Heidelberg–New York: Springer-Verlag, 1986 22. Landsberg, G.: Math. Ann. 69, 227 (1910) 23. Strocchi, F.: Selected topics on the general properties of quantum field theory. Singapore: World Scientific, 1986 24. Carey, A.L., Ruijsenaars, S.N.M., Wright, J.D.: Commun. Math. Phys. 99, 347 (1985) 25. Araki, H., Wyss, W.: Helv. Phys. Acta 37, 136 (1964) 26. Wightman, A.S.: Introduction to some aspects of the relativistic dynamics of quantum fields. In: Carg`ese Lectures in Theoretical Physics, 1964, New York: Gordon and Breach, 1967 27. Carey, A.L., Ruijsenaars, S.N.M., Wright, J.D.: Acta Appl.Math. 10, 1 (1987) 28. Kac, V.: Vertex algebras for beginners. University Lecture Series, vol. 10, Providence, RI: Am. Math. Soc., 1997 Communicated by G. Felder

Commun. Math. Phys. 200, 297 – 324 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Integrable Structure of Conformal Field Theory III. The Yang–Baxter Relation Vladimir V. Bazhanov1,2 , Sergei L. Lukyanov3,4 , Alexander B. Zamolodchikov3 1 Department of Theoretical Physics and Center of Mathematics and its Applications, IAS, Australian National University, Canberra, ACT 0200, Australia 2 Saint Petersburg Branch of Steklov Mathematical Institute, Fontanka 27, Saint Petersburg, 191011, Russia 3 Department of Physics and Astronomy, Rutgers University, Piscataway, NJ 08855-049, USA 4 L.D. Landau Institute for Theoretical Physics, Chernogolovka, 142432, Russia

Received: 20 May 1998 / Accepted: 7 July 1998

Abstract: In this paper we fill some gaps in the arguments of our previous papers [1,2]. In particular, we give a proof that the L operators of Conformal Field Theory indeed satisfy the defining relations of theYang–Baxter algebra. Among other results we present a derivation of the functional relations satisfied by T and Q operators and a proof of the basic analyticity assumptions for these operators used in [1,2]. 1. Introduction This paper is a sequel to our works [1,2] where we have introduced the families of operators T(λ) and Q(λ) which act in a highest weight Virasoro module and satisfy the commutativity conditions [T(λ), T(λ0 )] = [T(λ), Q(λ0 )] = [Q(λ), Q(λ0 )] = 0.

(1.1)

These operators are CFT analogs of Baxter’s commuting transfer-matrices of integrable lattice theory [3,4]. In the lattice theory the transfer-matrices are typically constructed as follows. One first finds an R-matrix which solves the Yang–Baxter equation, RV V 0 (λ) RV V 00 (λλ0 ) RV 0 V 00 (λ0 ) = RV 0 V 00 (λ0 ) RV V 00 (λλ0 ) RV V 0 (λ).

(1.2)

Here RV V 0 , RV V 00 , RV 0 V 00 act in the tensor product of the identical vector spaces V , V 0 and V 00 . Then one introduces the L-operator, LV (λ) = RV V1 (λ)RV V2 (λ) . . . RV VN (λ),

(1.3)

which is considered as a matrix in V whose elements are operators acting in the tensor product (1.4) HN = ⊗N i=1 Vi , where N is the size of the lattice. The space (1.4) is interpreted as the space of states of the lattice theory. The operator (1.3) satisfies the defining relations of the Yang–Baxter algebra

298

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

RV V 0 (λ/λ0 ) LV (λ)LV 0 (λ0 ) = LV 0 (λ0 )LV (λ) RV V 0 (λ/λ0 ).

(1.5)

It realizes, thereby, a representation of this algebra in the space of states of the lattice theory. The “transfer-matrix” (1.6) TV (λ) = Tr V LV (λ) : HN → HN satisfies the commutativity condition (1.1) as a simple consequence of the defining relations (1.5). In many cases the integrable lattice theory defined through the transfer-matrix (1.6) can be used as the starting point to construct an integrable quantum field theory (QFT). If the lattice system has a critical point one can define QFT by taking the appropriate scaling limit (which in particular involves the limit N → ∞). Then the space of states of QFT appears as a certain subspace in the limiting space (1.4), HQF T ⊂ HN →∞ . Although many integrable QFT can be constructed and studied this way (and this is essentially the way integrable QFT are obtained in the Quantum Inverse Scattering Method [5,6] ), the alternative idea of constructing representations of the Yang–Baxter algebra directly in the space of states HQF T of continuous QFT seems to be more attractive. This idea was the motivation of our constructions in [1,2]. The natural starting point for implementing this idea is the Conformal Field Theory (CFT) because the general structure of its space of states HCF T is relatively well understood [7]. The space HCF T can be decomposed as V1 ⊗ V , (1.7) HCF T = ⊕ 1,1

1

where V1 and V are irreducible highest weight representations of “left” and “right” 1 Virasoro algebras with the highest weights 1 and 1 respectively. The sum (1.7) may be finite (as in the “minimal models”), infinite or even continuous. In any case the space (1.7) can be embedded into a direct product Hchiral ⊗ Hchiral of left and right “chiral” subspaces, Hchiral = ⊕1 V1 . (1.8) In [1,2] we introduced the operators L(λ) which realize particular representations of the Yang–Baxter algebra (1.5) in the space (1.8). The commuting operators (1.1) was constructed in terms of these operators L. However, the proof that these operators actually satisfy the defining relations (1.5) of the Yang–Baxter algebra was not presented. The main purpose of this paper is to fill this gap. Here we recall some notations used in [1,2]. Let ϕ(u) be a free chiral Bose field, i.e. the operator-valued function X a−n (1.9) einu , ϕ(u) = iQ + iP u + n n6=0

where P, Q and an (n = ±1, ±2, . . . ) are operators which satisfy the commutation relations of the Heisenberg algebra [Q, P ] =

n i 2 β ; [an , am ] = β 2 δn+m,0 . 2 2

(1.10)

with real β. The variable u is interpreted as a complex coordinate on the 2D cylinder of a circumference 2π. The field ϕ(u) is a quasi-periodic function of u, i.e. ϕ(u + 2π) = ϕ(u) + 2πiP.

(1.11)

Integrable Structure of Conformal Field Theory III.

299

Let Fp be the Fock space, i.e. the space generated by a free action of the operators an with n < 0 on the vacuum vector | pi which satisfies an | pi = 0, for n > 0 ; P | pi = p | pi.

(1.12)

The space Fp supports a highest weight representation of the Virasoro algebra generated by the operators Z 2π c i inu du h T (u) + e (1.13) Ln = 2π 24 0 with the Virasoro central charge c = 13 − 6 β 2 + β −2 and the highest weight 1 = 1(p) ≡

p 2 β

+

(1.14)

c−1 . 24

(1.15)

Here T (u) denotes the composite field −β 2 T (u) =: ϕ0 (u)2 : +(1 − β 2 ) ϕ00 (u) +

β2 24

(1.16)

which is a periodic function, T (u + 2π) = T (u). The symbol : : denotes the standard normal ordering with respect to the Fock vacuum (1.12). It is well known that if the parameters β and p take generic values this representation of the Virasoro algebra is irreducible. For particular values of these parameters, when null-vectors appear in Fp , the irreducible representation V1(p) is obtained from Fp by factoring out all the invariant subspaces. In what follows we will always assume that all the invariant subspaces (if any) are factored out, and identify the spaces Fp and V1(p) . The space Fˆ p = ⊕∞ (1.17) k=−∞ Fp+kβ 2 admits the action of the exponential fields V± (u) =: e±2ϕ(u) :≡ exp ± 2 exp ∓ 2

∞ X an n=1

n

∞ X a−n n=1

−inu

e

n

einu exp ± 2i (Q + P u) (1.18)

.

The following relations are easily verified from (1.9)-(1.11): Vσ1 (u1 )Vσ2 (u2 ) = q 2σ1 σ2 Vσ2 (u2 )Vσ1 (u1 ), u1 > u2 , P V± (u) = V± (u)(P ± β 2 ),

(1.19)

where σ1 , σ2 = ±1. Moreover, V± (u + 2π) = q −2 e±4πiP V± (u).

(1.20)

Any CFT possesses infinitely many local Integrals of Motion (IM) I2k−1 [8,9],

300

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

Z

2π

I2k−1 = 0

du T2k (u), k = 1, 2 . . . , 2π

(1.21)

where T2k (u) are certain local fields, polynomials in T (u) and its derivatives. For example T2 (u) = T (u), T4 (u) =: T 2 (u) :, . . . , T2k (u) =: T k (u) : + terms with the derivatives.

(1.22)

Here : : denote appropriately regularized operator products, see [1] for details. There exists infinitely many densities (1.22) (one for each integer k [10,11]) such that all IM(1.21) commute, (1.23) [ I2k−1 , I2l−1 ] = 0. Consider the following operator matrix [12,1]1 : Lj (λ) = πj L(λ) , Z 2π H iπP H −H 2 2 P exp λ du V− (u) q E + V+ (u) q F , L(λ) = e

(1.24) (1.25)

0

where the exponential fields V± (u) are defined in (1.18) and E, F andH are the generating elements of the quantum universal enveloping algebra Uq sl(2) [15], [H, E] = 2E, [H, F ] = −2F, [E, F ] = with

q H − q −H , q − q −1

(1.26)

q = eiπβ . 2

(1.27) The symbol πj in (1.24) stands for the (2j + 1) dimensional representation of Uq sl(2) , so that (1.24) is in fact the (2j + 1) × (2j + 1) matrix whose elements are the operators acting in the space (1.17). Following the conventional terminology, we will refer to this space as the “quantum space”. The expression (1.24) contains the ordered exponential (the symbol P denotes the path ordering) which can be defined in terms of the power series in λ as follows, Lj (λ) = Z ∞ X λk πj eiπP H k=0

where

2π≥u1 ≥u2 ≥···≥uk ≥0

K(u1 )K(u2 ) . . . K(uk ) du1 du2 . . . duk , (1.28)

H

H

K(u) = V− (u) q 2 E + V+ (u) q − 2 F.

(1.29)

The integrals in (1.28) make perfect sense if −∞ < c < −2.

(1.30)

For −2 < c < 1 the integrals (1.28) diverge and power series expansion of (1.25) should be written down in terms of contour integrals, as explained in [2] (see also Appendix C of this paper). 1

Note that the discrete analog of the operator L 1 (λ) has been used in [13,14] in the context of the quantum 2

lattice KdV equation.

Integrable Structure of Conformal Field Theory III.

301

In Sect. 2 we will show that the operator matrices (1.24) satisfy the relations (1.5), Rjj 0 (λµ−1 ) Lj (λ) ⊗ 1 1 ⊗ Lj 0 (µ) = 1 ⊗ Lj 0 (µ) Lj (λ) ⊗ 1 Rjj 0 (λµ−1 ), (1.31) where the matrix Rjj 0 (λ) is the R-matrix associated with the representations πj , πj0 of Uq sl(2) ; in particular 

q −1 λ − qλ−1

 R 21 21 (λ) = 

 −1

−1

λ−λ q −q q −1 − q λ − λ−1

  q

−1

λ − qλ

(1.32)

−1

coincides with the R-matrix of the six-vertex model. With an appropriate normalization the matrix Rjj 0 (λ) is a finite Laurent polynomial in λ. Therefore, after multiplication by a simple power factor both sides of (1.31) can be expanded in infinite series in the variables λ and µ. We will prove that the relations (1.31) are valid to all orders of these expansions. In fact, we will construct more general L-operators which satisfy the Yang– Baxter relation (1.5) with the universal R-matrix for the quantum Kac–Moody algebra b . Equation (1.31) will follow then as a particular case. Uq sl(2) 2. The Yang–Baxter Relation

b is generated by elements h0 , h1 , x0 , The quantum Kac–Moody algebra A = Uq sl(2) x1 , y0 , y1 , subject to the commutation relations (2.1) hi , hj = 0, hi , xj = −aij xj , hi , yj = aij yj , q hi − q −hi , yi , xj = δij q − q −1

(2.2)

and the Serre relations x3i xj − [3]q x2i xj xi + [3]q xi xj x2i − xj x3i = 0, yi3 yj − [3]q yi2 yj yi + [3]q yi yj yi2 − yj yi3 = 0 .

(2.3)

Here the indices i, j take two values i, j = 0, 1; aij is the Cartan matrix of the algebra b , Uq sl(2) 2 −2 , aij = −2 2 and [n]q = (q n − q −n )/(q − q −1 ). The sum k = h0 + h1

(2.4)

is a central element in the algebra A. Usually the algebra A is supplemented by the grade operator d, [d, x1 ] = x1 , [d, y1 ] = −y1 . [d, h0 ] = [d, h1 ] = [d, x0 ] = [d, y0 ] = 0, b is a Hopf algebra with the co-multiplication The algebra A = Uq sl(2) δ:

A −→ A ⊗ A

(2.5)

302

defined as

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

δ(xi ) = xi ⊗ 1 + q −hi ⊗ xi , δ(yi ) = yi ⊗ q hi + 1 ⊗ yi , δ(hi ) = hi ⊗ 1 + 1 ⊗ hi , δ(d) = d ⊗ 1 + 1 ⊗ d,

(2.6)

where i = 0, 1. As usual we introduce δ 0 = σ ◦ δ, σ ◦ (a ⊗ b) = b ⊗ a ( ∀a, b ∈ A ).

(2.7)

Define also two Borel subalgebras B− ⊂ A and B+ ⊂ A generated by d, h0,1 , x0 , x1 and d, h0,1 , y0 , y1 respectively. There exists a unique element [16,17] R ∈ B+ ⊗ B− ,

(2.8)

satisfying the following relations: δ 0 (a) R = R δ(a) (∀ a ∈ A), (δ ⊗ 1) R = R13 R23 ,

(2.9)

(1 ⊗ δ) R = R R , 13

12

where R12 , R13 , R23 ∈ A ⊗ A ⊗ A and R12 = R ⊗ 1, R23 = 1 ⊗ R, R13 = (σ ⊗ 1) R23 . The element R is called the universal R-matrix. It satisfies the Yang–Baxter equation R12 R13 R23 = R23 R13 R12 ,

(2.10)

which is a simple corollary of the definitions (2.9). The universal R-matrix is understood as a formal series in generators in B+ ⊗ B− . Its dependence on the Cartan elements can be isolated as a simple factor. It will be convenient to introduce the “reduced” universal R-matrix R = q −(h0 ⊗h0 )/2+k⊗d+d⊗k R = (series in y0 , y1 , x0 , x1 ), (2.11) where yi ∈ B+ ⊗ 1, xi ∈ 1 ⊗ B− (i = 0, 1). There exists an “explicit” expression for the universal R-matrix [18,19] which, in the general case, provides an algorithmic procedure for the computation of this series order by order. Using these results or directly from the definitions (2.8) and (2.9) one can calculate the first few terms in (2.11), q − q −1 (q 2 − 1)(y02 ⊗ x20 + y12 ⊗ x21 )+ R = 1 + (q − q −1 )(y0 ⊗ x0 + y1 ⊗ x1 ) + [2]q y0 y1 ⊗ (x1 x0 − q −2 x0 x1 ) + y1 y0 ⊗ (x0 x1 − q −2 x1 x0 ) + . . . . (2.12) The higher terms soon become very complicated and their general form is unknown. This complexity should not be surprising, since the universal R-matrix contains infinitely b . A few many nontrivial solutions of theYang–Baxter equation associated with Uq sl(2) more terms of the expansion (2.12) are given in Appendix A. We are now ready to prove the Yang–Baxter equation (1.31) and its generalizations. Consider the following operator:

Integrable Structure of Conformal Field Theory III.

303

L = eiπP h P exp

Z

2π

K(u)du ,

(2.13)

0

where K(u) = V− (u) y0 + V+ (u) y1 .

(2.14)

Here h = h0 = −h1 , y0 , y1 are the generators of the Borel subalgebra B+ and the Pexponent is defined as the series of the ordered integrals of K(u), similarly to (1.28). Notice that we assumed here that the central charge k is zero; considering this case is sufficient for our goals. The operator (2.13) is an element of the algebra B+ whose coefficients are operators acting in the quantum space (1.17). It is more general than the one in (1.25) and reduces to the latter for a particular representation of B+ (see below). Consider now two operators (2.13) L ⊗ 1 ∈ B+ ⊗ 1, 1 ⊗ L ∈ 1 ⊗ B+

(2.15)

belonging to the different factors of the direct product B+ ⊗ B+ . Using (1.19) for the product of these operators one obtains Z 2π Z 2π iπP δ(h) P exp K1 (u)du P exp K2 (u)du , (2.16) (L ⊗ 1)(1 ⊗ L) = e 0

0

where and

δ(h) = h ⊗ 1 + 1 ⊗ h,

(2.17)

K1 (u) =V− (u) (y0 ⊗ q h ) + V+ (u) (y1 ⊗ q −h ), K2 (u) =V− (u) (1 ⊗ y0 ) + V+ (u) (1 ⊗ y1 ).

(2.18)

Taking into account (1.19) and (2.1) it is easy to see that [K1 (u1 ), K2 (u2 )] = 0 , u1 < u2 ,

(2.19)

therefore the product of the P-exponents in (2.16) can be rewritten as Z 2π K1 (u) + K2 (u) du ( L ⊗ 1 ) ( 1 ⊗ L ) = eiπP δ(h) P exp 0

iπP δ(h)

=e

P exp

Z

2π

(2.20) V− (u) δ(y0 ) + V+ (u) δ(y1 ) du

0

= δ(L), where the co-multiplication δ is defined in (2.6). Similarly ( 1 ⊗ L ) ( L ⊗ 1 ) = δ 0 (L),

(2.21)

with δ 0 defined in (2.7). Combining (2.20) and (2.21) with the first equations in (2.9) one obtains the following Yang–Baxter equation R ( L ⊗ 1 ) ( 1 ⊗ L ) = ( 1 ⊗ L ) ( L ⊗ 1 ) R.

(2.22)

Obviously, this equation is more general than (1.31). To obtain the latter from (2.22) we only need to choose appropriate representations in each factor of the direct product A⊗A

304

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

b involved in (2.22). Consider the so-called evaluation homomorphism Uq sl(2) −→ Uq sl(2) of the form x0 → λ−1 F q −H/2 , y0 → λq H/2 E, h0 → H, x1 → λ−1 Eq H/2 , y1 → λq −H/2 F, h1 → −H,

(2.23)

where λ is a spectral parameter, and E, F, H are the generators of the algebra Uq sl(2) , defined already in (1.26). One could easily check that withthe map (2.23) all the defining b become simple corollaries of relations (2.1), (2.2) and (2.3) of the algebra A = Uq sl(2) (1.26). For any representation π of Uq (sl(2)) the formulae (2.23) define a representation of the algebra A with zero central charge k, which will be denoted as π(λ). In particular, the matrix representations of A corresponding to the (2j+1)-dimensional representations πj of Uq sl(2) will be denoted πj (λ). Let us now evaluate the Yang–Baxter equation (2.22) in the representations πj (λ) and πj 0 (µ) for the first and second factor of the direct product respectively. For the L-operators one has (2.24) πj (λ) L = Lj (λ), πj 0 (µ) L = Lj 0 (µ), with Lj given by (1.24), while for the R-matrix one obtains πj (λ) ⊗ πj 0 (µ) R = ρjj 0 (λ/µ) Rjj 0 (λ/µ),

(2.25)

where ρjj 0 is a scalar factor and the Rjj 0 is the same as in (1.31) [20]. This completes the proof of (1.31). We conclude this section with the following observation concerning the structure of the L-operator (2.13). As one could expect Eq. (2.22) is, in fact, a specialization of the Yang–Baxter equation (2.10) for the universal R-matrix. To demonstrate this it would be sufficient to find an appropriate realization of the algebra A in the third factor of the product A ⊗ A ⊗ A involved in (2.10), such that (2.10) reduces to (2.22). A little inspection shows that each side of (2.10) is an element of B+ ⊗ A ⊗ B− rather than an element of A ⊗ A ⊗ A. Therefore we do not need a realization of the full algebra A in the third factor; realization of the Borel subalgebra B− is sufficient. Let us identify the generators x0 , x1 ∈ B− of this Borel subalgebra with the integrals of the exponential fields Z 2π Z 2π 1 1 V− (u) du, x1 = V+ (u) du. (2.26) x0 = q − q −1 0 q − q −1 0 One can check that these generators satisfy [11] the Serre relations (2.3). To do this one should express the fourth order products of x’s in (2.3) in terms of the ordered integrals of products of the exponential fields V± (u) . The calculations are simple but rather technical. We present them in Appendix A. Substituting the expressions (2.26) for the generators x0 , x1 into the “reduced” universal R-matrix R (2.11), (2.12) one obtains a vector in B+ whose coordinates are operators acting in the quantum space (1.17). It is natural to expect that it coincides with the P-ordered exponent from (2.13). Conjecture. 2 The specialization of the “reduced” universal R-matrix R ∈ B+ ⊗ B− (2.11) for the case when x0 , x1 ∈ B− are realized as in (2.26) coincide with the Pexponent 2

This statement requires no restrictions on the value of the central element k.

Integrable Structure of Conformal Field Theory III.

L = P exp

305

Z

2π

K(u) du ,

(2.27)

0

where K(u) is defined in (2.14). One can check this conjecture in a few lowest orders in the series expansion for the universal R-matrix. Substitute (2.26) into (2.12). It is not difficult to see that every polynomial of x’s appearing in (2.12) as a coefficient to the monomial in y 0 s can be written as a single ordered integral of the vertex operators (rather than their linear combination). For example, the second order terms read [2]q [2]q J(+, −), J(−, −), (x0 x1 − q −2 x1 x0 ) = q(q − q −1 )2 (q − q −1 ) [2]q [2]q J(+, +), (x1 x0 − q −2 x0 x1 ) = J(−, +), x21 = −1 2 q(q − q ) (q − q −1 )

x20 =

where3 J(σ1 , σ2 , . . . , σn ) =

(2.28)

Z 2π≥u1 ···un ≥0

Vσ1 (u1 ) Vσ2 (u2 ) · · · Vσn (un ) du1 . . . dun , (2.29)

with σi = ±1. Using (2.26) and (2.29) one can rewrite the RHS of (2.12) as R = 1 + y0 J(−) + y1 J(+) + y02 J(−, −) + y12 J(+, +) + y0 y1 J(−, +) + y1 y0 J(+, −) + . . . , (2.30) which coincides with the first three terms of the expansion of the P-exponent (2.27). We have verified this conjecture to within the terms of the fourth order in the generators x0 and x1 (see Appendix A). Notice that starting from the fourth order one has to take into account the Serre relations (2.3). The above conjecture suggests that the operators (2.27) can be reexpressed through algebraic combinations of the two elementary integrals (2.26) instead of the ordered integrals (2.29)4 . Conversely, this statement combined with the uniqueness [17] of the universal R-matrix satisfying (2.8) and (2.9) implies the above conjecture. Finally let us stress that our proof of the Yang–Baxter equations (1.31) and (2.22) is independent of this conjecture. 3. Commuting T- and Q-Operators It is a well known and simple consequence of the Yang–Baxter relation (1.31) that appropriately defined traces of the operator matrices Lj (λ) give rise to the operators Tj (λ) which commute for different values of the parameter λ, i.e. [Tj (λ), Tj 0 (λ)] = 0.

(3.1)

In fact, there is a certain ambiguity in the construction of these operators. Below we show that this ambiguity is eliminated if we impose an additional requirement that the operators Tj (λ) also commute with the local IM (1.21), [Tj (λ), I2k−1 ] = 0.

(3.2)

Note that this definition differs by the factor q n from that given in Eq.(2.31) of Ref. [2]. Perhaps this statement is less trivial than it might appear. In fact, one can always write any product of x0 and x1 from (2.26) as a linear combination of the integrals (2.29), but not vice versa, since the elementary integrals (2.26) are algebraically dependent due to the Serre relations. 3

4

306

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

It is easy to check that the Yang–Baxter equation (1.31) is not affected if one multiplies the L-operator (1.25) by an exponent of the Cartan element H, L(λ) −→ L(f ) (λ) = eif H L(λ),

(3.3)

where f is an arbitrary constant. Therefore the operators if H ) Lj (λ) T(f j (λ) = Tr πj e

(3.4)

satisfy the commutativity relations (3.1) for any value of f . Moreover, this commutativity is not violated even if the quantity f is a function of P rather than a constant (despite the fact that in this case the operators (3.3) do not necessarily satisfy the ordinary Yang– Baxter equation (1.31)). This is obvious if one uses the standard realization of the spin-j representations πj of the algebra (1.26) πj [E] |ki = [k]q [2j −k +1]q |k −1i, πj [F ] |ki = |k +1i, πj [H] |ki = (2j −2k) |ki, (3.5) where [k]q = (q k − q −k )/(q − q −1 ) and the vectors |ki (k = 0, 1, . . . , 2j) form a basis in the (2j + 1)-dimensional space. Then, using (1.19) it is easy to show that all the diagonal entries of the (2j + 1) × (2j + 1) matrices Lj (λ) commute with the operator P . As an immediate consequence the quantity f = f (P ) in (3.4) can be treated as a constant and therefore the commutativity (3.1) remains valid. It follows also that the operators (3.4) invariantly act in each Fock module Fp . The commutativity (3.2) requires a special choice of the function f = f (P ). We show in Appendix C that the operators (3.4) commute with I1 = L0 − c/24 if f = π (P + N ).

(3.6)

Here N is an arbitrary integer which obviously has no other effect on (3.4) than the overall sign of this operator; in what follows we set N = 0 and define (3.7) Tj (λ) = Tr πj eiπP H Lj (λ) . In fact, with this choice of f the operators (3.7) commute with all the local IM (1.21). This is demonstrated in Appendix C. The operators (3.7) act invariantly in each Fock module Fp and satisfy both (3.1) and (3.2). The above operators Tj (λ) are CFT analogs of the commuting transfer-matrices of Baxter’s lattice theory. Besides these commuting transfer-matrices the “technology” of the solvable lattice models involves also another important object – Baxter’s Q-matrix [3]. It turns out that another specialization of the general L operator (2.13) leads to the CFT analog of the Q-matrix [2]. Consider the so-called q-oscillator algebra generated by the elements H, E+ , E− subject to the relations q E+ E− − q −1 E− E+ =

1 , [H, E± ] = ± 2 E± . q − q −1

(3.8)

b One can easily show that the following two maps of the Borel subalgebra B− of Uq sl(2) into the q-oscillator algebra (3.8) H

H

h = h0 = −h1 → ±H, y0 → λq ± 2 E± , y1 → λq ∓ 2 E∓

(3.9)

Integrable Structure of Conformal Field Theory III.

307

(here one has to choose all the upper or all the lower signs) are homomorphisms. Under these homomorphisms the operator (2.13) becomes an element of the algebra (3.8) Z 2π H H du (V− (u) q ± 2 E± + V+ (u) q ∓ 2 E∓ ) . (3.10) L → L± (λ) = e±iπP H P exp λ 0

Let ρ± be any representations of (3.8) such that the trace Z± (p) = Tr ρ± [ e±2πipH ]

(3.11)

exists and does not vanish for complex p belonging to the lower half plane, =m p < 0. Then define two operators −1 (P ) Tr ρ± [ e±πiP H L± (λ) ]. A± (λ) = Z±

(3.12)

Since we are interested in the action of these operators in Fp the operator P in (3.12) can be substituted by its eigenvalue p. The definition (3.12) applies to the case =m p < 0. However the operators A± (λ) can be defined for all complex p (except for some set of singular points on the real axis) by an analytic continuation in p. As was shown in [2], the trace in (3.12) is completely determined by the commutation relations (3.8) and the cyclic property of the trace, so the specific choice of the representations ρ± is not significant as long as the above property is maintained. The Q-operators (the CFT analogs of Baxter’s Q-matrix) are defined as Q± (λ) = λ±2P/β A± (λ). 2

(3.13)

Similarly to the T-operators they act invariantly in each Fock module Fp , Q± (λ) :

Fp → Fp ,

(3.14)

and commute with the local IM (1.21). The operators Q± (λ) with different values of λ commute among themselves and with all the operators Tj (λ), [Q± (λ), Q± (λ0 )] = [Q± (λ), Tj (λ0 )] = 0.

(3.15)

This follows from the appropriate specializations of the Yang–Baxter equation (2.22). The operators Tj (λ) and A± (λ) enjoy remarkable analyticity properties as the functions of the variable λ2 . Namely, all the matrix elements and eigenvalues of these operators are entire functions of this variable [1,2]. The proof is carried out in Appendix B. It is based on the result of [21] on the analyticity of certain Coulomb partition functions which was obtained through the Jack polynomial technique. Currently there is a complete proof of the above analyticity for Tj (λ) for all values of β 2 in the domain 0 < β 2 < 1/2 (which corresponds to (1.30) ) and the “almost complete” proof of this analyticity for A± (λ) which extends to all rational values of β 2 and to almost all irrational values of β 2 (i.e. to all irrational values except for some set of the Lebesgue measure zero, see Appendix B for the details) in the above interval. It is natural to assume that this analyticity remains valid for those exceptional irrationals as well. 4. The Functional Relations It is well known from the lattice theory that analyticity of the commuting transfer matrices become an extremely powerful condition when combined with the functional relations which the transfer matrices satisfy, and, in principle, allows one to determine

308

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

all the eigenvalues. Therefore, the functional relations (FR) for the operators Q± (λ) and Tj (λ) are of primary interest. We start our consideration with the “fundamental” FR (fundamental in the sense that it implies all the other functional relations involving the operators Tj (λ) or Q± (λ)). i) Fundamental relation. The “transfer-matrices” Tj (λ) can be expressed in terms of Q± (λ) as [2] 2i sin(2πP ) Tj (λ) = Q+ (q j+ 2 λ)Q− (q −j− 2 λ) − Q+ (q −j− 2 λ)Q− (q j+ 2 λ), 1

1

1

1

(4.1)

where j takes (half-) integer values j = 0, 21 , 1, 23 , . . . , ∞. Before going into the proof of (4.1) let us mention its simple but important corollary ii) T -Q relation. The operators Q± (λ) satisfy Baxter’s T -Q equation T(λ)Q± (λ) = Q± (qλ) + Q± (q −1 λ),

(4.2)

where T(λ) ≡ T 21 (λ). This equation can be thought of as the finite-difference analog of a second order differential equation so we expect it to have two linearly independent solutions. As T(λ) is a single-valued function of λ2 , i.e. it is a periodic function of log λ2 , the operators Q± (λ) are interpreted as two “Bloch-wave” solutions to Eq. (4.2). The operators Q± (λ) satisfy the “quantum Wronskian” condition Q+ (q 2 λ)Q− (q − 2 λ) − Q+ (q − 2 λ)Q− (q 2 λ) = 2i sin(2πP ), 1

1

1

1

(4.3)

which is just a particular case of (4.1) with j = 0. To prove these relations consider more general T-operators which correspond to the infinite dimensional highest weight representations of Uq (sl(2)). These new T-operators are defined by the same formula as (3.7), (4.4) T+j (λ) = Tr πj+ eiπP H Lj+ (λ) , Lj+ (λ) = πj+ L(λ) , except that the trace is now taken over the infinite dimensional representation πj+ of (1.26). The corresponding representation matrices πj+ [E], πj+ [F ] and πj+ [H] for the generators of (1.26) are defined by the equations πj+ [E] |ki = [k]q [2j −k+1]q |k−1i , πj+ [F ] |ki = |k+1i, πj+ [H] |ki = (2j −2k) |ki, (4.5) which are similar to (3.5), but the basis |ki is now infinite, k = 0, 1, . . . , ∞. The highest weight 2j of the representation πj+ , πj+ (H) |0i = 2j |0i, can take arbitrary complex values. Since we are interested in the action of the operators T+j (λ) in Fp the operator P in (4.4) can be substituted by its eigenvalue p. Similarly to (3.12) the definition (4.4) makes sense only if =m p < 0, but it can be extended to all complex p (except for some set of singular points on the real axis) by the analytic continuation in p. The operators (4.4) thus defined satisfy the commutativity conditions [Tj (λ), T+j 0 (µ)] = [T+j (λ), T+j 0 (µ)] = 0, which follow from the appropriate specializations of the Yang–Baxter equation (2.22).

Integrable Structure of Conformal Field Theory III.

309

If j takes a non-negative integer or half-integer value the matrices πj+ [E], πj+ [F ] and πj+ [H] in (4.5) have a block-triangular form with two diagonal blocks, one (finite) being equivalent to the (2j + 1) dimensional representation πj and the other (infinite) + . Hence the following simple coinciding with the highest weight representation π−j−1 relation holds: T+j (λ) = Tj (λ) + T+−j−1 (λ), j = 0, 1/2, 1, 3/2, . . . .

(4.6)

In some ways the operators T+j (λ) are simpler than Tj (λ). Making a similarity transformation E → λ E, F → λ−1 F, which does not affect the trace in (4.4) and observing that λ2 πj+ (E) |ki =

[k]q λ2+ q −k − λ2− q k |k − 1i, k = 0, 1, . . . , ∞, −1 q−q

where

λ+ = λ q j+ 2 , λ− = λ q −j− 2 , 1

one can conclude that the operator

T+j (λ)

1

(4.7)

can be written as

T+j (λ) = T+j (0) 8(λ+ , λ− ), where T+j (0) =

(4.8)

e2πi(2j+1)P 2i sin(2πP )

(4.9)

and 8(λ+ , λ− ) is a series in λ2+ and λ2− with the coefficients which do not depend on j and the leading coefficient being equal to 1. Remarkably, the expression (4.8) further simplifies since the quantity 8(λ+ , λ− ) factorizes into a product of two operators (3.12), 1 1 (4.10) 2i sin(2πP ) T+j (λ) = e2πi(2j+1)P A+ λ q j+ 2 A− λ q −j− 2 . This factorization can be proved algebraically by using decomposition properties of the tensor product of two representations of the q-oscillator algebra (the latter are also rep b with respect to the co-multiplication resentations of the Borel subalgebra of Uq sl(2) b . The detail of the calculations are presented in Appendix D. The funcfrom Uq sl(2) tional relation (4.1) trivially follows from (4.6) and (4.10). The relation (4.3) shows that the operators Q+ (λ) and Q− (λ) are functionally dependent. Using this dependence one can write (4.1) as Tj (λ) = Q(q

j+ 21

λ) Q(q

−j− 21

λ)

j X k=−j

1 1 Q(q k+ 2

λ) Q(q k− 2 λ), 1

(4.11)

where Q(λ) is any one of Q+ (λ) and Q− (λ). The last group of FR we want to discuss here is the relations involving solely the transfer matrices Tj (λ) and usually referred to as the “fusion relations”5 [22]. Note that these are again simple corollaries of the “fundamental relation” (4.1). 5 In fact, all the above FR can also be called the fusion relation since they all follow from (4.10) which describes the “fusion” of the q-oscillator algebra representations.

310

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

iii) Fusion relations. The transfer matrices Tj (λ) satisfy Tj (q 2 λ) Tj (q − 2 λ) = 1 + Tj+ 21 (λ) Tj− 21 (λ), 1

1

(4.12)

where T0 (λ) ≡ 1. These can also be equivalently rewritten as T(λ) Tj (q j+ 2 λ) = Tj− 21 (q j+1 λ) + Tj+ 21 (q j λ),

(4.13)

T(λ) Tj (q −j− 2 λ) = Tj− 21 (q −j−1 λ) + Tj+ 21 (q −j λ).

(4.14)

1

or as

1

Considerable reductions of the FR occur when q is a root of unity. Let q N = ±1 and q n 6= ±1 for any integer 0 < n < N,

(4.15)

where N ≥ 2 is some integer. When using (4.1) it is easy to obtain that N

e2πiN P Tj (λ) + T N −1−j (λq 2 ) = 2

1 1 sin(2πN P ) Q+ (λq j+ 2 )Q− (λq −j− 2 ) sin(2πP )

(4.16)

for j = 0, 21 , . . . , N2 − 1. Similarly, T N − 1 (λ) = 2

2

N N sin(2πN P ) Q+ (λq 2 )Q− (λq 2 ). sin(2πP )

(4.17)

Moreover in this case there is an extra relation involving only T-operators, T N (λ) = 2 cos(2πN P ) + T N −1 (λ), 2

2

(4.18)

as it readily follows from (4.1). As is shown in [23] this allows to bring the FR (4.12) to the form identical to the functional TBA equations (the Y -system) of the DN type. Additional simplifications occur when the operators Tj act in Fock spaces Fp with special values of p, `+1 , (4.19) p= 2N where ` ≥ 0 is an integer such that 2p 6= nβ 2 + m for any integers n and m. Then the RHS’s of (4.16) and (4.17) vanish and these relations lead to T N −j−1 (λ) = (−1)` Tj (q

N 2

2

λ), for j = 0, 21 , 1, . . . , N2 − 1 ;

T N − 1 (λ) = 0. 2

(4.20)

2

Further discussion of this case can be found in [1] and [23]. Finally, some remarks concerning the lattice theory are worth making. Although our construction of the Q-operators in terms of the q-oscillator representations was given here specifically for the case of continuous theory, it is clear that the lattice Q-matrices admit a similar construction. In particular the Q-matrix of the six-vertex model can be obtained as a transfer matrix associated with infinite dimensional representations of the q-oscillator algebra (3.8)6 . In the case of the six-vertex vertex model with nonzero (horizontal) field this construction gives rise to two Q-matrices, Q± . As the structure of the FR (4.1), (4.2), (4.10) is completely determined by the decomposition properties b , all these FR are valid in the lattice case, of products of representations of Uq sl(2) with minor modifications mostly related to the normalization conventions of the lattice transfer matrices. 6 Using this construction it is possible, in particular, to reproduce a remarkably simple expression for an arbitrary matrix element of the Q-matrix of the zero field six-vertex model in the “half-filling” sector [3].

Integrable Structure of Conformal Field Theory III.

311

Acknowledgement. The authors thank R. Askey and D.S. Libinsky for bringing papers [24,25] to our attention. V.B. thanks L.D.Faddeev, E. Frenkel and S.M.Khoroshkin for interesting discussions and the Department of Physics and Astronomy, Rutgers University for hospitality.

Appendix A Here we present the results on the series expansion verification of our conjecture on the b structure of the universal R-matrix for the quantum Kac–Moody algebra Uq (sl(2)). We will need expressions for products of the basic contour integrals (2.26) in terms of linear combinations of the ordered integrals (2.29). To derive them one only has to use the commutation relation (1.19) for the vertex operators. For example, consider the second order product Z 2π Z 2π : e−2ϕ(u1 ) : : e2ϕ(u2 ) : du1 du2 (q − q −1 )2 x0 x1 = 0

0

Z = 2π>u1 >u2 >0

Z . . . du1 du2 +

2π>u2 >u1 >0

. . . du1 du2

(A.1)

=J(−, +) + q 2 J(+, −), where J’s are defined in (2.29). For the nth order products one has to split the domain of integration in n-tuple integral into n! pieces corresponding to all possible orderings of the integration variables and then rearrange the products of the vertex operators using the commutation relations (1.19). Below we present the results of these calculations for the products of orders less then or equal to four, x0 =

1 J(−), (q − q −1 )

x1 =

1 J(+), (q − q −1 )

x20 =

q −1 [2]q J(−, −), (q − q −1 )2

x21 =

q −1 [2]q J(+, +), (q − q −1 )2

x30

q −3 [2]q [3]q = J(−, −, −), (q − q −1 )3

x40 =

q −6 [2]q [3]q [4]q J(−, −, −, −), (q − q −1 )4

x31

q −3 [2]q [3]q = J(+, +, +), (q − q −1 )3

x41 =

(A.2)

q −6 [2]q [3]q [4]q J(+, +, +, +), (q − q −1 )4

1 1 q 2 J(−, +) x0 x1 = x1 x0 (q − q −1 )2 q 2 1 J(+, −)

(A.3)

    2 4 1 q q J(−, −, +) x20 x1 [2]q q 2 q 2 q 2  J(−, +, −) , x0 x1 x0  = −1 )3 q(q − q 2 J(+, −, −) x1 x0 q4 q2 1

(A.4)



312

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

    1 q2 q4 J(+, +, −) x21 x0 [2] q x1 x0 x1  = q 2 q 2 q 2  J(+, −, +) , −1 )3 q(q − q 2 J(−, +, +) x0 x1 q4 q2 1 



x30 x1

(A.5)



  2 x0 x1 x0  [2]q =   2 (q − q −1 )4 x x x  0 1 0 x1 x30

 (A.6)  −3  q [3]q q −1 [3]q q [3]q q 3 [3]q J(−, −, −, +)   −1  q [3]q q + 2q −1 2q + q −1 q [3]q  J(−, −, +, −) ,       q [3]q 2q + q −1 q + 2q −1 q −1 [3]q  J(−, +, −, −) q 3 [3]q



x31 x0

q [3]q

q −1 [3]q q −3 [3]q

J(+, −, −, −)



  2 x1 x0 x1  [2]q =   2 (q − q −1 )4 x1 x0 x1  x0 x31

  −3  q [3]q q −1 [3]q q [3]q q 3 [3]q J(+, +, +, −)   −1  q [3]q q + 2q −1 2q + q −1 q [3]q  J(+, +, −, +) ,       q [3]q 2q + q −1 q + 2q −1 q −1 [3]q  J(+, −, +, +) q 3 [3]q





q [3]q 

q −4 q −2

q −1 [3]q q −3 [3]q

(A.7)

J(−, +, +, +) q2

q4



    J12 −1 2q  q −2 2q  2  J  1 1  x0 x1 x0 x1  [2]q [2]q q   13          J14   1 1 1   x0 x21 x0  q 2 [2]2q  1 1 [3]q − 2  =    J  ,   −1 4  23  1 [3]q − 2 1 1    x1 x20 x1  (q − q )  1 1  (A.8)      −1  2 2q  x x x x  J 2q 24   −2  q [2]   1 0 1 0 1 1 [2]q q q   J34 x21 x20 q4 q2 1 1 q −2 q −4 x20 x21

1

1

where J12 = J(−, −, +, +), J13 = J(−, +, −, +), J14 = J(−, +, +, −), J23 = J(+, −, −, +), J24 = J(+, −, +, −), J34 = J(+, +, −, −).

(A.9)

We can now invert most of these relations (except (A.6) and (A.7)) to express J’s in terms of products of x’s. This is not possible for (A.6) and (A.7) because the products of x’s in the left-hand sides are linearly dependent (the rank of the four by four matrix

Integrable Structure of Conformal Field Theory III.

313

therein is equal to three) as a manifestation of the Serre relations. In fact, using (A.6) and (A.7) one can easily check that the Serre relations (2.3) for the basic contour integrals (2.26) are indeed satisfied. It is, perhaps, not surprising that the J-integrals entering (A.6) and (A.7) appear in the expansion of the L-operator (2.13) only in certain linear combinations7 which can be expressed through the products of x’s. We will need the following combinations: 3 (q − q −1 )2 3 2 2 x x − 2x x x + x x x J(−, +, +, +) + J(+, +, +, −) = 0 1 1 0 1 1 0 1 , [2]2q [3]q J(+, −, +, +) − [3]q J(+, +, +, −) = (q − q −1 )2 3 −2 2 −1 2 x + ([3] + q )x x x − q [2] x x x − 2x = 0 1 q 1 0 1 q 1 0 1 , [2]2q (q − q −1 )2 3 −1 2 −2 2 x0 x1 − q [2]q x1 x0 x1 + q x1 x0 x1 , J(+, +, −, +) + [3]q J(+, +,+, −) = [2]2q 3 3 (q − q −1 )2 2 2 x x − 2x x x + x x x J(−, −, −, +) + J(+, −, −, −) = 1 0 1 0 , 0 1 0 [2]2q [3]q 0 (A.10) J(−, −, +, −) − [3]q J(+, −, −, −) = −1 2 (q − q ) − 2x30 x1 + ([3]q + q −2 )x20 x1 x0 − q −1 [2]q x0 x1 x20 , = [2]2q (q − q −1 )2 3 −1 2 −2 2 x0 x1 − q [2]q x0 x1 x0 + q x0 x1 x0 . J(−, +, −, −) + [3]q J(+,−,−, −) = [2]2q For the rest of J’s one has " # J(−, +)

" −2 #" # 1 x0 x1 (q − q −1 ) −q , = [2]q J(+, −) 1 −q −2 x1 x0

(A.11)

  2    −q −1 [2]q 1 q −2 x0 x1 J(−, −, +) −1      (q − q )  −1 −q [2]q q −1 [4]q −q −1 [2]q  J(−, +, −) =  x0 x1 x0  , [2]2q  (A.12) J(+, −, −) x1 x20 1 −q −1 [2]q q −2   2    −q −1 [2]q 1 q −2 x1 x0 J(+, +, −) −1       (q − q )  −1 −q [2]q q −1 [4]q −q −1 [2]q  J(+, −, +) =  x1 x0 x1  , [2]2q  (A.13) J(−, +, +) x0 x21 1 −q −1 [2]q q −2 (q − q −1 ) 2 2 2 2 x x − q 2 [2]q x0 x1 x0 x1 + (q − q −1 )x0 x21 x0 + q [4]q [2]q [2]q 0 1 −1 2 −2 −2 2 2 2 x x , (q − q )x1 x0 x1 + q [2]q x1 x0 x1 x0 − q [2]q 1 0

J(+, +, −, −) = q −2

7

This happens again due to the Serre relation but now for the generators y0 and y1 .

314

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

J(+, −, +, −) = q

− q −1 ) [4]q [2]q

−2 (q

− q 2 [2]q x20 x21 + (2q + q 3 + q 5 )x0 x1 x0 x1 −

(q − q −1 )[3]q x0 x21 x0 − (q − q −1 )[3]q x1 x20 x1 − −5 −3 −1 −2 2 2 (q + q + 2q )x1 x0 x1 x0 + q [2]q x1 x0 , −1 2 ) −2 (q − q x20 x21 − [3]q x0 x1 x0 x1 + x0 x21 x0 + J(+, −, −, +) = q [4]q [2]q 2 2 2 [3]q x1 x0 x1 − [3]q x1 x0 x1 x0 + x1 x0 , (q − q −1 )2 2 2 x0 x1 − [3]q x0 x1 x0 x1 + [3]q x0 x21 x0 + J(−, +, +, −) = q −2 [4]q [2]q x1 x20 x1 − [3]q x1 x0 x1 x0 + x21 x20 , (q − q −1 ) −2 q [2]q x20 x21 − (q −5 + q −3 + 2q −1 )x0 x1 x0 x1 − J(−, +, −, +) = q −2 [4]q [2]q (q − q −1 )[3]q x0 x21 x0 − (q − q −1 )[3]q x1 x20 x1 + (2q + q 3 + q 5 )x1 x0 x1 x0 − q 2 [2]q x21 x20 ,

J(−, −, +, +) = q

− q −1 ) [4]q [2]q

−2 (q

− q −2

2 2 2 x x + q −2 [2]q x0 x1 x0 x1 + [2]q 0 1

(A.14) −1 2 −1 2 2 2 2 2 2 x x . (q − q )x0 x1 x0 + (q − q )x1 x0 x1 −q [2]q x1 x0 x1 x0 +q [2]q 1 0

Expanding the P-exponent in (2.27) in a series one obtains Z 2π ∞ X X K(u)du = 1 + yσ1 yσ2 · · · yσn J(−σ1 , −σ2 , . . . , −σn ), exp 0 (A.15) n=1 {σi =±1} where

y+ = y0 , y− = y1 .

Let us restrict our attention to the terms in (A.15) of the order four or lower. One can exclude the products y0 y13 and y03 y1 using the Serre relations (2.3). Then one can substitute the J-integrals in (A.15) with the corresponding expressions (A.10)–(A.14). There is no need to rewrite (A.15) again since this substitution is rather mechanical and no cancellation can occur. The resulting expression is to be compared with the corresponding expansion of the universal R-matrix. The latter can be found using the b used in those papers results of [17–19]. The notation for the generators of the Uq (sl(2)) is different from ours. The generators eα , e−α , eβ , e−β , hα , hβ in [17–19] are related to x0 , x1 , y0 , y1 , h0 , h1 in (2.1)–(2.3) as follows eα = q −h0 y0 , e−α = x0 q h0 , hα = h0 , eβ = q −h1 y1 , e−β = x1 q h1 , hβ = h1 .

(A.16)

Integrable Structure of Conformal Field Theory III.

315

The expression for the “reduced” universal R-matrix (2.11) follows from Eq.(5.1) of Ref. [19]   → Y R = expq−2 (q − q −1 ) eα+nδ q hα+nδ ⊗ q −hα+nδ e−α−nδ  × n≥0

X

exp  

(q − q

n>0 ← Y

−1

n(enδ q hnδ ⊗ q −hnδ e−nδ ) ) [2n]q

! ×

(A.17)



expq−2 (q − q −1 ) eβ+nδ q hβ+nδ ⊗ q −hβ +nδ e−β−nδ  ,

n≥0

where expp (x) =

∞ X p(n−1)(2−n)/2 xn , [n]p ! = [1]p [2]p · · · [n]p [n]p ! n=0

and hγ+nδ = hγ + n (hα + hβ ) (hγ = 0, hα , hβ ). The index n of the multipliers increases from left to right in the first ordered product above and decreases in the second one. The root vectors eα+nδ , e−α−nδ , etc. appearing in (A.17) are defined recursively by Eqs.(3.2)–(3.5) of Ref.[19]. Applying these formulae one obtains the first few of them eα+δ =

1 2 (e eβ − (1 + q −2 )eα eβ eα + q −2 eβ e2α ), [2]q α

eβ+δ =

1 (eα e2β − (1 + q −2 )eβ eα eβ + q −2 e2β eα ), [2]q

(A.18)

e−α−δ =

1 (e−β e2−α − (1 + q 2 )e−α e−β e−α + q 2 e2−α e−β ), [2]q

e−β−δ =

1 2 (e e−α − (1 + q 2 )e−β e−α e−β + q 2 e−α e2−β ), [2]q −β

eδ = eα eβ − q −2 eβ eα , e−δ = e−β e−α − q 2 e−α e−β , e2δ =

(A.20)

1 2q 2 e2α e2β − q 2 [2]2q eα eβ eα eβ + (q 2 − q −2 )(eα e2β eα + eβ e2α eβ )+ 2q 2 [2]q q −2 [2]2q eβ eα eβ eα − 2q −2 e2β e2α ,

e−2δ = − (q − q 2

(A.19)

−2

q2 2q 2 e2−α e2−β − q 2 [2]2q e−α e−β e−α e−β + 2[2]q )(e−α e2−β e−α

+

e−β e2−α e−β )

+q

−2

[2]2q

e−β e−α e−β e−α −

(A.21) 2q −2 e2−β e2−α

.

These formulae enable us to calculate the expansion of the universal R-matrix (A.17) to within the fourth order terms. Substituting (A.18)–(A.21) into (A.17), expanding the

316

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

the exponents and calculating their product one gets precisely the result obtained above from the expansion of the L-operator (2.13) given by (A.15) and (A.10)–(A.14). Finally notice the that the negative root vectors (A.19)–(A.21) have particular simple expressions in terms of J-integrals (2.29), namely they all reduce to just a single Jintegral as one easily obtains from (A.19)–(A.21) and (A.13)–(A.14), q −hδ e−δ = −q 4

[2]q J(+, −), q − q −1

q −hα+δ e−α−δ = q 6 q

−hβ+δ

e−β−δ

[2]q J(+, −, −), q − q −1

(A.22)

[2]q =q J(+, +, −) , q − q −1

q −h2δ e−2δ = −q 4

6

[2]q [4]q J(+, +, −, −) . 2(q − q −1 )

Appendix B In this appendix we show that for 0 < β 2 < 1/2 the operators Tj (λ) (3.7) and A± (λ) (3.12) are entire functions of the variable λ2 . Consider the simplest nontrivial T-operator T(λ) = T 21 (λ) which corresponds to the two-dimensional representation of Uq (sl(2)). In this case π 21 (E) =

01 00 1 0 , π 21 (F ) = , π 21 (H) = . 00 10 0 −1

(B.1)

Using these expressions to compute the trace in (3.7) one finds T(λ) = 2 cos(2πP ) +

∞ X

λ2n Gn ,

(B.2)

n=1

where Gn = q n e2iπP J(−, +, . . . , −, +) + q n e−2iπP J(+, −, . . . , +, −) | | {z } {z } 2n elements

(B.3)

2n elements

with J’s defined in (2.29). The operators Gn are the “nonlocal integrals of motion” (NIM) [1] which commute among themselves and with all operators Tj (λ). They act invariantly in each Fock module Fp . In particular, the vacuum state |pi ∈ Fp is an eigenstate of all operators Gn Gn |pi = G(vac) (p) |pi, n (p) are given by the integrals [1] where the eigenvalues G(vac) n

(B.4)

Integrable Structure of Conformal Field Theory III.

Z G(vac) (p) = n n Y

u1

du1

Z

4 sin

v1

dv1

0

0

j>i n Y

Z

2π

317

Z

u2

du2

0

Z dv2 ...

0

vn−1

Z dun

0

un

dvn

0

2 2 n vi − vj 2β Y ui − uj ui − vj −2β sin 2 sin (B.5) 2 2 2

j≥i

n X vi − uj −2β 2 cos 2p π + (vi − ui ) . 2 i=1 2

2 sin

j>i

Let us now examine the convergence properties of the series T (vac) (λ) = cos(2πp) +

∞ X

λ2n G(vac) (p) n

(B.6)

n=1

for the vacuum eigenvalue of the operator T(λ). A similar problem was studied in [21] for the series Z(λ) = 1 +

∞ X

λ2n Zn

(B.7)

n=1

with Z 2π Z 2π Z 2π Z 2π Z 2π Z 2π 1 du1 du2 · · · dun dv1 dv2 · · · dvn Zn = (n!)2 0 0 0 0 0 0 (B.8) 2 2 n n Y vi − vj 2β Y u i − uj ui − vj −2β sin , 4 sin 2 sin 2 2 2 j>i j,i=1 where 0 < β 2 < 1/2. It was shown (using the Jack polynomial technique) that the leading asymptotics of the integrals (B.8) for large n is given by log Zn = 2 (β 2 − 1) n log n + O(n), n → ∞,

(B.9)

and hence series (B.7) defines an entire function of the variable λ2 . It is easy to see that (p)| < Zn , |G(vac) n

(B.10)

and therefore the eigenvalue (B.6) is also an entire function of λ2 . Similar considerations apply to arbitrary matrix elements of T(λ) between the states in Fp . Thus all matrix elements and eigenvalues of T(λ) are entire functions8 of λ2 . Consider now the vacuum eigenvalue A(vac) (λ) of the operator A(λ) ≡ A+ (λ) defined in (3.12). It can be written as a series A(vac) (λ) = 1 +

∞ X n=1

8

X

λ2n an (−σ1 , . . . , −σ2n ) J (vac) (σ1 , . . . , σ2n ), (B.11) σ1 +···+σ2n =0

The higher spin operators Tj (λ) with j > 1/2 can be polynomially expressed through T 1 (λ) (as it 2

follow from (4.13)) and obviously enjoy the same analyticity properties.

318

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

where the sum is taken over all sets of variables σ1 , . . . , σ2n = ±1 with zero total sum and J (vac) (σ1 , . . . , σ2n ) denote vacuum eigenvalues of the operators (2.29). The numerical coefficients an defined as an (σ1 , . . . , σ2n ) = q n Z+−1 (p) Tr ρ+ (e2πipH Eσ1 · · · Eσ2n ),

(B.12)

where the trace is taken over the representation ρ+ of the q-oscillator algebra (3.8) and Z+ (p) is given by (3.11). It is easy to see that X (vac) (σ1 , . . . , σ2n ) ≤ Zn . (B.13) J σ1 +···+σ2n =0

To estimate the coefficients (B.12) it is convenient to use the explicit form of the representation matrices ρ+ (E± ) and ρ+ (H) given in (D.6). Using these one can show 22n , |an ({σ})| < Q n j=1 (1 − q −2j e4πiP )

(B.14)

where we have assumed that 2p 6= nβ 2 + m

(B.15)

for any integer m and any positive integer n. For rational β 2 the relation (B.14) obviously implies |an ({σ})| < C n

(B.16)

is C is some constant. Combining (B.16), (B.13) and (B.9) we conclude the series (B.11) in this case converges in a whole complex plane of λ2 . In fact, the same inequality (B.16) holds for (almost all) irrational β 2 . This follows from a remarkable result of [24, 25] n Z 2π Y 1 −2j 4πiP ) = log(2 sin x) dx = 0, log (1 − q e lim n→∞ n 0 j=1

(B.17)

which is valid for all irrational β 2 satisfying (B.15) except a set of some exceptional irrationals of the linear Lebesgue measure zero (see [24,25] for the details).

Appendix C Using (1.9), (1.13), (1.16) one can write the Virasoro generator L0 as L0 =

2 X P2 c − 1 + 2 + a−n an . 2 β 24 β n>0

(C.1)

[L0 , ϕ(u)] = −i ∂u ϕ(u).

(C.2)

Then it is easy to show that

Therefore the adjoint action of the operator exp(iεL0 ) on (3.4)

Integrable Structure of Conformal Field Theory III.

iεL0

e

) T(f j (λ)

−iεL0

e

319

Z = Tr πj ei(πP +f )H P exp

ε

2π+ε

K(u)du

(C.3)

reduces to a shift of the limits of integration in the P-exponent on the amount of ε, where ε is assumed to be real. Here K(u) is the same as in (1.29). Retaining in (C.3) linear in ε terms only one gets h i ) i(πP +f )H −iπP H −iπP H e K(2π) e (λ)] = −i Tr L (λ) − e L (λ) K(0) . [L0 , T(f π j j j j (C.4) Expanding the P-exponent as in (2.30), using (1.20), the commutations relations (1.19) and (1.26) and the cyclic property of the trace one obtains ) [L0 , T(f j (λ)]

= sin(πP − f )

(C.5)

X

a(f ) (σ0 , σ1 , . . . , σn ) : e−2σ0 ϕ(2π) : J(−σ1 , . . . , −σn ),

σ0 +···+σn =0

where

with

(C.6) a(f ) (σ0 , σ1 , . . . , σn ) = −2 σ0 eiσ0 (πP −f ) Tr πj ei(πP +f )H yσ0 yσ1 · · · yσn y+ = λ q H/2 E, y− = λ q −H/2 F,

and the ordered integrals J(σ1 , . . . , σn ) defined in (2.29). Obviously, the RHS of (C.5) vanishes if f = π(P + N ), where N is an arbitrary integer. We set N = 0, since (3.4) depends on N only through a trivial sign factor (−1)2jN . Thus the operators Tj (λ) (3.7) commute with the simplest local IM I1 = L0 − c/24. As it follows from (4.12) and (B.2) the coefficients of the series expansions of Tj (λ) in the variable λ2 can be algebraically expressed in terms of the nonlocal IM (B.3). Therefore the above commutativity is equivalent to [Gn , I1 ] = 0, n = 1, 2, . . . , ∞.

(C.7)

In fact, the operators Gn commute with all local IM (1.21). To check this one has to transform the ordered integrals in (B.3) to contour integrals. For example, G1 can be written as [2] Z Z 2π −1 2π du1 du2 qe−2πiP − q −1 e2πiP × G1 = q 2 − q −2 0

0

V− (u1 + i0)V+ (u2 − i0) + qe2πiP − q −1 e−2πiP V+ (u1 + i0)V− (u2 − i0)

.(C.8)

The characteristic property of the local IM is that their commutators with the exponential fields (1.18) reduces to a total derivative [10,11] o n (C.9) [I2n−1 , V± (u)] = ∂u : On± (u)V± (u) : ≡ ∂u Xn± (u).

320

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

Here On± (u) are some polynomials with respect to the field ∂u ϕ and its derivatives. It follows then that the commutator of (C.8) with I2n−1 , Cn = [I2n−1 , G1 ],

(C.10)

is expressed as a double contour integral of a linear combination of products of the form ∂v1 Xn± (v1 ) V∓ (v2 ) and V± (v1 ) ∂v2 Xn∓ (v2 ). It is important to note that the operator product expansion for these products does not contain any terms proportional to negative integer powers of the difference (v1 − v2 ). Therefore the above integrand for (C.10) does not contain any contact terms (i.e. the terms proportional to the delta function δ(u1 − u2 ) and its derivatives). Thus we can easily perform one integration −1 −2πiP − q −1 e2πiP qe2πiP − q −1 e−2πiP × qe Cn = q 2 − q −2 Z 2π du qe2πiP V− (u)X+ (0) − q −1 e−2πiP X− (0)V+ (u) 0

(C.11)

−2πiP −1 2πiP V+ (u)X− (0) − q e X+ (0)V− (u) , + qe

where we have used the periodicity property X± (u + 2π) = q −2 e±4πiP X± (u).

(C.12)

Using now the commutation relations X± (0) V∓ (u) = q 2 V∓ (u) X± (0), u > 0,

(C.13)

where σ1 , σ2 = ±1, one can see that the RHS of (C.11) is equal to zero. The higher nonlocal IM Gn also admit contour integral representations similar to (C.8) and their commutativity with I2k−1 can, in priciple, be proved in the same way. However, these representations become more and more complicated for high orders and in general unknown. It would be interesting to obtain a general proof of the above commutativity to all orders.9

Appendix D In this appendix we present the derivation of the factorization (4.10). Using the definition (3.12) one can write the product of the operators A± from (4.10) in the form h i −1 Tr ρ+ ⊗ρ− eiπP H L+ (λµ) ⊗ L− (λµ−1 ) , A+ (λµ)A− (λµ−1 ) = Z+ (P )Z− (P ) (D.1) where µ = q j+ 2 , 1

(D.2)

9 B. Feigin and E. Frenkel have pointed out [26] that such proof can be obtained by extending the results of [10,11].

Integrable Structure of Conformal Field Theory III.

321

and the trace is taken over the direct product of the two representations ρ+ ⊗ ρ− of (3.8) (these are defined after (3.10) in the main text) and H = H ⊗ 1 − 1 ⊗ H.

(D.3)

It is convenient to choose the representation space of ρ+ (ρ− ) as a highest module generated by a free action of the operator ρ+ [E− ] ρ− [E+ ] on a vacuum vector defined respectively as ρ± [E± ] |0i± = 0, ρ± [H] |0i± = 0. Defining natural bases in these modules k |0i± , k = 0, 1, 2, . . . , ∞, |ki± = ρ± E∓

(D.4)

(D.5)

with the upper signs for ρ+ and the lower signs for ρ− one can easily calculate the matrix elements ρ± [E± ] |ki± =

1 − q ∓2k |k − 1i± , ρ± [E∓ ] |ki± = |k + 1i± , (q − q −1 )2

(D.6)

ρ± [H] |ki± = ∓2k |ki± . Notice that the trace in (3.11) for this choice of ρ± reads Z+ (P ) = Z− (P ) =

e2πiP . 2i sin(2πP )

(D.7)

Specializing now the formula (2.20) for the product of the two operators L± in (D.1) one obtains L+ (λµ) ⊗ L− (λµ−1 ) = Z 2π H H iπP H − P exp λ V− (u) q 2 E + V+ (u) q 2 F du , e

(D.8)

0

where H is given by (D.3) and H

H

E = µ E+ ⊗ q − 2 + µ−1 q − 2 ⊗ E− , F = µ E− ⊗ q

H 2

+ µ−1 q

H 2

⊗ E+ .

(D.9)

The last two equations can be written in a compact form E = a− + b− , F = a+ + b+ ,

(D.10)

if one introduces the operators H

H

a± = µ E∓ ⊗ q ± 2 , b± = µ−1 q ± 2 ⊗ E± ,

(D.11)

322

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

acting in ρ+ ⊗ ρ− . These operators satisfy the commutation relations aσ1 bσ2 = q 2σ1 σ2 bσ2 aσ1 , [H, a± ] = ∓2a± , [H, b± ] = ∓2b± , q a− a+ − q −1 a+ a− =

µ2 µ−2 −1 , q b b − q b b = , + − − + q − q −1 q − q −1

(D.12)

where σ1 , σ2 = ±1. Further, the direct product of the modules ρ± can be decomposed in the following way: ∞

ρ+ ⊗ ρ− = ⊕ ρ(m) , m=0

(D.13)

where each ρ(m) , m = 0, 1, 2, . . . , ∞, is again a highest weight module spanned on the vectors ρ(m) :

|ρk(m) i = (a+ + b+ )k (a+ − γb+ )m |0i+ ⊗ |0i− , k = 0, 1, 2, . . . , ∞. (D.14)

The constant γ here is constrained by the relation γ 6= −q −2n , n = 0, 1, 2, . . . , ∞,

(D.15)

but otherwise arbitrary. To prove that the modules ρ(m) are linearly independent (as subspaces in the vector space ρ+ ⊗ ρ− ) it is enough to prove that ` + 1 vectors |ρk(`−k) i, k = 0, 1, . . . , `, on each “level” ` = 0, 1, . . . , ∞ are linearly independent (the vectors on different levels are obviously linearly independent). To see this let us use the commutation relations (D.12) and rewrite zk = (a+ + b+ )k (a+ − γb+ )`−k , k = 0, 1, . . . , `

(D.16)

as ordered polynomials in the variables a+ and b+ , zk =

` X m=0

(`) Ckm (a+ )`−m (b+ )m .

(D.17)

If γ satisfies (D.15) the determinant of the coefficients of these polynomials (`) k0≤k,m≤` = det kCkm

`−1 Y

(γ + q −2n )`−n

(D.18)

n=0

does not vanish. That implies the required linear independence. From the above definitions it is easy to see that the operators H and F entering (D.8) act invariantly in each module ρ(m) H, F :

ρ(m) −→ ρ(m) ,

(D.19)

while for the operator E acts as E :

ρ(m) −→ ρ(m) ⊕ ρ(m−1)

(D.20)

with ρ(−1) ≡ 0. The matrix element of these operators can be easily found from (D.10,), (D.12), (D14),

Integrable Structure of Conformal Field Theory III.

323

(ρ+ ⊗ ρ− )[H] |ρk(m) i = −2 (m + k) |ρk(m) i, (ρ+ ⊗ ρ− )[F] |ρk(m) i = |ρ(m) k+1 i,

(D.21)

(m) i + ck(m) |ρk(m−1) i, (ρ+ ⊗ ρ− )[E] |ρk(m) i = [k]q [2j − 1 + k]q |ρk−1

where we have used (D.2). The values of ck(m) can be calculated but are not necessary in what follows. Thus the matrices (D.21) have the block triangular form with an infinite number of diagonal blocks. It is essential to note that in each diagonal block these matrices coincide with those of the highest weight representation πj+ given by (4.5) (up to an overall shift in the matrix elements of (ρ+ ⊗ρ− )[H] in different blocks). Substituting now (D.21) into (D.8) and then into (D.1) and using the definition (4.4) one easily arrives at the factorization (4.10). References 1. 2. 3.

4. 5. 6. 7. 8. 9. 10.

11. 12.

13. 14. 15. 16. 17. 18.

19.

Bazhanov, V.V., Lukyanov, S.L. and Zamolodchikov, A.B.: Integrable structure of conformal field theory, quantum KdV theory and thermodynamic Bethe Ansatz. Commun. Math. Phys. 177, 381–398 (1996) Bazhanov, V.V., Lukyanov, S.L. and Zamolodchikov, A.B.: Integrable structure of conformal field theory II. Q-operator and DDV equation. Commun. Math. Phys. 190, 247–278 (1997) Baxter, R.J.: Eight-vertex model in lattice statistics and one-dimensional anisotropic Heisenberg chain; 1. Some fundamental eigenvectors. Ann. Phys. (N.Y.) 76, 1–24 (1973); 2. Equivalence to a generalized icetype model. Ann. Phys. (N.Y.) 76, 25–47 (1973); 3. Eigenvectors of the transfer matrix and Hamiltonian. Ann. Phys. (N.Y.) 76, 48–71 (1973) Baxter, R.J.: Exactly solved models in Statistical Mechanics. London: Academic Press, 1982 Sklyanin, E.K., Takhtajan, L.A. and Faddeev, L.D.: Quantum inverse scattering method I. Teoret. Mat. Fiz. 40, 194–219 (1979) (In Russian) [English translation: Theor. Math. Phys. 40, 688–706 (1979)]. Korepin, V.E., Bogoliubov, N.M. and Izergin, A.G.: Quantum inverse scattering method and correlation functions. Cambridge: Cambridge University Press, 1993 Belavin, A.A., Polyakov, A.M. and Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B241, 333–380 (1984) Sasaki, R. and Yamanaka, I.: Virasoro algebra, vertex operators, quantum Sine-Gordon and solvable Quantum Field theories. Adv. Stud. in Pure Math. 16, 271–296 (1988) Eguchi, T. and Yang, S.K.: Deformation of conformal field theories and soliton equations. Phys. Lett. B224, 373–378 (1989) Feigin, B. and Frenkel, E.: Integrals of motion and quantum groups. In: “Integrable Systems and Quantum Groups", Proceeding of C.I.M.E. Summer School, Lect. Notes in Math., 1620, 349–418 (1996). (hepth/9310022) Feigin, B. and Frenkel, E.: Free field resolutions in affine Toda field theories. Phys. Lett. B276, 79–86 (1992) Fateev, V.A. and Lukyanov, S.L.: Poisson-Lie group and classical W-algebras. Int. J. Mod. Phys. A7, 853– 876 (1992); Fateev, V.A. and Lukyanov, S.L.: Vertex operators and representations of quantum universal enveloping algebras. Int. J. Mod. Phys. A7, 1325–1359 (1992) Volkov, A.Yu.: Quantum Volterra model. Phys. Lett. A167, 345–355 (1992) Volkov, A.Yu.: Quantum lattice KdV equation. Lett. Math. Phys. 39, 313–329 (1997) Kulish, P.P., Reshetikhin, N.Yu. and Sklyanin, E.K.: Yang–Baxter equation and representation theory. Lett. Math. Phys. 5, 393–403 (1981) Drinfel’d, V.G.: Quantum Groups. In: Proceedings of the International Congress of Mathematics. Berkeley 1986, 1, California: Acad. Press, 1987, pp. 798–820 Khoroshkin, S. M. and Tolstoy, V. N.: The uniqueness theorem for the universal R-matrix.: Lett. Math. Phys. 24, 231–244 (1992) Tolstoy, V. N. and Khoroshkin, S. M.: Universal R-matrix for quantized nontwisted affine Lie algebras. Funktsional. Anal. i Prilozhen. 26(1), 85–88 (1992) (In Russian) [English translation: Func. Anal. Appl. 26, 69–71 (1992)] Khoroshkin, S. M., Stolin, A. A. and Tolstoy, V. N.: Generalized Gauss decomposition of trigonometric R-matrices. Mod. Phys. Lett. A10, 1375–1392 (1995)

324

V. V. Bazhanov, S. L. Lukyanov, A. B. Zamolodchikov

20. Jimbo, M.: A q-difference analogue of U G and Yang–Baxter equation. Lett. Math. Phys. 10, 63–68 (1985) 21. Fendley, P., Lesage, F. and Saleur, H.: Solving 1d plasmas and 2d boundary problems using Jack polynomial and functional relations. J. Statist. Phys. 79, 799–819 (1995) 22. Kirilov, A.N. and Reshetikhin N.Yu.: Exact solution of the integrable XXZ Heisenberg model with arbitrary spin. J. Phys. A20, 1565–1585 (1987) 23. Bazhanov, V.V., Lukyanov, S.L. and Zamolodchikov, A.B.: Integrable quantum field theories in finite volume: Excited state energies. Nucl. Phys. B489, 487–531 (1997) 24. Hardy, G.H. and Littlewood, J.E.: Notes on the Theory of Series (XXIV): A Curious Power Series. Proc. Camb. Phil. Soc. 42, 85–90 (1946) 25. Driver, K.A., Libinsky, D.S., Petruska, G. and Sarnak, P.: Irregular distribution of {nβ}, n = 1, 2, 3, . . . , quadrature of singular integrals and curious basic hypergeometric series. Indag. Mathem., N.S. 2(4), 469–481 (1991) 26. Feigin, B. and Frenkel, E.: private communication, 1997 Communicated by G. Felder

Commun. Math. Phys. 200, 325 – 354 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Vacuum Radiation and Symmetry Breaking in Conformally Invariant Quantum Field Theory ? V. Aldaya1,2 , M. Calixto2,3 , J. M. Cerver´o4 1 Instituto de Astrof´ısica de Andaluc´ıa (CSIC), P.O. Box 3004, 18080-Granada, Spain. E-mail: [email protected] 2 Instituto “Carlos I” de F´ısica Te´ orica y Computacional, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva, Granada 18002, Spain. 3 Departemento de F´ısica Te´ orica y del Cosmos, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva, Granada 18002, Spain. E-mail: [email protected] 4 Facultad de F´ısicas, edificio Triling¨ ue, Universidad de Salamanca, 37008 Salamanca, Spain. E-mail: [email protected]

Received: 17 September 1997 / Accepted: 7 July 1998

Abstract: The underlying reasons for the difficulty of unitarily implementing the whole conformal group SO(4, 2) in a massless Quantum Field Theory (QFT) on Minkowski space are investigated in this paper. Firstly, we demonstrate that the singular action of the subgroup of special conformal transformations (SCT), on the standard Minkowski space M , cannot be primarily associated with the vacuum radiation problems, the reason being more profound and related to the dynamical breakdown of part of the conformal symmetry (the SCT subgroup, to be more precise) when representations of null mass are selected inside the representations of the whole conformal group. Then we show how the vacuum of the massless QFT radiates under the action of SCT (usually interpreted as transitions to a uniformly accelerated frame) and we calculate exactly the spectrum of the outgoing particles, which proves to be a generalization of the Planckian one, this recovered as a given limit.

1. Introduction The conformal group SO(4, 2) has ever been recognized as a symmetry of the Maxwell equations for classical electro-dynamics [C-B], and more recently considered as an invariance of general, non-abelian, massless gauge field theories at the classical level. However, the quantum theory raises, in general, serious problems in the implementation of conformal symmetry, and much work has been devoted to the study of the physical reasons for that (see e.g. [Fr]). Basically, the main trouble associated with this quantum symmetry (at the second quantization level) lies in the difficulty of finding a vacuum, which is stable under special conformal transformations acting on the Minkowski space in the form: ? Work partially supported by the DGICYT under contracts PB92-1055, PB92-0302, PB95-1201 and PB950947.

326

V. Aldaya, M. Calixto, J.M. Cerver´o µ

xµ → x0 =

xµ + cµ x2 , σ(x, c)

σ(x, c) = 1 + 2cx + c2 x2 .

(1)

These transformations, which can be interpreted as transitions to systems of relativistic, uniformly accelerated observers (see e.g. [H]), cause vacuum radiation, a phenomenon analogous to the Fulling-Unruh effect [Fu, U] in a non-inertial reference frame. To be more precise, if a(k), a+ (k) are the Fourier components of a scalar massless field φ(x), satisfying the equation η µν ∂µ ∂ν φ(x) = 0 ,

(2)

then, the Fourier components a0 (k), a0 (k) of the transformed field φ0 (x0 ) = σ −l (x, c)φ(x) by (1) (l being the conformal dimension) are expressed in terms of both a(k), a+ (k) through a Bogolyubov transformation Z 0 (3) a (λ) = dk Aλ (k)a(k) + Bλ (k)a+ (k) . +

In the second quantized theory, the vacuum states defined by the conditions aˆ (k)|0i = 0 and aˆ 0 (λ)|00 i = 0, are not identical if the coefficients Bλ (k) in (3) are not zero. In this case the new vacuum has a non-trivial content of untransformed particle states. This situation is always present when quantizing field theories in curved space as well as in flat space, whenever some kind of global mutilation of the space is involved. This is the case of the natural quantization in Rindler coordinates [BD], which leads to a quantization inequivalent to the normal Minkowski quantization, or that of a quantum field in a box, where a dilatation produces a rearrangement of the vacuum [Fu]. Nevertheless, it must be stressed that the situation for SCT is more peculiar. The rearrangement of the vacuum in a massless QFT due to SCT, even though they are a symmetry of the classical system, behaves as if the conformal group were spontaneously broken, and this fact can be interpreted as a kind of topological anomaly. Thinking of the underlying reasons for this anomaly, we are tempted to make the singular action of the transformations (1) in Minkowski space responsible for it, as has been in fact pointed out in [GU]. However, a deeper analysis of the interconnection between symmetry and quantization will reveal a more profound obstruction to the possibility of implementing unitarily SCT in a generalized Minkowski space, free from singularities, when conformally invariant fields are forced to evolve in time. This way, the quantum time evolution itself destroys the conformal symmetry, leading to some sort of dynamical symmetry breaking which preserves the Weyl subgroup (Poincar´e + dilatations). This obstruction is traced back to the impossibility of representing the entire SO(4, 2) group unitarily and irreducibly on a space of functions depending arbitrarily on ~x (see e.g. [Fr]), so that a Cauchy surface determines the evolution in time. Natural representations, however, can be constructed by means of wave functions having support on the whole space-time and evolving in some kind of proper time. From the point of view of particle quantum mechanics (or “first” quantization), the free arguments of wave functions in the configuration-space “representation” correspond to half of the canonically conjugated variables in phase space (or the classical solution manifold), and this phase space is usually defined as a co-adjoint orbit of the basic symmetry group characterizing the physical system. Thus, for instance, for the Galilei or Poincar´e group the phase space associated with massive spinless particles has dimension 6 and the corresponding wave functions in configuration space have the time variable

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

327

factorized out. However, as mentioned above, this is not the case for the conformal group, for which we shall realize that time is a quantum observable subject to uncertainty relations; this fact extends covariance rules to the quantum domain. The present study is developed in the framework of a GroupApproach to Quantization (GAQ)[AA1, ANR], which proves to be specially suitable for facing those quantization problems arising from specific symmetry requirements. Furthermore, this formalism has the virtue of providing in a natural way the space on which the wave functions are defined. A very brief report on GAQ is presented in Sect. 2. In Sect. 3 we apply this quantization technique to the particular case of the group SO(2, 2), which is the 1+1 dimensional version of the SO(4, 2) symmetry. Although the conformal symmetry in 1+1 dimensions is far richer, we proceed in a way that can be straightforwardly extended to the realistic dimension. In this example we show how a (compact) configuration space, locally isomorphic to Minkowski space time, can be found inside the SO(2, 2) group manifold on which the whole conformal group acts without singularities. We also prove that the unitarity of the irreducible representations of SO(2, 2) requires the dynamical character of the time variable, or that which is similar, prevents the existence of a conformally invariant quantum evolution equation in the time variable. We examine two cases corresponding to non-compact and compact “proper time” dynamics in Subsects. 3.1 and 3.2 , respectively. Section 4 is devoted to the application of GAQ to a very special ˜ G) ˜ directly attached to the quantum mechaninfinite-dimensional Lie group G˜ (2) (H(G), ˜ of a “first”-quantized system characterized by the quantizing ical Hilbert space H(G) group G˜ (a central extension of G = SO(2, 2), for the present case). This mechanism is nothing other than a group version of the “second”-quantization algorithm. With this algorithm at hand we formulate in Sect. 4.1 a conformally invariant quantum field theory and, in Sect. 4.2, we investigate the effect of a SCT on a Weyl vacuum and the associated radiation phenomenon. We calculate exactly the spectrum of an accelerated Weyl vacuum, which proves to be a generalization of the black body spectrum, this recovered as a given limit. Final comments are presented in the last Sect. 5. ˜ 2. Quantization on a Group G The starting point of GAQ is a group G˜ (the quantizing group) with a principal fibre ˜ bundle structure G(B, T ), having T as the structure group and B being the base. The group T generalizes the phase invariance of Quantum Mechanics. Although the situation can be more general [ANR], we shall start with the rather general case in which G˜ is a central extension of a group G by T [T = U (1) or even T = C ∗ = <+ ⊗ U (1)]. For the one-parametric group T = U (1), the group law for G˜ = {g˜ = (g, ζ)/g ∈ G, ζ ∈ U (1)} adopts the following form: 0

g˜ 0 ∗ g˜ = (g 0 ∗ g, ζ 0 ζeiξ(g ,g) ),

(4)

where g 00 = g 0 ∗ g is the group operation in G and ξ(g 0 , g) is a two-cocycle of G on < fulfilling: ξ(g2 , g1 ) + ξ(g2 ∗ g1 , g3 ) = ξ(g2 , g1 ∗ g3 ) + ξ(g1 , g3 ) , gi ∈ G.

(5)

In the general theory of central extensions [B], two two-cocycles are said to be equivalent if they differ in a coboundary, i.e. a cocycle which can be written in the form ξ(g 0 , g) = δ(g 0 ∗ g) − δ(g 0 ) − δ(g), where δ(g) is called the generating function of the coboundary. However, although cocycles differing on a coboundary lead to equivalent

328

V. Aldaya, M. Calixto, J.M. Cerver´o

central extensions as such, there are some coboundaries which provide a non-trivial connection on the fibre bundle G˜ and Lie-algebra structure constants different from that of the direct product G ⊗ U (1). These are generated by a function δ with a non-trivial gradient at the identity, and can be divided into equivalence Pseudo-cohomology subclasses: two pseudo-cocycles are equivalent if they differ in a coboundary generated by a function with trivial gradient at the identity [S, AA2, AGM]. Pseudo-cohomology plays an important role in the theory of the finite-dimensional semi-simple groups, as they have trivial cohomology. For them, Pseudo-cohomology classes are associated with coadjoint orbits [AGM]. The right and left finite actions of the group G˜ on itself provide two sets of mutually commuting (left- and right-, respectively) invariant vector fields: h i ∂ g˜ 00j ∂ g˜ 00j ∂ ∂ R ˜ ˜R = , X = , X˜ gL (6) X˜ gL ˜i g˜ i ˜ i , Xg˜ j = 0, i j 0i j ∂ g˜ g=e ∂ g˜ ∂ g˜ g˜ 0 =e ∂ g˜ ˜ ˜ The GAQ program continues finding the leftwhere {g˜ j } is a parameterization of G. invariant 1-form 2 (the Quantization 1-form) associated with the central generator X˜ ζL = X˜ ζR , ζ ∈ T , i.e. the T -component θ˜L(ζ) of the canonical left-invariant 1-form ˜ This constitutes the generalization of the Poincar´e-Cartan form of Classical θ˜L on G. Mechanics (see [AM]). The differential d2 is a presymplectic form and its characteristic module, Ker2 ∩ Kerd2, is generated by a left subalgebra G2 named the characteristic ˜ 2)/G2 is a quantum manifold in the sense of Geometric subalgebra. The quotient (G, Quantization [GQ]. The trajectories generated by the vector fields in G2 constitute the generalized equations of motion of the theory (temporal evolution, rotations, etc . . . ), and the “Noether” invariants under those equations are Fg˜ j ≡ iX˜ R 2, i.e. the contraction g˜ j

of right-invariant vector fields with the Quantization 1-form. ˜ be the set of complex-valued T -functions on G˜ in the sense of principal Let B(G) bundle theory: ˜ ζ ∈ T, ψ(ζ ∗ g) ˜ = DT (ζ)ψ(g),

(7)

where DT is the natural representation of T on the complex numbers C. The repre˜ generated by G˜ R = {X˜ R } is called Bohr Quantization and is sentation of G˜ on B(G) reducible. The reduction can be achieved by means of the restrictions imposed by a full polarization P: X˜ L ψP = 0, ∀X˜ L ∈ P,

(8)

which is a maximal, horizontal (excluding X˜ ζL ) left subalgebra of G˜ L which contains G2 . It should be noted that the existence of a full polarization, containing the whole subalgebra G2 , is not guaranteed. In case of such a breakdown, called an anomaly, or simply for the desire of choosing a preferred representation space, a higher-order polarization has to be imposed [ABLN]. A higher-order polarization is a maximal, horizontal subalgebra of the left enveloping algebra U G˜ L which contains G2 . ˜ ≡ {|ψi} of polarized The group G˜ is irreducibly represented on the space H(G) ∗ ˜ wave functions, and on its dual H (G) ≡ {hψ|}. If we denote by ˜ hψ 0 |g˜ P i ≡ ψ 0∗P (g) ˜ hg˜ P |ψi ≡ ψP (g),

(9)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

329

the coordinates of the “ket” |ψi and the “bra” hψ 0 | in a representation defined through a ˜ can be naturally polarization P (first- or higher-order), then, a scalar product on H(G) defined as: Z

0

hψ |ψi ≡

˜ G

v(g)ψ ˜ P0 ∗ (g)ψ ˜ P (g), ˜

(10)

where . . (.G) ∧θgL˜ n v(g) ˜ ≡ θgL˜ 1 ∧ dim ˜

is the left-invariant integration volume in G˜ and Z |g˜ P iv(g)h ˜ g˜ P | 1=

(11)

(12)

˜ G

formally represents a closure relation. A direct computation proves that, with this scalar product, the group G˜ is unitarily represented through the left finite action (ρ denotes the representation) ˜ hg˜ P |ρ(g˜ 0 )|ψi ≡ ψP (g˜ 0−1 ∗ g).

(13)

The adjoint action is then defined as ˜ hψ 0 |ρ† (g˜ 0 )|ψi ≡ hψ|ρ(g˜ 0 )|ψ 0 i∗ , i.e hg˜ P |ρ† (g˜ 0 )|ψi = ψP (g˜ 0 ∗ g).

(14)

We can relate the coordinates of |ψi in two given representations, corresponding with two different polarizations P1 and P2 , as follows: Z Z ˜ = hg˜ P 1 |ψi = v(g˜ 0 )hg˜ P 1 |g˜ P0 2 ihg˜ P0 2 |ψi ≡ v(g˜ 0 )1P 1 P 2 (g, ˜ g˜ 0 )ψP 2 (g˜ 0 ), ψP 1 (g) ˜ ˜ G G (15) ˜ g˜ 0 ) is a “polarization changing” operator. An explicit expression of where 1P 1 P 2 (g, 1P 1 P 2 can be obtained by making use of a basis {|ni}n∈I (I is a set of indices) of ˜ as follows: H(G), X ˜ g˜ 0 ) = hg˜ P 1 |g˜ P0 2 i = ψP∗ 1 ,n (g)ψ ˜ P 2 ,n (g˜ 0 ), (16) 1P 1P 2 (g, n∈I

˜ ≡ hg˜ P i |ni are the coordinates of |ni in a polarization Pi . where ψP i ,n (g) Constraints are consistently incorporated into the theory by enlarging the structure group T (which always includes U (1)), i.e. through T -function conditions: ρ(t˜)|ψi = DT() (t˜)|ψi, t˜ ∈ T

(17)

or, for continuous transformations, () ˜ X˜ tR ˜ |ψi = dDT (t)|ψi ,

(18)

DT() means a specific representation of T [the index parametrizes different (inequivalent) quantizations] and dDT() is its differential.

330

V. Aldaya, M. Calixto, J.M. Cerver´o

It is obvious that, for a non-central structure group T , not all the right operators ˜ ˜ X˜ gR ˜ will preserve these constraints; a sufficient condition for a subgroup GT ⊂ G to preserve the constraints is (see [ACG]): (19) G˜ T , T ⊂ KerDT() [note that, for the trivial representation of T , the subgroup G˜ T is nothing other than the normalizer of T ]. G˜ T takes part of the set of good operators [ANR], of the enveloping algebra U G˜ R in general, for which the subgroup T behaves as a gauge group (see [NAC] for a thorough study of gauge symmetries and constraints from the point of view of GAQ). A more general situation can be posed when the constraints are lifted to the higher-order level, not necessarily first order as in (18), that is, they are a subalgebra of the right enveloping algebra U G˜ R . An interesting example of this last case arises when one selects representations labelled by a value of some Casimir operator Q of a ˜ This is exactly the case that interests us: null mass representations subgroup G˜ Q of G. ( = m = 0) of the Poincar´e group (G˜ Q = SO(3, 1) ⊗s T4 , Q = Pν P ν ) inside the conformal group (G˜ = SO(4, 2)). In the more general case in which T is not a trivial central extension, T 6= Tˇ × U (1), where Tˇ ≡ T /U (1) – i.e. T contains second-class constraints – the conditions (18) are not all compatible and we must select a subgroup TB = Tp × U (1), where Tp is the subgroup associated with a right polarization subalgebra of the central extension T (see [ANR]). For simplicity, we have sometimes made use of infinitesimal (geometrical) concepts, but all this language can be translated to their finite (algebraic) counterparts (see [ANR]), a desirable way of proceeding when discrete transformations are incorporated into the theory. 3. Conformally Invariant Quantum Mechanics Conformally invariant Quantum Mechanics (in 1+1D) will be developed by finding the unitary irreducible representations of the centrally extended SO(2, 2) group in exactly the same way the Hilbert space of the Galilean particle is obtained from the unitary irreducible representations of the centrally extended Galilei group (see e.g. [AA1]). The configuration space of the theory or, rather, an analytic continuation of Minkowski space, will arise as a homogeneous space of the group, on which the wave functions supporting the irreducible representation take arguments. Except for discrete symmetries, which are not relevant at the Lie algebra level, SO(2, 2) ∼ SU (1, 1) ⊗ SU (1, 1) so that we shall look at the structure of z 1 z2 ∗ 2 2 , z ∈ C/ det(U ) = |z | − |z | = 1 . (20) , z SU (1, 1) = U = i i 1 2 z2∗ z1∗ SU (1, 1) is a fibre bundle with fibre U (1) and base the hyperboloid. A system of coordinates adapted to this fibration is the following: η≡

z∗ z2 z1 , α ≡ , α∗ ≡ 2∗ , |z1 | z1 z1

η ∈ U (1), α, α∗ ∈ D1 ,

(21)

where D1 is the unit disk and the coordinates α, α∗ are related to the stereographical projection of the hyperboloid on the disk. The inverse transformation is:

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

r z1 =

1 η, z2 = 1 − αα∗

r

1 αη. 1 − αα∗

331

(22)

The group law U 00 = U 0 U in η, α, α∗ coordinates is: η 0 η + η 0 η ∗ α0 α∗ z100 =p , 00 |z1 | (1 + η ∗ 2 α0 α∗ )(1 + η 2 αα∗ 0 ) z 00 αη 2 + α0 α00 = 200 = 2 , z1 η + α0 α∗ η 00 =

α∗ 00 =

(23)

∗

z200 α∗ η −2 + α∗ 0 = , ∗ η −2 + α∗ 0 α z100

from which we can extract the left- and right-invariant vector fields: ∂ ∂ ∂ − 2α + 2α∗ ∗ , ∂η ∂α ∂α ∂ 1 ∗ ∂ ∂ − ηα + − α∗ 2 ∗ , 2 ∂η ∂α ∂α ∂ ∂ ∂ 1 ηα − α2 + , 2 ∂η ∂α ∂α∗ ∂ η , ∂η 1 −1 ∗ ∂ ∂ η α + η −2 (1 − αα∗ ) , 2 ∂η ∂α 1 ∂ ∂ + η 2 (1 − αα∗ ) ∗ . − η3 α 2 ∂η ∂α

XηL = η XαL = XαL∗ = XηR = XαR = XαR∗ =

(24)

They close the Lie algebra: L L Xη , Xα = 2XαL , XηL , XαL∗ = −2XαL∗ , XαL , XαL∗ = XηL ,

(25)

and the corresponding right version by changing the sign to the structure constants. Let us parameterize G = SO(2, 2) as two copies of SU (1, 1) with parameters ¯ α, ¯ α¯ ∗ )}. There are two possibilities of combining the generators in the {(η, α, α∗ ); (η, Lie algebra G L = {XηL , XαL , XαL∗ , XηL¯ , XαL¯ , XαL¯ ∗ }

(26)

of G, in order to get the usual conformal generators G L = {DL , M L , P0 L , P1 L , K0 L , K1 L } which fulfill the ordinary commutation relations [K]:

(27)

332

V. Aldaya, M. Calixto, J.M. Cerver´o

L L P , D = −P0L 0L P , M L = −P0L 1L L P , K = 2M L 0L 1L K , D = K1L 1L D , ML = 0

L L P , D = −P1L 1L L P , K = −2DL 0L 0L P , K = 2DL 1L 1L K , M = −K1L L0 L P0 , P1 = 0

L P0 , M L = −P1L , L L P , K = −2M L , 1L 0L K , D = K0L , 0L K1 , M L = −K0L , L L K0 , K1 = 0,

(28)

where D, M, Pµ , Kµ are the generators of dilatation, boosts, space-time translations and special conformal transformations, respectively. One of the two mentioned choices lead to a non-compact dilatation subgroup, whereas the other leads to a compact one. Let us show what these combinations are, respectively, COMPACT D, NON COMPACT D, 1 L L L D = − 2 Xη + Xη¯ , DL = − 2i XαL − XαL∗ + XαL¯ − XαL¯ ∗ , M L = 21 XηL − XηL¯ , M L = 2i XαL − XαL∗ − XαL¯ + XαL¯ ∗ , P0L = − XαL∗ + XαL¯ ∗ , P0L = 21 XαL + XαL∗ − XαL¯ − XαL¯ ∗ − i(XηL − XηL¯ ) , P1L = − 21 XαL + XαL∗ + XαL¯ + XαL¯ ∗ − i(XηL + XηL¯ ) , (29) P1L = XαL∗ − XαL¯ ∗ , K0L = 21 −XαL − XαL∗ + XαL¯ + XαL¯ ∗ − i(XηL − XηL¯ ) , K0L = XαL + XαL¯ , K1L = − 21 XαL + XαL∗ + XαL¯ + XαL¯ ∗ + i(XηL + XηL¯ ) . K1L = XαL − XαL¯ , The group G = SU (1, 1) ⊗ SU (1, 1) is non-compact and semisimple. The leftinvariant integration volume can be expressed as: . . (G) . ∧θgLn v(g) ≡ θgL1 ∧ dim 1 1 =− dα ∧ dα∗ ∧ dη ∧ dα¯ ∧ dα¯ ∗ ∧ dη, ¯ ∗ 2 η(1 − αα ) η(1 ¯ − α¯ α¯ ∗ )2

(30)

which becomes singular for values |α|, |α| ¯ → 1 (unit circumferences surrounding both open disks). However, resorting to a central extension G˜ of G, necessarily trivial since G is semisimple and finite-dimensional, we shall turn the extended scalar product, between wave functions on the group, finite for some range of the extension parameter. There are several central extensions of the conformal group, but we are interested in one that afterwards leads to a generalized Minkowski space. This choice corresponds to an extension by a coboundary locally generated by the dilatation parameter, which we shall consider as a “proper time” (see [AA3]). We shall separate the two cases: a) non-compact and b) compact dilatation subgroup, in two subsections, respectively. The essentials of the problem we are involved in are insensitive to the topological character of the dilatation subgroup; however, whereas the non-compact dilatation case will be useful to connect with some standard expressions in Minkowski space, the compact dilatation case will be more manageable to construct and illustrate the second quantization program. It can be proven that a consistent quantum theory needs the group C ∗ as the structure group T for the first case, whereas a pseudoextension by U (1) is enough for the second one. 3.1. Non-compact dilatation subgroup. Let us look for a T = C ∗ = {z = rζ; r ∈ <+ , ζ ∈ U (1)}-pseudo-extension 0

z 00 = z 0 zeξ(g ,g) ,

ξ(g 0 , g) = δ(g 0 ∗ g) − δ(g 0 ) − δ(g),

z ∈ C ∗,

(31)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

333

where δ(g) = −iβ(α − α∗ + α¯ − α¯ ∗ ) is the function which generates the coboundary and β = β1 + iβ2 is a complex parameter characterizing the representation. The extended left- and right-invariant vector fields in G˜ are ∂ X˜ rL = X˜ rR = r , ∂r ∂ X˜ ζL = X˜ ζR = ζ , ∂ζ X˜ ηL = XηL + 2iβ1 (α + α∗ )X˜ rL − 2β2 (α + α∗ )X˜ ζL , X˜ αL = XαL − iβ1 α∗ 2 X˜ rL + β2 α∗ 2 X˜ ζL , X˜ αL∗ = XαL∗ + iβ1 α2 X˜ rL − β2 α2 X˜ ζL , X˜ ηR = XηR , X˜ αR = XαR − iβ1 (η −1 (1 − αα∗ ) − 1)X˜ rR + β2 (η −1 (1 − αα∗ ) − 1)X˜ ζR , X˜ αR∗ = XαR∗ + iβ1 (η 2 (1 − αα∗ ) − 1)X˜ rR − β2 (η 2 (1 − αα∗ ) − 1)X˜ ζR ,

(32)

and similar expression for the η, ¯ α, ¯ α¯ ∗ parameters. The new commutation relations for the extended conformal Lie algebra G˜ of G˜ are two copies of: L L X˜ η , X˜ α = 2X˜ αL − 2iβ1 X˜ rL + 2β2 X˜ ζL , L L X˜ η , X˜ α∗ = −2X˜ αL∗ − 2iβ1 X˜ rL + 2β2 X˜ ζL , L L X˜ α , X˜ α∗ = X˜ ηL , L X˜ r , all = 0, L X˜ ζ , all = 0,

(33)

the right ones changing a global sign in the structure constants. The only change induced in the Lie algebra commutators, when expressed in terms of ˜ L, M ˜ L , P˜0L , P˜1L , K ˜ 0L , K ˜ 1L , X˜ ζL , X˜ rL } G˜ L = {D

(34)

(having the same functional form as in the right hand side of Eq. (29)), is in the following two commutators: L L ˜ 0 = −2D ˜ L + 4β1 X˜ rL + 4iβ2 X˜ ζL , P˜0 , K L L ˜ 1 = 2D ˜ L − 4β1 X˜ rL − 4iβ2 X˜ ζL , P˜1 , K

(35)

the remainder keeping the same expression as in Eq. (28). These relations show that the ˜ 0L and P˜1L , K ˜ 1L are canonically conjugate, i.e. they give two couples of generators P˜0L , K rise to central terms at the right-hand side of the corresponding commutator. Central extensions of this kind were already considered in [AAB, AA3]. From (35) we conclude that, like the space operator, time is not deprived of dynamical character, that is, it is an operator subject to uncertainty relations (see [JR] for another definition of space-time position operators inside the enveloping algebra of the conformal group).

334

V. Aldaya, M. Calixto, J.M. Cerver´o

The quantization form and its characteristic module are 2 = 2(r) , 2(ζ) , 2(r) = β1 (0(η, α, α∗ ) + 0(η, ¯ α, ¯ α¯ ∗ )) + r−1 dr, ¯ α, ¯ α¯ ∗ )) + iζ −1 dζ, 2(ζ) = −β2 (0(η, α, α∗ ) + 0(η, 1 −2i(α + α∗ )η −1 dη − iαα∗ dα + iαα∗ dα∗ , 0(η, α, α∗ ) ≡ ∗ 1 − αα ˜ L, M ˜L>. G2 = < D

(36)

˜ be the set of complex valued T -functions on G˜ in the sense of principal Let B(G) ˜ and let us choose the representation DT (z) = z p , bundle theory: ψ(z ∗ g) ˜ = DT (z)ψ(g), where p has to be a negative integer for single-valuedness and the “square integrable” ˜ we condition of the wave function. In order to reduce the representation of G˜ on B(G), impose the full polarization subalgebra: ˜ L, K ˜ 0L , K ˜ 1L > . ˜ L, M P =< D

(37)

˜ made of wave The solution to the polarization conditions leads to a Hilbert space H(G) functions of the form ¯ α, ¯ α¯ ∗ , z) = z p Wβ (α, α∗ , α, ¯ α¯ ∗ )φ(µ, µ), ¯ ψ (β) (η, α, α∗ , η, ∗ ∗ ∗ ∗ ¯ α¯ ) = wβ (α, α )wβ (α, ¯ α¯ ), Wβ (α, α , α, ∗

wβ (α, α∗ ) = (1 − αα∗ )pβ (α + i)−pβ (α∗ − i)−pβ eipβ(α−α ) , (38) where Wβ is a “generating function” and φ is an arbitrary power series φ(µ, µ) ¯ =

∞ X

an,n¯ φn,n¯ (µ, µ), ¯ φn,n¯ (µ, µ) ¯ ≡ µn µ¯ n¯

(39)

n,n=−∞ ¯

in the variables µ=

α∗ − i −2 z2∗ − iz1∗ η = , α+i z2 + iz1

µ¯ =

α¯ ∗ − i −2 z¯2∗ − iz¯1∗ η¯ = . α¯ + i z¯2 + iz¯1

(40)

Note that (µ, µ) ¯ are defined in a two-dimensional torus T 2 = S 1 × S 1 (the 1+1 dimensional version of the 3+1 dimensional projective cone S 3 × S 1 /Z2 ). Let us show how the conformal group act on T 2 free from singularities. For this, we have only to translate the group composition law, originally written in global variables zi , z¯i , i = 1, 2, in Eq.(22), to the variables µ, µ: ¯ z2∗ 00 − iz1∗ 00 z2∗ 0 z2 + z1∗ 0 z1∗ − i(z2∗ 0 z1 + z1∗ 0 z2∗ ) µ − iα∗0 = , = 02 ∗ ∗ 00 00 0 0 0 0 z2 + iz1 z1 z2 + z2 z1 + i(z1 z1 + z2 z2 ) η (1 + iµα0 ) µ¯ − iα¯ ∗0 z¯ ∗ 00 − iz¯1∗ 00 z¯2∗ 0 z¯2 + z¯1∗ 0 z¯1∗ − i(z¯2∗ 0 z¯1 + z¯1∗ 0 z¯2∗ ) = 02 . (41) = µ¯ → µ¯ 00 ≡ 2 00 ∗ ∗ 00 0 0 0 0 z¯2 + iz¯1 z¯1 z¯2 + z¯2 z¯1 + i(z¯1 z¯1 + z¯2 z¯2 ) η¯ (1 + iµ¯ α¯ 0 ) µ → µ00 ≡

This action is always well defined and transitive on T 2 (see Ref. [LM] for a more detailed study of the global properties of a similar space in 3+1 dimensions), in contrast to the action on the Minkowski space, which can be seen as a local chart of T 2 obtained by ¯ stereographical projection (µ ≡ eiθ , µ¯ ≡ eiθ ):

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

1 (cot 2 1 x = (cot 2 t=

θ¯ θ + cot ), 2 2 θ¯ θ − cot ), 2 2

335

(42)

as can be checked by realizing that the expression of the generators of the conformal group in T 2 (see Eq. (44)) acquire the standard form in Minkowski space – except for (quantum) inhomogeneous terms proportional to the extension parameter β – (see [K] for instance) when expressed in terms of t, x. The manifold T 2 is thus a natural space-time on which a globally-defined 1+1 conformally invariant QFT can live. The invariant integration volume is v(g) ˜ = v(g) ∧ (r−1 dr) ∧ (iζ −1 dζ) (see Eq.(30)). The scalar product of two wave functions (38) will be finite when the factor ((1 − ˜ αα∗ )(1 − α¯ α¯ ∗ ))2pβ , coming from Wβ (see Eq.(38)), cancels out the singularity of v(g) at the boundary of the unit disk due to the factor ((1−αα∗ )(1− α¯ α¯ ∗ ))−2 . This is possible when pβ1 > 1/2 ,

(43)

with no restriction in the parameter β2 (this is the reason why the pseudo-extension by the real positive line, with parameter β1 6= 0, is fundamental for this case). The action of the right-invariant vector fields (operators in the theory) on polarized wave functions (see Eq. (38)) has the explicit form: ˜ R ψ (β) = z p Wβ D 1 2 1 2 ∂ ∂ −1 −1 − (µ¯ − 1) − pβ(µ + µ + µ¯ + µ¯ − 2) φ(µ, µ, × − (µ − 1) ¯ ) 2 ∂µ 2 ∂ µ¯ ˜ R ψ (β) = z p Wβ M 1 2 ∂ ∂ 1 2 −1 −1 (µ − 1) − (µ¯ − 1) − pβ(−µ − µ + µ¯ + µ¯ ) φ(µ, µ), × ¯ 2 ∂µ 2 ∂ µ¯ P˜0R ψ (β) = z p Wβ i i 2 ∂ 2 ∂ −1 −1 − pβ(µ − µ − µ¯ + µ¯ ) φ(µ, µ), × − (µ − 1) ¯ + (µ¯ − 1) 2 ∂µ 2 ∂ µ¯ P˜1R ψ (β) = z p Wβ i i 2 ∂ 2 ∂ −1 −1 − pβ(−µ + µ − µ¯ + µ¯ ) φ(µ, µ), × ¯ (µ − 1) + (µ¯ − 1) 2 ∂µ 2 ∂ µ¯ ˜ 0R ψ (β) = z p Wβ K ∂ ∂ i i × ¯ (µ + 1)2 − (µ¯ + 1)2 + pβ(µ − µ−1 − µ¯ + µ¯ −1 ) φ(µ, µ), 2 ∂µ 2 ∂ µ¯ ˜ 1R ψ (β) = z p Wβ K i i ∂ ∂ × ¯ (µ − 1)2 + (µ¯ − 1)2 − pβ(−µ + µ−1 − µ¯ + µ¯ −1 ) φ(µ, µ), 2 ∂µ 2 ∂ µ¯ X˜ rR ψ (β) = pψ (β) , X˜ ζR ψ (β) = pψ (β) . (44) This representation is irreducible for the extended conformal group G˜ and this is a consequence, according to the general formalism, of the maximality of the full polarization subalgebra P in Eq. (37), i.e. P cannot be further enlarged nor the representation

336

V. Aldaya, M. Calixto, J.M. Cerver´o

further reduced. The process of obtaining the unitary irreducible representations ends here. Any restriction desired on our wave functions should then be imposed as constraints. We are interested, however, in null mass representations, and these can be achieved ˜ which are nullified by the Casimir by selecting those wave functions ψc(β) in H(G) Q˜ R ≡ (P˜0R )2 − (P˜1R )2 of the Poincar´e subgroup. More explicitly, wave functions which fulfill: (µ − 1)2 ∂ (µ¯ − 1)2 ∂ R (β) ˜ + pβ + pβ φ(µ, µ) ¯ =0 Q ψc = 0 ⇒ (µ¯ − µ¯ −1 ) ∂ µ¯ (µ − µ−1 ) ∂µ ∂ϕ(µ, µ) ¯ ⇒ = 0, (45) ∂µ∂ µ¯ where φ(µ, µ) ¯ ≡

(µ − 1)2 (µ¯ − 1)2 µ µ¯

−pβ ϕ(µ, µ). ¯

(46)

This Klein–Gordon-like evolution equation (in light-cone-like coordinates) is then in˜ made of terpreted as a constraint in the theory and leads to a new Hilbert space Hc (G) constrained wave functions of the form: −pβ (µ − 1)2 (µ¯ − 1)2 (β) p (ϕ(µ) + ϕ( ¯ µ)), ¯ (47) ψ c = z Wβ µ µ¯ that is, wave functions for which the arbitrary part splits up into functions which depend on µ and µ¯ separately (they resemble the standard left- and right-hand moving modes). So long as this constraint is imposed by means of generators of the left translation on the group, not all the operators X˜ gRi will preserve this constraint; only the ones called good in the general approach of [ANR, ACG] will do. One can obtain the good operators for condition (45) by looking at the (right) commutators: R R ˜ , Q˜ = −2Q˜ R , D R R ˜ , Q˜ = 0, M R R ˜ ˜ P0 , Q = 0, R R = 0, P˜1 , Q˜ R R ˜ R + 4P˜1R M ˜ R − 8ipβ P˜0R , ˜ ˜ K0 , Q = −4P˜0R D ¯ Q˜ R − 8ipβ P˜0R , = f0 (µ, µ) R R ˜ R + 4P˜0R M ˜ R − 8ipβ P˜1R ˜ 1 , Q˜ = −4P˜1R D K = f1 (µ, µ) ¯ Q˜ R − 8ipβ P˜1R , (48) ¯ are some functions on the torus], from which we can conclude that the set of [fν (µ, µ) (first-order) good operators is ˜ R, M ˜ R , P˜0R , P˜1R , X˜ rR , X˜ ζR >, Ggood = < D

(49)

and close a subalgebra (Poincar´e + dilatation ≡ Weyl) of the extended conformal Lie algebra in 1 + 1 dimensions.

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

337

˜ 0R and K ˜ 1R are bad operators, i.e. they do not preserve the Hc (G) ˜ The fact that K Hilbert space, will be relevant in the “second quantization” of the constrained theory. The new (Weyl) vacuum will no longer be annihilated by the second quantized version of ˜ 1R but, rather, it will appear to be “polarized” from an accelerated frame (see ˜ 0R and K K Subsect. 4.2). This way, the profound reason for the rearrangement of the vacuum (under special conformal transformations) in (massless) Quantum Conformal Field Theories is not a singular action of this subgroup on the space-time but, rather, the impossibility of ˜ properly implementing these transformations in the constrained Hilbert space Hc (G). 1 ˜R 1 ˜R R R ˜ ˜ Note that the combinations A+ ≡ 2 (K0 + K1 ) and A− ≡ 2 (K0 − K1 ) are “partially good”, in the sense that they preserve the left- and right-hand moving modes subspaces, respectively; we shall see (Subsect. 4.2) how its finite action on a Weyl vacuum (in the second quantized theory) gives rise to a thermal bath of left- and right-hand moving scalar photons, respectively. As far as the classical field theories is concerned, the existence of a well defined scalar product does not really matter; condition (43) can be violated by putting β = 0, thus ˜ 1R leave the equation ˜ 0R and K leading to a reducible representation where the operators K R (β) ˜ Q ψc = 0 invariant, as it can be easily checked from the two last commutators in (48). However, for this particular case, the loss of unitarity can give rise to some problems in the quantization procedure, especially concerning the definition of the field propagators in the quantum field theory (see Sect. 4). Thus, for the null mass case, the conformal symmetry is “spontaneously broken” in the sense that it is a symmetry of the classical massless field theory, whereas the corresponding quantum field theory is only invariant under the Weyl subgroup. The appearance of terms proportional to β at the right-hand side of some commutators, as in (48), wight be seen as an “anomaly”; however, this time, anomaly does not means obstruction to quantization but, on the contrary, it is intrinsic to the quantization procedure and necessary for the good behaviour of the theory. Note that for massive field theories the situation is slightly different. The only symmetry which survives (both for classical and quantum theories), after the constraint Q˜ R ψc(β) = D(m) (Q˜ R )ψc(β) = m2 ψc(β)

(50)

is imposed, is the Poincar´e subgroup. Indeed, the dilatation generator is now a bad operator (it does not preserve the constraint (50), as can be seen from the first line of (48)). Its finite action, of course being bad, is not “so bad” in the sense that it changes 0 from one representation D(m) (Q˜ R ) to another D(m ) (Q˜ R ) with m0 = e2λ m, where λ is the parameter of the transformation. That is, it plays the role of a “quantizationchanging operator” (see [ACG] for other relevant examples), its domain being the union L (m) ˜ H + c (G) of all the constrained Hilbert spaces corresponding to different masses m∈< (i.e. a theory with continuum mass spectrum). One can look for a physical interpretation of those facts and say that “quantum conformal fields do not evolve in time”. The representation (44) is irreducible for the whole conformal group, but reducible under Poincar´e + dilatation (Weyl) subgroup. Some external perturbation breaks the conformal symmetry and forces the fields to evolve in time and acquire a fixed value for the mass (we are interested in the massless case), so that these fields carry an irreducible representation of the Poincar´e(+dilatation) subgroup. In this way, the dynamical symmetry breaking and the fixing of the mass, even null, come together. 3.2. Compact dilatation subgroup. It can be proved that, for this case, a T = U (1)pseudo-extension is enough to have a well defined quantum theory. It has the form:

338

V. Aldaya, M. Calixto, J.M. Cerver´o

−2N 0 −1 ζ 00 = ζ 0 ζeiξ(g ,g) = ζ 0 ζ η 00 η 0 η −1 η¯ 00 η¯ 0−1 η¯ −1 ,

(51)

where ξ(g 0 , g) is the two-cocycle (in fact, coboundary) generated by a multiple of i(log η + log η), ¯ and the parameter N labels the irreducible representations and it must be quantized, taking the values N=

j , 2

j ∈ Z,

(52)

for globality conditions. The extended left- and right-invariant vector fields on G˜ are: X˜ ηR = XηR , X˜ ηL = XηL , L L ∗ R X˜ α = Xα + N α X˜ ζ , X˜ αR = XαR − N η −2 α∗ X˜ ζR , X˜ αL∗ = XαL∗ − N αX˜ L , X˜ αR∗ = XαR∗ + N η 2 αX˜ R , ζ

(53)

ζ

∗

and similar expressions for the η, ¯ α, ¯ α¯ parameters. The new commutation relations for the extended conformal Lie algebra G˜ of G˜ are two copies of: L L X˜ η , X˜ α = 2X˜ αL , L L X˜ η , X˜ α∗ = −2X˜ αL∗ , L L X˜ α , X˜ α∗ = X˜ ηL − 2N X˜ ζL , L (54) X˜ ζ , all = 0. ˜ L , P˜0L , P˜1L , K ˜ 0L , K ˜ 1L , X˜ L }, lead now to ˜ L, M which, expressed in terms of the basis {D ζ L L ˜ 0 = −2D ˜ L − 4N X˜ ζL , P˜0 , K L L ˜ 1 = 2D ˜ L + 4N X˜ ζL . (55) P˜1 , K and the same expression as in (28) for the remainder. The left-invariant 1-form 2 has now the form: iN 4αα∗ η −1 dη + α∗ dc − αdα∗ 1 − αα∗ iN ¯ α¯ ∗ − iζ −1 dζ, 4α¯ α¯ ∗ η¯ −1 dη¯ + α¯ ∗ dα¯ − αd + ∗ 1 − α¯ α¯

2=

(56)

the characteristic module G2 and the polarization subalgebra having the same content in fields as in the previous section. The polarized U (1)-functions (we choose the faithful representation for U (1)) have now the form ¯ α, ¯ α¯ ∗ , ζ) = ζWN (α, α∗ , α, ¯ α¯ ∗ )φ(s, s). ¯ ψ (N ) (η, α, α∗ , η, ∗ ∗ ¯ α¯ ). WN = wN (α, α )wN (α, ∗ ∗ N wN (α, α ) = (1 − αα ) .

(57)

where WN is a “generating function” and φ is an arbitrary power series φ(s, s) ¯ =

∞ X n,n=0 ¯

an,n¯ sn s¯n¯

(58)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

339

in the variables s = η −2 α∗ =

z2∗ , z1

s¯ = η¯ −2 α¯ ∗ =

z¯2∗ . z¯1

(59)

Let us show how the conformal group act on s, s¯ free from singularities. For this, let us proceed as in Eq. (41): z2∗ 00 z2∗ 0 z2 + z1∗ 0 z1∗ s + α∗0 = = , z1 00 z1 0 z1 + z2 0 z2∗ η 0 2 (1 + sα0 ) z¯ ∗ 00 z¯ ∗ 0 z¯2 + z¯1∗ 0 z¯1∗ s¯ + α¯ ∗0 . = s¯ → s¯00 ≡ 2 00 = 2 z¯1 z¯1 z¯1 + z¯2 0 z¯2∗ η¯ 02 (1 + s¯α¯ 0 )

s → s00 ≡

(60)

This action is always well defined and transitive on this space. The invariant integration volume can be now chosen as v(g) ˜ = −(2π)−5 v(g) ∧ (N ) (N ) −1 ˇ (iζ dζ) and the scalar product of two basic functions ψn,n¯ ≡ ζWN sn s¯n¯ and ψˇ m, m ¯ ≡ m m ¯ ζWN s s¯ is: ¯ − 2)! n!(2N − 2)! n!(2N (N ) ˇ (N ) ) δnm δn¯ m¯ = Cn(N ) Cn(N hψˇ n, ¯m ¯ , n ¯ |ψm,m ¯ i = ¯ δnm δn (2N + n − 1)! (2N + n¯ − 1)! Cn(N ) ≡

n!(2N − 2)! , (2N + n − 1)!

(61)

where we are assuming that N > 21 , a necessary condition for having a well defined (finite) scalar product [this condition can be relaxed to N > 0 by going to the universal covering group of G]. The set     1 (N ) (N ) ˜ = ψn, ψˇ n, (62) B(HN (G)) n ¯ ≡ q n ¯   ) Cn(N ) Cn(N ¯ ˜ is then orthonormal and complete, i.e. an orthonormal base of HN (G). The actions of the right-invariant vector fields (operators in the theory) on polarized wave functions (see Eq. (57)) have the explicit form: ˜ R ψ (N ) = ζWN · (s ∂ + s¯ ∂ )φ(s, s), ¯ D ∂s ∂ s¯ ˜ R ψ (N ) = ζWN · (−s ∂ + s¯ ∂ )φ(s, s), ¯ M ∂s ∂ s¯ ∂ ∂ − )φ(s, s), ¯ P˜0R ψ (N ) = ζWN · (− ∂s ∂ s¯ ∂ ∂ )φ(s, s), ¯ − P˜1R ψ (N ) = ζWN · ( ∂s ∂ s¯ ˜ 0R ψ (N ) = ζWN · (−s2 ∂ − s¯2 ∂ − 2N (s + s))φ(s, K ¯ s), ¯ ∂s ∂ s¯ ˜ 1R ψ (N ) = ζWN · (−s2 ∂ + s¯2 ∂ − 2N (s − s))φ(s, ¯ s), ¯ K ∂s ∂ s¯ X˜ ζL ψ (N ) = ψ (N ) .

(63)

340

V. Aldaya, M. Calixto, J.M. Cerver´o

The finite (left) action (13) of an arbitrary element g˜ 0 = (η 0 , α0 , α∗ 0 , η¯ 0 , α¯ 0 , α¯ ∗0 , ζ 0 ) ∈ G˜ on an arbitrary wave function ˜ = ψ (N ) (g)

∞ X

(N ) an,n¯ ψn, ˜ n ¯ (g),

(64)

n,n=0 ¯ ) (N ) (N ) ˜ 0 ) ≡ hψm, ˜ 0 )|ψn, can be given through the matrix elements ρ(N mn;m ¯n ¯ (g m ¯ |ρ(g n ¯ i of ρ in the ˜ base B(HN (G)). They have the following expression: ) ) ∗ (N ) ˜ = ζ −1 ρ(N ¯ α, ¯ α¯ ∗ ), ρ(N mn;m ¯n ¯ (g) ¯n ¯ (η, mn (η, α, α )ρm s n (N ) X Cm n 2N + m + l − 1 (N ) ∗ × ρmn (η, α, α ) = l m−n+l Cn(N ) l=θnm

l 2m ∗ l m−n+l

(−1) η

α α

(1 − αα∗ )N ,

(65)

where the function θnm in the lower limit of the last summatory is defined by θnm ≡ sign(n−m)+1 , the function sign(n) being the standard sign function (sign(0) = 1); (n−m) 2 it guarantees the following inequality m − n + l ≥ 0. These expressions will be very useful for the construction of the corresponding quantum field theory in the next section. The constrained wave functions ψc(N ) obeying ∂2φ =0 Q˜ R ψc(N ) = ((P˜0R )2 − (P˜1R )2 )ψc(N ) = 0 ⇒ ∂s∂ s¯

(66)

have now the form ¯ s)). ¯ ψc(N ) = ζWN · (ϕ(s) + ϕ(

(67)

We arrive at the same conclusions as in the non-compact dilatation case, concerning good and bad operators. For this case, N plays the same role as β did in the former. Let us investigate the conformal quantum field theory associated with this “first quantized” theory and how to interpret the dynamical symmetry breaking of the conformal group in the context of the corresponding “second quantized” theory. To this end, let us show how this second quantization approach can be discussed within the GAQ framework. ˜ A Model for a Conformally Invariant 4. “Second Quantization” on a Group G: QFT In this subsection we shall develop a general approach to the quantization of linear, complex quantum fields defined on a group manifold G˜ (more precisely, on the quotient ˜ G/(T ∪P)). This formalism can be seen as a “second quantization” of a “first quantized” ˜ of polarized wave functions. theory defined by a group G˜ and a Hilbert space H(G) (2) ˜ The construction of the quantizing group G for this complex quantum field is as ˜ we define the direct sum ˜ and its dual H∗ (G), follows. Given the Hilbert space H(G) ˜ ˜ ≡ H(G) ˜ ⊕ H∗ (G) F(G) ˜ |B ∗ i ∈ H∗ (G) ˜ , = |f i = |Ai + |B ∗ i; |Ai ∈ H(G),

(68)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

341

where we have denoted |B ∗ i according to hg˜ P∗ |B ∗ i ≡ hB|g˜ P i = BP∗ (g). ˜ The group G˜ acts on this vectorial space as follows: ρ(g˜ 0 )|f i = ρ(g˜ 0 )|Ai + ρ(g˜ 0 )|B ∗ i,

(69)

˜ hg˜ P∗ |ρ(g˜ 0 )|B ∗ i ≡ hB|ρ† (g˜ 0 )|g˜ P i = BP∗ (g˜ 0−1 ∗ g).

(70)

where

We can also define the dual space ˜ ≡ H∗ (G) ˜ ⊕ H∗∗ (G) ˜ F ∗ (G) ˜ hB ∗ | ∈ H∗∗ (G) ˜ ∼ H(G) ˜ , (71) = hf | = hA| + hB ∗ | ; hA| ∈ H∗ (G), where G˜ acts according to the adjoint action hf |ρ† (g˜ 0 ) = hA|ρ† (g˜ 0 ) + hB ∗ |ρ† (g˜ 0 )

(72)

hB ∗ |ρ† (g˜ 0 )|g˜ P∗ i ≡ hg˜ P |ρ(g˜ 0 )|Bi.

(73)

and now

˜ is Using the closure relation (12), the product of two arbitrary elements of F(G) 0

0

z }| { z }| { hf |f i = hA |Ai + hA0 |B ∗ i + hB 0∗ |Ai +hB 0∗ |B ∗ i,

(74)

indeed, the second and third integrals Z Z v(g)A ˜ 0P ∗ (g)B ˜ P∗ (g) ˜ =0= v(g)B ˜ P0 (g)A ˜ P (g) ˜

(75)

0

0

˜ G

˜ G

are zero because of the integration on the central parameter ζ ∈ U (1). Thus, the subspaces ˜ are orthogonal with respect to this scalar product in F(G). ˜ A basis for ˜ and H∗ (G) H(G) ˜ is provided by the set {|ni + |m∗ i} . F(G) n,m∈I ˜ ≡ F(G) ˜ ⊗ F ∗ (G) ˜ can be endowed with a simplectic structure The space M(G) S(f 0 , f ) ≡

−i 0 (hf |f i − hf |f 0 i), 2

(76)

˜ as a phase space. This phase space can be naturally embedded into thus defining M(G) a quantizing group ˜ |f i, hf |; ς , (77) G˜ (2) ≡ g˜ (2) = (g (2) ; ς) ≡ g, which is a (true) central extension by U (1), with parameter ς, of the semidirect product ˜ of the basic group G˜ and the phase space M(G). ˜ The group law of G(2) ≡ G˜ ⊗ρ M(G) G˜ (2) is formally: ˜ g˜ 00 = g˜ 0 ∗ g, 00 0 |f i = |f i + ρ(g˜ 0 )|f i, hf 00 | = hf 0 | + hf |ρ† (g˜ 0 ), ς 00 = ς 0 ςeiξ

(2)

(g (2)0 ,g (2) )

,

(78)

342

V. Aldaya, M. Calixto, J.M. Cerver´o

where ξ (2) (g (2)0 , g (2) ) is a two-cocycle defined as ξ (2) (g (2)0 , g (2) ) ≡ κS(f 0 , ρ(g˜ 0 )f )

(79)

and κ is intended to kill any possible dimension of S. A system of coordinates for G˜ (2) corresponds to a choice of representation associated with a given polarization P, ˜ ≡ hg˜ P |f i, fP(−) (g) ˜ ≡ hg˜ P∗ |f i, fP(+) (g) ∗(+) ∗ ∗(−) ˜ ≡ hf |g˜ P i, fP (g) ˜ ≡ hf |g˜ P i. fP (g)

(80)

This splitting of f is the group generalization of the more standard decomposition of a field frequency parts. If we make use of the closure relation R in positive and negative ˜ the explicit expression of the cocycle (79) ˜ g˜ P ihg˜ P | + |g˜ P∗ ihg˜ P∗ |} for F(G), 1 = G˜ v(g){| ˜ in this coordinate system (for simplicity, we discard the semidirect action of G), ZZ −iκ v(g˜ 0 )v(g) ˜ fP0 ∗(−) (g˜ 0 )1(+) (g˜ 0 , g)f ˜ P(+) (g) ˜ ξ (2) (g (2)0 , g (2) ) = P 2 ˜ G (g˜ 0 , g)f ˜ P0 (+) (g) ˜ + fP0 ∗(+) (g˜ 0 )1(−) (g˜ 0 , g)f ˜ P(−) (g) ˜ (81) − fP∗(−) (g˜ 0 )1(+) P P ∗(+) 0 (−) 0 0 (−) ˜ P (g) ˜ , − fP (g˜ )1P (g˜ , g)f

where (g˜ 0 , g) ˜ ≡ hg˜ P0 |g˜ P i = 1(+) P 0 ∗

∗

ψP ,n (g˜ 0 )ψP∗ ,n (g), ˜

n∈I (+)

˜ ≡ hg˜ P |g˜ P i = 1P (g, ˜ g˜ 0 ), 1P (g˜ , g) (−)

0

X

(82)

shows that the vector fields associated with the co-ordinates in (80) are canonically conjugated i h = κ1(+) (g˜ 0 , g) ˜ X˜ ςL , X˜ fL∗(−) (g˜ 0 ) , X˜ fL(+) (g) P ˜ P h P i (g˜ 0 , g) ˜ X˜ ςL . (83) X˜ fL∗(+) (g˜ 0 ) , X˜ fL(−) (g) = κ1(−) P ˜ P

P

(g˜ 0 , g) ˜ play the role of propagators (central matrices of the Here, the functions 1(±) P cocycle). At this point, we must stress the importance of a well defined scalar product ˜ as regards the good behaviour of the two-cocycle (81), an essential ingredient in H(G) in the corresponding QFT. The non-zero value of the central extension parameter of G˜ – see Eq. (43,52) and comments after Eq. (61) – which prevents the whole conformal group from being an exact symmetry of the massless quantum field theory (remember the comments after Eq. (48)) proves to be an essential prerequisite for a proper definition of the conformal quantum field theory through the group G˜ (2) . The propagators in two different parametrizations of G˜ (2) , corresponding to two different polarization subalgebras P1 and P2 of G˜ L (or U G˜ L ), are related through polarization-changing operators (16) as follows: ZZ ˜ = ˜ v(g˜ 0 )v(g)1 ˜ (±) (h˜ 0 , g˜ 0 )1(±) (g˜ 0 , g)1 ˜ (±) (g, ˜ h), 1(±) (h˜ 0 , h) P2

˜ G

P 2P 1

P1

P 1P 2

˜ g) ˜ g), ˜ g) ˜ (h, ˜ ≡ 1P iP j (h, ˜ 1(−) (h, ˜ ≡ 1P iP j (g, ˜ h). 1(+) P iP j P iP j

(84)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

343

To apply the GAQ formalism to G˜ (2) , it is appropriate to use a “Fourier-like” parametrization, alternative to the field-like parametrization above [see (80)]. If we denote by an ≡ hn|f i, bn ≡ hn∗ |f i, a∗n ≡ hf |ni, b∗n ≡ hf |n∗ i,

(85)

the Fourier coefficients of the “particle” and the “antiparticle”, a polarization subalgebra P (2) for G˜ (2) is always given by the corresponding left-invariant vector fields X˜ aLn , X˜ bLn ˜ which is the characteristic subalgebra G2(2) of the and the whole Lie algebra G˜ L of G, second-quantized theory (see the next subsection). The operators of the theory are the right-invariant vector fields of G˜ (2) ; in particular, the basic operators are: the annihilation operators of particles and anti-particles, aˆ n ≡ X˜ aR∗n , bˆ n ≡ X˜ bR∗n , and the corresponding creation operators aˆ †n ≡ − κ1 X˜ aRn , bˆ †n ≡ − κ1 X˜ bRn . The operators corresponding to the of the first-quantized operators X˜ gR subgroup G˜ [the second-quantized version X˜ gR(2) ˜ j in ˜j (6)] are written in terms of the basic ones, since they are in the characteristic subalgebra G2(2) of the second-quantized theory. The group G˜ plays a key role in picking out a preferred vacuum state and defining the notion of a “particle”, in the same way as the Poincar´e group plays a central role in relativistic quantum theories defined on Minkowski space. In general, standard QFT in curved space suffers from the lack of a preferred definition of particles. The infinite-dimensional character of the symplectic solution manifold of a field system is responsible for the existence of an infinite number of unitarily inequivalent irreducible representations of the Heisenberg–Weyl (H-W) relations and there is no criterion to select a preferred vacuum of the corresponding quantum field (see, for example, [W, BD]). This situation is not present in the finite-dimensional case, according to the Stone–von Newman theorem [St, N]. In our language, the origin of this fact is related to the lack of a characteristic module for the H-W subgroup G˜ (2) /G˜ of G˜ (2) ; i.e., for the infinitedimensional H-W group, we can polarize the wave functions in arbitrary, non-equivalent ˜ the directions. Thus, so long as we can embed the (curved) space M into a given group G, L ˜ existence of a characteristic module – generated by G – in the polarization subalgebra helps us in picking out a preferred vacuum state. This vacuum state will be characterized by being annihilated by the right version of the polarization subalgabra dual to P (2) , i.e., it will be invariant under the action of G˜ ⊂ G˜ (2) and annihilated by the right-invariant vector fields X˜ aR∗n , X˜ bR∗n . Other vacuum states might be selected as those states being invariant under a subgroup G˜ Q ⊂ G˜ only, for example, the uniparametric subgroup of time evolution (see e.g. [A] for a discussion of vacuum states in de Sitter space). From our point of view, this situation would correspond to a breakdown of the symmetry and could be understood as a constrained theory of the original one. Indeed, let us comment on the influence of the constraints in the first quantized theory at the second quantization level. Associated with a constrained wave function satisfying (18), there is a corresponding constrained quantum field subjected to the condition: h i () ˜ ˜ R R R ˜ ˜ R(2) , X (86) ≡ X adX˜ R(2) X˜ |f i |f i = dDT (t)X|f i , t˜ t˜

stands for the “second-quantized version” of X˜ tR where X˜ tR(2) ˜ . It is straightforward to ˜ generalize the last condition to higher-order constraints:

344

V. Aldaya, M. Calixto, J.M. Cerver´o

X˜ 1R X˜ 2R . . . X˜ jR |ψi = |ψi → R R = X˜ |f adX˜ R(2) adX˜ R(2) . . . adX˜ R(2) X˜ |f i ... i. 1

2

j

(87)

˜ ⊂ H(G) ˜ made of wave functions The selection of a given Hilbert subspace H() (G) ψc obeying a higher-order constraint Qψc = ψc , where Q = X˜ 1R X˜ 2R . . . X˜ jR is some ˜ manifests, at the second quantization level, as a new Casimir operator of G˜ Q ⊂ G, (broken) QFT. The vacuum for the new observables of this broken theory (the good operators in (87)) does not have to coincide with the vacuum of the original theory, and the action of the rest of the operators (the bad operators) could make this new vacuum radiate. This is precisely the problem we are involved with, where Q ≡ Q˜ R is the Casimir of the Poincar´e subgroup inside the conformal group (see later in Sect. 4.2). In general, constraints lead to gauge symmetries in the constrained theory and, also, the property for a subgroup N ⊂ G˜ of being gauge is heritable at the second-quantization level. To conclude this subsection, it is important to note that the representation of G˜ on ˜ is reducible, but it is irreducible under G˜ together with the charge conjugation M(G) operation an ↔ bn , which could be implemented on G˜ (2) . For simplicity, we have preferred to discard this transformation; however, a treatment including it would be relevant as a revision of the CPT symmetry in quantum field theory. The Noether invariant associated with X˜ ζR(2) is nothing other than the total electric charge (the total number of particles in the case of a real field bn ≡ an ) and its central character, inside the “dynamical” group G˜ of the first-quantized theory, now ensures its conservation under the action of the subgroup G˜ ⊂ G˜ (2) . To account for non-abelian charges (iso-spin, color, etc.), a non-abelian structure group T ⊂ G˜ is required. 4.1. The case of the conformal group. Let us now apply the GAQ formalism to the centrally extended group G˜ (2) given through the group law in (78) for the case of G˜ = SO(2, 2) and compact dilatation. We shall consider the case of a real field and we shall use a “Fourier” parametrization in terms of the coefficients an,n¯ rather than a “field” ˜ The explicit group law is: parametrization in terms of fP (g). ˜ g˜ 00 = g˜ 0 ∗ g,

∞ X

a00m,m¯ = a0m,m¯ +

n,n=0 ¯ ∞ X

a00∗m,m¯ = a0∗m,m¯ + ς 00 = ς 0 ς exp

) ρ(N ˜ 0 )an,n¯ , mn;m ¯n ¯ (g

)∗ ρ(N ˜ 0 )a∗n,n¯ , mn;m ¯n ¯ (g

∞ X κ ) )∗ (a0∗m,m¯ ρ(N ˜ 0 )an,n¯ − a0m,m¯ ρ(N ˜ 0 )a∗n,n¯ ). mn;m ¯n ¯ (g mn;m ¯n ¯ (g 2 m,m=0 ¯ n,n=0 ¯

The left- and right-invariant vector fields (we denote ∂m,m¯ ≡ ∂

) ∂a∗ m,m ¯

are:

(88)

n,n=0 ¯ ∞ X

∂ , ∂am,m ¯

∗ ∂m, m ¯ ≡

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

345

∂ X˜ ςL = X˜ ςR = ς , ∂ς ∞ ∞ X κ X (N ) ) X˜ aLn,n¯ = ρ(N ( g)∂ ˜ + ρmn;m¯ n¯ (g)a ˜ ∗m,m¯ X˜ ςL , m,m ¯ mn;m ¯n ¯ 2 m,m=0 ¯ m,m=0 ¯ X˜ aL∗n,n¯ = ˜ D

L(2)

∞ X

)∗ ∗ ρ(N ˜ m, mn;m ¯n ¯ (g)∂ m ¯

m,m=0 ¯

∞ κ X (N )∗ − ρmn;m¯ n¯ (g)a ˜ m,m¯ X˜ ςL , 2 m,m=0 ¯

˜ L, M ˜ L(2) = M ˜ L , P˜ L(2) = P˜0L , =D 0

˜ L(2) = K ˜ 0L , K ˜ L(2) = K ˜ 1L , X˜ L(2) = X˜ ζL , P˜1L(2) = P˜1L , K 0 1 ζ κ X˜ aRn,n¯ = ∂n,n¯ − a∗n,n¯ X˜ ςL , 2 κ ∗ ˜L X˜ aR∗n,n¯ = ∂n, ¯ Xς , n ¯ + an,n 2 ∞ X ∗ ˜R− ˜ R(2) = D (m + m)(a ¯ m,m¯ ∂m,m¯ − a∗m,m¯ ∂m, D m ¯ ), m,m=0 ¯ ∞ X

˜ R(2) + ˜ R(2) = M M

m,m=0 ¯ ∞ X p

P˜0R(2) = P˜0R + +

p

∗ (m + 1)(2N + m)(am+1,m¯ ∂m,m¯ − a∗m,m¯ ∂m+1, m ¯ )

m,m=0 ¯

∗ ∗ (m ¯ + 1)(2N + m)(a ¯ m,m+1 ¯ ∂m,m ¯ − am,m ¯ ∂m,m+1 ¯ ) ,

P˜1R(2) = P˜1R − −

p

+

∗ (m + 1)(2N + m)(am+1,m¯ ∂m,m¯ − a∗m,m¯ ∂m+1, m ¯ )

∗ ∗ (m ¯ + 1)(2N + m)(a ¯ m,m+1 ¯ ∂m,m ¯ − am,m ¯ ∂m,m+1 ¯ ) , ∞ p X ∗ (m + 1)(2N + m)(am,m¯ ∂m+1,m¯ − a∗m+1,m¯ ∂m, m ¯ ) m,m=0 ¯

∗ (m ¯ + 1)(2N + m)(a ¯ m,m¯ ∂m,m+1 − a∗m,m+1 ¯ ¯ ∂m,m ¯ ) ,

˜ 1R + ˜ R(2) = K K 1 −

∞ p X m,m=0 ¯

˜ 0R + ˜ R(2) = K K 0 p

∗ (m − m)(a ¯ m,m¯ ∂m,m¯ − a∗m,m¯ ∂m, m ¯ ),

p

∞ p X ∗ (m + 1)(2N + m)(am,m¯ ∂m+1,m¯ − a∗m+1,m¯ ∂m, m ¯ ) m,m=0 ¯

∗ (m ¯ + 1)(2N + m)(a ¯ m,m¯ ∂m,m+1 − a∗m,m+1 ¯ ¯ ∂m,m ¯ )

X˜ ζR(2) = X˜ ζR −

∞ X m,m=0 ¯

∗ (am,m¯ ∂m,m¯ − a∗m,m¯ ∂m, m ¯ ).

(89)

346

V. Aldaya, M. Calixto, J.M. Cerver´o

The non-trivial commutators between those vector fields are: i h X˜ aLn,n¯ , X˜ aL∗m,m¯ = −κδnm δn¯ m¯ X˜ ςL , i h ˜ L(2) , X˜ aL = −(n + n) ¯ X˜ aLn,n¯ , D n,n ¯ i h ˜ L(2) , X˜ aL = (n − n) ¯ X˜ aLn,n¯ , M n,n ¯ h i p p ¯ + n¯ − 1)X˜ aLn,n−1 , P˜0L(2) , X˜ aLn,n¯ = n(2N + n − 1)X˜ aLn−1,n¯ + n(2N ¯ i h p p ¯ + n¯ − 1)X˜ aLn,n−1 , P˜1L(2) , X˜ aLn,n¯ = − n(2N + n − 1)X˜ aLn−1,n¯ + n(2N ¯ i p h p ˜ L(2) , X˜ aL ¯ X˜ aLn,n+1 , = (n + 1)(2N + n)X˜ aLn+1,n¯ + (n¯ + 1)(2N + n) K 0 n,n ¯ ¯ i p h p ˜ L(2) , X˜ aL = (n + 1)(2N + n)X˜ aLn+1,n¯ − (n¯ + 1)(2N + n) ¯ X˜ aLn,n+1 , K 1 n,n ¯ ¯ i h X˜ ζL(2) , X˜ aLn,n¯ = X˜ aLn,n¯ , i h ˜ L(2) , X˜ aL∗ = (n + n) ¯ X˜ aL∗n,n¯ , D n,n ¯ i h ˜ L(2) , X˜ aL∗ = −(n − n) ¯ X˜ aL∗n,n¯ , M n,n ¯ i h p p ¯ X˜ aL∗n,n+1 , P˜0L(2) , X˜ aL∗n,n¯ = − (n + 1)(2N + n)X˜ aL∗n+1,n¯ − (n¯ + 1)(2N + n) ¯ h i p p P˜1L(2) , X˜ aL∗n,n¯ = (n + 1)(2N + n)X˜ aL∗n+1,n¯ − (n¯ + 1)(2N + n) ¯ X˜ aL∗n,n+1 , ¯ i h p p ˜ L(2) , X˜ aL∗ = − n(2N + n − 1)X˜ aL∗n−1,n¯ − n(2N ¯ + n¯ − 1)X˜ aL∗n,n−1 , K 0 n,n ¯ ¯ i h p p ˜ L(2) , X˜ aL∗ ¯ + n¯ − 1)X˜ aL∗n,n−1 , = − n(2N + n − 1)X˜ aL∗n−1,n¯ + n(2N K 1 n,n ¯ ¯ i h (90) X˜ ζL(2) , X˜ aL∗n,n¯ = −X˜ aL∗n,n¯ , where we have omitted the commutators corresponding to the extended conformal subgroup, which have the same form as in (28), except for the two commutators in (55). The quantization 1-form and the characteristic module are: 2

(2)

∞ iκ X = (an,n¯ da∗n,n¯ − a∗n,n¯ dan,n¯ ) − iς −1 dς, 2 n,n=0 ¯

˜ L(2) , M ˜ L(2) , P˜ L(2) , P˜ L(2) , K ˜ L(2) , K ˜ L(2) , X˜ L(2) > . G2(2) = < D 0 1 0 1 ζ

(91)

A full polarization subalgebra is: P (2) =< G2(2) , X˜ aLn,n¯ >, ∀n, n¯ ≥ 0, and the polarized U (1)-functions have the form:   ∞  κ X  ˜ ς] = ς exp − a∗n,n¯ an,n¯ 8[a∗ ] ≡ ς8[a∗ ], 9[a, a∗ , g,  2  n,n=0 ¯

(92)

(93)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

347

where is the vacuum of the second quantized theory and 8 is an arbitrary power series in its argument. The actions of the right-invariant vector fields (operators in the second-quantized theory) on polarized wave functions in (93) have the explicit form: X˜ aRn,n¯ 9 = ς × (−κa∗n,n¯ )8 ≡ ς × (−κˆa†n,n¯ )8, ∗ X˜ aR∗n,n¯ 9 = ς × (∂n, an,n¯ )8, n ¯ )8 ≡ ς × (ˆ   ∞ X ˜ R(2) 9 = ς ×  (n + n)ˆ ¯ a†n,n¯ aˆ n,n¯  8, D n,n=0 ¯



˜ R(2) 9 = ς × − M

∞ X

 (n − n)ˆ ¯ a†n,n¯ aˆ n,n¯  8,

n,n=0 ¯

P˜0R(2) 9 

= ς

× −

∞ p X

 p  8, (n + 1)(2N + n)ˆa†n,n¯ aˆ n+1,n¯ + (n¯ + 1)(2N + n)ˆ ¯ a†n,n¯ aˆ n,n+1 ¯

n,n=0 ¯

P˜1R(2) 9

= ς

 ∞ p X p  8, (n + 1)(2N + n)ˆa†n,n¯ aˆ n+1,n¯ − (n¯ + 1)(2N + n)ˆ ¯ a†n,n¯ aˆ n,n+1 × ¯ 

n,n=0 ¯

˜ R(2) 9 K 0 

× −

= ς ∞ p X n,n=0 ¯

˜ R(2) 9 K 1 

× −

= ς ∞ p X n,n=0 ¯

X˜ ζR(2) 9 

×

= ς

∞ X

 p (n + 1)(2N + n)ˆa†n+1,n¯ aˆ n,n¯ + (n¯ + 1)(2N + n)ˆ ¯ a†n,n+1 ˆ n,n¯  8, ¯ a  p (n + 1)(2N + n)ˆa†n+1,n¯ aˆ n,n¯ − (n¯ + 1)(2N + n)ˆ ¯ a†n,n+1 ˆ n,n¯  8, ¯ a 

aˆ †n,n¯ aˆ n,n¯  8,

(94)

n,n=0 ¯

where aˆ n,n¯ and aˆ †n,n¯ are interpreted as annihilation and creation operators of modes ˜ R(2) is attached to the total energy (remember that the dilatation parameter |n; ni, ¯ D plays the role of a proper time), and X˜ ζL(2) corresponds with the number operator. It should be mentioned that all those quantities appear, in a natural way, normally ordered; this is one of the advantages of this method of quantization: normal order does not have to be imposed by hand but, rather, it is implicitly inside the formalism itself.

348

V. Aldaya, M. Calixto, J.M. Cerver´o

We can think of the Hilbert space as composed of modes: 1. pure non-bar |n1 , n2 , . . . ; 0i , 2. pure bar |0; n¯ 1 , n¯ 2 , . . . i , 3. mixed |n1 , n2 , . . . ; n¯ 1 , n¯ 2 , . . . i. 4.2. Breaking down to the Weyl subgroup. Vacuum radiation. In this subsection, we investigate the effect of SCT on a Weyl vacuum, i.e. a vacuum of the massless QFT obtained after constraining the conformal quantum field theory developed in the last subsection. The field degrees of freedom of the massless field are obtained by translating the condition (66) to the second quantization level, according to the general procedure (87) that is, by imposing ii h h P˜0R(2) + P˜1R(2) , P˜0R(2) − P˜1R(2) , X˜ aRn,n¯ = −4

p

p n(2N + n − 1) n(2N ¯ + n¯ − 1)X˜ aLn−1,n−1 = 0, ¯

(95)

which selects the pure non-bar and pure bar operators, i.e, aˆ †n,0 = − κ1 X˜ aRn,0 and aˆ †0,n¯ = − κ1 X˜ aR0,n¯ . These operators, together with the Weyl generators (good operators of the first-quantized theory) close a Lie subalgebra ˜ R(2) , M ˜ R(2) , P˜ R(2) , P˜ R(2) , X˜ L(2) , aˆ † , aˆ † > Gc(2) = < D 0 1 n,0 0,n ¯ ζ

(96)

of the original Lie algebra of the conformal quantum field. The vacuum of this constrained theory does not have to coincide with the conformal vacuum |0i = |n = 0; n¯ = 0i. In fact, any conformal state made up of an arbitrary content of zero modes |W {σ}i ≡

∞ X

σq (ˆa†0,0 )q |0i

(97)

q=0

behaves as a vacuum from the point of view of a Weyl observer, that is, it is annihilated by the Weyl generators and the destruction operators aˆ n,0 and aˆ 0,n¯ , for all n, n¯ ∈ N − {0}. Note that, since the operator aˆ †0,0 is central in Gc(2) (it commutes with all the others), it would be too restrictive to require (97) being nullified by aˆ 0,0 ; the only solution would be the conformal vacuum |0i. It is then natural to demand that aˆ 0,0 behave as a multiple ϑ of the identity, that is, it has to leave the Weyl vacuum stable aˆ 0,0 |W {σ}i = ϑ|W {σ}i ⇒ σq(0) =

ϑq σ0 , q!

(98)

a condition which, after normalizing, determines the Weyl vacuum up to a complex parameter ϑ hW {σ (0) }|W {σ (0) }i = 1 ⇒ |σ0 | = e− 2 |ϑ| . 1

2

(99)

Thus, we have found a set of Weyl vacua (coherent states of the conformal quantum field, made of zero modes) †

|0iϑ ≡ e− 2 |ϑ| eϑaˆ 0,0 |0i, 1

2

(100)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

349

labeled by ϑ [the existence of a degenerate ground state resembles the “θ-vacuum” phenomenon in Yang-Mills field theories [JcRb-CDG] and, in general, it is present whenever we deal with non-simply connected phase spaces and constrained theories [ACG]]. As the final result is independent of ϑ, from now on we shall implicitly choose ϑ = 1, for the sake of simplicity. An orthonormal basis for the Hilbert space of the constrained theory can be obtained by taking the orbit through the vacuum (100) of the creation operators as follows: |m(n1 ), . . . , m(nq ), m(n¯ 1 ), . . . , m(n¯ j )iϑ ≡

(ˆa†n1 ,0 )m(n1 ) . . . (ˆa†nq ,0 )m(nq ) (ˆa†0,n¯ 1 )m(n¯ 1 ) . . . (ˆa†0,n¯ j )m(n¯ j ) (m(n1 )! . . . m(nq )!m(n¯ 1 )! . . . m(n¯ j )!)1/2

|0iϑ .

(101)

We can make a comparison with the standard case of a massless field in 1+1 dimensional Minkowski space-time and relate the non-bar and bar modes to the left-hand and righthand moving scalar photons, respectively. Let us introduce dimensions through the Planck constant h and the frequency mode ν, so that the total energy is given by ˜ R(2) + 2N X˜ L(2) ≡ hνDR(2) ; (102) Eˆ ≡ hν D ζ the last redefinition of the dilatation generator is intended to render the commutation relations (55) to the usual ones (28) by destroying the pseudo-extension (51). The expected value of the energy in the general state (101) is ˆ = hν( hEi

q X l=1

m(nl )nl +

j X

m(n¯ l )n¯ l + 2N ),

(103)

l=1

where E0 ≡ 2N hν represents the zero point energy, i.e. the expected value of the energy in the Weyl vacuum. Zero modes represent virtual particles (they have no energy and cannot be detected by a Weyl observer) and can be spontaneously created from the Weyl vacuum, as can be deduced from the condition (98). It is natural to think that zero modes will play an important role in the radiation of a Weyl vacuum, as they will be made real by acceleration. In fact, let us show how a finite 1 ˜ L(2) ˜ L(2) ), applied to a +K special conformal transformation, generated by A(2) + ≡ 2 ( K0 1 Weyl vacuum gives rise to a “thermal bath” of no-bar modes (left-hand moving scalar 1 ˜ L(2) ˜ L(2) ) gives rise to a “thermal −K photons), whereas the combination A(2) − ≡ 2 ( K0 1 bath” of bar modes (right-hand moving scalar photons). The finite action of A(2) + , with parameter α (the corresponding acceleration is a ≡ −(2π)2 logcν|α|2 , where c is the speed of light), on the Fourier parameter, s ∞ ∞ X C0(N ) ∗ n X (−1)n a α = rn a∗n,0 αn , a∗0,0 → a0∗0,0 = (N ) n,0 C n n=0 n=0 s (N ) C0 (104) rn ≡ (−1)n Cn(N ) (according to the general expression in the third line of (88) and the last equality in Eq. (65)), leads to the following transformation on the Weyl vacuum:

350

V. Aldaya, M. Calixto, J.M. Cerver´o

†

|0iϑ → |9(α)iϑ ≡ e− 2 e0 aˆ 0,0 |0i = 1

∞ X

X

αq

q=0

Pmq1 ,...,mq : n=0

q q Y rnmn Y † mn (ˆa ) |0iϑ , mn ! n=0 n,0 (105) n=0

nmn =q

where m0 = 0 and we have used the general identity ∞ X

!l γn αn

=

n=0

∞ X

δq α q ,

q=0

δ0 = γ0l , δq =

q 1 X (sm − q + s)γs δq−s . qγ0 s=1

(106)

The relative probability of observing a state with total energy Eq = hνq + E0 in a Weyl vacuum from an accelerated frame (i.e. in |9(α)iϑ ) is Pq = 3(Eq )(|α|2 )q , q X Y rn2mn . 3(Eq ) ≡ mn ! n=0 Pmq1 ,...,mq : n=0

(107)

nmn =q

We can associate a thermal bath with this distribution function by noticing that 3(Eq ) represents a relative weight proportional to the number of states with energy Eq , and the factor (|α|2 )q fits this weight properly to a temperature as (|α|2 )q = eq log |α| = e− 2

Eq −E0 kT

, where T ≡ −

hν ~a = 2 k log |α| 2πck

(108)

is the temperature associated with a given acceleration a, and k is the Boltzmann’s constant. This simple, but profound, relation between temperature and acceleration was first considered by Unruh [U]. The balance between the “multiplicity factor” 3(Eq ) (an increasing function of the energy) and the temperature factor (108) (a decreasing function of the energy) is favorable (maximum) for a given system of this canonical ensemble, the energy of which is a representative value of the mean energy. In fact, this mean energy can be calculated exactly as the expected value of the energy operator Eˆ in the state |9(α)iϑ . To this end, let us perform some intermediate calculations. The norm of this accelerated vacuum is ! ∞ X 2 2n rn |α| = exp (1 − |α|2 )−2N − 1 . Nor[9(α)] ≡ ϑ h9(α)|9(α)iϑ = exp −1 + (109) n=0 The probability Pn (m) of observing m particles with energy En coincides with the expected value of the projector Pˆn (m) on the state |m(n)iϑ , i.e.: Pn (m) ≡

ϑ h9(α)|Pn (m)|9(α)iϑ

ˆ

Nor[9(α)]

=

(rn2 |α|2n )m 1 ; Nor[9(α)] m!

(110)

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

351

P∞ it can be seen that the closure relation n,m=0 Pn (m) = 1 is in fact verified. The mean number Nn of left-hand moving scalar photons with energy En corresponds with the expected value of the number operator Nˆ n ≡ aˆ † aˆ n,0 , i.e. : n,0

Nn =

∞ X m=0

mPn (m) =

1 rn2 |α|2n exp rn2 |α|2n . Nor[9(α)]

(111)

With this information at hand, we can calculate the expected value of the total energy as: ∞ X 2N hν|α|2 nNn = E0 + . (112) E[9(α)] ≡ E0 + hν (1 − |α|2 )2N +1 n=1 If we subtract the zero-point energy, normalize by a 2N factor (this normalization can be hν seen as a reparametrization of the proper time) and make use of the relation |α|2 = e− kT in (108), we obtain a more “familiar” expression for the mean energy per mode hν

N (ν, T ) =

hνe− kT

, (113) (1 − e )2N +1 the value N = 0 corresponding with the well known case of the Bose-Einstein statistic. Note that this particular value of N can be reached only as a limiting process formulated on the universal covering of SU (1, 1) ⊗ SU (1, 1) or, equivalently, by uncompactifying the proper time (U (1) → <). hν − kT

Fig. 1. Departure from the Planck’s spectrum (thickest line) for increasing values of N (decreasing thickness)

Let us compare the spectral distribution of the radiation of the Weyl vacuum for different values of N , with the well known case of the black body radiation (Planck’s spectrum). For this purpose, we have to multiply the mean energy per mode by the number of states with frequency ν which, in d dimensions, would be proportional to ν d−1 . If we hν (u0 is a constant, for denote this product by u(x, N ) ≡ u0 xd−1 N (x, T0 ) with x ≡ kT 0 a fixed temperature T0 , with dimensions of energy per unit of volume), Fig. 1 shows the departure from the Planckian spectrum (N = 0) for four diferent values N = 21 , 43 , 1, 1.1 in the realistic dimension d = 3. Note that the value of N ≡ Nc = d−1 2 corresponds to a critical situation: over this value the theory exhibits an “infrared catastrophe”.

352

V. Aldaya, M. Calixto, J.M. Cerver´o

5. Other Representations: Some Comments We have shown that it is impossible to establish conformally invariant evolution equations in a (even compactified) Minkowski space, not only for massive fields but, also for massless quantum fields. If we wish the whole conformal group to be an exact symmetry of physical laws (at least, at very high energies), then we should reconsider the convenience of the Minkowski space as the frame for describing quantum physical phenomena. In fact, there exists a wider consistent quantum dynamics in which the conformal invariance is exact. The price to be paid is the introduction of an extra dimension, thus increasing by one the number of space-time parameters. The physical interpretation of this new parameter remains obscure but, interpretations in terms of a “unit of measurement” a` la Weyl [BtH] and/or a “variable mass” interpretation have already been treated in the literature, even at the non-relativistic (Galilean) level [NiBt]. In the present scheme, this kind of representation can be obtained through a higherorder polarization which, as we are going to see, carries an index m0 eventually interpreted as a conformally invariant mass (see [BtH]). In fact, let us use the following couples of generators: √ √ 2 ˜L ˜ L 2 ˜L L L ˜ ˜ 0L ) ˜ (P0 + K0 ) ; X0 ≡ (P0 − K 50 ≡ 2 2 and (114) √ √ 2 ˜L ˜ L 2 ˜L ˜ 1L ) ˜L (P1 + K1 ) ; X˜ 1L ≡ (P1 − K 5 1 ≡ 2 2 as conjugate variables. If one tries to introduce the set {X˜ 0L , X˜ 1L } in the polarization, then we face the problem that the characteristic module generated by the operators ˜ L, M ˜ L is too large, since X˜ νL , D ˜ L = −5 ˜L ˜L D ν ; in fact, only M possesses a compatible set of commutation relations. As we have already pointed out, the dilatation could be introduced at the price of being a higher-order operator (something similar occurs with the time operator in the case free particle of the ordinary as long as we stay in position representation). More precisely, with this higher-order polarization, one can reduce the representation as follows: ˜ L , X˜ 0L , X˜ 1L , C˜ L >, P =< M

(115)

where C˜ L is just a Casimir operator of the extended conformal group [we can always add an arbitrary central term to C˜ L ]. For example, in the compact dilatation case ˜ L )2 + ( D ˜ L + 2N X˜ ζL )2 − (5 ˜ L )2 + (X˜ L )2 . C˜ L = (M

(116)

Polarized wave functions evolve according to a Klein-Gordon-like equation ˜ L + 2N )2 ψ , ˜ L )2 ψ = ( D (5

(117)

which can be interpreted as the motion equation of a scalar field with variable square ˜ L + 2N )2 = (DL )2 . The value of the Casimir on polarized wave functions mass m2 = (D is C˜ R ψ = N (N − 1)ψ ≡ m0 ψ, which justify the denomination of m0 as a conformally invariant mass [it proves to be quantized for this case, the reason being related to the compact character of the proper time (dilatation)]. The allowed value of N , N = 1

Vacuum Radiation and Symmetry Breaking in Conformally Invariant QFT

353

thus corresponds with null conformal mass. The precise connection between N and the curvature of some homogeneous subspaces (let us say, the Anti-de Sitter universe in 2+1 dimensions) inside the conformal group is being investigated [ACC]. Note that the Cauchy hypersurfaces of Eq. (117) have dimension 2, and the physical interpretation of the extra dimension remains unclear. Two different approaches can be taken which could be consistent with the physical meaning of the conformal group. One is related to the Weyl idea of different lengths in different points of space time [We]. The “rule” for measuring distances changes at different positions. In Quantum Mechanics, this implies that wave functions measuring probability densities do have different integration measures as functions of space-time. This change in the measure of integration needs to be related to the extra parameter appearing in our Group Approach to Quantization of the full conformal group. The other interpretation – not necessarily unrelated to the previous one – could be attached to the variable character of mass. Even at the level of one particle ordinary conformal quantum mechanics, the inescapable consequence of a variable mass appearing in the formalism was already observed several years ago by Niederer [NiBt]. It indeed would not be a surprise should this fact also have some consequences in the full quantization. Neither interpretation, however, is without controversy, as emphasized previously by Rohrlich [R]. At any rate, we have considered here a more satisfactory point of view by examining the dynamical breaking of the conformal group down to the Weyl subgroup in the framework of the Group Approach to Quantization. Acknowledgement. M. Calixto thanks the Spanish MEC for a FPI grant. M.C. is also grateful to A.P. Balachandran for his hospitality at the Department of Physics of Syracuse University, where some of the work on the manuscript was carried out.

References [A] [AA1] [AA2] [AA3] [AAB] [ABLN] [ACC] [ACG] [AGM] [AM] [ANR] [B] [BD] [BtH] [C-B] [Fr] [Fu]

Allen B.: Phys. Rev.D32, 3136 (1985) Aldaya, V. and de Azc´arraga, J.: J. Math. Phys.23, 1297 (1982) Aldaya, V. and de Azc´arraga, J.A.: Int. J. Teor. Phys. 24, 141 (1985) Aldaya, V. and Azc´arraga, J.A.: Ann. of Phys. 165, 484 (1985) Aldaya, V., Azcarraga, J.A. and Bisquert, J.: Lecture Notes in Physics 278, Berlin–Heidelberg– New York: Springer-Verlag, 1986, p. 369 Aldaya, V., Bisquert, J., Loll, R.: and Navarro-Salas, J.: J. Math. Phys. 33, 3087 (1992) Aldaya, V., Calixto, M. and Cerver´o, J.M.: In preparation Aldaya, V., Calixto, M. and Guerrero, J.: Commun. Math. Phys. 178, 399 (1996) Aldaya, V., Guerrero, J. and Marmo, G.: Higher-Order Quantization on a Group. hepth/9512020; Int. J. Mod. Phys. A12, 3 (1997) Abraham, R. and Marsden, J.E.: Foundations of Mechanics, Reading, MA: W. A. Benjamin, Inc., 1967 Aldaya, V., Navarro-Salas, J. and Ram´ırez, A.: Commun. Math. Phys. 121, 541 (1989) Bargmann, V.: Ann. Math. 59, 1 (1954) Birrell, N.D. and Davies, P.C.W.: Quantum fields in curved space. Cambridge: Cambridge University Press, 1982 Barut, A.O. and Haugen, R.B.: Ann. of Phys. 71, 519 (1972) Cunnigham, E.: Proc. R. Soc. Lond.8 , 77 (1910); Bateman, H.: Proc. London Math. Soc. 8, 223 (1910) Fronsdal, C.: Phys. Rev. D12, 3819 (1975) Fulling, S.A.: Phys. Rev. D7, 2850 (1973)

354

[GQ]

V. Aldaya, M. Calixto, J.M. Cerver´o

Souriau, J.M.: Structure des systemes dynamiques. Paris: Dunod, 1970; Kostant, B.: Quantization and Unitary Representations. In: Lecture Notes in Math. 170, Berlin: Springer-Verlag, 1970; Sniatycki, J.: Geometric Quantization and Quantum Mechanics. New York: Springer-Verlag, 1970; Woodhouse, N.: Geometric Quantization. Oxford: Clarendon, 1980 [GU] Grib, A.A. and Urusova, N.Sh.: Teoreticheskaya i Matematicheskaya Fizika 54, 398 (1983). Translated in Institute of Precision Mechanics and Optics, Leningrad [H] Hill, E.L.: Phys. Rev. 67, 358 (1945) [J] Jacobson, N.: Lie Algebras. Intersc. Tracts, No. 10, New York: John Wiley and Sons, 1962 [JR] Jaekel, M.-T. and Reynaud, S.: Phys. Lett. A220, 10 (1996) [JcRb-CDG] Jackiw, R. and Rebbi, C.: Phys. Rev. Lett. 37, 172 (1976); Callan, C., Dashen, R. and Gross, D.: Phys. Lett. B63, 334 (1976) [K] Kastrup, H.A.: Annalen der Physik 9, 388 (1962); Kastrup, H.A.: Phys. Rev. 142, 1060 (1966); Kastrup, H.A.: Phys. Rev. 143, 1021 (1966); Kastrup, H.A.:Phys. Rev. 150, 1183 (1966) [LM] L¨uscher, M. and Mack, G.: Commun. Math. Phys. 41, 203 (1975) [N] von Newmann, J.: Math. Ann. Bd. 102 (1929) [NAC] Navarro, M., Aldaya, V. and Calixto, M.: J. Math. Phys. 38, 1454 (1997) [NiBt] Niederer, U.: Helvetica Physica Acta 45, 802 (1972); Barut, A.O.: Helvetica Physica Acta 46, 496 (1973); Niederer, U.: Helvetica Physica Acta. 47, 120 (1974); Niederer, U.: Helvetica Physica Acta. 47, 167 (1974) [R] Rohrlich, F. et al.: Rev. Mod. Phys. 34, 442 (1962) [S] Saletan, E.J.: J. Math. Phys. 2, 1 (1961) [St] Stone, M.: Proc. Nat. Ac. (1929, 1930) [U] Unruh, W.G.: Phys. Rev. D14, 870 (1976) [W] Wald, R.M.: Quantum Fields in Curved Space and Black Hole Thermodynamics. Chicago: University of Chicago Press, 1995 [We] Weyl, H.: Space, Time and Matter. First Edition, Dover, NY: 1922 Communicated by G. Felder

Commun. Math. Phys. 200, 355 – 379 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Pseudo-K¨ahler Quantization on Flag Manifolds Alexander V. Karabegov? International Centre for Theoretical Physics, Trieste, Italy. E-mail: [email protected] Received: 5 August 1997 / Accepted: 8 July 1998

Abstract: A unified approach to geometric, symbol and deformation quantizations on a generalized flag manifold endowed with an invariant pseudo-K¨ahler structure is proposed. In particular cases we arrive at Berezin’s quantization via covariant and contravariant symbols.

1. Introduction In the series of papers [12, 4, 5, 6] a modern approach to quantization on K¨ahler manifolds was proposed which combines together geometric quantization [10, 14], symbol quantization [3] and deformation quantization [2]. The main idea of this approach can be formulated for quantization on a general symplectic manifold M as follows: • To give a geometric realization of a family of Hilbert spaces H~ over the manifold M, parametrized by a small parameter ~ which plays a role of Planck constant (by means of geometric quantization or of its generalizations). • To describe a geometric construction of a symbol mapping from functions on M to operators in H~ (the construction of operator symbols). • To choose appropriate algebras A~ of symbols such that the symbol mapping provides a representation of A~ in H~ . • To find the deformation quantization which controls the asymptotic expansion of the symbol product as ~ → 0 in the same geometric framework. Our goal is to carry out this quantization program in a unified geometric framework on a generalized flag manifold, a homogeneous space of a compact semisimple Lie group, endowed with an invariant pseudo-K¨ahler structure. ? On leave of absence from the Joint Institute for Nuclear Research, LCTA, Dubna 141980, Moscow Region, Russia.

356

A. V. Karabegov

Let K be a real Lie group, kr the real Lie algebra of K, kr∗ the real dual of kr , gc the complexification of kr , gc = kr ⊗ C, and gr the realification of gc . Suppose M is a homogeneous symplectic K-manifold with an invariant pseudo-K¨ahler polarization, and there is given a moment mapping from M to kr∗ . (Actually we work in a slightly more general situation.) In Sect. 3 we introduce a technical notion of s-module. This is a (gr , K)-module structure on C ∞ (M), where K acts by shifts and gr by first-order differential operators. To an s-module on M we associate a mapping from the universal enveloping algebra U(gc ) of gc to C ∞ (M), whose kernel is a two-sided ideal, and whose image A ⊂ C ∞ (M) thus inherits the structure of the corresponding quotient algebra of U(gc ). This mapping was introduced independently in [1] in a more general context of a homogeneous symplectic manifold with a pair of transversal invariant polarizations. Let L → M be a holomorphic hermitian line bundle. In Sect. 4 we define a pushforward mapping from holomorphic differential operators on L to differential operators on M. Suppose the group K acts on L by holomorphic line bundle automorphisms which respect the hermitian metrics on L. Using the pushforward mapping we relate to L an s-module on M and the corresponding algebra A ⊂ C ∞ (M). Now let K be a compact semisimple Lie group and M be a generalized flag manifold. Then M may be identified with a coadjoint orbit in kr∗ via the moment mapping, which is injective. The elements of the algebra A corresponding to an s-module on M are regular functions on M. The Bott–Borel–Weil theorem provides geometric realizations of representations of K, and therefore of U(gc ), in the sheaf cohomology spaces of hermitian line bundles on M. We show in Sect. 9 that the representation in the sheaf cohomology space of a hermitian line bundle L on M factors through the mapping from U(gc ) to A ⊂ C ∞ (M) corresponding to the s-module associated to the line bundle L. Thus we obtain a representation of the algebra A in the sheaf cohomology space of L. One may consider A as a symbol algebra on M, and its representation as the symbol mapping. The symbol algebra A on M and its representation were also constructed in [8] by algebraic methods, based on the theory of Harish-Chandra modules. It was shown there that in some cases A coincides with the algebras of Berezin’s covariant and contravariant symbols on M (see also Sect. 10). It would be extremely interesting to give a direct geometric construction of the symbol mapping for the symbol algebra A on M. In Sect. 7 we consider some special families of s-modules on M, rationally parametrized by a parameter ~, and the corresponding algebras A~ . Denote by ∗~ the product in A~ . We prove that any regular function on M belongs to the algebra A~ for all but a finite number of values of ~. For any regular functions f, g on M their product f ∗~ g expands to an absolutely and uniformly convergent series in ~ in a neighborhood of ~ = 0. With ~ replaced by a formal parameter, this series is shown to define the ?product of some deformation quantization with separation of variables on M (see [7]). This ?-product is completely identified and can be described autonomously. In a particular case we obtain the convergent ?-products on compact hermitian symmetric spaces, derived from the product of Berezin’s covariant symbols in [4].

Pseudo-K¨ahler Quantization on Flag Manifolds

357

2. Equivariant Families of Functions on Homogeneous Manifolds Let K be a real Lie group, kr its real Lie algebra, kr∗ the real dual to kr . For X ∈ kr , F ∈ kr∗ denote their pairing by hF, Xi. Let M be a homogeneous K-manifold. Denote by Tk the shift operator by k ∈ K in C ∞ (M), Tk f (x) = f (k −1 x), x ∈ M, f ∈ C ∞ (M). We call a family of real smooth functions {fX }, X ∈ kr , on M a K-equivariant family if kr 3 X 7→ fX is a linear mapping from kr to C ∞ (M), K-equivariant with respect to the adjoint action on kr and the shift action on C ∞ (M), so that for all k ∈ K, X ∈ kr Tk fX = fAd(k)X holds . For X ∈ kr denote by vX the corresponding fundamental vector field on M. For k ∈ K the relation Tk vX Tk−1 = vAd(k)X , X ∈ kr holds, where vX is treated as a differential operator in C ∞ (M). For a K-equivariant family {fX } on M and for all X, Y ∈ kr the relation vX fY = f[X,Y ] holds. Given a K-equivariant family {fX } on M, define a "moment" mapping γ : M → kr∗ K-equivariant with respect to the shift action on M and coadjoint action on kr∗ , such that for all x ∈ M, X ∈ kr hγ(x), Xi = fX (x) holds. Since M is homogeneous, the image of γ is a single coadjoint orbit = γ(M) ⊂ kr∗ . the corresponding fundamental vector field on . It is For X ∈ kr denote by vX Hamiltonian with respect to the K-invariant symplectic structure given by the Kirillov (F ) = hF, Xi, F ∈ , is its Hamiltonian symplectic form ω on . The function fX = −i(vX )ω . For X, Y ∈ kr ω (vX , vY ) = f[X,Y function, i.e., dfX ] holds. Remark. K-equivariant families {fX } on M are in one-to-one correspondence with K-equivariant mappings γ : M → kr∗ . To a given γ there corresponds the coadjoint = fX ◦ γ. orbit = γ(M) and the family {fX } such that fX = γ ∗ fX Fix a point x0 ∈ M and denote by K0 ⊂ K the isotropy subgroup of the point x0 . The mapping γ and thus the K-equivariant family {fX } itself are completely determined by the image point γ(x0 ) which is an arbitrary K0 -stable point in kr∗ . (1) (2) (1) } and {fX } their linear combination {αfX + For two K-equivariant families {fX (2) βfX } is a K-equivariant family as well. Therefore the set of all K-equivariant families {fX } on M is a vector space which can be identified with the subspace (kr∗ )K0 of all K0 -stable points in kr∗ . Denote by ω the pullback of the form ω by γ. Then ω is a closed (but not necessarily nondegenerate) K-invariant form on M such that for X, Y ∈ kr ω(vX , vY ) = f[X,Y ] holds and dfX = −i(vX )ω. We say that ω is associated to the K-equivariant family {fX }. The form ω is nondegenerate iff the tangent mapping to γ : M → at any point x ∈ M is an isomorphism of the tangent spaces Tx M and Tγ(x) or, equivalently, if γ is a covering mapping. 3. Modules of Functions on Complex Homogeneous Manifolds Let K be a real Lie group with the real Lie algebra kr . Denote by gc the complexification of kr , gc = kr ⊗ C, by gr the realification of gc , and by J the corresponding operator of complex structure in gr , so that (gr , J) is isomorphic to gc . Let the group K act transitively and holomorphically on a complex manifold M. Then for X ∈ kr the fundamental vector field vX on M decomposes into the sum of

358

A. V. Karabegov

holomorphic and antiholomorphic vector fields ξX and ηX respectively, vX = ξX + ηX . Therefore ηX = ξ¯X and for arbitrary X, Y ∈ kr ξX commutes with ηY , [ξX , ξY ] = ξ[X,Y ] and [ηX , ηY ] = η[X,Y ] . For X ∈ kr , k ∈ K the following relations hold, Tk ξX Tk−1 = ξAd(k)X and Tk ηX Tk−1 = ηAd(k)X . For Z = X + JY ∈ gr , X, Y ∈ kr , set ξZ = ξX + iξY and ηZ = ηX − iηY . Now (gr , J) 3 Z 7→ ξZ is a C-linear homomorphism from (gr , J) to the Lie algebra of holomorphic vector fields on M, and ηZ = ξ¯Z . We get that gr acts on M by real vector fields vZ = ξZ + ηZ = (ξX + ηX ) + i(ξY − ηY ) which respect the holomorphic structure on M. We call a mapping kr 3 X 7→ sX of the Lie algebra kr to End(C ∞ (M)) Kequivariant if it is K-equivariant with respect to the adjoint action of K in kr and the shift action in C ∞ (M). This means that Tk sX Tk−1 = sAd(k)X , X ∈ kr , k ∈ K. Now we shall define a special (gr , K)-module structure on C ∞ (M). Let K act in C ∞ (M) by the shifts Tk , k ∈ K, and gr act by real differential operators mZ = vZ + ϕZ , Z ∈ gr , where ϕZ is a real smooth function on M, so that the actions of K and gr agree in the usual sense. This means that the actions of the algebra kr as of a subalgebra of gr and as of the Lie algebra of K coincide, mX = vX for X ∈ kr , and Tk mZ Tk−1 = mAd(k)Z for k ∈ K, Z ∈ gr . In particular, for X ∈ kr , ϕX = 0 holds and Tk ϕZ = ϕAd(k)Z . Then we say that there is given an s-module on M. For X, Y ∈ kr set Z(X, Y ) = 1/2(X − iJX + Y + iJY ) ∈ gr ⊗ C. The mapping kr × kr 3 (X, Y ) 7→ Z(X, Y ) is a Lie algebra homomorphism from kr × kr to gr ⊗ C (moreover, it extends by C-linearity to an isomorphism of the complex Lie algebras (gr , J) × (gr , −J) and gr ⊗ C). For X ∈ kr introduce a function fX = (−1/2)ϕJX on M. It is easy to check that the functions fX , X ∈ kr , form a K-equivariant family. For X ∈ kr set lX = ξX + ifX , rX = ηX − ifX . Notice that ηX = ξ¯X and that the mappings kr 3 X 7→ lX and kr 3 X 7→ rX are K-equivariant. A straightforward calculation shows that mZ(X,Y ) = lX + rY , where the mapping gr 3 Z 7→ mZ is extended to gr ⊗ C by C-linearity. Taking into account that (X, 0) commutes with (0, Y ) in kr × kr , we get the following lemma. Lemma 1. The mappings kr 3 X 7→ lX and kr 3 X 7→ rX are commuting Kequivariant complex conjugate representations of kr in C ∞ (M). Suppose there is given a representation of kr in C ∞ (M) of the form kr 3 X 7→ lX = ξX + ifX , where {fX } is a K-equivariant family on M (which is equivalent to X 7→ lX being K-equivariant). Then there exists an s-module on M to which the representation kr 3 X 7→ lX is associated. Lemma 2. Let kr 3 X 7→ lX = ξX + ifX be a K-equivariant representation of kr in C ∞ (M). For Z = X + JY ∈ gr , X, Y ∈ kr , define the function ϕZ = −2fY on M. Then the mapping gr 3 Z 7→ mZ = vZ + ϕZ is a representation of gr in C ∞ (M). Together with the shift action of K in C ∞ (M) it defines an s-module on M. Proof. Consider the complex conjugate representation kr 3 X 7→ rX = l¯X = ηX −ifX to the representation X 7→ lX of kr . Then for X, Y ∈ kr and Z(X, Y ) = 1/2(X − iJX + Y + iJY ) we have as above mZ(X,Y ) = lX + rY . In order to show that the mapping gr 3 Z 7→ mZ is a representation of gr in C ∞ (M), it is enough to show that the representations X 7→ lX and X 7→ rX of kr commute or, equivalently, that ξX fY + ηY fX = 0. We get from the identity [lX , lY ] = l[X,Y ] that f[X,Y ] = ξX fY −ξY fX . Since vY fX = f[Y,X] , we get that f[X,Y ] = −ξY fX − ηY fX . Equating the two expressions for f[X,Y ] we obtain the desired identity. The rest of the proof is straightforward.

Pseudo-K¨ahler Quantization on Flag Manifolds

359

It follows from Lemma 2 that any s-module is completely determined by some Kequivariant family of functions {fX } on M for which the mapping kr 3 X 7→ lX = ξX + ifX is a representation of kr , or, equivalently, such that for X, Y ∈ kr the relation f[X,Y ] = ξX fY − ξY fX holds. Since this relation is linear with respect to the family {fX }, the set S of all s-modules on M is naturally identified with a linear subspace of the vector space of K-equivariant families of functions on M. It turns out that one can give a simple characterization of those K-equivariant families of functions which give rise to s-modules. It is given in terms of the closed 2-form ω on M associated to the K-equivariant family. Theorem 1. A K-equivariant family {fX } on M corresponds to an s-module on M iff the 2-form ω associated to {fX } is of the type (1, 1) with respect to the complex structure on M. Proof. We have to prove that the form ω is of the type (1, 1) iff the relation f[X,Y ] = ξX fY − ξY fX holds for all X, Y ∈ kr . We can rewrite this relation in terms of ω using that dfX = −i(vX )ω and ω(vX , vY ) = f[X,Y ] as follows: ω(vX , vY ) = ω(ξX , vY ) − ω(ξY , vX ). Since vX = ξX + ηY , it is equivalent to the relation ω(ξX , ξY ) = ω(ηX , ηY ) for all X, Y ∈ kr .

(1)

If ω is of the type (1, 1), both sides of (1) vanish. Suppose now that (1) is true. For Z = X + iY ∈ gc = kr ⊗ C, X, Y ∈ kr , set vZ = vX + ivY , ξZ = ξX + iξY and ηZ = ηX + iηY . It follows from (1) that ω(ξZ , ξZ 0 ) = ω(ηZ , ηZ 0 ) for all Z, Z 0 ∈ gc .

(2)

Fix a point x ∈ M. Since M is K-homogeneous, the vectors vX , X ∈ kr , at the point x span the real tangent space Tx M and thus the vectors vZ , Z ∈ gc , span Tx M ⊗ C. For arbitrary vectors ξ, ξ 0 ∈ Tx M ⊗ C of the type (1, 0) one can find Z, Z 0 ∈ gc such that vZ = ξ and vZ 0 = ξ 0 at the point x. Therefore, at the point x, ξZ = ξ, ξZ 0 = ξ 0 and ηZ = ηZ 0 = 0. It follows from (2) that ω(ξ, ξ 0 ) = 0 at the point x ∈ M for arbitrary vectors ξ, ξ 0 of the type (1,0). Since ω is real, it follows that it is of the type (1,1), which completes the proof. We say that an s-module is nondegenerate if the corresponding 2-form ω is nondegenerate. It follows from Theorem 1 that the 2-form ω associated to a nondegenerate s-module is a pseudo-K¨ahler form. The set of nondegenerate s-modules is either empty or it is a dense open conical (i.e. invariant with respect to the multiplication by non-zero constants, s 7→ t · s, s ∈ S, t ∈ R\{0}) subset of S. Example. Let ⊂ kr∗ be a coadjoint orbit of the group K, endowed with an invariant pseudo-K¨ahler polarization. This means that there is given an invariant complex structure on such that the Kirillov form ω is of the type (1,1). The fundamental vector field , X ∈ kr , decomposes into the sum of a holomorphic and antiholomorphic vector vX and ηX respectively, vX = ξX + ηX . Then the functions fX , X ∈ kr , form a fields ξX K-equivariant family that corresponds to an s-module on . In particular, the mappings + ifX and kr 3 X 7→ rX = ηX − ifX are two commuting kr 3 X 7→ lX = ξX ∞ K-equivariant representations of kr in C ().

360

A. V. Karabegov

We are going to associate to each s-module on M an associative algebra A whose elements are smooth functions on M. Extend the representations kr 3 X 7→ lX and kr 3 X 7→ rX = l¯X to gc = kr ⊗ C by C-linearity. Then, extending them further to the universal enveloping algebra U(gc ) of gc , one obtains two commuting K-equivariant representations of U(gc ) in C ∞ (M), u 7→ lu and u 7→ ru , u ∈ U(gc ) (K acts on U(gc ) by the properly extended adjoint action). Let u 7→ uˇ denote the standard anti-automorphism of U(gc ) which maps X ∈ gc to −X. Lemma 3. For u ∈ U(gc ) the following relation holds: lu 1 = ruˇ 1. Proof. We prove the Lemma for the monomials un = X1 . . . Xn , Xj ∈ gc , using induction over n. One checks directly that for X ∈ U(gc ) lX 1 = ifX = rXˇ 1 holds. Assume that lun−1 1 = ruˇ n−1 1 holds. Then lun 1 = lun−1 lXn 1 = lun−1 rXˇ n 1 = rXˇ n lun−1 1 = rXˇ n ruˇ n−1 1 = ruˇ n 1. The lemma is proved. For u ∈ U(gc ) denote σu = lu 1 and let A denote the image of the mapping σ : u 7→ σu from U(gc ) to C ∞ (M). Notice that the mapping σ is K-equivariant with respect to the adjoint action on U(gc ) and the action by shifts on C ∞ (M). Lemma 4. The kernel I of the mapping σ : U(gc ) → C ∞ (M) is a two-sided ideal in U(gc ) and thus A inherits the algebra structure from the quotient algebra U(gc )/I. Proof. It follows from the relation lu 1 = 0 that I is a left ideal, while ruˇ 1 = 0 shows that I is a right ideal since u 7→ uˇ is an anti-homomorphism. The lemma is proved. We shall denote the associative product in A by ∗. It follows from Lemma 3 that for u ∈ U(gc ), f ∈ A lu f = σu ∗ f and ruˇ f = f ∗ σu holds. Remark. As a subspace of C ∞ (M) the algebra A is the spherical (gr , K)-submodule of the s-module it is associated to, generated by the constant function 1, which is a spherical (K-invariant) vector. Denote by Z(gc ) the center of U(gc ). The elements of Z(gc ) are stable under the adjoint action of K. Since the mapping σ is K-equivariant, σ maps the central elements of U(gc ) to constants in A. Thus the restriction of the mapping σ to Z(gc ) defines a central character ψ : Z(gc ) → C of the algebra U(gc ), ψ(z) = σz , z ∈ Z(gc ) (here we identify the constant functions in A with the corresponding complex constants). 4. Holomorphic Differential Operators on Hermitian Line Bundles Let π : L → M be a holomorphic hermitian line bundle over M with hermitian metrics h. Denote by L∗ the bundle L with the zero section removed. It is a C∗ -principal bundle. A local holomorphic trivialization of L is given by a pair (U, s), where U is an open chart on M and s : U → L∗ is a nonvanishing local holomorphic section of L. We are going to define a pushforward of holomorphic differential operators on L to the base space M. A holomorphic differential operator A on L is a global geometric object given locally, for a holomorphic trivialization (Uα , sα ), by a holomorphic differential operator Aα on Uα . On the intersection of two charts Uα and Uβ the operators Aα and Aβ must satisfy the

Pseudo-K¨ahler Quantization on Flag Manifolds

361

relation Aα ϕαβ = ϕαβ Aβ , where ϕαβ is a holomorphic transition function on Uα ∩ Uβ such that ϕαβ sα = sβ . In this relation we consider ϕαβ as a multiplication operator. Holomorphic differential operators on L act on the sheaf of local holomorphic sections of L and form an algebra. On each chart (Uα , sα ) introduce a real function 8α = − log h ◦ sα . Lemma 5. On the intersection of two charts Uα and Uβ the following equality holds: 8α − 8β = log |ϕαβ |2 . Proof. We have 8β = − log h ◦ sβ = − log h ◦ (ϕαβ sα ) = − log(|ϕαβ |2 h ◦ sα ) = 8α − log |ϕαβ |2 . The lemma is proved. Given a global holomorphic differential operator A on L, consider differential operators Aˇ α = e−8α Aα e8α on each chart Uα . Lemma 6. The operators Aˇ α define a global differential operator Aˇ on M. Proof. We have to check that on the intersection of two charts Uα and Uβ holds the equality Aˇ α = Aˇ β . It is equivalent to e−8α Aα e8α = e−8β Aβ e8β or Aα e8α −8β = e8α −8β Aβ .Applying Lemma 5 we get an equivalent equality Aα ϕαβ ϕ¯ αβ = ϕαβ ϕ¯ αβ Aβ . The assertion of the lemma follows now from the fact that the holomorphic differential operator Aβ commutes with the multiplication by the antiholomorphic function ϕ¯ αβ . We call Aˇ the pushforward of the holomorphic differential operator A on the line bundle L to the base space M. It is clear that the pushforward mapping A 7→ Aˇ is an injective homomorphism of the algebra of holomorphic differential operators on L into the algebra of differential operators on M. Let ∇ denote the canonical holomorphic connection of the hermitian line bundle (L, h). For a local holomorphic trivialization (Uα , sα ) a local expression of ∇ on Uα is ¯ α. ∇ = d − ∂8α . The curvature ω of ∇ has a local expression ω = i∂ ∂8 Let the Lie group K act on the line bundle π : L → M by holomorphic line bundle automorphisms which respect the hermitian metrics h. The metrics h can be considered as a function on L, L 3 q 7→ h(q). To each local section s of L over an open set U ⊂ M relate a function ψs on π −1 (U ) ∩ L∗ such that ψs (q)q = s(π(q)). For t ∈ C∗ h(tq) = |t|2 h(q) holds and ψs (tq) = t−1 ψs (q). L Any element X of the Lie algebra kr of K acts on L∗ by a real vector field vX L L L ¯ which is the sum of holomorphic and antiholomorphic vector fields ξX and ηX = ξX L L L L L L respectively, vX = ξX + ηX . The vector fields vX , ξX and ηX are homogeneous of order 0 with respect to the action of C∗ on L∗ . Let vX , ξX and ηX denote their projections to M, so vX = ξX + ηX . L on the functions ψs on L∗ can be transferred to the action on the The action of ξX corresponding local holomorphic sections s of L, which defines a global holomorphic differential operator AX on L. The object of interest to us will be its pushforward Aˇ X to the base space M. First, consider a local trivialization of L∗ by a local section s0 : U → L∗ , which L = ξX − aX v∂/∂v identifies (x, v) ∈ U × C∗ with s0 (x)v ∈ L∗ |U . Then, locally, ξX for some holomorphic function aX on U . To push forward a holomorphic differential operator from L|U to U we use the function 8 = − log h ◦ s0 on U . The metrics h at the point (x, v) ∈ U × C∗ can be expressed as follows: h(x, v) = e−8 |v|2 . Since the metrics L h = 0. A simple calculation shows then that h is K-invariant, we have vX (ξX + ξ¯X )8 = −aX − a¯ X .

(3)

362

A. V. Karabegov

Introduce a function fX = −i(aX + ξX 8) on U . Then (3) means that fX is real. Proposition 1. The holomorphic differential operator AX on L and its pushforward Aˇ X to the base space M can be expressed as follows, AX = ∇ξX +ifX and Aˇ X = ξX +ifX . The mapping kr 3 X 7→ Aˇ X is a K equivariant representation of kr in C ∞ (M). The L −1 h ) = ifX ◦ π function fX is globally defined on M and satisfies the relations h(ξX and dfX = −i(vX )ω. Proof. Fix a trivialization of L over an open subset U ⊂ M, L|U ≈ U × C, and consider a local section of L over U , s : x 7→ s(x) = (x, ϕs (x)) ∈ U × C. The function ψs corresponding to s is defined by the equality ψs (q)q = s(x) for q ∈ L∗ , x = π(q). At the point q = (x, v) ∈ U × C∗ we have (x, ψs (x, v)v) = (x, ϕs (x)), whence ψs (x, v) = ϕs (x)v −1 . To find the local expression of the operator AX calculate its action on ψs (x, v), (ξX − aX v∂/∂v)(ϕs v −1 ) = (ξX ϕs + aX ϕs )v −1 . Thus, locally, AX = ξX + aX = (ξX − ξX 8) + (ξX 8 + aX ) = ∇ξX + ifX . Pushing it forward to U we get Aˇ X = e−8 (ξX + aX )e8 = ξX + (ξX 8 + aX ) = ξX + ifX . The K-equivariance of the mapping kr 3 X 7→ Aˇ X follows from the fact that K acts on the hermitian line bundle (L, h) by the line bundle automorphisms which preserve the metrics h. L −1 h ) = e−8 |v|2 ((ξX − aX v∂/∂v)e8 |v|−2 ) = ξX 8 + aX = We have locally that h(ξX ifX ◦ π. To prove the last relation of the proposition we notice that i(ξX )ω is of the ¯ X = −i(ξX )ω type (0, 1) and i(ηX )ω is of the type (1, 0).We have to show that ∂f and ∂fX = −i(ηX )ω. These equalities are complex conjugate, so we prove the former one. Let {z k } be local holomorphic coordinates on U and ξX = ak (z)∂/∂z k . Then ¯ X = −i∂(a ¯ X + ξX 8) = −i∂ξ ¯ X 8 = −iak (z)(∂ 2 8/∂z k ∂ z¯ l )dz¯ l . Taking into account ∂f 2 k l k that ω = i(∂ 8/∂z ∂ z¯ )dz ∧ dz¯ l we immediately obtain the desired equality, which completes the proof. It follows from Proposition 1 that to a hermitian line bundle (L, h) → M on which the group K acts by holomorphic automorphisms which respect the metrics h there corresponds an s-module s on M. The relation dfX = −i(vX )ω implies that ω(vX , vY ) = f[X,Y ] , therefore the (1,1)-form corresponding to the s-module s is the curvature ω of the canonical connection ∇ on L. The pushforward of the operator AX to M coincides with the operator lX , associated to s, Aˇ X = lX . The mapping kr 3 X 7→ AX can be extended to the homomorphism of the algebra U(gc ) to the algebra of holomorphic differential operators on L, U(gc ) 3 u 7→ Au . Since the pushforward mapping is a homomorphism of the algebra of holomorphic differential operators on L into the algebra of differential operators on M, we get the following corollary of Proposition 1. Corollary. To a hermitian line bundle (L, h) → M on which the group K acts by holomorphic automorphisms which respect the metrics h there corresponds an s-module s on M such that for any u ∈ U(gc ) the pushforward of the operator Au from L to M coincides with the operator lu , associated to s, Aˇ u = lu . Denote by Lcan the canonical line bundle of M, i.e., the top exterior power of the holomorphic cotangent bundle T ∗ 0 M of M, Lcan = ∧m T ∗ 0 M, where m = dimC M. Its local holomorphic sections are the local holomorphic m-forms on M. Let µ be a global positive volume form on M. One can associate to it a hermitian metrics hµ on ¯ Lcan such that for an arbitrary local holomorphic m-form α on M hµ (α) = α ∧ α/µ. Recall that the divergence of a vector field ξ with respect to the volume form µ is given by the formula divµ ξ = Lξ µ/µ, where Lξ is the Lie derivative corresponding to ξ.

Pseudo-K¨ahler Quantization on Flag Manifolds

363

Let the volume form µ on M be K-invariant. Then for X ∈ kr divµ vX = divµ ξX + divµ ηX = 0. Since µ is real, it follows that divµ ξX and divµ ηX are complex conjugate can = −idivµ ξX . and thus pure imaginary. For X ∈ kr introduce a real function fX The natural geometric action of K on the hermitian line bundle (Lcan , hµ ) by holomorphic line bundle automorphisms preserves the metrics hµ . The corresponding infinitesimal action of an element X ∈ kr on the local holomorphic m-forms on M by the Lie derivative LξX , defines a global holomorphic differential operator AX on Lcan . Proposition 2. The holomorphic differential operator AX on Lcan is given by the forcan . mula AX = ∇ξX + divµ ξX = ∇ξX + ifX Proof. Let U ⊂ M be a local coordinate chart with holomorphic coordinates {z k }. The form α0 = dz 1 ∧ · · · ∧ dz m is a local holomorphic trivialization of Lcan . Set 8 = − log h(α0 ). Then locally on (U, α0 ) µ = e8 α0 ∧ α¯ 0 and ∇ = d − ∂8. Since ξX is a holomorphic vector field and α¯ 0 is an anti-holomorphic form, we get LξX α¯ 0 = 0. Therefore, LξX µ = (ξX 8)e8 α0 ∧ α¯ 0 + e8 (LξX α0 ) ∧ α¯ 0 . On the other hand, LξX µ = (divµ ξX )µ = (divµ ξX )e8 α0 ∧ α¯ 0 . Therefore, (divµ ξX )α0 = (ξX 8)α0 + LξX α0 . Let α = f α0 be a holomorphic m-form on U . The holomorphic function f represents the local holomorphic section α of Lcan in the trivialization (U, α0 ). Now LξX α = LξX (f α0 ) = (ξX f )α0 + f LξX α0 = ((ξX − ξX 8)f )α0 + (divµ ξX )α = (∇ξX + divµ ξX )α. The proposition is proved. Applying Proposition 1 we obtain the following Corollary. The pushforward of the operator AX to M is Aˇ X = ξX + divµ ξX = ξX + can can ifX . The mapping kr 3 X 7→ ξX + ifX is a K-equivariant representation of kr in ∞ C (M). This means that if there exists a K-invariant measure µ on M, we get an s-module on M. It is easy to check that if we replace µ by an arbitrary K-invariant measure can , X ∈ kr , and thus the same s-module. c·µ, c ∈ R+ , we will get the same functions fX This s-module will be called canonical and denoted scan . If the set of nondegenerate smodules on M is non-empty, then there exists a K-invariant symplectic (pseudo-K¨ahler) form ω on M associated to a nondegenerate s-module. The corresponding symplectic volume is K-invariant as well, and therefore gives rise to the canonical s-module. Suppose µ is a K-invariant measure on M, and kr 3 X 7→ lX = ξX + ifX is a Kequivariant representation of kr in C ∞ (M), which corresponds to the s-module s ∈ S. with respect For a differential operator A in C ∞ (M), denote by ARt its formal transpose R to the measure µ, so that for all φ, ψ ∈ C0∞ (M), (Aφ)ψdµ = φ(At ψ)dµ holds. Consider the K-equivariant representation kr 3 X 7→ (l−X )t = ξX − ifX + divµ ξX . It corresponds to the s-module which we call dual to s and denote by s0 . Since the canonical module scan corresponds to the K-equivariant family {−idivµ ξX }, we get s0 = −s + scan . 5. Deformation Quantizations with Separation of Variables Recall the definition of deformation quantization on a symplectic manifold M introduced in [2]. Definition. Formal differentiable deformation quantization on a symplectic manifold M is a structure of associative algebra in the space of all formal series C ∞ (M)[[ν]].

364

A. V. Karabegov

P P The product ? of two elements f = r≥0 ν r fr , g = r≥0 ν r gr of C ∞ (M)[[ν]] is given by the following formula: X X νr Ci (fj , gk ), (4) f ?g = r

i+j+k=r

where Cr (·, ·), r = 0, 1, . . . , are bidifferential operators such that for smooth functions ϕ, ψ on M holds C0 (ϕ, ψ) = ϕψ and C1 (ϕ, ψ) − C1 (ψ, ϕ) = i{ϕ, ψ}. Here {·, ·} is the Poisson bracket on M, corresponding to the symplectic structure. Then the product ? is called a star-product. The star-product can be extended by the same formula (4) to the space F = C ∞ (M)[ν −1 , ν]] of formal Laurent series with a finite polar part. Since the star-product is given by bidifferential operators, it is localizable, that is, it can be restricted to any open subset U ⊂ M. For U ⊂ M denote F(U ) = C ∞ (U )[ν −1 , ν]] and for f, g ∈ F(U ) let Lf and Rg denote the left star-multiplication operator by f and the right star-multiplication operator by g in F(U ) respectively, so that Lf g = f ? g = Rg f . The operators Lf and Rg commute for all f, g ∈ F(U ). Let L(U ) and R(U ) denote the algebras of left and right star-multiplication operators in F(U ) respectively. It is important to notice that both left and right star-multiplication operators are formal Laurent series of differential operators with a finite polar part (i.e., with finitely many terms of negative degree of the formal parameter ν). We call such operators formal differential operators. Let M be a complex manifold endowed with a pseudo-K¨ahler form ω0 . This means that ω0 is a real closed nondegenerate form of the type (1, 1). Then M is a pseudo-K¨ahler manifold. The form ω0 defines a symplectic structure on M. A formal deformation of pseudo-K¨ahler form ω0 is a formal series ω = ω0 +νω1 +. . . , where ωr , r > 0, are closed, possibly degenerate forms of the type (1, 1) on M. On any contractible chart U ⊂ M there exists a formal potential 8 = 80 + ν81 + . . . of ω, ¯ r , r ≥ 0. which means that ωr = i∂ ∂8 Definition. Deformation quantization on a pseudo-K¨ahler manifold M is called quantization with separation of variables if for any open U ⊂ M and any holomorphic function a(z) and antiholomorphic function b(z) ¯ on U left ?-multiplication by a and right ?-multiplication by b are point-wise multiplications, i.e., La = a and Rb = b. We call the corresponding ?-product a ?-product with separation of variables. In [7] a complete description of all deformation quantizations with separation of variables on an arbitrary K¨ahler manifold was given. It was shown that such quantizations are parametrized by the formal deformations of the original K¨ahler form. The results obtained in [7] are trivially valid for pseudo-K¨ahler manifolds as well. Theorem 2 ([7]). Deformation quantizations with separation of variables on a pseudoK¨ahler manifold M are in one-to-one correspondence with formal deformations of the pseudo-K¨ahler form ω0 . If there is given a quantization with separation of variables on M corresponding to a formal deformation ω of the form ω0 , U is a contractible coordinate chart on M with holomorphic coordinates {z k }, and 8 is a formal potential of ω, then the algebra L(U ) of the left ?-multiplication operators consists of those formal differential operators on U which commute with all z¯ l and ∂8/∂ z¯ l + ν∂/∂ z¯ l . Similarly, the algebra R(U ) of the right ?-multiplication operators on U consists of those formal differential operators which commute with all z k and ∂8/∂z k + ν∂/∂z k .

Pseudo-K¨ahler Quantization on Flag Manifolds

365

Remark. Given the algebra L(U ), one can recover the ?-product f ?g for f, g ∈ F(U ) as follows: One finds a unique operator A ∈ L(U ) such that A1 = f . Obviously, A = Lf , whence f ? g = Lf g. Let (F, ?) denote the deformation quantization with separation of variables on M corresponding to a formal deformation ω = ω0 + νω1 + . . . of a pseudo-K¨ahler form ω0 . P Then for f, g ∈ F f ? g = r ν r Cr (f, g) for bidifferential operators Cr (·, ·). Later we shall meet the productP?˜ on F, opposite to the ?-product ?. This means that for f, g ∈ F f ?˜ g = g ? f = r ν r Cr (g, f ), whence it is straightforward that ?˜ is the ?product corresponding to a formal deformation quantization on the symplectic manifold (M, −ω0 ). ˜ R ˜ the algebras of left and right star-multiplication operators of the Denote by L, deformation quantization (F, ?˜ ), and by L˜ f , R˜ f the operators of left and right starmultiplication by an element f ∈ F respectively. It is clear that L˜ f = Rf , R˜ f = Lf , L˜ = ˜ = L. If a, b are, respectively, a holomorphic and antiholomorphic functions on R, R an open subset U ⊂ M, then L˜ b = b and R˜ a = a. This means that the product ?˜ is a ¯ opposite to M (i.e., ?-product with separation of variables on the complex manifold M, with the opposite complex structure). Let U be a contractible coordinate chart on M with holomorphic coordinates {z k }, ˜ ) = R(U ) consists of formal opand 8 a formal potential of ω, then the algebra L(U ¯ holomorphic and erators, commuting with all z k and ∂8/∂z k + ν∂/∂z k . Since on M ¯ corresponding antiholomorphic coordinates are swapped, the formal (1,1)-form on M, ¯ to the quantization (F, ?˜ ) is i∂∂8 = −ω. This (1,1)-form is a formal deformation of the ¯ pseudo-K¨ahler form −ω0 on M. Let M be a K-homogeneous complex manifold, s0 , s1 , . . . be s-modules on M, n {fX } and ωn be the K-equivariant family and the (1,1)-form on M, respectively, asn = −i(vX )ωn . Assume that s0 is nondegenerate, i.e., ω0 is a sociated to sn . Then dfX pseudo-K¨ahler form. Denote by (F, ?) the deformation quantization with separation of variables on M corresponding to the formal deformation ω = ω0 + νω1 + . . . of the pseudo-K¨ahler form ω0 . Since all the (1,1)-forms ωn are K-invariant, the ?-product ? is invariant under K-shifts. (ν) (ν) 0 1 = fX + νfX + . . . . Introduce a formal operator lX = For X ∈ kr denote fX (ν) ξX + (i/ν)fX . (ν) is a Lie algebra homomorphism of kr to Proposition 3. The mapping kr 3 X 7→ lX the algebra L(M) of the left ?-multiplication operators of the deformation quantization (F, ?). It is K-equivariant with respect to the coadjoint action on kr and the conjugation by shift operators in L(M). (ν) is a K-equivariant Lie algebra homomorphism to Proof. The mapping kr 3 X 7→ lX (ν) (ν) the Lie algebra of formal operators on M if and only if ξX fY(ν) − ξY fX = f[X,Y ] and (ν) (ν) Tk fX = fAd(k)X for all X, Y ∈ kr and k ∈ K. These relations follow immediately n . Theorem 2 tells that in order from the corresponding relations for the functions fX (ν) to show that lX ∈ L(M) one has to check that for a formal potential 8 of ω on any contractible coordinate chart U with holomorphic coordinates {z k } the formal operator (ν) (ν) = ξX + (i/ν)fX commutes with all z¯ l and ∂8/∂ z¯ l + ν∂/∂ z¯ l . Thus we have to lX check the equality (ν) /∂ z¯ l . ξX (∂8/∂ z¯ l ) = i∂fX

(5)

366

A. V. Karabegov

¯ = i(∂ 2 8/∂z k ∂ z¯ l )dz k ∧ dz¯ l and writing down Taking into account that ω = i∂ ∂8 the local expression for ξX , ξX = ak (z)∂/∂z k , we rewrite the left hand side of (5) (ν) as follows, ak (z)∂ 2 8/∂z k ∂ z¯ l . On the other hand, i∂fX /∂ z¯ l = ih−i(vX )ω, ∂/∂ z¯ l i = l k 2 k l −iω(vX , ∂/∂ z¯ ) = a (z)∂ 8/∂z ∂ z¯ , which proves (5) and completes the proof of the proposition. (ν) to the homomorphism U(gc ) 3 u 7→ lu(ν) from Extend the mapping kr 3 X 7→ lX (ν) (ν) U(gc ) to L(M) and set σu = lu 1.

Corollary. The mapping U(gc ) 3 u 7→ σu(ν) is a homomorphism from U(gc ) to the algebra (F, ?), K-equivariant with respect to the adjoint action on U(gc ) and the shift action on F. This result was obtained independently in [1]. It follows that the mapping U(gc ) 3 u 7→ σu(ν) maps the elements of the center Z(gc ) of U(gc ) to formal series with constant coefficients. Lemma 7. For z ∈ Z(gc ) the operator lz(ν) is scalar and is equal to σz(ν) . Proof. If A ∈ L(M) then A = Lf for f = A1 ∈ F. Therefore lz(ν) = Lσz(ν) . Let B denote the multiplication operator by the formal series with constant coefficients σz(ν) . It commutes with all formal differential operators and therefore B ∈ L(M). Since B1 = σz(ν) we get that lz(ν) = B. The lemma is proved. It was shown in Sect. 3 that for a given s-module s on M the function σz = lz 1, z ∈ Z(gc ), is scalar and is equal to the value ψ(z) of the central character ψ associated to s. Yet it does not mean that for z ∈ Z the corresponding operator lz is scalar. We shall use deformation quantization to prove the following proposition. Proposition 4. Let s1 be an arbitrary s-module on M, lu , u ∈ U(gc ), and ψ be the associated operators and the central character of U(gc ) respectively. If the set of nondegenerate s-modules on M is non-empty then for z ∈ Z(gc ) lz = ψ(z) · 1 holds. (We denote by 1 the identity operator.) j } and by ωj the KProof. Choose a nondegenerate s-module s0 . Denote by {fX equivariant family and the (1,1)-form associated to sj , j = 0, 1, respectively. Consider a parameter dependent s-module s(t) = ts0 + s1 . The K-equivariant family {fX } asso0 1 + fX . Thus for X ∈ kr the operator lX (t) associated ciated to s(t) is such that fX = tfX 0 1 + fX ). When t = 0 the operator to s(t) is given by the formula lX (t) = ξX + i(tfX 1 lX (t) reduces to the operator lX = ξX + ifX associated to the (possibly degenerate) s-module s1 . If we replace the parameter t in lX (t) by 1/ν we will get the operator (ν) 0 1 lX = ξX + (i/ν)(fX + νfX ) of the deformation quantization with separation of variables (F, ?) which corresponds to the formal (1,1)-form ω = ω0 +νω1 . For z ∈ Z(gc ) the operator lz (t) is polynomial in t. If we replace t by 1/ν in lz (t) we will get the operator lz(ν) which is scalar by Lemma 7. Therefore lz (t) is scalar as well. Taking t = 0 we get that the operator lz (0) = lz associated to the s-module s1 is scalar. Since lz 1 = σz = ψ(z) it follows that lz = ψ(z) · 1. This completes the proof.

Theorem 3. Let (L, h) → M be a hermitian line bundle on M on which the group K acts by holomorphic automorphisms which preserve the metrics h. The algebra U(gc )

Pseudo-K¨ahler Quantization on Flag Manifolds

367

acts on (L, h) by holomorphic differential operators Au , u ∈ U(gc ). Let s be the corresponding s-module on M and ψ be the central character of U(gc ) associated to s. If the set of nondegenerate s-modules on M is non-empty, the center Z(gc ) of U(gc ) acts on the sheaf of local holomorphic sections of L by scalar operators Az = ψ(z) · 1, z ∈ Z(gc ). Proof. It follows from Proposition 3 and the corollary to Proposition 1 that for z ∈ Z(gc ) the pushforward of the holomorphic differential operator Az from L to M is scalar and is equal to ψ(z) = σz . Now the theorem is a consequence of the fact that the pushforward mapping A 7→ Aˇ is injective. 6. s-Modules on Flag Manifolds We are going to apply the results obtained above to the case of K being a compact semisimple Lie group. The general facts from the theory of semisimple Lie groups mentioned below may be found in [15]. Let gc be a complex semisimple Lie algebra, hc its Cartan subalgebra, h∗c the dual of hc , W the Weyl group of the pair (gc , hc ), 4, 4+ , 4− , 6 ⊂ h∗c the sets of all nonzero, positive, negative and simple roots respectively, δ the half-sum of positive roots. For each α ∈ 4 choose weight elements Xα ∈ gc such that [Hα , X±α ] = ±2X±α for Hα = [Xα , X−α ]. An element λ ∈ h∗c is called dominant if λ(Hα ) ≥ 0 for all α ∈ 6, and is a weight if λ(Hα ) ∈ Z for all α ∈ 6. Denote by W the set of all weights in h∗c (the weight lattice). Fix an arbitrary subset 2 of 6 and denote by h2i the set of roots which are linear combinations of elements of 2. Then 5 = h2i ∪ 4− is a parabolic subset of 4. Denote by gc2 the Levi subalgebra of gc generated by hc and Xα , α ∈ h2i, and by qc the parabolic subalgebra generated by hc and Xα , α ∈ 5. Denote by gr , qr , gr2 the realifications of gc , qc , gc2 respectively, and by J the complex structure in gr inherited from gc . Let kr ⊂ gr denote the compact form of gc generated by JHα , Xα − X−α , J(Xα + X−α ), α ∈ 4. Define kr2 = kr ∩ gc2 = kr ∩ qc . It is generated by JHα , α ∈ 4, and Xα − X−α , J(Xα + X−α ), α ∈ h2i. Introduce the real Lie algebra tr = hc ∩ kr , the Lie algebra of a maximal torus in K. It is generated by JHα , α ∈ 4. Let G be a complex connected simply connected Lie group with the Lie algebra gr , G2 and Q the Levi and parabolic subgroups of G with the Lie algebras gr2 and qr respectively. In the rest of this paper K will denote the maximal compact subgroup of G with the Lie algebra kr , and K 2 = K ∩ G2 = K ∩ Q. It is known that K 2 is the centralizer of a torus and is connected, and that G/Q = K/K 2 is a complex compact homogeneous manifold (a generalized flag manifold). Denote it by M. Denote by x0 the class of the unit element of K in M (the "origin" of M) and by E the set of all K 2 -invariant points of kr . The set of K-equivariant mappings γ : M → kr is parametrized by E so that γ corresponds to E = γ(x0 ) ∈ E. Since the group K 2 is connected, the set E is the centralizer of kr2 . It is easy to check that E = {H ∈ tr |α(H) = 0 for all α ∈ h2i}. Denote by (·, ·) the Killing form on gc . It is C-linear, and its restriction to kr is negative-definite. Identify the dual kr∗ of the Lie algebra kr with kr via the Killing form. We are going to show that any K-equivariant mapping γ : M → kr (or the K-equivariant family defined by γ) corresponds to an s-module on M. Let ⊂ kr be the orbit of the point

368

A. V. Karabegov

E = γ(x0 ) ∈ E, ω be the Kirillov 2-form on , and vX , X ∈ kr , the fundamental vector fields on . Then the 2-form ω on M corresponding to γ equals γ ∗ ω . It is known that , vY ) = (E, [X, Y ]) holds. Thus at the point at the point E ∈ for X, Y ∈ kr , ω (vX x0 ∈ M ω(vX , vY ) = (E, [X, Y ]). The tangent space Tx0 M to the complex manifold ˜ In view of Theorem 1 in order to show that the M carries the natural complex structure J. mapping γ corresponds to an s-module it is enough to check that the form ω on the tangent space Tx0 M is of the type (1,1) or, equivalently, that for any v1 , v2 ∈ Tx0 M, ω(v1 , v2 ) = ˜ 2 ) holds. We can identify Tx0 M as a real vector space with the subspace of ˜ 1 , Jv ω(Jv kr generated by the basis consisting of the elements Xα − X−α , J(Xα + X−α ), α ∈ ˜ α − X−α ) = J(Xα + X−α ) for α ∈ 4+ \h2i. Since gr /qr = kr /kr2 , we get that J(X 4+ \h2i. The tangent space Tx0 M can be represented as the direct sum of 2-dimensional real subspaces spanned by the vectors Xα − X−α , J(Xα + X−α ), α ∈ 4+ \h2i. These subspaces are mutually orthogonal with respect to the skew-symmetric form (E, [·, ·]). ˜ α − X−α ), JJ(X ˜ Now, for α ∈ 4+ \h2i we have (E, [J(X α + X−α )]) = (E, [J(Xα + X−α ), −(Xα − X−α )]) = (E, [Xα − X−α , J(Xα + X−α )]) = i(E, [Xα − X−α , Xα + X−α ]) = 2i(E, Hα ). Thus the form ω is of the type (1,1). For α ∈ 4\h2i the linear functional E 3 H 7→ α(H) is nonzero, therefore the set Ereg = {H ∈ E|(H, Hα ) 6= 0 for all α ∈ 4\h2i} is a dense open subset of E. The form ω is nondegenerate iff E ∈ Ereg . It is known that under the adjoint action of the compact group K on kr the isotropy subgroup of any element of kr is connected. Now if ω is nondegenerate, the isotropy subgroup of E = γ(x0 ) coincides with K 2 and thus the mapping γ : M → is a bijection. ˜ 2) − ˜ by the formula hv1 , v2 i = ω(v1 , Jv Define a sesquilinear form h·, ·i on (T M, J) iω(v1 , v2 ). If ω is nondegenerate and thus pseudo-K¨ahler, the form h·, ·i is the corresponding pseudo-K¨ahler metrics on M. The vectors Xα − X−α , α ∈ 4+ \h2i, form a ˜ They are orthogonal with respect to the basis in the complex vector space (Tx0 M, J). form h·, ·i. We have hXα − X−α , Xα − X−α i = (E, [Xα − X−α , J(Xα + X−α )]) = i(E, [Xα − X−α , Xα + X−α ]) = 2i(E, Hα ) = 2(E, JHα ). (Notice that since E, JHα ∈ tr , (E, JHα ) is real.) Thus we have proved the following theorem.

Theorem 4. To an arbitrary K-equivariant mapping γ : M → kr there corresponds an s-module s on M. It is nondegenerate iff for E = γ(x0 ) and all α ∈ 4\h2i, (E, Hα ) 6= 0 holds. The set of nondegenerate s-modules on M is non-empty. For a nondegenerate s the associated mapping γ : M → = γ(M) is a bijection and the pseudo-K¨ahler structure on M, pushed forward to the orbit defines a pseudo-K¨ahler polarization on it. The index of inertia of the corresponding pseudo-K¨ahler metrics h·, ·i on M (i.e. the number of minuses in the signature) equals #{α ∈ 4+ \h2i|(E, JHα ) < 0}.

7. Convergent Star-Products on Flag Manifolds We are going to extend the class of convergent star-products on generalized flag manifolds introduced in [4], using results from [8]. We retain the notations of Sect. 6. In particular, the group K is compact semisimple and M is a generalized flag manifold. A representation of the group K in a vector space V is called K-finite if any vector v ∈ V is K-finite, i.e., the set {kv}, k ∈ K, is contained in a finite dimensional subspace of V . If this is the case, V splits into the direct sum of isotypic components. For a dominant weight ζ ∈ W denote by V ζ the component isomorphic to a multiple of irreducible representation of K with highest weight ζ.

Pseudo-K¨ahler Quantization on Flag Manifolds

369

For a K-homogeneous manifold M denote by F (M ) the space of continuous functions on M K-finite with respect to the shift action. Since K is compact, it follows from the Frobenius theorem that each isotypic component F (M )ζ is finite dimensional. Let ⊂ kr be a K-orbit. A function on is called regular if it is the restriction of a polynomial function on kr . It is easy to show that the set of all regular functions on coincides with F () (see, e.g.,[8]). Let d be a nonnegative integer. Denote by Ud the subspace of U(gc ), generated by all monomials of the form X1 . . . Xk , where X1 , . . . , Xk ∈ gc and k ≤ d. The subspaces {Ud } determine the canonical filtration on U(gc ). The symmetric algebra S(gc ) can be identified with the space of polynomials on ˜ ) = kr , so that the element X ∈ gc corresponds to the linear functional on kr , X(Y d (X, Y ), Y ∈ kr . Let S (gc ) be the space of homogeneous polynomials on kr of degree d. The graded algebra, associated with the canonical filtration on U(gc ) is canonically isomorphic to S(gc ), so that Ud /Ud−1 corresponds to S d (gc ). For u ∈ Ud let u(d) denote the corresponding element of S d (gc ). If k ≤ d and u = X1 . . . Xk ∈ Ud , then u(d) = 0 for k < d and u(d) = X˜ 1 . . . X˜ d for k = d. We say that a parameter dependent vector v(~) in a vector space PV depends rationally on a real parameter ~ if v(~) can be represented in a form v(~) = j aj (~)vj for a finite number of elements vj ∈ V and rational functions aj (~), i.e., v(~) ∈ C((~)) ⊗ V , where C((~)) is the field of rational functions of ~. Denote by O(~) ⊂ C((~)) the ring of rational functions of ~ regular at ~ = 0. Vector v(~) is called regular at ~ = 0 if v(~) ∈ O(~) ⊗ V . P Let v(~) = r ~r vr , vr ∈ V , be the Laurent expansion of v(~) at ~ = 0. Since v(~) depends rationally on ~, its Laurent expansion has a finite P polar part. Denote by 9(v(~)) the corresponding formal Laurent series, 9(v(~)) = r ν r vr . The set S of s-modules on M is a finite dimensional vector space. Thus we can consider an s-module s(~) on M depending rationally on ~ and regular at ~ = 0. Denote by ω(~) the (1,1)-form associated to s(~). It is clear P that ω(~) also depends rationally on ~ and is regular at ~ = 0. Moreover, 9(s(~)) = r≥0 ν r sr for some sr ∈ S and P 9(ω(~)) = r≥0 ν r ωr where ωr is the (1,1)-form associated to sr .

(~) r Denote by γ(~), γr : M → kr the K-equivariant mappings and by {fX }, {fX } the K-equivariant families corresponding to s(~), sr respectively. For X ∈ kr the function P (~) (~) r on M depends rationally on ~, is regular at ~ = 0 and 9(fX ) = r≥0 ν r fX . fX

(~) } corresponds to the s-module The K-equivariant family {(1/~)fX (1/~)s(~). It will be convenient for us to denote by lu(~) , u ∈ U(gc ), the operators on M associated to the s-module (1/~)s(~) (rather than to s(~)) and set σu(~) = lu(~) 1. In particular, (~) (~) (~) (~) = ξX + (i/~)fX and σX = (i/~)fX . for X ∈ kr lX Let D~ denote the algebra of differential operators on M depending rationally on ~.

Lemma 8. For u ∈ Ud the differential operator ~d lu(~) belongs to D~ . It is regular at ~ = 0 and lim~→0 ~d lu(~) is a multiplication operator by the function id u(d) ◦ γ0 . In particular, the function ~d σu(~) depends rationally on ~, is regular at ~ = 0 and lim~→0 ~d σu(~) = id u(d) ◦ γ0 . (~) 0 equals fX = X (1) ◦ γ0 at ~ = 0. Let u = X1 . . . Xk , Xj ∈ kr , Proof. The function fX (~) (~) (~) . . . lX = ~d−k (~ξX1 +ifX ) . . . (~ξXk + for k ≤ d. Then u ∈ Ud . We have ~d lu(~) = ~d lX 1 1 k (~) d (~) d 0 0 = ifXk ). Thus the limit lim~→0 ~ lu equals zero if k < d and equals i fX1 . . . fX d d (d) i u ◦ γ0 if k = d, whence the lemma follows immediately.

370

A. V. Karabegov

For the rest of the section denote = γ0 (M) and assume that the s-module s0 is nondegenerate, so that ω0 is pseudo-K¨ahler. Then Theorem 4 implies that the Kequivariant mapping γ0 : M → is a bijection. Thus γ0∗ : F () → F (M) is an isomorphism of K-modules. Proposition 5. If the s-module s0 is nondegenerate, then for any f ∈ F (M) there exist Ud(j) for some numbers d(j) and rational functions aj (~) regular at ~ = 0, elements uj ∈P such that f = j ~d(j) aj (~)σu(~)j for all but a finite number of values of ~. Proof. Fix a dominant weight ζ ∈ W. The subspace Ud ⊂ U(gc ) is invariant under the adjoint action of the group K, and is finite dimensional. The mapping Ud 3 u 7→ u(d) ∈ S d (gc ) is K-equivariant, therefore it maps Udζ to F ()ζ . Since the space F () coincides with the space of regular functions on and is isomorphic to F (M), one ζ for some numbers d(j) such that the functions fj = can choose elements uj ∈ Ud(j) id(j) uj (d(j)) ◦ γ0 form a basis {fj } in F (M)ζ . Since the function f˜j = ~d(j) σu(~)j on M depends rationally on ~ and is regular at ~ = 0, the elements of the matrix (bjk (~)) P such that f˜j = k bjk (~)fk are rational functions of ~ regular at ~ = 0. It follows from Lemma 8 that the matrix (bjk (~)) coincides with the identity matrix at ~ = 0. Thus P the elements of the inverse matrix (akj (~)) = (bjk (~))−1 such that fk = j akj (~)f˜j = P d(j) (~) σuj are also rational functions of ~ regular at ~ = 0. Now the proposition j akj (~)~ follows from the fact that the space F (M) is a direct sum of the subspaces F (M)ζ . to the s-module Let (A~ , ∗~ ) denote the algebra of functions on M associatedP (1/~)s(~).Any function f ∈ F (M) can be represented in the form f = j ~d(j) aj (~)σu(~)j for some uj ∈ Ud(j) . Thus f ∈ A~ forP all but a finite number of values of ~. For a function g ∈ A~ , f ∗~ g = j ~d(j) aj (~)lu(~)j g holds. We get from Lemma 8 the following corollary to Proposition 5. Corollary. Any functions f, g ∈ F (M) are elements of the algebra (A~ , ∗~ ) for all but a finite number of values of ~. The product f ∗~ g as a function on M depends rationally on ~ and is regular at ~ = 0, i.e., f ∗~ g ∈ O(~) ⊗ F (M). Remark. It is easy to show that extending the multiplication ∗~ by O(~)-linearity we obtain the associative algebra (O(~)⊗F (M), ∗~ ) over the ring O(~) of rational functions of ~ regular at ~ = 0. Denote by ω the formal (1,1)-form 9(ω(~)) = ω0 + νω1 + . . . . It is a formal deformation of the pseudo-K¨ahler form ω0 . Denote by (F, ?) the deformation quantization with separation of variables on M corresponding to ω. (~) (~) (ν) (ν) 0 1 = 9(fX ) = fX + νfX + . . . . Then 9(lX ) = ξX + (i/ν)fX . It follows from Set fX (ν) (ν) Proposition 3 that for X ∈ kr the operator lX = ξX + (i/ν)fX belongs to the algebra L of left ?-multiplication operators of the deformation quantization (F, ?). It is easy to check that the mapping D~ 3 A 7→ 9(A) is a homomorphism from D~ to the algebra (~) (ν) of formal differential operators on M, therefore for u ∈ U(g u ) = lu ∈ L. Pc ), 9(l d(j) Represent a function f ∈ F (M) in the form f = ~ aj (~)σu(~)j for some jP uj ∈ Ud(j) as in Proposition 5 and consider the operator A = j ~d(j) aj (~)lu(~)j ∈ D~ . It follows from Lemma 8 that A is regular at ~ = 0. It is straightforward that 9(A) ∈ L and A1 = f , whence one can easily obtain that 9(A)1 = f and therefore 9(A) = Lf . For g ∈ F (M) the product f ∗~ g = Ag is a function on M which depends rationally on ~ and is regular at ~ = 0. Therefore the product f ∗~ g expands to the uniformly and

Pseudo-K¨ahler Quantization on Flag Manifolds

371

absolutely convergent Taylor series in ~ at ~ = 0. Finally, 9(f ∗~ g) = 9(Ag) = Lf g = f ? g. Thus we have proved the following theorem. Theorem 5. Let s(~) be an s-module on M which depends rationally on the parameter ~ and is regular at ~ = 0, and ω(~) be the associated (1,1)-form. Then 9(s(~)) = P r r≥0 ν sr for some sr ∈ S. Assume that the s-module s0 is nondegenerate and denote by (A~ , ∗~ ) the algebra of functions associated to the s-module (1/~)s(~). Any functions f, g ∈ F (M) belong to A~ for all but a finite number of values of ~. The product uniformly and absolutely convergent Taylor series in ~ at the f ∗~ g expands to theP point ~ = 0, f ∗~ g = r≥0 ~r Cr (f, g), where Cr (·, ·), r = 0, 1, . . . , are bidifferential operators which define the deformation quantization with separation of variables on M corresponding to the formal deformation ω = 9(ω(~)) = ω0 + νω1 + . . . of the pseudo-K¨ahler form ω0 . 8. Characters Associated to s-Modules on Flag Manifolds Since the group K is compact, there exists the K-invariant measure µ of the total volume 1 on the flag manifold M. Let s be an s-module on M and A the corresponding algebra of functions on M. It is known that U(gc ) = Z(gc ) ⊕ [U(gc ), U(gc )] (see [15]). Let U(gc ) 3 u 7→ u0 denote the corresponding projection of U(gc ) onto Z(gc ). Recall the following Definition (see [15]). A linear form κ : U(gc ) → C is called a character of gc if: (1) κ(uv) = κ(vu), κ(1) = 1; (2) κ(u0 v) = κ(u)κ(v) (u, v ∈ U(gc )). Thus one has κ(u) = κ(u0 ) for all u ∈ U(gc ). Moreover, κ is then a homomorphism of Z(gc ) into C, a central character of gc . This central character determines κ completely. Fix an s-module s on M and let lu , u ∈ U(gc ), be the operators on M, associated to s, σu = lu 1 and ψ be the corresponding central character of U(gc ), ψ(z) = σz , z ∈ Z(gc ). R Proposition 6. A linear form κ(u) = M σu dµ, u ∈ U(gc ), on U(gc ) is a character of gc . For z ∈ Z(gc ) κ(z) = ψ(z). Proof. Since the mapping u 7→ σu = lu 1 is K-equivariant and the measure µ is Kinvariant, for k ∈ K, κ(Ad(k)u) = κ(u) holds or, infinitesimally, for X ∈ kr κ(Xu − uX) = 0, therefore κ(uv) = κ(vu). The measure µ is of the total volume 1 and for z ∈ Z(gc ) σz = ψ(z) is scalar, therefore κ(z) = ψ(z). In particular, κ(1) = 1. Thus (1) 0 lu0 = σu0 is scalar, we is proved. Now, R using that Rfor u ∈ U(gc ), u R∈ Z(gc ) holds and 0 get κ(u v) = lu0 v 1 dµ = lu0 lv 1 dµ = σu0 lv 1 dµ = κ(u0 )κ(v). This completes the proof of the proposition. Remark. Proposition 6 implies that to any s-module s on M there corresponds a character κ of the Lie algebra gc . The central character Z(gc ) 3 z 7→ κ(z) of U(gc ) coincides with R the central character ψ associated to s. Moreover, the mapping A 3 f 7→ t(f ) = M f dµ is a trace on the algebra A, i.e., t(f ∗ g) = t(g ∗ f ) for f, g ∈ A. Let τ be an n-dimensional irreducible representation of gc . Since for z ∈ Z(gc ) τ (z) is scalar, it is straightforward that κτ = (1/n)tr τ is a character of gc . In particular, Z(gc ) 3 z 7→ κτ (z) is the central character of the representation τ .

372

A. V. Karabegov

Proposition 7. Let τ be an n-dimensional irreducible representation of U(gc ) in the vector space V and s be an s-module on M. If the central character ψ associated to s coincides with the central character of the representation τ , then there exists a representation ρ of the algebra A in the same vector space V such that τ = ρ ◦ σ. R Proof. Since the characters κτ = (1/n)tr τ and κ = M σ dµ of gc coincide on Z(gc ), they coincide identically. We have that for u, v ∈ U(gc ), Z Z (6) tr(τ (u)τ (v)) = tr τ (uv) = n σuv dµ = n σu ∗ σv dµ. Assume that σv = 0. Then the last expression in (6) is zero for all u ∈ U(gc ). Since τ is irreducible, τ (u) is an arbitrary endomorphism of the representation space V , therefore τ (v) = 0. Thus the representation τ factors through the mapping σ : U(gc ) → A, τ = ρ ◦ σ. The proposition is proved. 9. Holomorphic Line Bundles on Flag Manifolds The Levi subgroup G2 ⊂ G is reductive. The pair (gc2 , hc ) has a root system h2i. Induce the ordering on h2i from 4. Denote by W 2 the Weyl group of the pair (gc2 , hc ), by 0 δ2 the half-sum of positive roots from h2i, and set δ2 = δ − δ2 . The one-dimensional holomorphic representations (the holomorphic characters) of G2 are parametrized by the set W 2 of W 2 -invariant weights from W. The parabolic group Q is a semi-direct product of G2 and of the unipotent radical R of Q. For λ ∈ W 2 denote by χλ the holomorphic character of Q, which is trivial on R, and whose restriction to G2 is the character of G2 parametrized by λ. For H ∈ hc χλ (exp H) = exp λ(H). Denote by Cλ a one-dimensional complex vector space with the action of Q given by χλ . Consider the holomorphic line bundle Lλ = G×Q Cλ . It is the coset space of G×Cλ under the equivalence (gq, v) = (g, χλ (q)v), g ∈ G, q ∈ Q, v ∈ Cλ . The group G acts on Lλ as follows, G 3 g0 : (g, v) 7→ (g0 g, v). Since G/Q = K/K 2 , one has an alternative description of Lλ , Lλ = K ×K 2 Cλ . Using that description one can define a K-invariant hermitian metrics h on Lλ setting h(k, v) = |v|2 . It follows from Iwasawa decomposition that each element g ∈ G can be (non-uniquely) represented as a product g = kq for some k ∈ K, q ∈ Q. Thus for g = kq we get h(g, v) = h(kq, v) = h(k, χλ (q)v) = |χλ (q)v|2 . It follows from the results obtained in Sect. 4, that to the hermitian line bundle λ }, X ∈ kr , be the (Lλ , h) there corresponds an s-module on M. Denote it sλ . Let {fX corresponding K-equivariant family which defines the mapping γ : M → kr such that λ (x) for all x ∈ M, X ∈ kr . We are going to apply Theorem 4 to the s(γ(x), X) = fX modules sλ , λ ∈ W 2 . Calculate the element E λ = γ(x0 ). Since E λ ∈ tr , in order to determine E λ it is enough to consider only the pairing (E λ , JHα ) for all α ∈ 4. For Z ∈ gr L L L let vZ be the fundamental vector field on Lλ , then ξZL = (1/2)(vZ − ivJZ ) is its holomorphic component. For ϕ ∈ C ∞ (Lλ ) vZ ϕ(g, v) = (d/dt)ϕ(exp(−tZ)g, v)|t=0 , g ∈ G, v ∈ Cλ . Using Proposition 1 and taking into account K-invariance of the metrics h we get λ L −1 L ◦ π = h(ξX h ) = (−i/2)h(vJX h−1 ) for X ∈ kr . Thus ifX λ (x0 ) = (1/2)h(vHα h−1 ) = (1/2)h(e, v)(d/dt)(h(exp(−tHα ), v))−1 |t=0 fJH α

= (1/2)(d/dt)(|χλ (exp(−tHα ))|−2 )|t=0 = (1/2)(d/dt)(| exp λ(tHα )|2 )|t=0 = λ(Hα ).

Pseudo-K¨ahler Quantization on Flag Manifolds

373

Now E λ = γ(x0 ) is the element of tr such that (E λ , JHα ) = λ(Hα ) for all α ∈ 4. The following proposition is a direct consequence of Theorem 4. Proposition 8. The s-module sλ corresponding to the holomorphic hermitian line bundle (Lλ , h), λ ∈ W 2 , is nondegenerate iff for all α ∈ 4\h2i, λ(Hα ) 6= 0 holds. In this case the index of inertia of the corresponding pseudo-K¨ahler metrics on M equals #{α ∈ 4+ \h2i|λ(Hα ) < 0}. Lemma 9. The canonical line bundle Lcan on M is isomorphic to the bundle Lλ for 0 0 . The canonical s-module scan on M coincides with sλ for λ = −2δ2 . λ = −2δ2 Proof. The isotropy subgroup Q ⊂ G of the point x0 ∈ M = G/Q acts on the fibers of G-bundles at x0 . The fiber of the line bundle Lλ at x0 is isomorphic as a Q-module to Cλ . On the other hand, the holomorphic tangent space of M at x0 , Tx0 0 M, is isomorphic as a Q-module to gc /qc under the adjoint action. For H ∈ hc the operator ad(H) on gc /qc is diagonal in the basis {Xα + qc }, α ∈ 4\5, and takes the eigenvalue α(H) on Xα + qc . 0 (H), where m = dimC M. The element H ∈ hc acts on ∧m (gc /qc ) by the scalar 2δ2 m 0 . The lemma Therefore the element q ∈ Q acts on ∧ (gc /qc ) by χλ (q) for λ = 2δ2 follows from the fact that the fiber of the canonical line bundle Lcan at x0 is dual to ∧m (gc /qc ) as a Q-module. Now we shall use a particular case of the Bott–Borel–Weil theorem concerning cohomological realizations of finite dimensional irreducible holomorphic representations of the group G in the sheaf cohomologies of line bundles over M = G/Q (see [9]). Let H i (M, SLλ ) denote the space of i-dimensional cohomology with coefficients in the sheaf of germs of holomorphic sections of the line bundle Lλ . The action of the group G on Lλ gives rise to the action of G on the local holomorphic sections of Lλ , which induces the action of G in the cohomology spaces H i (M, SLλ ). Theorem 6 (Bott–Borel–Weil). Let λ ∈ W 2 , k = #{α ∈ 4+ |(λ + δ)(Hα ) < 0}. If (λ + δ)(Hα ) = 0 for some α ∈ 4, then H i (M, SLλ ) = 0 for all i. If (λ + δ)(Hα ) 6= 0 for all α ∈ 4 one can choose w ∈ W so that w(λ + δ) is dominant. Then ζ = w(λ + δ) − δ is dominant as well. For all i 6= k H i (M, SLλ ) = 0. The representation of the group G in H k (M, SLλ ) is isomorphic to the irreducible finite dimensional holomorphic representation of G with highest weight ζ. Assume that an irreducible finite dimensional holomorphic representation τ of the group G is realized in the cohomology space H k (M, SLλ ) as in Theorem 6. Retain the same notation for the representations of the Lie algebra gc and of its universal enveloping algebra U(gc ) which correspond to τ . The action of the Lie algebra gc on Lλ by holomorphic differential operators can be extended to the action of U(gc ). For u ∈ U(gc ) denote by Au the corresponding holomorphic differential operator on Lλ . It induces the representation operator τ (u) in H k (M, SLλ ). According to Theorem 3, for z ∈ Z(gc ) the holomorphic operator Az on Lλ is scalar and is equal to the value ψ(z) of the central character ψ associated to the s-module sλ . It follows immediately that the central character of the representation τ coincides with ψ. As in the proof of Proposition 7 we obtain that for u ∈ U(gc ), the equality Z σu dµ = tr τ (u) (7) n M

holds, where n = dim τ . According to Proposition 7, there exists a representation ρ of the algebra A of functions on M associated to the s-module sλ in the space H k (M, SLλ ), such that τ = ρ ◦ σ.

374

A. V. Karabegov

λ λ Let {fX }, X ∈ kr , be the K-equivariant family associated to sλ . Since σX = ifX λ for X ∈ kr , the algebra A contains the functions fX , X ∈ kr , and is generated by them. The algebra kr acts on Lλ by the holomorphic differential operators AX = λ , X ∈ kr , due to Proposition 1. The operator AX induces in H k (M, SLλ ) ∇ξX + ifX λ ). the representation operator ρ(ifX λ . The operator AX Remark. For X ∈ kr consider the operator QX = ∇vX + ifX differs from QX by the anti-holomorphic operator ∇ηX , which annihilates the local holomorphic sections of Lλ and thus induces the trivial action on the sheaf cohomology. λ ). If the Therefore the operator QX also induces in H k (M, SLλ ) the operator ρ(ifX s-module sλ is nondegenerate, the curvature form ω of the connection ∇ is symplectic λ is a Hamiltonian of the fundamental vector field vX on M. Then and the function fX the operator QX is the operator of geometric quantization corresponding to the function λ . fX

We see that the Bott–Borel–Weil theorem provides a natural geometric representation of the algebra A in the sheaf cohomology space of the line bundle Lλ . λ }, X ∈ Theorem 7. Let λ ∈ W 2 be such that (λ+δ)(Hα ) 6= 0 for all α ∈ 4, A and {fX kr , be the algebra of functions on M and the K-equivariant family associated to the λ , X ∈ kr . Set s-module sλ respectively. The algebra A is generated by its elements fX + k = #{α ∈ 4 |(λ + δ)(Hα ) < 0}. There exists a unique finite dimensional irreducible representation ρ of the algebra A in the space H k (M, SLλ ) such that for all X ∈ kr the λ ) is induced from the holomorphic differential operator representation operator ρ(ifX λ ∇ξX + ifX on Lλ . There exists an element w ∈ W such that ζ = w(λ + δ) − δ is a dominant weight of the Lie algebra gc . The representation τ = ρ◦σ of gc in H k (M, SLλ ) is irreducible with highest weight ζ.

Denote by w0 and w02 the elements of the maximal reduced length in the Weyl groups W and W 2 respectively. Let τ be the irreducible finite dimensional representation of the algebra gc with highest weight ζ. It is known that the dual representation τ 0 has the highest weight ζ 0 = −w0 ζ. Lemma 10. Let λ ∈ W 2 and w ∈ W be such that ζ = w(λ + δ) − δ is a dominant 0 ∈ W 2 and there exists an element w0 ∈ W such that weight. Then λ0 = −λ − 2δ2 0 0 0 ζ = w (λ +δ)−δ. If (λ+δ)(Hα ) 6= 0 for all α ∈ 4 and k = #{α ∈ 4+ |(λ+δ)(Hα ) < 0}, then (λ0 + δ)(Hα ) 6= 0 for all α ∈ 4 and #{α ∈ 4+ |(λ0 + δ)(Hα ) < 0} = m − k, where m = dimC M. Proof. For α ∈ 2 the reflection sα ∈ W 2 maps α to −α and preserves both 4+ \{α} 0 and h2i. It follows that the group W 2 preserves the set 4+ \h2i, whence −2δ2 ∈ W2 0 0 2 2 + − and therefore λ = −λ − 2δ2 ∈ W . The element w0 maps h2i to h2i and preserves 0 +δ2 ) = 4+ \h2i, whence w02 δ2 = −δ2 . Take w0 = w0 ww02 , then w0 (λ0 +δ) = w0 (−λ−δ2 2 0 0 w0 ww0 (−λ−δ2 +δ2 ) = w0 w(−λ−δ2 −δ2 ) = w0 w(−λ−δ) = w0 (−ζ−δ) = −w0 ζ+δ = 0 + δ2 )(Hα ) = ζ 0 + δ, thus ζ 0 = w0 (λ0 + δ) − δ. For α ∈ 4 we have (λ0 + δ)(Hα ) = (−λ − δ2 2 0 0 (w0 (−λ − δ2 + δ2 ))(Hw2 α ) = (−λ − δ2 − δ2 )(Hw2 α ) = (−λ − δ)(Hw2 α ). Now it 0 0 0 is clear that if (λ + δ)(Hα ) 6= 0 for all α ∈ 4 then (λ0 + δ)(Hα ) 6= 0 for all α ∈ 4. Since m = #(4+ \h2i), λ(Hα ) = λ0 (Hα ) = 0 for α ∈ h2i and δ(Hα ) > 0 for α ∈ 4+ , we get #{α ∈ 4+ |(λ0 + δ)(Hα ) < 0} = #{α ∈ 4+ \h2i|(λ0 + δ)(Hα ) < 0} = #{α ∈ 4+ \h2i|(−λ − δ)(Hw2 α ) < 0} = #{α ∈ 4+ \h2i|(λ + δ)(Hα ) > 0} = m − k. The 0 lemma is proved.

Pseudo-K¨ahler Quantization on Flag Manifolds

375

Retain the notations of Lemma 10 and assume that (λ + δ)(Hα ) 6= 0 for all α ∈ 4. It follows from Theorem 6 and Lemma 10 that the dual representations τ and τ 0 of the algebra gc with highest weights ζ and ζ 0 are realized in the cohomology spaces H k (M, SLλ ) and H m−k (M, SLλ0 ) respectively. The spaces H k (M, SLλ ) and H m−k (M, SLλ0 ) are dual. This is, in fact, the Kodaira–Serre duality. According to Theorem 7 and Lemma 9, the s-modules sλ and sλ0 are dual and the associated function algebras A and A0 on M have representations ρ and ρ0 in H k (M, SLλ ) and H m−k (M, SLλ0 ), such that τ = ρ ◦ σ and τ 0 = ρ0 ◦ σ 0 respectively (here all the notations have their usual meaning). Let n = dim τ . Proposition 9. For f ∈ A and g ∈ A0 the equality Z f g dµ = tr(ρ(f )(ρ0 (g))t ) n M

holds. 0 Proof. RChoose u, v ∈ R f 0= σu , g =R σv0 . tThen, using REq. (7), one R U(g0c ) such that σ dµ = n σ l 1 dµ = n (l gets n f g dµ = n σ u v u v v ) σu dµ = n lvˇ σu dµ = R 0 t 0 t dµ = tr τ ( vu) ˇ = tr((τ (v)) τ (u)) = tr(ρ(f )(ρ (g)) ). The proposition is proved. n σvu ˇ

10. Covariant and Contravariant Symbols on Flag Manifolds Assume that λ ∈ W 2 is such that the (finite dimensional) space H = H 0 (M, SLλ ) of global holomorphic sections of Lλ is nontrivial. According to Theorem 6, this is the case iff (λ + δ)(Hα ) > 0 for all α ∈ 4+ or, equivalently, iff λ is a dominant weight. For any elements q, q 0 of the same fiber of L∗λ denote by h(q, q 0 ) their K-invariant hermitian scalar product such that h(q, q) = h(q). Let L2 (M, Lλ ) denote the Hilbert with respect to the K-invariant Hilbert norm space of sections of Lλ , square integrable R || · || given by the formula ||s||2 = M h(s) dµ, s a section of Lλ . Denote the corresponding hermitian scalar product in L2 (M, Lλ ) by h·, ·i. We introduce coherent states in H in a geometrically invariant fashion, following [12]. For q ∈ L∗λ the corresponding coherent state eq is a unique element in H such that the relation hs, eq iq = s ◦ π(q) holds for all s ∈ H. It is known that the coherent states eq exist for all q ∈ L∗λ and the mapping L∗λ 3 q 7→ eq ∈ H is antiholomorphic. For c ∈ C, ecq = c¯−1 eq holds. The group K acts on the sections of the line bundle Lλ as follows, (ks)(x) = k(s(k −1 x)) for k ∈ K, x ∈ M and s a section of Lλ . This action is unitary with respect to the scalar product h·, ·i. For any holomorphic section s of Lλ we have hks, ekq ikq = (ks)(kx) = k(s(x)), therefore hks, ekq iq = s(x). On the other hand, hks, keq i = hs, eq i = s(x), whence keq = ekq . The function ||eq ||2 h(q) is homogeneous of order 0 with respect to C∗ -action and K-invariant. Thus it is identically constant. Set ||eq ||2 h(q) = C. Let A be an operator on H. It is easy to check that the function f˜(q) = hAeq , eq i/ heq , eq i on the bundle L∗λ is constant on the fibers. Therefore there exists a function fA on M such that fA ◦ π = f˜. Definition. Berezin’s covariant symbol of an operator A on H is the function fA on M given by the formula fA (x) = hAeq , eq i/heq , eq i for any q ∈ L∗λ such that π(q) = x ∈ M.

376

A. V. Karabegov

The operator–symbol mapping A 7→ fA is injective and thus induces an algebra structure on the set of all covariant symbols. The algebra of covariant symbols is isomorphic to End(H). Let A be a holomorphic differential operator on Lλ . Fix a local holomorphic trivialization (U, s0 ) of Lλ and let A0 denote the local expression of the operator A on U . Then for x, y ∈ U we have hAes0 (y) , es0 (x) is0 (x) = Aes0 (y) (x) = s0 (x)A0 (es0 (y) (x)/s0 (x)). The function es0 (y) (x)/s0 (x) on U ×U is holomorphic in x and antiholomorphic in y. Set S(x) = hes0 (x) , es0 (x) i = es0 (x) (x)/s0 (x). Let f be the covariant symbol of the operator A. Since A0 is a holomorphic differential operator on U , we get for q = s0 (x) that f (x) = hAeq , eq i/heq , eq i = (hAes0 (y) , es0 (x) i|y=x )/S(x) = (A0 (es0 (y) (x)/s0 (x)))|y=x /S(x) = A0 S(x)/S(x). Introduce the function 8 = − log h ◦ s0 on U . We have S(x) = ||es0 (x) ||2 = C exp 8, ˇ where Aˇ is the pushforward of the whence f (x) = A0 S(x)/S(x) = e−8 (A0 e8 ) = A1, ˇ operator A to M. The formula f = A1 holds globally on M. For u ∈ U(gc ) the pushforward to M of the operator Au on Lλ coincides with the operator lu related to the s-module sλ , Aˇ u = lu . Therefore the covariant symbol fu of the operator Au on H can be expressed by the formula fu = Aˇ u 1 = lu 1 = σu . We have proved the following theorem. Theorem 8. Let λ ∈ W 2 be dominant. Then the space H = H 0 (M, SLλ ) of global Endow it with the Hilbert space structure via holomorphic sections of Lλ is nontrivial. R the norm || · || such that ||s||2 = M h(s) dµ, s ∈ H. Then for u ∈ U(gc ) the covariant symbol of the operator Au on H equals σu , where σ : U(gc ) → C ∞ (M) is the mapping associated to the s-module sλ . According to Theorem 7, the representation ρ of the algebra A = σ(U(gc )) in H is irreducible (the representation τ = ρ ◦ σ of the Lie algebra gc is irreducible with highest weight λ). Therefore any operator on H can be represented as Au for some u ∈ U(gc ). Thus we get the following corollary. Corollary. The algebra A associated to the s-module sλ coincides with the algebra of Berezin’s covariant symbols of the operators on H. Let P : L2 (M, Lλ ) → H be the orthogonal projection operator. For a measurable function f on M let Mf denote the multiplication operator by f . Introduce the operator fˆ = P Mf P on H. Definition. A measurable function f on M is called a contravariant symbol of an operator A on H if A = fˆ. Let s1 , s2 be holomorphic sections of Lλ . Calculate the covariant symbol of the rank one operator A0 = s1 ⊗ s∗2 in H, fA0 = =

hA0 eq , eq i hs1 , eq iheq , s2 i = heq , eq i heq , eq i h(s1 , s2 ) (s1 /q)(s2 /q) . = 2 ||eq || ||eq ||2 h(q)

Pseudo-K¨ahler Quantization on Flag Manifolds

377

Since ||eq ||2 h(q) = C, we obtain fA0 R= h(s1 , s2 )/C. ForR any measurable function g on R ˆ = hgs ˆ 1 , s2 i = hgs1 , s2 i = h(gs1 , s2 ) dµ =R gh(s1 , s2 ) dµ = C fA0 g dµ. M tr(A0 g) Therefore for any operator A on H holds tr(Ag) ˆ = C fA g dµ. Taking A = 1, g = 1 we immediately obtain that C = n = dim H. Proposition 10. A measurable function g on M is a contravariant symbol of an operator R B on H iff for any operator A on H holds the formula tr(AB) = n fA g dµ. The proof is straightforward. 0 . For the s-module sλ0 dual to Let λ ∈ W 2 be dominant. Set λ0 = −λ − 2δ2 0 ∞ sλ let σ : U(gc ) → C (M) denote the mapping associated to sλ0 , ρ0 be the corresponding representation of the algebra A0 = σ 0 (U(gc )) in H m (M, SLλ0 ) The spaces H = H 0 (M, SLλ ) and H m (M, SLλ0 ) are dual as representation spaces of the group G. The following theorem is a direct consequence of Propositions 9,10 and Theorem 8. Theorem 9. A function f ∈ A0 is a contravariant symbol of the operator (ρ0 (f ))t in H. 11. Quantization on Flag Manifolds Now we are ready to put together various results obtained above to give examples of quantization on a generalized flag manifold M endowed with a pseudo-K¨ahler metrics. Let λ ∈ W 2 be such that for all α ∈ 4\h2i, λ(Hα ) 6= 0 holds. According to Proposition 8, in this case the s-module sλ on M is nondegenerate. Denote by ω the 2-form associated to sλ . This form is pseudo-K¨ahler, and the index of inertia of the corresponding pseudo-K¨ahler metrics equals l = #{α ∈ 4+ \h2i|λ(Hα ) < 0}. Denote by A~ the algebra of functions on M associated to the s-module (1/~)sλ . It follows from Theorem 5 that any functions f, g ∈ F (M) belong to A~ for all but a finite number of the uniformly and absolutely convergent Taylor values of ~. The product f ∗~ g expands toP series in ~ at the point ~ = 0, f ∗~ g = r≥0 ~r Cr (f, g), where Cr (·, ·), r = 0, 1, . . . , are bidifferential operators which define the deformation quantization with separation of variables on M corresponding to the (non-deformed) pseudo-K¨ahler form ω. For n ∈ N holds nλ ∈ W 2 . Theorem 7 implies that for ~ = 1/n the algebra A~ has a natural geometric representation ρ~ in the sheaf cohomology space of the line bundle Lnλ = (Lλ )n , H~ = H kn (M, SLnλ ), where kn = #{α ∈ 4+ |(nλ + δ)(Hα ) < 0}. Since for α ∈ 4+ , δ(Hα ) > 0 holds, only those α ∈ 4+ contribute to kn for which λ(Hα ) < 0. Therefore kn = l for n >> 0. In other words, for sufficiently small values of ~ = 1/n the dimension of the sheaf cohomology the representation ρ~ is realized in is equal to the index of inertia l of the pseudo-K¨ahler metrics on M corresponding to the (1,1)-form ω. We have obtained pseudo-K¨ahler quantization on a generalized flag manifold. Now assume λ ∈ W 2 is dominant in the rest of the paper. Then the metrics corresponding to the (1,1)-form ω on M is positive definite, i.e., ω is a K¨ahler form, and Theorem 8 implies that for ~ = 1/n the space H~ = H 0 (M, SLnλ ) is the space of global holomorphic sections of the line bundle Lnλ = (Lλ )n and A~ is the corresponding algebra of Berezin’s covariant symbols on M. Thus we arrive at Berezin’s K¨ahler quantization on M and identify the associated formal deformation quantization obtained in [11, 4] with the quantization with separation of variables, corresponding to the non-deformed K¨ahler form ω. Consider the s-module s(~) = −sλ + ~scan . It depends rationally on ~ and is regular at ~ = 0. Denote by ωcan the (1,1)-form associated to the canonical s-module scan on

378

A. V. Karabegov

M. Then the form, associated to s(~) is −ω + ~ωcan . The s-module (1/~)s(~) is dual to (1/~)sλ . Denote by (A0~ , ∗0~ ) the algebra of functions on M associated to the s-module (1/~)s(~). For any functions f, g ∈ F (M) the product f ∗0~ g depends rationally on ~ and is regular at ~ = 0. The asymptotic expansion of the product f ∗0~ g gives rise to the deformation quantization with separation of variables (F, ?0 ) corresponding to the formal deformation of the negative-definite K¨ahler form −ω, ω 0 = −ω + νωcan . If ~ = 1/n then the algebra A0~ has a representation ρ0~ in the space H m (M, SLλ0n ), 0 . The space H m (M, SLλ0n ) is dual to H~ and where m = dimC M and λ0n = −nλ − 2δ2 Theorem 9 implies that any function f ∈ A0~ is a contravariant symbol of the operator (ρ0~ (f ))t in the space H~ . The mapping A0~ 3 f 7→ (ρ0~ (f ))t is an anti-homomorphism. Thus, in order to obtain quantization on M by contravariant symbols (it is usually called Berezin-Toeplitz quantization, see [13]), we have to consider the algebras (A˜ ~ , ∗˜ ~ ), opposite to (A0~ , ∗0~ ). Then A˜ ~ 3 f 7→ (ρ0~ (f ))t will be a representation of the algebra A˜ ~ . The corresponding deformation quantization (F, ?˜ ) is opposite to (F, ?0 ). As it was shown in Sect. 5, this quantization is also a quantization with separation of variables, though with respect to the opposite complex structure on M. It corresponds to the ¯ This form formal (1,1)-form −ω 0 = ω − νωcan on the opposite complex manifold M. ¯ (The metrics on M, ¯ corresponding is a formal deformation of the K¨ahler form ω on M. to the (1,1)-form ω is a negative-definite K¨ahler metrics.) It would be interesting to compare the deformation quantization associated to Berezin-Toeplitz quantization on a general compact K¨ahler manifold in [13] with deformation quantization with separation of variables. Acknowledgement. I am very grateful to A. Astashkevich for interesting discussions and to the referee for important remarks. I wish to express my deep gratitude to Professor M.S. Narasimhan for inviting me to the International Centre for Theoretical Physics (ICTP) and to the ICTP for their warm hospitality.

References 1. Astashkevich, A.: On Karabegov’s quantization of semisimple coadjoint orbits. To appear in Adv. in Geom.and Math. Phys., Vol. 1, eds. J-L. Brylinski et al., Basel–Boston: Birkh¨auser, 1998 2. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation theory and quantization. Ann. Phys. 111, 61–151 (1978) 3. Berezin, F.A.: Quantization. Math. USSR Izv. 8, 1109–1165 (1974) 4. Cahen, M., Gutt, S., Rawnsley, J.: Quantization of K¨ahler manifolds II. Trans. Am. Math. Soc. 337, 73–98 (1993) 5. Cahen, M., Gutt, S., Rawnsley, J.: Quantization of K¨ahler manifolds III. Lett. Math. Phys. 30, 291–305 (1994) 6. Cahen, M., Gutt, S., Rawnsley, J.: Quantization of K¨ahler manifolds IV. Lett. Math. Phys. 34, 159–168 (1995) 7. Karabegov,A.V.: Deformation quantizations with separation of variables on a K¨ahler manifold. Commun. Math. Phys. 180, 745–755 (1996) 8. Karabegov, A.V.: Berezin’s quantization on flag manifolds and spherical modules Trans. Am. Math. Soc. 350, 1467–1479 (1998) 9. Knapp, A.W.: Introduction to representation in analytic cohomology. Contemp. Math. 154, 1–19 (1993) 10. Kostant, B.: Quantization and unitary representations. In: Lectures in Modern Analysis and Applications III, Lect. Notes in Math. 170, Berlin–New York: Springer-Verlag, 1970 11. Moreno, C.: Invariant star products and representations of compact semisimple Lie groups. Lett. Math. Phys. 12, 217–229 (1986) 12. Rawnsley, J., Cahen, M., Gutt, S.: Quantization of K¨ahler manifolds I: Geometric interpretation of Berezin’s quantization. J. Geom. Phys. 7, 45–62 (1990)

Pseudo-K¨ahler Quantization on Flag Manifolds

379

13. Schlichenmaier, M.: Berezin–Toeplitz quantization of compact K¨ahler manifolds. Preprint q-alg/9601016, 1–15 (1996) 14. Souriau, J.: Structure des syst`emes dynamiques. Paris: Dunod, 1970 15. Warner, G.: Harmonic Analysis on Semi-simple Lie Groups, I. Berlin–New York: Springer-Verlag, 1972 Communicated by A. Connes

Commun. Math. Phys. 200, 381 – 398 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Magnetic Monopoles in U (1)4 Lattice Gauge Theory with Wilson Action V. Cirigliano? , G. Paffuti Dipartimento di Fisica dell’Universit`a and I.N.F.N., Via Buonarroti, I-56100 Pisa, Italy. E-mail:[email protected] Received: 2 July 1997 / Accepted: 20 July 1998

Abstract: We construct the Euclidean Green functions for the soliton (magnetic monopole) field in the U (1)4 Lattice Gauge Theory with Wilson action. We show that in the strong coupling regime there is monopole condensation while in the QED phase the physical Hilbert space splits into orthogonal soliton sectors labeled by integer magnetic charge. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Disorder fields and magnetic monopoles . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Mixed order-disorder correlation functions . . . . . . . . . . . . . . . . . . . . . . . 2 Monopole Condensation in the Confining Phase . . . . . . . . . . . . . . . . . . . 2.1 The polymer expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Bounds on polymer activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Bound on S2 (x, q; y, −q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Monopole Sectors in the QED Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Expansion of Z(D, ) in interacting monopole loops . . . . . . . . . . . . . . . 3.2 Renormalization transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Estimates on renormalized activities and cluster property . . . . . . . . . . . . 4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Bound on Activity’s Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2 4 5 5 7 8 10 10 12 13 16 17

1. Introduction In this paper we apply the methods introduced in [4] to the construction of soliton (magnetic monopole) sectors for the U (1) Lattice Gauge Theory with Wilson action [17]. The soliton quantization for the U (1) Lattice Gauge Theory with Villain action [16], as well ? Present address: Department of Physics and Astronomy, University of Massachusetts, Amherst, MA 01003, USA. E-mail:[email protected]

382

V. Cirigliano, G. Paffuti

as for a large class of models, has been carried out in [4, 5] by constructing the Euclidean Green functions of soliton fields as expectation values of suitable disorder operators. These operators are obtained by coupling the theory to a generalized external gauge field in an hyper-gauge invariant way, as we shall briefly recall below. In the statistical approach this procedure corresponds to the introduction of open-ended line defects. In the case of the U (1)4 gauge theories such line defects are just magnetic loops carrying a defect (topological) charge. Opening up a loop one introduces magnetic monopoles at the endpoints of the current line. An Osterwalder-Schrader (O.S.) reconstruction theorem [12, 13] applied to the disorder fields correlation functions permits the identification of vacuum expectation values of the soliton field, which can be considered as a charged operator creating magnetic monopoles [15, 2]. We have focused our attention in particular on the vacuum expectation value of the soliton field, which is defined by a limiting procedure starting from the two point function: S1 = lim S2 (0, x). |x|→∞

The result of our work is that S1 is a good disorder parameter for the phase transition occurring in the model [9]. In fact we show that S1 is bounded away from zero in the strong coupling regime (β 1) while it is vanishing in the QED phase. The proof of this statement makes use of two different techniques. The strong coupling phase is analyzed by means of a convergent Mayer expansion, applied to a polymer system obtained by the dual representation of the model [10]. Besides, the clustering property in the QED phase is proved using an adapted version of the expansion in renormalized monopole loops originally given by Fr¨ohlich and Spencer [6, 7]. The dual representation of the Wilson model has a measure given by products of modified Bessel functions: the estimates have been done applying suitable bounds on modified Bessel functions for β 1, while using for β large an interpolation to Bessel functions given in [6]. The paper is organized as follows: in this section we shortly define the disorder fields and correlation functions for the U (1) Wilson model and describe their connection with magnetic monopoles. Then we enunciate the reconstruction theorem for the U (1) Wilson model. In Sect. 2 we’ll map the model into a polymer system in order to prove that S1 is nonvanishing in the strong coupling regime and thus the lattice solitons “condense” in the vacuum sector. In Sect. 3 we’ll give the expansion in renormalized monopole loops in order to show that S1 = 0 in the weak coupling regime: the Hilbert space of the reconstructed Lattice Quantum Field Theory splits into orthogonal sectors labeled by the magnetic charge. Finally, in Sect. 4 we give some concluding remarks. 1.1. Disorder fields and magnetic monopoles. In what follows we consider all fields as defined on a finite lattice 3 ⊂ Z 4 ; all estimates needed to prove our statements are uniform in |3| and therefore extend to the thermodynamic limit (3 → Z 4 ). The partition function for the U (1) gauge model is Z Y ϕβ (dθp ), (1) Z = Dθ p⊂3

where Dθ is the product measure on the 1-form θ valued in [−π, π), dθ is the field strength defined on the plaquettes p and (for Wilson action) ϕβ (dθ) = eβ cos(dθ) .

(2)

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

383

To define a disorder operator we can consider a modified partition function in which an external hyper-gauge field strength X is coupled to the dynamical variables: Z Y ϕβ (dθp + Xp ). (3) Z(X) = Dθ p⊂3

The mean value of the disorder operator is defined by hD(X)i =

Z(X) Z(0)

(4)

and is invariant under the hyper-gauge transformation X −→ X + dγ,

(5)

with γ a generic 1-form. The hyper-gauge invariance follows by the redefinition of link variables (θ → θ − γ) and amounts to saying that hD(X)i depends only on the 3-form dX = M . In fact, by Hodge decomposition, on a convex lattice one can write X = dα + δ

1 M 1

(6)

and the first term on the right-hand side can be always absorbed in a redefinition of θ. Moreover, M turns out to be the dual of the magnetic current density J M , because the total field strength G = dθ + X satisfies the following modified Maxwell equations (in the presence of an electric current J E ): δG = J E , dG = M or δ ∗ G = J M ,

(7) (8)

where by ∗G we mean the Hodge dual of the form G. Finally, from (8) it follows that the magnetic current is identically conserved: δJ M = 0 .

(9)

Hence we have that a disorder operator as defined by (4) is unavoidably connected to a magnetic current. In this language a disorder operator describing an open-ended line defect in the statistical system (1) will correspond to a current J M describing the birth and the evolution of a magnetic monopole. It is possible to parametrize such a conserved magnetic current in a general way as follows: J M = 2π [D − ] .

(10)

Here is an integer valued 1-form with support on a line (the Dirac string) whose endpoints are the space-time locations of monopoles: X qi δ(x − xi ), qi ∈ Z\{0}. (11) δ(x) = i

~ is a superposition of Coulomb-like magnetic fields with flux qi , spreading Dµ = (0, D) out at the time slices where the monopoles are located ( x0i ) ~ 0 , ~x) = qi ~x − ~xi δ(x0 − x0i ) D(x 4π |~x − ~xi |3

(12)

384

V. Cirigliano, G. Paffuti

in such a way that δD = δ. Moreover, hyper-gauge invariance joined to the compactness of the action implies that the disorder operator does not depend on the shape of the string. In conclusion we see that a magnetic monopole of charge q can be implemented as a defect in the model (1) by a 2-form X whose curvature dX plays the role of a dual magnetic current. For a monopole-antimonopole pair of charge ±q we have ∗J M = dX = 2πq(∗D − ∗) X = 2πqδ

1 (∗D − ∗), 1

(13)

where ∗ is an integer valued 3-form and ∗D is the 3-form representing the Coulomb field. 1.2. Mixed order-disorder correlation functions. Performing a Fourier analysis on (3) we obtain X Y Iβ (np ) ei(n,X) . (14) Z(X) = n:δn=0 p⊂3

np is the integer valued 2-form labeling the Fourier coefficients. With Iβ (n) we indicate the modified Bessel functions of order n evaluated in β (commonly written as In (β)). On the dual form v = ∗n the constraint δn = 0 becomes dv = 0 and so we can write v = dA and sum over equivalence classes of integer valued 1-forms defined by [A] = {A0 : dA0 = dA}: X Y Iβ (dAp ) ei2πq(A,D−) . (15) Z(D, ) = [A] p⊂3

Now ei2πq(A,) = 1, for integer values of q, and we can conclude that hD(D, )i actually depends Ponly on (xi , qi ), once we have fixed the shape of the magnetic field D satisfying δD = i qi δ(x − xi ). Now one can introduce ordinary fields, preserving hyper-gauge invariance of expectation values: ψp (D, ) = ei[dθp +Xp ] .

(16)

The correlation functions to which the reconstruction theorem applies are then given by P

Sn,m (x1 q1 , . . . , xn qn ; p1 . . . pm ) = hD(x1 q1 , . . . , xn qn )ψp1 . . . .ψpm i

(17)

for qi = 0. Correlation functions P with non vanishing total charge are defined by the following limiting procedure for qi = q: Sn,m = lim cq Sn+1,m (x1 , q1 ; . . . ; xn , qn ; x, −q; p1 . . . pm ), x→∞

(18)

where cq is a normalization constant. We recall now the reconstruction theorem in the version given in [4]. Theorem 1. If the set of correlation functions {Sn,m } is 1) lattice translation invariant; 2) O.S. (reflection) positive; 3) satisfies cluster properties; then one can reconstruct from {Sn,m } a) a separable Hilbert space H of physical states; b) a vector of unit norm, the vacuum;

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

385

c) a selfadjoint transfer matrix with norm kT k ≤ 1 and unitary spatial translation operators Uµ µ = 1, .., d − 1 such that T = U = ; d) is the unique vector in H invariant under T and U . If moreover the limits (18) vanish, then H splits into orthogonal sectors Hq , q ∈ Z, which are the lattice monopole sectors. In our case hypotheses 1–3 follow from translation invariance and reflection positivity of the measure defined in the standard way from the Wilson action [14]. In particular one can easily check the reflection positivity of monomials of disorder fields in the dual representation, where they assume a standard form and have support on fixed time slices (because the magnetic field spreads out in fixed time planes). We are now going to show that in the confining phase the two point function S2 (x, q; y, −q) is uniformly bounded away from zero, while in the QED phase it vanishes at large Euclidean time distances: lim

|x−y|→∞

S2 (x, q; y, −q) = 0.

(19)

A generalization of these estimates to Sn,m implies that in the confining phase there is the so-called monopole condensation and in the weak coupling region the physical Hilbert space H decomposes into orthogonal sectors labeled by total magnetic charge. 2. Monopole Condensation in the Confining Phase In this section we are going to prove monopole condensation in the strong coupling regime: this relies on the fact that given arbitrary γ ∈]0, 1[, there exists a βγ such that for β ≤ βγ S2 (x, q; y, −q) ≥ γ ∀ x, y ∈ 3 holds.

(20)

In order to prove our statement we shall adopt the following strategy [10]: first, we express the disorder field expectation value in terms of logarithms of the partition functions (21) S2 (x, q; y, −q) = exp log Z(D) − log Z and then we prove that log Z(D) − log Z is closeto 0 uniformly in x and y. The main tool we’ll use is a cluster expansion for log Z(D) , which we obtain after rearranging Z(D) as the partition function of a polymer gas. 2.1. The polymer expansion. The first step in our program is the polymer expansion for the (modified) partition function of our system (see e.g. [1, 14]). Although it is a standard technique we now briefly present its application to the compact U (1) system with disorder fields. In the following presentation of the polymer and Mayer expansions we are going to work in a finite lattice 3 ⊂ Z 4 and we’ll neglect boundary effects (due, for instance, to polymers whose supports extend to the boundaries): these effects do not influence the proofs. First note that Z(D, ) (15) can be written summing over closed integer valued 2-forms v as follows:

386

V. Cirigliano, G. Paffuti

Z(D, ) = N3

X Y

I˜β (vp ) ei2πq(Av ,D) ,

(22)

v:dv=0 p⊂3

with NP (3) Iβ (vp ) . , I˜β (vp ) = N3 = Iβ (0) Iβ (0)

(23)

In Eq. (22) Av is the representative element of the class defined by dAv = v: it is simple to verify that each term in the expansion does not depend on the choice of the representative element Av . Moreover we have extracted an overall factor rescaling the modified Bessel functions by Iβ (0): the advantage of this choice is that I˜β (0) = 1 and in our expansion around v ≡ 0 we must not carry over tedious factors. Let us now give some definitions: the support of a k-form v (k) is the following set: suppv (k) = x ∈ 3 : x ∈ ck k-cell with v (k) (ck ) 6= 0 . Let us note that following [10] we think of suppv (k) as a set of points rather than a set of k-cells. In the same way one can define in a natural way the support of a set of k-cells. This permits us to extend to these sets the common definition of connectedness: a set X ⊂ 3 is connected if any two sites in X can be connected by a path of links whose endpoints all lie in X. Returning to Eq. (22), in order to recover the equivalent polymer system, we can rearrange it as X k(v) , (24) Z(D) = v:dv=0

with (we miss the inessential factor N3 ) k(v) = e2πqi(Av ,D)

Y

I˜β (vp ) .

(25)

p⊂3

The main idea is to re-express Eqs. (24) and (25) in terms of closed 2-forms v with connected supports, which will become the supports Pof the polymers. With this purpose let us recall that it is possible to write [10] v = i vi with the propertyPthat suppvi are the connected components of suppv. Moreover one can write Av = i Avi with dAvi = vi and these relations allow the following factorization: Y k(vi ). (26) k(v) = i

Finally, observing that with the above definition of connected set the condition dv = 0 implies dvi = 0 ∀i, we can reorganize the sum in Eq. (24) as follows: Z(D) =

∞ n X 1 X Y K(Xi , D), K(X, D) = n! n=0 i=1 X1 ...Xn

X

k(v).

(27)

v: supp v=X

The sum is extended to all finite connected subsets Xi ⊂ 3, with the condition that Xi ∪ Xj is disconnected if i 6= j. Thus we see that expression (27) has the form of a hard core interaction [1] between polymers Xi of connected support. This allows us to use the techniques developed to deal with these systems in order to give the wanted

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

387

bound on disorder field expectations. We point out that an expansion of the form given in (27) is common to many statistical (or quantum mechanical in the lattice formulation) systems: the polymer activities K(Xi , D) are the link to the original problem, depending on the form of the starting action. In particular the same expansion is possible for the U (1) gauge model with Villain action [16] and gives monopole condensation in strong coupling [4, 5]. The only difference from the Wilson model is that the dual representation is Gaussian and this simplifies the handling of polymer activities. 2.2. Bounds on polymer activities. It is a well known result that the main properties (mathematical and physical) of the polymer system can be taken in strict correspondence with the general behavior of the activities, which in turn depends on parameters such as the temperature or the coupling constant. In this subsection we give a bound on |K(X, D)| which is known [1] to be a sufficient condition for the convergence of the Mayer expansion for log Z(D); moreover it will be of great importance in the estimate of log Z(D) − log Z . We want to show that ∀ M > 0 ∃ βM such that for β < βM , (i) |K(X, D)| ≤ e−M |X| , (ii) |

∂ K(X, D)| ≤ e−M |X| . ∂Db

(28)

Here |X| stands for the cardinality of the support of the X polymer. The basic tool used in the proof of Eq. (28) is the following upper bound on the modified (and rescaled) Bessel functions, I˜n (β) ≤ e|n|

|n| β β 0 ≤ ≤ 1, 2 2

(29)

which easily follows from the power expansion [8]: In (β) =

2k n X ∞ β β 1 . 2 0(k + 1)0(n + k + 1) 2 k=0

The bound (29) tells us that for small β the series whose nth term is given by Iβ (n) is convergent. Now let us sketch the argument that leads to (28). In what follows we assume that suppv = X: from Eqs. (25) and (27) one has |K(X, D)| ≤

X Y

|I˜β (vp )| ≤

v:dv=0 p⊂X

X Y

|I˜β (vp )| .

(30)

all v p⊂X

We have added the contributions due to all integer valued 2-forms on X because we P have in mind to exploit the convergence property of Iβ (n). In order to sum over all integer 2-forms with support on X, we first consider each set Yi of plaquettes such that suppYi = X and sum over 2-forms v satisfying vp 6= 0 ∀p ∈ Yi . Then we collect the contributions due to all the sets Yi . Formally we have: |K(X, D)| ≤

X

X Y I˜β (vp ) .

Yi : supp Yi =X v:vp 6=0 p∈Yi

(31)

388

V. Cirigliano, G. Paffuti

Now let us focus on the generic term with fixed Yj : exchanging the sum with the product and using the parity property In (β) = I−n (β), we obtain for the polymer activity " # X Y X I˜β (n) . 2 (32) |K(X, D)| ≤ n>0

Yi : supp Yi =X p∈Yi

Thus to each plaquette in Yi is associated a factor that we estimate replacing Iβ (n) by the upper bound (29) and summing the resulting series. Noticing that this is a geometric series missing the first term we are left with Y p∈Yj

1 eβ 1 ≤ e−Np (Yj ) log( 2eβ ) for β < , e 1 − eβ 2

(33)

where Np (Yj ) is the number of plaquettes in Yj . From the fact that suppYj = X the relation NP (Yj ) ≥ 41 |X| follows. Moreover one can bound the number of Yj with support on X by ek|X| and obtains |K(X, D)| ≤ e−(Aβ −k)|X| Aβ =

1 1 log( ). 4 2eβ

(34)

Thus we see that part (i) of (28) is satisfied with Mβ = Aβ − k. As far as part (ii) of (28) is concerned, it can be obtained showing that there is a function G(β) such that |

|X| ∂ K(X, D)| ≤ G(β)|X|3 e−( 4 −1)Mβ ≡ F (|X|) holds; ∂Db

(35)

then one finds a constant Mβ0 such that F (|X|) ≤ e−M

0

|X|

.

(36)

For a proof of Eq. (35) see Appendix A. The proof of (36) can then be obtained by means of elementary analysis. 2.3. Bound on S2 (x, q;y, −q). For sake of completeness let us now sketch the analysis given in [10] to bound log Z(D) − log Z . Let us start from the definition of the Mayer expansion: log Z(D) =

∞ n X Y 1 X ψc (X1 ...Xn ) K(Xi , D). n! n=0 i=1

(37)

X1 ...Xn

Here ψc (X1 ...Xn ) is the connected part of the hard core interaction of the polymer system and it is nonvanishing only if ∪i Xi is a connected set. The bound on the polymer activities allows the convergence of the Mayer expansion [1] and this implies, for example, that correlation functions defined by differentiation of log Z(D) with respect to Db share the cluster property. Following the line of [10] we now define H(s) = log Z(sD) and observe that H(s) = H(−s) because the measure on equivalence classes [A] is even. Hence we see that Z 1 ds(1 − s)H 00 (s), (38) log Z(D) − log Z(0) = 0

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

389

where H 00 (s) =

X

D(b)D(b0 )m(b, b0 ),

(39)

b,b0

∞ n X X 1 X ψc (X1 ...Xn ) × n! i,j=1 n=0 b⊂B(Xi ),b0 ⊂B(Xj )   Y ∂ ∂ K(Xk , sD) K(Xi , sD) K(Xj , sD). × ∂Db ∂Db0

m(b, b0 ) =

(40)

k6=i,j

In (40) B(X) denotes the smallest rectangular parallelepiped in 3 which contains X. Moreover, by the exponential bounds on activities and derivatives given in Eq. (28) follows that   " # X Y ∂ ∂  K(Xk , sD) K(Xi , sD) K(Xj , sD) ≤ exp −M |Xk | . ∂Db ∂Db0 (41) k6=i,j

k

P Since ψc (X1 ...Xn ) 6= 0 only if ∪i Xi is a connected set, we have that k |Xk | ≥ d(b, b0 ) (d(b, b0 ) is the distance between the two links) and so 1 |m(b, b0 )| ≤ exp − M d(b, b0 ) × 2 " # (42) ∞ n X X X 1 1 X ψc (X1 ...Xn ) exp − M |Xi | . × n! i,j=1 2 0 i n=0 b⊂B(Xi ),b ⊂B(Xj )

Now, using the techniques of [1], one can work out from (42) the inequality X X 1 1 nδ n |X| exp − M |X| , |m(b, b0 )| ≤ exp − M d(b, b0 ) × 2 2 n

(43)

X⊃b

where δ is a constant little as β decreases. For M large enough the sum over n appearing in (43) converges to a constant δ 0 which again is little as β decreases. Moreover reading the sum in (39) as the scalar product between D and mD, we have the following bound on |H 00 (s)|: (44) |H 00 (s)| = |(D, mD)| ≤ kDk2 kmDk2 ≤ kDk22 kmk2 . P From properties on matrix norms we have that kmk2 ≤ supb0 ( b |m(b, b0 )|): using the previous result we obtain " # X 0 1 e− 2 M d(b,b ) < ρ(β), (45) |H 00 (s)| ≤ δ 0 kDk22 b

390

V. Cirigliano, G. Paffuti

with ρ(β) → 0 as β → 0. It is important to point out that the argument works because kDk2 is bounded uniformly in x and y. Finally, using (38) and (39) we conclude 1 (46) S2 (x, q; y, −q) > exp − ρ(β) . 2 This relation implies in particular that S1 (x, q), defined by the limiting procedure in which y → ∞, is nonvanishing. In the language of field theory we can say that the field describing magnetic monopoles acquires a nonvanishing vacuum expectation value. This implies the spontaneous breakdown of the topological symmetry associated to the magnetic charge conservation and signals confinement of electric charge [15].

3. Monopole Sectors in the QED Phase In this section we prove the relation (19) for the soliton two point function, showing that for β large enough and |x − y| → ∞ it is possible to find a positive constant m(β) such that S2 (x, q; y, −q) ≤ e−m(β)q|x−y| .

(47)

In order to prove our statement, we use a slight modification of the expansion given in [6, 7] and [11]: we re-express the partition function as a gas of monopole loops, to which apply a renormalization transformation. Estimates on the renormalized loop activities enable us to extract the relevant contribution to S2 (x, q; y, −q). 3.1. Expansion of Z(D, ) in interacting monopole loops. From Eq. (15), defining the modified partition function in dual representation, it is natural to introduce a measure on equivalence classes [A] of integer valued 1-forms given by Z 1 Y Iβ (dAp ), dµ(A) = 1. (48) dµ(A) = Z p⊂3 [A] The main idea [6] on which our construction is based is the following: we want to introduce a measure dµIβ (A) on Rn (n is the number of links in 3), which reproduces (48) once we constrain the real variables Ab to integer values and pick them on a gauge slice. Such a measure should enable us to make suitable estimates in the weak coupling region. We can fulfill the first constraint inserting a sum of δ-functions for each link variable Ab , which by Poisson summation gives us the monopole currents; formally we have   1 Y  X i2πqAb  e (49) dµIβ (A). dµ(A) = Z b⊂3

q∈Z

Moreover, since we are going to compute expectation values of gauge invariant observables, the only contributions come from conserved currents; hence we can impose Z (50) dµIβ (A) ei(J,A) = 0, for δJ 6= 0.

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

391

Actually it is possible to construct a measure with the required properties, taking the limit 3 → Z 4 of Y 1 Y Iβ (dA) dAb , Ab ∈ R. (51) dµIβ (A) = N3 p⊂3 b∈3

Iβ (φ) is a suitable interpolation of the modified Bessel functions, which has been constructed in [6] for large values of β, and has the properties listed below: 1. Iβ (φ) =

1 2π

Z

2π

dθ eβ cos θ eiφθ for integer φ;

0

2. Iβ (φ) is an even, positive and integrable function on R; 3. Iβ (φ) is analytic on the strip |Imφ| ≤ β2 . Moreover it is possible to find a constant c such that g(a) β Iβ (φ + ia) | ≤ exp with 0 ≤ g(a) ≤ c e2π|a| for 1 ≤ |a| ≤ . | Iβ (φ) β 2 With these new tools the soliton two point function can be written as follows: " # Z Y X i(A,(D−)) i2πnb Ab e . S2 (x, q; y, −q) = dµIβ (A) e b∈3

(52)

nb

Here the factor 2πq has been absorbed in D and . Our first objective is a suitable rearrangement of the product appearing in (52), which, by the constraint (50), gives the usual coupling of A with external current loops. Let us start defining a current density ρ as a 1-form on 3 with values in 2πZ. An 1ensemble E is a set of current densities {ρ} whose supports are disjoint and such that 1 dist(ρ, ρ0 ) ≥ 2 2 ∀ρ, ρ0 ∈ E. It is useful to collect the above currents in 1-ensembles using the following property [7]: " # Y X Y X i2πnb Ab e dσ (53) = [1 + K(ρ) cos(A, ρ)] . b∈3

nb

σ

ρ∈Eσ

The index σ runs over a finite set, each Eσ is an 1-ensemble and dσ > 0. Moreover, kkρk1 , where kρk1 is the norm given the bare loop P activities satisfy 0 < K(ρ) < e by kρk1 = b∈ supp ρ |ρb | and k is a geometrical constant, independent of ρ. The next step is to consider the string as a current density and construct suitable 1-ensembles containing open-ended currents which are obtained by grouping the currents “touching” , in a sense which we’ll specify below. We choose (by hyper-gauge invariance) such that supp ∩ suppD = : this means that we are taking with support in the region bounded by the hyperplanes z 0 = x0 and z 0 = y 0 . The disorder field expectation can be written as follows [6, 7, 11]: Z X cτ cos(A, D − ) + K(ρτ ) cos(A, D − + ρτ ) × Z(, D) = dµIβ (A) τ

×

Y ρ∈Eτ

[1 + K(ρ) cos(A, ρ)] .

(54)

392

V. Cirigliano, G. Paffuti

Now, some comments on Eq. (54): the currents ρτ are divergenceless and their support has non vanishing intersection with N = {b ∈ 3 : dist(b, supp(D − )) ≤ 1}: ρτ are the currents touching . cτ are positive constants and K(ρτ ) satisfies 0 < K(ρτ ) < kkρτ k1 . Moreover Eτ is an 1-ensemble of divergenceless currents having vanishing e intersection with N . We must point out that Eτ ∪ {ρτ − } is still an 1-ensemble: we have divided the closed currents from the open-ended ρτ = ρτ − , which play a peculiar role in the proof of clustering of the soliton correlation function, as we shall see below. 3.2. Renormalization transformation. Now we make on (54) a transformation that renormalizes the activities of currents and allows us to give the wanted bound on S2 (x, q; y, −q). The transformation consists in the explicit integration of the factors containing the currents ρ on a suitable subset of suppρ, which we call Bρ , characterized by the fact that two different links contained in it belong to different plaquettes and X |ρb | ≥ c˜kρk1 . (55) b∈Bρ 1 . Such a renormalization In dimension four the geometric constant c˜ can be fixed to be 18 is based on the application of the following property. Let us consider a function G(A) which does not depend on Ab for some link b ∈ 3: for arbitrary real a such that |a| ≤ β2 then we have Z eiqAb G(A)dµIβ (A) =   Z (56) Y ˜ iβ ((b, p)a, dA) G(A)dµIβ (A), eiqAb  = e−Eβ (a,q) p:b∈∂p

where nb Iβ (φ + ia) − g(a) e β . E˜ β (a, q) = qa − g(a) and iβ (a, φ) = β Iβ (φ)

(57)

In Eq. (57) nb is the number of plaquettes containing b and (b, p) is a factor ±1 which gives the orientation of b in ∂p. The proof of this property is immediate if we make a complex translation on the variable A ( A → A + ia) and use the properties of the function Iβ (φ). Now let us focus on the generic term in the sum over the index τ appearing in (54). First, we choose Bρ ⊂ suppρ for every ρ ∈ Eτ and Bρτ ⊂ suppρτ . By the defining property of these subsets and the fact that we are dealing with 1-ensembles, all links selected in this way belong to different plaquettes. After exponentiating the cosines in (54) and decomposing the resulting products, this observation allows us to apply the relation (56) to the resulting terms for all links in Bρ , Bρτ and supp. These integrations produce the transformation of cos(A, ρ) into a function cβ (A, ρ) and renormalize the activities K(ρ) as described by the following relations: Z X cτ z(β)cβ (A, D − ) + z(β, ρτ )cβ (A, D + ρτ ) × Z(, D) = dµIβ (A) τ

×

Y 1 + z(β, ρ)cβ (A, ρ) . ρ∈Eτ

(58)

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

393

The renormalized activities are given by Y ˜ e−Eβ (a,b ) ; z(β) = b∈ supp

z(β, ρτ ) = K(ρτ )

Y

−E˜ β a,(D+ρτ )b

e

;

b∈Bρτ

z(β, ρ) = K(ρ)

Y

e−Eβ (a,ρb ) . ˜

(59)

b∈Bρ

The renormalized version of the cosine is cβ (A, ρ) = Re eβ (A, ρ) , where  eβ (A, ρ) = ei(A,ρ) 

Y

 iβ (a, dA(p))

(60)

p∈T (Bρ )

and T (Bρ ) = {p ∈ 3 : b ∈ ∂p b ∈ Bρ } . From the relations given above it is clear that |cβ (A, ρ)| ≤ 1. 3.3. Estimates on renormalized activities and cluster property. In order to extract a bound that assures the cluster property we must now choose a suitable value of the parameter a appearing in E˜ β . By property 3. of the function Iβ (φ) it follows that e−Eβ (a,q) ≤ e−Eβ (a,q) with Eβ (a, q) = qa − ˜

nb 2π|a| ce . β

(61)

In order to give an upper bound on activities as strong as possible, we take the value of a maximizing Eβ (a, q) in the domain |a| ≤ β2 and we denote it by am . For a fixed value of β it turns out that am depends on the value of the parameter q, which stands here for the value of the generic current on a link in Bρ (q → ρb in Eqs. (59)): we find β|q| 1 log if |q| ≤ q˜β ; (q) = (q) (62) a(1) m 2π 2πnb c β (63) if |q| ≥ q˜β , a(2) m (q) = (q) 2 and correspondingly β|q| |q| log −1 ; 2π 2πnb c 2nb c πβ β 1 − , q) = |q| e Eβ(2) (q) ≡ Eβ (a(2) . m 2 |q|β 2 Eβ(1) (q) ≡ Eβ (a(1) m , q) =

(64) (65)

˜ = β2 . The important feature of these current The discriminant value q˜ is defined by a(1) m (q) self-energies extracted by renormalization is that for β large enough both Eβ(1) (q) and Eβ(2) (q) are positive.

394

V. Cirigliano, G. Paffuti

With the above choice for the parameter a the renormalized activity of the generic current ρ satisfies Y e−|ρb |A(β,ρb ) , (66) z(β, ρ) ≤ K(ρ) b∈Bρ

with A(β, ρb ) =

 

h

i

β|ρb | 1 1 2π log 2πnb c − β 2n c πβ  1− b 2 e 2 |ρb |β

˜ for |ρb | ≤ q, for |ρb | ≥ q. ˜

(67)

In both cases A(β, ρb ) is bounded from below by a function which does not depend on ρb :  h i  1 log β − 1 = A(1) (β) 2π 12πc (68) A(β, ρb ) ≥ β  1 − 1 = A(2) (β) . 2 πβ The first bound is obtained using the inequalities |ρb | > 1 and nb ≤ 6; the second one ˜ We must point out that both A(1) (β) and A(2) (β) are is obtained replacing |ρb | with q. positive functions increasing with β. If now we define A(β) = min{A(1) (β), A(2) (β)}, using the properties of Bρ and K(ρ) we can write ˜ ˜ 1 1 ≤ e−(cA(β)−k)kρk . z(β, ρ) ≤ K(ρ) e−cA(β)kρk

(69)

In particular one can bound the renormalized activity of the string as follows: ˜ ˜ 1 ≤ e−cA(β)q|x−y| . z(β) ≤ e−cA(β)kk

(70)

More delicate is the estimate of z(β, ρτ ), because of the presence of the Coulomb field D. In fact one has Bρτ ⊂ supp(ρτ ) but the generalized current density is ρτ + D . ˜ In general for these activities one can find C(β) and A(β) > 0 such that: z(β, ρτ ) ≤ C(β) e−A(β)q|x−y| . ˜

The starting point to prove (71) is the relation Y τ e−A(β)|ρb +Db | . z(β, ρτ ) ≤ K(ρτ )

(71)

(72)

b∈Bρτ

First one can easily see that in the case in which suppρτ ∩ suppD = one can deal with the current ρτ = ρτ − as for the common ρ (see Eq. (69)) and from the relation ˜ = c˜A(β) − k. kρτ k1 ≥ kk1 follows (71) with C(β) = 1 and A(β) The case in which suppρτ ∩ suppD 6= can be worked out with the following trick: we distinguish the links b ∈ Bρτ such that |Db | > 21 from those such that |Db | < 21 (in other words we decompose Bρτ = Bρ<τ ∪ Bρ>τ ) obtaining a factorization in Eq. (72). Noticing then that ρτ takes values in 2πZ, for b ∈ Bρ<τ we have ! Db 1 τ τ τ (73) |ρb + Db | = |ρb || 1 + τ | ≥ |ρ b | , 2 ρb

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

395

by which we reduce the factor involving Bρ<τ to the standard form (66). The term involving Bρ>τ will give us the constant C(β). In fact after a simple little algebra we obtain τ

˜ z(β, ρτ ) ≤ K(ρτ ) e− 2 cA(β)kρ 1

with τ

G(D, ρ

Y

)=

" exp A(β)

b∈Bρ>τ

k1

G(D, ρτ ),

!# τ |ρ τ b | − |ρb + Db | . 2

Finally it is possible to show that G(D, ρτ ) is bounded by a constant C(β) only dependent on β and the number of links on which |D| ≥ 21 (actually this number is a function of q); this completes the proof of Eq. (71). Now we have all the elements to complete our proof. By the property that |zcβ | ≤ |z|, starting from (58) we can write |S2 (x, q; y, −q)| ≤ supτ {z(β) + z(β, ρτ )} × Z Y X cτ 1 + z(β, ρ)cβ (A, ρ) . × dµIβ (A) τ

(74)

ρ∈Eτ

Let us call R the integral in (74): considering the explicit form of the normalization factor Z, it can be represented as P Aτ , (75) R = Pτ τ Bτ with

Z Aτ = cτ

dµIβ (A)

Y 1 + z(β, ρ)cβ (A, ρ) ,

(76)

ρ∈Eτ

Z Bτ = cτ

Y dµIβ (A) 1 + z(β, ρτ )cβ (ρτ , A) 1 + z(β, ρ)cβ (A, ρ) .

(77)

ρ∈Eτ

It is simple to see that for β and |x−y| large enough we have Aτ ≤ 2Bτ . In fact choosing suitable values of β and |x − y| we can obtain that z(β, ρτ ) ≤ 21 and z(β, ρ) ≤ 1; thus in order to evaluate Bτ we must integrate in the positive measure Y 1 + z(β, ρ)cβ (A, ρ) dµIβ (A) ρ∈Eτ

the function 1 + z(β, ρτ )cβ (A, ρτ ) ≥ 21 . This observation leads to the conclusion that R ≤ 2. Finally using this result in (74) we get the relation |S2 (x, q; y, −q)| ≤ 2 supτ {z(β) + z(β, ρτ )} .

(78)

The cluster property of the soliton two point function follows then by estimates of Eqs. (70) and (71) on renormalized activities. Indeed the extension of the above reasoning to the Pcorrelation functions with n-point disorder field is straightforward. The point is that if i qi = 0, then it is always possible

396

V. Cirigliano, G. Paffuti

to write J M = D − , as pointed out in the introduction; therefore the general npoint disorder field correlation function has the form of the right-hand side of (52), with appropriate D and . Now, all manipulations on this correlation function depend only on D having support on fixed time hyperplanes and on behaving such that supp ∩ suppD = , condition fulfilled by means of hyper-gauge freedom: both these features are clearly true in the general case. Equations (70) and (71), giving the fundamental bounds for the proof of clustering, in general read: ˜ 1 ; z(β) ≤ e−cA(β)kk

c˜

z(β, ρτ ) ≤ C(β) e−( 2 A(β)−k)kk1 .

(79)

(80)

The term C(β) actually depends on the Coulomb field D but the combination c2˜ A(β)−k, on which the clustering depends, is the same for all correlation functions: this ensures the validity of our statements about the Hilbert space decomposition. Finally we notice that the introduction of ordinary fields in the correlation functions does not affect the estimation and the extraction of the renormalized monopole loop’s activities. The only effect of the order fields is a modification of the quantities Aτ introduced in Eq. (76). Substantially one is left with |Sn,m | ≤ supτ {z(β) + z(β, ρτ )} × R(β; p1 ....pm ),

(81)

which, thanks to (79) and (80), is a bound ensuring clustering.

4. Concluding Remarks In conclusion we summarize the results: in the weak coupling region the Hilbert space of the reconstructed Lattice Quantum Field Theory splits into orthogonal sectors labeled by magnetic charge. Instead, in the strong coupling region the lattice solitons do condense in the vacuum sector: the symmetry associated to the topological conservation of magnetic charge is spontaneously broken, as signaled by the nonvanishing expectation value of the charged monopole operator. Indeed this is just the criterion for quark confinement proposed by ’t Hooft [15]. Moreover our analytical results are in agreement with numerical calculations [3]: these show that the parameter regions in which S1 > 0 and S1 = 0 coincide with the confining phase and QED phase, respectively, and allow to extract information about the behavior of the system at the transition. Our result suggests that the correlation functions of soliton (disorder) fields are indeed a viable tool for the study, both numerical and analytical, of phase transitions in lattice models that exhibit monopole-like topological excitations (we point out that a slight generalization allows the extension to lattices with non trivial topology). Along this line there is the possibility to study analytically the disorder fields associated with vortices in the 3d XY model and, although more remote, a generalization to non Abelian models. Acknowledgement. We wish to thank Adriano Di Giacomo and Michele Mintchev for useful discussions.

Magnetic Monopoles in U (1)4 Lattice Gauge Theory

397

A. Bound on Activity’s Derivatives In this appendix we sketch the argument which leads to Eq. (35). The proof is based on the following inequality, given in [10]: kAv k ≤

c (v, v)2 ≤ cNp2 m4v , 2π

(82)

where Np is the number of plaquettes in X and mv = maxp⊂X |vp |. The derivative with respect to Db modifies the expression of k(v) by the multiplicative factor 2πAv (b) and thus we have: |

X ∂ K(X, D)| ≤ cNp2 k4 ∂Db k>0

Now we can write X

Y

I˜β (vp ) =

[v]:mv =k p⊂X

X

X

Y

I˜β (vp ).

(83)

[v]:mv =k p⊂X

X

Y

I˜β (vp ).

(84)

Yi : supp Yi =X [v]:mv =k,vp 6=0 p∈Yi

The 2-forms with mv = k may be obtained by fixing the value of v to ±k on a plaquette pM and summing over configurations of integers n ≤ k in the remaining (NP (Yi ) − k 1) plaquettes. We extract a factor β2 (coming from I˜β (vpM = k)) from each term with fixed pM and notice that the contribution due to all configurations is bounded by e−(NP (Yi )−1)Aβ . The result is X eβ k X ∂ K(X, D)| ≤ cNp2 k4 e−(|X|−1)Aβ NP (Yi ). (85) | ∂Db 2 k>0

Yi : supp Yi =X

The series in k can be estimated by the value of the integral of x4 e−Bx on R+ (B = 2 )), and this gives G(β) apart from factors. Moreover the sum over Yi is bounded log( eβ by 2|X| ek|X| in such a way that we can write Eq. (35): |

∂ K(X, D)| ≤ G(β)|X|3 e−(|X|−1)Mβ ≡ F (|X|). ∂Db

(86)

References 1. Brydges, D.: A short course in cluster expansions. In: Osterwalder, K. and Stora, R. (eds), Critical phenomena, random systems, gauge theories. Proceedings, Les Houches Summer School 1984, Amsterdam: North-Holland, pp. 132–183 2. Del Debbio, L., Di Giacomo A., Paffuti, G.: Detecting dual superconductivity in the ground state of gauge theory. Phys. Lett. B 349, 513–518 (1995) 3. Di Giacomo, A., Paffuti, G.: Disorder parameter for dual superconductivity in gauge theories. Phys. Rev D 56, 6816–6823 (1997) 4. Fr¨ohlich, J., Marchetti, P.A.: Soliton Quantization in Lattice Field Theories. Commun. Math. Phys. 112, 343–383 (1987) 5. Fr¨ohlich, J., Marchetti, P.A.: Magnetic Monopoles and Charged States in Four-Dimensional, Abelian Lattice Gauge Theories. Europhys. Lett 2, 933–940 (1986)

398

V. Cirigliano, G. Paffuti

6. Fr¨ohlich, J., Spencer, T.: The Kosterlitz–Thouless Transition in Two-Dimensional Abelian Spin Systems and the Coulomb Gas. Commun. Math. Phys. 81, 527–602 (1981) 7. Fr¨ohlich, J., Spencer, T.: Massless Phases and Symmetry Restoration in Abelian Gauge Theories and Spin Systems. Commun. Math. Phys. 83, 411–454 (1982) 8. Gradshteyn, I.S., Ryzhik, I.M.: eq. 8.445 in: Table of integrals, series and products, London–New York: Academic Press 9. Guth, A.: Existence proof of a nonconfining phase in four dimensional U (1) lattice gauge theory. Phys. Rev. D 21, 2291–2307 (1980) 10. Kennedy, T., King, C.: Spontaneous Symmetry Breakdown in the Abelian Higgs Model. Commun. Math. Phys. 104, 327–347 (1986) 11. Marchetti, P.A.: An Euclidean approach to the construction and the analysis of the soliton sectors. PhD thesis, SISSA, 1986 12. Osterwalder, K., Schrader, R.: Axioms for Euclidean Green Functions. Commun. Math. Phys. 31, 83–112 (1975) 13. Osterwalder, K., Schrader, R.: Axioms for Euclidean Green Functions II. Commun. Math. Phys. 42, 281–305 (1975) 14. Seiler, E.: Gauge theories as a problem of constructive quantum field theory and statistical mechanics. Lecture Notes in Physics, Vol. 159, Berlin–Heidelberg–New York: Springer, 1982 15. ’t Hooft, G.: On the phase transition towards permanent quark confinement. Nucl. Phys. B 138, 1–25 (1978) 16. Villain, J.: Theory of one and two-dimensional magnets with an easy magnetization plane. II. The planar, classical two-dimensional magnet. J. Phys. 36, 581–590 (1975) 17. Wilson, K.G.: Confinement of quarks. Phys. Rev. D10, 2445–2459 (1974) Communicated by D. Brydges

Commun. Math. Phys. 200, 399 – 420 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

ˆ (n)) and Quantum Spinor Representations of Uq (gl Boson-Fermion Correspondence Jintai Ding RIMS, Kyoto University, Kyoto 606-01, Japan Received: 22 November 1995 / Accepted: 21 July 1998

Abstract: This is an extension of quantum spinor construction in [DF2]. We define quantum affine Clifford algebras based on the tensor product of a finite dimensional repˆ resentation and an infinite highest weight representation of Uq (gl(n)) and the solutions ˆ and explain of q-KZ equations, construct quantum spinor representations of Uq (gl(n)) classical and quantum boson-fermion correspondence.

Introduction In [DF2], we proposed an invariant approach to study the quantum algebras, which was also stressed in [FRT1]. We managed to use such an approach to define quantum Clifford and Weyl algebras using general representation theory of quantum groups, and construct spinor and oscillator representations of quantum groups of classical types. The key idea consists in reformulating familiar classical constructions entirely in terms of the tensor category of highest weight representations of quantum groups and then, using Lusztig’s result on q-deformation of this category to define the corresponding quantum structures. The quasitriangular structure of quantum groups introduced by Drinfeld [D2], especially the universal Casimir operator, plays an important role in this formulation. It turns out that such an invariant approach also applies to affine quantum groups. In this paper, we implement this idea for the cases of the spinor representations of quantum ˆ ˆ and Uq (sl(n)). Subsequently, we will present the corresponding affine groups Uq (gl(n)) ˆ results for the cases of oˆ (n) [Di] and sp(2n) analogously to the cases of o(n) and sp(2n) in [DF2]. The affine Kac–Moody algebra gˆ associated to a simple Lie algebra g admits a natural realization as a central extension of the corresponding loop algebra g ⊗ C[t, t−1 ][G]. In the Frenkel–Kac construction, this Lie algebra is given by vertex operators written in terms of the Heisenberg algebra in the so-called bosonic representations. There also exists another family of representations, the fermionic representations [F, FF] in terms

400

J. Ding

of affine Clifford algebras. For these two families of representations, there exists the well known boson-fermion correspondence [F1], which shows their equivalence. As in the undeformed classical case, with the canonical basis for the n-dimensional representation of gl(n), we can write down formulas that are completely parallel to the ones in [DF2]. These formulas enable us to obtain the isomorphism between the algebra generated by intertwiners and the affine Clifford algebra and derive the spinor construction. This gives a structural explanation of the boson-fermion correspondence. For the affine quantum groups, the starting point of our construction is their loop realization. Faddeev, Reshetikhin and Takhtajan [FRT2] showed how to extend their realization of Uq (g) to the quantum loop algebra Uq (g ⊗ [t, t−1 ]) via a canonical solution of theYang–Baxter equation depending on a parameter z ∈ C. The first realization of the quantum affine algebra Uq (ˆg) and its special degeneration called the Yangian were obtained by Drinfeld [D2]. Reshetikhin and Semenov-Tian-Shansky [RS] incorporated the central extension in the previous construction of [FRT2] to obtain the second realization of the quantum affine algebra Uq (ˆg). ˆ In this paper, we first construct the spinor representation of Uq (gl(n)) with intertwiners completely parallel to the case of the quantum groups of classical types as explained in [DF2]. ˆ by Frenkel and Jing [FJ] constructed quantum bosonic representations of Uq (sl(n)) ˆ vertex operators in terms of Drinfeld’s realization, which could be extended to Uq (gl(n)). In [DF], we explicitly established a relation between two realizations of the quantum ˆ affine algebra Uq (gl(n)) through the Gauss decomposition of a matrix composed of elements of the quantum affine algebra. With this isomorphism, we are able to establish ˆ by the concrete the quantum boson-fermion correspondence for the case of Uq (gl(n)) isomorphism between those two representations via the intertwiners, whose bosonization is partially solved in [Ko]. ˆ where however he only Hayashi constructed fermionic representations of Uq (sl(n)), obtained the realization of the basic generators. From the point of view of this paper, his definition of the quantized fermions does not reveal the new structure quantum fermions possess. The quantum Clifford algebras we define are closely related to massive quantum field theory[Sm]. We will construct a realization of an algebra, which resembles the algebra to define form factors in massive quantum field theory based on the quantum Clifford algebra. This paper is arranged as follows. We present the basic idea of this paper in Sect. 1. In Sect. 2, we will reconstruct classical spinor representations and boson-fermion correˆ and spondence. Sect. 3 will give the construction of spinor representations of Uq (gl(n)) explain the quantum boson-fermion correspondence. We will also discuss the connection with the theory of form factors in massive quantum field theory. 1. Universal Casimir Operators of Affine Quantum Groups The definitions of quantum groups corresponding to the affine Lie algebras, which were given by Drinfeld [D1] and Jimbo [J1] in terms of generators and quantized relations corresponding to the affine Cartan matrices, are very simple. Definition 1.1. Let (aij ) be a Cartan matrix for an affine Lie algebra gˆ [Ka]. Let (d0 , . . . , dn ) be a vector with integer entries di ∈ {1, 2, 3, 4} such that (di aij ) is symmetric. Let q be an indeterminate. Uq (ˆg) is an associative algebra on C(q) with generators

Spinor Representations and Quantum Boson-Fermion Correspondence

401

Ei , Fi , Ki ,Ki−1 (i = 0, 1.., n) and the relations are: Ki Kj = Kj Ki Ki Ki−1 = Ki−1 Ki = 1. Ki Ej = q di aij Ej Ki , Ki Fj = q −di aij Fj Ki , Ki − Kj−1 , q di − q −di X s 1 − aij (−1) Eir Ej Eis = 0, if i 6= j, s d i r+s=1−aij X 1 − aij (−1)s Fir Fj Fis = 0, if i 6= j, s d Ei Ej − Fj Ei = δij

r+s=1−aij

(1.1)

i

where for integers N, M, d ≥ 0, we define [N ]d ! =

N Y q da − q −da a=1

q d − q −d

[M + N ]d ! M +N , = . N [M ]d ![N ]d ! d

(1.2)

The quantum group Uq (ˆg) has a noncocommutative Hopf algebra structure with comultiplication 1, antipode S and counit ε defined by 1(Ei ) = Ei ⊗ 1 + Ki ⊗ Ei , 1(Fi ) = Fi ⊗ Ki−1 + 1 ⊗ Fi , 1(Ki ) = Ki ⊗ Ki , −Ki−1 Ei ,

S(Fi ) = −Fi Ki , S(Ki ) = S(Ei ) = ε(Ei ) = ε(Fi ) = 0, ε(Ki ) = 1.

(1.3) Ki−1 ,

We define an automorphism Dz of Uq (ˆg) as Dz (E0 ) = zE0 , Dz (F0 ) = z −1 F0 , and Dz fixes all other generators. We also define the map 1z (a) = (Dz ⊗ id)1(a) and 10z (a) = (Dz ⊗id)10 (a), where a ∈ Uq (ˆg) and 10 denotes the opposite comultiplication. Let Uq (˜g) be the algebra generated by Uq (ˆg) and the operator d, such that d commutes with all other elements and [d, E0 ] = E0 , [d, F0 ] = −F0 . It is clear that action of Dz is equivalent to conjugation by z d . ¯ Uq (˜g), from the theory of Drinfeld[D2], has a universal R-matrix R. ¯ such that R ¯ and Proposition 1.1 ([D2, FR]). There exists an element R ¯ ∈ Uq (bˆ+ )⊗U ˆ q (bˆ− ) ⊗ C[[z]] R(z) = q −d⊗C−C⊗d (Dz ⊗ id)R satisfy the properties: ¯ ¯ R1(a) = 10 (a)R, −1 0 R(z)1z (a) = (Dq−1 C2 ⊗ Dq C1 )1z (a)R(z),

R12 (z)R13 (zq C2 /w)R23 (w) = R23 (w)R13 (zq −C2 /w)R12 (z).

(1.4)

402

J. Ding

where a ∈ Uq (g), 10 denotes the opposite comultiplication, C1 = C ⊗ 1, C2 = 1 ⊗ C, Uq (bˆ+ ) is the subalgebra generated by Ei , Ki and Uq (bˆ− ) is the subalgebra generated by Fi , Ki . Let R(z) = (Dz ⊗ 1)R,

(1.5)

which implies R(1) = R. −1 Corollary 1.1. Let C = ((Dq−1 C2 ⊗ Dq C1 )R21 )R. Then −2 C1(a) = ((Dq−2 C2 ⊗ Dq C1 )1(a))C.

(1.6)

Proof. From the property of R(z), we know that

Thus we have

−1 0 R1(a) = (Dq−1 C2 ⊗ Dq C1 )1 (a)R.

(1.7)

−1 R21 10 (a) = ((Dq−1 C2 ⊗ Dq C1 )1(a))R21 ,

(1.8)

and −1 −1 −1 −1 −1 0 ((Dq−1 C2 ⊗ Dq C1 ))(R21 1 (a)) = ((Dq C2 ⊗ Dq C1 ))(((Dq C2 ⊗ Dq C1 )1(a))R21 ).

(1.9)

Thus we get that −1 −1 −1 −2 −2 −1 −1 0 (Dq−1 C2 ⊗ Dq C1 R21 )(Dq C2 ⊗ Dq C1 1 (a)) = (Dq C2 ⊗ Dq C1 1(a))(Dq C2 ⊗ Dq C1 R21 ). (1.10) Thus we obtain the proof.

We note that (q C2 ⊗ q C1 ) is invariant under the permutation. −1 This proposition shows that the action of C = ((Dq−1 C2 ⊗ Dq C1 )R21 )R on a tensor product of two modules is an intertwiner which, however, shifts the actions of E0 and F0 by the constants q ∓2C2 ⊗ q ∓2C1 respectively. We should also notice that Dz ⊗ Dz fixes R. Let R21 (z) = (Dz ⊗ 1)R21 . Note that R21 (z) is not equal to P (R(z)), where P is the permutation operator. Let V be a finite dimensional representation of Uq (ˆg). Let L¯ + (z) = (id ⊗ πV )(R21 (z)), −1 L¯ − (z) = (id ⊗ πV )R(z −1 ),

L+ (z) = (πV ⊗ id)(R−1 (z)),

(1.11a)

−1

−1 ). L− (z) = (πV ⊗ id)R−1 21 (z

We have that

−1 −1 L¯ + (z)P L+ (z −1 )P = 1, L¯ − (z)P L− (z −1 )P = 1,

(1.11b)

where P is the permutation operator. Uq (ˆg) as an algebra is generated by operator entries −1 −1 of L¯ + (z) and L¯ − (z), and it is also generated by operator entries of L+ (z) and L− (z). L¯ ± (z) are used in [FR] to obtain q-KZ equation.

Spinor Representations and Quantum Boson-Fermion Correspondence

Let

−1 ¯ L(z) = (id ⊗ πV )((1 ⊗ Dq−1 ), C )R21 (z))R(z

¯ = L¯ = (id ⊗ πV )((1 ⊗ Dq−1 (Dz ⊗ 1)L(z) C )R21 )R, and

−1 ), L(z) = (πV ⊗ id)((DqC ⊗ 1)R−1 (z))R−1 21 (z

(1 ⊗ Dz )L(z) = L = (πV ⊗ id)((1 ⊗ DqC )R−1 )R−1 21 .

403

(1.12a)

(1.12b)

Proposition 1.2. The following identities hold: ¯ z ), ¯ z )L¯ ± (z)L¯ ± (w) = L¯ ± (w)L¯ ± (z)R( R( 1 2 2 1 w w −C C zq zq − − + + ¯ ¯ )L¯ 1 (z)L¯ 2 (w) = L¯ 2 (w)L¯ 1 (z)R( ), R( w w −1 ¯ ¯ 2C /w)−1 L¯ 2 (w) = L¯ 2 (w)R(zq ¯ −2C /w)L¯ 1 (z)R(z/w) ¯ R(z/w) L¯ 1 (z)R(zq ,

(1.13a)

¯ ¯ L(z)(id ⊗ πV )1(a) = (id ⊗ πV )((1 ⊗ Dq−2 C )1(a))L(z), and

¯ z )L± (z)−1 L± (w)−1 = L± (w)−1 L± (z)−1 R( ¯ z ), R( 2 2 1 w 1 w (1.13b) −C C zq zq − −1 −1 + −1 ¯ ¯ )L+1 (z)−1 L− ), (w) = L (w) L (z) R( R( 1 2 2 w w −2C −1 2C ¯ ¯ ¯ ¯ /w)L2 (w)R(z/w) = R(z/w)L (w) R(zq /w)−1 L1 (z), L1 (z)R(zq 2 L(z)(πV ⊗ id)1(a) = (πV ⊗ id)((Dq2C ⊗ id)1(a))L(z).

¯ Here R(z/w) is the image of R(z/w) on V ⊗ V . ¯ We name L(z) and L(z) the universal Casimir operator and the inverse universal Casimir operator of the quantum algebra respectively. The construction of a represen¯ or L(z), tation Uq (ˆg) is equivalent to finding a specific realization of the operator L(z) which plays the same role as L in the case of Uq (g) in [DF2]. This will be proved as in [DF] for the case of Uq (gl(n)) in Sect. 3. Naturally we would like to find a way to ¯ build this L(z) or L(z) out of the intertwiners, just as in the case of spinor and oscillator representations of the quantum groups of types A, B, C and D in [DF2]. Let Vλ,k and Vλ1 ,k be two highest weight representations of Uq (ˆg) with highest weight λ and λ1 and the center C acting as a multiplication by a number k. Let 9 be an intertwiner: ˆ 9 : Vλ1 ,k −→ Vλ,k ⊗V, 9(x) = 91 (x) ⊗ e1 + ... + 9n (x) ⊗ en , (1.14) ˆ where x ∈ Vλ1 ,k and ei is a basis for V . Here Vλ,k ⊗V denotes the completion of Vλ,k ⊗V . Let 9∗ be an intertwiner: ˆ ∗, 9∗ : Vλ,k −→ Vλ1 ,k ⊗V 9∗ (x) = 9∗1 (x) ⊗ e∗1 + ... + 9∗n (x) ⊗ e∗n ,

(1.15)

where V ∗ is the right dual representation of V of Uˆ q (g), x ∈ Vλ,k and e∗i is a basis for V ∗ . Let us identify V ⊗ V ∗ with End(V ). By the right dual representation of V of Uˆ q (g), we mean the action of Uqˆ(g) on the dual space given by < av 0 , v >=< v 0 , S(a)v >, for

404

J. Ding

a ∈ Uˆ q (g), v ∈ V ∗ and v 0 ∈ V . (9 ⊗ 1)9∗ = 69i 9∗j ⊗ ei ⊗ e∗j gives a map, which we assume is well defined, (9 ⊗ 1)9∗ : Vλ,k −→ Vλ,k ⊗ V ⊗ V ∗ . Let L˜ ∈ End(Vλ,k )⊗ End(V ): L˜ = (L˜ ij ) = ((Dq−2k 9i )9∗j ),

(1.16)

where (Dq−2k 9i ) means shifting the evaluation representation by the constant q −2k . Here we need the assumption that L˜ is well defined. On Vλ,k and Vλ1 ,k , there is a grading such that the action of E0 lifting the degree of an element by 1, F0 lowering the degree of an element by 1 and the highest weight vector grade 0. Then 9i can be written = 6ba 9i (n), as 6Z 9i (n), where 9i (n) shifts the degree of an element by n. Let 9a,b i a < b. When any matrix coefficient of the homogeneous component of the image of converges when a goes an element in Vλ,k under the composite action of 9∗j and 9a,b i to negative infinity and b goes to positive infinity independently, we say that L˜ is well defined. Proposition 1.3. ˜ ˜ Vλ,k ⊗ πV )1(a) = ((1 ⊗ Dq−2k )(πVλ,k ⊗ πV 1(a))L, L(π

(1.17)

where a is an element in Uq (ˆg). Proof. This is because (id ⊗ F)((9 ⊗ 1)9∗ ) ⊗ V = L˜ gives an intertwining map Vλ,k ⊗ V −→ Vλ,k ⊗ Vq−2k , ∗

where F is the map V ⊗ V to C. We can also derive similar construction using another type of intertwiners. Let 8 be an intertwiner: 8 : Vλ1 ,k −→ V ⊗ Vλ,k , where x ∈ Vλ1 ,k

8(x) = e1 ⊗ 81 (x) + ... + en ⊗ 8n (x), and {ei } is the basis for V . Let 8∗ be an intertwiner:

(1.18)

8∗ : Vλ,k −→∗ V ⊗ Vλ1 ,k , (1.19) 8∗ (x) = e∗1 ⊗ 8∗1 (x) + ... + e∗n ⊗ 8∗n (x), ∗ ∗ ˆ where V is the left dual representation of V of Uq (g), x ∈ Vλ,k and ei is the basis for ∗ V . By the left dual representation of V of Uˆ q (g), we mean the action of Uqˆ(g) on the dual space given by < av 0 , v >=< v 0 , S −1 (a)v >, for a ∈ Uˆ q (g), v ∈ V and v 0 ∈ ∗ V . (1 ⊗ 8)8∗ = 68i 8∗j ⊗ e∗j ⊗ ei gives a map ¯ : Vλ,k −→∗ V ⊗ V ⊗ Vλ,k . 8 Let us identify ∗ V ⊗ V with End(V ) by the map which maps the first two components ˜ ∈ End(V )⊗ End(Vλ,k ): of V ⊗∗ V ⊗ V to C and fixes the last component. Let L ˜ = (L ˜ ij ) = ((Dq2k 8j )8∗i ), L

(1.20)

where (Dq2k 8i ) means shifting the evaluation representation by constant q 2k and we ˜ is well defined. assume L

Spinor Representations and Quantum Boson-Fermion Correspondence

Proposition 1.4.

˜ ˜ L1(a) = ((Dq2k ⊗ 1)1(a))L

Proof. Similar to that of Proposition 1.3.

405

(1.21)

˜ with L to obtain representations out of The key idea is to identify L˜ with L¯ or L intertwiners. The intertwiners for the affine quantum groups are extensively studied by the Kyoto school. They connected the XXZ model in statistical mechanics with the structures of the representation of quantum affine algebras via the intertwiners. Meanwhile Jimbo, Miki, Miwa and Nakayashiki [JMMN] and Koyama [Ko] worked on the bosonization of the intertwiners of quantum affine algebras. These results, in some sense, are trying to obtain the fermions out of bosons. The results in [DF] enable us to obtain the explicit quantum boson-fermion correspondence via Gauss decomposition of L± (z). Our construction also brings a different but conceptional understanding to the classical boson-fermion correspondence. On the other hand, Miki[M], Foda, Iohara, Jimbo, Kedem and Yan [FIJKMY] presented another idea to construct realizations of L¯ ± by composition of 9 and 8∗ . We will use the idea in this paper to define the corresponding quantum Clifford algebras, quantum Weyl algebras and construct spinor and oscillator representations of ˆ For the spinor representation of Uq (ˆo(2n)), only the bosonic Uq (ˆo(2n)) and Uq (sp(2n)). realization [B1] is available. We expect to build the boson-fermion correspondence in a similar and simple way. ˆ 2. Spinor Representation of gl(n) and Boson-Fermion Correspondence ˆ For gl(n), we assume n > 2. This restriction is to avoid some nonessential but tedious complication caused by the self dual property of the standard representation of sl(2) on C2 , which can be resolved. Definition 2.1. The Affine Clifford algebra is an associative algebra generated by ai (m) and a∗i (m), (i = 1, .., n; m ∈ Z) with the commutation relations: ai (m)aj (l) + aj (l)ai (m) = 0, a∗i (m)a∗j (l) + a∗j (l)a∗i (m) = 0,

(2.1)

ai (m)a∗j (l) + a∗j (l)ai (m) = δij δm,−l . Definition 2.2. The affine Heisenberg algebra is an associative algebra generated by hi (m), m 6= 0, i = 1, ..., n and m ∈ Z. The relations are hi (m)hj (l) − hj (l)hi (m) = 0, l 6= −m, hi (m)hj (−m) − hj (−m)hi (m) = mδij , m 6= 0.

(2.2)

We now introduce the notation of formal power series. (For details see [FLM].) For a vector space W , we denote that W [[z, z −1 ]] = {6m∈Z vm z m kvm ∈ W },

(2.3)

where z is a formal variable. For example, let W be also an algebra over C with unit, and δ(z) = 6m∈Z z m . Suppose we are given two formal power series g1 (z) = 6gi (m)z −m and g2 (z) = 6g2 (m)z −m over the algebra W . If the limit of all the matrix coefficients of

406

J. Ding

each homogeneous component of 6m∈Z g1 (m)z −m 6l1
(2.4)

ˆ can Let Eij be the standard basis of the Lie algebra gl(n). As shown in [G], gl(n) be realized as gl(n) ⊗ C[x, x−1 ] ⊕ CC, where C is the central element. Let Eij (z) = ˆ where z is a 6Eij ⊗ xn z −n be the standard basis of the generating functions of gl(n), formal variable. We define the formal power series ai (z), a∗ (z) with coefficients in the corresponding algebra as ai (z) = 6ai (m)z −m , (2.5) a∗i (z) = 6a∗i (m)z −m . Let P be an n dimensional lattice ⊕i Zh(0)i , with the form (h(0)i , h(0)j ) = δij . Let C[P¯ ] be the central extension of the group algebra of P , such that eh(0)i eh(0)j = (−1)(h(0)i ,h(0)j ) eh(0)j eh(0)i . Let H¯ be an associative algebra generated by hi (m), m 6= 0 and C[P¯ ], where hi (m) and C[P¯ ] commute with each other. Let hi (z) = 6hi (m)z −m + ∂h(0)i , where ∂h(0)i is the partial differential of h(0)i . Definition 2.3. Spinor Fock space is the space of the subalgebra of the affine Clifford algebra generated by ai (−m), a∗i (−m), m > 0 and a∗i (0). Definition 2.4. Oscillator Fock space is the space of the subalgebra of H¯ generated by hi (−n), n > 0 and C[P¯ ]. In this paper, we define normal ordering : : as in [F1]. Proposition 2.1 ([F, FF, F1]). Let ai (z), a∗ (z) be as defined above and Eij (z) be the ˆ ˆ standard basis of gl(n). There is a representation of gl(n) given by the following on the spinor Fock space: (2.6) Eij (z) −→: ai (z)a∗j (z) :, where : : denotes the normal ordering. Proposition 2.2 ([FK, F1]). Let hi (z) be as defined above. There is the Frenkel–Kac ˆ construction of a representation of gl(n) on the oscillator Fock space by vertex operators given by Eii (z) −→ hi (z), (2.7) Eij (z) −→: exp(h¯ i (z) − h¯ j (z)) :, i 6= j, where h¯ i (z)=6n6=0 n1 hi (−n)z n + ∂h(0)i lnz + h(0)i . This representation is isomorphic to the representation of Proposition 2.1 above. The isomorphism is given as: ai (z) =: exp(h¯ i (z)) :, (2.8) a∗i (z) =: exp(−h¯ i (z)) : . Here : : is defined in the standard way as in [F1].

Spinor Representations and Quantum Boson-Fermion Correspondence

407

The isomorphism can be easily proved. First, we shift the degree of ai (z) and a∗i (z) by ±1/2 respectively. Then by comparing the character, we get the proof [F1]. The propositions above give us the classical boson-fermion correspondence. Let’s denote the Fock space representation by Vbf . Proposition 2.3. Vbf = ⊕l∈Z Vl , such that Vl is an irreducible highest weight representation with C acting as a multiplication of 1. Vnm is the irreducible highest weight ˆ representation of gl(n) with zero highest weight, and Vl , for l 6= nm, is the irreducible ˆ highest weight representation of gl(n) with the i= (l( mod (n)))th fundamental weight ˆ of the classical part sl(n) of sl(n). The action of 6i=1,...,n Eii on Vl is given by the multiplication of l. ˆ which commutes For the above case, 6hi (z) = 6i=1,...n : ai (z)a∗i (z) : gives us gl(1), n ∗ ˆ with sl(n). Let ei be the standard basis of V = C and ei be the dual basis of the dual ˆ where z is a module V ∗ . Let V (z) and V ∗ (z) be the evaluation representation of gl(n), ˆ is given by the action of Eij of gl(n) formal variable and the action of Eij ⊗ tn of gl(n) ˆ the left dual of a representation is equivalent with z n scalar multiple. Note that for sl(n), to the right dual. Proposition 2.4. Let

8c = 6a∗i (z) ⊗ ei , 8c∗ = 6ai (z) ⊗ e∗i .

(2.9)

Then 8c and 8c∗ are intertwiners: 8c : Vbf −→ Vbf ⊗ Vz , 8c∗ : Vbf −→ Vbf ⊗ Vz∗ .

(2.10)

This can be checked by direct calculation, which is implicitly given in [FF]. This observation is the starting point of our approach to the spinor and oscillator representations. As in the finite dimensional cases [DF2], we explain the hidden structure behind all the constructions above. The existence of such intertwiners follows from the standard theory about the intertwiners [TK]. Now we will derive the above construction without any specific realizations. In the next section, we will present a parallel q-deformation of this abstract construction. We now start from an abstract module Vbf =6l∈Z V l , where V l is the highest weight representation with the l( mod (n))th fundamental weight, with central charge 1 and the action of 6Eii as integer l. By the 0th fundamental weight, we mean zero weight. V l ˆ has infinitely many copies of the highest weight representation of sl(n) corresponding th to the l( mod (n)) fundamental weight of sl(n) and with central extension 1. From the representation theory of Kac–Moody algebras, we know there exist the irreducible highest weight modules, which are determined by the weight of the highest weight ˆ vector. We extend the representation to that of gl(n) over the space Vz = V [z, z −1 ] by defining the action of 6Eii as a unit. We here will use the notation in [FLM], which is introduced in the theory of vertex operator algebras. By V¯z , we mean the set of all vectors in the form of 6vi fi (z), where vi ∈ V form a basis of V and fi (z) are formal power series on z and z −1 . Let V ∗ be the left dual module of V . Let F be an invariant vector in V ⊗ V ∗ of gl(n), which is unique up to a scalar multiple. We normalize it, such that F is equal to the identity if it is identified with an element in End(V ). Let z1 F (z1 , z2 ) = {x|x = F f (z1 )δ( )}, z2

408

J. Ding

where f (z1 ) is a polynomial of z1 and z1−1 , ˆ V¯z∗2 under the action of Proposition 2.5. F (z1 , z2 ) is an invariant subspace of V¯z1 ⊗ ˆ gl(n). l

∗l

¯ z be intertwiners: ¯ z and 9 Proposition 2.6. Let 9 ¯ lz : V l+1 −→ V l ⊗ Vz , 9

(2.11)

l l+1 ¯ ∗l ⊗ Vz∗ , 9 z : V −→ V l

l

∗l

∗l

l

¯ i (−m) ⊗ vi z m and 9 ¯ z = 69 ¯ i (−m) ⊗ vi z m , where 9 ¯ i (m) and ¯ z = 69 such that 9 ∗l ¯ 9i (m) are operators shifting the degree by m. Then these operators are unique up to a scalar multiple. However, from [FF], we know that not all the highest weight vectors in the spinor construction above are graded 0, but that most of them have negative grading. We will thus shift the grading of V l , for l < 0 or l > n by defining the degree of any vector of |m|−1 V l of degree w to be degree w + n((60 i) + j(m)), where l = mn + j, n − 1 > j > 0. After the shifting, we will denote the new operators by 9lz and 9∗l z . We know that the correlation functions of those operators satisfy KZ equation [TK, FR], and the solutions for the KZ equation in this case can be computed. On the other hand, because the space of the solutions in this case is one dimensional, we can normalize them in such a way that all < vV l−1 , 9lz2 9l+1 z1 vV l > are equal, all ∗l l ∗l 9 v > are equal and all < v , 9 9 v > are equal to 1/(1− zz21 )F , < vV l+2 , 9∗l+1 l l l V z2 z1 V z2 z1 V where vV l is the highest weight vector of V l . More precisely, we can choose the normalization in the way that their correlation functions are equal to the corresponding correlation functions of 8c and 8c∗ . ∗ Let 9z = 6i ⊕ 9iz and 9∗z = 6i ⊕ 9∗i z , then 9z and 9z are intertwining operators ∗ from Vbf to Vbf ⊗Vz and Vbf ⊗Vz respectively. By 9z = 6i ⊕9iz we mean the operator acting on each component of Vl of Vbf by 9lz . For any vector v = 6bi ei in V , we can define 9(v)(n) as (1 ⊗ A)9z ⊗ 6bi e∗i , where A is the map V ⊗ V ∗ to C and ei and e∗i are the standard dual basis of V and V ∗ respectively. Similarly we can define 9 ∗ (v ∗ )(n) for v ∗ in V ∗ . Proposition 2.7. (9z2 ⊗ I)9z1 + P 0 (9z1 ⊗ I)9z2 = 0, (9∗z2 ⊗ I)9∗z1 + P 0 (9∗z1 ⊗ I)9∗z2 = 0, z1 (9z2 ⊗ I)9∗z1 + P 0 (9∗z1 ⊗ I)9z2 − δ( )F = 0, z2

(2.12)

where P 0 is the operator which maps ai z1m ⊗ bj z2l to bj z2l ⊗ ai z1m . Proof. We first prove the relations on the level of the correlation of the highest weight vector, because we can calculate explicitly the correlation functions of the intertwiners for the highest weight vectors from the KZ-equation [TK]. Because the left-hand sides of the formulas above are intertwiners, the equality above is valid for correlation functions of any two vectors. Thus we finish the proof.

Spinor Representations and Quantum Boson-Fermion Correspondence

409

Theorem 2.1. The algebra generated by 9(v)(m) and 9 ∗ (v ∗ )(m) is isomorphic to the Clifford algebra of Definition 2.2. The proof is the same as that for Proposition 2.2. Let g be a simple Lie algebra on C of type An , Bn , Cn or Dn . Let ei , fi and hi , i = 1, ..., n be the basic generators of g corresponding to the Cartan matrix. Let (, ) be the Killing form on g. Let r = 6hi ⊗ hi + 61+ eα ⊗ e−α + 61− eα ⊗ e−α , where 1± are the sets of all the positive roots and the negative roots respectively, eα is a root vector in gα and (hi , hj ) = δij , (eα , e−α ) = 1. r is the Casimir operator of g. Let gˆ be the affine Lie algebra associated with g. gˆ has a concrete realization as gˆ = g[x, x−1 ] + CC, where C is the central element. ˆg Definition 2.5. Let z be a formal variable. We define the elements rˆ and r(z) in gˆ ⊗ˆ ˆ g[z, z −1 ] as and gˆ ⊗ˆ rˆ = 6i,m hi xm ⊗ hi x−m + 6α∈1,i,m eα xm ⊗ e−α x−m , r(z) = 6i,m hi xm ⊗ hi x−m z −m + 6α∈1,m eα xm ⊗ e−α x−m z −m .

(2.13)

We see that rˆ is basically like a Casimir operator. Let ei , fi , hi ( i = 0, 1, .., n) be the basic generators of gˆ for the corresponding Cartan matrix of gˆ . Let M be any finite dimensional module of gˆ , Let Vµ,k be a highest weight module with the highest weight µ and central extension k of gˆ . k is a complex number. Vµ,k is a graded module such that e0 and f0 changes the degree by +1 and −1 respectively as we explained before. ˆ , commutes with ei , fi , hi , Theorem 2.2. πVµ,k ⊗ πM (r) ˆ maps Vµ,k ⊗ M to Vµ,k ⊗M for i 6= 0 and h0 , and ˆ = −2k(id) ⊗ (πM (e0 )), [πVµ,k ⊗ πM (e0 ), πVµ,k ⊗ πM (r)] ˆ = 2k(id) ⊗ (πM (f0 )). [πVµ,k ⊗ πM (f0 ), πVµ,k ⊗ πM (r)]

(2.14)

ˆ , we mean the set of vectors in the form of 6n≤0 µ(n) ⊗ mi , where µ(n) is By Vµ,k ⊗M a vector in Vµ,k of degree n. This follows from a direct calculation, which is also a direct corollary of the corresponding assertion in the quantum case. ˆ explained Let V = Cn be the fundamental representation of g as the case for gl(n) above. This representation can be extended to a representation of gˆ . It is clear that the concrete realization of πVµ,k ⊗ πV (r(z)) can give us explicitly the construction of the representation. That means, for a specific representation Vµ,k , constructing a representation is equivalent to giving an explicit expression of πVµ,k ⊗ πV (r(z)). This is the central idea to understand the classical spinor constructions. From now on in this ˆ as we defined in section, we assume that g = gl(n). Let V = Cn be a module of gl(n) the section above. Let t be a real number such that |t| is less than 1. Theorem 2.3. Let F be the standard map V ∗ ⊗V to C. (9z ⊗I)9∗zt −1/(1−t)id⊗F and limt→1 (9z ⊗ I)9∗zt − 1/(1 − t)id ⊗ F exist. As a map from Vbf ⊗ V to Vbf ⊗ V [z, z −1 ], −r(z) ¯ = (id ⊗ I ⊗ F)(lim (9z ⊗ I ⊗ I)9∗zt ⊗ I − 1/(1 − t)id ⊗ F ⊗ I) t→1

ˆ = πVbf ⊗ πV (r(z)).

(2.15)

By the limit above, we mean that we would take the limit for each homogeneous component separately.

410

J. Ding

Proof. It is clear that the first assertion implies the second one. The limit we take above is equivalent to the normal ordering defined in [FF]. A direct calculation shows that ¯ satisfies the property (2.14) of r on the tensor module. We can r¯ = −(Dz−1 ⊗ 1)(r(z)) show by calculation that the images of the difference of the degree zero terms of the highest weight vectors is zero. However we know that the difference between r¯ and r is an intertwiner. Thus the difference is zero. Therefore they are equal. ˆ and Boson-Fermion 3. Quantum Spinor Representation of Uq (gl(n)) Correspondence We assume n > 2 for the same reason explained in the previous section. We will proceed to construct the spinor representation and explain the quantum boson-fermion ˆ The degeneration of this construction provides us the correspondence for Uq (gl(n)). classical boson-fermion correspondence in the section above. ˆ Let R(z) be an element of We will first give a different realization of Uq (gl(n)). End(Cn ⊗ Cn ) defined by R(z) =

n X

Eii ⊗ Eii +

i=1

+

n X i6=j i,j=1

n X i<j i,j=1

Eii ⊗ Ejj

z−1 q −1 z − q

n z(q −1 − q) X (q −1 − q) Eij ⊗ Eji −1 Eij ⊗ Eji −1 + , zq − q q z−q i>j

(3.1)

i,j=1

where q, z are formal variables. R(z) satisfies the Yang–Baxter equation. Definition 3.1. Uq (R) is an associative algebra with generators {lij [m], m ∈ Z. Let L(z) = (lij )ni,j=1 = ((6m∈Z lij [m]z −m )ni,j=1 . The defining relations are: R(z/w)L1 (z)R(zq 2c /w)−1 L2 (w) = L2 (w)R(zq −2c /w)L1 (z)R(z/w)−1 .

(3.2)

ˆ Uq (R) is isomorphic to an algeProposition 3.1. Uq (R) is isomorphic to Uq (gl(n)). ˆ ˆ bra generated by Uq (sl(n)) and g(m) for m 6= 0. g(n) commute with Uq (sl(n)) and ˆ [g(l), g(m)] = δl,−m lC, where C is the central element of Uq (sl(n)) [DF]. ˆ we can obtain L¯ ± (z), which satisfy (1.13a) [DF]. There exist Proof. From Uq (sl(n)), ±m 1 b(±, m) and g ± (z) = e6b(±m)g(∓m)z e± 2 K ,m > 0, such that the operators ¯ = g + (z)L¯ + (zq −c )(L¯ − (z))−1 )g − (z) l(z) satisfy the relation (3.2). This gives us an algebra homomorphism from the Uq (R) to ˆ ˆ [DF] above. Because Uq (R) is isomorphic to U (gl(n)), when q = 1, this map Uq (gl(n)) is surjective. From the fact [DF] that the intersection of the ideals of the integrable ˆ is zero, we know that it must be also an injective map. representation of Uq (gl(n)) Lusztig’s theorem about deformation of the category of integrable highest weight representations [L], implies that the module Vbf as an integrable highest weight module can be deformed. One construction of this module was given by Frenkel and Jing [FJ]. As in the last section, we will start from the highest weight modules. We will denote ˆ this deformed module of Vbf = 6l∈Z ⊕ Vl by VBF . VBF as a module of Uq (gl(n))

Spinor Representations and Quantum Boson-Fermion Correspondence

411

can be decomposed into irreducible components: VBF = 6l∈Z ⊕ V l , where V l is an ˆ in the irreducible component of VBF , which is a deformation of the module Vl of gl(n) ˆ has infinite multiple copies of V¯i(mod(n)) , section above. But V l as a module of Uq (sl(n)) ˆ with the ith fundamental weight which is a highest weight representation of Uq (sl(n)) l and central extension 1. We also assume that V has the same grading shifting as for Vl in the classical case. ˆ We now present the evaluation representations of Uq (sl(n)) on the fundamental representations of the subalgebra Uq (sl(n)) generated Ei , Fi , Ki , i 6= 0 from [DO]. We / J). If J is define a characteristic function θJ (j) of a set J by θJ (j) = 1 (j ∈ J), 0 (j ∈ omitted, it should be understood as Z≥0 . Fix a positive integer k such that 1 ≤ k ≤ n−1. Let I = {i1 , · · · , ik } be a subset of {0, 1, · · · , n − 1}. For I = {i1 , · · · , ik }, we put s(I) = i1 +· · ·+ik . Consider a vector space V (k) spanned by the vectors {vI }. With a map π (k) from Uq (sl(n)) to End(V (k) )[DO], V (k) becomes a module of Uq (sl(n)), which is isomorphic to the irreducible highest weight module with highest weight corresponding to the k th node of the Dynkin diagram of sl(n) [DO]. Let (3.3) Vz(k) = V (k) ⊗ C[z, z −1 ]. ˜ −→ End(Vz(k) ) [DO]. We can lift π (k) to an algebra homomorphism πz(k) : Uq (sl(n)) ˆ Then there is the following isomorphism of Uq (sl(n))-modules: (k) : C±

∗s±1 ∼ (k) (n−k) Vz(−q) , ∓n −→ Vz

(3.4)

vI 7→ (−q)±s(I) vI∗c . Here I c denotes the complement set of I in {0, 1, · · · , n − 1}, {vI∗c } signifies the dual ±1 basis of {vI c } ⊂ V (n−k) and V ∗s denotes the right and the left dual respectively. ˆ We extend this index Let 3i , (i=0,1,...,n-1) be the fundamental weights of Uq (sl(n)). ¯ cyclically by 3i = 3i+n . Let’s denote by 3k = 3k − 30 the projection of 3k to the weight lattice of Uq (sl(n)) generated by Ei , Fi and Ki when i is not zero. Let 8 be an intertwiner defined in the same way as 9(z) in Proposition 2.7: 8 : VBF −→ V (1) ⊗ VBF , 8(x) = e1 ⊗ 81 (x) + ... + en ⊗ 8n (x),

(3.5)

where x ∈ VBF and {ei } is a basis for V (1) . Let 8∗ be the intertwiner: 8∗ : VBF −→ V (1)∗ ⊗ VBF , 8∗ (x) = e¯∗1 ⊗ 8∗1 (x) + ... + e¯∗n ⊗ 8∗n (x),

(3.6)

ˆ x ∈ VBF and {e¯∗i } is a where V (1)∗ is the left dual representation of V (1) of Uq (sl(n)), basis for V (1)∗ . ¯ ∗ be the intertwiner defined in the same way as 9∗ (z) in Proposition 2.7: Let 8 ¯ ∗ : VBF −→∗ V (1) ⊗ VBF , 8 ¯ ∗1 (x) + ... + e∗n ⊗ 8 ¯ ∗n (x), ¯ ∗ (x) = e∗1 ⊗ 8 8

(3.7)

ˆ x ∈ VBF and {e∗i } is a where ∗ V (1) is the right dual representation of V of Uq (sl(n)), ∗ (1) (1) basis for V . From now on, we will substitute V by V . There is an isomorphism

412

J. Ding

from ∗ V to V ∗ by a → q 2ρ a, for a ∈∗ V [FR], where ρ is the half sum of all the positive ˆ generated by Ei , Fi andKi (i 6= 0). roots of the Uq (sl(n)) in Uq (sl(n)) As in the previous case, we can identify V ∗ ⊗V with End(V ). With this identification, ˜ ∈ End(V ) ⊗ End(VBF ): we define an operator L ˜ = (L ˜ ij ) = ((Dq2 8j )8 ¯ ∗i ), L

(3.8)

˜ ˜ L1(a) = ((Dq2 ⊗ 1)1(a))L.

(3.9)

The shift comes from the shift of 8q2 . ¯ ∗j is well There is a problem whether the multiplication of two operators Dq2 8i and 8 defined. With the condition |q| < 1, we will show that this multiplication is well defined with Corollary 3.2 below, which comes from the results of the correlation functions of those intertwiners [DO]. We assume, from now on, that |q| < 1. Let 1j = j(n − j)/2n, for j = 0, ..., n − 1 and we extend this index cyclically by ˆ 1j = 1j+n . Let V¯j , j = 0, 1, 2, . . . , n − 1 be equivalent to Vj as a module of Uq (sl(n)). ¯ We extend the index cyclically by identifying V¯j with V¯j+n . For an integer l, let l stand for the integer such that l ≡ l¯ mod n, 0 ≤ l¯ < n. Set Ijk = {j − k, j − k + 1, · · · , j}. V V¯ Let 8V¯ j−k (z) denote an intertwiner from V¯j to Vz(k) ⊗ V¯j−k . Then our normalj ization reads as follows (0 ≤ j < n): (k)

V (k) V¯ j−k

8V¯

j

± (V (k) )∗a V¯ j+k

8V¯

j

(z)|vj i = z 1j−k −1j vIjk \{j} ⊗ |vj−k i + · · · , ∗ (z)|vj i = z 1j+k −1j v(I c ⊗ |vj+k i + · · · , j,n−k \{j})

where vi is the highest weight vector in V¯i . Let V (1) V¯ j−1

8(z) = (1 ⊗ Dz−1 )8 = 6 ⊕ 8V¯

j

¯ ∗ = 6 ⊕ 8(V 8∗ (z) = (1 ⊗ Dz−1 )8 V¯

(1) ∗a+

)

V¯ j+1

j

¯ ∗ (z) = (1 ⊗ Dz−1 )8∗ = 6 ⊕ 8(V 8 V¯

j

(1) ∗a−

)

(z), (z),

V¯ j+1

(3.10)

(z).

The matrix coefficients of the highest weight vector for the product of two intertwiners were obtained in [DO], which are needed to present the commutation relations between those intertwiners. Let ∞ Y (1 − zpj ). (3.11) (z; p)∞ = j=0

From now on, we will always use h, i to denote the matrix coefficient of the corresponding highest weight vectors of the highest weight modules.

Spinor Representations and Quantum Boson-Fermion Correspondence

413

Proposition 3.2 ([DO]). V (k) V¯ j

h8V¯ 2

j+k

0

V (k ) V¯ j+k

(z2 )8V¯ 1

j+k+k0

(z1 )i 0

= ×

1j+k −1j+k+k0 1j −1j+k z2 z1

X 0

I⊂I0(k+k ) ,|I|=k

m Y ((−q)2i+|k−k |−2 zz1 ; q 2n )∞ 2

((−q)−2i−|k−k0 | zz21 ; q 2n )∞ i=1 (k) z1 (−q)s(I0 )−s(I) ( )µj (I) v(I (k+k0 ) \I)[j] ⊗ vI[j] , 0 z2

(3.12)

m = min(k, k 0 , n − k, n − k 0 ), µj (I) = ]{i ∈ I|i + j ≥ n} − ]{i ∈ I0(k) |i + j ≥ n}, I[j] = {i1 + j, · · · , ik + j} for I = {i1 , · · · , ik }. 0

Here I0(k) = {0, 1, · · · , k−1}, if k+k 0 > n, we formally understand I0(k+k ) = {0, · · · , n− 0 0 0 1} t I0(k+k −n) , and we assume I0(k+k −n) ⊂ I, I0(k+k ) \ I. Let R¯ V (k) V ( k0 ) (z) be as defined in [DO]. Let 2p (z) = (z; p)∞ (pz −1 ; p)∞ (p; p)∞ . We define

(3.13)

0

Rkk0 (z) = ρ(k,k ) (z)R¯ V (k) V (k0 ) (z), 0

0

0

ρ(k,k ) (z) = z −kk /n+min(k,k )

((−q)b z −1 ; q 2n )∞ ((−q)s z; q 2n )∞ ((−q)b z; q 2n )∞ ((−q)s z −1 ; q 2n )∞

m Y 2q2n ((−q)2i+b z −1 )

2q2n ((−q)2i+b z)

i=1

(3.14)

(3.15)

,

where b = |k − k 0 |, s = min(k + k 0 , 2n − k − k 0 ), m is defined as min{k, k 0 , n − k, n − k} and k, k 0 =1 or n − 1. Let 0

0

(n−k ) (n−k ) −1 ∗ )Rk,n−k0 (z(−q)−n )(id ⊗ C− ) , Rkk 0 (z) = (id ⊗ C− 0

0

(n−k) (n−k ) (n−k) (n−k ) −1 ∗∗ ⊗ C− )Rn−k,n−k0 (z)(C− ⊗ C− ) , Rkk 0 (z) = (C−

(3.16)

. Corollary 3.1. Let k, k 0 =1 or n − 1. Then V (k) V¯ j

h8V¯ 2

j+k

0

V (k ) V¯ j+k

(z1 )8V¯ 1

j+k+k0

(z2 )i = P Rkk0 ( V (k) V¯ j

h8V¯ 2

j+k

∗ = P Rkk 0(

0 )∗a−1

(z1 )8V¯ 1

V¯ j+k

j+k−k0

(z2 )i

0 −1 z1 V (k) V¯ V (k )∗a V¯ j 0 )h8V¯ 2 0 (z2 )8V¯ 1 j−k (z1 )i, j−k j+k−k0 z2

V (k)∗a

h8V¯ 2

j−k

∗∗ = P Rkk 0(

V (k

0 z1 V (k) V¯ 0 V (k ) V¯ )h8V¯ 2 0 j (z2 )8V¯ 1 j+k (z1 )i, j+k j+k+k0 z2

−1

V¯ j

V (k

(z1 )8V¯ 1

0 )∗a−1

j−k−k0

V¯ j−k

(z2 )i

−1 0 −1 z1 V (k)∗a V¯ j−k0 V (k )∗a V¯ j )h8V¯ 2 0 (z2 )8V¯ 1 (z1 )i. j−k j−k−k0 z2

(3.17)

414

J. Ding

In the neighborhood of | zz21 | = 1, both sides of the second formula above with k = k 0 have a simple pole at z1 = z2 . Its residue is given by V (k) V¯ j

P Resz1 =z2 h8V¯ 2

j+k

= h(k)

V (k)∗a

(z1 )8V¯ 1 j X

−1

V¯ j+k

(z2 ) d

z1 i z2

vI ⊗ vI∗ ,

(3.18)

I⊂I0(n) ,|I|=k

where h(k) =

(q 2n−2 ; q 2n )∞ (q 2n ; q 2n )∞

min(k,n−k)−1 Y i=1

(q 2n−2i−2 ; q 2n )∞ . (q 2i ; q 2n )∞

Let f(

( zz21 ; q 2n )∞ z1 ) = −2 , z2 (q zz21 ; q 2n )∞

F(

((q)2n−2 zz21 ; q 2n )∞ z1 )= . z2 (q 2n zz21 ; q 2n )∞

∗∗ ∗ ( zz21 ) and R1,n−1 ( zz21 ) Note that (1− zz21 q −2 )f ( zz21 ) = (1− zz21 )/F ( zz21 ). Let R1,1 ( zz21 ), Rn−1,n−1 (k,k0 ) (z) in (3.15) will be substituted be matrices as defined above but the first term of ρ by z δk,k0 .

Corollary 3.2 ([FR, DFJMN, DO]). Let z, z1 and z2 be formal variables. 8(z) and ¯ ∗ (z) satisfy the commutation relations: 8 1 68j (z1 )8i (z2 )ei ⊗ ej = f ( zz21 )

P0

1 z1 ))(68j 0 (z2 )8i0 (z1 )ei0 ⊗ ej 0 ), z1 (R1,1 ( f ( z2 ) z2 1 ¯ ∗i (z1 )8 ¯ ∗j (z2 )e∗j ⊗ e∗i = 68 f ( zz21 )

P0

1 z1 ¯ ∗i (z2 )8 ¯ ∗j (z1 )e∗j ⊗ e∗i ), (R∗∗ ( )(68 f ( zz21 ) n−1,n−1 z2

(3.19)

1 ¯ j (z1 )8∗i (z2 )e∗i ⊗ ej − 1/(1 − z1 )F = 68 F ( zz21 ) z2

P0 lim

1 z1 z1 z1 (R∗ ( ))(68∗j (z1 )8i (z2 )ei ⊗ e∗j ) − ( )/( − 1)F, F ( zz21 ) 1,n−1 z2 z 2 z2 lim

∗

¯ i (z2 )e∗i ⊗ ei = P (z1 − z2 )68i (z1 )8

z1 →1 z1 →z2 ,|z1 |<|z2 |

(q 2n−2 ; q 2n )∞ F, (q 2n ; q 2n )∞

where f ( zz21 ) and F ( zz21 ) are expanded in the power series of zz21 on the left hand side of the the formulas above but in the power series of zz21 on the right-hand side, F = 6ei ⊗ e∗i .

Spinor Representations and Quantum Boson-Fermion Correspondence

415

Proof. Our argument is based on the formulas in Proposition 3.2 of the correlation functions. The argument for the first two formulas is straightforward. First, we know that the formula is valid for the level of correlation functions of the highest weight vectors of VBF , due to the fact that after we factor out those functions as above, the correlation functions of the operators on both sides of highest weight vectors are polynomials as shown in [DO]. On the other hand, both sides are intertwiners. This relation thus can be proved to be valid for the matrix coefficient of any two vectors, thus both sides are equal. ¯ ∗j , the argument As for the formula for the commutation relations between 8i and 8 goes as follows: the first part is the same as that above, namely, the formula is valid on the level of correlation functions of the highest weight vectors; secondly, 6n∈Z ( zz21 )n F is an invariant vector, thus id ⊗ 6n∈Z ( zz21 )n F is also an intertwiner. Thus the difference of two sides is also an intertwiner, then we can show that the third formula is valid on the level of correlation functions of any two vectors of the two sides of the third formula. Thus it is valid. ∗

¯ j (z) deThe commutation relation of each homogeneous component of 8i (z), 8 scribed by using the R-matrix will degenerate into the commutation relation with δ(z). ¯ ∗i (z2 ) do not include The locations of the poles of the correlation functions of 8j (z1 )8 the line z1 q 2 = z2 . From the commutation relations and the condition that |q| < 1, ¯ ∗j e∗j ⊗ ei = ¯ ∗ (z) and 8(zq 2 ) is well defined. Thus 6i,j (D 2 8i )8 the multiplication of 8 q ¯ ∗ (z) is a well defined operator. (1 ⊗ 1 ⊗ Dz−1 )(1 ⊗ 8(zq 2 ))8 We know that both V ⊗ Vi and Vq2 ⊗ Vi are irreducible as shown in [KKMMNN, ?]. This shows that the dimension of the space of the operators X : V ⊗VBF −→ Vq2 ⊗VBF , which satisfy the following relation: X1(a) = (Dq2 ⊗ 1)1(a)X, ˜ and L is a constant factor on all irreducible compois n. Thus the difference between L nents, which can be determined by looking at their actions on the highest weight vectors. By looking at the homogeneous component of degree 0 of the correlation functions, we know that there is a factor c for all the irreducible components, which can be derived by comparing the actions of L and L˜ on the highest weight vectors, which is partially due to the proper normalization. ˜ ¯ ∗ (z). Let L(z)=(1 ⊗ 8(zq 2 ))8 Theorem 3.1. ˜ L(z) = c(1 ⊗ Dz−1 )L, where c =

¯ ∗ )8v0 ) tr(v0 ,(Dq2 8 tr(v0 ,L(1)v0 )

(3.20)

and v0 is the highest weight vector of V0 . ∗

¯ . We can remove the c in (3.20) by renormalization of 8 and 8 ˜ Because L(z) can be decomposed into the product of L+ (z) and L− (z)−1 for the + ˜ fermionic realizations, L(z)v i = L (z)vi . Then, if we identify the elements in the Fock space with the elements in the quantum affine algebra, this gives us the formula for ˜ L+ (z). This also gives us the unique decomposition of L(z). However we are not able to ˆ As in [DF], by adding write down any simple formulas. Next, let’s consider Uq (gl(n)). ˆ an extra Heisenberg algebra h(n), which commutes with Uq (sl(n)), we would obtain the ˆ From [DF] and (1.11b) representation of Uq (gl(n)).

416

J. Ding

Proposition 3.3. There exist complex numbers a(n) such that P m ¯L+ (z) = (L+ (z) ⊗ e m∈Z>0 a(m)h(−m)z )−1 , P −m ¯L− (z) = (L− (z) ⊗ e m∈Z>0 a(−m)h(m)z )−1 ,

(3.21)

satisfy the commutation relation (3.1). Here h(m) are generators of the Heisenberg algebra as in Definition 2.2.. These operators act on the tensor of the Fock space of H(−m), m > 0 and V ⊗VBF . We will denote it by Vgl . Vgl = VBF ⊗ Vh , where Vh is the module generated by the ˆ Let’s give the same Gauss decomposition to these extra Heisenberg algebra of Uq (gl(n)). new operators. Then the action of the product of the zero component of the diagonal components of their decomposition, which we denote by T and T −1 , are 1. Let A be a group algebra generated by a lattice Za. Let V¯ = 6 ⊕ Vgli ⊗ e(mn+i)a , m ∈ Z in the space ˆ such that all Vgli ⊗ A, where Vgli = Vi ⊗ Vh . We define V¯ to be a module of Uq (gl(n)), other elements acting only on Vgl , but the action of T on Vgl ⊗ ema is a multiplication of q m . ˆ ¯ ± (z) gives us a representation L± (z) of Uq (gl(n)) on V¯ . V¯ is Proposition 3.4. T ±1/2 L equivalent to VBF . ¯ 1 )h(z ¯ 2 )= g( z1 )+ : h(z ¯ 1 )h(z ¯ 2 ) :. Let ¯ Let h(z) = 6a(−m)h(m)z −m /(q 2 − 1) and h(z z2 z

g( z1 )

G( zz21 ) = e

2

. Then : eh(z2 ) :: eh(z1 ) : = G( zz21 ) : eh(z2 )+h(z1 ) :. ¯

¯

¯

¯

Definition 3.2. Let z, z1 and z2 be formal variables. The affine Quantum Clifford algebra is defined as an associative algebra generated by ψ(z) = (ψi (z)) = (6ψi (m)z −m ) and ψ ∗ (z) = (ψi∗ (z)) = (6ψi∗ (m)z −m ), 0 < i < n + 1 quotient the ideal generated by the following the relations: G( zz21 ) f ( zz21 )

P

0

G( zz21 ) f ( zz21 )

(R1,1 (

G( zz21 )

P0

G( zz21 ) f ( zz21 )

6ψj (z1 )ψi (z2 )ei ⊗ ej =

f ( zz21 )

z1 ))(6ψi0 (z2 )ψj 0 (z1 )ej 0 ⊗ ei0 ), z2

6ψj∗ (z1 )ψi∗ (z2 )e∗i ⊗ e∗j =

∗∗ (Rn−1,n−1 (

z1 )(6ψi∗ (z2 )ψj∗ (z1 )e∗j ⊗ e∗i ), z2

1 6ψj (z1 )ψi∗ (z2 )e∗i z1 G( z2 )F ( zz21 )

⊗ ej − 1/(1 −

z2 )F = z1

1 z1 (R∗ ( ))(6ψj∗ (z2 )ψi (z1 )ei ⊗ e∗j ) (F ( zz21 )G( zz21 )) 1,n−1 z2 z1 z1 −( )/( − 1)F, z2 z2 (q 2n−2 ; q 2n )∞ lim 1/(z1 − z2 )6ψi (z1 )ψi∗ (z2 )e∗i ⊗ ei = F/G(1), lim z1 →1 z1 →z2 ,|z1 |<|z2 | (q 2n ; q 2n )∞ (3.22) P0

Spinor Representations and Quantum Boson-Fermion Correspondence

417

where the functions of the left- and right-hand sides are expanded in tively.

z2 z1

z1 z2

and

respec-

Theorem 3.2. The affine quantum Clifford algebra is isomorphic to the algebra gener¯ ¯ ated by 8(z) ⊗ e−h(z) ⊗ e−a and 8(z)∗ ⊗ eh(z) ⊗ ea on V¯ . Proof. It is straightforward to show that the map from ψ(z) to 8(z) ⊗ e−h(z) ⊗ e−a ¯ and ψ ∗ (z) to 8(z)∗ ⊗ eh(z) ⊗ ea is a surjective algebra homomorphism. Because ρkk (z) for k = 1 and k = n − 1 has a factor in the form of z(1 − z −1 )/(1 − z) = −1 and the triangular property of R(0), if we shift the degree of ψ(z) and ψ ∗ (z) by ±1/2 as in the classical case, we can define the Fock space as space generated by the operators ψi (n) for n < 0 and ψi∗ (m) for m 6= 0, where ψi (n) and ψi∗ (n) are the coefficient operator of z −n of ψi (z) and ψi∗ (z) respectively. The character can be derived with the calculation based on the R-matrix on Cn ⊗ Cn on the specific basis we choose, which shows that it has the right character. Thus we prove the isomorphism. ¯

In Definition 3.2, we expect that the complicated functions showing in Definition 3.2 will cancel each other, such that we would get manageable and easy formulas. If this hypothesis is true, it hints that, as in the classical case, we should look at the corresponding case of gl(n) instead of sl(n), which should make the case much simpler. With Theorem 3.2, we actually can start from the abstract algebra defined in Definition 3.2. Then we can derive L, which leads to the realization of L± (z). From [DF], through the Gauss decomposition of L± (z), we obtain all the quantum bosons out of L± (z). Thus we obtain the realization of the quantum boson-fermion correspondence in one direction. But we can not write explicit simple formulas due to the difficulties coming from the polar decomposition of L(z) and the Gauss decomposition of L± (z), however ˆ can be solved in a relatively easy way through computation. On the case for Uq (sl(2)) the other hand, based on the work of Koyama [Ko] and the Frenkel–Jing construction, we can write down partially the realization the quantum fermions in Bosons. With our results, we expect that a complete formula is very possible, if we consider gl(n) instead of sl(n) as we explain in Remark 1. Let F¯ ( zz21 ) = (1 − zz21 q −2 )F ( zz21 ). Let H(z) = 6H(n)z −n , n 6= 0 be an Heisenberg ¯ = 8(z)⊗ : algebra such that : eH(z1 ) :: e−H(z2 ) := 1/(F¯ ( zz21 )) : eH(z1 )−H(z2 ) :. Let 9(z) ∗ ∗ −H(z) H(z) ˜ ¯ e :, 9 (z) = 8 (z)⊗ : e :. Let V be the space of tensor product VBF ⊗ H, where H is the space generated by H(n), n < 0. Then, on V˜ , we have ∗

¯ ¯ (z) satisfies the relaand 9 Theorem 3.3. Let z, z1 and z2 be formal variables. 9(z) tions: ¯ i (z2 )ei ⊗ ej = ¯ j (z1 )9 69 z2 ¯ F ( z1 ) z1 z1 ¯ i0 (z2 )9 ¯ j 0 (z1 )ej 0 ⊗ ei0 ), ))(69 (1 − )P 0 z1 (R1,1 ( z2 (f ( z2 )) z2 ∗

∗

¯ i (z2 )e∗i ⊗ e∗j = ¯ j (z1 )9 69 (1 −

z z1 z1 0 F¯ ( z21 ) ∗∗ ¯ ∗i (z2 )9 ¯ ∗j (z1 )e∗j ⊗ e∗i ), (R )P ( )(69 z2 (f ( zz21 )) n−1,n−1 z2 ¯ ∗i (z2 )e∗i ⊗ ej − 1/(1 − z2 )F (1 − q −2 ) = ¯ j (z1 )9 69 z1

418

J. Ding

P0

lim

1 z1 ¯ ∗i (z2 )9 ¯ j (z1 )ej ⊗ e∗i ) (R∗ ( ))(69 (F ( zz21 )F¯ ( zz21 )) 1,n−1 z2 z1 z1 −( )/( − 1)(1 − q −2 )F, z2 z 2 ¯ i (z1 )9 ¯ ∗i (z2 )e∗i ⊗ ei = (1 − q −2 )F, lim 1/(z1 − z2 )69

(3.23)

z1 →1 z1 →z2 ,|z1 |<|z2 |

where the functions of the left- and right- hand sides are expanded in tively.

z2 z1

and

z1 z2

respec-

The proof comes from straight calculation. Discussion Theorem 3.3 gives a realization of an algebra, which has the same definition formulas for form factors in quantum field theories [Sm], but the R-matrix here is different with a function factor. To construct local operators in the theory of form factors is a very ˆ and if we consider the important problem [Sm]. If we consider the case of Uq (sl(n)), intertwiners in (1.18) and (1.19), which we will call right intertwiners, as the basic generators to define form factors, it is known that a certain composition of the intertwiners of the type as in (1.14) and (1.15), which we will call left intertwiners, gives local operators. However, how to derive the left intertwiners from the right intertwiners is a problem. With the spinor construction we derive above, it is clear that we can derive the operator L± (z) through Gauss decomposition of L(z) constructed out of the right intertwiners. The proper composition of L± (z) with the right intertwiners obviously gives us the left intertwiners. Thus our construction provides a way to obtain local operators directly from the algebra, which defines form factors. Moreover, we can modify the algebra with an extra Heisenberg algebra similar to the case in Theorem 3.3, such that we can derive in the same way operators l± (z) coming from the Gauss decomposition of the operator l(z) built out of those modified right intertwiners to derive operators similar to the left intertwiners, which, however, simply commute with those modified right intertwiners, which is basically the definition property of local operators. As a natural continuation, we are expecting to apply the same idea to uncover the underlying structure of the so-called vertex operator algebras, which should lead us to the corresponding deformed structure, an axiomatic formulation quantum vertex operator algebra via the representation theory and the structure theory of quantum affine algebras. The complete establishment of such a theory should provide a proper mathematical setting to understand the massive quantum field theory in theoretical physics. Acknowledgement. This paper is part of a dissertation under Professor I. Frenkel submitted to Yale University in May 1995. I would like to thank my advisor, Igor B. Frenkel, for his guidance and his constant and creative encouragement. I would like to thank Prof. M. Jimbo for his stimulating discussion and advice, especially as concerns the commutation relations of the intertwiners. I would like to thank the Sloane Foundation for the dissertation fellowship. I would also like to thank the referees for their kind advice.

References [B1] [B2]

Bernard, D.: Lett. Math. Phys. 165, 555–568 (1989) Bernard, D.: Propos du calcul diffentiel sur les groupes quantiques. Inst. H. Poincar’e, Phys. Theory 56, No. 4 (1992)

Spinor Representations and Quantum Boson-Fermion Correspondence

[DO]

419

Date, E. and Okado, M.: Calculation of excitation spectra of the spin model related with the vector representation of the quantized affine algebra of type A(1) n . Int. J. Mod. Phys. A 9, No. 3, (1994) [DFJMN] Davis, B., Foda, O., Jimbo, M., Miwa, T. and Nakayashiki, A.: Diagonalization of the XXZ Hamiltonian by vertex operators. Commun. Math. Phys. 151, 89–153 (1993) [Di] Ding, J.: Spinor representations of Uq (o(N ˆ )). Lett. Math. Phys. 39, no. 1, 81–94 (1997) ˆ [DF] Ding, J., Frenkel, I.B.: Isomorphism of two realizations of quantum affine algebra Uq (gl(n)). Commun. Math. Phys. 156, 277–300 (1993) [DF2] Ding, J. and Frenkel, I.B.: Spinor and oscillator representations of quantum groups. In: Lie Theory and Geometry in Honor of Bertram Kostant, Progress in mathematics, 123, Boston: Birkhauser, 1994 [D1] Drinfeld, V.G.: Hopf algebra and the quantum Yang–Baxter Equation. Dokl. Akad. Nauk. SSSR 283, 1060–1064 (1985) [D2] Drinfeld, V.G.: Quantum Groups. (1986)ICM Proceedings, New York, Berkeley, pp. 798–820 [D3] Drinfeld, V.G.: New realization of Yangian and quantum affine algebra. Soviet Math. Doklady 36, 212–216 (1988) [FRT1] Faddeev, L.D., Reshetikhin, N.Yu and Takhtajan, L.A.: Quantization of Lie groups and Lie algebras. Algebra and Analysis (Russian) 1.1, 118–206 (1989) [FRT2] Faddeev, L.D., Reshetikhin, N.Yu and Takhtajan, L.A.: Quantization of Lie groups and Lie algebras, Yang–Baxter equation in Integrable Systems. Advanced Series in Mathematical Physics, Vol. 10 Singapore: World Scientific, 1989, pp. 299–309 [FF] Feingold, A.J., Frenkel, I.B.: Classical affine Lie algebras. Adv. Math. 56, 117–172 (1985) [FIJKMY] Foda, O., Iohara, H., Jimbo, M., Kedem, R., Miwa, T. and Yan, H.: Notes on highest weight ˆ 2 ). To appear in Quantum Field Theory, Integrable Models Modules of the EllipticAlgebra Ap,q (sl and Beyond, Suppl. Progr. Theor. Phys., Eds. T. Inami and R. Sasahi [F] Frenkel, I.B.: Spinor representation of affine Lie algebras. Proc. Natl. Acad. Sci. USA 77, 6303– 6306 (1980) [F1] Frenkel, I.B.: Two constructions of affine Lie algebra representations and boson-fermion correspondence in quantum field theory. J. Funct. Anal. 44, 259–327 (1981) [FK] Frenkel, I.B. and Kac, V.G.: Basic representations of affine Lie algebras and Dual Resonance Model. Invent. Math. 62, 23–66 (1980) [FJ] Frenkel, I.B., Jing, N.: Vertex representations of quantum affine algebras. Proc. Natl. Acad. Sci., USA 85, 9373–9377 (1988) [FLM] Frenkel. I.B., Lepowsky, J. and Meurman, A.: Vertex Operator Algebras and the Monster. Boston: Academic Press, 1988 [FR] Frenkel, I.B., Reshetikhin, N.Yu.: Quantum affine algebras and holomorphic difference equation. Commun. Math. Phys. 146, 1–60 (1992) [G] Garland, H.: The arithmetic theory of loop groups. Publ. Math. IHES 52, 5–136 (1980) [H] Hayashi, T.: Q-analogue of Clifford and Weyl algebras – spinor and oscillator representation of quantum enveloping algebras. Commun. Math. Phys. 127, 129–144 (1990) [J1] Jimbo, M.: A q-difference analogue of U (g) and Yang–Baxter equation. Lett. Math. Phys. 10, 63–69 (1985) [J2] Jimbo, M.: Quantum R-matrix for the generalized Toda systems. Commun. Math. Phys. 102, 537-548 (1986) [JMMN] Jimbo, M., Miki, K., Miwa, T. and Nakayashiki, A.: Correlation functions of the XXZ model for 1 < −1. Phys. Lett. A 168, 256–263 (1992) [Ka] Kac, V.G.: Infinite dimansional Lie algebras. 3rd ed., Cambridge: Cambridge University Press, 1990 [KP] Kac, V.G. and Peterson, D.H.: Spinor and wedge representations of infinite-dimensional Lie algebras and groups. Proc. Natl. Acad. Sci. USA 78, 3308–3312 (1981) [KKMMNN] Kang, S., Kashiwara, M., Misra, K., Miwa, T., Nakashima, T. and Nakayashiki, A.: Affine crystals and vertex models. Int. J. Mod. Phys. A 7 (Supp 1.1A), 449–484 (1992) ˆ [Ko] Koyama, Y.: Staggered Polarization of Vertex Model with Uq (sl(n))-symmetry. Commun. Math. Phys. 164, 277–291 (1994) [L] Lusztig, G.: Quantum deformations of certain simple modules over enveloping algebras. Adv. Math. 70, 237–249 (1988) [M] Miki, K.: Creation/annihilation operators and form factors of XXZ model. Phys. Lett. A 186, 217–224 (1994)

420

J. Ding

[R]

Reshetikhin, N.Yu.: Quantized Universal; Enveloping algebras, The Yang–Baxter equation and invariants of links I, II. LOM1, Preprint E-4-87, E-17-87, L: LOM1 (1987–1988) Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.A.: Central Extensions of Quantum Current Groups. Lett. Math. Phys. 19, 133–142 (1990) Segal, G.: Unitary representation of some infinite dimensional groups. Commun. Math. Phys. 80, 301–342 (1981) Smirnov, F.A.: Introduction to quantum groups and intergrable Massive Models of Quantum Field Theory. Nankai Lectures on Mathematical Physics, Mo-Lin Ge, Bao-Heng Zhao(eds.) Singapore: World Scientific, 1990 Tsuchiya, A. and Kanie, Y.: Vertex operators in conformal field theory on P 1 and monodromy representation of braid group. Adv. Stud. Pure Math. 16, 297–372 (1988)

[RS] [S] [Sm]

[TK]

Communicated by M. Jimbo

Commun. Math. Phys. 200, 421 – 444 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Harnack Type Inequality: the Method of Moving Planes Yan Yan Li? Department of Mathematics, Rutgers University, Piscataway, NJ 08854-8019, USA. E-mail: [email protected] Received: 30 September 1997 / Accepted: 21 July 1998

Abstract: A Harnack type inequality is established for solutions to some semilinear elliptic equations in dimension two. The result is motivated by our approach to the study of some semilinear elliptic equations on compact Riemannian manifolds, which originated from some Chern–Simons Higgs model and have been studied recently by various authors.

0. Introduction Let (M, g) be a compact Riemann surface without boundary, V be a positive function on R M , W be a function with M W dvg = 1. Throughout the paper dvg denotes the volume element of g, 1g denotes the Laplace Beltrami operator with respect to g. For λ ∈ R, we seek a solution of V eu R −W on M. (Eu )λ −1g u = λ V eu dvg M R Clearly M W R dvg = 1 is a necessary condition for (Eu )λ to have a solution. If we set ξ = u − log M V eu dvg for a solution of (Eu )λ , then ξ satisfies −1g ξ = λ(V eξ − W ) and

Z M

V eξ dvg = 1.

on M,

(Eξ )λ

(1)

? Partially supported by the Alfred P. Sloan Foundation Research Fellowship, NSF grant DMS-9706887, and a Rutgers University Research Council grant.

422

Y. Y. Li

Equation (Eu )λ has been studied by Kazdan and Warner [29] in connection with the prescribed Gauss curvature problem, while, it also arises from some Chern–Simons Higgs model as discussed in Taubes [40, 41], Hong, Kim and Pac [27], Jackiw and Weinberg [28], Spruck and Yang [38], Caffarelli and Yang [11], Tarantello [39], Struwe and Tarantello [35], Ding, Jost, Li and Wang [22, 23], and the references therein. Related problems are studied by Carleson and Chang in [14]. Such equations on bounded domains of R2 with Dirichlet boundary conditions play an important role in the context of statistical mechanics of point vortices in the mean field limit as discussed in Caglioti, Lions, Marchioro and Pulvirenti [12, 13] and Kiessling [30]. In particular, it is proved in [35], when (M, g) is a flat torus with fundamental cell domain [− 21 , 21 ] × [− 21 , 21 ], V ≡ 1 and W ≡ 1/vol(M ), that Eq. (Eu )λ has at least one nontrivial solution for 8π < λ < 4π 2 . On the other hand Eq. (Eu )8π , with W ≡ 1/vol(M ), is studied in [22] where sufficient conditions are given for the existence of solutions. Such conditions obviously hold when (M, g) is a flat two dimensional torus, V ≡ 1 and W ≡ 1/vol(M ). The author was recently informed by G. Tarantello that she and M. Nolasco have independently established the existence results in the special case that (M, g) is a flat two-dimensional torus, V ≡ 1 and W ≡ 1/vol(M ). In view of our earlier work [31], we propose a different approach to study the existence of solutions of (Eu )λ . Clearly (Eu )λ is invariant when replacing u by u + constant. Assuming V and W are Lipschitz functions, it is well known that when λ all solutions u of (Eu )λ , after a normalization Rlies in compact subsets of (−∞, 8π), 2,α udv = 0, stay bounded in C (M ) for 0 < α < 1. For λ in compact subsets g M 8πm, 8π(m + 1) , the same conclusion holds due to the results of Brezis and of ∪∞ m=1 Merle [8] and Li and Shafrir [32]. For 0 < α < 1, let Z udvg = 0}. Xα = {u ∈ C 2,α (M ) | M

Xα , equipped with the C Kλ : Xα → Xα by

2,α

(M ) norm, is a Banach space. We introduce an operator

Kλ (u) = λ(−1g )−1 R

V eu − W . V eu dvg M

It follows from standard elliptic theories that Kλ is a well defined compact operator. Equation (Eu )λ is equivalent to (I − Kλ )u = 0 in Xα . For any bounded open set O ⊂ Xα , the Leray-Schauder degree deg(I − Kλ , O, 0) is well defined provided 0 does not belong to (I − Kλ )(∂O). For the definition of the Leray-Schauder degree and its various properties, see, for example, Nirenberg [34]. Let Ba = {u ∈ Xα | kukXα < a} denote the ball in Xα . Due to the above mentioned a priori estimates of solutions of (Eu )λ for λ in compact subsets of R \ ∪∞ m=1 {8πm}, there exists some continuous function aλ defined in R \ ∪∞ {8πm} such that for all λ ∈ R \ ∪∞ m=1 m=1 {8πm} and a > aλ , dλ := deg(I − Kλ , Ba , 0)

(2)

is well defined and, in view of the homotopy invariance of the Leray-Schauder degree, is independent of a as long as a > aλ . Moreover dλ is a constant in each interval (8πm, 8π(m+1)). The piecewise constant function dλ is determined by the Euler number of M . We know that dλ is equal to 1 for λ < 8π. However, due to the possible loss of

Harnack Type Inequality: the Method of Moving Planes

423

compactness of solutions of (Eu )λ when λ crosses 8πm, we do not know yet the values of dλ in the other intervals. Knowing the values of dλ should lead to new existence results for (Eu )λ since dλ 6= 0 implies that (Eu )λ has at least one solution. The situation here is similar to that in [31] where −1u + 2u =

1 K(x)u3 , u > 0, 6

on S4

(3)

is studied. Let A be the open and dense subset of C 2 (S4 )+ , the set of positive twice differentiable functions, defined in [31]. For any Morse function K ∈ A, let Kλ = (1 − λ) + λK for 0 ≤ λ ≤ 1. It is not difficult to see from [31] that there exist 0 < λ1 < · · · < λl < 1 such that for all λ ∈ (0, 1] \ {λ1 , · · · , λl } the total LeraySchauder degree dλ of all possible solutions of (3) with K = Kλ is well defined and is a constant function of λ in each interval (λm , λm+1 ). Since d1 6= 0 implies that (3) has at least one solution, we wish to have a formula of d1 in terms of K. The formula of dλ for small λ is known due to the work of Chang and Yang [16]. So, one way to derive a formula of d1 is to calculate the jump-values of dλ at λm for 1 ≤ m ≤ l. These jump-values can be calculated by using the strong pointwise estimates in [31] of blowup solutions uλ , solutions of (3) with K = Kλ , as λ → λm . Once these jump-values are known, we have a formula of d1 in terms of K. This provides an alternative derivation of the formula of d1 obtained in [31]. We propose to take a similar approach to study (Eu )λ , namely, to look for a formula of dλ in terms of the Euler number of M . Since we know dλ = 1 for λ < 8π, we only need to calculate the jump-values of dλ at 8πm, m ≥ 1. In view of the results in [31], we tend to believe that a good enough pointwise estimate of blowup solutions {uλ } as λ → 8πm is the most crucial step in evaluating the jump-value of dλ at 8πm. Once we know the jump-values for m less than some m0 , we obtain a formula of dλ in m0 −1 {8πm}. The main purpose of this paper is to start making good (−∞, 8πm0 ) \ ∪m=1 pointwise estimates for blowup solutions {uλ } as λ → 8πm. The main analytical result of this paper is a new local estimate given in Theorem 0.3. We first state a well known fact in the subcritical case λ < 8π. Theorem 0.1 (well known). Let (M, g) be a compact R Riemann surface, V be a positive continuous function on M , W ∈ L∞ (M ) with M W dvgR = 1. Then for all > 0, −−1 ≤ λ ≤ 8π − , and all C 2 solutions u of (Eu )λ with M udvg = 0, we have kukL∞ (M ) ≤ C, where C depends only on M, g, , kV kL∞ (M ) , kW kL∞ (M ) , the modulo of continuity of V , and the positive lower bound of V . If both V and W are Lipschitz functions, then it is well known that any C 2 solution of (Eu )λ is actually in C 2,α (M ) for all 0 < α < 1, and C 2,α estimates of u follow from the L∞ estimates. Thus, we have Corollary 0.1. In addition to the hypothesis in Theorem 0.1, assume that both V and −1 W are Lipschitz functions. Then R for all > 0, − ≤ λ ≤ 8π − , 0 < α < 1, and all 2 C solution u of (Eu )λ with M udvg = 0, we have kukC 2,α (M ) ≤ C, where C depends only on M, g, , α, kV kL∞ (M ) , k∇V kL∞ (M ) , k∇W kL∞ (M ) , and the positive lower bound of V .

424

Y. Y. Li

Corollary 0.2. In addition to the hypothesis in Theorem 0.1, we assume that both V and W are Lipschitz functions. Then dλ = 1 for all λ < 8π. Consequently, (Eu )λ has at least one solution for every λ < 8π. Remark 0.1. The existence of one solution to (Eu )λ for λ < 8π can easily be established by variational methods using the following consequence of the Moser-Trudinger inequality: For every > 0, Z Z Z 1 1 ew dvg ≤ ( |∇g w|2 dvg + wdvg + C() ∀ w ∈ H 1 (M ). + ) log 16π vol(M ) M M M See, for example, [33, 29] and [22] for more details. For λ ≥ 8π, Eq. (Eu )λ is much more delicate. The difficulty lies in possible loss of compactness of solutions of (Eu )λ for λ ≥ 8π. A good understanding of possible blowup behavior of solutions of (Eu )λ is important in the study of (Eu )λ . Our next theorem gives some understanding of the possible blowup behavior of solutions of (Eu )λ , which should be relevant in the study of the existence of solutions of (Eu )λ for λ ≥ 8π. Let {Vn } satisfy lim inf min Vn > 0, n→∞

lim sup(max Vn + k∇Vn kL∞ (M ) ) < ∞,

M

n→∞

M

{Wn } satisfy

(4)

Z lim sup k∇Wn kL∞ (M ) < ∞, n→∞

M

Wn dvg = 1,

(5)

and {ξn } satisfy −1g ξn = λn (Vn eξn − Wn ) and

Z M

on M,

Vn eξn dvg = 1.

(6)

(7)

We will use d(x, y) to denote the distance between x and y in M and will use the notation Z 1 ξn = ξn dvg , vol(M ) M to denote the average of ξn on M . Let G(x, y) denote the Green0 s function of −1g on M , namely, 1 −1x G(x, y) = δy − vol(M ) , in M, R G(x, y)dv (x) = 0. g M It is well known (see, e.g., [2]) that G(x, y) is uniquely defined, symmetric in x and y, and a solution of (6) satisfies Z (Vn (y)eξn (y) − Wn (y))G(x, y)dvg (y), ∀ x ∈ M. (8) ξn (x) − ξ n = λn M

Harnack Type Inequality: the Method of Moving Planes

425

Theorem 0.2. Let (M, g) be a compact Riemann surface, {Vn } and {Wn } satisfy (4) and (5), λn → λ ∈ (−∞, ∞), and {ξn } ⊂ C 2 (M ) satisfy (6) and (7). Assume max |ξn | → ∞.

(9)

M

Then after passing to a subsequence (still denoted as {ξn }), there exist m distinct points (l) {x(l) }1≤l≤m in M and m sequences of points x(l) n → x such that (a) ξn → −∞ uniformly on compact subsets of M \ {x(1) , · · · , x(m) }. (b) For each 1 ≤ l ≤ m, and n large, x(l) n is the unique maximum point of ξn in 0 1 (l) (l {x ∈ M | d(x, x ) ≤ 2 minl0 6=l dist(x ) , x(l) )}, and ξn (x(l) n ) → ∞. (c) For each 1 ≤ l ≤ m, let g = eϕn (dx21 + dx22 ) be an isothermal coordinate system (with ϕn (0) = 0) centered at x(l) n , we have, for some constant C independent of n, |ξn (x)−log

0 eξn (0) 1 | ≤ C, ∀ |x| ≤ min dist(x(l) , x(l ) ) and ∀ n. λn Vn (0) ξn (0) 0 6=l 2 2 l 4 (1 + e |x| ) 8

(d) For some constant C independent of n, max |ξn (x(l) n ) + ξ n | ≤ C.

1≤l≤m 2 (M \ {x(1) , · · · , x(m) }), (e) In Cloc

ξn − ξ n → 8π

m X

Z G(·, x(l) ) − 8πm

l=1

M

W (y)G(·, y)dvg (y),

where W = limn→∞ Wn weak ∗ in L∞ (M ). Consequently, λn Vn eξn * 8π

m X

δx(l) in the sense of measure, and λ = 8πm,

l=1

where δx(l) denotes the Delta mass at x(l) . Remark 0.2. Theorem 0.2 still holds when we replace the metric g by a sequence of metrics gn converging to g in the C 2 norm. This can be seen easily from the proof of Theorem 0.2. Due to Theorem 0.2 and Theorem 0.1, there exists some continuous function aλ ∞ defined in R\∪∞ m=1 {8πm} such that dλ in (2) is well defined for all λ ∈ R\∪m=1 {8πm} and a > aλ . Furthermore, in view of Remark 0.2, dλ is independent of the metric g. Therefore, in view of the homotopy invariance of the Leray-Schauder degree, dλ is a constant in each of the open intervals, and all these constants are independent of V , W and the metric g. So dλ is a piecewise constant function of λ determined completely by the Euler number of M . We know from Corollary 0.2 that dλ is equal to 1 for λ < 8π, but we do not know yet the values of dλ in other intervals. Knowing the values of dλ should lead to new existence results for (Eu )λ . As mentioned earlier we wish to calculate the jump-value of dλ at 8πm. In view of the results in [31], Theorem 0.2, providing pointwise estimates of {ξn }, should be useful in evaluating the jump-value of dλ at 8πm. Let {ξn } be the subsequence in Theorem 0.2 satisfying (a)-(e). In an isothermal coordinate system centered at x(l) n , we set

426

Y. Y. Li

vn (x) = ξn (δn(l) x) + 2 log δn(l) ,

|x| < a/δn(l) ,

where δn(l) = e−ξn (xn )/2 and a is some suitably small positive constant. It will be shown by a blow up argument that 2 in Cloc (R2 ) vn → v (l)

with v(x) = log {

1 (1 +

λ(limn→∞ Vn (0) |x|2 )2 8

},

in R2 .

(10)

Consequently, (l)

Rn := sup{R > 0 : kvn − vkC 2 (B 2R (0)) + kvn − vkH 2 (B 2R (0)) < e−R } → ∞. eξn (xn ) (l)

This shows that ξn (x) is very well approximated by log { (l)

(1 +

0

(l)

(l) λn Vn (0) ξn (xn ) |x|2 )2 e 8

} in

|x| ≤ Rn δn(l) . For Rn δn(l) ≤ |x| ≤ 21 minl0 6=l dist(x(l ) , x(l) ), we will give, using (c), some convergence estimate better than (e). For convenience, we use the notation ζn ∼ 0 to denote a sequence of functions {ζn } in C 2 (M ) satisfying lim max{

n→∞

1+

|ζn (x)| : x ∈ M \ ∪ll=1 BR(l) δ(l) (x(l) n )} = 0, (j) n n | log d(x, x )| n j=1

Pm

(11)

|∇ζn (x)| : x ∈ M \ ∪ll=1 BR(l) δ(l) (x(l) lim max{ Pm n )} = 0, (j) −1 n n d(x, x ) n j=1

(12)

|∇2 ζn (x)| : x ∈ M \ ∪ll=1 BR(l) δ(l) (x(l) lim max{ Pm n )} = 0. (j) −2 n→∞ n n j=1 d(x, xn )

(13)

n→∞

and

We also write ζ ∼0 0 for (11), ζ ∼1 0 for (12), and ζ ∼2 0 for (13). Corollary 0.3. Let {ξn } be the subsequence in Theorem 0.2 satisfying (a)-(e). Then Z m X G(·, x(l) ) + 8πm W (y)G(·, y)dvg (y) ∼ 0. ϕn := ξn − ξ n − 8π n l=1

M

Theorem 0.2 will be deduced from some local results on the behavior of blowup solutions to equations of the type −1u = V eu in domains of R2 . In particular, a new local estimate, Theorem 0.3, is needed in the proof of Theorem 0.2. We first recall some known results. Let ⊂ R2 be a bounded smooth domain, 0 ∈ , {Vn } be a sequence of Lipschitz continuous functions satisfying 0 < a ≤ Vn (x) ≤ b < ∞,

∀ x ∈ ,

(14)

Harnack Type Inequality: the Method of Moving Planes

427

and |∇Vn (x)| ≤ A,

∀ x ∈ ,

(15)

in ,

(16)

where a, b and A are positive constants. Consider −1un = Vn eun ,

and let {un } be a sequence of C 2 solutions of (16) satisfying Z lim sup Vn eun < ∞. n→∞

(17)

It follows from Theorem 3 in Brezis and Merle [8] that, under (14)–(17), there are only three alternatives after passing to a subsequence: 1. {un } uniformly converges on compact subsets of , 2. {un } tends to −∞ uniformly on compact subsets of , 3. There exist finitely many blowup points {x(1) , · · · , x(l) } of {un } such that {un } tends to −∞ uniformly on compact subsets of \ {x(1) , · · · , x(l) }, and Vn eun *

l X

αi δx(i)

in the sense of measure,

i=1

with αi ≥ 4π. Here δx(i) is the Dirac mass at x(i) . We recall that a point y is called a blowup point of {un } if there exist yn → y such that un (yn ) → ∞ as n → ∞. It was conjectured in [8] that each αi can be written as αi = 8πmi for some positive integer mi . This was established by Li and Shafrir in [32]. Chen further demonstrated in [20] that any positive integer mi can occur in such local situations. Under (14) and (15), the following Harnack type inequality is proved by Brezis, Li and Shafrir in [9] through the method of moving planes: Every solution of (16) satisfies, on any compact subset K of , sup un + inf un ≤ C(a, b, A, K, ).

K

(18)

It is raised as an open question in [9] whether the above Harnack type inequality still holds when replacing k∇Vn kL∞ () by kVn kC α () (0 < α < 1). The answer is affirmative due to some recent work of Chen and Lin [19]. Now we are ready to state our new local estimate which is essentially equivalent to a Harnack type estimate | sup un + inf un | ≤ C under additional hypotheses (20) and (21) below. These additional hypotheses are necessary for such an estimate to hold. We will further assume that un (0) = max un → ∞,

(19)

and Vn eun * αδ,

in , in the sense of measure,

where α > 0 is a constant and δ is the Dirac mass at the origin.

(20)

428

Y. Y. Li

Theorem 0.3. In addition to (14)–(16) and (19)–(20), we assume that max un − min un ≤ A1 ∂

(21)

∂

for some positive constant A1 . Then for some constant C independent of n, we have |un (x) − log

eun (0) | ≤ C, (1 + Vn8(0) eun (0) |x|2 )2

∀ x ∈ and ∀ n.

(22)

Theorem 0.3 will be proved by the method of moving planes, which has become a very powerful and convenient tool in the study of nonlinear elliptic partial differential equations starting from the pioneering works of A.D. Alexandrov [1], Serrin [37], and Gidas, Ni and Nirenberg [25, 26]. The method has been further developed in a series of papers by Berestycki, Nirenberg and their collaborators [3]–[7], and Caffarelli, Gidas and Spruck [10]. Many more applications of the moving plane method have been given by various authors. The method of moving planes was used to obtain some Harnack type inequalities by Schoen in [36], subsequently by Brezis, Li, and Shafrir in [9], and by Chen and Lin in [18]. Our proof of Theorem 0.3 requires some new ingredients. 1. Compactness and Existence for λ ∈ (−∞, 8π) Throughout this section V is a positive continuous function on M and W ∈ L∞ (M ) R with M W dvg = 1. Lemma 1.1. Let > 0 and ξ satisfy (Eξ )λ and (1) with −−1 ≤ λ ≤ 8π − . Then max |ξ| ≤ C M

for some constant C depending only on M, g, , kW kL∞ (M ) , kV kL∞ (M ) , the positive lower bound of V , and the modulus of continuity of V . Proof. Suppose the contrary, then there exist {Vn } converging to some positive func∞ −1 Rtion in ξC(M ), {Wn } bounded in L (M ), λn → λ ∈ [− , 8π − ], ξn , with n V e dvg = 1, satisfying (Eξ )λn with V = Vn and W = Wn , but maxM |ξn | → ∞. M n Let yn be a maximum point of ξn . If ξn (yn ) → ∞ along a subsequence (still denoted as {ξn }), we work in some isothermal coordinate system x = (x1 , x2 ) centered at yn . Without loss of generality, we may assume yn → y. In a neighborhood of y, g = eϕn (dx21 + dx22 ), where ϕn (0) = 1 and {ϕn } converges in the neighborhood with respect to C 2 norms, and, in the neighborhood, the equation of ξn takes the form −1ξn = λn (Vn eϕn eξn − eϕn Wn ), where 1 = ∂x1 x1 + ∂x2 x2 . Consider vn (x) = ξn (δn x) + 2 log δn ,

|x| < aδn−1 ,

where δn = e−ξn (0)/2 → 0 and a > 0 is some constant. Clearly vn satisfies  ϕn (δn x) vn (x) e − δn2 eϕn (δn x) Wn (δn x) , |x| < aδn−1 ,   R−1vn (x) = λn Vn (δn x)e ϕn (δn x) vn (x) e ≤ 1, −1 Vn (δn x)e |x|≤aδn   vn (x) ≤ vn (0) = 0, |x| < aδn−1 .

Harnack Type Inequality: the Method of Moving Planes

429

For any R > 1, let fn be the solution of −1fn (x) = λn Vn (δn x)eϕn (δn x) evn (x) − δn2 eϕn (δn x) Wn (δn x) , |x| < R, |x| = R. fn (x) = 0, Then |fn | is bounded from above by some constant C = C(R) in |x| ≤ R, so C +fn −vn is a nonnegative harmonic function in |x| ≤ R with value at the origin not larger than 2C. The Harnack inequality yields the upper bound of C + fn − vn in |x| ≤ R/2, which in turn yields the lower bound of vn in |x| ≤ R/2. Therefore, after passing to a subsequence, we have, by applying W 2,p estimates to vn , that vn → v

1 in Cloc (R2 ),

where v satisfies, in the distribution sense,   V (0))ev , in R2 ,  −1v = λ(limn→∞ R nv (limn→∞ Vn (0)) R2 e ≤ 1,   v(x) ≤ v(0) = 0, in R2 .

(23)

In fact, due to standard elliptic estimates, v ∈ C 2 (R2 ). It is easy to show (see the Appendix) that there is no solution to (23) if λ ≤ 0, so λ > 0. On the other hand, due to the classification of all solutions of (23) (see, for example, [19, 21] and [15]), we know that v is the function given in (10). It follows that Z λ( lim Vn (0)) ev = 8π. n→∞

R2

Consequently, λ ≥ 8π. This is a contradiction. Thus {ξn } is bounded from above and ξn (yˆn ) = − minM ξn → ∞ for some yˆn ∈ M . Without loss of generality, we may ˆ Let ⊂ M be any smooth open connected set containing y, ˆ ∂ 6= φ. assume yˆn → y. Define ηn by −1g ηn = λn (Vn eξn − Wn ), in , on ∂. ηn = 0, In view of the upper bound of ξn , we derive from standard elliptic estimates that {ηn } is uniformly bounded in . Let wn = ξn − ηn , then wn satisfies −1g wn = 0,

wn ≤ C,

in .

Applying the Harnack inequality to C − wn on compact subsets of , we have, in view of C − wn (yˆn ) → ∞, that C − wn → ∞ uniformly on compact subsets of . Namely, ξn → −∞ uniformly on compact subsets Rof . Since can be chosen arbitrarily, ξn → −∞ uniformly on M which violates M Vn eξn = 1. Lemma 1.1 is established. Theorem 0.1 can be deduced from Lemma 1.1 as follows. R u Proof of Theorem 0.1. Set R ξ = u − log M V e dvg . We know from Lemma 1.1 that |ξ| ≤R C on M . Since M udvg = 0, u vanishes somewhere in M . It follows that | log M V eu dvg | ≤ C. Consequently, |u| ≤ C.

430

Y. Y. Li

2. A New Local Estimate by the Method of Moving Planes In this section we establish Theorem 0.3 by the method of moving planes. Let G(x, y) be the Green0 s function of −1 in ⊂ R2 with respect to the zero boundary condition: ( −1x G(x, y) = δy , in , G(x, y) = 0, x ∈ ∂. Consider

Z u˜ n (x) =

G(x, y)Vn (y)eun (y) dy.

Namely, u˜ n is the solution of

−1u˜ n = Vn eun , in , on ∂. u˜ n = 0,

Lemma 2.1. Under the hypothesis of Theorem 0.3, for all r > 0, u˜ n (x) → αG(x, 0)

in C 1 ( \ Br ).

Proof. Write Z u˜ n (x) = G(x, 0)

Vn (y)eun (y) dy +

Z

[G(x, y) − G(x, 0)]Vn (y)eun (y) dy.

As y → 0, G(x, y) − G(x, 0) → 0 uniformly for x ∈ \ Br . Consequently, using (20), u˜ n (x) → αG(x, 0)

in C 0 ( \ Br ).

On the other hand, we have Z ∇u˜ n (x) =

∇x G(x, y)Vn (y)eun (y) dy,

and, as y → 0, ∇x G(x, y) − ∇x G(x, 0) → 0 uniformly for x ∈ \ Br . The C 1 convergence of u˜ n (x) to α∇x G(x, 0) follows immediately. Lemma 2.2. Under the hypotheses of Theorem 0.3, for all r > 0, there exists some constant C = C(r, , a, b, A, A1 , α) such that max un − min un ≤ C.

\Br

\Br

Proof. It follows from Lemma 2.1 and (21) that the oscillation of un − u˜ n on ∂ is bounded. Since un − u˜ n is a harmonic function, it follows from the maximum principle that the oscillation of un − u˜ n in is bounded. Lemma 2.2 follows.

Harnack Type Inequality: the Method of Moving Planes

431

Due to Lemma 2.2, we only need to establish Theorem 0.3 for a special case: = B1 is the unit ball in R2 . Without loss of generality, we assume that Vn (0) = 8. Set δn = e−un (0)/2 , v n (x) = un (δn x) + 2 log δn , wn (x) = v n (x) + 2 log |x|,

for |x| ≤ 1/δn , for |x| ≤ 1/δn .

It is clear that v n satisfies ( −1v n (x) = Vn (δn x)evn (x) v n (x) ≤ v n (0) = 0

for |x| ≤ 1/δn , for |x| ≤ 1/δn .

Arguing as in Sect. 1, vn → v

2 in Cloc (R2 ),

(24)

and therefore wn − w → 0

2 in Cloc (R2 ),

where v(x) = log

w(x) = log

(25)

1 (1 + |x|2 )2

,

|x|2 (1 + |x|2 )2

.

For convenience, we work in cylindrical coordinates (t, θ) with x1 = et cos θ, x2 = et sin θ.

(26)

It is easy to check that the transformation given by (26): (x1 , x2 ) → (t, cosθ, sin θ) is a conformal transformation of R2 \ {0} to the cylinder R × S1 = {(t, cosθ, sin θ)}. Set, for t < 0 and θ ∈ [0, 2π], w˜ n (t, θ) = un (et cos θ, et sin θ) + 2t,

and w(s) ˜ = log

e2s (1 + e2s )2

= 2s − 2 log(1 + e2s ).

Under transformation (26), wn (x) = w˜ n (t + log δn , θ),

w(x) = w(t). ˜

We derive from (25) that in the new variables, ˜ lim kw˜ n (s + log δn , θ) − w(s)k L∞ (s≤α,θ∈[0,2π]) = 0,

n→∞

∀ α ∈ R.

(27)

Clearly, under the above conformal transformation of R2 \{0} to R×S1 , the equation of un is transformed to the following equation on the half cylinder R− × S1 :

432

Y. Y. Li

−(

∂2 ∂2 + )w˜ n = V˜n (t, θ)ew˜ n ∂t2 ∂θ2

in Q,

where Q = {(t, θ) : t ≤ 0 and 0 ≤ θ ≤ 2π}, and V˜n (t, θ) = Vn (et cos θ, et sin θ). ˜ = w(s) ˜ Note that w˜ achieves its maximum at s = 0, w˜ 0 (s) > 0 for s < 0, and w(−s) for all s. Let us first describe the ideas of the proof. For some Rn → ∞, estimate (22) inside the shrinking balls |x| ≤ Rn δn follows from the usual blow up argument. What we need to estimate is in the region Rn δn ≤ |x| ≤ 1. We work on R− × S1 , the left half cylinder. It is not difficult to see that the desired estimate (22) in the region Cδn ≤ |x| ≤ 1 is equivalent to ˜ − log δn )| ≤ C, ∀ log δn + C ≤ t ≤ 0 and ∀θ. |w˜ n (t, θ) − w(t

(28)

Here and in the following, C denotes various constants independent of n. ˜ − log δn ) for The blow up argument gives a precise estimate to w˜ n (t, θ) − w(t ˜ − log δn ) is symmetric with respect to t = log δn , estimate t ≤ log δn + C. Since w(t (28) is then, in view of (27), equivalent to |w˜ n (t, θ) − w˜ n (2 log δn − t, θ)| ≤ C, ∀ log δn + C ≤ t ≤ 0 and ∀θ.

(29)

To establish (29) we will introduce two functions, wˆ n and wn∗ , which differ from w˜ n by some uniformly bounded functions. The function wˆ n will be chosen so that the method of moving planes can be applied to wˆ n from the left to obtain wˆ n (t, θ) ≥ wˆ n (2λn − t, θ), ∀ λn ≤ t ≤ 0 and ∀θ,

(30)

where λn is some number smaller than log δn + 2. On the other hand, wn∗ will be chosen so that the method of moving planes can be applied to wn∗ from the right to obtain wn∗ (t, θ) ≤ wn∗ (2λ∗n − t, θ), ∀ λ∗n ≤ t ≤ 0 and ∀θ,

(31)

where λ∗n is some number larger than log δn − C. We emphasize that in order to apply the moving plane method to wn∗ from the right we need (30) and Lemma 2.1-2.2 so that the plane moving process can get started. These estimates are also needed to ensure that |λn − log δn | + |λ∗n − log δn | ≤ C. The desired estimate (29) follows from (30), (31), (32) and (27). We first introduce wˆ n (t, θ) = w˜ n (t, θ) −

A t e a

in Q.

(32)

Harnack Type Inequality: the Method of Moving Planes

433

Clearly wˆ n satisfies −( where

∂2 A ∂2 + 2 )wˆ n = Vˆn ewˆ n + et , 2 ∂t ∂θ a

(33)

t Vˆn (t, θ) = V˜n (t, θ)eAe /a .

It is easy to see that A ∂ Vˆn (t, θ)eξ + et ≥ 0 ∂t a

∀ (t, θ) ∈ Q, ∀ ξ ∈ R.

(34)

We recall some estimates obtained for wˆ n in [9] by the method of moving planes. For λ < 0 and λ ≤ t < 0, we set tλ = 2λ − t and wˆ nλ (t, θ) = wˆ n (tλ , θ). Clearly wˆ nλ satisfies −(

λ ∂2 A λ ∂2 + 2 )wˆ nλ = Vˆnλ ewˆ n + et , 2 ∂t ∂θ a

(35)

where Vˆnλ = Vˆn (tλ , θ). It is easy to see that wˆ n (t, θ) behaves like 2t for t very negative and therefore for λ very negative (depending on n), we have wˆ nλ (t, θ) − wˆ n (t, θ) < 0

for λ < t ≤ 0, 0 ≤ θ ≤ 2π.

Define λn = sup{µ < 0 : wˆ nλ (t, θ) − wˆ n (t, θ) < 0 for all λ < µ, λ < t ≤ 0, 0 ≤ θ ≤ 2π}. ˜ − log δn , θ) very For every fixed α ∈ R, we know that w˜ n (t, θ) approximates w(t well in t ≤ log δn + α. Therefore (see [9] for details) λn ≤ log δn + 2.

(36)

Using the fact wˆ nλ (t, θ) − wˆ n (t, θ) < 0

∀ λ < t < 0, λ < λn , 0 ≤ θ ≤ 2π,

it is not difficult to see from (34), (33), (35) and the mean value theorem that 2 ∂2 ∂ wˆ nλ (t, θ) − wˆ n (t, θ) ≤ 0 + for λ ≤ t ≤ 0, λ ≤ λn and 0 ≤ θ ≤ 2π. − ∂t2 ∂θ2 Since the plane moving process stops at λn , we derive, using the Hopf lemma and the strong maximum principle, that min {wˆ n (0, θ) − wˆ n (2λn , θ)} = 0.

0≤θ≤2π

Next, we introduce wn∗ (t, θ) = w˜ n (t, θ) +

A t e a

in Q.

(37)

434

Y. Y. Li

Clearly −

∂2 ∂2 + 2 2 ∂t ∂θ

A t e, a

(38)

∀ (t, θ) ∈ Q, ∀ ξ ∈ R.

(39)

∗

wn∗ = Vn∗ ewn −

t

where Vn∗ (t, θ) = V˜n (t, θ)e−Ae /a . It is easy to see that A t ∂ ∗ ξ Vn (t, θ)e − e < 0 ∂t a

We will apply the method of moving planes to wn∗ , but from the opposite direction. For λ < 0 and 2λ ≤ t ≤ λ, we set wn∗λ (t, θ) = wn∗ (tλ , θ), where, as before, tλ = 2λ − t. Clearly 2 ∗λ ∂2 A λ ∂ + 2 wn∗λ = Vn∗λ (t, θ)ewn − et , − 2 ∂t ∂θ a

(40)

where Vn∗λ (t, θ) = Vn∗ (tλ , θ). In order to get started with the plane moving process, appropriate estimates are needed for wn∗ . For that purpose, we first use the harmonicity of un − u˜ n in B1 , the boundedness of the oscillation of un − u˜ n in B1 , and standard elliptic estimates to obtain |∇(un − u˜ n )| ≤ C

in B1/2 .

(41)

Taking −31 > −32 >> 1, we derive from (41) and Lemma 2.1, for large n (depending on 31 and 32 ), that α ∂un (t, θ) ≤ − + 1, ∂t 2π

∀ 31 ≤ t ≤ 32 , 0 ≤ θ ≤ 2π.

Notice that α ≥ 8π, we have ∂un ∂ w˜ n (t, θ) = (t, θ) + 2 ≤ −1, ∂t ∂t

∀ 31 ≤ t ≤ 32 , 0 ≤ θ ≤ 2π.

Consequently, ∂wn∗ (t, θ) ≤ −1/2, ∂t

∀ 31 ≤ t ≤ 32 , 0 ≤ t ≤ θ.

(42)

Fix 32 first. It follows from (37), (27), (36) and Lemma 2.2 that ˜ wn∗ (t, θ) ≤ w(2λ n − log δn ) + C(32 ) ≤ 2(2λn − log δn ) + C(32 ) for 32 ≤ t ≤ 0, 0 ≤ θ ≤ 2π. Therefore, for all 30 < 32 , 230 ≤ t ≤ 230 − 32 , we have wn∗30 (t, θ) = wn∗ (t30 , θ) ≤ 2(2λn − log δn ) + C(32 ).

(43)

Harnack Type Inequality: the Method of Moving Planes

435

Using the definition of λn , we have wn∗ (t, θ) ≥ wˆ n (t, θ) − C ≥ wˆ nλn (t, θ) − C

∀ λn ≤ t ≤ 0, 0 ≤ θ ≤ 2π,

where C is some constant independent of n, 32 , 31 and 30 . Namely, for all λn ≤ t ≤ 0, 0 ≤ θ ≤ 2π we have wn∗ (t, θ) ≥ wˆ n (2λn − t, θ) − C. Therefore for all λn ≤ t ≤ 30 , 0 ≤ θ ≤ 2π, we have, in view of (36) and (27), wn∗ (t, θ) ≥ wˆ n (2λn − t, θ) − C ≥ w(2λ ˜ n − log δn − t, θ) − C ≥ 2(2λn − log δn − t) − C.

(44)

We see from (43) and (44) that there exists some 30 < 32 such that for all 30 < 30 , and all λn ≤ 230 ≤ t ≤ 230 − 32 and 0 ≤ θ ≤ 2π, we have wn∗30 (t, θ) < wn∗ (t, θ).

(45)

Fix one such 30 < 30 . Using (42) with 31 = 230 , we have, for n large, wn∗30 (t, θ) < wn∗ (t, θ),

∀ 230 − 32 ≤ t < 30 , 0 ≤ θ ≤ 2π.

(46)

Define λ∗n = inf{µ ≤ 30 : wn∗λ (t, θ)−wn∗ (t, θ) < 0 ∀ µ ≤ λ ≤ 30 , 2λ ≤ t < λ, 0 ≤ θ ≤ 2π}. Due to (45) and (46), λ∗n is well defined for large n. It is easy to see from (27), for large n, that λ∗n ≥ log δn − 2.

(47)

Using the fact wn∗λ (t, θ) − wn∗ (t, θ) < 0,

∀ 2λ < t < λ, λ∗n ≤ λ ≤ 30 , 0 ≤ θ ≤ 2π,

we derive from (38), (40), (39) and the mean value theorem that 2 ∂2 ∂ wn∗λ (t, θ) − wn∗ (t, θ) ≤ 0 + − ∂t2 ∂θ2 ∀ 2λ < t < λ, λ∗n ≤ λ ≤ 30 , 0 ≤ θ ≤ 2π. Since the plane moving process stops at λ∗n , we have, by using the strong maximum principle and the Hopf lemma, that o n ∗λ∗ max wn n (2λ∗n , θ) − wn∗ (2λ∗n , θ) = 0. 0≤θ≤2π

Namely, max {wn∗ (0, θ) − wn∗ (2λ∗n , θ)} = 0.

0≤θ≤2π

(48)

436

Y. Y. Li

It follows from the definition of wn∗ and (21) that min un − C ≤ wn∗ (0, θ) ≤ min un + C ∂B1

∂B1

∀0 ≤ θ ≤ 2π.

(49)

Using (27) and the definition of λn , we also know that min wn∗ (2λ∗n , θ) ≥ min wˆ n (2λ∗n , θ) − C

0≤θ≤2π

0≤θ≤2π

≥

  min wˆ n (2λn − 2λ∗n ) − C if 2λ∗n ≥ λn 0≤θ≤2π



≥

min wˆ n (2λ∗n ) − C if 2λ∗n < λn

(50)

0≤θ≤2π

2(2λn − 2λ∗n − log δn ) − C if 2λ∗n ≥ λn . 2(2λ∗n − log δn ) − C if 2λ∗n < λn

Combining (48), (49) and (50), we have 2(2λn − 2λ∗n − log δn ) − C if 2λ∗n ≥ λn . max un ≥ ∂B1 2(2λ∗n − log δn ) − C if 2λ∗n < λn

(51)

On the other hand, we know from (37) and (27) that min un ≤ 2(2λn − log δn ) + C. ∂B1

(52)

It follows from (21), (51), (52) that either −λ∗n ≤ C,

(53)

λ∗n ≤ λn + C.

(54)

or

We rule out (53) as follows. Suppose (53) happens, then, since λ∗n < 30 , we derive from (45), for all 2λ∗n ≤ t ≤ 2λ∗n − 32 and 0 ≤ θ ≤ 2π, that ∗λ∗

wn n (t, θ) < wn∗ (t, θ). Now, in view of (42), we have, for n large, 2λ∗n − 32 ≤ t ≤ 32 , and 0 ≤ θ ≤ 2π, that ∂wn∗ (t, θ) ≤ −1/2 < 0. ∂t These imply, for some > 0 and λ ∈ [λ∗n − , λ∗n ], that wn∗λ (t, θ) < wn∗ (t, θ) for all 2λ ≤ t < λ and 0 ≤ θ ≤ 2π. This violates the definition of λ∗n , so (53) can not happen. Therefore we always have (54) and, in view of (47) and (36), that |λn − log δn | + |λ∗n − log δn | ≤ C.

(55)

Harnack Type Inequality: the Method of Moving Planes

437

Recall that ∗λ∗

wn n (t, θ) ≤ wn∗ (t, θ),

∀ 2λ∗n ≤ t ≤ λ∗n , 0 ≤ θ ≤ 2π,

and wˆ nλn (t, θ) ≤ wˆ n (t, θ),

∀ λn ≤ t ≤ 0, 0 ≤ θ ≤ 2π.

Namely, wn∗ (t, θ) ≤ wn∗ (2λ∗n − t, θ),

∀ λ∗n ≤ t ≤ 0, 0 ≤ θ ≤ 2π,

wˆ n (t, θ) ≥ wˆ n (2λn − t, θ),

∀ λn ≤ t ≤ 0, 0 ≤ θ ≤ 2π.

and

Since |wˆ n (t, θ) − w˜ n (t, θ)| + |wn∗ (t, θ) − w˜ n (t, θ)| ≤ C, for all t ≤ 0 and θ, we have

w˜ n (t, θ) ≤ w˜ n (2λ∗n − t, θ) + C, ∀ λ∗n ≤ t ≤ 0, 0 ≤ θ ≤ 2π, w˜ n (t, θ) ≥ w˜ n (2λn − t, θ) − C, ∀ λn ≤ t ≤ 0, 0 ≤ θ ≤ 2π.

(56)

Due to (55), we have 2λ∗n − t ≤ log δn + C

∀ λ∗n ≤ t ≤ 0,

2λn − t ≤ log δn + C

∀ λn ≤ t ≤ 0.

and

So we can use (27) to estimate the right hand sides of (56) and obtain, using again (55), that 2(log δn − t) − C ≤ w˜ n (t, θ) ≤ 2(log δn − t) + C, ∀ log δn ≤ t ≤ 0, ∀ θ. In terms of un , this means |un (x) + un (0) + 4 log |x|| ≤ C,

∀ δn ≤ |x| ≤ 1.

The standard blow up argument (see (24)) yields, for some Rn → ∞, max

|x|≤Rn δn

|un (x) − log

δn−2 |→0 (1 + δn−2 |x|2 )2

as n → ∞.

On the other hand, (57) is equivalent to |un (x) − log

δn−2 | ≤ C, ∀ δn ≤ |x| ≤ 1. (1 + δn−2 |x|2 )2

Theorem 0.3 follows from the above two estimates.

(57)

438

Y. Y. Li

3. Proof of Theorem 0.2 In this section we establish Theorem 0.2 by using Theorem 0.3. Proof of Theorem 0.2. We know from Theorem 0.1 that λ ∈ [8π, ∞). For any point y ∈ M , let x = (x1 , x2 ) be some isothermal coordinate system centered at y. The metric g takes the form eϕ (dx21 + dx22 ) in Br (0) := {x | x21 + x22 < r} with ϕ(0) = 0. Then ξn satisfies in x21 + x22 < r, −1ξn = λn eϕ (Vn eξn − Wn ), where 1 = ∂x1 x1 + ∂x2 x2 . Define ζn by −1ζn = λn eϕ Wn + 1ϕ, in Br (0), on ∂Br (0), ζn = 0, and set ηn = ξn + ζn + ϕ. Then ηn satisfies −1ηn = λn e−ζn Vn eηn , in Br (0).

R in B r (0). We see from (7) that { M eξn dvg } is It is clear that {ζn } is uniformly R bounded bounded from above, so λn Br (0) e−ζn Vn eηn ≤ C. Therefore it follows from Theorem 3 of [8] that, after passing to a subsequence, there are only three possibilities: (i) {ηn } uniformly converges in C 2 (B r/2 (0)), (ii) {ηn } tends to −∞ uniformly on B r/2 (0), (iii) There exist finitely many blowup points {x(1) , · · · , x(l) } of {ηn } such that {ηn } tends to −∞ uniformly on compact subsets of B r/2 (0) \ {x(1) , · · · , x(l) }. Clearly, in view of the boundedness of {ζn }, there are only the above three possibilities for {ξn } as well. Since M is connected, we know that, after passing to a subsequence, there are only three possibilities for {ξn } on M : 1◦ {ξn } uniformly converges in C 2 (M ), 2◦ {ξn } tends to −∞ uniformly on M , 3◦ There exist finitely many blowup points {x(1) , · · · , x(m) } of {ξn } such that {ξn } tends to −∞ uniformly on compact subsets of M \ {x(1) , · · · , x(m) }. R Since we know from (7) that { M eξn dvg } has a positive lower bound, so 2◦ can of (9). We are left with 3◦ . Applying the not occur. 1◦ can not occur either because P m ξn result in [32], we know that λn Vn e * l=1 8πNl δx(l) for some positive integers Pm Nl . Consequently, in view of (7), λ = 8π l=1 Nl . We then derive from (8) that, in 0 (M \ {x(1) , · · · , x(m) }), Cloc ξn − ξ n → 8π

m X l=1

Z Nl G(·, x(l) ) − λ

M

W (y)G(·, y)dvg (y).

(58)

Due to (58), {ξn } has bounded oscillations in compact subsets of M \ {x(1) , · · · , x(m) }. 0 Let 0 < al < 21 minl0 6=l d(x(l) , x(l ) ) be some small constant, x(l) n be a maximum point of ξn in {y ∈ M | d(y, x(l) ) < al }, and x = (x1 , x2 ) be some isothermal coordinate system ϕn (dx21 +dx22 ) in Bal (0) := {x | x21 +x22 < al } centered at x(l) n . The metric g takes the form e with ϕn (0) = 0. Define ζn and ηn in Br (0) as at the beginning of this section with r = al , then, by applying Theorem 0.3 to ηn , we have

Harnack Type Inequality: the Method of Moving Planes

|ηn (x) − log

439

eηn (0) (1 +

λn e−ζn (0) Vn (0) 8

eηn (0) |x|2 )2

| ≤ C,

∀ |x| ≤ al ,

namely, |ξn (x) − log

eξn (0) | ≤ C, (1 + λn V8n (0) eξn (0) |x|2 )2

∀ |x| ≤ al .

(59)

It follows easily that λn Vn eξn * 8π

m X

δx(l) in the sense of measure.

l=1

In the isothermal coordinate system centered at x(l) n , we define vn (x) = ξn (δn x) + 2 log δn ,

|x| < al δn−1 ,

where δn = e−ξn (0)/2 → 0. Set (l)

Rn := sup{R > 0 : kvn − vkC 2 (B 2R (0)) + kvn − vkH 2 (B 2R (0)) < e−R }, where v(x) = log {

1 (1 +

λ limn→∞ Vn (0) |x|2 )2 8

},

in R2 .

Arguing by contradiction using the standard blow up argument as in Sect. 1, we can (l) show that Rn → ∞ as n → ∞. Clearly, Z Vn eξn → 8π, λn (l) (l) (l) d(y,xn )
(l)

−ξn (xn )/2 } and ξn , for large n, has a unique critical point in {y ∈ M | d(y, x(l) n ) < Rn e due to the fact that v has a unique nondegenerate critical point at the origin. It is easy to see from (59) that Z Vn eξn → 0. (l) (l) (l)

(l) Rn e−ξn (xn )/2
Consequently, Nl = 1 for all l and λ = 8πm. (e) then follows from (58). We easily derive (d) and (c) from (59) and (58). The above discussion also yields the uniqueness ofR the maximum point x(l) n since V eξn → otherwise another maximum point xˆ (l) (l) (l) −ξ (x n (l) n would lead to λn ˆ )/2 n n

d(y,xˆ n )
8π, and due to the definition (l) (l) Rn e−ξn (xn )/2 } and {y ∈ M | would violate Nl = 1. Theorem 0.2 is thus established.

In the rest of this section we derive Corollary 0.3 from Theorem 0.2. Proof of Corollary 0.3. Using (8), we write ϕn as (2) ϕn = ϕ(1) n + ϕn

440

Y. Y. Li

with

Z ϕ(1) n := λn

M

Vn (y)eξn (y) G(·, y)dvg (y) − 8π

ϕ(2) n

G(·, x(l) n)

l=1

Z

Z

and

m X

:= 8πm M

W (y)G(·, y)dvg (y) − λn

M

Wn (y)G(·, y)dvg (y).

Since λn Wn → 8πmW in C α (M ) for 0 < α < 1, we derive from Schauder estimates 2,α that ϕ(2) (M ), so n → 0 in C ϕ(2) n ∼ 0.

(60)

(l) Without loss of generality, we may assume d(x, x(1) n ) = min1≤l≤m d(x, xn ). Write (11) (12) (13) (14) ϕ(1) n = ϕn + ϕ n + ϕ n + ϕ n

with ϕ(11) n = λn

m Z X B√

l=1

ϕ(12) n =

m X



(l) (xn ) (l)

(l) Rn δn



Z



l=1

B√

Vn (y)eξn (y) [G(·, y) − G(·, x(l) n )]dvg (y),

(l) (xn ) (l) (l) Rn δn

λn Vn (y)eξn (y) dvg (y) − 8π  G(·, x(l) n ),

Z ϕ(13) n = λn Z ϕ(14) n = λn

d(y,x)≤d(x,x(1) n )/4

Vn (y)eξn (y) G(·, y)dvg (y),

m √ d(y,x)≥d(x,x(1) n )/4,y∈M \∪l=1 B

(x(l) n) (l)

Vn (y)eξn (y) G(·, y)dvg (y).

(l) Rn δn

(l) For x ∈ M \ ∪m l=1 BR(l) δ (l) (xn ), we derive from (c)-(d) in Theorem 0.2 that n

n

|ϕ(13) (x)| ≤ C

R

(1)

eξn (xn )

(1) 2 2 (1+eξn (xn ) d(x,x(1) n) )

d(y,x)≤d(x,x(1) n )/4

≤ C(1 + | log d(x, x(1) n )|)

|G(x, y)|dvg (y)

(1) 2 eξn (xn ) d(x,x(1) n) (1) (1) 2 2 ξn (xn ) (1+e d(x,xn ) )

,

which implies, in view of (1)

(1)

2 ξn (xn ) (Rn δn(1) ))2 = (Rn )2 → ∞, eξn (xn ) d(x, x(1) n ) ≥e (1)

(1)

that ϕ(13) ∼0 0.

(61)

The usual blow up argument as in Sect. 1 yields ϕ(12) ∼0 0.

(62)

Harnack Type Inequality: the Method of Moving Planes

441

(l) Apparently, for x ∈ M \ ∪m (l) (l) (xn ), l=1 BoverlineRn δn

|ϕ(11) (x)| ≤ C ≤C

Pm R

l=1 B√

Pm

l=1

log

(l) (xn )

d(x,y) eξn (y) | log d(x,x (i) |dvg (y) ) n

p (l) pRn(l) .

(l) (l) Rn δn (l) Rn + (l) Rn −

Rn

Consequently, ϕ(11) ∼0 0.

(63)

Using (a) and (c), we have |ϕ(14) (x)| ≤ C(1 + | log d(x, x(1) n )|)

R M \∪m B l=1

= ◦(1)(1 + | log d(x, x(l) n )|).

(1) (l) (l) (xn ) Rn δn

eξn dvg

Namely, ϕ(14) ∼0 0. Combining (61)–(64), we have

(64)

ϕ(1) ∼0 0.

Differentiating ϕ(1) under the integral sign and making estimates as above, we can easily show (details are left to readers) that ϕ(1) ∼1 0

ϕ(1) ∼2 0.

and

Therefore ϕ(1) ∼ 0. Corollary 0.3 follows from (60) and (65).

(65)

4. Appendix For readers’ convenience, we provide a proof of the following well known fact. Lemma 4.1. There is no C 2 solution to v in R2 , R1v =v e , e < ∞. R2 Proof. Suppose the contrary, v is a C 2 solution. Set Z 1 v(r) = v 2πr ∂Br for r > 0. We derive from Jensen0 s inequality that Z 1 ev ≥ ev(r) . 2πr ∂Br

(66)

442

Y. Y. Li

It follows that v satisfies

1v ≥ ev(r) ,

in R2 ,

namely, 1 0 (rv (r))0 ≥ ev(r) . r We derive from the above that rv 0 (r) ≥

Z

r

sev(s) ≥ 0

for all r ≥ 0.

0

Consequently, 0

Z

rv (r) ≥

r

sev(0) = ev(0)

0

r2 2

for all r ≥ 0.

In turn we have v(r) ≥ v(0) + ev(0) It follows from (66) and (67) that

Z R2

Contradiction.

r2 4

for all r ≥ 0.

(67)

ev = ∞.

References 1. Alexandrov, A.D.: Uniqueness theorems for surfaces in the large I-V. Vestnik Leningrad Univ. 11 #19, 5–17 (1956); 12 #7, 15–44 (1957); 13 #7, 14–26 (1958); 13 #13, 27–34 (1958); 13 #19, 5–8 (1958); English transl. in Am. Math. Soc. Transl. 21, 341–354, 354–388, 389–403, 403–411, 412–416 (1962) 2. Aubin, T.: Nonlinear Analysis on Manifolds. Monge–Amp`ere Equations. New York–Berlin: SpringerVerlag, 1982 3. Berestycki, H., Caffarelli, L. and Nirenberg, L.: Symmetry for elliptic equations in a half space, Boundary value problems for partial differential equations and applications. RMA Res. Notes Appl. Math., 29, Paris: Masson, 1993, pp. 27–42 4. Berestycki, H., Caffarelli, L. and Nirenberg, L.: Further qualitative properties for elliptic equations in unbounded domains. Annali Sc. Norm. Sup. Pisa Cl. Sci. 4, 1 (1998) 5. Berestycki, H. and Nirenberg, L.: Monotonicity, symmetry and antisymmetry of solutions of semilinear elliptic equations. J. Geom. Phys. 5, 237–275 (1988) 6. Berestycki, H. and Nirenberg, L.: Some qualitative properties of solutions of semilinear elliptic equations in cylindrical domains. Analysis, et cetera, Boston, MA: Academic Press, 1990, pp. 115–164 7. Berestycki, H. and Nirenberg, L.: On the method of moving planes and the sliding method. Bol. Soc. Bras. Mat. 22, 1–37 (1991) 8. Brezis, H. and Merle, F.: Uniform estimates and blow-up behavior for solutions of −1u = V (x)eu in two dimension. Commun. Partial Differential Equation 16, 1223–1253 (1991) 9. Brezis, H., Li, Y.Y. and Shafrir, I.: A sup + inf inequality for some nonlinear elliptic equations involving exponential nonlinearities. J. Funct. Anal. 115, 344–358 (1993) 10. Caffarelli, L., Gidas, B. and Spruck, J.: Asymptotic symmetry and local behavior of semilinear elliptic equations with critical Sobolev growth. Commun. Pure Appl. Math. 42, 271–297 (1989) 11. Caffarelli, L. andYang,Y.: Vortex condensation in the Chern–Simons Higgs model: an existence theorem. Commun. Math. Phys. 168, 321–336 (1995) 12. Caglioti, E., Lions, P.L. and Marchioro, C.: A special class of stationary flows for two-dimensional Euler equations: A statistical mechanics description. Commun. Math. Phys. 143, 501–525 (1992)

Harnack Type Inequality: the Method of Moving Planes

443

13. Caglioti, E., Lions, P.L., Marchioro, C. and Pulvirenti, M.: A special class of stationary flows for twodimensional Euler equations: A statistical mechanics description, part II. Commun. Math. Phys. 174, 229–260 (1995) 14. Carleson, L. and Chang, S.Y.: On the existence of an extremal function for an inequality of Moser. Bull. Sci. Math. 110, 113–127 (1986) 15. Chanillo, S. and Kiessling, M.K.H.: Conformally invariant systems of nonlinear PDE of Liouville type. Geom. Funct. Anal. 5, 924–947 (1995) 16. Chang, S.Y. and Yang, P.: A perturbation result in prescribing scalar curvature on Sn . Duke Math. J. 64, 27–69 (1991) 17. Chen, C.C. and Lin, C.S.: A sharp sup+inf inequality for a nonlinear elliptic equation in R2 . Comm. Anal. Geom. 6, 1–19 (1998) 18. Chen, C.C. and Lin, C.S.: Estimates of the conformal scalar curvature equation via the method of moving planes. Comm. Pure Appl. Math. 50, 971–1017 (1997) 19. Chen, W. and Li, C.: Classification of solutions of some nonlinear elliptic equations. Duke Math. J. 63, 615–623 (1991) 20. Chen, X.: Remarks on the existence of branch bubbles on the blowup analysis of equation −1u = e2u in dimension two. Comm. Anal. Geom., to appear 21. Chou, K.S. and Wan, T.Y.H.: Asymptotic radial symmetry for solutions of 1u + eu = 0 in a punctured disc. In: Elliptic and parabolic methods in geometry, Minneapolis, MN, 1994, Wellesley, MA: A K Peters, 1996, pp. 17–21 22. Ding, W., Jost, J., Li, J. and Wang, G.: The differential equation Riemann surface. Asian J. Math. 1, 230–248 (1997)

1u

= 8π − 8πheu on a compact

23. Ding, W., Jost, J., Li, J. and Wang, G.: An analysis of the two-vortex case in the Chern–Simons Higgs model. Calc. Var. 7, 87–97 (1998) 24. Dunne, G.: Self-Dual Chern–Simons Theories. Lecture notes in Physics, New Series M 36, New York: Springer, 1996 25. Gidas, B., Ni, W.M. and Nirenberg, L.: Symmetry and related properties via the maximum principle. Commun. Math. Phys. 68, 209–243 (1979) 26. Gidas, B., Ni, W.M. and Nirenberg, L.: Symmetry of positive solutions of nonlinear elliptic equations in Rn . Math. Anal. and Applications, Part A, Advances in Math. Suppl. Studies 7A, (ed. L. Nachbin), London–New York: Academic Pr., 1981, pp. 369–402 27. Hong, J., Kim, Y. and Pac, P.Y.: Multivortex solutions of the Abelian Chern Simons theory. Phys. Rev. Lett. 64, 2230–2233 (1990) 28. Jackiw, R. and Weinberg, E.J.: Selfdual Chern Simons vortices. Phys. Rev. Lett. 64 2234–2237 (1990), 29. Kazdan, J. and Warner, F.: Curvature functions for compact 2−manifolds. Ann. of Math. 99, 14–47 (1974) 30. Kiessling, M.K.H.: Statistical mechanics of classical particles with logarithmic interaction. Comm. Pure Appl. Math. 46, 27–56 (1993) 31. Li, Y.Y.: Prescribing scalar curvature on S n and related problems, Part II: Existence and compactness. Comm. Pure Appl. Math. 49, 541–597 (1996) 32. Li, Y.Y. and Shafrir, I.: Blow-up analysis for solutions of −1u = V eu in dimension two. Indiana Univ. Math. J. 43, 1255–1270 (1994) 33. Moser, J.: On a nonlinear problem in differential geometry. In: Dynamical Systems (M. Peixoto, ed.), New York: Academic Press, 1973, pp. 273–280 34. Nirenberg, L.: Topics in Nonlinear Functional Analysis. Lecture Notes, Courant Institute, New York University, 1974 35. Struwe, M. and Tarantello, G.: On multivortex solutions in Chern–Simons Gauge theory. Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. 8 1, 109–121 (1998) 36. Schoen, R.: Courses at Stanford University (1988) and New York University (1989), unpublished 37. Serrin, J.: A symmetry problem in potential theory. Arch. Rat. Mech. Anal. 43, 304–318 (1971) 38. Spruck, J. and Yang, Y.: Topological solutions in the self-dual Chern–Simons theory: Existence and approximation. Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 12, 75–97 (1995)

444

Y. Y. Li

39. Tarantello, G.: Multiple condensate solutions for the Chern–Simons–Higgs theory. J. Math. Phys. 37, 3769–3796 (1996) 40. Taubes, C.H.: Arbitrary N -vortex solutions to the first order Ginzburg–Landau equation. Commun. Math. Phys. 72, 277–292 (1980) 41. Taubes, C.H.: On the equivalence of the first and second order equations for gauge theories. Commun. Math. Phys. 75, 207–227 (1980) Communicated by J. L. Lebowitz

Commun. Math. Phys. 200, 445 – 485 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

What is the Relativistic Volterra Lattice? Yuri B. Suris1 , Orlando Ragnisco2 1

Technische Universit¨at Berlin, Fachbereich Mathematik, SFB 288, Sekr. MA 8–5, Str. des 17. Juni 136, 10623 Berlin, Germany. E-mail: [email protected] 2 Dipartimento di Fisica, Universita di Roma Tre, Via Vasca Navale 84, 00146 Roma, Italy. E-mail: [email protected] Received: 1 April 1998 / Accepted: 21 July 1998

Abstract: We develop a systematic procedure of finding integrable “relativistic” (regular one-parameter) deformations for integrable lattice systems. Our procedure is based on the integrable time discretizations and consists of three steps. First, for a given system one finds a local discretization living in the same hierarchy. Second, one considers this discretization as a particular Cauchy problem for a certain 2-dimensional lattice equation, and then looks for another meaningful Cauchy problem, which can be, in turn, interpreted as a new discrete time system. Third, one has to identify integrable hierarchies to which these new discrete time systems belong. These novel hierarchies are called then “relativistic”, the small time step h playing the role of inverse speed of light. We apply this procedure to the Toda lattice (and recover the well-known relativistic Toda lattice), as well as to the Volterra lattice and a certain Bogoyavlensky lattice, for which the “relativistic” deformations were not known previously.

1. Introduction The theory of integrable differential–difference, or lattice, systems is by now a well developed and well understood subject. Nevertheless, some intriguing questions remain open, and the aim of this paper is to close one of them: what is relativistic Volterra lattice? Let us start with the necessary background. Certainly, two most celebrated and well studied integrable lattice systems are the Toda lattice (TL), b˙ k = ak − ak−1 , a˙ k = ak (bk+1 − bk ),

(1.1)

and the Volterra lattice (VL), u˙ k = uk (vk − vk−1 ), v˙ k = vk (uk+1 − uk ).

(1.2)

446

Yu. B. Suris, O. Ragnisco

For readers more familiar with another form of these systems, we recall that the Newtonian form of the Toda lattice, x¨ k = exk+1 −xk − exk −xk−1 ,

(1.3)

equivalent also to the Hamiltonian one, x˙ k = pk , p˙k = exk+1 −xk − exk −xk−1 ,

(1.4)

is recovered from (1.1), if the variables ak , bk are parametrized according to Manakov– Flaschka formulas bk = pk , ak = exk+1 −xk ,

(1.5)

while the more convenient form of the Volterra lattice, a˙ k = ak (ak+1 − ak−1 ),

(1.6)

arises from (1.2) upon re-naming uk = a2k−1 , vk = a2k .

(1.7)

These two lattice systems are connected to one another in two different ways. On the one hand, the Volterra lattice (1.6) is a restriction to the manifold bk = 0 of the second flow of the Toda hierarchy. On the other hand (and this connection will be of primary interest for us here) the flows (1.1) and (1.2) are connected by a Miura map, or, better, by two different Miura maps M1,2 : (u, v) 7→ (a, b): bk = uk + vk−1 , bk = uk + vk , M2 : (1.8) M1 : ak = uk vk , ak = uk+1 vk . Some time ago a remarkable discovery was made by Ruijsenaars [R] in the area of integrable lattice systems: he found a relativistic generalization of TL (RTL). The corresponding system may be viewed as a regular deformation of TL: d˙ k = (1 + hdk )(ak − ak−1 ), a˙ k = ak (dk+1 − dk + hak+1 − hak−1 )

(1.9)

(the reason for choosing here d instead of b will become clear in the main text). Under the parametrization (1.10) dk = ehpk − 1 /h, ak = exk+1 −xk +hpk (which is a regular deformation of (1.5)) the equations of motion (1.9) may be presented as k−1 , p˙k = exk+1 −xk +hpk − exk −xk−1 +hp (1.11) hpk 2 xk+1 −xk 1+h e , 1 + hx˙ k = e which implies also Newtonian equations of motion exk+1 −xk 1 + h2 exk+1 −xk exk −xk−1 − (1 + hx˙ k−1 )(1 + hx˙ k ) . 1 + h2 exk −xk−1

x¨ k = (1 + hx˙ k )(1 + hx˙ k+1 )

(1.12)

Relativistic Volterra Lattice

447

Mathematical structures related to RTL, including Lax representations, multi-Hamiltonian structure, and so on, were further investigated in [BR, ZTOF, S1, S2]. A general approach to constructing “relativistic” generalizations of integrable lattice systems, applicable to the whole lattice KP hierarchy, was proposed in [GK1]. Paradoxically, up to now nobody seems to know what the relativistic Volterra lattice is. We propose here an answer to this question. Let us stress that we are not concerned here with relativistic invariance. Instead, our aim is to construct integrable lattice systems, which are regular one-parameter deformations of VL and are Miura-related to RTL in the same manner as VL is related to TL. Needless to say, the “relativistic Miura maps” we are looking for have to be regular deformations of the standard ones. Surprisingly, these Miura maps can be chosen to be identical with the nonrelativistic ones. There exist two different systems that are sent to the RTL by either of two Miura maps (1.8), M1,2 (u, v) = (a, d), so that “the relativistic Volterra lattice” actually splits into two systems: u ˙ k = uk (v k − v k−1 + huk v k − huk−1 v k−1 ), (1.13) v˙ k = v k (uk+1 − uk + huk+1 v k+1 − huk v k ), and

u ˙ k = uk (v k − v k−1 + huk+1 v k − huk v k−1 ), v˙ k = v k (uk+1 − uk + huk+1 v k − huk v k−1 ).

(1.14)

We suspect that our way to find these systems might be more significant than the systems themselves. Namely, our route is through the theory of integrable time discretizations. It was discovered in [S1] that certain integrable discretization of the usual TL, e e k (1 + he bk ) = ak (1 + he bk+1 ), bk = bk + h(ak − ak−1 ), a

(1.15)

shares the integrals of motion and the invariant Poisson structure with RTL (upon the change of variables bk = dk + hak−1 ). In other words, this discretization belongs to the RTL hierarchy. In [PGR, S3] it was noticed that this discretization is connected to other one, belonging to the TL hierarchy: e ak−1 ), e ak (1 + he bk ) = ak (1 + hbk+1 ). bk = bk + h(ak − e

(1.16)

These two 1+1-dimensional discrete systems (with one discrete space coordinate and one discrete time) may be seen as resuling from one and the same 2-dimensional discrete system (living on a 2-dimensional lattice) by posing an initial value problem in two different ways. In other words, the Cauchy data for two discretizations are prescribed on two different “discrete curves” on the 2-dimensional lattice. Being very close on the 2-dimensional lattice, Eqs. (1.15), (1.16) have nevertheless very different properties, such as invariant Poisson brackets, integrals of motion, Lax matrices, etc. – all in one, they belong to different hierarchies. On the level of Hirota’s bilinear representations and τ -functions such a relation was observed also in [OKMS]. Thus, our strategy to find the “relativistic” deformations for continuous time lattice systems (with one discrete space coordinate) may be described as follows. First, for a given integrable lattice system, one constructs integrable time discretization belonging to the same hierarchy. Such time discretizations appeared first in [AL, GK2], and a systematic procedure was developed in [S3, S4]. As a rule, this approach results in nonlocal equations of motion. However, these nonlocal discrete time

448

Yu. B. Suris, O. Ragnisco

equations of motion may be brought into a local form with the help of the so-called localizing changes of variables, which were found in [S6] for a large set of examples. Second, and this is a crucial step in finding new hierarchies, one considers the resulting discrete time system as a system on a two-dimensional lattice, and tries to find new meaningful initial value problems for this two-dimensional lattice. This is close in spirit to the constructions in [PNC, NPCQ, PN]. The resulting discrete time systems belong to integrable hierarchies distinct from the original one (this fact is not stressed in the papers just mentioned). As the third and final step, one has to identify these novel hierarchies, in particular, to find integrals of motion, invariant Poisson structures, and higher flows. In this paper we first recall how this program could be realized to find RTL (although the actual way to discover this system was quite different), and then demonstrate how the relativistic Volterra lattice may be derived. In particular, we show that (1.13) and (1.14) are the simplest flows of the hierarchies, to which the following two explicit discretizations of VL belong: ek (1 + he e k (1 + hv k−1 ) = uk (1 + hv k ), v uk ) = v k (1 + he uk+1 ), u

(1.17)

ek (1 + huk ) = v k (1 + huk+1 ), e k (1 + he v k−1 ) = uk (1 + he v k ), v u

(1.18)

and

respectively. One can see that the corresponding constructions may be described as a factorization of RTL, in a complete analogy with the nonrelativistic case. Finally, we show how to generalize these results to some of the Bogoyavlensky lattices.

2. General Framework

2.1. Lax equations and representations. Our approach to integrable lattice systems is based on the notion of Lax representations. We consider Lax equations of one of the following types: i h i h (2.1) L˙ = L, π+ (f (L)) = − L, π− (f (L)) , or

L˙ j = Lj · π+ f (Tj−1 ) − π+ f (Tj ) · Lj = − Lj · π− f (Tj−1 ) + π− f (Tj ) · Lj . (2.2)

Let us explain the notations. Let g be an associative algebra. One can introduce in g the structure of the Lie algebra in a standard way. Let g + , g − be two subalgebras such that as a vector space g is a direct sum g = g + ⊕ g − . Denote by π± : g 7→ g ± the corresponding projections. Finally, let f : g 7→ g be an Ad-covariant function on g, and let L stand for a generic element of g. Then N (2.1) is a certain differential equation on g. m Further, let g = j=1 g be a direct product of m copies of the algebra g. A generic element of g is denoted by L = (L1 , . . . , Lm ). We use also the notation Tj = Tj (L) = Lj · . . . · L1 · Lm · . . . · Lj+1 .

(2.3)

Relativistic Volterra Lattice

449

Then (2.2) is a certain differential equation on g. Such equations are sometimes called Lax triads. One says that (2.1), resp. (2.2), is a Lax representation of a Hamiltonian flow x˙ = {H, x}

(2.4)

on a Poisson manifold X , {·, ·} , if there exists a map L : X 7→ g (resp. L : X 7→ g) such that the former equations of motion are equivalent to the latter ones. Let us stress that when considering equations (2.1), resp. (2.2) in the role of Lax representation, the letter L (resp. L) does not stand for a generic element of the corresponding algebra any more; rather, it represents the elements of the images of the maps L : X 7→ g and L : X 7→ g, correspondingly. The elements L(x), resp. L(x) (and the map L, resp. L, itself) are called Lax matrices. 2.2. r-matrix Poisson brackets. Recall that there exist several constructions of Poisson brackets on associative algebras implying the Lax form of Hamiltonian equations of motion. We recall here some of them. Suppose that g carries a nondegenerate scalar product h·, ·i, bi-invariant with respect to the multiplication in g. Let R be a linear operator on g. Definition 1. [STS] A linear r-matrix bracket on g corresponding to the operator R is defined by: {ϕ, ψ}1 (L) =

E 1D [R(∇ϕ(L)), ∇ψ(L)] + [∇ϕ(L), R(∇ψ(L))] , L . 2

(2.5)

If this is indeed a Poisson bracket, it will denoted by PB1 (R). Theorem 1. [STS] A sufficient condition for (2.5) to define a Poisson bracket is given by the modified Yang–Baxter equation for the operator R, mYB(R; α), which reads [R(u), R(v)] − R [R(u), v] + [u, R(v)] = −α [u, v] ∀u, v ∈ g. (2.6) Now let A1 , A2 , S be three linear operators on g, A1 and A2 being skew-symmetric: A1∗ = −A1 , A2∗ = −A2 .

(2.7)

Definition 2. [S2] A quadratic r-matrix bracket on g corresponding to the triple A1 , A2 , S is defined by: {ϕ, ψ}2 (L) =

E 1D E 1D A1 (d 0 ϕ(L)), d 0 ψ(L) − A2 (dϕ(L)), dψ(L) 2 2 D E E 1D ∗ 0 1 S(dϕ(L)), d 0 ψ(L) − S (d ϕ(L)), dψ(L) , + 2 2

(2.8)

where we denote for brevity dϕ(L) = L · ∇ϕ(L), d 0 ϕ(L) = ∇ϕ(L) · L. If this expression defines a Poisson bracket, we shall denote it by PB2 (A1 , A2 , S).

(2.9)

450

Yu. B. Suris, O. Ragnisco

In what follows we shall usually suppose the following condition to be satisfied: A1 + S = A2 + S∗ = R.

(2.10)

Then a linearization of PB2 (A1 , A2 , S) in the unit element of g coincides with PB1 (R), and we call the former a quadratization of the latter. Theorem 2. [S2] A sufficient condition for (2.8) to be a Poisson bracket is given by Eqs. (2.10) and mYB(R; α), mYB(A1 ; α), mYB(A2 ; α).

(2.11)

One of the most important properties of the r-matrix brackets is the following one. Theorem 3. Ad-invariant functions on g are in involution with respect to the bracket PB1 (R) and with respect to its quadratizations PB2 (A1 , A2 , S). The Hamiltonian equations of motion on g corresponding to an Ad-invariant Hamilton function ϕ, have the Lax form i 1h (2.12) L˙ = L, R(f (L)) , 2 where f (L) = ∇ϕ(L) for the linear r-matrix bracket, and f (L) = dϕ(L) for its quadratizations. Quadratic r-matrixN brackets have interesting and important features when considered m on a “big” algebra g = j=1 g. This algebra carries a (nondegenerate, bi-invariant) scalar product m X hLk , Mk i. hhL, Mii = k=1

Working with linear operators on g, we use the following natural notations. Let A : g 7→ g be a linear operator, let A(L) be the ith component of A(L); then we set i

A(L)

i

=

m X

(A)ij (Lj ).

(2.13)

j=1

For a smooth function 8(L) on g we also denote by ∇j 8, dj 8, d j0 8 the j th components of the corresponding objects. Now let A1 , A2 , S be linear operators on g satisfying conditions analogous to (2.7) and to (2.11). One has, obviously: ∗ ∗ ∗ (A2 )ij = −(A2 )ji , (S)ij = (S∗ )ji . (A1 )ij = −(A1 )ji , Then one can define the bracket PB2 (A1 , A2 , S) on g. In components it reads: {8, 9}2 (L) =

m m D E 1X E 1 XD (A1 )ij (d j0 8), di0 9 − (A2 )ij (dj 8), di 9 2 i,j=1 2 i,j=1

+

m m D E 1X E 1 XD (S)ij (dj 8), di0 9 − (S∗ )ij (d j0 8), di 9 . 2 i,j=1 2 i,j=1

(2.14)

Relativistic Volterra Lattice

451

Theorem 4. [S5] Let g be equipped with the Poisson bracket PB2 (A1 , A2 , S). Suppose that the following relations hold: (A1 )j+1,j+1 + (S)j+1,j = (A2 )j,j + (S∗ )j,j+1 = R for all 1 ≤ j ≤ m, (A1 )i+1,j+1 = −(S)i+1,j = (S∗ )i,j+1 = −(A2 )i,j for i 6= j. Then each map Tj : g 7→ g (2.3) is Poisson, if the target space g is equipped with the Poisson bracket PB2 (A1 )j+1,j+1 , (A2 )j,j , (S)j+1,j . Hamilton function of the form 8(L) = ϕ(Lm · . . . · L1 ), where ϕ is an Ad-invariant function on g, generates Hamiltonian equations of motion on g having the Lax form: 1 L˙ j = Lj Bj−1 − Bj Lj , Bj = R(dϕ(Tj )). 2

(2.15)

(In all formulas the subscripts should be taken (mod m).) We have discussed above the r-matrix origin of Lax equations. If one is concerned with a Lax representation of a Hamiltonian flow (2.4) on a Poisson manifold X , {·, ·} , then finding an r-matrix interpretation for it consists of finding an r-matrix bracket on g (or on g) such that the Lax matrix map L : X 7→ g (resp. L : X 7→ g) is a Poisson map. 2.3. Factorizations and integrable discretization. A further remarkable feature of Eqs. (2.1) and (2.2) is a possibility to solve them explicitly in terms of a certain factorization problem in the Lie group G corresponding to g [Sy, STS, RSTS]. (Actually, this can be done even in a more general situation of hierarchies governed by R-operators satisfying the modifiedYang–Baxter equation, see [RSTS]). The factorization problem is described by the equation U = 5+ (U ) · 5− (U ), U ∈ G, 5± (U ) ∈ G± ,

(2.16)

where G± are two subgroups of G with the Lie algebras g ± , respectively. This problem has a unique solution in a certain neighbourhood of the group unit. In the situations we consider in the sequel G will be a matrix group, and we write the adjoint action of the group elements on g as a conjugation by the corresponding matrices. −1 In this context we write 5−1 ± (U ) for (5± (U )) . Correspondingly, we call Ad-covariant functions g 7→ g also “conjugation covariant”. This notation has an additional advantage of being applicable also to functions g 7→ G. Based on the above mentioned explicit solution, the following recipe for integrable discretization was formulated in [S3, S4]. In all difference equations below we use the e = L(t + h), if L = L(t), tilde to denote the discrete time shift, so that, for example, L t ∈ hZ. Recipe. Suppose you are looking for an integrable discretization of an integrable system (2.4) allowing a Lax representation of the form (2.1). Then as a solution of your task you may take the difference equation −1 e = 5−1 L + (F (L)) · L · 5+ (F (L)) = 5− (F (L)) · L · 5− (F (L))

(2.17)

452

Yu. B. Suris, O. Ragnisco

with the same Lax matrix L and some conjugation covariant function F : g 7→ G such that F (L) = I + hf (L) + O(h2 ). Analogously, if your system has a Lax representation of the form (2.2) on the algebra g, then you may take as its integrable discretization the difference Lax equation e j = 5−1 F (Tj ) · Lj · 5+ F (Tj−1 ) = 5− F (Tj ) · Lj · 5−1 L + − F (Tj−1 ) (2.18) with F as above. Obviously, by construction, the discretizations obtained in this way share the Lax matrix and therefore the integrals of motion with their underlying continuous time systems. Moreover, they share also the invariant Poisson brackets. Indeed, the above mentioned factorization theorems imply that the maps (2.17), (2.18) are the time h shifts along the trajectories of the corresponding flows (2.1), (2.2) with f (L) = h−1 log(F (L)) = f (L) + O(h). This, in turn, implies that if all flows of the hierarchy (2.1) [resp. (2.2)] are Hamiltonian with respect to a certain Poisson bracket, then our discretizations are Poisson maps with respect to this bracket. In particular, if the Lax matrices L [resp. L] form a Poisson submanifold for some r-matrix bracket, then this submanifold is left invariant by the corresponding Poisson map (2.17) [resp. (2.18)]. We shall say that our recipe yields discretizations living in the same hierarchies as the underlying continuous time systems. 2.4. Basic algebras and operators. In what follows we fix an algebra g suited to describe various lattice systems with periodic boundary conditions, as well as certain operators taking part in the corresponding r-matrix constructions. So, starting from this point the symbols g, g ± , G, G± , π± , 5± , R, A1 , A2 , S will carry only the following meanings. The algebra g is a twisted loop algebra over gl(N ), consisting of Laurent polynomials L(λ) with coefficients from gl(N ) and a natural commutator [uλj , vλk ] = [u, v]λj+k , satisfying an additional condition L(λ)−1 = L(ωλ), where = diag(1, ω, . . . , ω N −1 ), ω = exp(2πi/N ). In other words, elements of g have the following structure: X `(p) E p . L(λ) =

(2.19)

p

(p) are diagonal matrices, and , . . . , ` In this formula `(p) = diag `(p) 1 N E =λ

N X k=1

Ek+1,k , E −1 = λ−1

N X

Ek,k+1

k=1

are the shift matrices (here and below Ejk stands for the matrix whose only nonzero entry is on the intersection of the j th row and the k th column and is equal to 1; we set EN +1,N = E1,N , EN,N +1 = EN,1 , and in general understand all subscripts (mod N )).

Relativistic Volterra Lattice

453

The nondegenerate bi-invariant scalar product on g is chosen as hL(λ), M (λ)i = tr(L(λ) · M (λ))0 ,

(2.20)

the subscript 0 denoting the free term of the formal Laurent series. This scalar product allows to identify g ∗ with g. The two subalgebras g + , g − such that g = g + ⊕ g − are chosen as

g+ =

 X 

`(p) E p

p≥0

  

, g− =

 X 

`(p) E p

p<0

  

.

(2.21)

The corresponding decomposition L = π+ (L) + π− (L) of an arbitrary element L ∈ g will be called the generalized LU decomposition. We denote also by g 0 the commutative subalgebra of g consisting of λ-independent diagonal matrices (i.e. of matrices `(0) E 0 ). The linear operator on g assigning to each element L(λ) (2.19) its free term `(0) will be denoted by P0 . The group G corresponding to the twisted loop algebra g is a twisted loop group, consisting of GL(N )-valued functions U (λ) of the complex parameter λ, regular in CP 1 \{0, ∞} and satisfying U (λ)−1 = U (ωλ). Its subgroups G+ and G− corresponding to the Lie algebras g + and g − , are singled out by the following conditions: the elements U (λ) ∈ G+ are regular in the neighbourhood of λ = 0, and the elements U (λ) ∈ G− are regular in the neighbourhood of λ = ∞ and take in this point the value U (∞) = I. We call the corresponding 5+ 5− factorization the generalized LU factorization. The basic operator governing the hierarchies of Lax equations, is: R = π+ − π− .

(2.22)

Denote by R0 , P0 its skew-symmetric and its symmetric parts, respectively: R0 = (R − R∗ )/2, P0 = (R + R∗ )/2. Obviously, this definition of P0 is consistent with the previous one. Let the skew-symmetric operator W act on g 0 according to W(Ejj ) =

X k<j

Ekk −

X

Ekk ,

k>j

and on the rest of g according to W = W ◦ P0 . Finally, define: A1 = R0 + W, A2 = R0 − W, S = P0 − W, S∗ = P0 + W.

(2.23)

These operators will be basic building blocks in the quadratic r-matrix brackets appearing below.

454

Yu. B. Suris, O. Ragnisco

3. Recalling the Toda Lattice Case 3.1. TL. We consider the equations of motion of TL (1.1) under periodic boundary conditions: all subscripts are taken (mod N ). The phase space of this system is T = R2N (b1 , a1 , . . . , bN , aN )

(3.1)

(recall that a0 ≡ aN , bN +1 ≡ b1 ). There exist three compatible local Poisson brackets on T such that the system TL is Hamiltonian with respect to each one of them, see [K1]. We adopt once and forever the following conventions: the Poisson brackets will be defined by writing down all nonvanishing brackets between the coordinate functions; the indices in the corresponding formulas are taken (mod N ). The “linear” Poisson structure on T is defined by the brackets {bk , ak }1 = −ak , {ak , bk+1 }1 = −ak ,

(3.2)

the corresponding Hamilton function for the flow TL is given by: H2 (a, b) =

N

N

k=1

k=1

1X 2 X bk + ak . 2

(3.3)

The “quadratic” Poisson structure has the following definition: {bk , ak }2 = −ak bk , {bk , bk+1 }2 = −ak ,

{ak , bk+1 }2 = −ak bk+1 , {ak , ak+1 }2 = −ak+1 ak .

(3.4)

The Hamilton function generating TL in this bracket is: H1 (a, b) =

N X

bk .

(3.5)

k=1

Finally, the “cubic” bracket on T is given by the relations {bk , ak }3 = −ak (b2k + ak ), {bk , bk+1 }3 = −ak (bk + bk+1 ), {bk , ak+1 }3 = −ak ak+1 ,

{ak , bk+1 }3 = −ak (b2k+1 + ak ), {ak , ak+1 }3 = −2ak ak+1 bk+1 , {ak , bk+2 }3 = −ak ak+1 .

(3.6)

The corresponding Hamilton function of the flow TL is: N

1X log(ak ). H0 (a, b) = 2

(3.7)

k=1

The Lax representation of the Toda lattice [F, M] lives in the algebra g introduced in the previous section. We shall work with the Lax matrix L(a, b, λ) = λ

N X k=1

Ek+1,k +

N X k=1

bk Ek,k + λ−1

N X

ak Ek,k+1 = E + b + aE −1 , (3.8)

k=1

where diagonal matrices a = diag(a1 , . . . , aN ), b = diag(b1 , . . . , bN ) are introduced.

Relativistic Volterra Lattice

455

The equations of motion (1.1) are equivalent to the Lax equations L˙ = [L, B] = −[L, A]

(3.9)

with B = π+ (L) = λ

N X

Ek+1,k +

k=1

A = π− (L) = λ−1

N X

bk Ek,k = E + b,

(3.10)

k=1

N X

ak Ek,k+1 = aE −1 ,

(3.11)

k=1

where π± : g 7→ g ± are the projections to the subalgebras g ± . Spectral invariants of the Lax matrix L(a, b, λ) serve as integrals of motion of this system. Note that all Hamilton functions in different Hamiltonian formulations belong to these spectral invariants. For instance, H2 (a, b) =

1 2 tr L (a, b, λ) , H1 (a, b) = tr L(a, b, λ) , 2 0 0

where the subscript “0” is used to denote the free term of the corresponding Laurent series. All spectral invariants turn out to be in involution with respect to each of the Poisson brackets (3.2), (3.4), (3.6). Most directly it follows from the r-matrix interpretation of the Lax equation (3.9). Theorem 5. The Lax matrix map L(a, b, λ) : T 7→ g is Poisson, if T carries {·, ·}1 and g carries PB1 (R), and also if T carries {·, ·}2 and g carries PB2 (A1 , A2 , S). The first statement is from [AM], the second one – from [S2]. For other versions of such statements (including the r-matrix interpretation of the cubic bracket) see [DLT1, OR, MP]. 3.2. TL → dTL. In order to find an integrable time discretization for the flow TL, we apply the recipe of the previous section with F (L) = I + hL, i.e. we take as a solution of this problem the map described by the discrete time Lax equation e = B −1 LB = ALA−1 with B = 5+ (I + hL), A = 5− (I + hL). (3.12) L Theorem 6. [S3] (see also [GK2]). Consider the change of variables T (a, b) 7→ T (a, b) defined by the formulas bk = bk + hak−1 , ak = ak (1 + hbk ).

(3.13)

The discrete time Lax equation (3.12) is equivalent to the map (a, b) 7→ (e a, eb) which in the variables (a, b) is described by the following equations: e ak−1 ), e ak (1 + he bk ) = ak (1 + hbk+1 ). bk = bk + h(ak − e

(3.14)

456

Yu. B. Suris, O. Ragnisco

Proof. The tri-diagonal structure of the matrix L implies the following bi-diagonal structure for the factors B, A: B = 5+

N N X X (1 + hbk )Ek,k + hλ Ek+1,k , I + hL = k=1

A = 5−

(3.15)

k=1

N X −1 I + hL = I + hλ ak Ek,k+1 .

(3.16)

k=1

The formulas (3.13) represent the matrix equation I + hT = BA which serves as a definition of the matrices B, A. Obviously, these formulas define a local diffeomorphism T (a, b) 7→ T (a, b). Now, the Lax equation (3.12) is easy to see to be equivalent to I + hTe = B −1 (BA)B = A(BA)A−1 , or e ·A e = A · B. B

(3.17)

The equations of motion (3.14) are nothing but the componentwise form of the latter matrix equation. The map (3.14) will be denoted dTL. Obviously, it serves as a difference approximation to the Toda flow TL (1.1). The construction assures numerous positive properties of this discretization: in the coordinates (a, b) the map dTL is Poisson with respect to each one of the Poisson brackets (3.2), (3.4), (3.6), it has the same integrals of motion as the flow TL, etc. Going to the coordinates (a, b) deforms the integrals of motion and the Poisson brackets. Moreover, since the inverse to the map (3.13) is nonlocal, the invariant Poisson brackets become, generally speaking, also nonlocal in the coordinates (a, b). Remarkably, there turn out to exist such linear combinations of the basic invariant Poisson brackets whose pull-backs to the coordinates (a, b) are described by local formulas. Theorem 7. a) The pull-back of the bracket {·, ·}1 + h{·, ·}2

(3.18)

on T (a, b) under the change of variables (3.13) is the following bracket on T (a, b): {bk , ak } = −ak (1 + hbk ), {ak , bk+1 } = −ak (1 + hbk+1 ).

(3.19)

b) The pull-back of the bracket {·, ·}2 + h{·, ·}3

(3.20)

on T (a, b) under the change of variables (3.13) is the following bracket on T (a, b): {bk , ak } {ak , bk+1 } {bk , bk+1 } {ak , ak+1 }

= −ak (bk + hak )(1 + hbk ), = −ak (bk+1 + hak )(1 + hbk+1 ), = −ak (1 + hbk )(1 + hbk+1 ), = −ak ak+1 (1 + hbk+1 ).

(3.21)

c) The brackets (3.19), (3.21) are compatible. The map dTL (3.14) is Poisson with respect to both of them.

Relativistic Volterra Lattice

457

Proof. To prove the theorem, one has, for example, in the (less laborious) case a) to verify the following statement: the formulas (3.19) imply that the nonvanishing pairwise Poisson brackets of the functions (3.13) are {bk , ak } = −ak (1 + hbk ), {bk , bk+1 } = −hak ,

{ak , bk+1 } = −ak (1 + hbk+1 ), {ak , ak+1 } = −hak ak+1 .

This verification consists of straightforward calculations.

The map (3.14) was first found in [HTI], along with the Lax representation, but without discussing its Poisson structure and its place in the continuous time Toda hierarchy. We can now determine the hierarchy of continuous time lattice equations to which the map (3.14) belongs. Clearly, this is the Toda hierarchy pulled-back under the map (3.13). The previous theorem allows to calculate the corresponding equations of motion in a systematic (Hamiltonian) fashion. Theorem 8. [K1]. The pull-back of the flow TL under the change of variables (3.13) is described by the following differential equations: b˙ k = (ak − ak−1 )(1 + hbk ), a˙ k = ak (bk+1 − bk ).

(3.22)

Proof. To determine the pull-back of the flow TL, we can use the Hamiltonian formalism. An opportunity to apply it is given by Theorem 7. We shall use the statement a) PN only. Consider the function h−1 H1 (a, b) = h−1 k=1 bk . It is a Casimir of the bracket {·, ·}1 , and generates exactly the flow TL in the bracket h{·, ·}2 . Hence it generates the flow TL also in the bracket (3.18). The pull-back of this Hamilton function is equal to PN h−1 k=1 (bk + hak−1 ). It remains only to calculate the flow generated by this function in the Poisson brackets (3.19). This results in the equations of motion (3.22). 3.3. TL → explicit dTL. Now consider Eqs. (3.14) as equations on the 2-dimensional lattice. In other words, we attach the variables bk = bk (t), ak = ak (t) to the lattice site (k, t) ∈ Z × hZ. A linear change of independent variables (k, t) 7→ (k, τ ) = (k, t + kh) mixes space and time coordinates, which is equivalent to changing the Cauchy path on the lattice to {τ = 0} = {t = −kh}. For other instances of such staircase (or sawtooth) Cauchy paths on the lattice, see [PNC, NPCQ, FV]. To deal with the new initial-value problem, denote bk (τ ) = bk (t) = bk (τ − kh), ak (τ ) = ak (t) = ak (τ − kh).

(3.23)

Denoting the h-shift in τ still by the tilde, it is easy to see that the variables ak = ak (τ ), bk = bk (τ ) satisfy the following difference equations: e e k (1 + he bk ) = ak (1 + he bk+1 ). bk = bk + h(ak − ak−1 ), a

(3.24)

These equations serve as an explicit discretization of the flow TL. Indeed, they allow to e ). calculate (e a, e b) explicitly, if (a, b) is known (first e b, then a Of course, it is tempting to conclude about the integrability of the system (3.24) from the known integrability of Eqs. (3.14). However, we do not know any general statements allowing such conclusions, so that the integrability of (3.24) has to be proved independently. Moreover, we are not aware of any systematic procedure allowing to determine invariant Poisson brackets of the map (3.24) starting with the known invariant Poisson

458

Yu. B. Suris, O. Ragnisco

brackets of the map (3.14). Hence even the definition of integrability is nontrivial in the case of (3.24). Our approach to these problems will be based on Lax representations. Recall that the Lax equations may be considered as compatibility conditions of linear problems. In particular, the discrete time Lax equation (3.17) has two versions: eA e = A(BA)A−1 = B −1 (BA)B. B The first one is a compatibility condition of two linear problems: BAψ = µψ, ψe = Aψ,

(3.25)

while the second one is a compatibility condition of BAψ = µψ, ψe = B −1 ψ.

(3.26)

Here ψ = (ψ1 , . . . , ψN ) is an auxiliary vector, µ is a spectral parameter (having nothing in common with the spectral parameter λ entering the elements of the algebra g such as L(a, b, λ)). These two pairs of linear problems are equivalent, correspondingly, to the following ones: B ψe = µψ, Aψ = ψe

(3.27)

e B ψe = ψ. Aψ = µψ,

(3.28)

and

We are now in a position to derive from these two pairs of auxiliary linear problems two different Lax representations for the map (3.24). To this end, we assume that the variables ψk (t) are also attached to lattice sites (k, t), and perform in the above equations the change of variables (k, t) 7→ (k, τ ) = (k, t + kh). We denote, in addition to the variables ak , bk introduced above, also ψ k (τ ) = ψk (t) = ψk (τ − kh). Theorem 9. The map (3.24) allows the following Lax representation: eC e −1 = C −1 D e − = D −1 L− D = C −1 L− C ⇐⇒ D L

(3.29)

with the Lax matrix L− = (DC −1 − I)/h, or L− =

λ

N X

Ek+1,k +

k=1

N X

(bk − hak−1 )Ekk + λ

−1

k=1

N X

! ak Ek,k+1

C −1 , (3.30)

k=1

where D=

N X

(1 + hbk − h2 ak−1 )Ekk + hλ

k=1

C = I − hλ−1

N X

Ek+1,k ,

(3.31)

k=1 N X k=1

ak Ek,k+1 .

(3.32)

Relativistic Volterra Lattice

459

Proof. The statement may be, of course, easily verified, but we want to show how it can be derived. To this end we start with (3.27). In coordinates these equations read: (1 + hbk )ψek + hλψek−1 = µψk , ψek = ψk + hλ−1 ak ψk+1 . This implies, after the change of variables (k, t) 7→ (k, τ ): −1 e + hλψ e e (1 + hbk )ψ k k−1 = µψ k , ψ k = ψ k + hλ ak ψ k+1 .

After simple manipulations the latter system may be put into the form −1 e + hλψ e e e (1 + hbk − h2 ak−1 )ψ k k−1 = µψ k , ψ k = ψ k − hλ ak ψ k+1 .

In matrix notations the latter equations look like e = µψ, C ψ e = ψ, Dψ with the matrices D, C introduced above. The compatibility condition of these two linear problems coincides with (3.29). Theorem 10. The map (3.24) allows the following Lax representation: e+ = 1 e e −1 F · L+ · F −1 1, L

(3.33)

with the matrices L+ = F

−1

λ

N X

Ek+1,k +

k=1

N X

(bk − hak−1 )Ekk + λ

k=1

−1

N X

! ak Ek,k+1

, (3.34)

k=1 N X

Ek+1,k = I − hE,

(3.35)

1 = diag(1 + hb1 , . . . , 1 + hbN ).

(3.36)

F = I − hλ

k=1

Proof. We perform with (3.28) transformations analogous to those of the previous proof. In coordinates (3.28) reads: ψk + hλ−1 ak ψk+1 = µψek , (1 + hbk )ψek + hλψek−1 = ψk . This implies, after the change of variables (k, t) 7→ (k, τ ): e e e ψ k + hλ−1 ak ψ k+1 = µψ k , (1 + hbk )ψ k + hλψ k−1 = ψ k . Straightforward manipulations imply: 1 + hbk 1 + hbk 2 ψ = µ(ψ k − hλψ k−1 ), ψ k + hλ−1 ak 1 + hbk − h ak 1 + hbk+1 1 + hbk+1 k+1 e = ψ k

1 (ψ − hλψ k−1 ). 1 + hbk k

460

Yu. B. Suris, O. Ragnisco

This may be put in a matrix form: e = 1−1 Fψ, F −1 P ψ = µψ, ψ with the matrices F, 1 as above, and P =

N X k=1

1 + hbk 1 + hbk − h ak 1 + hbk+1

2

Ekk + hλ−1

N X

ak

k=1

1 + hbk Ek,k+1 . 1 + hbk+1 (3.37)

Now the compatibility condition of the latter two linear problems reads: e = 1−1 F · F −1 P · F −1 1. F −1 P In order to bring this into the required form (3.33), notice that the equations of motion (3.24) imply the following simple formula: e = P

N X

(1 + hbk − h2 ak−1 )Ekk + hλ−1

k=1

N X

ak Ek,k+1 ,

(3.38)

k=1

so that

e = I + hL+ . F −1 P

This completes the proof.

3.4. Explicit dTL → RTL. Now that we have two different Lax representations for the map (3.24), with the Lax matrices L± , the natural question is one about their relation. It is easy to see that this relation may be described as follows: T −1 −1 1 L− (λ) −1 1 = L+ (α λ ),

where α = (a1 a2 . . . aN )1/N and 1 = diag(1, α−1 a1 , α−2 a1 a2 , . . . , α−N +1 a1 a2 . . . aN −1 ). Both Lax matrices for the explicit dTL (3.24) are substantially different from the one for the implicit dTL (3.14) (the latter Lax matrix is the usual Toda one (3.8) with the variables (a, b) parametrized according to (3.13)). The next natural question is: what is the continuous time hierarchy, to which (3.24) belongs? In other words, how can one get continuous time flows with the Lax matrix L± , and what is the underlying invariant Poisson structure(s)? The answer is known for both Lax formulations. Introduce the phase space R = R2N (d1 , a1 , . . . , dN , aN ).

(3.39)

Theorem 11. [S2] The Lax matrix map L− (a, d, λ) : R 7→ g, where L− = (DC

−1

− I)/h =

λ

N X k=1

Ek+1,k +

N X k=1

dk Ekk + λ

−1

N X k=1

! ak Ek,k+1

C −1 , (3.40)

Relativistic Volterra Lattice

461

D(a, d, λ) =

N X

(1 + hdk )Ekk + hλ

N X

k=1

Ek+1,k ,

(3.41)

k=1

C(a, d, λ) = I − hλ−1

N X

ak Ek,k+1 ,

(3.42)

k=1

is Poisson, if R is equipped with the Poisson bracket {dk , ak }1 = −ak , {ak , dk+1 }1 = −ak , {dk , dk+1 }1 = hak ,

(3.43)

and g is equipped with PB1 (R), and also if R is equipped with the Poisson bracket {dk , ak }2 = −ak dk , {dk , dk+1 }2 = −ak ,

{ak , dk+1 }2 = −ak dk+1 , {ak , ak+1 }2 = −ak ak+1 ,

(3.44)

and g is equipped with PB2 (A1 , A2 , S). The hierarchy of continuous time flows is of the usual form h i ˙ − = L− , ±π± (f (L− )) . (3.45) L The “first” flow of the hierarchy, corresponding to f (L) = L, coincides with RTL (1.9) and allows a Lax representation with the auxiliary matrix π+ (L− ) =

N X

(dk + hak−1 )Ekk + λ

N X

k=1

The map

Ek+1,k .

(3.46)

k=1

e − = D −1 L− D = C −1 L− C L

is interpolated by the flow of this hierarchy with f (L) = h−1 log(I + hL). Actually, additional information is available from [S2]. The evolution of the factors C, D for the flows of the RTL hierarchy (3.45) is described by the Lax triads ˙ = ±D · π± (f (C −1 D)) ∓ π± (f (DC −1 )) · D, D ˙ = ±C · π± (f (C −1 D)) ∓ π± (f (DC −1 )) · C. C In particular, for the flow RTL (1.9) the corresponding auxiliary matrices in the Lax triads are (3.46) and N N X X (dk + hak )Ekk + λ Ek+1,k . π+ (C −1 D − I)/h = k=1

(3.47)

k=1

For the case of the quadratic bracket an r-matrix interpretation in g ⊗ g is possible. Namely, the Lax matrix map (D, C −1 ) : R 7→ g = g ⊗ g is Poisson, if R is equipped with the bracket {·, ·}1 + h{·, ·}2 , i.e. {dk , ak } = −(1 + hdk )ak , {ak , dk+1 } = −(1 + hdk+1 )ak , {ak , ak+1 } = −hak ak+1 ,

(3.48)

462

Yu. B. Suris, O. Ragnisco

and g ⊗ g is equipped with hPB2 (A1 , A2 , S) with the operators S S A1 −S A2 −S∗ , A2 = , S= . A1 = S A2 S∗ A1 S −S∗

(3.49)

Theorem 4 then assures that the monodromy matrix maps DC −1 and C −1 D are Poisson, if the target g is equipped with hPB2 (A1 , A2 , S), which implies the Poisson property for the Lax matrix maps L− = (DC −1 − I)/h and (C −1 D − I)/h, if the target g is equipped with PB1 (R) + hPB2 (A1 , A2 , S). For the hierarchy attached to the Lax matrix L+ , somewhat less detailed information is available. Theorem 12. [BR, GK1] The hierarchy of Lax equations related to the Lax matrix −1

L+ (a, d, λ) = (I − hE)

λ

N X

Ek+1,k +

k=1

N X

dk Ekk + λ

−1

k=1

N X

! ak Ek,k+1

k=1

(3.50)

is given by the formula h i ˙ + = L+ , ±π± (f (L+ )) + σ(f (L+ )) , L where σ is a linear map from g into the set of diagonal matrices defined as X X `(p) E p ) = h−p `(p) . σ p

p<0

In particular, the “first” flow of this hierarchy corresponding to f (L) = L coincides with RTL (1.9) and allows the Lax representation h i ˙ + = A, L+ L with the auxiliary matrix A = π− (L+ ) − σ(L+ ) = λ−1

N X k=1

ak Ek,k+1 − h

N X

ak Ekk .

k=1

It has to be mentioned that each of the formulations above has its specific advantages. Theorem 11 has an advantage of including the RTL hierarchy into the standard framework of the lattice KP hierarchy. On the other hand, Theorem 12, found for RTL in [BR], has an advantage of being a particular case of a much more general construction, due to [GK1], which delivers a “relativistic” generalization of the whole lattice KP hierarchy. An r-matrix interpretation of the latter result still remains to be elaborated (see, however, [OR] for a linear bracket in the open-end case). So, upon the identification dk = bk − hak−1 the explicit discrete time Toda map (3.24) belongs to the relativistic Toda hierarchy.

Relativistic Volterra Lattice

463

Remarks. 1. In the coordinates (a, b) the equations of motion (1.9) take the form b˙ k = ak (1 + hbk ) − ak−1 (1 + hbk−1 ), a˙ k = ak (bk+1 − bk + hak+1 − hak ). (3.51) 2. In the coordinates (a, b) the linear Poisson bracket (3.43) of the RTL formally coincides (incidentally) with the linear Poisson bracket (3.2) of the usual Toda lattice: {bk , ak }1 = −ak , {ak , bk+1 }1 = −ak .

(3.52)

Note also that in the coordinates (a, d) the quadratic Poisson structure (3.44) formally coincides with the quadratic Poisson bracket (3.4) for the usual Toda lattice. 3. The third (“cubic”) invariant Poisson structure is known for the RTL hierarchy, however its r-matrix interpretation is unknown (which means that no cubic r-matrix structure on g is found that could be properly restricted to the set of matrices L+ or L− and would produce the formulas (3.53) below; cf. [OR]): {dk , ak }3 {ak , dk+1 }3 {dk , dk+1 }3 {ak , ak+1 }3 {dk , ak+1 }3 {ak , dk+2 }3 {ak , ak+2 }3

= = = = = = =

−ak (d2k + ak + hda), −ak (d2k+1 + ak + hdk+1 a), −ak (dk + dk+1 + hdk dk+1 ), −ak ak+1 (2dk+1 + hak + hak+1 ), −ak ak+1 (1 + hdk ), −ak ak+1 (1 + hdk+2 ), −hak ak+1 ak+2 .

(3.53)

4. Volterra and Relativistic Volterra Lattices 4.1. VL. We consider here the equations of motion of VL (1.2) under periodic boundary conditions, so that the corresponding phase space is: V = R2N (u1 , v1 , . . . , uN , vN ).

(4.1)

There exist two compatible local Poisson brackets on V invariant under the flow VL: {uk , vk }2 = −uk vk , {vk , uk+1 }2 = −vk uk+1 ,

(4.2)

and {uk , vk }3 {vk , uk+1 }3 {uk , uk+1 }3 {vk , vk+1 }3

= = = =

−uk vk (uk + vk ), −vk uk+1 (vk + uk+1 ), −uk vk uk+1 , −vk uk+1 vk+1 .

(4.3)

The corresponding Hamilton functions for the flow VL are equal to H1 (u, v) =

N X k=1

uk +

N X k=1

vk ,

(4.4)

464

Yu. B. Suris, O. Ragnisco

and H0 (u, v) =

N X

or H0 (u, v) =

log(uk )

k=1

N X

log(vk )

(4.5)

k=1

(the difference of the latter two functions is a Casimir of {·, ·}3 ). The Lax representation of VL we use here lives in g = g ⊗ g and was introduced in [K1] (where the system (1.2) was called “modified Toda lattice”), see also [S5]. Consider the following two matrices: U (u, v, λ) = λ

N X

Ek+1,k +

uk Ek,k = E + u,

(4.6)

vk Ek,k+1 = I + vE −1 .

(4.7)

k=1

V (u, v, λ) = I + λ

N X k=1

−1

N X k=1

These formulas define the “Lax matrix” (U, V ) : V 7→ g ⊗g. The flow (1.2) is equivalent to either of the following Lax equations in g ⊗ g: U˙ = A1 U − U A2 U˙ = U B2 − B1 U (4.8) or ˙ V˙ = A2 V − V A1 V = V B1 − B2 V with the matrices B1 = π+ (U V ), B2 = π+ (V U ), A1 = π− (U V ), A2 = π− (V U ),

(4.9)

so that B1 = λ

N X

Ek+1,k +

k=1

B2 = λ

N X

(uk + vk−1 )Ekk ,

(4.10)

(uk + vk )Ekk ,

(4.11)

k=1

Ek+1,k +

k=1

A1 = λ−1

N X

N X k=1

N X

uk vk Ek,k+1 ,

(4.12)

uk+1 vk Ek,k+1 .

(4.13)

k=1

A2 = λ−1

N X k=1

As a consequence, the matrices T1 (u, v, λ) = U (u, v, λ)V (u, v, λ), T2 (u, v, λ) = V (u, v, λ)U (u, v, λ) (4.14) satisfy the usual Lax equations in g: T˙ = [T, ±π± (T )]. The Lax equations (4.8), (4.15) may be given an r-matrix interpretation.

(4.15)

Relativistic Volterra Lattice

465

Theorem 13. [S5] The Lax matrix map (U, V ) : V 7→ g ⊗ g is Poisson, if V is equipped with the bracket {·, ·}2 and g ⊗ g is equipped with the bracket PB2 (A1 , A2 , S) corresponding to the operators A1 , A2 , S defined in (3.49). The maps T1,2 : g ⊗ g 7→ g, T1 (U, V ) = U V, T2 (U, V ) = V U are Poisson, if the target space g is equipped with the bracket PB2 (A1 , A2 , S). It is easy to see that the matrices T1 , T2 formally coincide with the Lax matrix (3.8) of the Toda lattice, with the coordinates (a, b) given by the corresponding formula for the Miura maps in (1.8). The Poisson property of the monodromy map is therefore reflected in the (first half) of the following statements about the Miura maps. Theorem 14. The Miura maps M1,2 : V 7→ T are Poisson, if V carries the bracket {·, ·}2 (4.2) and T carries the bracket {·, ·}2 (3.4), and also if V carries the bracket {·, ·}3 (4.3) and T carries the bracket {·, ·}3 (3.6). 4.2. VL → dVL. To find an integrable time discretization for the flow VL, we apply the recipe of Sect. 2.3 with F (T ) = I + hT , i.e. we consider the map described by the discrete time “Lax triads”   e = A1 U A−1 e = B −1 U B 2 U U 1 2 or (4.16) e e −1 V B V A V = B −1 V = A 1 2 2 1 with B 1 = 5+ (I + hU V ), B 2 = 5+ (I + hV U ), A1 = 5− (I + hU V ), A2 = 5− (I + hV U ). Theorem 15. Consider the change of variables V(u, v) 7→ V(u, v) defined by the formulas uk = uk (1 + hvk−1 ), vk = vk (1 + huk ).

(4.17)

The discrete time Lax equations (4.16) are equivalent to the map (u, v) 7→ (e u, ve) which in coordinates (u, v) is described by the following equations: ek (1 + he vk−1 ) = uk (1 + hvk ), e vk (1 + he uk ) = vk (1 + huk+1 ). u

(4.18)

Proof. The formulas (4.17) allow to find the factors B 1,2 , A1,2 in a closed form. Indeed, with the help of (4.17) we can represent (4.6), (4.7) as U =

N X

uk (1 + hvk−1 )Ekk + λ

k=1

V = I +λ

N X

Ek+1,k ,

(4.19)

k=1 −1

N X k=1

vk (1 + huk )Ek,k+1 .

(4.20)

466

Yu. B. Suris, O. Ragnisco

From these formulas we derive: I + hU V = hλ

N X

Ek+1,k +

k=1

N X

(1 + huk )(1 + hvk−1 ) + h2 uk−1 vk−1 Ekk

k=1

+hλ−1

N X

uk vk (1 + huk )(1 + hvk−1 )Ek,k+1 .

k=1

Obviously, this matrix may be factorized as I + hU V = B 1 A1 , where B1 =

N X

(1 + huk )(1 + hvk−1 )Ekk + hλ

k=1

N X

Ek+1,k ,

(4.21)

k=1

A1 = I + hλ−1

N X

uk vk Ek,k+1 .

(4.22)

k=1

Similarly, we find: I + hV U = hλ

N X

Ek+1,k +

k=1

+hλ−1

N X

(1 + huk )(1 + hvk ) + h2 uk vk−1 Ekk

k=1 N X

uk+1 vk (1 + huk )(1 + hvk )Ek,k+1 ,

k=1

which may be factorized as I + hV U = B 2 A2 , where B2 =

N X

(1 + huk )(1 + hvk )Ekk + hλ

k=1

N X

Ek+1,k ,

(4.23)

k=1

A2 = I + hλ−1

N X

uk+1 vk Ek,k+1 .

(4.24)

k=1

e = U B 2 , B 2 Ve = V B 1 (or their A-analogs) Now the discrete time Lax equations B 1 U immediately imply the equations of motion (4.18). Comparing two pairs of formulas (4.21), (4.22) and (4.23), (4.24) for dVL with the formulas (3.15), (3.16), we see immediately that the Miura maps M1,2 : V(u, v) 7→ T (a, b) are conjugated by the changes of variables (4.17), (3.13) with the maps M1,2 : V(u, v) 7→ T (a, b) given by: 1 + hbk = (1 + huk )(1 + hvk−1 ), (4.25) M1 : ak = uk vk , M2 :

1 + hbk = (1 + huk )(1 + hvk ), ak = uk+1 vk .

(4.26)

The map (4.18), denoted hereafter by dVL, serves as a difference approximation to the flow VL (1.2). By construction, in the variables (u, v) this map shares the integrals of motion and the invariant Poisson structures with the flow VL. In the coordinates

Relativistic Volterra Lattice

467

(u, v) the integrals of motion and the Poisson brackets become deformed, the latter ones become nonlocal, since the inverse of the map (4.17) is nonlocal. Nevertheless, we have the following local invariant Poisson bracket for the map dVL: Theorem 16. [K1] The pull-back of the bracket {·, ·}2 + h{·, ·}3

(4.27)

on V(u, v) under the change of variables (4.17) is the following bracket on V(u, v): {uk , vk } = −uk vk (1 + huk )(1 + hvk ), {vk , uk+1 } = −vk uk+1 (1 + hvk )(1 + huk+1 ).

(4.28)

The map (4.18) is Poisson with respect to the bracket (4.28). The maps M1,2 (4.25), (4.26) are Poisson, if V(u, v) carries the bracket (4.28) and T (a, b) carries the bracket (3.21).

Proof. – by a direct verification.

The map (4.18) was found in [THO, HTI], where, however, its place in the continuous time Volterra hierarchy and its Poisson structure were not elaborated. The Miura maps (4.25), (4.26) were found in [K1] without connecting them to the discretization problem, and also in [HT] in the discretization context, but without Poisson properties. The previous theorem allows also to calculate the equations of motion of continuous time hierarchy, to which the map dVL belongs. Theorem 17. [K1] The pull-back of the flow VL under the map (4.17) is described by the following equations of motion: u˙ k = uk (1 + huk )(vk − vk−1 ), v˙ k = vk (1 + hvk )(uk+1 − vk ).

(4.29)

Proof. Since the flow VL has a Hamilton function h−1 log H0 (u, v) in the Poisson bracket h{·, ·}3 , and this function is a Casimir of the bracket {·, ·}2 , we conclude that this function generates the flow VL also in the bracket (4.27). This means that the pull-back of the flow VL is a Hamiltonian flow in the bracket (4.28) with the Hamilton function h−1

N X

log uk (1 + hvk−1 ) .

k=1

Calculating the corresponding equations of motion, we arrive at Eqs. (4.29).

4.3. dVL → explicit dVL. We use the same trick as in the case of dTL, in order to extract from dVL the explicit discretization for VL. Namely, we attach the variables uk = uk (t), vk = vk (t) to the lattice site (k, t) ∈ Z×hZ, and then perform the change of independent variables (k, t) 7→ (k, τ ) = (k, t + kh). We denote uk (τ ) = uk (t) = uk (τ − kh), v k (τ ) = vk (t) = vk (τ − kh).

(4.30)

Denoting the h-shift in τ still by the tilde, we immediately derive from (4.18) the following difference equations for the variables uk , v k : ek (1 + he e k (1 + hv k−1 ) = uk (1 + hv k ), v uk ) = v k (1 + he uk+1 ). u

(4.31)

This is clearly an explicit discretization, since, knowing (u, v), it allows to calculate e , and then v e. explicitly first u We find now Lax representations for the map (4.31). As in the case of the dTL, there exist two versions thereof.

468

Yu. B. Suris, O. Ragnisco

Theorem 18. The map (4.31) allows the following Lax representation: e −1 = C −1 V e = C −1 U C 2 , Ve C U 2

(4.32)

with the matrices U (u, v, λ) =

N X

uk Ekk + λ

k=1

V (u, v, λ) = I + λ−1

N X

Ek+1,k ,

(4.33)

k=1 N X

v k Ek,k+1 ,

(4.34)

k=1

C(u, v, λ) = I − hλ−1

N X

uk v k Ek,k+1 ,

(4.35)

e k+1 v k Ek,k+1 . u

(4.36)

k=1

C 2 (u, v, λ) = I − hλ−1

N X k=1

The Lax representation (4.32) implies also e Ve C e −1 = C −1 U V e − = C −1 L− C ⇐⇒ U L

(4.37)

L− = U V C −1 .

(4.38)

with the Lax matrix

Proof. We start with the following Lax representation of the dVL in g ⊗ g: e = A1 U A−1 , Ve = A2 V A−1 U 2 1 with the matrices (4.19), (4.20), (4.22), (4.24). These Lax equations may be considered as the compatibility condition of the following linear problems:   U φ = µψ

  φe = A2 φ

 V ψ = µφ

e ψ = A1 ψ

We shall call the first pair the spectral linear problem, and the second pair the evolutionary linear problem. In coordinates the above equations may be presented as   uk (1 + hvk−1 )φk + λφk−1 = µψk

  φek = φk + hλ−1 uk+1 vk φk+1

 ψ + λ−1 v (1 + hu )ψ k k k k+1 = µφk

e ψk = ψk + hλ−1 uk vk φk+1

.

Relativistic Volterra Lattice

469

Upon the change of variables (k, t) 7→ (k, τ ) this implies:   e e = φ + hλ−1 u e k+1 v k φ  uk (1 + h v k−1 )φk + λ φk−1 = µψ k  φ k k k+1 e . e  ψ + λ−1 v (1 + hu )ψ e e e k k ψ k = ψ k + hλ−1 uk v k φ k k+1 = µφk k+1 After simple manipulations (substitute into the first pair of equalities the values of φk−1 e and ψ k obtained from the second pair of equalities) this can be brought into the form   e − hλ−1 u e e k+1 v k φ  φk = φ  uk φk + λφk−1 = µψ k k k+1 ,  e e −1 e e = µφ ψ k + λ−1 v k ψ ψ k = ψ k − hλ uk v k φk+1 k+1 k or in the matrix form

  U φ = µψ

 e=φ  C 2φ





e = µφ Vψ

e =ψ Cψ

,

where we introduced the matrices U , V , C, C 2 from (4.33)-(4.36). The compatibility condition of these linear problems coincides with (4.32). Theorem 19. The map (4.31) allows the following Lax representation:  −1 −1  e1 e 11 e 2 ), e −1 e 2 = (1  F −1 U U 12 · (F −1 1 1 13 F) · F  

−1 −1 e e 1) e −1 e −1 e −1 13 1 1 2 V = (12 11 F) · 12 V · (F

(4.39)

with the matrices U , V , F from (4.33), (4.34), (3.35), and 11 = diag(1 + huk ), 12 = diag(1 + hv k ),

(4.40)

13 = diag(1 + hv k−1 ).

(4.41)

The Lax representation (4.39) implies also −1 −1 e + = (1 e 1) e −1 13 1 L 1 13 F) · L+ · (F

(4.42)

L+ = F −1 U V .

(4.43)

with the Lax matrix

Proof. This time we start with the following Lax representation of the map dVL: e = B −1 U B 2 , Ve = B −1 V B 1 , U 1 2 with the matrices (4.19), (4.20), (4.21), (4.23). These Lax equations are the compatibility conditions of the following linear problems:   e φe = µψe  B 2 φe = φ U ee V ψ = µφe



B 1 ψe = ψ

470

Yu. B. Suris, O. Ragnisco

(it turns out to be more convenient to take the tilded versions of the equations in the first pair). In components the first pair of the above equations (the spectral problem) takes the form: ek (1 + he vk−1 )φek + λφek−1 = µψek , u −1 e e uk )ψek+1 = µφek , ψk + λ vk (1 + he while the second pair (the evolutionary problem) takes the form   (1 + huk )(1 + hvk )φek + hλφek−1 = φk , 

(1 + hvk−1 )(1 + huk )ψek + hλψek−1 = ψk .

After the change of variables (k, t) 7→ (k, τ ) the spectral problem takes the form:  e + λφ e  e k (1 + hv k−1 )φ u k k−1 = µψ k ,  e ψ e e e + λ−1 v ek (1 + he uk )ψ k k+1 = µφk , and the evolutionary problem takes the form  e   (1 + huk )(1 + hv k )φk + hλφk−1 = φk , e + hλψ   (1 + h v k−1 )(1 + huk )ψ k k−1 = ψ k . e Now we use the equations of motion (4.31) to bring the first pair (the spectral problem) into the form  e + λφ e   uk (1 + hv k )φ k k−1 = µψ k , (4.44)  e ψ e e e + λ−1 v k (1 + he u ) ψ = µ φ . k+1 k k+1 k The second pair (the evolutionary problem) may be presented as   −1 −1 e e = φ − hλφ  )(1 + hv ) φ (1 + hu   k k k k k−1  φ = 12 11 Fφ  . ⇐⇒ −1 −1 e e = ψ − hλψ     (1 + h v k−1 )(1 + huk )ψ k k k−1  ψ = 11 13 Fψ (4.45) e e Now we transform the equations describing the spectral problems. From the first equations of the two pairs in (4.44), (4.45) it is easy to derive: e = hµψ e , φk − (1 + hv k )φ k k and using this in the first equation in (4.44), we bring the latter into the form e + hλ(1 + hv k−1 )φ e e e uk (1 + hv k )φ k k−1 = µ(ψ k − hλψ k−1 ). From the second equation in (4.45) we find the value e e (1 + huk+1 )ψ k+1 =

1 e e ), (ψ − hλψ k 1 + hv k k+1

(4.46)

Relativistic Volterra Lattice

471

which, being substituted into the second spectral equation in (4.44), implies: 1 e e e ψ k + λ−1 v k ψ k+1 = µφk . 1 + hv k Now we can put the spectral problems in (4.46), (4.47) into the matrix form: −1 e = µψ, e F U 12 φ −1 e e 1 V ψ = µφ.

(4.47)

(4.48)

2

The compatibilty conditions of the linear problems (4.45), (4.48) coincide with (4.42). 4.4. Explicit dVL → RVL. Now we have obtained two Lax matrices for the explicit dVL: L− (u, v, λ) = U V C −1 and L+ (u, v, λ) = F −1 U V . The relation between them is the following: T 0 T 0 −1 −1 T 0 1 U (λ) −1 2 = V (λ ), 2 V (λ) 1 = U (λ ), 1 C(λ) 1 = F (λ ),

so that

T 0 1 L− (λ) −1 1 = L+ (λ ),

where λ0 = α−1 λ−1 , α = (u1 v 1 . . . uN v N )1/N , and 1 = diag(1, α−1 u1 v 1 , α−2 u1 v 1 u2 v 2 , . . . , α−N +1 u1 v 1 . . . uN −1 v N −1 ), 2 = diag(1, α−1 v 1 u2 , α−2 v 1 u2 v 2 u3 , . . . , α−N +1 v 1 u2 . . . v N −1 uN ). We postulate that the hypothetic relativistic Volterra lattice (RVL) lives in the hierarchy associated to these Lax matrices, and it remains to identify this hierarchy. (The above relation between L± assures that both Lax matrices generate one and the same hierarchy.) We start with the Lax matrix L− . Theorem 20. The triples (U , C −1 , V ) form a Poisson Lax matrix map from V(u, v) into g ⊗ g ⊗ g, if the space V(u, v) carries the Poisson bracket {uk , v k }2 = −uk v k , {v k , uk+1 }2 = −v k uk+1 ,

(4.49)

and g ⊗ g ⊗ g carries the Poisson bracket PB2 (A1 , A2 , S) defined by the operators     A1 −S −S A2 −S∗ −S∗ A1 =  S∗ A1 S∗  , A2 =  S A2 −S∗  , S∗ −S A1 S S A2  S S S S =  S −S∗ −S∗  . S S −S∗ 

(4.50)

The Lax matrix maps L− = U V C −1 , V C −1 U , C −1 U V : V(u, v) 7→ g are Poisson, if the target g is equipped with PB2 (A1 , A2 , S). The hierarchy of continuous time flows is described by the Lax triads

472

Yu. B. Suris, O. Ragnisco

˙ = ±U · π± f (V C −1 U ) ∓ π± f (U V C −1 ) · U , U ˙ = ±C · π± f (C −1 U V ) ∓ π± f (U V C −1 ) · C, C V˙ = ±V · π± f (C −1 U V ) ∓ π± f (V C −1 U ) · V .

(4.51) (4.52) (4.53)

As a consequence, the evolution of the matrix L− = U V C −1 is described by the standard Lax equation h i ˙ − = L− , ±π± (f (L− )) . L In particular, the “first” flow corresponding to f (L) = L, coincides with the RVL (1.13), and the auxiliary matrices in the Lax triads are given by π+ (U V C

−1

N N X X )= (uk + v k−1 + huk−1 v k−1 )Ekk + λ Ek+1,k , k=1

(4.54)

k=1

N N X X Ek+1,k , π+ (C −1 U V )= (uk + v k−1 + huk v k )Ekk + λ k=1

(4.55)

k=1

N N X X Ek+1,k . π+ (V C −1 U )= (uk + v k + huk v k )Ekk + λ k=1

(4.56)

k=1

The map (4.31) belongs to this hierarchy (in particular, is Poisson with respect to the bracket (4.49)) and corresponds to f (L) = h−1 log(I + hL). Proof. The first statement is proved with the help of the tensor notations for the quadratic r-matrix brackets, along the same lines as the proof of analogous statements in [S2, S5]. All other statements, except the last one, are consequences of the first one and Theorem 4. To identify the place of the map (4.31) in this hierarchy, we have, referring to Theorem 18, to prove the following relations: −1 U ), C −1 = 5− (I + hU V C −1 ), C −1 2 = 5− (I + hV C

e −1 = 5− (I + hC −1 U V ). C Clearly, it is enough to prove only the first of them, but it follows immediately from I + hU V C −1 = (C + hU V )C −1 and C + hU V =

N X

(1 + huk + hv k−1 )Ekk + hλ

k=1

The proof is complete.

N X

Ek+1,k ∈ G+ ,

C −1 ∈ G− .

k=1

Turning to the case of the Lax matrix L+ , we have the following results.

Relativistic Volterra Lattice

473

Theorem 21. The hierarchy associated with the Lax matrix L+ = F −1 U V is described by the following Lax triads: ˙ = A1 U − U A2 , V˙ = A2 V − V A0 , U

(4.57)

which have to be supplemented by the identity F˙ = 0 = A1 F − FA0 ,

(4.58)

so that for the matrix L+ = F −1 U V there holds the Lax equation h i ˙ + = A0 , L+ . L

(4.59)

Here A0 = π− (f (L+ )) − σ(f (L+ )),

(4.60)

and the matrices A1,2 are uniquely defined by the condition of compatibility of the equations of the Lax triads (4.57), (4.58). In particular, the “first” flow of the hierarchy corresponding to f (L) = L, coincides with RVL (1.13), and for this flow A0 = λ

−1

N X

uk v k Ek,k+1 − h

k=1

A1 = λ−1

N X k=1

A2 = λ−1

N X k=1

N X

uk v k Ekk ,

(4.61)

uk−1 v k−1 Ekk ,

(4.62)

k=1

uk v k Ek,k+1 − h

N X k=1

uk+1 v k Ek,k+1 − h

N X

uk v k Ekk .

(4.63)

k=1

Proof. The proof goes by a direct check. We define A0 by (4.60), then put A1 = FA0 F −1 ∈ g 0 ⊕ g − . The matrix A2 ∈ g 0 ⊕ g − may be defined in two ways to make either of two Lax triads in (4.57) consistent. These two definitions have to be consistent, because the choice (4.60) assures the validity of the Lax equation (4.59) for the matrix L+ , due to Theorem 12. Comparing the Lax matrices L± for the RTL hierarchy (see (3.40, (3.50)) and for the RVL hierarchy (see (4.37), (4.42)), we see that the Miura map between two hierarchies formally coincides with the Miura map between the TL hierarchy and the VL hierarchy, because it is equivalent to identifying L(a, d, λ) with U (u, v, λ)V (u, v, λ), where L, U, V are the matrices from the nonrelativistic theories. This Miura map reads: dk = uk + v k−1 , M1 (u, v) = (a, d) : ak = uk v k . This map is Poisson with respect to the invariant quadratic Poisson brackets, since these also formally coincide with the invariant quadratic Poisson brackets of the nonrelativistic hierarchies. Recall that also the cubic Poisson bracket (3.53) is known for the RTL

474

Yu. B. Suris, O. Ragnisco

hierarchy. It turns out that its pull-back by the Miura map is given by the following nice formulas: {uk , v k }3 {v k , uk+1 }3 {uk , uk+1 }3 {v k , v k+1 }3 {v k , uk+2 }3

= = = = =

−uk v k (uk + v k + huk v k ), −v k uk+1 (v k + uk+1 ), −uk v k uk+1 (1 + huk ), −v k uk+1 v k+1 (1 + hv k+1 ), −hv k uk+1 v k+1 uk+2 .

(4.64)

Finally, let us comment on the twin RVL flow (1.14). Obviously, this system, as well as the corresponding Miura map M2 , may be obtained upon the renaming uk → v k , v k → uk+1 . The Lax matrices for the corresponding hierarchy arise from the ones previously concerned upon changing U V to V U . The explicit discretization of the VL living in this hierarchy reads: ek (1 + huk ) = v k (1 + huk+1 ) e k (1 + he v k−1 ) = uk (1 + he v k ), v u

(4.65)

e, and then u e ). (here, knowing u, v, one calculates first v 5. Bogoyavlensky Lattices 5.1. BL2. A class of systems called Bogoyavlensky lattices (although discovered several times independently before his papers [B], see the references in [S5], and also [K1]) serve as generalizations of the Volterra lattice. We consider here only one representative of this class, which is however typical. The integrable lattice system considered in the present section will be called BL2. Its phase space (in the periodic case) is B = R3N (u1 , v1 , w1 , . . . , uN , vN , wN ).

(5.1)

The equations of motion read: u˙ k = uk (vk + wk − vk−1 − wk−1 ), v˙ k = vk (uk+1 + wk − uk − wk−1 ), w˙ k = wk (uk+1 + vk+1 − uk − vk ).

(5.2)

This system is Hamiltonian with the local Poisson brackets on B {uk , vk }2 = −uk vk , {uk , wk }2 = −uk wk , {vk , wk }2 = −vk wk ,

{vk , uk+1 }2 = −vk uk+1 , {wk , uk+1 }2 = −wk uk+1 , {wk , vk+1 }2 = −wk vk+1 ,

(5.3)

and the Hamilton function H1 (u, v, w) =

N X k=1

uk +

N X

vk +

N X

k=1

wk .

k=1

Notice that the system BL2 is more familiar in the form a˙ k = ak (ak+1 + ak+2 − ak−1 − ak−2 ),

(5.4)

Relativistic Volterra Lattice

475

which arises upon the re-naming of variables uk = a3k−2 , vk = a3k−1 , wk = a3k . We use here the Lax representation of BL2 living in g = g ⊗ g ⊗ g and introduced in [S5]. Consider the following three matrices: U (u, v, w, λ) = λ

N X

Ek+1,k +

k=1

V (u, v, w, λ) = I + λ−1

N X

uk Ek,k = E + u,

(5.5)

k=1 N X

vk Ek,k+1 = I + vE −1 ,

(5.6)

wk Ek,k+1 = I + wE −1 .

(5.7)

k=1

W (u, v, w, λ) = I + λ−1

N X k=1

These formulas define the “Lax matrix” (U, V, W ) : B 7→ g ⊗ g ⊗ g. The flow (5.2) is equivalent to either of the following Lax equations in g ⊗ g ⊗ g:    U˙ = A1 U − U A3 ,  U˙ = U B3 − B1 U, ˙ or (5.8) V = V B1 − B2 V, V˙ = A2 V − V A1 ,  ˙  ˙ W = W B2 − B3 W, W = A3 W − W A2 , with the matrices B1 = π+ (U W V ), B2 = π+ (V U W ), B3 = π+ (W V U ),

(5.9)

A1 = π− (U W V ), A2 = π− (V U W ), A3 = π− (W V U ),

(5.10)

so that B1 = λ

N X

Ek+1,k +

k=1

B2 = λ

N X

N X

Ek+1,k +

(5.11)

N X

(uk + vk + wk−1 )Ekk ,

(5.12)

(uk + vk + wk )Ekk ,

(5.13)

k=1

Ek+1,k +

k=1

A1 = λ−1

(uk + vk−1 + wk−1 )Ekk ,

k=1

k=1

B3 = λ

N X

N X k=1

N X

(wk−1 vk + uk vk + uk wk )Ek,k+1 + λ−2

k=1

A2 = λ−1

N X

N X

(uk wk + vk wk + vk uk+1 )Ek,k+1 + λ−2

N X k=1

(vk uk+1 + wk uk+1 + wk vk+1 )Ek,k+1

k=1

+λ−2

uk wk vk+1 Ek,k+2 ,

k=1

k=1

A3 = λ−1

N X

N X k=1

wk vk+1 uk+2 Ek,k+2 .

vk uk+1 wk+1 Ek,k+2 ,

476

Yu. B. Suris, O. Ragnisco

The Lax equations (5.8) allow the following r-matrix interpretation. Theorem 22. [S5] The Lax matrix map (U, V, W ) : B 7→ g ⊗ g ⊗ g is Poisson, if B is equipped with the bracket {·, ·}2 and g⊗g⊗g is equipped with the bracket PB2 (A1 , A2 , S) corresponding to the operators A1 , A2 , S defined in (4.50). The monodromy maps T1,2,3 : g ⊗ g ⊗ g 7→ g, T1 (U, V, W ) = U W V, T2 (U, V, W ) = V U W, T3 (U, V, W ) = W V U are Poisson, if the target space g is equipped with the bracket PB2 (A1 , A2 , S). 5.2. BL2 → dBL2. To find an integrable time discretization for the flow BL2, we again apply the recipe of Sect. 2.3 with F (T ) = I + hT , i.e. we consider the map described by the discrete time “Lax triads”   e = A1 U A−1 e = B −1 U B 3 U U   1 3         or (5.14) Ve = B −1 Ve = A2 V A−1 2 V B1 1         f f W = B −1 W = A3 W A−1 3 W B2 2 with B 1 = 5+ (I + hU W V ), B 2 = 5+ (I + hV U W ), B 3 = 5+ (I + hW V U ), A1 = 5− (I + hU W V ), A2 = 5− (I + hV U W ), A3 = 5− (I + hW V U ). Theorem 23. Consider the change of variables B(u, v, w) 7→ B(u, v, w) defined by the formulas uk = uk (1 + hwk−1 )(1 + hvk−1 ), vk = vk (1 + huk )(1 + hwk−1 ), wk = wk (1 + hvk )(1 + huk ).

(5.15)

The discrete time Lax equations (5.14) are equivalent to the map (u, v, w) 7→ (e u, ve, w) e which in coordinates (u, v, w) is described by the following equations: ek (1 + he vk−1 )(1 + he wk−1 ) = uk (1 + hvk )(1 + hwk ), u e uk )(1 + he wk−1 ) = vk (1 + huk+1 )(1 + hwk ), vk (1 + he

(5.16)

e k (1 + he uk )(1 + he vk ) = wk (1 + huk+1 )(1 + hvk+1 ). w Proof. The proof is parallel to the proof of Theorem 15. The formulas (5.15) allow to find the factors B 1,2,3 , A1,2,3 in a closed form. Indeed, with the help of (5.15) we can represent (5.5)–(5.7) as U =

N X

uk (1 + hwk−1 )(1 + hvk−1 )Ekk + λ

k=1

V = I + λ−1

N X

Ek+1,k ,

(5.17)

k=1 N X

vk (1 + huk )(1 + hwk−1 )Ek,k+1 ,

(5.18)

wk (1 + hvk )(1 + huk )Ek,k+1 .

(5.19)

k=1

W = I + λ−1

N X k=1

Relativistic Volterra Lattice

477

From these formulas we derive the following expressions for the factors B i = 5+ (I+hTi ) and Ai = 5− (I + hTi ), i = 1, 2, 3: N N X X Ek+1,k , B 1 = (1 + huk )(1 + hwk−1 )(1 + hvk−1 )Ekk + hλ k=1

(5.20)

k=1

N N X X Ek+1,k , B 2 = (1 + hvk )(1 + huk )(1 + hwk−1 )Ekk + hλ k=1

(5.21)

k=1

N N X X Ek+1,k , B 3 = (1 + hwk )(1 + hvk )(1 + huk )Ekk + hλ k=1

(5.22)

k=1

A1 = I + hλ−1

N X

(wk−1 vk + uk vk + uk wk

k=1

+hwk−1 uk vk + huk vk wk )Ek,k+1 +hλ−2

N X

uk wk vk+1 (1 + hvk )(1 + hwk )(1 + huk+1 )Ek,k+2 ,

k=1

A2 = I + hλ−1

N X

(uk wk + vk wk + vk uk+1

k=1

+huk vk wk + hvk wk uk+1 )Ek,k+1 +hλ−2

N X

vk uk+1 wk+1 (1 + hwk )(1 + huk+1 )(1 + hvk+1 )Ek,k+2 ,

k=1

A3 = I + hλ−1

N X

(vk uk+1 + wk uk+1 + wk vk+1

k=1

+hvk wk uk+1 + hwk uk+1 vk+1 )Ek,k+1 +hλ−2

N X

wk vk+1 uk+2 (1 + huk+1 )(1 + hvk+1 )(1 + hwk+1 )Ek,k+2 .

k=1

Now the equations of motion (5.16) follow directly from the B-version of the discrete time Lax triads (5.14). The map (5.16), denoted hereafter by dBL2, serves as a difference approximation to the flow BL2 (5.2). According to the general properties of discretizations of Sect. 2.3, this map in the coordinates (u, v, w) is Poisson with respect to the bracket (5.3). Unfortunately, this bracket becomes highly nonlocal in the coordinates (u, v, w). After re-naming, uk = a3k−2 , vk = a3k−1 , wk = a3k , the map dBL2 turns into e ak−1 )(1 + he ak−2 ) = ak (1 + hak+1 )(1 + hak+2 ), ak (1 + he and in this form it appeared for the first time in [THO], and then in [PN, S4].

478

Yu. B. Suris, O. Ragnisco

5.3. dBL2 → explicit dBL2. Following the scheme used already for dTL and dVL, we extract an explicit discretization for the flow BL2 from the map dBL2. To this end we denote uk (τ ) = uk (t) = uk (τ − kh), v k (τ ) = vk (t) = vk (τ − kh), wk (τ ) = wk (t) = wk (τ − kh).

(5.23)

These variables satisfy the following difference equations with respect to the discrete time τ : e k (1 + hv k−1 )(1 + hwk−1 ) = uk (1 + hv k )(1 + hwk ), u ek (1 + he uk )(1 + hwk−1 ) = v k (1 + he uk+1 )(1 + hwk ), v

(5.24)

e k (1 + he uk )(1 + he v k ) = wk (1 + he uk+1 )(1 + he v k+1 ). w This is indeed an explicit discretization, because, knowing (u, v, w), one calculates e, v e, w e (in this order). We now turn to finding a Lax representation for this explicitly u map. Unlike the previous cases, we could find reasonable formulas only starting from the B-version of the Lax representation for dBL2. Theorem 24. The map (5.24) allows the following Lax representation:  −1 e e e −1 −1 e e e e −1 −1 −1    F U 12 13 = (11 14 15 F) · F U 12 13 · (F 11 12 13 ),    −1 −1 e e 1 ), e −1 e −1 −1 e −1 15 14 1 1 2 V = (12 11 15 F) · 12 V · (F       −1 −1 f e 11 e 2 ), e −1 e −1 e −1 e −1 15 1 1 3 W = (13 12 11 F) · 13 W · (F

(5.25)

with the matrices U =λ

N X

Ek+1,k +

k=1

N X

uk Ek,k ,

(5.26)

k=1

V = I + λ−1

N X

v k Ek,k+1 ,

(5.27)

wk Ek,k+1 ,

(5.28)

k=1

W = I +λ

−1

N X k=1

and 11 = diag(1 + huk ), 12 = diag(1 + hv k ), 13 = diag(1 + hwk ),

(5.29)

14 = diag(1 + hv k−1 ), 15 = diag(1 + hwk−1 ).

(5.30)

Relativistic Volterra Lattice

479

The Lax representation (5.25) implies also −1 −1 −1 e + = (1 e 1) e −1 15 14 1 L 1 14 15 F) · L+ · (F

(5.31)

−1 −1 U 12 W 1−1 L+ = (F −1 U 12 13 )(1−1 3 W )(12 V ) = F 2 V.

(5.32)

with the Lax matrix

Proof. We start with the following Lax representation of the map dBL2: f = B −1 W B 2 , e = B −1 U B 3 , Ve = B −1 V B 1 , W U 1 2 3 with the matrices (5.17)–(5.19), (5.20)–(5.22). These Lax equations are the compatibility conditions of the following linear problems:  e φe = µψe U      Ve ψe = µe χ     f Wχ e = µφe

  B 3 φe = φ     B 1 ψe = ψ      e=χ B2χ

(it turns out to be more convenient to take the tilded version of the spectral problem, i.e. of the equations in the first triple). In components the equations of the spectral linear problem take the form:  ek (1 + he wk−1 )(1 + he vk−1 )φek + λφek−1 = µψek , u      vk (1 + he uk )(1 + he wk−1 )ψek+1 = µe χk , ψek + λ−1e      e k (1 + he vk )(1 + he uk )e χk+1 = µφek , χ ek + λ−1 w while the equations of the evolutionary linear problem take the form   (1 + huk )(1 + hvk )(1 + hwk )φek + hλφek−1 = φk ,     (1 + hvk−1 )(1 + hwk−1 )(1 + huk )ψek + hλψek−1 = ψk ,      χk + hλe χk−1 = χk . (1 + hwk−1 )(1 + huk )(1 + hvk )e After the change of variables (k, t) 7→ (k, τ ) the spectral equations take the form:  e e  e   uk (1 + hv k−1 )(1 + hwk−1 )φk + λφk−1 = µψ k ,    e e e + λ−1 v ek (1 + he uk )(1 + hwk−1 )ψ χk , ψ k k+1 = µe       e e , e k+1 = µφ e k (1 + he e k + λ−1 w v k )(1 + he uk )χ χ k and the evolutionary equations take the form

480

Yu. B. Suris, O. Ragnisco

 e + hλφ  (1 + huk )(1 + hv k )(1 + hwk )φ  k k−1 = φk ,     e + hλψ (1 + h v k−1 )(1 + h wk−1 )(1 + huk )ψ k k−1 = ψ k ,   e e    χk + hλχk−1 = χk ,  (1 + h wk−1 )(1 + huk )(1 + hv k )e e

(5.33)

or, in matrix notations,  −1 −1 −1 e   φ = 13 12 11 Fφ,     e = 1−1 1−1 1−1 Fψ, ψ 1 4 5  e e   −1 −1 −1   e = 12 11 15 Fχ. χ e

(5.34)

Now we use the equations of motion (5.24) to bring the spectral equations into the form  e + λφ e  uk (1 + hv k )(1 + hwk )φ  k k−1 = µψ k ,     e e e + λ−1 v k (1 + hwk )(1 + he (5.35) uk+1 )ψ χk , ψ k k+1 = µe       e e . e k+1 = µφ e k + λ−1 wk (1 + he uk+1 )(1 + he v k+1 )χ χ k Next, we transform the spectral equations further with the help of the evolutionary ones. From the first equations in (5.33), (5.35) we derive: e = hµψ e , φk − (1 + hv k )(1 + hwk )φ k k and using this in the first equation in (5.35), we bring the latter into the form e + hλ(1 + hv k−1 )(1 + hwk−1 )φ e e e uk (1 + hv k )(1 + hwk )φ k k−1 = µ(ψ k − hλψ k−1 ). (5.36) From the second equation in (5.33) we find the expression e e uk+1 )ψ (1 + hwk )(1 + he k+1 =

1 e e ), (ψ − hλψ k 1 + hv k k+1

and plugging this into the second equation in (5.35), we find: 1 e e ψ k + λ−1 v k ψ χk . k+1 = µe 1 + hv k

(5.37)

Similarly, from the third equation in (5.33) we find the expression e e k+1 = v k+1 )χ (1 + he uk+1 )(1 + he

1 (e χ − hλe χk ), 1 + hwk k+1

and plugging this into the third equation in (5.35), we find: 1 e . e k + λ−1 wk χ e k+1 = µφ χ k 1 + hwk

(5.38)

Relativistic Volterra Lattice

481

Putting Eqs. (5.36), (5.37), (5.38) into matrix notations, we finally find:  e = µψ, e F −1 U 12 13 φ      e χ, 1−1 2 V ψ = µe      −1 e e = µφ. 1 Wχ

(5.39)

3

The compatibility conditions of (5.34), (5.39) coincide with (5.25).

5.4. Explicit dBL2 → “relativistic” BL2. Our general philosophy and the previous theorem suggest the matrix L+ = F −1 U 12 W 1−1 2 V

(5.40)

as the Lax matrix of the “relativistic” BL2 hierarchy. The spirit of [GK1], however, would suggest L+ = F −1 U W V ,

(5.41)

and this is the choice we actually follow in the next theorem. However, there is no conflict between these two suggestions, because changing W → 12 W 1−1 2 amounts solely to the simple change of variables wk → wk

1 + hv k 1 + hv k+1

(5.42)

(not touching uk , v k ). In other words, in order to make the explicit discretization (5.24) belonging to the “relativistic” BL2 hierarchy defined below, one needs to perform the change of variables (5.42). It turns out that the continuous time flows look more symmetric, if the Lax matrix is chosen as in (5.41), and the discrete time equations (5.42) look more symmetric by the choice (5.40) for the Lax matrix. Theorem 25. The hierarchy associated with the Lax matrix (5.41) is described by the following Lax triads: ˙ = A 3 W − W A2 , ˙ = A1 U − U A3 , V˙ = A2 V − V A0 , W U

(5.43)

which have to be supplemented by the identity F˙ = 0 = A1 F − FA0 , so that for the matrix L+ = F −1 U W V there holds the Lax equation i h ˙ + = A0 , L+ . L

(5.44)

(5.45)

Here A0 = π− (f (L+ )) − σ(f (L+ )),

(5.46)

482

Yu. B. Suris, O. Ragnisco

and the matrices A1,2,3 are uniquely defined by the condition of compatibility of the equations of the Lax triads (5.43), (5.44). In particular, for the “first” flow of the hierarchy corresponding to f (L) = L, one has: A0 = λ−1

N X

(wk−1 v k + uk v k + uk wk + huk−1 wk−1 v k )Ek,k+1

k=1

+λ−2

N X

uk wk v k+1 Ek,k+2 − h

k=1

A1 = λ−1

N X

N X

γk Ekk ,

(5.47)

k=1

(wk−1 v k + uk v k + uk wk + huk wk v k+1 )Ek,k+1

k=1

+λ

−2

N X k=1

A2 = λ−1

N X

uk wk v k+1 Ek,k+2 − h

N X

γk−1 Ekk ,

(5.48)

k=1

(uk wk + v k wk + v k uk+1 + huk v k wk )Ek,k+1

k=1

+λ−2

N X

v k uk+1 wk+1 Ek,k+2 − h

k=1

A3 = λ−1

N X

N X

γk Ekk ,

(5.49)

k=1

(v k uk+1 + wk uk+1 + wk v k+1 + hwk uk+1 v k+1 )Ek,k+1

k=1

+λ−2

N X

wk v k+1 uk+2 Ek,k+2 − h

k=1

N X

γk Ekk ,

(5.50)

k=1

where γk = wk−1 v k + uk v k + uk wk + huk−1 wk−1 v k + huk wk v k+1 .

(5.51)

The equations of motion of the “first” flow read: u ˙ k = uk (v k + wk − v k−1 − wk−1 + hwk v k+1 − hwk−1 v k + hγk − hγk−1 ), v˙ k = v k (wk + uk+1 − wk−1 − uk + huk wk − huk−1 wk−1 + hγk+1 − hγk ), w ˙ k = wk (uk+1 + v k+1 − uk − v k + huk+1 v k+1 − huk v k + hγk+1 − hγk ). (5.52) Proof. The proof is analogous to the proof of Theorem 21. We define A0 by (5.46) in accordance with the prescription of [GK1], then put A1 = FA0 F −1 ∈ g 0 ⊕ g − . The matrices A2,3 ∈ g 0 ⊕ g − are uniquely defined by the condition of consistency of the Lax triads in (5.43). The results for the “first” flow (5.52) (= “relativistic” BL2) are merely the specialization of this construction for f (L) = L and follow by a direct calculation. The previous theorem allows to find the equations of motion of all flows of the hierarchy, and to calculate its integrals. Unfortunately, we still do not know its Hamiltonian structure, so that the problem of its integrability in the Liouville–Arnold sense remains open.

Relativistic Volterra Lattice

483

6. Conclusion We discussed in this paper an approach to finding new integrable hierarchies based on the time-discretization. Namely, performing the discretization of a known integrable lattice system and then changing the Cauchy problem for the discrete time equation, we arrive at another discrete time equation which belongs, generally speaking, to an integrable hierarchy distinct from the one we started with. This new hierarchy is a regular deformation of the initial one, the deformation parameter being the time-step of the discretization. In the Toda lattice case the new hierarchy turns out to be the relativistic Toda one, the deformation parameter acquiring the meaning of the inverse speed of light. This suggests to call in general the new hierarchies obtained by this procedure “relativistic” ones. It would be interesting to find out, whether this notation allows some physical justification. In any case, we have now a program of finding “relativistic” deformations of a large number of known lattice hierarchies. In this connection the collection of local discretizations in [S6] may serve as a starting point. For the systems belonging to the lattice KP hierarchy, another procedure of finding “relativistic” deformations was suggested in [GK1], which has a priori nothing in common with our one. However, in the examples of the present paper, both procedures lead to similar results. It would be important to find out whether there are some deeper principles behind this coincidence. Acknowledgement. Y. S. gratefully acknowledges the stimulating correspondence with Boris Kupershmidt. In the early stage of this work he also benefited from discussions with Frank Nijhoff.

References [AL]

[AM] [B]

[BR]

[D]

[DLT1] [FT] [FV]

Ablowitz, M., Ladik, J.: A nonlinear difference scheme and inverse scattering. Stud. Appl. Math. 55, 213–229 (1976); On solution of a class of nonlinear partial difference equations. Stud. Appl. Math. 57, 1–12 (1977) Adler, M., van Moerbeke, P.: Completely integrable systems, Kac–Moody algebras and curves. Adv. Math. 38, 267–317 (1980) Bogoyavlensky, O.I.: Some constructions of integrable dynamical systems. USSR Math. Izv. 31, 47–75 (1988); Integrable dynamical systems associated with the KdV equation. USSR Math. Izv. 31, 435–454 (1988); The Lax representation with a spectral parameter for certain dynamical systems. USSR Math. Izv. 32, 245–268 (1989) Bruschi, M., Ragnisco, O.: Recursion operator and B¨acklund transformations for the Ruijsenaars– Toda lattices. Phys. Lett. A 129, 21–25 (1988); Lax representation and complete integrability for the periodic relativistic Toda lattice. Phys. Lett. A 134, 365–370 (1989); The periodic relativistic Toda lattice: direct and inverse problem. Inv. Probl. 5, 389–405 (1989) Damianou, P.A.: Master symmetries and R-matrices for the Toda lattice. Lett. Math. Phys. 20, 101–112 (1990); Multiple Hamiltonian structures for Toda type systems. J. Math. Phys. 35, 5511–5541 (1994) Deift, P., Li, L.-Ch., and Tomei, C.: Matrix factorizations and integrable systems. Commun. Pure Appl. Math. 42, 443–521 (1989) Faddeev, L.D., Takhtajan, L.A.: Hamiltonian methods in the theory of solitons, Berlin–Heidelberg– New York: Springer, 1987 Faddeev, L.D., Volkov, A.Yu.: Hirota equation as an example of integrable symplectic map. Lett. Math. Phys. 32, 125–136 (1994)

484

[F]

Yu. B. Suris, O. Ragnisco

Flaschka, H.: On the Toda lattice I. - Phys. Rev. B 9, 1924–1925 (1974); On the Toda lattice II. Inverse scattering solution. Progr. Theor. Phys. 51, 703–716 (1974) [GK1] Gibbons, J., Kupershmidt, B.A.: Relativistic analogs of basic integrable systems. In: Integrable and superintegrable systems, Ed. B.A.Kupershmidt, Singapore: World Scientific, 1990, pp. 207–231 [GK2] Gibbons, J. Kupershmidt, B.A..: Time discretizations of lattice integrable systems. Phys. Lett. A 165, 105–110 (1992) [H] Hirota, R.: Nonlinear partial difference equations. I–V. J. Phys. Soc. Japan 43, 1423–1433, 2074– 2078, 2079–2086 (1977); 45, 321–332 (1978); 46, 312–319 (1978) [H1] Hirota, R.: Discrete analogue of a generalized Toda equation. J. Phys. Soc. Japan 50, 3785–3791 (1981) [HT] Hirota, R., Tsujimoto, S.: Conserved quantities of a class of nonlinear difference–difference equations. J. Phys. Soc. Japan 64, 3125–3127 (1995); RIMS Kokyuroku 868, 31 (1994); RIMS Kokyuroku 933, 105 (1995) [HTI] Hirota, R., Tsujimoto, S., Imai, T.: Difference scheme of soliton equations. In: Future directions of nonlinear dynamics in physical and biological systems, Eds. P.L. Christiansen, J.C. Eilbeck, and R.D. Parmentier, London–New York: Plenum, 1993, pp. 7–15 [KM] Kac, M., van Moerbeke, P.: On an explicitly soluble system of nonlinear differential equations related to certain Toda lattices. Adv. Math. 16, 160–169 (1975) [K1] Kupershmidt, B.A.: Discrete Lax equations and differential–difference calculus. Asterisque 123 (1985) [K2] Kupershmidt, B.A.: Infinitely-precise space-time discretizations of the equation ut + uux = 0. In: Algebraic Aspects of Integrable Systems. In Memory of Irene Dorfman, Basel–Boston: Birkh¨auser, 1996, pp. 205–216 [LP] Li, L.-C., Parmentier, S.: Nonlinear Poisson structures and r-matrices. Commun. Math. Phys. 125, 545–563 (1989) [M] Manakov, S.V.: On the complete integrability and stochastization in discrete dynamical systems. Zh. Exp. Theor. Phys. 67, 543–555 (1974) [MP] Morosi, C., Pizzocchero, L.: R-matrix theory, formal Casimirs and the periodic Toda lattice. J. Math. Phys., 37, 4484–4513 (1996) [MV] Moser, J., Veselov, A.P.: Discrete versions of some classical integrable systems and factorization of matrix polynomials. Commun. Math. Phys. 139, 217–243 (1991) [NPCQ] Nijhoff, F.W., Papageorgiou, V.G., Capel, H.W., and Quispel, G.R.W.: The lattice Gel’fand–Dikii hierarchy. Inv. Problems 8, 597–621 (1992) [OR] Oevel, W., Ragnisco, O.: R-matrices and higher Poisson brackets for integrable systems. Physica A 161, 181–220 (1989) [OFZR] Oevel, W., Fuchssteiner, B., Zhang, H., and Ragnisco, O.: Mastersymmetries, angle variables and recursion operator of the relativistic Toda lattice. J. Math. Phys. 30, 2664–2676 (1989) [OKMS] Ohta, Y., Kajiwara, K., Matsukidaira, J., and Satsuma, J.: Casorati dterminant solution for the relativistic Toda equation. J. Math. Phys. 34, 5190–5197 (1993) [PNC] Papageorgiou, V.G., Nijhoff, F.W., and Capel, H.W.: Integrable mappings and nonlinear integrable lattice equations. Phys. Lett. A 147 106–114 (1990); Capel, H.W., Nijhoff, F.W., and Papageorgiou, V.G.: Complete integrability of Lagrangian mappings and lattices of KdV type. Phys. Lett. A 155, 377–387 (1991) [PN] Papageorgiou, V.G., Nijhoff, F.W.: On some integrable discrete time systems associated with the Bogoyavlensky lattices. Physica A 228, 172–188 (1996) [PGR] Papageorgiou, V., Grammaticos, B., and Ramani, A.: Orthogonal polynomial approach to discrete Lax pairs for initial boundary value problems of the QD algorithm. Lett. Math. Phys. 34, 91–101 (1995) [RSTS] Reyman, A.G., Semenov-Tian-Shansky, M.A.: Group theoretical methods in the theory of finite dimensional integrable systems. In: Encyclopaedia of mathematical science, v.16: Dynamical Systems VII, Berlin–Heidelberg–New York: Springer, 1994, pp.116–225 [R] Ruijsenaars, S.N.M.: Relativistic Toda systems. Commun. Math. Phys. 133, 217–247 (1990) [STS] Semenov-Tian-Shansky, M.A.: What is a classical r-matrix? Funct. Anal. Appl. 17, 259–272 (1983); Classical r-matrices, Lax equations, Poisson Lie groups and dressing transformations. Lecture Notes Phys. 280, 174–214 (1987) [S1] Suris, Yu.B.: Generalized Toda chains in discrete time. Algebra i Anal. 2, 141–157 (1990); Discrete-time generalized Toda lattices: complete integrability and relation with relativistic Toda lattices. Phys. Lett. A 145, 113–119 (1990)

Relativistic Volterra Lattice

[S2]

485

Suris, Yu.B.: On the bi-Hamiltonian structure of Toda and relativistic Toda lattices. Phys. Lett. A 180, 419–429 (1993) [S3] Suris, Yu.B.: Bi-Hamiltonian structure of the qd algorithm and new discretizations of the Toda lattice. Phys. Lett. A 206, 153–161 (1995) [S4] Suris, Yu.B.: A discrete-time relativistic Toda lattice. J. Phys. A: Math. and Gen. 29, 451–465 (1996); Integrable discretizations of the Bogoyavlensky lattices. J. Math. Phys. 37, 3982–3996 (1996) [S5] Suris, Yu.B.: Nonlocal quadratic Poisson algebras, monodromy map, and Bogoyavlensky lattices. J. Math. Phys. 38, 4179–4201 (1997) [S6] Suris, Yu.B.: Integrable discretizations for lattice systems: local equations of motion and their hamiltonian properties. Rev. Math. Phys., to appear [Sy] Symes, W.W.: The QR algorithm and scattering for the finite nonperiodic Toda lattice. Physica D 4, 275–280 (1982) [THO] Tsujimoto, S., Hirota, R., and Oishi, S.: An extension and discretization of Volterra equation. Techn. Report IEICE, NLP 92-90 [V] Veselov, A.P.: Integrable systems with discrete time and difference operators. Funct. Anal. Appl. 22, 1–13 (1988) [ZTOF] Zhang, H., Tu, G.-Zh., Oevel, W., and Fuchssteiner, B.: Symmetries, conserved quantities, and hierarchies for some lattice systems with soliton structure. J. Math. Phys. 32, 1908–1918 (1991) Communicated by T. Miwa This article was processed by the author using the LaTEX style file cljour1 from Springer-Verlag

Commun. Math. Phys. 200, 487 – 494 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Lyapunov Bounds for Lattice Gauge Dynamics Hans Henrik Rugh Department of Mathematics, University of Warwick, Coventry, CV4 7AL, UK Received: 20 December 1997 / Accepted: 21 July 1998

Abstract: We study the classical Hamiltonian dynamics of the Kogut–Susskind model for lattice gauge theories on a finite box in a d-dimensional integer lattice. The coupling constant for the plaquette interaction is denoted λ2 . When the gauge group is a real or a complex subgroup of a unitary matrix √ group U (N ), N ≥ 1, we show that the maximal Lyapunov exponent is bounded by 2λ d − 1, uniformly in the size of the lattice, the energy of the system as well as the order, N , of the gauge group.

1. Introduction and Statement of Results In [10] Kogut and Susskind (K-S) introduced a Hamiltonian lattice gauge model for the Yang–Mills equations, originally with a quantum treatment in mind. In recent years, however, the classical dynamics of the K-S model has attracted much attention mainly for two reasons. First, it is rather simple to conceive a scheme implementing the dynamics on a computer. Second, one hopes that dynamical averages calculated within such a scheme reflect a real quantum physical situation for aYang–Mills system within a suitable section of parameter space, notably at high temperatures and densities, cf. Ambjørn et al. [1]. Many articles (cf. e.g. [12, 7, 8, 9, 4]) are devoted to numerical studies of the maximal Lyapunov exponent of the K-S dynamical system and, in particular, to how this exponent scales with energy. We refer to [2, 3, 5] for physical motivations and to [14, 13, 11] for a discussion of the scaling behaviour found numerically. In this paper we shall discuss uniform bounds on the maximal Lyapunov exponent and prove the following: Theorem. Consider the classical dynamics of the Kogut–Susskind model for a gauge group over a finite box in a d-dimensional integer lattice. Let λ2 denote the coupling constant for the plaquette interaction (cf. Eqs. (1) to (5) below) and let λmax be the maximal Lyapunov exponent with respect to an invariant measure. If the gauge group is a real or a complex subgroup of a unitary matrix group U (N ), N ≥ 1, then λmax is

488

H. H. Rugh

√ bounded by 2λ d − 1, uniformly in the size of the lattice, the energy of the system and the order, N , of the gauge group. An intuitive picture which the reader may have in mind during further reading is as follows: The K-S dynamical system may be viewed as a collection of ‘rotors’ on the gauge group G, which are coupled to nearest neighbors. Had there been no coupling each individual “rotor” would constitute an integrable Hamiltonian dynamical system, the first integrals being a Lie algebra element describing the preserved “angular velocities”, Eqs. (11) and (13) below. Introducing the coupling, λ2 > 0, gives rise to non-integrable forces which on the other hand are uniformly bounded. The proof consists then in finding an adapted metric on the phase space whose time derivative along a trajectory is uniformly bounded in terms of the metric itself. It seems here to be important that the gauge group is a Lie subgroup of a unitary group as orthogonality makes the centrifugal term, Eq. (12), vanish and compactness gives a uniform bound on the remaining interaction. We shall mainly use properties of the Lie group which are independent of any particular representation (coordinates) of the group. For the convenience of readers who should wish to use our equations of motion in a numerical scheme we have included explicit formulas and examples in an appendix. In [13] we announced the result that for the gauge group SU (2) on a d-dimensional integer lattice the maximal Lyapunov exponent was uniformly bounded. Compared to the (sketch of the) proof we gave then our present approach is much simplified, generalizes to an arbitrary subgroup of U (N ) and gives an improvement of the bound. 2. The Kogut–Susskind Model Let B be a finite box in the integer lattice Zd and denote by 3 the collection of nearest neighbour pairs in B (for simplicity we shall assume periodic boundary conditions). The elements in 3 are called links and each link is given an (arbitrary) orientation. In the following let G be a Lie group (gauge group) which is a subgroup of the unitary group U (N ) and consider a configuration manifold, M = G3 , given as the set of maps from 3 to G. Alternatively one may view M as the space of sections of the trivial principal bundle over 3 with G as its typical fiber. The tangent bundle of M is naturally identified with the direct product of tangent bundles of the gauge group, T M = (T G)3 and we denote a point in T M by (u, u) ˙ = (ui , u˙ i )i∈3 , where u˙ i ∈ Tui G for each link i ∈ 3. A Riemannian metric and norm are given by hx, yi = Re tr x† y, |x| = hx, xi1/2 , x, y ∈ Tu G,

(1)

and they are both left and right-invariant under the action of the group. We may then define a kinetic energy of (u, u) ˙ ∈ M as: K[u] ˙ ≡

X1 i∈3

2

hu˙ i , u˙ i i.

(2)

Given the collection of oriented links, 3, we construct the set of so-called plaquettes, P. A plaquette, ∈ P, is a quadruple of links which form a closed loop connecting 4 points in B. As is readily verified each link is contained in precisely 2(d − 1) plaquettes (cf. Fig. 1), |P|/|3| = 2(d − 1).

(3)

Lyapunov Bounds for Lattice Gauge Dynamics

489

J ? ^J J + JJ J ]J J ui J ^ J 3 6 J 63 J J J J+ ]J ^ ? J J JJ Fig. 1. The plaquettes containing the link ui (d = 3)

uk

ul

l 6

6 u

6 uj

ui

Fig. 2. The plaquette product u = ui uj u†k u†l

For each plaquette, ∈ P, we choose an orientation and a base point writing = (ijkl), i, j, k, l ∈ 3, and we let u = ui uj uk ul ∈ G be the path ordered product of the link group elements in the plaquette starting from the base point. In this product if a link † k ∈ 3 is traversed opposite its orientation, uk is replaced by u−1 k = uk (cf. Fig. 2, where orientation and base point are indicated). Let λ ≥ 0 be a coupling constant. A potential energy is then given by I[u] =

λ2 X (N − Re tr u ), 2

(4)

∈P

and a lattice gauge model is defined through the following Lagrangian function on T M : L[u, u] ˙ = K[u] ˙ − I[u].

(5)

One verifies that the Lagrangian does not depend on the choice of orientations and base points. The Kogut–Susskind model is obtained by reformulating the Lagrangian system as the equivalent Hamiltonian system on the cotangent bundle, T ∗ M . For this approach and for more details in the above derivation we refer to [10, 6, 5]. Although the Hamiltonian model is certainly useful in many contexts, including the study of the full Lyapunov spectrum, [12], we shall restrict ourselves to the Lagrangian system as it is adequate for our purpose.

490

H. H. Rugh

3. Lagrangian Dynamics on a Gauge Group To simplify matters we shall first consider the Lagrangian dynamics on a configuration space consisting of a single copy of a gauge group G. We therefore consider a Lagrangian function of T G: L[u, u] ˙ =

1 hu, ˙ ui ˙ − I[u], 2

(6)

in which the potential I[u] is a smooth function of u ∈ G. The differential of L is given by: dL = hu, ˙ dui ˙ − dI d ˙ dui − hu, ¨ dui + dI , = hu, dt

(7)

where dI = −hf, dui

(8)

defines a force f on u. In terms of this force the Lagrange equation of motion (D’Alembert’s principle) becomes: hu¨ − f, dui ≡ 0.

(9)

It is convenient to represent the tangent bundle as left translations of the Lie algebra, G × G → T G, (u, a) 7→ (u, ua).

(10) (11)

Writing u˙ = ua we differentiate once more to get u¨ = ua2 + u˙a. When inserted in (9) we obtain two terms in the Lagrange equation. The crucial point in our analysis is that the “centrifugal” term vanishes identically, hua2 , dui =

1 tr (ua2 )† du + du† ua2 ≡ 0, 2

(12)

since a2 is hermitian and u† du is anti-hermitian (here we are using that G is a subgroup of U (N )). We thus arrive at the following reduced equation of motion: h˙a − u† f, xi ≡ 0 , x ∈ G, (13) u˙ = ua. The linear stability of these equations are studied through their differentials: hd˙a − d(u† f ), xi ≡ 0 , x ∈ G, du˙ = du · a + u · da.

(14)

˙ = −a dh + dh a + da and We parametrize again by the Lie-algebra dh = u† du. Then dh noting that da itself is in the Lie-algebra we derive the following inner products: hda, d˙ai = hda, d(u† f )i, (15) ˙ = hdh, dai. hdh, dhi

Lyapunov Bounds for Lattice Gauge Dynamics

491

Here we have used the fact that the terms hdh, a dhi and hdh, dh ai vanish by antisymmetry of a. For C > 0 (to be chosen later) we define a Riemannian metric on T G by: g=

1 C hda, dai + hdh, dhi. 2 2C

(16)

The identities (15) imply that g has the following time-derivative along the flow: g˙ = Chda, d(u† f )i +

1 hdh, dai. C

(17)

At this stage we need to know more about the interaction in order to estimate the growth rate of tangent vectors.

4. The Lattice Potential Term Returning to the full lattice model the differential of the potential, I, Eq. (4), takes the form: λ2 X Re tr du 2 ∈P X ≡− hdui , fi i,

dI = −

(18)

i∈3

where fi is the force on the ith link. For i ∈ 3 this force is determined by: u†i fi =

λ2 X ui, , 2

(19)

3i

in which the sum extends over all plaquettes containing the ith link and oriented such that u†i is the first factor. Writing dhj = u†j duj , the differential of (19) is given by: d(u†i fi ) =

λ2 X X 1 2 vij dhj vji . 2

(20)

3i j∈

Here the last sum extends over all links j in the plaquette and the plaquette product is split into a product: 1 2 vji ui, = vij

(21)

1 2 = u†i . . . uj and vji containing the rest. with vij

For each link i ∈ 3 the expression (16) defines a quadratic form, gi , on the ith component, T (T G), of T (T M ). The sum of these, g3 =

X i∈3

gi =

1X 1 (Chdai , dai i + hdhi , dhi i), 2 i∈3 C

(22)

492

H. H. Rugh

is clearly a positive definite quadratic form on T (T M ) and thus it defines a Riemannian metric on T M . Its time-derivative along the flow is given by the sum of terms of the form (17) and by inserting (20) we obtain: g˙3 = C

λ2 X X X 1 X 1 2 hdai , vij dhj vji i+ hdhi , dai i. 2 i∈3 C i∈3

(23)

3i j∈

Here, both v 1 and v 2 are unitary matrices and hence, the summand in the first term is bounded by |dai | · |dhj |. Introducing the symmetric (integer) matrix: tij = #{ ∈ P : i, j ∈ } , i, j ∈ 3,

(24)

we get: |g˙3 | ≤ C

X 1 λ2 X tij |dai | |dhj | + |dhi | |dai |. 2 i,j∈3 C i∈3

(25)

The matrix t has the following properties (cf. Fig. 1) : each diagonal term equals tii = 2(d − 1) and for each fixed i ∈ 3 there are exactly 6(d − 1) off diagonal elements for which tij = 1 and all other elements vanish. Viewing t as a symmetric operator on l2 (3), the norm of t is bounded by X |tij | = 8(d − 1). (26) ktkl2 (3) ≤ sup Writing kxk2 =

P

i∈3 j∈3

i∈3 hxi , xi i,

the Cauchy–Schwartz inequality yields:

|g˙3 | ≤ (4λ2 (d − 1)C +

1 1 )kdak kdhk ≤ (4λ2 (d − 1)C + )g3 , C C

(27)

where we for the last inequality used that for p and q real, |pq| ≤ 21 (Cp2 + C1 q 2 ). Choosing C=

1 √ 2λ d − 1

(28)

we are able to bound the time derivative of the metric g3 by the metric itself as follows: 2 (29) g3 . C A Lyapunov exponent measures the exponential growth of the length (under a given norm) of a tangent vector under the flow and the inequality, |g˙3 | ≤

√ d √ log g3 ≤ 1/C = 2λ d − 1, dt

(30)

shows that this growth rate can not be faster than 1/C when measured in the metric g3 . When 3 is finite, the phase space for fixed energy, K + I = const, is compact and as energy is conserved under the flow any two norms will be equivalent in a neighborhood of the energy surface and hence Lyapunov exponents are independent of the choice of the norm, thus proving Theorem 1. Acknowledgement. I am grateful to S.E. Rugh for introducing me to the subject and valuable discussions. Also D. Salamon is acknowledged for his comments and encouragement.

Lyapunov Bounds for Lattice Gauge Dynamics

493

A. Explicit Equations For the reader who is interested in performing numerical simulations it might be useful to have an explicit representation of the equations of motion for the Lagrangian system considered above. What is needed is to convert an implicit equation for a Lie-algebra element (like Eq. 13) into an explicit form. Thus, let G be a matrix group, i.e. a subgroup of GL(N, C), which is given as the kernel of a smooth map: K : MN (C) → CdK , dK > 0.

(31)

Then G is represented by the natural injection: K

G ,→ Mn −→ 0

(32)

and the tangent bundle T G is represented by: K

∗ 0. T G ,→ T MN −→

(33)

As an example the group G = SU (N ) can be defined as the kernel of the mapping K = (K0 , K1 ) with dK = N × N + 1 and: K0 [u] = u† u − e, K1 [u] = det u − 1.

(34) (35)

The constraints for T SU (N ) become: dK0 = u† du + du† u, dK1 = tr u† du,

(36) (37)

and the Lie algebra, G, is identified with the kernel of dK at the identity element, e, of the group, i.e. a + a† = 0, (38) a∈G⇔ tr a = 0. For a ∈ MN (C) the operators: Aa ≡

1 1 (a − a† ) and N a ≡ a − e tr a 2 N

(39)

are symmetric projections which commute and with the standard metric πa ≡ N Aa =

1 1 (a − a† ) − e tr(a − a† ) 2 N

(40)

is the orthogonal projection of MN (C) onto su(N ). To see this note that the condition (38) corresponds to (1 − A)a ≡ 0 and (1 − N )a ≡ 0. Given an explicit representation of the orthogonal projection π of MN (C) onto G (as for su(N ) above) the equations of motion (13) become: a˙ = π u† f, u˙ = ua,

(41) (42)

494

H. H. Rugh

and the linear stability is determined by: d˙a = π d(u† f ), du˙ = du · a + u · da.

(43) (44)

Using the parametrization by the Lie algebra dh = u† du and writing d(u† f ) = (u† f )∗ dh, we have the equivalent equations: d˙a = π(u† f )∗ dh, dh˙ = da + dh a − a dh.

(45) (46)

The above equations of motion for the dynamics and the instability, (41-42) and (45-46), are straightforward to implement in numerical experiments, though they might not be the most economical in terms of computer costs (cf. the discussion in [5] in the case of G = SU (2)). References 1. Ambjørn, J., Askgaard, T., Porter, H. and Shaposhnikov, M.E.: Sphaleron transitions and baryon asymmetry: A numerical real-time analysis. Nucl. Phys. B353, 346 (1991) 2. Bir´o, T.S., Gong, C. and M¨uller, B.: Lyapunov exponent and plasmon damping rate in non-Abelian gauge theories. Phys. Rev. D 52, 1260–1266 (1995) 3. Bir´o, T.S., Gong, C. and M¨uller, B.: Chaos driven by soft-hard modes coupling in thermal Yang–Mills theory. Phys. Lett. B 362, 29–33 (1995) 4. Bir´o, T.S., Gong, C., M¨uller, B. and Trayanov, A.: Hamiltonian Dynamics of Yang–Mills fields on a Lattice. Int. J. Mod. Phys. C 5, 113 (1994) 5. Biro, T.S., Matinyan, S.G. and M¨uller, B.: Chaos and Gauge Field Theory. Sinagpore: World Scientific, 1994 6. Creutz, M.J.: Quarks, gluons and lattices. Cambridge: Cambridge University Press, 1983 7. Gong, C.: Lyapunov exponents of classical SU(3) gauge theory. Phys. Lett. B 298, 257 (1993) 8. Gong, C.: Lyapunov spectra in SU (2) lattice gauge theory. Phys. Rev. D 49, 2642–2645 (1994) 9. Gong, C., Matinyan, S.G., M¨uller, B. and Trayanov, A.: Manifestation of infrared instabilities in high energy processes in non-Abelian gauge theories. Phys. Rev. D 49, 607–610 (1994) 10. Kogut, J. and Susskind, L.: Hamiltonian formulation of Wilson’s lattice gauge theories. Phys. Rev. D 11, 395 (1975) 11. M¨uller, B.: Study of Chaos and Scaling in Classical SU (2) Gauge Theory. Duke U. preprint DUKETH-96-118 (July 1996), chao-dyn/9607001 12. M¨uller, B. and Trayanov, A.: Deterministic Chaos in Non-Abelian Lattice Gauge Theory. Phys. Rev. Lett. 68, 3387 (1992) 13. Nielsen, H.B., Rugh, H.H. and Rugh, S.E.: Chaos, Scaling and Existence of a Continuum Limit in Classical Non-Abelian Lattice Gauge Theory. 10 pp, Preprint LA-UR-96-4455 (Nov 1996), hep-th/9611128 14. Rugh, S.E.: Aspects of Chaos in the Fundamental Interactions. Part I. Non-Abelian Yang–Mills fields. (Part of) Licentiate Thesis, The Niels Bohr Institute. September 1994 Communicated by Ya. G. Sinai

Commun. Math. Phys. 200, 495 – 543 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Gauge Theories on Manifolds with Boundary Ivan G. Avramidi1,2,? , Giampiero Esposito3,4 1

Department of Mathematics, University of Greifswald, F.-L.-Jahnstr. 15a, D-17489 Greifswald, Germany Department of Mathematics, The University of Iowa, 14 MacLean Hall, Iowa City, IA 52242-1419, USA. E-mail: [email protected] 3 Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Complesso Universitario di M. S. Angelo, Edificio G, 80126 Napoli, Italy 4 Universit` a di Napoli Federico II, Dipartimento di Scienze Fisiche Mostra d’Oltremare Padiglione 19, 80125 Napoli, Italy

2

Received: 6 October 1997 / Accepted: 27 July 1998

Abstract: The boundary-value problem for Laplace-type operators acting on smooth sections of a vector bundle over a compact Riemannian manifold with generalized local boundary conditions including both normal and tangential derivatives is studied. The condition of strong ellipticity of this boundary-value problem is formulated. The resolvent kernel and the heat kernel in the leading approximation are explicitly constructed. As a result, the previous work in the literature on heat-kernel asymptotics is shown to be a particular case of a more general structure. For a bosonic gauge theory on a compact Riemannian manifold with smooth boundary, the problem of obtaining a gauge-field operator of Laplace type is studied, jointly with local and gauge-invariant boundary conditions, which should lead to a strongly elliptic boundary-value problem. The scheme is extended to fermionic gauge theories by means of local and gauge-invariant projectors. After deriving a general condition for the validity of strong ellipticity for gauge theories, it is proved that for Euclidean Yang–Mills theory and Rarita–Schwinger fields all the above conditions can be satisfied. For Euclidean quantum gravity, however, this property no longer holds, i.e. the corresponding boundary-value problem is not strongly elliptic. Some non-standard local formulae for the leading asymptotics of the heat-kernel diagonal are also obtained. It is shown that, due to the lack of strong ellipticity, the heat-kernel diagonal is non-integrable near the boundary.

1. Introduction The consideration of boundary conditions in the formulation of quantum field theories is crucial for at least two reasons: ? On leave of absence from Research Institute for Physics, Rostov State University, Stachki 194, 344104 Rostov-on-Don, Russia.

496

I. G. Avramidi, G. Esposito

(i) Boundary effects are necessary to obtain a complete prescription for the quantization, unless one studies the idealized case where no bounding surfaces occur. In particular, the functional-integral approach relies heavily on a careful assignment of boundary data. (ii) One may wonder whether the symmetries of the theory in the absence of boundaries are preserved by their inclusion. Moreover, one would like to know whether the application of a commonly accepted physical principle (e.g. the invariance under infinitesimal diffeomorphisms on metric perturbations, in the case of gravitation) is sufficient to determine completely the desired form of the boundary conditions, upon combination with other well known mathematical properties. The investigations carried out in our paper represent the attempt to solve these problems by using the advanced tools of analysis and geometry, with application to the operators of Laplace or Dirac type. Indeed, the differential operators of Laplace type are well known to play a crucial role in mathematical physics. By choosing a suitable gauge it is almost always possible to reduce the problem of evaluating the Green functions and the effective action in quantum field theory to a calculation of Green functions (and hence the resolvent) and functional determinants (or the ζ-functions) of Laplacetype operators. These objects are well defined, strictly speaking, only for self-adjoint elliptic operators. Thus, on manifolds with boundary, one has to impose some boundary conditions which guarantee the self-adjointness and the ellipticity of the Laplace-type operator. The simplest choice is the Dirichlet and Neumann boundary conditions when the fields or their normal derivatives are set to zero on the boundary. A slight modification of the Neumann boundary conditions are the so-called Robin ones, when one sets to zero at the boundary a linear combination of the values of the fields and their normal derivatives. An even more general scheme corresponds to a mixed situation when some field components satisfy Dirichlet conditions, and the remaining ones are subject to Robin boundary conditions [1, 2]. However, this is not the most general scheme, and one can define some generalized boundary conditions which are still local but include both normal and tangential derivatives of the fields [3]. For example, by using linear covariant gauges in quantum gravity, one is led to impose boundary conditions that involve the tangential derivatives as well, to ensure that the whole set of boundary conditions on metric perturbations is invariant under infinitesimal diffeomorphisms [4, 5]. The same boundary conditions may be derived by constructing a BRST charge and requiring BRST invariance of the boundary conditions [6]. In this paper we are going to study the generalized boundary-value problem for Laplace- and Dirac-type operators, the latter being relevant for the analysis of fermionic models. Such a boundary-value problem, however, is not automatically elliptic. Therefore, we find first an explicit criterion of ellipticity (Sect. 2.3). Then we construct the resolvent kernel and the heat kernel in the leading approximation (Sect. 3). The application of this formalism (Sect. 4) proves that the generalized boundary-value problem is strongly elliptic for Euclidean Yang–Mills theory (Sect. 5) and Rarita–Schwinger fields (Sect. 6), but not for Euclidean quantum gravity (Sect. 7). The possible implications are discussed in Sect. 8.

2. Laplace-Type and Dirac-Type Operators Let (M, g) be a smooth compact Riemannian manifold of dimension m with smooth boundary, say ∂M . Let g be the positive-definite Riemannian metric on M and gˆ be

Gauge Theories on Manifolds with Boundary

497

the induced metric on ∂M . Let V be a (smooth) vector bundle over the manifold M and C ∞ (V, M ) be the space of smooth sections of the bundle V . Let V ∗ be the dual vector bundle and E : V → V ∗ be a Hermitian non-degenerate metric, E † = E, that determines the Hermitian fibre scalar product in V . Using the invariant Riemannian volume element d vol(x) on M one defines a natural L2 inner product (, ) in C ∞ (V, M ), and the Hilbert space L2 (V, M ) as the completion of C ∞ (V, M ) in this norm. Further, let ∇V be the connection on the vector bundle V compatible with the metric E, tr g = g ⊗ 1 be the contraction of sections of the bundle T ∗ M ⊗ T ∗ M ⊗ V with the metric on the cotangent bundle T ∗ M , and Q be a smooth endomorphism of the bundle V , i.e. Q ∈ End(V ), satisfying the condition Q¯ ≡ E −1 Q† E = Q.

(2.1)

Hereafter we call such endomorphisms self-adjoint. We also use the notation A¯ = E −1 A† E for any endomorphisms or operators. Then a Laplace-type operator, or generalized Laplacian, F : C ∞ (V, M ) → C ∞ (V, M ),

(2.2)

is a second-order differential operator defined by F ≡ −tr g ∇T

∗

M ⊗V

∇V + Q,

(2.3)

⊗ 1 + 1 ⊗ ∇V ,

(2.4)

where ∇T

∗

M ⊗V

= ∇T

∗

M

∗

and ∇T M is the Levi–Civita connection on M . Let V be a Clifford bundle and 0 : T ∗ M → End(V ), 0(ξ) = 0µ ξµ , be the Clifford map satisfying 0(ξ1 )0(ξ2 ) + 0(ξ2 )0(ξ1 ) = 2g(ξ1 , ξ2 )I,

(2.5)

for all ξ1 , ξ2 ∈ T ∗ M . Hereafter I denotes the identity endomorphism of the bundle V . Let ∇V be the Clifford connection compatible with the Clifford map. Then a Dirac-type operator D : C ∞ (V, M ) → C ∞ (V, M ),

(2.6)

is a first-order differential operator defined by [1, 7] D ≡ i(0∇V + S),

(2.7)

with S ∈ End(V ). Of course, the square of the Dirac operator is a Laplace-type operator. 2.1. Geometry of boundary operators. 2.1.1. Laplace-type operator. For a Laplace-type operator we define the boundary data by ψ0 (ϕ) , (2.8) ψF (ϕ) = ψ1 (ϕ)

498

I. G. Avramidi, G. Esposito

where ψ0 (ϕ) ≡ ϕ|∂M , ψ1 (ϕ) ≡ ∇N ϕ|∂M

(2.9)

are the restrictions of the sections ϕ ∈ C ∞ (V, M ) and their normal derivatives, to the boundary (hereafter, N is the inward-pointing unit normal vector field to the boundary). To make the operator F symmetric and elliptic (see Sects. 2.2 and 2.3), one has to impose some conditions on the boundary data ψF (ϕ). In general, a d-graded vector bundle is a vector bundle jointly with a fixed decomposition into d sub-bundles [1]. In our problem, let the vector bundle WF over ∂M be the bundle of the boundary data. WF consists of two copies of the restriction of V to ∂M and inherits a natural grading [1] WF = W0 ⊕ W1 ,

(2.10)

where Wj represents normal derivatives of order j, and, therefore, dim WF = 2 dim V.

(2.11)

The bundles W0 and W1 have the same structure, and hence in the following they will be often identified. Let WF0 = W00 ⊕ W10 be an auxiliary graded vector bundle over ∂M such that dim WF0 = dim V.

(2.12)

Since the dimension of the bundle WF is twice as big as the dimension of WF0 , dim WF = 2 dim WF0 , it is convenient to identify the bundle WF0 with a sub-bundle of WF by means of a projection PF : WF → WF0 , PF2 = PF .

(2.13)

In other words, sections of the bundle WF0 have the form PF χ, with χ being a section of the bundle WF . Note that the rank of the projector PF is equal to the dimension of the bundle V , rank PF = rank(IW − PF ) = dim V,

(2.14)

where IW is the identity endomorphism of the bundle WF . Let BF : C ∞ (WF , ∂M ) → C ∞ (WF0 , ∂M ) be a tangential differential operator on ∂M . The boundary conditions then read BF ψF (ϕ) = 0.

(2.15)

The boundary operator BF is not arbitrary but should satisfy some conditions to make the operator F self-adjoint and elliptic. These conditions are formulated in the Subsects. 2.2 and 2.3. The boundary operator BF can be then presented in the form BF = PF L,

(2.16)

where L is a non-singular tangential operator L : C ∞ (WF , ∂M ) → C ∞ (WF , ∂M ),

(2.17)

meaning that there is a well defined inverse operator L−1 . In other words, the boundary conditions mean

Gauge Theories on Manifolds with Boundary

499

ψF (ϕ) = KF χ with arbitrary χ ∈ C ∞ (WF , ∂M ),

(2.18)

KF = L−1 (IW − PF ) : C ∞ (WF , ∂M ) → C ∞ (WF , ∂M ).

(2.19)

where

Now let 5 be a self-adjoint projector acting on W0 and W1 , 5 : W 0 → W 0 , 5 : W1 → W1 , ¯ ≡ E −1 5† E = 5. 52 = 5, 5

(2.20) (2.21)

In our analysis, we will consider the projector PF and the operator L of the form

5 0 , PF = 5 ⊕ (I − 5) = 0 (I − 5) I 0 L= , 3I where 3 is a tangential differential operator on ∂M , 3 : C ∞ (W0 , ∂M ), satisfying the conditions 53 = 35 = 0.

(2.22) (2.23) C ∞ (W0 , ∂M ) → (2.24)

Of course, rank PF = rank 5 + rank(I − 5) = dim V as needed. Hence we obtain 5 0 I −5 0 , KF = IW − BF = I . (2.25) BF = −3 5 3 II − 5 Note that BF and KF are complementary projectors BF2 = BF , KF2 = KF , BF KF = KF BF = 0.

(2.26)

Moreover, by virtue of the property (2.14), both the operators BF and KF do not vanish. 2.1.2. Dirac-type operator. For a Dirac-type operator (see (2.7)) the normal derivatives are not included in the boundary data ψD (ϕ), and such boundary data consist only of the restriction ψD (ϕ) = ψ0 (ϕ) (see (2.9)) of sections ϕ ∈ C ∞ (V, M ) to the boundary. Therefore, the bundle of the boundary data WD is just the restriction W0 of the bundle V 0 = W00 . The dimensions of these bundles and, similarly, the auxiliary vector bundle WD are 0 = dim V. dim WD = 2 dim WD

(2.27)

0 can be identified with This should be compared with (2.11) and (2.12). As above, WD a sub-bundle of WD by means of a projection 0 , PD : WD → WD

(2.28)

but now the dimension of the projector PD is equal to half the dimension of the bundle V: rank PD =

1 dim V. 2

(2.29)

500

I. G. Avramidi, G. Esposito

The boundary conditions for the operator D read BD ψ0 (ϕ) = 0,

(2.30)

where now the operator BD is not a tangential differential operator but just a projector PD : BD = PD , KD = I − PD .

(2.31)

It is crucial to use projectors to obtain a well posed boundary-value problem of local nature for Dirac-type operators, since any attempt to fix the whole fermionic field at the boundary would lead to an over-determined problem. 2.2. Symmetry. 2.2.1. Laplace-type operator. The Laplace-type operator F is formally self-adjoint, which means that it is symmetric on any section of the bundle V with compact support disjoint from the boundary ∂M , i.e. IF (ϕ1 , ϕ2 ) ≡ (F ϕ1 , ϕ2 ) − (ϕ1 , F ϕ2 ) = 0, for ϕ1 , ϕ2 ∈ C0∞ (V, M ).

(2.32)

However, a formally self-adjoint operator is not necessarily self-adjoint. It is essentially self-adjoint if its closure is self-adjoint. This implies that the operator F is such that: i) it is symmetric on any smooth section satisfying the boundary conditions (2.15), i.e. IF (ϕ1 , ϕ2 ) = 0, for ϕ1 , ϕ2 ∈ C ∞ (V, M ) :

BF ψ(ϕ1 ) = BF ψ(ϕ2 ) = 0, (2.33)

and ii) there exists a unique self-adjoint extension of F . The property ii) can be proved by studying deficiency indices [8], but is not the object of our investigation. The antisymmetric bilinear form IF (ϕ1 , ϕ2 ) depends on the boundary data. Integrating by parts, it is not difficult to obtain IF (ϕ1 , ϕ2 ) =< ψ(ϕ1 ), JF ψ(ϕ2 ) >, where

JF =

0 I . −I 0

(2.34)

(2.35)

Here <, > denotes the L2 inner product in C ∞ (W, ∂M ) determined by the restriction of the fibre metric to the boundary. To ensure the symmetry, we have to fix the boundary operators BF so as to make this form identically zero. The condition for that reads ¯ F JF KF = 0. K

(2.36)

By using the general form (2.25) of the operator KF , we find herefrom that this is equivalent to the condition of symmetry of the operator 3, < 3ϕ1 , ϕ2 >=< ϕ1 , 3ϕ2 >, for ϕ1 , ϕ2 ∈ C ∞ (W0 , ∂M ). Thus, we have proven the following result:

(2.37)

Gauge Theories on Manifolds with Boundary

501

Theorem 1. The Laplace-type operator F (2.3) endowed with the boundary conditions (2.15), with the boundary operator B given by (2.25), and any symmetric operator 3 satisfying conditions (2.24) and (2.37), is symmetric. 2.2.2. Dirac-type operator. In the case of Dirac-type operators there are two essentially different cases. The point is that, in general, there exist two different representations of the Clifford algebra satisfying ¯ 0(ξ) = −ε0(ξ), ε = ±1.

(2.38)

In odd dimension m there is only one possibility, ε = −1, corresponding to self-adjoint Dirac matrices, whereas for even dimension m they can be either self-adjoint or antiself-adjoint. By requiring the endomorphism S to satisfy the condition S¯ = εS,

(2.39)

we see that the Dirac-type operator (2.7) is formally self-adjoint for ε = −1 and antiself-adjoint for ε = 1: ID (ϕ1 , ϕ2 ) ≡ (Dϕ1 , ϕ2 ) + ε(ϕ1 , Dϕ2 ) = 0, ϕ1 , ϕ2 ∈ C0∞ (V, M ).

(2.40)

In complete analogy with the above, we easily find the condition for the Dirac-type operator to be (anti)-symmetric, ID (ϕ1 , ϕ2 ) =< ψ0 (ϕ1 ), JD ψ0 (ϕ2 ) >= 0,

(2.41)

JD = iε0(N ) = iε0µ Nµ ,

(2.42)

where

for any ϕ1 , ϕ2 ∈ C ∞ (V, M ) satisfying the boundary conditions (2.30). This leads to a condition on the boundary operator, ¯ D JD KD = 0, K

(2.43)

wherefrom, by using (2.31), we obtain a condition for the projector PD , (I − P¯D )0(N )(I − PD ) = 0.

(2.44)

Hence we get a sufficient condition on the boundary projector, P¯D = 0(N )(I − PD )0(N )−1 .

(2.45)

This means that PD can be expressed in the form PD =

1 (I + η), 2

(2.46)

where η satisfies the conditions ¯ ) + 0(N )η = 0. η 2 = I, η0(N

(2.47)

Thus, there are two cases: i) η is anti-self-adjoint and commutes with 0(N ), ii) η is self-adjoint and anti-commutes with 0(N ). This leads to two particular solutions, η = ±0(N ), for ε = 1, and even m,

(2.48)

502

I. G. Avramidi, G. Esposito

and η=

1 0(u), for ε = −1 and any m, |u|

(2.49)

with some cotangent vector u ∈ T ∗ ∂M on the boundary. In even dimension m there exists another very simple solution, η = ±C, for even m, ε = ±1,

(2.50)

where C is the chirality operator defined with the help of an orthonormal basis ea in T ∗M , C = im/2 0(e1 ) · · · 0(em ).

(2.51)

C 2 = I, C¯ = C,

(2.52)

C0(ξ) + 0(ξ)C = 0,

(2.53)

One easily finds

for any ξ ∈ T ∗ M , so that the conditions (2.47) are satisfied. Thus, we have Theorem 2. The Dirac-type operator D = i(0∇ + S) with 0 and S satisfying the conditions 0¯ = −ε0, S¯ = εS, endowed with the boundary conditions PD ϕ|∂M = 0, is anti-symmetric for ε = −1 and symmetric for ε = 1 provided that the boundary projector PD satisfies the condition (I − P¯D )0(N )(I − PD ) = 0. Admissible boundary projectors satisfying this criterion are: for ε = 1, PD = and, for ε = −1, PD (u) =

1 [I ± 0(N )] , 2

(2.54)

1 1 I+ 0(u) , 2 |u|

(2.55)

where u ∈ T ∗ ∂M . In the case of an even-dimensional manifold M , the boundary projector PD =

1 (I ± C) , 2

(2.56)

C being the chirality operator, is also admissible. 2.3. Strong ellipticity. 2.3.1. Laplace-type operator. Now we are going to study the ellipticity of the boundaryvalue problem defined by the boundary operator (2.25). First of all we fix the notation. By using the inward geodesic flow, we identify a narrow neighbourhood of the boundary ∂M with a part of ∂M ×R+ and define a split of the cotangent bundle T ∗ M = T ∗ ∂M ⊕T ∗ (R). Let xˆ = (xˆ i ), (i = 1, 2, . . . , m − 1), be the local coordinates on ∂M and r be the normal geodesic distance to the boundary, so that N = ∂r = ∂/∂r is the inward unit normal on

Gauge Theories on Manifolds with Boundary

503

∂M . Near ∂M we choose the local coordinates x = (xµ ) = (x, ˆ r), (µ = 1, 2, . . . , m) and the split ξ = (ξµ ) = (ζ, ω) ∈ T ∗ M , where ζ = (ζj ) ∈ T ∗ ∂M and ω ∈ R. With our notation, Greek indices run from 1 through m and lower case Latin indices run from 1 through m − 1. Our presentation differs from the one in [9], since we always work with self-adjoint projectors. In this paper we are interested in the so-called generalized boundary conditions, when 3 is a first-order tangential differential operator acting on sections of the vector bundle W0 over ∂M . Any formally self-adjoint operator of first order satisfying ˆ i denotes (m − 1)-dimensional the conditions (2.24) can be put in the form (hereafter, ∇ covariant differentiation tangentially, defined in ref. [1]) 1 iˆ ˆ i 0i ) + S (I − 5), (2.57) (0 ∇i + ∇ 3 = (I − 5) 2 where 0i ∈ C ∞ (T ∂M ⊗ End (W0 ), ∂M ) are some endomorphism-valued vector fields on ∂M , and S is some endomorphism of the vector bundle W0 , satisfying the conditions i 0¯ = −0i , S¯ = S,

(2.58)

50i = 0i 5 = 5S = S5 = 0.

(2.59)

Now we are going to determine under which conditions the boundary-value problem is strongly elliptic [1]. First of all, the leading symbol of the operator F should be elliptic in the interior of M . The leading symbol of the operator F reads σL (F ; x, ξ) = |ξ|2 · I ≡ g µν (x)ξµ ξν · I,

(2.60)

where ξ ∈ T ∗ M is a cotangent vector. Of course, for a positive-definite non-singular metric the leading symbol is non-degenerate for ξ 6= 0. Moreover, for a complex λ which does not lie on the positive real axis, λ ∈ C − R+ (R+ being the set of positive numbers), det(σL (F ; x, ξ) − λ) = (|ξ|2 − λ)dim V 6= 0.

(2.61)

This equals zero only for ξ = λ = 0. Thus, the leading symbol of the operator F is elliptic. Second, the so-called strong ellipticity condition should be satisfied [1, 10]. As we already noted above, there is a natural grading in the vector bundles WF and WF0 which reflects simply the number of normal derivatives of a section of the bundle [1]. The boundary operator BF (2.25) is said to have the graded order 0. Its graded leading symbol is defined by [1, 10] 5 0 , (2.62) σg (BF ) ≡ iT (I − 5) where, by virtue of (2.57), T = −iσL (3) = 0j ζj ,

(2.63)

ζ ∈ T ∗ ∂M being a cotangent vector on the boundary. By virtue of (2.58) the matrix T is anti-self-adjoint,

504

I. G. Avramidi, G. Esposito

T¯ = −T,

(2.64)

5T = T 5 = 0.

(2.65)

and satisfies the conditions

ˆ r, ζ, ω) To define the strong ellipticity condition we take the leading symbol σL (F ; x, of the operator F , substitute r = 0 and ω → −i∂r and consider the following ordinary differential equation for a ϕ ∈ C ∞ (V, ∂M × R+ ): ˆ 0, ζ, −i∂r ) − λ · I] ϕ(r) = 0, [σL (F ; x,

(2.66)

with an asymptotic condition lim ϕ(r) = 0,

r→∞

(2.67)

where ζ ∈ T ∗ ∂M , λ ∈ C − R+ is a complex number which does not lie on the positive real axis, and (ζ, λ) 6= (0, 0). The boundary-value problem (F, B) is said to be strongly elliptic [10, p.415] with respect to the cone C − R+ if for every ζ ∈ T ∗ ∂M , λ ∈ C − R+ , (ζ, λ) 6= (0, 0), and ψF0 ∈ C ∞ (W 0 , ∂M ) there is a unique solution ϕ to Eq. (2.66) subject to the condition (2.67) and satisfying ˆ ζ)ψF (ϕ) = ψF0 , σg (BF )(x,

(2.68)

with ψF (ϕ) ∈ C ∞ (W, ∂M ) being the boundary data defined by (2.8) and (2.9). A purely algebraic formulation of strong ellipticity can also be given, following Gilkey and Smith [10]. For this purpose, let us denote by WF± (ζ, λ) the subsets of WF corresponding to boundary data of solutions of Eq. (2.66) vanishing as r → ±∞. ˆ 0, ζ, ω) = p0 ω 2 + p1 ω + p2 , where pj is homogeneous of order j Decompose σL (F ; x, in ζ. Then the differential equation (2.66) can be rewritten in the form of a first-order system i h ϕ = 0, (2.69) −i ∂r I + τ (ζ, λ) ϕ1 where

τ =i

0 −1 . −1 p−1 0 (p2 − λ) p0 p1

(2.70)

Herefrom one sees that τ does not have any purely-imaginary eigenvalues for (ζ, λ) 6= (0, 0) if (F, B) is strongly elliptic. It is then possible to re-express the strong ellipticity condition by saying that ˆ ζ) : WF+ (ζ, λ) → WF0 σg (BF )(x,

(2.71)

should be an isomorphism, for (ζ, λ) 6= (0, 0), ζ ∈ T ∗ (∂M ), λ ∈ C − R+ [10, p.416]. For a Laplace-type operator the Eq. (2.66) takes the form 2 (2.72) −∂r + |ζ|2 − λ ϕ(r) = 0, ˆ i ζj . The general solution satisfying the decay condition at infinity, where |ζ|2 ≡ gˆ ij (x)ζ r → ∞, reads

Gauge Theories on Manifolds with Boundary

ϕ(r) = χ exp(−µr),

505

(2.73)

p where µ ≡ |ζ|2 − λ. Since (ζ, λ) 6= (0, 0) and λ ∈ C − R+ , one can always choose Re µ > 0. The boundary data are now χ . (2.74) ψF (ϕ) = −µχ Thus, the question of strong ellipticity for Laplace-type operators is reduced to the invertibility of the equations 0 5 0 χ ψ0 (2.75) = ψ10 iT (I − 5) −µχ for arbitrary ψ00 ∈ C ∞ (W00 , ∂M ), ψ10 ∈ C ∞ (W10 , ∂M ). This is obviously equivalent to the algebraic criterion (2.71). Equation (2.75) can be transformed into (cf. [9]) 5χ = ψ00 , (µI − iT )χ = µψ00 − ψ10 .

(2.76) (2.77)

Remember that ψF0 = PF ψ˜ F with some ψ˜ F ∈ C ∞ (WF , ∂M ). Therefore, (I − 5)ψ00 = 5ψ10 = 0 and the first equation follows from the second one. Therefore, if Eq. (2.77) has a unique solution for any ψ00 and ψ10 , then the boundary-value problem is strongly elliptic. A necessary and sufficient condition to achieve this is expressed by the non-degeneracy of the matrix [µI − iT ], i.e. det[µI − iT ] 6= 0,

(2.78)

for any (ζ, λ) 6= (0, 0) and λ ∈ C − R+ . In this case the solution of Eq. (2.77) reads (2.79) χ = (µI − iT )−1 µψ00 − ψ10 . Since the matrix iT is self-adjoint, it has only real eigenvalues, in other words the eigenvalues ofp T 2 are real and negative, T 2 ≤ 0. It is clear that, for any non-real λ ∈ C − R, µ = |ζ|2 − λ is complex and, therefore, the matrix [µI − iT ] is nondegenerate. For real negative λ, µ is real and we have µ > |ζ|. Thus, the condition (2.78) means that the matrix |ζ|I − iT is positive-definite, |ζ|I − iT > 0.

(2.80)

|ζ|2 I + T 2 > 0.

(2.81)

A sufficient condition for that reads

On the other hand, there holds, of course, |ζ|2 I + T 2 ≤ |ζ|2 . Equation (2.81) means that the absolute values of all eigenvalues of the matrix (iT ), both positive and negative, are smaller than |ζ|, whereas (2.80) means that only the positive eigenvalues are smaller than |ζ|, but says nothing about the negative ones. A similar inequality has been derived in [11], but in that case the boundary operator does not include the effect of 5, following [3, 12]. This proves the following theorem:

506

I. G. Avramidi, G. Esposito

Theorem 3. Let F be a Laplace-type operator defined by (2.3), and BF the generalized boundary operator given by (2.25) with the operator 3 defined by (2.57). Let ζ ∈ T ∗ ∂M be a cotangent vector on the boundary and T ≡ 0j ζj . The boundary-value problem (F, BF ) is strongly elliptic with respect to C − R+ if and only if for any non-vanishing ζ 6= 0 the matrix |ζ|I − iT is positive-definite, i.e. |ζ|I − iT > 0. 2.3.2. Dirac-type operator. The question of strong ellipticity for boundary-value problems involving Dirac-type operators can also be studied. As is well known, the leading symbol of a Dirac-type operator reads σL (D; ξ) = −0(ξ).

(2.82)

Since the square of a Dirac-type operator is a Laplace-type operator, it is clear that the leading symbol σL (D; x, ξ) is non-degenerate in the interior of M for ξ 6= 0. Moreover, for a complex λ one finds [σL (D; ξ) − λI]−1 = [σL (D; ξ) + λI]

|ξ|2

1 . − λ2

(2.83)

Therefore, [σL (D; ξ) − λI] is non-degenerate when |ξ|2 − λ2 6= 0. But this vanishes only for Im λ = 0 and Re λ = ±|ξ| and arbitrary ξ. Thus, for (ξ, λ) 6= (0, 0) and λ ∈ C − R+ − R− , [σL (D; ξ) − λI] is non-degenerate. The boundary-value problem for a Dirac-type operator D is strongly elliptic if there exists a unique solution of the equation [i0(N )∂r − 0(ζ) − λI] ϕ(r) = 0,

(2.84)

for any (ζ, λ) 6= (0, 0), λ ∈ / R+ ∪ R− , such that limr→∞ ϕ(r) = 0, 0 , PD ψD (ϕ) = ψD

(2.85) (2.86)

0 0 ∈ WD . for any ψD The general solution of (2.84) satisfying the decay condition at infinity reads

ϕ(r) = χ exp(−µr),

(2.87)

p

where now µ ≡ |ζ|2 − λ2 . Again, since (ζ, λ) 6= (0, 0), λ ∈ C − R+ − R− , the root can be always defined by Re µ > 0. The constant prefactor χ should satisfy the equation (X − λI)χ = 0,

(2.88)

X ≡ σL (D; iµN, ζ) = −iµ0(N ) − 0(ζ),

(2.89)

where

and the boundary condition 0 , PD χ = ψD

(2.90)

0 = PD ψ˜ 0 , ψ˜ 0 ∈ W0 . Equations (2.88), (2.90) are reduced to with some ψD 0 + χ− , χ = ψD 0 . [X + λ(2PD − I)] χ− = −(X − λI)ψD

(2.91) (2.92)

Gauge Theories on Manifolds with Boundary

507

Note that, since PD χ− vanishes, the coefficient of PD in Eq. (2.92) is arbitrary, and is set equal to 2 for convenience. Thus, the question of strong ellipticity for a Dirac-type operator is reduced to the matrix [X + λ(2PD − I)] being non-degenerate, det [X + λ(2PD − I)] 6= 0,

(2.93)

for any (ζ, λ) 6= (0, 0), λ ∈ C − R+ − R− . If this is satisfied, then the solution reads 0 . χ = 2λ [X + λ(2PD − I)]−1 ψD

(2.94)

Let us set (2PD − I) ≡ η and let us compute the square of the matrix [X + λη]. For the boundary projectors defined by (2.54)–(2.56) the matrix η is either η = 0(N ), η = 0(u)/|u| with some u ∈ T ∗ ∂M or (in even dimension m only) is exactly the chirality operator η = C (see (2.48)–(2.50)). By using the properties (2.5) of the Clifford algebra and those of the chirality operator η (see (2.51)–(2.53)), we compute [X + λC]2 = 2λ2 I,

(2.95)

2 g(ζ, u) 1 I. X + λ 0(u) = 2λ λ − |u| |u|

(2.96)

and

We see that, in both cases, the matrix [X + λ(2PD − I)] is non-degenerate for any (ζ, λ) 6= (0, 0) and any non-vanishing λ that does not lie on the real axis. Further, we find also (2.97) [X + λ0(N )]2 = 2λ(λ − iµ)I. p Bearing in mind that µ ≡ |ζ|2 − λ2 , we see that this does not vanish for any ζ 6= 0. However, for ζ = 0 and arbitrary λ, with Im λ > 0, this equals zero. Thus, the boundary projector PD = (I ± 0(N ))/2 does not lead to strong ellipticity, which only holds for Im λ < 0. Thus, we have proven Theorem 4. The boundary-value problem (D, BD ), where D is a Dirac-type operator and BD is a projector taking the forms (2.56), or (2.55), or (2.54), is strongly elliptic with respect to C − {0}, C − R and the lower half-plane Im λ < 0, respectively. 3. Resolvent and Heat Kernel Let w be a sufficiently large negative constant and λ ∈ C, Re λ < w, be a complex number with a sufficiently large negative real part. Then the resolvent G(λ) = (F − λI)−1 : L2 (V, M ) → L2 (V, M ) of the strongly elliptic boundary-value problem (F, B) is well defined. The resolvent kernel is a section of the tensor product of the vector bundles V and V ∗ over the tensor-product manifold M × M , defined by the equation (F − λI)G(λ|x, y) = δ(x, y)

(3.1)

with the boundary conditions BF ψ[G(λ|x, y)] = 0,

(3.2)

508

I. G. Avramidi, G. Esposito

where δ(x, y) is the covariant Dirac distribution, which is nothing but the kernel of the identity operator. Hereafter all differential operators as well as the boundary data map act on the first argument of the resolvent kernel (and the heat kernel), unless otherwise stated. This equation, together with the condition G(λ|x, y) = G(λ∗ |y, x),

(3.3)

which follows from the self-adjointness of the operator F , completely determine the resolvent kernel. Similarly, for t > 0 the heat semi-group operator U (t) = exp(−tF ) : L2 (V, M ) → 2 L (V, M ) is well defined. The kernel of this operator, called heat kernel, is defined by the equation (∂t + F )U (t|x, y) = 0

(3.4)

U (0|x, y) = δ(x, y),

(3.5)

BF ψ[U (t|x, y)] = 0,

(3.6)

with the initial condition

the boundary condition

and the self-adjointness condition U (t|x, y) = U (t|y, x).

(3.7)

As is well known [1], the heat kernel and the resolvent kernel are related by the Laplace transform: Z∞ G(λ|x, y) =

dtetλ U (t|x, y),

(3.8)

0

U (t|x, y) =

1 2πi

w+i∞ Z

dλe−tλ G(λ|x, y).

(3.9)

w−i∞

Also, it is well known [1] that the heat kernel U (t|x, y) is a smooth function near diagonal of M × M and has a well defined diagonal value U (t|x, x), and the functional trace Z d vol(x)tr V U (t|x, x). (3.10) Tr L2 exp(−tF ) = M

Moreover, the functional trace has an asymptotic expansion as t → 0+ , X tk/2 Ak/2 (F, BF ), Tr L2 exp(−tF ) ∼ (4πt)−m/2

(3.11)

k≥0

and the corresponding expansion of the functional trace for the resolvent as λ → −∞ reads

Gauge Theories on Manifolds with Boundary

∂ ∂λ

509

n

G(λ) (3.12) X 0[(k − m)/2 + n + 1](−λ)(m−k)/2−n−1 Ak/2 (F, BF ), ∼ (4π)−m/2

Tr L2

k≥0

for n ≥ m/2. Here Ak/2 (F, BF ) are the famous so-called (global) heat-kernel coefficients (sometimes called also Minakshisundaram-Plejel or Seeley coefficients). The zeroth-order coefficient is very well known: Z (3.13) A0 = d vol(x)tr V I = vol(M ) · dim(V ). M

It is independent of the operator F and of the boundary conditions. The higher order coefficients have the following general form: Z Z d vol(x)tr ˆ V bk/2 (F, BF |x), ˆ (3.14) Ak/2 (F, BF ) = d vol(x)tr V ak/2 (F |x) + M

∂M

where ak/2 and bk/2 are the (local) interior and boundary heat-kernel coefficients. The interior coefficients do not depend on the boundary conditions BF . It is well known that the interior coefficients of half-integer order, ak+1/2 , vanish [1]. The integer order coefficients ak are calculated for Laplace-type operators up to a4 [13]. The boundary coefficients bk/2 (F, BF ) do depend on both the operator F and the boundary operator BF . They are far more complicated because in addition to the geometry of the manifold M they depend essentially on the geometry of the boundary ∂M . For Laplace-type operators they are known for the usual boundary conditions (Dirichlet, Neumann, or mixed version of them) up to b5/2 [2, 14]. For generalized boundary conditions including tangential derivatives, to the best of our knowledge, they are not known at all. Only some special cases have been studied in the literature [3, 11, 12]. We are going to calculate below the next-to-leading coefficient A1/2 (F, BF ) for the generalized boundary conditions. To do this, and also to study the role of the ellipticity condition, we will construct an approximation to the heat kernel U (t|x, y) near the diagonal, i.e. for x close to y and for t → 0+ . Since the heat kernel and resolvent kernel are connected by the Laplace transform, this is equivalent to studying an approximation to the resolvent kernel G(λ|x, y) near the diagonal and for large negative λ → −∞ (this leads, in turn, to an approximate inverse of F − λI, called a parametrix). Let us stress here that we are not going to provide a rigorous construction of the resolvent with all the estimates, which, for the boundary-value problem, is a task that would require a separate paper. For a complete and mathematically rigorous exposition the reader is referred to the classical papers [10, 15, 16, 17, 18, 19]. Here we keep instead to a pragmatic approach and will describe briefly how the approximate resolvent kernel for λ → −∞, and hence the heat kernel for t → 0+ can be constructed, and then will calculate both kernels in the leading approximation. This will allow us to compute the heat-kernel coefficient A1/2 . First of all, we decompose both kernels into two parts G(λ|x, y) = G∞ (λ|x, y) + GB (λ|x, y), U (t|x, y) = U∞ (t|x, y) + UB (t|x, y).

(3.15) (3.16)

510

I. G. Avramidi, G. Esposito

Then we construct different approximations for G∞ and GB and, analogously, for U∞ and UB . The first parts G∞ (λ|x, y) and U∞ (t|x, y) are approximated by the usual asymptotic expansion of the resolvent and the heat kernel in the case of compact manifolds without boundary when x → y, λ → −∞ and t → 0+ . This means that effectively one introduces a small expansion parameter ε reflecting the fact that the points x and y are close to each other, the parameter t is small and the parameter λ is negative and large. This can be done by fixing a point x0 , choosing the normal coordinates at this point (with gµν (x0 ) = δµν ) and scaling x → x0 + ε(x − x0 ), y → x0 + ε(y − x0 ), t → ε2 t, λ → ε−2 λ

(3.17)

and expanding into an asymptotic series in ε. If one uses the Fourier transform, then the corresponding momenta ξ ∈ T ∗ M are large and scale according to ξ → ε−1 ξ.

(3.18)

This construction is standard [1] and we do not repeat it here. One can also use a completely covariant method [13, 20]. Probably the most convenient formula for the asymptotics as t → 0+ , among many equivalent ones, is [13, 20] σ X 11/2 tk ak , (3.19) U∞ (t|x, y) ∼ (4πt)−m/2 exp − 2t k≥0

where σ = σ(x, y) = r2 (x, y)/2 is one half the geodesic distance between x and y, 1 = 1(x, y) = g −1/2 (x)g −1/2 (y)det(−∂µx ∂νy σ(x, y)) is the corresponding Van VleckMorette determinant, g = detgµν , and ak = ak (x, y) are the off-diagonal heat-kernel coefficients. These coefficients satisfy certain differential recursion relations which can be solved in the form of a covariant Taylor series near diagonal [13]. On the diagonal the asymptotic expansion of the heat kernel reads X tk ak (x, x). (3.20) U∞ (t|x, x) ∼ (4πt)−m/2 k≥0

The explicit formulas for the diagonal values of ak are known up to k = 4 [13]. This asymptotic expansion can be integrated over the manifold M to get Z X Z d vol(x)tr V U∞ (t|x, x) ∼ (4πt)−m/2 tk d vol(x)tr V ak (x, x). (3.21) M

k≥0

M

Thus, integrating the diagonal of U∞ gives the interior terms in the heat-kernel asymptotics. The asymptotic expansion of G∞ (λ|x, y) for λ → −∞ is obtained from here by the Laplace transform (3.8) X σ (2k+2−m)/4 √ −m/2 1/2 1 2 Kk+1−m/2 ( −2λσ)ak , G∞ (λ|x, y) ∼ (4π) −2λ (3.22) k≥0 where Kν (z) = π/[2 sin(νπ)][I−ν (z) − Iν (z)] is the McDonald function (the modified Bessel function of the third kind). It is singular on the diagonal x = y. However, for a sufficiently large n, n ≥ m/2, ∂λn G becomes regular at the diagonal resulting in the asymptotic series as λ → −∞,

Gauge Theories on Manifolds with Boundary

Z d vol(x)tr V M

∂ ∂λ

n

511

G∞ (λ|x, y)

x=y

Z X m m + n + 1 (−λ) 2 −k−n−1 d vol(x)tr V ak (x, x). (3.23) ∼ (4π)−m/2 0 k− 2 k≥0

M

For a strongly elliptic boundary-value problem the diagonal of the boundary part UB (t|x, x) is exponentially small, i.e. of order ∼ exp(−r2 (x)/t), where r(x) is the normal geodesic distance to the boundary, as t → 0+ if x 6∈ ∂M . So, it does not contribute to the asymptotic expansion of the heat-kernel diagonal outside the boundary as t → 0+ . Therefore, the asymptotic expansion of the total heat-kernel diagonal outside the boundary is determined only by U∞ , X tk ak (x, x), x 6∈ ∂M. (3.24) U (t|x, x) ∼ (4πt)−m/2 k≥0

The point is, the coefficients of the asymptotic expansion of the diagonal of the boundary part UB (t|x, x) as t → 0+ behave near the boundary like the one-dimensional Dirac distribution δ(r(x)) and its derivatives. Thus, the integral over the manifold M of the boundary part UB (t|x, x) has an asymptotic expansion as t → 0+ with non-vanishing coefficients in the form of the integrals over the boundary. Therefore, it determines the local boundary contributions bk/2 to the heat-kernel coefficients Ak/2 . It is well known that the coefficient A1/2 is a purely boundary contribution [1]. It is almost obvious that it can be evaluated by integrating the fibre trace of the boundary contribution UB of the heat kernel in the leading order. Of course, this approximation is obtained without taking into account the boundary conditions. Therefore, G∞ satisfies approximately Eq. (3.1) but does not satisfy the boundary conditions (3.2). This means that the compensating term GB (λ|x, y) is defined by the equation (F − λI)GB (λ|x, y) = 0

(3.25)

BF ψ G∞ (λ|x, y) + GB (λ|x, y) = 0.

(3.26)

with the boundary condition

Analogously, UB (t|x, y) is defined by (∂t I + F )UB (t|x, y) = 0

(3.27)

UB (0|x, y) = 0,

(3.28)

BF ψ U∞ (t|x, y) + UB (t|x, y) = 0.

(3.29)

with the initial condition

and the boundary condition

The most difficult problem is to find the compensating terms GB (λ|x, y) and UB (t|x, y). These functions are important only near the boundary where they behave like distributions when t → 0+ or λ → −∞. Since the points x and y are close to the boundary the coordinates r(x) and r(y) are small separately, hence not only the difference [r(x) − r(y)] but also the sum [r(x) + r(y)] is small. This means that we must

512

I. G. Avramidi, G. Esposito

additionally scale r(x) → εr(x), r(y) → εr(y). By contrast, the point xˆ 0 is kept fixed on the boundary, so the coordinates xˆ 0 do not scale at all: xˆ 0 → xˆ 0 . Thus, we shall scale the coordinates x = (x, ˆ r(x)) and y = (y, ˆ r(y)), the parameters t and λ and momenta ξ = (ζ, ω) ∈ T ∗ M , with ζ ∈ T ∗ ∂M and ω ∈ T ∗ R, according to xˆ → xˆ 0 + ε(xˆ − xˆ 0 ), yˆ → xˆ 0 + ε(yˆ − xˆ 0 ), r(x) → εr(x), r(y) → εr(y), (3.30) t → ε2 t, λ →

1 1 1 λ, ζ → ζ, ω → ω. 2 ε ε ε

(3.31)

The corresponding differential operators are scaled by 1ˆ 1 1 ∂r → ∂r , ∂t → 2 ∂t , ∂λ → ε2 ∂λ . ∂ˆ → ∂, ε ε ε

(3.32)

We will call this transformation just scaling and denote the scaled objects by an index ε, e.g. GεB . The scaling parameter ε will be considered as a small parameter in the theory and we will use it to expand everything in power series in ε. We will not take care about the convergence properties of these expansions and take them as formal power series. In fact, they are asymptotic expansions as ε → 0. At the very end of the calculations we set ε = 1. The non-scaled objects, i.e. those with ε = 1, will not have the index ε, e.g. GεB |ε=1 = GB . Another way of doing this is by saying that we will expand all quantities in the homogeneous functions of the coordinates (xˆ − xˆ 0 ), (yˆ − xˆ 0 ), r(x), r(y), the momenta ξ = (ζ, ω) and the parameters t, λ. First of all, we expand the scaled operator F ε in power series in ε X εn−2 Fn , (3.33) Fε ∼ n≥0

where Fn are second-order differential operators with homogeneous symbols. The boundary operator requires more careful handling. Since half of the boundary data (2.8) contain normal derivatives, formally ψ0 = ϕ|r=0 and ψ1 = ∂r ϕ|r=0 , (2.9), would be of different order in ε. To make them of the same order we have to assume an additional factor ε in all ψ1 ∈ C ∞ (W1 , ∂M ). Thus, we define the graded scaling of the boundary data map by ε ϕ(x, ˆ r)|r=0 ψ0 (ϕ) ε = = ψ(ϕ), (3.34) ψ (ϕ) = εψ1ε (ϕ) ∂r ϕ(x, ˆ r)|r=0 so that the boundary data map ψ does not scale at all. This leads to an additional factor ε in the operator 3 determining the boundary operator BF (2.25). Thus, we define the graded scaling of the boundary operator by ε 5 0 ε , (3.35) BF = ε3ε I − 5ε which has the following asymptotic expansion in ε: X εn BF (n) , BFε ∼

(3.36)

n≥0

where BF (n) are first-order tangential operators with homogeneous symbols. Of course,

Gauge Theories on Manifolds with Boundary

513

F0 = −∂r2 − ∂ˆ 2 , 50 0 , BF (0) = 30 I − 50

(3.37)

∂ˆ 2 = gˆ jk (xˆ 0 )∂ˆj ∂ˆk , 30 = 0j (xˆ 0 )∂ˆj , 50 = 5(xˆ 0 ).

(3.39)

(3.38)

where

Note that all leading-order operators F0 , BF (0) and 30 have constant coefficients and, therefore, are very easy to handle. This procedure is called sometimes “freezing the coefficients of the differential operator”. The subsequent strategy is rather simple. Expand the scaled resolvent kernel in ε, X ε2−m+n G∞(n) , (3.40) Gε∞ ∼ n≥0

GεB

∼

X

ε2−m+n GB(n) ,

(3.41)

n≥0

and substitute into the scaled version of Eq. (3.25) and the boundary condition (3.26). Then, by equating the like powers in ε one gets an infinite set of recursive equations which determine all GB(n) . The zeroth-order term GB(0) is defined by (F0 − λI)GB(0) = 0,

(3.42)

BF (0) ψ G∞(0) + GB(0) = 0,

(3.43)

and the boundary conditions,

where F0 and BF (0) are defined by (3.37)–(3.39). The higher orders are determined from (F0 − λI)GB(k) = −

k X

Fn GB(k−n) , k = 1, 2, . . . ,

(3.44)

n=1

and the corresponding boundary conditions, k X BF (n) ψ G∞(k−n) + GB(k−n) . BF (0) ψ G∞(k) + GB(k) = −

(3.45)

n=1

The G∞(n) are obtained simply by expanding the scaled version of (3.22) in power series in ε. The operator F0 is a partial differential operator, but fortunately, has constant coefficients. By using the Fourier transform in the boundary coordinates (xˆ − xˆ 0 ) it reduces to an ordinary differential operator of second order. This enables one to find easily its resolvent. We will do this in next subsection below. Before doing this, let us stress that the same procedure can be applied to get the boundary part UB (t|x, y) of the heat kernel. 3.1. Leading order resolvent. In this subsection we determine the resolvent to leading order, i.e. G0 = G∞(0) + GB(0) . As we already outlined above, we fix a point xˆ 0 ∈ ∂M on the boundary and the normal coordinates at this point (with gˆ ik (xˆ 0 ) = δik ), take the tangent space T ∂M and identify the manifold M with M0 ≡ T ∂M × R+ . By using

514

I. G. Avramidi, G. Esposito

the explicit form of the zeroth-order operators F0 , B0 and 30 given by (3.37)-(3.39) we obtain the equation (3.46) −∂r2 − ∂ˆ 2 − λ G0 (λ|x, y) = δ(x − y), and the boundary conditions

= 0, 50 G0 (λ|x, y) r(x)=0 (I − 50 ) ∂r I + i0j0 ∂ˆj G0 (λ|x, y)

(3.47) r(x)=0

= 0,

(3.48)

where 50 = 5(xˆ 0 ), 0j0 = 0j (xˆ 0 ). Hereafter the differential operators always act on the first argument of a kernel. Moreover, for simplicity of notation, we will denote 50 and 00 just by 5 and 0j and omit the dependence of all geometric objects on xˆ 0 . To leading order this cannot cause any misunderstanding. Furthermore, the resolvent kernel should be bounded, lim G0 (λ|x, y) =

r(x)→∞

lim G0 (λ|x, y) = 0,

r(y)→∞

(3.49)

and self-adjoint, G0 (λ|x, y) = G0 (λ∗ |y, x).

(3.50)

Since the above boundary-value problem has constant coefficients, by using the Fourier transform in (xˆ − y), ˆ Z dζ ˆ y) ˆ ˜ eiζ·(x− (3.51) G(0) (λ|ζ, r(x), r(y)), G(0) (λ|x, y) = (2π)m−1 Rm−1

where ζ ∈ T ∗ ∂M , ζ · xˆ ≡ ζj xˆ j , we obtain an ordinary differential equation −∂r2 + |ζ|2 − λ G˜ 0 (λ|ζ, r(x), r(y)) = δ[ρ(x, y)],

(3.52)

where |ζ|2 = g ij (xˆ 0 )ζi ζj , ρ(x, y) = r(x) − r(y),

(3.53)

the boundary conditions 5G˜ 0 (λ|ζ, 0, r(y)) = 0,

(I − 5) (∂r I + iT ) G˜ 0 (λ|ζ, r(x), r(y))

(3.54) r(x)=0

= 0,

(3.55)

and self-adjointness condition G˜ 0 (λ|ζ, r(x), r(y)) = G˜ 0 (λ∗ |ζ, r(y), r(x)).

(3.56)

Here T is an anti-self-adjoint matrix T = T (ζ) = 0 · ζ = 0j ζj , T¯ = −T, satisfying the conditions

(3.57)

Gauge Theories on Manifolds with Boundary

515

(I − 5)T = T (I − 5) = T.

(3.58)

The first “free” part of the solution of this problem G˜ ∞(0) , which gives the resolvent kernel on the line R, is almost obvious. Since Eq. (3.52) has constant coefficients and there are no boundary conditions, its solution depends only on the difference ρ(x, y) = r(x) − r(y), 1 θ[ρ(x, y)]e−µρ(x,y) + θ[−ρ(x, y)]eµρ(x,y) G˜ ∞(0) (λ|ζ, r(x), r(y)) = 2µ 1 −µ|ρ(x,y)| e , (3.59) = 2µ where µ≡

p

|ζ|2 − λ, Re µ > 0,

and θ is the usual step function

θ(x) ≡

1 if x > 0 0 if x < 0.

(3.60)

(3.61)

The boundary part G˜ B(0) takes into account the boundary conditions (3.54) and (3.55). On requiring regularity at infinity and self-adjointness of the kernel, we can write down ¯ the solution up to a constant self-adjoint matrix h = h, 1 G˜ B(0) (λ|ζ, r(x), r(y)) = h(µ; ζ) e−µ[r(x)+r(y)] . 2µ

(3.62)

From Eqs. (3.59)–(3.62) we find 1 (I + h)e−µr(y) , G˜ 0 (λ|ζ, 0, r(y)) = 2µ 1 = (I − h)e−µr(y) . ∂r G˜ 0 (λ|ζ, r(x), r(y)) 2 r(x)=0

(3.63)

Taking into account the conditions (3.57)–(3.58) we reduce the boundary conditions to the form 5(I + h) = 0, (I − 5)µ(I − h) + iT (I + h) = 0.

(3.64) (3.65)

Since h is self-adjoint and T 5 = 5T = 0 it must have the form h = α5 + (I − 5)β(T )(I − 5), where α is a constant and β is a matrix depending on T . Substituting this into (3.64) we immediately obtain α = −1. From the second boundary condition (3.65) we get (µI − iT )β = µI + iT.

(3.66)

Since the problem is assumed to be strongly elliptic, the matrix µI − iT is invertible for any (ζ, λ) 6= (0, 0). Thus, we obtain the solution of the boundary conditions in the form h(µ; ζ) = −5 + (I − 5)(µI − iT )−1 (µI + iT )(I − 5) = I − 25 + 2iT (µI − iT )−1 .

(3.67)

516

I. G. Avramidi, G. Esposito

In a particular case of mixed Dirichlet and Neumann boundary conditions, when 0j = 0, this reduces to h = I − 25.

(3.68)

Note that the function h(µ, ζ) depends actually on the ratio ζ/µ, in other words, it satisfies the homogeneity relation h(sµ, sζ) = h(µ, ζ).

(3.69)

More generally, G˜ 0

√ √ 1 1 √ λ √ ζ, t r(x), t r(y) = tG˜ 0 (λ|ζ, r(x), r(y)). t t

(3.70)

This holds for G∞(0) as well as for GB(0) . As a function of λ, G˜ (0) is a meromorphic function with a cut along the positive real axis from |ζ|2 to ∞. The “free” part G∞(0) has no other singularities, whereas the boundary part GB(0) has simple poles at the eigenvalues of the matrix |ζ|2 I +T 2 . Remember that the strong ellipticity condition (2.80) requires these eigenvalues to be positive and smaller than |ζ|2 , i.e. 0 < |ζ|2 I +T 2 < |ζ|2 . 3.2. Leading order heat kernel. Using G0 (λ|x, y), we can also get the zeroth-order heat kernel U0 (t|x, y) by the inverse Laplace transform 1 U0 (t|x, y) = 2πi

w+i∞ Z

dλ e−tλ G0 (λ|x, y)

(3.71)

w−i∞

Z = Rm−1

dζ (2π)m−1

w+i∞ Z

w−i∞

dλ −tλ+iζ·(x− ˆ y) ˆ ˜ G0 (λ|ζ, r(x), r(y)), (3.72) e 2πi

where w√is a negative constant. Now, by scaling the integration variables λ → λ/t and ζ → ζ/ t and shifting the contour of integration over λ (w → w/t, which can be done because the integrand is analytic in the left half-plane of λ) and using the homogeneity property we obtain immediately Z √ dζ ˆ y)/ ˆ t eiζ·(x− U0 (t|x, y) = (4πt)−m/2 (m−1)/2 π Rm−1

w+i∞ Z

w−i∞

dλ −λ ˜ r(x) r(y) √ e G0 λ ζ, √ , √ . i π t t

(3.73)

Next, let us change the variable λ according to λ ≡ |ζ|2 + ω 2 ,

(3.74)

where ω = iµ, and, hence, Im ω > 0. In the upper half-plane, Im ω > 0, this change of variables is single valued and well defined. Under this change the cut in the complex plane λ along the positive real axis from |ζ|2 to ∞, i.e. Im λ = 0, |ζ|2 < Re λ < ∞ is mapped onto the whole real axis Im ω = 0, −∞ < Re ω < +∞. The interval

Gauge Theories on Manifolds with Boundary

517

Im λ = 0, 0 < Re λ < |ζ|2 on the real axis of λ is mapped onto an interval Re ω = 0, 0 < Im ω < |ζ|, on the positive imaginary axis of ω. As a function of ω the resolvent G˜ 0 is a meromorphic function in the upper half plane, Im ω > 0, with simple poles on the interval Re ω = 0, 0 < Im ω < |ζ|, on the imaginary axis. The contour of integrationp in the complex plane of ω is a hyperbola going from (ei3π/4 )∞ through the point ω = |ζ|2 − w to (eiπ/4 )∞. It can be deformed to a contour C that comes from −∞ + iε, encircles the point ω = i|ζ| in the clockwise direction and goes to +∞ + iε, where ε is an infinitesimal positive parameter. The contour C does not cross the interval Re ω = 0, 0 < Im ω < |ζ|, on the imaginary axis and is above all the singularities of the resolvent. After such a tranformation we obtain Z √ 2 dζ ˆ y)/ ˆ t e−|ζ| +iζ·(x− U0 (t|x, y) = (4πt)−m/2 (m−1)/2 π Rm−1 Z r(x) r(y) dω −ω2 2 2 ˜ √ e . (3.75) × 2(−iω) G0 |ζ| + ω ζ, √ , √ π t t C

Substituting here G˜ 0 we find −m/2

Z

U0 (t|x, y) = (4πt) Z × C

Rm−1

dζ

π

e−|ζ| (m−1)/2

2

√ +iζ·(x− ˆ y)/ ˆ t

n √ √ o 2 dω √ e−ω Ieiω|r(x)−r(y)|/ t + h(−iω; ζ)eiω|r(x)+r(y)|/ t . (3.76) π

The first “free” part is easily obtained by computing the Gaussian integrals over ω and ζ. We get 1 −m/2 2 exp − |x − y| I + UB(0) (t|x, y), (3.77) U0 (t|x, y) = (4πt) 4t where UB(0) (t|x, y) = (4πt)−m/2 Z × C

Z

Rm−1

dζ π (m−1)/2

e−|ζ|

2

√ +iζ·(x− ˆ y)/ ˆ t

(3.78)

dω |r(x) + r(y)| √ exp −ω 2 + iω √ (I − 25) − 2T (ω I + T )−1 . π t

Here, the part proportional to (I − 25) also contains only Gaussian integrals and is easily calculated. Thus (cf. [21]) UB(0) (t|x, y) = (4πt)−m/2 1 1 2 2 × exp − |xˆ − y| ˆ − [r(x) + r(y)] (I − 25) + (t|x, y) , 4t 4t (3.79) where

518

I. G. Avramidi, G. Esposito

Z

dζ π (m−1)/2 Rm−1 Z dω (xˆ − y) ˆ [r(x) + r(y)] √ exp −|ζ|2 + iζ · √ √ − ω 2 + iω (3.80) π t t

(t|x, y) = −2

C

× 0 · ζ(ω I + 0 · ζ)−1 , where we substituted T = 0 · ζ ≡ 0j ζj . Herefrom we obtain easily the diagonal value of the heat kernel −m/2

U(0) (t|x, x) = (4πt)

r2 (x) I + exp − t

√

(I − 25) + 8(r(x)/ t) , (3.81)

where Z

Z

dζ

8(z) = −2 Rm−1

π (m−1)/2

C

2 2 dω √ e−|ζ| −ω +2iωz 0 · ζ(ω I + 0 · ζ)−1 . (3.82) π

Now, by using the representation (remember that Im (ω I + 0 · ζ) > 0) (ωI + 0 · ζ)

−1

Z = −2i

∞

dp e2ip(ωI+0·ζ) ,

(3.83)

0

and computing a Gaussian integral over ω, we obtain Z

dζ

8(z) = 2 Rm−1

π (m−1)/2

Z

∞

dp e−|ζ|

2

−(p+z)2

0

∂ 2ip0·ζ e . ∂p

(3.84)

Integrating by parts over p we get 8(z) = −2e−z I − 2 2

∂ 9(z), ∂z

(3.85)

where Z 9(z) = Rm−1

dζ π (m−1)/2

Z

∞

dp e−|ζ|

2

−(p+z)2 +2ip0·ζ

.

(3.86)

0

It is not difficult to show that, as z → ∞, the functions 9(z) and 8(z) are exponentially small: 1 1 −z2 2 −4 (3.87) e I − 2 (I + 0 ) + O(z ) , 9(z) ∼ 2z 2z 2 1 (3.88) 8(z) ∼ 2 e−z −02 + O(z −2 ) , z

Gauge Theories on Manifolds with Boundary

519

where 02 ≡ gij 0i 0j . For z = 0, by using the change ζ → −ζ, we obtain √ Z 2 2 π dζ e−|ζ| −(0·ζ) . 9(0) = (m−1)/2 2 π

(3.89)

Rm−1

Note that this integral converges when the strong ellipticity condition (0 · ζ)2 + |ζ|2 I > 0 is satisfied. Now, we take the diagonal U(0) (t|x, x) given by (3.81), and integrate over the manifold M . Because the boundary part UB(0) is exponentially small as r(x) → ∞ we can in fact integrate it only over a narrow strip near the boundary, when 0 < r(x) < δ. The √ difference is asymptotically small as t →√ 0+ . Doing the change of variables z = r/ t we reduce the integration to 0 < z < δ/ t. We see, that as t → 0+ we can integrate over z from 0 to ∞. The error is asymptotically small as t → 0+ and does not contribute to the asymptotic expansion of the trace of the heat kernel. Thus, we obtain √ (3.90) Tr L2 U0 (t) ∼ (4πt)−m/2 A0 + t A1/2 + · · · , where A0 is given by (3.13) and

Z d vol(x)tr ˆ V a1/2 ,

A1/2 =

(3.91)

∂M

with

√

π I − 25 + β1/2 , 2 Z∞ 2 = √ dz 8(z). π

a1/2 =

(3.92)

β1/2

(3.93)

0

Now, using (3.85) and (3.89) and the fact that 9(∞) = 0 we get easily 4 β1/2 = −2I + √ 9(0) π Z 2 jk dζ e−|ζ| e−A ζj ζk − I , =2 (m−1)/2 π

(3.94)

Rm−1

where Ajk ≡ 0(j 0k) . Eventually, a1/2

 √  Z π I − 25 + 2 = 2 

Rm−1

dζ e−|ζ| π (m−1)/2

(3.95)

2

e−A

jk

ζj ζk

−I

  

.

(3.96)

Further calculations of a general nature, without knowing the algebraic properties of the matrices Ajk , seem to be impossible. One can, however, evaluate the integral in the form of an expansion in the matrices Ajk , or 0i . Using the Gaussian integrals

520

I. G. Avramidi, G. Esposito

Z Rm−1

2 dζ (2n)! e−|ζ| ζi1 · · · ζi2n = gˆ (i i · · · gˆ i2n−1 i2n ) , n!22n 1 2 π (m−1)/2

(3.97)

we obtain β1/2 = 2

X n≥1

(−1)n

(2n)! gˆ i i · · · gˆ i2n−1 i2n 0(i1 · · · 0i2n ) , (n!)2 22n 1 2

(3.98)

and, therefore, √ X π n (2n)! (i1 i2n ) I − 25 + 2 . (−1) gˆ i i · · · gˆ i2n−1 i2n 0 · · · 0 a1/2 = 2 (n!)2 22n 1 2 (3.99) n≥1

Since our main result (3.94) is rather complicated, we now consider a number of particular cases of physical relevance. 1. The simplest case is, of course, when the matrices Aij vanish, Aij = 0. One then gets the familiar result for mixed boundary conditions [1, 2] √ π (I − 25). (3.100) β1/2 = 0, a1/2 = 2 2. The first non-trivial case is when the matrices 0i form an Abelian algebra, [0i , 0j ] = 0. One can then easily compute the integral (3.94) explicitly, h i β1/2 = 2 (I + 02 )−1/2 − 1 .

(3.101)

(3.102)

Therefore a1/2

√ n h io π I − 25 + 2 (I + 02 )−1/2 − I . = 2

(3.103)

In the case 5 = 0, this coincides with the result of ref. [3], where the authors considered the particular case of commuting 0i matrices (without noting this explicitly). 3. Let us assume that the matrices Ajk = 0(j 0k) form an Abelian algebra, i.e. [Ajk , Alm ] = 0. The evaluation of the resulting integral over ζ yields h i β1/2 = 2 (det T ∂M [δ i j + Ai j ])−1/2 − I ,

(3.104)

(3.105)

where the determinant detT ∂M is taken over the indices in the tangent space to the boundary (we used det gˆ = 1). By virtue of (3.92) one gets the final result √ n h io −1/2 π I − 25 + 2 detT ∂M [δ i j + Ai j ] −I . (3.106) a1/2 = 2

Gauge Theories on Manifolds with Boundary

521

4. A particular realization of the last case is when the matrices 0i satisfy the Dirac-type condition 0i 0j + 0j 0i = 2 gˆ ij

1 02 , (m − 1)

(3.107)

so that Aij = 0(i 0j) =

1 gˆ ij 02 . (m − 1)

(3.108)

Then, of course, the matrices Aij commute and the result is given by Eq. (3.106). But the determinant is now m−1 1 i i 2 . (3.109) 0 det T ∂M [δ j + A j ] = I + (m − 1) Thus, a1/2

" #) −(m−1)/2 √ ( 1 π 2 I − 25 + 2 I + = −I . (3.110) 0 2 (m − 1)

Note that this differs essentially from the result of ref. [3], and shows again that the result of ref. [3] applies actually only to the completely Abelian case, when all matrices 0j commute. Note also that, in the most interesting applications (e.g. in quantum gravity), the matrices 0i do not commute [12]. The result (3.94), however, is valid in the most general case. 5. A very important case is when the operator 3 is a natural operator on the boundary. Since it is of first order it can be only the generalized Dirac operator. In this case the matrices 0j satisfy the condition 0i 0j + 0j 0i ≡ 2Aij = −2 κgˆ ij (I − 5),

(3.111)

where κ is a constant. Hence the matrices Aij obviously commute and we have the case considered above (see (3.106)). The determinant is easily calculated, det T ∂M [δji + Aij ] = 5 + (I − 5)(1 − κ)m−1 , and we eventually obtain √ n h io π −5 + (I − 5) 2(1 − κ)−(m−1)/2 − 1 . a1/2 = 2

(3.112)

(3.113)

Thus, a singularity is found at κ = 1. This happens because, for κ = 1, the strong ellipticity condition is violated (see also [11]). Indeed, the strong ellipticity condition (2.81), T 2 + |ζ|2 I = (0 · ζ)2 + |ζ|2 I = |ζ|2 [5 + (1 − κ)(I − 5)] > 0, implies in this case κ < 1 (cf. [11]).

(3.114)

522

I. G. Avramidi, G. Esposito

4. Analysis of Ellipticity in a General Gauge Theory on Manifolds with Boundary In this section we are going to study gauge-invariant boundary conditions in a general gauge theory (for a review, see [22]). A gauge theory is defined by two vector bundles, V and G, such that dim V > dim G. V is the bundle of gauge fields ϕ ∈ C ∞ (V, M ), and G is the bundle of parameters of gauge transformations ∈ C ∞ (G, M ). Both bundles V and G are equipped with some Hermitian positive-definite metrics E, E † = E, and γ, γ † = γ, and with the corresponding natural L2 scalar products (, )V and (, )G . The infinitesimal gauge transformations δϕ = R

(4.1)

are determined by a first-order differential operator R, R : C ∞ (G, M ) → C ∞ (V, M ).

(4.2)

Further, one introduces two auxiliary operators, X : C ∞ (V, M ) → C ∞ (G, M )

(4.3)

Y : C ∞ (G, M ) → C ∞ (G, M ),

(4.4)

and

and one defines two differential operators, L ≡ XR : C ∞ (G, M ) → C ∞ (G, M )

(4.5)

¯ X : C ∞ (V, M ) → C ∞ (V, M ), H ≡ XY

(4.6)

and

where X¯ = E −1 X † γ. The operators X and Y should satisfy the following conditions (but are otherwise arbitrary): 1) The differential operators L and H have the same order. 2) The operators L and H are formally self-adjoint (or anti-self-adjoint). 3) The operators L and Y are elliptic. From these conditions we find that there are two essentially different cases: Case I. X is of first order and Y is of zeroth order, i.e. ¯ Y = IG , X = R,

(4.7)

where R¯ ≡ γ −1 R† E. Then, of course, L and H are both second-order differential operators, ¯ ¯ L = RR, H = RR.

(4.8)

Case II. X is of zeroth order and Y is of first order. Let R be the bundle of maps of G into V , and let β ∈ R be a zeroth-order differential operator. Then ¯ Y = βR, ¯ X = β, where β¯ ≡ γ −1 β † E, and the operators L and H are of first order,

(4.9)

Gauge Theories on Manifolds with Boundary

523

¯ ¯ β¯ = βLβ. ¯ L = βR, H = β βR

(4.10)

¯ We assume that, by suitable choice of the parameters, the second-order operator RR ¯ can be made of Laplace type and the first-order operator βR can be made of Dirac type, and, therefore, have non-degenerate leading symbols, ¯ 6= 0, (4.11) detG σL (RR) ¯ 6= 0. det G σL (βR)

(4.12)

∞

The dynamics of gauge fields ϕ ∈ C (V, M ) at the linearized (one-loop) level is described by a formally self-adjoint (or anti-self-adjoint) differential operator, 1 : C ∞ (V, M ) → C ∞ (V, M ).

(4.13)

This operator is of second order for bosonic fields and of first order for fermionic fields. In both cases it satisfies the identities ¯ = 0, 1R = 0, R1 (4.14) and, therefore, is degenerate. We consider only the case when the gauge generators are linearly independent. This means that the equation σL (R) = 0,

(4.15)

for ξ 6= 0, has the only solution = 0. In other words, Ker σL (R) = ∅,

(4.16)

i.e. the rank of the leading symbol of the operator R equals the dimension of the bundle G, rank σL (R) = dim G.

(4.17)

We also assume that the leading symbols of the generators R are complete in that they generate all zero-modes of the leading symbol of the operator 1, i.e. all solutions of the equation σL (1)ϕ = 0,

(4.18)

ϕ = σL (R),

(4.19)

Ker σL (1) = {σL (R) | ∈ G},

(4.20)

rank σL (1) = dim V − dim G.

(4.21)

for ξ 6= 0, have the form for some . In other words, and hence Further, let us take the operator H of the same order as the operator 1 and construct a formally (anti-)self-adjoint operator, F ≡ 1 + H,

(4.22)

σL (F ) = σL (1) + σL (H).

(4.23)

so that It is easy to derive the following result:

524

I. G. Avramidi, G. Esposito

Proposition 1. The leading symbol of the operator F is non-degenerate, i.e. detV σL (F ; x, ξ) 6= 0,

(4.24)

for any ξ 6= 0. Proof. Indeed, suppose there exists a zero-mode, say ϕ0 , of the leading symbol of the operator F , i.e. σL (F )ϕ0 = ϕ¯ 0 σL (F ) = 0,

(4.25)

where ϕ¯ ≡ ϕ† E. Then we have ¯ )σL (L) = 0, ϕ¯ 0 σL (F )σL (R) = ϕ¯ 0 σL (XY

(4.26)

and, since σL (L) is non-degenerate, ¯ ) = σL (Y X)ϕ0 = 0. ϕ¯ 0 σL (XY

(4.27)

σL (H)ϕ0 = 0,

(4.28)

σL (F )ϕ0 = σL (1)ϕ0 = 0.

(4.29)

But this implies

and hence

Thus, ϕ0 is a zero-mode of the leading symbol of the operator 1, and according to the completeness of the generators R must have the form ϕ0 = σL (R) for some . Substituting this form into Eq. (4.27) we obtain σL (Y X)σL (R) = σL (Y L) = 0.

(4.30)

Herefrom, by taking into account the non-singularity of σL (Y L), = ϕ0 = 0 follows, and hence the leading symbol of the operator F does not have any zero-modes, i.e. it is non-degenerate. Thus, the operators L and F have, both, non-degenerate leading symbols. In quantum field theory the operator X is called the gauge-fixing operator, F the gauge-field operator, the operator L the (Faddeev–Popov) ghost operator and the operator Y in Case II the third (or Nielsen–Kallosh) ghost operator. The most convenient and the most important case is when, by suitable choice of the parameters it turns out to be possible to make both the operators F and L either of Laplace type or of Dirac type. The one-loop effective action for gauge fields is given by the functional superdeterminants of the gauge-field operator F and the ghost operators L and Y [22] 0=

1 1 log(Sdet F ) − log(Sdet L) − log(Sdet Y ). 2 2

(4.31)

Gauge Theories on Manifolds with Boundary

525

4.1. Bosonic gauge fields. Let us consider first the case of bosonic fields, when 1 is a second-order formally self-adjoint operator. The gauge invariance identity (4.14) means, in particular, σL (1)σL (R) = 0.

(4.32)

¯ and F = 1 + RR¯ are of Laplace type, Now we assume that both the operators L = RR i.e. ¯ = |ξ|2 IG , σL (RR)

(4.33)

¯ = |ξ|2 I. σL (F ) = σL (1) + σL (RR)

(4.34)

On manifolds with boundary one has to impose some boundary conditions to make these operators self-adjoint and elliptic. They read BL ψ() = 0, BF ψ(ϕ) = 0,

(4.35) (4.36)

where ψ() and ψ(ϕ) are the boundary data for the bundles G and V , respectively, and BL and BF are the corresponding boundary operators (see Sect. 2.1). In gauge theories one tries to choose the boundary operators BL and BF in a gauge-invariant way, so that the condition BF ψ(R) = 0

(4.37)

is satisfied identically for any subject to the boundary conditions (4.35). This means that the boundary operators BL and BF satisfy the identity BF [ψ, R](I − BL ) ≡ 0,

(4.38)

where [ψ, R] is the commutator of the linear boundary data map ψ and the operator R. We will see that this requirement fixes completely the form of the as yet unknown boundary operator BL . Indeed, the most natural way to satisfy the condition of gauge invariance is as follows. Let us decompose the cotangent bundle T ∗ M in such a way that ξ = (N, ζ) ∈ T ∗ M , where N is the inward pointing unit normal to the boundary and ζ ∈ T ∗ ∂M is a cotangent vector on the boundary. Consider the restriction W0 of the vector bundle V to the boundary. Let us define restrictions of the leading symbols of the operators R and 1 to the boundary, i.e. (4.39) 5 ≡ σL (1; N ) , ∂M (4.40) ν ≡ σL (R; N ) , ∂M (4.41) µ ≡ σL (R; ζ) . ∂M

From Eq. (4.32) we have thus the identity 5ν = 0.

(4.42)

526

I. G. Avramidi, G. Esposito

Moreover, from (4.33) and (4.34) we have also νν ¯ = IG , νµ ¯ + µν ¯ = 0, µµ ¯ = |ζ|2 IG , 5 = I − ν ν. ¯

(4.43) (4.44) (4.45) (4.46)

From (4.42) and (4.43) we find that 5 : W0 → W0 is a self-adjoint projector orthogonal to ν, ¯ = 5. 52 = 5, 5ν = 0, 5

(4.47)

Then, a part of the boundary conditions for the operator F reads = 0. 5ϕ

(4.48)

The gauge transformation of this equation is = 0. 5R

(4.49)

∂M

∂M

The normal derivative does not contribute to this equation, and, therefore, if Dirichlet boundary conditions are imposed on , = 0, (4.50) ∂M

Eq. (4.49) is satisfied identically. The easiest way to get the other part of the boundary conditions is just to set ¯ = 0. (4.51) Rϕ ∂M

Bearing in mind Eq. (4.5) we find that, under the gauge transformations (4.1), this is transformed into = 0. (4.52) L ∂M

If some is a zero-mode of the operator L, i.e. ∈ Ker (L), this is identically zero. For all ∈ / Ker (L) this is identically zero for the Dirichlet boundary conditions (4.50). In other words, the requirement of gauge invariance of the boundary conditions (4.36) determines in an almost unique way (up to zero-modes) that the ghost boundary operator BL should be of Dirichlet type. Anyway, the Dirichlet boundary conditions for the operator L are sufficient to achieve gauge invariance of the boundary conditions for the operator F . Since the operator R¯ in the boundary conditions (4.51) is a first-order operator, the set of boundary conditions (4.48) and (4.51) is equivalent to the general scheme formulated in Sect. 2.1. Separating the normal derivative in the operator R¯ we find exactly the generalized boundary conditions (2.15) with the boundary operator BF of the form (2.25) with a first-order operator 3 : C ∞ (W0 , ∂M ) → C ∞ (W0 , ∂M ), the matrices 0j being of the form ¯ j ν. ¯ 0j = −ν νµ

(4.53)

Gauge Theories on Manifolds with Boundary

527

i These matrices are anti-self-adjoint, 0¯ = −0i , and satisfy the relations

50i = 0i 5 = 0.

(4.54)

Thus, one can now define the matrix T ≡ 0 · ζ = −ν νµ ¯ ν, ¯

(4.55)

where µ ≡ µj ζj , and study the condition of strong ellipticity (2.80). The condition of strong ellipticity now reads |ζ|I − iT = |ζ|I + iν νµ ¯ ν¯ > 0.

(4.56)

Further, using Eqs. (4.53), (4.44) and (4.45) we evaluate Aij = 0(i 0j) = −(I − 5)µ(i µ¯ j) (I − 5).

(4.57)

¯ − 5), T 2 = Aij ζi ζj = −(I − 5)µµ(I

(4.58)

¯ − 5). T 2 + |ζ|2 I = |ζ|2 5 + (I − 5)[|ζ|2 I − µµ](I

(4.59)

Therefore,

and

Since for non-vanishing ζ the part proportional to 5 is positive-definite, the condition of strong ellipticity for bosonic gauge theory means ¯ − 5) > 0. (I − 5)[|ζ|2 I − µµ](I

(4.60)

We have thus proved a theorem: Theorem 5. Let V and G be two vector bundles over a compact Riemannian manifold M with smooth boundary, such that dim V > dim G. Consider a bosonic gauge theory and let the first-order differential operator R : C ∞ (G, M ) → C ∞ (V, M ) be the generator of infinitesimal gauge transformations. Let 1 : C ∞ (V, M ) → C ∞ (V, M ) be the gauge-invariant second-order differential operator of the linearized field equations. ¯ : C ∞ (G, M ) → C ∞ (G, M ) and F ≡ 1+RR¯ be of Laplace Let the operators L ≡ RR ≡ ν and σL (R; ζ) ≡µ type and normalized by σL (L) = |ξ|2 IG . Let σL (R; N ) ∂M ∂M be the restriction of the leading symbol of the operator R to the boundary, N being the ¯ normal to the boundary and ζ ∈ T ∗ ∂M being a cotangent vector, and 5 = I − ν ν. Then the generalized boundary-value problem (F, BF ) with the boundary operator BF determined by the boundary conditions (4.48) and (4.51) is gauge-invariant provided that the ghost boundary operator BL takes the Dirichlet form. Moreover, it is strongly ¯ ν] ¯ is positive-definite. elliptic with respect to C − R+ if and only if the matrix [|ζ|I + iν νµ A sufficient condition for that reads ¯ − 5) > 0. (I − 5)[|ζ|2 I − µµ](I

(4.61)

528

I. G. Avramidi, G. Esposito

4.2. Fermionic gauge fields. In the case of fermionic gauge fields, 1 is a first-order formally self-adjoint (or anti-self-adjoint), degenerate (i.e. gauge-invariant) operator with leading symbol satisfying σL (1)σL (R) = 0.

(4.62)

¯ and F = Now we have the case II, hence, we assume both the operators L = βR ¯ β¯ to be of Dirac type, i.e. the operators L2 and F 2 are of Laplace type: 1 + β βR ¯ L (R)]2 = |ξ|2 IG , [βσ

(4.63)

¯ L (R)β¯ 2 = |ξ|2 I. σL (1) + β βσ

(4.64)

Note that now we have two systems of Dirac matrices, γ µ on the bundle G and 0µ on V . They are defined by the leading symbols of the Dirac-type operators L and F , σL (L) = −γ µ ξµ , σL (F ) = −0µ ξµ . Let us define

b ≡ σL (1; N )

∂M

, c ≡ σL (1; ζ)

∂M

(4.65)

,

(4.66)

where ζ ∈ T ∗ ∂M . Bearing in mind the notation (4.40) and (4.41), we have from (4.62), bν = 0, cν + bµ = 0.

(4.67)

Moreover, from (4.63) we have also ¯ βν ¯ = IG , βν ¯ βµ ¯ + βµ ¯ βν ¯ = 0. βν

(4.68) (4.69)

¯ β¯ 2 = I, b + β βν ¯ β¯ c + β βµ ¯ β¯ + c + β βµ ¯ β¯ b + β βν ¯ β¯ = 0. b + β βν

(4.70)

Similarly, from (4.64) we get

(4.71)

Herefrom, we find that the following operators: ¯ PG = 21 [IG − γ(N )] = 21 (IG + βν), ¯ β), ¯ PV = 1 [I − 0(N )] = 1 (I + b + β βν

(4.72)

PG2 = PG , PV2 = PV .

(4.74)

2

2

(4.73)

are projectors,

The boundary conditions for the Dirac-type operators L and F are given by projectors (see (2.31)), = 0, (4.75) PL ∂M = 0. (4.76) PF ϕ ∂M

Gauge Theories on Manifolds with Boundary

529

The problem is to make them gauge-invariant. Here the projectors PL and PF are defined by means of the matrices γ µ and 0µ , respectively. The gauge transformation of Eq. (4.76) is = 0. (4.77) PF R ∂M

Noting that, on the boundary, ¯ ¯ β)µ ¯ j∇ ˆ j ] = [ν βνL − i(I − ν βν R ∂M

∂M

,

(4.78)

and assuming that µ commutes with PL , we get herefrom two conditions on the boundary projectors, ¯ G − PL ) = 0, PF ν βν(I ¯ β)(I ¯ G − PL ) = 0. PF (I − ν βν

(4.79) (4.80)

Such gauge-invariant boundary operators always exist. We will construct them explicitly in Sect. 6 in the course of studying the Rarita–Schwinger system. 5. Strong Ellipticity in Yang–Mills Theory The first physical application that we study is the strong ellipticity condition in Yang– Mills theory. Now G = A is the Lie algebra of a semi-simple and compact gauge group, and V is the bundle of 1-forms taking values in A, i.e. V = T ∗ M ⊗ G. Let h be the Cartan metric on the Lie algebra defined by hab ≡ −C c ad C d cb ,

(5.1)

C a bc being the structure constants of the gauge group, and the fibre metric E on the bundle V be defined by E(ϕ, ϕ) ≡ −tr A g(ϕ, ϕ),

(5.2)

E µ a ν b = g µν hab .

(5.3)

or, in components,

The Cartan metric is non-degenerate and positive-definite. Therefore, the fibre metric E is always non-singular and positive-definite. Henceforth we will suppress the group indices. The generator of gauge transformations is now just the covariant derivative R = ∇G , R¯ = −tr g ∇V .

(5.4)

The leading symbols of these operators are ¯ ¯ ξ) = −iξI, σL (R; ξ) = iξIG , σL (R;

(5.5)

where ξ¯ ≡ tr g ξ is a map ξ¯ : T ∗ M → R. First of all, we see that ¯ G = |ξ|2 IG , ¯ ξ) = ξξI σL (L; ξ) = σL (RR; ¯ ¯ ξ) = ξ ⊗ ξI, σL (H; ξ) = σL (RR;

(5.6) (5.7)

530

I. G. Avramidi, G. Esposito

¯ is indeed a Laplace-type operator. The gauge-invariant so that the operator L = RR operator 1 in linearized Yang–Mills theory is defined by the leading symbol (5.8) σL (1) = |ξ|2 − ξ ⊗ ξ¯ I. Thus, the operator F = 1 + H is of Laplace type, σL (F ) = σL (1 + H) = |ξ|2 I.

(5.9)

ν = i N IG , ν¯ = −i tr g N,

(5.10)

¯ µ = i ζIG , µ¯ = −i tr g ζ,

(5.11)

Further, we find

and where N is the normal cotangent vector, and ζ ∈ T ∗ ∂M . The projector 5 has the form 5 = q,

(5.12)

q ≡ 1 − N ⊗ N¯ .

(5.13)

qµ ν ≡ δµν − Nµ N ν .

(5.14)

Thus, the gauge-invariant boundary conditions are = 0, q µ ν ϕν ∂M = 0. g µν ∇µ ϕν

(5.15)

where

In components, this reads

∂M

(5.16)

¯ = N¯ ζ =< ζ, N >= 0, we find from (5.10) and (5.11), Since ζN µν ¯ = νµ ¯ = 0,

(5.17)

and hence the matrices 0i in (2.57), as well as T = 0i ζi , vanish: 0j = 0, T = 0.

(5.18)

Therefore, the matrix T 2 + ζ 2 I = |ζ|2 I is positive-definite, so that the strong ellipticity condition (2.80) or (4.61) is satisfied. Thus, we have Theorem 6. A Laplace-type operator F acting on 1-forms taking values in a semisimple Lie algebra G, F : C ∞ (T ∗ M ⊗ G, M ) → C ∞ (T ∗ M ⊗ G, M ), with the boundary conditions (5.15) and (5.16), is elliptic. Since such boundary conditions automatically appear in the gauge-invariant formulation of the boundary conditions in one-loop Yang–Mills theory, we have herefrom Corollary 1. The boundary-value problem in one-loop Euclidean Yang–Mills theory determined by a Laplace-type operator and the gauge-invariant boundary conditions defined by (5.15) and (5.16), is strongly elliptic with respect to C − R+ .

Gauge Theories on Manifolds with Boundary

531

6. Ellipticity for the Rarita–Schwinger System The next step is the analysis of Rarita–Schwinger fields. The bundle G is now the bundle of spinor fields taking values in some semi-simple Lie algebra A, i.e. G = S ⊗ A, and V is the bundle of spin-vector fields, in other words, V is the bundle of 1-forms taking values in the fibre of G, i.e. V = T ∗ M ⊗ G. The whole theory does not depend on the presence of the algebra A, so we will omit completely the group indices. Let γ µ be the Dirac matrices and γ be the Hermitian metric on the spinor bundle S determined by γ µ γ ν + γ ν γ µ = 2g µν IS , γ¯ µ = −εγ µ , γ † = γ,

(6.1)

where γ¯ µ = γ −1 γ µ† γ. The fibre metric E on the bundle V is defined by E(ϕ, ϕ) ≡ ϕ†µ γE µν ϕν ,

(6.2)

E µν ≡ g µν + αγ µ γ ν ,

(6.3)

where

with a parameter α. By using (6.1) it is easily seen that E is Hermitian, i.e. E¯ µν = E νµ , if α is real. The inverse metric reads −1 = gµν − Eµν

α γµ γν . (1 + mα)

(6.4)

(6.5)

Therefore, the fibre metric E is positive-definite only for α > −1/m and is singular for α = −1/m. Thus, hereafter α 6= −1/m. In fact, for α = −1/m the matrix Eµ ν becomes a projector on a subspace of spin-vectors ϕµ satisfying the condition γ µ ϕµ = 0. The generator of gauge transformations is now again, as in the Yang–Mills case, just the covariant derivative [23] R = b∇G ,

(6.6)

where b is a normalization constant, with leading symbol σL (R; ξ) = ibξIG .

(6.7)

Now we have the Case II of Sect. 4, and hence we define the map β : G → V and its adjoint β¯ : V → G by (β)µ ≡

iε −1 ν iε ¯ ≡ i γ µ ϕµ , E γ = γµ , βϕ b µν b(1 + αm) b

(6.8)

so that ¯ = ββ

−εm IG . b2 (1 + αm)

(6.9)

¯ is of The operator X of Eq. (4.3) is now equal to β¯ so that the operator L = βR Dirac type with leading symbol σL (L) = −γ µ ξµ .

(6.10)

532

I. G. Avramidi, G. Esposito

¯ β¯ reads The leading symbol of the operator H = β βR ε γλ γ µ γ ν ξµ ϕν . σL (H)ϕλ = 2 b (1 + αm)

(6.11)

The gauge-invariant operator 1 is now the Rarita–Schwinger operator with leading symbol −1 µβν γ ξµ ϕν . σL (1)ϕλ = εEλβ

(6.12)

Here and below we denote the antisymmetrized products of γ-matrices by γ µ1 ...µn ≡ γ [µ1 · · · γ µn ] .

(6.13)

Of course, the leading symbol is self-adjoint and gauge-invariant, in that σL (1) = σL (1), σL (1)σL (R) = 0,

(6.14)

† (1)E. Further, the leading symbol of the operator F = 1 + H where σL (1) = E −1 σL reads 1 −1 µβν γλ γ µ γ ν ξµ ϕν . γ + 2 (6.15) σL (F )ϕλ = ε Eλβ b (1 + αm)

Using the properties of the Clifford algebra we compute σL (F )ϕλ = −ε δλν γ µ −δλµ γ ν +

1 (1+mα)

1+2α−

1 b2

γλ γ µ γ ν −

(1+2α) γλ g µν (1+mα)

(6.16) ξµ ϕν .

Moreover, one can prove the following property of representation theory: e µ (0) defined by Proposition 2. The matrices 0 e µ λ ν (0) ≡ δλν γ µ + 2 γλ γ µ γ ν + (ω − 1)δ µ γ ν − (ω + 1)γλ g µν , e µ (0))λ ν = 0 (0 λ m (6.17) where ω is defined by m−4 , m form a representation of the Clifford algebra, i.e. ω2 ≡

(6.18)

e ν λ ρ (0) + 0 e ν β λ (0)0 e µ λ ρ (0) = 2g µν δ ρ IS . e µ β λ (0)0 0 β

(6.19)

It is thus clear that the set of matrices e µ ρ σ (0)Tσ β (α0 ), e µ λ β (α0 ) = T −1 λ ρ (α0 )0 0 0

(6.20) 0

with arbitrary non-degenerate matrix T (α ) depending on a parameter α , also forms a representation of the Clifford algebra. By choosing Tλ β (α0 ) = δλβ + α0 γλ γ β ,

(6.21)

and, hence, T −1 λ β (α0 ) = δλβ − we prove, more generally, what follows.

α0 γλ γ β , (1 + mα0 )

(6.22)

Gauge Theories on Manifolds with Boundary

533

µ

e (α0 ) defined by Corollary 2. The matrices 0 e µ λ ν (α0 ) ≡ δλν γ µ + c1 γλ γ µ γ ν + c2 δ µ γ ν + c3 γλ g µν , e µ (α0 ))λ ν = 0 (0 λ

(6.23)

where c1 ≡

(m − 2 − mω) (1 + mα0 )

α0 +

ω+1 2

2 ,

(6.24)

ω+1 , c2 ≡ −(m − 2 − mω) α0 + 2

(6.25)

2 c3 ≡ − (1 + mα0 )

ω+1 α + 2 0

,

(6.26)

with α0 an arbitrary constant α0 6= −1/m, and ω defined by (6.18), form a representation of the Clifford algebra, i.e. satisfy Eq. (6.19). Note that, by choosing α0 = −(ω + 1)/2, all the constants c1 , c2 and c3 vanish. The operator F with leading symbol (6.16) should be of Dirac type. Thus, by imposing that the term in curly brackets in Eq. (6.16) should coincide with the right-hand side of Eq. (6.23), one finds a system whose solution is 1 [m − 4 + (m − 2)ω] , 4 m−4 , α= 4 2 . b = ±√ m−2

α0 =

(6.27) (6.28) (6.29)

√ The simplest case is m = 4, one has then α = α0 = 0, b = ± 2. After this choice the operator F is of Dirac type, i.e. σL (F ) = −ε0µ ξµ , with (0µ )λ ν = 0µ λ ν ≡ δλν γ µ +

1 2 γλ γ µ γ ν − δλµ γ ν − γλ g µν . (6.30) (m − 2) (m − 2)

Thus, we have two Dirac-type operators, L and F , which have elliptic leading symbols. By choosing the appropriate boundary conditions with the projectors PL and PF the system becomes elliptic. The problem is to define the boundary projectors in a gaugeinvariant way. Let PL be the boundary projector for the ghost operator L, = 0. (6.31) PL ∂M

Remember that it satisfies the symmetry condition (2.44). Then we choose the boundary conditions for the gauge field in the form = 0, (6.32) PL q ν µ ϕµ ∂M = 0, (6.33) PL γ µ ϕµ ∂M

534

I. G. Avramidi, G. Esposito

where q is a projector defined by (5.13). The gauge transformation of (6.32) reads = 0, (6.34) PL qν µ ∇µ ∂M

and does not include the normal derivative. We are assuming that the projector PL commutes with the tangential derivative (as is usually the case, we find that their commutator vanishes identically by virtue of the boundary conditions on ). The gauge transformation of Eq. (6.33) is proportional exactly to the operator L, ¯ = PL L = 0. (6.35) PL βR ∂M

∂M

By expanding in the eigenmodes of the operator L we find that this is proportional to the boundary conditions (6.31) on , and therefore vanishes. Thus, the boundary conditions (6.32) and (6.33) are gauge-invariant. They can be re-written in another convenient form, = 0. (6.36) PL [qν µ + Nν γ µ ]ϕµ ∂M

This defines eventually the boundary projector PF for the gauge operator F , PF µ β = [δµ ν − Nµ Nα γ α N ν − Nµ Nα γ α γ ν ]PL [qν β + Nν γ β ].

(6.37)

If the projector PL satisfies the symmetry condition (2.44) then so does the projector PF (of course, one has to check it with the matrix 0µ Nµ ). Thus, we have shown that Theorem 7. The boundary-value problem for the Rarita–Schwinger system with the boundary conditions (6.31)–(6.33) is gauge-invariant and strongly elliptic provided that the projector PL satisfies the condition (2.44). Particular examples of such projectors are given by (2.54)–(2.56).

7. Euclidean Quantum Gravity Generalized boundary conditions similar to the ones studied so far occur naturally in Euclidean quantum gravity [4, 5]. The vector bundle G is now the bundle of cotangent vectors, G = T ∗ M , and V is the vector bundle of symmetric rank-2 tensors (also called symmetric 2-forms) over M : V = T ∗ M ∨ T ∗ M , ∨ being the symmetrized tensor product. The metric on the bundle G is, naturally, just the metric on M , and the fibre metric E on the bundle V is defined by the equation E ab cd ≡ g a(c g d)b + αg ab g cd ,

(7.1)

where α is a real parameter. One has also, for the inverse metric, E −1 ab cd ≡ ga(c gd)b −

α gab gcd . (1 + αm)

(7.2)

We do not fix the α parameter from the beginning, but rather are going to study the dependence of the heat kernel on it. It is not difficult to show that the eigenvalues of the

Gauge Theories on Manifolds with Boundary

535

matrix E are 1 (with degeneracy m(m + 1)/2 − 1) and (1 + αm). Therefore, this metric is positive-definite only for α>−

1 , m

(7.3)

and becomes singular for α = −1/m. Thus, hereafter α 6= −1/m. The generator R of infinitesimal gauge transformations is now defined to be the Lie derivative of the tensor field ϕ along the vector field , √ (7.4) (R)ab ≡ (L ϕ)ab = 2∇(a b) . The adjoint generator R¯ with the metric E is defined by its action, √ ¯ a ≡ − 2Ea b cd ∇b ϕcd . (Rϕ)

(7.5)

¯ takes now the form The leading symbol of the ghost operator L = RR σL (L; ξ)a = 2Ea bcd ξb ξc d = δab |ξ|2 + (1 + 2α)ξa ξ b b .

(7.6)

Therefore, we see that it becomes of Laplace type only for α = −1/2. Note also that the operator L has positive-definite leading symbol only for α > −1, and becomes degenerate for α = −1. Further, the leading symbol of the operator H = RR¯ reads i h (c d) ξ + 2αξa ξb g cd ϕcd . (7.7) σL (H; ξ)ϕab = 2ξ(a Eb) cde ξc ϕde = 2ξ(a δb) The gauge-invariant operator 1 is well known (see, e.g. [22, 23]). It has the following leading symbol: 1 + 2α (c d) (c d) 2 cd c d cd gab (ξ ξ − g ) ϕcd . σL (1; ξ)ϕab = δ(a δb) |ξ| + ξa ξb g − 2ξ(a δb) ξ + (1 + αm) (7.8) Thus, we see that the operator F = 1 + H is of Laplace type only in the case α = −1/2. Let us, however, consider for the time being a Laplace-type operator F on symmetric rank-2 tensors with a fibre metric (7.1) with an arbitrary parameter α. Further, we define the projector 5 5 = 5ab cd ≡ q(a c qb) d −

α Na Nb q cd , (α + 1)

(7.9)

where qab ≡ gab − Na Nb . It is not difficult to check that it is self-adjoint with respect ¯ = 5, and that to the metric E, i.e. 5 tr V 5 =

1 m(m − 1). 2

(7.10)

Thus, we consider a Laplace-type operator F acting on symmetric rank-2 tensors with the following boundary conditions: = 0, (7.11) 5ab cd ϕcd ∂M = 0. (7.12) E ab cd ∇b ϕcd ∂M

536

I. G. Avramidi, G. Esposito

Separating the normal derivative we find from here the boundary operator BF of the form (2.25), the operator 3 being given by (2.57) with the matrices 0i defined by 0i = 0i ab cd ≡ −

1 Na Nb ei(c N d) + N(a eib) N c N d . (1 + α)

(7.13)

It is not difficult to check that these matrices are anti-self-adjoint and satisfy the conditions (2.58) and (2.59). The matrix T ≡ 0 · ζ reads (cf. [9]) T =−

1 p1 + p 2 , (1 + α)

(7.14)

where p1 = p1ab cd ≡ Na Nb ζ (c N d) , p2 = p2ab cd ≡ N(a ζb) N c N d , and ζa =

eia ζi ,

(7.15)

a

so that ζa N = 0. It is important to note that 5T = T 5 = 0.

(7.16)

We also define further projectors, ρ = ρab cd ≡

2 N(a ζb) N (c ζ d) , |ζ|2

p = pab cd ≡ Na Nb N c N d ,

(7.17) (7.18)

which are mutually orthogonal: pρ = ρp = 0.

(7.19)

The matrices p1 and p2 , however, are nilpotent: p21 = p22 = 0, and their products are proportional to the projectors: 1 2 1 |ζ| p, p2 p1 = |ζ|2 ρ. 2 2

(7.20)

1 |ζ|2 (p + ρ) ≡ −τ 2 (p + ρ), 2(1 + α)

(7.21)

1 |ζ|. 2(1 + α)

(7.22)

p 1 p2 = Therefore, we have T2 = − where

τ≡√ We compute further

T 2n = (iτ )2n (p + ρ), T 2n+1 = (iτ )2n T.

(7.23) (7.24)

trp = trρ = 1, trp1 = trp2 = 0,

(7.25) (7.26)

trT 2n = 2(iτ )2n , trT 2n+1 = trT = 0.

(7.27)

Last, by using

we obtain

This suffices to prove:

Gauge Theories on Manifolds with Boundary

537

Lemma 1. For any function f analytic in the region |z| ≤ τ , one has 1 f (T ) = f (0) [I − p − ρ] + [f (iτ ) + f (−iτ )](p + ρ) 2 1 [f (iτ ) − f (−iτ )]T, + 2i|ζ| m(m + 1) − 2 f (0) + f (iτ ) + f (−iτ ). trf (T ) = 2

(7.28)

Thus, the eigenvalues of the matrix T are   0 with degeneracy m(m+1) −2 2 spec (T ) = iτ with degeneracy 1  −iτ with degeneracy 1

(7.29)

(7.30)

with τ defined by Eq. (7.22). This means that the eigenvalues of the matrix T 2 for a non-vanishing ζj are 0 and −1/[2(1 + α)]|ζ|2 . Thus, the strong ellipticity condition (2.81), which means that the matrix (T 2 + |ζ|2 I), for non-zero ζ, should be positive-definite, takes the form −

1 + 1 > 0. 2(1 + α)

(7.31)

This proves eventually Theorem 8. The boundary-value problem for a Laplace-type operator acting on sections of the bundle of symmetric rank-2 tensors with the boundary conditions (7.11) and (7.12) is strongly elliptic with respect to C − R+ only for 1 α>− . 2

(7.32)

Remarks. First, let us note that the condition (7.32) of strong ellipticity is compatible with the condition (7.3) of positivity of the fibre metric E. Second, it is exactly the value α = −1/2 that appears in the gauge-invariant boundary conditions in one-loop quantum gravity in the minimal DeWitt gauge. For a general value α 6= −1/2, the operator F resulting from 1 and H is not of Laplace type, which complicates the analysis significantly. In other words, we have a corollary [9]: Corollary 3. The boundary-value problem in one-loop Euclidean quantum gravity, with a Laplace-type operator F acting on rank-two symmetric tensor fields, and with the gauge-invariant boundary conditions (7.11) and (7.12), the fibre metric being E with α = −1/2, is not strongly elliptic with respect to C − R+ . We can also evaluate the coefficient a1/2 of heat-kernel asymptotics. This is most easily obtained by using the representation (3.94) of the coefficient β1/2 in form of a Gaussian integral. Using Eq. (7.21) we get first 1 2 2 (7.33) |ζ| − I (p + ρ). exp(−T ) − I = exp 2(1 + α)

538

I. G. Avramidi, G. Esposito

Therefore, from Eq. (3.94) we obtain Z 1 + 2α dζ 2 −|ζ|2 |ζ| − e exp − (p + ρ). β1/2 = 2 2(1 + α) π (m−1)/2

(7.34)

Rm−1

One should bear in mind that ρ is a projector that depends on ζ (see (7.17)). Although ρ is singular at the point ζ = 0, the integral is well defined because of the difference of two exponential functions. Calculating the Gaussian integrals we obtain " # (m−1)/2 2(α + 1) 1 −1 p+ ψ , (7.35) β1/2 = 2 1 + 2α (m − 1)

a1/2

" # (m−1)/2 ) √ ( 2(α + 1) π 1 I − 25 + 2 = −1 p+ ψ ,(7.36) 2 1 + 2α (m − 1)

where ψ is yet another projector: ψ = ψab cd ≡ 2N(a qb)(c N d) .

(7.37)

Last, from Eq. (3.92), by using the traces of the projectors (7.10), (7.25) and (7.37), and the dimension of the bundle of symmetric rank-2 tensors, dim V = tr V I = we obtain tr V a1/2

1 m(m + 1), 2

(m−1)/2 ) √ ( 1 2(α + 1) π − m(m − 3) − 4 + 4 . = 2 2 1 + 2α

(7.38)

(7.39)

Thus, we see that there is a singularity at α = −1/2, which reflects the lack of strong ellipticity in this case. 7.1. Heat-kernel diagonal in the non-elliptic case. Consider now the case α = −1/2. From Eq. (7.28) we have, in particular, µ 1 1 (I − p − ρ) + 2 (p + ρ) + i 2 T µ µ − |ζ|2 µ − |ζ|2 µ 1 1 = (I − p − ρ) − (p + ρ) − i T, µ λ λ

(I µ − iT )−1 =

(7.40)

det (I µ − iT ) = µm(m+1)/2−2 (µ2 − |ζ|2 ) = (|ζ|2 − λ)m(m+1)/4−1 (−λ). (7.41) For the boundary-value problem to be strongly elliptic, this determinant should be nonvanishing for any (ζ, λ) 6= (0, 0) and λ ∈ C − R+ , including the case λ = 0, ζ 6= 0. Actually we see that, for any non-zero λ ∈ / R+ , this determinant does not vanish for any ζ. However, for λ = 0 and any ζ it equals zero, which means that the corresponding boundary conditions do not fix a unique solution of the eigenvalue equation for the leading symbol, subject to a decay condition at infinity. This is reflected by the simple

Gauge Theories on Manifolds with Boundary

539

fact that the coefficient a1/2 of the asymptotic expansion of the heat kernel is not well defined, in that the integrals that determine it are divergent. At the technical level, the non-ellipticity is reflected in the fact that the heat-kernel diagonal, although well defined, has a non-standard non-integrable behaviour near the boundary, i.e. for r → 0. To prove this property, let us calculate the fibre trace of the heat-kernel diagonal. From (3.81) we have h √ i 2 (7.42) tr V U0 (t|x, x) = (4πt)−m/2 c0 + c1 e−r /t + J(r/ t) , where m(m + 1) , 2 m(m − 3) , c1 ≡ tr V (I − 25) = − 2 J(z) ≡ tr V 8(z) Z Z 2 2 dζ dω √ e−|ζ| −ω +2iωz tr V 0 · ζ(ω I + 0 · ζ)−1 . = −2 (m−1)/2 π π c0 ≡ dim V =

(7.43) (7.44)

(7.45)

C

Rm−1

Now, for α = −1/2 the parameter τ (7.22) determining the eigenvalues of the matrix T = 0 · ζ is equal to τ = |ζ|. By using (7.29) we get Z Z 2 2 dζ dω |ζ|2 √ e−|ζ| −ω +2iωz 2 . (7.46) J(z) = −4 (m−1)/2 ω + |ζ|2 π π Rm−1

C

Remember that the contour C lies in the upper half plane: it comes from −∞ + iε, encircles the point ω = i|ζ| in the clockwise direction and goes to ∞ + iε. The integral over ω is calculated by using the formula Z∞

Z dω f (ω) = −2πi Resω=i|ζ| f (ω) +

dω f (ω).

(7.47)

−∞

C

The integrals over ζ can be reduced to Gaussian integrals by lifting the denominator in the exponent (cf. (3.83)) or by using spherical coordinates. In this way we get eventually 1 J(z) = 2(m − 1)0(m/2) m − 2(m − 1) z

Z1

du um/2−1 e−z u . 2

(7.48)

0

This can also be written in the form J(z) = 2(m − 1)z −m 0(m/2) − γ(m/2, z 2 ) = 2(m − 1)z −m 0(m/2, z 2 ), (7.49) by using the incomplete γ-functions Zx γ(a, x) ≡ 0

du ua−1 e−u , 0(a, x) ≡

Z

∞

x

du ua−1 e−u .

(7.50)

540

I. G. Avramidi, G. Esposito

It is immediately seen that the function J(z) is singular as z → 0, J(z) ∼ 2(m − 1)0(m/2)

1 . zm

(7.51)

By using the identity 0(a, x) = xa−1 e−x + (a − 1)0(a − 1, x)

(7.52)

we also find that, when z → ∞, J(z) is exponentially small, i.e. J(z) ∼ 2(m − 1)z −2 e−z . 2

(7.53)

Note that, for α = − 21 , one has tr V T 2 = −2|ζ| , and hence tr V 0i 0j = −2g ij and tr V 02 = −2(m − 1). Thus, we see that the asymptotics (7.53) as z → ∞ corresponds to Eq. (3.88). The singularity at the point z = 0 results exactly from the pole at ω = i|ζ|. In the strongly elliptic case all poles lie on the positive imaginary line with Im ω < i|ζ|, so that there is a finite gap between the pole located at the point with the largest value of the imaginary part and the point i|ζ|. Thus, we obtain 2

tr V U0 (t|x, x) = (4πt)

−m/2

" −r 2 /t

c0 + c1 e

+ 2(m − 1)

r √

#

−m

0(m/2, r /t) . 2

t

(7.54)

Here the first term is the standard first term in the heat-kernel asymptotics which gives the familiar interior contribution when integrated over a compact manifold. The boundary terms in (7.54) are exponentially small as t → 0+ , if r is kept fixed. Actually, these terms should behave, as t → 0+ , as distributions near the boundary, so that they give well defined non-vanishing contributions (in form of integrals over the boundary) when integrated with a smooth function. The third term, however, gives rise to an unusual singularity at the boundary as r → 0 (on fixing t): r→0

tr V U0 (t|x, x) ∼ (4π)−m/2 2(m − 1)0(m/2)

1 . rm

(7.55)

This limit is non-standard in that: i) it does not depend on t and ii) it is not integrable over r near the boundary, as r → 0. This is a direct consequence of the lack of strong ellipticity. 8. Concluding Remarks We have studied a generalized boundary-value problem for operators of Laplace type, where a part of the field is subject to Dirichlet boundary conditions, and the remaining part is subject to conditions which generalize the Robin case by the inclusion of tangential derivatives. The corresponding boundary operator can be always expressed in the form (2.25), where 3 is a tangential differential operator of the form (2.57). The fermionic analysis for Dirac-type operators has also been developed. The strong ellipticity of the resulting boundary-value problem is crucial, in particular, to ensure the existence of the asymptotic expansions frequently studied in the theory of heat kernels [1]. In physical

Gauge Theories on Manifolds with Boundary

541

problems, this means that the one-loop semi-classical expansions of the Green functions and of the effective action in quantum field theory are well defined and can be computed explicitly on compact Riemannian manifolds with smooth boundary. The occurrence of boundaries plays indeed a key role in the path-integral approach to quantum gravity [24], and it appears desirable to study the strong ellipticity problem for all gauge theories of physical interest, now that a unified scheme for the derivation of BRST-invariant boundary conditions is available [6]. We have thus tried to understand whether the following requirements are compatible: (i) The gauge-field operator, F , should be of Laplace type (and the same for the ghost operator); (ii) Local nature of the boundary operator BF (in fact, we have studied the case when BF is a first-order differential operator, and boundary projectors for fermionic fields); (iii) Gauge invariance of the boundary conditions; (iv) Strong ellipticity of the boundary-value problem (F, BF ). First, we have found a condition of strong ellipticity (see (2.80)) for a generalized boundary-value problem for a Laplace-type operator. For operators of Dirac type, one finds instead the strong ellipticity properties described in our Theorem 4. Second, we have constructed the resolvent kernel and the heat kernel in the leading approximation (Sects. 3.1 and 3.2) and evaluated the first non-trivial heat-kernel coefficient A1/2 in Eqs. (3.96) and (3.99). Third, we have found a criterion of strong ellipticity of a general gauge theory in terms of the gauge generators (see (4.61)). As physical applications of the above results we have studied the Yang–Mills, Rarita–Schwinger and Einstein field theories. Interestingly, only in the latter the strong ellipticity condition is not satisfied if the conditions (i), (ii) and (iii) hold. Moreover, the gauge-invariant boundary conditions for the Rarita–Schwinger system have been found to involve only the boundary projector for the ghost operator. As far as we know, our results as well as the consequent analysis of the physical gauge models, are completely original, or extend significantly previous work in the literature [3, 9, 11]. Since we find that, for gravitation, the four conditions listed above are not, in general, compatible, it seems that one should investigate in detail at least one of the following alternatives: (1) Quantum gravity and quantum supergravity on manifolds with boundary with gaugefield operators which are non-minimal; (2) Non-local boundary conditions for gravitation [25] and spin-3/2 fields [26]; (3) Non-gauge-invariant “regularization” of the boundary conditions, which suppresses the tangential derivatives and improves the ellipticity; (4) Boundary conditions which are not completely gauge-invariant, and hence avoid the occurrence of tangential derivatives in the boundary operator. The latter possibility has been investigated in [27] and has been widely used in the physical literature (see [28] and references therein). The first 3 options are not yet (completely) exploited in the literature. In particular, it is unclear whether one has to resort to non-minimal operators to preserve strong ellipticity of the boundary-value problem. Moreover, non-local boundary conditions which are completely gauge-invariant, compatible with local supersymmetry transformations at the boundary (cf. [27]), and ensure strong ellipticity, are not easily obtained (if at all admissible). Indeed, it should not be especially surprising that gauge theories have, in general, an essentially non-local character. Since the projectors on the physical gauge-invariant subspace of the configuration

542

I. G. Avramidi, G. Esposito

space are non-local, a consistent formulation of gauge theories on Riemannian manifolds (even without boundary) is necessarily non-local (see [23]). Thus, there is increasing evidence in favor of boundaries raising deep and unavoidable foundational issues for the understanding of modern quantum field theories [29]. The solution of such problems might in turn shed new light on spectral geometry and on the general theory of elliptic operators [1, 30, 31]. Acknowledgement. We are much indebted toAndrei Barvinsky, Thomas Branson, Stuart Dowker, Peter Gilkey, Alexander Kamenshchik, Klaus Kirsten and Hugh Osborn for correspondence and discussion of the generalized boundary conditions, and to Stephen Fulling for valuable remarks. The work of IA was supported by the Deutsche Forschungsgemeinschaft.

References 1. Gilkey, P.B.: Invariance Theory, the Heat Equation and the Atiyah-Singer Index Theorem. Boca Raton, FL: Chemical Rubber Company, 1995 2. Branson, T.P., Gilkey, P.B., Ørsted, B.: Proc. Am. Math. Soc. 109, 437 (1990) 3. McAvity, D.M., Osborn, H.: Class. Quantum Grav. 8, 1445 (1991) 4. Barvinsky, A. O.: Phys. Lett. B195, 344 (1987) 5. Avramidi, I.G., Esposito, G., Kamenshchik, A.Yu.: Class. Quantum Grav. 13, 2361 (1996) 6. Moss, I.G., Silva, P.J.: Phys. Rev. D55, 1072 (1997) 7. Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators. Berlin: Springer, 1992 8. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, II: Fourier Analysis and SelfAdjointness. New York: Academic, 1975 9. Avramidi, I.G., Esposito, G.: Class. Quantum Grav. 15, 1141 (1998) 10. Gilkey, P.B., Smith, L.: J. Diff. Geom. 18, 393 (1983) 11. Dowker, J.S., Kirsten, K.: Class. Quantum Grav. 14, L169 (1997) 12. Avramidi, I.G., Esposito, G.: Class. Quantum Grav. 15, 281 (1998) 13. Avramidi, I. G.: Nucl. Phys. B355, 712 (1991) 14. Kirsten, K: Class. Quantum Grav. 15, L5 (1998) 15. Seeley, R.T.: Topics in Pseudo-Differential Operators. In C.I.M.E., Conference on Pseudo-Differential Operators, Roma: Edizioni Cremonese, 1969, p. 169 16. Seeley, R.T.: Am. J. Math. 91, 889 (1969) 17. Gilkey, P.B., Smith, L.: Comm. Pure Appl. Math. 36, 85 (1983) 18. H¨ormander, L.: The Analysis of Linear Partial Differential Operators III. Grundlehren d. Mathem. 274. Berlin: Springer-Verlag, 1985 19. Booss-Bavnbek, B., Wojciechowski, K.P.: Elliptic Boundary Problems for Dirac Operators. Boston: Birkh¨auser, 1993 20. Avramidi, I.G.: J. Math. Phys. 39, 2889 (1998) 21. Avramidi, I.G.: Yad. Fiz. 56, 245 (1993); Transl. in: Phys. Atom. Nucl. 56, 138 (1993) 22. DeWitt, B.S.: The Space-Time Approach to Quantum Field Theory. In: Relativity, Groups and Topology II, eds. B.S. DeWitt, R. Stora, Amsterdam: North-Holland, 1984, p. 381 23. Avramidi, I.G.: Int. J. Mod. Phys. A6, 1693 (1991) 24. Hawking, S.W.: The Path-Integral Approach to Quantum Gravity. In: General Relativity, an Einstein Centenary Survey, eds. S.W. Hawking and W. Israel, Cambridge: Cambridge University Press, 1979, p. 746 25. Marachevsky, V.N., Vassilevich, D.V.: Class. Quantum Grav. 13, 645 (1996) 26. D’Eath, P.D., Esposito, G.: Phys. Rev. D44, 1713 (1991) 27. Luckock, H.C.: J. Math. Phys. 32, 1755 (1991) 28. Esposito, G., Kamenshchik,A.Yu., Pollifrone, G.: Euclidean Quantum Gravity on Manifolds with Boundary, Fundamental Theories of Physics 85. Dordrecht: Kluwer, 1997

Gauge Theories on Manifolds with Boundary

543

29. Vassilevich, D.V.: Phys. Lett. B421, 93 (1998) 30. Grubb, G.: Functional Calculus of Pseudodifferential Boundary Problems, Progress of Mathematics 65. Boston: Birkh¨auser, 1996 31. Esposito, G.: Dirac Operators and Spectral Geometry. Cambridge Lecture Notes in Physics 12. Cambridge: Cambridge University Press, 1998 Communicated by H. Araki

Commun. Math. Phys. 200, 545 – 560 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry Ping Xu? Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 19 January 1998 / Accepted: 27 July 1998

Abstract: The purpose of this paper is to establish an explicit correspondence between various geometric structures on a vector bundle with some well-known algebraic structures such as Gerstenhaber algebras and BV-algebras. Some applications are discussed. In particular, we find an explicit connection between the Koszul–Brylinski operator and the modular class of a Poisson manifold. As a consequence, we prove that Poisson homology is isomorphic to Poisson cohomology for unimodular Poisson structures.

1. Introduction BV-algebras arise from the BRST theory of topological field theory [30]. Recently, there has been a great deal of interest in these algebras in connection with various subjects such as operads and string theory [7, 8, 11, 18, 22, 25, 26, 31]. Let us first recall various relevant definitions, following the terminology of [14]. A Gerstenhaber algebra consists of a triple (A = ⊕i∈Z Ai , ∧, [·, ·]) such that (A, ∧) is a graded commutative associative algebra, and (A = ⊕i∈Z A(i) , [·, ·]), with A(i) = Ai+1 , is a graded Lie algebra, and [a, ·], for each a ∈ A(i) is a derivation with respect to ∧ with degree i. An operator D of degree −1 is said to generate the Gerstenhaber algebra bracket if for every a ∈ A|a| and b ∈ A, [a, b] = (−1)|a| (D(a ∧ b) − Da ∧ b − (−1)|a| a ∧ Db).

(1)

A Gerstenhaber algebra is said to be exact if there is an operator D of square zero generating the bracket. In this case, D is called a generating operator. An exact Gerstenhaber algebra is also called a Batalin–Vilkovisky algebra (or BV-algebra in short). ?

Research partially supported by NSF grants DMS95-04913 and DMS97-04391.

546

P. Xu

A differential Gerstenhaber algebra is a Gerstenhaber algebra equipped with a differential d, which is a derivation of degree 1 with respect to ∧ and d2 = 0. It is called a strong differential Gerstenhaber algebra if, in addition, d is a derivation of the graded Lie bracket. Kosmann–Schwarzbach noted [13] that these algebra structures had also appeared in Koszul’s work [17] in 1985 in his study of Poisson manifolds. In fact they are connected with a certain differential structure on vector bundles, called Lie algebroids by Pradines [23]. Let us recall for the benefit of the reader the definition of a Lie algebroid [23, 24]. Definition 1.1. A Lie algebroid is a vector bundle A over M together with a Lie algebra structure on the space 0(A) of smooth sections of A, and a bundle map a : A → T P (called the anchor), extended to a map between sections of these bundles, such that (i) a([X, Y ]) = [a(X), a(Y )]; and (ii) [X, f Y ] = f [X, Y ] + (a(X)f )Y for any smooth sections X and Y of A and any smooth function f on M . Among many examples of Lie algebroids are the usual Lie algebras, the tangent bundle of a manifold, and an integrable distribution over a manifold (see [20]). In recent years, Lie algebroids have become increasingly interesting in Poisson geometry. One main reason for this is given by the following example. Let P be a Poisson manifold with Poisson tensor π. Then T ∗ P carries a natural Lie algebroid structure, called the cotangent Lie algebroid of the Poisson manifold P [4]. The anchor map π # : T ∗ P → T P is defined by π # : Tp∗ P −→ Tp P : π # (ξ)(η) = π(ξ, η), ∀ξ, η ∈ Tp∗ P

(2)

and the Lie bracket of 1-forms α and β is given by [α, β] = Lπ# (α) β − Lπ# (β) α − dπ(α, β).

(3)

In [13], Kosmann–Schwarzbach constructed various examples of strong differential Gerstenhaber algebras and BV-algebras in connection with Lie algebroids. Motivated by her work, in this paper we will study the relation between these algebra structures and some of the well-known geometric structures in Poisson geometry. More precisely, we will investigate the following question. Let A be a vector bundle of rank n over the base M , and let A = ⊕0≤k≤n 0(∧k A) be its corresponding exterior algebra. It is graded commutative. The question is: What additional structure on A will make A into a Gerstenhaber algebra, a strong differential Gerstenhaber algebra, or an exact Gerstenhaber algebra (or a BV-algebra)? The answer is surprisingly simple. Gerstenhaber algebras and strong differential Gerstenhaber algebras correspond exactly to the structures of Lie algebroids and Lie bialgebroids (see Sect. 2 for the definition), respectively, as already indicated in [13]. And an exact Gerstenhaber algebra structure corresponds to a Lie algebroid A together with a flat A-connection on its canonical line bundle ∧n A. This fact was already implicitly contained in Koszul’s work [17] although he only treated the case of multivector fields. However, the formulas (9) and (14) below establishing the explicit correspondence seem to be new. Below is a table of the correspondence.

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

Structures on algebra A Gerstenhaber algebras Strong differential Gerstenhaber algebras Exact Gerstenhaber algebras (BV-algebras)

547

Structures on the vector bundle A ↔ ↔

Lie algebroids Lie bialgebroids

↔

Lie algebroids with a flat A-connection on ∧n A

The content above occupies Sects. 2 and 3. Sect. 4 is devoted to applications. In particular, we find an explicit connection between the Koszul–Brylinski operator on a Poisson manifold with its modular class. As a consequence, we prove that Poisson homology is isomorphic to Poisson cohomology for unimodular Poisson structures (see [3, 28] for the definition). As another application, we define Lie algebroid homology as the homology group of the complex: D : 0(∧∗ A) −→ 0(∧∗−1 A) for a generating operator D. Since a generating operator depends on the choice of a flat A-connection on the line bundle ∧n A, in general this homology depends on the choice of such a connection ∇. When two connections are homotopic (see Sect. 4 for the precise definition), their corresponding homology groups are isomorphic. So a given Lie algebroid has homologies which are in fact parameterized by the first Lie algebroid cohomology H 1 (A, R). When A is a Lie algebra and ∇ is the trivial connection, this reduces to the usual Lie algebra homology with trivial coefficients. On the other hand, Poisson homology can also be considered as a special case of Lie algebroid homology, where A is taken as the cotangent Lie algebroid of a Poisson manifold. We note that in a recent paper [5], Evens, Lu and Weinstein have also established a connection between Poisson homology and the modular class of Poisson manifolds. Some results in the paper have recently been generalized by Huebschmann to the algebraic context of Lie-Rinehart algebras [9, 10].

2. Gerstenhaber Algebras and Differential Gerstenhaber Algebras In this section, we will treat Gerstenhaber algebras and differential Gerstenhaber algebras arising from a vector bundle. Again, let A be a vector bundle of rank n over M , and let A = ⊕0≤k≤n 0(∧k A). The following proposition establishes a one-one correspondence between Gerstenhaber algebra structures on A and Lie algebroid structures on the underlying vector bundle A. Proposition 2.1. A is a Gerstenhaber algebra iff A is a Lie algebroid. This is a well-known result (see [6, 16, 21]). For completeness, we sketch a proof below. Proof. Suppose that there is a graded Lie bracket [·, ·] that makes A into a Gerstenhaber algebra. It is clear that (0(A), [·, ·]) is a Lie algebra. Second, for any X ∈ 0(A) and f, g ∈ C ∞ (M ), it follows from the derivation property that [X, f g] = [X, f ]g + f [X, g].

548

P. Xu

Hence, [X, ·] defines a vector field on M , which will be denoted by a(X). It is easy to see that a is in fact induced by a bundle map from A to T P . By applying the graded Jacobi identity, we find that a([X, Y ]) = [a(X), a(Y )]. Finally, again from the derivation property, it follows that [X, f Y ] = (a(X)f )Y + f [X, Y ]. This shows that A is indeed a Lie algebroid. Conversely, given a Lie algebroid A, it is easy to check that A = ⊕0≤k≤n 0(∧k A) forms a Gerstenhaber algebra (see [13, 21]). The following lemma gives another characterization of a Lie algebroid, which should be of interest itself. Recall that a differential graded algebra is a graded commutative associative algebra equipped with a differential d, which is a derivation of degree 1 and of square zero. Lemma 2.2 ([16, 12]). Given a vector bundle A over M , A is a Lie algebroid iff ⊕k 0(∧k A∗ ) is a differential graded algebra. Proof. Given a Lie algebroid A, it is known that ⊕k 0(∧k A∗ ) admits a differential d that makes it into a differential graded algebra [16]. Here, d : 0(∧k A∗ ) −→ 0(∧k+1 A∗ ) is simply the differential defining the Lie algebroid cohomology ([20, 21, 29]): dω(X1 , . . . , Xk+1 ) =

k+1 X

(−1)i+1 a(Xi )(ω(X1 , . ˆ. ., Xk+1 ))

i=1

+

X

(−1)i+j ω([Xi , Xj ], X1 , . ˆ. . . ˆ. . , Xk+1 ),

(4)

i <j

for ω ∈ 0(∧k A∗ ), Xi ∈ 0A, 1 ≤ i ≤ k + 1. Conversely, if ⊕k 0(∧k A∗ ) is a differential graded algebra with differential d, then the equations: a(X)f =< df, X >, and < [X, Y ], θ >= a(X)(θ · Y ) − a(Y )(θ · X) − (dθ)(X, Y )

(5) (6)

∀f ∈ C ∞ (M ), X, Y ∈ 0(A), and θ ∈ 0(A∗ ), define a Lie algebroid structure on A. Remark. The lemma above is essentially Proposition 6.1 of [16]. Equation (6) is Formula (6.6) in [16]. Recall that a Lie bialgebroid [13, 21] is a dual pair (A, A∗ ) of vector bundles equipped with Lie algebroid structures such that the differential d∗ , induced from the Lie algebroid structure on A∗ as defined by Eq. (4), is a derivation of the Lie bracket on 0(A), i.e., d∗ [X, Y ] = [d∗ X, Y ] + [X, d∗ Y ], ∀X, Y ∈ 0(A).

(7)

The following result is due to Kosmann–Schwarzbach [13]. Proposition 2.3. A is a strong differential Gerstenhaber algebra iff (A, A∗ ) is a Lie bialgebroid.

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

549

Proof. Assume that A is a strong differential Gerstenhaber algebra. Then, A∗ is a Lie algebroid according to Lemma 2.2. Moreover, the derivation property of the differential with respect to the Lie bracket on 0(A) implies that (A∗ , A) is a Lie bialgebroid. This is equivalent to that (A, A∗ ) is a Lie bialgebroid by duality [21]. Conversely, it is straightforward to see, for a given Lie bialgebroid (A, A∗ ), that A is a strong differential Gerstenhaber algebra (see [13]). Example 2.4. Let P be a Poisson manifold with Poisson tensor π. Let A = T P with the standard Lie algebroid structure. It is well known that the space of multivector fields A = ⊕k 0(∧k T P ) has a Gerstenhaber algebra structure, where the graded Lie bracket is called the Schouten bracket. In 1977, Lichnerowicz introduced a differential d = [π, ·] : 0(∧k T P ) −→ 0(∧k+1 T P ), which he used to define the Poisson cohomology [19]. It is obvious that A becomes a strong differential Gerstenhaber algebra, so it corresponds to a Lie bialgebroid structure on (T P, T ∗ P ) according to Proposition 2.3. It is not surprising that this Lie bialgebroid is just the standard Lie bialgebroid associated to a Poisson manifold [21], where the Lie algebroid structure on T ∗ P is defined as in the introduction (see Eqs. (2) and (3)). It is, however, quite amazing that the Lie algebroid structure on T ∗ P was not known until the middle of 1980’s (see [15] for the references) and the Lie bialgebroid structure comes much later! For the Lie algebroid T ∗ P , the associated differential on ⊕0(∧∗ T P ) is the Lichnerowicz differential d = [π, ·]. This property was proved, independently by Bhaskara and Viswanath [1], and Kosmann–Schwarzbach and Magri [16]. 3. Exact Gerstenhaber Algebras In this section, we will move to exact Gerstenhaber algebras arising from a vector bundle. Let A −→ M be a Lie algebroid with anchor a and E −→ M a vector bundle over M . By an A-connection on E, we mean an R-linear map: 0(A) ⊗ 0(E) −→ 0(E), X ⊗ s −→ ∇X s, satisfying the axioms resembling those of the usual linear connections, i.e., ∀f ∈ C ∞ (M ), X ∈ 0(A), s ∈ 0(E), ∇f X s = f ∇X s; ∇X (f s) = (a(X)f )s + f ∇X s. Similarly, the curvature R of an A-connection ∇ is the element in 0(∧2 A∗ ) ⊗ End(E) defined by R(X, Y ) = ∇X ∇Y − ∇Y ∇X − ∇[X,Y ] , ∀X, Y ∈ 0(A).

(8)

Given a Lie algebroid A of rank n and an A-connection ∇ on the line bundle ∧n A, we define a differential operator D : 0(∧k A) −→ 0(∧k−1 A) as follows. Let U be any section in 0(∧k A) and write, locally, U = ω 3, where ω ∈ 0(∧n−k A∗ ) and 3 ∈ 0(∧n A). Set, for each m ∈ M , DU |m = −(−1)|ω| (dω

3+

n X i=1

(αi ∧ ω)

∇Xi 3),

(9)

550

P. Xu

where X1 , · · · , Xn is a basis of A|m and α1 , · · · , αn its dual basis in A∗ |m . Clearly, this definition is independent of the choice of the basis. Remark. We would like to make a remark on the notation. Let E be a vector bundle over M . Assume that V ∈ 0(∧k E) and θ ∈ 0(∧l E ∗ ) with k ≥ l. Then, by θ V we denote the section of ∧k−l E given by (θ

V )(ω) = V (θ ∧ ω), ∀ω ∈ 0(∧k−l E ∗ ).

We will stick to this convention in the sequel no matter whether E is A itself or its dual A∗ . Proposition 3.1. The operator D is well defined and D2 U = −R

U,

where R ∈ 0(∧2 A∗ ) is the curvature of the connection ∇ (note that EndE is a trivial line bundle). Proof. Assume that f is any locally nonzero function on M , and U = f ω X 1 1 3+ (αi ∧ f ω) ∇Xi ( 3) f f i X 1 1 = (df ∧ ω) 3 + dω 3 + f (aXi )( )(αi ∧ ω) f f i X (αi ∧ ω) ∇Xi 3 +

1 f 3.

d(f ω)

i

1 1 = (df ∧ ω) 3 + f (d( ) ∧ ω) 3 + dω f f X (αi ∧ ω) ∇Xi 3. = dω 3 +

3+

X

3

(αi ∧ ω)

∇Xi 3

i

i

This shows that D is well-defined. For the second part, we have D2 U = −(−1)|ω| D(dω = −(

X

3+

n X

(αi ∧ ω)

i=1

(αi ∧ dω)

∇Xi 3 +

i

+

X

(αj ∧ αi ∧ ω)

X

∇Xi 3

∇Xj ∇Xi 3)

(dαi ∧ ω)

∇ Xi 3 +

i

X ω = −[ i

d(αi ∧ ω)

i

j,i

= −(

X

∇Xi 3)

X

(αj ∧ αi ∧ ω)

∇Xj ∇Xi 3)

j,i

(dαi

∇Xi 3) +

X

ω

(αi ∧ αj

j,i

The conclusion thus follows from the following lemma.

∇Xi ∇Xj 3)].

Then,

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

Lemma 3.2.

X

dαi

∇Xi 3 +

X

∇Xi ∇Xj 3 = −R

3.

Proof. It is a straightforward verification, and is left to the readers.

i

(αi ∧ αj )

551

j,i

k

k−1

Proposition 3.3. Let D : 0(∧ A) −→ 0(∧ A) be the operator as defined in Equation (9). Then, D generates the Gerstenhaber algebra bracket on ⊕k 0(∧k A), i.e, for any U ∈ 0(∧u A) and V ∈ 0(∧v A), [U, V ] = (−1)u (D(U ∧ V ) − DU ∧ V − (−1)u U ∧ DV ).

(10)

We need a couple of lemmas before proving this proposition. Lemma 3.4. For any U ∈ 0(∧u A), V ∈ 0(∧v A) and θ ∈ 0(∧u+v−1 A∗ ), [U, V ]

θ = (−1)(u−1)(v−1) U

d(V

θ) − V

d(U

θ) − (−1)u+1 (U ∧ V )

dθ. (11)

Proof. See Eq. (1.16) in [27].

Lemma 3.5. For any U ∈ 0(∧u A) and θ ∈ 0(∧u−1 A∗ ), θ Proof. Assume that U = ω D(θ = −(−1)

DU = (−1)|θ| D(θ 3. Then θ

U) |ω|+|θ|

(d(ω ∧ θ)

3+

X

U ) + dθ

U = (ω ∧ θ)

(αi ∧ ω ∧ θ)

3 + (−1)|ω| (ω ∧ dθ)

(12)

3, and therefore,

∇Xi 3)

i

= −(−1)|ω|+|θ| [(dω ∧ θ)

U.

3+

X

(αi ∧ ω ∧ θ)

∇Xi 3]

i

= (−1)|θ| (θ

DU ) − (−1)|θ| dθ

U.

Proof of Proposition 3.3. For any U ∈ 0(∧u A), V ∈ 0(∧v A) and θ ∈ 0(∧u+v−1 A∗ ), using Eq. (12), we have θ

D(U ∧ V ) = (−1)|θ| D(θ

(U ∧ V )) + dθ

(U ∧ V ).

On the other hand, we have θ

(U ∧ DV ) = (U θ) DV = (−1)|θ|−u D((U

θ)

V ) + d(U

θ)

V,

and θ (DU ∧ V ) = (−1)(u−1)v θ (V ∧ DU ) = (−1)(u−1)v [(−1)|θ|−v D((V θ) U ) + d(V = (−1)uv+|θ| D((V θ) U ) + (−1)(u−1)v d(V

θ) θ)

U] U ).

552

P. Xu

It thus follows that θ [(−1)u D(U ∧ V ) − (−1)u DU ∧ V − U ∧ DV ] = (−1)|θ|+u D(θ (U ∧ V )) − (−1)uv+|θ|+u D((V θ) U ) −(−1)|θ|−u D((U θ) V ) +(−1)u dθ (U ∧ V ) + (−1)(u−1)(v−1) d(V θ) U − d(U

θ)

V.

The conclusion thus follows from Eq. (11) and the identity: θ

(U ∧ V ) = (U

θ)

V + (−1)uv (V

θ)

U.

(13)

Proposition 3.3 describes a construction from an A-connection to an operator D generating the Gerstenhaber algebra bracket. This construction is in fact reversible. Namely, the connection ∇ can also be recovered from the operator D. More precisely, we have Proposition 3.6. Suppose that D : 0(∧k A) −→ 0(∧k−1 A) is the operator corresponding to an A-connection ∇ on ∧n A. Then, for any X ∈ 0(A) and 3 ∈ 0(∧n A), ∇X 3 = −X ∧ D3. ∇Xi 3. Hence, X X ∧ (αi −X ∧ D3 =

(14)

Proof. By definition, D3 = −αi

∇Xi 3)

i

=

X

αi (X)∇Xi 3

i

= ∇X 3,

P where the last equality uses the identity: X = i αi (X)Xi , and the second equality follows from the following simple fact in linear algebra: Lemma 3.7. Let V be a vector space of dimension n, X ∈ V , α ∈ V ∗ and 3 ∈ ∧n V . Then, X ∧ (α 3) = α(X)3. Now we are ready to prove the main theorem of the section. Theorem 3.8. Let A be a Lie algebroid with anchor a, and A = ⊕k 0(∧k A) its corresponding Gerstenhaber algebra. There is a one-to-one correspondence between Aconnections on the line bundle ∧n A and linear operators D generating the Gerstenhaber algebra bracket on A. Under this correspondence, flat connections correspond to operators of square zero. Proof. It remains to prove that Eq. (14) indeed defines an A-connection on ∧n A if D is an operator generating the Gerstenhaber algebra bracket. First, it is clear that, with this definition, ∇f X 3 = f ∇X 3 for any f ∈ C ∞ (M ). To prove that it satisfies the second axiom of a connection, we observe that for any f ∈ C ∞ (M ), and 3 ∈ 0(∧n A),

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

553

D(f 3) = (Df )3 + f D3 + [f, 3] = f D3 + [f, 3]. Hence, ∇X (f 3) = −X ∧ D(f 3) = −X ∧ (f D3 + [f, 3]) = f ∇X 3 − X ∧ [f, 3]. On the other hand, using the property of Gerstenhaber algebras, [f, X ∧ 3] = [f, X] ∧ 3 + (−1)X ∧ [f, 3] = −(a(X)f )3 − X ∧ [f, 3]. Thus, X ∧ [f, 3] = −(a(X)f )3. Hence, ∇X (f 3) = f ∇X 3 + (a(X)f )3.

A flat A-connection always exists on the line bundle ∧n A. To see this, note that ∧n A ⊗ ∧n A is a trivial line bundle, which always admits a flat connection. So the “square root" of this connection (see Proposition 4.3 in [5]) is a flat connection we need. Therefore, for a given Lie algebroid, there always exists an operator of degree −1 and of square zero generating the corresponding Gerstenhaber algebra. Such an operator is called a generating operator. Any A-connection ∇ on A induces an A-connection on the line bundle ∧n A. Therefore, it corresponds to a linear operator D generating the Gerstenhaber algebra A. In particular, if it is torsion free, i.e., ∇X Y − ∇Y X = [X, Y ], ∀X, Y ∈ 0(A), D possesses a simpler expression. Note that ∇ induces an A-connection on the exterior power ∧k A and the dual bundle A∗ as well. We will denote them by the same symbol ∇. Proposition 3.9. Suppose that ∇ is a torsion free A-connection on A. Let D : 0(∧∗ A) −→ 0(∧∗−1 A) be its induced operator. Then, for any U ∈ 0(∧u A), X αi ∇Xi U, (15) DU |m = − i

where X1 , · · · , Xn is a basis of A|m and α1 , · · · , αn the dual basis of A∗ |m . Proof. Assume that U = ω 3 for some 3 ∈ 0(∧n A) and ω ∈ 0(∧n−u A∗ ). Then, X X αi ∇Xi (ω 3) = αi [∇Xi ω 3 + ω ∇Xi 3] i

i

=

X

[(∇Xi ω ∧ αi )

i

= (−1)|ω| (

X i

3 + (ω ∧ αi )

(αi ∧ ∇Xi ω)

3+

∇Xi 3]

X i

(αi ∧ ω)

∇Xi 3).

554

P. Xu

The conclusion thus follows from the following Lemma 3.10. For any ω ∈ 0(∧|ω| A∗ ), dω =

X

αi ∧ ∇Xi ω.

i

Proof. Define an operator δ : 0(∧k A∗ ) −→ 0(∧k+1 A∗ ), for all 0 ≤ k ≤ n, by δω =

X

αi ∧ ∇Xi ω.

i

It is simple to check that δ is a graded derivation with respect to the wedge product, i.e., δ(ω ∧ θ) = δω ∧ θ + (−1)|ω| ω ∧ δθ. For any f ∈ C ∞ (M ), δf =

X

αi ∇Xi f =

X

i

[a(Xi )f ]αi = df.

i

For any θ ∈ 0(A∗ ) and X, Y ∈ 0(A), (δθ)(X, Y ) X (αi ∧ ∇Xi θ)(X, Y ) = i

=

X

[αi (X)(∇Xi θ)(Y ) − αi (Y )(∇Xi θ)(X)]

i

=

X

αi (X)(∇Xi (θ · Y ) − θ · ∇Xi Y ) −

X

i

=

X i

αi (Y )(∇Xi (θ · X) − θ · ∇Xi X)

i

αi (X)(a(Xi )(θ · Y ) − θ · ∇Xi Y ) −

X

αi (Y )(a(Xi )(θ · X) − θ · ∇Xi X)

i

= a(X)(θ · Y ) − θ · ∇X Y − a(Y )(θ · X) + θ · ∇Y X = a(X)(θ · Y ) − a(Y )(θ · X) − θ · (∇X Y − ∇Y X) = a(X)(θ · Y ) − a(Y )(θ · X) − θ · [X, Y ] = dθ(X, Y ). Therefore, δ coincides with the exterior derivative d, since ⊕k 0(∧k A∗ ) is generated by 0(A∗ ) over the module C ∞ (M ). Remark. (1) Theorem 3.8 was proved by Koszul [17] for the case of the tangent bundle Lie algebroid T P . In fact, his result was the main motivation of the present work. However, Koszul used an indirect argument instead of using Eqs. (9) and (14). We will see more applications of these equations in the next section. (2) A flat A-connection on a vector bundle E is also called a representation of the Lie algebroid by Mackenzie [20, 5].

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

555

We end this section by introducing the notion of generalized divergence. Let ∇ be a flat A-connection on ∧n A, and D its corresponding generating operator. For any section X ∈ 0(A), we use div∇ X to denote the function DX. When A = T P with the usual Lie algebroid structure and ∇ is the flat connection induced by a volume, DX is the divergence in the ordinary sense. So DX can indeed be considered as a generalized divergence. The following proposition gives a simple geometric characterization for the divergence of a section X ∈ 0(A). Proposition 3.11. For any X ∈ 0(A) and 3 ∈ 0(∧n A), LX 3 − ∇X 3 = (div∇ X)3. In other words, the function div∇ X is the multiplier corresponding to the endomorphism LX − ∇X of the line bundle ∧n A. 3 for some ω ∈ 0(∧n−1 A∗ ). Then,

Proof. Assume that X = ω

DX = −(−1)|ω| (dω

3+

n X

(αi ∧ ω)

∇Xi 3).

i=1

Now n X

((αi ∧ ω)

∇Xi 3)3 =

i=1

n X

((αi ∧ ω)

3)∇Xi 3

i=1

=

X

(−1)|ω| ((ω ∧ αi )

3)∇Xi 3

i

=

X

(−1)|ω| (αi

X)∇Xi 3

i

=

X

(−1)|ω| X(αi )∇Xi 3

i

= (−1)|ω| ∇X 3. Let θ ∈ 0(∧n A∗ ) be the dual element of 3. It follows from Eq. (11) that [X, 3]

θ = −3

d(X

θ).

|ω|(n−|ω|)

ω. Since n = |ω| − 1, then It is simple to see that X θ = (ω 3) θ = (−1) X θ = (−1)|ω| ω, and [X, 3] θ = −(−1)|ω| dω 3. Hence, (DX)3 = [X, 3] − ∇X 3 = LX 3 − ∇X 3. This concludes the proof of the proposition. 4. Lie Algebroid Homology Let A be a Lie algebroid, and ∇ a flat A-connection on the line bundle ∧n A. Let D be its corresponding generating operator and ∂ = (−1)n−k D : 0(∧k A) −→ 0(∧k−1 A) (the reason for choosing this sign in the definition of ∂ will become clear later (see Eq. (17))). Then ∂ 2 = 0, and we obtain a chain complex. Let H∗ (A, ∇) denote its homology: H∗ (A, ∇) = ker∂/Im∂. Since D is a derivation with respect to [·, ·], immediately we have

556

P. Xu

Proposition 4.1. The Schouten bracket passes to the homology H∗ (A, ∇). Since this homology depends on the choice of the connection ∇, it is natural to ask how H∗ (A, ∇) changes according to the connection ∇. ˜ and D are operators generating the Gerstenhaber Proposition 4.2. Suppose that both D ˜ 2 −D2 = −idα . In particular, ˜ bracket on A. Then D− D = iα for some α ∈ 0(A∗ ). And D 2 2 ∗ ˜ if D = D = 0, then α ∈ 0(A ) is closed. ˜ and D respec˜ and ∇ be the A-connections on ∧n A corresponding to D Proof. Let ∇ tively. Then there exists α ∈ 0(A∗ ) such that ˜ X s = ∇X s+ < α, X > s, ∀s ∈ 0(∧n A). ∇ ˜ = D − iα . According to Proposition 3.1, we It follows from a direct verification that D ˜ and ∇, ˜ 2 U − D2 U = −(R˜ − R) U , where R˜ and R are the curvatures of ∇ have D respectively. Finally, it is routine to check that R˜ − R = dα. Definition 4.3. A-connections ∇1 and ∇2 are said to be homotopic if they differ by an exact form in 0(A∗ ). Similarly two generating operators D1 and D2 are said to be homotopic if they differ by an exact form, i.e., D1 − D2 = iα for some exact form α ∈ 0(A∗ ). The following result is thus immediate. Proposition 4.4. Let ∇1 and ∇2 be two flat A-connections on the canonical line bundle ∧n A, and D1 and D2 their corresponding generating operators. If ∇1 and ∇2 are homotopic (or equivalently D1 and D2 are homotopic), then, H∗ (A, ∇1 ) ∼ = H∗ (A, ∇2 ).

(16)

Now let us assume that ∧n A is a trivial bundle, so there exists a nowhere vanishing volume 3 ∈ 0(∧n A). This volume induces a flat A-connection ∇0 on ∧n A simply by (∇0 )X 3 = 0 for all X ∈ 0(A). Let D0 be its corresponding generating operator. Note that 3 being horizontal is equivalent to the condition: D0 3 = 0. Suppose that 30 is another nonvanishing volume, and ∇0 its corresponding flat connection on ∧n A. Assume that 30 = f 3 for some positive f ∈ C ∞ (M ). Then, it is easy to see that ∇0X s = (∇0 )X s− < d ln f, X > s. In other words, their corresponding generating operators are homotopic . Let us now fix such a volume 3 ∈ 0(∧n A). Define a ∗-operator from 0(∧k A∗ ) to 0(∧n−k A) by ∗ω = ω 3. Clearly ∗ is an isomorphism. The following proposition follows immediately from the definition of ∗. Proposition 4.5. The operator ∂0 = (−1)n−k D0 equals to − ∗ ◦d◦∗−1 . That is, ∂0 = − ∗ ◦d◦ ∗−1 .

(17)

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

557

Here d is the Lie algebroid cohomology differential (see Eq. (4)). Thus, as a consequence, we have Theorem 4.6. Let ∇0 be an A-connection on ∧n A which admits a global nowhere vanishing horizontal section 3 ∈ 0(∧n A). Then H∗ (A, ∇0 ) ∼ = H n−∗ (A, R). Remark. From the discussion above we see that there is a family of Lie algebroid homologies parameterized by the first Lie algebroid cohomology. In the case that the line bundle ∧n A is trivial, one of these homologies is isomorphic to the Lie algebroid cohomology with trivial coefficients, and the rest of them can be considered as Lie algebroid cohomology with twisted coefficients. In general, these are special cases of Lie algebroid cohomology with general coefficients in a line bundle (see [20, 5]). We will discuss two special cases below. Let g be an n-dimensional Lie algebra. Then ∧n g is one-dimensional, so it admits a trivial g-connection, which in turn induces a generating operator D0 : ∧∗ g −→ ∧∗−1 g. On the other hand, there exists another standard operator D : ∧∗ g −→ ∧∗−1 g, namely the dual of the Lie algebra cohomology differential. In general, D and D0 are different. In fact, it is easy to check that D − D0 = iα , where α is the modular character of the Lie algebra. In particular, when g is a unimodular Lie algebra, the Lie algebra homology is isomorphic to Lie algebra cohomology, a well-known result. Another interesting case, which does not seem trivial, is when A is the cotangent Lie algebroid T ∗ P of a Poisson manifold P (see Eqs. (2) and (3)). In this case, 0(∧k T ∗ P ) = k (P ). There is a well known operator D : k (P ) −→ k−1 (P ) due to Koszul [17] and Brylinski [2], given by D = [iπ , d]. The corresponding homology is called Poisson homology, and is denoted by H∗ (P, π). It was shown in [17] that D indeed generates the Gerstenhaber bracket on ∗ (P ) induced from the cotangent Lie algebroid of P . Therefore, it corresponds to a flat Lie algebroid connection on ∧n T ∗ P . According to Eq. (14), this connection has the form: ∇θ = −θ ∧ D = θ ∧ d(π

),

(18)

for any θ ∈ 1 (P ) and ∈ n (P ). We note that a similar formula was also discovered independently, by Evens–Lu–Weinstein [5]. The Koszul–Brylinski operator D is intimately related to the so called modular class of the Poisson manifold, a classical analogue of the modular form of a von Neumann algebra, which was introduced recently by Weinstein [28], and independently by Brylinski and Zuckerman [3]. Let us briefly recall its definition below. For simplicity, we assume that P is orientable with a volume form . The modular vector field ν is the vector field defined by f −→ (LXf )/, ∀f ∈ C ∞ (P ). It is easily shown that the above map satisfies the Leibniz rule, so it indeed defines a vector field on P . It can also be shown that ν preserves the Poisson structure, and in other words it is a Poisson vector field. When we change the volume , the corresponding modular vector fields differ by a hamiltonian vector field. Therefore it defines an element in the first Poisson cohomology Hπ1 (P ), which is called the modular class of the Poisson manifold. A Poisson manifold is called unimodular if its modular class vanishes. In fact, the modular class can be defined for any Poisson manifold by just replacing the volume form by a positive density. We refer the interested reader to [28] for more detail.

558

P. Xu

Now let P be an orientable Poisson manifold with volume form , and let D0 be its corresponding generating operator as in the observation preceding Proposition 4.5. The following result follows immediately from a direct verification. Proposition 4.7. Let D be the Koszul–Brylinski operator of a Poisson manifold P . Then D − D0 = iν , where ν is the modular vector field corresponding to the volume . As an immediate consequence, we have Theorem 4.8. If P is an orientable unimodular Poisson manifold, then H∗ (P, π) ∼ = Hπn−∗ (P ). In particular, this result holds for any symplectic manifold, which was first proved by Brylinski [2]. Remark. The above situation can be generalized to the case of triangular Lie bialgebroids. Let A be a Lie algebroid with anchor a. A triangular r-matrix is a section π in 0(∧2 A) satisfying the condition [π, π] = 0. One may think that this is a sort of generalized “Poisson structure" on the generalized manifold A. In this case, A∗ is equipped with a Lie algebroid structure with the anchor a◦π # and the Lie bracket as defined by an equation identical to the one defining the Lie bracket on one-forms of a Poisson manifold. Similarly, D = [iπ , d] : 0(∧k A∗ ) −→ 0(∧k−1 A∗ ) is an operator of square zero and generates the Gerstenhaber bracket [·, ·] on ⊕k 0(∧k A∗ ). A form of top degree ∈ 0(∧n A∗ ) satisfies the condition D = 0 iff π ∈ 0(∧n−2 A∗ ) is closed. If there exists such a nowhere vanishing form, the homology H∗ (A, ∇) is then isomorphic to the cohomology H n−∗ (A, R). 5. Discussions We end this paper by a list of open questions. Question 1. In the above remark, is the condition that π equivalent to the Lie algebroid A∗ being unimodular?

∈ 0(∧n−2 A∗ ) is closed

Question 2. For a general Lie algebroid A, does there exist a canonical generating operator corresponding to the modular class of the Lie algebroid in analogue to the case of cotangent Lie algebroid of a Poisson manifold (see Proposition 4.7)? Question 3. For a Poisson manifold P , there is a family of the homologies parameterized by the first Poisson cohomology Hπ1 (P ). What is the meaning of the rest of the homologies besides the Poisson homology? Question 4. Suppose that (A, A∗ ) is a Lie bialgebroid and ∇ a flat A-connection on ∧n A. Then (0(∧∗ A), ∧, d∗ , [, ], D) is a strong differential BV-algebra. It is clear that d∗ D + Dd∗ is a derivation with respect to both ∧ and [, ]. When is d∗ D + Dd∗ inner and in particular, when is d∗ D + Dd∗ = 0? For the Lie bialgebroid (T ∗ P, T P ) of a Poisson manifold, we may take the connection ∇ as in Eq. (18). Then d∗ is the usual de-Rham differential and D is the Koszul–Brylinski operator. Thus, d∗ D + Dd∗ is automatically zero, which gives rise to the Brylinski double complex [2]. On the other hand, if we switch the order and

Gerstenhaber Algebras and BV-Algebras in Poisson Geometry

559

consider the Lie bialgebroid (T P, T ∗ P ) for a Poisson manifold P with a volume, then A = ⊕k 0(∧k A) is the space of multivector fields. In this case, d∗ = [π, ·] is the Lichnerowicz Poisson cohomology differential, and D = −(−1)n−k ∗ ◦d◦∗−1 . Here ∗ is the isomorphism between the space of multivector fields and that of differential forms induced by the volume element. Then d∗ D + Dd∗ = LX , where X is the modular vector field of the Poisson manifold (see p. 265 of [17]). So it vanishes iff P is unimodular. Acknowledgement. The author would like to thank Jean-Luc Brylinski, Jiang-hua Lu, and Alan Weinstein for useful discussions. He is especially grateful to Yvette Kosmann–Schwarzbach and Jim Stasheff for providing many helpful comments on the first draft of the manuscript. In addition to the funding sources mentioned in the first footnote, he would also like to thank the IHES and Max-Planck Institut for their hospitality and financial support while part of this project was being done.

References 1. Bhaskara, K., and Viswanath, K.: Calculus on Poisson manifolds. Bull. London Math. Soc. 20, 68–72 (1988) 2. Brylinski, J.-L.: A differential complex for Poisson manifolds. J. Diff. Geom. 28, 93–114 (1988) 3. Brylinski, J.-L., Zuckerman, G.: The outer derivation of a complex Poisson manifold. J. Reine Angew. Math., to appear 4. Coste, A., Dazord, P. and Weinstein, A.: Groupo¨ıdes symplectiques, Publications du D´epartement de Math´ematiques de l’Universit´e de Lyon, I, 2/A, 1987, pp. 1–65 5. Evens, S., Lu, J.-H., and Weinstein, A.: Transverse measures, the modular class, and a cohomology pairing for Lie algebroids. Quart. J. Math. Oxford, to appear 6. Gerstenhaber, M. and Schack, S. D.: Algebras, bialgebras, quantum groups and algebraic deformations. Contemp. Math. 134, Providence, RI: AMS, 1992, pp. 51–92 7. Getzler, E.: Batalin–Vilkovisky algebras and two-dimensional topological field theories. Commun. Math. Phys. 159, 265–285 (1994) 8. Getzler, E., and Kapranov, M. M.: Cyclic operads and cyclic homology. In: Geometry, Topology and Physics for Raoul Bott, S.-T. Yau ed., 1995, pp. 167–201 9. Huebschmann, J.: Duality for Lie-Rinehart algebras and the modular class. J. Reine Angew. Math., to appear 10. Huebschmann, J.: Lie-Rinehart algebras, Gerstenhaber algebras, and BV algebras. Ann. Inst. Fourier (Grenoble), 48, 425–440 (1998) 11. Kimura, T., Voronov, A. and Stasheff, J.: On operad structures of moduli spaces and string theory. Commun. Math. Phys. 171, 1–25 (1995) 12. Kontsevich, M.: Course on deformation theory. UC Berkeley, 1994 13. Kosmann–Schwarzbach, Y.: Exact Gerstenhaber algebras and Lie bialgebroids. Acta Appl. Math. 41, 153–165 (1995) 14. Kosmann–Schwarzbach, Y.: Graded Poisson brackets and field theory. In: Modern group theoretical methods in physics, J. Berttrand et al. (eds.), 1995, pp. 189–196 15. Kosmann–Schwarzbach, Y.: The Lie bialgebroid of a Poisson–Nijenhuis manifold. Lett. Math. Phys. 38, 421–428 (1996) 16. Kosmann–Schwarzbach, Y., and Magri, F.: Poisson–Nijenhuis structures. Ann. Inst. H. Poincar´e Phys. Th´eor. 53, 35–81 (1990) 17. Koszul, J.-L.: Crochet de Schouten–Nijenhuis et cohomologie. Ast´erisque, num´ero hors s´erie, 257–271 (1985) 18. Lian, B.H. and Zukerman, G.J.: New perspectives on the BRST-algebraic structure of string theory. Commun. Math. Phys. 154, 613–646 (1993) 19. Lichnerowicz, A.: Les vari´et´es de Poisson et leurs alg`ebres de Lie associ´ees. J. Diff. Geom. 12, 253–300 (1977) 20. Mackenzie, K.: Lie Groupoids and Lie Algebroids in Differential Geometry. LMS Lecture Notes Series, 124, Cambridge: Cambridge Univ. Press, 1987 21. Mackenzie, K. and Xu, P.: Lie bialgebroids and Poisson groupoids. Duke Math. J. 73, 415–452 (1994)

560

P. Xu

22. Penkava, M. and Schwarz, A.: On some algebraic structures arising in string theory. Perspectives in Mathematical Physics. In: Conf. Proc. Lecture Notes Math. Phys. III, Cambridge, MA: Internat. Press, 1994 23. Pradines, J.: Th´eorie de Lie par les groupo¨ides diff´erentiables. C. R. Acad. Sci. Paris, S´erie A 267, 245–248 (1968) 24. Pradines, J.: Troisi`eme th´eor`eme de Lie pour les groupo¨ıdes diff´erentiables. C. R. Acad. Sci. Paris, S´erie A 267, 21–23 (1968) 25. Stasheff, J.: Differential graded Lie algebras, quasi-Hopf algebras and higher homotopy algebras. Lecture Notes in Math. 1510, pp. 120–137 26. Stasheff, J.: From operads to ‘physically’ inspried theories. In: Operads: Proceedings of Renaissance Conference, Harford, CT/Luminy, 1995, Contemp. Math. 202, pp. 9–14 27. Vaisman, I.: Lectures on the geometry of Poisson manifolds. PM 118, Basel–Boston–Berlin: Birkh¨auser, 1994 28. Weinstein, A.: The modular automorphism group of a Poisson manifold. J. Geom. Phys. 23, 379–394 (1997) 29. Weinstein, A. and Xu, P.: Extensions of symplectic groupoids and quantization. J. Reine Angew. Math. 417, 159–189 (1991) 30. Witten, E.: A note on antibracket formalism. Modern Physics Letters A 5, 487–494 (1990) 31. Zwiebach, B.: Closed string field theory: Quantum action and the BV master equation. Nucl. Phys. B 390, 33–152 (1993) Communicated by G. Felder

Commun. Math. Phys. 200, 561 – 598 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Cohomology of Conformal Algebras? Bojko Bakalov, Victor G. Kac, Alexander A. Voronov Department of Mathematics, M.I.T., 77 Massachusetts Ave., Cambridge, MA 02139-4307, USA. E-mail: [email protected], [email protected], [email protected] Received: 10 March 1998 / Accepted: 27 July 1998

To Bertram Kostant on his seventieth birthday Abstract: The notion of a conformal algebra encodes an axiomatic description of the operator product expansion of chiral fields in conformal field theory. On the other hand, it is an adequate tool for the study of infinite-dimensional Lie algebras satisfying the locality property. The main examples of such Lie algebras are those “based” on the punctured complex plane, such as the Virasoro algebra and loop Lie algebras. In the present paper we develop a cohomology theory of conformal algebras with coefficients in an arbitrary module. It possesses standards properties of cohomology theories; for example, it describes extensions and deformations. We offer explicit computations for the most important examples. Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Preliminaries on Conformal Algebras and Modules . . . . . . . . . . . . . . . . 2 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Extensions and Deformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Exterior Multiplication, Contraction, and Module Structure . . . . . . . . . . 6 Cohomology of Conformal Algebras and Their Annihilation Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Cohomology of the basic complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Cohomology of the reduced complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Cohomology of conformal algebras and formal distribution Lie algebras 7 Cohomology of the Virasoro Conformal Algebra . . . . . . . . . . . . . . . . . . 7.1 Cohomology of Vir with trivial coefficients . . . . . . . . . . . . . . . . . . . . . . . 7.2 Cohomology of Vir with coefficients in M1,α . . . . . . . . . . . . . . . . . . . . . 8 Cohomology of Current Conformal Algebras . . . . . . . . . . . . . . . . . . . . .

562 563 568 571 574 575 576 576 578 579 579 579 581 583

? Research of Bakalov and Kac was supported in part by NSF grant #DMS-9622870 and of Voronov by an AMS Centennial Fellowship.

562

B. Bakalov, V. G. Kac, A. A. Voronov

8.1 8.2 9 9.1 9.2 9.3 10 11 12 13

Cohomology with trivial coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cohomology with coefficients in a current module . . . . . . . . . . . . . . . . . Hochschild, Cyclic, and Leibniz Cohomology . . . . . . . . . . . . . . . . . . . . Hochschild cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cyclic cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Leibniz cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generalization to Conformal Algebras in Higher Dimensions . . . . . . . . Higher Differentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relation to Lie∗ Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

583 584 587 587 588 589 590 591 593 597

Introduction The notion of a conformal algebra encodes an axiomatic description of the operator product expansion of chiral fields in conformal field theory. On the other hand, it is an adequate tool for the study of infinite-dimensional Lie algebras satisfying the locality property [K2]–[K4, DK]. Likewise, conformal modules over a conformal algebra A correspond to conformal modules over the associated Lie algebra Lie A [CK]. The main examples of Lie algebras Lie A are the Lie algebras “based” on the punctured complex plane C× , namely the Lie algebra VectC× of vector fields on C× (= Virasoro algebra) and the Lie algebra of maps of C× to a finite-dimensional Lie algebra (= loop algebra). Their irreducible conformal modules are the spaces of densities on C× and loop modules, respectively, [CK]. Since complete reducibility does not hold in this case (cf. [F, CKW]), one may expect that their cohomology theory is very interesting. In the present paper we develop a cohomology theory of conformal algebras with coefficients in an arbitrary module. We introduce the basic and the reduced complexes, the latter being a quotient of the former. The basic complex turns out to be isomorphic to the Lie algebra complex for the so-called annihilation subalgebra (Lie A)− of Lie A. For the main examples the annihilation subalgebra turns out to be its complex-plane counterpart (i.e., C× is replaced by C). The cohomology of these Lie algebras has been extensively studied in [GF1, GF2, FF, Fe1, F, Fe2]. This allows us to compute the cohomology of the conformal algebra A, which in its turn captures the main features of the cohomology of the Lie algebra Lie A. As a byproduct of our considerations, we compute the cohomology of a current Lie algebra on C with values in an irreducible highest-weight module (see Theorem 8.2), which has been known only when the module is trivial [Fe1]. The first cohomology theory in the context of operator product expansion was the cohomology theory of vertex algebras and conformal field theories introduced in [KV]. The cohomology theory of the present paper relates to the cohomology theory of [KV] as much as the Chevalley–Eilenberg cohomology of Lie algebras relates to the Hochschild (or more exactly, Harrison) cohomology of commutative associative algebras. The two theories possess standard properties of cohomology theories. For example, the cohomology of [KV] describes deformations of vertex algebras, and the cohomology of this paper describes the same for conformal algebras. However, the cohomology of [KV] is hard to compute, whereas this paper offers the computation of cohomology in most of the important examples. The paper is organized as follows. In Sect. 1 we recall the definition of a conformal algebra and of a (conformal) module over it and describe their relation to formal distribution Lie algebras and conformal modules.

Cohomology of Conformal Algebras

563

e • (A, M ) and its quotient, the reduced In Sect. 2 we construct the basic complex C complex C • (A, M ), for a module M over a conformal algebra A. These complexes define the basic and reduced cohomology of a conformal algebra A. In Sect. 3 we show that this cohomology parameterizes A-module extensions, abelian conformal-algebra extensions, first-order deformations, etc. (Theorem 3.1). In Sect. 4 we construct the dual, homology complexes. In Sect. 5 we define the exterior multiplication, contraction and module structure for the basic complex. In Sect. 6 we prove that the basic complex is isomorphic to the Lie algebra complex of the annihilation algebra (Theorem 6.1). Along with Proposition 1.1 this implies, in particular, that basic cohomology can be defined via a derived functor. Apparently this is not the case for the reduced complex. In Sect. 7 we compute the cohomology with trivial coefficients of the Virasoro conformal algebra Vir both for the basic and reduced complexes (Theorem 7.1). As one could expect, the calculation and the result are closely related to Gelfand–Fuchs’ calculation of the cohomology of VectC× [GF1]. We also compute both cohomologies of Vir with coefficients in the modules of densities (Theorem 7.2). This result is closely related to the work of Feigin and Fuchs [FF, F]. In Sect. 8 we compute the cohomology of the current conformal algebras both with trivial coefficients (Theorem 8.1) and with coefficients in current modules (Theorem 8.2). This allows us, in particular, to classify abelian extensions of current algebras (Remark 8.1). Of course, abelian extensions of Vir can be classified by making use of Theorem 7.2. This problem has been solved earlier by M. Wakimoto and one of the authors of the present paper by a lengthy but direct calculation; however, in the case of current algebras the direct calculation is all but impossible. In Sect. 9 we briefly discuss the analogues of Hochschild and cyclic cohomology for associative conformal algebras and of Leibniz cohomology. In Sect. 10 we indicate how to generalize our cohomology theory to the case of conformal algebras in several indeterminates and discuss its relation to cohomology of Cartan’s filtered Lie algebras. In Sect. 11 we introduce anticommuting higher differentials which may be useful for computing the cohomology of the basic complex with non-trivial coefficients. In Sect. 12 we briefly discuss the relation of our cohomology theory to Lie algebras in a general pseudo-tensor category introduced in [BD]. In the last Sect. 13 we list several open questions. Unless otherwise specified, all vector spaces, linear maps and tensor products are considered over the field C of complex numbers. We will use the divided-powers notation λ(m) = λm /m!, m ∈ Z+ , where Z+ is the set of non-negative integers. 1. Preliminaries on Conformal Algebras and Modules Definition 1.1. A (Lie) conformal algebra is a C[∂]-module A endowed with a λbracket [aλ b] which defines a linear map A ⊗ A → A[λ], where A[λ] = C[λ] ⊗ A, subject to the following axioms: Conformal sesquilinearity: [∂aλ b] = −λ[aλ b], [aλ ∂b] = (∂ + λ)[aλ b]; Skew-symmetry: [aλ b] = −[b−λ−∂ a]; Jacobi identity: [aλ [bµ c]] = [[aλ b]λ+µ c] + [bµ [aλ c]]. Conformal algebras appear naturally in the context of formal distribution Lie algebras as follows. Let g be a vector space. A g-valued formal distribution is a series of the form

564

B. Bakalov, V. G. Kac, A. A. Voronov

P a(z) = n∈Z an z −n−1 , where an ∈ g and z is an indeterminate. We denote the space of such distributions by g[[z, z −1 ]] and the operator ∂z on this space by ∂. Let g be a Lie algebra. Two g-valued formal distributions are called local if (z − w)N [a(z), b(w)] = 0 for N 0 . This is equivalent to saying that one has an expansion of the form [K2]: [a(z), b(w)] =

N −1 X

(j) δ(z − w), a(w)(j) b(w) ∂w

(1.1)

j=0

where

a(w)(j) b(w) = Res(z − w)j [a(z), b(w)] z

and δ(z − w) =

X

(1.2)

z −n−1 wn .

n∈Z

Let F be a family of pairwise local g-valued formal distributions such that the coefficients of all distributions from F span g. Then the pair (g, F) is called a formal distribution Lie algebra. Let F denote the minimal subspace of g[[z, z −1 ]] containing F which is closed under all j th products (1.2) and ∂-invariant. One knows that F still consists of pairwise local distributions [K2]. Letting X λ(n) a(n) b, [aλ b] = n∈Z+

one endows F with the structure of a conformal algebra, which is denoted by Conf(g, F) [DK, K2]. Conversely, given a conformal algebra A, one associates to it the maximal formal distribution Lie algebra (Lie A, A) as follows. Let Lie A = A[t, t−1 ]/(∂ + ∂t )A[t, t−1 ] and let an denote the image of atn in Lie A. Then the formula (a, b ∈ A, m, n ∈ Z): X m (a(j) b)m+n−j (1.3) [am , bn ] = j j∈Z+

gives a well defined bracket making Lie A a Lie algebra. It forms a formal distribution Lie algebra with the family of pairwise local distributions ) ( X −n−1 an z . F = a(z) = n∈Z

a∈A

We have: Conf(Lie A, F) ' A via the map a 7→ a(z) [K2]. The Lie algebra Lie A carries a derivation T induced by −∂t : T (an ) = −nan−1 .

(1.4)

It is clear from (1.3) that the C-span of the an with n ∈ Z+ , a ∈ A, is a T -invariant subalgebra of the Lie algebra Lie A. This subalgebra is denoted by (Lie A)− and is called

Cohomology of Conformal Algebras

565

the annihilation Lie algebra of A. The semidirect sum (Lie A)− = CT + (Lie A)− is called the extended annihilation Lie algebra. If one drops the skew-symmetry in the definition of a Lie algebra g, but keeps the Leibniz version of the Jacobi identity [a, [b, c]] = [[a, b], c] + [b, [a, c]], then g is called a (left) Leibniz algebra, see [L1]. If one also drops the condition of locality on F, then (g, F) is called a formal distribution Leibniz algebra. In this case Conf(g, F) is a Leibniz conformal algebra, i.e., the skew-symmetry axiom in the definition of a Lie conformal algebra is dropped. Definition 1.2. A module M over a Lie conformal algebra A is a C[∂]-module endowed with the λ-action aλ v which defines a map A ⊗ M → M [[λ]] such that aλ (bµ v) − bµ (aλ v) = [aλ b]λ+µ v, (∂a)λ v = −λaλ v, aλ (∂v) = (∂ + λ)aλ v.

(1.5) (1.6)

If aλ v ∈ M [λ] for all a ∈ A, v ∈ M , then the A-module M is called conformal. If M is finitely generated over C[∂], M is simply called finite. Definition 1.3 ([DK]). A conformal linear map from an A-module M to an A-module N is a C-linear map f : M → N [λ], denoted fλ : M → N , such that fλ ∂ = (∂ + λ)fλ . The space of such maps is denoted Chom(M, N ). It has canonical structures of a C[∂]and an A-module: (∂f )λ = −λfλ , (aµ f )λ m = aµ (fλ−µ m) − fλ−µ (aµ m), where a ∈ A, m ∈ M , and f ∈ Chom(M, N ). When the two modules M and N are conformal and finite, the module Chom(M, N ) will also be conformal. For a finite module M , let Cend M = Chom(M, M ) denote the space of conformal linear endomorphisms of M . Besides the A-module structure, Cend M carries the natural structure f, g ∈ Cend M, m ∈ M, (fλ g)µ m = fλ (gµ−λ m), of an associative conformal algebra in the sense of the following definition, see [K4]. Definition 1.4. An associative conformal algebra is a C[∂]-module A endowed with a λ-multiplication aλ b which defines a linear map A ⊗ A → A[λ] subject to the following axioms: Conformal sesquilinearity: (∂a)λ b = −λaλ b, aλ ∂b = (∂ + λ)aλ b; Associativity: aλ (bµ c) = (aλ b)λ+µ c. The λ-bracket [aλ b] = aλ b − b−λ−∂ a makes an associative conformal algebra, in particular, Cend M , a Lie conformal algebra. Cend M with this structure is denoted gc M and called the general Lie conformal algebra of a module M [DK, K4]. Given an associative conformal algebra A, a left (or right) module M over it may be defined naturally, for example, like in Definition 1.2. A bimodule may be defined by adding the axiom aλ (mµ b) = (aλ m)λ+µ b to the list of those for a left and right module. A (bi)module is called conformal, provided the action(s) satisfy the usual polynomiality conditions. The structure of a conformal bimodule on M is equivalent to an extension of the associative conformal algebra structure to the space A ⊕ M , where 2 = 0.

566

B. Bakalov, V. G. Kac, A. A. Voronov

We will be working with Lie conformal algebras and modules over them throughout the paper, except when we discuss Hochschild cohomology in Sect. 9.1. We will therefore usually shorten the term “Lie conformal algebra” to “conformal algebra”. Conformal modules over conformal algebras appear naturally in the context of conformal modules over formal distribution Lie algebras as follows. Let (g, F) be a formal distribution Lie algebra and let V be a g-module. Suppose that E is a family of V -valued formal distributions which spans V and such that any a(z) ∈ F and v(z) ∈ E form a local pair, i.e., (z − w)N a(z)v(w) = 0 for N 0 . Then (V, E) is called a conformal (g, F)-module. As before, we have: a(z)v(w) =

N −1 X

(j) δ(z − w), a(w)(j) v(w) ∂w

(1.7)

j=0

where

a(w)(j) v(w) = Res(z − w)j a(z)v(w) . z

(1.8)

Let E denote the minimal subspace of V [[z, z −1 ]] containing E which is closed under all j th actions (1.8) and is ∂-invariant. One knows that all pairs a(z) ∈ F and v(z) ∈ E are still local [K2, K4]. Letting X

aλ v =

λ(n) a(n) v,

n∈Z+

one endows E with the structure of a conformal F-module [K2, K4]. Conversely, given a conformal A-module M , one associates to it the maximal conformal (Lie A, A)-module (V (M ), M ) in a way similar to the one the Lie algebra Lie A has been constructed. We let V (M ) = M [t, t−1 ]/(∂ + ∂t )M [t, t−1 ], with the well-defined Lie A-action X m (1.9) (a(j) v)m+n−j , am vn = j j∈Z+

where, as before, vn stands for the image of vtn in V (M ) [K2]. As before, we denote by V (M )− the C-span of the vn , where v ∈ M , n ∈ Z+ . It is clear from (1.9) that V (M )− is a (Lie A)− - and a (Lie A)− -submodule of V (M ). The following obvious observation plays a key role in representation theory of conformal algebras [CK]. Proposition 1.1. A module M over a conformal algebra A carries the natural structure of a module over the extended annihilation Lie algebra (Lie A)− . This correspondence establishes an equivalence of the category of A-modules and that of (Lie A)− -modules. The A-module M is conformal, iff as a (Lie A)− -module it satisfies the condition an v = 0 for a ∈ A, v ∈ V, n 0 .

(1.10)

Remark 1.1. As a (Lie A)− -module, a conformal A-module M is isomorphic to the module V (M )/V (M )− .

Cohomology of Conformal Algebras

567

Remark 1.2. [K2] One can show that the map A 7→ (Lie A, A) (respectively, M 7→ (V (M ), M )) establishes a bijection between isomorphism classes of conformal algebras (respectively, of conformal modules over conformal algebras) and equivalence classes of formal distribution Lie algebras (g, F) (respectively, of conformal modules over (Lie A, A)). By definition, all formal distribution Lie algebras ((Lie A)/I, F), where I is an ideal of Lie A having trivial intersection with A, and F = A are equivalent (and similarly for modules). Example 1.1. Let g be a Lie algebra and let g˜ = g[t, t−1 ] be the associated loop (= current) algebra (withPthe obvious bracket: [atm , btn ] = [a, b]tm+n , a, b ∈ g, m, n ∈ Z). For a ∈ g let a(z) = m∈Z (atm )z −m−1 ∈ g˜ [[z, z −1 ]]. Then [a(z), b(w)] = [a, b](w)δ(z − w) , hence the family F = {a(z)|a ∈ g} consists of pairwise local formal distributions and (˜g, F) is a formal distribution Lie algebra. Note that F = C[∂]F ' C[∂] ⊗ g is a conformal algebra with the λ-bracket [aλ b] = [a, b] , a, b ∈ g . This conformal algebra is called the current conformal algebra associated to g and is denoted by Cur g. Note that Lie (Cur g, F) ' g˜ , hence g˜ is the maximal formal distribution algebra. The corresponding annihilation algebra is g˜ − = g[t] and the extended annihilation algebra is C∂t + g[t]. Given a g-module U , one may associate the conformal g˜ -module U˜ = U [t, t−1 ] with the obvious action of g˜ , and the conformal Cur g-module MU = C[∂] ⊗ U defined by aλ u = au , a ∈ g , u ∈ U . We have: V (MU ) ' U˜ as g˜ -modules. It is known that, provided that g is finite-dimensional semisimple, the Cur g-modules MU , where U is a finite-dimensional irreducible g-module, exhaust all finite irreducible non-trivial Cur g-modules [CK]. Example 1.2. Let VectC× denote the Lie algebra of all regular vector fields on C× . The vectorPfields tn ∂t (n ∈ Z) form a basis of VectC× and the formal distribution L(z) = − n∈Z (tn ∂t )z −n−1 is local (with respect to itself), since 0 (z − w) . [L(z), L(w)] = ∂w L(w)δ(z − w) + 2L(w)δw

Hence (Vect C× , {L}) is a formal distribution Lie algebra. The associated conformal algebra Vir = C[∂]L , [Lλ L] = (∂ + 2λ)L is called the Virasoro conformal algebra. Note that Lie (Vir , {L}) ' Vect C× , hence Vect C× is the maximal formal distribution algebra. The corresponding annihilation algebra (Vect C× )− = Vect C, the Lie algebra of regular vector fields on C, and (Vect C× )− is isomorphic to the direct sum of (Vect C× )− and the 1-dimensional Lie algebra.

568

B. Bakalov, V. G. Kac, A. A. Voronov

It is known that all free non-trivial Vir-modules of rank 1 over C[∂] are the following ones (1, α ∈ C): M1,α = C[∂]v , Lλ v = (∂ + α + 1λ)v . We have: V (M1,α ) ' C[t, t−1 ]e−αt (dt)1−1 as Vect C-modules. The module M1,α is irreducible, iff 1 6= 0. The module M0,α contains a unique non-trivial submodule (∂ + α)M0,α isomorphic to M1,α . It is known that the modules M1,α with 1 6= 0 exhaust all finite irreducible non-trivial Vir-modules [CK]. It is known [DK] that the conformal algebras Cur g, where g is a finite-dimensional simple Lie algebra, and Vir exhaust all finite simple conformal algebras. For that reason we shall discuss mainly these two examples in what follows.

2. Basic Definitions Definition 2.1. An n-cochain (n ∈ Z+ ) of a conformal algebra A with coefficients in a module M over it is a C-linear map γ : A⊗n → M [λ1 , . . . , λn ], a1 ⊗ · · · ⊗ an 7→ γλ1 ,...,λn (a1 , . . . , an ), where M [λ1 , . . . , λn ] denotes the space of polynomials with coefficients in M , satisfying the following conditions: Conformal antilinearity: γλ1 ,...,λn (a1 , . . . , ∂ai , . . . , an ) = −λi γλ1 ,...,λn (a1 , . . . , ai , . . . , an ) for all i; Skew-symmetry: γ is skew-symmetric with respect to simultaneous permutations of ai ’s and λi ’s. We let A⊗0 = C, as usual, so that a 0-cochain γ is an element of M . Sometimes, when the module M is not conformal, one may consider formal power series instead of polynomials in this definition. We define a differential d of a cochain γ as follows: (dγ)λ1 ,...,λn+1 (a1 , . . . , an+1 ) =

n+1 X i=1

+

(−1)i+1 aiλi γλ ,...,b λ ,...,λ

n+1 X i,j=1 i<j

1

(−1)i+j γλ

i

b

n+1

b

(a1 , . . . , b ai , . . . , an+1 )

i +λj ,λ1 ,...,λi ,...,λj ,...,λn+1

([aiλi aj ], a1 , . . . , b ai , . . . , b aj , . . . , an+1 ),

where γ is extended linearly over the polynomials in λi . In particular, if γ ∈ M is a 0-cochain, then (dγ)λ (a) = aλ γ. Remark 2.1. Conformal antilinearity implies the following relation for an n-cochain γ: γλ+µ,λ1 ,... ([aλ b], a1 , . . . ) = γλ+µ,λ1 ,... ([a−∂−µ b], a1 , . . . ).

Cohomology of Conformal Algebras

569

Lemma 2.1. 1. The operator d preserves the space of cochains; 2. d2 = 0. Proof. 1. The only non-trivial point in checking the skew-symmetry of dγ amounts to the equation γλ+µ,λ1 ,...,λn−1 ([aλ b], a1 , . . . , an−1 ) = −γλ+µ,λ1 ,...,λn−1 ([bµ a], a1 , . . . , an−1 ), which follows from Remark 2.1 and the skew-symmetry of [aλ b]. 2. To check that d2 = 0, we will compute d2 γ for an n-cochain γ: (d2 γ)λ1 ,...,λn+2 (a1 , . . . , an+2 ) =

n+2 X i=1

n+2 X

+

i,j=1 i<j

n+2 X

=

i,j=1 i6=j

i

1

(−1)i+j (dγ)λ

b

i +λj ,λ1 ,...,λi,j ,...,λn+2

n+2

(a1 , . . . , b ai , . . . , an+2 )

([aiλi aj ], a1 , . . . , b ai,j , . . . , an+2 )

(−1)i+j sign{j, i}aiλi aj λj γλ ,...,b λ

i,j ,...,λn+2

1

n+2 X

+

(−1)i+1 aiλi (dγ)λ ,...,b λ ,...,λ

i,j,k=1 j
(−1)i+j+k+1 sign{j, k, i}aiλi γλ

(a1 , . . . , b ai,j , . . . , an+2 )

b

([aj λj ak ],

b

([aiλi aj ],

j +λk ,λ1 ,...,λi,j,k ,...,λn+2

ai,j,k , . . . , an+2 ) a1 , . . . , b n+2 X

+

i,j,k=1 i<j,k6=i,j

(−1)i+j+k sign{k, i, j}akλk γλ

i +λj ,λ1 ,...,λi,j,k ,...,λn+2

ai,j,k , . . . , an+2 ) a1 , . . . , b +

n+2 X i,j=1 i<j

(−1)i+j [aiλi aj ]λi +λj γλ ,...,b λ 1

n+2 X

+

i,j ,...,λn+2

(a1 , . . . , b ai,j , . . . , an+2 )

(−1)i+j+k+l sign{i, j, k, l}

distinct i,j,k,l=1 i<j,k
× γλ

b

k +λl ,λi +λj ,λ1 ,...,λi,j,k,l ,...,λn+2

n+2 X

+

([akλk al ], [aiλi aj ], a1 , . . . , b ai,j,k,l , . . . , an+2 )

(−1)i+j+k+1 sign{i, j, k}

i,j,k=1 i<j,k6=i,j

× γλ

b

i +λj +λk ,λ1 ,...,λi,j,k ,...,λn+2

([[aiλi aj ]λi +λj ak ], a1 , . . . , b ai,j,k , . . . , an+2 ),

where sign{i1 , . . . , ip } is the sign of the permutation putting the indices in increasing term in the sumorder and b ai,j,... means that ai , aj , . . . are omitted. Notice that each i jk l mation over i, j, k, l is skew with respect to the permutation . Therefore, the k l i j

570

B. Bakalov, V. G. Kac, A. A. Voronov

terms of that summation will cancel pairwise. The first and the fourth summations cancel each other, because M is a conformal algebra module: −aiλi (aj λj m) + aj λj (aiλi m) + [aiλi aj ]λi +λj m = 0. The second summation becomes equal to the third one after the substitution (ikj), except that they differ by a sign. Thus, they cancel each other, as well. Finally, the sixth summation can be rewritten as a summation over i < j < k of the sum of three permutations of the initial summand. Precisely, in the first entry of γ, we will have [[aiλi aj ]λi +λj ak ] − [[aiλi ak ]λi +λk aj ] + [[aj λj ak ]λj +λk ai ].

(2.1)

Using Remark 2.1, we can transform the sum (2.1) inside γ into [[aiλi aj ]λi +λj ak ] − [[aiλi ak ]−∂−λj aj ] + [[aj λj ak ]−∂−λi ai ], which vanishes by the Jacobi identity and skew-symmetry in A. Thus, we see that all of the terms in d2 γ cancel. Thus the cochains of a conformal algebra A with coefficients in a module M form a complex, which will be denoted M e • (A, M ) = e n (A, M ). e• = C C C n∈Z+

This complex is called the basic complex for the A-module M . This is not yet the complex defining the right cohomology of a conformal algebra: we need to consider a certain quotient complex. e • (A, M ) by letting Define the structure of a (left) C[∂]-module on C n X λi γλ1 ,...,λn (a1 , . . . , an ), (∂ · γ)λ1 ,...,λn (a1 , . . . , an ) = ∂M +

(2.2)

i=1

where ∂M denotes the action of ∂ on M . e• ⊂ C e • forms a subcomLemma 2.2. d∂ = ∂d, and therefore the graded subspace ∂ C plex. Pn Proof. The first summation in the differential transforms the factor ∂M + i=1 λi into Pn+1 ∂M + i=1 λi , because of the conformal sesquilinearity of the λ-bracket. The second summation does the same for more obvious reasons. Define the quotient complex e • (A, M )/∂ C e • (A, M ) = C • (A, M ) = C

M

C n (A, M ),

n∈Z+

called the reduced complex. •

e (A, M ) of a conformal algebra A with coDefinition 2.2. The basic cohomology H e • . The ( reduced) efficients in a module M is the cohomology of the basic complex C • cohomology H (A, M ) is the cohomology of the reduced complex C • = C • (A, M ) = e• . e • /∂ C C

Cohomology of Conformal Algebras

571 •

e (A, M ) is naturally a C[∂]-module, whereas the Remark 2.2. The basic cohomology H reduced cohomology H• (A, M ) is a complex vector space. e • → C • → 0 gives the long exact e• → C Remark 2.3. The exact sequence 0 → ∂ C sequence of cohomology: e• ) → H e 0 (A, M ) → H0 (A, M ) → 0 → H0 (∂ C e•

1

(2.3)

e (A, M ) → H (A, M ) → → H (∂ C ) → H 1

1

e• ) → H e 2 (A, M ) → H2 (A, M ) → · · · . → H2 (∂ C e • are isomorphic under the e • and ∂ C Proposition 2.1. In degrees ≥ 1, the complexes C map e• , e• → ∂ C C

γ 7→ ∂ · γ.

(2.4)

e• ) ' H e q (A, M ) for q ≥ 1, and the natural sequence 0 → Ker ∂[0] → Therefore, Hq (∂ C e • ) → 0, where Ker ∂[0] is the subcomplex Ker ∂ of C e • , in fact e 0 (A, M ) → H0 (∂ C H concentrated in degree zero, is exact. When the module M is C[∂]-free, the above isomorphisms take place in all degrees ≥ 0. e n (A, M ), n ≥ 1, are free over C[∂], because they are free Proof. Indeed, the modules C over C[λ1 ]. Lemma 2.2 shows that the map (2.4) is a morphism of complexes. When M is C[∂]-free, this argument extends over to n = 0. Remark 2.4. This proposition does not imply that in the long exact sequence (2.3), the e• ) → H e• ⊂ C e • are isomorphisms. e q (A, M ) induced by the embedding ∂ C maps Hq (∂ C 3. Extensions and Deformations Our cohomology theory describes extensions and deformations, just as any cohomology theory. 0

e (A, M ) = M A = {m ∈ M | aλ m = 0 ∀a ∈ A}. Theorem 3.1. 1. H 2. The isomorphism classes of extensions 0→M →E→C→0 of the trivial A-module C (∂ and A act by zero) by a conformal A-module M correspond bijectively to H0 (A, M ). 3. The isomorphism classes of C[∂]-split extensions 0→M →E→N →0 of conformal modules over a conformal algebra A correspond bijectively to H1 (A, Chom(N, M )), where M and N are assumed to be finite and Chom(N, M ) is the A-module of conformal linear maps from N to M . If, in particular, N = C is the trivial module, then there exist no non-trivial C[∂]-split extensions.

572

B. Bakalov, V. G. Kac, A. A. Voronov

4. Let C be a conformal A-module, considered as a conformal algebra with respect to the zero λ-bracket. Then the equivalence classes of C[∂]-split “abelian” extensions e→A→0 0→C→A of the conformal algebra A correspond bijectively to H2 (A, C). 5. The equivalence classes of first-order deformations of a conformal algebra A (leaving the C[∂]-action intact) correspond bijectively to H2 (A, A). 0

e (A, M ) follows directly from the definition: for m ∈ Proof. 1. The computation of H e 0 (A, M ) and a ∈ A, (dm)λ (a) = aλ m. M =C 2. Given an extension 0→M →E→C→0 of modules over a conformal algebra A, pick a splitting of this short exact sequence over C, i.e., assume that as a complex vector space, E ' M ⊕C = {(m, n) | m ∈ M, n ∈ C}. Define f ∈ M by writing down the action of ∂ on the pair (m, 1) ∈ E: ∂(m, 1) = (∂m + f, 0).

(3.1)

e 0 (A, M ) defines a zero-cocycle in the reduced complex We claim that f ∈ M = C • C (A, M ) and thereby a class in H0 (A, M ). e 1 (A, M ) using the action of A on E: To see that, define a one-cochain γ ∈ C aλ (m, 1) = (aλ m + γλ (a), 0)

(3.2)

for a ∈ A. The conformal antilinearity of γ: γλ (∂a) = −λγλ (a), follows from the fact that (∂a)λ (m, 1) = −λ(aλ (m, 1)). The property aλ (∂(m, 1)) = (λ + ∂)(aλ (m, 1)) of the action of A on E expands as (df )λ = (∂γ)λ ,

(3.3)

which means that df = 0 in the reduced complex. If we choose another splitting (m, n)0 of the extension E, it will differ by an element g ∈ M: (m, 1)0 = (m + g, 1), so that the new zero-cocycle becomes f 0 = f + ∂g, therefore defining the same cochain in the reduced complex. If we have two isomorphic extensions and choose a compatible splitting over C, we will get exactly the same zero-cocycles f corresponding to them. This proves that isomorphism classes of extensions give rise to elements of H0 (A, M ). Conversely, given a cocycle in C 0 (A, M ), we can choose a representative f ∈ M of it to alter the natural C[∂]-module structure on M ⊕ C by adding f to the action of ∂ on M ⊕ C as in (3.1). This will obviously extend to an action of the free commutative algebra C[∂]. We can also alter the natural A-module structure by adding γ to the action of a ∈ A as in (3.2), where γ is a solution of Eq. (3.3), which means that f is a cocycle in the reduced complex. This action will be conformally linear in (m, n), because of (3.3), and antilinear in A, because of the conformal antilinearity of γ. This action will define an A-module structure on M ⊕ C, because dγ = 0, which follows from (3.3) and the fact that C[∂] acts freely on basic two-cochains. By construction the natural mappings M → M ⊕ C and M ⊕ C → C will be morphisms of C[∂]- and A-modules.

Cohomology of Conformal Algebras

573

This construction of a new conformal module structure on M ⊕C involved a number of choices. The choice of a different representative f 0 = f + ∂g defines an isomorphism of the two C[∂]-module structures on M ⊕ C, which automatically becomes an isomorphism of the corresponding A-module structures, because the corresponding γ’s are unique. The one-cochain γ is uniquely determined by f , because C[∂] acts freely on the e 1 (A, M ) of basic one-cochains. space C 3. We will adjust the proof of Part 2 to the new situation. Given a C[∂]-split extension 0→M →E→N →0 of modules over a conformal algebra A, pick a splitting of the short exact sequence over C[∂], i.e., assume that as a C[∂]-module, E ' M ⊕ N = {(m, n) | m ∈ M, n ∈ N }. We are going to construct a reduced one-cochain with coefficients in Chom(N, M ) out of this data. Note that such cochains are linear maps γ = γλ (a)µ from A ⊗ N to M depending on two variables λ and µ, considered modulo λ − µ. Note that γλ (a)µ mod (λ − µ) is fully determined by the restriction γλ (a)λ to the diagonal λ = µ. Define a one-cochain γ ∈ C 1 (A, Chom(N, M )) using the action of A on E: aλ (m, n) = (aλ m + γλ (a)λ n, aλ n)

(3.4)

for a ∈ A. The conformal antilinearity of γ: γλ (∂a)λ = −λγλ (a)λ , follows from the fact that (∂a)λ (m, n) = −λ(aλ (m, n)). The property aλ (∂(m, n)) = (λ + ∂)(aλ (m, n)) of the action of A on E expands as (∂γ)λ = 0,

(3.5)

which means that γλ (a)λ is a conformal linear map N → M . Finally, the module property (1.5) for elements in E implies that dγ = 0. If we choose another C[∂]-splitting (m, n)0 of the extension E, it will differ by an element β ∈ HomC[∂] (N, M ): (m, n)0 = (m + β(n), n). HomC[∂] (N, M ) may be identified with the degree-zero part of Chom(N, M ), so that the new one-cocycle becomes γ 0 = γ + dβ, therefore defining the same cohomology class. If we have two isomorphic extensions and choose a compatible splitting over C[∂], we will have exactly the same one-cocycles γ corresponding to them. This proves that isomorphism classes of extensions give rise to elements of H1 (A, Chom(N, M )). Conversely, given a cohomology class in H1 (A, Chom(N, M )), we can choose a representative γ ∈ C 1 (A, Chom(N, M )) of it to alter the natural A-module structure on M ⊕ N by adding γ to the action of A on M ⊕ N as in (3.4). This action will be conformally linear in (m, n), because of (3.5), and antilinear in A, because of the conformal antilinearity of γ. This action will define an A-module structure on M ⊕ N , e 2 (A, Chom(N, M )). because dγ = 0 after the restriction to µ = λ1 + λ2 in C By construction the natural mappings M → M ⊕ N and M ⊕ N → N will be morphisms of C[∂]- and A-modules. This construction of a new conformal module structure on M ⊕ N is independent on the choice of a different representative γ 0 = γ + dβ, because it defines an isomorphic structure of an A-module on M ⊕ N . Finally, if N = C, then Chom(C, M ) = 0, and therefore, there are no split extensions.

574

B. Bakalov, V. G. Kac, A. A. Voronov

4. Given a C[∂]-split extension of a conformal algebra A by a module C, choose a e = C ⊕ A thereof. Then the bracket in A e splitting A [(0, a)λ (0, b)] = (cλ (a, b), aλ b)

for a, b ∈ A

defines a sesquilinear map c : A ⊗ A → C[λ], which we may combine with the natural mapping C[λ] → C[λ1 , λ2 ]/(∂ + λ1 + λ2 ), p(λ) 7→ p(λ1 ), to get the composite mapping, denoted cλ1 ,λ2 . It defines a two-cochain, because it is obviously skew and (cλ (∂a, b), −λaλ b) = [(0, ∂a)λ (0, b)] = [∂(0, a)λ (0, b)] = −λ[(0, a)λ (0, b)] = (−λcλ (a, b), −λaλ b), which implies cλ1 ,λ2 (∂a, b) = −λ1 cλ1 ,λ2 (a, b), and similarly, cλ1 ,λ2 (a, ∂b) = −λ2 cλ1 ,λ2 (a, b) mod (∂+λ1 +λ2 ). In fact, this two-cochain c is a cocycle: dc = aλ1 cλ2 ,λ3 (b, c) − bλ2 cλ1 ,λ3 (a, c) + cλ3 cλ1 ,λ2 (a, b) − cλ1 +λ2 ,λ3 (aλ1 b, c) + cλ1 +λ3 ,λ2 (aλ1 c, b) − cλ2 +λ3 ,λ1 (bλ2 c, a) = 0. This is just because the corresponding three-term relation, the Jacobi relation, is satisfied e in A. e = C ⊕ A. A different The construction of c assumed the choice of a splitting A splitting would differ by a mapping f : A → C, which can be thought of as f : A → C[λ]/(∂ + λ), which would contribute by df to c. Thus, any extension determines a cohomology class in H2 (A, C). The above arguments can be traced back to show that a class in the cohomology group defines an extension. 5. Let D = C[]/(2 ) be the algebra of dual numbers. Then a first-order deformation of a conformal algebra A is the structure of a conformal algebra over D on A ⊗ D, so that the map A ⊗ D → A, a ⊗ p() 7→ p(0) · a, is a morphism of conformal algebras and the action of ∂ on A ⊗ D is induced from that on the first factor. This means classes of first-order deformations are in bijection with classes of C[∂]-split abelian extensions of A with the A-module A in the sense of Part 2 of this theorem. Therefore, they are classified by H2 (A, A). 4. Homology en (A, M ) of nDualizing the cohomology theory we have defined above, the space C chains of a conformal algebra A with coefficients in a conformal module M over it is defined as the quotient of A⊗n ⊗ Hom(C[λ1 , . . . , λn ], M ), where Hom(C[λ1 , . . . , λn ], M ) is the space of C-linear maps from the space of polynomials to the module M , by the following relations: 1. a1 ⊗· · ·⊗∂ai ⊗· · ·⊗an ⊗φ = −a1 ⊗· · ·⊗ai ⊗· · ·⊗an ⊗Ti φ, where (Ti φ)(f ) = φ(λi f ); ∗ φ, 2. a1 ⊗ · · · ⊗ ai ⊗ · · · ⊗ aj ⊗ · · · ⊗ an ⊗ φ = −a1 ⊗ · · · ⊗ aj ⊗ · · · ⊗ ai ⊗ · · · ⊗ an ⊗ τij ∗ where (τij φ)(f (λ1 , . . . , λi , . . . , λj , . . . , λn )) = φ(f (λ1 , . . . , λj , . . . , λi , . . . , λn )).

Cohomology of Conformal Algebras

575

One can also define a differential which takes n-chains to (n − 1)-chains as follows: δ(a1 ⊗ · · · ⊗ an ⊗ φ) n X (−1)i+1 pi (a1 ⊗ · · · ⊗ b ai ⊗ · · · ⊗ an ⊗ aiλi φ) = i=1

+

n X

(−1)i+j pij ([aiλi aj ] ⊗ a1 ⊗ · · · ⊗ b ai ⊗ · · · ⊗ b aj ⊗ · · · ⊗ an ⊗ φ),

i,j=1 i<j

where pi is the natural pairing map C[λi ]⊗Hom(C[λ1 , . . . , λn ], M ) → Hom(C[λ1 , . . . , bi , . . . , λn ], M ) and pij is the pairing C[λi ]⊗Hom(C[λ1 , . . . , λn ], M ) → Hom(C[λi + λ bi , . . . , λ bj , . . . , λn ], M ). Similar computations to those in the cochain case λj , λ1 , . . . , λ show that the operator δ is well-defined and δ 2 = 0. e • (A, M ) as the homology of the chain complex One can define basic homology H and reduced homology as the homology of the subcomplex C• (A, M ) of ∂-invariant chains, where ∂ acts as ∂(a1 ⊗ · · · ⊗ an ⊗ φ) = a1 ⊗ · · · ⊗ an ⊗ (∂φ −

n X

Ti φ),

i=1

where (∂φ)(f ) = ∂(φ(f )), f ∈ C[λ1 , . . . λn ]. There are obviously natural pairings e q (A, M ∗ ) ⊗ H e q (A, M ) → C and Hq (A, M ∗ ) ⊗ Hq (A, M ) → C for q ≥ 0, where H M ∗ = HomC (M, C) is the linear dual space with a natural structure of an A-module: (∂f )(m) = −f (∂m), (aλ f )(m) = −f (aλ m) for f ∈ M ∗ , m ∈ M , and a ∈ A. One expects these pairings to be perfect, when, for instance, either of the (co)homology spaces is finite-dimensional.

5. Exterior Multiplication, Contraction, and Module Structure e m (A, C), where C is the one-dimensional space with the zero action of For any u ∈ C e • (A, M ): A, let (u) be the operator of exterior multiplication on C ((u)γ)λ1 ,...,λm+n (a1 , . . . , am+n ) X sign π uλ ,...,λπ(m) (aπ(1) , . . . , aπ(m) ) = m! n! π(1) π∈Sm+n

× γλπ(m+1) ,...,λπ(m+n) (aπ(m+1) , . . . , aπ(m+n) ). e • (A, C). It is clear that (u∧v) = (u)(v) Define also a wedge product u∧γ = (u)γ on C for any u, v ∈ V , therefore, we have a graded commutative associative algebra structure e • (A, C)-module structure on C e • (A, M ). e • (A, C), along with a C on C

576

B. Bakalov, V. G. Kac, A. A. Voronov

en (A, C), let ι(v) be the following Similarly, for any chain v = a1 ⊗ · · · ⊗ an ⊗ φ ∈ C m m−n e e (A, M ), for m ≥ n: contraction operator C (A, M ) → C (ι(v)γ)λn+1 ,...,λm (an+1 , . . . , am ) = p φ ⊗ γλ1 ,...,λm (a1 , . . . , am ) , where p is the natural pairing C[λ1 , . . . , λn ]∗ ⊗ C[λ1 , . . . , λm ] → C[λn+1 , . . . , λm ]. e1 (A, C), e 1 (A, C) and v ∈ C Note that for any u ∈ C (u)ι(v) + ι(v)(u) = ι(v)u. Furthermore for any a ∈ A, define the following structure of a module over the e • (A, M ): conformal algebra A on C (θλ (a)γ)λ1 ,...,λn (a1 , . . . , an ) = aλ γλ1 ,...,λn (a1 , . . . , an ) −

n X

γλ1 ,...,λ+λi ,...,λn (a1 , . . . , [aλ ai ], . . . , an ).

i=1

Define ιλ (a) in a similar fashion: (ιλ (a)γ)λ1 ,...,λn−1 (a1 , . . . , an−1 ) = γλ,λ1 ,...,λn−1 (a, a1 , . . . , an−1 ). e1 (A, C) depending Note that every a ∈ A defines naturally a one-chain a ⊗ γλ0 ∈ C on a parameter λ0 , where γλ0 (f (λ)) = f (λ0 ). Then we have ιλ (a) = ι(a ⊗ γλ ). The fundamental identity dιλ + ιλ d = θλ of classical Lie theory is also valid in the context of conformal algebras. It also implies e • (A, M ) is trivial, dθλ = θλ d. As in the Lie algebra case, the induced action of A on H cf. Remark 6.2. 6. Cohomology of Conformal Algebras and Their Annihilation Lie Algebras 6.1. Cohomology of the basic complex. Let A be a conformal algebra and M a conformal module over it. Then M is a module over the annihilation Lie algebra g− = (Lie A)− , see Sect. 1. Let C • (g− , M ) be the Chevalley–Eilenberg complex defining the cohomology of g− with coefficients in M . Recall that, by definition (see, e.g., [F]), C n (g− , M ) is the space of skew-symmetric linear functionals γ : (g− )⊗n → M which are continuous, i.e., γ(a1m1 ⊗ · · · ⊗ anmn ) = 0 for all but a finite number of m1 , . . . , mn ∈ Z+ , where a1 , . . . , an ∈ A, and aimi ∈ g− = (Lie A)− = A[t]/(∂ + ∂t )A[t] is the image of the element ai tmi . C • (g− , M ) has the following structure of a C[∂]-module: n X γ(a1 ⊗ · · · ⊗ ∂ai ⊗ · · · ⊗ an ), (∂γ)(a1 ⊗ · · · ⊗ an ) = ∂ γ(a1 ⊗ · · · ⊗ an ) − i=1

γ ∈ C n (g− , M ). (6.1)

Cohomology of Conformal Algebras

577

e • (A, M ) and Theorem 6.1. There is a canonical isomorphism of complexes C C • (g− , M ), compatible with the action of C[∂]. Consequently, the complex C • (A, M ) is isomorphic to C • (g− , M )/∂C • (g− , M ). e n (A, M ), we write Proof. For a cochain α ∈ C X 1) λ(m · · · λn(mn ) α(m1 ,...,mn ) (a1 , . . . , an ). αλ1 ,...,λn (a1 , . . . , an ) = 1 m1 ,...,mn ∈Z+

In terms of the linear maps α(m1 ,...,mn ) : A⊗n → M, a1 ⊗ · · · ⊗ an 7→ α(m1 ,...,mn ) (a1 , . . . , an ), e • (A, M ) translates as follows. the definition of C 1. For any a1 , . . . , an ∈ A, α(m1 ,...,mn ) (a1 , . . . , an ) is non-zero for only a finite number of (m1 , . . . , mn ). 2. α(m1 ,...,mi ,...,mn ) (a1 , . . . , ∂ai , . . . , an ) = −mi α(m1 ,...,mi −1,...,mn ) (a1 , . . . , ai , . . . , an ). 3. α is skew-symmetric with respect to simultaneous permutations of ai ’s and mi ’s. The differential is given by: (dγ)(m1 ,...,mn+1 ) (a1 , . . . , an+1 ) =

n+1 X i=1

+

(−1)i+1 ai(mi ) γ(m1 ,...,m ai , . . . , an+1 ) b i ,...,mn+1 ) (a1 , . . . , b

mi n+1 X X i,j=1 i<j

k=0

i+j

(−1)

mi γ(mi +mj −k,m1 ,...,m b i ,...,m b j ,...,mn+1 ) (ai(k) aj , a1 , k aj , . . . , an+1 ). ...,b ai , . . . , b

e n (A, M ) → C n (g− , M ) by the formula Define linear maps φn : C (φn α)(a1m1 ⊗ · · · ⊗ anmn ) = α(m1 ,...,mn ) (a1 , . . . , an ). They are well-defined due to above Condition 2. Clearly, φn are bijective and, using (1.3), it is easy to see that φn+1 ◦ d = d ◦ φn . Moreover, φn ◦ ∂ = ∂ ◦ φn , where ∂ acts e • (A, M ) via (2.2) and on C • (g− , M ) via (6.1). on C e • (A, M ) ' H• (g− , M ). Corollary 6.1. H en (A, M ) Remark 6.1. Similar results hold for homology. To a chain a1 ⊗· · ·⊗an ⊗φ ∈ C (ai ∈ A, φ ∈ Hom(C[λ1 , . . . , λn ], M )) we associate the chain hφ, a1λ1 ⊗ · · · ⊗ anλn i ∈ Cn (g− , M ). 1) n) · · · ∂λ(m |λ1 =···=λn =0 corresponds to a1m1 ⊗ · · · ⊗ In other words, a1 ⊗ · · · ⊗ an ⊗ ∂λ(m 1 n anmn .

578

B. Bakalov, V. G. Kac, A. A. Voronov

Remark 6.2. One can easily see that the exterior multiplication, contraction, module structure, etc., of Sect. 5 are equivalent to the corresponding notions for the annihilation Lie algebra g− . For example, if θ(am ) denotes the action of am ∈ g− on C • (g− , M ) ' e • (A, M ), then C X λ(m) θ(am ). θλ (a) = m∈Z+

e • (A, M ) is trivial. In particular, the action of A on H 6.2. Cohomology of the reduced complex. Now we assume that M is a free C[∂]-module: M = C[∂] ⊗C U for some vector space U . Then the g− -module V− = V (M )− is just U [t] with m X m n (a(j) u) tm+n−j , ∂(utn ) = −nutn−1 , am (ut ) = j j=0 for u ∈ U , a ∈ A, see Sect. 1. In terms of the usual generating series aλ = P (m) am , this can be rewritten as m≥0 λ aλ (utn ) = (aλ u) tn etλ . Theorem 6.2. If A is a conformal algebra and M a conformal module which is free as a C[∂]-module, then the complex C • (A, M ) is isomorphic to the subcomplex C • (g− , V− )∂ of ∂-invariant cochains in C • (g− , V− ). Proof. Let β ∈ C n (g− , V− ). As in the proof of Theorem 6.1, consider the generating series X n) λ1(m1 ) · · · λ(m β(a1m1 ⊗ · · · ⊗ anmn ). (6.2) βλ1 ,...,λn ;t (a1 , . . . , an ) = n m1 ,...,mn ∈Z+

By Eq. (6.1), ∂ acts on βλ1 ,...,λn ;t as −∂t +

P

λi . Hence β is ∂-invariant, iff P βλ1 ,...,λn ;t (a1 , . . . , an ) = γλ1 ,...,λn (a1 , . . . , an ) et λi ,

(6.3)

where γλ1 ,...,λn = βλ1 ,...,λn ;t |t=0 takes values in U . Identifying U with 1 ⊗ U ⊂ M , e n (A, M ). It is easy to check that β 7→ γ := γ we can consider γ as an element of C P mod (∂ + λi ) is a chain map from C • (g− , V− ) to C • (A, M ). e n (A, M ) such that Conversely, forPγ ∈ C n (A, M ) choose a representative γ ∈ C n ∂ γ = γ mod (∂ + λi ). Define β ∈ C (g− , V− ) by (6.2, 6.3) with ∂ substituted by −∂t in γλ1 ,...,λn (a1 , . . . , an ) ∈ M = U [∂]. Then clearly, β is independent of the choice of γ. The correspondence β ↔ γ establishes an isomorphism between C • (g− , V− )∂ and C • (A, M ). Remark 6.3. Identifying C • (A, M ) with C • (g− , M )/∂C • (g− , M ), we can rewrite (6.3) as β(a1m1 ⊗ · · · ⊗ anmn ) X = k1 ,...,kn ∈Z+

(ai ∈ A, mi ∈ Z+ ).

P P m1 mn ··· γ(a1k1 ⊗ · · · ⊗ ankn ) t mi − ki , k1 kn

Cohomology of Conformal Algebras

579

6.3. Cohomology of conformal algebras and formal distribution Lie algebras. Let g be the maximal formal distribution Lie algebra corresponding to a conformal algebra A e (see Sect. 1). Suppose γ ∈ C n (A, M ). The following formula defines an n-cochain γ on the Lie algebra g: γ e(a1 f1 (t), . . . , an fn (t)) = Res γ∂1 ,...,∂n (a1 , . . . , an )δ(λ1 − λ2 ) . . . δ(λ1 − λn )f1 (λ1 ) . . . fn (λn ), λ1 ,...,λn

where ai ∈ A, fi ∈ C[t], ∂i = ∂/∂λi , and when substituting ∂ into a polynomial, one has to use the divided powers ∂ (k) = ∂ k /k!. This formula is equivalent to the one from Remark 6.3, where mi ’s are now allowed to take negative values. This correspondence defines a morphism of complexes and, therefore, cohomology.

7. Cohomology of the Virasoro Conformal Algebra The conformal algebra with one free generator L as a C[∂]-module and λ-bracket [Lλ L] = (∂ + 2λ)L is called the Virasoro conformal algebra Vir, cf. Example 1.2. 7.1. Cohomology of Vir with trivial coefficients. Here we will compute the cohomology of Vir with trivial coefficients C, where both ∂ and L act by zero. Theorem 7.1. For the Virasoro conformal algebra Vir, ( 1 if q = 0 or 3, q e (Vir, C) = dim H 0 otherwise, and ( 1 if q = 0, 2, or 3, dim Hq (Vir, C) = 0 otherwise. Proof. Let us first identify the cohomology complex. An n-cochain γ in this case is determined by its value on L⊗n : P (λ1 , . . . , λn ) = γλ1 ,...,λn (L, . . . , L). Obviously, P (λ1 , . . . , λn ) is a skew-symmetric polynomial with values in C. The differential is then determined by the following formula: (dP )(λ1 , . . . , λn+1 ) =

n+1 X

bi , . . . , λ bj , . . . , λn+1 ). (−1)i+j (λi − λj )P (λi + λj , λ1 , . . . , λ

i,j=1 i<j

e • . The complex C • producing the cohomology of Vir This describes the complex C e • by the ideal spanned by Pn λi in each degree n. is nothing but the quotient of C i=1 n In Pnother words, Cn is the space of regular (polynomial) functions on the hyperplane i=1 λi = 0 in C which are skew in the variables λ1 , . . . , λn . This complex appeared

580

B. Bakalov, V. G. Kac, A. A. Voronov

as an intermediate step in Gelfand–Fuchs’s 1968 computation [GF1] of the cohomology of the Virasoro Lie algebra, and the cohomology of C • was computed therein. e q−1 : eq → C Consider the following homotopy operator C ∂P . k(P ) = (−1)q ∂λq λq =0 e q , where A straightforward computation shows that (dk+kd)P = (deg P −q)P for P ∈ C deg P is the total degree of P in λ1 , . . . , λq . Thus, only those homogeneous cochains whose degree as a polynomial is equal to their degree as a cochain contribute to the e • . These polynomials must be skew and therefore divisible by 3q cohomology of C Q = i<j (λi − λj ), whose polynomial degree is q(q − 1)/2. The quadratic inequality q(q − 1)/2 ≤ q has q = 0, 1, 2, and 3 as the only integral solutions. For q = 0, the e • ). For q = 1, the only polynomial of degree 1 is λ1 , e 0 = C contributes to H0 (C whole C 2 up to a constant factor. dλ1 = λ2 − λ21 , which is the only skew polynomial of degree e 2 = 0. Finally, for q = 3, the only skew e1 = H 2 in two variables. This shows that H polynomial of degree 3 in 3 variables is 33 , up to a constant. It is easy to see that this polynomial represents a non-trivial class in the cohomology. Indeed, it is closed, because a skew-symmetric function in four variables has a degree at least 6, which is greater than deg(d33 ) = 4. And 33 is not a coboundary, because it can be the coboundary of a two-cochain of degree 2, which must be a constant factor of λ22 − λ21 = dλ1 , whose coboundary is zero. The computation of the cohomology of the quotient complex C • is based on the short exact sequence e • → C • → 0. e• → C 0 → ∂C

(7.1)

e • , define a homotopy k1 : ∂ C eq → e 0 = 0. To find the cohomology of ∂ C By definition, ∂ C P q−1 q e e as k1 (∂P ) = ∂k(P ), where ∂ = i λi and P ∈ C . Then (dk1 + k1 d)∂P = ∂C (deg P − q)∂P . As in the previous paragraph, this implies that deg P = q = 0, 1, 2, or e • with this property are P1 = λ2 3. Up to constant factors, the only polynomials in ∂ C 1 2 2 for q = 1, P2 = (λ1 + λ2 )(λ1 − λ2 ) for q = 2, and P3 = (λ1 + λ2 + λ3 )33 for q = 3. One e • ) = 0 for all q but q = 3, where computes: dP1 = −P2 and dP3 = 0. Therefore Hq (∂ C it is one-dimensional with the generator P3 . Thus, the long exact sequence of cohomology associated with (7.1) looks as follows: 0 −−−−→

0

−−−−→ C −−−−→ H0 (Vir, C) −−−−→

−−−−→

0

−−−−→

0

−−−−→ H1 (Vir, C) −−−−→

−−−−→

0

−−−−→

0

−−−−→ H2 (Vir, C) −−−−→

−−−−→ CP3 −−−−→ C33 −−−−→ H3 (Vir, C) −−−−→ −−−−→

0

−−−−→

−−−−→

0

−−−−→ . . . .

0

−−−−→ H4 (Vir, C) −−−−→

We see that H0 (Vir, C) = C and Hq (Vir, C) = 0 for q = 1, 4, 5, 6, . . . and H3 (Vir, C) = C33 and H2 (Vir, C) = C(λ31 − λ32 ), because d(λ31 − λ32 ) = P3 .

Cohomology of Conformal Algebras

581

Remark 7.1. In fact, this computation shows that the cohomology of the Virasoro conformal algebra is the primitive part of the cohomology ring of the Virasoro Lie algebra, in addition to C in degree 0. The reduction of the basic cohomology to the computation of Gelfand and Fuchs [GF1] might be made using Corollary 6.1, but we preferred to use a direct argument in the proof. Remark 7.2. Instead of the trivial Vir-module C, consider the module Ca , which is the one-dimensional vector space C on which all elements of Vir act by zero, and ∂v = av for v ∈ Ca , a 6= 0 being a given complex constant. Then Proposition 2.1 shows that e• ) ' H e q (Vir, Ca ) for q ≥ 0, and the long exact sequence (2.3) combined with Hq (∂ C e • (Vir, C), provided e • (Vir, Ca ), which is obviously isomorphic to H the computation of H q by Theorem 7.1, shows that H (Vir, Ca ) = 0 for all q. 7.2. Cohomology of Vir with coefficients in M1,α . Recall (Example 1.2) that M1,α (1, α ∈ C) is the following Vir-module M1,α = C[∂]v , Lλ v = (∂ + α + 1λ)v . As in the previous subsection, we identify the space of n-cochains C n (Vir, M1,α ) with the space of all C-valued skew-symmetric polynomials in n variables: for any γ ∈ C n (Vir, M1,α ), there is a unique polynomial P (λ1 , . . . , λn ) such that γλ1 ,...,λn (L, . . . , L) = P (λ1 , . . . , λn )v

mod (∂ + λ1 + · · · + λn ).

Then the differential is given by the formula

(dP )(λ1 , . . . , λn+1 ) =

n+1 X i=1

+

n+1 X

n+1 X bi , . . . , λn+1 ) (−1)i+1 α − λj + 1λi P (λ1 , . . . , λ j=1

bi , . . . , λ bj , . . . , λn+1 ). (−1)i+j (λi − λj )P (λi + λj , λ1 , . . . , λ

i,j=1 i<j

Now we interpret this in terms of the Lie algebra VectC of regular vector fields on C, which is the annihilation algebra of Vir, see Sect. 1. To γ ∈ C n (Vir, M1,α ) we associate Vn VectC → C by the formula a linear map β : X

n) λ1(m1 ) · · · λ(m β(L(m1 ) ∧ · · · ∧ L(mn ) ) = P (λ1 , . . . , λn ), n

m1 ,...,mn ∈Z+

where L(m) = −tm ∂t .

582

B. Bakalov, V. G. Kac, A. A. Voronov

Then the differential is (dβ)(L(m1 ) ∧ · · · ∧ L(mn+1 ) ) =

n+1 X

[ (−1)i+1 αδmi ,0 + (1 − 1)δmi ,1 β L(m1 ) ∧ · · · ∧ L (mi ) ∧ · · · ∧ L(mn+1 )

i=1

+

n+1 X

(−1)i+j (1 − δmi ,0 )(1 − δmj ,0 ) β [L(mi ) , L(mj ) ], L(m1 )

i,j=1 i<j

[ [ ··· ∧ L (mi ) ∧ · · · ∧ L(mj ) ∧ · · · ∧ L(mn+1 ) . Let Vect0 C be the subalgebra of VectC of vector fields that vanish at the origin. It is spanned by the elements L(m) = −tm ∂t , m ≥ 1. Let U1 be a 1-dimensional Vect0 C-module on which L(m) acts as 0 for m ≥ 2 and L(1) acts as a multiplication by 1. Theorem 7.2. 1. H• (Vir, M1,α ) = 0 if α 6= 0. 2. Hq (Vir, M1,0 ) ' Hq (Vect0 C, U1−1 ) ⊕ Hq−1 (Vect0 C, U1−1 ) for any q (H q = 0 for q < 0 by definition). 3. dim Hq (Vir, M1,0 ) = dim Hq (VectC, C[t, t−1 ](dt)1−1 ). Explicitly:   2 for q = r + 1, dim Hq (Vir, M1−(3r2 ±r)/2, 0 ) = 1 for q = r, r + 2,  0 otherwise, and Hq (Vir, M1,0 ) = 0 if 1 6= 1 − (3r2 ± r)/2 for any r ∈ Z+ .

∗ V• VectC enProof. We have seen that the complex C • (Vir, M1,α ) is isomorphic to ∗ Vq Vq ∗ VectC → Vect0 C dowed with the above non-standard differential. Let π : be the restriction map. It is easy to see that in fact π is a chain map from C q (Vir, M1,α ) we identify U1−1 = C as a vector space. Define another to C q (Vect V0 C, U1−1 ),where ∗ ∗ Vq q−1 VectC → Vect0 C by the formula map ι : (ιβ)(L(m1 ) ∧ · · · ∧ L(mq ) ) =

q X

[ (−1)i+1 δmi ,0 β L(m1 ) ∧ · · · ∧ L (mi ) ∧ · · · ∧ L(mq ) .

i=1

Then ι is a chain map from C q−1 (Vect0 C, U1−1 ) to C q (Vir, M1,α ). We have a short exact sequence of complexes ι

π

0 → C q−1 (Vect0 C, U1−1 ) → C q (Vir, M1,α ) → C q (Vect0 C, U1−1 ) → 0. ∗ Vq Vect0 C is given by the formula A splitting φ : C q (Vect0 C, U1−1 ) → ( β(L(m1 ) ∧ · · · ∧ L(mq ) ) if all mi ≥ 1, (φβ)(L(m1 ) ∧ · · · ∧ L(mq ) ) = 0 otherwise. One checks that if dβ = 0 then dφβ = αιβ.

Cohomology of Conformal Algebras

583

Hence, the cohomology long exact sequence associated to the above short exact sequence of complexes looks as follows: α id

ι

π

→ Hq−1 (Vect0 C, U1−1 ) → Hq (Vir, M1,α ) → Hq (Vect0 C, U1−1 )

α id

→ Hq (Vect0 C, U1−1 )

ι

→ ··· .

This proves Parts 1 and 2. Part 3 follows from Part 2 and the results of Feigin and Fuchs, see [F, §2.3]. (Note that our U1 is exactly their E−1 .) 8. Cohomology of Current Conformal Algebras 8.1. Cohomology with trivial coefficients. Here we will compute the cohomology of a current conformal algebra Cur g with trivial coefficients for a finite-dimensional semisimple Lie algebra g. Recall from Example 1.1 that the current conformal algebra Cur g is C[∂] ⊗ g with the λ-bracket [aλ b] = [a, b]

for a, b ∈ g.

The basic complex in this case becomes bigraded, the second grading given by the total degree in λi , which we will call the λ-degree, of the restriction of the cochain to the subspace g of generators of Cur g. The differential respects the λ-degree, and e• e• ⊂ C therefore the complex splits into the direct sum of its graded subcomplexes. Let C 0 be the subcomplex of zero λ-degree. This subcomplex is obviously isomorphic to the Chevalley–Eilenberg complex C • (g, C) of the Lie algebra g. e • is a quasi-isomorphism, i.e., it induces Theorem 8.1. 1. The embedding C • (g, C) ⊂ C an isomorphism on cohomology. Therefore, ^• g e • (Cur g, C) ' H• (g, C) ' g∗ . H 2. For q ≥ 0,

Hq (Cur g, C) ' Hq (g, C) ⊕ Hq+1 (g, C).

e • (Cur g, C) and C • (g[t], C) are Proof. 1. According to Theorem 6.1, the complexes C isomorphic, because g[t] is the annihilation subalgebra of Cur g, see Example 1.1. Moreover, the part of λ-degree zero maps isomorphically to the Chevalley–Eilenberg complex C • (g, C), which is the subcomplex of C • (g[t], C) of cochains vanishing on tg[t]. Thus, e • (Cur g, C) ' H• (g[t], C), which is isomorphic to H• (g, C) via the subcomplex of H cochains vanishing on tg[t] by a result of Feigin [Fe1, Fe2]; see a different proof of Feigin’s result in Sect. 8.2, which covers the case of non-trivial coefficients as well. The computation of H• (g, C) via the invariants of the dual exterior algebra is standard, see e.g., [Fe2]. e • ) → Hq (C) e for 2. Consider the long exact sequence (2.3). The mapping Hq (∂ C e is concentrated in λ-degree zero (see q ≥ 1 is zero, because the cohomology of Hq (C) e • ) is concentrated in the first statement of the theorem) and the cohomology of Hq (∂ C e0 = 0 λ-degree one (see Proposition 2.1). The same is true even for q = 0, because ∂ C

584

B. Bakalov, V. G. Kac, A. A. Voronov

e0 → C e 1 is zero. Thus (2.3) splits into the short and the degree-zero differential d : C exact sequences 0 → Hq (g, C) → Hq (Cur g, C) → Hq+1 (g, C) → 0 for each q ≥ 0.

Remark 8.1. The same argument as in Remark 7.2 shows that H• (Cur g, Ca ) = 0, where Ca is the one-dimensional Cur g-module, on which (Cur g acts trivially, and ∂ acts by multiplication by a 6= 0. 8.2. Cohomology with coefficients in a current module. Let g be a finite-dimensional simple Lie algebra, and U a g-module. Recall (Example 1.1) that the current module MU over Cur g is defined as MU = C[∂] ⊗ U with aλ u = au

for a ∈ g, u ∈ U.

Proposition 8.1. H• (Cur g, MU ) ' H• (g[t], U ), where the Lie algebra g[t] acts on the g-module U by evaluation at t = 0. This can be deduced from Theorem 6.2 but we will give a more direct argument. Proof. Since MU is free over C[∂], any cochain α ∈ C n (Cur g, MU ) has a unique representative mod (∂ + λ1 + · · · + λn ) independent of ∂. Explicitly, there is a unique β : g⊗n → C[λ1 , . . . , λm ] ⊗ U such that αλ1 ,...,λn (a1 , . . . , an ) = βλ1 ,...,λn (a1 ⊗ · · · ⊗ an ) for a1 , . . . , an ∈ g. Now writing βλ1 ,...,λn (a1 ⊗ · · · ⊗ an ) =

X

mod (∂ + λ1 + · · · + λn )

1) λ(m · · · λn(mn ) β(tm1 a1 ∧ · · · ∧ tmn an ) 1

m1 ,...,mn ∈Z+

we can interpret β as a cochain

Vn

g[t] → U , as in the proof of Theorem 6.1.

•

To compute H (g[t], U ), we apply the Hochschild–Serre spectral sequence (see, e.g., [F, §1.5.1]) for the ideal tg[t] of g[t]. The E2 term is (8.1) E2p,q ' Hp g, Hq (tg[t], U ) ' Hp (g) ⊗ Hq (tg[t], U )g g p q ' H (g) ⊗ H (tg[t]) ⊗ U . We used that U is a trivial tg[t]-module and that Hp (g, U ) ' Hp (g) ⊗ U g for any module U over a simple Lie algebra g. Of course, Hp (g) is well-known (cf. Theorem 8.1), so we only need Hq (tg[t]). The latter can be deduced from a famous result of Kostant [Ko] (generalized to the affine Kac–Moody case). First, we need some notation from [K1]. Fix a triangular decomposition g = n− ⊕ h ⊕ n+ . Let W , 1, 1+ , 1l , ρ, θ, h∨ be respectively the Weyl group, the set of roots, the set of positive roots, the set of long roots, the half sum of positive roots, the highest root, and the dual Coxeter number of g. Let b g = g[t, t−1 ] + CK + Cd be the affine Kac–Moody algebra associated to g. The corresponding objects for b g will be hatted. For example, ±1 ±1 b b b n+ , where b n± = t g[t ] + n± , h = h + CK + Cd. Denote by δ and 30 g=b n− ⊕ h ⊕ b the elements of b h∗ that correspond to K and d via the isomorphism b h∗ ' b h given by the

Cohomology of Conformal Algebras

585

invariant bilinear form (·|·) of b g, normalized by (θ|θ) = 2. Recall that the simple roots of b bi = αi (1 ≤ i ≤ l := rank g), where αi are the simple roots of g. The g are α b0 = δ − θ, α ρ, α bi i = 1 (0 ≤ i ≤ l), i.e., ρb = ρ + h∨ 30 . element ρb ∈ b h∗ is defined by the property hb ∗ b c = W n T , where We denote by bar the projection from h onto h∗ . Also recall that W T is the group of translations tγ (γ ∈ Z1l ) such that tγ (λ) = λ + hλ, Kiγ for λ ∈ h∗ , c we denote its length by `(w). b ∈ W b (w ∈ W acts on tγ by wtγ w−1 = tw(γ) ). For w Finally, if 3 ∈ h∗ is a dominant weight, we denote by V (3) the irreducible g-module with highest weight 3. Now we can state Lemma 8.1. 1. As a g-module, M

Hq (tg[t]) '

V w(b b ρ) − ρb ,

b 1 , `(wb)=q w b∈W c|w c 1 := {w b + }. b∈W b−1 1+ ⊂ 1 where W 2. Equivalently, Hq (tg[t]) '

M

V (w(ρ) − ρ + h∨ γ),

(w,γ)∈W T 1 , `(tγ w)=q

where W T 1 := {(w, γ) ∈ W n Z1l | (γ|α) ≥ 0 ∀α ∈ 1+ , (γ|α) > 0 ∀α ∈ 1+ ∩ w1− }. Proof. Part 1 is a special case of Theorem 5.14 of Kostant [Ko] (generalized to the affine Kac–Moody case). His Lie algebra g will be the affine Kac–Moody algebra b g. We take the parabolic subalgebra u = g[t] + CK + Cd of b g, then n = tg[t], g1 = g + CK + Cd. c = W n T (see [K1]). Part 2 is standard, using that W c , we have: Lemma 8.2. For any w b∈W P 1. ρb − w(b b ρ) = β∈1 b+ ∩wb1 b− β. b − |. b+ ∩ w b1 2. `(w) b = |1 3. ρb − w(b b ρ) ∈ Zδ, iff w b = 1. Proof. Parts 1 and 2 are exercises from [K1, Chap. 3] and left to the reader. b + . Since w b−1 (δ) = δ, Suppose ρb − w(b b ρ) = nδ, n ∈ Z. Then by Part 1, nδ ∈ Z+ 1 −1 b applying w b to Part 1, we get nδ ∈ Z+ 1− . Hence n = 0. But then Parts 1 and 2 imply `(w) b = 0, i.e., w b = 1. c b∈W It follows from Part 3 of the lemma that for any 3 ∈ h∗ there is at most one w b ρ) − ρb. Define `(3) to be the length of this w b if it exists, and +∞ such that 3 = w(b otherwise. Then we can restate Lemma 8.1 as follows: M V (3), (8.2) Hq (tg[t]) ' 3∈h∗ , `(3)=q

where V (3) is a finite-dimensional representation of highest weight 3.

586

B. Bakalov, V. G. Kac, A. A. Voronov

Theorem 8.2. Let g be a finite-dimensional simple Lie algebra with a fixed Cartan subalgebra h. Let U be an irreducible g-module. Then Hn (Cur g, MU ) ' Hn (g[t], U ) ' Hn−`

∗

(U )

(g).

Here `∗ (U ) = +∞ whenever U is infinite-dimensional, `∗ (U ) = `(3∗ ) whenever U = V (3) is a finite-dimensional irreducible module with a highest weight 3, 3∗ is the highest weight of the contragredient module V (3)∗ , `(3) is as above, and we agree that Hn = 0 for n < 0 (including n = −∞). Proof. The first isomorphism in the theorem is from Proposition 8.1. To compute H• (g[t], V (3)), we apply the Hochschild–Serre spectral sequence for the Lie algebra g[t], its module U , and its ideal tg[t]. If U = V (3), then U ∗ ' V (3∗ ) and Eqs. (8.1, 8.2) imply that the E2 term is ( Hp (g) for q = `(3∗ ) < +∞, E2p,q = 0 otherwise. ∗

Hence the spectral sequence degenerates at E2 and Hn (g[t], V (3)) ' Hn−`(3 ) (g). If U is infinite-dimensional, then again by (8.1, 8.2), we have E2p,q = 0. Corollary 8.1 ([Fe1, Fe2]). H• (g[t]) ' H• (g), where the isomorphism is induced from evaluation at t = 0. Corollary 8.2. For any semisimple g-module U : 1. H1 (Cur g, MU ) ' Homg (g, U ). Explicitly, the isomorphism is given by: αλ (a) = λ ϕ(a)

mod (∂ + λ)

for a ∈ g, ϕ ∈ Homg (g, U ). V2 2. H2 (Cur g, MU ) ' Homg ( g/g, U ) provided that g 6' sl2 . Explicitly, the isomorphism is given by: αλ1 ,λ2 (a1 , a2 ) = λ1 λ2 ϕ(a1 , a2 ) V2 for a1 , a2 ∈ g, ϕ ∈ Homg ( g/g, U ).

mod (∂ + λ1 + λ2 )

Proof. It is easy to check that the above formulas indeed give cocycles. In fact, Part 2 V2 V2 gives a cocycle for any ϕ ∈ Homg ( g, U ); however, any ϕ ∈ Homg ( g, g) gives a coboundary. Next, we use Lemma 8.1 and the fact that Hn (Cur g, MU ) ' Homg (Hn (tg[t]), U ) for n = 1, 2. c 1 of length 1 is the simple reflection r with respect to the 1. The only element of W α b0 1 (b ρ ) − ρ b = −b α = θ − δ. Hence (tg[t]) ' V (θ) ' g as a g-module. root α b0 . Then rα H 0 b0 1 c 2. All elements of W of length 2 are of the form rα b0 rα bi , where i is such that ∨ b0∨ i = 6 0. Then rα r (b ρ ) − ρ b = −b α − α b + hb α , α b ib α . hb αi , α 0 i i 0 0 b0 α bi When g = sl2 we get H2 (tg[t]) ' V (2α1 ), see the next example. When g = sll+1 , l ≥ 2, there are two possibilities for i: either i = 1 or i = l; then H2 (tg[t]) ' V (2θ−α1 )⊕ V (2θ − αl ). For g 6= sll+1 there is a unique possibility for i and H2 (tg[t]) ' V (2θ − αi ). V2 g/g. In all cases, except for sl2 , one can check that H2 (tg[t]) '

Cohomology of Conformal Algebras

587

Example 8.1. Let V (m) be the unique irreducible sl2 -module of dimension m + 1. Then dim Hn (Cur sl2 , MV (m) ) = 1 for m = 2n, 2(n − 3), and = 0 otherwise. Let {e, f, h} be the standard basis of sl2 . Then the module V (2n) is isomorphic to Sn sl2 /(h2 −4ef ). Note that S• sl2 /(h2 −4ef ) is the coordinate ring of the nilpotent cone of sl2 . This description of V (2n) allows us to give an explicit formula for the cocycles that represent Hn (Cur sl2 , MU ) for any sl2 -module U . Namely, Hn (Cur sl2 , MU ) ' Homsl2 (Sn sl2 /(h2 − 4ef ), U ) ⊕ Homsl2 (Sn−3 sl2 /(h2 − 4ef ), U ). The cocycle α ∈ C n (Cur sl2 , MU ) that corresponds to (ϕn , ϕn−3 ) is αλ1 ,...,λn (a1 , . . . , an ) = 5(λ1 , . . . , λn ) ϕn (a1 , . . . , an ) X bi , . . . , λ bj , . . . , λ bk , . . . , λn ) c3 (ai , aj , ak ) 5(λ1 , . . . , λ + 1≤i<j
ai , . . . , b aj , . . . , b ak , . . . , an ), × ϕn−3 (a1 , . . . , b Q

where 5(λ1 , . . . , λn ) = λ1 · · · λn 1≤r<s≤n (λr − λs ) and c3 (a1 , a2 , a3 ) = (a1 ∧ a2 ∧ a3 )/(e ∧ f ∧ h) is the generator of H3 (sl2 ) ' C. Remark 8.2. Corollary 8.2 in light of Theorem 3.1 implies the following explicit description of the two-cocycles cλ (a, b) corresponding to abelian extensions e→A→0 0 → MU → A of a current conformal algebra A = Cur g by a current module MU . (See the proof of Theorem 3.1, Part 4 for the notation.) When g 6= sl2 , abelian extensions are parameterized by elements ϕ ∈ Homg V2 ( g/ g, U ) and the corresponding cocycle is cλ (a, b) = λ(∂ + λ)ϕ(a, b). When g = sl2 , abelian extensions are parameterized by elements ϕ ∈ Homsl2 (S2 sl2 / (h2 − 4ef ), U ) = Homsl2 (V (4), U ) and cλ (a, b) = λ(∂ + λ)(∂ + 2λ)ϕ(a, b).

9. Hochschild, Cyclic, and Leibniz Cohomology 9.1. Hochschild cohomology. We can similarly define the notion of Hochschild cohomology by considering the following analogues of the basic and reduced complexes for an associative conformal algebra A and a conformal bimodule M over it, see Definition 1.4. Definition 9.1. A Hochschild n-cochain (n ∈ Z+ ) of an associative conformal algebra A with coefficients in a conformal bimodule M over it is a C-linear operator γ : A⊗n → M [λ1 , . . . , λn ] a1 ⊗ · · · ⊗ an 7→ γλ1 ,...,λn (a1 , . . . , an ), satisfying the following condition: Conformal antilinearity: γλ1 ,...,λn (a1 , . . . , ∂ai , . . . , an ) = −λi γλ1 ,...,λn (a1 , . . . , ai , . . . , an ) for all i.

588

B. Bakalov, V. G. Kac, A. A. Voronov

The differential d of a cochain γ is defined as follows: (dγ)λ1 ,...,λn+1 (a1 , . . . , an+1 ) = a1λ1 γλ2 ,...,λn+1 (a2 , . . . , an+1 ) n X (−1)i γλ1 ,...,λi−1 ,λi +λi+1 ,λi+2 ,...,λn+1 (a1 , . . . , ai−1 , aiλi ai+1 , ai+2 , . . . , an+1 ) + i=1

+ (−1)n+1 γλ1 ,...,λn (a1 , . . . , an )−∂−λn+1 an+1 . One can verify that the operator d preserves the space of cochains and d2 = 0. The cochains of an associative conformal algebra A with coefficients in a bimodule M e • (A, M ), called the basic Hochschild complex. As in the Lie e• = C form a complex C e • (A, M ) carries the structure of a (left) C[∂]-module: conformal algebra case, C n X λi γλ1 ,...,λn (a1 , . . . , an ). (∂ · γ)λ1 ,...,λn (a1 , . . . , an ) = ∂M +

(9.1)

i=1

A straightforward computation shows that d commutes with ∂. The quotient complex e • (A, M )/∂ C e • (A, M ) C • (A, M ) = C is called the reduced Hochschild complex, and its cohomology is called the reduced Hochschild cohomology H• (A, M ), as opposed to the basic Hochschild cohomology e • . Low-degree e • (A, M ), which is the cohomology of the basic Hochschild complex C H Hochschild cohomology groups can be interpreted along the lines of Sect. 3, e.g., e 0 (A, M ) = {m ∈ M | aλ m = m−∂−λ a ∀a ∈ A}. H Remark 9.1. One has obvious analogues of Theorems 6.1, 6.2, and Proposition 8.1 for Hochschild cohomology. For a current conformal algebra Cur A, where A is a C-algebra, the reduced Hochschild cohomology H• (Cur A, Cur A) ' H• (A[t], A[t]), by the analogue of Proposition 8.1. By the Hochschild–Kostant–Rosenberg Theorem [HKR], when A is the algebra of regular functions on an affine nonsingular schemeVSpec A over V C, the latter cohomology is isomorphic to the space of polyvector fields A TA ⊗C C[t] C[t]∂t on the product Spec A × A1 , where TA = Der(A, A) is the left module of vector fields on Spec A. Remark 9.2. For a commutative associative conformal algebra [K4], one can define the analogue of the Harrison cohomology by placing the symmetry condition on Hochschild cochains. This cohomology is the closest analogue of the one introduced in [KV] in the context of vertex algebras. 9.2. Cyclic cohomology. In this section we define an analogue of cyclic cohomology, see [C, L1, Ts], for an associative conformal algebra A. Define its basic cyclic cohomology • • n f , where CC f , n ∈ Z+ , is the space of f (A) as the cohomology of the complex CC HC C-linear operators γ : A⊗(n+1) → C[λ0 , . . . , λn ], a0 ⊗ · · · ⊗ an 7→ γλ0 ,...,λn (a0 , . . . , an ), satisfying the following conditions:

Cohomology of Conformal Algebras

589

Conformal antilinearity: γλ0 ,...,λn (a0 , . . . , ∂ai , . . . , an ) = −λi γλ0 ,...,λn (a0 , . . . , ai , . . . , an ); Cyclic invariance: γλ1 ,...,λn ,λ0 (a1 , . . . , an , a0 ) = (−1)n γλ0 ,...,λn (a0 , . . . , an ). The differential d of a cochain γ is defined as follows: (dγ)λ0 ,...,λn+1 (a0 , . . . , an+1 ) n X (−1)i γλ0 ,...,λi−1 ,λi +λi+1 ,λi+2 ,...,λn+1 (a0 , . . . , ai−1 , aiλi ai+1 , ai+2 , . . . , an+1 ) = i=0

+ (−1)n+1 γλn+1 +λ0 ,...,λn (an+1λn+1 a0 , . . . , an ). The reduced cyclic cohomology HC• (A) may be defined as the cohomology of the quotient complex by the action of ∂, as in the Hochschild case. 9.3. Leibniz cohomology. Nonlocal collections of formal distributions lead to the notion of a Lebniz conformal algebra, see Sect. 1: Definition 9.2. A Leibniz conformal algebra is a C[∂]-module A endowed with a λbracket [aλ b] which defines a conformally sesquilinear map A ⊗ A → A[[λ]] satisfying the Jacobi identity as in Definition 1.1. The difference from Definition 1.1 of a Lie conformal algebra is that the skew-symmetry axiom is omitted and formal power series in λ are allowed. For a Leibniz conformal algebra A, the definition of a (left) module M over it is the same as that for Lie conformal algebras, see Definition 1.2. The space C n (A, M ) of n-cochains of a Leibniz algebra A with values in a module M is the space of C-linear operators γ : A⊗n → M [[λ1 , . . . , λn ]] a1 ⊗ · · · ⊗ an 7→ γλ1 ,...,λn (a1 , . . . , an ), which are conformally antilinear: γλ1 ,...,λn (a1 , . . . , ∂ai , . . . , an ) = −λi γλ1 ,...,λn (a1 , . . . , ai , . . . , an ). The differential d of a cochain γ is defined as follows: (dγ)λ1 ,...,λn+1 (a1 , . . . , an+1 ) = +

X

n+1 X i=1

(−1)i+1 aiλi γλ ,...,b (a , . . . , b ai , . . . , an+1 ) λi ,...,λn+1 1 1

(−1)i γλ ,...,b λ ,...,λ 1

i

j−1 ,λi +λj ,λj+1 ,...,λn+1

(a1 , . . . ,

1≤i<j≤n+1

b ai , . . . , aj−1 [aiλi aj ], aj+1 , . . . , an+1 ), where γ is extended linearly over the polynomials in λi . One can verify that the operator d preserves the space of cochains and d2 = 0. The n-cochains, n ∈ Z+ , of a Leibniz e • (A, M ), e• = C conformal algebra A with coefficients in a module M form a complex C called the basic Leibniz complex.

590

B. Bakalov, V. G. Kac, A. A. Voronov

e • (A, M ), which Equation (9.1) defines the structure of a left C[∂]-module on C commutes with d. The quotient complex e • (A, M )/∂ C e • (A, M ) C • (A, M ) = C is called the reduced Leibniz complex. Its cohomology is called the reduced Leibniz e • (A, M ), which cohomology H• (A, M ), as opposed to the basic Leibniz cohomology H e • . These are conformal analogues of is the cohomology of the basic Leibniz complex C cohomology of Leibniz algebras, see [Cu, L1, L2].

10. Generalization to Conformal Algebras in Higher Dimensions The theory of conformal algebras, their representations and cohomology has a straightforward generalization to the case when λ is a vector. Let us fix a natural number r. We replace a single indeterminate λ by the vector λ = (λ1 , . . . , λr ) and ∂ by ∂ = (∂1 , . . . , ∂r ), and Q use the multi-index notation like 1) (mr ) r λ(m) = λ(m · · · λ for m ∈ Z , δ(z − w) = r 1 i δ(zi − wi ), etc. Then everything from Sects. 1–6 and 9 holds. Examples of conformal algebras in r indeterminates are provided by r-dimensional current algebras, cf. Examples 1.1 and 11.2. Other important examples are the Cartan algebras of vector fields. The structure theory of higher dimensional conformal algebras, including a classification of the simple ones, is currently being developed [BDK]. −1 Example 10.1. The Lie algebra Wr = Der C[x1 , x−1 1 , . . . , xr , xr ] is spanned by the coefficients of the formal distributions X −m1 −1 mr 1 xm · · · zr−mr −1 . Li (z) = −δ(z − x)∂xi = − 1 · · · xr ∂xi z1 m∈Zr

They are pairwise local, since [Li (z), Lj (w)] = ∂wi Lj (w)δ(z − w) − ∂zj Li (w)δ(z − w) = ∂wi Lj (w)δ(z − w) + Lj (w)∂wi δ(z − w) + Li (w)∂wj δ(z − w). The corresponding conformal algebra is A =

Lr

i=1

C[∂]Li with λ-brackets

[Li λ Lj ] = ∂i Lj + λi Lj + λj Li .

(10.1)

Its annihilation algebra is Wr − = Der C[x1 , . . . , xr ]. For r = 1 A is the Virasoro conformal algebra Vir, see Example 1.2. By Corollary 6.1, the cohomology of Wr − with trivial coefficients is the same as e • (A, C). The latter can be described as follows. Let the cohomology of the complex C Lr i e n (A, C) is uniquely determined V be the vector space i=1 CL . Every cochain α ∈ C ⊗n by its values on V : k1 ,...,kn . α : V ⊗n → C[λ1 , . . . , λn ], Lk1 ⊗ . . . ⊗ Lkn 7→ αλ 1 ,...,λn

Cohomology of Conformal Algebras

591

The differential is given by the formula n+1 (dα)kλ11,...,k ,...,λn+1 =

−

n+1 X

ki ,k1 ,...,kbi ,...,kbj ,...,kn+1

(−1)i+j λi,kj α

λi +λj ,λ1 ,...,b λi ,...,b λj ,...,λn+1

i,j=1 i<j

n+1 X

(−1)i+j λj,ki α

kj ,k1 ,...,kbi ,...,kbj ,...,kn+1

λi +λj ,λ1 ,...,b λi ,...,b λj ,...,λn+1

i,j=1 i<j

,

where λi,k is the k th coordinate of the vector λi . The cohomology of the Lie algebra Wr − with trivial coefficients was computed by Gelfand and Fuchs [GF2] (see also [F, §2.2.2]). Example 10.2. The subalgebra of divergence 0 derivations is a formal distribution subconformal algebra of Wr . The corresponding P algebra is the following subalgebra of the P algebra in Example 10.1: { i Pi (∂)Li | i Pi (∂)∂i = 0}. Example 10.3. The subalgebra Hr , r = 2s, of Hamiltonian derivations is a formal distribution subalgebra of Wr . The corresponding conformal algebra is of rank one: A = C[∂]L with λ-bracket [Lλ L] =

s X

(λs+i ∂i L − λi ∂s+i L).

i=1

Its annihilation algebra Hr − is the Lie algebra of Hamiltonian derivations of C[x1 , . . . , xr ]. e • (A, C), whose cohomology is H• (Hr − ), can be The nth term of the complex C identified with the space of skew-symmetric polynomials in λ1 , . . . , λn . The differential is given by the formula (dP )(λ1 , . . . , λn+1 ) =

n+1 X

bi, . . . , λ b j , . . . , λn+1 ) (−1)i+j (λi |λj )P (λi + λj , λ1 , . . . , λ

i,j=1 i<j

Ps where (λ|µ) = k=1 (λk µs+k − λs+k µk ). For r = 2 this complex has been known for quite a long time, but the computation of its cohomology is still an open problem (see [F, §2.2.7]). Example 10.4. The subalgebra Kr , r = 2s + 1, of contact derivations is also a formal distribution subalgebra of Wr , but the corresponding conformal algebra is of infinite rank. It is better viewed as a Lie∗ algebra of rank 1, see Sect. 12. 11. Higher Differentials For the computation of the cohomology with non-trivial coefficients of the Lie algebras of vector fields, it is useful to know the cohomology of their subalgebras of vector fields which have a zero of certain order at the origin (see [F]). The argument of Theorem 6.1 can be generalized to give a complex which produces this cohomology. Let L A be a conformal algebra in r indeterminates which is a free C[∂]-module: A = i∈I C[∂]Li . For fixed N ∈ Zr+ , we define gN ≡ (Lie A)N to be the subspace

592

B. Bakalov, V. G. Kac, A. A. Voronov

of the annihilation algebra g− = (Lie A)− , spanned by Lim , i ∈ I, m ≥ N (meaning that mi ≥ Ni for each i). We are interested in the case when gN is a Lie subalgebra of g− . Note that this is always true when the entries of N are large enough. Indeed, we can write X k Cij (λ, ∂)Lk (11.1) [Liλ Lj ] = k∈I k . Then for some uniquely determined polynomials Cij X k Cij (λ, −λ − µ)Lkλ+µ . [Liλ , Ljµ ] =

(11.2)

k∈I N j Lµ ] can be expressed in terms of It follows that for large N the commutator [∂λN Liλ , ∂µ P (m−N ) N k N i i , this shows that gN is a Lie subalgebra ∂λ+µ Lλ+µ . Since ∂λ Lλ = m≥N Lm λ of g− . Let M be a module over the conformal algebra L A. Then M is a g− -module and hence also a gN -module. Let V be the vector space i∈I CLi . As in Sect. 6, the nth term of the complex C • (gN , M ) can be identified with the space of linear maps

α : V ⊗n → C[λ1 , . . . , λn ] ⊗C M, k1 ,...,kn , Lk1 ⊗ · · · ⊗ Lkn 7→ αλ 1 ,...,λn

(11.3)

which are skew-symmetric with respect to simultaneous permutations of ki ’s and λi ’s. Using (11.2), one can easily write its differential dN . Example 11.1. Let A be the conformal algebra associated to the Lie algebra Wr of vector fields (Example 10.1). Then for N ∈ Zr+ , Wr,N is the Lie algebra of vector fields P Pi (x)∂xi such that all Pi (x) are divisible by xN . Equation (10.1) implies N j N N N Lµ ] = ∂λN λj ∂λ+µ Liλ+µ − ∂µ µi ∂λ+µ Ljλ+µ . [∂λN Liλ , ∂µ

The differential dN of the complex (11.3) is given by the formula n+1 (dN α)kλ11,...,k ,...,λn+1 =

n+1 X i=1 n+1 X

+

b (−1)i+1 ∂λNi Lki λi αk1 ,...,ki ,...,kn+1 λ1 ,...,b λi ,...,λn+1 λi +λj ,λ1 ,...,b λi ,...,b λj ,...,λn+1

i,j=1 i<j

−

ki ,k1 ,...,kbi ,...,kbj ,...,kn+1

(−1)i+j ∂λNi λi,kj α

n+1 X

(−1)i+j ∂λNj λj,ki α

i,j=1 i<j

kj ,k1 ,...,kbi ,...,kbj ,...,kn+1

λi +λj ,λ1 ,...,b λi ,...,b λj ,...,λn+1

.

Example 11.2. Let g be a Lie algebra. Then the current algebra g˜ = g ⊗C C[x1 , x−1 1 , . . . , xr , x−1 r ] is spanned by the coefficients of the pairwise local formal distributions a(z) := a ⊗ δ(x − z), a ∈ g. They satisfy [a(z), b(w)] = [a, b](w)δ(z − w). The corresponding conformal algebra is A = C[∂] ⊗C g with λ-brackets determined by [aλ b] = [a, b]

for a, b ∈ g.

Cohomology of Conformal Algebras

593

The annihilation algebra of A is g˜ − = g ⊗C C[x] and for N ∈ Zr+ , g˜ N = g ⊗C C[x]xN . Now C n (˜gN , M ) consists of all α : g⊗n → M [λ1 , . . . , λn ], a1 ⊗ · · · ⊗ an 7→ αλ1 ,...,λn (a1 , . . . , an ), skew-symmetric with respect to simultaneous permutations of ai ’s and λi ’s. The differential dN is given by (dN α)λ1 ,...,λn+1 (a1 , . . . , an+1 ) =

n+1 X i=1

+

n+1 X i,j=1 i<j

(−1)i+1 ∂λNi aiλi αλ ,...,b λ 1

(−1)i+j ∂λNi αλ

i ,...,λn+1

b

(a1 , . . . , abi , . . . , an+1 )

b

i +λj ,λ1 ,...,λi ,...,λj ,...,λn+1

([ai , aj ], a1 , . . . , abi , . . . , abj , . . . , an+1 ).

Remark 11.1. It is easy to see that in the above examples the differentials satisfy dN dN 0 + dN 0 dN = 0. 12. Relation to Lie∗ Algebras The theory of conformal algebras is in many ways analogous to the theory of Lie algebras. The reason is that in fact conformal algebras can be considered as Lie algebras in certain pseudo-tensor categories, instead of the category of vector spaces. A pseudo-tensor category [BD] is a category equipped with “polylinear maps” and a way to compose them. This is enough to define the notions of Lie algebra, representations, cohomology, etc. As an example, consider first the category Vec of vector spaces (over C). For a finite non-empty set I and a collection of vector spaces {Li }i∈I , M , we can define polylinear maps from {Li }i∈I to M : PI ({Li }i∈I , M ) := Hom(⊗i∈I Li , M ). This is a vector space with an action of the symmetric group SI on it. π For any surjection of finite sets J I and a collection {Kj }j∈J , we have the obvious compositions of polylinear maps O PJi ({Kj }j∈Ji , Li ) → PJ ({Kj }j∈J , M ), PI ({Li }i∈I , M ) ⊗ (12.1) i∈I φ × {ψi }i∈I 7→ φ ◦ (⊗i∈I ψi ) ≡ φ({ψi }i∈I ),

(12.2)

where Ji := π −1 (i) for i ∈ I. The compositions have the following properties: Associativity: If H J, {Fh }h∈H is a family of objects and χj ∈ PHj ({Fh }h∈Hj , Kj ), then φ {ψi ({χj }j∈Ji )}i∈I = φ({ψi }i∈I ) ({χj }j∈J ) ∈ PH ({Fh }h∈H , M ).

594

B. Bakalov, V. G. Kac, A. A. Voronov

Unit: For any object M there is an element idM ∈ P1 ({M }, M ) such that for any φ ∈ PI ({Li }i∈I , M ) one has idM (φ) = φ({idLi }i∈I ) = φ. Equivariance: The compositions (12.1) are equivariant with respect to the natural action of the symmetric group. Definition 12.1. [BD]. A pseudo-tensor category is a class of objects M together with vector spaces PI ({Li }i∈I , M ) on which the symmetric group SI acts, and composition maps (12.1), satisfying the above three properties. Remark 12.1. For a pseudo-tensor category M and objects L, M ∈ M, let Hom(L, M ) = P1 ({L}, M ). This gives a structure of an ordinary (additive) category on M and all PI are functors (M◦ )I × M → Vec. (Here M◦ denotes the dual category of M.) Remark 12.2. The notion of pseudo-tensor category is a straightforward generalization of the notion of operad. By definition, an operad is a pseudo-tensor category with only one object. It is instructive to think of a polylinear map φ ∈ Pn ({Li }ni=1 , M ) as an operation with n inputs and 1 output, represented by the figure L1

Ln φ M

Definition 12.2. A Lie algebra in a pseudo-tensor category M is an object A and µ ∈ P2 ({A, A}, A) with the following properties. Skew-symmetry: µ = −σ12 µ, where σ12 = (12) ∈ S2 . Jacobi identity: µ(µ(·, ·), ·) = µ(·, µ(·, ·)) − σ12 µ(·, µ(·, ·)), where now σ12 = (12) is viewed as an element of S3 . Pictorially, the skew-symmetry and the Jacobi identity for a Lie algebra (A, µ) look as follows: A

A µ A

A

A

=−

µ A

Cohomology of Conformal Algebras

A

A

595

A

A

A

µ

A

A

A µ

µ

=

A

+

A

µ

A µ

µ

A

A

A

A

Definition 12.3. A representation of a Lie algebra (A, µ) is an object M together with ρ ∈ P2 ({A, M }, M ) satisfying ρ(µ(·, ·), ·) = ρ(·, ρ(·, ·)) − σ12 ρ(·, ρ(·, ·)). Definition 12.4. An n-cochain of a Lie algebra (A, µ) with coefficients in a module (M, ρ) over it is a polylinear operation α ∈ Pn ({A, . . . , A}, M ) which is skewsymmetric, i.e., satisfying 1

i

A

i +1

A

n

A

1

A

A

α

i

i +1

A

A α

=

M

M

for all i = 1, . . . , n. The differential of a cochain is defined as follows: 1

n+1

A

A

dα M 1

i -1

i

i+1

A

=

X

α (−1)i+1

1≤i≤n+1

n+1

A

A

M ρ M

n

A

596

B. Bakalov, V. G. Kac, A. A. Voronov 1

i -1 i i +1

j -1 j j +1

n+1

A

+

X

A

µ i+j

(−1)

1≤i<j≤n+1

A α M

The same computation as in the ordinary Lie algebra case shows that d2 = 0. The cohomology of the resulting complex is called the (reduced) cohomology of A with coefficients in M and is denoted by H• (A, M ). Remark 12.3. One can also define the notions of associative algebra or commutative algebra in a pseudo-tensor category, their representations and analogues of the Hochschild, cyclic, or Harrison cohomology. Example 12.1. A Lie algebra in the category of vector spaces Vec is just an ordinary Lie algebra. The same is true for representations and cohomology. Example 12.2. Let D be a cocommutative bialgebra with comultiplication 1 and counit ε. Then the category Ml (D) of left D-modules is a symmetric tensor category. Hence, Ml (D) is a pseudo-tensor category with polylinear maps PI ({Li }i∈I , M ) := HomD (⊗i∈I Li , M ).

(12.3)

A Lie algebra in the category Ml (D) is an ordinary Lie algebra which is also a left D-module and such that its bracket is a homomorphism of D-modules. Example 12.3. Let D be as in Example 12.2. We introduce a pseudo-tensor category M∗ (D) with the same objects as Ml (D) but with another pseudo-tensor structure [BD] PI ({Li }i∈I , M ) := HomD⊗I (i∈I Li , D⊗I ⊗D M ).

(12.4) π

Here i∈I is the tensor product functor Ml (D)I → Ml (D⊗I ). For J I the composition of polylinear maps is defined as follows: (12.5) φ {ψi }i∈I := 1(π) φ ◦ i∈I ψi . Here 1(π) is the functor Ml (D⊗I ) → Ml (D⊗J ), M 7→ D⊗J ⊗D⊗I M , where D⊗I acts on D⊗J via the iterated comultiplication determined by π. The symmetric group SI acts on PI ({Li }i∈I , M ) by simultaneously permuting the factors in i∈I Li and D⊗I . Definition 12.5. A Lie∗ algebra is a Lie algebra in the pseudo-tensor category M∗ (D) defined above.

Cohomology of Conformal Algebras

597

The following examples of Lie∗ algebras are important: 1. When D = C we recover Example 12.1. 2. For D = C[∂] (with 1(∂) = ∂ ⊗ 1 + 1 ⊗ ∂, ε(∂) = 0) we get exactly the notions of conformal algebras, conformal modules over them and the reduced cohomology theory introduced in this paper. 3. For D = C[∂1 , . . . , ∂r ] we get conformal algebras in r indeterminates, see Sect. 10. 4. When D = C[0] is the group algebra of a group 0, one obtains the 0-conformal algebras studied in [GK]. L m 5. Let 0 be a subgroup of C∗ and let D = C[∂] o C[0] = m∈Z+ ,α∈0 C ∂ Tα with multiplication Tα Tβ = Tαβ , T1 = 1, Tα ∂Tα−1 = α∂ and comultiplication 1(∂) = ∂ ⊗ 1 + 1 ⊗ ∂, 1(Tα ) = Tα ⊗ Tα . Then we get the 0-conformal algebras studied in [BDK] (cf. [K4]). 6. Let now D = C[∂]×F (0), Lwhere F (0) is the function algebra of a commutative group 0. In other words, D = m∈Z+ ,α∈0 C ∂ m πα with multiplication πα πβ = δα,β πα , P ∂πα = πα ∂ and comultiplication 1(∂) = ∂ ⊗ 1 + 1 ⊗ ∂, 1(πα ) = γ∈0 παγ −1 ⊗ πγ . Then one gets the notion of 0-twisted conformal algebra [BDK] (cf. [K4]). 7. Let D = U (h) be the universal enveloping algebra of the Heisenberg Lie algebra h with generators ai , bi , c and the only non-zero commutation relations [ai , bi ] = c (1 ≤ i ≤ s). Let A = DL be a free left D-module of rank one. Define µ ∈ P2 ({A, A}, A) by the formula µ(L L) =

s X

(ai ⊗ bi − bi ⊗ ai ) + c ⊗ 1 − 1 ⊗ c ⊗D L.

i=1

Then (A, µ) is a Lie algebra in the category M∗ (D) with annihilation algebra Kr − , r = 2s + 1, cf. Example 10.2. 13. Open Problems There are a number of interesting problems which we left beyond the scope of this paper. 1. Compute the cohomology of Cur g with coefficients in Chom(M, N ), where M and N are current modules. The same for the Virasoro conformal algebra, where M and N are modules of densities. Only H1 is known (see [CKW]), and the result is highly nontrivial. 2. Compute the cohomology of the general conformal algebra gcN and its infinite-rank subalgebras, see [K4], with trivial coefficients. Is it true that H• (gcN , C[∂]N ) is trivial? 3. Study the relationship between H• (A, M ) and H• (Lie A, V (M )).A mapping between the two is given in Sect. 6. Our computations show that in the case of a current or the Virasoro conformal algebra A, the image of H• (A, C) contains all generators of H• (Lie A, C). 4. Compute the cohomology of conformal algebras in several indeterminates. 5. Compute the Hochschild and cyclic conformal cohomology of Cend(M ). These problems are apparently related to [2]. Acknowledgement. The second author is grateful to Jean-Louis Loday for inspiring discussions on Leibniz algebras, whose conformal version, see Sect. 9.3, seems to be an essential notion in the case of nonlocal fields.

598

B. Bakalov, V. G. Kac, A. A. Voronov

References [BDK] Bakalov, B., D’Andrea, A. and Kac, V.G.: In preparation [BD] Beilinson, A. and Drinfeld, V.: Chiral algebras. Preprint [CK] Cheng, S.-J. and Kac, V. G.: Conformal Modules. Asian J. Math. 1, no. 1, 181–193 (1997); Erratum, Asian J. Math. 2, no. 1, 153–156 (1998) [CKW] Cheng, S.-J., Kac, V.G. and Wakimoto, M.: Extensions of conformal modules. In: Topological field theory, primitive forms and related topics, Proceedings of Taniguchi and RIMS symposia, Progress in Math. Basel–Boston: Birkh¨auser, 1998 [C] Connes, A.: Noncommutative geometry, San Diego, CA: Academic Press, Inc., 1994 [Cu] Cuvier, C.: Homologie de Leibniz et homologie de Hochschild. C. R. Acad. Sci. Paris S´er. I Math. 313, no. 9, 569–572 (1991) [DK] D’Andrea, A. and Kac, V.G.: Structure theory of finite conformal algebras. Selecta Math. 4, 377–418 (1998) [Fe1] Feigin, B.L.: Cohomology of groups and of algebras of flows. Uspekhi Mat. Nauk 35, no. 2(212), 225–226 (1980) [Fe2] Feigin, B.L.: On the cohomology of the Lie algebra of vector fields and of the current algebra. Selecta Math. Soviet. 7, no. 1, 49–62 (1988) [FF] Feigin, B.L. and Fuchs, D.B.: Homology of the Lie algebra of vector fields on the line. (Russian) Funkc. Anal. i Pril. 14, no. 3, 45–60 (1980) [F] Fuchs, D.B.: Cohomology of infinite-dimensional Lie algebras. Contemporary Soviet Mathematics. New York: Consultants Bureau, 1986 [GF1] Gelfand, I.M. and Fuchs, D.B.: Cohomologies of the Lie algebra of vector fields on the circle. (Russian) Funkc. Anal. i Pril. 2, no. 4, 92–93 (1968) [GF2] Gelfand, I.M. and Fuchs, D.B.: Cohomologies of the Lie algebra of formal vector fields. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 34, 322–337 (1970) [GK] Golenishcheva-Kutuzova, M.I. and Kac, V.G.: 0-conformal algebras. J. Math. Phys. 39, no. 4, 2290– 2305 (1998), q-alg/9709006 [HKR] Hochschild, G., Kostant, B. and Rosenberg, A.: Differential forms on regular affine algebras. Trans. Am. Math. Soc. 102, 383–408 (1962) [K1] Kac, V.G.: Infinite-dimensional Lie algebras. 3rd edition, Cambridge: Cambridge University Press, 1990 [K2] Kac, V.G.: Vertex algebras for beginners. University Lecture Series, 10 Providence, RI: American Mathematical Society, 1996. Second edition, 1998 [K3] Kac,V.G.: The idea of locality. In: Physical applications and mathematical aspects of geometry, groups and algebras, H.-D. Doebner et al, eds., Singapore: World Sci., 1997, pp. 16–32, q-alg/9709008 [K4] Kac, V.G.: Formal distribution algebras and conformal algebras. A talk at the Brisbane Congress in Math. Physics, July 1997, q-alg/9709027 [KV] Kimura, T. and Voronov, A.A.: The cohomology of algebras over moduli spaces. In: The moduli spaces of curves, Texel Island, 1994), Progr. Math. 129, Boston, MA: Birkh¨auser-Boston, 1995, pp. 305–334, AMSPPS #199606-14-015 [Ko] Kostant, B.: Lie algebra cohomology and the generalized Borel–Weil theorem. Annals of Math. 74, 329–387 (1961) [L1] Loday, J.-L.: Cyclic homology. Grundlehren der Mathematischen Wissenschaften, 301. Berlin: Springer-Verlag, 1992 [L2] Loday, J.-L.: Une version non commutative des alg`ebres de Lie: les alg`ebres de Leibniz. Enseign. Math. (2) 39, no. 3–4, 269–293 (1993) [Ts] Tsygan, B.: Homology of matrix Lie algebras over rings and the Hochschild homology. Uspekhi Mat. Nauk 38, no. 2(230), 217–218 (1983) Communicated by G. Felder

Commun. Math. Phys. 200, 599 – 619 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

The Centre of the Graph and Moduli Algebras at Roots of 1 S. A. Frolov? Section Physik, Munich University, Theresienstr. 37, D-80333 Munich, Germany?? Received: 8 March 1996 / Accepted: 28 July 1998

Abstract: The structure of the centres Z(Lg ) and Z(Mg ) of the graph algebra Lg (sl2 ) and the moduli algebra Mg (sl2 ) is studied at roots of 1. It it shown that Z(Lg ) can be endowed with the structure of the Poisson graph algebra. The elements of Spec(Z(Mg )) are shown to satisfy the defining relation for the holonomies of a flat connection along the cycles of a Riemann surface. The irreducible representations of the graph algebra are constructed.

1. Introduction The Poisson structure of the moduli space of flat connections on a Riemann surface with g handles can be described by means of a quadratic Poisson algebra, which was introduced by Fock and Rosly [1] and here will be called the Poisson graph algebra. Let us recall the definition of the algebra [1]. 2g Let G be a matrix algebraic group and Dg = G× . An arbitrary element d of Dg is parametrized by matrices Ai and Bi as d = (A1 , B1 , . . . , Ag , Bg ) ∈ Dg . Let all of the matrices be in the fundamental representation of the group G. Then the algebra of functions on Dg is generated by the matrix elements (Ai )mn and (Bi )mn . Definition 1. The Poisson graph algebra PLg is an algebra of regular functions on Dg with the following Poisson structure: ?

Alexander von Humboldt fellow Permanent address: Steklov Mathematical Institute, Gubkin st. 8, GSP-1, 117966 Moscow, Russia. Present address: Department of Physics and Astronomy, University of Alabama, Tuscaloosa, AL 35487-0324, USA ??

600

S. A. Frolov

i = 1, · · · , g,

k {A1 , A2 } = A1i r+ A2i − A2i A1i r+ − r− A2i A1i + A2i r− A1i , 2π i i k {B 1 , B 2 } = Bi1 r+ Bi2 − Bi2 Bi1 r+ − r− Bi2 Bi1 + Bi2 r− Bi1 , 2π i i k {A1 , B 2 } = A1i r+ Bi2 − Bi2 A1i r+ − r+ Bi2 A1i + Bi2 r− A1i , 2π i i i < j, k {A1 , A2 } = A1i r+ A2j − A2j A1i r+ − r+ A2j A1i + A2j r+ A1i , 2π i j k {A1 , B 2 } = A1i r+ Bj2 − Bj2 A1i r+ − r+ Bj2 A1i + Bj2 r+ A1i , 2π i j k {B 1 , B 2 } = Bi1 r+ Bj2 − Bj2 Bi1 r+ − r+ Bj2 Bi1 + Bj2 r+ Bi1 , 2π i j k {B 1 , A2 } = Bi1 r+ A2j − A2j Bi1 r+ − r+ A2j Bi1 + A2j r+ Bi1 . 2π i j

(1.1)

Here k is an arbitrary complex parameter, r± are classical r-matrices which satisfy the classical Yang–Baxter equation and the following relations: [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0, r− = −P r+ P, r+ − r− = C,

(1.2) (1.3)

where P is a permutation in the tensor product V ⊗ V (P a ⊗ b = b ⊗ a). In Eqs. (1.1–1.3) we use the standard notations from the theory of quantum groups [2, 3]: 1 for any matrix A acting in some space V one can construct two matrices P A = A ⊗ id matrix r = a r1 (a) ⊗ r2 (a) and A2 = id ⊗ A acting in the space V ⊗ V and for any P acting in the space V ⊗ V one constructs matrices r12 = a r1 (a) ⊗ r2 (a) ⊗ id, r13 = P P 23 = a id ⊗ r1 (a) ⊗ r2 (a) acting in the space V ⊗ V ⊗ V . a r1 (a) ⊗ id ⊗ r2 (a) and r The matrix C is the tensor Casimir operator of the Lie algebra G of the group G: C = −ηab λa ⊗ λb , ηab is the Killing tensor and λa form a basis of G. Let us now identify the matrices Ai and Bi with holonomies of a flat connection along the cycles ai and bi of a Riemann surface with g handles. Then Ai and Bi should satisfy the following defining relations: −1 −1 −1 M = Bg A−1 g Bg Ag · · · B1 A1 B1 A1 = 1.

(1.4)

These relations can be regarded as first-class constraints imposed on the variables of Dg . The gauge transformations generated by these constraints are just the simultaneous conjugations of Ai and Bi , Ai → hAi h−1 , Ai → hAi h−1 . Let us now consider two Poisson subalgebras of PLg . The first subalgebra I consists of all of the functions vanishing on the constraints surface: I = {f ∈ PLg : f |M =1 = 0} and the second subalgebra FI is the maximal subalgebra of PLg such that the subalgebra I is a Poisson ideal of FI :

Centre of the Graph and Moduli Algebras at Roots of 1

601

FI = {f ∈ PLg : {f, h} ∈ I ∀h ∈ I}. In particular it is not difficult to check that any function f which is invariant with respect to simultaneous conjugations of Ai and Bi , f (hA1 h−1 , . . . , hBg h−1 ) = f (A1 , . . . , Bg ) belongs to FI . Definition 2. The Poisson algebra of functions on the moduli space of flat connections on a Riemann surface with g handles or the Poisson moduli algebra PMg is defined as a quotient of the algebra FI over the ideal I PMg = FI I. It was shown by Fock and Rosly that the algebra PMg coincides with the canonical Poisson algebra of functions on the moduli space defined by the Atiyah-Bott symplectic structure. Quantization of the Poisson graph algebra leads to an associative algebra which was introduced in [4] (see also [5]). In the present paper we study the structure of the centre of the quantized graph and moduli algebras for the simplest case of the SL(2) group and, in what follows, present definitions and results only for this case. Our definition of the moduli algebra differs from the definition given in [4], where the truncated case was considered, and can be regarded as the standard one from quantum theory of constraints systems. The plan of the paper is as follows. In the second section we introduce the graph and moduli algebras. Then we describe an extension of the graph algebra and the isomorphism between the extended graph algebra L∗g and the tensor product of g copies of L∗1 [6, 7]. In the third section we study the centre Z(L1 ) of L1 and prove that Z(L1 ) is isomorphic to PL1 . In the fourth section, using the isomorphism mentioned in the second section, we generalize the results obtained in the third section to Lg and show that the elements of Spec(Z(Mg (sl2 ))) satisfy the defining relation (1.4). In the fifth section the irreducible representations of Lg are constructed. In the Conclusion we discuss unsolved problems.

2. Graph and Moduli Algebras Definition 3. The graph algebra Lg (sl2 ) is an associative algebra with unit element, generated by matrix elements of Ai , Bi ∈ End C 2 ⊗ Lg (sl2 ), i = 1, . . . , g and (Ai )−1 11 , −1 , M which are subject to the following relations: (Bi )−1 11 11 i = 1, · · · , g, A1i R+ A2i R+−1

=

−1 1 R− A2i R− Ai , 1 2 −1 Ai R+ Bi R+

−1 1 Bi1 R+ Bi2 R+−1 = R− Bi2 R− Bi ,

=

−1 1 R+ Bi2 R− Ai .

(2.1)

602

S. A. Frolov

i < j,

A1i R+ A2j R+−1 = R+ A2j R+−1 A1i , A1i R+ Bj2 R+−1 = R+ Bj2 R+−1 A1i , Bi1 R+ Bj2 R+−1 = R+ Bj2 R+−1 Bi1 , Bi1 R+ A2j R+−1 = R+ A2j R+−1 Bi1 , −1 −1 −1 (Ai )11 (Ai )−1 11 = (Ai )11 (Ai )11 = (Bi )11 (Bi )11 = (Bi )11 (Bi )11 = 1, −1 −3g −1 −1 −1 (q Bg A−1 M11 g Bg Ag · · · B1 A1 B1 A1 )11 = −1 −1 −1 −1 (q −3g Bg A−1 g Bg Ag · · · B1 A1 B1 A1 )11 M11 = 1, −1 −1 −1 −1 (Ai )mn = q 2(n−m) (Ai )mn M11 , M11 (Bi )mn = q 2(n−m) (Bi )mn M11 , M11 2 detq Ai = (Ai )11 (Ai )22 − q (Ai )21 (Ai )12 = 1, (2.2) detq Bi = (Bi )11 (Bi )22 − q 2 (Bi )21 (Bi )12 = 1, and we denote by 1 the unit element of any algebra throughout the paper. Here R± -matrices



R+ = q − 2

1

q 0 0 0

0 0 1 q − q −1 0 1 0 0

  −1 0 q 0 1  0 1 0 , R− = q 2  0 q −1 − q 0 q 0 0

0 0 1 0

 0 0  0 

q −1

satisfy the quantum Yang–Baxter equation and the following relations: R12 R13 R23 = R23 R13 R12 , 2πi −1 ~r± + O(~2 ), P, R± (q) = 1 + R + = P R− k 2πi ~), p = k + 2~. q = exp( p −1 because the Poisson structure (1.1) is degenerate We have introduced the element M11 on the surface M11 = 0 and because, as we will see in the fifth section, if the element M11 is invertible then all irreducible representations of the graph algebra are p3g -dimensional. Using the explicit expressions for R± , one can write down the commutation relations (2.2) in terms of elements (Ai )mn and (Bi )mn . The corresponding formulas are given in the Appendix. It is well known that for fixed index i the algebra generated by matrix elements of Ai (or Bi ) is isomorphic to Uq (sl2 ) [8] and the standard generators of Uq (sl2 ) can be expressed through (Ai )mn as follows: 1

K = (Ai )11 , X+ = −

1

q2 q2 (Ai )21 , X− = − (Ai )−1 11 (Ai )12 . −1 q−q q − q −1

The Casimir element of Uq (sl2 ) is equal to ci = trq Ai = q −1 (Ai )11 + q(Ai )22 . However the Casimir elements ci are not central elements of the graph algebra and, moreover, for generic values of q the graph algebra has a trivial centre. It is not difficult to show that for fixed index i the algebra generated by Ai and Bi is isomorphic to the quantized algebra of functions on the Heisenberg double of a Lie group [9, 10, 11, 12].

Centre of the Graph and Moduli Algebras at Roots of 1

603

Let us introduce monodromies Mi , −1 −1 −1 Mi = q −3(g−i+1) Bg A−1 g Bg Ag · · · Bi Ai Bi Ai .

The monodromies satisfy the following commutation relations: i ≤ j, Mi1 R+ Mj2 R+−1

=

−1 R− Mj2 R− Mi1 , Mi1 R+ Bj2 R+−1 =

−1 Mi1 R+ A2j R+−1 = R− A2j R− Mi1 , −1 R− Bj2 R− Mi1 ,

i < j,

(2.3)

A1i R+ Mj2 R+−1 = R+ Mj2 R+−1 A1i , Bi1 R+ Mj2 R+ = R+ Mj2 R+−1 Bi1 , detq Mi = 1. Using the Gauss decomposition, one can represent the monodromies as follows: Mi = m−1 − (i)Ki m+ (i). where m± (i) are upper- and lower-triangular matrices with the unity on the diagonals and Ki are diagonal matrices. It follows from relations (2.3) that the matrix elements of Ki form a commutative subalgebra of Lg . ±1

2 Let L∗g be an extension of the graph algebra Lg by means of the elements Q±1 i = Ki . The relations (2.3) and the commutation relations of Qi are presented in components in the Appendix. Then the following proposition is a refinement of some results from [6, 7]:

Proposition 1. The extended graph algebra L∗g is isomorphic to the tensor product of g copies of the extended graph algebra L∗1 . The isomorphism is given by means of the following formulas: Ai = M+−1 (i + 1)A¯ i M+ (i + 1), Bi = M+−1 (i + 1)B¯ i M+ (i + 1), where

(2.4)

M+ (i) = Qi m+ (i), M− (i) = Q−1 i m− (i), M± (i) = G¯ ± (i)G¯ ± (i + 1) · · · G¯ ± (g), G¯ i = q −3 B¯ i A¯ −1 B¯ −1 A¯ i . i

i

Remark 1. Proposition 1 is valid for an arbitrary graph algebra. In the case of the SL(2) group the elements (Ki )11 = (Mi )11 = (G¯ i )11 · · · (G¯ g )11 , all elements (Mi )11 and (G¯ i )11 are invertible due to invertibility of M11 and one can easily show that formulas (2.4) g define the isomorphism of Lg (sl2 ) and L⊗ 1 . There is no natural anti-involution of the graph algebra. However, the following anti-automorphism [13] plays the role of ∗-operation on Lg : −1 −1 −1 ρ(Ai ) = M+ A−1 i M+ , ρ(Bi ) = M+ Bi M+ .

The square of the anti-automorphism is equal to ρ2 (Ai ) = M Ai M −1 , ρ2 (Bi ) = M Bi M −1 .

(2.5)

604

S. A. Frolov

It is worthwhile to note that this anti-automorphism acts on the elements Mij as follows ρ(M± ) = M∓ . Let us introduce a set 8 of quantum constraints 8ij = Mij − δij ,

(2.6)

where Mij are components of M = M1 . Let us consider the left and right ideals of the graph algebra, generated by the set 8 and let FIL (FIR ) be the maximal subalgebra of Lg such that IL (IR ) is a two-sided ideal of FIL (FIR ). Let FI be the intersection of FIL and FIR : FI = {f ∈ Lg : IL f ⊂ IL and f IR ⊂ IR }. It is obvious that FI is a subalgebra of Lg and the elements 8ij ∈ FI . Definition 4. Let I be the intersection of the algebra FI with the linear span of elements from IL and IR . The moduli algebra Mg is defined as a quotient of the algebra FI over the ideal I, (2.7) Mg = FI I. Some comments are in order. To motivate Definition 4 one should note that IR and IL are natural noncommutative analogs of the Poisson ideal which appeared in Definition 2. The only other possible choice would be the two-sided ideal of Lg generated by 8. However, one can easily see that this ideal contains too many elements of Lg . The definition of FIL and FIR can be rewritten in the form : [FIL , IL ] ⊂ IL and [FIR , IR ] ⊂ IR . It is now obvious that FIL and FIR correspond to the subalgebra FI of PLg from Definition 2. To understand the origin of FI and I in the noncommutative case, let us consider the action of the anti-automorphism ρ on IL (IR ) and FIL (FIR ). It follows from the action of ρ on the elements Mij that ρ(IL ) = IR and ρ(IR ) = IL and, hence ρ(FIL ) = FIR and ρ(FIR ) = FIL . Thus the subalgebra FI is the maximal subalgebra of FIL (or FIR ) invariant with respect to the action of ρ: ρ(FI ) = FI . It is clear that the two-sided ideal I of FI has the same property: ρ(I) = I. Therefore, the moduli algebra inherits the anti-automorphism ρ from Lg . Moreover it is possible to show that, being restricted to the moduli algebra, the anti-automorphism becomes an anti-involution of Mg . Let us now suppose that the parameter ~ is an indeterminate. Then the graph algebra Lg , being regarded as a topological algebra over the ring C[[~]] of formal power series in ~ over C, is a deformation of the Poisson graph algebra PLg , since {f0 , g0 } ≡

i (f g − gf ) (mod ~), ~

where f, g ∈ Lg reduce to f0 , g0 ∈ PLg (mod ~). Analogously, the moduli algebra Mg seems to be a deformation of PMg . Although the detailed investigation of the relation to the deformation quantization is beyond the scope of the paper, let us give a sketch of a proof. First one should show that Mg is isomorphic to PMg [[~]] as a C[[~]]-module. Let f ∈ FI reduce to f0 ∈ FIcl (mod ~), and here we denote FIcl and I cl the subalgebra FI ⊂ PLg and the ideal I ⊂ FI from Definition 2 respectively. It is possible to prove that if f0 ∈ FIcl then it can be presented in the form f0 = f0inv + i0 , where f0inv is an invariant function on PLg

Centre of the Graph and Moduli Algebras at Roots of 1

605

({f0inv , 8} = 0) and i0 ∈ I cl . It is known [7] that to any invariant function f0inv one could put into correspondence an invariant element f0inv (~) ∈ Lg ([f0inv (~), 8] = 0) such that f0inv (~) = f0inv (mod ~). Thus, f can be represented in the form: f = f0inv (~) + i0 (~) + ~f1 , where i0 (~) is any element from IL , which reduces to i0 ∈ I cl (mod ~), and f1 is some element from Lg . It is easy to verify that f1 belongs to FIL and, hence reduces to a function f10 ∈ FIcl (mod ~). Repeating the procedure described above, one represents f1 in the form: inv (~) + i1 (~) + ~f2 , f1 = f10

and, finally, the element f takes the form: f = f inv + iL , where f inv ∈ FI is an invariant element and reduces to f0inv ∈ FIcl (mod ~), and iL ∈ IL . Since f belongs to FI , the element iL belongs to I. The subalgebra Finv of invariant elements has a nonzero intersection Iinv with the ideal I. Thus, Mg = Finv /Iinv and the statement that Finv /Iinv is isomorphic to PMg [[~]] seems can be extracted from [7]. Now, to prove that Mg is a deformation of PMg , it is enough to show that i ~ (f g − gf ) reduces to {f0 , g0 } (mod ~). It can be done by taking into account that Lg is a deformation of PLg . It is worthwhile to note that one could start not with f ∈ FI but with fL ∈ FIL R or with fR ∈ FIR . Then defining ML g = FIL /IL and Mg = FIR /IR one would L R show that Mg and Mg are both isomorphic to PMg [[~]] and, therefore, algebras Mg , R ML g and Mg are deformations of PMg . The moduli algebra Mg is singled out by the following simple reason. Let V be a left Lg -module and β be a bilinear form on V × V , such that β(v2 , λv1 ) = λβ(v2 , v1 ), β(λv2 , v1 ) = λ∗ β(v2 , v1 ), λ ∈ C. A representation is called unitary if β(v2 , f v1 ) = β(ρ(f )v2 , v1 ) for all v1 , v2 ∈ V and f ∈ Lg . Let V0 be the submodule of V which is annihilated by the elements 8ij : V0 = {9 ∈ V :

8ij 9 = 0}.

It is obvious that V0 is a FI -module (and FIL -module). Let us consider an element f ∈ FI (FIL ) of the form f=

X

r l 8ij fij + fij 8ij .

ij

It is clear that β(v2 , f v1 ) = 0 for all v1 , v2 ∈ V0 . Thus, we see, that defining the moduli algebra, it is natural to factorize over all elements of such a form but not only over elements from IL . Our aim is to describe the structure of the centre Z(Lg ) of the graph algebra at roots of 1, and in the remainder of the paper we assume that q is a primitive pth root of unity, p being odd. Due to Proposition 1 we can begin with the study of the centre Z(L1 ) of the algebra L1 .

606

S. A. Frolov

3. The Centre of the Graph Algebra L1 (sl2 ) −1 be the generators of L1 (sl2 ) with the commutation relations given Let aij , bij and M11 by formulas (6.1) from the Appendix. p Proposition 2. 1) The centre Z(L1 ) of L1 (sl2 ) is generated by the elements a±p 11 , a12 , p ±p p p −p a21 , b11 , b12 ,b21 and M11 , subject to the single relation −p (BA−1 B −1 A)11 = 1, M11

(3.1)

where Aij = apij , Bij = bpij for i, j 6= 2 simultaneously and det A = det B = 1.1 2) The algebra L1 (sl2 ) is a free Z(L1 )-module with basis the set of monomials ar111 as121 at211 br112 bs122 bt212 and 0 ≤ ri , si , ti ≤ p − 1. 3) The ring L1 (sl2 ) is an integral domain. Proof. Let us introduce a new set of generators of L1 (sl2 ) by means of the following formulas: −1 −2 2 X1 = (a−1 11 a12 − b11 b12 )b11 , X2 = a11 a21 b11 , −1 −1 2 X3 = a−2 11 b12 b11 , X4 = (b21 b11 − a21 a11 )a11 .

In terms of these generators the relations (6.1) from the Appendix take the form a11 b11 = qb11 a11 , a11 Xi = Xi a11 , b11 Xi = Xi b11 , X1 X4 = q 2 X4 X1 , Xi Xi+2 = q −2 Xi+2 Xi , i = 1, 2, Xi Xi+1 = q 2 Xi+1 Xi + q 2 − 1, i = 1, 2, 3, −1 −1 −1 −1 X1 = q 2 X1 M11 , M11 X3 = q 2 X3 M11 , M11 −1 −1 −1 −1 X2 = q −2 X2 M11 , M11 X4 = q −2 X4 M11 , M11 −1 −2 q (1 + X1 X2 + X1 X4 + X3 X4 + X1 X2 X3 X4 ) = 1. M11

(3.2)

±p p −p , a±p It is now no problem to show that the elements M11 11 , b11 and Xi generate the centre of L1 (sl2 ). To do this one should use the following lemma, which can be easily proved by induction

Lemma 1. Let elements c, Z and W satisfy the relations cZ = Zc, cW = W c, ZW = q 2 W Z + c, then the following relation is valid: Z mW n =

m X

q 2(m−k)(n−k) ck W n−k Z m−k

k=0

where m ≤ n, (n)q =

(m)q ! (n)q ! , (n − k)q ! (m − k)q !(k)q !

1−q 2n 1−q 2 .

The first part of Proposition 2 follows now from the simple relations between the elements apij , bpij and Xip , p −p p 2p p p p −2p X1p = (a−p 11 a12 − b11 b12 )b11 , X2 = a11 a21 b11 , −2p p p p −p 2p b12 b11 , X4p = (bp21 b−p X3p = a11 11 − a21 a11 )a11 ,

which can be proved by using the well-known lemma 1

Throughout the paper for any matrix M the element M22 is defined from the condition on det M .

Centre of the Graph and Moduli Algebras at Roots of 1

607

Lemma 2. Let elements a and b satisfy the relation ab = q 2 ba or a(a + b) = q 2 (a + b)a and q p = 1, then (a + b)p = ap + bp . The relation (3.1) will be proved later in this section (see Proposition 4). The second part of Proposition 2 follows from the obvious observation that the −v 1 s1 t1 ±r2 s2 t2 products a±r 11 a12 a21 b11 b12 b21 M11 , where ri , si , ti , v ∈ N , are a basis of L1 (sl2 ). Relations (3.2) show that L1 (sl2 ) is the tensor product of the Weyl algebra, generated −1 and, therefore, to prove by a11 and b11 , and the algebra X generated by Xi and M11 that L1 (sl2 ) is an integral domain, it is enough to show that X is an integral domain. We have to prove that if f g = 0 then either f = 0 or g = 0. An arbitrary element f ∈ X can be presented in the form f=

p−1 ∞ X X

−p fijkl (M11 , X2p , X3p )X1i X2j X3k X4l .

i,l=0 j,k=0

It is obvious that if f 6= 0, then X1 f 6= 0. Let us show that f X1 = 0 if and only if f = 0. It is clear that it is enough to prove the statement only for elements of the form X f= fij X1i X2j . Using the commutation relation for X1 and X2 , one gets X fij q −2j X1i+1 X2j + (q −2j − 1)X1i X2j−1 f X1 = X = fij q −2j + (q −2(j+1) − 1)fi+1,j+1 X1i+1 X2j . This expression is equal to zero only if fij = (q 2j − q −2 )fi+1,j+1 . Let j = kp + j0 , 0 ≤ j0 ≤ p − 1. By simple induction one gets fij = (q 2j0 − q −2 )(q 2(j0 +1) − q −2 ) · · · (q 2(j0 +p−j0 −1) − q −2 )fi+p−j0 ,(k+1)p = 0. Thus if f X1 = 0, then f = 0. In the same manner one can show that if f 6= 0 then f X4 6= 0, X4 f 6= 0, f (1 + X1 X2 ) 6= 0, (1 + X1 X2 )f 6= 0. Let us note that for generic values of q this statement is not valid and the ring X is not an integral domain. Let us consider elements fe = X1p−1 X4p−1 (1 + X1 X2 )p−1 f and ge = gX1p−1 X4p−1 (1 + g = 0) is equivalent to equation X1 X2 )p−1 . We have just shown that equation fe = 0 (e f = 0 (g = 0), therefore it is enough to show that if fege = 0 then either fe = 0 or ge = 0. Elements fe and ge can be written in the form fe =

p−1 ∞ X X

−p i fijkl (M11 , X2p , X3p )M11 X1j (1 + X1 X2 )k X4l ,

j,l=0 i,k=0

ge =

p−1 ∞ X X j,l=0 i,k=0

−p i gijkl (M11 , X2p , X3p )M11 X1j (1 + X1 X2 )k X4l .

608

S. A. Frolov

The product of these elements is equal to fege =

∞ X

p−1 X

q 2i2 (j1 −l1 )−2j2 (k1 +l1 ) fi1 j1 k1 l1 gi2 j2 k2 l2 ×

j1 ,j2 ,l1 ,l2 =0 i1 ,i2 ,k1 ,k2 =0 i1 +i2 j1 +j2 X1 (1 + X1 X2 )k1 +k2 X4l1 +l2 . M11

It is now of no problem to show that fege = 0 if and only if either fe = 0 or ge = 0. This proves the proposition. One can introduce the Poisson structure on the centre of L1 (sl2 ) by means of the following formula (see, for example, [16]): xy − yx k {x, y} = lim 2 . κ→q 1 − κp 2π

(3.3)

−1 . Using relations (3.2) Let PL∗g be the extension of PLg by means of the element M11 and Lemma 1 one can easily calculate the Poisson brackets between the generators of Z(L1 ) and prove the following proposition

Proposition 3. The centre Z(L1 ) of L1 (sl2 ) endowed with the Poisson structure (3.3) is isomorphic to the extended Poisson graph algebra PL∗1 (sl2 ) and the isomorphism is given by the formulas φ(apij ) = αij , φ(bpij ) = βij , where αij , βij are generators of the extended Poisson graph algebra PL∗1 (sl2 ) and i, j 6= 2 simultaneously. Remark 2. Let us note that as a by-product we have proven the well-known theorem that Z0 (Uq (sl2 )) is isomorphic to C[SL∗2 ] (see, e.g. [14, 15, 16]). To proceed with the study of the centre of the graph algebra Lg we will need to p = (BA−1 B −1 A)pij are expressed through know how the central elements Mij = Mij p p the elements aij and bij . Another reason to find these expressions is that the ideal of p − δij , belongs to the centre Z(I). Z(Lg ), generated by the elements Mij p are expressed through the generators apij and Proposition 4. The central elements Mij p bij of Z(L1 (sl2 )) by means of the following formula:

M = BA−1 B −1 A, where Aij = apij , Bij = bpij for i, j 6= 2 simultaneously and det A = det B = 1. Proof. Let us introduce matrices D = BA−1 and C = B −1 A. We first show that the matrix M is expressed through the matrix elements dpij , cpij as follows: M = DC, dpij ,

cpij

Cij = for i, j 6= 2 simultaneously. where Dij = The matrices D and C have the following commutation relations: −1 1 −1 1 D , C 1 R+ C 2 R+−1 = R− C 2 R− C , D1 R+ D2 R+−1 = R− D2 R−

C 1 R+ D2 R+−1 = R+ D2 R+−1 C 1 ,

detq C = detq C = q 3 ,

Centre of the Graph and Moduli Algebras at Roots of 1

609

which are rewritten in components in the Appendix. Using the relations (6.2) and Lemma 2, one gets p p p p p p = (DC)p11 = dp11 (c11 + d−1 M11 11 d12 c21 ) = d11 c11 + d12 c21 = (DC)11 , p −1 p 3 = (DC)p12 = (d11 c12 + q 2 d12 c−1 M12 11 c21 c12 + q d12 c11 ) p −p −1 p −1 p 2 p p = (d11 c12 + q 2 d12 c−1 11 c21 c12 ) + (d12 c11 ) = (d11 + q d12 c11 c21 ) c12 + d12 c11 p p p −p = dp11 cp12 + dp12 c−p 11 c21 c12 + d12 c11 = (DC)12 , p 3 −1 p = (DC)p21 = (d21 c11 + q 2 d−1 M21 11 d21 d12 c21 + q d11 c21 ) p −p p p −1 2 −1 p p = d−p 11 c21 + (d21 c11 + q d11 d21 d12 c21 ) = d11 c21 + d21 (c11 + d11 d12 c21 ) p p p −p p p = d−p 11 c21 + d21 (c11 + d11 d12 c21 ) = (DC)21 .

To complete the proof of Proposition 4 one should show that C = B −1 A, D = BA−1 . It can be done by using the following lemma which can be easily proved with the help of the Lemma 2: Lemma 3. Let elements c, Z and W satisfy the relations cZ = Zc, cW = W c, ZW −c = q −2 (W Z − c) and q p = 1 then (ZW − c)p = Z p W p − cp . Then the calculation of cpij gives cp11 = (B −1 A)p11 = (q 2 b22 + (1 − q 2 )b11 )a11 − q 2 b12 a21

p

4 −1 2 2 = q 2 b−1 11 a11 + q b11 b21 b12 a11 + (1 − q )b11 a11 − q b12 a21 p 2 −1 = b−1 11 a11 + q b11 b21 b12 a11 − qa21 b12 p p −1 2 = b−p 11 a11 1 + q (b21 − a21 a11 b11 )b12 p −1 p p −1 = b−p A)11 , 11 a11 1 + (b21 − a21 a11 b11 ) b12 = (B

p

cp12 = (B −1 A)p12 = (q 2 b22 a12 − qa22 b12 )p 2 −1 p = (−a−1 11 b12 − q a11 a21 a12 b12 + qb22 a12 ) p 2 −1 p = −a−p 11 b12 + (−q a11 a21 a12 b12 + qb22 a12 ) p p −1 p = −a−p 11 b12 + a12 (−qa11 a21 b12 + b22 ) p p −1 −1 −1 = −a−p 11 b12 + a12 b11 + (b21 b11 − a21 a11 )b12

p

p p p −p −1 −1 = −a−p 11 b12 + a12 b11 1 + (b21 b11 − a21 a11 )b12 b11 p p −p −1 p p −1 = −a−p A)12 , 11 b12 + a12 b11 1 + (b21 − a21 a11 b11 ) b12 = (B cp21 = (B −1 A)p21 = (−q 2 b21 a11 + b11 a21 )p p p p p −p p −1 = ap11 (−q 2 b21 + b11 a21 a−1 A)21 . 11 ) = a11 (−b21 + b11 a21 a11 ) = (B

610

S. A. Frolov

The calculation of the matrix elements Dij can be done in the same manner and, hence Proposition 4 is proved. The matrix elements of M can be expressed through the generators a11 , b11 and Xi as follows: M11 = 1 + X2 X1 + X1 X4 + X3 X4 + X3 X2 X1 X4 = q −2 (1 + X1 X2 + X1 X4 + X3 X4 + X1 X2 X3 X4 ), 2 −2 M12 = −q −2 ((1 + X1 X2 )X3 + X1 ) + q −2 X1 M11 b−2 11 + X3 M11 a11 b11 , −2 2 2 M21 = −q −2 (X2 (1 + X3 X4 ) + X4 ) + X4 M11 a−2 11 + q X2 M11 a11 b11 .

(3.4)

It follows from Proposition 4 that p = 1 + X1p X2p + X1p X4p + X3p X4p + X1p X2p X3p X4p , M11 p p −2p p 2p −2p M12 = −(1 + X1p X2p )X3p − X1p + X1p M11 b11 + X3p M11 a11 b11 , p p −2p p −2p 2p M21 = −X2p (1 + X3p X4p ) − X4p + X4p M11 a11 + X2p M11 a11 b11 .

(3.5)

One can use Eqs. (3.4) to prove Proposition 4. We are now ready to discuss the structure of the centre of the graph algebra Lg and the moduli algebra Mg . 4. The Centre of the Graph and Moduli Algebras p Proposition 5. 1) The centre Z(Lg ) of Lg (sl2 ) is generated by the elements a±p i11 , ai12 , p ±p p p −p ai21 , bi11 , bi12 ,bi21 and M11 , subject to the single relation −p −1 −1 −1 (Bg A−1 M11 g Bg Ag · · · B1 A1 B1 A1 )11 = 1.

2) The algebra Lg (sl2 ) is a free Z(Lg (sl2 ))-module with basis the set of monomials Qg ri1 si1 ti1 ri2 si2 ti2 i=1 ai11 ai12 ai21 bi11 bi12 bi21 and 0 ≤ r, s, t ≤ p − 1. 3) The ring Lg (sl2 ) is an integral domain. 4) The centre Z(Lg ) endowed with the Poisson structure (3.3) is isomorphic to the extended Poisson graph algebra PL∗g (sl2 ). 5) The centre Z(Lg ) is isomorphic to the tensor product of g copies of the extended Poisson graph algebra PL∗g (sl2 ). The isomorphism is given by means of the following formulas: −1 ¯ ¯ Ai = M−1 + (i + 1)Ai M+ (i + 1), Bi = M+ (i + 1)Bi M+ (i + 1),

(4.1)

where (Ai )mn = (Ai )pmn , (Bi )mn = (Bi )pmn , (A¯ i )mn = (A¯ i )pmn , (B¯i )mn = (B¯ i )pmn , and m, n 6= 2 simultaneously, −1 −1 −1 −1 Mi = M−1 − (i)M+ (i) = Bg Ag Bg Ag · · · Bi Ai Bi Ai ,

(4.2)

M± (i) = G¯± (i)G¯± (i + 1) · · · G¯± (g), −1 (i)G¯+ (i) = B¯i A¯ −1 B¯ −1 A¯ i . G¯i = G¯−

(4.3)

i

and A¯ i , B¯ i are matrices from Proposition 1.

i

Centre of the Graph and Moduli Algebras at Roots of 1

611

Proof. It is obvious that Propositions 5.1–5.4 follow from Propositions 1, 2, 3, and 5.5, thus it is enough to prove Proposition 5.5. The matrix elements of M+ (i) can be expressed through the matrix elements of Mi as follows: −1

1

−1

2 , M+ (i)12 = mi112 mi12 , M+ (i)22 = mi112 , M+ (i)11 = mi11

−1

−1

1

2 . M+−1 (i)11 = mi112 , M+−1 (i)12 = −qmi112 mi12 , M+−1 (i)22 = mi11

Using formulas (2.4), one gets f−1 ¯ f (Ai )p11 = (¯ai11 − q 2 mi+1,12 a¯ i21 )p = a¯ pi11 − mpi+1,12 a¯ pi21 = (M + (i + 1)Ai M+ (i + 1))11 , f−1 (i + 1)A¯ i M f+ (i + 1))21 , (Ai )p21 = (mi+1,11 a¯ i21 )p = mpi+1,11 a¯ pi21 = (M + 2 p ¯ i12 m−1 ¯ i21 m−1 ¯ i22 m−1 (Ai )p12 = (¯ai11 m−1 i+1,11 mi+1,12 + a i+1,11 − a i+1,11 mi+1,12 − a i+1,11 mi+1,12 ) −p = a¯−p a2i11 mi+1,12 + a¯ i11 a¯ i12 − a¯ i11 a¯ i21 m2i+1,12 −mi+1,12 −q 2 a¯ i21 a¯ i12 mi+1,12 )p i11 mi+1,11 (¯ p −p = a¯−p ai11 −q 2 a¯ i21 mi+1,12 )(¯ai11 mi+1,12 + a¯ i12 )−mi+1,12 i11 mi+1,11 (¯ −p = a¯−p ai11 −q 2 a¯ i21 mi+1,12 )p (¯ai11 mi+1,12 + a¯ i12 )p −mpi+1,12 i11 mi+1,11 (¯

f−1 ¯ f = (M + (i + 1)Ai M+ (i + 1))12 ,

(4.4)

fi+1,kl = mp where M i+1,kl . To get these expressions one should use the commutativity of mi+1,kl and a¯ imn , the Lemmas 2 and 3 and Eq. (4.4) follows from the identification a¯ i11 − q 2 a¯ i21 mi+1,12 = Z, a¯ i11 mi+1,12 + a¯ i12 = W, mi+1,12 = c. One sees that to prove Proposition 5 one should show that fi = Mi . M

(4.5)

The matrix elements of Mi can be expressed through the matrix elements of G¯ i as follows: mi11 = g¯ i11 g¯ i+1,11 · · · g¯ g11 , g X mi12 = g¯ i11 · · · g¯ k−1,11 g¯ k12 , k=i

mi21 =

g X

g¯ i11 · · · g¯ k−1,11 g¯ k21 ,

k=i

Using these expressions and Lemma 2 one immediately gets that f+ (i) = 0 e ± (i) · · · 0 e ± (g), M e ± (i))mn = (G¯± (i))pmn . where (0 Thus to complete the proof of Proposition 5 it remains to remember that the equation ¯ ¯ ¯ −1 ¯ −1 ¯ e −1 e ei = 0 0 − (i)0+ (i) = Gi = Bi Ai Bi Ai was proved in Proposition 4. Proposition 5 is proved.

612

S. A. Frolov

Remark 3. Strictly speaking to define the matrices M± (i), G¯± (i) one should use the ±1

±1

elements (mi112 )p , (g¯ i112 )p from the extended graph algebras. However the formulas 1

2 )2 = mi11 , (4.1) (and (2.4)) do not depend on them due to the obvious relations (mi11 1 2 )2 = g¯ i11 and we use the matrices M± , G¯± only to simplify the notations and the (g¯ i11 proof of Eq. (4.5).

Remark 4. The method described in the third section to prove Proposition 4 can be applied to prove Eq. (4.2) without using the decomposion (4.3). Let us proceed with the study of the centre Z(Mg ) of the moduli algebra. Proposition 6. The centre Z(FI ) coincides with Z(Lg ) and the centre Z(I) is the ideal p − δij . of Z(Lg ), generated by the elements Mij Proof. It is obvious that Z(Lg ) ⊂ Z(FI ). Let f ∈ FI , i ∈ I and, hence f i ∈ I. If zI is an arbitrary element from Z(I), then one gets zI f i = f izI = f zI i. As was proved in Proposition 5, Lg and, hence FI are integral domains, therefore zI f = f zI , and, thus, Z(I) ⊂ Z(FI ). Let z belong to Z(FI ). It is not difficult to show p−1 p−1 p−1 p−1 p−1 p−1 that the elements M21 M12 , a¯ k11 M21 M12 and b¯ k11 M21 M12 belong to FI . The ring Lg is an integral domain and, hence the following equations should be valid: z a¯ k11 = a¯ k11 z, z b¯ k11 = b¯ k11 z. These equations show that z commutes with any element of the subalgebra generated by ak11 and bk11 (the tensor product of g copies of the Weyl algebra) and, therefore, the element z should be of the form z = z((Xk )i , a¯ p , b¯ p ). k11

k11

Taking into account that z commutes with M12 and M21 one gets z(Xk )i = (Xk )i z ∀k = 1, . . . , g and ∀i = 1, 2, 3, 4. Therefore, z belongs to Z(Lg ) and Z(FI ) coincides with Z(Lg ). It can be easily shown using Eqs. (3.4) and (3.5) that any element z ∈ Z(I) belongs p − δij . to the ideal of Z(Lg ) generated by the elements Mij Let us consider a subalgebra J of FI which consists of the elements such that the commutator of an element j ∈ J with any element f ∈ FI belongs to the ideal I: J = {j ∈ FI : jf − f j ∈ I ∀f ∈ FI }. It is obvious that Z(Lg ) ⊂ J is the centre of J , I is an ideal of J and the centre Z(Mg ) coincides with the factor algebra of J over I, Z(Mg ) = J /I. As was shown before the graph algebra Lg and, therefore, the algebras FI and J are finitely generated over Z(Lg ). As has been just proved Z(I) ⊂ Z(Lg ) and, hence the moduli algebra Mg = FI /I and the centre Z(Mg ) are finitely generated over Z(Lg )/Z(I). Let us consider Spec(Z(Mg )) and Spec(Z(Lg )/Z(I)), i.e. the set of all algebra homomorphisms from Z(Mg ) and Z(Lg )/Z(I) to C. It is clear that Spec(Z(Lg )) is isomorphic to C 4g × (C × )2g , i.e. a complex affine space of dimension 6g with 2g hyperplanes of codimension 1 removed. Then Spec(Z(Lg )/Z(I)) is a submanifold of C 4g × (C × )2g which is singled out by means of Eq. (1.4).

Centre of the Graph and Moduli Algebras at Roots of 1

613

5. Irreducible Representations of the Graph Algebra It is clear that every irreducible Lg -module V is finite dimensional and the centre Z(Lg ) acts by scalar operators on V and, therefore, there is a homomorphism χV ∈ Spec(Z(Lg )) : Z(Lg ) → C, the central character of V , such that z.v = χV (z)v for all z ∈ Z(Lg ) and v ∈ V . Let us consider an ideal Iχ of Lg generated by elements z − χV (z), z ∈ Z(Lg ). Then every irreducible representation of the algebra Lχg = Lg /Iχ is an irreducible representation of Lg with the central character χV (z). Due to Proposition 1 it is enough to construct representations of L1 . As was shown in the third section the algebra L1 is isomorphic to the tensor product of the Weyl algebra, generated by a11 and b11 , and the algebra X generated by Xi . There is no problem in constructing representations of the Weyl algebra and we begin with the discussion of irreducible representations of X . p ) 6= 0 is a simple p4 -dimensional algeProposition 7. The algebra Xχ = X /Iχ , χ(M11 χ 6 bra and, hence, the algebra L1 is simple p -dimensional.

Proof. We are going to show that the unity element belongs to an arbitrary (nonzero) ideal J and, hence, the ideal coincides with Xχ and Xχ is a simple algebra. An arbitrary element f ∈ Xχ can be presented in the form X f = cijkl X1i (1 + X1 X2 )j (1 + X3 X4 )k X4l + dijkl X1i (1 + X1 X2 )j (1 + X3 X4 )k X3l + fijkl X2i (1 + X1 X2 )j (1 + X3 X4 )k X4l + gijkl X2i (1 + X1 X2 )j (1 + X3 X4 )k X3l , where i, j, k, l = 0, 1, . . . , p − 1. Let f belong to an ideal J and let l2 (l3 ) be the maximal power of X2 (X3 ) in the element f . Then one can easily show that the element X1l2 f X4l3 , which obviously belongs to J , can be represented in the form X cijkl X1i (1 + X1 X2 )j (1 + X3 X4 )k X4l X1l2 f X4l3 = with some new coefficients cijkl . This element can be rewritten as follows: X α α cαijk M11 (1 + X1 X2 )i X1j X4k + dαijk M11 (1 + X3 X4 )i X1j X4k , X1l2 f X4l3 = where M11 is given by Eq. (3.4). Multiplying X1l2 f X4l3 on (1 + X1 X2 )l , where l is the maximal power of (1 + X3 X4 ) in X1l2 f X4l3 one gets that the ideal J contains an element of the form X α (1 + X1 X2 )i X1j X4k . cαijk M11 Let us now suppose that X1p X4p 6= 0. Then the elements X1 and X4 are invertible and using the commutation relations of M11 , X1 and X4 with Xi one can easily show that the element of the form

614

S. A. Frolov

(1 + X1 X2 )l

X

α p−α p−α cα M11 X1 X4

belongs to J . If X2p = 0 then the element (1 + X1 X2 ) is invertible and from the commutation relations of (1 + X1 X2 ) with Xi one gets that the ideal J contains M11 and, hence the unity element. If X2p 6= 0 then using the commutation relations of X2 with Xi one gets that the element (1 + X1 X2 )l belongs to J . Now using the relation (1 + X1 X2 )(1 + X3 X4 ) = (1 + X3 X4 )(1 + X1 X2 ) + (q 2 − 1)X1 X4 , one can easily show that J contains the element X1 X4 and, therefore, the unity element. Let us now consider the case X1p = 0. Then the element (1 + X1 X2 ) is invertible and from the commutation relations of M11 and (1 + X1 X2 ) with Xi one can get that the element of the form X α cαi M11 (1 + X1 X2 )i X1l1 X4l4 belongs to J . Let X4p 6= 0. Using the commutation relations of X4 with Xi one gets that J contains the element of the form X ci (1 + X1 X2 )i . X1l1 With the help of X2 one gets that the element X1l1 X2l2 (1 + X1 X2 )l and, hence, X1l1 X2l2 belongs to J and using first X1 and then X2 one sees that the unity element is in J . If X4p = 0 then the element (1 + X3 X4 ) is invertible and using this element and X2 P α one shows that the element X4p−1 belongs to J . Using again X2 one gets that cα M11 p−1 l J contains the element X4 X2 . With the help of X1 and then X3 one gets that the unity element belongs to J . The case X4p = 0 , X1 6= 0 can be considered in the same manner. Proposition 7 is proved. The algebra Xχ , being simple, is isomorphic to Mp2 (C) and, therefore, the following proposition is valid. Proposition 8. Every irreducible representation of X is isomorphic to one of the following p2 -dimensional: 1) X1 9(k, l) = 9(k + 1, l), X2 9(k, l) = q −2k (y2 x1 + 1 − q 2k )9(k − 1, l) + z2 q −2(k+l) 9(k, l + 1), X3 9(k, l) = q 2(k+l) (y3 x4 + 1 − q −2l )9(k, l − 1) + z3 q 2k 9(k + 1, l), (5.1) X4 9(k, l) = q −2k 9(k, l + 1). Here 9(k, l) is a basis of a p2 -dimensional vector space, k, l = 0, 1, . . . , p − 1 and we use the following notations: 9(p, l) = x1 9(0, l), x1 9(−1, l) = 9(p − 1, l), 9(k, p) = x4 9(k, 0), x4 9(k, −1) = 9(k, p − 1). The complex parameters x1 , x4 , y2 , y3 , z2 , z3 should satisfy the following equations: (y2 x1 + 1)(y3 x4 + 1) 6= 0, z2 z3 = 0, 1 + q −2 z3 (y2 x1 + 1) + q 2 z2 (y3 x4 + 1) = 0.

Centre of the Graph and Moduli Algebras at Roots of 1

615

The central character of the representation is defined by the formulas χ(X1p ) = x1 , χ(X4p ) = x4 , χ(X2p ) = χ(X3p ) =

1 ((y2 x1 + 1)p − 1) + z2p x4 , x1

1 ((y3 x4 + 1)p − 1) + z3p x1 . x4

The element M11 acts in the representation as follows: M11 9(k, l) = q 2(l−k) (y2 x1 + 1)(y3 x4 + 1)9(k, l). 2) X1 9(k, l) = 9(k + 1, l), X2 9(k, l) = −9(k − 1, l) + b1 q −2(k+l) 9(k, l + 1) + b2 q −2(k+2l) 9(k + 1, l + 2), X3 9(k, l) = −q 2k 9(k, l − 1) + cp−1 q 2(k+2l) 9(p + k − 1, p + l − 2), X4 9(k, l) = q −2k 9(k, l + 1), where

x1 x4 b1 b2 cp−1 6= 0, 1 + q 10 x1 x4 b2 cp−1 = 0, χ(X1p ) = x1 , χ(X4p ) = x4 , χ(X2p ) = (bp1 + bp2 x1 x4 )x4 − x−1 1 , χ(X3p ) = cpp−1 xp−1 xp−2 − x−1 1 4 4 , M11 9(k, l) = q 2(l−k+3) x1 x4 b1 cp−1 9(k, l).

Now irreducible representations of the graph algebra L1 (sl2 ) can be constructed using, for example, the following representation of the Weyl algebra a11 9(m) = 9(m + 1), b11 9(m) = β11 q −m 9(m), p . 9(m + p) = α11 9(m), χ(ap11 ) = α11 , χ(bp11 ) = β11 Then the graph algebra acts in the tensor product of the representations of the Weyl algebra and the algebra X , a11 9(k, l, m) = 9(k, l, m + 1), b11 9(k, l, m) = β11 q −m 9(k, l, m), −2 2m −2 q 9(k + 1, l, m + 1) + β11 z3 q 2(m+k+1) 9(k + 1, l, m + 3), a12 9(k, l, m) = β11 −2 2(m+k+l+1) q (y3 x4 + 1 − q −2l )9(k, l − 1, m + 3) + β11 2 2(m−k) q (y2 x1 + 1 − q 2k )9(k − 1, l, m − 1) a21 9(k, l, m) = β11 2 z2 q 2(m−k−l) 9(k, l + 1, m − 1), + β11 −1 m+2(k+l) q (y3 x4 + 1 − q −2l )9(k, l − 1, m + 2) b12 9(k, l, m) = β11 −1 z3 q m+2k 9(k + 1, l, m + 2), + β11 3 3m−2k+2 q (y2 x1 + 1 − q 2k )9(k − 1, l, m − 2) b21 9(k, l, m) = β11 3 3m−2(k+l)+2 q 9(k, l + 1, m − 2) + β11

+ β11 q m−2k 9(k, l + 1, m − 2), and we present formulas only for the representation (5.1) of X .

616

S. A. Frolov

Using this representation and Proposition 1 one can easily construct all irreducible representations of the graph algebra Lg . Let us briefly discuss representations of the moduli algebra Mg . In this case one p p p ) = 1, χ(M12 ) = χ(M21 ) = 0. Let V0 be should consider only representations with χ(M11 the submodule of an irreducible left Lg -module V which is annihilated by the elements 8ij = Mij − δij : V0 = {9 ∈ V : 8ij 9 = 0} and let δV0 be a submodule of V0 which consists of the vectors of the following form: δV0 = {9 ∈ V0 :

9 = 8ij χij f or some χij ∈ V }.

Then it is obvious that the moduli algebra acts in the factor module Vph = V0 δV0 . 6. Conclusion In this paper we studied the structure of the centre and irreducible representations of the graph algebra. The next problem to be solved is to construct unitary representations of the graph algebra. The anti-automorphism ρ (2.5) can be used to define unitary representations of Lg . Namely, let V be a left Lg -module and β be a bilinear form on V × V , such that β(v2 , λv1 ) = λβ(v2 , v1 ), β(λv2 , v1 ) = λ∗ β(v2 , v1 ), λ ∈ C. A representation is called unitary if β(v2 , f v1 ) = β(ρ(f )v2 , v1 ) for all v1 , v2 ∈ V and f ∈ Lg . It is obvious that only representations with central characters, satisfying the equations Ai M = MAi , Bi M = MBi , can be unitary. We didn’t study irreducible representations of the moduli algebra, however there are some indications that Vph is an irreducible Mg -module. One should prove (or disprove) this conjecture and show that the dimension of Vph is given by Verlinde’s formula. It would be interesting to clarify the relation between this approach to quantization of the moduli space and the geometric quantization [17, 18]. It seems that a choice of a point of Spec(Z(Mg )) corresponds to a choice of a polarization on the moduli space. It seems that the results obtained in the paper can be generalized to the graph algebras corresponding to arbitrary quantized universal enveloping algebras. Acknowledgement. The author would like to thank G. Arutyunov, P. Schupp and A. A. Slavnov for discussions. He is grateful to Professor J. Wess for kind hospitality and the Alexander von Humboldt Foundation for the support. This work has been supported in part by the Russian Basic Research Fund under grant number 94-01-00300a.

Appendix Let matrices A and B satisfy the commutation relations of L1 , −1 1 −1 1 A , B 1 R+ B 2 R+−1 = R− B 2 R− B , A1 R+ A2 R+−1 = R− A2 R− −1 1 A, A1 R+ B 2 R+−1 = R+ B 2 R−

detq A = a11 a22 − q 2 a21 a12 = λa ,

detq B = b11 b22 − q 2 b21 b12 = λb .

Centre of the Graph and Moduli Algebras at Roots of 1

617

These relations can be rewritten in components as follows: a11 a12 = q −2 a12 a11 , a11 a21 = q 2 a21 a11 , a11 a22 = a22 a11 , [a12 , a21 ] = −(1 − q −2 )a11 (a11 − a22 ) ⇔ a12 a21 = q 2 a21 a12 + (1 − q −2 )(λa − a211 ), [a12 , a22 ] = −(1 − q −2 )a11 a12 , [a21 , a22 ] = (1 − q −2 )a21 a11 , and the same relations for bij : a11 b11 = qb11 a11 , a11 b12 = q −1 b12 a11 , a11 b21 = qb21 a11 + (q − q −1 )b11 a21 , a11 b22 = q −1 b22 a11 + q −1 (q − q −1 )2 b11 a11 + (q − q −1 )b12 a21 , a12 b11 = qb11 a12 + (q − q −1 )b12 a11 , a12 b12 = qb12 a12 , a12 b21 = q −1 b21 a12 + q −1 (q − q −1 )2 b12 a21 +q −2 (q − q −1 )(b22 a11 + b11 a22 + (q −2 − 2)b11 a11 ), a12 b22 = q −1 b22 a12 + q −1 (q − q −1 )2 b11 a12 + (q − q −1 )b12 a22 − q −2 (q − q −1 )b12 a11 , a21 b11 = q −1 b11 a21 , a21 b12 = q −1 b12 a21 + q −2 (q − q −1 )b11 a11 , a21 b21 = qb21 a21 , a21 b22 = qb22 a21 + (q − q −1 )b21 a11 , a22 b11 a22 b22 a22 b21 a22 b12

= q −1 b11 a22 + q −1 (q − q −1 )2 b11 a11 + (q − q −1 )b12 a21 , = qb22 a22 − q −3 (q − q −1 )2 b11 a11 + (q − q −1 )b21 a12 − q −2 (q − q −1 )b12 a21 , = q −1 b21 a22 + q −1 (q − q −1 )2 b21 a11 + (q − q −1 )b22 a21 − q −2 (q − q −1 )b11 a21 , = qb12 a22 + (q − q −1 )b11 a12 , (6.1)

Let matrices C and D satisfy the following relations: C 1 R+ D2 R+−1 = R+ D2 R+−1 C 1 . The relations look in components as follows: c11 d11 = d11 c11 − q(q − q −1 )d12 c21 , c11 d21 = d21 c11 + q(q − q −1 )(d11 − d22 )c21 , c11 d12 = d12 c11 , c11 d22 = d22 c11 + q −1 (q − q −1 )d12 c21 , c12 d11 = d11 c12 + q(q − q −1 )d12 (c11 − c22 ), c12 d12 = q 2 d12 c12 , c12 d21 + q −1 (q − q −1 )c11 (d11 − d22 ) = q −2 d21 c12 + q −1 (q − q −1 )(d11 − d22 )c22 , c12 d22 = d22 c12 − q −1 (q − q −1 )d12 (c11 − c22 ), c21 d11 = d11 c21 , c21 d12 = q −2 d12 c21 , c21 d21 = q 2 d21 c21 , c21 d22 = d22 c21 , c22 d11 = d11 c22 + q −1 (q − q −1 )d12 c21 , c22 d22 = d22 c22 − q −3 (q − q −1 )d12 c21 , c22 d12 = d12 c22 , c22 d21 = d21 c22 − q −1 (q − q −1 )(d11 − d22 )c21 . Let matrices A and M have the commutation relations −1 M 1. M 1 R+ A2 R+−1 = R− A2 R−

(6.2)

618

S. A. Frolov

The relations look in components as follows: a11 m11 = m11 a11 , a11 m12 = m12 a11 − q(q − q −1 )m11 a12 , a11 m21 = m21 a11 + q −1 (q − q −1 )m11 a21 , a11 m22 = m22 a11 + q(q − q −1 )(m12 a21 − m21 a12 ) − (q − q −1 )2 m11 (a22 − a11 ), a12 m11 = q 2 m11 a12 , a12 m12 = m12 a12 , a12 m21 = m21 a12 − q −1 (q − q −1 )m11 (a11 − a22 ), a12 m22 = q −2 m22 a12 − q −1 (q − q −1 )m12 (a11 − a22 ) + q −1 (q − q −1 )(q 2 − q −2 )m11 a12 , a21 m11 = q −2 m11 a21 , a21 m12 = m12 a21 + q −1 (q − q −1 )m11 (a11 − a22 ), a21 m21 = m21 a21 , a21 m22 = q 2 m22 a21 + q(q − q −1 )m21 (a11 − a22 ), a22 m11 = m11 a22 , a22 m12 = m12 a22 + q −1 (q − q −1 )m11 a12 , a22 m21 = m21 a22 − q −3 (q − q −1 )m11 a21 , a22 m22 = m22 a22 − q −1 (q − q −1 )(m12 a21 − m21 a12 ), + q −2 (q − q −1 )2 m11 (a22 − a11 ).

(6.3)

with (Ai )mn and (Bi )mn , which The commutation relations of the elements Qi , Q−1 i are used to define L∗g , are given by the formulas −1 2 Qi Q−1 i = Qi Qi = 1, Qi Qj = Qj Qi , Qi = (Mi )11 ,

i ≤ j, Qi (Aj )mn = q (m−n) (Aj )mn Qi , Qi (Bj )mn = q (m−n) (Bj )mn Qi , i < j, (Ai )11 Qj = Qj (Ai )11 + (1 − q)Q−1 j (Mj )12 (Ai )21 , (Ai )21 Qj = Qj (Ai )21 , (Ai )12 Qj = Qj (Ai )12 − (1 − q)Q−1 j (Mj )12 ((Ai )11 − (Ai )22 ) 2 +q −3 (1 − q)2 Q−3 j (Mj )12 (Ai )21 ,

(Ai )22 Qj = Qj (Ai )22 − q −2 (1 − q)Q−1 j (Mj )12 (Ai )21 , (Bi )11 Qj = Qj (Bi )11 + (1 − q)Q−1 j (Mj )12 (Bi )21 , (Bi )21 Qj = Qj (Bi )21 , (Bi )12 Qj = Qj (Bi )12 − (1 − q)Q−1 j (Mj )12 ((Bi )11 − (Bi )22 ) 2 +q −3 (1 − q)2 Q−3 j (Mj )12 (Bi )21 ,

(Bi )22 Qj = Qj (Bi )22 − q −2 (1 − q)Q−1 j (Mj )12 (Bi )21 . References 1. Fock, V.V. and Rosly, A.A.: Poisson structures on moduli of flat connections on Riemann surfaces and r-matrices. Preprint ITEP 72-92, June 1992, Moscow 2. Drinfeld, V.: Quantum Groups. Proc. ICM-86, Berkeley, California, USA, 1986, 1987, pp. 798–820 3. Faddeev, L.D.,Reshetikhin, N.Yu. and Takhtadjan, L.A.: Leningrad Math. J. 1, 178–206 (1989)

Centre of the Graph and Moduli Algebras at Roots of 1

619

4. Alekseev, A.Yu., Grosse, H. and Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern Simons theory I, II. HUTMP 94-B336, HUTMP 94–B337 5. Buffenoir, E. and Roche, Ph.: Commun. Math. Phys. 170, 669 (1995) 6. Alekseev, A.Yu. and Malkin, A.Z.: Commun. Math. Phys. 169, 99 (1995) 7. Alekseev, A.Yu.: Integrability in the Hamiltonian Chern–Simons theory. hep-th/ 9311074, St.-Peterburg Math. J. 6, 2, 1 (1994) 8. Reshetikhin, N.Yu. and Semenov-Tian-Shansky, M.A.: Lett. Math. Phys. 19, 133–142 (1990) 9. Semenov-Tian-Shansky, M.A.: Dressing transformations and Poisson-Lie group actions. In: Publ. RIMS, Kyoto University 21 no.6, 1985, p. 1237 10. Alekseev, A.Yu. and Faddeev, L.D.: Commun. Math. Phys. 141, 413–422 (1991) 11. Semenov-Tian-Shansky, M.A.: Teor. Math. Phys. 93, 302 (1992) (in Russian) 12. Alekseev, A.Yu. and Faddeev, L.D.: An involution and dynamics for the q-deformed quantum top. hepth/9406196 13. Frolov, S.A. Mod.Phys.Lett. A10, No. 34, 2619–2631 (1995); Physical phase space of lattice Yang-Mills theory and the moduli space of flat connections on a Riemann surface. hep-th/9511018 14. Di Concini, C. and Kac, V.: Representations of quantum groups at roots of 1. In: Progress in Mathematics, vol. 92, Basel–Boston: Birkhauser, 1990 15. Reshetikhin, N.: Commun. Math. Phys. 170, 79 (1995) 16. Chari, V. and Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press 17. Axelrod, S., Della Pietra, S. and Witten, E.: J. Diff. Geom. 33, 787 (1991) 18. Jefferey, J.C. and Weitsman, J.: Commun. Math. Phys. 150, 593 (1992) Communicated by A. Connes

Commun. Math. Phys. 200, 621 – 659 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Decay Estimates of Solutions for the Equations of Motion of Compressible Viscous and Heat-Conductive Gases in an Exterior Domain in R3 Takayuki Kobayashi1,? , Yoshihiro Shibata2 1 Institute of Mathematics, University of Tsukuba, Tsukuba-shi, Ibaraki 305-8571, Japan. E-mail: kobayasi@math. tsukuba. ac. jp 2 Department of Mathematics, School of Science and Engineering, Waseda University, 3-4-1 Ohkubo, Shinjyuku-ku, Tokyo 169-8555, Japan. E-mail: [email protected]

Received: 6 May 1998 / Accepted: 29 July 1998

Abstract: We consider the equations of motion of compressible viscous and heatconductive gases in an exterior domain in R3 . We give the Lq −Lp estimates for solutions to the linearized equations and show an optimal decay estimate for solutions to the nonlinear problem. 1. Introduction 1.1. Purpose of the paper. In this paper, we consider the optimal decay rate of the solutions to the exterior initial boundary value problem of the equation which describes the motion of compressible viscous and heat–conductive gases. The equation is given by the following system of five equations for the density ρ, the velocity v = T (v1 , v2 , v3 ) and the temperature θ:   ρt + (v · ∇)ρ + ρ div v = 0    µ µ + µ0 ∇P (ρ, θ)  ∇(div v) − vt + (v · ∇)v = 1v + , (1.1) ρ ρ ρ   P (ρ, θ) θ∂ k Ψ (v)  θ  θt + (v · ∇)θ + div v = 1θ + ρc ρc ρc where T (v1 , v2 , v3 ) is the transposed (v1 , v2 , v3 ), P = P (ρ, θ) is the pressure, µ and µ0 the viscosity coefficients, k the coefficient of the heat conduction, c the heat capacity at the constant volume and Ψ = Ψ (v) is the dissipation function: Ψ (v) =

3 3 X µ X (∂k vj + ∂j vk )2 + µ0 (∂j vj )2 . 2 j=1 j,k=1

? Present address: Department of Mathematics, Kyushu Institute of Technology, 1-1 Sensui-cho, Tobata-ku, Kitakyushu 804-8550, Japan. E-mail: [email protected]

622

T. Kobayashi, Y. Shibata

We consider the initial boundary value problem (IBVP) of (1.1) in the region t = 0, x ∈ , where is an exterior domain in R3 with compact smooth boundary ∂. The boundary condition is supposed by v |∂ = 0, θ|∂ = θ¯0 , lim v(t, x) = 0,

|x|→∞

lim θ(t, x) = θ¯0 , t > 0,

(1.2)

|x|→∞

and the initial condition is given by (ρ, v, θ)(0, x) = (ρ0 , v0 , θ0 )(x) in .

(1.3)

The unique existence of smooth solutions of IBVP: (1.1), (1.2) and (1.3) globally in time near the constant state (ρ¯0 , 0, θ¯0 ), where ρ¯0 and θ¯0 are positive constants was studied by Matsumura and Nishida [26] in the L2 framework, but they did not give the rate of convergence of solutions as t → ∞. In this paper, we will show the optimal rate of convergence of solutions as t → ∞ under the additional assumption that (ρ0 −ρ¯0 , v0 , θ0 − θ¯0 ) belongs to L1 , which is corresponding to the result for the Cauchy problem obtained by Matsumura and Nishida [22, 23] and Ponce [29]. In order to get the optimal rate of convergence, the main tool is the decay property of solutions to the linearized equations. In the Cauchy problem case, it was obtained by using the concrete representation formula of solutions by using the Fourier transform (cf. Theorem 3.1 below). But, in the exterior boundary value problem case, we can not expect to get such a representation formula. We need the idea to obtain enough decay properties of solutions to the linearized equations, which will be discussed in Sect. 4 below. The basic assumption of this paper is the following. (1) c, k and µ are positive constants, µ0 is a constant and 23 µ + µ0 = 0. (2) P is a known function of ρ and θ, smooth in a neighborhood of (ρ0 , θ0 ), where ∂P > 0. ∂θ

∂P , ∂ρ

1.2. Notation. Before stating our main results precisely, at this point we shall explain our notation. Three dimensional row vector valued functions are denoted by a bold-face letter which corresponds to the velocity field, that is for example v = T (v1 , v2 , v3 ). Five dimensional row vector valued functions are denoted by blackboard bold letters, that is for example U = T (u1 , u2 , u3 , u4 , u5 ), where u1 is corresponding to the density, T (u2 , u3 , u4 ) the velocity field and u5 the temperature. If we write (ρ, v, θ), v = T (v1 , v2 , v3 ), then this also stands for the five row vector T (ρ, v1 , v2 , v3 , θ). As usual, the · stands for the inner product in R3 and set ∂t = ∂/∂t, ut = ∂t u, ∂j = ∂/∂xj , 1 = ∂12 + ∂22 + ∂32 , ∇u = T (∂1 u, ∂2 u, ∂3 u), ∂xα = ∂1α1 ∂2α2 ∂3α3 , α = (α1 , α2 , α3 ), |α| = α1 + α2 + α3 ,

Compressible Viscous and Heat-Conductive Fluid

623

1u = T (1u1 , 1u2 , 1u3 ), ∇u : ∇v = T (∇u · ∇v1 , ∇u · ∇v2 , ∇u · ∇v3 ), T

(u · ∇)v = (u · ∇v1 , u · ∇v2 , u · ∇v3 ), div v = ∇ · v =

3 X

∂j vj ,

j=1

∂tj ∂xα v = T (∂tj ∂xα v1 , ∂tj ∂xα v2 , ∂tj ∂xα v3 ), ∂tj ∂xα U = (∂tj ∂xα ρ, ∂tj ∂xα v, ∂tj ∂xα θ), (U = (ρ, v, θ)), ∂xk u = (∂xα u | |α| = k), ∂xk v = (∂xα v | |α| = k), ∂xk U = (∂xα U | |α| = k), ∂x1 u = ∂x u, ∂x1 v = ∂x v, ∂x1 U = ∂x U. Let D be any domain in R3 . Lp (D) denotes the usual Lp space on D with norm k · kp,D . Put X k∂xα ukp,D , Wpm (D) = {u ∈ Lp (D) | kukm,p,D < ∞}, kukm,p,D = |α|5m m

H (D) =

W2m (D),

Wp0 (D)

= Lp (D), H (D) = L2 (D). 0

Sobolev spaces of vector valued functions are used as well as of scalar valued functions. For the function spaces of three and five dimensional row vector valued functions we use boldface letter and the blackboard bold letter, respectively. For example, Lp (D) = {v = (v1 , v2 , v3 ) | vj ∈ Lp (D)}, Lp (D) = {U = (ρ, v, θ) | ρ, θ ∈ Lp (D), v ∈ Lp (D)}. m m Likewise for Wpm (D), Wm p (D), H (D) and H (D). Set

kvkm,p,D =

3 X

kvj km,p,D , kUkm,p,D =

j=1

5 X

kuj km,p,D .

j=1

Set PU = T (0, v, θ) and (I − P)U = T (ρ, 0, 0, 0, 0) for U = (ρ, v, θ). Set k m Wk,m p (D) = {U | (I − P)U ∈ Wp (D), PU ∈ Wp (D)},

kUkWk,m (D) = k(I − P)Ukk,p,D + kPUkm,p,D . p When D = , we omit the index . Namely, . k · kp, = k · kp , k · kk,p, = k · kk,p , k · kWk,m () = k · kWk,m p p m m m k,m Likewise for Lp , Lp , Lp , Wpm , Wpm , Wm p , H , H , H and Wp . Set

C ` ([t1 , t2 ], B) = {u(t) |`-times continuously differentiable function of t ∈ [t1 , t2 ] with value in a Banach space B}, L2 ((t1 , t2 ), B) = {u(t) |L2 -function of t ∈ [t1 , t2 ] with value in B}. To denote various constants we use the same letter C, and C(A, B, . . . ) denotes the constant depending on the quantities, A, B, . . . . O(·) means the large order. 1.3. Existence results. In order to state the existence of solutions according to Matsumura and Nishida [26], first we introduce the class of solutions X k (0, ∞) and some norms of solutions Nk (0, ∞), k = 1, 2:

624

T. Kobayashi, Y. Shibata

k

X (0, ∞) = {U = (ρ, v, θ) | ρ − ρ¯0 ∈

k \

C j ([0, ∞); H k+2−j ),

j=0

∂x ρ ∈ L2 ((0, ∞); H k+1 ), ∂tj ρ ∈ L2 ((0, ∞); H k+2−j ), j = 1, k, v∈

k \

C j ([0, ∞); Hk+2−2j ), ∂x v ∈ L2 ((0, ∞); Hk+2 ),

j=0

∂tj v

∈ L2 ((0, ∞); Hk+3−2j ), j = 1, k,

θ − θ¯0 ∈

k \

C j ([0, ∞); H k+2−2j ), ∂x θ ∈ L2 ((0, ∞); H k+2 ),

j=0

∂tj θ

∈ L2 ((0, ∞); H k+3−2j ), j = 1, k}, ¯ 0 k23,2 + k∂t U(t)k2 2,1 N1 (0, ∞)2 = sup kU(t) − U W 2

05t<∞ Z ∞

k∂x U(s)k2W2,3 + k∂s U(s)k22,2 ds, 2 0 ¯ 0 k24,2 + k∂t U(t)k2 3,2 + k∂t2 U(t)k2 2,0 N2 (0, ∞)2 = sup kU(t) − U W W +

05t<∞ Z ∞

+ 0

2

2

k∂x U(s)k2W3,4 + k∂s U(s)k23,2 + k∂s2 U(s)k2W2,1 ds, 2

2

¯ 0 = (ρ¯0 , 0, θ¯0 ). where U Now, let us explain some necessary condition on the initial data (ρ0 , v0 , θ0 ). To do this, for a while we assume that (ρ, v, θ) ∈ X k (0, ∞) is a solution of IBVP: (1.1), (1.2) and (1.3). Then, ∂tj (ρ, v, θ)|t=0 , j = 1, are determined successively by the initial data (ρ0 , v0 , θ0 ) through Eq. (1.1). Namely, if we put (ρj , vj , θj ) = ∂tj (ρ, v, θ)|t=0 and then (ρj , vj , θj ) is generated inductively from (ρ0 , v0 , θ0 ) by means of Eq. (1.1). For example, ρ1 = −(v0 · ∇)ρ0 − ρ0 div v0 , µ µ + µ0 ∇P (ρ0 , θ0 ) ∇(div v0 ) − , v1 = −(v0 · ∇)v0 + 1v0 + ρ0 ρ0 ρ0 θ0 ∂θ P (ρ0 , θ0 ) κ Ψ (v0 ) div v0 + 1θ0 + . θ1 = −(v0 · ∇)θ0 − ρ0 c ρ0 c ρ0 c From the definition of X k (0, ∞), the solution (ρ, v, θ) belonging to X k (0, ∞) should satisfy the following condition at t = 0: ρ0 − ρ¯0 ∈ H k+2 , ρj ∈ H k+2−j , (j = 1, k), vj ∈ Hk+2−2j (j = 0, 1, k), vj |∂ = 0 (j = 0, 1), θ0 − θ¯0 ∈ H k+2 , θ0 |∂ = θ¯0 , θj ∈ H k+2−2j (j = 1, k), θ1 |∂ = 0.

(1.4)

This is a requirement on the initial data (ρ0 , v0 , θ0 ) in order to obtain solutions belonging to X k (0, ∞). If (ρ0 , v0 , θ0 ) satisfies (1.4), we say that (ρ0 , v0 , θ0 ) satisfies the k th order compatibility condition and regularity.

Compressible Viscous and Heat-Conductive Fluid

625

Theorem 1.1. Let k = 1 or 2. Assume that assumptions (1) and (2) hold. Then, there exists an 0 > 0 such that if (ρ0 , v0 , θ0 ) satisfies the k th order compatibility condition and regularity and k(ρ0 − ρ¯0 , v0 , θ0 − θ¯0 )k3,2 5 0 , then (IBVP): (1.1), (1.2) and (1.3) admits a unique solution (ρ, v, θ) ∈ X k (0, ∞). Moreover, there exists a constant C such that Nk (0, ∞) 5 Ck(ρ0 − ρ¯0 , v0 , θ0 − θ¯0 )kk+2,2 .

(1.5)

Remark. Theorem 1.1 was proved by Matsumura and Nishida[26] in the case that k = 1 only. But, employing the same argument as in [26], we can prove Theorem 1.1 in the case that k = 2 immediately, and then we may omit the proof of Theorem 1.1 in this paper. 1.4. Rate of convergence. The main purpose of the paper is to prove the following theorem concerning the rate of convergence of solutions to IBVP: (1.1), (1.2) and (1.3). Theorem 1.2. Assume that assumptions (1) and (2) hold. Assume that (ρ0 , v0 , θ0 ) satisfies the 2nd order compatibility condition and regularity and (ρ0 − ρ¯0 , v0 , θ0 − θ¯0 ) ∈ L1 . Then, there exists an > 0 such that if k(ρ0 − ρ¯0 , v0 , θ0 − θ¯0 )k4,2 5 then the solution (ρ, v, θ) of (IBVP): (1.1), (1.2) and (1.3) has the following asymptotic behaviour as t → ∞: 3 k(ρ − ρ¯0 , v, θ − θ¯0 )(t)k2 = O(t− 4 );

k∂x (ρ, v, θ)(t)kW1,2 + k∂t (ρ, v, θ)(t)k1,2 = O(t− 4 ); 5

2

3 k(ρ − ρ¯0 , v, θ − θ¯0 )(t)k∞ = O(t− 2 );

k∂x (ρ, v, θ)(t)kp = O(t− 2 ), 3 < p < ∞. 3

Here depends on p. 1.5. Decay property of solutions to the linearlized problem. In order to prove Theorem 1.2, we shall use the decay property of solutions to the corresponding linearized problem. If we linearize Eq. (1.1) at the constant state (ρ¯0 , 0, θ¯0 ) and we make some linear transformation of the unknown function, then we have the following initial boundary value problem of the linear operators:  ρt + γ div v = 0   vt − α1v − β∇ div v + γ∇ρ + ω∇θ = 0 in [0, ∞) × ,   (1.6) θt − κ1θ + ω div v = 0 v|∂ = 0, θ|∂ = 0, (ρ, v, θ)|t=0 = (ρ0 , v0 , θ0 ), where α, κ, γ and ω are positive constants and β is a nonnegative constant. Let A be the 5 × 5 matrix of the differential operators of the form:   0 γ div 0 A = γ∇ −α1 − β∇ div ω∇  (1.7) 0 ω div −κ1 with the domain:

626

T. Kobayashi, Y. Shibata

Dp (A) = {U = (ρ, v, θ) ∈ Wp1,2 | PU|∂ = 0} for 1 < p < ∞. Then, (1.6) is written in the form: Ut + AU = 0

for t > 0,

U|t=0 = U0 ,

(1.8)

where U0 = (ρ0 , v0 , θ0 ) and U = (ρ, v, θ). Moreover, if we apply some linear transformation to (ρ − ρ¯0 , v, θ − θ¯0 ) (the resulting ˜ then we can reduce IBVP: (1.1), (1.2) ˜ = (ρ, vector of functions being denoted by U ˜ v˜ , θ)), and (1.3) to the problem: ˜ = F(U) ˜ for t > 0, ˜ t + AU U

˜ t=0 = U ˜0 U|

(1.9)

˜ (cf. Sect. 5 below). Therefore, in order to prove with suitable nonlinear term F(U) Theorem 1.2, we have to obtain the suitable decay property of solutions to (1.8). Kobayashi [17] proved that A generates an analytic semigroup {e−tA }t=0 on Wp1,0 , 1 < p < ∞. In this paper, we shall show the following theorem concerning the decay rate of {e−tA }t=0 . Theorem 1.3. Put U(t) = e−tA U0 . Let 1 5 q 5 2 5 p < ∞. All the constant C in the theorem depends on p and q essentially. (A) For U0 ∈ Wp1,0 ∩ Lq and t = 1, we have the following estimates: ), 2 5 p < ∞, σ = kU(t)kp 5 Ct−σ (kU0 kq + kU0 kW1,0 p

3 2

1 1 − q p

;

k∂x U(t)kp + k∂t U(t)kp 5 Ct−σ− 2 (kU0 kq + kU0 kW1,0 ), 2 5 p 5 3; p 1

), k∂x U(t)kp + k∂t U(t)kp 5 Ct− 2q (kU0 kq + kU0 kW1,0 p 3

3 < p < ∞;

), k∂x2 PU(t)kp 5 Ct− 2q (kU0 kq + kU0 kW1,0 p

2 5 p < ∞;

5 Ct− 2q (kU0 kq + kU0 kW1,0 ), kU(t)kW0,1 ∞ p

3 < p < ∞.

3

3

(B) If U0 ∈ W2p ∩ Lq and PU0 = 0 on ∂, then we have the following estimates for t = 1: k∂x2 (I − P)U(t)kp + k∂x3 PU(t)kp + k∂x ∂t U(t)kp 5 Ct− 2q (kU0 kq + kU0 k2,p ), 3

3 − 2q

5 Ct kU(t)kW1,2 ∞

(kU0 kq + kU0 k2,p ),

2 5 p < ∞; 3 < p < ∞.

(C) If we assume that q > 1 additionally, then for U0 ∈ Wp2,1 ∩ Wq1,0 and t = 1 we have the following estimates: k∂x2 (I − P)U(t)kp + k∂x3 PU(t)kp + k∂x ∂t U(t)kp 5 Ct− 2q (kU0 kW1,0 + kU0 kW2,1 ), q p 3

3 − 2q

5 Ct kU(t)kW1,2 ∞

(kU0 kW1,0 + kU0 kW2,1 ), q p

2 5 p < ∞; 3 < p < ∞.

Compressible Viscous and Heat-Conductive Fluid

627

Theorem 1.3 will be proved by combination of the Lp –Lq type estimate in the R3 case (cf. Sect. 3) and the local energy decay estimate (cf. Sect. 2) of {e−tA }t=0 , via a cut-off technique in Sect. 4. Such a cut-off technique is already known as an effective method in the study of the Lp –Lq type estimate for the semigroups in the exterior domain (cf. Iwashita [15] for the Stokes semigroup, Kobayashi and Shibata [19] for the Oseen semigroup, Shibata [31] and Iwashita and Shibata [16] for elastic and wave equations and references cited therein). To prove Theorem 1.2, we reduce IBVP: (1.1), (1.2) and (1.3) to the integral equation: Z t ˜ ˜0 − ˜ e−(t−s)A F(U(s))ds U(t) = e−tA U 0

(cf. (1.9)).Applying Theorem 1.3 and using the fact that N2 (0, ∞) 5 Ck(ρ0 −ρ¯0 , v0 , θ0 − ˜ does not vanish at θ¯0 )k4,2 (cf. Theorem 1.1), we will have Theorem 1.2. Since PF(U(s)) the boundary of , in order to get the estimate of higher order derivatives Theorem 1.3 (C) will be used instead of Theorem 1.3 (B). 1.6. Related results concerning the nonlinear problem. Previous results concerning the uniqueness were obtained by Graffi [11], Serrin [30] and Valli [36]. The local in time existence theorem was proved by Nash [27], Itaya [13, 14] and Vol’pert and Hudjaev [41] for the Cauchy problem in R3 ; Fiszdon and Zajaczkowski [8, 9] and Lukaszewicz[20, 21] for the Cauchy Dirichlet Problem in a bounded domain, Solonnikov [32], Tani[35] and Valli [39] for the Cauchy Dirichlet problem in a general domain. The global in time existence theorem was proved by Matsumura and Nishida [22, 23] and Ponce [29] for the Cauchy problem in R3 ; Matsumura and Nishida[24, 25], Valli [39], Valli and Zajaczkowski [40], Fiszdon and Zajaczkowski [8, 10] and Str¨ohmer [33, 34] for the Cauchy Dirichlet problem in a bounded domain. Concerning the decay rate of solutions in the Cauchy problem case, Matsumura and Nishida[22] showed that if the L1 (R3 ) ∩ H4 (R3 )-norm of the initial data is sufficiently small, then k(ρ − ρ¯0 , v, θ − θ0 )k2,2,R3 = O(t− 4 ) as t → ∞. 3

Also Ponce [29] showed that if the Ws10 (R3 ) ∩ Hs0 (R3 )-norm (s0 = 4, integer) of the initial data is sufficiently small, then k∂xα (ρ − ρ¯0 , v, θ − θ0 )kp,R3 = O(t− 2q − 3

|α| 2

) as t → ∞,

where p = 2, 1/p + 1/q = 1 and |α| 5 2. As was already stated, the case of the halfspace or exterior domain has been studied by Matsumura and Nishida [26]. They proved the global in time existence theorem for small initial data in H3 and showed that the L∞ -norm of solutions vanishes as t → ∞. Deckelnick [5, 6] proved the following decay rate: k∂x1 (ρ, v, θ)k2 = O(t− 4 ) as t → ∞, 1

kρ − ρ0 k∞ = O(t− 8 ) as t → ∞, 1

k(v, θ − θ0 )k∞ = O(t− 4 ) as t → ∞. 1

But this rate is weaker compared with the decay rate obtained by Matsumura and Nishida [22] and Ponce [29] in the Cauchy problem case, because the initial data are assumed to be in H3 only.

628

T. Kobayashi, Y. Shibata

Our result gives an optimal rate in the case that the initial data belong to L1 , which corresponds to the rate in the Cauchy problem case which was obtained by Matsumura and Nishida [22] and Ponce [29]. Moreover, Theorem 1.2 is slightly better than [22] and [29], because we do not assume the smallness of the L1 -norm of the initial data. 2. Some Known Properties of the Semigroup {e−tA }t=0 In this section, we shall summarize some properties of the semigroup {e−tA }t=0 , which was obtained by Kobayashi [16, 17]. Let ρ(−A) be the resolvent set of the operator −A. Then, we have the following lemma. Lemma 2.1. Let 1 < p < ∞. Then −A is a closed linear operator in Wp1,0 () and ρ(−A) ⊃ 6 = {λ ∈ C | CReλ + (Imλ)2 > 0}, where C is a constant depending only on α, β, γ, κ and ω. Moreover, the following properties are valid: There exist positive constants λ0 and δ < π2 such that for any λ − λ0 ∈ 6δ = {λ ∈ C | | argλ |5 π − δ}, + kP(λ + A)−1 Fk2,p |λ|k(λ + A)−1 FkW1,0 p 5 C(λ0 , δ)kFkW1,0 p

if F ∈ Wp1,0 ;

|λ| 2 k(I − P)(λ + A)−1 Fk2,p + |λ|− 2 kP(λ + A)−1 Fk3,p 1

1

5 C(λ0 , δ)kFkW2,1 p

if F ∈ Wp2,1 .

By Lemma 2.1 we see that −A generates an analytic semigroup {e−tA }t=0 on Wp1,0 and moreover we have the following estimates for 0 < t 5 2: 5 C(p, k)t−k kUkW1,0 for U ∈ Wp1,0 , k = 0 integer, (2.1) kAk e−tA UkW1,0 p p ke−tA Uk1,p, 5 C(p)t− 2 kUkW1,0 p 1

− 21

k(I − P)e−tA Uk2,p, 5 C(p)t kPe

−tA

− 23

Uk3,p, 5 C(p)t

for U ∈ Wp1,0 ,

(2.2)

kUkW2,1 p

for U ∈ Wp2,1 ,

(2.3)

kUkW2,1 p

Wp2,1 .

(2.4)

for U ∈

By Proposition Ap.3 in the Appendix below, we have 5 kUkW1,0 + kAUkW1,0 5 C2 kUkW1,2 C1 kUkW1,2 p p p p

(2.5)

for suitable constants C1 and C2 provided that U ∈ Dp (A). If U ∈ Dp (A), then Ae−tA U = e−tA (AU), and therefore by (2.1) and (2.5) 5 CkUkW1,2 . ke−tA UkW1,2 p p The next lemma is concerned with a local decay property of {e−tA }t=0 . Set k,m | U(x) = 0 for|x| = b}, Wk,m p,b = {U ∈ Wp

b = ∩ Bb , Bb = {x ∈ R3 | |x| < b}.

(2.6)

Compressible Viscous and Heat-Conductive Fluid

629

Lemma 2.2 (Local energy decay). Let 1 < p < ∞ and let b0 be a fixed number such 3 ∞ that B b = ∩ Bb . Then, for ϕ ∈ C0 (b ) such R b0 ⊃ R \ . Suppose that b > b0 and 1,0 that b ϕ(x)dx = 1 and U = (ρ, v, θ) ∈ Wp,b we have the following representation formula: e−tA U = T1 (b, ϕ, t)U + T2 (b, ϕ, t)U, T1 (b, ϕ, t)U = e−tA {U − (Nb U) · ϕI1 }, Z t e−sA (0, ∇ϕ, 0)ds , T2 (b, ϕ, t)U = (Nb U) ϕ · I1 − T

R

0

where I1 = (1, 0, 0, 0, 0) and ND U = D ρ(x)dx. Moreover, the following estimates are valid for M = 0 integer, k = 0, 1, U ∈ Wk+1,k p,b and t = 1 : − 2 −M kUkWk+1,k , k∂tM T1 (b, ϕ, t)UkWk+1,k+2 (b ) 5 C(p, b, ϕ, M )t p p 3

− 2 −M k(I − P)Uk1,b . k∂tM +1 T2 (b, ϕ, t)UkWk+1,k+2 (b ) 5 C(p, b, ϕ, M )t p 3

3. Lq –Lp Estimates of Solutions to the Linearized Equations in R3 In this section, we shall show the Lq –Lp estimates of solutions to the initial value problem to the operator A defined by (1.7) in R3 , which was already obtained by Matsumura and Nishida [22] and Ponce [29]. Here, we shall give a slightly sharper estimate than that obtained in [22] and [29]. The equation considered here is the following: Ut + AU = 0 in [0, ∞) × R3 , U(0) = F in R3 ,

(3.1)

where U = (ρ, v, θ) and F = (f1 , f2 , f3 ). Then, by taking Fourier transform of (3.1) with respect to the x-variable and solving the ordinary differential equation with respect to t, we have ˆ ˆ (3.2) U(t) = E(t)F = F −1 (e−tA(ξ) F(ξ)), where F(f ) = fb stands for the Fourier transform of f , F −1 denotes the Fourier inverse b is the 5 × 5 symmetric matrix of the form: transform and A(ξ)     0 0 iγξk ξ1 b A(ξ) = iγξj δjk α|ξ|2 + βξj ξk iωξj  for ξ = ξ2 , ξ3 0 iωξk κ|ξ|2 √ where i = −1 and δjk = 0 when k 6= j and = 1 when k = j. Then, we have the following theorem concerning the Lq − Lp estimates of E(t)F. For notational simplicity, we use the following abbreviation concerning the norm only in this section: . k · kp,R3 = | · |p , k · km,p,R3 = | · |m,p , k · kWk,m (R3 ) = | · |Wk,m p p Theorem 3.1. Let E(t) be the solution operator of (3.1) defined by (3.2). Then, we have the decomposition: E(t)F = E0 (t)F + E∞ (t)F, where E0 (t) and E∞ (t) have the following properties:

630

T. Kobayashi, Y. Shibata

(1) For ∀ `, m = 0 integers, |∂tm ∂x` E0 (t)F|p 5 C(m, `, p, q)t− 2 ( q − p )− 3

1

1

m+` 2

∀

|F|q ,

t = 1,

where 1 5 q 5 2 5 p 5 ∞, and |∂tm ∂x` E0 (t)F|p 5 C(m, `, p, q)|F|q 0 <∀ t 5 2, where 1 5 q 5 p 5 ∞ and (p, q) 6= (1, 1), (∞, ∞). (2) Set (`)+ = ` if ` = 0 and (`)+ = 0 if ` < 0. For any p, 1 < p < ∞, there exists a c > 0 such that for ∀ `, m, n = 0 integers, |∂tm ∂x` (I − P)E∞ (t)F|p h n i 5 C(m, `, p, n)e−ct t− 2 |F|(2m+`−n−1)+ ,p + |F|W`,(`−1)+

∀

p

t > 0;

|∂tm ∂x` PE∞ (t)F|p

h n i 5 C(m, `, p, n)e−ct t− 2 |F|W(2m+`−n−1)+ ,(2m+`−n)+ + |F|W(`−1)+ ,(`−2)+ p

p

∀

t > 0.

(3) Let 1 < p < ∞ and [3/p] denotes the integer part. Let c be the same as in (2).Then, we have − n+ 3 |∂tm ∂x` (I − P)E∞ (t)F|∞ 5 C(m, `, p, n)t 2 2p e−ct |F|(2m+`−n−1)+ ,p h i + C(m, `, p)e−ct |F|W(`+[3/p]−1)+ ,`+[3/p] + |(I − P)F|`,∞ ∀ t > 0, p − n+ 3 |∂tm ∂x` PE∞ (t)F|∞ 5 C(m, `, p, n)e−ct t 2 2p |F|W(2m+`−n−1)+ ,(2m+`−n)+ p i ∀ +|F|W`+[3/p],(`+[3/p]−1)+ t > 0. p

To prove Theorem 3.1, first we shall consider the following stationary problem in R3 with a complex parameter λ: (λ + A)U = F in R3 , i.e.,

 λρ + γ div v = f1  λv − α1v − β∇(div v) + γ∇ρ + ω∇θ = f2 in R3 .  λθ − κ1θ + ω div v = f3

−1 b b F}, where Then by taking Fourier transform we obtain U = F −1 {[λ + A(ξ)] −1 −1 ˜ b b = {det[λ + A(ξ)]} A(λ; ξ), [λ + A(ξ)] 2 2 b det[λ + A(ξ)] = (λ + α|ξ| ) F (λ; |ξ|),

F (λ; |ξ|) = λ3 + (α + β + κ)|ξ|2 λ2 + [(α + β)κ|ξ|2 + γ 2 + ω 2 ]|ξ|2 λ + γ 2 κ|ξ|4 , (3.3) ˜ and A(λ; ξ) = (˜akj (λ; ξ)) is the 5 × 5 matrix and the components are

Compressible Viscous and Heat-Conductive Fluid

631

2

a˜ 1,1 = (λ + α|ξ|2 ) {λ2 + (α + β + κ)|ξ|2 λ + [ω 2 + (α + β)κ|ξ|2 ]|ξ|2 }, a˜ 1,5 = a˜ 5,1 = −γω(λ + α|ξ|2 )2 |ξ|2 , a˜ 1,j = a˜ j,1 = −iγ(λ + α|ξ|2 )2 (λ + κ|ξ|2 ) xij−1 , a˜ 5,j = a˜ j,5 = −iωλ(λ + α|ξ|2 )2 ξj−1 , a˜ 5,5 = (λ + α|ξ|2 )2 {λ2 + (α + β)|ξ|2 λ + γ 2 |ξ|2 }, a˜ k,j = (λ + α|ξ|2 ){λ(λ + α|ξ|2 )(λ + κ|ξ|2 )δkj + (δkj |ξ|2 − ξk−1 ξj−1 )(βλ2 + [βκ|ξ|2 + ω 2 + γ 2 ]λ + γ 2 κ|ξ|2 )}, (3.4) where k, j = 2, 3, 4. The following lemma was given by Matsumura and Nishida [24] (cf. Ponce [29]). b = 0, where λ4 (ξ) = λ5 (ξ) = Lemma 3.2. (1) Let {λj (ξ)}5j=1 be the roots of det[λI+ A(ξ)] 2 −α|ξ| . Then, for λj (ξ), j = 1, 2, 3, we have the following assertions: There exists a positive constant r1 such thatλj (ξ) has a Taylor series expansion for |ξ| < r1 as follows: (γ 2 + ω 2 )(α + β) + ω 2 κ 2 |ξ| + O(|ξ|3 ), 2(γ 2 + ω 2 ) −γ 2 κ γ 2 ω 2 κ2 {(γ 2 + ω 2 )(α + β) − γ 2 κ} 4 2 |ξ| + |ξ| + O(|ξ|6 ), λ3 (ξ) = 2 γ + ω2 (γ 2 + ω 2 )4 λ1 (ξ) = λ2 (ξ) = i(γ 2 + ω 2 )1/2 |ξ| −

where λ1 (ξ) and λ2 (ξ) are complex numbers and λ3 (ξ) is a real number. Similarly, there exists a positive constant r2 (> r1 ) such that λj (ξ) has a Laurent series expansion for |ξ| > r2 as follows: If α + β 6= κ, then λ1 (ξ) = −(α + β)|ξ|2 −

γ 2 κ − (γ 2 + ω 2 )(α + β) + O(|ξ|−2 ), (α + β)(α + β − κ)

ω2 + O(|ξ|−4 ), κ−α−β γ 2 (ω 2 (α + β) − κγ 2 ) −2 γ2 + |ξ| + O(|ξ|−4 ), λ3 (ξ) = − α+β (α + β)3 κ

λ2 (ξ) = −κ|ξ|2 +

where λj (ξ) are real numbers. If α + β = κ, then λ1 (ξ) = λ2 (ξ) = −κ|ξ|2 + iω|ξ| + O(1), λ3 (ξ) = −

γ2 + γ 2 (ω 2 − γ 2 )κ3 |ξ|−2 + O(|ξ|−4 ), κ

where λ1 (ξ) and λ2 (ξ) are complex numbers and λ3 (ξ) is a real number. (2) The matrix exponential has the spectral resolution e−tA(ξ) = ˆ

4 X j=1

etλj (ξ) Pj (ξ)

632

T. Kobayashi, Y. Shibata

for all |ξ| > 0 except for at most four points of |ξ| > 0, where |ξ α ∂ξα Pj (ξ)| 5 C for

∀

ξ ∈ R3 \{0} and |α| 5 2.

(3) For any 0 < R1 < R2 < ∞, we have |e−tA(ξ) | 5 C1 (r1 , r2 )(1 + t)3 e−C2 (r1 ,r2 )t for R1 5 |ξ| 5 R2 . ˆ

ˆ ˆ ξ)), where vˆ (t, ξ) = T (vˆ 1 (t, ξ), vˆ 2 (t, ξ), vˆ 3 (t, ξ)), ˆ ξ), vˆ (t, ξ), θ(t, If we put e−tA(ξ) = (ρ(t, then by (3.3) and (3.4) we can also write

ρ(t, ˆ ξ) =

3 X

etλk (ξ)

k=1

+

3 3 X X

λ2k (ξ) + (α + β + κ)|ξ|2 λk (ξ) + (ω 2 + (α + β)κ|ξ|2 )|ξ|2 ˆ f1 (ξ) (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

etλk (ξ)

k=1 j=1

+

3 X

etλk (ξ)

k=1

vˆ j (t, ξ) =

3 X

etλk (ξ)

k=1

+

3 4 X X

−iγω|ξ|2 fˆ5 (ξ), (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

−iγ(λk (ξ) + κ|ξ|2 )ξj fˆ1 (ξ) (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

etλk (ξ)

k=1 `=1

+

3 X

etλk (ξ)

k=1

ˆ ξ) = θ(t,

3 X

etλk (ξ)

k=1

+

3 X 3 X

3 X k=1

k gj,` (ξ) fˆ`+1 (ξ) (λk (ξ) − λm1 (ξ))(λk (ξ) − λm2 (ξ))(λk (ξ) − λm3 (ξ))

−iγλk (ξ)ξj fˆ5 (ξ), j = 1, 2, 3, (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

−iγω|ξ|2 fˆ1 (ξ) (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

etλk (ξ)

k=1 j=1

+

−iγ(λk (ξ) + κ|ξ|2 )ξj fˆj+1 (ξ) (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

etλk (ξ)

−iωλk (ξ)ξj fˆj+1 (ξ) (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

λ2k (ξ) + (α + β)|ξ|2 λk (ξ) + γ 2 |ξ|2 ˆ f5 (ξ), (λk (ξ) − λl1 (ξ))(λk (ξ) − λl2 (ξ))

(3.5)

where l1 , l2 = 1, 2, 3, l1 6= l2 , l1 6= k, l2 6= k, k (ξ) =λk (ξ)(λk (ξ) − λ4 (ξ))(λk (ξ) + κ|ξ|2 )δj` gj,`

+ (δj` |ξ|2 − ξj ξ` )(βλ2k (ξ) + (βκ|ξ|2 + ω 2 + γ 2 )λk (ξ) + γ 2 κ|ξ|2 ), and m1 , m2 , m3 = 1, 2, 3, 4, mj 6= m` (j 6= `), mj 6= k (j = 1, 2, 3). Proof of Theorem 3.1. We shall use the notation defined in Lemma 3.2, below. Let ϕ0 (ξ) be a function in C0∞ (R3 ) such that ϕ0 (ξ) = 1 for |ξ| 5 r1 /2 and ϕ0 (ξ) = 0 for |ξ| ≥ r1 and set

Compressible Viscous and Heat-Conductive Fluid

90 (t)F(x) =

4 X

633

ˆ F −1 [etλj (ξ) ϕ0 (ξ)Pj (ξ)F(ξ)](x).

j=1

In view of Lemma 3.2, in order to estimate 90 (t)F, we consider the function: G` (t)f (x) = F −1 [eλ(ξ)t g` (ξ)ϕ0 (ξ)fˆ(ξ)](x), where λ(ξ) and g` (ξ) satisfy the condition: −β0 |ξ|2 5 Reλ(ξ) 5 −β1 |ξ|2 , |ξ α ∂ξα λ(ξ)| 5 C(α)|ξ| and |ξ α ∂ξα g` (ξ)| 5 C(α)|ξ|` for any ξ ∈ R3 \{0} with |ξ| 5 r1 . First, we consider the case when t = 1 and 1 5 q 5 2 5 p 5 ∞. Let us write h i 2 2 G` (t)f (x) = F −1 [e−β1 |ξ| t/4 ] ∗ F −1 [eµ(ξ)t g` (ξ)ϕ0 (ξ)] ∗ F −1 [e−β1 |ξ| t/4 ] ∗ f (x), where ∗ means the convolution and µ(ξ) = λ(ξ) + β1 |ξ|2 /2. Note that Reµ(ξ) 5 −β1 |ξ|2 /2. Since F −1 [e−β1 |ξ|

2

t/4

] = (β1 πt)−3/2 e−|x|

2

/β1 t

,

by Young’s inequality |F −1 [e−β1 |ξ|

2

t/4

− 23

] ∗ h|p 5 Ct

1 1 q−p

|h|q ,

(3.6)

where 1 5 q 5 p 5 ∞. By Parseval’s formula |F −1 [eµ(ξ)t g` (ξ)ϕ0 (ξ)] ∗ h|2 5 C

Z

2 ˆ e−β1 |ξ| t |ξ|2` |h(ξ)| dξ 2

R3 `

5 Ct− 2

− `2

5 Ct

max3 e−β1 |ξ| |ξ|2` 2

21

21 Z

ξ∈R

R3

2 ˆ |h(ξ)| dξ

21

|h|2 .

Combining these estimations, we have ` −3 −3 1− 1 |G` (t)f |p 5 Ct 2 2 p t− 2 t 2

1 1 q−2

− 23

|f |q = Ct

1 1 q−p

− `2

|f |q ,

for t = 1 and 1 5 q 5 2 5 p 5 ∞. Next, we consider the case when 0 < t 5 1. Since β β λ(ξ)t g` (ξ)ξ α ϕ0 (ξ) 5 C(α, β) ξ ∂ξ e for any α and β when 0 < t 5 1, by the Fourier multiplier theorem we have |∂xα G` (t)f |q 5 C(α, p)|f |q 0 < ∀t 5 1, ∀α, when 1 < q < ∞, which together with Sobolev’s imbedding theorem implies that

634

T. Kobayashi, Y. Shibata

|G` (t)f |p 5 C(p, q)|f |q 0 < ∀t 5 1, when 1 < q 5 p 5 ∞ and 1 < q < ∞. If we choose ψ(ξ) ∈ C0∞ (Rn ) so that ϕ0 (ξ) = 1 on supp ψ(ξ), then, we can write G` (t)f = G` (t)F −1 [ψ] ∗ f, and then by Young’s inequality we have |G` (t)f |p 5 |G` (t)F −1 [ψ]|p |f |1 5 C|f |1 when 1 < p < ∞. Finally, we have Z λ(ξ)t e g` (ξ)ϕ0 (ξ)fˆ(ξ) dξ 5 C|f |1 , |G` (t)f |∞ 5 R3

and therefore combination of these estimations implies that |G` (t)f |p 5 C|f |q , 0 < t 5 1, where 1 5 q 5 p 5 ∞ and (p, q) 6= (1, 1), (∞, ∞). ˆ with |α| = ` and Applying these estimates with g` (ξ)fˆ(ξ) = ξ α λ(ξ)m Pj (ξ)F(ξ) λ(ξ) = λj (ξ) and using Lemma 3.2 (2), immediately we have |∂tm ∂x` 90 (t)F|p

− 23

5 Ct

1 1 q−p

− (m+`) 2

|F|q ,

|∂tm ∂x` 90 (t)F|p 5 |F|q ,

for t = 1, 1 5 q 5 2 5 p 5 ∞, ( 0 < t 5 1, 1 5 q 5 p 5 ∞, for (p, q) 6= (1, 1), (∞, ∞).

Now, we shall estimate the term corresponding to the large |ξ|. Put Ψ∞ (t)F =

4 X

ˆ F −1 eλj (ξ)t ϕ∞ (ξ)Pj (ξ)F(ξ) (x),

j=1

where ϕ∞ (ξ) is a function in C ∞ (R3 ) such that ϕ∞ (ξ) = 1 for |ξ| = 3r2 and ϕ∞ (ξ) = 0 for |ξ| 5 2r2 . In order to estimate Ψ∞ (t)F, we shall consider the function: [H`j (t)f ](x) = F −1 eλj (ξ)t g` (ξ)ϕ∞ (ξ)fˆ(ξ) (x), j = 1, 2, 3, 4, where g` (ξ) satisfies the condition: |ξ α ∂ξα g` (ξ)| 5 C(α)|ξ|−`

∀

ξ ∈ R3 .

Let us estimate H`j (t)f , j = 1, 2 and 4. There exist β0 > β1 > 0 such that −β0 |ξ|2 5 Reλj (ξ) 5 −β1 |ξ|2 and |ξ α ∂ξα λj (ξ)| 5 C(α)|ξ|2 for |ξ| = r2 . Therefore, by induction we see easily that −

|ξ α ∂ξα eλj (ξ)t | 5 C(α)e which implies that

β1 2|α|

|ξ|2 t

for |ξ| = r2 and

t > 0,

Compressible Viscous and Heat-Conductive Fluid

635

  β1 α α eλj (ξ)t ξ β λj (ξ)m g` (ξ)ϕ∞ (ξ) 2 ξ ∂ξ   5 C(α, β, m, `, n)t− n2 e− 4|α| |ξ| t 2m+|β|−`−n + 2 (1 + |ξ|2 ) for |ξ| = r2 and t > 0. Since a

|F −1 [(1 + |ξ|2 ) 2 fˆ(ξ)]|p 5 C(a, p)|f |a,p 1 < p < ∞ as follows from Calder´on’s theorem [3], by the Fourier multiplier theorem we have n

|∂tm ∂xβ H`j (t)f |p 5 C(m, β, n, p)t− 2 e−ct |f |(2m+|β|−`−n)+ ,p

∀

t > 0,

(3.7)

where c = β1 r22 /16 and j = 1, 2 and 4. Next, we shall consider the case that j = 3. There exist constants β0 > β1 > 0 such that −β0 5 Reλ3 (ξ) 5 −β1 for |ξ| = r2 . Moreover, there exist a positive constant γ and a function µ3 (ξ) such that λ3 (ξ) = −γ + µ3 (ξ) and |ξ α ∂ξα µ3 (ξ)| 5 C(α)|ξ|−2 for |ξ| = r2 . Since   α α eλ3 (ξ)t ξ β λ3 (ξ)m g` (ξ)ϕ∞ (ξ) 5 C(α, β, m, `)(1 + t)|α| e−β1 t ξ ∂ξ   |β|−` + (1 + |ξ|2 ) 2

(3.8)

∀

t > 0,

by the Fourier multiplier theorem and Calder´on’s theorem |∂tm ∂xβ H`3 (t)f |p 5 C(m, β, `)(1 + t)3 e−β1 t |f |(|β|−`)+ ,p 5 C(m, β, `, β1 )e−

β1 2

t

|f |(|β|−`)+ ,p

(3.9)

for any t > 0. Now, using (3.7) and (3.9), we shall estimate Ψ∞ (t)F. Put 4 X

ˆ ξ)). ˆ eλj (ξ)t ϕ∞ (ξ)Pj (ξ)F(ξ) = (ρ(t, ˆ ξ), vˆ (t, ξ), θ(t,

j=1

In view of (3.5) and Lemma 3.2, we write ρ(t, ˆ ξ) =

5 3 X X

eλk (ξ)t A1kj (ξ)ϕ∞ (ξ)fˆj (ξ),

k=1 j=1

where

A131 (ξ) = 1 + p131 (ξ) and |ξ α ∂ξα p131 (ξ)| 5 C(α)|ξ|−2 , ( α α 1 j=1 ξ ∂ξ Akj (ξ) ≤ Cα −1 j 6= 1 Cα |ξ|

for |ξ| = r2 . By (3.8) and Taylor’s formula, eλ3 (ξ)t A131 (ξ)f1 (ξ) = e−γt fˆ1 (ξ) + e−γt [(ϕ∞ (ξ) − 1) + a131 (t, ξ)],

(3.10)

636

T. Kobayashi, Y. Shibata

where Z a131 (t, ξ)

= ϕ∞ (ξ)

1

θµ3 (ξ)t

e

Z dθµ3 (ξ)t +

p131 (ξ)

θµ3 (ξ)t

e

+

0

1

dθµ3 (ξ)p131 (ξ)t

.

0

Since |ξ α ∂ξα a131 (t, ξ)| 5 C(α)(1 + |ξ|)−2 (1 + t)|α|

∀

ξ ∈ R3

and since 1 − ϕ∞ (ξ) is a rapidly decreasing function, by Fourier multiplier theorem, Calder´on’s theorem and Sobolev’s inequality m ` −1 λ (ξ)t 1 ∂t ∂x F e 3 A31 (ξ)ϕ∞ (ξ)fˆ1 (ξ) p 5 C(m, `)e−γt (1 + t)m |f1 |`,p , m ` −1 λ (ξ)t 1 ∂t ∂x F e 3 A31 (ξ)ϕ∞ (ξ)fˆ1 (ξ) ∞ 5 C(m, `)e−γt (1 + t)m |f1 |`,∞ + |f1 |(`+[3/p]−1)+ ,p . Applying (3.7) and noting (3.10), we have m ` −1 λ (ξ)t 1 n ∂t ∂x F e k Akj (ξ)ϕ∞ (ξ)fˆj (ξ) p 5 C(m, `, n, p)t− 2 e−ct |fj |(2m+`−1−n)+ ,p for t > 0, 1 < p < ∞, k = 1, 2 and j = 1, 2, 3, 4. By (3.9) and (3.10) we have also m ` −1 λ (ξ)t 1 ∂t ∂x F (3.11) e 3 A3j (ξ)ϕ∞ (ξ)fˆj (ξ) p 5 C(m, `, p)e−ct |fj |(`−1)+ ,p for t > 0, 1 < p < ∞ and j = 2, 3, 4 with a suitable constant c > 0. Let us estimate the L∞ -norm. Put λk (ξ) = −δ|ξ|2 + σk (ξ) k = 1, 2, 4, where   min α , α + β , κ 2 δ= 2 2  min α , κ 2 2

if α + β 6= κ,

if α + β = κ, ) σ1 (ξ) = −(α + β − δ)|ξ|2 + O(|ξ|) if α + β 6= κ, σ2 (ξ) = −(κ − δ)|ξ|2 + O(|ξ|)

σ1 (ξ) = σ2 (ξ) = −(κ − δ)|ξ|2 + O(|ξ|) if α + β = κ, σ4 (ξ) = −(α − δ)|ξ|2 for |ξ| = r2 . When k = 1, 2 and j = 1, 2, 3, 4, we have ∂tm ∂xα F −1 etλk (ξ) A1kj (ξ)ϕ∞ (ξ)fˆj (ξ) h i 2 = F −1 e−δ|ξ| t ∗ F −1 λk (ξ)m ξ α etσk (ξ) A1kj (ξ)ϕ∞ (ξ)fˆj (ξ) ,

Compressible Viscous and Heat-Conductive Fluid

637

and therefore by (3.6) and (3.7), m α tλ (ξ) 1 ∂t ∂x F e k Akj (ξ)ϕ∞ (ξ)fˆj (ξ) ∞ 3 5 C(p)t− 2p F −1 λk (ξ)m ξ α etσk (ξ) A1kj (ξ)ϕ∞ (ξ)fˆj (ξ) p − 3 +n 5 C(p, α, m, n)t 2p 2 e−ct |fj |(2m+|α|−n−1)+ ,p , for t > 0 and 1 < p < ∞. By Sobolev’s inequality, (3.9) and (3.10) we have also m ` −1 tλ (ξ) 1 ∂t ∂x F e 3 A3j (ξ)ϕ∞ (ξ)fˆj (ξ) ∞ 5 C(p) ∂tm ∂x` F −1 etλ3 (ξ) A13j (ξ)ϕ∞ (ξ)fˆj (ξ) [3/p]+1,p

−ct

5 C(m, α, p)e

|fj |`+[3/p],p

for t > 0 , 1 < p < ∞ and j = 2, 3, 4. In the same manner, we have m ` −1 tλ (ξ) 1 ∂t ∂x F e k Ak5 (ξ)ϕ∞ (ξ)fˆ5 (ξ) p n

5 C(m, `, n, p)t− 2 e−ct |f5 |(2m+`−n−2)+ ,p , k = 1, 2, m ` −1 tλ (ξ) 1 ∂t ∂x F e k Ak5 (ξ)ϕ∞ (ξ)fˆ5 (ξ) ∞ − 3 +n 5 C(m, `, n, p)t 2p 2 e−ct |f5 |(2m+`−n−2)+ ,p , k = 1, 2, m ` −1 tλ (ξ) 1 ∂t ∂x F e k A35 (ξ)ϕ∞ (ξ)fˆ5 (ξ) p 5 C(m, `, p)e−ct |f5 |(`−2)+ ,p m ` −1 tλ (ξ) 1 ∂t ∂x F e 3 A35 (ξ)ϕ∞ (ξ)fˆ5 (ξ) ∞ 5 C(m, `, p)e−ct |f5 |(`+[3/p]−1)+ ,p , for t > 0. Summing up, we have proved the following estimates: h i m ` ∂t ∂x (I − P)Ψ∞ (t)F 5 C(m, `, n, p)e−ct t− n2 |F|(2m+`−n−1)+ ,p + |F| `,(`−1)+ , Wp p m ` n 3 ∂t ∂x (I − P)Ψ∞ (t)F 5 C(m, `, n, p)e−ct t− 2 + 2p |F|(2m+`−n−1)+ ,p ∞ i + |F|W(`+[3/p]−1)+ ,`+[3/p] + |(I − P)F|`,∞ , p

for t > 0 and 1 < p < ∞. In view of Lemma 3.2 and (3.5), we write vˆ i (t, ξ) =

5 3 X X

ˆ etλk (ξ) Ai+1 kj (ξ)ϕ∞ (ξ)fj (ξ) +

k=1 j=1

ˆ ξ) = θ(t,

5 3 X X

4 X

ˆ etλ4 (ξ) Ai+1 4j (ξ)ϕ∞ (ξ)fj (ξ),

j=2

eλk (ξ)t A5kj (ξ)ϕ∞ (ξ)fˆj (ξ),

k=1 j=1

where vˆ (t, ξ) = T (vˆ 1 (t, ξ), vˆ 2 (t, ξ), vˆ 3 (t, ξ)), |ξ α ∂ξα Ai+1 kj (ξ)| 5 Cα ,

k = 1, 2, 4, j = 2, 3, 4, i = 1, 2, 3,

|ξ α ∂ξα Ai+1 k5 (ξ)| α α 5 |ξ ∂ξ Ak5 (ξ)|

5 Cα ,

k = 1, 2, i = 1, 2, 3,

5 Cα ,

k = 1, 2,

638

T. Kobayashi, Y. Shibata −1 |ξ α ∂ξα Ai+1 k1 (ξ)| 5 Cα |ξ| , k = 1, 2, 3, i = 1, 2, 3,

|ξ α ∂ξα A5kj (ξ)| 5 Cα |ξ|−1 , k = 1, 2, j = 1, 2, 3, 4, −2 |ξ α ∂ξα Ai+1 3j (ξ)| 5 Cα |ξ| , j = 2, 3, 4, i = 1, 2, 3,

|ξ α ∂ξα A53j (ξ)| 5 Cα |ξ|−2 , j = 1, 5, −3 |ξ α ∂ξα Ai+1 35 (ξ)| 5 Cα |ξ| , i = 1, 2, 3,

|ξ α ∂ξα A53j (ξ)| 5 Cα |ξ|−3 , j = 2, 3, 4. Employing the same argument as before, we see easily that m ` ∂t ∂x PE∞ (t)F p h n i 5 C(m, `, p, n)e−ct t− 2 |F|W(2m+`−n−1)+ ,(2m+`−n)+ + |F|W(`−1)+ ,(`−2)+ , p p m ` ∂t ∂x PE∞ (t)F ∞ h n 3 i 5 C(m, `, p, n)e−ct t− 2 + 2p |F|W(2m+`−n−1)+ ,(2m+`−n)+ + |F|W`+[3/p],(`+[3/p]−1)+ , p

p

for t > 0 and 1 < p < ∞. Choose ϕM (ξ) ∈ C0∞ (R3 ) such that ϕ0 (ξ) + ϕM (ξ) + ϕ∞ (ξ) = 1 for ∀ ξ ∈ R3 and put E0 (t)F = Ψ0 (t)F,

h i ˆ ˆ (x). E∞ (t)F = Ψ∞ (t)F + F −1 e−tA(ξ) ϕM (ξ)F(ξ) By Lemma 3.2 and the Fourier multiplier theorem, h h ii ˆ m ` ˆ 5 C(p, m, `)e−ct |F|p ∂t ∂x F −1 e−tA(ξ) ϕM (ξ)F(ξ) p

for a suitable constant c > 0. Combining these estimations, we have Theorem 3.1.

4. A Proof of Theorem 1.3: Lq -Lp Decay Estimate in In this section we shall prove Theorem 1.3 by using the cut-off technique based on Lemma 2.2 and Theorem 3.1. The strategy follows Shibata [31] (and also Iwashita and Shibata [16], Iwashita [14] and Kobayashi and Shibata[18]). First, to use Theorem 3.1, we construct a suitable extension of U0 = (ρ0 , v0 , θ0 ) to R3 in the following manner. Let U00 be the Lions’ extension of U0 such that U00 | = U0 , kU00 kWk,` 3 5 C(p, k, `)kU0 kWk,` . p (R ) p Put U00 = (ρ00 , v00 , θ00 ). Let η be a function in C0∞ (O), O = R3 \, such that Set Z ρ1 (x) = ρ00 (x) −

O

R O

η(x)dx = 1.

ρ00 (x)dx η(x), v1 (x) = v00 (x), θ1 (x) = θ00 (x),

U1 (x) = (ρ1 (x), v1 (x), θ1 (x)).

Compressible Viscous and Heat-Conductive Fluid

639

Then, we have , kU1 kWk,` (R3 ) 5 CkU0 kWk,` p Zp ρ1 (x)dx = 0.

(4.1) (4.2)

O

In the course of the proof, we denote the constant depending essentially on p and q simply by C. Now, we set Uc (t) = (ρc (t), vc (t), θc (t)) = E(t)U1

in (0, ∞) × R3 ,

U(t) = (ρ(t), v(t), θ(t)) = e−tA U0

in (0, ∞) × .

We shall start with the following step. Step 1. For t = 1, we have 3 , U0 ∈ Wp1,0 ∩ Lq , kU(t)k1,p,b 5 Cb t− 2q kU0 kq + kU0 kW1,0 p 3 , U0 ∈ Wp2,1 ∩ Lq , k(I − P)U(t)k2,p,b 5 Cb t− 2q kU0 kq + kU0 kW2,1 p where b is an arbitrary constant such that Bb−3 ⊃ O and Cb is a constant depending on b, p and q. Below, we also omit the subscript b from the constant Cb . Choose ϕ ∈ C0∞ (R3 ) such that ϕ(x) = 1 for |x| 5 b − 1 and ϕ(x) = 0 for |x| = b and set Um = (ρm , vm , θm ), where ρm = ρ − ρc , vm = v − (1 − ϕ)vc and θm = θ − (1 − ϕ)θc . Then, Um satisfies the equation: ∂t Um + AUm = Fm in (0, ∞) × , PUm |∂ = 0, Um (0) = (0, ϕv0 , ϕθ0 ), where Fm − (fm1 , fm2 , fm3 ) and fm1 = −ϕ(ρc )t + γ(∇ϕ) · vc , fm2 = −ϕγ∇ρc + ω(∇ϕ)θc −α[2∇ϕ :∇vc + (1ϕ)vc ]−β[∇(∇ϕ · vc ) + (∇ϕ) div vc ], fm3 = −κ[(1ϕ)θc + 2(∇ϕ) · (∇θc )] + ω(∇ϕ) · vc . By Duhamel’s principle (cf. Pazy [28, pp. 105–106]), Um (t) = e−tA Um (0) −

Z

t

e−(t−s)A Fm (s)ds.

0

Let us use the symbols defined in Lemma 2.2 and let t = 1 below. Since Z 0dx = 0, Nb Um (0) = b

640

T. Kobayashi, Y. Shibata

by Lemma 2.2 5 Ct− 2 kUm (0)kW1,0 5 Ct− 2 kPU0 kp , ke−tA Um (0)kW1,2 p (b ) p 3

3

− 23

− 23

5 Ct ke−tA Um (0)kW2,3 p (b )

kUm (0)kW2,1 5 Ct p

kPU0 k1,p .

(4.3) (4.4)

In view of Lemma 2.2 we can write Z t Z t Z t −(t−s)A e Fm (s)ds = T1 (b, ψ, t − s)Fm (s)ds + T2 (b, ψ, t − s)Fm (s)ds 0

0

0

= I1 (t) + I2 (t), where ψ is a function in C0∞ (b ) such that

R b

ψ(x)dx = 1. If we introduce the function

Z

gm1 (s, x) = fm1 (s, x) −

b

fm1 (s, x)dxψ(x),

then we can write T1 (b, ψ, t − s)Fm (s) = e−(t−s)A (gm1 (s), fm2 (s), fm3 (s)). To estimate I1 (t), we put Z I1 (t) =

Z

t

Z 1

t−1

+ t−1

T1 (b, ψ, t − s)Fm (s)ds =

+ 1

0

3 X

Jj (t).

j=1

By Theorem 3.1 and (4.1), 5 C k(ρc )s kj+1,p,Bb + kE(s)U1 kj+1,p,Bb kFm (s)kWj+1,j p ], s = 1, j = 0, 1, 5 Cs− 2q [kU0 kq + kU0 kWj+1,j p 3

Z

fm1 (s, x)dx 5 CkFm (s)kW1,0 p () b h i 3 , s = 1. 5 Cs− 2q kU0 kq + kU0 kW1,0 p

(4.5)

By (2.2), (2.3) and (4.5), Z kJ1 (t)k1,p 5 C Z

t−1 t

(t − s)− 2 k(gm1 (s), fm2 (s), fm3 (s))kW1,0 ds p 1

h

i

(t − s)− 2 s− 2q ds kU0 kq + kU0 kW1,0 p t−1 h i 3 ; 5 Ct− 2q kU0 kq + kU0 kW1,0 p Z t 1 5C (t − s)− 2 k(gm1 (s), fm2 (s), fm3 (s))kW2,1 ds p

5C

k(I − P)J1 (t)k2,p

t

Z

5C

t−1 t t−1

1

3

i h 3 1 (t − s)− 2 s− 2q ds kU0 kq + kU0 kW2,1 p

(4.6)

Compressible Viscous and Heat-Conductive Fluid

641

h i 3 5 Ct− 2q kU0 kq + kU0 kW2,1 . p

(4.7)

By Lemma 2.2 and (4.5), Z kJ2 (t)kWj+1,j+2 (b ) 5 C p

t−1

1

Z

5C

3

i h 3 3 (t − s)− 2 s− 2q ds kU0 kq + kU0 kWj+1,j p h i kU0 kq + kU0 kWj+1,j t = 1. p

t−1

1 3 − 2q

5 Ct

(t − s)− 2 kFm (s)kWj+1,j ds p

(4.8)

Since Fm (s) = ∂s (ϕρc (s), 0, 0) + (γ∇ϕ · vc (s), fm2 (s), fm3 (s)), we have J3 (t) = T1 (b, ψ, t)(ϕρc (0), 0, 0) − T1 (b, ψ, t − 1)(ϕρc (1), 0, 0) Z 1 − [∂s T1 (b, ψ, t − s)] (ϕρc (s), 0, 0)ds Z

0 1

T1 (b, ψ, t − s)(γ∇ϕ · vc (s), fm2 (s), fm3 (s))ds.

+ 0

By Theorem 3.1 and (4.1), kϕρc (s)kj+1,p 5 CkU0 kWj+1,j p k(γ∇ϕ · vc (s), fm2 (s), fm3 (s))kWj+1,j 5 CkE(s)U1 kj+1,p,b 5 Cs− 2 kU0 kWj+1,j p p 1

for j = 0, 1 and 0 < s 5 1, and therefore by Lemma 2.2 −2 kU0 kWj+1,j j = 0, 1. kJ3 (t)kWj+1,j+2 (b ) 5 Ct p p 3

(4.8)

Combining (4.6), (4.7) and (4.8) implies that h i 3 , kI1 (t)k1,p,b 5 Ct− 2q kU0 kq + kU0 kW1,0 p h i 3 k(I − P)I1 (t)k2,p,b 5 Ct− 2q kU0 kq + kU0 kW2,1 p

(4.9)

for t = 1. Since (ρc )t + γ div vc = 0, we have fm1 = γ div(ϕvc ), and therefore by Stokes formula, Z Z Z fm1 (s, x)dx = γ ν(x) · (ϕ(x)vc (s, x))dσ = −γ (−ν(x)) · vc (s, x)dσ b ∂b ∂O Z Z Z d div vc (s, x)dx = (ρc )s (s, x)dx = ρc (s, x)dx, = −γ ds O O O (4.10) where ν(x) is the unit external normal vector to ∂ and dσ is the surface element. By Lemma 2.2 and (4.10),

642

T. Kobayashi, Y. Shibata

Z I2 (t) = −

O

ρc (t, x)dxψI1 −

Z t Z O

0

R

ρc (s, x)dx e−(t−s)A (0, ∇ψ, 0) ds, R

where we have used the fact that O ρc (0, x)dx = from (4.2). By (2.2), (2.3) and Lemma 2.2,

O

ρ1 (x)dx = 0 which follows

ke−(t−s)A (0, ∇ψ, 0)k1,p,b + k(I − P)e−(t−s)A (0, ∇ψ, 0)k2,p,b 5 C(1 + (t − s)− 2 )(1 + t − s)− 2 for 0 < s < t. 1

Since

3

Z h i ρc (s, x)dx 5 C(1 + s)− 2q3 kU0 kq + kU0 k 1,0 s=0 Wp O

as follows from Theorem 3.1 and (4.1), we have also kI2 (t)k1,p,b + k(I − P)I2 (t)k2,p,b h i 3 t = 1, 5 Ct− 2q kU0 kq + kU0 kW1,0 p which together with (4.3), (4.4) and (4.9) implies Step 1. Step 2. For t = 1, we have

h i 3 U0 ∈ Wp1,0 ∩ Lq , (4.11) k∂t U(t)k1,p,b 5 Ct− 2q kU0 kq + kU0 kW1,0 p h i 3 (4.12) U0 ∈ Wp1,0 ∩ Lq , kPU(t)k2,p,b 5 Ct− 2q kU0 kq + kU0 kW1,0 p h i 3 U0 ∈ Wp2,1 ∩ Lq , PU0 |∂ = 0. kPU(t)k3,p,b 5 Ct− 2q kU0 kq + kU0 kW2,1 p (4.13)

Put Un = (ρn , vn , θn ) = U − (1 − ϕ)Uc and then ∂t Un + AUn = Fn in (0, ∞) × , PUn |∂ = 0, Un (0) = (ϕρ0 , ϕv0 , ϕθ0 ), where Fn = (fn1 , fn2 , fn3 ), fn1 = γ(∇ϕ) · vc , fn2 = γ(∇ϕ)ρc + ω(∇ϕ)θc −α[(1ϕ)vc + 2(∇ϕ) :∇vc ]−β[∇(∇ · vc ) + (∇ϕ) div vc ], fn3 = ω(∇ϕ) · vc − κ[(1ϕ)θc + 2(∇ϕ) · (∇θc )]. By Duhamel’s principle and integration by parts, 1 1 ∂t Un (t) = ∂t e−tA Un (0) − e− 2 A Fn (t − ) 2 Z Z t e−(t−s)A ∂s Fn (s)ds − −

t− 21

t− 21

∂t e−(t−s)A Fn (s)ds.

0

By Lemma 2.2, Theorem 3.1 and (4.1), 5 Ct− 2q kU0 kW1,0 , k∂t e−tA Un (0)kW1,2 p (b ) p 3

1 1 1 1 5 CkFn (t − )kW1,0 5 CkE(t − )U1 kW0,1 ke− 2 A Fn (t − )kW1,2 p (b ) p p (b ) 2 2 2 h i

5 Ct− 2q kU0 kq + kU0 kW1,0 . p 3

Compressible Viscous and Heat-Conductive Fluid

643

Since −

5 Ck∂s E(s)U1 kW0,1 5 Cs k∂s Fn (s)kW1,0 p p (b )

3 1 2q + 2

h

kU0 kq + kU0 kW1,0 p

i

1 as follows from Theorem 3.1 and (4.1), by (2.2) 2

Z Z t h

t i 3 1

−(t−s)A − 21 − 2q + 2 e ∂ F (s)ds 5 C (t − s) s ds kU0 kq + kU0 kW1,0

s n p

t− 1 t− 21 2 1,p i h − 3 +1 . 5 Ct 2q 2 kU0 kq + kU0 kW1,0 p

for s =

Since

h i 3 − 2q − 21 0,1 1,0 1 + s kU 5 CkE(t)U k 5 C(1 + s) k + kU k kFn (s)kW1,0 1 0 q 0 W ( ) W b p p p

as follows from Theorem 3.1 and (4.1), by Lemma 2.2

Z 1

t− 2

∂t e−(t−s)A Fn (s)ds

1,2

0 Wp (b )

Z 5C 0

5 Ct

h i 3 3 1 (t − s)− 2 (1 + s)− 2q 1 + s− 2 ds kU0 kq + kU0 kW1,0 p h i kU0 kq + kU0 kW1,0 . p

t− 21

3 − 2q

Combining these estimations, we have h i 3 . k∂t Un (t)k1,p,b 5 Ct− 2q kU0 kq + kU0 kW1,0 p

(4.14)

On the other hand, by Theorem 3.1 and (4.1), h i − 3 +1 , k∂t Uc (t)k1,p,b 5 Ct 2q 2 kU0 kq + kU0 kW1,0 p which together with (4.14) implies (4.11). By Proposition Ap.2 in the Appendix below we see easily that kPU(t)k2,p,b−1 5 C k∂t PU(t)kp,b + kU(t)k1,p,b . Therefore, by Step 1 and (4.11) we have h i 3 t = 1. kPU(t)k2,p,b−1 5 Ct− 2q kU0 kq + kU0 kW1,0 p Since b is chosen arbitrarily, we have (4.12) with j = 0. Applying Proposition Ap. 2 with m = 1, we have kPU(t)k3,p,b−1 5 C k∂t PU(t)k1,p,b + kU(t)k2,p,b .

644

T. Kobayashi, Y. Shibata

And then, by Step 1, (4.11) and (4.12) we have h i 3 kPU(t)k3,p,b−1 5 Ct− 2q kU0 kq + kU0 kW2,1 p for U0 ∈ Wp2,1 ∩ Lq . Since b is chosen arbitrarily, we have (4.13). Now, we shall estimate U(t) for |x| = b. Let ϕ∞ (x) be a function in C ∞ (R3 ) such that ϕ∞ (x) = 1 for|x| = b and ϕ∞ (x) = 0 for |x| 5 b − 1. Put U∞ (t) = (ρ∞ (t), v∞ (t), θ∞ (t)) = ϕ∞ U(t). Then, ∂t U∞ + AU∞ = F∞ in (0, ∞) × R3 and U∞ (0) = ϕ∞ U0 in R3 , where F∞ = (f∞1 , f∞2 , f∞3 ), f∞1 = −γ(∇ϕ) · v, f∞2 = α [(1ϕ∞ )v + 2(∇ϕ∞ ) :∇v] + β [∇((∇ϕ∞ ) · v) + (∇ϕ∞ ) div v] − γ(∇ϕ∞ )ρ − ω(∇ϕ∞ )θ, f∞3 = κ [(1ϕ∞ )θ + 2(∇ϕ∞ ) · ∇θ] − ω(∇ϕ∞ )θ. By Duhamel’s principle we can write Z U∞ (t) = E(t)U∞ (0) − L(t), L(t) =

t

E(t − s)F∞ (s)ds.

0

Let 1 5 q 5 2 5 p < ∞. When we estimate the L∞ -norm of U∞ (t), we always assume that 3 < p < ∞, below. And also, we assume that t = 1, below. Recall that 3 σ= 2

1 1 − q p

(cf. Theorem 1.3). By Theorem 3.1, we have h i k k∂xk (I − P)E(t)U∞ (0)kp 5 Ct−(σ+ 2 ) kU0 kq + kU0 kWk,(k−1)+ , p i h k −(σ+ k2 ) kU0 kq + kU0 kW(k−1)+ ,(k−2)+ , k∂x PE(t)U∞ (0)kp 5 Ct p h i 3 − 2q 5 Ct kU0 kq + kU0 kW1,0 , kE(t)U∞ (0)kW0,1 ∞ p h i 3 , 5 Ct− 2q kU0 kq + kU0 kW2,1 kE(t)U∞ (0)kW1,2 ∞ p

(4.15)

where we have used Sobolev’s inequality for 3 < p < ∞ when we estimated the L∞ -norm. To estimate L(t) we decompose it as follows: Z L(t) =

Z

t

+ t−1

Z 1

t−1

E(t − s)F∞ (s)ds =

+ 1

0

3 X j=1

Lj (t).

Compressible Viscous and Heat-Conductive Fluid

645

By Theorem 3.1 k∂xk PL1 (t)kp 5 C k∂xk (I − P)L1 (t)kp 5 C 5C kL1 (t)kW0,1 ∞

Z

t

t−1 t

(t − s)− 2 kF∞ (s)k(k−1)+ ,p,R3 ds, 1

Z

t−1 Z t t−1 t

Z

5C kL1 (t)kW1,2 ∞

t−1

kF∞ (s)kWk,(k−1)+ (R3 ) ds, p

(t − s)− 2 kF∞ (s)k1,p,R3 ds, 1

(t − s)− 2 kF∞ (s)k2,p,R3 ds. 1

By Steps 1 and 2, h i 3 − 2q j+1,j+2 j+1,j kU 5 CkU(s)k 5 Ct k + kU k kF∞ (s)kWj+2,j+1 3 0 q 0 (R ) Wp (b ) Wp p for s = 1 and j = 0, 1. Combining these estimates implies that h i 3 , 0 5 k 5 2, k∂xk L1 (t)kp 5 Ct− 2q kU0 kq + kU0 kW1,0 p h i 3 , k∂x3 PL1 (t)kp 5 Ct− 2q kU0 kq + kU0 kW2,1 p h i 3 , 5 Ct− 2q kU0 kq + kU0 kW1,0 kL1 (t)kW0,1 ∞ p h i 3 5 Ct− 2q kU0 kq + kU0 kW2,1 . kL1 (t)kW1,2 ∞ p

(4.16)

(4.17)

Now, we shall estimate L2 (t). In order to do this we introduce the symbol `pq (t), which is defined by the following formula: ( 1 t−(σ+ 2 ) if 1 5 q 5 2 5 p 5 3, `p,q (t) = − 3 if 1 5 q 5 2 and 3 < p < ∞. t 2q Since kF∞ (s)kq,R3 5 CkF∞ (s)kp,R3 , 1 5 q 5 p 5 ∞, as follows from the fact that F∞ has the compact support, by Theorem 3.1 and (4.16), k∂xk PL2 (t)kp Z t−1 −3 5C (t − s) 2 Z

1 t−1

5C

1− p1 − k2

1

h

i kF∞ (s)k1,R3 + kF∞ (s)kW(k−1)+ ,(k−2)+ (R3 ) ds p

3 − 23 1− p − k2 − 2q

(t − s)

s

1

k∂xk (I − P)L2 (t)kp Z t−1 −3 5C (t − s) 2 Z

1 t−1

5C 1

1− p1 − k2

1

h

i

, 0 5 k 5 3; ds kU0 kq + kU0 kW1,0 p

i kF∞ (s)k1,R3 + kF∞ (s)kWk,(k−1)+ (R3 ) ds p

3 − 23 1− p − k2 − 2q

(t − s)

h

s

i h , 0 5 k 5 2; ds kU0 kq + kU0 kW1,0 p

646

T. Kobayashi, Y. Shibata

kL2 (t)kW1,2 3 ∞ (R ) Z t−1 h i 3 ds 5C (t − s)− 2 kF∞ (s)k1,R3 + kF∞ (s)kW2,1 p Z

1

t−1

5C 1

If we divide

R t−1 1

by

Z

R t−1 (t−1)/2

t−1

1 t−1

Z

1 Z t−1

i h 3 3 . (t − s)− 2 s− 2q ds kU0 kq + kU0 kW1,0 p and

R (t−1)/2 1

− 23 1− p1

(t − s)

, we see easily that

s− 2q ds 5 Ct−σ , 3

3 − 23 1− p1 − 21 − 2q

(t − s)

s

3 − 23 1− p1 − k2 − 2q

(t − s)

1

Z

t−1

s

ds 5 C`pq (t), ds 5 Ct− 2q k = 2, 3

(t − s)− 2 s− 2q ds 5 Ct− 2q . 3

3

3

1

Therefore, we have h i , kL2 (t)kp 5 Ct−σ kU0 kq + kU0 kW1,0 p h i , k∂x L2 (t)kp 5 C`pq (t) kU0 kq + kU0 kW1,0 p h i 3 , k∂x2 L2 (t)kp 5 Ct− 2q kU0 kq + kU0 kW1,0 p h i 3 , k∂x3 PL2 (t)kp 5 Ct− 2q kU0 kq + kU0 kW1,0 p h i 3 5 Ct− 2q kU0 kq + kU0 kW1,0 . kL2 (t)kW1,2 ∞ p Finally, we shall estimate L3 (t). By Theorem 3.1, k∂xk PL3 (t)kp Z 1 h i k 5C (t − s)−σ− 2 kF∞ (s)kq,R3 + kF∞ (s)kW(k−1)+ ,(k−2)+ (R3 ) ds 0

k

5 Ct−σ− 2

Z

p

1

kU(s)kW(k−2)+ ,(k−1)+ ( ) ds; b

p

0

k∂xk (I − P)L3 (t)kp Z 1 h i k 5C (t − s)−σ− 2 kF∞ (s)kq,R3 + kF∞ (s)kWk,(k−1)+ (R3 ) ds 0

k

5 Ct−σ− 2

Z

0

p

1

kU(s)kW(k−1)+ ,k ( ) ds; p

b

(4.18)

Compressible Viscous and Heat-Conductive Fluid

Z kL3 (t)kW0,1 5C ∞

1

0

h i 3 (t − s)− 2 kF∞ (s)k1,R3 + kF∞ (s)kW1,0 ds 3 p (R )

5 Ct− 2

3

Z 5C kL3 (t)kW1,2 ∞

0

Z

1

kU(s)kW0,1 ds, p (b )

0

1

647

h i 3 ds (t − s)− 2 kF∞ (s)k1,R3 + kF∞ (s)kW2,1 3 p (R )

5 Ct− 2

3

Z

0

1

kU(s)kW1,2 ds. p (b )

By (2.2) and (2.6), 5 Cs− 2 kU0 kW1,0 , kU(s)kW0,1 p (b ) p 1

5 kU0 kW1,2 when U0 ∈ Dp (A). kU(s)kW1,2 p (b ) p Combining these estimates implies that , kL3 (t)kp 5 Ct−σ kU0 kW1,0 p

U0 ∈ Wp1,0 ,

, U0 ∈ Wp1,0 , k∂x L3 (t)kp 5 Ct−σ− 2 kU0 kW1,0 p 1

, U0 ∈ Wp1,0 , k∂x2 PL3 (t)kp 5 Ct−σ−1 kU0 kW1,0 p , U0 ∈ Dp (A), k∂x2 (I − P)L3 (t)kp 5 Ct−σ−1 kU0 kW1,2 p , U0 ∈ Dp (A), k∂x3 PL3 (t)kp 5 Ct−σ− 2 kU0 kW1,2 p 3

5 Ct− 2 kU0 kW1,0 , kL3 (t)kW0,1 ∞ p

U0 ∈ Wp1,0 ,

5 Ct− 2 kU0 kW1,2 , kL3 (t)kW1,2 ∞ p

U0 ∈ Dp (A).

3

3

(4.19)

Put Db = {x ∈ R3 | |x| = b}. Since U∞ = U on Db , combining (4.15), (4.17), (4.18) and (4.19) implies that h i , U0 kU(t)kp,Db 5 Ct−σ kU0 kq + kU0 kW1,0 p i h ,U0 k∂x U(t)kp,Db 5 C`pq (t) kU0 kq + kU0 kW1,0 p h i 3 , U0 k∂x2 PU(t)kp,Db 5 Ct− 2q kU0 kq + kU0 kW1,0 p 3 k∂x2 (I − P)U(t)kp,Db 5 Ct− 2q kU0 kq + kU0 k2,p , U0 3 k∂x3 PU(t)kp,Db 5 Ct− 2q kU0 kq + kU0 k2,p , U0 h i 3 − 2q 1,0 , U0 5 Ct k + kU k kU kU(t)kW0,1 0 q 0 Wp ∞ (Db ) 3 − 2q 5 Ct kU0 kq + kU0 k2,p , U0 kU(t)kW1,2 ∞ (Db )

∈ Wp1,0 ∩ Lq , ∈ Wp1,0 ∩ Lq , ∈ Wp1,0 ∩ Lq , ∈ W2p ∩ Lq and PU0 |∂ = 0, ∈ W2p ∩ Lq and PU0 |∂ = 0, ∈ Wp1,0 ∩ Lq , ∈ W2p ∩ Lq and PU0 |∂ = 0,

which together with Steps 1 and 2 and Sobolev’s imbedding theorem implies the proof of (A) and (B) of Theorem 1.3. To prove (C) of Theorem 1.3, we think of U(t) as

648

T. Kobayashi, Y. Shibata

U(t) = e−(t− 2 )A (e− 2 A U0 ). By (2.3) and (2.4), 1

1

5 CkU0 kW2,1 when U0 ∈ Wp2,1 . ke− 2 A U0 kW2,3 p p 1

Moreover, when U0 ∈ Wq1,0 . P(e− 2 A U0 )|∂ = 0 and ke− 2 A U0 kq 5 CkU0 kW1,0 q 1

1

Therefore, applying (B) to e−(t− 2 )A (e− 2 A U0 ), we have (C) of Theorem 1.3, which completes the proof of Theorem 1.3. 1

1

5. A Proof of Theorem 1.2 In this section, we shall prove Theorem 1.2. By the change of unknown functions: (ρ, v, θ) → (ρ + ρ¯0 , v, θ + θ¯0 ), the IBVP: (1.1), (1.2) and (1.3) is reduced to the following equation:  ρt + ρ¯0 div v = f1 ,   0 in [0, ∞) × , ˆ − (µˆ + µˆ )∇ div v + p1 ∇ρ + p2 ∇θ = f2 vt − µ1v   θt − κ1θ ˆ + p3 div v = f3 , (5.1) v| = 0, θ| = 0, (ρ, v, θ)|t=0 = (ρ0 − ρ¯0 , v0 , θ0 − θ¯0 ), ∂

∂

where µˆ = µ/ρ¯0 , µˆ 0 = µ0 /ρ¯0 , κˆ = κ/ρ¯0 c, p1 = Pρ (ρ¯0 , θ¯0 )/ρ¯0 , p2 = Pθ (ρ¯0 , θ¯0 )/ρ¯0 , p3 = θ¯0 Pθ (ρ¯0 , θ¯0 )/ρ¯0 c, f1 = −ρ div v − ∇ρ · v, µ µ + µ0 − µˆ 1v + − µˆ − µˆ 0 ∇ div v f2 = −(v · ∇)v + ρ + ρ¯0 ρ + ρ¯0 Pρ (ρ, θ) Pθ (ρ, θ) ∇ρ + p2 − ∇θ, + p1 − ρ + ρ¯0 ρ + ρ¯0 (θ + θ¯0 )Pθ (ρ, θ) κ div v − κˆ 1θ + p3 − f3 = −(v · ∇)θ + (ρ + ρ¯0 )c (ρ + ρ¯0 )c Ψ (v) . + (ρ + ρ¯0 )c If we put ρ0 = (p1 /ρ¯0 ) 2 ρ, v0 = v and θ0 = (c/θ¯0 ) 2 θ, then (5.1) is reduced to the symmetric form:  ρ0t + γ div v0 = f10 ,   vt0 − α1v0 − β∇ div v0 + γ∇ρ0 + ω∇θ0 = f20 in [0, ∞) × ,   0 0 0 0 0 θt − κ 1θ + ω div v = f3 1

1

(5.2) v0 |∂ = 0, θ0 |∂ = 0, (ρ0 , v0 , θ0 )|t=0 = (ρ00 , v00 , θ00 ), q p where α = µ, ˆ β = µˆ + µˆ 0 , γ = ρ¯0 Pρ (ρ¯0 , θ¯0 ), κ0 = κ/ρ¯0 c and ω = Pθ (ρ¯0 , θ¯0 ) θ¯0 /c/ρ¯0 . For notational simplicity, we write: ρ = ρ0 , v = v0 , θ = θ0 , f1 = f10 , f2 = f20 , f3 = f30 ,

Compressible Viscous and Heat-Conductive Fluid

649

κ = κ0 , again. If we put U = (ρ, v, θ), U0 = (ρ00 , v00 , θ00 ) and F(U) = (f1 , f2 , f3 ), then the IBVP: (1.1), (1.2) and (1.3) is reduced to the following equations: Ut + AU = F(U) t > 0, PU|∂ = 0, U|t=0 = U0 ,

(5.3)

where F(U) is written symbolically as follows: (I − P)F(U) = (ρ∂x v, ∂x ρv), PF(U) = (a(ρ)ρ∂x2 PU, b(U)U∂x U, c(ρ)∂x v∂x v). Note that there exist constants C1 , C2 > 0 such that 5 k(ρ0 − ρ¯0 , v0 , θ0 − θ¯0 )kW`,m 5 C2 kU0 kW`,m . C1 kU0 kW`,m p p p Let Nk (0, ∞), k = 1, 2, be the quantity defined in Section 1.3. For the symbol U = (ρ, v, θ), we have Z ∞ k∂x U(s)k2W2,3 + k∂s U(s)k22,2 ds 5 CN1 (0, ∞)2 , kU(t)k23,2 + k∂t U(t)k2W2,1 + 2 2 Z0 ∞ k∂x U(s)k2W3,4 + k∂s U(s)k23,2 ds 5 CN2 (0, ∞)2 . kU(t)k24,2 + k∂t U(t)k2W3,2 + 2 2 0 (5.4) By choosing k(ρ0 − ρ¯0 , v0 , θ0 − θ¯0 )kk+2,2 small enough, we can make Nk (0, ∞) as small as we want, and therefore we will state the smallness assumption in terms of Nk (0, ∞) instead of k(ρ0 − ρ¯0 , v0 , θ0 − θ¯0 )kk+2,2 in the course of our proof of Theorem 1.2 below. Since Nk (0, ∞) will be chosen small enough, without loss of generality we may assume that Nk (0, ∞) 5 1, k = 1, 2. To prove Theorem 1.2, we reduce the problem (5.3) to the integral equation: Z t −tA e−(t−s)A F(U)(s)ds. U(t) = e U0 − N(t), N(t) = 0

Step 1. Put

3 M2 (t) = sup (1 + s) 4 kU(s)kW2,3 + k∂s U(s)k1,2 . 2

05s5t

Then, there exists an 2 > 0 such that if N2 (0, ∞) 5 2 , then

M2 (t) 5 C N1 (0, ∞) + kU0 k1 + kU0 k2,2 .

(5.5)

M2 (t) 5 CN1 (0, ∞),

(5.6)

When 0 5 t 5 2,

and therefore we consider the case when t = 0, below. By Theorem 1.3 (A) and (B), 3 ke−tA U0 k2 5 Ct− 4 kU0 k1 + kU0 kW1,0 , 2 −tA − 45 kU0 k1 + kU0 kW1,0 , k∂x e U0 k2 5 Ct 2 2 −tA − 23 kU0 k1 + kU0 kW1,0 , k∂x Pe U0 k2 5 Ct 2

k∂x2 (I

−tA

− P)e

U0 k2 +

k∂x3 Pe−tA U0 k2

+ k∂x ∂t e−tA U0 k2

650

T. Kobayashi, Y. Shibata

3 5 Ct− 2 kU0 k1 + kU0 k2,2 ,

(5.7)

because PU0 |∂ = 0. The main task is the estimation of N(t) which is divided into the two parts as follows: Z t Z t−1 + e−(t−s)A F(U)(s)ds = I(t) + II(t). N(t) = t−1

0

Before going further on the proof of Step 1, at this point we state the estimate of nonlinear terms F(U): kF(U)(s)kW1,0 5 CkU(s)k22,2 ,

2 5 C kU(s)k∞ k∂x U(s)k1,p + k∂x U(s)kp k∂x PU(s)k2,2 , kF(U)(s)kW1,0 p

2 < p < ∞,

kF(U)(s)kW2,1 5 C kU(s)k∞ + k∂x U(s)k4 k∂x U(s)k2,2 ,

2 k∂s F(U)(s)kW1,0 5 C k∂s U(s)k2,2 kU(s)k∞ + k∂s U(s)k1,2 k∂x U(s)k2,2 . 2

(5.8)

In fact, to get

k∂x (∂x ρ v)k2 5 C k∂x2 ρk2 kvk∞ + k∂x ρk4 k∂x vk4 5 CkU(s)k22,2 ,

we use H¨older’s inequality and the inequalities: kukp 5 Ckuk1,2 2 5 p 5 6, kukp 5 Ckuk2,2 6 < p 5 ∞.

(5.9)

And also, we have kρ∂x2 PUk2 5 kρk∞ k∂x2 PUk2 5 CkU(s)k22,2 . Other terms can be estimated in the same manner, and then we have the first formula of (5.8). To get the second inequality, observe that k∂x (∂x ρ v)kp 5 k∂x2 ρkp kvk∞ + k∂x ρkp k∂x vk∞

5 C kvk∞ k∂x ρk1,p + k∂x ρkp k∂x vk2,2 ,

k∂x (ρ∂x v)kp 5 Ckρk∞ k∂x2 vkp + k∂x ρkp k∂x vk∞

5 C kρk∞ k∂x vk1,p + k∂x ρkp k∂x vk2,2 ,

kρ∂x2 vkp 5 Ckρk∞ k∂x2 vkp 5 Ckρk∞ k∂x vk1,p , k∂x v∂x vkp 5 k∂x vkp k∂x vk∞ 5 Ck∂x vkp k∂x vk2,2 , kU∂x Ukp 5 kUk∞ k∂x Ukp , and therefore we have the second inequality of (5.8). To get the third inequality, observe that k∂x2 (∂x ρ v)k2 5 k∂x3 ρk2 kvk∞ + 2k∂x2 ρk4 k∂x vk4 + k∂x ρk4 k∂x2 vk4 5 C kU(s)k∞ + k∂x U(s)k4 k∂x U(s)k2,2 ; k∂x (ρ∂x2 v)k2 5 kρk∞ k∂x3 vk2 + k∂x ρk4 k∂x2 vk4 5 C kU(s)k∞ + k∂x U(s)k4 k∂x U(s)k2,2 .

Compressible Viscous and Heat-Conductive Fluid

651

Other terms can be estimated in the same manner, and therefore we have the third formula of (5.8). To get the fourth formula of (5.8), observe that k∂x ∂s (∂x ρ v)k2 5 k∂s ∂x2 ρk2 kvk∞ + k∂s ∂x ρk2 k∂x vk∞ + k∂x2 ρk4 k∂s vk4 + k∂x ρk∞ k∂s ∂x vk2 5 C k∂s U(s)k2,2 kU(s)k∞ + k∂s U(s)k1,2 k∂x U(s)k2,2 ; k∂s (ρ∂x2 v)k2 5 k∂s ρk4 k∂x2 vk4 + kρk∞ k∂s ∂x2 vk2 5 C k∂s U(s)k2,2 kU(s)k∞ + k∂s U(s)k1,2 k∂x U(s)k2,2 . Other terms can be estimated in the same manner, and therefore we have the fourth formula of (5.8). Now, we return to the proof of Step 1. By Theorem 1.3 (A) and (C) with q = 2, kII(t)kW2,3 + k∂t II(t)k1,2 2 Z −A 5 C ke F(U)(t − 1)k1,2 +

t−1

− 43

(t − s)

0

h

i

kF(U)(s)k1 + kF(U)(s)kW2,1 ds . 2

By H¨older’s inequality, (5.8) and (5.9), kF(U)(s)k1 5 CkU(s)k1,2 k∂x U(s)k1,2 , kF(U)(s)kW2,1 5 CkU(s)k2,2 k∂x U(s)k2,2 .

(5.10)

2

By (2.2) and (5.8), ke−A F(U)(t − 1)k1,2 5 CkF(U)(t − 1)kW1,0 5 CkU(t − 1)k22,2 . 2

Combining these estimations, by (5.4) we have kII(t)kW2,3 + k∂t II(t)k1,2 2 Z t−1 3 3 5C (t − s)− 4 (1 + s)− 4 k∂x U(s)k2,2 dsM2 (t) + kU(t − 1)k22,2  0 21 Z t−1 21 Z t−1 − 23 − 23 2  (t − s) (1 + s) ds k∂x U(s)k2,2 ds M2 (t) 5 C 0

0



+ (1 + t)

− 43

M2 (t)N1 (0, ∞)

5 C(1 + t)− 4 N1 (0, ∞)M2 (t). 3

(5.11)

On the other hand, by (2.2), (5.4) and (5.8), Z t 1 (t − s)− 2 kF(U)(s)kW1,0 ds kI(t)k1,2 5 C t−1 t

Z 5C

t−1

2

(t − s)− 2 (1 + s)− 4 dsN1 (0, ∞)M2 (t) 1

3

5 C(1 + t)− 4 N1 (0, ∞)M2 (t), 3

652

T. Kobayashi, Y. Shibata

which together with (5.11), (5.6) and (5.7) implies that 3 kU(t)k1,2 5 C(1 + t)− 4 N1 (0, ∞)M2 (t) + N1 (0, ∞) + kU0 k1 + kU0 k2,2 .

(5.12)

By integration by parts, Z ∂t I(t) =

t

t−1

e−(t−s)A ∂s F(U)(s)ds,

(5.13)

and therefore by (2.2), Z k∂t I(t)k1,2 5 C

t

(t − s)− 2 k∂s F(U)(s)kW1,0 ds. 1

t−1

2

By (5.8), (5.9) and (5.4),

k∂s F(U)(s)kW1,0 5 C k∂s U(s)k2,2 kU(s)k2,2 + k∂s U(s)k1,2 kU(s)k3,2 2

5 CN2 (0, ∞)(1 + s)− 4 M2 (t), 0 5 s 5 t. 3

Combining these estimations, we have k∂t I(t)k1,2 5 C(1 + t)− 4 N2 (0, ∞)M2 (t), 3

which together with (5.11), (5.6) and (5.7) implies that 3 k∂t U(t)k1,2 5 C(1 + t)− 4 N2 (0, ∞)M2 (t) + N1 (0, ∞) + kU0 k1 + kU0 k2,2 . (5.14) By (2.2), (2.3), (5.10) and (5.4), Z k(I − P)I(t)k2,2 5 C

t

t−1

(t − s)− 2 kF(U)(s)kW2,1 ds 1

Z

5 CN1 (0, ∞) 5 C(1 + t)

− 43

2

t

t−1

(t − s)− 2 (1 + s)− 4 dsM2 (t) 1

3

N1 (0, ∞)M2 (t),

which together with (5.11), (5.6) and (5.7) implies that 3 k(I − P)U(t)k2,2 5 C(1 + t)− 4 N1 (0, ∞)M2 (t) + N1 (0, ∞) + kU0 k1 + kU0 k2,2 . (5.15) Applying Proposition Ap.3 in the Appendix to the second and third equations of (5.2), we have kPU(t)k2+k,2 5 C kU(t)k1+k,2 + k∂t U(t)kk,2 + kF(U)(t)kk,2 , k = 0, 1, which together with (5.15), (5.14) and (5.12) implies that 3 kPU(t)k3,2 5 C(1 + t)− 4 N2 (0, ∞)M2 (t) + N1 (0, ∞) + kUk1 + kU0 k2,2 . Therefore, we have

M2 (t) 5 C N2 (0, ∞)M2 (t) + N1 (0, ∞) + kU0 k1 + kU0 k2,2 ,

Compressible Viscous and Heat-Conductive Fluid

653

which implies (5.5) provided that CN2 (0, ∞) < 1. Step 2. Put 3

M∞ (t) = sup (1 + s) 2 kU(s)k∞ . 05s5t

Then, there exists an ∞ > 0 such that if N1 (0, ∞) 5 ∞ then M∞ (t) 5 C N1 (0, ∞) + kU0 k1 + kU0 k2,2 + M2 (t)2 .

(5.16)

When 0 5 t 5 2, by Sobolev’s inequality and (5.4), M∞ (t) 5 CN1 (0, ∞),

(5.17)

and therefore we consider the case when t = 2, below. By Theorem 1.3 (A) with p = 4 and q = 1 and Sobolev’s inequality, h i − 23 2,1 . kU (5.18) 5 Ct k + kU k ke−tA U0 kW0,1 0 1 0 W ∞ 2

By Sobolev’s inequality and (2.2), Z kI(t)k∞ 5 C

t

t−1

(t − s)− 2 kF(U)(s)kW1,0 ds. 1

4

By (5.8) with p = 4 and (5.9), kF(U)(s)kW1,0 5 C kU(s)k∞ k∂x U(s)k1,4 + k∂x U(s)k4 k∂x PU(s)k2,2 4 i h 5 C kU(s)k∞ kU(s)k3,2 + kU(s)k2W2,3 .

(5.19)

2

Combining these estimations and (5.4), we have 3 kI(t)k∞ 5 C(1 + t)− 2 N1 (0, ∞)M∞ (t) + M2 (t)2 .

(5.20)

By Theorem 1.3 (A) with q = 1 and p = 4, (5.19) and (5.10), Z kII(t)k∞ 5 C Z

t−1

4

0 t−1

5C

h i 3 (t − s)− 2 kF(U)(s)k1 + kF(U)(s)kW1,0 ds 3 3 (t − s)− 2 (1 + s)− 2 ds N1 (0, ∞)M∞ (t) + M2 (t)2

0

3 5 C(1 + t)− 2 N1 (0, ∞)M∞ (t) + M2 (t)2 , which together with (5.20), (5.18) and (5.17) implies that i h M∞ (t) 5 C N1 (0, ∞)M∞ (t) + M2 (t)2 + N1 (0, ∞) + kU0 k1 + kU0 kW2,1 . 2

If CN1 (0, ∞) < 1, then we have (5.16).

654

T. Kobayashi, Y. Shibata

Step 3. We have the estimate: h i 5 k∂x U(t)k2 5 C(1 + t)− 4 kU0 k1 + kU0 kW1,0 + M2 (t)2

(5.21)

2

for t = 0. In view of (5.6) we consider the case when t = 2 only. By (2.2) and (5.8) Z t 1 (t − s)− 2 kF(U)(s)kW1,0 ds k∂x I(t)k2 5 C t−1 t

Z 5C

t−1

2

(t − s)− 2 (1 + s)− 2 dsM2 (t)2 1

3

5 C(1 + t)− 2 M2 (t)2 . 3

By Theorem 1.3 (A), (5.8) and (5.10), Z t−1 h i 5 (t − s)− 4 kF(U)(s)k1 + kF(U)(s)kW1,0 ds k∂x II(t)k2 5 C Z

2

0

t−1

5C

(t − s)− 4 (1 + s)− 2 dsM2 (t)2 5

3

0

5 C(1 + t)− 4 M2 (t)2 . 5

Combining these estimations, (5.6) and (5.7) implies (5.21). Step 4. 3 < p < ∞ and put 3

Mp (t) = sup (1 + s) 2 k∂x U(s)kp .

(5.22)

05s5t

Then, there exists an p > 0 depending on p such that if N1 (0, ∞) 5 p then i h . Mp (t) 5 C M∞ (t) + M2 (t)2 + N1 (0, ∞) + kU0 k1 + kU0 kW1,0 p By Sobolev’s inequality, Mp (t) 5 CN1 (0, ∞), and then we consider the case when t = 2 only. By Theorem 1.3 (A) with q = 1 and Sobolev’s inequality, h i 3 . k∂x e−tA U0 kp 5 Ct− 2 kU0 k1 + kU0 kW1,0 p By (2.2), (5.8), (5.9) and (5.4), Z t 1 (t − s)− 2 kF(U)(s)kW1,0 ds k∂x I(t)kp 5 C p t−1

3 5 C(1 + t)− 2 N2 (0, ∞)M∞ (t) + N1 (0, ∞)Mp (t) . By Theorem 1.3 (A) with q = 1, (5.8), (5.4) and (5.10), 3 k∂x II(t)kp 5 C(1 + t)− 2 N2 (0, ∞)M∞ (t) + N1 (0, ∞)Mp (t) + M2 (t)2 .

Compressible Viscous and Heat-Conductive Fluid

655

Combining these estimations, we have h Mp (t) 5 C N1 (0, ∞)Mp (t) + N2 (0, ∞)M∞ (t) + M2 (t)2

i +N1 (0, ∞) + kU0 k1 + kU0 kW1,0 . p

If CN1 (0, ∞) < 1, then we have (5.22), where the constant C depends essentially on p. Step 5. Put

h i 5 M2 (t) = sup (1 + s) 4 k∂s U(s)k1,2 + k∂x U(s)kW1,2 . 2

05s5t

Then, there exists an 02 > 0 such that if N1 (0, ∞) 5 02 , then M2 (t) 5 C M∞ (t) + M4 (t) + M2 (t)2 + N1 (0, ∞) + kU0 k1 + kUk2,2 .

(5.23)

In view of (5.6), we may concentrate on the case when t = 2 only. By (5.13), (2.2), (5.4) and (5.8), k∂t I(t)k1,2 5 C(1 + t)− 4 [N2 (0, ∞)M∞ (t) + N1 (0, ∞)M2 (t)] . 5

(5.24)

By Theorem 1.3 (A) and (C) with q = 6/5, k∂t II(t)k1,2 5 C ke−A F(U)(t − 1)k1,2 Z

t−1

+

− 45

(t − s)

kF(U)(s)k

0

W1,0 6/5

+ kF(U)(s)k

W2,1 2

+ kF(U)(s)k1 ds .

By (2.2), (5.8) and Step 1, ke−A F(U)(t − 1)k1,2 5 CkF(U)(t − 1)kW1,0 5 C(1 + t)− 2 M2 (t)2 . 3

2

By H¨older’s inequality with the exponent: 5/6 = 1/2 + 1/3 and (5.9), kF(U)(s)kW1,0 5 C kU(s)k3 k∂x2 U(s)k2 + kU(s)k1,3 k∂x U(s)k2 6/5

5 CkU(s)k22,2 . Therefore, by (5.4), (5.8), Step 2 and Step 4, 5 k∂t II(t)k1,2 5 C(1 + t)− 4 N1 (0, ∞)(M∞ (t) + M4 (t)) + M2 (t)2 . By (2.2), (5.8), Step 2 and Step 4 with p = 4, k(I − P)I(t)k2,2 5 C(1 + t)− 2 N1 (0, ∞) (M∞ (t) + M4 (t)) . 3

By Theorem 1.3 (C) with q = 6/5, (5.25) and (5.8), 5 k∂x2 (I − P)II(t)k2 5 C(1 + t)− 4 N1 (0, ∞) (M∞ (t) + M4 (t)) + M2 (t)2 .

(5.25)

656

T. Kobayashi, Y. Shibata

Combining these estimations, (5.6) and (5.7), we have k∂t U(t)k1,2 + k∂x2 (I − P)U(t)k2 5 C(1 + t)− 4 [N1 (0, ∞)M2 (t) + M∞ (t) + M4 (t) +M2 (t)2 + N1 (0, ∞) + kU0 k1 + kU0 k2,2 , ∀ t > 0, (5.26) 5

where we have used the fact that Nk (0, ∞) 5 1, k = 1, 2. By Proposition Ap.3 in the Appendix, we have k∂xk+2 PU(t)k2 5 C k∂t U(t)kk,2 + kPF(U)(t)kk,2 + k∂x (I − P)U(t)kk,2 +k∂x PU(t)kk,2 + kPU(t)k2,b , k = 0, 1. (5.27) Since PU(t)|∂ = 0, by Poincar´e’s inequality and Step 3, h i 5 kPU(t)k2,b 5 Ck∂x PU(t)k2 5 C(1 + t)− 4 kU0 k1 + kU0 kW1,0 + M2 (t)2 .

(5.28)

2

By (5.8), (5.4), Step 2 and Step 4 with p = 4, we have kPF(U)(t)k1,2 5 C(1 + t)− 2 N1 (0, ∞) (M∞ (t) + M4 (t)) . 3

(5.29)

Combining Step 3 and (5.26)–(5.29), M2 (t) 5 C [N1 (0, ∞)M2 (t) + M∞ (t) + M4 (t) + M2 (t)2

+N1 (0, ∞) + kU0 k1 + kU0 k2,2 .

(5.30)

If CN1 (0, ∞) < 1, then (5.23) follows from (5.30). Combining all the steps, we have proved Theorem 1.2, which completes the proof of Theorem 1.2. 6. Appendix. A Priori Estimate of an Elliptic Operator By Agmon–Douglis–Nirenberg [1, 2], we know the following estimate. Theorem Ap.1. Let D be a bounded domain with smooth boundary ∂D. Let 1 < p < ∞ and let m be an integer = 0. Suppose that u ∈ Wpm+2 (D) and f ∈ Wpm (D) satisfy the equation: (Ap.1) −α1u − β∇ div v = f in D and u|∂D = 0, where α > 0 and α + β > 0. Then, the following estimate holds: kukm+2,p,D 5 C(m, p, D) kfkm,p,D + kukp,D . By using the cut-off function ϕ we can deduce the following proposition from Theorem Ap.1 immediately. Proposition Ap.2. Let b be an arbitrary number such that Bb−3 ⊃ O. Let 1 < p < ∞ and let m be an integer = 0. Suppose that u ∈ Wpm+2 () and f ∈ Wpm () satisfy the equation: (Ap.2) −α1u − β∇ div u = f in and u|∂ = 0, where α > 0 and α + β > 0. Then, the following estimate holds: kukm+2,p,b−1 5 C(m, p, b) kfkm,p,b + kukp,b .

Compressible Viscous and Heat-Conductive Fluid

657

By Fourier transform, we can reduce the formula: −α1u − β∇ div u = f in R3 to the formula: ˆ ˆ = f(ξ) in R3 , α|ξ|2 δij + βξi ξj u(ξ) ˆ where u(ξ) denotes the Fourier transformation of u. Since det α|ξ|2 δij + βξi ξj = α2 (α + β)|ξ|6 , by Fourier multiplier theorem and Calder´on’s theorem we see easily that

i h −1

m+2 −1 ˆ f(ξ) α|ξ|2 δij + βξi ξj

3 5 C(m, p)kfkm,p,R3 .

∂x F p,R

(Ap.3)

By using the cut-off function, we can easily deduce the following proposition from (Ap.3) and Proposition Ap.2. Proposition Ap.3. Let 1 < p < ∞ and let m be an integer = 0. Suppose that u ∈ Wpm+2 () and f ∈ Wpm () satisfy the equation: −α1u − β∇ div u = f in and u|∂ = 0, where α > 0 and α + β > 0. Then, the following estimate holds: k∂xm+2 ukp 5 C(m, p) {kfkm,p + kuk1,p,b } , where b is the same as in Proposition Ap.2. References 1. Agmon, S., Douglis, A., and Nirenberg, L.: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions I. Commun. Pure Appl. Math. 12, 623–727 (1959) 2. Agmon, S., Douglis, A., and Nirenberg, L.: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions II. ibid, 17, 35–92 (1964) 3. Calder´on, A.P.: Lebesgue spaces of differentiable functions and distributions. Proc. Symp. in Pure Math. 5, 33–49 (1961) 4. Cattabriga, L.: Su un problema al contorno relativo al sistema di equazioni di Stokes. Rend. Mat. Sem. Univ. Padova 31, 308–340 (1961) 5. Deckelnick, K.: Decay estimates for the compressible Navier–Stokes equations in unbounded domain. Math. Z. 209, 115–130 (1992) 6. Deckelnick, K.: L2 –decay for the compressible Navier–Stokes equations in unbounded domains. Commun. in Partial Differential Equations 18, 1445–1476 (1993) 7. Edmunds, D. E. and Evans W. E.: Spectral Theory and Differential Operators. Oxford: Oxford University Press, 1987 8. Fiszdon, W. and Zajaczkowski, W. M.: The initial boundary problem for the flow of a baratropic viscous fluid, global in time. Appl. Anal. 15, 91–114 (1983) 9. Fiszdon, W. and Zajaczkowski, W. M.: Existence and uniqueness of solutions of the initial boundary problem for the flow of a baratropic viscous fluid, local in time. Appl. Mech. 35, 497–516 (1983) 10. Fiszdon, W. and Zajaczkowski, W. M.: Existence and uniqueness of solutions of the initial boundary problem for the flow of a baratropic viscous fluid, global in time. ibid 35, 517–532 (1983) 11. Graffi, D.: II teorema di unicita` nella dinamica dei fluidi compressibili. J. Rat. Mech. Anal. 2, 99–106 (1953) 12. H¨ormander, L.: The analysis of linear partial differential operators I. Grund. math. Wiss. 256, Berlin– Heidelberg–New York: Springer–Verlag, 1983

658

T. Kobayashi, Y. Shibata

13. Itaya, N.: On the Caucy problem for the system of fundamental equations describing the movement of compressible viscous fluid. K˜odai. Math. Sem. Rep. 23, 60–120 (1971) 14. Itaya, N.: On the initial value problem of motion of compressible viscous fluid, especially on the problem of uniqueness. J. Math. Kyoto Univ. 16, 413–427 (1976) 15. Iwashita, H.: Lq −Lr estimates for solutions of the nonstationary Stokes equations in an exterior domain and the Navier–Stokes initial value problems in Lq spaces. Math. Ann. 285, 265–268 (1989) 16. Iwashita, H. and Shibata, Y.: On the analyticity of spectral functions for some exterior boundary value problem. Glas. Mat. III ser. 23, 291–313 (1988) 17. Kobayashi, T.: On a local energy decay of solutions for the equations of motion of viscous and heatconductive gases in an exterior domain in R3 . Tsukuba. J. Math. 21, 629–670 (1997) 18. Kobayashi, T.: On the local energy decay of higer derivatives of solutions for the equations of motion of compressible viscous and heat-conductive gases in an exterior domain in R3 . Proc. Japan Acad. Ser. A 73 (7), 126–129 (1997) 19. Kobayashi, T. and Shibata, Y.: On the Oseen equation in the exterior domains. Math. Ann. 310, 1–45 (1998) 20. Lukaszewicz, G.: An existence theorem for compressible viscous and heat conducting fluids. Math. Meth. Appl. Sci. 6, 234–247 (1984) 21. Lukaszewicz, G.: On the first initial-boundary value problem for the equations of motion of viscous and heat conducting gas. Arch. Mech. 36, 234–247 (1984) 22. Matsumura, A. and Nishida, T.: The initial value problem for the equations of motion of compressible viscous and heat-conductive fluids. Proc. Japan Acad. Ser. A 55, 337–342 (1979) 23. Matsumura, A. and Nishida, T.: The initial value problems for the equations of motion of viscous and heat-conductive gases. J. Math. Kyoto Univ. 20-1, 67–104 (1980) 24. Matsumura, A. and Nishida, T.: The initial boundary value problem for the equations of motion of compressible viscous and heat-conductive fluid. University of Wisconsin, MRC Technical summary Report no. 2237 (1981) 25. Matsumura, A. and Nishida, T.: Initial boundary value problems for the equations of motion of general fluids. In: Computing methods in applied sciences and engineering, V, Glowinski, R. , Lions, J. L. eds. Amsterdam–New York–Oxford: North-Holland, 1982 26. Matsumura, A. and Nishida, T.: Initial boundary value problems for the equations of motion of compressible viscous and heat-conductive fluids. Commun. Math. Phys. 89, 445–464 (1983) 27. Nash, J.: Le probl`eme de Cauchy pour les e´ quations diff´erentielles d’un fluid g´en´eral. Bull. Soc. Math. France, 90, 487–497 (1962) 28. Pazy, A.: Semigroups of linear operators and applications to partial differential equations. Appl. Math. Sci. 44, New York: Springer–Verlag, 1983 29. Ponce, G.: Global existence of small solutions to a class of nonlinear evolution equations. Nonlinear. Anal. TMA. 9, 339–418 (1985) 30. Serrin, J.: On the uniqueness of compressible fluid motions. Arch. Rat. Mech. Anal. 3, 271–288 (1959) 31. Shibata, Y.: On the global existence of classical solutions of second order fully nonlinear hyperbolic equations with first order dissipation in the exterior domain. Tsukuba. J. Math. 7, 1–68 (1983) 32. Solonnikov, V. A.: Solvability of initial-boundary value problem for the equations of a viscous compressible fluid. (previously in Zap. Nauchn. Semin. Leningr. Otd. Mat. Inst. Steklova (LOMI) 56, pp. 128–142 (1976) (in Russian)) J. Sov. Math. 14, 1120–1133 (1980) 33. Str¨ohmer, G.: About the resolvent of an operator from fluid dynamics. Math. Z. 194, 183–191 (1987) 34. Str¨ohmer, G.: About compressible viscous fluid flow in a bounded region. Pacific. J. Math. 143, 359–375 (1990) 35. Tani, A.: On the first initial–boundary problem of compressible viscous fluid motion. Publ. RIMS. Kyoto Univ. 130, 193–253 (1977) 36. Valli, A.: Uniquness theorems for compressible viscous fluids, especially when the Stokes relation holds. Bol. Unione. Mat. Ital. , Anal. Funz. Appl. (V)18-C, 317–325 (1981) 37. Valli, A.: An existence theorem for compressible viscous fluids. Ann. Mat. Pura Appl. (IV) 13 (132), 197–213, (399–400) (1982) 38. Valli, A.: An existence theorem for compressible viscous fluids. Ann. Mat. Pura Appl. (IV) 132, 399–400 (1982) 39. Valli, A.: Periodic and stationary solutions for compressible Navier–Stokes equations via a stability method. Ann. Sc. Norm. Super. Pisa (IV) 10, 607–647 (1983)

Compressible Viscous and Heat-Conductive Fluid

659

40. Valli, A. and Zajaczkowski, W. M.: Navier–Stokes equations for compressible fluids; Global existence and qualitative properties of solutions in the general case. Commun. Math. Phys. 103, 259–296 (1986) 41. Vol’pert, A. I. and Hudjaev, S. I.: On the Cauchy problem for composite systems of nonlinear differential equations. (previously in Mat. Sb. (N. S.) 87 (129), 504–528 (1972) (in Russian)) Math. USSR-Sb. 16, 517–544 (1972) Communicated by H. Araki

Commun. Math. Phys. 200, 661 – 683 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Distribution of Zeros of Random and Quantum Chaotic Sections of Positive Line Bundles? Bernard Shiffman, Steve Zelditch Department of Mathematics, Johns Hopkins University, Baltimore, MD 21218, USA. E-mail: [email protected], [email protected] Received: 6 March 1998 / Accepted: 31 July 1998

Abstract: We study the limit distribution of zeros of certain sequences of holomorphic sections of high powers LN of a positive holomorphic Hermitian line bundle L over a compact complex manifold M . Our first result concerns “random” sequences of sections. Using the natural probability measure on the space of sequences of orthonormal bases {SjN } of H 0 (M, LN ), we show that for almost every sequence {SjN }, the associated sequence of zero currents N1 ZSjN tends to the curvature form ω of L. Thus, the zeros of a sequence of sections sN ∈ H 0 (M, LN ) chosen independently and at random become uniformly distributed. Our second result concerns the zeros of quantum ergodic eigenfunctions, where the relevant orthonormal bases {SjN } of H 0 (M, LN ) consist of eigensections of a quantum ergodic map. We show that also in this case the zeros become uniformly distributed.

1. Introduction This paper is concerned with the limit distribution of zeros of “random” holomorphic sections and of “quantum ergodic” eigensections of powers of a positive holomorphic line bundle L over a compact complex manifold M . To introduce our subject, let us consider the simplest case where M = CPm and where L is the hyperplane section bundle. As is well-known, sections of LN are given by homogeneous polynomials pN (z0 , z1 , . . . , zm ) of degree N on Cm+1 ; these polynomials are called SU(m + 1) polynomials when we consider them as elements of a measure space with an SU(m + 1)-invariant Gaussian measure (see Sect. 4). We are concerned with the question: what is the limit distribution of zeros ZN = {pN = 0} ⊂ M of a sequence {pN } of such polynomials as the degree N → ∞? Of course, if we consider all possible sequences, then little can be ? Research of the first author partially supported by NSF grant #DMS-9500491; research of the second author partially supported by NSF grant #DMS-9703775.

662

B. Shiffman, S. Zelditch

said. However, if we consider only the typical behavior, then there is a simple answer: if the sequence {pN } is chosen independently and at random from the ensembles of homogeneous polynomials of degree N and L2 -norm one, then the zero sets of {pN } almost surely become uniformly distributed with respect to the volume form induced by ω. The same conclusion is true for any positive Hermitian holomorphic line bundle (L, h) over any compact complex manifold M . In place of homogeneous polynomials of degree N , one now considers holomorphic sections sN ∈ H 0 (M, LN ). The curvature form ω = c1 (h) of h defines a K¨ahler structure on M , and the metrics h, ω provide a Hermitian inner product on H 0 (M, LN ). (See Eqs. (1)–(2) in Sect. 2.) We then have the LN ). Namely, we notion of a “random” sequence of L2 -normalized sections of H 0 (M, Q∞ consider the probability space (S, dµ), where S equals the product N =1 SH 0 (M, LN ) of the unit spheres SH 0 (M, LN ) in H 0 (M, LN ) and µ is the product of Haar measures on these spheres. Given a sequence s = {sN } ∈ S, we associate the currents of integration ZsN over the zero divisors of the sections sN . In complex dimension 1, ZsN is simply the sum of delta functions at the zeros of sN . Our first result states that for a random (i.e., for almost all) s ∈ S, the sequence of zeros of the sections sN are asymptotically uniformly distributed: Theorem 1.1. For µ-almost all s = {sN } ∈ S, measures; in other words, lim

N →∞

1 N ZsN

→ ω weakly in the sense of

Z 1 ZsN , ϕ = ω∧ϕ N M

for all continuous (m − 1, m − 1) forms ϕ. In particular, lim

N →∞

1 Vol2m−2 {z ∈ U : sN (z) = 0} = mVol2m U, N

for U open in M (where Volk denotes the Riemannian k-volume in (M, ω) ). The key ideas in the proof of Theorem 1.1 (as well as Theorem 1.2 below) are Tian’s theorem [T, Z4] on approximating the metric ω using the sections of H 0 (M, LN ) (see Theorem 2.1) and an asymptotic estimate of the variances of ZsN , regarded as a current-valued random variable (Lemma 3.3). A closely related issue is the distribution of zeros of sections {SjN } forming random orthonormal bases of H 0 (M, LN ). Such bases are increasingly used to model orthonormal bases of quantum chaotic eigenfunctions; e.g., see [BBL, Ha, LS, NV]. The properties of these bases are very similar to those of random orthonormal bases of spherical harmonics studied in [Z1] and [V]. To study the zeros of random orthonormal bases, we introduce the probability space (ON B, dν), where ON B is the infinite product of Q∞ the sets ON B N of orthonormal bases of the spaces H 0 (M, LN ), and ν = N =1 νN , where νN is Haar probability measure on ON BN . A point of ON B is thus a sequence S = {(S1N , . . . , SdNN )}N ≥1 of orthonormal bases (where dN = dim H 0 (M, LN )), and we may ask whether all of the zero sets ZSjN are tending simultaneously to the uniform distribution. The answer is still essentially yes, but for technical reasons we have to delete a subsequence of relative density zero of the sections.

Distribution of Zeros of Sections of Positive Line Bundles

663

Theorem 1.2. For ν-almost all S = {(S1N , . . . , SdNN )} ∈ ON B, we have 2 dN 1 1 X N Z − ω, ϕ → 0 dN j=1 N Sj for all continuous (m − 1, m − 1) forms ϕ. Equivalently, for each N there exists a subset N 3N ⊂ {1, . . . , dN } such that #3 dN → 1 and lim

N →∞,j∈3N

1 Z N =ω N Sj

weakly in the sense of measures. Our final result pertains to actual quantum ergodic eigenfunctions rather than to random sections and shows that their zero divisors also become uniformly distributed in the high power limit. Recall that a quantum map is a unitary operator which “quantizes” a symplectic map on a symplectic manifold. In our setting, the symplectic manifold is the K¨ahler manifold (M, ω) and the map is a symplectic transformation χ : (M, ω) → (M, ω). Under certain conditions, χ may be quantized as a sequence of unitary operators Uχ,N on H 0 (M, LN ). The sequence defines a semiclassical Fourier integral operator of Hermite type (or equivalently a semiclassical Toeplitz operator). For the precise definitions and conditions, we refer to [Z3]. We call Uχ,N a “quantum ergodic map” if χ is also an ergodic transformation of (M, ω). Theorem 1.3. Let (L, h) → (M, ω) be a positive Hermitian line bundle over a K¨ahler manifold with c1 (h) = ω and let Uχ,N : H 0 (M, LN ) 7→ H 0 (M, LN ) be a quantum ergodic map. Further, let {S1N , . . . , SdNN } be an orthonormal basis of eigensections of Uχ,N . Then there exists a subsequence 3 ⊂ {(N, j) : N = 1, 2, 3, . . . , j ∈ {1, . . . , dN }} of density one such that lim

N →∞,(N,j)∈3

1 Z N =ω N Sj

weakly in the sense of measures. This result was proved independently by Nonnenmacher-Voros [NV] in the case of the theta bundle over an elliptic curve C/Z2 . The main step is to establish the following result: Lemma 1.4. Let (L, h) → (M, ω) be a positive Hermitian holomorphic line bundle over a K¨ahler manifold M with c1 (h) = ω. Let sN ∈ H 0 (M, LN ), N = 1, 2, . . . , be a sequence of sections with the property that ksN (z)k2 → 1 in the weak* sense as N → ∞. Then N1 ZsN → ω weakly in the sense of measures. R R The convergence hypothesis means that M ϕ(z)ksN (z)k2 dz → M ϕ(z)dz for all ϕ ∈ C 0 (M ). Our proof of Lemma 1.4 is somewhat different and more general than that of [NV], but both are based on potential theory. The lemma was motivated by an analogous result of Sodin [So] on the asymptotic equidistribution of zero sets of sequences of rational functions in one variable (see also [RSh, RSo] for the higher dimensional case); Sodin’s result in turn arose from the Brolin-Lyubich Theorem in complex dynamics (cf., [FS]). The connection between Lemma 1.4 and Theorems 1.2, 1.3 will be established

664

B. Shiffman, S. Zelditch

in Sect. 5, the main point being that both random orthonormal bases and orthonormal bases of chaotic eigenfunctions satisfy the hypothesis of the lemma (Theorems 5.1, 5.2). We end this introduction with a brief discussion of related results. There is an extensive literature on the distribution of zeros of random polynomials, beginning with the classical papers of Bloch-Polya [BP], Littlewood-Offord [LO], Kac [Ka] and ErdosTuran [ET] on polynomials in one variable. The articles of Bleher-Di [BD] and SheppVanderbei [SV] contain recent results and further references. In addition to the mathematical literature there is a growing physics literature on zeros of random polynomials and chaotic quantum eigenfunctions, see in particular [BD, BBL, Ha, LS, NV]. As in this paper, these articles are largely concerned with the distribution of zeros in the semiclassical limit. The main theme is that the distribution of zeros of eigenfunctions of quantum maps should reflect the signature of the dynamics of the underlying classical system: in the case of ergodic quantum maps, the zeros should be uniformly distributed in the semiclassical limit while in the completely integrable case they should concentrate in a singular way. Random polynomials (or more generally sections) are believed to provide an accurate model for quantum chaotic eigenfunctions and hence there is interest in understanding how their zeros are distributed and how the zeros are correlated. To our knowledge, the prior results on distribution of zeros of random holomorphic sections only go as far as determining the average distribution. In the special case of SU(2) polynomials it is shown in [BBL] that the average distribution is uniform. Our result that the expected distribution is achieved asymptotically by almost every sequence of sections appears to be new even in that case. Regarding zeros of quantum ergodic eigenfunctions, the only prior rigorous result appears to be that of [NV] mentioned above. We should also mention the study of the zeros of certain sections of positive line bundles in the almost complex setting which has recently been made by Donaldson [D]; the relevant zero sets were also shown to be uniformly distributed in the high power limit.

2. Background We begin by introducing some terminology and basic properties of orthonormal bases of holomorphic sections of powers of a positive line bundle. 2.1. Notation. Throughout this paper, we let L denote an ample holomorphic line bundle over an m-dimensional compact complex (projective) manifold M . We denote the space of global holomorphic sections of L by H 0 (M, L). We let Dp,q (M ) denote the space of C ∞ (p, q)-forms on M , and we let D0p,q (M ) = Dm−p,m−q (M )0 denote the space of (p, q)-currents on M ; (T, ϕ) = T (ϕ) denotes the pairing of T ∈ D0p,q (M ) and ϕ ∈ Dm−p,m−q (M ). If L has a smooth Hermitian metric h, its curvature form c1 (h) ∈ D1,1 (M ) is given locally by √ −1 ¯ c1 (h) = − ∂ ∂ log keL kh , π where eL is a nonvanishing local holomorphic section of L, and keL kh = h(eL , eL )1/2 denotes the h-norm of eL . The curvature form c1 (h) is a de Rham representative of the Chern class c1 (L) ∈ H 2 (M, R); see [GH, SS]. Since L is ample, we can give L a metric h withR strictly positive curvature form, and we give M the K¨ahler metric ω = c1 (h). Then M ω m = c1 (L)m ∈ Z+ . Finally, we give M the volume form

Distribution of Zeros of Sections of Positive Line Bundles

dV =

1 ωm , c1 (L)m

665

(1)

R so that M has unit volume: M dV = 1. This paper is concerned with the spaces H 0 (M, LN ) of sections of LN = L⊗· · ·⊗L. The metric h induces Hermitian metrics hN on LN given by ks⊗N khN = kskN h . We give H 0 (M, LN ) the inner product structure Z hN (s1 , s2 )dV (s1 , s2 ∈ H 0 (M, LN ) ), (2) hs1 , s2 i = M

and we write |s| = hs, si1/2 . We let dN = dim H 0 (M, LN ). It is well known that for N sufficiently large, dN is given by the Hilbert polynomial of L, whose leading term is c1 (L)m m (see, for example [SS, Chapter 7]). m! N For a holomorphic section s ∈ H 0 (M, LN ), we let Zs denote the current of inteN N gration over the zero divisor of s. In a local frame eN L for L , we can write s = ψeL , where ψ is a holomorphic function. We recall the Poincar´e-Lelong formula √ √ −1 ¯ −1 ¯ (3) ∂ ∂ log |ψ| = ∂ ∂ log kskhn + N ω. Zs = π π We also consider the normalized zero divisor esN = 1 Zs , Z N eN are de Rham representatives of c1 (L), and thus so that the currents Z s m eN , ω m−1 = c1 (L) . Z s m!

(4)

eN all have the same mass. Equation (4) says that the currents Z s For example, we consider the hyperplane section bundle, denoted O(1), over CPm . Sections s ∈ H 0 (CPm , O(1)) are linear functions on Cm+1 ; the zero divisors Zs are projective hyperplanes. The line bundle O(1) carries a natural metric hFS given by kskhFS ([w]) =

|(s, w)| , |w|

w = (w0 , . . . , wm ) ∈ Cm+1 ,

Pm for s ∈ Cm+1∗ ≡ H 0 (CPm , O(1)), where |w|2 = j=0 |wj |2 and [w] ∈ CPm is the complex line through w. The curvature form of hFS is given by √ −1 ¯ (5) ∂ ∂ log |w|2 , c1 (hFS ) = ωFS = 2π where ωFS is the Fubini-Study K¨ahler form on CPm . Here, ωFS is normalized so that it represents the generator of H 2 (CPm , Z). The N th tensor power of O(1) is denoted O(N ). Elements of H 0 (CPm , O(N )) are homogeneous polynomials on Cm+1 of degree +m 1 Nm + · · · . = m! N ; hence, dim H 0 (CPm , O(N )) = Nm 2.2. Holomorphic sections and CR holomorphic functions. The setting for our analysis is the Hardy space H 2 (X) ⊂ L2 (X), where X → M is the principal S 1 bundle associated to L. To be precise, let L∗ be the dual line bundle to L and let D = {v ∈ L∗ : h(v, v) < 1}

666

B. Shiffman, S. Zelditch

be its unit disc bundle relative to the metric induced by h and let X = ∂D = {v ∈ L∗ : h(v, v) = 1}. The positivity of c1 (h) is equivalent to the disc bundle D being strictly pseudoconvex in L∗ (see [Gr]). We let rθ x = eiθ x (x ∈ X) denote the S 1 action on X and denote its infinitesi∂ . As the boundary of a strictly pseudoconvex domain, X is a CR mal generator by ∂θ manifold, and the Hardy space H 2 (X) mentioned above is by definition the space of square integrable CR functions on X. Equivalently, it is the space of boundary valS 1 action on X comues of holomorphic functions on D which are in L2 (X). The L ∞ 2 mutes with the Cauchy-Riemann operator ∂¯b ; hence H 2 (X) = N =0 HN (X), where 2 2 iN θ HN (X) = {f ∈ H (X) : f (rθ x) = e f (x)}. A section s of L determines an equivˆ λ) = (λ, s(z)) (z ∈ M, λ ∈ L∗z ). It is clear that ariant function sˆ on L∗ by the rule: s(z, if τ ∈ C then s(z, ˆ τ λ) = τ s. ˆ We will usually restrict sˆ to X and then the equivariance ˆ Similarly, a section sN of LN determines an property takes the form: s(r ˆ θ x) = eiθ s(x). ∗ N equivariant function sˆN on L : put sˆN (z, λ) = λ , sN (z) , where λN = λ⊗· · ·⊗λ; then sˆN (rθ x) = eiN θ sN (x). The map s 7→ sˆ is a unitary equivalence between H 0 (M, LN ) 2 (X). and HN We now recall the strong form of Tian’s theorem [T] given in [Z4]: Theorem 2.1 ([Z4]). Let M be a compact complex manifold of dimension m (over C) and let (L, h) → M be a positive Hermitian holomorphic line bundle. Let {S1N , . . . , SdNN } be any orthonormal basis of H 0 (M, LN ) (with respect to the inner product defined above). Then there exists a complete asymptotic expansion dN X j=1

kSjN (z)k2hN = a0 N m + a1 (z)N m−1 + a2 (z)N m−2 + . . .

m

and with the lower coefficients aj (z) given by invariant polynomials with a0 = c1 (L) m! in the higher derivatives of h. More precisely, for any k ≥ 0, dN

X X

kSiN k2hN − aj N m−j

i=0

j
Ck

≤ CR,k N m−R .

Note that since the SjN

have unit length (as elements of H 0 (M, LN )), if we integrate the above asymptotic expansion over M (with respect to the volume dV ), we get simply dN . Thus the integrals of the aj are the coefficients of the Hilbert polynomial of L. (The constant a0 differs from that of [T] and [Z4], since we use here the normalized volume dV on M .) The canonical map 8N : M → PH 0 (M, L⊗N )∗ ,

z 7→ {s ∈ H 0 (M, L⊗N ) : s(z) = 0}

(6)

can be described in terms of an orthonormal basis S = {S1N , . . . , SdNN } by the map 8SN : M → CPdN −1 ,

z 7→ [S1N (z), . . . , SdNN (z)].

(7)

We shall drop the S and denote the map given in (7) simply by 8N . For N sufficiently large, the sections {S1N , . . . , SdNN } do not have common zeros and (7) gives a holomorphic embedding, by the Kodaira embedding theorem; see [GH, SS]. Theorem 2.1 can be regarded as an asymptotic formula for the distortion function between the metrics hN and 8∗N hFS on the line bundle LN . It also gives the following asymptotic estimate of the Riemannian distortion of the maps 8N :

Distribution of Zeros of Sections of Positive Line Bundles

667

Corollary 2.2. [Z4] Let ωFS denote the Fubini-Study form on CPdN −1 . Then for any k ≥ 0,

1 ∗

8N (ωFS ) − ω = O( 1 ).

k

N N C

3. Zeros of Random Sections es as s Our first aim is to determine the expected value of the normalized zero divisor Z is chosen at random from the unit sphere SH 0 (M, LN ) := {s ∈ H 0 (M, LN ) : |s| = 1} (or equivalently as [s] ∈ PH 0 (M, LN ) is chosen at random with respect to the FubiniStudy volume). As above, we fix one orthonormal basis {SjN } of H 0 (M, LN ) and write N SjN = fj eN L relative to a holomorphic frame (= nonvanishing section) eL over an open PdN 0 N set U ⊂ M . Any section in SH (M, L ) may then be written as s = j=1 aj fj eN L with PdN 2 dN (which j=1 |aj | = 1. To simplify the notation we let f = (f1 , . . . , fdN ) : U → C is a local representation of 8N ) and we put dN X

aj fj = ha, f i.

j=1

Hence esN = Z

√

−1 ¯ ∂ ∂ log |ha, f i|. Nπ

(8)

3.1. Expected distribution of zeros. We shall frequently use the notation E(Y ) for the expected value of a random variable Y on a probability space (, dµ), i.e. E(Y ) = R Y dµ. esN as a D01,1 (M )-valued random variable (which we call simply a “random We view Z current’) as s varies over SH 0 (M, LN ) regarded as a probability space with the standard measure, which we denote by µN . The expected distribution of zeros of the random esN ) ∈ D01,1 (M ) given by section s is the current E(Z Z N eN ), ϕ = e , ϕ dµN , ϕ ∈ Dm−1,m−1 (M ), (9) E(Z Z s s S 2dN −1

where we identify SH 0 (M, LN ) with the unit (2dN − 1)-sphere S 2dN −1 ⊂ CdN . In fact, we have the following simple formula for the expected zero-distribution in terms of the map 8N given by Eq. (7): Lemma 3.1. For N sufficiently large so that 8N is defined, we have: esN ) = E(Z

1 ∗ 8 ωFS . N N

668

B. Shiffman, S. Zelditch

Proof. (Lemma 3.1 is a special case of Lemma 4.3 below. We give here a short alternate proof of Lemma 3.1 which will serve as an introduction to our estimate on the variance (Lemma 3.3) to be given below.) We write ωN =

1 ∗ 8 ωFS . N N

(10)

In terms of our fixed orthonormal basis, we have: √

ωN

√ dN X −1 ¯ −1 ¯ N 2 = |fj | = ∂ ∂ log ∂ ∂ log |f |2 , 2πN 2πN j=1

(11)

where f = (f0 , . . . , fdN ) is a local representation of 8N as defined above. Let ϕ be a smooth (m − 1, m − 1) form, which we shall refer to as a “test form”. We may assume that we have a coordinate frame for L on Support ϕ. By (8), we must show that √ Z Z −1 ∂ ∂¯ log |ha, f i| ∧ ϕdµN (a) = (ωN , ϕ). (12) πN S 2dN −1 M To compute the integral, we write f = |f |u, where |u| ≡ 1. Evidently, log |ha, f i| = log |f | + log |ha, ui|. The first term gives √ Z Z −1 ∂ ∂¯ log |f | ∧ ϕ = ωN ∧ ϕ. (13) πN M M We now look at the second term. We have √ Z Z −1 ∂ ∂¯ log |ha, ui| ∧ ϕdµN (a) π S 2dN −1 M (14) √ Z Z −1 ∂ ∂¯ log |ha, ui|dµN (a) ∧ ϕ = 0, = π M S 2dN −1 R since the average log |ha, ωi|dµN (a) is a constant independent of u for |u| = 1, and thus the operator ∂ ∂¯ kills it. Combining Corollary 2.2 and Lemma 3.1, we obtain: esN ) = ω + O( 1 ); i.e., for each smooth test form ϕ, we have Proposition 3.2. E(Z N Z 1 N e ω ∧ ϕ + O( ). E(Zs , ϕ) = N M

3.2. Variance estimate. The purpose of this section is to obtain the variance estimate we need to obtain Theorem 1.2. Let ϕ be a test form. It follows from our formula for the esN , ϕ) is given by expectation (Lemma 3.1) that the variance of (Z esN , ϕ) − (ωN , ϕ)|2 = E (Z esN , ϕ)2 − (ωN , ϕ)2 . esN − ωN , ϕ)2 = E |(Z E (Z (15)

Distribution of Zeros of Sections of Positive Line Bundles

669

We have the following estimate of the variance: Lemma 3.3. Let ϕ be any smooth test form. Then esN , ϕ) − (ωN , ϕ)|2 = O( 1 ). E |(Z N2 Proof. We again let f be a local representation of 8N . Using (8) we easily obtain Z Z esN , ϕ)2 = −1 ¯ ¯ (∂ ∂ϕ(z))(∂ ∂ϕ(w)) E (Z π2 N 2 M M Z (16) log |hf (z), ai| log |hf (w), ai|dµN (a). S 2dN −1

As in the previous lemma we write f = |f |u with |u| ≡ 1. Then log |hf (z), ai| log |hf (w), ai| = log |f (z)| log |f (w)| + log |f (z)| log |hu(w), ai| + log |f (w)| log |hu(z), ai| + log |hu(w), ai| log |hu(z), ai|. The first term contributes Z Z 1 −1 ¯ ¯ (∂ ∂ϕ(z))(∂ ∂ϕ(w)) log |f (z)| log |f (w)| = 2 (ϕ, 8∗N ωFS )2 = (ϕ, ωN )2 . π2 N 2 M M N (17) The middle two terms contribute zero to the integral by (14). The lemma at hand thus comes down to the following claim: Z Z Z = O(1). ¯ ¯ (∂ ∂ϕ(z))(∂ ∂ϕ(w)) log |hu(z), ai| log |hu(w), ai|dµ (a) N M M S 2dN −1 (18) It suffices to show that Z log |hx, ai| log |hy, ai|dµN (a) = CN + O(1) (x, y ∈ S 2dN −1 ), GN (x, y) := 2d −1 S N (19) where CN is a constant and the O(1) term is uniformly bounded on S 2dN −1 × S 2dN −1 . To verify (19), we consider the Gaussian integral Z 2 e N (x, y) := e−|a| log |hx, ai| log |hy, ai|da. (20) G CdN

We evaluate (20) in two different ways. First, we use spherical coordinates a = ρσ with σ ∈ S 2dN −1 . We have Z ∞Z 2 e N (x, y) = e−ρ ρ2dN −1 log ρ + log |hx, σi| log ρ + log |hy, σi| dρdσ, G S 2dN −1 0 (21) where dσ denotes the (non-normalized) volume element on the unit sphere. Multiplying out we get four terms. The only term that is non-constant is the term containing both x and y. We then have

670

B. Shiffman, S. Zelditch 0 e N (x, y) = CN + G

Z

0 + = CN

∞

−ρ2 2dN −1

e

0

(dN − 1)! 2

ρ Z

Z dρ

S 2dN −1

S 2dN −1

log |hx, σi| log |hy, σi|dσ

log |hx, σi| log |hy, σi|dσ.

e N (x, y) a second way by noting that coordinates in CdN may be We now evaluate G chosen so that x = (1, 0, . . . , 0), y = (ζ1 , ζ2 , 0, . . . , 0). Write a0 = (a1 , a2 ), a˜ = (a3 , . . . , adN ), ζ 0 = (ζ1 , ζ2 ). Then the integral becomes Z −|a| ˜ 2 e e d˜a ψ(ζ 0 ) = π dN −2 ψ(ζ 0 ), (22) GN (x, y) = CdN −2

where

Z

0

ψ(ζ ) =

0 2

C2

e−|a | log |a1 | log |ha0 , ζ 0 i|da0 (ζ 0 ∈ S 3 ⊂ C2 ).

(23)

(To be precise, we have a well-defined continuous map ζ : S 2dN −1 ×S 2dN −1 → S 3 /S 1 = CP1 and ψ(ζ 0 ) = ψ(ζ(x, y)).) By the Cauchy-Schwartz inequality, we have 1/2 Z 1/2 Z 0 2 0 2 e−|a | (log |a1 |)2 da0 e−|a | (log |ha0 , ζ 0 i|)2 da0 |ψ(ζ 0 )| ≤ 2 C2 Z C 0 2 = e−|a | (log |a1 |)2 da0 = C < +∞, C2

for all ζ 0 ∈ S 3 . Since dσ = σ(S 2dN −1 )dµN =

2π dN dµN , (dN − 1)!

we have GN (x, y) =

1 1 e 0 0 (x, y) − C G N N = 2 ψ(ζ ) + CN . π dN π

(24)

Thus esN , ϕ) − (ωN , ϕ)|2 ≤ E |(Z

C ¯ 2. sup k∂ ∂ϕk π4 N 2

(25)

3.3. Almost everywhere convergence. We can now complete the proof of Theorem 1.1 on the convergence of the zero sets for a random sequence of increasing Q∞of sections 0 SH (M, LN ) with the degree, viewed as an element of the probability space S = N =1 Q∞ measure µ = N =1 µN . Proof. Recall that we identify the unit sphere SH 0 (M, LN ) ⊂ H 0 (M, LN ) with the (2dN − 1)-sphere S 2dN −1 ⊂ CdN (using the Hermitian inner product described in Sect. 2.1); the measure µN is Haar probability measure on S 2dN −1 . An element in S will be denoted s = {sN }. Since es , ϕ)| ≤ (Z es , ω m−1 )kϕkC 0 = c1 (L)m kϕkC 0 , |(Z N N

Distribution of Zeros of Sections of Positive Line Bundles

671

by considering a countable C 0 -dense family of test forms, we need only consider one test form ϕ. By Lemma 2.2, it suffices to show that es − ωN , ϕ) → 0 (Z N

almost surely.

Consider the random variables es − ωN , ϕ)2 ≥ 0. YN (s) = (Z N By Lemma 3.3,

Z S

Therefore

Z X ∞ S N =1

YN (s)dµ(s) = O(

YN dµ =

and hence YN → 0 almost surely.

∞ Z X S

N =1

(26)

1 ). N2

YN dµ < +∞,

Remark. Since Lemma 3.3gives an O N12 bound on E(YN ), we have for any > 0, R 1 N 1−2 YN dµ = O N 1+2 . Thus the above proof actually shows that S 1 e , almost surely. ZN , ϕ − (ω, ϕ) ≤ O 1 N 2 − 3.4. Zeros of random orthonormal bases. We now switch our attention to sequences Q∞ of orthonormal bases and prove Theorem 1.2. We let ON B = N =1 ON BN denote the space of sequences {(S1N , . . . , SdNN ) : N = 1, 2, . . . }, where (S1N , . . . , SdNN ) is an element of the space ON B N of orthonormal bases for H 0 (M, LN ). Choosing a fixed e = {eN j : j = 0, . . . , dN , N = 1, 2, . . . } ∈ ON B gives the identifications ON B N ≡ U(dN ) (the unitary group of rank dN ) and ON B ≡

∞ Y

U(dN ).

(27)

νN ,

(28)

N =1

We give ON B the measure ν :=

∞ Y N =1

where νN is the unit-mass Haar measure on U(dN ). The variance estimate of Lemma 3.3 carries over to orthonormal bases: Lemma 3.4. For a smooth test form ϕ, we have eNN − ωN , ϕ)2 = O( 1 ). E (Z Sj N2 N 0 N Proof. Let πj : ON BN → SH (M, L ) denote the projection to the j th factor. Since N νN = µN , we see that πj∗ eNN − ωN , ϕ)2 = E 2dN −1 (Z eN − ωN , ϕ)2 , EU(dN ) (Z s S S j

and thus Lemmas 3.3 and 3.4 are equivalent.

The proof of Theorem 1.2 follows easily from Lemma 3.4 exactly as in the proof of Theorem 1.1. (The equivalence of the second conclusion follows from [Z2, §1.3].)

672

B. Shiffman, S. Zelditch

4. Zeros of SU(k) Polynomials As an example, we apply Lemma 3.1 to the case where M = CPm with the Fubini-Study K¨ahler form ω = ωFS and L is the hyperplane section bundle O(1) with the standard Hermitian metric hFS . (See Sect. 2.1; recall that the curvature c1 (hFS ) of O(1) is ω.) We also extend Lemma 3.1 to the case of simultaneous zeros. 4.1. SU(2) polynomials. First consider m = 1. Elements of H 0 (M, LN ) = H 0 (CP1 , O(N )) are homogeneous polynomials in two variables of degree N , or equivalently, polynomials in one variable of degree ≤ N . A basis is given by σj = z j , j = 0, . . . , N . The inner product in H 0 (M, LN ) is given by Z Z z j z¯ k z j z¯ k 1 ω = dxdy. hσj , σk i = 2 N π C (1 + |z|2 )N +2 C (1 + |z| ) Writing the integral in polar coordinates, we see that the σj are orthogonal, and Z ∞ r2j+1 1 dr = |σj |2 = 2 . 2 N +2 (1 + r ) (N + 1) Nj 0 We thus can choose an orthonormal basis 1 1 SjN = (N + 1) 2 Nj 2 z j ,

(29)

j = 0, . . . , N.

Next, we note that N X j=0

kSjN k2 = (1 + |z|2 )−N

N X j=0

(N + 1)

N j

|z 2j | ≡ N + 1,

and thus ωN = N1 8∗N ωFS = ω. We thus recover the following result of [BBL, Appendix C] on “random SU(2) polynomials”: Theorem 4.1 ([BBL]). Suppose we have a random polynomial P (z) = c0 + c1 z + · · · + cN z N , where Re c0 , Im c0 , . . . , Re cN , Im cN are independent Gaussian random variables with mean 0 and variances E (Re cj )2 = E (Im cj )2 = Nj . Then the expected distribution of zeros of P is uniform over CP1 ≈ S 2 . In fact, Theorem 1.1 tells us that for a random sequence of such polynomials, the distribution of zeros approaches uniformity. 4.2. SU(m+1) polynomials. We now turn to the case of polynomials in several variables. An “SU(m + 1) polynomial of degree N ” is an element of the probability space of homogeneous polynomials of degree N on Cm+1 with an SU(m + 1)-invariant Gaussian probability measure. Recall that this space can be identified with H 0 (CPm , O(N )). We give H 0 (CPm , O(N )) the standard inner product. A basis for H 0 (CPm , O(N )) is given

Distribution of Zeros of Sections of Positive Line Bundles

673

by the monomials jm , σJ = z0j0 · · · zm

J = (j0 , . . . , jm ), |J| = N.

One easily sees that the σJ are orthogonal. We compute Z Z m!j0 ! · · · jm ! |σJ (z)|2 m ω = |σJ (z)|2 dµ2m+1 = |σJ |2 = FS 2N |z| (N + m)! m 2m+1 CP S

(30)

(where µ2m+1 is Haar probability measure on S 2m+1 ), by writing Z Z Z 2 2 2 e−|z| |σJ (z)|2 dz = e−|z0 | |z0 |2j0 dz0 · · · e−|zm | |zm |2jm dzm . Cm+1

C

C

Therefore, the sections SJN

(N + m)! := m!j0 ! · · · jm !

21

zJ

form an orthonormal basis for H 0 (CPm , O(N )). Furthermore X +m kSJN k2 ≡ Nm ,

(31)

|J|=N

since the sum is SU(m + 1) invariant, hence constant, and the integral of the left side equals dim H 0 (CPm , O(N )). In our results on zeros, we can replace the unit sphere SH 0 (M, LN ) with the complex dN -dimensional vector space H 0 (M, LN ) with the Gaussian probability measure 2 1 e−|s| ds (where ds means 2dN -dimensional Lebesgue measure). (We continue to π dN use the inner product structure on H 0 (M, LN ) introduced in Sect. 2.1.) The space of SU(m + 1) polynomials of degree N is by definition the space H 0 (CPm , O(N )) of homogeneous polynomials of degree N in m + 1 variables (or equivalently, polynomials in m variables of degree ≤ N ) with this Gaussian measure. We can use (30) to describe the space of SU(m + 1) polynomials explicitly as follows. For P ∈ H 0 (CPm , O(N )), we write X aJ jm √ z j0 · · · zm . (32) P (z0 , . . . , zm ) = j0 ! · · · jm ! 0 |J|=N

The Gaussian measure on H 0 (CPm , O(N )) is then given by 1 π dN

e−|A| dA, 2

A = (aJ ) ∈ CdN ,

+m . where dN = Nm Lemma 3.1 and (31) now tell us that if P is a polynomial given by (32), with the aJ being independent Gaussian random variables with mean 0 and variance 1, then the expected zero current ZP equals N ωF S . (This fact, which is the higher dimensional analogue of Theorem 4.1, is extended to cover simultaneous zeros in Proposition 4.5 below.) Furthermore, Theorem 1.1 yields the following:

674

B. Shiffman, S. Zelditch

Proposition 4.2. Suppose we have a sequence of polynomials PN (z0 , . . . , zm ) =

X |J|=N

√

aN jm J z j0 · · · zm , j0 ! · · · jm ! 0

where the aN J are independent Gaussian random variables with mean 0 and variance 1. Then 1 ZP → ωF S almost surely N N (weakly in the sense of measures). 4.3. Expected distribution of simultaneous zeros. We take a brief detour now to generalize Lemma 3.1 and Proposition 3.2 to simultaneous zero sets of holomorphic sections. This yields a generalization (Proposition 4.5) of Theorem 4.1 to the case of simultaneous zeros of polynomials in several variables. In particular, the 0-dimensional case of Proposition 4.5 says that the simultaneous zeros of m random SU(m + 1) polynomials m . are uniformly distributed on CPm with respect to the volume ωFS Let 1 ≤ ` ≤ m, and consider the Grassmannian of `-dimensional subspaces of H 0 (M, LN ), which we denote G` H 0 (M, LN ). For an element S = Span{s1 , . . . , s` } ∈ G` H 0 (M, LN ), we let ZS ∈ D0`,` denote the current of integration over the set {z ∈ M : s1 (z) = · · · = s` (z) = 0}. Note that this definition is independent of the choice of basis {sj } of S; furthermore by Bertini’s theorem (see [GH]), the zero sets Zsj are smooth and intersect transversely for almost all S, so we can ignore multiplicities if we wish. As before, we consider the normalized current eSN = 1 ZS , Z N` which we regard as a random current with S varying over the probability space eN is then given G` H 0 (M, LN ) with unit-mass Haar measure. The expected value of Z S by the following elementary formula: Lemma 4.3. For N sufficiently large so that 8N is defined, we have: ` eSN ) = ωN . E(Z

Proof. Using our fixed orthonormal basis {SjN }, we can write sk = S

⊥

= {w ∈ CP

dN −1

:

dN X j=1

PdN

j=1

ajk SjN . Let

ajk wj = 0, k = 1, . . . , `}.

We let [S ⊥ ] denote the current of integration over S ⊥ , regarded as a D0`,` (CPdN −1 )eN = 1` 8∗N [S ⊥ ], we then have valued random variable. Since Z S N eSN ) = E(Z where E([S ⊥ ]) =

1 ∗ 8 E([S ⊥ ]), N` N Z G` CdN

[S ⊥ ]dS.

Distribution of Zeros of Sections of Positive Line Bundles

675

We note that E([S ⊥ ]) is U(dN )-invariant. It is well-known that the only (`, `)-currents ` ; see on projective space that are invariant under the unitary group are multiples of ωFS ⊥ m−` ⊥ ` ) = 1, we conclude that E([S ]) = ωFS and [Sh, Lemma 3.3]. Since (E([S ]), ω thus ` ` eSN ) = 1 8∗N ωFS = ωN . E(Z N` Applying Corollary 2.2, we obtain the following generalization of Proposition 3.2: Proposition 4.4. Let S be a random element of G` H 0 (M, LN ), where 1 ≤ ` ≤ m. Then eSN ) = ω ` + O( 1 ). E(Z N We now apply Lemma 4.3 to random SU(m + 1) polynomials to obtain: Proposition 4.5. Choose an `-tuple P = (P1 , . . . , P` ) of SU(m + 1) polynomials of degree N at random. Then E(ZP ) = N ` ωF` S , and in particular E Vol2m−2` {z ∈ U : P1 (z) = · · · = P` (z) = 0} =

m! N ` Vol2m (U ) (m − `)!

for all open subsets U of CPm (where Volk denotes the Riemannian k-volume in (M, ω)). Proof. An `-tuple of SU(m + 1) polynomials is an element of the probability space ` H 0 (CPm , O(N )] , dG , where dG is the `-fold self-product of the Gaussian measure on H 0 (CPm , O(N )) (which, of course, is itself a Gaussian measure). By (31), we conclude as before that ωN = ω. Let o n ` = (W1 , . . . , W` ) ∈ H 0 (CPm , O(N )) : W1 ∧ · · · ∧ W` 6= 0 , and let γ : → G` H 0 (CPm , O(N )) be the natural map. The conclusion follows from Lemma 4.3 by noting that γ∗ (dG) equals Haar measure on G` H 0 (CPm , O(N )). 5. Ergodic Orthonormal Bases and Sections We now turn to the distribution of zeros of sections which form an “ergodic orthonormal basis”. As will be explained below, eigenfunctions of quantum ergodic maps form such a basis. So do random orthonormal bases. Both of these facts belong to now familiar genres of results in quantum chaos. Let us briefly recall the basic definitions and results and then prove the principal new results, Theorem 1.3 and Lemma 1.4. Proofs of the background results on ergodic bases are given in the Appendix. 5.1. The ergodic property. The weak*-convergence hypothesis of Lemma 1.4 is closely related to the following “ergodic property”:

676

B. Shiffman, S. Zelditch

Definition. We say that S ∈ ON B has the ergodic property if 2 dn Z N 1 X 1 X n 2 = 0, ∀ϕ ∈ C(M ). ϕ(z)kS (z)k dV − ϕ ¯ j hn N →∞ N d M n=1 n j=1

(EP)

lim

Here, ϕ¯ =

R M

ϕdV denotes the average value of a function ϕ over M .

As is well-known (see, for example [Z2, §1]), this property may be rephrased in the following way: Let S = {(S1N , . . . , SdNN ) : N = 1, 2, . . . } ∈ ON B. Then the ergodic property (EP) is equivalent to the following weak* convergence property: There exists a subsequence {S10 , S20 , . . . } of relative density one of the sequence {S11 , . . . , Sd11 , . . . , S1N , . . . , SdNN , . . . } such that Z ϕ(z)kSn0 (z)k2 dV → ϕ, ¯ ∀ϕ ∈ C(M ). (EP 0 ) M

A subsequence {akn } of a sequence {an } is said to have relative density one if limn→∞ n/kn = 1. The equivalence of (EP) and (EP 0 ) is a consequence of the fact that if {a1 , a2 , a3 , . . . } = {A11 , . . . , A1d1 , . . . , An1 , . . . , Andn , . . . } is a sequence of non-negative real numbers, then the following are equivalent: i) there exists a subsequence {akn } of relative density one such that limn→∞ akn → 0. PN an → 0. ii) limN →∞ N1 Pn=1 Pn N 1 n iii) limN →∞ N1 n=1 dn j=1 Aj → 0. The equivalence of (i) and (ii) is given in [W, Theorem 1.20]. For the equivalence of (ii) and (iii), which depends on the fact that dn ∼ nm , see [Z2, §1.3]. (To complete the proof that (EP) ⇒ (EP 0 ), one uses a diagonalization argument to pick a subsequence independent of ϕ satisfying (EP 0 ).) We first have: Theorem 5.1. (a) A random S ∈ ON B has the ergodic property (EP), or equivalently, (EP 0 ). In fact, in complex dimensions m ≥ 2, a random S ∈ ON B has the property 2 dN Z 1 X N 2 ϕkSj k dV − ϕ¯ = 0, ∀ϕ ∈ C(M ), lim N →∞ dN M j=1 or equivalently, for each N there exists a subset 3N ⊂ {1, . . . , dN } such that and Z lim

N →∞,j∈3N

M

#3N dN

→1

ϕkSjN k2 dV = ϕ. ¯

(b) A random sequence of sections s = {s1 , s2 , . . . } ∈ S has a subsequence {sNk } of relative density 1 such that Z ϕ(z)ksNk (z)k2 dV → ϕ, ¯ ∀ϕ ∈ C(M ). M

In complex dimensions m ≥ 2, the entire sequence has this property.

Distribution of Zeros of Sections of Positive Line Bundles

677

Theorem 5.1(a) is the line-bundle analogue of Theorem (b) in [Z2] on random orthonormal combinations of eigenfunctions of positive elliptic operators with periodic bicharacteristic flow. The proof of Theorem 5.1 closely parallels those of [Z1, Z2] and strengthens them in dimensions m ≥ 2. Details will be given in the Appendix below. The second setting in which ergodic orthonormal bases appear is that of quantum ergodicity. We recall the following result from [Z3, Theorem B-Corollary B], which together with Lemma 1.4 yields Theorem 1.3. Theorem 5.2 ([Z3]). Let {SjN } be an orthonormal basis of eigenfunctions of an ergodic quantum map Uχ,N on H 0 (M, LN ) (as described in Theorem 1.3). Then {SjN } has the ergodic property (EP), or equivalently, (EP 0 ). Theorem 5.2 belongs to a long line of results originating in the work of A. Shnirelman [Shn1] in 1974 (see also [Shn2]) on eigenfunctions of the Laplacian on compact Riemannian manifolds with ergodic geodesic flow. The definition of “quantum map” and the proof of ergodicity of eigenfunctions for ergodic quantum maps over compact K¨ahler manifolds is contained in [Z3], where further references can be found to the literature of quantum ergodicity. We now complete the proofs of Theorems 1.2 and 1.3 by verifying Lemma 1.4. 5.2. Proof of Lemma 1.4. Let (L, h) → (M, ω) and sN ∈ H 0 (M, LN ), N = 1, 2, . . . , be as in the hypotheses of Lemma 1.4. We write uN =

1 log ksN (z)khN . N

We shall prove that uN → 0 in L1 (M ). Indeed, assuming that this is the case (or uN → 0 weakly), then for any smooth test form ϕ ∈ Dm−1,m−1 (M ), we have by the Poincar´e-Lelong formula (3), √ −1 ¯ 1 ZN − ω, ϕ = uN , ∂ ∂ϕ → 0. N π Since by (4),

c1 (L)m 1 ZN , ϕ ≤ sup |ϕ|, N m!

the conclusion of the lemma then holds for all C 0 test forms ϕ. To show that uN → 0 in L1 (M ), we first observe that: i) the functions uN are uniformly bounded above on M ; ii) lim supN →∞ uN ≤ 0. Indeed, since ksN k2 converges weakly to 1, we have Z 2 ksN k2hN dV → 1. |sN | = M

Choose orthonormal bases By Theorem 2.1, we have

{SjN }

ksN (z)k2hN ≤ |sN |2

and write sN =

dN X j=1

kSjN (z)k2hN =

P j

aj SjN , so that

P

|aj |2 = |sN |2 .

c1 (L)m + o(1) N m . m!

678

B. Shiffman, S. Zelditch

Hence ksN (z)khN ≤ CN m/2 for some C < ∞ and taking the logarithm gives both statements. Let eL be a local holomorphic frame for L over U ⊂ M and let eN L be the correN (z)k = g(z) . Then we may sponding frame for LN . Let g(z) = keL (z)kh so that keN h N L N write sN = fN eN L with fN ∈ O(U ) and ksN khN = |fN |g . It is useful to consider the function 1 log |fN | = uN − log g, vN = N which is plurisubharmonic on U . (For the properties of plurisubharmonic functions used here, see for example, [Kl].) To finish the proof, we follow the potential-theoretic approach used by Fornaess and Sibony [FS] in their proof of the Brolin-Lyubich theorem on the dynamics of rational functions. Let U 0 be a relatively compact, open subset of U . We must show that uN → 0 (or equivalently, vN → − log g) in L1 (U 0 ). Suppose on the contrary that uN 6→ 0 in L1 (U 0 ). Then we can find a subsequence {uNk } with kuNk kL1 (U 0 ) ≥ δ > 0. By a standard result on subharmonic functions (see [Ho, Theorem 4.1.9]), we know that the sequence {vNk } either converges uniformly to −∞ on U 0 or else has a subsequence which is convergent in L1 (U 0 ). Let us now rule out the first possibility. If it occurred, there would exist K > 0 such that for k ≥ K, z ∈ U 0 , 1 log ksNk (z)khNk ≤ −1. Nk

(33)

However, (33) means that ksNk (z)k2hN ≤ e−2Nk k

∀z ∈ U 0 ,

which is inconsistent with the hypothesis that ksNk (z)k2hN → 1 in the weak* sense. k Therefore there must exist a subsequence, which we continue to denote by {vNk }, which converges in L1 (U 0 ) to some v ∈ L1 (U 0 ). By passing if necessary to a further subsequence, we may assume that {vNk } converges pointwise almost everywhere in U 0 to v, and hence by (ii) above, v = lim sup vNk = lim sup uNk − log g ≤ − log g k→∞

Now let

k→∞

(a.e.).

v ∗ (z) := lim sup v(w) w→z

be the upper-semicontinuous regularization of v. Then by (i) above, v ∗ is plurisubharmonic on U 0 and v ∗ = v almost everywhere. Thus, v ∗ (z) ≤ − log g(z) at all points z ∈ U 0. Since kvNk + log gkL1 (U 0 ) = kuNk kL1 (U 0 ) ≥ δ > 0, we know that v ∗ 6≡ − log g. Hence, for some > 0, the open set U = {z ∈ U 0 : v ∗ < − log g −} is non-empty. Let U 00 be a non-empty, relatively compact, open subset of U ; by Hartogs’ Lemma, there exists a positive integer K such that vNk (z) ≤ − log g(z) − /2 for z ∈ U 00 , k ≥ K; i.e., ksNk (z)k2hN ≤ e−Nk , k

which contradicts the weak convergence to 1.

z ∈ U 00 , k ≥ K,

(34)

Distribution of Zeros of Sections of Positive Line Bundles

679

6. Appendix In this Appendix, we give a proof of Theorem 5.1, closely following the proof of Proposition 2.1.4(b) in [Z2]. To simplify things, we write Z (S) = Aϕ nj

M

2 ϕ(z)kSjn (z)k2hn dV − ϕ¯ .

(35)

2 (X), we may identify ON B with the In view of the isomorphism H 0 (M, LN ) ∼ = HN ∂ generating the S 1 space of orthonormal bases of eigenfunctions for the operator 1i ∂θ action on X. Assume without loss of generality that ϕ is real-valued, and consider the 2 2 (X) → HN (X), where Mϕ is Toeplitz operators TNϕ = 5N Mϕ 5N = 5N Mϕ : HN 2 2 multiplication by the lift of ϕ to X, and 5N : L (X) → HN (X) is the orthogonal 2 (X), which can be identified with projection. Then TNϕ is a self-adjoint operator on HN a Hermitian dN × dN matrix via the fixed basis e. We then have 2 2 2 n n ¯ = (Tnϕ Sjn , Sjn ) − ϕ¯ = (Un∗ Tnϕ Un enj , enj ) − ϕ¯ , Aϕ nj (S) = (ϕSj , Sj ) − ϕ (36)

where S = {UN }, UN ∈ U(dN ) ≡ ON BN . We have 1 ϕ¯ = dn

Z

dn X

M j=1

kenj k2 ϕdV

Z

dn 1 1 X 1 + kenj k2 ϕdV = Tr Tnϕ + O( ), 1− d d n n j=1 n M (37)

where the last equality is by Theorem 2.1. Therefore, 1 eϕ Aϕ nj (S) = Anj (S) + O( ), n

(38)

2 ∗ ϕ 1 ϕ n n ϕ e Tr Tn . Anj (S) = (Un Tn Un ej , ej ) − dn

(39)

where

(The bound for the O( n1 ) term in (38) is independent of S.) Note that iTNϕ can be identified with an element of the Lie algebra u(dN ) of U(dN ). Let t(d) denote the Cartan subalgebra of diagonal elements in u(d), and let k · k2 denote the Euclidean inner product on t(d). Also let Jd : iu(d) → it(d) denote the orthogonal projection (extracting the diagonal). Finally, let 1 ¯ Tr H Idd , Jd (H) = d for Hermitian matrices H ∈ iu(d). (Thus, H = H 0 + J¯d (H), with H 0 traceless, gives us the decomposition u(d) = su(d) ⊕ R.)

680

B. Shiffman, S. Zelditch

We introduce the random variables: Ynϕ : ON B → [0, +∞), Ynϕ (S) := kJdn (Un∗ Tnϕ Un ) − J¯dn (Tnϕ )k2 . By (38) dn dn X 1 X 1 1 ϕ eϕ (S) = 1 A Yn (S) = Aϕ nj nj (S) + O( ) dn dn j=1 dn j=1 n

(40)

(where the O( n1 ) term is independent of S). Thus, (EP) is equivalent to: N 1 X 1 ϕ Y (S) = 0, ∀ϕ ∈ C(M ). N →∞ N d n n=1 n

lim

(41)

The main part of the proof of (41) is to show the following asymptotic formula for the expected values of the Ynϕ . Lemma 6.1.

¯ 2 + o(1). E(Ynϕ ) = ϕ2 − (ϕ)

Assume Lemma 6.1 for the moment. The lemma implies that (41) holds on the average; i.e., N 1 X 1 ϕ E Y = 0. lim N →∞ N dn n n=1

(42)

Next we note that Var By (39),

1 ϕ Y dn n

≤ sup

1 ϕ Y dn n

2

eϕ )2 . ≤ max sup(A nj j

eϕ (S) ≤ 4(Un∗ Tnϕ Un enj , enj )2 ≤ 4 sup ϕ2 , A nj

and therefore

Var

1 ϕ Y dn n

≤ 16 sup ϕ4 < +∞.

(43)

Since the variances of the independent random variables d1n Ynϕ are bounded, (41) follows from (42) and the Kolmogorov strong law of large numbers, which gives part (a) for general dimensions. In dimensions m ≥ 2, we obtain the improved as folP conclusion ∞ ϕ ϕ 1 1 1 lows: From the fact that E( dN YN ) = O( N m ) it follows that E N =1 dN YN < +∞ and thus

1 dN

YNϕ → 0 almost surely when m ≥ 2. The quantity we are interested in is ϕ XN

2 dN Z dN X 1 X N 2 = 1 := ϕkS k dV − ϕ ¯ Aϕ . j dN j=1 M dN j=1 N j

Distribution of Zeros of Sections of Positive Line Bundles

681

However, by (40), ϕ − sup |XN

ON B

1 ϕ 1 Y | = O( ). dN N N

ϕ XN

→ 0 almost surely. Hence also eϕ ), for all j, it follows from eϕ ) = E(A To verify part (b), we note that since E(A nj n1 eϕ ) = E( 1 Ynϕ ). Thus, (40) that E(A n1 dn N 1 X eϕ An1 = 0, N →∞ N n=1

(44)

N 1 X ϕ An1 = 0. N →∞ N n=1

(45)

lim

or equivalently, lim

Part (b) then follows from (45) exactly as before. It remains to prove Lemma 6.1. Denote the eigenvalues of Tnϕ by λ1 , . . . , λdn and write dn X λkj . Sk (λ1 , . . . , λdn ) = j=1

Note that Tr (Tnϕ )k = Sk (λ1 , . . . , λdn ).

(46)

We shall use the following “Szego limit theorem” due to Boutet de Monvel and Guillemin [BG, Theorem 13.13]: Lemma 6.2. [BG] For k ∈ Z+ , we have lim

N →∞

1 Tr (TNϕ )k = ϕk . dN

Lemma 6.1 is an immediate consequence of Lemma 6.2 and the following formula: Z

S2 (~λ) S1 (~λ)2 kJd (U ∗ D(~λ)U ) − J¯d (D(~λ))k2 dU = − , d+1 d(d + 1) U(d)

(47)

where ~λ = (λ1 , . . . , λd ) ∈ Rd , D(~λ) denotes the diagonal matrix with entries equal to the λj , and integration is with respect to Haar probability measure on U(d). Proof. A proof of the identity (47) is given in [Z1, pp. 68–69] (see also [Z2]). For completeness, we provide here a simplified proof of (47) following the methods of [Z1, Z2]. Let E(~λ) denote the left side of (47). Since E(~λ) is a homogeneous, degree 2, symmetric polynomial in ~λ, we can write E(~λ) = cd S2 (~λ) + c0d S1 (~λ)2 .

(48)

682

B. Shiffman, S. Zelditch

Substituting ~λ = (1, . . . , 1) in (48) and using the fact that E(1, . . . , 1) = 0, we conclude that c0d = −cd /d. To find cd , we substitute ~λ = (1, 0, . . . , 0), and write D = D(1, 0, . . . , 0). For U = (ujk ) ∈ U(d), we have 1 (U ∗ DU )jj = |u1j |2 , J¯d D = Idd . d Therefore, Z E(1, 0, . . . , 0) = Z

d X U(d) j=1

1 |u1j | − d

d X

= S 2d−1

j=1

2

2

Z dU =

d X

S 2d−1 j=1

1 |aj | − d

1 1 |aj | − dµ2d−1 (a) = − + d d d

2

2 dµ2d−1 (a)

Z

4

S 2d−1

|a1 |4 dµ2d−1 (a),

where a = (a1 , . . . , ad ) ∈ S 2d−1 and µ2d−1 is unit-mass Haar measure on S 2d−1 . By (30), Z 2 |a1 |4 dµ2d−1 (a) = , d(d + 1) 2d−1 S and therefore E(1, 0, . . . , 0) =

d−1 . d(d + 1)

(49)

Substituting (49) into (48) with c0d = −cd /d, we conclude that cd =

1 . d+1

Acknowledgement. We would like to thank S. Nonnenmacher and A. Voros for sending us a copy of their paper [NV] prior to publication and to acknowledge their priority on the overlapping result. We would also like to thank W. Minicozzi for discussions of Donaldson’s paper at the outset of this work and for suggesting that we study random sequences of sections.

References [BD]

Bleher, P. and Di, X.: Correlations between zeros of a random polynomial. J. Stat. Phys. 88, 269–305 (1997) [BP] Bloch, A. and Polya, G.: On the roots of certain algebraic equations. Proc. London Math. Soc. 33, 102–114 (1932) [BBL] Bogomolny, E., Bohigas, O. and Leboeuf, P.: Quantum chaotic dynamics and random polynomials. J. Stat. Phys. 85, 639–679 (1996) [BG] Boutet de Monvel, L. and Guillemin, V.: The Spectral Theory of Toeplitz Operators. Ann. Math. Studies 99, Princeton, NJ: Princeton Univ. Press, 1981 [D] Donaldson, S.: Symplectic submanifolds and almost complex geometry. J. Diff. Geom. 44, 666–705 (1996) [ET] Erdos, P. and Turan, P.: On the distribution of roots of polynomials. Ann. Math. 51, 105–119 (1950) [FS] Fornaess, J.E. and Sibony, N.: Complex dynamics in higher dimensions, II. Modern methods in Complex Analysis, Princeton, NJ, 1992, Ann. of Math. Stud. 137, Princeton, NJ: Princeton Univ. Press, 1995, pp. 135–182 ¨ [Gr] Grauert, H.: Uber Modifikationen und exzeptionelle analytische Mengen. Math. Annalen 146, 331– 368 (1962)

Distribution of Zeros of Sections of Positive Line Bundles

[GH] [Ha] [Ho] [Ka] [Kl] [LS] [LO]

[NV] [RSh] [RSo] [SV] [Sh] [SS] [Shn1] [Shn2]

[So] [T] [V] [W] [Z1] [Z2] [Z3] [Z4]

683

Griffiths, P. and Harris, J.: Principles of Algebraic Geometry. New York: Wiley-Interscience, 1978 Hannay, J.H.: Chaotic analytic zero points: Exact statistics for those of a random spin state. J. Phys. A: Math. Gen. 29, 101–105 (1996) H¨ormander, L.: The Analysis of Linear Partial Differential Operators. Grund. Math. Wiss. bf256, New York: Springer-Verlag, 1983 Kac, M.: On the average number of real roots of a random algebraic equation. Bull. Am. Math. Soc. 49, 314–320 (1943) Klimek, M.: Pluripotential Theory. Oxford: Clarendon Press, 1991 Leboeuf, P. and Shukla, P.: Universal fluctuations of zeros of chaotic wavefunctions. J. Phys. A: Math. Gen. 29, 4827–4835 (1996) Littlewood, J. and Offord, A.: On the number of real roots of random algebraic equations I, II, III. J. London Math. Soc.13, 288–295 (1938); Proc. Camb. Phil. Soc. 35, 133–148 (1939); Math. Sborn. 12, 277–286 (1943) Nonnenmacher, S. and Voros, A.: Chaotic eigenfunctions in phase space. Preprint 1997 Russakovskii, A. and Shiffman, B.: Value distribution for sequences of rational mappings and complex dynamics. Indiana Univ. Math. J. 46, 897–932 (1997) Russakovskii, A. and Sodin, M.: Equidistribution for sequences of polynomial mappings. Indiana Univ. Math. J. 44, 851–882 (1995) Shepp, L.A. and Vanderbei, R.J.: The complex zeros of random polynomials. Trans. Am. Math. Soc. 347, 4365–4384 (1995) Shiffman, B.: Applications of geometric measure theory to value distribution theory for meromorphic maps. Value-Distribution Theory, Part A, New York: Marcel-Dekker, 1974, pp. 63–96 Shiffman, B. and Sommese, A.J.: Vanishing Theorems on Complex Manifolds. Progress in Math. 56, Boston: Birkh¨auser, 1985 Shnirelman, A.I.: Ergodic properties of eigenfunctions. Usp. Mat. Nauk. 29/6, 181–182 (1974) Shnirelman, A.I.: On the asymptotic properties of eigenfunctions in the region of chaotic motion. Addendum to V. F. Lazutkin, KAM Theory and Semiclassical Approximations to Eigenfunctions, New York: Springer-Verlag, 1993 Sodin, M.L.: Value distribution of sequences of rational mappings. Entire and Subharmonic Functions, B. Ya. Levin, ed., Advances in Soviet Math. 11, 1992 Tian, G.: On a set of polarized K¨ahler metrics on algebraic manifolds. J. Diff. Geometry 32, 99–130 (1990) VanderKam, J.M.: L∞ norms and quantum ergodicity on the sphere. Int. Math. Res. Notices 7, 329–347 (1997) Walters, P.: An Introduction to Ergodic Theory. New York: Springer-Verlag, 1981 Zelditch, S.: Quantum ergodicity on the sphere. Commun. Math. Phys. 146, 61–71 (1992) Zelditch, S.: A random matrix model for quantum mixing. Int. Math. Res. Notices 3, 115–137 (1996) Zelditch, S.: Index and dynamics of quantized contact transformations. Annales de l’Institut Fourier (Grenoble) 47, 305–363 (1997) Zelditch, S.: Szeg¨o kernels and a theorem of Tian: Int. Math. Res. Notices 6, 317–331 (1998)

Communicated by P. Sarnak

Commun. Math. Phys. 200, 685 – 698 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

On Chern–Simons and WZW Partition Functions Dana Stanley Fine? Department of Mathematics, University of Massachusetts Dartmouth, North Dartmouth, MA 02747, USA. E-mail: [email protected] Received: 2 March 1998 / Accepted: 4 August 1998

Abstract: Direct analysis of the path integral reduces partition functions in Chern– Simons theory on a three-manifold M with group G to partition functions in a WZW model of maps from a Riemann surface 6 to G. In particular, Chern–Simons theory on S 3 , S 1 × 6, B 3 and the solid torus correspond, respectively, to the WZW model of maps from S 2 to G, the G/G model for 6, and Witten’s gauged WZW path integral Ansatz for Chern–Simons states using maps from S 2 and from the torus to G. The reduction hinges on the characterization of A/G n , the space of connections modulo those gauge transformations which are the identity at a point n, as itself a principal fiber bundle with affine-linear fiber. Chern–Simons and WZW Partition Functions Most non-perturbative accounts of the quantum Chern–Simons theory on compact manifolds follow Witten’s approach [13], which combines geometric quantization on R × 6 (where 6 denotes a Riemann surface), surgery techniques, and certain formal properties of partition functions to calculate Chern–Simons partition functions from known quantities in conformal field theory. Path integrals serve to elucidate the formal properties of partition functions, to fix the quantization condition on the constant k appearing in the action, and to describe the large-k limit. Recent developments put the geometric quantization on a more rigorous mathematical basis, treat non-semi-simple groups, explore the connections to conformal field theory and axiomatize the required properties of the partition function. Papers by Walker [12] and Birmingham et al [2] include reviews of the literature. Several authors have directly evaluated Chern–Simons path integrals. Moore and Seiberg, in their early attempt to categorize rational conformal field theories [8], compute ? This material is based upon work supported in part by the National Science Foundation under Grant #DMS-9307608

686

D. S. Fine

the path integral, in the axial gauge, for the partition function for Chern–Simons on topologically non-trivial bundles over R × 6. Dijkgraaf and Witten [4] follow this approach to describe a map from H 4 (BG, Z) to H 3 (G, Z), the classes which index, respectively, Chern–Simons theories and WZW models. Fr¨ohlich and King [7] treat the Minkowskispace path integral in light cone gauge, and show the Knizhnik-Zamolodchikov equation governs expectations of Wilson lines. The aim of this paper is to use the path integral to calculate the partition function for Chern–Simons theory. We present the technique in the specific case of S 3 , and then extend it to other manifolds. In the case of manifolds with boundary, the partition function is a Chern–Simons state. Our approach differs from those just cited in computing the path integral directly on A/G n , the space of connections modulo those gauge transformations which are the identity in the fiber over a point n in the base manifold. As it requires no gauge choice, this approach is manifestly gauge-invariant. The technique, which is independent of geometric quantization and does not require surgery, provides a direct link between the Chern–Simons path integral and the WZW path integral. As in our treatment of Yang–Mills on Riemann surfaces [5, 6], we first characterize A/G n as a principal fiber bundle, in this case over 2 G, the space of based maps from S 2 to G. We find that the action is quadratic along the affine-linear fibers of A/G n , and we explicitly integrate over the fibers. Our main result is that the remaining integral over the base is the path integral for a WZW model on 2 G. Precisely, with h i denoting the b any extension of X to a three-manifold Chern–Simons expectation, X ∈ 2 G, and X B whose boundary is S 2 , we have Theorem 2.1. For a function f on A/G n which is constant along the fibers, Z R R k ¯ 1 b 3 b−1 dX ik ∂X∧∂X+i 12π B X DX, f (X)e 4π S2 hf i = Z0 where ∂¯ and ∂ are determined by a choice of complex structure on a small two-sphere about the north pole of S 3 . The organization of this paper is as follows: Section 1. characterizes A/G n for S 3 as a bundle over 2 G. Section 2. carries out the integration over the fibers and interprets the result in terms of a WZW path integral. Section 3. extends this result to other three-manifolds. 1. The Bundle A/G n We focus first on Chern–Simons theory on S 3 , where the constructions elucidate the geometry of A/G n . The techniques carry over with slight modifications to other manifolds, of which we will provide examples below. Let A denote the space of smooth connections on a principal G bundle P over the manifold M = S 3 , and let Gn denote the group of gauge transformations which are the identity at the north pole n ∈ S 3 . Since π2 (G), which classifies G-bundles over S 3 , is trivial for any Lie group, P is necessarily a product bundle. We would like to calculate the expectation of a function f on A/G n , as given by the path integral Z 1 (1) f ([A])eiS(A) DA, hf i = Z0

Chern–Simons Theory and WZW Partition Functions

687

where S(A) is the Chern–Simons action. In order to carry out the path integration, we first describe the structure of A/G n , which is of interest in its own right. First we define, for a given connection A, a corresponding element of 2 G. Let e0 be a fixed point of the equator in S 3 , and let e be any other point on the equator. Consider the closed path γ(e) : [0, 1] → S 3 , based at n, defined by the longitudes through e0 and e, traversed in that order. Figure 1 illustrates one such path. Parallel transport about γ(e) by A determines an element of holonomy acting on the fiber over n. Relative to a fixed element pn of the fiber over n, this defines a group element XA (e) for each point e of the equator. Since XA (e0 ) = 1, and the equator is diffeomorphic to S 2 , XA is an element of 2 G. Moreover, since the effect of a gauge transformation on holonomy is pointwise conjugation, and elements of Gn are the identity at n, the map XA depends on the equivalence class [A] ∈ A/G n , not on its representative A. There is thus a well-defined map ξ : A/G n → 2 G given by [A] 7→ XA .

Fig. 1. The path γ(e)

Atiyah and Jones [1] and Singer [9] have shown that for principal fiber bundles over S 4 , A/G n is homotopically equivalent to 3 G. Their arguments would extend to justify the analogous statement for bundles over S 3 with 2 G replacing 3 G. However, as we require slightly more information than the topology, we reprove this equivalence in the context of a theorem casting ξ as the projection map for the fiber bundle A/G n V1 over 2 G. Let (S 3 , g) denote the space of Lie-algebra-valued one-forms on S 3 , and V1 3 V1 3 (S , g) → (S , g) denote projection onto the longitudinal component. let P0 : Properly speaking, this projection is ill-defined, since the longitudinal direction is not defined at the north and south poles of S 3 . However, we can identify 1-forms on S 3 with 1-forms on I × S 2 , on which P0 is unambiguously defined, with conditions on their behavior at the boundary. Thus, for instance, Ker P0 in the following theorem refers to 1-forms on I × S 2 which vanish in the direction tangent to I and which are zero on the boundary. Theorem 1.1. The mapping ξ takes A/G n onto 2 G. The fiber of ξ is an affine-linear space modelled on Ker P0 . Proof. To see that Im ξ contains 2 G, let X ∈ 2 G. Since π2 (G) is trivial, there is a homotopy h from the constant 1 to X. We shall use h to define a family of lifts γ e(e)

688

D. S. Fine

of the paths γ(e). The definition will imply that these are lifts by a connection whose image under ξ is X. Preliminary to defining the lift γ e(e), choose a global section σ of P and let γ e0 (e) denote σ (γ(e)). Let pn denote σ(n). For definiteness, parametrize the paths on S 3 so that, for each point e of the equator, γ( 21 , e) is the south pole. That is, for each e, γ(t, e) parametrizes the longitude from the north to the south pole though e0 as t goes from 0 to 21 , and it parametrizes the longitude from the south pole to the north pole through e as t goes from 21 to 1. Notice that, with the exception of the north and south poles, each point of S 3 is of the form γ(t, e) for a unique choice of t ∈ [ 21 , 1] and e on the equator. e(e) Over the longitude through e0 , which is the first part of each γ(e), define the lift γ to agree with γ e0 (e). Then, over the longitude through e, use the homotopy h to define the difference between γ e(e) and γ e0 (e). More precisely, γ e(t, e) =

t ∈ [0, 21 ] γ e0 (t, e) γ e0 (t, e)h(2t − 1, e)−1 t ∈ [ 21 , 1)

Note that each lift γ e(e) is continuous at the south pole (t = 21 ), since h(0, e) = 1. To ensure that γ e(e0 ) is well-defined, we must assume h(t, e0 ) = 1 for all t ∈ [0, 1]. There is no loss of generality in making this assumption; for any homotopy h from X to 1 there is a homotopy defined by h(t, e0 )−1 h(t, e) having this property. Finally, let η be any one-form whose longitudinal component is η0 = h−1 ∂h ∂t . Relative to the section σ, the one-form η defines a connection, which we also denote by η, such e. Then, ξ([η]) = X, since that γ eη = γ e(t, e) = pn X −1 . pn Xη (e)−1 = lim γ t→1

This completes the proof of the first statement of the theorem. At this point it is easy to determine the fiber of ξ. Let γ eA denote the family of lifts by eA+τ = γ eA ; hence, XA = XA+τ . Thus A through pn of the paths γ. If τ ∈ Ker P0 , then γ the fiber through [A] contains [A + τ ] for all τ ∈ Ker P0 . Conversely, if [A] and [B] are in the same fiber, so XA = XB , the lifts as above define a map g by eB g −1 . γ eA = γ

(2)

In this case, g(0, e) = g(1, e) = 1, the identity in G. We will use g to define a gauge transformation φ such that B = φ−1 •(A + τ ) for some τ ∈ Ker P0 . First note that g defines a map g˜ : S 3 → G which is the identity at n. This in turn defines a gauge eA = γ eφ•B , from eB g −1 . In this context, then, Eq. 2 says γ transformation φ via φ γ eB = γ which the desired statement follows. Thus the fiber is exactly the set of classes of the form [A + τ ] for τ ∈ Ker P0 . The proof of Theorem 1.1 is thus complete. This result characterizes A/G n as a principal fiber bundle with projection ξ over 2 G. The fiber is isomorphic to Ker P0 . The path integral over A/G n thus becomes an integral over the fibers followed by an integral over the base.

Chern–Simons Theory and WZW Partition Functions

689

2. Integration Over the Fiber Relative to the global section σ, a connection A is an element of Simons action is then Z 2 k A ∧ dA + A3 . S(A) = 4π S 3 3

V1

(M, g). The Chern–

(3)

(The product A3 is often written 21 A ∧ [A ∧ A], where the wedge-bracket product is the wedge as forms and the Lie bracket at elements of g, while the plain wedge product is the wedge of forms and the inner product on g). Because the fiber is linear, and the action proves to be quadratic along the fiber, we can explicitly perform the integral over the fiber. Expressing the result in terms of the map X representing a point of the base b b : B 3 → G for which X is X: and any extension X 3 ∂B

Theorem 2.1. For a function f on A/G n which is constant along the fibers, Z R R k ¯ 1 b 3 b−1 dX ik ∂X∧∂X+i 12π B 3 X DX, f (X)e 4π S2 hf i = Z0 where ∂¯ and ∂ are determined by a choice of complex structure on a small two-sphere about the north pole. In other words, the Chern–Simons path integral directly reduces to the path integral for a WZW model of based maps from S 2 to G. For a more general function f , the result holds true if f is replaced on the right-hand side by a certain average over the fiber, which is computable, at least when f is a polynomial along the fiber. Proof. The following three subsections present the proof of this theorem. The first, by way of preparation for the integration over the fibers, describes how to shift away a linear term appearing in the restriction to the fibers of the Chern–Simons action. Making this shift, also known as completing the square, amounts to choosing a specific origin in each fiber. The second contains the explicit evaluation of the integral over the fibers. For f constant along the fibers, the integral introduces a ratio of determinants and induces an action on the base. The third subsection re-casts this result in terms of the map X, completing the proof of the theorem. 2.1. The origin in a fiber. The action along the fiber is S(A + τ ), where A is as yet arbitrary and τ ∈ Ker P0 . In general, for a three-manifold M with boundary ∂M , and τ restricted to vanish in one direction at each point, Z Z k (τ ∧ DA τ + 2 FA ∧ τ ) + A∧τ , (4) S(A + τ ) = S(A) + 4π M ∂M since the term cubic in τ vanishes due to the restriction on τ . Possible boundary contributions will become important in what follows. This action is quadratic in τ . We seek a choice of A to eliminate the linear term. This will serve as a choice of origin in the fiber over X in A/G n relative to which the functional integral over the fiber is Gaussian. That there is such a choice is the import of ˜ of connections, Proposition 2.1. For any connection A, there is a unique class [A] smooth on S 3 − {n}, such that

690

D. S. Fine

˜ = ξ(A) and 1. ξ(A) 2. FA˜ ∧ τ = 0 for all τ ∈ Ker P0 Proof. On S 3 , the second property implies A is flat in the longitudinal direction. That b + τ˜ , where τ˜ ∈ Ker P0 is covariantly constant along longitudes, and X b b −1 dX is, A = X is defined by parallel transport along longitudes: b e)−1 . e0 (t, e)X(t, γ eA (t, e) = γ Here covariantly-constant and parallel transport are with respect to the given connection b •C, where C is any connection on S 2 . There is no choice of A. Equivalently, A = X C for which this A is continuous at the north and south poles. By choosing C = 0, we guarantee continuity at the south pole. Thus take b −1 dX b b •0 = X A˜ = X eA , so the first property to represent the origin in the fiber over X. By construction, γ eA˜ = γ e might depend on the initial connection A. However, holds. In principle, the origin [A] if φ•(A + τ ) represents some other initial connection, the above construction leads to −1 b e b d Xφ = φ•A. Xφ e is a well-defined choice of origin, relative to which the action on a fiber is Thus, [A] e defines a section of A/G over 2 G, purely quadratic. As a corollary, the map X 7→ [A] n ˜ which are only piece-wise provided A is understood to include connections, such as A, smooth. Identify connections on S 3 −{n} with connections on B 3 . Then smooth connections on S 3 which lie in the fiber through A correspond to connections on B 3 of the form φ•(A˜ + τ ), where τ ∈ Ker P0 must satisfy e . (5) τ |∂B 3 = − A ∂B 3

(At the origin of B 3 , τ vanishes.) Happily, this smoothness requirement also guarantees the boundary term in Eq. (4) vanishes, so, with this choice of origin, the action along the fiber is simply Z k e e τ ∧ DA τ, S(A + τ ) = S(A) + 4π b and the fiber is parametrized by τ ∈ Ker P0 e=X b −1 dX where A for which τ |∂ = −X −1 dX. 2.2. The integration. Having chosen an origin so that the action along the fiber is purely quadratic, we ought to be able to explicitly integrate over the fiber. First however, we must account for gauge invariance. This introduces a Jacobian factor, so the integral over the fiber becomes Z R k ˜ 1/2 iS(A) ∗ det (DA P0 DA ) f (τ )ei 4π τ ∧DA τ Dτ. e The derivation of this factor, which depends on the projection from the tangent to the orbits to the orthogonal complement of the fiber, is fairly standard and strictly analogous

Chern–Simons Theory and WZW Partition Functions

691

to the derivation of the Jacobian in the case of Yang–Mills in [6, Sect. 3]. Note that we are forced to introduce explicitly a metric on S 3 (or at least on A) at this point to define the orthogonal complement of the fiber. If f is constant, the integral we wish to evaluate is Z R k ei 4π τ ∧DA τ Dτ, \ taken over elements of Ker P0 , the subspace of Ker P0 whose elements satisfy the boundary condition Eq. (5). The restriction of the metric to Ker P0 defines a decomposition V V Ker P0 = + ⊕ − into eigenspaces of the corresponding Hodge operator. Re-writing the action on the fiber in terms of this decomposition and integrating by parts yields Z Z Z τ ∧ DA τ = 2 DA τ− ∧ τ+ − τ− ∧ τ+ . M

M

∂M

Given the condition at n, this means Z Z R R R k k k ¯ i 4π τ ∧DA τ i 4π ∂X∧∂X i 2π DA τ− ∧τ+ 3 2 S S S3 Dτ = e Dτ− Dτ+ . e e

(6)

The path integral on the right-hand side should be a power of the determinant of some operator related to DA P− . Indeed, the integral over τ+ should yield a delta-function, δ(DA τ− ). Then the integral over τ− would produce a factor det−1 |DA P− |. In the finite 1 dimensional analogue, we will show the required factor is in fact det− 2 P− DA∗ DA P− . To that end, let V be a vector space of dimension 2m, with the decomposition V = V+ ⊕ V− into vector spaces of dimension m. Let T : V → V ∗ be a linear operator for which T (V± ) = V∓∗ , and denote the pairing of a vector and a dual by h, i. In this setting, the integral analogous to that on the right-hand side of Eq. (6) is 1 m R 2ihT v ,v i dv± dv± − + Dv+ Dv− , where Dv± = √2π · · · √2π .. To evaluate this integral, choose e ± − + + 1 m a basis, e− 1 , · · · em , e1 , · · · em with ei ∈ V± and let f− , · · · f+ be the corresponding basis of the dual space. Define Tij by j T− (e− i ) = f+ Tji , i ± ei . (Here and in the following and similarly define components of vectors by v± = v± we use the summation convention for roman indices). If T−⊥ denotes the transpose of the matrix whose elements are Tij , then we can state the following proposition evaluating the integral: Proposition 2.2. Z 1 e2ihT v− ,v+ i Dv+ Dv− = det− 2 T−⊥ T− .

Proof. Introduce a small parameter and a (basis-dependent) isomorphism between V+ P and its dual; namely v+ 7→ i v+i f+i ≡ v+∗ . (Equivalently, introduce a metric, work with an orthonormal basis, and use the metric to identify V+ with its dual.) Then the left-hand side of the proposition may be replaced by Z 2 ∗ I = lim e− hv+ ,v+ i+2ihT v− ,v+ i Dv+ Dv− . →0

692

D. S. Fine

Changing variables from v+ to v˜ + = v+ makes this m Z ∗ i 1 Dv+ Dv− . I = lim e−hv˜ + ,v˜ + i+2 hT v− ,v˜ + i →0 Using + * v˜ +∗ , v˜ +

* 2i +

+

T v− , v˜ +

* 1 − 2

T v− , T v−

∗

+

=

∗ i i v˜ +∗ + T v− , v˜ + + T v− ,

to complete the square and integrating over v˜ + gives Z

v ∗ m v 1 − T − , T − Dv− . I = lim e →0 Changing variables to v˜ − = v− , demonstrates that the right-hand side is independent of even before taking the limit, and we obtain Z ∗ I = e−hT v˜ − ,(T v˜ − ) i Dv˜ − . Introducing the given basis, D ∗ E X i j = v˜ − Tki Tkj v˜ − T v˜ − , T v˜ − = =

k i v˜ −

X

T−⊥ T−

ij

j v˜ −

i i λi v˜ − v˜ − .

i

In the last equality we have assumed, without loss of generality, that the basis was chosen so that T−⊥ T− is diagonal, with eigenvalues λi . Performing the iterated Gaussian integrals over the v˜ − thus yields I=

m Y

1 1 (λi )− 2 = det− 2 T−⊥ T− .

i=1

There are two obstacles to using this result in the infinite-dimensional setting in 1 1 which det− 2 T−⊥ T− is replaced by det− 2 P− DA∗ DA P− . The first is that the integral \ is over Ker P0 , which, due to the non-trivial condition at n (Eq. (5)), is not a linear \ space. This obstacle is not too serious, as Ker P0 differs from the linear space Ker P0 = \ P0 is an affine {τ ∈ Ker P0 : τ |∂ = 0} by translation by a constant element. That is, Ker space modelled on Ker P0 . Since translation by a constant does not affect the path . The integral, we will interpret the latter as the determinant of P− DA∗ DA P− | Ker P0 remaining obstacle to applying Proposition 2.2 is the infinite product of eigenvalues must be regularized, by, say, zeta-function regularization. The integral over the fiber is thus Z R R k ¯ 1 ik ∂X∧∂X det− 2 P− DA∗ DA P− , (7) ei 4π τ ∧DA τ Dτ = e 4π S2

Chern–Simons Theory and WZW Partition Functions

693

where, in the determinant, DA∗ = ∗DA ∗ is the formal adjoint of DA , P− DA∗ DA P− is restricted to Ker P0 , and a regularization is understood. The integration over the fiber is now complete. Substituting the result into the path integral for the expectation of f , Eq. (1), yields 1 Z R k det 2 DA∗ P0 DA 1 ¯ i 4π ∂X∧∂X e iS(A) e (8) f (X)e hf i = µbase , 1 Z0 det 2 P− DA∗ DA P− where µbase is the measure on 2 G induced by the restriction of the metric on A/G n to the orthogonal complement of the fiber. Descending to 2 G by re-expressing each b will complete the proof of Theorem 2.1. factor in terms of X ∈ 2 G and its extension X Recall that the first exponent on the right-hand side of Eq. (7) is the boundary term R R τ ∧ τ coming from τ ∧ D τ . The above arguments thus prove, at the level of + − A 2 3 S B path integral heuristics, the following more general proposition, which will arise in the case of manifolds other than S 3 : Proposition 2.3. Let τ be a g-valued one-form on M = I × 6, where 6 denotes a Riemann surface. Suppose τ vanishes in the directions tangent to I × {x} for each x ∈ 6. Then Z R R k 1 ik τ ∧τ ei 4π τ ∧DA τ Dτ = e 4π ∂M + − det− 2 P− DA∗ DA P− . 2.3. The descent to 2 G. To see how the preceding expression descends to a path integral b −1 dX b shows on 2 G, first look at the induced action. Direct calculation for A˜ = X Z 3 b −1 dX b . ˜ = k X S(A) 12π S 3 Next, consider how µbase relates to the metric-compatible measure DX. The orthogonal complement of the fiber defines a connection on A/G n over 2 G. The horizontal tangent space at [A], denoted TAH A/G n , may be identified with 2 g, the tangent to the base space 2 G, by the map dξ|H . Thinking of µbase and DX as coming from volume 2 forms on TAH A/G n and G, respectively, these volume forms are related by the factor ∗ det dξ|H dξ|H . To compute this determinant, introduce an orthonormal “basis” on TAH A/G n as represented by one-forms η which are orthogonal to the gauge directions and to the fiber directions at A. These conditions on η are hDA f, ηi = 0 and η = P0 η,

(9)

where the first must hold for every Lie-algebra-valued zero-form f . In any metric on S 3 for which g0z = g0z¯ = 0, the conditions of Eq. (9) force η to take the form η=

1

√ g 00 g

UA−1 η(z, ¯ z)U ¯ A dt.

From here calculating the Jacobian is straightforward, though somewhat tedious. The calculation is entirely analogous to one arising in Yang–Mills on S 2 , the details of which appear in [5, Sect. 4.3]. The result is ∗ det dξ|H dξ|H = detC[1 + 2πC(evaluation at n)∗ (evaluation at n)],

694

D. S. Fine

R where C = S 2 g001√g and (evaluation at n) acts on Lie-algebra-valued zero-forms on S 2 . This calculation indicates the Jacobian is a constant and hence may be absorbed into the normalization constant Z0 . Finally, examine the ratio of determinants appearing in Eq. (8). Let X and Y be two points in the base space, and let AX and AY be connections representing the canonical origins in the fibers over X and Y . We shall prove Proposition 2.4.

and

det P− DA∗Y DAY P− = det P− DA∗X DAX P− , det DA∗Y P0 DAY = det DA∗X P0 DAX .

Our interest is in the immediate corollary, Corollary 2.1. The ratio of determinants is constant along the base space in A/G n . Proof. Define φ ∈ 3 G by eAY φ, γ eAX = γ and note that, therefore,

AY = φ−1 AX φ + φ−1 dφ + τXY ,

for some τXY ∈ Ker P0 . That is, up to an element of Ker P0 , AY is the transformation of AX by the “gauge transformation” φ, which is discontinuous at n. (If X = Y , then, as in the proof of Theorem 1.1, φ is an honest gauge transformation.) Away from the north pole, then, P− DA∗Y DAY P− = Adφ−1 P− DA∗X DAX P−Adφ .

(10)

Suppose now that τ is an eigenfunction of P− DA∗Y DAY P− , so P− DA∗Y DAY P− τ = λτ. Then, since P− commutes with Adφ , Eq. (10) implies that P− DA∗X DAX P−Adφ τ = λAdφ τ. Moreover, according to the observations preceding Eq. (7), τ should vanish at the north pole, in which case Adφ τ will also vanish there. Thus every eigenfunction of P− DA∗Y DAY P− determines an eigenfunction of P− DA∗X DAX P− with the same eigenvalue. The argument works equally well from X to Y , so the two operators have the same spectrum. Thus, in any regularization, they have the same determinant. This completes the proof of the first equation of the proposition; the proof of the second is almost identical. The proof of the proposition concludes the proof of Theorem 2.1.

Chern–Simons Theory and WZW Partition Functions

695

3. Other Manifolds The above constructions readily generalize to any manifold obtained from I × 6 by boundary identifications. Here “longitudinal” will mean tangent to the direction of I. To illustrate, we treat B 3 , the solid torus, and S 1 × 6 in the following subsections. 3.1. The three-ball. The only modification required to treat the Chern–Simons path integral over connections on B 3 with fixed boundary value A (a connection on S 2 ) is in the smoothness condition of Eq. (5). Rather than setting A˜ + τ to 0 on the boundary, to correspond to a continuous connection on S 3 , now set A˜ + τ ∂B 3 = A to implement the boundary condition on B 3 . The new condition on τ is thus, τ |∂B 3 = A − X −1 dX. As before, write each connection as B = A˜ + τ , so the path integral is Z Z R R k ˜ ˜ ik τ ∧DA τ i 4π A∧τ S2 e det1/2 (DA∗˜ P0 DA˜ ) Dτ µbase . eikS(B) DB = eiS(A) e 4π B3 (11) Note the inclusion of the boundary term (from Eq. (4)) which vanished in the S 3 -case. Proposition 2.3 in this context says Z R R 1 ik τ ∧DA τ ik τ ∧τ Dτ = e 4π S2 + − det− 2 P− DA∗ DA P− . e 4π B3 The argument to the exponential on the right-hand side is determined by the boundary condition on τ as Z Z k k ¯ − X −1 ∂X ∧ A− . ¯ − A+ ∧ X −1 ∂X τ + ∧ τ− = A+ ∧ A− + ∂X ∧ ∂X 4π S 2 4π S 2 Likewise, the boundary term of Eq. (11) becomes Z Z k k ˜ A∧τ = X −1 dX ∧ A − X −1 dX 4π S 2 4π S 2 Z k ¯ X −1 ∂X ∧ A− − A+ ∧ X −1 ∂X. = 4π S 2 The arguments of Sect. 2.3 go through to recast the above as an integral over 2 G, yielding Z eikS(B) DB Z k R R R R k k ¯ ¯ b 3 +i 4π b−1 dX ik ∂X∧∂X+i A+ ∧A− −i 2π A+ ∧X −1 ∂X 12π S 3 X S2 S2 . = e 4π S2 Note this agrees with Witten’s Ansatz in [14] for a WZW path integral to represent a Chern–Simons state. Here we have obtained it directly from the usual path integral representation for a quantum state. In Theorem 2.1, Z0 is the partition function for Chern–Simons on S 3 , which we have evaluated as a WZW path integral; namely, the integral obtained by setting f (X) = 1 on the right-hand side. On the other hand, we can use the B 3 state to check this evaluation

696

D. S. Fine

of the partition function for S 3 . Think of S 3 as two copies of B 3 , glued, with opposite orientations, along their boundaries. The S 3 path integral for the partition function is then the integral over a common boundary connection of two B 3 Chern–Simons states (with opposite orientations). This corresponds also to a path-integral computation of the square norm of the Chern–Simons state. In [14], Witten carries out this computation and arrives at precisely the WZW path integral we would obtain by setting f (X) = 1 on the right-hand side of Theorem 2.1. 3.2. The solid torus. Think of the solid torus as I ×T , where T is S 1 ×S 1 , and {0}×T is identified to a circle. Specifically, accomplish this identification by collapsing the second S 1 factor in T to a point. The Chern–Simons state should thus depend on a connection A on T , corresponding to the boundary {1}×T , and a connection A0 on S 1 , corresponding to the boundary {0}×T with the identification. Analogously to the previous cases, write b b •C + τ , where C is a connection on T . Now, X an arbitrary connection on I × T as X denotes parallel transport from a basepoint m to x along a path which first goes from m to a point in the circle {0} × T . Along this segment only the I-component varies. The path then follows the circle, with the I-component fixed at 0, and finally goes to x, again with only the I-component varying. The one-form τ vanishes on the tangents to such paths. Note that, in referring to parallel transports, we are assuming the bundle over the solid torus is topologically trivial and we have fixed a global section. To describe this path more explicitly, introduce coordinates on I × T in which x 7→ t(x), eiθ(x), eiφ(x) . In these coordinates the first segment of the path has the form t, eiθ(m) , eiφ(m) with t varying from t(m) to 0. The second segment is 0, eiθ , eiφ(m) as θ goes from θ(m) to θ(x), and the third is t, eiθ(x) , eiφ(x) as t goes from 0 to t(x). There is an implicit segment, coming between the second and third, of the form 0, eiθ(x) , eiφ as φ goes from φ(m) to φ(x), but, in light of the identification, parallel transport along this segment will be the identity for any connection on I × T which corresponds to a connection on the solid torus. Any such connection must vanish along the second circle in {0} × T . This implies b b −1 dX. that C must vanish along the first circle. As before, we set C = 0 to define A˜ = X ˜ Connections on the solid torus are of the form A + τ with τ |{0}×T = 0 and τ |{1}×T = A − X −1 dX, b . where A is the fixed connection on T and X = X {1}×T −1 b b dX does not explicitly enter the boundary condition Notice that A0 = X {0}×T {0}×T on τ . Integrating over the fiber now yields, for M the solid torus, Z Z k R R R R k k ¯ ¯ b 3 +i 4π b−1 dX ik ∂X∧∂X+i A+ ∧A− −i 2π A+ ∧X −1 ∂X ikS(B) 12π M X T T DB = e 4π T . e This is identical in form to the B 3 case. b •C + τ . 3.3. The manifold S 1 × 6. Again begin with a connection on I × 6 written as X b x) is parallel transport from (0, x) to (t, x) along the path lying in I × {x}. Now, X(t, Again C is a connection on 6, as identified with {0} × 6, and τ vanishes in the direction tangent to path in I × {x}. On the boundary,

Chern–Simons Theory and WZW Partition Functions

b X

{0}×6

b = 1 and X

697

{1}×6

≡ X,

where X : 6 → G. For this connection on I × 6 to correspond to one on S 1 × 6, its restrictions to {0} × 6 andR{1} × 6 must agree with each other. As before, we seek a connection A˜ for which I×6 FA˜ ∧ τ = 0 for all τ of the above form. Let A˜ {0}×6 ≡ A be an b •A. The condition for A˜ + τ arbitrary connection on 6. Extend this to S 1 × 6 via A˜ = X to correspond to a connection on S 1 × 6 agreeing with A on {0} × 6 is then τ |{0}×6 = 0 and τ |{1}×6 = A − X •A. Although we will not dwell on it, the representation of a gauge orbit by A˜ + τ again has a geometric interpretation. The space of connections on I × 6 modulo gauge transformations which are the identity on the entire boundary is a bundle over Map(6, G), with [A] 7→ X defining the projection. The fiber is an affine-linear space modelled on the space of one-forms τ . The connection A˜ represents a choice of origin relative to which the restriction of the Chern–Simons action to the fiber is purely quadratic. This picture suggests the result of integrating over the fiber in the Chern–Simons path integral will be a path integral over all X ∈ Map(6, G) and the boundary connections A. Moreover, the action induced on Map(6, G) should retain a symmetry under gauge transformations of 6. From the construction of X, it is clear these gauge transformations act simultaneously on A and, by conjugation, on X. Proceeding with the computation, the partition function is Z eikS(B) DB R Z R k k ˜ A∧τ τ ∧DA τ i 4π ˜ i 4π iS(A) {1}×6 I× 6 e e det1/2 (DA∗˜ P0 DA˜ ) Dτ µbase DA. = e The only novelty here is the integration over connections A on 6. To integrate over all connections on S 1 × 6, we first integrate over connections on I × 6 whose boundary restrictions agree on a fixed connection A and then integrate over all such A. Using Proposition 2.3 to evaluate the integral over the fiber, and focusing on the resulting argument to the exponential we get the induced action Sinduced (X, A) = Z Z 3 k b −1 dX b + k X X −1 dX ∧ X −1 AX + 12π I×6 4π 6 Z Z k k X •A ∧ (A − X •A) + (A − X •A)+ ∧ (A − X •A)− . 4π 6 4π 6 ˜ (which now includes a boundary term), and we have The first two terms are from S(A) substituted the boundary values of A˜ and τ into the expression from Proposition 2.3 to obtain the other two terms. This is precisely the action SG/G of the G/G model as presented, for example, in Eq. (4.5) of Blau and Thompson’s paper [3]. (Their g is X −1 .) Thus the Chern–Simons partition function on S 1 ×6 is the partition function of the G/G model. The preceding statements relating partition functions in Chern–Simons theory to path integrals in various WZW models might seem specious. After all, the path integrals

698

D. S. Fine

are only defined up to an overall normalization constant. Notice, however, that any gauge invariant function in the Chern–Simons theory can be regarded as a function of A˜ + τ . For any such function f that depends on τ polynomially (or even is analytic in τ ), we can explicitly compute the integral over the fiber of f ([B])eikS(B) . Upon dividing by the partition function as computed above, we obtain a relation between the quantum expectation of f in the Chern–Simons theory and the quantum expectation of a function fˆ (completely determined by f ) in the corresponding WZW model. As an example, consider again the Chern–Simons theory on S 1 × 6 and let f be the trace in a given representation of holonomy about the unknot S 1 × {x} for some x ∈ 6. Then f = Tr X(x) is independent of τ so the integral over the fiber proceeds exactly as above. The result is Z 1 Tr X(x)eSG/G (X,A) DXDA. hf iChern–Simons = Z R Here, Z = eSG/G (X,A) DXDA, so overall constants arising from different regulations of the path integral will cancel. Blau and Thompson use a diagonalization argument to explicitly calculate the right-hand side and check that it agrees with the expectation on the left-hand side as Witten has computed by other methods in [13]. Acknowledgement. The author wishes to thank S. Axelrod, J. Baez and I. M. Singer for helpful discussions, including the latter’s suggestion that our approach to Yang–Mills might work for Chern–Simons theory.

References 1. Atiyah, M.F. and Jones, J.D.S.: Topological Aspects of Yang–Mills Theory. Commun. Math. Phys. 61, 97–118 (1978) 2. Birmingham, D., Blau, M. Rakowski, M. and Thompson, G.: Topological Field Theory. Phys. Rep. 209, 129–340 (1991) 3. Blau, M. and Thompson, G.: Derivation of the Verlinde Formula from Chern–Simons Theory and the G/G Model. Nucl. Phys. B408, 345–390 (1993) 4. Dijkgraaf, R. and Witten, E.: Topological Gauge Theories and Group Cohomology. Commun. Math. Phys. 129, 393–429 (1990) 5. Fine, D.: Yang–Mills on the Two-sphere. Commun. Math. Phys. 134, 273–292 (1990) 6. Fine, D.: Yang–Mills on a Riemann Surface. Commun. Math. Phys. 140, 321–338 (1991) 7. Fr¨ohlich, J. and King, C.: The Chern–Simons Theory and Knot Polynomials. Commun. Math. Phys. 126, 167–199 (1989) 8. Moore, G. and Seiberg, N.: Taming the Conformal Zoo. Phys. Letts B 220, 422–430 (1989) 9. Singer, I.M.: Some Remarks on the Gribov Ambiguity. Commun. Math. Phys. 60, 7 (1978) 10. Singer, I.M.: Families of Dirac Operators with Applications to Physics. Soci´et´e Math´ematique de France Ast´erisque, hor s´erie, 323–340 (1985) 11. Steenrod, N.: The Topology of Fibre Bundles. Princeton, NJ: Princeton U. Press, 1951 12. Walker, K.: On Witten’s 3-manifold Invariants. Preprint February 1991 13. Witten, E.: Quantum Field Theory and the Jones Polynomial. Commun. Math. Phys. 121, 351–399 (1989) Witten, E.: On Quantum Gauge Theories in Two Dimensions. Commun. Math. Phys. 141, 153–209 (1991) 14. Witten, E.: On Holomorphic Factorization of WZW and Coset Models. Commun. Math. Phys. 144, 189–212 (1992) Communicated by A. Jaffe

Commun. Math. Phys. 200, 699 – 722 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Extensive Properties of the Complex Ginzburg–Landau Equation Pierre Collet1 , Jean-Pierre Eckmann2,3 1 Centre de Physique Th´ eorique, Laboratoire CNRS UPR 14, Ecole Polytechnique, F-91128 Palaiseau Cedex, France 2 D´ epartement de Physique Th´eorique, Universit´e de Gen`eve, CH-1211 Gen`eve 4, Switzerland 3 Section de Math´ ematiques, Universit´e de Gen`eve, CH-1211 Gen`eve 4, Switzerland

Received: 15 February 1998 / Accepted: 10 August 1998

Abstract: We study the set of solutions of the complex Ginzburg–Landau equation in Rd , d < 3. We consider the global attracting set (i.e., the forward map of the set of bounded initial data), and restrict it to a cube QL of side L. We cover this set by a (minimal) number NQL (ε) of balls of radius ε in L∞ (QL ). We show that the Kolmogorov ε-entropy per unit length, Hε = limL→∞ L−d log NQL (ε) exists. In particular, we bound Hε by O log(1/ε) , which shows that the attracting set is smaller than the set of bounded analytic functions in a strip. We finally give a positive lower bound: Hε > O log(1/ε) . 1. Introduction In the last few years, considerable effort has been made towards a better understanding of partial differential equations of parabolic type in infinite space. A typical equation is for example the complex Ginzburg–Landau equation (CGL) on Rd : ∂t A = (1 + iα)1A + A − (1 + iβ)A|A|2 .

(1.1)

Such equations show, at least numerically, in certain parameter ranges, interesting “chaotic” behavior, and our aim here is to discuss notions of chaoticity per unit length for such systems. Our discussion will be restricted to the CGL, but it will become clear from the methods of the proofs that the results can be extended without much additional work to other problems in which high frequencies are strongly damped. A first idea which comes to mind in the context of measuring chaoticity is the notion of “dimension per unit length”. As we shall see, this quantity is a well-defined and useful concept in dynamical systems with finite-dimensional phase space. While the “standard” definition leads to infinite dimensions for finite segments of infinite systems, we shall see that an adequate definition, first introduced by Kolmogorov and Tikhomirov [KT], leads to finite bounds which measure the “complexity” of the set under study.

700

P. Collet, J.-P. Eckmann

2. Attracting Sets In the study of PDE’s, there are several definitions of “attractors”. In this work, we concentrate our attention onto attracting sets (which may be larger than attractors). Definition 1. A set G is called an attracting set with fundamental neighborhood U for the flow 8t if i) The set G is compact. ii) For every open set V ⊃ G we have 8t U ⊂ V when t is large enough. iii) The set G is invariant. The open set ∪t>0 (8t )−1 (U) is called the basin of attraction of G. If the basin of attraction is the full space, then G is called a global attracting set. Remark. One finds a large number of definitions of “attractors” in the literature [T, MS]. Our terminology is inspired from the theory of dynamical systems. In particular, an attracting set is not an attractor in the sense of dynamical systems, it is usually larger than the attractor. See also [ER] for a discussion of these issues. We will consider Eq. (1.1) in a (large) box QL of side L in Rd , with periodic boundary conditions. Let GQL denote the global attracting set for this problem. It has been shown ∞ [GH] that GQL is a compact set in Lper,Q (since the set is made up of functions analytic L in a strip around the real axis). For the CGL on the infinite space the situation is somewhat more complicated. A nontrivial invariant set G can be defined in the topology of uniformly continuous functions as follows: First, if B is a large enough ball of uniformly continuous functions in L∞ , there is a finite time T0 (B) such that for any T > T0 (B) one has 2T (B) ⊂ B, where t 7→ 2t is the flow defined by the CGL. The set G(B, T ) is then defined by \ 2nT (B). (2.1) G(B, T ) = n≥0

It can be shown (see [MS]) that this set is invariant and that it does not depend on the initial ball B (if it is large enough) nor on the (large enough) time T > T0 (B). Thus, we define G = G(B, T ). It is made up of functions which extend to bounded analytic functions in a strip. Its width and the bound on the functions only depend on the parameters of the problem. These facts can be found scattered in the literature, but are “well-known”, see, e.g., [C]. The set G probably lacks properties i) and ii) above in the topology of uniformly continuous functions. We will nevertheless call it a globally attracting set since in [MS] it was proven that in local and/or weaker topologies conditions of the type of i) and ii) are satisfied. The set G defined by Eq. (2.1) will be our main object of study. 3. Dimension in Finite Volume ∞ needed to We define MQL (ε) to be the minimum number of balls of radius ε in Lper,Q L cover GQL . One can then define

CQL = lim sup ε→0

log MQL (ε) . log(1/ε)

Extensive Properties of the Ginzburg–Landau Equation

701

The technical term [M, 5.3] for this is the “upper Minkowski dimension”. This dimension is an upper bound for the Hausdorff dimension. It is also equal to the (upper) box-counting dimension (in which the positions of the boxes are centered on a dyadic grid). It has been shown by Ghidaglia and H´eron [GH] that CQL satisfies an “extensive bound”: Proposition 3.1. For CGL one has in dimensions d = 1, 2, the bound lim sup L→∞

CQL < ∞. Ld

(3.1)

To our knowledge, it is an open problem to show the existence of the limit in (3.1). The difficulty in obtaining a proof is that the familiar methods of statistical mechanics of matching together pieces of configurations to obtain a subadditivity bound of the form (written for simplicity for the case of dimension d = 1 and with L instead of QL ) CL1 +L2 ≤ CL1 + CL2 + O(1), do not seem to work. One can try to define a sort of “local” dimension by restricting the global problem to a local window. But this idea does not work either as we show now: For example consider the global attracting set G for CGL on the infinite line. As we have said before, this set is compact in a local topology which is not too fine. Take again a cube QL of side L in Rd and then denote by NQL (ε) the minimum number of balls of radius ε in L∞ (QL ) needed to cover G|QL . Again, this number is finite. But we have the following Lemma 3.2. For every L > 0 we have lim inf ε→0

log NQL (ε) = ∞. log(1/ε)

(3.2)

Remark. In other words, this lemma shows that the lower Minkowski dimension for the restriction of G to QL is infinite. Thus, there are many more functions in G|QL than in GQL . In fact, our proof will show a little more, namely, see also [BV], Corollary 3.3. The Hausdorff dimension of G|QL is infinite for every L > 0. Proof. The proof will be given in Sect. 5.

The example of Lemma 3.2 and Corollary 3.3 teaches us that the restriction to nice functions on the infinite line produces “too many” functions on a finite interval, as the observation (the ε) becomes infinitely accurate. This fact calls for a new kind of definition. Such a possibility is offered by the considerations of Kolmogorov and Tikhomirov [KT]. 4. The ε-Entropy per Unit Length The basic idea is to take the limit of infinite L before considering the behavior as ε goes to zero. Thus, with the definitions of the preceding section, we now define Hε = lim

L→∞

log NQL (ε) . Ld

In the paper [KT], this quantity was studied for different sets of functions. The authors considered in particular three classes of functions on the real line:

702

P. Collet, J.-P. Eckmann

i) The class Eσ (C) of entire functions f which are bounded by |f (z)| ≤ Ceσ|Imz| . ii) The class Fp,σ (C) of entire functions f with growth of order p > 1, which are p bounded by |f (z)| ≤ Ceσ|Imz| . iii) The class Sh (C) of bounded analytic functions in the strip |Imz| < h with a bound |f (z)| < C. For these classes the following result holds Theorem 4.1. [KT]. One has the bounds:  · log(1/ε) for the class Eσ (C),  (2σ/π)1/p 2−1/p 2σ p2 for the class Fp,σ (C), Hε ∼ π(2p−1)(p−1)1−1/p · log(1/ε)  2  1 for the class Sh (C), πh · log(1/ε) as ε → 0 in the sense that the l.h.s. divided by the r.h.s. has limit equal to 1. Notation. It will sometimes be convenient to write the dependence on the space such as Hε Eσ (C) . Our main result is the following Theorem 4.2. The global attracting set G of CGL satisfies a bound Hε (G) ≤ const. log(1/ε),

(4.1)

where the constant depends only on the parameters of the equation. Remark. The reader should note that this result contains new information about the set G of limiting states. It is for example well known that the solutions of CGL are analytic and bounded in a strip, that is, they are in the class Sh (C) for some h > 0 and some C < ∞. This alone, however would only give a bound 2 1 log(1/ε) , πh as we have seen in Theorem 4.1. Therefore, Theorem 4.2 shows that the long-time solutions are not only analytic in a strip, but form a proper subset of Sh (C) with smaller ε-entropy per unit length. On the other hand, the set G is in general not contained in the class Eσ (C), because some stationary solutions are √ not entire. For example for the real Ginzburg–Landau equation, the function tanh(x/ 2) is a stationary solution with a singularity in the complex plane. For the CGL, Hocking and Stewartson [HS, Eq.(5.2)] describe time-periodic solutions which exist in certain parameter ranges of α and β, and which are again not entire in x and are of the form const.eia1 t sech(a2 x)1+ia3 , where ai = ai (α, β) can be found in [HS].

Extensive Properties of the Ginzburg–Landau Equation

703

5. Proof of Lemma 3.2 and Corollary 3.3 We fix L > 0, and we want to show that lim inf ε→0

log NQL (ε) = ∞, log(1/ε)

(5.1)

where NQL (ε) is the minimum number of balls needed to cover G|QL . The idea of the proof is to observe that G|QL contains subsets of arbitrarily high Hausdorff dimension. These subsets are essentially parts of the unstable manifold of the 0 solution. We begin, as in [GH], by considering periodic solutions of period 3 for various 3. In that space, for 3 large enough, the origin is an unstable fixed point and the spectrum of the generator for the linearized evolution is 4π 2 1 − (1 + iα) 2 (n21 + · · · + n2d ) ni ∈ Z . 3 Thus, the origin is a hyperbolic fixed point if 2π/3 is irrational. In that case the local unstable manifold W of the origin has dimension D3 ≡ O(1)3d . In other words, we have a C 1 map 93 from a neighborhood U of 0 in RD3 to W which is injective (and in fact has differentiable inverse). This construction can be justified in a Sobolev space with sufficiently high index [GH, Remark 3.2, p. 289], [G]. This unstable manifold is of course contained in the global attracting set G3 . But it is also in G. We can consider W as a subset 8(U ) in G and look at it in L∞ (QL ) (with L 3). We would like to prove that there also it has a dimension equal to D3 . Note that there is a C 1 map 8 which maps U to W . We claim 8 is injective. Indeed, assume not, then there are two different points u1 and u2 in U such that on QL the functions 8(u1 ) and 8(u2 ) coincide. But since these functions are analytic in a strip they coincide everywhere and hence u1 = u2 : we have a contradiction. This implies that in L∞ (QL ), the local unstable manifold W has also dimension D3 . Therefore, for ε small enough, we need at least D3 −1 1 ε balls of radius ε to cover it. The assertion (5.1) follows by letting 3 tend to infinity. The proof of Lemma 3.2 is complete. Since we have constructed a lower bound for every L, Corollary 3.3 follows at once. 6. Upper Bound on the ε-Entropy per Unit Length We study in this section the quantity Hε (G) for the global attracting set on G for the CGL on the whole space. We begin by Theorem 6.1. For fixed ε > 0, the sequence log NQL (ε)/Ld has a limit when L goes to infinity, and there exists a constant C such that lim

L→∞

log NQL (ε) ≤ C log(1/ε). Ld

The constant C only depends on the parameters of the CGL.

(6.1)

704

P. Collet, J.-P. Eckmann

We first prove the existence of the limit. Lemma 6.2. For any fixed ε > 0, the sequence log NQL (ε)/Ld has a limit when L goes to infinity. Proof. Let B and B 0 denote two disjoint bounded sets of Rd . We denote by NB (ε) the minimum number of balls in L∞ (B) of radius ε which is needed to cover G|B . Since we are using the sup norm, it is easy to verify that NB∪B 0 (ε) ≤ NB (ε)NB 0 (ε),

(6.2)

because one can choose the functions in B and B 0 independently. The lemma follows by the standard sub-additivity argument, see [R], since the QL form a van Hove sequence. We now begin working towards a bound relating NQL (ε) and NQL (ε/2). The bound will be inefficient for small L but becomes asymptotically better. We let the CGL semiflow act on balls in L∞ (QL ), and we will analyze the deformation of these balls by looking at the difference between the trajectory of the center and the trajectory of the other points. We begin by considering functions f and g, both in G. We set w0 = g − f. It is left to the reader to verify that there are bounded functions R and S of space and time such that ¯ ∂t w = (1 + iα)1w + Rw + S w,

(6.3)

more precisely, we set w(t = 0) = w0 , and R = 1 − (1 + iβ)(ft + gt )f¯t , S = −(1 + iβ)gt2 , where ft = 2t (f ), and gt = 2t (g). Note that since G is bounded in a suitable space of analytic functions, there is a constant K > 1 which depends only on α and β such that sup kw(t, ·)k∞ + sup k∇w(t, ·)k∞ ≤ K. t

t

(6.4)

We want to show that if w0 is small in QL then the same is true for the solution of Eq. (6.3) up to time 1. This might seem not to be true because a large perturbation may reach QL from the outside. However, using localization techniques, we now show that this effect can only take place near the boundary. We will therefore introduce a layer of width ` near the boundary of the cube QL , and we assume ` < L. We assume QL to be centered at the origin and consider the cube QL−` also centered at the origin. We use as in [CE] the family of space cut-off functions ϕa (x) = Z

1 ≡ ϕ(x − a), (1 + |x − a|4 )d Z

where Z

−1

=

dx

1 . (1 + |x|4 )d

Extensive Properties of the Ginzburg–Landau Equation

705

Lemma 6.3. Let f and g be in G, and let w0 = f − g. In dimension d ≤ 3, if ` > 1/ε and w is a solution of Eq. (6.3) with initial data w0 satisfying kw0 k∞ ≤ 2K, then

and

sup |w0 (x)| ≤ ε,

x∈QL

Z sup

sup

0≤t≤1 a∈QL−`

dxϕa (x)|w(t, x)|2 ≤ O(ε2 ),

(6.5)

sup |w(t, a)| ≤ O(ε),

(6.6)

sup |∇w(t = 1, x)| ≤ O(ε).

(6.7)

sup

0≤t≤1 a∈QL−`

and x∈QL−`

These bounds depend on K but are independent of ` > 1/ε. Remark. The constant K = K(α, β) in this lemma is the one found in Eq. (6.4). Below, the notation Oα,β (1) will stand for a bound which depends only on α, β and this K(α, β), but not on L, ` or ε. R Proof. We begin by bounding X ≡ ∂t dxϕa (x)|w(t, x)|2 . Using Eq. (6.3) we have: Z X = dxwϕ ¯ a (1 + iα)1w + Rw + S w¯ + cc, where cc denotes the complex conjugate. Integrating by parts we get Z Z 2 ¯ X = −(1 + iα) dxϕa |∇w| − (1 + iα) dxw(∇ϕ a · ∇w) Z + dxϕa w¯ Rw + S w¯ + cc.

(6.8)

By the choice of ϕa we have |∇ϕa (x)| ≤ const.ϕa (x), uniformly in x and a. Therefore X can be bounded above by Z Z Z 2 X ≤ −2 dxϕa |∇w| + Oα,β (1) dxϕa |w||∇w| + Oα,β (1) dxϕa |w|2 . By polarization, and using that ϕa > 0, we get a bound Z Z Z ∂t dxϕa |w|2 ≤ − dxϕa |∇w|2 + Oα,β (1) dxϕa |w|2 .

(6.9)

Therefore we see that there is a constant C which depends only on α and β, for which we have the differential inequality Z Z (6.10) ∂t dxϕa (x)|w(t, x)|2 ≤ C dxϕa (x)|w(t, x)|2 . Since w(0, x) is bounded on Rd and small on QL , we have for ` > 1/ε, Z dxϕa (x)|w(0, x)|2 ≤ O(1 + K 2 )ε2 . sup a∈QL−`

706

P. Collet, J.-P. Eckmann

To see this, split the integration region into QL and Rd \ QL . In the first region, w is small and in the second region the integral of ϕa is small and |w| ≤ K. Using Eq. (6.10), we find Z dxϕa (x)|w(t, x)|2 ≤ eC Oα,β (1)ε2 = Oα,β (1)ε2 . sup sup 0≤t≤1 a∈QL−`

Thus we have shown Eq. (6.5). We next bound the solutions in L∞ . Let Gt denote the convolution kernel of the semigroup generated by the operator (1 + iα)1. We have Z t ¯ ·) . (6.11) w(t, ·) = Gt ? w0 + dsGt−s ? R(s, ·)w(s, ·) + S(s, ·)w(s, 0

We first bound the term We rewrite it as

Yt,s ≡ Gt−s ? R(s, ·)w(s, ·) . Z

Yt,s (x) =

Gt−s (x − y) p dy √ ϕ(x − y)R(s, y)w(s, y). ϕ(x − y)

By the Schwarz inequality, we get a bound Z Z |Gt−s |2 (x − y) 2 · Oα,β (1) dzϕx (z)|w(s, z)|2 . Yt,s ≤ dy ϕ(x − y)

(6.12)

Using Eq. (6.5), the second factor in (6.12) is bounded by O(ε2 ). The complex heat kernel G can be bounded as follows: Lemma 6.4. For every n > 0 there is a constant Cn such that |Gt (z)| ≤

1 Cn , (1 + z 2 /t)n/2 td/2

and |∇Gt (z)| ≤

1 td/2

Cn |z| . 2 n (1 + z /t) t

Proof. Use the stationary phase method [H].

(6.13)

(6.14)

Using this lemma, the first factor in (6.12) is bounded for t − s < 1 and for n large enough, by Z Z Cn 1 |Gt−s |2 (x − y) (1 + |x − y|4 )d ≤ dy dy 2 n (x−y) ϕ(x − y) (t − s)d 1 + t−s ≤ O (t − s)−d/2 . Inserting in (6.12), and integrating over s, we get the bound Z t dsYt,s ≤ O(ε), 0

Extensive Properties of the Ginzburg–Landau Equation

707

provided d < 4. The term involving S is bounded in the same manner. The inhomogeneous term in (6.11) is bounded by splitting the convolution integral into the regions y ∈ QL and y ∈ Rd \ QL . The first term gives a small contribution because w0 is O(ε) on QL and the second contribution is small because the kernel Gt is small for x ∈ QL−` and y ∈ Rd \ QL . This proves Eq. (6.6). It remains to show Eq. (6.7). We have Z ∇w(t, ·) = ∇Gt ? w0 +

t

ds∇Gt−s ? R(s, ·)w(s, ·) + S(s, ·)w(s, ¯ ·) .

0

We deal first with the inhomogeneous term. Using the same splitting as before, and Lemma 6.4, we get sup | (∇Gt=1 ) ? w0 (x)| ≤ O(ε). x∈QL−`

The homogeneous term I involving R is: Z

t

ds∇Gt−s ? wR .

I= 0

We want to bound I for t = 1 and rewrite it as Z 1 Z 1 Z 1/2 ds∇G1−s ? wR + dsG1−s ? w∇R + dsG1−s ? R∇w I= 0

1/2

1/2

≡ I1 + I2 + I3 . The term I2 is bounded in the same way as the integral of Yt,s . To bound the term I1 we observe that there is no singularity in the kernel (6.14), since s < 21 , and furthermore, |∇G1−s (z)| ≤ const.ϕ(z). Then the Schwarz inequality and the results on w yield I1 ≤ O(ε).

(6.15)

Finally, consider I3 . Integrating Eq. (6.9) over s from 0 to 21 , we have Z

1/2

ds

Z dxϕa (x)|∇w(s, x)|2

0

Z

1/2

≤ O(1)

ds

Z

Z dxϕa (x)|w(s, x)|2 +

dxϕa (x)|w(0, x)|2 .

0

Our previous bounds show that the r.h.s. is bounded by O(ε2 ). Therefore there is a value of s∗ ∈ (0, 21 ) for which Z

Furthermore, we have

dxϕa (x)|∇w(s∗ , x)|2 ≤ O(ε2 ).

(6.16)

708

P. Collet, J.-P. Eckmann

Lemma 6.5. We have the bounds Z Z ∂t dxϕa (x)|∇w(t, x)|2 ≤ O(1) dxϕa (x)|∇w(t, x)|2 Z + O(1) dxϕa (x)|w(t, x)|2 .

(6.17)

Proof. We start with Z Z ∂t dxϕa |∇w|2 = dxϕa ∇w¯ · ∂t ∇w + cc Z = dxϕa ∇w¯ · ∇ (1 + iα)1w + Rw + S w¯ + cc Z = − dxϕa 1w¯ (1 + iα)1w + Rw + S w¯ Z ¯ (1 + iα)1w + Rw + S w¯ + cc. − dx(∇ϕa · ∇w) Using again the explicit form of ϕa , completing the square and polarization, as in the proof of Eq. (6.10), the assertion follows. We continue with the proof of Lemma 6.3. Let s ∈ ( 21 , 1] and let Z Ts = dxϕa (x)|∇w(s, x)|2 . Then we integrate the differential inequality (6.17) which reads ∂t Tt ≤ O(1)Tt + O(ε2 ) from s∗ to s. This yields, using Eq. (6.16), Z Z dxϕa (x)|∇w(s∗ , x)|2 + O(ε2 ) dxϕa (x)|∇w(s, x)|2 ≤ exp O(1)(s − s∗ ) ≤ O(ε2 ).

(6.18)

Using this bound, we rewrite Z 1 Z G1−s (x − y) p ds dy √ ϕ(x − y)R(s, y)∇w(s, y). I3 = ϕ(x − y) 1/2 Using the Schwarz inequality as in Eq. (6.12), we get a bound I3 ≤ O(ε). Combining the bounds on I1 , I2 and I3 completes the proof of Eq. (6.7). The proof of Lemma 6.3 is complete. Lemma 6.3 gives us control over the evolution of differences in G, when they are small in G|QL . We shall now use this information to study the deformation of balls covering G|QL . To formulate the next result we need the following notation: Consider the universal attracting set G. The quantity NB(t) (ε) denotes the number of balls of radius ε needed to cover the set 2t (G)|B , in L∞ (B), where 2t is the semi-flow defined by the CGL equation.

Extensive Properties of the Ginzburg–Landau Equation

709

Proposition 6.6. There are constants c < ∞ and D, D1 < ∞ such that for all sufficiently small ε > 0 and all L > 3/ε one has the bound (t+1) (ε/2) ≤ NQ L

c D1 Ld−1 ε−(1+d) ε

d

(t) D L NQ (ε). L

(6.19)

Before we prove this proposition, we need a geometric lemma: Lemma 6.7. Let Q be a set of diameter r in Rd and assume that F is a family of complex functions f on Q which satisfy the bounds |f | ≤ a, |∇f | ≤ b, with br ≤ c/2. Then one can cover F with not more than (4a/c)2 balls of radius c in L∞ (Q). Proof. On a disk in Rd of diameter r, the function f varies no more than br which is bounded by c/2. On the other hand, one can find a set S of (4a/c)2 complex numbers of modulus less than a such that every complex number of modulus less than a is within c/2 of S. Since f varies less than c/2 one can find a constant function f ∗ with value in S such that supQ |f − f ∗ | < c. (t) (ε) balls of Proof of Proposition 6.6. By definition we can find, for every t ≥ 0, NQ L t ∞ radius ε in L (QL ) which cover 2 (G)|QL . Therefore we can find a collection B of (t) (ε) balls of radius 2ε in L∞ (QL ) with center in 2t (G)|QL , which cover 2t (G)|QL . NQ L in 2t (G) Let B be a ball (i.e., an element of B). We denote by B ∩ 2t (G) those functions t t whose restriction to QL is in B. We have obviously ∪B∈B B ∩ 2 (G) ⊃ 2 (G), and therefore [ 21 B ∩ 2t (G) |QL . 2t+1 (G)|QL ⊂ B∈B

Thus, we can move the time forward by one unit without changing the set we cover. This will be the crux of our argument, which will use the smoothing properties of 21 described in Lemma 6.3. We are going to cover every set 21 B∩2t (G) |QL by balls of radius ε/2 in L∞ (QL ). Counting fix a B ∈ B and consider 21 B ∩ all these balls will give the result. So we t t 2 (G) |QL . Since B ∈ B, its center f is in 2 (G)|QL , and, since 2t (G) ⊂ G, we also have f ∈ G|QL . (In fact f is the restriction of a function in G to QL .) Let g be an arbitrary point in B ∩ 2t (G) |QL . Our construction makes sure that both f and g satisfy the assumptions of Lemma 6.3 (with 2ε instead of ε). From Lemma 6.3, there are constants c1 and c2 (which do not depend on ε, f , or g) such that in QL−` the following holds: If w0 = g − f and w = 21 (g) − 21 (f ), then |w| ≤ c1 ε, Let

|∇w| ≤ c2 ε.

r1 = min 1, 1/(4c2 ) .

710

P. Collet, J.-P. Eckmann

We partition QL−` into disjoint cubes Q of side r1 (except at the boundary where we take possibly a strip of smaller cubes if necessary). In each of these cubes we can apply Lemma 6.7 with c = ε/2 since c2 εr1 ≤ ε/4. Therefore we can cover the restriction of 21 B ∩ 2t (G) to each cube Q by (4c1 ε/(ε/2))2 = 64c21 balls of radius ε/2 in L∞ (Q). We shall now use the same method in the corridor QL \QL−` but with balls at a different scale. In QL \QL−` we have only inequality (6.4) and not a bound O(ε) as in QL−` . Therefore we define r2 = ε/(4K), and again c = ε/2. This leads to

Kr2 = c/2.

We now cover the corridor QL \QL−` by cubes Q0 of side r2 (again a smaller strip at the boundary may be needed). In each of these cubes Q0 Lemma 6.7 applies and we can cover 21 B ∩ 2t (G) restricted to these cubes by 64K 2 ε−2 balls of radius ε/2 in L∞ (Q0 ). We now have a covering of QL by disjoint cubes. If we have a ball of radius ε/2 in L∞ in each cube, this defines a ball in L∞ (QL ) since in L∞ the product of two independent covers is a cover of the union of the sets, see Eq. (6.2). To get a covering of 21 B ∩ 2t (G) in L∞ (QL ) we have to consider all these possible balls and in particular count them. It is easy to verify that the number of such balls is bounded by d

d−1

(64c21 )(1+(L−`)/ min(1,1/4c2 )) (64K 2 ε−2 )2(1+4KL/ε)

(1+4K`/ε)

,

and the inequality (6.19) follows. The proof of Proposition 6.6 is complete.

(6.20)

Proof of Theorem 6.1. Finally, we can prove Theorem 6.1, and hence also Theorem 4.2. We use Proposition 6.6 recursively by starting at time t = 1 with ε = 1. For this case, we can apply Lemma 6.7 with a = K, b = K, r = 1/(4K) to get (t=1) Oα,β (2L+1)d , NQL (ε = 1) ≤ e and using inequality (6.19) inductively, we get (n+1) −n (2 ) NQ L

Oα,β (2L+1)d

≤e

D

nLd

n−1 Y

d−1 j(d+1)

(2j c)D1 L

2

.

j=0

Taking logarithms and dividing by (2L)d we get (n+1) −n (2 ) log NQ L

(2L)d

≤ n log D + L−d Oα,β (Ld ) + L−d Oα,β (nLd−1 2n(d+1) ).

Clearly, Theorem 6.1 follows by taking n as the integer part of log(1 + 1/ε).

Extensive Properties of the Ginzburg–Landau Equation

711

Remark. As asserted, D only depends on the parameters of the equation, as can be seen from Eq. (6.20): 1/ min(1,1/(4c2 )) , D = O 64c1 where c1 and c2 can be found in the proof of Lemma 6.3. Note also that there is a crossover point (for our bound) between the behavior described in Theorem 4.2, and the divergence described in Eq. (3.2), at about ε = L−1/(1+d) . 7. Lower Bound on the ε-Entropy per Unit Length In this section, we construct a lower bound on Hε (G). The idea is to construct a subset of the “local unstable manifold” of the origin with large enough ε-entropy per unit length. Working in space dimension 1 is enough, because such solutions are also solutions (in L∞ ) in higher dimensions which do not depend on the other variables (of course the lower bounds are not very accurate). The main result of this section is then Theorem 7.1. There is a constant A > 0 such that for sufficiently small ε > 0, the εentropy per unit length of the unstable manifold of 0 (and hence of the global attracting set G of CGL) satisfies the bound Hε (G) ≥ A log(1/ε).

(7.1)

7.1. The idea of the proof. To obtain a lower bound on the ε-entropy (always per unit length), we exhibit a large enough set of functions for which we prove that they are in the global attracting set. This set is built by observing that the 0 solution u = 0 has an unstable linear subspace which is made up of functions with momenta k in [−1, 1]. For these functions to be in the strongly unstable region, we restrict our attention to the class Eb (η) with b = 1/3 of entire functions in z = x + iy which are bounded by |f (z)| ≤ ηeb|Imz| . The Fourier transform fb of a function f in this class is a distribution with support in [−b, b] (see [S]) and is therefore strongly unstable. Furthermore, by Theorem 4.1 we have the bound Hε Eb (η) = 1, (7.2) lim ε→0 2b log(1/ε) π so there are “many” such functions. (See [KT], Theorem XXII and the beginning of Sect. 3). We want to use the set Eb (η) as the starting point for the construction of a set in G with positive ε-entropy. Thus, we want to evolve Eb (η) forward in time to reach G, using the evolution operator 2t defined above. However, this would move us far away from the solution 0 and we would lose control of the non-linearity. To overcome this difficulty, we first evolve the set Eb (η) backward in time by a linearized evolution. Thus, we use the method known from the usual construction of unstable manifolds, adapted to the case of continuous spectrum. b T0 (k) = e(1−(1+iα)k2 )T We begin by defining the linear evolution. Given T > 0 we let 2 and then \ (1−(1+iα)k2 )T b T b bT f (k). (2 0 f )(k) = 20 (k)f (k) = e

712

P. Collet, J.-P. Eckmann

Note that the map (in x-space) 2T0 : f 7→ 2T0 f is the evolution generated by the linearized CGL. Inspired by scattering theory, we will then consider the quantity S(f ) = lim 2T 2−T 0 (f ). T →∞

Since we consider the unstable manifold of 0 and stay in the vicinity of f = 0, the nonlinearities should be negligible and thus the following result seems very natural: Theorem 7.2. Let b = 1/3. There is an η∗ > 0 such that for η ≤ η∗ the following limit exists in L∞ (R) for f ∈ Eb (η): S(f ) = lim 2T 2−T 0 f. T →∞

Moreover,

S(f ) = f + Z(f ), where Z is Lipshitz continuous in f , with a Lipshitz constant of order O(η). In other words, S is close to the identity. Using this kind of information, we shall see that if two functions are separated by ε the functions S(f ) − S(f 0 ) are separated almost as much. Therefore, knowing that the set Eb (η) of f has positive ε-entropy implies that the set S(Eb (η)) – which is in the global attracting set – also has positive ε-entropy, as we shall show later. 7.2. The regularized linear evolution. In this subsection, we construct a somewhat more regular representation of 2T0 , which is needed because we consider negative T . We consider the class Eb (η), with b = 1/3. It is clear from the Paley-Wiener-Schwartz R [S] theorem that the functions f ∈ Eb (η) have a Fourier transform fb(k) = dxeikx f (x) which is a distribution with support in [−b, b]. If fb were a function, we could freely go back and forth between k-space and x-space. To deal with this problem, we use a with support in [−c, c] regularizing device. Let c > b and let ψb be a positive C ∞ function √ b fb(k) = fb(k) and equal to 1 on [−b, b]. We shall take b = 1/3, c = 1/ 3. Clearly ψ(k) (as a distribution) and therefore ψ ? f = f (in x-space), where ? denotes the convolution product. We define a regularized linear evolution kernel Z T (1−(1+iα)k2 ) b , gT (x) = dkeikx ψ(k)e and then we define

2T0,ψ f (x) ≡ gT ? f (x). This is our regularized representation of the linear evolution. By construction, it has the property: If f ∈ Eb (η), then 2T0,ψ f = 2T0 f,

(7.3)

as a distribution. But, as we shall see below, the l.h.s. is a well defined function and thus we can use either of the definitions, whichever is more convenient. Henceforth, we use the notation ft for 2t0 f = 2T0,ψ f . 7.3. Proof of the first part of Theorem 7.2. This theorem is relatively conventional, but tedious, to prove. We will therefore only sketch the standard estimates and describe in detail only the general sequence of estimates which are needed. We begin the proof of the first part of Theorem 7.2 with a study of 2t . First we would like to prove that 2t (f−T ) − ft−T remains small for 0 ≤ t ≤ T .

Extensive Properties of the Ginzburg–Landau Equation

713

Lemma 7.3. For η small enough, there is a ρ > 0 such that for any T > 0 and any t ∈ [0, T ] we have for all f ∈ Eb (η), the bound k2t (f−T ) − ft−T k∞ ≤ η 2 e−ρ(T −t) . Proof. First observe that by assumption kf k∞ ≤ η. By definition, we have Z 2 b f (x) = dydkeik(x−y) e−T (1−k (1+iα)) ψ(k)f (y). 2−T 0 Since ψb is smooth and supported in |k| ≤ c, we get from this the easy but useful bound kf−T k∞ ≤ O(η)e−(1−c

2

)T

.

(7.4)

Using Eq. (7.3), we see that 2t0,ψ f−T = ft−T satisfies ∂t ft−T = (1 + iα)∂x2 ft−T + ft−T . We let v = 2t (f−T ) − ft−T , and then we find ∂t v = (1 + iα)∂x2 v + v − (1 + iβ)(v + ft−T )|(v + ft−T )|2 . We write this as an integral equation using v(0, x) = 0. We get Z t 2 v(t, ·) = −(1 + iβ) ds2t−s 0,ψ (v(s, ·) + fs−T ) · |v(s, ·) + fs−T | .

(7.5)

In particular there is an inhomogeneous term Z t 2 −(1 + iβ) ds2t−s 0,ψ fs−T |fs−T | .

(7.6)

0

0

This term can be bounded by using Eq. (7.4) and the bound k2τ0 gk∞ ≤ eτ kgk∞ (which follows from Lemma 6.4). We get Z t 2 (7.7) η 3 O(1) dseT −s e−3(1−c )(T −s) ≤ O(η 3 ) e−ρ(T −t) , 0

and here the restriction on the choice of c implies ρ = 3(1 − c2 ) − 1 > 0. Thus we have bounded the inhomogeneous term (7.6). We next consider the set of functions satisfying sup eρ(T −t) sup |v(t, x)| ≤ η 2 , 0≤t≤T

x

with the associated metric. A standard argument using the bound (7.7) shows that in Eq. (7.5) we have a contraction (for η small enough, independent of t, T ) in this space and therefore a unique solution v for Eq. (7.5). Furthermore, the asserted bounds of Lemma 7.3 follow at once. We leave the (trivial) details to the reader. The proof of Lemma 7.3 is complete.

714

P. Collet, J.-P. Eckmann

We now come to the proof of convergence of 2T f−T , as T → ∞. We shall show that the derivative of this quantity is integrable in T . We recall that if we have a vector field X with flow ϕt then d ϕt (x) = Dϕt [x]X(x). dt We use throughout the notation DF [x] for the derivative of F evaluated at x; this is usually an operator. In our case, we get d T 2 (f−T ) = D2T [f−T ] dT

· (1 + iα)∂x2 f−T +f−T − (1 + iβ)f−T |f−T |2 − (1 + iα)∂x2 f−T − f−T T

(7.8)

= −(1 + iβ)D2 [f−T ] f−T |f−T | . 2

We want to prove that this quantity is integrable over T . For this purpose we have to control the linear operator D2T [f−T ]. Lemma 7.4. We have the inequality kD2T [f−T ]wk∞ ≤ O(1)eT (1+O(η)) kwk∞ .

(7.9)

Proof. It is easy to verify that D2T [f−T ]w0 is given as the value at time T of the solution of the linear equation ¯ ∂t w = (1 + iα)∂x2 w + w + Rβ w + Sβ w,

(7.10)

with initial condition w(t = 0, ·) = w0 (·). The coefficients Rβ and Sβ are given by Rβ (t, x) = −2(1 + iβ)|2t (f−T )(x)|2 , and

2

Sβ (t, x) = −(1 + iβ) 2t (f−T )(x)

.

The assertion of Lemma 7.4 follows now, using a contraction argument, as in the study of Eq. (7.5), from Lemma 7.3 and the previous formula. The details are again left to the reader. As a consequence, combining the inequalities (7.4) and (7.9), the right-hand side of Eq. (7.8) is exponentially small in T and therefore integrable and we have a limit. So our map S is well-defined by S(f ) = lim 2T (f−T ) = f + Z(f ), T →∞

where

Z T →∞

T

dt

Z(f ) = lim

0

d t 2 (f−t ), dt

and in fact we have proven that this last term is of order η 2 (in reality η 3 ). This completes the proof of the first part of Theorem 7.2. It remains to prove that it is Lipshitz and to estimate its Lipshitz constant in L∞ . This will be done in the next subsection, together with some even more detailed information on Z which we need later.

Extensive Properties of the Ginzburg–Landau Equation

715

7.4. Proof of the second part of Theorem 7.2. In this subsection, we prove the second part of Theorem 7.2, in fact even more. We first need some notation: Remark. It will be more convenient to work with the intervals [−L, L] instead of [−L/2, L/2] as in the earlier sections. We shall use the following notations: B = [−L, L], S = [−L + `, L − `], S0 = [−L + `/2, L − `/2], S00 = [−L + `/4, L − `/4], B \ S = [−L, −L + `) ∪ (L − `, L]. These letters stand for “big” and “small”. Our result is Proposition 7.5. The function Z is Lipshitz continuous in f in a neighborhood of 0 in Eb (η), b = 1/3, with a Lipshitz constant O(η): kZ(f ) − Z(f 0 )kL∞ (R) ≤ O(η)kf − f 0 kL∞ (R) .

(7.11)

Moreover, for ` ≥ 1/ε and L > `, one has the inequality kZ(f ) − Z(f 0 )kL∞ (S) ≤ O(η)kf − f 0 kL∞ (B) + O(ε2 )kf − f 0 kL∞ (R) .

(7.12)

Clearly, this result states more than what is asserted in Theorem 7.2, and thus, proving Proposition 7.5 will at the same time complete the proof of Theorem 7.2. Proof. Using Eq. (7.8), we have the expression Z(f ) = lim ZT (f ), T →∞

where

Z

T

dtD2t [f−t ] f−t |f−t |2 .

ZT (f ) = −(1 + iβ) 0

To prove the first part of Proposition 7.5, we would like to obtain a bound uniform in T on the differential of ZT (f ) with respect to f . Due to the presence of the absolute value, this function is not differentiable in f . One should therefore consider the expression obtained by taking the real and imaginary parts (note that we are only dealing with the values on the real axis and analyticity is not used in the following argument). To make the exposition simpler we will only explain the proof for the real Ginzburg–Landau equation (the field is real and α = β = 0), and for a space dimension equal to one, but the general case only presents notational complications. We have then, since we assume β = 0, Z ZT (f ) = − 0

T

3 dtD2t [f−t ] f−t .

716

P. Collet, J.-P. Eckmann

From this formula we have

Z

DZT [f ]w = − −3

T

0 Z T

3 dtD2 2t [f−t ](f−t , w) t

dtD2 [f−t ]

0

2 f−t (Df−t )w

(7.13) ≡ X1 + X 2 .

The second term X2 is easier to handle and we first prove both Eq. (7.11) and (7.12) for the contributions coming from this term. Since f−t is linear in f we have (Df−t )w = 2−t 0 w = w−t . Using Lemma 7.4 and Eq. (7.4), the integrand is bounded by 2 2 2 w−t k∞ ≤ O(1)et(1+O(η)) O(η)e−2(1−c )t O(1)e−(1−c )t kwk∞ , kD2t [f−t ] f−t (7.14) and therefore we get a bound for the integral which is of the form Z T 2 dtD2t [f−t ] f−t w−t k∞ ≤ O(η)kwk∞ , k3 0

which shows that the contribution from X2 to Eq. (7.11) is of the desired form, by linearity. We now come to the localized bound Eq. (7.12) for the contribution coming from the term X2 . It is enough to assume T large enough and for example T > t0 log(1/ε). Using the exponential estimates of Eq. (7.14), we have for a large enough constant t0 (independent of ε small enough), Z T 2 dtD2t [f−t ] f−t w−t k∞ ≤ O(ε2 )kwk∞ . k3 t0 log ε−1

For the other part of the integral, from 0 to t0 log ε−1 , we proceed as in the proof of Lemma 7.4. We want to bound Z t0 log ε−1 2 dtD2t [f−t ] f−t w−t . X2,+ = 0

In particular we will control the solution of the equation ∂t v = ∂x2 v + v + Rv, where R = O(η 2 ). Note that this is very similar to the estimate in Lemma 6.3, but the proof is more delicate. We can write an integral equation, namely if Kt is the heat kernel (associated with the Laplacian), we have Z t t (7.15) vt = e Kt ? v0 + dset−s Kt−s ? (Rs vs ). 0

It is now convenient to define, as in the proof of Lemma 6.3, yt = e−t(1+η) vt ,

(7.16)

Extensive Properties of the Ginzburg–Landau Equation

717

and to prove uniform bounds in t for yt . This leads to the integral equation Z t yt = Kt ? v0 + dse−η(t−s) Kt−s ? (Rs ys ).

(7.17)

0

In particular, if we consider this equation in the space of functions bounded in space and time, the last term gives an operator of norm O(η) because R = O(η 2 ). Therefore we can solve this equation for η small by iteration (i.e., the Neumann series converges). This is really the proof of Lemma 6.3. We are going to use this idea in a slightly more subtle way, taking advantage of the decay properties of the heat kernel. We first choose a number c1 > 0 large enough, basically c21 /t0 1, where t0 was defined above. We then choose an integer n such that (log ε)2 n, and nc1 log(1/ε) ≤ `/4. log η −1 Clearly, for our choice of ` ≥ 1/ε, and since η is a fixed (but small) constant, we can 3 choose n for example of order log(1/ε) , if ε > 0 is small enough. We next define a sequence of domains for 0 ≤ j ≤ n by S0j = [−L + `/2 − jc1 log(1/ε), L − `/2 + jc1 log(1/ε)]. Note that the distance between S0j and the complement of [−L, L] is at least `/4 (for ε small enough), that S0j ⊂ S0j+1 , that S0 = S0 and that Sn ⊂ [−L + `/4, L − `/4] = S00 . Using the integral equation Eq. (7.17) and t ≤ t0 log ε−1 we find, upon splitting the convolution integrals in the space variable, and writing t∗ = t0 log ε−1 : sup kyt kL∞ (S0j ) ≤kv0 kL∞ (B) + O(ε2 )kv0 kL∞ (R)

t∈[0,t∗ ]

+O(η) sup kyt kL∞ (S0j+1 ) + O(ε2 ) sup kyt kL∞ (R) . t∈[0,t∗ ]

(7.18)

t∈[0,t∗ ]

For example, the term Kt ? v0 is bounded as follows: Writing t∗ = t0 log ε−1 we have Z dz|Kt (x − z)v0 (z)| sup sup | Kt ? v0 (x)| ≤ sup sup t∈[0,t∗ ] x∈S0j

t∈[0,t∗ ] x∈S0j

z∈B∪(R\B)

≡ XB + XR\B . The term XB leads to the bound kv0 kL∞ (B) , since the integral of |Kt | = Kt equals 1. 2 Using Kt (z) ≤ 21/2 e(z /(2t)) K2t (z), the term XR\B is bounded by the supremum of v0 times Z dzKt (z) sup t∈[0,t∗ ] |z|>`/2−jc1 log(1/ε) Z sup O(1) exp −const.x2 /(2t) · dz(K2t (z)). ≤ sup t∈[0,t∗ ] |x|>`/2−jc1 log(1/ε)

R

Our choice of n and c1 implies that x2 /t ≥ log(1/ε2 ) (in fact a much better bound holds here, but later, when we iterate the argument, we shall use a bound which essentially saturates this inequality) and thus the bound of the first term in Eq. (7.17) follows. The bound on the second term follows using the same techniques and the contraction mapping

718

P. Collet, J.-P. Eckmann

principle as in our treatment of Eq. (7.5), and using that Rs = O(η 2 ) to compensate for a factor of η −1 which comes from the bound on the s-integral. Using the estimate on the whole line (Lemma 6.3), we conclude that the last term in Eq. (7.18) is of the same size as the second term and we get sup kyt kL∞ (S0j ) ≤ kv0 kL∞ (B) + O(ε2 )kv0 kL∞ (R) + O(η) sup kyt kL∞ (S0j+1 ) .

t∈[0,t∗ ]

t∈[0,t∗ ]

We now iterate n times this inequality (and here we only get a bound x2 /t > log(1/ε2 ) which comes from the lower bound on the separation of R \ S0j+1 from S0j ) to obtain an estimate on S0 = S0 . Since we have chosen the constant n such that η n = o(ε2 ), we find sup kyt kL∞ (S0 ) ≤ O(1)kv0 kL∞ (B) + O(ε2 )kv0 kL∞ (R) .

t∈[0,t∗ ]

We can now undo the effect of the exponential of Eq. (7.16). If we furthermore replace 2 2 w−t , and use the information we have on f−t w−t , we get the v0 by the initial data f−t bound for this part of the integral: kX2,+ kL∞ (S0 ) ≤ O(η)kwkL∞ (B) + O(ε2 )kwkL∞ (R) . Since S0 ⊃ S, this is the desired bound, and we have completed the bound on X2 . We finally consider the term X1 of Eq. (7.13). Here, we estimate D2 2t [f ](w1 , w2 ). Again, this is a function z which is a solution of 2 ∂t z = ∂x2 z + z − 3 2t f z − 62t f · D2t [f ]w1 · D2t [f ]w2 ,

(7.19)

with initial data z = 0, which is the analog of Eq. (7.10) which we found for the first derivative. Its estimate is analogous to the previous one. To deal with the localization problem for the non-homogeneous term in Eq. (7.19), we now exploit that the bound on X2 was done on a region S0 which is larger (by `/2) than the region S on which we really need the bounds. Details are left to the reader. Interpretation. The inequality Eq. (7.12) serves to localize the bounds of the previous subsection. If ε is small enough (depending only on the bounds Theorem 7.2 on the derivative of Z which are global), we have for any two functions f , f 0 in Eb (η) the inequality kZ(f ) − Z(f 0 )k∞ ≤ O(η)kf − f 0 k∞ . Therefore, kS(f ) − S(f 0 )k∞ ≥ (1 − O(η))kf − f 0 k∞ . Basically, we want to use Eq. (7.12) to show that if f and f 0 differ by at least ε somewhere on B this implies that S(f ) and S(f 0 ) differ by at least ε/2 somewhere on S. While this is not true in general, we will see in the next section that it must be true for enough functions among those which form the centers of the balls which cover G. This will be exploited in the next subsection.

Extensive Properties of the Ginzburg–Landau Equation

719

7.5. Proof of Theorem 7.1. The idea of the proof is to show that because S(f ) ≈ f and because the ε-entropy of the set Eb (η∗ ) of f is O log(1/ε) , the same will hold for the set S Eb (η∗ ) . Here, and in the sequel we fix η∗ to the value found as a bound in Theorem 7.2. Basically, we are going to show that if kf − f 0 k∞ ≥ ε, then not only kS(f ) − S(f 0 )k∞ ≥ ε/2,

(7.20)

but also that we can find enough functions for which supx∈B |f (x) − f 0 (x)| > ε and sup |S(f )(x) − S(f 0 )(x)| ≥ ε/4. x∈S

(7.21)

Here, we shall choose ` ≥ 1/ε. Note that we cannot prove Eq. (7.21) for individual pairs of functions, but only for a (large enough) subset of them. The mechanism responsible for that is a “crowding lemma” in the following setting: Let S be a set of N 1 functions which are pairwise at a distance at least α from each other, when considered on a set Ibig which is a finite union of intervals. Let Ismall be another finite union of intervals contained in Ibig . Lemma 7.6. Under the above assumptions at least one of the following alternatives holds: – At least N 1/2 /2 functions in S differ pairwise by α on Ibig \ Ismall . – At least N 1/2 functions in S differ pairwise by α/3 on Ismall . Remark. We can symmetrize the statement. We formulate this as a corollary for further use: Corollary 7.7. Under the above assumptions at least one of the following alternatives holds: – At least N 1/2 /2 functions in S differ pairwise by α/3 on Ibig \ Ismall . – At least N 1/2 /2 functions in S differ pairwise by α/3 on Ismall . Proof. We first need the following auxiliary Lemma 7.8. Let E be a set of M 2 > 4 points in a metric space. Assume that for a given ρ > 0 we can find in E no more than M points which are pairwise at a distance at least ρ. Then there is a point x∗ in E such that at least M/2 points of E are within a distance ρ of x∗ . Proof. Let E0 be a maximal set of points in E with pairwise distance at least ρ. By assumption, the cardinality of E0 satisfies |E0 | ≤ M . Adding any point x0 ∈ E \ E0 to E0 , we can find a point x00 ∈ E0 such that d(x0 , x00 ) < ρ, where d is the distance. We continue in this fashion with every point xj of E \E0 , finding a partner x0j in E0 with d(xj , x0j ) < ρ. There are thus |E \ E0 | = M 2 − M choices of x0j . But since there are at most M points in E0 , there must be at least one point in E0 which has at least (M 2 − M )/M partners. Clearly, this point can be chosen as x∗ . Since (M 2 − M )/M > M/2, the proof of Lemma 7.8 is complete.

720

P. Collet, J.-P. Eckmann

Proof of Lemma 7.6. We assume that the second alternative does not hold and show that then the first must hold. If the second alternative does not hold, then we can apply Lemma 7.8 with (M + 2)2 > N ≥ (M + 1)2 and ρ = α/3 on the set S of functions with the sup norm on Ismall and conclude that there is a function, f ∗ , such that on Ismall we can find M/2 others within distance at most α/3 from f ∗ . Call those functions fi (i = 1, . . . , K with K ≥ M/2). Therefore, sup |fj (x) − fj 0 (x)| < 2α/3,

x∈Ismall

for all pairs j, j 0 ∈ {1, . . . , K}. This implies that these M/2 functions fi must differ pairwise by at least α on Ibig \ Ismall since they have to differ pairwise by α on the whole interval Ibig . The proof of Lemma 7.6 is complete. With these tools in place, we can now start the proof of Theorem 7.1 proper. We first make precise the limiting process in the definition of Hε Eb (η∗ ) . Using the definition of Hε we have the following information about the set Eb (η∗ ): Let N[−L,L] (ε) denote again the minimum number of balls of radius ε in L∞ ([−L, L]) needed to cover Eb (η∗ ) (restricted to [−L, L]). Then we know that lim

ε→0

1 log N[−L,L] (ε) 2b lim = . log(1/ε) L→∞ 2L π

This leads to upper and lower bounds of the following form: For every δ > 0 there is an ε(δ) > 0 and for every ε satisfying 0 < ε < ε(δ) there is an L(δ, ε) such that for all L > L(δ, ε) one has σ∗ L(1+δ) σ∗ L(1−δ) 1 1 ≤ N[−L,L] (ε) ≤ , (7.22) ε ε where σ∗ = 4b/π. Given δ > 0, we pick ε and L as above and can find therefore in Eb (η∗ ) a set S1 of σ∗ L(1−δ) 1 , (7.23) N1 (ε, L) ≥ ε functions which are pairwise at distance at least ε in L∞ (B). Lemma 7.9. When ` 1/ε and L ` one can find in S1 a set S2 of at least N2 = 1 1/2 functions which differ pairwise by ε/3 on L∞ (S). 2 N1 Proof. We apply Corollary 7.7 with Ibig = [−L, L] and Ismall = [−L + `, L − `] and with N = [(1/ε)σ∗ L(1−δ)/2 ] and α = ε. If the conclusion of Lemma 7.9 does not hold, then by Lemma 7.6 we can find N2 functions which are pairwise at a distance at least ε on [−L, −L + `] ∪ [L − `, L]. Applying Corollary 7.7 with Ibig = [−L, −L + `] ∪ [L − `, L] and Ismall = [−L, −L + `] we conclude that in at least one of the intervals [−L, −L + `] 1/2 and [L−`, L] we can find at least N3 = 21 N2 functions which are pairwise at a distance ε/3 when considered on that interval. Since we are considering a subset of Eb (η∗ ), we see by Eq. (7.22), there can be no more than N4 ≡ (1/ε)σ∗ (1+δ)` such functions. Since δ > 0 is arbitrarily small and we have seen that there are at least N3 such functions, we find for L/5 > `(1 + δ)/(1 − δ) + 1 the inequality N3 > N4 . This is a contradiction and the proof of Lemma 7.9 is complete.

Extensive Properties of the Ginzburg–Landau Equation

721

Continuing the proof of Theorem 7.1, we take the set S2 of N2 functions among the initial ones which differ pairwise at least by ε/3 on S. Note that this is different from looking at functions which differ by ε/3 only on that interval because in S2 we have some information outside, namely that the functions differ by at least ε when considered on B. We consider the different S(f ) for these functions. Assume first that at least 1 1/2 N5 ≡ N2 = O ε−Lσ∗ (1−δ)/4 2 of these S(f ) differ pairwise by at least ε/12 on S. This means that N5 balls of radius ε/25 in L∞ (S) do not cover the set S(S1 ). In the terminology of [KT, pp. 86–87], this means that the minimal number of points in an ε/25-net is at least N5 . Thus the ε-entropy per unit length of S(S1 ) is bounded below by O log(1/ε) , we have a lower bound and we are done, i.e., Theorem 7.1 is proved in this case. For the opposite case, we are going to derive a contradiction, and this will complete the proof of Theorem 7.1 for all cases. By Lemma 7.8, with ρ = ε/12, if we cannot find at least N5 of the S(fi ) which differ pairwise by at least ε/12 on S, there is an f ∗∗ such that in a neighborhood of radius ε/36 around S(f ∗∗ ) we can find at least N5 of the other S(f ). This implies that we have a sub-collection {fi } of N5 functions for which sup |S(fj )(x) − S(fj 0 )(x)| < ε/36, x∈S

0

for all choices of j and j . Therefore, by the definition of S and Z we have sup |fj (x) − fj 0 (x)| < ε/36 + sup |Z(fj )(x) − Z(fj 0 )(x)|. x∈S

x∈S

We now apply Eq. (7.12) to bound this quantity by sup |fj (x) − fj 0 (x)| < ε/36 + O(ε2 ) + O(η∗ ) sup |fj (x) − fj 0 (x)|. x∈S

x∈B

Now if

sup |fj (x) − fj 0 (x)| ≤ sup |fj (x) − fj 0 (x)|,

x∈B\S

x∈S

the previous inequality implies sup |fj (x) − fj 0 (x)| < (1 + O(η∗ ))−1 (ε/6 + O(ε2 )). x∈S

Combining the last two inequalities we have sup |fj (x) − fj 0 (x)| < (1 + O(η∗ ))−1 (ε/36 + O(ε2 )),

x∈B

and we have a contradiction since the distance should be at least ε. (It is here that we use the additional information we have on the set S2 of N2 functions constructed in Lemma 7.9.) Therefore we conclude that sup |fj (x) − fj 0 (x)| > sup |fj (x) − fj 0 (x)|,

x∈B\S

x∈S

but since the sup over the whole interval must be ε we conclude that the sup on the l.h.s. is at least ε. Applying again Corollary 7.7 we can find among the {fi } at least 1 1/2 functions such that on one of the intervals [−L, −L + `] or [L − `, L] of B \ S 2 N5 they are pairwise at a distance at least ε/36. As before this leads to a contradiction if L ` because there should be at most ε−`σ∗ (1+δ) such functions. The proof of Theorem 7.1 is complete.

722

P. Collet, J.-P. Eckmann

Acknowledgement. This work was supported in part by the Fonds National Suisse.

References [BV] Babin, A.V. and Vishik, M.I.: Attractors of PDE’s in an unbounded domain. Proc. Royal Soc. Edinb. 116A, 221–243 (1990) [C] Collet, P.: Non-linear parabolic evolutions in unbounded domains. In: Dynamics, Bifurcations and Symmetries. P. Chossat, ed., Nato ASI 437, New York, London: Plenum, 1994, pp. 97–104 [CE] Collet, P. and Eckmann, J.-P.: The time-dependent amplitude equation for the Swift-Hohenberg problem. Commun. Math. Phys. 132, 139–153 (1990) [ER] Eckmann, J.-P. and Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57, 617–656 (1985) [G] Gallay, Th.: A center-stable manifold theorem for differential equations in Banach spaces. Commun. Math. Phys. 152, 249–268 (1993) [GH] Ghidaglia, J.M. and H´eron, B.: Dimension of the attractors associated to the Ginzburg–Landau partial differential equation. Physica 28D, 282–304 (1987) [HS] Hocking, L.M. and Stewartson, K.: On the nonlinear response of a marginally unstable plane parallel flow to a two-dimensional disturbance. Proc. R. Soc. Lond. A326, 289–313 (1972) [H] H¨ormander, L.: The Analysis of Linear Partial Differential Equations. Berlin, Heidelberg, New York: Springer, 1983–1985 [KT] Kolmogorov, A.N. and Tikhomirov, V.M.: ε-entropy and ε-capacity of sets in functional spaces.1 In: Selected Works of A.N. Kolmogorov Vol III. Shirayayev, A.N., ed., Dordrecht: Kluver, 1993 [M] Mattila, P.: Geometry of Sets and Measures in Euclidean Spaces. Cambridge: Cambridge University Press, 1995 [MS] Mielke, A. and Schneider, G.: Attractors for modulation equations on unbounded domains – existence and comparison. Nonlinearity 8, 743–768 (1995) [R] Ruelle, D.: Statistical Mechanics. New York: Benjamin, 1969 [S] Schwartz, L.: Th´eorie des Distributions. Paris: Hermann, 1950 [T] Temam, R.: Infinite-dimensional dynamical systems in mechanics and physics. New York: SpringerVerlag, 1988 Communicated by A. Kupiainen

1 The version in this collection is more complete than the original paper of Uspekhi Mat. Nauk 14, 3–86 (1959).

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 222

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close