Communications in Mathematical Physics - Volume 195

Commun. Math. Phys. 195, 1 – 14 (1998) Communications in Mathematical Physics © Springer-Verlag 1998 Anderson Localiz...

Author: A. Jaffe (Chief Editor)

36 downloads 912 Views 10MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 195, 1 – 14 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Anderson Localization for the Almost Mathieu Equation, III. Semi-Uniform Localization, Continuity of Gaps, and Measure of the Spectrum Svetlana Ya. Jitomirskaya1,? , Yoram Last2 1

Department of Mathematics, University of California, Irvine, CA 92697, USA Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA

2

Received: 7 July 1997 / Accepted: 15 September 1997

Abstract: We show that the almost Mathieu operator, (Hω,λ,θ 9)(n) = 9(n+1)+9(n− 1)+λ cos(πωn+θ)9(n), has semi-uniform (and thus dynamical) localization for λ > 15 and a.e. ω, θ. We also obtain a new estimate on gap continuity (in ω) for this operator with λ > 29 (or λ < 4/29), and use it to prove that the measure of its spectrum is equal to |4 − 2|λ|| for λ in this range and all irrational ω’s. 1. Introduction In this paper we study localization for the almost Mathieu operator Hω,λ,θ acting on `2 (Z): (Hω,λ,θ 9)(n) = 9(n + 1) + 9(n − 1) + vn 9(n),

(1.1)

where the potential vn is given by vn = λ cos(πωn + θ).

(1.2)

For background and some recent results on the almost Mathieu operator see [26, 18, 19, 15]. An often used notion of localization is that of pure point spectrum with exponentially decaying eigenfunctions. It can be expressed by the following definition: Definition 1. H exhibits localization if there exists a constant γ > 0 such that for any eigenfunction ψs one can find a constant C(s) > 0 and a site n(s) ∈ Z (center of localization) so that |ψs (k)| ≤ C(s)e−γ|n(s)−k| for any k ∈ Z. ? Alfred P. Sloan Research Fellow. The author was supported in part by NSF Grants DMS-9208029 and DMS-9501265.

2

S. Ya. Jitomirskaya, Y. Last

Localization, in this sense, for the almost Mathieu operator has been proven for λ > 15 and ω, θ satisfying certain arithmetic conditions [32, 13, 16]. At the same time, the physical understanding of localization is connected with what is often called dynamical localization—the non-spreading of initially localized wavepackets under the Schr¨odinger time evolution. It can be expressed by, for example, boundedness in the time t of kxeitH δ0 k2 = (e−itH δ0 , x2 e−itH δ0 ). An even stronger condition that we prefer to adapt here is the following: Definition 2. H exhibits dynamical localization if there exists a constant γ˜ > 0 such ˜ that for any ` ∈ Z, there exists C(`) > 0 so that −γ|n−`| ˜ ˜ . sup |e−itH (n, `)| ≤ C(`)e

(1.3)

t

Both of the above definitions (as well as Definitions 3 and 4 below), with Z replaced by Zd , work for operators H acting on `2 (Zd ). Dynamical localization has been established for the d-dimensional Anderson model in [2, 10] (and, in a restricted form, [28]); it remained an open question for the almost Mathieu operator. While dynamical localization implies pure point spectrum [23, 5], the converse is not true in general. There exist operators H with localization, but, nevertheless, with lim supt→∞ kxeitH δ0 k2 /tα = ∞ for any α < 2 [7] (also see [27]). In fact, tα can be replaced by an arbitrary function f (t) = o(t2 ). This example shows that Simon’s theorem on the absence of ballistic motion for operators with pure point spectrum [31] is optimal, and that mere “exponential localization” of eigenfunctions is not sufficient to determine the dynamics. Indeed, localization might not have much physical meaning if there is no control on the dependence of C on n (or, equivalently, on the eigenenergy En ). In particular, if the C(m)’s are allowed to grow arbitrarily fast with m, then eigenvectors may be “extended” over arbitrarily large length-scales and one cannot effectively define a “localization length” corresponding to a typical size of the “essential support" of the eigenfunction. An appropriate level of control over the C(m)’s is given by the following definition, introduced in [7]: Definition 3. H has SULE (semi-uniformly localized eigenvectors) if there exists a constant γ > 0 such that for any b > 0, there exists a constant C(b) > 0 such that for any eigenfunction ψs one can find n(s) ∈ Z so that |ψs (k)| ≤ C(b)eb|n(s)|−γ|k−n(s)| for any k ∈ Z. The notion of SULE, which strengthens mere localization, also has a dynamical counterpart, which strengthens mere dynamical localization: Definition 4. H has SUDL (semi-uniform dynamical localization) if there exists a con˜ stant γ˜ > 0 such that for any b > 0, there exists a constant C(b) > 0 so that b|`|−γ|n−`| ˜ ˜ . sup |e−itH (n, `)| ≤ C(b)e

(1.4)

t

It is shown in [7] that SULE implies SUDL (and thus dynamical localization), and that SUDL, along with a simple spectrum, implies SULE. There is another direction from which the notion of localization had been challenged recently, when Gordon [14] and del Rio-Makarov-Simon [9] have shown that localization can be destroyed by infinitesimally small localized rank-one perturbations. However,

Anderson Localization for Almost Mathieu Equation. III

3

SULE appears to be a condition that implies certain semistability (or physical stability) of localization [7]. Specifically, if H has SULE and H 0 is obtained from H by adding a localized rank-one perturbation, then all the spectral measures of H 0 are supported on 0 a set of zero Hausdorff dimension; and, more importantly, while kxe−itH δ0 k may be 2 unbounded, it never grows faster than C ln (t) [7]. That makes it particularly interesting to establish SULE for systems with localization. For the d-dimensional Anderson model, SULE had been derived in [7] from the dynamical estimates of Aizenman [2] ([10] in the one-dimensional case). More precisely, the dynamical estimates imply SUDL, from which SULE follows by the SULE ⇔ SUDL relation. In the present paper, we obtain SULE for the almost Mathieu operator by direct analysis of eigenfunctions, and so we deduce SUDL (and, in particular, dynamical localization) for this operator by using the SULE ⇔ SUDL relation in the other direction. (It should be pointed out that we do not know any other way to prove dynamical localization for the almost Mathieu operator.) Throughout the paper we will often assume ω to be Diophantine; that is, that there exist c(ω) > 0 and 1 < r(ω) < ∞ such that | sin πjω| >

c(ω) |j|r(ω)

(1.5)

for all j 6= 0. Our main result is the following: Theorem 1.1. The almost Mathieu operator has SULE for any Diophantine ω, λ > 15, and a.e. θ. As discussed above this immediately implies: Corollary 1.2. The almost Mathieu operator with ω, λ, θ as in Theorem 1 has dynamical localization. Remarks. 1. The set of parameters ω, λ in Theorem 1.1 is exactly the set for which localization has been proven [16]. 2. We will, in fact, obtain a polynomial bound on C(n(s)). See (2.2). 3. The set of θ’s for which we show SULE is smaller than the set of θ’s for which localization has been proven. One can show that there is a zero measure set of θ’s for which there is localization but not SULE. 4. Similar techniques can be applied to non-Diophantine ω’s with exponential rate of approximation by rationals, for which localization is proven for large λ [21]. 5. One can study local SULE, that is, semi-uniform localization of eigenfunctions with corresponding eigenvalues belonging to a certain interval. Such local SULE can be shown to imply dynamical localization for the spectral projection of H onto this interval. Our method allows us to establish local SULE for all energy intervals where localization has been shown so far. That includes certain intervals in the center of the spectrum for λ > 5.6 [16]. 6. One can think that a more natural control on the C(n)’s is given by uniform localization, UL, defined as localization with a uniform bound C(n) < c. However, a large class of ergodic operators, including the almost Mathieu operator, has been shown not to have UL [20, 7]. Indeed, for large families of ergodic potentials, vn = v(T n x), UL implies phase-stability of pure point spectrum (see [20]), that is, localization for all phases x [20, 7]. We would like to repeat a remark from [20] that for the almost Mathieu operator, SULE is accompanied by certain phase-semistability of pure point spectrum, as the spectral measures for all θ are zero-dimensional [19, 22].

4

S. Ya. Jitomirskaya, Y. Last

This suggests that the general relation between SULE and such semistability, or even a stronger dynamical statement, could be similar to the general relation between UL and the stability. While both phase-stability of pure point spectrum and UL, as in the Maryland model [29, 12, 11, 30], are deeply connected with the lack of resonances (see [20]), both SULE and zero-dimensional spectral measures for the almost Mathieu operator are implied by certain control over the resonances. The estimates we obtain allow us, in addition, to answer some other almost Mathieu questions. In particular, we establish a strong version of the continuity of gaps theorem [3, 24], and use it to extend the result of [24] on the measure of the spectrum to the case of an arbitrary irrational frequency ω. Let σ(ω, λ) denote the spectrum of Hω,λ,θ , which is known to be completely θ-independent for any irrational ω [5]. We shall prove the following: Theorem 1.3. For all irrational ω’s and |λ| > 29 or |λ| < 4/29, |σ(ω, λ)| = |4 − 2|λ||, where | · | denotes Lebesgue measure. Remark. The equality |σ(ω, λ)| = |4 − 2|λ|| was conjectured by Aubry and Andre [1] to hold for every λ and irrational ω, and was studied by Thouless [33] and by Avron, van Mouche, and Simon [3], who proved the lower bound. Last [24] obtained the equality for every λ and a.e. ω. In Sect. 2 we prove Theorem 1.1 up to some lemmas that we prove in Sections 3 and 4. In Sect. 5 we prove a result about continuity of gaps, which we use in Sect. 6 to prove Theorem 1.3. The Appendix provides a proof for a somewhat technical lemma that we use in Sect. 4.

2. Proof of Theorem 1.1 In this section we prove Theorem 1.1 up to some lemmas that will be proven later. Our proof consists of two parts: 1. Obtaining uniform estimates in the proof of localization for the nonresonant regime. 2. Studying the statistics of resonances, and, particularly, the dependence of the number of resonances on the position of the center of localization. We introduce the sets of resonant phases: j 2sj,k = θ : (k + 1)−s < sin 2π θ + ω < k −s , k ∈ N, 2 2sk = ∪k|j|=0 2sj,k , 2s = lim 2sk , k→∞

and 2 = ∩s>r(ω) 2s .

Anderson Localization for Almost Mathieu Equation. III

5

Note that 2 = {θ : for every s > r(ω) the relation | sin 2π(θ + (j/2)ω)| < k−s for some |j| ≤ k holds for infinitely many k 0 s}. Since |2sk | = O(k −s ), it follows from the Borel-Cantelli lemma that every 2s , s > r(ω), has zero Lebesgue measure and so does 2. For θ ∈ / 2, we define the resonant rate as s(θ) ≡ inf{s > r(ω) : θ ∈ / 2s }+1 ≥ r(ω)+1. For s > s(θ)−1 we define the s-resonant number of θ, k(θ, s) = #{m ∈ N : θ ∈ 2sm }. Let n1 (θ, s) < · · · < nk(θ,s) (θ, s) be the positions of resonances: all m ∈ N such that θ ∈ 2sm . For a fixed θ, the numbers k(θ, s) and nk(θ,s) (θ, s) decrease with s. Also, s(θ) is an invariant function: s(θ + ω) = s(θ), and 2 is an invariant set. Let k(θ) = k(θ, s(θ)). We put ni (θ) = ni (θ, s(θ)), i = 1, . . . , k(θ), and will sometimes write nk(θ) for nk(θ) (θ). We first obtain some elementary information on the sparseness of the sequence ni (θ). Lemma 2.1. Suppose θ ∈ 2sn , ω satisfies (1.5), and s > r(ω). Then there exists a s positive constant c1 (ω) such that θ ∈ / 2sm , n < m < c1 (ω)n r(ω) . Proof. | sin 2π(θ + j/2ω)| ≤ n−s , some |j| ≤ n, and | sin 2π(θ + `/2ω)| ≤ m−s , some |`| ≤ m, |`| > |j|, imply | sin π(` − j)ω)| < 2n−s , and so, by (1.5), we have 1 m ≥ |`| ≥ |` − j| − |j| ≥ (c(ω)ns /2) r(ω) − n. For θ ∈ / 2, λ > 15 and ω satisfying (1.5) the localization has been proven in [16]. Thus every eigenfunction 9E with eigenvalue E, attains its maximal value at no more than finitely many points. We define n(E) to be the position of the leftmost maximum of 9E . Our key technical result is Lemma 2.2. Let θ ∈ / 2, λ > 15 and ω satisfies (1.5). Then there exist C = C(θ, ω, λ) < ∞ and γ = γ(λ) > 0 such that for any eigenfunction 9E of Hθ , we have |9E (m)| < 2|9E (n(E))|e−γ(λ)|m−n(E)| for |m − n(E)| > C(θ, ω, λ) ln nk(θ+n(E)ω) . Lemma 2.2 will be proven in Sect. 4. For the continuity of gaps and measure of the spectrum parts, we will also need a similar, slightly more detailed statement, Lemma 5.3, asserting the exponential decay between the resonances. In order to relate the number of resonances to the position of n(E), we will need the following: Lemma 2.3. Fix 0 < r < ∞. Then for a.e. θ ∈ / 2s with s > r + 1, there exists q(θ, r, s) < ∞ such that for every eigenvalue E of Hθ , we have n(E) > nrk(θ+n(E)ω) if nk(θ+n(E)ω) > q(θ, r, s). This lemma will be proven in Sect. 3. Proof of Theorem 1.1. Suppose λ > 15 and ω satisfies (1.5). For a.e. θ ∈ / 2, as in Lemma 2.3, we obtain, using Lemma 2.2, that for any eigenfunction 9E of Hω,λ,θ and any m ∈ Z, |9E (m)| < 2|9E (nE )|(nk(θ+n(E)ω) )C(θ,ω,λ)γ(λ) e−γ(λ)|m−n(E)|) .

(2.1)

Thus, if we normalize 9 by 9E (n(E)) = 1 and fix 0 < r < s(θ) − 1, Lemma 2.3 yields that 1 (2.2) |9E (m)| < 2(max(n(E) r , q(θ, r, s(θ))))C(θ,ω,λ)γ(λ) e−γ(λ)|m−n(E)|) which, in particular, proves the theorem.

6

S. Ya. Jitomirskaya, Y. Last

3. Resonant Sets: Proof of Lemma 2.3 Let psn (θ) = min{|m| : θ + 2πmω ∈ 2sn }. Lemma 3.1. Fix 0 < r < s − 1. Then for a.e. θ, there exists q(θ, r, s) < ∞ such that for n > q(θ, r, s), we have psn (θ) > nr . Proof. Since |2sn | = O(n−s ), we have |{θ : psn (θ) < nr }| ≤ 2nr+1−s . Thus, the Borel-Cantelli lemma implies the result. Proof of Lemma 2.3. Since θ + n(E)ω ∈ 2snk(θ+n(E)ω) , we have by the definition of psn (θ) that n(E) ≥ psnk(θ+n(E)ω) (θ). So, by Lemma 3.1, we obtain the needed statement.

4. Uniform Decay: Proof of Lemma 2.2 Throughout this section we assume that ω satisfies (1.5). We start with recalling the main definitions and lemmas from the proof of localization in [16]. We will use the notation G[x1 ,x2 ] (E) for the Green’s function (H − E)−1 of the operator Hω,λ,θ restricted to the interval [x1 , x2 ] with zero boundary conditions at x1 − 1 and x2 + 1. Let us denote # " . Pk (θ, E) = det (H(θ) − E) [0,k−1]

We now fix E ∈ R; 1 < m1 < quantities:

where by

p

λ 1 2 , m1

p 1 M (E, λ) = √ |E + i + (E + i)2 − λ2 |, 3

(E + i)2 − λ2 we understand the value with positive imaginary part; C(E, λ) =

and cλ, =

< m2 < 1, > 1. We will need the following

ln(m1 m2 ) ln(M (E,λ)) .

ln λ2 3 − ln M (E, λ) 4

Given k > 0, let us denote

Ak (E) = {x ∈ Z : |Pk (θ + xω, E)| > mk1 }. We will sometimes drop the E-dependence, assuming E is fixed. Definition. Fix E ∈ R. A point y ∈ Z will be called (m2 , k)-regular if there exists an interval [x1 , x2 ] containing y such that |G[x1 ,x2 ] (y, xi )| < mk2 , and dist(y, xi ) ≤ k; i = 1, 2. Otherwise y will be called (m2 , k)-singular. The Poisson formula 9(y) = G[x1 ,x2 ] (y, x1 )9(x1 − 1) + G[x1 ,x2 ] (y, x2 )9(x2 + 1)

(4.1)

implies that if 9E is a generalized eigenfunction, then every point y with 9E (y) 6= 0 is (m2 , k)-singular for k sufficiently large: k > k(E, m2 , θ, y). Of course, in general there is no uniform bound on k(E, m2 , θ, y). It turns out though that if we pick Hω,λ,θ with an eigenvalue E and take y to be n(E), such a bound becomes immediate.

Anderson Localization for Almost Mathieu Equation. III

7

2 Lemma 4.1. n(E) is (m, k)-singular for k > − lnlnm .

Proof. Obvious from (4.1).

We will need to recall several statements from the proof of localization in [16]. Lemma 4.2 (Proposition 1 of [16]). For any > 1, there exists k(, E) such that for k > k(, E) and all θ we have |Pk (θ, E)| < (M (E, λ))k . Lemma 4.3 (Proposition 2 of [16]). Suppose y ∈ Z is (m2 , k)-singular. Then for any x such that k(1 − cλ, ) ≤ y − x ≤ kcλ, , we have that x does not belong to Ak . The following lemma is proven in the proof of Lemma 3 in [16]: 1 Lemma 4.4. Let 2m λ k1 (b), if the two points x1 , x2 ∈ Z are such that / Ak , i = 1, 2, 1) xi , xi + 1, ..., xi + [ k+1 2 ]∈ ], 2) dist(x1 , x2 ) > [ k+1 2 then cos 2π θ + k − 1 + x1 + j1 ω 2 (4.2) k k−1 4 − cos 2π θ + + x1 + j 2 ω ≤b 2

k+1 for some j1 , j2 ∈ [0, [ k+1 2 ]] ∪ [x2 − x1 , x2 − x1 + k − 1 − [ 2 ]].

Let E(θ) be a generalized eigenvalue of Hω,λ,θ , 9(x) the corresponding generalized eigenfunction. To finish the proof of Lemma 2.2 we will need E-independent bounds on how large the scale k should be in Lemmas 4.1–4.4. Since Lemmas 4.1, 4.3, 4.4 are already E-independent, we will only have to take care of Lemma 4.2. It turns out that Lemma 4.2, although formulated and proven in [16] in a non-uniform way, is, in fact, a uniform statement: Lemma 4.5. There exists k() < ∞ such that k(, E) ≤ k() for any E ∈ [−λ−2, λ+2]. Lemma 4.5 will be proven in the appendix. Proof of Lemma 2.2. Let λ > 15. It was shown in [16] that in such a case, C(E, λ) > C(λ + 2, λ) > 0 for any E ∈ [λ − 2, λ + 2]. Thus, there exist (E-independent) 1 < 1 1 such that 2cλ, − 1 > 21 . Fix 2m λ ln 2 k = |x − n(E)| > max[k(), k1 (b), − ln m2 ]. Suppose x is (m2 , k)-singular. Since, by Lemma 4.1, n(E) is also (m2 , k)-singular, we can, by Lemma 4.3, apply Lemma 4.4 with x1 = x − [cλ, |x − n(E)|] and x2 = n(E) − [cλ, |x − n(E)|]. We then obtain, by (1.5), (4.2), bk/4 (3k/2)r(ω) sin 2π θ + k − 1 + x1 ω + j1 + j2 ω < ˆ < bk/5 , k > k(ω, b) 2 2 2c(ω) k+1 with some j1 , j2 ∈ [0, [ k+1 / 2 ]]∪[x2 −x1 , x2 −x1 +k −1−[ 2 ]]. Noting that θ+n(E)ω ∈ −5s(θ) ln nk(θ+n(E)ω) s(θ) or |k−1−2cλ, k+ 2n for n > nk(θ+n(E)ω) , we obtain that either k < ln b −k

j1 + j2 | > b 5s(θ) . Since for any allowed j1 , j2 we have |k − 1 − 2cλ, k + j1 + j2 | < 5/2k, the second inequality is contradictory for k > kˆ 1 (b, s(θ)). Thus any x with

8

S. Ya. Jitomirskaya, Y. Last

k = |x − n(E)| > k0 (, m2 , b, ω, θ) ln 2 ˆ −5s(θ) ln nk(θ+n(E)ω) ˆ = max k(), k1 (b), − , k(ω, b), k1 (b, s(θ)), ln m2 ln b is (m2 , |x − n(E)|)-regular. Repeating the argument of [16], we have that there exists an interval [y1 , y2 ] containing x such that |x−n(E)|

|yi − x| ≤ |x − n(E)|, |G[y1 ,y2 ] (x, yi )| ≤ m2

, i = 1, 2.

Using (4.1), we obtain the estimate: |9(x)| ≤ 2|9(n(E))|e−γ(λ)|x−n(E)| , γ(λ) = − ln m2 , for any x with |x − n(E)| > C1 (ω, θ, λ) ln nk(θ+n(E)ω) , since our choice of , m2 , b was dependent on λ only.

5. Continuity of Gaps Let σ(ω, λ, θ) denote the spectrum of Hω,λ,θ which depends on θ only if ω is a rational. We denote S(ω, λ) = ∪θ σ(ω, λ, θ). In this section we establish the following continuity property of the set S(ω, λ): Theorem 5.1. For every λ > 29, there are constants C(λ), D(λ) > 0, such that if ω, ω 0 ∈ R satisfy |ω − ω 0 | < C(λ), then for every E ∈ S(ω, λ) there is E 0 ∈ S(ω 0 , λ) with |E − E 0 | < F (|ω − ω 0 |), where F (x) = −D(λ)x ln x. Remarks. 1. This theorem with F (x) = 6(λx)1/2 was proven for any λ > 0 by Avron, van Mouche, and Simon [3]. 2. The constant D(λ) can be effectively estimated. Namely, as can be seen from the λ. This estimate explodes for λ proof, it can be shown that D(λ) < 3 ln 264λ 2M (λ+2,λ)5/6

approaching the root of λ = 2M (λ + 2, λ)5/6 which is slightly smaller than 29. However, as λ grows, the estimate becomes increasingly better. Theorem 5.1 immediately implies the following corollary: Corollary 5.2. (i) For every λ > 29, there are constants C(λ), D(λ) > 0, such that if |ω − ω 0 | < C(λ), then for every gap in S(ω, λ) with midpoint Eg , and measure |g| larger than −2D(λ)|ω − ω 0 | ln |ω − ω 0 |, there is a corresponding (containing Eg ) gap in S(ω 0 , λ) with measure larger than |g| + 2D(λ)|ω − ω 0 | ln |ω − ω 0 |. (ii) The same continuity as in (i) also holds for the extreme edges of S(ω, λ) namely, for |ω − ω 0 | < C(λ): 0 max 0 0 |max min S(ω, λ) − min S(ω , λ)| < −D(λ)|ω − ω | ln |ω − ω |.

For the proof of Theorem 5.1, we will need a statement asserting decay between the resonances. Fix θ. Let j(n) be the position of the resonance of order n : | sin 2π(θ + j(n) −s , | sin 2π(θ + j2 ω)| > n−s , |j| < |j(n)|. By the proof of Lemma 2.1, we 2 ω)| < n have, for m > n: s (5.1) |j(m)| > c1 (ω)|j(n)| r(ω) .

Anderson Localization for Almost Mathieu Equation. III

9

Lemma 5.3. Let λ > 29, ω satisfies (1.5), θ ∈ / 2, E is an eigenvalue of Hθ and / 2si , n k(λ, ω) we have |9E (l)| < 2|9E (n(E))|e−γ(λ)|l−n(E)| whenever 3j(n) < |l − n(E)| < 3/8j(m), where γ(λ) is the same as in Lemma 2.2. For the proof of Theorem 5.1, we will need the following elementary lemma: Lemma 5.4. For any Borel set S ⊂ [0, 1] with |S| > 0, there exist ω ∈ S for which (1.5) holds with r(ω) = 3 and c(ω) > |S| 3 . Proof. If not, then for every ω ∈ S with r(ω) = 3 (such ω’s form a set of full measure) there would exist j 6= 0 such that | sin πjω| <

|S| . 3|j|3

(5.2)

If Pwe denote by Aj the set of ω ∈ S for which (5.2) is satisfied, we obtain |S| ≤ |Aj | < |S|, and the contradiction proves the lemma. Proof of Theorem 5.1. Fix λ > 29. By Lemma 5.4, we can find ω1 ∈ (ω, ω 0 ) satisfying 0 | . Take E ∈ σ(ω1 , λ) and L > k(λ, ω1 ) (1.5) with r(ω1 ) = 3, such that c(ω1 ) > |ω−ω 3 (from Lemma 5.3). Pick θ ∈ / 2 so that Hω1 ,λ,θ has pure point spectrum. Pick E1 , an eigenvalue of Hω1 ,λ,θ with |E1 − E| < e−3/8γ(λ)L . Let 9E1 be the corresponding eigenfunction. Let j(ni ) be a sequence of resonance positions for the phase θ + n(E1 )ω1 . Let i be such that j(ni ) < L < j(ni+1 ). Then by (5.1), one can find a constant 3/8 < a < 3 such that 3j(ni ) < aL < 3/8j(ni+1 ). Let 8E1 be 9E1 restricted to [n(E1 ) − aL, n(E1 ) + aL] and normalized by k8E1 k = 1. Then we have ||(Hω0 ,λ,θ − E)8E1 || ≤ ||(Hω1 ,λ,θ − Hω0 ,λ,θ )8E1 || + ||(Hω1 ,λ,θ − E1 )8E1 || + |E − E1 |||8E1 || ≤ 2aλ|ω − ω 0 |L + 4e−γ(λ)aL + e−3/8γ(λ)L ≤ 6λ|ω − ω 0 |L + 5e−3/8γ(λ)L . Here we used Lemma 5.3 to estimate the second term. By taking L = −D1 (λ) ln |ω −ω 0 |, where D1 (λ) = max(2c3 (λ), 8/3γ(λ)−1 ), we obtain the needed statement for λ > 29. ln

λ

2 Proof of Lemma 5.3. The constant 29 was chosen so that ln M (31,29) > 5/6. This implies that for λ > 29, one can choose 1 < m1 < λ/2, m2 < 1, > 1 so that 2cλ, − 1 > 2/3. That means, by Lemma 4.3, that every (m2 , k)-singular point “produces" [2k/3] points not belonging to Ak . We will now formulate a version of Lemma 4.4: 1 Lemma 5.5. Let 2m λ k2 (b), if the two points x1 , x2 ∈ Z are such that / Ak , i = 1, 2, 1) xi , xi + 1, ..., xi + 2[ k+1 3 ]∈ 2) dist(x1 , x2 ) > [ k+1 ], 2 then

10

S. Ya. Jitomirskaya, Y. Last

cos 2π θ + k − 1 + x1 + j1 ω 2 k−1 ≤ b k4 − cos 2π θ + + x1 + j 2 ω 2

(5.3)

k+1 for some j1 , j2 ∈ [0, 2[ k+1 3 ]] ∪ [x2 − x1 , x2 − x1 + [ 3 ]].

The proof of this lemma is identical to that of Lemma 4.4 and is given in [16]. / 2si , n < i < m. Assume x Assume now that θ + n(E)ω ∈ 2sn ∩ 2sm , n < m; θ ∈ is (m2 , k)-singular, with k = |x − n(E)|. We will consider several cases depending on the position of x with respect to the resonance. • If x < n(E) < n(E) + j(n), we can apply Lemma 5.5 with x1 = x − 5k/6 and x2 = n(E) − 5k/6. As in the proof of Lemma 2.2 this together with (1.5) will imply a resonance condition j < bk/5 , sin 2π θ + n(E) + ω 2

(5.4)

k+1 for j = −8k/3−1+j1 +j2 , some j1 , j2 ∈ [0, 2[ k+1 3 ]]∪[k, k +[ 3 ]], and k satisfying

bk/4 (4k/3)r(ω) < bk/5 . 2c(ω)

(5.5)

We get a contradiction provided j 6= j(n) or j(m) for all choices of j1 , j2 . Since for any allowed j1 , j2 we have −8k/3 − 1 ≤ j ≤ 0, the contradiction follows from k < 3/8j(m). • If x > n(E) > n(E) + j(n), we apply Lemma 5.5 with x1 = x − 5/6k and x2 = n(E) − k2 . We get a resonant condition (5.4) with j = 4k/3 − 1 + j1 + j2 , with j1 + j2 ∈ [−4k/3, 4k/3]. By the same argument we obtain a contradiction from k < 3/8j(m) if k obeys (5.5). • If x > n(E) + j(n) > n(E), we apply Lemma 5.5 with x1 = x − 5/6k and x2 = n(E)−5/6k. The possible values for j in (5.4) are now j ∈ [−2k/3, 0]∪[k/3, 8k/3]. For the contradiction, we need 3j(n) < k < 3/8j(m) and bk/4 (5k/3)r(ω) < bk/5 . 2c(ω)

(5.6)

• Similarly, if x < n(E) + j(n) < n(E), we apply Lemma 5.5 with x1 = x − 5/6k and x2 = n(E) − k2 . Again, the contradiction follows if 3j(n) < k < 3/8j(m) and k obeys (5.6). In the same way as in the proof of Lemma 2.2, we obtain, using (5.5),(5.6), that for ln 2 20 ln(2c(ω)) + 3/2) and satisfying 3j(n) < |x − |x − n(E)| > max(k(), k2 (b), − ln m2 , ln b n(E)| < 3/8j(m) we have that x is (m2 , |x − n(E)|)-regular which, as before, proves the statement of Lemma 5.3. Here, in estimating how large k should be to satisfy (5.5), (5.6), we used that c(ω) ≤ 1 for all ω obeying (1.5).

Anderson Localization for Almost Mathieu Equation. III

11

6. Measure of the Spectrum: Proof of Theorem 1.3 Once we have established the strong version of continuity of gaps, Theorem 5.1, the proof of Theorem 1.3 simply follows the lines of the measure-of-the-spectrum theorem in [24, 25]. We present the argument here for the reader’s convenience. Let G(ω, λ) be the union of the gaps in S(ω, λ), so that |S(ω, λ)| = max S(ω, λ) − min S(ω, λ) − |G(ω, λ)|.

(6.1)

If ω = p/q is a rational, then S(ω, λ) consists of no more than q bands, and G(ω, λ) of no more than q − 1 intervals. It is well known that for any irrational ω, there exists a sequence of rationals pn /qn → ω such that |ω − pn /qn | <

1 . qn2

(6.2)

Avron, van Mouche, and Simon [3] had proven that for every λ and every sequence {pn /qn } with pn and qn relatively prime and qn → ∞, lim |S(pn /qn , λ)| = |4 − 2|λ||.

n→∞

(6.3)

Equation (6.3) along with (any) gap continuity implies (see [3, 25]) the upper bound |σ(ω, λ)| ≥ |4 − 2|λ||

(6.4)

for any irrational ω. We now obtain from (i) of Corollary 5.2: |G(ω, λ)| > |G(pn /qn , λ)| − 2D(λ)(qn − 1)λ|ω − pn /qn | ln |ω − pn /qn |.

(6.5)

By (6.1) and (ii) of Corollary 5.2, this implies: |S(ω, λ)| < |S(pn /qn , λ)| + 2D(λ)qn λ|ω − pn /qn | ln |ω − pn /qn |. By (6.2) and (6.3), we obtain |σ(ω, λ)| = |S(ω, λ)| ≤ |4 − 2|λ||, which together with (6.4), completes the proof of Theorem 1.3 for λ > 29. Since S(ω, λ) is independent of the sign of λ, it is enough to have |λ| > 29. The result for |λ| < 4/29 follows from duality: S(ω, λ) = (λ/2)S(ω, 4/λ). 7. Appendix: Proof of Lemma 4.5 We denote

B(θ, E) =

E − λ cos θ −1

1 0

,

Bk (θ, E) = B(θ + kπω, E).

It was shown in the proof of Proposition 1 in [16] that for any k > 0, we have |Pk (θ, E)| ≤

2 √ 3

k+1 Y k

kBj (θ, E)k,

i=0

√ √ a b , we use ||A|| = max( a2 + c2 , b2 + d2 ). We now want to find c d k() (not dependent on E!) such that for any k ≥ k(),

where for A =

12

S. Ya. Jitomirskaya, Y. Last

2 √ 3

k+1 Y k

kBj (θ, E)k ≤

j=0

k

2 √ 3

k e

k 2π

R 2π

ln kB(θ,E)kdθ

0

,

which can be rewritten as: X Z 2π k k 2 + ln kBj (θ, E)k ≤ k ln + ln kB(θ, E)kdθ. ln √ 2π 0 3 j=0

(7.1)

(7.2)

√ R 2π This will prove Lemma 4.5 since 0 ln kB(θ, E)kdθ = ln( 23 M (E, λ)). Let pn /qn be the sequence of continued fraction approximants of ω. Let n(k) be such that qn(k) ≤ k < qn(k)+1 . We will write r for r(ω) and c for c(ω).

Proposition 7.1. For any f ∈ C[0, 2π), k > 0 we have: X Z 2π k−1 k n(k) + 1 f (θ + jπω) − f (θ)dθ ≤ k(c−1/r )Var(f ). 2π 0 k 1/r j=0

(7.3)

Proof. Writing k = bn qn + bn−1 qn−1 + · · · + b1 q1 + b0 and using the Denjoy-Koksma inequality (see, e.g., Lemma 4.1, Ch. 3 [4]), we get k−1 ! Z 2π n X X k q i+1 Var(f ). f (θ + jα) − f (θ)kdθ ≤ (b0 + · · · + bn )Var(f ) ≤ 2π 0 qi j=0

i=0

(7.4) qir c ,

qi+1 qi

1−1/r

qi+1 c1/r

qi+1 (cqi+1 )1/r

we have < = . The right-hand side of Since (1.5) implies qi+1 < (7.4) can now be estimated as ! n(k) X k k 1−1/r 1−1/r −1/r ≤ c Var(f ) < c−1/r n(k)qn(k) + Var(f ). qi + qn(k) qn(k) i=1

qr

1/r , and Since k < qn(k)+1 ≤ n(k) c , we have qn(k) ≥ (ck) can continue our estimate as

k qn(k)

≤ (c−1/r (n(k) + 1)k 1−1/r )Var(f ).

≤ c−1/r k 1−1/r . Thus we

Proposition 7.1 implies that X Z 2π k k ln kBk (θ, E)k − ln kB(θ, E)kdθ 2π 0 j=0

c−1/r n(k) max Var(ln((E − λ cos θ)2 + 1)). −λ−2<E<λ+2 k 1/r √ Var(ln((E − λ cos θ)2 + 1)) = A(λ). Since qn ≥ ( 2)n , n ≥ 2, we ≤ k·

Denote

max

−λ−2<E<λ+2 that n(k) ≤ lnln√k2 ,

and we can find k() such that for any k > k() we have √ ln(2/ 3) c−1/r n(k) ≤ ln A(λ) + k k 1/r which implies (7.2). This completes the proof of Lemma 4.5.

have

Anderson Localization for Almost Mathieu Equation. III

13

Acknowledgement. S.J. would like to thank J. Avron for the hospitality of the ITP at the Technion, and B. Simon for the hospitality of Caltech, where parts of this work were done. We are also grateful to Ya. Sinai and to B. Simon for enlightening conversations.

References 1. Aubry, G., Andre, G.: Analyticity breaking and Anderson localization in incommensurate lattices. Ann. Israel Phys. Soc. 3, 133–140 (1980) 2. Aizenman, M.: Localization at weak disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 3. Avron, J., van Mouche, P., Simon, B.: On the measure of the spectrum for the almost Mathieu operator. Commun. Math. Phys. 132, 103–118 (1990) 4. Cornfeld, I., Fomin, S., Sinai, Ya.: Ergodic Theory. New York: Springer, 1982 5. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger Operators. Berlin, Heidelberg, New York: Springer 1987 6. del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: What is localization? Phys. Rev. Lett. 75, 117–119 (1995) 7. del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank-one perturbations, and localization. J. d’Analyse Math. 69, 153–200 (1996) 8. del Rio, R., Jitomirskaya, S., Makarov, N., Simon, B.: Singular continuous spectrum is generic. Bull. AMS 31, 208–212 (1994) 9. del Rio, R., Makarov, N., Simon, B.: Operators with singular continuous spectrum, II. Rank one operators. Commun. Math. Phys. 165, 59–67 (1994) 10. Delyon, F., Kunz, H., Souillard, B.: One dimensional wave equations in disordered media. J. Phys. A 16, 25–42 (1983) 11. Figotin, A., Pastur, L.: An exactly solvable model of a multidimensional incommensurate structure. Commun. Math. Phys. 95, 401–425 (1984) 12. Fishman, S., Grempel, D., Prange, R.: Localization in a d-dimensional incommensurate structure. Phys. Rev. B 29, 4272–4276 (1984) 13. Fr¨ohlich, J., Spencer, T., Wittwer, P.: Localization for a class of one dimensional quasi-periodic Schr¨odinger operators. Commun. Math. Phys. 132, 5–25 (1990) 14. Gordon, A. Y.: Pure point spectrum under 1-parameter perturbations and instability of Anderson localization. Commun. Math. Phys. 164, 489–505 (1994) 15. Gordon, A., Jitomirskaya, S., Last, Y., Simon B.: Duality and singular continuous spectrum in the almost Mathieu equation. Acta Math. 178, 169–183 (1997) 16. Jitomirskaya, S.: Anderson localization for the almost Mathieu equation; A nonperturbative proof. Commun. Math. Phys. 165, 49–58 (1994) 17. Jitomirskaya, S., Simon, B.: Operators with singular continuous spectrum, III. Almost periodic Schr¨odinger operators. Commun. Math. Phys. 165, 201–205 (1994) 18. Jitomirskaya, S.: Almost everything about the almost Mathieu operator II. Proceedings of XI International Congress of Mathematical Physics, Paris 1994, Int. Press, 1995, pp. 373–382 19. Jitomirskaya, S., Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 20. Jitomirskaya, S.: Continuous spectrum and uniform localization for ergodic Schr¨odinger operators. J. Funct. Anal. 145, 312–322 (1997) 21. Jitomirskaya, S.: Almost Mathieu equations with weakly Liouville frequencies: Second critical constants. In preparation 22. Jitomirskaya, S., Last, Y.: Power-Law subordinacy and singular spectra, II. Line operators. In preparation 23. Kunz, H., Souillard, B.: Sur le spectre des operateurs aux differences finies aleatoires. Commun. Math. Phys. 78, 201–246 (1980) 24. Last, Y.: A relation between absolutely continuous spectrum of ergodic Jacobi matrices and the spectra of periodic approximants. Commun. Math. Phys. 151, 183–192 (1993) 25. Last, Y.: Zero measure spectrum for the almost Mathieu operator. Commun. Math. Phys. 164, 421–432 (1994) 26. Last, Y.: Almost everything about the almost Mathieu operator I. Proceedings of XI International Congress of Mathematical Physics, Paris 1994, Int. Press, 1995, pp. 366–372

14

S. Ya. Jitomirskaya, Y. Last

27. Last Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) 28. Martinelli, F., Scoppola, E.: Introduction to the mathematical theory of Anderson localization. Rivista del Nuovo Cimento 10, N. 10 (1987) 29. Prange, R., Grempel, D., Fishman, S.: A solvable model of quantum motion in an incommensurate potential. Phys. Rev. B 29, 6500–6512 (1984) 30. Simon, B.: Almost periodic Schr¨odinger operators, IV. The Maryland model. Ann. Phys. 159, 157–183 (1985) 31. Simon, B.: Absence of ballistic motion. Commun. Math. Phys. 134, 209–212 (1990) 32. Sinai, Ya.: Anderson localization for one-dimensional difference Schr¨odinger operator with quasiperiodic potential. J. Stat. Phys. 46, 861–909 (1987) 33. Thouless, D.J.: Bandwidth for a quasiperiodic tight binding model. Phys. Rev. B 28, 4272–4276 (1983) Communicated by B. Simon

Commun. Math. Phys. 195, 15 – 28 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

On Causal Compatibility of Quantum Field Theories and Space-Times Michael Keyl? Institut f¨ur Theoretische Physik, Technische Universit¨at Berlin, Hardenbergstraße 36, 10623 Berlin, Germany. E-mail: [email protected] Received: 30 October 1996 / Accepted: 7 November 1997

Abstract: In [Key96] the notion of causal compatibility is introduced as a method to determine the conformal structure of space-time uniquely by the net of observable algebras of a quantum field theory. In this work some new aspects of causal compatibility are discussed. In particular it is shown that for a given poset of algebras there exists up to conformal equivalence at most one Lorentzian manifold which is causally compatible. This enhances previous results, where special properties of the index set of the net are used. 1. Introduction In the framework of algebraic quantum field theory it is interesting to ask, whether the local observable algebras A(O), O ⊂ M of a quantum field theory determine uniquely the space-time (M, g) in which they are located. Previous results considering this question can be found in [Ban88, Ban94, Key93, Key96, Wol94] and the references therein. The most satisfactory results concern the causal, and hence the conformal, structure of space-time [Key93, Key96, Wol94], since there are several theorems stating that the local algebras fix the conformal structure of the underlying space-time uniquely. The easiest theorem of this kind assumes that the algebras A(O1 ), A(O2 ) commute iff the space-time regions O1 , O2 are space-like separated, i.e. there is no causal curve from O1 to O2 [BW92, Key93]. However this condition excludes physically significant models, like free, massless fields on Minkowski space, where the algebras do not commute iff there is a null geodesic from O1 to O2 . To overcome this problem Wollenberg uses algebras A(c) associated to curves c : (a, b) → M . The key observation is that at least for the free scalar field those algebras are abelian if c is space-like and non-abelian, if c is time-like (see [Wol94] for details). ? Present address: Inst. für Math. Phys., TU-Braunschweig, Mendelssohnstraße 3, D-38106 Braunschweig, Germany.

16

M. Keyl

A different method to generalize the results of [Key93], called causal compatibility (CC-compatibility in [Key96]) is proposed in [Key96]. The basic idea is, not to compare the binary relation ⊥g , given by space-like separation of two regions, directly with the relation ⊥max , which is defined by O1 ⊥max O2 : ⇐⇒ [A(O1 ), A(O2 )] = {0}, but to use the causal closures operators associated to ⊥g and ⊥max (see e.g. [BW92] or [Key96] for a definition of causal closures). It turns out that we get a uniqueness result similar to the one given in [Key93] which is applicable to a wider class of models. Another aspect of space-time structure which can be treated quite satisfactorily in the outlined framework is the topology. However our initial question is somewhat ill posed in this context. Since open, relatively compact subsets of space-time are used to index the local algebras, the topology of space-time is needed before the algebras A(O) can be defined. One way to circumvent this problem is the usage of order strucutures. In [Key96] elements λ of an abstract partially ordered set (I, ≺) are used instead of open relatively compact subsets of a manifold M to index the local algebras. In this way we get an “abstract” net (A(λ))λ∈I which is defined without explicit references to space-time. To derive space-time topology from this abstract net we have to search for (or construct) a topological space M such that the elements of I can be realized by open, relatively compact subsets of M and the relation ≺ becomes the usual inclusion relation ⊂. In other words we need an order isomorphism λ from I to the set B(M ) of all open, relatively compact subsets of M . If this ansatz is combined with causal compatibility, outlined in the last paragraph, it can be shown that up to conformal equivalence only one space-time exists, such that the derived net B(M ) 3 O 7→ A(λ−1 (O)) on (M, g) is causally compatible with this space-time (which implies that in addition to the causal and topological structure the differentiable structure of space-time is determined by the net B(M ) 3 O 7→ A(λ−1 (O))). However this result depends highly on the correct choice of the index set (I, ≺). If we assume that the order isomorphism λ from the last paragraph do not map I to B(M ) but to the set of all causally closed sets in B(M ) or to the set of all double cones, the theorem discussed in the last paragraph is, as it stands, not applicable. This is a substantial drawback, since there is in general much freedom to change the index set without changing the quantum theory described by the corresponding net. It is therefore more appropriate to consider, as it is done by Bannier in [Ban94], the set LB := {A(O) | O ∈ B} of all local algebras from the net as a partially ordered set for itself and to use this structure to determine space-time topology. The basic idea of this paper is to start with such a set LB which consists of algebras and which is partially ordered by inclusion and to search for a space-time (M, g) and a map B(M ) 3 O 7→ A(O) ∈ LB such that the corresponding net on (M, g) is causally compatible with this space-time. One main result is that the uniqueness result of [Key96], just mentioned generalizes to this case (see Sect. 4), i.e. (M, g) is uniquely determined up to conformal equivalence. The second main topic (Sect. 5) represents a first step towards generalized space-time concepts. Since the conformal structure of a space-time (M, g) is uniquely determined by the set LB it should be possible to reformulate statements about causality completely in terms of LB without any reference to (M, g). Following this idea we will characterize “space-like separation” in an algebraic way which makes sense even for sets of algebras LB to which a “classical” space-time (M, g) can not be associated in the described way.

Causal Compatibility of Quantum Field Theories and Space-Times

17

2. Local Algebras Before we start with this analysis let us recall some basic facts about local C*-algebras1 which we will need in the following. Hence we will consider a Lorentzian manifold2 (M, g) and a family of C*-algebras (A(O))O∈B indexed by the elements O ∈ B of a base B for the topology of M . The latter we will denote with T (M ) in the following. The selfadjoint elements of the algebra A(O) should describe physically local observables which are measurable in the region O. Hence it is natural to assume that isotony O1 ⊂ O2 ⇒ A(O1 ) ⊂ A(O2 )

(1)

holds, i.e. (A(O))O∈B is a net of C*-algebras. This implies especially that we can define the algebra of quasi-local observables A by [ A(O) (2) A := O∈B(M )

and more generally for each Q ∈ T (M ) the algebra   [ ˜ A(O) with T (Q) := {O |O ∈ T (M ), O ⊂ Q}. A(Q) = C∗ 

(3)

O∈B∩T (Q)

˜ Since O ∈ T (O) ∩ B for each O ∈ B we get A(O) = A(O). In other words the family ˜ (A(Q)) Q∈T (M ) of C*-algebras extends the family (A(O))O∈B , therefore we will drop the tilde in the following. Isotony holds as well for the extended family which defines therefore an extended net which we will call the canonical extension of (A(O))O∈B . ˜ O∈B by If we consider now another base B1 of T (M ) we get another net (A(O)) ˜ 1 restricting the canonical extension (A(Q))Q∈T (M ) to the new base B1 . However if we consider the canonical extension (A1 (Q))Q∈T (M ) of this new net (defined as in (3) with B replaced by B1 ) we get A1 (O) 6= A(O) except the original net (A(O))O∈B is additive, i.e. ! ! [ [ ∗ O =C A(O) , (4) A O∈N

S

O∈N

holds for each subset N ⊂ B with O∈N O ∈ B. In this case the canonical extension of (A(O))O∈B is additive as well and we get A(Q) = A1 (O) for all Q ∈ T (M ). In other words an additive net does not depend on the set B used to index the algebras, as long as B is a base for T (M ). From the physical point of view it is therefore convenient to choose the set B(M ) := {O ⊂ M | O open, O compact},

(5)

or an appropriate subset of B(M ) as the index set, because its elements represent bounded space-time regions. Hence the observables in A(O) with O ∈ B(M ) are measurable in a bounded region. 1 At many places in this paper associative algebras are sufficient. However this generalization does not change the statements and the proofs substantially. Since the A(O) should describe algebras of local observables C*-algebras are the physically most reasonable choice. 2 We will assume throughout this paper that (M, g) is smooth, Hausdorff, second countable and timeoriented and that the strong causality condition holds on it.

18

M. Keyl

A second significant axiom, which links the structure of (A(O))O∈B to the causal structure of (M, g) is locality: O1 ⊥g O2 ⇒ [A(O1 ), A(O2 )] = {0},

(6)

where ⊥g denotes space-like separation:

O1 ⊥g O2 : ⇐⇒ O2 ⊂ M \ J + (O1 ) ∪ J − (O1 ) .

(7)

A net (A(O))O∈B satisfying locality is called a causal net. For a causal net locality holds for the canonical extension and each subnet we can define by specifying another base B1 . Let us consider some properties of the binary relation ⊥g defined in (7). It is easy to check that it satisfies the conditions: O1 ⊥g O2 ⇐⇒ O2 ⊥g O1

i.e. ⊥g is symmetric,

(8)

O1 ⊂ O2 ∧ O2 ⊥g O3 ⇒ O1 ⊥g O3 ,

(9)

and

I ⊂ T (M ) ∧ O1 ⊥g O ∀O ∈ I ⇒ O1 ⊥g

[

! O .

(10)

O∈I

Following [BW92] Def. 7.1.1 we call each relation ⊥ ⊂ T (M ) × T (M ) satisfying (8)–(10) a “causal disjointness relation”. It is quite easy to see that the relation ⊥g contains all information about the conformal structure of (M, g) (see [Key96]). More precisely if ⊥g1 = ⊥g2 holds, then the two metrics g1 , g2 are conformally equivalent. Hence, if we want to determine the conformal equivalence class of g by algebraic properties of the net we have to describe ⊥g by relations between algebras. The easiest way to do this is to define the “maximal3 ” relation ⊥max (see [BW92], Prop. 7.2.7): O1 ⊥max O2 : ⇐⇒ [A(O1 ), A(O2 )] = {0}.

(11)

The condition ⊥max = ⊥g fixes now the conformal structure uniquely (and the differentiable structure as well; see [Key93]). The drawback of this idea is that it excludes physically reasonable cases, e.g. free massless fields on Minkowski space. To get an improved ansatz, we have to introduce the notions of causal complement and causal closure first. The causal complement of an arbitrary open subset Q ⊂ M with respect to a causal disjointness relation ⊥ is (see [BW92], Def. 7.1.6): [ O with Qc := {O ∈ T (M ) | O⊥Q}. (12) Q⊥ := O∈Qc

Note that we can replace in this definition the set Qc by {O ∈ B | O⊥O1 ∀O1 ∈ B with O1 ⊂ Q} if B is a base for T (M ). The causal closure of an O ∈ T (M ) is simply given by O⊥⊥ and O ∈ T (M ) is called causally closed if O = O⊥⊥ . Instead of comparing ⊥g and ⊥max directly the idea is now to compare the closure operators 3 ⊥ max is the maximal element of the set of all causal disjointness relations with which a given net of C*-algebras is a causal net (see [Wol94]).

Causal Compatibility of Quantum Field Theories and Space-Times

19

associated to ⊥g and ⊥max . Hence we call a net (A(O))O∈B causally compatible with the space-time (M, g), if (A(O))O∈B is a causal net4 and if O⊥g ⊥g = O⊥max ⊥max

∀O ∈ B

(13)

holds. Note that causal compatibility of the net (A(O))O∈B do not imply causal compatibility of the canonical extension even if additivity holds (in general causal compatibility of (A(O))O∈B only implies Q⊥max ⊥max ⊂ Q⊥g ⊥g for a general open set Q ⊂ M ). Therefore it is necessary frequently to assume that not only (A(O))O∈B is causally compatible with (M, g) but as well its canonical extension (see Def. 3.6). However this stronger condition is still more general than ⊥max = ⊥g . A particular example for a causally compatible net arises from the free scalar field on a globally hyperbolic space-time (see the [Dim80] for details). Due to Cor. 14.4 of [Key96] the following proposition holds: Proposition 2.1. Consider an analytic5 globally hyperbolic space-time (M, g) and the net (A(O))O∈B(M ) constructed from the Klein-Gordon equation according to [Dim80]. For each convex open set U ⊂ M we can construct now the net (A(O))O∈B(U ) , which is additive and causally compatible with the space-time (U, g|U ). Note that it is not clear whether the whole net is causally compatible with (M, g). Therefore causal compatibility might be a local property. This would imply that the structures we are going to develop reflect only local properties of space-time. 3. Posets of C*-algebras As we have indicated already in the introduction the basic idea of this paper is to consider instead of a net (A(O))O∈B only the set L(A(O), O ∈ B) := {A(O) | O ∈ B}

(14)

of local algebras and to ask whether this structure is sufficient to determine the spacetime (up to conformal equivalence). The purpose of this section is to provide some technical tools which are necessary to make this rough idea more precise. Note that our approach is closely related (and partly inspired) by a previous work of Bannier [Ban94]. We will discuss at the end of this section the differences and similarities between both approaches. The basic objects we have to deal with are sets of C*-algebras or more precisely sets of C*-subalgebras of a C*-algebra as in the following definition: Definition 3.1. A set LB is called a poset of C*-algebras if the elements of LB are C*-subalgebras of a C*-algebra6 I and if I is the C*-algebra generated by the union of all A ∈ LB . We will call I the maximal algebra of LB . (Note that we have I 6∈ LB in general.) 4 Condition (13) does not imply locality. In [Key96] a not necessarily causal net satisfying this equation is therefore called “CC-compatible” (causally closed compatible). 5 It is very likely that analyticity is only a restriction of the methods used in the proof and not of the statement itself. 6 To distinguish local algebras A(O) and the quasilocal algebra A on the one hand from elements of L B which are not a priori related to space-time regions on the other hand, we will use caligraphic letters in the first case and gothic ones in the latter.

20

M. Keyl

In terms of posets of algebras we can look at a net (A(O))O∈B on a space-time (M, g) as a monotone map A : B → LB from a base B of T (M ) onto a poset of C*-algebras. Hence if we want to consider the canonical extension of the net (A(O))O∈B we need an extension L of the poset LB which contains for each subset N ⊂ L the supremum _

N := C

[

∗

! A .

(15)

A∈N

Therefore we define Definition 3.2. A poset L of C*-algebras is called a semi-lattice of C*-algebras, W (i) if for each subset N ⊂ L the W supremum L given in (15) is an element of L. (Hence the maximal element I = L is in L.) C*-algebras there is a smallest semi-lattice L containing LB (ii) For each posetW LB of W and satisfying L = LB . It will be called the semi-lattice generated by LB . If we consider now a net B 3 O 7→ A(O) ∈ LB ⊂ L we can interpret its canonical extension as the map T (M ) 3 Q 7→ A(Q) :=

_

{A(O) | O ∈ B, O ⊂ Q} ∈ L.

If S the net is additive and if we consider T (M ) as a complete semi-lattice (with Q∈N Q) this map is a homomorphism of semi-lattices, i.e. we have A

[ Q∈N

! Q

=

_

A(Q)

(16) W

N =

(17)

Q∈N

for each subset N ⊂ T (M ). Furthermore it is the unique homomorphism of semi-lattices which extends the original map from B to LB . Although we have not introduced new structure, but reinterpreted definitions already given in the last section, we have changed our point of view significantly. We do not treat the space-time (M, g) as something given, but as something which has to be associated to a given set of local algebras. Hence we will record the ideas just introduced in the following definition. Definition 3.3. Consider a poset LB of C*-algebras and the semi-lattice L it generates. A realization of LB on a space-time (M, g) is a homomorphism T (M ) 3 Q 7→ A(Q) ∈ L of semi-lattices, which maps a base B ⊂ B(M ) of T (M ) to LB . In terms of this definition our aim is to find conditions which rule out up to conformal equivalence all but one realization of a given poset LB . Before we consider this question let us specify what “conformal equivalence” means in this context. Definition 3.4. Consider a poset LB of C*-algebras. Two realizations A1 : T (M1 ) → L and A2 : T (M2 ) → L of LB on the space-times (M1 , g1 ) and (M2 , g2 ) are called conformally equivalent if there is a (smooth) conformal transformation f : (M1 , g1 ) → (M2 , g2 ) with A1 (Q) = A2 (f (Q)) for all Q ∈ T (M1 ).

Causal Compatibility of Quantum Field Theories and Space-Times

21

The main tool for distinguishing a special class of realizations of a given poset LB are relations between ⊥g and ⊥max . The latter has a natural reformulation on the algebraic level by A⊥a B : ⇐⇒ [A, B] = {0}, and L 3 A 7→ A⊥a :=

A, B ∈ L

_

{B ∈ L | [A, B] = {0}} ∈ L.

(18)

(19)

(Note that we get an equivalent definition if we replace in the last equation B ∈ L by B ∈ LB .) It is easy to see that we can reformulate locality in terms of this relation by Q1 ⊥g Q2 ⇒ A(Q1 )⊥a A(Q2 )

∀Q1 , Q2 ∈ T (M ).

(20)

In a similar way causal compatibility can be expressed, as it is shown, by the following proposition. Proposition 3.5. Consider a poset LB of C*-algebras and a realization of LB on a space-time (M, g). The corresponding net (A(O))O∈B on (M, g) is causally compatible with this space-time if Eq. (20) and A(O⊥g ⊥g ) = A(O)⊥a ⊥a

∀O ∈ B

(21)

holds. Proof. Equation (20) is obviously a reformulation of locality. Hence we have to check whether A(O⊥g ⊥g ) = A(O)⊥a ⊥a holds for all O ∈ B iff O⊥g ⊥g = O⊥max ⊥max ∀O ∈ B is true. Assume first O⊥g ⊥g = O⊥max ⊥max holds. Then we have for O1 ∈ B, O1 ⊂ O⊥g ⊥g ⇐⇒ [A(O), A(Q)] = {0} ⇒ [A(O1 ), A(Q)] = {0} ∀Q ∈ B , (22) or using ⊥a O1 ⊂ O⊥g ⊥g ⇐⇒ (A(O)⊥a A(Q) ⇒ A(O1 )⊥a A(Q) ∀Q ∈ B) . But the right-hand side of this equivalence is equivalent to _ A(O1 )⊥a {A(Q) | Q ∈ T (M ), A(Q)⊥a A(O)},

(23)

(24)

and this is by Eq. (19) equivalent to A(O1 )⊥a A(O)⊥a . Hence applying again (19) we get O1 ⊂ O⊥g ⊥g ⇐⇒ A(O1 ) ⊂ A(O)⊥a ⊥a which was what we want to show. Consider now A(O⊥g ⊥g ) = A(O)⊥a ⊥a . Then O1 ⊂ O⊥g ⊥g is for O1 ∈ B equivalent to A(O1 )⊥a A(Q) for all Q ∈ B(M ) with A(Q)⊥a A(O). But this is equivalent to O1 ⊂ O⊥max ⊥max and we get therefore O1 ⊂ O⊥g ⊥g ⇐⇒ O1 ⊂ O⊥max ⊥max . This completes the proof. We have mentioned already in Sect. 2 that Eq. (21) is in general not equivalent to A(O⊥g ⊥g ) = A(O)⊥a ⊥a

∀O ∈ T (M ).

(25)

However for technical reasons (see Lemma 4.1) it is convenient to use this stronger condition. Therefore we define

22

M. Keyl

Definition 3.6. A realization A : T (M ) → L of a poset LB of C*-algebras is called causally admissible if Eqs. (20) and (25) hold. In the next section we will show that there is (up to conformal equivalence) at most one causally admissible realization of a poset LB . However before we do this let us have a short look on the related work by Bannier [Ban94] in which a topological space M and a net (A(O))O∈B on it are explicitly constructed from a partially ordered set of algebras. The basic difference between this ansatz and ours (apart from the fact that our approach is in contrast to Bannier’s one not constructive, since we will prove an abstract uniqueness result) concerns the partial ordering we are using throughout this paper. It is always the usual inclusion relation among algebras, while the construction in [Ban94] depends on the choice of the partial ordering, which is therefore an additional physical input. This can be seen from simple examples. Consider e.g. the set I of all open, bounded intervals of the real line as the partially ordered set. If we use the ordering given by λ1 ⊂ λ2 (λ1 , λ2 ∈ I) the topological space M constructed from I according to [Ban94] is homeomorphic to R with its natural topology; but if we use the ordering λ1 ⊂ λ2 we get a completely different space (homeomorphic to R × {0} ∪ R × {1} with topology generated by the base {[a, b) × {0} ∪ (a, b] × {1} | a, b ∈ R}, see [Key96] Ex. 4.6 for a proof of this remark). This example indicates that ordering by inclusion which is used in this work is in general not appropriate for the construction proposed in [Ban94]. Hence we can not use this method directly (i.e. without specifying additional information) to construct a realization of a poset L on a space-time (M, g). 4. Uniqueness of Conformal Structure In this section we will analyse causally admissible realizations A : T (M ) → L of posets of C*-algebras. The main result will show that there is, up to conformal equivalence, at most one such realization. The idea of the proof of this theorem is to choose an appropriate subset of T (M ) such that the restriction of the map A is invertible (this is related to “reduced index sets”; see [BW92, 5.1.2]). Hence our first step is the following lemma. Lemma 4.1. Consider a poset LB of C*-algebras and a causally admissible realization A : T (M ) → L of it on a space-time (M, g). (i) For all O1 , O2 ∈ Bcc (M, g) := { O ∈ T (M ) | O = O⊥g ⊥g ⊥g ⊥g

and ∃O1 ∈ B(M ) with O ⊂ O1

}

(26)

we have A(O1 ) ⊂ A(O2 ) ⇐⇒ O1 ⊂ O2 . In other words the restriction of the map A to Bcc (M, g) is injective. (ii) Bcc (M, g) is bijectively mapped by A to Lcc := {A ∈ L | A = A⊥a ⊥a and ∃{A1 , . . . , An } ⊂ LB with A ⊂ (A1 ∨ · · · ∨ An )⊥a ⊥a }.

(27)

Proof. To prove (i) consider O1 , O2 ∈ Bcc (M, g) such that A(O2 ) ⊂ A(O1 ) holds. Obviously we have [A(O2 ), A(O3 )] = {0} for all O3 with [A(O1 ), A(O3 )] = {0}. By

Causal Compatibility of Quantum Field Theories and Space-Times

23

definition this implies O2 ⊥max O3 for all O3 ⊂ O1⊥max . But this means O2 ⊂ O1⊥max ⊥max = ⊥ ⊥ O1 g g = O1 , which was what we want to show. Consider now (ii). Since we have just proved injectivity, we have to show that the image of Bcc (M, g) under the map A is exactly Lcc . Hence assume first O ∈ Bcc (M, g). This implies by definiton O = O⊥g ⊥g and the existence of an O1 ∈ B(M ) with O ⊂ ⊥ ⊥ O1 g g . By assumption there is a base B ⊂ B(M ) of T (M ) which is mapped by A to LB . Therefore we have finitely many Q1 , . . . , Qn ∈ B such that O1 ⊂ Q1 ∪ · · · ∪ Qn holds. Using the properties of causally admissible space-time realizations this implies A(O1 ) ⊂ A(Q1 ) ∨ · · · ∨ A(Qn ) and A(O) ⊂ A(O1 )⊥a ⊥a , hence A(O) ⊂ (A(Q1 ) ∨ · · · ∨ A(Qn ))⊥a ⊥a which is by definition equivalent to A(O) ∈ Lcc . Assume now A ∈ Lcc . Then we have A⊥a ⊥a = A and there are Q1 , . . . , Qn ∈ B such that A ⊂ (A(Q1 ) ∨ · · · ∨ A(Qn ))⊥a ⊥a holds. Since A is a realization of LB we get A ⊂ A(Q)⊥a ⊥a with Q := Q1 ∪ · · · ∪ Qn ∈ B(M ) and due to Prop. 3.5 this implies A ⊂ A(Q⊥g ⊥g ). On the other hand there is an O ∈ T (M ) with A(O) = A. By assumption we get A = A(O) = A(O)⊥a ⊥a = A(O⊥g ⊥g ) (applying again Prop. 3.5). Hence we get A(O⊥g ⊥g ) ⊂ A(Q⊥g ⊥g ). Due to (i) we therefore have O⊥g ⊥g ⊂ Q⊥g ⊥g with Q ∈ B(M ). This completes the proof. Consider now two causally admissible realizations A1 , A2 of LB on (M1 , g1 ) resp. (M2 , g2 ). Then we get a map Bcc (M1 , g1 ) 3 O 7→ A1 (O) ∈ Lcc 3 A1 (O) 7→ A−1 2 (A1 (O)) =: F (O) ∈ Bcc (M2 , g2 ). (28) We have to show that this map, which is defined on sets, is given by a conformal transformation f : M1 → M2 such that f (O) = F (O) holds. A first step in this direction is Thm. 13.4 of [Key96]. We will reformulate it using the terminology developed up to now. Theorem 4.2. Consider a poset LB of C*-algebras and two causally admissible realizations Ai (i = 1, 2) of LB on the strongly causal space-times (Mi , gi ). They are conformally equivalent, if there is an order isomorphism Fˆ : B(M1 ) → B(M2 ) such that Fˆ (O) = F (O) holds with the map F from (28) and for all O ∈ Bcc (M1 , g1 )∩B(M1 ). The drawback of this theorem is the assumption about the extendibility of the map F . The main purpose of the rest of this section is to show that we can drop it. The most relevant argument in this direction is the following lemma: Lemma 4.3. Let Mi (i = 1, 2) be two locally compact Hausdorff spaces and Bi ⊂ B(Mi ) bases for their topology satisfying the following condition: (O)◦ = O.

(29)

To each bijective map F : B1 → B2 which is monotone in both directions exists a unique homeomorphism f : M1 → M2 with f (O) = F (O). Proof. Basically we use the proof of Lemma 5.2.1 of [Key94] where (29) is not assumed but implicitly used in the proof (see also Sect. 3 of [Key96]). The first step is to extend F to the set T1 given by Ti := {O ⊂ Mi | O open (O)◦ = O} This is done by (O ∈ T1 ):

i = 1, 2.

(30)

24

M. Keyl

O 7→ Fˆ (O) :=

[

F (O1 ).

(31)

O1 ⊂O O1 ∈B1

According to [Key96] Prop. 3.5 is Fˆ an extension of F , i.e. F (O) = Fˆ (O) for all O ∈ B1 . It is easy to see that Fˆ is monotone: O1 ⊂ O2 implies Fˆ (O1 ) ⊂ Fˆ (O2 ). But the other implication holds as well if we have O1 ∈ B1 . To see this, note that O = (O)◦ is equivalent to the condition Q ⊂ O ⇒ Q ⊂ O for all open sets Q. Hence consider O1 ∈ B1 and O2 ∈ T1 with F (O1 ) ⊂ Fˆ (O2 ). If O1 6⊂ O2 there is an O2 ∈ B1 with O3 ∩ O2 = ∅ and O2 ⊂ O1 . This implies F (O3 ) ⊂ F (O2 ) and Fˆ (O1 ) ∩ F (O3 ) = ∅ which is a contradiction. Hence O1 ⊂ O2 and therefore O1 ⊂ O2 since O2 ∈ T1 . Consider now the dual relation: O1 6⊂ O2 ⇐⇒ F (O1 ) 6⊂ Fˆ (O2 ) for O1 ∈ B1 and O2 ∈ T1 . We can use it to show that Fˆ (O) is in T2 . We have to prove that Q ⊂ Fˆ (O) implies Q ⊂ Fˆ (O). Hence assume Q ⊂ Fˆ (O) but not Q ⊂ Fˆ (O). As we have just seen this implies F −1 (Q) 6⊂ O. Since O ∈ T1 this is only possible if F −1 (Q) 6⊂ O. Then there is an O1 ∈ B1 with O1 ⊂ F −1 (Q) and O1 ∩ O = ∅. But this implies F (O1 ) ⊂ Q and F (O1 ) ∩ Fˆ (O) = ∅ in contradiction to Q ⊂ Fˆ (O). Therefore Q ⊂ Fˆ (O), which shows that Fˆ (O) is in T2 . Hence Fˆ is a map from T1 to T2 . It is invertible with inverse [ O 7→ Fˆ −1 (O) := F (O1 ). (32) O1 ⊂O O1 ∈B2

To show this let us calculate Fˆ −1 (Fˆ (O)). We have seen that Q ⊂ Fˆ (O) implies F −1 (Q) ⊂ O Hence Fˆ −1 (Fˆ (O)) ⊂ O. The other inclusion follows from [ [ [ O= Q= F −1 (F (Q)) ⊂ F −1 (Q) ⊂ Fˆ −1 (Fˆ (O)). (33) Q∈B1 Q⊂O

Q∈B1 Q⊂O

Q∈B2 Q⊂Fˆ (O)

Now we can consider sets of closed subsets: Ci := {X ⊂ Mi | X closed (X ◦ ) = X }.

(34)

It is easy to check that the complement of O ∈ Ti is in Ci and the complement of X ∈ Ci is in Ti . Therefore we can define the map G by: G : C1 → C 2 ;

X 7→ M2 \ Fˆ (M1 \ X ).

(35)

As in Prop. 3.8 of [Key96] we can show that G is invertible with inverse G−1 : C2 → C1 ;

X 7→ M1 \ Fˆ −1 (M2 \ X ).

(36)

Furthermore we can prove as in Lemma 3.9 of [Key96] the relations: O ⊂ X ⇐⇒ Fˆ (O) ⊂ G(X )

(37)

X ⊂ O ⇐⇒ G(X ) ⊂ Fˆ (O).

(38)

and To complete the proof consider now the closures of O ∈ Bi . Since (O)◦ = O we have O ∈ Ci and we can apply the map G. Assume now O1 ⊂ O2 . Our results up to now imply F (O1 ) ⊂ G(O1 ) and consequently F (O1 ) ⊂ G(O1 ). Furthermore G(O1 ) ⊂ F (O2 ) holds and therefore F (O1 ) ⊂ F (O2 ). Now we can apply Lemma 4.1 of [Key93] which proves the result.

Causal Compatibility of Quantum Field Theories and Space-Times

25

Now we can state the main result of this section. Theorem 4.4. All causally admissible space-time realizations of a poset of C*-algebras are pairwise conformally equivalent. Proof. Consider the set B1 := B(M1 ) ∩ Bcc (M1 , g1 ) and the map F from (28). B1 is a base for T (M1 ) (since the strong causality condition holds on both st. by assumption; see footnote 2) and we can assume without loss of generality that F (B1 ) = B2 ⊂ B(M2 ) ∩ Bcc (M2 , g2 ) holds. (If this is not the case, i.e. if B2 6⊂ B(M2 ) we have to consider F −1 (B2 ∩ Bcc (M2 , g2 )) ⊂ B1 which is a base for T (M1 ) as well). Since B2 is a base of T (M2 ) the assumptions of Lemma 4.3 are fulfilled if we can show that Q¯ ◦ = Q holds for all Q ∈ Bcc (Mi , gi ), i = 1, 2. The statement then follows directly from Thm. 4.2. ¯ ◦ for an arbitrary Q ∈ Bcc (Mi , gi ). There are two Hence consider a point p ∈ (Q) ◦ ¯ ◦ . r, s ∈ (Q) ¯ ◦ implies that there are open ¯ points r, s ∈ (Q) with p ∈ I(r, s) ⊂ (Q) neighbourhoods Vr , Vs of r, s such that Q ∩ Vr and Q ∩ Vs are not empty and p ∈ I(r, ˜ s) ˜ holds. Hence consider r˜ ∈ Q ∩ Vr and s˜ ∈ Q ∩ Vs , then r, ˜ s˜ ∈ Q imply I(r, ˜ s) ˜ ⊂Q ¯ ◦ = Q, since Q is causally closed by assumption. Hence p ∈ Q and consequently (Q) which completes the proof. 5. Abstract Causality Since the notion of causally admissible space-time realizations associates, as we have seen in the last section, a unique conformal equivalence class of space-times to the poset of algebras LB , it should be possible to reformulate all statements concerning the causal structure of those space-times in terms of LB . To demonstrate this consider the following proposition: Proposition 5.1. Consider a poset LB of C*-algebras and a causally admissible spacetime realization A on a space-time (M, g). There is exactly one binary relation ⊥c ⊂ L× L on the semi-lattice generated by LB which satisfies A(O1 )⊥c A(O2 ) ⇐⇒ O1 ⊥g O2 for all O1 , O2 ∈ T (M ). ⊥c does not depend on the realization A and the space-time (M, g). Proof. For each A1 , A2 ∈ Lcc there are by Lemma 4.1 unique O1 , O2 ∈ Bcc (M, g) such that A(O1 ) = A1 and A(O2 ) = A2 . Hence we can define A1 ⊥c A2 : ⇐⇒ O1 ⊥g O2 . Due to Thm. 4.4 this relation does not depend on the realization A. To extend ⊥c to the semi-lattice L we define B1 ⊥c B2 : ⇐⇒ A1 ⊥c A2 for all A1 , A2 ∈ Lcc with A1 ⊂ B1 and A2 ⊂ B2 . The statement follows from Eq. (10). The relation ⊥c ⊂ L × L thus defined depends only on the algebraic structure of LB (and L). Hence it should be possible to characterize it without explicit use of the realization A. The task of this section is to show how this can be done. The first steps are the following two lemmata: Lemma 5.2. Let (M, g) be a strongly causal space-time and Oi ∈ Bcc (M, g) with O1 ∩ O2 = ∅. Then (O1 ∪ O2 )⊥g ⊥g = O1 ∪ O2

(39)

holds if O3 ⊂ (O1 ∪ O2 )⊥g ⊥g implies for all O3 ∈ Bcc (M, g) that O3 ∩ O1 6= ∅ or O3 ∩ O1 6= ∅.

26

M. Keyl

Proof. Consider an O3 ∈ Bcc (M, g) with O3 ⊂ (O1 ∪ O2 )⊥g ⊥g . If O3 6⊂ O1 ∪ O2 there is a nonempty Q ∈ Bcc (M, g) with Q ⊂ O3 and Q ⊂ (M \ (O1 ∪ O2 )). Hence Q is a subset of (O1 ∪ O2 )⊥g ⊥g but O1 ∩ Q = ∅ and O2 ∩ Q = ∅ which contradicts the assumption. This implies O3 ⊂ O1 ∪ O2 and since O1 ∩ O2 = ∅ is satisfied by assumption we have O3 ⊂ O1 ∪ O2 = O1 ∪ O2 . Consider now a point p ∈ O3 ∩ O1 . There is a neighbourhood I(r, s) := I + (r) ∩ I − (s) ⊂ O3 of p. Since O1 and O2 are disjoint we can assume without loss of generality that I(r, s) do not intersect O2 . This implies r, s ∈ O1 . Hence each neighbourhood U of r and V of s intersect O1 . On the other hand r, s admit neighbourhoods U, V such that p ∈ I(r, ˜ s) ˜ for all r˜ ∈ U and s˜ ∈ V . In other words p has a neighbourhood I(r, ˜ s) ˜ with r, ˜ s˜ ∈ O1 . Since O1 is causally closed it is due to Lemma 14.4 of [Key96] a globally hyperbolic subset of M . Consequently I(r, ˜ s) ˜ ⊂ O1 and therefore p ∈ O1 . Since the same arguments apply to O2 we get O3 ⊂ O1 ∪ O2 which completes the proof. Lemma 5.3. Consider a globally hyperbolic space-time (M, g). For two disjoint sets O1 , O2 ∈ Bcc (M, g) the relation O1 ⊥g O2 holds, iff O1 ∪ O2 ∈ Bcc (M, g) is satisfied. Proof. Assume first O1 ⊥g O2 . According to Lemma 14.1 of [Key96] there are acausal subsets S1 , S2 of O1 , O2 such that Oi = D(Si ) (i = 1, 2), where D(Si ) denotes the Cauchy development of Si . Since O1 and O2 are space-like separated S := S1 ∪ S2 is acausal as well and D(S) = O1 ∪ O2 . Hence Proposition 12.2 of [Key96] shows that O1 ∪ O2 is causally closed. Assume now O1 ∪ O2 is causally closed. We have to show that O1 ⊥g O2 holds. Applying again Lemma 14.4 of [Key96] we get an achronal set S ⊂ O1 ∪ O2 with D(S) = O1 ∪ O2 . Since O1 and O2 are disjoint we get S = S1 ∪ S2 with Si = Oi ∩ S and Oi = D(Si ). But since S is acausal there is no causal curve from S1 to S2 and hence no causal curve from O1 to O2 , which implies O1 ⊥g O2 . The last object we need to state the announced theorem is an algebraic equivalent for O3 ∩ O1 = ∅. This is done by the following equation (A1 , A2 ∈ L): A1 ⊥o A2 : ⇐⇒ The set {A1 , A2 } admits no lower bound in L.

(40)

Now we can characterize the relation ⊥c of Prop. 5.1 as follows: Theorem 5.4. Consider a poset LB of C*-algebras which admits a causally admissible realization A on a space-time (M, g). For A1 , A2 ∈ Lcc the relation ⊥c defined in 5.1 has the form A1 ⊥c A2 : ⇐⇒ A1 ⊥o A2 and A3 ⊂ (A4 ∨ A5 )⊥a ⊥a ⇒ ¬(A3 ⊥o A4 and A3 ⊥o A5 ) ∀A3 ∈ Lcc ∀A4 ∈ Lcc , A4 ⊂ A1 ∀A5 ∈ Lcc , A5 ⊂ A2 ) .

(41)

Proof. According to Lemma 4.1 there are unique Oi ∈ Bcc (M, g), i = 1, . . . , 5 such that A(Oi ) = Ai holds. Assume first that O1 ⊥g O2 is satisified. This implies for all O4 , O5 with O4 ⊂ O1 and O5 ⊂ O2 the relation O4 ⊥g O5 . Lemma 5.3 shows now that O4 ∪ O5 ∈ Bcc (M, g) holds. Due to additivity and Prop. 4.1 we have: A4 ∨ A5 = A(O4 ∪ O5 ) ∈ Lcc .

(42)

Causal Compatibility of Quantum Field Theories and Space-Times

27

This shows together with Prop. 4.1 that O3 ⊂ O4 ∪O5 is satisfied for all O3 ∈ Bcc (M, g) with A(O3 ) = A3 ⊂ (A4 ∨ A5 )⊥a ⊥a . But then A3 ⊥o A4 and A3 ⊥o A5 can not hold simultaneously. This proves one implication. To show the other one consider A1 , A2 ∈ Lcc with A1 ⊥c A2 . Hence for two regions O4 , O5 ∈ Bcc (M, g) with A(O4 ) ⊂ A1 and A(O5 ) ⊂ A2 the relation A(O3 ) ⊂ (A(O4 ) ∨ A(O5 ))

(43)

¬(A(O3 )⊥o A(O4 ) and A(O3 )⊥o A(O5 )).

(44)

implies by definition

Prop. 4.1 and Eq. (43) imply O3 ⊂ (O1 ∪ O2 )⊥g ⊥g . On the other hand Eq. (44) is equivalent to O3 ∩ O4 6= ∅ or O3 ∩ O5 6= ∅. If O4 ∩ O5 = ∅ holds, Lemma 5.2 shows that O4 ∪ O5 ∈ Bcc (M, g) is satisfied and therefore O4 ⊥g O5 due to Lemma 5.3. By definition A1 ⊥o A2 holds and hence O1 ∩ O2 = ∅. This implies that each pair of points p ∈ O1 , q ∈ O2 admit neighbourhoods O4 , O5 with O4 ∩O5 = ∅ and O4 ⊂ O1 and O5 ⊂ O2 . We have just seen that these neighbourhoods satisfy O4 ⊥g O5 and therefore O1 ⊥g O2 which was we want to show. 6. Conclusions The last result shows that we can express, at least some statements about space-time completely in terms of posets of local algebras without explicit reference to space-time realizations. The result of Thm. 5.4 can be used to define ⊥c on posets which do not even admit space-time realizations. From a physical point of view it makes sense to abandon the notions “point” and “event” since they are, at least in quantum field theory, overidealized. The replacement which is offered by the result of the last section are “space-time regions” represented by the elements of the poset LB . To improve this idea it is necessary to reformulate more statements and concepts related to space-time geometry in terms of posets of algebras. One interesting question in this direction concerns the existence of “time-orderings”. On a time oriented Lorentzian manifold we can say whether an event p which is causally related to a second one q is in the causal past or future of q. If we have a poset LB of C*-algebras and a causally admissible realization A on a time oriented Lorentzian manifold (M, g) we can introduce a binary relation on Lcc by: A1 ≺ A2 : ⇐⇒ there is a future pointing causal curve from a point in A−1 (A1 ) to a point in A−1 (A2 ). (According to Lemma 4.1 the map A is invertible on Bcc (M, g).) Note that ≺ is a pre-ordering but not an ordering, since it is reflexive and transitive but not antisymmetric. However it is antisymmetric if A1 ⊥o A2 holds (so it is in some sense nearly an ordering). This relation is linked to ⊥c defined in Thm. 5.4 by the condition: A1 ⊥c A2 ⇐⇒ ¬(A1 ≺ A2 or A2 ≺ A1 ). Due to Theorem 4.4 ≺ depends only on the time-ordering of (M, g) not on the realization A. Hence it is interesting to ask how such a relation can be introduced without the help of causally admissible realizations, and which conditons LB has to satisfy, such that it exists. Another interesting point concerns the projective structure of space-time, i.e. the family of geodesics. However there is no reason to assume that a poset LB is sufficient to determine it uniquely since we have only shown that causally admissible space-time realizations are unique up to conformal equivalence. It is very likely that we need a

28

M. Keyl

richer structure to describe the projective structure. One possible choice is to consider in addition to the poset LB a set S of physically reasonable states on the maximal algebra I. This is very natural, since we need for a complete description of a quantum field theory in any case a set of preferred states. However even with this additional information it is very hard to get a definitive result, even for a very restricted class of models (e.g. free scalar fields). Hence we have to leave this question open here. Another question concerns the existence of causally admissible space-time realizations and a procedure to construct them. Basically this point decomposes into two subproblems: The reconstruction of points and the topology and the reconstruction of the causal relations. The latter is partially solved by the discussion of Sect. 5 and an answer to the question concerning time orientations will be very useful as well. For the first subproblem we can find some considerations in the work of Bannier [Ban94] which we have already discussed in Sect. 3. Other possibilities arise maybe from lattice theory, since we are searching basically for an embedding of L into a Boolean algebra. The last point we want to mention here are possibilities for further generalizations. We may ask whether it is possible to derive even the poset LB from something more general. The most far-reaching ansatz in this direction is the search for an algorithm which constructs a poset LB from a C*-algebra I and a set S of physically preferred state. This seems to be very difficult, however there are intermediate questions which are easier to solve. One possibility is to start with the semi-lattice L and a set of states S on the maximal algebra. Since the elements of L belong to open sets, the search for a poset LB which generates L corresponds to the characterization of bounded regions by algebraic methods. The local quasiequivalence of Hadamard states of the free scalar field on a globally hyperbolic space-time (see [Ver94]) motivates the following idea: LB := {A ∈ L | πω1 |A and πω2 |A are quasiequivalent ∀ω1 , ω2 ∈ S}, in other words, LB is the biggest set of elements of L, such that all elements of S are locally quasiequivalent. Acknowledgement. I like to thank K.-E. Hellwig, F. Lledó, M. Trucks and M. Wollenberg for useful discussions about the subject of this paper. Special thanks are also due to C. Piron and D. Moore for their kind hospitality during a stay at Geneva and for many useful hints and suggestions concerning the relations of my research to lattice theory.

References [Ban88] Bannier, U.: On generally covariant quantum field theory and generalized causal and dynamical structures. Commun. Math. Phys. 118, 163–170 (1988) [Ban94] Bannier, U.: Intrinsic algebraic characterization of space-time structure. Int. J. Theor. Phys. 33, 1797–1809 (1994) [BW92] Baumg¨artel, H. and Wollenberg, M.: Causal nets of operator algebras. Akademie Verlag, Berlin, 1992 [Dim80] Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219–228 (1980) [Key93] Keyl, M.: Remarks on the relation between causality and quantum fields. Class. Quant. Grav. 10, 2353–2362 (1993) ¨ [Key94] Keyl, M.: Uber die Struktur klassischer Felder und ihre Beziehung zu lokalen Quantenfeldtheorien. Dissertation, TU-Berlin, 1994 [Key96] Keyl, M.: Causal spaces, causal complements and their relations to quantum field theory. Rev. Math. Phys 8, 229–270 (1996) [Ver94] Verch, R.: Local definiteness, primarity, and quasiequivalence of quasifree Hadamard quantum states in curved spacetime. Commun. Math. Phys. 160, 507–536 (1994) [Wol94] Wollenberg, M.: On the relation between conformal structures in space-time and nets of local algebras of observables. Lett. Math. Phys. 31, 195–203 (1994) Communicated by G. Felder

Commun. Math. Phys. 195, 29 – 65 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Missing Modules, the Gnome Lie Algebra, and E10 3,???,† ¨ O. B¨arwald1,? , R. W. Gebert2,?? , M. Gunaydin , H. Nicolai4,† 1

Department of Mathematics, King’s College London, Strand, London WC2R 2LS, Great Britain Institute for Advanced Study, School of Natural Sciences, Olden Lane, Princeton, NJ 08540, USA 3 Physics Department, Pennsylvania State University, 104 Davey Lab, University Park, PA 16802, USA 4 Max-Planck-Institut f¨ ur Gravitationsphysik, Albert-Einstein-Institut, Schlaatzweg 1, D-14473 Potsdam, Germany. E-mail: [email protected]

2

Received: 17 March 1997 / Accepted: 1 April 1997

Abstract: We study the embedding of Kac–Moody algebras into Borcherds (or generalized Kac–Moody) algebras which can be explicitly realized as Lie algebras of physical states of some completely compactified bosonic string. The extra “missing states” can be decomposed into irreducible highest or lowest weight “missing modules” w.r.t. the relevant Kac–Moody subalgebra; the corresponding lowest weights are associated with imaginary simple roots whose multiplicities can be simply understood in terms of certain polarization states of the associated string. We analyse in detail two examples where the momentum lattice of the string is given by the unique even unimodular Lorentzian lattice II1,1 or II9,1 , respectively. The former leads to the Borcherds algebra gII1,1 , which we call “gnome Lie algebra”, with maximal Kac–Moody subalgebra A1 . By the use of the denominator formula a complete set of imaginary simple roots can be exhibited, while the DDF construction provides an explicit Lie algebra basis in terms of purely longitudinal states of the compactified string in two dimensions. The second example is the Borcherds algebra gII9,1 , whose maximal Kac–Moody subalgebra is the hyperbolic algebra E10 . The imaginary simple roots at level 1, which give rise to irreducible lowest weight modules for E10 , can be completely characterized; furthermore, our explicit analysis of two non-trivial level-2 root spaces leads us to conjecture that these are in fact the only imaginary simple roots for gII9,1 . Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2 The Lie Algebra of Physical States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.1 The completely compactified bosonic string. . . . . . . . . . . . . . . . 32 ?

Supported by Gottlieb Daimler- und Karl Benz-Stiftung under Contract No. 02-22/96 Supported by Deutsche Forschungsgemeinschaft under Contract No. DFG Ge 963/1-1 ??? Work supported in part by the National Science Foundation under Grant Number PHY-9631332 † Work supported in part by NATO Collaborative Research Grant CRG.960188 Correspondence to: H. Nicolai ??

30

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

3

4

2.2 The DDF construction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Borcherds algebras and Kac–Moody algebras. . . . . . . . . . . . . . . 2.4 Missing modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Gnome Lie Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The lattice II1,1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Basic structure of the gnome Lie algebra. . . . . . . . . . . . . . . . . . . 3.3 DDF states and examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Direct sums of lattices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Missing Modules for E10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Basics of E10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Lowest and highest weight modules of E10 . . . . . . . . . . . . . . . . . 4.3 Examples: 37 and 31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34 36 40 44 44 46 51 53 55 56 58 60

1. Introduction The main focus of this paper is the interplay between Borcherds algebras1 and their maximal Kac–Moody subalgebras. The potential importance of these infinite dimensional Lie algebras for string unification is widely recognized, but it is far from clear at this time what their ultimate role will be in the scheme of things (see e.g. [20, 31] for recent overviews and motivation). In addition to their uncertain status with regard to physical applications, these algebras are very incompletely understood and present numerous challenges from the purely mathematical point of view. Because recent advances in string theory have greatly contributed to clarifying some of their mathematical intricacies we believe that the best strategy for making progress is to exploit string technology as far as it can take us. This is the path we will follow in this paper. As is well known, Kac–Moody and Borcherds algebras can both be defined recursively in terms of a Cartan matrix A (with matrix entries aij ) and a set of generating elements {ei , fi , hi |i ∈ I} called Chevalley–Serre generators, which are subject to certain relations involving aij (see e.g. [24, 28]). For Kac–Moody algebras, the matrix A has to satisfy the properties listed on page one of [24]; the resulting Lie algebra is designated as g(A).2 For Borcherds algebras more general matrices A are possible [4]; in particular, imaginary (i.e., lightlike or timelike) simple roots are allowed, corresponding to zero or negative entries on the diagonal of the Cartan matrix, respectively. The root system of a Kac–Moody algebra is simple to describe, yet for any other but positive or positive semi-definite Cartan matrices (corresponding to finite and affine Lie algebras, resp.), the structure of the algebra itself is exceedingly complicated and not completely known even for a single example. By contrast, Borcherds algebras can sometimes be explicitly realized as Lie algebras of physical states of some compactified bosonic string. Famous examples are the fake monster Lie algebra gII25,1 and the (true) monster Lie algebra g\ , arising as the Lie algebra of transversal states of a bosonic string in 26 dimensions fully compactified on a torus or a Z2 -orbifold thereof, respectively [5, 6]. Recently, such algebras were also discovered in vertex operator algebras associated with the compactified heterotic string [22]; likewise, the Borcherds superalgebras constructed in [21] may admit such explicit realizations. However, the root systems are now much more difficult to 1

In the literature, these algebras are also referred to as “generalized Kac–Moody algebras.” We use the labeling i, j ∈ {1, . . . , d} for A > 0 and i, j ∈ {−1, 0, 1, . . . , d − 2} for Lorentzian A (which have Lorentzian signature), where d = rank(A). The affine case of positive semi-definite A which has a slightly different labeling will not concern us here. 2

Missing Modules, the Gnome Lie Algebra, and E10

31

characterize, because one is confronted with an (generically) infinite tower of imaginary simple roots; in fact, the full system of simple roots is known only in some special cases. In this paper we exploit the complementarity of these difficulties. As shown some time ago, both Lorentzian Kac–Moody algebras and Borcherds algebras can be conveniently and explicitly represented in terms of a DDF construction [11, 8] adapted to the root lattice in question [17]. More precisely, any Lorentzian algebra g(A) can be embedded into a possibly larger, but in some sense simpler Borcherds algebra of physical states g3 associated with the root lattice 3 of g(A). The DDF construction then provides a complete basis for g3 and thereby also for g(A), although the actual determination of the latter is very difficult. A distinctive feature of Lorentzian Kac–Moody algebras of “subcritical” rank (i.e., d < 26) is the occurrence of longitudinal states besides the transversal ones. This result applies in particular to the maximally extended hyperbolic algebra E10 which can be embedded into gII9,1 , the Lie algebra of physical states of a subcritical bosonic string fully compactified on the unique 10-dimensional even unimodular Lorentzian lattice II9,1 . The problem of understanding E10 can thus be reduced to the problem of characterizing the “missing states” (alias “decoupled states”), i.e. those physical states in gII9,1 not belonging to E10 . The problem of counting these states, in turn, is equivalent to the one of identifying all the imaginary simple roots of gII9,1 with their multiplicities. In general terms, our proposal is therefore to study the embedding g(A) ⊂ g3 , and to group the missing states M ≡ g3 g(A) into an infinite direct sum of “missing modules”, that is, irreducible highest or lowest weight representations of the subalgebra g(A). This idea of decomposing a Borcherds algebra with respect to its maximal Kac–Moody subalgebra was already used by Kang [26] for deriving formulas for the root multiplicites of Borcherds algebras and was treated in the axiomatic setup in great detail by Jurisich [23]. We present here an alternative approach exploiting special features of the string model. After exposing the general structure of the embedding, we will work out two examples in great detail. The first is gII1,1 , the Lie algebra of physical states of a bosonic string compactified on II1,1 ; because of its kinship with the monster Lie algebra g\ which has the same root lattice, we will refer to it as the “gnome Lie algebra”. Its maximal Kac–Moody subalgebra g(A) ⊂ gII1,1 is just the finite Lie algebra A1 ≡ sl2 . The other example which we will investigate is gII9,1 with the maximal Kac–Moody subalgebra E10 ⊂ gII9,1 . Very little is known about this hyperbolic Lie algebra, and even less is known about its representation theory (see, however, [13] for some recent results on the representations of hyperbolic Kac–Moody algebras). Our main point is that by combining the ill-understood Lie algebra with its representations into the Lie algebra gII9,1 , we arrive at a structure which can be handled much more easily. The gnome Lie algebra has not yet appeared in the literature so far, although it is possibly the simplest non-trivial example of a Borcherds algebra for which not only one has a satisfactory understanding of the imaginary simple roots, but also a completely explicit realization of the algebra itself in terms of physical string states. (Readers should keep in mind, that so far most investigations of such algebras are limited to counting dimensions of root spaces and studying the modular properties of the associated partition functions.) It is almost “purely Borcherds” since it has only two real roots (and hence

32

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

only one real simple root), but infinitely many imaginary (in fact, timelike) simple roots. From the generalized denominator formula we shall derive a generating function for their multiplicities. Even better, the root spaces – and not just their dimensions – can be analyzed in a completely explicit manner using the DDF construction. If the fake monster Lie algebra is extremal in the sense that it contains only transversal, but no longitudinal states, the gnome Lie algebra gII1,1 is at the extreme opposite end of the classification in that it has only longitudinal but no transversal states. This is of course in accordance with expectations for a d = 2 subcritical string. For the Borcherds algebra gII9,1 the analysis is not so straightforward. It has to be performed level by level where “level” refers to the Z-grading of the Lie algebra induced by the eigenvalue of the central element of the affine subalgebra E9 (which makes up the level-0 piece). At level 1, we exhibit a complete set of missing lowest weight vectors for the hyperbolic Lie algebra E10 obtainable from the tachyonic groundstate |r−1 i (associated with the overextended real simple root r−1 ) by repeated application of the longitudinal DDF operators. To the best of our knowledge, the corresponding E10 -modules provide the first examples for explicit realizations of unitary irreducible highest weight representations of a hyperbolic Kac–Moody algebra. We also examine the non-trivial root spaces associated with the two level-2 roots (or fundamental weights w.r.t. the affine subalgebra) 37 and 31 , which were recently worked out explicitly in [17, 1] and which exemplify the rapidly increasing complications at higher level. An important result of this paper is the explicit demonstration that the missing states for 37 and 31 can be completely reproduced by commuting missing level-1 states either with themselves or with other level-1 E10 elements. This calculation not only furnishes a nontrivial check on our previous results, which were obtained in a rather different manner; even more importantly, it shows that the simple multiplicity (i.e., the multiplicity as a simple root) of both 37 and 31 is zero. In view of this surprising conclusion and the fact that E10 is a “huge” subalgebra of gII9,1 , we conjecture that all missing states of E10 should be obtainable in this way. In other words, the “easy” imaginary simple roots of gII9,1 at level-1 would in fact be the only ones. In spite of the formidable difficulties of verifying (or falsifying) this conjecture at arbitrary levels, we believe that its elucidation would take us a long way towards understanding E10 and what is so special about it. 2. The Lie Algebra of Physical States We shall study one chiral sector of a closed bosonic string moving on a Minkowski torus as spacetime, i.e., with all target space coordinates compactified. Uniqueness of the quantum mechanical wave function then forces the center of mass momenta of the string to form a lattice λ with Minkowskian signature. Upon “old” covariant quantization this system turns out to realize a mathematical structure called vertex algebra [3]. In these models the physical string states form an infinite-dimensional Lie algebra g3 which has the structure of a so-called Borcherds algebra. It is possible to identify a maximal Kac– Moody subalgebra g(A) inside g3 which is generically of Lorentzian indefinite type. The physical states not belonging to g(A) are called missing states and can be grouped into irreducible highest or lowest weight representations of g(A). In principle, the DDF construction allows us to identify the corresponding vacuum states. 2.1. The completely compactified bosonic string. For a detailed account of this topic the reader may wish to consult the review [16]. Here, we will follow closely [17], omitting most of the technical details.

Missing Modules, the Gnome Lie Algebra, and E10

33

Let 3 be an even Lorentzian lattice of rank d < ∞, representing the lattice of allowed center-of-mass momenta for the string. To each lattice point we assign a groundstate |ri which plays the role of a highest weight vector for a d-fold Heisenberg algebra hˆ of µ (n ∈ Z, 0 ≤ µ ≤ d − 1), string oscillators αm α0µ |ri = rµ |ri,

µ αm |ri = 0

∀m > 0,

where µ , αnν ] = mη µν δm+n,0 . [αm

ˆ The Fock space is obtained by collecting the irreducible h-modules built on all possible groundstates, viz. M F (r) , F := r∈3

where

µ1 µM · · · α−m |ri | 0 ≤ µi ≤ d − 1, mi > 0}. F (r) := span{α−m 1 M

To each state ψ ∈ F , one assigns a vertex operator X ψn z −n−1 , V(ψ, z) = n∈Z

which is an operator-valued (ψn ∈ EndF ∀n) formal Laurent series. For notational convenience we put ξ(m) ≡ ξ·αm for any ξ ∈ Rd−1,1 , and we introduce the current X ξ(m)z −m−1 . ξ(z) := m∈Z

The vertex operator associated with a single oscillator is defined as m−1 1 d V ξ(−m)|0i, z := ξ(z), (m − 1)! dz

(2.1)

whereas for a groundstate |ri one puts R R (2.2) V |ri, z := e r− (z)dz eir·q z r·p e r+ (z)dz cr , P with cr denoting some cocycle factor, r± (z) := m>0 r(±m)z ∓m−1 , and q µ being the position operators conjugate to the momentum operators pµ ≡ α0µ ([q µ , pν ] = iη µν ). For a general homogeneous element ψ = ξ 1 (−m1 ) · · · ξ M (−mM )|ri, say, the associated vertex operator is then defined by the normal-ordered product V(ψ, z) := :V ξ 1 (−m1 )|0i, z · · · V ξ M (−mM )|0i, z V |ri, z :. (2.3) This definition can be extended by linearity to the whole of F . The above data indeed fulfill all the requirements of a vertex algebra [3, 15]. The two preferred elements in F , namely the vacuum and the conformal vector, are given here by 1 := |0i and ω := 21 α−1 · α−1 |0i, respectively. Note that the corresponding vertex operators P are respectively given by the identity idF and the stress–energy tensor V(ω, z) = n∈Z Ln z −n−2 , where the latter provides the generators Ln of the constraint Virasoro algebra Vir L (with central charge c = d), such that the grading of F is obtained by the eigenvalues of L0 and the role of a translation generator is played by L−1 satisfying

34

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

d V(L−1 ψ, z) = dz V(ψ, z). Finally, we mention that among the axioms of a vertex algebra there is a crucial identity relating products and iterates of vertex operators called the Cauchy–Jacobi identity. We denote by P h the space of (conformal) highest weight vectors or primary states of weight h ∈ Z, satisfying

L0 ψ = hψ, Ln ψ = 0 ∀n > 0.

(2.4) (2.5)

We shall call the vectors in P 1 physical states from now on. The vertex operators associated with physical states enjoy rather simple commutation relations with the generators of Vir L . In terms of the mode operators we have [Ln , ψm ] = −mψm+n for ψ ∈ P 1 . In particular, the zero modes ψ0 of physical vertex operators commute with the Virasoro constraints and consequently map physical states into physical states. This observation leads to the following definition of a bilinear product on the space of physical states [3]: [ψ, ϕ] := ψ0 ϕ ≡ Resz [V(ψ, z)ϕ] ,

(2.6)

using an obvious formal residue notation. The Cauchy–Jacobi identity for the vertex algebra immediately ensures that the Jacobi identity [ξ, [ψ, ϕ]]+[ψ, [ϕ, ξ]]+[ϕ, [ξ, ψ]] = 0 always holds (even on F). But the antisymmetry property turns out to be satisfied only modulo L−1 terms. Hence one is led to introduce the Lie algebra of observable physical states by (2.7) g3 := P 1 L−1 P 0 , where “observable” refers to the fact that the subspace L−1 P 0 consists of (unobservable) null physical states, i.e., physical states orthogonal to all physical states including themselves (w.r.t. the usual string scalar product). Indeed, for d 6= 26, L−1 P 0 accounts for all null physical states. 2.2. The DDF construction. For a detailed analysis of g3 one requires an explicit basis. First, one observes that the natural g3 -gradation by momentum already provides a root space decomposition for g3 , viz. M g3 = h3 ⊕ g3 (r) , r∈1

where the root space g3 (r) consists of all observable physical states with momentum r: g3 (r) := {ψ ∈ g3 | pµ ψ = rµ ψ}. The set of roots, 1, is determined by the requirement that the roots should represent physically allowed string momenta. Hence we have 1 ≡ 13 := {r ∈ 3 | r2 ≤ 2, r 6= 0} = 1re ∪ 1im , where we have also split the set of roots into two subsets of real and imaginary roots which are respectively given by 1re := {r ∈ 1 | r2 > 0},

1im := {r ∈ 1 | r2 ≤ 0}.

Missing Modules, the Gnome Lie Algebra, and E10

35

Zero momentum is by definition not a root but is incorporated into the d-dimensional Cartan subalgebra h3 := {ξ(−1)|0i | ξ ∈ Rd−1,1 }. Thus the task is to find a basis for each root space. This is achieved by the so-called DDF construction [11, 8] which we will sketch. Given a root r ∈ 1, it is always possible to find a DDF decomposition for it, r = a − nk

with n := 1 − 21 r2 ,

where a, k ∈ Rd−1,1 satisfy a2 = 2, a · k = 1, and k2 = 0. Having fixed a and k we choose a set of orthonormal polarization vectors ξ i ∈ Rd−1,1 (1 ≤ i ≤ d − 2) obeying ξi ·a = ξ i ·k = 0. Then the transversal and longitudinal DDF operators are respectively defined by Aim = Aim (a, k) := Resz V ξ i (−1)|mki, z , (2.8) m d − A− log k(z) V |mki, z m = Am (a, k) := Resz −V a(−1)|mki, z + 2 dz 1 X× i i × (2.9) − × An Am−n × + 2δm0 k·p. 2 n∈Z

We shall need to make use of the following important facts about the DDF operators (see e.g. [17]). Theorem 1. Let r ∈ 1. The DDF operators associated with the DDF decomposition r = a − nk enjoy the following properties on the space of physical string states with momentum r, P 1,(r) : 1. (Physicality) [Lm , Ain ] = [Lm , A− n ] = 0; 2. (Transversal Heisenberg algebra) [Aim , Ajn ] = mδ ij δm+n,0 ; 3. (Longitudinal Virasoro algebra) 26−d − − 2 [A− m , An ] = (m − n)Am+n + 12 m(m − 1)δm+n,0 ; − 4. (Null states) A−1 |ai ∝ L−1 |a − ki; i 5. (Orthogonality) [A− m , An ] = 0; 6. (Highest weight property) Aik |ai = A− k |ai = 0 for all k ≥ 0; 7. (Spectrum generating) − 1 2 1 M P 1,(r) = span{Ai−m · · · Ai−m A− −n1 · · · A−nN |ai | m1 + . . . + nN = 1 − 2 r }; 1 M for all m, n ∈ Z and 1 ≤ i ≤ d − 2. As a simple consequence, we have the following explicit formula for the multiplicity of a root r in g3 : mult g3 (r) ≡ dim g3 (r) = πd−1 (n) := pd−1 (n) − pd−1 (n − 1), (2.10) P Q where n = 1 − 21 r2 and n≥0 pd (n)q n = [φ(q)]−d ≡ n≥1 (1 − q n )−d , so that ∞ X n=0

πd−1 (n)q n =

1−q = 1 + (d − 2)q + 21 (d − 1)dq 2 + · · · . [φ(q)]d−1

(2.11)

The above theorem is also useful for constructing a positive definite symmetric bilinear form on g3 as follows:

36

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

hr|si := δr,s

for r, s ∈ 3,

µ µ † (αm ) := α−m .

For the DDF operators this yields (Aim )† = Ai−m ,

− † (A− m ) = A−m .

In view of the above commutation relations it is then clear that h | i is positive definite on any root space g3 (r) if d < 26. For the critical dimension, d = 26, we redefine g3 by dividing out the additional null states which correspond to the remaining longitudinal DDF states. Thus we have to replace πd−1 by p24 in the multiplicity formula. Note that the scalar product has Minkowskian signature on the Cartan subalgebra. For our purposes we shall also need an invariant symmetric bilinear form on g3 which is defined as (ψ|ϕ) := −hθ(ψ)|ϕi for ψ, ϕ ∈ g3 , where the Chevalley involution is given by θ(|ri) := | − ri,

µ µ ◦ θ := −αm . θ ◦ αm

Clearly, both bilinear forms are preserved by this involution and they enjoy the invariance and contravariance properties, respectively, viz. ([ψ, χ]|ϕ) = (ψ|[χ, ϕ]),

h[ψ, χ]|ϕi = hψ|[θ(χ), ϕ]i

∀ψ, χ, ϕ ∈ g3 .

(2.12)

2.3. Borcherds algebras and Kac–Moody algebras. We now have all ingredients at hand to show that g3 for any d > 0 belongs to a certain class of infinite-dimensional Lie algebras. Definition 1. Let J be a countable index set (identified with some subset of Z). Let B = (bij )i,j∈J be a real matrix, satisfying the following conditions: (C1) B is symmetric; (C2) If i 6= j then bij ≤ 0; 2b (C3) If bii > 0 then biiij ∈ Z for all j ∈ J. Then the universal Borcherds algebra g(B) associated with B is defined as the Lie algebra generated by elements ei , fi and hij for i, j ∈ J, with the following relations: (R1) (R2) (R3) (R4)

[hij , ek ] = δij bik ek , [hij , fk ] = −δij bik fk ; [ei , fj ] = hij ; If bii > 0 then (adei )1−2bij /bii ej = (adfi )1−2bij /bii fj = 0; If bij = 0 then [ei , ej ] = [fi , fj ] = 0.

The elements hij span an abelian subalgebra of g(B) called the Cartan subalgebra. In fact, the elements hij with i 6= j lie in the center of g(B). It is easy to see that hij is zero unless the ith and j th columns of the matrix B are equal.3 A Lie algebra is called a Borcherds algebra, if it can be obtained from a universal Borcherds algebra by 3 Actually, the elements h for i 6= j do not play any role and in fact cannot appear in the present context, ij where g(B) = g3 is based on a non-degenerate Lorenzian lattice 3. Namely, for the ith and j th columns of B to be equal the corresponding roots must be equal, and therefore such hij are always of the form vij (−1)|0i with vij ∈ 3. Since furthermore the hij with i 6= j are central elements, the lattice vectors vij must be orthogonal to all (real and imaginary) roots. Because 3 is non-degenerate, we conclude that vij = 0, and hence hij = 0 for i 6= j.

Missing Modules, the Gnome Lie Algebra, and E10

37

dividing out a subspace of its center and adding an abelian algebra of outer derivations. An important property of (universal) Borcherds algebras is the existence of a triangular decomposition g = n− ⊕ h ⊕ n + ,

(2.13)

where n+ and n− denote the subalgebras generated by the ei ’s and the fi ’s, respectively. This can be established by the usual methods for Kac–Moody algebras (see [23] for a careful proof). Given the Lie algebra of physical string states, g3 , it is extremely difficult to decide whether it is a Borcherds algebra in the sense of the above definition. Luckily, however, there are alternative characterizations of Borcherds algebras which can be readily applied to the case of g3 . We start with the following one [4]. Theorem 2. A Lie algebra g is a Borcherds algebra if it has an almost positive definite contravariant form h | i, which means that g has the following properties: L 1. (Grading) g = n∈Z gn with dim gn < ∞ for n 6= 0; 2. (Involution) g has an involution θ which acts as −1 on g0 and maps gn to g−n ; 3. (Invariance) g carries a symmetric invariant bilinear form ( | ) preserved by θ and such that (gm | gn ) = 0 unless m + n = 0; 4. (Positivity) The contravariant form hx|yi := −(θ(x)|y) is positive definite on gn if n 6= 0. The converse is almost true, which means that, apart from some pathological cases, a Borcherds algebra always satisfies the conditions in the above theorem (cf. [23]). Hence g3 for d ≤ 26 is a Borcherds algebra if we can equip it with an appropriate Z-grading. Note that the grading given by assigning degree 1 − 21 r2 to a root space g3 (r) will not work since there are infinitely many lattice points lying on the hyperboloid x2 = const ∈ 2Z. The solution is to slice the forward (resp. backward) light cone by a family of (d − 1)-dimensional parallel hyperplanes whose common normal vector is timelike and has integer scalar product with all the roots of g3 ( i.e., it is an element of the weight lattice 3∗ ). There is one subtlety here, however. It might well happen that for a certain choice of the timelike normal vector t ∈ 3∗ there are some real roots r ∈ 3 which are orthogonal to t so that the associated root spaces would have degree zero.4 But then we would run into trouble since the Chevalley involution does not act as −1 on a root space g3 (r) but rather maps it into g3 (−r) . We call a timelike vector t ∈ 3∗ a grading vector if it is “in general position”, which means that it has nonzero scalar product with all roots. So let us fix some grading vector5 and define M g3 (r) , g0 := h3 . gn := r ∈1 r·t=n

(The associated degree operator is just t·p.) Then this yields the grading necessary for g3 to be a Borcherds algebra. Note that the pairing property (gm | gn ) ∝ δm+n,0 is fulfilled since θ is induced from the reflection symmetry of the lattice. Observe also that if the lattice admits a (timelike) Weyl vector ρ we can set t = ρ since this vector has all the requisite properties. We conclude: if the lattice 3 has a grading vector and d ≤ 26, then 4

By choosing t to be timelike it is also assured that it has nonzero scalar product with all imaginary roots. Grading vectors always exist since the hyperplanes orthogonal to the real roots cannot exhaust all the ∗ points of 3 inside the lightcone. 5

38

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

the Lie algebra of physical states, g3 , is a Borcherds algebra. This result suggests that above the critical dimension the Lie algebra of physical string states somehow changes in type, as one would also naively expect from a string theoretical point of view. But this impression is wrong. It is an artefact caused by the special choice of the string scalar product. To see this, we recall another characterization of Borcherds algebras [7]. Theorem 3. A Lie algebra g satisfying the following conditions is a Borcherds algebra: (B1) g has a nonsingular invariant symmetric bilinear form ( | ); (B2) g has a self-centralizing subalgebra h such that g is diagonalizable with respect to h and all the eigenspaces are finite-dimensional; (B3) h has a regular element h× , i.e., the centralizer of h× is h and there are only a finite number of roots r ∈ h∗ such that |r(h× )| < R for any R ∈ R; (B4) The norms of roots of g are bounded above; (B5) Any two imaginary roots which are both positive or negative have inner product at most 0, and if they are orthogonal their root spaces commute. Here, as usual, the nonzero eigenvalues of h acting on g are elements of the dual h∗ and are called roots of g. A root is called positive or negative depending on whether its value on the regular element is positive or negative, respectively; and a root is called real if its norm (naturally induced from ( | ) on g) is positive, and imaginary otherwise. Note that the regular element provides a triangular decomposition (2.13) by gathering all root spaces associated with positive (resp. negative) roots into the subalgebra n+ (resp. n− ). For our purposes we shall need a special case of this theorem. Suppose that the bilinear form has Lorentzian signature on h (and consequently also on h∗ ). For the regular element h× we can take any t(−1)|0i associated with a timelike vector t in general position (cf. the above remark about grading vectors!). But the Lorentzian geometry implies more; namely, that two vectors inside or on the forward (or backward) lightcone have to have nonpositive inner product with each other, and they can be orthogonal only if they are multiples of the same lightlike vector. Therefore we have [7] Corollary 1. A Lie algebra g satisfying the following properties conditions is a Borcherds algebra: (B1’) g has a nonsingular invariant symmetric bilinear form ( | ); (B2’) g has a self-centralizing subalgebra h such that g is diagonalizable with respect to h and all the eigenspaces are finite-dimensional; (B3’) The bilinear form restricted to h is Lorentzian; (B4’) The norms of roots of g are bounded above; (B5’) If two roots are positive multiples of the same norm 0 vector then their root spaces commute. Apparently, g3 for any d fulfills the conditions (B1’)–(B4’). A straightforward exercise in oscillator algebra also verifies (B5’) (see formula (3.1) in [17]). We conclude that g3 is indeed always a Borcherds algebra. Although we do not know the Cartan matrix B associated to g3 (and so the set of simple roots) we can determine the maximal Kac–Moody subalgebra of g3 given by the submatrix A obtained from B by deleting all rows and columns j ∈ J such that bjj ≤ 0. A special role is played by the lattice vectors of length 2 which are called the real roots of the lattice and which give rise to tachyonic physical string states. Lightlike or timelike roots are referred to as imaginary roots. We associate with every real root r ∈ 3 a reflection by wr (x) := x − (x · r)r for x ∈ Rd−1,1 . The reflecting hyperplanes then

Missing Modules, the Gnome Lie Algebra, and E10

39

divide the vector space Rd−1,1 into regions called Weyl chambers. The reflections in the real roots of 3 generate a group called the Weyl group W of 3, which acts simply transitively on the Weyl chambers. Fixing one chamber to be the fundamental Weyl chamber C once and for all, we call the real roots perpendicular to the faces of C and with inner product at most 0 with elements of C, the simple roots. We denote such a set of real simple roots by Π re = Π re (C) = {ri |i ∈ I} for a countable index set I.6 Note that a priori there is no relation between the rank d of the lattice and the number of simple roots, |I|.7 The main new feature of Borcherds algebras in comparison with ordinary Kac– Moody algebras is the appearance of imaginary simple roots. An important property of Borcherds algebras is the existence of a character formula which generalizes the Weyl– Kac character formula for ordinary Kac–Moody algebras and which leads as a special case to the following Weyl–Kac–Borcherds denominator formula. Theorem 4. Let g be a Borcherds algebra with Weyl vector ρ (i.e., ρ·r = − 21 r2 for all simple roots) and Weyl group W (generated by the reflections in the real simple roots). Then Y X X (1 − er )mult(r) = (−1)w ew(ρ)−ρ (s)ew(s) , (2.14) r∈1+

w∈W

s

where (s) is (−1)n if s is a sum of n distinct pairwise orthogonal imaginary simple roots and zero otherwise. Note that the Weyl vector may be replaced by any other vector having inner product − 21 r2 with all real simple roots since ew(ρ)−ρ involves only inner products of ρ with real simple roots. This will be important for the gnome Lie algebra below where there is no true Weyl vector but the denominator formula nevertheless can be used to determine the multiplicities of the imaginary simple roots. The physical states ei := |ri i,

fi := −| − ri i,

hi := ri (−1)|0i,

(2.15)

for i ∈ I, obey the following commutation relations (see [3]): [hi , hj ] = 0, [ei , fj ] = δij hi , [hi , fj ] = −aij fj , [hi , ej ] = aij ej , (adfi )1−aij fj = 0 ∀i 6= j, (adei )1−aij ej = 0,

(2.16)

which means that they generate via multiple commutators the Kac–Moody algebra g(A) associated with the Cartan matrix A = (aij )i,j∈I , aij := ri ·rj . As usual, we have the triangular decomposition g(A) = n− (A) ⊕ h(A) ⊕ n+ (A),

(2.17)

6 I may be identified with a subset of J. Note, however, that apart from some special examples, the matrix B for g3 as a Borcherds algebra is not known. 7 The extremal case occurs for the lattice II 25,1 where d = 26 but |I| = ∞ [9]. We should mention here that in order to get the set of imaginary roots “well-behaved”, one assumes that the semidirect product of the Weyl group with the group of graph automorphisms associated with the Coxeter–Dynkin diagram of Π re has finite index in the automorphism group of the lattice 3 (see e.g. [29]).

40

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

where n+ (A) (resp. n− (A)) denotes the subalgebra generated by the ei ’s (resp. fi ’s) for i ∈ I. This corresponds to a choice of the grading vector t (and the regular element h× := t(−1)|0i) satisfying t·ri > 0 ∀i ∈ I. The Lie algebra g(A) is a proper subalgebra of the Lie algebra of physical states g3 , g(A) ⊂ g3 . If we finally introduce the Kac–Moody root lattice Q(A) :=

X

Zri ,

i∈I

then obviously Q(A) ⊆ 3 and in particular rankQ(A) ≤ d, even though |I| might be larger than d. 2.4. Missing modules. Having found the Kac–Moody algebra g(A), the idea is now to analyze the “rest” of g3 from the point of view of g(A). It is clear that, via the adjoint action, g3 is a representation of g(A). Since the contravariant bilinear form is positive definite on the root spaces g3 (r) , r ∈ 1, it is sensible to consider the direct sum of orthogonal complements of g(A) ∩ g3 (r) in g3 (r) with respect to h | i and explore its properties under the action of g(A). We shall see that the resulting space of so-called missing states is a completely reducible g(A)-module, decomposable into irreducible highest or lowest weight representations. The issue of zero momentum, however, requires some care. If Q(A) 6= 3, then there must be a set of d − rankQ(A) linearly independent imaginary simple roots, {rj |j ∈ H ⊂ J \ I}, linearly independent of the set of real simple roots, such that h3 = h(A) ⊕ h0 with h0 := span{hj |j ∈ H}. The latter subspace of the Cartan subalgebra is in general not a g(A)-module but rather an abelian algebra of outer derivations for g(A) in view of the commutation relations (R1). This observation suggests to consider an extension of g(A) by these derivations. There is also another argument that this is a natural thing to do. Namely, extending h(A) to h3 ensures that any root r is a nonzero weight for the extended Lie algebra, while this is not guaranteed for g(A) because there might exist roots in 1 orthogonal to all real simple roots. This procedure is in spirit the same for the general theory of affine Lie algebras where one extends the algebra by adjoining outer derivations to the Cartan subalgebra such that the standard invariant form becomes nondegenerate. Definition 2. The Lie algebra gˆ (A) := g(A) + h3 = n− (A) ⊕ h3 ⊕ n+ (A) is called the extended Kac–Moody algebra associated with 3. The orthogonal complement of gˆ (A) in g3 with respect to the contravariant bilinear form h | i is called the space of missing (or decoupled) states, M. It is clear that gˆ (A) has the same root system and root space decomposition as g(A). Note that M is the same as the orthogonal complement of gˆ (A) in g3 with respect to the invariant form ( | ). Obviously, M has zero intersection with the Cartan subalgebra h3 and with all the tachyonic root spaces g3 (r) , r ∈ 1re = 1re (A). Hence we can write M = M− ⊕ M + ,

M± :=

M r∈1im ±

M(r) ,

(2.18)

Missing Modules, the Gnome Lie Algebra, and E10

41

where 1im ± denotes the set of imaginary roots inside the forward or the backward lightcone, respectively,8 and M(r) is given as the orthogonal complement of the root space for g(A) in g3 , viz. g3 (r) = g(A)(r) ⊕ M(r)

∀r ∈ 1im .

(2.19)

Note that it might (and in some examples does) happen that g(A)(r) is empty for some r ∈ 1im , namely when r is not a root for g(A). Generically, g(A) is a (infinite-dimensional) Lorentzian Kac–Moody algebra about which not much is known. On the other hand we are in the lucky situation of having a root space decomposition with known multiplicities for g3 . So the main problem in this string realization of g(A) is to understand the space of missing states. The starting point for the analysis presented below is the following theorem [23]. Theorem 5. 1. M is completely reducible under the adjoint action of g(A). It decomposes into an orthogonal (w.r.t. h | i) direct sum of irreducible lowest or highest weight modules for g(A): M mr L(∓r), (2.20) M± = r∈B

where B ⊂ 3 ∩ (−C) denotes some appropriate set of dominant integral weights for h(A), L(r) (resp. L(−r)) denotes an irreducible highest (resp. lowest) weight module for g(A) with highest weight r (resp. lowest weight −r), which occurs with multiplicity mr (= m−r ) inside M− (resp. M+ ). 2. Let H± ⊂ M± denote the space of missing lowest and highest weight vectors, respectively. Equipped with the bracket in g3 , H+ and H− are (isomorphic) Lie algebras. If there are no pairwise orthogonal imaginary simple roots in g3 , then they are free Lie algebras. Proof. Let x ∈ gˆ (A), m ∈ M. Then we can write [x, m] = x0 + m0 for some x0 ∈ gˆ (A), m0 ∈ M. It follows that (y|x0 ) = (y|[x, m]) = ([y, x]|m) = 0 for all y ∈ gˆ (A) using invariance. Since the radical of the invariant form has been divided out we conclude that x0 = 0. Thus [ˆg(A), M] ⊆ M and the homomorphism property of ρ : g(A) → EndM, ρ(x)m := [x, m], follows from the Jacobi identity in g3 . But M± are already gˆ (A)modules by themselves. To see this, we exploit the Z-grading of g3 induced by the grading vector t. An element of g3 with momentum r is said to have height r·t. Then M+ and M− consist of elements of positive and negative height, respectively. Going from positive to negative weight with the action of gˆ (A) requires missing states of height zero, which cannot exist since h3 ⊂ gˆ (A). By applying the Chevalley involution θ, it is sufficient to consider M− . Let N ⊂ M− be a gˆ (A)-submodule. Then M N (r) , N (r) := M(r) N = − ∩ N. r∈1im − (r) (r) Since dim M(r) − ≤ dim g3 − < ∞ and h | i is positive definite on M− for all r ∈ 1, it follows that we have the decomposition (r) ⊕ N (r)⊥ M(r) − =N 8

∀r ∈ 1im −.

This means that we choose the grading vector to lie inside the backward lightcone.

42

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

If we define

N ⊥ :=

M

N (r)⊥ ,

r∈1im −

then

M− = N ⊕ N ⊥ ,

and hN |x(m)i = hθ(x)(N )|mi = 0 ⊥

for all x ∈ gˆ (A), m ∈ N , since N is a submodule by assumption. Hence N ⊥ is also a gˆ (A)-submodule and M− is indeed completely reducible. Finally, it is easy to see that each irreducible gˆ (A)-submodule N ⊂ M− is of highest-weight type. Indeed, N inherits the grading of M− by height which is bounded from above by zero, whereas the Chevalley generators ei (i ∈ I) associated with real simple roots increase the height when applied to elements of N . Now we want to show that each irreducible gˆ (A)-module N ⊂ M− is also irreducible under the action of g(A). We shall use an argument similar to the proof of Prop. 11.8 in [24]. Recall that we have the decomposition h3 = h(A) ⊕ h0 , where h0 is spanned by suitable elements hi = ri (−1)|0i (i ∈ H) associated with imaginary simple roots ri . Obviously, any imaginary simple root ri satisfies ri ·r ≥ 0 for all r ∈ 1im and ri ·rj ≤ 0 − P for all rj ∈ Π re . Let us introduce a restricted grading vector by t0 := i∈H ri . We shall call the inner product of t0 with any root r the restricted height of that root. The subspaces of N of constant restricted height are then given by M N (r) . Nh := r∈1im

−

t0 ·r=h

Since t0 ·r ≥ 0 for all r ∈ 1im − , there exists some minimal hmin such that Nhmin 6= 0 and Nh = 0 for h < hmin . We have a decomposition of g(A) w.r.t. to the restricted height as well, viz. M g± := g(A)h . g(A) = g− ⊕ g0 ⊕ g+ , h≷0

Note that this triangular decomposition is different from the previous one encountered in (2.17). In general, they are related by n± (A) ⊆ g± ⊕ g0 and h(A) ⊆ g0 . Now, apparently each Nh is a g0 module. In particular, Nhmin must be irreducible, since any g0 invariant proper subspace would generate a proper gˆ (A) submodule of N contradicting its irreducibility. By the same argument, {v ∈ Nh |g− (v) = 0} = 0 for h > hmin . Hence N = U(g+ )Nvac , where Nvac := {v ∈ N |g− (v) = 0} = Nhmin is an irreducible g0 module. From this we conclude that N is indeed an irreducible g(A) module. So M− decomposes into an othogonal direct sum M M− = mα Lα , α∈B

Missing Modules, the Gnome Lie Algebra, and E10

43

where B denotes some appropriate index set and each Lα is an irreducible g(A)-module occurring with multiplicity mα > 0. Finally, it is easy to see that each irreducible g(A)submodule Lα ⊂ M− is of highest-weight type. Indeed, Lα inherits the grading of M− by height which is bounded from above by zero, whereas the Chevalley generators ei (i ∈ I) associated with real simple roots increase the height when applied to vectors weight in Lα . So there exists an element vr ∈ Lα associated with a dominant integral r ∈ 3 ∩ (−C) such that ei (vr ) = 0 for all i ∈ I and Lα ≡ L(r) = U n− (A) vr . To prove the second part of the theorem, let v1 , v2 ∈ H− . It follows that x [v1 , v2 ] := [x, [v1 , v2 ]] = [x(v1 ), v2 ] + [v1 , x(v2 )]. If we choose x = ei or x = hi , respectively, it is clear that [v1 , v2 ] is again ahighest weight vector. To see that it is missing we note that hx|[v1 , v2 ]i = hx θ(v1 ) |v2 i for all x ∈ g(A) by contravariance. But since x θ(v1 ) ∈ M+ and v2 ∈ M− ⊥ M+ we see that indeed [v1 , v2 ] ∈ H− . Finally, since g3 is a Borcherds algebra we know that extra Lie algebra relations (in addition to those for g(A)) can occur only if there are pairwise orthogonal imaginary simple roots in g3 . If this is not the case H± must be free. So the space of missing states decomposes into an orthogonal direct sum of irreducible g(A)-multiplets each of which is obtained by repeated application of the raising operators ei (resp. fi ) to some lowest (resp. highest) weight vector. This beautiful structure, however, looks rather messy from the point of view of a single missing root space, M(r) , say. Generically, it decomposes into an orthogonal direct sum of three subspaces with special properties, viz. M(r) = R(r) ⊕ H(r) ⊕ J (r) ,

for r ∈ 1im + ,

(2.21)

where R(r) consists of states belonging to lower-height g(A)-multiplets and H(r) := [H+ , H+ ] ∩ M(r) is spanned by multiple commutators of appropriate lower-height vacuum vectors. What can we say about the remaining piece, J (r) ? Its states are vacuum vectors for g(A), which cannot be reached by multiple commutators inside the space of missing lowest weight vectors, H+ . So a basis for J (r) is part of a basis for H+ . At the level of the Borcherds algebra g3 , this just means that the root r is an imaginary simple root of multiplicity dim J (r) . For this reason we introduce the so-called simple multiplicity µ(r) of a root r in the fundamental Weyl chamber as µ(r) := dim J (r) .

(2.22)

Obviously we have µ(r) ≤ mult(r). Once we know the simple multiplicity of a fundamental root, it is clear how to proceed. Recursively by height, we adjoin to g(A) for each fundamental root r a set of µ(r) generators {ej , fj , hj }. This also explains why it is sufficient to concentrate on fundamental roots. Indeed, by the action of the Weyl group we conclude that the simple multiplicity of any non-fundamental positive imaginary root is zero, while the Chevalley involution tells us that µ(r) = µ(−r) – this just reflects that the fact that we adjoin the Chevalley generators ej and fj always in pairs. Let us point out that for ordinary (i.e. not generalized in the sense of Borcherds) Kac–Moody algebras, for which all elements of any root space are obtained as multiple commutators of the Chevalley–Serre generators (by the very definition of a Kac–Moody algebra!), we have µ(r) = 0, and therefore the notion of simple multiplicity is superfluous.

44

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

3. The Gnome Lie Algebra The gnome Lie algebra gII1,1 , which we will investigate in this section, is the simplest example of a Borcherds algebra that can be explicitly described as the Lie algebra of physical states of a compactified string. It is based on the lattice II1,1 as the momentum lattice of a fully compactified bosonic string in two space-time dimensions. Since there are no transversal degrees of freedom in d = 2 and only longitudinal string excitations occur, the Lie algebra of physical states may be regarded as the precise opposite of the fake monster Lie algebra in 26 dimensions which has only transversal and no longitudinal physical states. It constitutes an example of a generalized Kac–Moody algebra which is almost “purely Borcherds” in that with one exception, all its simple roots are imaginary (timelike). The gnome Lie algebra is also a cousin of the true monster Lie algebra because they both have the same root lattice, II1,1 . In fact, we shall see that the gnome Lie algebra is a Borcherds subalgebra not only of the fake monster Lie algebra but also of any Lie algebra of physical states associated with a momentum lattice that can be decomposed in such a way that it contains II1,1 as a sublattice. 3.1. The lattice II1,1 . We start by summarizing some properties of the unique twodimensional even unimodular Lorentzian lattice II1,1 . It can be realized as II1,1 := Z( 21 ; 21 ) ⊕ Z(−1; 1) = {(`/2 − n; `/2 + n) | `, n ∈ Z}, where for the (Minkowskian) product of two vectors our convention is (x1 ; x0 ) · (y 1 ; y 0 ) := x1 y 1 − x0 y 0 . Alternatively, we will represent the elements of II1,1 in a light cone basis, i.e., in terms 2 0 −1 of pairs h`, ni ∈ Z ⊕ Z with inner product matrix −1 0 , so that h`, ni = −2`n. The lattice points are shown in Fig. 1 below. The main importance of this lattice for us derives from the fact that it is the root lattice of the Lie algebra gII1,1 we are about to construct. As already explained in the last section, allowed physical string momenta have norm squared at most two and consequently any root 3 for gII1,1 must obey 32 ≤ 2. There are no lightlike roots here: the corresponding root spaces are empty owing to the absence of transversal polarizations in two dimensions. Therefore, imaginary roots for gII1,1 are all lattice vectors lying in the interior of the lightcone. Real roots satisfy 32 = 2, and the lattice II1,1 possesses only two such roots 3 = ±r−1 , where r−1 := ( 23 ; − 21 ) = h1, −1i. Our notation has been chosen so as to make explicit the analogy with E10 , where r−1 is the over-extended root. In addition we need the lightlike vector δ := (−1; 1) = h0, 1i, obeying r−1 ·δ = −1. Hence it serves as a lightlike Weyl vector for gII1,1 .9 It is analogous to the null root of the affine subalgebra E9 ⊂ E10 , but the crucial difference is that for 9 It is, however, only a “real” Weyl vector since it has scalar product -1 with all real simple roots, whereas it will not have the correct scalar products with all imaginary simple roots. In fact, there is no true Weyl vector for gΠ 1,1 .

Missing Modules, the Gnome Lie Algebra, and E10

45

x0

n

`

δ x1 r−1

Fig. 1. The Lorentzian lattice II1,1

II1,1 it is not a root (see the above remark). Nonetheless, we can use δ to introduce the notion of level (again by analogy with E10 ), namely, by assigning to a root 3 the integer ` := −δ·3. This gives us a Z-grading of the set of roots. The reflection symmetry of the lattice, which gives rise to the Chevalley involution of gII1,1 and which introduces the splitting of the set of roots into positive and negative roots, apparently changes the level into its negative. Consequently, the sign of the level of a root determines whether it is positive or negative, and for an analysis of gII1,1 it is sufficient to consider positive roots only. We conclude that the set of positive roots for gII1,1 consists of the level-1 root r−1 and the infinitely many lattice vectors lying inside the forward lightcone. The Weyl group of II1,1 is very simple: since we can only reflect with respect to the single root r−1 , it has only two elements and is thus isomorphic to Z2 just like the Weyl group of the monster Lie algebra [5]. On any vector x ∈ R1,1 it acts as w−1 (x) := x − (x·r−1 )r−1 ; in light cone coordinates we have the simple formula w−1 h`, ni = hn, `i. Hence the forward lightcone is the union of only two Weyl chambers; the fundamental Weyl chamber leading to our choice of the real simple root has been shaded in Fig. 2. It is given by

46

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

C = {x ∈ R1,1 | x2 ≤ 0, x·r−1 ≤ 0, x·δ ≤ 0}. The imaginary positive roots inside C will be called fundamental roots. Combining the action of the Weyl group with the reflection symmetry of the lattice, the whole analysis of gII1,1 is thereby reduced to understanding the root spaces associated with fundamental roots. Obviously, r−1 and δ span II1,1 , and thus any positive level-` root can be written as 3 = `r−1 + nδ = h`, n − `i, where n > ` > 0 because of 32 = 2`(` − n). As explained in [17], the DDF construction necessitates the introduction of fractional momenta which do not belong to the lattice. We define 1 1 δ, k` := − δ, a` := `r−1 + ` − ` ` such that we can write down the so-called DDF decomposition 1 3 = a` − 1 − 32 k` 2

(3.1)

for any positive level-` root 3. The tachyonic momenta a` lie on a mass shell hyperbola a2` = 2 which has been depicted in Fig. 2 below. This figure also displays the intermediate points (as small circles) “between the lattice” required by the DDF construction, and allows us to visualize how the lattice becomes more and more “fractionalized” with increasing level. We call vectors a` − mk` , 0 ≤ m ≤ − 21 32 , which are not lattice points fractional roots. Note that fractional roots can only occur for ` > 1. We stress that the physical states associated with these intermediate points are not elements of the Lie algebra gII1,1 , as their operator product expansions will contain fractional powers. 3.2. Basic structure of the gnome Lie algebra. The gnome Lie algebra is by definition the Borcherds algebra gII1,1 of physical states of a bosonic string fully compactified on the lattice II1,1 . We would first like to describe its root space decomposition. To do so, we assign the grading h`, ni to any string state with momentum h`, ni = `r−1 + (n − `)δ ∈ II1,1 . The no-ghost theorem in the guise of Thm. 1 then implies that the contravariant form h | i is positive definite on the piece of nonzero degree of the gnome Lie algebra gII1,1 . The degree h0, 0i piece of gII1,1 is isomorphic to R2 , while the tachyonic states | ± r−1 i yield two one-dimensional subspaces of degrees h−1, 1i and h1, −1i, respectively. With these conventions, the gnome Lie algebra looks schematically like the monster Lie algebra (see Fig. 3 and [6]). Here we have indexed the subspace associated with the root 3 = h`, ni by [`n] because the dimension of this root space depends only on the product 2 `n. Indeed, since 1 − 21 h`, ni = 1 + `n we have, according to (2.10), multgII1,1 (3) ≡ dim gII1,1 (3) = π1 (1 + `n), where the partition function π1 (n) was already defined in (2.11). While this description of gII1,1 is rather abstract, we can give a much more concrete realization of this Lie algebra by means of the discrete DDF construction developed in [17]. In fact, the DDF construction provides us with a complete basis for the gnome Lie algebra.

Missing Modules, the Gnome Lie Algebra, and E10

47 x0

n

`

x1

Fig. 2. Fundamental Weyl chamber, positive and fractional roots for gΠ 1,1

The single real simple root r−1 of II1,1 gives rise to Lie algebra elements (cf. Eq. (2.15)) h−1 := r−1 (−1)|0i,

e−1 := |r−1 i,

f−1 := −| − r−1 i,

(3.2)

which generate the finite Kac–Moody subalgebra g(A) = sl2 ≡ A1 ⊂ gII1,1 . On the other hand, there are infinitely many imaginary (timelike) roots inside the lightcone. We shall see that out of these all fundamental roots (except for one) will be simple roots as well. We notice that the one-dimensional Cartan subalgebra h(A) spanned by h−1 does not coincide with the two-dimensional Cartan subalgebra hII1,1 . Hence we need to introduce the Lie algebra gˆ (A) := sl2 + hII1,1 = sl2 ⊕ R30 by adjoining to sl2 the element 30 := (r−1 + 2δ)(−1)|0i, which commutes with sl2 and therefore behaves like a central charge (but notice that the affine extension of sl2 is not a subalgebra of gII1,1 ). It may be regarded as a remnant of the Cartan subalgebra of the hyperbolic extension of a zero-dimensional (virtual) Lie algebra. We see that in this example the Lie algebra g(A) is too small to yield a lot of information (the “smallness” of g(A) is due to the absence of transversal physical string

48

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

.. .

n

.. .

g[6] ··· ···

···

0

0

0 0

0

0

g[5] g[6] .. .

g[6] g[8]

g[10] .. .

.. .

.. .

.. .

g[8]

.. .

···

.. .

··· ···

0

···

0

···

0

···

··· ···

g[5] g[10]

.. .

···

0 g[4]

g[12]

···

0 0

g[3]

g[9] g[12]

.. .

0

g[6]

0

0

0

g[4]

g[3]

···

0

g[−1]

g[2]

0

0 0

0

g[2]

g[4]

0

0

g[1]

0 0

···

0

···

0 0

0 R2

g[−1]

0

··· ···

0

`

g[6]

g[4]

g[2]

.. .

g[5]

g[3]

g[1]

0

.. .

g[10]

g[6]

g[2]

.. .

g[8]

g[4]

g[3]

.. .

g[12]

g[6]

0

.. .

g[9]

0

0

.. .

g[12]

g[4]

0

···

.. .

g[8]

0

··· ···

g[10]

0

··· ···

.. .

g[5]

··· ···

.. .

··· g[6]

.. .

.. .

Fig. 3. Root space decomposition of the gnome Lie algebra

states in two dimensions). Nonetheless, there are infinitely many purely longitudinal physical states present which are of the form − A− −n1 (a` ) · · · A−nN (a` )|a` i,

(3.3)

where n1 ≥ n2 ≥ . . . ≥ nN ≥ 2 and the longitudinal DDF operators A− −na are associated with a tachyon momentum a` and a lightlike vector k` satisfying a` ·k` = 1. Of course, not all of these string states belong to gII1,1 ; in addition, we must require that (cf. Eq. (3.1)) 3 := a` − M k` is a root, i.e. 3 ∈ II1,1 with 32 ≤ 2, so that M :=

N X

nj = 1 − 21 32 ≥ 0.

j=1

In other words, given a root 3 = `r−1 + nδ, a basis of the associated root space gII1,1 (3) is provided by longitudinal DDF states of the above form with total excitation number M = `(n − `) + 1. For momenta of the form a` − mk` , 0 ≤ m < M , such that m − 1 is not a multiple of `, i.e., for fractional roots “between the lattice points” (cf. Fig. 2), we obtain “intermediate (physical) states” which are not elements of the Lie algebra gII1,1 . In fact, they are not full-fledged states of the string model under consideration but rather states of the uncompactified string model.

Missing Modules, the Gnome Lie Algebra, and E10

49

It is clear that, apart from the subalgebra gˆ (A), all elements of the gnome Lie algebra are associated with imaginary roots. And since none of the longitudinal states can be obtained by multiple commutation of elements of sl2 , all of them are missing states. Thus (3) M(3) + =gII1,1 − 1 2 =span{A− −n1 (a` ) · · · A−nN (a` )|a` i | nj > 1, n1 + . . . + nN = 1 − 2 3 },

(3.4)

for all 3 ∈ 1im + and similarly for M− . From the point of view of sl2 , all these states must be added “by hand” to fill up sl2 to gII1,1 . Having a complete basis for the space of missing states the task is now to determine the complete set of imaginary simple roots. In principle, this can be achieved in two steps. First, we have to identify all the missing lowest weight vectors in M+ . Then we have to determine a basis for the Lie algebra of lowest weight vectors. This provides us with the complete information about the imaginary simple roots and their multiplicities. In the next subsection, this strategy is discussed in more detail and is illustrated by some examples. For the gnome Lie algebra, the information about the imaginary simple roots and their multiplicities can be determined by means of the Weyl–Kac–Borcherds denominator formula. One reason for this is the simplicity of the Weyl group of sl2 which simplifies the denominator formula enormously. It reads Q x` y n )π1 (1+`n) x−1 − y −1 `,n>0 (1 −P (3.5) −1 −1 = x −y + n≥`>0 µ`,n xn y `−1 − x`−1 y n , where we write x ≡ eh1,0i and y ≡ eh0,1i for the generators of the group algebra of II1,1 and we put µ`,n ≡ µ h`, ni . Recall that the action of the Weyl group simply interchanges x and y. Also note that the fundamental roots have nonzero inner product with each other so that there is no extra contribution of pairwise orthogonal imaginary simple roots on the right-hand side. Therefore we are in the fortunate situation that the sum on the right-hand side runs only once over the imaginary simple roots and that the relevant coefficients are just the simple multiplicities. Furthermore, the associated Lie algebra of lowest weight vectors, H+ , is a free Lie algebra, which follows from Thm. 5 due to the fact that there are no lightlike roots (cf. [23]). We summarize: a set of imaginary simple roots for the gnome Lie algebra gII1,1 is given by the vectors {h`, ni | n ≥ ` ≥ 1}, each with multiplicity µ`,n which is the coefficent of xn y `−1 in the left-hand side of Eq. (3.5) as generating function. Expanding the latter, one readily obtains the results (see Fig. 4) µ1,n = π1 (1 + n)

for n ≥ 1,

µ2,n = π1 (1 + 2n) − π1 (2 + n) − 21 π1 (1 + n2 ) π1 (1 + n2 ) − 1 n−1 2 X π1 (1 + k)π1 (1 + n − k) for n ≥ 2, −

(3.6)

k=1

where we have defined π1 (1+ n2 ) := 0 for any odd integer n. The first formula tells us that all level-1 longitudinal states are missing states associated with imaginary simple roots; from the second we learn that this is no longer true at higher level since µ2,n < π1 (1+2n) and consequently some of the associated states can be generated by commutation of level1 states. In fact, one easily sees that not only does µ(3) not vanish in general, and hence all higher-level roots are simple with a certain multiplicity, but also that µ(3) < mult(3) at

50

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

µ`,n `\n 1 2 3 4 5 6

1 1

2 1 0

3 2 1 3

mult h`, ni 4 2 2 6 5

5 4 6 20 36 63

6 4 10 40 101 239 331

`\n 1 2 3 4 5 6

1 1 1 2 2 4 4

2 1 2 4 8 14 24

3 2 4 12 24 55 105

4 2 8 24 66 165 383

5 4 14 55 165 478 1238

6 4 24 105 383 1238 3660

Fig. 4. Multiplicity of imaginary simple roots vs. dimension of root spaces

higher level. This illustrates the point we have already made in the introduction and in the past [17]: while generalized Kac–Moody algebras such as the gnome may have a rather simple structure in terms of the DDF construction, they are usually quite complicated to analyze from the point of view of their root space decompositions. For hyperbolic Kac–Moody algebras, the situation is precisely the reverse: the simple roots can be read off from the Coxeter–Dynkin diagram, but the detailed structure of the root spaces is exceedingly complicated. Due to the complicated pattern of the imaginary simple roots and their multiplicities, the approach of decomposing gII1,1 into multiplets of sl2 seems to be not very fruitful. One reason for this is that sl2 is just “too small” to yield non-trivial information about the full Lie algebra – in stark contrast to the algebra gII9,1 whose corresponding subalgebra g(A) = E10 is much bigger. Another reason, which is not so obvious, comes from the observation that for increasing level the dimensions of the root spaces grow much faster than the simple multiplicities. This explains why additional imaginary simple roots are needed at every level. There is a beautiful example where this situation is rectified. The true monster Lie algebra [6] is a Borcherds algebra which is based on the same lattice II1,1 as the root lattice; but the multiplicity of a root h`, ni is given by c(`n) (replacing π1 (1P + `n)) which is the coefficient of q `n in the elliptic modular function n = q −1 + 196884q + . . .. In [6], Borcherds was able to j(q) − 744 = n≥−1 cn q determine a set of imaginary simple roots and their simple multiplicities by establishing an identity for the elliptic modular function which turned out to be precisely the above denominator formula. In that example, the imaginary simple roots are all level-1 vectors h1, ni (n ≥ 1), each with multiplicity c(n). Thus the simple multiplicities are large enough so that the level-1 sl2 vacuum vectors can generate by multiple commutators the full Lie algebra of missing lowest weight vectors. Even though the infinite Cartan matrix looks rather messy, the gnome Lie algebra gII1,1 has now been cast into the form of a Borcherds algebra in the sense of Def. 1. The next step in the analysis would be the calculation of the structure constants. Since we have exhibited an explicit basis of the algebra in terms of the DDF states, this can be done in principle. Practically, however, the calculations still have to be performed by use µ }, whereas we would prefer to be able to calculate of the humble oscillator basis {αm the commutators of DDF states in a manifestly physical way, i.e., in a formalism based on the DDF operators only. For the transversal DDF operators this problem was solved recently [18]. However, since we are dealing with purely longitudinal excitations here, one would certainly have to consider exponentials of longitudinal DDF operators. This is technically much more delicate, since the operators do not form a Heisenberg algebra

Missing Modules, the Gnome Lie Algebra, and E10

51

but a Virasoro algebra. Let us also point out the evident relation between the gnome Lie algebra and Liouville theory, which remains to be understood in more detail. 3.3. DDF states and examples. We will now perform some explicit checks and for some examples exhibit the split of the root spaces into parts that can be generated by commutation of low-level elements and the remaining states which must be adjoined by hand, and whose number equals the simple multiplicity of the root in question. Since the actual calculations are quite cumbersome it is helpful to use a computer. We would like to emphasize that these examples not only provide completely explicit realizations of the Lie algebra elements, but also enable us to determine the “structure constants”, whereas for other Borcherds algebras (such as the true or the fake monster Lie algebra), investigations so far have been limited to the determination of root space multiplicities and the modular properties of the associated partition functions. It is natural to investigate the subspace M+ of missing states of the gnome Lie algebra recursively level by level: M M M[`] , M[`] := M(3) (3.7) M+ = + . `>0

3∈1im + 3·δ=−`

We observe that, already at level 1, we have an infinite tower of missing states; indeed, the states − A− −n1 (r−1 ) · · · A−nN (r−1 )|r−1 i

(3.8)

span M[1] . Adjoining these states to the algebra is therefore tantamount to adjoining infinitely many imaginary simple level-1 roots r−1 + nδ = h1, n − 1i (n > 1) with multiplicity π1 (n).10 Although this statement is evident, we would like to demonstrate explicitly that these states are indeed lowest weight vectors for irreducible sl2 -modules. − So let us consider the state v3 := A− −n1 (r−1 ) · · · A−nN (r−1 )|r−1 i, where 3 := r−1 +nδ, PN n := j=1 nj > 1. Using the adjoint action in gII1,1 and the formulas for sl2 given in (3.2), we infer that h−1 (v3 ) = (2 − n)v3 , f−1 (v3 ) ∝ L−1 |nδi ≡ 0, (e−1 )1−r−1 ·3 (v3 ) ∝ L−1 |n(r−1 + δ)i ≡ 0. Note that the last two relations (the lowest weight and the null vector condition, respectively) follow from momentum conservation (cf. Eq. (2.2)) and the fact that physical string states in two dimensions are bound to be null states. Hence v3 is indeed a vacuum vector for an irreducible sl2 -module with spin 21 (n − 2). These multiplets can be constructed by repeated application of the raising operator e−1 which each time increases the level by one. Clearly, the higher-level states belong to irreducible sl2 -multiplets, but the structure quickly becomes rather messy. As already mentioned, we have to decompose each missing root space M(3) + into an orthogonal direct sum of three subspaces with special properties: one consists of states belonging to lower-level sl2 -multiplets, the other is made up of appropriate multiple commutators of lower-level vacuum vectors, 10 As already mentioned, there are no proper physical states on the lightcone, i.e., with momenta proportional to the lightlike vectors δ = h0, 1i and r−1 + δ = h1, 0i, since these would require transversal polarizations.

52

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

and the rest comes from states corresponding to imaginary simple roots. We will now illustrate this pattern by a few examples. So the question is which of the higher-level states can be generated by multiple commutators of the missing level-1 states. As it turns out we will have to add new states at each higher-level root, apart from an exceptional level-2 root which we will exhibit below. We have calculated the following commutators (by means of MAPLE V) − (3.9) |r−1 i, A− −3 |r−1 i = A−3 |a2 i, − 5 − − 38 A− (3.10) |r−1 i, A− −4 |r−1 i = −3 A−2 − 8 A−5 |a2 i, − − − |r−1 i, A− − A− (3.11) −2 A−2 |r−1 i = −3 A−2 + A−5 |a2 i, − − − 35 − 7 − 5 − |r−1 i, A− −5 |r−1 i = 64 A−7 + 32 A−5 A−2 + 64 A−3 A−2 A−2 |a2 i, (3.12) − − − 61 − 7 − |r−1 i, A− − 128 A−7 + 41 A− −3 A−2 |r−1 i = −4 A−3 + 64 A−5 A−2 − 37 − + 128 A−3 A− A (3.13) −2 −2 |a2 i, − − − 83 − 41 − A−2 |r−1 i, A− − 128 A−7 + 41 A− −3 |r−1 i = −4 A−3 + 64 A−5 A−2 − 21 − − 128 A−3 A− A (3.14) −2 −2 |a2 i, where a2 = 2r−1 + 23 δ is the tachyonic level-2 root. Furthermore, we have adopted the convention from [17] according to which the DDF operators are always understood to be the ones appropriate for the states on which they act (i.e. A− m (r−1 ) on the l.h.s. and A− m (a2 ) on the r.h.s.). The first commutator generates an element of the root space associated with 3 = 2r−1 + 3δ. But since this space is one-dimensional, mult(2r−1 + 3δ) = π1 (3) = 1, we infer that we do not need an additional imaginary simple root here (recall that mult(2r−1 + nδ) = multh2, n − 2i = π1 (2n − 3)). This is, of course, a rather trivial observation because h2, 1i is not a fundamental root anyhow. The next two commutators leading to states in the root space associated with 3 = 2r−1 + 4δ are already more involved. By taking suitable linear combinations we obtain − − A− −3 A−2 |a2 i and A−5 |a2 i, which, as one can easily convince oneself, already span the full two-dimensional root space, mult(2r−1 + 4δ) = π1 (5) = 2. Consequently, this root space can be entirely generated by commutators of level-1 missing states, which means that µ2,2 = 0. This is the only root in the fundamental Weyl chamber which is not simple. Let us finally consider a generic example. The commutators (3.12)–(3.14) give states with momentum 3 = 2r−1 + 5δ. Note that the commutators (3.12) and (3.13) are − − states of spin 3/2 sl2 -modules built on the vacuum vectors A− −5 |r−1 i and A−3 A−2 |r−1 i, respectively. In the notations of the last section (see Eq. (2.21)), they span the twodimensional space R(3) , whereas H(3) is one-dimensional with basis element given by the commutator (3.14) of two level-1 vacuum vectors. By building suitable linear combinations these states can be simplified somewhat; in this way, we get the three linearly independent states − 3 − (3.15) A− −7 + 5 A−5 A−2 |a2 i , − − − − 7 − (3.16) A−3 A−2 A−2 − 5 A−5 A−2 |a2 i , − − − 16 − A−4 A−3 + 5 A−5 A−2 |a2 i . (3.17)

Missing Modules, the Gnome Lie Algebra, and E10

53

However, we know that the full root space has dimension π1 (7) = 4, generated by the − − − − − − − (3) must longitudinal DDF operators A− −3 A−2 A−2 , A−4 A−3 , A−5 A−2 , A−7 . Hence J be one-dimensional. Indeed, the physical state − − − − − − − −2457413A− −7 + 1354090A−5 A−2 − 1613422A−4 A−3 + 157593A−3 A−2 A−2 |a2 i is orthogonal to the above three states and cannot be generated by commutation. Hence it is a missing state which must be added by hand to arrive at the total count of four. We conclude that 2r−1 + 5δ is an imaginary simple root with simple multiplicity µ2,3 = 1. Of course, these explicit results are in complete agreement with the Weyl–Kac– Borcherds formula predicting µ2,2 = 0 and µ2,3 = 1 (cf. Fig. 4). 3.4. Direct sums of lattices. We conclude this section with a remark about direct sums of lattices and how this translates into the associated Lie algebras of physical states. Suppose we have two lattices 31 and 32 . Then the direct sum 3 := 31 ⊕ 32 enjoys the following properties (see e.g. [28]): (i) (ii) (iii) (iv)

rank3 = rank31 + rank32 ; sgn3 = sgn31 + sgn32 ; det 3 = (det 31 )(det 32 ); 3 is even iff both 31 and 32 are even ;

where sgn denotes the signature of a lattice. For 3 to be even Lorentzian we shall therefore assume that 31 is even Lorentzian and 32 is even Euclidean. For example, the root lattice of E10 can be decomposed into a direct sum of the unique even selfdual Lorentzian lattice II1,1 in two dimensions and the E8 root lattice. More generally, we have II8n+1,1 = II1,1 ⊕ 08n , where 08n denotes an even selfdual Euclidean lattice of rank 8n.11 We would like to answer the question how the Lie algebra of physical states in F3 := F31 ⊗ F32 is built up from the states in F31 and F32 . This amounts to rewriting both P31 and L−1 P30 as direct sums of tensor products of subspaces of F3i . Using the facts about tensor products of vertex algebras [14] and that F3h2 = 0 for h < 0, we deduce that any state in ψ ∈ P31 is a finite linear combination of the form ψ=

H X

ψ11−h ⊗ ψ2h ,

h=0

with ψih ∈ F3hi and satisfying 11 As is well known (see e.g. [10]), there exists only one such lattice for n = 1 (associated with E ), two for 8 n = 2 (associated with E8 ⊕ E8 and Spin(32)/Z2 , resp.), and 24 for n = 3 (the 24 Niemeier lattices with the famous Leech lattice as one of them). For higher rank, an explicit classification seems impossible. This is due to the explosive growth of the number of even selfdual Euclidean lattices according to the Minkowski–Siegel mass formula which, for example, gives us 8 × 107 as a lower limit on the number of such lattices with rank 32.

54

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

L1,n ψ11−h

⊗

ψ2h

+

ψ11−h ⊗ L2,n ψ2h = 0

for 0 ≤ h < n,

⊗ L2,n ψ2h+n L1,n ψ11−h ⊗ ψ2h

=0

for 0 ≤ h ≤ H − n,

=0

for H − n < h ≤ H,

ψ11−h−n

(3.18) for all n > 0. We immediately see that ψ20 ∈ P30 2 = R|0i2 and ψ11−H ∈ P31−H , but it 1 is difficult to extract from the above relations similar information about the other states. Nonetheless, the last two observations are sufficient to pinpoint the gnome Lie algebra inside g3 . Namely, by considering the special case ψ = ψ11 ⊗ |0i2 , we can immediately infer that gII1,1 ∼ = gII1,1 ⊗|0i2 ⊂ g3 . So the gnome Lie algebra is a Borcherds subalgebra of any Lie algebra of physical states for which the root lattice can be decomposed into a direct sum in such a way that II1,1 arises as a sublattice. This in particular holds for the Lie algebras based on the lattices II9,1 , II17,1 , and II25,1 , respectively, the latter being the celebrated fake monster Lie algebra [5]. We can explore the decomposition of P31 further by the use of the DDF construction. Let us suppose that 31 is the lattice II1,1 and that 32 has rank d − 2 (> 0). We shall write vectors in 3 as (r, v), where r ∈ II1,1 and v ∈ 32 , respectively, so that (r, v)2 = r2 + v2 . We wish to find a tensor product decomposition of the subspace of P31 1 which has fixed momentum component r ∈ 31 , i.e., of the space M (r,v) F3 . P31,r := P31 ∩ v∈32

The idea is to perform the DDF construction in a clever way such that the d−2 transversal directions all belong to the Euclidean lattice 32 and thus the transversal DDF operators can be identified with the string oscillators in F32 . We start from the DDF decomposition r = a` − 1 − 21 r2 k` (see Eq. (3.1)), which gives rise to the decomposition (r, v) = a` − 21 v2 k` , v − 1 − 21 (r, v)2 (k` , 0) within 3. A suitable set of polarization vectors is obtained from any orthonormal basis {ξ i |1 ≤ i ≤ d − 2} of R ⊗Z 32 by putting ξ i ≡ (0, ξ i ). From Thm. 1 it follows that − 1 2 1 M · · · Ai−m A− P31,r = span{Ai−m −n1 · · · A−nN |a` − 2 v k` , vi 1 M 1 |v ∈ 32 , m1 + . . . + nN = 1 − 2 (r, v)2 }. P For fixed h := 21 v2 + a ma , we may identify iM i1 1 M · · · Ai−m |a` − hk` , vi ∼ · · · α−m |vi2 , Ai−m = |a` − hk` i1 ⊗ α−m 1 1 M M

or

1 M · · · Ai−m |a` − hk` , vi} ∼ span{Ai−m = |a` − hk` i1 ⊗ F3h2 . 1 M

If we finally use the fact that P3h1 for any integer h is generated by longitudinal operators we conclude that 1− 21 r2 M 1−h,(r) 1,r ∼ P31 ⊗ F3h2 P3 = h=0

for any r ∈ 31 . There is one subtlety here concerning the central charge. The longitudinal Virasoro algebra occurring on the right-hand side as spectrum-generating algebra for

Missing Modules, the Gnome Lie Algebra, and E10

55

any P3h1 does not have the naive central charge c = 24 (like for the gnome Lie algebra) but rather c = 26−d, the extra contribution coming from the transversal space 32 . So for d = 26 we get modulo null states the trivial representation of the longitudinal Virasoro 1− 1 r2 algebra and hence g3 r ∼ = F 2 in agreement with the literature [6]. 32

4. Missing Modules for E10 We now turn to the hyperbolic Kac–Moody algebra g(A) = E10 , which arises as the maximal Kac–Moody subalgebra of the Borcherds algebra gII9,1 of physical states associated with a subcritical open bosonic string moving in 10-dimensional space-time fully compactified on a torus, so that the momenta lie on the lattice II9,1 . As such, it plays the same role for gII9,1 as sl2 did for the gnome Lie algebra, but is incomparably more complicated. Again, the central idea to split the larger algebra gII9,1 into E10 and its orthogonal complement which can be decomposed into a direct sum of E10 lowest and highest weight modules, respectively. Since the root lattice of E10 is identical with the momentum lattice II9,1 , there is no need to extend E10 by outer derivations. Thus we start from gII9,1 = E10 ⊕ M, where the space of missing states M decomposes as M = M+ ⊕ M − ,

M± =

M

U(E10 )v;

v∈H±

each of the (irreducible) E10 modules U(E10 )v is referred to as a “missing module”. To be sure, this decomposition still does not provide us with an explicit realization of the E10 algebra since we know as little about the E10 modules as about the E10 itself (see [13] for some recent progress). On the other hand, we do gain insight by combining the unknown algebra and its unknown modules into something which we understand very well, namely the Lie algebra of physical states gII9,1 for which a basis is explicitly given in terms of the DDF construction. Moreover, we will formulate a conjecture according to which all higher-level missing states can be obtained by commuting the missing states at level 1 whose structure is completely known. Our explicit tests of this conjecture for the root spaces of 37 and 31 constitute highly non-trivial checks, but of course major new insights are required to settle the question for higher levels. We should mention that the results of the previous section immediately show that the conjecture fails for the gnome Lie algebra gII1,1 . As we have already pointed out, the sl2 module structure of the missing states for gII1,1 is not especially enlightening due to the “smallness” of sl2 . Here the situation is completely different, because E10 and its representations are “huge” (even in comparison with irreducible representations of the affine E9 subalgebra!). If our conjecture were true it would not only take us a long way towards a complete understanding of E10 but also provide another hint that E10 is very special indeed. Conversely, it would also allow us to understand the Borcherds algebra gII9,1 by exhibiting its complete set of imaginary simple roots. In addition to the fake monster, the true monster, and the gnome Lie algebra, this would be the fourth example of an explicit realization of a Borcherds algebra.

56

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

4.1. Basics of E10 . As the momentum lattice for the completely compactified string we shall take the unique 10-dimensional even unimodular Lorentzian lattice II9,1 . It can be defined as the lattice of all points x = (x1 , . . . , x9 |x0 ) for which the xµ ’s are all in Z or all in Z+ 21 and which have integer inner product with the vector l = ( 21 , . . . , 21 | 21 ), all norms and inner products being evaluated in the Minkowskian metric x2 = x21 + . . . + x29 − x20 (cf. [32]). To identify the maximal Kac–Moody subalgebra of the Borcherds algebra gII9,1 of physical string states we have to determine a set of real simple roots for the lattice. According to [9], such a set is given by the ten vectors r−1 , r0 , r1 , . . . , r8 in II9,1 for which r2i = 2 and ri · ρ = −1, where the Weyl vector is ρ = (0, 1, 2, . . . , 8|38) with r2 = −1240.12 Explicitly, r−1 r0 r1 r2 r3 r4 r5 r6 r7 r8

= = = = = = = = = =

( 0, 0, 0, 0, 0, 0, 0, 1,−1 | 0), ( 0, 0, 0, 0, 0, 0, 1,−1, 0 | 0), ( 0, 0, 0, 0, 0, 1,−1, 0, 0 | 0), ( 0, 0, 0, 0, 1,−1, 0, 0, 0 | 0), ( 0, 0, 0, 1,−1, 0, 0, 0, 0 | 0), ( 0, 0, 1,−1, 0, 0, 0, 0, 0 | 0), ( 0, 1,−1, 0, 0, 0, 0, 0, 0 | 0), (−1,−1, 0, 0, 0, 0, 0, 0, 0 | 0), ( 21 , 21 , 21 , 21 , 21 , 21 , 21 , 21 , 21 | 21 ), ( 1,−1, 0, 0, 0, 0, 0, 0, 0 | 0).

These simple roots indeed generate the reflection group of II9,1 . The corresponding Coxeter–Dynkin diagram associated with the Cartan matrix aij := ri·rj looks as follows: u u

u

u

u

u

u

u

u

u

The algebra g(A) is the hyperbolic Kac–Moody algebra E10 , defined in terms of generators and relations (2.16). Moreover, from | det A| = 1 we infer that the root lattice Q(E10 ) indeed coincides with II9,1 , and hence gˆ (A) ≡ g(A) here. The E9 null root is δ=

8 X

ni ri = (0, 0, 0, 0, 0, 0, 0, 0, 1 | 1),

i=0

where the marks ni can be read off from

3 012345642

.

The fundamental Weyl chamber C of E10 is the convex cone generated by the fundamental weights 3i ,13 8 X 3i = − (A−1 )ij rj for i = −1, 0, 1, . . . 8, j=−1

Note that ρ fulfills all the requirements of a grading vector for gΠ 9,1 . Notice that our convention is opposite to the one adopted in [25]. The fundamental weights here are positive and satisfy 3i ·rj = −δij . 12

13

Missing Modules, the Gnome Lie Algebra, and E10

57

where A−1 is the inverse Cartan matrix. Thus, 3∈C

⇐⇒

3=

8 X

k i 3i ,

i=−1

for ki ∈ Z+ . A special feature of E10 is that we need not distinguish between root and weight lattice, since these are the same for self-dual root lattices.14 Note also that the null root plays a special role: the first fundamental weight is just 3−1 = δ, and all null-vectors in C must be multiples of 3−1 since 32i < 0 for all other fundamental weights. We can employ the affine null root to introduce a Z-grading of E10 . If we introduce the so-called level ` of a root 3 ∈ 1(E10 ) by ` := −3·δ, then we may decompose the algebra into a direct sum of subspaces of fixed level, viz. M E10 = E10 [`] , `∈Z

where

E10 [0] ∼ = E9 ,

E10 [`] :=

M

E10 (3)

for ` 6= 0.

3∈1(E10 ) −3·δ=`

Besides the obvious fact that ` counts the number of e−1 (resp. f−1 ) generators in multiple commutators, the level derives its importance from the fact that it grades the algebra E10 with respect to its affine subalgebra E9 [12]. The subspaces belonging to a fixed level can be decomposed into irreducible representations of E9 , the level being equal to the eigenvalue of the central term of the E9 algebra on this representation (hence the full E10 algebra contains E9 representations of all integer levels!). Let us emphasize that for general hyperbolic algebras there would be a separate grading associated with every regular affine subalgebra, and therefore the graded structure would no longer be unique. Using the Jacobi identity it is possible to represent any subspace of fixed level in the form E10 [`] = E10 [1] , E10 [1] , . . . E10 [1] , E10 [1] . . . , | {z } ` times for ` > 0, and in an analogous form for ` < 0. This simple fact turns out to be extremely useful in connection with the DDF construction, as soon as one wishes to effectively construct higher-level elements of E10 . Little is known about the general structure of this algebra. Partial progress has been made in determining the multiplicity of certain roots. Although the general form of the multiplicity formulas for arbitrary levels appears to be beyond reach for the moment, the following results for levels ` ≤ 2 have been established. For ` = 0 and ` = 1, we have multE10 (3) = p8 (1 − 21 32 ) (see [24]), i.e., the multiplicities are just given by the number of transversal states; as was demonstrated in [17] the corresponding states are indeed transversal. in [25] that multE10 (3) = ξ(3 − 21 32 ), where For ` 2=2, it 4was shown P n 8 n ξ(n)q = 1 − φ(q ) φ(q ) φ(q) , φ(q) denoting the Euler function as before. 14

In the remainder, we will consequently denote arbitrary roots by 3 and reserve the letter r for real roots.

58

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

Beyond ` = 2, no general formula seems to be known although for ` = 3 the multiplicity problem was recently solved [2]. However, the resulting formulas are somewhat implicit and certainly more cumbersome than the above results. Of course, if one is only interested in a particular root, the relevant multiplicity can always be determined by means of the Peterson recursion formula (see e.g. [27]). 4.2. Lowest and highest weight modules of E10 . We know from Thm. 5 that M+ (resp. M− ) decomposes into a direct sum of lowest (resp. highest) weight modules for E10 . As before, H± denotes the subspace spanned by the corresponding lowest and highest weight states, respectively. Clearly, H± inherits from gII9,1 the grading by the level, M H[`] , H[`] := H ∩ gII9,1 [`] . H± = `≷0

Since the Chevalley involution provides an isomorphism between H[`] and H[−`] and since we are ultimately interested in identifying the imaginary simple roots and their multiplicities, it is sufficient to restrict the explicit analysis to H+ . We will first study the structure of the space H[1] and will explicitly demonstrate how it is made up of purely longitudinal DDF states. Intuitively, this is what one should expect. Recall that the level1 sector of E10 is isomorphic to the basic representation of E9 (cf. [12]); in terms of the DDF construction, it is generated by the transversal states built on |r−1 i, i.e., it is spanned 1 k · · · Aj−m |r−1 i and their orbits under the action of the E9 by all states of the form Aj−m 1 k affine Weyl group [17]. Thus the longitudinal states at level 1 do not belong to E10 and must be counted as missing states. Furthermore, the level-1 transversal DDF operators can be identified with the adjoint action of appropriate E9 elements (corresponding to multiples of the affine null root). Hence the purely longitudinal DDF states built on the level-1 roots of E10 are candidates for missing lowest weight vectors. But apparently this set can be further reduced, because each (real) level-1 root of E10 is conjugated to some root of the form r−1 + M δ (M ≥ 0) under the action of the affine Weyl group. So we end up with purely longitudinal states built on |r−1 i – the same set we already encountered in Sect. 3.3 for the case of the gnome Lie algebra! And indeed, we have Proposition 1. The space of missing level-1 lowest weight vectors consists of purely longitudinal DDF states built on |r−1 i, n o − H[1] = span A− −n1 · · · A−nN |r−1 i n1 ≥ n2 ≥ . . . ≥ nN ≥ 2 , i.e., it is (modulo null states) the longitudinal Virasoro–Verma module with |r−1 i as highest weight vector. In particular, r−1 + nδ for any n ≥ 2 is an imaginary simple root for gII9,1 with multiplicity µ(r−1 + nδ) = π1 (1 + n). Proof. Let us consider the state − v3 := A− −n1 (r−1 ) · · · A−nN (r−1 )|r−1 i P with momentum 3 := r−1 +M δ, M := j nj > 1. We first check that, under the adjoint action in gII9,1 , it is a lowest weight vector for the basic representation of E9 . Acting with either of the affine Chevalley generators ei = |ri i and fi = −| − ri i (i = 0, 1, . . . , 8) on v3 , we can move it through the longitudinal DDF operators by the use of the general “intertwining relation” [18]

Missing Modules, the Gnome Lie Algebra, and E10

59

− 0 r Enr A− m (r−1 ) = Am (a )En ,

where a0 := r−1 + r + δ and Enr denotes the step operator associated with the real affine root r + nδ. Thereby we end up with the same state but where the Chevalley generator now acts on |r−1 i. The latter, however, is just a lowest weight vector for the basic representation of E9 , viz. fi |r−1 i = 0,

1−ri ·r−1

ei

|r−1 i = 0,

for i = 0, 1, . . . , 8 ,

which is readily seen by inspection of the momenta. Indeed, (r−1 − ri )2 = r−1 + (1 − 2 ri ·r−1 )ri = 2(2 − ri ·r−1 ) ≥ 4, contradicting the mass shell condition (2.4). Since hi (v3 ) = ri · 3v3 = −δi0 v3 , we conclude that v3 is a vacuum vector for the adjoint action of E9 generating the basic representation. According to [17] it is given by the 1 k · · · Aj−m v , transversal states built on v3 , i.e., U(E9 )v3 is spanned by the states Aj−m 1 k 3 where Ajm ≡ Ajm (r−1 ). To show that the state v3 is a lowest weight vector for the full E10 algebra, we have to check the remaining two Chevalley generators. Again by momentum conservation, the state f−1 (v3 ) = − | − r−1 i, v3 has momentum M δ. But since the physical states associated with lightlike momentum are purely transversal and are elements of E10 , the resulting missing state must be a null state (or vanishes identically). Within gII9,1 , we therefore have f−1 (v3 ) = 0. On the other hand, acting with the Chevalley generator e−1 on v3 repeatedly, say k times, we obtain a state of momentum λ = (1 + k)r−1 + M δ. By the mass shell condition, this state identically vanishes for λ2 = 2(1+k)(1+k −M ) > 2, i.e., k > M − 1 = 1 − r−1 ·3. For k = M − 1, the momentum vector λ is lightlike, and by the same reasoning as before we conclude that the state is null also for this value of k. Altogether, we have shown that fi (v3 ) = (ei )1−ri ·3 (v3 ) = 0

for i = −1, 0, 1, . . . , 8 .

These are the defining conditions for v3 to be a lowest weight state for E10 . Since adhi = ri ·p, it is clear that the lowest weight is just 3. The fact that f−1 annihilates the state v3 in particular implies that we can “only go up” in level (for positive level lowest weight states) and that it is not possible to cross the line ` = 0 by the action of E10 . In the context of representation theory of hyperbolic Kac–Moody algebras (see [13]), the above result provides the first examples of explicit realizations of unitary irreducible lowest weight representations of the hyperbolic algebra E10 . More specifically, they are associated with lowest weights 30 + m3−1 for any m ≥ 0. By commutation we even obtain an infinite set of missing lowest-weight vectors with lowest weights `30 + m3−1 for any ` ≥ 1 and m ≥ 0, on which we can build irreducible E10 modules. Analogous statements can be also made for other hyperbolic algebras when we replace II9,1 by the root lattice of the hyperbolic algebra. Due to the string realization this lattice should be even and Lorentzian, conditions which rule out some hyperbolic algebras (see e.g. [30] for a list of them). The next question is now whether gII9,1 also provides realizations of other lowest weight representations of E10 . The results of the following section suggest that this may not be the case. More specifically, we are led to make the following

60

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

Conjecture 1. There are no imaginary simple roots for gII9,1 at level 2 or higher, i.e., the Lie algebra of missing lowest weight states, H+ , is a free algebra generated by the states given in Prop. 1. Note that for the true monster Lie algebra the analogous claim is actually valid: the imaginary simple roots are all of level 1. On the other hand, the conjecture obviously fails for the gnome Lie algebra. The reason for this is that the root spaces in the former example are much bigger (due to the “hidden” extra 24 dimensions of the moonshine module), even though the maximal Kac–Moody algebra in both examples is the same, namely sl2 . This appears to suggest that E10 has just the right size so that the missing modules built on elements of the free Lie algebra over H[1] precisely fill up E10 to the full Lie algebra of physical states. At present, we are not aware of any convincing general argument in favour of the above conjecture. In the next subsection, however, we will verify it for two explicitly constructed non-trivial root spaces. More specifically, we will consider a 201 = 192 + 9 dimensional and a 780 = 727+53 dimensional level-2 root space, respectively, where the first contribution in each sum equals the dimension of the E10 root space and the second term is the dimension of the space of missing states. We will show for both examples that all the missing states are contained in E10 modules built on level-1 missing lowest weight vectors or on commutators of them. Of course, these two zeros could be accidental like in the case of the gnome Lie algebra where we also found a zero at level 2 (see Fig. 4). In the latter example, this was not unexpected since the root multiplicities in this region of the fundamental cone are very low, anyway. For the E10 algebra, by contrast, there is no apparent reason why all missing states in certain level-2 root spaces should belong to E10 modules of the conjectured type. The fact that they do in the cases we have studied constitutes our primary motivation for the above conjecture. 4.3. Examples: 37 and 31 . We use the same system of polarization vectors and DDF decomposition as in [17], which we recall here for convenience: Explicitly, 37 is given by 7 37 = = (0, 0, 0, 0, 0, 0, 0, 0, 0 | 2), 2 4 6 8 10 12 14 9 4 so 327 = −4. Its decomposition into two level-1 tachyonic roots is 37 = r + s + 2δ, where 0 r := r−1= = (0, 0, 0, 0, 0, 0, 0, 1, −1 | 0), 100000000 1 s:= = (0, 0, 0, 0, 0, 0, 0, −1, −1 | 0). 122222210 Since n = 1 − 21 327 = 3, we have the DDF decomposition 37 = a − 3k, where k := − 21 δ and a := r + s − k = (0, 0, 0, 0, 0, 0, 0, 0, − 23 | 21 ). As for the three sets of polarization vectors associated with the tachyon momenta |ri, |si and |ai, respectively, a convenient choice is

Missing Modules, the Gnome Lie Algebra, and E10

61

ξ α ≡ ξ α (r) = ξ α (s) = ξ α (a) for α = 1, . . . , 7 , ξ 1 := (1, 0, 0, 0, 0, 0, 0, 0, 0 | 0), .. . ξ 7 := (0, 0, 0, 0, 0, 0, 1, 0, 0 | 0); ξ 8 (r) := (0, 0, 0, 0, 0, 0, 0, 1, 1 | 1), ξ 8 (s) := (0, 0, 0, 0, 0, 0, 0, −1, 1 | 1), ξ 8 ≡ ξ8 (a) := (0, 0, 0, 0, 0, 0, 0, 1, 0 | 0). The little group is W(37 , δ) = W(D8 ) = S8 o(Z2 )7 of order 214 31 51 71 . We only have to evaluate the following commutator, where denotes a cocycle-factor:

|si, A− −2 |ri

7 1 − 5 8 8 8 1 8 1X µ µ 8 = − A−3 − A−1 A−1 A−1 + A−3 + A−1 A−1 A−1 |ai. 2 6 3 2

µ=1

To identify the remaining missing states, we act on this state with the little Weyl group (which leaves the longitudinal contribution invariant): S8 permutes all transversal polarizations, and hence generates another seven states. To see that the longitudinal state can be separated from the transversal ones, we act with w0 · · · w5 w8 w6 w5 · · · w0 on the above state; this operation switches the relative sign between the transversal and the longitudinal terms. Altogether we can thus isolate the following nine states:

{2Ai−3

−

8Ai−1 Ai−1 Ai−1

+

3Ai−1

A− −3 |ai

1 state,

j j j=1 A−1 A−1 }|ai

8 states.

P8

We use Roman letters i, j running from 1 to 8 to label the transversal indices. These nine states indeed span the orthogonal complement of the 192-dimensional root space E10 (37 ) in gII9,1 (37 ) as was already noticed in [19] where the result was derived by a completely different approach based on multistring vertices and overlap identities. Our second (more involved) example is the fundamental root 31 given by 9 31 = = (0, 0, 0, 0, 0, 0, 1, 1, 1 | 3), 2 4 6 9 12 15 18 12 6 hence 321 = −6 (our conventions used here are the same as in [1]). We have the DDF decomposition 31 = a − 4k, where k = − 21 δ and a := 31 + 4k = 0, 0, 0, 0, 0, 0, 1, 1, −1|1 . We will need two different decompositions of 31 into level-1 roots, namely: 1. 31 = r + s + 3δ with 0 r := = (0, 0, 0, 0, 0, 0, 0, 1, −1 | 0), 100000000

0 s := = (0, 0, 0, 0, 0, 0, 1, 0, −1 | 0); 110000000

62

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

2. 31 = r0 + s0 + 2δ with 0 0 r := = (0, 0, 0, 0, 0, 1, 0, 0, −1 | 0), 111000000 s0 :=

3 = (0, 0, 0, 0, 0, −1, 1, 1, 0 | 1). 111345642

Although we will need several sets of polarization vectors adjusted to these different decompositions, we will present the basis using the following set, which is adjusted to the first decomposition: for α = 1, . . . , 7, ξ α ≡ ξ α (r) = ξ α (s) = ξ α (a) ξ 1 = (1, 0, 0, 0, 0, 0, 0, 0, 0 | 0), .. . ξ 6 = (0, 0, 0, 0, 0, 1, 0, 0, 0 | 0), √ ξ 7 = 21 2(0, 0, 0, 0, 0, 0, 1, 1, 1 | 1), √ ξ 8 (a) = 21 2(0, 0, 0, 0, 0, 0, 1, −1, 0 | 0), √ ξ 8 (r) = 21 2(0, 0, 0, 0, 0, 0, −1, 1, 1 | 1), √ ξ 8 (s) = 21 2(0, 0, 0, 0, 0, 0, 1, −1, 1 | 1). The little Weyl group, W(31 , δ), which is isomorphic to Z2 ×W(E7 ) in this case, acts on this set by permuting ξ 1 , . . . ξ 6 , as a Z2 on ξ 8 and by a more complicated transformation on ξ 7 . We worked out the following commutator equations, denoting some irrelevant cocycle factor: n P √ 8 − 7 µ µ 1 1 ν ν A A A A − 2A−1 A−3 − 43 A8−1 A8−1 A− |ri = |si, A− −1 −1 −1 −1 −2 µ,ν=1 8 2 −3 P 7 5 8 − 43 µ=1 Aµ−1 Aµ−1 A8−1 A8−1 − 24 A−1 A8−1 A8−1 A8−1 o P7 P7 µ µ 1 − 56 A8−1 A8−3 + 41 µ=1 Aµ−1 Aµ−1 A− + A A −2 µ=1 −1 −3 |ai, 2

n |si, A8−1 A− |ri = −2

√ 8 8 8 8 √ P7 5 2A−1 A−1 A−1 A−1 − 32 2 µ=1 Aµ−1 Aµ−1 A8−1 A8−1 √ √ P 7 7 1 + 16 2A8−1 A8−3 − 16 2 µ=1 Aµ−1 Aµ−3 √ √ P 7 − − 1 1 + 16 2A−2 A−2 − 64 2 µ,ν=1 Aµ−1 Aµ−1 Aν−1 Aν−1 o √ P7 − 43 A8−1 A8−1 A8−2 + 41 µ=1 Aµ−1 Aµ−1 A8−2 − 41 2A− −4 |ai, 7 64

n √ − − P7 1 2A−2 A−2 − 43 A8−1 A8−1 A8−2 + 41 µ=1 Aµ−1 Aµ−1 A8−2 − 16 A8−1 |si, A− −2 |ri = √ 8 8 √ 8 8 8 8 7 7 − 16√ 2A−1 A−3 − 64 2A−1 A−1 A−1√ A−1 P7 P7 1 1 + 64 2 µ,ν=1 Aµ−1 Aµ−1 Aν−1 Aν−1 + 16 2 oµ=1 Aµ−1 Aµ−3 √ P7 √ 5 + 32 2 µ=1 Aµ−1 Aµ−1 A8−1 A8−1 + 41 2A− −4 |ai, n √ 8 µ √ µ P7 µ 1 1 ν ν A−3 − 16 2Aµ−1 A8−3 A−1 |si, A− ν=1 A−1 A−1 A−2 + 2 2A−1√ −2 |ri = − 4 µ µ 3 8 1 8 8 8 8 − 21 Aµ−1 A− −3 + 4 A−1 A−1 A−2 +o12 2A−1 A−1 A−1 A−1 √ P 7 µ 1 ν ν 8 − 4 2 ν=1 A−1 A−1 A−1 A−1 |ai,

Missing Modules, the Gnome Lie Algebra, and E10

63

n √ P7 − 43 √ A8−1 A8−1 Aµ−2 + 41 ν=1 Aν−1 Aν−1 Aµ−2 + 21 2A8−1 Aµ−3 |ri = |si, Aµ−1 A− −2 √ µ 1 8 8 8 − 16 2Aµ−1 A8−3 − 21 Aµ−1 A− −3 + o 12 2A−1 A−1 A−1 A−1 √ P 7 − 41 2 ν=1 Aν−1 Aν−1 Aµ−1 A8−1 |ai.

We need one more commutator, associated with a second DDF decomposition. Namely, n P 0 7 µ µ 1 1 8 8 8 8 8 8 0 |s i, A− µ=1 A−1 A−1 A−1 A−1 + 64 A−1 A−1 A−1 A−1 −2 |r i = 32 P P 7 7 µ µ 1 + 16 + 1 Aµ−1 Aµ−1 Aν−1 Aν−1 µ=1 A−1 A−3 √ 6 7 √ √ 647 µ,ν=1 − − 1 − 1 1 − 16√A−2 A−2 − 3 3A−1 A−3 + 13 A6−1 A− −3 6 + 6 2A−3 A−1 √ + 41 √2A6−1 A7−1 A8−1 A8−1 − 13 2A6−1 A7−1 A7−1 A7−1 1 8 + 16 2A6−1 A7−3 + 16 A−3 A8−1 − 13 A6−3 A6−1 − 41 A6−1 A6−1 A8−1 A8−1 1 7 1 7 7 − 6 A−3 A−1 − 8 A−1 A7−1 A8−1 A8−1 + 13 A6−1 A6−1 A6−1 A6−1 P7 1 7 + 12 A−1 A7−1 A7−1 A7−1 − 41 µ=1 Aµ−1 Aµ−1 A6−1 A6−1 P7 − 18 µ=1 Aµ−1 Aµ−1 A7−1 A7−1 + A6−1 A6−1 A7−1 A7−1 + 41 A− −4o √ √ P7 − 23 2A6−1 A6−1 A6−1 A7−1 + 41 2 µ=1 Aµ−1 Aµ−1 A6−1 A7−1 |ai. We displayed this result using the basis of polarization associated with the first decomposition. Appropriate linear combinations and the little Weyl group action lead to the following 53 states, spanning the orthogonal complement of the 727-dimensional root space E10 (31 ) in gII9,1 (31 ) . We use the following conventions to label the transversal indices: Roman letters i, j, . . . run from 1 to 8, Greek letters from the middle of the alphabet µ, ν, . . . run from 1 to 7 and Greek letters from the beginning of the alphabet α, β, . . . run from 1 to 6. Ai−1 A− −3 |ai

8 states,

− 3 µ=1 Aµ−1 Aµ−1 Ai−2 }|ai − − {A− −2 A−2 − 4A−4 }|ai P7 {Aµ−1 A8−1 A8−1 A8−1 − 3 ν=1 Aν−1 Aν−1 Aµ−1 A8−1 − 2Aµ−1 A8−3 +6A8−1 Aµ−3 }|ai 7 7 α α 7 7 α 7 {Aα A7 − 4Aα Aα −1 A−3 + A−1 A−3 − 2A−1 A−1 A−1 −1 A−1 A−1 P8−1 i −1 3 i α 7 + 2 i=1 A−1 A−1 A−1 A−1 }|ai 3 8 α α 8 α α α {A−3 A−1 − 2 A−3 A−1 + 21 A7−3 A7−1 − Aα −1 A−1 A−1 A−1 P 8 α − 41 A7−1 A7−1 A7−1 A7−1 + 43 i=1 Ai−1 Ai−1 Aα −1 A−1 P8 3 i i 7 7 α α 7 + A A A A − A−1 A−1 A−1 A7−1 P7 8 µi=1 µ −1 8 −1 8 −1 −1 3 + 8 µ=1 A−1 A−1 A−1 A−1 − 38 A8−1 A8−1 A8−1 A8−1 }|ai β β β β β β 1 α 1 α α α α {Aα −1 A−3 + A−1 A −3 + 2 A−1 A−1 A−1 A−1 + 2 A−1 A−1 A−1 A−1 P 6 β γ γ β 3 α 8 8 − 23 Aα γ=1 −1 A−1 A−1 A−1 + 2 A−1 A−1 A−1 A−1 γ6=α,β β γ η 7 7 δ + 23 Aα −1 A−1 A−1 A−1 + 4A−1 A−1 A−1 A−1 }|ai P7 − 3 8 8 { 43 A8−1 A8−3 − 18 µ=1 Aµ−1 Aµ−1 A− −2 + 8 A−1 A−1 A−2 P 7 + 13 A8−1 A8−1 A8−1 A8−1 + 41 µ=1 Aµ−1 Aµ−1 A8−1 A8−1 }|ai P7 {7A8−1 A8−3 + 47 A8−1 A8−1 A8−1 A8−1 − 25 µ=1 Aµ−1 Aµ−1 A8−1 A8−1 P7 P7 − 41 µ,ν=1 Aµ−1 Aµ−1 Aν−1 Aν−1 − µ=1 Aµ−1 Aµ−3 }|ai

8 states,

{A8−1 A8−1 Ai−2

P7 1

1 state, 7 states, 6 states,

6 states,

15 states, 1 state, 1 state.

64

O. B¨arwald, R. W. Gebert, M. G¨unaydin, H. Nicolai

These are precisely the missing states found in [1]. Acknowledgement. H.N. would like to thank R. Borcherds for discussions related to this work.

Note added in proof We have meanwhile performed an independent test of the Conjecture 1 by means of a modified denominator formula, establishing its validity for all roots of norm ≥ −8. However, the conjecture fails for roots of norm ≤ −10. See O. B¨arwald, R.W. Gebert and J. Niocolai, “On the Imaginary Simple Roots of the Borcherds Algebra gII9,1 ”. Nuclear Physics B510, 721–738 (1998).

References 1. B¨arwald, O., and Gebert, R. W.: Explicit determination of a 727-dimensional root space of the hyperbolic Lie algebra E10 . J. Phys. A: Math. and Gen. 30, 2433–2446 (1997) 2. Bauer, M., and Bernard, D.: On root multiplicities of some hyperbolic Kac-Moody algebras. Preprint SPhT-96-145, hep-th/9612210 3. Borcherds, R. E.: Vertex algebras, Kac-Moody algebras, and the monster. Proc. Nat. Acad. Sci. U.S.A. 83, 3068–3071 (1986) 4. Borcherds, R. E.: Generalized Kac-Moody algebras. J. Algebra 115, 501–512 (1988) 5. Borcherds, R. E.: The monster Lie algebra. Adv. in Math. 83, 30–47 (1990) 6. Borcherds, R. E.: Monstrous Lie superalgebras. Invent. Math. 109, 405–444 (1992) 7. Borcherds, R. E.: A chacterization of generalized Kac-Moody algebras. J. Algebra 174, 1073–1079 (1995) 8. Brower, R. C.: Spectrum-generating algebra and no-ghost theorem for the dual model. Phys. Rev. D6, 1655–1662 (1972) 9. Conway, J. H.: The automorphism group of the 26-dimensional even unimodular Lorentzian lattice. J. Algebra 80, 159–163 (1983) 10. Conway, J. H., and Sloane, N. J. A.: Sphere Packings, Lattices and Groups. New York: Springer, Second ed., 1993 11. Del Giudice, E., Di Vecchia, P., and Fubini, S.: General properties of the dual resonance model. Ann. Physics 70, 378–398 (1972) 12. Feingold, A. J., and Frenkel, I. B.: A hyperbolic Kac-Moody algebra and the theory of Siegel modular forms of genus 2. Math. Ann. 263, 87–144 (1983) 13. Feingold, A. J., Frenkel, I. B., and Ries, J. F. X.: Representations of hyperbolic Kac-Moody algebras. J. Algebra 156, 433–453 (1993) 14. Frenkel, I. B., Huang, Y.-Z., and Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Am. Math. Soc. 104 (1993) 15. Frenkel, I. B., Lepowsky, J., and Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Applied Mathematics Volume 134. San Diego, CA: Academic Press, 1988 16. Gebert, R. W.: Introduction to vertex algebras, Borcherds algebras, and the monster Lie algebra. Int. J. Mod. Phys. A8, 5441–5503 (1993) 17. Gebert, R. W., and Nicolai, H.: On E10 and the DDF construction. Commun. Math. Phys. 172, 571–622 (1995) 18. Gebert, R. W., and Nicolai, H.: An affine string vertex operator construction at arbitrary level. J. Math. Phys. 38, 4435–4450 (1997) 19. Gebert, R. W., Nicolai, H., and West, P. C.: Multistring vertices and hyperbolic Kac–Moody algebras. Int. J. Mod. Phys. A11, 429–514 (1996) 20. Giveon, A., Porrati, M., and Rabinovici, E.: Target space duality in string theory. Phys. Rep. 244, 77–202 (1994) 21. Gritsenko, V. A., and Nikulin, V. V.: Siegel automorphic form corrections of some Lorentzian Kac– Moody Lie algebras. Schriftenreihe des SFB “Geometrie und Analysis” Heft 17, Mathematica Gottingensis (1995). Eprint alg-geom/9504006

Missing Modules, the Gnome Lie Algebra, and E10

65

22. Harvey, J. A., and Moore, G.: Algebras, BPS states, and strings. Nucl. Phys. B463, 315–368 (1996) 23. Jurisich, E.: Generalized Kac-Moody Lie algebras, free Lie algebras and the structure of the monster Lie algebra. J. Pure Appl. Algebra 122 (1997). To appear 24. Kac, V. G.: Infinite dimensional Lie algebras. Cambridge: Cambridge University Press, Third ed., 1990 25. Kac, V. G., Moody, R. V., and Wakimoto, M.: On E10 . In: K. Bleuler and M. Werner (eds.), Differential geometrical methods in theoretical physics. Proceedings, NATO advanced research workshop, 16th international conference, Como, Amsterdam: Kluwer, 1988, pp. 109–128 26. Kang, S.-J.: Generalized Kac-Moody algebras and the modular function j. Math. Ann. 298, 373–384 (1994) 27. Kass, S., Moody, R. V., Patera, J., and Slansky, R.: Affine Lie Algebras, Weight Multiplicities, and Branching Rules, Vol. 1. Berkeley, CA: University of California Press, 1990 28. Moody, R. V., and Pianzola, A.: Lie Algebras With Triangular Decomposition. New York: John Wiley & Sons, 1995 29. Nikulin, V. V.: Reflection groups in hyperbolic spaces and the denominator formula for Lorentzian Kac–Moody Lie algebras. Schriftenreihe des SFB “Geometrie und Analysis” Heft 13, Mathematica Gottingensis (1995). Eprint alg-geom/9503003 30. Sa¸clioˇglu, C.: Dynkin diagrams for hyperbolic Kac-Moody algebras. J. Phys. A: Math. and Gen. 22, 3753–3769 (1989) 31. Schwarz, J. H.: Lectures on superstring and M theory dualities. Eprint hep-th/9607201 32. Serre, J.-P.: A Course in Arithmetics. New York: Springer, 1973 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 195, 67 – 77 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Some Properties of Matrix Harmonics on S 2 Jens Hoppe1 , Shing-Tung Yau2 1 2

Institute for Theoretical Physics, ETH–H¨onggerberg, CH-8093 Z¨urich, Switzerland Mathematics Department, Harvard University, Cambridge, MA 02138, USA

Received: 28 July 1997 / Accepted: 7 November 1997

Abstract: We show that matrix harmonics on S 2 (obtained from harmonic polynomials in 3 variables by replacing the commuting variables x1 , x2 , x3 by hermitian N × N matrices X1 , X2 , X3 satisfying [X1 , X2 ] = √ 2i2 X3 , + cycl.) define two sets of N −1

families of discrete orthogonal polynomials, dual to each other, one of them having 3term recurrence relations that, written in tridiagonal matrix form, are the constituents of a discrete Laplacian whose eigenvalues coincide with the first N 2 ones of the ordinary Laplacian on S 2 . Introduction Some time ago, it was noticed that the Poisson-algebra of functions on S 2 can be obtained as a particular (“twisted”) N → ∞ limit of u(N). In order to prove this relation, one rewrites the spherical harmonics {Ylm (θ, ϕ)} l =0,1,... as harmonic homogeneous m=−l,... ,+l

polynomials in 3 variables, x1 , x2 , x3 (subject to the condition ~x2 = 1), replaces the 3 commuting variables by 3 hermitian N ×N matrices X1 , X2 , X3 satisfying [Xa , Xb ] = (N ) √2iabc Xc , X12 + X22 + X32 = 1I (thus obtaining an N × N matrix Tlm for each Ylm ; 2 N −1

(N ) for l < N they turn out to be independent from one another, while for l ≥ N , Tlm ≡0 ) and then notices that, due to the normalization, the effects of non-commutativity will vanish, as N → ∞ [1,2]. (N ) . In this paper, we would like to explore some properties of the matrices Tlm

Explicit Form of the Matrix Harmonics T(N) lm Start with Xa = √

2 Sa , N 2 −1

an irreducible (“Spin s =

N −1 2 ”)

representation of SO(3)

~ 2 = X 2 +X 2 +X 2 = 11. by N ×N matrices, [Sa , Sb ] = i ∈abc Sc , normalized such that X 1 2 3

68

J. Hoppe, S.-T. Yau

For definiteness, choose (X3 )M1 M2 = √ 22 δM 1 M 2 M1 (M1 , M2 = −s, · · · , +s), √ N −1 (X1 ± iX2 )M1 M2 = √N±2−1 δM 1 ,M 2 ±1 s(s + 1) − M2 (M2 ± 1), i.e. 2

 X3 = √

 

2 N2 − 1 

0   .. 1  . X1 =  2   

.

..

.

0  .. . i  X2 = −  2   

..

.. .. +

m(m+1) s(s+1)

√

− .. . ..

.

 0

.

. √

 ,

. N −1 2

q 1− .. . .. .

√



0

0

..

0 

0

− N 2−1

    m = −s, . . . , +s, ..  . 

0  0    , ..  . 

(1)

0

−1 each of which has N equidistant eigenvalues (µi )N i=0 (centered around 0) in the interval (−1, +1) ⊂ R. (N ) by substituting the (nonDefine N 2 independent (real) N × N matrices Tˆlm commuting) matrices Xa for the (commuting) cartesian coordinates xa in the harmonic homogeneous polynomials X c(m) Ylm (~x) =rl Ylm (θ, φ) = a1 ...al xa1 · xa2 · . . . · xal s 2l + 1 (l − m)! m P (cos θ)eimφ = rl (−)m (2) (m≥0) 4π (l + m)! l

(the “solid spherical harmonics”) of degree l < N (with the tensor cm a1 ···al chosen to be totally symmetric) and multiplying by s r √ 4π (N 2 − 1)l (N − 1 − l)! : cN l = 4π cN l = N (N + l)! X (N ) Tˆlm c(m) :=cN l (3) a1 ...al Xa1 · Xa2 · . . . · Xal . The normalization cN l (approaching 1, for each fixed l, as N → ∞) is chosen such that (cf. [1]) † ˆ Tl0 m0 ) = δll0 δmm0 T r(Tˆlm

(4)

for all l, l0 < N, |m| ≤ l, |m0 | ≤ l (T † denoting the hermitian conjugate of T ). By † are irreducible tensor-operators (see e.g. [3]) satisfying definition, the Tlm = (−)m Tl−m

Some Properties of Matrix Harmonics on S 2

69

p

(N ) [Tlm , X± ] = ± ~

(N ) l(l + 1) − m(m ± 1)Tlm±1 ,

(N ) (N ) , X3 ] = − ~mTlm [Tlm

with ~ := (s(s + 1))−1/2 = √

2 , N 2 −1

(5)

† and X± := ∓(X1 ± iX2 ) (i.e., X± = −X∓ ,

[X3 , X± ] = ±~X± , [X+ , X− ] = −2~X3 ). Due to the Wigner-Eckart theorem (see e.g. [3]) the matrix elements of Tlm must be proportional to the Wigner 3j symbol (for which various explicit expressions in the form of algebraic sum are known; the one used below is taken from [4]); with the above normalization, √ s l s (N ) s−M1 (Tlm )M1 M2 = 2l + 1 (−) −M1 m M2 s √ (s + M2 )!(s − M2 )!(s + M1 )!(l − m)! = 2l + 1 (N − 1 − l)!(N + l)!(s − M1 )!(l + m)! ·

X (−)l+m+k (N − 1 − k)!(l + k)! · δ M1 , k!(k − m)! (s + M1 − k)!(l − k)! M2 +m

(6)

k

where the sum extends over all integers k, such that all appearing factorials have nonnegative arguments. (Equivalently, one could write (6) in terms of generalized hypergeometric functions 3 F2 , 3 F2

∞ X 1 (a)k (b)k (c)k a, b, c , ; 1 := d, e k! (d)k (c)k k=0

where (x)k := x(x + 1) · . . . · (x + k − 1)).

Discrete Orthogonal Polynomials and Their Duals The only non-vanishing elements of the matrix Tlm are on the (lower) mth off diagonal; in particular (as noted 20 years ago [2]) Tl0(N ) = Tl(N ) (X3 ) must be polynomial(s) of degree l in X3 , and hence define discrete orthogonal polynomials, as, due to (4), N −1 X

) Tl(N ) (µj )Tl(N (µj ) = δll0 , 0

j=0

l, l0 = 0 . . . N − 1, µj = (−s + j)~.

(7)

Using (6), one has Tl(N ) (µj ) where

√ (N − 1)! 2l + 1 fl (j), = (−) √ (N + l)!(N − 1 − l)! l

(8)

70

J. Hoppe, S.-T. Yau

l X l l + k (N − 1 − k)! (−y)k fl (y) := k k (N − 1)! k=0

=3 F2

−l, l + 1, −y ; 1 1, −(N − 1)

(9)

are discrete (Chebychev-Hahn [5]) polynomials, orthogonal on the points 0, 1, . . . , N −1 with weight one: N −1 X

fl (j)fl0 (j) = δll0 hl ,

(10)

j=0

hl > 0 being the inverse square of the numerical factor in (8); 2 y, (N − 1) 6 6 y+ y(y − 1). f2 (y) = 1 − (N − 1) (N − 1)(N − 2) f1 (y) = 1 −

f0 (y) = 1,

Before going on to discuss the discrete orthogonal polynomials connected with the Tlm6=0 , let us make the following important observation: as (7)expresses the orthogonality (T T tr = 11) of the real N × N matrix T := (Tlj ) := Tl (µj ) , X

Tlj Tlj 0 =

N −1 X

l

l=0

1 fl (j)fl (j 0 ) = δjj 0 , hl

0

j, j = 0 . . . N − 1

(11)

immediately follows (as T tr T = 11 as well). Equation (11), however, looks like the defining property for a new set of orthogonal “functions” fej (l) := fl (j) having weight ((N −1)!)2 (2l+1) 1 wl := (N +l)!(N −l−l)! = hl > 0. While, a priori, the functions fej have no reason to be polynomials, y ) :=3 F2 fej (e

−e y , ye + 1, −j ; 1 1, −(N − 1)

j X (−)i j (N − i − 1)! (e y − i + 1)2i = i i! (N − 1)!

(12)

i=0

is polynomial in ye; moreover, it is polynomial of degree j(!) in the “dual variable” ye(e y + 1), resp. µ e(e y ) := α + βe y (e y + 1); ye(e y + 1) , fe0 = 1, fe1 = 1 − N −1

ye(e y + 1) ye(e y + 1) − 2 2 ye(e y + 1) + , ... . N −1 2(N − 1)(N − 2) y − i + 1)2i · ye(e y + 1) − i(i + 1) , hence, by induction, Note that ye − (i + 1) + 1 2(i+1) = (e fe2 = 1 −

Some Properties of Matrix Harmonics on S 2

(e y − i + 1)2i =

i−1 Y

71

y + 1) . ye(e y + 1) − k(k + 1) = Hi (z := ye(e

k=0

In the context of algebraic combinatorics [6], association schemes corresponding to orthogonal polynomials having (in particular) the above property are called “P and Q µl ) being eigenvalues of the tridiagonal (dual) polynomial” association schemes, the µj (e “intersection-matrix”   a0 c1 0   b0 a 1 c 2  , (13) B =    ... ... cN −1 0 bN −2 aN −1 which (given certain additional properties) belongs to a distance-regular graph (cf. [6], [7] for details). Given orthogonal polynomials fl (y), the matrix (13) is formed out of the coefficients of the 3-term recursion relations xul (x) = bl ul+1 (x) + al ul (x) + cl ul−1 (x)

(14)

for ul (x = µ(y)) := fl (y); note that u0 (µj ), . . . , uN −1 (µj ) are left-eigenvectors of (13) with eigenvalue µj , j = 0 . . . N − 1. For the case at hand p x ul (x) = fl + s = hl Tl(N ) (x) , ~ one has xul (x) = −~(N − 1 − l)

(l + 1) (N + l)l ul+1 (x) − ~ ul−1 (x) (2l + 1) 2l + 1

(15)

(note that for N → ∞ these become the 3-term recursion relations of the Legendrepolynomials Pl (−x) = u∞ l (x), andthat, for any N , bl + cl = −~s = µ0 ), while the dual polynomials u ej x y ) satisfy e = α + βe y (e y + 1) = fej (e x) = − β(j + 1)(N − 1 − j)e uj+1 (e x) x eu ej (e − βj(N − j)e uj−1 (e x) + e aj u ej (e x)

(16)

e(0) − ebj − e cj = α + 2β j(N − j − 1) + N 2−1 . The (left) eigenvectors of with e aj = µ e (≡ B e tr for (16)), the dual “intersection matrix” B   c1 e a0 e  e .  b0 e a1 . . 0  e   (17) B= .  .. ... e  cN −1  ebN −2 e 0 aN −1 µl ), . . . , u eN −1 (e µl ) , which due to u ej (e µl ) = ul (µj ) are identical to are u e0 (e

72

J. Hoppe, S.-T. Yau

ul (µ0 ), ul (µ1 ), . . . , ul (µN −1 ) ,

(18)

with eigenvalue µ el = α + βl(l + 1). e not only has its eigenvalues corresponding to the Laplace-operator on S 2 , Thus, B e applied to (18)) of a discrete Laplacian. In the context but also an eigenvalue-equation (B of graph-theory, where a Laplacian may be defined (cp. [8]) as L = 11 − D −1/2 AD−1/2 , with A being the adjacency matrix (with non-vanishing entries Aij = 1 whenever vertices i and j are connected) and the diagonal matrix D having the valences di (i.e., the number of edges having i as one of its 2 vertices) as non-vanishing entries, one would choose N −1 X 1 2 e0 = +1 and µ el>0 ∈ (−1, +1) µ el · (2l + 1) = 0, µ α = +1, β = − 2 ~ , implying l=0

(corresponding to the eigenvalues of 11 − L). The connection of systems of discrete orthogonal polynomials with distance regular graphs, however, is not straightforward in our case. While the multiplicity of the µ el , calculated as ([6], [7]) #(e µl ) =

l Y bk−1 k=1

ck

at least come out (as 2l + 1) when formally taking the N → ∞ limit in (15), other ‘feasibility-criteria’ (cf. [6], [7]) are, in the form (15)/(16), not satisfied. m−1 Y (l + m)! (N ) , l(l + 1) − k(k + 1) = In any case, let us now discuss the Tl,m≥0 : (5), (l − m)! k=0 and i h (19) Tl(N ) (X3 ), X+ = (1 − 0)Tl(N ) (X3 ) X+ (where 0f (X3 ) := f (X3 − ~11)) can be used to write s (l − m)! m (N ) (N ) D (Tl (X3 )) X+m = Tlm (l + m)!

(20)

approaching the ordinary derivative, as N → ∞), in full analogy to (with D := (1−0) ~ (2): X+m corresponds to (−)m (sin θ)m eimφ and, up to normalization, Dm Tl(N ) (X3 ) to Plm (l+m)! (m,m) dm 1 Qlm := d(cos θ)m Pl (cos θ) = (sin θ)m = (l)! Pl−m · 2m , which is a particular case of a Jacobi-polynomial, (α + 1)n −n, n + α + β + 1 1 (1 − x) F Pn(α,β) (x) = ; 2 1 α+1 n! 2 n n (−) d :=(1 − x)−α (1 + x)−β nn (1 − x)n+α (1 + x)n+β . n 2 ! dx Observing that X+ X− = X32 − ~X3 − 11 =: ρ(X3 ) = −~2 J(N − J) (where J :=

X3 ~

+ s11 has eigenvalues j = 0, 1, . . . , N − 1) implies

(21)

Some Properties of Matrix Harmonics on S 2

73

X+m (−X− )m = − X+m−1 ρ(X3 )(−X− )m−1 = + X+m−2 ρ(X3 − ~11)X+ X− (−X− )m−2 = . . . = (−)m

m−1 Y

(0r ρ)

(22)

r=0

=~2m

J! (N − J + m − 1)! =: ρm (J), (J − m)! (N − J − 1)!

) m (N ) one finds that, due to (4), the (l − m)th order polynomials Q(N (X3 ) · lm (J) := D Tl q (l−m)! (l+m)! satisfy N −1 X

) (N ) 0 Q(N lm (j)Ql0 m (j)ρm (j) = δll ,

j=m

l, l0 = m, . . . , N − 1

(23)

(the points j = 0, . . . , m − 1 don’t contribute, as ρm vanishes there). For any fixed m, ) N −1 the (N − 1 − m) polynomials {Q(N lm }l=m therefore constitute a set of polynomials of degree 0, 1, . . . , N −1−m, orthogonal with respect to the (N −m) points m, . . . , N −1 and weight-function ρm . QM2 +m−1 √ M1 Using (6), ((X+ )m )M1 M2 = (−~)m δM s(s + 1) − M (M + 1), and writM =M2 2 +m ing

) 0 Q(N lm (j

√ √ (−)l−m (l + m)! 2l + 1(N − 1 − m)! (m) 0 √ √ f + m) = (j ) m! (l − m)! (N − 1 − l)!(N + l)! l−m

(24)

yields (generalizing (8)/(9)) (m) = fl−m

l m!(l − m)! X (N − 1 − k)!(l + k)! (−y)k−m (l + m)! (N − 1 − m)!(l − k)!k!(k − m)! k=m

=

l−m X n=0

(−y)n (l + m + n)! (l − m)! m! (N − 1 − m − n)! n! (l + m)! (l − m − n)! (m + n)! (N − 1 − m)! −(l − m), (l + m + 1), −y = 3 F2 ;1 , (25) (m + 1), −(N − 1 − m)

which is a more general Chebychev-Hahn [5] polynomial, Ql−m (y; m, m, N − m − 1) (m,m) (approaching m!(l−m)! Pl−m (1 − 2z) for y = zN, N → ∞). l! As implied by (23)/(24) one has N −1−m X

(m) (m) fl−m (j) fl(m) (j) = δll0 h(m) 0 −m (j) w l

j=0

with l, l0 = m, m + 1, . . . , N − 1

(26)

74

J. Hoppe, S.-T. Yau

h(m) = l wj(m) =

(N − 1 − l)!(N + l)!(l − m)! , 2 + 1) (N − 1 − m)! (l + m)!

~2m (2l

ρm (j + m) (m + 1)j (m + 1)N −1−m−j . = (m!~m )2 j!(N − 1 − m − j)!

(27)

Again, one finds that y ) := f (m) (j) fej(m) (e e y

(28)

are orthogonal polynomials (in µ e(m) (e y ) = αm + βm ye(e y + 1)) of degree j (= 0 . . . N − 1 − m), and the dual 3-term recurrence relations will lead to (N − m)-dimensional e (m) with eigenvalues µ tridiagonal Matrices B e(m) and eigenvectors (m) (m) (0), . . . , fl−m (N − 1 − m) , l = m, . . . , N − 1 , (29) fl−m

(m) (y)) u(m) l0 (x = µ 0

:=

fl(m) (y) 0

=

l X a=0

Ma(m)

a−1 Y

a−1 Y

p=0

q=0

(e µ(m) e(m) p ) l0 − µ

(x − µ(m) (q))

(30)

with µ e(m) (e y ) :=α(m) + β (m) ye(e y + 2m + 1), µ(m) (y) :=~(−s + y + m), (−)a m! (N − 1 − m − a)! Ma(m) = (m) a (β ~) a! (m + a)! (N − 1 − m)! satisfies

xu(m) l0 (x) = −

~ (l0 + 2m + 1) (m) (N − 1 − m − l0 ) u 0 (x) 2 2(l0 + m) + 1 l +1 −

~ l0 (N + l0 + m) (m) 0 u 0 + a(m) l0 ul (x) 2 2(l0 + m) + 1 l −1

with (l + m + 1) −~ (l + m + 1) (N − 1 − l) −→ − , 2 2l + 1 N →∞ 2l + 1 (l − m) −~ (l − m) (N + l) −→ − , = N →∞ 2 2l + 1 2l + 1

b(m) l0 =l−m = c(m) l0 =l−m and

(m) (m) (0) − b(m) a(m) l0 = µ l 0 − cl 0 =

m ~ → 0) (N → ∞) ; 2

(31)

Some Properties of Matrix Harmonics on S 2

75

u e(m) x=µ e(m) (e y )) j 0 (e 0

:=

y) fej(m) 0 (e

=

j X a=0

Ma(m)

a−1 Y

a−1 Y

q=0

p=0

(m) (µ(m) j 0 − µq )

(e x−µ e(m) p )

(32)

satisfies x) = eb(m) e(m) x) + e c(m) e(m) x) + e a(m) e(m) x) x eu e(m) j 0 (e j0 u j 0 +1 (e j0 u j 0 −1 (e j0 u j 0 (e with (m) 0 eb(m) (j + 1 + m)(N − 1 − m − j 0 ), j0 = − β (m) 0 j (N − j 0 ), e c(m) j0 = − β (m) µ(0) − b(m) e a(m) j 0 =e j 0 − cj 0

= α(m) + β{2j 0 (N − 1 − m − j 0 ) + (m + 1)(N − 1 − m)} .

(33)

The Discrete Laplacian Consider 1N :=

3 X

[Si , [Si , ]]

i=1

=

1 ~2

1 1 [X3 , [X3 , ] − [X+ [X− , ]] − [X− , [X+ , ] 2 2

(34)

(N ) (l = 0 . . . N − 1, |m| ≤ l), just as (acting on gl(N )), whose eigenfunctions are the Tlm the spherical harmonics Ylm (θ, φ), satisfying r 4π {Ylm , Y10 } = imYlm , 3 r o p √ 4π n Ylm , 2Y1±1 = ∓i l(l + 1) − m(m ± 1)Ylm±1 (35) 3

are the eigenfunctions of the ordinary Laplacian on S 2 , 1 1 ∂θ (sin θ∂θ ) + ∂φ2 sin θ sin2 θ 4π − {Y10 , {Y10 , }} = 3 √ 1 √ + { 2Y11 , { 2Y1−1 , }} 2 √ 1 √ + { 2Y1−1 , { 2Y1+1 , }} 2

−1 =

(36)

76

J. Hoppe, S.-T. Yau

(here, {, } denotes the usual Poisson-bracket for functions on S 2 , 1 ∂g ∂f ∂f ∂g {f, g}(θ, φ) := − . sin θ ∂θ ∂φ ∂θ ∂φ (N ) (N ) = l(l + 1)Tlm is clearly the same as using (35) to verify Using (5) to prove 1N Tlm −1YRlm = l(l + 1)Ylm (for a discussion of the relation between T r(Tl†00 m00 [Tlm , Tl0 m0 ]) and S 2 Yl∗00 m00 {Ylm , Yl0 m0 } for general values ll0 l00 , see [1]). The associative (non-commutative, ordinary matrix-) multiplication for gl(N ), however – to be compared with the ordinary, commutative, multiplication for functions on ˆ N) S 2 – naturally associates to (34) a (real, symmetric, tridiagonal) N 2 × N 2 matrix (1 (N ) by viewing (Tlm )M1 M2 as matrix elements of an N 2 × N 2 dimensional real orthogonal matrix Tˆ and writing 0

0

0 0 ˆ N )M1 M2 = 3llmm Tˆl0 m0 ,M1 M2 , (Tˆ )lm,M10 M20 (1 M1 M2

(37)

0 0 ˆ N Tˆ −1 is just with (3)llmm = l(l + 1)δll0 δmm0 (as the Tlm are the eigenvectors of 1N , Tˆ 1 ˆ N ; equivalently, 1 ˆ N is obtained from(34) the similarity transformation diagonalizing 1 by using the standard basis Eij of the vector space gl(N ), i.e., N × N matrices Eij being zero except having entry +1 in the ith row and j th column). (N ) (N ) Calculating [X3 [X3 , Tlm ]]M1 M2 and [X± [X∓ , Tlm ]]M1 M2 by using the explicit representation (1) (rather than the Lie algebraic relations (5)), and comparing with (37), results in M 0M 0 M0 M0 1M11 M22 =2δM11 δM22 s(s + 1) − M1 M2 p p M10 M20 − δM1+1 δM2+1 s(s + 1) − M1 (M1 + 1) s(s + 1) − M2 (M2 + 1) p p M10 M20 − δM1−1 δM2−1 s(s + 1) − M1 (M1 − 1) s(s + 1) − M2 (M2 − 1) .(38)

M 0M 0

Due to the fact that 1M11 M22 = 0 unless 1M := M1 − M2 equals 1M 0 := M10 − M20 , ˆ 1N splits into (2N − 1) blocks, 1(m) , each of dimension N − |m| (m = −(N − 1), . . . , +(N + 1)). Each 1(m) , given explicitly as j0 1(m) jj 0 =2δj s(2j + 1 + m) − j(j + m) p p j0 (j + m + 1)(N − 1 − j − m) (j + 1)(N − 1 − j) − δj+1 p p j0 (j + m)(N − j − m) j(N − j), (39) − δj−1 jj 0 =0 . . . N − 1 − m (if m ≥ 0), jj 0 =|m| . . . N − 1 (if m ≤ 0) (the double index notation was removed using M2 = M = j − s, M1 = M + m = j − s + m), has eigenvalues lm (lm + 1), lm = (N − 1), . . . , N − 1 − |m|. Note that (39)m=0 does reproduce (17)/(16)α=0,β=1 . e (m) and 1(m) are related by a non-trivial (though diagonal) similarity For m 6= 0, B transformation:

Some Properties of Matrix Harmonics on S 2

tr

e (m) B

Z

−1

= Z (m) 1(m) Z (m) ,

β=1 α=m(m+1)

(m>0)

77

s

jj 0

= δjj 0

(j + m)! (N − 1 − j)! , j! (N − 1 − m − j)!

(40)

corresponding to the fact that p (m) (Tlm )−s+m+j,−s+j = dN lm fl−m (j) (j + 1)m (N − m − j)m . Acknowledgement. J.H. would like to thank J¨urg Fr¨ohlich and the Institute for Theoretical Physics of ETH Z¨urich, as well as the Mathematics Departments of Harvard University, and Justus Liebig University, Giessen, for their kind hospitality.

References 1. 2. 3. 4. 5.

Hoppe, J.: MIT Ph.D. thesis, 1982 Goldstone, J.: unpublished Messiah; A.: Quantum Mechanics, Vol. I+II. North Holland, 1958 Majumdar, S. D.: Prog. Theor. Phys. 20, 798–803 (1958) Chebychev, P.L.: Sur les fractions continue, Sur une nouvelle s´erie, Sur l’interpolation des valeurs e´ quidistantes. Oeuvres, T.I., Chelsea, NY, 1961. Hahn, W.: Math. Nachr. 2 4–34, 263–278 (1949) 6. Bannai, E., Ito, T.: Algebraic Combinatorics. I. Benjamin-Cummings, 1984 7. Brouwer, A. E., Cohen, A. M., Neumaier, A.: Distance regular graphs. Springer, 1989 8. Chung, F. R. K., Yau, S.T.: A combinatorical trace formula. Tsing Hua Lectures on Geometry and Analysis (ed. S.-T. Yau). Cambridge, MA: International Press, 1997, pp. 107–116

Communicated by G. Felder

Commun. Math. Phys. 195, 79 – 93 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Mirror Symmetry on K3 Surfaces via Fourier–Mukai Transform ˜ Claudio Bartocci1 , Ugo Bruzzo2 , Daniel Hern´andez Ruip´erez3 , Jos´e M. Munoz Porras3 1 Dipartimento di Matematica, Universit` a di Genova, Via Dodecaneso 35, 16146 Genova, Italy. E-mail: [email protected] 2 Scuola Internazionale Superiore di Studi Avanzati, Via Beirut 2–4, 34014 Trieste, Italy. E-mail: [email protected] 3 Departamento de Matem´ atica Pura y Aplicada, Universidad de Salamanca, Plaza de la Merced 1–4, 37008 Salamanca, Spain. E-mail: [email protected]; [email protected]

Received: 14 May 1997 / Accepted: 7 November 1997

Abstract: We use a relative Fourier–Mukai transform on elliptic K3 surfaces X to describe mirror symmetry. The action of this Fourier–Mukai transform on the cohomology ring of X reproduces relative T-duality and provides an infinitesimal isometry of the moduli space of algebraic structures on X which, in view of the triviality of the quantum cohomology of K3 surfaces, can be interpreted as mirror symmetry. From the mathematical viewpoint the novelty is that we exhibit another example of a Fourier–Mukai transform on K3 surfaces, whose properties are closely related to the geometry of the relative Jacobian of X.

1. Introduction In a recent approach of Strominger, Yau and Zaslow [20], the phenomenon of mirror symmetry on Calabi–Yau threefolds admitting a T 3 fibration is interpreted as T-duality on the T 3 fibres. According to this formulation one would like to define the mirror dual to a Calabi–Yau manifold (of any dimension) as a compactification of the moduli space of its special Lagrangian submanifolds (the T 3 tori in the above case) endowed with a suitable complex structure [20, 12, 9]. In two dimensions this means that one considers a K3 surface elliptically fibred over the projective line, p : X → P1 . A mirror dual to X can be identified with the component M of the moduli space of stable sheaves on X having Mukai vector (0, µ, 0) ∈ H • (X, Z), where µ is the cohomology class defined by the fibres of p. The mirror map between the Hodge lattices of X and M should be given by a suitable Fourier–Mukai transform [12, 4, 5]. In this paper we show that a Fourier–Mukai transform on elliptically fibred K3 surfaces provides indeed a description of mirror symmetry. The Fourier–Mukai transform not only maps special Lagrangian 2-cycles to 0-cycles, as noticed by Morrison and others, but also reproduces the correct duality transformations on 4-cycles and on 2-cycles of

80

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

genus 0. It turns out that the Fourier–Mukai transform does not define an automorphism of the cohomology ring of the K3 surface which swaps the directions corresponding to complex structures with the directions corresponding to complexified K¨ahler structures. In this sense our treatment is different from other approaches, cf. e.g. [2, 6, 8]. However, we are able to obtain an isometry between the tangent space to the deformations of complex structures on X and the tangent space to the deformations of “complexified K¨ahler structures” on the mirror manifold. We also note that the map determined by the Fourier–Mukai transform has a correct action on the mass of the so-called BPS states. In order to describe this “geometric mirror symmetry” two modifications must be introduced in the construction we have above outlined. First, we regard the mirror dual b (this is actually to the elliptic K3 surface X as its compactified relative Jacobian X isomorphic to M); secondly, we define a Fourier–Mukai in a relative setting (cf. [15] for a relative Fourier–Mukai transform for abelian schemes). Moreover, the relative transform we define, once restricted to the smooth fibres, reduces to the usual Fourier– Mukai transform for abelian varieties; in this way the reduction of mirror symmetry to relative T-duality in the spirit of [20] is achieved. It should be stressed that this analysis shows that the moduli space M is isomorphic to the original K3 surface X as an algebraic variety, in accordance with the fact that, under this interpretation of mirror symmetry, a K3 surface is mirror to itself [20]. This fact, together with the existence of an isometry between the above mentioned spaces of deformations, is consistent with the triviality of the quantum cohomology of a K3 surface (in particular, the Weil-Petersson metric on the moduli space of complexified K¨ahler structures bears no instantonic corrections). To go through some more detail, the Fourier–Mukai functor T we define transforms a torsion-free rank-one zero-degree sheaf concentrated on an elliptic fibre of X to a point b accordingly, T enjoys the T-duality property of of the compactified relative Jacobian X; relating 2-cycles to 0-cycles. Furthermore, T induces an isometry b C)/ Pic(X) b ⊗ C → H 1,1 (X, C)/ Pic(X) ⊗ C . ψ : H 1,1 (X, b C)/ Pic(X) b ⊗ C can be regarded as the tangent space at X b to the The quotient H 1,1 (X, b which preserve the Picard lattice, space of deformations of algebraic structures on X and similarly, H 1,1 (X, C)/ Pic(X) ⊗ C is to be identified with the tangent space to the space of deformations of K¨ahler structures on X preserving the Picard lattice. With these identifications in mind, the isometry ψ can be regarded as an “infinitesimal” mirror map. From a mathematical viewpoint the transform we define here provides another example of a Fourier–Mukai transform on K3 surfaces in addition to the one given in [3]. The paper is organized as follows. In Sect. 2 we fix notations, define the relative Fourier–Mukai functor and prove its first properties. In Sect. 3 we prove that it is invertible and thus gives rise to an equivalence of derived categories. In Sect. 4 we study the action of the Fourier–Mukai transform on the cohomology ring of the K3 surface X. In Sect. 5 we discuss how the Fourier–Mukai transform can be regarded as a mirror duality for string theories compactified on an elliptic K3 surface. 2. The Basic Construction Let p : X → P1 be a minimal projective elliptically fibred K3 surface (all algebraic varieties will be over C). The fibration p has singular fibres; these have been classified

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

81

by Kodaira [11]. We assume that p : X → P1 has a section e : P1 ,→ X and write H = e(P1 ). We shall denote by Xt the fibre of p over t ∈ P1 , and by it : Xt ,→ X the inclusion. A compactification of the relative Jacobian. Let M be the moduli space of stable sheaves on X, of pure dimension 1 and Chern character (0, µ, 0), where µ is the cohomology class of the fibres of p. Results of Simpson [19] imply that M is a smooth projective surface (actually, a minimal K3 surface, cf. [16]). One may define a morphism γ : X → M, x 7→ (it )∗ (mx ⊗ OXt (e(t))),

(2.1) (2.2)

where Xt 3 x, and mx is the ideal sheaf of x in Xt . Let U ⊂ P1 be the open subset supporting the smooth fibres of p, and let J(X|U ) be the relative Jacobian variety. The restriction of γ to X|U factors as ∼ J(X|U ) ,→ M , X|U → ∼ J(X|U ) is given by x 7→ OX (e(t)−x) = mx ⊗OX (e(t)) where the isomorphism X|U → t t and J(X|U ) ,→ M associates with any zero-degree torsion-free sheaf Lt over Xt its direct image (it )∗ Lt . Then γ is birational, and X ' M since they both are smooth projective surfaces and X is minimal. We want now to construct a suitable compactification of the relative Jacobian of 1 p : X → P1 . We denote by Pic− X/P1 the functor which to any morphism f : S → P of algebraic varieties associates the space of S-flat sheaves on pS : X ×P1 S → S, whose restrictions to the fibres of pS are torsion-free, of rank one and degree zero.1 Two such sheaves F , F 0 are considered to be equivalent if F 0 ' F ⊗ p∗S L for a line bundle L on S (cf. [1]). Due to the existence of the section e, Pic− X/P1 is a sheaf functor. b Proposition 2.1. The functor Pic− X/P1 is represented by an algebraic variety pˆ : X → P1 , which is isomorphic to X. Proof. If we denote by hX , hM the functors of points of X, M as schemes over P1 , the ∼ hM factors as isomorphism γ : hX → $

α

hX −→ Pic− → hM X/P1 − where $ and α are defined (over the closed points) by $(x) = mx ⊗ OXt (e(t)) and α(Lt ) = (it )∗ Lt for any zero-degree torsion-free sheaf Lt over Xt . Both morphisms of functors are immersions and their composition is an isomorphism, so that they are b isomorphisms as well. Then, Pic− X/P1 is represented by a fibred algebraic variety pˆ : X → b is an isomorphism. P1 , and $ : X → X b the canonical section; one has eˆ = $ ◦ e. Moreover, we denote We denote by eˆ : P1 → X b onto the factors. by π, πˆ the projections of the fibred product X ×P1 X 1

When the fibre is reducible, by “torsion-free” we mean stable of pure dimension 1.

82

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

Remark 2.2. The Picard functor is also representable by an open dense subscheme J of b the relative Jacobian J → P1 of X → P1 . If X s ⊂ X denotes the complement of X, ∼ J of schemes the singular points of the fibres of π, then $ gives an isomorphism X s → over P1 . One should notice that in general the Jacobian variety J → P1 is different from Pic0 (X/P1 ) → P1 . This scheme can be obtained from J → P1 by removing the images by $ of those components of the singular fibres of p : X → P1 that do not meet the image H of the section. b The representability of Pic− X/P1 means there exists a coherent sheaf P on X ×P1 X b whose restrictions to the fibres of πˆ are torsion-free, and of rank one and flat over X, degree zero, such that Hom

b → Pic− 1 (S) , X/P

P1 (S, X)

φ 7→ [(1 × φ)∗ P]

(2.3)

is an isomorphism of functors. P is defined up to tensor product by the pullback of a b and is called the universal Poincar´e sheaf. line bundle on X, ∼ X b is To normalize the Poincar´e sheaf we notice that the isomorphism $ : X → induced, according to the universal property (2.3), by the sheaf I1 ⊗ p∗1 OX (H) on X ×P1 X, where p1 is the projection onto the first factor and I1 is the ideal sheaf of the diagonal δ : X ,→ X ×P1 X; this sheaf is flat over the second factor and has zero relative degree. Then (2.4) P = (1 × $−1 )∗ I1 ⊗ p∗1 OX (H) ⊗ πˆ ∗ L b gives P b Restriction to H ×P1 X for a line bundle L on X. b ' OX b (−2) ⊗ L, which |H×P1 X is a line bundle. We can then normalize P by letting P|H× X b ' OX b. P1

(2.5)

We shall henceforth assume that P is normalized in this way. We shall denote by Pξ the restriction P|πˆ −1 (ξ) . As a consequence of (2.4), P is flat over X as well. b there is a universal Remark 2.3. Since the moduli space M is fine, on X ×M ' X × X sheaf Q. This is the sheaf that gives rise to the morphism γ (cf. Eq. (2.2)). One can show b and its restriction to its support is isomorphic to P (up that Q is supported on X ×P1 X, b to tensoring by a pullback of a line bundle on X). b whose restrictions The dual P ∗ of the Poincar´e bundle is a coherent sheaf on X ×P1 X b it to the fibres of πˆ are torsion-free, rank one, and of degree zero. As P ∗ is flat over X 1 b b b defines a morphism ι : X → X. Since Ext OX (Pξ , OXt ) = 0 for every point ξ ∈ X t (P, OX× X (here t = p(ξ)) ˆ and Ext 1O b ) = 0 by (2.4), the base change property P1 X× 1 X b P for the local Ext’s ([1], Theorem 1.9) implies that (P ∗ )ξ ' (Pξ )∗ . Then, the morphism b →X b maps any rank-one torsion-free and zero-degree coherent sheaf F on a fibre ι: X Xt to its dual F ∗ . By (2.3) one has (1 × ι)∗ P ' P ∗ ⊗ πˆ ∗ N for some line bundle N on b which turns out to be trivial by (2.5). Then X, (1 × ι)∗ P ' P ∗ .

(2.6)

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

83

b →X b is the identity on the Jacobian J(X|U ) ⊂ X; b by separateness The morphism ι◦ι : X ι ◦ ι = Id, and (2.6) implies P ' P ∗∗ . Then, every coherent sheaf F on X ×P1 S flat over S whose restrictions to the fibres of X ×P1 S → S are torsion-free and of rank one and degree zero is reflexive, F ' F ∗∗ . b → P1 admits a compactiProposition 2.4. The relative Jacobian of the fibration pˆ : X 1 fication which is isomorphic to X as a fibred variety over P , and the relevant universal Poincar´e sheaf may be identified with P ∗ . Proof. By (2.4), the sheaf P ∗ is flat over X. Proceeding as above, one proves that (P ∗ )x ' (Px )∗ for every point x ∈ X, which means that the restrictions of P ∗ to the fibres of π are torsion-free sheaves of rank one and degree zero. So P ∗ defines a morphism bb X →X of schemes over P1 . If U ⊂ P1 denotes as above the open subset supporting b |U of the smooth fibres of p, this morphism restricts to an isomorphism X|U ' J(X) bb b is minimal and X is smooth and has no (−1)-curves, X ' X. schemes over U . Since X b are so completely interchangeable. The roles of X and X The Fourier–Mukai functors. For any morphism f : S → P1 let us consider the diagram πˆ S b S −−− (X ×P1 X) −→   πS y

XS

bS X  pˆ y S

pS

−−−−→ S

We shall systematically denote objects obtained by base change to S by a subscript b S ' XS × S X bS .) We define the Fourier–Mukai functors Si , S. (Note that (X ×P1 X) S bS , i = 0, 1 by associating with every sheaf F on XS flat over S the sheaf on X SiS (F) = Ri πˆ S∗ (πS∗ F ⊗ PS ) . b will be denoted The Fourier–Mukai functors mapping sheaves on X to sheaves on X by Si . Definition 2.5. We say that a coherent sheaf F on XS flat over S is WITi if SjS (F) = 0 for j 6= i. We say that F is ITi if it is WITi and SiS (F) is locally free. One should notice that, due to the presence of the fibred instead of the cartesian product, the WIT0 and IT0 conditions are not equivalent: for instance κ(x) (the skyscraper sheaf concentrated at x ∈ X) is WIT0 but not IT0 . Since the fibres of πˆ S are one-dimensional the first direct image functor commutes with base change. Proposition 2.6. Let F be a sheaf on XS , flat over S. For every morphism g : T → S one ∗ b b F), where gX : XT → XS , gX has g ∗ S1S (F ) ' S1T (gX b : XT → XS are the morphisms b X induced by g. The zeroth direct image does not commute with base change; however, a weaker property holds.

84

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

bS , the Proposition 2.7. Let F be a sheaf on XS , flat over S. For every point ξ ∈ X natural base change morphism πˆ S∗ (πS∗ F ⊗ PS ) ⊗ κ(ξ) → H 0 (Xs , Fs ⊗ Pξ ) is injective (here s = pˆS (ξ)). bS . Since πˆ S is flat, πˆ ∗ mξ is the ideal sheaf Proof. Let mξ denote the ideal sheaf of ξ ∈ X S −1 bS . Let us write N = π ∗ F ⊗PS and Nξ = N −1 . of the fibre πˆ S (ξ) ' Xs in XS ×S X S |πˆ S (ξ) bS there is an exact sequence Since N is flat over X →N − → j ∗ Nξ − → 0, 0− → πˆ S∗ mξ ⊗ N − bS is the natural immersion. By taking direct images where j : πˆ S−1 (ξ) ' Xs ,→ XS ×S X we obtain η

→ πˆ S∗ (N ) − → πˆ S∗ (j∗ Nξ ) = H 0 (Xs , Nξ ) . 0− → πˆ S∗ (πˆ S∗ mξ ⊗ N ) − By the projection formula, πˆ S∗ (πˆ S∗ mξ ⊗ N ) ' mξ ⊗ πˆ S∗ N , and then ker η ' mξ · N ; this implies that the base change morphism N ⊗ κ(ξ) → H 0 (Xs , Nξ ) is injective. Fourier–Mukai transform of rank 1 sheaves. A first manifestation of geometric mirror symmetry is the fact that the Fourier–Mukai transform of a torsion-free rank-one zerodegree coherent sheaf on a fibre Xt is a skyscraper sheaf concentrated at a point of bt . X By Proposition 2.6 the basic ingredients to compute the functors S• are the Fourier– b Mukai transforms S• (P) of the universal Poincar´e sheaf P on XX b = X ×P1 X. The b X relevant higher direct images of P and P ∗ are computed as follows. (For every algebraic variety q : Y → P1 over P1 and every coherent sheaf N on Y we denote by N (n) the sheaf N ⊗ q ∗ OP1 (n).) Theorem 2.8. b b b 1. S1 (P) ' ζ∗ OX b (−2), where ζ : X ,→ X ×P1 X is the graph of the morphism ι. b X 2. S0 (P) = 0. b X 0 b ,→ X b × P1 X b is the diagonal (P ∗ ) = 0, where δ : X 3. S1 (P ∗ ) ' δ∗ OX b (−2), SX b b X immersion. 4. R1 πˆ ∗ P ' R1 πˆ ∗ P ∗ ' eˆ∗ OP1 (−2), while the zeroth direct images vanish. A result similar to the second formula can be found in [15] for the case of relative abelian schemes. To prove Theorem 2.8 we need some preliminary results. Lemma 2.9. Let Y be a fibre of p and F a torsion-free rank-one and zero-degree sheaf on Y . Then H 1 (Y, F) 6= 0 if and only if F ' OY . Proof. One has H 0 (Y, F) 6= 0 by Riemann-Roch and H 0 (Y, F ∗ ) 6= 0 by duality. Let τ and σ be nonzero sections of F and F ∗ respectively. Let ρ be the composition σ∗

F −−−−→ F ∗∗ −−−−→ OY . Since ρ ◦ τ 6= 0, the morphism ρ ◦ τ consists in the multiplication by a nonzero constant, which may be set to 1. Then ρ ◦ τ = id, so that F ' OY ⊕ M, where M has rank zero; hence M = 0, and F ' OY .

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

85

bt . Lemma 2.10. Let ξ, µ ∈ X bt and 1. The sheaf Pξ ⊗ Pµ has torsion if and only if ξ is a singular point of the fibre X µ = ξ. In that case, ι(ξ) = ξ. 2. The evaluation morphism Pξ ⊗ Pξ∗ → OXt induces an isomorphism H 1 (Xt , Pξ ⊗ Pξ∗ ) ' H 1 (Xt , OXt ). 3. If µ 6= ι(ξ) then H 1 (Xt , Pξ ⊗ Pµ ) = 0. bt iff x Proof. 1. We have Pξ = mx (e(t)) for a point x ∈ Xt , and ξ is singular in X is singular in Xt . If ξ or µ are not singular, then one of the sheaves Pξ , Pµ is locally free, and Pξ ⊗ Pµ is torsion-free. Otherwise, Pξ = mx (e(t)) and Pµ = my (e(t)) for singular points x, y ∈ Xt . If µ 6= ι(ξ) then mx ⊗ my is torsion-free. Finally, if µ = ι(ξ) then mx (2e(t)) = m∗y , so that x = y (because mx , my are not locally-free only at x, y, respectively). Thus µ = ξ, and mx ⊗ m∗x has torsion at x. 2. The only nontrivial case is when Pξ is not locally free. Let x ∈ Xt be the singular point corresponding to ξ. We have an exact sequence → Pξ ⊗ Pξ∗ − → mx /m2x − → 0, 0− → mx − which implies that H 1 (Xt , mx ) → H 1 (Xt , Pξ ⊗ Pξ∗ ) is an epimorphism. Since the composition H 1 (Xt , mx ) → H 1 (Xt , Pξ ⊗ Pξ∗ ) → H 1 (Xt , OXt ) is an isomorphism, H 1 (Xt , Pξ ⊗ Pξ∗ ) ' H 1 (Xt , OXt ) is an isomorphism as well. 3. Follows from 1 and Lemma 2.9. In order to compute the Fourier–Mukai transform S• (P) of the Poincar´e sheaf P b X b we consider the diagram on X ×P1 X b ×P1 X b × P1 X b −−π−23−→ X b X × P1 X    pˆ π12 y y 1 b X × P1 X

πˆ

−−−−→

b X

b × P1 X b is P = π ∗ P and the Fourier–Mukai transforms The Poincar´e sheaf on X ×P1 X 13 b X • • ∗ ∗ of P are S (P) = R π23∗ (π12 P ⊗ π13 P). b X ∗ ∗ P ⊗ π13 P). The composition Proof of 1 of Theorem 2.8. We have P ⊗ P ∗ = (1 × ζ)∗ (π12 ∗ ∗ ∗ of the epimorphism π12 P ⊗ π13 P − → (1 × ζ)∗ (P ⊗ P ) with the evaluation morphism → (1 × ζ)∗ (OX× X (1 × ζ)∗ (P ⊗ P ∗ ) − b ) gives a morphism P1

∗ ∗ π12 P ⊗ π13 P− → (1 × ζ)∗ (OX× X b) . P1

We have then a morphism ∗ ∗ P ⊗ π13 P) − → R1 π23∗ ((1 × ζ)∗ (OX× X S1b (P) = R1 π23∗ (π12 b )) ' ζ∗ OX b (−2) . X P1

Since the first direct image functor commutes with base change, Lemma 2.10 implies b Moreover the fibre of the previous morphism at a point that S1 (P) is supported on ζ(X). b X

86

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

1 bt , is H 1 (Xt , Pξ ⊗ P ∗ ) − ζ(ξ), with ξ ∈ X ξ → H (Xt , OXt ), which is an isomorphism by Lemma 2.10. Let f : S → P1 be a morphism and F a coherent sheaf on X ×P1 S flat over S whose restrictions to the fibres of pS are torsion-free and have rank one and degree zero. Let b be the morphism determined by the universal property (2.3), so that φ: S → X

(1 × φ)∗ P ' F ⊗ p∗S L , b bS be the graph of the morphism ι ◦ φ : S → X. for a line bundle L on S. Let 0 : S ,→ X Lemma 2.11. S1S (F) ⊗ pˆ∗S L ' 0∗ OS (−2) ,

S0S (F) = 0.

Proof. The formula for S1S (F) follows from Proposition 2.6 and 1 of Theorem 2.8 after some standard computations. The second formula is proved as follows. From Proposition 2.7 we have the exact sequence → H 0 (Xs , Fs ⊗ Pξ ), 0− → πˆ S∗ (πS∗ F ⊗ PS ) ⊗ κ(ξ) − / 0(S), H 0 (Xs , Fs ⊗ Pξ ) = 0 by Lemma 2.10 and πˆ S∗ (πS∗ F ⊗ where s = pˆS (ξ). If ξ ∈ PS ) ⊗ κ(ξ) = 0 as well. If ξ ∈ 0(S) the first direct image S1S (F) is not locally-free at ξ since it is concentrated on the image of 0, and then the second arrow is not surjective; but H 0 (Xs , Fs ⊗Pξ ) is one-dimensional by Lemma 2.10, so that πˆ S∗ (πS∗ F ⊗PS )⊗κ(ξ) = 0. b and φ End of proof of Theorem 2.8. 2 is proved by applying Lemma 2.11 with S = X 1 b the identity, while to prove 3 one chooses S = X and φ = ι. Taking S = P and φ = eˆ one proves the claims of 4 concerning the sheaf P. To prove the claims for P ∗ one notices that Lemma 2.11 still applies after replacing P by P ∗ . We can now compute the Fourier–Mukai transform of sheaves on X corresponding b to points in X. Corollary 2.12. Let F be a rank-one, zero-degree, torsion-free coherent sheaf on a fibre Xt . Then S0t (F) = 0 , S1t (F ) = κ([F ∗ ]) , bt defined by F ∗ . where [F ∗ ] is the point of X 3. Inversion of the Fourier–Mukai Transform b given by S(F ) = The Fourier–Mukai functor defines a functor D− (X) → D− (X) L

Rπˆ ∗ (π ∗ F ⊗P) (here D− (X) is the subcategory of the derived category of coherent OX modules consisting of the complexes bounded from above). To state the invertibility properties of this functor in a neat way we define a modified functor T : D− (X) → L b by T(F ) = S(F ⊗OX (1)). A natural candidate for the inverse of T is the functor D− (X) b → D− (X) given by b : D − (X) T L

b T(G) =b S(G⊗OX b (1))

where

L

b S(G0 ) = Rπ∗ (πˆ ∗ G0 ⊗P ∗ ) .

Since the relative dualizing complexes of π and πˆ are both isomorphic to OX× X b (2)[1], P1 relative duality gives:

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

87

b one has functorial Proposition 3.1. For every object F in D− (X) and G in D− (X) isomorphisms b HomD− (X) b (G, T(F )) ' HomD− (X) (T(G), F [−1]), b ' HomD− (X) HomD− (X) (F, T(G)) b (T(F ), G[−1]) . b F ∈ D− (X) there are functorial isomorphisms Theorem 3.2. For every G ∈ D− (X), b T(T(G)) ' G[−1] ,

b T(T(F )) ' F [−1]

b and D− (X), respectively. in the derived categories D− (X) b ×P1 X. b Then T(T(G)) b Proof. Let πˆ 1 and πˆ 2 be the projections onto the two factors of X = L e ⊗ OX (2) (see [13, 15, 16, 3] for similar statements), with Rπˆ 2,∗ (πˆ ∗ G⊗P) 1

e = Rπ23,∗ (π ∗ P ∗ ⊗ π ∗ P) . P 12 13 e ' δ∗ (O (−2))[−1] in the derived category, and T(T(G)) b By Theorem 2.8 P ' G[−1]. b X b The second statement follows from the first by interchanging the roles of X and X. So T establishes an equivalence of triangulated categories. Corollary 3.3. Let F be a WITi sheaf on X. Then its Fourier–Mukai transform Ti (F ) b whose Fourier–Mukai transform is a WIT1−i sheaf on X, b 1−i (Ti (F )) = R1−i π∗ (πˆ ∗ Ti (F ) ⊗ P ∗ (1)) T is isomorphic to F . We also have a property of preservation of the Hom groups, which is sometimes called “Parseval theorem.” Proposition 3.4. There are functorial isomorphisms b b ¯ b b ¯ ¯ HomD− (X) b (G, G) ' HomD− (X) (S(G), S(G)) ' HomD− (X) (T(G), T(G)), ¯ ¯ HomD− (X) (F, F¯ ) ' HomD− (X) b (S(F ), S(F )) ' HomD− (X) b (T(F ), T(F )) b for F , F¯ in D− (X) and G, G¯ in D− (X). Corollary 3.5. Let F, F 0 be coherent sheaves on X. If F is WITi and F 0 is WITj , we have Ext h (F, F 0 ) ' Exth+i−j (Si (F), Sj (F 0 )) ' Exth+i−j (Ti (F), Tj (F 0 )) . for h = 0, 1. In particular, if F is WITi there is an isomorphism Exth (F, F) ' Ext h (Ti (F ), Ti (F)) for every h, so that Ti (F) is simple if F is. Remark 3.6. Moduli spaces of sheaves on holomorphic symplectic surfaces carry a holomorphic symplectic structure, which is given by the Yoneda pairing Ext1 (F, F) ⊗ Ext 1 (F , F ) → Ext 2 (F, F) ' C (cf. [14]), where one identifies Ext1 (F, F) with the tangent space to the moduli space at the point corresponding to the sheaf F . Whenever the Fourier–Mukai transform establishes a morphism between such moduli spaces, Corollary 3.5 implies that the morphism is symplectic.

88

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

4. Action on the Cohomology Ring The cohomology ring H • (X, Z) carries a bilinear pairing, usually called Mukai pairing, defined as (a, b, c) · (a0 , b0 , c0 ) = (b ∪ b0 − a ∪ c0 − a0 ∪ c) \ [X], b Z) (here \ denotes the slant product). We define an and the same is true for H • (X, • b Q) and want to show that in terms of f one isomorphism f : H (X, Q) → H • (X, can introduce an isometry between the tangent space to the moduli space of algebraic b and the space of deformations of the complexified K¨ahler structure on structures on X X, which can be regarded as a geometric realization of mirror symmetry. We define the map f basically as in [16], but the properties of this map are slightly different, since we are working in a relative setting, and the relative dualizing sheaf is nontrivial. Also, we must take coefficients in Q because the relative Todd characters involved in the definition of the f map do not have integral square roots. The f map. We now define the f map and describe its basic properties. We shall be concerned with varieties fibred over P1 , φY : Y → P1 , with a section σY : P1 ,→ Y . Since σY∗ ◦ φ∗Y = 1, there is a decomposition H • (Y, Q) ' φ∗Y H • (P1 , Q) ⊕ Hφ• (Y, Q), where Hφ• (Y, Q) = ker σY∗ . One has in particular Hφ0 (Y, Q) = 0,

H 2 (Y, Q) = QµY ⊕Hφ2 (Y, Q),

Hφ2i (Y, Q) = H 2i (Y, Q)

for i ≥ 2 .

We define in H even (Y, Q) an involution ∗ by letting α∗ = (−1)i α

if

α ∈ Hφ2i (Y, Q),

(φ∗Y η)∗ = φ∗Y η

if

η ∈ H 2i (P1 , Q) .

Turning back to the case where X is an elliptic K3 surface, satisfying all the properties we have so far stated, we define morphisms b Q), f : H • (X, Q) → H • (X, by letting where

f(α) = πˆ ∗ (Z π ∗ α), Z=

b Q) → H • (X, Q) f 0 : H • (X, f 0 (β) = π∗ (Z ∗ πˆ ∗ β) ,

√ √ td πˆ ch(P ⊗ π ∗ OX (1)) td π .

Lemma 4.1. The maps f, f 0 have the following properties: f ◦ f 0 (β) = −β; f and f 0 are H • (P1 , Q)-module isomorphisms; b f(µ) = −w, ˆ where wˆ is the fundamental class of X; f(H) = 1 + w; ˆ b → P1 , and f(1) = −µˆ − 2 + w, ˆ where µˆ is the divisor given by the fibres of pˆ : X 1 2 = e(P ˆ ). b Q). 6. β · f(α) = −f 0 (β) · α for α ∈ Hp• (X, Q), β ∈ Hp•ˆ (X, • b Q). 7. f establishes an isometry between Hp (X, Q) and Hp•ˆ (X,

1. 2. 3. 4. 5.

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

89

Proof. Property 1 is proved as in [16], p. 382, provided that suitable adaptations to the relative case are done. One also proves that f 0 ◦ f(α) = −α, so that 2 follows. To prove 3, let L be a flat line bundle on a smooth fibre Xt of p. One knows that ch it∗ (L) = it∗ (1) = µ since the normal bundle to Xt is trivial. By Corollary 2.12 we have S0 (it∗ L) = 0, S1 (it∗ L) = k([L∗ ]), bt is the isomorphism class of L. By Riemann-Roch we get −wˆ = f(µ). where [L] ∈ X b we get f 0 (µ) ˆ = w; after swapping X and X ˆ = −w which implies (This implies f 0 (µ) f(w) = µ.) ˆ To prove 4 we apply Riemann-Roch to S0 (OH ) = OX b,

S1 (OH ) = 0 .

5 is now straightforward. Using these results one proves 6 as in [16]. 7 follows from 1 and 6. If one defines a modified, H • (P1 , Q)-valued Mukai pairing by letting α·α0 = p∗ (α∗ ∪α0 ), ∼ H • (X, b Q) as H • (P1 , Q)-modules. then the map f establishes an isometry H • (X, Q) → b Q)-component of f(α) is µ · α. As Proposition 4.2. For all α ∈ H • (X, Q), the H 0 (X, ⊥ b Q). ˜ a consequence, f induces an isometry f : µ /Qµ → H 2 (X, Proof. We already know that f(w)0 = 0 and f(1)0 = 0, so we may assume α ∈ H 2 (X, Q). Then, f(α)0 = π ∗ α \ µ = α · µ. Thus f(α)0 = 0 for α ∈ µ⊥ . We now define f¯ : µ⊥ → b Q) by taking f(α) ¯ ¯ as the H 2 -component of f(α). One has that f(α) = 0 if and only H 2 (X, ¯ if f(α) = swˆ (s ∈ Q), and then α = −sµ, ˆ which proves that ker f = Qµ, and f¯ induces b Q). If β ∈ H 2 (X, b Q), f 0 (β) · µ = 0, and an injective morphism f˜ : µ⊥ /Qµ ,→ H 2 (X, 0 ˜ β = f(−f (β)), thus finishing the proof. Remark 4.3. The cohomology lattice H • (X, Z) contains a hyperbolic sublattice U generated by µ and H, and the hyperbolic sublattice V = H 0 (X, Z) ⊕ H 4 (X, Z). From b the map f swaps the lattices U Proposition 4.2 we see that (after identifying X and X) and V . Topological invariants of the Fourier–Mukai transform. Let us assume at first that the Picard number of X is two; then the Picard group of X reduces to the hyperbolic lattice U (this happens when X has 24 singular fibres consisting in elliptic curves with a nodal singularity). It is then possible to compute the invariants of the Fourier–Mukai transform of a sheaf on X by means of the Riemann-Roch formula, expressed in the form ch T• (F) = √

p 1 f((ch F ) td p) . td pˆ

In particular, let us assume that F is WITi , and set Fb = Ti (F), and ch F = r + a H + b µ + c w,

where

r = rk F .

We then have (−1)i rk Fb = a,

b = −r 2 + c µ, (−1)i c1 (F) ˆ

b with In the same way, if E is a WITi sheaf on X,

b = −b wˆ . (−1)i ch2 (F)

(4.1)

90

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

ch E = r + a 2 + b µˆ + c wˆ b i E we have after setting Eb = T (−1)i rk Eb = a,

b = −r H + c µ, (−1)i c1 (E)

b = −b w . (−1)i ch2 (E)

(4.2)

One obtains similar formulae also in the case when the Picard group has higher rank; in the Appendix we treat the case when X has also singular fibres of type In (according to Kodaira’s classification [11]).

5. Fourier–Mukai Functor as Mirror Symmetry We would like now to examine some facts which pinpoint the relations between the relative Fourier–Mukai transform on elliptic K3 surfaces and mirror symmetry. (a) The formulae (4.1) and (4.2) establish a morphism b Z) ⊕ Pic(X) b ⊕ H 4 (X, b Z) H 0 (X, Z) ⊕ Pic(X) ⊕ H 4 (X, Z) → H 0 (X, r + a H + b µ + c w 7→ a − r 2 + c µˆ − b wˆ together with its inverse. According to these formulae, the cycle corresponding to a 0brane is mapped to a special Lagrangian 2-cycle of genus 1 (i.e. to the cycle homologous to µ), ˆ and vice versa, while a 4-brane is mapped to a special Lagrangian 2-cycle of genus 0, and vice versa. So one recovers the transformation properties of D-branes under Tduality as known from string theory [17]. One should notice that, according to Corollary 2.12, a fibre of X, regarded as supersymmetric 2-cycle, is mapped to 0-brane (point) lying in the same fibre, thus giving rise to a relative (fibrewise) T-duality. (b) Mirror symmetry should consist in the identification of the moduli space of complex structures on an n-dimensional Calabi–Yau manifold X with the moduli space of “comb The tangent spaces to the two plexified K¨ahler structures” on the mirror manifold X. n−1,1 b C), respectively. (X, C) and H 1,1 (X, moduli spaces are the cohomology groups H We want to show that when X is an (algebraic) elliptic K3 surface the f map establishes an isometry between the subspaces of these tangent spaces which describe “algebraic deformations,” in a sense that we shall clarify hereunder. b C) → H • (X, C) its We denote by φ the complexification of f˜ and by ψ : H 2 (X, inverse. Proposition 5.1. The map ψ establishes an isometry 1,1 b C) H 1,1 (X, ∼ H (X, C) . → b ⊗C Pic(X) ⊗ C Pic(X)

b C) and H 0,2 (X, b C). Since X b is a moduli ¯ be generators of H 2,0 (X, Proof. Let , ¯ lie in H 2,0 (X, C) and space of sheaves on X, by Remark 3.6 the classes ψ() and ψ() H 0,2 (X, C), respectively. The result then follows from Lemma 4.1 and Proposition 4.2.

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

91

b C)/Pic(X) b ⊗ C may be naturally identified with the tangent space The space H 1,1 (X, b b which preserve the Picard at X to the space of deformations of algebraic structures on X 1,1 lattice. Analogously, the space H (X, C) can be regarded as the space of deformations of the K¨ahler structure of X, and its quotient H 1,1 (X, C)/Pic(X) ⊗ C as the space of deformations of the K¨ahler structure which preserve the Picard lattice. The map ψ can then be thought of as a mirror transformation in the algebraic setting. Since the WeilPetersson metrics on both spaces are expressed in terms of the Mukai pairing, which is preserved by ψ, we see that ψ establishes an isometry between the tangent spaces to the two moduli spaces, consistently with the fact that the quantum cohomology of a K3 surface is trivial. (c) The mass of a BPS state, which is represented by a D-brane wrapped around a 2-cycle γ, is given by the expression [7] R γ |γ · []| M= R 21 = 1 , ¯ ¯ 2 ∧ [] · [ ] X where denotes a holomorphic 2-form on X, and [] its cohomology class in H 2,0 (X, C). The map φ evidently preserves this quantity. As a final remark, we would like to mention [10], where the authors consider a b given by the ideal sheaf Fourier–Mukai transform on the cartesian product X × X b A Riemann-Roch of the diagonal and use it to define a T-duality between X and X. computation is then advocated to support an interpretation of the duality of the baryonic phases in N = 2 super Yang-Mills theory. Thus, the geometric setting and the physical implications of this construction are different from those of the present paper. Conclusions. It should be stressed that in this picture, in accordance with [20, 12], and differently to other proposals that have been recently advocated (cf. e.g. [2, 8, 6]), the mirror dual to a given elliptic K3 surface X is isomorphic to X. Of course this does not imply that the mirror map is to be trivial, and indeed the Fourier–Mukai transform seems to establish such a map, at least at a cohomological level, and at the “infinitesimal” level as far as the moduli spaces of complex structures and the moduli space of complexified K¨ahler structures are concerned. It would be now of some interest to develop a similar construction in terms of a generalized Fourier–Mukai transform in higher dimensional cases, where the mirror dual is not expected to be isomorphic to the original variety. Acknowledgement. We thank C. G´omez, C.-S. Chu, and especially C. Imbimbo for useful discussions. This research was partly supported by the Spanish DGES through the research project PB95-0928, by the Italian Ministry for Universities and Research, and by an Italian-Spanish cooperation project. The first author thanks the Tata Institute for Fundamental Research, Bombay, for the very warm hospitality and for providing support during the final stage of preparation of this paper.

6. Appendix In order to be able to compute the topological invariants of the Fourier–Mukai transform of a sheaf on X we need to describe the action of the f map on the generators of the Picard group. In this Appendix we assume that the elliptic K3 surface X has singular fibres which are of type In , n ≥ 3 or are elliptic nodal curves; every singular fibre of type In is a reducible curve whose irreducible components are n smooth rational curves

92

C. Bartocci, U. Bruzzo, D. H. Ruip´erez, J. M. Mu˜noz Porras

which intersect pairwise. The section e intersects only one irreducible component of each singular fibre. The Picard group Pic(X) is generated by the divisors µ and H and by r divisors α1 , . . . , αr given by the irreducible components Ci of the singular fibres of type In which do not meet the section e. b Q). ˆ = 0 we have f(αi ) = βi ∈ H 2 (X, Since αi · µ = 0 and αi · H = f(αi ) · (1 + w) Proposition 6.1. The sheaf OX (−Ci ) is WIT1 , and T1 [OX (−Ci )] ' O6i (−1), where b whose associated cohomology class is 2 + µˆ + βi . 6i is a section of X Proof. By base change for every t ∈ P1 one has i Ti [OX (−Ci )] ⊗ OX b ' Tt (Lt ), t

(6.1)

where Lt = OX (−Ci )|Xt so that OX (−Ci ) is WIT1 . By Riemann-Roch one has ch T1 [OX (−Ci )] = 2 + µˆ + βi .

(6.2)

b From Eq. (6.1) we see that Ti [OX (−Ci )] ⊗ OX bt is concentrated at the point in Xt ∗ corresponding to the flat line bundle Lt on Xt , whence the first claim follows. The second is a consequence of formula (6.2). References 1. Altman, A., and Kleiman, S.: Compactifying the Picard scheme. Adv. Math. 35, 50–112 (1980) 2. Aspinwall, P.S.: K3 surfaces and string duality. In: Fields, strings and duality. River Edge, N.J.: World Sci. Publ., 1997 3. Bartocci, C., Bruzzo, U., and Hern´andez Ruip´erez, D.: A Fourier–Mukai transform for stable bundles on K3 surfaces. J. reine angew. Math. 486, 1–16 (1997) 4. Bershadsky, M., Johansen, A., Pantev, T., Sadov, V., and Vafa, C.: F-theory, geometric engineering and N=1 dualities. Nucl. phys. B505, 153–164 (1997) 5. Douglas, M.R., and Moore, G.: D-branes, quivers, and ALE instantons. hep-th/9603167 6. G´omez, C.: D-brane probes and mirror symmetry. hep-th/9612104 7. Greene, B.R., and Kanter, Y.: Small volumes in compactified string theory. Nucl. Phys. B497, 127–145 (1997) 8. Gross, M., and Wilson, P.M.H.: Mirror symmetry via 3-tori for a class of Calabi–Yau threefolds. Math. Ann. 309, 505–531 (1997) 9. Hitchin, N.: The moduli space of special Lagrangian submanifolds. dg-ga/9711002 10. Hori, K., and Oz, Y.: F-theory, T-duality on K3 surfaces and N = 2 supersymmetric gauge theories in four dimensions. Nucl. Phys. B501, 97–108 (1997) 11. Kodaira, K.: On complex analytic surfaces, II. Ann. Math. 77, 563–626 (1963) 12. Morrison, D.R.: The geometry underlying mirror symmetry. In: Proc. European Algebraic Geometry Conference (Warwick, 1996). To appear ˆ with its application to Picard sheaves. Nagoya Math. J. 81, 13. Mukai, S.: Duality between D(X) and D(X) 153–175 (1981) 14. Mukai, S.: Symplectic structure of the moduli space of sheaves on an abelian or K3 surface. Invent. Math. 77, 101–116 (1984) 15. Mukai, S.: Fourier functor and its application to the moduli of bundles on an abelian variety. Adv. Studies Pure Math. 10, 515–550 (1987) 16. Mukai, S.: On the moduli space of bundles on a K3 surface I. In: Vector bundles on algebraic varieties. Bombay and London: Oxford University Press, 1987 17. Ooguri, H., Oz, Y., and Yin, Z.: D-Branes on Calabi–Yau spaces and their mirrors. Nucl. Phys. B477, 407 (1996)

Mirror Simmetry on K3 Surfaces via Fourier–Mukai Transform

93

18. Shioda, T.: Elliptic modular surfaces. J. Math. Soc. Japan 24, 20–59 (1972) 19. Simpson, C.T.: Moduli of representations of the fundamental group of a smooth projective variety I. Publ. Math. IHES 79, 47–129 (1994) 20. Strominger, A., Yau, S.-T., and Zaslow, E.: Mirror symmetry is T-duality. Nucl. Phys. B479, 243–259 (1996) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 195, 95 – 111 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

W1+∞ Algebra, W3 Algebra, and Friedan–Martinec– Shenker Bosonization Weiqiang Wang? Max-Planck Institut f¨ur Mathematik, 53225 Bonn, Germany. E-mail: [email protected] Received: 20 June 1997 / Accepted: 11 November 1997

Abstract: We show that the vertex algebra W1+∞ with central charge −1 is isomorphic to a tensor product of the simple W3 algebra with central charge −2 and a Heisenberg vertex algebra generated by a free bosonic field. We construct a family of irreducible modules of the W3 algebra with central charge −2 in terms of free fields and calculate the full character formulas of these modules with respect to the full Cartan subalgebra of the W3 algebra. 0. Introduction In search of classification of conformal field theories, one is lead to study W algebras which are extended chiral algebras (vertex algebras or vertex operator algebras in mathematical terminology) containing Virasoro algebra as a subalgebra. Since the first attempt was made by Zamolodchikov [Z] there has been much further study of W algebras (see the review paper [BS] and references therein)1 . A particularly interesting example of W algebra, the so-called W1+∞ algebra [PRS], appears to be a universal one among various W infinite algebras in the N → ∞ limit of W(slN ) algebras, see e.g. [Ba, BK, PRS, O]. The W(slN ) algebras are often referred to as WN algebras in the b of the literature. In mathematics, W1+∞ is known as the universal central extension D Lie algebra D of differential operators on the circle. The first systematic study of the b was undertaken by Kac and Radul in [KR1] representation theory of the Lie algebra D and there have been many further development [M, FKRW, AFMO, KR2, W1] since then, just to name some. b and its representation theory are studied in the frameIn [FKRW], the Lie algebra D work of vertex algebras [B, FLM, DL, K2, LZ2]. It turns out that the irreducible vacuum ?

On leave from Department of Mathematics, Yale University, USA We note a less-known fact that WN algebras were constructed in [F2] for the particular central charge c=N −1 1

96

W. Wang

b D-module with central charge c admits a canonical vertex algebra structure, with infinitely many generating fields of conformal weights 1, 2, 3, · · ·, which we will denote by W1+∞,c . The case when the central charge is non-integral is not difficult to understand. The case when the central charge is a positive integer was studied in detail in [FKRW]. The vertex algebra W1+∞,N with a positive integral central charge N has redundant symmetries, namely only the first N generating fields are independent. More precisely W1+∞,N is shown to be isomorphic to a W-algebra W(glN ) with central charge N and the irreducible modules of W1+∞,N are classified [FKRW]. In this paper we will take the first step to clarify the connection between the vertex algebra W1+∞,−N and some other W–algebra with finitely many generating fields. We prove that the vertex algebra W1+∞,−1 is isomorphic to a W(gl3 ) algebra, which is a tensor product of the simple W3 algebra with central charge −2 (denoted below by W3,−2 ) and a Heisenberg vertex algebra generated by a free bosonic field. We will construct explicitly a number of modules of the W3,−2 algebra parametrized by integers in terms of free fields. We prove the irreducibility of these modules. As a by-product, we obtain full character formulas for these representations. To our best knowledge, these seem to be the first known full character formula of any non-trivial module of the W3 algebra with any non-generic central charge. We mention a curious fact that a generating function of counting covers of an elliptic curve [Di] appears to be closely related to our character formulas and admits very interesting modular invariance properties [KZ]. The difficulties appearing in the negative integral central charge case in contrast to the positive integral central charge case are roughly the following: In both cases we have free field realizations. In the case of positive integral central charge we need bc fields which are free fermions while in the negative integral central charge case we need βγ fields which are free bosonic ghosts. The structure of the WN algebra in the realization of W1+∞,N in terms of bc fields can be identified relatively easily due to the very fact that b N is well the structure of the basic representation of the affine Kac–Moody algebra sl understood [K1]. However structures of representations of affine algebras with negative integral central charges are far from being clear. One of the main technique we use in relating the W1+∞ algebra with central charge −1 to the W3 algebra with central charge −2 is the bosonization of βγ fields [FMS]. A similar construction was also given by Kac and van de Leur and used by them for a construction of a super KP hierarchy [KV1, KV2]. More detailed structures in the bosonization of βγ fields are further worked out in [FF] and used for the computation of semi-infinite cohomology of the Virasoro algebra with coefficient in the module of its adjoint semi-infinite symmetric powers. It is well known that βγ fields are fundamental ingredients in superstring theory [FMS], in realizations of level −1 representations of classical affine algebras [FeF] and in the calculation of BRST cohomology of superVirasoro algebras [LZ1]. They are also closely related to the logarithmic conformal field theories which recently attract much attention from physicists, see e.g. [F, Ka, GK]. We hope our results may shed some light on these subjects. Let us explain in more detail. It is well known [M] that the Fock space Ms of the b can be decomposed into a direct sum of the modules Ml βγ fields as a module over D s parametrized by the βγ–charge number l. Recall [FMS] that the βγ fields are expressed in terms of two scalar fields ψ(z) and φ(z). So the space Ms can be identified with some subspace of the Fock space of the Heisenberg algebra of the two scalar fields ψ(z) and √ lN l Hi(s+l) , i = −1 (cf. e.g. [FF]), where F is φ(z). Indeed one can identify Mls as F a certain subspace of the Fock space of the Heisenberg algebra generated by the Fourier

W1+∞ Algebra, W3 Algebra

97

components of the field ψ(z) while Hi(s+l) is the Fock space of the Heisenberg algebra of the field φ(z). W1+∞,−1 acts on Mls by means of fields J i (z) =: γ(z)∂ i β(z) : +

1 s(s − 1) · · · (s − i)z −i−1 , i ∈ Z+ . i+1

For the sake of simplicity, the reader may understand the main results of this paper by taking s = 0 throughout this paper. By the celebrated boson-fermion correspondence, we have a pair of fermionic fields b(z) and c(z) expressed in terms of the scalar field ψ(z). Furthermore we can construct two particular fields as some normally ordered polynomials of fields b(z) and c(z) and their derivative fields: a Virasoro field T (z) of conformal weight 2 and a field W (z) of conformal weight 3. These two fields T (z) and W (z) satisfy the operator product expansion of the W3 algebra with central charge −2. The three fields J 0 (z), T (z) and W (z) may be regarded as generating fields of a W(gl3 ) algebra. We will show that all the J i (z) =: γ(z)∂ i β(z) :, i = 0, 1, . . . , can be expressed (see Lemmas 4.2 and 4.3) as some normally ordered polynomials in terms of T (z), W (z) lN Hi(s+l) , being isomorphic to and J 0 (z) and their derivative fields. Since the space F Mls , is an irreducible module over the vertex algebra W1+∞,−1 , it is also irreducible as a module over the W(gl3 ) algebra. One can show that J 0 (z) = i∂φ(z) by using Friedan–Martinec–Shenker bosonizalN tion. Note that when the W(gl3 ) algebra acts on F Hi(s+l) , the Fourier components l of fields T (z) and W (z) act only on the first factor F while J 0 (z) acts only on the l second factor Hi(s+l) . This implies that F is irreducible as a module over the W3,−2 l

algebra. We obtain full character formulas of these irreducible modules F of the W3,−2 algebra as a consequence of our explicit free field realization. As a by-product of our l free field realization of F , we find that there exists non-split short exact sequences of modules over the W3 (resp. W1+∞,−1 ) algebra with central charge −2 (resp. −1). b The plan of this paper is as follows. In Section 1, we review the definition of D and the construction of the vertex algebra W1+∞,c . We present the free field realization of W1+∞,−1 in terms of βγ fields. In Sect. 2, we recall the bosonization of βγ fields in detail. In Sect. 3 we review the W3 algebra in the framework of vertex algebras. In Sect. 4 we prove that the vertex algebra W1+∞,−1 is isomorphic to a tensor product of the simple W3,−2 algebra and a Heisenberg vertex algebra generated by a free bosonic field. We construct a number of irreducible modules of the W3,−2 algebra. In Sect. 5, we calculate the full character formula for representations of the W3,−2 algebra constructed in Sect. 4. We will classify the irreducible modules of the W3,−2 algebra in our subsequent paper [W2]. It turns out that these irreducible modules are parametrized by points on a certain rational curve. We will also classify all the irreducible modules of the W1+∞,−1 algebra based on the relation between W3,−2 and W1+∞,−1 algebras found in this paper. 1. Vertex Algebra W1+∞,c and Free Fields Realization of W1+∞,−1 Let D be the Lie algebra of regular differential operators on the circle. The elements Jkl = −tl+k (∂t )l ,

l ∈ Z+ , k ∈ Z,

98

W. Wang

form a basis of D. D has also another basis Llk = −tk Dl ,

l ∈ Z+ , k ∈ Z,

b the central extension of D by a one-dimensional center where D = t∂t . Denote by D with a generator C, with commutation relation (cf. [KR1]) r t f (D), ts g(D) = tr+s (f (D + s)g(D) − f (D)g(D + r)) + 9 (tr f (D), ts g(D)) C, (1.1) where 9 (tr f (D), ts g(D)) =

 

X



−r≤j≤−1

f (j)g(j + r), r = −s ≥ 0

Letting the weight Jkl = k and weight C = 0 defines a principal gradation M M b= bj . Dj , D D= D j∈Z

b± = D

M

(1.3)

j∈Z

Then we have the triangular decomposition M M b0 b− , b=D b+ D D D where

(1.2)

r + s 6= 0.

0,

bj , D

b 0 = D0 D

M

(1.4)

CC.

j∈±N

Let P be the distinguished parabolic subalgebra of D, consisting of the differential operators that extend into the whole interior of the circle. P has a basis {Jkl , l ≥ 0, l+k ≥ b vanishes 0}. It is easy to check that the 2-cocycle 9 defining the central extension of D b Denote when restricted to the parabolic subalgebra P. So P is also a subalgebra of D. b P = P ⊕ CC. b module by letting C act as scalar c Fix c ∈ C. Denote by Cc the 1–dimensional P b and P act trivially. Fix a non-zero vector v0 in Cc . The induced D–module O b =U D b Cc Mc D U (P)

b is called the vacuum D–module with central charge c. Here we denote by U (g) the b admits a unique irreducible universal enveloping algebra of a Lie algebra g. Mc (D) b by |0i. quotient, denoted by W1+∞,c . Denote the highest weight vector 1 ⊗ v0 in Mc (D) It is shown in [FKRW] that W1+∞,c carries a canonical vertex algebra structure, with vacuum vector |0i and generating fields X Jkl z −k−l−1 J l (z) = k∈Z l of conformal weight l+1, l = 0, 1, · · · . The fields J l (z) correspond to the vector J−l−1 |0i in W1+∞,c . Below we will concentrate on the particular case W1+∞,−1 .

W1+∞ Algebra, W3 Algebra

99

Recall that the bosonic βγ fields are X X β(z) = β(n)z −n+s , γ(z) = γ(n)z −n−s−1 n∈Z

(s ∈ C)

(1.5)

n∈Z

with the operator product expansions (OPEs) β(z)γ(w) ∼

−1 z s ( ) , β(z)β(w) ∼ 0, γ(z)γ(w) ∼ 0. z−w w

(1.6)

In other words, we have the following commutation relations: [γ(m), β(n)] = δm,−n ,

[β(m), β(n)] = 0,

[γ(m), γ(n)] = 0.

Let us denote by Ms the Fock space of the βγ fields, with the vacuum vector |si, and

β(n + 1)|si = 0,

γ(n)|si = 0,

n ≥ 0.

(1.7)

One can realize a representation of W1+∞,−1 on Ms by letting (cf. [KR2, M], our convention here is a little different): J N (z) =: γ(z)∂ N β(z) : +

1 s(s − 1) · · · (s − N )z −N −1 , N ∈ Z+ . N +1

(1.8)

The normal ordering :: is understood as moving the operators annihilating |si to the right. P Note that J 0 (z) = k∈Z Jk0 z −k−1 is a free bosonic field of conformal weight 1 with commutation relations 0 , Jn0 ] = −mδm−n , m, n ∈ Z. [Jm

We also have the following commutation relations: 0 [Jm , β(n)] = β(m + n),

0 [Jm , γ(n)] = −γ(m + n),

m, n ∈ Z.

Then we have the βγ-charge L decomposition of Ms according to the eigenvalues of the operator −J00 : Ms = l∈Z Mls . It is known [M, KR2] that M00 is isomorphic to W1+∞,−1 as vertex algebras. 2. Bosonizations In Sect. 2 we recall the well-known boson-fermion correspondence (cf. [F1]). In Sect. 2 we review the Friedan–Martinec–Shenker bosonization of the βγ fields and some more detailed structures [FMS, FF]. 2.1. Bosonization of fermions. Let j(z) be a free bosonic field of conformal weight 1, namely 1 , j(z)j(w) ∼ (z − w)2 P or equivalently, by introducing j(z) = n∈Z j(n)z −n−1 , we have [j(m), j(n)] = mδm,−n .

100

W. Wang

Let us also introduce the free scalar field φ(z) = q + j(0) ln z −

X j(n) n

n6=0

z −n ,

where the operator q satisfies [q, j(n)] = δn,0 . Clearly j(z) = ∂φ(z). Given α ∈ C, we denote by Hα the Fock space of the free field j(z) generated by the vacuum vector |αi satisfying j(n)|αi = αδn,0 |αi,

n ≥ 0.

It is well known that H0 is a vertex algebra, which we refer to as a Heisenberg vertex algebra. It is easy to see that exp(ηq)|αi =| α + ηi. Introduce the vertex operator X exp(ηq)z ηα Xη (n)z −n Xη (z) = n∈Z

as follows. Let Xη (z) = : exp (ηφ(z)) : = exp(ηq)z ηα exp η

X

! j(−n)z n /n exp η

n>0

X

!

(2.9)

j(−n)z n /n .

n<0

The Fourier components of Xη (z) act from Hα to Hα+η . Furthermore we have the following OPE: ηXη (w) 1 + ∂Xη (w), (2.10) j(z)Xη (w) ∼ z−w η or equivalently we have

Also we have

j(m), Xη (n) = ηXη (m + n), 1 : j(z)Xη (z) : = ∂Xη (z). η

Xξ (z)Xη (w) ∼ (z − w)ξη : Xξ (z)Xη (w) : .

(2.11)

In particular we have a pair of fermionic fields X± (z) with OPEs: X1 (z)X−1 (w) ∼

1 , z−w

X±1 (z)X±1 (w) ∼ 0.

This is the well-known boson-fermion correspondence. 2.2. Bosonization of bosons. First let us introduce the bc fermionic fields X X b(n)z −n , c(z) = c(n)z −n−1 b(z) = n∈Z

with OPEs b(z)c(w) ∼ In other words, we have

1 , z−w

n∈Z

b(z)b(w) ∼ 0,

c(z)c(w) ∼ 0.

(2.12)

W1+∞ Algebra, W3 Algebra

101

[b(m), c(n)]+ = δm,−n ,

[b(m), b(n)]+ = 0,

[c(m), c(n)]+ = 0.

We denote by F the Fock space of the bc fields, generated by |bci, satisfying b(n + 1)|bci = 0,

c(n)|bci = 0,

Then j bc (z) =: c(z)b(z) :=

X

n ≥ 0.

jnbc z −n−1

n∈Z

is a free boson of conformal weight 1 with commutation relations bc bc [jm , jn ] = mδm,−n ,

m, n ∈ Z.

We further have the following commutation relations: bc [jm , b(n)] = −b(m + n),

bc [jm , c(n)] = c(m + n),

m, n ∈ Z.

Then we have the bc–charge decomposition of F according to the eigenvalues of j0bc : F=

M

F l.

l∈Z

Following [FF], we consider the vector space N (s) =

X

Fl

O

Hi(s+l) ,

l∈Z

and we define the actions of β(n), γ(n), n ∈ Z on N (s) by letting [FMS] β(z) =

X

β(n)z −n+s = ∂b(z)X−i (z),

(2.13)

γ(n)z −n−s−1 = c(z)Xi (z).

(2.14)

n∈Z

γ(z) =

X

n∈Z

It can be easily shown that theN bosonic fields β(z), γ(z) defined above indeed satisfy the OPEs (1.6). The vector |bci |isi satisfies the vacuum condition (1.7) by means of (2.13) and (2.14). Then we have a homomorphism : Ms −→ N (s) as modules of the Heisenberg algebra spanned by β(n), γ(n), n ∈ Z, by letting O |si = |bci |isi. This homomorphism is obviously an embedding since Ms is an irreducible module of the above Heisenberg algebra. The following proposition (cf. [FF]) tells us the precise image of this embedding. We reproduce the proof here since some crucial misprints in their proof in [FF] need to be corrected. Proposition 2.1. The image of the homomorphism coincides with the kernel of c(0), acting from N (s) to N (s − 1).

102

W. Wang

Proof. The operators β(n), γ(n), n ∈ Z, given by (2.13) and (2.14) do not depend on b(0) and therefore commute with c(0). So we have Im ⊂ ker c(0) since the operator c(0) kills the vacuum N vactor |bci. It is easy to see that the kernel of c(0) is obtained by applying to |bci | isi the operators j(n), c(n), n ∈ Z, and b(m), m ∈ Z − {0}. So it remains to show that fields j(z), c(z) and ∂b(z) can be expressed in terms of fields β(z), γ(z), X±i (z) and their derivative fields. Indeed it is easy to show that ∂b(z) = ∂β(z)Xi (z), c(z) = ∂γ(z)X−i (z). Recall that J 0 (z) =: γ(z)β(z) :. It is easy to check by (2.13) and (2.14) that j(z) ≡ ∂z φ(z) = −iJ 0 (z). l

Denote by F the kernel of the operator c(0) acting from F l to F l+1 . We now have a natural isomorphism: lO (2.15) Hi(s+l) . Ml ∼ =F s

0

Remark 2.1. F is a vertex subalgebra of F 0 . This is an example of the following well-known Pfact in the theory of vertex algebras: Given a vertex algebra V and let Y (a, z) = n∈Z a(n)z −n−1 be the field corresponding to some vector a ∈ V , then the kernel of the operator a(0) acting on V is always a vertex subalgebra of V . 3. W3 Algebra Denote by U (W3,c ) (c ∈ C is the central charge) the quotient of the free associative algebra generated by Lm , Wm , m ∈ Z by the two-sided ideal generated by the following commutation relations (cf. e.g. [BMP]): [Lm , Ln ] = (m − n)Lm+n +

c (m3 − m)δm,−n , 12

[Lm , Wn ] = (2m − n)Wm+n , 1 [Wm , Wn ] = (m − n) (m + n + 3)(m + n + 2) 15 1 − (m + 2)(n + 2) Lm+n 6 c m(m2 − 1)(m2 − 4)δm,−n , +β(m − n)3m+n + 360 with β = 16/(22 + 5c) and 3m =

X k≤−2

Lk Lm−k +

X

Lm−k Lk −

k>−2

3 (m + 2)(m + 3)Lm . 10

Denote W3,± = {Ln , Wn , ±n ≥ 0},

W3,0 = {L0 , W0 }.

A Verma module Mc (t, w) of U(W3,c ) is the induced module

(3.16)

W1+∞ Algebra, W3 Algebra

103

Mc (t, w) = U(W3,c )

O

Ct,w ,

U (W3,+ ⊕W3,0 )

where Ct,w is the 1-dimensional module of U (W3,+ ⊕ W3,0 ) such that W3,+ |t, wi = 0, L0 |t, wi = t|t, wi, W0 |t, wi = w|t, wi.

(3.17)

Mc (t, w) has a unique irreducible quotient which is denoted by Lc (t, w). A singular vector in a U (W3,c )-module means a vector killed by W3,+ . It is easy to see that L−1 |0, 0i, W−1 |0, 0i, and W−2 |0, 0i are singular vectors in M(0, 0). We denote by VW 3,c the vacuum module which is by definition the quotient of the Verma module M(0, 0) by the U (W3,c )-submodule generated by the singular vectors L−1 |0, 0i, W−1 |0, 0i, and W−2 |0, 0i. We call Lc (0, 0) the irreducible vacuum module. Let I be the maximal proper submodule of the vacuum module VW 3,c . Clearly Lc (0, 0) is the irreducible quotient VW 3,c /I of VW 3,c . It is easy to see that VW 3,c has a linear basis L−i1 −2 · · · L−im −2 W−j1 −3 · · · W−jn −3 |0, 0i, 0 ≤ i1 ≤ · · · ≤ im , 0 ≤ j1 ≤ · · · ≤ jn , m, n ≥ 0. Introduce the following fields X Ln z −n−2 , T (z) = n∈Z

W (z) =

X

Wn z −n−3 .

(3.18)

(3.19)

n∈Z

It is well known that the vacuum module VW 3,c (resp. irreducible vacuum module Lc (0, 0)) carries a vertex algebra structure with generating fields T (z) and W (z). The W3 algebra with central charge −2 we have been referring to is the vertex algebra L−2 (0, 0), which we denote by W3,−2 throughout our paper. Fields T (z) and W (z) correspond to the vectors L−2 |0, 0i and W−3 |0, 0i respectively. The field corresponding to the vector L−i1 −2 · · · L−im −2 W−j1 −3 · · · W−jn −3 |0, 0i is ∂ (i1 ) T (z) · · · ∂ (im ) T (z)∂ (j1 ) W (z) · · · ∂ (jn ) W (z), where ∂ (i) denotes i!1 ∂zi . We can rewrite (3.16) in terms of the following OPEs in our central charge −2 case: −1 2T (w) ∂T (w) , + + 4 2 (z − w) (z − w) z−w 3W (w) ∂W (w) T (z)W (w) ∼ , + (z − w)2 z−w 2T (w) ∂T (w) −2/3 + + W (z)W (w) ∼ (z − w)6 (z − w)4 (z − w)3 1 2 1 8 : T (w)T (w) : − ∂ T (w) + (z − w)2 3 2 1 4 1 3 + ∂ (: T (w)T (w) :) − ∂ T (w) . z−w 3 3 T (z)T (w) ∼

(3.20)

Representation theory of the vertex algebra VW 3,c is just the same as that of U(W3 ). However note that c = −2 is not a generic central charge [W2], namely the vacuum module VW 3,c with c = −2 is reducible, or in other word, the maximal proper submodule

104

W. Wang

I of VW 3,c is not zero. Thus representation theory of W3,−2 becomes highly non-trivial due to the following constraints: a module M of the vertex algebra VW 3,c can be a module of the vertex algebra W3,−2 if and only if M is annihilated by all the Fourier components of all fields corresponding to vectors in the maximal proper submodule I ⊂ VW 3,c . 4. Relations Between the W3 Algebra and the Vertex Algebra W1+∞,−1 Define T (z) ≡

X

Ln z −n−2 =: ∂b(z)c(z) : .

(4.21)

n∈Z

It is easy to check that T (z) is a Virasoro field with central charge −2. We also define another field of conformal weight 3: W (z) ≡

X

1 Wn z −n−3 = √ : ∂ 2 b(z)c(z) : − : ∂b(z)∂c(z) : . 6 n∈Z

(4.22)

We have the following proposition whose proof is straightforward however tedious by using Wick’s theorem. Proposition 4.1. Fields T (z) and W (z) satisfy the OPEs (3.20) of the W3 algebra with central charge −2. We note that this W3 algebra structure in bc fields was also observed in [BCMN]. √ f (z) = 1 6W (z), namely We rescale W (z) to be W 2 f (z) ≡ W

X n∈Z

fn z −n−3 = 1 : ∂ 2 b(z)c(z) : − : ∂b(z)∂c(z) : . W 2

(4.23)

f (z). Now We will see later that it is more convenient to work with the rescaled field W we can state our first main results of this paper. 0

Theorem 4.1. 1) The vertex algebra F is isomorphic to the simple vertex algebra l W3,−2 , with generating fields T (z) and W (z). F (l ∈ Z) are irreducible modules of the W3,−2 algebra. 2) The vertex algebra W1+∞,−1 is isomorphic to a tensor product of the vertex algebra W3,−2 , and the Heisenberg vertex algebra H0 with J 0 (z) as a generating field. l

Remark 4.1. Proposition 4.1 implies that F (l ∈ Z) are modules of the W3,−2 algebra, also cf. [BCMN]. Theorem 4.1 says that they are indeed irreducible. Proof of the above theorem relies on the following three lemmas. Lemma 4.1. Mls is irreducible, regarded as a module of the vertex algebra W1+∞,−1 via the free field realization (1.8). Lemma 4.1 was proved in [KR2, M].

W1+∞ Algebra, W3 Algebra

105

Lemma 4.2. The fields J n (w) =: γ(w)∂ n β(w) :, n ≥ 0 acting on the Fock space M0 can be expressed as a normally ordered polynomial in terms of fields : ∂ i b(w)∂ j c(w) : , i + j ≤ n, i > 0, and ∂ k j(w), k = 0, 1, · · · , n. More precisely, we have n : γ(w)∂ : P β(w) (4.24) = 1≤k≤n k nk : ∂ n−k+1 b(w)c(w) : Pk−1 (j) + Cn Pn+1 (j), Pn 1 where Cn = (n + 2) m=0 (−1)m+1 m+2 is some constant depending on n, and the normally ordered polynomial Pm (j) (or denoted by Pm (j(w)) when it is necessary to specify the variable in j(w)) in terms of the field j(w) and its derivative fields is defined as (recall that j(w) = ∂φ(w))

Pm (j) =

m ∂w : e−iφ(w) : , : e−iφ(w) :

m ≥ 0.

(4.25)

Proof of Lemma 4.2. We will calculate the normally ordered product : ∂ n β(w)γ(w) : instead of : γ(w)∂ n β(w) :. These two normally ordered products coincide since both β(w) and γ(w) are free fields. By formulas (2.13) and (2.14), we have (4.26) : ∂ n β(w)γ(w) : = : ∂ n ∂b(w)X−i (w) (c(w)Xi (w)) : X n = : ∂ n−k+1 b(w)∂ k X−i (w)c(w)Xi (w) : . k 0≤k≤n

It follows from the OPEs (2.12) that ∂zn−k+1 b(z)c(w) =

(−1)n−k+1 (n − k + 1)! + : ∂ n−k+1 b(w)c(w) : (z − w)n−k+2 + higher terms,

It follows from the OPEs (2.11) that   X (z − w)m+1 ∂zk X−i (z)Xi (w) = ∂zk  Pm (j(w)) m! =

X

(4.27)

(4.28)

m≥0

m≥k−1

m+1 (z − w)m−k+1 Pm (j(w)). (m − k + 1)!

n

Since : ∂ β(w)γ(w) : is the constant term in the expansion of power series of z − w in the operator product expansion of ∂ n β(z)γ(w), we see from Eqs. (4.26), (4.27) and (4.28) that the only terms in Eq. (4.28) which will contribute to : ∂ n β(w)γ(w) : non-trivially is the two terms m = k − 1 and m = n + 1. Namely we have : ∂ n β(w)γ(w) : h P = 0≤k≤n nk k : ∂ n−k+1 b(w)c(w) : Pk−1 (j) i n+2 (−1)n−k+1 (n − k + 1)!Pn+1 (j) + (n−k+2)! P = 1≤k≤n k nk : ∂ n−k+1 b(w)c(w) : Pk−1 (j) + Cn Pn+1 (j),

(4.29)

106

W. Wang

where Cn = (n + 2)

n X

(−1)m+1

m=0

1 . m+2

Remark 4.2. 1) Pm (j) defined in Eq. (4.25) reads as follows for small m: P1 (j) = −ij(w), P0 (j) = 1, P2 (j) = −i∂j(w)− : j(w)2 :, P3 (j) = −i∂ 2 j(w) − 3 : j(w)∂j(w) : +i : j(w)3 : . 2) The formula (4.24) reads as follows for small n: : γ(w)β(w) : ≡ J 0 (w) = ij(w), 1 1 : J 0 (w)2 : + ∂J 0 (w), 2 2 : γ(w)∂ 2 β(w) : = 2 : ∂ 2 b(w)c(w) : −2 : ∂b(w)c(w) : J 0 (w) 5 5 + ∂ 2 J 0 (w) + : J 0 (w)3 : 3 3 −5 : J 0 (w)∂J 0 (w) : . : γ(w)∂β(w) : = : ∂b(w)c(w) : −

Lemma 4.3. Each field : ∂ i b(z)∂ j c(z) :, i > 0, j ≥ 0 can be expressed as a normally ordered polynomial in terms of T (z) and W (z) defined in (4.21) and (4.22) and their derivative fields. Proof of Lemma 4.3. We first prove the following statement: Claim An : Any field : ∂ i b(z)∂ n−i+1 c(z) :, 1 ≤ i ≤ n + 1 can be written as a linear combination of the following n + 1 fields: ∂ : ∂ k b(z)∂ n−k c(z) : , 1 ≤ k ≤ n and : T (z)∂ n−1 b(z)c(z) : . Indeed one can calculate directly by using (4.21) and Wick’s Theorem that 1 1 : T (z) ∂ n−1 b(z)c(z) := : ∂ n−1 b(z)∂ 2 c(z) : + : ∂ n+1 b(z)c(z) : . 2 n

(4.30)

Also since the derivation of a normally ordered product satisfies the Leibniz rule we have ∂ : ∂ k b(z)∂ n−k c(z) : =: ∂ k+1 b(z)∂ n−k c(z) : + : ∂ k b(z)∂ n−k+1 c(z) : .

(4.31)

The n + 1 fields ∂ : ∂ k b(z)∂ n−k c(z) : , 1 ≤ k ≤ n and : T (z)∂ n−1 b(z)c(z) : can be obtained from the n + 1 fields : ∂ i b(z)∂ n−i+1 c(z) :, 1 ≤ i ≤ n + 1 through a linear transformation given by the following (n + 1) × (n + 1) matrix

W1+∞ Algebra, W3 Algebra

107





1 1

   0 1 ...     .. .. ..    . . .  .   ..  . 1 1 0     0 1 1  1 1 2 0 n+1 n+3 so it is invertible. By inverting the It is easy to see this matrix has determinant 2(n+1) matrix we prove the Claim An . Now we are ready to prove the following claim by induction on n which is a reformulation of Lemma 4.3. Claim Bn : Any field : ∂ i b(z)∂ n−i c(z), 1 ≤ i ≤ n can be written as a normally ordered polynomial in terms of T (z), W (z) and their derivative fields. When n = 1, : ∂b(z)c(z) : is just T (z) itself. When n = 2, : ∂ 2 b(z)c(z) : and : ∂b(z)∂c(z) : are clearly linear combinations of the fields ∂T (z) = ∂ (: ∂b(z)c(z) :) =: ∂ 2 b(z)c(z) : + : ∂b(z)∂c(z) :

and

1 W (z) = √ : ∂ 2 b(z)c(z) : − : ∂b(z)∂c(z) : . 6 So Claim B2 is true. Assume that the statement Bn is true. Then particularly the field : ∂ n−1 b(z)c(z) : can be written as a normally ordered polynomial of T (z) and W (z). And so is : T (z)∂ n−1 b(z)c(z) :. Then the Claim Bn+1 follows from Claim An (cf. Eq.+(4.30)). l

Proof of Theorem 4.1. Lemmas 4.1 and 4.2 imply immediately that F is irreducible under the actions of the Fourier components of fields : ∂ i b(z)∂ j c(z) :, i > 0, j ≥ 0. Together l with Lemma 4.3, this implies that F is irreducible under the actions of Ln , Wn , n ∈ Z. 0 So the vertex algebra F is isomorphic to W3,−2 by Proposition 4.1. The free field l

l

construction of F guarantees that F is a module of the vertex algebra W3,−2 . The second statement of Theorem 4.1 now follows from the isomorphism of vertex algebras 0N M00 ∼ H0 given by (2.15). =F We have the following proposition from the explicit free field realization of modules F of the W3,−2 algebra. Also see Remark 4.3 in [W1] for some further implication. l

Proposition 4.2. 1) There exists a non-split short exact sequence of modules over the vertex algebra W3,−2 : l

0 −→ F −→ F l −→ F

l−1

−→ 0.

(4.32)

2) There exists a non-split short exact sequence of modules over the vertex algebra W1+∞,−1 : 0 −→ Mls−l −→ M −→ Ml−1 (4.33) s−l+1 −→ 0. N l Here M is isomorphic to F His as vector spaces.

108

W. Wang

lL l−1 Proof. As a vector space we have a direct sum F l = F b(0)F . Then it is not hard l l−1 to see that as a W3,−2 -module, F l /F is isomorphic to the irreducible module F . So the following non-split short exact sequence of modules over the W3,−2 algebra l

l

0 −→ F −→ F l −→ F l /F −→ 0 is isomorphic to the one in (4.32). jN Note that Mjs is isomorphic to F Hi(s+j) as modules over the vertex algebra W1+∞,−1 by Theorem 4.1. Then the non-split short exact sequence (4.33) can be obtained by tensoring the one in (4.32) with His . 5. Character Formulas of Modules Over W3 Algebra With Central Charge −2 Denote by 9(z, q, p) ≡

X

z l ψl (q, p) = Tr |L

l∈Z

l∈Z

the full character of

L l∈Z

e0 z −j0 q L0 pW bc

F

l

l

l

F , a direct sum of irreducible modules F over the W3,−2 l

algebra. Here ψl (q, p) is the full character of F , l ∈ Z. Then the full character formula l ψl (q, p) of the irreducible W3,−2 -module F can be recovered from 9(z, q, p) by taking the residue ψl (q, p) = Resz=0 z l+1 9(z, q, p). We will need the following lemma. Lemma 5.1. We have the following OPEs: ∂b(w) , z−w ∂c(w) c(w) , + T (z)c(w) ∼ (z − w)2 z − w 1 2 f (z)b(w) ∼ 2 ∂b(w) + ∂ b(w) , W (z − w)2 z − w 3 2 f (z)c(w) ∼ −c(w) + − 2 ∂c(w) + −∂ c(w) . W (z − w)3 (z − w)2 z−w T (z)b(w) ∼

(5.34) (5.35)

Proof. We will prove the OPE (5.35) only and the other OPEs can be proved similarly by using Wick’s Theorem. Since b(z)c(w) ∼

1 , z−w

we have ∂z2 b(z) c(w) ∼

2 . (z − w)3

W1+∞ Algebra, W3 Algebra

109

Since c(z)c(w) ∼ 0 and b(z), c(z) are fermionic fields, we have by Wick’s Theorem 2c(z) ∂z2 b(z)c(z) c(w) ∼ − (z − w)3 2c(w) 2∂c(w) ∂ 2 c(w) . ∼− − − 3 2 (z − w) (z − w) z−w

(5.36)

We also have by Wick’s Theorem (∂z b(z)∂z c(z)) c(w) ∼

2 ∂z c(z) ∂w c(w) c(w) ∂w . ∼ + 2 2 (z − w) (z − w) z−w

(5.37)

f (z) in (4.23). Now the OPE (5.35) follows from (5.36), (5.37) and the definition of W In particular Lemma 5.1 implies

Corollary 5.1. We have the following commutation relations (n ∈ Z): [L0 , b(n)] = −nb(n), [L0 , c(n)] = −nc(n), [W0 , b(n)] = n2 b(n), [W0 , c(n)] = −n2 c(n).

Proof. By comparing the coefficients of the z −3 terms in both sides of the OPE (5.34), we get (5.38) [W0 , b(w)] = w∂b(w) + w2 ∂ 2 b(w). Comparing the coefficients of the wn terms in both sides of (5.38), we get [W0 , b(n)] = n2 b(n). Proofs of the other commutation relations in Corollary 5.1 are similar. The following full character formula follows now from Corollary 5.1 and the charl acterization of F as the subspace of F l consisting of vectors which do not involve b(0), the zeroth Fourier component of the field b(z). Theorem 5.1. The full character formula 9(z, q, p) is given by Y 2 2 1 + zq n pn 1 + z −1 q n p−n . 9(z, q, p) = n≥1

Remark 5.1. 1) By using the Jacobi triple identity, one can easily show that ψl (q, 0) =

X k(k+1) 1 (−1)k+l q 2 . n 5n≥1 (1 − q ) k≥|l|

l

This is consistent with the explicit decomposition of F with respect to the Virasoro algebra generated by the Fourier components of the field T (z) [FF].

110

W. Wang

2) If we consider instead e q, p) ≡ Tr |L 9(z,

l∈Z

e0 , z −j0 q L0 pW bc

Fl

then we can show similarly that e q, p) = (1 + z) 9(z,

Y

1 + zq n pn

2

1 + z −1 q n p−n

2

.

(5.39)

n≥1

Essentially the same formula as in (5.39) up to some simple changes of variables appears in [Di] as some generating function of counting covers of an elliptic curve. e q, p) Modular invariance and some other interesting properties of the function 9(z, e were discussed in detail in [KZ]. It is suggested that 9(z, q, p) may be an indication of the existence of generalized Jacobi forms involving several (possibly infinitely many) variables. We hope that full character formulas of representations of W-algebras in general may provide further natural examples of generalized Jacobi forms. Acknowledgement. The results of this paper were presented in the Seminar of Geometry, Symmetry and Physics at Yale University and in the 1997 AMS Meeting at Detroit. I thank the organizers of the meeting, Chongying Dong and Bob Griess for invitation. I thank Edward Frenkel, Igor Frenkel, Victor Kac, and Gregg Zuckerman for their interests and comments, and especially Edward Frenkel for stimulating discussions. I also thank Gerd Mersmann for pointing out to me the references [Di, KZ].

References [AFMO] Awata, H., Fukuma, M., Matsuo, Y. and Odake, S.: Character and determinant formulae of quasifinite representations of the W1+∞ algebra. Commun. Math. Phys. 172, 377–400 (1995) [Ba] Bakas, I.: The large-N limit of extended conformal symmetries. Phys. Lett. B228, 57–63 (1989) [BK] Bakas, I. and Kiritsis, E.: Bosonic realization of a universal W-algebra and Z∞ parafermions. Nuc. Phys. B343, 185–204 (1990) [B] Borcherds, R.: Vertex algebras, Kac–Moody algebras, and the Monster. Proc. Natl. Acad. Sci, USA 83, 3068–3071 (1986) [BCMN] Bouwknegt, P., Ceresole, A., van Nieuwenhuizen, P. and McCarthy, J.: Extended Sugawara construction for the superalgebras SU (M + 1 | N + 1).II. The third-order Casimir algebra. Phys. Rev. D 40, 415–421 (1989) [BMP] Bouwknegt, P., McCarthy, J. and Pilch, K.: The W3 algebra: Modules, semi-infinite cohomology and BV-algebras. hep-th/9509119 [BS] Bouwknegt, P. and Schoutens, K.: W-symmetry in conformal field theory. Phys. Rep. 223, 183–276 (1993) [Di] Dijkgraaf, R.: Mirror symmetry and elliptic curves. In: The moduli space of curves R. Dijkgraaf : et al (eds.), Prog. Math. 129, Boston: Birkhauser, 1995 [DL] Dong, C. and Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Prog. Math. 112, Boston: Birkhauser, 1993 [FF] Feigin, B. and Frenkel, E.: Semi-infinite Weil complex and the Virasoro algebra. Commun. Math. Phys. 137, 617–639 (1991); Erratum: Commun. Math. Phys. 147, 647–8 (1992) [FeF] Feingold, A. and Frenkel, I.: Classical affine algebras. Adv. Math. 56, 117–172 (1985) [F] Flohr, M.: On modular invariant partition functions of conformal field theories with logarithmic operators. hep-th/9509166 [FKRW] Frenkel, E., Kac, V., Radul, A. and Wang, W.: W1+∞ and W(glN ) with central charge N . Commun. Math. Phys. 170, 337–357 (1995) [F1] Frenkel,I.: Two constructions of affine Lie algebras and boson-fermion correspondence in quantum field theory. J. Funct. Anal. 44, 259–327 (1981)

W1+∞ Algebra, W3 Algebra [F2]

[FLM] [FMS] [GK] [K1] [K2] [KV1] [KV2] [KR1] [KR2] [KZ] [Ka] [LZ1] [LZ2] [M] [O] [PRS] [W1] [W2] [Z]

111

Frenkel, I.: Representations of Kac–Moody algebras and dual resonance models. In: Applications of Group Theory in Physics and Mathematical Physics, eds. M. Flato, P. Sally, G. Zuckerman, Lect. Applied Math, AMS, 21, 325–353 (1985) Frenkel, I., Lepowsky, J. and Meurman, A.: Vertex operator algebras and the Monster. New York: Academic Press, 1988 Friedan, D., Martinec, E. and Shenker, S.: Conformal invariance, supersymmetry and string theory. Nucl. Phys. B 271, 93–165 (1986) Gaberdiel, M. and Kausch, H.: A rational logarithmic conformal field theory. hep-th/9606050 Kac, V.: Infinite dimensional Lie algebras. Third edition, Cambridge: Cambridge University Press, 1990 Kac, V.: Vertex algebras for beginners. Univ. Lect. Series, 10, Providence, RI: AMS, 1996 Kac, V. and van de Leur, W.: Super boson-fermion correspondence. Ann. Inst. Fourier 37, 99–137 (1987) Kac, V. and van de Leur, W.: Super boson-fermion correspondence of type B. In: Infinite-dimensional Lie algebras and groups, V. Kac (ed.), Singapore, World Scientific: 1989, pp. 369–416 Kac, V. and Radul, A.: Quasi-finite highest weight modules over the Lie algebra of differential operators on the circle. Commun. Math. Phys. 157, 429–457 (1993) Kac, V. and Radul, A.: Representation theory of the vertex algebra W1+∞ . Transf. Groups, Vol. 1, 41–70 (1996) Kaneko, M. and Zagier, D.: A Generalized Jacobi theta function and quasimodular forms. In: The moduli space of curves, R. Dijkgraaf et al. (eds.), Prog. Math. 129, Boston. Birkhauser, 1995 Kausch, H.: Curiosity at c = −2. hep-th/9510149 Lian, B. and Zuckerman, G.: BRST cohomology of the super-Virasoro algebras. Commun. Math. Phys. 1253 301–335 (1989) Lian, and Zuckerman, G.: Commutative quantum operator algebras. J. Pure Appl. Alg. 100, 117–139 (1995) Matsuo, Y.: Free fields and quasi-finite representations of W1+∞ . Phys. Lett. B 326, 95–100 (1994) Odake, S.: Unitary representations of W infinity algebras. Inter. J. Mod. Phys. A7, 6339–6355 (1992) Pope, C., Romans, L. and Shen, X.: A new higher-spin algebra and the lone-star product. Phys. Lett. B 242, 401–406 (1990) Wang, W.: Dual pairs and tensor categories of modules over Lie algebras gbl∞ and W1+∞ . Preprint, q-alg/9709034 Wang, W.: Classification of irreducible modules of W3 algebra with central charge −2. qalg/9708016, to appear in Commun. Math. Phys Zamolodchikov, A.B.: : Infinite additional symmetries in two dimensional conformal quantum field theory. Theor. Math. Phys. 65, 1205–1213 (1985)

Communicated by G. Felder

Commun. Math. Phys. 195, 113 – 128 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Classification of Irreducible Modules of W3 Algebra with c = −2 Weiqiang Wang? Max-Planck Institut f¨ur Mathematik, 53225 Bonn, Germany. E-mail: [email protected] Received: 10 July 1997 / Accepted: 11 November 1997

Abstract: We construct irreducible modules Vα (α ∈ C) over the W3 algebra with central charge c = −2 in terms of a free bosonic field. We prove that these modules exhaust all the irreducible modules of the W3 algebra with c = −2. Highest weights of Cartan subalgebra of modules Vα (α ∈ C) with respect to the full (two-dimensional) the W3 algebra are 21 α(α − 1), 16 α(α − 1)(2α − 1) . They are parametrized by points (t, w) on a rational curve w2 − 19 t2 (8t + 1) = 0. Irreducible modules of the vertex algebra W1+∞ with c = −1 are also classified. 0. Introduction In the study of two-dimensional conformal field theories extensions of conformal symmetry play an important role. The algebraic structures underlying the extended conformal symmetry are usually known as W-algebras in the literature (see [BS, FF] and references therein). Mathematically W-algebras can be put into the general framework of the theory of vertex algebras formulated first by Borcherds, cf. e.g. [B1, FLM, DL, LZ, FKRW, K, B2]. In contrast to vertex algebras associated to the Virasoro algebra, W-algebras such as WN algebras, have the feature that non-linearity terms appear in the operator product expansion of two generating fields, namely the commutator of two generators contains non-linear terms expressed by these generators themselves. Mainly due to the non-linear nature of W-algebras, the study of their representation theory has been difficult and very non-trivial. Even the understanding of representation theory of the Zamolodchikov W3 algebra [Za], which is the simplest example of W-algebras beyond the Virasoro algebra, is far from satisfactory (see however [BMP, dVvD]). Apart from the Virasoro generators Ln , n ∈ Z, W3 algebra has an additional set of generators Wn , n ∈ Z. Denote by U (W3 ) the corresponding universal en?

On leave from Department of Mathematics, Yale University, USA.

114

W. Wang

P veloping algebra. Define two generating series T (z) = n∈Z Ln z −n−2 and W (z) = P −n−3 . n∈Z Wn z It is well known that the vacuum module VW 3,c with central charge c carries a vertex algebra structure. For a generic central charge c, VW 3,c is an irreducible representation of U (W3 ). In this case, the representation theory of the vertex algebra VW 3,c is the same as that of U (W3 ). For a non-generic central charge c, VW 3,c is reducible and admits a unique maximal proper U(W3 )-submodule I and thus a unique irreducible quotient, which is denoted by W3,c . W3,c inherits a vertex algebra structure from VW 3,c . Representation theory of W3,c with non-generic central charge c becomes highly non-trivial since a module M of U(W3 ) can be regarded as a module of W3,c if and only if the Fourier components of any field corresponding to any vector in I annihilates the whole M. In [W2], in studying the vertex algebra W1+∞ with central charge −1 (denoted by W1+∞,−1 ) we explicitly constructed a number of irreducible modules of W3,−2 parametrized by integers and obtained full character formulas for these modules. We showed that the vertex algebra W1+∞,−1 is isomorphic to a tensor product of W3,−2 and a Heisenberg vertex algebra generated by a free bosonic field by using the Friedan– Martinec–Shenker bosonization technique [FMS]. In this paper, we will continue the study of representation theory of W3,−2 and W1+∞,−1 . Note that −2 is a non-generic central charge for the W3 algebra. We will explicitly construct irreducible modules Vα , α ∈ C of W3,−2 in terms of a free bosonic field. Then by locating key singular vectors in VW 3,−2 and applying Zhu’s machinery [Z] to our case we are able to prove that Vα , α ∈ C exhaust all the irreducible modules of W3,−2 . It turns out that the set of all irreducibles of W3,−2 has an elegant description: highest weights of these irreducible modules are parametrized by points of a rational curve defined by w2 − 19 t2 (8t + 1) = 0. Combining with our results in [W2] we also construct and classify all the irreducible modules of W1+∞,−1 . This latter classification result disproves a conjecture of Kac and Radul [KR2]. P Let us explain in more detail. Given a pair of bc fields b(z) = n∈Z b(n)z −n and P c(z) = n∈Z c(n)z −n−1 , we construct a Fock space F generated by the vacuum vector |bci, satisfying b(n + 1)|bci = 0, c(n)|bci = 0, n ≥ 0. P −n−1 is a free bosonic field. Take a scalar field Then j(z) =: b(z)c(z) := n∈Z jn z ψ(z) such that j(z) = ∂ψ(z). Denote by Hα the Fock space of the Heisenberg algebra {jn , n ∈ Z} with vacuum vector |αi satisfying jn |αi = αδn,0 |αi,

n ≥ 0.

It is observed in [BCMN, W2] that the fields T (z) =: ∂b(z)c(z) :,

1 W (z) = √ : ∂ 2 b(z)c(z) : − : ∂b(z)∂c(z) : , 6

satisfy the W3 operator product expansions with central charge −2. We can rewrite the fields T (z) and W (z) in terms of j(z) by means of boson-fermion correspondence (Proposition 3.1). F 0 is thus isomorphic to H0 . It is shown [W2] that the simple vertex algebra W3,−2 is a vertex subalgebra of F 0 and can be identified explicitly inside F 0 . Denote by Vα the irreducible quotient of the W3,−2 -submodule of Hα generated by the √ f (z) ≡ P f −n−3 = 1 6W (z). We will highest weight vector |αi in Hα . Let W n∈Z Wn z 2

W3 Algebra

115

f0 } show that the highest weight of Vα with respect to the full Cartan subalgebra {L0 , W of W3,−2 is

1 1 α(α − 1), α(α − 1)(2α − 1) . 2 6

(0.1)

To show that the above irreducible modules Vα , α ∈ C exhaust all the irreducible modules of W3,−2 , we invoke a powerful machinery due to Zhu in the general theory of vertex algebras [Z]. Zhu constructed an associative algebra A(V ) for any vertex algebra V such that irreducible modules of the vertex algebra V one-to-one correspond to irreducible modules of the associative algebra A(V ) (Zhu’s constructions were generalized to vertex superalgebras in [KW]). By construction, the Zhu associative algebra A(V ) is a certain quotient of V . We denote by [a] the image in A(V ) of a ∈ V . By studying the associative algebra A(V ), one can often obtain useful information on highest weights of modules over V . We show that the Zhu associative algebra A(VW 3,−2 ) is isomorphic to a polynof−3 |0i] in mial algebra C[t, w], where t and w correspond to elements [L−2 |0i] and [W A(VW 3,−2 ) respectively. Using some explicit results on singular vectors in VW 3,−2 , we further show that the Zhu associative algebra A(W3,−2 ) is isomorphic to (some quotient of) the quotient algebra C[t, w]/hf (t, w)i, where hf (t, w)i is the ideal of C[t, w] generated by the polynomial f (t, w) = w2 − 19 t2 (8t + 1). This means that a necessary condition for any irreducible module of the vertex algebra VW 3,−2 to be a module of W3,−2 is that its highest weight (t, w) has to satisfy the equation 1 w2 − t2 (8t + 1) = 0. 9

(0.2)

We observe that all the solutions to the equation above can be written as of the form (0.1). But we have already constructed irreducible modules Vα of W3,−2 with a highest weight of any such form. This shows that Vα (α ∈ C) are all irreducible W3,−2 -modules and their highest weights are parametrized by points on the rational curve defined by the Eq. (0.2). The Eq. (0.2) as a necessary constraint on the highest weights of irreducible W3,−2 -modules was anticipated in [H, EFH2 NV] by some other arguments1 . In the remaining part of this paper we classify all the irreducible modules of the vertex algebra W1+∞,−1 . Recall in our paper [W2], we have shown that the vertex algebra W1+∞,−1 is isomorphic to a tensor product of the vertex algebra W3,−2 and a Heisenberg vertex algebra generated by a free bosonic field. Therefore the classification of irreducible modules W1+∞,−1 follows from our classification of irreducible modules of W3,−2 and the well-known description of all irreducible modules of a Heisenberg vertex algebra. This paper is organized as follows. In Sect. 1, we recall the definition of a vertex algebra and review Zhu’s associative algebra theory. In Sect. 2, we recall the W3 algebra and study the case with central charge −2 in some detail. In Sect. 3 we construct irreducible modules Vα (α ∈ C) of W3,−2 and determine their highest weights. In Sect. 4 we calculate the Zhu algebra A(W3,−2 ) and show that the list of irreducible modules constructed in Sect. 3 is complete. In Sect. 5 we classify all irreducible modules of W1+∞,−1 . 1

I thank A. Honecker for pointing out these references to me after the completion of the paper.

116

W. Wang

1. Vertex Algebras and Zhu’s Associative Algebra Theory Our definition of vertex algebras basically follows [FKRW, K]. To our best knowledge, the locality as fomulated in the axiom (L) below first appeared in [DL]. It was also indicated in [DL] that a similar definition can be made which is essentially equivalent to other formulations in [B1, FLM, LZ]. Though it is not essential to have a gradation in the definition of vertex algebras (cf. [K]), we choose to keep it in this paper in order to present Zhu’s associative algebra theory. Definition 1.1. L A vertex algebra consists of the following data: a Z+ -graded vector space V = n∈Z+ Vn ; a vector |0i ∈ V (called the vacuum vector); an operator L0 (called the degree operator) and an operator T ∈ End V (called P the translation operator); a linear map from V to the space of fields a 7→ Y (a, z) = n∈Z a(n) z −n−1 ∈ End V [[z, z −1 ]] (called the state-field correspondence). These data satisfy the following axioms: (V) (G) (T) (L)

Y (|0i, z) = IV , Y (a, z)|0i|z=0 = a; L0 |Vn = nIVn , [L0 , Y (a, z)] = ∂z Y (a, z) + Y (L0 a, z); [T, Y (a, z)] = ∂z Y (a, z), T |0i = 0; (z − w)N [Y (a, z), Y (b, w)] = 0 for N 0.

For a ∈ Vn , n is called the weight of a, denoted by wt a. We denote by o(a) = a(wt a − 1) for a homogeneous element a ∈ V which extends by linearity to the whole V . The results of the remaining part of this section are due to Zhu [Z]. We refer readers to [Z] for more detail. Definition 1.2. Define two bilinear operations ∗ and ◦ on V as follows. For a homogeneous, let ! (z + 1)wt a b , a ∗ b = Resz Y (a, z) z ! (z + 1)wt a a ◦ b = Resz Y (a, z) b , z2 then extend to V × V by bilinearity. Denote by O(V ) the subspace of V spanned by elements a ◦ b, and by A(V ) the quotient space V /O(V ). It is convenient to introduce an equivalence relation ∼ as in [W1]. For a, b ∈ V , a ∼ b means a − b ≡ 0 mod O(V ). For f, g ∈ End V , f ∼ g means f · c ∼ g · c for any c ∈ V . Denote by [a] the image of a in V under the projection of V onto A(V ). Lemma 1.1. 1) T + L0 ∼ 0. 2) For every homogeneous element a ∈ V , and m ≥ n ≥ 0, one has ! (z + 1)wt a+n ∼ 0. Resz Y (a, z) z 2+m 3) For homogeneous elements a, b ∈ V , one has a ∗ b ∼ Resz

! (z + 1)wt b−1 a . Y (b, z) z

W3 Algebra

117

Theorem 1.1. 1) O(V ) is a two-sided ideal of V under the multiplication ∗. Moreover, the quotient algebra (A(V ), ∗) is associative. 2) [1] is the unit element of the algebra A(V ). In the case that the vertex algebra V contains a Virasoro element ω, i.e. the corresponding field Y (ω, z) is an energy-momentum tensor field, we have Lemma 1.2. [ω] is in the center of the associative algebra A(V ). The following proposition follows from the definition of A(V ). Proposition 1.1. Let I be an ideal of V . Then the associative algebra A(V /I) is isomorphic to A(V )/[I], where [I] is the image of I in A(V ). L Theorem 1.2. 1) If M = n∈Z+ Mn is a module of the vertex algebra V , then the top level M0 of M is a module of the associative algebra A(V ), with action given as follows: for [a] ∈ A(V ), which is the image of a ∈ V , [a] acts on M0 as o(a). 2) Irreducible modules of the vertex algebra V one-to-one correspond to irreducible modules of the associative algebra A(V ) as in 1). We call A(V ) the Zhu (associative) algebra of a vertex algebra V . 2. W3 Algebra with Central Charge −2 Denote by U (W3 ) the quotient of the free associative algebra generated by Lm , Wm , m ∈ Z, by the ideal generated by the following commutation relations (cf. e.g. [BMP]): [Lm , Ln ] = (m − n)Lm+n +

c (m3 − m)δm,−n , 12

[Lm , Wn ] = (2m − n)Wm+n , 1 [Wm , Wn ] = (m − n) (m + n + 3)(m + n + 2) 15 1 − (m + 2)(n + 2) Lm+n 6 c m(m2 − 1)(m2 − 4)δm,−n , +β(m − n)3m+n + 360

(2.3)

where c ∈ C is the central charge, β = 16/(22 + 5c) and 3m =

X n≤−2

Ln Lm−n +

X

Lm−n Ln −

n>−2

3 (m + 2)(m + 3)Lm . 10

Denote W3,± = {Ln , Wn , ±n ≥ 0},

W3,0 = {L0 , W0 }.

A Verma module Mc (t, w) (or M(t, w) whenever there is no confusion of central charge) of U (W3 ) is the induced module O M(t, w) = U(W3 ) Ct,w , U (W3,+ ⊕W3,0 )

118

W. Wang

where Ct,w is the 1-dimensional module of U(W3,+ ⊕ W3,0 ) generated by a vector |t, wi such that (2.4) W3,+ |t, wi = 0, L0 |t, wi = t|t, wi, W0 |t, wi = w|t, wi. M(t, w) has a unique irreducible quotient which is denoted by L(t, w) (or Lc (t, w) when it is necessary to specify the central charge). A singular vector in a U (W3 )-module means a vector killed by W3,+ . For simplicity, we denote the vacuum vector |0, 0i by |0i in the case t = w = 0. It is easy to see that L−1 |0i, W−1 |0i, and W−2 |0i are singular vectors in M(0, 0). We denote by VW 3,c the vacuum module which is by definition the quotient of the Verma module M(0, 0) by the U(W3 )-submodule generated by the singular vectors L−1 |0i, W−1 |0i, and W−2 |0i. We also call L(0, 0) the irreducible vacuum module. Let I be the maximal proper submodule of the Verma vacuum module VW 3,c . Clearly L(0, 0) is the irreducible quotient of VW 3,c . It is easy to see that VW 3,c has a linear basis L−i1 −2 · · · L−im −2 W−j1 −3 · · · W−jn −3 |0i, 0 ≤ i1 ≤ · · · ≤ im , 0 ≤ j1 ≤ · · · ≤ jn , m, n ≥ 0.

(2.5)

L The action of L0 on VW 3,c gives rise to a principal gradation on VW 3,c : VW 3,c = n∈Z (VW 3,c )n . Introduce the following fields X X Ln z −n−2 , W (z) = Wn z −n−3 . (2.6) T (z) = n∈Z

n∈Z

It is well known that the vacuum module VW 3,c (resp. irreducible vacuum module L(0, 0)) carries a vertex algebra structure with generating fields T (z) and W (z). The W3 algebra with central charge −2 we have been referring to is the vertex algebra L−2 (0, 0), which we denote by W3,−2 in this paper. Fields T (z) and W (z) correspond to the vectors L−2 |0i and W−3 |0i respectively. The field corresponding to the vector L−i1 −2 · · · L−im −2 W−j1 −3 · · · W−jn −3 |0i is ∂ (i1 ) T (z) · · · ∂ (im ) T (z)∂ (j1 ) W (z) · · · ∂ (jn ) W (z), where ∂ (i) denotes i!1 ∂zi . From now on we concentrate on the case of the W3 algebra with central charge c = −2. We can rewrite (2.3) as the following OPEs in our central charge −2 case: 2T (w) ∂T (w) −1 , + + (z − w)4 (z − w)2 z − w 3W (w) ∂W (w) T (z)W (w) ∼ , + (z − w)2 z−w 2T (w) ∂T (w) −2/3 + + W (z)W (w) ∼ 6 4 (z − w) (z − w) (z − w)3 1 2 1 8 : T (w)T (w) : − ∂ T (w) + (z − w)2 3 2 1 1 3 4 + ∂ (: T (w)T (w) :) − ∂ T (w) . z−w 3 3 T (z)T (w) ∼

(2.7)

Representation theory of the vertex algebra VW 3,c is just the same as that of U (W3 ). We see from the following lemma that VW 3,−2 is reducible so its maximal proper submodule I is not zero. Representation theory of W3,−2 becomes highly non-trivial

W3 Algebra

119

due to the following constraints: a module M of the vertex algebra VW 3,−2 can be a module of W3,−2 if and only if M is annihilated by all the Fourier components of all fields corresponding to vectors in I ⊂ VW 3,−2 . So it is important to find information of (top) singular vectors in the vacuum module VW 3,−2 , cf. [dVvD] for a general approach by means of Kazhdan–Lusztig polynomials. The following lemma can be proved by a tedious however direct calculation. Lemma 2.1. 1) There is no singular vector in (VW 3,−2 )n , n ≤ 5. 0 2) There are two independent singular vectors in (VW 3,−2 )6 , denoted by vs and vs : 19 2 8 3 14 44 3 2 W − L − L − L−2 L−4 + L−6 |0i, vs ≡ 2 −3 36 −3 9 −2 9 9 0 9 vs ≡ W−6 + 9L−3 W−3 − 6L−2 W−4 |0i. 2 0

0

3) vs = 27 W (v ), v = 1 W (v ). Equivalently we have 98 0 s 0 s 36 0 s 98 0 4) W0 6vs ± 98 27 vs = ±6 6vs ± 27 vs . 0

Remark 2.1. Vectors vs , vs are not singular vectors in the Verma module M(0, 0). 3. Irreducible Modules Vα (α ∈ C) of W3,−2 We first recall how we realize the W3,−2 algebra in terms of a pair of fermionic bc fields [W2]. Take a pair of bc fields X X b(n)z −n , c(z) = c(n)z −n−1 b(z) = n∈Z

n∈Z

with OPEs

1 , b(z)b(w) ∼ 0, c(z)c(w) ∼ 0. z−w Equivalently, we have the following commutation relations: b(z)c(w) ∼

[b(m), c(n)]+ = δm,−n ,

[b(m), b(n)]+ = 0,

[c(m), c(n)]+ = 0.

We denote by F the Fock space of the bc fields, generated by |bci, satisfying b(n + 1)|bci = 0,

c(n)|bci = 0,

Then j(z) =: b(z)c(z) :=

X

n ≥ 0.

jn z −n−1

n∈Z

is a free boson of conformal weight 1 with commutation relations [jm , jn ] = mδm,−n ,

m, n ∈ Z.

We further have the following commutation relations: [jm , b(n)] = b(m + n),

[jm , c(n)] = −c(m + n),

m, n ∈ Z.

(3.8)

120

W. Wang

Then we have the bc–charge decomposition of F according to the eigenvalues of j0 : M F= F l. l∈Z

We denote by Hα (α ∈ C) the Fock space of the Heisenberg algebra generated by jn , n ∈ Z, with vacuum vector |αi satisfying jn |αi = αδn,0 |αi, n ≥ 0. P Denote by ψ(z) = q + j0 ln z − n6=0 jn z −n , where the operator q satisfies [q, jn ] = δn,0 . Clearly j(z) = ∂ψ(z). (Note that our j(z), jn , · · · are denoted in [W2] by −j bc (z), −jnbc , · · ·.) By the well-known boson-fermion correspondence, we have an isomorphism between F l and Hl as representations over the Heisenberg algebra generated by jn , n ∈ Z. On the other hand, we may regard b(z) and c(z) as b(z) =: eψ(z) :,

c(z) =: e−ψ(z) : .

(3.9)

Furthermore we have the following OPEs: b(z)c(w) =

1 : b(z)c(w) :, z−w

c(z)b(w) =

1 : c(z)b(w) : . z−w

(3.10) 0

In particular it is well known that F 0 (and so H0 ) is a vertex algebra. Denote by F the kernel of the screening operator c(0) from F 0 to F −1 . It has a structure of a vertex subalgebra of F 0 . Let X T (z) ≡ Ln z −n−2 =: ∂b(z)c(z) : . (3.11) n∈Z

It is easy to check that T (z) is a Virasoro field with central charge −2. We also define another field of conformal weight 3: X 1 Wn z −n−3 = √ : ∂ 2 b(z)c(z) : − : ∂b(z)∂c(z) : . (3.12) W (z) ≡ 6 n∈Z f (z) = We introduce a rescaled field W f (z) ≡ W

X n∈Z

1 2

√

6W (z) for convenience later on, namely

fn z −n−3 = 1 : ∂ 2 b(z)c(z) : − : ∂b(z)∂c(z) : . W 2

(3.13)

The following theorem is proved in [W2]. 0

Theorem 3.1. The vertex algebra F is isomorphic to the simple vertex algebra W3,−2 with generating fields T (z) and W (z) defined as in (3.11) and (3.12). By the boson-fermion correspondence F 0 and H0 are isomorphic as vertex algebras so we may view W3,−2 as a vertex subalgebra of H0 as well. Hα is a module over the vertex algebra H0 and so can be regarded as a module over W3,−2 . Denote by Vα the irreducible subquotient of the W3,−2 -submodule of Hα generated by the highest weight |αi. We first rewrite the fields T (z) and W (z) defined as in (3.11) and (3.12) in terms of the field j(z) and its derivative fields.

W3 Algebra

121

Proposition 3.1. Under the boson-fermion correspondence, the fields T (z) and W (z) in (3.11) and (3.12) can be expressed in terms of j(z) as 1 : j(z)2 : +∂j(z) , 2 f (z) = 1 4 : j(z)3 : +6 : j(z)∂j(z) : +∂ 2 j(z) . W 12 T (z) =

(3.14) (3.15)

Proof. By (3.10), we have 1 b(z)c(w) ∼ z−w

(z − w)2 : j(w)2 : +∂j(w) 2 3 2 : j(w) : +3 : j(w)∂j(w) : +∂ j(w)

1 + (z − w)j(w) +

(z − w)3 6 1 1 + j(w) + (z − w) : j(w)2 : +∂j(w) ∼ z−w 2 1 + (z − w)2 : j(w)3 : +3 : j(w)∂j(w) : +∂ 2 j(w) . 6 +

From this we see that : ∂ 2 b(w)c(w) :=

1 : j(w)3 : +3 : j(w)∂j(w) : +∂ 2 j(w) . 3

(3.16)

On the other hand, by (3.10) we have 1 (z − w)2 c(z)b(w) = 1 − (z − w)j(w) + : j(w)2 : −∂j(w) z−w 2 (z − w)3 + − : j(w)3 : +3 : j(w)∂j(w) : −∂ 2 j(w) 6 + higher terms. This implies that 1 1 + (: j(w)2 : −∂j(w)) (z − w)2 2 1 + (z − w) − : j(w)3 : +3 : j(w)∂j(w) : 3 +higher terms.

: ∂c(z)b(w) : = −

(3.17)

Equivalently we have by switching z and w in (3.17) and reversing the order between ∂c and b (we get a minus sign since bc fields are fermionic) 1 1 : j(z)2 : −∂j(z) − 2 (z − w) 2 1 + (z − w) − : j(z)3 : +3 : j(z)∂j(z) : 3 1 1 : j(w)2 : −∂j(w) − ∼ (z − w)2 2

: b(z)∂c(w) : ∼

122

W. Wang

1 − (z − w) 2 : j(w)∂j(w) : −∂ 2 j(w) 2 1 + − : j(w)3 : +3 : j(w)∂j(w) : 3 1 1 : j(w)2 : −∂j(w) − ∼ (z − w)2 2 1 − (z − w) 2 : j(w)∂j(w) : −∂ 2 j(w) 2 1 1 +(z − w) − : j(w)3 : + ∂ 2 j(w) . 3 6

(3.18)

It follows from (3.18) that : ∂b(w)∂c(w) := −

1 1 : j(w)3 : + ∂ 2 j(w). 3 6

(3.19)

So by (3.13), (3.16) and (3.19) we have f (w) = 1 4 : j(w)3 : +6 : j(w)∂j(w) + ∂ 2 j(w) . W 12 The proof of the identity (3.14) is similar.

Proposition 3.2. The highest weight of the W3,−2 -module Vα (α ∈ C) with respect to f0 ) is 1 α(α − 1), 1 α(α − 1)(2α − 1) . (L0 , W 2 6 f0 can be written as an infinite sum of monomials in terms of j−n , n > 0 Proof. L0 and W by Proposition 3.1. Indeed we have   X X 1 1 jn j−n + j−n jn  − j0 L0 =  2 2 n<0

=

1 2 (j − j0 ) + 2 0

X

n≥0

j−n jn .

n>0

Since jn |αi = 0, n > 0 and j0 |αi = α|αi, we have L0 |αi =

1 2 1 (j0 − j0 )|αi = α(α − 1)|αi. 2 2

f Similarly, a little calculation shows that the only terms in W0 which do not annihilate 1 3 the vacuum vector |αi are 12 4j0 + 6j0 (−j0 ) + 2j0 . So we have f0 |αi = 1 4α3 + 6α(−α) + 2α = 1 α(α − 1)(2α − 1). W 12 6

−α

constructed Remark 3.1. The irreducible module Vα is isomorphic to the module F in [W2] by comparing their highest weights for α ∈ Z. Vα is a proper subspace of Hα in this case and its full character formula is given in [W2]. For α 6∈ 21 Z, we know that Hα is irreducible as a module over the Virasoro algebra given by the field T (z) with central charge −2 [FeF, KR] and so is irreducible as a module over W3,−2 . Full character f0 } can be also calculated. formulas of these Vα with respect to {L0 , W

W3 Algebra

123

4. Classification of Irreducible Representations of W3,−2 Algebra We will show that irreducible modules Vα , α ∈ C exhaust all the irreducible modules of the vertex algebra W3,−2 by calculating the Zhu algebra in this case. We break the proof into a sequence of simple lemmas. Lemma 4.1. The Zhu algebra A(VW 3,c ) is isomorphic to a polynomial algebra C[t, w], f−3 |0i] in A(VW 3,c ). where t, w correspond to [L−2 |0i] and [W Note that L−2 |0i is the Virasoro element in VW 3,c so the element [L−2 |0i] lies in the center of A(VW 3,c ) by Lemma 1.2. Proof of the above lemma is quite standard. See Lemma 4.1 in [W1] for a proof of a similar result. One can easily modify that proof to give a proof of our present lemma. We will not write it down here since it is not very illuminating. Now specify c = −2. Let us denote by σ the isomorphism from A(VW 3,−2 ) to C[t, w]. Lemma 4.2. Keeping the conventions in Lemma 4.1, under the isomorphism σ we have 1 σ([vs ]) = w2 − t2 (8t + 1), 9

0

σ([vs ]) = 0.

Proof. We will continue using the equivalence convention denoted by ∼ in the sense of Sect. 1. It follows from Lemma 1.1 that for any a ∈ VW 3,−2 , f−3 + 2W f−2 + W f−1 a, f−3 |0i ∼ W a∗ W a ∗ L−2 |0i ∼ L−2 + L−1 a. (4.20) Recall that the isomorphism σ from A(VW 3,−2 ) to C[t, w] sends elements [L−2 |0i] and f−3 |0i] in A(VW 3,c ) to t, w respectively. By applying (4.20) to the first two terms of [W the singular vector vs given in Lemma 2.1 and then rewriting it in terms of the PBW basis of the form (2.5) by using the commutation relations (2.3), we get n f−2 + W f−1 W f−3 vs ∼ w 2 + − 2 W 19 2 8 3 14 44 − L−3 − L−2 − L−2 L−4 + L−6 |0i 36 9 9 9 10 8 = w2 + −3 L−2 L−3 − L−5 3 3 3 8 2 − L − L−4 2 3 −2 19 2 8 3 14 44 − L−3 − L−2 − L−2 L−4 + L−6 |0i 36 9 9 9 3 = w2 + −8L−2 L−3 + 10L−5 − 4L2−2 + L−4 2 19 2 8 3 14 44 − L−3 − L−2 − L−2 L−4 + L−6 |0i. (4.21) 36 9 9 9

124

W. Wang

It is easy to show by induction and applying Lemma 1.1 that L−n ∼ (−1)n (n − 1) L−2 + L−1 + L0 ,

n ≥ 1.

(4.22)

By Eq. (4.20) and repeated uses of (4.22) on the right hand side of (4.21), we get 9 vs ∼ w2 + 16t(t + 3) − 40t − 4t(t + 2) + t 2 8 14 220 19 t − t(2t + 3) − t(t + 2)(t + 4) − t(t + 4) + 18 9 3 9 1 = w2 − t2 (8t + 1). 9

(4.23)

This completes the proof that σ([vs ]) = w2 − 19 t2 (8t + 1). Similarly we can prove 0 that σ([vs ]) = 0. Denote f (t, w) = w2 − 19 t2 (8t + 1). Now the following lemma follows from Proposition 1.1, Lemma 4.1 and Lemma 4.2 (see Corollary 4.1 for a more precise statement). Lemma 4.3. The Zhu algebra A(W3,−2 ) is a certain quotient of the quotient algebra C[t, w]/hf (t, w)i, where hf (t, w)i denotes the ideal of C[t, w] generated by f (t, w). We have the following observation. Lemma 4.4. Solutions to Eq. (0.2) are parametrized as follows: 1 1 (t(α), w(α)) ≡ α(α − 1), α(α − 1)(2α − 1) , α ∈ C. 2 6

(4.24)

Proof. First it is clear that t(α) can take any complex value when α ranges over C. 2 Then by substituting t(α) in the Eq. (0.2) we see that w(α)2 = 16 α(α − 1)(2α − 1) . We don’t lose any generality by letting w(α) = 16 α(α − 1)(2α − 1). The reason is that t(1 − α) = t(α) while w(1 − α) = −w(α). Remark 4.1. For different α, α0 ∈ C, (t(α), w(α)) = t(α0 ), w(α0 ) if and only if α = 0(resp. 1), α0 = 1(resp. 0). Namely V0 is isomorphic to V1 and this is the only isomorphism among Vα , α ∈ C. Now we are ready to prove our classification theorem on irreducible modules over the W3,−2 algebra. Theorem 4.1. Vα , α ∈ C are all the irreducible modules over the simple W3 algebra with central charge −2. Highest weights of these modules Vα are given by 1 1 2 α(α − 1), 6 α(α − 1)(2α − 1) , α ∈ C. They are parametrized by points (t, w) on the rational curve defined by w2 = 19 t2 (8t + 1). Proof. By Lemma 4.3, we see that any irreducible module of the associative algebra A(W3,−2 ) is one-dimensional since A(W3,−2 ) is commutative. Given t, w ∈ C, let Ct,w be the one-dimensional module of A(W3,−2 ), with [L−2 |0i] acting as the scalar f−3 |0i] as the scalar w. Then (t, w) has to satisfy w2 = 1 t2 (8t + 1). Note t and [W 9 f−3 |0i) = W f0 by the definition of o(·) in Sect. 1. So by that o(L−2 |0i) = L0 and o(W

W3 Algebra

125

Theorem 1.2, the highest weight (t, w) of any irreducible module of the vertex algef0 ) has to satisfy the equation w2 = 1 t2 (8t + 1). By bra W3,−2 with respect to (L0 , W 9 Lemma 4.4, we see all solutions to the above equation can be written as of the form 1 1 2 α(α − 1), 6 α(α − 1)(2α − 1) , α ∈ C. On the other hand, we have already constructed a family of irreducible modules Vα (α ∈ C) with highest weight exactly equal to 21 α(α − 1), 16 α(α − 1)(2α − 1) . This completes the proof of the theorem. We think it is remarkable that the set of all irreducible modules of W3,−2 has such a simple and elegant description in terms of a rational curve. It indicates that the nonrational vertex algebras may have very rich representation theory. We have an immediate corollary of Theorem 4.1 which strengthens Lemma 4.3. Corollary 4.1. The Zhu algebra A(W3,−2 ) is isomorphic to the commutative associative algebra C[t, w]/hf (t, w)i. Remark 4.2. 1) Based on the results of Theorem 4.1 and Corollary 4.1 it is natural to 0 conjecture that the singular vectors vs and vs generate the maximal proper submodule of the vacuum module VW 3,−2 . 2) A Virasoro vertex algebra with a certain central charge is rational if and only if the corresponding vacuum module is reducible [W1]. As our results show, the W3 algebra provides a new possibility, namely the simple vertex algebra W3,−2 is not rational but the corresponding vacuum module VW 3,−2 is reducible. We further comment on why the central charge c = −2 is particularly interesting from a different point of view. There is the so-called quantized Drinfeld–Sokolov reduction (cf. e.g. [BH, FKW]) which allows one to establish connections between the Wn algebra b with central charge c(k) n and the affine Kac-Moody Lie algebra sln with central charge k. Here 1 (k) 3 2 + k + n . = 2n − n − 1 − n(n − 1) cn k+n In particular, for k = −n + p/q one can rewrite c(k) n as follows: n(n + 1)(p − q)2 . = (n − 1) 1 − c(k) n pq The so-called minimal series central charges of the Wn algebra are those c(k) n for k = −n + p/q, where p, q are coprime integers satisfying p, q ≥ n. They correspond to b n , with the same the admissible central charges k = −n + p/q for the affine algebra sl conditions imposed on p, q as above. The admissible representations with admissible central charges were first studied by Kac-Wakimoto [KWa]. Thus by means of Drinfeld–Sokolov reduction the central charge −2 for the W3 b 3 . Observe that k = − 3 = algebra corresponds to the central charge k = − 23 or − 27 of sl 2 2 7 3 −3 + 3 or − 2 = −3 + 2 corresponds to the “boundary” of the admissible central charges b 3. of sl However more than this is true. Consider the “boundary” of the admissible central b n , i.e. k = −n + n or −n + n−1 . The corresponding central charge of the charges of sl n−1 n Wn algebra c(k) = −2, which is independent of n. In this sense −2 is a universal central n cn charge for any Wn algebra. We expect that the representations of the affine algebra sl with central charge equal to the “boundary” of the admissible central charges are of independent interest.

126

W. Wang

5. Classification of Irreducible Modules of the Vertex Algebra W1+∞,−1 Let D be the Lie algebra of regular differential operators on the circle. The elements Jkl = −tl+k (∂t )l ,

l ∈ Z+ , k ∈ Z,

form a basis of D. D has also another basis Llk = −tk Dl ,

l ∈ Z+ , k ∈ Z,

b the central extension of D by a one-dimensional center where D = t∂t . Denote by D with a generator C, with commutation relation (cf. [KR1]) r t f (D), ts g(D) = tr+s (f (D + s)g(D) − f (D)g(D + r)) + 9 (tr f (D), ts g(D)) C, (5.25) where 9 (tr f (D), ts g(D)) =

 

X



−r≤j≤−1

f (j)g(j + r), r = −s ≥ 0

Letting weight Jkl = k and weight C = 0 defines a principal gradation M M b= bj . D= Dj , D D j∈Z

b± = D

M

bj , D

(5.27)

j∈Z

Then we have the triangular decomposition M M b0 b− , b=D b+ D D D where

(5.26)

r + s 6= 0.

0,

b 0 = D0 D

M

(5.28)

CC.

j∈±N

Let P be the distinguished parabolic subalgebra of D, consisting of the differential operators that extend into the whole interior of the circle. P has a basis {Jkl , l ≥ 0, l+k ≥ b vanishes 0}. It is easy to check that the 2-cocycle 9 defining the central extension of D b Denote when restricted to the parabolic subalgebra P. So P is also a subalgebra of D. b P = P ⊕ CC. b module by letting C act as scalar c Fix c ∈ C. Denote by Cc the 1–dimensional P b and P act trivially. Fix a non-zero vector v0 in Cc . The induced D–module O b =U D b Cc Mc D U (P)

b is called the vacuum D–module with central charge c. Here we denote by U (g) the b admits a unique irreducible universal enveloping algebra of a Lie algebra g. Mc (D) b by |0i. quotient, denoted by W1+∞,c . Denote the highest weight vector 1 ⊗ v0 in Mc (D) It is shown in [FKRW] that W1+∞,c carries a canonical vertex algebra structure, with vacuum vector |0i and generating fields

W3 Algebra

127

J l (z) =

X

Jkl z −k−l−1 ,

k∈Z

of conformal weight l + 1, l = 0, 1, · · · . The fields J l (z) corresponds to the vector l |0i in W1+∞,c . Below we will concentrate on the particular case W1+∞,−1 . J−l−1 The relation between the vertex algebras W1+∞,−1 and W3,−2 is made clear by the following theorem [W2]. Theorem 5.1. The vertex algebra W1+∞,−1 is isomorphic to a tensor product of the W3,−2 algebra, and the Heisenberg vertex algebra H0 with J 0 (z) as a generating field. Then the classification of irreducible modules over W1+∞,−1 follows from classification of those over W3,−2 since the classification of irreducible modules over a Heisenberg vertex algebra is well known. Also see Remark 4.1. Theorem 5.2. There exists a two-parameter family of irreducible modules over W1+∞,−1 . Any irreducible W1+∞,−1 -module can be written uniquely as a tensor product of a module L(t(α), w(α)) over W3,−2 with a module Hs over H0 (α ∈ C−{1}, s ∈ C), with (t(α), w(α)) as defined in (4.24). Remark 5.1. Theorem 5.2 disproves a conjecture of Kac and Radul [KR2]. The list of irreducible modules of W1+∞,−1 which were conjectured to be complete in [KR2] consists of those with α = 0 in Theorem 5.2, (i.e. modules M0s in [W2]). There are several questions to which the author does not know the answers at present but hopes to have a better understanding in the future. 1) What are the fusion rules of W3,−2 (and thus of W1+∞,−1 )? The existence of reducible however indecomposible modules of W3,−2 [W2] seems to be related to the fact that there is a node at (0, 0) on the rational curve w2 − 19 t2 (8t + 1) = 0. It is likely that we may need to regard some reducible however indecomposible modules as basic objects when studying the fusion rules. 2) Recall the Cartan subalgebra of W1+∞ is infinite dimensional. In [KR1] the highest weight of an irreducible quasifinite module over W1+∞ is characterized in terms of a certain generating function 1(x). The question is how to identify highest weights of the two-parameter family of irreducible modules of W1+∞ with central charge −1 in Theorem 5.2 in terms of 1(x). It would be very interesting to see if these irreducible modules of W1+∞,−1 we have constructed are the first realizations of W1+∞ -modules sx with 1(x) = p(x)e ex −1 + · · · with some non-constant polynomial p(x) (cf. [KR1] for notations). Acknowledgement. Some results of this paper were presented in the Seminar of Geometry, Symmetry and Physics at Yale University and in the 1997 AMS Meeting at Detroit. I thank the organizers of the meeting, C. Dong and R. Griess, for the invitation. I also thank E. Frenkel, I. Frenkel, V. Kac and G. Zuckerman for their interests and comments.

References [BH] [B1]

Bershadsky, M. and H. Ooguri,: Hidden SL(n) symmetry in conformal field theories. Commun. Math. Phys. 126, 49–83 (1989) Borcherds, R.: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci, USA 83, 3068–3071 (1986)

128

W. Wang

[B2] Borcherds, R.: Vertex algebras. Preprint, q-alg/9706008 [BCMN] Bouwknegt, P., Ceresole, A., van Nieuwenhuizen, P. and McCarthy, J.: Extended Sugawara construction for the superalgebras SU (M + 1 | N + 1). II. The third-order Casimir algebra. Phys. Rev. D 40, 415–421 (1989) [BMP] Bouwknegt, P., McCarthy, J. and Pilch,: The W3 algebra: Modules, semi-infinite cohomology and BV-algebras. Lect. Notes in Phys, New Series m: Monographs 42, Berlin–Heidelberg–New York: Springer Verlag, 1996 [BS] Bouwknegt, P. and Schoutens, K.: W-symmetry in conformal field theory. Phys. Rep. 223, 183–276 (1993) [dVvD] de Vos, K. and van Driel, P.: The Kazhdan–Lusztig conjecture for W algebras. J. Math. Phys. 37, 3587–3610 (1996) [DL] Dong, C. and Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Prog. Math. 112, Boston: Birkhauser, 1993 [EFH2 NV] Eholzer, W., Flohr, M., Honecker, A., H¨ubel, R., Nahm, W. and Varnhagen, W.: Representations of W-algebras with two generators and new rational models. Nucl. Phys. B 383, 249–288 (1992) [FeF] Feigin, B. and Fuchs, D.: Representations of the Virasoro algebra. In: Representations of infinitedimensional Lie algebras and Lie groups, London: Gordon and Breach, 1990 [FF] Feigin, B. and Frenkel, E.: Integrals of motion and quantum groups. Lect. Notes in Math. 1620, Berlin–Heidelberg–New York: Springer Verlag, 1996 [FKRW] Frenkel, E., Kac, V., Radul, A. and Wang, W.: W1+∞ and W(glN ) with central charge N . Commun. Math. Phys. 170, 337–357 (1995) [FKW] Frenkel, E., Kac, V. and Wakimoto, M.: Characters and fusion rules for W -algebras via quantized Drinfeld–Sokolov reduction. Commun. Math. Phys. 147, 295–328 (1992) [FLM] Frenkel, I., Lepowsky, J. and Meurman, A.: Vertex operator algebras and the Monster. New York: Academic Press, 1988 [FMS] Friedan, D., Martinec, E. and Shenker, S.: Conformal invariance, supersymmetry and string theory. Nucl. Phys. B 271, 93–165 (1986) [H] Honecker, A.: Automorphisms of W-algebras and extended rational conformal field theories. Nucl. Phys. B400, 574–596 (1993) [K] Kac, V.: Vertex algebras for beginners. Univ. Lect. Series, 10, Providence, RI: AMS, 1996 [KR1] Kac, V. and Radul, A.: Quasi-finite highest weight modules over the Lie algebra of differential operators on the circle. Commun. Math. Phys. 157, 429–457 (1993) [KR2] Kac, V. and Radul, A.: Representation theory of the vertex algebra W1+∞ . Transf. Groups, Vol. 1, 41–70 (1996) [KR] Kac, V. and Raina, A.: Bombay lectures on highest weight representations of infinite dimensional Lie algebras. Adv. Series Math. Phys. 2, Singapore: World Scientific, 1987 [KWa] Kac, V. and Wakimoto, M.: Modular invariant representations of infinite-dimensional Lie algebras and superalgebras. Proc. Natl. Sci. USA 85, 4956–4960 [KW] Kac, V. and Wang, W.: Vertex operator superalgebras and their representations. Contemp. Math, 175, 161–191 (1994); Mathematical aspects of conformal and topological field theories and quantum groups. P. Sally et al, (eds) [LZ] Lian, B. and Zuckerman, G.: Commutative quantum operator algebras. J. Pure Appl. Alg, 100, 117–139 (1995) [W1] Wang, W.: Rationality of Virasoro vertex operator algebras. Duke Math. J. 71. Inter. Math. Res. Notice, 197–211 (1993) [W2] Wang, W.: W1+∞ Algebra, W3 Algebra, and Friedan–Martinec–Shenker bosonization. Preprint, q-alg/9708008, to appear in Commun. Math. Phys. [Za] Zamolodchikov, A.B.: Infinite additional symmetries in two dimensional conformal quantum field theory. Theor. Math. Phys. 65, 1205–1213 (1985) [Z] Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. AMS. 9, 237–302 (1996) Communicated by G. Felder

Commun. Math. Phys. 195, 129 – 173 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

The Structure of Verma Modules over the N = 2 Superconformal Algebra A. M. Semikhatov, I. Yu. Tipunin Tamm Theory Division, Lebedev Physics Institute, Russian Academy of Sciences Received: 26 April 1997 / Accepted: 12 November 1997

Abstract: We classify degeneration patterns of Verma modules over the N = 2 superconformal algebra in two dimensions. Explicit formulae are given for singular vectors that generate maximal submodules in each of the degenerate cases. The mappings between Verma modules defined by these singular vectors are embeddings; in particular, their compositions never vanish. As a by-product, we also obtain general formulae for N = 2 subsingular vectors. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 2.1 The N = 2 algebra, spectral flow transform, and Verma modules . . . . . . 134 2.2 The algebra of continued operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 2.3 Singular vectors in codimension 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 3 Submodules and Singular Vectors in Codimension ≥ 2 . . . . . . . . . . . . . 148 3.1 Topological Verma modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 3.2 Massive singular vectors in codimension 2 . . . . . . . . . . . . . . . . . . . . . . . 151 3.3 Codimension-3 cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

1. Introduction In this paper, we describe the structure of submodules and singular vectors in Verma modules over the N = 2 superconformal algebra in two dimensions – the N = 2 supersymmetric extension of the Virasoro algebra [A]. This algebra underlies the construction of N = 2 strings [A, M, FT, OV] (with its possible role in the M-theory

130

A. M. Semikhatov, I. Yu. Tipunin

proposed in [KM]), and, on the other hand, is realized on the world-sheet of any noncritical string theory [GRS, BLNW]. Other (non-exhaustive) references on the N = 2 superconformal algebra and N = 2 models in conformal field theory and string theory are [BFK, EHY, SS, DVPYZ, G, IK, KS, EG, Ga]. The N = 2 algebra, however, hasn’t been very privileged in several respects, first of all because it is not an affine Lie algebra. It does not admit a root system enjoying all the properties of root systems of affine Lie algebras, hence, in particular, there is no canonical triangular decomposition. As a result, there is no canonical way to impose “highest-weight”-type conditions on a vacuum vector (hence, on singular vectors) in representations of the algebra. Trying to follow the formal analogy with the case of affine b Lie algebras (e.g., s`(2)) and imitating the highest-weight conditions imposed there leads to several complications, if not inconsistencies, with the definition and properties of N = 2 Verma modules. These complications are related to the fact that there exist two different types of Verma-like modules, and, while modules of one type can be submodules of the other, the converse is not true. The definition of singular vectors carried over from the affine Lie-algebra case does not distinguish between the two types of submodules. Among other facts pertaining to the N = 2 algebra, let us note that the different sectors of the algebra (the Neveu–Schwarz and Ramond ones) are isomorphic, which is in contrast to the N = 1 case. This is due to the N = 2 spectral flow [SS]. Thus, there is “the only” N = 2 algebra1 and its isomorphic images under the spectral flow. However, the basis in the algebra can be chosen in different ways, since the presence of the U (1) current allows one to change the energy-momentum tensor by the derivative of the current; the algebras that appear in different contexts are in fact isomorphic [EY, W] to one and the same N = 2 algebra. Finally, even the terminology used in the N = 2 representation theory does not appear to be unified, which may again be related to the fact that the situation which is familiar from the affine Lie algebras does not literally carry over to N = 2. The object of our study is possible degenerations (reducibility patterns) of N = 2 Verma modules, i.e., the structure of submodules in these modules. This is more involved than in more familiar cases of the Virasoro algebra and the standard Verma modules over b the affine algebra s`(2), due to two main reasons. First, the N = 2 algebra has rank 3, which gives its modules more possibilities to degenerate. Second, as we have already mentioned, there are two different types of N = 2 Verma-like modules that have to be distinguished clearly; we follow refs. [ST2, FST] in calling them the topological and massive Verma modules (in a different terminology, the first ones are chiral, while the second ones are tacitly understood to be “the” N = 2 Verma modules). The topological Verma modules appearing as submodules are twisted, i.e., transformed by the spectral flow. Ignoring the existence of two types of modules and trying to describe degenerations of N = 2 Verma modules in terms of only massive Verma (sub)modules results in an incorrect picture, e.g. apparent relations in Verma modules would then seem to exist, in contradiction with the definition of Verma modules. In the embedding diagrams known in the literature, similarly, some singular vectors appear to vanish when constructed in a module built on another singular vector – in which case one can hardly talk about embedding diagrams. The actual situation is that topological Verma submodules may exist in massive Verma modules, with one extra annihilation condition being imposed on the highest-weight vector of such submodules. For example, submodules generated by the so-called charged singular vectors [BFK] are always (twisted) topological. That this 1 With the exception of a somewhat exotic “twisted sector”, where one of the fermions has half-integer, and the other, integer, modes, which we do not touch upon in this paper.

Structure of Verma Modules over the N = 2 Superconformal Algebra

131

fundamental fact about the charged singular vectors has not been widely appreciated, is because the nature of submodules is obscured when one employs singular vectors defined using a formal analogy with the case of affine Lie algebras. In fact, any mapping from a massive Verma module into a topological Verma module necessarily has a kernel that contains another topological Verma submodule, which makes a sequence of such mappings look more like a BGG-resolution rather than an embedding diagram. Another reason why literally copying the definition of singular vectors from the affine Lie algebras complicates the analysis of N = 2 modules is that using such singular vectors entails subsingular vectors. Generally, when considering representations of algebras of rank ≥ 3, one should take care of whether a given singular vector or a set of singular vectors generate a maximal submodule. That a submodule U2 generated from all singular vectors in a Verma module U is not maximal means that there exists a proper submodule U1 6= U2 such that U2 ⊂ U1 ⊂ U. Then, the quotient module U/U2 contains a submodule, which in the simplest case would be generated from one or several singular vectors (otherwise, the story repeats). However, these vectors are not singular in U , i.e., before taking the quotient with respect to U2 . They are commonly known as subsingular vectors. The question of whether or not singular vector(s) generate a maximal submodule is unambiguous for the affine Lie algebras, where the root system determines a fixed set of operators from the algebra that is required to annihilate a state in order that it be a singular vector. As we have already remarked, such annihilation conditions are not defined uniquely in the N = 2 case. Since the significance of singular vectors consists in providing a description of submodules, one should thus focus one’s attention on the structure of submodules in N = 2 Verma modules. As regards singular vectors, then, one has two options: either to fix the convention that singular vectors satisfy some annihilation conditions (for instance, those copied literally from the case of affine Lie algebras) and then to find a system of sub-, subsub-, . . . -singular vectors that “compensate” for the failure of the chosen singular vectors to generate maximal submodules; or to try to define singular vectors in such a way that they generate maximal submodules, in which case the structure of submodules would be described without introducing subsingular vectors. In what follows, we present a regular way to single out and to explicitly construct those vectors that generate maximal submodules in N = 2 Verma modules2 . They turn out to satisfy “twisted” annihilation conditions, i.e. those given by a spectral flow transform [SS, LVW] of the annihilation conditions imposed on the highest-weightvector in the module. With this definition of singular vectors, subsingular vectors become redundant. However, in view of controversial statements that have been made regarding “subsingular vectors in N = 2 Verma modules” [D, GRR, EG], we will also show how our description can be adapted to provide a systematic way to construct the states in N = 2 Verma modules that are subsingular vectors once singular vectors are defined by the conventional, “untwisted”, annihilation conditions. The general picture that emerges in this way is very simple and can be outlined as follows. Recall that, in modules over a Z×Z-graded algebra, any vector that satisfies “highestweight” conditions is a member of the family of extremal states which make up an ex2 To avoid misunderstanding, let us point out explicitly that we do not claim, of course, that any maximal submodule in any N = 2 Verma module would be generated from one singular vector; this cannot be the case already for the sum of two submodules. What we are saying is that any maximal submodule is necessarily generated from an appropriate number of singular vectors that we work with in this paper. Note that this is not the case whenever subsingular vectors exist.

132

A. M. Semikhatov, I. Yu. Tipunin

tremal diagram3 . For the N = 2 algebra, a given singular vector that we consider satisfies twisted highest-weight conditions with a certain integral twist θ; this vector belongs to the extremal diagram that consists of states satisfying twisted highest-weight conditions with all integral twists. That of the extremal states which satisfies the conventional, “untwisted”, highest-weight conditions is the conventional singular vector. Now, it is the properties of the extremal diagram that are responsible for whether or not all of the extremal states generate the same submodule. Generically, it is irrelevant which of the representatives of the extremal diagram is singled out as “the” singular vector. In the degenerate cases, however, there do exist preferred representatives that generate the maximal possible submodule, while other extremal states generate a smaller submodule. Moreover, there exists a systematic way to divide N = 2 extremal diagrams into those vectors that do, and those that do not, generate maximal submodules. Using singular vectors that generate maximal submodules, it is not too difficult to classify all possible degenerations of N = 2 Verma modules, since the structure of submodules is still relatively simple. However, it may become quite complicated to describe the same structure in terms of a restricted set of singular vectors that satisfy zero-twist annihilation conditions (and then, necessarily, in terms of subsingular vectors). The upshot is that, in the degenerate cases where several singular vectors exist in the module, their zero-twist representatives may lie in the section of the extremal diagram separated from the vectors generating the maximal submodule by a state that satisfies stronger highest-weight conditions and, thus, is a highest-weight vector of a twisted topological Verma submodule (diagrams (3.7), (3.21), and (3.43)). Depending on the relative positions of such topological singular vectors in the extremal diagram, therefore, a single picture in terms of extremal diagrams breaks into several cases in some of which the conventional singular vectors do, while in others do not, generate maximal submodules. A careful analysis of the type of submodules in N = 2 Verma modules is also crucial for correctly describing one particular degeneration of massive Verma modules where a massive Verma submodule is embedded into the direct sum of two twisted topological Verma submodules. In the conventional terms, the situation is described either as the existence of two linearly independent singular vectors with identical quantum numbers [D] or as the existence of a singular vector and a subsingular vector with identical quantum numbers (the latter case was missed in the conventional approach). In more invariant and in fact, much simpler terms, both these cases are described uniformly, as the existence of two singular vectors that satisfy twisted highest-weight conditions (diagram (3.42)). It follows that linearly independent singular vectors belong then to two twisted topological Verma submodules. To summarize the situation with N = 2 subsingular vectors, they are superfluous when it comes to classifying degenerations of N = 2 Verma modules. Instead, our strategy is as follows. Given a submodule in the N = 2 Verma module, we consider the entire extremal diagram that includes the vectors from which that submodule is generated. On the extremal diagram, then, we point out the states that generate the maximal submodule. These states, which turn out to satisfy twisted highest-weight conditions, are the singular vectors we work with in this paper. Restricting oneself to those extremal states (the conventional singular vectors) that fail to generate maximal submodules and classifying the “compensating” subsingular vectors is then an exercise in describing the same structure 3 In a context similar to that of the present paper, (diagrams of) extremal vectors were introduced in [FS] (see Eqs. (2.6) and (2.9) for the topological and the massive Verma modules, respectively). Their usefulness in representation theory, which was pointed out in [FS], has been demonstrated in [ST, ST2, S2, FST].

Structure of Verma Modules over the N = 2 Superconformal Algebra

133

of submodules in much less convenient terms. However, in order to make contact with the problems raised in the literature, we indicate in each of the degenerate cases4 why and how the conventional singular vectors fail to generate maximal submodules; we then explicitly construct the corresponding subsingular vectors. By choosing the singular vectors that satisfy twisted highest-weight conditions, we sacrifice the formal similarity with the case of Kaˇc–Moody algebras, yet in the end of the day one observes [FST] that the structure of N = 2 Verma modules is equivalent to b the structure of certain modules over the affine s`(2) algebra: there is a functor from the b Verma modules introduced in [FST] to N = 2 Verma modules. category of “relaxed” s`(2) b Verma modules, this functor gives the twisted topological Restricted to the standard s`(2) b Verma modules over the N = 2 algebra. The s`(2) singular vectors (which do generate maximal submodules) correspond then precisely to the N = 2 singular vectors that we consider in this paper. This gives an intrinsic relation (actually, isomorphism [FST]) b singular vectors and the N = 2 singular vectors satisfying the twisted between affine s`(2) highest-weight conditions. Note also that the issue of subsingular vectors is normally b not considered for the s`(2) algebra; combined with the equivalence proved in [FST], this clearly signifies, once again, that N = 2 subsingular vectors are but an artifact of adopting the “zero-twist” definition for singular vectors (as we explain in some detail after diagram (3.7)). In what follows, we thus define and systematically refer to singular vectors that satisfy twisted “highest-weight” conditions (see Definitions 2.3 and 2.9); the twist-zero singular vectors will be referred to as the conventional, “untwisted”, singular vectors (as we explain below, these can also be characterized as the top-level representatives of the extremal diagrams). As we have already mentioned, the two essentially different types of N = 2 Verma modules are called the massive and the topological ones. “Highest-weight” conditions will be used without quotation marks from now on. Whenever we talk about subsingular vectors, these will of course be understood in the setting where one restricts oneself to the conventional definition of singular vectors. “State” is synonymous to “vector”. In application to representations, the term “twisted” means “transformed by the spectral flow”. Our main results are the classification of degenerations of N = 2 Verma modules and the general construction of N = 2 singular vectors. We develop a systematic description of all possible degenerations of N = 2 Verma modules using the extremal diagrams. This allows us to describe the structure of submodules without invoking subsingular vectors. The formalism that we develop for the N = 2 algebra (see also [ST2]) is, at the same time, a natural counterpart of the construction of singular vectors of affine Lie algebras (see [MFF, FST]). To make contact with the issues discussed in the literature, we also show how the properties of the extremal diagrams determine whether or not the conventional singular vectors generate maximal submodules; when they do not, we give the general construction of the corresponding subsingular vectors that arise in the conventional approach. In Sect. 2, we fix our notation and review the properties of the N = 2 algebra and singular vectors in its Verma modules. In Sect. 3, we describe all the degenerate cases where more than one singular vector exists. 4 With one exception, where the classification of subsingular vectors would be too long in view of a large number of different cases of relative positions of the extremal diagrams describing the relevant submodules; classifying the subsingular vectors then remains a straightforward, although lengthy and unnecessary, exercise.

134

A. M. Semikhatov, I. Yu. Tipunin

2. Preliminaries 2.1. The N = 2 algebra, spectral flow transform, and Verma modules. The N = 2 superconformal algebra A is taken in this paper in the following basis (see [ST2] for a discussion of the choice of various bases (and moddings) in the algebra): [Hm , Hn ] = C3 mδm+n,0 , [Lm , Ln ] = (m − n)Lm+n , [Hm , Gn ] = Gm+n , [Lm , Gn ] = (m − n)Gm+n , [Hm , Qn ] = −Qm+n , [Lm , Qn ] = −nQm+n , [Lm , Hn ] = −nHm+n + C6 (m2 + m)δm+n,0 , {Gm , Qn } = 2Lm+n − 2nHm+n + C3 (m2 + m)δm+n,0 .

m, n ∈ Z ,

(2.1)

The generators Lm , Qm , Hm , and Gm are the Virasoro generators, the BRST current, the U (1) current, and the spin-2 fermionic current respectively. H is not primary; instead, the commutation relations for the Virasoro generators are centreless. The element C is central. Since it is diagonalizable in any representations (at least in all those that we are going to consider), we do not distinguish between C and a number c ∈ C, which it will be convenient to parametrize as t−2 (2.2) c=3 t with t ∈ C \ {0}. The spectral flow transform [SS, LVW] produces isomorphic images of the algebra A. When applied to the generators involved in (2.1), it acts as Uθ :

Ln 7→ Ln + θHn + c6 (θ2 + θ)δn,0 , Hn 7→ Hn + c3 θδn,0 , Qn 7→ Qn−θ ,

Gn 7→ Gn+θ .

(2.3)

For any θ ∈ C, this gives the mapping Uθ : A → Aθ of the N = 2 algebra A ≡ A0 to an isomorphic algebra Aθ , whose generators Lθn , Qθn , Hnθ , and Gnθ (where Xnθ = Uθ (Xn )) satisfy the same relations as those with θ = 0. The family Aθ includes the Neveu– Schwarz and Ramond N = 2 algebras, as well as the algebras in which the fermion modes range over ±θ + Z. Spectral flow is an automorphism when θ ∈ Z. Next, consider Verma modules over the N = 2 algebra. As already mentioned in the Introduction, there are two different types of N = 2 Verma modules, the topological5 and the massive ones. Since each of these can be “twisted” by the spectral flow, we give the definitions of twisted modules, the “untwisted” ones being recovered by setting the twist parameter θ = 0. An important point, however, is that submodules of a given “untwisted” module can be the twisted modules (which is the case with submodules in topological Verma modules and also with submodules determined by the “charged” singular vectors). Definition 2.1. A vector satisfying the highest-weight conditions6 Lm |h, t; θitop = 0 ,

m ≥ 1,

Qλ |h, t; θitop = 0 , λ ∈ −θ + N0

Hm |h, t; θitop = 0 ,

m ≥ 1,

Gν |h, t; θitop = 0 , ν ∈ θ + N0

θ ∈ Z,

(2.4)

with the Cartan generators having the following eigenvalues: 5 The name has to do with the fact that the highest-weight vectors in these modules correspond to primary states existing when the N = 2 algebra is viewed as the topological conformal algebra [EY, W]. 6 Here and henceforth, N = 1, 2, . . ., while N = 0, 1, 2, . . .. 0

Structure of Verma Modules over the N = 2 Superconformal Algebra

135

(H0 + c3 θ) |h, t; θitop = h |h, t; θitop , (L0 + θH0 + c6 (θ2 + θ)) |h, t; θitop = 0

(2.5)

is called the twisted topological highest-weight state. Conditions (2.4) are called the twisted topological highest-weight conditions. Definition 2.2. The twisted topological Verma module Vh,t;θ is freely generated from a twisted topological highest-weight state |h, t; θitop by L−m , m ∈ N ,

H−m , m ∈ N ,

Q−m−θ , m ∈ N ,

G−m+θ , m ∈ N .

We write |h, titop ≡ |h, t; 0itop and Vh,t ≡ Vh,t;0 . N = 2 modules are graded with respect to H0 (the charge) and L0 (the level). Extremal vectors in N = 2 modules are those having the minimal level for a fixed H0 charge. Associating a rectangular lattice with the bigrading, we have that increasing the H0 -grade by 1 corresponds to shifting to the neighbouring site on the left, while increasing the level corresponds to moving down. The extremal vectors separate the lattice into those sites that are occupied by at least one element of the module and those that are not. The extremal diagram of a topological Verma module reads (in the θ = 0 case for simplicity) |h,titop G−1

G−2

• .. .

•

•

@ Q−1 R @ • A

A Q−2 A AU • .. .

(2.6)

Then, all the states in the module are inside the “parabola”, while none of the states from the module are associated with the outside part of the plane. A characteristic feature of extremal diagrams of topological Verma modules is the existence of a state that satisfies stronger highest-weight conditions than the other states in the diagram. Geometrically, this is a “cusp” point for the following reasons. Assigning grade −n to Qn and grade n to Gn , we see that every two adjacent arrows in the diagram represent the operators whose grades differ by 1, except at the cusp, where they differ by 2. As we are going to see momentarily, the extremal diagrams of submodules in a (twisted) topological Verma module have “cusps” as well, these “cusp” points being the topological singular vectors: Definition 2.3. A topological singular vector in the (twisted) topological Verma module V is any element of V that is not proportional to the highest-weight vector and satisfies twisted topological highest-weight conditions (i.e., is annihilated by the operators Lm , Hm , m ≥ 1, Qλ , λ ∈ −θ + N0 , and Gν , ν = θ + N0 with some θ ∈ Z). The point is that the twist parameter θ that enters the highest-weight conditions satisfied by the topological singular vector may be different from the twist parameter of the module. One readily shows, of course, that acting with the N = 2 generators on a topological singular vector defined in this way generates a submodule. The next statement follows from the results of [FST]:

136

A. M. Semikhatov, I. Yu. Tipunin

Theorem 2.4 ([FST]). Any submodule of a (twisted) topological Verma module is generated from either one or two topological singular vectors. This is directly parallel to the situation encountered in affine s`(2) Verma modules – which, in fact, is the statement of [FST], where a functor was constructed that maps b s`(2)-Verma modules to twisted topological Verma modules. The morphisms in a Verma b modules category are singular vectors. The functor maps singular vectors in a s`(2)Verma module to topological singular vectors and, thus, the assertion of the Theorem b follows from the well known facts in the theory of s`(2)-Verma modules. Thus, a maximal submodule of a topological Verma module is either a twisted topological Verma module or a sum (not a direct one, of course) of two twisted topological Verma modules. In what follows, we call a submodule primitive if it is not a sum of two or more submodules. Next, consider the massive N = 2 Verma modules. Definition 2.5. A twisted massive Verma module Uh,`,t;θ is freely generated from a twisted massive highest-weight vector |h, `, t; θi by the generators L−m , m ∈ N ,

H−m , m ∈ N ,

Q−θ−m , m ∈ N0 ,

Gθ−m , m ∈ N.

(2.7)

The twisted massive highest-weight vector |h, `, t; θi satisfies the following set of highestweight conditions: Q−θ+m+1 |h, `, t; θi = Gθ+m |h, `, t; θi = Lm+1 |h, `, t; θi = Hm+1 |h, `, t; θi = 0 , m ∈ N0 , (H0 +

c 3 θ) |h, `, t; θi

= h |h, `, t; θi ,

(2.8)

(L0 + θH0 + c6 (θ2 + θ)) |h, `, t; θi = ` |h, `, t; θi . Equations (2.8) will be referred to as the twisted massive highest-weight conditions. It is understood that the twisted massive highest-weight vector does not satisfy the twisted topological highest-weight conditions (i.e., Q−θ |h, `, t; θi 6= 0). The ordinary non-twisted case is obtained by setting θ = 0. We identify |h, `, ti ≡ |h, `, t; 0i and Uh,`,t ≡ Uh,`,t;0 . When we say that a state in a Verma module satisfies (twisted) massive highest-weight conditions, we will mean primarily the annihilation conditions from (2.8). An important property of the above definition is expressed by the following lemma, which underlies all the subsequent analysis. The lemma (which follows by a straightforward calculation in the universal enveloping algebra) is almost trivial, however we formulate it explicitly because of its wide use in what follows. Although we will not refer to the lemma explicitly, the reader should keep in mind that it is implicit in almost all of our constructions. Lemma 2.6. If a state |θ0 i in a (twisted) massive Verma module satisfies the annihilation conditions (2.8) with the parameter θ equal to θ0 then the states Gθ0 −N . . . Gθ0 −1 |θ0 i, N ≥ 1, and Q−θ0 −N . . . Q−θ0 |θ0 i, N ≥ 0, satisfy annihilation conditions (2.8) with the parameter θ equal to θ0 − N and θ0 + N + 1 respectively. The states referred to in the lemma fill out the extremal diagram of the massive Verma module. For θ = 0, it reads as

Structure of Verma Modules over the N = 2 Superconformal Algebra |h, `,ti •

Q0

-•

G−1

•

137

G−2 • .. .

@ Q−1 R @ • A

A Q−2 A AU • .. .

(2.9)

While |h, `, ti satisfies the “untwisted” highest-weight conditions (θ = 0 in (2.8)), the lemma tells us that the other states in the extremal diagram satisfy twisted massive highest-weight conditions with all θ ∈ Z. The massive highest-weight vector must not be a “cusp point” (i.e., it should not satisfy topological highest-weight conditions), however other cusp points may appear in the diagram depending on the highest-weight parameters (h, `, t). Whenever this happens, there is a twisted topological submodule in the massive Verma module. The singular vectors appearing in the extremal diagrams of massive Verma modules are called “charged” for historical reasons [BFK]. Definition 2.7. The charged singular vector in a massive Verma module U is any vector that satisfies twisted topological highest-weight conditions (2.4) (with whatever θ ∈ Z) and belongs to the extremal diagram of the module. An example of a charged singular vector is given in the following diagram, where the twisted topological highest-weight conditions (2.4) with θ = 2 are satisfied by the extremal state at the point C: |h,`,ti Q0

•- •

Q1

• Q2 G−2 • • .. .

G−1

G0

@Q−1 R @ @ •

I @ G 1

A Q0 Q−2 • -• A Q1 G0 @A I @ G−1 G1 @ RAU• C @ • BM B BB BB G3 B Q−3 B • BB .. . BBN • .. .

(2.10)

As a result, no operator inverts the action of Q−2 , while each of the other arrows is inverted up to a scalar factor by acting with the opposite mode of the other fermion. Thus, the extremal diagram branches at the “topological points”, and the crucial fact is that, once we are on the inner parabola, we can never leave it: none of the operators from the N = 2 algebra map onto the remaining part of the big parabola from the small one, or, in other words, the inner diagram corresponds to an N = 2 submodule. The general

138

A. M. Semikhatov, I. Yu. Tipunin

construction for the charged singular vectors is already obvious from the above remarks, and it will be given in Eqs. (2.40). The “topological” nature of submodules generated from charged singular vectors can be concealed if one allows submodules to be generated only from the conventional singular vectors, i.e. those that satisfy precisely the same highest-weight conditions as the highest-weight conditions satisfied by the highest-weight vector of the module. These conventional singular vectors do not in general coincide with the “cusp” of the extremal diagram of the topological submodule. However, it is the existence of this “cusp” that determines several crucial properties of the submodule. Besides (twisted) topological Verma submodules, massive Verma modules may have submodules of the same, “massive”, type. These have to be clearly distinguished from the topological ones. The following definition will allow us to single out massive Verma modules. Definition 2.8. Let |Y i be a state in an N = 2 Verma module that satisfies twisted massive highest-weight conditions with some θ ∈ Z. Then |Xi is said to be a dense G/Q-descendant of |Y i if either |Xi = α Gθ−N . . . Gθ−1 |Y i, N ∈ N, or |Xi = α Q−θ−M . . . Q−θ |Y i, M ∈ N0 , where α ∈ C, α 6= 0. Those extremal states that do not generate the entire massive Verma module necessarily have a vanishing dense G/Q-descendant. In (2.10), for example, a part of the states on the extremal diagram generate only a submodule of the massive Verma module. Thus, in order to correctly define singular vectors that generate massive Verma submodules in a massive Verma module, one has to avoid vanishing dense G/Q-descendants of the singular vector. This is formalized in the following definition. Definition 2.9. A representative of a massive singular vector in the massive Verma module Uh,`,t is any element of Uh,`,t such that it is annihilated by the operators Lm , Hm , m ∈ N, Qλ , λ ∈ −θ + N, and Gν , ν = θ + N0 with some θ ∈ Z, ii) none of its dense G/Q-descendants vanish, iii) the highest-weight state |h, `, ti is not one of its descendants.

i)

The meaning of the definition is that any representative of a massive singular vectors should generate an extremal diagram of the same type as extremal diagram (2.9). On the other hand, vectors that do generate a given massive submodule can be chosen in different ways, and we thus talk about representatives of a massive singular vector. In the conventional approach, the highest-weight conditions imposed on any singular vector read (2.11) Q≥1 ≈ G≥0 ≈ L≥1 ≈ H≥1 ≈ 0 . This selects the top-level (in accordance with the diagrams being drawn “upside-down”) representative of the extremal diagram of the submodule. These conventional, “untwisted”, singular vectors will thus be called top-level representatives. In [BFK], [D], conditions (2.11) apply equally to the representatives of the charged and the massive singular vectors in our nomenclature. As regards the charged singular vectors, choosing the top-level representative conceals the fact that the submodule is of a different nature than the module itself; ignoring this then shows up in a number of “paradoxes” when analyzing degenerations of the module. In the general position, the massive singular vectors are equivalent to the “uncharged” singular vectors in the conventional approach, since these generate the same submodule.

Structure of Verma Modules over the N = 2 Superconformal Algebra

139

In the degenerate cases, however, the conventional, top-level, singular vectors may not generate the entire submodule generated from some other states on the same extremal diagram. This depends on the properties of the extremal diagram of the submodule, which change when there appears a charged singular vector, i.e., when one of the extremal states in the diagram happens to satisfy twisted topological highest-weight conditions. The conventional representatives of singular vectors may then be separated by such topological points from those sections of the extremal diagram which generate the maximal submodule. Our strategy is to define and explicitly construct singular vectors that lie in the “safe” sections of the extremal diagrams (those from which maximal submodules are generated). As we have mentioned, this eliminates the notion of subsingular vectors. Describing the structure of N = 2 modules in this way appears to be more transparent and in any case much more economical, considering a fast proliferation of cases describing the subsingular vectors that have to be introduced whenever the conventional, top-level, singular vectors lie in the “wrong” section of the extremal diagram of the submodule. However, given the analysis that follows, it is a straightforward exercise to classify all such cases (and explicitly construct the subsingular vectors) by looking at how the extremal diagram is divided into different sections by the topological singular vectors. In the next subsection, we develop the algebraic formalism that allows us to construct singular vectors. The reader who is interested only in the degeneration patterns may skip to Subsect. 2 and Sect. 3. 2.2. The algebra of continued operators. In order to explicitly construct singular vectors, we follow ref. [ST2] in making use of “continued” operators that generalize the dense G/Q-descendants to the case of non-integral (in fact, complex) θ. The new operators g(a, b) and q(a, b) can be thought of as a continuation of the products of modes Ga Ga+1 . . . Ga+N and Qa Qa+1 . . . Qa+N , respectively, to a complex number of factors. In particular, whenever the length b − a + 1 of g(a, b) or q(a, b) is a non-negative integer, the corresponding operator becomes, by definition, the product of the corresponding modes:

g(a, b) =

L−1 Y i=0

Ga+i ,

q(a, b) =

L−1 Y

Qa+i ,

iff

L ≡ b − a + 1 = 0, 1, 2, . . . (2.12)

i=0

(in the case where L = 0, the product evaluates as 1). We now postulate a number of properties of the new operators in such a way that these properties become identities whenever the operators reduce to elements of the universal enveloping algebra. This is analogous to the well-known story about complex exponents in the construction of [MFF]. To begin with, the idea of a “dense” filling with fermions is formalized in the rules g(a, b − 1) g(b, θ − 1) |θig = g(a, θ − 1) |θig , q(a, b − 1) q(b, −θ − 1) |θiq = q(a, −θ − 1) |θiq ,

a, b, θ ∈ C ,

(2.13)

where |θig is any state that satisfies Gθ+n |θig = 0 for n ∈ N0 , and |θiq , similarly, satisfies Q−θ+n |θiq = 0 for n ∈ N0 .

140

A. M. Semikhatov, I. Yu. Tipunin

Under the spectral flow transform (2.3), the operators g(a, b) and q(a, b) behave in the manner that is also inherited from the behaviour of the products (2.12): Uθ :

g(a, b) 7→ g(a + θ, b + θ) ,

(2.14)

q(a, b) 7→ q(a − θ, b − θ) .

Further properties of the new operators originate in the fact that, the N = 2 generators Qa+N Q and G being fermions, they satisfy the vanishing formulae such as, e.g., Gn · i=a Gi = 0, N ∈ N0 , a ≤ n ≤ a + N . For complex values of the parameters, we impose b−c−1 ∈ N). (2.15) Similarly, the “left-hand” annihilation properties are expressed by the relations Ga g(b, c) = 0 ,

Qa q(b, c) = 0 ,

a−b ∈ N0

and

q(a, b) Qc = 0 ,

b−c ∈ N0

and

(a−c 6∈ N

or

(a−c 6∈ N

or

a−b−1 ∈ N) . (2.16) Next, the formulae to commute the continued operators with the bosons L≥1 and H≥1 read g(a, b) Gc = 0 ,

X d(p,a,b) g(a, b − l − 1) Kp , Gb−l Gb−l+1 . . . Gb , Kp , g(a, b) = l=0

Kp , q(a, b) =

d(p,a,b) X

p ∈ N , (2.17)

q(a, b − l − 1) Kp , Qb−l Qb−l+1 . . . Qb ,

l=0

where K = L or H, and d(p, a, b) =

(

b − a , p − b + a ∈ N0

and

b − a + 1 ∈ N0 ,

p − 1 , otherwise.

(2.18)

The main point here is that, even though the length b − a + 1 may not be an integer, there is always an integral number of terms on the RHS of (2.17). Similarly, applying the g and q operators changes the eigenvalues of L0 and H0 , which can be expressed by the commutation relations [L0 , g(a, b)] = − 21 (a + b)(b − a + 1) g(a, b) , [H0 , g(a, b)] = (b − a + 1) g(a, b) , [L0 , q(a, b)] = − 21 (a + b)(b − a + 1) q(a, b) , [H0 , q(a, b)] = (−b + a − 1) q(a, b) , (2.19) Further annihilation properties with respect to the fermionic operators are as follows: Q−θ+n g(θ, −1) |h, `, ti = 0 ,

θ ∈ C,

n ∈ N,

(2.20)

while Q−θ g(θ, −1) |h, `, ti = 2(` + θh − 1t (θ2 + θ)) g(θ + 1, −1) |h, `, ti.

(2.21)

It is understood here that the operators acting from the left of g(θ, θ0 −1) or q(−θ, −θ0 −1) are the N = 2 generators from (2.1) subjected to the spectral flow transform Uθ . Similarly, for the q-operators, we have the following properties:

Structure of Verma Modules over the N = 2 Superconformal Algebra

Gθ+n q(−θ, 0) |h, `, ti = 0

141

n ∈ N,

Gθ q(−θ, 0) |h, `, ti = 2(` + θh − 1t (θ2 + θ))q(−θ + 1, 0) |h, `, ti .

(2.22)

For the “continued” topological highest-weight states we have, in the same manner, Q−θ0 g(θ0 , θ − 1) |h, t; θitop = = 2(θ0 − θ)(h + 1t (θ − θ0 − 1)) g(θ0 + 1, θ − 1) |h, t; θitop ,

(2.23)

0

Gθ0 q(−θ , −θ − 1) |h, t; θitop = = 2(θ0 − θ)(h + 1 + 1t (θ − θ0 − 1)) q(θ0 + 1, θ − 1) |h, t; θitop .

(2.24)

The formulae to commute the negative-moded H and L operators through q(a, b) and g(a, b) read

X d(−p,a,b) Ga . . . Ga+l−1 Ga+l , Kp g(a + l + 1, b) , g(a, b), Kp = l=0

q(a, b), Kp =

d(−p,a,b) X

(2.25)

Qa . . . Qa+l−1 Qa+l , Kp q(a + l + 1, b) ,

l=0

where d(p, a, b) is given by (2.18). As before, K = H or L. The formulae postulated for g and q make up a consistent set of algebraic rules (in particular, they are consistent with operator associativity and with the positive integral length reduction (2.12)). All the properties listed above make it easy to show the following Lemma 2.10. I. A massive highest-weight state maps under the action of operators g and q into the states g(θ, −1)|h, `, ti and q(−θ, 0)|h, `, ti that satisfy the following annihilation conditions: Lm g(θ, −1) |h, `, ti = 0 , m ∈ N , Hm g(θ, −1) |h, `, ti = 0 ,

and

m ∈ N,

Ga g(θ, −1) |h, `, ti = 0 ,

a ∈ θ + N0 ,

Qa g(θ, −1) |h, `, ti = 0 ,

a ∈ −θ + N ,

Lm q(−θ, 0) |h, `, ti = 0 ,

m ∈ N,

Hm q(−θ, 0) |h, `, ti = 0 ,

m ∈ N,

Ga q(−θ, 0) |h, `, ti = 0 ,

a ∈ θ + N,

(2.26)

(2.27)

Qa q(−θ, 0) |h, `, ti = 0 , a ∈ −θ + N0 . II. The twisted topological highest-weight states are mapped under the action of g and q into the states that satisfy Lm g(θ0 , θ − 1) |h, t; θitop = 0 ,

Lm q(θ0 , −θ − 1) |h, t; θitop = 0 , m ∈ N ,

Hm g(θ0 , θ − 1) |h, t; θitop = 0 ,

Hm q(θ0 , −θ − 1) |h, t; θitop = 0 , m ∈ N ,

Ga g(θ0 , θ − 1) |h, t; θitop = 0 ,

Ga q(θ0 , −θ − 1) |h, t; θitop = 0 , a ∈ −θ0 + N .

Qa g(θ0 , θ − 1) |h, t; θitop = 0 ,

Qa q(θ0 , −θ − 1) |h, t; θitop = 0 , a ∈ θ0 + N0 . (2.28)

142

A. M. Semikhatov, I. Yu. Tipunin

These equations allow us to relate the states satisfying the highest-weight conditions with different twists, which is necessary for the construction of singular vectors. It is also useful to know the parameters (the corresponding h and `) of the vector obtained from |h, `, t; θi by the action of a q- or a g-operator. These are described as follows: up to a numerical coefficient, we have g(θ0 , θ − 1) |h, `, t; θi ∼ |h0 , `0 , t; θ0 i , h0 = h + 2t (θ − θ0 ) , 0

0

(2.29) 0

` = ` + (θ − θ)(h − 1t (θ − θ + 1)), and

q(−θ0 , −θ)|h, `, t; θit ∼ |h00 , `00 , t; θ0 + 1i , h00 = h + 2t (θ − θ0 − 1) ,

(2.30)

`00 = ` + (θ0 − θ + 1)(h − 1t (θ0 − θ + 2)) . Note that whenever ` + (θ0 − θ)h − 1t ((θ0 − θ)2 + θ0 − θ) = 0, Eqs. (2.22) allow us to show that, in addition to (2.30), q(−θ0 , −θ)|h, `, t; θi ∼ h + 2t (θ − θ0 ) − 1, t; θ0 top . (2.31) In what follows, the above formulae will be used to construct the general expressions for singular vectors in N = 2 Verma modules. 2.3. Singular vectors in codimension 1. In the general position, there are no singular vectors in Verma modules. Singular vectors can appear in codimension 1, when there is 1 relation between parameters of the highest-weight state. This is considered in the present subsection, while the cases of a higher codimension, where several singular vectors coexist in the module, are considered in the next section. We begin with the topological Verma modules. As we are going to see, this case is also crucial for the massive Verma modules, since the analysis of the latter reduces, to a considerable degree, to the analysis of certain topological Verma modules. Theorem 2.11. I. A singular vector exists in the topological Verma module Vh,t if and only if h = h+ (r, s, t) or h = h− (r, s, t), where h+ (r, s, t) = − r−1 t + s − 1, h− (r, s, t) =

r+1 t

− s,

r, s ∈ N .

(2.32)

II. All singular vectors in the topological Verma module Vh± (r,s,t),t over the N = 2 superconformal algebra are given by the explicit construction: |E(r, s, t)i+ = g(−r, (s − 1)t − 1) q(−(s − 1)t, r − 1 − t) . . . g((s − 2)t − r, t − 1) q(−t, r − 1 − t(s − 1)) · g((s − 1)t − r, −1) |h+ (r, s, t), titop ,

(2.33)

−

|E(r, s, t)i = q(−r, (s − 1)t − 1) g(−(s − 1)t, r − t − 1) . . . q((s − 2)t − r, t − 1) g(−t, r − 1 − (s − 1)t) ·q((s − 1)t − r, −1) h− (r, s, t), t top ,

(2.34)

Structure of Verma Modules over the N = 2 Superconformal Algebra

143

where r, s ∈ N and the factors in the first two lines of each formula are g(−r − t − mt + st, −1 + mt) q(−mt, r − 1 + mt − st) , s − 1 ≥ m ≥ 1 (2.35) and q(−r − t − mt + st, −1 + mt) g(−mt, r − 1 + mt − st) , s − 1 ≥ m ≥ 1 , (2.36) respectively. The |E(r, s, t)i± singular vectors satisfy twisted topological highest-weight conditions with the twist parameter θ = ∓r, are on the level rs + 21 r(r − 1) over the corresponding topological highest-weight state, and have the relative charge ±r. In what follows, we will need singular vector operators E ± (r, s, t) such that |E(r, s, t)i± = E ± (r, s, t) h± (r, s, t), t top . In a direct analogy with the well-known affine Lie algebra case [MFF, Ma], “all singular vectors” applies literally to non-rational t, while for rational t, a singular vector may be given already by a subformula of Eqs. (2.33), (2.34) as soon as that subformula (obtained by dropping several g- and q-operators from the left) produces an element of the Verma module. To avoid a possible misunderstanding, let us point out once again that a given submodule may be generated from a state other than the singular vectors we work with (in the present case, other than the topological singular vectors). This is completely similar to the situation in the standard case of (affine) Lie algebras, where it is possible to generate a given Verma submodule from some vectors other than the highest-weight state of the submodule. However, any such vector is a descendant of the highest-weight vector and, in this sense, considering it as a “singular vector” is unnecessary (and, often, inconvenient). An essential point about singular vectors (2.33), (2.34) is that the corresponding submodules can be freely generated from the vectors. Proof. Part I was conjectured in [S1] and proved in [FST] as an immediate consequence of the equivalence result obtained there. The construction of singular vectors in Part II is borrowed from [ST], while the fact that these are all singular vectors follows again from [FST]. The scheme to evaluate the singular vectors as elements of the topological Verma module can be outlined as follows. Consider, for definiteness, (2.34). This can be rewritten as |E(r, s, t)i− = q(−r, (s − 1)t − 1) E +,r−(s−1)t (r, s − 1, t) q((s − 1)t − r, −1) h− (r, s, t), t top , where E +,θ (r, s−1, t) is the spectral flow transform of the singular vector operator. Now, assuming that this operator is already expressed in terms of modes of L, H, G, and Q, we shall prove that |E(r, s, t)i− is a polynomial in L≤−1 , H≤−1 , G≤−1 , and Q≤−1 acting on |h− (r, s, t), titop . To this end, we use (2.13) to rewrite q(−r, (s−1)t−1)E +,r−(s−1)t (r, s− 1, t) as q(−r, (s−1)t−r −1)Q(s−1)t−r . . . Q(s−1)t−1 E +,r−(s−1)t (r, s−1, t) and observe that all of the operators Q(s−1)t−r , . . . , Q(s−1)t−1 annihilate the state q((s−1)t−r, −1)· |h− (r, s, t), titop in accordance with (2.20)–(2.24). After commuting these operators to the right, Eqs. (2.16) apply to q(−r, (s − 1)t − r − 1) and all of the remaining modes of Q. Finally, we are left with a polynomial in the modes of only L and H between q(−r, (s − 1)t − r − 1) and q((s − 1)t − r, −1). Then, using Eqs. (2.25), we see that the two q-operators meet each other and are eliminated using Eqs. (2.13) and (2.12). Thus, we are left with a polynomial in L≤−1 , H≤−1 , G≤−1 , and Q≤−1 acting on |h− (r, s, t), titop .

144

A. M. Semikhatov, I. Yu. Tipunin

This allows us to develop the induction argument, with the starting point being that in the center of each of the formulas (2.33) and (2.34), there is a g- or q-operator of the positive integral length r, which therefore reduces to the product of modes according to (2.12). We now turn to singular vectors in massive N = 2 Verma modules. To a given massive Verma module Uh,`,t we associate four twisted topological Verma modules whose highest-weight vectors are the “continued” states of the form of those entering (2.26) and (2.27). Namely, let θ0 and θ00 = −θ0 + ht − 1 be two roots of the equation ` = −θh + 1t (θ2 + θ) .

(2.37)

Then, using Lemma 2.10 and Eqs. (2.29) and (2.30), it is immediately verified that the states g(θ0 , −1)|h, `, ti , q(−θ0 , 0)|h, `, ti , (2.38) g(θ00 , −1)|h, `, ti , q(−θ00 , 0)|h, `, ti , formally satisfy the twisted topological highest-weight conditions (2.4), although possibly with a complex twist parameter. We will say, for brevity, that a highest-weight state admits a singular vector if the corresponding singular vector exists in the module built on that state and that a highestweight state admits no singular vectors if no singular vectors exist in the module. As it turns out, all possible degenerations of the massive Verma module Uh,`,t occur depending on whether and how many of states (2.38) belong to Uh,`,t and/or admit a topological singular vector. We now introduce a stratification of the space of highest weights (h, `, t) controlled by the behaviour of vectors (2.38). In the subsequent sections, we consider each stratum in turn and study the corresponding degenerations of massive Verma modules. The possible cases, whose labels Oxyz indicate the existence of typical (massive or charged) singular vectors, are as follows: codimension 0: 1. none of states (2.38) belong to Uh,`,t and at least one of states (2.38) admits no topological singular vectors; codimension 1: 2. Om : one of states (2.38) admits precisely one topological singular vector, each of the other states (2.38) admits at least one topological singular vector, while none of states (2.38) belong to Uh,`,t ; 3. Oc : one and only one of states (2.38) belongs to Uh,`,t and none of states (2.38) admit a topological singular vector; codimension 2: 4. Omm : each of states (2.38) admits at least two distinct topological singular vectors, while none of states (2.38) belong to Uh,`,t ; 5. Occ : precisely one of the states from each column in (2.38) belongs to the module Uh,`,t and none of these two states admit a topological singular vector; 6. Ocm : one of the states from (2.38) belongs to the module Uh,`,t and admits precisely one topological singular vector; none of states (2.38) admit two different topological singular vectors; no two states from different columns in (2.38) belong to Uh,`,t ; codimension 3:

Structure of Verma Modules over the N = 2 Superconformal Algebra

145

7. Ocmm : one of the states from (2.38) belongs to the module Uh,`,t and admits at least two different topological singular vectors; no two states from different columns in (2.38) belong to Uh,`,t ; 8. Occm : precisely one of the states from each column in (2.38) belongs to the module Uh,`,t ; each of these two states admits at least one topological singular vector. In what follows, we will often refer to cases 2–8 by saying that the highest-weight parameters (h, `, t) of Uh,`,t belong to the corresponding Oxyz . Lemma 2.12. The above cases 1–8 divide the space of highest-weight parameters (h, `, t) into a disjoint union. Proof. Observe, first of all, that each case in the above list is singled out by a combination of two conditions or their negations: that one of states (2.38) belongs to the module Uh,`,t and that a (necessarily topological) singular vector exists in the module built on one of states (2.38). The first condition means that the θ parameter is an integer of the appropriate sign such that formulae (2.12) apply and, thus, the corresponding state in an element of Uh,`,t . We find from (2.37) that the condition for this to be the case is ` = n(h + n−1 t ), n ∈ Z. Next, whether or not a state from (2.38) admits a topological singular vector is a matter of whether the corresponding h0 or h00 parameter determined according to (2.29) and (2.30) equals one of the h± from (2.32). We see from (2.37) that this is the case if and only if ` = − 4t (h − h− (r, s, t))(h − h+ (r, s + 1, t)), r, s ∈ N. Note that the two expressions for ` are precisely the zeros of the Kaˇc determinant [BFK]. The cases 1–8 do not overlap by construction; on the other hand, there are no other possible combinations of the two basic conditions, since such combinations (e.g., that three distinct states from (2.38) belong to Uh,`,t , etc.) would either lead to an overdetermined system of equations on the parameters h, t, θ0 , and θ00 , which admits no solutions, or would contradict the embedding patterns of topological Verma modules, which are b isomorphic [FST] to the embedding patterns of s`(2) Verma modules. Unless one considers cases of codimension 2 or 3, there is no discrepancy in the use of the term “charged” between the present paper and the treatment of [BFK] (and similarly with the correspondence “massive”–“uncharged”), in the sense that the toplevel representatives of singular vectors generate exactly the same submodules as our singular vectors. In the following two theorems, we take care not to slip down to a higher codimension and recover the “charged” and the “massive” cases: Theorem 2.13. I. The highest-weight of the massive Verma module Uh,`,t belongs to the set Oc if and only if ` = lch (n, h, t), where lch (n, h, t) = n(h + n−1 t ), (n, h, t) ∈ (Z × C × C) \ (2.39) n o 0 0 0 2n0 −1+r 0 \ (n , s − t0 , t ) | r, s ∈ Z, r 6= 0, r · s ≥ 0, n ∈ Z, t ∈ C . II. Then, the massive Verma module Uh,lch (n,h,t),t contains precisely one submodule, which is generated from the charged singular vector ( n ≤ 0, Qn . . . Q0 |h, lch (n, h, t), ti (2.40) |E(n, h, t)ich = G−n . . . G−1 |h, lch (n, h, t), ti , n ≥ 1 .

146

A. M. Semikhatov, I. Yu. Tipunin

Every such vector satisfies the twisted topological highest-weight conditions (2.4) with θ = −n and, therefore, the submodule is isomorphic to a twisted topological Verma module. Proof. The formula for lch (n, h, t) is obvious from the proof of Lemma 2.12; the condition ` = lch (n, h, t) is equivalent to the fact that a solution of Eq. (2.37) is an integer (θ0 ∈ Z or θ00 ∈ Z). This reproduces the “charged” series of zeros of the Kaˇc determinant [BFK]. The excluded set X(n, t) is that where other zeros of the Kaˇc determinant occur. Finally, a straightforward calculation in the universal enveloping algebra shows that the state (2.40) does satisfy the twisted topological highest-weight conditions, which completes the proof. The top-level representative of (2.40), which reads as ( G0 . . . G−n−1 |E(n, h, t)ich , n ≤ 0 , |s(n, h, t)ich = Q1 . . . Qn−1 |E(n, h, t)ich , n ≥ 1 ,

(2.41)

is the conventional charged singular vector satisfying the conditions given in [BFK]. Thus, the conventional charged singular vector necessarily belongs to a twisted topological Verma submodule, and it is the highest-weight vector of this submodule that we call the charged singular vector |E(n, h, t)ich in this paper. Further, as regards the massive singular vectors, we have Theorem 2.14. I. The highest-weight of the massive Verma module Uh,`,t belongs to the set Om if and only if ` = l(r, s, h, t), where (2.42) l(r, s, h, t) = − 4t (h − h− (r, s, t))(h − h+ (r, s + 1, t)) , [ (r, s, h, t) ∈ N × N × C × (C \ Q) Y n o 0 \ (r0 , s0 , ±s0 − 2n−1±r , t0 ) n ∈ Z, r0 , s0 ∈ N, t0 ∈ C , t0 where

o[ n Y = (r0 , s0 , h0 , − pq ) 1 ≤ r0 ≤ p, 1 ≤ s0 ≤ q, p, q ∈ N, h0 ∈ C o n (r0 , 1, h0 , − pq ) p + 1 ≤ r0 ≤ 2p, p, q ∈ N, h0 ∈ C .

(2.43)

In this case, Uh,`,t contains precisely one submodule, which is a massive Verma module. II. Then, the representatives of the massive singular vector in the massive Verma module Uh,l(r,s,h,t),t are given by |S(r, s, h, t)i− = g(−rs, r + θ− (r, s, h, t) − 1) −

E −,θ (r,s,h,t) (r, s, t) g(θ− (r, s, h, t), −1) |h, l(r, s, h, t), ti , (2.44) + |S(r, s, h, t)i = q(1 − rs, r − θ+ (r, s, h, t) − 1) E +,θ

+

(r,s,h,t)

(r, s, t) q(−θ+ (r, s, h, t), 0) |h, l(r, s, h, t), ti,

(2.45)

where E ±,θ (r, s, t) are the topological singular vector operators subjected to the spectral

Structure of Verma Modules over the N = 2 Superconformal Algebra

147

flow transform with parameter θ, and θ− (r, s, h, t) = 2t (h − h− (r, s, t)) , θ+ (r, s, h, t) = 2t (h − 1 − h+ (r, s, t)) .

(2.46)

The RHSs of (2.44) and (2.45) evaluate as elements of Uh,l(r,s,h,t),t and satisfy the twisted massive highest-weight conditions Q≥1∓rs |S(r, s, h, t)i± = H≥1 |S(r, s, h, t)i± = L≥1 |S(r, s, h, t)i± = G≥±rs |S(r, s, h, t)i± = 0 , L0 |S(r, s, h, t)i± = l± (r, s, h, t) |S(r, s, h, t)i± ,

(2.47)

H0 |S(r, s, h, t)i± = (h ∓ rs) |S(r, s, h, t)i± with

l± (r, s, h, t) = l(r, s, h, t) + 21 rs(rs + 2 ∓ 1) .

(2.48)

±

Either of the |S(r, s, h, t)i states generates the entire massive Verma submodule; in particular, all of the dense G/Q-descendants of (2.44) and (2.45) are on the same extremal subdiagram (the extremal diagram of the submodule) and coincide up to numerical factors whenever they are in the same grade: c− (i, h, t) Qi+1−rs . . . Qrs |S(r, s, h, t)i− = = c+ (i, h, t) Grs−i . . . Grs−1 |S(r, s, h, t)i+ ,

i = 0, . . . , 2rs ,

−

c− (i, h, t) G−rs+i . . . G−rs−1 |S(r, s, h, t)i = = c+ (i, h, t) G−rs+i . . . Grs−1 |S(r, s, h, t)i+ , i ≤ −1 ,

(2.49)

c− (i, h, t) Qrs−i . . . Qrs |S(r, s, h, t)i− = = c+ (i, h, t) Qrs−i . . . Q−rs |S(r, s, h, t)i+ ,

i ≥ 2rs + 1 ,

where c± (i, h, t) are (r- and s-dependent) polynomials in h and t. Proof. A state |h0 , t; θ0 itop or |h00 , t; θ00 itop from (2.38) admits a singular vector if and only if (2.32) holds for the corresponding h0 or h00 parameter determined according to (2.29) and (2.31). Using (2.37), we see that this is the case if and only if ` = l(r, s, h, t), r, s ∈ N, which gives zeros of the Kaˇc determinant [BFK]. Excluding the set X(r, s, t) guarantees that this is the only zero. Further, a unique submodule can also occur for p negative rational t = − e q provided r is sufficiently small (the “smallness” of r depends on whether s = 1 or s ≥ 1, since these cases correspond to different degenerations of the auxiliary topological Verma modules; the corresponding embedding diagrams are b isomorphic [FST] to embedding diagrams of the s`(2) Verma modules with negative p e rational k + 2 = − q and with the same r and s), whence Part I follows. The fact that (2.44) and (2.45) are elements of the Verma module Uh,l(r,s,h,t),t follows similarly to how this was described in the proof of Theorem 2.11 (in the present case, one considers the topological singular vector operators E ± (r, s, t) as already expressed as polynomials in the modes, then one subjects these operators to the spectral flow transform with θ = θ± (r, s, h, t), and, finally, applies the formulae of Sect. 2). Formulae (2.47) follow from (2.26)–(2.30) and (2.46). Equations (2.48) are obtained by applying (2.19) to

148

A. M. Semikhatov, I. Yu. Tipunin

explicit expressions (2.44) and (2.45). Two singular vectors (2.44) and (2.45) generate the same submodule because they are descendants of each other, as expressed by Eqs. (2.49), b which, in turn, follows by comparing with the theory of s`(2) relaxed Verma modules by means of the direct and the inverse functors constructed in [FST]. The structure of (2.44) and (2.45) reflects the property stipulated in item 2 of the list on page 144, that the corresponding topological highest-weight state from (2.38) admit a singular vector. Namely, Eqs. (2.44) and (2.45) mean that we first map from the massive Verma module Uh,l(r,s,h,t),t either by g(θ− (r, s, h, t), −1) or by q(−θ+ (r, s, h, t), 0) in such a way that the resulting state satisfies twisted topological highest-weight conditions with the twist parameters θ∓ (r, s, h, t) respectively, which are the roots of (2.37) with ` = l(r, s, h, t). Even though θ∓ (r, s, h, t) are, in general, complex, we build spectralflow-transformed topological singular vectors on these states and, finally, map back to the original module Uh,l(r,s,h,t),t . From the correspondence with the zeros of the Kaˇc determinant, we also see that the massive Verma module Uh,`,t is irreducible if and only if conditions of item 1 of the list on p. 144 are satisfied. 3. Submodules and Singular Vectors in Codimension ≥ 2 To proceed with the degeneration patterns of N = 2 Verma modules, we begin with topological Verma modules, where 2 is the highest codimension, and then consider codimensions 2 and 3 in the massive case. 3.1. Topological Verma modules. A further degeneration in the setting of Theorem 2.11 means that the parameter t is rational, t = p/q. This case is the least interesting one as regards the structure of submodules, since the structure of topological Verma module Vh± (r,s, pq ), pq is determined [FST] by the well-known structure of the Verma module b Mj± (r,s, p −2), p −2 over the affine s`(2) algebra, where j+ (r, s, k) = r−1 − (k + 2) s−1 and q

2

q

2

s j− (r, s, k) = − r+1 2 + (k + 2) 2 . This applies to the BGG resolution [BGG], embedding diagrams [RCW, Ma], etc. b Recall that the Verma module Mj,k over the s`(2) algebra 0 ± [Jm , Jn± ] = ± Jm+n ,

0 [Jm , Jn0 ] =

K 2

m δm+n,0 ,

+ 0 [Jm , Jn− ] = K m δm+n,0 + 2Jm+n ,

m, n ∈ Z

(3.1)

− + , J≤0 , and (where the generator K is central) is freely generated by the modes J≤−1 0 from the highest-weight vector |j, kis`(2) that satisfies the following highest-weight J≤−1 conditions: − + 0 J≥0 |j, kis`(2) = J≥1 |j, kis`(2) = J≥1 |j, kis`(2) = 0 ,

J00 |j, kis`(2) = j |j, kis`(2) ,

K |j, kis`(2) = k |j, kis`(2) ,

j, k ∈ C .

(3.2)

Singular vectors in Mj,k are labelled by r, s ∈ N and can be of the “+” or “−” type. General formulae for these singular vectors, which we denote as |MFF(r, s, k)i± , can be found in [MFF] or, in our present conventions, in [FST].

Structure of Verma Modules over the N = 2 Superconformal Algebra

149

Theorem 3.1 ([FST]). For arbitrary h ∈ C and t ∈ C \ {0}, b 1. the topological N = 2 Verma module Vh,t is irreducible if and only if the s`(2) Verma module M− t2 h,t−2 is irreducible; 2. the module Vh,t has a submodule generated by a singular vector |E(r, s, t)i± , Eqs. (2.33) or (2.34), if and only if the module M− t2 h,t−2 has a submodule generated by the singular vector |MFF(r, s, t − 2)i± respectively. Whenever the singular vector in M− t2 h,t−2 has relative J00 -charge ±r, it is clear from formulae (2.33) and (2.34) that the corresponding topological singular vector in Vh,t has relative charge ∓r and satisfies the twisted topological highest-weight conditions (2.4) with the twist parameter θ = ±r. Thus, the appearance of one or more singular vectors in a topological N = 2 Verma b module can be read off from the corresponding s`(2) Verma module. In view of the correspondence at the level of submodules, it might seem puzzling that one can talk about subsingular vectors in topological N = 2 Verma modules, since these are absent b in s`(2) Verma modules. In fact, this apparent paradox illustrates our general statement that, for the N = 2 superconformal algebra, subsingular vectors are an artifact of an “inconvenient” definition of singular vectors. They have to be considered when one restricts oneself to submodules generated only from the conventional, top-level, N = 2 singular vectors, which do not always generate maximal submodules. On the other hand, singular vectors (2.33) and (2.34), which satisfy twisted topological highest-weight conditions, allow one to work with maximal submodules, and it is these singular vectors b that are in 1 : 1 correspondence [S1, FST] with the s`(2) singular vectors. In the conventional approach, on the other hand, subsingular vectors occur in the topological Verma modules Vh± (r,s,t(r,s,n)),t(r,s,n) , where t(r, s, n) = n−r s with r greater than n. Namely, we have the following Proposition 3.2. The quotient of the topological Verma module Vh,t over the submodules generated by conventional singular vectors is reducible – i.e., a subsingular vector exists in Vh,t – if and only if t = t(r, s, n), h = h± (r, s, t(r, s, n)), 1 ≤ n < r. In the “−” case, for definiteness, the subsingular vector in Vh− (r,s,t(r,s,n)),t(r,s,n) is given by − |Subi = G0 . . . Gr−n−1 Gr−n+1 . . . Gr−1 E(r, s, n−r (3.3) s ) (where |E(r, s, t)i− is the topological singular vector (2.34)). This becomes singular in the quotient module Vh− (r,s,t(r,s,n)),t(r,s,n) /C, where C is the submodule generated from the top-level representative of the singular vector in Vh− (r,s,t(r,s,n)),t(r,s,n) , which is given − by G0 . . . Gr−1 |E(r, s, n−r s )i . Proof. The parameters are such that the module Vh− (r,s,t(r,s,n)),t(r,s,n) contains at least two submodules C0

|E(n,1,

n−r +,r )i s

-C

|E(r,s,

n−r − )i s

- Vh− (r,s,t(r,s,n)),t(r,s,n) ,

0

(3.4)

where C ≈ Vh− (r,s,t(r,s,n))− 2r ,t(r,s,n);r and C ≈ Vh− (r,s,t(r,s,n))+ 2(n−r) ,t(r,s,n);r−n , the art t rows mean embeddings by means of the corresponding singular vectors, and |E(r, s, t)i±,θ are the topological singular vectors subjected to the spectral flow transform with parameter θ. All we have to do in the conventional approach is to describe these submodules in terms of (submodules generated from) the top-level representatives of extremal diagrams. Thus, consider the conventional singular vector

150

A. M. Semikhatov, I. Yu. Tipunin

− |convi = G0 . . . Gr−1 E(r, s, n−r s )

(3.5)

which satisfies highest-weight conditions (2.8) with θ = 0. It is clear that |convi belongs to the submodule C 0 iff n < r and belongs to C iff n > r. Indeed, the highest− weight vector of C 0 is |h.w.0 i = Gr−n . . . Gr−1 |E(r, s, n−r s )i and we have |convi = 0 G0 . . . Gr−n−1 |h.w. i whenever n < r; on the other hand, |h.w.0 i = Gr−n . . . G−1 |convi whenever n > r. Hence, in the case where n < r, the module C 0 is generated from |convi by the action of the N = 2 generators, whereas in the n > r case, |convi generates C. There exists the minimal submodule N such that C ⊂ N (it is possible that N ≈ Vh− (r,s,t(r,s,n)),t(r,s,n) ). Now, let us take the quotient of N over the submodule generated from all conventional singular vectors. In the case where n > r, this is the quotient over the maximal submodule of N , therefore the quotient is irreducible and, thus, there are no subsingular vectors in Vh− (r,s,t(r,s,n)),t(r,s,n) . On the other hand, in the case where n < r, this quotient contains the highest-weight vector of C and, therefore, is reducible. The explicit formula (3.3) is obvious from the analysis of extremal diagrams (see (3.7) and below), however it can also be checked by direct calculations that the vector (3.3) satisfies conventional highest-weight conditions (Eqs. (2.8) with θ = 0) modulo descendants of |convi: − H1 |Subi = G0 . . . Gr−n−2 Gr−n . . . Gr−1 |E(r, s, n−r s )i = r−n Q−r+n+1 |convi, = (−1)(r−n−1) 2s(n+1)

(3.6)

and similarly for the other annihilation conditions. Finally, for modules Vh,t with h and t not as in the proposition, the topological singular vectors and the conventional one are dense G/Q-descendants of each other. Therefore, they generate the same submodule and there are no subsingular vectors in those cases. The above proof and the construction of subsingular vectors can be illustrated in the following extremal diagram: |h− (r,s,t(r,s,n)),t(r,s,n)i

•

-

r

nx

2

◦ •

7

3

?

•4

1

6

5

(3.7)

Structure of Verma Modules over the N = 2 Superconformal Algebra

151

Here, the line 1–2–3–4–5 is the extremal diagram7 of C. The “cusp” of this diagram is at 4, i.e., this point represents topological singular vector (2.34) that satisfies the twisted topological highest-weight conditions with the twist parameter θ = r. Further, the topological singular vector |E(n, 1, t(r, s, n))i+,r in the submodule is represented by the point 2. Then, consider any state in the section of the extremal diagram of the submodule between 1 and 2 (for instance, the conventional singular vector |convi marked with a ×). The dense Q-descendants of this state terminate at 2: Qn−r · |E(n, 1, t(r, s, n))i+,r = 0. Applying instead a one-lower mode of Q to |E(n, 1, t(r, s, n))i+,r and then constructing dense Q descendants, one spans the line 2–6, as . . . Qn−r−2 Qn−r−1 · |E(n, 1, t(r, s, n))i+,r . Thus neither the states around 1 on the solid inner parabola, nor |convi generate the maximal submodule, whereas the state |E(r, s, t(r, s, n))i− at 4 does (in particular, |E(n, 1, t(r, s, n))i+,r is a dense Gdescendant of |E(r, s, t(r, s, n))i− : 2 = Gr−n . . . Gr−1 · 4). Thus, the states 3–5–. . . (along with infinitely many other, non-extremal, states), even though inside a proper submodule of Vh− (r,s,t(r,s,n)),t(r,s,n) , are not in the submodule generated by the top-level singular vector |convi. In the quotient module over the topological Verma submodule generated by |convi (or, equivalently, by the state |E(n, 1, t(r, s, n))i+,r at 2), the state 3 (such that 2 = Gr−n · 3) satisfies twisted topological highest-weight conditions (with the twist parameter θ = r − n). Acting on this state with Gr−n−1 , Gr−n−2 , . . . , gives the state |Subi at 7, which satisfies the untwisted massive highest-weight conditions as long as the state |convi is factored away. To avoid a possible misunderstanding, let us point out once again that the strucb ture of submodules described above is in fact the same as that of the s`(2) Verma b module Mj− (r,s,t(r,s,n)−2),t(r,s,n)−2 , with the s`(2) singular vectors corresponding to the topological singular vectors that necessarily satisfy twisted topological highest-weight conditions. On the contrary, the conventional N = 2 singular vectors are not the counterb parts of the s`(2) highest-weight states. This is what leads one to observing subsingular b Verma module, vectors in the conventional approach (whereas in the corresponding s`(2) there are no reasons altogether to define singular vectors as a counterpart of the N = 2 top-level representatives of extremal diagrams, hence no subsingular vectors there). 3.2. Massive singular vectors in codimension 2. We now turn to massive Verma modules. In codimension 2, three cases from the list on page 144 are arranged into the three following theorems (3.4, 3.6, and 3.8), while Propositions 3.5, 3.7, and 3.9 are given in order to make contact with the conventional description in terms of top-level, untwisted, representatives of singular vectors and, accordingly, in terms of subsingular vectors; we show why the subsingular vectors appear and how they can be constructed explicitly. The following observations are central for the subsequent constructions: Lemma 3.3. Let U ≡ Uh,`,t be a massive Verma module. i)

Let U ⊃ U 0 and U ⊃ C, where U 0 is a massive Verma submodule and C is a twisted topological Verma module generated from a charged singular vector in U such that for any twisted topological Verma module C 00 , U ⊃ C 00 ⊃ C, it follows that C 00 = C. Then there exists a twisted topological Verma module C 0 = U 0 ∩ C 6= {0}. Moreover, the embeddings C 0 ⊂ C and C 0 ⊂ U 0 are given by a topological singular vector in C and by a charged singular vector in U 0 respectively.

7 Here and in what follows, extremal diagrams of the type of (2.6), (2.9), etc., are shown schematically as parabolas, rather than “discrete approximations” thereof.

152

A. M. Semikhatov, I. Yu. Tipunin

Conversely, if U ⊃ U 0 , where U 0 is a massive Verma module, and U 0 ⊃ C 0 , where C 0 is a submodule generated from a charged singular vector in U 0 , then U ⊃ C, where C is a submodule generated from a charged singular vector in U . Moreover, C is maximal ( U ⊃ C 00 ⊃ C =⇒ C 00 = C), C 0 ⊂ U 0 ∩ C, and C 0 is generated from a topological singular vector in the topological Verma module C. iii) If U ⊃ C 0 , where C 0 is a twisted topological Verma module, there exists a twisted topological Verma submodule C ⊂ U such that the embedding is given by the charged singular vector, C is maximal (U ⊃ C 00 ⊃ C =⇒ C 00 = C), and C 0 ⊂ C (with the embedding given by a topological singular vector). ii)

Proof. The lemma can be illustrated by U % U

-

0

C -

% C

0

As regards item i), let us assume the contrary, namely that U 0 ∩ C = {0}. We then take the quotient Q = U/C, which is a twisted topological Verma module. It should contain all of the extremal states of the massive Verma module U 0 , however some of these states are clearly outside the extremal diagram of Q according to their bigrading. Thus, U 0 ∩ C = C 0 6= {0}. Further, it follows from Theorem 2.4 that the embedding C 0 ⊂ C is given by a topological singular vector in C. In the extremal diagram of U 0 , choose a state |?0 i from which all of the U 0 module is generated and consider the dense G/Q-descendant |top0 i of |?0 i with the minimal number of the G or Q modes among those dense G/Q-descendants that belong to C 0 . Such a state necessarily exists, since otherwise the quotient U/C would contain extremal states of U 0 that lie outside the module U /C. The state |top0 i satisfies the twisted topological highest-weight conditions and the module C 0 is generated from |top0 i. Therefore, |top0 i coincides with a topological singular vector in C and is at the same time a charged singular vector in U 0 . This completes the proof of i). To prove ii), let us fix extremal states |?i in U and |?0 i in U 0 such that U and U 0 are generated from |?i and |?0 i respectively. The highest-weight vector |top0 i of C 0 is a dense G/Q-descendant of |?0 i. Now, a charged singular vector exists in the extremal diagram of the submodule whenever l± (r, s, h, t) = lch (N, h ∓ rs, t) (see Eqs. (2.39) and (2.48)), which implies a similar relation for the dimension l(r, s, h, t) of the massive Verma module Uh,l(r,s,h,t),t . Assuming, for definiteness, that |top0 i is a dense G-descendant of |?0 i, we thus see that there exists a state |topi such that it is a dense G-descendant of |?i, satisfies twisted topological highest-weight conditions, and is not a dense Gdescendant of any other state satisfying twisted topological highest-weight conditions. Let us consider the module C generated from |topi and take the quotient of Q = U /C, which is a twisted topological Verma module. By analyzing the bigradings, it is easy to see that some of the dense G-descendants of |?0 i lie outside the extremal diagram of Q, therefore these dense G-descendants belong to C. Therefore, C ∩ C 0 6= {0}. If C ∩ C 0 6= C 0 , the module Q contains the submodule C 0 /(C ∩ C 0 ) which is not a twisted topological Verma module. This contradicts Theorem 2.4. Therefore, C ∩ C 0 = C 0 and C 0 ⊂ U 0 ∩ C,

Structure of Verma Modules over the N = 2 Superconformal Algebra

153

whence also follows the fact that the embedding C 0 ⊂ C is given by the topological singular vector. The proof of iii), which is rather tedious, is relegated to the Appendix. The lemma is used in the following theorem, which describes the occurrence of (at least) two different massive singular vectors in the massive Verma module Uh,`,t . This is case4 of the list on page 144 and it corresponds to rational t: Theorem 3.4. The highest-weight of the massive Verma module Uh,`,t belongs to the set Omm if and only if ` = l(r, s, h, t), where (r, s, h, t) ∈ N × N × C × Q n oS p \ (r, s, ±s − 2n−1±r , ) r, s ∈ Z, n ∈ Z, p ∈ Z, q ∈ N Y p/q q with Y as in (2.43). Then, 1.

2.

any primitive submodule of Uh,`,t is a massive Verma module generated from the representative |S(a, b, h, pq )i− of a massive singular vector, where a, b ∈ N is a solution to ` = l(a, b, h, pq ); equivalently, that submodule is also generated from the |S(a, b, h, pq )i+ representative of the massive singular vector with the same a and b. The structure of Uh,l(r,s,h, pq ), pq is determined by the structure of the topological Verma module Vh− (r,s, pq ), pq in the following way: (a) for any massive Verma submodule U 0 ⊂ Uh,l(r,s,h, pq ), pq generated from a massive singular vector, there exists a submodule in Vh− (r,s, pq ), pq generated from a topological singular vector e = E ± (a, b, pq ) |h− (r, s, pq ), pq )itop , a, b ∈ N, such that U 0 is generated from the massive singular vector g(−ab, ∓a + θ− (a, b, h, pq ) − 1)

E ±,θ

−

(a,b,h, p q)

(a, b, pq ) g(θ− (a, b, h, pq ), −1) |h, l(r, s, h, pq ), pq i ;

(3.8)

(b) conversely, for any singular vector in Vh− (r,s, pq ), pq of the form e = |E(a, b, pq )i+ with a ≥ 1, b ≥ 2, or e = |E(a, b, pq )i− with a, b ≥ 1, there exists a massive singular vector constructed as in (3.8) that generates a massive Verma submodule U 0 ⊂ Uh,l(r,s,h, pq ), pq ; (c) two different singular vectors e1 and e2 in Vh− (r,s, pq ), pq correspond in this way to the same massive Verma submodule U 0 ⊂ Uh,l(r,s,h, pq ), pq if and only if one of the ei is the |E(c ≥ 1, 1, pq )i+ singular vector in the module generated from the other. (As before, E ±,θ (r, s, t) is the spectral flow transform of the topological singular vector operator read off from (2.33) and (2.34); primitive refers to a submodule that is not a sum of other submodules.) Proof. By definition, the highest-weight of the Verma module Uh,`,t belongs to Omm whenever each of the states (2.38) admits two topological singular vectors and none of 0 00 the solutions to Eq. (2.37) ispan integer: θ , θ 6∈ Z. The last condition reformulates as 1 / Z. On the other hand, analysing all the constraint 2 1 − ht ± 4`t + (ht − 1)2 ∈ possible cases where the states (2.38) have the specified number of singular vectors for negative rational t = − pq requires analysing the embedding diagrams of the auxiliary

154

A. M. Semikhatov, I. Yu. Tipunin

topological Verma modules, these embedding diagrams being isomorphic to the standard b embedding diagrams of s`(2) Verma modules. This gives the values of (r, s, h, t) as in the theorem. Further, by the definition of Omm , there are no charged singular vectors in Uh,`,t , therefore taking into account Lemma 3.3 we obtain that each submodule is generated from (2.44) as well as from (2.45). Part 2 becomes obvious from the explicit formulae for massive singular vectors (2.44) and (2.45). Let us consider, for definiteness, Eq. (2.44). The part g(θ− (r, s, h, p p p q ), −1) · |h, l(r, s, h, q ), q i of the formula represents the highest-weight vector of the twisted topological Verma module Vh− (r,s, pq ), pq ;θ− (r,s,h, pq ) . Now, let us take any element |νi from the extremal diagram of the massive submodule which satisfies the twisted massive highest-weight conditions with the twist θ = ν. The operator g(∓a + θ− (r, s, h, pq ), ν − 1) maps the state |νi into the module Vh− (r,s, pq ), pq ;θ− (r,s,h, pq ) . The image of |νi under this mapping is a twisted topological singular vector referred to in Part 2(a). The fact that e1 is the |E(c ≥ 1, 1, pq )i+ singular vector built on e2 = |E(a, b, pq )i± means that e1 = g(∓a − c, ∓a − 1) |E(a, b, pq )i± . Then Part 2(c) follows from the identity g(−ab, a − c + θ− (a, b, h, pq ) − 1) · · g(∓a − c + θ− (a, b, h, pq ), ∓a + θ− (a, b, h, pq ) − 1)E ±,θ E g(θ− (a, b, h, pq ), −1) h, l(r, s, h, pq ), pq = −

p

g(−ab, a + θ− (a, b, h, pq ) − 1) E ±,θ (a,b,h, q ) E (a, b, pq ) g(θ− (a, b, h, pq ), −1) h, l(r, s, h, pq ), pq ,

−

(a,b,h, p q)

(a, b, pq ) ·

(3.9)

where we used Eqs. (2.13). Let us point out once again that the constructions of the type of g(−ab, ∓a + θ − 1)E ±,θ (a, b, t)g(θ, −1)|h, `, ti in Eq. (3.8) evaluate as Verma module elements using the formulae of Sect. 2. If we recall that the structure of topological Verma modules is equivalent [FST] to b the structure of the standard s`(2) Verma modules, we see that the modules Uh,`,t with b (h, `, t) ∈ Omm , too, have essentially the same (familiar) structure as the s`(2) Verma modules with a rational level k = t − 2. In the present case, restricting oneself to only top-level singular vectors is innocuous8 : Proposition 3.5. Under the conditions of Theorem 3.4, the quotient of Uh,l(r,s,h, pq ), pq with respect to the conventional singular vectors is irreducible, i.e., no subsingular vectors exist in the massive Verma module Uh,l(r,s,h, pq ), pq . Indeed, in this case there are no charged singular vectors in Uh,l(r,s,h, pq ), pq , therefore each of the extremal states of the submodule is a dense G/Q-descendant of any other extremal state of the same submodule. Thus, each element of the extremal diagram of the submodule generates the same module. Next is the case where Uh,`,t contains two charged singular vectors none of which are descendants of the other, i.e. the extremal diagram contains two states that satisfy twisted 8 We remind the reader that, when we are talking about subsingular vectors, these are understood in the setting where the conventional definition of singular vectors is adopted, i.e., only top-level representatives of extremal diagrams are “allowed” to generate submodules.

Structure of Verma Modules over the N = 2 Superconformal Algebra

155

topological highest-weight conditions and lie on the different sides of the highest-weight vector. That is, the extremal diagram has two branching points similar to that in (2.10), but on the different sides of |h, `, ti. This is case 5 in the list on page 144: Theorem 3.6. The highest-weight of the massive Verma module Uh,`,t belongs to the set Occ if and only if h = hcc (n, m, t) and ` = lcc (n, m, t), where hcc (n, m, t) = 1t (1 − m − n) , lcc (n, m, t) = − mn t , [n (n, m, t) ∈ N × (−N0 ) × (C \ Q) (n0 , m0 , − pq ) n0 o ∈ N, m0 ∈ −N0 , p, q ∈ N, 1 ≤ n0 − m0 ≤ q .

(3.10)

Then the massive Verma module Uhcc (n,m,t),lcc (n,m,t),t contains two twisted topological Verma submodules C1 ≈ V m−n−1 ,t;−m and C2 ≈ V n+1−m ,t;−n generated from the t t charged singular vectors |E(m, hcc (n, m, t), t)ich and |E(n, hcc (n, m, t), t)ich respectively. The maximal submodule in Uhcc (n,m,t),lcc (n,m,t),t is C1 ∪ C2 , and C1 ∩ C2 = 0. Proof. The definition of the set Occ implies that both solutions θ0 and θ00 = −θ0 + ht − 1 of Eq. (2.37) are integers, whence the conditions h = hcc (n, m, t) and ` = lcc (n, m, t) follow. Then the singular vectors referred to in the theorem are all singular vectors in the module Uh,`,t , since, in accordance with Lemma 3.3, any other submodule in Uh,`,t would have non-empty intersections with C1 and C2 , which would then be generated from singular vectors in C1 and C2 . But by the definition of the set Occ , the modules C1 and C2 both are irreducible. This case is still harmless if one wishes to work with only the top-level representatives of extremal diagrams of submodules. Since there are only two singular vectors in Uhcc (n,m,t),lcc (n,m,t),t , we immediately obtain Proposition 3.7. Under conditions of Theorem 3.6, the quotient of Uhcc (n,m,t),lcc (n,m,t),t with respect to the conventional singular vectors is irreducible, i.e., there are no subsingular vectors in the massive Verma module Uhcc (n,m,t),lcc (n,m,t),t . The third possibility in codimension 2, as described in case6 of the list on page 144, is when one of the states (2.38) belongs to the original module Uh,`,t , hence there is a charged singular vector in Uh,`,t . The submodule C generated from the charged singular vector contains a singular vector. One of the possibilities is that this is a second charged singular vector in Uh,`,t , situated on the same side from the highest-weight vector as the first charged singular vector. Otherwise, the submodule generated from the singular vector in C corresponds to a massive Verma submodule in Uh,`,t in accordance with Lemma 3.3. Theorem 3.8. The highest-weight of the massive Verma module Uh,`,t belongs to the set Ocm if and only if h = hσcm (r, s, n, t), ` = lσcm (r, s, n, t), where σ ∈ {−, +}, h± cm (r, s, n, t) =

1−2n n ± (s − rt ) , l± cm (r, s, n, t) = t (−n ± (st − r)) , (3.11) t [ [ (σ, r, s, n, t) ∈ {±} × N × N × Z × (C \ Q) A B,

where

156

A. M. Semikhatov, I. Yu. Tipunin

n oS A = ({+}, r, 0, n, t) r ∈ N, n ∈ N, t ∈ C \ Q o n ({−}, r, 0, n, t) r ∈ N, n ∈ −N0 , t ∈ C \ Q , oS n (3.12) B = ({+}, r, s, n, − pq ) n ∈ N, p, q ∈ N, 1 ≤ r ≤ p, 0 ≤ s ≤ q − 1 o n ({−}, r, s, n, − pq ) n ∈ −N0 , p, q ∈ N, 1 ≤ r ≤ p, 0 ≤ s ≤ q − 1 . ± Then, Uh± contains a twisted topological Verma submodule C ≈ cm (r,s,n,t),lcm (r,s,n,t),t Vh± (r,s,t),t;−n generated from the charged singular vector |E(n, h± cm (r, s, n, t), t)ich . ± For s 6= 0, further, Uh± contains a massive Verma submodule cm (r,s,n,t),lcm (r,s,n,t),t + U 0 generated from the massive singular vector |S(r, s, h± cm (r, s, n, t), t)i (if n ≥ 1) ± − or |S(r, s, hcm (r, s, n, t), t)i (if n ≤ 0), where r, s ≥ 1. The maximal submodule in ± is U 0 ∪ C. The intersection U 0 ∩ C is a twisted topological Verma Uh± cm (r,s,n,t),lcm (r,s,n,t),t module generated from the topological singular vector in C given by

( ±

|T i =

E ±,−n (r, s +

1 2

E ±,−n (r, s +

1 2

± 21 , t) E(n, h± cm (r, s, n, t), t) ch , n ∈ N , ∓ 21 , t) E(n, h± cm (r, s, n, t), t) ch , n ∈ −N0 .

(3.13)

When s = 0, the corresponding state (3.13) is a topological singular vector in C and, at the same time, a charged singular vector in Uh,`,t . Proof. The definition of Ocm requires that precisely one of the solutions of Eq. (2.37) be an integer, whence the existence of submodule C follows. Further, by the definition of Ocm , the module C contains precisely one singular vector. This is vector (3.13). If this vector were |E + (r, 1, t)i for n > 0 or |E − (r, 1, t)i for n ≤ 0, it would be a second charged singular vector in Uh,`,t . Then, by the conditions of the theorem and Lemma 3.3, there are no other submodules in Uh,`,t . If singular vector (3.13) is not one of the above, we see from Lemma 3.3 that any other submodule in Uh,`,t is a massive Verma module that has a nontrivial intersection with C. However, any submodule in Uh,`,t can intersect C over the submodule generated from the only singular vector (3.13) in C. Thus, Uh,`,t + can contain only one massive submodule. Finally, the state |S(r, s, h± cm (r, s, n, t), t)i ± − in the case of n ≥ 1 or |S(r, s, hcm (r, s, n, t), t)i in the case of n ≤ 0 generates this submodule because the respective state satisfies the (twisted) massive highest-weight conditions and does not belong to C since C contains no states with the gradings as that of the respective |S(. . .)i± state9 . The situation described in the Theorem is illustrated in the following extremal diagram (choosing, for definiteness, n > 0 and the “−” case in (3.11)) 9 On the other hand, for n > 0 for example, the vector |S(r, s, h− (r, s, n, t), t)i− belongs to the topocm logical Verma submodule C 0 generated from the highest-weight state (3.13) whenever n ≤ r(s + 1), in which − − case it is then a dense G-descendant of |T i− : |S(r, s, h− cm (r, s, n, t), t)i = G−rs . . . Gr−n−1 |T i , and similarly for n ≤ 0 with + ↔ −.

Structure of Verma Modules over the N = 2 Superconformal Algebra

-

n

• |E(n)ich •

|S(r,s)i−

r

c

157

|T i−

|S(r,s)i+

c

(3.14) + Here, the vector |S(r, s)i+ ≡ |S(r, s, h− cm (r, s, n, t), t)i is a representative of the massive singular vector in the sense of Definition 2.9, since its dense G/Q-descendants generate the entire extremal diagram of the massive Verma submodule U 0 . According to (3.13), the twisted topological highest-weight state |T i− is the embedding of the topological singular vector |E(r, s, t)i− into the submodule built on the charged singular vector |E(n)ich ≡ |E(n, h− cm (r, s, n, t), t)ich . The diagram shows the case where n ≤ r(s + 1) − and, thus, the vector |S(r, s)i− ≡ |S(r, s, h− cm (r, s, n, t), t)i belongs to the topological 0 Verma submodule C generated from the highest-weight state (3.13). In particular, its dense G/Q-descendants do not generate the same diagram as dense G/Q-descend+ − ants of |S(r, s, h− cm (r, s, n, t), t)i . Together, the vectors |E(n, hcm (r, s, n, t), t)ich and + (r, s, n, t), t)i generate a maximal submodule. The submodules generated |S(r, s, h− cm by each of these vectors intersect over the submodule generated from |T i− . In this case, when one wishes to work with only the conventional, top-level, representatives of singular vectors, one has to pay the price of considering subsingular vectors. Their positions and explicit constructions are a direct consequence of the above analysis. In the following proposition, we thus assume the conventional definition of singular vectors, demanding that these always satisfy the “untwisted” highest-weight conditions. Then, the subsingular vectors are as follows:

Proposition 3.9. Under the conditions of Theorem 3.8, the massive Verma module Uh,`,t contains a subsingular vector if and only if 1. either r ≥ n > 0 and h = h− cm (r, s, n, t), s ≥ 1, 2. or n ≤ 0, r ≥ |n| + 1, and h = h+cm (r, s, n, t), s ≥ 1. In the first case, the subsingular vector is given by |Subi = G0 . . . G−n+r q(1 − r + n, n + t − 1) − E +,−n+r−t (r, s, t) q(n − r + t, 0) h− cm (n, r, s, t), lcm (n, r, s, t), t + = G0 . . . Gr−n−1 Gr−n+1 . . . Grs−1 S(r, s, h− . (3.15) cm (n, r, s, t), t) This vector (which has the relative charge 1) becomes singular in the module obtained by taking the quotient over the submodule generated by the (top-level) singular vector

158

A. M. Semikhatov, I. Yu. Tipunin

− |si = G0 . . . Gr−n−1 E −,−n (r, s, t) G−n . . . G−1 h− cm (n, r, s, t), lcm (n, r, s, t), t . (3.16) In the case where n ≤ 0, r ≥ |n| + 1, similarly, |Subi = Q1 . . . Qr+n−1 Qr+n+1 . . . Qrs |S(r, s, h+cm (n, r, s, t), t)i− .

(3.17)

Let us remind the reader that, as in the general construction (2.44), (2.45) of N = 2 singular vectors, the state q(1−r+n, n+t−1) E +,−n+r−t (r, s, t) q(n−r+t, 0) |h, `, ti in (3.15) evaluates as an element of Uh,`,t using the formulae of Sect. 2, see also [ST2]. Recall also that, as before, E ±,θ (r, s, t) are topological singular vector operators transformed by the spectral flow with the parameter θ. Proof. Consider, for definiteness, case 1 of the proposition. Then, the module Uh,`,t contains only three submodules C ⊂ Uh,`,t , U 0 ⊂ Uh,`,t , and C 0 ⊂ U 0 , C 0 ⊂ C, where C and U 0 are as in Theorem 3.8 and C 0 ≈ Vh− . The embeddings are 2 cm (r,s,n,t)+ t (n−r),t;r−n given by the singular vectors described in Theorem 3.8. The submodule C 0 is embedded by the singular vector − |T i− = E −,−n (r, s, t) G−n . . . G−1 h− cm (n, r, s, t), lcm (n, r, s, t), t .

(3.18)

Obviously, |T i− is inside the submodule generated from the charged singular vector − − E(n, h− cm (n, r, s, t), t) ch = G−n . . . G−1 hcm (n, r, s, t), lcm (n, r, s, t), t .

(3.19)

Further, the top-level representative − |ci = Q1 . . . Qn−1 G−n . . . G−1 h− cm (n, r, s, t), lcm (n, r, s, t), t of this charged singular vector generates the module C. In the conventional description, the existence of a subsingular vector in the module Uh,`,t depends on whether the top-level representative |si of the extremal diagram connecting |T i− and + 0 |S(r, s, h− cm (n, r, s, t), t)i belongs to the submodule C . It is clear that |si = G0 . . . Gr−n−1 |T i− , |T i− = Gr−n . . . G−1 |si ,

r≥n>0 n≥r>0

=⇒ =⇒

|si ∈ C 0 , |si 6∈ C 0 ,

(3.20)

whence we see that the quotient of Uh,`,t over conventional singular vectors is reducible in the case where r ≥ n > 0. In terms of extremal diagrams, the conditions relating n and r mean that the twisted topological highest-weight state |T i− in (3.14) has gone past the top of the “massive” parabola (i. e., past the conventional singular vector). Therefore, the extremal diagram actually takes the following form:

Structure of Verma Modules over the N = 2 Superconformal Algebra

n |ci

159

x -

x |si

x |T i−

• x I |Subi |Ech (n)i

• |S(r,s)i−

r

c

-

∗

|S(r,s)i+

c

(3.21) Here, the crosses denote the conventional, top-level, representatives, the • states satisfy twisted topological highest-weight conditions, ∗ is the state (a descendant of + − |S(r, s, h− cm (n, r, s, t), t)i ) such that |T i = G−n+r · (∗), and |E(n)ich ≡ E(n, h− cm (n, r, s, t), t) ch (and, as before, we consider the case where r ≥ n > 0). The arrow in the diagram, which represents the action of Gr−n , cannot be inverted because of the twisted topological highest-weight conditions at |T i− , therefore the dotted line cannot be reached by the action of elements of the N = 2 algebra on either |T i− or the top-level representative |si (nor, in fact, |ci). Instead, acting with the highest of modes of Q that produces a non-vanishing result, one spans out the lower branch originating at |T i− , which is shown in the solid line. After taking the quotient with respect to the singular vector − (or, equivalently, |si), we are left with the |S(r, s)i− ≡ |S(r, s, h− cm (n, r, s, t), t)i submodule whose extremal diagram is precisely the dotted line. Then the state |Subi (the top-level representative of this diagram) is a subsingular vector. However, rather than describing the structure of N = 2 Verma modules in terms of subsingular vectors, it is much more convenient to construct those vectors that do generate maximal submodules. In (3.21), this is the canonical representative |S(r, s)i+ ≡ + |S(r, s, h− cm (n, r, s, t), t)i from (2.45). 3.3. Codimension-3 cases. Now we are going to analyze codimension-3 degenerations. Let us begin with the case when a further degeneration occurs in Theorem 3.4, as described in case 7 of the list on page 144. Namely, one more massive Verma submodule appears in the diagram of the type of (3.14), with its own topological point similar to |T i− . All such topological points are at the same time topological singular vectors in the submodule generated from a charged singular vector (Lemma 3.3). In this way, the structure of submodules in the massive Verma module is still essentially described by that of its topological Verma submodule generated from a charged singular vector (while the structure of the topological Verma module, in turn, is the same as for the corresponding b s`(2) Verma module).

160

A. M. Semikhatov, I. Yu. Tipunin

Theorem 3.10. The highest-weight of the massive Verma module Uh,`,t belongs to the set Ocmm if and only if h = hσcm (r, s, n, t), ` = lσcm (r, s, n, t), where σ ∈ {−, +} and [ (σ, r, s, n, t) ∈ ({±} × N × N × Z × Q) A0 \ B ,

(3.22)

where B is as in (3.12), n o[ A0 = ({+}, r, 0, n, t) r ∈ N, n ∈ N, t ∈ Q n o ({−}, r, 0, n, t) r ∈ N, n ∈ −N0 , t ∈ Q , and, with t = pq ,

r −

ps q

6∈ {|n|, |n| + 1, |n| + 2, . . .} .

(3.23)

Then, the structure of Uh,`,t is described as follows: 1. there exists a twisted topological Verma submodule C = Vh± (r,s, pq ), pq ;−n ,→ p ± p p , where the embedding is given by singular vector (2.40); U h± cm (r,s,n, ),lcm (r,s,n, ), q

q

q

p ± p p satisfies one of the follow2. any other primitivesubmodule in Uh± cm (r,s,n, q ),lcm (r,s,n, q ), q ing: (a) it is a submodule in C (hence, a twisted topological Verma module); 0 (b) it is a massive Verma module the representative E U generated from E p p p p ± + − (if n ≥ 1) or S(r, s, h± (if S(r, s, hcm (r, s, n, q ), q ) cm (r, s, n, q ), q ) n ≤ 0) of the massive singular vector, where r, s ≥ 1. Then there exists a vector ± fi ∈ U 0 that satisfies twisted massive highest-weight conditions with the twist |T parameter θ = ∓r − n (if n ≤ 0) or θ = ∓r − n + 1 (if n > 0) and such that the vector

|T i± =

 ± fi = E ±,−n (r, s + 1 ∓ 1 , p )Qn . . .  Qn±r |T  2 2 q   p p p ±   . . . Q0 |h± cm (n, r, s, q ), lcm (n, r, s, q ), q i,  ±  fi = E ±,−n (r, s + 1 ± 1 , p )G−n . . .  G−n∓r |T   2 2 q  p p p ± . . . G−1 |h± cm (n, r, s, q ), lcm (n, r, s, q ), q i,

n ≤ 0, (3.24) n ≥ 1,

satisfies twisted topological highest-weight conditions and generates the twisted topological Verma module C 0 = U 0 ∩ C. 3. For any twisted topological Verma submodule C 0 ⊂ C, there exists a massive Verma submodule U 0 ⊆ Uh,`,t such that (a) C 0 ⊂ U 0 ∩ C is a submodule in U 0 corresponding to a charged singular vector; fi ∈ U 0 that generates U 0 such that the vector |T i defined (b) there exists a vector |T as in (3.24) generates a twisted topological Verma module that either coincides with C 0 or contains C 0 as a submodule generated from the topological singular vector |E(a, 1, pq )i+,−n , a ∈ N, or |E(a, 1, pq )i−,−n , a ∈ N, for n ≥ 1 and n ≤ 0 respectively.

Structure of Verma Modules over the N = 2 Superconformal Algebra

161

In case 3 of the theorem, U 0 = Uh,`,t occurs only when s = 0 (when there are two charged singular vectors on one side of the highest-weight state), otherwise U 0 ⊂ Uh,`,t . 6= For negative rational t and small r and s, the excluded cases where only one massive Verma submodule exists are those covered by Theorem 3.8. As in the above, E ±,θ (r, s, t) denote topological singular vector operators transformed by the spectral flow with the parameter θ. Proof. From the definition of Ocmm , follow the equations on h, `, and t with the solutions (3.22)–(3.23). Item 1 of the Theorem is a part of the definition of Ocmm . Further, by Lemma 3.3, each twisted topological Verma submodule is embedded into the module C by a topological singular vector. Each massive submodule has a non-empty intersection with C. This intersection is generated from the topological singular vector written on the RHS of (3.24). ] The crucial point in the Theorem is the existence of the |T i± states. For definiteness, we choose n > 0 and the “−” case in (3.24). We then apply the g operator of length −1 to the twisted topological highest-weight state |T (r, s, n, t)i ≡ |T i− from (3.24), E ] |T i− ≡ Te(r, s, n, t) = g(r − n + 1, r − n − 1) |T (r, s, n, t)i = g(r − n + 1, r − n − 1) E −,−n (r, s, t) E(n, h− cm (r, s, n, t), t ch , (3.25) in accordance with the rules of Sect. 2. The condition for the |Te(r, s, n, t)i state to exist − is given by the next lemma, from which we see in the module Uh− cm (n,r,s,t),lcm (n,r,s,t),t fi state that, under the conditions of the theorem, f (r, s, n, t) 6= 0 and therefore the |T does exist. Lemma 3.11. The state |Te(r, s, n, t)i, Eq. (3.25), exists in the massive Verma mod− with 1 ≤ n ≤ r(s + 1) if and only if f (r, s, n, t) 6= 0, where ule Uh− cm (r,s,n,t),lcm (r,s,n,t),t  Q  2r−n (st + n − r + i) , n ≤ 2r , (3.26) f (r, s, n, t) =  i=0 1, n ≥ 2r + 1 . This state is then a representative of the massive singular vector; further, the dense + Q-descendant of |S(r, s, h− cm (r, s, n, t), t)i that lies in the same grade as (3.25) is proportional to that vector: E + e (r, s, n, t), t) = a(r, s, n, t) T (r, s, n, t) , (3.27) Gr−n+1 . . . Grs−1 S(r, s, h− cm where a(r, s, n, t) is

2 rs t

times a polynomial of the order r(s + 1) in t.

Proof. To evaluate (3.25), one uses formulae (2.25) and then, as negative-length g operators reach the highest-weight state, one applies the formula, g(θ1 , θ−1) |h, `, t; θi =

1 Q−θ1 +1 g(θ1 −1, θ−1)|h, `, t; θi , (3.28) b 2`(h, `, t, θ − θ1 + 1)

(which also follows from Sect. 2) with b `, t, N ) = ` − lch (N, h, t) `(h,

(3.29)

162

A. M. Semikhatov, I. Yu. Tipunin

In the case at hand, we further use the fact that b − (r, s, n, t), l− (r, s, n, t), t, −i) = − 1 (i + n)(st + n − r + i) `(h cm cm t

(3.30)

A simple analysis of the relative charge of |Sb− (r, s, h− cm (r, s, n, t), t)i shows that f (r, s, n, t) is precisely the function responsible for the existence of |Te(r, s, n, t)i because, for t 6= 0, the relevant factors from the denominators are precisely the above f (r, s, n, t), whence the lemma follows. While this case is rather straightforward when described in terms of singular vectors that generate maximal submodules, the analysis of the same structure in terms of top-level singular vectors and subsingular vectors that become necessary then is quite lengthy when it comes to listing all possible occurrences of subsingular vectors. Comparing (3.14) and (3.21) we have seen that the appearance of subsingular vectors in the conventional setting is due to the fact that the twisted topological highest-weight state is shifted to a certain side (depending on the sign, etc., of the parameters) of the top-level vector in the extremal diagram of the submodule. In the present case, however, there are two independent massive subdiagrams in the extremal diagram, each with its own “topological point”. The description in terms of conventional singular vectors and subsingular vectors would then amount to classifying all possible relative positions of the topological points and top-level vectors of the parabolas. Although this presents no conceptual difficulties and can be carried out similarly to Proposition 3.9, yet there are a large number of different cases. We omit this analysis, since it does not add anything to Theorem 3.10 as regards the structure of submodules, while at the same time is too long to serve as an example. It remains to consider the case where, in addition to the conditions of Theorem 3.6, the submodules generated from the charged singular vector, in their own turn, admit topological singular vectors. Then, the corresponding twisted topological Verma submodules may be such that a given massive Verma submodule may be embedded into a direct sum of two such twisted topological Verma submodules. The corresponding massive singular vector then “splits” into a pair of singular vectors, each of which belongs to the respective twisted topological Verma submodule. In the restricted setting with only top-level representatives allowed to generate submodules, this can be observed as the occurrence of two linearly independent singular vectors in the same grade10 or as the appearance of a singular vector and a subsingular vector in the same grade. As in Theorem 3.6, we assume for definiteness that the integers labelling the charged singular vectors in the massive Verma module Uh,`,t are such that n > 0 and m ≤ 0. By the distance between any two vectors on the same extremal diagram we mean the difference of their U (1) charges. Then the distance between the charged singular vectors in Uhcc (n,m,t),lcc (n,m,t),t is equal to −m + n + 1. Now we are ready to describe case 8 of the list on page 144, i.e., the coexistence of a massive singular vector with two charged singular vectors on different sides of the highest-weight vector: Lemma 3.12. The highest-weight of the massive Verma module Uh,`,t belongs to the set Occm if and only if 10 The occurrence of linearly independent singular vectors in the same grade was noticed for the first time in [D]. As we are going to see they are necessarily elements of twisted topological, not massive, Verma submodules.

Structure of Verma Modules over the N = 2 Superconformal Algebra

163

m ∈ −N0 . (3.31) topological submodules C1 Then, the massive Verma module Uh,`,t contains twisted E (1−m−n)s r±(n−m) and C2 generated from the charged singular vectors E(n, r∓(m−n) , ) ch and s E r±(n−m) ) ch respectively. Each of the modules C1 and C2 admits a E(m, (1−m−n)s r∓(m−n) , s h = hcc (n, m, t) ,

` = lcc (n, m, t) ,

t=

r±(n−m) s

,

r, s, n ∈ N ,

singular vector; moreover, a singular vector |E(a, b, t)i∓,−n exists in C1 if and only if |E(a, b, t)i±,−m exists in C2 with the same a, b ∈ N (and with the above t). Any other primitive submodule U 0 ⊂ Uh,`,t satisfies one of the following: 1. it is a twisted topological Verma module, in which case it is a submodule of either C1 or C2 ; 2. it is a massive Verma module, in which case the non-empty intersections C10 = U 0 ∩ C1 and C20 = U 0 ∩ C2 are generated each from a topological singular vector in C1 and C2 respectively. If, then, C10 is generated from the singular vector |E(a, b, t)i±,−n , then C20 is generated from the singular vector |E(a, b, t)i∓,−m with the same a, b ∈ N (and with the above t). Each of the submodules C10 and C20 is at the same time generated from a charged singular vector in U 0 . Proof. The definition of Occm means that both solutions of Eq. (2.37) are integers of different signs and, in addition, the states (2.38) admit a topological singular vector, whence (3.12) follows. Further, each of the modules C1 and C2 generated from the charged singular vectors contains a topological singular vector. Now assuming that the highestweight vector of C1 is |h± (a, b, t), t; −nitop , we see that the highest-weight vector of C2 is |h∓ (a, b, t), t; −mitop . Therefore the assertion that a singular vector |E(a, b, t)i∓,−n exists in C1 if and only if |E(a, b, t)i±,−m exists in C2 with the same a, b ∈ N is obvious. Note further that the quotient of Uh,`,t over C1 or C2 is a twisted topological Verma module Q1 or Q2 respectively. Let us assume that there exists a twisted topological highest-weight state |ti such that |ti 6∈ C1 , |ti 6∈ C2 . Then this state is a topological singular vector in Q1 and, likewise, in Q2 . However, the Verma module generated from |ti contains states in the gradings where there are no states from either the Q1 or the Q2 module. Therefore, the state |ti belongs to a twisted topological Verma module and, at the same time, generates a submodule which is isomorphic to the quotient of a twisted topological Verma module. This contradicts the structure of the topological Verma modules described in Theorem 3.1. Part 2 follows immediately from Lemma 3.3. In the case described in the lemma, therefore, a given massive Verma submodule U 0 in Uh,`,t necessarily has two charged singular vectors lying on the different sides of the highest-weight vector of U 0 . It may be useful to recall the diagram (3.14), where the topological singular vector |T i− is, at the same time, a charged singular vector in the massive Verma module whose extremal diagram is the parabola connecting |S(r, s)i− and |S(r, s)i+ . In the present case, we have two topological points on the extremal diagram of any massive Verma submodule, which are the highest-weight states of the modules C10 and C20 : U 3x kQ Q  Q C1 U0 C2 - % - % C20 C10 (3.32)

164

A. M. Semikhatov, I. Yu. Tipunin

Conversely, let us be given any topological singular vector in C1 , |e1 i = E ±,−n (a, b, t) G−n . . . G−1 |hcc (n, m, t), lcc (n, m, t), ti ,

(3.33)

(with t as in the lemma). Then we find the corresponding singular vector in C2 : |e2 i = E ∓,−m (a, b, t) Qm . . . Q0 |hcc (n, m, t), lcc (n, m, t), ti .

(3.34)

Now the question is whether a massive submodule U 0 exists in the Verma module under consideration such that (3.33) and (3.34) would be charged singular vectors in that massive Verma submodule. This is not always the case. In more detail, the structure of the module U is described in the following Theorem 3.13. Under the conditions of Lemma 3.12, 1. Whenever t =

−m+n+r , s

the module C1 contains the state

|e1 i = E +,−n (r, s + 1, t) G−n . . . G−1 |hcc (n, m, t), lcc (n, m, t), ti ,

(3.35)

that satisfies twisted topological highest-weight conditions. Let then |e2 i = E −,−m (r, s + 1, t) Qm . . . Q0 |hcc (n, m, t), lcc (n, m, t), ti ,

(3.36)

be the singular vector in C2 whose existence is claimed in the lemma. There exist − + fi in Uh (n,m,t),l (n,m,t),t such that fi and |T states |T cc

cc

+

fi = |e1 i , G−r−n |T −

fi = |e2 i . Q−r+m |T

(3.37)

Each of these two states generates the same massive Verma submodule U 0 ⊂ Uh,`,t , in which |e1 i and |e2 i are charged singular vectors. , the module C1 contains the state 2. Whenever t = m−n+r s |e1 i = E −,−n (r, s, t) G−n . . . G−1 |hcc (n, m, t), lcc (n, m, t), ti

(3.38)

that satisfies twisted topological highest-weight conditions. Let then |e2 i be a singular vector in C2 whose existence is claimed in the lemma. Then, (a) if 2r + m − n ≤ −1, there exists a massive Verma submodule U 0 ⊂ Uh,`,t g g generated from any of the states |v 1 i or |v2 i such that g Gr−n |v 1 i = |e1 i , g Qr+m |v 2 i = |e2 i ,

(3.39)

and, further, |e1 i and |e2 i are charged singular vectors in U 0 , the distance between them being −m + n + 1 − 2r; (b) if 2r +m−n ≥ 0, there does not exist a massive submodule U 0 in Uh,`,t in which either |e1 i or |e2 i would be a charged singular vector. If, further, 2r + m − n ≥ 1, then, • for each i from the range i = 0, . . . , 2r + m − n − 1, the states Qr+m−i . . . Qr+m−1 |e2 i and G−m−r+i+1 . . . Gr−n−1 |e1 i satisfy twisted massive highestweight conditions, are in the same grade and are linearly independent;

Structure of Verma Modules over the N = 2 Superconformal Algebra

•

165

the modules C10 and C20 generated by |e1 i and |e2 i respectively, contain topological singular vectors G−r−m . . . Gr−n−1 |e1 i and Q−r+n . . . Qr+m−1 |e2 i, respectively. −

+

fi can be written as fi and |T Proof. In case 1 of the theorem, the states |T +

fi = g(−r − n + 1, −r − n − 1) |e1 i , |T

(3.40)

−

fi = q(−r + m + 1, −r + m − 1) |e2 i . |T

They exist as elements of Uh,`,t in view of the argument similar to the one used in − + fi are dense G/Q-descendants of each other fi and |T Lemma 3.11. The fact that |T (up to a nonzero factor) is checked by quotienting Uhcc (n,m,t),lcc (n,m,t),t over C1 or C2 . − + fi in the same grading fi and of |T The assumption that dense G/Q-descendants of |T are linearly independent leads to the contradiction with the structure of the quotient Uhcc (n,m,t),lcc (n,m,t),t /C1 , which is a twisted topological Verma module. In case 2, Lemma 3.11 assures that the states g |v 1 i = g(r − n + 1, r − n − 1) |e1 i ,

(3.41)

g |v 2 i = q(r + m + 1, r + m − 1) |e2 i exist in case (a) and do not exist in Uh,`,t in case (b).

Different cases in the theorem are thus distinguished by whether or not there exists a massive Verma submodule U 0 such that its intersections with C1 and C2 coincide with submodules C10 ⊂ C1 and C20 ⊂ C2 generated from the singular vectors |e1 i and |e2 i respectively. In cases 1 and 2a one has the embeddings as in (3.32), whereas in case 2b there is no massive Verma submodule U 0 in which the highest-weight vectors of C10 and C20 would be charged singular vectors. Case 2b of the theorem can be illustrated as x C1

1

•

x

fx x

C2

C10

g

•2

8 C20

x •7 •

|e2 i

|e1 i

• •9

3

6

4

5

(3.42)

166

A. M. Semikhatov, I. Yu. Tipunin

Here, 1 and 2 are the charged singular vectors in the massive Verma module U , which read mns m−n+r mns m−n+r G−n . . . G−1 ·| (1−m−n)s i and Qm . . . Q0 ·| (1−m−n)s i, m−n+r , n−m−r , s m−n+r , n−m−r , s respectively. The extremal diagrams of the corresponding twisted topological submodules in U are labelled by C1 and C2 respectively. The top-level representatives (2.41) are marked with crosses. The extremal diagrams of the twisted topological Verma submodules generated from |e1 i and |e2 i respectively are given by 3–|e1 i–4, with the cusp at |e1 i, and by 5–|e2 i–6, with the cusp at |e2 i. Thus, neither |e1 i nor |e2 i alone generates all of the states in 3–|e2 i–|e1 i–5. Those states of the two submodules that lie between |e1 i and |e2 i are in the same grade and are linearly independent. They satisfy twisted massive highest-weight conditions and might thus be taken for two linearly independent massive singular vectors in the same grade (when, e.g., such |e1 i and |e2 i happen to lie on different sides of the top of the parabola, one would observe the pair of conventional singular vectors [D] at the top of the parabola). However, we have seen that each of the two linearly independent states in the same grade belongs in fact to its own twisted topological Verma submodule. The state at 7 ∈ C10 is yet another topological singular vector from part 2b of the theorem, in particular Qr+m 7 = 0. The state |s1 i ∈ C10 (not indicated in the diagram), which is in the same grade as |e2 i ∈ C20 but belongs to the other twisted topological submodule, is such that G−r−m |s1 i = 7. (Similarly, 9 ∈ C20 is a topological singular vector as well). The state 8 is the top-level representative of the extremal diagram generated by the topological singular vector |e2 i, but is not in the module generated from |e1 i. Similarly, in case 2a of Theorem 3.13, we have the extremal diagram n -x C2 x •

|e2 i |e1 i

• -

r

−m+1 -

x

C1

• x

x r

-

• 3

5

4

6

(3.43)

where 3–|e1 i–4 is the extremal diagram of the twisted topological Verma submodule C10 = C1 ∩ U 0 , and 5–|e2 i–6, that of C20 = C2 ∩ U 0 . There are −m + n − 2r states between |e1 i and |e2 i that satisfy twisted massive highest-weight conditions but do not belong to either C10 or C20 submodules, nor, in fact, to either the C1 or C2 submodules of Uh,`,t . Thus, these states survive in the quotient module with respect to C1 and C2 . It is these states that generate the entire massive submodule 3–|e1 i–|e2 i–6, in which |e1 i and |e2 i are charged singular vectors. As regards describing the above pictures in terms of only the top-level representatives of singular vectors and the subsingular vectors, one has to consider submodules

Structure of Verma Modules over the N = 2 Superconformal Algebra

167

of the submodules described above generated by the conventional, top-level, representatives of the extremal diagrams. The missing parts of the submodules will then be generated by subsingular vectors. For the “traditional” reasons, we now briefly describe the subsingular vectors “hidden” in the above pictures. To begin with 2b, consider what happens in (3.42) after factoring away the submodule generated from the top-level (×) representative of the extremal diagram 3–|e1 i–4 of the C10 submodule. The vector |s1 i (not shown in (3.42)) that lies at the same grade as |e2 i ∈ C20 but belongs to C10 then satisfies twisted topological highest-weight conditions, thus giving rise to (the extremal diagram of) a twisted topological Verma submodule. The top-level representative of this extremal diagram is then a subsingular vector in the conventional sense. This top-level representative would be at the same point as 8, the top-level representative of the topological singular vector |e2 i in U . Thus, the top-level representative of |s1 i does not belong to the submodule generated by the top-level representative of C10 because of the topological highest-weight conditions at 7. We also see that this subsingular vector lies in the same grade as 8. Thus – continuing with the conventional definition of singular vectors – a singular vector and a subsingular vector are in the same grade in the case under consideration. The crucial feature of this case is that the entire sections 7–|e1 i and |e2 i–9 of the extremal diagrams of each of the topological submodules are on one side of the top of the parabola. Had these sections included the top-level representative, one would conclude that two conventional singular vectors exist in the same grade. Whether or not this is the case is determined by parameters r, s, m, and n. As they change, one of these conventional singular vectors “submerges” and becomes subsingular. From the above discussion we immediately obtain the following three propositions. Proposition 3.14. Let the conditions of case 2b of Theorem 3.13 hold. Then, subsingular vectors exist in U ≡ U (1−m−n)s , mns , m−n+r if and only if either r < −m + 1 or r < n. m−n+r n−m−r s In the first case, the subsingular vector reads |Subi = G0 . . . G−r−m−1 · G−r−m+1 . . . Gr−n−1 |e1 i ) = G0 . . . G−r−m−1 · G−r−m+1 . . . Gr−n−1 E −,−n (r, s, m−n+r s E mns m−n+r G−n . . . G−1 (1−m−n)s m−n+r , n−m−r , s

(3.44)

(where E −,θ (r, s, t) is the spectral flow transform of the topological singular vector operator read off from (2.34)). This becomes singular in the quotient module U/C100 , where C100 is the submodule generated by the top-level representative of the extremal diagram 3–7, which reads E (1−m−n)s mns m−n+r ) G . . . G . G0 . . . Gr−n−1 E −,−n (r, s, m−n+r −n −1 m−n+r , n−m−r , s s In the other case, the subsingular vector (in the conventional nomenclature) is given by an equally simple construction. Similarly, describing case 2a of Theorem 3.13 in terms of only top-level representatives of singular vectors (and then, in terms of subsingular vectors) one would conclude that whenever r ≥ |m|+1 or r ≥ n the states between |e1 i and |e2 i in the diagram (3.43) are not generated from the top-level representatives. The top-level representative of the g g state |v 2 i (such that Qr+m |v2 i = |e2 i) gives a subsingular vector:

168

A. M. Semikhatov, I. Yu. Tipunin

Proposition 3.15. Under the conditions of case 2a of Theorem 3.13, the state g |Subi = Q1 . . . Qr+m−1 |v 2i = Q1 . . . Qr+m−1 q(r + m + 1, r + m − 1) |e2 i ) = Q1 . . . Qr+m−1 q(r + m + 1, r + m − 1) E +,−m (r, s, n−m+r s E mns n−m+r Qm . . . Q0 (1−m−n)s n−m+r , m−n−r , s

(3.45)

is a subsingular vector in the module U ≡ U (1−m−n)s , mns , n−m+r with r ≥ |m| + n−m+r m−n−r s 1. It becomes singular in the quotient module U/(C1 ∪ C2 ), where C1 and C2 are n−m+r submodules in U generated by the charged singular vectors |E(n, (1−m−n)s )ich n−m+r , s (1−m−n)s n−m+r and |E(m, n−m+r , s )ich respectively. The length-(−1) operator q(r + m + 1, r + m − 1) evaluates according to the rules given in Sect. 2. When r ≥ n, the subsingular vector is built similarly starting from the vector g |v 1 i = g(r − n + 1, r − n − 1)|e1 i. Finally, even if one insists on considering only top-level representatives of singular vectors, there would be no subsingular vectors in case 1 of the theorem, because topological states in the extremal diagram of the submodule have relative charges of different signs, hence the entire maximal submodule can be generated from the top-level representative. Proposition 3.16. Under the conditions of 1 of Theorem 3.13, no subsingular vectors exist in Uh,`,t . 4. Concluding Remarks We have analyzed the structure of N = 2 Verma modules and classified their degeneration patterns. We considered singular vectors that generate maximal submodules (and satisfy twisted highest-weight conditions), which has allowed us to describe the structure of submodules of N = 2 Verma modules in a setting which is free of subsingular vectors. However, in order to make contact with the approach existing in the literature, we have also shown how the description in terms of the conventional, “untwisted”, singular vectors and the subsingular vectors, as well as general expressions for the subsingular vectors, follow from our approach and the expressions for the singular vectors satisfying the twisted highest-weight conditions. As we haave seen, an important point about the structure of massive N = 2 Verma modules is that there are submodules of exactly two different types, the massive and the twisted topological ones (and, obviously, arbitrary sums thereof). The existence of two types of submodules shows up also in the classification of the patterns describing possible sequences of submodules of submodules of a given N = 2 Verma module, which we have not considered yet. This amounts to finding embedding diagrams of N = 2 Verma modules. Using the singular vectors constructed in this paper, these would be embedding diagrams, i.e., those consisting only of mappings with trivial kernels. The sought sequence in which submodules may follow one another is determined by the degeneration patterns found in this paper. As regards the topological Verma modules, the answer is already known, since the corresponding embedding diagrams are isomorphic b to those of s`(2) Verma modules. Further, we have seen that some of the degeneration

Structure of Verma Modules over the N = 2 Superconformal Algebra

169

patterns of massive N = 2 Verma modules are such that, again, the structure of subb modules is determined by that of a certain topological (hence, s`(2)-) Verma module. It thus remains to analyze several cases where the known embedding diagrams are “glued together” to produce somewhat more complicated structures [SSi]. The classification of N = 2 embedding diagrams is, thus, a refinement of the classification of degeneration patterns presented in this paper. b In view of the relation existing between s`(2|1) and N = 2 singular vectors [S2], b it would also be interesting to see how s`(2|1) subsingular vectors behave under the reduction [S2] to N = 2 Verma modules. Acknowledgement. We are grateful to B. Feigin and V. Sirota for many helpful discussions. We thank I. Todorov for pointing the paper [Ga] out to us and F. Malikov for useful remarks. AMS thanks D. Leites, I. Shchepochkina, and V. Tolstoy for useful remarks. We are grateful to the Editor for the criticism that helped us to improve the presentation. This work was supported in part by the RFFI grant 96-01-00725. IYT was also supported in part by a Landau Foundation grant, and AMS, by grant 93-0633-ext from the European Community.

Appendix: The Proof of Part iii) of Lemma 3.3 This proof exploits heavily the properties of extremal diagrams of both the twisted topological and the massive Verma modules (in fact, the reader would find the proof easier to read if he draws the parabolas, dense descendants, etc., which we deal with in what follows). The idea of the proof is to demonstrate that the converse leads to a contradiction either with the “size” of twisted topological Verma modules (i.e., the appearance of states with bigradings outside the extremal diagram) or with the structure of submodules in twisted topological Verma modules (Theorem 3.1). This argument is applied several times to the topological Verma modules that are the quotients of the massive Verma module with respect to the topological Verma modules whose existence is established at previous steps of the proof. To begin with, note that the module C 0 cannot be embedded by charged singular vectors in more than one massive submodule in U . Indeed, the assumption that U 0 ⊃ C 0 ⊂ U 00 , where the embeddings are given by charged singular vectors, leads to the contradiction, because either U 0 or U 00 then necessarily has states in the gradings outside the module U. To see this, let |v 0 i be the highest-weight vector of C 0 . Let |v 0 i have the twist parameter θ0 and lie in the bigrading (`0 , h0 ). Then, the extremal states of the massive Verma submodule that contains |v 0 i as one of its extremal states have to lie in bigradings (`, h), all of which satisfy one and only one of the following equations: 0 0 0 0 0 1 2 1 1 0 1 02 2 h − ( 2 + h + θ )h + 2 h + 2 h + ` + h θ = ` , 0 0 0 0 0 1 2 1 1 0 1 02 2 h + ( 2 − h − θ )h − 2 h + 2 h + ` + h θ = ` .

(A.1) (A.2)

Now, the following alternative is satisfied: either infinitely many bigradings satisfying (A.1) lie outside the module U , in which case none of the bigradings satisfying (A.2) lie outside the module U , – or infinitely many bigradings satisfying (A.2) lie outside the module U , in which case none of the bigradings satisfying (A.1) lie outside the module U

–

170

A. M. Semikhatov, I. Yu. Tipunin

(the converse would contradict the fact that U is freely generated). Thus, we can associate with each state |v 0 i two sets of bigradings (`1i , h1i ) and (`2i , h2i ), each of which satisfies one and only one of Eqs. (A.1) and (A.2). Infinitely many bigradings from one set lie outside U, while all bigradings from the other set lie inside U . We will call bigradings from the latter set admissible with respect to the state |v 0 i. Thus, there may exist at most one massive submodule U 0 into which C 0 is embedded by a charged singular vector and, moreover, all of the extremal states of U 0 have admissible bigradings with respect to the highest-weight vector of C 0 if such a U 0 exists. In the case where there is such a massive Verma module, we are in the situation described in Part ii) of the lemma, using which iii) is proved. Consider, therefore, the case where there does not exist a massive submodule U 0 ⊃ C 0 such that the embedding is (A.3) given by a charged singular vector. It is easy to see then that there exists a state |yi that satisfies the following properties (the proof of this statement is left to the reader as a useful exercise): |yi has an admissible bigrading with respect to |v 0 i; |yi satisfies twisted topological highest-weight conditions; unless |yi = |v 0 i, the vector |v 0 i is a dense G/Q-descendant of |yi, while |yi is not a dense G/Q-descendant of |v 0 i; there are no states |zi with admissible bigradings with respect to |yi such that |yi is a dense G/Q-descendant of |zi, while |zi is not a dense G/Q-descendant of |yi. It is clear that |yi generates a twisted topological Verma module C 00 ⊇ C 0 (C 00 = C 0 whenever |yi = |v 0 i) with the condition (A.3) satisfied for C 00 . Now, we will have proved iii) for the module C 0 as soon as we prove iii) for C 00 . Let |yi have the twist parameter θy and the bigrading (`y , hy ). Assume, for definiteness, that any admissible bigrading with respect to |yi satisfies (A.1) with θ0 = θy and (`0 , h0 ) = (`y , hy ). Then, the fact that there are no states |zi with the properties as described above is equivalent to the fact that the expression g(θy + 1, θy − 1)|yi cannot be evaluated as a polynomial in the modes of Q, G, L, and H acting on the highest-weight vector of U. By lengthy but direct calculations with formulae from Sect. 2 one can show that g(θy + 1, θy − 1)|yi cannot be evaluated in this way if and only if there exists a twisted topological Verma module C1 ⊂ U, where the embedding is given by a charged singular vector and such that C1 is maximal (U ⊃ C 00 ⊃ C =⇒ C 00 = C) and C1 ∩ C 00 = {0}. Then, consider the quotient Q1 = U/C1 . This is a twisted topological Verma module, which contains the topological singular vector |yi. It follows by comparing the highest-weight parameters of Q1 and C1 that C1 contains a topological singular vector |xi. The bigrading of |xi is (`x , hx ) = (`y + θy , hy − 1) and any bigrading admissible with respect to |yi is admissible with respect to |xi, and vice versa. Let C10 be the twisted topological Verma module generated from |xi. We have two possibilities: a) there does not exist a massive submodule U 0 ⊃ C10 , where the embedding is given by a charged singular vector; b) there exists a massive submodule U 0 ⊃ C10 , where the embedding is given by a charged singular vector. In case a), we can apply to C10 the same reasoning as in the case of the C 00 module. In this way, we see that a module C2 ⊂ U exists, where the embedding is given by a charged singular vector, C2 is maximal (U ⊃ C 000 ⊃ C2 =⇒ C 000 = C2 ), and C2 ∩ C10 = {0}. It is also easy to see that C1 ∩ C2 = {0}. Further, the quotient U/C2 cannot contain all of the

Structure of Verma Modules over the N = 2 Superconformal Algebra

171

extremal states of C 00 , since U/C2 is a twisted topological Verma module, which already contains all of the extremal states of C10 . Therefore, C2 ∩ C 00 = C20 6= {0} and the quotient U /C2 contains the submodule C 00 /C20 . However, this can happen only in the case where C 00 /C20 = {0} or, equivalently, C 00 ⊂ C2 , from which iii) follows. To complete the proof, it remains to consider case b). Let U 0 ⊂ U be a massive submodule such that |xi is the charged singular vector in U 0 . This means that there exists a state |zi with an admissible bigrading with respect to |xi such that |xi is a dense G/Q-descendant of |zi, whereas |zi is not a dense G/Q-descendant of |xi. Since |zi has an admissible bigrading with respect to |xi as well as with respect to |yi, and, also, (`x , hx ) = (`y + θy , hy − 1), the state |zi can be represented in the form |zi = |wi + a|ui, where |ui is a dense G/Q-descendant of |yi and a ∈ C. Further, |wi = 0 in the quotient U/C1 , since otherwise we are in contradiction with the structure of the twisted topological Verma module U/C1 . Thus, we see that either |wi ∈ C1 or |wi = 0. However, C1 cannot contain all of the bigradings that are admissible with respect to |xi, therefore there exists |z 0 i ∈ C 00 such that it is a dense G/Q-descendant of |zi. We now see that U 0 ∩ C 00 6= {0} or, equivalently, there exists C ⊆ C 00 such that C ⊂ U 0 , where the embedding is given by a charged singular vector. From part ii) of the lemma, it follows that there exists C2 ⊂ U, where C2 is a submodule generated from a charged singular vector in U , C2 is maximal (U ⊃ C 000 ⊃ C2 =⇒ C 000 = C2 ) and C ⊂ U 0 ∩ C2 . We now have C2 ∩ C 00 6= {0}, which allows us to repeat the arguments regarding taking the quotient and, thus, to obtain iii). Note added in proof The fact that the charged singular vectors do not generate the massive Verma modules was used in the recent paper [D2]. As we saw in Theorem 2.13, one can be considerably more precise by saying that this is a twisted topological Verma module with the twist parameter −n, where n labels the charged singular vector. Similarly with the statement of [D2] regarding the degenerate case with two linearly independent singular vectors in the same grade: as we have seen, such vectors generate a direct sum of two twisted topological Verma modules, which makes the “fermionic uncharged singular vectors” introduced in [D2] excessive. The conditions for the absence of subsingular vectors applied in that paper to the derivation of characters of unitary representations, are a particular case of conditions of Proposition 3.14. References [A]

Ademollo, M. et al: Dual String With U(1) Color Symmetry. Nucl. Phys. B111, 77 (1976); M. Ademollo et al, Dual String Models With Nonabelian Color and Flavor Symmetries. Nucl. Phys. B114, 297 (1976) [BPZ] Belavin, A.A., Polyakov, A.M. and Zamolodchikov, A.B.: Infinite Conformal Symmetry of Critical Fluctuations In Two-Dimensions. Nucl. Phys. B241, 333 (1984) [BGG] Bernshtein, A.B.,Gelfand, I. and Gelfand, S.: Funk. An. Prilozh. 10, 1 (1976) [BLNW] Bershadsky, M., Lerche, W., Nemeschansky, D. and Warner, N.P.: Extended N = 2 Superconformal Structure of Gravity and W Gravity Coupled to Matter. Nucl. Phys. B401, 304 (1993) [BFK] Boucher, W., Friedan, D. and Kent, A.: Determinant Formulae and Unitarity For the N = 2 Superconformal Algebras In Two-Dimensions Or Exact Results On String Compactification. Phys. Lett. B172, 316 (1986) [DVPYZ] Di Vecchia, P., Petersen, J.L., Yu, M. and Zheng, H.B.: Phys. Lett. B174, 280 (1986) [D] D¨orrzapf, M.: Analytic Expressions for the Singular Vectors of the N = 2 Superconformal Algebra. Commun. Math. Phys. 180, 195–232 (1996)

172

A. M. Semikhatov, I. Yu. Tipunin

[D2]

D¨orrzapf, M.: The embedding structure of unitary N = 2 minimal models. hep-th/9712165

[EY]

Eguchi, T. and Yang, S.-K.: N = 2 Superconformal Models as Topological Field Theories. Mod. Phys. Lett. A4, 1653 (1990)

[EHY]

Eguchi, T., Hosono, S. and Yang, S.-K.: Hidden Fermionic Symmetry in Conformal Topological Field Theories. Commun. Math. Phys. 140, 159 (1991)

[EG]

Eholzer, W. and Gaberdiel, M.R.: Unitarity of rational N = 2 superconformal theories. Commun. Math. Phys. 186, 61–85 (1997)

[FST]

Feigin, B.L., Semikhatov, A.M. and Tipunin, I.Yu.: Equivalence between Chain Categories of Representations of Affine s`(2) and N = 2 Superconformal Algebras. hep-th/9701043

[FS]

Feigin, B.L. and Stoianovsky, A.V.: Functional Models of Representations of Current Algebras and Semi-infinite Schubert Cells. Funk. An. i ego Prilozh., 28(1), 68 (1994)

[FT]

Fradkin, E.S. and Tseytlin, A.A.: Quantization of Two-Dimensional Supergravity and Critical Dimensions for String Models. Phys. Lett. B106, 63 (1981); Anomaly Free Two-Dimensional Chiral Supergravity–Matter Models and Consistent String Theories Phys. Lett. B162, 295 (1985)

[FMS]

Friedan, D.H., Martinec, E.J. and Shenker, S.H.: Conformal Invariance, Supersymmetry and String Theory. Nucl. Phys. B271, 93 (1986)

[Ga]

Gannon, T.: U(1)m Modular Invariants, N=2 Minimal Models, and the Quantum Hall Effect. hep-th/9608063

[GRR]

Gato-Rivera, B. and Rosado, J.I.: Families of Singular and Subsingular Vectors of the Topological N = 2 Superconformal Algebra. hep-th/9701041

[GRS]

Gato-Rivera, B. and Semikhatov, A.M.: d ≤ 1∪d ≥ 25 and W Constraints from BRST-Invariance in the c 6= 3 Topological Algebra. Phys. Lett. B, 293, 72 (1992)

[G]

Gepner, D.: Phys. Lett. B199, 380 (1987); Nucl. Phys. B 296, 757 (1988)

[IK]

Ito, K. and Kanno, H.: Hamiltonian Reduction and Topological Conformal Algebra in c ≤ 1 Non-Critical Strings. Mod. Phys. Lett. A9, 1377 (1994)

[K2]

Kaˇc, V.G. Infinite Dimensional Lie Algebras. Cambridge: Cambridge University Press, 1990

[KS]

Kazama, Y. and Suzuki, H.: New N = 2 Superconformal Field Theories and Superstring compactification. Nucl. Phys. B321, 232 (1989)

[KM]

Kutasov, D. and Martinec, E.: New Principles for String/Membrane Unification. Nucl. Phys. B 477, 652 (1996); Kutasov, D., Martinec, E. and O’Loughlin, M.: Vacua of M-theory and N = 2 strings. Nucl. Phys. B 477, 675 (1996)

[LVW]

Lerche, W., Vafa, C. and Warner, N.P.: Chiral Rings In N=2 Superconformal Theories. Nucl. Phys. B324, 427 (1989)

[Ma]

Malikov, F.G.: Verma Modules over Rank-2 Kaˇc–Moody Algebras. Algebra i Analiz, 2 No. 2, 65–84 (1990)

[MFF]

Malikov, F.G., Feigin, B.L. and Fuchs, D.B.: Singular Vectors in Verma Modules over Kaˇc–Moody Algebras. Funk. An. Prilozh. 20 N2, 25 (1986)

[M]

Marcus, N.: A Tour through N = 2 strings. Talk at the Rome String Theory Workshop, 1992, hep-th/9211059

[OV]

Ooguri, H. and Vafa, C.: Geometry of N = 2 Strings. Nucl. Phys. B361, 469–518 (1991); N = 2 Heterotic Strings. Nucl. Phys. B367, 83–104 (1991)

[RCW]

Rocha-Caridi, A. and Wallach, N.R.: Highest Weight Modules over Graded Lie Algebras: Resolutions, Filtrations, and Character Formulas. Trans. Amer. Math. Soc. 277, 133–162 (1983)

[SS]

Schwimmer, A. and Seiberg, N.: Comments On the N = 2, N = 3, N = 4 Superconformal Algebras in Two-Dimensions. Phys. Lett. B184, 191 (1987)

[S1]

Semikhatov, A.M.: The MFF singular vectors in topological conformal theories. Mod. Phys. Lett. A9, 1867 (1994)

[S2]

Semikhatov, A.M.: Verma Modules, Extremal Vectors, and Singular Vectors on the Non-Critical N = 2 String Worldsheet. hep-th/9610084

[SSi]

Semikhatov, A.M. and Sirota, V.A.: Embedding Diagrams of N = 2 and Relaxed-sb`(2) Verma Modules. hep-th/9712102

Structure of Verma Modules over the N = 2 Superconformal Algebra [ST] [ST2] [W]

173

Semikhatov, A.M. and Tipunin, I.Yu.: Singular Vectors of the Topological Conformal Algebra. Int. J. Mod. Phys. A11, 4597 (1996) Semikhatov, A.M. and Tipunin, I.Yu.: All Singular Vectors of the N = 2 Superconformal Algebra via the Algebraic Continuation Approach. hep-th/9604176 Witten, E.: Topological Sigma Models. Commun. Math. Phys. 118, 411 (1988); On the Structure of the Topological Phase of Two-Dimensional Gravity. Nucl. Phys. B340, 281–332 (1990)

Communicated by T. Miwa

Commun. Math. Phys. 195, 175 – 193 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Diophantine Conditions Imply Critical Points on the Boundaries of Siegel Disks of Polynomials James T. Rogers, Jr. Department of Mathematics, Tulane University, New Orleans, LA 70118, USA. E-mail: [email protected] Received: 10 May 1997 / Accepted: 12 November 1997

Abstract: Let f be a polynomial map of the Riemann sphere of degree at least two. We prove that if f has a Siegel disk G on which the rotation number satisfies a diophantine condition, then either the boundary B of G contains a critical point or B is a Lakes of Wada indecomposable continuum with one of the lakes containing a critical point. Consequently, if the boundary B of G has only 2 complementary domains, then B contains a critical point. We also show, without any assumption on the rotation number, that each proper nondegenerate subcontinuum of the boundary B of G is tree-like, and any other bounded complementary domain of B is a preperiodic component of the grand orbit of G. Finally, we establish some conditions under which B contains no periodic point. Introduction The main theorem of this paper is the following: Theorem 0.1. If the polynomial f has a Siegel disk G with rotation number satisfying a diophantine condition, then either the boundary B of G contains a critical point or B is an indecomposable continuum with three properties: (1) B has at least three complementary domains, and B is the boundary of each of them, (2) each bounded complementary domain of B is a component of the grand orbit of G and so a bounded component of the Fatou set, and (3) one of the bounded complementary domains of B contains a critical point. An indecomposable continuum in C which is the common boundary of at least three complementary domains is called a Lakes of Wada continuum, and the bounded complementary domains are called lakes. Hence we could rephrase the conclusion of Theorem 0.1 to say that either the boundary of B of G contains a critical point or B is a Lakes of Wada continuum with one of the lakes being a preperiodic component of the grand orbit of G that contains a critical point.

176

J. T. Rogers, Jr.

The results of this paper were announced in the Bulletin of the American Mathematical Society in 1995 (see [R4]). Theorem 0.1 corrects the statement appearing in [R4], in which we did not mention the second alternative. The possibility that B is such a Lakes of Wada continuum remains open; indeed whether such a Siegel disk can exist is unknown. Here are two corollaries to the main theorem. Corollary 0.2. If the polynomial f has a Siegel disk G with boundary B, if G is the only bounded complementary domain of B, and if the rotation number of G satisfies a diophantine condition, then the boundary B of G contains a critical point. Corollary 0.3. If the polynomial f has a Siegel disk G with boundary B, if all critical points (save the one at infinity) are in the Julia set, and if the rotation number of G satisfies a diophantine condition, then the boundary B of G contains a critical point. This gives a new proof of the cases z → z 2 + c and z → z n + c, previously handled by Douady [D2] and Herman [H1]. Here is some history of the problem and an explanation of the essential role played by results of Herman. Fatou showed in 1920 that the boundary of a Siegel disk G of a rational function g of degree at least two is contained in the closure of the orbits of the critical points of g. The following question is natural: Question. Does the boundary of a Siegel disk contain a critical point? This question was the first question of the 1982 survey by Douady [D1, p. 140]. See also the survey of Lyubich [L, p. 77], and the works of Ghys [G] and Herman [H1, H2]. In 1986, M. Herman showed the answer is “no” in general by proving that there exists a quadratic polynomial with a Siegel disk whose boundary does not contain the critical point. This unexpected result forces the Julia set of such a quadratic polynomial to fail to be locally connected. Herman’s example has a Siegel disk whose rotation number α fails to satisfy a diophantine condition: we say α satisfies a diophantine condition if there are numbers r > 0 and k ≥ 2 such that |α − p/q| > r/q k for every rational number p/q. This means roughly that α is poorly approximated by rational numbers. In 1984, Ghys [G] showed that, under the assumption that α satisfies a diophantine condition, the answer is “yes”, provided the boundary of G is a Jordan curve. Herman [H1] generalized the Ghys result by proving the following theorem in 1985. Theorem 0.4 (Herman). If the rotation number of G satisfies a diophantine condition, then the boundary of G contains a critical point, provided f is injective when restricted to the boundary of G. In this paper, we contribute to this program for Siegel disks of certain complex polynomials by proving the following theorem. b of a polynomial f does not contain a critical Theorem 0.5. If the filled Siegel disk G point, then f is injective when restricted to the boundary of G. Thus the following theorem has been proven.

Diophantine Conditions for Siegel Disks

177

Theorem 0.6. If the polynomial f has a Siegel disk G, and if the rotation number of G b contains a critical point. satisfies a diophantine condition, then the filled Siegel disk G b is the union of the boundary B of the Siegel disk G with all The filled Siegel disk G the bounded complementary domains of B. The diophantine condition in Theorem 0.4 is only needed to apply the following theorem of Herman–Yoccoz: Theorem 0.7. If g is an R-analytic diffeomorphism of S 1 of rotation number α satisfying a diophantine condition, then g is R-analytically conjugate on S 1 to the rotation z → exp(2πiα)z. Hence Theorem 0.6 holds under much weaker forms of the diophantine condition (see [Y]). As a tool to prove these theorems, we obtain the following information on the topology of the boundary of a Siegel disk. Note there is no assumption on the rotation number or on the critical points in this theorem. Theorem 0.8. If B is the boundary of a Siegel disk G of a polynomial, then each proper nondegenerate subcontinuum of B is tree-like, and any other bounded complementary domain of B is a preperiodic component of the grand orbit of G. Consequently, while B separates C, no proper closed subset separates C. Moreover, if B has more than one bounded complementary domain, then B is a Lakes of Wada continuum. We also show that if the restriction of f to the boundary B of G is injective, then B does not contain a periodic point. In particular, if a filled Siegel disk of a polynomial does not contain a critical point, then the boundary of the Siegel disk does not contain a periodic point. This gives a partial answer to a question of J. Milnor.

1. Siegel Disks with Decomposable Boundaries A component G of the Fatou set of a polynomial f satisfying f (G) = G is a Siegel disk if on the component G, f is analytically conjugate to an irrational rotation through the angle exp(2πiα), where α is an irrational real number called the rotation number of G. The main tool for proving Theorem 1 is the Structure Theorem of the author [R1, R2]. According to that theorem, there are two mutually exclusive possibilities for the boundary of a Siegel disk of a polynomial. Structure Theorem. The boundary of a Siegel disk G of a polynomial satisfies exactly one of the following: (1) The impressions of the prime ends of G are disjoint subsets of the boundary of G, or (2) The boundary of G is an indecomposable continuum. The prime ends of this theorem arise from internal rays emanating from the fixed point of G and are not to be confused with the prime ends arising from external rays emanating from infinity. An indecomposable continuum is a compact connected space X that cannot be written as a union A ∪ E with A and E connected closed proper subsets of X. We will

178

J. T. Rogers, Jr.

deal with case (1) in Sect. 6 and cover case (2) in Sect. 7; in both cases, as the reader will see, we will be forced to deal with indecomposable continua in the boundary of G. The Structure Theorem, together with Sullivan’s No Wandering Domains Theorem, gives one additional piece of information about the impressions of the prime ends of G. In case (1), not only are the impressions disjoint and one-dimensional, but each impression is full; that is, no impression has a bounded complementary domain. This is true because a bounded complementary domain of an impression would have to wander under the “irrational rotation” of f [R2, p. 187]. A tree-like continuum is a full, one-dimensional compact connected subset of C. From the previous paragraph, we see that in the decomposable case each impression of G is a tree-like continuum. We show next how the proof of Theorem 1 in the decomposable case can be reduced to a certain theorem about tree-like continua in the boundary of G. Let G be a Siegel disk such that the boundary B of G is decomposable. Let x and y be two points of B with the property that f (x) = z = f (y). The point z belongs to exactly one impression I(β), so both x and y belong to the impression I(β − α), for α is the rotation number of G and f −1 (I(β)) = I(β − α). Since f (I(β − α)) = I(β), which is tree-like, the decomposable case of Theorem 1 follows from a theorem in Sect. 6. Theorem 6.6. Let G be a Siegel disk of a polynomial f , and let X be a subcontinuum of the boundary B of G such that the image f (X) of X is tree-like. If B contains no critical points, then f maps X homeomorphically onto f (X). The next theorem, which deals with the case that the boundary of G contains no indecomposable continuum at all, has extremely simple proofs. One proof is given in [R4]; the result is also an immediate corollary to Theorem 6.2, and the reader may skip directly to Sect. 6 if this result is the goal. Theorem 1.1. If the polynomial f has a Siegel disk G, if the rotation number of G satisfies a diophantine condition, and if the boundary of G does not contain an indecomposable continuum, then the boundary of G contains a critical point.

2. Basic Theorems and Definitions Throughout this paper, f : C → C denotes a complex polynomial of degree d, where d ≥ 2. If f 0 (c) = 0, then c is a critical point of f and f (c) is a critical value of f . The Julia set of f is denoted J(f ) and the Fatou set F (f ). A continuum is a compact connected nonvoid metric space. A subcontinuum is a continuum that is a subset of another continuum. e Hence, for example, if c is a If X is a set in C, then its image f (X) is denoted X. critical point of f , then its image c˜ is a critical value. Occasionally, we shall cheat and e before we define X. In these cases, we always require that anything called X define X e maps onto X. The polynomial f has two crucial properties. First it is a d-sheeted ramified covering of the Riemann sphere with the critical points as branch points. This has the following consequence. Theorem 2.1. Let V be a simply connected domain and W a component of the inverse image of W . If W contains no critical points, then f : W → V is a homeomorphism. In particular, if V contains no critical values, then f is univalent on each of the d components of the inverse image of V .

Diophantine Conditions for Siegel Disks

179

Second, f is confluent. A map is called confluent provided each component of the inverse image of each continuum is mapped onto that continuum. There are two ways to see that f is confluent. By a theorem of G. T. Whyburn, all open maps of compacta are confluent [W, p. 148]. Since each polynomial is an open map of the Riemann sphere, f is confluent. Alternatively, A. F. Beardon [Be] has given a dynamical proof that f is confluent. A continuum X in C is full if C − X is connected. A continuum is tree-like if it is full and one-dimensional. We need the following facts about tree-like continua. Theorem 2.2. Each subcontinuum of a tree-like continuum is a tree-like continuum or a point, and the nonempty intersection of two subcontinua of a tree-like continuum is again a tree-like continuum or a point. A compact set in C is cellular if it is equal to the intersection of a nested sequence of closed 2-cells in C, each of which is contained in the interior of the previous one. If X is a compact subset of C, then the following conditions are equivalent: (1) X is cellular, (2) X is a full subcontinuum of C, and e 0 (X). (3) H 1 (X) = 0 = H ˇ Our cohomology is reduced Alexander-Cech cohomology with coefficients in the integers. e be a subcontinuum Theorem 2.3. Let f : C → C be a complex polynomial, and let X −1 e e of J(f ). If X is tree-like, then each component of f (X) is a tree-like continuum. e Since J(f ) is one-dimensional and completely Proof. Let Z be a component of f −1 (X). invariant, Z is one-dimensional. If G is a bounded complementary domain of Z, then e is a bounded component of F (f ) whose boundary G is a component of F (f ). Hence G e e e is contained in Z = X. Since X is tree-like, this is impossible. e is tree-like, then Xis tree-like. Corollary 2.4. If X Theorem 2.5. Let f : C → C be a complex polynomial and let X be a subcontinuum e and X contains no critical point. If X e of J(f ). Suppose X is a component of f −1 (X) e is tree-like, then f maps X homeomorphically onto X. Proof. Let U be an open set such that X ⊂ U and U contains no critical point of f and e Since f is an open map and X e is cellular, there is an open no other point of f −1 (X). e ⊂ V ⊂ f (U ). Let W be the component of f −1 (V ) containing 2-cell V such that X e is a X. By Theorem 2.1, f : W → V is a homeomorphism. Hence f : X → X homeomorphism. Lemma 2.6. If f : C → C is holomorphic and K is a compact subset of C, then the image of a bounded complementary domain W of K is an open set which is the union of bounded components of C\f (K) and (possibly) some subset of f (K). Proof. The set f (W ) ∩ (C\f (K)) = (f (W ) ∪ f (K)) ∩ (C\f (K)) is both open and closed in C \ f (K) and hence a union of components of C \ f (K). It cannot contain the unbounded component because f (W ) is relatively compact.

180

J. T. Rogers, Jr.

3. Complementary Domains The goal of this section is to prove Theorem 3.3, which states that if B is the boundary of a Siegel disk G of a polynomial f , then either G is the only bounded complementary domain of B, or there are other bounded complementary domains of B and each is a component of the preimage of G under some iterate of f . No assumption on the critical points is made in this section. We review some ideas of Goldberg and Milnor [GM, Sect. 3]. Assume the Julia set J(f ) is connected. Consider the landing points of the d − 1 external rays which are fixed by f . These rays together with their landing points divide the plane into m basic regions, where 1 ≤ m < d. The landing points are parabolic or repelling fixed points belonging to J(f ), but otherwise the rays lie in the basin of infinity and hence in the Fatou set. The boundaries of two basic regions U and V have at most one point of the Julia set in common; such a point must be one of these landing points. The following is Theorem 3.3 of [GM]. Theorem 3.1 (Goldberg–Milnor). Each of the basic regions contains exactly one interior fixed point or virtual fixed point. We defer to [GM] for precise definitions of these terms; for us it is enough to know that (super-)attracting fixed points and centers of Siegel disks are interior fixed points, and that a virtual fixed point is associated with the immediate basin of a parabolic fixed point in somewhat the same way. A. Poirier [GM, Corollary 3.5] has used this to show there is no Cremer point on the boundary of a Siegel disk. We follow the same ideas to prove the next theorem. Theorem 3.2. Let B be the boundary of a Siegel disk G of a polynomial f . If H is another bounded component of the Fatou set that is invariant under f , then B contains at most one point of the boundary of H. Proof. Let K be the component of J(f ) containing B. It follows from results of C. McMullen [Mc] that there is a polynomial whose Julia set is homeomorphic to K. Furthermore, this polynomial has a Siegel disk G0 with boundary B 0 homeomorphic to B. Finally, if B ∩ ∂H 6= ∅, then this polynomial has a bounded invariant component H 0 of its Fatou set satisfying B 0 ∩ ∂H 0 is homeomorphic to B ∩ ∂H. Hence it is no loss of generality to assume J(f ) is connected. Evidently G is contained in some basic region U . Since U contains the center of G, U cannot contain another interior fixed point or a virtual fixed point. Hence U cannot contain another Siegel disk, nor can it contain the immediate basin of a superattracting or attracting fixed point. Finally, it cannot contain the immediate basin of a parabolic fixed point, since it contains no virtual fixed point. Remark. In case there is such a point, it must be a repelling or parabolic fixed point of f . It seems likely that such a point does not exist, and in Corollary 8.6 we prove this in the case that the filled Siegel disk contains no critical point. Next we show that each bounded complementary domain of B is a component of the grand orbit of G. Theorem 3.3. Let G be a Siegel disk of a polynomial f . Then either (1) G is the only bounded complementary domain of B, or

Diophantine Conditions for Siegel Disks

181

(2) B has other bounded complementary domains, and each one of them is a preperiodic component of the grand orbit of G. Proof. Suppose H is a bounded complementary domain of B different from G. Since B is invariant, each iterated image of H is a bounded complementary domain of B. Some iterated image L of H is a periodic component of the Fatou set. Replacing f by an appropriate iterate, we may assume L is a bounded invariant Fatou domain. Theorem 3.2 implies L = G. 4. Proper Subcontinua are Tree-like The goal of this section is to prove Theorem 4.9, namely, if B is the indecomposable boundary of a Siegel disk of a polynomial, then each proper nondegenerate subcontinuum X of B is tree-like, and the image of X is also a proper subcontinuum of B. The first seven theorems are valid for rational functions. No assumption on the critical points is made in this section. Let K be a continuum contained in C. ˜ ∩ K is at least m}. Definition 4.1. Qm = {x ∈ K : the cardinality of f −1 (x) If K does not contain a critical point, then Qm is closed. Definition 4.2. Let n be the largest integer such that {x ∈ K : x˜ is a critical value} ∪ Qn = K. Thus 1 ≤ n ≤ d. Notice that n depends on the choice of K. e − K, then x is in Qn+1 Theorem 4.3. If x is a point of K that is contained in f −1 (K) or x˜ is a critical value. e − K that converges to the point x of Proof. Let (yi ) be a sequence of points of f −1 (K) K such that x˜ is not a critical value of f . We may assume no y˜i is a critical value. Then e Since y˜i is not a critical value, we may assume that ˜ all points of K. (y˜i ) converges to x, for each i, f −1 (y˜i ) contains at least n points of K. Call them yi1 , . . . , yin . Choosing a subsequence if necessary, we assume the sequence (yi1 ) converges to a point x1 in K ˜ Continuing to take subsequences, we find x1 , x2 , . . . , xn , n points of K and f (x1 ) = x. ˜ All of these points are distinct from each other and from x, since belonging to f −1 (x). none of them is a critical point. Hence x is in Qn+1 . e − K is a proper subset of K. Corollary 4.4. The intersection of K and f −1 (K) A subcontinuum W of a continuum Y is a continuum of condensation if W has empty interior with respect to Y . It is easily shown that a continuum is indecomposable if and only if each proper subcontinuum is a continuum of condensation. The next two results are thus immediate consequences of Corollary 4.4. e is a Theorem 4.5. If K is a continuum of condensation of the continuum X, then K e proper subcontinuum of X. Theorem 4.6. If K is a proper subcontinuum of the indecomposable subcontinuum X, e is a proper subcontinuum of X. e then K

182

J. T. Rogers, Jr.

Let G be a Siegel disk with indecomposable boundary B. e is also a proper subcontinTheorem 4.7. If K is a proper subcontinuum of B, then K uum of B. Proof. This is an immediate consequence of Theorem 4.6.

Theorem 4.8. If K is a proper nondegenerate subcontinuum of the indecomposable boundary B of a Siegel disk of a polynomial, then K is tree-like. Proof. If K is not tree-like, then K bounds a component of the Fatou set of f . This component is a preimage of G under some power g of f , by Theorem 3.3. The result now follows from Theorem 4.7 with f replaced by g. The next theorem summarizes this section. Theorem 4.9. If B is the indecomposable boundary of a Siegel disk of a polynomial, then each proper nondegenerate subcontinuum of B is tree-like and the image of each proper nondegenerate subcontinuum of B is tree-like. Proof. If K is a proper nondegenerate subcontinuum of B, then K is tree-like. Theorem e is also tree-like. 4.7 implies K To see that the conclusion of Theorem 4.9 may still hold while G is not the only bounded complementary domain of B, consider the Lakes of Wada example [HY]. In the last section we give some conditions sufficient to conclude that G is the only complementary domain of B. Corollary 4.10. If B is the boundary of a Siegel disk of a polynomial, then B is an irreducible separator of C, i.e., B separates C but no proper closed subset of B separates C. 5. Indecomposable Continua A continuum X is indecomposable if it is not the union of two of its proper subcontinua. Examples of indecomposable continua occurring in dynamics are solenoids, Birkhoff’s remarkable curve, and the closure of the unstable manifold of any periodic point of the Smale horseshoe. These and other examples are discussed in [R1, Sect. 3]. Figure 1 shows a Knaster continuum, an indecomposable continuum in the plane homeomorphic to the closure of the unstable manifold of a periodic point of the Smale horseshoe. What is pictured is actually part of one arc component of this continuum. Each of the uncountably many other arc components in this continuum is contained in the closure of the first one, and each is dense in the continuum. Figure 2 shows another Knaster continuum. This one has two “endpoints”, and parts of two arc components are shown. Again each of the uncountably many other arc components in this continuum is contained in the closure of either of the pictured arc components, and each is dense in the continuum. The union of all proper subcontinua of an indecomposable continuum X containing a point of x of X is called a composant of X. In case each proper subcontinuum of X is an arc, the composants of X are precisely the arc components of X. This is the case for solenoids and for the Knaster continua; in general, however, composants can be considerably more complicated sets (for example, the composants of the pseudocircle

Diophantine Conditions for Siegel Disks

Fig. 1.

183

Fig. 2.

[R3, H3] contain no arcs). The collection of composants of X forms a partition of X into disjoint sets. The continuum X contains uncountably many composants, and each composant is dense in X [HY, p. 140]. If X is an indecomposable subcontinuum of the complex plane, quite a lot of knowledge is available about various forms of accessibility of composants of X from the complement of X. The most basic form of accessibility is described in the next paragraph. A point x of an indecomposable continuum X ⊂ C is an accessible point if there exists an arc A in C such that A ∩ X = {x}. The composant of X containing x is called an accessible composant of X. For the Knaster continuum in Fig. 1, there is only one accessible composant; it is the pictured arc component, since in this example, composants and arc components are the same sets. This composant contains many accessible points, but no other composant contains any accessible point. The continuum in Fig. 2 has two accessible composants, namely the pictured arc components. Both of these composants contain many accessible points, but no other composant contains any accessible point. This illustrates the following theorem, due to Mazurkiewicz. Theorem 5.1. If X is an indecomposable continuum in C, then at most countably many composants of X have the property that the composant contains 2 points accessible from the complement of X. We now move to more sophisticated forms of accessibility. Let X be an indecomposable subcontinuum of J(f ). A composant C of X is hidden in the Julia set if each subcontinuum of the Julia set containing a point of C and a point not in C must contain X. The union of all the hidden composants of X is denoted Hid(X). A composant C of X is internal if every subcontinuum of C containing a point of C and a point not in C must intersect all composants of X. The union of all the internal composants of X is denoted Int(X). J. Krasinkiewicz [Kr1] introduced the notion of internal composant and proved that each indecomposable continuum in C has uncountably many such composants. Theorem 5.2. If f : C → C is a complex polynomial and X is an indecomposable continuum in the Julia set J(f ), then every internal composant of X is hidden in the Julia set. Proof. Let Z be a subcontinuum of J(f ) that contains a point of an internal composant C of X and a point not in C. Let Y be the component of the filled Julia set that contains X. Thus Y is full, and X is contained in the boundary of Y . Since Z is contained in J(f ) and Z contains a point of X, Z is in the same component of J(f ) as X. In particular, Z is contained in Y . If Z contains a point of Y − X, then Z contains X, by

184

J. T. Rogers, Jr.

[Kr2, Theorem 2.3]. If Z ⊂ X, then Z = X, since X is irreducible between points of different composants. Hence each internal composant of X is a hidden composant of X. Corollary 5.3. If f : C → C is a complex polynomial and X is an indecomposable subcontinuum of the Julia set J(f ), then X contains an uncountable collection of hidden composants. Theorem 5.4. If f : C → C is a complex polynomial and X is an indecomposable e is also an indecomposable continuum in J(f ). continuum in J(f ), then X e is a continuum contained in J(f ). If X e Proof. Since J(f ) is completely invariant, X e e e e e were decomposable, then X = A ∪ E, where A and E are two proper subcontinua of e Each continuum Ai is a subset of e Let A1 , . . . , Am be the components of f −1 (A). X. J(f ). If Ai contains a point of an internal composant of X, then Ai ⊂ X, for otherwise e ⊃ X, e contradicting the decomposability of X. e Ai ⊃ X, and so A e Similar remarks apply to the components E1 , . . . , Es of the inverse image of E. Furthermore, X ⊂ A1 ∪ · · · ∪ Am ∪ E1 ∪ · · · ∪ Es . It follows that Int(X) can be expressed as the union of a finite number of compact connected sets, namely the union of the Ai ’s and the Ej ’s that intersect Int(X). This implies the contradiction that Int(X) is a compact set. We introduce some definitions analogous to arcwise connected and arc component. A set S is continuumwise connected if each pair of points of S is contained in a continuum that is a subset of S. For example, a composant of an indecomposable continuum is continuumwise connected. A continuum component of a set S is a maximal continuumwise connected subset of S. e be a composant Theorem 5.5. Let X be an indecomposable continuum in J(f ). Let C e e of X such that C does not contain a critical value of f . If each continuum contained in e is tree-like, then C e = C 1 ∪ · · · ∪ Cd , f −1 (C) where each Ci is a continuumwise connected set in C, and e f |Ci : Ci → C is a bijective confluent map. e Then Ye is tree-like, so there is a simply connected Proof. Let Ye be a subcontinuum of C. e e e In addition, D e can be chosen to contain no domain D in C such that Y is contained in D. e then f is univalent critical values of f . If Di are the connected components of f −1 (D), e on each component Di , and there are d branches in D of the inverse function f −1 . Hence there are d disjoint preimages of Ye (call them Yi ) and f is univalent on each of them. e and let ci denote the d preimages of c˜. Let Ci denote the Let c˜ be a point of C, e If d˜ is a point of C, e then there is a continuum continuum component of ci in f −1 (C). e e ˜ Y in C containing c˜ and d. The previous paragraph implies that f takes Yi univalently onto Ye . Hence the restriction of f to Ci is a confluent surjection. If x and y are points of Ci such that f (x) = f (y), let Z be a continuum in Ci e and f is not univalent on one of the containing x and y. Then Ze is a subcontinuum of C e This contradicts the first paragraph. preimages of Z.

Diophantine Conditions for Siegel Disks

185

6. The Main Theorem for B Decomposable Let G be a Siegel disk of a polynomial f , and assume the boundary B of G is decomposable and contains no critical points of f . Let K be a continuum of B with the property that K contains two points x and y such that f (x) = f (y) = z, and no proper subcontinuum of K contains such a pair of points. In particular, f is injective when restricted to a proper subcontinuum of K and so K is irreducible from x to y, i.e., no proper subcontinuum of K contains both x and y. The existence of K follows from an argument using Zorn’s Lemma. Let Q = {k in K: some point k 0 of K distinct from k has the property that f (k) = f (k 0 )}. So Q is the set Q2 discussed in Sect. 4. The set Q contains both x and y, so it is nonempty. Since B contains no critical point, Q is closed. e is a tree-like continuum, then Q = K. In other words, f is at least Theorem 6.1. If K 2-1 over K. e Q) e and the map Proof. Consider the long exact sequences of the pairs (K, Q) and (K, of cohomology sequences induced by f : H 1x(K) ←− H 1 (K, x Q) ←−     1 e 1 e e H (K) ←− H (K, Q) ←−

e 0 (Q) ←− H e 0 (K) ←− H e 0 (K, Q) H x x x f ∗      0 e 0 e 0 e e e e e H (Q) ←− H (K) ←− H (K, Q)

e are tree-like, by Corollary The first vertical arrow is an isomorphism, since both K and K e −Q e is a homeomorphism, 2.4, and so both groups are trivial. Since f : K − Q → K the second arrow is an isomorphism by the Strong Excision Theorem [Sp, p. 318]. The fourth and fifth arrows are isomorphisms because K is connected. Therefore, the Five Lemma [Sp, p. 185] implies that the middle arrow is also an isomorphism. To say that f ∗ is an isomorphism is to say that f induces a bijection from the set of components of e This forces x and y to be in the same component of Q. Q to the set of components of Q. Since K is irreducible from x to y, this component must in fact be equal to K. Theorem 6.2. K is indecomposable. Proof. Suppose K = A ∪ E, where A and E are proper subcontinua of K. If k is a point of A ∩ E, then there is no point k 0 of K such that f (k) = f (k 0 ), for f is univalent on both A and E. This contradiction implies K is indecomposable. e is indecomposable. Corollary 6.3. K Proof. This follows from Theorem 5.4.

e is a subset of a composant of K. e Theorem 6.4. If C is a composant of K, then C Proof. This is straightforward.

Let Qm = {x in K: the cardinality of f −1 (x) ˜ ∩ K is at least m}. Since K contains no critical point, Qm is closed. Let n be the largest integer such that Qn = K and e − K ∩ K is a proper subset of K. Qn+1 6= K. Corollary 4.4 says f −1 (K)

186

J. T. Rogers, Jr.

e is tree-like, then the intersection of K and f −1 (K) e − K is the union Theorem 6.5. If K of a finite number of proper subcontinua of K. e − K ∩ K is a proper subset of K. Proof. According to the previous paragraph, f −1 (K) e of K e containing no critical value such that if D is a composant Choose a composant C e then D does not contain 2 points accessible from of K contained in the preimage of C, e is possible by Theorem 5.1 and the fact that K the complement of K. The choice of C has uncountably many composants. Theorem 5.5 implies that e − K = ∪C i − K = ∪C i − K = ∪Ci − K . f −1 (K) (Recall there are only a finite number of Ci ’s.) Both K and Ci − K are subsets of a e It suffices to show Ci − K is a tree-like continuum, namely a component of f −1 (K). continuum, since this implies K ∩ Ci − K is a proper subcontinuum of K, by Theorem 2.2. Suppose Ci − K = A ∪ E, where A and E are 2 disjoint closed sets. Let a ∈ (Ci ∩ A) − K, and e ∈ (Ci ∩ E) − K. Let Z be a subcontinuum of Ci containing a and e. Then Z − K = (Z ∩ A) ∪ (Z ∩ E), the union of two disjoint closed sets. Hence some open set contains a point of Z ∩ K and misses A ∪ E. Hence Z ∩ K contains 2 points accessible from the complement of e K. Since Z ∩ K is a proper subcontinuum of K, this contradicts the choice of C. Theorem 6.6. Let G be a Siegel disk of a polynomial f . Let X be a subcontinuum of e is tree-like. If B contains no critical points of f , then the boundary B of G such that X e f maps X homeomorphically onto X. Proof. If f is not univalent on X, then we can choose a subcontinuum K of X satisfying e is tree-like, so is K. e By Corollary 2.4, K is all the properties of this section. Since X also tree-like. e be a composant of K e such that if D is a composant of K contained in the Let C e preimage of C, then e − K, and (1) D is disjoint from f −1 (K) e contains no critical value of f . (2) C e contains uncountably many composants and only a finite number are excluded Since K e exists. by these conditions, such a composant C e Let z˜ be a point of C. Since Qn+1 is a proper closed subset of K, Theorem 4.3 implies we can choose z˜ such that the preimages z1 , . . . , zn all belong to K but not to e − K. Therefore K contains n points z1 , . . . zn in the preimage of z˜ and misses f −1 (K) the other preimages zn+1 , . . . , zd of z. ˜ Let U be a simply connected domain containing K such that U contains no critical e be a round ball point of f and such that the closure of U misses zn+1 , . . . zd . Let O containing z˜ such that e = O1 ∪ · · · ∪ On ∪ · · · ∪ Od , where the Oi ’s are open sets whose closures (3) f −1 (O) are pairwise disjoint and each such closure is mapped homeomorphically onto the e closure of O.

Diophantine Conditions for Siegel Disks

(4) (5) (6) (7)

187

zi belongs to Oi , e = O1 ∪ · · · ∪ On . U ∩ f −1 (O) O1 ∪ · · · ∪ On ⊂ U . e − K = ∅, i = 1, 2, . . . , n. Oi ∩ f −1 (K)

Let D be a composant of K containing z1 . The composant D intersects each one of O1 , . . . , On because D = K. Recall n ≥ 2 since K = Q2 , by Theorem 6.1. Let 61 be a (tree-like) continuum in the composant D irreducible from z1 to a point ˜ ze] be the radial line segment in the closure y1 of ∂(O2 ∪ · · · ∪ On ). (See Fig. 3.) Let [y, e with endpoints y˜ and ze. Let Ie = [y, of O ˜ x] ˜ be the shortest subsegment of [y, ˜ ze] such that e The restriction of f to D is injective, so 61 ∩ O1 is disjoint x˜ 6= y˜ and x˜ belongs to 6. ˜ Hence x˜ exists. from f −1 (y). e ∪ Ie bounds a simply connected region V , since 6 e ∩ Ie = {x, The continuum 6 ˜ y}. ˜ Let I2 be the lift of Ie that contains y1 . Let x2 be the preimage of x˜ contained in I2 , and let e that contains x2 . Condition (1) implies 62 is a subcontinuum 62 be the preimage of 6 of K. Continue this process until, when choosing xk+1 , we find that xk+1 = x1 . Thus Ik+1 has endpoints yk and x1 . Let M = 61 ∪ I2 ∪ · · · ∪ 6k . Then M is a tree-like continuum, and Ik+1 ∩ M = {x1 , yk }. Hence M ∪ Ik+1 bounds a simply connected domain W in U . Thus W does not contain a critical point. By Lemma 2.6, W is a simply connected component of f −1 (V ). Theorem 2.1 implies f restricted to W is univalent. This is a contradiction, since the inverse image of a point of V near Ie contains more than one point of W . The following corollary to the proof of Theorem 6.2 is of independent interest in the theory of continua. Corollary 6.7. Let f : X → Y be a surjective map between tree-like continua. Assume X does not contain an indecomposable continuum. If f is locally injective, then f is injective. In fact, by using a slightly different argument, it suffices to assume only Y is tree-like. 7. The Main Theorem for B Indecomposable Let G be a Siegel disk of a polynomial f , and assume the boundary B of G is indecomposable and contains no critical points of f . Let K and Q be chosen as in the beginning of Sect. 6. Q is closed because B does not contain a critical point. e is not tree-like. Hence Theorem 4.9 implies K = B. Theorem 6.6 implies that K Thus we have the following lemma. Lemma 7.1. The map f is injective when restricted to a proper subcontinuum of B. Theorem 7.2. Q = B. In other words, f is at least 2-1 over B. The proof divides into two cases. Case 1. B has a bounded complementary domain H different from G.

188

J. T. Rogers, Jr.

Proof. Theorem 3.3 implies H is a preperiodic component of the grand orbit of G. Theorem 4.9 implies B is the boundary of H. Without loss of generality, H is a component of the preimage of G under f distinct from G. Let w be a point of B. Let wi be a sequence of points of G converging to w. Since f maps each of G and H onto G, there are sequences yi in G and zi in H such that f (yi ) = wi = f (zi ). We may assume yi converges to y in B and zi converges to z in B, the common boundary of G and H. Since B contains no critical point, y is distinct from z, and hence w has two preimages in B. Case 2. G is the only bounded complementary domain of B. Proof. Let E be the union of G and B. Let R = {e ∈ E : some point e0 of E distinct from e has the property that f (e) = f (e0 )}. Since f is injective on G and B has no other bounded complementary domain, R is contained in B, and so R = Q. e = E is also acyclic. An Since E is forward invariant and acyclic, its image E argument similar to the proof of Theorem 6.1 shows that the points x and y are in the same component L of R. Thus this component L is a subcontinuum of B containing x and y. Since f (x) = f (y), Lemma 7.1 implies L = B. Therefore, Q = B. We have now established the following facts. The boundary B of the Siegel disk G is an indecomposable continuum consisting of uncountably many composants. These composants are pairwise disjoint sets, and each composant is dense in B. The restriction of f to each of these composants is injective, while the restriction of f to B is at least 2-1. Furthermore, each subcontinuum of such a composant is tree-like, being a proper subcontinuum of B. Theorem 7.3. If the indecomposable boundary B of a Siegel disk G of a polynomial does not contain a critical point and no bounded complementary domain of B contains a critical point, then f is injective when restricted to the boundary of G. Proof. Each open subset of B contains uncountably many points of B accessible from G. Furthermore, no composant of B contains more than one point of B accessible from G [R5, Proposition 11]. Thus, in any open set of B, we have uncountably many choices of points accessible from G and hence composants of B. Assume f is not injective on B. Since Qn+1 is a proper closed subset of K = B, the e not containing paragraph above and Theorem 4.3 imply we can choose a composant C a critical value and containing a point ze in B accessible from G such that the preimages z1 , . . . zn of ze all belong to B but not to f −1 (B) − B. Therefore B contains n points z1 , . . . , zn in the preimage of ze and misses the other preimages zn+1 , . . . zd of ze. We e so that no composant D of B that maps into C e has more than can also choose C one point accessible from the complement of B. The last restriction eliminates only e a countable number of choices, by Theorem 5.1, so uncountably many choices for C remain. Next we proceed as in the proof of Theorem 6.6, beginning with the fourth paragraph. Since the filled Siegel disk contains no critical point, we can choose a simply connected domain U containing K = B such that U contains no critical point of f and such that the closure of U misses zn+1 , . . . , zd . Everything proceeds as before, with one exception: to complete the proof, we must verify that 62 is a subset of B, since this is where we used Condition (1), which depended on Theorem 6.5. Assume 62 contains a point p not in B. First we note the immediate basin of infinity is a subset of the unbounded complementary domain of B. Hence p is a point of

Diophantine Conditions for Siegel Disks

189

O O

y

z

r

Σ

z2

Σ

On+1

O zn+1

U

Od f zd

ze

xe e I

e Σ

ye

Fig. 3.

the unbounded complementary domain of B, since p must be a boundary point of the immediate basin of infinity. Since ze is accessible from G, z2 is accessible from some component H of the preimage of G. The remainder of the proof is divided into 2 cases. Case 1. The component H is a subset of the unbounded complementary domain of B. Proof of Case 1. First we show that 62 ∩B contains a second point (besides z2 ) accessible from H. Second we show that 62 ∩B is a continuum and hence a subset of a composant of B. Together these 2 claims show that the composant of B containing z2 contains 2 points of B accessible from the unbounded complementary domain of B. This contradicts our e choice of the composant C. Claim 1. 62 ∩ B contains another point (besides z2 ) accessible from H.

190

J. T. Rogers, Jr.

Proof of Claim 1. Since z2 belongs to B but not to the closed set f −1 (B) − B, there exists a small circular disk K centered at z2 such that K ∩ f −1 (B) ⊂ B. Since H is simply connected, there is a conformal map of the unit disk onto H mapping the origin to a point w of H. At least one prime end of H has z2 as its only principal point, so this conformal map takes some radial ray onto a ray R1 in H beginning at w and landing on z2 . Theorem 9.3 of [CL] implies there is a circular arc [y1 , y2 ] contained in the boundary of K such that [y1 , y2 ] is a cross cut associated with this prime end. In particular, the endpoints y1 and y2 belong to the boundary ∂H of H, the open arc (y1 , y2 ) is contained in H, and the ray R1 intersects [y1 , y2 ] transversally at the point v. Note that the points y1 and y2 belong to B because ∂H ⊂ f −1 (B) and K ∩ f −1 (B) ⊂ B. Also y1 and y2 are accessible from H. v’

z2

Σ2

R

S

H ∞ v R’

S’ z’2

Σ’2

Fig. 4.

We complete the proof by showing either y1 or y2 belongs to 62 . We choose K small enough that there exists an arc R2 in the unbounded complementary domain of B from w to a point p0 of 62 such that R2 ∩ R1 = {w}, R2 ∩ 62 = {p0 }, and R2 ∩ [y1 , y2 ] = ∅. Let M = R1 ∪ R2 ∪ 62 . Since 62 is tree-like, the continuum M has exactly 2 complementary domains; call them D1 and D2 . One of them, say D2 , contains G. Thus B = ∂G ⊂ D2 . In fact B ⊂ D2 ∪ 62 , so each of the points y1 and y2 belong to D2 ∪ 62 . If both y1 and y2 belong to D2 , then there is an arc A in D2 with endpoints y1 and y2 but otherwise disjoint from [y1 , y2 ]. The simple closed curve A ∪ [y1 , y2 ] lies entirely in D2 except for the point v. In particular, the simple closed curve A ∪ [y1 , y2 ] can be homotoped off R1 by an arbitrarily small homotopy. This contradicts the fact that R1 intersects [y1 , y2 ] transversally at v. Hence either y1 or y2 belongs to 62 . The proof of Claim 1 is complete. Claim 2. 62 ∩ B is connected. Proof of Claim 2. Suppose 62 ∩ B is not connected. Then there exists a domain L such that L is a bounded complementary domain of 62 ∪ B but L is not a complementary

Diophantine Conditions for Siegel Disks

191

domain of B or of 62 . Since the immediate basin of ∞ contains the Julia set in its closure, L is in fact a component of the Fatou set and is simply connected. Since f (62 ∪ B) ⊂ B, the polynomial f maps L to a bounded complementary domain of B. Let y be a point of 62 − B accessible from L. At least one prime end β of L has y as its only principal point. The impression I(β) of β is a subset of 62 . Hence the image f (I(β)) of this impression, which is the impression of a prime end of the domain f (L), is e and hence tree-like or a point. By Theorem 4.9, for a subset of the tree-like continuum 6 each n, f n (I(β)) is a tree-like continuum or a point and hence a proper subcontinuum of B. On the other hand, by Theorem 3.3, there exists an n such that f n (L) = G. Hence f n (I(β)) is the impression of a prime end of G. Since the impression of each prime end of G is B [R2, Theorem 7.1], we have a contradiction. The proof of Case 1 is completed. Case 2. The component H is a bounded complementary domain of B. e 0 with accessible Proof of Case 2. We claim that if we choose a different composant C point ze0 and do the argument again, then the point z20 will not be accessible from H but from a different preimage of G. If this claim is proved, then, by discarding at most d e we then find a choice for the composant C e for which no choices of the composant C, point p exists. Such a contradiction would complete the proof. e 0 containing an accessible Thus, we assume that we choose another composant C 0 point ze as above, and this leads, by following the proof above, to a point z20 accessible from the same component H as z2 and a continuum 602 containing both z20 and a point of the unbounded complementary domain of B. We seek a contradiction. At this point, see Fig. 4. The continua 62 and 602 are disjoint. Let R and R0 be (internal) rays of H landing at z2 and z20 , respectively. Let S and S 0 be disjoint arcs in the unbounded complementary domain of B ∪ 62 ∪ 602 from infinity to an accessible point of 62 and 602 , respectively. Let Y denote the continuum R ∪ 62 ∪ S ∪ S 0 ∪ 602 ∪ R0 . Thus Y separates an accessible point v of H from another accessible point v 0 of H (see Fig. 4). We can choose v such that the composant E of B containing v is disjoint from both 62 and 602 . Hence the composant E must contain points arbitrarily close to v 0 (since E is dense in B), but it must be contained in the complementary domain of Y that contains v. This contradiction completes the proof. 8. Applications The goal of this section is to determine some additional properties of the boundary B of a Siegel disk G of a polynomial f . No result of this section is used in the proof of the main theorem. Theorem 8.1. If the restriction of f to the boundary B of a Siegel disk G of a polynomial is injective, then B contains at most a finite number of points of the boundary of any other component of the preimage of G. Furthermore, each such point is a critical point. Proof. Let x belong to both B and the boundary of another component H of the preimage ˜ of G. Let (zi ) be a sequence in H such that zi converges to x. Then z˜i converges to x. Since f |G : G → G is a homeomorphism, there exists a sequence (yi ) in G such that y˜i = z˜i . By hypothesis, yi converges to x. Hence x is a critical point.

192

J. T. Rogers, Jr.

b of a polynomial does not contain a critical point, Corollary 8.2. If the filled Siegel disk G then the boundary B of G contains no point of the boundary of any other component of the preimage of G. Proof. This follows from Theorem 1.

Corollary 8.3. If the restriction of f to the boundary B of a Siegel disk G of a polynomial is locally injective and B does not contain an indecomposable continuum, then B contains at most a finite number of points of the boundary of any other component of the preimage of G. Each such point is a critical point. Proof. This follows from Theorem 6.2 or the proof of [R4].

Theorem 8.4. If the restriction to f to the boundary B of a Siegel disk G of a polynomial is injective, then G is the only bounded complementary domain of B. Proof. Let H be another boundary complementary domain of B. According to Theorem 3.3, H is a preperiodic component of the grand orbit of G. Each iterated image of H is also a bounded complementary domain of B, since B is invariant under f . Hence we may assume H is a component of the preimage of G. This contradicts Theorem 8.1 and completes the proof. Theorem 8.5. If the restriction to f to the boundary B of a Siegel disk G of a polynomial is injective, then the boundary B does not contain a periodic point. Proof. First recall that any periodic point that belongs to the Julia set must be a Cremer point, a parabolic periodic point, or a repelling periodic point. A. Poirier [GM, Corollary 3.5] has shown there cannot be a Cremer point in B. (The proof in [GM] assumes the Julia set is connected, but Poirier has pointed out that the assumption of connectivity can be removed by using a result of McMullen [Mc].) Thus any periodic point on the boundary must be either parabolic or repelling. C. Petersen [P, Corollary B.2] has shown that such a point must also belong to the boundary of some preperiodic component of the grand orbit of G. Again, since B is invariant, we may assume this component is in fact a component of the preimage of G. This contradiction of Theorem 8.1 completes the proof. b of a polynomial does not contain a critical Corollary 8.6. If the filled Siegel disk G point, then the boundary B does not contain a periodic point, and G is the only bounded complementary domain of B. Recently R. Perez-Marco [PM] has shown that if G is a Siegel disk of a rational function f and if f is univalent on a simply connected neighborhood of G, then B does not contain a periodic point. Acknowledgement. We thank John Mayer, Jack Milnor, and Lex Oversteegen for helpful comments and Julien Doucet for providing Figs. 1 and 2.

References [Be] Beardon, A.F.: The components of a Julia set. Annal. Acad. Scient. Fenn. 16, 173–177 (1991) [B] Blanchard, P.: Complex analytic dynamics on the Riemann sphere. Bull. Am. Math. Soc. 11, 85–141 (1984)

Diophantine Conditions for Siegel Disks

193

[CL] Collingwood, E.F. and Lohwater, A.J.: Theory of Cluster sets. Cambridge Tracts in Math. and Math. Physics 56, Cambridge: Cambridge University Press, 1966 [D1] Douady, A.: Syst´emes dynamiques holomorphes. S´eminaire Bourbaki, expos ’e 599, Asterique 105– 106, 39–63 (1983) [D2] Douady, A.: Disques de Siegel et anneaux de Herman. S´eminaire Bourbaki, expos´e 677, Asterique, 1986–87 ´ [DH] Douady, A. and Hubbard, J.H.: Etude dynamique des complexes (deuxi´eme partie). Publications Mathematiques D’Orsay 4, 1–154 (1985) [G] Ghys, E.: Transformation holomorphe au voisinage d’une courbe de Jordan. C. R. Acad. Sc. Paris 289, 385–388 (1984) [GM] Goldberg, Lisa and Milnor, John: Fixed points of polynomial maps. Part II. Fixed point portraits, Ann. Scient. Ec. Norm. Sup. 26, 51–98 (1993) [H1] Herman, M.R.: Are there critical points on the boundary of singular domains?, Commun. Math. Phys. 99, 593–612 (1985) [H2] Herman, M.R.: Recent results and some open questions on Siegel’s linearization theorems of germs of complex analytic diffeomorphisms of C n over a fixed point. In: Proceedings of the Eighth Int. Cong. Math. Phys., Singapore: World Sci. 1986, pp. 138–198 [H3] Herman, M.R.:: Construction of some curious diffeomorphisms of the Riemann sphere. J. London Math. Soc. 34, 375–384 (1986) [HY] Hocking, J. and Young, G.: Topology. Reading, Mass.: Addison-Wesley Publishing Co., 1961 [Kr1] Krasinkiewicz, J.: On the composants of indecomposable plane continua. Bull. Pol. Acad. Sci. 20, 935–940 (1972) [Kr2] Krasinkiewicz, J.: Boundaries of plane continua and the fixed point property. Bull. Pol. Acad. Sci. 21, 427–431 (1973) [L] Lyubich, M.: The dynamics of rational transforms: The topological picture. Russian Math. Surveys 41:4, 43–117 (1986) [MR] Mayer, J.C. and Rogers, J.T., Jr.: Indecomposable continua and the Julia sets of polynomials, Proc. Am. Math. Soc. 117, 795–802 (1993) [Mc] McMullen, Curt: Automorphisms of rational maps, Holomorphic Functions and Moduli. New York: Springer-Verlag, 1988, pp. 31–60 [M] Milnor, J.: Dynamics in one complex variable: Introductory lectures. Preprint #1990/5, Institute for Mathematical Sciences, SUNY-Stony Brook [Mo] Moeckel, R.: Rotations of the closures of some simply connected domains. Complex Variables Theory Appl. 4, 233–232 (1985) [P] Petersen, C.L.: On the Pommerenke–Levin–Yoccoz inequality. Ergodic Theory and Dynamical Systems 13, 785–806 (1993) [PM] Perez-Marco, R.: Fixed points and circle maps. Technical Report 67, Universit´e de Paris-Sud, 1994 [R1] Rogers, J.T., Jr.: Is the boundary of a Siegel disk a Jordan curve?, Bull. Am. Math. Soc. 27, 284–287 (1992) [R2] Rogers, J.T., Jr.: Singularities in the boundaries of local Siegel disks. Ergodic Theory Dynamical Systems 12, 803–821 (1992) [R3] Rogers, J.T., Jr.: The pseudo-circle is not homogeneous. Trans. Am. Math. Soc. 148, 417–428 (1970) [R4] Rogers, J.T., Jr.: Critical points on the boundaries of Siegel disks. Bull. Am. Math. Soc. 32, 317–321 (1995) [R5] Rogers, J.T., Jr.: Intrinsic rotations of simply connected regions and their boundaries. Complex Variables Applications 23, 17–23 (1993) [Sp] Spanier, E.H.: Algebraic Topology. New York: McGraw-Hill, 1967 [W] Whyburn, G.T.: Analytic Topology. New York 1942 [Y] Yoccoz, J.-C.: Lin´earisation des germes de diff´eomorphismes holomorphes de (C, 0). C. R. Acad. Sci. Paris 306, 55–58 (1988) Communicated by Ya. G. Sinai

Commun. Math. Phys. 195, 195 – 211 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Quantum Symmetry Groups of Finite Spaces Shuzhou Wang Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 29 September 1997 / Accepted: 13 November 1997

Dedicated to Marc A. Rieffel on the occasion of his sixtieth birthday

Abstract: We determine the quantum automorphism groups of finite spaces. These are compact matrix quantum groups in the sense of Woronowicz.

1. Introduction At Les Houches Summer School on Quantum Symmetries in 1995, Alain Connes posed the following problem: What is the quantum automorphism group of a space? Here the notion of a space is taken in the sense of noncommutative geometry [4], hence it can be either commutative or noncommutative. To put this problem in a proper context, let us recall that the notion of a group arises most naturally as symmetries of various kinds of spaces. As a matter of fact, this is how the notion of a group was discovered historically. However, the notion of a quantum group was discovered from several different points of view [10, 11, 8, 28, 29, 30, 31, 9], the most important of which is to view quantum groups as deformations of ordinary Lie groups or Lie algebras, instead of viewing them as quantum symmetry objects of noncommutative spaces. In [13], an important first step was made by Manin in this latter direction, where quantum groups are described as quantum symmetry objects of quadratic algebras. In this paper, we solve the problem above for finite spaces (viz. finite dimensional C ∗ -algebras). That is, we explicitly determine the quantum automorphism groups of such spaces. These spaces do not carry the additional geometric (Riemannian) structures in the sense of [4, 5]. The quantum automorphism groups for the latter geometric finite spaces can be termed quantum isometry groups. At the end of his book [4], Connes poses the problem of finding a finite quantum symmetry group for the finite geometric space used in his formulation of the Standard Model in particle physics. This problem is clearly related to the problem above he posed at Les Houches Summer School. We expect that the results in our paper will be useful for this problem. As a matter of fact,

196

S. Wang

the quantum symmetry group for the finite geometric space of [4] should be a quantum subgroup of an appropriate quantum automorphism group described in this paper. The main difficulty is to find the natural quantum finite subgroup of the latter that deserves to be called the quantum isometry group. This paper can be viewed as a continuation of the work of Manin [13] in the sense that the quantum groups we consider here are also quantum symmetry objects. However, it differs from the work of Manin in three main aspects. First, the noncommutative spaces on which Manin considers symmetries are quadratic algebras and are infinite; while the spaces on which we consider symmetries are not quadratic and are finite. Second, Manin’s quantum groups are generated by infinitely many multiplicative matrices and admits many actions on the spaces in question, one action for each multiplicative matrix (for the notion of multiplicative matrices, see Manin [13]); while our quantum groups are generated by a single multiplicative matrix and they act on the spaces in question in one natural manner. Finally, Manin’s quantum groups do not give rise to natural structures of C ∗ -algebras in general (see [18]); while our quantum groups, besides having a purely algebraic formulation, are compact matrix quantum groups in the sense of Woronowicz [30]. Consequently we need to invoke some basic results of Woronowicz [30]. Loosely speaking, Manin’s quantum groups are noncompact quantum groups. But to the best knowledge of the author, it is not known as to how one can make this precise in the strict sense of Woronowicz [32]. On the other hand, it is natural to expect that quantum automorphism groups of finite spaces are compact quantum groups without knowing their explicit descriptions in this paper. The ideas in our earlier papers [19, 20, 18] on universal quantum groups play an important role in this paper. Note that finite spaces are just finite dimensional C ∗ algebras, no deformation is involved. Moreover, as in [19, 20, 18], the quantum groups considered in this paper are intrinsic objects, not as deformations of groups, so they are different from the quantum groups obtained by the traditional method of deformations of Lie groups (cf. [8, 9, 29, 31, 12, 16, 23]). We summarize the contents of this paper. In Sect. 2, we recall some basic notions concerning actions of quantum groups and define the notion of a quantum automorphism group of a space. The most natural way to define a quantum automorphism group is by categorical method, viz, to define it as a universal object in a certain category of quantum transformation groups. Sections. 3, 4, 5 are devoted to explicit determination of quantum automorphism groups for several categories of quantum transformation groups of the spaces Xn , Mn (C), and ⊕m k=1 Mnk (C), respectively. Though the main idea in the construction of quantum automorphism groups is the same for each of the spaces Xn , Mn (C) and ⊕m k=1 Mnk (C), the two special cases Xn and Mn offers interesting phenomena in their own right. Hence we deal with them separately and begin by considering the simplest case Xn . In Sect. 6, using the results of Sects. 3, 4, 5, we prove that a finite space has a quantum automorphism group in the category of all compact quantum transformation groups if and only if the finite space is Xn , and that a measured finite space (i.e. a finite space endowed with a positive functional) always has a quantum automorphism group. A convention on terminology: In the following, we will use interchangeably both the term compact quantum groups and the term Woronowicz Hopf C ∗ -algebras. The meaning should be clear from the context (cf. [19, 20, 23, 18]). Notation. For every natural number n, and every *-algebra A, Mn (A) denotes the *algebra of n × n matrix with entries in A. We also use Mn to denote Mn (C), where C is the algebra of complex numbers. For every matrix u = (aij ) ∈ Mn (A), ut denotes

Quantum Symmetry Groups of Finite Spaces

197

the transpose of u; u¯ = (a∗ij ) denotes the conjugate matrix of u; u∗ = u¯ t denotes the adjoint matrix of u (this defines the ordinary *-operation on Mn (A)). The symbol X(A) denotes the set of all unital *-homomorphism from A to C. Finally, Xn = {x1 , · · · , xn } is the finite space with n letters.

2. The Notion of Quantum Automorphism Groups Part of the problem of Connes mentioned in the introduction is to make precise the notion of a quantum automorphism group, which we address in this section. First recall that the usual automorphism group Aut(X) of a space X consists of the set of all transformations on X that preserve the structure of X. A quantum group is not a set of transformations in general. Thus a naive imitation of the above definition of Aut(X) for quantum automorphisms will not work. However, we recapture the definition of Aut(X) from the following universal property of Aut(X) in the category of transformation groups of X: If G is any group acting on X, then there is a unique morphism of transformation groups from G to Aut(X). This motivates our Definition 2.3 of quantum automorphism groups below. The automorphism groups of finite spaces are compact Lie groups (e.g. Aut(Xn ) = Sn , the symmetric group on n letters, and Aut(Mn ) = SU (n)). For this reason, it is natural to expect that the quantum automorphism groups of such spaces are compact quantum groups, viz., Woronowicz Hopf C ∗ -algebras. We will consider only such quantum groups in this paper. For basic notions on compact quantum groups, we refer the reader to [30, 19, 20]. Note that for every compact quantum group, there corresponds a full Woronowicz Hopf C ∗ -algebra and a reduced Woronowicz Hopf C ∗ -algebra [1, 22]. We will assume that all the Woronowicz Hopf C ∗ -algebras in this paper are full, as morphisms behave well only with such algebras (see the discussions in III.7 of [22]). Let A be a compact quantum group. Let be the unit of this quantum group (or counit of the full Woronowicz Hopf C ∗ -algebra). Let A denote the canonical dense Hopf *subalgebra of A consisting of coefficients of finite dimensional representations of the quantum group A. Definition 2.1. [cf. [1, 3, 14]] A left action of a compact quantum group A on a C ∗ algebra B is a unital *-homomorphism α from B to B ⊗ A such that (1) (idB ⊗ 8)α = (α ⊗ idA )α, where 8 is the coproduct on A; (2) (idB ⊗ )α = idB ; (3) There is a dense *-subalgebra B of B, such that α restricts to a right coaction of the Hopf *-algebra A on B. ˜ α) We also call (A, α) a left quantum transformation group of B. Let (A, ˜ be another ˜ α) left quantum transformation group of B. We define a morphism from (A, ˜ to (A, α) to be a morphism π of quantum groups from A˜ to A (which is the same thing as a morphism ˜ see [20]), such that of Woronowicz Hopf C ∗ -algebras from A to A, α˜ = (idB ⊗ π)α. It is easy to see that left quantum transformation groups of B form a category with the morphisms defined above. We call it the category of left quantum transformation groups of B.

198

S. Wang

Our definition of an action of a quantum group above appears to be different from the one in [14], but it is equivalent to the latter. More precisely, conditions (2) and (3) above are equivalent to the following density requirement, which is used in [1, 3, 14] for the definition of an action: (I ⊗ A)α(B) is norm dense in B ⊗ A, but they are more natural and convenient for our purposes. It is not clear whether the injectivity condition on α imposed in [1, 3] is implied by the three conditions in the definition above. Our definition coincides with the notion of actions of groups on spaces when the quantum group A is a group and B is an ordinary space (simply by reversing the arrows). The above definition is commonly called the right coaction of a unital Hopf C ∗ algebra. Note that for the Hopf C ∗ -algebra A = C(G) of continuous functions over a compact group G, the notion of right coaction of A corresponds to the notion of left action of G on a C ∗ -algebra B. For this reason, when we are dealing with a compact quantum group A, we call a right coaction of the underlying Woronowicz Hopf C ∗ -algebra of A a left action of the quantum group A. In the following, we will omit the word left for actions of quantum transformation groups. This should not cause confusion. Definition 2.2. Let (A, α) be a quantum transformation group of B. An element b of B is said to be fixed under α (or invariant under α) if α(b) = b ⊗ 1A . The fixed point algebra Aα of the action α is {b ∈ B | α(b) = b ⊗ 1A }. The quantum transformation group (A, α) is said to be ergodic if Aα = CI. A (continuous) functional φ on B is said to be invariant under α if (φ ⊗ idA )α(b) = φ(b)IA for all b ∈ B. For a given functional φ on B, we define the category of quantum transformation groups of the pair (B, φ) to be the category with objects that leave invariant the functional φ. This is a subcategory of the category of all quantum transformation groups. Besides the two categories of quantum transformation groups mentioned above, we also have the category of quantum transformation groups of Kac type for B, which is a full subcategory of the category of quantum transformation groups of B. Definition 2.3. Let C be a category of quantum transformation groups of B. The quantum automorphism group of B in C is a universal final object in the category C. That ˜ α) is, if (A, ˜ is an object in this category, then there is a unique morphism π of quantum ˜ α) transformation groups from (A, ˜ to (A, α). Let φ be a continuous functional on the algebra B. We define quantum automorphism group of the pair (B, φ) to be the universal object in the category of quantum transformation groups of the pair (B, φ) (cf. Definition 2.1).

Quantum Symmetry Groups of Finite Spaces

199

From categorical abstract nonsense, the quantum automorphism group of B (in a given category) is unique (up to isomorphism) if it exists. We emphasize in particular that the notion of a quantum automorphism group depends on the category of quantum transformation groups of B, not only on B. As a matter of fact, for a finite space B other than Xn , we will show in Theorem 6.1 that the quantum automorphism group does not exist for the category of all quantum transformation groups. In the subcategory of quantum transformation groups of B with objects consisting of compact transformation groups, the universal object is precisely the ordinary automorphism group Aut(B), as mentioned in the beginning of this section. We will also use the following notion, which generalizes the usual notion of a faithful group action. Definition 2.4. Let (A, α) be a quantum transformation group of B. We say that the action α is faithful if there is no proper Woronowicz Hopf C ∗ -subalgebra A1 of A such that α is an action of A1 on B. If (A, α) is a quantum automorphism group in some category of quantum transformation groups on B, then the action α is faithful. We leave the verification of this to the reader as an exercise. 3. Quantum Automorphism Group of Finite Space Xn By the Gelfand–Naimark theorem, we can identify Xn = {x1 , · · · , xn } with the C ∗ algebra B = C(Xn ) of continuous functions on Xn . The algebra B has the following presentation, B = C ∗ {ei | e2i = ei = e∗i ,

n X

er = 1, i = 1, · · · , n}.

r=1

The ordinary automorphism group Aut(Xn ) = Aut(B) of Xn is the symmetric group Sn on n symbols. We can put the group Sn in the framework of Woronowicz as follows. As a transformation group, Sn can be thought of as the collection of all permutation matrices   a11 a12 · · · a1n  a a · · · a2n  . g =  21 22 ··· ··· ··· ···  an1 an2 · · · ann When g varies in Sn , the aij ’s (i, j = 1, · · · , n) are functions on the group Sn satisfying the following relations: a2ij = aij = a∗ij , i, j = 1, · · · , n, n X aij = 1, i = 1, · · · , n, j=1 n X

aij = 1,

i = 1, · · · , n.

(3.1) (3.2)

(3.3)

i=1

It is easy to see that the commutative C ∗ -algebra generated by the above commutation relations is the Woronowicz Hopf C ∗ -algebra C(Sn ). In other words, the group Sn is

200

S. Wang

completely determined by these relations. The following theorem shows that we have obtained much more: If we remove the condition that the aij ’s commute with each other, these relations define the quantum automorphism group of Xn . Theorem 3.1. Let A be the C ∗ -algebra with generators aij (i, j = 1, · · · , n) and defining relations (3.1)–(3.3). Then (1) A is a compact quantum group of Kac type; (2) The formulas n X ei ⊗ aij , α(ej ) =

j = 1, · · · , n

i=1

defines a quantum transformation group (A, α) of B. It is the quantum automorphism group of B in the category of all compact quantum transformation groups (hence also in the category of compact quantum groups of Kac type) of B, and it contains the ordinary automorphism group Aut(Xn ) = Sn (in fact, {(χ(aij )) | χ ∈ X(A)} is precisely the set of permutation matrices). Because of (2) above, we will denote the quantum group above by Aaut (Xn ). We will call it the quantum permutation group on n symbols. Proof. (1) It is easy to check that there is a well-defined homomorphism 8 from A to A ⊗ A with the property 8(aij ) =

n X

aik ⊗ akj ,

i, j = 1, · · · , n.

k=1

Using (3.1)–(3.3), it is also easy to check that u = (aij ) is an orthogonal matrix. Hence (A, u) is a quantum subgroup of Ao (n), so it is of Kac type (cf. [19, 20, 18]). To prove (2), note that the generators {ei }ni=1 form a basis of the vector space B, so an action α˜ of any quantum group A˜ on B is uniquely determined by its effect on the ei ’s: n X α(e ˜ j) = ei ⊗ a˜ ij , j = 1, · · · , n. i=1

The condition that α˜ is a *-homomorphism together with the equations e2i = ei = e∗i ,

i = 1, · · · , n

shows that the a˜ ij ’s satisfy the relations (3.1). The condition that α˜ is a unital homomorphism together with the equation n X

ei = 1

i=1

shows that the a˜ ij ’s satisfy (3.2). Let u˜ = (˜aij ). Then we have u˜ u˜ ∗ = In . The condition in Definition 2.1 (2) means that (˜aij ) = δij ,

i, j = 1, · · · , n.

Quantum Symmetry Groups of Finite Spaces

201

˜ Hence by Proposition 3.2 of [30], it By condition (3) of Definition 2.1, the a˜ ij ’s are in A. follows that u˜ = (˜aij ) is a non-degenerate smooth representation of the quantum group ˜ In particular, u˜ is also left invertible, A. u˜ ∗ u˜ = In . This implies that the a˜ ij ’s satisfy the relations (3.3). From these we see that (A, α) is a universal quantum transformation group of B: there is a unique morphism π of quantum ˜ α) transformation groups from (A, ˜ to (A, α) such that π(aij ) = a˜ ij ,

i, j = 1, · · · , n.

It is clear that the maximal subgroup of the quantum group A is Sn , that is, the set {(χ(aij )) | χ ∈ X(A)} is precisely the set of permutation matrices. Remarks. (1) For each pair i, j, let Aij be the group C ∗ -algebra C ∗ (Z/2Z) with generator pij , p2ij = pij = p∗ij (i, j = 1, · · · , n). Then the C ∗ -algebra A is isomorphic to the following quotient C ∗ -algebra of the free product of the Aij ’s: (∗ni,j=1 Aij )/ <

n X

prj = 1 =

r=1

n X

pis , i, j = 1, · · · , n > .

s=1

(2) Let φ be the unique Sn -invariant probability measure on Xn . Then it is easy to see that φ is a fixed functional under the action of the quantum group Aaut (Xn ) defined in Theorem 3.1. Hence Aaut (Xn ) is also the quantum automorphism group for the pair (Xn , φ). ∗ (3) Let Q > 0 be a positive n × n matrix. Let AQ aut (Xn ) be the C -algebra with generators aij (i, j = 1, · · · , n) and the defining relations given by (3.1)–(3.2) along with the following set of relations:

ut QuQ−1 = In = QuQ−1 ut ,

(3.4)

where u = (aij ). Then it not hard to verify that (AQ aut (Xn ), α) is a compact quantum transformation subgroup of the one defined in Theorem 3.1 (hence the aij ’s also satisfy the relations (3.3)), here α is as in Theorem 3.1. Note also for Q = In , AQ aut (Xn ) = Aaut (Xn ). 4. Quantum Automorphism Group of Finite Space Mn (C) n kl n Notation. Let u = (akl ij )i,j,k,l=1 and v = (bij )i,j,k,l=1 with entries from a *-algebra. Define uv to be the matrix whose entries are given by

(uv)kl ij

=

n X

rs akl rs bij ,

i, j, k, l = 1, · · · , n.

r,s=1

Let ψ = T r be the trace functional on Mn (so φ = n1 ψ is the unique Aut(Mn )invariant state on Mn ). The C ∗ -algebra Mn has the following presentation: B = C ∗ {eij | eij ekl = δjk eil , e∗ij = eji ,

n X r=1

err = 1, i, j, k, l = 1, · · · , n}.

202

S. Wang

Theorem 4.1. Let A be the C ∗ -algebra with generators akl ij and the following defining relations (4.1)–(4.5): n X v=1 n X v=1

vl kl akv ij ars = δjr ais ,

i, j, k, l, r, s = 1, · · · , n,

(4.1)

ji si asr lv avk = δjr alk ,

i, j, k, l, r, s = 1, · · · , n,

(4.2)

∗

lk akl ij = aji , i, j, k, l = 1, · · · , n, n X akl rr = δkl , k, l = 1, · · · , n, r=1 n X

arr kl = δkl ,

k, l = 1, · · · , n.

(1) A is a compact quantum group of Kac type; (2) The formulas n X α(eij ) = ekl ⊗ akl ij ,

i, j = 1, · · · , n

(4.3) (4.4) (4.5)

r=1

Then

k,l=1

defines a quantum transformation group (A, α) of (Mn , ψ). It is the quantum automorphism group of (Mn , ψ) in the category of compact quantum transformation groups (hence also in the category of compact quantum groups of Kac type) of (Mn , ψ), and it contains the ordinary automorphism group Aut(Mn ) = SU (n). We will denote the quantum group above by Aaut (Mn ). ∗

Proof. (1) It is easy to check that the matrix u = (akl ¯ = (akl ij ) as well as its conjugate u ij ) are both unitary matrices, and that the formulas 8(akl ij ) =

n X

rs akl rs ⊗ aij ,

i, j, k, l = 1, · · · , n

r,s=1

gives a well-defined map from A to A ⊗ A (this is the coproduct). Hence A is a quantum subgroup of Au (m) (with m = n2 ), so it is of Kac type (cf. [19, 20, 18]). ˜ α) (2) Let (A, ˜ be any quantum transformation group of Mn . Being a basis for the vector space Mn , the eij ’s uniquely determine the action α: ˜ α(e ˜ ij ) =

n X

ekl ⊗ a˜ kl ij ,

i, j = 1, · · · , n.

k,l=1

The condition that α˜ is a homomorphism together with the equations eij ekl = δjk eil ,

i, j, k, l = 1, · · · , n

shows that the a˜ kl ˜ preserves the *-operation together ij ’s satisfy (4.1). The condition that α with the equations

Quantum Symmetry Groups of Finite Spaces

203

e∗ij = eji ,

i, j = 1, · · · , n

˜ preserves the units together with shows that the a˜ kl ij ’s satisfy (4.3). The condition that α the identity X err = 1 r

shows that the a˜ kl ˜ leaves the trace ψ ij ’s satisfy (4.4). The condition that α that the a˜ kl ’s satisfy (4.5). ij To show that the a˜ kl ij ’s satisfy (4.2), first it is an easy check that

invariant shows

u˜ ∗ u˜ = In⊗2 , n ˜ where u˜ = (˜akl ˜ kl ij )i,j,k,l=1 . By condition (3) of Definition 2.1, the a ij ’s are in A. Hence by Proposition 3.2 of [30], we see that u˜ is a non-degenerate smooth representation of the ˜ In particular, u˜ is also right invertible, quantum group A.

u˜ u˜ ∗ = In⊗2 , which means that n X

a˜ kl ˜ sr ij a ji = δkr δls ,

k, l, r, s = 1, · · · , n.

i,j=1

From these relations and the relations (4.1), (4.3)-(4.5), we deduce that both matrices u˜ and u˜ t are unitary. This shows that the quantum group A1 generated by the coefficients ˜ is a bounded a˜ kl ij is a compact quantum group of Kac type. That is, the antipode κ *-antihomomorphism when restricted to A1 . Put ˜ akl aji v = (bkl ij ) = (κ(˜ ij )) = (˜ lk ). Then in the opposite algebra A1 op (which has the same elements as A1 with multiplica˜ kl tion reserved), the bkl ij ’s satisfy the relations (4.1), which means that the a ij ’s satisfy the ˜ relations (4.2) in the algebra A. From the above consideration we see that (A, α) is a quantum transformation group of Mn , and that there is a unique morphism π of quantum groups from A˜ to A such that π(akl ˜ kl ij ) = a ij ,

i, j, k, l = 1, · · · , n.

It is routine to check that π is the unique morphism π of quantum transformation groups ˜ α) from (A, ˜ to (A, α). From the relations (4.1)–(4.5), one can show that each matrix (χ(akl ij )) (χ ∈ X(Aaut (Mn ))) defines an automorphism of Mn by the formulas in Theorem 4.1 (2). This means that the maximal subgroup X(Aaut (Mn )) is naturally embedded in Aut(Mn ). Conversely, it is clear that Aut(Mn ) can be embedded as a subgroup of the maximal subgroup X(Aaut (Mn )) of Aaut (Mn ). ∗ Remark. Consider the quantum group (Au (n), (aij )) (cf. [20, 18]). Put a˜ kl ij = aki alj . Then the a˜ kl ˜ kl ij ’s satisfies the relations (4.1)–(4.5). From this we see that the a ij ’s determines a quantum subgroup of Aaut (Mn ). Hence the Woronowicz Hopf C ∗ -algebra Aaut (Mn ) is noncommutative and noncocommutative. How big is the subalgebra of Au (n) generated by the a˜ kl ij ? An answer to this question will shed light on the structure of the C ∗ -algebra Aaut (Mn ).

204

S. Wang

Proposition 4.2. Let Q > 0 be a positive matrix in Mn (C) ⊗ Mn (C). Let A be the C ∗ -algebra with generators akl ij and defining relations given by (4.1), (4.3), (4.4), along with the following set of relations: u∗ QuQ−1 = In⊗2 = QuQ−1 u∗ ,

(4.6)

where u = (akl ij ). Then A is a compact quantum group that acts faithfully on Mn in the following manner, α(eij ) =

n X

ekl ⊗ akl ij ,

i, j = 1, · · · , n,

k,l=1

and its maximal subgroup is isomorphic to a subgroup of Aut(Mn ) ∼ = SU (n). Any faithful compact quantum transformation group of Mn is a quantum subgroup of (A, α) for some positive Q. Proof. First we show that A is a compact quantum group. Let v = Q1/2 uQ−1/2 . Then (4.6) is equivalent to v ∗ v = In⊗2 = vv ∗ . Hence the C ∗ -algebra A is well defined. The set of relations in (4.6) shows that u is invertible. We claim that ut is also invertible. For simplicity of notation in the following kl ) = Q−1 . Then (4.6) becomes computation, let Q˜ = (q˜ij n X k,l,r,s,x,y=1

ij kl rs xy alk ij qrs axy q˜ef = δef =

n X k,l,r,s,x,y=1

ij kl rs yx qkl ars q˜xy af e ,

kl ˜ where i, j, e, f = 1, · · · , n. Put P = (pkl ij ) and P = (p˜ij ), where lk pkl ij = qij ,

kl p˜kl ij = qji ,

i, j, k, l = 1, · · · , n.

Then P −1 = P˜ , and the relations (4.6) becomes ut P uP −1 = In⊗2 = P uP −1 ut . This proves our claim. Now it is easy to check that A is a compact matrix quantum group with coproduct 8 given by the same formulas as in the proof of Theorem 4.1(1). ˜ α) Let (A, ˜ be a faithful quantum transformation group of Mn . We saw in the proof ∗ ˜ of Theorem 4.1 that there are elements a˜ kl ij (i, j, k, l = 1, · · · , n) in the C -algebra A that satisfy the relations (4.1), (4.3) and (4.4). The condition in Definition 2.1 (2) means that kl (˜akl ij ) = δij , i, j, k, l = 1, · · · , n. ˜ Hence by Proposition 3.2 of [30], By condition (3) of Definition 2.1, the a˜ ij ’s are in A. kl this implies that u˜ = (˜aij ) is a non-degenerate smooth representation of the quantum group A. From the proof of Theorem 5.2 of [30], with ˜ u˜ ∗ u), ˜ Q = (id ⊗ h)( ˜ α) we have Q > 0 and u˜ satisfies (4.6). The assumption that (A, ˜ is faithful implies that A˜ is generated by the elements a˜ kl (i, j = 1, · · · , n). This shows that (A, α) is a well ij

Quantum Symmetry Groups of Finite Spaces

205

defined faithful quantum transformation group of Mn and that the compact quantum ˜ α) transformation group (A, ˜ is a quantum subgroup of (A, α). Let χ ∈ X(A). From the defining relations for A, we see that (χ(akl,ij )) defines an ordinary transformation for Mn via the formulas in Theorem 4.2. Hence the maximal subgroup X(A) is embedded in Aut(Mn ). ⊗2 Note. We will denote the quantum group above by AQ aut (Mn ). If Q = In , then it is easy to see that the square of the coinverse (i.e. antipode) map is the identity map. From this one can show that this quantum group reduces to the quantum group Aaut (Mn ) in Theorem 4.1.

5. Quantum Automorphism Group of Finite Space

Lm

k=1

Mnk (C)

kl Notation. Let u = (akl rs,xy ) and v = (brs,xy ) be two matrices with entries from a *algebra, where

k, l = 1, · · · , nx , r, s = 1, · · · , ny , x, y = 1, · · · , m. Define uv to be the matrix whose entries are given by (uv)kl rs,xy =

np m X X

ij akl ij,xp brs,py .

p=1 i,j=1

Using the same method Lmas above, we now study the quantum automorphism group of the finite space B = k=1 Mnk , where nk is a positive integer. The C ∗ -algebra B has the following presentation: B = C ∗ {ekl,i | ekl,i ers,j = δij δlr eks , e∗kl,i = elk,i ,

nq m X X

epp,q = 1,

q=1 p=1

k, l = 1, · · · , ni , r, s = 1, · · · , nj , i, j = 1, · · · , m}. Let ψ be the positive functional on B defined by ψ(ekl,i ) = T r(ekl,i ) = δkl , k, l = 1, · · · , ni , i = 1, · · · , m. The defining relations for the quantum group of (B, ψ) are obtained as a combination of the relations of the quantum automorphism groups Aaut (Xn ) and Aaut (Mn ). Theorem 5.1. Let A be the C ∗ -algebra with generators akl rs,xy k, l = 1, · · · , nx , r, s = 1, · · · , ny , x, y = 1, · · · , m, and the following defining relations (5.1)–(5.5): nx X

vl kl akv ij,xy ars,xz = δjr δyz ais,xy ,

v=1

i, j = 1, · · · , ny , r, s = 1, · · · , nz , k, l = 1, · · · , nx , x, y, z = 1, · · · , m,

(5.1)

206

S. Wang nx X v=1

ji si asr lv,yx avk,zx = δjr δyz alk,yx ,

(5.2)

i, j = 1, · · · , nz , r, s = 1, · · · , ny , k, l = 1, · · · , nx , x, y, z = 1, · · · , m, ∗

lk akl ij,yz = aji,yz ,

(5.3)

i, j = 1, · · · , nz , k, l = 1, · · · , ny , y, z = 1, · · · , m, nz m X X

akl rr,yz = δkl ,

z=1 r=1 ny m X X

arr kl,yz = δkl ,

k, l = 1, · · · , ny ,

y = 1, · · · , m,

(5.4)

k, l = 1, · · · , nz ,

z = 1, · · · , m.

(5.5)

y=1 r=1

Then (1) A is a compact quantum group of Kac type; (2) The formulas α(ers,j ) =

ni m X X

ekl,i ⊗ akl rs,ij ,

r, s = 1, · · · , nj ,

j = 1, · · · , m

i=1 k,l

define a quantum transformation group (A, α) of (B, ψ). This is the quantum automorphism group of (B, ψ) in the category of compact quantum transformation groups (hence also in the category of compact quantum groups of Kac type) of (B, ψ), and it contains the ordinary automorphism group Aut(B). We will denote the quantum group above by Aaut (B). Proof. The proof of this theorem follows the lines of the proof of Theorem 4.1. The coproduct is given by 8(akl ij,xy )

=

np m X X

rs akl rs,xp ⊗ aij,py ,

k, l = 1, · · · , nx ,

x, y = 1, · · · , m.

p=1 r,s=1

Note that when nk = 1 for all k, then the quantum group Aaut (B) reduces to the quantum group Aaut (Xn ) in Theorem 3.1, and when m = 1, Aaut (B) reduces to the quantum group Aaut (Mn ) in Theorem 4.1. kl ) > 0 (k, l = 1, · · · , nx , r, s = 1, · · · , ny , x, y = 1, · · · , m) be a Let Q = (qrs,xy kl positive matrix with complex entries. Define δrs,xy to be 1 if k = r, l = s, x = y and 0 kl otherwise, and let I be the matrix with entries δrs,xy , where k, l = 1, · · · , nx , r, s = 1, · · · , ny , x, y = 1, · · · , m.

Quantum Symmetry Groups of Finite Spaces

207

Proposition 5.2. Let Q and I be as above. Let A be the C ∗ -algebra with generators akl rs,xy , k, l = 1, · · · , nx , r, s = 1, · · · , ny , x, y = 1, · · · , m, and defining relations (5.1), (5.3), (5.4), along with the following set of relations: u∗ QuQ−1 = I = QuQ−1 u∗ ,

(5.6)

where u = (akl rs,xy ). Then A is a compact quantum group that acts faithfully on B in the following manner, α(ers,j ) =

ni m X X

ekl,i ⊗ akl rs,ij ,

r, s = 1, · · · , nj ,

j = 1, · · · , m.

i=1 k,l

Any faithful compact quantum transformation group of B is a quantum subgroup of (A, α) for some positive Q. Proof. The proof follows the lines of Theorem 4.2.

Q We will denote the quantum group above by AQ aut (B), or simply by Aaut . When Q = In⊗2 , then AQ aut (B) is just Aaut (B). Note that for nk ’s distinct, the automorphism m group Aut(⊕m k=1 Mnk ) is isomorphic to the group ×k=1 Aut(Mnk ). A natural problem related to this is

Problem 5.3. For nk ’s distinct, the quantum automorphism group Aaut (⊕m k=1 Mnk ) is isomorphic to the quantum group ⊗m k=1 Aaut (Mnk ) (cf. [21]). For each fixed 1 ≤ k0 ≤ m, Aaut (Mk0 ) as defined in the last section is a quantum kl subgroup of Aaut (B). (This is seen as follows. Let a˜ kl rs,xy = δxk0 δyk0 ars , where the akl ˜ kl rs ’s are generators of Aaut (Mnk0 ). Then the a rs,xy ’s satisfy the defining relations for Aaut (B).) Note also that if nk = n for all k, then Aaut (Xm ) is a quantum subgroup of Aaut (B). (This is seen as follows. Let a˜ kl rs,xy = δkr δls axy , where the axy ’s are generators of Aaut (Xm ). Then the a˜ kl rs,xy ’s satisfy the defining relations for Aaut (B).) In view of the fact that the ordinary automorphism group Aut(⊕m 1 Mn ) is isomorphic to the semi-direct product SU (n) o Sm , it would be interesting to solve the following problem. Problem 5.4. Is it possible to express Aaut (⊕m 1 Mn ) in terms of Aaut (Mn ) and Aaut (Xm ) as a certain semi-direct product that generalizes [21]? 6. The Main Result Summarizing the previous sections, we can now state the main result of this paper. Theorem 6.1. Let B be a finite space of the form ⊕m k=1 Mnk . (1) Quantum automorphism group of B exists in the category of (left) quantum transformation groups if and only if B is the finite space Xm . (2) The quantum automorphism group for (B, ψ) exists and is defined as in Theorem 5.1 (see also Theorem 3.1, Theorem 4.1).

208

S. Wang

Proof. (1) If B is Xm , we saw in Theorem 3.1 that Aaut (Xm ) is the quantum automorphism group of Xm in the category of all quantum transformation groups. Now assume that B 6= C(Xm ), and assume that the quantum automorphism group of B exists in the category of all quantum transformation groups. Call it (A0 , α0 ). As in Theorem 5.1 and Theorem 5.2, α0 is determined by its effect on the basis ers,j of B, α0 (ers,j ) =

ni m X X

ekl,i ⊗ a˜ kl rs,ij ,

r, s = 1, · · · , nj ,

j = 1, · · · , m.

i=1 k,l

Since (A0 , α0 ) is the quantum automorphism group of B, the action α0 is faithful (cf. ∗ Definition 2.4). This implies that the a˜ kl rs,ij ’s generates the C -algebra A0 . As in Theorem 5.2 (see also Theorem 4.2), there is a positive Q0 , such that the a˜ kl rs,xy ’s satisfy the relations (5.1), (5.3), (5.4), along with the following set of relations: ˜ −1 ˜ −1 ˜ ∗, u˜ ∗ Q0 uQ 0 = I = Q0 uQ 0 u

(6.1)

Q0 where u˜ = (˜akl rs,xy ). By the universal property of (A0 , α0 ), we conclude that A0 = Aaut (see also the last statement in Theorem 5.2). For every positive Q, the unique morphism Q0 ˜ kl from (AQ aut , α) to (A0 , α0 ) sends the generators a rs,xy of Aaut to the corresponding Q generators akl rs,xy of Aaut (again because of faithfulness of the quantum transformaQ 0 kl tion group Aaut and the universality of AQ aut ). Hence the generators ars,xy also satisfy the relations (6.1). This is impossible because we can choose Q so that AQ aut 0 kl and AQ aut have different classical points in the vector space with coordinates ars,xy (k, l = 1, · · · , nx , r, s = 1, · · · , ny , x, y = 1, · · · , m).

(2) This is proved in the previous sections.

Concluding Remarks. (1) In this paper, we only described the quantum automorphism group of (B, ψ) for the special choice of functional ψ, because this quantum automorphism group is closest to the ordinary automorphism group Aut(B) of B, and it contains the latter. One can also use the same method to describe quantum automorphism groups of B endowed with other functionals or a collection of functionals. (2) For each 1 ≤ k ≤ n, consider the delta measure χk on Xn corresponding to the point xk . Then the quantum automorphism group of (Xn , χk ) is isomorphic to the quantum permutation group of the space Xn−1 , just as in the case of ordinary permutation groups. (3) If we remove condition (3) in Definition 2.1, then we obtain the notion of an action of a quantum semi-group on a C ∗ -algebra. The relations (5.1), (5.3), (5.4) define the universal quantum semi-group E(B) acting on B, even though B is not a quadratic algebra in the sense of Manin [13]. From the main theorem of this paper, the Hopf envelope H(B) of this quantum semi-group in the sense of Manin cannot be a compact quantum group (see also the last section of [18]). After this paper was submitted for publication, we received the papers [6, 7], where a finite quantum group symmetry A(F ) for M3 is described, following the work of Connes [5]. The finite quantum group A(F ) in these papers is not a finite quantum group in the sense of [30] (because it does not have a compatible C ∗ norm), so it cannot be a quantum subgroup of the COMPACT quantum symmetry groups Aaut (M3 ) and AQ aut (M3 ) in our paper; but it is a quantum subgroup of the Hopf envelope H(B) of the quantum semi-group E(B) mentioned in the last paragraph.

Quantum Symmetry Groups of Finite Spaces

209

Our paper gives solutions to the “intricate problem” mentioned in the end of Sect. 2 of the paper [7]: find the biggest quantum group acting on M3 . This “intricate problem” has two solutions: the first, Theorem 6.1, solves the problem in the category of compact quantum groups; the second, the remarks in the last two paragraphs, solves the problem in the category of all quantum groups–Hopf algebras that need not have C ∗ -norms. (4) In [13], the quantum group SUq (2) is described as the quantum automorphism group of the quantum plane (i.e. the deformed plane). In view of the fact that the automorphism group Aut(M2 ) is SU (2), one might be able to describe SUq (2) as a quantum automorphism group of the non-deformed space M2 endowed with a collection of functionals.

Appendix In [18], we introduced a compact matrix quantum group Ao (Q) for each non-singular matrix Q. It has the following presentation: u¯ = u, uut = Im = ut u, ut QuQ−1 = Im = QuQ−1 ut , where u = (aij ). As a matter of fact, it is more appropriate to use the notation Ao (Q) (and we will do so from now on) for the compact matrix quantum group with the following sets of relations (where Q is positive): u¯ = u, ut QuQ−1 = Im = QuQ−1 ut . (Let v = Q1/2 uQ−1/2 . Then v is a unitary matrix. Hence the C ∗ -algebra A exists. From this it is easy to see that Ao (Q) is a compact matrix quantum group.) This quantum group has all the properties listed in [18] for the old Ao (Q). The old Ao (Q) is the intersection of the quantum groups Ao (n) and the new Ao (Q) defined above. Moreover, if Q is a real matrix, the new Ao (Q) is a compact quantum group of Kac type. Finally, we note that the quantum group denoted by Ao (F ) in [2] is the same as the quantum group Bu (Q) in [24, 26] with Q = F ∗ , so it is different from the quantum group Ao (Q) above unless F is the trivial matrix In . Acknowledgement. The author wishes to thank Alain Connes for several helpful discussions and for his interest in this work. He is also indebted to Marc Rieffel for his support, which enabled the author to finish writing up this paper. He thanks T. Hodges, G. Nagy, A. Sheu, S.L. Woronowicz for their comments during the AMS summer research conference on Quantization in July, 1996, on which the author reported preliminary results of this paper. The main results of this paper were obtained while the author was a visiting member at the IHES during the year July, 1995-Aug, 1996. He is grateful for the financial support of the IHES during this period. He would like to thank the Director Professor J.-P. Bourguignon and the staff of the IHES for their hospitality. The author also wishes to thank the Department of Mathematics at UC-Berkeley for its support and hospitality while the author held an NSF Postdoctoral Fellowship there during the final stage of this paper.

210

S. Wang

References 1. Baaj, S. and Skandalis, G.: Unitaires multiplicatifs et dualit´e pour les produits crois´es de C ∗ -alg`ebres. Ann. Sci. Ec. Norm. Sup. 26, 425–488 (1993) 2. Banica, T.: Th´eorie des repr´esentations du groupe quantique compact libre O(n). C. R. Acad. Sci. Paris t. 322, Serie I, 241–244 (1996) 3. Boca, F.: Ergodic actions of compact matrix pseudogroups on C ∗ -algebras. In: Recent Advances in Operator Algebras. Ast´erisque 232, 93–109 (1995) 4. Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 5. Connes, A.: Gravity coupled with matter and the foundation of non commutative geometry. Commun. Math. Phys. 182, 155–176 (1996) 6. Dabrowski, L. and Hajac, P.M. and Siniscalco, P.: Explicit Hopf Galois description of SL 2πi (2)e 3

7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.

27. 28.

induced Frobenius homomorphisms. Preprint DAMPT-97-93, SISSA 43/97/FM (q-alg/9708031) Dabrowski, L. and Nesti, F. and Siniscalco, P.: A finite quantum symmetry of M (3, C). Preprint SISSA 63/97/FM (hep-th/9705204), to appear in Int. J. Mod. Phys. Drinfeld, V. G. : Quantum groups. In: Proc. ICM-1986, Berkeley, Vol I, Providence, R.I.: Amer. Math. Soc., 1987, pp. 798–820 Faddeev, L. D. and Reshetikhin, N. Y. and Takhtajan, L. A.: Quantization of Lie groups and Lie algebras. Algebra and Analysis 1, 193–225 (1990) Kac, G.: Ring groups and the duality principle I, II, Proc. Moscow Math. Soc. 12, 259–303 (1963) Kac, G. and Palyutkin, V.: An example of a ring group generated by Lie groups. Ukrain. Math. J. 16, 99–105 (1964) Levendorskii, S. and Soibelman, Y.: Algebra of functions on compact quantum groups, Schubert cells, and quantum tori. Commun. Math. Phys. 139, 141–170 (1991) Manin, Y.: Quantum Groups and Noncommutative Geometry. Publications du C.R.M. 1561, Univ de Montreal, 1988 Podles, P.: Symmetries of quantum spaces. Subgroups and quotient spaces of quantum SU (2) and SO(3) groups. Commun. Math. Phys. 170, 1–20 (1995) Podles, P. and Woronowicz S. L.: Quantum deformation of Lorentz group. Commun. Math. Phys. 130, 381–431 (1990) Rieffel, M.: Compact quantum groups associated with toral subgroups. Contemp. Math. 145, 465–491 (1993) Van Daele, A.: Discrete quantum groups. J. Alg. 180, 431–444 (1996) Van Daele, A. and Wang, S. Z.: Universal quantum groups. International J. Math 7:2, 255–264 (1996) Wang, S. Z.: General Constructions of Compact Quantum Groups. Ph.D Thesis, University of California at Berkeley, March, 1993 Wang, S. Z.: Free products of compact quantum groups. Commun. Math. Phys. 167, 671–692 (1995) Wang, S. Z.: Tensor products and crossed products of compact quantum groups. Proc. London Math. Soc. 71, 695–720 (1995) Wang, S. Z.: Krein duality for compact quantum groups. J. Math. Phys. 38:1, 524–534 (1997) Wang, S. Z.: Deformations of compact quantum groups via Rieffel’s quantization. Commun. Math. Phys. 178, 747–764 (1996) Wang, S. Z.: New classes of compact quantum groups. Lecture notes for talks at the University of Amsterdam and the University of Warsaw, January and March, 1995 Wang, S. Z.: Classification of quantum groups SUq (n). To appear in J. London Math. Soc. Wang, S. Z.: Problems in the theory of quantum groups. In: Quantum Groups and Quantum Spaces, Banach Center Publication 40 Inst. of Math., Polish Acad. Sci., Editors: R. Budzynski, W. Pusz, and S. Zakrzewski, 1997, pp. 67–78 Wang, S. Z.: Ergodic actions of universal quantum groups on operator algebras. Preprint, March 1998 Woronowicz, S. L.: Pseudospaces, pseudogroups and Pontryagin duality, Proc. of the International Conference on Mathematics and Physics, Lausanne, Lecture Notes in Phys. Vol. 116, 1979, pp. 407– 412

Quantum Symmetry Groups of Finite Spaces

211

29. Woronowicz, S. L.: Twisted SU (2) group. An example of noncommutative differential calculus. Publ. RIMS, Kyoto Univ. 23, 117–181 (1987) 30. Woronowicz, S. L.: Compact matrix pseudogroups, Commun. Math. Phys. 111, 613–665 (1987) 31. Woronowicz, S. L.: Tannaka–Krein duality for compact matrix pseudogroups. Twisted SU (N ) groups. Invent. Math. 93, 35–76 (1988) 32. Woronowicz, S. L.: Unbounded elements affiliated with C ∗ -algebras and non-compact quantum groups. Commun. Math. Phys. 136, 399–432 (1991) Communicated by A. Connes

Commun. Math. Phys. 195, 213 – 232 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Non-Bernoullian Quantum K-Systems Valentin Ya. Golodets, Sergey V. Neshveyev Institute for Low Temperature Physics & Engineering, Lenin Ave 47, Kharkov 310164, Ukraine Received: 20 March 1997 / Accepted: 18 November 1997

Dedicated to Professor Walter Thirring on his 70th birthday.

Abstract: We construct an uncountable family of pairwise non-conjugate non-Bernoullian K-systems of type III1 with the same finite CNT-entropy. We also investigate clustering properties of multiple channel entropies for strong asymptotically abelian systems of type II and III. We prove that a wide enough class of systems has the K-property. In particular, such systems as the space translations of a one-dimensional quantum lattice with the Gibbs states of Araki, the space translations of the CCR-algebra and the even part of the CAR-algebra with the quasi-free states of Park and Shin, noncommutative Markov shifts in the Accardi sense are entropic K-systems.

Introduction Entropy for transformations of a measure space introduced by Kolmogorov and Sinai is an important invariant in the ergodic theory. Connes, Narnhofer, Størmer and Thirring [11, 9, 10] defined and investigated dynamical entropy for automorphisms of an operator algebra. A more detailed bibliography and applications of dynamical entropy (or CNTentropy) to mathematical physics can be found in the monographs [21, 3]. In recent years a lot of interesting results in computation of CNT-entropy in the models of mathematical physics was obtained. Let us consider some of them. Araki [2]studied Gibbs states for a one-dimensional quantum lattice. The dynamical entropy of the lattice translation for this state was investigated in [10]. Størmer and Voiculescu [31] found a nice formula (predicted by A.Connes for the tracial state) for the entropy of Bogoliubov automorphisms of the CAR-algebra, preserving a quasi-free state the modular operator of which has pure point spectrum. Bezuglyi and Golodets [5] obtained the same formula for the entropy of Bogoliubov actions on the CAR-algebra of the groups Zn , n ∈ N, and Z ⊕ Z ⊕ . . . . Important results belong to Park and Shin. They proved that the CNT-entropy of the space translation of CAR- and CCR-algebras in n-dimensional (n < ∞) continuous spaces with respect to an invariant quasi-free

214

V. Ya. Golodets, S. V. Neshveyev

state is equal to the mean entropy and derived a simple formula for the CNT-entropy [24]. Similar results were obtained by Petz for quantum spin lattices with Markov states [21, 25]. Pimsner and Popa [26], Yin [32] and Choda [8] computed the CNT-entropy of the shifts of Temperley-Lieb algebras. Golodets and Størmer [14], Price [27] computed the entropy for a wide enough class of binary shifts. Narnhofer, Størmer and Thirring [18] proved the existence of a binary shift with zero entropy (see [14] for a bibliography about binary shifts). The progress in computation of CNT-entropy gives the possibility of investigating new problems. The concept of K-system introduced by Rohlin and Sinai [28] is very important in classical theory. Narnhofer and Thirring [19] suggested a non-commutative, or quantum, version of K-systems as systems with “complete memory loss” (see Definition 1.2 below). It is natural to expect that these systems should have interesting properties and applications. They were studied in [19, 20, 3] (see [3] for a more detailed bibliography). In particular, Benatti and Narnhofer [4] proved that K-systems of type II1 are asymptotically abelian. In [14] a description of K-systems defined by bitstreams was obtained. The simplest examples of K-systems can be constructed as follows. Let N be a von Neumann algebra and ψ be a normal faithful state of N . For each integer n let (Nn , ψn ) be a copy of (N, ψ). Denote by (M, φ) the W ∗ -tensor product of (Nn , ψn )n , that is (M, φ) = ⊗n∈Z (Nn , ψn ), and by γ the right shift automorphism of M . Then (M, φ, γ) is a K-system (see Theorem 3.1 below). We shall call such systems Bernoullian systems. A natural problem is to prove the existence of K-systems which are non-isomorphic to Bernoulli shift. In the commutative case the problem was solved by Ornstein [22] and Ornstein and Shields [23]. In this paper we construct a quasi-free state ω of the CCR-algebra U and an uncountable family of Bogoliubov automorphisms τθ , θ ∈ [0, 2π), of U such that (see Theorem 5.5 below) (i) (ii) (iii) (iv) (v)

if x 7→ πω (x) is the GNS-representation of U with respect to ω (x ∈ U), then M = πω (U)00 is the injective factor of type III1 ; ω ◦ τθ = ω, θ ∈ [0, 2π); (M, ω, τθ ) is a non-Bernoullian K-system; the CNT-entropy hω (τθ ) of τθ is finite, positive and does not depend on θ ∈ [0, 2π); the systems (M, ω, τθ1 ) and (M, ω, τθ2 ) are non-conjugate for θ1 6= θ2 (see Definition 1.1).

These results are based on the properties of quasi-free states of CCR-algebras and their modular groups [7]. We also use the results of [24]. Let us note that the problem is still open for systems of type II1 . As we mentioned, K-systems of type II1 are asymptotically abelian according to [4]. More exactly, if (M, τ, α) is a K-system, M is an algebra of type II1 , τ is a faithful normal trace on M and α ∈ Aut M , τ ◦ α = τ , then Hτ (A, αn (A)) → 2Hτ (A) for n → ∞ for any finite dimensional subalgebra A of M . It was shown in [4] that the strong asymptotic abelianness of the system (M, τ, α) follows from this clustering property. We prove here (see Sect. 2 below) the reverse statement. It is natural to ask whether asymptotic abelianness is equivalent to the K-property for systems of type II and III. The answer is positive for the dynamical systems defined by bitstreams [14]. In the general case the question is open.

Non-Bernoullian Quantum K-Systems

215

In Sect. 3 we present a sufficient condition for the K-property. Using this condition we prove in Sects. 4 and 5 that most of the systems mentioned above are entropic K-systems. In particular, such systems as the space translations of a one-dimensional quantum spin lattice with the Gibbs state of Araki, the space translations of the CCRalgebra and the even part of the CAR-algebra with the quasi-free states of Park and Shin, non-commutative Markov shifts in the Accardi sense are entropic K-systems. Thus it is true for the space translations of ideal Fermi (the even part) and Bose gases. 1. Preliminaries A quantum dynamical system is a triple (M, ω, α), where M is a C ∗ -algebra (or W ∗ algebra), α is *-automorphism, and ω is an α-invariant state of M (supposed to be normal in the W ∗ -case). Definition 1.1. The systems (M1 , ω1 , α1 ) and (M2 , ω2 , α2 ) are said to be conjugate (or isomorphic), if there exists a *-isomorphism θ: M1 → M2 such that ω2 ◦ θ = ω1 and θ ◦ α1 = α2 ◦ θ. Recall the definition of CNT-entropy [10]. Let A be a finite dimensional C ∗ -algebra, φ and ψ positive linear functionals on A. The relative entropy is given by S(φ, ψ) = Trace(Qψ (log Qψ − log Qφ )), where Qφ and Qψ are the density operators corresponding to φ and ψ. The quantity S(φ) = Tr η(Qφ ), where η(x) = −x log x, is called the von Neumann entropy of φ. Let γi : Ai → M, 1 ≤ i ≤ n, be a unital completely positive map of a finite dimensional C ∗ -algebra Ai to M . The quantity Hω (γ1 , . . . , γn ) is defined as follows: Hω (γ1 , . . . , γn ) = sup

X

ηωi1 ...in (1) +

i1 ,...,in

n X X

S(ω ◦ γk , ωi(k) ◦ γk ) k

k=1 ik

n X h X X ηωi1 ...in (1) − ηωi(k) (1) = sup k i1 ,...,in

+

k=1 ik

n X X

i (k) ωi(k) (1)S(ω ◦ γ , ω ˆ ◦ γ ) , k k ik k

k=1 ik

where the supremum is taken over all finite decompositions ω = = a sum of positive linear functionals, ωi(k) k

X

X

i1 ,...,in ωi1 ...in , ωˆ i(k) = k

ωi1 ...in of ω in ωi(k) (1)−1 ωi(k) . k k

i1 ,...,in , ik fixed

If M is a W ∗ -algebra and ω is faithful, then any positive linear functional φ ≤ ω ω (x)) for some x ∈ M, x ≥ 0. Thus can be uniquely represented in the form ω(·σ−i/2 P are in one-to-one correspondence to decompositions decompositions ω = ω i1 ...in P 1= xi1 ...in , xi1 ...in ≥ 0. The properties of Hω ([10, 19, 21]): 1. Hω (γ1 ◦ θ1 , . . . , γn ◦ θn ) ≤ Hω (γ1 , . . . , γn ) for any completely positive unital map θi : Bi → Ai , 1 ≤ i ≤ n.

216

V. Ya. Golodets, S. V. Neshveyev

2. If α is an automorphism of M preserving ω, then Hω (α ◦ γ1 , . . . , α ◦ γn ) = Hω (γ1 , . . . , γn ). 3. Hω (γ1 , γ1 , . . . , γn ) = Hω (γ1 , . . . , γn ). 4. Hω (γ1 , . . . , γp , γp+1 , . . . , γn ) ≤ Hω (γ1 , . . . , γp ) + Hω (γp+1 , . . . , γn ). of the state ω, then Hω (A) = S(ω|A ) and 5. If A is a subalgebra of the centralizer MωP an optimal decomposition is given by ωP = i ω(·pi ), where {pi }i is a set of mutually orthogonal minimal projections of A, i pi = 1. 6. If subalgebras A1 , . . . , An of M pairwise commute, and there exists an ω-preserving conditional expectation M → Ai , 1 ≤ i ≤ n, then Hω (A1 , . . . , An ) = S(ω|A ), where A is the algebra generated by A1 , . . . , An . 7. If M is a von Neumann algebra and ω is its faithful normal state, then Hω (N ) > 0 unless N = C1. The properties 2 and 4 imply that the limit hω (γ, α) = lim

n→∞

1 Hω (γ, α ◦ γ, . . . , αn−1 ◦ γ) n

exists for any γ. The dynamical entropy (or the CNT-entropy) hω (α) is the supremum of hω (γ, α) over all γ. For a commutative W ∗ -dynamical system (M, ω, α), where ω is a normal faithful state, the following properties are equivalent ([12, 19]). 1. For any finite dimensional subalgebra A of M , lim hω (A, αn ) = Hω (A). n→∞

2. For any finite dimensional subalgebra A of M, A 6= C1, we have hω (A, α) > 0. 3. There exists a von Neumann subalgebra A of M such that (i) A ⊂ α(A); (ii) ∩n αn (A) = C1; (iii) ∪n αn (A) is weakly dense in M . Definition 1.2 ([19]). A W ∗ -dynamical system (M, ω, α) is an entropic K-system (resp. has completely positive entropy, is an algebraic K-system), if Property 1 (resp. 2, 3) is satisfied. In the non-commutative case Properties 1-3 are not equivalent. It is easy to show that an entropic K-system has completely positive entropy. The existence of a system having the Property 3 and zero entropy was proved in [18]. A system with completely positive entropy and without the K-property was constructed in [14]. It should be noted that both of the mentioned systems are not asymptotically abelian. So the problem of equivalence of Properties 1-3 for asymptotically abelian systems has not been solved yet. Remark 1.3. Let N be an α-invariant W ∗ -subalgebra of M , γ = α|N , φ = ω|N . Suppose there exists a ω-preserving conditional expectation M → N . Then Hφ (A1 , . . . , An ) = Hω (A1 , . . . , An ) for any subalgebras A1 , . . . , An of N . Hence, if (M, ω, α) is an entropic K-system or has completely positive entropy, then (N, φ, γ) has the same property.

Non-Bernoullian Quantum K-Systems

217

2. Asymptotic Abelianness and Clustering of Entropic Functions In this section we consider asymptotically abelian systems. In particular we reverse the Benatti-Narnhofer theorem [4, 3.1.3]. Theorem 2.1. Let (M, ω, α) be a strongly asymptotically abelian W ∗ -dynamical system. Suppose ω is faithful and either ω is tracial or M is approximately finite dimensional. Suppose also that, for given k ∈ N, for any x0 , . . . , xk ∈ Z(M ) (the center of M ), lim ω x0 αn (x1 ) . . . αkn (xk ) = ω(x0 ) . . . ω(xk ). n→∞

Then, for any finite dimensional subalgebra A of M , we have lim Hω A, αn (A), . . . , αkn (A) = (k + 1)Hω (A). n→∞

In particular, if (Z(M ), ω, α) is a commutative K-system, then the above convergence holds for any k ∈ N. To prove the theorem we need the following technical result. Let M be a von Neumann algebra and φ a state of M . For any von Neumann subalgebra Q of M , we introduce a semi-norm kxk#φ,Q =

sup

y1 ,y2 ∈Q ky1 k,ky2 k≤1

(φ(y1∗ x∗ xy1 ) + φ(y2∗ xx∗ y2 ))1/2 .

δ

For δ > 0 and subalgebras Q and P of M , we write Q ⊂ P if, for any x ∈ Q, kxk ≤ 1, there exists an element y ∈ P , kyk ≤ 1, such that

φ

kx − yk#φ,Q < δ. Lemma 2.2. Let n > 0 and ε > 0 be given. Then there exists δ = δn (ε) > 0 such that, δ

for any pair of von Neumann subalgebras Q and P of M with Q ⊂ P , dim Q = n, and φ P (m) } , e = 1, of Q, there exists any system of matrix units {e(m) k,m kk kl k,l=1,...,nm , m=1,...,s (m) a system of matrix units {pkl }k,l=1,...,nm , m=1,...,s in P such that (m) # ke(m) kl − pkl kφ,Q < ε ∀k, l, m.

This lemma was used in a similar form in [10]. First, it was formulated and proved for the tracial case in [11]. The same proof holds in the general case. Proof of Theorem 2.1.. Under the assumptions of the theorem, for any ε > 0, we can find a finite dimensional subalgebra B of M and positive elements x1 , . . . , xl in B, P i xi = 1, such that X X Hω (A) ≤ ε + ηω(xj ) + S ω|A , ω ·σ−i/2 (xj ) |A . j

j

We construct subalgebras B(0, n), . . . , B(k, n) of M and *-homomorphisms Fin : B → B(i, n), 0 ≤ i ≤ k, such that (i)

B(i, n) = Fin (B) + C(1 − Fin (1)), 0 ≤ i ≤ k;

218

V. Ya. Golodets, S. V. Neshveyev

(ii) B(0, n), . . . , B(k, n) pairwise commute; (iii) for any x ∈ B, Fin (x) − αin (x) → 0 in s-topology, as n → ∞. Let B(0, n) = B and F0n = IdB . Suppose algebras B(0, n), . . . , B(i, n) and *-homomorphisms F0n , . . . , Fin are constructed for any n. Let {e(m) kl }k,l=1,...,nm , m=1,...,s be a system of matrix units of B. Define Gi+1,n : M → M by (see [17]) X (mi ) (m0 ) 0) i) F0n (ek(m ) . . . Fin (e(m Gi+1,n (x) = ki 1 )xFin (e1ki ) . . . F0n (e1k0 ). 01 m0 ,...,mi k0 ,...,ki

The map Gi+1,n has the following properties: (i) kGi+1,n k ≤ 1; 0 (ii) Gi+1,n (x) ∈ ∪0≤j≤i B(j, n) ∩ M . We assert that, for any x ∈ B, kα(i+1)n (x) − Gi+1,n (α(i+1)n (x))k#ω,α(i+1)n (B) → 0, as n → ∞. In other words, for any x, y ∈ B, k α(i+1)n (x) − Gi+1,n (α(i+1)n (x)) α(i+1)n (y)ξω k → 0, as n → ∞, where ξω is the cyclic vector in the GNS-representation corresponding to ω. First, we note that if a bounded sequence {xn }n in M converges to zero in s-topology, then, for any y1 , . . . , yl ∈ M , kxn αm1 (y1 ) . . . αml (yl )ξω k → 0, as n → ∞, uniformly on (m1 ,. . . , ml ) ∈ Zl . Indeed, for any sequences {m(j) n }n , 1 ≤ j ≤ l, of −m(1) −m(1) n (xn )ξω → 0 ⇒ α n (xn )y1 ξω → 0 ⇒ integers, we have xn ξω → 0 ⇒ α (1) (1) (l) xn αmn (y1 )ξω → 0 ⇒ . . . ⇒ xn αmn (y1 ) . . . αmn (yl )ξω → 0 . Second, Fjn (x)α(i+1)n (y) − α(i+1)n (y)αjn (x) → 0 in s-topology for j < i + 1, since Fjn (x)−αjn (x) → 0, [x, α(i−j+1)n (y)] → 0, and Fjn (x)α(i+1)n (y)−α(i+1)n (y)αjn (x) = = α(i+1)n α−(i+1)n (Fjn (x) − αjn (x))y + αjn ([x, α(i−j+1)n (y)]). Using these two observations we conclude that lim k Gi+1,n (α(i+1)n (x))α(i+1)n (y) − α(i+1)n (xy) ξω k = n X (i+1)n 0) i) i) = lim k F0n (ek(m ) . . . Fin (e(m (x)Fin (e(m ki 1 )α 1ki ) 01 n

m0 ,...,mi k0 ,...,ki

(m0 ) (i+1)n (i+1)n )α (y) − α (xy) ξω k . . . F0n (e1k 0 X (i+1)n 0) i) i) = lim k F0n (ek(m ) . . . Fin (e(m (x)Fin (e(m ki 1 )α 1ki ) . . . 01 n (m0 ) (i+1)n 1) F1n (e(m (y)e1k − α(i+1)n (xy) ξω k = . . . = 1k1 )α 0

Non-Bernoullian Quantum K-Systems

X

= lim k n

X

= lim k n

219

(m0 ) (i+1)n 0) i) i) F0n (ek(m ) . . . Fin (e(m (xy)αin (e(m ki 1 )α 1ki ) . . . e1k0 − 01 −α(i+1)n (xy) ξω k = . . . =

(m0 ) in (mi ) in (mi ) (i+1)n 0) α(i+1)n (xy)ek(m . . . α (e )α (e ) . . . e − α (xy) ξω k ki 1 1ki 1k0 01

= 0, and our assertion is proved. 0 By Lemma 2.2 there exists a system of matrix units {p(m) kl (n)} in (∪0≤j≤i B(j, n)) ∩ M such that (m) α(i+1)n (e(m) kl ) − pkl (n) −→ 0 n→∞

in s-topology. (m) We define a homomorphism Fi+1,n : B → M by Fi+1,n (e(m) kl ) = pkl (n) and an algebra B(i + 1, n) by B(i + 1, n) = Fi+1,n (B) + C(1 − Fi+1,n (1)). Then, denoting Fin (xj ) + ω(xj )(1 − Fin (1)) by x(i) j (n), we obtain Hω (A, αn (A), . . . , αkn (A)) ≥ k X X X (0) (k) (m) ηω(xj0 (n) . . . xjk (n)) + S ω|αmn (A) , ω ·σ−i/2 (xj (n)) |αmn (A) ≥ j0 ,...,jk

=

X

m=0

j

m=0

j

k X X (0) (k) −mn (m) ηω(xj0 (n) . . . xjk (n)) + S ω|A , ω ·σ−i/2 α (xj (n)) |A .

j0 ,...,jk

Hence lim Hω (A, αn (A), . . . , αkn (A)) ≥ n

≥ lim n

≥ lim n

X

ηω(xj0 αn (xj1 ) . . . αkn (xjk )) +

j0 ,...,jk

k X X S ω|A , ω(·σ−i/2 (xj ))|A m=0

X

j

ηω(xj0 α (xj1 ) . . . α (xjk )) − ηω(xj0 ) . . . ω(xjk ) n

kn

j0 ,...,jk

+(k + 1)(Hω (A) − ε). It remains to show that, for any y0 , . . . , yk , we have lim ω(y0 αn (y1 ) . . . αkn (yk )) = ω(y0 ) . . . ω(yk ). n

Let E: M → Z(M ) be an ω-preserving conditional expectation. Then, for any central sequence {xn }n in M and any y ∈ M , lim(ω(yxn ) − ω(E(y)xn )) = 0. n

Indeed, if z ∈ Z(M ) is a w-limit point for {xn }n , then ω(yz) = ω(E(y)z) is the corresponding limit point for {ω(yxn )}n and {ω(E(y)xn )}n . Since the sequence {αn (x1 )α2n (x2 ) . . . αln (xl )} is central for any l ∈ N and any x1 , . . . , xl ∈ M , we obtain

220

V. Ya. Golodets, S. V. Neshveyev

lim ω(y0 αn (y1 ) . . . αkn (yk )) = lim ω(E(y0 )αn (y1 ) . . . αkn (yk )) n

n

= lim ω(y1 α−n (E(y0 ))αn (y2 ) . . . α(k−1)n (yk )) n = . . . = lim ω E(yk )α−n (E(yk−1 )) . . . α−kn (E(y0 )) n

= ω(y0 ) . . . ω(yk ). The last assertion of the theorem follows from the fact that any commutative Ksystem is a mixing of multiplicity k for any k ∈ N.

3. Sufficient Condition for the K-Property We present a sufficient condition for the K-property. This condition allows to show that many well-known quantum systems are entropic K-systems. Theorem 3.1. Let (M, ω, α) be a W ∗ -dynamical system. Suppose ω is faithful, and there exists a W ∗ -subalgebra M0 in M such that (i) M0 ⊂ α(M0 ) ; (ii) ∩n αn (M0 ) = C1 ; (iii) ∪n∈N (α−n (M0 )0 ∩ αn (M0 )) is weakly dense in M . Then the system (M, ω, α) is an entropic K-system. First, we need the following known result. We prove it for the reader’s convenience. Lemma 3.2. Let (X, µ) be a Lebesgue space, ξ and η its measurable partitions, ξ = (X1 , . . . , Xd ). Suppose Z Z g dµ − µ(Xi ) g dµ| ≤ ε||g||∞ ∀g ∈ L∞ (X/η), i = 1, . . . , d. | Xi

X

Then H(ξ|η) ≥ H(ξ) − δ(ε, d), where δ(ε, d) = (ε d)1/2 ( 23 + 2 log d + 3 log(1 + ( dε )1/2 )) → 0. ε→0

R Proof. Let Y = X/η, ν be the measure on Y induced by µ, µ = Y µy dν(y) the disintegration of µ with respect to ν. If we denote by ω (resp. ωy ) the state on L∞ (X/ξ) determined by µ (resp. µy ), then Z S(ωy ) dν(y), H(ξ) = S(ω), H(ξ|η) = Y

and the assumption of the lemma means that Z Z ωy (pi )g(y)dν(y) − ω(pi ) g(y)dν(y)| ≤ ε||g||∞ , | Y

Y

where pi is the characteristic function of the set Xi , hence Z |ωy (pi ) − ω(pi )|dν(y) ≤ ε, Y

Non-Bernoullian Quantum K-Systems

so that

221

Z ||ωy − ω||dν(y) ≤ ε d. Y

Let Z = {y ∈ Y | ||ωy − ω|| ≥ (ε d)1/2 }. Then ν(Z) ≤ (ε d)1/2 , |S(ωy ) − S(ω)| ≤ 2 log d for any y ∈ Z, and |S(ωy ) − S(ω)| ≤ 3(ε d)1/2 (1/2 + log(1 + d1/2 /ε1/2 )) for any y ∈ Y \Z by [10, Lemma IV.1]. Thus we obtain the desired result. Proof of Theorem 3.1. Let N be a finite dimensional subalgebra of M .P For any ε > 0 xi = 1, such there exist m ∈ N and elements x1 , . . . , xd ∈ α−m (M0 )0 ∩ αm (M0 ), that X X Hω (N ) < ε + ηω(xj ) + S(ω|N , ω(·σ−i/2 (xj ))|N ). Choose ε1 > 0 such that δ(ε1 , d) < ε. By [6, 2.6.1] there exists n0 ≥ 2m such that |ω(xj y) − ω(xj )ω(y)| ≤ ε1 ||y|| ∀y ∈ αm−n0 (M0 ), j = 1, . . . , d. Let us fix n ≥ n0 . For each j ∈ Z, let Aj be a copy of a finite dimensional abelian C ∗ algebra A0 with minimal projections p1 , . . . , pd , and for a finite subset J = {j1 , . . . , jm } of Z, AJ = Aj1 ⊗ . . . ⊗ Ajm . We define a unital positive map FJ : AJ → M by FJ (pi1 ⊗ . . . ⊗ pim ) = αnj1 (xi1 ) . . . αnjm (xim ). Let A be the infinite C ∗ -tensor product ⊗j∈Z Aj . Since A is the inductive limit of {AJ }J , the coherent system {FJ }J defines a positive unital map F : A → M . Let µ = ω ◦ F , γ be the right shift automorphism of A, πµ the GNS-representation corresponding to µ, A¯ = πµ (A)00 , µ¯ and γ¯ the state and the automorphism of A¯ corresponding to µ and γ respectively. Since ω is faithful, F induces a normal unital positive map F¯ : A¯ → M . Indeed, any bounded linear map of A (in particular F , πµ , µ) can be uniquely extended to a normal map of the W ∗ -enveloping algebra A∗∗ of the algebra A which we denote by the same letter. Then A¯ = πµ (A∗∗ ), and we only have to show that Ker πµ ⊂KerF . This follows from the Schwarz inequality F (x)∗ F (x) ≤ F (x∗ x): Ker πµ = {x ∈ A∗∗ | µ(x∗ x) = 0} = {x ∈ A∗∗ | F (x∗ x) = 0} ⊂ {x ∈ A∗∗ | F (x) = 0} = KerF . For any subset J of Z, we denote by A¯ J the von Neumann subalgebra of A¯ generated by πµ (Aj ), j ∈ J. Then F¯ (A¯ (−∞,k] ) ⊂ αm+nk (M0 ); if a ∈ A¯ J1 , b ∈ A¯ J2 and J1 ∩ J2 = ∅, then F¯ (ab) = F¯ (a)F¯ (b); µ¯ = ω ◦ F¯ and F¯ ◦ γ¯ = αn ◦ F¯ . For any a ∈ A¯ (−∞,−1] and i ∈ {1, . . . , d} we have

1) 2) 3)

|µ(p ¯ i a) − µ(p ¯ i )µ(a)| ¯ = |ω(xi F¯ (a)) − ω(xi )ω(F¯ (a))| ≤ ε1 ||F¯ (a)|| ≤ ε1 ||a||. By Lemma 3.2 Hµ¯ (A¯ 0 |A¯ (−∞,−1] ) ≥ Hµ¯ (A¯ 0 ) − δ(ε1 , d). On the other hand, Hµ¯ (A¯ 0 ) = P j ηω(xj ) and 1 Hµ¯ (A¯ [0,k−1] ) k→∞ k 1 X ηω(xi1 αn (xi2 ) . . . αn(k−1) (xik )). = lim k→∞ k i ,...,i

Hµ¯ (A¯ 0 |A¯ (−∞,−1] ) = hµ¯ (A¯ 0 , γ) = lim

1

k

222

V. Ya. Golodets, S. V. Neshveyev

By the definition of Hω we have Hω (N, αn (N ), . . . , αn(k−1) (N )) ≥ X ηω(xi1 αn (xi2 ) . . . αn(k−1) (xik )) + ≥ i1 ,...,ik

k X X n(l−1) S ω|αn(l−1) (N ) , ω ·σ−i/2 (α (xil )) |αn(l−1) (N ) , + l=1

il

so that hω (N, αn ) ≥ X 1 X ≥ lim ηω(xi1 αn (xi2 ) . . . αn(k−1) (xik )) + S(ω|N , ω(·σ−i/2 (xj ))|N ) k→∞ k i1 ,...,ik j X X ≥ ηω(xj ) − δ(ε1 , d) + S(ω|N , ω(·σ−i/2 (xj ))|N ) j

j

≥ Hω (N ) − ε − δ(ε1 , d) ≥ Hω (N ) − 2ε.

4. Entropic Properties of Quantum Systems with Markov States An example of a system, for which the conditions of Theorem 3.1 are satisfied, is the space translation for the Gibbs state of a one-dimensional quantum lattice system corresponding to a finite range interaction. Such a state is always factorial and has exponential decay of correlations [2]. In this section we study entropic properties of a quantum spin system with a Markov state and its subsystem given by the restriction to the centralizer of the Markov state. We prove that these systems are entropic K-systems too. So, let B = M ats (C) be a full matrix algebra. For every i ∈ Z a copy Ai of B is associated and A is the infinite C ∗ -tensor product ⊗i Ai . The right shift automorphism of the algebra A will be denoted by γ. For each subset J of Z, let AJ be the C ∗ -subalgebra of A generated by Ai , i ∈ J. Recall that a state φ of A is called locally faithful provided its restriction to AJ is faithful for any finite J. We restrict ourselves to locally faithful states. According to the definition of Accardi [1] a translation invariant state φ of A is called the Markov state if the following condition is satisfied: For every n ∈ N there exists a completely positive unital mapping Fn : A[0,n+2] → A[0,n+1] which preserves the state φ and leaves A[0,n] pointwise invariant. Petz proved that the latter condition is equivalent to the next one: S(φ|A[0,n+2] ) + S(φ|An+1 ) = S(φ|A[0,n+1] ) + S(φ|A[n+1,n+2] ). This equality implies that the mean entropy s(φ) of a Markov state φ is equal to S(φ|A[0,1] ) − S(φ|A0 ). Theorem 4.1. Let φ be a Markov state. Then

Non-Bernoullian Quantum K-Systems

223

1) φ is separating, i. e. the cyclic vector ξφ is separating for M = πφ (A)00 ; 2) M is a factor; 3) the centralizer Mφ of the state φ is the hyperfinite II1 -factor. Proof. 1) Define φ0 = φ|A[0,∞) . The state φ0 is separating and, for any n ≥ 1, there exists a σtφ0 -invariant *-subalgebra Nn of A[0,n+1] such that A[0,n] ⊂ Nn (see [15]). Let En be a φ0 -preserving conditional expectation of A[0,∞] onto Nn ∩ N10 ⊂ A[0,n+1] ∩ A0[0,1] = A[2,n+1] . The map γ −k ◦ Em+2k ◦ γ k : A[−k,∞) → A[−k+2,∞) leaves Nm ∩ N10 pointwise invariant for any k ≥ 0, since γ k (Nm ∩ N10 ) ⊂ A[k+2,k+m+1] ⊂ N[2k+m] ∩ N10 for k ≥ 1. Hence the formula En,m = Em ◦ γ −1 ◦ Em+2 ◦ γ −1 ◦ . . . ◦ Em+2(n−1) ◦ γ −1 ◦ Em+2n ◦ γ n defines a φ-preserving conditional expectation A[−n,∞) → Nm ∩ N10 . Then {En,m }n defines a φ-preserving conditional expectation of A onto Nm ∩ N10 . So, the algebra A∞ = ∪n A[−n,n] is the union of such finite dimensional subalgebras that there exists a φ-preserving conditional expectation onto each of them. Hence φ is separating [15]. 2) Since A[0,n] is a type I subfactor of Nn , the algebra Nn is generated by A[0,n] and its relative commutant Nn ∩ An+1 in Nn . So, taking N˜ n = γ −n−1 (Nn ∩ An+1 ) ⊂ A0 , we have Nn = A[0,n] ⊗ γ n+1 (N˜ n ). A φ0 -preserving conditional expectation A[0,∞) → Nn maps A[m,∞) to A0[0,m−1] ∩ Nn = A[m,n] ⊗ γ n+1 (N˜ n ) for m ≤ n, and A[n+1,∞) to γ n+1 (N˜ n ). Hence the algebras N˜ n and A[0,m] ⊗ γ m+1 (N˜ n ), m ≤ n, are the images of φ0 -preserving conditional expectations. ∞ ˜ Let N = ∪∞ n=1 ∩m=n Nm . Then N is a subalgebra of A0 , and there exist φ0 -preserving conditionnal expectations onto N and A[0,n] ⊗ γ n+1 (N ), n ≥ 0. Let E: A[0,∞) → N be a φ0 -preserving conditional expectation. Since, for any n, γ n+1 ◦ E ◦ γ −n−1 : A[n+1,∞) → γ n+1 (N ) is a φ0 -preserving conditional expectation, the unique φ0 -preserving conditional expectation A[0,∞) = A[0,n] ⊗ A[n+1,∞) → A[0,n] ⊗ γ n+1 (N ) coincides with IdA[0,n] ⊗ (γ n+1 ◦ E ◦ γ −n−1 ). Hence, for a0 , . . . , an ∈ A0 , we have E(a0 γ(a1 ) . . . γ n (an )) = E ◦ (IdA0 ⊗ γ ◦ E ◦ γ −1 ) (a0 . . . γ n (an )) n−1 (an )) = E a0 γ E(a1 . . . γ (4.1) = . . . = (Ea0 ◦ . . . ◦ Ean )(1), where Ea : N → N, a ∈ A0 , maps b ∈ N to E(aγ(b)). (In other words, φ is a C ∗ -finitely correlated state, see [13].) E1 maps N to the center Z(N ) of the algebra N . If p1 , . . . , pn is the list of minimal projections of Z(N ), then the matrix φ(pi )−1 φ(pi γ(pj )) ij of the mapping E1 |Z(N ) with respect to this basis is a stochastic matrix with strictly positive elements. The probability distribution (φ(p1 ), . . . , φ(pn )) is invariant for the corresponding Markov process, and the Markov dynamical system so obtained is simply (Z, φ, γ), where Z = (∪n∈Z γ n (Z(N )))00 . We want to use mixing properties of this system. We need two lemmas to do it.

224

V. Ya. Golodets, S. V. Neshveyev

Lemma 4.2. Let z be a minimal projection of Z(N ), a ∈ A(−∞,−1] , b ∈ A[1,∞) . Then φ(z)φ(azb) = φ(az)φ(zb). Proof. Suppose a = γ −n (an ) . . . γ −1 (a1 )

and b = γ(b1 ) . . . γ n (bn ),

where a1 , . . . , an , b1 , . . . , bn ∈ A0 . Then φ(azb) = φ((Ean ◦ . . . ◦ Ea1 ◦ Ez ◦ Eb1 ◦ . . . ◦ Ebn )(1)). Since z is minimal in Z(N ), the element Ez (n) = zE1 (n) is a scalar multiple of z for any n ∈ N , so φ(zb) z. (Ez ◦ Eb1 ◦ . . . ◦ Ebn )(1) = φ(z) Then φ(azb) =

φ(zb) φ(zb) φ((Ean ◦ . . . ◦ Ea1 )(z)) = φ(az). φ(z) φ(z)

Lemma 4.3. The subalgebra Z of M lies in the centralizer Mφ of φ. In particular, there exists a φ-preserving conditional expectation G: M → Z. We have: (i) if a ∈ A[n,m] , then G(a) ∈ A[n−1,m+1] ; (ii) if a ∈ A(−∞,n] , b ∈ A[n+2,∞) , then G(ab) = G(a)G(b). Proof. If a = γ −n (a−n ) . . . γ n (an ), z ∈ Z(N ), where a−n , . . . , an ∈ A0 , then by (4.1), φ(az) = φ((Ea−n ◦ . . . Ea−1 ◦ Ea0 z ◦ Ea1 ◦ . . . Ean )(1)). Since Ea0 z = Eza0 , this implies that Z(N ) lies in the centralizer of φ. ˜ is the image Let a ∈ A[−n,−1] , b ∈ A[1,n] . It is sufficient to prove that if a˜ (resp. b) −k of a (resp. b) under a φ-preserving conditional expectation A[−n−1,0] → ∨n+1 (Z) k=0 γ n+1 k ˜ (resp. A[0,n+1] → ∨k+0 γ (Z)), then G(ab) = a˜ b. For this it is enough to show that, for any system z−k , . . . , z0 , . . . , zk , k ≥ n + 1, of minimal projections of Z, we have ˜ −k (z−k ) . . . γ k (zk )). φ(abγ −k (z−k )γ −k+1 (z−k+1 ) . . . γ k (zk )) = φ(˜abγ Apply Lemma 4.2: 1 φ(aγ −k (z−k ) . . . z0 )φ(bz0 . . . γ k (zk )) φ(z0 ) 1 1 = φ(γ −k (z−k ) . . . γ −n−1 (z−n−1 )) × φ(z0 ) φ(z−n−1 )

φ(abγ −k (z−k ) . . . γ k (zk )) =

φ(aγ −n−1 (z−n−1 ) . . . z0 ) × 1 φ(bz0 . . . γ n+1 (zn+1 )) × φ(zn+1 ) φ(γ n+1 (zn+1 ) . . . γ k (zk )). Since φ(aγ −n−1 (z−n−1 ) . . . z0 ) = φ(˜aγ −n−1 (z−n−1 ) . . . z0 ) and a˜ γ −n−1 (z−n−1 ) . . . z0 ˜ −k (z−k ) is a scalar multiple of γ −n−1 (z−n−1 ) . . . z0 , we obtain the same result for φ(˜abγ k . . . γ (zk )).

Non-Bernoullian Quantum K-Systems

225

For a subset J of Z, let ZJ be the W ∗ -subalgebra of Z generated by γ j (Z(N )), j ∈ J. Since (Z, φ, γ) is a classical mixing Markov dynamical system, ∩n∈N Z(−∞,−n]∪[n,∞) = C1. By [6, 2.6.1] it is equivalent to the convergence sup b∈Z(−∞,−n]∪[n,∞)

|φ(ab) − φ(a)φ(b)| → 0, ∀a ∈ Z. n→∞ kbk

By virtue of Lemma 4.3 it follows that sup b∈A(−∞,−n]∪[n,∞)

|φ(ab) − φ(a)φ(b)| → 0, ∀a ∈ ∪m∈N A[−m,m] . n→∞ kbk

Hence φ is factorial by [6, 2.6.10]. 3) We prove that the center Z(Mφ ) of the centralizer Mφ is contained in the center of the algebra M . As we showed above, the *-algebra A∞ is the union of the finite dimensional σtφ 1/2 invariant subalgebras. Hence the linear span of elements b ∈ A∞ such that 1φ bξφ = 1/2 λ bξφ for some λ > 0 is s-dense in M . So, it is sufficient to prove that any such an element commutes with Z(Mφ ). First, we prove that [γ n (b), a] → 0 in s-topology for any a ∈ M . Let ε > 0. There exists an aε ∈ A∞ such that k(a − aε )ξφ k < ε. Then (a − aε )γ n (b)ξφ = λ−1/2 (a − aε )jγ n (b∗ )ξφ = λ−1/2 (jγ n (b∗ )j)(a − aε )ξφ , so that

k[a, γ n (b)]ξφ k ≤ (1 + λ−1/2 )kbkε + k[aε , γ n (b)]ξφ k.

Since [aε , γ n (b)] = 0 for n sufficiently large, our assertion is proved. For any n, γ n (b∗ )b ∈ Mφ . Hence, for z ∈ Z(Mφ ), zγ n (b∗ )b = γ n (b∗ )bz. Then γ n (b)zγ n (b∗ )b = γ n (bb∗ )bz.

(4.2)

Since M is a factor and {γ n (bb∗ )}n is central, letting n → ∞, at the right hand side of (4.2) we obtain φ(bb∗ )bz. The left hand side of (4.2) is equal to γ n (b)[z, γ n (b∗ )]b + γ n (bb∗ )zb, so it weakly converges to φ(bb∗ )zb. Hence z lies in the center of M , which is trivial. Hyperfiniteness of the factor Mφ is evident: if {Mn }n is an increasing sequence of σtφ -invariant finite dimensional subalgebras of M such that ∪n Mn is weakly dense in M , then ∪n (Mn ∩ Mφ ) is weakly dense in Mφ . Remark 4.4. It follows from the Perron-Frobenius theorem and the above considerations that a Markov state has exponential decay of correlations. More precisely, if λ is the maximum of |µ| over all eigenvalues µ of E1 different from 1, then there exists a constant C > 0 such that |φ(ab) − φ(a)φ(b)| ≤ Cλn φ(|G(a)|)φ(|G(b)|), ∀a ∈ A[k,l] ∀b ∈ A(−∞,k−n]∪[l+n,∞) . Theorem 4.5. Let (M, φ, γ) be as in Theorem 4.1, then the systems (M, φ, γ) and (Mφ , φ|Mφ , γ|Mφ ) are entropic K-systems and hφ (γ|Mφ ) = hφ (γ) = s(φ) = S(φ|A[0,1] ) − S(φ|A0 ).

226

V. Ya. Golodets, S. V. Neshveyev

Proof. Since φ is factorial, the K-property follows from Theorem 3.1. The equality hφ (γ) = s(φ) was obtained by Petz [25]. Equality of hφ (γ|Mφ ) and hφ (γ) also follows from his proof, but for the sake of completeness we give a proof. The inequalities hφ (γ|Mφ ) ≤ hφ (γ) ≤ s(φ) always hold. We will use the notations of the proof of Theorem 4.1. Let Mn be a σtφ -invariant subalgebra of M such that A[0,n] ⊂ Mn ⊂ A[−1,n+1] (see the proof of Theorem 4.1, 1)), ˜ n = Mn ∩ Mφ . Then hφ (γ|Mφ ) = lim hφ (M ˜ n , γ) by a Kolmogorov–Sinai and let M n→∞ type theorem [10]. For any k, we have ˜ n , γ(M ˜ n ), . . . , γ k(n+3) (M ˜ n )) ≥ Hφ (M ˜ n , γ n+3 (M ˜ n ), . . . , γ k(n+3) (M ˜ n )) Hφ ( M = S(φ|Mn,k ), where Mn,k is the algebra generated by Mn , γ n+3 (Mn ), . . . , γ k(n+3) (Mn ) [10]. We need the following lemma to estimate S(φ|Mn,k ): Lemma 4.6. Let A ⊂ N ⊂ B be finite dimensional C ∗ -algebras, A = M atp (C), B = M atq (C), ψ be a state of B.Then S(ψ|A ) + log q/p ≥ S(ψ|N ) ≥ S(ψ) − log q/p. Proof. Let τ = Tr B (1)−1 Tr B be the unique tracial state of B. Let Qτ ∈ N be the density matrix of τ |N , i. e. τ |N = Tr N (·Qτ ). Then S(ψ|N ) = −ψ(log Qτ ) − S(τ |N , ψ|N ). Every minimal projection e of N majorizes a minimal projection of B and is equivalent to a projection which is majorized by a minimal projection of A. Hence 1/q ≤ τ (e) ≤ 1/p, so that 1/q ≤ Qτ ≤ 1/p. Using monotonicity of the relative entropy, we obtain S(ψ|N ) ≥ log p − S(τ, ψ) = S(ψ) − log q/p. Analogously S(ψ|N ) ≤ log q − S(τ |A , ψ|A ) = S(ψ|A ) + log q/p.

Applying the lemma to A = A[0,n] ⊗A[n+3,2n+3] ⊗. . .⊗A[k(n+3),k(n+3)+n] , N = Mn,k and B = A[−1,k(n+3)+n+1] , we obtain S(φ|Mn,k ) ≥ S(φ|A[−1,k(n+3)+n+1] ) − log

s(k+1)(n+3) s(k+1)(n+1)

(recall that A0 = M ats (C)). Hence 1 S(φ|A[−1,k(n+3)+n+1] ) − 2(k + 1) log s k→∞ k(n + 3) 2 log s. = s(φ) − n+3

˜ n , γ) ≥ lim hφ (M

This ends the proof of the theorem.

Non-Bernoullian Quantum K-Systems

227

Remark 4.7. It is not difficult to construct a system (M, φ, γ), where φ is a Markov state and Z is a Cartan subalgebra of M . In [11] non-commutative Bernoulli shifts are defined. Analogously it is natural to call the system (Mφ , φ, γ) a non-commutative Markov shift. It is well-known that a classical mixing Markov system is conjugate to a Bernoulli shift with the same entropy. Is this true in the non-commutative case? 5. Non-Isomorphic Entropic K-Systems In this section we obtain an uncountable family of non-conjugate K-systems on the injective III1 -factor all having the same finite entropy. Besides the space translation of a one-dimensional quantum lattice system, the examples of systems, for which the conditions of Theorem 3.1 are satisfied, are also the space translations of the CCR-algebra over the pre-Hilbert space L20 (R) and the space translations of the even part of the CAR-algebra, when all the systems are in factor states. Let us consider the case of space translation of CCR-algebra in more detail. So, let U be the CCR-algebra over L2 (R), τ the Bogoliubov automorphism of U corresponding to the space translation of 1, i. e. τ (W (f )) = W (V f ), (V f )(x) = f (x − 1). (For all the facts and the definitions concerning CCR-algebra we refer the reader to [7].) Let A be a positive bounded operator on L2 (R) that commutes with V , and ω be the quasi-free state corresponding to A. If KerA = 0, then the state ω is separating and σtω (W (f )) = W (B it f ), where B =

A . 1+A

(5.1)

The GNS-triple (Hω , πω , ξω ) corresponding to ω can be expressed in terms of the Fock representation as follows: Hω = F+ ⊗ F+ , ξω = ⊗ , ∗

∗

πω (a (f )) = a ((1 + A)1/2 f ) ⊗ 1 + 1 ⊗ a(JA1/2 f ),

(5.2)

where F+ is the symmetric Fock space over L (R), is the vacuum vector, and J is an anti-linear isometric involution on L2 (R). Then the automorphism τ is implemented by the unitary 2

0(V ) ⊗ 0(JV J),

(5.3)

where 0 is the operator of second quantization. Using (5.1) and (5.2) one also concludes that it it it −it 1it ), ω = 0(B ) ⊗ 0(JB J) = 0(B ) ⊗ 0((JBJ)

(5.4)

equivalently 1ω = 0(B) ⊗ 0((JBJ)−1 ) = 0(B) ⊗ 0(JB −1 J). So we see that the discrete part of the spectrum of 1ω is the group generated by the eigenvalues of B. Moreover, if the spectrum of B is continuous then ξω is the unique eigenvector of 1ω . In the latter case the centralizer of the state ω is trivial, and hence

228

V. Ya. Golodets, S. V. Neshveyev

M = πω (U )00 is a type III1 factor (see [30, Theorem 29.9]). This factor is injective, since U is nuclear. For a subset 3 of R, let U3 be the C ∗ -subalgebra of U generated by W (f ), suppf ⊂ 3. Then U3 is the CCR-algebra over L2 (3), U31 and U32 commute for 31 ∩ 32 = ∅, and ∪3 compact πω (U3 ) is weakly dense in M (though ∪3 U3 is not norm-dense in U). So 00 . that the assumptions of Theorem 3.1 are satisfied with M0 = U(−∞,0] Let us summarize what we have proved: Proposition 5.1. Under the above notations, let A have pure continuous spectrum. Then M = πω (U )00 is the injective III1 -factor, the cyclic vector ξω is the unique eigenvector of the modular operator 1ω , and the system (M, ω, τ ) is an entropic K-system. It is worth to note that the same result holds for the even part of the CAR-algebra. Park and Shin considered the situation, when A is the operator of convolution with a function K. Under certain conditions on K they proved that Z 1 ˆ ˆ η K(x) − η(1 + K(x)) dx, (5.5) hω (τ ) = 2π R ˆ where K(x) = K(y)eiyx dy is the Fourier transform of K. The operator A can be considered via the Fourier transform as the operator of mulˆ Let us suppose that tiplication by the function K. K(x) = o(e−α|x| ), as |x| → ∞, for certain α > 0.

(5.6)

ˆ is analytic in the strip |Im z| < α, and hence A has pure continuous spectrum Then K and Proposition 5.1 can be applied. The next theorem shows that such systems are usually non-conjugate. ˆ i ≥ 0, and ωi be the Theorem 5.2. Let Ki be a function satisfying (5.6) such that K state corresponding to Ki , i = 1, 2. Suppose the systems (M, ω1 , τ ) and (M, ω2 , τ ) are isomorphic. It follows that ˆ 1 (x + 2πn) ˆ 2 (x) = K K for certain n ∈ Z, equivalently K2 (x) = ei2πnx K1 (x). Proof. It is more convenient for us to pass to the Fourier transform, i. e. we consider the automorphism τ as the Bogoliubov automorphism corresponding to the operator of multiplication by the function eix and the states ω1 and ω2 as the quasi-free states ˆ 1 and K ˆ 2 respectively. corresponding to the operators of multiplication by the functions K The space of the GNS-representation corresponding to ωi , i = 1, 2, is identified with F+ ⊗ F+ as described above, and we choose J to be the usual pointwise conjugation on L2 (R). An isomorphism of our systems is implemented by a unitary U on F+ ⊗ F+ . This operator maps ξω1 to ξω2 , conjugates the modular operators and the operators implementing the automorphisms. In view of the identities (5.3), (5.4) this means that U ( ⊗ ) = ⊗ , U 0(eiX ) ⊗ 0(e−iX )U ∗ = 0(eiX ) ⊗ 0(e−iX ),

(5.7)

U 0(B1it ) ⊗ 0(B1−it )U ∗ = 0(B2it ) ⊗ 0(B2−it ),

(5.8)

Non-Bernoullian Quantum K-Systems

229

where X is the operator of multiplication by x and Bj is the operator of multiplication ˆ j (1 + K ˆ j )−1 , j = 1, 2. by the function Dj = K For non-negative integers l, m let Pl,m be the projection onto the subspace (F+ )l ⊗ (F+ )m of F+ ⊗ F+ . There exist l and m such that l + m ≥ 1 and T = Pl,m U P1,0 6= 0. Then, identifying (F+ )p ⊗ (F+ )q with the subspace of L2 (Rp+q ) consisting of functions f (x1 , . . . , xp , y1 , . . . , yq ) symmetric on each group of variables, we can rewrite the identities (5.7) and (5.8) as follows: T eiX = ei(X1 +...+Xl −Y1 −...−Ym ) T, it D2 (X1 ) . . . D2 (Xl ) it T. T D1 (X) = D2 (Y1 ) . . . D2 (Ym )

(5.70 ) (5.80 )

The identity (5.70 ) implies that T f (X) = f (X1 + . . . + Xl − Y1 − . . . − Ym )T

(5.9)

for any bounded measurable 2π-periodic function f on R. Indeed, the algebra of functions for which (5.9) holds is closed under pointwise limits of bounded sequences and contains eix . Now we take an integer k such that T |L2 (2πk,2π(k+1)) 6= 0. Denoting by D the 2πperiodic function that coincides with D1 on (2πk, 2π(k + 1)) and taking a function ξ ∈ L2 (2πk, 2π(k + 1)) for which T ξ 6= 0, we obtain: it D2 (X1 ) . . . D2 (Xl ) T ξ = T D1 (X)it ξ D2 (Y1 ) . . . D2 (Ym ) = T D(X)it ξ = D(X1 + . . . + Xl − Y1 − . . . − Ym )it T ξ. Then

D2 (x1 ) . . . D2 (xl ) D2 (y1 ) . . . D2 (ym )

it = D(x1 + . . . + xl − y1 − . . . − ym )it

(5.10)

for any t ∈ R for almost all (x1 , . . . , ym ) belonging to the support of T ξ. Hence there exists a set 3 ⊂ Rl+m of positive measure such that D2 (x1 ) . . . D2 (xl ) = D(x1 + . . . + xl − y1 − . . . − ym ) D2 (y1 ) . . . D2 (ym ) on 3 (we can take 3 to be the set of (x1 , . . . , ym ) for which (5.10) holds for any rational t). Replacing, if necessary, 3 by a subset of positive measure we find an integer n such that (x1 +. . .+xl −y1 −. . .−ym ) ∈ (2π(k −n), 2π(k −n+1)) for any (x1 , . . . , ym ) ∈ 3. Then D2 (x1 ) . . . D2 (xl ) = D1 (x1 + . . . + xl − y1 − . . . − ym + 2πn) D2 (y1 ) . . . D2 (ym )

(5.11)

on 3. Using the Fubini theorem and the uniqueness theorem for meromorphic functions ˜ of Rl+m−1 of we conclude that (5.11) holds on Rl+m . (Indeed, there exists a subset 3 ˜ the identity (5.11) positive measure such that, for any (x1 , . . . , xl , y1 , . . . , ym−1 ) ∈ 3, ˜ × R, and so on.) holds on a set of positive measure on ym . Then (5.11) holds on 3

230

V. Ya. Golodets, S. V. Neshveyev

The following cases are possible: 1) l + m ≥ 2. Comparing the level sets of the functions in (5.11) corresponding to the values 0 and ∞, we see that D1 and D2 have neither roots nor poles on the real axis. Taking the logarithm and comparing the power-series expansions for log Di , i = 1, 2, one concludes that the functions log D1 and log D2 are linear. So, D2 (x) = ceax , D1 (x) = cl−m ea(x−2πn) ˆ i (x)(1 + K ˆ i (x))−1 → 0, for certain real c and a. This contradicts the fact that Di (x) = K as |x| → ∞. 2) l = 0, m = 1. Then we have D2 (y)−1 = D1 (2πn − y). This is impossible by the same reason as above. ˆ 2 (x) = D2 (x)(1 − D2 (x))−1 = 3) l = 1, m = 0. Then D2 (x) = D1 (x + 2πn). Hence K ˆ K1 (x + 2πn). The simplest example of an entropic K-system is the shift automorphism of an infinite tensor product algebra with a faithful product-state. We shall call such systems Bernoullian. Theorem 5.3. Let N be a von Neumann algebra and ψ be a normal faithful state of N . For each integer n, let (Nn , ψn ) be a copy of (N, ψ) and (M, φ) be the W ∗ -tensor product ⊗n (Nn , ψn ). The right shift automorphism of M is denoted by γ. Suppose that hφ (γ) < ∞. Then N is at most a countable sum of factors of type I. P ˜ Proof. Let S(ψ) = sup i λi S(ψ| P A , ψi |A ), where the supremum is taken over all finite convex decompositions ψ = i λi ψi into states, over all finite dimensional abelian ˜ subalgebras A of N . The proof of Theorem 6.10 in [21] shows that if S(ψ) < ∞, then N is at most a countable sum of factors of type I. On the other hand, for any finite dimensional subalgebra A of N0 , we have X sup λi S(ψ0 |A , ψi |A ), hφ (A, γ) ≥ P ψ0 =

˜ so that hφ (γ) ≥ S(ψ).

i

λi ψ i

i

Corollary 5.4. Let (M, ω, α) be an entropic K-system. Suppose hω (α) < ∞, ω is faithful, and the modular operator 1ω is not diagonalizable. Then the system (M, ω, α) is non-Bernoullian. Now we return to the Park–Shin systems considered above. Let U and Vθ , θ ∈ R, be the unitary operators on L2 (R) defined by (U f )(x) = f (x − 1), (Vθ f )(x) = eiθx f (x), and τθ be the Bogoliubov automorphism of U corresponding to the operator Vθ U V−θ . (Note that τθ+2π = τθ .) Theorem 5.5. Let L be a non-zero smooth compactly supported function such that Lˆ ≥ 0, K = L ∗ L, ω the quasi-free state of U corresponding to K, M = πω (U)00 . Then M is the injective III1 -factor and

Non-Bernoullian Quantum K-Systems

231

1) for any θ, the system (M, ω, τθ ) is a non-Bernoullian entropic K-system with the entropy Z 1 ˆ ˆ η K(x) − η(1 + K(x)) dx; hω (τθ ) = 2π 2) for 0 ≤ θ < 2π, the systems (M, ω, τθ ) are pairwise non-conjugate. Proof. M is the injective III1 -factor and (M, ω, τ0 ) is a K-system by Proposition 5.1. (M, ω, τ0 ) is non-Bernoullian by virtue of Corollary 5.4. Let A be the operator of convolution with the function K. Since V−θ AVθ is the operator of convolution with the function e−iθx K(x), the Bogoliubov automorphism corresponding to V−θ conjugates the systems (U, ω, τθ ) and (U, ω−θ , τ0 ), where ω−θ is the quasi-free state corresponding to the operator of convolution with the function e−iθx K(x). Hence hω (τθ ) = hω−θ (τ0 ) Z 1 ˆ − θ) − η(1 + K(x ˆ − θ)) dx η K(x = 2π Z 1 ˆ ˆ η K(x) − η(1 + K(x)) dx = 2π = hω (τ0 ). Thus our theorem follows from what we have proved and Theorem 5.2.

Remark 5.6. Under the assumptions of Theorem 5.5 K has a compact support, and if suppfi ∩ (suppfj + suppK) = ∅ and suppfi ∩ suppfj = ∅ for i 6= j, then ω(W (f1 ) . . . W (fn )) = ω(W (f1 )) . . . ω(W (fn )). Recalling the proof of Theorem 3.1 one sees that such a clustering property simplifies the proof crucially. So that the K-property for the systems in Theorem 5.5 (as well as for Bernoullian systems) is rather evident. Acknowledgement. One of the authors (V.G.) would like to thank Prof. E.Kissin and the University of North London for hospitality.

References 1. Accardi, L.: A noncommutative Markov property, Functional. Anal. i Prilozen. 9, 1–8 (1975) (in Russian) 2. Araki, H.: Gibbs states of a one dimensional quantum lattice. Commun. Math. Phys. 14, 120–157 (1969) 3. Benatti, F.: Deterministic Chaos in Infinite Quantum Systems. Berlin–Heidelberg–New York: Springer, 1993 4. Benatti, F., Narnhofer H.: Strong asymptotic abelianness for entropic K-systems. Commun. Math. Phys. 136, 231–250 (1991) 5. Bezuglyi, S.I., Golodets, V.Ya.: Dynamical entropy for Bogoliubov actions of free abelian groups on the CAR-algebra. Ergod. Th. and Dynam. Sys. 17, 757–782 (1997) 6. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics I. New-York: Springer, 1987 7. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics II. New-York: Springer, 1987 8. Choda, M.: Entropy for *-endomorphisms and relative entropy for subalgebras. J. Operator Theory, 25, 125–140 (1991)

232

V. Ya. Golodets, S. V. Neshveyev

9. Connes, A.: Entropie de Kolmogoroff-Sinai et mechanique statistique quantique. C. R. Acad. Sc. 301, 1–6 (1985) 10. Connes, A., Narnhofer, H., Thirring, W.: Dynamical entropy of C ∗ -algebras and von Neumann algebras. Commun. Math. Phys. 112, 691–719 (1987) 11. Connes, A., Størmer, E.: Entropy for automorphisms of II1 von Neumann algebras. Acta Math. 134, 289–306 (1975) 12. Cornfeld, I.P., Fomin, S.V., Sinai, Ya.G.: Ergodic Theory. New-York: Springer, 1980 13. Fannes, M., Nachtergaele, B., Werner, R.F.: Finitely correlated states on quantum spin chains. Commun. Math. Phys. 144 , 443–490 (1992) 14. Golodets, V.Ya., Størmer, E.: Entropy of C ∗ -dynamical systems defined by bitstreams. Ergod. Th. & Dynam Sys. 18, 1–16 (1998) 15. Golodets, V.Ya., Zholtkevich, G.N.: Markovian Kubo–Martin–Schwinger states. Teoret. Mat. Fiz. 56, 80–86 (1983) (in Russian) 16. Kolmogorov, A.: A new metric invariant of transitive dynamical systems and automorphisms of Lebesgue space. Dokl. Akad. Nauk., 119, 861–864 (1958) (in Russian) 17. McDuff, D.: Central sequences and the hyperfinite factor. Proc. London Math. Soc. 21, 443–461 (1970) 18. Narnhofer, H., Størmer, E., Thirring, W.: C ∗ -dynamical systems for which the tensor product formula for entropy fails. Ergod. Th. and Dynam. Sys. 15, 961–968 (1995) 19. Narnhofer, H., Thirring, W.: Quantum K-systems. Commun. Math. Phys. 125, 564–577 (1989) 20. Narnhofer, H., Thirring, W.: Clustering for algebraic K-system. Lett. Math. Phys. 30, 307–316 (1994) 21. Ohya, M., Petz, D.: Quantum Entropy and Its Use. Berlin–Heidelberg–New York: Springer, 1993 22. Ornstein, D.: An example of a K-automorphism that is not a Bernoulli shift. Adv. in Math. 10, 49–62 (1973) 23. Ornstein, D., Shields, P.C.: An uncountable family of K-automorphisms. Adv. in Math. 10, 63–88 (1973) 24. Park, Y.M., Shin, H.H.: Dynamical entropy of space translations of CAR and CCR algebras with respect to quasi-free states. Commun. Math. Phys. 152, 497–537 (1993) 25. Petz, D.: Entropy of Markov states. Math. Pura ed Appl. 14, 33–42 (1994) 26. Pimsner, M., Popa, S.: Entropy and index for subfactors. Ann. Sci. Ecole Norm. Sup. 19, 57–106 (1986) 27. Price, G.: The entropy of rational Powers shifts. USNA preprint to appear in AMS Proc. 28. Rohlin, V.A., Sinai, Ya.G.: Constructions and properties of invariant measurable partitions. Dokl. Akad. Nauk. 141, 1038–1041 (1961) (in Russian) 29. Sinai, Ya.G.: On the concept of entropy for dynamical systems. Dokl. Akad. Nauk, 124, 768–771 (1959) (in Russian) 30. Strˇatilˇa, S.: Modular Theory in Operator Algebras. Abacus Press, 1981 31. Størmer, E., Voiculescu, D.: Entropy of Bogoliubov automorphisms of the Canonical Anticommutation Relations. Commun. Math. Phys. 133, 521–542 (1990) 32. Yin, H.S.: Entropy of certain noncommutative shifts. Rocky Mountain J. Math. 20, 651–656 (1990) Communicated by A. Connes

Commun. Math. Phys. 195, 233 – 247 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Quantization of Infinitely Reducible Generalized Chern–Simons Actions in Two Dimensions Noboru Kawamoto, Kazuhiko Suehiro, Takuya Tsukioka, Hiroshi Umetsu Department of Physics, Hokkaido University, Sapporo, 060, Japan. E-mail: [email protected], [email protected], [email protected], [email protected] Received: 4 April 1997 / Accepted: 19 November 1997

Abstract: We investigate the quantization of the two-dimensional version of the generalized Chern–Simons actions which were proposed previously. The models turn out to be infinitely reducible and thus we need an infinite number of ghosts, antighosts and the corresponding antifields. The quantized minimal actions which satisfy the master equation of Batalin and Vilkovisky have the same Chern–Simons form. The infinite fields and antifields are successfully controlled by the unified treatment of generalized fields with quaternion algebra. This is a universal feature of generalized Chern–Simons theory and thus the quantization procedure can be naturally extended to arbitrary even dimensions.

1. Introduction The Chern–Simons action has many applications for physical mechanisms and formalisms. In particular it was used to formulate three-dimensional Einstein gravity [42]. Two possible reasons why three-dimensional Einstein gravity was successfully formulated by the Chern–Simons action are based on the facts that the action is formulated by differential forms on the one hand and the three-dimensional Einstein gravity has no dynamical degrees of freedom on the other hand. One of the authors (N.K.) and Watabiki have proposed a new type of topological actions in arbitrary dimensions which have the Chern–Simons form [29, 30, 31, 32]. The actions have the same algebraic structure as the ordinary Chern–Simons action and are formulated by differential forms. It was shown that two-dimensional topological gravities [31] and a four-dimensional topological conformal gravity [32] were formulated by the even-dimensional version of the generalized Chern–Simons actions. It is interesting to ask if the models defined by the generalized Chern–Simons actions are well-defined in the quantum level and thus lead to the quantization of topological gravity. It turns out that the quantization of the generalized Chern–Simons action is

234

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu

highly nontrivial. The reasons are two fold: Firstly the action has a zero form square term multiplied by the highest form and thus breaks the regularity condition. Secondly the theory is highly reducible, in fact infinitely reducible, as we show in this paper. Thus the models formulated by the generalized Chern–Simons actions provide its own interesting problems for the known quantization procedures such as the Batalin and Vilkovisky formulation of the master equation [5, 6], the Batalin, Fradkin and Vilkovisky Hamiltonian formulation [17, 4, 16, 2, 3] and the quantization procedure of cohomological perturbation [24]. It was shown in the quantization of the simplest abelian version of the generalized Chern–Simons action that the particular type of regularity violation does not cause serious problems for the quantization [27]. In this paper we investigate the nonabelian version of Chern–Simons actions which turn out to be infinitely reducible. We show that the quantization of this infinitely reducible system can be treated successfully by the unified treatment of fields and antifields of the generalized Chern–Simons theory. It is interesting to note that the nonabelian version of the generalized Chern–Simons actions provide the most fruitful examples for the quantization of infinitely reducible systems among the known examples such as the Brink–Schwarz superparticle [13, 36, 12, 26, 35, 18, 9], Green–Schwarz superstring [19, 20, 25, 15, 26] and covariant string field theories [41, 22, 23, 43, 33, 37, 11].

2. Generalized Chern–Simons Theory The generalized Chern–Simons theory is a generalization of the ordinary three dimensional Chern–Simons theory into arbitrary dimensions [29–32]. The main point of the generalization is to extend a one form gauge field to a quaternion valued generalized gauge field A which contains forms of all possible degrees. Correspondingly a gauge symmetry is extended and it is described by a quaternion valued gauge parameter V. It was shown that this formulation can naturally incorporate fermionic gauge fields and parameters as well. In the most general form, a generalized gauge field A and a gauge parameter V are defined by the following component form, ˆ A = 1ψ + iψˆ + jA + kA, V = 1ˆa + ia + jαˆ + kα,

(2.1) (2.2)

ˆ α), ˆ aˆ ) are direct sums of fermionic odd forms, fermionic where (ψ, α), (ψ, ˆ (A, a) and (A, even forms, bosonic odd forms and bosonic even forms, respectively, and they take values on a gauge algebra. The boldface symbols 1, i, j and k are elements of the quaternion. The two types of component expansions (2.1) and (2.2), which belong to 3− and 3+ classes, can be regarded as generalizations of odd forms and even forms, respectively. In the case of even-dimensional formulation a gauge algebra can simply be chosen as such an algebra as is closed within commutators and anticommutators. In this case the elements in 3− and 3+ classes fulfill the following Z2 grading structure: λ + λ+ ∈ 3 + ,

λ − λ+ ∈ 3 − ,

λ − λ− ∈ 3 + ,

(2.3)

where λ+ ∈ 3+ , λ− ∈ 3− . In general, a graded Lie algebra is necessary to accommodate odd-dimensional formulation. The even-dimensional version of actions proposed by Kawamoto and Watabiki possess the following Chern–Simons form [29, 30]:

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

1 S= 2

235

Z M

2 3 Trk AQA + A , 3

(2.4)

where Q = jd ∈ 3− is the exterior derivative and Trk (· · ·) is defined so as to pick up only the coefficient of k from (· · ·) and take the trace of the gauge algebra. The k component of an element in the 3− class includes only bosonic even forms and thus the action (2.4) leads to an even-dimensional one. We then need to pick up d-form terms corresponding to the d-dimensional manifold M . Since this action has the same structure as the ordinary three-dimensional Chern–Simons action, it is invariant under the following gauge transformation, δA = [ Q + A , V ].

(2.5)

It should be noted that this symmetry is much larger than the usual gauge symmetry since the gauge parameter V contains many parameters of various forms. Since anticommutators as well as commutators for elements of the gauge algebra appear in the explicit form of the gauge transformations, we need to use an algebra which is closed within commutators and anticommutators. A specific example of the algebra is realized by Clifford algebra. In general a generalized gauge theory can be formulated for a graded Lie algebra which includes a supersymmetry algebra as a special example [30]. The equation of motion of this theory is F = 0,

(2.6)

where F is a generalized curvature, given by F = (Q + A)2 = QA + A2 .

(2.7)

3. Infinite Reducibility of Two-Dimensional Models Hereafter we consider the action (2.4) in two dimensions with a nonabelian gauge algebra as a concrete example although we will see that models in arbitrary even dimensions can be treated in the similar way. A simple example for nonabelian gauge algebras is given by the Clifford algebra c(0, 3) generated by {T a } = {1, iσ k ; k = 1, 2, 3}, where σ k ’s are Pauli matrices [31]. For simplicity we omit fermionic gauge fields and parameters in the starting action and gauge transformations. It is, however, easy to recover them in the subsequent formulation. Then the action expanded into components is given by Z 1 d2 xTr µν (∂µ ων + ωµ ων )φ + µν Bµν φ2 , (3.1) S0 = − 2 where φ, ωµ and Bµν are scalar, vector and antisymmetric tensor fields, respectively, and 01 = 11 . This Lagrangian possesses gauge symmetries corresponding to (2.5), δφ = [φ, v1 ], δωµ = ∂µ v1 + [ωµ , v1 ] − {φ, u1µ }, δB = µν (∂µ u1ν + [ωµ , u1ν ]) + [B, v1 ] + [φ, b1 ],

(3.2) (3.3) (3.4)

1 Throughout this paper we impose φ† = −φ, ω † = −ω and B † = B µ µν to make the classical action µ µν hermitian.

236

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu

where B is defined by B ≡ 21 µν Bµν and b1 by b1 ≡ 21 µν b1µν . Equations of motion of this theory are given by φ : −µν (∂µ ων + ωµ ων ) − {φ, B} = 0, ωµ : −µν (∂ν φ + [ων , φ]) = 0, B : −φ2 = 0.

(3.5) (3.6) (3.7)

This system is on-shell reducible since the gauge transformations (3.2)–(3.4) are invariant under the transformations δv1 = {φ, v2 }, δu1µ = ∂µ v2 + [ωµ , v2 ] − [φ, u2µ ], δb1 = µν (∂µ u2ν + [ωµ , u2ν ]) + {B, v2 } + {φ, b2 }, with the on-shell conditions. However this is not the end of the story. Indeed this system is infinitely on-shell reducible, i.e., successive reducibilities are given by the following relations: (3.8) δvn = [φ, vn+1 ](−)n+1 , n (3.9) δunµ = ∂µ vn+1 + [ωµ , vn+1 ] − [φ, un+1µ ](−) , δbn = µν (∂µ un+1ν + [ωµ , un+1ν ]) + [B, vn+1 ](−)n+1 + [φ, bn+1 ](−)n+1 , (3.10) n = 1, 2, 3, · · · , where [ , ](−)n is a commutator for odd n and an anticommutator for even n. This fact is more easily understood by using compact notations such as the generalized gauge field A and parameter V. We define Vn from vn , unµ and bn by 1 µ µ ν V2n = ju2nµ dx + k v2n + b2nµν dx ∧ dx ∈ 3− , (3.11) 2 1 V2n+1 = 1 v2n+1 + b2n+1µν dxµ ∧ dxν − iu2n+1µ dxµ ∈ 3+ , (3.12) 2 n = 0, 1, 2, · · · , where v0 = φ, u0,µ = ωµ and b0 = B and thus V0 = A. Then Eqs. (3.2)–(3.4) and (3.8)–(3.10) can be described in the following compact form, δVn = (−)n [ Q + A , Vn+1 ](−)n+1 ,

n = 0, 1, 2, · · · .

(3.13)

Using these notations, it is easy to see the on-shell reducibility = (−)n [ Q + A , Vn+1 + δVn+1 ](−)n+1 δVn Vn+1 →Vn+1 +δVn+1 = δVn + (−)n Q + A , (−)n+1 [ Q + A , Vn+2 ](−)n+2 (−)n+1 = δVn − [ F , Vn+2 ] = δVn ,

(3.14)

where we have used the equation of motion (2.6). Actually the infinite on-shell reducibility is a common feature of generalized Chern– Simons theories with nonabelian gauge algebras in arbitrary dimensions, which can

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

237

be understood by the fact that (3.14) is the relation among the generalized gauge fields and parameters. Thus generalized Chern–Simons theories add another category of infinitely reducible systems to known examples like Brink–Schwarz superparticle [13, 36, 12, 26, 35, 18, 9], Green–Schwarz superstring [19, 20, 25, 15, 26] and covariant string field theories [41, 22, 23, 43, 33, 37, 11]. It should be noted that this theory is infinitely reducible though it contains only a finite number of fields of finite rank antisymmetric tensors. The Brink–Schwarz superparticle and Green–Schwarz superstring are similar examples in the sense that they contain only a finite number of fields yet are infinitely reducible. In the present case the infinite reducibility is understood from the following facts: Firstly, the highest form degrees of Vn are unchanged from that of Vn−1 in Eq. (3.13) since the generalized gauge field A contains the zero form gauge field φ. Secondly, the generalized Chern–Simons actions possess the same functional form (2.4) as the ordinary Chern–Simons action and thus have the vanishing curvature condition as the equation of motion; F = 0 (2.6). Thus Eqs. (3.13) representing the infinite reducibilities have the same form at any stage n, except for the difference between commutators and anticommutators. Algebraically, the structure of infinite reducibility resembles that of string field theories of a Chern–Simons form. Before closing this section, we compare the generalized Chern–Simons theory of the abelian gl(1, R) algebra, which was investigated previously [27], with the model of nonabelian algebra. In the abelian case commutators in the gauge algebra vanish while only anticommutators remain. Then we can consistently put all transformation parameters to be zero except for v1 , u1µ and v2 . This leads to the previous analysis that the abelian version was quantized as a first stage reducible system. In nonabelian cases, however, infinite reducibility is the universal and inevitable feature of the generalized Chern–Simons theories. 4. Minimal Sector In this section we present a construction of the minimal part of quantized action based on the Lagrangian formulation given by Batalin and Vilkovisky [5, 6]. In the construction of Batalin and Vilkovisky, ghosts and ghosts for ghosts and the corresponding antifields are introduced according to the reducibility of the theory. We denote a minimal set of fields by 8A which include classical fields and ghost fields, and the corresponding antifields by 8∗A . If a field has ghost number n, its antifield has ghost number −n − 1. Then a minimal action is obtained by solving the classical master equation, (Smin (8, 8∗ ), Smin (8, 8∗ )) = 0, ∂r X ∂ l Y ∂r X ∂l Y , (X, Y ) = ∗ − ∂8∗A ∂8A ∂8A ∂8A with the following boundary conditions, Smin ∗ = S0 ,

(4.2)

(4.3)

8A =0

∂Smin n = Zaan+1 8an+1 , ∂8∗an 8∗A =0

(4.1)

n = 0, 1, 2, · · · ,

(4.4)

n 8an+1 represents the n-th reducibility transwhere S0 is the classical action and Zaan+1 formation where the reducibility parameters are replaced by the corresponding ghost

238

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu

fields. In this notation, the relation with n = 0 in Eq. (4.4) corresponds to the gauge transformation. The BRST transformations of 8A and 8∗A are given by the following equations: s8A = (8A , Smin (8, 8∗ )),

s8∗A = (8∗A , Smin (8, 8∗ )).

(4.5)

Equations (4.1) and (4.5) assure that the BRST transformation is nilpotent and the minimal action is invariant under the transformation. In the present case it is difficult to solve the master equation (4.1) order by order with respect to the ghost number because the theory we consider is infinitely reducible. We need to solve an infinite set of equations according to the introduction of an infinite set of ghost fields; ghosts, ghosts for ghosts, · · · and the corresponding antifields. There is, however, a way to circumvent the difficulties by using the characteristics of generalized Chern–Simons theory in which fermionic and bosonic fields, and odd and even forms, can be treated in a unified manner. First we introduce infinite fields en = 1 µν Cnµν , Cn , Cnµ , C 2

n = 0, ±1, ±2, · · · , ±∞,

(4.6)

where the index n indicates the ghost number of the field. The fields with ghost number 0 are the classical fields e0 = B. C0 = φ, C0µ = ωµ , C

(4.7)

The fields with even (odd) ghost numbers are bosonic (fermionic). It is seen from Eqs. (3.2)–(3.4) and (3.8)–(3.10) that fields content for ghosts and ghosts for ghosts in the minimal set is completed in the sector for n > 0 while the necessary degrees of freedom for antifields are saturated for n < 0. We will later identify fields with negative ghost numbers as antifields. We now define a generalized gauge field Ae in such a form of (2.1) as it contains these infinite fields according to their Grassmann parities and form degrees, ψ= ψˆ = A= Aˆ =

∞ X

C2n+1µ dxµ ,

n=−∞ ∞ X n=−∞ ∞ X

1 C2n+1 + C2n+1µν dxµ ∧ dxν 2

C2nµ dxµ ,

n=−∞ ∞ X n=−∞

1 C2n + C2nµν dxµ ∧ dxν 2

We then introduce a generalized action for Ae as Z 2 e3 1 0 e e e Trk AQA + A S= 2 3 ( Z ∞ X = − d2 xTr0 C2n µν ∂µ C−2nν n=−∞

(4.8) ,

(4.9) (4.10)

.

(4.11)

(4.12)

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

+

!

∞ X

m=−∞ ∞ X

+

239

e−2(m+n)+1 } − C2m−1µ C−2(m+n)+1ν C2mµ C−2(m+n)ν + {C2m−1 , C µν

µν

e2n C2m−1 C−2(m+n)+1 + C2m C−2(m+n) C

m=−∞

+C2n−1µ

µν

∂ν C−2n+1 +

!)

∞ X

[C2mν , C−2(m+n)+1 ]

,

(4.13)

m=−∞

where the upper index 0 on Tr indicates to pick up only the part with ghost number 0. This action is invariant under the following transformation δλ Ae = −Fe iλ,

(4.14)

where Fe is the generalized curvature (2.7) constructed of Ae and λ is a fermionic scalar parameter with ghost number −1. It should be understood that the same ghost number sectors must be equated in Eq. (4.14). Since Fe and iλ belong to 3+ and 3− , respectively, e their product in the right hand side of Eq. (4.14) belongs to the same 3− class as A. e The invariance of the action S under the transformation (4.14) can be checked by the following manipulation, Z o n e δλ S = − Tr0k (QAe + Ae2 )Fe iλ Z e ·λ = Tr0j (FeF) Z 2 e3 0 e e = Trj Q(AQA + A ) · λ 3 = 0, (4.15) where the subscript j plays the similar role as the subscript k, i.e., to pick up only the coefficient of j in the trace. The change of the subscript k to j is necessary to take i into account in the trace in accordance with ji = −k. Here we have simply ignored the boundary term and thus the invariance is valid up to the surface term. e is the BRST transforWe now show that a right variation s defined by δλ Ae = sAλ mation. First of all this transformation is nilpotent, e 2 λ1 = δλ δλ Ae = −δλ Fe iλ1 = −[ Q + Ae , Fe ]λ2 λ1 = 0, s2 Aλ 2 1 2

(4.16)

where the generalized Bianchi identity is used, [ Q + Ae , Fe ] = [ Q + Ae , ( Q + Ae )2 ] = 0.

(4.17)

Next we need to show that the transformation s is realized as the antibracket form of (4.5). The invariance of Se under (4.14) implies that Se is indeed the minimal action if we make a proper identification of fields of negative ghost numbers with antifields. It is straightforward to see that the BRST transformations (4.5), both for fields and antifields, e are realized under the following identifications with Smin = S:

240

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu ν∗ −1 ν∗ C−2n+1µ = −1 µν C2(n−1) , C−2nµ = µν C2n−1 ,

e∗ C−2n+1 = C 2(n−1) ,

e∗ , C−2n = −C 2n−1

e−2n+1 = C

e−2n = C

∗ C2(n−1) ,

∗ −C2n−1 ,

(4.18) n = 1, 2, 3, · · · ,

µν µρ −1 µ where2 −1 µν is the inverse of , ρν = δν . This shows that we have obtained a solution for the master equation (4.1),

δλ Smin = (Smin , Smin ) · λ = 0.

(4.19)

It is easy to see that this solution satisfies the boundary conditions (4.3) and (4.4), by comparing the gauge transformation (3.2)–(3.4) and the reducibilities (3.8)–(3.10) with the following expansion of Smin : Z 1 Smin = d2 xTr −µν (∂µ ων + ωµ ων )φ − µν Bµν φ2 2 ∞ X ∗ + Cn [φ, Cn+1 ](−)(n+1) n=0

+Cnµ∗ ∂µ Cn+1 + [ωµ , Cn+1 ] − [φ, Cn+1µ ](−)n e ∗ µν (∂µ Cn+1ν + [ωµ , Cn+1ν ]) +C n

en+1 ](−)(n+1) +[B, Cn+1 ](−)(n+1) + [φ, C

o

+ ······

.

Thus the action Smin = Se with the identification (4.18) is the correct solution of the classical master equation for the generalized Chern–Simons theory. For completeness we give explicit forms of the BRST transformations of the minimal fields: sC2n = − sC2n−1 =

∞ X

m=−∞ ∞ X m=−∞

[C2m+1 , C2(n−m) ],

(4.20)

1 1 {C2m , C2(n−m) } + {C2m−1 , C2(n−m)+1 } , 2 2

(4.21)

sC2nµ = ∂µ C2n+1 +

∞ X

[C2mµ , C2(n−m)+1 ] − {C2m+1µ , C2(n−m) } , (4.22)

m=−∞

sC2n−1µ = ∂µ C2n +

∞ X m=−∞

e2n = µν ∂µ C2n+1ν + sC

[C2mµ , C2(n−m) ] + {C2m−1µ , C2(n−m)+1 } , (4.23) ∞ X

µν [C2mµ , C2(n−m)+1ν ]

m=−∞

e2(n−m) ] , e2m+1 , C2(n−m) ] − [C2m+1 , C −[C

2

∗ = C ∗a η −1 T b , · · · , with TrT a T b = η ab . To be precise the antifields are defined as Cn n ab

(4.24)

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

241

e2n−1 = µν ∂µ C2nν sC ∞ X 1 1 µν [C2mµ , C2(n−m)ν ] − µν [C2m−1µ , C2(n−m)+1ν ] + 2 2 m=−∞ e2(n−m) } + {C2m−1 , C e2(n−m)+1 } , (4.25) +{C2m , C where the identification (4.18) should be understood. It is critical in our construction of the minimal action that the action of the generalized theory possesses the same structure as the Chern–Simons action and fermionic and bosonic fields are treated in a unified manner. It is interesting to note that the starting classical action, which includes only bosonic fields, and the quantized minimal action, which includes the infinite series of bosonic and fermionic fields, have the same form e This is reminiscent of string field theories of (2.4) with the replacement A → A. whose actions have the Chern–Simons form: A string field contains infinite series of ghost fields and antifields. The quantized minimal action also takes the same Chern– Simons form [41, 23, 37, 11]. It is also worth mentioning that there are other examples where classical fields and ghost fields are treated in a unified way [8, 40, 1, 14, 7]. It is obvious that the minimal action for generalized Chern–Simons theory in arbitrary even dimensions can be constructed in the same way as in the two-dimensional case because the classical action (2.4), symmetries (2.5), reducibilities (3.13), the minimal action (4.12) and BRST transformations sAe = −Fei are described by using generalized fields and parameters. It is worth mentioning the applicability to the quantum master equation. The full BRST invariance of the theory at the quantum level is assured by the quantum master equation 1 (S(8, 8∗ ), S(8, 8∗ )) = i~4S(8, 8∗ ), 2 ∂r ∂l S 4S(8, 8∗ ) = ∗ , ∂8A ∂8A

(4.26) (4.27)

which differs from the classical master equation by the order ~ term. As we mentioned in the above, there is a highly nontrivial similarity between the string field theory and the generalized Chern–Simons theory: necessity of infinite ghost towers, the same Chern– Simons form in the classical and the quantized minimal actions. It is a well-known, however, delicate issue that the solution to the classical master equation of the string field theory does not satisfy the quantum master equation (4.26) [38, 21]. This is related with the following facts: In the case of open string field theory closed string degrees of freedom are not incorporated in the classical action while they appear at the loop level. In the case of closed string field theory loop amplitudes evaluated from the minimal action with a suitable gauge fixing do not correctly reproduce the fundamental region in the moduli space, which leads to the violation of unitarity unless the minimal action is modified at the quantum level. In contrast with the string case, the minimal action Smin of the present model satisfies (4.26) in the naive calculation. We say naive here due to the fact that in local field theories 4S always includes δ(0) owing to the second functional derivative at the same space-time point and thus the right-hand side of (4.26) must be regularized. In the naive calculation ignoring this subtlety, ∂r ∂l Smin ∂r A = s8 4Smin = A ∂8∗ ∂8 ∂8A A

242

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu

can be evaluated and each contribution from the six types of fields in Eqs. (4.20)-(4.25) vanishes separately. For example the contribution from C2n in (4.20) vanishes, ∞ X ∂r −1 ∂r a b TrT [C2m+1 , C2(n−m) ] s C = −η 2n ab a a ∂C2n ∂C2n m=−∞ −1 = −ηab Tr[T a , T b ]C1 = 0,

where the footnote of Eq. (4.18) is taken into account. Other contributions vanish just in the same way. In the string field theory, 1Smin does not vanish in this naive level. In local field theories, i~1Smin is interpreted as the term which breaks BRST invariance at the quantum level and thus leads to the breakdown of the gauge invariance, the anomaly term [39]. Instead of the naive calculation, we need to calculate the anomaly term by using a suitable regularization to evaluate the ill-defined singular term δ(0). Although in the present model there is no chiral fermion nor selfdual antisymmetric tensor field, which is a usual source for anomaly, we can not simply resort to the dimensional regularization to deal with δ(0) due to the presence of µν . It is left for future investigations whether the minimal action itself is the solution to the quantum master equation.

5. Gauge-Fixed Action The gauge degrees of freedom are fixed by introducing a nonminimal action which must be added to the minimal one, and choosing a suitable gauge fermion. Though the number of gauge-fixing conditions is determined in accordance with the “real” gauge degrees of freedom, we can prepare a redundant set of gauge-fixing conditions and then compensate the redundancy by introducing extraghosts. Indeed Batalin and Vilkovisky gave a general prescription to construct a nonminimal sector by this procedure [6]. This prescription is, however, inconvenient in the present case since it leads to a doubly infinite number of fields; antighosts, extraghosts,· · ·, where “doubly infinite” means the infinities both in the vertical direction and the horizontal direction in the triangular tableau of ghosts. We can instead adopt gauge-fixing conditions so that such extra infinite series do not appear while propagators for all fields are well-defined. The type of gauge-fixing prescription which is unconventional for the Batalin-Vilkovisky formulation is known, for example, in a quantization of topological Yang-Mills theory [34]. In the present case, we found that in each sector of the ghost number the standard Landau type gauge-fixing for the vector and antisymmetric tensor fields is sufficient to make a complete gauge-fixing. After taking into account the above points, we introduce the following nonminimal action, Z Snonmin =

d2 x

∞ X n=1

∗ µ ∗ Tr C¯ n∗ bn−1 + C¯ nµ bn−1 + ηn−1 πn ,

(5.1)

where the ghost number of nonminimal fields is n for ηn and πn and −n for C¯ n , C¯ nµ , bn and bµn and the corresponding antifields possess ghost number −n − 1 and n − 1, respectively. Even (odd) ghost number fields are bosonic (fermionic), as usual. The BRST transformations of these fields are defined by this nonminimal action,

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

sC¯ n sC¯ nµ sηn−1 sC¯ n∗ ∗ sC¯ nµ ∗ sηn−1

= = = = = =

bn−1 , sbn−1 bµn−1 , sbµn−1 πn , sπn 0, sb∗n−1 0, sb∗n−1µ 0, sπn∗

= = = = = =

243

0, 0, 0, (−)n C¯ n∗ , ∗ (−)n C¯ nµ , n+1 ∗ (−) ηn−1 .

(5.2)

Next we adopt the following gauge fermion 9 which leads to a Landau type gauge fixing, Z ∞ X ν e ¯µ Tr C¯ n ∂ µ Cn−1µ + C¯ nµ −1 (5.3) 9 = d2 x µν ∂ Cn−1 + Cn ∂µ ηn−1 , n=1

where we assume a flat metric for simplicity. Then the antifields can be eliminated by ∂9 equations 8∗A = ∂8 A, Cn∗ = 0, Cnµ∗ = −∂ µ C¯ n+1 , µ ¯ν en∗ = −1 C µν ∂ Cn+1 , ∗ C¯ n+1 = ∂ µ Cnµ , ∗ C¯ n+1µ ∗ ηn−1

= =

ν e −1 µν ∂ Cn −∂µ C¯ nµ ,

(5.4) (5.5) (5.6) (5.7)

+ ∂ µ ηn ,

(5.8) n = 0, 1, 2, · · · .

(5.9)

The complete gauge-fixed action Stot is Stot = Smin |6 + Snonmin |6 ,

(5.10)

where 6 is a surface defined by Eqs. (5.4)–(5.9). This action is invariant under the onshell nilpotent BRST transformations (4.20)–(4.25) and (5.2) in which the antifields are eliminated by substituting Eqs. (5.4)–(5.9). It can be seen that the propagators of all fields are well-defined, by writing the kinetic terms and the gauge-fixing terms in Stot , Z n µ ν Stot = d2 xTr − φµν ∂µ ων + ∂ µ ωµ b0 + −1 µν ∂ B b0 +

∞ X n=1

+

1 − ∂ µ C¯ n ∂µ Cn − (∂ µ C¯ nν − ∂ ν C¯ nµ )(∂µ Cnν − ∂ν Cnµ ) 2

∞ X n=1

µ ν e µ ¯µ ∂ µ Cnµ bn + −1 µν ∂ Cn bn + ∂µ ηn−1 bn−1 − ∂µ Cn πn

+ interaction terms

o .

Thus the gauge fermion (5.3) is a correct choice and the gauge degrees of freedom are fixed completely. We can consistently determine the hermiticity of the fields with a convention λ† = −λ in Eq. (4.14)3 . 3

Hermiticity conditions; † † en† = Cen , C¯ n† = (−)n+1 C¯ n , C¯ nµ† = −C¯ nµ , = −Cn , Cnµ = (−)n+1 Cnµ , C Cn † † n µ n+1 πn . b†n = −bn , bµ† n = (−) bn , ηn = ηn , πn = (−)

244

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu

Here comes a possible important comment. There is a common feature for some models of inifinitely reducible systems. When the number of reducibility parameters at each level is the same as that of gauge parameters, the number of the “real” gauge degrees of freedom is half of the original degrees of freedom [26]. The known examples of this type, Brink–Schwarz superparticle and Green–Schwarz superstring, have these characteristics [25, 15, 26, 36, 12, 35, 18, 9]. In the present two-dimensional model, there are four parameters vn , unµ and bn for each stage of the reducibility. The “real” number of gauge-fixing conditions is 3 − 1 = 2, where three gauge-fixing conditions ν e µ −1 ν e ∂ µ Cn−1µ = 0, −1 µν ∂ Cn−1 = 0 are linearly dependent due to ∂ (µν ∂ Cn−1 ) = 0 and thus we needed to impose an extra condition ∂µ C¯ nµ = 0.

6. Conclusions and Discussions We have investigated the quantization of the two-dimensional version of the generalized Chern–Simons theory with a nonabelian gauge algebra by the Lagrangian formalism [5, 6]. We have found that models formulated by the generalized Chern–Simons theory are in general infinitely reducible and thus the quantization is highly nontrivial. We have derived the on-shell nilpotent BRST transformation and the BRST invariant gauge-fixed action for this infinitely reducible system. We have confirmed that the propagators of all fields are well-defined in the gauge-fixed action. It is important to recognize that the starting classical action includes only bosonic fields, while the quantized minimal action includes infinite series of both bosonic and fermionic ghost fields, which are treated in a unified way by the generalized Chern–Simons formulation. It is a characteristic of the generalized Chern–Simons theory that the quantized minimal action has the same Chern–Simons form as the classical action. The quantization is successfully carried out while there appear other possible problems in connection with the introduction of the infinitely many fields. It is then an important question whether we can treat the quantum effects of the infinitely many ghost fields consistently. We have obtained some evidences that quantum effects of the infinitely many ghost fields can be treated in a systematic way and lead to a finite contribution. To be specific as a related example, the classical action is independent of the space-time metric, but it is not obvious that the quantized theory is topological because of the on-shell reducibility. The similar situation occurs in the nonabelian BF theories [10]. We can, however, prove the metric independence of the partition function by regularizing the quantum effects of infinitely many ghosts contributions in a specific but natural way. It is also important to analyze quantum effects on correlation functions for physical operators and the evaluation of the anomaly term in the quantum master equation. The details of these points will be given in a subsequent publication [28]. It is interesting to consider physical aspects of an introduction of the infinite number of ghost fields. An immediate consequence is a democracy of ghosts and classical fields, i.e., the classical fields are simply the zero ghost number sector among infinitely many ghost fields. The classical gauge fields and ghost fields have no essential difference in the quantized minimal action. In the present paper we have not introduced fermionic gauge fields in the starting action but it is straightforward to introduce fermionic gauge fields [29, 30] and carry out quantization. The classical fermionic fields are just zero ghost number sector among infinitely many ghost fields in a quantized action, just the same as in the bosonic sector. It is tempting to speculate that fermionic matter fields may be identified as a special and possibly infinite combination of ghost fields because

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

245

the fermionic and bosonic sectors couple in the standard covariant form in the quantized minimal action of the generalized Chern–Simons theory. In the analyses of the quantization of the generalized Chern–Simons theory with abelian gl(1, R) algebra, it was pointed out that a physical degree of freedom which did not exist at the classical level appeared in the constant part of the zero form field φ at the quantum level due to the violation of the regularity [27]. We know that a zero form field plays an important role in the generalized Chern–Simons theories as emphasized in the classical discussion [31, 32]. In particular a constant component of the zero form field played a role of physical order parameter between the gravity and nongravity phases. We find it is important to clarify the mechanism how the physical constant mode of the zero form field plays the role of possible order parameter in the quantum level. This question is essentially related to the regularity violation in the nonabelian version of the generalized Chern–Simons theory. It is, however, expected that this question will be better clarified in the Hamiltonian formalism quantization. We have already found that the BRST invariant gauge-fixed action obtained from the Hamiltonian formalism coincides with that of the Lagrangian formulation. These points will also be discussed in a subsequent publication [28]. Finally we point out that the quantization procedures of the generalized Chern– Simons theories given in this paper is universal and thus naturally extended to arbitrary even dimensions. To derive nonminimal actions, however, we need to count the genuine independent degrees of freedom in the gauge transformation and impose a gauge-fixing by choosing an adequate gauge fermion. In case the number of reducibility parameters at each level is the same as that of gauge parameters, it seems to be a general feature that the independent gauge degrees of freedom is just half of the original degrees of freedom. In the Hamiltonian formalism we found a reasoning that this should be the case. Acknowledgement. One of the authors (N.K.) wishes to thank M.A. Vasiliev for useful comments. The work by N.K. and K.S. is supported in part by the Grant-in-Aid for Scientific Research from the Ministry of Education, Science and Culture (No. 07044048). One of authors (H.U.) is partially supported by Nukazawa Science Foundation.

References 1. Abud, M., Ader, J.-P. and Cappiello, L.: A BRST lagrangian quantization of reducible gauge theories: Non-abelian p-forms and string field theories. Nuovo Cimento 105A, 1507–1537 (1992) 2. Batalin, I.A. and Fradkin, E.S.: A generalized canonical formalism and quantization of reducible gauge theories. Phys. Lett. B122, 157–164 (1983) 3. Batalin, I.A. and Fradkin, E.S.: Operator quantization of relativistic dynamical systems subject to first class constraints. Phys. Lett. B128, 303–308 (1983) 4. Batalin, I.A. and Vilkovisky, G.A.: Relativistic S-matrix of dynamical systems with boson and fermion constraints. Phys. Lett. B69, 309–312 (1977) 5. Batalin, I.A. and Vilkovisky, G.A.: Gauge algebra and quantization. Phys. Lett. B102, 27–31 (1981) 6. Batalin, I.A. and Vilkovisky, G.A.: Quantization of gauge theories with linearly dependent generators. Phys. Rev. D28, 2567–2582 (1983), Errata: D30, 508 (1984) 7. Baulieu, L.: Field anti-field duality, p-form gauge fields and topological field theories. hep-th/9512026 8. Baulieu, L., Bergshoeff, E. and Sezgin, E.: Open BRST algebras, ghost unification and string field theory. Nucl. Phys. B307, 348–364 (1988) 9. Bergshoeff, E., Kallosh, R. and Van Proeyen, A.: Superparticle actions and gauge fixings. Class. Quant. Grav. 9, 321–360 (1992) 10. For a review see Birmingham, D., Blau, M., Rakowski, M. and Thompson, G.: Topological field theory. Phys. Rep. 209, 129–340 (1991) and references therein

246

N. Kawamoto, K. Suehiro, T. Tsukioka, H. Umetsu

11. Bochicchio, M.: String field theory in the Siegel gauge. Phys. Lett. B188, 330–334 (1987) 12. Brink, L., Henneaux, M. and Teitelboim, C.: Covariant Hamiltonian formulation of the superparticle. Nucl. Phys. B293, 505–540 (1987) 13. Brink, L. and Schwarz, J.H.: Quantum superspace. Phys. Lett. B100, 310–312 (1981) ¨ 14. Dayi, O.F.: A general solution of the BV-master equation and BRST field theories. Mod. Phys. Lett. A8, 2087–2097 (1993) 15. Diaz, A.H. and Toppan, F.: Towards the quantization of the Green–Schwarz heterotic string. Phys. Lett. B211, 285–292 (1988) 16. Fradkin, E.S. and Fradkina, T.E.: Quantization of relativistic systems with boson and fermion first- and second-class constraints. Phys. Lett. B72, 343–348 (1978) 17. Fradkin, E.S. and Vilkovisky, G.A.: Quantization of relativistic systems with constraints. Phys. Lett. B55, 224–226 (1975) 18. Green, M.B. and Hull, C.M.: The covariant quantization of the superparticle. In: Arnowitt, R., Bryan, R., Duff, M.J., Nanopoulos, D. and Pope, C.N. (eds.) Strings 89. Proceedings,Singapore: World Scientific, 1990, pp. 478–503 19. Green, M.B. and Schwarz, J.H.: Covariant description of superstrings. Phys. Lett. B136, 367–370 (1984) 20. Green, M.B. and Schwarz, J.H.: Properties of the covariant formulation of superstring theories. Nucl. Phys. B243, 285–306 (1984) 21. Hata, H.: BRS invariance and unitarity in closed string field theory. Nucl. Phys. B329, 698–722 (1990) 22. Hata, H., Itoh, K., Kugo, T., Kunitomo, H. and Ogawa, K.: Covariant string field theory. Phys. Rev. D34. 2360–2429 (1986) 23. Hata, H., Itoh, K., Kugo, T., Kunitomo, H. and Ogawa, K.: Covariant string field theory. 2. Phys. Rev. D35, 1318–1355 (1987) 24. Henneaux, M. and Teitelboim, C.: Quantization of gauge systems. Princeton, NJ: Princeton University Press, 1992 25. Kallosh, R.E.: Quantization of the Green–Schwarz superstring. Phys. Lett. B195, 369–376 (1987) 26. Kallosh, R., Troost, W. and Van Proeyen, A.: Quantization of superparticle and superstring with Siegel’s modification. Phys. Lett. B212, 428–436 (1988) 27. Kawamoto, N., Ozawa, E. and Suehiro, K.: Quntization of gl(1,R) generalized Chern–Simons theory in 1+1 dimensions. Mod. Phys. Lett. A12, 219–231 (1997) 28. Kawamoto, N., Suehiro, K., Tsukioka, T. and Umetsu, H. To appear 29. Kawamoto, N. and Watabiki, Y.: Even dimensional generalization of Chern–Simons action and new gauge symmetry. Commun. Math. Phys. 144, 641–648 (1992) 30. Kawamoto, N. and Watabiki, Y.: Graded Lie algebra and generalized Chern–Simons actions in arbitrary dimensions. Mod. Phys. Lett. A7, 1137–1147 (1992) 31. Kawamoto, N. and Watabiki, Y.: Two-dimensional gravity as the gauge theory of the Clifford algebra for an even-dimensional generalized Chern–Simons action. Phys. Rev. D45, 605–617 (1992) 32. Kawamoto, N. and Watabiki, Y.: Four-dimensional topological conformal gravity from even-dimensional generalized Chern–Simons action. Nucl. Phys. B396, 326–364 (1993) 33. Kugo, T. and Suehiro, K.: Nonpolynomial closed string field theory: Action and its gauge invariance. Nucl. Phys. B337, 434–466 (1990) 34. Labastida, J.M.F. and Pernici, M.: A gauge invariant action in topological quantum field theory. Phys. Lett. B212, 56–62 (1988) 35. Lindstr¨om, U., Roˇcek, M., Siegel, W., Van Nieuwenhuizen, P. and Van De Ven, A.E.: Lorentz covariant quantization of the superparticle. Phys. Lett. B224, 285–287 (1989) 36. Nissimov, E.R. and Pacheva, S.J.: Quantization of N = 1, 2 superparticle with irreducible constraints. Phys. Lett. B189, 57–62 (1987) 37. Thorn, C.B.: Perturbation theory for quantized string fields. Nucl. Phys. B287, 61–92 (1987) 38. Thorn, C.B.: String field theory. Phys. Rep. 174, 1–101 (1989) 39. Troost, W., Van Nieuwenhuizen, P. and Van Proeyen, A.: Anomalies and the Batalin-Vilkovisky lagrangian formalism. Nucl. Phys. B333, 727–770 (1990) 40. Wallet, J.C.: Algebraic set-up for the gauge-fixing of BF and super BF systems. Phys. Lett. B235, 71–78 (1990)

Quantization of Infinitely Reducible Generalized Chern–Simons Actions

247

41. Witten, E.: Non-commutative geometry and string field theory. Nucl. Phys. B268, 253–294 (1986) 42. Witten, E.: 2+1 dimensional gravity as an exactly soluble system. Nucl. Phys. B311, 46–78 (1988/89) 43. Zwiebach, B.: Closed string field theory: Quantum action and the B-V master equation. Nucl. Phys. B390, 33–152 (1993) Communicated by T. Miwa

Commun. Math. Phys. 195, 249 – 265 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Smooth Irrotational Flows in the Large to the Euler–Poisson System in R3+1 Yan Guo? Department of Mathematics, Princeton University Princeton, NJ 08544, USA and Brown University, Division of Applied Mathematics, Providence, RI 02912, USA Received: 27 January 1997 / Accepted: 19 November 1997

Abstract: A simple two-fluid model to describe the dynamics of a plasma is the Euler– Poisson system, where the compressible electron fluid interacts with its own electric field against a constant charged ion background. The plasma frequency produced by the electric field plays the role of “mass” term to the linearized system. Based on this “Klein– Gordon” effect, we construct global smooth irrotational flows with small velocity for the electron fluid. 1. Introduction A plasma is a collection of moving electrons and ions. At high frequencies, a simple-fluid model for a plasma breaks down. The electrons and ions tend to move independently, and charge separations occur. The greater inertia of the ions implies that they will be unable to follow the rapid fluctuation of the fluid, only electrons partake in the motion. The ions merely provide a uniform background of positive charge. One of the simplest two-fluid model for a plasma is the Euler–Poisson system ∂t n + ∇ · (nu) = 0, e 1 ∂t u + u · ∇u + ∇p(n) = ∇φ me n me

(1)

with the electric field ∇φ which satisfies the Poisson system 1φ = 4πe(n − n0 ), with |φ| → 0, as |x| → ∞.

(2)

Here, the electrons of charge e and mass me are described by a density n(t, x) and an average velocity u(t, x). The constant equilibrium-charged density of ions and electrons is ±en0 . We assume the pressure is ?

The research is supported in part by NSF grant 96-23253 and a NSF Postdoctoral Fellowship.

250

Y. Guo

p(n) = Anγ with γ > 1, and A is a constant (The case γ = 1 can be treated too). Throughout this paper, we consider an irrotational flow, that is ∇×u≡0

(3)

which is invariant for all time. Therefore, we deduce, from (2), (1) and (3), that the dynamic equation of ∇φ is ∂t ∇φ = −4πe∇1−1 ∇ · [nu] = −4πe{n0 u + ∇1−1 ∇ · [(n − n0 )u]}.

(4)

We study the dynamic problem of (1) and (4), together with the constraint (2) at time t = 0. It follows that the Poisson equation (2) holds for all time. There is a quiet fluid equilibrium for (1) and (2) of electrons: n ≡ n0 , u ≡ 0 (E ≡ 0). In the absence of the electric field ∇φ, the Euler–Poisson system reduces to the wellknown Euler equations for compressible fluids. Despite many important progresses over the years (especially in 1-D), the existence and uniqueness of global solutions in the 3+1 dimension remains an outstanding open problem. Consider smooth, irrotational initial data which are small perturbations of a quiet fluid n ≡ n0 , u ≡ 0. In general, it is well-known that singularity (shock waves) develops in finite time for the pure Euler equations [Si1]. On the other hand, although it seems more complicated, in contrast, we demonstrate that these initial data lead to smooth, irrotational solutions to the Euler–Poisson system for all time. Theorem 1. Let ρ(x) ∈ Cc∞ (R3 ) and vector value function υ(x) ∈ Cc∞ ( R3 ) with Z ρdx = (neutrality). ∇ × υ = (irrotationality), R3

Then there exist 0 > 0 such that for 0 < < 0 , there exist unique smooth solutions (n (t, x), u (t, x)) to the Euler–Poisson system (1), (4) and (2) for 0 ≤ t < ∞ with initial data (ρ + n0 , υ). Moreover, n (t, x) − n0 and u (t, x) decays, uniformly in t ≥ 0, x ∈R3 , as (1 + t)−p for any 1 < p < 23 . x , with In particular, (3) holds for radially symmetric data ρ = ρ(|x|), υ(x) = υ(|x|) |x| ∞ 3 ∞ 3 scalar functions ρ ∈ Cc (R ) and υ ∈ Cc (R \{0}). The neutral condition is sufficient for some non-local requirements for the data. More general statements can be found in Theorem 9. For the pure Euler equations, the life span of classical irrotational flow with initial data (ρ + n0 , υ) is of the order O(exp( 1 )) [Si2], since irrotational solutions of the linearized Euler equations decay like that of (1 + t)−1 , which is not integrable. On the other hand, due to the interaction with its own electric field, the linearized Euler–Poisson system for irrotational flows take the form

1 0 4πe2 n0 p (n0 )1n + n = 0, me me 1 0 4πe2 n0 ∂tt u − p (n0 )1u + u = 0. me me

∂tt n −

4πe2 n0 me , which is absent in the pure Euler system, comes from (3) 4πe2 n0 = ωp2 , where ωp , the plasma frequency for plasma oscillations, me

The new “mass” term and (4). Notice that

Smooth Irrotational Flows in the Large to the Euler–Poisson System

251

characterizes one of the fundamental features of a plasma. It is well-known [MSW, Str] that the linear Klein–Gordon equation has a better decay rate of (1 + t)−3/2 . In 1985, global smooth solutions with small amplitude to general scalar, quasi-linear Klein– Gordon equations were constructed independently by Klainerman [K] and Shatah [Sh] via two different methods. Due to the non-local complications (2), it is difficult to directly employ the vector field method of Klainerman. Instead, we modify Shatah’s method of a normal form in a Lp (p > 1) setting, and use a Lp − L∞ estimate of [N] to construct global solutions. For pure Euler equations for compressible fluids in R3+1 , global weak solutions with radial symmetry and large amplitude have been constructed [CG, MMU], outside the ball |x| ≥ 1. There are some works for the Euler–Poisson system (1), (4) and (2). Cordier et al [CDMS] constructed interesting steady states solutions in one space dimension. On the other hand, many mathematicians have made contributions to the related Euler– Poisson model in semiconductor physics with a momentum relaxation. See [CW, DM, G, Pe, PRV, WC and Z] for more references on that subject. 2. Reformulation of the Problem For notational simplicity, we set all physical constants e, me , 4π and A to be one. It is convenient to introduce new variables to simplify the forth-coming energy estimates. We consider smooth irrotational flows near the equilibrium n ≡ n0 , u ≡ 0 and ∇φ ≡ 0. As in [Si2], if (n, u) is a smooth solution of the Euler–Poisson system, we define t

n( c0 x) 1 t t 2 , [( )(γ−1)/2 − 1], v(t, x) = u( , x), ψ(t, x) = φ( , x) m(t, x) = γ−1 n0 c0 c0 c0 √ (γ−1)/2 2/(γ−1) , and n = n0 ( γ−1 . Notice that with the sound speed c0 = γn0 2 m + 1) ∇ × v ≡ 0 from (3). In terms of new variables, the Euler–Poisson system (1), (4) takes the form γ−1 m∇ · v = 0, ∂t m + ∇ · v + v · ∇m + 2 γ−1 m∇m = c−2 (5) ∂t v + ∇m + v∇v + 0 ∇ψ, 2 γ−1 m + 1)2/(γ−1) − 1]v}, ∂t ∇ψ = −n0 v − n0 ∇1−1 ∇ · {[( 2 with the constraint γ−1 m + 1)2/(γ−1) − 1] ≡ n0 [m − h(m)], 1ψ = n0 [( 2 where h, as defined, is a smooth function satisfying h(0) = 0 and h0 (0) = 0. Since ∇ × v ≡ 0, by using the Poisson equation and taking one more derivative of (5), we obtain γ−1 m∇m] (∂tt − 1 + m0 )m = ∇ · [v∇v + 2 γ−1 m∇ · v] + m0 h(m), −∂t [v · ∇m + 2 γ−1 γ−1 m∇ · v] − ∂t [v∇v + m∇m] (6) (∂tt − 1 + m0 )v = ∇[v · ∇m + 2 2 −m0 ∇1−1 ∇ · {[m − h(m)]v},

252

Y. Guo

where m0 = c−2 0 n0 . Notice that the right handside in (6) is formally second order in m and v. In order to further simplify the notation, we define w = (w0 , w1 , w2 , w3 )T = (m, v)T , where T is the transpose. Let ∂0 = ∂t = ∂x0 and ∂j = ∂xj for 1 ≤ j ≤ 3. And we use the standard convention that the Latin letter i, j, k runs from 1 to 3, while the Greek letters µ, ν runs from 0 to 3. In term of w, (5) takes the form T ∂0 w + Aj (w)∂j w = (0, c−2 0 ∇ψ) ,

(7)

where Aj are symmetric matrices of 0 T ( γ−1 wj 2 w + 1)ej , 0 ( γ−1 wj I 2 w + 1)ej see [Si1]. We now rewrite (6) in term of w, and separate the non-local term m0 ∇1−1 ∇ · {[m − h(m)]v} from the other. We obtain (∂tt − 1 + m0 )w = f (w, ∂w, ∂ 2 w) ≡ s(w, ∂w, ∂ 2 w) + g(w, ∂w, ∂ 2 w).

(8)

µ

Here the singular, non-local function s = (s ), with s0 = 0, and sl = −m0 1−1 ∂lk {[w0 − h(w0 )]wk }, while g = (g µ ) is the rest of the nonlinear terms. Notice that g is a smooth function of w, ∂w, ∂ 2 w, with g(0, 0, 0) = 0 and no dependence on ∂00 w. We define |w(t)|k,p as the standard spatial Sobolev norm of order k. We use, for simplicity, ∂ α and ∂ i to denote multi-space-time and space derivatives, with lengths α and i respectively. We also use the Einstein summation convention from time to time. We also define k w(t) kk,p = |w(t)|k,p + |∂0 w(t)|k−1,p

(9)

for 1 ≤ p ≤ ∞. We first prove Lemma 2. Let k ≥ 3 and w(t) be a solution to (8) with k w(t) kk,∞ ≤ 1. Then k f (w(t)) kk,p ≤ C k w k[k/2]+2,∞ (k w kk+2,p + k w k2k+2,2p ),

(10)

where 1 < p < ∞, and [k/2] is the largest integer that does not exceed k/2. Proof. Recall the non-local term s is a sum of products of Riesz transforms of m0 [w0 − h(w0 )]w. By the Lp boundedness of the Riesz transformation [Ste], for 1 < p < ∞, we have |f (w)|k,p ≤ C|[w0 − h(w0 )]w|k,p + |g(w, ∂w, ∂ 2 w)|k,p . Notice that g and [w0 − h(w0 )]w are smooth and second order. By the product rule, the above is majorized by C(|w|[k/2],∞ |w|k,p + k w k[k/2]+2,∞ k w kk+2,p ) ≤ C k w k[k/2]+2,∞ k w kk+2,p , (11) since g does not depend on ∂00 w, and k w(t) kk,∞ ≤ 1, for k ≥ 3. Moreover, |∂0 f (w(t))|k−1,p ≤ C|∂0 {[w0 − h(w0 )]w}|k−1,p + |∂0 g|k−1,p ≤ C k w k[k/2]+1,∞ k w kk,p +|gν ∂0 ∂ ν w|k−1,p .

Smooth Irrotational Flows in the Large to the Euler–Poisson System

253

Here ν is a multi-index with |ν| ≤ 2, and gν is the partial derivative of g(w, ∂w, ∂ 2 w) with respect to ∂ ν w. We estimate the last term. If ν does not contain any t (or x0 ) derivative, we apply (11) with gν |gν ∂0 ∂ ν w|k−1,p = |∂ k−1−j (gν )∂ j ∂0 ∂ ν w|p ≤ C k w k[k/2]+2,∞ k w kk+2,p , by separating two cases 0 ≤ j ≤ [ k2 ] − 1 and j ≥ [ k2 ] (i.e. k − 1 − j ≤ [ k2 ]). On the other hand, if ν contains one x0 derivative, then ∂ ν = ∂0 ∂ i , with 0 ≤ i ≤ 1. Substituting (8), we get |gν ∂0 ∂ ν w|k−1,p = |gν ∂ i ∂00 w|k−1,p ≤ |gν ∂ i {(1 − m0 )w}|k−1,p + |gν ∂ i f (w)|k−1,p . Similarly, separating two cases 0 ≤ j ≤ [ k2 ] − 1 and k − 1 − j ≤ [ k2 ] yields |gν ∂ i {(1 − m0 )w}|k−1,p ≤ C|∂ k−1−j (gν )∂ j+i {(1 − m0 )w}| ≤ C k w k[k/2]+2,∞ k w kk+2,p . We use Holder’s inequality and (11) to estimate the second term by |gν ∂ i f (w)|k−1,p ≤ C|gν |k−1,2p |f (w)|k,2p ≤ C k w kk+1,2p k w k[k/2]+2,∞ k w kk+2,2p ≤ C k w k[k/2]+2,∞ k w k2k+2,2p . The lemma thus follows from (11) and the above two estimates.

3. The Normal Forms Now we define a normal form transformation for (8). The goal is to construct a new variable ω = (ω µ ), such that (∂tt − 1 + m0 )ω is cubic in w. Thus we can apply the linear L∞ decay estimate for ω. We follow the construction of Shatah [Sh] to define µ , (wβ , ∂0 wβ )T ], ω µ = wµ + [(wα , ∂0 wα ), Bαβ

(12)

µ is a 2 × 2 matrix to be determined, and with summation over 0 ≤ α, β ≤ 3. Here Bαβ Z µ µ , V2T ](x) ≡ V1 (z)Bαβ (x − y, x − z)V2T (y)dydz (13) [V1 , Bαβ R3 ×R3 Z 1 µ eix·(ξ+η) F (V1 )(η)F(Bαβ )(ξ, η)F (V2T )(ξ)dξdη ≡ (2π)6 R3 ×R3

for any two 1 × 2 functions V1 and V2 . The second equation in (13) follows directly from a Fourier transform F with respect to z, y as well as both x − y and x − z. Here the Fourier transform F is defined as Z Z 1 e−ix·σ V (x)dx, F −1 (V )(x) = eix·σ V (σ)dσ F (V )(σ) = l (2π) l l R R for any integer l > 0 and V ∈ S (Rl ). We compute (∂tt − 1 + m0 )ω µ for a smooth solution w of (8). We first compute the most complicated term ∂00 ω µ . By (13), (8) and (12), we have

254

Y. Guo µ ∂0 [(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ] µ µ = [(∂0 wα , ∂00 wα ), Bαβ , (wβ , ∂0 wβ )T ] + [(wα , ∂0 wα ), Bαβ , (∂0 wβ , ∂00 wβ )T ] µ = [(∂0 wα , (1 − m0 )wα + f α ), Bαβ , (wβ , ∂0 wβ )T ] µ +[(wα , ∂0 wα ), Bαβ , (∂0 wβ , (1 − m0 )wβ + f β )T ].

By taking one more t derivative, we obtain µ ∂00 [(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ] µ = [((1 − m0 )wα + f α , (1 − m0 )∂0 wα + ∂0 f α ), Bαβ , (wβ , ∂0 wβ )T ] µ +2[(∂0 wα , (1 − m0 )wα + f α ), Bαβ , (∂0 wβ , (1 − m0 )wβ + f β )T ] µ +[(wα , ∂0 wα )T , Bαβ , ((1 − m0 )wβ + f β , (1 − m0 )∂0 wβ + ∂0 f β )T ].

We now separate second order terms from higher order terms. Notice that µ µ ∇y,z Bαβ (x − y, x − z) = −∇1,2 Bαβ (x − y, x − z),

(14)

Integrating by part over the y, z variables, we simplify the above from (13) (1 − m0 is self-adjoint) µ , (wβ , ∂0 wβ )T ] ∂00 [(wα , ∂0 wα ), Bαβ µ = [(wα , ∂0 wα ), (12 − m0 )Bαβ , (wβ , ∂0 wβ )T ] µ +2[(wα , ∂0 wα ), CBαβ , (wβ , ∂0 wβ )T ] µ +[(wα , ∂0 wα ), (11 − m0 )Bαβ , (wβ , ∂0 wβ )T ] + R1µ .

Here and

µ CBαβ

≡

µ µ (11 − m0 )(12 − m0 )Bαβ22 (12 − m0 )Bαβ21 µ µ (11 − m0 )Bαβ12 Bαβ11

µ with entries Bαβij

µ , (wβ , ∂0 wβ )T ] R1µ = [(f α , ∂0 f α ), Bαβ µ +2[(0, f α ), Bαβ , (∂0 wβ , (1 − m0 )wβ + f β )T ] µ +2[(∂0 wα , (1 − m0 )wα + f α ), Bαβ , (0, f β )T ] µ +[(wα , ∂0 wα )T , Bαβ , (f β , ∂0 f β )T ]

is the third order remainder. From the definition of [·, ·] in (13), µ µ 1x [(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ] = [(wα , ∂0 wα ), (∇1 + ∇2 )2 Bαβ , (wβ , ∂0 wβ )T ], µ , i = 1, 2. where ∇i denotes the gradient operator with respect to ith argument of Bαβ From (7) and (12), ω satisfies: µ , (wβ , ∂0 wβ )T ] + R1µ (∂tt − 1 + m0 )ω µ = (∂tt − 1 + m0 )wµ + [(wα , ∂0 wα ), LBαβ µ = f µ + [(wα , ∂0 wα ), LBαβ , (wβ , ∂0 wβ )T ] + R1µ . µ Here LBαβ is defined as

(15)

Smooth Irrotational Flows in the Large to the Euler–Poisson System

255

µ µ µ LBαβ ≡ {(12 − m0 ) + (11 − m0 ) − (∇1 + ∇2 )2 + m0 }Bαβ + 2CBαβ µ µ = {12 + 11 − m0 − (∇1 + ∇2 )2 }Bαβ + 2CBαβ .

We now expand f µ as a sum of a quadratic and a higher order part µ , (wβ , ∂0 wβ )T ] + R2µ . f µ = [(wα , ∂0 wα ), fαβ

(16)

µ µ µ = sµαβ + gαβ is the kernel of the quadratic part. Moreover, gαβ (y, z) = Here fαβ µ µ l k gαβkl ∂y δ(y)∂z δ(z), gαβkl are constant 2 × 2 matrices, k + l ≤ 2 and δ is the Dirac mass; from (8), the non-local part sµαβ is

[V1 , sl0k , V2T ](x) = −m0 ∂lk 1−1 {V10 V2k } = m0 ∂lk (−1)−1 [V1 , e0k δ(y)δ(z), V2T ] for 1 ≤ k, l ≤ 3, and sµαβ ≡ 0 otherwise. From ( 6), the third order term R2µ in (16) is 1 R20 = m0 {h(w0 ) − h00 (0)(w0 )2 } ≡ h1 (w0 ), R2l = m0 1−1 ∂lk [h(w0 )wk ]. 2 Let Ia = (−1)−a/2 be the Riesz potential of order −∞ < a < 3 [Ste], which is defined in terms of the Fourier transform F (I a (·))(σ)=|σ|−a F (·). As a function of x, it follows from the Fourier transform that for a < 3, Ia (eix·(ξ+η) ) = |ξ + η|−a eix·(ξ+η) .

(17)

[V1 , sl0k , V2T ](x)

equals From the second equation in (13), and (17) with a = 2, Z (ξk + ηk )(ξl + ηl ) m0 eix·(ξ+η) F(V1 )(η) e0k F (V2T )(ξ)dξdη. − (2π)6 R3 ×R3 |ξ + η|2

(18)

In order to eliminate all second order terms of w in (15), we let µ µ ≡ −fαβ . LBαβ

(19)

(∂tt − 1 + m0 )ω µ = R1µ + R2µ ,

(20)

Therefore (15) becomes

with only third order terms of w left. In order to achieve this goal, we need to solve (19) µ . Recall the fundamental theorem due to Shatah [Sh]: for Bαβ Theorem 3. (a). Let D(y, z), F (D)(ξ, η) be 2 × 2 matrices of distributions, then there exists distributions B(y, z), such that LB ≡ D in the sense of distributions. Moreover, the Fourier transform of B satisfies F(B)(ξ, η) = Q(ξ, η)F (D)(ξ, η), ∞

where Q(ξ, η) is C and |Q(ξ, η)| ≤ C(1 + |ξ| + |η|)6 . (b). Furthermore, assume D = Dkl ∂yl δ(y)∂zk δ(z), with multi-index k, l, and constant matrices Dkl. Then there exists integer N > 0, such that |[∂ i V1 , B, ∂ j V2T ]|p ≤ C|∂ i V1 |4N,p1 |∂ j V2 |4N,p2 for any V1 , V2 , 1 ≤ p ≤ ∞,

1 p

=

1 p1

+

1 p2 ,

and any two multi-index i, j .

256

Y. Guo

We can not directly apply part (b) of Shatah’s Theorem to solve (19), since the nonlocal term F (sµαβ ) in (18) has singularity at ξ + η = 0. We use the Riesz potential Ia to smooth out near ξ + η = 0. We first observe p Lemma 4. Let hξi = 1 + |ξ|2 , 0 < a ≤ 23 , and 3 + a < N . Then Z ξk ξl hξi−N eiy·ξ dξ ∈ L1 (Ry3 ). 2−a |ξ| 3 R R ξk ξl ξk ξl −N −N iy·ξ Proof. For N > 3 + a, since |ξ| ∈ L1 (R3ξ ), |ξ| e dξ ∈ L∞ (R3y ). 2−a hξi 2−a hξi R ξk ξl Hence it suffices to show |ξ|2−a hξi−N eiy·ξ dξ ∈ L1 {|y| ≥ 1}. Without loss of generality, we may assume |y1 | = max1≤i≤3 {|yi |}. Integrating by part over the ξ variable, we have Z Z 3 ξk ξ l ξk ξl −N iy·ξ 3 −N iy·ξ y1 hξi e dξ = c ∂ { hξi }e dξ ξ 1 2−a |ξ|2−a R3 |ξ| R3 Z hξi−N |ξ|a−3 θ(ξ)eiy·ξ dξ|, = c| R3

where θ(ξ) is a bounded, smooth function, and c some numerical constant. Since a > 0, 3 ≤ 2. Hence from the Hausdroff–Young hξi−N |ξ|a−3 θ(ξ) ∈ Lp (R3ξ ) for 1 ≤ p < 3−a inequality, Z 0

hξi−N |ξ|a−3 θ(ξ)eiy·ξ dξ| ∈ Lp (Ry3 )

| for

1 p

+

1 p0

= 1. Finally, since |y1 |−3 ≤ 3|y|−3 ∈ Lp {|y| ≥ 1} for any p > 1, Z ξ k ξl −N iy·ξ |ξ|2−a hξi e dξ 1{|y|≥1} Z = |y1 −3 hξi−N |ξ|a−3 θ(ξ)eiy·ξ dξ|1,{|y|≥1} Z Z −3p 1/p ≤ C{ |y| } | hξi−N |ξ|a−3 θ(ξ)eiy·ξ dξ|p0 < ∞ |y|≥1

from Holder’s inequality.

µ . Notice now p 6= 1, ∞. Now we generalize Theorem 3 to solve (19) for Bαβ µ µ µ Theorem 5. There exists distributions Bαβ such that LBαβ ≡ −fαβ for 0 ≤ µ, α, β ≤ 3, with µ µ F(Bαβ )(ξ, η) = Q(ξ, η)F (fαβ ),

|Q(ξ, η)| ≤ C(1 + |ξ| + |η|)6 as in Theorem 3. Moreover, for any 1 0 such that µ , ∂ j V2T ]|p ≤ C{|∂ i V1 |4N,p1 |∂ j V2 |4N,p2 + |∂ i V1 |4N,r1 |∂ j V2 |4N,r2 } |[∂ i V1 , Bαβ

with

1 p

=

1 p1

+

1 p2

, r1 =

1 r1

+

1 1 r2 , r

=

1 p

+ na , for any 0 < a ≤ 23 , and multi-indices i, j.

Smooth Irrotational Flows in the Large to the Euler–Poisson System

257

µ µ µ Proof. From part (a) of Theorem 3, there is Bαβ such that LBαβ ≡ −fαβ . Taking the Fourier transform, by (18) we get µ )(ξ, η) = Q(ξ, η){m0 F (Bαβ

By Theorem 3, for

1 p

=

1 p1

+

1 p2 ,

(ξk + ηk )(ξl + ηl ) µ e0k − F(gαβ )}. |ξ + η|2

the g−part is easily estimated as

µ )}, ∂ j V2T ]|p ≤ C|∂ i V1 |4N,p1 |∂ j V2 |4N,p2 . |[∂ i V1 , F −1 {QF(gαβ

It suffices to estimate the first singular term. By the Hardy-Littlewood-Sobolev inequality, for r1 = p1 + na , 1 < p < ∞, 0 < a ≤ 23 , i [∂ V1 , F −1 {Q (ξk + ηk )(ξl + ηl ) }e0k , ∂ j V2T ] 2 |ξ + η| p (ξk + ηk )(ξl + ηl ) j T ≤ Ia [∂ i V1 , F −1 {Q }e , ∂ V ] 0k 2 . 2 |ξ + η| r

From (17), (18) and the second equation of (13), the above is equivalent to Z 1 (ξk + ηk )(ξl + ηl ) ix·(ξ+η) i j T F(∂ V1 )(η) e0k F (∂ V2 )(ξ)dξdη (2π)6 3 3 e 2−a |ξ + η| R ×R r = |[∂ i V1 , F −1 {|ξ + η|a−2 (ξk + ηk )(ξl + ηl )Q}e0k , ∂ j V2T ]|r .

(21) √ It thus suffices to bound (21) by C|∂ i V1 |4N,r1 |∂ j V2 |4N,r2 . Let hσi = 1 + σ 2 . From (14) and (13), we rewrite the right-hand side of (21) as i [∂ V1 , {h∂1 + ∂2 i2N h∂1 − ∂2 i2N } F −1 {

(ξk + ηk )(ξl + ηl )Q j T }[´e , ∂ V ] 0k 2 2N 2N 2−a hξ + ηi hξ − ηi |ξ + η| r

= [∂ i V1 , {h∂y + ∂z i2N h∂y − ∂z i2N }

(ξk + ηk )(ξl + ηl )Q j T F −1 { }e , ∂ V ] 0k 2 2N 2N 2−a hξ + ηi hξ − ηi |ξ + η| r i+n (ξ + η )(ξ + η )Q k k l l −1 j+n2 T 1 ≤ Cn1, n2 [∂ V1 , F { }e , ∂ V ] 0k 2 , hξ + ηi2N hξ − ηi2N |ξ + η|2−a r (integration by part), where summations over 0 ≤ n1 , n2 ≤ 4N . By a change of variable x−y = y 0 , x−z = z 0 in (13), the above is majorized by ( r1 = r11 + r12 ), Cn1, n2 |∂ i+n1 V1 |r1 |F −1 {

(ξk + ηk )(ξl + ηl )Q }e0k |1 |∂ j+n2 V2T |r2 . ( Holder) hξ + ηi2N hξ − ηi2N |ξ + η|2−a

k )(ξl +ηl )Q }∈ We now only need to show that for some integer N > 0, F −1 { hξ+ηi(ξ2Nk +η hξ−ηi2N |ξ+η|2−a 1 3 3 −1 −N −N 1 3 3 L (Ry ×Rz ). Notice that F {Qhξ + ηi hξ − ηi } ∈ L (Ry ×Rz ), for N large. By Young’s inequality, it thus suffices to show

258

Y. Guo

−1 (ξk + ηk )(ξl + ηl ) F { } |ξ + η|2−a hξ + ηiN hξ − ηiN

1(Ry3 ×Rz3 )

< ∞.

But from the definition of F −1 , this is equivalent to Z 3 (ξ + η )(ξ + η ) k k l l iy·ξ+iz·η ×R3 e dξdη R 3 3 |ξ + η|2−a hξ + ηiN hξ − ηiN 1(Ry ×Rz ) Z 3 0 0 ξk ξl 3 i(y+z)·ξ 0 +i(y−z)·η 0 0 0 = c ×R 0 2−a 0 N 0 N e dξ dη . R 3 3 |ξ | hξ i hη i 1(Ry ×Rz )

0

0

(ξ + η = 2ξ , ξ − η = 2η ). By further changing variables y + z = y 0 , and y − z = z 0 , we estimate above by Z Z 0 0 ξk0 ξl0 iy 0 ·ξ 0 +iz 0 ·η 0 0 0 C e dξ dη dy dz 0 2−a 0 N 0 N hξ i hη i R3 ×R3 R3 ×R3 |ξ | Z Z Z Z 0 0 0 ξk0 ξl0 iy 0 ·ξ 0 0 = C dξ dy × | hη 0 i−N eiz ·η dη 0 |dz 0 (Fubini) 3 |ξ 0 |2−a hξ 0 iN e 3 3 3 R R ZR ZR 0 −N iz 0 ·η 0 0 0 | hη i e dη |dz < ∞, ≤C R3

R3

where we have used Lemma 4. We thus deduce our theorem.

4. The L∞ Decay Estimate In this section, we derive the L∞ decay estimate for the solution w of (8). We first state the L∞ − Lp estimates for the linear Klein–Gordon equation. Lemma 6. Let (∂tt − 1 + 1)ω = 0. Then for 1 ≤ p ≤ 2, l ≥ 1, 0

k ω(t) kl,∞ ≤ C(1 + t)−3/2+3/p k ω(0) k4+l,p . ∞ 1 Proof. Recall p the space time norm in (9). The L −L estimate is standard, see [MSW]. 2 Let hξi = 1 + ξ , we have

sinhξit , hξi F (∂0 ω(t))(ξ) = F(∂0 ω(0)) coshξit − F (ω(0))hξi sinhξit. F(ω(t))(ξ) = F(ω(0)) coshξit + F(∂0 ω(0))

For 2 ≤ p0 < ∞, from Corollary 5.1, 5.2 of [N], −1 exp{−ihξit} 0 F ( ) ≤ Ct−3/2+3/p . 3 hξi p0 Hence by the Young’s inequality for convolutions ( p10 +

1 p

= 1), for t ≥ 1,

Smooth Irrotational Flows in the Large to the Euler–Poisson System

259

kω(t)k1,∞ = |ω(t)|1,∞ + |∂0 ω(t)|∞ −1 exp{−ihξit} ≤ C F ( ) {|hξi4 F (ω(0))|p + |hξi3 F (∂0 ω(0))|p } hξi3 0 p

≤ Ct

−3/2+3/p0

k ω(0) k4,p .

) ∈ L1 ∩ L∞ , it follows that On the other hand, for t ≤ 1, since F −1 ( exp{−ihξit} hξi4 kω(t)k1,∞ ≤ C k ω(0) k5,p . Hence the lemma is valid when l = 1. We deduce the lemma by taking l − 1 more spatial derivatives. This lemma can also be proven by an interpolation between the L∞ −L1 estimate and the energy estimate for high derivatives. We now define, for positive integer l > 0, |w|X ≡k w k2l,2 |w|Y ≡k w kl,∞ |w|Z ≡k w kl+4,p. , 0

|||w||| ≡ sup[|w|X + |∇ψ|X + (1 + t)3/2−3/p |w|Y ], t>0

0

|||w|||T∗ ≡ sup [|w|X + |∇ψ|X + (1 + t)3/2−3/p |w|Y ], 0
with 1 < p < 65 , or 6 < p0 < ∞. Notice that |w|Y ≤ C|w|X for l large, and |w|Y ≤ C|w|Z , both from the Sobolev Imbedding Theorem. The solutions w which we construct satisfy |||w||| < ∞. We first derive an a priori bound for |w|Y . Theorem 7 (L∞ decay estimate). Let |w(0)|Y +|w(0)|Z = 0 , and |||w|||T∗ ≤ 1. There is l0 > 0, such that if l ≥ l0 , 0 ≤ t ≤ T∗ . 0

(1 + t)3/2−3/p |w(t)|Y ≤ C(0 + |||w|||2T∗ ). Proof. By the standard Duhamel principle, we apply Lemma 6 (with 1 replaced by m0 ) to (20) to get Z t 0 0 (1 + t − τ )−3/2+3/p |R1 + R2 |l+3,p (τ )dτ. |ω(t)|Y ≤ C(1 + t)−3/2+3/p |ω(0)|Z + 0

Step 1. Estimate |w(t)|Y . Notice that from the normal form transformation (12), µ , (wβ , ∂0 wβ )T ]|Y . |wµ (t)|Y ≤ |ω µ (t)|Y + |[(wα , ∂0 wα ), Bαβ

We estimate the quadratic term above. By the Sobolev Imbedding Theorem, µ µ , (wβ , ∂0 wβ )T ]|l,∞ ≤ C|[(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ]|l+1,q |[(wα , ∂0 wα ), Bαβ

for some q > n > 2. Choose r1 + na = q1 , a small, and r > 2. Repeatedly using (14) as well as integrating by parts over y and z, from Theorem 5, we obtain µ µ |[(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ]|l+1,q = Cij |[∂ i (wα , ∂0 wα ), Bαβ , ∂ j (wβ , ∂0 wβ )T ]|q

≤ Cij {k ∂ i wα k4N +1,q1 (i) k ∂ j wβ k4N +1,q2 (j) + k ∂ i wα k4N +1,r1 (i) k ∂ j wβ k4N +1,r2 (j) }, summation over multi-index i and j with 0 ≤ i + j ≤ l + 1, Cij constants. We choose

260

Y. Guo

i 1 j 1 i 1 j 1 = , = ; = , = . q1 (i) (i + j)q q2 (j) (i + j)q r1 (i) (i + j)r r2 (j) (i + j)r

(22)

Apply Nirenberg–Gagliardo inequality (interpolation between q, ∞ and r, ∞) to each i, j to get µ , (wβ , ∂0 wβ )T ]|l+1,q |[(wα , ∂0 wα ), Bαβ

≤ Cij k w k4N +1,∞ [k w k4N +l+2,q + k w k4N +l+2,r ] ≤ C|w|Y |w|X ,

(23)

where k w k4N +l+2,q + k w k4N +l+2,r ≤ C|w|X , for q, r > 2, l ≥ l0 from Sobolev Theorem. Now we estimate for q > n, µ , (wβ , ∂0 wβ )T ]|l,∞ |∂0 [(wα , ∂0 wα ), Bαβ µ ≤ C|∂0 [(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ]|l+1,q µ ≤ C|[(∂0 wα , (1 − m0 )wα + f α ), Bαβ , (wβ , ∂0 wβ )T ]|l+1,q µ +C|[(wα , ∂0 wα ), Bαβ , (∂0 wβ , (1 − m0 )wβ + f β )T ]|l+1,q .

It suffices to estimate the first term only, which is bounded by µ , (wβ , ∂0 wβ )T ]|l+1,q C{|[(∂0 wα , (1 − m0 )wα ), Bαβ µ +|[(0, f α ), Bαβ , (wβ , ∂0 wβ )T ]|l+1,q } ≡ I1 + I2.

Repeating the same proof of (23) with the same choice of (22), we obtain I1 ≤ Cij {k ∂ i wα k4N +2,q1 (i) k ∂ j wβ k4N +2,q2 (j) + k ∂ i wα k4N +2,r1 (i) k ∂ j wβ k4N +2,r2 (j) } ≤ C k w k4N +2,∞ [k w k4N +l+3,q + k w k4N +l+3,r ] ≤ C|w|Y |w|X , since q, r > 2. Applying Theorem 5 with p1 = p2 = 2q and r1 = r2 = 2r, we estimate I2 as I2 ≤ Cij {|∂ i f α |4N,2q k ∂ j wβ k4N +1,2q +|∂ i wα |4N,2r k ∂ j wβ k4N +1,2r } ≤ C{|f |4N +l+1,2q k w k4N +l+1,2q +|f |4N +l+1,2r k w k4N +l+1,2q } ≤ C|w|Y |w|X , where |f |4N +l+1,2q + |f |4N +l+1,2r ≤ C|w|Y from Lemma 2. In summary, we combine I1 and I2 to get (24) |wµ (t)|Y ≤ |ω µ (t)|Y + C|w(t)|Y |w(t)|X . Step 2. The estimate of |ω(0)|Z . Notice that from (12), µ |ω µ (0)|Z ≤ |wµ (0)|Z + |[(wα , ∂0 wα ), Bαβ , (wβ , ∂0 wβ )T ]|Z (0).

Applying Theorem 5 with p1 = p2 = 2p and r1 = r2 = 2r, we get µ , (wβ , ∂0 wβ )T ]|l+4,p (0) |[(wα , ∂0 wα ), Bαβ

≤ Cij k ∂ i wα (0) k4N +1,2p k ∂ j wβ (0) k4N +1,2p +Cij k ∂ w (0) k4N +1,2r k ∂ w (0) k4N +1,2r i

≤ C|w(0)|2X ,

α

j

β

(25)

Smooth Irrotational Flows in the Large to the Euler–Poisson System

261

since 2p, 2r > 2, for l ≥ l0 and 0 ≤ i + j ≤ l + 1. Here r1 + Similarly, with the same choice of p1 and r1 , we estimate

a n

= q1 , r > 2, and a small.

µ , (wβ , ∂0 wβ )T ]|l+3,p (0) |∂0 [(wα , ∂0 wα ), Bαβ µ ≤ C|[(∂0 wα , (1 − m0 )wα + f α ), Bαβ , (wβ , ∂0 wβ )T ]|l+3,p (0) µ + C|[(wα , ∂0 wα ), Bαβ , (∂0 wβ , (1 − m0 )wβ + f β )T ]|l+3,p (0)

(26)

≤ C{k w(0) k4N +l+5,2p +|f (0)|4N +l+3,2p + k w(0) k4N +l+5,2r +|f (0)|4N +l+3,2r }2 ≤ C|w(0)|2X for l large, where we have used the estimate for f in Lemma 2. In summary, from (25) and (26) (27) |ω µ (0)|Z ≤ |wµ (0)|Z + C|w(0)|2X . Step 3. The estimate of |R1 + R2 |l+3,p . We first estimate R2 . Recall R2 consists of terms of ∂ij 1−1 [h(w0 )wj ] and h1 (w0 ). By the Lp boundedness of Riesz transform [Ste], (1 < p < ∞) |∂ij 1−1 [h(w0 )wj ]|l+3,p ≤ C|h(w0 )wj |l+3,p ≤ C|w|Y |w|2X for l large. Similarly, |h1 (w0 )|l+3,p ≤ C|w|Y |w|2X . Hence |R2 |l+3,p ≤ C|w|Y |w|2X . Now we estimate R1 . From Theorem 5 with pi = 2p, ri = 2r for i = 1, 2, µ , (wβ , ∂0 wβ )T ]|l+3,p |R1µ |l+3,p ≤ |[(f α , ∂0 f α ), Bαβ µ +2|[(0, f α ), Bαβ , (∂0 wβ , (1 − m0 )wβ + f β )T ]|l+3,p µ +2|[(∂0 wα , (1 − m0 )wα + f α ), Bαβ , (0, f β )T ]|l+3,p µ +|[(wα , ∂0 wα ), Bαβ , (∂0 f β , f β )T ]|l+3,p

≤ C{k f k4N +l+4,2s k w k4N +l+4,2s +|f |4N +l+3,2s [k w k4N +l+5,2s +|f |4N +l+3,2s ]}, where s = 2p or s = 2r > 2. Now from Lemma 2, k f k4N +l+4,2s ≤ C|w|X |w|Y . Applying Sobolev Imbedding Theorem (2s > 2) to terms with w yields( l ≥ l0 ), |R1µ |l+3,p ≤ C|w|Y |w|2X .

(28)

Combining (27) and (28), we now have Z t 0 0 (1 + t − τ )−3/2+3/p |R1 + R2 |l+4,p (τ )dτ |ω(t)|Y ≤ C(1 + t)−3/2+3/p |ω(0)|Z + 0

≤ C{(1 + t)−3/2+3/p 0 +

0

Z

t 0

0

(1 + t − τ )−3/2+3/p |w(τ )|Y |w(τ )|2X dτ }

0

≤ C{(1 + t)−3/2+3/p 0 + |||w|||3T∗

Z

t

0

0

(1 + t − τ )−3/2+3/p (1 + τ )−3/2+3/p dτ }

0

0

≤ C(1 + t)−3/2+3/p [0 + |||w|||3T∗ ], 0

since p0 > 6. Notice that (1 + t)3/2−3/p |w(t)|X |w(t)|Y ≤ C|||w|||2T∗ , from (24), we conclude, for 0 ≤ t ≤ T∗ , 0

0

(1 + t)3/2−3/p |w(t)|Y ≤ (1 + t)3/2−3/p [|ω(t)|Y + C|w(t)|X |w(t)|Y ] ≤ C[0 +

|||w|||2T∗ ].

(29)

262

Y. Guo

5. The Energy Estimate and Global Existence We now derive the high-order energy estimate. Lemma 8 (Energy Estimate). Let |w(0)|X ≤ 0 and |||w|||T∗ < ∞ for 0 ≤ t ≤ T∗ , then |w(t)|2X + |∇ψ(t)|2X ≤ 20 + C|||w|||3T∗ .

(30)

Proof. We take the derivative in (7) with ∂ α = ∂0 ∂ l−1 or ∂ α = ∂ l . Taking the vector inner product with ∂ α w, we have h∂0 ∂ α w, ∂ α wi + hAj (w)∂j ∂ α w, ∂ α wi α α β α−β ∂j w, ∂ α wi, = c−2 0 h∂ ∇ψ, ∂ vi + cαβ h∂ Aj (w)∂

where summation over |β| < |α|, and hw1 , w2 i = w1T w2 , for some constants cαβ . Since Aj is symmetric, we rewrite the above as ∂0 h∂ α w, ∂ α wi + ∂j hAj (w)∂ α w, ∂ α wi 2 = 2 h∂ α ∇ψ, ∂ α vi + 2cαβ h∂ β Aj (w)∂ α−β ∂j w, ∂ α wi c0 +h∂j Aj (w)∂ α w, ∂ α wi.

(31)

We now take ∂ α in the last equation in (5), ∂t ∂ α ∇ψ = −n0 ∂ α v − n0 ∇1−1 ∇ · ∂ α {[(

γ−1 m + 1)2/(γ−1) − 1]v}. 2

Plugging ∂ α v back into (31) and integrating over [0, t]×R3 , we obtain |∂ α w(t)|22 + Z ≤

20

t

+C

1 |∂ α ∇ψ(t)|22 c20 n0

|∂ α ∇ψ|2 |∇1−1 ∇ · ∂ α {[(

0

Z

t

+C 0

γ−1 m + 1)2/(γ−1) − 1]v}|2 dτ 2

|∂ β Aj (w)∂ α−β ∂j w|2 |∂ α w|2 dτ

Z

t

≤ 20 + C 0

(32)

|w(τ )|Y |w(τ )|2X dτ,

by the L2 boundedness of the Riesz transform and |∂ β Aj (w)∂ α−β ∂j w|2 ≤ C|w|Y |w|X for |α| > |β|. We thus further estimate the above by Z 20 + C|||w|||3T∗

t 0

0

(1 + t)−3/2+3/p dτ ≤ 20 + C|||w|||3T∗ .

Now are ready for the global existence theorem.

Smooth Irrotational Flows in the Large to the Euler–Poisson System

263

Theorem 9 (Global Existence). Let l ≥ l0 and l0 be large enough. Let |w(0)|Z + |w(0)|X = 0 , and ∂j wi (0) = ∂i wj (0) (irrotationality). There exists a unique global solution w(t) of the Euler–Poisson system (5) with 0

|||w(t)||| = sup[|w|X + |∇ψ|X + (1 + t)3/2−3/p |w|Y ] < ∞, t>0

provided 0 small enough. Proof. For l ≥ 3/2 + 1, based on the energy estimate (32) and standard approximations, the existence of a local regular solution w(t) ∈ C([0, T∗ ), X) of (6) follows the standard method of [Ka]. We then combine (29) and (30) to obtain 3/2

|||w(t)||| T∗ ≤ C0 + C(|||w(t)|||2T∗ + |||w(t)|||T∗ ), provided |||w|||T∗ ≤ 1. It follows from the bootstrap argument that |||w(t)||| T∗ ≤ C0, if 0 is small and T∗ = ∞. We remark that the condition |w(0)|Z + |w(0)|X = 0 is implicit for the first order Euler–Poisson system. Not explicitly given as an initial condition, ∂0 w(0) is determined by (5). This implies that the non-local term ∇ψ is not only in L2 , but also has to be in Lp for p near 1. Equivalently, ∇1−1 (n(0, x) − n0 ) ∈ Lp . This, however, can be achieved by a natural “neutral condition” for n − n0 as follows. R Proof of Theorem 1. Let ρ = n(0, x) − n0 ∈ Cc∞ and ρ = 0, we show ∇1−1 ρ ∈ Lp , for any 1 < p < 3/2. Then |w(0)|Z < ∞ follows from (5). Equivalently, we show that I1 (ρ) ∈ Lp . Here I1 is the Riesz potential of order one. Since [Ste] Z I1 (ρ)(x) = c ρ(x − y)|y|−2 dy R3 Z Z ∞ {r2 ρ(x − rξ)dξ}r−2 dr (spherical coordinates) =c 0

Z

∞

= 2c

|ξ|=1 Z r 2

Z {

τ

0

0

|ξ|=1

ρ(x − τ ξ)dξdτ }r−3 dr

via an integration by part over the radial variable r. Here c is some numerical constant. Let supp ρ(z) ⊆ {|z| ≤ d}. We claim that for |x| > d, Z |x|+d Z r Z { τ2 ρ(x − τ ξ)dξdτ }r−3 dr. (33) I1 (ρ)(x) = 2c |x|−d

0

Rr

|ξ|=1

R To prove (33), we show that the support of 0 τ 2 |ξ|=1 ρ(x−τ ξ)dξdτ is in [|x|−d, |x|+d]. In fact, if r ≤ |x|−d, then |x−τ ξ| ≥ |x|−τ ≥ |x|−(|x|−d) = d, hence ρ(x−τ ξ) = 0. It follows Z r Z τ2 ρ(x − τ ξ)dξdτ = 0. 0

|ξ|=1

264

Y. Guo

On the other hand, if r ≥ |x| + d, then for τ ≥ r, |x − τ ξ| ≥ τ − |x| ≥ d. Hence ρ(x − τ ξ) = 0. Therefore Z r Z τ2 ρ(x − τ ξ)dξdτ 0

Z

Z

r

τ

= Z Z

2 |ξ|=1

0 ∞

= 0

|ξ|=1

Z ρ(x − τ ξ)dξdτ +

|ξ|=1

τ r

Z τ2

Z

∞

2 |ξ|=1

ρ(x − τ ξ)dξdτ

ρ(x − τ ξ)dξdτ

ρ(x − y)dy = 0,

= R3

R for r ≥ |x| + d, from the neutral condition ρ = 0. We therefore proved (33). Now from (33) and Z Z r Z 2 τ ρ(x − τ ξ)dξdτ = ρ(x − y)dy ≤ Cd3 |ρ|∞ , 0 |ξ|=1 |y|≤r we deduce that |I1 (ρ)|(x) ≤ C|x|−3 for |x| large, thus I1 (ρ) ∈ Lp ({|x| ≥ 1}) for 1 < p < 3/2 . By the Hardy-Littlewood-Sobolev’s inequality, I1 (ρ) ∈ L2loc (R3 ). Hence I1 (ρ) ∈ Lp (R3 ) for any 1 < p < 3/2. Acknowledgement. This article was inspired by stimulating conversations on elasticity with S. TahvidlarZadeh. The author also wishes to thank D. Christodoulou, S. Klainerman and W. Strauss for helpful comments.

References [CDMS] Cordier, S., Degond, P., Markowich, P., Schmeiser, C.: Travelling wave analysis and jump relations for Euler–Poisson model in the quasineutral limit. Asymptotic Analysis 11, 209–240 (1995) [CG] Chen, GQ., Glimm, J.,: Global solutions to the compressible Euler equations with geometrical structure. Commun. Math. Phys. 180, 153–193 (1996) [CW] Chen, GQ., Wang, D.: Convergence of shock capturing schemes for the compressible Euler–Poisson equations. Preprint 1996 [DM] Degond, P., Markowich, P.: A steady state potential flow model for semiconductors. Annali di Matematica pura ed applicata, (IV), vol. CLXV, 87–98 (1993) [G] Gamba, I.M.: Stationary transonic solutions of a one-dimensional hydrodynamic model for semiconductors. Commun. in PDE, 17, (384), 553–577 (1992) [K] Klainerman, S.: Global existence of small amplitude solutions to nonlinear Klein–Gordon equations in four space-time dimensions. Comm. Pure. Appl. Math. 38, 631–641 (1985) [Ka] Kato, T.: The Cauchy problem for quasilinear symmetric systems, Arch. Ration. Mech. Anal. 58, 181–205 (1975) [MMU] Makino, T., Mizohata, K., Ukai, S.: The global weak solutions of compressible Euler equations with spherical symmetry. Japan J. Industrial Appl. Math. 9, 431–449 (1992) [MSW] Marshall, B., Strauss, W., Wainger, S.: Lp − Lq estimates for the Klein–Gordon equations. J. Math. Pures Appl. (9), 59, 417–440 (1980) [N] Nelson, S.: On some solutions to the Klein–Gordon equations related to an integral of Sonine. Trans. A. M. S. 154, 227–237 (1971) [Pe] Perthame, B.: Nonexistence of global solutions to Euler–Poisson equations for repulsive forces. Japan J. Appl. Math. 7 no. 2, 363–367 (1990) [PRV] Poupaud, F., Rasche, M., Vila, J.P.: Global solutions to the isothermal Euler–Poisson system with arbitrarily large data. J. Diff. Equ. 123, 93–121 (1995)

Smooth Irrotational Flows in the Large to the Euler–Poisson System

[Sh] [Si1] [Si2] [Ste] [Str] [WC] [Z]

265

Shatah, J.: Normal forms and quadratic nonlinear Klein–Gordon equations. Comm. Pure. Appl. Math. 38, 685–696 (1985) Sideris, T.: Formation of singularities in three-dimensional compressible fluids. Commun. Math. Phys. 101, 475–485 (1985) Sideris, T.: The lifespan of smooth solutions to the three-dimensional compressible Euler equations and the incompressible limit. Indiana Univ. Math. J., 40 No. 2, 536–550 (1991) Stein, E.: Singular Integrals and Differentiability. Princeton, NJ: Princeton Univ. Press, 1970 Strauss, W.: Nonlinear Wave Equations. Providence, RI: AMS, 1989 Wang, D., Chen, G. Q.: Formation of singularities in compressible Euler–Poisson fluids with heat diffusion and damping relaxation. Preprint 1996. Zhang, B.: Convergence of the Godunov scheme for a simplified one-dimensional hydrodynamic model for semiconductor devices. Commun. Math. Phys. 157, 1–22 (1993)

Communicated by L. L. Lebowitz

Commun. Math. Phys. 195, 267 – 293 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Unstable BGK Solitary Waves and Collisionless Shocks? Yan Guo1,?? , Walter A. Strauss2 1 2

Division of Applied Mathematics, Brown University, Providence, RI 02912, USA Department of Mathematics, Brown University, Providence, RI 02912, USA

Received: 20 February 1997 / Accepted: 19 November 1997

Abstract: Consider a collisionless relativistic neutral plasma. We generalize the Penrose condition for linearized instability to the relativistic case. Then we consider a general one-dimensional equilibrium (a BGK wave) that is a collisionless shock or a solitary wave. The electric potential undergoes a transition from one constant to another as x runs from −∞ to +∞. We prove that if one of these constants satisfies the relativistic version of the Penrose condition, then the BGK wave is nonlinearly unstable. We also prove that the periodic relativistic BGK waves of small amplitude are nonlinearly unstable.

1. Introduction A collisionless plasma of ions and electrons is described by the Vlasov–Maxwell system. In such a plasma collisions are relatively rare; here we assume no collisions at all. In many plasmas some of the particles are expected to travel at relativistic speeds. However, in a nonrelativistic Vlasov model particles can travel at arbitrarily great speeds. We avoid this anomaly by assuming a relativistic model. Thus we consider the one-dimensional relativistic Vlasov–Maxwell system (RVM), e± ˆ x f± (t, x, v) ± E(t, x)∂v f± (t, x, v) = 0, ∂t f± (t, x, v) + v∂ m± Z ∞ v[e ˆ + f+ (t, x, v) − e− f− (t, x, v)]dv ∂t E(t, x) = −4πj(t, x) = −4π

(1) (2)

−∞

with the constraint ? The research of the first author is supported in part by NSF grant 96-23253 and a NSF Postdoctoral Fellowship, the second author is supported in part by NSF grants 93-22146 and 97-03695. ?? This work was completed while Y.G. was at Department of Mathematics, Princeton University.

268

Y. Guo, W. A. Strauss

Z ∂x E(t, x) = 4πρ(t, x) = 4π

∞

−∞

[e+ f+ (t, x, v) − e− f− (t, x, v)]dv.

(3)

Here f+ (t, x, v) is the distribution function of the ions, f− (t, x, v) the distribution of the electrons, at time t, position x, momentum v, and velocity vˆ . The mass and charge of an ion are m+ and e+ , while the mass and charge of an electron are m− and −e− . For notational simplicity, we will take all constants to be 1. Therefore the velocity is √ 2 vˆ = v/ 1 + v . The electric field is E(t, x), the charge density is ρ(t, x), and the current density is j(t, x). We will generally be assuming the neutrality condition Z Z Z Z f− dxdv, f+ dxv = which means that the total negative and positive charges are the same. There is considerable redundancy in the equations since (1) and (2) imply ∂t (∂x E − 4πρ) = 0 while (1) and (3) imply ∂x (∂t E + 4πj) = 0. A fundamental feature of this collisionless model is the multiplicity of its steady states. Their dynamical stability has been one of the important problems in plasma physics, with potential applications to plasma control (stability) and turbulence (microscopic instabilities). The simplest stationary solutions are those distributions f± (v) that only depend on the velocity variable v and satisfy the charge neutrality condition with E ≡ 0. The distributions f± need not be Gaussians; in fact, they can have arbitrary shapes. In 1960 Penrose [P] found necessary and sufficient conditions on homogeneous (spatially independent) distributions f± for their linear instability. In Lemma 4 of Sect. 2 we extend Penrose’s conditions to the relativistic case. In 1957 Bernstein, Greene and Kruskal [BGK] introduced the general 1-D equilibria, now known as the BGK waves. Typical √ one dimensional equilibria have the form f± (t, x, v) = µ± (hvi ∓ β(x)) with hvi = 1 + v 2 , where the electric potential β solves Z ∞ [µ+ (hvi − β(x)) − µ− (hvi + β(x))]dv. (4) βxx = −∞

For a given pair of distribution functions µ± , (4) is a Hamiltonian system with respect to x in the phase plane (β, βx ). Three important types of BGK waves are periodic waves, solitary waves and collisionless shocks. They correspond to closed, homoclinic and heteroclinic orbits in the phase space (β, βx ) respectively. It is the goal of this article to investigate the nonlinear dynamical instabilities of these three types of BGK waves. Theorem 1 asserts the nonlinear instability of solitary BGK waves and collisionless shocks. There is no size restriction on the amplitudes of these waves. Theorem 1 (Instability of solitary waves and collisionless shocks). Let 0 ≤ µ± (·) ∈ C 2 satisfy (5), (31) and (37). Let β ∈ C 2 solve (4) and limx→−∞ β(x) = 0, and consider the stationary solution [µ± (hvi ∓ β), βx ]. Assume that the linearized Vlasov–Maxwell system (11) around the homogeneous state [µ± (hvi), 0] has a growing exponential plane wave solution. Then there are positive constants 0 and C1 and a family of BV solutions δ δ (t), E δ (t)] of RVM, defined for δ sufficiently small, with f± non-negative, uδ (t) = [f± such that X δ kf± (0) − µ± (hvi ∓ β)kW 1,1 (R×R) + |E δ (0) − βx |W 1,1 (R) ≤ δ, ±

and

Unstable BGK Solitary Waves and Collisionless Shocks

sup0≤t≤C1 | ln δ|

R

P R

R

{

± R

269

δ |f± (t) − µ± (hvi ∓ β)|dv

+|E δ (t) − βx |}dx + supx |∂x E δ (t) − βxx | ≥ 0 . Notice that the last term is bounded by the W 1,1 norm of f − µ by (3) and (4). Sufficient conditions for the hypotheses are given in Theorem 3. The solutions may be chosen as smooth as we wish. Theorem 2 asserts the nonlinear instability of small-amplitude periodic BGK waves. Theorem 2 (Instability of periodic waves). Let 0 ≤ µ± ∈ C 2 satisfy (5), (7), (31) and (37). Let β be a solution of (4) of period Pβ with kβkC 2 sufficiently small, as in Lemma δ 1. Then the conclusion of Theorem 1 is valid for solutions [f± , E δ ] of period 2Pβ where the norms are taken over the period 2Pβ . Thus instability is measured in the L1 norm together with the L∞ norm of the charge. Notice that in both theorems the solution escapes from a δ-neighborhood of the equilibrium in a time O(| ln δ|). This property characterizes an exponential instability and is much shorter than the escape time O(1/δ) of the trivial instability due to Lorentz transformations. Conditions (5), (31) and (37) require that µ± be smooth and non-negative and satisfy certain decay conditions at infinity. In Theorem 1 we assume that the given equilibrium is asymptotic to zero (a constant would do equally well) in one direction (x → −∞). In this case the origin in the phase space is a saddle and the equilibrium could be a collisionless shock or solitary wave. Sufficient conditions for such an equilibrium to exist, and for the relativistic Penrose condition for corresponding homogeneous wave (with β ≡ 0) to hold, are given in Theorem 3. In Theorem 2 the condition (7) means that the origin is a center of the Hamiltonian system, so that arbitrarily small periodic equilibria exist. The same condition automatically provides a growing mode for the corresponding homogeneous wave. There are three levels of instability that appear in this paper. The simplest level is that of RVM linearized around a homogeneous state. This level is easily reduced to a dispersion relation for which explicit sufficient (and almost necessary) conditions are found following [P]. The second level is that of the system RVM linearized around the inhomogeneous (BGK) equilibrium. The third level is that of the full nonlinear system RVM. Normally it is expected that exponential growth of the linearized system (the second level) should imply the nonlinear instability of the equilibrium (the third level). We are aware of no work other than [GS2], [GS3] and the present one that proves the the instability of BGK equilibria either on the linearized or the nonlinear level. Although this paper is closely related to our previous one [GS2], there are some major differences. On the one hand, in [GS2] we only treated the small periodic BGK equilibria while here our main purpose is to treat solitary waves and collisionless shocks. On the other hand, we treated in [GS2] the Vlasov–Poisson system (VP), which consists of Eqs. (1) and (3) only. Instead of (2), in [GS2] we imposed the condition that the electric field have zero average over a period. In the present paper the additional equation (2) is naturally treated as an evolution equation for the electric field. A third difference is that the present paper is relativistic, which of course implies the causality of the system in space-time. The relativistic character of RVM requires many changes in the details of the calculations. Finally, at the suggestion of J. Vukadinovi´c and G. Rein, we now explicitly exhibit the non-negativity of the distribution function of our unstable solutions, which was omitted in [GS2].

270

Y. Guo, W. A. Strauss

The most important difference of the solitary and shock cases from the periodic case is that the spatial variable is now unbounded. So the growing plane wave solutions, which are not in any natural function space like Lp , p < ∞, correspond to continuous spectrum and the method of [GS2] fails. In fact, it remains an open problem to establish the instability of solitary waves and shocks in the classical VP system. Instead, we employ the causality property of RVM to estimate solutions in a region by their initial data in the domain of dependence. Inside the domain of dependence we replace the original problem by a periodic problem that has unstable point spectrum. We carefully relate the sizes of these domains to the exponential escape times. We believe that the same approach may be able to overcome the difficulties in some other hyperbolic instability problems that are associated with continuous spectra [GS4]. Section 2 is devoted to sufficient conditions for the existence of periodic, solitary and shock-like BGK equilibria (Lemmas 1 and 2) and for the existence of exponentially growing modes for the corresponding homogeneous system (Lemmas 3 and 4), called the first level in the discussion above. Theorem 3 summarizes all the conditions required in the case of solitary and shock equilibria. In Sect. 3 we prove the linearized instability in the periodic case (second level). Here the main results parallel those of [GS2] but with some important differences. Explicitly using the Vlasov characteristics, the existence of a growing mode is reduced to finding a fixed point of a rather unwieldy operator C. The main technical point is an estimate that implies that the operator C depends continuously on β in an appropriate norm (Lemma 8). It follows that C is a compact operator on L1 that depends analytically on the eigenvalue parameter, so that standard compact operator theory is applicable (Theorem 4). In Sect. 4 some properties of the periodic eigenfunctions are derived, including their regularity (Lemma 11) and their pointwise behavior (Lemma 13). Certain approximate eigenfunctions are estimated in Lemma 14. Section 5 is devoted to the periodic nonlinear problem. After a discussion of well-posedness, it is proved that an exponential bound on the difference of a solution from the equilibrium implies a corresponding bound on the first derivatives (Lemma 16). Then the dominant linear growing mode, together with a bootstrap method employing Lemma 16, is used to prove Theorem 2. In Sect. 6 we treat the instability of the solitary and shock BGK waves. After a brief discussion of well-posedness, we prove Theorem 1. While we believe it is also possible to prove the linear instability of these equilibria, we prefer a much briefer and more direct proof of the nonlinear instability. First we construct an approximate periodic problem with a very large period. By Theorem 2 there is an exact solution of the periodic nonlinear problem that is growing. Then we use the fact that β(x) → 0 as x → −∞ to deduce the existence of an exact solution of the full-line nonlinear problem that is also growing locally. Finally in Sect. 7 we complete the earlier paper [GS2] by stating sufficient conditions for the positivity of the distribution functions and making some minor corrections. Some related references are given in [GS2]. 3D homogeneous equilibria of VP are treated in [GS1]. Various magnetic equilibria of RVM in 1.5D are constructed in [GR]. The nonlinear stability of some of these magnetic equilibria are proven in [G] and the instability of others in [GS4]. At the suggestion of one of the referees we have omitted some of the proofs that are analogous to proofs in [GS2]; cf. [GS3].

Unstable BGK Solitary Waves and Collisionless Shocks

271

2. The BGK Equilibria In this section, we construct various time-independent, spatially inhomogeneous BGK equilibria. They include periodic waves, solitary waves and collisionless shocks. A BGK wave is a steady state of the form f± = µ± (hvi ∓ β(x)) and E = βx , with β satisfying: Z ∞ βxx = [µ+ (hvi − β) − µ− (hvi + β)]dv. −∞

Here µ± are two given nonnegative functions. This is a simple Hamiltonian system with respect to x in the phase space (β, βx ). We always assume µ± (·) are positive C 2 functions on R, µ± (hvi) and µ0± (hvi) are integrable functions of v ∈ R; Z ∞ [µ+ (hvi) − µ− (hvi)]dv = 0.

(5)

−∞

We let the potential function H(β) satisfy Z ∞ −H 0 (β) = [µ+ (hvi − β) − µ− (hvi + β)]dv −∞

and we define H(0) = 0. Substituting s = hvi ∓ β, we obtain formally: Z ∞ [µ+ (hvi − β) − µ− (hvi + β)]dv −H 0 (β) = −∞ Z ∞ Z ∞ (s + β)ds (s − β)ds −2 =2 µ+ (s) p µ− (s) p 2 (s + β) − 1 (s − β)2 − 1 1−β 1+β Z ∞ Z ∞ p p ∂ ∂ =2 µ+ (s) (s + β)2 − 1ds + 2 µ− (s) (s − β)2 − 1ds. ∂β 1−β ∂β 1+β Therefore, H(β) = C − 2

Z

∞

Z p µ+ (s) (s + β)2 − 1ds − 2

1−β

∞

p µ− (s) (s − β)2 − 1ds,

(6)

1+β

√ R∞ where C = 2 1 [µ+ (s)+µ− (s)] s2 − 1ds. We deduce that as β → ±∞, H(β) → −∞ unless µ± ≡ 0. The same conclusion is easily justified under assumptions (5) by noting that Z ∞ µ+ (hvi − β)dv lim H 0 (β) = − lim β→+∞ β→+∞ −∞ Z ∞ Z ∞ s+β ds = − = − lim µ+ (s) p µ+ (s)ds < 0. β→+∞ 1−β (s + β)2 − 1 −∞ This implies, in particular, that no BGK wave satisfies lim x→∞ βx = const 6= 0. We also have Z ∞ [µ0+ (hvi) + µ0− (hvi)]dv. H 00 (0) = −∞

272

Y. Guo, W. A. Strauss

Lemma 1 (Periodic BGK equilibria). Let µ± satisfy (5) and let 2π ( )2 = P0

Z

∞ −∞

[µ0+ (hvi) + µ0− (hvi)]dv > 0.

(7)

Then there exists δ0 > 0 such that for all δ < δ0 , there exists a periodic function β(x) with period Pβ satisfying (4), and |β|∞ = δ,

lim Pβ = P0 ,

δ→0

β(0) = β(Pβ ) =

min β(x),

0≤x≤Pβ

β(Pβ /2) = max β(x) 0≤x≤Pβ

Here P0 is defined by (7) and we can take δ0 = sup{s : H 00 (s) > 0}.

(8)

The proof is similar to Lemma 1.1 in [GS2]. For fixed β, we sometimes drop the subscript on Pβ . We now construct the solitary waves and collisionless shocks. Unlike the case of periodic solutions, these solutions are associated with saddle points of (4). From elementary ODE, the following lemma is well-known. Lemma 2 (Solitary waves and collisionless shocks). Assume (5) as well as 00

Z

∞

0 > H (0) = −∞

[µ0+ (hvi) + µ0− (hvi)]dv,

there is p > 0 such that 0 = H(p).

(9) (10)

Then there is a solution of (4) such that lim β(x) = 0, lim β(x) = constant = a.

x→−∞

x→+∞

(These two limits can be arbitrary constants.) If we let p be the first positive zero of H, then H 0 (p) ≥ 0. If H 0 (p) > 0, then a = 0 and βmax = p, and the solution is called a solitary wave (or homoclinic orbit). If H 0 (p) = 0, then a = p and β is monotone and the solution is called a collisionless shock (or heteroclinic orbit). Our goal is to study the instabilities of these BGK equilibria. As a first step, we construct growing plane wave solutions to the linearized Vlasov–Maxwell system around spatially homogeneous waves. These homogeneous waves correspond to the centers and saddles associated with the periodic waves and solitary waves (or collisionless shocks) respectively. The linearized Vlasov–Maxwell system around (µ± (hvi), β ≡ 0) is ˆ x )g± = ∓E∂v µ± (hvi), (∂t + v∂ Z ∞ v(g ˆ + − g− )dv = −j, ∂t E = − −∞ Z ∞ ∂x E = (g+ − g− )dv = ρ.

(11)

−∞

We emphasize that system (11) is not (RVM) linearized around the BGK equilibrium but only around the corresponding homogeneous state.

Unstable BGK Solitary Waves and Collisionless Shocks

273

Lemma 3 (Homogeneous growing modes for centers). Let µ± satisfy conditions (5) and (7). Then there exists a growing exponential solution for (11) of period 2P0 : vµ ˆ 0± (hvi) ikx−iω0 t e , E(t, x) = −ikeikx−iω0 t , vˆ − ω0 /k Z 2P0 Z 2P0 E(t, x)dx = j(t, x)dx = 0. g± (t, x, v) = ±

0

Here k =

π P0

(12)

0

> 0, and ω0 is a pure imaginary number. Moreover, Im ω0 > 0.

Proof. Notice that the function Z

∞

Z(iλ) = −∞

v[µ ˆ 0+ (hvi) + µ0− (hvi)]dv vˆ − iλ

is real and continuous for 0 ≤ λ < ∞ by integration by parts. Moreover, Z(0) = (2π/P0 )2 and limλ→∞ Z(iλ) = 0. Hence there exists λ > 0 such that Z(iλ) = (π/P0 )2 . It follows directly that the following triple is a solution of (11): g± = ±

vµ ˆ 0± (hvi) k[ix+λt] e , vˆ − iλ

E = −ikek[ix+λt] .

Here k = π/P0 . We deduce the lemma by letting ω0 = ikλ, k 6= 0. Clearly this is a growing mode since it has the factor exp(kλt) with kλ > 0. Remark. If Z(iλ0 ) ≥ (2π/P0 )2 for some λ0 > 0, then there is a growing mode with period P0 instead of 2P0 . Although centers are always formally unstable, saddles are only unstable under certain assumptions. The following lemma gives a more general condition for a homogeneous growing mode. Lemma 4 (Homogeneous growing modes). Let F = µ+ + µ− , |F 0 (s)| = O(|s|−γ ) as |s| → ∞ for some γ > 1. Assume (5) is valid. There is a growing mode for the linearized Vlasov–Maxwell system provided there exists b ∈ R such that Z F (hvi) − F (hbi) dv > 2hbi2 F (hbi). (13) F 0 (hbi) = 0, F 00 (hbi) > 0, ˆ 2 hvi3 (vˆ − b) R Proof. Following [P], we look for an exponential plane-wave solution g± = ei(kx−ωt) g(v), E = ei(kx−ωt) with k real and ω complex, Im ω > 0. Plugging into the linearized equation, we obtain the dispersion relation Z ∂v F (hvi) dv = k 2 Z(z) = vˆ − z R with z = ω/k, k 6= 0. The function Z maps C to C, and is analytic for Imz 6= 0. As in [P], in order to find a Imω > 0 satisfying this dispersion relation, it suffices to show that {Z(z)|Im z = 0}, which is the image of the real axis, intersects the positive real axis. To this end, we compute Z(z) on real axis. We claim that

274

Y. Guo, W. A. Strauss

Z

Im Z(bˆ + i0) = πbhbi2 F 0 (hbi),

Re Z(bˆ + i0) = P

∞

−∞

∂v F (hvi) dv, vˆ − bˆ

(14)

where P denotes the Cauchy principal value. To prove this, let z = bˆ + iη, so that Z

∞

vF ˆ 0 (hvi)(vˆ − bˆ + iη) dv. ˆ 2 + η2 (vˆ − b)

Z(z) = −∞

Letting u =

v− ˆ bˆ η ,

Z

we have dv = η(1 − (bˆ + ηu)2 )−3/2 du and

ˆ (1−b)/η

Z(z) = Z

ˆ −(1+b)/η ˆ (1−b)/η

= ˆ −(1+b)/η

(bˆ + ηu)F 0 ([1 − (bˆ + ηu)2 ]−1/2p )(u + i)[1 − (bˆ + ηu)2 ]−3/2 du u2 + 1 (bˆ + ηu)F 0 (hvi)(u + i)hvi3 du u2 + 1

with hvi = [1 − (bˆ + ηu)2 ]−1/2 . Since limη→0 hvi = hbi, Z Im Z(bˆ + i0) = lim Im Z(z) = η→0

∞

−∞

ˆ 0 (hbi)hbi3 bF du = πbhbi2 F 0 (hbi). u2 + 1

Now we compute limη→0 Re Z(z) where Z

∞

Re Z(z) = −∞

ˆ ∂v F (hvi)(vˆ − b) dv. ˆ 2 + η2 (vˆ − b)

We split Re Z(z) as Z ˆ ˆ ∂v F (hvi)(vˆ − b) [∂v F (hvi) − ∂v F (hbi)](vˆ − b) + Re Z(z) = ˆ 2 + η2 ˆ 2 + η2 ˆ − b) (vˆ − b) |v−b|> (v |v−b|< Z vˆ − bˆ +∂v F (hbi) = I1 + I2 + I3 ∂v F (hbi). ˆ 2 + η2 ˆ − b) |v−b|< (v Z

Clearly I2 = O() uniformly in η for η 6= 0. We claim that the same is true for I3 . ˆ dw = hvi−3 dv = (1 − (w + b) ˆ 2 )3/2 dv. We thus To prove the claim, let w = vˆ − b, have Z b+ Z (b+ˆ)−bˆ vˆ − bˆ w ˆ 2 ]−3/2 dw. dv = [1 − (w + b) I3 = 2 + η2 2 + η2 ˆ w ˆ ˆ ( v ˆ − b) b− (b−)−b ˆ < 1, it follows that [1 − (w + b) ˆ 2 ]−3/2 = (1 − bˆ 2 )−3/2 + O(w) for Notice that since |b| small enough. Hence I3 = (1 − bˆ 2 )−3/2 Moreover, since

w w2 +η 2

Z

(b+ˆ)−bˆ (b−ˆ)−bˆ

w dw + O( 2 w + η2

Z

(b+ˆ)−bˆ

w2 dw). + η2

2 (b−ˆ)−bˆ w

is odd and (b ± ˆ) − bˆ = ±hbi−3 + O(2 ),

Unstable BGK Solitary Waves and Collisionless Shocks

I3 ≤ (1 − bˆ 2 )−3/2

Z

(b+ˆ)−bˆ −(b−ˆ)+bˆ

275

1 dw + O( w

Z

(b+ˆ)−bˆ

dw). (b−ˆ)−bˆ

The first interval has length O(2 ), the second O(). Our claim thus follows. Now we can take the limit first as η → 0 and then as → 0 to obtain Z ∞ ∂v F (hvi) ˆ dv. Re Z(b + i0) = P vˆ − bˆ −∞ We now rewrite Re Z(bˆ + i0) in terms of F itself: Z ∞ Z ∞ ∂v F (hvi) ∂v F (hvi) − ∂v F (hbi) dv = dv vˆ − bˆ vˆ − bˆ b+ b+ Z ∞ F (hvi) − F (hbi) ∞ F (hvi) − F (hbi) dv + = |b+ 2 3 ˆ (vˆ − b) hvi vˆ − bˆ b+ Z ∞ F (hbi) F (hb + i) − F (hbi) F (hvi) − F (hbi) = dv − − . ˆ 2 hvi3 (vˆ − b) 1 − bˆ (b + ˆ) − bˆ b+ Similarly Z

b− −∞

∂v F (hvi) dv = vˆ − bˆ

Z

b−

−∞

F (hbi) F (hb − i) − F (hbi) F (hvi) − F (hbi) dv − + . 2 3 ˆ (vˆ − b) hvi 1 + bˆ (b − ˆ) − bˆ

In the limit as → 0, the last terms in each expression cancel because F (hb + i) − F (hbi) F (hb − i) − (F (hbi)) = lim . ˆ ˆ →0 →0 (b + ) − b (b − ˆ) − bˆ lim

The other terms in (14) give Z ReZ(bˆ + i0) =

F (hvi) − F (hbi) dv − 2hbi2 F (hbi). ˆ 2 hvi3 (vˆ − b)

By condition (13), there exists bˆ real such that Z(bˆ + i0) > 0. If bˆ = 0, we deduce, using only the last condition in (13), that there exists η0 > 0 such that Z(iη0 ) > 0 since Z(iη) d is real and continuous in η. If bˆ 6= 0, we calculate dθ Im Z(θ + i0)|θ=bˆ > 0 from (13) and (14). This implies that the curve {Z(z)|Im z = 0} crosses the positive real axis. Hence in any case there exists a point in the upper half plane z = bˆ + iη with η > 0 such that Z(bˆ + iη) > 0. This is exactly the dispersion relation for the existence of the plane wave solution vµ ˆ 0± (hvi) ik(x−bt) ˆ ˆ g± = eηkt , E = −ikeik(x−bt) eηkt e vˆ − bˆ − iη to (11). Since η > 0, this is a growing mode.

We now combine this lemma with Lemma 2 to get sufficient conditions for the existence of solitary waves or collisionless shocks with growing modes for the corresponding homogeneous problems.

276

Y. Guo, W. A. Strauss

Theorem 3 (Homogeneous growing modes for saddles). There are plenty of distributions µ± that are C ∞ , strictly positive, and satisfy (5), (9), (10) and (13). That is, there are collisionless shocks and solitary wave solutions to (4) such that (11) has a growing mode. Proof. We begin by choosing any b > 0 and any smooth distribution F1 (hvi) such that / {1, hbi}, and F1 decays at F1 (1) = F1 (hbi) = 0, F1 00 (hbi) > 0, F1 (hvi) > 0 for hvi ∈ infinity like O(s−λ ). Thus F1 satisfies (13). The distributions µ1+ and µ1− are chosen to be neutral (i.e. satisfying (5)) so that F1 = µ1+ + µ1− . Then H1 is defined by (6). Now H10 (0) = 0 and 00

H1 (0) =

Z

∞

−∞

F10 (hvi)dv

Z

∞

∂v (F1 (hvi)) dv = Z(0 + i0) = vˆ

= −∞

Z

∞

−∞

F1 (hvi) dv > 0 vˆ 2 hvi3

since F1 (1) = 0. This is the opposite condition from (9). Because H1 (0) = H10 (0) = 0, H1 00 (0) > 0 and H1 (β) → −∞ as β → ±∞, it follows that H1 (β) must vanish at some positive value β = p > 0. Thus H1 satisfies (10). In order to obtain (9), we perturb H1 . Choose a smooth function g ≥ 0 with g(0) = 1, g 0 < 0 in (0, 1), and g ≡ 0 in [1, ∞). Define F2 (β) = F1 (β) + g( β−1 ) for some small . Let g = g+ + g− (for instance, we may choose g+ = g− = 21 g). Defining H2 as in (6), we have Z Z ∞ Z ∞ 1 ∞ 0 hvi − 1 H2 00 (0) = )dv F20 (hvi)dv = F10 (hvi)dv + g( −∞ −∞ −∞ Z ∞ Z 1 2 1 + r = F10 (hvi)dv + √ g 0 (r) √ dr. 2r + r2 −∞ 0 The last integral is negative and we have H2 00 (0) <

Z

∞

−∞

F10 (hvi)dv − C−1/2 < 0

for sufficiently small . Thus H2 satisfies (9). Furthermore, by (6), H2 (β) = H1 (β) − 2

X Z { ±

Z

1+

−

g± ( 1

1+

g± ( 1∓β

s−1 p ) (s ± β)2 − 1ds

s−1 p 2 ) s − 1ds}.

All four of these integrands are bounded and the intervals have length at most 2. Hence H2 (β) = H1 (β) + O(). Therefore, for sufficiently small, there exists p > 0 such that H2 (p) > 0. Then (10) is true for H2 . It is obvious that the conditions (13) are also true for F2 . Thus F2 satisfies all the required conditions except strict positivity. We have F2 (β) > 0 for β 6= hbi. Finally, let F (β) = F2 (β) + k(β) where k(β) = δ in [hbi − δ, hbi + δ] and k(β) = 0 in (−∞, hbi − 2δ] ∪ [hbi + 2δ, ∞). Also take a neutral pair k± such that k = k+ + k− . Then µ± = µ1± + g± + k± satisfies all the required conditions for δ sufficiently small.

Unstable BGK Solitary Waves and Collisionless Shocks

277

3. Linear Instability for Periodic BGK Waves In this section, we shall prove the instability for the linearized Vlasov–Maxwell system around periodic BGK waves, by using a perturbation method. We formulate the linearized problem equivalently in terms of the Poisson equation and a complicated operator C involving the Vlasov characteristics. Then through detailed estimates along the trajectories, we conclude that the linear operator is a nice perturbation of the homogeneous case, whereby it indeed has a growing mode. Let β = β(x) be any given periodic BGK wave with period P . We study the linearized Vlasov–Maxwell system around the BGK wave [µ± (hvi ∓ β), βx ], namely, ˆ x ± β 0 ∂v )g± ± E∂v µ± (hvi ∓ β) = 0, (∂t + v∂ Z ˆ + − g− ]dv = −j ∂t E = − v[g

(15)

R

R

with the constraint ∂x E = R (g+ − g− )dv = ρ, and the P-periodic boundary condition. We will consider pairs of functions g = [g+ (x, v), g− (x, v)] and triples u = [g+ (x, v), g− (x, v), E(x)]. Definition. Let M be the Banach space of triples u = [g+ (x, v), g− (x, v), E(x)] of finite measures on RP ×R, RP ×R, and RP , respectively, which are periodic in x with period P , and satisfy Z PZ g− dvdx = g+ dvdx, (neutrality) R R 0 0 Z ∂x E = [g+ − g− ]dv. (Poisson equation)

Z

P

Z

(16)

R

We denote the norm kukm = kg+ km + kg− km + |E|m , where k · km and | · |m are the corresponding measure norms in RP × R and RP . Definition. We define the operator A acting on pairs g = [g+ , g− ] into pairs, and the operator K acting on E into pairs, by A+ (g+ ), v∂ ˆ x g+ + β 0 ∂v g+ Ag = = 0 v∂ ˆ x g− − β ∂v g− A− (g− ) K+ E ∂v µ+ (hvi − β)E KE = = . (17) K− E −∂v µ− (hvi + β)E Furthermore, we define L from triples to triples by Lu =

A+ g+ + K+ E RA− g− + K− E v[g ˆ + − g− ]dv R

! .

(18)

Lemma 5 (Linearized well-posedness). Let µ± satisfy (5) and let β be any solution of (4) of period P . If u0 ∈ M, there is unique solution u(t) ∈ M of du + Lu = 0, dt

u(0) = u0 .

278

Y. Guo, W. A. Strauss

Sketch of the proof. We split the operator L as L=

A(g) v[g ˆ + − g− ]dv R

R

+

K(E) 0

≡ L 1 + L2 .

(19)

Notice that the Vlasov operator e−At has norm 1, and |j|m ≤ kg+ km + kg− km . The operator L1 thus generates a strongly continuous semigroup on M with ke−L1 t u0 km ≤ C(1 + t)ku0 km .

(20)

Now |∂x E|m = kρkm ≤ kg+ km + kg− km , so that L2 is compact on M and our lemma thus follows. We introduce the characteristics X ± (t; 0, x0 , v 0 ) and V ± (t; 0, x0 , v 0 ) as the solutions ± dX ± of dt = Vˆ ± , dVdt = ±β 0 (X ± ), X ± (0) = x0 , V ± (0) = v 0 . Let L1 (RP ) be the space of P -periodic integrable functions of x and let L1 (RP × R) be the similar space of functions of x and v with the norms Z

P

|h(·)|1 =

Z |h(x)|dx,

P

kh(·, ·)k1 =

0

0

Z

∞ −∞

|h(x, v)|dxdv.

Let W 1,1 (RP ) and W 1,1 (RP × R) be the subspaces of L1 (RP ) and L1 (RP × R) with the norms |h|1,1 = |∂x h|1 + |h|1 and khk1,1 = k∂x hk1 + k∂v hk1 + khk1 . Definition. For Im ω > 0, we define Z ∞ e−sA± eiωs K± Eds, R± = − 0 Z ρ(x) = [R+ (x, v) − R− (x, v)]dv, ZR v[R ˆ + (x, v) − R− (x, v)]dv, j(x) = R

Z

[C(ω, β)E](x) = 0

x

1 ρ(y)dy + P

Z

P 0

(21) (22) (23) 1 { j(y)dy − iω

Z

y

ρ(z)dz}dy.

(24)

0

Lemma 6. Let β be any solution of (4) of period P . (a) If Im ω > 0, then C(ω, β) is a bounded linear operator on L1 (RP ). (b) Suppose that E ∈ L1 (RP ) satisfies the equation E = C(ω, β)E for some Im ω > 0. Then there exist R± (x, v) ∈ L1 (RP × R) such that v∂ ˆ x R± ± β∂v R± ± ∂v µ± (hvi ∓ β)E(x) = iωR± , j = iωE, ∂x E = ρ. That is, −iω is an eigenvalue with a positive real part of the generator −L.

(25)

Unstable BGK Solitary Waves and Collisionless Shocks

(c) In terms of the characteristics, we have Z Z x ρ(y)dy = K(x, x0 )E(x0 )dx0 ,

K

±

(26)

R

0

where K = K + + K − ,

279

Z

∞

=−

Z

0

∞ −∞

H∂v µ± (x0 , v 0 )eisω dv 0 ds,

where H = H(x−X ± (s; 0, x0 , v 0 ))−H(−X ± (s; 0, x0 , v 0 )) and H(·) is the Heaviside function. (d) We also have Z Z x j(y)dy = J(x, x0 )E(x0 )dx0 , J = J + + J − , R 0 Z ∞Z ± Vˆ ± (s; 0, x, v)H∂v µ± (x0 , v 0 )eisω dv 0 ds. J =− 0

R

The proof is similar to that of Lemma 2.1 in [GS2]. In order to analyze the operator C, we have to estimate along the trajectories. Consider the particle paths given by d x = v, ˆ dt

dv = ±β 0 (x), dt

(27)

whose solutions are X ± (t; 0, x0 , v 0 ), V ± (t; 0, x0 , v 0 ). Define the untrapped region of the ‘+’ flows as F + = {(x0 , v 0 )| hv 0 i − β(x0 ) > 1 − min β = a}. In F + the trajectories go from −∞ to +∞. We also define the trapped region of the ‘+’ flows as T + = {(x0 , v 0 )| hv 0 i − β(x0 ) ≤ 1 − min β}, where the flows will never move out of each interval [nP, (n + 1)P ], by our choice of β in (8). Similarly, we define the untrapped region of the ‘-’ flows as F − = {(x0 , v 0 )| hv 0 i + β(x0 ) > 1 − max β = b}, where the flows can go from −∞ to +∞ and the trapped region of the ‘-’ flows as T − = {(x0 , v 0 )| hv 0 i + β(x0 ) ≤ 1 + max β}, where the flows never move out of each interval [nP − P/2, nP + P/2], by (8). Let 6± (t, x, x0 ) = {v 0 ∈ R | X ± (t; 0, x0 , v 0 ) = x} be the initial velocity of a particle travelling from x0 to x in time t. Notice that 6± could, inside the trapped region T ± , consist of more than one point. The flows with different initial velocities could come back to the same position at the same time, as long as the consumed time interval is a common multiple of their different periods. However, in the untrapped region 6± is a single point. Lemma 7. (a) If (x0 , v 0 ) ∈ F ± and v 0 ∈ 6± (t, x, x0 ), (both + or both −), then 6± (t, x, x0 ) consists of a unique point V± (t, x, x0 ). Moreover V± (t, x + P, x0 + P ) = V± (t, x, x0 ).

(28)

280

Y. Guo, W. A. Strauss

(b) C(ω, β) maps P -periodic functions to P -periodic functions. Proof. For part (a), without loss of generality we may just consider the ‘+’ part, since similar arguments apply to the ‘-’ part. If v 0 ∈ 6+ (t, x, x0 ), then hV + i − β(X + ) = hv 0 i − β(x0 ),

(29)

+ + 0 0 + + 0 0 2 2 where p X = X (0; t, x , v ), V = V (0; t, x , v ). Notice that hvi − 1 = v , vˆ = 2 ± hvi − 1/hvi. Therefore from the characteristic ordinary differential equations (27) and from (29), we have

[(hv 0 i − β(x0 ) + β(X + ))2 − 1]1/2 dX + = Vˆ + = ± . dt hv 0 i − β(x0 ) + β(X + ) Because v 0 ∈ 6+ (t, x, x0 ), we have Z x hv 0 i − β(x0 ) + β(y) dy. t=± 0 0 2 1/2 x0 [(hv i − β(x ) + β(y)) − 1]

(30)

Here we let x = X + (t; 0, x0 , v 0 ) and use + if t > 0 and x > x0 . If we take x > x0 , then t > 0 and (x0 , v 0 ) ∈ F + . Then v 0 > 0. As we choose the plus sign in (30), t is a strictly decreasing function of hv 0 i and a strictly increasing function of x. Hence v 0 is uniquely determined and we write v 0 = V+ (t, x, x0 ). The proof of periodicity is similar to that in Lemma 2.2 in [GS2]. Denote the individual particle energies by W± (t, x, x0 ) = hV± (t, x, x0 )i ∓ β(x0 ),

W0 (t, x, x0 ) = [1 − (x − x0 )2 /t2 ]−1/2 = hV0 i,

0 where V± (t, x, x0 ) are defined in (28) within the free regions and Vˆ 0 = x−x t . Let X ± (τ ; 0, x0 , v 0 ) be the trajectories in (27) and X 0 (τ ; 0, x0 , v 0 ) = x0 + τ vˆ 0 , V 0 (τ ; 0, x0 , v 0 ) = v 0 be the unperturbed trajectories (straight lines). Recall the definition of k ± and also define Z ∞Z ± 0 [ δ(x − x0 − vˆ 0 τ )∂v0 µ± (hv 0 i)dv 0 ]eiωτ dτ k0 (x, x ) = ∓ 0 Z ∞ µ0± (hV0 i)Vˆ 0 hV0 i3 τ −1 eiωτ dτ. =∓

0

Recalling the definitions (17), (21), (22), (23), we similarly define 0 K+ E ∂v µ+ (hvi)E A0 = v∂ ˆ x , K0 E = = , 0 K− E −∂v µ− (hvi)E Z ∞ 0 0 0 =− e−τ A eiωτ K± Edτ, R± 0 Z Z 0 0 ρ0 = (R+0 − R− )dv, j 0 = v(R ˆ +0 − R− )dv. R

Our key estimates are the following.

R

Unstable BGK Solitary Waves and Collisionless Shocks

281

Lemma 8. Let µ± satisfy (5) and let β be any solution of (4) of period P . Let Im ω > 0. Assume for some γ > 1 that |µ0± (θ)| ≤

C , 1 + |θ|γ

|µ± 00 (θ)| ≤

C . 1 + |θ|γ+1

(31)

With ρ and j defined by (22), (23), we have |ρ − ρ0 |1 + |j − j 0 |1 ≤ Ckβk1/2 |E|1 for all E ∈ L1 (RP ), where kβk = |β|C 1 . Lemma 9. Under the same conditions, Z

P

Z Z

∞

| R

0

Z

[δ(x − X ± )∂v µ± (x0 , v 0 ) − δ(x − X 0 )∂v µ± (hv 0 i)] × R

0

×e−Imωτ E(x0 )dv 0 dτ dx0 |dx ≤ Ckβk1/2 |E|1

(32)

for all E ∈ L1 (RP ), X ± = X ± (τ ; 0, x0 , v 0 ) and X 0 = X 0 (τ ; 0, x0 , v 0 ). Remark. It is easy to estimate the integral in Lemma 9 by C|E|1 , but we will require the small constant kβk. We illustrate our technique by estimating the free part of the integral as Z Z R

P

0 ∞

Z

Z Z

∞ 0 P

Z

|δ(x − X 0 )∂v µ± (hv 0 i)|e−Imωτ |E(x0 )|dv 0 dτ dxdx0

Z Z

e−Imωτ |E(x0 )|δ(x − x0 − vˆ 0 τ )|µ0+ (hv 0 i)vˆ 0 |dv 0 dx0 dxdτ

= Z

0 ∞

Z

0

R R P Z x+τ

= 0

(33)

R

e−Imωτ |E(x0 )µ0+ (hV0 i)Vˆ 0 |hV0 i3 τ −1 dx0 dxdτ,

x−τ

0

where we have integrated v 0 first, and computed ∂v0 [x − x0 − vˆ 0 τ ] = −τ hv 0 i−3 . We notice that Z

x0 +τ x0 −τ

hV0 i3−γ

dx = τ

Z

x0 +τ x0 −τ

(1 − |

x − x0 2 γ−3 −1 | ) 2 τ dx = τ

Z

1

−1

(1 − y 2 )

γ−3 2

dy < ∞

since γ > 1. Thus the free part is bounded by Z

∞ 0

e−Imωτ

Z

P 0

Z

∞

=C

e 0

−Imωτ

Z

x+τ

x−τ Z P +τ −τ

|E(x0 )|hV0 i3−γ τ −1 dx0 dxdτ 0

0

Z

|E(x )|dx dτ ≤ C

where we have used the fact that

∞

e−Imωτ dτ (τ + 1)|E|1 ≤ C|E|1 ,

0

R P +τ −τ

(34)

|E(x0 )|dx0 ≤ C(τ + 1)|E|1 .

282

Y. Guo, W. A. Strauss

The proofs of Lemma 8 and Lemma 9 may be found in [GS3]. Now we are ready for our main theorem about the linear operator C. Recall the definition of C(ω, β) in (21). We shall write P = 2Pβ . We define C(ω, 0) from L1 (RP ) into itself by Z P Z Z Z x 1 1 P z 0 0 0 ρ (y)dy + j (y)dy − ρ (y)dydz. (35) C(ω, 0)E(x) = iωP 0 P 0 0 0 Theorem 4 (Growing mode for periodic BGK equilibria). Let P = 2Pβ , and Imω > 0 and γ > 1, and µ± satisfy (5), (7) and (31). Then (a) C(ω, β) and C(ω, 0) are compact operators from L1 (RP ) to L1 (RP ) such that 1/2

kC(ω, β) − C(ω, 0)kL1 (RP )→L1 (RP ) ≤ CkβkC 1 , where C(ω, 0) is the unperturbed linearized operator. The constant C is uniform for Im ω > constant > 0. (b) C(ω, β) is analytic in ω for Im ω > 0. (c) There exists η > 0 such that if kβkC 1 < η, there exists a growing mode [g ± , E] with period P for the linearized Vlasov–Maxwell system (15) around [µ± (hvi ∓ β(x)), β 0 ]. The proof is similar to that of Theorem 2.4 of [GS2] and is found in [GS3]. 4. Properties of Periodic Eigenfunctions By Theorem 4 (c) and Vidav’s Lemma [V], we have Lemma 10 (Linear Vlasov–Maxwell). Let µ± satisfy (5) and let β be any solution of (4) of period P . Then for all δ > 0, the spectrum of −L in {Reλ > δ} consists of a finite number of eigenvalues of finite multiplicity. If λ1 denotes an eigenvalue with maximal real part, and 3 > max{0, Reλ1 }, then there exists C3 > 0 such that ke−tL u0 km ≤ C3 e3t ku0 km . Lemma 11 (Regularity of eigenfunctions). Let µ± satisfy (5) and (31) and let β be any solution of (4) of period P . Let λ be any eigenvalue of −L with Reλ > 0 and [R± , E0 ] its eigenfunction triple. Assume kβkC 2 < (Reλ)2 . Then R± ∈ W 1,1 (RP × R) and there exists a constant C depending only on λ and µ± such that |E0 |1,1 + kRk1,1 ≤ CkRk1 . Proof. We begin with u = [R± , E0 ] ∈ M. We first claim that Z ∞ e−tA e−λt K(E0 )dt. R=−

(36)

0

In order to prove this, notice that g(t) = eλt R and E(t) = eλt E0 satisfy ∂t g+Ag = −KE. Hence Z t

eλt R = e−(t−s)A eλs R −

s

Letting s → −∞, we get

e−(t−τ )A Keλτ E0 dτ.

Unstable BGK Solitary Waves and Collisionless Shocks

Z e R=− λt

t

e

−(t−τ )A

−∞

283

Z

∞

KE0 e dτ = − λτ

e−τ A KE0 eλ(t−τ ) dτ,

0

which is the same as (36). The integral converges R because Reλ > 0. Since [R± , E0 ] ∈ M, we have ∂x E0 = ρ = (R+ −R− )dv so that E0 ∈ L1 (RP ) and K(E0 ) ∈ L1 (RP × R). So (36) implies that R ∈ L1 (RP × R). Hence ρ ∈ L1 (RP ) and |E0 |1,1 ≤ CkRk1 . Next we let h(t) = exp(−tA)(KE0 ). Thus (∂t +A)h = 0, h(0) = KE0 . Differentiating this equation with respect to x, we get −β 00 0 ∂x h(0) = ∂x KE0 , (∂t + A)(∂x h) = [A, ∂x ]h = 00 ∂v h, 0 β where [A, ∂x ] is the commutator. Hence Z t −β 00 0 ∂x h(t) = e−tA ∂x KE0 + e−(t−τ )A 00 ∂v h(τ )dτ. 0 β 0 Rt Similarly, ∂v h(t) = e−tA ∂v KE0 − 0 e−(t−τ )A hvi−3 ∂x h(τ )dτ. Taking L1 -norms, we get 1/2 k∂x h(t)k1 + k∂v h(t)k1 ≤ (k∂x (KE0 )k1 + k∂v (KE0 )k1 )etkβk . We put this estimate into the integrand of (36) to get Z ∞ 1/2 e[kβk −λ]t dt)(k∂x KE0 k1 + k∂v KE0 k1 ). k∂x Rk1 + k∂v Rk1 ≤ ( 0

By the definition of (17) of K, the decay condition (31) on µ± , and the boundedness of β, we deduce that KE0 ∈ W 1,1 , and we have the desired estimate. Lemma 12. Let u be an eigenvector of −L with its eigenvalue λ, Reλ > 0. If λ is not real, then there is a constant ζ > 0 such that for all t > 0, ke−Lt (Im u )k1 ≥ ζeReλt kIm u k1 > 0. Proof. We prove it by contradiction. Notice that e−Lt (Im u ) = Im(e−Lt u) = eReλt (sin[Im λt]Re u + cos[Im λt]Im u ). If the lemma were false, by passing through a convergent subsequence of sin[Im λtn ], and cos[Im λtn ], with n → ∞ we would have aIm u + bRe u = 0, with a2 + b2 = 1. Therefore either Im u or Re u would be a real eigenvector and λ would be real, a contradiction. Lemma 13 (Pointwise estimate of eigenfunctions). Let µ± satisfy (5) and (31) and let β be any solution of (4) of period P . Let [R+ , R− , E0 ] be an eigenvector with kRk1,1 + |E0 |1 = 1 with eigenvalue λ satisfying Reλ > 0. Let h(v) satisfy |h0 | ≤ C1 h for some constant C1 . If |µ0± (hvi ∓ β)| ≤ C2 h(|v|)µ± (hvi ∓ β) and |β|C 1 is sufficiently small, then |R± (x, v)| ≤ C3 h(|v|)µ± (hvi ∓ β), where C3 depends only on C1 , C2 , Reλ and |β 0 |∞ .

284

Y. Guo, W. A. Strauss

Proof. Omit the subscripts ±. The eigenfunction satisfies [v∂ ˆ x ± β 0 (x)∂v ]R ± E0 ∂v µ = −λR, where µ = µ± (hvi ∓ β 0 (x)). Then S = (hµ)−1 R satisfies [v∂ ˆ x ± β 0 (x)∂v ]S ± E0

∂v µ ∂v h ± β0 S = −λS, hµ h

since v∂ ˆ x µ± (x, v) ± β 0 ∂v µ± (x, v) = 0. As in (21), this may be written as Z ∞ ∂v µ ∂v h + β0 S]dt. e−tA e−λt [E0 S=∓ hµ h 0 Since exp(−tA) has norm one on L∞ , for Reλ > 0 we have ∂v µ ∂v h k∞ + |β 0 |∞ k k∞ kSk∞ ](Reλ)−1 hµ h ≤ [C2 |E0 |∞ + C1 |β 0 |∞ kSk∞ ](Reλ)−1 .

kSk∞ ≤ [|E0 |∞ k

Since |E0 |∞ ≤ |E0 |1,1 ≤ CkRk1 , the lemma thus follows if |β 0 |∞ is small. The following lemma gives an improved bound for a cutoff eigenfunction. Lemma 14 (Approximate eigenfunctions). Let [R± , E0 ] and β be as in the preceding lemma. Let h(s) be either hsiσ or exp(l|s|), for some σ > 0 or l > 0. Assume that for sufficiently large s, we have |µ0± (s)| ≤ C4 h(s)µ± (s)

and

µ± (s) ≤ C5 h0 (s)[h(s)]−(2+m0 )

(37)

for some m0 > 0. Then there exists δ0 > 0 such that for 0 < δ < δ0 , there exist approxδ δ ∈ L1 (RP × R), E0δ ∈ L1 (RP ) and hviγ R± ∈ L∞ (RP × R), imate eigenfunctions R± such that all of the following hold: δ δ|R± (x, v)| ≤ µ± (hvi ∓ β(x)), δ − kR± Z PZ

R± k1 +

|E0δ

(38)

− E0 | 1 ≤ δ , m

(39)

δ (R+δ − R− )dvdx = 0, Z δ ∂x E0δ = (R+δ − R− )dv,

(41)

kRδ k1,1 ≤ CkRk1 ,

(42)

0

(40)

R

R

where 0 < m < m0 . Proof. We prove this lemma in two steps. Cut-off approximation. Let η(v) be a smooth cutoff function, η(v) = 1 for |v| ≤ w, η(v) = 0 for |v| ≥ w + 1, with w to be chosen later. By Lemma 13, |η(v)R± (x, v)| ≤ |R± (x, v)| ≤ C3 h(|v|)µ± (hvi ∓ β(x)). Define w by the equation δ = [2C3 h(w+1)]−1 . Then δη(v)|R± (x, v)| ≤ 21 µ± (hvi∓β(x)) since h is an increasing function of |v|. Now from (37) we have

Unstable BGK Solitary Waves and Collisionless Shocks

285

µ± (s) = o[h0 (s)(h(s))−(2+m) ] as s → ∞. Hence

Z

P

µ± (hvi ∓ β(x))dx = o{

0

h0 (v + 1) } [h(v)]2+m

as |v| → ∞. Integrating this inequality, we get Z Z P Z h(v)µ± (hvi ∓ β(x))dxdv = o{ |v|≥w

Z

|v|≥w

0

P 0

h0 (v + 1) dxdv} ≤ δ m [h(v)]1+m

for sufficiently small δ, by the definition of w. Hence Z ∞Z P Z Z |R± |dxdv |ηR± − R± | ≤ −∞

|v|≥w

0

Z

Z

≤C

|v|≥w

P

h(v)µ± (hvi ∓ β)dxdv ≤ C6 δ m

0

for sufficiently large w. Reducing m slightly eliminates the constant C6 . Neutrality and Poisson conditions. We now further perturb the cut-off eigenfunctions. R1 Let 0 ≤ Q(v) ∈ C01 (−1, 1), P −1 Q(v)dv = 1. By Step 1, we define for every δ > 0, R+δ = ηR+ + aQ,

δ R− = ηR− ,

where a is a complex number satisfying (40), Z PZ Z PZ δ (R+δ − R− )dxdv = a + η(R+ − R− )dvdx = 0. 0

R

0

R

By Step 1 and the neutrality condition (16), Z Z |a| = | (1 − η)(R+ − R− )dvdx| ≤ δ m . We also deduce (38) from Step 1 because δ is sufficiently small and µ+ > 0. By an δ k1,1 ≤ Cδ m + CkR± k1,1 ≤ C 0 kRk1 . The easy calculation and the bound on a, kR± last inequality follows from Lemma 11 and the normalization. We finally define E δ to satisfy Z δ )dv ∂x E0δ = (R+δ − R− with the same average as E0 :

RP 0

R

E0δ dx =

RP 0

E0 dx. Hence

|E0δ − E0 |1 ≤ C|∂x [E0δ − E0 ]|1 ≤ Cδ m . We deduce (39) and (42) for small δ and the lemma follows.

Remark. Condition (37) is very general. It allows µ± go to zero at polynomial, exponential or even super-exponential rate but it excludes µ± of compact support. An example is µ(s) = exp[−sα ] with α ≥ 1 and h(s) = sα−1 . Another example is µ(s) = exp[−exps] and h(s) = exps.

286

Y. Guo, W. A. Strauss

5. Nonlinear Instability of Periodic BGK Waves We begin by stating the uniqueness and existence of BV solutions of the nonlinear relativistic Vlasov–Maxwell system with periodic boundary conditions. The proof is standard. We define BV as the space of integrable functions of bounded variation. 0 , E 0 ] be given such that Lemma 15. Let P > 0 and γ > 1. Let initial data [f+0 , f− 0 0 ∈ BV (RP × R), E 0 ∈ L1 (RP ), hviγ f± ∈ L∞ (RP × R), 0 ≤ f±

Z

P

Z

0

Z R

0 [f+0 − f− ]dvdx = 0,

∂x E 0 = R

0 [f+0 − f− ]dv.

0 , E 0 ) of period P in Then there exists a unique solution of (RVM) with initial data (f± γ (R; BV (R × R)), hvi f is bounded for bounded time, x, such that 0 ≤ f± ∈ L∞ P ± loc 1,∞ and E ∈ L∞ (R; W (R )). P loc

Let us abbreviate f = [f+ , f− ], µβ = [µ+ (hvi − β), µ− (hvi + β)], u = [f+ , f− , E], νβ = [µ+ (hvi − β), µ− (hvi + β), ∂x β]. We define the norm Z Z P Z P Z kukX = (|f+ | + |f− |)dxdv + |E|dx + sup | [f+ − f− ]dv| −∞≤x≤∞ R 0 R 0 Z (43) ≡ kuk1 + sup | [f+ − f− ]dv|. −∞≤x≤∞

R

Our goal is to show that the 2Pβ -periodic BGK equilibrium νβ is nonlinearly unstable under k · kX with P = 2Pβ . As in the proof of Lemma 3.3 in [GS2], we deduce Lemma 16. Let µ± satisfy (5), (7), (31) and (37). Let [f+ , f− , E] be a BV solution of the nonlinear Vlasov–Maxwell system as in Lemma 15. Let T > 0 and ω > 0. Assume that (44) kf (t) − µβ k1 ≤ C0 eωt ku(0) − νβ k1 in [0, T ]. If sup0≤t≤T (ku(t) − νβ kX + kβkC 2 ) < min(ω 2 , 1), then there is a constant C (independent of t and T and the initial data) such that k∂v [f (t) − µβ ]km ≤ Ceωt [kf (0) − µβ kBV + |E(0) − ∂x β|1 ]

(45)

in [0, T ]. Here m denotes the measure norm. We now are ready to prove the nonlinear instability of periodic BGK waves. Proof of Theorem 2. We are given non-negative µ± that satisfy (5), (7), (31) and (37). Furthermore, β is a solution of (4) of period Pβ with kβkC 2 sufficiently small as in δ Lemma 1 and νβ = [µβ , βx ]. We must find a family of solutions uδ (t) = [f± (t), E δ (t)] of the nonlinear Vlasov–Maxwell system satisfying the conclusions of Lemma 15, such that

Unstable BGK Solitary Waves and Collisionless Shocks

287

kf δ (0) − µβ k1,1 + |E δ (0) − βx |1,1 ≤ δ, sup 0≤t
kuδ (t) − νβ kX ≥ 0 > 0

(46)

with k · kX defined by (43). By Lemma 1, the BGK equilibria exist because of (5) and (7). By Theorem 4 and Lemma 11 and because of (31), we may choose 4 = [R+ , R− , E0 ] to be an eigenvector of −L satisfying (25) with kRk1,1 + |E0 |1 = 1 such that its eigenvalue λ has the largest (positive) real part. If λ is not real, then kIm 4 k1 ≡ r > 0 by Lemma 12. We choose δ , Im E0δ ] to the imaginary part of 4 by Lemma 14. In an approximation 4δ = [Im R± case λ is real we simply do not take the imaginary parts; but without loss of generality, we will assume λ is not real. δ (t, x, v), E δ (t, x)] by specifying We choose the family of solutions uδ (t, x, v) = [f± δ δ the initial data u (0, x, v) = νβ + δ4 . That is, δ δ (0, x, v) = µ± (hvi ∓ β(x)) + δIm R± (x, v), E δ (0, x) = βx (x) + δIm E0δ (x). f± δ Because of (38), f± (0, x, v) ≥ 0 for all x, v and for all sufficiently small δ. Because of Lemma 14, all of the conditions of Lemma 15 are satisfied. Note that

|ku(0) − νβ k1 − δr| = δ(k4kδ1 − r) ≤ δk4 − 4δ k1 ≤ δ m+1 ≤ δr/2

(47)

by (39) for δ sufficiently small. Hence by (42), kf (0) − µβ k1,1 + |E(0) − βx |1 = δkIm Rδ k1,1 + δ|Im E0δ |1 ≤ Cδr.

(48)

Let uδ (t) = u(t) = [f+ (t), f− (t), E(t)] denote the solution, where we drop the superscript δ. By the nonlinear Vlasov–Maxwell system Z t ∓(E(τ ) − βx )∂v (f± (τ ) − µ± ) dτ. (49) e−L(t−τ ) u(t) − νβ = δe−Lt 4δ + 0 0 We choose 3 such that

Reλ < 3 < (1 + m)Reλ.

(50)

Let

ζr 1 | ln |, 3 − Reλ 2C3 δ m where C3 is the constant in Lemma 10 and ζ is the constant in Lemma 12. Let Tδ =

T = sup{s : ku(t) − νβ − δe−Lt 4δ k1 ≤

ζ Reλt δe r, for 0 ≤ t ≤ s}. 4

(51)

(52)

For 0 ≤ t ≤ min{T δ , T }, from Lemma 10, (39) and (51), ke−Lt (Im 4 − 4δ )k1 ≤ C3 e3t δ m ≤

1 ζreReλt . 2

Hence by (52) for such t ζ ku(t) − νβ k1 ≤ δeReλt kIm 4k1 + δke−Lt (Im 4 − 4δ )k1 + δeReλt r, 4 ≤ (1 + 3ζ/4)δeReλt r ≤ (2 + 3ζ/2)eReλt ku(0) − νβ k1

(53)

288

Y. Guo, W. A. Strauss

R by (47). Hence for such t, from ∂x (E − βx ) = ρ − βxx = R [(f+ − µ+ ) − (f− − µ− )]dv, we deduce X 1 kf± (t)−µ± k1 ≤ Cku(t)−νβ k1 ≤ CδetReλ . (54) |E(t)−βx |∞ ≤ |E(t)−βx |1 + P ± Let be small enough that + kβkC 2 < min{1, (Reλ)2 } and let T ∗ = sup{s : ku(t) − νβ kX ≤ ,

for 0 ≤ t ≤ s} ≤ ∞.

(55) ∗

By Lemma 16 with ω = Reλ and C0 = 2 + 3ζ/2, we have for 0 ≤ t ≤ min{T, T , T δ }, k∂v [f (t) − µβ ]km ≤ CeReλt (kf (0) − µβ k1,1 + |E(0) − βx |1 ) ≤ CδeReλt by (48). Hence for such t, by Lemma 10 and (49) and (54), ku(t) − νβ − δe−Lt 4δ k1 Z t X ≤C e3(t−τ ) |E(τ ) − ∂x β|∞ k∂v [f± (τ ) − µ± ]km dτ Z

0

± t

≤C

e3(t−τ ) (δeτ Reλ )2 dτ ≤ C2 (δeReλt )2

(56)

0

with a constant C2 independent of , δ and t. Thus for 0 ≤ t ≤ min{T, T ∗ , T δ }, we also have ku(t) − νβ k1 ≥ δke−Lt 4δ k1 − ku(t) − νβ − δe−Lt 4δ k1 ≥ δke−Lt Im4k1 − δke−Lt (Im4 − 4δ )k1 − C2 (δeReλt )2 1 ≥ δrζeReλt − C2 (δeReλt )2 2 by Lemma 12. Choose T ∗∗ so that δeT Notice that T T ∗∗ =

∗∗

∗∗

Reλ

(57)

= ζr/(4C2 ).

(58)

≤ C| ln δ| since λ, r and ζ are fixed. By definitions (51) and (58),

1 1 1 1 m {ln + ln[ζr/(4C2 )]} < {ln + ln[ζr/(2C3 )]} = T δ , Reλ δ 3 − Reλ δ m

for δ sufficiently small, by (50). Also let 0 < 0 < min{,

ζ 2 r2 }. 16C2

(59)

We now consider which of the three numbers T, T ∗ T ∗∗ is the smallest. If T < min(T ∗ , T ∗∗ ), then by (56) and (58), we have ku(t) − νβ − δe−Lt 4δ k1 ≤ C2 (δeReλT )2 < C2 (δeReλT )(δeT

∗∗

Reλ

)=

ζ rδeT Reλ . 4

This contradicts (52). On the other hand, if T ∗∗ ≤ min{T, T ∗ }, then by (57), (58) and ζ 2 r2 > 0 . Finally, if T ∗ ≤ min(T, T ∗∗ ), then by (55) (59) we have ku(T ∗∗ ) − νβ k1 ≥ 16C 2 we have ku(t) − νβ kX ≥ > 0 for some t ≤ T ∗∗ . Thus in any of these three cases we have ku(t) − νβ kX > 0 for some t ≤ T ∗∗ .

Unstable BGK Solitary Waves and Collisionless Shocks

289

6. Nonlinear Instabilities of BGK Solitary Waves and Collisionless Shocks In this section, we study the instabilities of BGK solitary waves and collisionless shocks. The new difficulty we encounter lies in the unboundedness of the spatial variable, so that the plane wave growing modes do not decay as x → ∞. They do not belong to any Lp space, and they correspond to continuous spectrum. We shall overcome this by employing the finite propagation speed property of the relativistic model. We approximate the original problem on the whole line by a family of cutoff periodic problems. Consider the 1-D Vlasov–Maxwell system of ions and electrons for −∞ < x < ∞, ˆ x f± ± E∂v f± = 0, ∂t f± + v∂ Z ∞ v[f ˆ + − f− ]dv ∂t E = −j = − −∞

R∞

with the constraint ∂x E = ρ = −∞ [f+ − f− ]dv. First we prove the well-posedness of the initial-value problem. It is convenient to do so in the space BV (R × R). 0 , E 0 ] be given for (x, v) ∈ R × R Lemma 17 (Cauchy problem). Let initial data [f+0 , f− 0 γ 0 ∈ BV such that, for any bounded open set Bx ⊂ R, 0 ≤ f± x × R), hvi f± ∈ R (B ∞ 0 1 0 0 0 L (Bx ×R) for some fixed γ > 1, E ∈ Lloc (R) and ∂x E = (f+ −f− )dv. Then there 0 , E 0 ] with f± ≥ 0 such that, exists a unique solution of (RVM) with initial data [f+0 , f− for any bounded open sets Bx ⊂ R and Bt ⊂ R, we have f± ∈ L∞ (Bt ; BV (Bx × R)), hviγ f± ∈ L∞ (Bt × Bx × R) and E ∈ L∞ (Bt ; W 1,∞ (Bx )). Furthermore, this unique solution depends causally on its initial data with speed of propagation at most one. 0 ∈ W 1,1 rather than BV and we define an approximating Proof. We may assume f± n n n+1 sequence [f± , E ] satisfying L± (E n , f± ) = 0, ∂t E n = −j n and we deduce ∂x E n = ρn . Let K be the piece of solid cone {|x − x0 | ≤ t0 − t, 0 ≤ t ≤ t1 } with top T , bottom B and lateral surface K in (t, x) space. Integrating the Vlasov equation over K, we obtain Z Z Z Z n+1 0 f± dvdx ≤ f± dvdx < ∞. T

B

R

R

Let It = {x : |x − x0 | < t0 − t} be the cross-section of K. Integrating the equation ∂t E n = −j n over a piece of the solid cone, we deduce Z Z Z Z n 0 n |E |dx ≤ |E |dx + (|f+n | + |f− |)dvdxdτ. It

K

I0

R

Furthermore ∂x E = ρ is also integrable over the cone. Hence the product χ(t, ·) · E n (t, ·) is uniformly bounded in L∞ (R), where χ(t, x) denotes the characteristic function of K. So E n is uniformly bounded in L∞ (K). Now the same arguments as in Lemma n bounded in L∞ (K) and E n ∈ L∞ (W 1,∞ ), restricted to the cone. The 15 give hviγ f± n (t) within the existence proof concludes as before with a bound on the W 1,1 norm of f± cone K and a subsequent passage to the limit. To prove the causality, and consequently the uniqueness, let [f± , E] and [f ± , E] be two solutions with the given properties. Proceeding as in Lemma 15 but integrating only over K, we have Z t1 Z Z |f± − f ± |dvdx ≤ |χ(E − E)|∞ dτ kf kBV (K×R) . n

T

R

n

0

290

Y. Guo, W. A. Strauss

From ∂t (E − E) = −j + j we get |χ(E − E)(t)|1 ≤ Ckχ(f − f )(t)k1 while from ∂x (E − E) = ρ − ρ we get |χ∂x (E − E)(t)|1 ≤ Ckχ(f − f )(t)k1 . Hence |χ(E − E)(t)|∞ ≤ Ckχ(f − f )(t)k1 . By Gronwall we deduce f± ≡ f ± and hence E ≡ E.

0 0 ∈ BV (R × R), hviγ f± ∈ L∞ (R × R) and E 0 ∈ L1 (R), then Corollary 1. If f± ∞ the solution satisfies f± R∈ L (Bt ; BV (R × R)), hviγ f± ∈ L∞ (Bt × R × R), E ∈ L∞ (Bt ; W 1,∞ (R)), and R×R f± dvdx are independent of t.

We now prove the instability of solitary waves and collisionless shocks. Proof of Theorem 1. We are given non-negative µ± that satisfy (5), (31) and (37), as well as a solution β(x) of (4) on the whole line such that β(x) → 0 as x → −∞. We assume that the system (11) linearized around the homogeneous state [µ± (hvi), 0] has a growing exponential plane wave solution (see Theorem 3 for sufficient conditions). We δ (t), E δ (t)] of RVM that satisfy must find a family of solutions uδ (t) = [f± kf δ (0) − µβ k1,1 + |E δ (0) − βx |1,1 ≤ | ln δ|δ, and sup 0≤t≤C1 | ln δ|

kuδ (t) − νβ kX ≥ 0 > 0,

where all the norms are taken over −∞ < x < ∞. We prove this theorem in three steps. The approximate periodic problem. Let µ0 = the pair [µ+ (hvi), µ− (hvi)] and ν0 = the triple [µ0 , 0]; that is, with vanishing electric field. Let β ≡ 0 in Lemmas 10 to 14. By assumption the linearized RVM (11) around ν0 has a growing mode which is an exponential plane wave. This mode has a certain period P > 0. Consider now the RVM system with boundary conditions of period P in x. We apply Theorem 2 in the case β ≡ 0 to obtain constants C1 > 0, 0 > 0 and a family of solutions uδP = [fPδ ± , EPδ ] of the nonlinear P −periodic problem such that fPδ ± (0) ≥ 0, sup 0≤t≤C1 | ln δ|

kfPδ (0) − µ0 kW 1,1 (RP ×R) + |EPδ (0)|L1 (RP ) ≤ δ,

kuδP (t) − ν0 kX(RP ×R) ≥ 0 > 0.

(60)

Here X = X(RP × R) is exactly the space defined by the norm (43). Similarly, we define Z XZ Z |f± |dxdv + |E|L1 (I) + sup | (f+ − f− )dv| kukX(I) = ±

R

I

x∈I

R

for I ⊂ R. It follows as in (54) that |EPδ |∞ ≤ C|EPδ |1,1 ≤ Cδ. The whole-line problem. It suffices to restrict δ to a sequence δ = δN → 0. We choose, for any positive integer N , NP ]. δ = δN = exp[− C1 Let −a be chosen sufficiently large, as specified later to depend on N . We define nonperiodic initial data as follows. Let I = {|x − a| < (N + 2)P }. For J = {|x − a| ≤ (N + 1)P }, we define

Unstable BGK Solitary Waves and Collisionless Shocks

291

δ f± (0, x, v) = fPδ ± (0, x, v), E δ (0, x) = EPδ (0, x).

For |x − a| ≥ (N + 2)P , we define δ (0, x, v) = µ± (hvi ∓ β(x)), E δ (0, x) = β 0 (x). f±

In the remaining intervals, L = L+ ∪ L− , L± = {(N + 1)P < ±(x − a) < (N + 2)P }, we define f¯δ (0, x, v) as the linear interpolate between fPδ and µβ . By (60) and (31), there is a constant C independent of N such that kf¯δ (0, x, v) − µ0 kW 1,1 (L) ≤ Cδ + CkβkC 1 (L) . We then define for x ∈ L+ , δ δ f± (0, x, v) = f¯± (0, x, v) + α± Q(x, v), R R where 0 ≤ Q ∈ Cc∞ (L+ × R) with L+ R Q = 1. The constants α± ≥ 0 are chosen so that Z Z δ [f+δ (0) − f− (0)]dvdx = β 0 (a + (N + 2)P ) − EPδ (0, a + (N + 1)P ). (61) L+

R

This requires Z

Z α + − α− = −

L+

so that

δ

R

δ

[f + (0) − f − (0)]dvdx + β 0 (a + (N + 2)P ) − EPδ (0, a + (N + 1)P ) |α+ − α− | ≤ Cδ + CkβkC 1 (L+ ) .

(62)

We then define E δ (0, x) in L+ as Z E δ (0, x) = EPδ (0, a + (N + 1)P ) +

Z

x a+(N +1)P

R

δ (f+δ (0) − f− (0))dydv.

It follows from (60) that E δ (0, x) is continuous at a + (N + 2)P and that ∂x E δ (0, x) = ρδ (0, x) for x ∈ L+ . We have |E δ (0)|L1 (L+ ) + kf δ (0) − µ0 kW 1,1 (L+ ×R) ≤ Cδ + kf¯δ (0) − µ0 kW 1,1 (L+ ×R) + ka± QkW 1,1 (L+ ×R) ≤ Cδ + CkβkC 1 (L+ ) , δ (0, x, v), E δ (0, x)] on L− by the where C is independent of N and δ. We define [f± same method. We then deduce for x ∈ I = J ∪ L that

kf δ (0) − µ0 kW 1,1 (I) + |E δ (0)|L1 (I) ≤ kfPδ (0) − µ0 kW 1,1 (J) + |EPδ (0)|L1 (J) + kf δ (0) − µ0 kW 1,1 (L) + |E δ (0)|L1 (L) ≤ C(N + 1)δ + C(δ + kβkC 1 (L) ) ≤ Cδ| ln δ| + CkβkC 1 (L) (63) because of the periodicity within {|x − a| < (N + 1)P } = J. We apply Lemma 17 and its corollary to this initial data. By causality and because N P = C1 | ln δ|, sup 0≤t≤N P

kuδ (t) − ν0 kX(K) =

sup 0≤t≤C1 | ln δ|

kuδP (t) − ν0 kX(K) ≥ 0 ,

292

Y. Guo, W. A. Strauss

where K = [a − P/2, a + P/2]. Instability with β 6= 0. Let β ∈ C 2 solve the ODE (4) and let limx→−∞ β(x) = 0. It follows that all the derivatives of β tend to zero as x → −∞. Let νβ (x, v) = [µ+ (hvi − β(x)), µ+ (hvi + β(x)), β 0 (x)]. Then sup 0≤t≤C1 | ln δ|

= 0 −

kuδ (t) − νβ kX(K) ≥ 0 − kνβ − ν0 kX(K)

XZ Z ±

K

Z |µ± (hvi ∓ β) − µ± (hvi)|dvdx − K

R 0

|β 0 |dx − sup |β 00 (x)| K

00

≥ 0 − C sup(|β(x)| + |β (x)| + |β (x)|) ≥ 0 /2 K

by choosing a sufficiently near −∞. Furthermore, by definition of f δ (0) and E δ (0), kf δ (0) − µβ kW 1,1 (R×R) + |E δ (0) − β 0 |L1 (R) ≤ kf δ (0) − µ0 kW 1,1 (I×R) + |E δ (0)|L1 (I) + kµ0 − µβ kW 1,1 (I×R) + |β 0 |L1 (I) . (64) The first two R terms on the right are O[δ| ln δ| + kβkC 1 (I) ] by (63). The last term is |β 0 |L1 (I) = I |β 0 |dx ≤ 2(N + 2)P kβkC 1 (I) . The third term in (64) is XZ ±

Z

{|µ± (hvi ∓ β(x)) − µ± (hvi)]| Ia

R

+|v[µ ˆ 0± (hvi ∓ β(x)) − µ0± (hvi)| + |β 0 (x)[µ0± (hvi ∓ β(x))]|}dvdx ≤ C(N + 2)kβkC 1 (I) . For each N and for δ = δN , we choose a so near −∞ that kβkC 1 (I) < δ. Thus from (64) we have kf δ (0) − µβ kW 1,1 (R×R) + |E δ (0) − βx |W 1,1 (R) ≤ Cδ| ln δ|. The factor C| ln δ| can be eliminated by a scale change. This completes the proof.

Remark. The relativistic Vlasov–Maxwell system is invariant under any Lorentz transformation: shvi + cv ), (65) (t, x, v, v) ˆ → (ct + sx, st + cx, shvi + cv, chvi + sv cosh α sinh α c s where ≡ , c2 − s2 = 1. This leads to a trivial “instability” sinh α cosh α s c of BGK waves. In fact, if [µβ , β 0 ] is a BGK wave, then Lα [µβ , β 0 ] = [µ± (hshvi + cvi ∓ β(st + cx)), β 0 (st + cx)] is also an exact solution. It follows, when α small, that kLα [µβ , β 0 ]|t=0 − [µβ , β 0 ]|t=0 k1,1 = O(α),

kLα [µβ , β 0 ] − [µβ , β 0 ]kX = O(αt).

Therefore, the escape time of this trivial instability is O(δ −1 ), which is much longer than the exponential escape times in our theorem.

Unstable BGK Solitary Waves and Collisionless Shocks

293

7. Appendix In this section, we give sufficient conditions for the positivity of the distribution functions in our previous papers. We thank J. Vukadinovi´c and G. Rein for pointing out the importance of the positivity. The proof follows exactly as in the current paper. For the paper [GS1], the positivity can be incorporated into the Main Theorem as follows. Theorem 5 (Main Theorem of [GS1]). Let µ(v) be an even function in each coordinate that satisfies (2.1) and (2.2). Let h ∈ C 1 ([0, ∞)), h > 0 and h0 > 0, h(s) → ∞ as s → ∞. Assume |∇µ(v)| ≤ h(|v|)µ(v),

µp (v) = o{|v|−pα−2 h(|v| + 1)−p(1+m)−1 h0 (|v| + 1)}

for |v| large, and m > 0. Then there are initial data f0n (x, v) ≥ 0 and times tn ≥ 0 such that kf0n − µkX → 0, but kf n (tn ) − µkX does not go to 0. For the paper [GS2] we correct some errors. R 2P [i] The natural condition 0 β E(t, x)dx = 0 is missing in (VP), (1.1), (2.55), (3.3). [ii] In (2.9), delete the term E(x0 ). [iii] The correct inequality in Lemma 3.1 and in (3.7) should be kβk < |λ|2 . 0 ∈ L∞ for γ > 1 is missing. [iv] In Lemma 3.2, the condition hviγ f± 0 2 [v] In (1.2) add: µ± (v /2) are integrable. Now we state the improved main result. In the Main Theorem of [GS2] we have f δ (0) ≥ 0 if h(s) is either hsiσ or exp(lhsi) with σ ≥ 0, l ≥ 0, and |vµ0± (v 2 /2)| ≤ h(|v|)µ± (v 2 /2),

µ± (v 2 /2) ≤ h0 (|v|)[h(|v|)]−2−m

for |v| large and some m > 0. References [BGK] Bernstein, I., Greene, J., Kruskal, M.: Exact nonlinear plasma oscillations. Phys. Rev. 108 3, 546–550 (1957) [G] Guo, Y.: Stable magnetic equilibria in collisionless plasmas. Comm. Pure Appl. Math., Vol. L, 0891– 0933 (1997) [GR] Guo, Y., Ragazzo, C. G.: On steady states in a collisionless plasma. Comm. Pure Appl. Math., Vol. XLIX, 1145–1174 (1996) [GS1] Guo, Y., Strauss, W.: Nonlinear instability of double-humped equilibria. Ann. IHP, Analyse Nonlineaire, 12, 339–352 (1995) [GS2] Guo, Y., Strauss, W.: Instability of periodic BGK equilibria. Comm. Pure Appl. Math. Vol XLVIII, 861–894 (1995) [GS3] Guo, Y., Strauss, W.: Relativistic unstable periodic BGK waves. Comp. Appl. Math., to appear [GS4] Guo, Y., Strauss, W.: Unstable oscillatory-tail waves in collisionsless plasmas. To appear [P] Penrose, O.: Electrostatic instability of a non-Maxwellian plasma. Phys. Fluids. 3, 258–265 (1960) [Sh] Shizuta, Y.: On the classical solutions of the Boltzmann equation. Comm. Pure Appl. Math. 36, 705–754 (1983) [St] Steinberg, S.: Meromorphic families of compact operators. Arch. Rat. Mech. Anal. 31, 372–379 (1968) [V] Vidav, I.: Spectra of perturbed semigroups with applications to transport theory. J. Math. Anal. Appl. 30, 264–279 (1970) Communicated by J. L. Lebowitz

Commun. Math. Phys. 195, 295 – 308 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

On the Dynamics of n-Dimensional Quadratic Endomorphisms N. Romero1 , A. Rovella2 , F. Vilamaj´o3 1 Decanato de Ciencias, Universidad Centro Occidental Lisandro Alvarado, Apdo. 400, Barquisimeto, Venezuela. E-mail: [email protected] 2 Centro de Matem´ atica, Universidad de la Rep´ublica, Ed. Acevedo 1139, Montevideo, Uruguay. E-mail: [email protected] 3 Departament de Matem´ atica Aplicada 2, Escola Tecnica Superior d’Enginyers Industrials, Colom 11, 08222 Terrassa, Barcelona, Espanya. E-mail: [email protected]

Received: 13 May 1997 / Accepted: 24 November 1997

Dedicated to the memory of Ricardo Ma˜ne´ and Wieslav Szlenk. Abstract: Considering a convex endomorphism F (its n coordinates are convex functions) and the one parameter family Fµ = F − µν, where ν is any vector of Rn , we find sufficient conditions in order that for large values of the parameter, the dynamical behavior of Fµ is completely described: either the nonwandering set (Fµ ) is empty or Fµ restricted to (Fµ ) is an expanding map. These conditions are shown to be generic in the space of quadratic endomorphisms. 1. Introduction Convexity seems to be a condition which when imposed on higher dimensional endomorphisms permits generalization of some parts of the theory of one dimensional dynamics. This occurs for delay equations (see [RV]) and in a more general context will be the subject of this work. A real function f defined on Rn is C 2 -convex if it is C 2 and there exists α > 0 such that qx (v) = hHf (x)v, vi ≥ α for every unit vector v ∈ Rn , where Hf (x) denotes the Hessian matrix of f at the point x and h·, ·i denotes the usual scalar product in Rn . An endomorphism of Rn is called C 2 -convex when all its coordinates are C 2 -convex functions. The set of C 2 -convex functions defined on Rn will be denoted by CC 2 (Rn ). Next define the class H0 of C 1 endomorphisms of Rn containing the maps F which satisfy the following properties: 1. ∞ is an attractor for F (i.e. there exists R > 0 such that ||x|| > R implies that F k (x) → ∞ when k → ∞). Denote by B∞ the basin of attraction of ∞. 2. The nonwandering set (F ) is either empty or a Cantor set which coincides with the complement of the basin of ∞, and F restricted to (F ) is an expanding map. Endomorphisms in H0 are always Axiom A (see Ma˜ne´ and Pugh [MP]); by a theorem of Przytycki (see [P]) adapted to this case of noncompact manifolds, the structural stability of the endomorphisms in H0 also follows.

296

N. Romero, A. Rovella, F. Vilamaj´o

Let F = (f1 , · · · , fn ) be a C 2 -convex endomorphism; for ν ∈ Rn fixed, consider the one parameter family Fµ = F − µν. We will find sufficient conditions on the geometry of intersections of the level sets of the functions fi such that for large values of µ, the map Fµ belongs to H0 (see Proposition 1 in Sect. 3). We define Gν as the set of C 2 endomorphisms F of Rn for which there exists µ0 ∈ R such that Fµ belongs to H0 for every |µ| > µ0 . We will show in Sect. 3 that the intersection of Gν with the space of C 2 -convex endomorphisms is open in the C 2 -strong topology. However, in Example 3 of the last section we will show that there exists F ∈ Gν (F is not C 2 -convex) which is not an interior point of Gν in the C r -strong topology for any r ≥ 2. Observe that if f : R → R is a C 2 -convex function then fµ belongs to H0 for every µ large. We are trying to understand the situation in higher dimensions. Actually the same result does not hold in dimension n ≥ 2; in fact, we will show in Sect. 5 that there are open sets of C 2 -convex endomorphisms for which the families {Fµ : µ > 0} do not intersect H0 . (See Examples 1 and 2 of the last section). However, the situation for quadratic maps is quite different. Any quadratic endomorphism in Rn is determined by symmetric matrices A1 , · · · , An , vectors of Rn v1 , · · · , vn , and real numbers a1 , · · · , an , and given by F (x) = (hA1 x, xi + hv1 , xi + a1 , · · · , hAn x, xi + hvn , xi + an ). Obviously the endomorphism F is C 2 -convex if and only if each of the matrices Ai is positive. We will show that if at least one of the matrices Ai is positive, then ∞ is an attractor for F . There are quadratic endomorphisms for which this does not occur, as will soon become clear. In the space of quadratic endomorphisms it is more natural to consider the weak (compact-open) topology since the strong topology becomes discrete when induced in this space. Moreover, the weak topology coincides with the natural topology given by the immersion (via coefficients) of the quadratic space in euclidean space. With this topology, we will prove the following result: Theorem 1. For every ν ∈ Rn \ {0}, Gν is open and dense in the space of quadratic endomorphisms of Rn . These kind of situations are also found in [BSV] and [RV], where delay endomorphisms were studied; these endomorphisms, which fail to be C 2 -convex because they have n − 1 linear coordinates, “generically” display hyperbolic dynamics (including that of H0 ) when one parameter families are considered. In this sentence, “generically” has a different meaning, because the delay is required to be maintained. This will be explained in the first example of the last section.

2. Preliminaries In this section we will describe some properties of a single C 2 -convex function f : Rn → R. ∂f ∂2f For each i, j = 1, · · · , n we denote the partial derivatives and by ∂i f ∂xi ∂xi ∂xj and ∂ij f respectively, the gradient vector of f at x by ∇f (x), and we define the sets Ci (f ) = {x ∈ Rn : ∂j f (x) = 0 for j 6= i}, i ∈ {1, · · · , n}. Let α > 0 be such that for every v, x ∈ Rn :

Dynamics of n-Dimensional Quadratic Endomorphisms

297

qx (v) = hHf (x)v, vi ≥ αkvk2 ; where Hf (x) is the Hessian matrix of f at x. Next we comment on the fundamental properties: 1. There exists R > 0 such that f (x) ≥ α3 ||x||2 if ||x|| ≥ R. Proof: Fix x ∈ Rn with norm 1 and define ϕx (t) = f (tx) for positive t. Then ϕ00x (t) = hHf (tx)x, xi ≥ α for every t ≥ 0. It follows that ϕx (t) ≥

α 2 t + ϕ0x (0)t + ϕx (0). 2

As |ϕ0x (0)| is bounded above independently of x, this implies the assertion. It also follows that f is a proper function: preimages of bounded sets are bounded. 2. We claim that f has a unique critical point. Proof. The first item implies that f has an absolute minimum in the region kxk ≤ R, that must be a critical point. Let x0 be a point where f takes its absolute minimum, fix x with kx − x0 k = 1, and define ψx (t) = f (x0 + t(x − x0 )) for t ≥ 0. Then, as above, ψx00 (t) ≥ α for t > 0, which implies that ψx (t) ≥ α2 t2 + f (x0 ) for t > 0, and the claims follows. 3. For s ∈ R the level sets f −1 (s) are always compact; furthermore, when s < min f , f −1 (s) = ∅; when s = min f , f −1 (s) is the critical point of f and if s > min f , then f −1 (s) is a compact set that separates Rn into two components, the bounded one being the strictly convex set {x ∈ Rn : f (x) < s}, denoted in the sequel by i(f −1 (s)). The unbounded component will be denoted by e(f −1 (s)). Another simple consequence of the convexity is that every nonempty level set f −1 (s) with s > min f , has exactly two points of tangency with hyperplanes xi = constant, i = 1, · · · , n; these are the points of intersection of f −1 (s) and Ci (f ). 4. The set Ci (f ) is the graph of a function defined in the ith axis, that is, we claim that there exists x˜ i : R → Rn−1 such that ∂j f (x1 , · · · , xn ) = 0 for every j 6= i if and only if there exists t ∈ R satisfying xi = t and (x1 , · · · , xi−1 , xi+1 , · · · , xn ) = x˜ i (t). Proof. Take i = n to simplify the notation, and consider the map gn (x) = (∂1 f (x), · · · , ˜ xn ) ∈ Rn and x˜ = (x1 , · · · , xn−1 ). It is easy to verify that ∂n−1 f (x)), where x = (x, b f (x), with H b f (x) the matrix obtained from Hf (x) if the last row ∂x˜ gn (x, ˜ xn ) = H b f (x) is nonsingular. and column are taken off. Since Hf (x) is a positive matrix, H 0 0 0 0 As gn (x ) = 0, where x = (x1 , · · · , xn ) is the critical point of f , then the implicit function theorem implies that there is a neighborhood V of x0n and a function x˜ n defined on V such that gn (x˜ n (xn ), xn ) = 0 for every xn ∈ V . Moreover, b f (x˜ n (xn ), xn )x˜ 0 (xn ) = −∇∂ b n f (x˜ n (xn ), xn ), H n

(1)

b n f = (∂1n f, · · · , ∂(n−1)n f ). where ∇∂ As Cn (f ) is the set of points where the level sets of f are tangent to the hyperplanes xn = const, it follows that the domain of x˜ is all R. The sets Ci (f ), i = 1, · · · , n, are called the critical lines of f .

298

N. Romero, A. Rovella, F. Vilamaj´o

Now we separate in a lemma the main result of this section; it says that if µ is sufficiently large, then for each 1 ≤ i ≤ n there is a level set Si of fµ = f − µ, tangent to the hyperplane xi = fµ (Si ). Lemma 1. Let fµ = f −µ, where f : Rn → R is a C 2 -convex function and µ ∈ R. Then there exists µ0 such that for any i = 1, · · · , n and µ ≥ µ0 there are defined functions si (µ) and s˜i (µ) with the following properties: 1. fµ−1 (si (µ)) is tangent to xi = si (µ) and to xi = s˜i (µ). 2. si (µ) → +∞, s˜i (µ) → −∞, siµ(µ) → 0 and s˜ iµ(µ) → 0 as µ → +∞. 3. fµ−1 (s) ⊂ {(x1 , · · · , xn ) : xi < s} if s > si (µ). fµ−1 (s)∩{(x1 , · · · , xn ) : xi > s} 6= ∅ if s < si (µ) and fµ−1 (s) is not empty. Proof. We assume i = n, the proof for i < n is similar; we denote by x0 = (x01 , · · · , x0n ) the point where f takes its minimum a. Fix µ large enough and define ϕµ (t) = fµ (x˜ n (t), t), where (x˜ n (t), t) = (u1 (t), · · · , un−1 (t), t) is the parametrization of Cn (f ) given above. Observe that ϕ0µ (t) = ∂n f (x˜ n (t), t), because for 1 ≤ j < n, ∂j f = 0 at points in Cn (f ). It follows that ϕ00µ (t) =

n−1 X

∂in fµ (x˜ n (t), t)u0i (t) + ∂nn fµ (x˜ n (t), t).

i=1

Next we prove that ϕ00µ is bounded below from 0. Developing the determinant of Hf (x˜ n (t), t) by adjoints of the last row gives det(Hf (x˜ n (t), t)) =

n X

(−1)n−i ∂in f (x˜ n (t), t)Ai (t),

(2)

i=1

b f (x˜ n (t), t)) and Ai (t), for i = 1, · · · , n − 1, is the determinant of where An (t) = det(H the matrix obtained from Hf (x˜ n (t), t) taking off the ith column and nth row. Equation (1) says that b n f (x˜ n (t), t). b f (x˜ n (t), t)x˜ 0n (t) = −∇∂ H Consider this a linear system with unknowns x˜ 0n (t) = (u01 (t), · · · , u0n (t)). By b f (x˜ n (t), t) is equal to the determinant Cramer’s rule, u0i (t) times the determinant of H th b f (x˜ n (t), t) by −∇∂ b n f (x˜ n (t), t) = of the matrix obtained substituting the i column of H −(∂1n f (x˜ n (t), t), · · · , ∂(n−1)n f (x˜ n (t), t)). This last matrix is obtained from Hf (x˜ n (t), t) taking off the last row and the ith column and interchanging the last column with the ith one. It follows that b f (x˜ n (t), t)). Ai (t) = (−1)i−1 u0i (t)det(H In this way, from (2) we have b f (x˜ n (t), t)) det(Hf (x˜ n (t), t)) = det(H

n−1 X i=1

! ∂in f (x˜ n (t), t)u0i (t) + ∂nn f (x˜ n (t), t) ;

Dynamics of n-Dimensional Quadratic Endomorphisms

therefore ϕ00µ (t) =

299

det(Hf (x˜ n (t), t)) ; it is an exercise of linear algebra to prove that then b f (x˜ n (t), t)) det(H

ϕ00µ ≥ α. On the other hand, it is clear that ϕ0µ (x0n ) = 0 and ϕµ (x0n ) = a − µ. From this we conclude that for every large value of µ there exists sn (µ) > 0 and s˜n (µ) < 0 with x0n ∈ (s˜n (µ), sn (µ)) such that ϕµ (sn (µ)) = sn (µ), ϕµ (s˜n (µ)) = sn (µ), ϕµ (s) < s if x0n < s < sn (µ) and ϕµ (s) > s if s > sn (µ). The lemma follows easily. Remark 1. – As an immediate consequence of the above lemma we have the following fact: if Fµ : Rn → Rn is any endomorphism such that at least one of its coordinates (suppose the last one) is fµ = f − µ, where f : Rn → R is a C 2 -convex function, then ∞ is an attractor for Fµ if µ is large enough. (This is also a consequence of the first property of C 2 -convex functions stated above.) Moreover, if we define Cn (µ) = {x ∈ Rn : fµ (x) ∈ [s˜n (µ), sn (µ)]}, and sn (µ) being as in the lemma and if B∞ is the basin of infinity, then B∞ = s˜n (µ) \ n Fµ−k (Cn (µ)). Now suppose that each coordinate fi is C 2 -convex and let R \ k≥0

s˜i (µ), si (µ) be as in the previous lemma when the ith coordinate is considered. n \ If Ci (µ) = {x : fi (x) − µ ∈ [s˜i (µ), si (µ)]} and C(µ) = Ci (µ), then B∞ (µ) = i=1 \ Rn \ Fµ−k (C(µ)). k≥0

˜ = s˜n (µ). ˜ If µ < µ, ˜ – Observe that diminishing µ we can find a value µ˜ such that sn (µ) then the basin of infinity for Fµ is equal to Rn . Therefore, if for the one parameter family of C 2 -convex endomorphisms Fµ = (f1 , . . . , fn ) − µν, if any of the entries of the vector ν is negative, then for every large positive µ, the Fµ -orbit of any point goes to ∞. 3. -Transversality Now we will find conditions expressed in terms of the intersections of the level curves of f1 , · · · , fn which will be sufficient to obtain that Fµ belongs to H0 for large values of µ. The precise statement is Proposition 1. First we introduce some notation. By [{v1 , · · · , vk }] we denote the linear subspace generated by {v1 , · · · , vk } ⊂ Rn and PV (resp. PV⊥ ) denote the orthogonal projection of Rn onto the linear subspace V (resp. onto the orthogonal complement of V ). Lemma 2. If {v1 , · · · , vk } is a linearly independent set of vectors in Rn and V = [{v1 , · · · , vk }], then for every > 0 there exists δ > 0 such that if w1 , · · · , wk are linearly independent vectors in Rn , W = [{w1 , · · · , wk }] and kwi − vi k < δ for every i = 1, · · · , k, then for any unit vector v ∈ Rn it holds that kPV (v) − PW (v)k < . Proof. Let {v10 , · · · , vk0 } and {w10 , · · · , wk0 } be orthonormal basis of the linear subspaces V and W obtained from v1 , · · · , vn and w1 , · · · , wn by the Gram Schmidt orthogonalization method. So for every vector v ∈ Rn we can write

300

N. Romero, A. Rovella, F. Vilamaj´o

PV (v) =

k X

hv, vi0 ivi0

and PW (v) =

i=1

k X

hv, wi0 iwi0 .

i=1

By continuity of scalar product, kwj0 − vj0 k is small if kwi − vi k is small for every i ≤ j; so the lemma follows. Definition 1. Given > 0 we say that {v1 , · · · , vn } ⊂ Rn \ {0} is -transverse if for each Vi = [{v1 , · · · , vn } \ {vi }] with i = 1, · · · , n, it holds that kPV⊥i vi k ≥ kvi k. Definition 2. For any > 0 we say that a set of n smooth Tn hypersurfaces S1 , · · · , Sn in Rn is transverse if at each point of intersection x ∈ i=1 Si the set of n normal vectors to the tangent spaces of the hypersurfaces is linearly independent. Tn The set {S1 , · · · , Sn } is -transverse if at each point x ∈ i=1 Si , the set of n normal vectors at x to the respective tangent spaces is -transverse. The following is an immediate corollary of Lemma 2. Corollary 1. If {v1 , · · · , vn } is a set of unit vectors of Rn which is not -transverse, then there exists δ > 0 such that if w1 , · · · , wn are unit vectors satisfying kwi − vi k < δ for every i = 1, · · · , n, then {w1 , · · · , wn } is not -transverse. The following lemma is the basic tool to obtain expansivity. Lemma 3 (-transversality). Given > 0 there exists c() > 0 such that if the set of unit vectors {v1 , · · · , vn } ⊂ Rn is -transverse, then the n × n matrix A whose rows are the vectors v1 , · · · , vn satisfies: kAvk ≥ c()kvk, for every v ∈ Rn . Proof. Suppose by contradiction that there exists > 0 such that for every positive integer k and all i = 1, · · · , n there exist unit vectors vik and v k such that the set {v1k , · · · , vnk } is -transverse and if Ak is the matrix whose rows are the vectors v1k , · · · , vnk , then: kAk v k k ≤

1 . k

(3)

We can assume without loss of generality that the sequences {vik : k ≥ 1} with i = 1, · · · , n and {vk : k ≥ 1} converge to the unit vectors v1 , · · · , vn and v. From the corollary above it follows that {v1 , · · · , vn } is -transverse, hence linearly independent, but, on the other hand, if A is the matrix whose rows are the vectors v1 , · · · , vn , then pasing to the limit in Eq. (3) we have Av = 0. This contradiction proves the lemma. Remark 2. It can be proved that the number c() in the preceding lemma can be chosen as a constant depending only on the dimension n times n−1 . We will not need this stronger version.

Dynamics of n-Dimensional Quadratic Endomorphisms

301

Proposition 1. Let Fµ = (f1 − µν1 , · · · , fn − µνn ) (each νi > 0) be a C 2 -convex endomorphism of Rn satisfying the following: Given > 0 there exists µ0 such that, if µ > µ0 then {fi−1 (µνi + si ) : i = 1, · · · , n} is -transverse whenever si ∈ [s˜i (µνi ), si (µνi )] for each i = 1, · · · , n. Then Fµ belongs to H0 for every µ sufficiently large. Proof. Suppose first that νi = 1 for each i = 1, · · · , n. Since each component of Fµ is a C 2 -convex function, Remark 1 implies that \ Rn \ B∞ = Fµ−k (C(µ)). k≥0

Take any x ∈ Rn \B∞ . For each i = 1, · · · , n there exist si ∈ [s˜i (µ), si (µ)] such that n \ x∈ fi−1 (si + µ). The normal vector to fi−1 (si + µ) at x is ∇fi (x), so the hypothesis i=1

implies that the set {∇f1 (x), · · · , ∇fn (x)} is -transverse. On the other hand, it is clear that k(DFµ )x (v)k2 = h∇f1 (x), vi2 + · · · + h∇fn (x), vi2 n X ∇fi (x) , vi2 . ≥ min k∇fi (x)k2 h 1≤i≤n k∇fi (x)k i=1

The sum in the last member is equal to the square of the norm of A(x)v, where A(x) is the ∇fi (x) matrix which rows are the vectors k∇f . These are -transverse, so the -transversality i (x)k lemma implies that k(DFµ )x (v)k2 ≥ c()2 min k∇fi (x)k2 kvk2 . 1≤i≤n

Therefore, if we prove that for every µ large c() min1≤i≤n k∇fi (x)k > 1 for every x ∈ C(µ), then the result follows. Let x ∈ Ci (µ), then fi (x) − µ ≥ s˜i (µ) and Lemma 1 implies that when µ → ∞, µ + s˜i (µ) → ∞. Then it follows that kxk → ∞ and as fi is a C 2 -convex function, k∇fi (x)k → ∞ as µ → ∞. This proves the proposition in case νi = 1 for each i = 1, · · · , n. For the general case, define, instead of C(µ) the set Cν (µ) =

n \

{x : fi (x) − µνi ∈ [s˜i (µνi ), si (µνi )]},

i=1

and then proceed as above.

Remark 3. – Observe that if any νi ≤ 0 then for every µ large the nonwandering set of Fµ is empty. This is a consequence of Lemma 1. – To give a simple example in which the conditions of the above proposition hold, take any C 2 -convex endomorphisms F = (f1 , f2 ) of R2 , such that, for any i ∈ {1, 2} and x ∈ R2 , ∂ii fi (x) > ∂jj fi (x) for every j 6= i. Then the level curves of f1 are more vertical than horizontal, and those of f2 are more horizontal than vertical. This gives an idea why the level curves have to be transverse. The proof is similar to the one we give in the next section.

302

N. Romero, A. Rovella, F. Vilamaj´o

Now we make a digression to discuss some topologies in the space of C 2 -convex functions of Rn . The C 2 -weak topology given by uniform convergence on compact subsets seems to be not useful because any C 2 -convex function has arbitrary small perturbations which are not even convex functions. This represents a difficulty since we are dealing with the behaviour at infinity. A C 2 -Whitney or strong neighborhood of a function f is given by continuous functions i (x) > 0, i = 0, 1, 2 and is defined by: V(f ; 0 , 1 , 2 ) = {g ∈ C 2 (Rn ) : kHf (x) − Hg (x)k ≤ 2 (x); k∇f (x) − ∇g(x)k ≤ 1 (x) and |f (x) − g(x)| < 0 (x) for every x}. It is clear that CC 2 (Rn ) is open in C 2 (Rn ) when the strong topology is considered. This makes this topology more interesting in CC 2 (Rn ). Moreover, as C 2 (Rn ) is a Baire space (see [H]), it follows that also CC 2 (Rn ) is a Baire space. However, induced in the set of quadratic convex functions the Whitney topology is discrete, while the weak topology induces the natural topology of the norm which we will use in the next section. In the space of C 2 -convex endomorphisms of Rn we will use product topologies. This means that a strong small perturbation of an endomorphism F of Rn is an endomorphism G such that each coordinate is close to the corresponding coordinate of F . Remark 4. Gν is open under strong topology in the space of C 2 -convex endomorphisms of Rn . Proof. Let F be a C 2 -convex endomorphism in Gν . Then Fµ = F − µν belongs to H0 for every |µ| > µ0 . By Remark 1, there is a continuous and increasing function b(µ) such that b(µ) → +∞ as µ → +∞ and the nonwandering set of Fµ is contained in the complementary set of the ball centered at 0 and with radius b(µ). As H0 is open, each Fµ has a neighborhood contained in H0 . The family {Fµ : µ ≥ µ0 } is not compact, but the nonwandering set of Fµ is determined by the restriction of F to a set of the form √ {x : b(µ) ≤ kxk ≤ const. µ}, and there the values of a C 2 -strong perturbation G can be chosen close to F . Then the nonwandering set of Gµ must be conjugated to that of Fµ . It is important to note that the C 2 -convexity is crucial, because it makes the nonwandering set to go to ∞, when F and G are arbitrarily close. Compare this with the situation in Example 3 of the last section, where the distance from the nonwandering set of Fµ to 0 tends to 0 when µ → +∞. In the following sections we will need to describe some perturbations of C 2 -convex endomorphisms and the effect of these perturbations on the level sets of the functions. Recall that if L is the level set of a C 2 -convex function, then i(L) denotes the convex bounded region of the complementary set of L. If a is any point in i(L) and S n−1 denotes the unit sphere of Rn , then there exists a function ϕL : S n−1 → R+ such that {a + ϕL (θ)θ : θ ∈ S n−1 } = L. To prove the above, observe that each ray starting at a ∈ i(L) must intersect L because L is compact. This intersection must be unique because i(L) is strictly convex. We will call this function ϕL the parametrization of L. In this way it is clear that for each g ∈ CC 2 (Rn ), t0 > min g and a ∈ i(g −1 (t0 )) there exists a function ϕg : S n−1 × (t0 , ∞) → R+

Dynamics of n-Dimensional Quadratic Endomorphisms

303

such that for each t > t0 , the function ϕgt : S n−1 → R+ given by ϕgt (θ) = ϕg (θ, t) defines the parametrization of g −1 (t). In other words, ϕg is the unique function satisfying g(a + ϕg (θ, t)θ) = t for every θ ∈ S n−1 and t > t0 . (Here we used polar coordinates in the domain of g.) Suppose that g is as above and take a strong C 2 -neighborhood V of g such that every h ∈ V is C 2 -convex and satisfies a ∈ i(h−1 (t0 )). Then, for every t > t0 , we can define the parametrization ϕht of h−1 (t). This defines an operator ϕ from V into C 2 (S n−1 × (t0 , +∞)); i.e. ϕ(h) = ϕh . Considering the C 2 -strong topology also in this space of functions we have: Lemma 4. The operator ϕ : V → C 2 (S n−1 × (t0 , +∞)) is continuous. Proof. Let d be the distance from a to h−1 (t0 ) and define 8h : S n−1 × (d, +∞) × (t0 , +∞) → R by 8h (θ, s, t) = h(a + sθ) − t. Observe that ∂2 h(a + sθ) = hHh (a + sθ)θ, θi ≥ α, ∂s2 where Hh (a + sθ) is the Hessian matrix of h at the point a + sθ. It follows that ∂h ∂8h (θ, s, t) = (a + sθ) > 0 ∂s ∂s for every s > d. (Geometrically, ∂h ∂s (a + sθ) is positive because for s > d and any θ the line a + sθ is transverse to the level sets of h, and when s increases, a + sθ cuts higher level sets of h.) Thus the implicit function theorem provides a C 2 function ϕh : S n−1 × (t0 , +∞) → R+ such that 8h (θ, ϕh (θ, t), t) = 0 and the dependence of ϕh on h is continuous because 8h depends continuously on h, by the parametrized implicit function theorem. (This follows from the parametrized version of the Inverse Mapping Theorem: Let X be a topological space, M a manifold and ψ : X × M → M such that for each x ∈ X, ψx is C r and the map x → ψx is continuous. Fix x ∈ X, p ∈ M and suppose that the differential Dp ψx is invertible. Then there is a neighborhood N of x in X, such that for every y ∈ N , ψy is locally C r -invertible and the inverses depend continuously on y.) This proves the lemma. The advantage in considering ϕg instead of g is that the high level sets of g are images of the compact set S n−1 under ϕgt , simplifying the work with level curves. Corollary 2. Let g1 , · · · , gn be C 2 -convex functions such that the set {gi−1 (µ) : i = 1, · · · , n} is (µ)-transverse for every µ > µ0 , where (µ) is a continuous function of µ with range contained in an open interval I bounded away from 0. Then there exists a small neighborhood of (g1 , · · · , gn ) in the C 2 -strong topology, such that for every 0 (h1 , · · · , hn ) in that neighborhood, the set {h−1 i (µ) : 1 ≤ i ≤ n} is (µ)-transverse for every µ, where 0 (µ) belongs to I for every µ.

304

N. Romero, A. Rovella, F. Vilamaj´o

Proof. Let h = (h1 , · · · , hn ) be a small C 2 -strong perturbation of g = (g1 , · · · , gn ); each level curve gi−1 (µ) of gi is the image under ϕgµ of S n−1 . By continuity of ϕ, the functions hi can be chosen so that ϕhi (S n−1 × {µ}) and ϕgi (S n−1 × {µ}) are located at a distance that converges to 0 arbitrarily fast when µ → ∞. Therefore, as -transversality for ∈ I is open, the result follows. 4. Proof of Theorem 1 Consider F = (f1 , · · · , fn ) where each component is given by fi (x) = hAi x, xi + Li (x) + ai , with Ai a symmetric matrix, Li a linear function and ai ∈ R. We are not supposing that the matrices Ai are positive, so F is not convex necessarily. Assume first that 1. {hAi x, xi = 0 : i = 1, · · · , n} ∩ S n−1 = ∅, 2. {hAi x, xi = ±νi }, i = 1, · · · , n is transverse for all possible choices of + and −, 3. Ai is invertible, i = 1, · · · , n. Under these conditions (that will be shown to be open and dense), we will show that: (a) ∞ is an attractor for F . (b) Fµ = F − µν belongs to H0 for every large value of |µ|. Proof of (a). Condition 1 and continuity imply that there exists δ > 0 such that n \

{x : |hAi x, xi| < δ} ∩ S n−1 = ∅.

i=1

Using Condition 1 we see that for every x ∈ Rn there exists some index i such that x x , ||x|| i| ≥ δ, then we will have: |hAi ||x|| kF (x)k = 2

n X

(hAj x, xi + Bj (x))2 ≥ (kxk2 δ − |Bi (x)|)2 .

j=1

As each Bi = L(x)+ai is a polynomial of degree ≤ 1, it follows that there exist constants b1 , b2 such that: |Bi (x)| ≤ b1 kxk + b2 , for every x. Then there exists δ0 > 0 such that: kF (x)k ≥ δ0 kxk2

(4)

for every kxk large; this implies (a). Proof of (b). Observe first that in the proof of (a) we use only Condition 1 and not the others, so ∞ is an attractor for every Fµ . Let D(r) be the open ball in Rn of radius r and centered at the origin. Claim. There exist numbers 0 < r1 < r2 such that p p Rn \ B∞ (µ) ⊂ D(r2 |µ|) \ D(r1 |µ|) for every |µ| large.

Dynamics of n-Dimensional Quadratic Endomorphisms

305

p Proof of the Claim. Take x ∈ / D(r2 |µ|), r2 to be fixed. Then using Condition 1 as in the proof of (a) we find that for some 1 ≤ i ≤ n: kFµ (x)k ≥ δ0 kxk2 − |µνi | ≥ δ0 kxk2 − |µ| max |νi | ≥ δ0 kxk2 − max |νi |

kxk2 ≥ δ1 kxk2 r22

for some δ1 > 0 and every x large, if r22 is taken ≥ max |νi |/δ0 . (We used (4), where kxk was required to be large; so begin takingp|µ| large to assure this condition.) This / D(r2 |µ|) and µ is large. Now suppose that implies that p kFµ (x)k ≥ 2kxk if x ∈ x ∈ D(r1 |µ|), r1 to be fixed. It is clear that |fi (x)| ≤ K1 kxk2 + K2 for some positive constants K1 , K2 , every 1 ≤ i ≤ n and x ∈ Rn . Then ||Fµ (x)||2 =

n X

(fi (x) − µνi )2 ≥ (fi (x) − µνi )2

i=1

for each 1 ≤ i ≤ n, in particular, ||Fµ (x)|| ≥ max |νi ||µ| − K1 r12 |µ| − K2 ≥ r2

p

|µ|,

if r1 is small and |µ| large. Then, by the the first part of the proof of the claim, it follows that Fµ (x) ∈ B∞ (µ) and so x ∈ B∞ (µ). The claim is proved. p p Consequently, if C(µ) = D(r2 |µ|) \ D(r1 |µ|), then: ∞ \

Rn \ B∞ (µ) =

Fµ−k (C(µ)).

k=1

As each Ai is invertible by Condition 3, there exists a constant d > 0 such that / B∞ (µ) and let’s prove kAi xk ≥ dkxk for every 1 ≤ i ≤ n and x ∈ Rn . Now fix x0 ∈ that (DFµ )x0 expands every nonzero vector v uniformly in x . 0 p p For every 1 ≤ i ≤ n the level si defined by si := fi (x0 )−µνi p belongs to (−r2 |µ|, r2 |µ|) because the contrary assumption implies kFµ (x0 )k ≥ r2 |µ| and then x0 ∈ B∞ (µ). By Condition 2 plus continuity, it follows that there exists > 0 such that {x : hAi x, xi = νi } for 1 ≤ i ≤ n is an -transverse set. Also, the intersection of these sets is compact, by Condition 1 and the proof of (a). This gives the ingredients necessary to apply the transversality lemma, as we did in Proposition 1. First observe that the level sets {x : fi (x) = µνi + si } for 1 ≤ i ≤ n form an /2-transverse set if µ is large, and fi (x) − si = νi } {x : fi (x) − µνi = si } = {x : fi (x) − si = µνi } = {x : µ hAi x, xi Li (x) ai − si + + = νi } = {x : sgn(µ) |µ| |µ| |µ| p Li (x) ai − si = |µ|{x : hAi x, xi + p = sgn(µ)νi }, + |µ| |µ|

306

N. Romero, A. Rovella, F. Vilamaj´o

1 ai − si are very where sgn(µ) is the sign of µ. As the functions x → p Li (x) + |µ| |µ| si (µ) s˜i (µ) and go small in C 2 -topology in compact sets when |µ| is large (recall that µ µ to 0 as µ goes to +∞), and the level sets {hAi x, xi = νi } are regular and -transverse, then the family of level sets {fi (x) = µνi + si } is /2-transverse for every µ large, as was claimed. p Finally, for x ∈ / B∞ (µ) and 1 ≤ i ≤ n, ||Ai (x)|| ≥ dr1 |µ|; then, as a consequence of the -transversality lemma, Fµ is expanding outside B∞ (µ). This proves (b). It remains to prove that Conditions 1, 2 and 3 are open and dense in the topology of the norm of the matrices (which corresponds with the weak topology). The first and third condition come from the fact that eigenvalues and eigenvectors depend continuously on the matrix, and for the second, take first generically a matrix A2 such that the level sets corresponding to A1 and A2 are transverse (thus the intersection will be a manifold of dimension n − 2 or else the empty set). Then proceed by induction.

5. Examples Example 1 (Delay endomorphisms). An endomorphism of R2 of the form F (x, y) = (y, f (x, y)), is called a delay endomorphism. Suppose that f (x, y) = ax2 + by 2 , with a, b > 0, and let ν = (0, 1). The function f is C 2 -convex, so ∞ is an attractor for every Fµ = F − µ(0, 1). If b >> a, it follows from [RV] that for every large µ > 0, Fµ has 2 saddle type fixed points. The stable manifolds of these fixed points play an important rˆole in the understanding of the dynamics of Fµ . (For a recent work on invariant manifolds of endomorphisms see [S].) Moreover the complemen of B∞ is the closure of the stable manifold of these fixed points, which turns out to be homeomorphic to the product of a Cantor set and a circle. These endomorphisms are hyperbolic, and satisfy the conditions of Przytycki [P], so are also structurally stable. It follows that for every strong perturbation G of F , the family Gµ has the same dynamical behavior as Fµ . This shows that Gν is not dense in the strong topology. In addition, if only the second coordinate of F is perturbed within the quadratic functions, then the same results of [RV] can be applied, and the family perturbed is again not in H0 . In sight of theorem 1 we conclude that both coordinates should be perturbed to obtain an endomorphism in Gν . Moreover, Theorem 1 gives also sufficient conditions (1 to 3) at the beginning of Sect. 4 that are easy to check in general. For example, G(x, y) = (y + 1 x2 + 2 y 2 , ax2 + by 2 ) 1 a 6= . belongs to Gν whenever 2 b Example 2. Next we will construct an example of a C 2 -convex endomorphism such that the level curves have not transversality enough to obtain expansivity. Furthermore, every C 2 -strong perturbation of this transformation gives rise to a one parameter family which is also nonexpanding for all parameters µ large. This should be compared with the situation in quadratic endomorphisms where the genericity holds but when other topology is considered. There exists a C 2 -convex endomorphism F in R2 such that for every small C 2 -strong perturbation G of F , the family {Gµ ; µ > µ0 } does not intersect H0 . In fact, let b : R → R be any C 2 function satisfying:

Dynamics of n-Dimensional Quadratic Endomorphisms

307

1. b(0) = b0 (0) = 0, 2. b00 (x) > 1 for every x ∈ R, and 3. 1/2 < 2x − b0 (x) < 3/4 for every x ≥ 1. First we show some function b satisfying the conditions above. Take b such that b(0) = b0 (0) = 0, b00 (x) = 3/2 for |x| ≤ 1, b00 (x) = 2 for |x| ≥ 3/2 and b00 (x) ∈ (1/2, 3/4) for |x| ∈ (1, 3/2). Then Z x 0 2 − b00 (t)dt ∈ (1/2, 3/4) 2x − b (x) = 0

for every x > 1. Define fµ (x, y) = x2 + b(y) − µ and gµ (x, y) = fµ (y, x). It follows that each element of the family Fµ = (fµ , gµ ) is a C 2 -convex endomorphism of R2 . The functions x → φµ (x) = x2 + b(x) − µ verify φµ (0) = −µ, φ0µ (0) = 0 and 00 φµ (0) ≥ 3 for every x. It follows that φµ has a fixed point xµ > 0 such that xµ → +∞ when µ → +∞. It is clear that the point Pµ = (xµ , xµ ) is fixed for Fµ . Observe that {∇f (Pµ ), ∇g(Pµ )} is -transverse if and only if <

4x2µ − b02 (xµ ) . 4x2µ + b02 (xµ )

Using the third condition of the definition of b it comes that 4x2µ −b02 (xµ ) < 4xµ −1. 4x −1 Thus it follows that the set {∇f (Pµ ), ∇g(Pµ )} is not 4x2 +bµ02 (xµ ) -transverse. Pµ is a µ saddle type fixed point, with one eigenvalue in (0, 1). Now consider the C 2 -convex functions given by f˜(x, y) = f (x, y) − x and g(x, ˜ y) = g(x, y) − y. Observe that f˜−1 (µ) ∩ g˜ −1 (µ) is the set of fixed points of Fµ . In addition, if F˜ = (f˜, g), ˜ ˜ then DFPµ = DFPµ −Id has an eigenvalue in (−1, 0); so it follows that the transversality of {f˜−1 (µ), g˜ −1 (µ)} is ! | det DF˜Pµ | (µ) ∈ 0, . k∇f˜kk∇gk ˜ This, by Corollary 2 is preserved by small perturbations, and it follows that the family perturbating Fµ must have a saddle type fixed point P 0 (µ) for every µ. This enables the new family to belong to H0 . Example 3. Gν is not open in C r (R, R) with the strong C r -topology. Let f be an even function having derivative f 0 (x) > 2 for every x > 1, having a unique critical point at x = 0, f (0) = 0 and negative Schwarzian derivative. Suppose also that for the family fµ = f − µ, the following conditions hold: (i) xµ > 1 is a fixed point of fµ , (ii) fµ2 (0) > xµ and fµ2 (0) − xµ → 0 as µ → ∞. Then, as f has negative Schwarzian derivative and the critical orbit intersects (xµ , +∞) ⊂ B∞ , it follows that f ∈ G1 . Now, if g is a small perturbation of f such that f = g outside |x| ≤ 1, g has it unique critical point at 0 and g(0) = f (0) + , then gµ2 (0) < xµ for every µ > 0 large. This implies that the whole interval [−xµ , xµ ] is invariant and g ∈ / G1 . To

308

N. Romero, A. Rovella, F. Vilamaj´o

construct f , begin with f (x) = 2x2 in [−1, 1] and choose f 0 decreasing to 2 at infinity. It is easy to see that f can be taken C ∞ with negative Schwarzian (f 000 ≤ 0 for x > 0). The items are satisfied if a careful choice of the first derivative of f is made outside [−1, 1]. Observe that if f 0 were constant equal to 2 the first item does not hold, and if 1 f 0 is constant > 2 then the second one is not true. Take for example f (x) = 2x + for x x > β > 1. Acknowledgement. We thank CDCHT-UCLA (Venezuela) and Pedeciba (Uruguay) for partial financial support. We also thank the referee for many helpful suggestions and comments.

References [BSV] Bofill, F., Szlenk, W. and Vilamaj´o, F.: Discrete time delayed dynamical systems. An example. European Conference on Iteration Theory, ECIT 91 [H] Hirsch, M.: Differential Topology. New York: Springer-Verlag, 1976 [MP] Ma˜ne´ , R. and Pugh, C.: Stability of Endomorphisms. Symp. Warwick Dynamical Systems. Lecture Notes in Mathematics Vol. 468, Berlin–Heidelberg–New York: Springer-Verlag, 1975, pp. 175–184 [P] Przytycki, F.: On -stability and structural stability on endomorphisms satisfying Axiom A. Studia Mathematica LX, 61–77 (1977) [RV] Rovella, A. and Vilamaj´o, F.: Convex Delay Endomorphisms. Commun. Math. Phys. 174, 393–407 (1995) [S] Sander, E.: Hyperbolic sets for noninvertible maps and relations. Ph. D. Thesis, University of Minnesota, 1996 and to appear at ZAMP Communicated by Ya. G. Sinai

Commun. Math. Phys. 195, 309 – 319 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

On Pentagon, Ten-Term, and Tetrahedron Relations R. M. Kashaev1,?,?? , S. M. Sergeev2,??? 1 2

Laboratoire de Physique Th´eorique enslapp† , ENSLyon, 46 All´ee d’Italie, 69007 Lyon, France Scientific Center of Institute of Nuclear Physics SB RAS, Protvino, Moscow Region 142 284, Russia

Received: 8 August 1996 / Accepted: 25 November 1997

Abstract: It is shown that the tetrahedron equation under the substitution R123 = S¯ 13 P23 S13 , where P23 is the permutation operator, is reduced to a pair of pentagon ¯ Examples of infinite dimensional and one ten-term equations on operators S and S. solutions are found. O-doubles of Novikov, which generalize the Heisenberg double of a Hopf algebra, provide a particular algebraic solution to the problem.

1. Introduction The Yang–Baxter equation (YBE) [25, 3] can be considered as a tool for both constructing and solving integrable two-dimensional models of statistical mechanics and quantum field theory [2, 9]. Recent progress in understanding the algebraic structure, lying behind the YBE, has led to the theory of quasi-triangular Hopf algebras [7]. The tetrahedron (or three-simplex) equation (TE) [26] has been introduced as a three-dimensional generalization of the YBE. Before describing it in the algebraic form, first consider an associative unital algebra A, and define an important notation to be used throughout the paper. Namely, for each set of distinct integers {i1 , i2 , . . . , im } ⊂ {1, 2, . . . , n}, for m < n, define the algebra homomorphism τi1 ,i2 ,...,im : A⊗m → A⊗n such that a ⊗ b ⊗ . . . ⊗ c 7→ 1 ⊗ . . . ⊗ a ⊗ . . . ⊗ b ⊗ . . . ⊗ c ⊗ . . . ⊗ 1, ? ?? ??? †

On leave of absence from St. Petersburg Branch of the Steklov Mathematical Institute Supported by MAE-MICECO-CNRS Fellowship Partially supported by INTAS Grant 93-2492 and RFFR Grant 95-01-00249 URA 14-36 du CNRS, associ´ee a` l’E.N.S. de Lyon, et a` l’Universit`e de Savoie

310

R. M. Kashaev, S. M. Sergeev

where a, b, . . . , c in the r.h.s. stand on i1 , i2 , . . . , ith m positions, respectively, and unit elements, on the others. The notation to be used follows: ui1 i2 ...im := τi1 ,i2 ,...,im (u),

u ∈ A⊗m ,

(1.1)

i.e. the subscripts indicate the way an element of the algebra A⊗m is interpreted as an element of the algebra A⊗n . Next, we shall find it convenient to use the “permutation operator” P , which is an (additional) element in A⊗2 , defined by P a ⊗ b = b ⊗ aP,

P 2 = 1 ⊗ 1,

a, b ∈ A.

(1.2)

The (constant) TE is a nonlinear relation in A⊗6 on an invertible element R ∈ A⊗3 : R123 R145 R246 R356 = R356 R246 R145 R123 ,

(1.3)

where we use notation (1.1). One can introduce also the higher-dimensional counterparts of the YBE [6], the n-simplex equations. For example, the (constant) four-simplex equation (FSE) is a relation in A⊗10 on an invertible element B ∈ A⊗4 : B0123 B0456 B1478 B2579 B3689 = B3689 B2579 B1478 B0456 B0123 .

(1.4)

Many solutions have been found already for the TE, see e.g. [26, 4, 5, 14, 15, 16, 10, 19], though, adequate algebraic framework (an analog of quasi-triangular Hopf algebras) is still missing. Practically nothing is known for the higher-simplex equations. The purpose of this paper is to make a step towards the algebraic theory of the TE. Our main result is that one and the same system of equations

in A

⊗3

in A

⊗4

and

S12 S13 S23 = S23 S12 ,

(1.5)

S¯ 23 S¯ 13 S¯ 12 = S¯ 12 S¯ 23 ,

(1.6)

S¯ 12 S13 S¯ 14 S24 S¯ 34 = S24 S¯ 34 S14 S¯ 12 S13 ,

(1.7)

on elements S, S¯ ∈ A⊗2 implies both the TE for the combination S = S¯ 13 P23 S13 , R123

(1.8)

S B0123 = S¯ 13 P01 P23 S13 .

(1.9)

and the FSE for the combination

We should warn, however, that formula (1.9) is too restrictive to give genuinely fourdimensional models. As a matter of fact, it corresponds to a non-interacting system of three-dimensional models. The manifest symmetry properties of Eqs. (1.5)–(1.7) are given by the following ¯ transformations of S and S: S12 ↔ S¯ 21 ,

S12 ↔ (S¯ 12 )−1 .

(1.10)

Equation (1.5) and (1.6) are the two forms of the celebrated pentagon equation (PE), which appears in various forms in • representation theory of (quantum) groups as the Biedenharn–Elliott identity for 6jsymbols;

On Pentagon, Ten-Term, and Tetrahedron Relations

311

• quantum conformal field theory as the identity for the fusion matrices [20]; • quasi-Hopf algebras as the consistency equation for the associator [8]. In the form of Eqs. (1.5), (1.6) the PE first appeared in the geometric approach to threedimensional integrable systems [17, 18]. In [18] a reduction of the TE to the PE has been suggested. Finally, the PE in the form (1.5), (1.6) was shown in [12] to be intimately related with the Heisenberg double of a Hopf algebra [22, 1, 23]. In particular, using the inclusion of the Drinfeld double into the tensor product of two Heisenbergs, one can reduce the YBE to the PE. Equation (1.7), the “ten-term” relation, on two different solutions of the PE, appears to be satisfied by canonical elements in the O-double of a Hopf algebra, introduced in [21] as a generalization of the Heisenberg double. Thus, the O-double is, probably, the simplest algebraic framework for the TE. The paper is organized as follows. In Sect. 2 particular solutions for the PE, which generalize solutions associated with the Heisenberg doubles of group algebras, are considered. The results of this section are used in the next one for construction of particular solutions (typically infinite dimensional) for system (1.5)–(1.7). In Sect. 3 the latter system is derived from the TE, and FSE, and the examples of solutions are described. In Sect. 4 the O-double construction of a special class of solutions is presented. 2. Pentagon Relation for a Rational Transformation

2.1. Notation. In this and the next sections we shall use the following notation. For two sets X, Y symbol Y X denotes the space of mappings from X to Y . If Y is a linear space over field k, the space Y X is also considered as a k-linear space: k × Y X 3 (a, f ) 7→ af ∈ Y X ,

Y X × Y X 3 (f, g) 7→ f + g ∈ Y X ,

where (af )(x) := af (x),

(f + g)(x) := f (x) + g(x),

∀x ∈ X.

The composition U V of two linear operators U, V in Y X will be understood in the standard way U V f := U V (f ) := U (V (f )). (2.1) With each injective mapping i: {1, . . . , m} 3 k 7→ ik ∈ {1, . . . , n}, and any associate mapping

0 < m ≤ n,

α: X {1,...,m} → X {1,...,m} αi1 ,...,im : X {1,...,n} → X {1,...,n}

defined as follows:

αi1 ,...,im (f )(k) :=

α(f ◦ i)(s), if k = is ; f (k), otherwise.

Let U be the linear operator associated with α in CX

{1,...,m}

defined by

312

R. M. Kashaev, S. M. Sergeev

U f := f ◦ α,

f ∈ CX

{1,...,m}

.

(2.2) {1,...,n}

Then, the linear operator corresponding to αi1 ,...,im in CX will be denoted as Ui1 ...im : {1,...,n} . (2.3) Ui1 ...im f := f ◦ αi1 ,...,im , f ∈ CX Finally note, that the space X {1,...,m} is naturally identified with the mth Cartesian power of X: X {1,...,m} ≡ X × . . . × X . | {z } m times

2.2. . From the results of the paper [12] it follows that for a group G the operator (Sϕ)(x, y) = ϕ(xy, y),

ϕ ∈ CG×G ,

x, y ∈ G,

(2.4)

satisfies PE (1.5), where Sij is defined in Eq. (2.3) with m = 2, n = 3, U = S, and α being the mapping associated with S of G × G to G × G through Eq. (2.2). This is the “coordinate” representation for the canonical element in the Heisenberg double of the group algebra. In this section we generalize this result. Let M be some set. Define operator S (Sϕ)(x, y) = ϕ(x · y, x ∗ y), for some mappings

·

M × M → M,

ϕ ∈ CM ×M , ∗

M × M → M.

(2.5) (2.6)

We shall call these the dot- and star-mapping, respectively. Again defining Sij in accordance with Eq. (2.3), and imposing PE (1.5), we obtain the following equations: (x · y) · z = x · (y · z),

(2.7)

(x ∗ y) · ((x · y) ∗ z) = x ∗ (y · z),

(2.8)

(x ∗ y) ∗ ((x · y) ∗ z) = y ∗ z.

(2.9)

Proposition 1. (i) Let M be a closed under multiplication subset of group G with the dot-mapping given by the multiplication in G, and let mappings λ, µ: M → G be such that ∀x, y ∈ M : x ∗ y := µ(x)−1 µ(xy) ∈ M,

µ(x ∗ y) = λ(x)µ(y),

(2.10)

then Eqs. (2.8)–(2.9) are satisfied; (ii) if furthermore 1 ∈ M , then x ∗ y = y is the only solution to the system (2.8)–(2.9). Proof. The proof of (i) is straightforward. Let us prove (ii). Putting x = 1 in Eq. (2.8), we immediately obtain the first equation from (2.10), where µ(x) = 1 ∗ x. The invertibility condition for the operator S implies invertibility of the function µ. Putting y = 1 in Eq. (2.9), noting that x ∗ 1 = 1, one gets the second equation in (2.10), where λ(x) = 1.1 Together with the invertibility property of the function µ this implies x ∗ y = y. Recall, that the S-operator, given by Eq. (2.4), in the case where M is a group, is associated with the Heisenberg double of the group algebra. There are, however, other solutions if the set M is not a group. 1

We thank the referee for pointing out this substitution.

On Pentagon, Ten-Term, and Tetrahedron Relations

313

Proposition 2. Let M be a subset of an associative unital ring where the definitions x · y := xy,

x ∗ y := (1 − x )− (1 − (xy) ) ,

= ±1,

(2.11)

make sense. Then, Eqs. (2.7)–(2.9) are satisfied. The proof is straightforward. Note that solution (2.11) is such that Eqs. (2.10) are satisfied as well with µ(x) = (1 − x ) ,

λ(x) = (1 − x− )− .

Proposition 3. Let M =]0, 1[⊂ R be the open unit interval with the dot-mapping given by the multiplication in R, and let the star-mapping be continuously differentiable. Then, system (2.8)–(2.9) is satisfied iff α 1 − x1/α , (2.12) x∗y =y 1 − (xy)1/α where real α ≥ 0, and the case α = 0 is understood as the limit α → 0+ . Proof. Solution (2.12) is described by part (i) of Proposition 1, where G = R+ with the group structure given by the multiplication in R, and µ(x) = (x−1/α − 1)−α ,

λ(x) = (1 − x1/α )α ,

so formula (2.12) does satisfy system (2.8)–(2.9). Let us prove that it is the only continuously differentiable solution. First, there exists a strictly increasing and continuously differentiable function µ: ]0, 1[→ R+

(2.13)

such that Eqs. (2.10) are satisfied. Indeed, differentiating Eq. (2.8) with respect to z we obtain f (xy, z) = f (x, yz), ∀x, y, z ∈]0, 1[, (2.14) where

f (x, y) := ∂ log(x ∗ y)/∂ log y.

(2.15)

The r.h.s. of Eq. (2.14) is well defined also for z = 1, therefore so is the l.h.s. Thus we obtain f (x, y) = 1/w(xy), w(x) := 1/f (x, 1) ∈ R+ , ∀x, y ∈]0, 1[. By definition function w(x) is continuous. Define the function (2.13) as a solution to the following differential equation w(x) = d log x/d log µ(x).

(2.16)

Solving now Eq. (2.15) w.r.t. x ∗ y we come to the first equation in (2.10), where the integration constant is fixed by Eq. (2.8). The second equation in (2.10) is a simple consequence of the first one and Eq. (2.9). Next, let us differentiate the second equation in (2.10) w.r.t. y. The result can be written as w µ(xy)/µ(x) = w(y)/w(xy), (2.17) while differentiation of the former equation w.r.t. x combined with Eq. (2.17) gives

314

R. M. Kashaev, S. M. Sergeev

w(xy) = w(x) − w(y)w(x)d log λ(x)/d log x.

(2.18)

Consistency of the last equation under the permutation x ↔ y fixes the derivative of the λ-function up to a real constant c: d log λ(x)/d log x = c − 1/w(x). Plugging this back into Eq. (2.18), we obtain the closed functional equation w(xy) = w(x) + w(y) − cw(x)w(y).

(2.19)

Here c 6= 0 (otherwise w(x) ∼ log x, and Eqs. (2.17) and (2.16) are in contradiction), therefore, Eq. (2.19) can be rewritten in the form 1 − cw(xy) = (1 − cw(x))(1 − cw(y)), the general continuous solution of which is well known: 1 − cw(x) = x1/α ,

1/α ∈ R.

Compatibility of Eqs. (2.17) and (2.16) fixes c = 1. Thus, w(x) = 1 − x1/α > 0 =⇒ α > 0, where the first inequality follows from definition (2.16) of w(x) and the fact that µ(x) is a strictly increasing function. Finally, solving the differential equation (2.16), we complete the proof. In conclusion note that any two non-zero parameters α, α0 6= 0 in Eq. (2.12) give equivalent S-operators (2.5), consequently, we have only two inequivalent solutions for M =]0, 1[, corresponding to α = 0, and α = 1. 3. Pentagon, Ten-Term, Three-, and Four-Simplex Relations Consider the following “ansatz” for solution of Eq. (1.3): T = T¯13 P23 T13 , R123

(3.1)

for invertible elments T, T¯ ∈ A⊗2 , with P being defined in Eq. (1.2). Proposition 4. The TE (1.3) for element R of the form (3.1) is equivalent to the existence of invertible elements S, S¯ ∈ A⊗2 such that the following equations are satisfied: S12 T13 T23 = T23 T12 ,

T¯23 T¯13 S¯ 12 = T¯12 T¯23 ,

S¯ 12 T13 T¯14 T24 T¯34 = T24 T¯34 T14 T¯12 S13 .

(3.2) (3.3)

Proof. Substituting Eq. (3.1) into Eq. (1.3), moving all P -elements to the right, one can remove all of them from the both sides of the equality simultaneously. The resulting identity can be rewritten in the form −1 ¯ −1 ¯ ¯ −1 −1 T36 T13 T36 )T12 T¯15 T35 T¯25 = T35 T¯25 T15 T¯13 (T24 T12 T24 T14 ). (T¯16

Here nontrivial elements in the subspaces 4 and 6 are contained only in the expressions enclosed in parenthesis in the r.h.s. and the l.h.s., respectively. Consequently, these expressions should be trivial in the subspaces 4 and 6. In this way, we immediately come to the statements of the proposition.

On Pentagon, Ten-Term, and Tetrahedron Relations

315

Proposition 5. Equations (1.5)–(1.7) follow from Eqs. (3.2) and (3.3). Proof. Equation (1.5) follows from the identity S12 S13 S23 T14 T24 T34 = S23 S12 T14 T24 T34 , which is proved by successive applications of the first identity from Eq. (3.2). Eq. (1.6) is proved similarly. As for Eq. (1.7), it is a consequence of the identity S¯ 12 S13 S¯ 14 S24 S¯ 34 T15 T35 T¯16 T26 T¯36 T46 T¯56 = S24 S¯ 34 S14 S¯ 12 S13 T15 T35 T¯16 T26 T¯36 T46 T¯56 , which is proved through the following sequence of transformations (in each step the fragment, to be transformed according to either the first equation from (3.2) or Eq. (3.3) or Eq. (1.5), is underlined): S¯ 12 S13 S¯ 14 S24 S¯ 34 T15 T35 T¯16 T26 T¯36 T46 T¯56 = S¯ 12 S13 S¯ 14 S24 T15 T¯16 T26 T46 T¯56 T36 T¯34 S35 = S¯ 12 S13 S¯ 14 T15 T¯16 T46 T24 T¯56 T36 T¯34 S35 = S¯ 12 S13 T46 T¯56 T16 T¯14 S15 T24 T36 T¯34 S35 = S¯ 12 T46 T¯56 T36 T13 T¯14 S15 T24 T¯34 S35 = T46 T¯56 T36 T24 T¯34 T14 T¯12 S13 S15 S35 = T46 T¯56 T36 T24 T¯34 T14 T¯12 S35 S13 = S24 T26 T46 T¯56 T36 T¯34 T14 T¯12 S35 S13 = S24 T26 S¯ 34 T35 T¯36 T46 T¯56 T14 T¯12 S13 = S24 T26 S¯ 34 T35 T¯36 S14 T16 T46 T¯56 T¯12 S13 = S24 S¯ 34 T35 S14 S¯ 12 T13 T¯16 T26 T¯36 T46 T¯56 = S24 S¯ 34 S14 S¯ 12 S13 T15 T35 T¯16 T26 T¯36 T46 T¯56 .

The next proposition is about a similar statement for the FSE. Proposition 6. The FSE (1.4) for element B of the form (1.9) is equivalent to Eqs. (1.5)– (1.7). The proof is similar to that of Proposition 4. Thus, Eqs. (1.5)–(1.7) enable us to construct a special class of solutions for the TE and FSE.

316

R. M. Kashaev, S. M. Sergeev

Example 1. Take for the algebra A the space of birational transformations Aut(C(x)) of the field of rational expressions in one indeterminate C(x), and identify A ⊗ A ⊗ . . . with Aut(C(x, y, . . .)). Interpreting C(x, y, . . .) as the space of rational functions on C × C × . . ., define (T ϕ)(x, y) := ϕ(xy, y − xy),

T¯ = T −1 .

Then, Eqs. (3.2), (3.3) as well as Eqs. (1.5)–(1.7) are satisfied with (Sϕ)(x, y) := ϕ(xy, (y − xy)/(1 − xy)),

S¯ := S −1 .

One can show that the corresponding element RT is equivalent to the solution 80 for the TE from [13], which in turn is associated with the three-dimensional Hirota equation of the discrete Toda system [11]. Example 2. Let now x = (x1 , x2 ) be a pair of indeterminates, and put A = Aut(C(x)), A ⊗ A ⊗ . . . being identified with Aut(C(x, y, . . .)). Consider the following rational mappings: x · y := (x1 y1 , x2 y1 + y2 ), x ∗ y := µ(x)−1 · µ(x · y) , where Operators

x−1 := (1/x1 , −x2 /x1 ),

µ(x) := (1 − x1 + x2 , x2 ).

(Sϕ)(x, y) := ϕ(x · y, x ∗ y),

S¯ := S −1 ,

satisfy Eqs. (1.5)–(1.7) for any ∈ C. Note, however, that any two non-zero and 0 give equivalent solutions. Thus, we have only two essentially different solutions corresponding to = 0 and 6= 0. Example 3. Let algebra A be the Heisenberg algebra, generated by elements {H, 3, 1}, satisfying the Heisenberg commutation relation 3H − H3 = 1/h, with h being a complex parameter with a positive real part. Define the function (x; q)∞ :=

∞ Y

(1 − xq n ),

n=0

where q = exp(−h), and put S := q H⊗3 (−q 3 ⊗ q −H q −3 ; q)−1 ∞.

(3.4)

This operator can be shown to be a “quantization” of Example 2 with 6= 0. Note also, that it is a specialization of the canonical element in the Heisenberg double of the Borel ¯ either subalgebra of Uq (sl(2)) quantum group, see [12]. Now, both choices of S, S¯ := S −1 , or

S¯ := q −H⊗3 ,

solve system (1.5)–(1.7). The corresponding solutions (1.8) and (1.9) to the TE and FSE first have been found in [24].

On Pentagon, Ten-Term, and Tetrahedron Relations

317

4. O-Double Construction One particular class of solutions to system (1.5)–(1.7) is connected with the O-doubles [21], which generalize the Heisenberg double of a Hopf algebra [22, 1, 23]. ¯ satisfying Eqs. (1.5), (1.6), and Consider elements S and S, S13 S¯ 23 = S¯ 23 S13 ,

S12 S¯ 13 S¯ 23 = S¯ 23 S12 ,

S23 S13 S¯ 12 = S¯ 12 S23 ,

(4.1)

which imply also Eq. (1.7). It appears that there is a general algebraic structure, underlying Eqs. (1.5), (1.6), and (4.1). Let X be a Hopf algebra. In a linear basis {ei } the product, the coproduct, the unit, the counit, and the antipode take the form ei ej = mkij ek , 1 = εi e i ,

1(ei ) = µjk i ej ⊗ ek ,

ε(ei ) = εi ,

γ(ei ) = γij ej ,

(4.2)

j i where summation over repeated indices is implied. Here mkij , µjk i , ε , εi , and γi are numerical structure constants of the algebra. Let X ∗ be the dual Hopf algebra. Following [21], consider algebra X ∗ XX ∗ (Odouble), generated by right derivations Rx∗ , x ∈ X:

Rx∗ : X ∗ → X ∗ ,

hRx∗ (f ), yi = hf, Rx (y)i = hf, yxi,

(4.3)

left, Lf , and right, Rγ −1 (g) , multiplications , f, g ∈ X ∗ : L f , Rg : X ∗ → X ∗ ,

Lf (g) = Rg (f ) = f g.

(4.4)

Proposition 7. Algebra X ∗ XX ∗ is an associative algebra, generated by elements {ei , ej , e˜k }, subject to the following defining relations: k ei ej = µij ke , k ei ej = mjkl µlm i e em ,

ei ej = mkij ek ,

e˜i e˜j = µij ˜k , ke

i e˜i ej = µkl ˜m , j mlm ek e

ei e˜j = e˜j ei .

(4.5)

Proof. One has just to write the compositions of the operations, defined in Eqs. (4.3), (4.4), for elements of the linear basis, using Eqs. (4.2) and the corresponding relations for the dual algebra. Proposition 8. Two canonical elements S = ei ⊗ ei , S¯ = ei ⊗ e˜i in X ∗ XX ∗ satisfy Eqs. (1.5), (1.6), and (4.1). The proof is straightforward through the substitution of the canonical elements into the relations to be proved, and application of formulae (4.5). Thus , we have obtained a particular class of general algebraic solutions to system (1.5)–(1.7), which in turn imply the TE for the element (1.8) and the FSE for the element (1.9).

318

R. M. Kashaev, S. M. Sergeev

5. Summary Solutions for the system of Eqs. (1.5)–(1.7) provide us both with solutions for the threeand four-simplex Eqs. (1.3) and (1.4) through formulae (1.8) and (1.9), respectively. The O-double of a Hopf algebra provides an algebraic structure, underlying the system (1.5), (1.6) and (4.1), which implies also Eq. (1.7). Nevertheless, examples of solutions to system (1.5)–(1.7), described in Sect. 3, do not come from the O-double construction. This suggests, that the latter is a particular case of a more general algebraic structure, lying behind system (1.5)–(1.7) itself. Acknowledgement. The authors are indebted to A.Yu. Volkov for reading the manuscript and helpful suggestions. It is a pleasure to thank also V.O. Tarasov, Yu.G. Stroganov, J.M. Maillet, L. Freidel, H.E. Boos for discussions.

References 1. Alekseev, A.Yu., Faddeev, L.D.: (T ∗ G)t : A toy model for conformal field theory. Commun. Math. Phys. 141, 413–422 (1991) 2. Baxter, R.J.: Exactly solved models in statistical mechanics. London: Academic Press 1982 3. Baxter, R.J.: Partition function of the eight-vertex lattice model. Ann. Phys. 70, 193–228 (1972) 4. Bazhanov, V.V., Baxter, R.J.: New solvable lattice models in three dimensions. J. Stat. Phys. 69, 453–485 (1992) 5. Bazhanov, V.V., Baxter, R.J.: Star-triangle relation for a three - dimensional model. J. Stat. Phys. 71, 839–864 (1993) 6. Bazhanov, V.V., Stroganov, Yu.G.: Commutativity conditions for transfer matrices on a multidimensional lattice. Theor. Mat. Fiz. 52, 105–113 (1982) [English transl.: Theor. and Math. Phys. 52, 685–691 (1983)] 7. Drinfeld, V.G.: Quantum groups. In: Proc. Int. Cong. Math., Berkeley 1987, pp. 798–820 8. Drinfeld, V.G.: Quasi-Hopf algebras. Algebra and Analysis 1, 114–148 (1989) 9. Faddeev, L.D.: Quantum completely integrable models in field theory. Sov. Sci. Rev. C1, 107–155 (1980) 10. Hietarinta, J.: Labelling schemes for tetrahedron equations and dualities between them. J. Phys. A27, 5727–5748 (1994) 11. Hirota, R.: Discrete analogue of a generalized Toda equation. J. Phys. Soc. Jpn., 50, 3785–3791 (1981) 12. Kashaev, R.M.: The Heisenberg double and the pentagon relation. Algebra i Analys, Vol. 8, No. 4, 63–74 (1996) 13. Kashaev, R.M.: On discrete three-dimensional equations associated with the local Yang–Baxter relation. Lett. Math. Phys. 35, 389–397 (1996) 14. Kashaev, R.M., Mangazeev, V.V., Stroganov, Yu.G.: Spatial symmetry, local integrability, and tetrahedron equation in the Baxter-Bazhanov model. Int. J. Mod. Phys. A8, 587–601 (1993) 15. Kashaev, R.M., Mangazeev, V.V., Stroganov, Yu.G.: Star-square and tetrahedron equations in the BaxterBazhanov model. Int. J. Mod. Phys. A8, 1399–1409 (1993) 16. Korepanov, I.G.: Tetrahedral Zamolodchikov algebras corresponding to Baxter’s L-operator. Commun. Math. Phys. 154, 85–97 (1993) 17. Maillet, J.M.: Integrable systems and gauge theories. Nucl. Phys. (Proc. Suppl.) B18, 212–241 (1990) 18. Maillet, J.M.: On Pentagon and Tetrahedron equations. Algebra and Analysis 6, 375–383 (1994) 19. Mangazeev, V.V., Sergeev, S.M., Stroganov, Yu.G.: New solutions of vertex type tetrahedron equations. Mod. Phys. Lett. A10, 279–287 (1995) 20. Moore, G., Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–255 (1989) 21. Novikov, S.P.: Various doublings of Hopf algebras. Algebras of operators on quantum groups. Complex cobordisms. Usp. Math. Nauk 47, No. 5(287), 189-190 (1992), transl. in Russ. Math. Surv. 47, No. 5, 198–199 (1992) 22. Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.A.: Central extensions of quantum current groups. Lett. Math. Phys. 19, 133–142 (1990)

On Pentagon, Ten-Term, and Tetrahedron Relations

319

23. Semenov-Tian-Shansky, M.A.: Poisson-Lie qroups. The quantum duality principle and the twisted quantum double. Theor. Math. Phys.93, 302–329 (1992), transl. in: Theoret. and Math. Phys. 93, No. 2, 1292–1307 (1992) 24. Sergeev, S.M., Bazhanov, V.V., Boos, H.E., Mangazeev, V.V., Stroganov, Yu.G.: Quantum dilogarithm and tetrahedron equation. Preprint IHEP 95-129 25. Yang, C.N.: Some exact results for the many-body problem in one dimension with repulsive deltafunction interaction. Phys. Rev. Lett. 19, 1312–1314 (1967) 26. Zamolodchikov, A.B.: Tetrahedron equations and the relativistic S-matrix of straight-strings in 2 + 1 dimensions. Commun. Math. Phys. 79, 489–505 (1981) Communicated by T. Miwa

Commun. Math. Phys. 195, 321 – 352 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Quantum Weyl Reciprocity and Tilting Modules? Jie Du1 , Brian Parshall2 , Leonard Scott2 1 2

School of Mathematics, University of New South Wales, Sydney 2052, Australia Department of Mathematics, University of Virginia, Charlottesville, VA 22903-3199, USA

Received: 6 January 1997 / Accepted: 25 November 1997

Abstract: Quantum Weyl reciprocity relates the representation theory of Hecke algebras of type A with that of q-Schur algebras. This paper establishes that Weyl reciprocity holds integrally (i. e., over the ring Z[q, q −1 ] of Laurent polynomials) and that it behaves well under base-change. A key ingredient in our approach involves the theory of tilting modules for q-Schur algebras. New results obtained in that direction include an explicit determination of the Ringel dual algebra of a q-Schur algebra in all cases. In particular, in the most interesting situation, the Ringel dual identifies with a natural quotient algebra of the Hecke algebra.

0. Introduction Weyl reciprocity refers to the connection between the representation theories of the general linear group GLn (k) and the symmetric group Sr . Let V be a vector space (over a field k) of dimension n and form the tensor space V ⊗r . The natural (left) action of GLn (k) on V ⊗r commutes with the (right) permutation action of Sr . Let A (resp., R) be the algebra generated by the image of GLn (k) (resp., Sr ) in the algebra End(V ⊗r ) of linear operators on V ⊗r . Classically [We], when k = C, these algebras satisfy the double centralizer property a)

A = EndR (V ⊗r )

and

b) R = EndA (V ⊗r ).

(1)

Further, the set 3+ (n, r) of partitions of r into at most n nonzero parts indexes both the irreducible A-modules L(λ) and the irreducible R-modules Sλ . The L(λ) are the irreducible polynomial representations of GLn (C) of homogeneous degree r, while the Sλ are Specht modules for Sr . Weyl reciprocity also entails the decomposition ?

Research supported by the National Science Foundation and the Australian Research Council.

322

J. Du, B. Parshall, L. Scott

V ⊗r =

M

L(λ) ⊗ Sλ

(2)

λ∈3+ (n,r)

of the tensor space into irreducible (A, Rop )-bimodules. When k has positive characteristic p, property (1) remains true, but it is more difficult to establish; see [CL, (3.1)] for the equality (1a) and [dCP, (4.1)] or [D2, Sect. 2 Cor.] for (1b). (The latter is easy when n ≥ r.) The set 3+ (n, r) still indexes Irr(A), while Irr(R) is indexed by the subset 3+ (n, r)p−reg of p-regular partitions. The decomposition (2) no longer holds, in general. The Hecke algebra H of type Ar−1 arises as a q-deformation of the group algebra kSr . Motivated by physics, Jimbo [Ji] gave a corresponding action of H on V ⊗r deforming the permutation action of Sr . In the generic case (k = C and q ∈ C is “general” – i. e., not a root of 1), he gave a quantum version of Weyl reciprocity, similar to the classical situation. In his theory, the quantum enveloping algebra Uq (gln ) played the role of GLn (k) above (while A is the q-Schur algebra Sq (n, r)). Later, entirely different considerations in finite group representation theory led Dipper and James [DJ1, DJ2] independently to make similar constructions; the name “q-Schur algebra” is due to them.1 This paper considers quantum Weyl reciprocity when k and q are arbitrary, and the related theory of tilting modules for q-Schur algebras Sq (n, r). There is a surjective homomorphism Uq1/2 = Uq1/2 (gln ) → Sq (n, r). A self-dual Sq (n, r)-module X is a tilting module if, when regarded as a Uq1/2 -module, it has a filtration with sections isomorphic to q-Weyl modules. In the more general context of quasi-hereditary algebras, Ringel [R] established the existence of a rich supply of tilting modules. Tilting modules have remarkable homological properties, giving rise, for example, to interesting equivalences ∼ of derived categories Db (A) → Db (B) if B is the endomorphism algebra of a full tilting module for A. In this case, B is called the Ringel dual of A (though it is only determined up to Morita equivalence); it is also a quasi-hereditary algebra. In [DPS1, DPS2], we used Kazhdan-Lusztig cell theory methods to study Hecke endomorphism algebras of importance in the representation theory of finite groups G of Lie type over fields of positive characteristic distinct from the defining characteristic of G. Those methods remain effective in the present paper. In particular, the “strong homological property” of cell filtrations, discovered in [DPS1] and reviewed in (3.5) below, plays an essential role in several places, e. g., in our generalization of (2) above in Theorem 6.6. Also, new and simple proofs of a number of results (from [dCP, D1, E1]) involving the representation theory of GLn and Sr result by setting q = 1. Because we largely work in an integral, or characteristic-free, setting, many of our results are new even in the q = 1 case. We now outline the contents of this paper. Section 1 discusses general representation theoretic facts for Hecke algebras associated to a finite Coxeter system. We apply these in Sect. 2 to study certain Hecke endomorphism algebras A. In particular, Theorem 2.3 presents a candidate for a full tilting module for A. Section 3 treats various “cell modules” which play an important role in the theory of q-Schur algebras in Sect. 5. Section 4 collects information concerning quasi-hereditary algebras and their tilting modules. In Proposition 5.4, we give a direct proof that V ⊗r is a tilting module for Sq (n, r). A consequence, Theorem 5.5, establishes an important base change property for q-tensor space which plays an essential role in Sect. 6. Section 6 takes up quantum Weyl reciprocity. Making use of an interesting new basis for the q-Specht modules given in Lemma 6.1, Theorem 6.2 describes Rop very explicitly 1 If k has positive characteristic p, the q-Schur algebras play a central role in the representation theory of the finite general linear groups GLn (Fq ) when p and q are relatively prime; see [DJ2].

Quantum Weyl Reciprocity and Tilting Modules

323

as a quotient algebra of the Hecke algebra H. The proof is almost self-contained, using only Theorem 5.5 mentioned above. The double centralizer property (1) then follows easily in Theorem 6.3. In the special case q = 1, we obtain a new proof of the double centralizer result of [dCP]. Although the decomposition (2) fails in general, Theorem 6.6 establishes that V ⊗r does have an (A, Rop )-bimodule filtration analogous to (2). The existence of such a filtration plays a crucial role in our determination in Theorem 7.7 of Sect. 7 of the Ringel duals of q-Schur algebras. In the classical q = 1 case, Donkin [D1, (3.6)] observed a connection between tilting modules and “twisted” permutation modules. [CPS2, (5.2)] used a similar idea to give a combinatorial 2 approach to Schur algebra tilting modules, realizing the latter in characteristic 6= 2 as modules of intertwining operators between “twisted” permutation modules and permutation modules. While we find that this description generally remains valid here, a second (and perhaps equally interesting) realization of tilting modules given in Theorem 2.3 proves essential in treating all specializations of q and all characteristics; see Proposition 7.3, Theorem 7.6. Also, Remarks 7.8 describes connections with more recent work of Donkin. Section 8 first recasts, in a q-setting, recent work of Erdmann [E1] concerning decomposition numbers for symmetric groups and Weyl modules. As in the classical case, the decomposition numbers of the Hecke algebra H are determined in terms of filtration multiplicities for tilting modules; see Proposition 8.2. Finally, suppose that q is a primitive lth root of unity satisfying n < l. Then Theorem 8.4 establishes that def

H(n, r) = EndSq (n,r) (V ⊗r )op is quasi-hereditary; in fact, H(n, r) identifies with the Ringel dual of Sq (n, r). Some notation. Unless otherwise stated, Z = Z[q, q −1 ], the ring of integral Laurent polynomials in a variable q, and Q(q) is its field of fractions. If Z 0 is a commutative f⊗Z Z 0 . When clear from context, denote f is a Z-module, put M fZ 0 = M Z-algebra and M 0 f f MZ 0 by M . The reader should take careful note of the “tilde” notation used thoughout this paper: f denotes a Z 0 -module, where Z 0 is some fixed commutative ZUsually, the notation M 0 algebra (e. g., Z = Z). However, if a field k is a Z-algebra, we often write M for fk . M Given a ring R, R C (resp., CR ) is the category of finitely generated left (resp., right) R-modules. If an R-module M has a composition series, then [M : L] is the multiplicity of the irreducible module L as a composition factor of M . Let Rop be the opposite ring of R. If M is a left (resp., right) R-module, let M op be the right (resp., left) Rop -module obtained from the action of R on M . For an R-module M , we consider (finite) filtrations F• : 0 = F0 ⊂ F1 ⊂ · · · ⊂ Ft = M of M by submodules Fi . The Fi /Fi−1 are called the sections of F• , with F1 = F1 /F0 the bottom section and Ft /Ft−1 the top section. If 1 is a fixed family of R-modules and if each section Fi /Fi−1 ∈ 1, then F• is called a 1-filtration. Let R C(1) denote the subclass of Ob(R C) consisting of objects which have a 1-filtration. If e ∈ R is an idempotent and M ∈ Ob(R C), HomR (Re, M ) identifies with the eRe-module eM by means of f 7→ f (e), f ∈ HomR (Re, M ). It will be convenient to have a general characterization of this property in the following way: Given a ∈ R and M ∈ Ob(R C), let TrR (a, M ) = {f (a) | f ∈ HomR (Ra, M )}. Since any module map f : Ra → M is determined by the image f (a), f 7→ f (a) defines an additive ∼ group isomorphism HomR (Ra, M ) → TrR (a, M ). For m ∈ M , the map Ra → M , 2 The method of [D1] involved algebraic groups and Hopf algebra structures, whereas [CPS2] and the present paper are essentially combinatorial in nature.

324

J. Du, B. Parshall, L. Scott

ra 7→ ram, is an R-module map (the restriction to Ra of an obvious map R → M ). Hence, aM is a subgroup of TrR (a, M ). By definition, a ∈ R is almost-idempotent w.r.t. M if TrR (a, M ) = aM , or, equivalently, if f (a) ∈ aM for every f ∈ HomR (Ra, M ). The usefulness of such a property was perhaps first suggested, though not formalized, by [DJ2, (2.7)]. Lemma 0.1. Let R be a ring, e ∈ R an idempotent, and M ∈ Ob(R C). Let a ∈ eRe be almost-idempotent w.r.t. the eRe-module eM . Then a is almost-idempotent w.r.t. res the R-module M , and restriction HomR (Ra, N ) → HomeRe (eRa, eN ) is a natural isomorphism of abelian groups for every R-submodule N ⊆ M . In particular, for M = Re and a, b ∈ eRe with a almost-idempotent w.r.t. eM , we have an isomorphism HomR (Ra, Rb) ∼ = HomeRe (eRa, eRb).

(0.1)

We leave the proof to the reader as an exercise, using the definition above. Observe in the second assertion that HomeRe (eRa, eN ) identifies with eN ∩ aeM = N ∩ aM . Although the above has been cast for R C, an evident version holds for CR . Remark 0.2. In the context of [DJ2, (2.7)], or any situation in which M is a faithful Rmodule, the property that a is almost-idempotent is equivalent to the double annihilator property for aM in M : aM = {m ∈ M | rm = 0 for every r ∈ R with raM = 0}.

(0.2)

In general, even if M is not faithful, the double annihilator property for aM in M holds when a is almost-idempotent with respect to M . Yet a third property, which also implies the double annihilator property, is the annihilator property which says that for some subset S ⊆ R, we have aM = {m ∈ M | sm = 0 for all s ∈ S}.

(0.3)

This property is often the most natural one to check, cf. the proof of Lemma 1.1e below, but sometimes Lemma 0.1 is more natural, in the presence of a natural idempotent e. (See the proof of Theorem 2.3b.) With regard to Lemma 0.1, observe that if (0.3) applies for the ring eRe for some S ⊆ eRe and a ∈ eRe, then it also applies in R w.r.t. M , using the (somewhat unnatural) subset S + (1 − e). 1. Hecke Algebras Let (W, S) be a finite Coxeter system. Let ≤ be the Bruhat-Chevalley poset structure on W and let ` : W → Z be the usual length function. Let P(S) denote the power set of S. Consider a finite poset 3 and a fixed (arbitrary) function J : 3 → P(S). The poset 3 will serve as a “weight” poset in our theory and the role of the function J is to adjust the multiplicities of direct summands in a “tensor” space. We will assume for simplicity that J is surjective. For λ ∈ 3, Wλ = hs | s ∈ J(λ)i is the associated parabolic subgroup; thus, (Wλ , J(λ)) is again a finite Coxeter system. For convenience in the sequel, we will usually identify λ ∈ 3 with the subset J(λ). Fix a system {cs ∈ Z}s∈S of index parameters, i. e., integers cs satisfying cs = ct if s and t are W -conjugate. For w ∈ W , let w = s1 · · · sm be a reduced expression and put qw = q cs1 · · · q csm ∈ Z. The definition of qw is independent of the reduced expression chosen for w. If all cs = 1, then qw = q `(w) . Put w = (−1)`(w) .

Quantum Weyl Reciprocity and Tilting Modules

325

e over Z is the algebra with Z-basis {τw }w∈W , satisThe generic Hecke algebra H fying the relations (for s ∈ S, w ∈ W ): ( if sw > w; τsw (1.1) τ s τw = qs τsw + (qs − 1)τw if sw < w. e0 = H e Z 0 . Thus, H e 0 has a basis Suppose that Z 0 is a commutative Z-algebra and put H τw ⊗ 1, w ∈ W , satisfying relations like those in (1.1). To simplify notation, continue e which to denote τw ⊗ 1 by τw ; we will follow this same convention for other bases of H arise below. Also, when no confusion results, let qw denote the image of qw in Z 0 . e 0 admits a Z 0 -automorphism 8 (resp., Z 0 -anti-automorphism ι) of The algebra H order 2 defined on basis elements by 8(τw ) = w qw τw−1−1 (resp., ι(τw ) = τw−1 ); see [L1, e 0 -module M f, let M f8 denote the H e 0 -module obtained from M f by p. 138]. Given an H 0 0 f f e e twisting the action of H on M by 8. Similarly, if M is a left (resp., right) H -module, e 0 -module obtained from M f by twisting the action by fι is the right (resp., left) H then M ∗ 0 f f ι. In general, writing M = HomZ 0 (M , Z ), we define a contravariant “duality” functor op f, N e ∈ Ob(C 0 ) and f ∈ Hom 0 (M f, N e ): by setting, for M dH e 0 → CH e e e 0 : CH H H e0 f f∗ ι dH e 0 M = (M )

0 and dH e 0 f = HomZ 0 (f, Z ).

(1.2)

f∼ f. Whenever M f ∈ Ob(C ) is Z-projective, f is a projective Z 0 -module, d2 M If M =M e H e0 H there is an isomorphism 0 f ∼ f f (dH e ⊗ Z )MZ 0 = dH e 0 MZ 0 , e ( M )Z 0 = dH

(1.3)

f8 ) ∼ f8 f. Since 8 and ι commute, d 0 (M which is natural in M = dH e 0 (M ) . e P P H −1 e 0 . (If λ 7→ J ⊆ S τw ∈ H For λ ∈ 3, let xλ = w∈Wλ τw , yλ = w∈Wλ w qw λ under the surjection 3 → P(S), we sometimes write xJ for xλ and yJ for yλ . For e 0 xλ , H e 0 yλ play e 0 , yλ H e 0 and H example, x∅ = y∅ = 1.) The q-permutation modules xλ H 0 0 e an important role. These modules are all free Z -modules. If Hλ = hτw | w ∈ Wλ i, e 0 , yλ H e 0 , etc. τw xλ = qw xλ = xλ τw and τw yλ = w yλ = yλ τw , for w ∈ Wλ , and so xλ H e 0 ) cf. [DPS1, (2.1.5)]. can be interpreted as induced modules (from H λ 0 0 0 e Let tr : H → Z , τw 7→ δw1 , be the Z -linear “trace” map. Then ha, bi = tr(ab) e 0 (see, e.g., [L1, (5.1.9)]). defines a non-degenerate, symmetric, associative pairing on H This pairing satisfies hτw , τv i = qw δw,v−1 . Lemma 1.1. Let λ, µ ∈ 3. Then: e 0 xλ and (yλ H e 0 yλ in 0 C, where (−)∗ = HomZ 0 (−, Z 0 ). e 0 )∗ ∼ e 0 )∗ ∼ (a) (xλ H = H = H e H e0 ∼ e0 e0 ∼ e0 Thus, dH e 0 yλ H = yλ H . There is a non-degenerate form e 0 xλ H = xλ H and dH 0 e 0. e satisfying (ah, b) = (a, bhι ) for a, b ∈ xλ H e 0, h ∈ H ( , ) defined on xλ H e 0 | τs h = qs h, ∀s ∈ λ} and yλ H e 0 | τs h = −h, ∀s ∈ λ}. e 0 = {h ∈ H e 0 = {h ∈ H (b) xλ H 0 8 ∼ 0 0 0 ∼ e e e e e0 e0 (c) (xλ H ) = yλ H and HomH e 0 (yλ H , yµ H ) = HomH e 0 (xλ H , xµ H ). e0 ∼ e 0 and yµ H e0 ∼ (d) If the subgroups Wλ and Wµ are W -conjugate, then xµ H = xλ H = 0 e yλ H .

326

J. Du, B. Parshall, L. Scott

e 0 -module H e 0 (in the (e) Both xλ and yλ are almost idempotents w.r.t. the left or right H sense of Lemma 0.1). 0 = {d ∈ W | `(udv) = `(u) + `(d) + `(v), ∀u ∈ Wλ , v ∈ Wµ }.3 Then (f) Let Dλ,µ e 0 xµ is a free Z 0 -module with basis {yλ τd xµ }d∈D0 . Similarly, xµ H e 0 yλ is a yλ H λ,µ

e 0 yλ and yλ H e 0 xµ are Z 0 free Z 0 -module with basis {xµ τd yλ }d∈D0 . Both xµ H µ,λ e 0. direct summands of H (g) If qs + 1 is not a zero divisor in Z 0 , ∀s ∈ λ, then yλ is an almost-idempotent (in the e 0 xµ , as well as for the right modules sense of Lemma 0.1) w.r.t. the left modules H e 0. xµ H e0 (h) Assume, for all s ∈ S, that qs + 1 is not a zero divisor in Z 0 . Then HomH e 0 ( H yλ , e 0 xµ ) is a free Z 0 -module with basis {φd }d∈D0 , where φd (yλ ) = yλ τd xµ . SimiH λ,µ

0 e0 e0 0 , where now larly, HomH e 0 (yλ H , xµ H ) is a free Z -module with basis {φd }d∈Dµ,λ φd (yλ ) = xµ τd yλ .

Proof. (a) can be proved easily by using the pairing h , i defined just before the statement of the lemma. For details, see [DPS1, (2.1.9)] for the first assertion and [DJ1, (4.4)] for the last assertion. For (b), see [Cu, (1.9)] or [DPS1, (2.1.6)]. Next, (c) is proved in [DJ2, (2.1)]. A proof for (d) can be found in [DJ1, (4.3)]. Also, (e) follows from (b), noting (0.3) is a consequence, while (f) follows from [DJ1, (4.1)] (or directly from [C, (2.7.5)]). e 0 and (Another proof of (e) could be based on (a), rather than (b), using the fact that xλ H 0 0 0 e e yλ H are Z -direct summands of H .) If qs + 1 is not a zero divisor in Z 0 for s ∈ λ, the properties of distinguished double e 0 xµ = e0 ∩ H coset representatives given in [C, (2.7.4), (2.7.5)] easily imply that yλ H 0 e xµ . Now (g) follows from this fact using (e). More explicitly, we have that yλ H e0 e0 e0 e0 e0 HomH e 0 (H yλ , H xµ ) = yλ H ∩ H xµ = yλ H xµ . With this observation, (h) follows from (f).

2. Hecke Endomorphism Algebras e so that, by Lemma 1.1c, Te8 = yλ H. e For 0 ⊆ 3, put Given λ ∈ 3, write Teλ = xλ H, λ L Te(0) = λ∈0 Teλ ∈ Ob(CH e ), and form the Hecke endomorphism algebra e 8 e Z 0 ) = End 0 (Te(0)Z 0 ) ∼ A(0, = EndH e e 0 (T (0)Z 0 ) H

(2.1)

for any commutative Z-algebra Z 0 . The isomorphism in (2.1) follows from Lemma 1.1c. f ∈ Ob(CZ ), abbreviate Lemma 2.1. Let Z 0 be a commutative Z-algebra, and, if M 0 f f MZ 0 by M . Let λ, µ ∈ 3, and 0 ⊆ 3. Then: e e 0∼ e 0 e0 e 8 e8 0 ∼ e80 e80 (a) HomH e (Tλ , Tµ ) = HomH e 0 (Tλ , Tµ ) and HomH e (Tλ , Tµ ) = HomH e 0 (Tλ , Tµ ). e Z)0 ∼ e Z 0 ). (b) A(0, = A(0, 0 Dλ,µ is the set of distinguished Wλ , Wµ -double coset representatives with trivial intersection property. See [C, Sect. 2.7]. 3

Quantum Weyl Reciprocity and Tilting Modules

327

e 0 -homomorphism Te0 → Te0 is obtained by left multiplication by an element (c) Any H µ λ e 0 satisfying hxµ = xλ h0 for some h0 ∈ H e 0 . Thus, the algebra A(0, e Z 0 ) is a h∈H e0 homomorphic image of the algebra of all 0 × 0 matrices (hλ,µ ), where hλ,µ ∈ H 0 e satisfies hλ,µ xµ ∈ xλ H . Proof. The first assertion in (a) is well-known [DJ1, (3.3)]. (Another argument can be based on [DPS1, (2.3.4), (2.3.5)].) Next, (a) implies (b) from Definition (2.1). To e 0 -module morphism. Then e 0 → Te0 = xλ H e 0 be a right H prove (c), let f : Teµ0 = xµ H λ e 0 . By Lemma 1.1b, we have f (xµ ) = hxµ for some f (xµ ) = xλ h0 for some h0 ∈ H 0 e h ∈ H and it follows that f is left multiplication by h. Now the second assertion follows immediately from this fact. e e Z), so that Lemma 2.1 implies that A(0) e 0 ∼ In (2.1), we write A(0) for A(0, = 0 0 0 e Z ) for any commutative Z-algebra Z . For 0 ⊆ 3, let e0 : Te(3) → Te(0)0 A(0, e 0 and A(0) e 0∼ e 0 e0 . Because the be the idempotent projection. Then e0 ∈ A(3) = e0 A(3) map 3 → P(S) is surjective, there exists γ ∈ 3 with xγ = 1. Then e = e{γ} is an e 0 = xγ H e 0 . (According idempotent projection of Te(3)0 onto a summand isomorphic to H to our convention, we could also write x∅ for xγ here.) This implies that ( e 0 e in (1) Te(3)0 ∼ = A(3) e 0 C; A(3) (2.2) 0 ∼ e0 e (2) eA(3) e = H as algebras. We now show that dH e 0 – see (1.2) – induces a duality on A(0) e 0 C. The following result is motivated by [CPS2, (1.2.1)]. Lemma 2.2. Let Z 0 be a commutative Z-algebra. For any 0 ⊆ 3, there is a contravariop ant “duality” functor dA(0) e 0 : A(0) e 0 C → A(0) e 0 C satisfying: e e ∗ e e 0∼ e 0 (a) dA(0) e 0 T (0) = T (0) ; in fact, if φ : T (0) → T (0) is a Z-isomorphism satisfying e ) = φ(f e )hι for all f ∈ Te∗ , h ∈ H, e then there exists an anti-automorphism βe φ(hf 2 β e e e ) for all f ∈ Te∗ , a ∈ A(0). e e of A(0) such that β = 1 and φ(f a ) = aφ(f 2 0 f∼ f for any M f ∈ Ob( (b) d M =M e 0 C) which is Z -projective; A(0) e 0 A(0) f 0∼ f0 f ∈ Ob( (c) If M e (M )) = dA(0) e 0 (M ). e C) is Z-projective, then (dA(0) A(0) ∼ e e e Proof. By Lemma 1.1a, fix an isomorphism φe : dH e (T (0)) → T (0) of H-modules. Such e ) = φ(f e )hι ” stated in (a). Define an anti-automorphism a φe satisfies the condition “φ(hf −1 0 e e e e e e β of A(0) by β(f ) = φ ◦ dH e (f ) ◦ φ , f ∈ A(0). For a commutative Z-algebra Z , e 0 -module M f, put set βe0 = βeZ 0 . By (1.2), one sees easily that βe2 = 1. Given a left A(0) 0 0 f e f dA(0) e 0 M = HomZ 0 (M , Z ) with the right action of A(0) converted to a left action by means of βe0 . The required properties for dA(0) e 0 follow formally – see [CPS2, (1.2.1)] for (a),(b), while (c) follows from the definitions.

As explained in [CPS2, (1.2.2a)], the above result recasts in the present set-up a e familiar phenomenon in the theory of permutation groups, where the role of A(0) is played by the centralizer algebra of a permutation module for a finite group, and βe0

328

J. Du, B. Parshall, L. Scott

e 0 is the corresponds to matrix transpose. In fact, by Lemma 2.1c, we have, if f ∈ A(0) 0 e image of the matrix (hλ,µ ), then β (f ) is the image of the transpose of the matrix (hιλ,µ ). e (= X e 0 for the case Z 0 = Z) introduced Despite its simple description, the module X e below in Theorem 2.3c plays a central role in the representation theory of A(3). e0 = A(3) e Z 0 , Te0 = Theorem 2.3. Let Z 0 be a commutative Z-algebra, and write A 80 8 e e e T (3)Z 0 , and T = T (3)Z 0 . Then: e0 -module Te0 yλ satisfies d 0 Te0 yλ ∼ (a) For λ ∈ 3, the left A = Te0 yλ . If, in addition, qs + 1 e A e80 e0 is not a zero divisor in Z 0 , ∀s ∈ λ, then Te0 yλ ∼ = HomH e 0 (Tλ , T ) ∈ Ob(Ae0 C). e0 e0 (b) For λ, µ ∈ 3, HomAe0 (Te0 yλ , Te0 yµ ) ∼ = HomH e 0 (yµ H , yλ H ) in CZ 0 . L e0 = e 0 ∼ e0op e0 (c) Put X λ∈3 T yλ . Then EndA e0 (X ) = A . e0 , H e 0 )-bimodule. Next, observe that Proof. We have Te0 yλ ∈ Ob(Ae0 C), since Te0 is an (A 0∗ 0 ∗ the natural map yλ Te → (Te yλ ) , sending yλ f to the linear function tyλ 7→ f (tyλ ), e0 -modules. (The surjectivity of f ∈ Te0∗ , t ∈ Te0 , is a well-defined isomorphism of right A ∗ ∼ e e e this map is immediate from Lemma 1.1f.) Let φ : T → T be the Z-isomorphism as in Lemma 2.2a. Since yλι = yλ , φe defines (by restriction) an isomorphism of yλ Te∗ , viewed e to Teyλ . Base-changing to Z 0 e as a left A-module by means of the anti-automorphism β, 0∗ ∼ e0 e gives a similar isomorphism yλ T = T yλ . Now the first assertion in (a) is clear. By Lemma 1.1g, if qs + 1 is not a zero divisor in Z 0 , ∀s ∈ λ, yλ is an almoste 0 , and hence for Te0 . Hence, idempotent for any xµ H e80 e0 e 0 e0 ∼ e0 HomH e 0 (yλ H , T ) = T yλ . e 0 (Tλ , T ) = HomH completing the proof of (a). Using (2.2), Lemma 1.1e and (0.1) give e0 e0 HomAe0 (Te0 yλ , Te0 yµ ) ∼ = HomH e 0 (H yλ , H yµ ). Now (b) follows from Lemma 1.1a. e0 e0 If f ∈ HomAe0 (Te0yλ , Te0 yµ ), g ∈ HomAe0(Te0yµ , Te0yν ) map to f¯ ∈ HomH e 0 (yµ H , yλ H ), e0 e0 g¯ ∈ HomH e 0 (yν H , yµ H ), respectively, under the isomorphism given in (b), then gf is e 0) ∼ e80 op ∼ e0 op ∼ e0op sent to f¯g. ¯ Thus, EndAe0 (X = EndH e 0 (T ) = EndH e 0 (T ) = A , and (c) follows. Note that Theorem 2.3a holds if 3 is replaced by a subset 0 ⊆ 3. We point out that e08 e0 e0 the advantage of using Te0 yλ instead of HomH e 0 (Tλ , T ) as direct summands of X is eZ 0 , for all e0 ∼ the following base change property: Te0 yλ ∼ = (Teyλ )Z 0 , and hence, X = X 0 commutative Z-algebras Z without restriction on any qs . e0 In Corollary 2.4 below, we denote either of the contravariant functors HomH e 0 (−, T ) : e0 e 0 ∼ e80 provided that CH e 0 by (−) . Thus, X = T e 0 → Ae0 C and HomAe0 (−, T ) : Ae0 C → CH e0 or H e 0 -module M f, there is a natural qs + 1 is not a zero divisor in Z 0 , ∀s ∈ S. For any A f f “evaluation map” EvM e : M → M . (See [CPS2, (1.1)].)

Quantum Weyl Reciprocity and Tilting Modules

329

Corollary 2.4. With the above notation, we have: (a) The evaluation maps EvTe80 and EvTe80 are isomorphisms, provided qs + 1 is not a λ

λ

zero divisor in Z 0 , ∀s ∈ λ. Thus, if qs + 1 is not a zero divisor in Z 0 , ∀s ∈ S, (−) def eop0 = End 0 (Te80 )op → End 0 (Te80 ) → induces an algebra isomorphism f˜0 : A e e H A e0 . =E ∼ (b) Assume that qs + 1 is not a zero divisor in Z 0 , ∀s ∈ S, and let Fe0 : Ee0 C → Ae0op C be the equivalence of categories defined using pull-back through the isomorphism e is a right H e 0 -module such that Ev is an isomorphism. f˜0 in (a). Suppose that N e N Then, in Ae0op C, e , Te80 )) ∼ e80 e (2.3) Fe 0 (HomAe0 (N = HomH e 0 (T , N ).

Proof. Taking µ = ∅, we conclude from Theorem 2.3a,b, tracing through the maps involved, that EvTe80 is an isomorphism. It then follows formally that EvTe80 is an λ λ isomorphism. This proves (a). Then (b) follows by the functoriality of (−) . (Obe0 e0 on Te80 defines a right action of A serve in (2.3) that the natural left action of A 80 e 0 80 e e e on HomH defines a left action of e 0 (T , N ), while the left E -module structure of T 0 e80 e e E on HomAe0 (N , T ).) 3. Cell Modules In this section, we collect together a number of results from Kazhdan-Lusztig cell theory that will be needed later. The strong homological property described below in (3.5) will play a key role in what follows. 0 }w∈W were presented for the In [KL1] and [L2], certain bases {Cw }w∈W and {Cw 1/2 −1/2 e ⊗Z Z[q , q e0 = H ]. Following [DPS1, DPS2],4 for w ∈ W , define algebra H 1/2 0 −1/2 + − + − Cw = qw Cw and Cw = qw Cw . Both {Cw }w∈W and {Cw }w∈W form a Z-basis for e H. There are preorders ≤L and ≤R defined on W which satisfy: X X ε ε e e w ⊆ ZCyε , and Cw ZCyε (3.1) HC H⊆ y≤L w

y≤R w

+ + − − = qs C w (resp., τs Cw = −Cw ) if sw < w, for each choice ε = ±. In particular, τs Cw s ∈ S. Let ≤LR be the preorder generated by ≤L , ≤R . The equivalence classes in W defined by ≤L , ≤R and ≤LR are the left cells, right cells and two-sided cells, respectively. For example, w ∼L y if and only if w ≤L y and y ≤L w. Let (resp., 4) be the set of left (resp., two-sided) cells in W . For ω ∈ , ω −1 = {y −1 | y ∈ ω} is a right cell. Also, the preorder ≤L (resp., ≤LR ) defines a poset structure on (resp., 4). Given x ∈ W , its right-set R(x) (resp., left-set L(x)) is the subset of S consisting of those s ∈ S such that xs < x (resp., sx < x). It is well-known that, if x ≤R y (resp., x ≤L y), then L(x) ⊇ L(y) (resp., R(x) ⊇ R(y)) [KL1, (2.4)], [X, (1.20)]. 4 Although it would be simpler to work with the C 0 -basis (or the C -basis), and therefore with H e0 , stronger w w e over the smaller ring Z[q, q−1 ]; see, e. g., the discussion in results are obtained using the Hecke algebra H [DPS2]. Some results below, such as Lemma 6.4 and the proof of Theorem 6.6, would simplify somewhat over Z[q 1/2 , q −1/2 ].

330

J. Du, B. Parshall, L. Scott

e λ . Similarly, Further, for any λ ∈ 3, {Cy+ | J(λ) ⊆ L(y)} is a Z-basis for Teλ = Hx + e {Cy | J(λ) ⊆ R(y)} is a Z-basis for Hxλ . See [DPS1, (2.3.5)]. eω : Let ω ∈ be a left cell. Because of (3.1), ω determines a left cell module E X X eω = ( ZCy+ ) ( ZCy+ ) ∈ Ob(H (3.2) E e C) (w ∈ ω fixed). y
y≤L w

eω , Z) ∈ Ob(C ) is the corresponding dual left cell module. The Then Seω = HomZ (E e H −1 right cell γ = ω defines, in the same way, a right cell module X X −1 eγ = ( ZCy+−1 ) ( ZCy+−1 ) ∈ Ob(CH (3.3) K e ) (w ∈ γ fixed). y
y≤L w

Since ι(Cy+ ) = Cy+−1 (from the uniqueness in [KL1, (1.1)]), the definitions imply that e ∼ e dH e Sω = Kγ ,

γ = ω −1 .

(3.4)

e e f, we mean a filtration Fe• : 0 = Fe0 ⊂ By an S-filtration of a right H-module M f by H-submodules e such that each nonzero section Fei /Fei−1 is Fe1 ⊂ · · · ⊂ Fet = M e isomorphic to some Sωi for some ωi ∈ . For example, [DPS1, (2.3.9)] establishes that e λ ∈ 3, has an S-filtration e the modules Teλ = xλ H, Feλ• which satisfies, in addition, the strong homological property that 0 , Teµ0 ) = 0, Ext 1e 0 (Teλ0 /Feλi H

∀i, µ ∈ 3,

(3.5)

e K is whenever Z 0 is a Z-algebra which is an integral domain with the property that H 0 e e semisimple over the fraction field K of Z . Further, Fλ• has bottom section Sωλ , where ωλ is the left cell containing the longest word w0,λ ∈ Wλ . e e e Theorem 3.1. Let 0 ⊆ 3. Define 1(ω, 0) = HomH e (Sω , T (0)) ∈ Ob(A(0) e C), ω ∈ . Then: e (a) The Z-module 1(ω, 0) is free. e e (b) The A(0)-module Te(0) has a {1(ω, 0)}ω∈ -filtration. e = x∅ H e has an S-filtration e Proof. For (a), see [DPS1, (2.5.1)]. As described above, H e e F• = F∅• satisfying (3.5). Thus, M e Fei , Te(0)) ∼ e Fei , Teµ ) = 0. Ext 1e (H/ (3.6) Ext 1e (H/ = H H µ∈0

e i = Hom (H/ e Fei , Te(0)). Then Te(0) = G e0 ⊃ G e1 ⊃ · · · ⊃ For an integer i, define G e H e • is a filtration of Te(0) = Hom (H, e Te(0)). Furthermore, for e t = 0 for some t, so that G G e H e e e any i, applying the functor HomH e (−, T (0)) to the short exact sequence 0 → Fi /Fi−1 → e Fei → 0 and using (3.6) yields an exact sequence e Fei−1 → H/ H/ e e e e e e e e e 0 → HomH e (H/Fi , T (0)) → HomH e (H/Fi−1 , T (0)) → HomH e (Fi /Fi−1 , T (0)) → 0

Quantum Weyl Reciprocity and Tilting Modules

331

by the long exact sequence of Ext • . For some ωi ∈ , we have Fei /Fei−1 ∼ = Seωi . e H Therefore, ∼ e ei ∼ e e e i−1 /G G = HomH e (Sωi , T (0)) = 1(ωi , 0), as required.

Let Z 0 be a commutative Z-algebra. To simplify notation, we will as before continue ε e0 = H e Z 0 by {C ε }w∈W . In Sect. 6, we ⊗ 1}w∈W (ε = ±) for H to denote the bases {Cw w + e dual to the C -basis under the “trace form” h , i introduced above use the basis for H w Lemma 1.1: + e 0 with respect to {C + }w∈W in the }w∈W be the dual basis for H Lemma 3.2. Let {Dw w + + sense that hCw , Dy i = δw,y−1 , ∀y, w ∈ W . Then:

(a) If Cx+ Dy+−1 6= 0, then y ≤L x. e 0 D+ ⊆ P (b) For any y ∈ W , H y

z≥L y

0 + e0 ⊆ P Z 0 Dz+ and Dy+ H z≥R y Z Dz .

Proof. For Z 0 = Z, (a) follows from [L1, (5.1.14)]. (In fact, this is an easy calculation using (3.1).) Also, (b) is immediate from (3.1). Now for a homomorphism Z → Z 0 , the lemma follows by applying the base change functor − ⊗Z Z 0 . 4. Quasi-Hereditary Algebras From Sect. 5 on, we will restrict to the Hecke algebras associated to symmetric groups and will mainly consider the q-Schur algebras as the Hecke endomorphism algebras defined in (2.1). Since q-Schur algebras are quasi-hereditary (see below), we briefly review the general theory of quasi-hereditary algebras in this section. For more details of the elementary theory, see [CPS1]. Let A be a finite dimensional algebra over a field k whose irreducible modules are absolutely irreducible. Let 3+ be a set indexing the representatives from the distinct isomorphism classes of irreducible (left) A-modules: for λ ∈ 3+ , L(λ) ∈ Ob(A C) is the associated irreducible. Let P (λ) ∈ Ob(A C) (resp., I(λ) ∈ Ob(A C)) be the projective cover (resp., injective envelope) of L(λ). Suppose that 3+ has a fixed poset structure ≤ such that for λ ∈ 3+ there exists 1(λ) ∈ Ob(A C) satisfying the following two conditions: (i) L(λ) is the head of 1(λ) and all other composition factors L(µ) satisfy µ < λ; (ii) P (λ) has a filtration with top section 1(λ) and lower sections 1(µ) for some µ > λ. Then A C is a highest weight category (HWC) with poset 3+ . The algebra A is quasihereditary if and only if A C is a HWC for a poset structure on 3+ . If A C is a HWC, there exist ∇(λ) ∈ Ob(A C), λ ∈ 3+ , such that (i◦ ) ∇(λ) has socle L(λ) and all other composition factors L(µ) satisfy µ < λ; (ii◦ ) I(λ) has a filtration with bottom section ∇(λ) and higher sections ∇(µ) for some µ > λ. Of course, these conditions are just dual to conditions (i), (ii) above. In case A needs to be explicitly mentioned, write 1(λ, A C), P (λ, A C), etc. for 1(λ), P (λ), etc. The modules in 1 = {1(λ)} and ∇ = {∇(λ)} – called the standard and costandard modules, respectively, of the HWC A C – satisfy strong homological properties:

332

J. Du, B. Parshall, L. Scott

 n  (1) dim Ext A (1(λ), ∇(µ)) = δn0 δλµ ; + + for λ, µ ∈ 3 , n ∈ Z : (2) n > 0 & Ext nA (1(λ), 1(µ)) 6= 0 =⇒ λ < µ  (3) n > 0 & Ext n (∇(λ), ∇(µ)) 6= 0 =⇒ λ > µ. A

(4.1)

(This well known fact is immediate from the proof of [CPS1, (3.11)].) Any M ∈ Ob(A C) def

with both T a 1-filtration and a ∇-filtration is a tilting module, i. e., M ∈ A C(tilt) = C(1) A A C(∇). If M ∈ A C(1) and N ∈ A C(∇), the n = 1 case of (4.1(1)) implies that Ext nA (M, N ) = 0 for n > 0. Ringel [R] has obtained some results on A C(tilt): For λ ∈ 3+ , there exists a unique indecomposable X(λ) ∈ A C(tilt) such that λ is the maximal µ ∈ 3+ for which [X(λ) : L(µ)] 6= 0. In fact, [X(λ) : L(λ)] = 1. By (4.1(2)), X(λ) has a 1-filtration with bottom section 1(λ) and “higher” sections 1(µ) for some µ < λ. Similarly, X(λ) has a ∇-filtration with top section ∇(λ) and “lower” ∇(µ) for some µ < λ. L sections ⊕m λ (X) X(λ) . If each integer Every X ∈ A C(tilt) has a decomposition X = + λ∈3 mλ (X) > 0, then X is a full tilting module. In turn, the X(λ), λ ∈ 3, are often called the indecomposable partial tilting modules for A C. Given a full tilting module X, “the” Ringel dual A? = EndA (X) of A is quasi-hereditary. If Y is another full tilting module, def

the algebras EndA (X) and EndA (Y ) are Morita equivalent – thus, A C ? = A? C (the Ringel dual of A C) is a HWC (with poset the opposite poset (3+ , ≤op )). (In other words, λ ≤op µ if and only if µ ≤ λ.) Also, X is a full tilting module for A? and EndA? (X) ∼ = A. For λ ∈ 3+ , 1? (λ) = HomA (1(λ), X) is the 1-object in A C ? corresponding to λ.5 Let 0+ be an ideal in the poset 3+ of a HWC A C (i. e., ω ≤ γ ∈ 0+ implies that ω ∈ 0+ ). The full subcategory A C[0+ ] of A C consisting of objects having composition factors L(γ), γ ∈ 0+ , is a HWC with poset 0+ . Its standard objects 1(γ, A C[0+ ]) and costandard objects ∇(γ, A C[0+ ]), γ ∈ 0+ , are defined by: 1(γ, A C[0+ ]) = 1(γ, A C) and ∇(γ, A C[0+ ]) = ∇(γ, A C). For γ ∈ 0+ , X(γ, A C) ∈ Ob(A C[0+ ]), and so X(γ, A C) identifies with X(γ, A C[0+ ]). The quotient category A C(+ ) = A C/A C[0+ ] is a HWC with poset + = 3+ \0+ (using the induced poset structure ≤). For some idempotent e ∈ A, we have A C(+ ) ∼ = ∗ ∗ eAe C. The exact functor j : A C → eAe C, M 7→ j (M ) = eM , carries L(ω) (resp., 1(ω), ∇(ω)), for ω ∈ + , to the corresponding objects in eAe C. If γ ∈ 0+ , however, we have that j ∗ L(γ) ∼ = j ∗ 1(γ) ∼ = j ∗ ∇(γ) = 0. Hence,  ∗  j A C(1) ⊆ eAe C(1) (4.2) j ∗ A C(∇) ⊆ eAe C(∇)  j ∗ C(tilt) ⊆ eAe C(tilt). A Lemma 4.1. Let + = 3+ \0+ be a coideal in the poset 3+ of the HWC A C. Then: (a) For ω ∈ + , we have j ∗ X(ω) ∼ = X(ω, eAe C). Hence, j ∗ A C(tilt) = eAe C(tilt). (b) We have ? + ? ∼ + (4.3) A C( ) = A C [ ], where, on the right-hand side, + is regarded as an ideal in the opposite poset (3+ , ≤op ). Proof. By (4.2), j ∗ X(ω) ∈ eAe C(tilt) for ω ∈ + . Since [j ∗ X(ω) : j ∗ L(ω)] = 1 and ω is the maximal element τ ∈ + such that j ∗ L(τ ) is a composition factor of 5

The reader can also consult [CPS2, Sect. 3.4] for another treatment of these results.

Quantum Weyl Reciprocity and Tilting Modules

333

j ∗ X(ω), (a) follows if j ∗ X(ω) is indecomposable. By an (essentially elementary) homological argument [CPS2, (3.4.6.2)] using adjoints and ∇-filtrations, the restriction map EndA (X(γ)) → EndeAe (j ∗ X(γ)) is surjective.6 Since X(γ) is indecomposable, EndA (X(γ)) is a local algebra. Hence, EndeAe (j ∗ X(γ)) is also local, so j ∗ X(γ) is indecomposable. This proves (a). Finally, (b) follows from [CPS2, (3.4.6)]. We conclude this section with the following result. It will be particularly useful to know that tilting modules satisfy the base change property (4.4); see (c) below. This property was originally suggested by a corresponding well-known property of projective modules (which tilting modules become, under appropriate functorial transformations). e be an arbitrary Z-algebra which is finitely Lemma 4.2. Let Z = Z[q, q −1 ] and let A f, N e ∈ Ob( C) are Z-projective. generated and projective as a Z-module. Suppose M e A Then: f, N e ) = 0 for i = 1, 2. For any commutative Z-algebra Z 0 , we (a) Suppose Ext i (M e A have: f, N e )Z 0 ∼ fZ 0 , N eZ 0 ). (4.4) HomAe(M = HomAe 0 (M Z

(b)

f, N e ) 6= 0. Then there exists p Let n ≥ 1 and suppose that Ext n (M e A f ,N e ) 6= 0. e )) such that, if k = Zp /pZp , then Ext n (M N ek k k A

f, ∈ Supp(Ext n (M e A

ek is quasi-hereditary and that (c) Suppose for each field k which is a Z-algebra, A e fk , N ek ∈ C(tilt).) f Mk ∈ Ae C(1), Nk ∈ Ae C(∇). (In particular, this holds if M ek A k k Then (4.4) holds for any commutative Z-algebra Z 0 . f → 0 be a resolution of M f by projective A-modules e Proof. (a) Let Pe• → M Pei . The hypotheses of (a) imply that the complex f, N e ) → Hom (Pe0 , N e ) → Hom (Pe1 , N e) 0 → HomAe(M e e A A e ) → Hom (Pe3 , N e) → HomAe(Pe2 , N e A

is exact. Also, because the ring Z = Z[q, q −1 ] has global dimension 2, [DPS2, (0.1)] implies that the terms in the above complex are Z-projective. (The result [DPS2, (0.1)] is an elementary linear algebra argument based on a commutative algebra result of e) → e of the map Hom (Pe2 , N Auslander-Goldman [AG, Cor., p. 17].) Thus, the kernel X e A e ) is also Z-projective (again since Z has global dimension 2.) It follows HomAe(Pe3 , N that the acyclic complex f, N e ) → Hom (Pe0 , N e ) → Hom (Pe1 , N e) → X e →0 0 → HomAe(M e e A A

(4.5)

splits as a complex of Z-modules (in the sense that the kernels and cokernels of the various maps are Z-direct summands). Hence, (4.5) remains acyclic after applying the funce e N e )Z 0 ∼ eZ 0 , N eZ 0 ) HomAe(Q, tor − ⊗Z Z 0 . Since Pe0 and Pe1 are A-projective, = HomAe 0 (Q Z ∼ e = Pe0 , Pe1 . Thus, Hom (M f e f e for Q e , N )Z 0 = HomAeZ 0 (MZ 0 , NZ 0 ) as claimed in (4.4). This A completes the proof of (a). 6 We point out that line 9 of [CPS2, p.58] contains a misprint. The last isomorphism should be j j∗ T ∼ = ∗ j∗ j ∗ T .

334

J. Du, B. Parshall, L. Scott

f is Z-projective, (b) follows When n = 1, (b) is proved in [DPS2, (2.9)]. Because M by induction on n, using dimension shifting. fk , N ek ) = 0 for all n > 0. Therefore, To see (c), (4.1) implies that Ext nAk (M n f e Ext (M , N ) = 0 for all n > 0 by (b). Now we can apply (a) to conclude that (c) e A holds. An alternate argument for Lemma 4.2a can be based on the convergent homology spectral sequence q f e 2 f 0, N eZ 0 ) = TorZ (M , N ), Z 0 ) ⇒ Ext q−p (M Epq p (Ext A e eZ 0 Z A 2 2 in which d2pq : Epq → Ep−2,q−1 . Assuming this fact, (a) follows, since Z has global Z dimension 2 so that Torp = 0 for p > 2. The above spectral sequence is given in [DS, (2.9)] in a somewhat different context, but also follows (after reindexing) from the K¨unneth spectral sequence given in [Wi, (5.6.4)]. (The boundedness assumption required in [Wi, (5.6.4)] can be achieved by truncating the complex P there, or by observing that it is not necessary since Z has finite global dimension.)

5. q-Schur Algebras For the rest of this paper, let W = Sr , the symmetric group of degree r, and S = {(1, 2), · · · , (r − 1, r)}. For a positive integer n, let 3(n, r) (resp. 3+ (n, r)) be the set of compositions (resp., partitions) of r with n (resp. at most n non-zero) parts. (A composition P of r with n parts is a sequence λ = (λ1 , · · · , λn ) of non-negative integers with i λi = r; λ is a partition if λ1 ≥ λ2 ≥ · · · .) Let 3+ (r) = 3+ (r, r) and 3(r) = 3(r, r). For any n ≥ r, 3+ (n, r) = 3+ (r). The set 3(n, r) has a poset structure E defined by setting λ E µ if and only if λ1 + · · · + λi ≤ µ1 + · · · + µi for all i. Then 3+ (n, r) is a coideal in 3+ (r) for all n, r. If n ≤ r, any λ ∈ 3(n, r) can be regarded as an element in 3(r) by adding a string n − r 0’s to λ. With this identification, 3(n, r) is a coideal in 3(r). There is a natural map J from 3(n, r) to the power set P(S) of S: For λ ∈ 3(n, r), let Y(λ) be the Young diagram of shape λ and let tλ be the tableau of shape λ obtained by filling in the boxes in the first row of Y(λ) consecutively with the integers 1, 2, · · · , λ1 , the second row with the integers λ1 + 1, · · · , λ1 + λ2 , etc. Then we define J(λ) as the subset of S consisting of those s which stabilize the integers in the rows of tλ . Every subset of S has the form J(λ) for some λ ∈ 3(n, r) if n ≥ r. In the set-up of the previous sections, we worked with a pair (0, 3) in which 3 is a poset, together with a surjective map J : 3 → P(S), and 0 ⊆ 3. In the present section, we will put 0 = 3(n, r), with 3 = 0 if n ≥ r and with 3 = 3(r) if n < r. Now, the parabolic subgroup Wλ of W is precisely the row stabilizer of tλ . By rearranging terms, every composition λ ∈ 3(n, r) determines a unique partition λ+ ∈ 3+ (n, r). Clearly, λ E λ+ and λ+ is the minimal partition with this property. Also, Wλ and Wµ are W -conjugate if and only if λ+ = µ+ . e over Z = Z[q, q −1 ] for W = Sr is defined in (1.1) with each The generic algebra H cs = 1: thus, qs = q for all s ∈ S. Abbreviate Te(3(n, r)) as Te(n, r). Explicitly, Te(n, r) = L e ∼ L e⊕dλ , where dλ = #{µ | µ+ = λ}. (By Lemma 1.1d, λ∈3(n,r) Tλ = λ∈3+ (n,r) Tλ Teλ ∼ = Teλ+ .) For λ ∈ 3(r), let λ0 ∈ 3+ (r) be the dual partition: thus, λ0i = #{λj ≥ i}. We require the following purely combinatorial result, closely related to Lemma 1.1h.

Quantum Weyl Reciprocity and Tilting Modules

335

Lemma 5.1. ([DJ1, (4.3)]) Suppose that Z 0 is a commutative Z-algebra in which q + 1 is not a zero divisor. Then: e80 e0 (a) For λ, µ ∈ 3(r), if HomH e 0 (Tλ0 , Tµ ) 6= 0, then λ D µ. 80 0 e e0 (b) For λ ∈ 3(r), HomH e 0 (Tλ0 , Tλ ) is a free Z -module of rank 1. We will now make use of the Kazhdan-Lusztig cell theory for W . First, there is a ∼ poset isomorphism α : (3+ (r), E) → (4, ≤op LR ). Explicitly, α(λ) is the two-sided cell ξ ∈ 4 containing the longest word w0,λ ∈ Wλ ; see [DPS2, Sect. 2]. Two left cell eω and E eω0 are isomorphic if and only if ω, ω 0 are contained in the same modules E eωQ(q) ∼ eω0 Q(q) are isomorphic two-sided cell ξ ∈ 4 (and this occurs if and only if E = E irreducible modules).7 Thus, for ξ ∈ 4, let Seξ = Seω for any left cell ω ⊆ ξ. If α(λ) = ξ, we also denote Seξ e e by Sλ . We follow a similar convention, relative to the modules 1(ω, 0) in Theorem 3.1. + + e e Thus, for 0 = 3(n, r) and λ ∈ 0 = 3 (n, r), 1(λ, 0) = 1(ω, 0) if ω ⊆ α(λ). The SeλQ(q) , λ ∈ 3+ (r), are the distinct irreducible (right) modules for the (split) semisimple e Q(q) . We record the following fact. algebra H L ⊕eµλ Lemma 5.2. For λ ∈ 3+ (r), TeλQ(q) ∼ and Teλ80 Q(q) ∼ = SeλQ(q) ⊕ µ.λ SeµQ(q) = SeλQ(q) ⊕ L ⊕fµλ e for some positive integers eµλ , fµλ . S µ/λ

µQ(q)

Proof. The analogous decomposition of permutation modules for QSr is well-known (going back to Frobenius at the character level), so that the lemma follows from elementary “equal characteristic” Brauer theory; see the discussion for [DPS2, (2.6)]. For any commutative Z-algebra Z 0 , we call 0 ∼ 80 e e Seq (n, r, Z 0 ) = EndH e 0 (T (n, r) ) = EndH e 0 (T (n, r) ),

(5.1)

the q-Schur algebra of bidegree (n, r). By Lemma 2.1b, Seq (n, r, Z 0 ) ∼ = Seq (n, r, Z)Z 0 . + 0 0 e Also, Seq (n, r, Z ) is Morita equivalent to EndH e 0 (T (3 (n, r) ); cf. Lemma 1.1d. For e e convenience, write Sq (n, r) for Sq (n, r, Z). e Te) ∼ Lemma 5.3. For λ ∈ 3+ (n, r), we have HomSe (n,r) (1(λ), = Seλ , where Te = Te(n, r) q e e e and 1(λ) = HomH e (Sλ , T ). e Proof. As in Cor. 2.4, denote either of the functors HomSe (n,r) (−, Te) or HomH e (−, T ) q e Feλ• satisfying the vanishing by (−) . As described in Sect. 3, Teλ has a S-filtration e condition (3.5) and having bottom section Sλ . By Lemma 5.2, the other sections Seµ e satisfy µ . λ. Therefore, Feλ• is a 1-filtration of the projective Seq (n, r)-module Teλ with e e e top section 1(λ). Hence, Seλ ∼ is an H-submodule of Teλ ∼ = 1(λ) = Teλ . By naturality, e e the “evaluation map” Sλ → Sλ factors through the isomorphism Teλ ∼ = Teλ , and e e e e e defines an inclusion Sλ ,→ 1(λ) . Since Tλ /Sλ is Z-torsion free and 1(λ)K ∼ = SeλK , the desired result follows from Lemma 5.2. e0 over Z[q1/2 , q−1/2 ]. The assertion This fact is proved in [KL1, (1.4)] for the generic Hecke algebra H e then follows from general principles (see [DPS2, (2.3)]). for H 7

336

J. Du, B. Parshall, L. Scott

e e e e e For λ ∈ 3+ (n, r), write 1(λ) = HomH e (Sλ , T ), where T = T (n, r) as above. e k . By [DPS2, Theorem 1], for every field k which is a Z-algebra, Put 1(λ) = 1(λ) the module category Se (n,r) C is a HWC with poset (3+ (n, r), E) and with standard q k objects 1(λ), λ ∈ 3+ (n, r). (This result is established, in fact, using little more than e k . By e e and ∇(λ) = ∇(λ) the machinery discussed in Sect. 3.) Let ∇(λ) = dSe (n,r) 1(λ) q [DJ4, (4.11)], for any field k which is a Z-algebra, dSe (n,r) is a strong duality in the q k L(λ) for any irreducible Seq (n, r)k -module L(λ).8 Hence, by sense that dSe (n,r) L(λ) ∼ = q k [CPS2, (1.2)], ∇(λ) is the ∇-object corresponding to λ for the HWC Sq (n,r) C. Of course, e e e eq (n,r) C(tilt) denotes the class of Sq (n, r)-modules with both a 1 and a ∇-filtration. By S ∼ e e Also, Lemma 2.2a says that d Theorem 3.1, Te(n, r) ∈ Se (n,r) C(1). eq (n,r) T (n, r) = S q Te(n, r). We conclude: Proposition 5.4. Te(n, r) ∈ Se (n,r) C(tilt). For k as above, q

T (n, r) ∈ Sq (n,r) C(tilt). Further, by (4.1) and Lemma 4.2, we have  ∼ e e (1(λ), ∇(µ)) (1) Ext m = δm0 δλµ Z;   eq (n,r) S  m e e (1(λ), 1(µ)) 6= 0 =⇒ λ < µ; (2) m > 0 & Ext eq (n,r) S   (3) m > 0 & Ext m e e (∇(λ), ∇(µ)) 6= 0 =⇒ λ > µ. eq (n,r) S

(5.2)

Now we can prove: Theorem 5.5. We have EndSe (n,r) (Te(n, r))Z 0 ∼ = EndSeq (n,r) 0 (Te(n, r)Z 0 for any comq Z mutative Z-algebra Z 0 . Also, dim EndSe (n,r) (Te(n, r)k ) is constant on residue fields k q k of Z. Proof. By Proposition 5.4, Te(n, r)k is a tilting module for Seq (n, r)k for every field k e k is semisimple, this assertion is obvious!) The first which is a Z-algebra. (In case H assertion of the theorem follows from Lemma 4.2c. The long exact sequence of Ext , Proposition 5.4 and (5.2) imply that Ext 1e

Sq (n,r)

e (1(λ), Te(n, r)) = 0 ∀λ ∈ 3+ (n, r).

e Therefore, a 1-filtration of the left Seq (n, r)-module Te(n, r) induces, after applying e the functor HomSe (n,r) (−, Te(n, r)) and using Lemma 5.4, a S-filtration of the right q e H-module EndSe (n,r) (Te(n, r)) = HomSe (n,r) (Te(n, r), Te(n, r)). q

q

Since any Seλ is Z-free by definition, EndSe (n,r) (Te(n, r)) is also Z-free with rank d, say. q So, dim EndSe (n,r) (Te(n, r)k ) ∼ = dim EndSe (n,r) (Te(n, r))k = d q

for any field k which is a Z-algebra. 8

q

k

From the point of view of the present paper, this fact is easy to see directly – see Lemma 7.2 below.

Quantum Weyl Reciprocity and Tilting Modules

337

6. Quantum Weyl Reciprocity, I In this section, we first obtain the q-version of the double centralizer property (1) of the introduction. We maintain the notation of Sects. 3 and 5. Let γλ be the right cell in W containing the longest word w0,λ of Wλ . For x ∈ γλ , let ωx be the S left cell containing x. By [KL1, (1.4)], the ωx are distinct for distinct x; the union x∈γλ ωx is the two-sided Kazhdan-Lusztig cell ξλ containing w0,λ . Using + the basis {Dw }w∈W introduced in Lemma 3.2, we give an alternative description of the e submodule Sλ of Teλ . fx = P Lemma 6.1. Let λ ∈ 3+ (r). For any x ∈ γλ , form the Z-submodule M y∈ωx Z + + + + e Then M fx = Seλ for all x ∈ γλ . The set {C D −1 }y∈ω forms a C D −1 of Teλ = xλ H. x

y

Z-basis for Seλ and is part of a basis for Teλ as well.

x

y

x

Proof. For x ∈ γλ , λ = J(λ) is equal to the left-set {s ∈ S | sx < x} of x (see fx ⊆ Teλ by [KL1, (2.4)] for an easy argument), so τs Cx+ = qCx+ if s ∈ λ. Hence, M e e e f Lemma 1.1b. By Lemma 3.2, Mx is an H-submodule of Tλ . Given h ∈ H, the matrix of e ∗ relative to the basis {ζy }y∈ω dual h as a right operator on the dual left cell module E ωx x + + + + , Dy+−1 i = to that defined by the Cy , y ∈ ωx , is (hhCw , Dy−1 i)(y,w)∈ωx ×ωx . Since hhCw e∗ → M fx , ζy 7→ C + D+−1 , defines a surjective H-module e hC + , D+−1 hi, the map f : E w

ωx

y

x y + + tr(Cx Dx−1 ),

fx 6= 0. But E e∗ = so M homomorphism. Also, 1 = ωx Q(q) is an e Q(q) -module (see [DPS2, (2.3)]), so f must be an isomorphism onto its irreducible H e e∗ ∼ image. For any x ∈ γλ , E ωx = Sλ . By Lemma 5.2, the image Im f of f , viewed as a subspace of Teλ , is contained in e e of Teλ .) Since Sλ . (Recall that Seλ identifies with the bottom section in a S-filtration ∼ e e f e EndH e (Sλ ) = Z, Imf = rSλ for some 0 6= r ∈ Z. Thus, Z = tr(Mx ) = tr(rSλ ) ⊆ rZ, so r is a unit in Z, and we can take r = 1. Finally, using the dual left cell filtration [DPS1, (2.3.7)], we see that Teλ /Seλ is Z-free. Therefore, every basis for Seλ is part of a basis for Teλ , proving the last assertion. 9

hCx+ , Dx+ −1 i

Let n be some positive integer. Since 3+ (n, r) is a coideal in 3+ (r), we can use the isomorphism α : (3+ (r), E) → (4, ≤op LR ) defined in [DPS2] and discussed after Lemma 5.1 aboveSto fixSa listing ξ1 , · · · , ξm of 4 satisfying ξi ≤LR ξj =⇒ i ≤ j and such that ξ1 · · · ξm0 = α(3+ (n, r)) for some integer m0 . Denote the ideal α(3+ (n, r)) of 4 by 4(n, r). e r) be the Z-span of the D+ , w 6∈ S e Let J(n, w ξ∈4(n,r) ξ. By Lemma 3.2b, J(n, r) is e let H(n, e e J(n, e r). an ideal in H; r) = H/ e e op → End Theorem 6.2. The natural map ψ : H eq (n,r) (T (n, r)) induces an isomorS phism e H(n, r)op ∼ = EndSe (n,r) (Te(n, r)). q

(In particular, ψ is surjective.) Furthermore, if Z 0 is a commutative Z-algebra, then ∼

e∗ → H e → Cx+ H, e checking with Lemma 3.2 that Alternatively, the map f is obtained naturally from H ∗ e defined by Seλ . the later is well-defined on the section of H 9

338

J. Du, B. Parshall, L. Scott

e e ψZ 0 : H(n, r)op Z 0 → EndS e (n,r) 0 (T (n, r)Z 0 ) q

Z

is an isomorphism. Proof. For λ ∈ 3+ (r), let (as before) w0,λ be the longest word in Wλ . As defined in [DPS2], α(λ) is the two-sided cell ξλ ∈ 4 containing w0,λ . From the definition e Z[q1/2 ,q−1/2 ] given in [KL1, ((1.1c)], C + = of the Kazhdan-Lusztig basis {Cx0 } for H w0,λ P 1/2 0 qw Cw = P τ , where P is the Kazhdan-Lusztig polynomial for y,w0,λ y≤w0,λ y,w0,λ y 0,λ the pair (y, w0,λ ). By the standard formula [KL1, (2.3g)], we have that Py,w0,λ = 1 + e is a for all y ≤ w0,λ . Thus, Cw = xλ . It follows that any element in Teλ = xλ H 0,λ + Z-linear combination of Cx satisfying x ≤R w0,λ . Therefore, if y ∈ W does not lie in any ξ ∈ 4(n, r) and λ ∈ 3+ (n, r), then Lemma 3.2a implies that Teλ Dy+−1 = 0. By e r)) = 0, yielding a homomorphism ψ¯ : H(n, e Lemma 1.1d, we conclude that ψ(J(n, r) → e EndSe (n,r) (T (n, r)). q + , w ∈ ξ for ξ ∈ 4(n, r), define a Z 0 Let Z 0 be a commutative Z-algebra. The Dw P + e basis for H(n, r)Z 0 . Suppose that f = w∈ξ,ξ∈4(n,r) aw Dw acts as the zero operator on Te(n, r)Z 0 with some aw 6= 0. Choose a left cell ω ∈ minimal (w.r.t. ≤L ) for which aw−1 6= 0 with w ∈ ω. For some λ ∈ 3+ (n, r), ω ⊆ ξ = α(λ). There exists x ∈ γλ so that ω = ωx . Since x and w0,λ have the same left-set, we conclude that Cx+ ∈ Teλ0 . (See P the remarks above (3.2).) By Lemma 3.2a and the minimality of ω, 0 = Cx+ f = y∈ωx ay−1 Cx+ Dy+−1 . Since {Cx+ Dy+−1 }y∈ωx is linearly independent over Z 0 by Lemma 6.1, it follows that aw−1 = 0, a contradiction. Hence, ψ¯ Z 0 is injective. e Q(q) implies that H(n, e Taking Z 0 = Q(q), the semisimplicity of H r) has rank equal e to dim EndSe (n,r) (Te(n, r)Q(q) ). (Note that H(n, r) is Z-free.) Hence, by Theorem 5.5 q Q(q) and the previous paragraph, ψ¯ k is an isomorphism for every field k which is a Zf→N e is a morphism of finitely generated Z-modules which becomes an algebra. If M isomorphism upon passage to every residue field k of Z, then an elementary commutative f → N e is an isomorphism of Z-modules. Thus, algebra argument establishes that M op e e ψ : H(n, r) → EndSe (n,r) (T (n, r)) is an isomorphism, proving the first assertion. The q second assertion follows from this and Theorem 5.5. The above result is the q-version of a theorem of De Concini–Procesi [dCP, (4.1)] for Schur algebras. (Another proof in that case is given in [D2, Sect. 2 Cor.].) As an application, we establish a double centralizer property (1), independent of the field k and the parameter q. Let Ve be a free Z-module of rank n. For any positive integer r, e on Ve ⊗r so that Ve ⊗r ∼ there is a natural right action of H = Te(n, r) [DD, (3.1.5)]. Let e e Uq1/2 = Uq1/2 (gln ) be the divided power Z-form of the quantized enveloping algebra eq1/2 → End (Ve ⊗r ) and of gln . We have two natural algebra homomorphisms ϕ : U e H e ⊗r ). (The map ϕ is defined in [Du], based on [BLM].) e op → End ( V ψ:H e U q 1/2

Theorem 6.3. Both maps ϕ and ψ are surjective. Therefore, for any specialization of Z into a field k, we have e ⊗r im(ϕk ) = EndH e ( Vk ) k

and

im(ψk ) = EndUe (Vek⊗r ). qk

Quantum Weyl Reciprocity and Tilting Modules

339

Proof. The surjectivity of ϕ is proved in [Du, (3.4)] based on work of [BLM] over Q(q). With this, Theorem 6.2 and the discussion above on Seq (n, r) imply the surjectivity of ψ. The last assertion follows by base change. When k = C and q ∈ C is not a root of unity, we recover the quantized Weyl reciprocity established in [Ji]. See [Du, (1.2)]. When k = C and q ∈ C is a root of unity, e r)C ∼ then the fact that H(2, = EndSeq (2,r)C (Ve ⊗r ) has been proved by Martin [M, Sect. 4] by different methods, while for general n, the kernel of ψC was explicitly described in [M, Sect. 3] in terms of Young symmetrizers. We emphasize, however, that results Theorem 6.2 and Theorem 6.3 are much stronger. We have proved that Theorem 6.2 holds over the ring Z = Z[q, q −1 ], and it behaves well under base change. The proof for the latter fact as well as the surjectivity of ψ requires the tilting module theory for quasi-hereditary algebras. We next consider a filtration version of (2) in the introduction. For this, we require the ∗-operations on the elements of W . These operations are defined in [KL1, (4.1)] which gives all the needed properties. 10 The following result is implicit in [KL1, Sects. 4, 5]; an explicit proof is presented in [DPS2, (2.3)]. Lemma 6.4. (a) Let ω (resp., γ) be a left (resp., right) cell contained in the two-sided cell ξ ∈ 4. Then ω ∗ = {w∗ | w ∈ ω} (resp., ∗ γ = {∗ w | w ∈ γ}) is a left (resp., right) cell in ξ. Every left (resp., right) cell in ξ can be obtained from ω (resp., γ) by applying a sequence of ∗-operations. e Suppose (b) Let w ∈ W and h ∈ H. X

+ ≡ hCw

αx (h, w)Cx+ mod

x∼L w + Cw h

≡

X

ZCz+

z
βy (h, w)Cy+

mod

y∼R w

Then

X X

ZCz+ .

z
−1/2 1/2 qx αx (h, w) 1/2 −1/2 −1/2 1/2 q∗ w qw q∗ x qx βy (h, w).

−1/2 q x∗ αx∗ (h, w∗ ) = qw∗ qw 1/2

β∗ y (h, ∗ w) = (c) For any left cell ω ∈ , + + Cw

X

−1/2

1/2 + ZCz+ 7→ qw qw∗ q 1/2 Cw ∗ +

z
X z
ZCz+ , w ∈ ω

∼

eω∗ of left cell modules. Similarly, for any right cell γ, eω → E defines an isomorphism E X X + 1/2 −1/2 1/2 + + ZCz+ 7→ qw q∗ w q C ∗ w + ZCz+ , w ∈ γ Cw z
z
∼ e eγ → defines an isomorphism K K∗ γ of right cell modules. 10 Given w ∈ W , the definition of w ∗ depends on a choice of elements s, t ∈ S such that st has order 3. It will not be necessary to explicitly mention s, t. Also, w∗ is defined only if #R(w) ∩ {s, t} = 1, where R(w) = {s ∈ S | ws < w}. Similar remarks hold for ∗ w.

340

J. Du, B. Parshall, L. Scott

For λ ∈ 3(r), let Dλ+ = {w ∈ W | sw < w, ∀s ∈ λ}. By [DPS1, (2.3.5)], Teλ has + λ + λ basis {Cw }w∈Dλ+ . Putting Cw = Cw , we obtain a basis {Cw }λ∈3(n,r),w∈Dλ+ for Te(n, r). As above, let ωλ ∈ contain w0,λ . Define ( µ eλ = span{Cw | µ ∈ 3(n, r), w ∈ Dµ+ , w ≤L w0,λ }, N (6.1) µ e − = span{Cw | µ ∈ 3(n, r), w ∈ Dµ+ , w
(resp. x ∼R y) ⇔ s(x) = s(y)

(resp. r(x) = r(y)).

(6.2)

In the proof below, it will be useful to keep in mind the trivial fact that r(w∗ ) = r(w) and λ for Crλ−1 (r,s) . s(∗ w) = s(w), whenever w∗ or ∗ w are defined. For convenience, write C(r,s) −1 Also, let `(r, s) = `(r (r, s)) and q(r,s) = qr−1 (r,s) . ee Theorem 6.6. List 3+ (n, r) as ν1 , · · · , νm , where νi D νj =⇒ i ≤ j. The R e e e e e e e module T (n, r) has a ϒ -filtration 0 = T0 ⊂ T1 ⊂ · · · ⊂ Tm = T (n, r) in which e i ) ⊗ d Seνi , for i = 1, · · · , m. e e (νi ) = 1(ν Tei /Tei−1 ∼ =ϒ e H Proof. Let γi be the right cell containing w0,νi , and consider the right cell module λ + eγ ∼ e e K i = dH e Sνi ; see (3.3), (3.4). Define Ti = span{Cw | λ ∈ 3(n, r), w ∈ Dλ , w ∼L x ∈ e e e γ1 ∪ · · · ∪ γi }. Since ≤op LR =E, Lemma 2.1c and (3.1) imply that Ti is an (Sq (n, r), H)λ sub-bimodule of Te. The section Tei /Tei−1 has basis consisting of cosets Cw + Tei−1 for 11 It is clear that (6.1) and this lemma are valid in the context of 1 e (ωλ , 0) in Theorem 3.1 for any Coxeter system.

Quantum Weyl Reciprocity and Tilting Modules

341

w ∼L x ∼R w0,νi , w ∈ Dλ+ , and λ ∈ 3(n, r). By (6.2), these basis vectors can be λ described as C(r,s) + Tei−1 , letting (r, s) run over all pairs of standard tableau of shape e i ) has a basis νi which satisfy r−1 (r, s) ∈ Dλ+ . Let ti = r(w0,νi ) = s(w0,νi ). Then 1(ν λ λ − −1 + e ¯ consisting of those C(r,ti ) = C(r,ti ) + Nνi , where r (r, ti ) ∈ Dλ for λ ∈ 3(n, r). Also, + e γ has a basis consisting of the C¯ + = C + + P the right cell module K i (ti ,s) (ti ,s) x
e is a linear isomorphism. Using Lemma 6.4b, we see that f is an (Seq (n, r), H)-bimodule map. Remarks 6.7. (a) The duality functors dH e and dSeq (n,r) defined in (1.2) and Lemma 2.2 ee formally induce a duality functor dR ee on the category of R -modules; see [CPS2, ee (1.2.2c)]. Clearly, d T (n, r) ∼ = Te(n, r). Hence, by Theorem 6.6, Te(n, r) also has a e ee e ⊗ Seλ , λ ∈ 3+ (n, r). filtration with sections d ϒ (λ) ∼ = ∇(λ) (b) For any field k which is a Z-algebra, Theorem 6.6 implies that Te(n, r)k has a Re = Sq (n, r) ⊗ H(n, r)op -filtration with sections 1(λ) ⊗ dH Sλ , λ ∈ 3+ (n, r). e Q(q) , λ ∈ 3+ (n, r), are the distinct irreducible modules for the semisimple (c) The 1(λ) ∼ e e algebra Seq (n, r)Q(q) . Also, dH eQ(q) SλQ(q) = SλQ(q) . Hence, Theorem 6.6 agrees with the decomposition (2) in the introduction. 7. Young Modules and Tilting q-Schur Algebras Maintain the notation of the previous two sections. We first give a direct development of the basic theory of Young modules. These modules were first developed for Hecke algebras of type A by Dipper and James in [DJ2]. For example, they prove Proposition 7.3c and (part of) Proposition 7.5c. However, using the results of the present paper, we are able to establish these results (and more) very quickly. Then we apply this work to describe explicitly the partial tilting modules X(λ) for the q-Schur algebras Sq (n, r) and determine the Ringel dual of Sq (n, r). Since this paper was written, Donkin has given us a copy of his paper [D4] which also calculates the Ringel dual of Sq (n, r) in the special case n ≥ r. For the rest of this section, fix a field k which is a Z-algebra. To avoid trivialities, we e k is not semisimple – hence, we assume concentrate on the case in which the algebra H 1 + q + · · · + q r−1 = 0 in k (see, e. g. [DPS1, (4.2.2)]). There exists a Z-algebra O with the following two properties: (i) O is a discrete valuation ring with residue field k and e K is a split fraction field K; (ii) the image of q in K is not a root of unity. Thus, H semisimple algebra. Note that q + 1 6= 0 in the domain O. e k , Te(n, r)k , Teλk , etc.) by Sq (n, r) (resp., H, T (n, r), Denote Seq (n, r)k (resp., H e O -modules. Also, Tλ , etc.). The Krull-Schmidt theorem holds for finitely generated H

342

J. Du, B. Parshall, L. Scott

L e ⊕ni if Te(n, r)O = is a decomposition of Te(n, r) into distinct, indecomposable i Yi L e e HO -summands Yi , then T (n, r) = i Yi⊕ni is a decomposition of T (n, r) into distinct indecomposable H-summands Yi = Yeik . (Use “Heller’s theorem”, see, e. g., [CPS2, (1.5.6)], and Lemma 2.1b.) e80 e0 ∼ 0 By Lemma 5.1b, HomH e 0 (Tλ0 , Tλ ) = Z if λ ∈ 3(r) and q + 1 is not a zero divisor in the commutative Z-algebra Z 0 . Thus, taking Z 0 = O, for λ ∈ 3+ (r), there exist unique e O -summands Ye \ of Te80 and Yeλ of TeλO such that indecomposable H λO λ ∼ e\ e ∼ e8 e HomH e (Tλ0 O , Tλ,O ) = O. e (Yλ , Yλ ) = HomH O

O

(7.1)

Then Yeλ (resp., Yeλ\ ) is the Young module (resp., twisted Young module) associated to λ. \ and Yλ = Yeλk . Let Yλ\ = Yeλk L ⊕n \ Lemma 7.1. (a) For λ ∈ 3+ (r), we have YeλK ∼ = SeλK ⊕ µ.λ SeµKµλ and YeλK ∼ = L ⊕m SeλK ⊕ Se µλ . µ/λ

µK

e O -summands of Te(n, r)O . (b) The Yeλ , λ ∈ 3+ (n, r), are the distinct indecomposable H + e e e e (c) For λ ∈ 3 (r), let P (λ) = HomH eO (Yλ , TO ). Each P (λ) is a projective indecompose able Sq (r, r)O -module. The modules P (λ) = Pe (λ)k ∼ = HomH (Yλ , T ), λ ∈ 3+ (r), are the projective indecomposable Sq (r, r)-modules. e Q(q) and H e K -modules e K is a split semisimple algebra, the irreducible H Proof. Because H e e correspond bijectively SλQ(q) 7→ SλK . (This assertion follows immediately from elementary Brauer theory for algebras over a regular ring of Krull dim. ≤ 2; cf. [DPS1, (1.1.3)], or use [B, (1.9.6)].) Thus, the decomposition given in Lemma 5.2 holds with Q(q) replaced throughout by K. Now (a) follows from this fact. By Lemma 5.1, the Yeλ are distinct for distinct λ ∈ 3+ (r). By general principles, the + e e e P (λ) = HomH eO (Yλ , T (n, r)O ), λ ∈ 3 (n, r), are the distinct indecomposable, projective Seq (n, r)O -modules. Since Seq (n, r)K is split semisimple, the functor − ⊗O k takes distinct projective indecomposable Seq (n, r)O -modules to distinct projective indecomposable Sq (n, r)-modules (see [CR, Ex. 16, p. 142]), proving (c). Since #3+ (n, r) is the number of distinct irreducible Sq (n, r)-modules, (b) follows.12 We remark that a version of Lemma 7.1b over fields is implicit in [DJ2, (2.6),(3.11)] and [DJ4, (8.8)]. (As noted in [DPS2], it is easy to check that the modules Seλ , Seλk , λ ∈ 3+ (r), in this paper identify with the Specht modules as considered in [DJ1, DJ2].) We consider next how the duality functors in (1.2) and in Lemma 2.2 behave. ∼ e ∼ e e\ ∼ e\ Lemma 7.2. (a) For λ ∈ 3+ (r), dH eO Yλ = Yλ and dH Yλ = Yλ . Also, dH eO Yλ = Yλ \ and dH Yλ\ ∼ = Yλ . \8 (b) For λ ∈ 3+ (r), Yeλ\8 ∼ = Yeλ0 and Yλ ∼ = Yλ 0 . (c) dSq (n,r) is a strong duality on Sq (n,r) C, i. e., dSq (n,r) L(λ) ∼ = L(λ) for all irreducible Sq (n, r)-modules L(λ). 12

Using Lemma 7.1c, it can be shown that nµλ = mµ0 λ0 . Similarly, in Lemma 5.2, eµλ = fµ0 λ0 .

Quantum Weyl Reciprocity and Tilting Modules

343

e ∼ e e 8 ∼ e8 e Proof. By Lemma 1.1a, dH eO TλO = TλO and dH eO TλO = TλO . Hence, dH eO Yλ is an indecomposable summand of TeλO which has a nonzero homomorphism to Teλ80 O . Thus, e8 ∼ e8 e e HomH eK YλK , Tλ0 K ) = HomH eO (dH eO YλO , Tλ0 O )K 6= 0. eK (dH e K is a semisimple algebra, it follows that Hom (Te80 , d YeλK ) 6 0, so Since H eK λ K H eK H e80 , d Yeλ ) 6= 0 also. Therefore, d Yeλ ∼ eλ . Similarly, d Ye \ ∼ e\ HomH ( T Y = eO λ O H eO eO eO λ = Yλ . H H This proves two of the isomorphisms in (a); the other two follow by base change and (1.3). e\ e Applying dH eO and then 8 to a nonzero morphism Yλ → Yλ , (a) and Lemma 1.1c 8 8 imply that Yeλ is an indecomposable summand of TeλO for which there exists a nonzero \ morphism to Teλ0 O . Hence, Yeλ8 ∼ = Yeλ0 . Applying 8 again gives the first assertion in (b). Now base change gives the second assertion in (b). Finally, since the duality dH fixes each Yλ , [CPS2, (1.2.1c)] states that dSq (n,r) is a strong duality, so (c) holds. The following result provides an essential key for understanding the partial tilting modules X(λ) for q-Schur algebras. The proof makes crucial use of the isomorphisms e8 e Teyλ ∼ = HomH e (Tλ , T ) given in Theorem 2.3a and the filtration results Theorem 6.6. Compare [DJ1, (4.12)] and [DJ3, (3.5)] for parts (b) and (c). e8 e e Proposition 7.3. Let Te = Te(r, r) and put Ve (λ) = HomH e (Tλ0 , T ) and X(λ) = \ + e e HomH eO (Yλ , TO ) for λ ∈ 3 (r). Then: (a) Ve (λ) ∈ Se (r,r) C(tilt). (See above Proposition 5.4 for a definition.) q e filtration with top section Seλ and lower sections Seµ for µ / λ. (b) Teλ80 has a H-module 8 ∼ e e (c) dH e Sλ = Sλ0 . (d) There is an isomorphism ∼ e HomSe (r,r) (X(λ), TeO ) → Yeλ\ q

O

e O -modules. Also, X(λ) e e k ∈ of H ∈ Se (r,r) C(tilt). Finally, X(λ) q O highest weight λ.

(7.2) Sq (r,r) C(tilt)

has

e e -filtration Te• of Te defined in Proof. By Theorem 2.3a, Ve (λ) ∼ = Teyλ0 . Consider the ϒ e Theorem 6.6. By its construction, it can be refined to a 1-filtration Fe• in which every µ e Fi is spanned by certain Kazhdan-Lusztig basis elements Cw , µ ∈ 3(r) and w ∈ Dµ+ . µ e If Fei /Fei−1 ∼ ν ∈ 3+ (r), then Fei /Fei−1 has basis {Cw + Fei−1 | µ ∈ 3(r), w ∼L = 1(ν), x, for some fixed x ∼R w0,ν }. + For λ ∈ 3+ (r), let Dλ = {w ∈ W | sw > w, ∀s ∈ J(λ)}. Then Cw yλ 6= 0 if −1 + −1 and only if w ∈ Dλ . If Cw yλ 6= 0, then w ∈ Dλ , since otherwise, there exists + + τs = qCw , while τs yλ = −yλ , a contradiction. s ∈ J(λ) so that ws < w. Then Cw + Conversely, any Cw = τw + terms involving τv for `(v) < `(w), so, for w−1 ∈ Dλ , + −1 Cw yλ = w0,λ qw τ + terms involving τu , `(u) < `(ww0,λ ). Hence, the elements 0,λ ww0,λ + −1 Cw yλ , w ∈ Dλ , are Z-linearly independent.

344

J. Du, B. Parshall, L. Scott

µ Thus, Fei yλ has a basis consisting of various products Cw yλ , for µ ∈ 3(r), −1 + e e w ∈ Dµ ∩ Dλ . Hence, any non-zero Fi yλ /Fi−1 yλ is Z-torsion free. For such i, ∼ e Q(q) is an irreducible Seq (r, r)Q(q) -module. Fei /Fei−1 → Fei yλ /Fei−1 yλ , since each 1(λ) 0 e e e Therefore, V (λ ) = T yλ has a 1-filtration. By Theorem 2.3a, dSe (r,r) Teyλ ∼ = Teyλ ; so q e This proves (a). Teyλ also has a ∇-filtration. f, Te) = 0 for any M f ∈ e (M Since Te has a ∇-filtration, (5.2(1)) implies Ext 1 eq (r,r) S e e e e e e eq (n,r) C(1). Hence, if F• is a 1-filtration of V (λ), then HomSeq (r,r) (F• , T ) is a filtration S e Te) (by Lemma 5.3) for of HomSe (r,r) (Ve (λ), Te) with sections Seν ∼ = HomSeq (r,r) (1(ν), q e every ν for which 1(ν) is a section of Fe• . By Corollary 2.4a, Teλ80 ∼ = HomSeq (r,r) (Ve (λ), Te). e Thus, Lemma 5.2 implies that 1(λ) appears as a section in Ve (λ) and all other sections e 1(µ) satisfy µ / λ. Also, (5.2(2)) says that Fe• can be assumed to have bottom section e 1(λ). Clearly, (b) follows from these observations. e8 e Let φλ generate the free rank one Z-module HomH e (Tλ0 , Tλ ); cf. Lemma 5.1b. By 8 ∼ e (b) and Lemma 5.2, Im φλ ∼ = Seλ . By Lemma 1.1b, dH e (φλ ) = φλ0 . Since Im φλ0 = Sλ0 , (c) follows. e O simply by 1(λ), e e e O , ∇(λ) e In proving (d), denote 1(λ) ∇(λ). Now X(λ) is a di8 1 e e e e ( T , T ) ∈ C(tilt), so, Ext ( X(λ), ∇(µ)) and rect summand of HomH 0 O eq (r,r)O eO λ O S eq (r,r) S e e Ext 1 (1(µ), X(λ)) vanish ∀µ ∈ 3+ (r) by (5.2(1,2)). By “Donkin’s criterion” eq (r,r) S e [CPS2, (4.5.1)], X(λ) ∈ Se (r,r) C(tilt). The isomorphism q

O

Teλ80 O ∼ = HomSeq (r,r)O (Ve (λ)O , TeO ) immediately implies (7.2). (Observe that, in the notation just above Corollary 2.4, if e EvM e is an isomorphism, then EvNe is also an isomorphism for any direct summand N f.) Thus, by Lemma 7.1a, λ is the maximal ν for which 1(ν) e e O is a section of X(λ). of M Remark 7.4. If we localize Z to Z 0 in which q + 1 is invertible, then the analog of Proposition 7.3a over Z 0 is much easier to prove: One can show Ext 1e 0 (Seλ0 , Te0 ) = 0, H

(7.3)

using the arguments from [CPS2, (1.5.2)] and [DPS1, (1.2.13)]. Then one gets a filtration e 0 , and then duality may be applied to complete the proof. of Ve (λ)0 by modules 1(λ) e f has a S-filtration e If an H-module M Fe• , then the multiplicity of any Seλ as a section in f e fQ(q) : SeλQ(q) ]). Hence, the multiplicity is e F• equals [MK : SλK ] (which in turn equals [M f : Seλ ]. Similarly, for a Seq (n, r)independent of the filtration chosen; we denote it by [M f f e e e module M with a 1-filtration, the multiplicity [M : 1(λ)] of 1(λ) as a section of a f is well-defined. Similar remarks apply for H e O and Seq (n, r)O . e 1-filtration of M e e Proposition 7.5. For λ ∈ 3+ (r), let Pe (λ) = HomH eO (Yλ , TO ) be as in Lemma 7.1. Then:

Quantum Weyl Reciprocity and Tilting Modules

345

e (a) Each Pe (λ) is in Se (r,r) C(1). q O e (resp., S-filtration) with bottom (b) For λ ∈ 3+ (r), Yeλ (resp., Yλ ) has a S-filtration e section SλO (resp., Sλ ) and higher sections SeµO (resp., Sµ ) for µ . λ. Also, for any µ ∈ 3+ (r), e O ] = [P (λ) : 1(µ)], [Yeλ : SeµO ] = [Pe (λ) : 1(µ) (7.4) where P (λ) = Pe (λ)k . e with top section SeλO (c) Similarly, for λ ∈ 3+ (r), Yeλ\ (resp., Yλ\ ) has a S-filtration e (resp., Sλ ) and lower sections SµO (resp., Sµ ) for µ E λ. (d) Given λ, µ ∈ 3+ (r), there is an equality e e O ] = [Ye \ : SeµO ] [X(λ) : 1(µ) λ

(7.5)

e e e of multiplicities relative to any 1-filtration of X(λ) and S-filtration of Yeλ\ . Proof. (a) follows from the quasi-heredity of Seq (r, r) (or, as we have done earlier, e use [CPS2, (4.5.1)]). Since Proposition 5.4 implies that Te = Te(r, r) ∈ Se (r,r) C(∇), q e O of Pe(λ) contributes a section SeµO = Lemma 5.3 and (5.2(1)) imply each section 1(µ) e e e HomSe (r,r) (1(µ)O , TO ) in a filtration of Yλ ∼ = HomSeq (r,r)O (Pe(λ), TeO ), proving (b). q O Finally, both (c), (d) follow immediately from Proposition 7.3d. Recall from Sect. 4 that, given a HWC A C with poset 3+ , there is an associated partial tilting module X(λ) defined for every λ ∈ 3+ . Our next result gives an explicit description of these modules for q-Schur algebras. e k = Hom (Ye \ , Te(n, r)O )k identifies with the Theorem 7.6. For λ ∈ 3+ (n, r), X(λ) eO λ H partial tilting module X(λ) of highest weight λ for the HWC Sq (n,r) C. Proof. By Lemma 4.1a, we can assume r = n. Since Yeλ\ is a direct summand of e\ e\ e e TeO8 = Te(r, r)8 O , the evaluation map Yλ → HomS eq (r,r)O (HomH eO (Yλ , TO ), TO ) is an isomorphism by Corollary 2.4a. Thus, the indecomposability of Yeλ\ implies that of e e k is indecomposable. X(λ). Now [CPS2, (1.5.6b)] and Lemma 4.2c imply that X(λ) The theorem follows by Proposition 7.3d. As an easy consequence of our results, we describe the Ringel dual of Sq (n, r). Let 3+ (n, r)0 be the ideal in 3+ (r) consisting of all dual partitions λ0 , λ ∈ 3+ (n, r). Also, following Theorem 2.3, let M 8 e e= e X (7.6) Te(r, r)yλ ∼ = HomH e (T (r, r) , T (r, r)). λ∈3(r,r)

ek is a full tilting module for Sq (r, r) for any field k which is a By Theorem 7.6, X e = End e e ∼ e Z-algebra. So, if E eq (r,r) (X), then Ek = EndSq (r,r) (Xk ) by Lemma 4.2c. As S ? discussed in Sect. 4, the Ringel dual Sq (r,r) C of Sq (r,r) C identifies with Ee C. It is a k highest weight category with weight poset (3+ (r), Eop ) and standard objects 1? (λ) = ek ). Now we can prove the following important result. HomSq (r,r) (1(λ), X Theorem 7.7. Let k be a field which is a Z-algebra.

346

J. Du, B. Parshall, L. Scott

e as in (7.6), we have EndS (n,r) (X ek ) ∼ (a) Then, for X = Sq (n, r). Moreover, there is q ? ∼ an equivalence G : Sq (r,r) C → Sq (r,r) C of highest weight categories such that G(L? (λ)) ∼ = L(λ0 ), λ ∈ 3+ (r). (Here L? (λ) denotes the simple object in Sq (r,r) C ? corresponding to λ ∈ 3+ (r).) (b) For n ≥ r, Sq (n,r) C ? ∼ = Sq (n,r) C. (c) For n < r, Sq (n,r) C ? ∼ = Sq (r,r) C[3+ (n, r)0 ]. Proof. We will use the notation immediately above the statement of the theorem. By ∼ e Let βe be the antiCorollary 2.4a, there is an isomorphism fe : Seq (r, r)op → E. automorphism in the proof of Lemma 2.2, and form the isomorphism ge = fe ◦ βe : ∼ e ek is an algebra isomorphism, proving the Seq (r, r) → E. Then g = gek : Sq (r, r) → E first assertion of (a). ∼ Next, as in Corollary 2.4b, fe defines a category equivalence Fe : EeC → Se (r,r)op C. For q e = Seλ in Corollary 2.4b, so N e = 1(λ). e By Lemma 5.4 and its proof, the λ ∈ 3+ (r), let N e . If we put 1 e e ? (λ) = Hom e hypothesis of Corollary 2.3b holds for N eq (r,r) (1(λ), X) and S ? ? e (λ)k ∼ e (λ)) ∼ define 1? (λ) similarly, then 1 = 1? (λ) by Lemma 4.2c. By (2.3), Fe(1 = 8 e e e e ( T , S ), where T = T (r, r). Thus, the isomorphism g e defines, by pull-back, HomH λ e ∼ ∼ e e? e8 e e : C → a category equivalence G eq (r,r) C in which G(1 (λ)) = HomH e (T , Sλ ). In e E S e8 e this isomorphism, the natural right action of Seq (r, r) on HomH e (T , Sλ ), through its left e Now action on Te8 , is converted to a left action by means of β. e8 e ∼ e∗ e8∗ ∼ e e8 HomH e (Sλ , T ) = HomH e (dH e Sλ , T ) e (T , Sλ ) = HomH ∼ e8 e8 ∼ e e = HomH e ( Sλ 0 , T ) e (Sλ0 , T ) = HomH ∼ e 0) = 1(λ as left Seq (r, r)-modules. In the above display, the second isomorphism follows from Lemma 2.2a, and the third isomorphism follows from Proposition 7.3c. Now the ∼ isomorphism g above defines, by pull-back, an equivalence G : Ee C → Sq (r,r) C. k e1 e ? (λ)k ) ∼ e ? (λ))k ∼ e 0 )k ∼ We have G(1? (λ)) ∼ = G(1 = G( = 1(λ = 1(λ0 ). It follows ∼ + + 0 op 0 G(L? (λ)) ∼ L(λ ). Since (3 (r), E ) → (3 (r), E), λ → 7 λ , is a poset isomorphism, = G is an equivalence of highest weight categories. So, (a) is proved. Next, (b) follows since for n ≥ r, Sq (n, r) is Morita equivalent to Sq (r, r). In the notation just above (2.2), we have Sq (n, r) = e3+ (n,r) Sq (r, r)e3+ (n,r) . Thus, + ∼ Sq (n,r) C = Sq (r,r) C(3 (n, r)), so (4.3) implies that Sq (n,r) C

?

∼ = Sq (r,r) C ? [3+ (n, r)],

where, on the right-hand side, 3+ (n, r) is an ideal in the poset (3+ (r), Eop ). Now (c) follows since the equivalence G defined in (a) carries Sq (r,r) C ? [3+ (n, r)] to + 0 Sq (r,r) C[3 (n, r) ]. Remarks 7.8. (a) The determination of the Ringel dual of the Schur algebra S(n, r) (i. e., the algebra Sq (n, r) when q = 1 in k) is due to Donkin [D1], using algebraic group/Hopf algebra methods. Donkin has also announced [D3, p. 236] a result similar to (the first part of) Theorem 7.7a for q-Schur algebras. (In fact, the proof, which is quite different from our methods, has now been circulated in preprint form [D4, Sect. 4.1].) [CPS2, Sect. 5.2]

Quantum Weyl Reciprocity and Tilting Modules

347

gave a different, more combinatorial development of tilting modules for S(n, r), valid under the assumption that char k 6= 2. The above arguments, besides being short and independent of any representation theory of quantum groups, extend those methods to q-Schur algebras. The methods also give integral versions of the classical q = 1 results, and, specializing to q = 1, give efficient proofs of those results. (b) In characteristic 2, when q = 1, the treatment here confirms and deepens the tilting module description proposed in [CPS2, (5.2.8)], via reduction of an integral intertwining module, and does the same for q-Schur algebras over any field in which q + 1 = 0; see Theorem 7.6. These exceptional cases are quite important for the nondescribing characteristic representation theory of the finite general linear groups. It is interesting to note that the tilting module description Theorem 2.3c remains “the same" over any field or Z-algebra, though that is not true of the intertwining module description. (c) In Theorem 8.4c below, we will present another description of the Ringel dual of Seq (n, r)k in some cases. 8. Quantum Weyl Reciprocity, II e is the In this section, we maintain the notation of the previous section. In particular, H −1 generic Hecke algebra over Z = Z[q, q ] for W = Sr and the field k is a Z-algebra so e k is not semisimple. Thus, the image of q is a primitive lth root of 1 for some that H = H integer l satisfying 1 ≤ l ≤ r. If l = 1, then 0 < char k = p ≤ r. As at the beginning of Sect. 7, fix a local triple (O, k, K), where the discrete valuation ring O is a Z-algebra e K is a (split) semisimple algebra. and H A partition λ is (row) a-regular (a ∈ Z+ ) if λ contains no part λi which is repeated a or more times. Let 3+ (n, r)a−reg be the subset of a-regular partitions. Put ( 3+ (n, r)l−reg if l > 1; + and 3+ (r)reg = 3+ (r, r)reg . (8.1) 3 (n, r)reg = 3+ (n, r)p−reg if l = 1, Lemma 8.1. (a) For λ ∈ 3+ (r), the restriction of the pairing ( , ) in Lemma 1.1a to Sλ ⊆ Tλ is non-zero if and only if λ ∈ 3+ (r)reg . (b) For λ ∈ 3+ (r)reg , let Dλ be the head of Sλ . The algebra H has #3+ (r)reg distinct irreducible modules. Each irreducible H-module is isomorphic to some Dλ . e O -module) if and (c) For λ ∈ 3, Yλ (resp., Yeλ ) is projective as an H-module (resp., H + 0 only if λ ∈ 3 (r)reg . Proof. It has already been remarked (above Lemma 7.2) that Sλ identifies with the Specht module associated to λ in the sense of [DJ1]. For this reason, (a) follows from [DJ1, p. 42]. (b) and (c) are also proved in [DJ1, (7.7)], but, using (a), follow from general principles, see [CPS2, (4.5.1)]. Proposition 8.2. (a) For any λ, µ ∈ 3+ (r), the number of occurrences [X(λ) : 1(µ)] of 1(µ) as a section in a 1-filtration of X(λ) equals [1(µ0 ) : L(λ0 )]. (b) If λ ∈ 3+ (r)reg , then [X(λ) : 1(µ)] = [Sµ : Dλ ]. e e O ] = [Ye \ : SeµO ]. But Lemma 7.2a,b Proof. By (7.5), [X(λ) : 1(µ)] = [X(λ) : 1(µ) λ and Proposition 7.3c imply that [Yeλ\ : SeµO ] = [Yeλ0 : Seµ0 ], which in turn equals [P (λ0 ) : 1(µ0 )] by (7.4). By Brauer–Humphreys reciprocity, [P (λ0 ) : 1(µ0 )] = [∇(µ0 ) : L(λ0 )]

348

J. Du, B. Parshall, L. Scott

which also equals [1(µ0 ) : L(λ0 )] since Sq (r, r) has a strong duality. If λ ∈ 3+ (r)reg , then Yλ\ ∼ = Yλ80 is projective by Lemma 8.1c. By elementary Brauer theory (see [DPS1, (1.1.3)]), [Yeλ\ : Seµ ] = [Sµ : Dλ ]. Remarks 8.3. (a) Using Lemma 4.1a, we could have taken λ, µ ∈ 3+ (n, r) in Proposition 8.2. We could also obtain the first assertion in Proposition 8.2a from Theorem 7.7a, but have given the above argument since it is very concrete. (b) Taking q = 1 in Proposition 8.2 yields relations between the decomposition numbers for symmetric groups and those for Weyl modules for Schur algebras. In particular, (a) is in [D1], and (b) is in [E1]. Putting (a) and (b) together, gives a result of James [J1] on embedding the decomposition matrix of Sr into that for S(r, r). (c) Erdmann [E2] considered whether every decomposition number for S(n, r) occurs as a decomposition number for some symmetric group Sr0 . She proved that for λ, µ ∈ 3+ (n, r) :

[1(µ) : L(λ)] = [St(µ0 ) : Dt(λ0 ) ],

(8.2)

where t(µ0 ) = pµ0 +(p−1)(n−1, n−2, · · · , 1, 0) ∈ 3+ (pr+(p−1) ( n2 ))p−reg . The proof of (8.2) follows from the q = 1 case of Proposition 8.2 using the GLn (k)-isomorphisms X(λ)(1) ⊗ St ∼ = X(t(λ)),

1(λ)(1) ⊗ St ∼ = 1(t(λ)).

(8.3)

Here St is the Steinberg module, and M (1) denotes the Frobenius twist of M . (d) Let q ∈ k be a primitive lth root of unity, l > 1. There is no evident quantization of (8.2), since now the Frobenius morphism Fr : GLn,q (k) → GLn (k) goes from the quantum group GLn,q (k) to the classical algebraic group GLn (k). (See [PW, Sect. 7].) Thus, although there exist isomorphisms as in (8.3), the twisted modules X(λ)(1) and 1(λ)(1) are the pull-backs through Fr of the tilting module X(λ) and Weyl module 1(λ) for GLn (k), and not for GLn,q (k). (e) Suppose in Proposition 8.2 that char k = 0, so that l > 1. By Proposition 8.2 and Lemma 6.4, the decomposition matrix D for H embeds into the decomposition matrix Dq for U 21 (sln ). When l is odd at least, Dq is described in terms of inverse Kazhdanq Lusztig polynomials, using [KL3, KT]. Another approach to determining D in terms b ) is described in of the crystal bases for the affine quantum enveloping algebra U 21 (gl n q [LLT]. Perhaps, our remark explains the “in principle” comment [LLT, p. 205]. Now assume that

( n < l, if l > 1; n < p, if l = 1.

(8.4)

Since n is the Coxeter number of GLn (k) (or the quantun general linear group GLn,q (k)), (8.4) covers all cases (except n = l or p, which we omit for simplicity) relevant for the Lusztig conjecture relating the characters of U 21 (sln ) with those for GLn (k), when k q has positive characteristic, etc. James [J2, Sect. 4] has formulated a conjecture (in the case n = r and k has positive characteristic p) asserting that, when r < lp, the irreducible Sq (r, r)-modules arise by “reduction mod p” from the corresponding irreducible modules for the complex q-Schur algebra Sq (r, r)C at a primitive lth root of unity in C. Clearly, this conjecture can be made for n < r (and its validity for n = r implies the validity for n < r). Also, if l = p, the condition r < lp is similar in spirit to the weight restrictions in the modular Lusztig conjecture (for the irreducible characters of reductive groups G).

Quantum Weyl Reciprocity and Tilting Modules

349

Thus, the conditions n < l = p and r < lp represent an overlap between the Lusztig and James conjectures.13 We prove the following result, first observed by Erdmann for q = 1 [E1] (building on e work of Donkin). We will make use of the algebras H(n, r) defined above Theorem 6.2. e Also, we set H(n, r) = H(n, r)k . L ⊕rλ where Theorem 8.4. (a) As an Sq (n, r)-module, T (n, r) = λ∈3+ (n,r)reg X(λ) each rλ > 0. (b) Assume (8.4) holds. Then T (n, r) is a full tilting module for Sq (n, r). In particular, H(n, r)op ∼ = EndSq (n,r) (T (n, r)) (see Theorem 6.2) is quasi-hereditary, and H(n,r)op C identifies with the Ringel dual of Sq (n,r) C. (c) If (8.4) holds, then H(n,r)op C is a HWC with poset (3+ (n, r), Eop ), standard objects Sλ , partial tilting modules Yλ , and projective indecomposable modules Yλ\ , λ ∈ 3+ (n, r). Proof. We first prove (a). First, by Theorem 7.6, the partial tilting module X(λ) is isomorphic to HomH (Yeλ\ , Te(n, r)O )k . If λ ∈ 3+ (n, r)reg , then Lemma 8.1c and Lemma 7.2b imply that Yλ\ is projective, so e\ e HomH (Yλ\ , T (n, r)) ∼ = HomH e (Yλ , T (n, r)O )k . O

Therefore, in this case, X(λ) is a direct summand of T (n, r). Since T (n, r) is a faithful H(n, r)-module, the irreducible H(n, r)-modules are precisely those Dλ which appear as an H-composition factor of T (n, r). Thus, using Proposition 8.2b, it follows that Irr(H(n, r)) = {Dλ | λ ∈ 3+ (n, r)reg }. Since H(n, r) = EndSq (n,r) (T (n, r)) (by Theorem 6.2 and Lemma 4.2), Irr(H(n, r)) has cardinality equal to the number of distinct (up to isomorphism), indecomposable summands of the Sq (n, r)-module T (n, r). In view of the previous paragraph, (a) is established. If (8.4) holds, then 3+ (n, r)reg = 3+ (n, r), so that (b) follows from (a). Then by Sect. 4, H(n,r)op C is a HWC with poset (3+ (n, r), E) and standard objects 1? (λ) = HomSq (n,r) (1(λ), T (n, r)) ∼ = Sλ . (The second isomorphism here follows from Lemma 5.3, Lemma 4.2 and the discussion above and below Proposition 5.4.) Since Ringel’s theory establishes that T (n, r) is a full tilting module for H(n, r)op , it follows that the Yλ , λ ∈ 3+ (n, r), are the distinct partial tilting modules for H(n, r). Finally, any Yλ\ , λ ∈ 3+ (n, r), is a projective H-module and an H(n, r)-module; thus, it is also a projective H(n, r)-module. Because the Yλ\ are distinct for distinct λ, (c) follows. We remark that the proof of Theorem 8.4a above has used Theorem 6.2. However, another proof, independent of Theorem 6.2, exists. We may first prove the result for n ≥ r then apply Lemma 4.1a and Theorem 7.6. (See [E1, 4.2].) However, Theorem 8.4b,c require Theorem 6.2 in an essential way. Now let 3e = 3+ (n, r) × 3+ (n, r) and define (λ, µ) ≤e (τ, σ) if and only if λ E τ and µ D σ. 13 For further discussion of the James conjecture, see [GH] and [CPS3]. In particular, these papers discuss the fact that the conjecture is true “generically”.

350

J. Du, B. Parshall, L. Scott

Corollary 8.5. Assume that (8.4) holds. Then the algebra Re = Sq (n, r) ⊗ H(n, r)op is a quasi-hereditary algebra. The module category Re C is a HWC with poset (3e , ≤e ) and standard objects 1(λ) ⊗ Sµ , (λ, µ) ∈ 3e . Proof. At least for an algebraically closed field, the result is immediate from Theorem 8.4 and [W], where tensor products of quasi-hereditary algebras are studied. We e ee = Seq (n, r)O ⊗ H(n, r)op give a different proof, applying [DPS2, (1.6)] to the algebra R O e e µ)O = 1(λ) e O ⊗ SeµO ∈ Ob( e C) is a finitely genand the poset 3 . First, each 1(λ, e R erated projective O-module. Let Pe (λ, µ) = Pe(λ) ⊗ Yeµ\ , where Pe(λ) is the projective cover in Se (n,r) C of the irreducible Sq (n, r)-module L(λ). Then Pe(λ, µ) is a projective q O ee -module. In the Grothendieck group of Seq (n, r)K -modules, we have [Pe(λ, µ)K ] = R P ∼ e e e µ)K ] + e e [1(λ, (σ,τ )>e (λ,µ) m(σ,τ ),(λ,µ) [1(σ, µ)K ]. Each 1(λ, µ)K = 1(λ)K ⊗ SµK is L op e e e = Seq (n, r)K ⊗ H(n, r)K . Finally, δ∈3e Pe(δ) is clearly absolutely irreducible for R K a projective generator for R C. Thus, the hypotheses of [DPS2, (1.6)] holds, and (d) ee follows. Remarks 8.6. (a) We emphasize that, even without the assumption (8.4), Theorem 6.6 gives a filtration of tensor space by Sq (n, r) ⊗ H(n, r)op -modules 1(λ) ⊗ dH Sλ , λ ∈ 3+ (n, r), in analogy with the classical decomposition (2) described at the beginning of the paper. In the presence of (8.4), this filtration is even by modules which make sense in terms of quasi-hereditary algebras. However, note that the 1(λ) ⊗ dH Sλ are not (generally) standard modules for Sq (n, r) ⊗ H(n, r)op . The standard modules are the modules 1(λ) ⊗ Sλ by Corollary 8.5. (b) We mention without detailed proof one further aspect of Weyl reciprocity, namely the correspondence between Weyl and Specht modules. One requires here the set-up of Remark 7.4, where we localize Z to Z 0 in which q + 1 is invertible. Then there is a contravariant equivalence of exact categories eq (n,r)0 S

0

∼

e ) → C 0 (Se0 )op . C(1 e H

The proof is obtained using the approach given in [CPS2, (4.6.4)]. A version at q = 1 of this result is due to Erdmann [E1] over a field, where stronger assumptions (p > n is sufficient) are required. Acknowledgement. This paper was written while the first (resp., second) author was on leave from the University of New South Wales (resp., the University of Virginia) at the University of Virginia (resp., Northwestern University). Thanks are due to the University of Virginia, the University of Chicago, and Northwestern University for their hospitality during the preparation of this paper.

References [AG] [BV] [BLM] [B]

Auslander, M. and Goldman, O.: Maximal orders. Trans. Am. Math. Soc, 97, 1–24 (1960) Barbasch, D. and Vogan, D.: Primitive ideals and orbital integrals in complex classical groups. Math. Ann. 259, 153–199 (1982) Beilinson, A.A., Lusztig, G. and MacPherson, R.: A geometric setting for the quantum deformation of GLn . Duke Math. J. 61, 655–677 (1990) Benson, D.: Representations and cohomology, I: Basic representation theory of finite groups and associative algebras. Cambridge: Cambridge U. Press, 1991

Quantum Weyl Reciprocity and Tilting Modules

[C] [CL] [CPS1] [CPS2] [CPS3] [dCP] [Cu] [CR] [DD] [DJ1] [DJ2] [DJ3] [DJ4] [D1] [D2] [D3] [D4] [Du] [DPS1] [DPS2] [DS] [E1] [E2] [F] [J1] [J2] [Ji] [GH] [KT] [KL1] [KL2] [KL3] [LLT] [L1] [L2]

351

Carter, R.: Finite Groups of Lie type: Conjugacy classes and complex characters. New York: Wiley, 1985 Carter, R. and Lusztig, G.: On the modular representations of the general linear and symmetric groups. Math. Zeit. 136, 193–242 (1974) Cline, E., Parshall, B. and Scott, L.: Finite dimensional algebras and highest weight categories. J. reine angew. Math. 391, 85–99 (1988) Cline, E., Parshall, B. and Scott, L.: Stratifying endomorphism algebras. Memoirs Am. Math. Soc. 591, (1996) Cline, E., Parshall, B. and Scott, L.: Generic and q-rational representation theory. To appear de Concini, C. and Procesi, C.: A characteristic free approach to invariant theory. Adv. Math. 21, 330–54 (1976) Curtis, C.: On Lusztig’s isomorphism theorem for Hecke algebras. J. Algebra 92, 348–365 (1985) Curtis, C. and Reiner, I.: Methods of Representation Theory I, New York: Wiley, 1981 Dipper, R. and Donkin, S.: Quantum GLn . Proc. London Math. Soc. 63, 165–211 (1991) Dipper, R. and James, G.: Representations of Hecke algebras of general linear groups. Proc. London Math. Soc. 52, 20–52 (1986) Dipper, R. and James, G.: The q-Schur algebra. Proc. London Math. Soc. 59, 23–50 (1989) Dipper, R. and James, G.: Blocks and idempotents of Hecke algebras of general linear groups. Proc. London Math. Soc. 54, 57–82 (1987) Dipper, R. and James, G.: q-Tensor space and q-Weyl modules. Trans. Am. Math. Soc. 327, 251–282 (1991) Donkin, S.: On tilting modules for algebraic groups. Math. Zeit. 212, 39–60 (1993) Donkin, S.: Invariants of several matrices. Invent. Math. 110, 389–401 (1992) Donkin, S.: Standard homological properties for quantum GLn . J. Algebra 181, 235–266 (1996) Donkin, S.: Schur algebras and related algebras: The q-Schur algebra. To appear Du, J.: A note on quantized Weyl reciprocity at roots of unity. Algebra Colloq. 2, 363–372 (1995) Du, J., Parshall, B. and Scott, L.: Stratifying endomorphism algebras associated to Hecke algebras. J. Algebra. 203, 169–210 (1998) Du, J., Parshall, B. and Scott, L.: Cells and q-Schur algebras. J. Transformation Groups. 3, 33–44 (1998) Du J.,and Scott, L.: Lusztig conjectures, old and new, I. J. reine angew. Math. 455, 141–182 (1994) Erdmann, K.: Symmetric groups and quasi-hereditary algebras In: Finite dimensional algebras and related topics, ed. V. Dlab and L. Scott, NATO ASI Series 424, 123–161 (1994) Erdmann, K.: Decomposition numbers for symmetric groups and composition factors of Weyl modules. J. Algebra 180, 316–320 (1996) Fulton, W.: Young tableaux. Cambridge: Cambridge U. Press, 1997 James, G.D.: The decomposition of tensors over fields of prime characteristic. Math. Zeit. 172, 161–178 (1980) James, G.D.: The decomposition matrices of GLn (q) for n ≤ 10. Proc. London Math. Soc. 60, 225–265 (1990) Jimbo, M.: A q-analogue of U (gl(N + 1)), Hecke algebra, and the Yang-Baxter equation. Letters in Math. Phys. 11, 247–252 (1986) Gruber, J. and Hiss, G.: Decomposition numbers of finite classical groups for linear primes. J. reine angew. Math. 485, 55–91 (1997) Kashiwara, M. and Tanisaki, T.: Kazhdan–Lusztig conjecture for affine Lie algebras with negative level. Duke Math. J. 77, 21–62 (1995); II. Nonintegral case. Duke Math. J. 84, 771–813 (1996) Kazhdan, D. and Lusztig, G.: Representations of Coxeter groups and Hecke algebras. Invent. Math. 53, 165–184 (1979) Kazhdan, D. and Lusztig, G.: Schubert varieties and Poincar´e duality. Proc. Sympos. Pure Math. 36, Providence, RI: 185–203, Am. Math. Soc., 1980 Kazhdan, D. and Lusztig, G.: Tensor structures arising from affine Lie algebras I–II, III–IV. J. Am. Math. Soc. 6 7, 905–1011 (1994); 335–453 (1994) Lascoux, A., Leclerc, B. and Thibon, J-Y.: Hecke algebras at roots of unity and crystal bases of quantum affine algebras. Commun. Math. Physics 181, 205–263 (1996) Lusztig, G.: Characters of reductive groups over a finite field. Princeton, NJ: Princeton U. Press, 1984 Lusztig, G.: Cells in the affine Weyl groups. Algebraic Groups and Related Topics, Adv. Stud. Pure Math. 6, 255–287 (1985)

352

J. Du, B. Parshall, L. Scott

[L3] [M]

Lusztig, G.: Modular representations and quantum groups. Contemp. Math. 82, 59–77 (1989) Martin, P.: On Schur-Weyl duality, An Hecke algebras and quantum sl(N ) on ⊗n+1 CN . Inter. J. Mod. Phys. 7, (1B) 645–673 (1992) Parshall, B. and Wang, J.-p.: Quantum linear groups. Memoirs Am. Math. Soc. 439, (1991) Ringel, C.: The category of modules with good filtrations over a quasi-hereditary algebra has almost split sequences. Math. Zeit. 208, 209–223 (1991) Shi, J.-Y.: The Kazhdan–Lusztig cells in certain affine Weyl groups. Springer Lecture Notes in Math. 1179, Berlin: Springer-Verlag, 1986 Weyl, H.: The Classical Groups. Princeton, NJ: Princeton U. Press, 1973 Weibel, C.: An introduction to homological algebra. Cambridges studies in advanced mathematics, Cambridge: Cambridge Univ. Press, 38, 1994 Wiedemann, A.: On stratifications of derived module categories. Can. Bull. Math. 34, 275–280 (1991) Xi, N.: Representations of affine Hecke algebras. Springer Lecture Notes in Math. 1587, Berlin: Springer-Verlag, 1994

[PW] [R] [S] [We] [Wi] [W] [X]

Communicated by T. Miwa

Commun. Math. Phys. 195, 353 – 372 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

O∞ Realized on Bose Fock Space Carsten Binnenhei? Institut f¨ur Theoretische Physik der Freien Universit¨at, Arnimallee 14, 14195 Berlin, Germany. E-mail: [email protected] Received: 21 January 1997 / Accepted: 26 November 1997

Abstract: We study the semigroup of Bogoliubov endomorphisms of the canonical commutation relations which give rise to representations of the Cuntz algebra O∞ on Fock space and describe the corresponding Cuntz algebra generators in detail. 1. Introduction The appearance of the Cuntz algebras Od [6] is a generic feature of quantum field theory. This fact has been discovered, within the algebraic approach [10], by Doplicher and Roberts [8] who associated with each localized morphism % of dimension d and obeying permutation group statistics a multiplet 91 , . . . , 9d of local field operators, acting on a Hilbert space which contains each superselection sector with some multiplicity, such that Cuntz’ relations hold X 9∗j 9k = δjk 1, 9j 9∗j = 1 (1.1) j

and such that, for any local observable A, X 9j A9∗j . %(A) =

(1.2)

j

Regarding the 9j as an orthonormal basis for the closure H(%) of their linear span, one says that “% is implemented by the Hilbert space H(%) of isometries”. These observations motivated a study of endomorphisms of the CAR algebra which can be implemented by Hilbert spaces of isometries on Fock space [5]. Here we present a similar analysis for endomorphisms of the canonical commutation relations (CCR) or, ? Supported by the Deutsche Forschungsgemeinschaft (Sfb 288 “Differentialgeometrie und Quantenphysik”)

354

C. Binnenhei

more precisely, of the Weyl relations. As in [5], we find it convenient to use Araki’s “selfdual” formulation [2, 4] which is briefly introduced in Sect. 2. We discuss a class of natural endomorphisms of the CCR algebra (“Bogoliubov endomorphisms”) in Sect. 3. As a generalization of Shale’s condition for automorphisms [17], we state a necessary and sufficient condition for Bogoliubov endomorphisms to be implementable in a fixed Fock representation. The derivation of this result is based on known criteria for quasi– equivalence of quasi-free states [2, 7, 4] and can, at one point, be reduced to the CAR case, by using an inequality due to Araki and Yamagami [3]. The topological semigroup of implementable endomorphisms is the subject of Sect. 4. It can be written as a product of a subgroup consisting of automorphisms which are close to the identity, and the sub-semigroup of endomorphisms which leave the given Fock state invariant. This decomposition enables us to determine the connected components of the semigroup. It also plays a role in Sect. 5 which is concerned with the construction of orthonormal bases for the Hilbert spaces H(%). These Hilbert spaces themselves carry the structure of symmetric Fock spaces and thus are, for genuine endomorphisms, infinite-dimensional. The C*-algebra generated by a single H(%) is O∞ . Implementers are constructed by an adaptation of Ruijsenaars’ formulas for unitary implementers of automorphisms [15]. They can be written as products of certain isometries belonging to the commutant of the range of % times a Wick ordered exponential of an expression which is bilinear in creation and annihilation operators. The connection with the aforementioned product decomposition is that, roughly speaking, the first factor carries the exponential term, whereas the second is responsible for the additional isometries. The proof of completeness of implementers can thereby be reduced to the case of endomorphisms which leave the given Fock state invariant. In an application to quantum field theory, where the CCR algebra would correspond to the field algebra and the observables would be singled out as the invariants under the (second-quantized) action of a compact gauge group G of the first kind, one would restrict to gauge invariant endomorphisms % (endomorphisms which commute with the gauge action). Then one finds that the representation of G on H(%) decomposes according to the Fock space structure of H(%). Consequently, these endomorphisms are always reducible (unless they are automorphisms), and in fact infinite direct sums of irreducibles. This explains the generic occurrence of infinite statistics. Details will be given elsewhere. 2. Basic Notions Let K0 be an infinite-dimensional complex linear space, equipped with a nondegenerate hermitian sesquilinear form γ and an antilinear involution f 7→ f ∗ , such that γ(f ∗ , g ∗ ) = −γ(g, f ),

f, g ∈ K0 .

(The reader who is unfamiliar with Araki’s approach [2, 4] should think of K0 as being the complexification of the real linear space Re K0 ≡ {f ∈ K0 | f ∗ = f }, together with its canonical conjugation. −iγ should be viewed as the sesquilinear extension of a nondegenerate symplectic form on Re K0 . Hopefully, the reader will not be confused in the following by the appearance of too many stars with different meanings.) The selfdual CCR algebra C(K0 , γ) [2, 4] over (K0 , γ) is the simple *-algebra which is generated by 1 and elements f ∈ K0 , subject to the commutation relation f ∗ g − gf ∗ = γ(f, g)1,

f, g ∈ K0 .

(2.1)

O∞ Realized on Bose Fock Space

355

We henceforth assume the existence of a distinguished Fock state ωP1 . Here P1 is a basis projection of K0 , i.e. a linear operator, defined on the whole of K0 , which satisfies P12 ∗ ∗

= P1 , P1 f + P1 (f ) = f,

γ(f, P1 g) = γ(P1 f, g), γ(f, P1 f ) > 0 if P1 f 6= 0

(2.2)

for f, g ∈ K0 . Let P2 ≡ 1 − P1 ,

C ≡ P1 − P2 ,

hf, gi ≡ γ(f, Cg).

The positive definite inner product h , i turns K0 into a pre-Hilbert space. We assume its completion K to be separable. By continuity, the involution ∗ extends to a conjugation on K, P1 and P2 to orthogonal projections, C to a self-adjoint unitary, and γ to a nondegenerate hermitian form. These extensions will be denoted by the same symbols. Setting n = 1, 2, Kn ≡ Pn (K), we get a direct sum decomposition K = K1 ⊕ K2 which is orthogonal with respect to both γ and h , i. The following notations will frequently be used for A ∈ B(K), where B(K) is the algebra of bounded linear operators on K: Amn ≡ Pm APn , †

m, n = 1, 2,

∗

A ≡ CA C, Af ≡ A(f ∗ )∗ ,

f ∈ K.

The components Amn of A are regarded as operators from Kn to Km , and A will † 11 A12 sometimes be written as a matrix A A21 A22 . A is the adjoint of A relative to γ, whereas ∗ A is the Hilbert space adjoint. A may be viewed as the complex conjugate of A. Thus one has relations like P2 = P1 = P1† = P1∗ , C = −C, A12 † = A† 21 = −A12 ∗ , A11 = A22 ,

etc.

The Fock state ωP1 is the unique state1 which is annihilated by all f ∈ ran P2 : ωP1 (f ∗ f ) = 0 if P1 f = 0. (In the conventional setting mentioned above, ωP1 is the Fock state corresponding to the complex structure iC on Re K.) Let Fs (K1 ) be the symmetric Fock space over K1 and let D be the dense subspace of algebraic tensors. A GNS representation πP1 for ωP1 is provided by πP1 (f ) = a∗ (P1 f ) + a P1 (f ∗ ) , f ∈ K, where a∗ (g) and a(g), g ∈ K1 , are the usual creation and annihilation operators on D. The cyclic vector inducing the state ωP1 is P1 , the Fock vacuum. The operators πP1 (a), a ∈ C(K, γ), have invariant domain D, are closable, and πP1 (a∗ ) ⊂ πP1 (a)∗ . In particular, if f ∈ Re K, then πP1 (f ) is essentially self-adjoint on D, and the unitary Weyl operator w(f ) is defined as the exponential of the closure of iπP1 (f ). Its vacuum expectation value is ωP1 (w(f )) ≡ hP1 , w(f )P1 i = e− 2 kP1 f k , 1

1

2

A state ω over C(K, γ) is a linear functional with ω(1) = 1 and ω(a∗ a) ≥ 0, a ∈ C(K, γ).

356

C. Binnenhei

and the Weyl relations hold w(f )w(g) = e− 2 γ(f,g) w(f + g), 1

f, g ∈ Re K.

The Weyl operators generate a simple C*-algebra W(K, γ) which acts irreducibly on Fs (K1 ). If H is a subspace of K with H = H∗ , then the C*-algebra generated by all w(f ) with f ∈ Re H is denoted by W(H). If H0 is the orthogonal complement of H with respect to γ, then duality holds [1, 2, 4]: W(H)0 = W(H0 )00

(2.3)

(a prime denotes the commutant). Lemma 2.1. For f ∈ K, let Hf be the subspace spanned by f and f ∗ . Then the closure of πP1 (f ) is affiliated with W(Hf )00 . Proof. Let T be the closure of πP1 (f ), with domain D(T ). We have to show that, for any A ∈ W(Hf )0 , A(D(T )) ⊂ D(T ),

AT = T A on D(T ).

Now by virtue of the CCR (2.1), kT φk = kT ∗ φk + γ(f, f )kφk for φ ∈ D. Hence, for a given Cauchy sequence φn ∈ D, T φn converges if and only if T ∗ φn does. This implies that D(T ) = D(T ∗ ). 2

2

2

Let f ± ∈ Re Hf be defined as f + ≡ 21 (f + f ∗ ), f − ≡ 2i (f − f ∗ ), and let T ± be the (self-adjoint) closure of πP1 (f ± ). We claim that D(T ) = D(T + ) ∩ D(T − ),

T = T + − iT − on D(T ).

For if φ ∈ D(T ), then there exists a sequence φn ∈ D converging to φ such that πP1 (f )φn and πP1 (f ∗ )φn converge. Thus φ belongs to the domain of the closure of πP1 (f ± ). Conversely, if φ ∈ D(T + ) ∩ D(T − ), then there exists a sequence φn ∈ D converging to φ such that both πP1 (f + f ∗ )φn and πP1 (f − f ∗ )φn converge (cf. [15]). Therefore πP1 (f )φn is also convergent, i.e. φ is contained in D(T ), and T φ = (T + − iT − )φ. Now if A ∈ W(Hf )0 , then A commutes with the one-parameter unitary groups w(tf ± ) = exp(itT ± ). As a consequence, A leaves D(T ± ) invariant and commutes with T ± on D(T ± ). It follows that A(D(T )) ⊂ D(T ) and AT = T A on D(T ) as was to be shown. 3. Implementability of Endomorphisms Bogoliubov endomorphisms are the unital *-endomorphisms of C(K, γ) which map K, viewed as a subspace of C(K, γ), into itself. They are completely determined by their restrictions to K which are called Bogoliubov operators. Hence V ∈ B(K) is a Bogoliubov operator if and only if it commutes with complex conjugation and preserves the hermitian form γ 2 . Bogoliubov operators form a unital semigroup denoted by 2 We may disregard unbounded Bogoliubov operators V (defined on K0 ) since the topologies induced by the corresponding states ωP1 ◦%V on K0 differ from the one induced by ωP1 . Hence these states cannot be quasi–equivalent to ωP1 (cf. [2, 4]), and %V cannot be implemented.

O∞ Realized on Bose Fock Space

357

S (K, γ) ≡ {V ∈ B(K) | V = V, V † V = 1}. Each V ∈ S (K, γ) extends to a unique Bogoliubov endomorphism of C(K, γ) and to a unique *-endomorphism of W(K, γ). By abuse of notation, both endomorphisms are denoted by %V , so that %V (f ) = V f, f ∈ K, and %V (w(g)) = w(V g), g ∈ Re K. The condition V † V = 1 entails that V is injective and V ∗ surjective; hence ran V is closed, and V is a semi-Fredholm operator [11]. We claim that the Fredholm index − ind V = dim ker V † cannot be odd, in contrast to the CAR case [5]. For let f ∈ ker V † such that 0 = γ(f, g) ≡ hf, Cgi ∀g ∈ ker V † . Then f ∈ (C ker V † )⊥ = (ker V ∗ )⊥ = ran V , but ran V ∩ ker V † = {0} due to V † V = 1, so f has to vanish. This shows that the restriction of γ to ker V † stays nondegenerate. It follows that dim ker V † cannot be odd (there is no nondegenerate symplectic form on an odd-dimensional space). On the other hand, each even number (and ∞) occurs as dim ker V † for some V . Hence we have an epimorphism of semigroups S (K, γ) → N ∪ {∞},

1 1 V 7→ − ind V = dim ker V † 2 2

(0 ∈ N by convention). Let S n (K, γ) ≡ {V ∈ S (K, γ) | ind V = −2n},

n ∈ N ∪ {∞}.

S 0 (K, γ) is the group of Bogoliubov automorphisms (isomorphic to the symplectic group of Re K). It acts on S (K, γ) by left multiplication. Analogous to the CAR case, the orbits under this action are the subsets S n (K, γ), and the stabilizer of V ∈ S n (K, γ) is isomorphic to the symplectic group Sp(n). We are interested in endomorphisms %V which can be implemented by Hilbert spaces of isometries on Fs (K1 ). This means that there exist isometries 9j on Fs (K1 ) which fulfill the Cuntz algebra relations (1.1) and implement %V according to (1.2), X %V (w(f )) = 9j w(f )9∗j , f ∈ Re K. j

As explained in [5], such isometries exist if and only if %V , viewed as a representation of W(K, γ) on Fs (K1 ), is quasi–equivalent to the defining (Fock) representation. To study %V as a representation, for fixed V ∈ S (K, γ), let us decompose it into cyclic subrepresentations. Let e1 , e2 , . . . be an orthonormal basis in K1 ∩ ker V † and let α = (α1 , . . . , αl ) be a multi-index with αj ≤ αj+1 . Such α has the form α = (α10 , . . . , α10 , α20 , . . . , α20 , . . . , αr0 , . . . , αr0 ) | {z } | {z } | {z } l1

l2

(3.1)

lr

with α10 < α20 < · · · < αr0 and l1 + · · · + lr = l. Let φα ≡ (l1 ! · · · lr !)− 2 a∗ (eα1 ) · · · a∗ (eαl )P1 , 1

Fα ≡ W(ran V )φα , πα ≡ %V |Fα . Lemma 3.1. %V = ⊕α πα , where the sum extends over all multi-indices α as above, including α = 0 (φ0 ≡ P1 ). Each (πα , Fα , φα ) is a GNS representation for ωP1 ◦%V (regarded as a state over W(K, γ)).

358

C. Binnenhei

Proof. By definition, the φα constitute an orthonormal basis for Fs (K1 ∩ ker V † ), and (πα , Fα , φα ) is a cyclic representation of W(K, γ). Since the closures of a∗ (ej ) and a(ej ) are affiliated with W(ker V † )00 = W(ran V )0 (see Lemma 2.1 and (2.3)), there holds for f ∈ Re K, with Nα ≡ (l1 ! · · · lr !)−1 , hφα , πα (w(f ))φα i = Nα ha∗ (eα1 ) · · · a∗ (eαl )P1 , w(V f )a∗ (eα1 ) · · · a∗ (eαl )P1 i = Nα hP1 , w(V f ) a(eαl ) · · · a(eα1 )a∗ (eα1 ) · · · a∗ (eαl )P1 i | {z } −1 Nα P1

= hP1 , w(V f )P1 i. This proves that (πα , Fα , φα ) is a GNS representation for ωP1 ◦%V . Similarly, one finds that hφα , w(V f )φα0 i = 0 for α 6= α0 , so the Fα are mutually orthogonal. It remains to show that ⊕α Fα = Fs (K1 ). We claim that F0 equals Fs (ran P1 V ), the symmetric Fock space over the closure of ran P1 V . The inclusion F0 ⊂ Fs (ran P1 V ) holds because vectors of the form w(V f )P1 = exp i a∗ (P1 V f ) + a(P1 V f ) P1 ∈ Fs (ran P1 V ) are total in F0 . The converse inclusion may be proved inductively. Assume that a∗ (g1 ) · · · a∗ (gm )P1 is contained in F0 for all m ≤ n, g1 , . . . , gm ∈ ran P1 V . Then, for f ∈ V (Re K) and g1 , . . . , gn ∈ ran P1 V , 1i w(tft )−1 a∗ (g1 ) · · · a∗ (gn )P1 has a limit a∗ (P1 f )a∗ (g1 ) · · · a∗ (gn )P1 + a(P1 f )a∗ (g1 ) · · · a∗ (gn )P1 in F0 as t & 0. By assumption, the second term lies in F0 , and so does the first. Since each g ∈ ran P1 V is a linear combination of such P1 f , it follows that a∗ (g1 ) · · · a∗ (gn+1 )P1 is contained in F0 for arbitrary gj ∈ ran P1 V , and, by induction, for arbitrary n ∈ N. But such vectors span a dense subspace in Fs (ran P1 V ), so F0 = Fs (ran P1 V ) as claimed. Finally, K1 ∩ker V † equals ker V ∗ P1 , where V ∗ P1 is regarded as an operator from K1 to K. Thus we have K1 = ran P1 V ⊕(K1 ∩ker V † ) and Fs (K1 ) ∼ = F0 ⊗Fs (K1 ∩ker V † ). Under this isomorphism, Fα is identified with F0 ⊗ (Cφα ). Since the φα form an orthonormal basis for Fs (K1 ∩ ker V † ), the desired result ⊕α Fα = Fs (K1 ) follows. As a consequence, the representation %V is quasi–equivalent to the GNS representation associated with the quasi-free state ωP1 ◦ %V . So %V is implementable if and only if ωP1 ◦%V and ωP1 (i.e. their GNS representations) are quasi–equivalent. Now the twopoint function of ωP1 ◦%V (as a state over C(K, γ)) is given by ˜ ωP1 ◦%V (f ∗ g) = γ(f, Sg) = hf, Sgi, with

S ≡ V † P1 V,

f, g ∈ K,

S˜ ≡ V ∗ P1 V.

The latter operators contain valuable information about ωP1 ◦%V . For example, it can be shown (cf. [13]) that ωP1 ◦%V is a pure state over W(K, γ) if and only if S is a basis projection, that is, if and only if S is idempotent (the remaining conditions in (2.2) are automatically fulfilled). This is further equivalent to [P1 , V V † ] = 0, by the following chain of equivalences: (since S = 1 − S) S 2 = S ⇔ 0 = SS ⇔ 0 = V ∗ P1 V CV ∗ P2 V (since ran V ∗ P2 V = ran V ∗ P2 ⇔ 0 = P1 V CV ∗ P2 and ker V ∗ P1 V = ker P1 V ) † ⇔ 0 = P1 V V P2 ⇔ 0 = [P1 , V V † ].

O∞ Realized on Bose Fock Space

359

On the other hand, the criterion for quasi–equivalence of quasi-free states, in the form given by Araki and Yamagami [4], states that ωP1 ◦%V is quasi–equivalent to ωP1 if and 1 only if P1 − S˜ 2 is a Hilbert–Schmidt operator on K. This condition can be simplified in the present context, as the following result shows. Theorem 3.2. Let a Bogoliubov operator V ∈ S (K, γ) be given. Then there exists a Hilbert space of isometries H(%V ) which implements the endomorphism %V in the Fock representation determined by the basis projection P1 if and only if [P1 , V ] (or, equivalently, V12 ) is a Hilbert–Schmidt operator. The dimension of H(%V ) is 1 if ind V = 0, otherwise ∞. Proof. First note that [P1 , V ] = V12 − V21 = V12 − V12 is Hilbert–Schmidt (HS) if and only if V12 is. 1 By the preceding discussion, %V is implementable if and only if P1 − S˜ 2 is HS. In 1 ˜ 2 = V12 ∗ V12 is of trace class, hence V12 is HS. this case, P2 (P1 − S˜ 2 )2 P2 = P2 SP Conversely, assume V12 to be HS. Let V = V 0 kV k be the polar decomposition of V . Then kV k = kV k is a bounded bijection with a bounded inverse, and kV k − 1 = 2 (kV k −1)(kV k+1)−1 = (V ∗ −V † )V (kV k+1)−1 = 2(V12 ∗ +V21 ∗ )V (kV k+1)−1 is HS. 1 1 Thus, by a corollary [4] of an inequality of Araki and Yamagami [3], (kV kAkV k) 2 −A 2 0∗ 0 is HS for any positive A ∈ B(K). Applying this to A = V P1 V , we get that 1 1 ∗ S˜ 2 − (V 0 P1 V 0 ) 2 is HS.

(3.2)

Now V 0 is an isometry with V 0 = V 0 , i.e. a CAR Bogoliubov operator [5]. [P1 , V ] Since −1 −1 −1 −1 −1 and [P1 , kV k ] = kV k kV k, P1 kV k = kV k kV k − 1, P1 kV k are HS, −1 the same holds true for [P1 , V 0 ] = [P1 , V kV k ]. So V 0 fulfills the implementability condition for CAR Bogoliubov operators derived in [5], and, as shown there, this forces 1 1 ∗ P1 − (V 0 P1 V 0 ) 2 to be HS. This, together with (3.2), implies that P1 − S˜ 2 is HS as claimed. It remains to prove the statement about dim H(%V ). Let %˜V be the normal extension of %V to B(Fs (K1 )). Then B(H(%V )) ∼ = %˜V (B(Fs (K1 )))0 = %V (W(K, γ))0 = W(ran V )0 = W(ker V † )00 . The latter (and hence H(%V )) is one-dimensional if ker V † = {0} and infinite-dimensional if ker V † 6= {0}. Remark. Shale’s original result [17] asserts that a Bogoliubov automorphism %V , V ∈ S 0 (K, γ), is implementable if and only if kV k − 1 is HS. This condition is equivalent to [P1 , V ] being HS, not only for V ∈ S 0 (K, γ), but for all V ∈ S (K, γ) with − ind V < ∞. However, the two conditions are not equivalent for V ∈ S ∞ (K, γ), as the following example shows. Let K1 = H ⊕ H0 be a decomposition into infinite4 dimensional subspaces. Choose an operator V12 from K2 to H with tr kV12 k < ∞, but 2 2 1 tr kV12 k = ∞. Let V21 ≡ V12 and kV11 k ≡ (P1 + kV21 k ) 2 . Choose an isometry v11 from K1 to H0 and set V11 ≡ v11 kV11 k, V22 ≡ V11 . These components define a Bogoliubov operator V ∈ S ∞ (K, γ) (cf. (3.4a)–(3.4d) below) which violates the condition of 2 2 2 Theorem 3.2. But it fulfills Shale’s condition since kV k − 1 = 2(kV12 k + kV21 k ) is 2 HS and since kV k − 1 = (kV k − 1)(kV k + 1)−1 . Let V ∈ S (K, γ) with V12 compact. Due to stability under compact perturbations [11], V11 and V22 = V11 are semi-Fredholm with

360

C. Binnenhei

ind V11 = ind V22 =

1 ind V. 2

(3.3)

We will occasionally use the relation V † V = 1 componentwise: V11 ∗ V11 − V21 ∗ V21 V22 ∗ V22 − V12 ∗ V12 V11 ∗ V12 − V21 ∗ V22 V22 ∗ V21 − V12 ∗ V11

= P1 , = P2 , = 0, = 0.

(3.4a) (3.4b) (3.4c) (3.4d)

Since V11 is injective by (3.4a) and has closed range, we may define a bounded operator V11 −1 as the inverse of V11 on ran V11 and as zero on ker V11 ∗ (the same applies to V22 ). These operators will be needed later. Note that dim ker V11 ∗ = − 21 ind V . 4. On the Semigroup of Implementable Endomorphisms According to Theorem 3.2, the semigroup of implementable Bogoliubov endomorphisms is isomorphic to the following semigroup of Bogoliubov operators: SP1 (K, γ) ≡ {V ∈ S (K, γ) | V12 is Hilbert–Schmidt}. SP1 (K, γ) is a topological semigroup with respect to the metric dP1 (V, V 0 ) ≡ kV − V 0 k+ kV12 − V120 kHS , where k kHS denotes Hilbert–Schmidt norm. It contains the closed subsemigroup of diagonal Bogoliubov operators Sdiag (K, γ) = {V ∈ S (K, γ) | [P1 , V ] = 0} which is isomorphic to the semigroup of isometries of the Hilbert space K1 , via the map V 7→ V11 . The Fredholm index yields a decomposition [ SP1 (K, γ) = SPn1 (K, γ), SPn1 (K, γ) ≡ SP1 (K, γ) ∩ S n (K, γ). n∈N∪{∞}

The group SP0 1 (K, γ) is usually called the restricted symplectic group [17, 16]. It has a natural normal subgroup SHS (K, γ) ≡ {V ∈ S (K, γ) | V − 1 is Hilbert–Schmidt} ⊂ SP0 1 (K, γ). We will eventually show that each V ∈ SP1 (K, γ) can be written as a product V = U W with U ∈ SHS (K, γ) and W ∈ Sdiag (K, γ). Assume that such U and W exist. Then PV ≡ U P1 U † is a basis projection such that P1 − PV is Hilbert–Schmidt,

V † PV V = P1 ,

(4.1)

so the corresponding Fock state ωPV is unitarily equivalent to ωP1 and fulfills ωPV ◦%V = ωP1 . In order to construct such basis projections, let us investigate the set PP1 of basis projections of K which differ from P1 only by a Hilbert–Schmidt operator. Let EP1 be the infinite-dimensional analogue of the open unit disk [18, 16], consisting of all Hilbert–Schmidt operators Z from K1 to K2 which are symmetric in the sense that Z = Z∗

(4.2) †

and have norm less than 1 (the latter condition is equivalent to P1 + Z Z being positive definite on K1 , since Z † = −Z ∗ and Z is compact). Then the following is more or less well-known (cf. [16]).

O∞ Realized on Bose Fock Space

361

Proposition 4.1. P 7→ P21 P11 −1 defines a bijection from PP1 onto EP1 , with inverse given by Z 7→ PZ ≡ (P1 + Z)(P1 + Z † Z)−1 (P1 + Z † ).

(4.3)

The restricted symplectic group SP0 1 (K, γ) acts transitively on either set, in a way compatible with the above bijection, through the formulas P 7→ U P U † ,

(4.4)

Z 7→ (U21 + U22 Z)(U11 + U12 Z)

−1

(4.40 )

.

The restrictions of these actions to the subgroup SHS (K, γ) remain transitive, as follows from the fact that, for Z ∈ EP1 , UZ ≡ (P1 + Z)(P1 + Z † Z)− 2 + (P2 − Z † )(P2 + ZZ † )− 2 1

1

(4.5)

lies in SHS (K, γ) and fulfills UZ P1 UZ† = PZ (equivalently, under the action (4.40 ), UZ takes 0 ∈ EP1 to Z). Proof. Having made K into a Hilbert space, the conditions on P to be a basis projection (2.2) may be rewritten as P = P † = 1 − P = P 2,

CP is positive definite on ran P ;

(4.6)

or, in components: P11 = P11 ∗ = P1 − P22 , ∗

P22 = P22 = P2 − P11 , ∗

∗

(4.7a) (4.7b)

P21 = P21 = −P12 ,

(4.7c)

P11 2 − P11 = P21 ∗ P21 ,

(4.7d)

∗

P22 − P22 = P12 P12 , (P1 − P11 )P12 = P12 P22 , (P2 − P22 )P21 = P21 P11 , P11 P12 is positive definite on ran P. −P21 −P22 2

(4.7e) (4.7f) (4.7g) (4.7h)

Moreover, P1 − P is Hilbert–Schmidt if and only if P2 P is. Now let P ∈ PP1 . Then P22 ≤ 0 by (4.7h), hence, by (4.7a), P11 = P1 − P22 ≥ P1 has a bounded inverse. Thus Z ≡ P21 P11 −1 is a well-defined Hilbert–Schmidt operator. By (4.7a)–(4.7c) and (4.7g), Z − Z ∗ = P21 P11 −1 − P11 −1 P21 ∗

= P11 −1 (P2 − P22 )P21 − P21 P11 P11 −1 = 0,

362

C. Binnenhei

so Z is symmetric in the sense of (4.2). Furthermore, by (4.7d), P1 − Z ∗ Z = P1 − P11 −1 P21 ∗ P21 P11 −1 = P1 − P11 −1 (P11 2 − P11 )P11 −1 = P11 −1

(4.8)

is positive definite on K1 , which proves Z ∈ EP1 . Next let Z ∈ EP1 and let PZ be given by (4.3). We associate with Z an operator Y ≡ (P1 + Z † Z)−1 = (P1 − Z ∗ Z)−1

(4.9)

which is bounded by assumption. Then PZ = PZ† = PZ2 since (P1 + Z † )(P1 + Z) = Y −1 . To prove that PZ + PZ = 1 holds, note that ZY −1 = Y Y Z † = Z † Y . It follows that

−1

Z and therefore Y Z = ZY ,

PZ + PZ = (P1 + Z)Y (P1 + Z † ) + (P2 − Z † )Y (P2 − Z) = Y + ZY + Y Z † + ZZ † Y + Y − Y Z † − ZY + Z † ZY = Y −1 Y + Y = P1 + P2 = 1.

−1

Y

Since P2 PZ is clearly HS and since CPZ = (P1 − Z)Y (P1 − Z ∗ )

(4.10)

is positive definite on ran PZ = ran(P1 + Z), we get that PZ ∈ PP1 as desired. To show that these two maps are mutually inverse, let first Z ∈ EP1 . Then (PZ )21 (PZ )11 −1 = ZY Y −1 = Z. Conversely, let P ∈ PP1 be given and set Z ≡ P21 P11 −1 . Then ZP11 = P21 and P11 Z † = P21 † = P12 . By (4.8) and (4.9), Y = P11 , hence P11 Z † = Z † P11 . Thus we get P − PZ = P − (P1 + Z)P11 (P1 + Z † ) = P − P11 − ZP11 − P11 Z † − ZP11 Z † = P − P11 − P21 − P12 − ZZ † P11 = P22 − ZZ † P11 = P2 − (P2 + ZZ † )P11 (by (4.7b)) = 0. It remains to prove the statements about the group actions. It is fairly obvious that SP0 1 (K, γ) acts on PP1 via (4.4). The proof that UZ is a Bogoliubov operator which takes P1 to PZ is also straightforward. To show that UZ ∈ SHS (K, γ), let Y be given by (4.9). Then Y 2 − P1 = Y 2 (P1 − Y −1 )(P1 + Y − 2 )−1 = Y 2 Z ∗ Z(P1 + Y − 2 )−1 1

1

1

1

1

1

1

is of trace class. Therefore (UZ − 1)P1 = (P1 + Z)Y 2 − P1 = Y 2 − P1 + ZY which implies UZ ∈ SHS (K, γ).

1 2

is HS,

O∞ Realized on Bose Fock Space

363

Finally we have to show that the action (4.4) on PP1 carries over to an action (4.40 ) on EP1 . Thus, for given Z ∈ EP1 and U ∈ SP0 1 (K, γ), we have to compute the operator 0 0 −1 Z 0 = P21 P11 which corresponds to P 0 = U PZ U † . By definition, 0 P21 = (U21 + U22 Z)Y (U11 + U12 Z)∗ , 0 P11 = (U11 + U12 Z)Y (U11 + U12 Z)∗ .

(4.11)

Suppose that (U11 + U12 Z)f = 0 for some f ∈ K1 . Then kf k = U11 −1 U12 Zf .

2

Since U11 −1 U12 = U12 ∗ U11 −1∗ U11 −1 U12 = U12 ∗ (P1 + U12 U12 ∗ )−1 U12 = 2 2 kU12 k /(1 + kU12 k ) < 1 and kZk < 1, it follows that f = 0. Hence U11 + U12 Z is injective, and, as a Fredholm operator with vanishing index (3.3), it has a bounded 0 0 −1 inverse. So we get from (4.11) that Z 0 = P21 P11 = (U21 + U22 Z)(U11 + U12 Z)−1 as claimed. The following construction will enable us to assign, in an unambiguous way, to each Bogoliubov operator V ∈ SP1 (K, γ) a basis projection PV such that (4.1) holds. Lemma 4.2. Let H ⊂ K be a closed *-invariant subspace such that γ|H×H is nondegenerate and such that [P1 , E] is Hilbert–Schmidt where E is the orthogonal projection onto H. Let A ≡ ECE be the self-adjoint operator, invertible on H, such that γ(f, g) = hf, Agi, f, g ∈ H, and let A± be the unique positive operators such that A = A+ − A− and A+ A− = 0. Further let A−1 be defined as the inverse of A on H and −1 as zero on H⊥ , and similarly for A−1 ± . Then A C is the γ-orthogonal projection onto −1 H, P+ ≡ A+ C is a basis projection of H, and P2 P+ is Hilbert–Schmidt. Moreover, P+ = P1 E if and only if [P1 , E] = 0. Proof. Let E 0 ≡ 1 − E. Since ECE 0 and E 0 CE are compact by assumption, C − ECE 0 − E 0 CE = A + E 0 CE 0 is a Fredholm operator on K with vanishing index. Hence A is Fredholm on H with ind A = 0. A is injective since γ is nondegenerate on H. It is therefore a bounded bijection on H with a bounded inverse (the same holds true for A± as operators on ran A± ). Thus Q ≡ A−1 C is well-defined. It fulfills Q2 = A−1 (ECE)A−1 C = Q and Q† = C(CA−1 )C = Q. So Q is a projection, self-adjoint with respect to γ. Since its range equals ran A−1 = H, it is the γ-orthogonal projection onto H. By a similar argument, P+ is also a γ-orthogonal projection. It is straightforward to see that P+ = P1 E if and only if [P1 , E] = 0. To show that P+ is actually a basis projection of H (cf. (4.6)), note that A+ = A− because of A = −A (and uniqueness of −1 −1 A± ). This implies P+ + P+ = A−1 + C − A− C = A C = 1H . Positive definiteness of

−1/2 2 CP+ on ran P+ follows from hf, CP+ f i = A+ Cf . To prove that P2 P+ is HS, let D ≡ EP1 E − A+ . Since EP1 E − EP2 E = A = A+ − A− , we have D = D. We claim that D is of trace class. Since ECE 0 is HS, ECE 0 CE = EC(1 − E)CE = E − (ECE)2 = E − A2 = (E + kAk)(E − kAk) is of trace class. Since E + kAk has a bounded inverse (as an operator on H) and since kAk = A+ + A− , it follows that E − kAk = EP1 E + EP2 E − A+ − A− = D + D = 2D is

364

C. Binnenhei

of trace class as claimed. As a consequence, A+ P2 = (EP1 E −D)P2 is HS (P1 EP2 is HS −2 † by assumption). By boundedness of A−1 + , P+ P2 = −A+ (A+ P2 ) and P2 P+ = (P+ P2 ) are also HS. This completes the proof. Now let V ∈ SP1 (K, γ). We already showed in Sect. 3 that the restriction of γ to ker V † is nondegenerate. We also showed in the proof of Theorem 3.2 that [P1 , V 0 ] is Hilbert– Schmidt where V 0 is the isometry arising from polar decomposition of V . Hence [P1 , E] ∗ is Hilbert–Schmidt, where E = C(1−V 0 V 0 )C is the orthogonal projection onto ker V † . Thus Lemma 4.2 applies to H = ker V † . Definition 4.3. For V ∈ SP1 (K, γ), let PV + be the basis projection of ker V † given by Lemma 4.2, and set PV ≡ V P1 V † + PV + ZV ≡ (PV )21 (PV )11

−1

∈ P P1 , ∈ E P1

(cf. Proposition 4.1). Further let UV ∈ SHS (K, γ) be the Bogoliubov operator associated with ZV according to (4.5), and define WV ≡ UV† V ∈ Sdiag (K, γ). PV clearly is a basis projection which satisfies (4.1). Actually, any basis projection P fulfilling V † P V = P1 or, equivalently, P V = V P1 , is of the form P = V P1 V † + P 0 , where P 0 is some basis projection of ker V † . What had to be proved above is that P 0 can be chosen such that P2 P 0 is Hilbert–Schmidt, in the case dim ker V † = ∞. In fact, any such choice would suffice for what follows. The condition V † PV V = P1 translates into the condition ZV V11 = V21

(4.12)

for ZV . Again, each Z ∈ EP1 fulfilling (4.12) would do, but we prefer to have a definite choice. It follows from symmetry (4.2) that any Z which solves (4.12) must have the form Z = V21 V11 −1 + V22 −1∗ V12 ∗ Pker V11 ∗ + Z 0 ,

(4.13)

where PH denotes the orthogonal projection onto some closed subspace H ⊂ K, V11 −1 and V22 −1 have been defined below (3.4), and Z 0 is a symmetric Hilbert–Schmidt operator from ker V11 ∗ to ker V22 ∗ . The freedom in the choice of Z 0 corresponds to the freedom in the choice of P 0 . Note that Z can be written, with respect to the decompositions K1 = ran V11 ⊕ ker V11 ∗ , K2 = ran V22 ⊕ ker V22 ∗ , as Pran V22 V21 V11 −1 V22 −1∗ V12 ∗ Pker V11 ∗ Z= . (4.130 ) Pker V22 ∗ V21 V11 −1 Z0 The Hilbert–Schmidt norm of Z is minimized by choosing Z 0 = 0, but there are examples in which this choice violates the condition kZk < 1, i.e. it does not always define an element of EP1 . This is in contrast to the CAR case where the choice analogous to Z 0 = 0 appears to be natural [5]. As we shall see in Sect. 5, ZV describes the values of implementers on the Fock vacuum. The operators UV and WV constitute the product decomposition of V that was announced earlier. WV is diagonal because P1 WV = P1 UV† V = UV† PV V = UV† V P1 = WV P1 . Explicitly, one computes that

O∞ Realized on Bose Fock Space

WV =

365

0 (P1 + ZV† ZV ) 2 V11 1 0 (P2 + ZV ZV† ) 2 V22 1

with respect to the decomposition K = K1 ⊕ K2 . Let us summarize the properties of these operators. Proposition 4.4. Definition 4.3 establishes a decomposition of V ∈ SP1 (K, γ), V = U V WV , where UV ∈ SHS (K, γ) and WV ∈ Sdiag (K, γ) have the properties ind UV = 0, ind WV = ind V,

ZUV = ZV , ZWV = 0,

PUV = PV ; PWV = P1 .

In particular, if V ∈ SP0 1 (K, γ), then UV =

kV11 ∗ k V12 v22 ∗ , V21 v11 ∗ kV22 ∗ k

WV =

v11 0 , 0 v22

−1

where v11 ≡ V11 kV11 k and v22 = v11 are the unitary parts of V11 and V22 ; whereas if V ∈ Sdiag (K, γ), then UV = 1 and WV = V . Remark. The product decomposition described above is the generalization to the infinite-dimensional case of a construction given by Maaß [12]. The exact analogue of the construction given in [5] in the fermionic case would be to define W 0 ∈ Sdiag (K, γ) −1

0 ≡ V11 kV11 k (the isometric part of V11 ), to choose a Bogoliubov operathrough W11 † † 0 tor u from ker W 0 to ker V † such that u0 P1 = PV u0 , and to set U 0 ≡ V W 0 + u0 ∈ 0 0 0 SP1 (K, γ). Then U and W would also have the properties listed in Proposition 4.4, with the exception that U 0 − 1 is not necessarily Hilbert–Schmidt. On the other hand, this choice has the merit that the definition of W 0 is completely canonical (independent of the choice of Z). Though it was not shown in [5], it holds true also in the CAR case that each implementable Bogoliubov operator can be written as a product of two factors where the first differs from 1 only by a Hilbert–Schmidt part, and the second is diagonal.

Corollary 4.5. SP1 (K, γ) = SHS (K, γ)·Sdiag (K, γ). The orbits of the action of SP0 1 (K, γ) on SP1 (K, γ) are the subsets SPn1 (K, γ), n ∈ N∪{∞}. They coincide with the connected components of SP1 (K, γ). 5. Normal Form of Cuntz Algebra Generators The first step in the construction of implementers consists in a generalization of the definition of “bilinear Hamiltonians” [2] from the finite rank case to the case of bounded ∗ HC operators. If H is a finite rank operator P on K such that H = H = −H, then e belongs toPSHS (K, γ). Expanding H = fj hgj , .i, one obtains a skew-adjoint element fj gj∗ of C(K, γ) which is a linear function of H, independent of the choice b0 (H) ≡ of fj , gj ∈ K. Then πP1 b0 (H) is essentially skew-adjoint on D, and, if b(H) denotes

366

C. Binnenhei

its closure, exp 21 b(H) is a unitary which implements the automorphism induced by eHC [2, 4]. Using Wick ordering, the definition of bilinear Hamiltonians can be extended to arbitrary bounded operators H which are symmetric in the sense of (4.2)3 : H11 = H22 ∗ ,

H12 = H12 ∗ ,

H21 = H21 ∗ .

(5.1)

Without loss of generality, we henceforth assume that K1 = L2 (Rd ). Then let S ⊂ Fs (K1 ) be the dense subspace consisting of finite particle vectors φ with n-particle wave functions φ(n) in the Schwartz space S(Rdn ). The unsmeared annihilation operator a(p) with (invariant) domain S is defined as usual, √ (a(p)φ)(n) (p1 , . . . , pn ) ≡ n + 1 φ(n+1) (p, p1 , . . . , pn ). Let a∗ (p) be its quadratic form adjoint on S × S. Then Wick ordered monomials a∗ (q1 ) · · · a∗ (qm )a(p1 ) · · · a(pn ) make sense as quadratic forms on S × S [9, 14], and, for φ, φ0 ∈ S, hφ, a∗ (q1 ) · · · a∗ (qm )a(p1 ) · · · a(pn )φ0 i ≡ ha(q1 ) · · · a(qm )φ, a(p1 ) · · · a(pn )φ0 i is a Schwartz function to which tempered distributions may be applied. In particular, the distributions Hjk (p, q), j, k = 1, 2, given by Z hf, H11 gi = f (p)H11 (p, q)g(q) dp dq, Z hf, H12 g ∗ i = f (p)H12 (p, q)g(q) dp dq, Z ∗ hf , H21 gi = f (p)H21 (p, q)g(q) dp dq, Z hf ∗ , H22 g ∗ i = f (p)H22 (p, q)g(q) dp dq for f, g ∈ S(Rd ) ⊂ K1 , give rise to the following quadratic forms on S × S: Z H11 a∗ a ≡ a(p)∗ H11 (p, q)a(q) dp dq, Z ∗ ∗ H12 a a ≡ a(p)∗ H12 (p, q)a(q)∗ dp dq, Z H21 aa ≡ a(p)H21 (p, q)a(q) dp dq, Z ∗ : H22 aa : ≡ a(q)∗ H22 (p, q)a(p) dp dq = H11 a∗ a. Wick ordering of H22 aa∗ (indicated by colons) is necessary to make this expression well-defined. The last equality follows from symmetry of H: H11 (p, q) = H22 (q, p),

H12 (p, q) = H12 (q, p),

H21 (p, q) = H21 (q, p).

We next define : b(H) : and its Wick ordered powers as quadratic forms on S × S: 3

The bilinear Hamiltonian corresponding to an antisymmetric operator (H = −H ∗ ) vanishes.

O∞ Realized on Bose Fock Space

367

: b(H) : ≡ H12 a∗ a∗ + 2H11 a∗ a + H21 aa, : b(H)l : ≡ l! Z with Hl1 ,l2 ,l3 ≡

l X l1 ,l2 ,l3 =0 l1 +l2 +l3 =l

2 l2 Hl ,l ,l , l1 !l2 !l3 ! 1 2 3

l ∈ N,

H12 (p1 , q1 ) · · · H12 (pl1 , ql1 )H11 (p01 , q10 ) · · · H11 (p0l2 , ql02 )

· H21 (p001 , q100 ) · · · H21 (p00l3 , ql003 )a∗ (p1 ) · · · a∗ (pl1 )a∗ (q1 ) · · · a∗ (ql1 ) · a∗ (p01 ) · · · a∗ (p0l2 )a(q10 ) · · · a(ql02 )a(p001 ) · · · a(p00l3 )a(q100 ) · · · a(ql003 ) · dp1 dq1 . . . dpl1 dql1 dp01 dq10 . . . dp0l2 dql02 dp001 dq100 . . . dp00l3 dql003 . The Wick ordered exponential of 21 b(H) is also well-defined on S × S, since only a finite number of terms contributes when applied to vectors from S: ∞ X 1 1 b(H) : ≡ : b(H)l : . : exp 2 l!2l l=0

The important point is that these quadratic forms are actually the forms of uniquely determined linear operators, defined on the dense subspace D and mapping D into the domain of (the closure of) any creation or annihilation operator, provided that [15] kH12 k < 1,

H12 is Hilbert–Schmidt.

(5.2)

These operators will be denoted by the same symbols as the quadratic forms. Lemma 5.1. Let H ∈ B(K) satisfy (5.1) and (5.2). Then the following commutation relations hold on D, for f ∈ K1 : [Hl1 ,l2 ,l3 , a(f )∗ ] = l2 a(H11 f )∗ Hl1 ,l2 −1,l3 + 2l3 Hl1 ,l2 ,l3 −1 a (H21 f )∗ , [a(f ), Hl1 ,l2 ,l3 ] = 2l1 a(H12 f ∗ )∗ Hl1 −1,l2 ,l3 + l2 Hl1 ,l2 −1,l3 a(H11 ∗ f ), implying that 1 : exp b(H) : , a(f )∗ 2 = a(H11 f )∗ : exp 1 a(f ), : exp b(H) : 2

1 1 b(H) : + : exp b(H) : a (H21 f )∗ , 2 2

= a(H12 f ∗ )∗ : exp Proof. Compute as in [15, 5].

1 1 b(H) : + : exp b(H) : a(H11 ∗ f ). 2 2

For given V ∈ SP1 (K, γ), we are now looking for symmetric bounded operators H which satisfy (5.2) and the following intertwiner relation on D: 1 1 b(H) : πP1 (f ) = πP1 (V f ) : exp b(H) : , f ∈ K (5.3) : exp 2 2 (taking the closure of πP1 (V f ) is tacitly assumed here). This problem turns out to be equivalent to the determination of the operators Z done in (4.12), (4.13).

368

C. Binnenhei

Lemma 5.2. Each Z ∈ EP1 fulfilling (4.12) gives rise to a unique solution H of the above problem through the formula Z† V11 − P1 + Z † V21 , H= (V22 ∗ + V12 ∗ Z † )V21 V22 ∗ − P2 + V12 ∗ Z † and each solution arises in this way. Proof. Let us abbreviate ηH ≡ : exp( 21 b(H)) : . Choosing f ∈ K2 resp. f ∈ K1 and inserting the definition of πP1 , one finds that (5.3) is equivalent to ηH a(g) = a(V11 g) + a∗ (V12 g ∗ ) ηH , ηH a∗ (g) = a∗ (V11 g) + a(V12 g ∗ ) ηH for g ∈ K1 . Using the commutation relations from Lemma 5.1, these equations may be brought into Wick ordered form: 0 = a∗ (V12 + H12 V22 )g ∗ ηH + ηH a (P1 + H11 ∗ )V11 − P1 g , 0 = a∗ (P1 + H11 − V11 − H12 V21 )g ηH + ηH a H21 − (P1 + H11 ∗ )V12 g ∗ . As in the CAR case [5], these equations hold for all g ∈ K1 if and only if 0 = V12 + H12 V22 , 0 = P1 + H11 − V11 − H12 V21 , 0 = H21 − (P2 + H22 )V21 , 0 = P2 − (P2 + H22 )V22

(5.4a) (5.4b) (5.4c) (5.4d)

(we applied complex conjugation and used H11 ∗ = H22 ). Now assume that H solves the above problem. It is then obvious from (5.1), (5.2) and (5.4a) that Z ≡ H12 † belongs to EP1 and fulfills (4.12). Conversely, let Z ∈ EP1 satisfy (4.12). If there exists a solution H with H12 = Z † , then H11 is fixed by (5.4b), H22 must equal H11 ∗ , and H21 is determined by (5.4c). Thus there can be at most one solution corresponding to Z, and it is necessarily of the form stated in the proposition. It remains to prove that the so-defined H has all desired properties, i.e. that H21 is symmetric and that (5.4d) holds, the rest being clear by construction. The first claim follows from (3.4d): H21 − H21 ∗ = (V22 ∗ + V12 ∗ Z † )V21 − V12 ∗ (V11 + Z † V21 ) = 0, and the second from (4.12) and (3.4b): (P2 + H22 )V22 = (V22 ∗ − V12 ∗ Z)V22 = V22 ∗ V22 − V12 ∗ V12 = P2 . Inserting the formula (4.13) for Z, one obtains †

H11 = V11 −1∗ − P1 − Pker V11 ∗ V12 V22 −1 V21 + Z 0 V21 , †

H12 = −V12 V22 −1 − V11 −1∗ V21 ∗ Pker V22 ∗ + Z 0 , †

H21 = (V22 −1 − V12 ∗ V11 −1∗ V21 ∗ Pker V22 ∗ )V21 + V12 ∗ Z 0 V21 , †

H22 = V22 −1 − P2 − V12 ∗ V11 −1∗ V21 ∗ Pker V22 ∗ + V12 ∗ Z 0 .

O∞ Realized on Bose Fock Space

369

H corresponds to Ruijsenaars’ operator 3 [15]. If one compares the above formula for H with Ruijsenaars’ formula for 3 in the case of automorphisms (ker Vjj ∗ = {0}, j = 1, 2, Z 0 = 0), one finds that the off-diagonal components carry opposite signs. This is due to the fact that Ruijsenaars actually constructs implementers for the transformation induced by CV C rather than V , cf. (3.27) and (3.29) in [15]. Note that : exp 21 b(H) : P1 = exp( 21 H12 a∗ a∗ )P1 . By Ruijsenaars’ computation [15] (see also [16]), the norm of such vectors is

−1/4

: exp 1 b(H) : P = det(P1 + H12 H12 † ) . 1

2 Definition 5.3. Let V ∈ SP1 (K, γ), and let PV , ZV and HV be the operators associated with V according to Definition 4.3 and Lemma 5.2. Choose a γ-orthonormal basis f1 , f2 , . . . in PV (ker V † ), i.e. a basis such that γ(fj , fk ) = δjk (this is possible because the restriction of γ to PV (ker V † ) is positive definite). Let ψj be the isometry obtained by polar decomposition of the closure of πP1 (fj ). Then define operators 9α (V ) on D, for any multi-index α = (α1 , . . . , αl ) with αj ≤ αj+1 (or α = 0) as in (3.1), as

9α (V ) ≡ det(P1 +

ZV†

41

ZV )

ψα1 · · · ψαl : exp

1 b(HV ) : . 2

(5.5)

Theorem 5.4. The 9α (V ) extend continuously to isometries (denoted by the same symbols) on the symmetric Fock space Fs (K1 ) such that 9α (V )∗ 9β (V ) = δαβ 1,

X

9α (V )9α (V )∗ = 1

(5.6)

α

and, for any element w of the Weyl algebra W(K, γ), %V (w) =

X

9α (V )w9α (V )∗ .

(5.7)

α

Proof. By (2.1) we have πP1 (fj )∗ πP1 (fj ) = 1 + πP1 (fj )πP1 (fj )∗ on D, so the closure of πP1 (fj ) is injective, and ψj is isometric. It is also easy to see, using (5.3), the CCR and k9α (V )P1 k = 1, that h9α (V )πP1 (g1 · · · gm )P1 , 9α (V )πP1 (h1 · · · hn )P1 i = hπP1 (g1 · · · gm )P1 , πP1 (h1 · · · hn )P1 i. Hence 9α (V ) is isometric on D and has a continuous extension to an isometry on Fs (K1 ). Let Hj ≡ span(fj , fj∗ ), so that ψj ∈ W(Hj )00 by virtue of Lemma 2.1. Since Hj ⊂ ker V † , there holds W(Hj ) ⊂ W(ran V )0 by duality (2.3). Now let f ∈ Re K and φ ∈ D. Since φ is an entire analytic vector for πP1 (f ) [2], since D is invariant under πP1 (f ), and since πP1 (V f ) is affiliated with W(ran V ) by Lemma 2.1 (the bar denotes closure), it follows from (5.3) that

370

C. Binnenhei

9α (V )w(f )φ =

∞ n X i 9α (V )(πP1 (f ))n φ n! n=0

∞ n X n i ψα1 · · · ψαl πP1 (V f ) 90 (V )φ = n! n=0

∞ n X n i πP1 (V f ) 9α (V )φ = n! n=0

= w(V f )9α (V )φ. By continuity, this entails 9α (V )w = %V (w)9α (V ),

w ∈ W(K, γ).

(5.8)

We next claim that ψj∗ 90 (V ) = 0

(5.9)

or, equivalently, that πP1 (fj )∗ 90 (V ) = 0. To see this, apply Lemma 5.1 and write πP1 (fj )∗ 90 (V ) in Wick ordered form: ∗ πP1 (fj )∗ 90 (V ) = a (P1 + H12 )fj∗ 90 (V ) + 90 (V )a (P1 + H11 ∗ )fj on D, with H ≡ HV . Then (5.9) holds if and only if (P1 + H12 )fj∗ = 0,

(P1 + H11 ∗ )fj = 0.

(5.90 )

Now fj ∈ ran PV is equivalent to fj∗ ∈ ker PV = ker CPV = ker(P1 + H12 ) (we used (4.10)). This proves the first equation in (5.90 ). It also shows that H12 ∗ fj = −P2 fj . Hence by Lemma 5.2, (P1 + H11 ∗ )fj = (V11 ∗ + V21 ∗ H12 ∗ )fj = (V11 ∗ − V21 ∗ )fj = P1 V † fj = 0, which proves the second equation in (5.90 ) and therefore (5.9). The orthogonality relation 9α (V )∗ 9β (V ) = 0 (α 6= β) now follows from (5.9) and from W(Hj ) ⊂ W(Hk )0 (j 6= k) which in turn is a consequence of γ(Hj , Hk ) = 0 and (2.3). P The proof of the completeness relation 9α (V )9α (V )∗ = 1 is facilitated by invoking the product decomposition V = UV WV from Proposition 4.4. Set ej ≡ UV† fj to obtain a γ-orthonormal basis e1 , e2 , . . . in P1 (ker WV† ) = K1 ∩ ker WV† . Let ψj0 be the isometric part of a(ej )∗ . An application of Definition 5.3 to WV yields implementers 9α (WV ) = ψα0 1 · · · ψα0 l 90 (WV ) for WV . ZWV = 0 entails that 9α (WV )P1 = ψα0 1 · · · ψα0 l P1 . One computes, using the CCR, that ψα0 1 · · · ψα0 l P1 = φ0α , where the φ0α are the cyclic vectors associated with the pure state ωP1 ◦ %WV = ωP1 as in Lemma 3.1. Let Fα0 be the closure of W(ran WV )φ0α . Since the Fα0 are irreducible subspaces for W(ran WV ) by Lemma 3.1, they must coincide with the irreducible subspaces ran 9α (WV ). ⊕Fα0 = Fs (K1 ) then implies completeness of the 9α (WV ). The proof will be completed by showing that

O∞ Realized on Bose Fock Space

371

9α (V ) = 9(UV )9α (WV )

(5.10)

holds where 9(UV ) is the unitary implementer for UV given by Definition 5.3. It suffices to show that (5.10) holds on P1 since any bounded operator fulfilling (5.8) is already determined by its value on P1 . Because of ZUV = ZV we have 90 (V )P1 = 9(UV )P1 ,

(5.11)

9(UV )ψα0 1

so it remains to show that ψα1 · · · ψαl 9(UV )P1 = that ψj 9(UV ) = 9(UV )ψj0 .

· · · ψα0 l P1 .

We claim

±

For let T (resp. T 0 ) be the closure of πP1 (fj ) (resp. πP1 (ej )), and let T ± (resp. T 0 ) be the corresponding self-adjoint operators as in the proof of Lemma 2.1, so that D(T ) = D(T + ) ∩ D(T − ),

T = T + − iT − ,

and similar for T 0 . Then there holds ±

9(UV ) exp(itT 0 )9(UV )∗ = exp(itT ± ), ±

t ∈ R. ±

Therefore 9(UV ) maps D(T 0 ) onto D(T ± ), and one has 9(UV )T 0 9(UV )∗ = T ± . Consequently, 9(UV ) D(T 0 ) = D(T ) and 9(UV )T 0 9(UV )∗ = T . This implies that 9(UV )ψj0 9(UV )∗ = ψj as claimed. The proof is complete since (5.6) and (5.8) together imply (5.7). Corollary 5.5. There is a unitary isomorphism from H(%V ), the Hilbert space generated by the 9α (V ), onto the symmetric Fock space Fs (PV (ker V † )) over PV (ker V † ), which 1 maps 9α (V ) to (l1 ! · · · lr !)− 2 a∗ (fα1 ) · · · a∗ (fαl ), where the notation is as in (3.1), and a∗ (fj ) and are now creation operators and the Fock vacuum in Fs (PV (ker V † )). Acknowledgement. It is a pleasure to thank Dr. M. Schmidt for discussions and for bringing refs. [18, 16, 12] to the author’s attention.

References 1. Araki, H.: A lattice of von Neumann algebras associated with the quantum theory of a free Bose field. J. Math. Phys. 4, 1343 (1963) 2. Araki, H., Shiraishi, M.: On quasifree states of the canonical commutation relations (I). Publ. RIMS Kyoto Univ. 7, 105 (1971/72); Araki, H.: On quasifree states of the canonical commutation relations (II). Publ. RIMS Kyoto Univ. 7, 121 (1971/72) 3. Araki, H., Yamagami, S.: An inequality for Hilbert–Schmidt norm. Commun. Math. Phys. 81, 89 (1981) 4. Araki, H., Yamagami, S.: On quasi–equivalence of quasifree states of the canonical commutation relations. Publ. RIMS Kyoto Univ. 18, 283 (1982) 5. Binnenhei, C.: Implementation of endomorphisms of the CAR algebra. Rev. Math. Phys. 7, 833 (1995) 6. Cuntz, J.: Simple C*-algebras generated by isometries. Commun. Math. Phys. 57, 173 (1977) 7. van Daele, A.: Quasi-equivalence of quasi-free states on the Weyl algebra. Commun. Math. Phys. 21, 171 (1971) 8. Doplicher, S., Roberts, J. E.: Fields, statistics and non-abelian gauge groups. Commun. Math. Phys. 28, 331 (1972); Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics. Commun. Math. Phys. 131, 51 (1990) 9. Glimm, J., Jaffe, A.: Quantum field theory models. In: DeWitt, C., Stora, R. (eds.): Statistical Mechanics and Quantum Field Theory. New York: Gordon and Breach, 1971 10. Haag, R.: Local Quantum Physics. Berlin–Heidelberg–New York: Springer-Verlag, 1992.

372

C. Binnenhei

11. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1966 12. Maaß, H.: Siegel’s Modular Forms and Dirichlet Series. Berlin–Heidelberg–New York: Springer-Verlag, 1971 13. Manuceau, J., Verbeure, A.: Quasi-free states of the C.C.R.-algebra and Bogoliubov transformations. Commun. Math. Phys. 9, 293 (1968) 14. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol. II. New York: Academic Press, 1975 15. Ruijsenaars, S. N. M.: On Bogoliubov transformations. II. The general case. Ann. Phys. 116, 105 (1978) 16. Segal, G.: Unitary representations of some infinite dimensional groups. Commun. Math. Phys. 80, 301 (1981) 17. Shale, D.: Linear symmetries of free boson fields. Trans. Am. Math. Soc. 103, 149 (1962) 18. Siegel, C. L.: Symplectic Geometry. New York, London: Academic Press, 1964 Communicated by H. Araki

Commun. Math. Phys. 195, 373 – 403 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

d An Elliptic Algebra Uq,p (sl 2 ) and the Fusion RSOS Model Hitoshi Konno Department of Mathematics, Faculty of Integrated Arts and Sciences, Hiroshima University, HigashiHiroshima 739, Japan. E-mail: [email protected] Received: 7 September 1997 / Accepted: 26 November 1997

c2 ) with p = q 2r (r ∈ R>0 ) and present Abstract: We introduce an elliptic algebra Uq,p (sl its free boson representation at generic level k. We show that this algebra governs a structure of the space of states in the k-fusion RSOS model specified by a pair of positive integers (r, k), or equivalently a q-deformation of the coset conformal field theory SU (2)k × SU (2)r−k−2 /SU (2)r−2 . Extending the work by Lukyanov and Pugai corresponding to the case k = 1, we give a full set of screening operators for k > 1. The c2 ) has two interesting degeneration limits, p → 0 and p → 1. The former algebra Uq,p (sl c2 ) whereas the latter yields the algebra limit yields the quantum affine algebra Uq (sl c c2 ). Using this correspondence, A~,η (sl2 ), the scaling limit of the elliptic algebra Aq,p (sl we also obtain the highest component of two types of vertex operators which can be regarded as q-deformations of the primary fields in the coset conformal field theory. 1. Introduction In studying exactly solvable models, especially in calculating correlation functions, the algebraic analysis method has proved to be extremely powerful [1]. The method is based on the infinite dimensional quantum group symmetry possessed by a model1 and its representation theory. In particular, the intertwining operators between the infinite dimensional representation spaces play an important role. There are two types of intertwiners called type I and type II. Remarkably, in solvable lattice models, the type I intertwining operator can be identified with a certain composition of the Boltzmann weights so that its product behaves as a local operator acting on the lattice. One can thus combine this into Baxter’s corner transfer matrix (CTM) method [2]. As a consequence, the type I intertwiner allows us to identify the infinite dimensional irreducible representation with the space of states of the model. Furthermore, the properties of the 1

Here we assume the model to have an infinite number of degrees of freedom.

374

H. Konno

Boltzmann weights such as the Yang-Baxter equation, the inversion relation and the crossing symmetry yield some universal relations among the type I intertwiners and the CTM. Based on these relations, one can derive q-difference equations for correlation functions of local operators. It is usually a difficult problem to solve such equations. The advantage of the algebraic analysis is that it allows us to derive the correlation functions directly. Correlation functions are formulated as traces of the product of type I intertwiners over irreducible representation space. Especially, in many cases, the infinite dimensional quantum group symmetries admit free boson realization. This enables us to construct the infinite dimensional representations as well as the intertwining operators. Then the calculation of the correlation functions, i.e. the traces of the intertwiners is a straightforward task. It is needless to say that the same spirit was already applied to the two dimensional conformal field theory(CFT) [3, 4]. In [1, 5, 6], the XXZ model, or equivalently, the six vertex model, in the antiferromagnetic regime was solved by applying the representation theory of the quantum c2 ). Following this work, its higher spin extension [7, 8, 9, 10] and affine algebra Uq (sl a higher rank extension [11] were discussed. The XY Z model, or equivalently, the eight vertex model in the principal regime, was also treated in this approach [12, 13]. c2 ) [14] was proposed as the basic symmetry of the There the elliptic algebra Aq,p (sl model. However its free field realizations still remain to be obtained. It should also be c2 ), the rational limit remarked that the central extension of the Yangian double DY (sl c of Uq (sl2 ), was properly defined [15, 16]. Its free field realization and application to physical problems were discussed [15, 16, 17]. In the recent work [18], we discussed two degeneration limits of the elliptic algebra c2 ), that are the limits p → 0 and p → 1. The first limit gives Uq (sl c2 ), whereas the Aq,p (sl second limit yields a new algebra. The new algebra turned out to be a relevant symmetry for the XXZ model in the gap-less regime [18] as well as for the sine-Gordon theory [19, 20]. This new algebra was later reformulated by using the Drinfeld currents and c2 ) [21]. called A~,η (sl On the other hand, it is known as vertex-face correspondence that there exists an interaction-round-a-face model corresponding to a vertex model. The eight vertex model and the corresponding eight vertex solid-on-solid (SOS) model, or called the AndrewBaxter-Forrester (ABF) model, are well-known examples [22]. The higher spin analogues of the ABF model were constructed by a fusion procedure [23]. The k-fusion SOS model is obtained by fusing the Boltzmann weights of the ABF model k times in both horizontal and vertical directions. We are interested in their restricted versions, i.e. the k-fusion restricted SOS (RSOS) models [22, 23]. The model is labeled by a pair of positive integers (r, k). At each lattice site a, one places a random variable (local height) ma taking values in the set S = {1, 2, .., r − 1}. One further imposes a restriction that for all adjacent sites a and b, the local heights ma and mb satisfy the admissible conditions k + 1 ≤ ma + mb ≤ 2r − k − 1. ma − mb = −k, −k + 2, .., k, m a mb The Boltzmann weight W u of the model is attached to each configuration md mc (ma , mb , mc , md ) on the NW,NE,SE,SW corners of an elementary face with u being the spectral parameter. The following two facts shown in [23] for the regime III, which is treated through this paper, are fundamental.

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

375

1. The one point local height probability (LHP), i.e. a probability in which the center site has a given value of the local height, is given by the so-called branching coefficient appearing in the decomposition of the product of the two irreducible characters of level k and r −k −2 of the affine Lie algebra A(1) 1 into level r −2 irreducible character (1) of the diagonal A1 . 2. The critical behavior is describedby the conformal field theory with the Virasoro 2(k+2) 3k and the primary fields of conformal dimensions central charge c = k+2 1 − r(r−k) 0 hJ;n ,n (2.8). The corresponding CFT is known as the coset minimal model SU (2)k × SU (2)r−k−2 . SU (2)r−2 The cases k = 1, 2 were known before the fusion RSOS model [24], whereas the k > 2 cases were realized inspired by the model [25, 26, 27]. The coset minimal model possesses the extended Virasoro algebra symmetry generated by the Virasoro generator and some extra generators. The super Virasoro algebra is contained as the case k = 2. The irreducible characters of the extended Virasoro algebras are given by the same branching coefficients as the LHP in the above. The first attempt to applying the algebraic analysis to the fusion RSOS model was carried out in [28]. There, in the regime III, the space of states was described based on c2 ) and the creation and annihilation operators of the quasithe representation of Uq (sl particles were obtained as a tensor product of certain type I and type II intertwiners c2 ). Later, the type I vertex operator in the restricted ABF model in the regime in Uq (sl III was recognized as a lattice operator and a q-difference equation for the correlation function was derived [29]. Recently, Lukyanov and Pugai have succeeded to realize this type I vertex operator by using a free boson [30]. This enables us to construct a solution of the above q-difference equation exactly. However the most important contribution by this work is not this but a discovery of a symmetry of the restricted ABF model. That is the symmetry generated by the q-deformation of the Virasoro algebra [31]. In the same way as the Virasoro algebra, the q-Virasoro algebra admits a singular representation corresponding to the minimal series [3]. In such representation, screening operators play an essential role. Constructing screening operators and q-deformation of the primary fields, Lukyanov and Pugai discovered that their q-primary fields are nothing but the above type I vertex operator in the restricted ABF model. The purpose of this paper is to extend their result to the k-fusion RSOS model in the regime III. Since the Virasoro algebra is realized as the case k = 1 in the coset CFT, we expect that a certain q-deformation of the extended Virasoro algebra corresponding to the coset SU (2)k × SU (2)r−k−2 /SU (2)r−2 (k > 1) exists and it provides a basic symmetry of the k-fusion RSOS models [32]. Our strategy is based on the following observation. The screening currents found by Lukyanov and Pugai satisfy an elliptic deformation of the quantum affine algebra c2 ) at level one. This elliptic algebra has another degeneration limit to A~,η (sl c2 ) Uq (sl at level one [18, 31]. Picking up this algebraic nature, we carry out the extension in the following two steps. 1. We extend the elliptic algebra of the screening currents to generic level k. We call c2 ) [32]. Realizing it by using free bosons, we show that the this algebra as Uq,p (sl

376

H. Konno

c2 ) coincide with those known conformal limit of the generating currents for Uq,p (sl in the above coset CFT. Hence these currents give a full set of screening currents for the q-deformed coset theory for arbitrary k. c2 ) has two desired degeneration limits, Uq (sl c2 ) and 2. The elliptic algebra Uq,p (sl c c c2 ). A~,η (sl2 ). The Hopf algebra (like) structures are known both in Uq (sl2 ) and A~,η (sl Using this knowledge, we obtain a free field realization of , at least, the highest component of the type I and type II vertex operators. The free field realization of the screening currents and the type I, II intertwiners allow us to make a characterization of the q-deformation of the coset CFT SU (2)k × SU (2)r−k−2 /SU (2)r−2 . In order to identify our q-deformed coset theory with the k-fusion RSOS model, we investigate the following two things. The first one is a partition function per site. We show that the correct partition function per site is obtained from the commutation of the two type I vertex operators. This allows us to regard the type I vertex operator as a proper lattice operator for the k-fusion RSOS model. The second one is a structure of the Fock modules for the q-deformed coset theory. The Fock modules are reducible due to the existence of singular vectors which can be constructed by the screening operators on the modules. We consider a resolution of the modules and obtain a space which can be regarded as the irreducible highest weight representation of the conjectural qdeformation of the extended Virasoro algebra. The character of the space coincides with the desired branching coefficient. Hence one can identify the space with the space of states for the k-fusion RSOS model. This paper is organized as follows. In the next section, we briefly review the free field representation of the coset minimal model SU (2)k × SU (2)r−k−2 /SU (2)r−2 . The formulae summarized in this section should be compared with those of the q-deformed c2 ) ones obtained in Sect. 4 and 5. In Sect. 3, we introduce the elliptic algebra Uq,p (sl c2 ). and discuss its properties. In Sect. 4, we present a free field representation of Uq,p (sl c As a corollary, the free field representation of the algebra A~,η (sl2 ) for arbitrary level is obtained. We also derive the type I and type II vertex operators and their commutation relations for the highest components. We argue that the correct partition function per site is obtained from these relations. In Sect. 5, based on these results, we propose a c2 ) provides q-deformation of the coset minimal model. We show that the algebra Uq,p (sl a full set of screening operators which are sufficient for making a resolution of the Fock modules. Then a characterization of the irreducible highest weight modules of the conjectural q-deformed extended Virasoro algebras is obtained. The evaluation of the character of the space supports the identification of the space with the space of states for the k-fusion RSOS model. The final section is devoted to discussions on the results and some future problems.

2. Coset Conformal Field Theory In this section we briefly review the coset minimal model SU (2)k × SU (2)l /SU (2)k+l k, l ∈ Z >0 [24, 25, 26, 27]. The symmetry of the theory is an extended Virasoro algebra generated by the EnergyMomentum (EM) tensor T (z) and the extra generators Ajk (z) (j = 1, 2, ..). The number of extra generators is depend on the value k. For example, the case k = 1 and l ∈ Z >0 , the theory is the Virasoro minimal model and there are no extra generators [3, 24]. For

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

377

k = 2 with l ∈ Z >0 , the theory is the N = 1 super Virasoro minimal model [24, 33, 34]. There are one extra generator of conformal dimension 3/2, which is nothing but the super generator A2 (z) = G(z). The k = 4 case with l ∈ Z >0 is known as the S3 symmetric model [35]. There are two extra generators A14 (z) and A24 (z) of conformal dimension 4/3. The extended Virasoro algebra is defined by the following operator product expansions (OPE) [27]: T (z)T (w) =

2T (z) ∂T (z) c/2 + O(1) + + (z − w)4 (z − w)2 z − w

σAjk (w) ∂Ajk (w) + O(1) + (z − w)2 z−w nc o Aik (z)Ajk (w) = αij (z − w)−2σ + 2T (w)(z − w)2 + O((z − w)3 ) σ n o 1 ijm −σ m 2 Ak (w) + (z − w)∂Am (w) + O((z − w) ) , +β (z − w) k 2

T (z)Ajk (w) =

(2.1) (2.2) (2.3) (2.4)

k+4 where σ = k+2 is the conformal dimension of Ajk (z), and αij , β ijm are the structure constants. The central charge c of the Virasoro algebra is given by 2(k + 2) 3k 1− . (2.5) c= k+2 (l + 2)(l + k + 2)

1 and J = |n0 − n(mod2k)|, 0 ≤ Let r = l + k + 2, 1 ≤ n ≤ r − k − 1, 1 ≤ n0 ≤ r −P J ≤ k. Define the Virasoro generators Lm by T (z) = m∈Z Lm z −m−2 . The Virasoro highest weight state |J; n0 , ni is the state satisfying L0 |J; n0 , ni = hJ;n0 ,n |J; n0 , ni, for n ≥ 1. Ln |J; n0 , ni = 0,

(2.6) (2.7)

The highest weights are given by hJ;n0 n =

J(k − J) (nr − n0 (r − k))2 − k 2 + . 2k(k + 2) 4kr(r − k)

(2.8)

One can realize the theory in terms of three boson fields [25, 26, 36]. Let us introduce the three independent free boson fields φ0 (z), φ1 (z) and φ2 (z) satisfying the OPE < φ0 (z)φ0 (w) >=< φ1 (z)φ1 (w) >= log(z − w) = − < φ2 (z)φ2 (w) >. Then the EM tensor of the coset theory is realized as √ 1 T (z) = TZk (z) + (∂φ0 (z))2 + 2α0 ∂ 2 φ0 (z), 2 r 1 1 1 2 2 2 TZk (z) = (∂φ1 (z)) − ∂ φ1 (z) − (∂φ2 (z))2 . 2 2 k+2 2

(2.9) (2.10)

q k Here 2α0 = r(r−k) , and TZk (z) being the EM tensor of the Zk parafermion theory [35]. The realization of the extra generators can be found in [25, 26, 27]. The expression (2.9) indicates that the coset theory is realized as a composition of the Z k parafermion theory and the boson theory φ0 (z). Indeed, the primary field of the coset theory is realized as

378

H. Konno

9J;n0 ,n (z) = 9J,M ;n0 ,n (z)|J=M (mod 2k) , √ 9J,M ;n0 ,n (z) = φJ,M (z) : exp 2αn0 ,n φ0 (z) :, n J o M φJ,M (z) =: exp √ φ1 (z) + √ φ2 (z) :, 2k + 2 2k

(2.11) (2.12) (2.13)

q 0 1−n r α + α , α = where M = −J, −J +2, .., J (mod 2k), αn0 ,n = 1−n − + + 2 2 k(r−k) , α− = q 0 0 − r−k kr . Note 2α0 = α+ + α− . Using 9J;n ,n (z), the highest weight state |J; n , ni is obtained as |J; n0 , ni = lim 9J;n0 ,n (z)|0i, z→0

(2.14)

where |0i denotes the SL(2, C) invariant vacuum state. The highest weight representation space is then given by the Fock module FJ;n0 ,n constructed by the action of the creation operators of the fields φj (z), j = 0, 1, 2 on |J; n0 , ni. These modules are reducible due to the existence of singular vectors. The singular vectors can be constructed by using the screening operators on some highest weight states. In the minimal coset theory, the screening currents, which contour integrals yield the screening operators, are given by n√ o S+ (z) = 9(z) : exp 2α+ φ0 (z) :, n√ o S− (z) = 9† (z) : exp 2α− φ0 (z) :, r n r 2 o r k + 2 k ∂φ1 (z) + ∂φ2 (z) exp − φ1 (z) :, S(z) = − : 2 2 k+2 r o nr k + 2 k φ1 (z) + φ2 (z) :, η(z) =: exp 2 2

(2.15) (2.16) (2.17) (2.18)

where 9 and 9† are the parafermion currents given by r k + 2

nr 2 o ∂φ1 (z) + ∂φ2 (z) exp φ2 (z) :, 9(z) =: k k r k + 2 n r2 o ∂φ1 (z) − ∂φ2 (z) exp − φ2 (z) : . 9† (z) =: k k

(2.19) (2.20)

These currents are characterized by the properties (i) the conformal dimension is one, (ii) the contour integrals of them commute with the extended Virasoro algebra. These screening currents defines the nilpotent operators called the BRST charges. One can use these charges to make a resolution of the Fock modules [37, 38]. One should note that the screening currents S(z) and η(z) act only on the Z k parafermion theory and that the screening operators obtained from S± (z) commute with those from S(z) and η(z). Therefore, in order to make a resolution of the Fock modules of the coset theory, one may take the following two steps. First make a resolution of the Z k parafermion

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

379

theory and then consider the coset theory [39].2 In Sect. 5, we discuss a q-analogue of this resolution. c2 ) 3. The Elliptic Algebra Uq,p (sl c2 ) [32] and discuss its relation to In this section we define a new elliptic algebra Uq,p (sl c2 ) and A~,η (sl c2 ). known algebras Uq (sl 3.1. Definition. Let r ∈ R>0 and q ∈ C, |q| < 1. We set p = q 2r and p∗ = pq −2c . c2 ) is generated by the operator valued Definition 3.1. The associative algebra Uq,p (sl functions (currents) k(z), E(z), F (z) of the complex variable z and the central element c with the following relations3 . c z 2r(r−c) ξ(w/z; p, q) ξ(z/w; p∗ , q) k(w)k(z), (3.1) w ξ(w/z; p∗ , q) ξ(z/w; p, q) 1 1 z r−c 2p∗ (q −1 p∗ 2 w/z) E(w)k(z), (3.2) k(z)E(w) = 1 w 2p∗ (qp∗ 2 w/z) z − r1 2 (qp 21 w/z) p F (w)k(z), (3.3) k(z)F (w) = 1 w 2p (q −1 p 2 w/z) 2 z r−c 2p∗ (q −2 w/z) E(w)E(z), (3.4) E(z)E(w) = q 2 w 2p∗ (q 2 w/z) z − r2 2 (q 2 w/z) p F (w)F (z), (3.5) F (z)F (w) = q −2 w 2p (q −2 w/z) o 1 n −c + − c2 c − − c2 [E(z), F (w)] = δ(q z/w)H (q z) − δ(q z/w)H (q w) , (3.6) q − q −1

k(z)k(w) =

where c

c

(3.7) H ± (z) = κ(z)k(q ±(r− 2 )+1 z)k(q ±(r− 2 )−1 z), c ∗ − c 2r(r−c) ξ(x; p , q) , (3.8) κ(z) = q ±(r− 2 )+1 z ξ(x; p, q) x=q−2 (q 2 z; p, q 4 )∞ (pq 2 z; p, q 4 )∞ (3.9) ξ(z; p, q) = (q 4 z; p, q 4 )∞ (pz; p, q 4 )∞ P and [A, B] = AB −BA, δ(z) = n∈Z z n , 2s (z) = (z; s)∞ (s/z; s)∞ (s; s)∞ , (z; s)∞ = Q∞ n n=0 (1 − zs ) 2

The boson fields 8(z), ϕ(z), χ(z) used in [39] are related to φ0 (z), φ1 (z), φ2 (z) as follows:

r

8(z) = φ0 (z), 3

ϕ(z) = −

k φ1 (z) − 2

r

k+2 φ2 (z), 2

r

χ(z) = −

k+2 φ1 (z) − 2

r

k φ2 (z). 2

We are indebted to Jimbo for introducing the generator k(z) and giving its relation to H ± (z) in (3.7).

380

H. Konno ∗

Remark 3.1. Let us set p = e−2πi/τ , p∗ = e−2πi/τ , z = q 2u and denote E(z), F (z) and k(z) by the same letters E(u), F (u) and k(u). Then the relations in (3.2)–(3.5) are rewritten in more compact form, k(u)E(v) =

u−v+ 21 r−c u−v− 21 ϑ1 ( r−c

ϑ1 (

− 21 |τ ∗ ) − 21 |τ ∗ )

E(v)k(u),

(3.10)

u−v− 21 − 21 |τ ) r F (v)k(u), 1 u−v+ ϑ1 ( r 2 − 21 |τ ) ∗ ϑ1 ( u−v+1 r−c |τ ) E(u)E(v) = E(v)E(u), ∗ ϑ1 ( u−v−1 r−c |τ ) |τ ) ϑ1 ( u−v−1 r F (u)F (v) = F (v)F (u), u−v+1 ϑ1 ( r |τ ) u−v+(1+ c ) u−v−(1+ c2 ) ϑ1 ( |τ ) ϑ1 ( r−c 2 |τ ∗ ) − + − r H (v)H + (u), H (u)H (v) = u−v+(1− c2 ) u−v−(1− c2 ) ∗ ϑ1 ( |τ ) ϑ1 ( |τ ) r r−c u−v+1 ∗ u−v−1 ϑ1 ( r |τ ) ϑ1 ( r−c |τ ) ± H ± (u)H ± (v) = H (v)H ± (u), u−v−1 ∗ ϑ1 ( u−v+1 |τ ) ϑ ( |τ ) 1 r r−c

k(u)F (v) =

ϑ1 (

(3.11)

(3.12) (3.13) (3.14)

(3.15)

where ϑ1 (u|τ ) is the Jacobi elliptic theta function X 2 ϑ1 (u|τ ) = i (−)n eπi(n−1/2) τ e2πi(n−1/2)u . n∈Z

c2 ) is related to some elliptic curves in These expressions suggest that the algebra Uq,p (sl the similar way to the theory of Enriquez and Felder [40]. 3.2. Degeneration limit. There are two interesting degeneration limits: p → 0 and p → 1. The limit p → 0 is taken by letting r → ∞. Then the relations (3.1)–(3.6) are c2 ) ( see reduced to those of the Drinfeld currents in the quantum affine algebra Uq (sl for example, [41]). In the later section (Sect. 5), we will make the identification that c2 ) is the algebra of the screening currents in the q-deformation of the coset theory Uq,p (sl c2 ) is then consistent with SU (2)k × SU (2)r−k−2 /SU (2)r−2 . The limit r → ∞ to Uq (sl the well-known fact in CFT and the perturbation of the coset CFT [42], i.e. SU (2)k × SU (2)r−k−2 theory r→∞ SU (2)r−2 → SU (2)k Wess−Zumino−Witten model. lim

~ε

The second limit p → 1 is taken by setting q = e 2 and z = e−iαε , w = e−iβε and letting ε → 0. In this limit, the relations in Definition 3.1 tend to those of the currents in c2 ) [21] (see the Appendix), i.e. the degeneration (or scaling) limit of the algebra A~,η (sl c2 ) [14], under the identification 1/η = ~r, 1/η 0 = ~(r −k) and the elliptic algebra Aq,p (sl + the interchanges H ↔ H − , E ↔ F . This limit and the results in [20] are also consistent

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

381

with the known fact that the scaling limit of the RSOS model gives the restricted sineGordon model and the latter model is obtained by an integrable perturbation of the coset minimal model (see, for example, [42]). c2 ) hence has the same degeneration limits as the elliptic algebra Our algebra Uq,p (sl c c2 ) and Aq,p (sl c2 ) is unknown at Aq,p (sl2 ) [18]. The direct relationship between Uq,p (sl 4 this stage . However, see [18] Sect. 3 where one can find a discussion which suggests c2 ) at the deep relation between the q-Virasoro algebra and the elliptic algebra Aq,p (sl level one. c2 ) at Level k 4. Free Field Realization of Uq,p (sl c2 ) at arbitrary level k 6= 0, −2 in We here consider a realization of the algebra Uq,p (sl 5 terms of three bosonic fields . c2 ). Let aj,m (m ∈ Z 6=0 j = 0, 1, 2) be bosons satisfying the 4.1. Bosonization of Uq,p (sl commutation relations [2m][km] [rm] δm+n,0 , m [(r − k)m] [2m][(k + 2)m] [a1,m , a1,n ] = δm+n,0 , m [2m][km] [a2,m , a2,n ] = − δm+n,0 , m [a0,m , a0,n ] =

m

(4.1) (4.2) (4.3)

−m

−q 0 where [m] = q q−q −1 . We also define the primed boson a0,m and the zero-mode operators Qj and Pj (j = 0, 1, 2) satisfying

[(r − k)m] a0,m , [rm] [P1 , Q1 ] = 2(k + 2), [P0 , Q0 ] = −i,

a00,m =

(4.4) [P2 , Q2 ] = −2k.

(4.5)

In order to make the expression of the currents simple, we introduce the following boson fields φj (A; B, C|z; D) (j = 0, 1, 2, A, B, C, D ∈ R), r A 2kr φ0 (A; B, C|z; D) = (iQ0 + P0 log z) BC r − k X [Am] a0,m z −m q D|m| , (4.6) + [Bm][Cm] m∈Z 6=0

φ00 (A; B, C|z; D)

r A 2k(r − k) (iQ0 + P0 log z) = BC r X [Am] a00,m z −m q D|m| , + [Bm][Cm] m∈Z

(4.7)

6=0

4

In the trigonometric limit, the corresponding problem has been discussed by Hou et al. [43]. In preparing this paper, we have noticed that Shiraishi has obtained another free field realization for a similar algebra. 5

382

H. Konno

A (Qj + Pj log z) BC X [Am] + aj,m z −m q D|m| [Bm][Cm] m∈Z

φj (A; B, C|z; D) = −

(j = 1, 2),

(4.8)

6=0

φ(±) j (A; B|z; C)

Pj log q = 2 +(q − q −1 )

X [Am] aj,±m z ∓m q Cm [Bm]

(j = 1, 2). (4.9)

m∈Z >0

We often use the abridgment φj (C|z; D) = φj (A; A, C|z; D),

φj (C|z) = φj (C|z; 0).

(4.10)

We denote by :: the usual normal ordered product. For example, X X a0,m a0,−m m z z −m exp − : exp −φ0 (k|z) := exp [km] [km] m∈Z >0 m∈Z >0 p 2r p 2r −i − k(r−k) Q0 k(r−k) P0 z . (4.11) ×e Then we have b 2 ) has the following free field realization at c = k. Theorem 4.1. The algebra Uq,p (sl (4.12) k(z) =: exp −φ0 (1; 2, r|z) :, E(z) = 9(z) : exp −φ0 (k|z) :, (4.13) F (z) = 9† (z) : exp φ00 (k|z) : , (4.14) where 9(z) and 9† (z) are the q-deformed Z k parafermion currents given by 9(z) = 9− (z), 9† (z) = 9+ (z) [44]6 , 1 k 9± (z) = ∓ : exp ±φ ) (k|z; ± 2 (q − q −1 ) 2 k + 2 k (+) × exp −φ(+) ± φ 1; 2|z; ∓ 1; 2|z; ∓ 2 1 2 2 k+2 k (−) −exp φ(−) ∓ φ : . (4.15) 1; 2|z; ∓ 1; 2|z; ∓ 2 1 2 2 Proof. Straightforward calculation.

Remark 4.1. In the CFT limit q → 1 with z fixed, the currents E(z) and F (z) coincide with the screening currents S+ (z)(2.15), S− (z)(2.16) in the coset CFT. 6 The relation of our notations to those of Matsuo’s is as follows: β ¯ m = a2,m , β0 = P1 , α¯ 0 = m = a1,m , α P2 , 2(k + 2)β = Q1 , 2kα¯ = Q2 .

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

383

Remark 4.2. In the limit r → ∞, the expressions (4.12) with (3.7) and (4.13)- (4.14) c2 ) [44]. Namely, H + (z) → K+ (z), H − (z) → tend to Matsuo’s bosonization of Uq (sl + K− (z), E(z) → X (z) and F (z) → X − (z). c2 ) limit. Let us consider the A~,η (sl c2 ) limit. The algebra A~,η (sl c2 ) 4.2. The A~,η (sl [21] is defined in Appendix. The limit is taken by the following procedure [18]. Setting ~ε q = e 2 , z = e−iαε , r = ξ + k, mε = t ∈ R and letting ε → 0. Then we have from (4.1)–(4.3), [a0 (t), a0 (t0 )] =

1 sinh ~t sinh ~kt 2 sinh ~2 t sinh ~ξt 2

~(ξ+k)t 2

δ(t + t0 ),

1 sinh ~t sinh ~(k+2)t 2 δ(t + t0 ), ~2 t 1 sinh ~t sinh ~kt 2 δ(t + t0 ), [a2 (t), a2 (t0 )] = − 2 ~ t [a1 (t), a1 (t0 )] =

(4.16) (4.17) (4.18)

and a00 (t) =

~ξt 2 ~(ξ+k)t 2

sinh sinh

a0 (t).

(4.19)

In this limit, all the zero-mode operators Qj and Pj (j = 0, 1, 2) are dropped. The boson fields φj (A; B, C|z; D), j = 0, 1, 2 are now reduced to Z φ˜ j (A; B, C|α; D) = ~

φ˜ 00 (A; B, C|α; D) = ~

−∞

Z Z

φ˜ (±) j (A; B|α; C) = 2~

∞

dt

sinh sinh

~At 2

~Bt 2

sinh

sinh

~At 2

~Ct 2

aj (t)eiαt+

~D 2 |t|

(j = 0, 1, 2), (4.20)

∞

−∞

dt

∞

dt 0

sinh

sinh sinh

~D a0 (t)eiαt+ 2 |t| , ~Bt ~Ct 0 sinh 2 2 ~At ~C 2 a (±t)e±iαt+ 2 |t| (j ~Bt j 2

(4.21) = 1, 2).

(4.22)

Under these notations, we have c2 ) with 1/η = ~ξ, 1/η 0 = ~(ξ + k) is Theorem 4.2. The level k currents in A~,η (sl realized as follows. ˜ (4.23) k(α) =: exp −φ0 (1; 2, ξ + k α) :, E(α) = 9(α) : exp −φ˜ 0 (k|α) :, (4.24) F (α) = 9† (α) : exp φ˜ 00 (k|α) :, (4.25) where 9(α) = 9− (α), 9† (α) = 9+ (α) are given by

384

H. Konno

k 1 ˜ 9 (α) = ∓ : exp ±φ2 k α; ± ~ 2 k + 2 ˜ (+) k (+) ˜ × exp −φ2 1; 2 α; ∓ ± φ1 1; 2 α; ∓ 2 2 k+2 k (−) (−) ˜ ˜ − exp φ2 1; 2 α; ∓ ∓ φ1 1; 2 α; ∓ : . (4.26) 2 2 ±

c2 ), the Hopf algebra like structure and the level zero repreIn the algebra A~,η (sl sentation are known [21]. We summarize them in the Appendix. From this knowledge, one can construct the intertwining operators between the level k infinite dimensional representations. Let us define the following four vertex operators: ˜ 00 (l; 2, k|ζ) :, (ζ) = φ (ζ) : exp − φ (4.27) 8(l) l,l l ˜ 0 (l; 2, k|ζ) : , 9(l)∗ (ζ) = φ (ζ) : exp φ (4.28) l,−l l 0 ˜ ˜ (l) 8 (ζ) = φ (ζ) : exp − φ (l; 2, k|ζ) :, (4.29) k−l,−(k−l) l 0 ˜ ˜ (l)∗ 9 (l = 0, 1, 2, .., k), (4.30) l (ζ) = φk−l,k−l (ζ) : exp φ0 (l; 2, k|ζ) : where φl,±l (ζ) are the analogues of the Z k -parafermion primary fields (2.13) given by: k ˜ k + 2 − φ1 l; 2, k + 2|ζ; ± : . (4.31) φl,±l (ζ) =: exp −φ˜ 2 ±l; 2, k|ζ; ± 2 2 Then we have (l)∗ Proposition 4.3. The vertex operators 8(l) l (ζ) and 9l (ζ) satisfy the intertwining relations of the type I (A.29)–(A.31) and the type II (A.32)–(A.34), respectively, whereas ˜ (l) ˜ (l)∗ the vertex operators 8 l (ζ) and 9l (ζ) satisfy the twisted intertwining relations of the type I (A.29), (A.38)–(A.39) and the type II (A.32),(A.40)–(A.41), respectively.

Proof. Straightforward.

Using the relations (A.31)–(A.34), one can obtain the other (lower) components of (l)∗ ˜ (l) ˜ (l)∗ the intertwiners 8(l) m (ζ), 9m (ζ), 8m (ζ), 9m (ζ) (m = 0, 1, 2, .., l − 1). We omit them here. 4.3. The type I and type II vertex operators. Although we do not have any results on the c2 ), the (q, p)-analogue of the operators (4.27)–(4.30) Hopf algebra structure of Uq,p (sl can be obtained by the following requirements. c2 ) from Uq,p (sl c2 ) makes the vertex operators 1. The procedure taking the limit to A~,η (sl reduce to those in (4.27)–(4.30). 2. The zero-mode factors are determined by requiring that the CFT limit ( q → 1 with z fixed ) of the vertex operators should be expressed as the exponential of the boson fields.

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

385

Using the notations in (4.6)–(4.9), we find that the desired vertex operators are given as follows. 0 (z) = φ (z) : exp −φ (l; 2, k|z) :, (4.32) 8(l) l,l 0 l 9(l)∗ (z) = φ (z) : exp φ (l; 2, k|z) :, (4.33) l,−l 0 l 0 ˜ (l) 8 (z) = φ (z) : exp −φ (l; 2, k|z) :, (4.34) k−l,−(k−l) l 0 ˜ (l)∗ 9 (z) = φ (z) : exp φ (l; 2, k|z) : (l = 0, 1, 2, .., k), (4.35) k−l,k−l 0 l where

k k + 2 − φ1 l; 2, k + 2|z; ± :. φl,±l (z) =: exp −φ2 ±l; 2, k|z; ± 2 2

(4.36)

We also have some conjectural expression for the lower components of these vertex operators. We will discuss them and their commutation relations in a separate paper. c2 ) limit, the type I vertex operator 8(l) (z) (4.32) coincides with Remark 4.3. In the Uq (sl l the result in [44]. On the other hand, in the CFT limit, the same vertex operator coincides with the primary field 9l;l+1,1 (z) in (2.12), whereas the type II vertex operator 9(l)∗ l (z) √ (l)∗ (z) and 9 coincides with φl,−l (z) : exp 2α1,l+1 φ0 (z) :. Hence one can regard 8(l) l l (z) as the q-deformation of the primary fields in the coset theory. + ˜ (1) Remark 4.4. At level one, the vertex operator 8 1 (z) coincides with 9 (z) obtained by Lukyanov and Pugai in the q-Virasoro algebra [30]. On the other hand, the degeneration limit (4.29) and (4.30), at level one, coincide with those found in the massless XXZ model [18]. In this way, in all the known cases, the vertex operators relevant for the physical applications obey the twisted intertwining relations. The vertex operators (4.32)–(4.35) satisfy interesting commutation relations with the currents which allows us to expect the existence of the (q, p)-analogue of the (twisted) intertwining relations (A.29)–(A.41) and the existence of the Hopf algebra structure in c2 ) (see also Sect. 6). We here list them only for the vertices 8 ˜ (l) ˜ (l)∗ Uq,p (sl l (z) and 9l (z).

˜ (l) ˜ (l)∗ Proposition 4.4. The vertex operators 8 l (z) and 9l (z) satisfy the following relations: −l/r 2 (q −l±k/2 z/w) p l z ±k/2 ˜ (l) ˜ (l) 8 q (z) = q (z)H (±) (w), (4.37) H (±) (w)8 l l±k/2 w 2p (q z/w) l (l)

(l)

˜ l (z) + 8 ˜ l (z)E(w) = 0, E(w)8 (4.38) z −l/r 2 (q −l z/w) p l ˜ (l) ˜ (l) (z)F (w), F (w)8 8 (4.39) l (z) = −q w 2p (q l z/w) l l/(r−k) 2 ∗ (q l∓k/2 z/w) p −l z ∓k/2 ˜ (l)∗ ˜ (l)∗ 9 q (z)H (±) (w), (4.40) H (±) (w)9 l (z) = q −l∓k/2 w 2p∗ (q z/w) l (l)∗

(l)∗

˜ l (z) + 9 ˜ l (z)F (w) = 0, F (w)9 l/(r−k) 2 ∗ (q l z/w) p −l z ˜ (l)∗ ˜ (l)∗ (z)E(w). E(w)9 9 l (z) = −q w 2p∗ (q −l z/w) l

(4.41) (4.42)

386

H. Konno

Finally, we present the commutation relations among the type I and type II vertex ˜ (k) operators. For application to physics, we are interested in those among 8 k (z) and (1) ˜ 91 (z). Proposition 4.5. ˜ (k) ˜ (k) ˜ (k) ˜ (k) 8 k (z)8k (w) = rk (w/z)8k (w)8k (z),

(4.43)

˜ (1)∗ ˜ (1)∗ ˜ (1)∗ ˜ (1)∗ 9 1 (z)91 (w) = s1 (z/w)91 (w)91 (z), ˜ (k) ˜ (1)∗ ˜ (1)∗ ˜ (k) 8 k (z)91 (w) = χ(w/z)91 (w)8k (z),

(4.44) (4.45)

where (q 2r−2k+2 /z; q 4 , p)∞ (q 2k+2 /z; q 4 , p)∞ (q 2r+2 z; q 4 , p)∞ (q 2 z; q 4 , p)∞ , (q 2r+2 /z; q 4 , p)∞ (q 2 /z; q 4 , p)∞ (q 2r−2k+2 z; q 4 , p)∞ (q 2k+2 z; q 4 , p)∞ (4.46) 2 (k−1) r σ(w/z) s1 (z) = z 2k(r−k) − k(k+2) (4.47) σ(z/w)

rk (z) = z

k(r−k) 2r

with σ(z) = (q 2(k+1) z; q 2k , q 2(k+2) )2∞ (q 2(k+2) z; q 4 , q 2k )∞ (q 2k z; q 4 , q 2k )∞ (q 4 z; q 4 , p∗ )∞ (z; q 4 , p∗ )∞ (q 4k z; q 2k , q 2(k+2) )2∞ (q 4 z; q 2k , q 2(k+2) )2∞ (q 2(k+1) z; q 4 , q 2k )2∞ (q 2 z; q 4 , p∗ )2∞ (4.48) and χ(z) = z 1/2

2q4 (q/z) . 2q4 (qz)

(4.49)

The function rk (z) ≡ 1/κ(u) with z = q 2u satisfies the following inversion relations: κ(u)κ(−u) = 1, κ(u)κ(−2 − u) =

(4.50) k−1−u ) ϑ1 ( k+1+u r )ϑ1 ( r . 1+u −1−u ϑ1 ( r )ϑ1 ( r )

(4.51)

Therefore, according to Appendix D in the third reference in [23], one can identify κ(u) with the partition function per site for the k-fusion RSOS model in the regime III. On the other hand, the logarithmic derivative of the function χ(z) gives the excitation energy of the kink [45]. 5. q-Deformation of the Coset Theory In this section, we discuss a q-deformation of the coset conformal field theory based on c2 ) and make an identification of it with the k-fusion RSOS model the algebra Uq,p (sl k ∈ Z >0 . 5.1. Definition. The Virasoro central charge of the Z k parafermion theory is cP F = 2(k−1) k+2 . Hence the level one (k = 1) parafermion theory has zero central charge and gives

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

387

a trivial contribution to the coset theory. This should be true in the q-deformed theory, too. Noting this, one can see that at level one, the currents E(z) and F (z) in (4.13) and (4.14) coincides with the screening currents V¯ (z) and U¯ (z) in the q-Virasoro algebra + ˜ (l) [30]. In addition, the type I vertex 8 1 (z) coincides with the vertex 9 (z) in [30, 31]. c In this sense, the algebra Uq,p (sl2 ) at level one governs the structure of the q-Virasoro algebra. In fact, at level one, the q-Virasoro algebra generator T (z) is obtained by [31] T (z) = 3(zq −1 ) + 3−1 (zq), 3(z) =:

−1 −r ˜ (1) ˜ (1) 8 )81 (zq r ) 1 (zq

(5.1) :.

(5.2)

For generic level k, as mentioned in Remarks 4.1 and 4.3, one can regard the currents E(z), F (z) and the type I vertex 8(l) l (z)(l = 0, 1, .., k) as the q-deformation of the screening currents S+ (z), S− (z) and the primary field 9l;l+1,1 in the coset CFT. These observation lead us to the following characterization of a q-deformation of the coset theory SU (2)k × SU (2)r−k−2 /SU (2)r−2 , or equivalently a q-deformation of the extended Virasoro algebra in the free boson realization: 1. The theory is obtained as a composition of the q-deformed Z k parafermion theory and the φ0 boson theory. c2 ). 2. The screening currents satisfy the algebra Uq,p (sl 3. The q-deformation of the primary fields are the intertwiners between the infinite c2 ). dimensional representations of Uq,p (sl We have not yet succeeded to obtain a q-Virasoro generators with central charge c (2.5) and any extra generators. However, the free boson realization of the screening operators and the type I, type II vertex operators enables us to analyze the structure of the highest weight representation of the q-Virasoro algebras [30]. In the following section, we carry out such analysis for the case k > 1. The resultant irreducible representation turns out to be identified with the space of states of the k-fusion RSOS model. 5.2. Fock modules . Let J = |n0 − n (mod2k)| and M = n0 − n (mod 2k). Define the highest weight state |J, M ; n0 , ni by |J, M ; n0 , ni = |J, M iP F ⊗ |n0 , ni0 , |J, M iP F = e

J M 2(k+2) Q1 + 2k

|n0 , ni = e−i

2αn0 ,n Q0

√

Q2

|0iP F ,

|0i0 .

(5.3) (5.4) (5.5)

Here |0iP F and |0i denote the SL(2, C) invariant vacuum states defined by aj,m |0iP F = 0 = Pj |0iP F (j = 1, 2 m ∈ Z >0 ), (m ∈ Z >0 ). a0,m |0i0 = 0 = P0 |0i0

(5.6) (5.7)

Note that the highest weight states can be obtained by making the vertex operator act on the SL(2, C) invariant vacuum state in the same way as (2.14) in CFT. PF ⊗ Fnφ00,n are defined by The Fock modules FJ,M ;n0 ,n = FJ,M PF FJ,M = C[a1,−m1 , a2,−m2 (m1 , m2 ∈ Z >0 )]|J, M iP F ,

(5.8)

Fnφ00,n = C[a0,−m (m ∈ Z >0 )]|n0 , ni0 .

(5.9)

388

H. Konno

We also define the degree of the vector in the Fock modules as an eigenvalue of the operator L0 given by F L0 = L P + Lφ0 0 , 0 X P1 (P1 + 2) m2 F LP a1,−m a1,m + = 0 [2m][(k + 2)m] 4(k + 2) m>0

−

X m>0

Lφ0 0

P2 m2 a2,−m a2,m − 2 , [2m][km] 4k

X m2 [(r − k)m] 1 a0,−m a0,m + P0 P0 − = [2m][km][rm] 2

(5.10)

(5.11) s

m>0

2k . r(r − k)

For a vector u ∈ FJ,M ;n0 ,n , Y ai,−mi,1 ai,−mi,2 · · · ai,−mi,Ni |J, M ; n0 , ni, u=

(5.12)

(5.13)

i=0,1,2

its degree N is given by L0 u = ( hJ,M ;n0 ,n + N )u,

N=

Ni X X

mi,j ,

(5.14)

i=0,1,2 j=1

where (nr − n0 (r − k))2 − k 2 , 4kr(r − k) J(J + 2) M 2 − . = 4(k + 2) 4k

hJ,M ;n0 ,n = hJ,M +

(5.15)

hJ,M

(5.16)

5.3. Screening currents. In Sect. 5.1, we identified the currents E(z) and F (z) with the qdeformation of the screening currents S+ (z) and S− (z) in the coset CFT. As mentioned in Sect. 2, we need two more screening currents, which governs the structure of the Fock representation space of the q-deformed Z k parafermion theory. Such screening currents S(z) : FJ,M ;n,n0 → FJ−2,M ;n,n0 and η(z) : FJ,M ;n,n0 → FJ+k+2,M +k;n,n0 were already realized by Matsuo [44]. They are given by the following formulae. k + 2 −1 : exp φ1 k + 2 z; − S(z) = (q − q −1 ) 2 k + 2 k (+) (+) × exp φ2 1; 2 z; + φ1 1; 2 z; 2 2 k + 2 k (−) (−) −exp −φ2 1; 2 z; − φ1 1; 2 z; :, (5.17) 2 2 k k + 2 η(z) =: exp −φ1 2 z; − φ2 2 z; :. (5.18) 2 2 Noting that S(z) and η(z) depend only on the boson fields φ1 and φ2 , the following relations are direct consequences of the Lemma 4.1 and 4.5 in [44].

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

389

Proposition 5.1. E(z)S(w) = S(w)E(z) = O(1),

i 1 : e8F S (z) : + O(1), z−w i : e8Eη (z) : + O(1),

h

F (z)S(w) = S(w)F (z) = [k + 2]k+2 ∂w h 1 E(z)η(w) = −η(w)E(z) = 1 ∂w z−w F (z)η(w) = −η(w)F (z) = O(1), i h 1 : e8Sη (z) : + O(1), S(z)η(w) = η(w)S(z) = 1 ∂w z−w

(5.19) (5.20) (5.21) (5.22) (5.23)

with k+2 k + φ2 k z; , + φ1 k + 2 z; 8F S (z) = 2 2 k+2 k+4 k 8Eη (z) = − φ0 (k | z) − φ1 2 z; − φ2 k z; − − φ2 2 z; , 2 2 2 k−2 k k+2 8Sη (z) = φ1 k + 2 z; − − φ1 2 z; − φ2 2 z; . 2 2 2 φ00 (k | z)

Here the difference of the function f (z) is defined by a ∂z

f (z) =

f (q a z) − f (q −a z) . q − q −1

In addition, from Theorem 4.1, the following commutation relations hold. Proposition 5.2. [u − v + 1]r−k E(w)E(z), [u − v − 1]r−k [u − v − 1]r F (z)F (w) = F (w)F (z), [u − v + 1]r [u − v + 1]k+2 S(w)S(z). S(z)S(w) = [u − v − 1]k+2

E(z)E(w) =

(5.24) (5.25) (5.26)

Here the notation [u]x (x = r, r − k, k + 2) stands for the theta function [u]x = ϑ1

u τx , x

(5.27)

where we set q 2x = e−2πi/τx . Hence τr = τ and τr−k = τ ∗ . One should note the following quasi-periodicities: [u + x]x = −[u]x ,

[u + xτx ]x = −e−πi(2u+τx ) [u]x .

Now let us define a set of screening operators.

(5.28)

390

H. Konno

Definition 5.3. I Q+ = I Q− = I Q= I η0 = where

ˆ r−k [u − 21 + 5] dz E(z) , 1 2πiz [u − 2 ]r−k

(5.29)

ˆ ]r [u + 21 − 5 dz F (z) , 1 2πiz [u + 2 ]r

(5.30)

[u − 21 + P1 ]k+2 dz S(z) , 2πiz [u − 21 ]k+2

(5.31)

dz η(z), 2πi

(5.32)

0

r

r−k 2r(r − k) P0 − P2 , k k r r 2r(r − k) 0 ˆ 5 = P0 − P2 . k k ˆ = 5

(5.33) (5.34)

Due to the quasi-periodicity (5.28), the integrands in Q+ , Q− and Q are single valued in z. In addition the integrand in η0 is single valued on FJ,M ;n0 ,n . Therefore all the integrations in Definition 5.3 can be taken over a closed contour on FJ,M ;n0 ,n . The following commutation relations hold. Lemma 5.4. [Q± , Q] = 0, [Q± , η0 ] = 0, {Q, η0 } = 0,

(5.35) (5.36) (5.37)

where {A, B} = AB + BA. ˆ S(w)] = 0, [P1 , E(z)] = 0, the relation Proof of (5.35). From Proposition 5.1 and [5, [Q+ , Q] = 0 is trivial. For the commutation [Q− , Q], we have from Proposition 5.1 and ˆ 0 , S(w)] = 0, [P1 , F (z)] = 0, [5 I I h 1 i dz dw [Q− , Q] = −[k + 2] : e8F S (z) : k+2 ∂w z−w |z|=1 2πi Cz 2πi ×

ˆ 0 ]r [v − 1 + P1 ]k+2 [u + 21 − 5 2 . [u + 21 ]r [v − 21 ]k+2

(5.38)

Here Cz denotes a closed contour enclosing the points q ±(k+2) z. After taking the integral in w, the quasi-periodicity of the theta function [u]k+2 makes the right hand side of (5.38) vanish. The other statements can be proved in a similar way. Remark 5.1. Due to the theta function factor in Definition 5.3, the screening operators 0 Q+ and Q− do not commute each other. However, the relation [Qn+ , Qn− ] = 0 holds on FJ,M ;n0 ,n .

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

391

n n Let us set Q+n = Qn+ , Q− n = Q− , and Qn = Q . Then we claim

Theorem 5.5. The screening operators Q+ , Q− , Q and η0 are nilpotent: Q+n Q+r−k−n = Q+r−k−n Q+n = Q+r−k = 0, − Q− n Qr−n

=

Qn Qk+2−n η02 = 0.

− Q− r−n Qn

Q− r

= = 0, = Qk+2−n Qn = Qk+2 = 0,

(5.39) (5.40) (5.41) (5.42)

The proof is due to the following lemma7 . Lemma 5.6. Q+n

n I [n]r−k ! Y dzj = E(zj ) n![1]nr−k 2πizj j=1

Y

n ˆ − n]r−k [ui − uj ]r−k Y [ui + 21 + 5 , [ui − uj − 1]r−k [ui − 21 ]r−k i<j i=1 n I [n]r ! Y dzj − F (zj ) Qn = n![1]nr 2πizj

×

(5.43)

j=1

n ˆ 0 + n]r [ui − uj ]r Y [ui − 21 − 5 × , [ui − uj − 1]r [ui + 21 ]r i<j i=1 n I [n]k+2 ! Y dzj S(z ) Qn = j n![1]nk+2 2πizj

Y

(5.44)

j=1

×

Y i<j

n [ui − uj ]k+2 Y [ui + 21 + P1 − n]k+2 . [ui − uj − 1]k+2 [ui − 21 ]k+2

(5.45)

i=1

This lemma is proved by using the commutation relations in Proposition 5.2 and the following theta function identity [48]. Lemma 5.7. n 1 X Y [uσ(i) − 2i + 2]x n! σ∈Sn i=1

=

Y i<j σ(i)>σ(j)

[uσ(i) − uσ(j) − 1]x [uσ(i) − uσ(j) + 1]x

n [n]x ! Y [ui − uj ]x Y [ui − n + 1]x n![1]nx i<j [ui − uj − 1]x

(x = r, r − k, k + 2). (5.46)

i=1

PF ⊗ 5.4. Resolution of the Fock modules FJ,M ;n0 ,n . The Fock modules FJ,M ;n0 ,n = FJ,M φ0 Fn0 ,n are reducible due to the existence of the singular vectors, which are constructed by the screening operators on some highest weight states [10, 31]. In order to obtain irreducible spaces, we consider a resolution of the modules FJ,M ;n0 ,n following the method by Felder [37].

The screening operator S(z) is equivalent to the one in Uq (sc l2 ) discussed in [10]. We hence have proved l2 ) in the improved form as (5.31). the nilpotency of the BRST charge in Uq (sc 7

392

H. Konno

As mentioned in Sect. 2, Lemma 5.4 allows us to divide the consideration into the following two steps. First consider the resolution of the Z k parafermion Fock modules PF PF and get spaces HJ,M as irreducible representations. Then consider the resolution FJ,M φ0 PF of the modules FJ,M ⊗ Fn0 ,n . PF 1) Resolution of the Z k q-parafermion Fock modules FJ,M . Let us first remind the reader of the following two facts.

1. The Z k parafermion theory is obtained as the coset SU (2)k /U (1). In the q-deformed case, especially in our case, this means that the q-deformed Z k parafermion theory c2 ) by dropping the { αm } boson. is obtained from Matsuo’s bosonization of Uq (sl 2. The screening currents S(z) and η(z) are independent of the {αm } bosons. Therefore c2 ) theory and the q-parafermion these screening currents are common in the Uq (sl theory so that the structure of the singular vectors in the q-parafermion Fock modules c2 ). are the same as corresponding Fock modules for Uq (sl c2 ) was investigated in [44, 10]. The modules The structure of the Fock for Uq (sl L modules PF α α are given by FJ = M ∈Z FJ,M ⊗ FM . Here FM denotes the Fock module of the { αm } boson. In [10], we showed, using the equivalent boson realization, that the Fock modules FJ are reducible for the highest weight λa,a0 = Ja,a0 31 + (k − Ja,a0 )30 due to the existence of the singular vectors. Here Ja,a0 = a − (k + 2)a0 − 1, the level k being , in general, a rational number k + 2 = P/P 0 , P, P 0 ≥ 1, GCD(P, P 0 ) = 1 and 1 ≤ a ≤ P − 1, 0 ≤ a0 ≤ P 0 − 1. The case P 0 = 1 is relevant for the parafermion theory. It was then observed that the Fock module structure, i.e. the degree of the singular vectors and the cosingular vectors as well as the multiplicities in each degree seems to be the same as the CFT case [38]. This was checked by calculating the characters of the irreducible representation spaces obtained by a resolution of the Fock modules. c2 ) [44, 10] and the analysis in CFT [46, 47], Following the procedure given for Uq (sl PF we make the resolution of the q-parafermion Fock modules FJ,M as follows. We first consider the following restriction: PF PF PF = Ker(η : FJ,M →FJ+k+2,M F˜ J,M +k ).

(5.47)

Since η02 = 0, the complex of the Fock modules associated with the map η0 : PF PF →FJ+k+2,M FJ,M +k has trivial cohomology. We hence have tr F˜ P F O =

X

J,M

(−)u tr F P F [u] O

and 0=

X

(5.48)

J,M

u≥0

(−)u tr F P F [u] O,

(5.49)

J,M

u∈Z

P F [u] PF ≡ FJ+(k+2)u,M where FJ,M +ku . On the other hand, the operator Q generates the following complex of the restricted PF : Fock modules F˜ J,M Q[−2]

Q[−1]

Q[0]

Q[1]

Q[2]

J+1 P F [−1] J+1 ˜ P F [0] J+1 ˜ P F [1] J+1 ˜ P F [2] J+1 · · · −→ F˜ J,M −→ FJ,M −→ FJ,M −→ FJ,M −→ · · · .

(5.50)

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

393

[2s+1] Here we introduced the notations Q[2s] = Q(k+2)−J−1 . These J+1 = QJ+1 and QJ+1 P F [2s] P F [2s+1] PF = FJ−2s(k+2),M and FJ,M = are operators acting on the modules FJ,M PF F−J−2−2s(k+2),M , respectively. The results in CFT [38, 46, 47, 39] and the investigation in the quantum affine algebra c2 ) [10] lead us to the following conjecture Uq (sl

Conjecture 5.8. The cohomology of the complex (5.50) is given by 0 f or s 6= 0 [s] [s−1] , KerQJ+1 /ImQJ+1 = PF f or s = 0 HJ,M

(5.51)

PF is the irreducible highest weight module of the q-deformed Z k parafermion where HJ,M theory with the highest weight hJ,M (5.16).

As a consequence, we obtain the following trace formula Corollary 5.9. F O = tr HPJ,M

X

(−)s tr F˜ P F [s] O[s] ,

s∈Z

J,M

(5.52)

P F [s] where O[s] is an operator on FJ,M obtained by the recursion formula [s] = O[s+1] Q[s] Q[s] a O a

(5.53)

with O[0] = O. Combining (5.52) and (5.48), we get XX F O = (−)s+u tr F P F [s,u] O[s] , tr HPJ,M s∈Z u≥0

J,M

(5.54)

where P F [2s,u] PF = FJ+(k+2)(u−2s),M FJ,M +ku , P F [2s+1,u] PF FJ,M = F−J−2+(k+2)(u−2s),M +ku .

To support the conjecture, let us apply this formula to the calculation of the character PF . We obtain of the space HJ,M PF

L F y 0 χJ,M (ω) = tr HPJ,M

−cP F /24

= η(ω)cJM (ω)

with y = e2πiω . Here the function cJM (ω) denotes the string function [s,u] XX [s,u] (−1)u xBJ,M − xB−J−2,M , cJM (ω) = η(ω)−3

(5.55)

(5.56)

s∈Z u≥0 [s,u] = BJ,M

(M + ku)2 (J + 1 − 2(k + 2)(s − u/2))2 − 4(k + 2) 4k

with η(ω) being the Dedekind eta function

(5.57)

394

H. Konno

η(ω) = y 1/24

∞ Y

(1 − y n ).

(5.58)

n=1

The result (5.55) is precisely the irreducible character of the Z k parafermion theory. PF PF with HJ,M in 2) Resolution of the Fock modules FJ,M ;n0 ,n . Replacing the space FJ,M φ 0 P F FJ,M ;n0 ,n , now we consider the Fock modules F˜ J,M ;n0 ,n = HJ,M ⊗ Fn0 ,n with J = |n0 −n (mod2k)|, M = n0 −n ( mod 2k), 1 ≤ n ≤ r−k−1, 1 ≤ n0 ≤ r−1, 0 ≤ J ≤ k. = Q+n , Q+[2t+1] = Q+r−k−n , Q−[2t] = Q− Let us introduce the notations Q+[2t] n n n0 , n0 −[2t+1] − = Qr−n0 (t ∈ Z) and Qn0 [2t] ˜ F˜ J,M ;n0 ,n = FJ,M +2(r−k)t ;n0 ,n−2(r−k)t ,

(5.59)

[2t+1] ˜ F˜ J,M ;n0 ,n = FJ,M +2n+2(r−k)t ;n0 ,−n−2(r−k)t ,

(5.60)

0

[2t] ˜ F˜ J,M ;n0 ,n = FJ,M +2rt ;n0 −2rt,n ,

(5.61)

[2t+1]0 F˜ J,M ;n0 ,n

(5.62)

= F˜ J,M −2n0 +2rt ;−n0 −2rt,n .

Then, the screening operators Q+ , Q− generate the following infinite sequences of the Fock modules F˜ J,M ;n0 ,n : +[−2]

+[−1]

+[0]

+[1]

Q−[0] n0

Q−[1] n0

Qn Qn Qn Qn [−1] ˜ [0] ˜ [1] · · · −→ F˜ J,M ;n0 ,n −→ FJ,M ;n0 ,n −→ FJ,M ;n0 ,n −→ · · · , −[−2] Qn 0

0

−[−1] Qn 0

0

0

[−1] ˜ [0] ˜ [1] · · · −→ F˜ J,M ;n0 ,n −→ FJ,M ;n0 ,n −→ FJ,M ;n0 ,n −→ · · · .

(5.63) (5.64)

Due to Theorem 5.5, these are complexes. As in the level one case [30], it is enough to consider one of them. Let us consider the complex (5.63). The following observation suggests the existence of the singular vectors in F˜ J,M ;n0 ,n similar to the CFT case [39]. Let Mn0 ,n = n0 − n (mod 2k) and consider the vector |χ−n0 ,n i = Q+n |J, M−n0 ,n ; −n0 , ni ∈ FJ,M−n0 ,−n ;−n0 ,−n .

(5.65)

Using Lemma 5.6 and the operator product 1 2 2 (q 2(k+1) z /z ; q 2k ) − 2 1 z1 k q −2 9(z1 )9(z2 ) = q − q −1 (q z2 /z1 ; q 2k ) × (1 − z2 /z1 ) : 9I (z1 )9I (z2 ) : −(1 − q −2 z2 /z1 ) : 9I (z1 )9II (z2 ) :

−q −2 (1 − q 2 z2 /z1 ) : 9II (z1 )9I (z2 ) : +(1 − z2 /z1 ) : 9II (z1 )9II (z2 ) : , (5.66)

where we set the RHS of (4.15) as

1 q−q −1 (9I (z)

− 9II (z)), we have

Q+n |J, M−n0 ,n ; −n0 , ni I I √ r n αn0 ,n − M [n]r−k ! −1 n dz1 dzn Y 2 k(r−k) k · · · z = i n −1 n![1]r−k q − q 2πiz1 2πizn ×

Y i<j

i=1

2 r−k

zi

2(k+2)

q

2k

(q zj /zi ; q ) (q −2 zj /zi ; q 2k )

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

395

× sum of (polynomials of zj /zi (1 ≤ i < j ≤ n)) × exp {polynomials of a0,−m , a1,−m0 , a2,−m00 ,

zj (m, m0 , m00 ∈ Z > 0, j = 1, .., n)} ×|J, M−n0 ,−n ; −n0 , −ni.

(5.67)

Setting z1 = z, zj = zwj (j = 2, .., n) and collecting all the z-dependence in the H n(n0 +n) n integrand, one can factor the integral dz z −1−N − k − k M−n0 ,n , where N ∈ Z ≥0 comes from the exponent in the fourth line of (5.67). Let us evaluate N . From (5.14), the non-vanishing term in (5.67) has degree N + hJ,M−n0 ,−n ;−n0 ,−n . On the other hand, the BRST charge Q+n commutes with L0 . Hence the degree of the same term should be equal to hJ,M−n0 ,n ;−n0 ,n . Hence we have N = hJ,M−n0 ,n ;−n0 ,n − hJ,M−n0 ,−n ;−n0 ,−n = 2 2 (4n0 n + M−n 0 ,−n − M−n0 ,n )/4k. This value N is consistent with the non-vanishment H of the integral dz. We hence conjecture the following statement [39]. Conjecture 5.10. The complex (5.63) has the non-trivial cohomology only at t = 0, i.e. +[t−1] KerQ+[t] n /ImQn

=

f or f or

0 HJ;n0 ,n

t 6= 0 . t=0

(5.68)

Here the space HJ;n0 ,n is the conjectural irreducible highest weight representation space of the q-Virasoro algebra with central charge c (2.5) with the highest weight hJ;n0 ,n = hJ,J;n0 ,n . As a consequence, we can derive a trace formula, which relates the trace over HJ;n0 ,n [t] to those over the Fock spaces F˜ J,M ;n0 ,n t ∈ Z. Corollary 5.11. tr HJ;n,n0 O =

X (−)t tr F˜ [t] t∈Z

J,M ;n0 ,n

O[t] ,

(5.69)

[t] where the operator O[t] on F˜ J,M ;n0 ,n is defined recursively by [t] Q+[t] = O[t+1] Q+[t] n O n

(5.70)

with O[0] = O. Combining (5.54) and (5.69), we finally get the formula tr HJ;n0 ,n O =

XXX s∈Z t∈Z u≥0

(−)s+t+u tr F [s,t,u] ⊗F φ0 [t] O[s,t] , J,M

[s,t,u] where the Fock modules FJ,M ;n0 ,n denote the following:

n0 ,n

(5.71)

396

H. Konno [2s,2t,u] φ0 P F [u] FJ,M ;n0 ,n = FJ−2s(k+2),M +2t(r−k) ⊗ Fn0 ,n−2(r−k)t ,

(5.72)

[2s+1,2t,u] φ0 P F [u] FJ,M ;n0 ,n = F−J−2−2s(k+2),M +2t(r−k) ⊗ Fn0 ,n−2(r−k)t ,

(5.73)

[2s,2t+1,u] φ0 P F [u] FJ,M ;n0 ,n = FJ−2s(k+2),M +2n+2t(r−k) ⊗ Fn0 ,−n−2(r−k)t , [2s+1,2t+1,u] φ0 P F [u] = F−J−2−2s(k+2),M FJ,M ;n0 ,n +2n+2t(r−k) ⊗ Fn0 ,−n−2(r−k)t .

(5.74) (5.75)

[s,t,u] The operator O[s,t] acting on the space FJ,M ;n0 ,n is defined recursively by

O[s,0] = O[s] , [s,t] = O[s,t+1] Q+[t] Q+[t] n O n .

(5.76) (5.77)

Applying the formula (5.71), we obtain the character of the space HJ;n0 ,n : χJ;n0 ,n (ω) = tr HJ;n,n0 y L0 −c/24 =

k X

X

cJ,M¯ (ω)

¯ =−k+1 M

δM¯ ,Mn0 ,n−2(r−k)t y

t∈Z

[2t] Bφ 0

−

X

δM¯ ,Mn0 ,−n−2(r−k)t y

[2t+1] Bφ 0

,

t∈Z

(5.78) where c is given by (2.5) and = Bφ[2t] 0 Bφ[2t+1] 0

2 1 n0 (r − k) − nr + 2r(r − k)t , 4kr(r − k) 2 1 n0 (r − k) + nr + 2r(r − k)t . = 4kr(r − k)

The result (5.78) precisely gives the branching coefficient in the formula X (l) χ(k) χJ;n0 ,n (ω)χn(k+l) 0 −1 (ω), J (ω)χn−1 (ω) =

(5.79) (5.80)

(5.81)

n0

0 where χ(k) L (ω) (L = J, n − 1, n − 1) are the irreducible characters of the affine Lie c2 labeled by the level k and spin L/2. algebra sl The one point local height probability of the k-fusion RSOS model in the regime III is given by the same branching coefficient as (5.78). Hence one may regard the space HJ;n0 ,n as the space of states of the k-fusion RSOS model in the regime III, on which Baxter’s CTM acts [2, 23].

6. Discussion c2 ). Based on this algebra, In summary, we have introduced the elliptic algebra Uq,p (sl we have extended the bosonization of the ABF model in [30] to the k-fusion RSOS model. We have obtained it as the q-deformation of the coset conformal field theory SU (2)k × SU (2)r−k−2 /SU (2)r−2 . A full set of screening currents and the highest component of the two types of vertex operators have been derived. We have observed that these operators give a proper characterization of the Fock spaces as the space of states in the k-fusion RSOS model.

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

397

We have conjectured that there exists a corresponding q-deformation of the extended c2 ) Virasoro algebra in such a way that its screening currents satisfy the algebra Uq,p (sl and the q-deformed primary fields are determined as the intertwiners between the infinite c2 ). In order to establish this point, we need to clarify the representation space of Uq,p (sl following. 1. The realization of the q-Virasoro generator with the central charge (2.5) and the extra generators. c2 ). 2. The Hopf algebra structure of Uq,p (sl c2 ) by the Gauss decomRecently, Jimbo has succeeded to derive the algebra Uq,p (sl position of a RLL-relation related to a central extended dynamical RLL-relation with the R-matrix introduced by Enriquez and Felder [40]. This result allows us to clarify a c2 ). The work along this line is now in progress [49]. Hopf algebra structure of Uq,p (sl For the completion of the identification of our bosonization with the k-fusion RSOS model, we need further to show that the commutation relation among the type I and the type II vertices are given precisely by the k-fused RSOS Boltzmann weights [29]. We have checked this point both in the k = 1 case and in the case of arbitrary k but for the highest components of the type I and type II vertex operators (Prop. 4.5). c2 ) As the scaling limit, we also have obtained a bosonization of the algebra A~,η (sl at arbitrary level k. The level one case has shown to be a relevant symmetry for deriving the form factors in the sine-Gordon theory [20]. In this respect, one may relate the higher level case, especially, the case k = 2 to the super sine-Gordon theory. Our bosonization should be useful for deriving the form factors of such a theory. It is also an interesting problem to extend our results to the higher rank case as well as to the case of other types of Lie algebras. For these extensions, the corresponding SOS models are known [50, 51]. We expect that these SOS models could be bosonized c2 ). based on the corresponding extension of our Uq,p (sl c2 ) A. The Algebra A~,η (sl We here give a brief review of the results in [21]. c2 ) is generated by the symbols eˆλ , fˆλ , tˆλ , λ ∈ R Definition A.1. The algebra A~,η (sl and the central element c with the following relations: sh πη(α − β − i~(1 − c2 )) sh πη 0 (α − β + i~(1 − c2 )) − H (β)H + (α), sh πη(α − β + i~(1 + c2 )) sh πη 0 (α − β − i~(1 + c2 )) (A.1) 0 sh πη(α − β − i~) (α − β + i~) sh πη H ± (β)H ± (α), (A.2) H ± (α)H ± (β) = sh πη(α − β + i~) sh πη 0 (α − β − i~) sh πη(α − β − i~(1 ∓ c2 )) E(β)H ± (α), (A.3) H ± (α)E(β) = sh πη(α − β + i~(1 ± c2 )) sh πη 0 (α − β + i~(1 ∓ c2 )) F (β)H ± (α), (A.4) H ± (α)F (β) = sh πη 0 (α − β − i~(1 ± c2 )) sh πη(α − β − i~) E(α)E(β) = E(α)E(β), (A.5) sh πη(α − β + i~) H + (a)H − (β) =

398

H. Konno

sh πη 0 (α − β + i~) F (β)F (α), sh πη 0 (α − β − i~) E(α)F (β) = F (β)E(α) 2π 1 1 i~c = − H+ α − i~c ~ α−β− 2 4 α−β+

F (α)F (β) =

(A.6)

i~c 2

i~c + O(1), (A.7) H− β − 4

where ~ and η(> 0) are real parameter and 1/η 0 − 1/η = ~c > 0. The generating functions (currents)E(α), F (α) and H ± (α) are defined by the following formulae. Z ∞ E(α) = dλeiλα eˆλ , (A.8) −∞ Z ∞ F (α) = dλeiλα fˆλ , (A.9) −∞ Z ∞ 00 ~ H ± (α) = − dλeiλα tˆλ e∓λ/2η , (A.10) 2 −∞ with η 00 =

2ηη 0 η+η 0 .

Define other generating functions e± (α), f ± (α) and h± (α) by Z dγ E(α) , e± (α) = sinπη~ 2πi sh πη(α − γ ± i~ c4 ) C Z dγ F (α) f ± (α) = sinπη 0 ~ c , 0 C 0 2πi sh πη (α − γ ∓ i~ 4 ) sinπη~ ± H (α). h± (α) = πη~

(A.11) (A.12) (A.13)

The following relations hold. e− (α) = −e+ (α − i/η 00 ),

f − (α) = −f + (α − i/η 00 ),

h− (α) = h+ (α − i/η 00 ). (A.14)

c2 ) is given for e+ (α, ξ) = e+ (α), f + (α, ξ) = The comultiplication of the algebra A~,η (sl + + f (α), h (α, ξ) = h (α) by the formulae +

1c = c0 + c00 = c ⊗ 1 + 1 ⊗ c, 1e+ (α, ξ) = e+ (α0 , ξ) ⊗ 1 ∞ X + (−1)p (f + (α0 − i~, ξ 0 ))p h+ (α0 , ξ 0 ) ⊗ (e+ (α00 , ξ 00 ))p+1 ,

(A.15)

(A.16)

p=0

1f + (α, ξ) = 1 ⊗ f + (α0 , ξ) ∞ X + (−1)p (f + (α0 , ξ 0 ))p+1 ⊗ h˜ + (α00 , ξ 00 )(e+ (α00 − i~, ξ 00 ))p , (A.17) p=0 +

1h (α, ξ) = h (α0 , ξ 0 ) ⊗ h+ (α00 , ξ 00 ) ∞ X + (−1)p [p + 1]η (f + (α0 − i~, ξ 0 ))p h+ (α0 , ξ 0 ) +

p=0

⊗h+ (α00 , ξ 00 )(e+ (α00 − i~, ξ 00 ))p ,

(A.18)

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

399

where ξ = 1/η = ξ 0 , ξ 00 = ξ + ~c, α0 = α + i~c00 /4, α00 = α − i~c0 /4 and [p]η =

η sinπη 0 ~ + h˜ + (α) = 0 h (α). η sinπη~

sinπη~p , sinπη~

Let V (l) be the l + 1-dimensional representation of Uρ (sl2 ), ρ = eiπη~ with a basis vm (m = 0, 1, 2, .., l). The l + 1-dimensional evaluation representation V (l) (ζ) = V (l) ⊗ C[[eiλζ ]], λ ∈ R, ζ ∈ C is given for the base vm (ζ) ∈ V (l) (ζ) by sh iπη~m vm−1 (ζ), sh πη(α − ζ + i~ l−2m−1 ) 2 sh πη~(l − m) vm+1 (ζ), f + (α)vm (ζ) = − sh πη(α − ζ + i~ l−2m−1 ) 2

e+ (α)vm (ζ) = −

h+ (α)vm (ζ) =

(A.19) (A.20)

l+1 sh πη(α − ζ − i~ l+1 2 )sh πη(α − ζ + i~ 2 )

sh πη(α − ζ + i~ l−2m+1 )sh πη(α − ζ + i~ l−2m−1 ) 2 2

vm (ζ).

(A.21)

c2 ) is An infinite dimensional highest weight representation of the algebra A~,η (sl constructed as a Fock space F of free bosons [18, 21]. See also Sect. 6. Here the highest weight property means that eˆλ |h.w.i = 0,

fˆλ |h.w.i = 0

λ ∈ R.

(A.22)

There are two types of intertwining operators (vertex operators): Type

I

Type

II

8(l) (ζ) : F → F ⊗ V (l) (ζ + 9(l)∗ (ζ) : V (l) (ζ +

i~ ), 2

i~ )⊗F →F 2

(A.23) (A.24)

satisfying 8(l) (ζ) x = 1(x)8(l) (ζ), 9(l)∗ (ζ)1(x) = x9(l)∗ (ζ).

(A.25) (A.26)

The components of the intertwiners are defined as follows: 8(l) (ζ)u =

l X

8(l) m (ζ)u ⊗ vm ,

(A.27)

m=0

9(l)∗ (ζ)(vm ⊗ u) = 9(l) m (ζ)u u ∈ F .

(A.28)

Using the comultiplication (A.15)–(A.18) and the finite dimensional evaluation representation (A.19)–(A.21), the intertwining relations are rewritten as follows. For type I,

400

H. Konno

sh πη 0 (α − ζ +

± 8(l) l (ζ)h (α) =

i~ 2 (l ∓ i~ 2 (l ±

sh πη 0 (α − ζ −

k 2 )) ± h (α)8(l) l (ζ), k )) 2

[e± (α), 8(l) l (ζ)] = 0, sh iπη~(l − m +

(A.29) (A.30)

1)8(l) m−1 (ζ)

k h i~ ± = −sh πη 0 (α − ζ + (l − 2m + 2 ∓ )) 8(l) m (ζ)f (α) 2 2 i A − f ± (α)8(l) m (ζ) , B (A.31) where k i~ (l + 2 ± ))sh πη 0 (α − ζ + i~(l ∓ k/2)), 2 2

A = sh πη 0 (α − ζ − B = sh πη 0 (α − ζ +

k i~ k (l − 2m ∓ ))sh πη 0 (α − ζ + i~(l − 2m − 2 ∓ )), 2 2 2

and for type II, h± (α)9(l)∗ l (ζ) =

sh πη(α − ζ + sh πη(α − ζ −

i~ 2 (l ± i~ 2 (l ∓

k 2 )) (l)∗ 9l (ζ)h± (α), k )) 2

[f ± (α), 9(l)∗ l (ζ)] = 0,

(A.32) (A.33)

sh iπη~m9(l)∗ m−1 (ζ) k i~ (l − 2m − 2 ± ))e± (α)9(l)∗ m (ζ) 2 2

= −sh πη(α − ζ + +

C sh πη(α − ζ +

i~ 2 (l

− 2m ± k2 ))

± 9(l)∗ m (ζ)e (α),

(A.34)

where C = sh πη(α − ζ −

k i~ k i~ (l + 2 ∓ ))sh πη(α − ζ + (l ± )). 2 2 2 2

For some unknown reason, the intertwiners relevant for the physical problems such as the mass less XXZ model [18] and the sine-Gordon theory [19, 20] prefer the twisted ˜ (l)∗ (ζ) the intertwiners of the same ˜ (l) (ζ) and 9 intertwining relations. We denote by 8 types as (A.23) and (A.24) but obeying the following twisted intertwining relations. ˜ (l) (ζ), ˜ (l) (ζ) ι(x) = 1(x)8 8 ˜ (l)∗ (ζ), ˜ (l)∗ (ζ)1(x) = ι(x)9 9

(A.35) (A.36)

c2 ) where ι is the following involution of A~,η (sl ι(eˆλ ) = −eˆλ ,

ι(fˆλ ) = −fˆλ ,

ι(tˆλ ) = tˆλ .

(A.37)

Then the relations (A.29) and (A.32) remain the same but (A.30), (A.31), (A.33) and (A.34) are replaced with

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

401

˜ (l) {e± (α), 8 l (ζ)} = 0,

(A.38)

˜ (l) sh iπη~(l − m + 1)8 m−1 (ζ) i k h ˜ (l) i~ D ± (l) ± ˜ f (ζ)f (α) + (α) 8 (ζ) , = sh πη 0 (α − ζ + (l − 2m + 2 ∓ )) 8 m m 2 2 E (A.39) ˜ (l)∗ {f (α), 9 l (ζ)} = 0, ±

(A.40)

˜ (l)∗ sh iπη~m9 m−1 (ζ) k i~ ˜ (l)∗ (l − 2m − 2 ± ))e± (α)9 m (ζ) 2 2 k i~ k sh πη(α − ζ − i~ 2 (l + 2 ∓ 2 ))sh πη(α − ζ + 2 (l ± 2 ))

= sh πη(α − ζ + +

sh πη(α − ζ +

i~ 2 (l

− 2m ± k2 ))

± ˜ (l)∗ 9 m (ζ)e (α), (A.41)

where D = sh πη 0 (α − ζ − E = sh πη 0 (α − ζ +

k i~ (l + 2 ± ))sh πη 0 (α − ζ + i~(l ∓ k/2)), 2 2

k i~ k (l − 2m ∓ ))sh πη 0 (α − ζ + i~(l − 2m − 2 ∓ )). 2 2 2

Acknowledgement. The author would like to thank Michio Jimbo for stimulating discussions and various suggestions. He is also grateful to Kenji Iohara, Atsuo Kuniba, Tetsuji Miwa, Stanislav Pakuliak and Junichi Shiraishi for valuable discussions. This work is supported in part by the Ministry of Education Contract No.09740028.

c2 ) can be regarded as a Note added. In [49], we have recently clarified that Uq,p (sl composition of a quasi-Hopf algebra and a Heisenberg algebra. The RLL relation of c2 ) yields a central extension of the dynamical RLL relation by removing the Uq,p (sl Heisenberg algebra. The quasi-Hopf subalgebra allows us to construct the two types of intertwiners of infinite dimensional representations of the subalgebra. We have comc2 ) by attaching the Heisenberg pleted the construction of the vertex operators of Uq,p (sl algebra to them. At level one, thus obtained vertex operators coincide with those obtained in [30, 45]. References 1. Jimbo, M. and Miwa, T.: Algebraic Analysis of Solvable Lattice Models. CBMS Regional Conference Series in Mathematics vol. 85, Providence, RI: AMS, 1994 2. Baxter, R. J.: Exactly Solved Models in Statistical Mechanics. London: Academic, 1982 3. Belavin, A. A., Polyakov, A. M. and Zamolodchiko, A. B.: Infinite conformal symmetry in two dimensional quantum field theory. Nucl. Phys. B 241, 333–380 (1980) 4. Dotsenko, V. S. and Fateev, V. A.: Conformal algebra and multipoint correlation functions in twodimensional statistical models. Nucl. Phys. B 240 [FS], 312 (1984); Four point correlation functions and the operator algebra in the two-dimensional conformal invariant theories with the central charge c < 1. Nucl. Phys. B 251 [FS], 691 (1985) 5. Davies, B., Foda, O., Jimbo, M., Miwa, T. and Nakayashiki, A.: Diagonalization of the XXZ Hamiltonian by vertex operators. Commun. Math. Phys. 151, 89–153 (1993) 6. Jimbo, M., Miki, K., Miwa, T. and Nakayashiki, A.: Correlation functions of the XXZ model for 1 < −1. Phys. Lett. A, 168, 256–263 (1992)

402

H. Konno

7. Idzumi, M., Iohara, K., Jimbo, M., Miwa, T., Nakashima, T. and Tokihiro, T.: Quantum affine symmetry in vertex models. Int. J. Mod. Phys. A, 8, 1479–1511 (1993)

l2 ), vertex operators, and their correlations. 8. Idzumi, M.: Level two irreducible representations of Uq (sc Int. J. Mod. Phys. A 9, 4449–4484 (1994) 9. Bougourzi, A.H. and Weston, R.A: N point correlation functions of the spin 1 XXZ model. Nucl. Phys. B 417, 439–462 (1994) b2 ). Mod. Phys. Lett. A 9, 1253–1265 10. Konno, H.: BRST cohomology in quantum affine algebra Uq (sl b2 ) and form factors in the higher-spin (1994); Free field representation of the quantum affine algebra Uq (sl XXZ model. Nucl. Phys. B 432 [FS], 457–486 (1994)

c 11. Koyama, Y.: Staggered polarization of vertex models with Uq (sl N ) symmetry. Commun. Math. Phys. 164, 277–292 (1994) 12. Jimbo, M., Miwa, T. and Nakayashiki, A.: Difference equations for the correlation functions of the eight-vertex model. Jour. of Phys. A 26, 2199 (1993) 13. Jimbo, M., Kedem, R., Konno, H., Miwa, T. and Weston, R.: Difference equations in spin chains with a boundary. Nucl. Phys. B 448 [FS], 429–456 (1995) 14. Foda, O., Iohara, K., Jimbo, M., Kedem, R., Miwa, T., and Yan, H. An elliptic quantum algebra for sbl2 . Lett. Math. Phys. 32, 259–268 (1994); Notes on highest weight modules of the elliptic algebra Aq,p (sbl2 ). Prog. Theor. Phys., Supplement 118, 1–34 (1995) 15. Iohara, K. and Kohno, M.: A central extension of DY~ (gl2 ) and its vertex representations. Lett. Math. Phys. 37, 319–328 (1996); Iohara, K.: Bosonic representations of Yangian double DY~ (g) with g = gln , sln . J. Phys. A 29, 4593–4621 (1996) 16. Khoroshkin, S.: Central Extension of the Yangian Double. q-alg/9602031; Khoroshkin, S., Lebedev, D. and Pakuliak, S.: Intertwining operators for the central extension of the Yangian double. Phys. Lett. A 222, 381–392 (1996) b2 )k and deformation of Wakimoto 17. Konno, H.: Free field representation of level-k Yangian double DY (sl modules. Lett. Math. Phys. 40, 321–336 (1997) 18. Jimbo, M., Konno, H. and Miwa, T.: Massless XXZ model and degeneration of the Elliptic Algebra b2 ). In: Deformation Theory and Symplectic Geometry, Dordrecht: Kluwer Academic Publishers, Aq,p (sl 1997, pp. 117–138 19. Lukyanov, S. Free field representation for massive integrable models. Commun. Math.Phys. 167, 183– 226 (1995) b2 ) and form factors in the sine-Gordon theory. 20. Konno, H.: Degeneration of the Elliptic algebra Aq,p (sl Preprint Hiroshima Univ. hep-th/9701034, to appear in the CRM series in Mathematical Physics, Springer Verlag. 21. Khoroshkin, S., Lebedev, D. and Pakuliak, S. Elliptic algebra Aq,p (slˆ2 ) in the scaling limit. Preprint q-alg/9702002 22. Andrews, G.E., Baxter, R.J. and Forrester, P.J.: Eight vertex SOS model and generalized RogersRamanujan-type identities. J. Stat. Phys. 35, 193–266 (1984) 23. Date, E., Jimbo, M., Miwa, T. and Okado, M.: Fusion of the eight vertex SOS model. Lett. Math. Phys. 12, 209–215 (1986); Date, E., Jimbo, M., Kuniba, A., Miwa, T. and Okado, M.: Exactly solvable SOS models. Nucl. Phys. B 290 [FS20], 231–273 (1987); Exactly solvable SOS models II. Adv. Stud. Pure Math. 16, 17–122 (1988) 24. Goddard, P., Kent, A. and Olive, D.: Virasoro algebras and coset space models. Phys. Lett. B 152, 88 (1985); Unitary representations of the Virasoro and super-Virasoro algebras. Commun. Math. Phys. 103, 105–119 (1986) 25. Kastor, D., Martinec, E. and Qiu, Z.: Current algebra and conformal discrete series. Phys. Lett. B 200, 434–440 (1988) 26. Bagger, J., Nemeschansky, D. and Yankielowicz, S. Virasoro algebras with central charge c > 1. Phys. Rev. Lett. 60, 389–392 (1988) 27. Ravanini, F.: An infinite class of new conformal field theories with extended algebras. Mod. Phys. Lett. A 3, 397–412 (1988) 28. Jimbo, M., Miwa, T. and Ohta, Y.: Structure of the space of states in RSOS models. Int. J. Mod. Phys. A 8, 1457–1477 (1993) 29. Foda, O., Jimbo, M., Miki, K., Miwa, T. and Nakayashiki, A.: Vertex operators in solvable lattice models. J. Math. Phys. 35, 13–46 (1994)

b2 ) and the Fusion RSOS Model Elliptic Algebra Uq,p (sl

403

30. Lukyanov, S. and Pugai, Y.: Multi-point Local Height Probabilities in the Integrable RSOS Model". Nucl. Phys. B 473, 631–658 (1996) 31. Shiraishi, J., Kubo, H., Awata, H. and Odake, S.: A Quantum Deformation of the Virasoro Algebra and the Macdonald Symmetric Functions. Lett. Math. Phys. 38, 33–51 (1996); Virasoro-type symmetries in solvable models. Preprint EFI-96-44, DPSU-96-18, UT-764, hep-th/9612233 32. Konno, H.: q-deformation of the coset conformal field theory and the fusion RSOS model. Talks given at the XIIth International Congress of Mathematical Physics, 13–19 July, 1997, Brisbane and the International Workshop on “Statistical Mechanics and Integrable Systems”, 28 July-8 August, 1997, ANU, Canberra, Australia 33. Bershadsky, M.A., Knizhnik, V.G. and Teitelman, M.G.: Super conformal symmetry in two dimensions. Phys. Lett. B 151, 31–36 (1985) 34. Friedan, D., Qiu, Z. and Shenker, S.: Super conformal invariance in two dimensions and the tricritical Ising model. Phys. Lett. B 151, 37–43 (1985) 35. Zamolodchikov, A.B. and Fateev, V.A. Representations of the algebra of “parafermion currents” of spin 4/3 in two-dimensional conformal field theory. Minimal models and the tricritical potts Z 3 model. Theor. Math. Phys. 71, 451–462 (1987) 36. Gerasimov, A., Marshakov, A. and Morozov, A.: Free field representation of parafermions and related coset models. Nucl. Phys. B 328, 664 (1989) 37. Felder, G.: BRST approach to minimal model. Nucl. Phys B317, 215–236 (1989) 38. Bernard, D. and Felder, G.: Fock Representations and BRST cohomology in SL(2) current algebra. Commun. Math. Phys 127, 145–168 (1990) 39. Konno, H.: SU (2)k × SU (2)l /SU (2)k+l coset conformal field theory and topological minimal model on higher genus Riemann surface. Int. J. Mod. Phys. A 8, 5537–5561 (1993) 40. Enriquez, B. and Felder, G.: Elliptic quantum groups Eτ,η (sl2 ) and quasi-Hopf algebras. q-alg/9703018 41. Frenkel, I.B. and Jing, N.H.: Vertex Representations of Quantum Affine Algebras. Proc. Nat’l. Acad. Sci. USA 85, 9373–9377 (1988) 42. Ahn, C., Bernard, D. and LeClair, A.: Fractional supersymmetries in perturbed coset CFTs and integrable soliton theory. Nucl. Phys. B 346, 409–439 (1990) 43. Hou, B.-Y. and Yang, W.-L.: ~(Yangian) deformed Virasoro algebra as a dynamically twisted A~,η algebra. Talk given at the XIIth International Congress of Mathematical Physics, 13-19 July, 1997, Brisbane, Australia 44. Matsuo, A.: A q-deformation of Wakimoto modules, primary fields and screening operators. Commun. Math. Phys. 161, 33–48 (1994) 45. Miwa, T. and Weston, R.: Boundary ABF models. Nucl. Phys. B 486, 517–545 (1997) 46. Distler, J. and Qiu, Z.: BRST cohomology and a Feigin–Fuchs representation of Kac–Moody and parafermion theories. Nucl. Phys. B 336, 533–546 (1990) b2 WZNW 47. Frau, M., Lerda, A., McCarthy, J.G., Sciuto, S. and Sidenius. J.: Free field representation for sl models on Riemann surfaces. Phys. Lett. B 245, 453–464 (1990) 48. Jimbo, M., Lashkevich, M., Miwa, T. and Pugai, Y.: Lukyanov’s Screening Operators for the Deformed Virasoro Algebra. Preprint RIMS-1087, July 1996, hep-th/9607177 b2 ): Drinfeld currents and 49. Jimbo, M., Konno, H., Odake, S. and Shiraishi, J.: Elliptic algebra Uq,p (sl vertex operators. q-qlg/9802002 50. Jimbo, M., Miwa, T. and Okado, M.: Solvable lattice models whose states are dominant integral weights of A(1) n−1 . Lett. Math. Phys. 14, 123–131 (1987); Solvable lattice models related to the vector representation of classical simple Lie algebras. Commun. Math. Phys. 116, 507–525 (1988) 51. Gepner, D.: On RSOS models associated to Lie algebras and RCFT. Phys. Lett. B 313, 45–54 (1993) Communicated by T. Miwa

Commun. Math. Phys. 195, 405 – 416 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Central Limit Theorem for the Adjacency Operators on the Infinite Symmetric Group Akihito Hora? Department of Environmental and Mathematical Sciences, Faculty of Environmental Science and Technology, Okayama University, Tsushima Okayama 700, Japan. E-mail: [email protected] Received: 16 October 1997 / Accepted: 29 November 1997

Abstract: An adjacency operator on a group is a formal sum of (left) regular representations over a conjugacy class. For such adjacency operators on the infinite symmetric group which are parametrized by the Young diagrams, we discuss the correlation of their powers with respect to the vacuum vector state. We compute exactly the correlation function under suitable normalization and through the infinite volume limit. This approach is viewed as a central limit theorem in quantum probability, where the operators are interpreted as random variables via spectral decomposition. In [K], Kerov showed the corresponding result for one-row Young diagrams. Our formula provides an extension of Kerov’s theorem to the case of arbitrary Young diagrams. 1. Introduction Let L denote the left regular representation of the infinite symmetric group S∞ :=

∞ [

Sn = {bijection g : N −→ N | g(k) = k except finite k’s} ,

n=1

φ := hδe , · δe i`2 (S∞ ) the vacuum vector state determined by the delta function on the unit element e of S∞ , and D the set of Young diagrams which have no rows consisting of one box. (Hence each λ ∈ D contains at least 2 boxes.) The conjugacy classes in S∞ (except the trivial one {e}) are parametrized by the diagrams in D. We assign the operator Aλ to each λ ∈ D which is formally defined by X L(x)f = ICλ ∗ f, Aλ f := x∈Cλ ? Supported by Grant-in-Aid for Scientific Research (No. 09740108), The Ministry of Education, Science, Sports and Culture, Japan.

406

A. Hora

where Cλ is the conjugacy class in S∞ corresponding to λ and ICλ is the indicator function of Cλ . Let us consider describing how Aλ1 , · · · , Aλm are correlated with respect to φ for distinct Young diagrams λ1 , · · · , λm ∈ D, in other words, what one can say ) for p1 , · · · , pm ∈ N. Of course φ(Apλ11 · · · Apλm ) is meaningless in about φ(Apλ11 · · · Apλm m m this expression itself. We express X X 1 Aλ = L(x) + (L(x) + L(x−1 )) (1) 2 2 2 x∈Cλ , x =e

x∈Cλ , x 6=e

as a sum of (small) observables where each term is regarded as a quantum random variable by probabilistic interpretation through spectral decomposition. Using the idea of a central limit theorem in quantum probability, we will give a precise formulation and a solution to our problem. Let λ1 , · · · , λm ∈ D be given. |λ| denotes the number of boxes contained in λ ∈ D. First we consider Sn for sufficiently large n (> |λ1 |+· · ·+|λm |) and then we let n → ∞. Set X A(n) := L(x) . (2) Cλ(n) := Cλ ∩ Sn , λ (n) x∈Cλ

(The representation matrix of A(n) λ |`2 (Sn ) with respect to the canonical basis {δx |x ∈ Sn } is an adjacency matrix of the group association scheme X (Sn ) of Sn . According to this terminology, we refer to Aλ as an adjacency operator.) In a typical form of classical central limit theorems, one subtracts the mean from a sum of independent random variables, divides it by the standard deviation (the square root of the variance), and considers weak convergence of its distribution. Since we have X φ(A(n)2 φ(L(xy)) = ]Cλ(n) φ(A(n) λ )=0, λ )= (n) x,y∈Cλ

as theq mean and the variance of A(n) λ respectively, the correct normalized form will be

(n) A(n) λ / ]Cλ . We are thus concerned with the limit of q q (n) p1 (n) / ]C ) · · · (A / ]Cλ(n) )p m ) φ((A(n) λ1 λ1 λm m

(3)

(n) as n → ∞, which describes the correlation of Aλ1 , · · · , Aλm . Although A(n) λ1 , · · · , A λm (n) in Eq. (3) are mutually commuting, each Aλ is a sum of non-commuting and nonindependent elements as is seen from Eq. (2). Central limit theorems for noncommutative random variables have been studied by von Waldenfels, Accardi, and other authors in several aspects. Closely related to our problem are [G-W, A-B, S-W, A-L1, A-L2, B, Ha] and [Ho]. We mention also [Sp, V-D-N, M] and latest [A-H-O]. However, it seems that the computation of the limit of Eq. (3) is not covered by existing theory developed up to now e.g. based on a certain independence argument. Our method is direct combinatorial analysis of Eq. (3). (Is it possible to describe “independence” of the terms in Eq. (1) fully enough to be able to see the limit of Eq. (3)?) (2) (3) (j) Let us state our main result. We use the cycle notation like λ = (2k 3k · · · j k · · ·) if λ ∈ D contains k (j) rows of length j. Hr (x) denotes the Hermite polynomial of degree r obeying the recurrence formula:

Hr+1 (x) = xHr (x) − rHr−1 (x)

(r ≥ 1) ,

H0 (x) = 1 ,

H1 (x) = x .

(4)

Central Limit Theorem on the Infinite Symmetric Group

407

Theorem 1. For ∀λ1 , · · · , λm ∈ D and ∀p1 , · · · , pm ∈ N, we have A(n) p1 A(n) pm λ1 q lim φ · · · q λm = n→∞ (n) ]Cλ1 ]Cλ(n) m H (j) (x) pm Y Z e−x2 /2 Hk(j) (x) p1 k √ q1 · · · pm dx, = (j) (j) 2π R km ! j≥2 k1 ! (2)

(3)

(5)

(j)

where λi = (2ki 3ki · · · j ki · · ·) for i = 1, · · · , m. (j) = 0 holds for sufficiently large j’s, the right hand side of Eq. (5) Since k1(j) = · · · = km is actually a finite product. One of the remarkable features of quantum central limit theorems is a wide variety of resulting limit distributions in comparison with classical ones. See Corollary 1 in Sect. 2 in our case. Corollary 2 in Sect. 2 gives the condition for asymptotic independence. In particular, Corollaries 1 and 2 yield Kerov’s result in [K]. Section 3 is devoted to the proof of Theorem 1. It will turn out that Hermite polynomials in Eq. (5) come from the matching polynomials of the complete graphs and that an integral formula on the number of perfect matchings plays an important role.

2. Remarks on Theorem 1 Theorem 1 readily yields the following corollaries. γ denotes the one-dimensional standard Gaussian distribution. 8∗ µ denotes the image measure of the measure µ the by map 8; 8∗ µ(B) := µ(8−1 B). (2)

(3)

(j)

k k k of Corollary q 1. For λ = (2 3 · · · j · · ·) ∈ D, Qthe spectral distribution √ (n) (n) Aλ / ]Cλ with respect to φ converges weakly to ( j≥2 Hk(j) (xj )/ k (j) !)∗ γ ⊗∞ as n → ∞.

Proof. Let J be the maximal length of rows in λ. Equation (5) implies lim φ

n→∞

J Z A(n) p Y Hk(j) (x) p q λ √ γ(dx) = k (j) ! j=2 R ]Cλ(n) Z J Y Hk(j) (xj ) p ⊗J−1 √ γ (dx2 · · · dxJ ) = k (j) ! RJ−1 j=2

(=: mp )

q / ]Cλ(n) with for ∀p ∈ N. Thus every pth moment of the spectral distribution of A(n) λ √ Q respect to φ converges to mp (= the pth moment of ( j≥2 Hk(j) (xj )/ k (j) !)∗ γ ⊗∞ ). 1/2p

Since mp obviously satisfies lim supp→∞ m2p /2p < ∞, we get the desired weak convergence. (See e.g. [D] Chapter 2 for the relation between weak convergence of probabilities and convergence of their moments.)

408

A. Hora

Corollary q 2. If no two diagrams q in λ1 , · · · , λm ∈ D contain rows of equal length, (n) (n) (n) are “asymptotically independent” as n → ∞, then Aλ1 / ]Cλ1 , · · · , Aλm / ]Cλ(n) m namely, their joint spectral distribution converges weakly to the tensor product of individual limit spectral distributions. (j) contain at most one nonzero member for every j ≥ 2, the Proof. Since k1(j) , · · · , km right hand side of Eq. (5) is Y Z Hk(j) (x) pm Y Z Hk(j) (x) p1 1 q pm γ(dx) · · · γ(dx) (j) (j) R R km ! j≥2 j≥2 k1 ! Z Z Y Hk(j) (xj ) Y H (j) (xj ) km q1 p xp 1 γ ⊗∞ (dx) · · · xpm γ ⊗∞ (dx) . (6) = (j) ∗ ∗ (j) R R km ! j≥2 j≥2 k ! 1

Corollary 1 implies that m each factor of Eq. (6) is a moment of the corresponding limit spectral distribution. Examples. We mention a few concrete limit distributions in Corollary 1 which are not Gaussian. (i) Let λ = (a2 ) (= two rows of length a(≥ 2)). Since H2 (x) = x2 − 1, we get √ 1 √ √ ( 2u + 1)−1/2 e−( 2u+1)/2 I(−1/√2,∞) (u)du . π (ii) Let λ = (a4 ). Since H4 (x) = x4 − 6x2 + 3, we get r √ √ √ 1/2 3 −3/2 e (6 + 2 6u)−1/2 {(3 + (6 + 2 6u)1/2 )−1/2 e−(6+2 6u) /2 I(−√6/2,∞) (u) π √ √ 1/2 +(3 − (6 + 2 6u)1/2 )−1/2 e(6+2 6u) /2 I(−√6/2,3/2√6) (u)}du . k In general, q let λ = (a ) ∈ D. Then the support of the limit spectral distribution of (n) A(n) λ / ]Cλ with respect to φ is R if k is odd, while it is an open semi-infinite interval √ (c, ∞) if k is even, where c is the minimal value of Hk (x)/ k!. (n) Now let us discuss the connection with Kerov’s result in [K]. Let Ck(n) := C(k 1 ) be the conjugacy class in Sn consisting of the cycles of length k(≥ 2). For α ∈ Sˆn , χ(n) α and d(n) denote the irreducible character and its dimension respectively. Set α k/2 (n) χα (Ck(n) )/d(n) ϕ(n) α , k (α) := n

M (n) ({α}) := d(n)2 α /n! ,

(7)

(n) (n) (n) is the Plancherel measure on Sˆn . In where χ(n) α (Ck ) is the value at ∀x ∈ Ck . M [K], Kerov showed the following limit behavior of the joint distribution of ϕ(n) k with respect to M (n) .

Kerov’s Theorem. For ∀m ≥ 2 and ∀x2 , · · · , xm ∈ R, lim M (n) ({α ∈ Sˆn |ϕ(n) k (α) ≤ xk , 2 ≤ k ≤ m}) =

n→∞

holds.

m Y k=2

√

1 2πk

Z

xk −∞

e−y

2

/2k

dy

Central Limit Theorem on the Infinite Symmetric Group

409

P In general, let G be a finite group. As in Eq. (2), we associate the operator AC := x∈C L(x) with each conjugacy class C in G. A denotes the C-algebra generated by {AC |C : conjugacy class in G} (the Bose-Mesner algebra of the group association scheme X (G) of G). A is commutative. Note that φ|A = φtr |A holds where φtr := (]G)−1 T r is the tracial state. Set M ({α}) := d2α /]G for α ∈ Gˆ (the Plancherel measure ˆ Using the basis consisting of the matrix elements of the irreducible representations on G). of G, we see φ(AC1 · · · ACp ) = φtr (IC1 ∗ · · · ∗ ICp ∗ · ) Z (]Cp )χα (Cp ) (]C1 )χα (C1 ) ··· M (dα) = dα dα ˆ G

(8)

for conjugacy classes C1 , · · · , Cp in G. (See e.g. [Si] Chapter III.) We apply Corollary 1 and Corollary 2 as well as Eq. (8) to deduce Kerov’s theorem. (n) The conjugacy classes involved there are only the ones consisting of cycles Ck(n) = C(k 1). q (n) (n) Corollary 1 says that the limit distribution of Ak / ]Ck is γ. Corollary 2 implies q q (n) (n) (n) / ]C , · · · , A / ]Cm . Combining these with asymptotic independence of A(n) m 2 2 Eq. (8), we get q q Z ]C (n) χ(n) (C (n) ) p (n) (n) (n) p ]Cm χα (Cm ) m (n) 2 α 2 2 lim · · · M (dα) (n) (n) n→∞ Sˆ dα dα n q q (n) p2 (n) pm (n) / ]C ) · · · (A / ]Cm ) ) = lim φ((A(n) m 2 2 n→∞ Z xp2 2 · · · xpmm γ ⊗m−1 (dx2 · · · dxm ) (9) = Rm−1

for ∀p2 , · · · , pm ∈ N. Since ]Ck(n) = n(n − 1) · · · (n − k + 1)/k ∼ nk /k (n → ∞), Eq. (9) yields Z √ √ pm (n) (ϕ(n) / 2)p2 · · · (ϕ(n) M (dα) = lim m / m) 2 n→∞ Sˆ n Z xp2 2 · · · xpmm γ ⊗m−1 (dx2 · · · dxm ) , = Rm−1

which implies weak convergence of the distributions. We thus obtain Kerov’s theorem. We proceed to the proof of Theorem 1 in the next section. Our approach is combinatorial and rather elementary. We first expand the left hand side of Eq. (5) and then investigate what kind of terms can survive in the limit of n → ∞. A restricted case was treated also in [Ho]. For comparison, let us refer to a few other possible approaches. One is algebraic, in which one uses the structure of the Bose-Mesner algebra of X (Sn ), more precisely, information on the structure constant p(n)νλµ in (n) A(n) λ Aµ =

X

p(n)νλµ A(n) ν

(λ, µ, ν ∈ D; |λ|, |µ|, |ν| ≤ n) .

ν

It can be said that Kerov’s strategy goes along this line. Another may be rather analytic, in which one notices Eq. (8). All information needed for computation of the right hand

410

A. Hora

side of Eq. (8) for G = Sn is obtained, in principle, from Frobenius’ character formula. However, it seems to be too hard a task to compute the left hand side of Eq. (5) along this line. 3. Proof of Theorem 1 3.1. A reduction. Let n > p1 |λ1 | + · · · + pm |λm |. For simplicity, the r consecutive product n(n − 1) · · · (n − r + 1) is denoted by nr . Since the cardinality of the conjugacy (2) (3) (j) class associated with λ = (2k 3k · · · j k · · ·) ∈ D is Y (j) ]Cλ(n) = n|λ| / j k k (j) ! n|λ| (n → ∞) , (10) j≥2

the order of the denominator in the left hand side of Eq. (5) is n(p1 |λ1 |+···+pm |λm |)/2 .

(11)

In the expanded expression X

1 m · · · A(n)p φ(A(n)p λ1 λm ) =

(1) (pm ) φ(g1(1) · · · g1(p1 ) · · · gm · · · gm ) , (12)

(n) gi(l) ∈Cλ ,l=1,···,pi ,i=1,···,m i

(pm ) = e or not. For each term each term has value 1 or 0 according to wether g1(1) · · · · · · gm in the right hand side of Eq. (12), we pay attention to the number of appearing letters

ν := ]

pi m [ [

{letters which move by gi(l) } .

i=1 l=1

We classify the terms in Eq. (12) into three ones. In a term such that 2ν > p1 |λ1 | + · · · + pm |λm |, there exists a letter which appears (pm ) , the value exactly once in the term. Since the letter then moves by g1(1) · · · · · · gm of φ is 0. (ii) The number of the terms such that 2ν < p1 |λ1 | + · · · + pm |λm | is of a smaller order than (11). Indeed, consider a sequence of p1 tableaux of shape λ1 , p2 tableaux of shape λ2 , · · ·, pm tableaux of shape λm . The number of such tableaux sequences made of ν letters is majorized by n × (constant independent of n) ≤ constant × nν . ν (i)

The number under consideration is clearly smaller than the number of such tableaux sequences. Hence it vanishes under normalization by (11) as n → ∞. (iii) ]Now remains the terms such that 2ν = p1 |λ1 |+· · ·+pm |λm |. If such a term contain a letter which appears only once, the value of φ at the term is 0 as in (i). Hence we have only to take into account the terms such that (1) (pm ) · · · gm every letter appearing in g1(1) · · · g1(p1 ) · · · gm appears exactly twice there

in the right hand side of Eq. (12).

(13)

Central Limit Theorem on the Infinite Symmetric Group

411

The following observation enables us to see what terms really survive among those which satisfy condition (13). Lemma 1. In product g1 · · · gq (gi ∈ Sn , gi 6= e), express each gi as a product of cycles. Assume that every letter appearing in g1 · · · gq appears exactly twice there. Then we have g1 · · · gq = e ⇐⇒ ∀cycle S in g1 · · · gq , ∃S −1 in g1 · · · gq cleared of S . Proof. ⇐=: Under the assumption, every letter in cycle S appears again in and only in S −1 . Then, two cycles S and S 0 share no common letters, and hence commute, if S 0 6= S −1 . Since the cycles in g1 · · · gq are thus commutative, their product is clearly e. =⇒: Take an arbitrary cycle S in g1 · · · gq ; e = · · · S · · · ∈ Sn . First look at the left side from S in · · · S · · ·. If we there find a cycle in which a letter contained in S, say l, appears for the first time, let T denote the cycle. Set e = · · · S · · · =: ZT Y SX. We show T = S −1 . Set S =: (l h1 · · · hs ) and T =: (l kt · · · k1 ). l, h1 , · · · , hs never appear in Y in view of the definition of T . Since X S Y T Z l −→ l −→ h1 −→ h1 −→ l −→ l holds, we get h1 = k1 . Then, h1 = k1 appears only in S and T . Since X S Y T Z h1 −→ h1 −→ h2 −→ h2 −→ h1 (= k1 ) −→ h1 holds, we get h2 = k2 . Thus we get h1 = k1 , · · · , hs = ks . These letters appear only in S and T . Since X S Y T Z hs −→ hs −→ l −→ l −→ hs −→ hs holds, we get hs = kt . Hence T = S −1 is shown. If we find no cycles sharing a common letter with S in the left side from S, then we find cycle T which contains a letter in S for the first time in the right side from S. Hence we have e = XSY T Z, where Y does not share any letters with S. Taking the inverse e = Z −1 T −1 Y −1 S −1 X −1 , we see T = S −1 from the same discussion as the previous paragraph. Under condition (13), Lemma 1 implies that a term in the right hand side of Eq. (12) survives (i.e. the value of φ is 1) if and only if (1) (pm ) · · · gm consist of “cycle vs inverse cycle” pairs. the cycles in g1(1) · · · g1(p1 ) · · · gm (14)

3.2. Computation of the limit. It is obvious that two cycles forming a “cycle vs inverse cycle” pair in Eq. (12) should be of equal length. Two cycles contained in the same gi(l) (j) cannot form a pair. Each term in Eq. (12) has p1 k1(j) + · · · + pm km cycles of length j. (j) (j) (j) =: 0 and Here consider a complete p1 + · · · + pm -partite graph Kk(j) ,···,k(j) ,···,km ,···,km 1

1

set 0 := 0(2) ∪ 0(3) ∪ · · ·. The vertices of 0(j) are divided into p1 + · · · + pm parts, where the first p1 parts contain k1(j) vertices respectively, the next p2 parts contain k2(j) vertices respectively, and so on. Arbitrary two vertices in different parts are joined by an edge in 0(j) , while two vertices in the same part are not. The vertices and the edges of 0 are the union of those of 0(j) ’s respectively. Of course, 0 is actually a finite union of 0(j) ’s.

412

A. Hora

◦ ◦ .. . ··· ◦

◦ ◦ .. . ··· ··· ◦

◦ ◦ .. . ··· ◦ ◦

◦ ◦ .. . ◦ ◦

Fig. 1. The vertices of 0

(j)

Let pm(0) denote the number of perfect matchings in 0. A perfect matching M in 0 is by definition a subset of the edges of 0 such that every vertex of 0 lies in exactly one edge in M . Lemma 2. Under the notation in Theorem 1, we have A(n) p1 A(n) pm Y (j) λ1 (j) pm /2 q · · · q λm (k1 !)p1 /2 · · · (km !) . = pm(0)/ lim φ n→∞ (n) ]Cλ(n) ]C j≥2 λm 1 (15) Proof. Step 1. For gi(l) in Eq. (12), the following one-to-one correspondence holds: gi(l) ←→ Young tableau of shape λi mod

(A) cyclic permutation within a row (B) permutaion among rows of equal length.

For a while, we forget the identification of (B). (Later in Step 4 we adjust the redundancy.) Let us count up the number of the sequences of p1 + · · · + pm tableaux mod (A) under conditions (13) and (14). Step 2. We assign a perfect matching in 0 to a sequence of p1 +· · ·+pm tableaux mod (A) satisfying (13) and (14) through row - vertex correspondence in the way that two vertices corresponding to a “cycle vs inverse cycle” pair are joined by an edge in the matching. In this way a perfect matching in 0 is uniquely determined by a tableaux sequence. Indeed, two distinct perfect matchings in 0 gives two distinct perfect matchings in 0(j) for at least one j. Then there exists a vertex in 0(j) which forms pairs with different vertices in the two matchings. Hence the two perfect matchings cannot be assigned to the same tableaux sequence. Step 3. Let a perfect matching M in 0 be given. How many tableaux sequences does M admit? Choose a vertex in each pair. As for the rows to which these chosen vertices are assigned, the (p1 |λ1 | + · · · + pm |λm |)/2 boxes contained there can be filled by arbitrary distinct letters with adjustment of mod (A). Then, the row to which the other vertex in each pair is assigned is uniquely determined mod (A) by condition (14). Hence the inquired number is Y (j) (j) n(p1 |λ1 |+···+pm |λm |)/2 / j (p1 k1 +···+pm km )/2 . (16) j≥2

Step 4. From Step 2 and Step 3, we see that the number of the sequences of p1 + · · · + pm tableaux mod (A) satisfying (13) and (14) is pm(0) times Eq. (16). Now this quantity Q (j) pm must be divided by j≥2 (k1(j) !)p1 · · · (km !) in order for mod (B) in Step 1 to be taken into account. Hence we see that the number of the terms which really survive in the right hand side of Eq. (12) is

Central Limit Theorem on the Infinite Symmetric Group

pm(0) n(p1 |λ1 |+···+pm |λm |)/2 /

Y

(j)

j (p1 k1

413 (j) +···+pm km )/2

(j) pm (k1(j) !)p1 · · · (km !) .

j≥2

Combining this with (]Cλ(n) )p1 /2 · · · (]Cλ(n) )pm /2 ∼ Q 1 m

n|λ1 |p1 /2 · · · n|λm |pm /2 (j)

(j)

(j) pm /2 k1 k1(j) !)p1 /2 · · · (j km km !) j≥2 (j

(n → ∞)

(seen from Eq. (10)), we obtain Eq. (15).

To complete the proof of Theorem 1, we use an integral formula giving the number of perfect matchings. Let us review the materials quickly by following Godsil’s book [G]. The complement of a graph G, denoted by G, is the graph such that the vertex set of G is that of G and two vertices are joined by an edge in G if and only if they are not in G. An r-matching in graph G is by definition an r-subset {e1 , · · · , er } of edges of G such that no ei and ej (i 6= j) share a common vertex. Set p(G, r) := ]{r-matching in G} (r ≥ 1) , X µ(G, x) := (−1)r p(G, r)xn−2r ,

p(G, 0) := 1 ,

r≥0

where n is the number of the vertices of G. µ(G, x) is called the matchings polynomial of G. One sees that µ(G1 ∪ G2 , x) = µ(G1 , x)µ(G2 , x), µ(Kr , x) = Hr (x) where Kr is the complete graph with r vertices

(17) (18)

hold. See [G] Chapter 1. (µ(Kr , x) satisfies the recurrence formula Eq. (4) for Hermite polynomials.) Formula.

([G] Theorem 2.2) 1 pm(G) = √ 2π

Z

e−x

2

/2

µ(G, x)dx .

(19)

R

Now we apply Eq. (19) to G = 0(j) . Since (j) ∪ · · · ∪ K (j) 0(j) = Kk(j) ∪ · · · ∪ Kk(j) ∪ · · · · · · ∪ Kkm km 1

1

holds, Eq. (17) and Eq. (18) yield pm pm (j) , x) (j) (x) µ(0(j) , x) = µ(Kk(j) , x)p1 · · · µ(Kkm = Hk(j) (x)p1 · · · Hkm . 1

1

Hence Eq. (19) yields 1 pm(0 ) = √ 2π

Z

e−x

(j)

R

2

/2

pm (j) (x) Hk(j) (x)p1 · · · Hkm dx . 1

Combining this with pm(0) = pm(∪j≥2 0(j) ) = we consequently obtain Eq. (5).

Q j≥2

pm(0(j) ) and then using Lemma 2,

414

A. Hora

Remark. In our proof, the asymptotic independence among different j’s, j being length of a row of a diagram, results from disjoint union structure of graph 0, which yields that pm(0) splits into product of pm(0(j) )’s. 3.3. Another approach. The referee suggested another illuminating way to prove our result. This approach would help simplify the presentation, while it uses Kerov’s theorem and therefore is not self-contained. This subsection is based on the referee comments. Let us consider a sequence (Xj )j=2,3,··· of independent √ random variables obeying Q the standard Gaussian law. Set Yλ := j≥2 Hk(j) (Xj )/ k (j) ! for each Young diagram (2)

(3)

(j)

λ = (2k 3k · · · j k · · ·). Our goal (i.e. Theorem 1) is to prove that the joint distribution of q q (n) (n) A(n) / ]C , · · · , A / ]Cλ(n) (20) λ1 λ1 λm m converges to the joint distribution of Yλ1 , · · · , Yλm as n → ∞. On the q other hand, Kerov’s (n) theorem assures that every finite joint distribution of {Hk(j) (A(n) j / ]Cj )}j=2,3,··· con-

(n) verges to that of {Hk(j) (Xj )}j=2,3,··· as n → ∞ for any diagram, where A(n) j [resp. Cj ] denotes the adjacency operator [resp. the conjugacy class] associated with the j-cycles. Hence, the joint distribution of q q q q Y Y (j) (j) (n) (n) (n) (n) (j) (A Hk(j) (Aj / ]Cj )/ k1 ! , · · · , H km (21) j / ]Cj )/ km ! 1

j≥2

j≥2

converges to that of Yλ1 , · · · , Yλm as n → ∞. The essential part of this approach thus consists of proving the asymptotic coincidence of the distributions of Eq. (20) and Eq. (21). (This part is not trivial and needs some similar counting arguments in the spirit of our original proof.) An advantage of this approach is that it may be in a sense explanatory of the appearance of Hermite polynomials. Let us consider, in particular, diagramqµr := (2r )

(n) in Eq. (20) and Eq. (21). One can see the connection between A(n) µr / ]Cµr and q √ (n) Hr (A(n) µ1 / ]Cµ1 )/ r! as follows. In the expression (n) 2r+1 r!A(n) µ1 Aµr =

X i6=j

(i j)

X

(k1 l1 ) · · · (kr lr ) ,

kh ,lh (1≤h≤r) : distinct

P(1) P(2) P(3) P(4) + + + we divide the sum of RHS into 4 pieces: , where each sum is taken over: P(1) i, j, k1 , l1 , · · · , kr , lr are all distinct, P(2) k1 , l1 , · · · , kr , lr are distinct and ]({i, j} ∩ {kh , lh |1 ≤ h ≤ r}) = 1, P(3) k1 , l1 , · · · , kr , lr are distinct and ∃h such that {i, j} = {kh , lh }, P(4) k1 , l1 , · · · , kr , lr are distinct and ∃h, h0 such that h 6= h0 , i ∈ {kh , lh }, j ∈ {kh0 , lh0 }. q (n) Note that we consider normalization by ]Cµ(n) nr+1 . (Recall Eq. (10) and 1 ]Cµr P(2) Eq. (11).) Since is a finite – depending only on r — sum of such expressions as

Central Limit Theorem on the Infinite Symmetric Group

X

415

(i k1 l1 )(k2 l2 ) · · · (kr lr ) ,

i,kh ,lh (1≤h≤r) : distinct

it has moments of the same order with A(2r−1 31 ) . Hence the correct normalization is by P(4) , being normalized by nr+1 , n(2(r−1)+3)/2 = o(nr+1 ) as was seen in Sect. 3.1. Similarly P(1) has the vanishing moments as n → ∞. Combining these with = 2r+1 (r + 1)!A(n) µr+1 P(3) and = 2r r!n(n − 1)A(n) , we get µr−1 (n) r+1 r (n) (r + 1)!A(n) 2r r!A(n) µ1 Aµr = 2 µr+1 + 2 r!n(n − 1)Aµr−1 + (remainder),

and finally p

A(n) √ A(n) A(n) µ µ µ (1 + o(1)) (r + 1)! q r+1 = q 1 r! q r (n) (n) ]Cµ(n) ]C ]C µ1 µr r+1 (n) p Aµ (remainder) −r (r − 1)! q r−1 (1 + o(1)) + . (22) nr+1 ]Cµ(n) r−1

Equation (22) recovers theprecurrence formula √ Eq. (4) of Hermite polynomials under the correspondence: Aµr / ]Cµr ←→ Hr / r!. Acknowledgement. I express my sincere gratitude to Prof. Y. Yamasaki for his comments and encouragement. I express deep appreciation to Prof. L. Accardi for kind instruction in several references.

References [A-B]

Accardi,L., Bach,A.: Quantum central limit theorems for strongly mixing random variables. Z. Wahr. verw. Geb. 68, 393–402 (1985) [A-H-O] Accardi,L., Hashimoto,Y., Obata,N.: Notions of independence related to the free group. Preprint 1997 [A-L1] Accardi,L., Lu,Y.G.: Quantum central limit theorems for weakly dependent maps I. Acta Math. Hungar. 63 (2), 183–212 (1994) [A-L2] Accardi,L., Lu,Y.G.: Quantum central limit theorems for weakly dependent maps II. Acta Math. Hungar. 63 (3), 249–282 (1994) [B] Biane,P.: Permutation model for semi-circular systems and quantum random walks. Pacific J. Math. 171, 373–387 (1995) [D] Durrett,R.: Probability: Theory and examples. Belmont, California: Duxbury Press 1991 [G-W] Giri,N., von Waldenfels,W.: An algebraic version of the central limit theorem. Z. Wahr. verw. Geb. 42, 129–134 (1978) [G] Godsil,C.D.: Algebraic combinatorics. New York: Chapman & Hall, 1993 [Ha] Hashimoto,Y.: A combinatorial approach to limit distributions of random walks: On discrete groups. Preprint 1996 [Ho] Hora,A.: Central limit theorems and asymptotic spectral analysis on large graphs. Submitted 1997 [K] Kerov,S.: Gaussian limit for the Plancherel measure of the symmetric group. C. R. Acad. Sci. Paris 316, S´erie I, 303–308 (1993) [M] Muraki,N.: Noncommutative Brownian motion in monotone Fock space. Commun. Math. Phys. 183, 557–570 (1997) [Si] Simon,B.: Representations of finite and compact groups. Providence, RI: Amer. Math. Soc., 1996 [Sp] Speicher,R.: A new example of “independence” and “white noise” . Probab. Th. Rel. Fields 84, 141–159 (1990)

416

[S-W]

A. Hora

Speicher,R., von Waldenfels,W.: A general central limit theorem and invariance principle. In: Accardi,L. et al. (eds.) Quantum probability and related topics IX, Singapore: World Scientific, 1994, pp. 371–387 [V-D-N] Voiculescu,D.V., Dykema,K.J., Nica,A.: Free random variables. Providence, RI: Amer. Math. Soc., 1992 Communicated by H. Araki

Commun. Math. Phys. 195, 417 – 434 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Simple Facts Concerning Nambu Algebras Philippe Gautheron1,2 1 Laboratoire Gevrey de Math´ ematique Physique, Universit´e de Bourgogne, BP 400, F-21011 Dijon Cedex, France. E-mail: [email protected] 2 Coll` ege Henri Dunant, Netreville, Evreux, France

Received: 24 October 1997 / Accepted: 1 December 1997

Abstract: A class of substitution equations arising in the extension of Jacobi identity for n-gebras is studied and solved. Graded bracket and cohomology adapted to the study of formal deformations are presented. New identities in the case of Nambu-Lie algebras are proved. The triviality in the Gerstenhaber sense of certain deformed n-skew-symmetric brackets, satisfying the Leibniz rule with respect to a star-product, is shown for n ≥ 3.

1. Introduction Recently, there have been several works dealing with various generalizations of Poisson structures by extending the binary bracket to an n-bracket. The main point for these generalizations is to look for the corresponding identity which would play the rˆole of Jacobi identity for the usual Poisson bracket. Among the ways to present the Jacobi identity in view of generalizations, essentially two were considered so far: a) The sum over the symmetric group S3 of the composed brackets [[·, ·], ·] is zero. b) The adjoint map b 7→ [a, b] is a Lie algebra derivation. Formulation a), when extended to n-brackets, leads to the notion of generalized Poisson structures studied in [1]. The corresponding identity is obtained by complete skewsymmetrization of the 2n − 1 composed brackets when n is even. This is equivalent to require that the Schouten bracket of the n-tensor defining the n-bracket with itself vanishes. For completeness, one should note that there is another way to present the Jacobi identity by requiring a graded bracket of the Lie operation with itself to vanish. This might, in a more general context, lead to still different generalizations; this is not easy to achieve and in fact has not been considered so far.

418

P. Gautheron

A more interesting generalization from the dynamical point of view is found by considering an identity which is the analogue of formulation b). In this case one gets the Fundamental Identity of Nambu Mechanics [7, 12]: [x1 , . . . , xn−1 , [y1 , . . . , yn ]] X [y1 , . . . , yi−1 , yi+1 , . . . , yn , [x1 , . . . , xi−1 , yi , xi , . . . , xn−1 ]] = 0. − 1≤i≤n

Actually there are other ways, intermediate between the previous two, to extend the Jacobi identity for n ≥ 3. One may consider: 1) The length of the symmetric group as annihilator for [[·, ·], ·], from n + 1 to 2n − 1. 2) The length p of the argument of an adjoint map and the properties it has to verify as an (n − p)-multilinear map. In the case n = 2 the above two formulations coincide. One should also note that for n ≥ 3 two a priori different formulations can give the same structure. For example, the Fundamental Identity is in fact equivalent to [x1 , . . . , xn−2 , yn+1 , [y1 , . . . , yn ]] X [x1 , . . . , xn−2 , yi , [y1 , . . . , yi−1 , yn+1 , yi+1 . . . , yn ]] = 0, − 1≤i≤n

i.e. the skew-symmetrization in the last n + 1 variables of the composed brackets is 0. This unexpected result can be proved using some substitutions of the arguments of the identity and linear combinations of the so obtained equalities. In the present paper, we study all the structures arising from different choices of the length of the symmetric group. For each of them we give an equivalent formulation in terms of properties of the adjoint map and an appropriated algebraic auto-bracket providing a cohomology adapted to deformation theory. P In Sect. 2 we shall study substitution equations of the form i ai σi .F = 0, where F is a function in r variables and the sum is performed over a subset {σi } of the symmetric group of r elements. Here we shall be concerned with the case where F is a function in p + q variables, skew-symmetric in the first p and in the last q variables as those coming from the composition of two skew-symmetric operators A, B, namely A(x1 , . . . , xp , B(y1 , . . . , yq )). This study will allow to get equivalent formulations of the structure condition for n-gebras as stated above. For some substitution equations, the general solutions are given and we shall be in position to propose some higher analogue of the notion of associativity. Notice that the functions F are not necessarily linear. In Sect. 3, after presenting in a simple way the Gerstenhaber bracket, we shall present a hierachized family of extended Jacobi identities and their different interpretations and introduce the corresponding auto-brackets. For the structure of even order, cohomological operations classifying the equivalence classes and obstructions to deformation follow straightforwardly from these brackets. In this family the first structure is given by the requirement that the action of the symmetric group Sn+1 on its composed brackets is zero. The second structure is obtained from Sn+3 and so on. In some cases, we show that intermediate situations are equivalent to a lower order one. The first structure is equivalent to the Fundamental Identity and therefore the Nambu-Poisson structures appear to be the richest ones. To illustrate this fact we shall present for this last structure a set of remarkable properties. It would be

Simple Facts Concerning Nambu Algebras

419

interesting to find some concrete finite dimensional examples of these n-Lie brackets in order to apply directly the previous properties. A first step would be to consider the cases studied by Filippov [6], where these properties seem to provide directly nontrivial multilinear identities. All the results obtained in the present paper are of algebraic nature, hence they are valid for any kind of n-skew-symmetric linear maps on any vector space V (finite or infinite dimensional on a field of characteristic zero). In Sect. 4, we shall deal with the situation where V is the space of smooth functions on a finite dimension manifold. The n-brackets are verifying the Leibniz rule for the usual pointwise product so that we shall refer to them as Poisson brackets. The extended Jacobi identities presented in Sect. 2 give rise to algebraic and differential equations. In the Nambu-Poisson case the algebraic equations imply the decomposability of NambuPoisson tensors [8]. For other weaker generalizations of the Jacobi identity we ask two questions concerning the form of the solutions. Finally, we shall consider Gerstenhaber deformations of n-skew-symmetric brackets. It is not known yet whether the second cohomology space classifying this kind of deformation is trivial or not. There are some strong evidences that there is no nontrivial deformation of n-brackets for n ≥ 3. In fact we shall study the special case where the deformed n-skew-symmetric bracket is verifying Leibniz rule with respect to a star-product (the expression star-product will mean associative deformation of the pointwise product of functions). We will show that for n ≥ 3 such a star-product must be an Abelian deformation and, according to [11], it must be a trivial deformation (in the Gerstenhaber sense). Notice that this triviality result about Abelian deformations is no longer valid in the more general setting of Zariski quantization [4, 5] where nontrivial Abelian deformations of the usual product appear in a generalized deformation framework not of Gerstenhabertype.

2. Substitution Equations on (p, q)-Skew-Symmetric Functions 2.1. Definition of the spaces Sp,q and some of their operators. In this section, we present the definitions of various function spaces and maps that will be used throughout the paper. Several operations on finite sequences appear in the paper and, for simplicity, the notation i

j

(x1 , . . . , a, . . . , b, . . . , xn ) will stand for (x1 , . . . , xi−1 , a, xi+1 , . . . , xj−1 , b, xj+1 , . . . , xn ). Definition 1 (Space of functions Sp,q ). Let p and q be two integers, E be any set and M a vector space (of arbitrary dimension) over a field of characteristic 0. We shall denote by Sp,q the space of functions in p + q variables of E taking values in M : F (x1 , . . . , xp ; y1 , . . . , yq ), skew-symmetric in x1 , . . . , xp and in y1 , . . . , yq . S0,0 is identified with M and for convenience we define Sp,q to be 0 for negative p or q. Definition 2 (The operators Tr (p, q) and Tl (p, q)). For every pair of integers p, q, we define a linear map Tr (p, q): Sp,q → Sp+1,q−1 by the following formulas: a) On S0,q , q ≥ 1: (Tr (0, q)F )(x1 ; y1 , . . . , yq−1 ) = F (x1 , y1 , . . . , yq−1 ); b) On Sp,0 , p ≥ 0: Tr (p, 0) = 0; c) Otherwise:

420

P. Gautheron

(Tr (p, q)F )(x1 , . . . , xp+1 ; y1 , . . . , yq−1 ) = F (x1 , . . . , xp ; xp+1 , y1 , . . . , yq−1 ) X

−

i

F (x1 , . . . , xp+1 , . . . , xp ; xi , y1 , . . . , yq−1 ).

1≤i≤p

Similarly, Tl (p, q) maps Sp,q into Sp−1,q+1 by Tl (p, q)F = (Tr (q, p)Fˆ ) ˆ , where ˆ: Sp,q → Sq,p is given by Fˆ (y1 , . . . , yq ; x1 , . . . , xp ) = F (x1 , . . . , xp ; y1 , . . . , yq ). Definition 3 (The operators 1p,q and σp,q ). Let Gp,q be the vector space of symmetric maps from E×E to Sp,q . For integers p, q, we define a linear map 1p,q : Sp,q → Gp−1,q−1 by: (1p,q F (a, b))(x1 , . . . , xp−1 ; y1 , . . . , yq−1 ) = F (x1 , . . . , xp−1 , a; y1 , . . . , yq−1 , b) + F (x1 , . . . , xp−1 , b; y1 , . . . , yq−1 , a). Conversely, we define a linear map σp,q : Gp,q → Sp+1,q+1 by: (σp,q G)(a, x1 , . . . , xp ; b, y1 , . . . , yq ) = G(a, b; x1 , . . . , xp ; y1 , . . . , yq ) −

X

i

G(xi , b; x1 , . . . , a, . . . , xp ; y1 , . . . , yq )

1≤i≤p

−

X

j

G(a, yj ; x1 , . . . , xp ; y1 , . . . , b, . . . , yq )

1≤j≤q

+

X

j

i

G(xi , yj ; x1 , . . . , a, . . . , xp ; y1 , . . . , b, . . . , yq ),

1≤i≤p 1≤j≤q

where G(a, b; x1 , . . . , xp ; y1 , . . . , yq ) stands for G(a, b)(x1 , . . . , xp ; y1 , . . . , yq ). Definition 4 (The operators As (p, q)). Let p, q and s be three integers verifying 0 ≤ s ≤ min(p, q), As (p, q) will denote the linear map from Sp,q into itself defined by: (As (p, q)F )(x1 , . . . , xp ; y1 , . . . , yq ) =

X

i1

is

j1

js

F (x1 , . . . , y j1 , . . . , y js , . . . , xp ; y1 , . . . , x i1 , . . . , x is , . . . , yq ),

where the sum is performed over all s-tuples (i1 , . . . , is ) and (j1 , . . . , js ) satisfying 1 ≤ i1 < · · · < is ≤ p and 1 ≤ j1 < · · · < js ≤ q. Each term in the sum is obtained by interchanging xil and yil , 1 ≤ l ≤ s, in F (x1 , . . . , xp ; y1 , . . . , yq ). A0 (p, q) is the identity and As (p, q) is 0 if s > p or s > q. Remark 1. Whenever it is clear from the context or there is no danger of confusion, we shall simply write Tr (resp. Tl , 1, σ, As ) for Tr (p, q) (resp. Tl (p, q), 1p,q , σp,q , As (p, q)). We shall also make a slight abuse of notation by viewing the operator 1 as a map from Sp,q into Sp−1,q−1 by implicitly fixing a, b ∈ E (cf. Def. 3). When an identity involves 1, it means that it is valid for any a, b ∈ E once written in more strict notations.

Simple Facts Concerning Nambu Algebras

421

2.2. Properties. We summarize basic properties of the maps defined in Sect. 2.1 in the following lemma: Lemma 1. The following properties hold on Sp,q for all p, q, s: [1, As+1 ] = As 1,

(1)

σ1 = A1 + pqI,

(2)

[Tr , 1] = [Tl , 1] = 0,

(3)

Tr Tl = pI − A1 ,

(4)

Tl Tr = qI − A1 ,

[Tr , A1 ] = (q − p − 1)Tr ,

(5)

[Tl , A1 ] = (p − q − 1)Tl ,

(6)

A1 As = (s + 1)As+1 − s(p − s + q − s)As +(p − s + 1)(q − s + 1)As−1 .

(7)

Proof. Tedious computations give all these results. As an example, we give the proof for Eq. (7) for s = p and p ≤ q. Ap F is the sum of the following terms: j1

jk

jp

F (yj1 , . . . , yjp ; y1 , . . . , x1 , . . . , xk , . . . , xp , . . . , yq ),

(8)

over 1 ≤ j1 < · · · < jp ≤ q. Now apply A1 to the previous sum, i.e. sum over all the substitutions xi ↔ yj , 1 ≤ i ≤ p, 1 ≤ j ≤ q. Each term gives: – for j not in the j1 , · · · , jq , p(q − p) times itself with an extra minus sign; – otherwise, one gets terms like: 0 ; y1 , · · · , x1 , · · · , yj , · · · , xp , · · · , yq ) F (yj10 , · · · , xi , · · · , yjp−1

0 . with j10 < j20 < · · · < jp−1

This term can arise from q − p + 1 original expressions (8) (the number of possibilities 0 ). By recollecting the terms obtained, one gets the to add another j to ( j10 , j20 , . . . , jp−1 identity: A1 Ap = (q − p + 1)Ap−1 − p(q − p)Ap . Moreover, the case considered above allows the generalization of Eq. (7) to the cases s ≥ p or s ≥ q by taking into account the convention As+1 = 0 on Sp,q . n 2.3. Theorems. We denote by Cm the binomial coefficient

m! n!(m−n)! .

Theorem 1 (Spectral decomposition of As ). For 0 ≤ p ≤ q, As admits on Sp,q the following spectral decomposition: (Vk , aks )0≤k≤p , with

422

P. Gautheron

Vk = Trp−k (ker Tl (k, q + p − k)), and

aks (p, q) =

X

s−i s−i (−1)s−i Cki Cp−k Cq−k ,

0≤i≤k

with

n Cm

= 0 for for n > m or n < 0.

Theorem 2 (Criterion for the image of Tr in the case p ≤ q). Let F be in Sp,q with p ≤ q, then F is coming from some G in Sp−k,q−k by applying Trk on G, if and only if 1p−k+1 F = 0. Theorem 3 (Solutions of Tls = 0). Let 0 < p ≤ q. The kernel of Tls on Sp,q is equal to the image of σ p−s+1 1p−s+1 . Theorem 4. Let F be in Sn,n+1 for n ≥ 0 and s ≥ 0. F verifies: Tls ((s + 1)F − (−1)s An F ) = 0, if and only if Tls+1 F = 0. Corollary 1 (Alternative form for the Fundamental Identity). Let F be in Sn,n+1 for n ≥ 0. Then F verifies the Fundamental Identity substitution equation, i.e.: F (x1 , . . . , xn ; y1 , . . . , yn+1 ) X F (y1 , . . . , yi−1 , yi+1 , . . . , yn+1 ; x1 , . . . , xi−1 , yi , xi , . . . , xn ) = 0, − 1≤i≤n+1

if and only if Tl F = 0. 2.4. Some lemmas. Lemma 2 (Polynomial annihilator for A1 ). For 0 ≤ p ≤ q, the polynomial: Y (X − (k − (p − k)(q − k))) , Hp,q (X) = 0≤k≤p

verifies Hp,q (A1 ) = 0 on Sp,q . (l) : Sp,q → Sp,q by R(l) = σ l 1l (with Proof. For 0 ≤ l ≤ p + 1 and p ≤ q, we define Rp,q s (p+1) = 0. shortened notations). Note that, since 1 = 0 on Sp,q for s ≥ p + 1, we have Rp,q l Using (2), i.e. σ1 = A1 + pqI (on Sp,q ), and by noting that 1 maps Sp,q into Sp−l,q−l , we have:

R(l+1) = σ l+1 1l+1 , = σ l (σ1)1l , = σ l A1 1l + (p − l)(q − l)σ l 1l , = σ l [A1 , 1l ] + R(l) A1 + (p − l)(q − l)R(l) . Now (1) with s = 0 gives [1, A1 ] = 1 and implies that [A1 , 1l ] = −l1l . Hence we have the following relation: R(l+1) = R(l) (A1 + ((p − l)(q − l) − l)I),

Simple Facts Concerning Nambu Algebras

which by induction gives:

Y

R(l+1) =

423

(A1 + ((p − k)(q − k) − k)I).

0≤k≤l (p+1) = Hp,q (A1 ), hence Hp,q (A1 ) = 0. For l = p, we have Rp,q

Lemma 3 (Spectral decomposition of A1 ). If 0 ≤ p ≤ q, A1 is diagonalizable on Sp,q and its eigenvalues are ak1 (p, q) = −(p − k)(q − k) + k, 0 ≤ k ≤ p. k Proof. Since ak+1 1 (p, q) − a1 (p, q) = p + q − 2k > 0 for 0 ≤ k < p, the p + 1 roots of Hp,q (X) are simple. Then M Vi , Sp,q = ker Hp,q (A1 ) = 0≤i≤p

where Vi = ker(A1 −

ai1 (p, q)I).

Lemma 4. On Sp,q , we have: a) b) c) d)

If 0 ≤ p < q, then Tr F = 0 implies F = 0. If 0 ≤ p ≤ q, then ker Tl = ker(A1 − pI). If 0 p (p being the highest eigenvalue of A1 , cf. Lemma 3), we have that Tl Tr and Tr are injective, hence statement a). For 0 ≤ p ≤ q, Tr is injective on Sp−1,q+1 by statement a), and since Tr Tl = −A1 +pI on Sp,q , statement b) follows. Moreover, on each eigenspace of A1 except Vp , the property Tr Tl = −A1 + pI, becomes Tr Tl = (p − λ)I with p − λ 6= 0. This shows that statements c) and d) can be deduced from it by iteration. Lemma 5. If F is an eigenvector of A1 with associated eigenvalue ak1 (p, q) on Sp,q for 0 ≤ p < q, then Tr F is an eigenvector of A1 in Sp+1,q−1 with associated eigenvalue ak1 (p + 1, q − 1). Proof. This result comes from the injectivity of Tr , from property (5), Tr A1 = A1 Tr + (q − p − 1)Tr and from the identity: ak1 (p, q) + 1 + p − q = ak1 (p + 1, q − 1). Lemma 6 (Eigenspaces of A1 ). The spectral decomposition of A1 is given by (Vk , ak1 )0≤k≤p with Vk = Trp−k (ker Tl (k, q + p − k)) and

ak1 = −(p − k)(q − k) + k

Proof. This lemma is a consequence of Lemmas 4 and 5. Lemma 7. Let Vk (p, q), 0 ≤ k ≤ p ≤ q, be the subspace of Sp,q defined by: Vk (p, q) = Trp−k (ker Tl (k, q + p − k)), then for 1 ≤ i ≤ k + 1 and ∀F ∈ Vk , we have 1i F = 0. For k + 1 < i ≤ q + 1 and ∀F ∈ Vk , 1i F = 0 implies F = 0.

424

P. Gautheron

Proof. This lemma is proved with the help of Lemma 6 and of the explicit form of R(s) = σ s 1s introduced in Lemma 2. 2.5. Proofs of the Theorems. Theorem 1 for s = 1 is proved by Lemma 6. Property (7) proves that the subspaces Vk are eigenspaces of As . To conclude, property (1) and Lemma 7 give for every F in Vk the relation: 1As+1 F = As+1 (1F ) + As (1F ), with 1f 6= 0 if k > 0. But 1F is in Vk−1 (p − 1, q − 1) and then the aks (p, q)’s verify: k−1 aks+1 (p, q) = ak−1 (p − 1, q − 1). For k = 0, V0 is the space of totally s+1 (p − 1, q − 1) + as skew-symmetric functions and a0s (p, q) = (−1)s Cps Cqs . By induction on k, we obtain now: X s−i s−i (−1)(s−i) Cki Cp−k Cq−k . aks (p, q) = 0≤i≤k

Theorems 2 and 3 are proved by Lemma 7 and by the decomposition of Sp,q . Theorem 4 is a direct consequence of the decomposition of An on Sn,n+1 . We have: X n−i n−i akn (n, n + 1) = (−1)n−i Cki Cn−k Cn+1−k 0≤i≤k n−k n−k = (−1)n−k Ckk Cn−k Cn+1−k

= (−1)n−k (n + 1 − k) (for fixed n, note that the values akn are distinct). On each Vk , the equation Tls ((s + 1)F − (−1)s An F ) = 0, becomes

Tls ((s + 1) − (−1)s (−1)n−k (n + 1 − k))F = 0.

Then either Tls F = 0 or k = n − s. Then F ∈ Trs (ker Tl ) and Tls+1 F = 0. Now for s = 0 we get the Fundamental Identity, which can be written as An = Id, and then its solution is exactly Vn = ker Tl . Remark 2 (More general solutions to Tl = 0 and Tls = 0). Denote by Sx1 ,...,xr , shortened in Sr x, the skew-symmetrization of any function in the variables (x1 , . . . , xr ). The solutions of ker Tlk+1 = 0 are then described by σ p−k 1p−k F with F in Sp,q , in fact by (Sp x)(Sq y)1p−k F up to a constant. We shall prove now that for any function (without symmetry conditions), the same process gives a solution. Like in any substitution problem of p + q variables, it is sufficient to demonstrate it over a vector space of dimension p + q with F = dx1 ⊗ · · · dxp ⊗ dy1 ⊗ · · · ⊗ dyq , constructed with any basis. The characteristic of the field is still zero. The result of 1s is the sum over the 2s -group generated by (dx1 , dy1 ), . . . , (dxs , dys ). The action of Sp x)(Sq y on each da1 ⊗ · · · ⊗ dap ⊗ db1 ⊗ · · · ⊗ dbq gives da1 ∧ · · · ∧ dap ⊗ db1 ∧ · · · ∧ dbq up to a constant which is equal by duality to det(b1 , . . . , bq , · · ·).det(a1 , . . . , ap , · · ·). We shall

Simple Facts Concerning Nambu Algebras

425

use the interior product to write det(a1 , . . . , ap , · · ·) = Ia1 ,...,ap det. The result of all these operations reads then: (1s G)((x1 , y1 ; x2 , y2 ; · · · ; xs , ys ); ys+1 , . . . , yq , . . . ; xs+1 , · · · , xp ) with G = det.det in Sp+q,p+q . But Tlp−s+1 Iys+1 ,...,yq det.Ixs+1 ,...,xp det = 0, because the dimension is p + q and of the linearity, and since 1s is commuting Tl . Remark 3 (Extended associativity). Here we would like to briefly stress how to apply the preceding results to find some analogue of associativity conditions for n-ary operations. Consider an n-bracket obtained by skew-symmetrization of some given n-structure A(x1 , . . . , xn ), without any symmetry properties. One may consider what kind of conditions should be imposed on A for the associated n-bracket to satisfy some given identity (e.g. Trk [x1 , . . . , xn−1 , [y1 , . . . , yn ]] = 0). For n = 2, it is well known that an associative product A(x, A(y, z)) = A(A(x, y), z) gives by skew-symmetrization a Lie bracket satisfying the Jacobi identity (Tr [x, [y, z]] = 0). Notice that there exist many other relations leading to the Jacobi identity. The result of the composition of n-skew-symmetric brackets can be written: Sn−1 .Sn A(A(y1 , . . . , yn ), x1 , . . . , xn−1 ) X i − A(xi , x1 , . . . , A(y1 , . . . , yn ), . . . , xn−1 ). 1≤i≤n

Then the n-bracket will verify Trk [x1 , . . . , xn−1 , [y1 , . . . , yn ]] = 0 if the following: A(A(y1 , . . . , yn ), x1 , . . . , xn−1 ) −

X

i

A(xi , x1 , . . . , A(y1 , . . . , yn ), . . . , xn−1 ),

1≤i≤n

contains at least n − k-invariance in xi ↔ yj up to the action of Sn−1 .Sn or, in a more precise way, it can be decomposed into several parts having such invariance (cf. Remark 2). For example, a 4-product verifying: A(A(y1 , y2 , y3 , y4 ), x1 , x2 , x3 ) = A(x2 , A(x1 , y2 , y3 , y4 ), y1 , x3 ), A(A(y1 , y2 , y3 , y4 ), x1 , x2 , x3 ) = A(x3 , x1 , A(y1 , x2 , y3 , y4 ), y2 ), A(A(y1 , y2 , y3 , y4 ), x1 , x2 , x3 ) = A(x3 , y2 , y1 , A(x1 , x2 , y3 , y4 )) gives rise to a 4-bracket satisfying Tl2 [x1 , x2 , x3 , [y1 , y2 , y3 , y4 ]] = 0. Moreover, if the 4-bracket admits a nondegenerate invariant bilinear form < ·, · > expressed in some orthogonal basis {Xi } by < Xi , Xj >= ai δij (for the finite dimensional case), then < [x1 , x2 , x3 , x4 ], x5 > is a 5-skew-symmetric form ω verifying X Tl2 ai ω(Xi , x1 , x2 , x3 , x4 )ω(Xi , y1 , y2 , y3 , y4 ) = 0. i

It will in fact verify Tl = 0 and Tl [x1 , x2 , x3 , [y1 , y2 , y3 , y4 ]] = 0 as well (see the next remark).

426

P. Gautheron

Remark 4. The computation of the eigenvalues of An for Sn,n yields akn = (−1)n−k . Now the decomposition: ker Tl2 = Vn ⊕ Vn−1 , provides another proof of Weitzenb¨ock’s trick [13]: Tl2 F = 0 and F = Fˆ if and only if Tl F = 0. In general, if F = Fˆ , Tl2k F = 0 implies Tl2k−1 F = 0. The original statement of Weitzenb¨ock’s trick involves a product of an n-vector P with itself and provides a criterion for decomposability: P is decomposable if and only if Iβ P ∧ P = 0 for any (n − 2)-form β (Iβ denotes the interior product by β). This is related to Nambu tensors which are known to be decomposable for n ≥ 3 [8]. It is a consequence of some quadratic equations on the Nambu tensor imposed by the Fundamental Identity. 3. Applications to n-gebras 3.1. Extended action of a multilinear map on the exterior algebras and graded bracket. 3.1.1. Extended action. Let V beVa vector space of arbitrary dimension over a field of characteristic 0. Let V T (V) (resp. (V)) be the tensor (resp. exterior) algebra of V. The usual injection of (V) into T (V), will be denoted by: X (σ)xσ(1) ⊗ · · · ⊗ xσ(n) , n! x1 ∧ · · · ∧ xn = σ∈Sn

where Sn denotes the symmetric group and (σ) is the sign of σ. Let A be a skew-symmetric n-multilinear map on V V taking values in V. We define an action of A, denoted by [A], on T (V) taking values in (V), by the following formula for m ≥ n: [A](x1 ⊗ · · · ⊗ xm ) = A(x1 , . . . , xn ) ∧ xn+1 ∧ · · · ∧ xm , and [A](x1 ⊗ · · · ⊗ xm ) = 0 for Vm < n. V By using the injection of (V) into T (V), the restriction of [A] to (V) can be written for m ≥ n as: (m − n)! n! [A](x1 ∧ · · · ∧ xm ) X (σ) A(xσ(1) , . . . , xσ(n) ) ∧ xσ(n+1) ∧ · · · ∧ xσ(m) , = σ∈Sm

(for m < n, it is 0). Let m − 1 = a be the degree of such an operator A and of its action. Then [A] Vm Vm−a maps (V) into (V). Moreover the actions of these operators are completely determined by induction on their degree and on the lengths of tensorial spaces by the two Cartan formula: [Ix A] = [A](x ∧ · · ·) − (−1)a x ∧ [A], with initial conditions [A] = 0 if the lengths of the tensors are less than the length of its natural argument, [A] = A at the first admissible step and [x] = x ∧ · · · for all vectors x ∈ V.

Simple Facts Concerning Nambu Algebras

427

Va+1 Vb+1 3.1.2. Graded bracket. Let A and B be two maps on V and V taking values V2 in V. The operator A ∧ B taking values in V verifies: A ∧ B = (−1)ab B ∧ A, Va+b+1 Va+b+2 V. Then for every A, B, there exists some F on V, also of degree a + b on such that: [A][B] − (−1)ab [B][A] = [F ]. F will be called the graded bracket of A and B and will be denoted by [A, B]. Its expression is: [A, B] =

(Sa+b+1 ) (a + 1)A ◦ B − (−1)ab (b + 1)B ◦ A, (a + 1)!(b + 1)!

where A ◦ B(x1 , . . . , xa+b+1 ) = A(B(x1 , . . . , xk+1 ), xk+2 , . . . , xa+b+1 ). This bracket verifies the following two properties: [A, B] = (−1)ab [B, A], [A, [B, C]] = [[A, B]C] + (−1)ab [B, [A, C]], V which can be easily proved through the extended actions on (V) and their associative composition. 3.2. k-Nambu-Lie structures. The purpose of this section is to translate in terms of the graded bracket some properties of Sn,n+1 -functions F of the form F (x1 , . . . , xn ; y1 , . . . , yn+1 ) = A(x1 , . . . , xn , B(y1 , . . . , yn+1 )). Theorem 5. Let A, B be (n + 1)-multilinear skew-symmetric maps from V into V and k be an even integer. The Sn,n+1 -function F = A(· · · , B(· · ·)) + B(· · · , A(· · ·)) verifies Tlk+1 F = 0 if and only if, for all x1 , . . . , xn−k : [Ix1 ,...xn−k A, B] + [Ix1 ,...xn−k B, A] = 0. Proof. For simplicity, we shall only write the first part of the symmetric expressions in A, B and recall this fact when it will be important. From the definition of An , we have for some F ∈ Sn,n+1 : An F (x1 , . . . , xn ; y1 , . . . , yn+1 ) = F (y1 , . . . , yn ; x1 , . . . , xn , yn+1 ) X F (y1 , . . . , yi−1 , yi+1 , . . . , yn+1 ; x1 , . . . , xi−1 , yi , xi , . . . , xn ), + 1≤i≤n

and from the definition of Tr :

428

P. Gautheron

Tr F (y1 , . . . , yn+1 ; x1 , . . . , xn ) = F (y1 , . . . , yn ; yn+1 , x1 , . . . , xn ) X i − F (y1 , . . . , yn+1 , . . . , yn ; yi , x1 , . . . , xn ). 1≤i≤n

By comparing the right-hand sides of the preceding equations, we deduce that An F = (−1)n (Tr F ) ˆ = (−1)n Tl Fˆ (Fˆ was defined at the end of Def. 2). Now we will apply Theorem 4 to F = A(· · · B(· · ·)), i.e. Tlk+1 F = 0 if and only if the condition (1) Tlk ((k + 1)F − (−1)k An F ) = 0, is verified. Consider G = Fˆ , that is: G(x1 , . . . , xn+1 ; y1 , . . . , yn ) = A(y1 , . . . , yn , B(x1 , . . . , xn+1 )), and since An F = (−1)n Tl G, condition (1) becomes: (k + 1)Tlk F − (−1)n+k Tlk+1 G = 0. The action of Tlk on some H in Sp,q can be expressed in terms of a partial skewsymmetrization of the arguments of H. One can verify that: (Tlk H)(x1 , · · · , xp−k ; y1 , · · · , yq+k ) 1 X (σ)H(x1 , . . . , xp−k , yσ(1) , . . . , yσ(k) ; yσ(k+1) . . . , yσ(q+k) ). q! σ∈Sq+k

Using the notation introduced earlier for partial skew-symmetrization, the condition (1) reads: (k + 1)Sn+1+k a A(x1 , . . . xn−k , a1 , . . . , B(ak+1 , . . . , an+k+1 )) (n + 1)! (−1)n+k Sn+1+k a A(ak+2 , . . . , an+k+2 , B(x1 , . . . xn−k , a1 , . . . , ak+1 )) = 0. − n! By performing some permutations of the variables a’s, we find: Sn+1+k a (−1)k(n+k)+k (k + 1)A(x1 , · · · xn−k , B(a1 , . . .), . . . , an+k+1 ) (n + 1)! −(−1)n+k (−1)n (n + 1)A(B(x1 , . . . xn−k , a1 , . . .), . . . , an+k+1 ) = 0. For the preceding equation to be identified with the graded bracket, we see that k must be even. Then after symmetrization in A, B, we get: (−1)nk (k + 1)![Ix1 ,...,xn−k A, B] = 0. Definition 5. Let A be an (n + 1)-multilinear skew-symmetric map from V to V. A is called a k-Nambu-Lie bracket of order n if and only if Sn+1+k+1 .A(x1 , · · · xn−k , · · · A(· · ·)) = 0 ∀x1 , . . . , xn−k ∈ V.

Simple Facts Concerning Nambu Algebras

429

This definition is equivalent to the following: Structure condition. Let k be even. A is a k-Nambu-Lie bracket of order n + 1 if and only if: [Ix1 ,...,xn−k A, A] = 0. 3.3. Deformation and cohomology for n + 1 even. Theorem 6. ∀h, a, b ∈ N and for any (h + a + 1)−, (h + b + 1)−, (h + c + 1)-multilinear skew-symmetric maps A, B, C and an (h + f + 1)-multilinear F in Sh,f +1 , the brackets: (h + c)! [Ix ,...,xh B, C] (c + 1)! 1 (h + b)! −(−1)(h+b)(h+c) (−1)h(h+c) [Ix ,...,xh C, B], (b + 1)! 1

{B, C}x1 ,...,xh = (−1)h(h+b)

(2)

and 1 [Ix ,...,xh A, Sh+f +1 .F ] (f + 2)! 1 (h + a)! −(−1)(h+f )(h+a) (−1)h(h+f ) [F (x1 , . . . , xh ; · · ·), A], (a + 1)!

< A, F >x1 ,...,xh = (−1)h(h+a)

verify a kind of “Jacobi identity" given by: < A, {B, C} >= (−1)(h+a)(h+b)+h(a+b) < B, {A, C} > −(−1)(h+a)(h+c)+h(a+c) < C, {A, B} > .

(3)

Proof. This result follows from the property: Sh+a+b+1 .[Ix1 ,···xh A, B] =

(−1)h(h+b) (h + a + b + 1)!(h + a)! ((a + 1)[A, B] (a + 1)! 1 Sh+a+b+1 .B(A(· · ·) · · ·))). −(−1)(h+a)(h+b) h (h + b)!(h + a + 1)!

The bracket {·, ·} is defined by (2) so that Sh+a+b+1 .{A, B} = (h + a + b + 2)!(−1)h(a+b) [

(h + a)!A (h + b)!B , ], (a + 1)! (b + 1)!

and then the bracket < ·, · > satisfies (h + b)!B (h + c)!C , ]] (b + 1)! (c + 1)! (h + a)!A −(−1)(h+a)(b+c) [{B, C}, ]. (a + 1)!

< A, {B, C} > = (−1)h(b+c) ([(−1)h(h+a) Ix1 ,···xh A[

Now straightforward calculations using graded bracket properties prove the theorem.

430

P. Gautheron

3.3.1. Deformation. Let A0 be a k-Nambu-Lie bracket of even order n. Let At be a formal deformation of A0 , i.e. At = A0 + tA1 + t2 A2 + · · · . Since n is even, the structure condition for At can be written {At , At } with h = n − k. Then the three following operators: d1 =

[A0 , ·]

on L(V),

d2 = {A0 , ·}

on

d3 = < A0 , · >

on

n ^

V ∗ ⊗ V,

n−k ^

V∗ ⊗

n ^

V ∗ ⊗ V,

verify d2 d1 = 0 and d3 d1 = 0 because [A0 , A0 ] = 0 and of (3), and the fact that n is even. Hence it defines two cohomology spaces which classify the nontrivial infinitesimal deformations and the obstructions to extend it (cf. [9, 10]). Remark 5. The equation Sn+1+k+1 .A(x1 , · · · xn−k , · · · A(· · ·)) = 0, depends only on the length of the symmetric group, that is, it is equivalent to Sn+1+k+1 .A(x1 , . . . , xn−k−p , . . . , A(xn−k−p+1 , . . . , xn−k ), . . .) = 0, for p ≤ n − k. This can easily be seen by considering the action of Sn+3+k on F = A(· · · A(· · ·)) which is 0. This can be decomposed in (k + 2)Sn+1+k+1 .A(x1 , · · · xn−k , · · · A(· · ·)) ±(n + 1)Sn+1+k+1 .A(x1 , · · · xn−k−1 , · · · A(xn−k · · ·)). The second term vanishes and by iteration one gets the statement above. 3.4. Generalized adjoint properties for 0-Nambu-Lie Brackets. Theorem 7. Let A be a 0-Nambu-Lie bracket. Then for all x1 , . . . , xk and y1 , . . . , yl , we have: [Ix1 ,...,xk A, Iy1 ,···,yl A] = (−1)n(n−k−1) I[Ix1 ,...,xk A](y1 ∧···∧yl ) A. Proof. By definition the theorem is true for k = n and l = 0. By Remark 5, one proves the theorem for l = 0 and all k, by using the definition of the graded bracket. For k = n and all l we also have: [A(x1 , . . . , xn ), Iy1 ,...,yl A] = (−1)n IA(x1 ,...,xn ),y1 ,...,yl A. These initial conditions and the identity: [[Ix1 ,...,xk A, Iy1 ,...,yl A], z] = (−1)n−l−1 [Ix1 ,...,xk ,z A, Iy1 ,...,yl A] + [Ix1 ,...,xk A, Iy1 ,...,yl ,z A], show by induction that there exists some action φ of Ix1 ,...,xk A such that:

Simple Facts Concerning Nambu Algebras

431

[Ix1 ,...,xk A, Iy1 ,...,yl A] = Iφ(y1 ∧···∧yl ) A, with the following recursive equation and by (8): φ(y1 ∧ · · · ∧ yl ) ∧ z = (−1)n−l−1 Iz φ(y1 ∧ · · · ∧ yl ) + φ(y1 ∧ · · · ∧ yl ∧ z), better written as: (−1)n Iz φ(y1 ∧ · · · ∧ yl ) = φ(z ∧ y1 ∧ · · · ∧ yl ) − (−1)n−k−1 z ∧ φ(y1 ∧ · · · ∧ yl ) which characterizes the action of (−1)n(n−k−1) Ix1 ,...,xk A. 4. The C ∞ (M)-Case It is a well-known fact that any n-multilinear skew-symmetric bracket {·, ·} on the space of smooth functions A(M) on a manifold M, verifying the Leibniz property for the usual pointwise product: {f1 , . . . , fn−1 , g.h} = {f1 , . . . , fn−1 , g}.h + g.{f1 , . . . , fn−1 , h}, Vn T M such that is given by an n-skew-symmetric tensor ω on {f1 , . . . , fn } = ω(df1 , . . . , dfn ). Then ω has to verify other required properties. For the n-bracket to define some k-Lien-gebra, it has to satisfy both algebraic and differential equations. As far as the algebraic part is concerned, notice that Tlk {f1 , . . . , fn−1 , {g1 , . . . , gn } does not verify the Leibniz property for f1 , . . . , fn−k−1 . By replacing, for example, f1 by a product ab in Tlk {f1 , . . . , fn−1 , {g1 , . . . , gn }} = 0, we obtain an algebraic constraint. A straightforward computation yields: 1Tlk ω · ω = 0. One has to distinguish between four cases: a) n is even and k = n − 1, then the preceding relation is always satisfied; b) n is even and if k < n/2, then one can prove that the relation is equivalent to Tlk ω · ω = 0; c) n is even and k ≥ n/2, this case is open; d) n is odd and the relation is equivalent to Tlk ω · ω = 0. The case d) can be easily proved because of ω ∧ ω = 0 and then Tlk ω · ω cannot be totally skew-symmetric for any k. The case b) is not straightforward. If Tlk ω·ω is a 2n-skew-symmetric form, it is equal to zero if and only if it vanishes on all 2n-dimensional subspaces. By 1Tlk ω · ω = 0, the k + 1-vector Sn+k ω(a, x1 , . . . , xn−1 )a ⊗ xn ⊗ · · · ⊗ xn+k+1 , constructed with any xi and a, annihilates the form ω and, if k + 1 ≤ n − k, also the form (Iy1 ,...,yn−k ω) ∧ ω that is Tlk ω · ω. But a 2n-form does not admit such an annihilator on a 2n-dimensional subspace. One can use the Weitzenb¨ock trick to pass from an even integer k to k − 1. The problem for k = 1 and n > 2 is solved: ω has to be decomposable [8]. For the other cases, one can consider:

432

P. Gautheron

Question 1. Do there exist some even form ω verifying 1Tlk ω · ω = 0, without satisfying Tlk ω · ω = 0, for k ≥ n/2? Question 2. Is the condition Tlk ω · ω = 0 for k odd verified if and only if ω admits a decomposition in at least n − k + 1 odd factors? Is ω of the form: ω = η1 ∧ · · · ∧ ηn−k+1 ∧ α, where the ηi ’s are odd and α is an arbitrary form? 4.1. Leibniz rule and triviality of star-product. In the framework of deformation theory in the sense of Gerstenhaber [10], we shall now consider the following question: Is it possible to construct a deformation of an n-skew-symmetric bracket conserving the Leibniz property for a deformed associative product of the product of functions? The answer is given for n ≥ 3 by: (see [2, 3] for general references on star-product) Theorem 8. Let n ≥ 3 and [· · ·]t be a skew-symmetric deformation of a nonzero nbracket {· · ·} and ∗ an associative deformation of the pointwise product (a star-product). If [· · ·]t verifies the Leibniz property for ∗, then ∗ is commutative and equivalent to the pointwise product. Proof. Let P be the first nonzero term of a ∗ b − b ∗ a = tk P(a, b) + · · · . If ∗ is associative then P is a Poisson bracket given by a 2 vector α. The Leibniz condition [f1 , . . . fn−1 , a ∗ b]t = [f1 , . . . fn−1 , a]t ∗ b + a ∗ [f1 , . . . fn−1 , b]t , becomes at the first step in t: {f1 , . . . , fn−1 , g.h} = {f1 , . . . , fn−1 , g}.h + g.{f1 , . . . , fn−1 , h}. Then there exits an n-vector ω (we limit ourselves to the open set where ω 6= 0) such that: {f1 , . . . , fn } = ω(df1 , . . . , dfn ). If the bracket is a derivation for a star-product, it is also a derivation for the Lie bracket constructed by skew-symmetrization of the star-product. For the first term appearing in the expression of the deformation this gives ω(df1 , . . . , dfn−1 , α(dg1 , dg2 )) = α(dg1 , ω(df1 , . . . , dfn−1 , dg2 )) + α(ω(df1 , . . . , dfn−1 , dg1 ), dg2 ). Substituting fn−1 = a.b in the preceding identity, only the bilinear terms in da and db remain and we have after simplifications:

Simple Facts Concerning Nambu Algebras

433

ω(df1 , . . . , dfn−2 , da, dg1 )α(dg2 , db) −ω(df1 , . . . , dfn−2 , da, dg2 )α(dg1 , db) +ω(df1 , . . . , dfn−2 , db, dg1 )α(dg2 , da) −ω(df1 , . . . , dfn−2 , db, dg2 )α(dg1 , da) = 0. We recognize the relation: 1Tr (α.ω) = 0. Since n ≥ 3, Tr is injective; then 1(α.ω) = 0 and (α.ω) is completely skew-symmetric. On any (n + 2)-dimensional subspace V0 there exists a vector α(x, y)y (x, y ∈ V0 arbitrary) in the kernel of ω and thus in the kernel of (α.ω). This implies (α.ω) = 0 on V0 , thus everywhere. Since ω is nonzero, α = 0 and thus ∗ is Abelian. It is known that the second Harrison cohomology space classifies Abelian deformations of the pointwise product and it is trivial for the algebra of polynomials ([9] and references therein). This result has been extended recently for the algebra of smooth functions [11], hence we conclude that ∗ is equivalent to the pointwise product. Acknowledgement. The author is grateful to Mosh´e Flato, Daniel Sternheimer and especially Giuseppe Dito for helpful comments and critical readings of the manuscript. I am grateful to Mr. and Mrs. Thomazeau for their amicable support.

References 1. de Azc´arraga, J. A., Perelomov, A. M., P´erez Bueno, J. C.: New generalized Poisson structures. J. Phys. A 29, L151–L157 (1996) 2. Bayen, F., Flato, M., Frønsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation Theory and Quantization: I. Deformations of Symplectic Structures. Ann. Phys. 111, 61–110 (1978) 3. Bayen, F., Flato, M., Frønsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation Theory and Quantization: II. Physical Applications. Ann. Phys. 111, 111–151 (1978) 4. Dito, G., Flato, M., Sternheimer, D., Takhtajan, L.: Deformation Quantization and Nambu Mechanics. Commun. Math. Phys. 183, 1–22 (1997) 5. Flato, M., Dito, G., Sternheimer, D.: Nambu Mechanics, N -ary Operations and their Quantization. In: Sternheimer, D., Rawnsley, J. and Gutt, S. (eds.), Deformation Theory and Symplectic Geometry Mathematical Physics Studies 20, Dordrecht: Kluwer, 1997, pp. 43–66 6. Filippov, V. T.: n-Lie algebras, Siberian Math. J. 26 (6), 875–879 (1985) 7. Flato, M., Frønsdal, C.: Unpublished (1992) 8. Gautheron, Ph.: Some Remarks Concerning Nambu Mechanics. Lett. Math. Phys. 37, 103–116 (1996) 9. Gerstenhaber, M., Schack, S.: Algebraic Cohomology and Deformation Theory. In: Hazewinkel, M. and Gerstenhaber, M. (eds.), Deformation Theory of Algebras and Structures and Applications. NATO ASI Series C, 247. Dordrecht: Kluwer, 1988, pp. 11–264. [See also: Barr, M.: Harrison Homology, Hochschild Homology, and Triples. J. Alg., 8, 314–323 (1968)] 10. Gerstenhaber, M., Schack, S.: Algebras, Bialgebras, Quantum Groups, and Algebraic Deformations. In: Gerstenhaber, M. and Stasheff, J. (eds.), Deformation Theory and Quantum Groups with Applications to Mathematical Physics. Contemporary Mathematics 134, Providence (RI): American Mathematical Society, 1992, pp. 51–92 11. Pinczon, G.: On the equivalence between continuous and differential deformations. Lett. Math. Phys. 39, 143–156 (1997)

434

P. Gautheron

12. Takhtajan, L.: On Foundation of the Generalized Nambu Mechanics. Commun. Math. Phys. 160, 295– 315 (1994) 13. Weitzenb¨ock, R.: Invariantentheorie. Groningen: P. Noordhoff, 1923 Communicated by H. Araki This article was processed by the author using the LaTEXstyle file cljour from Springer-Verlag.

Commun. Math. Phys. 195, 435 – 464 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions. Part I: Probability and Wavelength Estimate Stanislaus Maier-Paape, Thomas Wanner? Institut f¨ur Mathematik, Universit¨at Augsburg, D-86135 Augsburg, Germany Received: 23 May 1997 / Accepted: 2 December 1997

Abstract: This paper is the first in a series of two papers addressing the phenomenon of spinodal decomposition for the Cahn–Hilliard equation ut = −1(ε2 1u + f (u)) in ,

∂u ∂1u = = 0 on ∂ , ∂ν ∂ν

where ⊂ Rn , n ∈ {1, 2, 3}, is a bounded domain with sufficiently smooth boundary, and f is cubic-like, for example f (u) = u − u3 . We will present the main ideas of our approach and explain in what way our method differs from known results in one space dimension due to Grant [26]. Furthermore, we derive certain probability and wavelength estimates. The probability estimate is needed to understand why in a neighborhood of a homogeneous equilibrium u0 ≡ µ of the Cahn–Hilliard equation, with mass µ in the spinodal region, a strongly unstable manifold has dominating effects. This is demonstrated for the linearized equation, but will be essential for the nonlinear setting in the second paper [37] as well. Moreover, we introduce the notion of a characteristic wavelength for the strongly unstable directions.

1. Introduction The Cahn–Hilliard equation ut = −1(ε2 1u + f (u)) in , ∂u ∂1u = = 0 on ∂ ∂ν ∂ν

(1)

? While this work was done the author was visiting the Center for Dynamical Systems and Nonlinear Studies, Georgia Institute of Technology, supported by the Deutsche Forschungsgemeinschaft, “Forschungsstipendium” Wa 960/3-1, Wa 960/3-2.

436

S. Maier-Paape, T. Wanner

was introduced in [12, 15] as a model for phase separation in binary alloys. Here ⊂ Rn is a suitable bounded domain in Rn for n ∈ {1, 2, 3}, and the function −f is the derivative of a double-well potential, the standard example being the cubic function f (u) = u − u3 . Furthermore, ε is a small positive parameter. The underlying physical context can be described as follows. If a high-temperature homogeneous mixture of two metallic components is rapidly quenched to a certain lower temperature, a process of phase separation may set in which occurs in two stages. During the initial phase of spinodal decomposition the mixture quickly becomes inhomogeneous, forming a fine-grained structure which exhibits a characteristic length scale, cf. for example Cahn [13, 14], Elder, Desai [17], Elder, Rogers, Desai [18], and Hyde et al. [31]. After that, a slow coarsening process can be observed, during which the above-mentioned characteristic length scale grows. Whether or not this whole process of phase separation sets in at all depends crucially on the masses of the two metallic components in the alloy: The mass of one of the components has to be contained in the so-called spinodal interval. In the Cahn–Hilliard equation (1) the variable u represents the mass of one of the two components R of the alloy subject to some affine transformation. Since (1) conserves the total mass udx, this also determines the mass of the other component. Furthermore, the spinodal interval is the (usually connected) set of all v ∈ R for which f 0 (v) > 0. Numerous numerical simulations have shown that (1) exhibits both spinodal decomposition and coarsening, as described above in the physical context. We refer the interested reader to Elliott, French [20], Elliott [19], and Bai et al. [8], just to name a few. As for mathematical results describing the physical phenomena mentioned above, the situation depends on the dimension of the domain . For the one-dimensional case a fairly complete picture of the process can be given: The process of spinodal decomposition has been explained by Grant [26], his results will be described in more detail below. The literature on the coarsening process is extensive, we refer the reader to Alikakos, Bates, Fusco [2], Bates, Xun [9, 10], Bronsard, Hilhorst [11], and Grant [27]; more general results, which cover also the one-dimensional Cahn–Hilliard equation, can be found in Sandstede [41] and Kalies, VanderVorst, Wanner [32]. Furthermore, the set of equilibria of (1) is completely known in the one-dimensional case, compare Grinfeld, Novick-Cohen [28] and the references therein. For the physically more relevant two- and three-dimensional cases there are fewer results available, and the whole separation process is far from being understood completely. While there are results on the coarsening process (cf. Alikakos, Bates, Chen [1], Alikakos, Bronsard, Fusco [3], Alikakos, Fusco [4, 5, 6], Pego [38], Stoth [42], and the references therein) and the set of equilibria in two dimensions (cf. Fife et al. [22], Kielh¨ofer [34], and Maier-Paape, Wanner [36]), there are (to the best of our knowledge) no results on spinodal decomposition. As the title of our paper indicates, we will address this phenomenon below and in [37]. Before presenting our results, let us briefly describe Grant’s approach [26] to prove spinodal decomposition in one space dimension (cf. also Fig. 1). Assume that µ ∈ R is contained in the spinodal interval, i.e., suppose that f 0 (µ) > 0. Obviously, the constant function u0 ≡ µ is an equilibrium for the Cahn–Hilliard equation. From an analysis of the linearization Aε of (1) at u0 Grant deduced that for sufficiently small generic ε > 0 the operator Aε has a largest eigenvalue λ+ε > 0 which is simple and whose corresponding one-dimensional eigenspace Xε+ is spanned by a periodic function with wavelength of order O(ε). Tangent to this eigenspace there exists a pseudo-unstable invariant manifold Wε+ . All other full orbits which originate near u0 and converge to u0 as t → −∞ do this

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

437

Fig. 1. Grant’s approach to spinodal decomposition

tangentially to the orthogonal complement Xε− of Xε+ . Qualitatively the situation near u0 is depicted in the left half of Fig. 1. Since the Cahn–Hilliard equation is a gradient system, the ω-limit set of non-stationary points on the pseudo-unstable manifold Wε+ contains only equilibria of (1). Grant proved that these equilibria are periodic with wavelengths of order O(ε) and that their L∞ -norms are bounded away from 0 as ε → 0. Hence they can be interpreted as spinodally decomposed states. Altogether, he gets the following result: If we choose a small neighborhood Vε of the ω-limit set related to Wε+ and a certain “probability” p ∈ (0, 1), then there exists a neighborhood Uε of u0 such that with probability at least p a randomly chosen initial condition in Uε leads to an orbit of (1) which will intersect Vε . This is due to the tangency statement above. (Observe that in Fig. 1 only part of the neighborhood Vε is depicted.) Thus, with high probability orbits originating in Uε will at some point be close to one of the equilibria in the ω-limit set related to Wε+ — thus also represent spinodally decomposed states. Large parts of Grant’s argument remain valid in higher dimensions. For example, let us consider the case of the square = (0, 1)2 ⊂ R2 . Again, for generic small values of ε > 0 there is a unique largest eigenvalue of the linearization, with corresponding eigenfunction ϕε (x1 , x2 ) = cos(kε πx1 ) cos(`ε πx2 ). We can also find a one-dimensional invariant manifold Wε+ as above, and it can be shown that all equilibria in the ω-limit set of non-stationary points on Wε+ have at least the same symmetries as the eigenfunction ϕε . If we now assume that the remaining results of Grant’s approach remain valid, too, the following result is obtained: With very high probability an initial condition chosen randomly from a sufficiently small neighborhood Uε of the equilibrium u0 ≡ µ will lead to an orbit which gets close to one of the equilibria in the ω-limit set related to Wε+ . In particular, this would imply that at some point the orbit exhibits a nearly doubly periodic, very regular structure. Yet, neither physical experiments nor numerical calculations back these predictions. Rather than regular, doubly periodic structures one observes somewhat irregular, almost “snake-like” patterns, cf. for example Cahn [13,14], Elder, Desai [17], Elder, Rogers, Desai [18], Hyde et al. [31], or Bai et al. [8]. These patterns seem to have only some com-

438

S. Maier-Paape, T. Wanner

mon characteristic wavelength associated with them, which appears to be proportional to ε. Although this could indicate that Grant’s results are not valid for higher-dimensional domains , we do believe that they are true — even if this might seem contradictory at first glance. The reason for this difference between what his results imply and what can be observed is of a quantitative nature. Let us go back to our above description of Grant’s approach. He proves that the boundary of the set of initial conditions leading to orbits which get close to the ω-limit set related to Wε+ satisfies a “power law”, as indicated in Fig. 1. More precisely, decompose initial conditions u ∈ Uε as u =: u0 + u− + u+ , where the components u± are uniquely determined by u± ∈ Xε± . Then there are constants C > 0 and γε > 1 such that if the initial condition u ∈ Uε satisfies ||u+ || ≥ C||u− ||γε ,

(2)

then the orbit originating in u will intersect the neighborhood Vε of the ω-limit set related to Wε+ . (Obviously, the inequality (2) is a quantitative way of describing the tangency statement from above, since γε > 1.) Grant also shows that the exponent γε determining the power law depends on the size of the spectral gap between the largest and secondlargest eigenvalues of the linearization; more precisely, this power is roughly the ratio of the largest and the second-largest eigenvalue. This ratio, however, converges to 1 as ε → 0, thus we also have γε → 1 as ε → 0. A straightforward formal calculation (cf. Fife [21]) then yields that in order for the largest eigenvalue to dominate the behavior of orbits starting nearby, the initial data has to be chosen from a neighborhood Uε whose size is exponentially small, i.e., of the order O(exp(−c/ε2 )) for some c > 0. For reasonable values of ε, which is a small parameter, it is therefore impossible to actually generate initial data within Uε , neither experimentally nor numerically. This discussion leads to the following conclusion: To capture the typical behavior of orbits originating near the homogeneous equilibrium u0 ≡ µ of the Cahn–Hilliard equation (1), it does not suffice to consider only the eigendirection corresponding to the largest eigenvalue of the linearization of (1) at u0 . Rather a certain percentage of the largest eigenvalues has to be taken into account. Due to the close spacing of these eigenvalues, however, none of them will dominate alone. Every linear superposition of the corresponding eigenfunctions describes a possible direction for typical orbits starting near u0 , and none of them is really preferred. Thus one would expect that these superpositions exhibit the patterns of spinodally decomposed states as predicted by experiments and numerical simulations. This relation was already conjectured by Cahn [13], but had not been established rigorously. In this paper and its sequel [37] we present a new approach for explaining spinodal decomposition which is based on the above intuition and which will work in one, two, and three space dimensions. While the discussion of the nonlinear Cahn–Hilliard equation (1) will be the subject of [37], the present paper considers only the semigroup generated by the linearized equation. Although this might seem redundant at first glance, it will allow us to describe our main ideas more clearly, as for example the dominating effects of a strongly unstable subspace. Also, it is more natural to describe the characteristic wavelength of spinodally decomposed states in the linear setting. Once this has been accomplished, the results can easily be carried over to the nonlinear situation as well, cf. [37]. This paper is organized as follows. In Sect. 2 we discuss the linearization of the Cahn–Hilliard equation (1) at the homogeneous equilibrium u0 ≡ µ. In particular, the above-mentioned dominating eigendirections determining the dominating subspace are

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

439

defined in Subsect. 2.2.1, and our main results are explained using the linear framework in Subsects. 2.2.2, 2.2.3, and 2.2.4. This is done mainly to provide the reader with some intuition for what may be expected in the nonlinear equation, cf. the forthcoming sequel [37]. For the linear discussion in Sect. 2, as well as for the nonlinear equation considered in [37], we need certain probability and wavelength estimates. These will be derived in Sects. 3 and 4, respectively. The probability estimates quantitatively describe the dominance of the strongly unstable directions in a small neighborhood of the homogeneous equilibrium u0 ≡ µ, with µ in the spinodal interval. The wavelength estimates are a first step towards an interpretation of the notion of a characteristic wavelength for the spinodally decomposed states. The analytical results concerning the wavelength are only valid under additional assumptions. They are, however, satisfied “on average”, thus providing a “typical” characteristic wavelength of the order O(ε) for spinodally decomposed states. 2. The Linearized Equation In this section we will provide the information on the spectrum of the linearization of the Cahn–Hilliard equation at a spatially homogeneous equilibrium which will be needed subsequently. Furthermore, we will present a “linear version” of our main results to provide the reader with some geometric intuition of what can be expected in the nonlinear case considered in [37]. 2.1. The spectrum of the linearization. Assume that µ is contained in the spinodal interval, i.e., suppose f 0 (µ) > 0. The linearization of (1) around the spatially homogeneous equilibrium u0 ≡ µ is given by vt = −1(ε2 1v + f 0 (µ)v) in , (3) ∂v ∂1v = = 0 on ∂ . ∂ν ∂ν We are interested in the spectrum of the linear operator associated with problem (3). Due to the mass constraint in the Cahn–Hilliard equation this operator will be considered as operator on   Z   udx = 0 , (4) X := u ∈ L2 () :  

rather than on L2 (). Recall the following well-known result. Lemma 2.1. Let ⊂ Rn , n ∈ {1, 2, 3} denote a bounded domain with piecewise C 1 boundary. Then the spectrum of the operator −1 : X → X with domain D(−1) = {u ∈ X ∩ H 2 () : ∂u/∂ν(x) = 0 , x ∈ ∂} consists of an infinite sequence 0 < κ1 ≤ κ2 ≤ κ3 ≤ . . . → +∞ of real eigenvalues. The corresponding normalized eigenfunctions ψ1 , ψ2 , ψ3 , . . . form a complete L2 ()-orthonormal set in X. Furthermore, if Nn (λ) denotes the number of eigenvalues less than λ ∈ R (counting multiplicities), then we have Nn (λ) = cn vol(n) () . lim λ→∞ λn/2 Here vol(n) () denotes the n-dimensional Lebesgue measure of , and the constants cn are given by c1 = 1/π, c2 = 1/4π, and c3 = 1/6π 2 .

440

S. Maier-Paape, T. Wanner

Proof. Compare Courant and Hilbert [16, p.442].

Of particular interest are the special cases of a one-dimensional interval (0, a), a rectangle (0, a) × (0, b), and a cube (0, a) × (0, b) × (0, c). In these cases the eigenvalues are respectively given by 2 2 k2 `2 ` 2 m2 k k 2 , π , + · + + π2 · 2 , π2 · a a 2 b2 a 2 b2 c2 with corresponding normalized eigenfunctions r r kπx1 kπx1 `πx2 2 4 · cos , · cos · cos , a a ab a b r kπx1 `πx2 mπx3 8 · cos · cos · cos . abc a b c In each of these cases we have k, `, m ∈ N0 , subject to the inequalities k > 0, k + ` > 0, or k + ` + m > 0, respectively. As for the eigenvalues of the linear operator associated with the linearized Cahn– Hilliard equation we have the following result. Lemma 2.2. Again, let ⊂ Rn , n ∈ {1, 2, 3} denote a bounded domain with piecewise C 1 -boundary. Consider the operator Aε : X → X defined by Aε v = −1(ε2 1v + f 0 (µ)v) , with domain D(Aε ) =

∂1u ∂u (x) = (x) = 0 , x ∈ ∂ u ∈ X ∩ H () : ∂ν ∂ν 4

.

Then −Aε is a selfadjoint and sectorial operator. The spectrum of Aε consists of real eigenvalues λ1,ε ≥ λ2,ε ≥ . . . → −∞ with corresponding eigenfunctions ϕ1,ε , ϕ2,ε , . . .. If κi and ψi denote the eigenvalues and eigenfunctions of Lemma 2.1, then the eigenvalues λi,ε are obtained by ordering the numbers λ˜ i,ε := κi (f 0 (µ) − ε2 κi ) ,

i∈N.

The eigenfunctions ϕi,ε are obtained from the eigenfunctions ψi through this ordering procedure in the obvious way. Proof. Compare Henry [29, p. 19] and Riesz, Sz.-Nagy [39].

Remark 2.3. In the following, the numbers λ˜ i,ε will always denote the unordered eigenvalues of the operator Aε given by λ˜ i,ε = κi (f 0 (µ) − ε2 κi ) ,

i∈N,

where the κi are the ordered eigenvalues of the operator −1 from Lemma 2.1. Similarly, the numbers λi,ε will always denote the ordered eigenvalues of the operator Aε and κ˜ i,ε the corresponding unordered eigenvalues of −1 such that λi,ε = κ˜ i,ε (f 0 (µ) − ε2 κ˜ i,ε ) ,

i∈N.

Note that for large i ∈ N (depending on ε) we have both λ˜ i,ε = λi,ε and κ˜ i,ε = κi .

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

441

Since we assumed p the inequality f 0 (µ) > 0 the operator Aε has positive eigenvalues for all 0 < ε < f 0 (µ)/κ1 , i.e., the homogeneous equilibrium u0 ≡ µ of the Cahn–Hilliard equation (1) is unstable. From Lemma 2.1 we can further deduce that the dimension of the (finite-dimensional) unstable manifold is asymptotically of the order f 0 (µ)n/2 cn vol(n) () εn

(5)

as ε → 0. Furthermore, the largest eigenvalue λ1,ε of Aε is of the order λ1,ε

∼

λmax := ε

f 0 (µ)2 , 4ε2

(6)

regardless of the dimension n of . The above discussion shows that for small ε > 0 the linearization Aε has many very large positive eigenvalues, and many of them are almost equal in size. Thus it is natural to ask, which of these eigenvalues will dominate the behavior of orbits originating near 0. This will be the subject of the next subsection. 2.2. Discussion of the linearized equation. To give a rough outline of our arguments for the nonlinear Cahn–Hilliard equation, let us briefly discuss the dynamics of the linearized equation (3), written in the abstract form vt = Aε v

,

v(0) = v¯ .

(7)

Due to Lemma 2.2 this equation generates an analytic semigroup Sε (t) on X, and the ¯ Furthermore, if we denote solution to the initial value problem is given by vε (t) = Sε (t)v. the Fourier series representation of v ¯ in X with respect to the complete orthonormal set P∞ ¯ ϕk,ε ) (where (·, ·) denotes {ϕk,ε , k ≥ 1} by v¯ = k=1 ξk ϕk,ε , i.e., if we define ξk := (v, the standard scalar product in L2 ()), then vε (t) = Sε (t)v¯ =

∞ X

eλk,ε ·t ξk ϕk,ε

for all

t≥0.

(8)

k=1

This explicit spectral representation of the semigroup will be useful later on. (Note that the Fourier coefficients ξk also depend on ε. However, the dependence is not crucial in the following and therefore we will suppress the subscript ε.) Suppose that v¯ is an initial condition for (7) which is close to the constant solution v ≡ 0. We already mentioned that f 0 (µ) > 0 implies the instability of the trivial solution. Thus, it is very likely that the orbit vε (·) will quickly leave a small neighborhood U of v ≡ 0. We are interested in understanding the geometry of the function vε (t) when this happens. More precisely, suppose that we randomly choose initial conditions from a smaller neighborhood V ⊂ U ⊂ X. Is it possible to describe where “most” of these orbits exit the neighborhood U ? Is there a preferred direction at all? How can we formalize “most”? 2.2.1. Definition of the dominating subspace Yε . Let us introduce some notation. Fix constants γ −− 0 γ − < γ + < 1. In order to keep this linear discussion as simple as possible, we will consider from now on only those values of ε > 0, for which the max is spectrum of the operator Aε is disjoint from the set {γ −− , γ − , γ + } · λmax ε , where λε defined as in (6). (It is possible to obtain similar results if this condition does not hold,

442

S. Maier-Paape, T. Wanner

Fig. 2. The spectral decomposition of X

but since the condition is satisfied for generic ε, we refrain from presenting the general case here.) Under this assumption we can divide the spectrum of the operator Aε into four parts, σ(Aε ) = σε−− ∪ σε− ∪ σε+ ∪ σε++ , where the four sets on the right-hand side denote the intersections of the spectrum of Aε −− − + max + max with the intervals (−∞, γ −− ) · λmax , γ − ) · λmax ε , (γ ε , (γ , γ ) · λε , and (γ , 1] · λε , −− respectively. This splitting induces a decomposition of X into four subspaces Xε , Xε− , Xε+ , and Xε++ , which are generated by the eigenfunctions of Aε corresponding to eigenvalues in σε−− , σε− , σε+ , and σε++ , respectively. Each of these spaces is invariant with respect to the semigroup Sε (t) and we denote the corresponding restrictions of Sε (t) by the appropriate superscripts. Assume further that the neighborhood U = BR (0) is a fixed ball in X with radius R (with respect to the X-norm, i.e., the L2 ()-norm), and define Yε := Xε+ ⊕ Xε++ .

(9)

The subspace Yε is generated by all eigenfunctions of Aε corresponding to eigenvalues −n , it is considerably bigger than γ − · λmax ε . Although its dimension is proportional to ε less than that of the unstable subspace, whose dimension is of the same order (cf. (5)). This is due to the fact that 0 γ − < 1, for example γ − = 0.99. In what follows, we will demonstrate that Yε dominates the behavior of most orbits of (7) originating near the origin in the sense that upon leaving U they will be close to Yε . More precisely, choose a constant 0 < % R. Then for suitable r ∈ (0, %) we will show that most orbits starting in V := Br (0) ⊂ X will leave U within a %-neighborhood of Yε (cf. Fig. 3). This will be done in several steps. 2.2.2. Reduction to a finite-dimensional problem. In this first step it will be shown that we may neglect the projection onto Xε−− of an orbit originating in V , provided r is sufficiently small. In other words, only the projection onto the finite-dimensional subspace Zε := Xε− ⊕ Xε+ ⊕ Xε++ determines the behavior of the orbit. Since γ −− 0 its dimension

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

443

Fig. 3. The dominating subspace Yε

N (ε) := dim Zε is considerably bigger than the dimension of the unstable subspace, but still proportional to ε−n . Choose an initial condition v¯ in V (cf. Fig. 3). Let us further suppose that there exists ¯ and that this orbit exits the a full orbit vε (·) : R → X of (7) through v¯ with vε (0) = v, ball U at time Tε∗ > 0. Set vε∗ := vε (Tε∗ ) (cf. again Fig. 3). Since the subspaces Xε−− , Xε− , Xε+ , and Xε++ are invariant with respect to Sε we may decompose vε (·) as vε (t) =: vε−− (t) + vε− (t) + vε+ (t) + vε++ (t) ∈ Xε−− ⊕ Xε− ⊕ Xε+ ⊕ Xε++ , and analogously for both the initial condition v¯ =: v¯ −− + v¯ − + v¯ + + v¯ ++ and the exit point vε∗ =: vε∗,−− + vε∗,− + vε∗,+ + vε∗,++ . Using these notations we can verify the following claim: −−

−−

(C1) If r < (%R−γ )1/(1−γ ) , then ||vε∗,−− || < % (note that || · || denotes the norm of X, i.e., the standard L2 ()-norm). In other words, every orbit originating in V = Br (0) has a small Xε−− -component upon leaving U = BR (0), provided the orbit leaves U at all. Proof. According to v¯ ∈ V and the fact that the largest eigenvalue of Aε is bounded ∗ max ≥ ln(R/r), and therefore above by λmax ε , we immediately get Tε · λε ||vε∗,−− ||

=

||Sε−− (Tε∗ )v¯ −− ||

≤r·e

∗ γ −− ·λmax ε ·Tε

≤r·

R r

γ −− <%,

due to the choice of r. Here we have used the explicit spectral representation of the semigroup Sε (·) given in (8).

444

S. Maier-Paape, T. Wanner

Fig. 4.The linear power law

Fig. 5. The main result

2.2.3. A quantitative estimate for the Zε -component. The above claim implies that only the projection of the orbit vε onto the finite-dimensional subspace Zε has to be considered in detail, provided r satisfies the inequality in (C1), which we will assume from now on. The main observation of this second step is contained in the following claim: (C2) Choose r as in (C1). If v¯ ∈ V = Br (0) is such that ||v¯ ++ || > R%−γ

+

/γ −

· ||v¯ − ||γ

+

/γ −

,

(10)

then the orbit through v¯ exits the neighborhood U = BR (0) near the dominating subspace Yε (cf. Fig. 4, some orbits exiting close to Yε are marked with arrows). More precisely, we not only have ||vε∗,−− || < %, but also ||vε∗,− || < %. Proof. Recall that for all t < Tε∗ the inequalities ||vε++ (t)|| ≤ ||vε∗,++ || · eγ ||vε− (t)||

≥

||vε∗,− ||

·e

+

∗ ·λmax ε ·(t−Tε )

≤ R · eγ

+

∗ ·λmax ε ·(t−Tε )

,

∗ γ − ·λmax ε ·(t−Tε )

hold, cf. the spectral representation (8). For t = 0 this implies, together with the choice of v, ¯ the inequality ||vε∗,− ||

−

≤ ||v¯ || · e

∗ γ − ·λmax ε ·Tε

<%·

which completes the verification of (C2).

||v¯ ++ || R

γ − /γ +

· eγ

−

∗ ·λmax ε ·Tε

≤%,

2.2.4. Interpretation of the results. The two claims from the previous steps already imply that “most” orbits originating in V will leave U within a %-neighborhood of the dominating subspace Yε . This will be explained in the following.

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

445

Fix a number p ∈ (0, 1) close to 1. We want to get a statement of the form: With probability p, orbits starting in V will leave U close to Yε . But what is the right notion of probability? Since V is a subset of the infinite-dimensional space X there is no canonical choice of a probability measure on V which would correspond to the Lebesgue measure in finite dimensions. For this reason, Grant [26] considered general Gaussian measures on Hilbert spaces, but there are certain disadvantages to this approach, cf. for example the discussion in Hunt, Sauer, Yorke [30]. Thus, we will not pursue this approach here, but rather adopt the following point of view. It was pointed out in the first step that the projection of an orbit onto Xε−− decays exponentially and can therefore be neglected. Thus, if we observe an orbit vε , we actually observe only its projection onto the N (ε)dimensional subspace Zε . Yet, for bounded subsets of Zε there is a canonical probability measure, namely the one induced by N (ε)-dimensional Lebesgue measure through the isometry N (ε) X ξk ϕk,ε ∈ Zε , RN (ε) 3 (ξ1 , . . . , ξN (ε) ) 7→ k=1 N (ε)

where R is equipped with the Euclidean norm and Zε with the L2 ()-norm of the surrounding space X. We therefore consider only the finite-dimensional flow obtained by projecting Sε (t) onto Zε . The above notion of probability will be realized in terms of the Lebesgue measure on the space RN (ε) . Although this idea is very simple, one has to exercise extreme caution. Since the subspace Zε depends on the choice of ε, it is not clear whether different choices of ε lead to the same probabilities p. This requirement, however, has to be met in order for the results to be meaningful. Fortunately, in the above situation it is possible to get a statement which is independent of ε. Consider the set o n + − + − ∩ Bε,r ⊂ Zε ∼ Gε,r := v ∈ Zε : ||v ++ || > R%−γ /γ · ||v − ||γ /γ = RN (ε) , where Bε,r ⊂ Zε denotes the finite-dimensional closed ball with center 0 and radius r. Due to (C2) all initial conditions in V ⊂ X whose Zε -component is contained in Gε,r will lead to orbits leaving U %-close to Yε . Furthermore, the boundary of the set Gε,r satisfies the power law given in (10) — and the exponent in this power law is given by the ε-independent number γ + /γ − > 1. Together with Theorem 3.1 from Sect. 3 this finally implies the following. There exists an r0 ∈ (0, %) which is independent of ε > 0 such that for all 0 < r ≤ r0 and all ε > 0 the estimate vol(N (ε)) (Gε,r ) ≥p vol(N (ε)) (Bε,r ) holds, where vol(N (ε)) (·) denotes N (ε)-dimensional Lebesgue measure. (Note that assumption (11) in Theorem 3.1 is satisfied in our situation, since the dimensions of Xε− and Xε++ are both of the order ε−n .) In other words, if we consider 0 < r ≤ r0 , and the above finite-dimensional measure as a notion for probability, then with probability at least p an orbit originating in the neighborhood V ∩ Zε = Bε,r will leave U ∩ Zε = Bε,R close to Yε . Compare also Fig. 5 which illustrates the above discussion. In this figure we use the definition Mε,r := Bε,r \ Gε,r . Note that the value of the constant r0 from above remains (basically) unchanged if we consider slightly different values for γ −− , γ − , or γ + . This is needed for obtaining

446

S. Maier-Paape, T. Wanner

the result for arbitrary small ε > 0, and not only the generic values of ε considered in this section for demonstration purposes. 2.2.5. Nodal domain patterns of functions in Yε . Let us close this discussion with some remarks on the nodal domain patterns of the orbits at the time of exit from U . Since most of the orbits will be close to Yε , their nodal domains will exhibit patterns which resemble those of functions in Yε . These functions, however, exhibit nodal domains as predicted by physical experiments. We first consider the one-dimensional domain = (0, 1) with some γ − < 1 close to 1. A straightforward calculation shows that the space Yε is spanned by the functions cos(kπx), where k ∈ N satisfies 1 · ε

r

r p p 1 f 0 (µ) f 0 (µ) − − , · ≤ k ≤ · 1 − 1 − γ · 1 + 1 − γ 2π 2 ε 2π 2

so the size of their nodal domains is of the order ε. Now let ζ 6= 0 denote an arbitrary function in Yε . Then according to Karlin [33, Theorem 6.2, p. 35] the number of zeros of ζ satisfies the same inequalities as k above. Hence, the “average size” of the nodal domains is again of the order ε. This characteristic length has been observed in experiments. On the other hand, in two space dimensions we may observe snake-like nodal domain patterns. As an example, let us consider the case = (0, 1)2 , ε = 1/100, and γ − = 0.99. Then the operator Aε has 826 positive eigenvalues and the dimension of Yε is 83, i.e., Yε is spanned by the eigenfunctions corresponding to the 83 largest eigenvalues of Aε . Fig. 6 depicts a random superposition ζ of these eigenfunctions. For demonstration purposes we only show the restriction of ζ to the domain (0, 1/2)2 ⊂ . The left diagram contains the nodal lines of this restriction of ζ, i.e., the set {x ∈ (0, 1/2)2 : ζ(x) = 0}. This pattern resembles both numerically and physically observed patterns. Note also that the nodal domains exhibit some characteristic “thickness”, which appears to be of the same order of magnitude as ε. This will be made precise in Sect. 4 where we will prove the following result (cf. Theorem 4.8 and the discussion of Sect. 4.3): Let x0 ∈ be a “typical” point, and let G ⊂ denote the nodal domain of the function ζ ∈ Yε which contains x0 . Then for any ball contained in G with radius r and center x0 the estimate r ≤ C · ε holds with an ε-independent constant C. We close this section with a three-dimensional example. Let = (0, 1)3 , and consider again ε = 1/100 and γ − = 0.99. Now the operator Aε has 18, 096 positive eigenvalues and the dimension of Yε is 1, 939, i.e., Yε is spanned by the eigenfunctions corresponding to the 1, 939 largest eigenvalues of Aε . The lower left diagram in Fig. 7 depicts a cross-section of a random superposition ζ of these eigenfunctions; more precisely, it shows the intersection of the nodal surfaces of ζ with the horizontal plane x3 = 0.2, or in other words, the nodal lines of ζ(·, ·, 0.2). In the lower right diagram the nodal lines of ζ(·, 0.4, ·) are depicted, and in the upper diagram of Fig. 7 the restriction of the function ζ to the cube (0, 0.4)3 is sketched. (In this diagram black corresponds to points x in the cube where the inequality ζ(x) ≤ −0.3 holds, white corresponds to ζ(x) ≥ 0.3, and function values in (−0.3, 0.3) are represented by different shades of grey.) Again, these patterns resemble actually observed ones, cf. for example Hyde et al. [31]. Moreover, the nodal domains exhibit a characteristic wavelength, cf. Theorem 4.8.

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

447

0.5 0.45 0.4

5

0.35 0.3 0

0.25 0.2 0.15

−5 0.5

0.1

0.4

0.5 0.3

0.05

0.4 0.3

0.2

0 0

0.2

0.1

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.1 0

0.5

0

Fig. 6. Random superposition of some significant eigenfunctions in 2D

0.4 0.35 0.3

x3

0.25 0.2 0.15 0.1 0.05 0 0.4 0.3

0.4 0.3

0.2 0.2 0.1

0.1 0

0

x1

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

x3

x2

x2

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

0.1

0.2

0.3

0.4

0.5 x1

0.6

0.7

0.8

0.9

1

0

0

0.1

0.2

0.3

0.4

0.5 x1

0.6

Fig. 7. Random superposition of some significant eigenfunctions in 3D

0.7

0.8

0.9

1

448

S. Maier-Paape, T. Wanner

3. The Probability Estimate In this section we prove a result which is used in Sect. 2 and in [37] to justify the fact that with very high probability orbits corresponding to randomly chosen initial data near a constant solution behave similarly as solutions on the dominating subspace Yε , or on the dominating manifold in [37]. Using the notation Bri := {x ∈ Ri : ||x||2 ≤ r}

and

τri := vol(i) (Bri ) =

π i/2 · ri , 0(i/2 + 1)

where || · ||2 denotes the standard Euclidean norm and vol(i) the Lebesgue measure on Ri , the following result holds. Theorem 3.1. Let β, η ∈ (0, 1) be fixed and define h(s) = hβ,η (s) := (s/β)1/η . Furthermore, fix a constant C∗ > 0 and let k, ` ∈ N and m ∈ N0 be arbitrary natural numbers such that (11) ` ≤ C∗ · k . Let i := k + ` + m and for % > 0 consider the set M% := {(x, y, z) ∈ B%i ⊂ Rk × R` × Rm : ||x||2 ≤ h(||y||2 )} , cf. Fig. 8. Then we have s vol(i) M% 1 1 1 h(%) ≤ = β − 2η · % 2 ( η −1) (i) i % vol B%

for all

0 < % ≤ %0 ,

(12)

where  %0 := β

1 1−η

 

Z

1 · min ,  1 + 2C∗

1 0

η !2  1−η   (1 − s2 )C∗ /2 ds >0. 

(13)

In particular, the quotient in (12) approaches zero as % → 0. Remark 3.2. It can be shown that asymptotically r Z 1 C∗ /2 π ds ∼ 1 − s2 2C ∗+2 0 Hence, we have

as C∗ → ∞ . η

%0 = β 1−η · (1 + 2C∗ )− 1−η 1

for sufficiently large C∗ > 0. Remark 3.3. Instead of considering M% as a subset of the ball B%i one could also consider cubes, i.e., M%Q = {(x, y, z) ∈ Q% : ||x||2 ≤ h(||y||2 )} , Q% = {(x, y, z) ∈ Rk × R` × Rm : ||x|| < %, ||y|| < %, ||z|| < %} . In this case, an easy calculation furnishes the identity

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

449

Fig. 8. The orthogonal projection of M% onto Rk × R` × {0}

vol(i) (M%Q ) (i)

vol (Q% )

=

1 1 1 · β −k· η · %k·( η −1) 1 + k/(`η)

for all

% ≤ β 1/(1−η) ,

(14)

which readily implies vol(i) (M%Q ) (i)

vol (Q% )

s ≤

1 1 1 h(%) = β − 2η · % 2 ( η −1) %

for all

% ≤ β 1/(1−η) .

Note that in this case no restriction on ` and k is necessary. Yet, although the result for cubes is more straightforward, in our situation it is more natural to work with Euclidean balls, since these correspond to L2 ()-balls in our application, compare Subsect. 2.2.4. Proof of Theorem 3.1. We only give the proof for m ≥ 1. In the case m = 0 it is even simpler. Also, we restrict ourselves to k ≥ 2 and ` ≥ 2, since these are the relevant cases for our later applications. The cases when k = 1 or ` = 1 can be treated by similar calculations. Introducing θ := (||x||22 + ||y||22 )1/2 we have [ m ˜ θ × B√ M M% = , 2 2 θ∈[0,%]

with

% −θ

˜ θ := (x, y) ∈ ∂Bθk+` : ||x||2 ≤ h(||y||2 ) . M

Then a generalization of Fubini’s theorem implies Z % m ˜ θ · vol(m) B√ volS(k+`) M vol(i) (M% ) = 0

%2 −θ 2

dθ,

(15)

where we have used the canonical surface Lebesgue volume on the sphere ∂Bθk+` , i.e., volS(k+`) (M ) is defined for measurable subsets M ⊂ ∂Bθk+` . Since 0 ≤ ||x||2 ≤ θ, we ˜ θ the estimate deduce for arbitrary (x, y) ∈ M q 2 2 θ − ||x||2 ≤ h(θ) , 0 ≤ ||x||2 ≤ h(||y||2 ) = h

450

S. Maier-Paape, T. Wanner

hence

ˆ θ := (x, y) ∈ ∂Bθk+` : ||x||2 < h(θ) ˜θ ⊂ M M

and

˜ θ ≤ vol(k+`) M ˆθ . volS(k+`) M S

(16)

To perform the integration volS(k+`) on ∂Bθk+` explicitly, we introduce new coordinates. Let nˆ := k + ` and set , y` ) := (x1 , ..., xk , y1 , ..., y` ) . (t1 , ..., tk , tk+1 , ..., tn−1 ˆ ˆ θ ⊂ ∂B k+` , Standard integration techniques for surfaces yield for M θ Z θ 1 ˆ ˆθ = p · volS(k+`) M dn−1 t. 2 2 θ − ||t||22 0<||t||2 <θ qP k 2 t
0<

For this we assume h(θ) ≤ θ, which is certainly satisfied for all θ with 0 < θ ≤ %0 , cf. (13). For ` ≥ 2 Fubini’s Theorem and a standard integration formula for rotationally symmetric functions then imply Z h(θ) Z √θ2 −s2 σ `−2 sk−1 (k+`) `−1 k ˆ θ = 2kτ1 (` − 1)τ θ √ dσ ds . (17) M volS 1 θ 2 − s2 − σ 2 0 0 An analogous calculation gives k+`

∂Bθ volk+` S

= 2kτ1k (` − 1)τ1`−1 θ

Z

θ 0

Z √θ2 −s2 0

√

σ `−2 sk−1 dσ ds . θ 2 − s2 − σ 2

(18)

Thus, we have to estimate (17) in terms of (18). With the definition Z 1 Z r 1 sj sj √ √ ds = j · ds, 2 : N0 → R+ , 2(j) := r 1 − s2 r 2 − s2 0 0 √ we derive using r = θ2 − s2 Z h(θ) Z √θ2 −s2 σ `−2 sk−1 √ dσ ds θ 2 − s2 − σ 2 0 0 Z h(θ)/θ p `−2 = θk+`−2 2(` − 2) 1 − s2 sk−1 ds . 0

Now ` ≥ 2 implies volS(k+`)

ˆ θ = 2kτ1k (` − 1)τ `−1 θk+`−1 2(` − 2) M 1

Z

h(θ)/θ

p

1 − s2

`−2

sk−1 ds ,

0

R1 √ and similarly, using b(j1 , j2 ) := 0 ( 1 − s2 )j1 sj2 ds, j1 , j2 ∈ N0 , volS(k+`) ∂Bθk+` = 2kτ1k (` − 1)τ1`−1 θk+`−1 2(` − 2) · b(` − 2, k − 1) .

(19)

(20)

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

It remains to estimate substitution yields

R h(θ)/θ √ ( 1 − s2 )`−2 sk−1 ds against b(` − 2, k − 1). A variable 0 Z

h(θ)/θ

sk−1

0

h(θ) = θ

451

Z

1

0

h(θ) s θ

k−1

p

1 − s2

s  1−

`−2

ds

h(θ) s θ

2

`−2 

ds .

(21)

We embed the integrand of the last integral in a one-parameter family in the following way. For fixed s ∈ [0, 1] define # " r p `−2 h(θ) h(θ) , → R+0 , ξ 7→ (ξs)k−1 · 1 − (ξs)2 , (22) gs : θ θ so that the last integrand in (21) pobviously is g· (h(θ)/θ). Let 0 < θ ≤ %0 be fixed for the moment. Then 0 < h(θ)/θ < h(θ)/θ < 1. Using (11) and (13) one can show that for k ≥ 2, ` ≥ 2,p and s ∈ [0, 1] the function gs is monotonically increasing with respect to ξ ∈ [h(θ)/θ, h(θ)/θ]. This furnishes s  !−1 r k−1 2 `−2 Z 1 h(θ) h(θ) h(θ)  1− s s  ds ≤ b(` − 2, k − 1) . θ θ θ 0 Together with (19), (20), and (21) we obtain r ˆθ volS(k+`) M h(θ) ≤ (k+`) k+` θ volS ∂Bθ Finally, (15) and (16) imply vol(i) M% ≤

Z

% 0

s ≤ and the claim is proved.

for all

0 < θ ≤ %0 .

m ˆ θ · vol(m) B√ volS(k+`) M

%2 −θ 2

dθ

h(%) · vol(i) B%i , for all 0 < % ≤ %0 , %

One of the assumptions in Theorem 3.1 was that “bad” versus “good” dimensions had to be bounded, cf. assumption (11). In our next theorem we will see that this assumption is indeed somehow necessary to get a good estimate on the quotient vol(i) (M%0 )/vol(i) (B%i 0 ) for varying dimensions and fixed (though arbitrarily small) radius %0 > 0. Theorem 3.4. Let β, η ∈ (0, 1) be fixed and define again h(s) := (s/β)1/η . Consider natural numbers k0 , ` ∈ N and m0 ∈ N0 , where k0 and m0 are assumed to be fixed, whereas ` may vary. Let i := k0 + ` + m0 and for fixed %0 > 0 set M%0 := (x, y, z) ∈ B%i 0 ⊂ Rk0 × R` × Rm0 : ||x||2 ≤ h(||y||2 ) . Then

vol(i) M%0 →1 vol(i) B%i 0

as

i→∞

(which implies ` → ∞) .

452

S. Maier-Paape, T. Wanner

Proof. Notice that we can restrict ourselves to the case m0 = 0 without loss of generality. We omit the proof of Theorem 3.4, since the required estimates are very similar to the ones used for the proof of Theorem 3.1. Remark 3.5. The above theorem also holds for any continuous function h satisfying h(0) = 0 and h(s) > 0 for s > 0. It shows that for fixed k0 and m0 the set B%i 0 \ M%0 will certainly not dominate in the ball B%i 0 for all dimensions i and for some fixed, but arbitrarily small %0 > 0. This is particularly remarkable since at first glance the set M%0 seems to be very small. Moreover, the analogous statement for cubes Q% in Ri instead of balls B%i 0 turns out to be wrong: In the situation of Remark 3.3 we obtain with (14) the relation vol(i) M%Q0 1 1 → β −k0 · η · %k0 ·( η −1) < 1 as i → ∞ (which implies ` → ∞) , (i) vol Q%0 provided %0 < β 1/(1−η) . Remark 3.6. Theorem 3.1 shows that our approach to spinodal decomposition as outlined in Subsect. 2.2 with variable dimension of the dominating subspace Yε = Xε+ ⊕Xε++ is sufficient: The dimensions of the spaces Xε++ and Xε− satisfy an estimate of the form (11) for an ε-independent constant C∗ , and the exponent 1/η in the function h is given by the also ε-independent number γ + /γ − . Therefore, the probability estimate (12) implies that the set Gε,r introduced in Subsect. 2.2.4 indeed is dominating in the ball Bε,r (these two sets correspond to the sets B%i 0 \ M%0 and B%i 0 in Theorem 3.1, respectively). For more details we refer the reader to Subsect. 2.2.4. On the other hand, the above Theorem 3.4 says that a fixed dimension of the dominating subspace Yε (which corresponds to the dimension k0 + m0 ) would not have sufficed. In particular, the one-dimensional strongly unstable manifold used in Grant [26] cannot be sufficient. 4. Wavelength of Spinodally Decomposed States In Subsect. 2.2 we described our approach to spinodal decomposition for the linearized Cahn–Hilliard equation (3). The main observation was that most orbits originating near the equilibrium 0 will leave a given neighborhood U = BR (0) close to the dominating subspace Yε = Xε+ ⊕ Xε++ defined in (9), which is generated by all eigenfunctions of the operator Aε corresponding to eigenvalues in σε+ ∪ σε++ , cf. Subsect. 2.2.1. This result will be carried over to the nonlinear Cahn–Hilliard equation (1) in [37]. For this however, the linear dominating subspace Yε has to be replaced by a nonlinear dominating invariant manifold Mε through the homogeneous equilibrium u0 ≡ µ tangent to the affine space µ + Yε . In other words, every ζ ∈ Mε close to the equilibrium µ is of the form ζ = µ + ψ + o(||ψ||) ,

ψ ∈ Yε = Xε+ ⊕ Xε++ .

(23)

We will show in Sect. 3 of [37] that upon leaving a neighborhood U of the homogeneous equilibrium u0 ≡ µ most solutions of the Cahn–Hilliard equation (1) starting near u0 will be close to Mε , and therefore the nodal domain patterns of functions on this dominating manifold will describe the possible patterns of spinodally decomposed states shortly after quenching. Due to physical experiments and numerical simulations one therefore expects that these functions will exhibit some form of common wavelength which should

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

453

be proportional to ε. The present section is devoted to mathematically describing this wavelength. Since Mε is tangent to µ + Yε , the linear dominating subspace Yε will certainly be important for our discussion. Let us recall some facts from Sect. 2. According to its definition in Subsect. 2.2.1 the space Yε is generated by all eigenfunctions of Aε corresponding to eigenvalues in σε+ ∪σε++ . According to Lemma 2.2 these eigenfunctions are also eigenfunctions ψi of −1 : X → X (cf. Lemma 2.1), now corresponding to eigenvalues κi which are sufficiently close to := κmax ε

f 0 (µ) . 2ε2

(24)

To fix notation, choose the interval Iε in such a way that Yε = Xε+ ⊕ Xε++ = span {ψi : i ∈ Iε } .

(25)

Furthermore, choose constants γ1 and γ2 with 0 γ1 < 1 and 0 1/γ2 < 1 which are independent of ε such that 0 < γ1 ≤

κmax ε ≤ γ2 κi

for all

i ∈ Iε .

(26)

Both γ1 and γ2 may be calculated explicitly from γ − < 1, cf. Subsect. 2.2.1 and Lemma 2.2. If Pκ : X → X denotes the L2 ()-orthogonal projection of X onto the eigenspace of −1 : X → X corresponding to the eigenvalue κ > 0, then Yε consists of all functions ζ ∈ X for which Pκ ζ = 0 whenever κ 6∈ {κi : i ∈ Iε }. Similarly, due to (23) any function ζ on the invariant manifold Mε has the following two properties: 1. There exists at least one κi , i ∈ Iε , with Pκi (ζ − µ) 6= 0. 2. For all eigenvalues κ 6∈ {κi : i ∈ Iε } we have Pκ (ζ − µ) ≈ 0. (The exact meaning of “≈ 0” can be deduced from (23).) Due to the dominating effects of the strongly unstable manifold Mε (cf. [37]), the above two statements remain true for most solutions of the nonlinear Cahn–Hilliard equation is in (1) originating near the homogeneous equilibrium u0 ≡ µ. Thus, the value κmax ε a way characteristic for a typical spinodally decomposed state of the Cahn–Hilliard equation. Yet it is still not clear in what way there is a common notion of wavelength for these states, which agrees with the one observed in numerical simulations and physical experiments. 4.1. Wavelength in an eigenspace. We begin by recalling the notion of wavelength used in physics. Let = [0, π]3 and consider the function ψ : → R defined by ψ(x, y, z) = cos(kx) cos(`y) cos(mz), where (0, 0, 0) 6= (k, `, m) ∈ N30 . The vector (k, `, m) is called the wave vector associated with ψ, and the corresponding wavelength is defined by wavelength of ψ :=

2π 2π 2π =√ =√ . length of the wave vector κ k 2 + ` 2 + m2

Here κ := k 2 + `2 + m2 denotes the eigenvalue of the operator −1 : X → X corresponding to the eigenfunction ψ, i.e., we have

454

S. Maier-Paape, T. Wanner

−1ψ = κ · ψ ∂ψ =0 ∂ν

in , on ∂ .

(27)

Similarly, if we choose = [0, π]n , then there is a notion of wavelength for eigenfunctions of −1 : X → X which are of a product cosine √ structure. Again this wavelength depends only on the eigenvalue κ and is given by 2π/ κ. The situation is different if sums of eigenfunctions are considered, since in this case the wave vectors have to be superposed. Now the notion of wavelength as “length of a wave” does not make sense anymore. (A nice and elementary analysis of the problem of superposing waves from a physical point of view may be found in Gerthsen, Kneser, Vogel [23, Chapter 4].) To the best of our knowledge, and although there is a vast amount of literature on the asymptotic behavior of eigenvalues and eigenfunctions of the Laplacian (cf. for example Babiˇc and Buldyrev [7], Safarov and Vassiliev [40], or Karlin [33]), there seems to be no geometrical explanation of the wavelength in general. At least for arbitrary eigenfunctions of (27) on general domains ⊂ Rn , the next lemma provides √ a reasonable explanation of the above statement “wavelength of ψ is equal to 2π/ κ”. ¯ be a smooth Lemma 4.1. Let G ⊂ Rn , n ∈ N, be a bounded domain and let u ∈ C 2 (G) solution of −1u = κ · u in G , u>0 in G . Then the following assertions hold: 1/2 (a) For any x0 ∈ G and r > 0 such that Br (x0 ) ⊂ G the estimate r ≤ (κ(n) 1 /κ) (n) necessarily holds. Here κ1 denotes the first eigenvalue of −1 on the unit ball B1 (0) ⊂ Rn subject to Dirichlet boundary conditions. (b) Suppose additionally that u = 0 on ∂G. Then for any x0 ∈ Rn and R > 0 such that 1/2 holds. G ⊂ BR (x0 ) the estimate R ≥ (κ(n) 1 /κ)

Remark 4.2. Before proving the lemma, let us add the following remarks. 1. This lemma seems to be well-known, but since we were not able to find a good reference, we will sketch the proof. 2. The above lemma may be applied to eigenfunctions ψ of the Neumann problem (27) on any nodal domain G ⊂ . The C 2 -smoothness is obvious since in that case u is the restriction of a smooth eigenfunction ψ on (for example if ∂ is of class C 2 , or if = [0, π]n ). Essential for the application is, however, that no smoothness of ∂G is required. 3. Lemma 4.1(a) does not rule out nodal domains which are “arbitrarily” large, but if that is the case they necessarily have to be slim. Part (b) on the other hand rules out nodal domains which are too small. Proof of Lemma 4.1. (a) Let κ > 0 and x0 ∈ G be given, and consider a first positive eigenfunction w = wκ : Bs (x0 ) → R of −1w = κw in Bs (x0 ) subject to Dirichlet boundary conditions. Here s = s(κ) is uniquely determined by κ and given explicitly by 1/2 s(κ) = (κ(n) . 1 /κ) Now assume that r > s. Then we have Bs (x0 ) ⊂ G, and both w and u are positive eigenfunctions of −1 on Bs (x0 ) corresponding to the eigenvalue κ. (Here and in the following the term eigenfunction is used for solutions of the eigenequation, without assuming any boundary constraints.) Thus, for suitable α > 0 the linear combination

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

455

v(x) := −u(x) + α · w(x), x ∈ Bs (x0 ), is a nontrivial eigenfunction of −1 satisfying v ≤ 0 in Bs (x0 ) and v < 0 on ∂Bs (x0 ), but v(x1 ) = 0 for at least one x1 ∈ Bs (x0 ). This, however, contradicts the maximum principle, cf. for example Gidas, Ni, Nirenberg [24, 1.3]. So our assumption r > s was wrong and (a) is proved. (b) This second assertion of the lemma can be proved analogously. Assuming the inequality R < s and comparing w with u on G ⊂ BR (x0 ) again contradicts the maximum principle, this time on G. For eigenfunctions of −1 Lemma 4.1 provides a satisfactory and very intuitive generalization of the physical√notion of wavelength described in the beginning of this subsection. Furthermore, 2π/ κ as wavelength would still be meaningful, since it differs 1/2 only by a constant factor. If we consider in particular from the expression (κ(n) 1 /κ) the eigenfunctions ψi which are contained in the dominating subspace Yε , i.e., if we and therefore the wavelength observed for ψi have i ∈ Iε , then κi will be close to κmax ε is approximately given by √ 2π 2 2π √ max = √ 0 ·ε, (28) κε f (µ) in accordance with physical observations. 4.2. Wavelength in the dominating subspace. While in the last subsection we gave a satisfactory interpretation for the wavelength of eigenfunctions, we actually are interested in obtaining a similar result for superpositions of eigenfunctions to “almost equal” eigenvalues, in particular, for the functions in the dominating subspace Yε . Unfortunately, although we do believe that an analogue of Lemma 4.1 also holds in this case, we will obtain the result only under additional assumptions. Let {ψi }i∈N denote the complete L2 ()-orthonormal set of eigenfunctions of the operator −1 : X → X with ordered eigenvalues {κi }i∈N according to Lemma 2.1, i.e., for all i ∈ N we have −1ψi = κi · ψi in ⊂ Rn , ∂ψi =0 on ∂. ∂ν Due to (25) the dominating subspace Yε is generated by the eigenfunctions ψi for which i ∈ Iε , and the inequalities in (26) together with the abbreviations η := γ1 /γ2 and κ(ε) ¯ := κmax ε /γ2 imply that 0<η≤

κ(ε) ¯ ≤1 κi

for all

i ∈ Iε .

(29)

The definition of η readily implies that it is independent of ε and close to 1. The latter statement follows from the fact that η can be calculated explicitly from γ − , which is also close to 1. The estimate (29) shows that although the dimension of the dominating subspace Yε grows without bounds as ε → 0, this space is generated by eigenfunctions ψi which correspond to eigenvalues having “almost” the same value. As a first step towards establishing a notion of wavelength of the order O(ε) for functions in Yε the following proposition reduces the problem to a problem on a onedimensional domain. P Proposition 4.3. Let ψ = i∈Iε βi ψi , βi ∈ R, denote an arbitrary element of the dominating subspace Yε . Furthermore, suppose that for some x0 ∈ and r > 0 we

456

S. Maier-Paape, T. Wanner

have ψ > 0 in Br (x0 ) ⊂ ⊂ Rn , n ∈ N, i.e., assume that Br (x0 ) is contained in a nodal domain of ψ. Then X ˜ := di · vi (t) > 0 for all t ∈ [0, r) , (30) ψ(t) i∈Iε

where di = di (x0 ) := βi · ψi (x0 ), and v = vi denotes the unique solution of the problem v 00 +

n−1 0 · v + κi v = 0 for t ∈ (0, ∞) , t v(0) = 1 , v 0 (0) = 0 ,

for every i ∈ Iε . ¯ Proof. Define ψ(x) := ψ(x + x0 ) for x ∈ Br (0) ⊂ Rn and (using the Haar integral, cf. [25]; O(n) denotes the orthogonal group) Z 1 ¯ −1 x) dg for all x ∈ Br (0) . ˆ ψ(g · (31) ψ(x) := |O(n)| O(n)

Obviously, ψˆ is radially symmetric, so it actually depends only on the Euclidean norm ˜ ˆ |x|, and we may define a function ψ˜ : [0, r) → R by ψ(|x|) := ψ(x), for arbitrary ¯ ˜ x ∈ Br (0). Due to ψ > 0 on Br (0) we have ψ(t) > 0 for all t ∈ [0, r). Similarly, we may symmetrize the eigenfunctions ψi separately. As above this furnishes radially symmetric functions ψˆ i : Br (0) → R, as well as corresponding real functions ψ˜ i : [0, r) → R defined by ψ˜ i (|x|) := ψˆ i (x). Furthermore, the radially symmetric functions ψˆ i are still eigenfunctions of −1 on Br (0) (in the sense that they satisfy the differential equation and no boundary conditions are specified). Thus, z = ψ˜ i (·) satisfies the equation z 00 +

n−1 0 · z + κi z = 0 t

for all t ∈ (0, r) ,

together with z(0) = ψi (x0 ) and z 0 (0) = 0. This implies ψ˜ i = ψi (x0 ) · vi , and the linearity of the integral in the symmetrization procedure of (31) immediately furnishes the representation of ψ˜ used in (30). In this subsection we want to show that if Br (x0 ) is contained in a nodal domain of a function ψ ∈ Yε , then r is bounded above by some number of the order O(ε) as ε → 0. The above proposition allows us to do this in terms of the positive real function ψ˜ : [0, r) → R. As the next step towards this goal we need a different representation of ˜ Let w denote the unique solution of ψ. w00 +

n−1 · w0 + w = 0 for t ∈ (0, ∞) , t w(0) = 1 , w0 (0) = 0 .

It is well-known (cf. for example [35]) that w(t) = Bn (t) = t−

n−2 2

· J n−2 (t) 2

for

t≥0,

where Jν (t) denotes the ν th Bessel function of the first kind, normalized in such a way that t−ν Jν (t) → 1 as t → 0. For example, for n = 1, 2, 3 one obtains

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

B1 (t) = cos(t)

,

B2 (t) = J0 (t)

,

B3 (t) =

457

sin t . t

Using a simple rescaling argument it can be shown √ that the functions vi introduced in Proposition 4.3 are explicitly given by vi (t) = Bn ( κi · t). Therefore, the inequality ˜ > 0 for all t ∈ [0, r) (cf. (30)) is equivalent to ψ(t) X √ di · B n κi · t > 0 for all t ∈ [0, r) . (32) i∈Iε

We believe that functions of this form tend to oscillate. Conjecture 4.4. Let n ∈ {1, 2, 3}. Then there exists an η0 < 1 close to 1 such that the following holds. If η ∈ (η0 , 1), if the index set Iε ⊂ N is such that the set {κi }i∈Iε satisfies (29), and if the di ∈ R are arbitrary, then the validity of (32) necessarily implies 2π r≤ √ κ(ε) ¯

(33)

√ for sufficiently small ε > 0. In other words, r is of the order O(1/ κmax ε ) as ε → 0. If this conjecture were true, Proposition 4.3 and the estimate (33) would imply that if ψ ∈ Yε is arbitrary and if the ball Br (x0 ) ⊂ is contained in a nodal domain of ψ, then necessarily √ 2π 2π 2γ2 ·ε. r≤ √ = √ 0 κ(ε) ¯ f (µ) This would be analogous to Lemma 4.1(a) and (28), yet this time it would hold for arbitrary functions in Yε rather than only for eigenfunctions. Although we cannot prove Conjecture 4.4 in full generality, we will establish its validity under certain additional conditions on the coefficients di = βi · ψi (x0 ), where i ∈ Iε . For this, let us introduce new variables p p ¯ and r˜ = r · κ(ε) ¯ . t˜ = t · κ(ε) In these rescaled variables (32) is equivalent to r X κi · t˜ > 0 di · Bn κ(ε) ¯

t˜ ∈ [0, r) ˜ .

for all

(34)

i∈Iε

In this linear combination of rescaled versions of Bn all the scaling factors are close to 1, more precisely we have r 1 κi ∈ 1, √ . κ(ε) ¯ η In order to emphasize this fact we introduce new variables

α(ε) = (αi (ε))i∈Iε , such that

r

αi (ε) ≥ 0 ,

κi = 1 + s(ε) · αi (ε) κ(ε) ¯

and

1 s(ε) ∈ 0, √ − 1 η

for all i ∈ Iε ,

458

S. Maier-Paape, T. Wanner

as well as ||α(ε)||∞ = 1 . √ Thus, if we take I = Iε ⊂ N, sn = 1/ η − 1, and drop both the tildes and ε’s, (34) leads to studying the following problem. Problem 4.5 (Normalized Oscillation Problem). Let I ⊂ N be an arbitrary index set, let sn > 0, let (di )i∈I ∈ RI , and choose an α ∈ RI such that αi ≥ 0

for all

i∈I,

||α||∞ = 1 .

and

(35)

Now fix an arbitrary s ∈ [0, sn ] and consider for fixed n ∈ N the function vs (t) :=

X

di · Bn ((1 + s · αi ) · t)

for t ∈ [0, r) .

(36)

i∈I

Problem: Is there a constant r0 > 0 such that the following holds? If for any choice of the parameters I, (di )i∈I , α, and s (which is possible in the above sense) we have vs (t) > 0 for all t ∈ (0, r), then automatically r ≤ r0 . Obviously, Problem 4.5 reduces to Conjecture 4.4 if we use the notation introduced above and replace r0 by 2π. The main advantage of the formulation of Problem 4.5 is that the ε-dependence has disappeared. P It is clear that for s = 0 the function v0 (t) = i∈I di · Bn (t) has always zeros in the interval [0, 2π]. But is it possible to retain this statement for all s ∈ [0, sn ], for some sn > 0? Also, the above problem has a positive answer for the special cases |I| = 1 and |I| = 2, provided sn > 0 is chosen appropriately. This follows easily from the intermediate value theorem. However, the application to the Cahn–Hilliard equation we have in mind makes it necessary to study Problem 4.5 for arbitrarily large values of |I|. The following result will give a partial answer to the normalized oscillation problem. In the theorem we use the notation d+i = max{0, di } ≥ 0 ,

d− i = max{0, −di } ≥ 0 ,

so that di = d+i − d− i . Theorem 4.6. Let n ∈ {1, 2, 3} be given and choose a small constant s0 ∈ (0, sn ], where sn > 0 will be determined in the proof. Furthermore, let I ⊂ N denote an arbitrary index set. Then there exists a positive constant δ > 0 (depending on s0 and approaching 0 as s0 → 0) such that for any choice of the coefficients (di )i∈I 6= 0 satisfying X X ∞> d+i ≥ (1 + δ) · d− (37) i ≥0 , i∈I

i∈I

I

and for arbitrary α ∈ R satisfying (35) the functions vs , s ∈ [0, s0 ], defined by vs (t) =

X

di · Bn ((1 + s · αi ) · t) ,

i∈I

have a sign change at some point ts ∈ [0, 2π].

t≥0,

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

459

P Proof. Without loss of generality we assume i∈I d+i = 1. Letting X d± ws± (t) := i · Bn ((1 + s · αi ) · t) , i∈I

we have vs (t) = ws+ (t) − ws− (t). Furthermore, for arbitrary s ∈ [0, s0 ] and t ≥ 0 we define Bnmin,s0 (t) := min Bn ((1 + ξ) · t) ξ∈[0,s0 ]

≤ Bn ((1 + s · αi ) · t) ≤ max Bn ((1 + ξ) · t) =: Bnmax,s0 (t) . Due to

P

ξ∈[0,s0 ]

+ i∈I di

= 1 this yields Bnmin,s0 (t) ≤ ws+ (t) ≤ Bnmax,s0 (t) ,

and we also have

X

min,s0 d− (t) ≤ ws− (t) ≤ i · Bn

i∈I

X

(38)

max,s0 d− (t) , i · Bn

(39)

i∈I

P for all t ≥ 0 and s ∈ [0, s0 ]. Since vs (0) = ws+ (0) − ws− (0) = 1 − i∈I d− i > 0, we only have to find some tn ∈ (0, 2π) with vs (tn ) < 0 in order to prove the theorem. Consider tn := first positive local minimum of Bn . Then obviously we have 0 < tn < 2π and Bn (tn ) < 0 for all n ∈ {1, 2, 3}. For our following arguments we further need both Bnmin,s0 (tn ) < 0 and Bnmax,s0 (tn ) < 0. But these inequalities are certainly satisfied for all s0 ∈ (0, sn ], for some sufficiently small sn > 0. Now the validity of (37) furnishes together with (38) and (39) for all s ∈ [0, s0 ] the estimate vs (tn ) = ws+ (tn ) − ws− (tn ) X min,s0 d− (tn ) ≤ Bnmax,s0 (tn ) − i · Bn i∈I

≤ −Bnmin,s0 (tn ) ·

B max,s0 (tn ) 1 − nmin,s0 1+δ Bn (tn )

.

Because of Bnmax,s0 (tn )/Bnmin,s0 (tn ) ∈ (0, 1) we can choose δ > 0 in such a way that always vs (tn ) < 0. More precisely, any δ > Bnmin,s0 (tn )/Bnmax,s0 (tn ) − 1 will do. Furthermore, ||Bnmax,s0 − Bn ||C[0,2π] → 0

and ||Bnmin,s0 − Bn ||C[0,2π] → 0

imply that δ = δ(s0 ) may be chosen such that δ(s0 ) → 0 as s0 → 0.

as s0 → 0

Remark 4.7. If some of the αi , i ∈ I, are equal, then condition (37) may easily be improved by adding those di which correspond to equal αi ’s, and then replacing the di by this sum. A similar improvement is possible in both (40) and (41) below. Theorem 4.6 partially answers the normalized oscillation Problem 4.5, yet under the additional condition (37). It is not clear to us at the moment whether or not a condition like that is actually necessary for the result to hold. The next result applies the above one to functions in the dominating subspace Yε .

460

S. Maier-Paape, T. Wanner

P Theorem 4.8. Let ψ ∈ Yε be arbitrary, i.e., assume that ψ = i∈Iε βi ψi , with coefficients βi ∈ R. Here the functions ψi are the L2 ()-orthonormal Neumann eigenfunctions of −1 on ⊂ Rn , n ∈ {1, 2, 3}, with corresponding eigenvalues κi satisfying (29). Then there exists a constant δ = δ(η) > 0 depending on 0 η < 1 (cf. (29)), which converges to 0 as η → 1, such that the following holds. For any point x0 ∈ at which either the angle between the vectors β := (βi )i∈Iε and ψx0 := (ψi (x0 ))i∈Iε satisfies | cos ^(β, ψx0 )| ≥ δ > 0 , or

 X  X βi ψi (x0 ) ≥ δ · min − βi ψi (x0 ) ,  − i∈Iε

i∈Iε (x0 )

(40) X

  βi ψi (x0 )

i∈Iε+ (x0 )



,

(41)

we have that for any ball Br (x0 ) which is contained in a nodal domain of ψ its radius r necessarily satisfies the inequality √ 2π 2γ2 2π ·ε. = √ 0 r≤ √ κ(ε) ¯ f (µ) Here Iε+ (x0 ) and Iε− (x0 ) denote the subsets of Iε on which the estimates βi ψi (x0 ) > 0 or βi ψi (x0 ) < 0 hold, respectively. Proof. According to Proposition 4.3 and the following reduction to the normalized oscillation problem (cf. (32), (34), and Problem 4.5), we only have to show that Theorem √ 4.6 is applicable with I = Iε and s0 = 1/ η − 1. To begin with, assume that condition (41) is satisfied. Letting di := βi ψi (x0 ) as in Proposition 4.3 it can easily be verified that condition (37) of Theorem 4.6 is satisfied for either ψ or −ψ, and therefore an application of Theorem 4.6 immediately furnishes the upper bound on r. Now assume that condition (40) is satisfied. As above we only have to verify the validity of (37). By assumption the estimate | cos ^(β, ψx0 )| =

|(β, ψx0 )RIε | ≥δ>0 kβkRIε · kψx0 kRIε

holds. Now assume without loss of generality that ||β||RIε = 1. Then X X βi ψi (x0 ) − (−βi ψi (x0 )) =: 6+β − 6− (β, ψx0 )RIε = β , i∈Iε+ (x0 )

(42)

(43)

i∈Iε− (x0 )

where both 6+β and 6− β are nonnegative. Again without loss of generality we assume holds, otherwise we consider −ψ instead of ψ. Due further that the inequality 6+β > 6− P P β − − + + to 6β = i∈Iε di and 6β = i∈Iε di it remains to verify 6+β ≥ (1 + δ) · 6− β . This is − = 0. If on the other hand we have 6 = 6 0, then first (42) and trivially satisfied if 6− β β − + (43) imply 6β − 6β = (β, ψx0 )RIε ≥ δ · ||ψx0 ||RIε , and this in turn yields ! δ · ||ψx0 ||RIε − − + . 6β ≥ δ · ||ψx0 ||RIε + 6β = 6β · 1 + 6− β

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

461

So in order to finish the proof we only have to establish the estimate 6− β ≤ ||ψx0 ||RIε . −

Let P : RIε → RIε

(x0 )

⊂ RIε denote the orthogonal projection. Then

6− β = −(P (β), P (ψx0 ))RIε ≤ |(P (β), P (ψx0 ))RIε | ≤ ||P (β)||RIε · ||P (ψx0 )||RIε ≤ ||ψx0 ||RIε , since ||β||RIε = 1. This completes the proof.

Remark 4.9. We already mentioned that Theorem 4.8 is not quite the result we wanted. But improvements should be possible. For example, a generalization of Theorem 4.8 could assume that either (40) or (41) hold for only one x0 ∈ Br/m (y0 ) for some m ∈ N with m ≥ 2. Then if Br (y0 ) is contained in a nodal domain of ψ we get r≤

m 2π ·√ . m−1 κ(ε) ¯

Remark 4.10. The proofs of this subsection can be modified to also establish the following lower bound for ψ ∈ Yε : Suppose that x0 ∈ R is such that either (40) or (41) holds. Furthermore, assume that both ψ(x0 ) > 0 and ∂BR (x0 ) ψdS < 0 are satisfied. √ ¯ holds. In particular, this can be used to show that Then the estimate R ≥ π/(4 κ(ε)) nodal domains of ψ around some x0 ∈ cannot be too small. 4.3. Discussion of the results. In the last subsection we derived our main result on the wavelength of functions in the dominating subspace Yε , yet only under either assumption (40) or (41). We will close this paper by discussing these two assumptions to some extent below, and by relating the results of Theorem 4.8 to the wavelength of spinodally decomposed states. Let us begin with the first assumption (40) of Theorem 4.8. Given some x0 ∈ , this condition states that the angle between the vector β ∈ RIε and the vector ψx0 is not close to a right angle, i.e., its cosine is at least δ. (Recall that β consists of the L2 ()Fourier coefficients of the function ψ, whereas ψx0 consists of the function values of the eigenfunctions in this Fourier series evaluated at x0 .) Since the constant δ > 0 will be very small in general, the assumption (40) might appear to be a minor one, yet in fact it is not. A calculation similar to the one in the proof of Theorem 3.4 implies (|Iε |)

volS

({β ∈ ∂B1 (0) ⊂ RIε : β satisfies (40)}) (|Iε |)

volS

(∂B1 (0))

→0

as

ε→0.

Fortunately, the second condition (41) seems toP be satisfied “on average”, at least from a heuristic point of view. Consider again ψ = i∈Iε βi ψi . If D ⊂ is a subdomain and I ⊂ Iε , then “on average” we will have approximately !2 !2 Z X Z X 1 1 · · βi ψi (x) dx ≈ βi ψi (x) dx |D| || D

i∈I

=

1 · ||

X

i∈I

βi2 .

(44)

i∈I

Now assume that (41) does not hold for any point x0 ∈ Br/2 (y0 ) (cf. Remark 4.9, with m = 2). Then integration furnishes

462

S. Maier-Paape, T. Wanner

Z

X

Br/2 (y0 )

!2 βi ψi (x)

dx ≤ δ 2 · max

i∈Iε

   Z

 

 

Br/2 (y0 )

Z

X

 

2 βi ψi (x) dx ,

i∈Iε− (x)

X

2

βi ψi (x) dx

i∈Iε+ (x)

Br/2 (y0 )

    

.

(45)

Decompose the ball Br/2 (y0 ) into pairwise disjoint subsets K1 , . . . , KJ such that for ˜ =: Ij− ⊂ Iε . Applying the all j = 1, . . . , J and all x, x˜ ∈ Kj we have Iε− (x) = Iε− (x) approximate identity (44) to each of these subsets one obtains 2  2  Z J Z X X X    βi ψi (x) dx = βi ψi (x) dx  Br/2 (y0 )

i∈Iε− (x)





j=1 K

j

i∈Ij−

J X |Kj |  X 2  |Br/2 (y0 )| ≈ · · βi  ≤ || || − j=1

i∈Ij

X

! βi2

,

i∈Iε

as well as an analogous estimate if the sum over i ∈ Iε+ (x). Since the left-hand P is taken 2 side of (45) is given approximately by ( i∈Iε βi ) · |Br/2 (y0 )|/|| this yields ! ! X X |B |Br/2 (y0 )| (y )| 0 r/2 · · βi2 ≤ δ 2 · βi2 , || || i∈Iε

i∈Iε

in contrast to δ 1. In other words, we do expect Theorem 4.8 to be applicable at “typical” points x0 ∈ , thus implying an upper bound for the thickness of nodal domains of functions in Yε which is of the order O(ε) as ε → 0. Let us close this section by adding some remarks on the wavelength of spinodally decomposed states. It was demonstrated in Subsect. 2.2 that most orbits of the linearized Cahn–Hilliard equation originating near the origin leave a given neighborhood U close to the dominating subspace Yε . Since in this context “closeness” is measured in the L2 ()norm, the results of the last subsection concerning functions in Yε do not automatically imply similar results for functions close by. Fortunately, this problem can be solved by employing different spaces. For example, if we restrict our attention to the subspace H 2 () ⊂ L2 () and use its topology, then one easily obtains analogous statements to the ones presented in Subsect. 2.2. Thus the space Yε will still be dominating. But now any function which is H 2 ()-close to some function ψ in Yε will also be close with ¯ respect to the C()-norm. Therefore its nodal domains will be similar to the ones of ψ, and the results of the last subsection on the wavelength will apply. It will turn out in [37] that in the nonlinear setting the space H 2 () will be the natural space to work with, so the results of this subsection will automatically apply. Acknowledgement. The authors thank Peter Bates, Norman Dancer, Paul Fife, Christopher Grant, Marco Holzmann, Hansj¨org Kielh¨ofer, and Klaus Schmitt for useful discussions. The second author thanks the Center for Dynamical Systems and Nonlinear Equations and the DFG for giving him the opportunity to visit.

Spinodal Decomposition for the Cahn–Hilliard Equation in Higher Dimensions

463

References 1. Alikakos, N.D., Bates,P.W. and Chen, X.: Convergence of the Cahn–Hilliard equation to the Hele-Shaw model. Arch. Rat. Mech. and Anal. 128, 165–205 (1994) 2. Alikakos, N.D., Bates, P.W. and Fusco, G.: Slow motion for the Cahn-Hilliard equation in one space dimension. J. Differ. Eq. 90, 81–135 (1991) 3. Alikakos, N.D., Bronsard, L. and Fusco, G.: Slow motion in the gradient theory of phase transitions via energy and spectrum. Calc. Var. Partial Differential Equations 6, 39–66 (1998) 4. Alikakos, N.D. and Fusco, G.: Equilibrium and dynamics of bubbles for the Cahn-Hilliard equation. In: International Conference on Differential Equations, Vol. I, Barcelona 1991, 1993, pp. 59–67 5. Alikakos, N.D. and Fusco, G.: Slow dynamics for the Cahn-Hilliard equation in higher space dimensions. Part I: Spectral estimates. Commun. Part. Differ. Eqs. 19, 1397–1447 (1994) 6. Alikakos, N.D. and Fusco, G.: Slow dynamics for the Cahn–Hilliard equation in higher space dimensions: The motion of bubbles. Arch. Rat. Mech. and Anal. 141, 1–61 (1998) 7. Babiˇc, V.M. and Buldyrev, V.S.: Short-Wavelength Diffraction Theory. Berlin–Heidelberg–New York: Springer-Verlag, 1991 8. Bai, F., Elliott, C.M., Gardiner, A., Spence, A. and Stuart, A.M.: The viscous Cahn-Hilliard equation. Part I: Computations. Nonlinearity 8, 131–160 (1995) 9. Bates, P.W. and Xun, J.: Metastable patterns for the Cahn-Hilliard equation. I. J. Differ. Eq. 111, 421–457 (1994) 10. Bates, P.W. and Xun, J.: Metastable patterns for the Cahn-Hilliard equation. II. Layer dynamics and slow invariant manifold. J. Differ. Eq. 117, 165–216 (1995) 11. Bronsard, L. and Hilhorst, D.: On the slow dynamics for the Cahn-Hilliard equation in one space dimension. Proc. Royal Soc. Lond. Series A 439, 669–682 (1992) 12. Cahn, J.W.: Free energy of a nonuniform system. II. Thermodynamic basis, J. Chem. Phys. 30, 1121– 1124 (1959) 13. Cahn, J.W.: Phase separation by spinodal decomposition in isotropic systems. J. Chem. Phys. 42, 93–99 (1965) 14. Cahn, J.W.: Spinodal decomposition. Trans. Metallurgical Soc. of AIME, 242, 166–180 (1968) 15. Cahn, J.W. and Hilliard, J.E.: Free energy of a nonuniform system I. Interfacial free energy. J. Chem. Phys. 28, 258–267 (1958) 16. Courant, R. and Hilbert, D.: Methods of Mathematical Physics. New York: Intersciences, 1953 17. Elder, K.R. and Desai, R.C.: Role of nonlinearities in off-critical quenches as described by the CahnHilliard model of phase separation. Phys. Rev. B 40, 243–254 (1989) 18. Elder, K.R., Rogers, T.M. and Desai, R.C.: Early stages of spinodal decomposition for the Cahn-HilliardCook model of phase separation. Phys. Rev. B 38, 4725–4739 (1988) 19. Elliott, C.M.: The Cahn-Hilliard model for the kinetics of phase separation. In: J. F. Rodrigues, editor, Mathematical Models for Phase Change Problems, Basel: Birkh¨auser, 1989, pp. 35–73 20. Elliott, C.M. and French, D.A.: Numerical studies of the Cahn-Hilliard equation for phase separation. IMA J. Appl. Math. 38, 97–128 (1987) 21. Fife, P.C.: Pattern dynamics for parabolic PDE’s. Preprint, 1989 22. Fife, P.C., Kielh¨ofer, H., Maier-Paape, S. and Wanner, T.: Perturbation of doubly periodic solution branches with applications to the Cahn-Hilliard equation. Physica D 100, (3–4), 257–278 (1997) 23. Gerthsen, C., Kneser, H.O., Vogel, H.: Physik. 15 edition, Berlin–Heidelberg–New York: Springer-Verlag 1989 24. Gidas, B., Ni, W.-M. and Nirenberg, L.: Symmetry and related properties via the maximum principle. Commun. Math. Phys. 68, 209–243 (1979) 25. Golubitsky, M., Stewart, I. and Schaeffer, D.G.: Singularities and Groups in Bifurcation Theory, Volume II. New York–Berlin–Heidelberg: Springer-Verlag, 1988 26. Grant, C.P.: Spinodal decomposition for the Cahn-Hilliard equation. Commun. Part. Differ. Eqs. 18, 453–490 (1993) 27. Grant, C.P.: Slow motion in one-dimensional Cahn–Morral systems. SIAM J. Math. Anal. 26, 21–34 (1995) 28. Grinfeld, M. and Novick-Cohen, A.: Counting stationary solutions of the Cahn-Hilliard equation by transversality arguments. Proc. Royal Soc. of Edinburgh 125 A, 351–370 (1995) 29. Henry, D.: Geometric Theory of Semilinear Parabolic Equations. Lecture Notes in Mathematics, Vol. 840, Berlin–Heidelberg–New York: Springer-Verlag, 1981

464

S. Maier-Paape, T. Wanner

30. Hunt, B.R., Sauer, T., and Yorke, J.A.: Prevalence: A translation-invariant “almost every” on infinitedimensional spaces. Bull. Am. Math. Soc. 27, 217–238 (1992) 31. Hyde, J.M., Miller, M.K., Hetherington, M.G., Cerezo, A., Smith, G.D.W. and Elliott, C.M.: Spinodal decomposition in Fe-Cr alloys: Experimental study at the atomic level and comparison with computer models. Acta Metall. Mat. 43, 3385–3426 (1995) 32. Kalies, W.D.,VanderVorst, R.C.A.M. and Wanner, T.: Slow motion in higher-order systems and 0convergence in one space dimension: Submitted for publication, 1997 33. Karlin, S.: Total Positivity. Stanford. Stanford University Press, 1968 34. Kielh¨ofer, H.: Pattern formation in the stationary Cahn-Hilliard model. Proc. Royal Soc. of Edinburgh, 127 A, 1219–1243 (1997) 35. Lauterbach, R. and Maier, S.: Symmetry-breaking at non-positive solutions of semilinear elliptic equations. Arch. Rat. Mech. and Anal. 126 (4), 299–331 (1994) 36. Maier-Paape, S. and Wanner, T.: Solutions of nonlinear planar elliptic problems with triangle symmetry. J. Differ. Eq. 136 (1), 1–34 (1997) 37. Maier-Paape, S. and Wanner, T.: Spinodal decomposition for the Cahn-Hilliard equation in higher dimensions. Part II: Nonlinear dynamics. Submitted to Arch. Rat. Mech. Anal. (1997) 38. Pego, R.L.: Front migration in the nonlinear Cahn-Hilliard equation. Proc. Royal Soc. Lond. Series A 422, 261–278 (1989) 39. Riesz, F. and Sz.-Nagy, B.: Functional Analysis. New York: Dover Publications, 1990 40. Safarov, Y. and Vassiliev, D. The Asymptotic Distribution of Eigenvalues of Partial Differential Operators. Translations of Mathematical Monographs Vol. 155, Providence, RI.: Am. Math. Soc., 1997 41. Sandstede, B.: Interaction of pulses in dissipative systems. In preparation, 1998 42. Stoth, B.E.E.: Convergence of the Cahn-Hilliard equation to the Mullins-Sekerka problem in spherical symmetry. J. Differ. Eq. 125, 154–183 (1996) Communicated by J. L. Lebowitz

Commun. Math. Phys. 195, 465 – 493 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

A Representation for Fermionic Correlation Functions Joel Feldman1,? , Horst Kn¨orrer2 , Eugene Trubowitz2 1 2

Department of Mathematics, University of British Columbia, Vancouver, B.C., Canada V6T 1Z2 Mathematik, ETH-Zentrum, CH-8092 Z¨urich, Switzerland

Received: 5 August 1997 / Accepted: 10 December 1997

Abstract: Let dµS (a) be a Gaussian measure on the finitely generated Grassmann algebra A. Given an even W (a) ∈ A, we construct an operator R on A such that Z Z 1 f (a) eW(a) dµS (a) = (11 − R)−1 (f ) dµS Z for all f (a) ∈ A. This representation of the Schwinger functional iteratively builds up Feynman graphs by successively appending lines farther and farther from f . It allows the Pauli exclusion principle to be implemented quantitatively by a simple application of Gram’s inequality. 1. Introduction complex Grassmann algebra freely generLet A(a1 , · · · , an ) be the finite dimensional, n o ated by a1 , · · · , an . Let Mr = (i1 ,···,ir ) | 1≤i1 ,···,ir ≤n be the set of all multi indices of degree r ≥ 0. For each multi index I = (i1 ,···,ir ), set aI = ai1 · · · air . By convention, a∅ = 1. Let o [ n (i1 ,···,ir ) | 1≤i1 <···
be the family of all strictly increasing multi indices. The set of monomials { aI | I ∈ I } is a basis for A(a1 , · · · , an ). Let S = Sij be a skew symmetric matrix of even order n. Recall that the Grassmann, Gaussian integral with covariance S is the unique linear map ? Research supported in part by the Natural Sciences and Engineering Research Council of Canada, the Schweizerischer Nationalfonds zur F¨orderung der wissenschaftlichen Forschung and the Forschungsinstitut f¨ur Mathematik, ETH Z¨urich

466

J. Feldman, H. Kn¨orrer, E. Trubowitz

Z f (a,b) ∈ A(a1 , · · · , an , b1 , · · · , bn ) −→ satisfying

Z

f (a,b) dµS (a) ∈ A(b1 , · · · , bn )

e6 ai bi dµS (a) = e− 2 6 bi Sij bj . 1

To manipulate Grassmann, Gaussian integrals, we can “integrate by parts” with respect to the generator ak , k = 1, · · · , n, Z ak f (a1 ,···,an ) dµS =

Z

n X

Sk`

`=1

The left partial derivative

∂ f (a1 ,···,an ) dµS . ∂a`

X ∂ ∂ f (a) = fI aI ∂a` ∂a` I∈I

is determined by ∂ aI = ∂a`

0, (−1)

|J|

aJ aK ,

`∈ /I aI = a J a ` a K .

Here, |J| is the degree of J. Integrating by parts with respect to ai1 and then arguing by induction on r we find Z aI dµS (a) = Pf SI where, for any multi index I = (i1 ,···,ir ) in Mr , SI = Sik i` is the skew symmetric matrix with elements Sik i` , k, ` = 1, · · · , r, and where Pf SI =

1 s 2 s!

r X

εk1 ···kr Sik1 ik2 · · · Sikr−1 ikr

k1 ,···,kr =1

is its Pfaffian when r = 2s is even. By convention, the Pfaffian of a skew symmetric matrix of odd order is zero. As usual, ( 1 , i1 , · · · , ir is an even permutation of 1, · · · , r i1 ···ir = −1 , i1 , · · · , ir is an odd permutation of 1, · · · , r . ε 0 , i1 , · · · , ir are not all distinct Let W(a) be an element of the commutative subalgebra A0 (a1 , · · · , an ) of all “even” Grassmann polynomials in A(a1 , · · · , an ). That is, X X wr (j1 ,···,jr ) aj1 · · · ajr , W(a) = r≥0 j1 ,···,jr

where, wr (j1 ,···,jr ), is an antisymmetric function of its arguments 1≤j1 ,···,jr ≤n that vanishes identically when r is odd. By definition, the “Schwinger functional” S(f ) on A(a1 , · · · , an ) corresponding to the “interaction” W(a) and the “propagator” S = Sij is Z 1 f (a) eW(a) dµS (a), S(f ) = Z

Representation for Fermionic Correlation Functions

467

Z where Z =

eW(a) dµS . The associated “correlation functions” Sm (j1 ,···,jm ) , m ≥ 0,

are given by 1 Sm (j1 ,···,jm ) = Z

Z aj1 · · · ajm eW(a) dµS (a).

In this paper we introduce an operator R on A(a1 , · · · , an ) such that Z S(f ) = (11 − R)−1 (f ) dµS holds for all f in A0 (a1 , · · · , an ). The utility of this representation of the Schwinger functional is demonstrated in Sect. III, where an elementary, but archetypical, bound on the correlation functions Sm (j1 ,···,jr ) , m ≥ 1 is obtained by bounding the operator norm of R in terms of a “naive power counting” norm on W(a). The tools developed here will be used to simplify the rigorous construction of a class of two dimensional Fermi liquids outlined in [FKLT]. The representation of the Schwinger functional derived in this paper grew out of the “integration by parts expansion” of [FMRT, FKLT] which, in turn, was developed as a replacement for the traditional “cluster/Mayer expansion”. In fact, the integration by parts expansion can be obtained by first expanding the inverse (11 − R)−1 in a Neumann series and then selectively expanding, by repeated partial integrations, the Grassmann, Gaussian integrals appearing in the definition of R. Apart from its conciseness, the advantage of the representation of the Schwinger functional given in this paper lies in the fact that the Pauli exclusion principle can be implemented quantitatively by a simple application of Gram’s inequality. This is in contrast to the “integration by parts expansion”, where the Pauli exclusion principle is implemented by a more physical, but more complicated, approach that involves carefully counting the number of fields in position space cubes whose dimensions are matched to the decay of the free propagator. There is a large literature devoted to the mathematical structure of Fermionic field theories, for example [C, FMRS, GK, BW]. One more notion is required for the detailed formulation of our results. For this purpose, let S ] be the complex, skew symmetric matrix of order 2n given by 0 S S] = . S S Also, for all multi indices I and J in M, let S:I:J be the skew symmetric matrix of order |I| + |J| defined by 0 SI ,J SI 0 = SIJ − S:I:J = . SJ,I SJ 0 0 Here, IJ is the juxtaposition of I and J and SI,J = Sik j` is the matrix with elements Sik j` , k = 1, · · · , r and ` = 1, · · · , s. Now, suppose a]I aJ is the monomial in the Grassmann algebra A(a]1 , · · · , a]n , a1 , · · · , an ) corresponding to I and J in M. Then, by construction, Z a]I aJ dµS ] (a],a) = Pf S:I:J

468

J. Feldman, H. Kn¨orrer, E. Trubowitz

In other words, the Grassmann, Gaussian integral with covariance S ] excludes “contractions” between any pair of the generators a]1 , · · · , a]n in A(a]1 , · · · , a]n , a1 , · · · , an ). For any f (a) in A(a1 , · · · , an ), let f (a)] be the Grassmann polynomial belonging to A(a]1 , · · · , a]n ) defined by f (a)] = f (a] ). If S is an invertible skew symmetric matrix, then there is a unique linear map, called the Wick map with respect to S, from f (a) in the Grassmann algebra A(a1 , · · · , an ) to : f (a) :S in A(a1 , · · · , an ) such that Z Z : f (a) :S g (a) dµS (a) = f (a)] g (a) dµS ] (a],a) for all f (a) , g (a) in A(a1 , · · · , an ). The Wick map has a unique extension from A(a1 , · · · , an ) to the Grassmann algebra A(a1 , · · · , an , b1 , · · · , bn ) that is left linear over the subalgebra A(b1 , · · · , bn ). For example, we can apply the Wick map to the exponential e6 ai bi as a Grassmann polynomial in the generators a1 , · · · , an with coefficients in A(b1 , · · · , bn ). By definition, Z Z ] ] ] ] ] 1 6 a i bi 6 ai b]i :e :S e dµS (a) = e6 ai bi e6 ai bi dµS ] (a],a) = e−6 bi Sij bj e− 2 6 bi Sij bj . On the other hand, Z ] ] ] ] 1 1 e6 ai bi e 2 6 bi Sij bj e6 ai bi dµS (a) = e−6 bi Sij bj e− 2 6 bi Sij bj and consequently, : e6 ai bi :S = e 2 6 bi Sij bj e6 ai bi . 1

It follows from the last identity that the Wick map : · :S depends continuously on S and has a unique continuous extension to the vector space of all skew symmetric matrices. Definition 1.1. For any Grassmann polynomial X h(a, b) = g I J a I bJ I,J∈I

in A(a1 , · · · , an , b1 , · · · , bn ), : h(a, b) :S,a =

X

gI J : aI :S bJ ,

I,J∈I

: h(a, b) :S,b =

X

gI J aI : bJ :S .

I,J∈I

In other words, the Wick map with respect to S is applied to h(a, b) as a Grassmann polynomial in the generators a1 , · · · , an with coefficients in A(b1 , · · · , bn ) to obtain : h(a, b) :S,a . Similarly, it is applied to h(a, b) as a Grassmann polynomial in the generators b1 , · · · , bn with coefficients in A(a1 , · · · , an ) to obtain : h(a, b) :S,b . Definition 1.2. The linear map R from A(a1 , · · · , an ) to itself is (consciously suppressing the dependence on W(a) and S) given by Z R(f ) (a) = : eW(a+b)−W(a) − 1 :S,b f (b) dµS (b).

Representation for Fermionic Correlation Functions

469

Our first result is Theorem 1.3. Suppose 1 is not in the spectrum of R. Then, Z S(f ) = (11 − R)−1 (f ) dµS for every f in A0 (a1 , · · · , an ). To exploit the representation of Theorem 1.3 we decompose R by expanding the exponential in : eW(a+b)−W(a) − 1 :S,b . Definition 1.4. For each pair r, s ∈ N` , ` ≥ 1, and every polynomial f in A(a1 , · · · , an ), the complex valued kernel Rr s (f )(K1 ,···,K` ) on Mr1 −s1 ×···×Mr` −s` is (consciously suppressing the dependence on W(a) and S) given by Rr s (f )(K1 ,···,K` ) = ±(s)

1 `!

X

Z

X

···

J1 ∈Ms

:

J` ∈Ms

(rsii ) wri (Ji , Ki ) bJi :S f (b) dµS (b)

i=1

`

1

` Y

when ri ≥ si ≥ 1 , i = 1, · · · , `. Otherwise, Rr s (f )(K1 ,···,K` ) = 0 . Here, ±(s)

=

` Y

si (si+1 +···+s` )

(−1)

.

i=1

The corresponding linear map Rr s from A(a1 , · · · , an ) to itself is Rr s (f ) =

X K1 ∈Mr −s 1 1

···

X

Rr s (f )(K1 ,···,K` )

K` ∈Mr −s ` `

` Y

aKi .

i=1

Theorem 1.5. For every f in A0 (a1 , · · · , an ), X X Rr s (f ). R(f ) = `≥1 r,s∈N`

Remark 1.6. At the end of the introduction, we use Theorem 1.5 to interpret Definition 1.4 and Theorem 1.3 in terms of Feynman graphs. We can combine Theorem 1.3 and Theorem 1.5 to obtain analytic control over the Schwinger functional S(f ). To do this, choose a nondecreasing function 8 on N and a 3 > 0 such that 1 Z 2 (|I|+|J|) , |J| ≤ |I| aI : aJ :S dµS (a) ≤ 8(|I|) 3 0, |J| > |I| for all multi indices I and J. The number 3 is morally the supremum kSk∞ = sup |Sij | of the covariance S. The number 8(|I|) is intuitively the degree of cani,j∈{1,···,n}

cellation between the at most |I|! nonzero terms contributing to the Pfaffian Z SI SI,J Pf = aI : aJ :S dµS (a). SJ,I 0

470

J. Feldman, H. Kn¨orrer, E. Trubowitz

At one extreme (see, Example 3.4), we can always choose 8(|I|) = |I|! for all multi indices I and 3 = kSk∞ . At the other extreme (see, Example 3.5), suppose that 0 6 S= T 0 −6 for some matrix 6 = 6ij of order n2 , and further that there is a complex Hilbert space H, elements vi , wi ∈ H , i = 1, · · · , n2 , and a constant 3 > 0 with hvi , wj iH = 6ij , 1

kvi kH , kwj kH ≤ L 2

for all i, j = 1, · · · , n2 . Then, by a variant of Gram’s inequality, we can choose 8(|I|) = 1 for all multi indices I and 3 = 2L. For example, the Grassmann algebra associated to a many fermion system has an equal number of “annihilation” a1 , · · · , am and “creation” a¯ 1 , · · · , a¯ m generators. Furthermore, the physical C cannot pair two annihilation or two creation gener covariance 0 Cı¯ . It is also often possible to write Cı¯ as an inner product ators. That is, C = Cı¯ 0 between vectors in an appropriate Hilbert space with “naturally” bounded norms so that 8(|I|) = 1 , I ∈ M, can be achieved in models of physical interest. See, [FMRT, p.682]. Now, let X f (m) (a) f (a) = m≥0

be a Grassmann polynomial in A(a1 , · · · , an ) where, for each m ≥ 0, X fm (j1 ,···,jm ) aj1 · · · ajm f (m) (a) = j1 ,···,jm

and the kernel fm (j1 ,···,jm ) is an antisymmetric function of its arguments. Definition 1.7. For all α ≥ 2, the “external” and “internal” naive power counting norms kf kα and |||f |||α of the Grassmann polynomial f (a) are kf kα =

X

kf (m) kα =

m≥0

and |||f |||α =

X m≥0

kfm k1 =

m

kfm k1

X

αm

1

kSk1,∞

32

(m−2)

kfm k1,∞ ,

m≥0

X j1 ,···,jm

kfm k1,∞ =

1

αm 3 2

m≥0

|||f (m) |||α =

where

X

sup

|fm (j1 ,···,jm )|, X

j1 ∈{1,···,n} j ,···,j 2 m

|fm (j1 ,j2 ,···,jm )|

are the L1 and “ mixed L1 , L∞ ” norms of the antisymmetric kernels fm (j1 ,···,jm ). In Sect. 3 we prove

Representation for Fermionic Correlation Functions

471

Theorem 1.8. Suppose 2 |||W|||α+1 ≤ 1. Then, for all polynomials f in the Grassmann algebra A(a1 , · · · , an ), kR(f )kα ≤ 28(n) |||W|||α+1 kf kα . In particular, the spectrum of R is bounded away from 1 uniformly in the number of degrees of freedom n when 8 is constant and |||W|||α+1 is small enough. A simple consequence of Theorem 1.8 is the archetypical bound on correlation functions Theorem 1.9. Suppose 2 (1+8(n)) |||W|||α+1 < 1. Then, for each m ≥ 0 and all sequences of indices 1 ≤ j1 ,···,jm ≤ n, 1 8(n) m αm 3 2 . 1 − 2 8(n) |||W|||α+1

|Sm (j1 ,···,jm )| ≤

In particular, the correlation functions are bounded uniformly in the number of degrees of freedom n when 8 is constant and kWkα+1 is small enough. 1.1. Decomposition of Feynman graphs into annuli. In the rest of the introduction we motivate and interpret Definition 1.4 and Theorem 1.3 “graphically”. However, we emphasize that the purely algebraic proof of Theorem 1.3 given in the next section is completely independent of this discussion and, in particular, does not refer to graphs. Recall that, for I = (i1 ,···,ir ) in Mr with r even, Z

r X

aI dµS (a) = Pf SI =

εk1 ···kr Sik1 ik2 · · · Sikr−1 ikr

k1 ,···,kr =1 k2i−1
can be thought of as the sum of the amplitudes of all graphs having r vertices labelled i1 ,···,ir and having precisely one line attached to each vertex. The amplitude of the graph having lines {ik1 , ik2 } , · · · , {ikr−1 , ikr } is εk1 ···kr Sik1 ik2 · · · Sikr−1 ikr . This graphical representation has an immediate extension to Z aH e

W(a)

X X 1 dµS (a) = `! ` `≥0 r∈N

X

X Z

···

I1 ∈Mr

1

I` ∈Mr

`

aH

` Y

wri (Ii ) aIi

dµS (a)

i=1

with H ∈ Mm . One merely has to substitute HI1 ··· I` for I. In general, a “Feynman graph 0” with m external legs and ` internal vertices wr1 , · · · , wr` (the vertex wri having ri legs) is a partition of the m + r1 + · · · + r` legs into disjoint pairs that are represented as lines. The amplitude of the graph 0 is defined to be Am(0)(H) =

ε `!

X I1 ∈Mr

··· 1

X

` Y

Y wri (Ii )

I` ∈Mr

`

i=1

Sic j c ,

lines c in 0

where we choose any ordering (ic , jc ) of the pairs in the partition determined by the lines c of 0 and where ε is the signature of the permutation that brings the juxtaposition of these pairs to the juxtaposition HI1 ···I` of the multi indices H , I1 , ··· , I` . Then, for H ∈ Mm ,

472

J. Feldman, H. Kn¨orrer, E. Trubowitz

Z

X

aH eW(a) dµS (a) =

Am(0)(H).

Feynman graphs 0

A Feynman graph 0 is “externally connected” when each connected component contains a line with an external leg. In other words, there are no “vacuum components”. It is well known (and can also be R derived from Theorem 1.3 and Theorem 1.5) that the correlation function S(aH ) = Z1 aH eW(a) dµS (a) is the sum of the amplitudes of all externally ∞ R P connected graphs. Roughly speaking, the representation S(f ) = Rn (f )dµS genn=0

erates these graphs a bit at a time with the mth application of R adding those lines that are of distance m − 1 from f and those vertices that are of distance m from f . To make this more precise, we return to Definition 1.4 and choose r, s ∈ N` satisfying rs ri ≥si ≥1 , i=1,···,`. We first explain how to visualize the action of the operator R on a homogeneous Grassmann polynomial X X fm (j1 ,···,jm ) aj1 · · · ajm = fm (H) aH f (m) (a) = H∈Mm

j1 ,···,jm

of degree m. Suppose, f (a) = aH is a monomial where, H = (h1 ,···,hm ), and select a multi index Ki ∈ Mri −si , i = 1, · · · , `. To graphically interpret the coefficient Rr s (aH )(K1 ,···,K` ) = ±(s)

1 `!

X J1 ∈Ms

··· 1

Z

X J` ∈Ms

: `

` Y

(rsii ) wri (Ji , Ki ) bJi :S bH dµS (b)

i=1

of aK1 · · · aK` in Rr s (aH ) , we imagine an annulus that has, for each generator in the incoming monomial bH , an “external leg” entering some point on its outer boundary ` Q and that has, for each generator in the outgoing product aKi of Rr s an “external leg” i=1

leaving some point on its inner boundary, and that furthermore contains the ` “vertices” wr1 , · · · , wr` in its interior. There are no other external legs or vertices. The vertex wri , i = 1, · · · , `, has ri “legs”. One leg, for each generator in the monomial wri (Ji , Ki ) bJi aKi . The leg attached to wri for a given generator in aKi is joined by a “passive” line to the corresponding external leg leaving the inner boundary of the annulus.

`=4

m = 10 ~ r = (4, 2, 4, 2) ~s = (2, 1, 3, 2)

Representation for Fermionic Correlation Functions

473

We construct “annular graphs of type m, r, s” out of our annulus, when m ≥ s1 +···+s` and m + s1 +···+s` is even, by connecting each leg of wri , i = 1, · · · , `, representing a generator in bJi , |Ji |=si , by an “active” line with an external leg representing a generator in bH entering the outer boundary or connecting, again by an “active” line, two external legs entering the outer boundary corresponding to a pair of generators in bH . Each vertex is connected to at least one external leg entering the outer boundary because si ≥ 1 , 1 ≤ i ≤ `. Note that there is a bijection between annular graphs and partitions P of the disjoint · · union H ∪· ∪ Ji into disjoint unordered pairs such that each element of ∪ Ji is paired 1≤i≤`

1≤i≤`

with an element of H.

Two annular graphs of type 10, (4, 2, 4, 2), (2, 1, 3, 2)

For each sequence (H , K1 , ··· , K` ) of multi indices in Mm ×Mr1 −s1 ×···×Mr` −s` , the amplitude Am(A)(H , K1 , ··· , K` ) of an annular graph A of type m, r, s is

Am(A)(H , K1 , ··· , K` ) = ±(s)

X 1 ε `!

X

···

J1 ∈Ms

1

` Y

J` ∈Ms

`

i=1

Y

(rsii ) wri (Ji , Ki )

Sic j c ,

active lines c in A

where we choose any ordering (ic , jc ) of the pairs in the partition PA determined by the active lines c of A and where ε is the signature of the permutation that brings the juxtaposition of these pairs to the juxtaposition HJ1 ···J` of the multi indices H , J1 , ··· , J` . The amplitude Am(A)(H , K1 , ··· , K` ) is a function of the external legs of the annular graph A. Recall that the Grassmann, Gaussian integral Z :

` Y

Z

` Y

bJi :S bH dµS (b) =

i=1

b]Ji bH dµS ] (b] ,b)

i=1

is equal to a Pfaffian that is the sum over all the partitions of the product disjoint pairs such that each generator in

` Q i=1

` Q i=1

]

b]Ji bH into

bJi contracts, via a matrix element of S, to

a generator in bH . Therefore, by construction,

474

J. Feldman, H. Kn¨orrer, E. Trubowitz ` X Z Y 1 X (rsii ) wri (Ji , Ki ) bJi :S bH dµS (b) : ±(s) ··· `! i=1 J1 ∈Ms J` ∈Ms 1 ` X = Am(A)(H , K1 , ··· , K` ). annular graphs A of type |H|,r,s

That is, Rr s (f (m) )(K1 ,···,K` ) =

X

X

fm (H)

H∈Mm

Am(A)(H , K1 , ··· , K` ).

annular graphs A of type m,r,s

R Suppose, |λ| 1. Then, the partition function Z (λ) = eλW(a) dµS 6= 0 and all the eigenvalues ofP R = O(λ) lie strictly inside the unit disc. In this case, the Neumann series Rp converges. Writing out all the terms, (11 − R)−1 = Z

p≥0

X

X

p≥0 , `1 ,···,`p ≥1

r1 ,s1 ∈N`1

(11 −R)−1 (f (m) ) dµS =

···

X

Z Rrp sp · · · Rr1 s1 (f (m) ) dµS .

rp ,sp ∈N`p

To convey the intuition that leads to the statement of Theorem 1.3, we examine both the action of a product Rrp sp · · · Rr1 s1 contributing to Rp on f (m) and the corresponding Grassmann Gaussian integrals Z Rrp sp · · · Rr1 s1 (f (m) ) dµS . For p = 2, Rr2 s2 Rr1 s1 (f (m) ) (L1 ,···,L`2 ) X X = fm1 (H) H∈Mm

1

K1 , ··· , K`1

X

Am(A1 )(H , K1 , ··· , K`1 ) Am(A2 )(K1 ···K`1 , L1 , ··· , L`2 ).

annular graphs Ai of type mi ,ri ,si for i=1,2

The degree m1 = m and the degree m2 = r1,1 −s1,1 +···+r1,`1 −s1,`1 , the second sum is over all sequences of multi indices (K1 , ··· , K`1 ) in Mr1,1 −s1,1 ×···×Mr1,`1 −s1,`1 , and the multi index K1 ···K`1 occurring in the amplitude Am(A2 ) is the juxtaposition of the multi indices K1 , ··· , K` . Now, 1 X Am(A1 )(H , K1 , ··· , K`1 ) Am(A2 )(K1 ···K`1 , L1 , ··· , L`2 ) Am(A1 A2 )(H , L1 , ··· , L`2 ) = K1 , ··· , K`

1

is the “amplitude of the double annular graph A1 A2 of type m, r1 , s1 , r2 , s2 ” obtained by `1 Q inserting A2 just inside the inner boundary of A1 and then, for each generator in aKi , i=1

joining the associated external leg (at the end of a passive line) leaving the inner boundary of A1 to its mate (at the beginning of an active line) entering the outer boundary of A2 . Notice that, by construction, each vertex in A2 is connected by a line to at least one vertex in A1 that, in turn, is connected to at least one external leg entering the outer boundary of A1 .

Representation for Fermionic Correlation Functions

475

A double annular graph A1 A2 with A1 of type 8, (2, 6, 2, 4, 4), (1, 3, 1, 2, 1) and A2 of type 10, (4, 2, 4, 2), (2, 1, 2, 2)

For p ≥ 3, one obtains the amplitude of the completely analogous p-annular graph A1 · · · Ap of type m, r1 , s1 , · · · , rp , sp in which each vertex in the annular graph Ai , i = `i−1 P ri−1,j −si−1,j , ri , si , is connected by a line to at least one vertex 2, · · · , p of type mi = j=1

in Ai−1 and ultimately to at least one external leg entering the outer boundary of A1 . An “externally connected Feynman graph 0 of type m, r1 , s1 , · · · , rp , sp ” is a pannular graph A1 · · · Ap of type m, r1 , s1 , · · · , rp , sp , as in the last paragraph, together with a partition P of the legs emanating from the inner boundary of Ap into disjoint pairs that are joined to form lines.

Suppose 0 is an externally connected Feynman graph with m external legs and the internal vertices wr1 , · · · , wrn . Set n o A1 (0) = wrj | at least one leg of wrj is connected by a line in 0 to an external leg and then define Ai (0) , n Ai (0) =

wrj

∈ /

S

i−1

i ≥ 2,

inductively by

Ah (0) | at least one leg of wrj is connected by a line in 0 to a vertex in Ai−1

o .

h=1

Also, set `i = |Ai (0)| fori ≥ 1. There is a 1 ≤ p ≤ n such that Ap (0) 6= ∅ , while Ap+1 (0) = ∅, and

476

J. Feldman, H. Kn¨orrer, E. Trubowitz

{wr1 , · · · ,

wrn }

=

p [

Ai (0).

i=1

Let wrj1 , · · · , wrj` be the vertices in Ai (0). For each k = 1 , ··· , `i , let sjk be the i number of legs of the vertex wrjk that are attached by lines to a vertex in Ai−1 (0). Also, let m0i be the total number of legs emanating from the vertices in Ai−1 (0) that are not attached by lines to vertices in Ai−2 (0). For each i = 1, · · · , p, set ri0 = (rj1 ,···,rj`i ) and si0 = (sj1 ,···,sj`i ). Observe that the graph 0 induces a unique annular graph structure on Ai (0) of type m0i , ri0 , si0 for each i = 1, · · · , p, and a partition P0 of the legs leaving the inner boundary of Ap (0) into pairs. Thus, every externally connected Feynman graph 0 with m external legs corresponds to a unique externally connected Feynman graph of type m, r10 , s10 , · · · , rp0 , sp0 . For each multi index H in Mm , the amplitude Am(0)(H) of the externally connected Feynman graph 0 of type m, r1 , s1 , · · · , rp , sp is defined by X Y Am(0)(H) = ε(P0 ) Am(A1 · · · Ap )(H , M1 , ··· , M`p ) Si j , M1 , ··· , M`p

(i,j)∈P0

where ε(P0 ) is the signature of the permutation that brings the juxtaposition of the pairs in the partition P0 to the juxtaposition M1 ···M`p . The sum is over all sequences of multi indices (M1 , ··· , M`p ) in Mrp,1 −sp,1 ×···×Mrp,` −sp,` . By definition, the amplitude Am(0)(H) of an p p externally connected Feynman graph 0 is the amplitude of the corresponding externally connected Feynman graph of type m, r10 , s10 , · · · , rp0 , sp0 . We can now write Z Rr2 s2 Rr1 s1 (f (m) ) dµS Z X Rr2 s2 Rr1 s1 (aH ) dµS = fm (H) H∈Mm

X

=

H∈Mm

X

=

L1 , ··· , L`2

annular graphs Ai of type mi ,ri ,si for i=1,2

X

Am(A1 A2 )(H , L1 , ··· , L`2 )

Z Y `2

aLi dµS

i=1

Am(0)(H)

externally connected Feynman graphs 0 of type m1 ,r1 ,s1 ,m2 ,r2 ,s2

since, the integral

Similarly, Z

X

fm (H)

H∈Mm

of the product

X

fm (H)

`2 Q i=1

Z Y `2

aLi dµS is equal to a Pfaffian that is the sum over all partitions

i=1

aLi into disjoint pairs that are contracted via matrix elements of S.

Rrp sp · · · Rr1 s1 (f (m) ) dµS =

X H∈Mm

for all p ≥ 3. It follows that

fm (H)

X

Am(0)(H)

externally connected Feynman graphs 0 of type m1 ,r1 ,s1 ,···,mp ,rp ,sp

Representation for Fermionic Correlation Functions

477

Z

(11 − R)−1 (f (m) ) dµS X X X = fm (H) H∈Mm

···

r1 ,s1 ∈N`1

p≥0 , `1 ,···,`p ≥1

X

X

rp ,sp ∈N`p

externally connected Feynman graphs 0 of type m1 ,r1 ,s1 ,···,mp ,rp ,sp

or, equivalently, Z X (11 − R)−1 (f (m) ) dµS = fm (H) H∈Mm

X

Am(0)(H)

Am(0)(H)

externally connected Feynman graphs 0

,

where the last sum is over all externally connected Feymann graphs with m external legs and any finite set wr1 , · · · , wrn of vertices chosen from wr , r ≥ 1. As mentioned before, it is “a well known fact” that X X fm (H) Am(0)(H). S(f (m) ) = H∈Mm

externally connected Feynman graphs 0

Z

Therefore, S(f

(m)

)=

(11 − R)−1 (f (m) ). dµS

This completes the graphical interpretation of Theorem 1.3. 2. The Proofs of Theorem 1.3 and Theorem 1.5 Again, let S = Sij be a skew symmetric matrix of even order n. Lemma 2.1. For each h(b, c) in A(b1 , · · · , bn , c1 , · · · , cn ), Z Z Z ] ] h(b, c) dµ( S h(b, a+b ) dµS ] (b ,b) dµS (a) =

S (b,c) S S)

.

Proof. Let 0 be the linear functional on the Grassmann algebra A(b1 , · · · , bn , c1 , · · · , cn ) given by Z Z 0(h) = Then, 0 e

6 bi di +ci ei

h(b, a+b] ) dµS ] (b],b) dµS (a), Z Z

]

e6 bi di +(a+b )i ei dµS ] (b],b) dµS (a) Z Z ] e6 bi ei +bi di dµS ] (b],b) dµS (a) = e6 ai ei Z −6 ei Sij dj − 21 6 di Sij dj e6 ai ei dµS (a) =e e

=

= e− 2 6 ei Sij ej e−6 ei Sij dj e− 2 6 di Sij dj . 1

By uniqueness,

Z 0(h) =

1

h(b, c) dµ( S

S (b,c) S S)

.

478

J. Feldman, H. Kn¨orrer, E. Trubowitz

The main ingredient required for the proof of Theorem 1.3 is Proposition 2.2. For all f and g in A(a1 , · · · , an ), Z Z Z f (a) g (a) dµS (a). f (b) : g (a+b) :S,b dµS (b) dµS (a) = Proof. By Lemma 2.1, Z Z Z Z f (b) g (a+b] ) dµS ] (b],b) dµS (a) f (b) : g (a+b) :S,b dµS (b) dµS (a) = Z = f (b) g (c) dµ( S S ) (b,c). S S

Now, observe that for all multi indices I = (i1 ,···,ir ) and J = (j1 ,···,js ) in M, the juxtaposition I(J+n) = (i1 ,···,ir , j1 +n,···,js +n) is a multi index in {1,···,2n}r+s and by construction, Z Z aI aJ dµS (a). bI cJ dµ( S S ) (b,c) = Pf ( SS SS )I(J+n) = Pf SIJ = S S

It follows that

Z

Z Z f (b) : g (a+b) :S,b dµS (b) dµS (a) =

f (a) g (a) dµS (a)

Again, let W(a) =

X X

wr (j1 ,···,jr ) aj1 · · · ajr

r≥0 j1 ,···,jr

be an even Grassmann polynomial where, wr (j1 ,···,jr ), is an antisymmetric function of its arguments 1≤j1 ,···,jr ≤n that vanishes identically when r is odd. Let R be the linear map on A(a1 , · · · , an ), corresponding to W(a) and S, that was introduced in Definition I.2. Theorem 2.3. For all f in A(a1 , · · · , an ), Z Z f (a) eW(a) dµS (a) = (R0 + R)(f )(a) eW(a) dµS (a) Z

where, R0 (f )(a) =

f (b) dµS (b) a∅ .

Proof. By Proposition 2.2, Z Z Z : eW(a+b)−W(a) :S,b f (b) dµS (b) eW(a) dµS (a) (R0 + R)(f )(a) eW(a) dµS (a) = Z Z = : eW(a+b) :S,b f (b) dµS (b) dµS (a) Z = f (a) eW(a) dµS (a),

Representation for Fermionic Correlation Functions

479

It is now easy to give the Proof of Theorem 1.3. If |λ| ≤ r 1, then Z eλW(a) dµS 6= 0, Z (λ) = and all the eigenvalues of R = O(λ) lie strictly inside the unit disc. In this case, Z Z Z 1 1 aH eλW(a) dµS = R(aH ) eλW(a) dµS aH dµS + Z Z X

and

Rs (aH ) = (11 − R)−1 aH .

s≥0

Iterating, Z Z Z Z 1 1 aH eλW(a) dµS = R2 (aH ) eλW(a) dµS aH dµS + R(aH ) dµS + Z Z Z Z X t 1 s Rt+1 (aH ) eλW(a) dµS = R (aH ) dµS + Z s=0

for all t ≥ 0. In the limit, Z Z 1 aH eλW(a) dµS = (11 − R)−1 aH dµS Z when |λ| ≤ r 1. To complete the proof, observe that both sides of the last identity are rational functions of λ ∈ C. To prove Theorem I.5 we make Convention 2.4. Let I = (i1 ,···,ir ) be any multi index. A “sub multi index” J⊂I is a multi index J = (j1 ,···,js ) together with a strictly increasing map νJ from {1,···,s} to {1,···,r} such that jk = iνJ (k) , k = 1, · · · , s. If the multi indices I and J belong to I and, in addition, J⊂I as sets, then J is uniquely determined as a sub multi index by the inclusion map of {j1 ,···,js } into {i1 ,···,ir }. For every sub multi index J⊂I, there is a unique complementary sub multi index I\J ⊂ I such that the image of νI\J is the complement of νJ ({1,···,s}) in {1,···,r}. The “relative sign” ρ(J,I) of the pair J⊂I is the signature of the permutation that brings the sequence (1 , ··· , r−s , r−s+1 , ··· , r) to (νJ (1) , · · · , νJ (s) , νI\J (1) , ··· , νI\J (r−s)). By construction, aI = ρ(J,I) aJ aI\J . The relative sign is defined on all of I × I by n ρ(J,I) , J⊂I ρ(J,I) = 0 , J6⊂I. Proof of Theorem 1.5. Observe that for each r ≥ 1, X X X wr (I) (a+b)I − aI = wr (I) I∈Mr

I∈Mr

= =

1≤s≤r

X

X

X

X

1≤s≤r

I∈Mr

J a subindex of I in Ms

X 1≤s≤r

X

ρ(J,I) bJ aI\J

J a subindex of I in Ms

X

J∈Ms K∈Mr−s

ρ(J,I) ρ(J,I) wr (J, I\J) bJ aI\J (rs) wr (J, K) bJ aK

480

J. Feldman, H. Kn¨orrer, E. Trubowitz

and consequently, for each r ∈ N` , ` X Y

wri (Ii ) (a+b)Ii i

` Y

X

···

Y `

J` ∈Ms ` K` ∈Mr −s ` `

K1 ∈Mr −s 1 1

±(s)

J` ∈Ms

` Y

(rsii ) wri (Ji , Ki ) bJi

i=1

` K` ∈Mr −s ` `

1 K1 ∈Mr −s 1 1

(rsii ) wri (Ji , Ki ) bJi aKi

i=1

X

···

J1 ∈Ms

s∈N` r ≥s ≥1

( ) wri (Ji , Ki ) bJi aKi

X

1

X

ri si

Ki ∈Mr −s i i

i

J1 ∈Ms

X

X

Ji ∈Ms

X

X s∈N` r ≥s ≥1

=

X

1≤si ≤ri

i=1

=

Ii ∈Mr

i=1

=

− aIi

` Y

aKi .

i=1

Now, we can expand the exponential to obtain : eW(a+b)−W(a) − 1 :S,b ` X X 1 X Y = : `! ` `≥1

X 1 X = `! ` `≥1

=

X

X X

where Qr s (K1 ,···,K` ,b) =

 

1 `!

:S,b

K` ∈Mr −s ` `

X

J1 ∈Ms

··· 1

` Y

Qr s (K1 ,···,K` ,b)

K` ∈Mr −s ` `

P

:

P

(rsii ) wri (Ji , Ki ) bJi :S

i=1

`

···

K1 ∈Mr −s 1 1

±(s)

J` ∈Ms

K1 ∈Mr −s 1 1

   ±(s)

X

··· 1

X

`≥1 r,s∈N`

i

J1 ∈Ms

r,s∈N r ≥s ≥1

− aIi

Ii ∈Mr

i=1

r∈N

wri (Ii ) (a+b)Ii

` Y

aKi

i=1 ` Y

aKi ,

i=1

:

J` ∈Ms

`

` Q i=1

(rsii ) wri (Ji , Ki ) bJi :S ,

0,

ri ≥si ≥1 , i=1,···,`

otherwise

Integrating, Z : eW(a+b)−W(a) − 1 :S,b f (b) dµS (b) =

X X `≥1 r,s∈N`

=

X X `≥1 r,s∈N`

=

X X

`≥1 r,s∈N`

X

···

K1 ∈Mr −s 1 1

X K1 ∈Mr −s 1 1

Rr s (f )

X

Z Qr s (K1 ,···,K` ,b) f (b) dµS (b)

K` ∈Mr −s ` `

···

X K` ∈Mr −s ` `

` Y i=1

Rr s (f )(K1 ,···,K` )

` Y i=1

aKi

aKi

.

Representation for Fermionic Correlation Functions

That is,

481

X X

R(f ) =

`≥1

Rr s (f ).

r,s∈N`

3. An Archetypical Bound and “Naive Power Counting” Fix a complex, skew symmetric matrix S = Sij of order n and an even Grassmann polynomial X X wr (j1 ,···,jr ) aj1 · · · ajr , W(a) = r≥0 j1 ,···,jr

where wr (j1 ,···,jr ) is an antisymmetric function of its arguments 1≤j1 ,···,jr ≤n that vanishes identically when r is odd. In this section we introduce a family of norms on A(a1 , · · · , an ) and then derive an archetypical bound on X X Rr s (f ) R(f ) = `≥1 r,s∈N`

for every f in A(a1 , · · · , an ). Recall that Rr s (f ) =

X

X

···

K1 ∈Mt

Rr s (f )(K1 ,···,K` )

K` ∈Mt

1

` Y

aKi

i=1

`

for all r, s ∈ N with the convention t = r− s, where ` X Z Y 1 X (rsii ) wri (Ji , Ki ) bJi :S f (b) dµS : ··· Rr s (f )(K1 ,···,K` ) = ± `! `

J1 ∈Ms

1

J` ∈Ms

i=1

`

i = 1, · · · , `, and Rr s (f )(K1 ,···,K` ) = 0 otherwise. The sign ` Q s (s +···+s` ) given by ±(s) = . (−1) i i+1 when

ri ≥si ≥1 ,

±

=

i=1

±(s)

is

A first prerequisite for introducing an appropriate family of norms on A(a1 , · · · , an ) is to define the “ L1 norm” kuk1 and the “ mixed L1 , L∞ norm” kuk1,∞ of a function u(j1 ,···,jr ) on {1,···,n}r by X |u(j1 ,···,jr )| kuk1 = j1 ,···,jr

and kuk1,∞ = sup

X

sup

X

i=1,···,r ji ∈{1,···,n} j ,···,j 1 i−1 ji+1 ,···,jr

|u(j1 ,···,ji−1 ,ji ,ji+1 ,···,jr )|.

If u(j1 ,···,jr ) is an antisymmetric function of its arguments, then X kuk1,∞ = sup |u(j1 ,j2 ,···,jr )|. j1 ∈{1,···,n} j ,···,j 2 r

For example, kSk1,∞ =

sup i∈{1,···,n}

X j

|Sij |.

482

J. Feldman, H. Kn¨orrer, E. Trubowitz

Remark 3.1. Let u(j1 ,···,jr ) be a function on {1,···,n}r and set Alt u(j1 ,···,jr ) =

1 X sgn(π) π·u(j1 ,···,jr ), r! π∈Sr

where π · u(j1 ,···,jr ) = u(jπ(1) ,···,jπ(r) ). Observe that kπ · uk1 = kuk1 for all π ∈ Sr and consequently, 1 X kπ·uk1 = kuk1 . kAlt uk1 ≤ r! π∈Sr

That is, kAlt uk1 ≤ kuk1 . Similarly, kAlt uk1,∞ ≤ kuk1,∞ . Proposition 3.2 (Tree Bound). Let f (h1 ,···,hm ) and ui (j, Ki ) = ui (j, ki1 ,···,kiti ) i = 1, · · · , `, be antisymmetric functions of their arguments with m ≥ `. Let X

T(K1 ,···,K` ) =

|f (h1 ,···,hm )|

` X n Y i=1

h1 ,···,hm

Then, kTk1 ≤ kf k1

` Y

|Shi j | |ui (j, Ki )| .

j=1

kSk1,∞ kui k1,∞ .

i=1

Proof. We have kTk1 =

X

|T(K1 ,···,K` )|

K1 ,···,K`

=

≤

X

X

K1 ,···,K`

h1 ,···,hm

X

|f (h1 ,···,hm )|

|f (h1 ,···,hm )|

h1 ,···,hm

≤

X

|f (h1 ,···,hm )|

h1 ,···,hm

= kf k1

` Y

` X n Y i=1

` Y

n X

i=1

j=1

` Y

j=1

|Shi j |

kSk1,∞

i=1

|Shi j | |ui (j, Ki )|

` Y

kui k1,∞

i=1 ` Y

kui k1,∞

i=1

kSk1,∞ kui k1,∞ .

i=1

A second prerequisite for introducing a family of norms on A(a1 , · · · , an ) that “correctly measures” the size of R(f ) is to choose a nondecreasing function 8 on N and a 3 > 0 satisfying the Hypothesis 3.3. For all multi indices I and J, 1 Z 2 (|I|+|J|) , aI : aJ :S dµS ≤ 8(|I|) 3 0,

|J| |J|

≤ |I| > |I|.

Representation for Fermionic Correlation Functions

483

The form of this hypothesis is motivated by two examples.

Example 3.4 (Global Factorial). For any complex, skew symmetric matrix S = Sij , the bound 1 Z 2 (|I|+|J|) , |J| ≤ |I| aI : aJ :S dµS ≤ |I|! kSk∞ 0, |J| > |I| holds for all multi indices I and J. The proof of this crude inequality is by induction on |J|. Suppose |J| = 0. If I = {i1 ,···,ir }, then ( Z , r is even Pf S i i k ` aI dµS = 0, r is odd, where Pf Sik i` is the Pfaffian of the matrix with elements Sik i` , k, ` = 1, · · · , r. We have r Z X aI dµS ≤ |εk1 ···kr | |Sik1 ik2 | · · · |Sikr−1 ikr | k1 ,···,kr =1 r X

1

2r ≤ kSk∞

|εk1 ···kr |

k1 ,···,kr =1 1 2r ∞

= kSk

r!.

Suppose |J| > 0. Integration by parts with respect to a]j1 gives Z Z aI : aJ :S dµS = aI a]J dµS ] Z |I| a]j1 aI a]J\{j1 } dµS ] = (−1) = (−1)|I|

Z

|I| X

(−1)

`−1

Sj 1 i`

(−1)

`−1

Sj 1 i`

`=1

= (−1)|I|

aI\{i` } a]J\{j1 } dµS ] Z

|I| X

aI\{i` } : aJ\{j1 } :S dµS .

`=1

Our induction hypothesis implies that Z 1 2 (|I|+|J|−2) (|I|−1) aI\{i` } : aJ\{j1 } :S dµS ≤ kSk∞ ! for each ` = 1, · · · , |I|. Now, |I| Z Z X aI : aJ :S dµS ≤ |Sj1 i` | aI\{i` } : aJ\{j1 } :S dµS `=1 2 (|I|+|J|−2) (|I|−1) ≤ kSk∞ ! 1

|I| X

`=1 1

2 (|I|+|J|) (|I|−1) ≤ kSk∞ ! |I|.

.

|Sj1 i` |

484

J. Feldman, H. Kn¨orrer, E. Trubowitz

This “perturbative bound” is obtained by ignoring all potential Z cancellations between the at most |I|! nonzero terms appearing in Pfaffian equal to aI a]J dµS ] . Example 3.5 (Gram’s Inequality). Suppose that S = Sij is a complex, skew symmetric matrix of the form 0 6 , S= t −6 0 where 6 = 6ij is a matrix of order n2 . Suppose, in addition, that there is a complex Hilbert space H, elements vi , wi ∈ H , i = 1, · · · , n2 , and a constant 3 > 0 with 6ij = hvi , wj iH and

3 21 2 n for all i, j = 1, · · · , 2 . Then, the “nonperturbative bound” kvi kH , kwj kH ≤

21 (|I|+|J|) Z , aI : aJ :S dµS ≤ 3 0,

|J| |J|

≤ |I| > |I|

holds for all multi indices I and J. The proof is presented in the Appendix. Now, let f (a) =

X

f (m) (a)

m≥0

be a Grassmann polynomial in A(a1 , · · · , an ) where, for each m ≥ 0, X fm (j1 ,···,jm ) aj1 · · · ajm f (m) (a) = j1 ,···,jm

and the kernel fm (j1 ,···,jm ) is an antisymmetric function of its arguments. Fix a complex, skew symmetric matrix S = Sij of order n satisfying Hypothesis 3.3. We recall Definition 1.4. For all α ≥ 2, the “external” and “internal” naive power counting norms kf kα and |||f |||α of the Grassmann polynomial f (a) are kf kα =

X

kf (m) kα =

m≥0

and |||f |||α =

X

|||f (m) |||α =

m≥0

X

1

αm 3 2

m

kfm k1

m≥0

X

αm

1

kSk1,∞

32

(m−2)

kfm k1,∞ .

m≥0

By the triangle inequality, X X X X X kRr s (f )kα ≤ kRr s (f (m) )kα kR(f )kα ≤ `≥1 r,s∈N`

and consequently,

m≥0 `≥1 r,s∈N`

Representation for Fermionic Correlation Functions

kR(f )kα ≤

485

m X X X m≥1

kRr s (f (m) )kα

r,s∈N`

`=1

since, Rr s (f (m) ) = 0 for all r, s ∈ N` when ` > m. Furthermore, ` ` 1 (ri −si ) 6 2 6(ri −si ) r s (m) kR (f )kα = α i=1 kAlt Rr s (f (m) )k1 3 i=1 since,

X

Rr s (f (m) ) =

Alt Rr s (f (m) )(j1 ,···,jM ) aj1 · · · ajM

j1 ,···,jM

with M = (r1 −s1 )+···+(r` −s` ). Altogether, kR(f )kα ≤

m X X X m≥1

`=1

`

α

6(ri −si )

i=1

3

1 2

`

6(ri −si )

i=1

kRr s (f (m) )k1 .

r,s∈N`

Proposition 3.2 will now be used to obtain a bound on the norm kRr s (f (m) )k1 of the kernel Rr s (f (m) )(K1 ,···,K` ). Lemma 3.6. Let H, J1 , ···, J` be multi indices with |H| = m ≥ `. Then, ` Z Y aJi :S aH dµS ≤ M(H, J1 , ···, J` ) : i=1

X

` Y

1≤µ1 ,···,µ` ≤m pairwise different

i=1

|Shµi ji1 |,

where M(H, J1 , ···, J` ) =

sup 1≤µ1 ,···,µ` ≤m pairwise different

` Z Y aH\{hµ1 ,···,hµ` } : aJi \{ji1 } :S dµS . i=1

Proof. For convenience, set ki = ji1 , i = 1, · · · , `. By antisymmetry, the integrand can be rewritten so that Z aH :

` Y

Z aJi :S dµS = ±

a]k` · · · a]k1 aH

i=1

` Y

a]Ji \{ki } dµS ] .

i=1

Now, integrate by parts successively with respect to a]k` , · · · , a]k1 , and then apply Leibniz’s rule to obtain Y Z Y Z ` ` X n ` Y ∂ aJi :S dµS = ± aH Ski m a]Ji \{ki } dµS ] aH : ∂am i=1 i=1 m=1 i=1 Z ` ` Y Y X aH\{hµ1 ,···,hµ` } = ± Sk i h µ i a]Ji \{ki } dµS ] 1≤µ1 ,···,µ` ≤m pairwise different

since

i=1

i=1

486

J. Feldman, H. Kn¨orrer, E. Trubowitz ` X n Y i=1

Ski m

m=1

∂ aH = ∂am

` Y

X 1≤µ1 ,···,µ` ≤m pairwise different

X

=

Ski h µ i

i=1

±

1≤µ1 ,···,µ` ≤m pairwise different

` Y

∂ aH ∂ahµi

Ski hµi aH\{hµ1 ,···,hµ` } .

i=1

It follows immediately that ` Z Y aH : aJi :S dµS i=1

≤

X

` Y

1≤µ1 ,···,µ` ≤m pairwise different

i=1

` Z Y |Shµi ki | aH\{hµ1 ,···,hµ` } : aJi \{ki } :S dµS i=1

X

` Y

1≤µ1 ,···,µ` ≤m pairwise different

i=1

≤ M(H, J1 , ···, J` )

|Shµi ki |.

Proposition 3.7. Let f (m) (a) =

X

fm (h1 ,···,hm ) ah1 · · · ahm

h1 ,···,hm

be a homogeneous Grassmann polynomial of degree m, where fm (h1 ,···,hm ) is an antisymmetric function of its arguments. Let J1 , ···, J` be multi indices with m ≥ `. Then, ` Z Y : bJi :S f (m) (b) dµS i=1

m ≤ `! M(m, J1 , ···, J` ) `

X

|fm (h1 ,···,hm )|

` Y

|Shi ji1 |,

i=1

h1 ,···,hm

where M(m, J1 , ···, J` ) = sup M(H, J1 , ···, J` ) . |H|=m

Proof. For convenience, set ki = ji1 , i = 1, · · · , `. By the preceding lemma, ` Z Y X : bJi :S f (m) (b) dµS ≤ |fm (H)| M(H, J1 , ···, J` ) i=1

|H|=m

≤ M(m, J1 , ···, J` )

X |H|=m

|fm (H)|

X

` Y

1≤µ1 ,···,µ` ≤m pairwise different

i=1

X

` Y

1≤µ1 ,···,µ` ≤m pairwise different

i=1

|Shµi ki |

|Shµi ki |.

Representation for Fermionic Correlation Functions

487

Observe that, by the antisymmetry of fm , X

X

` Y

1≤µ1 ,···,µ` ≤m pairwise different

i=1

|fm (H)|

|H|=m

X

X

1≤µ1 ,···,µ` ≤m pairwise different

h1 ,···,hm

X

X

1≤µ1 ,···,µ` ≤m pairwise different

h1 ,···,hm

=

=

= `!

m `

X

|Shµi ki |

|fm (h1 ,···,hm )|

` Y

|Shµi ki |

i=1

|fm (h1 ,···,hm )|

` Y

|Shi ki |

i=1

|fm (h1 ,···,hm )|

` Y

|Shi ki |.

i=1

h1 ,···,hm

Proposition 3.8. Let

X

f (m) (a) =

fm (h1 ,···,hm ) ah1 · · · ahm

h1 ,···,hm

be as above. Let r, s ∈ N` with m ≥ `. Then, the L1 norm kRr s (f (m) )k1 of the kernel Rr s (f (m) )(K1 ,···,K` ) is bounded by (a) kRr s (f (m) )k1 ≤

` Y m (rsii ) kSk1,∞ kwri k1,∞ , M(m, s) kfm k1 ` i=1

where M(m, s) = sup M(m, J1 , ···, J` ) . |Ji |=si i=1,···,`

(b) kRr s (f (m) )k1 1 m ≤ 8(m−`) 32 `

`

m − 6(ri −si ) i=1

kfm k1

` Y

1

(rsii ) kSk1,∞ 3 2

(r i −2)

kwri k1,∞

i=1

when, in addition, Hypothesis 3.3 is satisfied. Proof. To verify (a), set ui (j, Ki )

=

X

|wri (j,Ji0 , Ki )|.

|J0 |=si −1 i i=1,···,`

for each i = 1, · · · , `. By construction, kui k1,∞ = kwri k1,∞ , i = 1, · · · , `. Also, set T(K1 ,···,K` ) =

X h1 ,···,hm

|fm (h1 ,···,hm )|

` X n Y i=1

j=1

|Shi j | |ui (j, Ki )|

488

J. Feldman, H. Kn¨orrer, E. Trubowitz

By Proposition 3.7, ` Y m (rsii ) T(K1 ,···,K` ) M(m, s) | Rr s (f )(K1 ,···,K` ) | ≤ ` i=1

since, X

X

M(m, J1 , ···, J` )

|Ji |=si i=1,···,`

≤ M(m, s)

= M(m, s)

|fm (h1 ,···,hm )|

` Y

X

X

|J0 |=si −1 i i=1,···,`

h1 ,···,hm

X

|Shi ji1 | |wri (Ji , Ki )|

i=1

h1 ,···,hm

|fm (h1 ,···,hm )|

` X n Y i=1

` X n Y

|fm (h1 ,···,hm )|

i=1

h1 ,···,hm

|Shi j | |wri (j,Ji0 , Ki )|

j=1

|Shi j | |ui (j, Ki )| .

j=1

It follows from Proposition 3.2 that ` Y m (rsii ) kTk1 M(m, s) kRr s (f )k1 ≤ ` i=1 ` ` Y Y m (rsii ) kfm k1 ≤ M(m, s) kSk1,∞ kwri k1,∞ ` i=1

i=1

For (b), simply observe that, by Hypothesis 3.3, ` Z 1 Y aJi \{ji1 } :S dµS ≤ 8(m−` ) 3 2 aH\{hµ1 ,···,hµ` } :

`

m + 6(|Ji |−2) i=1

i=1

for any multi indices H, J1 , ···, J` with |H| = m ≥ ` and any pairwise different sequence of indices 1 ≤ µ1 , · · · , µ` ≤ m. Consequently, ` 1 2 m + 6(si −2) i=1 sup M(H, J1 , ···, J` ) ≤ 8(m−` ) 3 . M(m, s) = sup |Ji |=si i=1,···,`

|Hi |=m

We have developed all the material required for a useful bound on the operator R. For the rest of this section we assume Lemma 3.9. Let f (m) (a) =

X

fm (h1 ,···,hm ) ah1 · · · ahm

h1 ,···,hm

be as above and let m ≥ `. Then, for all α ≥ 2, X kRr s (f (m) )kα ≤ 8(m−`) kf (m) kα |||W|||`α+1 . r,s∈N`

Representation for Fermionic Correlation Functions

489

Proof. By Proposition 3.8 (b), kRr s (f (m) )kα `

≤α

6(ri −si )

i=1

` 1 1 m 6(ri −si ) kRr s (f (m) )k1 ≤ 8(m−`) 3 2 m kfm k1 Pr s 3 2 i=1 `

where, for convenience, Pr s =

` Q i=1

(rsii ) αri −si

1

kSk1,∞

32

(r i −2)

kwri k1,∞ . However,

1 1 1 m m m 2 3 αm 3 2 m kfm k1 ≤ kf (m) kα kfm k1 = m ` α ` when α ≥ 2, and consequently, X

kRr s (f (m) )kα ≤ 8(m−`) kf (m) kα

r,s∈N`

X

Pr s .

ri ≥si ≥1 i=1,···,`

Observe that X

Pr s =

X

ri ≥si ≥1 i=1,···,`

r s

( )α

r−s

kSk1,∞

3

1 2 (r −2)

` kwr k1,∞

≤ |||W|||`α+1

r≥s≥1

since X

(rs) αr−s

1

kSk1,∞

32

(r −2)

kwr k1,∞ ≤

r≥s≥1

X

(rs) αr−s

1

kSk1,∞

32

(r −2)

kwr k1,∞

r≥s≥0

=

X

r (α+1) kSk1,∞

1

32

(r −2)

kwr k1,∞

r≥0

= |||W|||α+1 . Therefore,

X

kRr s (f (m) )kα ≤ 8(m−`) kf (m) kα |||W|||`α+1 .

r,s∈N`

We can now prove Theorem 1.8. Suppose 2 |||W|||α+1 ≤ 1. Then, for all polynomials f in the Grassmann algebra A(a1 , · · · , an ), kR(f )kα ≤ 2 8(n) |||W|||α+1 kf kα . Proof. By Lemma 3.9,

490

J. Feldman, H. Kn¨orrer, E. Trubowitz

kR(f )kα ≤

m X X X m≥1

≤ 8(n)

kRr s (f (m) )kα

r,s∈N`

`=1

X

kf (m) kα

m≥1

m X

|||W|||`α+1

`=1

1 ≤ 8(n) kf kα |||W|||α+1 1 − |||W|||α+1 ≤ 2 8(n) kf kα |||W|||α+1 Corollary 3.10. Suppose 2 (1+8(n)) |||W|||α+1 < 1. Then, for all polynomials f in the Grassmann algebra A(a1 , · · · , an ), 1 kf kα . 1 − 2 8(n) |||W|||α+1

k(11 − R)−1 (f )kα ≤

Lemma 3.11. For all Grassmann polynomials f in A(a1 , · · · , an ), Z | f (a) dµS | ≤ 8(n) kf kα Proof. As usual, write f (a) =

X

f (m) (a),

m≥0

P

where, for each m ≥ 0, f (m) (a) =

j1 ,···,jm

fm (j1 ,···,jm ) aj1 · · · ajm and the kernel

fm (j1 ,···,jm ) is an antisymmetric function of its arguments. Then, by Hypothesis 3.3, Z X Z | f (m) (a) dµS | | f (a) dµS | ≤ m≥0

≤

X

X

Z |fm (j1 ,···,jm )| |

aj1 · · · ajm dµS |

m≥0 j1 ,···,jm

≤

X

X

1

|fm (j1 ,···,jm )| 8(m) 3 2

m

m≥0 j1 ,···,jm

≤

X

1

kfm k1 8(m) αm 3 2

m

m≥0

≤ 8(n) kf kα . Recall that the correlation functions Sm (j 1 ,···,jm ) , m ≥ 0, corresponding to the interaction W(a) and the propagator S = Sij are given by Z 1 aj1 · · · ajm eW(a) dµS (a). Sm (j1 ,···,jm ) = Z

Representation for Fermionic Correlation Functions

491

Theorem 1.9. Suppose 2 (1+8(n)) |||W|||α+1 < 1. Then, for each m ≥ 0 and all sequences of indices 1 ≤ j1 ,···,jm ≤ n, |Sm (j1 ,···,jm )| ≤

1 8(n) m αm 3 2 . 1 − 2 8(n) |||W|||α+1

Proof. Fix 1 ≤ j1 ,···,jm ≤ n and rewrite the monomial aJ = aj1 · · · ajm as X Alt(δk1 ,j1 · · · δkm ,jm )(k1 ,···,km ) ak1 · · · akm . aJ = k1 ,···,km

Then, 1

kaJ kα = αm 3 2

m

1

m

kAlt(δ·,j1 · · · δ·,jm )k1 ≤ αm 3 2 .

By Theorem I.3, Lemma 3.11 and Corollary 3.10, Z |Sm (j1 ,···,jm )| = | (11 − R)−1 (aJ ) dµS | ≤ 8(n) k(11 − R)−1 (aJ )kα 8(n) ≤ kaJ kα , 1 − 2 8(n) |||W|||α+1 so that |Sm (j1 ,···,jm )| ≤

1 8(n) m αm 3 2 . 1 − 2 8(n) |||W|||α+1

Appendix: Gram’s Inequality for Pfaffians Proposition. Suppose that S = Sij is a complex, skew symmetric matrix of the form 0 6 S= , t −6 0 where 6 = 6ij is a matrix of order n2 . Suppose, in addition, that there is a complex Hilbert space H, elements vi , wi ∈ H , i = 1, · · · , n2 , and a constant 3 > 0 with 6ij = hvi , wj iH and

3 21 2 for all i, j = 1, · · · , n2 . Then, for all multi indices I and J, Z 3 21 |I| aI dµS ≤ 2 kvi kH , kwj kH ≤

and

Z 21 (|I|+|J|) , aI : aJ :S dµS ≤ 3 0,

|J| |J|

≤ |I| > |I|.

492

J. Feldman, H. Kn¨orrer, E. Trubowitz

Proof. To prove the first inequality, suppose i1 <···
Z ai1 · · · air dµS = Pf

0 −U t

U 0

,

where U = Uk` is the ρ = max {k|ik ≤ n2 } by r − ρ matrix with elements

Uk` = 6ik i`+ρ − n2 = vik , wi`+ρ − n2 H . By direct inspection, Pf

0 −U t

U 0

=

0,

ρ6=r−ρ 1

(−1) 2

ρ(ρ−1)

det(U ) ,

ρ=r−ρ.

If r = 2ρ, then by Gram’s inequality for determinants Z ai1 · · · air dµS ρ

Y 3 ρ = det vik , wi`+ρ − n2 H ≤ kvik kH kwik+ρ − n2 kH ≤ . 2 k=1

Finally, by antisymmetry,

Z 3 21 |I| aI dµS ≤ 2

for any multi index I. To prove the second inequality, set 6] = The matrix



0 6

6 . 6

0 0 0 t 0 S 0 0 −6  = S] = 0 6 0 S S −6t 0 −6t is conjugated by the permutation matrix   11 0 0 0  0 0 11 0   0 11 0 0  0 0 0 11

 6 0 6 0

Representation for Fermionic Correlation Functions

to

0 t −6]

6 0

]



0  0 = 0 −6t

493

0 0 −6t −6t

 0 6 6 6 . 0 0 0 0

Also, define the vectors vi] , wi] , i = 1, · · · , n, in the Hilbert space H ⊕ H by (0,vi ) 1≤i≤ n2 vi] = , n (vi ,vi ) , 2
6]ij = vi] , wj] H⊕H 1

kvi] kH⊕H , kwj] kH⊕H ≤ 3 2

for all i, j = 1, · · · , n. The second inequality has now been reduced to the first for the matrix 0 t 6] , −6] 0 the Hilbert space H ⊕ H and the vectors vi] , wi] , i = 1, · · · , n.

Acknowledgement. It is a pleasure to thank Detlef Lehmann for many stimulating discussions.

Note added in proof. After this article was accepted for publication, we were informed that Abdesselam and Rivasseau propose a combination of Gram’s inequality and Trees, forests and jungles: A botanical garden for cluster expansions, Constructive Physics, Lecture Notes in Physics 446, Berlin–Heidelberg–New York: Springer-Verlag, 1995, for controlling fermionic systems. References [BW]

Brydges, D.C., Wright, J.D.: Mayer Expansions and the Hamilton-Jacobi Equation, II. Fermions, Dimensional Reduction Formulas. J. Stat. Phys. 51, 435–456 (1988) [C] Caianello, E.R.: Number of Feynman Diagrams and Convergence. Il Nuovo Cimento 3, 223–225 (1956) [FKLT] Feldman, J., Kn¨orrer, H., Lehmann, D., Trubowitz, E.: Fermi Liquids in Two Space Dimensions. Constructive Physics. V. Rivasseau (ed.), Springer Lecture Notes in Physics, Berlin–Heidelberg– New York: Springer-Verlag, 1995 [FMRS] Feldman, J., Magnen, J., Rivasseau, V., S´en´eor, R.: Massive Gross-Neveu model: A Rigorous Perturbative Construction. Phys. Rev. Lett. 54, 1479–1481 (1985) [FMRT] Feldman, J.,Magnen, J.,Rivasseau, V.,Trubowitz, E.: An Infinite Volume Expansion for Many Fermion Green’s Functions. Helv. Phys. Acta 65, 679–721 (1992) [GK] Gawedzki, K., Kupiainen, A.: Gross-Neveu Model Through Convergent Expansions. CMP 102, 1–30 (1985) Communicated by G. Felder

Commun. Math. Phys. 195, 495 – 507 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Anderson Localization for Random Schr¨odinger Operators with Long Range Interactions ¨ Werner Kirsch1 , Peter Stollmann2 , Gunter Stolz3,? 1

Institut f¨ur Mathematik, Ruhr-Universit¨at Bochum, D-44780 Bochum, Germany. E-mail: [email protected] 2 Fachbereich Mathematik, Johann Wolfgang Goethe-Universit¨ at, D-60054 Frankfurt am Main, Germany. E-mail: [email protected] 3 University of Alabama at Birmingham, Department of Mathematics, Birmingham, Alabama 35294, USA. E-mail: [email protected] Received: 9 April 1997 / Accepted: 5 December 1997

Abstract: We prove pure point spectrum with exponentially decaying eigenfunctions at all band edges for Schr¨odinger Operators with a periodic potential plus a random P −m qi (ω)f (x − i), where f decays at infinity like |x| potential of the form V ω (x) = for m > 4d resp. m > 3d depending on the regularity of f . The random variables qi are supposed to be independent and identically distributed. We assume that their distribution has a bounded density of compact support. 1. Introduction At least since the groundbreaking work of Lifshitz in the early 60s it is widely accepted among physicists that random models of solid state physics should exhibit pure point spectrum near fluctuation boundaries. The latter are those parts of the spectrum which are determined by rather rare events. To present a more concrete picture consider a Schr¨odinger operator of the form −1 + Vper + Vω describing a periodic solid with additional impurities given by the random perturbation. The spectrum will typically consist of a union of closed intervals, bands, whose edges correspond to the unlikely events that the random perturbation takes on its maximal respectively minimal value. Thus these band edges form the fluctuation boundaries. Mathematical rigorous proofs of localization in multidimensional models have so far been restricted to Anderson type models with additional technical restrictions. They go back to the pioneering works for the discrete case ([12, 11, 5, 20]) with Vper = 0 which found considerable simplification in [6]. The first paper which treated the continuum case was [13] which was extended and simplified substantially in [3, 14, 17]. In the latter paper the case of a non-trivial periodic background potential was considered for the first time; however, as in the other articles mentioned so far, the results showed localization near the bottom of the spectrum, only. ?

Research supported by NSF-Grants DMS-9401417 and DMS-9706076

496

W. Kirsch, P. Stollmann, G. Stolz

Localization near arbitrary band edges was investigated in a series of papers [8, 9, 1, 16, 21]: In [8] a discrete Schr¨odinger operator was considered, while [9] treats a divergence form model for acoustic waves and contains most of the technique necessary for the Schr¨odinger case (in [21] an extension of the results for anisotropic models is given). In [1] a fairly general situation is considered, including Schr¨odinger operators of the above type under some mild assumptions on the coefficients. In our recent paper [16] we also studied the latter case (allowing more general Vper and f ) using among other ingredients a multi-scale analysis which is based on a variation of ideas from [3, 9]. To point out the progress achieved in the present note we recall that alloy type models have the form X qi (ω)f (x − i), (1.1) V ω (x) = where we will assume that the qi for i ∈ Zd are independent identically distributed random variables which have a bounded density g. We suppose that supp(g) is an interval and denote by q− and q+ its infimum and supremum respectively. The function f ≥ 0, f ≥ c > 0 on an open set, is assumed to belong to l1 (Lp ) with p = 2 for d ≤ 3, p > d2 for d ≥ 4. Also let H0 = −1 + Vper , where Vper ∈ Lploc is a Zd -periodic potential with p as above. We will study the random Schr¨odinger operator Hω = H0 + Vω . Despite some discussion in [3, 14], localization had so far been settled for compactly supported f only, even in the case Vper = 0. Here we will prove localization for long range interaction f with |f (x)| ≤ C|x|−m

for |x| large,

where m will be chosen suitably. We will follow the general strategy for proving localization which was used for discrete models in [6] and adapted in [9] to the continuous case. A few changes will be implemented into this strategy, which, as a by-product, lead to streamlined proofs of some of the results of [9] (see the comment at the end of Sect. 5). In addition, we will use some results from [16] respectively [3, 1]. Peter Hislop has announced results (joint work with Combes and Mourre, in preparation) on the problem of localization for long range f as well. The difficulties one has to overcome are caused by the fact that Vω (x) and Vω (y) are statistically correlated even if the distance between x and y is large. For other results on localization for correlated potentials, see e.g. [7, 10]. A simple extension of the argument given in [16] shows that σ(Hω ) = 6 for almost every ω, where [ X 6= σ(H0 + q · f (· − k)). q∈[q− ,q+ ]

k

If m > 4d we will prove pure point spectrum with exponentially decaying eigenfunctions near the bottom inf 6 of the spectrum. If we impose the conditions of [1] or [3] we may even have m > 3d since their Wegner estimate is stronger. Note that [1, 3] in particular assume boundedness of f . Under slightly stronger assumptions on the density g we can prove localization near all band edges of the spectrum. Let us formulate the main results of this paper. Theorem 1.1. If m > 4d then for almost every ω the spectrum of Hω is pure point in a neighborhood of inf 6 with exponentially decaying eigenfunctions.

Anderson Localization for Random Schr¨odinger Operators

497

Theorem 1.2. Assume that m > 4d and that there exists τ > d/2 such that g satisfies Z q+ Z q− +h g(s) ds + g(s) ds ≤ h τ for small h. (1.2) q−

q+ −h

Then for almost every ω the spectrum of Hω is pure point in a neighborhood of ∂6 with exponentially decaying eigenfunctions. In our proof the particular form of the Wegner bound turns out to be crucial. In the Wegner estimate we take from [14] and [16] the Wegner bound behaves quadratic in the volume term. The above results rely on this type of Wegner estimate. Barbaroux, Combes, Hislop and Mourre ([1, 3] and [4]) prove a Wegner estimate that is linear in the volume. In their case our proof below requires only m > 3d instead of m > 4d. One of us has found a simple proof of a Wegner estimate for H¨older continuous distribution of the coupling constant [22]. This implies that the above theorems remain valid under this more general assumption. For discrete models, a Wegner estimate was proven in [2] which allows for H¨older continuous measures and even certain measures with non-zero discrete parts. We mention here without proof that the assumption on m can be weakened at the cost of a somewhat worse result on eigenfunction decay. In fact, by using the ideas described below in a somewhat different type of multiscale analysis (as used in [16] and based on methods from [20] and [3]) it can be shown that Theorems 1.1 and 1.2 hold for m > 3d (respectively m > 2d under the conditions of [1, 3]), but merely with eigenfunctions which decay more rapidly than any inverse polynomial. Of course, the optimal assumption, which we were not able to get, should be m > d. The additional assumption (1.2) in Theorem 1.2 could be dropped if Lifshitz tail behavior of the integrated density of states would be known at internal band edges of Hω (a property known at inf 6 and used in the proof of Theorem 1.1, compare in particular the proof of Proposition 4.2 in [16]). Klopp has recently shown in [18] that internal Lifshitz tails appear if and only if the integrated density of states at the corresponding band edge of the periodic operator H0 is “non-degenerate” (see [18]). 2. Preliminaries and Strategy of Proof In this paper we use the structure of the probability space of the random variables which still has product structure although the random potential itself might be (and typically will be) deterministic in the sense of the technical definition of this term (see [15]). The strategy of our proof is to make probabilistic estimates for the box-hamiltonians H3l (x) (see below) which are valid uniformly in qi for those i with dist(i, 3l (x)) ≥ rl for some “security distance” rl . To define this procedure more precisely note that our probability space (, F , P) is given by = ⊗i∈Zd S, S ⊂ R, F = ⊗i∈Zd B(S), P = ⊗i∈Zd P0 , where P0 is the distribution of q0 and S its (compact) support. For any subset 3 of Rd we define the projection 53 : −→ ⊗i∈3∩Zd S by (53 ω)i = ωi for i ∈ 3 ∩ Zd . Definition 2.1. If A ∈ F is an event and 3 ⊂ Rd we define A∗3 = {ω | ∃ω 0 ∈ A Proposition 2.2. If 31 ∩ 32 ∩ Zd = ∅, events.

53 ω 0 = 53 ω } = 5−1 3 (53 A).

(2.1)

∗ A, B ∈ F then A∗31 and B3 are independent 2

498

W. Kirsch, P. Stollmann, G. Stolz

∗ Proof. The event A∗31 is a (31 ∩ Zd )-cylinder set, while B3 is a (32 ∩ Zd )-cylinder. 2 d Thus they are independent if 31 ∩ 32 ∩ Z = ∅.

In order to explain the key role of this elementary fact in extending known multiscale methods to the case of long range interactions we first introduce some notation and terminology. We denote by H3 = H3 (ω) the operator Hω = H0 + Vω restricted to the cube 3 with periodic boundary conditions and by R3 (z) its resolvent. By 3l (x) we denote the cube of sidelength l ∈ 2N + 1 around the point x ∈ Zd . We will drop the letter “x” whenever it is understood which center is meant or if x is the origin. For fixed x denote := 3l/3 , 3inn l

3out := 3l \ 3l−2 . l

:= χ3inn and χout := χ3out be the corresponding indicator functions. Also let χinn l l l l Definition 2.3. A cube 3l is called (γ, E)-good for ω ∈ , if inn kχout k ≤ e−γl , l R3l (E)χl

where E ∈ R \ σ(H3l (ω)) is understood. If the potential f is compactly supported then the events {ω : 3l is not (γ, E)-good} ˜l ˜ l is not (γ, E)-good} are independent if the distance of the cubes 3l and 3 and {ω : 3 is sufficiently large. This has been crucial in multiscale techniques as for example in [6] and [9], but does not hold in the long range case. Our basic idea to solve this problem is to do a multiscale analysis with “uniformly good” rather than “good” cubes: Definition 2.4. Let Gl (γ, E) := {ω|3l is (γ, E)-good}. The cube 3l is called uniformly (γ, E)-good for a certain ω, if ω 0 ∈ Gl (γ, E) for all ω 0 with 5M ω 0 = 5M ω, where M = 34l . Thus {ω|3l is not uniformly (γ, E)-good} = {ω|3l is not (γ, E)-good}∗34l , which implies independence of {ω|3l (x) is not uniformly (γ, E)-good} and {ω|3l (y) is not uniformly (γ, E)-good} if d(x, y) > 4l by Proposition 2.2. Here d(·, ·) denotes the ∞-metric. In order to make a multiscale analysis work with the stronger requirement of uniformly good cubes (see Sect. 4), we also need uniform versions of the two basic ingredients into the multiscale technique, the Wegner estimate and the initial scale estimate. These are provided in the next section, where only the Wegner estimate needs a little extra thought. To understand how the concept of good cubes respectively uniformly good cubes will be used in the proof of localization in Sect. 5, we record the following eigenfunction decay inequality in part a) of the following Lemma. For this we recall that a function 1,2 is called a generalized eigenfunction for H = −1 + V (V ∈ Lploc , p as above) ψ ∈ Wloc to E ∈ R, if for every ϕ ∈ C0∞ (Rd ), h∇ψ, ∇ϕi + hV ψ, ϕi = Ehψ, ϕi. 1/p R . Part b) is a geWe use the notation kV kp,loc,unif := supx 31 (x) |V (y)|p dy ometric resolvent inequality used to compare R3 and R30 for different cubes in the multiscale analysis of Sect. 4.

Anderson Localization for Random Schr¨odinger Operators

499

Lemma 2.5. Let H = −1+V be such that kV kp,loc,unif ≤ M . Then for every bounded set U ⊂ R there is a constant CU,M such that a) every generalized eigenfunction ψ of H to E ∈ U satisfies out inn out kχinn 3 ψk ≤ CU,M kχ3 R3 (E)χ3 k kχ3 ψk,

(2.2)

b) if 3 ⊂ 30 are cubes with centers in Zd and sidelengths in 2N + 1, and if A ⊂ 3inn , B ⊂ 30 \ 3 and E ∈ U , then out kχB R30 (E)χA k ≤ CU,M kχB R30 (E)χout 3 k kχ3 R3 (E)χA k.

(2.3)

The proofs of these results involve the resolvent identity, the introduction of smooth cut-off functions and gradient estimates. Very similar results are provided in [9, Lemma 26 and Lemma 27]. An elementary discussion can also be found in [23], so we omit the proof.

3. Wegner Estimates and Initial Scale Estimate Let us set A = A(E, l, ε) = {ω |dist(E, σ(H3l )) < ε } . The Wegner Lemma tells us that this event has small probability, more precisely Proposition 3.1. ([14, 16]) Under the assumptions of Theorem 1.1 let a ∈ ∂6. Then there exists a neighborhood U of a and a constant C > 0 such that for E ∈ U , 2

P(A(E, l, ε)) ≤ Cε |3l | .

(3.1)

As usual |M | denotes the Lebesgue-measure of the set M . Combes-Hislop [3] and Barbaroux-Combes-Hislop [1] have a better Wegner estimate 2 which gives |3l | instead of |3l | on the right-hand side of (3.1): Proposition 3.2. ([3, 1]) Under the assumptions in [3] resp. [1] we have for E in a neighborhood U of a ∈ ∂6 that P(A(E, l, ε)) ≤ Cε |3l | .

(3.2)

We will need a ”uniform” Wegner-Lemma, i.e. an estimate of A∗l+r (where A = A(E, l, ε)). Proposition 3.3 (uniform Wegner-estimate). For E ∈ U as in Proposition 3.1 it holds that 2 (3.3) P(A∗l+r ) ≤ C(ε + r−(m−d) ) |3l | . Moreover in the case of Proposition 3.2 we have P(A∗l+r ) ≤ C(ε + r−(m−d) ) |3l | .

(3.4)

Note that the estimates (3.3) respectively (3.4) remain true if the event A is replaced by {ω| dist(M, σ(H3l )) < ε} and M ⊂ U is such that diam M < ε.

500

W. Kirsch, P. Stollmann, G. Stolz

Proof. Suppose ω ∈ A∗l+r then there exists an ω 0 ∈ A such that 53l+r ω 0 = 53l+r ω. Consequently, for any x ∈ 3l we have X |V ω (x) − V ω0 (x)| ≤

|qi (ω)| + |qi (ω 0 )| f (x − i)

i6∈3l+r

X

≤C

−m

|x − i|

≤ C 0 r−(m−d) .

i6∈3l+r

Thus dist(E, σ(H3l (ω 0 ))) < ε implies dist(E, σ(H3l (ω))) < ε + c0 r−(m−d) , and hence P(A∗l+r ) ≤ P(A(l, ε + c0 r−(m−d) )). The latter probability can be estimated by (3.1) und (3.2) respectively.

The induction in the multiscale method of [6] does not directly use a Wegner estimate but a consequence on the distance of the spectra of hamiltonians on disjoint boxes, see the estimate (4.4) in [6]. In the following we adapt this estimate to our situation. Let I be an interval such that I ⊂ U for the neighborhood U found in Proposition 3.1 respectively Proposition 3.2. For a 3 = 3l (x) define 1 1 σI (H3 ) = σ(H3 ) ∩ I + (− l−(m−d) , l−(m−d) ) . 2 2 Note that σI (H3 ) ⊂ U for l sufficiently large. Let 31 = 3l1 (x1 ) and 32 = 3l2 (x2 ) be cubes such that 34l1 (x1 ) ∩ 34l2 (x2 ) = ∅. For fixed ω ∈ and ωi := 534li (xi ) ω, i = 1, 2, we introduce the “uniform distance” ˜ I (H31 ), σI (H32 )) := d(σ inf dist σI (H31 (ω1 , ω˜ 1 )), σI (H32 (ω2 , ω˜2 )) . ω˜ 1 ∈ 5Rd \34l1 (x1 ) ω˜ 2 ∈ 5Rd \34l2 (x2 ) We consider the event ˜ I (H31 ), σI (H32 )) ≤ min{l1 , l2 }−(m−d) . A := ω| d(σ Lemma 3.4. There is a constant C > 0 independent of l1 , l2 , such that under the assumptions of Propositon 3.1, P(A) ≤ C

max{l1 , l2 }d , min{l1 , l2 }m−3d

(3.5)

and under the assumptions of Proposition 3.2, P(A) ≤ C

max{l1 , l2 }d . min{l1 , l2 }m−2d

(3.6)

Anderson Localization for Random Schr¨odinger Operators

501

Proof. For subsets 3 of Rd we write P3 for the probability ⊗i∈3∩Zd P0 on 53 and E3 for its expectation. We may assume l1 ≤ l2 . The event A is a (34l1 ∪ 34l2 )-cylinder and thus P(A) = P34l1 ∪34l2 (A) = E34l2 P34l1 (A).

(3.7)

Keeping ω2 fixed for a moment, we pick an arbitrary ω˜ 20 ∈ 5Rd \34l2 . From Weyl’s asymptotic formula we conclude that σI (H32 (ω2 , ω˜ 20 )) has at most Cl2d elements (uniformly in ω2 ). The decay assumption on f implies that for x ∈ 32 and all ω˜ 2 ∈ 5Rd \34l2 , −(m−d) . V(ω2 ,ω˜ 0 ) (x) − V(ω2 ,ω˜ 2 ) (x) ≤ Cl2 2

Thus σI (H32 (ω2 , ω˜ 2 )) ⊂ Sω2 , where Sω2 is independent of ω˜ 2 and a union of at most Cl2d intervals, each of length ≤ Cl2−(m−d) . We conclude ( ) P34l1 (A) ≤ P34l1

ω1 :

inf

ω ˜ 1 ∈5Rd \3

4l1

dist(σI (H31 (ω1 , ω˜ 1 )), Sω2 ) ≤ l1−(m−d)

o∗ n . = P ω : dist(σI (H31 (ω)), Sω2 ) ≤ l1−(m−d)

(3.8)

4l1

The set Sω2 can be covered by at most Cl2d intervals of length ≤ 21 l2−(m−d) ≤ 1 −(m−d) . Thus an additivity argument and the uniform Wegner estimate (3.3) with 2 l1 ε = l1−(m−d) (respectively the remark following Proposition 3.3) show that the r.h.s. of (3.8) can be further estimated by ≤ Cl2d l1−(m−d) l12d = C

l2d . m−3d l1

Since all constants are uniform in ω2 , (3.5) is now a direct consequence of (3.7). Using (3.4) instead of (3.3) we get (3.6). The other important tool for the multiscale analysis is an initial scale estimate, a result saying that with high probability the distance of the spectrum of finite box hamiltonians to ∂6 is not too small. If, as assumed here, f ≥ 0, then proofs of this result (e.g. [6, 3, 9]) directly extend to the long range case by a monotonicity argument. It is more difficult to incorporate that we only want to assume that f ≥ c > 0 on some open set (rather than f ≥ cχ31 (0) ). Under this assumption a proof of the following result was given in Propositions 4.1 and 4.2 of [16]. It again extends to the long range case considered here by monotonicity. In fact, this shows that the following results hold for the uniform events from Definition 2.1. Proposition 3.5. (a) Under the assumptions of Theorem 1.1 let a = inf 6. Then for any ξ > 0 and β0 ∈ (0, 2) there is an l∗ = l∗ (ξ, β0 ) such that ∗ P dist(σ(H3 (ω)), a) ≤ lβ0 −2 4l ≤ l−ξ for 3 = 3l (0) and l ≥ l∗ .

502

W. Kirsch, P. Stollmann, G. Stolz

(b) Under the assumptions of Theorem 1.2 let a ∈ ∂6. Then for any ξ ∈ (0, 2τ − d) there is a β0 > 0 and l∗ = l∗ (τ, ξ) such that ∗ P dist(σ(H3 (ω)) ∩ (a, ∞), a) ≤ lβ0 −2 4l ≤ l−ξ

if a is a lower band edge

and ∗ P dist(σ(H3 (ω)) ∩ (−∞, a), a) ≤ lβ0 −2 4l ≤ l−ξ

if a is an upper band edge

for 3 = 3l (0) and l ≥ l∗ . Note that σ(H3 (ω)) ⊂ 6 holds for every cube 3 = 3l (0) (see e.g. the proof of σ(Hω ) = 6 a.s. in [16]) and thus part b) implies that P{dist(σ(H3 (ω)), a) ≤ lβ0 −2 }∗4l ≤ l−ξ for every band edge a and sufficiently large l.

4. Multiscale Analysis In this section we adapt the type of multiscale analysis which was used in [23] (which in turn is based upon [6] and [9]) to our situation. With the preparations from Sects. 2 and 3 at hand, the detailed proofs of the following results are very similar to the considerations in [6]. Thus we will only outline the main ideas and refer to [23] for a more detailed account. In this and the next section we prove Theorems 1.1 and 1.2 under the assumption m > 4d. Simple modifications show that only m > 3d is needed if the stronger form of the Wegner estimate is available. For an interval I ⊂ R, l ∈ 2N + 1, γ > 0, and ξ > 0, we say that the estimate G(I, l, γ, ξ) is satisfied if for all pairs x, y ∈ Zd with d(x, y) ≥ 4l it holds that P {∀E ∈ I the cube 3l (x) or 3l (y) is uniformly (γ, E)-good for ω} ≥ 1 − l−2ξ . Note that by our general assumptions there is a bound kVper + Vω kp,loc,unif ≤ M uniformly in ω ∈ . Theorem 4.1. Let U be the neighborhood of a ∈ ∂6 as given in Proposition 3.3. Fix ∗ ∗ ξ ∈ (0, m 4 − d] and β ∈ (0, 1). Then there exist α = α(d, ξ) ∈ (1, 2), l = l (d, ξ, β, m), c1 = c1 (d, CU,M ) (with CU,M from (2.2) respectively (2.3)), c2 = c2 (d, m) and a universal constant c such that the following implication holds: If I is an open interval with I ⊂ U and for l ≥ l∗ and γ ≥ lβ−1 the estimate G(I, l, γ, ξ) is satisfied, then also G(I, L, γL , ξ) is true, where (i) lα ≤ L ≤ lα + 6, (ii) γL ≥ γ(1 − cl1−α ) − c1 l−1 − c2 Moreover, for α and ξ we have that (α − 1)d < 2ξ.

ln L ≥ Lβ−1 . L

(4.1)

Anderson Localization for Random Schr¨odinger Operators

503

proof of Theorem 4.1 starts by picking α = α(d, ξ) ∈ (1, 2) such that The α−1 4d 2−α ≤ ξ and L ∈ 3N \ 6N such that lα ≤ L ≤ lα + 6. With 0x := x + 3l Zd we now define the event n G (x) := ω ∈ : ∀E ∈ I there are no four cubes 3l (bi ) ⊂ 3L (x) with bi ∈ 0x , i = 1, . . . , 4 and d(bi , bj ) > 4l for i 6= j such that o 3l (bi ) is not uniformly (γ, E)-good for i = 1, . . . , 4 . Using the remark on independence following Definition 2.4 and 4d(α−1)/(2−α) ≤ ξ ≤ m/4 − d it can be seen that 1 P(G (x)) ≥ 1 − L−2ξ 3

(4.2)

for l sufficiently large. Simple geometrical considerations also show that for ω ∈ G (x) and E ∈ I there are disjoint cubes 3li (bi ) ⊂ 3L (x), i = 1, 2, 3, such that (i) (ii) (iii) (iv)

li ∈ L := {0, 5l, 10l, (10 + 13 )l, 15l, (15 + 13 )l, (15 + 23 )l}, P3 2 i=1 li ≤ (15 + 3 )l, S if b ∈ 0x and 3l (b) is not uniformly (γ, E)-good, then 3l (b) ⊂ i 3li (bi ), d(3li (bi ), 3lj (bj )) ≥ 3l for i 6= j.

Now fix x, y ∈ Zd with d(x, y) ≥ 4L and define n W := ω ∈ : ∃31 = 3l1 (z1 ), 32 = 3l2 (z2 ) with 31 ⊂ 3L (x), z1 ∈ 0x , 32 ⊂ 3L (y), z2 ∈ 0y , li ∈ L ∪ {L} o ˜ I (H31 ), σI (H32 )) ≤ min{l1 , l2 }−(m−d) . and d(σ P(W ) can be estimated by counting the number of possible centers and sidelengths in question, and by using Lemma 3.4 for a fixed pair of boxes 31 and 32 . Applying the definition of α, L and ξ one gets that P(W ) ≤

1 −2ξ L 3

(4.3)

for l sufficiently large. Using (4.2) and (4.3) the theorem now follows from Lemma 4.2. If l is sufficiently large and ω ∈ G (x) ∩ G (y) ∩ cW , then for arbitrary E ∈ I there is z ∈ {x, y} such that 0L (z) is uniformly (γL , E)-good, where γL satisfies (4.1). Proof. The proof of this result is very similar to the arguments in Sect. 4 of [6] respectively their adaptation to the continuum in Sect. 6 of [9]. A detailed account of the proof of this result for the case of compactly supported single site potential f is also given in [23]. However, to adjust these arguments to the long range case, it has to be shown that inn the required estimate for kχout L R3L (E)χL k with 3L = 3L (z) holds uniformly for all 0 0 ω ∈ with 5Rd \34L ω = 5Rd \34L ω . To this end one starts by using ω ∈ W to show that for either z = x or z = y all cubes 3l˜(u) ⊂ 3L (z) with u ∈ 0z and l˜ ∈ L are uniformly non-resonant, i.e. satisfy

504

W. Kirsch, P. Stollmann, G. Stolz

dist(σ(H3l˜ (u) ), E) ≥

1 ˜−(m−d) l 2

uniformly for all ω 0 with 5Rd \34l˜ (u) ω = 5Rd \34l˜ (u) ω 0 . For this z it can now be shown that 3L (z) is uniformly (γL , E)-good. This uses ω ∈ G (z) and an iteration procedure similar to the argument in [6], where the geometric resolvent inequality (2.3) is applied repeatedly. We omit further details and refer to either [6] or [23]. We now combine Proposition 3.5 and Theorem 4.1 to get Theorem 4.3. Let a = inf 6 under the assumptions of Theorem 1.1 or a ∈ ∂6 under the assumptions of Theorem 1.2. Then there exists an open interval I containing a, ξ > 0, γ > 0, α ∈ (1, 2), and a sequence (lk )k∈N of length scales such that (i) (α − 1)d < 2ξ, (ii) lkα ≤ lk+1 ≤ lkα + 6 for all k, (iii) G(I, lk , γ, ξ) is satisfied for all k. Proof. Proposition 3.5 shows the existence of β0 > 0 and ξ > 0 (which may be picked smaller than m/4 − d) such that for l sufficiently large and E ∈ I := (a − 21 lβ0 −2 , a + 1 β0 −2 ) one has 2l ∗ 1 β0 −2 ≤ l−ξ . P dist(E, σ(H3l )) ≤ l 2 4l Here, σ(H3l ) can get closer than 21 lβ0 −2 to E from only one side since from the fact that σ(H3l ) ⊂ 6 it follows that there is a gap (r, s) of σ(H3l ) such that E ∈ (r, s) and either dist(r, E) ≥ c or dist(s, E) ≥ c for a constant which does not depend on l and ω. An improved version of the Combes-Thomas estimate introduced in [1] (see also [16, Lemma A.1]) can be used to conclude that inn k≤ kχout l R3l (E)χl

c0 lβ0 −2

e−c

00 21 (β0 −2) l ·l

uniformly in the components of ω outside 34l with probability P ≥ 1 − l−ξ . Choosing β ∈ (0, β0 /2) this implies that G(I, l, γl , ξ) is satisfied for l ≥ l∗ and γl = lβ−1 , where independence was used. We pick l1 = l and can now complete the proof of Theorem 4.3 by an inductive application of Theorem 4.1. If l1 was picked sufficiently large, it can be checked by using (4.1) that there is a γ > 0 with γlk ≥ γ for all k. 5. Proof of Localization In this section we complete the proof of localization by adapting the line of reasoning from [23], where an improved continuum version of the arguments from Sect. 3 of [6] is presented. For an operator H = −1 + V , where V ∈ Lploc,unif , p as above, it is true that for almost every E with respect to a spectral measure for H there exists a polynomially bounded generalized eigenfunction for H, e.g. [19]. Therefore the proof of Theorems 1.1 and 1.2 is completed once we have shown

Anderson Localization for Random Schr¨odinger Operators

505

Proposition 5.1. Let I be an interval as provided by Theorem 4.3. Then it holds with probability one that every polynomially bounded generalized eigenfunction ψ for Hω to an E ∈ I is exponentially decaying, in fact lim sup x→∞

log |ψ(x)| ≤ −3γ, d(x, 0)

(5.1)

where γ > 0 is the decay rate found in Theorem 4.3. (Note that the γ given in Theorem 4.3 describes the decay of the “averaged Green’s inn k between the cube 3l/3 and the boundary of 3l , i.e. over a function” kχout l Rl (E)χl distance l/3 in ∞-metric. Thus the factor 3 on the r.h.s. of (5.1) shows that ψ decays at the same rate as the Green’s function.) Proof. With small modifications we follow the lines of the proof of Lemma 3.1 in [6]. For x0 ∈ Zd let Ak+1 (x0 ) = 38blk+1 (x0 ) \ 38lk (x0 ) for b > 1 to be chosen later, and consider the event n Ek (x0 ) := There is some E ∈ I and x ∈ Ak+1 (x0 ) ∩ 0k such that o 3lk (x0 ) and 3lk (x) are not (γ, E)-good , where 0k = x0 + l3k Zd . Since G(I, ξ, lk , γ) holds and Ak+1 (x0 ) ∩ 0k has ≤ Cb (lk+1 /lk )d elements, we can estimate P(Ek (x0 )) ≤ cb lk−2ξ lkd(α−1) . This is summable over k by Theorem 4.3 and thus by Borel-Cantelli and stationarity we get that P(0 ) = 1 for 0 := ω : ∀x ∈ Zd ∃kx ∈ N such that ω 6∈ Ek (x) for k ≥ kx . If ω ∈ 0 and ψ 6= 0 is a polynomially bounded generalized eigenfunction for H to E ∈ I, then it is shown as in [6] that 3lk (x) is (γ, E)-good for all k ≥ k0 (ω) and x ∈ Ak+1 (x0 ) ∩ 0k (kχx0 ψk replaces |ψ(x0 )| from the discrete case and (2.2) is used). 1+ρ Continuing the argument as in [6] and with b > 1−ρ for some ρ ∈ (0, 1) it can be shown that for y ∈ A˜ k+1 (x0 ) (defined as in [6] with factors 2 replaced by 8) and some n ≥ l3k ρd(y, x0 ) it holds that d −γlk j ) kχout kχy ψk ≤ kχinn y0 ,lk ψk ≤ (Cψ (3 − 1)e lk ,yj ψk,

j = 1, . . . , n.

Here y0 , . . . , yn is a sequence in 0k ∩ Ak+1 (x0 ) with y ∈ 3lk /3 (y0 ) and d(yj , yj+1 ) ≤ lk /3. The lower estimate for n and the polynomial bound for ψ imply that for every γ˜ < 3γ we have ˜ 0) ˜ −γρd(y,x . kχy ψk ≤ C(ψ, d, ρ, γ)e Since ρ ∈ (0, 1) and γ˜ < 3γ were arbitrary, (5.1) follows from a subsolution estimate |ψ(y)| ≤ CI,M kχy ψk (e.g. [19]).

506

W. Kirsch, P. Stollmann, G. Stolz

We point out that some of the changes in the above argument compared to [6] and [9] can also be used to streamline the proof of Theorem 6 in [9] as was already observed in [23]: Note that our result in Theorem 4.3 is actually somewhat stronger than the corresponding results in [6] respectively [9], even in the case of compactly supported f . The = 3l/3 as the difference is that in the definition of γ-good cubes we work with 3inn l inner cube, while [6] and [9] use unit cubes around the center in the same context. This has the effect that in the estimate for P(Ek (x0 )) above we can work with the counting d . A consequence of this factor (lk+1 /lk )d , where the proof of [6, Lemma 3.1] needs lk+1 fact is that we can now directly complete the proof of exponential decay with ξ > 0 as provided in Theorem 4.3, while the reasoning in [6] needs ξ > d in this context. In the proof of their Theorem 6 on band edge localization in [9] (which in the case of compactly supported f corresponds to our Theorem 1.2), Figotin and Klein provide an alternative argument to show that, roughly speaking, a result as Theorem 4.3 with ξ > 0 implies that the same result holds for some ξ > d. This requires an additional multiscale analysis, which can be dropped by using our more direct argument. Acknowledgement. We thank the referee for pointing out to us that the results in an earlier version of this paper could be improved to lead to exponential decay of the eigenfunctions.

References 1. Barbaroux, J.M., Combes, J.M. and Hislop, P.D.: Localization near band edges for random Schr¨odinger operators. Helv. Phys. Acta 70, 16–43 (1997) 2. Carmona, R., Klein, A. and Martinelli, F.: Anderson Localization for Bernoulli and Other Singular Potentials. Commun. Math. Phys. 108, 41–66 (1987) 3. Combes, J.M. and Hislop, P.D.: Localization for some continuous, random Hamiltonians in ddimensions. J. Funct. Anal. 124, 149–180 (1994) 4. Combes, J.M., Hislop, P.D. and Mourre, E.: Spectral averaging, perturbation of singular spectrum, and localization. Trans. Amer. Math. Soc. 348, 4883–4894 (1996) 5. Delyon, F., Levy, Y. and Souillard, B.: Approach a` la Borland to multidimensional localisation. Phys. Rev. Lett. 55, 618–621 (1985) 6. von Dreifus, H. and Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) 7. von Dreifus, H. and Klein, H.: Localization for Random Schr¨odinger Operators with Correlated Potentials. Commun. Math. Phys. 140, 133–147 (1991) 8. Figotin, A. and Klein, A.: Localization phenomenon in gaps of the spectrum of random lattice operators. J. Stat. Phys. 75, 997–1021 (1994) 9. Figotin, A. and Klein, A.: Localization of classical waves I: Acoustic waves. Commun. Math. Phys. 180, 439–482 (1996) 10. Fischer, W.: Zur elektronischen Lokalisierung durch gaußsche zuf¨allige Potentiale. PHD-Thesis, Erlangen 1996 11. Fr¨ohlich, J., Martinelli, F., Scoppola, E. and Spencer, T.: Constructive Proof of Localization in the Anderson Tight Binding Model. Commun. Math. Phys. 101, 21–46 (1985) 12. Fr¨ohlich, J. and Spencer, T.: Absence of Diffusion in the Anderson Tight Binding Model for Large Disorder or Low Energy. Commun. Math. Phys. 88, 151–184 (1983) 13. Holden, H. and Martinelli, F.: On the absence of diffusion for a Schr¨odinger operator on L2 (Rν ) with a random potential. Commun. Math. Phys. 93, 197–217 (1984) 14. Kirsch, W.: Wegner estimates and Anderson localization for alloy–type potentials. Math. Z. 221, 507–512 (1996) 15. Kirsch, W., Kotani, S. and Simon, B.: Absence of absolutely continuous spectrum for some onedimensional random but deterministic Schr¨odinger operators. Ann. Inst. H. Poincare Phys. Theor. 42, 383–406 (1985)

Anderson Localization for Random Schr¨odinger Operators

507

16. Kirsch, W., Stollmann, P. and Stolz, G.: Localization for random perturbations or periodic Schr¨odinger operators. Random Operators and Stochastic Equations, to appear 17. Klopp, F.: Localization for some continuous random Schr¨odinger operators. Commun. Math. Phys. 167, 553–569 (1995) 18. Klopp, F.: Internal Lifshits tails for random perturbations of periodic Schr¨odinger operators. Preprint 19. Simon, B.: Schr¨odinger semigroups. Bull. Amer. Math. Soc. 7, 447–526 (1981) 20. Simon, B. and Wolff, T.: Singular continuous spectrum under rank one perturbations and localization for random Hamiltonians. Comm. Pure Appl. Math. 39, 75–90 (1986) 21. Stollmann, P.: Localization for acoustic waves in random perturbations of periodic media. Isral J. Math., to appear 22. Stollmann, P.: A short proof of a Wegner estimate and localization. Preprint 1997 23. Stollmann, P.: Caught by Disorder: Lectures on Bound States in Random Media. In preparation Communicated by B. Simon

Commun. Math. Phys. 195, 509 – 523 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field Markus Kunze Mathematisches Institut der Universit¨at K¨oln, Weyertal 86, D-50931 K¨oln, Germany. E-mail: [email protected]. Received: 18 August 1997 / Accepted: 9 December 1997

Abstract: We consider the system of a point particle, accelerated through a confining potential, and interacting with a scalar wave field. We prove that the periodic solutions of the linearized system (which has an eigenvalue embedded in the continuous spectrum) do not survive perturbation through the nonlinearity, if some “mode” of the wave field does not couple to the particle. 1. Introduction and Main Results We consider the physical system of a mechanical point particle coupled in a translation invariant manner to a scalar wave field, described through the equations ˙ x) = π(t, x) , φ(t, q(t) ˙ = (1 + p(t)2 )−1/2 p(t) ,

π(t, ˙ x) = 1φ(t, x) − ρ(x − q(t)) , Z (1) p(t) ˙ = −∇V (q(t)) − ∇φ(t, x)ρ(x − q(t)) dx ,

for t ∈ R and x ∈ R3 . Here φ = φ(t, x) is the scalar wave field, and q = q(t) ∈ R3 is the position of the particle, whereas π = π(t, x) and p = p(t) represent the velocity field and the momentum of the particle, respectively. The motion of the particle is subjected to a potential V = V (q), and the coupling is realized through a radially symmetric and compactly supported form factor ρ, thus allowing only finite-distance interactions. We assume that the potential V satisfies V ∈ C 3 (R3 ) ,

lim V (q) = ∞ ,

(V )

ρ(x) = ρr (|x|) .

(C)

|q|→∞

and for the charge distribution ρ we require ρ ∈ Cc∞ (R3 ) ,

510

M. Kunze

System (1) describes radiative damping, where the accelerated charge generates fields which partially are radiated to infinity, cf. [?, ?, ?, ?]. There has also been a lot of work on the question whether there is a consistent, relativistically covariant theory of point charges and field, cf. [?, ?, ?, ?], a question which on the classical level seems to be undecided, see [?]. It should also be remarked that instead of the scalar wave field in (1), an analogous system with the full Maxwell field was looked at in [?], whereas the asymptotic of soliton solutions to (1) was considered in [?]. In this paper we are interested in the qualitative behaviour of solutions to the Hamiltonian system (1) for large times. For this, we first note that the stationary solutions of (1) are S = {(φ, q, π, p) = (φq∗ , q ∗ , 0, 0) =: Yq∗ | q ∗ ∈ S} , where S = {q ∈ R3 : ∇V (q) = 0} is the set of critical points for V , and Z 1 d3 y −1 ∗ φq∗ (x) = 1 ρ(x − q ) = − ρ(y − q ∗ ) . 4π |x − y| In [?] we supposed that the Wiener condition Z −3/2 ρ(k) ˆ = (2π) e−i k·x ρ(x) d3 x 6= 0 ,

k ∈ R3 ,

(W )

holds, and under this condition we proved that solutions of finite energy converge to S in suitable local energy norms. In addition, we detected the rate of convergence of solutions Y (t) = (φ(t), q(t), π(t), p(t)) → Yq∗ , provided that Y (0) decays sufficiently fast in x-space, and provided that Y (0) is close enough to Yq∗ , with q ∗ being “stable” for V , i.e., D2 V (q ∗ ) is positive definite. So the natural question arises whether there are generated periodic solutions in the neighborhood of some equilibrium, if (W ) is violated on some sphere |k| = |k0 |, a condition which can be interpreted to say that some “mode” of the wave field does not couple to the particle. Since (W ) is used mainly to exclude purely imaginary eigenvalues of the linearization of (1) around an equilibrium, we ask more precisely if the periodic solutions of the linearized system generated by an eigenvalue (due to a zero of ρ) ˆ create periodic solutions of the full non-linear system. To explain this in greater detail, we restrict ourselves to the case q ∗ = 0 ∈ S and 2 D V (0) = ω02 E3 (E3 being the unit matrix), i.e., q ∗ = 0 is “stable” for the linearization. We consider the system around the corresponding equilibrium Yq∗ = Y0 ∈ S. With u = φ − φq∗ = φ − φ0 the second order system for ψ = (u, q) reads as ψ¨ = −Aψ + F (ψ) with linearization

Aψ = A(u, q) =

−1u(x) − R ∇ρ(x) · q (ω02 + ω12 )q − ∇ρ(x)u(x) dx

(2) (3)

in D(A) = H 2 ⊕ C3 , H m = H m (R3 ) = W m,2 (R3 ) being the usual Sobolev space of functions having mth order distributional derivatives in L2 = L2 (R3 ), and nonlinearity   ρ(x) − ρ(x − q) − ∇ρ(x) · q ,   ∇V (0) − ∇V (q) + ω02 q   R F (ψ) =   ∈ Cc∞ (R3 ) ⊕ C3 . (4)  + ρ(x) − ρ(x − q) − ∇ρ(x) · q ∇φ0 (x) dx  R + ∇ρ(x − q) − ∇ρ(x) u(x) dx

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

511

2

Here we let ω12 = 13 |ρ| ˆ L2 , and for simplicity of notation we considered (1) with q˙ = p R 2 −1/2 p ; note also that ∇V (0) = 0 = ρ(x)∇φ0 (x) dx. instead of q˙ = (1 + p ) Suppose now that the Wiener condition (W ) is violated, in that ρ(k) ˆ =0

for

|k| = |k0 | > 0 ,

but

ρ(k) ˆ 6= 0

for

|k| = 6 |k0 | .

(5)

Then [0, ∞[ is the spectrum of A, with continuous spectrum [0, ∞[\{|k0 | } and an isolated imbedded eigenvalue λ0 = |k0 |2 , cf. Lemma 3 below. The corresponding eigenspace is i (k · q) ρ(k) ˆ uq . (6) : q ∈ C3 , uˆ q (k) = 2 E = {ψ : Aψ = |k0 |2 ψ} = ψ = ψq := q k − |k0 |2 2

Since Aψ = |k0 |2 ψ consists of two equations, in order to obtain an eigenvalue we have to assume additionally that ρ is adjusted in the right way, i.e., ρ = γ ρ0 with some parameter γ which can be chosen such that Z 2 |ρˆ0 (k)| 1 2 2 2 dk 2 , (7) α(±i|k0 |) = 0 , with α(λ) = ω0 + λ 1 + γ 3 k + λ2 see again Sect. 2, since otherwise A will have empty point spectrum. Note that it was exactly the fact that α had no zeros on the imaginary axis which allowed us to prove in [?] that (u, q) = (0, 0) was exponentially attracting (in a suitable sense), if (W ) is assumed to hold. Returning to the problem at hand, we hence find that the linear equation ψ¨ + Aψ = 0 has gained the 2π/|k0 |-periodic solutions sin(|k0 |t)ψq with ψq ∈ E, or stated differently, the operator ∂2 K0 ψ = |k0 |2 2 ψ + Aψ ∂t has kernel (8) ker K0 = {e−it ψ− + eit ψ+ : ψ± ∈ E} . In general, when investigating non-linear equations, often it can be shown that the fully non-linear problem is somehow dominated by the linearized system, at least near to the equilibria. Thus we fix q ∈ R3 , small ε > 0, and the initial value ε(0, |k0 |)ψq ˙ of the periodic solution (ψ(t), ψ(t)) = ε(sin(|k0 |t), |k0 | cos(|k0 |t))ψq of the linearized equation, and we ask whether there are initial values (ψε0 , ψ˙ε0 ) for the full system (2) close to ε(0, |k0 |)ψq which generate periodic solutions of (2) with period τε = 2π/ωε close to 2π/|k0 |. We fix this in more mathematical terms in the following Definition 1. Let E be defined through (6). An ψq ∈ E is called stable for (2), if there are initial values (ψε0 , ψ˙ε0 ) such that the corresponding solutions (ψε (t), ψ˙ε (t)) of (2) are periodic with period τε = 2π/ωε , and (a) kψε0 k 2 3 + kψ˙ ε0 − ε|k0 |ψq k 2 3 = o(ε) as ε → 0+ , L ⊕C

L ⊕C

(b) |ωε − |k0 || = o(ε) as ε → 0+ , and (c) kψε0 kH 2 ⊕C3 + kψ˙ ε0 kH 1 ⊕C3 = O(ε) as ε → 0+ . 1,1

1

Here (c) is a technical assumption, which appears due to the fact that we have to derive certain bounds in weighted L2 -spaces in order to apply a suitable limit absorption 2 and H11 are defined in (20) resp. (19) principle, cf. the remarks below; the spaces H1,1 below. Our main result is that no ψq with q 6= 0 is stable for (2).

512

M. Kunze

Theorem 1. Let assumptions (C) on ρ and (V) on V be satisfied. Suppose furthermore that (W ) is violated, i.e., (5), and that ρ = γ ρ0 can be adjusted to fulfill (7). Then every ψq ∈ E with q 6= 0 is unstable for (2), in the sense of Definition 1. Remarks. (i) With regard to the original system (1), Theorem 1 may be summarized that the periodic solutions of the linear system near Yq∗ = Y0 do not survive the perturbation through the nonlinearity, even if q ∗ = 0 is assumed to be stable, in the sense that D2 V (0) > 0 as a quadratic form (we had D2 V (0) = ω12 E3 only for simplicity). This transfers to the other equilibria Yq∗ ∈ S with 0 6= q ∗ ∈ S under the respective assumptions. (ii) To simplify some technicalities, during the proof we will assume V ∈ C 4 (R3 ), but in fact V ∈ C 3 (R3 ) is enough. −1/2

(iii) It is also not important that we considered q˙ = p instead of q˙ = (1 + p2 ) p in (1), since in case of a relativistic kinetic energy the linearization remains unchanged, while in the second component of F there appears an additional term of order four, which will −1 not affect the function f1 (ψ) = (∂/∂ε)(ε F (εψ)) , and it is mainly the properties of ε=0 this functions that enter the proof of the theorem. (iv) We prove that the periodic solutions of the linear system do not generate periodic solutions of the non-linear system, but this a priori does not exclude the existence of periodic solutions to (1) in case that (W ) is violated. However, we expect that there are no such periodic solutions, at least for some potentials. (v) For the proof of this theorem we rely heavily on the ideas of Sigal [?]. There it 2 was shown that for non-linear wave equations − ∂∂tu2 = Hu + fε (u) with a Schr¨odingerlike operator H having an isolated eigenvalue λ0 > 0, periodic solutions of the linearized equation are unstable to non-linear perturbations in a sense similar to Definition 1. The main assumptions were that H satisfies the Mourre estimate, and that n2 λ0 6∈ disc specH ∪ thresholdsH for n ≥ 2. Additionally, it was important to impose a further condition in order to exclude the possibility of δ(H − n2 λ0 )fn = 0 for all n ≥ 2, cf. [?, Eq. (2.15)]. Here fn are the Fourier coefficients of ∂fε /∂ε at ε = 0, evaluated at the periodic solution of the linearized system, and δ is related to a certain limiting absorption principle explained in more detail below. We note that the analogon to this latter condition will already be guaranteed by ρ(k) ˆ 6= 0

on the sphere |k| = 2|k0 |

(9)

in our present case, since this will imply δ(A − 4|k0 |2 )f2 6= 0. In (5) it is assumed more restrictively that ρ(k) ˆ 6= 0 for |k| 6= |k0 | only to avoid additional eigenvalues besides λ0 = |k0 |2 and to ensure the validity of the limiting absorption principle for A for all λ ∈]0, ∞[\{|k0 |2 }. (vi) In (b) of Definition 1 we impose the condition |ωε − |k0 || = o(ε) as ε → 0+ , which is stronger than |ωε − |k0 || → 0, as was assumed in (i) of [?, Def. 2.1]. This is compensated through the fact that we only have to require L2 ⊕ C3 -convergence of the initial data in (a). Theorem 1 remains true, if (b) is relaxed to |ωε − |k0 || → 0, and the 2 ⊕ C3 -norm for ψε0 resp. in convergence in (a) is strengthened to convergence in H1,1 1 3 0 ˙ H1 ⊕ C -norm for ψε . We preferred the conditions chosen, since they fit more into the framework considered earlier in [?].

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

513

Next we describe the strategy of the proof of Theorem 1, following [?]. Consider ψ˜ ε (t, x) = ε−1 ψε (t/ωε , x) with a sequence ψε according to Definition 1. Then ψ˜ ε is 2π-periodic in t and satisfies, dropping the tilde again for notational simplicity, ωε2 ψ¨ ε + Aψε = ε−1 F (εψε )

(10)

2 as well as (ψε0 , ψ˙ε0 ) ∈ (H1,1 ⊕ C3 ) ⊕ (H11 ⊕ C3 ),

(a)

kψε0 kL2 ⊕C3 + kψ˙ ε0 − ψq kL2 ⊕C3 → 0

(b)

|ωε − |k0 || = o(ε)

(c)

supε∈]0,ε0 ] kψε0 kH 2 ⊕C3 1,1

ε → 0+ ,

as

as

ε → 0+ ,

and

+ supε∈]0,ε0 ]

kψ˙ ε0 kH 1 ⊕C3 ≤ C .

      

(11)

1

We write (10) as i h i h ∂2 ∂2 0 = |k0 |2 2 ψε + Aψε + (ωε2 − |k0 |2 ) 2 ψε + fε (ψε ) =: K0 ψε + Iε (ψε ) (12) ∂t ∂t with fε (ψ) = ε−1 F (εψ) and try to “invert” K0 . For this we have to avoid ker K0 from (8) and thus introduce the orthogonal projections P0 : H2 → ker K0

and P¯0 = 1 − P0 : H2 → (ker K0 )⊥

with H2 = L2 (S 1 × R3 ) ⊕ L2 (S 1 ; C3 ). By e.g. Fourier series expansion it may be seen that some “invertability” of P¯0 H0 P¯0 , i.e., of K0 outside ker K0 , will be an immediate consequence of some kind of “invertability” of (A − n2 |k0 |2 ) for n ≥ 2; note that in our case n2 |k0 |2 6∈ disc specA for n ≥ 2, as is a basic assumption of [?, Thm. 2.2]. Hence an adaption of the limiting absorptions principle, cf. [?, Sect. 4], is needed for λ = n2 |k0 |2 , n ≥ 2, and this works as follows. By formally solving (u1 , q1 ) = (A − λ)(u, q) for (u, q) with given (u1 , q1 ), we obtain from the first equation by Fourier transform u(k) ˆ =

uˆ 1 (k) + i(k · q)ρ(k) ˆ . k2 − λ

Because of the singularity we hence cannot expect a solution u ∈ L2 , but if – arguing very heuristically – uˆ 1 had a derivative, then in integrals the 1/(k 2 − λ) singularity could be improved to a logarithmic singularity through integration by parts, and this logarithmic singularity is harmless. The property of uˆ 1 to have a derivative corresponds to u1 being an element of the weighted L2 -space L21 (R3 ), with Z n o s 2 L2s = L2s (R3 ) = u = u(x) : kukL2 (R3 ) = (1 + x2 ) |u(x)|2 dx < ∞ , s ∈ R . s

By using the classical limiting absorption principle for −1 from [?, Thm. 4.1], this can be made precise to yield (cf. Lemma 6 below) the following √ Lemma 1. For λ ∈]0, ∞[ such that ρ(k) ˆ 6= 0 on the sphere |k| = λ, the limits ± RA (λ) = lim+ (A − λ ± iη)−1 η→0

exist in the strong operator topology B(L21 ⊕ C3 , L2−1 ⊕ C3 ). As noted above this in turn implies

514

M. Kunze

Lemma 2. The limits −1

¯ ¯ R± K0 (0) = (P0 K0 P0 ± i0)

−1

:= lim+ (P¯0 K0 P¯0 ± iη) η→0

2 exist in the strong operator topology B(P¯0 H12 , P¯0 H−1 ).

Here we defined Hs2 = L2s (S 1 × R3 ) ⊕ L2 (S 1 ; C3 ) with L2s (S 1 × R3 ) = Z n 2 = u = u(t, x) : kukL2 (S 1 ×R3 ) = s

2π

Z

o s (1 + x2 ) |u(t, x)|2 dx dt < ∞ ,

s ∈ R.

0

This can be used as follows. Projection of (12) yields P¯0 K0 P¯0 ψε = −P¯0 Iε (ψε ) ,

and thus

(P¯0 K0 P¯0 ± iη)P¯0 ψε = P¯0 (±iηψε − Iε (ψε )) .

Hence if our assumptions (11) on the initial values (ψε0 , ψ˙ε0 ) can be used to ensure ψε , Iε (ψε ) ∈ H12 ,

(13)

then Lemma 2 gives −1 (P¯0 K0 P¯0 ± i0) (P¯0 Iε (ψε )) = −P¯0 ψε ,

and hence

δ(P¯0 K0 P¯0 )(P¯0 Iε (ψε )) = 0 ,

(14)

where

1 ¯ −1 −1 (P0 K0 P¯0 − i0) − (P¯0 K0 P¯0 + i0) . 2πi Suppose moreover that we can show from (11) that (ωε2 − |k0 |2 ) ∂ 2 1 Iε (ψε ) = ψε + ε−2 F (εψε ) → f1 (ψq ) in H12 , ε ε ∂t2 δ(P¯0 K0 P¯0 ) =

(15)

with

∂ fε (ψ) , and fε (ψ) = ε−1 F (εψ) . ∂ε ε=0 2 we obtain from (14) as ε → 0+ , Then by continuity of δ(P¯0 K0 P¯0 ) : P¯0 H12 → P¯0 H−1 f1 (ψ) =

δ(P¯0 K0 P¯0 )(P¯0 f1 (ψq )) = 0 . This corresponds to Eq. (5.8) of [?]. Thus exactly as in this paper we derive δ(A − n2 |k0 |2 ) fn = 0 , where

Z

fn (x) = 0

2π

f1 (sin t ψq )(x)e−int dt =

Z

2π 0

n ≥ 2,

(16)

uq f1 sin t (x)e−int dt . q

It will turn out, cf. Sect. ?? below, that f1 (sin t ψq )(x) = f1 (ψq )(x) sin2 t, and because R 2π of 0 sin2 te−int dt 6= 0 iff n = 2, (16) is equivalent to

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

515

δ(A − 4|k0 |2 ) f1 (ψq ) = 0 .

(17)

Contrary to this we will incorporate (9) to complete the proof of the theorem, in that we can show − + (λ) − RA (λ)] f1 (ψq ) = 2πi δ(A − λ) f1 (ψq ) 6= 0 (18) [RA with λ = 4|k0 |2 , thus contradicting (17). Note that (18) is the key point of the whole argument, being a kind of non-linear version of the Fermi Golden Rule from quantum mechanics, cf., e.g., [?, Sect. 4] or [?]. Since λ = 4|k0 |2 lies in the continuous spectrum, (18) may be interpreted as saying that there is a non-vanishing coupling of the periodic solutions of the linear system to this continuous spectrum. This allows energy to be transported to infinity by radiation, thus excluding the possibility of periodic solutions to persist under the influence of the nonlinearity. The paper is organized as follows. In Sect. 2 we first analyze the spectrum of A. Afterwards in Sect. 3, the limiting absorption principle for A is shown, while in Sect. ?? the outline of proof from above is completed. Since in [?] we didn’t treat (1) in weighted L2 -spaces, we added an appendix Sect. ?? where we derive existence and useful estimates for the dynamics of (1) in such a setting. We end this section by introducing some further notation. Let H11 = H11 (R3 ) = {u = u(x) ∈ H 1 : u, ∇u ∈ L21 } ,

2

2

2

kukH 1 = kukL2 + k∇ukL2 , (19) 1

1

1

and 2 2 = H1,1 (R3 ) = {u = u(x) ∈ H 2 : u, ∇u, 1u ∈ L21 } , H1,1 2

kukH 2

1,1

2

2

2

= kukH 2 + kukH 1 + k1ukL2 . 1

(20)

1

Recall also that the Fourier transform in L2 is an isometric isomorphism from H s to L2s , s ≥ 0, cf. [?, Sect. 10.2]. In general, only standard notations will beZused, as may be found in many basic references like [?, ?]. E.g., we define hu, viL2 =

u(x)v(x) dx to

be the usual inner product in L2 , with conjugation in the second argument. 2. On the Spectrum of the Linearization From (3) above we recall that −1u(x) − ∇ρ(x) · q R (ω02 + ω12 )q − dx∇ρ(x)u(x) u u1 2 3 2 3 = (A − λ) in D(A) = H ⊕ C ⊂ L ⊕ C . Thus, trying to solve with q1 q given (u1 , q1 ) ∈ L2 ⊕ C3 for (u, q) ∈ D(A), we obtain by Fourier transform

Aψ = A(u, q) =

u(k) ˆ = 2

ˆ uˆ 1 (k) + i(k · q)ρ(k) , 2 k −λ

and also, recalling ω12 = 13 |ρ| ˆ L2 , due to the spherical symmetry of ρ, ˆ

(21)

516

M. Kunze

Z q1 = (ω02 − λ)q + i k ρ(k) ˆ u(k) ˆ − ω12 q Z Z 1 1 k2 2 2 | ρ(k)| ˆ | ρ(k)| ˆ dk + dk q = (ω02 − λ)q + − 3 k2 − λ 3 Z uˆ 1 (k) + i k ρ(k) dk ˆ k2 − λ Z uˆ 1 (k) dk , ˆ = β(−λ) q + i k ρ(k) k2 − λ where

Z 2 1 |ρ(k)| ˆ dk ; β(λ) = ω02 + λ 1 + 3 k2 + λ

(22)

(23)

note that α(λ) = β(λ2 ) with α as in (7), hence β(−|k0 |2 ) = 0. From (21) and (22) it is seen that C\[0, ∞[ is contained in the resolvent set of A. Concerning the point spectrum of A, (22) ˆ = 0 for √ and (21) imply that λ ∈ [0, ∞[ is an eigenvalue iff β(−λ) = 0 and ρ(k) |k| = λ. Thus (5) and β(−|k0 |2 ) = 0 yield that λ0 = |k0 |2 > 0 is the only eigenvalue of A, and consequently it follows from (21) and (22) that under our assumptions we have the following Lemma 3. If (5) and (7) hold, then A has resolvent set C \ [0, ∞[, continuous spectrum [0, ∞[\{|k0 |2 }, and an embedded eigenvalue λ0 = |k0 |2 with eigenspace E from (6). Note that corresponding to this, the linearization of the original problem (1) around the equilibrium Yq∗ = Y0 ∈ S has continuous spectrum i R\{±i|k0 |} and two conjugate imbedded eigenvalues ±i|k0 |. A further property of the eigenelements ψq = (uq , q) ∈ E, where uˆ q (k) = i (k · q) ρ(k) ˆ , is given in 2 k − |k0 |2 2 Lemma 4. E ⊂ H1,1 ⊕ C3 .

Proof. We have to show uq ∈ H 2 ∩ L21 , ∇uq ∈ L21 , and 1uq ∈ L21 , or equivalently ˆ = 0 for |k| = |k0 |, uˆ q ∈ L22 ∩H 1 , ki uˆ q ∈ H 1 , and ki2 uˆ q (k) ∈ H 1 for i = 1, 2, 3. Since ρ(k) 2 \ obviously uˆ q ∈ L2 . To prove the rest, we apply [?, Thm. 3.2] with f (k) = ∇ρ · q(k) = i (k · q) ρ(k). ˆ Then f ∈ S, the Schwartz space of rapidly decreasing functions, and hence the quoted theorem can be used with α = 0, α = δi , and α = 2δi (in the notation used there) to imply uˆ q = v0 ∈ H 1 , ki uˆ q = vδi ∈ H 1 , and ki2 uˆ q = v2δi ∈ H 1 . Note that by this theorem in fact even uq , ∇uq , 1uq ∈ L2s for every s ≥ 1. 3. Limiting Absorption Principle for A In this section we prove a limiting absorption principle for A. This is readily obtained by means of the following lemma, which is a special case of a theorem of Agmon [?, Thm. 4.1].

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

517

Lemma 5 (Limiting Absorption Principle for −1). Let R0 (λ) = (−1 − λ)−1 for λ ∈ C \ [0, ∞[. Then for λ ∈]0, ∞[ (a) the limits

R0± (λ) = lim+ R0 (λ ± iη) η→0

exist in the strong operator topology B(L21 , L2−1 ), (b) the identities Z π 2 ± √ ˆ dσ Im hR0 (λ)u, uiL2 = ± √ |u(k)| 2 λ |k|= λ hold. This can be used to derive the corresponding limiting absorption principle for A, which was partially stated already in Lemma 1 from the introduction. Lemma 6. Let ˆ 6= 0 on the √ A be defined through (3). Then for λ ∈]0, ∞[ such that ρ(k) sphere |k| = λ the following holds. (a) The limits

± (λ) = lim+ (A − λ ± iη)−1 RA η→0

exist in the strong operator topology B(L21 ⊕ C3 , L2−1 ⊕ C3 ). (b) If additionally uˆ 1 (−k) = uˆ 1 (k) (k ∈ R3 ), we have the identities E D u1 u1 2 ± , = hR0∓ (λ)u1 , u1 iL2 + β(−λ ± i0)−1 |q1 | , RA (λ) q1 q1 L2 ⊕C3 with R0∓ (λ) from Lemma 5 and

1 (24) β(−λ ± i0) = ω02 − λ 1 + hR0∓ (λ)ρ, ρiL2 . 3 ± uη u1 ± Proof. Let (u1 , q1 ) ∈ L21 ⊕ C3 . Solving = (A − λ ± iη) for (u± η , qη ), we q1 qη± obtain from the first equation ± u± η = R0 (λ ∓ iη)(u1 + ∇ρ · qη ) .

(25)

Moreover, plugging the Fourier transform ˆ uˆ 1 (k) + i(k · qη± )ρ(k) k 2 − λ ± iη R ± ˆ uˆ ± into the second equation q1 = (ω02 + ω12 )qη± + i dkk ρ(k) η (k) − (λ ∓ iη)qη and taking 2 ˆ L2 , we compute into account ω12 = 13 |ρ| Z uˆ 1 (k) dkk ρ(k) ˆ qη± = β(−λ ± iη)−1 q1 − i k 2 − λ ± iη Z uˆ 1 (k) c = β(−λ ± iη)−1 q1 + dk ∇ρ(k) k 2 − λ ± iη = β(−λ ± iη)−1 q1 + hR0 (λ ∓ iη)u1 , ∇ρiL2 (R3 ,C3 ) , (26) uˆ ± η (k) =

518

M. Kunze

cf. formula (4.6) of [?]. Here we had Z 2 1 |ρ(k)| ˆ 2 dk β(λ) = ω0 + λ 1 + 3 k2 + λ 1 = ω02 + λ 1 + hR0 (−λ)ρ, ρiL2 , 3

λ ∈ C \ ] − ∞, 0] ,

recall (23) above. Hence Lemma 5(a) implies that 1 lim+ β(−λ ± iη) = ω02 − λ 1 + hR0∓ (λ)ρ, ρiL2 =: β(−λ ± i0) ∈ C η→0 3 exist, and Lemma 5(b) gives Z π√ 2 Im β(−λ ± i0) = ∓ λ ˆ dσ 6= 0 , √ |ρ(k)| 6 |k|= λ by assumption on λ. Using again Lemma 5 we thus conclude from (26) and (25) that (a) holds, with ± u1 u0 ± =: RA (λ) q1 q0± ! R0∓ (λ)(u1 + ∇ρ · q0± ) ∈ L2−1 ⊕ C3 . = β(−λ ± i0)−1 q1 + hR0∓ (λ)u1 , ∇ρiL2 (R3 ,C3 ) Concerning (b), it hence suffices to show that lim hR0 (λ ∓ iη)(∇ρ · q0± ), u1 iL2 = 0 = lim+ hR0 (λ ∓ iη)u1 , ∇ρiL2 (R3 ,C3 ) · q1 ,

η→0+

η→0

ˆ = ρ(k), ˆ but this follows from the additional assumption uˆ 1 (−k) = uˆ 1 (k), since also ρ(−k) and the Fourier transform of ∇ρ produces an extra k, so that the integrand is an odd function of k; here we use again formula (4.6) of [?]. 4. Proof of Theorem 1 In this section we fill in some of the details missing in the sketch of proof in Sect. 1. Let ∂ −1 f1 (ψ) = f1 (u, q) = ∂ε (ε F (εψ)) ∈ Cc∞ (R3 ) ⊕ C3 be given through ε=0



 1 − D2 ρ(x)(q, q)  2  Z Z f1 (ψ) =   1 3 1 − D V (0)(q, q) − ∇φ0 (x)D2 ρ(x)(q, q) dx − D2 ρ(x)(q)u(x) dx 2 2 (27) for ψ ∈ L2 ⊕ C3 , with D2 denoting the Hessian. In the appendix we will show that (13) and (15) hold, cf. Corollaries ??, ??, and ??. By (??) we have f1 (sin t ψq )(x) = f1 (ψq )(x) sin2 t, and therefore we only need to prove − + [RA (λ) − RA (λ)] f1 (ψq ) = 2πi δ(A − λ) f1 (ψq ) 6= 0

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

519

with λ = 4|k0 |2 , cf. (18). To see this, let (u1 , q1 ) = f1 (ψq ). Then uˆ 1 (k) = 21 (k · q)2 ρ(k), ˆ and hence uˆ 1 (−k) = uˆ 1 (k) implies that Lemma 6(b) can be used to yield ± (λ)f1 (ψq ), f1 (ψq )iL2 ⊕C3 = Im hR0∓ (λ)u1 , u1 iL2 + Im β(−λ ± i0)−1 |q1 | Im hRA

= Im hR0∓ (λ)u1 , u1 iL2 −

|q1 |

Im β(−λ ± i0) .

2

π Im hR0∓ (λ)u1 , u1 iL2 = ∓ √ 8 λ

2

|β(−λ ± i0)|

We have

2

Z

2

√

|k|= λ

|(k · q)2 ρ(k)| ˆ dσ

and Im β(−λ ± i0) = −

Z λ π√ 2 Im hR0∓ (λ)ρ, ρiL2 = ± λ ˆ dσ √ |ρ(k)| 3 6 |k|= λ

by (24) and Lemma 5(b), whence ± (λ)f1 (ψq ), f1 (ψq )iL2 ⊕C3 Im hRA Z Z 2√ π π |q1 | λ 2 2 2 =∓ √ |(k · q) ρ(k)| ˆ dσ ∓ |ρ(k)| ˆ dσ . √ 6 |β(−λ ± i0)|2 |k|=√λ 8 λ |k|= λ

Therefore E D − + (λ) − RA (λ)]f1 (ψq ), f1 (ψq ) 2 Im [RA L ⊕C3 Z Z 2√ π π |q1 | λ 2 2 2 = √ |(k · q) ρ(k)| ˆ dσ + |ρ(k)| ˆ dσ √ 6 |β(−λ − i0)|2 |k|=√λ 8 λ |k|= λ Z Z 2√ π |q1 | λ π 2 2 2 ˆ dσ − |ρ(k)| ˆ dσ − − √ √ |(k · q) ρ(k)| 6 |β(−λ + i0)|2 |k|=√λ 8 λ |k|= λ Z π 2 |(k · q)2 ρ(k)| ˆ dσ = √ 4 λ √ |k|= λ

+

√ π 2 |q1 | λ 6

1 2

|β(−λ + i0)|

As ρ(k) ˆ 6= 0 on the sphere |k| =

+

1

Z 2

2

|β(−λ − i0)|

|ρ(k)| ˆ dσ . √ |k|= λ

√ λ = 2|k0 |, we find

D E − + Im [RA (λ) − RA (λ)]f1 (ψq ), f1 (ψq )

L2 ⊕C3

6= 0 ,

and therefore it is impossible to have δ(A − 4|k0 |2 ) f1 (ψq ) = 0, cf. (18). This completes the proof of Theorem 1.

520

M. Kunze

5. Appendix: Existence of the Dynamics in L21 In this appendix we give some additional remarks concerning existence and estimates for the dynamics of (10) in weighted L2 -spaces. We mainly sketch the proofs instead of giving full details, since most of the arguments are, though lengthy, rather straightforward. ˙ ψi = (ui , qi ) (i = 1, 2), and consider (10) as We write 9 = (ψ1 , ψ2 ) = (ψ, ψ), 0 id 0 d ψ1 ψ1 = −1 A 0 + (A.1) =: Aε 9 + Fε (9) 1 ψ2 dt ψ2 ωε2 ωε2 fε (ψ1 ) on 2 ⊕ C3 ) ⊕ (H11 ⊕ C3 ) ⊂ (H11 ⊕ C3 ) ⊕ (L21 ⊕ C3 ) =: X , D = D(Aε ) := (H1,1

with A from (3), fε (ψ) = ε−1 F (εψ), and F from (4). In the following, when considering (??), we always will include the case ε = 0, where we let ω0 := |k0 | and F0 := 0. First we have Lemma A.1. The operator Aε generates a C0 semigroup Tε (t), t ≥ 0, on X . We have the estimate sup

t∈[0,t∗ ], ε∈[0,ε0 ]

kTε (t)9kD ≤ C(t∗ ) k9kD

for

9∈D

and

t∗ ≥ 0 .

(A.2)

Proof. By classical semigroup theory, cf. [?], it is sufficient to show that for 9∗ ∈ X and |λ| < 1, the equation 9 − λAε 9 = 9∗ has a solution 9 ∈ D with k9kX ≤ −1 C (1 − |λ|) k9∗ kX . Written in components, we obtain from (21) uˆ 1 (k) =

ωε2 uˆ ∗ (k) + i(k · q1 )ρ(k) ˆ k 2 + ωε2 /λ2 u∗ + λu∗ with u∗ = 1 2 2 ∈ L21 , λ

and

ψ2 =

ψ1 − ψ1∗ . λ

2 , or equivalently uˆ 1 ∈ L22 ∩ H 1 , k uˆ 1 ∈ H 1 , and We first need to show u1 ∈ H1,1 2 1 k uˆ 1 (k) ∈ H , but this is straightforward from uˆ ∗ ∈ H 1 , ρˆ ∈ S, and the additional power k 2 in the denominator. Thus also ψ2 ∈ H11 ⊕ C3 , since ψ1∗ ∈ H11 ⊕ C3 , and therefore 9 = (ψ1 , ψ2 ) ∈ D. Through direct calculations it is then also possible to verify −1 k9kX ≤ C (1 − |λ|) k9∗ kX , always switching from H 1 to L21 and vice versa, via Fourier transform. The estimate (??) uniformly in ε ∈ [0, ε0 ] also follows by direct calculation. For this, note that for 9 = (u1 , q1 ), (u2 , q2 ) ∈ X we have Tε (t)9 =: (u1,ε (t, ·), q1,ε (t)), (u2,ε (t, ·), q2,ε (t)) ,

implicitly given through uˆ 1,ε (t, k) = uˆ 1 (k) cos(|k|t/ωε ) + uˆ 2 (k) iρ(k) ˆ + ωε |k|

Z

t 0

ωε sin(|k|t/ωε ) |k|

sin(|k|(t − s)/ωε ) k · q1,ε (s) ds ,

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

521

q q ωε 2 2 q1,ε (t) = q1 cos( ω0 + ω1 t/ωε ) + q2 q sin( ω02 + ω12 t/ωε ) ω02 + ω12 Z Z t q 1 2 2 sin( ω0 + ω1 (t − s)/ωε ) dx∇ρ(x)u1,ε (s, x) ds , + q ωε ω02 + ω12 0 uˆ 2,ε (t, k) =

∂ uˆ 1,ε (t, k) ∂t

and q2,ε (t) =

dq1,ε (t) , dt

(A.3)

so that a Gronwall estimate may be derived for t 7→ ku1,ε (t)kL2 , which in turn yields that also |q1,ε (t)| is bounded on bounded t-intervals. If 9 ∈ D, then this boundedness and the regularity of 9 can be used in the implicit representation of Tε (t)9 to ensure (??). By (4), since D2 V (0) = ω02 E3 , and because ρ ∈ Cc∞ (R3 ), it is also more or less obvious that we have the following Lemma A.2. The nonlinearity Fε in (??) satisfies Fε (L2 ⊕ C3 ) ⊕ (L2 ⊕ C3 ) ⊂ {0} ⊕ (Cc∞ (R3 ) ⊕ C3 ) ⊂ D , and Fε is locally Lipschitz continuous, if considered X → X , and also, if considered D → D, where D is equipped with the graph norm of Aε . In addition, kFε (9)kD ≤ C ε−1 kF (εψ1 )kH 1 ⊕C3 1 2 ≤ C ε|q1 | 1 + ε|q1 | + sup |D3 V (τ εq1 )| + C ε|q1 | ku1 kL2 (A.4) τ ∈[0,1]

for 9 = (ψ1 , ψ2 ) ∈ (L2 ⊕ C3 ) ⊕ (L2 ⊕ C3 ), with any norm for D3 V , the third derivatives of V . According to Lemmas ?? and ?? we therefore find local solutions of (??) for ε ∈ [0, ε0 ]. To be more precise, the remark following [?, Thm. 6.1.7] implies that for 90ε := ∈]0, ∞] and a unique (ψε0 , ψ˙ε0 ) ∈ D [with (ψε0 , ψ˙ε0 ) from condition (11)], there exist tmax ε max 9ε ∈ C 1 ([0, tmax ε [; X ) ∩ C([0, tε [; D)

being a classical solution of (??) with 9ε (0) = 90ε . Writing 9ε = (ψ1,ε , ψ2,ε ) = (ψε , ψ˙ε ) and ψi,ε = (ui,ε , qi,ε ), i = 1, 2, we therefore obtain in particular that 2 3 ψε ∈ C 2 ([0, tmax ε [; L1 ⊕ C ). max In fact, tε = ∞, by the following Lemma A.3. We have sup

t∈[0,t∗ ], ε∈]0,ε0 ]

k9ε (t)kD ≤ C(t∗ )

for every t∗ ≥ 0, and hence in particular tmax = ∞ for ε ∈]0, ε0 ]. In addition, ε

2

d ψε

sup

dt2 (t) 2 3 ≤ C(t∗ ) . t∈[0,t∗ ], ε∈]0,ε0 ] L ⊕C 1

(A.5)

(A.6)

522

M. Kunze

This can be proved by using the variation of constants formula for 9ε and by deriving a bound for ε|q1,ε (t)|, uniformly in ε ∈]0, ε0 ] and t in bounded intervals. The latter bound is obtained from the second-order Hamiltonian system for ψ1,ε = (u1,ε , q1,ε ), cf. the arguments in [?, Sect. 2]. Here it is needed that V (q) → ∞ as |q| → ∞. Identifying now ψε (t, x) := (ψε (t))(x), we thus obtain Corollary A.1. For ε ∈]0, ε0 ], h i ∂2 ψε , Iε (ψε ) = (ωε2 − |k0 |2 ) 2 ψε + ε−1 F (εψε ) ∈ H12 = L21 (S 1 ⊕ C3 ) ⊕ L2 (C3 ) . ∂t Additionally, we find from (??) and condition (11)(b) Corollary A.2. As ε → 0+ , (ωε2 − |k0 |2 ) ∂ 2 ψ →0 ε ε ∂t2

in

H12 = L21 (S 1 × R3 ) ⊕ L2 (C3 ) .

Having obtained the necessary bounds, next we investigate the convergence of 9ε as ε → 0+ . For ε = 0 and ψq ∈ E we have 90,q (t) = (sin t, cos t)ψq as a global solution with initial value 90,q (0) = 900,q := (0, ψq ). Thus condition (11)(a) means k90ε − 900,q k(L2 ⊕C3 )⊕(L2 ⊕C3 ) → 0

as

ε → 0+ .

(A.7)

Let F1 (9) =

1 |k0 |2

0 f1 (ψ1 )

∈ {0} ⊕ (Cc∞ (R3 ) ⊕ C3 ) ⊂ D ,

9 = (ψ1 , ψ2 ) ∈ (L2 ⊕ C3 ) ⊕ (L2 ⊕ C3 ) ,

(A.8)

with f1 from (??) above. By Taylor expansion up to order three the following result can be shown, where 90,q (t) = (sin t, cos t)ψq = T0 (t)(0, ψq ),with the semigroup T0 (t) generated by A0 . Lemma A.4. For t∗ ≥ 0, sup kε−1 Fε (9ε (t)) − F1 (90,q (t))kD → 0

t∈[0,t∗ ]

as

ε → 0+ .

Recalling that 9ε (t) = (ψε (t), ψ˙ε (t)), by definition of Fε and F1 , cf. (??) resp. (??), this particularly implies Corollary A.3. As ε → 0+ , ε−2 F (εψε ) → f1 (ψq )

in

H12 = L21 (S 1 × R3 ) ⊕ L2 (C3 ) ,

where f1 (ψq )(t, x) := f1 (sin tψq )(x), with the first component of f1 (sin tψq ) being applied to x, and similarly F (εψε )(t, x) := F (εψε (t))(x). Acknowledgement. I’m grateful to C.E. Wayne for pointing out the paper [?] to me.

Instability of the Periodic Motion of a Particle Interacting with a Scalar Wave Field

523

References 1. Agmon, S.: Spectral properties of Schr¨odinger operators and scattering theory. Ann. Scuola Norm. Sup. Pisa 2, 151–218 (1975) 2. Bambusi, D.: A Nekhoroshev-type theorem for the Pauli-Fierz model of classical electrodynamics. Ann. Inst. H. Poincar´e Phys. Th´eor. 60, 339–371 (1994) 3. Bambusi, D., Galgani, L.: Some rigorous results on the Pauli-Fierz model of classical electrodynamics. Ann. Inst. H. Poincar´e Phys. Th´eor. 58, 155–171 (1993) 4. Barut, A.O.: Electrodynamics and Classical Theory of Fields and Particles. New York: Dover Publications 1980 5. Dirac, P.A.M.: Classical theory of radiating electrons. Proc. Roy. Soc. London Ser. A 167, 148–169 (1938) 6. Gittel, H.P, Kijowski, J., Zeidler E.: The relativistic dynamics of the combined particle-field system in nonlinear renormalized electrodynamics. Preprint 1997 7. Hislop, P.D., Sigal, I.M.: Introduction to Spectral Theory. With Applications to Schr¨odinger Operators. Berlin–Heidelberg–New York: Springer. 1996 8. Komech, A., Spohn H.: Soliton-like asymptotics for a classical particle interacting with a scalar wave field. To appear in Nonlinear Anal. 9. Komech, A., Spohn H.: Long-time asymptotics for the coupled Maxwell-Lorentz equations. Preprint 1997 10. Komech, A., Spohn, H., Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Comm. Partial Differential Equations 22, 307.-335 (1997) 11. Lorentz, H.A.: Theory of Electrons. 2nd edition 1915. Reprint: New York: Dover, 1952 12. Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations, 2nd ed. Berlin–Heidelberg–New York: Springer, 1983 13. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, IV. Analysis of Operators. New York: Academic Press, 1978 14. Rohrlich, F.: Classical Charged Particles. Foundations of Their Theory. Reading: Addison-Wesley, 1965 15. Sigal, I.M.: Non-linear wave and Schr¨odinger equations I. Instability of periodic and quasiperiodic solutions. Commun. Math. Phys. 153, 297–320 (1993) 16. Simon, B.: Resonances in n-body quantum systems with dilatation analytic potentials and the foundations of time-dependent perturbation theory. Ann. of Math. 93, 247–274 (1973) 17. Weidmann, J.: Linear Operators in Hilbert Spaces. Berlin–Heidelberg–New York: Springer, 1980 18. Wheeler, J.A., Feynman R.P.: Interaction with the absorber as the mechanism of radiation. Rev. Mod. Phys. 17, 157–181 (1945) 19. Yaghjian, A.D.: Relativistic Dynamics of a Charged Sphere. Berlin–Heidelberg–New York: Springer, 1992 Communicated by A. Kupiainen

Commun. Math. Phys. 195, 525 – 547 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Structure and Representations of the Quantum General Linear Supergroup R. B. Zhang Department of Pure Mathematics, University of Adelaide, Adelaide, Australia Received: 27 November 1996 / Accepted: 10 December 1997

Abstract: The structure and representations of the quantum general linear supergroup GLq (m|n) are studied systematically by investigating the Hopf superalgebra Gq of its representative functions. Gq is factorized into Gπq Gπq¯ , and a Peter–Weyl basis is constructed for each factor. Parabolic induction for the quantum supergroup is developed. The underlying geometry of induced representations is discussed, and an analog of Frobenius reciprocity is obtained. A quantum Borel–Weil theorem is proven for the irreducible covariant and contravariant tensorial representations, and explicit realizations are given for classes of irreducible tensorial representations in terms of sections of quantum super vector bundles over quantum projective superspaces. 1. Introduction Quantized universal enveloping superalgebras [1, 2] (which will be called quantum superalgebras for simplicity) represent the most important generalizations of the Drinfeld– Jimbo [3] quantized universal enveloping algebras. Their origin can be traced back to the Perk-Schultz solution of the Yang–Baxter equation and also the work of Bazhanov and Shadrikov [4]. However, systematical investigations of such algebraic structures only started about six years ago, but in an intensive manner. By now the subject has been developed quite extensively: the quasi-triangular Hopf superalgebraic structure of the quantum superalgebras was investigated [5]; the representation theory of large classes of quantum (affine)superalgebras and super Yangians was developed [6, 7]; applications of quantum superalgebras to two dimensional integrable models in statistical mechanics and quantum field theory were extensively explored [1, 8]. Quantum superalgebras have also been applied to the study of knot theory and 3-manifolds [9, 10], yielding many new topological invariants, notably, the multi-parameter generalizations of Alexander– Conway polynomials. Closely related to the Drinfeld–Jimbo algebras are the quantum groups introduced by Woronowicz and Faddeev–Reshetikhin–Takhatajan [11], which are, in the spirit of

526

R. B. Zhang

Tannaka–Krein duality theory, the “groups” associated with the quantized universal enveloping algebras. One very important aspect of quantum groups is their geometrical significance: they provide a concrete framework for developing noncommutative geometry [12], in particular, for investigating notions such as quantum flag varieties [13] and quantum fibre bundles. Our aim here is to study the structure and representations of the quantum general linear supergroup GLq (m|n) in a systematical fashion by investigating the algebra of its representative functions. We start in Sect. 2 with a concise treatment of finite dimensional unitary representations of Uq (gl(m|n)). Results will be repeatedly used in the remainder of the paper. In Sect. 3 we define the quantum general linear supergroup GLq (m|n), or more exactly, the superalgebra Gq of functions on it. This is done by first defining the bi-superalgebras Gπq and Gπq¯ , which are respectively generated by the matrix elements of the vector representation and its dual irreducible representation. Peter–Weyl type of bases for these bi-superalgebras are constructed. The Gq is defined to be generated by Gπq and Gπq¯ with some extra relations. It has the structures of a ∗-Hopf superalgebra, which separates points of Uq (gl(m|n)), and factorizes into Gπq Gπq¯ . Section 4 treats the representation theory of the quantum supergroup, and in particular, parabolic induction. The geometrical interpretation of induced representations is discussed, leading naturally to the concepts of quantum homogeneous spaces and quantum super vector bundles. A quantum analog of Frobenius reciprocity is obtained; and a quantum version of the Borel–Weil theorem is proven for the irreducible covariant and contravariant tensorial representations. Section 5 gives the explicit realizations of two infinite classes of irreducible tensorial representations in terms of sections of quantum super vector bundles over the quantum projective superspace. In doing this, we also treat the quantum projective superspace in some detail. 2. Unitary Representations of Uq (gl(m|n)) The finite dimensional unitary representations of Uq (gl(m|n)) were classified in [15]. Here we will reformulate the results on the covariant and contravariant tensor representations so that they can be readily used in the remainder of the paper. The material presented here also heavily relies on references [6] and [14]. 2.1. Hopf ∗-superalgebras and unitary representations. Let A be a Z2 -graded associative algebra over the complex field C. Its underlying Z2 -graded vector space is the direct sum A = A0 ⊕ A1 of the even subspace A0 and the odd subspace A1 . We introduce the grading index [ ] : A0 ∪ A1 → Z2 such that [a] = θ if a ∈ Aθ . We will call A a Z2 -graded ∗-algebra, or ∗-superalgebra, if there exists an even anti-linear antiautomorphism ∗ : A → A such that ∗ ◦ ∗ = idA . We will denote ∗(a) by a∗ . Needless to say, ∗(ab) = b∗ a∗ , a, b ∈ A. An important new feature of the Z2 -graded case is that for a given ∗-operation of A, there exists an associated ∗0 such that ∗0 (a) = (−1)[a] a∗ ,

(1)

for a being homogeneous, and extends to the whole of A anti-linearly. There also exist the so-called graded ∗-operations, which, however, are not useful for this paper, and thus will not be discussed any further. Let A and B be two Z2 -graded ∗-algebras. Then A ⊗C B has a natural Z2 -graded ∗-algebra structure, with the ∗-operation defined for homogeneous elements by

Structure and Representations of Quantum General Linear Supergroup

527

∗(a ⊗ b) = (−1)[a][b] a∗ ⊗ b∗ , and for all the elements by extending this anti-linearly. Consider a Z2 -graded Hopf algebra (also called Hopf superalgebra) H, with multiplication m, unit 1H , co-multiplication 1, co-unit and antipode S. We emphasize that the antipode is a linear anti- automorphism of the underlying algebra of H. In particular, for homogeneous a, b ∈ A, we have S(ab) = (−1)[a][b] S(b)S(a). H will be called a Z2 -graded Hopf ∗-algebra, or Hopf ∗-superalgebra, if the underlying algebra of H is a ∗-superalgebra such that 1 and are ∗-homomorphisms, i.e., ∗ ◦ 1 = 1 ◦ ∗, ∗ ◦ = ◦ ∗. These properties together with the defining relations of the antipode m ◦ (S ⊗ id)1 = m ◦ (id ⊗ S)1 = 1H imply that S ◦ ∗ ◦ S ◦ ∗ = idH . Let V be a left H-module. If there exists a non-degenerate sesquilinear form ( , ) : V × V → C, such that (i). (av, u) = (v, a∗ u), ∀u, v ∈ V, a ∈ H, (ii). (v, v) ≥ 0, (v, v) = 0 iff v = 0, we call V and the associated representation of H unitary. Unitary representations have the following important properties: i) A unitary representation is completely reducible; ii) The tensor product of two unitary (with respect to the same ∗-operation) representations is again unitary; iii) If a representation is unitary with respect to ∗, then its dual is unitary with respect to ∗0 . All the three assertions are well known, but there are some related matters worth discussing. One is concerned with the requirement that two representations must be unitary with respect to the same ∗-operation in order for their tensor product to be unitary as well. The tensor product V ⊗C W of two H-modules has a natural H module structure a{v ⊗ w} = 1(a){v ⊗ w} X = (−1)[a(2) ][v] a(1) v ⊗ a(2) w. (a)

If both V and W are equipped with sesquilinear forms ( , ) : V × V → C, and ( , ) : W × W → C, we can define a sesquilinear form (( , )) : (V ⊗C W )×2 → C by ((v1 ⊗ w1 , v2 ⊗ w2 )) = (v1 , v2 )(w1 , w2 ). Now if both V and W are unitary with respect to the same ∗- operation, then (( , )) is clearly positive definite and nondegenerate. Furthermore,

528

R. B. Zhang

((v1 ⊗ w1 , a{v2 ⊗ w2 })) =

X

(−1)[a(2) ][v2 ] ((a∗(1) v1 ⊗ a∗(2) w1 , v2 ⊗ w2 ))

(a)

= ((a∗ {v1 ⊗ w1 }, v2 ⊗ w2 )). Therefore, V ⊗C W indeed furnishes a unitary H-module. On the other hand, if, say, V is ∗-unitary, while W is ∗0 -unitary, then one can easily see that the above calculations will fail to go through. The other concerns the third assertion, the validity of which actually requires some qualification, namely, the Hopf *-superalgebra H in question must admit an even group like element K2ρ satisfying −1 ∗ K2ρ = K2ρ , S 2 (a) = K2ρ aK2ρ , ∀a ∈ H.

(2)

Let V be a locally finite module over H, which is unitary with respect to the sesquilinear form ( , ) : V × V → C. For every v ∈ V , we define v † by v † (w) = (v, w), ∀w ∈ V , and denote the linear span of all such v † by V † , which is a subspace of the dual vector space of V . The V † has a natural H module structure, with the action of H given by †

(av † )(w) = (−1)[a][v ] v † (S(a)w), w ∈ V. Unitarity of V leads to av † = (−1)[a][v] (∗S(a)v)† . We define a sesquilinear form ( , )0 : V † × V † → C by (v † , w† )0 = (K2ρ w, v). It follows from the properties of the original form on V that ( , )0 is positive definite and nondegenerate. A straightforward calculation shows that (av † , w† )0 = (v † , ∗0 (a)w† )0 , where ∗0 is defined by (1). 2.2. Uq (gl(m|n)). Throughout the paper, we will denote by g the complex Lie superalgebra gl(m|n), and by U (g) its universal enveloping algebra. As is well known, there are the Drinfeld and Jimbo versions of the quantized universal enveloping algebra Uq (g) of g, which, though, have very similar properties at generic q. It is the Jimbo version of Uq (g) that will be used in this paper. Now Uq (g) is a Z2 graded unital associative algebra over C(q, q −1 ), q being an indeterminate, generated by {Ka , Ka−1 , a ∈ I; Eb b+1 , Eb+1,b , b ∈ I0 }, I = {1, 2, ..., m + n}, I0 = {1, 2, ..., m + n − 1}, subject to the following relations: Ka Ka−1 = 1,

Ka±1 Kb±1 = Kb±1 Ka±1 ,

Ka Eb b±1 Ka−1 = qaδab −δa b±1 Eb b±1 , −1 [Ea a+1 , Eb+1 b } = δab (Ka Ka+1 − Ka−1 Ka+1 )/(qa − qa−1 ),

(Em m+1 )2 = (Em+1 m )2 = 0, Ea a+1 Eb b+1 = Eb b+1 Ea a+1 , Ea+1 a Eb+1 b = Eb+1 b Ea+1 a , |a − b| ≥ 2, Sa(+)a±1 = Sa(−)a±1 = 0, a 6= m, {Em−1 m+2 , Em m+1 } = {Em+2 m−1 , Em+1 m } = 0,

(3)

Structure and Representations of Quantum General Linear Supergroup

529

[a]

where qa = q (−1) , Sa(+)a±1 = (Ea a+1 )2 Ea±1 a+1±1 − (q + q −1 )Ea a+1 Ea±1 a+1±1 Ea a+1 + Ea±1 a+1±1 (Ea a+1 )2 , Sa(−)a±1 = (Ea+1 a )2 Ea+1±1 a±1 − (q + q −1 )Ea+1 a Ea+1±1 a±1 Ea+1 a + Ea+1±1 a±1 (Ea+1 a )2 , and Em−1 m+2 and Em+2 m−1 are the a = m − 1, b = m + 1, cases of the following elements [16, 6]: Ea b = Ea c Ec b − qc−1 Ec b Ea c , Eb a = Eb c Ec a − qc Ec a Eb c , a < c < b. The Z2 grading of the algebra is specified such that the elements Ka±1 , ∀a ∈ I, and Eb b+1 , Eb+1 b , b 6= m, areeven, while Em m+1 and Em+1 m are odd. Above, we have also 0, if a ≤ m, used the notation [a] = 1, if a > m. On the other hand, the Drinfeld version of Uq (g) is defined over C[[~]], q = exp(~), and is completed with respect to the ~-adic topology of C[[~]]. It is generated by {Ea a , a ∈ I; Eb b+1 , Eb+1,b , b ∈ I0 }, subject to the same relations (3) with Ka = qaEa a . It is well known that Uq (g) has the structure of a Z2 graded Hopf algebra, with a co-multiplication −1 + 1 ⊗ Ea a+1 , 1(Ea a+1 ) = Ea a+1 ⊗ Ka Ka+1

1(Ea+1 a ) = Ea+1 a ⊗ 1 + Ka−1 Ka+1 ⊗ Ea+1 a , 1(Ka±1 ) = Ka±1 ⊗ Ka±1 , co-unit (Ea a+1 ) = (Ea+1 a ) = 0, ∀a ∈ I0 , (Kb±1 ) = 1, ∀b ∈ I, and antipode S(Ea a+1 ) = −Ea a+1 Ka−1 Ka+1 , −1 S(Ea+1 a ) = −Ka Ka+1 Ea+1 a ,

S(Ka±1 ) = Ka∓1 ⊗ Ka∓1 . At generic q, the Jimbo version of Uq (g) has more or less the same representation theory as that of the Drinfeld version [6]. Let {a |a ∈ I} be the basis of a vector space with a bilinear form (a , b ) = (−1)[a] δab . The roots of the classical Lie superalgebra gl(m|n) can be expressed as a − b , a 6= b, a, b ∈ I. For later use, we define

530

R. B. Zhang

2ρ =

X

(−1)[a]+[b] (a − b ).

a≤b

From [6] we know that every finite dimensional irreducible Uq (g) module is of highest weight type and is essentially uniquely characterized P by a highest weight. Let W (λ) be an irreducible Uq (g) module with highest weight λ = a λa a , λa ∈ C. There exists a unique (up to scalar multiples) vector v+λ 6= 0 in W (λ), called the highest weight vector, such that Eaa+1 v+λ = 0, a ∈ I0 , Kb v+λ = qbλb v+λ , b ∈ I. W (λ) is finite dimensional if and only if λ satisfies λa − λa+1 ∈ Z+ , a 6= m, and in that case, it has the same weight space decomposition as that of the corresponding irreducible gl(m|n) module with the same highest weight. 2.3. Unitarity of covariant and contravariant tensor representations. From this section on, we will assume that Uq (g) is obtained from the Jimbo algebra by specializing q to a real positive parameter different from 1. To construct a ∗-operation for Uq (g), we first −1 consider the Hopf subalgebra generated by e = Ea a+1 , f = Ea+1 a , and k = Ka Ka+1 , −1 ±1 for a fixed a 6= m. It is not difficult to show that ∗(e) = f k, ∗(f ) = k e, ∗(k ) = k ±1 defines a ∗-operation for this Uq (sl(2)) subalgebra. Possible generalizations of this to Uq (g) are −1 ∗(Ea a+1 ) = (−1)(θ+1)δma Ea+1 a Ka Ka+1 ,

∗(Ea+1 a ) = (−1)(θ+1)δma Ka−1 Ka+1 Ea a+1 , ∗(Ka±1 ) = Ka±1 ,

(4)

where θ = 1 or 2. It is quite obvious that the “quadratic” relations of (3) are preserved by the ∗-operations, and we have also explicitly checked that the “Serre relations” are preserved as well. We will call the ∗-operations type 1 and type 2 respectively when θ = 1 and 2. It is also well known that Y (−1)[a]+[b] Ka Kb−1 K2ρ = a
satisfies Eq. (2). Now we consider the irreducible covariant and contravariant tensor representations of Uq (g). The vector representation π of Uq (g) is of highest weight 1 . The corresponding module E has the standard basis {va |a ∈ I}, such that Ka vb = qaδab vb , Ea a±1 vb = δb a±1 va . Define a sesquilinear form on E × E by (va , vb ) = δab

a−1 Y c=1

qc−1 .

Structure and Representations of Quantum General Linear Supergroup

531

Then it is straightforward to show that with respect to the type 1 ∗-operation, we have (Ea a±1 vb , vc ) = (vb , Ea∗ a±1 vc ), (Ka vb , vc ) = (vb , Ka vc ). Therefore, the vector representation is unitary of type 1. The Uq (g) modules E⊗k , k ∈ Z+ (E0 = C), obtained by repeated tensor products of the vector module with itself can be decomposed into direct sums of irreducible type 1 unitary modules, and we will call each direct summand an irreducible contravariant tensor module, and the corresponding irreducible representation an irreducible contravariant tensor representation. The irreducible contravariant tensor representations can be characterized in the following way. Let Z+ be the set of nonnegative integers. Define a subset P of Z+ ⊗(m+n) by P = {p = (p1 , p2 , ..., pm+n ) ∈ Z+ ⊗(m+n) | pm+1 ≤ n, pa ≥ pa+1 , a ∈ I0 }. Pm+n We associate with each p ∈ P a λ(p) = a=1 λa a defined by λa = pa , n X µ=1

λm+µ m+µ =

a ≤ m, n pX m+ν X

m+µ .

ν=1 µ=1

Introduce the set 3(1) = {λ(p) | p ∈ P}.

(5)

From results of [6, 14] we know that an irreducible representation of Uq (g) is a contravariant tensor if and only if its highest weight belongs to 3(1) . Needless to say, all such irreducible representations are type 1 unitary. Let W (λ) be an irreducible contravariant tensor Uq (g) module with highest weight ¯ An explicit formula λ ∈ 3(1) . We define λ¯ to be its lowest weight, and set λ† = −λ. † for λ was given in [14] (Sect. III. B.), where a more compact characterization was also given for the sets 3(1) and 3(2) := {λ† | λ ∈ 3(1) }.

(6)

We refer to that paper for details. Now the dual module W (λ)† of W (λ), which we will call a covariant tensor module, has highest weight λ† . All irreducible covariant tensor modules are unitary of type 2. The most important example is the covariant vector module E† , which is the dual of the vector module E. Its highest weight is given by −m+n . We summarize our discussions in the following Proposition 1. 1. Each Uq (g) module E⊗k (resp. (E† )⊗k ), k ∈ Z+ , can be decomposed into a direct sum of irreducible modules with highest weights belonging to 3(1) (resp. 3(2) ). 2. Every irreducible Uq (g) module with highest weight belonging to 3(1) (resp. 3(2) ) is contained in some repeated tensor products of E (resp. E† ) as an irreducible component.

532

R. B. Zhang

More detailed structures of the irreducible covariant and contravariant tensor representations can be understood, e.g., their characters and super characters can be computed, the Clebsch-Gordan problem of irreducible representations within a given tensor type can also be resolved by using the supersymmetric Young diagram method. Here we elucidate some general aspects of the Clebsch-Gordan problem, which will play an important role in the remainder of the paper. Denote by [λ] the equivalence class of irreducible representations with highest weight λ. For λ and λ0 both belonging to 3(1) , we interpret [λ] + [λ0 ] as the equivalence class sum representations, and [λ] · [λ0 ] as that of the direct products. of the direct (1) + Let 3 be the Z module with a basis {[λ] | λ ∈ 3(1) }. Then the “·” operation defines a multiplication on 3(1) . Clearly [λ] · [0] = [0] · [λ] = [λ]. Furthermore, from Sect. V of [14] we can deduce that if [λ] · [λ0 ] = [λ1 ] + [λ2 ] + ... + [λk ], then none of the λi is zero unless both λ and λ0 are zero. This is in agreement with the fact that \ 3(1) 3(2) = {0}. The discussions above can be repeated word by word for the irreducible representations with highest weights belonging to 3(2) . 3. Quantum General Linear Supergroup GLq (m|n) For compact Lie groups in the classical setting, there exists the celebrated Tannaka-Krein duality theory [18], which enables the reconstruction of a group from the Hopf algebra of its representative functions. The theory of quantum groups [11] makes essential use of a quantum analog of the duality [17], and is formulated entirely in terms of the algebra of functions. We will adopt the same philosophy here to formulate and study quantum supergroups. However, we should mention that Lie supergroups are much more complicated than ordinary compact Lie groups in structures; at the best, the TannakaKrein duality holds in a restricted sense for Lie supergroups even at the classical situation, though we have not come across any treatment of the problem in the literature. 3.1. Subalgebra of functions associated with the vector representation. As before, we denote by π the vector representation of Uq (g) relative to the standard basis {va | a ∈ I} of E. Then X π(x)b a vb , x ∈ Uq (g). xva = b

Let (Uq (g))0 be the finite dual of Uq (g). Consider the elements ta b , a, b ∈ I of (Uq (g))0 satisfying ta b (x) = π(x)a b , ∀x ∈ Uq (g). It is easy to show that the ta b indeed belong to (Uq (g))0 . Also note that ta b is even if [a] + [b] ≡ 0(mod 2), and odd otherwise. Standard Hopf algebra theory asserts that (Uq (g))0 is a Z2 - graded Hopf algebra with its structures dualizing those of Uq (g). Consider the subalgebra Gπq of (Uq (g))0 generated by ta b , a, b ∈ I. The multiplication which Gπq inherits from (Uq (g))0 is given by

Structure and Representations of Quantum General Linear Supergroup

ht t0 , xi =

X

533

ht ⊗ t0 , x(1) ⊗ x(2) i

(x)

=

X

0

(−1)[t ][x(1) ] ht, x(1) iht0 , x(2) i,

∀t, t0 ∈ Gπq , x ∈ Uq (g).

(7)

(x)

To better understand the algebraic structure of Gπq , we recall that the Drinfeld version of Uq (g) admits a universal R matrix, which in particular satisfies R1(x) = 10 (x)R, ∀x ∈ Uq (g).

(8)

Applying π ⊗ π to both sides of the equation yields Rπ π (π ⊗ π)1(x) = (π ⊗ π)10 (x)Rπ π ,

(9)

where Rπ π := (π ⊗ π)R. The universal R-matrix of Uq (g) can be extracted from the Khoroshkin-Tolstoy paper of [5] by appropriately adjusting the conventions. We can then apply π ⊗ π to R to get Rπ π . The matrices Rπ¯ π and Rπ¯ π¯ which will be used later, can also be obtained similarly. However, the explicit form of these matrices can be extracted more easily from the results of [16]. Here we copy Rπ π from that reference P X e ⊗e (−1)[a] + (q − q −1 ) ea b ⊗ eb a (−1)[b] . Rπ π = q a∈I a a a a a
We may also mention that R is the infinite spectral parameter limit of the Perk-Schultz R-matrix. It is important to realize that Eq. (9) makes perfect sense within the Jimbo formulation of the quantized universal enveloping algebra Uq (g), even when q is specialized to a realP parameter. We can re-interpret the equation in terms of the ta b . Then by setting t = a,b ea b ⊗ ta b , we have ππ ππ t1 t2 = t2 t1 R12 . R12

(10)

The co-multiplication 1 of Gπq is also defined in the standard way by h1(ta b ), x ⊗ yi = hta b , xyi = π(xy)a b , We have 1(ta b ) =

X

∀x, y ∈ Uq (g).

(−1)([a]+[c])([c]+[b]) ta c ⊗ tc b .

(11)

c∈I

Gπq also has the unit , and the co-unit 1Uq (g) . Therefore, Gπq has the structures of a Z2 -graded bi- algebra. However, it does not admit an antipode, as we will explain later. Let π (λ) be an arbitrary irreducible contravariant tensor representation of Uq (g). We may also regard π (λ) as a representative of [λ], where λ ∈ 3(1) . Define the elements t(λ) ij , i, j = 1, 2, ..., dimC π (λ) , of (Uq (g))0 by (λ) t(λ) i j (x) = π (x)i j , ∀x ∈ Uq (g). π It is an immediate consequence of Proposition 1 that t(λ) i j ∈ Gq , for all i, j and λ ∈ 3(1) , and every f ∈ Gπq can be expressed as a linear sum of these elements. From

534

R. B. Zhang

the representation theory of Uq (g) we can deduce that these elements are also linearly independent. Introduce the vector spaces T

(λ)

=

(λ) dimπ M

Ct(λ) ij .

i,j=1

Then Proposition 2. As a vector space, Gπq =

M

T (λ) .

λ∈3(1)

To return to the question why Gπq admits no antipode, we consider an arbitrary t(λ) ij ∈ Gπq with λ 6= 0. Denote also by S the antipode of (Uq (g))0 . Then (λ) S(t(λ) i j )(x) = ti j (S(x)), ∀x ∈ Uq (g). (λ) That is, S(t(λ) i j ) are the matrix elements of the dual irreducible representation of π , the (1) highest weight of which is not contained in 3 unless the irreducible representation π π (λ) is trivial, i.e., λ = 0. Therefore, S(t(λ) i j ) 6∈ Gq . π Let us now recapitulate that Gq is defined as the sub bi-superalgebra of (Uq (g))0 generated by the matrix elements of the vector representation of Uq (g). Equation (10) is a set of relations satisfied by the tab as elements of (Uq (g))0 . However, we may also by τab , a, b ∈ I, subject to the same relations consider a bi-superalgebra Tq generated P as (10) but with t replaced by a,b ea b ⊗ τa b , and with a similar co-multiplication as (11). Clearly, the bi-superalgebra map

ψ : Tq → Gπq , τab 7→ tab , is surjective. Now a natural and important question is whether ψ is also injective. The answer to this question is affirmative, as can be shown by adapting the method of Takeuchi [19] to the present situation. The question is also closely related to the problem of “connectedness” of quantum supergroups (see [19] about the corresponding problem for ordinary GLq (n)), which will be treated in detail on another occasion. 3.2. Subalgebra of functions associated with the dual vector representation. Let {v¯ a | a ∈ I} be the basis of E† dual to the standard basis of E, i.e., v¯ a (vb ) = δa b . Denote by π¯ the covariant vector representation relative to this basis. Let t¯a b , a, b ∈ I, be the elements of (Uq (g))0 such that ¯ a b , ∀x ∈ Uq (g). t¯a b (x) = π(x) Note that t¯a b is even if [a]+[b] ≡ 0(mod 2), and odd otherwise. These elements generate a Z2 -graded bi- subalgebra Gπq¯ of (Uq (g))0 in the standard fashion. Here we merely point out that they obey the relation

Structure and Representations of Quantum General Linear Supergroup π¯ π¯ ¯ ¯ π¯ π¯ t1 t2 = t¯2 t¯1 R12 R12 ,

with t¯ :=

X

ea b ⊗ t¯b a ,

535

(12)

Rπ¯ π¯ := (π¯ ⊗ π)R. ¯

a,b

The following explicit form of Rπ¯ π¯ is obtained from [16] P X e ⊗e (−1)[a] + (q − q −1 ) ea b ⊗ eb a (−1)[b] . Rπ¯ π¯ = q a∈I a a a a a>b

Also, the co-multiplication is given by X (−1)([a]+[c])([c]+[b]) t¯a c ⊗ t¯c b . 1(t¯a b ) = c∈I ¯ (−λ)

the irreducible representation dual to π (λ) , λ ∈ 3(1) , in a given Denote by π¯ ¯ λ) (λ) 0 homogeneous basis. Introduce the elements t¯(− i j , i, j = 1, 2, ..., dimC π , of (Uq (g)) such that ¯ ¯ λ) t¯(− ¯ i(−j λ) (x), ∀x ∈ Uq (g). i j (x) = π

Then it follows from Proposition 1 that these elements form a basis of Gπq¯ . Set T¯ (µ) = ⊕i,j Ct¯(µ) ij . We have Proposition 3. Gπq¯ =

M

T¯ (µ) .

µ∈3(2)

3.3. Algebra Gq of functions on GLq (m|n). We define the algebra Gq of functions on the quantum general linear supergroup GLq (m|n) to be the Z2 -graded subalgebra of (Uq (g))0 generated by {ta b , t¯a b | a, b ∈ I}. The ta b and t¯a b , besides obeying the relations (10) and (12), also satisfy π¯ π ¯ π¯ π R12 t1 t2 = t2 t¯1 R12 ,

(13)

where R := (π¯ ⊗ π)R. Equation (13) arises by first applying π¯ ⊗ π to both sides of (8), then interpreting the resulting equation in terms of the ta b and t¯a b . The following explicit expression of Rπ¯ π is extracted from [16], P X − ea a ⊗ea a (−1)[a] a∈I Rπ¯ π = q − (q − q −1 ) eb a ⊗ eb a (−1)[a]+[b]+[a][b] . π¯ π

a
Equation (13) enables us to factorize Gq into Gq = Gπq Gπq¯ .

(14)

As both Gπq and Gπq¯ are Z2 -graded bi-algebras, Gq inherits a natural bi-algebra structure. It also admits an antipode. By considering (xv¯ a )(vb ) = (−1)[x][a] v¯ a (S(x)vb ), x ∈ Uq (g), where {va } is the standard basis of the vector representation, and {v¯ a } is the basis of the covariant vector representation dual to {va }, we arrive at

536

R. B. Zhang

Lemma 1. The antipode S : Gq → Gq is a linear anti-automorphism given by S(ta b ) = (−1)[a][b]+[a] t¯b a , S(t¯a b ) = (−1)[a][b]+[b] q (2ρ, a −b ) tb a .

(15)

Therefore, Gq has the structures of a Z2 -graded Hopf algebra. Furthermore, ∗-operations can also be constructed for Gq , thus turning it into a Hopf ∗-superalgebra. We have ∗(ta b ) = (−1)(θ+[a])([a]+[b]) t¯a b , ∗(t¯a b ) = (−1)(θ+[a])([a]+[b]) ta b , where θ ∈ Z2 . An important property of Gq is that it separates points of Uq (g), that is, for any nonvanishing x ∈ Uq (g), there exists f ∈ Gq such that f (x) 6= 0. As a matter of fact, Gπq by itself separates points of Uq (g). Put differently, for any u ∈ Uq (g), if u 6= 0, then π ⊗p (u) 6= 0 for some p ∈ Z+ . To verify our assertion, we first consider the corresponding proposition in the classical situation of U (g) in detail. Let Ea(0)b , a, b ∈ I, be the standard generators of g embedded in its universal enveloping algebra. In the vector representation π (0) , one has π (0) (Ea(0)b ) = ea b . We isolate the u(1) subalgebra of g with the generator X Ea(0)a , Z (0) = a∈I (0) (0) 0 , A = 1, ...., (m + n)2 − 1, the elements Ec(0)c − Ec+1 and denote by XA c+1 , c ∈ I , and (0) Ea b , a 6= b, in any fixed ordering. Then a Poincar´e–Birkhoff–Witt basis for U (g) is given by, (0) (0) (0) (0) k (0) {Bk, A1 ...Al = (Z ) XA1 ...XAl | k, l ∈ Z+ , Ai ≤ Ai+1 , Ai 6= Ai+1 if [XAi ] = 1}. (0) Set π (0) (XA ) = eA . Denote by M the vector space of (m + n) × (m + n) matrices, and define

Rk =

k−1 X i=0

Let bA1 ... Ak =

M ⊗ ... ⊗ M ⊗I ⊗ M ⊗ ... ⊗ M . | | {z } {z }

X

i

k−1−i

(−1)|σ{A} | eAσ(1) ⊗ eAσ(2) ⊗ ... ⊗ eAσ(k) ,

σ∈Sk

where |σ{A} | is the number of permutations required amongst odd elements in order to change XA1 ⊗ XA2 ⊗ ... ⊗ XAk to XAσ(1) ⊗ XAσ(1) ⊗ ... ⊗ XAσ(k) . Clearly, the elements {bA1 ... Ak | k ∈ Z+ , Ai ≤ Ai+1 , Ai 6= Ai+1 if [XAi ] = 1}

Structure and Representations of Quantum General Linear Supergroup

537

are linearly independent in M⊗k , and we will denote by Lk their linear span. By considering the trace (not the supertrace!) on each factor of M⊗k , we can easily see that Lk intersects Rk trivially. Therefore, (0) (0) (0) ⊗k (π (0) )⊗(k+p) (B0, ⊗ (π (0) )⊗p (B0, A1 ...Ak ) = (π ) A1 ...Ak ) = bA1 ... Ak ⊗ I ⊗p + rk, p , rk, p ∈ Rk ⊗ M⊗p , are linearly indepenedent as elements of M⊗(k+p) . Consider u ∈ U (g) given by u=

L X K X X k=0 l=0 {A}

Using

(0) Ck, A1 ...Al Bk, A1 ...Al ,

Ck, A1 ...Al ∈ C.

(π (0) )⊗p (Z k ) = pk I ⊗p ,

we immediately see that (π (0) )⊗p (u) = 0, ∀p > L, requires K X

pk Ck, A1 ...Al = 0,

∀p > L,

k=0

which forces all the Ck, A1 ...Al to vanish. This completes the proof for the classical case. Remarks. There is something slightly unnatural about our proof, that is, the combination (0) (0) Em m − Em+1 m+1 does not belong to sl(m|n) ⊂ g, and this in turn forced us to consider the ordinary trace instead of the supertrace in proving Lk ∩ Rk = {0}. We can avoid this (0) (0) unnaturalness when m 6= n by using Em m + Em+1 m+1 instead, but not when m = n. With the above preparations we can now readily prove our assertion for the quantum superalgebra. We first consider the Drinfeld version of Uq (g). Similar to the classical case, we set X Ea a , Z= a∈I

and denote by XA , A = 1, ...., (m + n) − 1, the elements Ec c − Ec+1 c+1 , c ∈ I0 and Ea b , a 6= b, in a fixed ordering. Then 2

{Bk, A1 ...Al = Z k XA1 ...XAl | k, l ∈ Z+ , Ai ≤ Ai+1 , Ai 6= Ai+1 if [XAi ] = 1} forms a Poincar´e–Birkhoff–Witt basis for Uq (g) [6]. Given u = ~k (u0 + ~u1 + ~2 u2 + ...), where each ui is a finite C-combination of some Bk, A1 ...Al , and u0 is assumed to be nonzero. Then it follows from the classical case that there exist infinitely many p ∈ Z+ such that π ⊗p (u) 6≡ 0(mod ~k+1 ). For the Jimbo algebra, we observe that ordered monomials in Ea b , a 6= b, and Ka±1 form a basis of Uq (g). Given u ∈ Uq (g), and a positive integer p, we consider the

538

R. B. Zhang

matrix elements of π ⊗p (u)|q=exp(~) as a power series in ~. π ⊗p (u) 6= 0 if and only if some of these power series do not vanish identically. Now for the purpose of computing π ⊗p (u)|q=exp(~) , we can make the identification π ⊗p (Ka ) =

∞ X (−1)k[a] ~k

k!

k=0

ea a (p) =

p−1 X i=0

ea a (p)k ,

I ⊗ ... ⊗ I ⊗ea a ⊗ I ⊗ ... ⊗ I . | {z } | {z } i

p−i−1

This takes us back to the Drinfeld algebra situation, and we have already shown that in that situation the π ⊗p , p ∈ Z+ , separates points of Uq (g). We summarize the discussions of this section into a proposition, points ii) and iii) of which may be considered as a partial generalization of the classical Peter-Weyl theorem to the quantum supergroup in an algebraic setting: Proposition 4. (i) Gq is a ∗-Hopf superalgebra; (ii) Gq separates points of Uq (g); (iii) The following elements span Gq : (λ) (1) ¯(µ) t(λ) i j ti0 j 0 , i, j = 1, 2, ..., dimπ , λ ∈ 3 ,

i0 , j 0 = 1, 2, ..., dimπ¯ (µ) , µ ∈ 3(2) . However, we should point out that these elements are not linearly independent. 4. Induced Representations of Gq We will develop parabolic induction for representations of GLq (m|n) in this section. Recall that corresponding to every locally finite right co-module ω : W → W ⊗ Gq over Gq , there exists a unique left Uq (g) module Uq (g) ⊗ W → W with the module action defined by x w = ω(w)(x), x ∈ Uq (g), w ∈ W. A similar correspondence exists for left Gq co-modules and right Uq modules. Therefore, we can describe the representation theory of Gq in both the Gq co-module language and Uq (g) module language, depending on which one is more convenient in a given situation. We will largely use the latter here. 4.1. Parabolic subalgebras of Uq (gl(m|n)). Let 2 be a subset of I0 . Introduce the following sets of elements of Uq (g): Sl = {Ka±1 , a ∈ I; Ec c+1 , Ec+1 c , c ∈ 2}; Sp+ = Sl ∪ {Ec c+1 , c ∈ I0 \2}; Sp− = Sl ∪ {Ec+1 c , c ∈ I0 \2}. The elements of each set generate a Z2 -graded Hopf subalgebra of Uq (g). We denote by Uq (l) the Hopf subalgebra generated by the elements of Sl , and by Uq (p± ) the Hopf subalgebras respectively generated by the elements of Sp± . In the classical limit, the Hopf

Structure and Representations of Quantum General Linear Supergroup

539

subalgebras Uq (p± ) coincide with the universal enveloping algebras of parabolic subalgebras of the Lie superalgebra g. Therefore, we will call Uq (p± ) parabolic subalgebras of Uq (g). Let Vµ be a finite dimensional irreducible Uq (l) module. Then Vµ is of highest weight type. Let µ be the highest weight and µ˜ the lowest weight of Vµ respectively. We can extend Vµ in a unique fashion to a Uq (p+ ) module, which we still denote by Vµ , such that the elements of Sp+ \Sl act by zero. Similarly, Vµ also leads to a Uq (p− ) module, on which the elements of Sp− \Sl act by zero. It is not difficult to see that all finite dimensional irreducible Uq (p± ) modules are of this kind. Consider a finite dimensional irreducible Uq (g) module W (λ) with highest weight λ ¯ W (λ) can be restricted into a Uq (p+ ) or Uq (p− ) module in a natural and lowest weight λ. way, and the resultant module is always indecomposable, but not irreducible in general. Consider first the case of Uq (p+ ). We wish to examine the Z2 -graded vector space HomUq (p+ ) (W (λ), Vµ ), which graded-commutes with Uq (p+ ), namely, p φ − (−1)[p][φ] φ p = 0, p ∈ Uq (p+ ), φ ∈ HomUq (p+ ) (W (λ), Vµ ). Because of the irreducibility of Vµ , every non-zero φ ∈ HomUq (p+ ) (W (λ), Vµ ) must be surjective, and thus Vµ ∼ = W (λ)/Kerφ. As a Uq (p+ ) module, W (λ) is indecomposable, and contains a unique maximal proper submodule M such that the lowest weight vector w− of W (λ) does not belong to M . Therefore, Kerφ = M , and Vµ = φ(Uq (l)w− ). This forces λ¯ = µ, ˜ and all elements of HomUq (p+ ) (W (λ), Vµ ) are scalar multiples of one another. It is worth observing that the map φ may be odd. In fact its degree is given by [φ] ≡ [w− ] + [φ(w− )] (mod2). The case of Uq (p− ) can be studied in exactly the same way. To summarize, we have Lemma 2.

dimC HomUq (p+ ) (W (λ), Vµ ) = dimC HomUq (p− ) (W (λ), Vµ ) =

1, 0,

λ¯ = µ, ˜ λ¯ 6= µ. ˜

1, λ = µ, 0, λ = 6 µ.

4.2. Induced representations and quantum superbundles. Let us first introduce two types of left actions of Uq (g) on Gq , which correspond to the left and right translations in the classical situation. Define a bilinear map · : Uq (g) ⊗ Gq → Gq by x ⊗ f 7→ x · f X hf(1) , S −1 (x)if(2) , =

(16)

(f )

which can be easily shown to satisfy (x · f )(y) = (−1)[x][y] f (S −1 (x)y), x · (y · f ) = (xy) · f, x, y ∈ Uq (g), f ∈ Gq . (We assume that the elements x, y ∈ Uq (g) and g, f ∈ Gq are homogeneous for the sake of simplicity. All the statements below generalize to inhomogeneous elements in the obvious way.) Therefore, this defines a left action of Uq (g) on Gq , which corresponds

540

R. B. Zhang

to the left translation of Lie groups in the classical situation. It is worth observing that we may replace S −1 in the above definition, and arrive at a different left action. Another left action “◦” of Uq (g) on Gq can be defined by X (−1)[x]([f ]+[x]) f(1) hf(2) , xi. (17) x◦f = (f )

Straightforward calculations can show that

(idGq

x ◦ (y ◦ f ) = (xy) ◦ f ; (x ◦ f )(y) = f (yx), ⊗ x◦)1(f ) = 1(x ◦ f ).

This corresponds to the right translation in the classical theory. It graded-commutes with the action “·’, namely, x ◦ (y · f ) = (−1)[x][y] y · (x ◦ f ). Let Uq (p) denote either Uq (p+ ) or Uq (p− ). Given any finite dimensional left Uq (p) module V , we form the tensor product V ⊗C Gq , which is a subspace of functions Uq (g) → V : X ζ = vi ⊗ fi ∈ V ⊗ Gq , x ∈ Uq (g), X ζ(x) = fi (x)vi . The left actions “·” and “◦” of Uq (g) on Gq can be extended in an obvious way to actions on V ⊗C Gq , X x·ζ = (−1)[x][vi ] vi ⊗ x · fi , X x◦ζ = (−1)[x][vi ] vi ⊗ x ◦ fi , x ∈ Uq (g). Furthermore, there also exists a co-action ω of Gq on V ⊗C Gq defined by ω = idV ⊗ 10 , where 10 represents the opposite co-multiplication of Gq . Consider the subspace of V ⊗C Gq defined by OV = {ζ ∈ V ⊗C Gq | p ◦ ζ = ( S(p) ⊗ idGq )ζ, ∀ p ∈ Uq (p)}.

(18)

Lemma 3. OV furnishes a left Uq (g) module under “·’, and at the same time a right Gq co-module under ω. Proof. The lemma can be confirmed by direct calculations. For x ∈ Uq (g), p ∈ Uq (p), ζ ∈ OV , we have p ◦ (x · ζ) = (−1)[x][p] x · (p ◦ ζ) = ( S(p) ⊗ idGq )(x · ζ); ( p ◦ ⊗idGq )ω(ζ) = ( p ◦ ⊗idGq )( idV ⊗ 10 )ζ = ( idV ⊗ τ )( idV ⊗ idGq ⊗ p◦)( idV ⊗ 1)ζ = ( idV ⊗ τ )( idV ⊗ 10 )(p ◦ ζ) = ω( S(p) ⊗ idGq )ζ, where τ is the flip mapping.

Structure and Representations of Quantum General Linear Supergroup

541

We call OV the induced Uq (g) module, and also the induced Gq co-module, which gives rise to a co-representation of Gq . A conceptual understanding of OV can be gained by considering its classical analog, which was investigated by Manin [20] and Penkov [21]. Very briefly (precise and extensive treatments can be found in the references just given.), if P is a parabolic subgroup of the complex Lie supergroup SL(m|n), and E a finite dimensional representation of P , then SL(m|n) ×P E, the quotient space of SL(m|n) × E under the equivalence relation (g, v) ∼ (gp, p−1 v) for all p ∈ P , defines a super vector bundle over the supermanifold SL(m|n)/P . A function f : SL(m|n) → E satisfying f (gp) = p−1 f (g), ∀p ∈ P defines a section of the bundle sf : SL(m|n)/P → SL(m|n) ×P E. Analogously, we may regard OV as the vector space of sections of a quantum super vector bundle over the quantum counterpart of SL(m|n)/P . It is of great importance to systematically develop the theory of quantum homogeneous super vector bundles, and we hope to return to the subject in the future. In this paper, we will restrict ourselves to issues directly related to representation theory, and will not further ponder noncommutative geometry, except for the last section, where we will discuss in some detail quantum projective superspaces when dealing with explicit realizations of the irreducible skew supersymmetric tensor representations and their duals. We have the following quantum analog of Frobenius reciprocity. L∞ Proposition 5. Let W be a quotient Uq (g) module of k,l=0 E⊗k ⊗ (E∗ )⊗l (the restriction of which furnishes a Uq (p) module in a natural way). Then there is a canonical isomorphism HomUq (g) (W, OV ) ∼ = HomUq (p) (W, V ).

(19)

Proof. We prove the proposition by explicitly constructing the isomorphism, which we claim to be the linear map F : HomUq (g) (W, OV ) → HomUq (p) (W, V ), ψ 7→ ψ(1Uq (g) ), with the inverse map F¯ : HomUq (p) (W, V ) → HomUq (g) (W, OV ), ¯ φ 7→ φ, where φ¯ is defined by ¯ φ(w)(x) = (−1)[x]([w]+1) φ(S(x)w), x ∈ Uq (g), w ∈ W. As for F , we need to show that its image is contained in HomUq (p) (W, V ). This is indeed the case, as p(F ψ(w)) = (p · ψ(w))(1Uq (g) ) = (−1)[ψ][p] F ψ(pw),

p ∈ Uq (p), w ∈ W.

In order to show that F¯ is the inverse of F , we first need to demonstrate that the image Im(F¯ ) of F¯ is contained in HomUq (g) (W, OV ). Note that Im(F¯ ) ⊂ HomC (W, V ⊗Gq ), L∞ since W is a subquotient of k,l=0 E⊗k ⊗(E∗ )⊗l . Some relatively simple manipulations lead to

542

R. B. Zhang ¯

¯ (y · φ(w))(x) = (−1)[y][φ]+[x]([w]+[x]+[y]) φ(S(x)yw) ¯ ¯ = (−1)[y][φ] φ(yw)(x), ¯ (p ◦ φ(w))(x) = (−1)[x]([w]+1)+[p][φ] φ(S(p)S(x)w) ¯ = S(p)(φ(w)(x)), x, y ∈ Uq (g), p ∈ Uq (p), w ∈ W. Therefore, Im(F¯ ) ⊂ HomUq (g) (W, OV ). Now we show that F and F¯ are inverse to each other. For ψ ∈ HomUq (g) (W, OV ), and φ ∈ HomUq (p) (W, V ), we have (F F¯ φ)(w) = (F¯ φ)(w)(1Uq (g) ) = φ(w), ¯ (F F ψ)(w)(x) = (−1)[x]([w]+1) (F ψ)(S(x)w) = (−1)[x]([w]+1) ψ(S(x)w)(1Uq (g) ) = (−1)[x]([ψ(w)]+1) (S(x) · ψ(w))(1Uq (g) ) = ψ(w)(x), x ∈ Uq (g), w ∈ W. This completes the proof of the proposition. 4.3. Quantum Borel–Weil theorem for the irreducible covariant and contravariant tensor representations. In this subsection we study in detail the irreducible covariant and contravariant tensor representations of Uq (g) within the framework of parabolic induction. Our main result here will be a quantum version of the Borel–Weil theorem for these irreducible representations. For the classical Lie supergroups, the program of developing a Bott–Borel–Weil theory was initiated and extensively investigated by Penkov and co-workers [21, 22], although much remains to be done on the subject. Their program has also revealed a very rich content and various interesting new phenomena. It appears that the Hopf algebraic approach to the Bott–Borel–Weil theory developed here is also worth exploring at the classical level, and is likely to provide a new method complementary to the geometric approach of [21]. Let V be a finite dimensional irreducible Uq (p) module, with the Uq (l) highest weight ˜ For the purpose of studying the tensor representations, we µ and Uq (l) lowest weight µ. need to consider O(µ) = OV ∩ V ⊗ Gπq , (20) O(µ) = OV ∩ V ⊗ Gπq¯ . Let us study O(µ) first. A typical element of O(µ) is of the form X X † ) cλα β, i vi ⊗ t¯(λ ζ= αβ , λ∈3(1) α,β,i †

) where {vi } is a basis of V , and the cλα β, i are complex numbers. The t¯(λ α β are elements of the Peter-Weyl basis for Gπq¯ , which, needless to say, are polynomials in t¯ab , a, b ∈ I. The property that (p ◦ ζ) = ( S(p) ⊗ idGq )ζ, ∀p ∈ Uq (p) leads to X X (−1)[p]([γ]+[vi ]) cλα γ, i t(λ) cλα β, i p vi , ∀p ∈ Uq (p). (21) γ β (p)vi = γ,i

i

Structure and Representations of Quantum General Linear Supergroup

543

Let W (λ) with the basis {wα } be the irreducible Uq (g) module associated with the irreducible representation t(λ) . We define the linear maps between Z2 graded vector spaces φ(α) λ : W (λ) → V, X wβ 7→ cλα β, i vi . i

There is no particular significance attached to the maps at this stage, apart from the mere fact that they can be employed to re-express Eq. (21) as X (α) (α) (α) (−1)[p][φλ ] t(λ) γ β (p)φλ (wγ ) = p φλ (wβ ). γ

We emphasize that this equation is entirely equivalent to (21). Now something of crucial importance appears: this equation requires that each φ(α) λ be a Uq (p) module homomorphism of degree [φ(α) ]. Lemma 2 forces λ φ(α) λ = cα φλ , cα ∈ C, and φλ may be nonzero only when i) λ¯ = µ, ˜ if Uq (p) = Uq (p+ ), ii) λ = µ, if Uq (p) = Uq (p− ). In these cases, O(µ) is spanned by ζα =

X

†

) φλ (wβ ) ⊗ t¯(λ αβ ,

β

which are obviously linearly independent. Furthermore, X (λ) tβ α (x) ζβ , x ∈ Uq (g). x · ζα = (−1)[x][φλ ]

(22)

β

The case of O(µ) can be studied in exactly the same way. To summarize, we have the following quantum analog of the Borel–Weil theorem for the irreducible covariant and contravariant tensor representations Proposition 6. As Uq (g) modules,   W ((−µ) ˜ † ), if µ˜ ∈ −3(2) , ∼ O(µ) = W (µ), if µ ∈ 3(1) ,  {0}, otherwise.  †  W ((−µ) ˜ ), if µ˜ ∈ −3(1) , ∼ O(µ) = W (µ), if µ ∈ 3(2) ,  {0}, otherwise.

Uq (p) = Uq (p+ ), Uq (p) = Uq (p− ),

(23)

Uq (p) = Uq (p+ ), Uq (p) = Uq (p− ),

(24)

In the proposition, the notation W (λ) signifies the irreducible Uq (g) module with highest weight λ.

544

R. B. Zhang

Remarks. O(µ) and O(µ), which form irreducible Uq (g)-modules, are proper subspaces of OV . Although OV itself also furnishes a left Uq (g)-module, it is not irreducible in general. This fact differs drastically from the ordinary quantum group case, where the counter part of OV , which may be regarded as the quantum analog of the sheaf of holomorphic sections of a homogeneous vector bundle, forms an irreducible module over the corresponding quantized universal enveloping algebra.

5. Quantum Projective Superspaces and Skew Supersymmetric Tensors We will apply the general theory developed in the last section to study two infinite classes of irreducible representations, namely, the irreducible skew supersymmetric tensor representations and their duals. Explicit realizations of these irreducible representations will be given in terms of sections of quantum super vector bundles over quantum projective superspaces. 5.1. Quantum projective superspaces. Let Uq (g0 ), g0 = gl(m|n − 1), be the subalgebra of Uq (g) generated by the following elements: {Ka , a ∈ I0 ; Ec c+1 , Ec+1 c , c ∈ I0 \{m + n − 1}}. Clearly Uq (g0 ) is a Hopf subalgebra. Define A+ = {f ∈ Gπq | f (xp) = (p)f (x), ∀x ∈ Uq (g), p ∈ Uq (g0 )}, A− = {f ∈ Gπq¯ | f (xp) = (p)f (x), ∀x ∈ Uq (g), p ∈ Uq (g0 )}. The Hopf algebra structure of Uq (g0 ) implies that both A+ and A− are subalgebras of m|n−1 . Gq . Together they generate another subalgebra of Gq , which we will denote by Sq Set za = ta m+n , z¯a = t¯a m+n , a ∈ I. Then za and z¯a are conjugate to each other under the ∗-operation with θ = 0. More explicitly, ∗(za ) = z¯a , ∀a ∈ I. m|n−1

Now Sq relations:

is generated by the z’s and z’s, ¯ which satisfy the following commutations z a zb (zc )2 z¯a z¯b (z¯c )2 z¯a zb

X c∈I

(−1)[za ][zb ] q zb za , a < b, 0, c ≤ m; (−1)[z¯a ][z¯b ] q −1 z¯b z¯a , a < b, 0, c ≤ m; q(−1)[z¯a ][zb ] zb z¯a + δa b (1 − qa−1 )z¯a za ) X − (−1)[z¯a ] (q − q −1 ) z¯c zc , ∀a, b ∈ I, = = = = =

c
z¯c zc = 1.

Structure and Representations of Quantum General Linear Supergroup

545

It can be shown that the last two equations imply that X q (2ρ, c ) zc z¯c = q (2ρ, m+n ) . c∈I m|n−1

m|n−1

furnishes a right Gq co-module algebra, with the co-module action ω : Sq Sq m|n−1 → Sq ⊗ Gq defined by X ω(za ) = zc ⊗ t a c , c∈I

ω(z¯a ) =

X

z¯c ⊗ t¯a c .

c∈I m|n−1

gives rise to a right Uq (g) module algebra with the module action “◦” deAlso, Sq fined by (17). This module algebra structure restricts naturally to a module algebra struc±1 . The action of Uq (g0 ) ture over Uq (g0 )⊗Uq (gl(1)), where Uq (gl(1)) is generated by Km+n m|n−1 is trivial following the definitions of A± ; Uq (gl(1)) also acts in a very simple on Sq manner. To be explicit, we notations that for L = (θ1 , ..., θm ; l1 , ..., ln ) ∈ Pintroduce Pthe m n θ + {0, 1}⊗m ⊗ Z⊗n i + , |L| = i=1 µ=1 lµ . Set θm l 1 ln zm+1 ...zm+n , Z L = z1θ1 ...zm L

θm l 1 ln Z = z¯1θ1 ...z¯m z¯m+1 ...z¯m+n .

(25)

Then for any k ∈ Z, and p ∈ Uq (g0 ), we have L0

L0

0

k ) ◦ (Z L Z ) = (p)q k(|L |−|L|) Z L Z . (pKm+n m|n−1

We will define the quantum projective superspace CPq m|n−1 , namely, invariant subalgebra of Sq Uq (gl(1)) CPqm|n−1 = Sm|n−1 . q

(26) to be the Uq (gl(1))

(27)

5.2. Irreducible skew supersymmetric tensor representations and their duals. We specialize Uq (p+ ) and Uq (p− ) to the case with 2 = I0 \{m + n − 1}. Consider a onedimensional irreducible Uq (p+ ) module V+ = Cv such that Eb b+1 v = Ec+1 c v = 0, Kb v = v, Km+n v = q −k v, k ∈ Z+ , b, c ∈ I0 , c < m + n − 1, and denote the associated representation by φ. Define Ok = ζ ∈ V+ ⊗ Gπq | (p ◦ ζ)(x) = φ(S(p))ζ(x), ∀x ∈ Uq (g), p ∈ Uq (p+ ) . Direct calculations can show that Ok =

M |L|=k

Cv ⊗ Z L ,

(28)

546

R. B. Zhang

where Z L is defined by (25). Then Ok gives rise to the rank k irreducible skew supersymmetric tensor representation of Uq (g), with the highest weight Pk i , k ≤ m, λ = Pi=1 m + (k − m) , k > m. i m+1 i=1 Now let V− = Cw be a one dimensional irreducible Uq (p− ) module such that Ec c+1 v = Eb+1 b v = 0, Kb v = v, Km+n v = q k v, k ∈ Z+ , b, c ∈ I0 , c < m + n − 1, and denote the corresponding irreducible representation by ψ. Define Ok = ζ ∈ V− ⊗ Gπq¯ | (p ◦ ζ)(x) = ψ(S(p))ζ(x), ∀x ∈ Uq (g), p ∈ Uq (p− ) . Then Ok =

M

L

Cw ⊗ Z .

(29)

|L|=k

This time Ok yields an irreducible representationwith highest weight λ = −km+n , which is dual to the rank k irreducible skew supersymmetric tensor representation.

References 1. Bracken, A.J., Gould, M.D. and Zhang, R.B.: Quantum supergroups and solutions of the Yang–Baxter equation. Mod. Phys. Lett. A5, 831–840 (1990); Zhang, R.B., Bracken, A.J. and Gould, M.D.: Solution of the graded Yang–Baxter equation associated with the vector representation of Uq (osp(M/2n)). Phys. Lett. B257, 133–139 (1991) 2. Chaichian, M. and Kulish, P.P.: Phys. Lett. B234, 72 (1990); Floreanini, R., Spiridonov, V.P. and Vinet, L.: Commun. Math. Phys. 137, 149 (1991); Scheunert, M.: Serre- type relations for special linear Lie superalgebras. Lett. Math. Phys. 34, 320 (1993); Yamane, H.: A Serre type theorem for affine Lie superalgebras and their quantized universal enveloping superalgebras. Proc. Japan Acad. bf A70, 31 (1994) 3. Drinfeld, V.G.: Quantum groups. Proc. Internl. Cong. Math., Berkeley, 1, 789 (1986); Jimbo, M.: A q-difference analog of U (g) and the Yang–Baxter equation. Lett. Math. Phys., 10, 63 (1985) 4. Perk, J.H.H. and Schultz, C.L.: Phys. Lett. A84, 407 (1981); Bazhanov, V.V. and Shadrikov, A.G.: Theor. Math. Phys. 73, 1302 (1987) 5. Gould, M.D., Zhang, R.B. and Bracken, A.J.: Quantum double construction for graded Hopf algebras. Bull. Australian Math. Soc. 47, 353–375 (1993); Yamane, H.: Universal R-matrices for quantum groups associated to Lie superalgebras. Proc. Japan. Acad. bf A67, 108 (1991); Khoroshkin, S.M. and Tolstoy, V.N.: Universal R-matrix for quantized (super)algebras. Commun. Math. Phys. 141, 599 (1991) 6. Zhang, R.B.: Finite dimensional irreducible representations of the quantum supergroup Uq (gl(m/n)). J. Math. Phys., 34, 1236–1254 (1993)

Structure and Representations of Quantum General Linear Supergroup

547

7. Zhang, R.B.: Finite dimensional representations of Uq (C(n)) at arbitrary q. J. Phys. A26, 7041–7059 (1993); Palev, T.D., Stoilova, N.I. and Van der Jeugt, J.: Finite-dimensional representations of the quantum superalgebra Uq [gl(m|n)] and related q-identities. Commun. Math. Phys. 166, 367 (1994); Zhang, R.B.: The gl(M |N ) super Yangian and its finite dimensional representations. Lett. Math. Phys. 37, 419–434 (1996); Zhang, R.B.: Symmetrizable quantum affine superalgebras and their representations. J. Math. Phys. 38, 535 (1997) 8. Bracken, A.J., Gould, M.D., Zhang, Y.Z. and Delius, G.W.: Solutions to the quantum Yang-Baxter equation with extra non-additive parameters. J. Phys. 27 A, 6551–6561 (1994); Bracken, A.J., Gould, M.D., Links, J.R. and Zhang, Y.Z.: A new supersymmetric and exactly solvable model of correlated electrons Phys. Rev. Lett. 74, 2768–2771 (1995) 9. Zhang, R.B.: Braid group representations arising from quantum supergroups with arbitrary q and link polynomials. J. Math. Phys. 33, 3918–3930 (1992); Gould, M.D., Tsohantjis, I. and Bracken, A.J.: Quantum supergroups and link polynomials. Rev. Math. Phys. l5, 533–549 (1993); Links, J.R., Gould, and Zhang, R.B.: Quantum supergroups, link polynomials and representations of the braid generators. Rev. Math. Phys., 5, 345–361 (1993) 10. Zhang, R.B.: Quantum supergroups and topological invariants of three-manifolds. Rev. Math. Phys. 7, 809–831 (1995); Zhang, R.B. and Lee, H.C.: Lickorish invariant and quantum OSP(1|2). Mod. Phys. Lett. A 11, 2397 (1996) 11. Woronowicz, S.L.: Compact matrix pseudo groups. Commun. Math. Phys. 111, 613 (1987); Faddeev, L.D., Reshetikhin, N.Yu. and Takhtajan, L.A.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193 (1990) 12. Manin, Yu.I.: Quantum Groups and Noncommutative Geometry. Universite de Montreal, Centre de Recherches Mathematiques, Montreal, PQ (1988); Woronowicz, S.L.: Differential calculus on compact matrix pseudo groups (quantum groups). Commun. Math. Phys. 122, 125 (1989); Chu, C.S.,Ho, P.M. and Zumino, B.: Some complex quantum manifolds and their geometry. Preprint 13. Lakshmibai, V. and Reshetikhin, N.Yu.: Quantum deformation of flag and Schubert schemes. C. R. Acad. Sci. Paris. Ser. I. Math. 313 No 3, 121–126(1991) 14. Gould, M.D. and Zhang, R.B.: Classification of all star irreps of gl(m | n). J. Math. Phys., 31, 2552–2559 (1990) 15. Gould, M.D. and Scheunert, M. Classification of finite dimensional unitary irreps for Uq [gl(m | n)]. J. Math. Phys. 36, 435–452 (1995) 16. Zhang, R.B.: Universal L-operator and invariants of the quantum supergroup Uq (gl(m|n)). J. Math. Phys., 33, 1970–1979 (1992) 17. Woronowicz, S.L.: Tannaha–Krein duality for compact matrix pseudo groups. Inv. Math., 93, 35 (1988) 18. Hewitt, E. and Ross, K.: Abstract harmonic analysis, vol.2. . 2nd edition, New York: Springer-Verlag, 1979 19. Takeuchi, M.: Some topics on GLq (n). J. Algebra 147, 379 (1992) 20. Manin, Y.I.: Gauge field theory and complex geometry. Berlin: Springer-Verlag, 1988 21. Penkov, I.B.. Borel–Weil-Bott Theory for Classical Lie Supergroups. Sovr. Probl. Math. 32, Moscow: VINITI, 1988 pp. 71–124 22. Penkov, I.B. and Serganova, V. Cohomology of G/P for Classical Complex Lie Groups of G and Characters of Some Typical G-Modules. Annales de L’institut Fourier, 39 , 846–873 (1989) Communicated by T. Miwa

Commun. Math. Phys. 195, 549 – 583 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Formal GNS Construction and States in Deformation Quantization Martin Bordemann, Stefan Waldmann Fakult¨at f¨ur Physik, Universit¨at Freiburg, Hermann-Herder-Str. 3, 79104 Freiburg i. Br., Germany. E-mail: [email protected]; [email protected] Received: 14 August 1996 / Accepted: 15 December 1997

Abstract: In this paper we develop a method of constructing Hilbert spaces and the representation of the formal algebra of quantum observables in deformation quantization which is an analog of the well-known GNS construction for complex C ∗ -algebras: in this approach the corresponding positive linear functionals (“states”) take their values not in the field of complex numbers, but in (a suitable extension field of) the field of formal complex Laurent series in the formal parameter. By using the algebraic and topological properties of these fields we prove that this construction makes sense and show in physical examples that standard representations such as the Bargmann and Schr¨odinger representation come out correctly, both formally and in a suitable convergence scheme. For certain Hamiltonian functions (contained in the Gel’fand ideal of the positive functional) a formal solution to the time-dependent Schr¨odinger equation is shown to exist. Moreover, we show that for every K¨ahler manifold equipped with the Fedosov star product of Wick type all the classical delta functionals are positive and give rise to some formal Bargmann representation of the deformed algebra.

1. Introduction In the programme of deformation quantization introduced by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer [6] the algebra of quantum observables is considered as an associative local formal deformation (a so-called star product ∗) of the associative commutative algebra of smooth complex-valued functions C ∞ (M ) on a given symplectic manifold M , such that the first order commutator equals iλ times the Poisson bracket and such that complex conjugation is an antilinear involution of the deformed algebra. The latter is equal to C ∞ (M )[[λ]], the space of formal power series in the deformation parameter λ with coefficients in C ∞ (M ), and the associative noncommutative multiplication ∗ is bilinear with respect to the ring C[[λ]] of formal power series in λ with complex coefficients. On one hand the rather difficult question of existence

550

M. Bordemann, S. Waldmann

of these deformations for general symplectic manifolds has positively been answered (DeWilde-Lecomte 1983 [18]; Fedosov 1985 [21, 22]; Omori-Maeda-Yoshioka 1991 [41]). Moreover their classification (up to equivalence) in terms of formal power series with coefficients in the second de Rham cohomology class of the underlying symplectic manifold has recently been achieved ([8, 40, p. 204]). On the other hand star products have the –at first sight rather unpleasant– feature of lacking convergence uniform in the formal parameter which is due to the fact that they depend on the infinite jets of the two functions which in turn can be made as divergent as possible by Borel’s Theorem (see e.g. [49]; see also the article by Rubio 1984 [44] for the commutativity of local associative products on C ∞ (M )). Moreover, the deformed algebras do not seem to have obvious representations in some complex separable Hilbert space which is unsatisfactory from the physical point of view. In the past decade, however, several people have attacked this problem: Cahen, Gutt, and Rawnsley [14, 15] start from the finite-dimensional operator algebras of geometric quantization in tensor powers of a very ample regular prequantum line bundle over a compact K¨ahler manifold and use coherent states (see [7, 43]) to first construct star products for the Berezin-Rawnsley symbols ([7, 43]) for each tensor power separately. In a second step an asymptotic expansion of these star products in the inverse tensor power is shown to define a local star product on the manifold where the formal parameter appears as a sort of interpolation of the inverse tensor powers. For compact Hermitian symmetric spaces they showed that the subspace of representative functions has a convergent star product. See also [9, 10] for an elementary algebraic approach in the particular case of complex projective space. Pflaum has studied star products and their convergence on suitable subspaces on cotangent bundles in his thesis [42]. In flat R2n a non-formal analogue (“twisted products”) of star products using integral formulas can be defined on the Schwartz test function space (which is thereby made into an associative topological complex algebra) and extended to larger spaces by functional analytic techniques. The usual Weyl-Moyal star product is recovered as a asymptotic expansion in ~ (which can be introduced as an dilatation parameter), see [30, 34, 37] for more details. Moreover, Fedosov has shown that the deformed algebra allows a so-called asymptotic operator representation in a complex Hilbert space if and only if certain integrality properties of a formal index are satisfied which he defines as a formal analogue of the Atiyah-Singer index theorem and which is an invariant of the underlying symplectic manifold (see e. g. Fedosov’s book [22] for a detailed exposition). The approach of this paper is motivated by the following consideration: in the theory of complex C ∗ -algebras (which form one of the main mathematical pillars of algebraic quantum field theory (see Haag’s book [29] for details) and —in a more algebraic context— of Connes’ noncommutative geometry (see Connes’ book [16])) the representing complex Hilbert spaces for a given complex C ∗ -algebra A are constructed by means of the so-called GNS representation (see e. g. [13]): roughly speaking, any positive complex-valued linear functional on A (i.e. which maps positive elements of A to nonnegative real numbers) gives rise to a left ideal I of A (the Gel’fand ideal), and A is canonically represented on the quotient space A modulo I. Due to the positivity of the initial functional this quotient is equipped with a positive definite sesquilinear form and thus becomes a complex pre-Hilbert space whose completion yields the desired representation of the algebra. The initial functional can be regarded as a vacuum expectation value (functional) or a state on A if A has a unit element. A natural question which has been brought to our attention by K. Fredenhagen is the following: can the GNS construction be extended to the associative algebras occurring in deformation quantization? At first sight, this seems almost impossible: firstly, for any

Formal GNS Construction and States in Deformation Quantization

551

complex-valued linear functional on C ∞ (M )[[λ]] one would at once have an obvious convergence problem. Secondly, although the deformed algebra does have an antilinear involution it does not seem to share the remaining defining analytic properties of a complex C ∗ -algebra such as the existence of a C ∗ -norm or any obvious uniform structure in which it is complete. This picture, however, will change drastically –as we should like to show in the sequel– when the linear functionals take their values not only in the field of complex numbers but can have values in (some suitable field extension of) the ring C[[λ]] and are not only linear with respect to C, but C[[λ]]-linear: This makes sense since C ∞ (M )[[λ]] is a C[[λ]]-module in a natural way, a fact which lies at the heart of algebraic deformation theory (see e.g. [25, 19]). Thereby one stays in the formal category which avoids the convergence problems. Moreover, in the subring R[[λ]] of formal power series with real coefficients there is an algebraic sense of “asymptotic positivity”: a power series in R[[λ]] is defined to be positive (resp. negative) if and only if among its nonvanishing coefficients the one with lowest order is a positive (resp. negative) real number. This defines a ring ordering (which is preserved under sums and products). This structure allows us to speak of a positive linear functional ω on the deformed algebra if and only if ω(f ∗f ) is nonnegative in R[[λ]] for all f ∈ C ∞ (M )[[λ]]. For technical reasons it will be advantageous to replace the ring R[[λ]] by its quotient field R((λ)) of formal Laurent series (with finite principal part) and likewise for C[[λ]] and C ∞ (M )[[λ]]. The traces considered in deformation quantization (see [17, p. 3]; [39, p. 229], [40, p. 155] where the Laurent field is denoted by C[~−1 , ~]]); and [22, p. 171]) are particular examples for C((λ))-linear functionals. By this simple replacement of the complex numbers by (field extensions of) formal power series we get the following principal results: – With the above notion of positive linear functionals with values in (a field extension of) C((λ)) we can imitate the algebraic part of the classical GNS construction for the algebras in deformation quantization and arrive at a representation of these algebras in a pre-Hilbert space over (a suitable field extension) of C((λ)) (Sect. 3). – The above “asymptotic ordering” of the field defines a metric topology on the field which can be used to define a norm on the above pre-Hilbert space with values in (a suitable extension of) R((λ)). This norm serves to define a Cauchy completion of the pre-Hilbert space (Appendix A, B). – Apart from this algebraic construction this “formal” GNS construction gives rise to the geometric interpretation of these positive linear functionals as a deformation (quantization) of classical positive linear functionals: these classical “states” in classical mechanics are often given by certain measures (e.g. the Boltzmann distribution) having support on certain submanifolds of M . In this paper we shall deal with the simplest examples: a) Dirac measures (whose support is one point in M ) for which we shall show in the case of the standard Wick star product in Cn that the above formal GNS construction gives rise to the formal Wick quantization with the correct formal Bargmann Hilbert space (Sect. 6), and b) the integral over configuration space Rn in R2n for the standard star product of Weyl type which will give rise to the usual symmetrization rule and the correct formal L2 -space of (certain) square-integrable functions of Schr¨odinger quantization on configuration space (Sect. 8). – Moreover, we show that for the Fedosov star product of Wick type on an arbitrary K¨ahler manifold (see [11] for details) the Dirac delta functionals are positive functionals on the deformed algebra which allows a representation of Bargmann type. The geometry of the K¨ahler manifold enters this representation by mapping the function

552

M. Bordemann, S. Waldmann

f ∈ C ∞ (M )((λ)) on its Fedosov-Taylor series at the support of the delta functional (Sect. 7). – Finally, in all the above examples we could also find a solution to the problem of convergence of formal powers series: either by using a given formal Hilbert base (in the K¨ahler examples) or a set of suitable formal linear functionals (in the case of the formal Schr¨odinger representation) we define a complex Hilbert space out of the formal GNS Hilbert space for the value λ = ~ in the following way: we first single out the subset H(~) of all those elements ψ of the formal Hilbert space which satisfy an infinite number of suitable convergence conditions for all those formal series with complex coefficients which arise when all the scalar products of ψ with the elements of the Hilbert base or the values of all the given linear functionals on ψ are considered. This turns out to be a complex vector space and the quotient space H(~) := H(~)/N (~) becomes a complex Hilbert space where N (~) is the subspace of those elements of H(~) for which all the above series vanish when λ is replaced by the real number ~. A certain complex subspace of the formal algebra will respect this construction and can thus be represented on (a subspace of) H(~). These examples seem to support the point of view that the convergence problem in deformation quantization should be treated only after a formal GNS representation has been chosen (by the choice of a suitable positive linear functional), i.e. one should “stay in the formal category as long as possible”. All these exemplary results seem to suggest that the concept of formal GNS construction sketched above may be a suitable candidate to formulate reasonable “Hilbert space” representations of the observable algebra within the framework of deformation quantization in a uniform manner: the observable algebra is the primary object and the representing GNS (pre-) Hilbert spaces are subordinate objects parametrized by (often geometrically motivated) positive linear functionals (as opposed to other approaches to quantization where the observable algebra is defined as an operator algebra on a given Hilbert space). The paper is organized as follows: Sect. 2 recalls the necessary notation and definitions in deformation quantization, introduces the use of the field of formal Laurent series and gives some first elementary examples and counterexamples of formal positive functionals. The main flaw of the field of formal Laurent series is the fact that it is not algebraically closed. Section 3 is a bit technical: here we first describe the properties of general ordered fields and define pre-Hilbert spaces over such fields. This enables us to define the GNS pre-construction for associative algebras with involution on preHilbert spaces. Further properties as topologies and absolute values of ordered fields including their algebraic closures and Cauchy completions are discussed in Appendix A. In Appendix B we finally describe Hilbert spaces and the general GNS construction. The ideas of the proofs of many of these results can be found in algebra text books, or in the literature on p-adic functional analysis (see e.g. [38]), or in the works on (nonclassical) Hilbert spaces (see e.g. [28, 36, 27]). In Sect. 4 we apply the results of Sect. 3 and Appendices A and B to the concrete fields we need for the GNS construction in deformation quantization, i.e. the algebraic closure of the Laurent field, the so-called field of Newton–Puiseux series, and its Cauchy completion. The GNS construction is applied to the formal algebra C ∞ (M )hhλii of completed Newton–Puiseux series with coefficients in C ∞ (M ) which contains the original deformed algebra C ∞ (M )[[λ]] as a subring. In Sect. 5 we prove that a formal solution to the time-dependent Schr¨odinger equation exists as a well-defined curve of vectors in the formal GNS Hilbert space in case the classical Hamiltonian does not contain any negative powers of λ and is contained in the Gel’fand ideal of the considered positive linear functional ω. Section 6 contains the

Formal GNS Construction and States in Deformation Quantization

553

above-mentioned example of a GNS construction by means of the delta functional with support at the origin for the Wick star product in Cn . In addition we obtain the correct spectrum of the harmonic oscillator by an entirely formal deduction. Section 7 generalizes this to the delta functional of an arbitrary point of an arbitrary K¨ahler manifold equipped with the Fedosov star product of Wick type. In Sect. 8 the above-mentioned example of the formal Schr¨odinger representation in Rn is dealt with. Section 9 is an outlook on some open problems related to this approach.

2. Motivation and Basic Concepts In this section we shall give a first heuristic motivation how one can define positive linear functionals for a star product algebra without leaving the formal description. First we have to introduce some notation: Let M be a 2n-dimensional symplectic manifold. The observable algebra in deformation quantization is given by the formal power series C ∞ (M )[[λ]] in the formal parameter λ with coefficients in the smooth complex-valued functions C ∞ (M ). Then the addition in C ∞ (M )[[λ]] is the pointwise addition order by order in λ of the functions and the multiplication is a star product for M . A star product of two functions f, g ∈ C ∞ (M ) is a formal power series f ∗g =

∞ X

λr Mr (f, g)

r=0

with bilinear operators Mr : C ∞ (M ) × C ∞ (M ) → C ∞ (M ) such that ∗ extends to a C[[λ]]-bilinear associative product for C ∞ (M )[[λ]] with the following properties: The lowest order is the pointwise multiplication, i.e. M0 (f, g) = f g, and the first order commutator is given by iλ times the Poisson bracket, i. e. M1 (f, g) − M1 (g, f ) = i{f, g}. The constant function 1 is the unit element with respect to ∗, i.e. f ∗ 1 = 1 ∗ f = f . We assume for simplicity that all the operators Mr are not only local but bidifferential operators. We demand in addition that the complex conjugation of elements in C ∞ (M )[[λ]] (where λ := λ) is an antilinear algebra involution: f ∗ g = g ∗ f. Then C ∞ (M )[[λ]] together with a star product is not only a ∗ -algebra over the field C but also a C[[λ]]-module. Note that λ is to be identified directly with ~. It is now an easy step to pass from the ring of formal power series to the corresponding quotient field which turns out to be the field of formal Laurent series. The field of formal Laurent series with real (and analogously complex) coefficients is defined by ) ( ∞ X r λ ar ar ∈ R, N ∈ Z . (1) R((λ)) := r=−N

Then R((λ)) has the structure of a field if we define the addition order by order in λ and the multiplication by ! ! ∞ ∞ ∞ X X X X r s λ ar λ bs := λt a r bs . (2) r=−N

s=−M

t=−N −M

r+s=t r≥−N,s≥−M

554

M. Bordemann, S. Waldmann

Then one easily shows that R((λ)) and C((λ)) are fields and with λ := λ we have the following natural inclusions: R ⊂ R[[λ]] ⊂ R((λ)), C ⊂ C[[λ]] ⊂ C((λ)) and R((λ)) ⊂ C((λ)). Moreover R((λ)) is the quotient field of the ring R[[λ]] and now λ has an inverse, namely λ−1 . Remark. From the physical point of view it will be necessary to consider negative powers of Planck’s constant anyway since ~ will appear in the denominator in important situations as for example in the energy eigenvalues of the hydrogen atom. More generally, if V is a K-vector space (K = R, C) then we define the corresponding vector space of formal Laurent series V ((λ)) by ) ( ∞ X r λ vr vr ∈ V, N ∈ Z (3) V ((λ)) := r=−N

and notice that V ((λ)) is a K((λ))-vector space in the obvious way. Moreover the field R((λ)) is known to be an ordered field with a unique ordering (e. g. [47, p. 73]): Lemma 1. R((λ)) is an ordered field with a unique non-archimedian ordering relation such that λ > 0. The positive elements are given by ∞ X

λr ar > 0 ⇐⇒ a−N > 0.

(4)

r=−N

A symbolic picture of this non-archimedian ordering is given by · · · < λ−1 R− < R− < λR− < · · · < 0 < · · · < λR+ < R+ < λ−1 R+ < · · · . Apart from this ordering relation the fields R((λ)) and C((λ)) allow an absolute value ϕ : C((λ)) → R defined for a ∈ C((λ)) by ϕ(a) := 2−o(a) , where o(a) ∈ Z is the order of the lowest non-vanishing term in the formal Laurent series a. Both structures, the order and the absolute value, can be used to define topologies on the field R((λ)) (see Appendix A for definitions) which will turn out to be the same (see Sect. 4). An important observation will be that in R((λ)) every Cauchy sequence defined with respect to the order converges in R((λ)). Using the fields R((λ)) and C((λ)) instead of R and C we can generalize the Calgebra of observables C ∞ (M )[[λ]] in the obvious way to a C((λ))-algebra C ∞ (M )((λ)), where we extend the star product to a C((λ))-bilinear product of C ∞ (M )((λ)). Now we can consider C((λ))-linear functionals ω : C ∞ (M )((λ)) → C((λ)) and define positive linear functionals using the complex conjugation as antilinear algebra involution and the ordering of R((λ)) completely analogously to the case of C ∗ -algebras over C [13]: Definition 1. A C((λ))-linear functional ω : C ∞ (M )((λ)) → C((λ)) is called positive iff ω(f ∗ f ) ≥ 0

∀f ∈ C ∞ (M )((λ))

and a state iff ω is positive and ω(1) = 1. We will now give simple examples for positive linear functionals. Using the locality of the star product we can prove the following lemma:

Formal GNS Construction and States in Deformation Quantization

555

Lemma 2. Let 0 ≤ % ∈ C 0 (M ) with compact support. Then ω% defined by Z ω% (f ) := f% ( symplectic volume form) M

is a positive linear functional of C ∞ (M )((λ)) with respect to any star product and ω% (f ∗ f ) = 0 ⇐⇒ f % ≡ 0. A counterexample is given by the delta-functionals in R2 and the Weyl-Moyal star product (51). We easily compute that the evaluation functional δ(q0 ,p0 ) at (q0 , p0 ) ∈ R2 is not positive since 1 δ(q0 ,p0 ) (q − q0 )2 + (p − p0 )2 ∗ (q − q0 )2 + (p − p0 )2 = − λ2 < 0 2 in spite of the fact that the delta functional is a positive functional with respect to the pointwise product and hence a classical state. Now we can construct a representation of the C((λ))-algebra C ∞ (M )((λ)) using a given positive linear functional in the same way as the usual GNS representation of a C ∗ -algebra is constructed. But this will be done in a more general context in the next section and then it will turn out that there is even a better choice than the formal Laurent series since the field R((λ)) is not real closed and C((λ)) is not algebraically closed since for example the positive element λ is no square. 3. Ordered Fields and General GNS Pre-Construction In this rather technical section we shall present some general properties of ordered fields and vector spaces and algebras over such fields. All statements can in principal be proved in a fashion very similar to the case of real or complex numbers, and for the proofs only the ordering axioms will be used. In order to make this exposition not too long we shall omit most of the proofs. As general references we have used the textbooks by Jacobson [31] (in particular Chapter 5) and [32] (Chapters 9, 11), by Ruiz [47], Kelley [35], Rudin [45], Yosida [50], Bratteli-Robinson [13], and Haag [29]. Definition 2 ([31, Def. 5.1]). An ordered field (R, P ) is field R with a subset P of positive elements of R such that 0 6∈ P and if a ∈ R then either a = 0, a ∈ P or −a ∈ P and if a, b ∈ P then a + b ∈ P and ab ∈ P . Then an ordering relation is defined by a , ≤ and ≥ will be used as in the case R = R. The elements with a < 0 will be called negative. The set of positive elements will also be denoted by R+ and the negative elements by R− . Clearly, every square a2 is positive or 0 and hence −1 < 0 < 1. Since n1 = 1 + · · · + 1 is a sum of positive numbers for all n ∈ N we have n1 > 0 and hence R has characteristic zero. We define |a| := a if a ≥ 0 and |a| := −a if a < 0. Then | · | clearly satisfies |a| ≥ 0

and |a| = 0 ⇐⇒ a = 0 |ab| = |a||b| ||a| − |b|| ≤ |a + b| ≤ |a| + |b|.

(5)

The ordering relation < is called archimedian iff there is for any 0 < a and b ∈ R a natural number n ∈ N such that na > b. Otherwise the ordering will be called nonarchimedian.

556

M. Bordemann, S. Waldmann

Now we fix once and for all the quadratic field extension C := R(i)(= R ⊕ iR) of an ordered field R, where i2 := −1. In C we define complex conjugation as usual by a + ib ∈ C 7→ a + ib := a − ib, where a, b ∈ R. Then the complex conjugation is an involutive field automorphism. The elements of the subfield R of C will be called real and are characterised as usual by a ∈ R ⊂ C ⇐⇒ a = a. Clearly aa ∈ R, aa ≥ 0 and aa = 0 ⇐⇒ a = 0 for any a ∈ C and we define |a|2 := aa. Now we transfer definitions and some simple algebraic results from the theory of complex vector spaces with Hermitian products to the general case of a vector space over an arbitrary ordered field R and its quadratic field extension C. Definition 3. Let C = R(i) be the quadratic field extension of an ordered field R and HC a C-vector space. A map h· , ·i : HC × HC → C is called a Hermitian product iff for all φ, ψ, χ ∈ HC and for all a, b ∈ C, i) h· , ·i is antilinear in the first argument, i.e. haφ + bψ , χi = a hφ , χi + b hψ , χi, ii) hφ , ψi = hψ , φi, iii hφ , φi ≥ 0 and hφ , φi = 0 ⇐⇒ φ = 0 . A C-vector space with a Hermitian product h· , ·i is called a pre-Hilbert space. A linear map U : HC → KC from one pre-Hilbert space over C to another is called an isometry iff hU φ , U ψi = hφ , ψi for all φ, ψ ∈ HC and unitary iff U is a bijective isometry. For a pre-Hilbert space we have the Cauchy-Schwarz inequality for the Hermitian product which is a simple imitation of the standard arguments over the complex numbers (cf. e.g. [45, p. 77]): Lemma 3 (Cauchy-Schwarz inequality). Let HC be a pre-Hilbert space with Hermitian product h· , ·i and φ, ψ ∈ HC . Then hφ , ψi hψ , φi ≤ hφ , φi hψ , ψi

(6)

with equality if and only if ψ and φ are linearly dependent. Furthermore, for a linear operator from HC into HC we want to define an adjoint operator. In general it is rather complicated to see whether an adjoint operator exists or not. Without further assumptions we can only state the following definition for everywhere defined linear operators: Definition 4 (Adjoint operator). Let HC be a pre-Hilbert space and A : HC → HC a linear map. Then a linear map B : HC → HC is called an adjoint operator to A iff hφ , Aψi = hBφ , ψi for all φ, ψ ∈ HC . In this case we write B = A∗ . Lemma 4. Let HC be a pre-Hilbert space and A, B : HC → HC linear operators. If A∗ exists then it is unique and if A∗ and B ∗ exist then (aA + bB)∗ , (A∗ )∗ and (AB)∗ exist for a, b ∈ C and (aA + bB)∗ = aA∗ + bB ∗ , (A∗ )∗ = A and (AB)∗ = B ∗ A∗ . Let A be an associative and not necessarily commutative algebra over the quadratic field extension C = R(i) of an ordered field R. We shall only consider algebras with an antilinear involution ∗ compatible with the complex conjugation in C. This means that there is a map ∗ : A → A such that for all A, B ∈ A and a ∈ C, (A + B)∗ = A∗ + B ∗ , (AB)∗ = B ∗ A∗ ,

(aA)∗ = aA∗ , (A∗ )∗ = A.

(7)

If the algebra A has a unit element 1 then necessarily 1∗ = 1. By means of the algebra involution ∗ and the ordering relation in R we can define positive linear functionals analogously to the case of C ∗ -algebras over the complex numbers C [13, 29]:

Formal GNS Construction and States in Deformation Quantization

557

Definition 5. Let A be a C-algebra with involution ∗ and ω : A → C a linear functional where C = R(i) and R is an ordered field. Then ω is called positive iff for all A ∈ A, ω(A∗ A) ≥ 0. If in addition A has a unit element 1 then ω is called a state iff ω is positive and ω(1) = 1. For a positive linear functional we can prove the Cauchy-Schwarz inequality: Lemma 5 (Cauchy-Schwarz inequality). Let A be a C-algebra with involution ∗ and ω : A → C a positive linear functional. Then we have for all A, B ∈ A: ω(A∗ B) = ω(B ∗ A), ω(A∗ B)ω(A∗ B) ≤ ω(A∗ A)ω(B ∗ B).

(8) (9)

Moreover ω(1) = 0 implies ω ≡ 0 and ω(A∗ ) = ω(A) and if ω(A∗ A) = 0 then ω(A∗ B) = 0 for all B ∈ A. For an algebra A with involution ∗ over the quadratic field extension C of an ordered field R we are able to transfer the well-known GNS construction of representations for C ∗ -algebras: We construct a representation of A in a pre-Hilbert space over C. Firstly, we shall only deal with the algebraic properties of this construction and examine analytic properties in Appendix B. We are mainly using the notation of [13]. Let ω : A → C be a positive linear functional. Then we consider the following subspace of A: Jω := {A ∈ A | ω(A∗ A) = 0}.

(10)

This subspace Jω is called the Gel’fand ideal of ω and by means of Lemma 5 it is easily proved that Jω is indeed a left ideal of A. In the next step one considers the quotient vector space (11) Hω := A/Jω , where the equivalence classes in Hω are denoted by ψA := {A0 ∈ A | A0 = A + I, I ∈ Jω }.

(12)

On the quotient space Hω we can define a Hermitian product by hψA , ψB i := ω(A∗ B)

(13)

which is well-defined since Jω is a left ideal. Furthermore this definition leads indeed to a non-degenerate Hermitian product for Hω which will make Hω a pre-Hilbert space over C. In a third step one defines a representation πω of A on Hω by πω (A)ψB := ψAB .

(14)

To prove that this is well-defined we need once more the fact that Jω is a left ideal. Then we notice that πω is a ∗ -representation: πω (AB) = πω (A)πω (B), πω is linear and πω (A∗ ) = πω (A)∗ . In this case it is checked directly that πω (A∗ ) is the adjoint operator of πω (A) in the sense of Definition 4. If A has a unit element 1 and ω is a state then the representation πω is cyclic with the cyclic (vacuum) vector ψ1 since every vector ψA ∈ HC can be written as ψA = πω (A)ψ1 and we have ω(A) = hψ1 , πω (A)ψ1 i .

(15)

558

M. Bordemann, S. Waldmann

In this case we show analogously to the case of C ∗ -algebras that the representation πω is unique up to unitary equivalence: If H0 , π 0 , ψ 0 is another cyclic ∗ -representation with cyclic unit vector ψ 0 such that ω(A) = hψ 0 , π 0 (A)ψ 0 i for all A ∈ A then we have a unitary map U : Hω → H0 such that U −1 π 0 (A)U = π(A) and U ψ1 = ψ 0 . We shall resume this in the following proposition: Proposition 1 (General GNS pre-construction). Let A be a C-algebra with involution ∗ where C = R(i) is the quadratic field extension of an ordered field R. For any positive linear functional there exists a ∗ -representation πω on a pre-Hilbert space Hω as constructed above which is called the GNS representation on Hω . If A in addition has a unit element 1 and ω is a state then this representation is cyclic and we have ω(A) = hψ1 , πω (A)ψ1 i and this property defines this representation up to unitary equivalence. The following obvious generalization will be very useful if the positive linear functional is only defined on a proper ideal of A, a situation which will occur in Sect. 8. Corollary 1. Let A be a C-algebra with involution ∗ and B ⊂ A a two-sided ideal of A with B = B ∗ . Let ω : B → C be a positive linear functional and denote by πω the GNS representation of B on Hω = B/Jω , where Jω ⊂ B is the Gel’fand ideal of ω. Then Jω is a left ideal of A and πω can be extended canonically to a representation of A on Hω . Proof. Let a ∈ A and b ∈ Jω . Since B is an ideal it follows that ab and a∗ ab are contained in B. Moreover ω((ab)∗ ab)ω((ab)∗ ab) = ω(b∗ (a∗ ab))ω(b∗ (a∗ ab)) ≤ ω(b∗ b)ω((a∗ ab)∗ a∗ ab) = 0 thanks to the Cauchy Schwarz inequality which implies that ab ∈ Jω . The rest of the corollary follows easily.

4. The GNS Construction in Deformation Quantization Since the Laurent field R((λ)) (Definition (1)) is not real closed we are searching for a field extension of R((λ)) such that this extension will be real closed and Cauchy complete with respect to the ordering relation since this is the important property we need if we want to consider Hilbert spaces as in Appendix B. First we define the following systems of subsets of the rational numbers Q: Definition 6. S ⊂ Q is called NP-admissible iff S has a smallest element and there is a positive integer such that N · S ⊂ Z and S is called CNP-admissible iff S has a smallest element and S ∩ [i, j] is finite for any i, j ∈ Q. Then we define SCNP := {S ∈ Q | S is CNP-admissible} and analogously SNP . Let V be a vector space over R (or C) and f : Q → V . Then we define the support of f by suppf := {q ∈ Q | f (q) 6= 0}. Definition 7. Let V be a vector space over R (or C). Then we define the formal Newton– Puiseux series (NP series) with coefficients in V by V hhλ∗ ii := {f : Q → V | suppf ∈ SNP }

(16)

and the completed Newton–Puiseux series (CNP series) with coefficients in V by V hhλii := {f : Q → V | suppf ∈ SCNP } .

(17)

Formal GNS Construction and States in Deformation Quantization

559

Clearly V hhλ∗ ii and V hhλii are both vector spaces over R (or C) and elements in V hhλ∗ ii and V hhλii are written in the form X λq f q fq := f (q), f= q∈suppf ∗ and since SNP ⊂ SCNP we notice that V ⊂ V ((λ)) ⊂ V hhλ ⊂ V hhλii. If f ∈ V hhλ∗ ii Pii r ∞ N then either the support is finite or it can be written as f = r=−M λ fr , where M ∈ Z and N ∈ N. If f ∈ V hhλii then the support is again either finite or can be written as a sequence q0 < q1 < · · · ∈ Q without any accumulation points. In both cases we will write ∞ X λqr f qr . f= r=0

For f ∈ V hhλii we define the order o(f ) := min(suppf ) for f 6= 0 and set o(0) := +∞ and we define ϕ(0) := 0, (18) ϕ(f ) := 2−o(f ) for f 6= 0, which leads to an ultra-metric dϕ (f, g) := ϕ(f − g) for the vector space V hhλii. Proposition 2. Let V be a vector space over R (or C). Then V hhλii is a complete metric space with respect to the metric dϕ and V hhλ∗ ii is dense in V hhλii. Proof. Let (f (n) )n∈N be a Cauchy sequence in V hhλii with respect to the metric dϕ . Then there are natural numbers N0 ≤ N1 ≤ N2 ≤ · · · ∈ N such that dϕ (f (n) , f (m) ) < 2−k (N0 ) for all n, m ≥ Nk . Then we define fq := fP for −∞ < q ≤ 0 and fq := fq(Nk ) for q k − 1 < q ≤ k, where k ∈ N. Then f = q λq fq is a well-defined series in V hhλii since suppf has a smallest element and in any interval of the form [i, i + 1] with i ∈ Z are only finitely many fq 6= 0 for q ∈ [i, i + 1] and hence suppf ∈ SCNP . Since the f (n) coincide with f up to a sufficiently increasing order thanks to the Cauchy condition it follows f (n) → f which implies that the metric space V hhλii is complete. Now let P that q f = q λ fq be an arbitrary element in V hhλii. Then every element of the sequence P q (m) ∈ V hhλ∗ ii. f (m) := q≤m λ fq , where m ∈ N has finite support and hence f Moreover clearly f (m) → f which proves the proposition. P q Corollary 2. Let V be a vector space over R (or C) and f = q λ fq ∈ V hhλii. Then P f (m) := q≤m λq fq converges to f . We will now concentrate on the vector space Rhhλ∗ ii and Rhhλii (and analogously for Chhλ∗ ii and Chhλii) itself. Analogously to the case of formal Laurent series we define a multiplication of two elements a, b ∈ Rhhλii by  ! X X X X λp a p  λq bq  := λt a p bq , (19) ab = p∈suppa

q∈suppb

t∈suppa+suppb

p+q=t

where suppab = suppa + suppb := {p + q | p ∈ suppa, q ∈ suppb}. Then suppab ∈ SCNP and in any order t ∈ suppab the sum p + q = t for p ∈ suppa and q ∈ suppb is finite. Hence ab is again a well-defined CNP series. A proof in a more general context could be found in [38, p. 81]. Moreover the product is clearly associative and commutative and we can find for any 0 6= a ∈ Rhhλii an inverse a−1 ∈ Rhhλii and hence Rhhλii becomes

560

M. Bordemann, S. Waldmann

a field. Furthermore one can show that Rhhλ∗ ii is a subfield of Rhhλii. A similar result holds for Chhλii. If V is a vector space over K (K = R or C) then V hhλ∗ ii is a vector space over Khhλ∗ ii and V hhλii is a vector space over Khhλii. We call these vector spaces the canonical extensions of V to vector spaces over the NP series and the CNP series with real (or complex) coefficients. Let V ∗ be the (algebraic) dual vector space of V then any K-linear functional ω ∈ V ∗ has an obvious canonical Khhλii-linear extension to a functional of V hhλii by applying ω order by order. This extension will always be understood. A particular subspace of all Khhλii-linear functionalsPω : V hhλii → Rhhλii is given by V ∗ hhλii, i.e. the linear functionals of the form ω = q∈suppω λq ωq , where ωq ∈ V ∗ and suppω ∈ SCNP . Note that V ∗ hhλii is in general a proper subspace of all Khhλii-linear functionals of V hhλii, i.e. V ∗ hhλii ( (V hhλii)∗ iff V is infinite-dimensional. The field of CNP series Rhhλii (or Chhλii) has a canonical non-archimedian absolute value, namely ϕ and Rhhλii has a unique order which turns out to be compatible with the absolute value: Proposition 3. The map ϕ defined as in (18) is a non-archimedian and non-trivial absolute value for the fields Rhhλii resp. Chhλii. The field Rhhλii is an ordered field and this unique order is compatible with the absolute value ϕ and hence T = Tϕ and U = Uϕ . The positive elements are given by a > 0 iff aq0 > 0, where q0 = min(suppa). Theorem 1. i) Chhλ∗ ii ∼ = Rhhλ∗ ii(i)(= Rhhλ∗ ii ⊕ iRhhλ∗ ii) is algebraically closed and ∗ Rhhλ ii is real closed. ii Chhλii ∼ = Rhhλii(i)(= Rhhλii ⊕ iRhhλii) is algebraically closed and Rhhλii is real closed and both are Cauchy complete with respect to the ordering relation of Rhhλii. Proof. The first part is the Newton–Puiseux theorem ([47, p. 61] or [32, p. 595] for a more general case). To prove the second part we notice that Chhλii is the completion of Chhλ∗ ii with respect to the metric dϕ induced by the absolute value ϕ according to Proposition 2. Hence K¨ursch´ak’s theorem ensures that Chhλii is again algebraically closed, see [32, p. 584]. Furthermore, since Chhλii ∼ = Rhhλii ⊕ iRhhλii it follows e.g. from the theorem of Artin and Schreier [32, p. 674] that Rhhλii is real closed. This theorem ensures that the fields Rhhλii and Chhλii have indeed the algebraic and analytic properties needed for the GNS construction in deformation quantization and the definition of Hilbert spaces over Chhλii. Note that the NP series would not suffice since they are not Cauchy complete. Using the field Chhλii we consider the Chhλii-vector space C ∞ (M )hhλii and extend the star product to a Chhλii-bilinear product on C ∞ (M )hhλii. Hence the observable algebra in deformation quantization becomes a Chhλii-algebra. To define states and GNS representations we need positive linear functionals of this algebra and first we notice that the results of Sect. 2 not only hold for the field of formal Laurent series but also for the CNP case. Then we formulate the GNS construction as a corollary to Theorem 1 and 10: Corollary 3 (GNS (pre-)construction in deformation quantization). Let M be a symplectic manifold and ∗ a star product for M with f ∗ g = g ∗ f . Let K = C((λ)) or Chhλ∗ ii or Chhλii, and let AK := C ∞ (M )((λ)), C ∞ (M )hhλ∗ ii, C ∞ (M )hhλii, respectively. Then for any positive linear functional ω : AK → K there exists a pre-Hilbert space Hω over K carrying a ∗ -representation πω of AK as constructed in Theorem 1. We shall call this representation of AK the GNS representation on Hω .

Formal GNS Construction and States in Deformation Quantization

561

In the case K = Chhλii the pre-Hilbert space can be completed to a Hilbert space ˆ Hω over Chhλii as in Theorem 10. Then we shall call πω the GNS representation of ˆ ω. C ∞ (M )hhλii in H If ω is a state then the GNS representation is cyclic and we have ω(f ) = hψ1 , πω (f )ψ1 i and this property defines this representation up to unitary equivalence. P If we consider positive functionals of the particular form ω = q∈suppω λq ωq , where ωq : C ∞ (M )hhλii → C and suppω ∈ SCNP as mentioned above then we can show that for a state no negative powers in the formal parameter could occur using the CauchySchwarz inequality for the state: ∞ Lemma P 6. Let ω : C (M )hhλii → Chhλii be a positive linear functional of the form ω = q∈suppω λq ωq such that ω(1) = 1. Then q0 := min(suppω) = 0 and ω0 is a classical state, i.e. ω0 (f f ) ≥ 0 for f ∈ C ∞ (M ).

Proof. Let f ∈ C ∞ (M )hhλii be a function such that ωq0 (f ) 6= 0, where we can assume that f ∈ C ∞ (M ). Then o(ω(f )ω(f )) = 2q0 and o(f ∗ f )) ≥ q0 which is in contradiction to the Cauchy-Schwarz inequality ω(f )ω(f ) ≤ ω(f ∗ f ) if q0 < 0, where we used ω(1) = 1. Moreover this implies ωq0 (f f ) ≥ 0 since f ∗ f = f f + . . .. This lemma has the following physical interpretation: Not only the classical observable algebra C ∞ (M ) is deformed but also the classical states are deformed to obtain the states for the deformed algebra. Hence the restriction to these particular linear functionals in (C ∞ (M )∗ )hhλii fits well into the general concept of deformation. As we shall see in the examples the positive linear functionals are typically formal CNP series with coefficients even in a smaller space, namely in C ∞ (M )0 resp. C0∞ (M )0 , where C ∞ (M )0 resp. C0∞ (M )0 denotes the topological duals with respect to the natural locally convex topologies of the smooth functions resp. of the smooth functions with compact support. Which one of these topological duals one could or should choose in general depends much on the domain of definition of the positive linear functional (which need not be the whole star product algebra C ∞ (M )hhλii as e. g. in Sect. 8). Hence we shall concentrate in the following mainly on the purely algebraic properties and postpone these functional analytical questions as an interesting open problem to some future work. This scheme opens up a lot of possibilities to study candidates for positive linear functionals from a purely geometrical point of view: one can always start with a classical measure ω0 having support in an interesting submanifold of the classical phase space. The simplest possibility consists in studying single points, that is the evaluation or Dirac delta functionals, which we shall do in Sect. 6 and 7. One might also think about larger submanifolds such as certain energy surfaces of suitable classical Hamiltonian functions, or Lagrangian submanifolds. However, as we had already seen in Sect. 2 the positivity of these functionals for the deformed algebra is in general no longer true, and higher order terms in the deformation parameter have to be added. One might also think about a deformed version of statistical mechanics where ω0 is a classical statistical measure such as e.g. the Boltzmann measure exp(−βH0 ) for a Hamiltonian function H0 on the phase space such that the Liouville integral over this Boltzmann function converges. In a context where convergence in λ in a distributional sense is (at least partially) assumed one can find some of these ideas already in [3, 4] where among other things KMS conditions are studied. In the more particular case of flat R2n and the convergent setting, as mentioned in the introduction, Hansen has proved convergence of star exponentials (Eq. (5.2) in [30]) (of the above Boltzmann form) of tempered distributions satisfying a certain boundedness condition (see Eq. (5.5) and

562

M. Bordemann, S. Waldmann

Corollary 5.2 in [30]). Note that in that paper GNS representations with vanishing Gel’fand ideal are considered, see Proposition 3.1. 5. A Remark on the Time Development in the GNS (pre-)Hilbert Space In this section we should like to discuss the time development in the observable algebra and in a representation generated by some Hamiltonian. All results in this section are also valid (after the obvious modifications) when CNP-series are replaced by formal Laurent series. Let H0 ∈ C ∞ (M ) be a classical real Hamiltonian and let h ∈ C ∞ (M )hhλii be a real CNP series with o(h) > 0 and define H := H0 + h. Then the Heisenberg equation of motion with respect to the Hamiltonian H is given by 1 1 d ft = [ft , H] = (ft ∗ H − H ∗ ft ), dt iλ iλ

(20)

where t 7→ ft ∈ C ∞ (M )hhλii is the trajectory through f0 ∈ C ∞ (M )hhλii and t ∈ R. We ask now for solutions of (20) for a given initial condition f0 . Let XH0 be the Hamiltonian vector field of H0 and assume for simplicity that XH0 is complete. In this case the flow φt : M → M of XH0 is a one-parameter group of symplectic diffeomorphisms of M. Proposition 4. With the above notations we have: The Heisenberg equation of motion has a unique solution ft defined for all t ∈ R for any given initial condition f0 ∈ C ∞ (M )hhλii and ft is given by Z t ∗ ∗ ∗ b ft = φt ◦ T exp φ−τ ◦ H ◦ φτ dτ f0 , (21) 0

b where T exp means the time ordered exponential and H(g) := ∞ b satisfies o(H(g)) > o(g) for all g ∈ C (M )hhλii.

1 iλ [g, H]

− {g, H0 }

Remarks. Fedosov has proved a similar result in [22, Sect. 5.4.]. In the case where XH0 is not complete one can prove the existence and uniqueness of solutions of (20) for example for initial conditions f0 with compact support and sufficiently small times. We will denote the solutions of (20) in the symbolical way it

ft = e λ ad(H) f0 , where ad(H)f := [H, f ]. We shall now turn to the time development of states in a specific GNS (pre-)Hilbert space: We consider a positive linear functional ω of C ∞ (M )hhλii and the related GNS representation πω on Hω := C ∞ (M )hhλii/Jω . Then the Schr¨odinger equation is given by d iλ ψ(t) = πω (H)ψ(t), (22) dt where ψ(t) ∈ Hω . Specifying an initial condition ψ(0) one can rewrite the Schr¨odinger equation into an integral equation Z 1 t πω (H)ψ(τ )dτ, ψ(t) = ψ(0) + iλ 0

Formal GNS Construction and States in Deformation Quantization

563

but now iterating this equation will generate in general arbitrarily high negative powers of λ and hence it will not converge in the formal sense! A rather simple example is provided by taking H equal to the constant function 1. But under certain conditions the well-defined time development of the observables can be used to find solutions of (22): Theorem 2. Let ω : C ∞ (M )hhλii → Chhλii be a positive linear functional and H = H0 + h ∈ C ∞ (M )hhλii as above. If H is contained in the Gel’fand ideal Jω of ω then the Schr¨odinger equation (22) has a unique solution for t ∈ R and any initial condition ψ(0) ∈ Hω . Moreover if f ∈ C ∞ (M )hhλii and ft is the solution of (20) with initial condition f then ψ(t) := ψf−t = ψexp(− itλ ad(H))f is the unique solution with initial condition ψ(0) = ψf . Proof. Let ft be the solution of (20) with initial condition f . Since H ∈ Jω we have d 1 1 d ψ(t) = f−t mod Jω = − (f−t ∗ H − H ∗ f−t ) mod Jω = πω (H)ψ(t). dt dt iλ iλ Remarks. With the GNS representation we can define a time development of the states in a purely formal way without considering any convergence properties for λ = ~. There are other methods to define a time development as for example the star exponential [6] which is the starting point for a spectral analysis. In this approach the time development operator is viewed as a certain distribution if one substitutes the formal parameter λ by ~ ∈ R. If ω is a state then the condition H ∈ Jω implies that the vacuum vector ψ1 is invariant under the time development and hence ω could be called an invariant state under H. The search for invariant states under certain group actions is a well-studied problem in algebraic quantum field theory [29]. In our approach it raises the question whether there is a positive linear functional such that the left ideal generated by some Hamiltonian is contained in the Gel’fand ideal of that functional. This condition would determine a preferred choice of a state if one wants to consider the quantum theory of a certain classical Hamiltonian. 6. Example I: The Wick Product and δ-Functionals in Cn We consider the Wick product (see for example [9]) in Cn which is defined for f, g ∈ C ∞ (Cn )hhλii by f ∗ g :=

∞ X (2λ)r r=0

∂rg ∂rf , r! ∂z i1 · · · ∂z ir ∂z i1 · · · ∂z ir

(23)

where z 1 , . . . , z n are the canonical holomorphic coordinates in Cn and summation over repeated indices is always understood. Moreover we consider the evaluation functional (“delta-functional”) δp at the point p ∈ Cn , δp [f ] := f (p) ∞

n

(24)

as a Chhλii-linear functional δp : C (C )hhλii → Chhλii. Then δp turns out to be positive with respect to the Wick product:

564

M. Bordemann, S. Waldmann

Lemma 7. Let p ∈ Cn . Then δp is a positive linear functional with respect to the Wick product (23). Moreover it is clearly a state. The Gel’fand ideal of δp is given by |I| ∂ f ∞ n (p) = 0 for all multiindices I . (25) Jp := f ∈ C (C )hhλii ∂z I This lemma is proved by observing that f ∗ f evaluated at a point is just a series of non-negative elements in Rhhλii. Using Borel’s lemma [49] we easily find the following isomorphism: Proposition 5. We have the following isomorphism of Chhλii-vector spaces Hp := C ∞ (Cn )hhλii/Jp ∼ = C[[y 1 , . . . , y n ]] hhλii

(26)

and an isomorphism is given by the formal z-Taylor series at p, Hp 3 ψf 7→

∞ X 1 ∂rf (p)y i1 · · · y ir , r! ∂z i1 · · · ∂z ir

(27)

r=0

and the Hermitian product induced by δp is given by hψf , ψg i =

∞ X (2λ)r r=0

∂rg ∂rf (p) (p). r! ∂z i1 · · · ∂z ir ∂z i1 · · · z ir

(28)

In the following we shall identify ψf with the formal z-Taylor series and consider only p = 0 for simplicity. Then we can determine the GNS representation π0 induced by δ0 by calculating the formal z-Taylor series of f ∗ g in order to obtain π0 (f )ψg . Lemma 8 (The formal Wick representation). For f ∈ C ∞ (Cn )hhλii we find π0 (f ) =

∞ X ∂ (2λ)r ∂ ∂ r+s f (0)y j1 · · · y js i1 · · · ir . j j i i 1 s 1 r r!s! ∂z · · · ∂z ∂z · · · ∂z ∂y ∂y r,s=0

(29)

In particular we have for polynomials the Wick ordering (normal ordering): π0 (z i1 · · · z ir z j1 · · · z js ) = (2λ)r y j1 · · · y js

∂ ∂ · · · ir . i1 ∂y ∂y

(30)

Now we will consider the completion of H0 to a Chhλii-Hilbert space. c0 of H0 is given by Proposition 6. The completion H ( ∞ X 1 c0 = φ = ai1 ···ir (λ)y i1 · · · y ir ai1 ···ir (λ) ∈ Chhλii H r! r=0

such that

∞ X (2λ)r r=0

r!

)

ai1 ···ir (λ)ai1 ···ir (λ) converges in Chhλii .

(31)

Let K = (k1 , . . . , kn ) be a multiindex then the vectors eK := √

1 λ|K| K!

c0 (y 1 )k1 · · · (y n )kn ∈ H

where

|K| := k1 + · · · + kn K! := k1 ! · · · kn !

c0 is isometric to `2 (Chhλii). c0 and hence H form a Hilbert base for H

(32)

Formal GNS Construction and States in Deformation Quantization

565

Remark. An element in H0 has a lowest order in λ according to Eq. (26) but the coeffic0 written as in (31) could have decreasing orders in cients ai1 ···ir (λ) of an element in H 1 n λ. Assigning to y , . . . , y the total degree 1 and to λ the total degree 2 the elements in c0 could be understood as formal CNP series in this total degree. H As an example we want to consider the harmonic oscillator with classical Hamiltonian H := 21 ω|z|2 , where ω ∈ R+ is the oscillator frequency. First we notice that H is an element of the Gel’fand ideal J0 and π0 (H) = λωy k ∂/∂y k . Note that this operator c0 as a bounded and hence continuous operator. We can ask is clearly defined on all of H for its GNS-spectrum in the following sense: spec(π0 (H)) := {µ ∈ Chhλii | (π0 (H) − µ1) is bijective }. By an easy calculation we find that the GNS-spectrum is purely discrete, namely: spec(π0 (H)) = {0, λω, 2λω, . . .}. Moreover we notice that the vectors eK are eigenvectors to the eigenvalue λω|K| and hence they are a Hilbert base of eigenvectors to the harmonic oscillator. Note that the GNS-spectrum is still formal (i. e. a “CNP-number”) by definition, but after the substitution λ = ~ we arrive at the well-known real oscillator spectrum (minus the ground state energy which is due to the Wick product) which –in the context of star products– had directly been computed by analytic methods using the star exponential in [6, pp. 125–132]. At last we want to discuss a way how one can get back convergence in λ if we substitute λ by ~ ∈ R. The main idea is that we ask for convergence in the representation and not in the algebra. We define for a fixed ~ ∈ R+ , ∞ ( X c0 H(~) := φ ∈ H ~qr a(K) (33) qr := heK , φi |λ=~ converges r=0  !2 ∞ ∞  X X <∞ , ~qr a(K) (34) absolutely ∀K and qr  r=0 |K|=0 n o c0 heK , φi | N (~) := φ ∈ H (35) λ=~ = 0 . Lemma 9. Let φ =

P∞

1 i1 r=0 r! ai1 ...ir (λ)y

c0 . If φ ∈ H(~) then we have: · · · y ir ∈ H

i) ai1 ...ir (λ) converges absolutely for λ = ~. ii) The Hermitian product hφ , φi is absolutely convergent for λ = ~ and positive semidefinite in C: hφ , φi |λ=~ =

∞ X (2~)r r=0

r!

ai1 ...ir (~)ai1 ...ir (~) ≥ 0.

iii H(~) is a C-vector space and N (~) is a C-subvector space of H(~). For φ, ψ ∈ H(~) the Hermitian product hφ , ψi converges absolutely in C for λ = ~. P∞ a r (~) i1 z · · · z ir is an entire anti-holomorphic function on Cn . iv) F (z) := r=0 i1 ...i r!

566

M. Bordemann, S. Waldmann

Part ii) and iii) will be proved in a more P∞general context in Theorem 4. Let φ be given as in the above lemma and let ψ = r=0 r!1 bi1 ...ir (λ)y i1 · · · y ir be another element in H(~). Then we have the following theorem: Theorem 3. The quotient C-vector space H(~) := H(~)/N (~) with the Hermitian product ∞ X (2~)r ai1 ...ir (~)bi1 ...ir (~) (36) hφ mod N (~) , ψ mod N (~)i := r! r=0

is a Hilbert space over C which is isometric to the Hilbert space of anti-holomorphic functions f : Cn → C,   X 2 ∞   r X r ∂ (2~) f <∞ (0) (37) H(~) ∼ = f : Cn → C i i   r=0 r! i1 ,...,ir ∂z 1 · · · ∂z r with the Gaussian product hf , gi =

1 (2π~)n

Z

e− Cn

|z|2 2~

f (z)g(z)dz 1 · · · dz n dz 1 · · · dz n ,

(38)

and the canonical isometry is given by φ 7→ F , where F is the anti-holomorphic function defined by φ as in Lemma 9 iv). Moreover the GNS representation of C ∞ (Cn )hhλii induces a representation Q on the C-Hilbert space H(~) at least for “many” functions. We consider only those elements f ∈ C ∞ (Cn )hhλii with π0 (f )N (~) ⊆ N (~). Then we define Df (~) := {φ ∈ H(~) | π0 (f )φ ∈ H(~)} ;

Df (~) := Df (~)/N (~)

(39)

and call Df (~) ⊆ H(~) the domain of f since we can define an operator Q(f ) with domain of definition Df (~) by Q(f )(φ mod N (~)) := π0 (f )φ mod N (~)

for φ ∈ Df (~).

(40)

In general Df (~) may be very small but there are “many” functions (i.e. all polynomials in z and z) such that Df (~) is dense in H(~) or even equal to H(~). In this case we clearly have Q(f ∗ g) = Q(f ) ◦ Q(g), (41) and hence we call Q a quantization map. If we write H(~) in the form (37) this is of course just the well-known Bargmann quantization. 7. General Results for K¨ahler Manifolds In this section we shall derive some general results for K¨ahler manifolds with the Fedosov star product of Wick type which had been constructed in [11]. First we define the formal Wick algebra in 2n parameters by (42) Wn := C[[y 1 , . . . , y n , y 1 , . . . , y n ]] hhλii, then Wn is clearly a Chhλii-vector space and we define a pointwise multiplication of the formal power series in y 1 , . . . , y n as usual. Then Wn becomes an associative and

Formal GNS Construction and States in Deformation Quantization

567

commutative algebra over Chhλii with unit element. Moreover Wn becomes a ∗ -algebra if we extend the complex conjugation such that y k is mapped to y k and vice versa. This formal Wick algebra could be deformed by an analogue of the usual Wick product. We define ∞ X (2λ)r ∂rb ∂ra (43) a ◦ b := i i i 1 r! ∂y 1 · · · ∂y r ∂y · · · ∂y ir r=0 for a, b ∈ Wn . This leads to an associative deformation of the pointwise multiplication such that the complex conjugation is still an antilinear algebra involution and a ◦ 1 = 1 ◦ a = a. Every element a ∈ Wn could be written as ∞ X 1 ai ...i j ...j y i1 · · · y ir y j1 · · · y js . a= r!s! 1 r 1 s r,s=0

Then we define the delta functional as the projection onto the part without explicit powers of the formal parameters y 1 , . . . , y n : δ(a) := a00 ∈ Chhλii. This will also be written in the following symbolical way: δ(a) = a00 = a|y=0 . Then it is easy to see that the delta functional is a positive linear functional: Proposition 7. i) The delta functional δ : Wn → Chhλii is a state of Wn with respect to the Wick product and the Gel’fand ideal of δ is given by: ) ( ∂ |I| a = 0 for all multiindices I . (44) J = a ∈ Wn ∂y I y=0 ii) The quotient space Wn /J is isomorphic to H0n := C[[y 1 , . . . , y n ]] hhλii with the “formal y-Taylor series” as isomorphism ∞ X 1 ∂sa y ir · · · y i s , a mod J 7→ s! ∂y i1 · · · ∂y is y=0 s=0 and the Hermitian product induced by δ is given by + ∞ *∞ ∞ X 1 X X (2λ)r 1 ir jr ir js ai1 ...ir y · · · y , bj1 ...js y · · · y ai1 ...ir bi1 ...ir . = r! s! r! r=0

s=0

r=0

The properties of H0n and its completion Hn were already discussed in Proposition 6 where H0n ∼ = H0 . Now we want to investigate the convergence of the Hermitian products in a Hilbert space H over Chhλii with a countable Hilbert base {ek }k∈N . As we will see in Sect. 8 a Chhλii-Hilbert space has in general no Hilbert base but we will see important examples if we consider K¨ahler manifolds. First we will mention some easy convergence properties of CNP series. Let C− := C \ {x ∈ R | x ≤ 0} and Br− := Br (0) ∩ C− , where Br (0) is the open disk around 0 ∈ C of radius r. ForP q ∈ Q let z q be the holomorphic root defined on C− . For a ∞ qr formal CNP series P∞ a q=r r=0 λ aqr ∈ Chhλii we define the radius of convergence by R := supt∈R+ { r=0 t |aqr | < ∞} then we can prove the following proposition:

568

M. Bordemann, S. Waldmann

Proposition 8. Let a ∈ Chhλii with convergence radius R > 0. Then f (z) := P ∞ − − qr r=0 z aqr converges absolutely and normally in BR . Hence f : z ∈ BR → f (z) ∈ C is a holomorphic function. Now we define H(~) and N (~) in the following way: ( ∞ X ~qr a(k) H(~) := φ ∈ H ∀k ∈ N : hek , φi |λ=~ = qr converges absolutely r=0 ) ∞ ∞ X X 2 qr (k) |Fk (~)| < ∞ where Fk (~) := ~ a qr , (45) and r=0

r=0

N (~) := φ ∈ H(~) ∀k ∈ N : hek , φi |λ=~ = 0 .

(46)

First we notice that Fk (~) is well-defined and extends to a holomorphic function Fk on B~− . Using Cauchy’s theorem for iterated series and Weierstraß’ theorem for convergence of holomorphic functions we easily get the following theorem: Theorem 4. Let H be a Chhλii-Hilbert space with countable Hilbert base {ek }k∈N and let H(~) and N (~) be defined as above and φ, ψ ∈ H(~). i) H(~) is a C-vector space and N (~) is a C-subvector space of H(~). ii) The Hermitian product of H induces a semidefinite sesquilinear form on H(~) by hφ , ψi

λ=~

= hφ , ψi~ :=

∞ X

hφ , ek i |λ=~ hek , ψi |λ=~

(47)

k=1

and hφ , φi~ = 0 iff φ ∈ N (~). iii) hφ , ψi |λ=z is a holomorphic function on B~− and hφ , ψi |λ=z =

∞ X

hφ , ek i |λ=z hek , ψi |λ=z

k=1

converges normally on B~− . iv) The quotient space H(~) := H(~)/N (~) with the induced Hermitian product is a C-Hilbert space with Hilbert base {ek mod N (~)}k∈N and hence H(~) is isometric to `2 (C). Note that the resulting C-Hilbert space H is already a Hilbert space and not only a pre-Hilbert space. Remark. If one defines Fk with a(k) qr instead of its absolute value then one would obtain again two C-subvector spaces and again the quotient would be a Hilbert space. But now hφ , ψi |λ=~ is in general not convergent to hφ , ψi~ as in Eq. (47). Consider for example P∞ P∞ πλ k φ1 := k=0 λk sin( πλ r=0 k!λ sin( 2 )ek which are both in H(2) but 2 )ek or φ2 := clearly hφ1 , φ1 i |λ=~ has only convergence radius 1 and hφ2 , φ2 i |λ=~ has convergence radius 0 while hφ1 , φ1 i~ = hφ2 , φ2 i~ = 0. Now we want to consider the Fedosov star product of Wick type for a K¨ahler manifold M of real dimension 2n constructed as in [11]. We will use the same notation as in [11] with the only difference that the formal parameter is denoted by λ and ~ is reserved for the real number corresponding to the value of Planck’s constant in a chosen unit

Formal GNS Construction and States in Deformation Quantization

569

system. Let ω be the symplectic K¨ahler form on M which is given in local holomorphic coordinates by ω = 2i ωkl dz k ∧ dz l . The Fedosov algebra is defined by Ws ∗ V ∞ C 0 T M ⊗ T ∗ M hhλii (48) W ⊗ 3 := Xs=0 together with the pointwise multiplication induced by the symmetric antisymmetric Pand ∞ qr λ aqr , where productP of forms. The elements a ∈W W ⊗ 3 are of the form r=0 V ∗ ∞ s ∗ (qr ) (qr ) a qr = a and a ∈ C(0( T M ⊗ T M )) are smooth sections. The s s=0 s fibrewise Wick product as a deformation of the pointwise product is defined by r ∞ X iλ 3(r) (a, b), a ◦ b := 2 r=0 (49) r 1 4 (r) k1 l 1 kr l r 3 (a, b) := ω ···ω is (Zk1 ) · · · is (Zkr )a is (Z l1 ) · · · is (Z lr )b, r! i and since this product is defined fibrewise we can define the Fedosov algebra at a point p ∈ M by Ws ∗ V ∞ C Tp M ⊗ Tp∗ M hhλii (50) W ⊗ 3p := Xs=0 together with the product ◦. Then the restriction ap of a section a ∈ W ⊗ 3 to the point p ∈ M is an element in W ⊗ 3p . The sections in W ⊗ 3 without any antisymmetric part are denoted by W and analogously we define Wp . In [11] we have shown that there exists a Fedosov derivation D for W ⊗ 3 such that D2 = 0 and WD := ker D ∩ W is isomorphic to C ∞ (M )hhλii and the isomorphism is given by the Fedosov-Taylor series τ : C ∞ (M )hhλii → WD which was constructed recursively and the inverse map σ : WD → C ∞ (M )hhλii is simply the projection on the elements with symmetric degree zero, i.e. σ = π (0,0) , where π (r,s) is the projection onto the symmetric forms of type (r, s). Then the Fedosov star product is given by f ∗ g = σ(τ (f ) ◦ τ (g)) and several properties of this star product were shown in [11]. The most important property for our purpose is the reality of the Fedosov-Taylor series τ (f ) = τ (f ). Lemma 10. Let M be a 2n-dimensional K¨ahler manifold and z 1 , . . . , z n a holomorphic chart around p ∈ M such that ωkl |p = δkl in this chart. Then ϕ(1) := 1, ϕ(dz k ) := y k , ϕ(dz k ) := y k induces a Chhλii-algebra isomorphism ϕ : Wp → Wn with respect to the fibrewise Wick product in Wp and the Wick product (43) in Wn . Note that it is always possible to find such a holomorphic chart [26, Sect. 0.7]. With such an isomorphism we easily find the following proposition analogously to Lemma 7: Proposition 9. Let p ∈ M and δp the delta functional at p. Then δp : C ∞ (M )hhλii → Chhλii is a state with respect to the Fedosov star product of Wick type and the Gel’fand ideal is given by n o Jp := f ∈ C ∞ (M )hhλii ∀r ≥ 0 : π (0,r) τp (f ) = 0 , where τp (f ) is the Fedosov-Taylor series of f evaluated at p. In the next proposition we shall prove that the Fedosov-Taylor series τp at a point p ∈ M is surjective on Wp where we use Borel’s lemma and the recursion formula [11, Eq. (19)] for τ . Proposition 10. The Fedosov-Taylor series τp : C ∞ (M )hhλii → Wp is surjective for all p ∈ M .

570

M. Bordemann, S. Waldmann

We define for p ∈ M , n o Jep := a ∈ Wp ∀r ≥ 0 : π (0,r) a = 0

∞ W(0,s) ∗ f H0 p := Xs=0 Tp M hhλii,

and then we can describe the GNS representation induced by δp in the following way: Theorem 5. With the notations from above we have: i)

The quotient space H0p := C ∞ (M )hhλii/Jp is canonically isomorphic to f H0 p , where the isomorphism is induced by τp (and also denoted by τp ): ψf 7→ τp (ψf ) := τp (f ) mod Jep ∞ X 1 π (0,0) is (Z k1 ) · · · is (Z kr )τp (f ) dz k1 ∨ · · · ∨ dz kr , = r! r=0

where ψf ∈

H0p

and the Hermitian product is given by

hψf , ψg i =

∞ X (2λ)r r=0

r!

ωpk1 l1 · · · ωpkr lr

× π (0,0) is (Zk1 ) · · · is (Zkr )τp (f )is (Z l1 ) · · · is (Z lr )τp (g) , and the GNS representation induced by δp is given by τp ◦ πp (f ) ◦ τp−1 ∞ X (2λ)r (0,0) π = is (Zk1 ) · · · is (Zkr )is (Z l1 ) · · · is (Zls )τp (f ) r!s! r,s=0

× ω k1 t1 · · · ω kr tr dz l1 ∨ · · · ∨ dz ls is (Z t1 ) · · · is (Z tr ). ii) Let z 1 , . . . , z n be a holomorphic chart such that ωkl |p = δkl and let ϕ : Wp → Wn be defined as in Lemma 10. Then ϕ ◦ τp induces an isomorphism (also denoted by ϕ ◦ τp ) of the Chhλii-pre-Hilbert space H0p to H0n and hence the completion Hp of H0p is isometric to Hn via ϕ ◦ τp and hence Hp has a countable Hilbert base and is isometric to `2 (Chhλii). Moreover we have for ψf ∈ H0p , ∞ X 1 ∂ r ϕ ◦ τp (f ) ϕ ◦ τp (ψf ) = y k1 · · · y kr . r! ∂y k1 · · · ∂y kr y=0 r=0 As in the Wick case we can ask for convergence. First we have to choose a Hilbert base of Hp . In a holomorphic chart z 1 , . . . , z n with ωkl |p = δkl we can use for example the vectors eˆK := (ϕ ◦ τp )−1 eK with eK as in Proposition 7. Then we construct Hp (~) and Np (~) with respect to this Hilbert base as in Theorem 4 and get a C-Hilbert space Hp (~) and a representation of “many” functions f ∈ C ∞ (M )hhλii in the same way as in the Wick case. We consider again functions f with πp (f )Np (~) ⊆ Np (~) and define Df (~) and Df (~) as in the Wick case. Then we define the quantization map Q analogously: Q(f ) is an operator on Df (~) defined by Q(f )(ψ mod Np (~)) := πp (f )ψ mod Np (~)

ψ ∈ Df (~)

which leads to the representation property Q(f ) ◦ Q(g) = Q(f ∗ g) on suitable domains. Since according to Proposition 10 the Fedosov-Taylor series is surjective there are indeed “many” suitable functions such as the pre-images under τp of the polynomials in dz k and dz k (where z 1 , . . . , z n is the chart from above).

Formal GNS Construction and States in Deformation Quantization

571

8. Example II: The Weyl-Moyal Product for R2n Now we want to consider the phase space T ∗ Rn ∼ = R2n with the standard symplectic form and the Weyl-Moyal product defined for f, g ∈ C ∞ (R2n )hhλii by r ∞ X iλ 3(r) (f, g), f ∗ g := 2 (51) r=0 1 ∂rf ∂rg , 3(r) (f, g) := 3i1 j1 · · · 3ir jr i1 r! ∂x · · · ∂xir ∂xj1 · · · ∂xjr where x1 , . . . , x2n = q 1 , . . . , q n , p1 , . . . pn and 3ij are the components of the Poisson tensor 3 with respect to the coordinates x1 , . . . , x2n , i.e. 3 = 21 3ij ∂xi ∧∂xj = ∂qi ∧∂pi . The smooth functions on R2n with compact support are denoted by C0∞ (R2n )hhλii. Lemma 11. Let f, g ∈ C0∞ (R2n )hhλii then the integral over R2n is a trace [17] Z Z Z f ∗ g d2n x = g ∗ f d2n x = f g d2n x R2n

R2n

(52)

R2n

and clearly a positive linear functional. To obtain Schr¨odinger’s quantization we need a different positive linear functional, namely the integration over the configuration space Q := Rn of T ∗ Q ∼ = R2n for a fixed value p0 of the momenta. But first we have to define a suitable subalgebra of C ∞ (T ∗ Q)hhλii such that the integration is well-defined. For p0 ∈ Rn we define (53) Cp∞0 (T ∗ Q) := f ∈ C ∞ (T ∗ Q) suppf ∩ Qp0 is compact , where Qp0 := {(q, p0 ) ∈ T ∗ Q | q ∈ Q} and notice that due to the locality of the Weyl-Moyal product Cp∞0 (T ∗ Q)hhλii is not only a subalgebra but also a twosided ideal of C ∞ (T ∗ Q)hhλii. As we will see in Proposition 12 it is not sufficient to consider only the functions with compact support in T ∗ Q. For f ∈ Cp∞0 (T ∗ Q)hhλii we define Z f (q, p0 ) dn q, (54) ωp0 (f ) := Rn

which is obviously well-defined for any fixed momentum p0 . Moreover it is a positive linear functional of Cp∞0 (T ∗ Q)hhλii: Proposition 11. Let f, g ∈ Cp∞0 (T ∗ Q)hhλii and define 1 := exp(− iλ 2 1) then Z S −1 f (Sg) p=p dn q ωp0 (f ∗ g) =

∂2 ∂q k ∂pk

and S := (55)

0

Rn

and ωp0 is positive, i.e. ωp0 (f ∗ f ) ≥ 0. Proof. A straightforward induction on r shows that ! Z Z r X r r n s r−s 3 (f, g)|p=p0 d q = r! 1 f (−1) g s Rn Rn s=0

then Eq. (55) follows easily and this clearly implies the positivity.

dn q,

p=p0

572

M. Bordemann, S. Waldmann

Remark. A similar but non-formal positivity result of such integrals was obtained in [20, pp. 205-206, Eq. 5.11] in the context of complex C ∗ -algebras. Now we shall specialize to p0 = 0 for simplicity. We define for k = 1, . . . , n ∞ (T ∗ Q)hhλii := the functions Pk (q, p) := pk and denote by J0 the left ideal of CQ ∞ ∗ Cp0 =0 (T Q)hhλii generated by the functions Pk , n o X ∞ g k ∗ Pk , g k ∈ C Q (T ∗ Q)hhλii , (56) J0 := f ∈ C ∞ (T ∗ Q)hhλii f = k

∞ ∞ which is indeed a left ideal of CQ (T ∗ Q)hhλii since CQ (T ∗ Q)hhλii is a two-sided ideal ∞ ∗ of the whole algebra C (T Q)hhλii. By direct calculation we get: ∞ (T ∗ Q)hhλii, then we have Lemma 12. Let f1 , . . . , fn ∈ CQ ω0 (fk ∗ Pk ) ∗ (fl ∗ Pl ) = 0

(57)

and J0 is contained in the Gel’fand ideal of ω0 . The next proposition will show that J0 is already equal to the Gel’fand ideal of ω0 ∞ and the quotient H0 := CQ (T ∗ Q)hhλii/J0 is canonically isomorphic to C0∞ (Q)hhλii. ∞ ∗ Let r0 : C (T Q)hhλii → C ∞ (Q)hhλii be the restriction to momentum p = 0, i.e. r0 (f )(q) := f (q, 0) and let πQ : T ∗ Q → Q be the projection to the configuration ∞ (T ∗ Q)hhλii) = C0∞ (Q)hhλii and we define i0 := space. Then we clearly have r0 (CQ ∗ ∞ (T ∗ Q)hhλii → πQ ◦r0 , i.e. i0 (f )(q, p) := f (q, 0). Moreover we need the operator I : CQ ∞ ∗ ∞ ∗ CQ (T Q)hhλii defined for f ∈ CQ (T Q)hhλii by Z

1

I(f )(q, p) := 0

f (q, tp) − f (q, 0) dt t

∞ ∞ and for k = 1, . . . , n we define T k : CQ (T ∗ Q)hhλii → CQ (T ∗ Q)hhλii by ! 1 ∂ k (f ) . I◦ T (f ) := ∂pk 1 + iλ 2 1◦I

(58)

(59)

∞ ∞ Note that I(f ), T k (f ) ∈ CQ (T ∗ Q)hhλii if f ∈ CQ (T ∗ Q)hhλii. ∞ (T ∗ Q)hhλii: Proposition 12. With the above notation we have for f ∈ CQ

i) f = i0 ◦ S(f ) + T k (f ) ∗ Pk . ii) This decomposition is unique, i.e. ∞ ∞ (T ∗ Q)hhλii ∼ (T ∗ Q)hhλii ⊕ J0 . CQ = i0 ◦ S CQ

(60)

iii) The Gel’fand ideal of ω0 is given by J0 . Proof. First we use Hadamard’s trick to obtain f = i0 (f ) + Pk ∂pk I(f ) which could be rewritten as f = i0 (f ) + (∂pk I(f )) ∗ Pk − iλ 2 1 ◦ I(f ) using the Weyl-Moyal product. Iterating this equation leads to f = i0 ◦ 1+ iλ11◦I (f ) + T k (f ) ∗ Pk . An easy induction 2 shows that r!i0 ◦ (1 ◦ I)r = i0 ◦ 1r , which proves the first part. The second part follows from Lemma 12, and the last statement is a consequence of part one and two.

Formal GNS Construction and States in Deformation Quantization

573

∞ Corollary 4. The quotient space H0 := CQ (T ∗ Q)hhλii/J0 with the Hermitian product h· , ·i induced by ω0 is canonically isometric to C0∞ (Q)hhλii with the Hermitian product Z ψ(q)φ(q) dn q ψ, φ ∈ C0∞ (Q)hhλii, (61) hψ , φi := Rn

and the canonical isometry r is induced by r0 : For ψf ∈ H0 we set r(ψf ) := r0 ◦ S(f ) ∗ and the inverse of r is simply the pull-back πQ mod J0 . Now we consider the GNS representation π0 induced by ω0 on H0 . First we notice ∞ (T ∗ Q)hhλii could be represented but the that according to Corollary 5 not only CQ ∞ ∗ ∞ ∗ whole algebra C (T Q)hhλii since CQ (T Q)hhλii is a two-sided ideal and J0 is a left ideal of C ∞ (T ∗ Q)hhλii. Let % be the corresponding representation on C0∞ (Q)hhλii, i.e. ∗ . %(f ) := r ◦ π0 (f ) ◦ πQ Theorem 6 (Formal Schr¨odinger Quantization). Let Q = Rn and ψ ∈ C0∞ (Q)hhλii be a “formal wave function” on the configuration space Q and f ∈ C ∞ (T ∗ Q)hhλii. i)

r ∞ X 1 λ ∂ r (Sf ) ∂rψ . %(f )ψ = r! i ∂pi1 · · · ∂pir p=0 ∂q i1 · · · ∂q ir

(62)

r=0

For polynomials in q 1 , . . . , pn the representation % is the canonical quantization rule, i.e. λ ∂ %(q k ) = q k %(pk ) = (63) i ∂q k and the polynomials are mapped to the Weyl ordered polynomials of the corresponding operators (symmetrization rule). iii The Chhλii-pre-Hilbert space C0∞ (Q)hhλii with the Hermitian product (61) is already Cauchy complete and hence a Chhλii-Hilbert space. iV) H0 does not admit a (countable or uncountable) Hilbert base (for n ≥ 1). ii

Proof. The first statement is a straightforward computation. For the second one notes first that for each nonnegative integer k and 2n formal parameters α1 , . . . , αn , β 1 , . . . , β n the function (αr q r + β r pr )k is assigned the operator (αr %(q r ) + β r %(pr ))k by the Weyl symmetrization rule. Hence the formal exponential function eα,β (q, p) := exp(αr q r + β r pr ) is assigned the operator exp(αr %(q r ) + β r %(pr )). On the other hand, the righthand side of (62) is easily seen to be equal to the standard ordering prescription of the function Sf , i.e. where after applying the rule (63) all the derivatives are put on the right-hand side first. Clearly, the function Seα,β = exp(iλαβ/2)eα,β is assigned the operator exp(iλαβ/2) exp(αi %(q i )) exp(β i %(pi )) by standard ordering. Now the BakerCampbell-Hausdorff formula for the 2n + 1-dimensional Heisenberg Lie algebra easily implies that the Weyl ordered operator for eα,β and the standard-ordered operator for Seα,β coincide which proves this statement. The third and fourth statement are proved in a more general context in the next theorem. Theorem 7. Let M be an orientable manifold with volume form . Then H := C0∞ (M )hhλii together with the integral Z fg f, g ∈ H hf , gi := M

as Hermitian product is a Chhλii-Hilbert space. Moreover H does not admit a (countable or uncountable) Hilbert base if dim M ≥ 1.

574

M. Bordemann, S. Waldmann

Proof. It is easy to see that every Cauchy sequence in H with respect to the norm induced by the scalar product is a Cauchy sequence in the metric sense of Proposition 2, and vice versa, which proves completeness. Suppose there would exist a Hilbert base {ψ (α) }α∈I , I some index set, for H. Because of its orthonormality every element is of the form ψ (α) = ψ0(α) + λaα ψ1(α) , where the ψ0(α) is an orthonormal family in C0∞ (M ), aα is a positive rational number, and ψ1(α) ∈ C0∞ (M )hhλii with vanishing coefficients of negative λ-powers. By the usual complex L2 -theory for manifolds of dimension greater than or equal 1 there are countably many different ψ0(α) , and there is a function f ∈ C0∞ (M ) which is not a finite linear combination of the ψ0(α) . If f were approximated by finite sums of the ψ (α) there would be a positive integer N and α1 , . . . , αN ∈ I such that

2 N N

X X

(αi )

2 , f = hf , f i − ψ f , ψ (αi ) ψ (αi ) , f . λ > f −

i=1

D

i=1

ED E PN In particular it would follow that hf , f i = i=1 f , ψ0(αi ) ψ0(αi ) , f , implying by Parseval’s equality for complex Hilbert spaces that f is a finite linear combination of the ψ (αi ) which is a contradiction. At last we want to describe again the way back to convergence if we substitute the formal parameter λ by ~ ∈ R+ and again we want to ask for convergence in the P∞ representation. Let ψ = r=0 λqr ψqr ∈ C0∞ (Q)hhλii be a formal wave function and PN define the truncated series ψN := r=0 λqr ψqr . Then we ask for convergence in the sense of the locally convex topology of the smooth functions with compact support D(Q) := C0∞ (Q) of the sequence ψeN if we substitute λ by ~ and define o n D H(~) := ψ ∈ C0∞ (Rn )hhλii ψN |λ=~ −→ 9 ∈ D(Q) as N → ∞ , n o D N (~) := ψ ∈ H(~) ψN |λ=~ −→ 0 as N → ∞ . Now the same procedure as in the Wick case can be done and we get using the well-known completeness (see e. g. [46, Theorem 6.5.(g)]) of D(Q): Theorem 8. With the notations of above we have: i) H(~) is a C-vector space and N (~) is a C-subvector space of H(~). ii) If ψ, φ ∈ H(~) then hψ , φi converges for λ = ~ and Z 9(q)8(q) dn q hψ , φi |λ=~ = Rn

and defines a positive semidefinite sesquilinear form for H(~). iii) The quotient H(~) := H(~)/N (~) is canonically isometric to the C-pre-Hilbert space D(Q) where ψ mod N (~) 7→ 9 is the isomorphism. Then the usual completion of D(Q) with respect to the L2 -norm leads to the Hilbert space L2 (Rn ). Again the GNS representation % induces a representation of at least “many” elements f ∈ C ∞ (R2n )hhλii on H(~) ∼ = D(Q). For f ∈ C ∞ (R2n )hhλii such that %(f )N (~) ⊆ N (~) we define again Df (~) and Df (~) as in the Wick case and obtain a quantization map Q defined by

Formal GNS Construction and States in Deformation Quantization

Q(f )(ψ mod N (~)) := %(f )ψ mod N (~)

575

(64)

for ψ ∈ Df (~). Then Q(f ) is a linear operator defined on some domain Df (~) ⊆ H(~) ∼ = D(Rn ) and again there are “many” elements f with dense domain Df (~), for example those functions which are polynomial in the momentum variables which have Df (~) = D(Q) and in particular the polynomials in q and p. In this case the polynomials are represented via Q by differential operators obtained from (62) after substituting λ by ~. Note that an analogous result holds if one substitutes “D” by “S” denoting Schwartz’s test function space with its locally convex topology resp. by “L2 ” denoting the usual square integrable functions. In the case of e. g. polynomials in q and p the above quantization map is the Weyl transformation which is discussed e. g. in [34, 37] where among other things the image of the Weyl transformation applied to certain tempered distributions on R2n is computed.

9. Open Problems In this section we list some open problems arising with our approach: i)

It would be very interesting to see whether the approach of the WKB approximation via Lagrangian submanifolds (see e. g. [5]) contained in the energy surface of a fixed Hamiltonian function on the symplectic manifold is related to some variant of GNS construction with respect to a suitable positive linear functional whose support is contained in that Lagrangian submanifold. In the particular case of an arbitrary projectable Lagrangian submanifold of T ∗ Rn we have been able to derive a suitable GNS construction for the usual WKB expansion in [12]. A further problem arising in this context is the fact that the discrete energy eigenvalues in quantum mechanics are dependent on ~, and it is not obvious how this can be encoded in a formal way in the energy surface. ii) We have seen in Sect. 6 that the spectrum of the harmonic oscillator can be computed in a purely formal manner before the convergence scheme is performed. It may be interesting to develop a kind of “formal spectral theory” in order to formally compute e.g. discrete spectra depending on λ and then deal with the convergence. iii) Related to this question one may ask more generally to what extent there is some reasonable functional analysis in these formal Hilbert spaces: i.e. which (possibly weaker) topologies are more suited for the definition and calculation of spectra and convergence properties. The usual literature on p-adic functional analysis (see e.g. [38, 33]) and p-adic formulations of quantum physics (see e.g. [23, 24, 2, 1]) does unfortunately not seem to deal with Hilbert spaces in the sense described in our approach, but exclusively with fields carrying absolute values such as the field Qp of p-adic numbers (which is non-orderable, see Appendix A), and seems to avoid ordered fields and hence the notion of positivity we need (cf. also the remark in [1, p. 5517]). iv Finally, it may be interesting to see whether the prequantum line bundles of geometric quantization over, say, a compact prequantizable K¨ahler manifold (see e.g. [43, 14, 15]) are related to this construction (perhaps after employing a convergence scheme) and if yes, whether they can be constructed that way.

576

M. Bordemann, S. Waldmann

A. Some Analytic Properties of Ordered Fields In this appendix we shall examine some standard definitions of topology and calculus and transfer them from the case of real numbers R to the more general case of an arbitrary ordered field R. Again all the statements can be proved analogously using only the ordering axioms. First of all we define -balls around any point in R by means of the ordering relation and notice that the ordering induces a topology and a uniform structure for R similar to the topology and the uniform structure of a metric space [35, Chap. 6]. Definition 8. Let R be an ordered field. For any x ∈ R and any 0 < ∈ R we define the -ball around x by B (x) := {y ∈ R | |x − y| < } .

(65)

The set of all -balls is denoted by B := {B (x) ⊆ R | 0 < ∈ R, x ∈ R}. Furthermore we define for 0 < ∈ R, U := {(x, y) ∈ R × R | |x − y| < } ⊆ R × R,

(66)

and U 0 := {U ⊆ R × R | 0 < ∈ R}. Proposition 13. The set of -balls B is a base of a topology T for the field R such that R becomes a normal space. The set U 0 is a base for a uniform structure U for R which induces the same topology T than B. Addition and multiplication are uniformly continuous maps from R × R → R. We shall call this topology and the uniform structure induced by the ordering relation the standard topology and the standard uniform structure of the ordered field R. Using this topology we can define continuous functions f : R → R as usual. Remark. We notice that in the topology induced by the ordering relation the intervals may neither be connected nor compact in general. As an example we consider the field of formal Laurent series R((λ)) over the real numbers and the open balls Bnλ (r) with r ∈ R and n ∈ N. Then for a < b the closed interval [a, b] := {x|a ≤ x ≤ b} ⊂ R((λ)) in the Laurent field is contained in [ Bnλ (r) [a, b] ⊂ n∈N r∈[a,b]⊂R

but Bnλ (r) ∩ Bn0 λ (r0 ) = ∅ if r 6= r0 . Hence [a, b] ⊂ R((λ)) is neither compact nor connected. Another important concept is the supremum and infimum of a bounded subset in an ordered field. We define them as usual: A subset U ⊆ R is called bounded iff there is a C ∈ R such that |x| ≤ C for all x ∈ U . Then C is called a bound for U . Analogously we define upper and lower bounds. C is called the supremum (infimum) of U iff C is the smallest upper bound (largest lower bound), i.e. for all 0 < ∈ R there exists x ∈ U such that x > C − (x < C + ). In an archimedian ordered field the existence of a supremum or infimum of a bounded set can be proved iff the field is Cauchy complete. On the other hand, in a non-archimedian ordered field neither the supremum nor the infimum of a bounded set exist in general [48, p. 245].

Formal GNS Construction and States in Deformation Quantization

577

Definition 9 (Convergence, Cauchy sequences, completeness). Let R be an ordered field and (an )n∈N a sequence in R. Then (an ) is called convergent to a ∈ R iff ∀0 < ∈ R

∃N ∈ N

such that ∀n > N : |an − a| < .

A sequence (an ) is called a Cauchy sequence iff ∀0 < ∈ R

∃N ∈ N

such that ∀m, n > N : |am − an | < .

An ordered field is called Cauchy complete iff every Cauchy sequence converges in R. It is a well-known result that for any ordered field R there exists a unique ordered field b such that R is a dense subfield of R, b the orderings are compatible and R b is Cauchy R complete [48, p. 238]. For certain approximation statements we shall need to know whether the topology or the uniform structure is first countable or not (a topology for a set is called first countable iff any point has a countable base of neighbourhoods [35, p. 50]). In our case the existence of such a countable base of neigbourhoods is equivalent to the existence of non-trivial zero-sequences: Proposition 14. Let R be an ordered field. Then the following properties are equivalent: i) ii) iii) iv)

There is a sequence (n )n∈N such that n > 0 for all n ∈ N and n → 0. The standard uniform structure U has a countable base. The standard topology T is first countable. The standard uniform structure U and the standard topology T can be induced by a metric.

Proof. The implications i) ⇒ ii) ⇒ iii) are obvious and iii) ⇒ i) is proved by constructing the zero sequence using a given countable neighbourhood base. The fourth part is equivalent to the second one for general reasons since T is Hausdorff according to Proposition 13 (cf. e.g. [35, p. 186]). Although the supremum of a bounded subset does not exist in general we can easily prove the existence of a sequence in a bounded subset with supremum which converges to the supremum if the field is first countable: Lemma 13. Let R be an ordered field such that the standard topology is first countable. If for A ⊂ R the supremum sup A exists then there is a sequence (an ) with elements in A such that an → sup A. In the theory of fields another possibility to define a metric topology is an absolute value: Definition 10 (Absolute value [31, p. 558]). Let R be a field. An absolute value ϕ is a map ϕ : R → R+ ∪ {0} such that for all a, b ∈ R, i) ϕ(a) = 0 ⇐⇒ a = 0. ii) ϕ(ab) = ϕ(a)ϕ(b). iii) ϕ(a + b) ≤ ϕ(a) + ϕ(b) . An absolute value ϕ is called non-archimedian iff ϕ(a + b) ≤ max(ϕ(a), ϕ(b)) and archimedian if this is not the case. An absolute value is called trivial iff ϕ(0) = 0 and ϕ(a) = 1 for all 0 6= a ∈ R.

578

M. Bordemann, S. Waldmann

With help of an absolute value one can define a metric on R for a, b ∈ R by dϕ (a, b) := ϕ(a − b).

(67)

Recall that if the absolute value is non-archimedian then the metric dϕ is an ultrametric, i.e. dϕ (a, b) ≤ max(dϕ (a, c), dϕ (c, b)) [38, p. 6]. If ϕ : R → R is an absolute value we denote by Tϕ and Uϕ the topology and the uniform structure induced by the corresponding metric dϕ . The open metric balls around x ∈ R with radius 0 < ∈ R are denoted by Bϕ (x) and analogously we define Uϕ := {(x, y) ∈ R × R | dϕ (x, y) < } for 0 < ∈ R. Then the metric balls Bϕ (x) and the Uϕ form a base for the topology Tϕ and the uniform structure Uϕ . As in the case of an ordered field the field R with absolute value ϕ can be completed with respect to dϕ in the usual sense of metric completion. Remark. It is well-known that the field Qp of p-adic numbers (which is the metric completion of the field Q of rational numbers by means of the p-adic value [32, P∞absolute √ 2k k p.558]) is not orderable: note that the series z := 1/ 1 − 4p := i=0 k p converges in Qp and −1 = (4p − 1)z 2 hence −1 is a sum of squares in Qp . In the case of an ordered field R we ask now for an absolute value ϕ such that ϕ : R → R is not only continuous with respect to T but such that Tϕ = T and Uϕ = U . If this is the case we call the absolute value compatible with the ordering of R. Lemma 14. Let R be an ordered field and ϕ : R → R an absolute value. Then T = Tϕ ⇐⇒ U = Uϕ . Proof. The inverse implication being trivial assume that T = Tϕ . Then for all 0 < ∈ R there exists 0 < δ ∈ R such that Bδ (0) ⊂ Bϕ (0). This implies Uδ ⊂ Uϕ and vice versa. If there is such a compatible absolute value then the topology T = Tϕ of R is first countable since it is induced by a metric and hence we can apply Proposition 14. Furthermore ϕ is continuous which implies that ϕ is non-trivial. This implies that there are elements ∈ R such that 0 < ϕ() < 1 and hence ϕ()n → 0. Thus n → 0 in the field R since the topologies Tϕ and T are the same. Hence we have proved the following lemma: Lemma 15. Let R be an ordered field and ϕ : R → R a compatible absolute value. Then there exists 0 < ∈ R such that n → 0. At last we shall consider ordered fields and their real closure. A field R is called real closed iff it is ordered and any positive element has a square root in R and every polynomial of odd degree with coefficients in R has a root in R [31, p. 308]. Let R be b of R is called real closure of R iff R b is real closed an ordered field. An extension field R b is an extension of the order in R [32, p. and algebraic over R and the (unique) order of R 655]. Such a real closure exists for every ordered field and is unique up to isomorphisms [32, p. 656]. Furthermore for a real closed field one can prove the “fundamental theorem of algebra”: If R is a real closed field then the quadratic field extension C = R(i) with i2 := −1 is algebraically closed [31, p. 309]. For the quadratic field extension C = R(i) of a real closed field R we define √ √ (68) |z| := zz = a2 + b2 for z = a + ib ∈ C where a, b ∈ R using the square root. Then |z| ∈ R and for √ the real elements a ∈ R this definition coincides with the previous definition since a2 = |a|.

Formal GNS Construction and States in Deformation Quantization

579

Furthermore we have |z| = 0 ⇐⇒ z = 0, |zw| = |z||w| and |z + w| ≤ |z| + |w| for z, w ∈ C and hence | · | induces a topology on C which is compatible with the standard topology of the ordered field R. Again we will call this topology for C the standard topology. Then the field R is Cauchy complete with respect to the standard topology iff its quadratic extension C is Cauchy complete with respect to the standard topology.

B. Hilbert Space Formulation Based on Ordered Fields and the Generalized GNS Construction In this appendix we shall give a generalization of the usual definition of a Hilbert space over C which leads to the definition of a Hilbert space over C = R(i) where R is an ordered field which is Cauchy complete and real closed and formulate the generalized GNS construction based thereon. Let HC be a pre-Hilbert space over C = R(i) where R is a real closed field. In this case we can define a R-norm for vectors in HC that takes values in R by p (69) kφk := hφ , φi for φ ∈ HC . Then we clearly have for φ, ψ ∈ HC and a ∈ C: kφk ≥ 0

and kφk = 0 ⇐⇒ φ = 0, kaφk = |a| kφk , kφ + ψk ≤ kφk + kψk , 2 2 2 2 kφ + ψk + kφ − ψk = 2 kφk + 2 kψk .

(70)

With this R-norm the pre-Hilbert space HC becomes a topological vector space where we define a base for the topology by the -balls with respect to k·k. Note that HC is not a normed vector space in the usual sense since k·k takes values in R and not in R (in contrast, for example, to Kalisch’s definition of p-adic Hilbert spaces [33, p. 181]). Note also that in order to define the Hilbert norm (69) we need the existence of square roots of every element of C which motivates the use of algebraically closed fields. An easy consequence is the following lemma: Lemma 16. Let HC be a pre-Hilbert space over C = R(i) where R is a real closed field. Then the Hermitian product h· , ·i : HC × HC → C is continuous with respect to k·k. By means of k·k we can define Cauchy sequences in HC and this leads to the definition of a Hilbert space: Definition 11 (Hilbert space). Let C = R(i) be the quadratic field extension of a real closed field R and HC a pre-Hilbert space over C with R-norm k·k. Then HC is called a Hilbert space over C iff every Cauchy sequence in HC converges in HC with respect to k·k. Lemma 17. Let {0} 6= HC be a Hilbert space over C = R(i) where R is a real closed field. Then C and R are Cauchy complete. Due to this lemma we should always assume that C is not only algebraically closed but also Cauchy complete if we want to consider Hilbert spaces over C. In a next step we want to construct the completion of a pre-Hilbert space to a Hilbert space.

580

M. Bordemann, S. Waldmann

Proposition 15. Let HC be a pre-Hilbert space over C = R(i) where R is real closed cC and Cauchy complete. Then up to unitary equivalence there is one Hilbert space H c c such that there is an isometry i : HC → HC and i(HC ) is dense in HC . Let HC and KC be Hilbert spaces over C and T : HC → KC a linear map. Then clearly T is continuous iff it is continuous at some point φ0 ∈ HC since T is linear. Lemma 18. Let HC and KC be Hilbert spaces over C = R(i) where R is a real closed and Cauchy complete field. For a linear map T : HC → KC we have: T is continuous iff there exists C ∈ R such that kT φk ≤ C kφk for all φ ∈ HC (T is bounded). Remark. For continuous linear maps T between Hilbert spaces over C one usually defines an operator norm by the supremum of kT φk / kφk, where φ 6= 0 ranges over the Hilbert space. But in the case of a non-archimedian ordered field such a supremum does not exist in general though T is bounded. In the case that the standard topology of the field R is first countable we have the following important property of C-Hilbert spaces: Lemma 19. Let HC be a C-Hilbert space where C = R(i) and R is real closed, Cauchy complete and first countable. Let W be a subspace of HC . If W is dense in HC then for any φ ∈ HC there exists a sequence φn ∈ W with φn → φ. An important concept in the usual theory of Hilbert spaces are the Hilbert bases. This leads to the following generalizations: Let HC be a C-Hilbert space where C = R(i) and R is a real closed and Cauchy complete field. Let I be an index set and let F be the set of all finite subsets of I. Let {ek }k∈I be a set of vectors in HC then {ek }k∈I is called an orthonormal system iff hek , ek0 i = δkk0 for all k, k 0 ∈ I. In this case the set {ek }k∈I is clearly linear independent in HC . This leads to the definition of a Hilbert base: Definition 12 (Hilbert base). Let HC be a Hilbert space and let {ek }k∈I be an orthonormal set where C = R(i) and R is real closed, Cauchy complete and first countable. Then {ek }k∈I is called a Hilbert base for HC iff C-span{ek }k∈I is dense in HC with respect to the topology induced by k·k. The following theorem is proved in a completely analogous fashion to the usual proofs in textbooks on functional analysis (see e.g. [50, p. 86]) Theorem 9. Let HC be a C-Hilbert space where C = R(i) and R is a real closed and Cauchy complete field. Let {ek }k∈I be an orthonormal set. Then we have Bessel’s inequality X 2 |hek , φi| ≤ hφ , φi ∀φ ∈ HC , ∀F ∈ F (71) k∈F

and for all φ ∈ HC the following “best approximation property”:

X X

αk ek ≥ φ − hek , φi ek

φ −

k∈F

(72)

k∈F

for all αk ∈ C where the equality is only satisfied iff αk = hek , φi. Now let in addition R be first countable, let I = N, and let {ek }k∈N be a Hilbert base for HC . Then there PNn n exists a sequence φn = k=1 αk ek ∈ C-span{ek }k∈I with φn → φ for any φ ∈ HC and

Formal GNS Construction and States in Deformation Quantization

φ = lim

N X

N →∞

hek , φi ek =:

k=1

∞ X

581

hek , φi ek ,

(73)

k=1

and we have Parseval’s equation: 2

kφk =

∞ X

|hek , φi|

2

hφ , ψi =

k=1

∞ X

hφ , ek i hek , ψi .

(74)

k=1

P P Moreover if αk ∈ C then k αk ek converges to a vector in HC iff k |αk |2 converges in R. An orthonormal system {ek }k∈I is a Hilbert base for HC iff Parseval’s equation holds for any φ ∈ HC . As a (generic) example we will consider the `2 -space of a field C = R(i) where R is real closed, Cauchy complete and first countable: ( 2

` (C) :=

(ak )k∈N

) ∞ X 2 |ak | converges in R ak ∈ C such that

(75)

k=1

together with the `2 -product ha , bi :=

∞ X

a k bk .

(76)

k=1

Proposition 16. Let C = R(i) be the quadratic field extension of R where R is real closed, Cauchy complete and first countable. Then `2 (C) is a C-Hilbert space and the vectors eˆk := (0, . . . , 0, 1, 0, . . .) (where 1 is the k-th entry) form a countable Hilbert base for `2 (C). Proposition 17. Let HC and KC be Hilbert spaces where C = R(i) and R is real closed, Cauchy complete and first countable. Let {ek }k∈I be a Hilbert base for HC . i) If U : HC → KC is a unitary map then {fk }k∈I with fk := U ek is a Hilbert base for KC . ii) If {fk }k∈I is a Hilbert base of KC then there exists a unique unitary map U : HC → KC such that U ek = fk . For later use we finally mention the following corollary: Corollary 5. Let HC be a pre-Hilbert space over C = R(i) where R is real closed, Cauchy complete and first countable and let {ek }k∈I be an orthonormal system such cC . Then {ek }k∈I that C-span{ek }k∈I is dense in HC . Denote the completion of HC by H 2 c c is a Hilbert base of HC and HC is unitary equivalent to ` (C), i.e. there is a unitary map cC → `2 (C). U :H We are now in the position to formulate a generalized GNS construction (analogous to the GNS pre-construction formulated in Proposition 1) taking into account the analytical properties of the field C. Using Proposition 15 to complete the GNS pre-Hilbert space (11) we get the following easy consequence of the GNS pre-construction (Proposition 1):

582

M. Bordemann, S. Waldmann

Theorem 10 (General GNS construction). Let R be a real closed and Cauchy complete field and C = R(i) and let A be a C-algebra with involution ∗ . Then for any positive ˆ ω with a dense subspace Hω linear functional ω : A → C there exists a Hilbert space H carrying a ∗ -representation πω of A as constructed in Proposition 1. We shall call this ˆ ω . If A in addition has a unit element 1 and representation the GNS representation in H ω is a state, then this representation is cyclic and we have ω(A) = hψ1 , πω (A)ψ1 i, and this property defines this representation up to unitary equivalence. Acknowledgement. The authors would like to thank K. Fredenhagen for asking the question of the GNS construction in deformation quantization and suggesting to us to concentrate on the algebraic properties of C ∗ -algebras. Moreover we would like to thank O. Kegel for pointing out references [28, 27, 36]. Furthermore we would like to thank the referee for many useful and detailed comments and making us aware of references [30, 34, 37, 2, 23, 24]. Finally, we would like to thank M. Flato, K. Fredenhagen, J. Huebschmann, H. Rehren, and A. Weinstein for valuable discussions, and J. Hoppe and M. Lledo for a charming table-football match.

References 1. Albeverio, S., Khrennikov, A.: Representations of the Weyl group in spaces of square integrable functions with respect to p-adic valued Gaussian distributions. J. Phys. A 29, 5515–5527 (1996) 2. Aref’eva, I. Y., Volovich, I. V.: Quantum group particles and non-archimedian geometry. Phys. Lett. B 268, 179–187 (1991) 3. Basart, H., Flato, M., Lichnerowicz, A., Sternheimer, D.: Deformation Theory applied to Quantization and Statistical Mechanics. Lett. Math. Phys. 8, 483–494 (1984) 4. Basart, H., Lichnerowicz, A.: Conformal Symplectic Geometry, Deformations, Rigidity and Geometrical (KMS) Conditions. Lett. Math. Phys. 10, 167–177 (1985) 5. Bates, S., Weinstein, A.: Lectures on the Geometry of Quantization. Berkeley Mathematics Lecture Notes, Vol. 8 (1995) 6. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation Theory and Quantization. Ann. Phys. 111, part I: 61–110, part II: 111–151 (1978) 7. Berezin, F.: Quantization. Izv. Mat. NAUK 38, 1109–1165 (1974) 8. Bertelson, M., Cahen, M., Gutt, S.: Equivalence of Star Products. Universit´e Libre de Bruxelles, Travaux de Math´ematiques, Fascicule 1, 1–15 (1996) 9. Bordemann, M., Brischle, M., Emmrich, C., Waldmann, S.: Phase Space Reduction for Star-Products: An Explicit Construction for CP n . Lett. Math. Phys. 36, 357–371 (1996) 10. Bordemann, M., Brischle, M., Emmrich, C., Waldmann, S.: Subalgebras with Converging Star Products in Deformation Quantization: An Algebraic Construction for CP n . J. Math. Phys. 37 (12), 6311–6323 (1996) 11. Bordemann, M., Waldmann, S.: A Fedosov Star Product of Wick Type for K¨ahler manifolds. Lett. Math. Phys. 41, 243–253 (1997) 12. Bordemann, M., Waldmann, S.: Formal GNS Construction and WKB Expansion in Deformation Quantization. In: Sternheimer, D., Rawnsley, J., Gutt, S. (eds.): Deformation Theory and Symplectic Geometry. Dordrecht: Kluwer, 1997, 315–319 13. Bratteli, O., Robinson, D. W.: Operator Algebras and Quantum Statistical Mechanics 1, Second edition. Berlin–Heidelberg–New York: Springer Verlag, 1987 14. Cahen, M., Gutt, S., Rawnsley, J.: Quantization of K¨ahler Manifolds I. J. Geom. Phys. 7, 45–62 (1990) 15. Cahen, M., Gutt, S., Rawnsley, J.: Quantization of K¨ahler Manifolds. II. Trans. Am. Math. Soc. 337, 73–98 (1993) 16. Connes, A.: Noncommutative Geometry. San Diego CA: Academic Press, Inc. 1994 17. Connes, A., Flato, M., Sternheimer, D.: Closed Star Products and Cyclic Cohomology. Lett. Math. Phys. 24, 1–12 (1992) 18. DeWilde, M., Lecomte, P. B. A.: Existence of star-products and of formal deformations of the Poisson Lie Algebra of arbitrary symplectic manifolds. Lett. Math. Phys. 7, 487–496 (1983) 19. DeWilde, M., Lecomte, P. B. A.: Formal Deformations of the Poisson Lie Algebra of a Symplectic Manifold and Star Products. Existence, Equivalence, Derivations. In: Hazewinkel, M., Gerstenhaber, M. (eds): Deformation Theory of Algebras and Structures and Applications. Dordrecht: Kluwer, 1988.

Formal GNS Construction and States in Deformation Quantization

583

20. Doplicher, S., Fredenhagen, K., Roberts, J. E.: The Quantum Structure of Spacetime at the Planck Scale and Quantum Fields. Commun. Math. Phys. 172, 187–220 (1995) 21. Fedosov, B.: A Simple Geometrical Construction of Deformation Quantization. J. Diff. Geom. 40, 213–238 (1994) 22. Fedosov, B.: Deformation Quantization and Index Theory. Berlin: Akademie Verlag, 1996 23. Freund, P. G. O., Olson, M.: Non-Archimedian strings. Phys. Lett. B 199, 186–190 (1987) 24. Freund, P. G. O., Witten, E.: Adelic string amplitudes. Phys. Lett. B 199, 191–195 (1987) 25. Gerstenhaber, M., Schack, S.: Algebraic Cohomology and Deformation Theory. In: Hazewinkel, M., Gerstenhaber, M. (eds): Deformation Theory of Algebras and Structures and Applications. Dordrecht: Kluwer, 1988 26. Griffiths, P., Harris, J.: Principles of Algebraic Geometry. New York: John Wiley, 1978 27. Gross, H. Quadratic Forms in Infinite Dimensional Vector Spaces. Boston, Basel, Stuttgart: Birkh¨auser, 1979 28. Gross, H., Keller, A.: On the Definition of Hilbert Space. Manuscripta Math. 23, 67–90 (1977) 29. Haag, R.: Local Quantum Physics. Second edition. Berlin: Springer, 1993 30. Hansen, F.: Quantum Mechanics in Phase Space. Rep. Math. Phys. 19, 361–381 (1984) 31. Jacobson, N.: Basic Algebra I. Second ed. New York: Freeman and Co., 1985 32. Jacobson, N.: Basic Algebra II. second ed. New York: Freeman and Co., 1985 33. Kalisch, G. K.: On p-adic Hilbert Spaces. Ann. Math. 48, 180–192 (1947) 34. Kammerer, J. B.: Analysis of the Moyal product in a flat space. J. Math. Phys. 27 (2), 529–535 (1986) 35. Kelley, J. L.: General Topology. GTM 27, New York: Springer, (Reprint of the 1955 edition) 36. Keller, H. A.: Ein nicht-klassischer Hilbertscher Raum. Math. Z. 172, 41–49 (1980) 37. Maillard, J. M.: On the twisted convolution product and the Weyl transformation of tempered distributions. J. Geom. Phys. 3, 231–261 (1986) 38. Narici, N., Beckenstein, E., Bachman, G.: Functional Analysis and Valuation Theory. New York: Marcel Dekker, 1971 39. Nest, R., Tsygan, B.: Algebraic Index Theorem. Commun. Math. Phys. 172, 223–262 (1995) 40. Nest, R., Tsygan, B.: Algebraic Index Theorem for Families. Adv. Math. 113, 151–205 (1995) 41. Omori, H., Maeda, Y., Yoshioka, A.: Weyl manifolds and deformation quantization. Adv. Math. 85, 224–255 (1991) 42. Pflaum, M. J.: Local Analysis of Deformation Quantization. Ph.D. thesis, Fakult¨at f¨ur Mathematik der Ludwig-Maximilians-Universit¨at, M¨unchen, 1995 43. Rawnsley, J.: Coherent States and K¨ahler Manifolds. Quart. J. Oxford (2), 28, 403–415 (1977) 44. Rubio, R.: Alg`ebres associatives locales sur l’espace des sections d’un fibr´e en droites. C.R.A.S. t.299, S´erie I, 699–701 (1984) 45. Rudin, W.: Real and Complex Analysis. 3rd edition. New York: McGraw-Hill, 1987 46. Rudin, W.: Functional Analysis. 2nd edition. New York: McGraw-Hill, 1991 47. Ruiz, J. M.: The Basic Theory of Power Series. Braunschweig: Vieweg-Verlag, 1993 48. v. d. Waerden: Algebra I., Berlin–Heidelberg–New York: Springer-Verlag, 1993 49. Whitney, H.: Analytic extensions of differentiable functions defined on closed sets. Trans. Am. Math. Soc. 36, 63–89 (1934) 50. Yosida, K.: Functional Analysis. Berlin: Springer, 1980 Communicated by A. Connes This article was processed by the author using the LaTEXstyle file cljour from Springer-Verlag.

Commun. Math. Phys. 195, 585 – 612 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Estimations de la r´esolvante pour une mol´ecule diatomique dans l’approximation de Born–Oppenheimer Thierry Jecko? Fachbereich Mathematik MA 7-2, Technische Universit¨at Berlin, Straße des 17. Juni 136, D-10623 Berlin, Germany. E-mail: [email protected] Received: 11 July 1997 / Accepted: 18 December 1997

Abstract: Making use of an adiabatic operator that takes several electronic states into account, we derive a Born–Oppenheimer approximation of the resolvent for a diatomic molecule. This is an improvement of a result in [KMW1]. Such a resolvent approximation is useful to obtain an adiabatic approximation of total cross-sections (see [Jec2]). The strategy we use, based on Mourre’s commutator method and on a new kind of global escape function, may be carried over to control the resolvent of some matricial Schr¨odinger operators. In the same way, we obtain a semiclassical estimate for the resolvent of the semiclassical Dirac operator with scalar electric potential, extending a result of [Ce]. Table des mati`eres 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585 2 Pr´eliminaires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591 3 Estimation semi-classique de la r´esolvante de P . . . . . . . . . . . . . . . . . . . 595 4 Approximation semi-classique des op´erateurs d’onde de canal . . . . . . . 603 5 Autres utilisations d’une fonction fuite ou “multi-fuite” globale . . . . . . 606 R´ef´erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611

1. Introduction En 1927, M.Born et R.Oppenheimer introduisaient ce qui allait devenir l’approximation de Born–Oppenheimer (cf. [BO]), une approche essentielle pour la chimie mol´eculaire. En vue de d´ecrire les niveaux d’´energie mol´eculaires, leur id´ee consistait a` profiter du fait que les rapports de la masse de l’´electron a` celles des noyaux sont tr`es petits, pour effectuer des d´eveloppements en puissances de h, un petit param`etre li´e a` ces rapports. ? Previous address: D´ epartement de math´ematiques, Universit´e de Nantes, 44072 Nantes cedex 03, France. E-mail: [email protected]

586

Th. Jecko

En th´eorie de la diffusion, on peut reprendre cette id´ee, se ramener a` une e´ tude semiclassique et profiter ainsi des d´eveloppements r´ecents dans ce domaine. Le pr´esent travail est motiv´e par l’´etude de l’approximation de Born–Oppenheimer de sections efficaces totales pour une mol´ecule diatomique (cf. [Jec2]). Afin de pouvoir consid´erer des processus de diffusion in´elastiques, on am´eliore ici l’approximation de la r´esolvante obtenue dans [KMW1]. Au moyen d’un op´erateur adiabatique, qui prend en compte plusieurs e´ tats e´ lectroniques et sous une hypoth`ese de “non-croisement” des niveaux e´ lectroniques correspondants, on e´ tablit la mˆeme approximation de la r´esolvante que dans [KMW1], dans une situation plus g´en´erale. Comme dans [KMW1], on en d´eduit une approximation adiabatique pour certains op´erateurs d’onde de canal. Ces estimations semi-classiques de r´esolvantes sont valables pr`es d’une e´ nergie noncaptive pour certains hamiltoniens classiques, qui ne sont pas, comme dans [W1], les sous-hamiltoniens du syst`eme a` N -corps. Elles sont obtenues par la m´ethode de Mourre (cf. [Mo]). A la diff´erence des estimations de r´esolvantes (cf. [PSS]), utilis´ees dans [Ra] pour l’´etude d’op´erateurs d’onde et dans [CT] pour celle de l’amplitude de diffusion, les poids, intervenant dans les pr´esentes estimations, ne contiennent pas toutes les variables (comme dans [KMW1]). D’autre part, on se limite ici au cas o`u les potentiels sont r´eguliers alors que des singularit´es coulombiennes sont permises dans [KMW2] et que les potentiels sont coulombiens dans [CT] et [Ra]. En ce qui concerne l’approximation de Born–Oppenheimer, introduite dans [BO], on renvoie le lecteur aux r´ef´erences indiqu´ees dans [KMSW] et [Jec1]. Pour l’approximation de Born–Oppenheimer d´ependante du temps, de nombreux travaux ont e´ t´e r´ealis´es. On peut consulter a` ce sujet [H] ainsi que les r´ef´erences cit´ees dans ce m´emoire. Les d´etails de la m´ethode du commutateur de Mourre figurent dans [Mo, JMP]. Dans [T, W2], on trouvera des informations sur l’op´erateur de Dirac. Enfin, signalons un travail r´ecent (cf. [No]) sur l’amplitude de diffusion pour l’op´erateur de Dirac avec champ magn´etique, o`u des estimations semi-classiques de la r´esolvante sont e´ tablies. D´etaillons maintenant le syst`eme e´ tudi´e. On consid`ere une mol´ecule diatomique a` N e´ lectrons. La m´ecanique quantique pr´edit que le comportement de cette mol´ecule est donn´e par son op´erateur d’´energie, l’op´erateur auto-adjoint agissant dans L2 (R3(N +2) ), X X 1 1 1 H˜ = − 1x1 − 1x2 + (− 1xj ) + Vlj (xl − xj ), 2m1 2m2 2 N +2 j=3

l<j

o`u l’on a fix´e la masse des e´ lectrons (3 ≤ j ≤ N + 2) a` 1 ainsi que la constante de Planck. Les masses respectives des deux noyaux, m1 et m2 , sont donc grandes devant 1, les fonctions r´eelles Vlj repr´esentent les interactions bilat´erales entre particules. Plus g´en´eralement, on suppose que l’espace des configurations, dans lequel se meuvent les particules, est de dimension n ≥ 2. L’op´erateur pr´ec´edent agit donc dans L2 (Rn(N +2) ). Soit a = (A1 , A2 ) une d´ecomposition de {1, . . . , N + 2} en deux amas telle que j ∈ Aj , pour j ∈ {1, 2}. En effectuant un changement de variables convenable et en retirant le mouvement du centre de masse, on se ram`ene a` l’´etude de l’op´erateur P (h) = −h2 1x + P a (h) + Ia (h), agissant dans L2 (Rn(N +1) ) (voir la partie 2 pour les expressions pr´ecises de P a (h) et Ia (h)). Le r´eel positif h sera un petit param`etre (cf. (2.1)), l’hamiltonien interne P a (h) est la somme des op´erateurs d’´energie de chaque amas, consid´er´es comme isol´es, et le potentiel inter-amas Ia (h) rassemble les interactions entre particules appartenant a` deux

Estimations de la r´esolvante pour une mol´ecule diatomique

587

amas diff´erents. La variable x ∈ Rn repr´esente la position relative des centres de masse des amas. Ainsi l’op´erateur −h2 1x correspond a` l’´energie cin´etique du mouvement relatif de ces centres de masse. Puisque le mouvement des particules l´eg`eres, les e´ lectrons, doit eˆ tre nettement plus rapide que celui des particules lourdes, les noyaux, il est naturel d’introduire un hamiltonien e´ lectronique correspondant a` l’op´erateur d’´energie du syst`eme constitu´e des N e´ lectrons, en interaction entre eux et plac´es dans le champ ext´erieur cr´ee´ par les noyaux, dont la position relative est li´ee au param`etre x. On consid`ere donc la famille d’op´erateurs {Pe (x; h), x ∈ Rn , h ≤ h0 } d´efinis par Pe (x; h) = P a (h) + Ia (x; h), ∀x ∈ Rn , ∀h ≤ h0 . On a

P (h) = −h2 1x + Pe (h).

Les interactions bilat´erales apparaissant dans ces op´erateurs seront des fonctions V ∈ C ∞ (Rn ; R), v´erifiant, pour un certain ρ > 0, ∀α ∈ Nn , ∃Cα > 0; ∀x ∈ Rn , |∂xα V (x)| ≤ Cα hxi−ρ−|α|

(Dρ )

(avec hxi = (1+|x|2 )1/2 ). On s’int´eresse aux e´ tats du syst`eme dont l’´evolution ressemble, asymptotiquement, a` l’´evolution libre d’´etats li´es dans A1 et A2 . Une telle e´ volution est donn´ee par la restriction a` un sous-espace propre de P a (h) du propagateur de l’op´erateur Pa (h) ≡ −h2 1x + P a (h). Soient E1 < . . . < Er les r premi`eres valeurs propres du spectre discret de P a (0), chaque Ej e´ tant de multiplicit´e mj , pour j ∈ {1, · · · , r}. Pour chaque j, on suppose qu’il y a exactement mj “courbes” x 7→ λjl (x; 0), pour l ∈ {1, · · · , mj }, de valeurs propres λjl (x; 0) de Pe (x; 0) (r´ep´et´ees autant que leur multiplicit´e), qui tendent vers Ej lorsque |x| → ∞. De plus, on suppose que ces applications x 7→ λjl (x; 0), a` valeurs dans le spectre σ(Pe (x; 0)) de Pe (x; 0), sont globalement d´efinies sur Rn . Soit 5j0 (0) le projecteur spectral de P a (0), associ´e a` Ej , et, pour tout x ∈ Rn , soit 5j (x; 0) celui de Pe (x; 0), associ´e aux λjl (x; 0), pour l ∈ {1, · · · , mj }. D´efinition 1.1. Pour δ > 0 et pour tout j ∈ {1, · · · , r}, on consid`ere la condition (Hj,δ ) suivante: il existe des fonctions ej,± et Ej,± et des nombres r´eels hj,δ > 0 tels que, pour tout h ≤ hj,δ ,     infn Ej,± (x) − ej,± (x) ≥ δ.     x∈R               ∀x ∈ Rn , ej,− (x) < Ej,− (x) < ej,+ (x) < Ej,+ (x), (Hj,δ )   n   ∀x ∈ R , λ (x; 0), . . . , λ (x; 0) ∈ ]E (x); e (x)[,   j1 jmj j,− j,+              ∀x ∈ Rn , σ P (x; h) ∩ [e (x); E (x)] ∪ [e (x); E (x)] = ∅.  e j,− j,− j,+ j,+ Sous la condition pr´ec´edente, pour le mˆeme δ et pour hδ = min1≤j≤r hj,δ , en notant par 11]Ej,− (x),ej,+ (x)[ la fonction caract´eristique de l’intervalle ]Ej,− (x), ej,+ (x)[, on introduit la condition suivante. Il existe un R0 > 0 tel que, pour tout |x| ≥ R0 et tout h ∈ [0, hδ ],

588

Th. Jecko

dim Im 11]Ej,− (x),ej,+ (x)[ Pe (x; h)

= mj .

(Hj,δ )0

Si ces conditions (Hj,δ ) et (Hj,δ )0 sont satisfaites, pour un certain δ > 0, on dira que Ej v´erifie l’hypoth`ese de stabilit´e semi-classique. Sous cette hypoth`ese de stabilit´e semi-classique, pour tout j, on peut trouver une famille (0(x))x∈Rn , de contours complexes ind´ependants de h et entourant toutes les valeurs propres λjl (x; 0) (1 ≤ l ≤ mj , 1 ≤ j ≤ r). Grˆace a` ces contours, on peut exprimer, au moyen d’une formule de Cauchy, le projecteur spectral 50 (0) de P a (0), associ´e aux valeurs propres {E1 , . . . , Er }, et le projecteur spectral 5(x; 0) de Pe (x; 0), associ´e aux valeurs propres {λjl (x; 0), 1 ≤ j ≤ r, 1 ≤ l ≤ mj }. Pour h suffisamment petit, on peut d´efinir un projecteur spectral 50 (h) (respectivement 5(x; h)) de P a (h) (respectivement Pe (x; h)) par la mˆeme formule de Cauchy (cf. partie 2). Au moyen d’une int´egrale directe, on d´efinit un op´erateur fibr´e 5(h) par Z ⊕ 5(x; h) dx. 5(h) = Rn

Maintenant, on peut introduire la partie adiabatique de P (h) P AD (h) = 5(h)P (h)5(h). Sous l’hypoth`ese (Dρ ), pour ρ > 0, l’op´erateur P AD (h) est auto-ajoint (cf. [CDS]). On note par RAD (z; h) sa r´esolvante et par R(z; h) celle de P (h). L’objectif essentiel de ce travail est d’approcher la valeur au bord de la r´esolvante R(z; h) par celle de RAD (z; h). Avant d’´enoncer le r´esultat principal de ce travail, on introduit des notions importantes. D´efinition 1.2. Pour j ∈ {1, · · · , r}, on dit que Ej v´erifie la condition de “noncroisement” si les valeurs propres, qui tendent vers Ej lorsque |x| → ∞, v´erifient ∀x ∈ Rn , λj1 (x; 0) < . . . < λjl(j) (x; 0) pour un certain l(j) ∈ {1, · · · , mj }. ´ D´efinition 1.3. Etant donn´es un hamiltonien classique p : R2n −→ R et une e´ nergie −1 E ∈ R, on note par p (E) la surface d’´energie E n o p−1 (E) ≡ (x, ξ) ∈ R2n ; p(x, ξ) = E . On dit que l’´energie E est non-captive pour l’hamiltonien classique p si l’on a ∀(x, ξ) ∈ p−1 (E),

lim k8t (x, ξ)k = ∞

t→+∞

et

lim k8t (x, ξ)k = ∞,

t→−∞

8t d´esignant le flot hamiltonien associ´e a` p et k · k la norme euclidienne sur R2n . Un intervalle I est dit non-captif pour l’hamiltonien classique p si toute e´ nergie E ∈ I l’est. Une fonction a : R2n −→ R, de classe C 1 , est une fonction fuite globale pour l’hamiltonien classique p a` l’´energie E ∈ R, s’il existe > 0 et C > 0 tels que d ∀(x, ξ) ∈ p−1 ]E − , E + [ , ∀t ∈ R, (a ◦ 8t )(x, ξ) ≥ C. dt Notons que, dans ce cas, l’´energie E doit eˆ tre non-captive pour p.

Estimations de la r´esolvante pour une mol´ecule diatomique

589

Th´eor`eme 1.4. On suppose que les potentiels v´erifient (Dρ ) pour ρ > 0. Soit E1 < . . . < Er ∈ σdisc (P a (0)) les r premi`eres valeurs propres du spectre discret de P a (0). On suppose que l’hypoth`ese de stabilit´e semi-classique (cf. D´efinition 1.1) et la condition de “non-croisement” (cf. D´efinition 1.2) sont satisfaites pour tout j. Soit E 6∈ {0} ∪ {Ej , 1 ≤ j ≤ r} une e´ nergie non-captive pour chaque hamiltonien classique |ξ|2 + λjl (x; 0), pour 1 ≤ j ≤ r et 1 ≤ l ≤ l(j) (cf. D´efinition 1.3). 1. Pour tout voisinage compact 3 de E, assez petit, et pour tout s > 1/2, khxi−s RAD (λ ± i0; h)hxi−s k = O(h−1 ), pour h assez petit, uniform´ement pour λ ∈ 3. 2. On suppose que E < E AD ≡ infn inf σ Pe (x; 0) \ {λjl (x; 0), ∀j, l} . x∈R

(1.1)

(1.2)

Pour tout voisinage compact 3 de E, assez petit, et pour tout s > 1/2, on a, pour h assez petit et uniform´ement pour λ ∈ 3,

−s

R(λ ± i0; h) − RAD (λ ± i0; h)5(h) hxi−s = O(1), (1.3)

hxi pour ρ > 1, et

khxi−s R(λ ± i0; h)hxi−s k = O(h−1 ),

(1.4)

pour ρ > 0. Remarque 1.5. Dans [KMW1], il est e´ tabli que l’estimation (1.1) implique l’approximation (1.3) (et donc aussi l’estimation (1.4)) pour des potentiels a` courte port´ee (ρ > 1). De plus, dans le Th´eor`eme 3.2 de [KMW1], des conditions suffisantes relatives a` un op´erateur conjugu´e sont d´egag´ees pour obtenir l’estimation (1.1) et cette derni`ere est d´emontr´ee dans le cas o`u r = 1. Ici, on obtient ces conditions suffisantes dans le cas o`u les valeurs propres Ej , pour j > 1, ne sont pas simples (a priori). D’autre part, signalons que la condition de non-capture sur E impose, pour tout j, E 6∈ inf λjl (x; 0); Ej . x,l

Pour l’´etude semi-classique des op´erateurs d’onde de canal (cf. Th´eor`eme 4.1) et pour celle des sections efficaces totales (cf. [Jec2]), lorsque ρ > 1, il convient de pouvoir consid´erer une e´ nergie E telle que E > Er . Dans ce cas, on a besoin de la condition Er < E AD ≡ infn inf σ Pe (x; 0) \ {λjl (x; 0), ∀j, l} , (1.5) x∈R

pour avoir l’approximation (1.3) et l’estimation (1.4). Remarque 1.6. En comparant ce r´esultat avec les estimations semi-classiques de [W1], on est en droit de se demander si cette condition de “non-croisement” est superflue. Si un croisement a lieu dans la zone classiquement interdite, c’est-`a-dire a` une e´ nergie strictement sup´erieure a` E, i.e. n o ∃k 6= l; C ≡ x ∈ Rn ; λjk (x; 0) = λjl (x; 0) 6= ∅

590

Th. Jecko

et il existe > 0 tel que, pour tout x ∈ C, λjk (x; 0) ≥ E + , il ne change pas le r´esultat (mˆeme si l’on doit introduire quelques modifications dans la preuve du Th´eor`eme 1.4, cf. Remarque 3.5). En revanche, si des e´ nergies du croisement sont inf´erieures a` E, il semble que l’on ne puisse pas toujours construire une fonction “multi-fuite” globale. Le probl`eme pourrait eˆ tre e´ clairci par l’´etude des r´esonances pour des op´erateurs de Schr¨odinger a` potentiel matriciel avec croisement des valeurs propres λj (x) du potentiel, comme dans [Ne], mais pr`es d’une e´ nergie non-captive pour chaque hamiltonien classique |ξ|2 + λj (x). Pour un autre aspect de ce probl`eme, voir la Remarque 1.3 dans [Jec2]. A partir de l’approximation (1.3) du Th´eor`eme 1.4 (donc pour ρ > 1), on e´ tablit, dans la bande d’´energie [Er ; E AD ], l’approximation des op´erateurs d’onde de canal ± (h) = s − lim eih

−1

t→±∞

tP (h) −ih−1 tPa (h)

e

50 (h)

(qui existent, cf. [RS3]) par les op´erateurs d’onde adiabatiques ih AD ± (h) = s − lim e t→±∞

−1

tP AD (h) −ih−1 tPa (h)

e

50 (h).

Ce r´esultat (cf. Th´eor`eme 4.1) constitue une am´elioration du r´esultat correspondant dans [KMW1]. Signalons qu’une autre am´elioration a e´ t´e obtenue dans [KMW2], o`u des singularit´es coulombiennes dans les potentiels sont permises. Pour la preuve du Th´eor`eme 1.4, on adopte la strat´egie suivie dans [KMW1]. Cette strat´egie a e´ t´e utilis´ee pour la premi`ere fois dans [GM] pour obtenir une preuve rapide de l’estimation semi-classique de la r´esolvante d’op´erateurs de Schr¨odinger a` deux corps, initialement e´ tablie dans [RT]. Pour obtenir une estimation semi-classique de la r´esolvante d’un op´erateur de Schr¨odinger H, on utilise la m´ethode du commutateur de Mourre (cf. [Mo, JMP]). Il s’agit donc de trouver un op´erateur conjugu´e A de sorte que le commutateur ih−1 [H, A] soit strictement positif, dans un certain sens. L’id´ee d´evelopp´ee dans [GM] consiste a` construire une fonction fuite globale (cf. D´efinition 1.3) pour l’hamiltonien classique associ´e a` H et, en gros, de choisir comme op´erateur conjugu´e l’op´erateur pseudo-diff´erentiel de Weyl de symbole la fonction fuite globale. ´ Dans le pr´esent travail, on introduit un type particulier de fonction fuite globale. Etant donn´ee une famille d’hamiltoniens classiques, on veut construire une fonction, qui soit une fonction fuite globale pour chaque hamiltonien classique, une fonction “multi-fuite” globale. Pour les hamiltoniens classiques |ξ|2 + λjl (x; 0) (1 ≤ j ≤ r, 1 ≤ l ≤ l(j)), intervenant dans le Th´eor`eme 1.4, on utilise la condition de “non-croisement”, introduite dans la D´efinition 1.2, pour prouver l’existence d’une telle fonction. Pourquoi ne pas simplement utiliser une fonction fuite pour chaque hamiltonien classique ? Parce que, dans ce cas, un terme particulier, figurant dans l’estimation de Mourre semi-classique, semble incontrˆolable (cf. Remarque 3.3). A ce sujet, voir aussi la Remarque 5.2. Une telle fonction “multi-fuite” globale semble eˆ tre un bon outil pour obtenir des estimations semi-classiques de r´esolvante pour des op´erateurs matriciels. Par exemple, on e´ tablit une telle estimation pour la r´esolvante d’un op´erateur de Schr¨odinger a` deux corps avec un potentiel matriciel a` longue port´ee. Pour l’op´erateur de Dirac avec champ e´ lectrique scalaire a` longue port´ee, on obtient e´ galement un contrˆole semi-classique de la r´esolvante, comme dans [Ce], mais sous des hypoth`eses plus faibles sur le potentiel. C’est l’objet du

Estimations de la r´esolvante pour une mol´ecule diatomique

591

Th´eor`eme 1.7. On utilise les notations de la partie 5. En particulier, on note par D l’op´erateur de Dirac. On suppose que le champ e´ lectrique scalaire V est r´eel et satisfait la condition (Dρ ) pour un certain ρ > 0. Soit λ0 > 1 une e´ nergie non-captive (cf. D´efinition 1.3) pour l’hamiltonien classique hξi + V (x) telle que (1.6) λ0 > sup V (x) − 1. x∈R3

Pour tout voisinage compact 3 de λ0 , assez petit, et pour tout s > 1/2,

−1

−s (λ ± i0)I4 − D hxi−s = O(h−1 ),

hxi pour h assez petit et uniform´ement pour λ ∈ 3. Remarque 1.8. Ce Th´eor`eme 1.7 est encore valable si λ0 < −1 est non-captive pour l’hamiltonien classique −hξi + V (x) et v´erifie λ0 < inf V (x) + 1. x∈R3

Ce travail est organis´e de la fac¸on suivante. Dans la partie 2, on rappelle des propri´et´es de base, obtenues dans [KMW1], sur les projecteurs 5(x; h). La partie 3 est d´evolue a` la preuve du Th´eor`eme 1.4. On obtient l’approximation adiabatique des op´erateurs d’onde de canal dans la partie 4. Enfin, dans la partie 5, on consid`ere le cas d’un op´erateur de Schr¨odinger matriciel et on am´eliore le r´esultat de [Ce] sur l’op´erateur de Dirac. 2. Pr´eliminaires Dans cette partie, on introduit des notations et on e´ tablit quelques propri´et´es de l’hamiltonien e´ lectronique. En particulier, ces propri´et´es permettent de d´efinir P AD comme op´erateur auto-adjoint (cf. [CDS]). Les arguments utilis´es ici proviennent essentiellement de [KMW1]. Tout d’abord, donnons l’expression exacte des op´erateurs P a (h) et Ia (x; h). Tandis que la variable x ∈ Rn rep`ere la position relative des centres de masse des amas, on prend des coordonn´ees atomiques dans chaque amas. On obtient ainsi N variables internes que l’on d´esigne par y ∈ RnN . Pour k ∈ {1, 2}, notons par A0k l’ensemble des e´ lectrons de l’amas Ak , par |A0k | le cardinal de cet ensemble et par Mk = mk + |A0k | la masse totale de l’amas Ak . Le petit param`etre h et les op´erateurs P a (h) et Ia (x; h) sont alors 1/2 1 1 h= + , (2.1) 2M1 2M2  2 X 1 X  − 1yj + Vkj (yj ) P a (h) = 2 k=1 j∈A0k  1 X 1 X ∇ yl · ∇y j + Vlj (yl − yj ) , − 2mk 2 0 0 l,j∈Ak

l,j∈Ak

592

Th. Jecko

Ia (x; h) =

P l∈A01 ,j∈A02

+

Vlj (yl − yj + x + f2 − f1 ) + P

j∈A02

P l∈A01

Vl2 (x − f1 + f2 − yl )

V1j (x − f1 + f2 − yj ) + V12 (x − f1 + f2 ),

P o`u les quantit´es fk = M1k j∈A0 yj , pour k ∈ {1, 2}, d´ependent de h (le “ · ” d´esigne k le produit scalaire des gradients). Pour l < j, on a pos´e Vjl (z) = Vlj (−z). Notons par PHE le terme de Hughes-Eckart suivant PHE ≡ −

2 X 1 X ∇ y l · ∇ yj . 2mk 0 k=1

l,j∈Ak

Comme l’op´erateur P a (h) converge vers P a (0), lorsque h → 0, en norme des r´esolvantes, il existe, pour tout j, mj valeurs propres Ejl (h) de P a (h) qui convergent vers Ej , quand h → 0. Soit 5j0 (h) le projecteur spectral de P a (h) associ´e a` ces mj valeurs propres. Rappelons que, dans l’introduction, on a suppos´e que, pour tout j, mj valeurs propres λjl (x; 0) de Pe (x; 0) tendent vers Ej . Sous l’hypoth`ese de stabilit´e semi-classique (cf. D´efinition 1.1), pour tout j, on peut trouver (0j (x))x∈Rn , une famille de contours complexes, ind´ependants de h, qui entourent tous les λjl (x; 0) (1 ≤ l ≤ mj ) de sorte que l’on ait infn d σ(Pe (x; h)), 0j (x) ≥ δ/2, x∈R

pour tout h ∈ [0; hδ ]. Comme λjl (x; 0) → Ej (|x| → ∞), on peut choisir n o 0j (x) = z ∈ C; |z − Ej | = δ/2 ≡ 0j (∞),

(2.2)

pour |x| ≥ R1 , pour un certain R1 ≥ R0 , ind´ependant de j. Ainsi, pour δ et hδ assez petits, on peut e´ crire, pour tout h ∈ [0; hδ ], Z 1 (z − P a (h))−1 dz (2.3) 5j0 (h) = 2iπ 0j (∞) et, pour tout x ∈ Rn , le projecteur spectral de Pe (x; 0), associ´e aux valeurs propres λjl (x; 0) (1 ≤ l ≤ mj ), est donn´e, grˆace a` (Hj,δ ), par Z 1 (z − Pe (x; 0))−1 dz. 5j (x; 0) = 2iπ 0j (x) En s’appuyant encore sur (Hj,δ ), on d´efinit, pour tout x ∈ Rn et tout h ∈ [0; hδ ], Z 1 (z − Pe (x; h))−1 dz. (2.4) 5j (x; h) = 2iπ 0j (x) D’apr`es (Hj,δ )0 , l’op´erateur 5j (x; h) est, pour |x| assez grand, le projecteur spectral de Pe (x; h), associ´e a` certaines valeurs propres λj1 (x; h), . . . , λjmj (x; h), dont la multiplicit´e totale est mj . Dans la Proposition 2.3, on montre que ces valeurs propres sont globalement d´efinies et proches des λj1 (x; 0), . . . , λjmj (x; 0).

Estimations de la r´esolvante pour une mol´ecule diatomique

593

De fac¸on analogue, on peut trouver (0(x))x∈Rn , une famille de contours complexes, ind´ependants de h, qui entourent tous les λjl (x; 0) (1 ≤ l ≤ mj , 1 ≤ j ≤ r). On peut e´ galement les choisir de sorte que l’on ait, pour |x| assez grand, o n (2.5) 0(x) = z ∈ C; inf |z − Ej | = δ/2 ≡ 0(∞). 1≤j≤r

Comme dans l’introduction, on pose, pour h assez petit, Z 1 (z − P a (h))−1 dz, 50 (h) = 2iπ 0(∞) Z 1 5(x; h) = (z − Pe (x; h))−1 dz. 2iπ 0(x)

(2.6)

(2.7)

En particulier, on a, pour tout (x; h), 50 (h) =

r X j=1

5j0 (h) et 5(x; h) =

r X

5j (x; h).

j=1

Donnons maintenant des propri´et´es de r´egularit´e et de d´ecroissance a` l’infini de ces projecteurs. A partir des formules de Cauchy pr´ec´edentes, on voit que ce sont des fonctions C ∞ car les potentiels le sont. Grˆace a` la d´ecroissance exponentielle des fonctions propres de l’op´erateur P a (0) associ´ees aux valeurs propres Ej (cf. [A]), ces projecteurs poss`edent les propri´et´es suivantes. Proposition 2.1. Sous la condition (Dρ ) (ρ > 0) et sous l’hypoth`ese de stabilit´e semiclassique (cf. D´efinition 1.1), on a les estimations suivantes, pour tout 1 ≤ j ≤ r,

∀α ∈ Nn , ∃Dα > 0; ∀x ∈ Rn , ∂xα 5j (x; h) − 5j0 (h) ≤ Dα hxi−ρ−|α| ,

α

(∂x Ia )(x; h)5j0 (h) + (∂xα Ia )(x; h)5j (x; h) ≤ Dα hxi−ρ−|α| ,

5j (x; h)Pe (x; h)5j (x; h) − 5j (x; h)P a (h)5j0 (h) = O(hxi−ρ ),

(2.8) (2.9)

uniform´ement pour h ∈ [0, hδ ], hδ assez petit. Ici, k·k d´esigne la norme de L(L2 (RnN y )). Les op´erateurs 5(x; h) v´erifient des propri´et´es analogues. D´emonstration. On suit essentiellement les arguments de [KMW1]. Voir [Jec1].

Remarque 2.2. Signalons une propri´et´e importante, commune aux fonctions r´eguli`eres a` valeurs projecteur. Pour tout (x; h), 5(x; h)(∇x 5)(x; h)5(x; h) = 0. En effet, puisqu’il s’agit de projecteurs, on peut e´ crire h i 0 = 5(x; h) ∇x 52 (x; h) − 5(x; h) 5(x; h) = 5(x; h)(∇x 5)(x; h)5(x; h). Comme on l’a d´ej`a signal´e, on va voir que les valeurs propres λj1 (x; h), . . . , λjmj (x; h) se prolongent a` Rn et que l’on sait les localiser. C’est l’objet de la

594

Th. Jecko

Proposition 2.3. Pour tout 1 ≤ j ≤ r, on suppose que Ej v´erifie l’hypoth`ese de stabilit´e semi-classique (cf. D´efinition 1.1) pour un r´eel δ > 0. Pour h ∈ [0, hδ ], hδ assez petit, pour tout x ∈ Rn , il existe alors exactement mj valeurs propres de Pe (x; h), λj1 (x; h), . . . , λjmj (x; h), chacune r´ep´et´ee autant que sa multiplicit´e, telles que ∀l ∈ {1, . . . , m}

λjl (x; h) = λjl (x; 0) + O(h2 ),

uniform´ement en x. Le spectre de l’op´erateur Pe (x; h) poss`ede donc la propri´et´e suivante. Pour tout 1 ≤ j ≤ r, pour tout x ∈ Rn et tout h ∈ [0, hδ ], d {λj1 (x; h), . . . , λjmj (x; h)}, σ Pe (x; h) \ {λj1 (x; h), . . . , λjmj (x; h)} ≥ δ/3. (2.10) D´emonstration. On reprend la preuve de [KMW1] et on renvoie pour certaines e´ tapes a` l’annexe D de [Jec1]. Cette proposition est bas´ee sur l’estimation k5j (x; h) − 5j (x; 0)k = O(h2 ), uniform´ement en x.

(2.11)

D’apr`es la relation (2.4), on est ramen´e a` e´ tudier, pour z ∈ 0j (x), l’op´erateur −1 . Pe (x; h) − Pe (x; 0) z − Pe (x; h) Notons tout de suite que, uniform´ement en x et z ∈ 0j (x),

−1

= O(h2 ).

PHE z − Pe (x; h) Il reste donc a` examiner la contribution du potentiel inter-amas Ia qui est du type i −1 h z − Pe (x; h) , V x + Lh (y) + L(y) − V x + L(y) o`u Lh , L sont des applications lin´eaires de RnN dans Rn avec kLh k = O(h2 ) et L ind´ependante de h, et o`u V v´erifie (Dρ ). On pense utiliser une formule de Taylor en h. On voit alors apparaˆıtre un terme Lh (y) que l’on “absorbe” en projettant sur Im5j (x; h), grˆace a` la d´ecroissance exponentielle des fonctions propres de P a (0), associ´ees a` la valeur propre Ej (cf. [A]). On obtient ainsi

5j (x; h) − 5j (x; 0) 5j (x; h) + 5j (x; h) − 5j (x; 0) 5j (x; 0) = O(h2 ), d’o`u l’on tire (2.11), puisque l’on manipule des projecteurs. Pour h assez petit, 5j (x; h) et 5j (x; 0) ont mˆeme rang, pour tout x ∈ Rn . Le projecteur 5j (x; h) est donc le projecteur spectral de Pe (x; h) associ´e a` λj1 (x; h), . . . , λjmj (x; h), mj valeurs propres, chacune r´ep´et´ee autant que sa multiplicit´e. L’hypoth`ese de stabilit´e semi-classique implique la propri´et´e (2.10) pour Pe (x; h). On montre maintenant que les λjl (x; h) sont proches des λjl (x; 0). Des arguments pr´ec´edents r´esulte aussi l’estimation

Pe (x; h) − Pe (x; 0) 5j (x; 0) + Pe (x; h) − Pe (x; 0) 5j (x; h) = O(h2 ), qui permet, conjointement avec (2.11), d’´ecrire

5j (x; h)Pe (x; h)5j (x; h) − 5j (x; 0)Pe (x; 0)5j (x; 0) = O(h2 ). En utilisant une formule de “minimax”, on obtient le r´esultat cherch´e (cf. [Jec1]).

Estimations de la r´esolvante pour une mol´ecule diatomique

595

Sous l’hypoth`ese de “non-croisement” pour chaque j, il y a l(j) fonctions λjl (·; 0) qui tendent vers Ej a` l’infini et chaque λjl (·; 0) est de multiplicit´e constante mjl . On peut donc contruire une famille {0jl (x), x ∈ Rn } de contours dans C telle que Z 1 (z − Pe (x; 0))−1 dz 5jl (x; 0) = 2iπ 0jl (x) soit le projecteur spectral associ´e a` λjl (x; 0). Ces projecteurs sont aussi de classe C ∞ ainsi que les valeurs propres correspondantes en vertu de la relation 1 λjl (x; 0) = T r 5jl (x; 0)Pe (x; 0) , mjl o`u T r d´esigne la trace. A la diff´erence des contours pr´ec´edents, le p´erim`etre de ceux-ci tend vers 0 lorsque x tend vers l’infini (si la valeur propre Ej est multiple). Malgr´e cela, ces valeurs propres λjl (x; 0) v´erifient la condition (Dρ0 ) suivante ∀α ∈ Nn ; |α| ≤ 1, ∃Cα > 0; ∀x ∈ Rn , ∂xα λjl (x; 0) − Ej ≤ Cα hxi−ρ−|α| , (Dρ0 ) (il s’agit du ρ de l’hypoth`ese (Dρ ) v´erifi´ee par les potentiels). Pour confirmer ce point, on va s’appuyer sur [K]. Comme dans la preuve de la Proposition 2.3, les λj (x; 0) sont, pour |x| assez grand, les valeurs propres de la matrice sym´etrique M (x) repr´esentant Pe (x; 0) dans la base (5(x; 0)φk )1≤k≤m , o`u (φk )1≤k≤m est une base orthonorm´ee de Im50 (0). Les coefficients akl (x) de M (x) v´erifient α ∂x akl (x) − δkl = O(hxi−ρ−|α| ) pour tout α ∈ Nn grˆace a` la Proposition 2.1. D’apr`es [K] p. 111, les λj (x; 0) v´erifient la condition (Dρ0 ). Enfin, l’op´erateur P (h) est auto-adjoint sur le domaine du laplacien dans L2 (Rn(N +1) ), not´e D(P (h)). En utilisant la Proposition 2.1 et les arguments de [CDS], on v´erifie (cf. [Jec1]) que les op´erateurs P AD (h) = 5(h)P (h)5(h)

et

PjAD (h) = 5j (h)P (h)5j (h),

pour tout j, sont auto-adjoints sur les domaines respectifs o n φ ∈ L2 (Rn(N +1) ); 5(h)φ ∈ D P (h) , n

o φ ∈ L2 (Rn(N +1) ); 5j (h)φ ∈ D P (h) .

3. Estimation semi-classique de la r´esolvante de P L’objet de cette partie est d’´etablir, dans les conditions de la partie 2 pr´ec´edente, l’approximation adiabatique de la r´esolvante totale pour certaines e´ nergies (cf. Th´eor`eme 1.4). A la diff´erence de [KMW1] o`u le projecteur 5 est de rang 1, on est amen´e ici a` construire une fonction “multi-fuite” globale, c’est-`a-dire une fonction qui est simultan´ement une fonction fuite globale pour plusieurs hamiltoniens classiques. Pour obtenir le Th´eor`eme 1.4, on montre l’estimation de Mourre suivante.

596

Th. Jecko

Proposition 3.1. On suppose que les potentiels v´erifient (Dρ ) pour un r´eel ρ > 0. Soit E1 < . . . < Er ∈ σdisc (P a (0)) v´erifiant chacune l’hypoth`ese de stabilit´e semiclassique (cf. D´efinition 1.1) et l’hypoth`ese de “non-croisement” (cf. D´efinition 1.2). Pour toute e´ nergie E 6∈ {0} ∪ {Ej , 1 ≤ j ≤ r}, non-captive pour chaque hamiltonien classique |ξ|2 + λjl (x; 0) (cf. D´efinition 1.3), 1 ≤ j ≤ r et 1 ≤ l ≤ l(j), il existe un op´erateur F AD (h) tel que

−1 AD AD

AD AD , F(h) (3.1) , F(h) P (h) + i = O(h2 ),

P(h) qui v´erifie l’estimation de Mourre AD AD AD AD χ P(h) i P(h) , F(h) χ P(AD h) ≥ αhχ2 P(h)

(3.2)

avec χ = 11]E−δ,E+δ[ , δ > 0 et α > 0 ind´ependants de h. D´emonstration. L’hypoth`ese de non-capture impose, pour tout j, E 6∈ inf λjl (x; 0); Ej . x,l

Soit s le plus grand entier j tel que Ej < E. On adopte la strat´egie utilis´ee dans [KMW1]. Pour chaque hamiltonien classique |ξ|2 + λjl (x; 0) − Ej , il est naturel de construire une fonction fuite globale (cf. D´efinition 1.3) comme dans [GM]. Cependant, on pr´ef`ere construire ici une fonction a qui soit une fonction fuite globale pour chaque hamiltonien classique |ξ|2 + λjl (x; 0) − Ej (j ≤ s) a` l’´energie E − Ej > 0. Sous les hypoth`eses de cette Proposition 3.1, l’existence d’une telle fonction “multi-fuite” globale est donn´ee par le Lemme 3.2. On consid`ere des potentiels (Vj )1≤j≤q de C ∞ (Rn ; R), v´erifiant (Dρ0 ) pour un r´eel ρ > 0. Pour tout j, soit λj > 0 une e´ nergie de non-capture pour l’hamiltonien classique pj (x, ξ) = |ξ|2 + Vj (x). Sous l’hypoth`ese de “non-croisement” j 6= k =⇒

∀x ∈ Rn , Vj (x) − λj 6= Vk (x) − λk

(3.3)

il existe une fonction qui est, simultan´ement pour tout j, une fonction fuite globale a` l’´energie λj pour l’hamiltonien pj (cf. D´efinition 1.3). D´emonstration. D’apr`es [GM], il existe R > 0, C0 > 0 et > 0 tels que, pour tout j, • le crochet de Poisson {pj , x·ξ} ≥ λj /2 pour |x| ≥ R et (x, ξ) ∈ p−1 j (]λj −; λj +[), • il existe aj ∈ C ∞ (R2n ) de la forme: aj (x, ξ) = x · ξ + Cj χj (x)fj (x, ξ), o`u Cj > 0, χj ∈ C0∞ (Rn ), χj fj ∈ C0∞ (R2n ), et telle que {pj , aj } ≥ C0 sur p−1 j (]λj − ; λj + [).

Estimations de la r´esolvante pour une mol´ecule diatomique

597

Soient R0 ≡ maxj sup{|x|; x ∈ suppχj } > R et R00 > R0 . Grˆace a` l’hypoth`ese de “non-croisement” (3.3), on va montrer qu’il existe α > 0 tel que, pour tout assez petit et pour (Vj , λj ) 6= (Vk , λk ), la distance n d

(x, ξ); |x| ≤ R00

o

∩ p−1 ]λj − ; λj + [ , j n

(x, ξ); |x| ≤ R

00

o

∩

(3.4)

p−1 k

]λk − ; λk + [

≥ α.

Posons K = max sup |Vj (x)| j

et

|x|≤R00

Kj = sup kVj0 (x)k. |x|≤R00

D’apr`es (3.3), il existe un δ > 0 tel que |x| ≤ R00 =⇒ ∀j 6= k, Vj (x) − λj − Vk (x) + λk ≥ δ . Prenons < δ/2 et η, r > 0. Supposons qu’il existe des couples o n ]λj − ; λj + [ , (x, ξ) ∈ (y, η); |y| ≤ R00 ∩ p−1 j (x0 , ξ 0 ) ∈

n

o ]λk − ; λk + [ (y, η); |y| ≤ R00 ∩ p−1 k

pour j 6= k, tels que |ξ − ξ 0 | < η, |x − x0 | < r. On a donc 02 |ξ | − |ξ|2 ≤ |ξ 0 − ξ| · |ξ 0 + ξ| ≤ η (λj + + K)1/2 + (λk + + K)1/2 . D’une part, Vj (x) − λj − Vk (x0 ) − λk ≤ pj (x, ξ) − λj − pk (x0 , ξ 0 ) − λk + |ξ 0 |2 − |ξ|2 ≤ 2 + η (λj + + K)1/2 + (λk + + K)1/2 et d’autre part, Vj (x) − λj − Vk (x0 ) − λk ≥ Vj (x0 ) − λj − Vk (x0 ) − λk − |Vj (x) − Vj (x0 )| ≥ δ − rKj . Pour r et η assez petits, on a une contradiction. Par cons´equent, il existe un α > 0 pour lequel l’implication (3.4) est vraie, pour tout assez petit. Grˆace a` cette propri´ et´e, on peut construire, pour assez petit, une partition de l’unit´e P (tj )1≤j≤q sur R2n (i.e tj = 1) telle que, pour tout j, 0 ≤ tj ≤ 1, tj ∈ C ∞ (R2n ) et o n • tj = 1 sur (x, ξ); |x| ≤ R00 ∩ p−1 ]λj − ; λj + [ , j o n • tj = 0 sur (x, ξ); |x| ≤ R00 ∩ p−1 ]λk − ; λk + [ pour tout k 6= j. k

598

Th. Jecko

On pose a =

P

tj aj et on v´erifie que cette fonction convient. Sur p−1 − ; λ + [ , ]λ j j j {pj , a} =

q X

{pj , Ck tk χk fk } + {pj , x · ξ}.

k=1

Si |x| ≤ R00 alors {pj , a} = {pj , Cj tj χj fj } + {pj , x · ξ} = {pj , aj } ≥ C0 . Si |x| > R00 alors {pj , a} = {pj , x · ξ} ≥ λj /2 car R00 > R0 et pour |x| > R0 , x 6∈ suppχk , pour tout k.

Poursuivons la preuve de la Proposition 3.1. Grˆace aux hypoth`eses de “noncroisement” (cf. D´efinition 1.2) et de stabilit´e semi-classique (cf. D´efinition 1.1), pour tout j ≤ s, l 6= l0 =⇒ j 6= j 0 =⇒

∀x ∈ Rn , λjl (x; 0) − Ej − (E − Ej ) 6= λjl0 (x; 0) − Ej − (E − Ej ) ,

∀x ∈ Rn , ∀l, l0 , λjl (x; 0)−Ej −(E −Ej ) 6= λj 0 l0 (x; 0)−Ej 0 −(E −Ej 0 ) .

On peut donc appliquer le Lemme 3.2 aux hamiltoniens classiques pjl (x, ξ) − Ej = |ξ|2 + λjl (x; 0) − Ej aux e´ nergies E − E > 0, j ≤ s. Soit a la fonction “multi-fuite” globale correspondante. D’apr`es (2.8) et (2.9), on peut choisir le R de la preuve de ce Lemme 3.2 de sorte que,

5j (x; h)x·(∇x Ia )(x; h)5j (x; h) + 25j (x; h) Pe (x; h)−Ej 5j (x; h) ≤ E −Ej , (3.5) pour tout j et pour |x| ≥ R. Soient τ1 , τ2 ∈ C ∞ (Rn ) telles que τ12 + τ22 = 1. On impose que τ1 vaille 1 sur {x; |x| ≤ R0 } et soit a` support dans {x; |x| ≤ R00 }. On pose F AD =

l(j) r X X j=1

τ1 5jl aw 5jl τ1 + τ2 5j aw 5j τ2 ,

(3.6)

l=1

o`u aw l’op´erateur h-pseudo-diff´erentiel de symbole de Weyl a. Notons que les fonctions x 7→ τ1 (x)5jl (x; h) ∈ L(L2 (RnN es. y )) sont des symboles born´ Pour e´ tablir (3.1), supposons un instant que tous les projecteurs 5, 5j , 5jl soient e´ gaux a` l’identit´e. Dans ce cas, par le calcul h-pseudo-diff´erentiel habituel, le double commutateur est e´ gal a` h2 fois un op´erateur h-pseudo-diff´erentiel d’ordre 1, o`u le gain h2 provient des deux commutations. Dans le cas qui nous occupe, le mˆeme calcul est valable car les projecteurs 5(x; h), 5j (x; h), 5jl (x; h) commutent entre eux et avec Pe (x; h), ainsi qu’avec les symboles scalaires |ξ|2 et a(x, ξ) consid´er´es dans L(L2 (RnN y )). (3.1) est v´erifi´ee.

Estimations de la r´esolvante pour une mol´ecule diatomique

599

AD AD On calcule le commutateur i P(h) , F(h) modulo “O(h2 )”, o`u “O(h2 )” (respectivement “O(h)”) d´esignera des op´erateurs B P AD -born´es tels que la norme de B(P AD + i)−1 soit un O(h2 ) (respectivement O(h)). D’une part, pour tout j, i P AD , τ2 5j aw 5j τ2 = τ2 5j i P, aw 5j τ2 + 5i −h2 1x , τ2 5j aw 5j τ2 + τ2 5j aw i −h2 1x , τ2 5j 5 = τ2 5j i P, aw 5j τ2 + “O(h2 )” −2i5h∇x · h ∇x (τ2 5j ) aw 5j τ2 − 2iτ2 5j aw h ∇x (τ2 5j ) · h∇x 5. La somme, pour 1 ≤ j ≤ r, des termes faisant intervenir (∇x 5j ) forme un op´erateur h-pseudo-diff´erentiel de symbole principal −2τ22 (x)a(x, ξ)ξ ·

r X

5(x) (∇x 5j )(x)5j (x) + 5j (x)(∇x 5j )(x) 5(x),

j=1

qui est nul car r X

5(x) (∇x 5j )(x)5j (x) + 5j (x)(∇x 5j )(x) 5(x)

j=1

=

r X

5(x)(∇x 5j )(x)5(x)

(3.7)

j=1

= 5(x)(∇x 5)(x)5(x) = 0, d’apr`es la Remarque 2.2. Comme les commutateurs 5j , aw , 5j , h∇x sont des “O(h)”, r X i P AD , τ2 5j aw 5j τ2 j=1

=

r X

τ2 5j i P, aw 5j τ2 + “O(h2 )”

j=1

−2i

r X

5h∇x · (h∇τ2 )5j a 5j τ2 + τ2 5j a 5j (h∇τ2 ) · h∇x 5 w

w

j=1

=

r X

τ2 5j i P, aw 5j τ2 + “O(h2 )”

j=1

−2i

r X

5h∇x · (h∇τ2 )τ2 5j a 5j + 5j a 5j τ2 (h∇τ2 ) · h∇x 5 w

w

j=1

car le commutateur [τ2 , aw ] est “O(h)”. Comme τ2 est nulle au voisinage du support de χj (cf. Lemme 3.2), pour tout j, on peut en fait remplacer aw par Opw erateur h (x · ξ), l’op´ h-pseudo-diff´erentiel de symbole de Weyl x · ξ. D’autre part, on peut e´ crire

600

Th. Jecko

X l(j) l(j) r X r X X AD w w i P , τ1 5jl a 5jl τ1 τ1 5jl i P, a 5jl τ1 +“O(h2 )” + = j=1

j=1

l=1

l=1

l(j) r X X

−2i5h∇x · h ∇x (τ1 5jl ) aw 5jl τ1

j=1

l=1

−2iτ1 5jl aw h ∇x (τ1 5jl ) · h∇x 5 .

Les termes contenant les (∇x 5jl ) forment un op´erateur h-pseudo-diff´erentiel admissible de symbole principal −2ih

l(j) r X X

τ1 (x)5(x)h(∇x 5jl )(x) · ξa(x, ξ)5jl (x)τ1 (x)

j=1

l=1

(3.8)

+τ1 (x)5jl (x)a(x, ξ)ξ · h(∇x 5jl )(x)5(x)τ1 (x)

.

Comme, d’apr`es (3.7), r X

l(j) X

5j

(∇x 5jl )5jl + 5jl (∇x 5jl )

j=1

5j =

r X

5(∇x 5j )5 = 0,

j=1

l=1

l(j) r X X i P AD , τ1 5jl aw 5jl τ1 j=1

!

! =

l(j) r X X j=1

l=1

τ1 5jl i P, a

w

! 5jl τ1

+ “O(h2 )” +

l=1

l(j) r X X

−2i5h∇x · (h∇τ1 )5jl aw 5jl τ1

j=1

l=1

−2iτ1 5jl a 5jl (h∇τ1 ) · h∇x 5 w

.

En utilisant le fait que (∇τ1 ) aw , 5jl τ1 = “O(h)”, X X 5jl aw 5jk τ1 = (∇τ1 ) 5jl aw 5jl τ1 + “O(h)”, (∇τ1 )5j aw 5j τ1 = (∇τ1 ) l,k

l

pour tout j. Une relation similaire existe pour τ1 5j aw 5j (∇τ1 ). On en d´eduit que ! l(j) r X X AD w i P , τ1 5jl a 5jl τ1 j=1

=

l=1

l(j) r X X j=1

−2i

τ1 5jl i P, a

w

! 5jl τ1

+ “O(h2 )”

l=1

r X

5h∇x · (h∇τ1 )5j aw 5j τ1 + τ1 5j aw 5j (h∇τ1 ) · h∇x 5

j=1

Estimations de la r´esolvante pour une mol´ecule diatomique

=

l(j) r X X j=1

τ1 5jl i P, a

w

! 5jl τ1

+ “O(h2 )”

l=1

−2i

601

r X

5j h∇x · (h∇τ1 )τ1 5j aw 5j + 5j aw 5j τ1 (h∇τ1 ) · h∇x 5j

j=1

car le commutateur [τ1 , aw ] est un “O(h)”. En regroupant les deux calculs pr´ec´edents et en utilisant l’´egalit´e τ1 ∇τ1 + τ2 ∇τ2 = 0, on trouve donc l(j) r X X +“O(h2 )”. τ1 5jl i P, aw 5jl τ1+τ2 5j i P, Opw (x · ξ) 5 τ i P AD , F AD = j 2 h j=1

l=1

Soit η > 0 assez petit tel que 0, Ej 6∈]E − η; E + η[, pour tout j, et θ ∈ C ∞ (R; R) valant 1 pr`es de E et a` support dans ]E − η; E + η[. Pour tout j, P AD τ2 5j − τ2 5j PjAD = 5 −h2 1x , τ2 5j 5j = “O(h)”. Grˆace au calcul fonctionnel d’Helffer-Sj¨ostand (cf. [HS]), θ(P AD )τ2 5j − τ2 5j θ(PjAD ) = O(h)

(3.9)

avec θ PjAD = 0 si j > s. Comme, pour tous j, l et pour Pjl = −h2 1x + λjl (x; 0), P AD τ1 5jl − τ1 5jl Pjl = 5 Pe (·; h) − Pe (·; 0) τ1 5jl + 5 −h2 1x , τ1 5jl = 5(·; h) Pe (·; h)5jl (·; h) − Pe (·; 0)5jl (·; 0) τ1 −5(·; h)Pe (·; 0) 5jl (·; h) − 5jl (·; 0) τ1 + “O(h)” = “O(h)”, θ(P AD )τ1 5jl − τ1 5jl θ(Pjl ) = O(h),

(3.10)

par ce mˆeme calcul fonctionnel, avec θ(Pjl ) = 0 si j > s, et θ(P AD )τ1 5jl i P, aw 5jl τ1 θ(P AD ) = θ(P AD )τ1 5jl i Pjl , aw 5jl τ1 θ(P AD ) = τ1 5jl θ(Pjl )i Pjl , aw θ(Pjl )5jl τ1 + O(h2 ) ≥ chτ1 5jl θ2 (Pjl )5jl τ1 + O(h2 ) pour un certain c > 0 car a est une fonction fuite globale pour pj − Ej a` l’´energie E − Ej > 0, pour j ≤ s. En utilisant de nouveau (3.10), ce terme est donc minor´e par chθ(P AD )τ1 5jl τ1 θ(P AD ) + O(h2 ), pour tout j et tout l. D’autre part, pour tout j,

602

Th. Jecko

AD θ(P AD )τ2 5j i P, Opw ) h (x · ξ) 5j τ2 θ(P = hθ(P AD )τ2 5j −2h2 1x −x · (∇x Ia ) 5j τ2 θ(P AD ) AD AD = hτ2 θ(Pj )5j 2(Pj −Ej )−x · (∇x Ia )−2(Pe−Ej ) 5j θ(PjAD )τ2 + O(h2 ) AD ≥ hτ2 5j θ(Pj ) 3(E −Ej )/2−x · (∇x Ia )−2(Pe −Ej ) θ(PjAD )5j τ2 + O(h2 ) AD ≥ hθ(P )τ2 5j 3(E −Ej )/2−x · (∇x Ia )−2(Pe −Ej ) 5j τ2 θ(P AD ) + O(h2 ) E −Ej θ(P AD )τ2 5j τ2 θ(P AD ) + O(h2 ) 2 d’apr`es (3.5) pour j ≤ s et (3.9). En regroupant les in´egalit´es, on peut e´ crire θ(P AD )i P AD , F AD θ(P AD )  ! l(j) r X X ≥ c0 hθ(P AD )  τ1 5jl τ1 + τ2 5j τ2  θ(P AD ) + O(h2 ) ≥ h

j=1 0

≥ c hθ(P

AD 2

l=1 2

) + O(h )

0

pour un certain c > 0, ce qui termine la preuve de cette Proposition 3.1.

Remarque 3.3. Dans cette preuve, on a utilis´e la fonction “multi-fuite”, c’est-`a-dire une fonction fuite globale commune aux diff´erents hamiltoniens classiques, pour affirmer que le symbole principal donn´e par (3.8) est nul. Ceci n’est plus clair si l’on prend des fonctions ajl a priori distinctes, chaque ajl e´ tant une fonction fuite globale pour |ξ|2 + λjl (x; 0) − Ej a` l’´energie E − Ej , et si l’on pose l(j) r X X w F AD = τ1 5jl aw 5 τ + τ 5 Op (x · ξ)5 τ jl 1 2 j j 2 j h j=1

(on a not´e par

Opw h (x

l=1

· ξ) l’op´erateur h-pseudo-diff´erentiel de symbole de Weyl x · ξ).

D´emonstration (du Th´eor`eme 1.4). A partir de l’estimation de Mourre semi-classique de la Proposition 3.1, on peut reprendre la m´ethode de Mourre en suivant le param`etre h et obtenir (1.1). C’est pr´ecis´ement ce que le Th´eor`eme 3.2 de [KMW1] affirme. Sous la condition (1.2), on utilise le Th´eor`eme 3.4 de [KMW1], qui nous donne l’approximation (1.3) du Th´eor`eme 1.4. Il reste donc a` e´ tablir l’estimation (1.4) pour ρ > 0. On e´ tablit d’abord l’estimation de Mourre. Comme op´erateur conjugu´e pour P , on consid`ere l’op´erateur F AD utilis´e dans la preuve de la Proposition donn´e 3.1, qui est par (3.6). D’apr`es cette mˆeme preuve, le double commutateur P AD , F AD , F AD v´erifie l’estimation (3.1). On exploite la condition (1.2) sur l’´energie E dans le Lemme 3.4 ([KMW1], Lemme 4.3). Soient θ ∈ C0∞ (R; R) telle que la borne sup´erieure de son support soit strictement inf´erieure a` E AD , i.e. sup(suppθ) < E AD . Alors

θ P (h) − θ P AD (h) = O(h).

Estimations de la r´esolvante pour une mol´ecule diatomique

603

D´emonstration. Il suffit de remarquer que la preuve du Lemme 4.3 de [KMW1] est encore valable sous l’hypoth`ese (Dρ ) avec ρ > 0. Soit η > 0 assez petit tel que 0, Ej 6∈]E − η; E + η[, pour tout j, et θ ∈ C ∞ (R; R) valant 1 pr`es de E et a` support dans ]E − η; E + η[. D’apr`es le Lemme 3.4, θ(P )i P, F AD θ(P ) = θ(P AD )i P, F AD θ(P AD ) + O(h2 ). Comme le noyau de P AD est contenu dans l’image de 1 − 5 et θ(0) = 0, θ(P AD ) = 5θ(P AD ) = θ(P AD )5 et θ(P AD )i P, F AD θ(P AD ) = θ(P AD )i P AD , F AD F AD θ(P AD ). Par cons´equent, d’apr`es la preuve de la Proposition 3.1, on peut e´ crire θ(P )i P, F AD θ(P ) ≥ chθ2 (P AD ) + O(h2 ) ≥ chθ2 (P ) + O(h2 ) en r´eutilisant le Lemme 3.4. On obtient ainsi l’estimation de Mourre semi-classique pour P . La m´ethode de Mourre a` param`etre donne l’estimation (1.4). Remarque 3.5. Si un croisement de valeurs propres a lieu a` une e´ nergie strictement sup´erieure a` E (cf. Remarque 1.6), le Th´eor`eme 1.4 reste valable. Les arguments pr´ec´edents doivent eˆ tre modifi´es car les projecteurs 5jl peuvent pr´esenter des singularit´es au croisement. Mais comme les propri´et´es de “fuite” de la fonction a sont localis´ees sur des surfaces d’´energie qui sont e´ loign´ees du croisement, on peut imposer que a s’annule pr`es du croisement. Dans ce cas, l’op´erateur F AD , d´efini par (3.6), est encore un op´erateur pseudo-diff´erentiel et on peut reprendre la preuve pr´ec´edente.

4. Approximation semi-classique des op´erateurs d’onde de canal Grˆace au Th´eor`eme 1.4 pr´ec´edent, le Th´eor`eme 4.1 de [KMW1] permet, pour des potentiels a` courte port´ee, d’affirmer que les op´erateurs d’onde AD N (h) = s − lim eih ±

−1

tP (h) −ih−1 tP AD (h)

e

t→±∞

Eac (P AD (h)),

o`u Eac (P AD (h)) d´esigne la projection sur le sous-espace absolument continu de P AD (h), sont semi-classiquement proches de l’identit´e, ce qui implique que les op´erateurs d’ondes ih AD ± (h) = s − lim e

−1

tP AD (h) −ih−1 tPa (h)

e

t→±∞

50 (h)

constituent une approximation semi-classique, dans une certaine bande d’´energie, des op´erateurs d’onde de canal ± (h) = s − lim eih t→±∞

Avec les notations de la partie 3, on a le

−1

tP (h) −ih−1 tPa (h)

e

50 (h).

604

Th. Jecko

Th´eor`eme 4.1. Sous les hypoth`eses du Th´eor`eme 1.4 avec ρ > 1 et Er < E AD , soit I un intervalle non-captif pour les hamiltoniens classiques |ξ|2 +λjl (x; 0) (cf. D´efinition 1.3), 1 ≤ j ≤ r et 1 ≤ l ≤ l(j), tel que I ⊂ ]Er , E AD [ ou bien tel que, pour un certain j ≤ r (avec la convention “E0 = −∞”), I ⊂ ]Ej−1 , inf λjl (x; 0)[. x,l

Alors, pour toute fonction χ ∈ C0∞ (I), il existe une constante Cχ > 0 telle que

N AD

± (h) − 1 χ P AD (h) ≤ Cχ h,

± (h) − AD ± (h) χ Pa (h) ≤ Cχ h. Remarque 4.2. Ce th´eor`eme g´en´eralise, pour des potentiels r´eguliers, le r´esultat de [KMW1] o`u l’op´erateur P AD ne prenait en compte qu’une valeur propre simple de P a (0). Dans [KMW2], le r´esultat de [KMW1] est obtenu en pr´esence de singularit´es coulombiennes. D´emonstration. La premi`ere estimation d´ecoule du fait que l’on peut appliquer le Th´eor`eme 4.1 de [KMW1] puisque l’on dispose de l’estimation (1.1) du Th´eor`eme 1.4. Pour obtenir la seconde, rappelons que AD AD ± = N ± ±

(cf. [KMW1, Jec1]). Comme dans [KMW1], on peut remarquer que, si φ est une fonction de C0∞ (I) v´erifiant φχ = χ, alors, grˆace a` la propri´et´e d’intervertion des op´erateurs d’onde, AD ± − AD χ(Pa ) = N − 1 φ(P AD )AD ± ± ± χ(Pa ). La seconde estimation provient alors de la premi`ere.

Au sujet de cette approximation semi-classique des op´erateurs d’onde de canal, on peut eˆ tre plus pr´ecis. Consid´erons, pour 1 ≤ j ≤ r, les op´erateurs d’onde ih AD j,± = s − lim e

−1

tPjAD −ih−1 tPa

t→±∞

e

5j0 .

D’apr`es les arguments de [KMW1], ces op´erateurs AD j,± existent et sont complets (voir aussi le paragraphe 2.3 de [Jec1]). Remarquons qu’en g´en´eral, AD 6= ±

r X

AD j,± .

j=1

Certes les op´erateurs PjAD commutent deux a` deux mais il se trouve qu’en g´en´eral, P AD 6=

r X

PjAD .

j=1 −1

AD

En fait, l’´evolution eih tP “m´elange les niveaux”. Cependant, on dispose de la Proposition 4.3 suivante, qui pr´ecise le Th´eor`eme 4.1.

Estimations de la r´esolvante pour une mol´ecule diatomique

605

Proposition 4.3. Soit χ ∈ C0∞ (R) dont le support est un intervalle tel que 0, Ej 6∈ suppχ, pour tout j. Posons n o rχ = max 1 ≤ j ≤ r; Ej < inf(suppχ) (avec la convention rχ = 0 si E1 > sup(suppχ)). Si le support de χ est non-captif pour les hamiltioniens classiques |ξ|2 + λj (x; 0), pour 1 ≤ j ≤ rχ , alors rχ

X

AD AD

± (h) − j,± (h) χ Pa (h) = O(h). j=1

D´emonstration. Signalons tout d’abord que, pour j > rχ , 5j0 χ(Pa ) = 5j0 χ(−h2 1x + P a ) = 0 car pour h assez petit, les valeurs propres de P a (h) qui tendent vers Ej sont au-dessus du support de χ. On a donc AD esultat, il suffit d’avoir j,± χ(Pa ) = 0. Pour obtenir le r´ r

AD X AD − j,± χ(Pa ) = O(h).

±

(4.1)

j=1

Pour un tel j, on introduit les op´erateurs d’onde AD N = s − lim eih j,± t→±∞

−1

tP AD −ih−1 tPjAD

e

Eac (PjAD ).

Signalons que 5j Eac (PjAD ) = Eac (PjAD ). Comme (P AD − PjAD )5j = 5(−h2 1x )5j − 5j (−h2 1x )5j = −

X

5 −h2 1x , 5k 5j

k6=j −ρ−1

= O(hhxi

) · h∇x 5j + O(h hxi 2

−ρ−2

)5j ,

on voit, par la m´ethode de Cook, que ces op´erateurs d’onde existent. Grˆace a` la propri´et´e d’intervertion de ces op´erateurs d’onde, les estimations (4.1) d´ecoulent des suivantes

N AD

(4.2)

j,± − 1 χ(PjAD ) = O(h). Pour obtenir (4.2), v´erifions que l’on peut reprendre la preuve du Th´eor`eme 4.1 de [KMW1]. Tout d’abord, pour 1 ≤ j ≤ rχ , on a l’estimation semi-classique de la r´esolvante de PjAD sur le support de χ (cf. Th´eor`eme 1.4) grˆace aux hypoth`eses de non-capture et de “non-croisement” faites sur Ej . D’apr`es la th´eorie des op´erateurs localement lisses (cf. [RS4]), on a donc, pour tout s > 1/2, l’existence d’une constante C > 0 telle que Z ∞ −1 AD khxi−s e−ih tPj χ(PjAD )f k2 dt ≤ Ckf k2 , −∞

+1) ), uniform´ement pour h assez petit (cf. le Lemme pour toute fonction f ∈ L2 (Rn(N x,y ˆ jP 5 ˆ j, 5 ˆ j = 5 − 5j , 4.2 de [KMW1]). En e´ crivant, avec QAD =5 j

606

Th. Jecko

P AD =

r X

+ Vj , PjAD + QAD j

j=1

on peut reprendre la preuve du Lemme 4.3 de [KMW1] et obtenir r

X

χ(PjAD ) hxiρ = O(h).

χ(P AD ) − j=1

Enfin, comme Vj 5j =

X

(5k P 5j + 5j P 5k )5j

k6=j

=

X

5k (−h2 1x )5j = −

k6=j

X

−h2 1x , 5k 5j

k6=j −ρ−1

= O(hhxi

) · h∇x 5j + O(h2 hxi−ρ−2 )5j ,

les arguments de la preuve de ce Th´eor`eme 4.1 de [KMW1] permettent d’obtenir (4.2). 5. Autres utilisations d’une fonction fuite ou “multi-fuite” globale La construction d’une fonction fuite ou “multi-fuite” globale permet, dans d’autres situations, d’obtenir des th´eor`emes d’absorption limite semi-classiques. On va e´ tablir de tels th´eor`emes pour un op´erateur de Schr¨odinger matriciel et, comme dans [Ce], pour l’op´erateur de Dirac avec champ e´ lectrique scalaire. Commenc¸ons par un cas tr`es similaire a` celui trait´e dans la partie 3. Prenons un op´erateur de Schr¨odinger matriciel de la forme H = −h2 1x Im + M (x),

m agissant dans L2 (Rn ) , o`u Im est la matrice identit´e a` m lignes et m colonnes (avec m > 1) et M (x) est une matrice sym´etrique r´eelle de mˆeme taille, v´erifiant les propri´et´es • Rn 3 x 7→ M (x) est de classe C ∞ , • il existe une matrice sym´etrique r´eelle M0 et un r´eel ρ > 0 tels que

∀α ∈ Nn , ∃Cα > 0; ∀x ∈ Rn , ∂xα M (x) − M0 ≤ Cα hxi−ρ−|α| .

(5.1)

Soient E1 < . . . P < Er les valeurs propres de M0 , chaque Ej e´ tant de multiplicit´e mj (on a donc m = mj ). D’apr`es la propri´et´e (5.1) pour α = 0, il existe, pour tout j, mj fonctions Rn 3 x 7→ λjl (x) ∈ σ(M (x)) qui tendent vers Ej a` l’infini. On fait l’hypoth`ese de “non-croisement” (j, l) 6= (j 0 , l0 ) =⇒ ∀x ∈ Rn , λjl (x) 6= λj 0 l0 (x) . (5.2) En particulier, pour tout j, il y a l(j) fonctions λjl qui tendent vers Ej a` l’infini et chaque λjl est de multiplicit´e constante. Sous cette hypoth`ese (5.2), on peut trouver, pour tout j, une famille {0j (x), x ∈ Rn } de contours dans C telle que

Estimations de la r´esolvante pour une mol´ecule diatomique

5j (x) =

1 2iπ

Z 0j (x)

607

(z − M (x))−1 dz

soit le projecteur spectral associ´e a` {λjl (x), 1 ≤ l ≤ l(j)} et, pour tout 1 ≤ l ≤ l(j), une famille {0jl (x), x ∈ Rn } de contours dans C telle que 5jl (x) =

1 2iπ

Z 0jl (x)

(z − M (x))−1 dz

soit le projecteur spectral associ´e a` λjl (x). Notons que, pour |x| assez grand, on peut imposer 0j (x) = 0j , un contour entourant Ej mais aucune des Ek pour k 6= j. Le projecteur spectral de M0 associ´e a` la valeur propre Ej est donn´e par 5j0

1 = 2iπ

Z 0j

(z − M0 )−1 dz.

Grˆace a` la propri´et´e (5.1) et a` l’hypoth`ese de “non-croisement” (5.2), on voit que les projecteurs 5j et 5jl sont de classe C ∞ et, pour tout α ∈ Nn , il existe Dα > 0, tel que

∀x ∈ Rn , ∂xα 5j (x) − 5j0 ≤ Dα hxi−ρ−|α| . Notons que, comme dans la partie 3, le contrˆole des d´eriv´ees des 5jl n’est pas clair. En revanche, on voit que les valeurs propres λjl (x) v´erifient la condition (Dρ0 ), introduite dans cette partie 3. Pour obtenir le Th´eor`eme 5.3, on montre la Proposition 5.1. On suppose que la fonction Rn 3 x 7→ M (x) v´erifie les propri´et´es pr´ec´edentes (cf. (5.1)) et l’hypoth`ese de “non-croisement” (5.2). Soit E 6∈ {0} ∪ {Ej , 1 ≤ j ≤ r} une e´ nergie non-captive pour chaque hamiltonien classique |ξ|2 + λjl (x), 1 ≤ j ≤ r et 1 ≤ l ≤ l(j) (cf. D´efinition 1.3). Pour h assez petit, il existe un op´erateur conjugu´e F satisfaisant

[[H, F ] , F ] (H + i)−1 = O(h2 ) et v´erifiant l’estimation de Mourre χ(H)i [H, F ] χ(H) ≥ αhχ2 (H) avec χ = 11]E−δ,E+δ[ , δ > 0 et α > 0 ind´ependants de h. D´emonstration. La preuve est analogue a` celle de la preuve de la Proposition 3.1 en prenant comme op´erateur conjugu´e F =

l(j) r X X j=1

! τ1 5jl (a Im )5jl τ1 + τ2 5j (a Im )5j τ2 w

w

,

l=1

o`u a est une fonction “multi-fuite” globale, construite avec le Lemme 3.2, pour les hamiltoniens classiques |ξ|2 + λjl (x), j v´erifiant Ej < E, a` l’´energie E − Ej .

608

Th. Jecko

Remarque 5.2. Imaginons un instant que l’on dispose d’un symbole (x, ξ) 7→ A(x, ξ), a` valeurs matricielles et qui,ainsi que ses d´eriv´ees premi`eres, commutent avec M (x) et ses d´erive´ees premi`eres, de sorte qu’il existe c > 0, tel que o n |ξ|2 Im + M (x), A(x, ξ) ≥ c ({·, ·} d´esigne le crochet de Poisson, a` valeurs matricielles cette fois, et l’in´egalit´e est e´ galement matricielle) sur la couche d’´energie o n (x, ξ) ∈ R2n ; ∃E 0 ∈]E − δ, E + δ[; det |ξ|2 Im + M (x) − E 0 Im = 0 . Dans ce cas, on peut d´emontrer la Proposition 5.1, sans l’hypoth`ese de “non-croisement”, a` l’aide de l’op´erateur conjugu´e Aw , l’op´erateur h-pseudo-diff´erentiel de symbole de Weyl A(x, ξ). Tout le probl`eme r´eside dans la construction d’un tel op´erateur. En l’absence de “croisement”, lorsque les valeurs propres Ej sont simples, on cherche des symboles scalaires aj avec A(x, ξ) =

m X

aj (x, ξ)5j (x)

j=1

et, d’apr`es la Remarque 3.3, il vaut mieux prendre a1 = · · · = am = a, o`u a est une fonction “multi-fuite” globale pour les valeurs propres de |ξ|2 Im + M (x). Grˆace a` la Proposition 5.1, on obtient, par la m´ethode de Mourre, le Th´eor`eme 5.3. Soit E satisfaisant les hypoth`eses de la Proposition 5.1. Pour tout voisinage compact 3 de E, assez petit, et pour tout s > 1/2, khxi−s (λ ± i0 − H)−1 hxi−s k = O(h−1 ), pour h assez petit et uniform´ement pour λ ∈ 3. Plac¸ons-nous maintenant dans une situation vraiment diff´erente. Par d´efinition, l’op´erateur de Dirac semi-classique, avec champ e´ lectrique scalaire, dans la repr´esentation de Weyl, est l’op´erateur matriciel du premier ordre D=

3 X

hαj Dj + α4 + V I4

j=1 ∂ agissant dans L2 (R3 ; C4 ), o`u Dj = 1i ∂x et V est le potentiel e´ lectrique, que l’on j supposera r´egulier et a` longue port´ee. Les αj sont les matrices 4-4 0 I2 σj 0 α4 = , 1 ≤ j ≤ 3, , αj = I2 0 0 σj

o`u I2 est la matrice identit´e 2-2 et les σj sont les matrices de Pauli 01 0 −i 1 0 σ1 = , σ2 = , σ3 = 10 i 0 0 −1 (cf. [T, W2]). Le h-symbole de l’op´erateur libre (c’est-`a-dire pour V = 0)

Estimations de la r´esolvante pour une mol´ecule diatomique 3 X

609

αj ξj + α 4

j=1

qui, pour p tout (x, ξ), est une matrice 4-4, admet deux valeurs propres doubles ±hξi avec hξi = 1 + |ξ|2 . Notons par 5± (ξ) le projecteur spectral associ´e. Ces projecteurs, que l’on peut expliciter a` l’aide de l’op´erateur de Foldy-Wouthuysen (cf. [T], [W2]), sont r´eguliers en ξ. Le h-symbole de l’op´erateur D, d(x, ξ) =

3 X

αj ξj + α4 + V (x)I4 ,

j=1

admet donc lui aussi deux valeurs propres doubles p± (x, ξ) = ±hξi + V (x). Essayons d’appliquer la m´ethode pr´ec´edente a` cet op´erateur D. Il convient donc de fabriquer une fonction fuite globale pour p+ et une pour p− . Mais, sous les hypoth`eses faites sur V , il n’existe pas de telles fonctions ! En effet, pour λ0 ∈ R, les surfaces d’´energie p−1 + (λ0 ) (λ ) ne sont jamais simultan´ e ment non-compactes. et p−1 0 − Cependant, il est possible d’obtenir un th´eor`eme d’absorption limite semi-classique dans le cas o`u l’une de ces surfaces est vide. On retrouve ainsi le r´esultat de [Ce], avec une hypoth`ese moins restrictive sur V . D´emonstration (du Th´eor`eme 1.7). D’apr`es (1.6), pour > 0 assez petit, ]λ0 − ; λ0 + [ ∩ σ(P− ) = ∅,

(5.3)

o`u P− = Opw h (p− ). La contribution de P− dans l’estimation de Mourre disparaˆıtra. Il suffit donc de construire une fonction fuite globale pour p+ . C’est l’objet du Lemme 5.4. On suppose que le potentiel V v´erifie la condition (Dρ0 ) pour ρ > 0. Pour une fonction tout e´ nergie λ0 > 1, non-captive pour p+ (cf. D´efinition 1.3), il existe ∞ 6 −1 a ∈ C (R ) et des r´eels > 0, C0 > 0 tels que, sur p+ ]λ0 − ; λ0 + [ , {p+ , a} ≥ C0 .

]λ − ; λ + [ , D´emonstration. On s’inspire encore de [GM]. Sur la couche p−1 0 0 + |ξ|2 − x · ∇V (x) hξi 1 − x · ∇V (x) − V (x). ≥ λ0 − − hξi

{p+ , x · ξ}(x, ξ) =

D’apr`es (Dρ0 ), on peut trouver un r´eel R > 0 tel que, pour > 0 assez petit, n o λ0 − 1 . ]λ − ; λ + [ ∩ (x, ξ); |x| ≥ R =⇒ {p+ , x · ξ}(x, ξ) ≥ (x, ξ) ∈ p−1 0 0 + 2 (5.4) Soit x 7→ g(x) une fonction de C0∞ {y; |y| ≤ R1 } avec R < R1 , valant 1 pour |x| ≤ R et v´erifiant 0 ≤ g ≤ 1. On pose Z f+ (x, ξ) = −

+∞

g ◦ φt1,+ (x, ξ)dt, 0

610

Th. Jecko

o`u φt1,+ (x, ξ) est la composante spatiale du flot φt+ (x, ξ) associ´e a` p+ . D’apr`es l’hypoth`ese ∞ de non-capture, sur p−1 et + (]λ0 − ; λ0 + [), cette fonction f+ est de classe C {p+ , f+ }(x, ξ) = g(x). De plus, elle est uniform´ement born´ee par TR1 ,+ , le temps de s´ejour dans la boule {x; |x| ≤ R1 } pour la dynamique associ´ee a` p+ . Soient χ+ ∈ C0∞ (Rn ) valant 1 sur le support de g et C+ > 0, on pose a(x, ξ) = x · ξ + C+ χ+ (x)f+ (x, ξ). La fonction a est C

∞

sur p−1 + (]λ0 − ; λ0 + [) et

{p+ , a} = C+ g + C+ f+ {p+ , χ+ } + {p+ , x · ξ}. Pour C+ assez grand et grˆace a` (5.4), sur p−1 + (]λ0 − ; λ0 + [), λ0 − 1 . 2 En choisissant maintenant les variations de χ+ assez petites (ce qui est possible en augmentant la taille de son support), on assure, sur p−1 + (]λ0 − ; λ0 + [), que λ0 − 1 . C+ f+ {p+ , χ+ } ≤ 4 Reprenons la preuve du Th´eor`eme 1.7. Posons F = 5+ aw 5+ . Cet op´erateur v´erifie

[[D, F ] , F ] (D + i)−1 = O(h2 ). C+ g + {p+ , x · ξ} ≥

On se propose d’obtenir l’estimation de Mourre semi-classique pr`es de λ0 avec cet op´erateur F . Soit θ ∈ C ∞ (R; R) valant 1 pr`es de λ0 et a` support dans ]λ0 − ; λ0 + [ (avec le du Lemme 5.4). En posant P+ = Opw h (p+ ), θ(D) = θ(P+ )5+ + θ(P− )5− = θ(P+ )5+

(5.5)

d’apr`es (5.3). Par cons´equent, on peut e´ crire θ(D)i [D, F ] θ(D) = θ(P+ )5+ i [D, F ] 5+ θ(P+ ) = θ(P+ )5+ i P+ , 5+ aw 5+ 5+ θ(D). Les op´erateurs 5+ et P+ ne commutent pas forc´ement mais 5+ i P+ , 5+ aw 5+ 5+ = 5+ i P+ , aw 5+ + 5+ aw i [V I4 , 5+ ] 5+ + 5+ i [V I4 , 5+ ] aw 5+ . Le h-symbole principal de Weyl des deux derniers termes est h5+ (ξ){V, 5+ }(x, ξ)a(x, ξ)5+ (ξ) = −ha(x, ξ)(∇V )(x) · 5+ (ξ)(∇ξ 5+ )(ξ)5+ (ξ) = 0 d’apr`es la Remarque 2.2. On en d´eduit que

θ(D)i [D, F ] θ(D) = θ(P+ )5+ i P+ , aw 5+ θ(P+ ) + O(h2 ) ≥ chθ(P+ )2 5+ + O(h2 ) ≥ chθ(D)2 + O(h2 )

pour un certain c > 0, d’apr`es (5.5). De nouveau, on termine la preuve par la m´ethode de Mourre a` param`etre.

Estimations de la r´esolvante pour une mol´ecule diatomique

611

Remerciements. L’auteur exprime sa profonde gratitude envers X.P. Wang, pour son soutient constant et ses nombreux conseils (en particulier, il signala que la m´ethode utilis´ee dans la partie 3 devait s’appliquer a` l’op´erateur de Dirac). L’auteur remercie J.M. Combes et A. Martinez pour leurs remarques et l’int´erˆet qu’ils ont port´e a` ce travail. L’auteur tient aussi a` remercier les membres du d´epartement de math´ematiques de la TU Berlin pour leur hospitalit´e et en particulier V. Bach. L’auteur est soutenu financi`erement par le programme europ´een TMR de la Commission Europ´eenne, intitul´e:“ Network Postdoctoral training programme in partial differential equations and application in quantum mechanics”.

R´ef´erences [A]

Agmon, S.: Lectures on Exponential Decay of Solutions of Second-Order Elliptic Equations. Princeton, NJ: Princeton University Press, 1982 [BO] Born, M., Oppenheimer, R.: Zur Quantentheorie der Molekeln. Ann. der Phys. 84, 457 (1927) [Ce] Cerbah, S.: Principe d’absoption limite semi-classique pour l’op´erateur de Dirac. Preprint [CDS] Combes, J.M., Duclos, P., Seiler, R.: The Born–Oppenheimer Approximation. Rigourous Atomic and Molecular Physics, eds. Wightman and Velo, New York: Plenum, 1981 [CT] Combes, J.M., Tip, A.: Properties of the scattering amplitude for electron-atom collisions. Ann. I.H.P., 40, n◦ 2, 117–139 (1984) [GM] G´erard, C., Martinez, A.: Principe d’absorption limite pour des op´erateurs de Schr¨odinger a` longue port´ee. C.R. Acad. Sci. 306, 121–123, (1988) [H] Hagedorn, G.A.: Molecular propagation through electron energy level crossings. Memoirs AMS 536, 111 (1994) [HS] Helffer, B., Sj¨ostrand, J.: Op´erateurs de Schr¨odinger avec champs magn´etiques faibles et constants. Expos´e No. XII, S´eminaire EDP, f´evrier 1989, Ecole Polytechnique [Jec1] Jecko, Th.: Sections efficaces totales d’une mol´ecule diatomique dans l’approximation de Born– Oppenheimer. Th`ese de doctorat, Universit´e de Nantes, 1996 [Jec2] Jecko, Th.: Approximation de Born–Oppenheimer de sections efficaces totales diatomiques. Preprint [JMP] Jensen, A., Mourre, E., Perry, P.: Multiple commutator estimates and resolvant smoothness in quantum scattering theory. Ann. IHP 41, 207–225 (1984) [K] Kato, T.: Perturbation Theory for Linear Operators. Berlin: Springer, 1976 [KMSW] Klein, M., Martinez, A., Seiler, A., Wang, X.P.: On the Born–Oppenheimer Expansion for Polyatomic Molecules. Commun. Math. Phys. 143, 606–639 (1992) [KMW1] Klein, M., Martinez, A., Wang, X.P.: On the Born–Oppenheimer Approximation of Wave Operators in Molecular Scattering Theory. Commun. Math. Phys. 152, 73–95 (1993) [KMW2] Klein, M., Martinez, A., Wang, X.P.: On the Born–Oppenheimer Aproximation of Wave Operators II: Singular Potentials. J. Math. Phys. 38, 3 (1997) [Mo] Mourre, E.: Absence of singular continuous spectrum for certain self-adjoint operators. Commun. Math. Phys. 78, 391–408 (1981) [Ne] N´ed´elec, L.: R´esonances pour l’op´erateur de Schr¨odinger matriciel. Ann.IHP 65 2, 129–162 (1996) [No] Nourrigat, J.: Amplitude de diffusion pour l’op´erateur de Dirac. Expos´e aux journ´ees semiclassiques a` Lille, 22-24 janvier 1997 [PSS] Perry, P., Sigal, I.M., Simon, B.: Spectral analysis of N -body Schr¨odinger operators. Ann. Math. 114, 519–567 (1981) [Ra] Raphaelian, A.: Ion-Atom Scattering within a Born–Oppenheimer Framework. Disertation TU Berlin, 1986 [RS3] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Tome III: Scattering Theory. New York: Academic Press, 1979 [RS4] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Tome IV: Analysis of Operators. New York: Academic Press [RT] Robert, D., Tamura, H.: Semiclassical estimates for resolvents and asymptotics for total crosssection. Ann. IHP 46, 415–442 (1987) [T] Thaller, B.: The Dirac Equation. Berlin–Heidelberg–New York: Springer-Verlag, 1992

612

[W1] [W2]

Th. Jecko Wang, X.P.: Semiclassical Resolvent Estimates for N -body Schr¨odinger Operators. J. Funct. Anal. 97, 466–483 (1991) Wang, X.P.: Puits multiples pour l’op´erateur de Dirac. Ann. IHP 43, 269–319 (1985)

Communicated by B. Simon

Commun. Math. Phys. 195, 613 – 626 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Densities, Minimal Distances, and Coverings of Quasicrystals? R. V. Moody1 , J. Patera2 1 Department of Mathematical Sciences, University of Alberta, Edmonton, T6G 2G1, Canada. E-mail: [email protected] 2 Centre de recherches math´ ematiques, Universit´e de Montr´eal, CP 6128-A, Montr´eal (Qu´ebec) H3C 3J7, Canada. E-mail: [email protected]

Received: 11 February 1997 / Accepted: 19 December 1997

Abstract: Quasicrystals are generalizaions of lattices. In this spirit, we consider here three kinds of geometric properties of mathematical models of quasicrystals: minimal distances between quasicrystal points, covering radii of quasicrystals, and the densities of quasicrystal points. A closed formula for the density of points in quasicrystals with local symmetries of types H2 , H3 , and H4 is derived. Using this we determine the planes of maximal density in 3-dimensional icosahedral (i.e. H3 ) quasicrystals. Under fairly mild conditions on the acceptance window, these turn out to be the planes orthogonal to the 5-fold axes of the H3 -symmetry, revealing that the symmetry is already implicit in the quasicrystal, even when it is not explicit. We show how to determine the minimal distances and covering radii in particular quasirystals. In particular we derive the covering radius of the canonical H2 quasicrystal 62 and derive an interesting covering result and estimate for the quasicrystal 64 . 1. Introduction Curiosity about the unusual geometric properties of aperiodic systems with long range order has been the driving force behind the theoretical study of such systems since their discovery in 1974 [P74], with a great deal of new impetus added to it by the experimental discovery of quasicrystals ten years later [SBC84, J94, S95]. Quasicrystals are generalizations of lattices. In that spirit, we consider here three kinds of geometric properties of mathematical models of quasicrystals: minimal distances between quasicrystal points, covering radii of quasicrystals, and the densities of quasicrystal points. We have concentrated our attention on the physically most important and representative form of aperiodic symmetry, namely that built around fivefold / ? Work supported in part by the Natural Sciences and Engineering Research Council of Canada and by FCAR of Quebec.

614

R. V. Moody, J. Patera

icosahedral symmetry and the golden ratio τ . The interesting thing about this particular form of symmetry is its close relation to the three non-crystallographic Coxeter groups that are built on 5-fold symmetry: namely those of types H2 , H3 , and H4 . Quasicrystals built around these groups and their corresponding root systems are not only relevant but also have an especially attractive and amenable theory. In particular they are closely related to cut and project sets based on the root lattices A4 , D6 , E8 , and their sublattices. The new results in the article are the following: (i) We derive a closed formula for the density of points in quasicrystals with local symmetries of types H2 , H3 , and H4 . The existence of, and a general formula for, the densities of point sets arising out of the cut and project scheme have been derived in [H98] and [Sch98]. Here we offer a proof tailored to our situation which provides at the same time an explicit version of the density formula. Using the density formula, (ii) we determine the planes of maximal density in 3-dimensional icosahedral (i.e. H3 ) quasicrystals. Under fairly mild conditions on the acceptance window, these turn out to be the planes orthogonal to the 5-fold axes of the H3 -symmetry. This leads to the interesting and useful result that even if the quasicrystal carries no global H3 -symmetry, the orientation of the implicit H3 -symmetry is canonically determined by the local symmetry of the quasicrystal. In addition we show how to determine (iii) minimal distances and (iv) covering radii in particular quasicrystals. The former is fairly straightforward in any particular case, as our method shows. The latter, as in the case of lattices, is more difficult. Here we have derived the covering radius of the canonical H2 quasicrystal 62 and derived an interesting covering result and estimate for the quasicrystal 64 . In the folow-up article to this one, minimal distances are described in a general setting [MPP98/1]. Apart from this Introduction, the paper consists of five sections. In Sect. 2, while introducing some notation, we recall the three non-crytallographic root systems H2 , H3 , H4 , and their Z-spans M as the stage for building the quasicrystals, define the star map of M , and using that define the quasicrystals following [MP93]. In Sect. 3 the minimal distances, as defined by the diameter of the acceptance region of a quasicrytal, are considered. Sect. 4 contains a derivation of the closed density formula. In Sect. 5 the planar subquasicrystals of the H3 -quasicrytals are considered and the planes of maximal density are found. In the last section some results on covering radii are presented. 2. Notation and Preliminaries 1. The root systems are familiar attributes of the semisimple Lie algebras and the KacMoody algebras, though they only appear properly in Lie theory in the crystallographic cases. However, they are in fact defined for any finite Coxeter group [D82, H90], and it is the non-crystallographic ones that have proven to be of great value in the study of quasicrystals and aperiodic order. Here we are interested in the three non-crystallographic root systems 12 , 13 , 14 of types H2 , H3 , H4 , and the corresponding Coxeter groups of orders 10, 120, and 14 400. These have the following Coxeter-Dynkin diagrams:

◦ ◦ 5

◦ ◦ ◦ 5

◦ ◦ ◦ ◦ 5

respectively. A detailed development of quasicrystals from this perspective can be found in [CMP98, P97]. Each of these root systems has a rather explicit description in terms of coordinates, called the standard model, which we now describe.

Densities, Minimal Distances, and Coverings

615

Throughout the paper we frequently use the constants τ and τ 0 : √

5) = −ζ 2 − ζ 3 ' 1.618 , √ τ 0 = 21 (1 − 5) = −ζ − ζ 4 ' −0.618 , τ = 21 (1 +

τ τ 0 = −1,

τ + τ 0 = 1.

The standard model of the root system 12 of H2 is given by the ten points ±ζ j , j = 0, . . . , 4, in the complex plane C, where ζ = e2πi/5 . The root polytope h1k i is the convex hull of the roots. Thus h12 i is a regular decagon. As the simple roots one may choose α1 = 1 and α2 = ζ 2 , which make ζ = τ (α1 + α2 ) the highest root. The standard model [CMP98] of the root system 13 , given relative to an orthonormal basis in R3 , consists of the following 30 vectors: (±1, 0, 0) and permutations ,

0 1 2 (±1, ±τ , ±τ )

and even permutations.

(2.1)

The root polytope h13 i is the icosadodecahedron described, for example, in [C63, CKPS95]. The standard model [CMP98] of the root system 14 , given relative to an orthonormal basis in R4 , consists of the following 120 vectors: 1 2 (±1, ±1, ±1, ±1), 0 1 2 (0, ±1, ±τ , ±τ )

(±1, 0, 0, 0) and all permutations, and all even permutations.

(2.2)

Choosing for the simple roots α1 = 21 (−τ 0 , −τ, 0, −1),

α2 = 21 (0, −τ 0 , −τ, 1),

α3 = 21 (0, 1, −τ 0 , −τ ),

α4 = 21 (0, −1, −τ 0 , τ ),

(2.3)

we have the highest root as (1, 0, 0, 0). The subset {α2 , α3 , α4 } serves as a basis of simple roots for the root system 13 . The root polytope h14 i is the regular polyhedron called the 600-cell. (Its 3dimensional surface is formed by 600 regular tetrahedra.) For its detailed description, see for example [C63, CKPS95]. 2. One of the features that distinguishes non-crystallographic and crystallographic root systems is the existence in the case of the former of non-trivial root system maps, which we call star mappings. The theory of these is developed in [CMP98], but a good way to think of them in our context is as the mappings, from the points of a cut and project set in physical space to points inside the window of internal space, determined by the projection maps of a cut and project scheme. Indeed we will implicitly use this idea to construct our quasicrystals. A root system is normalized if all its roots have length 1. A star map ∗ is normalized if also 1∗k is normalized. Each root system has, up to Coxeter group symmetry, only one normalized star map (Proposition 6.1 of [CMP98]). The star map for the root system 12 can be defined by ±ζ j Consequently 1∗2 = 12 .

∗

= ±ζ 2j ,

for all j.

616

R. V. Moody, J. Patera

An instructive way to look at it is to consider the algebraic number field Q(ζ) =

∞ X

Qζ j =

j=0

3 M

Qζ j

j=0

(note that 1 + ζ + ζ 2 + ζ 3 + ζ 4 = 0).PThere is a P unique Galois automorphism ∗ of this field qj ζ 2j for all rational numbers qj , and it which sends ζ 7→ ζ 2 , and hence qj ζ j 7→ is this that induces ∗ on 12 . It is not hard to show that Q(ζ) ∩ R = Q(τ ) =

∞ X

Qτ j = Q ⊕ Qτ.

j=0

The restriction of ∗ to Q(τ ) is the automorphism of Q(τ ) that is determined by τ 7→ τ 0 . Indeed, (τ 0 )∗ = −ζ 2 − ζ 8 = τ . τ ∗ = −ζ 4 − ζ 6 = τ 0 , We recall the norm and trace maps on Q[τ ] : N [x] = xx0 and tr(x) = x + x0 . The sets 1∗3 and 1∗4 are defined by applying the star map to each of the coordinates of the corresponding standard model, which amounts to the interchange of τ and τ 0 in (2.1) and (2.2) respectively. In both cases the image of the star map does not coincide with 1k , but is another root system of the same type. In all cases, it is apparent that star maps extend beyond the bare point set of the root system itself. We come to this point in the next section. 3. Let Mk := Z1k be the Z-span of 1k , i.e. the set of all linear combinations of the 4, with integer coefficients. It is a dense set of points in Rk . In the roots of 1k , k = 2, 3,P ∞ case of k = 2, M2 = j=0 Zζ j = ⊕3j=0 Zζ j ⊂ Q(ζ). Given the geometry of the non-crystallographic root systems, it follows that Mk is closed under multiplication by τ . Thus we have also Mk = Z[τ ]1k , where Z[τ ] stands for√the numbers a + bτ , a, b ∈ Z (which are the integers from the quadratic extension Q( 5) of the rational numbers). We say that Mk is a Z[τ ]-lattice. In general a subset M ⊂ Rk is a Z[τ ]-lattice if (1) M is a Z[τ ]-module of rank k (under the additive structure of Rk and the scalar multiplication determined by Z[τ ] ⊂ R); M has rank 2k as a Z-module; (2) the R-span of M is Rk . A second Z[τ ]-lattice L in Rk is commensurate with M if QM = QL. This occurs if and only if one of (hence both) [M : L∩M ] and [L : L∩M ] is finite. The determinant det(M ) of a Z[τ ] – lattice M is the determinant of its Gram matrix. We can extend a star map by Z-linearity to a mapping on all of Mk . Its image then gives us Mk∗ ⊂ Rk . By definition for any X ∈ Mk one has X ∗ = ((a + bτ )β1 + (c + dτ )β2 )∗ = (a + bτ 0 )β1∗ + (c + dτ 0 )β2∗ , a, b, c, d ∈ Z ; β1 , β2 ∈ 1k ,

(2.5)

which shows that ∗ is semilinear (i.e. conjugates the coefficients) on Mk as a Z[τ ]module. In the case of k = 2 this definition of ∗ agrees with the one that exists on M2 as a subset of Q(ζ). Formally a star map on Mk is a semilinear map for which the image of 1k is another root system of the same type.

Densities, Minimal Distances, and Coverings

617

4. There is no standard definition of quasicrystals, but almost every definition of a kdimensional quasicrystal 3 displaying local Hk symmetries can be (re)formulated in the following way for some bounded region ⊂ Rk [MP93, CMP98] with non-empty interior: (2.6) 3 := {x ∈ Lk | x∗ ∈ }, where Lk is a Z[τ ]-sublattice of Mk , or as a simple combination of such point sets. Sets of this form are special examples of what are called cut and project sets or model sets. In particular the aperiodicity, infinite repetitions of any finite patch, various inflation symmetries, etc. can be readily deduced from this definition [P97]. Of special prominence we have the families of quasicrystals 3r := {x ∈ Lk | x∗ ∈ B(0, r)}, where B(0, r) is the ball of radius r about 0, and also the important special cases 6k which are obtained by choosing for the root polytope h1∗k i formed from the convex hull of the root system 1∗k , 6k := {x ∈ Mk | x∗ ∈ h1∗k i} .

(2.7)

These latter are called the canonical quasicrystals of type Hk because they are completely determined by the properties of the corresponding root system. If we form the set gk := {(x, x∗ ) | x ∈ Mk } M then we obtain a lattice in R2k . Furthermore, if we provide it with the inner product, (e x | ye) := 2 tr(δx · y) = 2{δx · y + δ 0 (x · y)0 } = 2{δx · y + δ 0 x∗ · y ∗ }, √ where δ := (τ 5)−1 , then this lattice is an even integral lattice [CMP98]. For M2 , M3 , M4 we obtain respectively A4 , D6 , E8 . gk For any subset S of M Se := {(x, x∗ ) | x ∈ S} gk which we call the lift of S. The sets is a subset of M fk ∪ τg 1 1k are the root systems of types A4 , D6 , E8 respectively. 5. A subset 3 of the Euclidean space Rk is a Delone (or Delaunay) set if it satisfies a) There is an r > 0 such that every ball of radius r meets at most one point of 3; b) There is an R > 0 such that every ball of radius R in Rk contains at least one point of 3.

(2.8)

Clearly any Delone set is countably infinite. The quasicrystals 3 of (2.6) are always Delone sets. The opposite implication also holds under certain additional conditions [MPP98/2].

618

R. V. Moody, J. Patera

3. Minimal Distances Let 1 be a normalized root system of type 1k and let ∗ be a normalized star map. Let M be the Z-span of 1. Proposition 3.1. Let P be an open convex region in Rk of diameter d ≤ τ 2 . Let 6 = {x ∈ Z1k | x∗ ∈ P }. Then the minimum distance between any two points of 6 is at least 1/τ . Proof. Let x, y ∈ 6, x 6= y. Then u := x−y ∈ M and |u∗ | = |x∗ −y ∗ | < diam(P ) ≤ τ 2 . Let |u|2 = n + mτ , so |u∗ |2 = n + mτ 0 . We have 0 < n + mτ 0 < τ 4 .

n + mτ > 0, Then

m > nτ 0

nτ > m > (n − τ 4 )τ.

(3.1)

These inequalities are incompatible with n ≤ 0, hence n > 0. From (3.1) and n > 0 we have n + mτ > n + (n − τ 4 )τ 2 = (2n − 5) + (n − 8)τ.

(3.2)

We now refer to Table 3.1. For n ≤ 4, the right hand side of (3.2) is negative and so is irrelevant. For n = 5, it is 5 − 3τ > 0 and so eliminates 5 − 3τ from the table. For n > 5, the minimum possible value n + mτ is already larger than 2 − τ . Thus the minimum value of n + mτ is 2 − τ = 1/τ 2 , resulting in the value |u| ≥ 1/τ as required. Table 1. The permissible values of n + mτ derived from the inequalities nτ 0 < m < nτ for n = 1, . . . , 7 n 1

nτ 0 < m < nτ 0, 1

n + mτ 1, 1 + τ

2

−1, 0, 1, 2, 3

2 − τ, 2, . . . , 2 + 3τ

3

−1, 0, 1, 2, 3, 4

3 − τ, 3, . . . , 3 + 4τ

4

−2, −1, 0, . . . , 6

4 − 2τ, 4 − τ, . . . , 4 + 6τ

5

−3, −2, −1, . . . , 8

5 − 3τ, 5 − 2τ, . . . , 5 + 8τ

6

−3, −2, −1, . . . , 9

6 − 3τ, 6 − 2τ, . . . , 6 + 9τ

7

−4, −3, −2, . . . , 11

7 − 4τ, 7 − 3τ, . . . , 7 + 11τ

Since we know that the interatomic distances in physical quasicrystals are typically around 2.5 – 3 Angstroms [J94], this information gives some idea about the size of the acceptance region for this type of model. Corollary 3.1. (i) The minimal distance between two points of 6τ x∗ ∈ B(0, τ 2 /2)} is 1/τ . (ii) The minimum distance between the points of 6k is 1/τ .

2

/2

= {x ∈ Z1 |

Proof. (ii) diam(1∗k ) = 2. Also for α ∈ 1k , τ α and α are in 6k and |τ α − α| = 1/τ .

Densities, Minimal Distances, and Coverings

619

It is evident that the elementary procedures used here are quite suitable for determining the minimal distances for many types of specific quasicrystals based on Z[τ ]. For more general results one would need to much deeper into the arithmetic of Z[τ ]. For more extensive information on the subject of minimal distances the reader may consult [MPP98/1]. 4. Density of a Quasicrystal Since a quasicrytal is a Delone set, it is reasonable that one might be able to speak of the density of its points. Indeed the existence and general formulas for densitites have been established for cut and project sets in [H98, Sch98]. Here we derive a convenient formula for the density of our quasicrystals, based on the mathematics of the objects at hand. Let 3 ⊂ Rk be a Delone set. For each r > 0 and u = (u1 , . . . , uk ) ∈ Rk let C(u, r) be the cube {x = (x1 , . . . , xk ) ∈ Rk | |xi − ui | ≤ r2 } of side length r around u. Let #u,r 3 := card(3 ∩ C(u, r)). Then the density of 3 is defined by d(3) := lim

r→∞

#u,r 3 , vol(C(u, r))

(4.1)

if this limit exists. It is easy to see that d(3) is independent of u. We recall that the norm mapping N : Q[τ ] −→ Q is defined by N (x) := xx0 . Proposition 4.1. Suppose that L ⊂ Rk is s Z[τ ]-lattice commensurate with Mk and 3 := {x ∈ L | x∗ ∈ }

(4.2)

is the quasicrystal in L defined by some bounded Riemann integrable region ⊂ Rk . Then 3 has a well-defined density vol() . d(3) = √ √ k ( 5) N (det L)

(4.3)

Proof. We suppose first that is a cube C(0, s) of side length s centered on 0 and that sτ < 1. Let {e1 , . . . , ek } be a Z[τ ]-basis for L and let {e∗1 , . . . , e∗k } be the corresponding basis for L∗ . Then the number #r 3 of points of 3 in the cube C(0, r) in Rk is the number e in the region of points of L ( ! ) X X Rr := ai ei , bi e∗i − r2 ≤ ai ≤ r2 , − s2 ≤ bi ≤ s2 (4.4) i

i

e = {(x, x∗ ) | x ∈ L} so it follows that in R2k . Now L ! ) ( X X e= Rr ∩ L ai ei , a0i e∗i ai ∈ Z[τ ], − r2 ≤ ai ≤ r2 , − s2 ≤ a0i ≤ s2 . (4.5) i

i

For a = c + dτ ∈ Z[τ ] we have

620

R. V. Moody, J. Patera

sτ sτ s s ≤ a0 ≤ ⇐⇒ − + d ≤ cτ ≤ + d. (4.6) 2 2 2 2 as c ranges over Z we are required to determine when cτ mod Z lies in the interval Thus sτ − sτ 2 , 2 . Since τ is irrational, according to Weyl’s theorem on uniform distribution [W16, H98, Sch98], the values cτ mod Z fall in the interval [− 21 , 21 ] uniformly and for integers, the number of these integers m such that mτ mod Z lies in anysτN consecutive − 2 , sτ is N sτ + so(N ) as N → ∞. Here o(N ) is independent of s. 2 Notice that for a = c + dτ ∈ Z[τ ] to satisfy the equivalent conditions (4.6) it is necessary that d = [cτ ] := the closest integer to cτ (since sτ < 1)). The set of integers r c such that c + [cτ ]τ ∈ [− r2 , r2 ] is a consecutive range of 1+τ 2 + o(r) integers of which rsτ + so(r) satisfy (4.6). Thus 2 1+τ k e = rsτ #r 3 = card(Rr ∩ L) + rk−1 sk o(r) 1 + τ2 (4.7) = rk sk 5−k/2 + rk−1 sk o(r) . −

k If C(0, s) is moved any other center u = (u1 , . . . , uk ) ∈ R , the effect is simply to sτ to sτ shift the interval − 2 , 2 by ui τ , which does not affect the estimate (4.7). In particular the constant in o(r) can be taken to be independent of u and s. Now allow the acceptance window to be any bounded Riemann integrable region. By definition this means that the boundary of has measure 0 and the volume of can be determined as the limiting sum of volumes of small cubes. √ In termsRof the coordinate basis {e∗1 , . . . , e∗k } with determinant det(L∗ ), vol() = ( det(L∗ )) ds1 · · · dsk . In √ the same way the volume of the cube C(0, r) in Rk is det(L)rk . Partitioning into small -cubes we obtain the limit, vol()rk o(r) −k/2 5 (4.8) + #r 3 = √ r det(L∗ )

and hence the density (4.3).

Example. For L = M4 and the ball B(0, τ 2 /2) in R4 we have N (det(M4 )) = det(M4 )(det(M4 ))0 = 1612 [CMP98], and 4 16 π 2 τ 2 τ 2 /4 d(6 · )= = 9.273 . . . . (4.9) 25 2 2 From Proposition 4.1 the minimum distance between the points of this quasicrystal 6 is 1/τ . Thus the packing density (i.e. the fraction of space occupied by spheres of radius 1 2τ packed around points of 6) is 4 2 π2 1 · · d(6τ /2 ) = 0.41728 . . . . (4.10) 2 2τ The highest density known for R4 is based on D4 root lattice and is 0.61685 . . . [CS88]. Remark. If L is a lattice √ paral√ then the discriminant of L is det(L), the fundamental lelopiped has volume det(L), and the density of lattice points of L in Rn is 1/ det(L). This suggests that we might define the discriminant disc(3) of a quasicrystal 3 by disc(3) :=

1 . d(3)2

(4.11)

Densities, Minimal Distances, and Coverings

621

5. Densities of Planar Subquasicrystals of 3-Dimensional Quasicrystals In any quasicrystal 3 = {x ∈ L | x∗ ∈ }, where L is commensurate with Mk , there are many subquasicrystals of lower dimension lying on affine subspaces of Rk determined by Z[τ ]-submodules of L. As an application of the formula for densities we will show that for L = M3 and for any acceptance window that is ‘close enough’ to a sphere, there are special parallel families of planar subquasicrystals which are of the highest density. Furthermore they are oriented in a way that explicitly shows the implicit e of affine planes in R3 with icosahedral symmetry. More precisely, there is a family P e the property that, for all P ∈ P, the intersection P ∩ 3 is a subquasicrystal of density d e but not on P ) such that any other planar subquasicrystal of (depending only on and P 3 has significantly lower density. These planes of maximal density are oriented at right angles to the 5-fold axes of the group H3 . Thus even if 3 carries no global H3 -symmetry, the orientation of the symmetries of H3 is canonically determined by the local symmetry of 3. To see why this is true consider any quasicrystal (4.2) with k = 3, where is assumed for the present only to be bounded with non-empty interior. Let PR be any affine plane in R3 . In order that 3 ∩ PR should support P , i.e. span it as an affine space, P := PR ∩ M3 must be a translate Q + a of a Z[τ ]-module Q of M3 of rank 2 where a ∈ M3 . Then 3() ∩ PR = {x ∈ Q | x∗ ∈ ( − a∗ )} + a

(5.1)

and hence, as far as densities are concerned, we can suppose that Q is P and a = 0. So we assume now that P is a Z[τ ]-module and we are interested in the density of 3P = {x ∈ P | x∗ ∈ } .

(5.2)

f3 in R6 . Then Pe is a rank 4 even integral Let Pe be the lift of P to the lattice M q lattice inside the root lattice D6 . We are interested in the density 1/ det(Pe) of this lattice. Indeed, det√Pe = 24 N (det P ) (Proposition 6.6 of [CMP98]) and the density of 3P is vol(P )/5 N (det P ) by Proposition 4.1, where P is the crossection of by the plane of R3 defined by the points of P ∗ . Amongst the even lattices in R4 the densest are D4 , A4 , A3 + A1 , A2 + A2 , . . . with determinants 4, 5, 8, 9, . . . [CS88]. In order that a Z[τ ]-module in M3 have a lift in R6 that is a root lattice (generated by a root system 8, say), it is necessary that there be a f0 ∪ τg subroot system 80 of 13 for which 8 80 = 8 [CMP98, Proposition 6.5] (see 2.4 for notation). Using the classification of rank 2 subroot systems of H3 [CMP98], H3 Z

Z Z Z

A1 ∨ A1

H2

A2

we see that D4 cannot arise from this construction and nor does A3 + A1 . On the other hand A4 can arise and does so in exactly 6 ways. In fact by Proposition 6.5 of [CMP98] this root system lies in Pe if and only if a copy of the noncrystallographic root system 12 lies in our 13 . Now each subroot system of type 12 in

622

R. V. Moody, J. Patera

13 gives rise to a set of 4 rotations of order 5 in W (13 ) about one of the 5-fold axes of symmetry of 13 . These axes are the 6 translates of Rω1 under W (13 ), where ω1 is defined by 2ω1 · αi = δ1i relative to a basis of simple roots {α1 , α2 , α3 } corresponding to the diagram

◦ ◦ ◦ 5

α1

α2

α3

Any copy of 12 is a W (13 )-translate of the root system of type 12 generated by {α2 , α3 } in 13 . Thus for planes P of R3 which are supported by 3(), the density of 3() ∩ P is 4 vol(P ) √ (5.3) 5 5 if P is orthogonal to a 5-fold axis and at most 4 vol(P ) √ 5 9

(5.4)

in any other case. Any complete family F of parallel planes P in R3 , each of which is supported by ∗ M3 , is in fact dense in R3 . Hence the supremum of the values vol(P ), where P runs over each family F is determined by the maximum plane cross-section of by all planes in R3 in this orientation. The above argument shows√that if, as we run through all orientations, this supremum varies by a ratio of less than 3/ 5 then the planes orthogonal to the 5-fold axes are unequivocally the densest. More explicitly, we have the following proposition: Proposition 5.1. Let 3() = {x ∈ M3 | x∗ ∈ } be a quasicrystal where is a bounded Riemann integrable region with non-empty interior. For each vector u on the unit sphere S let F(u) be the complete family of planes in R3 orthogonal to u. Let f (u) = sup{vol( ∩ P) | P ∈ F(u)} and suppose that

(5.5)

σ :=

3 sup f (u) / inf f (u) < √ . u∈S 5 u∈S

(5.6)

Then the density of the planar subquasicrystals 3( ∩ P ) determined by the planes P orthogonal to 5-fold u that are supported by their intersections with 3 have √axes density d = 4f (u)/ 5 5 . The maximum density of any other planar subquasicrystal √ is at most dσ 5/3. 6. Coverings For x ∈ Rk let B(x, r) be the ball of radius r about x. The covering radius of a Delone set X is inf{r ∈ R>0 | for all y ∈ Rk , B(y, r) ∩ X 6= ∅} or equivalently

inf{r ∈ R>0 | ∪x∈X B(x, r) = Rk }.

Densities, Minimal Distances, and Coverings

623

The covering radius of a lattice is an important invariant. Unfortunately, in general, its determination is quite a difficult task that is closely related to determining the structure of its Voronoi region. The situation with quasicrystals is, of course, more difficult still. In this section we give two results about covering radii for the canonical quasicrystals 6k . The first one is an exact result about 62 ; the second is an upper bound for the covering radius of 64 . In the case of 64 the primary result is a covering theorem using scaled root polytopes rather than spheres and is interesting because of its connection with the lattice E8 . It would be very nice to know the exact covering radius for each the three canonical quasicrystals. √ Proposition 6.1. The covering radius of 62 := {x ∈ 12 | x∗ ∈ h1∗2 i} is τ / τ + 2. Proof. Let x ∈ 62 . We claim that there exist roots α, β1 , β2 ∈ 12 with α · βi = τ 0 /2, i = 1, 2, β1 · β2 = −τ /2 and x + {0, α, β1 , β2 } ⊂ 62 . To see this we can assume that x∗ lies in the fundamental domain F = {y | y · αi∗ ≥ 0, i = 1, 2} of the Weyl group W (12 ), where {α1∗ , α2∗ } is base of 1∗2 . Let −α∗ ∈ F be the highest root and let −β1∗ , −β2∗ be the two roots closest to −α∗ . Then F + {0, α∗ , β1∗ , β2∗ } ⊂ h1∗2 i and hence x + {0, α, β1 , β2 } ⊂ 62 (see Fig. 6.1). x+α r p x r

-β*1 r x+β

-α* F

r x+β

1

2

q r

-β*2 Fig. 1. The relative positions of −α∗ , −β1∗ , −β2∗ , and F used in the proof of Proposition 6.1

x − τα Fig. 2. The Voronoi region used in the proof of Proposition 6.1

Now let y ∈ R2 be arbitrary and let x be an element of 62 that is closest to y. Find roots α, β1 , β2 as above, so that x + {0, α, β1 , β2 } ⊂ 62 . Then x + α, x + β1 , x + β2 are vertices of a triangle of which x is an interior point. Using τ -inflation around x applied to x + α we also obtain x − τ α ∈ 62 ([BM94] and Proposition 1 of [P97]). The assumption on x is that y lies in the Voronoi cell (Fig. 6.2) of x with respect to the point set {x, x + α, x + β1 , x + β2 , x − τ α}. Straightforward calculation shows that τ |p − x| = |q − x| = √ . τ +2 On the other √ hand 62 contains empty pentagons with side length 1 and these have circum radius τ / τ + 2 showing that this is the covering radius.

624

R. V. Moody, J. Patera

Let 14 be a root system of type H4 in R4 . Let M = Z14 and let ∗ : M4 −→ R4 be any star-map. Let 64 := {x ∈ M | x∗ ∈ h1∗4 i} be the corresponding canonical crystal. Proposition 6.2. For all x ∈ R4 ,

x + hτ 14 i ∩ 64 6= ∅.

In particular the covering radius of 64 is at most τ . In view of Proposition 6.2 of [CMP98] it will suffice to prove this for any particular model of the H4 quasicrystal. We choose the standard model of 14 given in Sect. 2. We begin by proving a covering result for the E8 root lattice, assuming that root lengths are normalized to 2. Proposition 6.3. Let Q be the root lattice of E8 , let 88 be the root system of type E8 , and let h88 i denote its convex hull. Then (i)

[

α + h88 i = R8 .

α∈Q

(ii) For all c with 0 < c < 1,

[

α + ch88 i ( R8 .

α∈Q

Proof. The vertices of h88 i are the roots of E8 . Let ξ be the highest root, so 88 = W ξ, where W is the Weyl group of E8 . Then

◦ ◦· ◦ ◦ ◦ ◦ ◦ ◦ 8

1

2

3

4

5

7

6

is the Wythoff diagram for h88 i (since ξ is on all reflecting hyperplanes except that of r1 ). For more on Wythoff polytopes see [C63, MP92]. Then h88 i has two kinds of faces

◦ ◦· ◦ ◦ ◦ ◦ ◦ 8

◦· ◦ ◦ ◦ ◦ ◦ ◦ 1

2

3

4

5

6

7

1

2

3

4

5

6

The centers of these faces are stabilized by hr1 , r2 , . . . , r7 i and hr1 , . . . , r6 , r8 i respectively and hence are c8 ω8 and c7 ω7 for some positive real numbers c8 and c7 . Now ω7 · ξ ω7 c 7 ω7 = |ω7 | |ω7 | and so |c7 ω7 | = |ω7 · ξ|/|ω7 |. We have the marks of E8 and thus ω7 · ξ = 2, and√ω8 · ξ = 3. From the inverse Cartan matrix of E√ 8 we also have |ω7 | = 2, and |ω8 | = 2 2. Hence |c7 ω7 | = 1 and similarly |c8 ω8 | = 3/2 2.

Densities, Minimal Distances, and Coverings

625

Consider the holes of the lattice Q around 0. These are the vertices of the Voronoi cell around 0 and are the W -orbits of 21 ω7 and 13 ω8 as we see from the diagram of the Voronoi √ cell [CS88, MP92]: They are of lengths 2/2 = 1 and 2 2/3 respectively. Thus we see that these two holes are in h88 i, though 21 ω7 is actually on the boundary. Consequently the entire orbit of holes around 0 is inside h88 i, i.e. the Voronoi region neighbouring 0 lies in h88 i. Thus [ (α + h88 i) = R8 α∈Q

but if 0 < c < 1 then

[

(α + ch88 i) ( R8 .

α∈Q

Now we return to the proof of the covering proposition for 6. e 4 ∪ τg 14 , Proof of Proposition 6.2. Using Proposition 6.4 of [CMP98] write 88 = 1 where e 4 = {(x, x∗ ) | x ∈ 14 } ⊂ R4 × R4 ' R8 . 1 Let πk and π⊥ be the two (scaled) projections from R8 −→ R4 determined by this decomposition: πk : (x, y) 7−→ x ∈ 14 , We have

πk h88 i = hτ 1i ,

π⊥ : (x, y) 7−→ y ∈ 14 . π⊥ h88 i = h1∗ i .

Let z ∈ R4 and consider the point (z, 0) ∈ R8 . Then (z, 0) + h88 i ∩ Q 6= ∅. Let q be in the intersection. Thus πk (q) ∈ z + hτ 1i and π⊥ (q) ∈ 0 + h1∗ i. Thus πk (q) ∈ 6 and we are done.

Acknowledgement. We are grateful for the partial support by the Natural Sciences and Engineering Research Council of Canada and by FCAR of Quebec, and for the hospitality of the Fields Institute for Mathematical Research where most of the work was done and to the Aspen Center for Physics where this work was completed. The authors also appreciate the comments of the referee.

References [BM94]

Berman, S. and Moody, R.V.: The algebraic theory of quasicrystals with five-fold symmetry. J. Phys. A: Math. Gen. 27, 115–130 (1994) [CMP98] Chen, L., Moody, R.V. and Patera, J.: Non-crystallographic root systems. In: Quasicrystals and Discrete Geometry, Fields Institute Monograph Series, Vol. 10, ed. J. Patera, Providence, RI: AMS, 1998, pp.135–178 [CKPS95] Champagne, B., Kjiri, M., Patera, J., and Sharp, R.T.: Description of reflection generated polytopes using decorated Coxeter diagrams. Can. J. Phys. 73, 566–584 (1995) [CS88] Conway, J. and Sloane, N.J.A.: Spheres Packings, Lattices and Groups. Berlin–Heidelberg–New York: Springer-Verlag, 1988

626

[C63] [D82] [H98]

R. V. Moody, J. Patera

Coxeter, H.S.M.: Regular Polytopes. New York: Macmillan, 1963 Deodhar, V.V.: On the root system of a Coxeter group. Comm. Alg. 10, 611–630 (1982) Hof, A.: Uniform distribution and the projection method. In: Quasicrystals and Discrete Geometry. Fields Institute Monograph Series, Vol 10, ed. J. Patera, Providence, RI: AMS, 1998 [H90] Humphreys, J.E.: Reflection groups and Coxeter groups. Cambridge: Cambridge Univ. Press, 1990 [J94] Janot, C.: Quasicrystals: A primer. Oxford, UK: Oxford Univ. Press, Second Edition, 1994 [MPP98/1] Maskov´a, Z., Patera, J., Pelantov´a, E.: Minimal distances in quasicrystals. J. Phys. A: Math. Gen. 31, 1539–1552 (1998) [MPP98/2] Maskov´a, Z., Patera, J., Pelantov´a, E.: Selfsimilar Delone sets. J. Phys. A: Math. Gen. 31, 4927– 4946 (1998) [MP92] Moody, R.V. and Patera, J.: Voronoi and Delone cells of root lattices: Classification of their faces and facets by Coxeter-Dynkin diagram. J. Phys. A: Math Gen. 25, 5089-5134 (1992) [MP93] Moody, R.V. and Patera, J.: Quasicrystals and icosians. J. Phys. A: Math. Gen. 26, 2829–2853 (1993) [P97] Patera, J.: Noncrystallographic root systems and quasicrystals. In Mathematics of Long Range Aperiodic Order, Proc. NATO ASI, Waterloo, 1995, ed. R. V. Moody, Dordrecht: Kluwer, 1997 [P74] Penrose, R.: Bull. Inst. Math. Appl. 10, 266 (1974) [SBC84] Shechtman, D., Blech, I. and Cahn, J.W.: Phys. Rev. Lett. 53, 1951 (1984) [Sch98] Schlottmann, M.: Cut and project sets in locally compact Abelian groups. In: Quasicrystals and Discrete Geometry. Fields Institute Monograph Series, Vol. 10, ed. J. Patera, Providence, RI: AMS, 1998 [S95] Senechal, M.: Quasicrystals and geometry. Cambridge, UK: Cambridge Univ. Press, 1995 ¨ [W16] Weyl, H.: Uber die Gleichungverteilung von Zahlen mod. Eins. Math. Ann. 77, 313–352 (1916) Communicated by A. Jaffe

Commun. Math. Phys. 195, 627 – 642 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

A Phase Transition for Hyperbolic Branching Processes ? F. I. Karpelevich1,2 , E. A. Pechersky2 , Yu. M. Suhov2,3 1

Moscow Transport University (MIIT), The Russian Ministry of Railways, Moscow 101475, Russia The Dobrushin Mathematical Laboratory, Institute for Problems of Information Transmission, The Russian Academy of Sciences, GSP-4 Moscow 101447, Russia 3 Statistical Laboratory, DPMMS, University of Cambridge, Cambridge CB2 1SB, England, UK

2

Received: 22 September 1997 / Accepted: 19 December 1997

Abstract: Consider a time- and space-homogeneous random branching Markov process on a d−D Lobachevsky space Hd . Its asymptotic behaviour can be described in terms of the Hausdorff dimension of the (random) set 3 of the accumulation points (on the absolute ∂Hd ). The simplest and most well-known example is the Laplace–Beltrami branching diffusion; in the case d = 2 the Hausdorff dimension of 3 was calculated in [LS]. In this paper we extend the formula for the Hausdorff dimension to d ≥ 3 and a larger class of branching processes. It turns out that the Hausdorff dimension of 3 takes either a value from (0, (d − 1)/2) or equals d − 1, the Euclidean dimension of ∂Hd , which gives an interesting exmaple of a “geometric” phase transition.

1. The Main Result Introduction. This paper deals with the Hausdorff dimension of the limiting set 3 of a homogeneous branching Markov process on the d-dimensional hyperbolic, or Lobachevsky, space Hd . The problem was put forward by Lalley and Sellke, and we refer the reader to the introduction to [LS], where it is discussed in the case of a homogeneous branching diffusion on a Lobachevsky plane H2 . (In this model, the generator of the individual Markov process is one half of the standard Laplace–Beltrami operator.) The main result of Lalley and Sellke is that the Hausdorff dimension of 3 (which is a subset the branching diffusion with the offspring number two and of the absolute ∂H2 ) for √ fission rate λ equals (1 − 1 − 8λ)/2 if λ ≤ 1/8 and 1 if λ > 1/8. A phase transition discovered in [LS] and manifested in the discontinuity of the Hausdorff dimension at λ ? This work was supported in part by the Russian Foundation for Fundamental Research (Grants 96 01 00150 and 97-01-00747); the Russian Ministry of Railways Fund “NIOKR”; The London Mathematical Society; St John’s College, Cambridge; Dublin Institute for Advanced Studies; I.H.E.S., Bures-sur-Yvette; the EC Grant “Training Mobility and Research” (Contracts CHRX-CT 930411 and ERBMRXT-CT 960075A) and the INTAS Grant “Mathematical Methods for Stochastic Discrete Event Systems” (INTAS 93-820).

628

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

= 1/8 caused some consternation and a desire to check whether it is a specific feature of the branching diffusion. Besides, a crucial fact, Proposition 4 from [LS], was proved by a method that seems to work only in dimension two, leaving open the question of extending the results to higher dimensions. These were the main incentives for looking further at the matter. In this paper, we generalise the Lalley–Sellke paper in two directions. One is a general class of branching processes, both with discrete and continuous time. The other is an arbitrary dimension d ≥ 2 of the Lobachevsky space under consideration. Within such a setting, we confirm the picture discovered in [LS]: the Hausdorff dimension of the limiting set 3 (which again lies on the absolute) changes continuously from 0 to (d−1)/2 and then jumps to d−1. See Theorem 1.1 below. The fact that the interval from (d − 1)/2 to d − 1 is not available is conjectured in a more general situation where the Lobachevsky space is replaced by an arbitrary symmetric space of rank one, but at the moment this remains open. Another interesting development could be a generalisation of these results to an inhomogeneous case. For example, the moving particles can walk in a random environment; we hope that methods of papers [KKS] and [KS1],[KS2] may be of some use here. Description of the results. We work with the so-called projective model of the d−dimensional Lobachevsky space Hd which is defined as the interior of the unit ball in the Euclidean space Rd+1 ; the group G of the motions is the connected component, containing the identity, of the group of the real unimodular (d + 1) × (d + 1) matrices preserving Minkovsky’s bilinear form, with a projective action. The absolute ∂Hd is the unit sphere and the geodesics are the straight chords of the ball. See Sect. 3 for the formal summary of basic facts. A homogeneous branching process on Hd is determined by a) an individual homogeneous Markov process θ on Hd , with discrete or continuous time, and b) the branching mechanism, i.e., a (possibly random) offspring number and, in the continuous-time case, a fission rate (i.e., the rate of the exponential time between subsequent divisions). That is, a discrete-time branching process is specified by fixing a) a transition probability measure {Q(x, ·), x ∈ Hd }, invariant under group G, and b) a probability distribution {µk , k = 0, . . . , } of the number of offspring produced in a single division. In the continuous-time case, a branching process is specified by fixing a) a family of transition probability measures {Qt (x, ·), t ≥ 0, x ∈ Hd } invariant under group G, b1) fission rate λ > 0 and b2) probability distribution {µk }. In both cases we assume that µ0 , the P probability of the zero offspring number, is 0 and the mean offspring number κ = k kµk is > 1. We also impose technical conditions on θ (see, e.g., (4.1) and (4.5)). Some of the above conditions can be weakened or even removed at the expense of the length of the exposition. An important role in this paper is played by the Laplace transform 8ho (p), p ∈ R, of the jump distribution of the so-called horospheric projection θho of process θ (in the continuous-time case we mean the distribution of the displacement at the end of the lifetime of θho ). See Lemma 4.1. It turns out (see Lemma 4.2) that function 8ho is symmetric about point p = (d − 1)/2 and its minimum is attained at this point. In the case when 8ho ((d − 1)/2) < 1/κ the branching process is called subcritical, when 8ho ((d − 1)/2) > 1/κ supercritical and when the equality occurs critical. Furthermore, in the subcritical case there are precisely two solutions, p0 and p1 , to the equation 8ho (p) = 1/κ, 0 < p0 < (d − 1)/2 < p1 . We show that with probability 1 in the sub- and critical case the process eventually leaves any compact C ⊂ Hd , while in the supercritical case it continues visiting any horoball; in [LS] a similar fact was proved for the homogeneous

Hyperbolic Branching Processes

629

branching diffusion on H2 . Following the opinions of the authors of [LS], one can consider the main result of this paper to be the following Theorem 1.1. The Hausdorff dimension of the limiting set 3 for a homogeneous branching process on Hd is equal: (1) in the subcritical case to p0 ∈ (0, (d − 1)/2), and (2) in the supercritical case to d − 1, the dimension of the absolute ∂Hd . Pictorially, if one fixes a homogeneous Markov process θ and increases κ (and/or λ, in the continuous-time setting) from 0 to ∞, the Hausdorff dimension of 3 increases continuously from 0 to (d − 1)/2 and then jumps to d − 1. In the critical case, the Hausdorff dimension is conjectured to equal (d − 1)/2; in this paper such a result is derived only for the branching diffusion (which extends the result from [LS] to the case of space Hd , d ≥ 2). See Remark 5.11. 2. Branching Processes on a Line Discrete-time processes. Let ξ = (ξn , n ≥ 0) be a Markov process on R; at this point it is not supposed to be space-homogeneous. Denote by Q(y, · ) the transition probability measure of ξ, y ∈ R, and assume that Q(y, {y 0 ∈ R: y 0 > y}) > 0 and ∃ r0 > 0 such that (2.1) Q y, {y 0 ∈ R : |y − y 0 | < r0 } = 1, y ∈ R. Let 4 = (4n , n ≥ 0) be the branching process generated by ξ and offspring number probabilities {µk , k ≥ 0}. The distribution of 4, with the single-particle initial position at x, is denoted by Px , and the expectation wrt Q(y, ·) by Ey . As before, we assume that µ0 = 0 and κ > 1. Set Mn = max(y : y ∈ supp 4n ) and (2.2) M = sup Mn , F (y, b) = Py M ≤ b , y, b ∈ R, y 0 be a non-decreasing function on R such that, for a given b ∈ R, κEy f = f (y) for y ≤ b. Then

(2.3)

F (y, b) ≥ 1 − f (y)/f (b), y ≤ b.

Proof. It is simple to check that F (·, b) is a maximal solution to the equation u = Lu, where (Lu)(y) = 1(y < b)Ey Z(u), in the class of functions u: (−∞, b + r0 ) → [0, 1] such that u(y) = 0 for b ≤ y < b + r0 . The proof is based on the fact that the operator L preserves the pointwise inequality between functions: (Lu1 )(y) ≤ (Lu2 )(y), if u1 (y) ≤ u2 (y), y < b. Cf. [KS1], Theorem 3.1. Set u0 (y) = max(0, 1−f (y)/f (b)), y < b+r0 ; then u0 belongs to the above class. Furthermore, by using the bound Z(u) ≥ κ(z − 1) + 1, |z| ≤ 1, it is easy to check that Lu0 (y) ≥ u0 (y), y < b ([KS1], Theorem 4.1). The assertion of Lemma 2.1 then follows from the above property of monotonicity of operator L. Corollary 2.2. Process 4 spends a finite time on each half-line (b, ∞) if ∃ a nondecreasing function f > 0 on R satisfying (2.3) ∀ y ∈ R such that limx→∞ f (x) = ∞. [Here, “if” can be replaced by “iff” – see [KS1], Theorem 1.]

630

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

Homogeneous discrete-time processes. In the homogeneous case probabilities Q(y, A + y) do not depend on y. Set Q(0, A) = q(A) and call measure q the jump distribution of ξ; set G(y) = q((−∞, y)), y ∈ R. Denote: Z (2.4) 8(p) = q(dy)epy . Function 8 is continuous and convex, and 8(0) = 1. Furthermore, the infimum inf p≥0 8(p) is attained (this is a corollary of the L´evy Theorem on the limit of the integral for a monotone sequence of non-negative functions). It is also easy to check (see [KS1], Sect. 1.4) that the conditions of Corollary 2.2 are equivalent to the inequality min 8(p) ≤ 1/κ.

(2.5)

p>0

Under this inequality, equation 8(p) = 1/κ has at least one non-negative solution; as in Sect. 1, we denote by p0 the smallest and by p1 the largest of them. Then function f figuring in Corollary 2.2 can be f : x 7→ ep0 x . By virtue of (2.1), (2.5), it is easy to see that Z y −p0 y ep0 z q(dz), y < 0, (2.6) G(y) ≤ c0 e −∞

p 0 r0

where c0 = e . Following [LS], we consider a “modified” process 4+ = (4+n , n ≥ 0), arising under “absorbing” conditions on the half-axis (−∞, 0]. Pictorially, the particles participating in process 4+ are those from 4 whose ancestors never visited (−∞, 0]; for completeness we assume that the particles from 4 hitting (−∞, 0] are frozen at their positions, no longer producing descendants. Denote by m+ (x) the expected number of particles frozen at any time n ≥ 0 calculated under Px . Lemma 2.3. Assume that (2.5) is fulfilled. Then m+ (x) ≤ c0 ep0 x , x ∈ (0, ∞),

(2.7)

where c0 is the constant from (2.6). Proof. As in the proof of Lemma 2.1, it is easy to see that m+ gives a minimal solution to equation v = L+ v, where L+ v(y) = 1(y ≤ 0) + 1(y > 0)κEy (v), in the class of the non-negative functions u on R such that u(y) = 1, y ≤ 0. The proof is again based on the fact that the operator L+ preserves the pointwise inequality between functions. Consider the following function v0 from the above class: v0 (y) = 1(y ≤ 0) + 1(y > 0)c0 ep0 y , and compare v0 (y) and L+ v0 (y). For y ≤ 0, both are equal. For y > 0 it is easy to check, by using the equality 8(p0 ) = 1/κ, that Z L v0 (y) − v0 (y) = κ G(−y) − c0 ep0 y +

−y

e

p0 z

q(dz) .

−∞

Condition (2.6) guarantees that the RHS of is non-positive, i.e., L+ v0 ≤ v0 . This implies that the sequence (L+ )i v0 pointwise decreases as i grows; the limiting function v = lim vi is ≤ v0 (x) and gives a solution to the equation v = L+ v in the above class. Thus, i→∞

m + ≤ v0 .

Hyperbolic Branching Processes

631

Lemma 2.4. Assume that (2.5) is fulfilled. Then 1 log m+ (x) = p0 . x→∞ x lim

(2.8)

Proof. In view of Lemma 2.3, it suffices to check that lim inf x→∞ (1/x) log m+ (x) ≥ p0 . Consider an auxiliary homogeneous Markov process ξe starting at x whose jumps are confined to the set {−∞} ∪ {r : r = −N− , . . . , N+ }, where N± > 0 are integers and > 0. [An agreement is that when ξe reaches the state −∞ it remains there forever.] We assume that the jump distribution qe of process ξe is stochastically majorized by q, the e and 4 e + be the branching processes jump distribution of ξ, and qe({−∞}) < 1/κ. Let 4 generated by ξe in the same fashion as above (adding state −∞ to the absorbing half-axis (−∞, 0]). Then, ∀x ∈ R, m e + (x), the expected number of particles frozen (at any time + e n ≥ 0) in 4 is ≤ m (x), the corresponding expected number for the original process. e and 4 before they In fact, after a natural identification of particles in processes 4 e lie stochastically “more leftwards” than their counterparts in freeze, the particles in 4 e freeze stochastically not later than in 4. Consequently, the 4. Thus, the particles in 4 e is stochastically not more than that in 4. number of frozen particles in 4 eR + (x) = pe0 , where pe0 is the least positive Our aim is to show that limx→∞ (1/x) log m e e root of the equation 8(p) = 1/κ and 8(p) = qe(dx)epx (cf. (2.4)). Clearly, pe0 ≤ p0 and by choosing appropriate and N± , pe0 may be made arbitrarily close to p0 , so the assertion of the lemma will follow. Plainly, m e + (x) is constant on each interval ((j − 1), j], j ∈ Z. It is convenient to + e (x), x ∈ ((j − 1), j]. Denote by ar the probability of jump r in process set mj = m e+ m ξe : ar = qe({r}), r = −N− , . . . , N+ . The equation m e+ = L e + (cf. the proof of Lemma 2.3) takes the form mj = 1, =κ

if j ≤ 0, X ar mj+r ,

if j > 0.

(2.9)

N− ≤r≤N+

A general solution to (2.9), (2.10) is mj =

P

j −N− ≤r≤N+ cr zr , j

> 0, where cr are arbiPN+ trary constants and zr are the roots of the characteristic equation 1/κ = r=−N ar z r . − + As m e > 0, the maximal (in terms of absolute value) root zmax taking part in the above representation must be positive, as well as the corresponding coefficient cmax . It is easy p0 to see that zmax = ee , where pe0 was specified before. Thus, mj /(zmax )j → cmax > 0. The equality limx→∞ (1/x) log m e + (x) = pe0 now follows. Continuous-time processes. In this section we establish assertions analogous to Lemma 2.1 for continuous-time b.p’s. We will assume that the (in general, not spacehomogeneous) individual Markov process ξ = (ξt , t ≥ 0) is strong Markov; its transition probabilities at point y ∈ R are denoted by Qt (y, ·). We again assume that Qt (y, {y 0 ∈ R : y 0 > y}) > 0. Fix a fission rate λ > 0 and offspring number probabilities µk with µ0 = 0 and mean κ > 1. As before, denote by 4 = (4t , t ≥ 0) the branching process generated by ξ and (λ, {µk }). The distribution of process 4, with a single particle placed at x at time 0,R is again denoted by Px . It is convenient to intro∞ duce Green’s measure G(y, A) = λ 0 dte−λt Qt (y, A); denote by Ey the expectation

632

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

wrt G(y, ·). Denote by N (t) the number of particles in 4 at time t and by Xi (t) their positions, i = 1, . . . , N (t). Set Mt = max1≤i≤N (t) Xi (t) and (cf. (2.2)) M = sup Mt , F (y, b) = Py (M < b),

y, b ∈ R, y ≤ b.

(2.10)

t≥0

Lemma 2.5. Let f > 0 be a non-decreasing function on R such that, for a given b ∈ R, κEy (f ) = f (y), y ≤ b.

(2.11)

F (y, b) ≥ 1 − f (y)/f (b), y ≤ b.

(2.12)

Then The proof of Lemma 2.5 is similar to that of Lemma 2.1 (see also [KS2], Theorem 2.3), and we therefore omit it. Corollary 2.2 holds by replacing (2.3) with (2.12) (the “iff” cases are discussed in [KS2]). There are two interesting classes of the Markov processes ξ where (2.11) may be simplified: diffusions and jump processes (j.p.’s). In the case of a diffusion, ξ is determined by a generator 1/2σ 2 (y)d2 /dy 2 + a(y)d/dy

(2.13)

(possibly with boundary conditions) in a suitable functional space Dξ on R. See [IM] for the extensive theory. For our purposes it suffices to set σ = 1 and assume that there is no singular point, and drift coefficient a(y) is bounded and C 1 . In the case of a j.p., ξ is determined by a function β(y) > 0 (the rate of jumping from y) and a probability measure π(y, ·) (the jump distribution from y), y ∈ R. Here, we assume that β(y) is bounded. Lemma 2.6. In the case of a branching diffusion, the condition of Lemma 2.5 is equivalent to the existence of a positive non-decreasing C 2 -function f > 0 on R such that 1/2f 00 (x) + a(x)f 0 (x) = −λ(κ − 1)f (x) for x < b. In the case of a branching jump process, the condition of Lemma 2.5 is equivalentRto the existence of a non-decreasing function f > 0 on R such that −β(x)f (x) + β(x) π(x, dy)f (y) = −λ(κ − 1)f (x) for x < b. The proof of Lemma 2.6 consists in passing to the generators of the corresponding processes. Cf. [KS2], Sections 3.1, 3.2 and 4.1. Homogeneous continuous-time processes. In the homogeneous case, Qt (y, A + y) and thus G(y, A+y) do not depend on y ∈ R. We now set G(0, A) = q(A), G(y) = q((−∞, y)) and then define 8(p) by (2.4). Under condition (2.5) process 4 again spends a finite time on each half-line (x, ∞) and equation 8(p) = 1/κ has at most two non-negative roots p0 , p1 . In general, we also have to assume that condition (2.6) holds for some constant c0 > 0. For the branching diffusion case, (2.6) will hold automatically (and is in fact not required). For the branching jump process, (2.6) will follow from the assumption π(y, {y 0 : |y 0 − y| < r0 }) = 1, y ∈ R. The modified process is again denoted by 4+ = (4+t , t ≥ 0) and the corresponding expected value m+ (x). An analogue of Lemmas 2.5 and 2.6 is Lemma 2.7 below. Its proof is a repetition of the discrete-time argument and is omitted. Lemma 2.7. Under condition (2.5), bound (2.7) holds, with the constant c0 from (2.6). Furthermore, (2.8) also holds.

Hyperbolic Branching Processes

633

In the case of a homogeneous branching diffusion, the drift a (see (2.13)) is a constant. We are particularly interested in the case a < 0. Definition (2.4) takes the form 8(p) =

2λ , 2λ − p2 − 2ap

(p + a)2 < a2 + 2λ,

(2.14)

and the analogue of (2.5) a2 ≥ 2λ(κ − 1). Under this condition, p0,1 = −a ± p a2 − 2λ(κ − 1). Remark 2.8. A direct calculation shows that for the branching diffusion, m+ (x) = ep0 x , x > 0. 3. Basic Facts of Lobachevsky Spaces Preliminaries. This summary of properties of the Lobachevsky spaces mainly follows [GGV]; other classical references (far exceeding our needs) are [He1] and [He2]. Given a (d + 1) × (d + P1) real matrix g = (gij ) and a (d + 1) real vector x = (xi ), define gx by (gx)i = j gij xj . Matrix g generates on the hyperplane {x ∈ Rd+1 : xd+1 = 1} the projective transformation x 7→ g ∗ x, where (g ∗ x)i = (gx)i /(gx)d+1 , i = 1, . . . , d + 1. DenoteP by [·, ·] the Minkovsky bilinear form on Rd+1 × Rd+1 defined by [x, y] = xd+1 yd+1 − 1≤j≤d xj yj . In the projective model, Lobachevsky space Hd is represented as the interior of the Euclidean unit ball {x ∈ Rd+1 : [x, x] > 0, xd+1 = 1}. The group G of the motions of Hd is the connected component, containing the identity, of the group of (d + 1) × (d + 1) real matrices of determinant one preserving form [·, ·], with action x 7→ g ∗ x. The G-invariant Riemannian metric ρ on Hd is given by (coshρ(x, y))2 = [x, y]2 /[x, x][y, y]; if y coincides with y 0 = (0, . . . , 0, 1), we have that tanh ρ(x, y 0 ) = kxk

(3.1)

(k · k stands for the Euclidean norm). The Euclidean rotations of Rd+1 about the axis Oxd+1 preserving [·, ·] generate a subgroup K of G which is the stabiliser of y 0 . The absolute ∂Hd coincides with the sphere {x ∈ Rd+1 : [x, x] = 0, xd+1 = 1}. The action of group G is extended to ∂Hd so that the map x 7→ g ∗ x is continuous in (g, x) ∈ G × H¯ d , where H¯ d = Hd ∪ ∂Hd , with the topology generated by norm k · k. Note that K acts on ∂Hd transitively. The subscript d in the notation Hd will henceforth be systematically omitted. The (directed) geodesics of H in this model are the (directed) chords of the unit ball. Thus, a point w ∈ ∂H is associated with the sheaf 6w of the geodesics that enter this point. For any geodesic γ there exists a one-parameter subgroup Aγ = {aγ (t), t ∈ B} ⊆ G of the shifts along γ. Here, aγ (t) performs the shift by distance |t| in the positive or negative direction, depending on sign t. By γ0 we denote the geodesic through y 0 going to point w0 = (0, . . . , 0, 1, 1) ∈ ∂H. The action of group Aγ0 is specified by aγ0 (t) ∗ x =

1 x1 . . . , xd−1 , xd cosh t + sinh t, xd sinh t + cosh t , xd sinh t + cosh t (3.2) ¯ x = (x1 , . . . , xd−1 , xd , 1) ∈ H.

The subscript γ0 in the notation Aγ0 , aγ0 (t), etc. is henceforth omitted. Note that if γ 0 = g ∗ γ then Aγ 0 = gAγ g −1 .

634

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

Consider the kernel k: H × ∂H → (0, ∞) given by p κ(x, w) = [x, x]/[x, w], x ∈ H, ∈ ∂H. Note that

k(g ∗ x, g ∗ w) = k(x, w)(gw)d+1 , g ∈ G.

(3.3) (3.4)

Given w ∈ ∂H and c > 0, the level surface {x ∈ H : k(x, w) = c} is called a horosphere (with the pole at w). In Euclidean terms, a horosphere is an ellipsoidal surface touching ∂H at point w. The ellipsoid {x ∈ H : k(x, w) ≥ c} is called a horoball. A geodesic γ ∈ 6w and a horosphere with the pole at w intersect at a single point and orthogonally. The horospheres are the limiting surfaces for the Riemannian spheres in H when their centres approach w ∈ ∂H along a geodesic γ ∈ 6w . This implies that if x0 , y are points on γ ∈ 6w and k(x0 , w) < k(y, w) then any Riemannian sphere centred at y and passing through x0 is contained in the horoball {x ∈ H : k(x, w) ≥ k(x0 , w)}. For any w ∈ ∂H there exists a group Nw ⊂ G called a horospheric group, whose orbits are the horospheres Ow with the pole at w. For w, w0 ∈ ∂H groups Nw , Nw0 are conjugated: if w0 = g ∗ w then Nw0 = gNw g −1 . Furthermore, for any γ ∈ 6w and t ∈ R, aγ (t)Nw aγ (−t) = Nw . This implies that elements aγ (t) transform the family of horospheres Ow into itself. Given a geodesic γ through y 0 , we can introduce the Cartesian co-ordinate along γ, with origin at y 0 , compatible with the direction on γ. If x ∈ H, w ∈ ∂H, γ ∈ 6w and Ow is the horosphere through x with the pole at w, we denote by 5ho γ (x) the intersection ho point of Ow and γ, and by πγ (x) its Cartesian co-ordinate along γ. It is easy to see that (3.5) k(x, w) = exp πγho (x) , γ ∈ 6w . Thus, Ow = {y ∈ H : πγho (y) = πγho (x)}. The equidistant projection. Now consider the manifold H0 = {x ∈ H : xd = 0}. Consider the subgroup G0 ⊂ G consisting of matrices g = (gij ) such that gid = gdi = δid , i = 1, . . . , d + 1. Group G0 takes H0 to itself (and determines on H0 the geometry of Hd−1 ). The orbits of G0 in H are called equidistants (with base H0 ). Each equidistant intersects geodesic γ0 at a single point; thus for any x ∈ H there is a uniquely determined point 5ed (x) ∈ γ0 called the equidistant projection of x. The Cartesian co-ordinate of 5ed (x) (along γ0 ) is denoted by π ed (x). On H0 there exists a Riemannian volume invariant under G0 . It is given by Pd−1 dx1 · · · dxd−1 /(1 − i=1 x2i )d/2 . The absolute is decomposed into three orbits of group G0 , namely: ∂H± = {w ∈ ∂H : ±wd > 0} and {w ∈ ∂H : wd = 0}. The map ∂H+ → H0 defined by (w1 , . . . , wd−1 , wd , 1) 7→ (w1 , . . . , wd−1 , 0, 1) commutes with G0 . This fact immediately leads to the following lemma. Lemma 3.1. The measure ν on ∂H+ , with ν(dw) = dw1 · · · dwd−1 /wdd , is invariant under G0 : ν(d(g ∗ w)) = ν(dw), w ∈ ∂H+ , g ∈ G0 . Given x ∈ H and p ∈ C, set 1 ψp (x) = 0(p + 2 − d)

Z ∂H+

k p (x, w)wdp ν(dw).

(3.6)

Here and below, 0(·) denotes Euler’s gamma-function. It is not hard to check that, for Re p > d − 2, the last integral converges.

Hyperbolic Branching Processes

635

Lemma 3.2. For Re p > d − 2 the function ψp is constant on the orbits of G0 . e = g ∗ x and w e = g ∗ w. Then (gw)d = Proof. Given x ∈ H, w ∈ ∂H, g ∈ G0 , set x wd = (gw)d+1 w ed . Thus, p Z 1 wd k p (x, w) e ν(dw), e ψp (x) = 0(p + 2 − d) ∂H+ (gw)d+1 p Z 1 wd p ψp (e x) = k (e x, w) e ν(dw), e 0(p + 2 − d) ∂H+ (gw)d+1

and the assertion follows from (3.4) and Lemma 3.1.

Lemma 3.3. Given x ∈ H, set s = π ed (x), the Cartesian co-ordinate of the equidistant projection of x to γ0 . The following equality holds true: ad−2 2

d−3 2

d − 1

0

ψp (x) = ×F

1 − tanh s 2 d − 3 0 p− 2

d−1 2

exp ps

d − 3 1 + tanh s d−1 3−d , , p− ; 2 2 2 2

(3.7)

, x ∈ H.

Here and below, F (·, ·, · ; ·) is the hypergeometric function and am is the area of the sphere {x ∈ Rm+1 : kxk = 1}. Proof. As ψp is constant on the orbits of G0 , we can replace x ∈ H by 5ed (x), the equidistant projection to γ0 . By virtue of (3.1), we can assume that xi = 0, 1 ≤ i ≤ d−1, xd = tanh s and xd+1 = 1. Then [x, x] = (cosh s)−2 and [x, w] = 1 −qwd tanh s, w ∈ ∂H. Pd−1 As wd = (1 − i=1 wi2 )1/2 , we have, in the polar co-ordinate r = 1 − wd2 , ad−2 p ψp (x) = 0(p + 2 − d) cosh s or, after the change of variable t = ψp (x) =

Z

1 0

wdp−d rd−2 p dr, 1 − wd tanh s

1−wd wd ,

ad−2 p 0(p + 2 − d) cosh s

Z

∞

t

d−3 2

(1 + t − tanh s)−p (t + 2)

d−3 2

dt.

0

Now use the well-known formula Z ∞ xα−1 (x + y)−τ (x + w)−ω dx 0

y = w−ω y α−τ B(α, τ + ω − α)F α, ω, τ + ω; 1 − , w

y, w, τ, ω ∈ C, |arg y|, |arg w| < π, 0 < Re α < Re (τ + ω). See, e.g., [DP], Chapter 10, No. 10.14. Here, B(x, y) =

0(x)0(y) 0(x+y)

is Euler’s beta-function.

636

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

Then d−3 d−1 ad−2 2 2 (1 − tanh s) 2 −p d−1 ψp (x) = B , p+2−d (cosh s)p 0(p + 2 − d) 2 d−1 3−d d − 3 1 + tanh s ×F , , p− ; , x ∈ H, 2 2 2 2 and (3.7) follows.

It is well known that for z ∈ C, |z| < 1, the ratio F (a, b, c; z)/0(a) is an entire function of a, b, c ∈ C (see, e.g., [E]). Therefore, for fixed s ∈ R, the RHS of (3.7) is an entire function of p ∈ C. Thus we can define the value ψp (x) for any p ∈ C as the analytic continuation from the domain Re p > d − 2. We thereby obtain the following Lemma 3.4. Lemma 3.4. The quantity ψp (x), x ∈ H, p ∈ C, defined by the RHS of (3.7), is an entire function of p for any fixed x. For a fixed p, ψp (x) is constant on the orbits of group G0 ; if Re p > d − 2, (3.6) holds true. Function ψp (x) and the Laplace–Beltrami operator.. Useful concepts are the radial horospheric and equidistant parts 1ho and 1ed of the Laplace–Beltrami operator 1. These are the differential operators induced by 1 on the functions that are constant, respectively, on the horospheres (with a fixed pole) and on the equidistants. We treat 1ho and 1ed as operators on R. It is possible to check (see [K, HC]) that 1ho =

d2 d d2 d − (d − 1) , 1ed = 2 − (d − 1)tanh s , s ∈ R. 2 ds ds ds ds

(3.8)

Lemma 3.5. Function ψp (·), p ∈ C, is an eigenfunction of 1 on H with the eigenvalue p(p − d + 1). Proof. By direct computation based on (3.5), it can be checked that ∀ x ∈ H, w ∈ ∂H, 1k p (x, w) = p(p − d + 1)k p (x, w). Therefore, for Re p > d − 2, 1ψp (x) = p(p − d + 1)ψp (x). Both the left and right hand sides of the last equality are entire functions of p; hence the equality holds ∀ p ∈ C. The metrics on the absolute. A natural metric on ∂H is angular where the distance between w, w0 ∈ ∂H equals the Riemannian angle between the geodesics γ ∈ 6w and γ 0 ∈ 6w0 through point y 0 . We denote this metric by ϕ. If instead of y 0 we choose another point y ∈ H we get a metric, ϕy , which is equivalent to ϕ. Another metric, χ, on ∂H which we will use is given by χ(w, w0 ) = k5st (w) − st 5 (w0 )k, where 5st is the standard stereographic projection ∂H → {x ∈ Rd+1 : xd = 0, xd+1 = 1} from point w0 . In this metric w0 is an infinitely distant point. Note the following properties of metric χ: (i) If F ⊂ ∂H is a closed set not containing w0 , ϕ and χ are equivalent on F × F. (ii) χ is invariant under the horospheric group Nw0 . (iii) Let geodesic γ ∈ 6w0 . Then ∀ w, w0 ∈ ∂H and t ∈ R, χ(aγ (t) ∗ w, aγ (t) ∗ w0 ) = et χ(w, w0 ). The verification of these properties is elementary and we omit it. Now consider the set Ds = {w ∈ ∂H : wd ≤ tanh s}, s ∈ R. Let Ds stand for the diameter of Ds measured in metric χ.

Hyperbolic Branching Processes

637

Lemma 3.6. Ds = D0 es . Proof. Under shift a(s) set D0 is taken to Ds . The assertion now follows from property (iii). 4. Homogeneous Processes on H: The Criticality Condition Projections of homogeneous Markov process. Discrete time. Let Q(x, ·), x ∈ H, be a transition probability measure on H, and θ = (θn , n ≥ 0) the corresponding discretetime. Markov process θ is called (space) homogeneous if Q(x, B) = Q(g ∗ x, g ∗ B), g ∈ G, x ∈ H, B ⊆ H. An equivalent fact is that the operator Q with Qf (x) = Ex f acting on measurable functions f : H → C commutes with the shift operators Tg , where Tg f (x) = f (g −1 ∗ x), g ∈ G. In the sequel we assume that θ is homogeneous and satisfies the condition analogous to (2.1): Q(x, {ρ(·, x) > 0}) > 0, and Q x, {ρ(·, x) < r0 } = 1, x ∈ H, (4.1) for some r0 > r1 > 0. The symbol Ex will be used for the expectation wrt Q(x, · ). Let H be a subgroup of G. If a function f on H is constant on the orbits of H then so is Qf . That is, θ induces a Markov process on the orbits of H. Take as H group G0 ; the corresponding process is denoted by θed and called the equidistant projection of θ. Since the orbits of G0 are parametrised by the points of γ0 , θed may be considered as a process on R; it is plain that θed = 5ed (θ). On the other hand, if we take as H the group Nw , w ∈ ∂H, we obtain a process θho called the horospheric projection of θ. For any geodesic γ ∈ 6w , the orbits of Nw may be parametrised by the points of γ, i.e., θho may again be considered as a process on R. Obviously, θho = 5ho γ (θ). Lemma 4.1. θho is a homogeneous Markov process on R. The jump distribution q ho of processR θho does not depend on the choice of w ∈ ∂H and γ ∈ 6w . Function 8ho (p) = q ho (dy)epy (cf. (2.4)) satisfies 8ho (p) = Ey0 k p ( · , w) , (4.2) where k is the kernel defined in (3.3) (the RHS of (4.2) does not depend on w ∈ ∂H and γ ∈ 6w ). Proof. The assertions of Lemma 4.1 follow directly from formula (3.5) and the fact that (a) Aγ , γ ∈ 6w takes an horosphere of Ow with the pole at w to another horosphere with the pole at w and (b) K acts transitively on ∂H. Lemma 4.2. Let E± be the operator of multiplication by the function x ∈ R 7→ exp(±(d − 1)/2 x) and Jf (x) = f (−x), x ∈ R. Then the operator E− Qho E+ commutes with J : JE− Qho E+ = E− Qho E+ J. Consequently, function 8ho possesses the following symmetry property: (4.3) 8ho (d − 1)/2 + p = 8ho (d − 1)/2 − p , p ∈ R, and the value 8ho ((d − 1)/2) is the minimum of 8ho (p), p ∈ R.

638

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

Proof. Equality (4.3) follows from Harish–Chandra’s Theorem (Theorem 10.1.15, [He1]). As 8ho is a convex function, (4.3) implies that 8ho attains its minimum at p = (d − 1)/2. Homogeneous branching processes. Discrete time. Given a homogeneous Markov proP cess θ and offspring number probabilities µk , k ≥ 1, with κ = k kµk > 1, let 2 be the corresponding branching process on H. In view of Lemma 4.2, inf p≥0 8ho (p) ≤ 1/κ iff 8ho ((d − 1)/2) ≤ 1/κ. As in Sect. 1, we denote by p0 the least and p1 the largest positive root of equation 8ho (p) = 1/κ. As before we introduce equidistant and horospheric projections 2ed and 2ho . In particular, 2ho is a space-homogeneous branching process on R. We call process 2 subcritical, critical or supercritical when this is true of 2ho ; this is equivalent to the classification given in Sect. 1. A direct corollary of Lemma 2.4 is the following. Theorem 4.3. In the subcritical and critical cases, process 2 spends a finite time in any horoball. Consequently, it spends a finite time in any compact C ⊂ H. On the other hand, in the supercritical case process 2 visits any horoball infinitely many times. In the supercritical case a similar statement holds for the set-theoretical difference of two horoballs with the same pole w ∈ H (in which case one of them contains another). More precisely, such a set is called a horoball layer (with the pole at w): it is described by inequalities c1 ≤ πγho (x) ≤ c2 , where γ ∈ 6w and c1 , c2 ∈ R are constants, c1 < c2 . The difference c2 − c1 is called the width of the layer; the exact statement is that any horoball layer of width > r0 (see (4.1)) is visited infinitely many times. By using the argument from [LS], Sect. 3.3, we obtain Theorem 4.4. In the supercritical case, with probability one there exists an infinite path of process 2 remaining in a compact region of H. Continuous time. Let θ = (θt , t ≥ 0) be a homogeneous continuous-time process on H. That is, its transition probabilities Qt (x, B) obey Qt (x, B) = Q(g ∗ x, g ∗ B), t ≥ 0, g ∈ G, x ∈ H, B ⊆ H, or, equivalently, the operators Qt , where Qt f (x) = Et,x f commute with Tg , g ∈ G. Here, Et,x denotes the expectation wrt to Qt (x, · ). As in the discrete-time case, one can consider the equidistant and horospheric projections θed and θho which are Markov processes on R. Process θho is space-homogeneous: its transition probabilities qtho (A) = Qho t (y, A+y) do not depend on y R∈ R. OtherRassertions of Lemma ∞ 4.1 are also carried through, with function 8ho (p) = λ 0 dte−λt qtho (dy)epy , p ∈ R, and the equality Z 8ho (p) = λ dte−λt Et,y0 (k p ( · , w)) (4.4) replacing (4.2) (here λ is an arbitrary positive number). Lemma 4.2 holds intact. Given λ > 0 and offspring number probabilities µk , k ≥ 1, with κ > 1, we again consider the corresponding branching process 2 and its projections 2ed and 2ho . Adopting the same classification and notation as in the discrete-time case, we can extend the assertions of Theorems 4.3 and 4.4 to continuous time. Here, the condition Z y Gho (y) ≤ c0 e−p0 y ep0 z q ho (dz), y < 0, (4.5) −∞

Hyperbolic Branching Processes

639

R is assumed to hold, where Gho (y) = λ dte−λt qt ((−∞, y)). It is automatically valid in the case of the branching diffusion; for the branching jump process it follows from the condition on the jump distribution π(x, {x0 ∈ H : ρ(x, x0 ) < r0 }) = 1. In the case of a branching diffusion, θ is generated by 1/21. Processes θho and θed are diffusions on R; it is easy to see that their generators are, respectively, 1/21ho and 1/21ed : see (3.8). Observe that the branching diffusion is sub-, super- or critical when 2λ(κ − 1) is, respectively, <, > or = (d − 1)2 /4. 5. The Limiting Set on ∂H Discrete-time case. As in [LS], the limiting (random) set 3 of a homogeneous branching S process 2 is defined as the set of accumulation points on ∂H for n≥0 supp 2n , in the ¯ It is easy to see that 3 is a closed continuum. The “distribution” of 3 topology of H. depends on the initial point x; by the continuity of the map (g, x) ∈ G × H¯ 7→ g ∗ x, the distribution of the g ∗ 3 is the same as that of the limiting set for the branching process from point g ∗ x. Theorem 5.1. In the supercritical case set 3 coincides a.s. with ∂H. Proof. The assertion follows from the fact that (i) in the supercritical case 2 visits any horoball infinitely many times (see Theorem 4.4), and (ii) any neighbourhood of a point w ∈ ∂H contains a horoball with the pole at w. Lemma 5.2. Function ψp obeys Ex ψp = 8ho (p)ψp (x), x ∈ H, p ∈ C.

(5.1)

Proof. It follows from (3.5) and (4.2) that Ex k p (θ1 , w) = 8ho (p)k p (x, w). Using (3.6) and Lemma 3.4 completes the proof. Denote by As = {x ∈ H : xd > tanh s}, s > 0 and set h(x) = Px (2 does not reach A0 ). In Lemmas 5.3–5.7 below we assume that 2 is either subcritical or critical. Lemma 5.3. Set s = π ed (x) and assume that s < 0 and |s| are large enough. Then h(x) ≥ 1 − cesp1 ,

(5.2)

where c > 0 is a constant and p1 the largest positive root of 8ho (p) = 1/κ. Proof. As A0 is invariant under group G0 , h is constant on the orbits of G0 . Thus, h(x) coincides with Px 2ed does not reach (0, ∞) . On the other hand, the function ψp is also invariant under G0 (see Lemma 3.4). Denote by f the function on R obtained by restricting ψp1 to the geodesic γ0 . We can then rewrite (5.1) as κEyed f = f (y), y ∈ R. Here, Eyed denotes the expectation wrt Qed (y, · ), the transition probability of θed . Then f satisfies the conditions of Lemma 2.1 which together with (3.7) completes the proof.

640

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

Lemma 5.4. For s > 0 large enough the following assertion holds true: Py0 (2 does not reach As ) ≥ 1 − ce−sp1 .

(5.3)

Proof. As seen from (3.2), the shift a(s) along γ0 takes set A0 to As and the point x ∈ γ0 with Cartesian co-ordinate −s to y 0 . The assertion of the lemma then follows. Lemma 5.5. ∀ w ∈ ∂H,

Py0 (w ∈ 3) = 0.

Proof. The assertion follows from Lemma 5.4 and the transitivity of the action of group K on ∂H; the homogeneity of process 2 wrt group K is used here in a crucial way. As K acts on ∂H transitively, ∃ a measure ν0 on ∂H invariant under K. Then, with probability one, ν0 (3) = 0. Cf. Proposition 5 from [LS]. For the definition and properties of the Hausdorff dimension, see, e.g., [F]. Lemma 5.6. The Hausdorff dimensions of 3 calculated in metrics ϕ and χ coincide a.s. Proof. The assertion follows from Lemma 5.5 (with w = w0 ), the fact that 3 is closed and property i) of metric χ. Denote by D(3) the diameter of 3 measured in metric χ. Lemma 5.7. For t > 0 large enough,

Py0 D(3) > t ≤ cD0p1 t−p1 .

(5.4)

Here, c is the constant from Lemma 5.4 and D0 the constant from Lemma 3.6. Proof. For the set Dsc = {w ∈ ∂H : wd > tanh s} we have, by Lemma 5.4, that Py0 (3 ∩ Dsc 6= ∅) ≤ ce−sp1 . As in Lemma 3.6, let Ds stand for the diameter of Ds . The event {D(3) > Ds } implies {3 ∩ Dsc 6= ∅}. Applying Lemma 3.6 with t = D0 es completes the proof. Lemma 5.7 is similar to that of Proposition 4 from [LS]. Repeating the argument from [LS], Sections 6.1, 6.2, we obtain Theorem 5.8. In the subcritical case, the Hausdorff dimension of 3 calculated in metric χ is ≤ p0 a.s. The lower bound is established in Theorem 5.9: Theorem 5.9. Assume that 2 is either subcritical or critical. Then the Hausdorff dimension of 3 calculated in metric χ is ≥ p0 a.s. Proof. We argue as in Sect. 7 of [LS], focussing on the changes needed to cover the case under consideration. We use the process X1 , . . . , Xn , . . . in H analogous to that introduced in [LS], Sect. 7.2. Namely, fix r > r0 and impose an absorbing condition in the domain B1 = {x ∈ H : π ho (x) ≤ −r} (the complement of a horoball). That is, particles from 2 are considered frozen where they hit B1 . Among the particles frozen in B1 we randomly choose one; its position is denoted by X1 . Then revive the chosen particle and create an absorbing condition for its descendants in the domain B2 = {x ∈

Hyperbolic Branching Processes

641

H : π ho (x) ≤ −r + π ho (X1 )}. Among the descendants frozen in B2 choose randomly again one; its position is denoted by X2 , etc. As in [LS], it is possible to prove that a.s. ¯ giving a random point W ∈ ∂H. ∃ limn→∞ Xn (in the topology of H) The distribution of W has a bounded probability density wrt to the measure on ∂H generated by the Lebesgue measure on the linear manifold {x ∈ Rd+1 : xd = 0, xd+1 = 1} under stereographic projection 5st . This follows from the fact that process X0 = y 0 , X1 , X2 , . . ., is distributed as process θ starting at y 0 and observed at the times T1 , T2 , . . ., when it subsequently hits the domains {x ∈ H : π ho (x) ≤ −r}, {x ∈ H : π ho (x) ≤ −r + π ho (X1 )}, etc. Using this fact and Lemma 2.4, the proof of Lemma 5.9 is completed as in [LS]. Corollary 5.10. In the subcritical case, the Hausdorff dimension of 3 coincides with p0 . Continuous-time case. After straightforward modifications, the assertions of Lemmas 5.3–5.7 and Theorems 5.8, 5.9 hold also for the continuous-time branching processes. Their derivation is similar to the discrete-time case, and the detail can be omitted. Remark 5.11. For the branching p diffusion, the Hausdorff dimension of 3 equals, in the subcritical case, (d − 1 − (d − 1)2 − 8λ(κ − 1))/2. The equality is extended to the critical case (which gives value (d − 1)/2). This is because the assertion of Proposition 3 from [LS] holds for the Lobachevsky space in any dimension whereas Proposition 4 from [LS] is replaced for d ≥ 3 by our Lemma 5.4. Acknowledgement. The authors would like to thank D. Gatzouras for an early copy of paper [LS]. F.I.K. thanks the London Mathematical Society and St John’s College, Cambridge, for support and hospitality. E.A.P. thanks the Russian Foundation for Fundamental Research, the Dublin Institute for Advanced Studies, the London Mathematical Society and St. John’s College, Cambridge, for support and hospitality. Y.M.S. thanks I.H.E.S., Bures-sur-Yvette, for support and hospitality. Special thanks go to S. Shea-Simonds for checking the style of the paper.

References [DP]

Ditkin V.A. and Prudnikov A.P.: Integral Transforms of Operational Calculus. Oxford: Clarendon Press, 1965 [E] Erdelyi A.: Asymptotic Expansions. New York: 1956 [F] Falconer K.J.: Fractal Geometry: Mathematical Foundations and Applications. Chichester: Wiley, 1990 [GGV] Gelfand I.M., Graev M.I. and Vilenkin N.Ya.: Generalized Functions, vol. 5. New York: 1966 [HC] Harish-Chandra: Spherical functions on a semi-simple Lie group. Am. J. Math. 80, 241–310 (1958) [He1] Helgason S.: Differential Geometry and Symmetric Spaces. New York: Academic Press, 1962 [He2] Helgason S.: Differential Geometry, Lie Groups and Symmetric Spaces. New York: Academic Press, 1978 [IM] Ito K. and McKean H.P.: Diffusion Processes and Their Sample Paths. Berlin: Springer-Verlag, 1965 [K] Karpelevich F.I.: Geometry of geodesics and eigenfunctions of the Beltrami–Laplace operator on symmetric spaces. [in Russian]. Trudy Mosk. Matem. Ob-va 44, 48–185 (1965) [KKS] Karpelevich F.I., Kelbert M.Ya. and Suhov Yu.M.: The boundedness of branching Markov processes. In: The Dynkin Festschrift. Markov Processes and Their Applications (ed. M. Freidlin). Progress in Probability, Vol. 34, Boston: Birkh¨auser, 1994, pp. 143–153 [KS1] Karpelevich F.I. and Suhov Yu.M.: A criterion of boundedness of discrete branching random walk. In: Classical and Modern Branching Processes, (eds. K.B. Athreya and P. Jagers). IMA Volumes in Mathematics and its Applications, Vol. 84, New York: Springer-Verlag 1997, pp 141–156

642

[KS2] [LS]

F. I. Karpelevich, E.A. Pechersky, Yu. M. Suhov

Karpelevich F.I. and Suhov Yu.M.: Boundedness of one-dimensional branching Markov processes. J. Appl. Math. and Stochastic Anal. 10, 307–332 (1997) Lalley S. and Sellke T.: Hyperbolic branching Brownian motion. Prob. Th. Rel. Fields 108, 171–192 (1997)

Communicated by Ya. G. Sinai

Commun. Math. Phys. 195, 643 – 650 (1998)

Communications in

Mathematical Physics

On the Laplace Operator Penalized by Mean Curvature?,?? Evans M. Harrell II, Michael Loss School of Mathematics, Georgia Institute of Technology, Atlanta GA 30332-0160, USA. E-mail: [email protected]; [email protected] Received: 11 December 1997 / Accepted: 22 December 1997

Pd Abstract: Let h = j=1 κj , where the κj are the principal curvatures of a d-dimensional hypersurface immersed in Rd+1 , and let −1 be the corresponding Laplace–Beltrami operator. We prove that the second eigenvalue of −1 − d1 h2 is strictly negative unless the surface is a sphere, in which case the second eigenvalue is zero. In particular this proves conjectures of Alikakos and Fusco.

1. Introduction This article is concerned with linear differential operators of the form H = −1 − q defined on curves, surfaces, and hypersurfaces, where q is a quadratic expression in the principal curvatures κj . The particular case where X κ2j q = j

has arisen in a number of previous articles, e.g., [AlFu, Ha]. There, H is the operator applied to a function f that one obtains by linearizing a distortion of the surface according to dx = f (x) N , dε where N is the unit normal vector. ? c 1997 by the authors. Reproduction of this article, in its entirety, by any means is permitted for noncommercial purposes. ?? Work supported by N.S.F. grant DMS-9622730 and N.S.F. grant DMS-9500840 and the MSRI

644

E. M. Harrell II, M. Loss

As such H plays a role in the evolution of phase interfaces in materials, and stability analyses of these interfaces led Alikakos and Fusco [AlFu] to formulate a spectralgeometric conjecture: Conjecture (Alikakos and Fusco). a) Suppose that is a simply smooth, Pconnected, κ2j is maximized at compact surface in R3 . The second eigenvalue of H with q = j

0 precisely when is a sphere. b) Suppose that is a simple, closed, smooth curve in the plane. The second eigenvalue of H with q = κ2 is maximized at 0 precisely when is a circle. Note that the potential has the dimension (length−2 ), the same as the differential operator. As a consequence the result is independent of the area of the surface or the length of the curve. One of us [Ha] recently proved the conjecture in the two-dimensional case. In this case the claim follows immediately from the corresponding one where q = 2κ1 κ2 , i.e., when q is twice the Gauss curvature. The proof in that case is achieved in a very natural way, namely using a variational characterization of the eigenvalues of H with the help of conformal transplantations [Ha]. This approach certainly works in the case where the surface is of the same topological type as the sphere. Curiously, the one-dimensional conjecture b), for the ordinary differential operator d2 − κ2 , ds2 has stubbornly resisted attacks along these lines. The second eigenvalue must be estimated with the min-max principle, requiring an orthogonalization, and the estimates to show that it is strictly negative except for the circle have been just too delicately balanced to resolve the conjecture unrestrictedly. There has been progress: Alikakos [Al] was able to resolve the conjecture under certain symmetry assumptions, and Papanicolaou [Pa] resolved it locally, in the sense that if C is a sufficiently small perturbation of a circle without being an exact circle, then there are at least two negative eigenvalues. Papanicolaou interpreted H(κ) as a Hill operator, that is, he took κ(s) as a periodic function with integral 2π, disregarding whether it is the curvature function of a closed curve. He also exhibited an example with a nonconstant κ(s), for which the second eigenvalue of H(κ) is positive. Thus the assumption that is a closed curve is crucial for the theorem which will be proved in the following section. In dimensions greater than two, there have been no results available until now. One barrier has been the proper choice of q; e.g., it is clear for dimensional reasons that the Gauss curvature is not a natural choice for q, except in two dimensions. The “good” choice for q (and this is a crucial insight of this article) is H(κ) := −

q=

1 2 h . d

Here d is the dimension of the surface and h is d times the mean curvature, i.e., h=

d X j=1

Our main result is the following:

kj .

Laplace Operator Penalized by Mean Curvature

645

Theorem 1. Let be a smooth compact oriented hypersurface of dimension d immersed in Rd+1 ; in particular self-intersections are allowed. The metric on that surface is the standard Euclidean metric inherited from Rd+1 . Then the second eigenvalue λ2 of the operator 1 H = −1 − h2 d is strictly negative unless is a sphere, in which case λ2 equals zero. Remarks. (i) Using the Cauchy–Schwarz inequality, the Alikakos–Fusco conjecture is an immediate consequence of Theorem 1. (ii) Because under a change of length scale the operator simply picks up a constant factor, we are free to normalize so that the d-dimensional volume of equals 1. We do this henceforth. (iii) No assumptions have to be made about the topology of the surface. Moreover, the theorem holds also if is immersed in Rn for any n > d. (iv) Although obvious, it is worth pointing out that Theorem 1 is not an intrinsic statement about the surface; it contains also information about the embedding of in Euclidean space Rd+1 . In particular, Theorem 1 and its proof do not make any claims about the two dimensional case when q is twice the Gauss curvature. The point of view which led us to a solution uses rather different ideas than normally used in differential geometry, where one usually deals with intrinsic quantities, independent of the embedding. Hence it may be of some value to give a general description of the strategy of proof. The key technical idea is to count the negative eigenvalues rather than estimating them. There is a method ( which is now standard in mathematical quantum mechanics) for counting eigenvalues, due to Birman [Bi] and Schwinger [Sc] which provides the first step. We recall it here, following the review article of Simon [Si]: Lemma (The Birman–Schwinger principle). Consider the self-adjoint operator −1 − W 2 (x), where W 2 is relatively bounded with respect to −1 with bound less than 1 (for definition see [ReSi, p. 162]). A number −µ < 0 is an eigenvalue of −1 − W 2 if and only if 1 is an eigenvalue of the bounded, positive operator Kµ := W (−1 + µ)−1 W. Remarks. (i) The multiplicities are also equal. (ii) The eigenvalues of Kµ are monotonically decreasing continuous functions of µ and tend to 0 as µ → ∞. Therefore, if we can locate an eigenvalue > 1, we can be sure that there is an eigenvalue = 1 for some larger value of µ. (iii) In contrast to most uses of the Birman–Schwinger principle, where W is the positive root of the potential W 2 , we do not assume W ≥ 0. Thus, the original problem, which is about an inequality has been reduced to a problem about the asymptotics of Kµ as µ tends to zero. The analysis of that problem is relatively straightforward. In Sect. 2 we consider the case of a space curve. Not only does this provide most of the ideas in Sect. 3, i.e., for hypersurfaces of dimension d embedded in Rd+1 , but it also shows that the techniques work for embeddings of higher codimension.

646

E. M. Harrell II, M. Loss

2. An Extremal Property of the Circle In this section we consider the problem of determining the closed curve C in R3 , of fixed length normalized to 1, which maximizes the second eigenvalue of the self-adjoint differential operator d2 H(κ) := − 2 − κ2 . ds Here, s is the arc-length and κ is the curvature, regarded as a given function of s. The domain of self-adjointness for this operator on the Hilbert space L2 (C, ds) consists of periodic functions with absolutely continuous derivatives. If κ(s) is a constant, then the curve is a circle (with κ = 2π), and it is an elementary observation that the first two eigenvalues are −4π 2 and 0 (degenerate). We prove the one-dimensional Alikakos-Fusco conjecture, with the less restrictive assumption that C is a space curve. Theorem 2. Let C be a smooth curve in R3 with curvature κ. Then the second eigenvalue of d2 H(κ) := − 2 − κ2 . ds is less than or equal to 0, with equality if and only if C is a circle. Proof. We may normalize the length of the curve to 1. In the first step we show that whether or not H has two negative eigenvalues depends on the analysis of a simple functional independent of µ, Eq. (1), below. The Birman–Schwinger operator in the case of a curve is

−1 d2 κ. Kµ := κ − 2 + µ ds As remarked in the introduction, since the operator norm of Kµ tends to 0 as µ → ∞, we can show the existence of (at least) 2 negative eigenvalues of H(κ) by showing that for sufficiently small µ > 0, Kµ has 2 eigenvalues larger than 1, except in the case of the circle. By the min-max principle as applied to Kµ , if we have two linearly independent functions f1,2 such that the 2 × 2 matrix M := hfj , Kµ fκ i − hfj , fκ i is strictly positive for sufficiently small µ, then Kµ has the two desired eigenvalues > 1. We choose f1 (s) = κ(s), and seek f2 (s) bounded and orthogonal to κ(s). For any d2 such function, κf2 is orthogonal to 1, which is the eigenfunction of − ds 2 with eigenvalue 0. −1 d2 acts From the spectral theorem it follows that the operator Rµ := − ds 2 + µ boundedly on κf2 as µ → 0. The limit R0 exists on the set of functions orthogonal to 1 and could be written explicitly as an integral expression (with constants of integration ensuring periodicity and orthogonality to 1). With these choices of f1,2 , the matrix M is found to have the form # " 1H 2 O(1) µ κ ds , 2 O(1) hκf2 , R0 κf2 i − ||f2 || + O(µ)

Laplace Operator Penalized by Mean Curvature

647

which will be positive provided that both its determinant and trace are positive. The trace is clearly positive for sufficiently small µ, while the determinant will also be positive for sufficiently small µ if the functional 8(f2 ) :=

hκf2 , R0 κf2 i ||f2 ||

2

> 1.

(1)

Synopsis: H has at least two negative eigenvalues if we can find f2 (s) bounded, orthogonal to κ(s), and satisfying (1). Let us therefore define   I   f(s) κ(s) ds = 0 . (2) 3 := sup 8(f) : f ∈ L2 , f 6≡ 0,   Next we show that 3 ≥ 1 by choosing as a test function f (s) any of the coordinates of the normal vector N to the curve. It will be convenient to recall the Frenet–Serret formulae for space curves: Let x denote the position of a point on C, embedded in R3 . Then dx ds = T dT ds = κN (3) dN = −κT + τ B ds dB ds = −τ T . Since T is periodic, the formula for dT/ds guarantees that each component of N is orthogonal to κ, and is thus a suitable choice for (1). Since

we calculate:

d2 x = κ N, ds2

(4)

R0 κ Nj = xj (s) − yj ,

(5)

where xj (s) is the j coordinate of x and yj is the constant needed so that xj (s) − yj is orthogonal to 1. Hence E D dT dT hNj , κR0 κNj i = dsj , R0 dsj D E dT = − dsj , xj (s) − yj . = hTj , Tj i . th

In the final step the boundary term in the integration by parts vanishes because the curve is closed. Summing on j, we obtain 3 P

hNj , κR0 κNj i =

j=1

=

R1 0

R1 0 2

2

|T| = 1

|N| =

3 P

hNj , Nj i .

j=1

Either 8(Nj ) > 1 for some j, or else 8(Nj ) = 1 for all j = 1, 2, 3 (strictly speaking, in the case of a planar curve one of the Nj might vanish identically, and 8(Nj ) = 1 for the other two coordinates). This establishes that 3 ≥ 1.

648

E. M. Harrell II, M. Loss

If 3 > 1 we are done, so we now assume that 3 = 1, which means that each Nj which does not vanish identically is an optimizer for the variational problem (2). We shall now demonstrate that this possibility implies that C is a circle. To this end we calculate the first variation of 8, and discover that a necessary condition for maximality is κR0 κN = N + 0 κ. Here, 0 is a vector of Lagrange multipliers, and the vectorial notation of this equation indicates that the operator R0 operates on each Cartesian component. (In case some component Nj vanishes identically, the equation holds trivially.) Using (5), the condition for maximality reads − κ (x(s) − y + 0 ) = N, which implies among other things that κ is bounded away from 0. Divided by κ, the equation becomes N (6) −x(s) + y − 0 = , κ and when we differentiate using the Frenet–Serret equations, we find τ d 1 d N = N (s) − T (s) + B (s) . −T (s) = ds κ ds κ κ By comparing components we learn that τ = 0 and κ = constant. This implies that C is a circle, the formula for which is obtained by taking the magnitude of both sides of (6). 3. An Extremal Property of S d The higher-dimensional Theorem 1 hinges on the generalization of (4), that − 1 x = h N.

(7)

Here the vector x is simply the position of a point on as embedded in Rd+1 . The vector notation in this equation indicates that the Laplace–Beltrami operator 1 acts on each of the d + 1 components of x independently as scalar functions – no Christoffel symbols are introduced. The useful identity (7) results from a direct, elementary calculation. Observe that the unit normal for a hypersurface is conventionally defined as outward, which will lead to some differences of sign from the ones used in (7) or for space curves, where N may be inward. We also remark for future purposes that none of the functions h(x) or Nj (x) can vanish identically on a compact hypersurface. Proof of Theorem 1. The proof will follow the conceptual outline of the one for space curves rather closely. As before, we look at the Birman–Schwinger operator, which in this case is 1 Kµ := h (−1 + µ)−1 h. d We shall show that Kµ has two eigenvalues ≥ 1 by projecting it onto the twodimensional space spanned by two trial functions h and f, restricted so that Z h(x) f (x) dV ol = 0.

Laplace Operator Penalized by Mean Curvature

649

Precisely the same argument as in Sect. 2 shows that the original operator H has two negative eigenvalues provided that the functional 8(f ) :=

hhf, R0 hf i d ||f ||

> 1,

2

(8)

where the reduced resolvent R0 is the limit as µ ↓ 0 of (−1 + µ)−1 . This is welldefined on the space of functions of mean 0. The variational problem now concerns Z 2 h(x) f(x) dVol = 0 . 3 := sup 8(f) : f ∈ L (), f 6≡ 0,

(9)

In order to show that 3 ≥ 1, we choose f (x) = Nj (x), and sum over all j, to compute: 1 d

d+1 P

hNj , h R0 h Nj i =

j=1

1 d

d+1 P

−1xj , R0 −1xj

j=1

=

1 d

d+1 PR j=1

2

|∇xj | dVol = 1.

Summing the denominators of 8(Nj ), d+1 X

hNj , Nj i = 1;

j=1

we conclude as in Sect. 2 that either 8(Nj ) > 1 for some j, or else 8(Nj ) = 1 for all j. This establishes that 3 ≥ 1. If 3 = 1, then each Nj is an optimizer, and we next show that this implies that is a sphere. The Euler–Lagrange equation (again using vector notation) now states that 1 h R0 h N = N + 0, d where 0 is a d + 1-tuple of Lagrange multipliers. Using (7), this reads h(x − y − d 0) = d N, which clearly shows that h cannot vanish. Dividing by h: (x − y − d 0) =

dN . h

(10)

If we now differentiate (10) along any curve in , the left side is a tangential vector, so the normal component of the derivative of the right side, i.e., the derivative of d/h, is 0. Thus h is constant. Together with (10), the constancy of h implies that is a sphere.

650

E. M. Harrell II, M. Loss

4. Concluding Remarks Two natural questions have not been fully addressed here. One of them is how a nontrivial topology can increase the number of negative eigenvalues of H beyond 2. This seems to be within reach for the case of a planar curve, where the topology is given by the winding number, and we believe that for winding number n there are at least 2n negative eigenvalues, except in the case of a (multiply traversed) circle. For space curves and spheres, however, it is not at all clear how the topology controls the number of negative eigenvalues. The second question has to do with the larger categories of potentials depending on curvature, as in the operator X κ2j −1 − α j

for 0 ≤ α < 1. Such potentials were allowed in two dimensions in [Ha], which thus connects Theorem 1 in two dimensions with the one with α = 0 of [He]. Since the second eigenvalue of the Laplace-Beltrami operator is known not to to be maximized by the sphere in certain higher-dimensional settings [Ur], the result of this article will not extend to all α ≥ 0. References [Al] Alikakos, N.D.: Private communication [AlFu] Alikakos, N.D. and Fusco, G.: The spectrum of the Cahn–Hilliard operator for generic interface in higher space dimensions. Indiana U. Math. J. 4, 637–674 (1993) [Bi] Birman, M.S.: The spectrum of singular boundary problems. Mat. Sbornik 55, 125–174 (1961) (Am. Math. Soc. Trans. 53, 23–80 (1966)) [Ha] Harrell II, E.M.: On the second eigenvalue of the Laplace operator penalized by curvature, J. Diff. Geom. and Appl. 6, 397–400 (1996) [He] Hersch, J.: Quatre propri´et´es isop´erimetriques de membranes sph´eriques homog`enes. C.R. Acad. Sci. Paris, s´er A-B 270, A1645–1648 (1970) [Pa] Papanicolaou, V.G.: The second periodic eigenvalue and the Alikakos–Fusco conjecture: J. Diff. Eqns. 130, 321–332 (1996) [ReSi] Reed, M. and Simon, B.: Methods of modern mathematical physics, II: Fourier analysis, self– adjointness. New York: Academic Press, 1975, p. 162 [Sc] Schwinger, J.: On the bound states of a given potential. Proc. Nat. Acad. Sci. U.S.A. 47, 122–129 (1961) [Si] Simon, B.: On the number of bound states of two body Schr¨oedinger operators – A review. In: Studies in mathematical physics, E. H. Lieb, B. Simon, A. S. Wightman eds. Princeton, NJ: Princeton Univ. Press, 1976, pp. 305–326 [Ur] Urakawa, H.: On the least positive eigenvalue of the Laplacian for compact group manifolds. J. Math. Soc. Japan 31, 209–226 (1979) Communicated by B. Simon

Commun. Math. Phys. 195, 651 – 689 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras B. Enriquez1 , G. Felder2 1 2

Centre de Math´ematiques, URA 169 du CNRS, Ecole Polytechnique, 91128 Palaiseau, France D-Math, ETH-Zentrum, HG G46, CH-8092 Z¨urich, Switzerland

Received: 20 March 1997 / Accepted: 30 December 1997

Abstract: We construct an algebra morphism from the elliptic quantum group Eτ,η (sl2 ) to a certain elliptic version of the “quantum loop groups in higher genus” studied by V. Rubtsov and the first author. This provides an embedding of Eτ,η (sl2 ) in an algebra “with central extension”. In particular we construct L± -operators obeying a dynamical version of the Reshetikhin–Semenov-Tian-Shansky relations. To do that, we construct the factorization of a certain twist of the quantum loop algebra, that automatically satisfies the “twisted cocycle equation” of O. Babelon, D. Bernard and E. Billey, and therefore provides a solution of the dynamical Yang–Baxter equation.

Introduction The aim of this paper is to compare the sl2 -version of the elliptic quantum groups introduced by the second author ([14]) with quasi-Hopf algebras introduced by V. Rubtsov and the first author ([11, 12]). Elliptic quantum groups are presented by exchange (or “RLL”) relations, whereas the algebras of [11] are “quantum loop algebras”. Our result can be viewed as an elliptic version of the results of J. Ding and I. Frenkel ([4]) and of S. Khoroshkin ([21]), where Drinfeld’s quantum current algebra ([6]) was shown to be isomorphic with the Reshetikhin–Semenov L-operator algebra of [23, 13], in the trigonometric and rational case respectively. Elliptic quantum groups are based on a matrix solution R(z, λ) of the dynamical Yang–Baxter equation (YBE). Here “dynamical” means that in addition to the spectral parameter z (belonging to an elliptic curve E), R depends on a parameter λ, which undergo certain shifts in the Yang–Baxter equation. The RLL relations defining the elliptic quantum groups Eτ,η (sl2 ) are then an algebraic variant of the dynamical YBE. In [3], O. Babelon, D. Bernard and E. Billey studied the relation beween the dynamical and quasi-Hopf Yang–Baxter equations. They showed that given a family of twists of a quasi-triangular Hopf algebra, satisfying a certain “twisted cocycle equation”,

652

B. Enriquez, G. Felder

the quasi-Hopf YBE satisfied by the twisted R-matrices was indeed equivalent to the dynamical YBE. The quantum loop algebras of [11] are generally associated with complex curves and rational differentials. As it was shown in [12], they can be endowed with a quasiHopf structure, quantizing Drinfeld’s “higher genus Manin pairs” ([7]). To make precise which quantum loop algebra should be associated with elliptic quantum groups, we first make a quasi-classical study (Sect. 1.1). The classical r-matrix rλ (z, w) associated with R(z − w, λ) corresponds to what we may call a “dynamical Manin triple”, that is to a family gλ of maximal isotropic complements of a fixed maximal isotropic subalgebra gO in a Lie algebra g, endowed with a nondegenerate inner product. Here g is a double extension of the Lie algebra sl2 ⊗ K, where K is the local field at the origin of the elliptic curve E of modulus τ , gO is a cocentral extension of sl2 ⊗O, where O is the local ring of E at the same point, and gλ is an extension of the sum (n+ ⊗Lλ )⊕(h⊗L0 )⊕(n− ⊗L−λ ), where Lλ are the sets of expansions at the origin of E of functions on its universal cover with certain transformation properties. According to [12], this Manin pair (g, gO ) defines quantum loop algebras U~ gO ⊂ ¯ which are conjugated by a certain U~ g(τ ); U~ g(τ ) is endowed with coproducts 1 and 1, twist F . These algebras are presented in Sect. 1.2 (analogous relations can be found in [5]). Our aim is to find a solution of the dynamical YBE in this algebra, quantizing rλ (z, w). To do that, we will construct twisted cocycles (in the sense of [3]). For that, we follow the method of [9]. In that paper, we gave the construction of a Hopf algebra cocycle in the double Yangian algebra DY (sl2 ), by factorizing the Yangian analogue FY g of F as a product FY g = F2 F1 , with F1 ∈ A<0 ⊗ A≥0 and F2 ∈ A≥0 ⊗ A<0 , with A≥0 and A<0 the subalgebras of DY (sl2 ) generated by the nonnegative and negative Fourier modes of the quantum currents. This method does not apply directly here: we look for a family of twists (Fλ1 , Fλ2 ). Moreover, we indeed have an analogue of A≥0 , that is U~ gO , but no analogue of A<0 . Our idea is to construct subalgebras of the algebra of families of elements of U~ g(τ )⊗2 depending on λ, which play the role of A<0 ⊗ A≥0 and A≥0 ⊗ A<0 , and to which Fλ1 and Fλ2 should belong (Sect. 3). The properties of these algebras A−+ and A+− are based on properties of relations between “half-currents” (generating series for elements of deformations of n± ⊗ L±λ and n± ⊗ O), following from the vertex relations for U~ g(τ ) (Sects. 1.4, 1.5). The decomposition F as a product Fλ2 Fλ1 is carried out in Sect. 4. It imitates the similar decomposition in [9]: applying certain projections to the decomposition identity leads us to guess the values of Fλ1,2 . We then show that the decomposition identity is indeed satisfied. The proof uses the Hopf algebra duality results of Sect. 2, and results on coproducts of Sect. 1.7. Next we prove the Fλ1 obtained that way indeed satisfies a twisted cocycle equation. For that, we introduce subalgebras of U~ g(τ )⊗3 , analogous to A≥0,<0 ⊗ DY (sl2 )⊗3 and ¯ of the A+− , A−+ DY (sl2 )⊗3 ⊗ A≥0,<0 , and show that they contain images by 1 and 1 (Props. 3.2, 3.3). This shows that thePratio 8λ between the two sides of the twisted cocycle equation has a special form ( i 1 ⊗ ai ⊗ hi , where h is the “Cartan element” of U~ g(τ )). We then prove a “twisted pentagon equation” for 8λ (Prop. 5.3), which in fact shows that it is equal to 1. After that we can apply the result of [3], and obtain a solution Rλ of the dynamical YBE in U~ g(τ )⊗2 (Thm. 6.1). We next study level zero representations of U~ g(τ ); this study was led in the general case in [11]. We obtain a family πζ of 2-dimensional

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

653

evaluation modules (Prop. 7.1), indexed by a point ζ of the formal neighborhood of the origin in E; we compute the image of Rλ by these representations, and find an answer closely connected to R(z, λ). This enables us to prove that the L-operators (πζ ⊗ 1)(Rλ ) and (1 ⊗ πζ )(Rλ ) satisfy the following dynamical version of the Reshetikhin–SemenovTian-Shansky relations (Thm. 7.1): ±(2) ±(1) 0 0 ± 0 R± (ζ − ζ 0 , λ)L±(1) (ζ)L±(2) λ (ζ ) = Lλ+~h(1) (ζ )Lλ (ζ)R (ζ − ζ , λ + ~h), λ+~h(2) +(2) 0 − 0 L−(1) λ (ζ)R (ζ − ζ , λ + ~h)Lλ (ζ )

(ζ 0 )R− (ζ − ζ 0 − K~, λ)L−(1) (ζ) = L+(2) λ+~h(1) λ+~h(2)

A(ζ, ζ 0 + K~) , A(ζ, ζ 0 )

where R± (z, λ) are elliptic R-matrices and A is a certain elliptic version of the usual ratio of gamma-functions. These relations extend the RLL-relations of the elliptic quantum group, which we recover in Thm. 9.1. Let us now say some words about problems connected to the present work. In [14, 15], quantum Knizhnik–Zamolodchikov–Bernard equations were defined as difference equations involving the dynamical R-matrix R(z, λ). It would be interesting to derive these equations from considerations involving coinvariants of U~ g(τ )-modules. This may also shed light on the question of what the equation for dependence in the moduli should be in the quantum situation. There is some indication that this equation is the Ruijsenaars–Schneiders (RS) equation at the critical level. At that level, one may apply the Reshetikhin–Semenov method for expressing elements of the center of U~ g(τ ), and then explicitly compute their actions on coinvariant spaces. Recent work of A. Varchenko and the second author leads to the impression that the situation is more complicated outside the critical level. It might be interesting to connect such an approach to the RS models with those of [1, 2]. Finally, it would be interesting to find an analogue of the theory developed in the present work, for the situations of higher genus ([11]). Recall that the classical r-matrices underlying the elliptic quantum groups are dynamical r-matrices for the Hitchin system associated with an elliptic curve. Dynamical r-matrices for Hitchin systems in higher genus have been introduced in [18, 8]. In this respect, it seems that a dynamical version of the Poisson–Lie theory would be of interest. Let us also mention here the work [20], where a dynamical approach to other “elliptic quantum groups” (those of [19]) is presented. After we finished this work, we learnt that S. Lukyanov and Ya. Pugai made some computations in the same direction, in connection with their work [22]. 1. Quantum Loop Algebras Associated with Elliptic Curves This section is organized as follows. In 1.1, we present the classical objects: the Lie quasi-bialgebra structures on g depending on λ ∈ C and the associated dynamical rmatrix rλ (z, w). In 1.2, we present the Hopf algebras introduced in [12], that quantize the Lie bialgebra structures associated with the decompositions g = (n± ⊗ K) ⊕ h ⊗ L0 ) ⊕ CK ⊕ (n∓ ⊗ K) ⊕ h ⊗ L0 ) ⊕ CD . This construction yields an algebra U~ g(τ ), endowed with conjugated coproducts 1 and ¯ We then study some properties of this algebra. 1.

654

B. Enriquez, G. Felder

U~ g(τ ) contains a subalgebra U~ gO , deforming U gO . The aim of 1.4 is to present relations for this subalgebra. For that, we split the generating currents e(z) and f (z) into "half-currents" corresponding to the decompositions K = O ⊕ L±λ , and we find relations for these half-currents. This allows us to give bases for U~ n± (τ ) (Prop. 1.2). In 1.6, we derive the analogous relations, where λ is replaced by λ + ~h, where h = h[1] is a Cartan generator. In 1.7, we study the transformation properties of the field K + (z) ¯ Finally, with repect to z; this field occurs in the definitions of the coproducts 1 and 1. ¯ of the generators x[r], r ∈ O and in Sect. 1.7 we show that the images by 1 and 1 x[ρ], ρ ∈ Lλ , x = e, f , have particular forms. 1.1. The classical situation. In [12], we constructed quasi-Hopf algebras, associated to the general data of a Frobenius algebra, a maximal isotropic subalgebra of it, and an invariant derivation. An example of such data is the following. Let us fix a complex number τ , with Im(τ ) > 0. Let 0 ⊂ C be the lattice Z + τ Z; call E the elliptic curve C/0. Let z be the coordinate on C, and let ω be the differential form on E, equal to dz. Let K = C((z)) be the completed local field of E at its origin 0, and O = C[[z]] the completed local ring at the same point. Endow K with the scalar product h, iK defined by hf, giK = res0 (f gω). Define on K the derivation ∂ to be equal to d/dz. Then ∂ is invariant w.r.t h, iK , and O is a maximal isotropic subring of K. Let us set a = sl2 (C), and denote by h, ia an invariant scalar product on a. Let us set g = (a ⊗ K) ⊕ CD ⊕ CK; let us define on g the Lie algebra structure defined by the central extension of a ⊗ K, c(x ⊗ f, y ⊗ g) = hx, yia h∂f, giK K, and by the derivation [D, x ⊗ f ] = x ⊗ ∂f . Let gO be the Lie subalgebra of g equal to (a⊗O)⊕CD. Define h, ia⊗K as the tensor product of h, ia and h, iK , and h, ig as the scalar product on g defined by h, ig |a⊗K = h, ia⊗K , hD, a ⊗ Kig = hK, a ⊗ Kig = 0, and hD, Kig = 1. Then gO is a maximal isotropic Lie subalgebra of g. To define maximal isotropic supplementaries of gO in g, we first define certain subspaces of K. For λ ∈ C, define Lλ as follows. If λ does not belong to 0, define Lλ to be the set of expansions near 0 of all holomorphic functions on C − 0, 1-periodic and such that of K f (z + τ ) = e−2iπλ f (z). For λ = 0, define Lλ as the maximal isotropic subspace H containing all holomorphic functions f on C − 0, 0-periodic, such that a f (z)dz = 0, where a is the cycle (i, i + 1) (with small and > 0). Finally, define Lλ = e−2iπmz L0 for λ = n + mτ . Let θ be the holomorphic function defined on C by the conditions that θ0 (0) = 1, the only zeroes of θ are the points of 0, θ(z + 1) = −θ(z), and θ(z + τ ) = −e−iπτ e−2iπz θ(z). θ is then odd. We then have 0 (j) θ e−2iπmz , if λ = n + mτ, (1) Lλ = ⊕j≥0 C θ

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

Lλ = ⊕i≥0 C

θ(λ + z) θ(z)

655

(i) ,

λ ∈ C − 0,

if

(2)

where for g ∈ K, we let g 0 = ∂g and g (i) = ∂ i g. Moreover, the orthogonal of Lλ for the scalar product h, iK is equal to L−λ . Consider now the decomposition g = g O ⊕ gλ ,

(3)

gλ = (h ⊗ L0 ) ⊕ (n+ ⊗ L−λ ) ⊕ (n− ⊗ Lλ ) ⊕ CK.

(4)

where

gO is a maximal isotropic subalgebra of g, and gλ is a maximal isotropic subspace of it. Therefore, (3) defines a Lie quasi-bialgebra structure on gO , and (as in [12]), of double Lie quasi-bialgebra on g. Its classical r-matrix is given by the formula rλ = D ⊗ K +

X1 2

i

h[ei ] ⊗ h[ei;0 ] + e[ei ] ⊗ f [ei;λ ] + f [ei ] ⊗ e[ei;−λ ],

i

for (e )i≥0 , (ei;λ )i≥0 dual bases of O and Lλ , with the valuation of ei tending to infinity with i, and we denote x ⊗ f by x[f ]; in other terms, θ0 θ(z − w + λ) 1 (h ⊗ h) (z − w) + (e ⊗ f ) 2 θ θ(z − w)θ(λ) θ(z − w − λ) + (f ⊗ e) + D ⊗ K; θ(z − w)θ(−λ)

rλ (z, w) =

(5)

this formula (without D ⊗ K) coincides with that of the classical r-matrix arising in the elliptic versions of the KZB equations (see [17]) and of the Hitchin system (see [10]). Remark 1. O plays the role of the ring R, in the notation of [12]. 1.2. Relations for U~ g(τ ). The quasi-Hopf algebra associated in [12] to the Lie quasibialgebras g and gO , are twists of the Hopf algebra (U~ g(τ ), 1) and of its subalgebra U~ gO , that we now present. We will sometimes denote U~ g(τ ) by A(τ ) or simply by A, and U~ gO by A+ . Let ~ be a formal parameter. Generators of U~ g(τ ) are D, K and the x[], x = e, f, h, ∈ K; they are subject to the relations x[α] = αx[],

x[ + 0 ] = x[] + x[0 ],

They serve to define the generating series X x[i ]i (z), x(z) =

α ∈ C, , 0 ∈ K.

x = e, f, h,

i∈Z

(i )i∈Z , (i )i∈Z dual bases of K; recall that (ei )i∈N , (ei;0 )i∈N are dual bases of O and L0 and set X X h[ei ]ei;0 (z), h− (z) = h[ei;0 ]ei (z). h+ (z) = i∈N

We will also use the series

i∈N

656

B. Enriquez, G. Felder

K + (z) = e(

q ∂ −q −∂ 2∂

h+ )(z)

K − (z) = q h

,

−

(z)

,

where q = e~ . The relations presenting U~ g(τ ) are then [K + (z), K + (w)] = [K − (z), K − (w)] = 0,

(6)

θ(z − w − ~)θ(z − w + ~ + ~K)K + (z)K − (w) −

(7)

= θ(z − w + ~)θ(z − w − ~ + ~K)K (w)K (z), θ(z − w + ~) e(w), θ(z − w − ~)

(8)

θ(w − z + ~K + ~) e(w), θ(w − z + ~K − ~)

(9)

K + (z)e(w)K + (z)−1 =

K − (z)e(w)K − (z)−1 =

K + (z)f (w)K + (z)−1 = −

−

−1

K (z)f (w)K (z)

+

θ(w − z + ~) f (w), θ(w − z − ~)

θ(z − w + ~) f (w), = θ(z − w − ~)

(10)

θ(z − w − ~)e(z)e(w) = θ(z − w + ~)e(w)e(z),

(11)

θ(w − z − ~)f (z)f (w) = θ(w − z + ~)f (w)f (z),

(12)

1 δ(z, w)K + (z) − δ(z, w − ~K)K − (w)−1 . ~ P Here δ denotes, as usual, the formal series i∈Z i (z)i (w). Let us introduce the generating series k + (z) and k − (z), defined by [e(z), f (w)] =

k + (z) = e(

q ∂ −1 + 2∂ h )(z)

,

k − (z) = q

(

1 1+q −∂

h− )(z)

;

(13)

(14)

they satisfy the relations K + (z) = k + (z)k + (z − ~),

K − (z) = k − (z)k − (z − ~).

(15)

Equations (8), (9) and (10) may be replaced by k + (z)e(w)k + (z)−1 = k − (z)e(w)k − (z)−1 = and

θ(z − w + ~) e(w), θ(z − w) θ(w − z + ~K) e(w), θ(w − z + ~K − ~)

(16)

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

θ(w − z) f (w), θ(w − z − ~) θ(z − w + ~) f (w); = θ(z − w)

657

k + (z)f (w)k + (z)−1 = k − (z)f (w)k − (z)−1

(17)

moreover, we will have [k ± (z), k ± (w)] = 0,

k + (z)k − (w) = φK (z − w)k − (w)k + (z),

(18)

where φK (ζ) is the formal series of 1 + ~C[[ζ]][[~]] defined by the functional equation φK (ζ)φK (ζ − ~) =

θ(ζ − ~ + ~K) θ(ζ) : ; θ(ζ − ~) θ(ζ + ~K)

if K = −2p, with p integer, we have Q 2 p−1 θ(ζ + ~) θ(ζ + ~ − 2p~) j=1 θ(ζ + ~ − 2j~) . φK (ζ) = Q 2 p−1 θ(ζ − 2j~) j=0

(19)

(20)

The algebra U~ g(τ ) is endowed with a Hopf structure given by the coproduct 1 defined by 1(k + (z)) = k + (z) ⊗ k + (z),

1(k − (z)) = k − (z) ⊗ k − (z + ~K1 ),

(21)

1(e(z)) = e(z) ⊗ K + (z) + 1 ⊗ e(z),

(22)

1(f (z)) = f (z) ⊗ 1 + K − (z)−1 ⊗ f (z + ~K1 ),

(23)

1(D) = D ⊗ 1 + 1 ⊗ D,

1(K) = K ⊗ 1 + 1 ⊗ K,

(24)

the counit ε, and the antipode S defined by them; we set K1 = K ⊗ 1, K2 = 1 ⊗ K. ¯ defined U~ g(τ ) is also endowed with another Hopf structure given by the coproduct 1 by ¯ + (z)) = k + (z) ⊗ k + (z), 1(k

¯ − (z)) = k − (z) ⊗ k − (z + ~K1 ), 1(k

(25)

¯ 1(e(z)) = e(z − ~K2 ) ⊗ K − (z − ~K2 )−1 + 1 ⊗ e(z),

(26)

¯ (z)) = f (z) ⊗ 1 + K + (z) ⊗ f (z), 1(f

(27)

¯ 1(D) = D ⊗ 1 + 1 ⊗ D,

¯ 1(K) = K ⊗ 1 + 1 ⊗ K,

(28)

the counit ε, and the antipode S¯ defined by them. In [12], we showed that there is some Hopf duality within U~ g. (αi ), (αi ) being dual bases of its subalgebras U~ n+ (τ ) and U~ n− (τ ) generated by the e[] and f [] , in K, set

658

B. Enriquez, G. Felder

F =

X

α i ⊗ αi .

(29)

i

¯ = Ad(F ) ◦ 1. Then we have 1 F satisfies the cocycle equation (F ⊗ 1)(1 ⊗ 1)(F ) = (1 ⊗ F )(1 ⊗ 1)(F )

(30)

(see [12], Prop. 3.1). Remark 2. In the notation of [12], 8.2, we have θ(z − w + ~) , θ(z − w − ~)

q(z, w) =

and σ(z) = z − ~K. We also have the relation θ0 2 θ0 (z − w) − (z − w + ~K) . [h+ (z), h− (w)] = ~ θ θ 1.3. Completions. The algebras and vector spaces introduced above possess natural topologies: the field K and the ring O are given the formal series topology; on the other hand, the spaces Lλ are given the discrete topology. For V, W two topological vector ˆ spaces, with basis of neighborhoods of the origin (Va )a∈Z and (Wb )b∈Z , define V ⊗W ¯ as the inverse limit of the V ⊗ W/Va ⊗ Wb , and V ⊗W as the inverse limit of the (V ⊗ W )/(Va ⊗ W + V ⊗ Wb ). ˆ Define then the completed tensor algebra T ˆ(V ) of V as the direct sum ⊕i≥0 V ⊗i . Then U~ g(τ ) is viewed as a quotient of T ˆ(g), and is endowed with the corresponding topology. U~ gO is then a closed subspace of U~ g(τ ). On the other hand, the fields x(z) ¯ The coproduct 1 maps U~ g(τ ) to belong to the completed tensor product U~ g(τ )⊗K. the completion of its tensor square defined as the suitable quotient of T ˆ(g ⊕ g). In the sequel we will consider tensor product of subspaces of U~ g(τ ) to be completed w.r.t. the topology of T ˆ(g ⊕ g), T ˆ(g ⊕ g ⊕ g), etc. 1.4. Relations for half-currents. Fix a complex number λ and set for x = e, f, K + , X x+λ (z) = x[ei ]ei;λ (z), (31) i

and for x = e, f, K − , x− λ (z) =

X

x[ei;−λ ]ei (z);

(32)

i

recall that (ei ), (ei;λ ) are dual bases of O and Lλ . The fields e(z) and f (z) are then split according to e(z) = e+λ (z) + e− λ (z),

− + f (z) = f−λ (z) + f−λ (z);

(33)

we call the expression x± λ (z) “half-currents”. In the above equality, we made use of the ¯ λ and of U~ g⊗O ¯ into U~ g⊗K. ¯ continuous inclusions of U~ gO ⊗L

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

659

− ¯ For x = e, f , we have x− λ (z) ∈ U~ g(τ )⊗O, so that x (z) can be viewed as a formal ¯ λ , so that x+λ (z) can be viewed as a series in z, regular at 0, and x+λ (z) ∈ U~ g(τ )⊗L function of z, satisfying

x+λ (z + 1) = x+λ (z),

x+λ (z + τ ) = e−2iπλ x+λ (z).

(34)

We then have: ± Proposition 1.1. The generating series e± λ (z), fλ (z) satisfy the following relations: 0 0 θ(w − z − λ)θ(−~) 0 θ(z − w − ~) e (z)eλ−~ (w) + 0 e (w)eλ−~ (w) θ(z − w) λ+~ θ(w − z)θ(−λ) λ+~

=

(35)

θ(z − w + ~) 0 θ(z − w − λ)θ(−~) e (w)eλ−~ (z) + 0 e (z)eλ−~ (z), θ(z − w) λ+~ θ(z − w)θ(−λ) λ+~

θ(z − w + ~) θ(w − z − λ)θ(~) 0 0 0 f f (z)fλ+~ (w) + 0 (w)fλ+~ (w) θ(z − w) λ−~ θ(w − z)θ(−λ) λ−~ θ(z − w − ~) 0 θ(z − w − λ)θ(~) = f f (w)fλ+~ (z) + 0 (z)fλ+~ (z), θ(z − w) λ−~ θ(z − w)θ(−λ) λ−~

(36)

1 where , 0 take the values +, −. In these relations, the expressions of the form z−w − − 1 (g1 (z, w)x+λ (z) − g2 (z, w)x+λ (w)), resp. z−w (g1 (z, w)xλ (z) − g2 (z, w)xλ (w)), where g1 , g2 are formal series in z, w coinciding for z = w, and x = e, f, K + , should be understood as the sums X g1 (z, w)ei,λ (z) − g2 (z, w)ei,λ (w) , x[ei ] z−w i≥0

resp.

X i<0

x[ei,λ ]

g1 (z, w)ei (z) − g2 (z, w)ei (w) , z−w

which belong to U~ g(τ ) ⊗ (K ⊗ K). Proof. Let us show relation (35) in the case = 0 = +. Let us denote by Z++ (z, w) the difference of the left- and right-hand sides of this equation, and by ` any continuous linear form on A(τ ). Clearly, `(Z++ (z, w)) belongs to Lλ ⊗Lλ (recall that `[ei ej ] is equal to zero when i or j are large enough), and is antisymmetric. (We attach coordinates z, w to the first and second factor of the tensor product.) On the other hand, the difference of `(θ(z − w)Z++ (z, w)) and of `(θ(z − w − ~)e(z)e(w) − θ(z − w + ~)e(w)e(z)) can be expressed as a sum of quadratic monomials in the e± λ±~ (z, w) using at least one − ¯ ¯ eλ±~ (z, w), and therefore belongs to K⊗O + O⊗K. Therefore, the same is true for (z − ¯ ¯ w)`(Z++ (z, w)). Let us set (z − w)`(Z++ (z, w)) = Y1 + Y2 , with Y1 ∈ K⊗O, Y2 ∈ O⊗K. e Let us denote by a tilde the exchange of arguments z and w, and set Y = (Y1 + Y2 )/2. ¯ and We have Y ∈ K⊗O

Set now Z = Y Since we have

(z − w)`(Z++ (z, w)) = Y + Ye .

P i≥0

¯ = K[[w]], so does Z. z −i−1 wi ; since Y belongs to K⊗O

e = 0, (z − w)[`(Z++ (z, w)) − Z + Z]

660

B. Enriquez, G. Felder

it follows that for some g in K, we have `(Z++ (z, w)) − Z + Ze = g(z)δ(z, w).

(37)

Consider both sides of (37) as the kernel of some operator, defined by (T g1 )(z) = res0 (g1 (z)δ(z, w)dw). Since the l.h.s. of (37) is antisymmetric in z and w, this operator should be antisymmetric, i.e. satisfy hT g1 , g2 iK + hg1 , T g2 iK = 0, for g1 , g2 ∈ K. On the other hand, T coincides with the multiplication operator by g. It follows that g = 0. e Since the left and right hand sides of this equality Therefore `(Z++ (z, w)) = Z − Z. ¯ ¯ respectively, `(Z++ (z, w)) = 0. This proves belongs to Lλ ⊗ Lλ and to O⊗K + K⊗O 0 (35) in the case where = = +. Let us now show (35) in the case where = 0 = −. Let us denote by Z−− (z, w) the difference of the left and right hand sides of this equation. Clearly, `(Z−− (z, w)) ˆ belongs to (O⊗O)[[~]], and is antisymmetric. On the other hand, the difference of θ(z − w)`(Z−− (z, w)) with `(θ(z − w − ~)e(z)e(w) − θ(z − w + ~)e(w)e(z)) belongs to Fλ,∗ + F∗,λ , where Fλ,∗ is the subspace of Hol(C − 0)((w)) formed by the functions g(z, w) such that g(z + 1, w) = g(z, w),

g(z + τ, w) = −e−iπτ e−2iπλ e−2iπ(z−w) g(z, w),

and F∗,λ is the subspace of Hol(C − 0)((z)) formed by the functions g(z, w) such that g(z, w + 1) = g(z, w),

g(z, w + τ ) = −e−iπτ e−2iπλ e−2iπ(w−z) g(z, w);

here Hol(C − 0) is the space of holomorphic functions defined on C − 0. Set θ(z − 0 w)`(Z−− (z, w)) = Y10 + Y20 , with Y10 ∈ Fλ,∗ and Y2 ∈ F∗,λ . Set Y 0 = (Y10 + Ye20 )/2. We 0 have Y ∈ Fλ,∗ and θ(z − w)`(Z++ (z, w)) = Y 0 + Ye 0 . Set now Z0 = Y 0

X i≥0

(θ−1 )(i) (z)

(−w)i ; i!

then Z 0 belongs to Hol(C − 0)((w)), and we have as before `(Z−− (z, w)) = Z 0 − Ze0 + g(z)δ(z, w), for some g ∈ K. The same reasoning as above shows that g = 0 and then that `(Z−− (z, w)) = 0. Let us now prove (35) in the case where = +, 0 = −. Let us denote by Z+− (z, w) the difference of the left and right hand sides of this equality. Let us subtract from (11), the sum of equalities (35) with = 0 = + and = 0 = −. We obtain that Z+− (z, w) + Z+− (w, z) = 0. Therefore Z+− (z, w) is antisymmetric in z and w. On the ¯ other hand, we have for any linear functional ` on A(τ ), `(Z+− (z, w)) ∈ Lλ ⊗O. Since the intersection of Lλ and O is zero, Z+− (z, w) is equal to zero. Therefore (35) is valid in the case = +, 0 = −. The case = −, 0 = + is obtained from = +, 0 = −, by exchanging z and w. Relations (36) can be obtained in a similar way.

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

661

Let us define U~ n+ (τ ) and U~ n− (τ ) as the subalgebras of A(τ ) generated by the e[], ∈ K and the f [], ∈ K. Let us denote by U+(e) (τ ), resp. U+(f ) (τ ) the subalgebras of U~ n± (τ ) generated by the e[r], r ∈ O, resp. f [r], r ∈ O. For µ ∈ C − 0, and x = e, f, h, denote by x− µ [] the element x[p−µ ()], where p−µ is the projection on L−µ parallel to O. For β ∈ C, define also x− µ+β~ [] as P i − i i ∂ x []/∂µ (β~) /i!. µ i≥0 (f ) (e) Let us denote by Uλ,− (τ ), Uλ,− (τ ) the subspaces of U~ n± (τ ) linearly spanned by the − − − [ηn ], n ≥ 0, products e−λ+2n~ [η0 ] · · · e−λ [ηn ], resp. by the products fλ− [η0 ] . . . fλ+2n~ ηi ∈ K. Let (i )i∈Z be a basis of K such that i = ei for i ≥ 0. We can now formulate a Poincar´e–Birkhoff–Witt result for U~ n± (τ ).

Proposition 1.2. 1) Bases of U+(e) (τ ) and of U+(f ) (τ ) are respectively given by the mono(e) (τ ) mials (e[i1 ] . . . e[ip ])0≤i1 ≤···≤ip and (f [i1 ] . . . f [ip ])0≤i1 ≤···≤ip ; bases of Uλ,−

(f ) − and Uλ,− (τ ) are respectively given by the (e− −λ+2n~ [i0 ] · · · e−λ [in ])i0 ≤···≤in <0 , resp. − [in ])i0 ≤···≤in <0 . by the (fλ− [i0 ] . . . fλ+2n~ (f ) (e) (e) (τ )⊗U+(e) → U~ n+ (τ ), U+(e) ⊗Uλ,− (τ ) → U~ n+ (τ ), Uλ,− (τ )⊗U+(f ) → 2) The maps Uλ,− (f ) U~ n− (τ ), U+(f ) ⊗ Uλ,− (τ ) → U~ n− (τ ), induced by the multiplication, define vector spaces isomorphisms.

Proof. We should first derive identities expressing the e[i ]e[j ], i > j in terms of combinations of the e[k ]e[l ], k ≤ l, for i, j, k, l ≥ 0; the e−λ+2~ [i ]e−λ [j ], i > j in terms of the e−λ+2~ [k ]e−λ [l ], k ≤ l, for i, j, k, l < 0; the e−λ [i ]e[j ], i < 0 ≤ j in terms of the e[k ]e−λ [l ], l < 0 ≤ k, and the e[i ]e−λ [j ], j < 0 ≤ i in terms of the e−λ [k ]e[j ], k < 0 ≤ l. For this, we may first assume that i = z i . Then we multiply (35) by z − w and combine the Fourier coefficients as in [12], Sect. 4. This proves that (e) the families of 1) generate U+(e) and Uλ,− , and that the two first maps of 2) are surjective. The facts that these families are free, and that these maps are injective, follow from [12], Lemma 4.4. Remark 3. An informal way to derive (35) is the following one. For example, if = 0 = +, we have I I θ(ζ − z − λ) θ(ζ − z − λ) − + e(ζ)dζ, eλ (z) = e(ζ)dζ, (38) eλ (z) = − θ(ζ − z)θ(−λ) θ(ζ − z)θ(−λ) C0 C0,z C0,z , resp. C0− being a contour encircling 0 and z (resp. 0) counterclockwise (resp. clockwise), and I I θ(ζ − z + λ) θ(ζ − z + λ) − + f (ζ)dζ, f−λ f (ζ)dζ. (39) (z) = (z) = f−λ C0− θ(ζ − z)θ(λ) C0,z θ(ζ − z)θ(λ) Multiply the identity θ(ζ − ζ 0 − ~) θ(ζ − ζ 0 + ~) 0 0 e(ζ)e(ζ e(ζ )e(ζ) ) = θ(ζ − ζ 0 ) θ(ζ − ζ 0 )

662

B. Enriquez, G. Felder 0

θ(ζ −w−λ) θ(ζ−z−λ) − 0− 0 by θ(ζ−z)θ(−λ) θ(ζ 0 −w)θ(−λ) , and integrate it over the cycles C0 for ζ, and C0 for ζ , 0− − 0 where C0 is a deformation of C0 , such that |ζ| < |ζ |. In the resulting identity, replace − + 0 in the l.h.s., e(ζ) by e+λ+~ (ζ) + e− λ+~ (ζ) and in the r.h.s. by eλ−~ (ζ) + eλ−~ (ζ ). The contributions to the integral of the terms in e− vanish, because these terms are regular at 0. We then obtain I θ(ζ 0 − w − λ) θ(z − ζ 0 − ~) + { e (z)e+λ−~ (ζ 0 ) 0 θ(z − ζ 0 ) λ+~ C00− θ(ζ − w)θ(−λ)

θ(ζ 0 − z − λ)θ(−~) + e (ζ 0 )e(ζ 0 )}dζ 0 θ(ζ 0 − z)θ(−λ) λ+~ I θ(ζ 0 − w − λ) θ(z − ζ 0 + ~) + { eλ+~ (ζ 0 )e+λ−~ (z) = 0) 0− θ(ζ 0 − w)θ(−λ) θ(z − ζ C0

+

+ that is

θ(ζ 0 − z − λ)θ(~) 0 + e(ζ )eλ−~ (ζ 0 )}dζ 0 , θ(ζ 0 − z)θ(−λ)

θ(z − w − ~) + θ(z − w − λ)θ(~) + e e (z) + (w) (40) θ(z − w)θ(−λ) λ−~ θ(z − w) λ−~ θ(z − w + ~) + θ(z − w − λ)θ(−~) + − e (z) + e (w) e+λ−~ (z) θ(z − w)θ(−λ) λ+~ θ(z − w) λ+~ I θ(ζ 0 − z − λ) θ(ζ 0 − w − λ) + eλ+~ (ζ 0 )e(ζ 0 ) + e(ζ 0 )e+λ−~ (ζ 0 ) dζ 0 . θ(−~) 0 =− 0 − w)θ(−λ) 0− θ(ζ − z)θ(−λ) θ(ζ C0

e+λ+~ (z)

Specializing (11) for ζ = ζ 0 , we find e(ζ 0 )2 = 0. Therefore 0 − 0 e+λ+~ (ζ 0 )e(ζ 0 ) + e(ζ 0 )e+λ−~ (ζ 0 ) = e+λ+~ (ζ 0 )e+λ−~ (ζ 0 ) − e− λ+~ (ζ )eλ−~ (ζ ).

The contribution of the terms in e− to the the r.h.s. of (40) is zero, since these terms are regular at 0, and the contribution of the terms in e+ is evaluated by the residues formula. (Note that this method apparently cannot be adapted to derive relations between e+λ (z) and e+λ0 (w) when λ 6= λ0 ± 2~, because then the term in e+ in the r.h.s. of (40) would no longer be a meromorphic function on E.) As we will see from Thm. 7.1, it is also possible to derive relations between fields ± k ± (z) and e± λ (w), fλ (w); for example, we have k + (z)e+λ (w)k + (z)−1 =

θ(z − w + ~) + θ(z − w − λ)θ(~) + eλ+~ (w) − e (z), (41) θ(z − w) θ(z − w)θ(−λ) λ+~

etc. 1.5. Shifts in h. In what follows, we will simply denote h[1] by h. We will also use the following notation. Let us define in each tensor power A⊗n , a(i) as the element −(j) P [] (~βh(i) )α ∂ α xµ [] as α≥0 ∂µ . 1⊗(i−1) ⊗a⊗1⊗(n−i) for a ∈ A, and for β ∈ C, x−(j) α α! µ+~βh(i) If n = 1, let us denote also x−(1) [] simply by x− µ+~βh []. Let us also set for µ+~βh(1) = +, −, X (∂/∂µ)α xµ (z) (~βh)α /α!; xµ+~βh (z) = α≥0

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

we have then x(z) = x+µ+~βh (z) + x− µ+~βh (z),

x− µ+~βh (z) =

663

X

i x− µ+~βh [ei;−µ ]e (z).

i∈Z

Lemma 1.1. We have the relations θ(z − w + ~) 0 fλ+~h (z)fλ+~h (w) θ(z − w) θ(w − z − λ − ~h − 3~)θ(~) 0 0 + 0 f (w)fλ+~h (w) θ(w − z)θ(−λ − ~h − 3~) λ+~h θ(z − w − ~) 0 = f (w)fλ+~h (z) θ(z − w) λ+~h θ(z − w − λ − ~h − 3~)θ(~) + 0 f (z)fλ+~h (z), θ(z − w)θ(−λ − ~h − 3~) λ+~h

(42)

where , 0 take the values + and −. Proof. The identity (~h)n X 0 (∂/∂λ)n φ(λ)fλ (z1 )fλ (z2 ) n! n≥0

=

X

(∂/∂λ)r φ(λ)

p,q,r≥0

0 (~(h + 4))r (~(h + 2))q (~h)p (∂/∂λ)q fλ (z1 ) (∂/∂λ)p fλ (z2 ) r! q! p!

implies that X n≥0

(~h)n 0 0 = φ(λ + ~(h + 4))fλ+~h+~ (∂/∂λ)n φ(λ)fλ (z1 )fλ (z2 ) (z1 )fλ+~h+~ (z2 ). n!

Equation (42) then follows from (36), after the change of λ into λ − ~.

1.6. Properties of K + (z). Since h+ (z) =

X θ0 (z)h+ [1] + h+ [ei ]ei (z), θ i>0

where the ei are elliptic functions, we have ∂ X 1 θ(z + ~) + q − q −∂ + h (z) = ln h [1] + h+ [ei ]e0i (z), 2∂ 2 θ(z − ~) i>0

with e0i again elliptic functions. Therefore, h/2 X θ(z + ~) + K (z) = ℘(i) (z)αi , θ(z − ~)

αi ∈ U~ h(τ );

(43)

i≥0

here

θ(z+~) θ(z−~)

h/2

is defined as exp

h 2

ln

θ(z−~) θ(z+~)

, where the argument of the exponential

is considered as a formal power series in ~, and we define ℘ by ℘ = −∂ 2 (ln θ).

664

B. Enriquez, G. Felder

Remark 4. Equation (43) implies that K + (z) has the properties K + (z + τ ) = e−2iπ~h K + (z);

K + (z + 1) = K + (z),

(44)

informally, we can write + K + (z) = K~h (z).

1.7. Properties of the coproducts. Lemma 1.2. Let us fix λ ∈ C − 0. For ∈ K, we have X −(1) 1(e− e−λ−~h(2) [i ](1 ⊗ aiλ ()) + e−(2) −λ []) = −λ [],

(45)

i≥0

and []) (1 ⊗ 1)(e−(1) −λ−~h(2) X X −(1) ∂ α aiλ () (−~h)α )+e−(2) e−λ−~h(2) −~h(3) [i ] (1⊗ ⊗ [], = −λ−~h(3) ∂λα α!

(46)

α≥0

i∈Z

where aiλ are linear maps from K to the subalgebra U~ h+ (τ ) of A(τ ), generated by the h+ [r], r ∈ O, depending holomorphically on λ ∈ C − 0. Proof. We have 1(e(z)) = e(z) ⊗ K + (z) + 1 ⊗ e(z), so that − − + 1(e− −λ []) = he(z) ⊗ K (z), pλ ()iK + 1 ⊗ e−λ [].

By (43), we have he(z) ⊗ K + (z), p− λ ()iK = he(z) ⊗

θ(z + ~) θ(z − ~)

h/2 X

℘(i) (z)αi , p− λ ()iK .

i≥0

Note now that the map associating to ∈ K, the series Aλ () =

θ(z + ~) θ(z − ~)

h/2 X i≥0

i ∂ i p− λ () (−~h) , ∂λi i!

is a linear map from K to Lλ [[~h]]. Therefore

and

θ(z + ~) θ(z − ~)

h/2

p− λ () =

he(z) ⊗ K + (z), p− λ ()iK =

X (~h)i ∂ i Aλ () i≥0

X i≥0

i!

∂λi

,

e(1) [℘(i) Aλ ()](1 ⊗ αi ). −λ−~h(2)

P Equation (45) follows, if we set aiλ () = j≥0 h℘(j) Aλ (), ρi iαj , where (ρj )j≥0 is the dual basis to (j )j≥0 . Equation (46) then follows directly from (45).

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

665

Lemma 1.3. There exists a family of linear maps (bi )i≥0 from O to the subalgebra U~ h(τ ) of A(τ ) generated by the h[], ∈ K, and K, such that X 1(f [r]) = f [r] ⊗ 1 + bi (r) ⊗ f [i ], (47) i≥0

for r ∈ O; recall that (i )i≥0 is a basis of O. Proof. We have 1(f (z)) = f (z) ⊗ 1 + q −h

−

(z)

⊗ f (z + ~K1 ), so that

−

⊗ f (z + ~K1 ), riK ;

1(f [r]) = f [r] ⊗ 1 + hq −h since q −h

−

(z)

=

P

hq −h

i≥0 −

(z)

(z)

βi i (z), for certain βi ∈ U~ h(τ ), we have

⊗ f (z + ~K1 ), riK = h

X

βi i (z) ⊗ f (z + ~K1 ), riK

i≥0

=

X

βi ⊗ f [q K1 ∂ (ri )].

i≥0

We then set for i ≥ 0, bi (r) = (i )i≥0 .

P j≥0

βj hq K∂ (ri ), ρi i, where (ρi )i≥0 is the dual basis to

Lemma 1.4. There exist families of linear maps (cλi )i≥0 from O to U~ h(τ ) and (di )i∈Z from K to U~ h+ (τ ), such that the dependence of cλi in λ ∈ C − 0 is holomorphic, and X ¯ 1(e[r]) = 1 ⊗ e[r] + e[i ] ⊗ cλi (r), (48) i≥0

for r ∈ O, ¯ − []) = f − [] ⊗ 1 + 1(f λ λ

X

−(2) (di () ⊗ 1)fλ−~h (1) [i ]

(49)

i∈Z

and −(1) ¯ −(1) (1) []) = fλ+~(h 1(f (1) +h(2) )−2~ [] + λ+~h −2~

X i∈Z

−(2) (di () ⊗ 1)fλ+~h (2) −2~ [i ].

(50)

Proof. Equation (48) is proved in the same way as (47). Let us prove (49). We have for ∈ K, ¯ − []) = f − [] ⊗ 1 + hK + (z) ⊗ f (z), p− ()iK ; 1(f λ λ −λ h/2 P θ(z+~) (i) recall that K + (z) = θ(z−~) i≥0 αi ℘ (z), with αi ∈ U~ h+ (τ ), therefore hK (z) ⊗ +

f (z), p− −λ ()iK

since each ℘(i) (z)p− −λ (z)

=

θ(z+~) θ(z−~)

X

αi ⊗ hf (z), ℘

i≥0

(i)

(z)p− λ (z)

θ(z + ~) θ(z − ~)

h/2 can be expressed as an expansion

h/2 iK ;

666

B. Enriquez, G. Felder

X ∂ j Bλ (z) (~h(1) )j ∂λj

j≥0

j!

,

with Bλ (z) ∈ L−λ [[~h(1) ]], we have 1 ⊗ hf (z), ℘(i) (z)p− −λ (z)

θ(z + ~h(1) ) −(2) iK = fλ−~h (1) (λi ()), θ(z)

where λi are certain linear endomorphisms of K. This shows (49). Equation (50) can be deduced from (49) by using the expansion −(1) fλ+~h (1) [] =

X ∂ j f −(1) [] (~h(1) )j λ

j≥0

∂λj

j!

,

¯ the identity 1(h) = h(1) + h(2) and by replacing λ by λ − 2~.

2. Duality In this section, we study the pairing between U~ n+ (τ ) and U~ n− (τ ), and compute some annihilators. As in [12], these results will serve (Sect. 4) to decompose the twist F . Let U~ g+ (τ ) be the subalgebra of A(τ ) generated by D, the h[r], r ∈ O, and U~ n+ (τ ), and let U~ g− (τ ) be the subalgebra of A(τ ) generated by K, the h[λ], λ ∈ L0 , and U~ n− (τ ). (U~ g± (τ ), 1) are Hopf subalgebras of (A(τ ), 1); (U~ g+ (τ ), 1) and (U~ g− (τ ), 10 ) are dual to each other; the duality h, i is expressed by the rules he[], f [0 ]i =

1 h, 0 iK , ~

hh[r], h[λ]i =

2 hr, λiK , ~

hD, Ki =

1 , ~

the other pairings between generators being trivial. We denote by h, iU~ n± (τ ) the restriction of h, i to U~ n+ (τ ) × U~ n− (τ ). On the other hand, let U~ g¯ + (τ ) and U~ g¯ − (τ ) be the subalgebras of A(τ ) respectively generated by U~ n+ (τ ), K and the h[λ], λ ∈ L0 , and by U~ n− (τ ), D and the h[r], r ∈ O. ¯ are Hopf subalgebras of (A(τ ), 1); ¯ (U~ g¯ + (τ ), 1 ¯ 0 ) and (U~ g¯ − (τ ), 1) ¯ are (U~ g¯ ± (τ ), 1) 0 dual to each other; the duality h, i is expressed by the rules he[], f [0 ]i0 =

1 h, 0 iK , ~

hh[λ], h[r]i0 =

2 hr, λiK , ~

hK, Di0 =

1 , ~

the other pairings between generators being trivial. The restriction of h, i0 to U~ n+ (τ ) ×U~ n− (τ ) coincides with h, iU~ n± (τ ) . Let us also denote by U~ n± (τ )[n] the homogeneous components of degree n (in the (f );n (f ) the intersections Uλ,− ∩ U~ n± (τ )[n] . e[] or f []) of U~ n± (τ ), and by Uλ,− Then P Lemma 2.1. 1) The annihilator of U+(f ) for h, iU~ n± (τ ) is r∈O e[r]U~ n+ (τ ). P (e);n − for h, iU~ n± (τ ) is ∈K fλ−2(n−1)~ []U~ n− (τ )[n−1] . 2) The annihilator of Uλ,− P (e) 3) The annihilator of U+ for h, iU~ n± (τ ) is r∈O U~ n− (τ )f [r]. P (f );n for h, iU~ n± (τ ) is ∈K U~ n+ (τ )[n−1] e− 4) The annihilator of Uλ,− −λ−2(n−1)~ [].

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

667

Proof. 1) and 3) are consequences of [12],P Prop. 6.2. − []U~ n− (τ )[n] is orthogonal to Let us show 2). Let us first prove that ∈K fλ−2n~ (e);n+1 . Let Uλ,− − ηi ∈ K a = e− −λ+2n~ [ηn ] · · · e−λ [η0 ], (e);n+1 belong to Uλ,− ; let b belong to U~ n− (τ )[n] , belong to K and let us compute − []bi. This is equal to ha, f−λ X − hai , f−λ []iha0i , bi, (51) i

P

a0i .

where 1(a) = i ai ⊗ From (45) follows that 1(a) is the product of the terms X −(1) e−λ+2p~−~h(2) [i ](1 ⊗ aiλ−2p~ )(ηp ) + e−(2) −λ+2p~ [ηp ],

(52)

i∈Z

for p = n, . . . , 0. This product belongs to U~ n+ (τ ) ⊗ U~ g+ (τ ). To evaluate (51), we may as well project the first factor of 1(a) on U~ n+ (τ )[1] parallel to all other homogeneous components. The contribution of the (n − p)th term (52) is X −(2) −(1) i e−λ+2n~ [ηn ] · · · e−(2) −λ+2(p+1)~ [ηp+1 ]e−λ+2p~−~h(2) [i ](1 ⊗ aλ−2p~ (ηp )) i∈Z

−(2) e−(2) −λ+2(p−1)~ [ηp−1 ] · · · e−λ [η0 ],

that is, using the fact that −(2) (h − 2p)e−(2) −λ+2n~ [ηn ] · · · e−λ+2(p+1)~ [ηp+1 ] −(2) = e−(2) −λ+2n~ [ηn ] · · · e−λ+2(p+1)~ [ηp+1 ](h − 2n),

XX i∈Z α≥0

(−∂/∂λ)α (e− −λ−2n~ [i ]) ⊗

hα −(2) e [ηn ] α! −λ+2n~

(53)

−(2) −(2) i · · · e−(2) −λ+2(p+1)~ [ηp+1 ]aλ−2p~ (ηp )e−λ+2(p−1)~ [ηp−1 ] · · · e−λ [η0 ].

Note now that for any x ∈ U~ g+ (τ ) and y ∈ U~ n− (τ ), we have hhx, yi = 0.

(54)

0 Indeed, hhx, yi = hh ⊗ x, 10 (y)i(2) (denoting by h, ith (2) e tensor square of h, i); but 1 (y) belongs to U~ n− (τ ) ⊗ U~ g− (τ ), and hh, U~ n− (τ )i = 0, so that (54) holds. Now the pairing of (53) with fλ− [] ⊗ b is equal to zero either (for α = 0) because − he−λ [η], fλ− []i = 0 for any , η ∈ K (Lλ and L−λ being orthogonal to each other) or by (54) for α > 0. Then standard deformation arguments (see [12], proof of Prop. 6.2) show that the P (e);n+1 − is exactly ∈K fλ−2n~ []U~ n− (τ )[n] . orthogonal of Uλ,− Let us now prove 4). Let us first show that for ∈ K, U~ n+ (τ )[n] e− −λ−2n~ [] is (f );n+1 orthogonal to Uλ,− . Let

668

B. Enriquez, G. Felder − a = fλ− [η0 ] · · · fλ+2n~ [ηn ],

ηi ∈ K,

belong to U~ n+ (τ )[n] , let b belong to U~ n+ (τ )[n] , and let us compute hbe− −λ−2n~ [], ai. This is equal to ¯ hb ⊗ e− −λ−2n~ [], 1(a)i(2) .

(55)

¯ From (49) follows that 1(a) is the product of the terms X −(2) − [ηp ] ⊗ 1 + (di (ηp ) ⊗ 1)fλ−~h fλ+2p~ (1) +2p~ [i ],

(56)

i∈Z

p = 0, . . . , n. Assign degrees −1 to terms of the form f [], ∈ K, and zero to those belonging to U~ h(τ ); then in the expansion of the product of the terms (56), only those of degree −1 will contribute to (55). Therefore (55) is equal to hb ⊗ e− −λ−2n~ [], n Y

n p−1 X Y

− (fλ+2k~ [ηk ] ⊗ 1)

p=0 k=0

X i∈Z

−(2) (di (ηp ) ⊗ 1)fλ−~(h (1) −2p) [i ]

(57)

− (fλ+2k~ [ηk ] ⊗ 1)i(2) .

k=p+1

Using the identity (h − 2p)

n Y

 − fλ+2k~ [ηk ] = 

k=p+1

n Y

 − fλ+2k~ [ηk ] (h − 2n),

k=p+1

we rewrite (57) as hb ⊗  

e− −λ−2n~ [],

n Y k=p+1

n X XX

p−1 Y

α≥0 p=0 i∈Z

k=0



− fλ+2k~ [ηk ]

! − fλ+2k~ [ηk ]

di (ηp )

(58)

(−~h)α − ⊗ (∂/∂λ)α fλ+2n~ [i ]i(2) . α!

Note now that for b ∈ U~ n+ (τ ), c ∈ U~ g¯ − (τ ), hb, chi = 0.

(59)

¯ 0 (b), c ⊗ hi(2) = 0 because the second factors of the expanIndeed, hb, chi = h1 0 ¯ (b) belong to U~ n+ (τ ). Therefore (58) vanishes, either by (59) or because sion of 1 − he− −λ−2n~ [], fλ+2n~ [i ]i = 0. P (f );n+1 It follows that ∈K U~ n+ (τ )[n] e− . The same −λ−2n~ [] is orthogonal to Uλ,− deformation arguments as above show that these spaces are in fact the orthogonals of each other. We will also use the following lemma:

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

669

Lemma 2.2. 1) For x ∈ U~ n+ (τ ), we have hF, id ⊗ xi = x. 2) For y ∈ U~ n− (τ ), we have

hF, y ⊗ idi = y.

3) Let π be the linear map from U~ g+ (τ ) to U~ n+ (τ ), defined by π(tx) = ε(t)x, for x ∈ U~ n+ (τ ), t ∈ U~ h+ (τ ); then we have for x0 ∈ U~ g+ , hF, id ⊗ x0 i = π(x). 4) Let π 0 be the linear map from U~ g¯ + (τ ) to U~ n+ (τ ), defined by π 0 (yt) = yε(t), for y ∈ U~ n+ (τ ), t ∈ U~ h− (τ ); then we have for y 0 ∈ U~ g+ , hF, id ⊗ y 0 i = π 0 (y 0 ). Proof. 1) and 2) are direct consequences of [12], (66) and (68). The proof of 3) and 4) is similar to that of Lemma 2.5 of [9]. 3. Algebras A+− and A−+ and Their Properties In this section, we introduce the algebras A+− and A−+ , to which the two parts of the decomposition of F should belong. In view of proving that these parts satisfy the twisted ¯ cocycle property, we also study their behaviour with respect to the coproducts 1 and 1. For X a vector space, we denote by Hol(C−0, X) the space of holomorphic functions from C − 0 to X and set Hol(C − 0) = Hol(C − 0, C). Definition 3.1. Let us define A−+ to be the subalgebra of Hol(C − 0, A ⊗ A) generated []f (2) [r], with ∈ K and r ∈ O, and A+− (over Hol(C − 0)) by h(2) and the e−(1) −λ−~h(2) as the subalgebra of Hol(C − 0, A ⊗ A) generated (over Hol(C − 0)) by h(2) and the −(2) e(1) [r]fλ+~h (2) −2~ [], with r ∈ O, ∈ K. Proposition 3.1. The intersection of A+− and A−+ is equal to Hol(C−0, 1⊗C[h][[~]]). Proof. Since we have [h, f [r]] = −2f [r] for r ∈ O, we have the relations f (2) [r]e−(1) [] = e−(1) []f (2) [r], −λ−~h(2) −λ−~h(2) −2~ for ∈ K, r ∈ O; therefore A−+ is linearly spanned by 1 and the ξ = e−(1) [η ] · · · e−(1) [η ]f (2) [r0 ] . . . f (2) [rn ](h(2) )p , −λ−~h(2) 0 −λ−~h(2) −2n~ n

(60)

with n, p ≥ 0, ηi ∈ K, ri ∈ O. On the other hand, A+− is linearly spanned by 1 and the −(2) −(2) (2) p η = e(1) [r0 ] . . . e(1) [rn ]fλ+~h ) , (2) −2~ [η0 ] · · · fλ+~h(2) −2~ [ηn ](h

(61)

n, p ≥ 0, ηi ∈ K, ri ∈ O. Suppose that some combination of elements of the form (61) belongs to A+− . The image of this combination by l ⊗ 1, l any linear form on A, is some combination

670

B. Enriquez, G. Felder

X

λ0;p (h(2) )p +

p≥0

XX n≥0

− − λi;n;p fλ+~h−2~ [η0(i) ] · · · fλ+~h−2~ [ηn(i) ](h(2) )p ,

(62)

i

λ0;p , λi;n;p ∈ Hol(C − 0), that should belong to A+ . By (36) with = 0 = −, a basis of the linear span of all elements of the form − − fλ+~h−2~ [η0(i) ] · · · fλ+~h−2~ [ηn(i) ](h(2) )p ,

with ηi ∈ K, is − − [i0 ] · · · fλ+~h−2~ [in ](h(2) )p , fλ+~h−2~

i0 ≤ · · · ≤ in < 0.

On the other hand, from (42) follows that a basis of the linear span of all elements of the form f [η0 ] · · · f [ηn ]hp , with i ∈ K, is given by − − [i0 ] . . . fλ+~h−2~ [ik ]f [ik+1 ] . . . f [in ]hp , fλ+~h−2~

i0 ≤ . . . ≤ ik < 0 ≤ ik+1 ≤ . . . ≤ in . A basis of the intersection of A+ with this linear span is f [i0 ] . . . f [in ]hp ,

i0 ≥ . . . ≥ in ≥ 0.

Therefore the only possibility that (62) belongs to A+ is that λi;n;p = 0 for all i, n, p.

Definition 3.2. Let us define A−,·,· as the subspace of the algebra Hol(C − 0, A⊗3 ), linearly spanned (over Hol(C − 0)) by the elements of the form ξ 0 = e−(1) [η ] · · · e−(1) [η ](1 ⊗ a ⊗ b), −λ−~(h(2) +h(3) ) 1 −λ−~(h(2) +h(3) )−2(n−1)~ n

(63)

n ≥ 0 (recall that the empty product is equal to 1), where ηi ∈ K, and a, b ∈ A are such that [h(1) + h(2) + h(3) , ξ 0 ] = 0; and A·,·,+ as the subspace of Hol(C − 0, A⊗3 ) spanned (over Hol(C − 0)) by the elements of the form η 0 = (a0 ⊗ b0 ⊗ 1)f (3) [r1 ] · · · f (3) [rn ](h(3) )s ,

n, s ≥ 0,

where a0 , b0 ∈ A, ri ∈ O, and such that [h(1) + h(2) + h(3) , η 0 ] = 0. Proposition 3.2. A−,·,· and A·,·,+ are subalgebras of Hol(C − 0, A⊗3 ). We have (1 ⊗ 1)(A−+ ) ⊂ A−,·,· ∩ A·,·,+ ,

(1 ⊗ 1)(A−+ ) ⊂ A−,·,· ∩ A·,·,+ .

(64)

Proof. That A·,·,+ is a subalgebra of Hol(C − 0, A⊗3 ) follows easily from its definition. Let us show now that A−,·,· is a subalgebra of Hol(C − 0, A⊗3 ). Let ξ 0 and ξ 00 be elements of Hol(C − 0, A⊗3 ) of the form (65), that is [η ] · · · e−(1) [η ](1 ⊗ a ⊗ b), ξ 0 = e−(1) −λ−~(h(2) +h(3) ) 1 −λ−~(h(2) +h(3) )−2(n−1)~ n and

[η 0 ] · · · e−(1) [η 0 ](1 ⊗ a0 ⊗ b0 ), ξ 00 = e−(1) −λ−~(h(2) +h(3) ) 1 −λ−~(h(2) +h(3) )−2(n0 −1)~ n0

with ηi , ηi0 ∈ K, a, b, a0 , b0 ∈ A are such that ξ 0 and ξ 00 commute with h(1) + h(2) + h(3) . Since [h(1) , ξ 0 ] = 2nξ, we have [h(2) +h(3) , ξ 0 ] = −2nξ 0 , so that [h(2) +h(3) , 1⊗a⊗b] = −2n(1 ⊗ a ⊗ b). It follows that for any p,

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

671

(1 ⊗ a ⊗ b)e−(1) [η 0 ] −λ−~(h(2) +h(3) )−2(p−1)~ p [η 0 ](1 ⊗ a ⊗ b). = e−(1) −λ−~(h(2) +h(3) )−2n~−2(p−1)~ p The product ξ 0 ξ 00 can then be written as e−(1) [η ] · · · e−(1) [η ], −λ−~(h(2) +h(3) ) 1 −λ−~(h(2) +h(3) )−2(n−1)~ n e−(1) [η 0 ] · · · e−(1) [η 0 ](1 ⊗ aa0 ⊗ bb0 ), −λ−~(h(2) +h(3) )−2n~ 1 −λ−~(h(2) +h(3) )−2(n+n0 −1)~ n0 which is of the form (63). Since we also have [h(1) + h(2) + h(3) , ξ 0 ξ 00 ] = 0, ξ 0 ξ 00 belongs to A−,·,· . Let us now prove the first part of (64). From (46) follows that for ∈ K, r ∈ O, []f (2) [r]) is equal to (1 ⊗ 1)(e−(1) −λ−~h(2) XX i≥0 α≥0

e−(1) [ ] −λ−~(h(2) +h(3) ) i

∂ α aiλ ()(2) (−~h(3) )α (3) f [r] + e−(2) []f (3) [r], −λ−~h(3) ∂λα α!

and so belongs to A−,·,· ∩ A·,·,+ . We also have (1 ⊗ 1)(h(2) ) ∈ A−,·,· ∩ A·,·,+ . Since f (2) [r], ∈ K, r ∈ O, generate A−+ , and that A−,·,· ∩ A·,·,+ is an h(2) and the e−(1) −λ+~h(2) algebra, we have (1 ⊗ 1)(A−+ ) ⊂ A−,·,· ∩ A·,·,+ . Let us now prove the second part of (64). Clearly, (1 ⊗ 1)(h(2) ) belongs to A−,·,· ∩ ·,·,+ [] f (2) [r]) A . From Lemma 1.3 follows that for any ∈ K, r ∈ O, (1 ⊗ 1)(e−(1) −λ−~h(2) is equal to   X [] f (2) [r] + bi (r)(2) f (3) [i ] , e−(1) −λ−~(h(2) +h(3) ) i≥0

[]f (2) [r], ∈ K, r ∈ and therefore belongs to A−,·,· ∩A·,·,+ . Since h(2) and the e−(1) −λ−~h(2) −+ −,·,· ·,·,+ O, generate A , and that A ∩A is an algebra, this shows that (1 ⊗ 1)(A−+ ) ⊂ −,·,· ·,·,+ A ∩A . We now define analogues A+,·,· and A·,·,− of A−,·,· and A·,·,+ . Definition 3.3. A+,·,· is the subspace of the algebra Hol(C − 0, A⊗3 ) linearly spanned (over Hol(C − 0)) by the elements of the form ξ 0 = e(1) [r1 ] . . . e(1) [rn ](1 ⊗ a ⊗ b),

n ≥ 0,

(65)

where ri ∈ O, and a, b ∈ A are such that [h(1) + h(2) + h(3) , ξ 0 ] = 0. A·,·,− is the subspace of Hol(C − 0, A⊗3 ) linearly spanned (over Hol(C − 0)) by the elements of the form −(3) −(3) (3) s η 0 = (a0 ⊗ b0 ⊗ 1)fλ+~h ) , (3) −2~ [η1 ] . . . fλ+~h(3) −2~ [ηn ](h

n, s ≥ 0,

(66)

where ηi ∈ K, and a0 , b0 ∈ A are such that [h(1) + h(2) + h(3) , η 0 ] = 0. We have then: Proposition 3.3. A+,·,· and A·,·,− are subalgebras of Hol(C − 0, A⊗3 ). We have ¯ ⊗ 1)(A+− ) ⊂ A+,·,· ∩ A·,·,− , (1

+− ¯ (1 ⊗ 1)(A ) ⊂ A+,·,·, ∩ A·,·,− .

(67)

672

B. Enriquez, G. Felder

Proof. That A+,·,· is a subalgebra of Hol(C − 0, A⊗3 ) follows easily from its definition. Let us now prove that A·,·,− is an algebra. Let us consider elements of the form (61), −(3) −(3) (3) p ) , η = (a ⊗ b ⊗ 1)fλ+~h (3) −2~ [η1 ] . . . fλ+~h(3) −2~ [ηn ](h

and 0

−(3) −(3) 0 0 (3) p ) , η 0 = (a0 ⊗ b0 ⊗ 1)fλ+~h (3) −2~ [η1 ] . . . fλ+~h(3) −2~ [ηn0 ](h

where ηi , ηi0 ∈ K, and a, b, a0 , b0 ∈ A are such that [h(1) + h(2) + h(3) , η] = [h(1) + h(2) + h(3) , η 0 ] = 0. −(3) 0 0 0 Since each fλ+~h (3) −2~ [ηp ] commutes with a ⊗ b ⊗ 1, ηη can be written as −(3) ηη 0 = (aa0 ⊗ bb0 ⊗ 1)fλ+~h (3) −2~ [η1 ] . . . 0

−(3) −(3) −(3) 0 0 (3) − 2n)p (h(3) )p , fλ+~h (3) −2~ [ηn ]fλ+~h(3) −2~ [η1 ] . . . fλ+~h(3) −2~ [ηn0 ](h

which is of the form (66); on the other hand, ηη 0 clearly commutes with h(1) + h(2) + h(3) , which implies that it belongs to A·,·,− . The proof of (67) is similar to that of Prop. 3.2: Lemma 1.4 implies that for ∈ ¯ ⊗ 1)(e(1) [r]f −(2) (2) K, r ∈ O, (1 []) is equal to λ+~h −2~  e(2) [r] +

X i≥0

  −(3) e(1) [i ]ci(2) λ (r) fλ+~h(3) −2~ []

¯ (1) [r]f −(2) (2) []) and therefore belongs to A+,·,· ∩ A·,·,− . On the other hand, (1 ⊗ 1)(e λ+~h −2~ is equal to  −(2) e(1) [r] fλ+~(h (2) +h(3) )−2~ [] +

X i≥0

 −(3)  d(2) i ()fλ+~h(3) −2~ [i ] ,

¯ ⊗ 1)(h(2) ) and (1 ⊗ 1)(h ¯ (2) ) also belong to and so belongs to A+,·,· ∩ A·,·,− . Finally, (1 A+,·,· ∩ A·,·,− . −(2) +− Since the e(1) [r]fλ+~h , and A+,·,· ∩ A·,·,− is an algebra, this (2) −2~ [] generate A proves (67). Proposition 3.4. We have A+,·,· ∩ A−,·,· = Hol(C − 0, 1 ⊗ (A⊗2 )h ), A·,·,+ ∩ A·,·,− = Hol(C − 0, (A⊗2 )h ⊗ C[h]), where (A⊗2 )h are the elements of A⊗2 commuting with h(1) + h(2) . Proof. The proof is similar to that of Prop. 3.1.

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

673

4. Decomposition of F In this section, we use the duality results of Sect. 2 to decompose F in its A+− and A−+ pieces. 4.1. Notation. We will use the following general notation. For (Xλ )λ∈C−0 a family of maps from an algebra A to A⊗m , depending on λ ∈ C − 0 in a holomorphic way, for n ≥ m and any injection k 7→ ik of {1, . . . , m} into {1, . . . , n}; for any complex numbers αs , s = 1, . . . , n, we set X ∂ i X (i1 ,... ,im ) (Pn ~αk h(k) )i (i1 P ,... ,im ) λ i=1 Xλ+ n ~α h(s) (a) = , a ∈ A, (a) s ∂λi i! s=1 i≥0

where Xλ(i1 ,... ,im ) (a) denotes the image of Xλ (a) in A⊗n by the map sending the k th factor to the ik th one. If the Xλ are algebra morphisms, and that each αik vanishes, then (i1 P ,... ,im ) Xλ+ is an algebra morphism. n ~αs h(s) s=1

This notation applies in particular if A = C (then Xλ is a family of elements of A⊗m ).

4.2. Decomposition of F . Let us denote by U~ n± (τ )[n] the homogeneous part of of U~ n± (τ ) of degree n. Let us set X F = Fn , with Fn ∈ U~ n+ (τ )[n] ⊗ U~ n− (τ )[n] . n≥0

Proposition 4.1. There exist families (Fλi;p )λ∈C−0;p≥0 , i = 1, 2 of elements of A⊗2 , where Fλ1;p is a linear combination with coefficients in Hol(C − 0) of the − e− −λ+2p~ [i1 ] · · · e−λ+2~ [ip ] ⊗ f [j1 ] · · · f [jp ],

Fλ2;q

jα ≥ 0,

is a similar combination of the − − [j1 ] · · · fλ−2~ [jq ], e[i1 ] · · · e[iq ] ⊗ fλ−2q~

and Fλi;0 = 1, i = 1, 2, such that Fn =

X

iα ≥ 0,

2;q Fλ−2p~ Fλ1;p .

(68)

p+q=n (e) (e) Proof. Let us define the linear maps 5(e) +,λ and 5−,λ , from U~ n+ (τ ) to U+ , resp. to (e) U−,λ , by

5(e) +,λ (ab) = aε(b),

5(e) −,λ (ab) = ε(a)b,

(e) a ∈ U+(e) , b ∈ U−,λ ,

) (f ) (f ) (f ) and 5(f +,λ and 5−,λ , from U~ n− (τ ) to U+ , resp. to U−,λ , by ) 5(f +,λ (ab) = ε(a)b,

) 5(f −,λ (ab) = aε(b),

(f ) a ∈ U−,λ , b ∈ U+(f ) .

(f ) (f ) (e) Note that 5(e) +,λ is a left U+ -module map, and 5+,λ is a right U+ -module map. (f ) (f ) (e) From Prop. 1.2 also follows that the kernels of 5(e) +,λ , 5 −,λ 5+,λ and of 5−,λ P P P − are respectively ∈K U~ n+ (τ )e− r∈O e[r]U~ n+ (τ ), ∈K fλ []U~ n− (τ ) and −λ [], P r∈O U~ n− (τ )f [r].

674

B. Enriquez, G. Felder

) (e) Lemma 4.1. 1) For each n ≥ 0, (1 ⊗ 5(f +,λ−2n~ )(Fn ) and (5−,λ−2~ ⊗ 1)(Fn ) both (e) belong to Uλ−2~,− ⊗ U+(f ) .

) (e) (e) 2) For each n ≥ 0, (1 ⊗ 5(f −,λ−2n~ )(Fn ) and (5+,λ−2~ ⊗ 1)(Fn ) both belong to U+ ⊗ (f ) Uλ−2n~,− . 3) We have the equalities

) (e) (1 ⊗ 5(f +,λ−2n~ )(Fn ) = (5−,λ−2~ ⊗ 1)(Fn )

and

(f ) (5(e) +,λ−2~ ⊗ 1)(Fn ) = (1 ⊗ 5−,λ−2n~ )(Fn ).

) Proof of the lemma. Let us prove the first part of 1). (1 ⊗ 5(f +,λ−2n~ )(Fn ) clearly belongs P (f ) − to U~ n+ (τ ) ⊗ U+ . Let now x belong to ∈K fλ−2n~ []U~ n− (τ )[n−1] . Consider h(1 ⊗ ) 5(f +,λ−2n~ )(Fn ), x ⊗ idi; this is equal to 5+,λ−2n~ (x) by Lemma 2.2, 1), and therefore to ) (e) zero. By Lemma 2.1, 2), it follows that (1 ⊗ 5(f +,λ−2n~ )(Fn ) also belongs to Uλ−2~,− ⊗ U~ n− (τ ). The proof of the second part of 1) and of 2) is similar, and uses the other statements (f ) of Lemma 2.1, and the above description of the kernels of 5(e) −,λ−2~ , 5−,λ−2n~ , and (e) 5+,λ−2~ . (e);n Let us now prove the first part of 3). Let us fix a+ in U+(f );n and a− in Uλ−2~,− . Then ) (e) h(1 ⊗ 5(f +,λ−2n~ )(Fn ) − (5−,λ−2~ ⊗ 1)(Fn ), a+ ⊗ a− i(2) ) (e) = ha− , 5(f +,λ−2n~ (a+ )i − h5−,λ−2~ (a− ), a+ i

= ha− , a+ i − ha− , a+ i = 0. ) (f );n (e) (e);n , and the Since (1 ⊗ 5(f +,λ−2n~ )(Fn ) − (5−,λ−2~ ⊗ 1)(Fn ) belongs to Uλ−2~,− ⊗ U+

(e);n pairing h, i induces injections from Uλ−2~,− in the dual of U+(f );n and from U+(f );n in the ) (e);n (e) dual of Uλ−2~,− , (1 ⊗ 5(f +,λ−2n~ )(Fn ) − (5−,λ−2~ ⊗ 1)(Fn ) is equal to zero. The proof of the second part of 3) is similar.

Let us set now ) (e) Fλ1;n = (1 ⊗ 5(f +,λ−2n~ )(Fn ) = (5−,λ−2~ ⊗ 1)(Fn )

and

(69)

(f ) Fλ2;n = (5(e) +,λ−2~ ⊗ 1)(Fn ) = (1 ⊗ 5−,λ−2n~ )(Fn ).

To prove (78), let us consider the family (indexed by λ ∈ C − 0) of linear endomorphisms `λ of U~ n+ (τ ), defined by X 2;q Fλ−2p~ Fλ1;p , id ⊗ xi, `λ (x) = h p+q=n

for x in U~ n+ (τ )[n] . Assign in U~ g+ (τ ), the degree 0 to the elements of U~ h+ (τ ), and the degree 1 to each e[], ∈ K. Let us denote by U~ g+ (τ )[q] the subspace of U~ g+ (τ ) formed by the elements of degree q. Then

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

1(U~ n+ (τ )[n] ) ⊂

X

675

U~ n+ (τ )[p] ⊗ U~ g+ (τ )[q] .

p+q=n

Let us now fix x in U~ n+ (τ )[n] , and let us set 1(x) = 0 xp,i ∈ U~ n+ (τ )[p] , x00q,i ∈ U~ g+ (τ )[q] . Expand `λ (x) as X

`λ (x) =

P i,p+q=n

x0p,i ⊗ x00q,i , with

2;q hFλ−2p~ , id ⊗ x0q,i ihFλ1;p , id ⊗ x00p,i i

i,p+q=n

X

=

i,p+q=n

X

=

i,p+q=n

(e) 0 00 5(e) +,λ−2~−2p~ (xq,i )5−,λ−2~ (hF, id ⊗ xq,i i) (e) 0 00 5(e) +,λ−2~−2p~ (xq,i )(5−,λ−2~ ◦ π)(xq,i );

(70)

the first equality follows from the Hopf algebra pairing rules, the second one from Lemma 2.2, 1), 2), and the third one from the same Lemma, 3). This formula enables us to show: Lemma 4.2. `λ is a left U+(e) (τ )-module map. Proof. Recall that the product map defines a linear isomorphism from U~ h+ (τ ) ⊗ U+(e) ⊗ (e) onto U~ g+ (τ ). Define U~ b+ (τ ) as the image of U~ h+ (τ ) ⊗ U+(e) ⊗ 1 by this map. Uλ−2~,− U~ b+ (τ ) is then a subalgebra of U~ g+ (τ ). On the other hand, since for r ∈ O, 1(e[r]) is equal to he(z) ⊗ K + (z), riK + 1 ⊗ e[r], it belongs to U~ n+ (τ ) ⊗ U~ b+ (τ ). It follows that 1(U+(e) ) ⊂ U~ n+ (τ ) ⊗ U~ b+ (τ ). 5(e) −,λ−2~ ◦ π is now defined as follows: to x ∈ U~ g+ (τ ) decomposed as X

hi x+i x− i ,

with

(e) hi ∈ U~ h+ (τ ), x+i ∈ U+(e) , x− i ∈ Uλ−2~,− ,

i

it associates 5(e) −,λ−2~ (

P i

ε(hi )x+i x− i ), that is

P i

ε(hi x+i )x− i . Therefore, it satisfies

(e) (5(e) −,λ−2~ ◦ π)(bx) = ε(b)(5−,λ−2~ ◦ π)(x),

(71)

for x ∈ U~ g+ (τ ), b ∈ U~ b+ (τ ). P Let us fix now x ∈ U~ n+ (τ )[n] , b ∈ U+(e),m . Let us set 1(x) = i,p+q=n x0p,i ⊗ x00q,i , P 0 00 0 [p0 ] , b00q0 ,j ∈ x0p,i , x00q,i as above, and 1(b) = j,p0 +q 0 =m bp0 ,j ⊗ bq 0 ,j , bp0 ,j ∈ U~ n+ (τ ) 0 0 0 U~ b+ (τ )[q ] (where U~ b+ (τ )[q ] is the intersection of U~ b+ (τ ) and U~ g+ (τ )[q ] ). Then X (e) 0 0 00 00 `λ (bx) = 5(e) +,λ−2~−2(p+p0 )~ (bq 0 ,j xq,i )(5−,λ−2~ ◦ π)(bp0 ,j xp,i ) i,j,p+q=n,p0 +q 0 =m

=

X

i,j,p+q=n,p0 +q 0 =m

(e) 0 0 00 00 5(e) +,λ−2~−2(p+p0 )~ (bq 0 ,j xq,i )ε(bp0 ,j )(5−,λ−2~ ◦ π)(xp,i ),

where the first equality follows from (70), and the second one from (71). If p0 6= 0, ε(b00p0 ,j ) P P 0 00 vanishes, so that j b0m,j ε(b000,j ) = j,p0 +q 0 =m bq 0 ,j ε(bp0 ,j ) = b. Therefore `λ (bx) is equal to

676

B. Enriquez, G. Felder

X i,j,p+q=n

=

(e) 0 00 5(e) +,λ−2~−2p~ (bxq,i )(5−,λ−2~ ◦ π)(xp,i )

X

i,j,p+q=n

(e) 0 00 b5(e) +,λ−2~−2p~ (xq,i )(5−,λ−2~ ◦ π)(xp,i ) = b`λ (x),

(e) where the first equality follows from the fact that 5(e) −,λ−2~ is a left U+ -module map. P ¯ = i,p+q=n x¯ 0p,i ⊗ x¯ 00q,i , with x¯ 0p,i ∈ U~ n+ (τ )[p] , Set now for x in U~ n+ (τ )[n] , 1(x) x¯ 00q,i ∈ U~ g¯ + (τ )[q] . Then X 2;q `λ (x) = hFλ−2p~ , id ⊗ x¯ 00q,i i0 hFλ1;p , id ⊗ x¯ 0p,i i0 i,p+q=n

=

X

i,p+q=n

=

X

i,p+q=n

5(e) ¯ 00q,i i0 )5(e) ¯ 0p,i ) +,λ−2p~−2~ (hF, id ⊗ x −,λ−2~ (x 0 (5(e) ¯ 00q,i )5(e) ¯ 0p,i ), +,λ−2p~−2~ ◦ π )(x −,λ−2~ (x

(72)

using the Hopf algebra pairing rules, and Lemma 2.2, 3). Before we use this identity to derive a result analogous to Lemma 2.7 of [9], we will show the following results. −

0 Lemma 4.3. For any , 0 ∈ K, the product (q −h )[]e− λ [ ] is equal to some combinaP − − tion i eλ+2~ [ηi0 ](q −h )[ηi ], for certain ηi , ηi0 ∈ K.

Proof. Recall that in (9), the ratio of theta-functions should be expanded for the argument of K − near 0; therefore, we have X θ(w + ~K + ~) (−z)α −h− h− α e(w). )(z)e(w)(q )(z) = (∂/∂w) (q θ(w + ~K − ~) α! α≥0

It follows that for 0 ∈ K, we have −

−

(q −h )(z)e[0 ](q h )(z) =

X (−z)α θ(w + ~K + ~) e 0 (w)(∂/∂w)α , α! θ(w + ~K − ~)

α≥0

therefore we have for ∈ K, X − − (−z)α θ(w + ~K + ~) ]. e 0 (w)(∂/∂w)α (q h )[(z) (q −h )[]e[0 ] = α! θ(w + ~K − ~) α≥0

Now if 0 belongs to Lλ , the products 0 (w)(∂/∂w)α The lemma follows.

θ(w+~K+~) θ(w+~K−~)

Lemma 4.4. For x ∈ U~ n+ (τ ), ∈ K, we have: (e) − − 1) 5(e) −,λ (xe−λ []) = 5λ (x)e−λ []; − 2) 5(e) +,λ (xe−λ []) = 0.

Proof. This follows directly from the definitions of 5(e) ±,λ .

belong to Lλ+2~ .

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

677

Lemma 4.5. We have the identities − `λ+2~ (xe− −λ []) = `λ (x)e−λ [],

for x ∈ U~ n+ (τ ), ∈ K.

P ¯ Proof. Let us fix x ∈ U~ n+ (τ )[n] , and set as above 1(x) = i,p+q=n x¯ 0p,i ⊗ x¯ 00q,i , with x¯ 0p,i ∈ U~ n+ (τ )[p] , x¯ 00q,i ∈ U~ g¯ + (τ )[q] . From formula (26) follows that x¯ 00q,i can be P decomposed as a sum j yq,i,j hq,i,j , where yq,i,j belongs to U~ n+ (τ )[q] and hq,i,j is −

−

a linear combination of products of the form (q −h )[1 ] · · · (q −h )[p ], with i ∈ K − − (where we denote hq −h (z) , η(z)iK by (q −h )[η]). Let now belong to K; we have X − 0 −h− (z) ¯ 0 (xe− []) = y h ⊗ x ¯ ⊗ e(z), p ()i + e [] ⊗ 1 . hq 1 q,i,j q,i,j −λ K p,i −λ −λ i,j,p+q=n

Let us set

hq −h

−

(z)

⊗ e(z), p−λ ()iK =

X

−

(q −h )[s ] ⊗ e[ρs ()],

s∈Z

where ρs is a family of linear endomorphisms of K. According to (72), we have `λ+2~ (xe− −λ []) X (e) − = ¯ 0p,i e[ρs ()]) 5+,lλ−2p~−2~ ◦ π 0 (yq,i,j hq,i,j (q −h )[s ])5(e) −,λ (x s,i,j,p+q=n

+

X

s,i,j,p+q=n

(73)

(e) 0 5(e) (yq,i,j hq,i,j e− ¯ 0p,i ). −λ [])5−,λ (x +,lλ−2p~ ◦ π

Using the property of π 0 that π 0 (xt) = π 0 (x)ε(t), for x ∈ U~ g− (τ ), t ∈ U~ h− (τ ), we write the first sum of the r.h.s. of (73) as X (e) − 5+,λ−2p~−2~ ◦ π 0 (yq,i,j hq,i,j )ε((q −h )[s ])5(e) ¯ 0p,i e[ρs ()]) −,λ (x (74) s,i,j,p+q=n X (e) (e) − 0 0 5+,λ−2p~−2~ ◦ π (yq,i,j hq,i,j )5−,λ (x¯ p,i e−λ []) = i,j,p+q=n

=

X

i,j,p+q=n

0 ◦ π ¯ 0p,i )e− 5(e) (yq,i,j hq,i,j )5(e) +,λ−2p~−2~ −λ [] −,λ (x

= `λ (x)e− −λ [];

(75)

here the first equality follows from the properties of ε, and the second one from Lemma 4.4, 1). According to Lemma 4.3, each product hq,i,j e− −λ [] can be written as a sum X e− −λ+2p~ [t ]hq,i,j,t , t∈Z

with hq,i,j,t ∈ U~ h− (τ ). It follows that the second sum of the r.h.s. of (73) can be written as

678

B. Enriquez, G. Felder

X t,s,i,j,p+q=n

But

(e) 0 ◦ π ¯ 0p,i ). 5(e) (yq,i,j e− +,λ−2p~ −λ+2p~ [t ]hq,i,j,t )5−,λ (x

(76)

0 5(e) ◦ π (yq,i,j e− −λ+2p~ [t ]hq,i,j,t ) +,λ−2p~ − = 5(e) +,λ−2p~ (yq,i,j e−λ+2p~ [t ])ε(hq,i,j,t ) = 0

by Lemma 4.4, 2). Therefore (76) vanishes. The lemma follows from this and (74). We are now in position to show that for any λ ∈ C − 0, x ∈ U~ n+ , (77) `λ (x) = x. P (e);p + , x− Using Prop. 1.2, decompose x as a sum i,p,q x+p,i x− q,i , with xp,i in U+ q,i in P (e);q − + Uλ,− . By Lemma 4.2, `λ (x) is equal to i,p,q xp,i `λ (xq,i ); and by Lemma 4.5, this last P expression is equal to i,p,q x+p,i `λ−2q~ (1)x− q,i . We easily check that for any λ ∈ C − 0, `λ (1) = 1. Equation (77) follows. The proposition now follows from the comparison of (77) and Lemma 2.2, and from the fact that h, iU~ n± (τ ) is non-degenerate. We can now obtain another decomposition of F : Corollary 4.1. There is a unique a decomposition of F as F = Fλ2 Fλ1 ,

Fλ1 ∈ A−+

with

and

Fλ2 ∈ A+− ,

(78)

with (ε ⊗ 1)(Fi ) = (1 ⊗ ε)(Fi ) = 1, i = 1, 2. Proof. Set Fλ2 =

XX

(∂/∂λ)α (Fλ2;q )

(~h(2) )α , α!

(∂/∂λ)α (Fλ1;p )

(~h(2) )α ; α!

q≥0 α≥0

and Fλ1 =

XX p≥0 α≥0

(79)

Fλ1 and Fλ2 belong repectively to A−+ and A+− . Since we have also Fλ2 =

XX

2;q ∂λα (Fλ−2p~ )

q≥0 α≥0

we can write Fλ2 Fλ1 =

X X p,q≥0 α≥0

(~(h(2) + 2p))α , α!

2;q ∂λα (Fλ−2p~ Fλ1;p )

(~h(2) )p = F. p!

Let us now prove the unicity of the decomposition (78). Let (Fλ01 , Fλ02 ) be some other solution to (78). Then, by Prop. 3.1, we will have Fλ02 = Fλ2 u, Fλ01 = u−1 Fλ1 , with u some invertible element of Hol(C − 0, 1 ⊗ C[h][[~]]). On the other hand, (ε ⊗ 1)(Fλ2 ) = (ε ⊗ 1)(Fλ02 ) = 1 implies that u = 1.

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

679

Lemma 4.6. We have an expansion X −(1) e−λ−~h(2) [ei;−λ ]f (2) [ei ] + U~ n+ (τ )≥2 ⊗ U~ n− (τ )≥2 C[h], Fλ1 ∈ 1 + ~ i≥0

where U~ n± (τ )≥2 = ⊕i≥2 U~ n± (τ )[i] .

P Proof. This follows from formulas (79), (69), the expansion F ∈ 1 + ~ i e[i ] ⊗ f [i ] + U~ n+ [τ ]≥2 ⊗ U~ n− [τ ]≥2 (see [12]), and from the fact that 5(e) −,λ−2~ maps each U~ n+ (τ )[i] to itself. 5. Twisted Cocycle Property The main result of this section is that the Fλ1 satisfies the twisted cocycle equation. Let us define for λ ∈ C − 0, −1 1(12) 1 Fλ1(23) (1 ⊗ 1)(Fλ1 ) . 8λ = Fλ+~h (3) (1 ⊗ 1)(Fλ ) Proposition 5.1. The family (8λ )λ∈C−0 belongs to A−,·,· ∩ A·,·,+ . ) and Proof. First observe that if (φλ )λ∈C−0 belongs to A−+ , then (φ(12) λ+~h(3) λ∈C−0 −,·,· ·,·,+ (φ(23) ) belong to A ∩ A ; this follows easily from the definitions of these λ∈C−0 λ 1(12) 1(23) algebras. It follows that (Fλ+~h )λ∈C−0 also belong to A−,·,· ∩A·,·,+ . (3) )λ∈C−0 and (Fλ 1 By (64), the families (1 ⊗ 1)(Fλ ) and (1 ⊗ 1)(Fλ1 ) also belong to A−,·,· ∩ A·,·,+ . Since A−,·,· ∩A·,·,+ is an algebra, and has the following property: any x ∈ A−,·,· ∩A·,·,+ , invertible in Hol(C − 0, A⊗3 ), is such that x−1 belongs to A−,·,· ∩ A·,·,+ , (8λ )λ∈C−0 belongs to A−,·,· ∩ A·,·,+ . Using (30), we may rewrite 8λ as −1 2(23) 2 ¯ ⊗ 1)(Fλ2 )(F 2(12) (3) ) ¯ 8λ = ( 1 (1 ⊗ 1)(F ). λ )(1 ⊗ Fλ λ+~h Proposition 5.2. 8λ ∈ A+,·,· ∩ A·,·,− . (12) Proof. We now remark that if (ψλ )λ∈C−0 belongs to A+− , then (ψλ+~h (3) )λ∈C−0 (23) 2(12) +,·,· ·,·,− and (ψλ )λ∈C−0 both belong to A ∩A . It follows that (Fλ+~h(3) )λ∈C−0 and (Fλ2(23) )λ∈C−0 also belong to A+,·,· ∩ A·,·,− . 2 +,·,· ¯ ⊗ 1)(Fλ2 ) and (1 ⊗ 1)(F ¯ By (67), the families (1 ∩ A·,·,− . λ ) also belong to A +,·,· ·,·,− +,·,· ·,·,− Since A ∩A also has the property that any x ∈ A ∩A , invertible in Hol(C − 0, A⊗3 ), is such that x−1 belongs to A+,·,· ∩ A·,·,− , (8λ )λ∈C−0 belongs to A+,·,· ∩ A·,·,− .

From the two above propositions it follows that we have X i 1 ⊗ a(i) 8λ = λ ⊗h , i≥0

for a certain family a(i) λ of elements of A(τ ), commuting with h. Let us define now

(80)

680

B. Enriquez, G. Felder

1λ = Ad(Fλ1 ) ◦ 1; this is a family of algebra morphisms from A(τ ) to A(τ )⊗2 , depending on λ ∈ C − 0 in a holomorphic way. Then we have the twisted quasi-Hopf condition (1λ+~h(3) ⊗ 1) ◦ 1λ = Ad(8λ ) ◦ (1 ⊗ 1λ ) ◦ 1λ .

(81)

Proposition 5.3. (1λ )λ∈C−0 and (8λ )λ∈C−0 satisfy the compatibility condition (1λ+~(h(3) +h(4) ) ⊗ 1 ⊗ 1)(8λ )(1 ⊗ 1 ⊗ 1λ )(8λ ) =

8(123) (1 λ+~h(4)

⊗ 1λ+~h(4) ⊗

(82)

1)(8λ )8(234) . λ

Proof. The proof is a straightforward computation. We can understand it in the following way. For (Vi , ρi )i=1,2 two A(τ )-modules (that is, we are given algebra morphisms ρi from A(τ ) to End(Vi )), define the family of A(τ )-modules (V1 ⊗ V2 , ρ1 ⊗λ ρ2 )λ∈C−0 (sometimes simply denoted by V1 ⊗λ V2 ) by ρ1 ⊗λ ρ2 = (ρ1 ⊗ ρ2 ) ◦ 1λ . Then for (Vi , ρi )i=1,2,3 three A(τ )-modules, (81) implies that the image of 8λ by ⊗3i=1 ρi is an intertwiner of A(τ )-modules from V1 ⊗λ (V2 ⊗λ V3 ) to (V1 ⊗λ+~h(3) V2 )⊗λ V3 . For (Vi , ρi )1≤i≤4 four A(τ )-modules, we then have the sequences of A(τ )-modules morphisms V1 ⊗λ (V2 ⊗λ (V3 ⊗λ V4 )) → V1 ⊗λ ((V2 ⊗λ+~h(4) V3 ) ⊗λ V4 ) → (V1 ⊗λ+~h(4) (V2 ⊗λ+~h(4) V3 )) ⊗λ V4 → ((V1 ⊗λ+~(h(3) +h(4) ) V2 ) ⊗λ+~h(4) V3 ) ⊗λ V4 given by the image by ⊗4i=1 ρi of the r.h.s. of (82), and the sequence V1 ⊗λ (V2 ⊗λ (V3 ⊗λ V4 )) → (V1 ⊗λ+~(h(3) +h(4) ) V2 ) ⊗λ (V3 ⊗λ V4 ) → ((V1 ⊗λ+~(h(3) +h(4) ) V2 ) ⊗λ+~h(4) V3 ) ⊗λ V4 given by the image by ⊗4i=1 ρi of the l.h.s. of the same equation; these sequences of morphisms coincide. We are ready to conclude: Theorem 5.1. We have 8λ = 1. Therefore 1(12) 1(23) 1 Fλ+~h (1 ⊗ 1)(Fλ1 ). (3) (1 ⊗ 1)(Fλ ) = Fλ

(83)

Proof. After substitution of (80), the l.h.s. of (82) becomes the product of X i 1 ⊗ 1 ⊗ a(i) λ ⊗h i≥0

and

X

i 1 ⊗ a(i) λ ⊗ (h ⊗ 1 + 1 ⊗ h) ;

i≥0

since h commutes with the a(i) λ , it follows that these two terms commute. Therefore (82) simplifies to

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

681

(1 ⊗ 1 ⊗ 1λ )(8λ ) = (8λ+~h(4) ⊗ 1)(1 ⊗ 1λ+~h(4) ⊗ 1)(8λ ). Apply now 1 ⊗ ε ⊗ 1 ⊗ 1 to this identity. Since (1 ⊗ ε ⊗ 1)(8λ ) = 1, the first and second term map to 1. On the other hand, since (ε ⊗ 1) ◦ 1λ = id, the last term maps to 8λ . Therefore, 8λ = 1. Remark 5. It would be interesting to find some analogue for the M (λ) of [3], that is some family of elements of A(τ ) whose twisted coboundary would be Fλ . This element should belong to some completion of A(τ ), and for ~ = 0 coincide with a “longest Weyl group element the affine algebra”. 6. Dynamical Yang–Baxter Equation The paper [3], Sect. 2, contains the following result: A Proposition 6.1 (see [3]). Let (A, 1A ∞ , R∞ ) be a quasi-triangular Hopf algebra, with a fixed element e h. Let F (λ) be a family of invertible elements of A ⊗ A, parametrized by some subset U ⊂ C. Set 1(λ) = Ad(F (λ)) ◦ 1A ∞ . Suppose that the identity (23) F (12) (λ + ~e h(3) )(1A (λ)(1 ⊗ 1A ∞ ⊗ 1)(F (λ)) = F ∞ )(F (λ))

(84)

is satisfied. Then we have (1(λ + ~e h(3) ) ⊗ 1) ◦ 1(λ) = (1 ⊗ 1(λ)) ◦ 1(λ),

(85)

−1 and if we set R(λ) = F (21) (λ)RA ∞ F (λ) , we have the identity

R(12) (λ)R(13) (λ + ~e h(2) )R(23) (λ) h(1) )R(13) (λ)R(12) (λ + ~e h(3) ). = R(23) (λ + ~e

(86)

e [12] (λ) as the linear map from A⊗2 to A⊗3 defined by Proof. Define 1 (12) e [12] (λ)(x ⊗ y) = F (12) (λ)(1A 1 (λ + ~e h(3) )−1 , ∞ ⊗ 1)(x ⊗ y)F

(87)

e [21] (λ)(x ⊗ y) = 1 e [12] (λ)(x ⊗ y)(213) . Then we have and 1 e [12] (λ)(x) = 1 e [21] (λ)(x)R(12) (λ + ~e h(3) ), R(12) (λ)1

x ∈ A⊗2

(88)

and e [12] (λ)(R(λ)) = R(13) (λ + ~e h(2) )R(23) (λ). 1

(89)

Applying (88) to x = R(λ) yields (86). e [23] (λ) by We could also define 1 (32) e [23] (x ⊗ y) = F (32) (λ + ~e 1 h(1) )(1 ⊗ 1A0 (λ)−1 , ∞ )(x ⊗ y)F

e [32] (T ) = 1 e [23] (T )(132) , for T ∈ A⊗2 ; then we have 1 e [23] (λ)(x)R(23) (λ) = R(23) (λ + 1 e [32] (T ), and ~e h(1) )1 e [23] (λ)(R(λ)) = R(12) (λ)R(13) (λ + ~e 1 h(2) ). e [23] (T ) = (1 e [12] (T (21)−1 ))(312)−1 . Note also the identity 1

682

B. Enriquez, G. Felder

Identities (84), (85) and (86) are respectively called the twisted cocycle condition for the family F (λ), the twisted coassociativity condition for 1(λ), and the dynamical Yang–Baxter equation for R(λ). Theorem 6.1. Let us set in A(τ )⊗2 , 1

R∞ = q D⊗K q 2

P i≥0

h[ei ]⊗h[ei;0 ]

F,

and for λ ∈ C − 0, Rλ = (Fλ1 )(21) R∞ (Fλ1 )−1 . Then the family (Rλ )λ∈C−0 satisfies the dynamical Yang–Baxter relation (13) (23) (23) (13) (12) R(12) λ Rλ+~h(2) Rλ = Rλ+~h(1) Rλ Rλ+~h(3) .

(90)

Proof. This follows directly from the above proposition and the fact that (A(τ ), 1, R∞ ) is a quasi-triangular Hopf algebra (see [12]). [12]

e λ be the linear map from A⊗2 to A⊗3 defined by (87), and 1[12] the Remark 6. Let 1 λ [12] e λ ⊗ id from A⊗2 ⊗ Diff(C − 0) to A⊗3 ⊗ Diff(C − 0). Here Diff(C − 0) is map 1 the ring of differential operators in λ ∈ C − L. Let us set R = e−~h

(1)

∂λ

Rλ .

13 23 Then we have the simple relation 1[12] λ (R) = R R . The form of R indicates that the θ0 ∂ . This indication algebra element h[ θ ] can be naturally added with the derivative ∂λ could be useful for the study of twisted conformal blocks: in such a theory we need to add differential operators to the elements of gλ .

7. Level 0 Representations of A(τ ), L-Operators and RLL Relations In [11], we studied the 2-dimensional representations, at level 0, of the quantum groups introduced there. In the case of the algebra A(τ ), these representations can be described as follows. Let us denote by Kζ the local field C((ζ)), by ∂ζ its derivation d/dζ, and by Kζ [∂ζ ] the associated ring of differential operators. Let (v1 , v−1 ) be the standard basis of C2 , and Eij the endomorphism of C2 defined by Eij (vα ) = δα,j vi . Proposition 7.1 (see [11], Prop. 9). There is a morphism of algebras πζ : A(τ ) → End(C2 ) ⊗ Kζ [∂ζ ][[~]], defined by the formulas πζ (K) = 0,

πζ (D) = Id ⊗∂ζ , C2

2 2 r (ζ) − E ⊗ r (ζ), r ∈ O, −1−1 1 + q∂ 1 + q −∂ ∂ 1 − q −∂ q −1 πζ (h[λ]) = E11 ⊗ λ (ζ) − E−1−1 ⊗ λ (ζ), λ ∈ L0 , ~∂ ~∂ πζ (h[r]) = E11 ⊗

πζ (e[]) =

θ(~) E1,−1 ⊗ (ζ), ~

πζ (f []) = E−1,1 ⊗ (ζ),

∈ K.

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

683

Lemma 7.1. The image of Rλ by πζ ⊗ πζ 0 is (πζ ⊗ πζ 0 )(Rλ+~ ) = A(ζ, ζ 0 )R− (ζ − ζ 0 , λ),

(91)

where θ(z) E11 ⊗ E−1,−1 (92) θ(z − ~) θ(λ + ~)θ(λ − ~) θ(z) θ(z + λ)θ(~) + E−1,−1 ⊗ E11 − E1,−1 ⊗ E−1,1 θ(λ)2 θ(z − ~) θ(z − ~)θ(λ) θ(z − λ)θ(~) E−1,1 ⊗ E1,−1 , + θ(z − ~)θ(λ) P ∂ i e (ζ)ei;0 (ζ 0 )). and A(ζ, ζ 0 ) is equal to exp( i≥0 ∂1 qq∂−1 +1 R− (z, λ) = E11 ⊗ E11 + E−1,−1 ⊗ E−1,−1 +

Proof. Since the image by πζ and πζ 0 of U~ n± (τ )≥2 is zero, and by Lemma 4.6, this image is the same as that of ! P X −(2) 1 h[ei ]⊗h[ei;0 ] (1) i 1+~ e−λ−~h(1) [i ]f [ ] q D⊗K q 2 i∈Z i∈Z

 1 + ~

X i≥0

 −(2) . e(1) [i ]fλ+~h (2) −2~ [i ]

After we use the expansions X θ(z − w + λ) , for λ ∈ C − 0, ei;λ (z)ei (w) = θ(z − w)θ(λ) i≥0

and the identities

X

(g(∂)ei )(ζ)ei;0 (ζ 0 ) =

i≥0

X

X

ei;0 (z)ei (w) =

i≥0

θ0 (z − w), θ

ei (ζ)(g(−∂)ei;0 )(ζ 0 ),

i≥0

for g any polynomial in ∂, and ∂z θ(z − w + ~) q − 1 θ0 exp , (z − w) = ∂z θ(z − w) θ we find (πζ ⊗ πζ 0 )(Rλ ) =

θ(ζ 0 − ζ + λ − ~) A(ζ, ζ ) 1 + θ(~)(E−1,1 ⊗ E1,−1 ) 0 θ(ζ − ζ)θ(λ − ~) 0

·

E1,1 ⊗ E1,1 + E−1,−1 ⊗ E−1,−1 θ(ζ 0 − ζ) θ(ζ 0 − ζ − ~) E1,1 ⊗ E−1,−1 + E−1,−1 ⊗ E1,1 + 0 θ(ζ − ζ + ~) θ(ζ 0 − ζ) θ(ζ − ζ 0 + λ − ~) 1 − θ(~)(E1,−1 ⊗ E−1,1 ) ; θ(ζ − ζ 0 )θ(λ − ~) the lemma follows.

·

684

B. Enriquez, G. Felder

Define R+ (z, λ) as R− (z, λ)−1 . We then have R+ (z, λ) = E11 ⊗ E11 + E−1,−1 ⊗ E−1,−1 θ(z) θ(λ + ~)θ(λ − ~) + E1,1 ⊗ E−1,−1 θ(z + ~) θ(λ)2 θ(z) θ(z + λ)θ(~) + E−1,−1 ⊗ E11 + E1,−1 ⊗ E−1,1 θ(z + ~) θ(z + ~)θ(λ) θ(z − λ)θ(~) E−1,1 ⊗ E1,−1 . − θ(z + ~)θ(λ)

(93)

Let us define now the L-operators as follows. Set L+λ (ζ) = (1 ⊗ πζ )(Rλ+~ ),

(21) L− λ (ζ) = (1 ⊗ πζ )(Rλ+~ ).

Using again the fact that U~ n± (τ )≥2 is mapped to zero by πζ , we compute + L+λ (ζ) = 1 + θ(~)fλ+~h−~ (ζ) ⊗ E1,−1 k + (ζ − ~) ⊗ E1,1 + k + (ζ)−1 ⊗ E−1,−1 1 + ~e+−λ (z) ⊗ E−1,1 + + k (z − ~) 0 1 θ(~)fλ+~h−~ 1 0 (ζ) = , (94) ~e+−λ (ζ) 1 0 1 0 k + (z)−1 and K∂ − ζ L− k − (ζ − ~) ⊗ E1,1 + k − (ζ)−1 ⊗ E−1,−1 λ (ζ) = 1 + ~e−λ (ζ) ⊗ E−1,1 q − (ζ) ⊗ E1,−1 1 + θ(~)fλ+~h−~ = q K∂ζ L− λ (ζ), where L− λ (ζ) =

1 0 (ζ − K~) 1 ~e− −λ

− − k (ζ − ~) 0 1 θ(~)fλ+~h−~ (ζ) . (95) 0 k − (ζ)−1 0 1

Theorem 7.1. The matrices L± λ (ζ) defined by (94) and (95) satisfy the relations +(2) +(2) 0 +(1) 0 + 0 R+ (ζ − ζ 0 , λ + ~h)L+(1) λ (ζ)Lλ+~h(1) (ζ ) = Lλ (ζ )Lλ+~h(2) (ζ)R (ζ − ζ , λ), (96)

−(2) −(1) 0 0 − 0 (ζ)L−(2) R− (ζ − ζ 0 , λ)L−(1) λ (ζ ) = Lλ+~h(1) (ζ )Lλ (ζ)R (ζ − ζ , λ + ~h), (97) λ+~h(2)

+(2) 0 − 0 L−(1) λ (ζ)R (ζ − ζ , λ + ~h)Lλ (ζ )

(98)

A(ζ, ζ 0 + K~) −(1) 0 − 0 . (ζ )R (ζ − ζ − K~, λ)L (ζ) = L+(2) (1) (2) λ+~h λ+~h A(ζ, ζ 0 ) Proof. It suffices to apply id ⊗ πζ ⊗ πζ 0 , πζ ⊗ πζ 0 ⊗ id and πζ ⊗ id ⊗ πζ 0 to (90), after the change of λ into λ + ~, to simplify the coefficient A(ζ, ζ 0 ) of Lemma 7.1 (which is independent of λ), and to transfer the factors q K∂ζ and q K∂ζ0 to the left.

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

685

Remark 7 (Connection of A with φK ). The function A(ζ, ζ 0 ) of Lemma 7.1 satisfies the functional equation θ(ζ − ζ 0 ) ; A(ζ, ζ 0 )A(ζ + ~, ζ 0 ) = θ(ζ − ζ 0 + ~) after analytic continuation, we see that A only depends on ζ − ζ 0 . The ratio A(ζ, ζ 0 + K~)/A(ζ, ζ 0 ) of relation (98) is then simply connected with the function φK expressing the commutator k + (z)k − (w)k + (z)−1 k − (w)−1 (relation (18)) by A(ζ − K~, ζ 0 ) = φK (ζ 0 − ζ − ~)−1 . A(ζ, ζ 0 ) After analytic continuation in the arguments, A(ζ, ζ 0 ) is ratio of elliptic gamma-functions in the difference ζ − ζ 0 . 8. Elliptic Quantum Group Eτ,η (sl2 ) 8.1. Definition. Let us set η = −~/2 and define Eτ,η (sl2 ) as the algebra generated by h and the ai (λ), bi (λ), ci (λ), di (λ), i ≥ 0, λ ∈ C − 0, subject to the relations [h, ai (λ)] = [h, di (λ)] = 0,

[h, bi (λ)] = −2bi (λ),

[h, ci (λ)] = 2ci (λ),

and if we set a(z, λ) =

X

ai (λ)ei,+~h/2 (z),

b(z, λ) =

i≥0

c(z, λ) =

X

X

bi (λ)ei,λ+~(h−2)/2 (z),

i≥0

ci (λ)ei,−λ−~(h+2)/2 (z),

d(z, λ) =

i≥0

X

di (λ)ei,−~h/2 (z),

i≥0

and

L(z, λ) =

a(z, λ) b(z, λ) , c(z, λ) d(z, λ)

(99)

the relations R+(12) (z1 − z2 , λ + ~h)L(1) (z1 , λ)L(2) (z2 , λ + ~h(1) ) = L (z2 , λ)L (z1 , λ + ~h )R (2)

(1)

(2)

(100) +(12)

(λ, z1 − z2 )

and Det(z, λ) = d(z − ~, λ)a(z, λ − ~) − b(z − ~, λ)c(z, λ − ~)

θ(λ + ~h + ~) = 1. θ(λ + ~h)

(101)

Here R+ (z, λ) ∈ End(C2 ⊗C2 ) is given by (93); we define h(1) as (E11P −E−1,−1 )⊗1 and h(2) as 1 ⊗ (E11 − E−1,−1 ). We also define as before, g(λ + ~h) as α≥0 (∂λα g)(λ) P (~h)α (i) α (i) α α≥0 (∂λ g)(λ)(~h ) /α!. α! , and g(λ + ~h ) as

686

B. Enriquez, G. Felder

Remark 8. The L-operator defined by (99) is 1-periodic in the variables z and λ, and satisfies L(z + τ, λ) = tλ+~h L(z, τ )t−1 λ ,

(102)

R+ (z, λ)t−1(1) . R+ (z + τ, λ) = t(1) λ λ+~h(2)

(103)

where tλ =

e−iπλ 0 . On the other hand, the R-matrix (92) satisfies the conditions 0 eiπλ

The fact that the periodicity conditions (102) and (103) seem compatible leads us to conjecture that the algebra Eτ,η (sl2 ) is a flat deformation of the function algebra of the group of holomorphic maps Lcl from (C − 0)2 to SL2 (C), such that Lcl (z + τ, λ) = Ad(tλ )(Lcl (z, λ)). Since the morphism 9 defined in Thm. 9.1 is obviously surjective, this algebra has “at least the size” of U~ gO (τ ). 8.2. Connection with the usual formulas. The formulas defining the elliptic quantum groups in [16] involve an R-matrix different from (92). Let us explain their connection with the above formalism. Consider the ring F[[~]] of formal series in ~, with coefficients meromorphic functions in λ. Let us adjoin to it a square root θ1/2 (λ) of θ(λ). The new ring F1/2 [[~]] then contains a solution ϕ of the functional equation θ(λ) ϕ(λ − ~) = ; ϕ(λ + ~) θ(λ + ~)

we have ϕ(λ) = θ1/2 (λ) exp

θ0 1 tanh(~∂/2) 2∂ θ

(λ).

Define for u, v ∈ C, the quantities ϕ(λ + ~(uh + v)) as   X (~(uh + v))k ϕ0 (k−1)  (λ). ϕ(λ) exp  k! ϕ k≥1

Note that any of the ratios

ϕ(λ−~(uh+v)) ϕ(λ−~(u0 h+v 0 )) ,

for u, v, u0 , v 0 ∈ C, belong to F[h][[~]].

Lemma 8.1. Let us set for λ ∈ C − 0, a¯ (z, λ) =

ϕ(λ + ~h) a(z, λ), ϕ(λ + ~)

¯ λ) = ϕ(λ + ~h) b(z, λ), b(z, ϕ(λ − ~)

ϕ(λ + ~h) ¯ λ) = ϕ(λ + ~h) d(z, λ); c(z, λ), d(z, ϕ(λ + ~) ϕ(λ − ~) ¯ ¯ λ) = a¯ (z, λ) b(z, λ) . Define let us set L(z, ¯ λ) c¯(z, λ) d(z, c¯(z, λ) =

(104)

(105)

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

687

¯ λ) = E11 ⊗ E11 + E−1,−1 ⊗ E−1,−1 + θ(λ − ~)θ(z) E1,1 ⊗ E−1,−1 R(z, θ(λ)θ(z + ~) (106) θ(λ + ~)θ(z) θ(λ + z)θ(~) + E−1,−1 ⊗ E11 + E1,−1 ⊗ E−1,1 θ(λ)θ(z + ~) θ(λ)θ(z + ~) θ(−λ + z)θ(~) E−1,1 ⊗ E1,−1 ; + θ(−λ)θ(z + ~) then we have the relations (see [16]) ¯ λ) = d(z, ¯ λ)h, h¯a(z, λ) = a¯ (z, λ)h, hd(z, (107) ¯ λ) = b(z, ¯ λ)(h − 2), hb(z,

h¯c(z, λ) = c¯(z, λ)(h + 2),

R¯ (12) (z1 − z2 , λ + ~h)L¯ (1) (z1 , λ)L¯ (2) (z2 , λ + ~h(1) ) = L¯ (2) (z2 , λ)L¯ (1) (z1 , λ + ~h(2) )R¯ (12) (λ, z1 − z2 ).

(108) (109)

Proof. We have ¯ λ) = ϕ(λ + ~h(2) )R+ (z, λ)ϕ(λ + ~h(1) )−1 , R(z, and

¯ λ) = ϕ(λ + ~h)L(z, λ)ϕ(λ + ~h(1) )−1 . L(z, ¯ λ) and L(z, ¯ λ) Substitute these expressions in (100); simplifications show that the R(z, satisfy (109). Remark 9. The formulas of Lemma 8.1 only use functions of F[[~]], although their proof uses the extension to F1/2 [[~]]. Remark 10. In [16], the determinant is defined by the formula θ(λ) ¯ − ~, λ)¯a(z, λ − ~) − b(z ¯ − ~, λ)¯c(z, λ − ~)). (d(z θ(λ + ~h) This formula is equivalent to the first equation of (101), as one sees by inserting the expressions for a¯ , . . . , d¯ in terms of a, . . . , d and using the identity Det(z, λ) =

ϕ(λ + ~h)ϕ(λ + ~h − ~) θ(λ + ~h) = . ϕ(λ)ϕ(λ − ~) θ(λ) Remark 11. By tensoring them with 1-dimensional representations, we can view the evaluation representations studied in [16] as representations of the factor algebra introduced in this paper by the relation Det(z, λ) = 1. After expansion in series in η = −~/2, the formulas defining the evaluation representations of [16] only have singularities for λ ∈ 0 or z − w ∈ 0. The effect of the tensoring with 1-dimensional representations is to multiply the matrix L(w, λ) by a function gz (w) satisfying gz (w − ~)gz (w) =

θ(z − w + (3 − 1)η) , θ(z − w − (3 + 1)η)

η = −~/2;

this equation can be solved in a similar way to that for ϕ, and we will find for gz (w) a formal series in ~ with coefficients functions of z − w with only singularities for z − w ∈ 0. Therefore the final representations can be viewed as representations of the algebras Eτ,η (sl2 ), provided w is considered as a formal variable at the origin (as it is the case in [11]).

688

B. Enriquez, G. Felder

9. Quantum Ccurrents for Eτ,η (sl2 ) Theorem 9.1. There is a morphism 9 from Eτ,η (sl2 ) to U~ gO (τ ), defined by the formulas 9(h) = h, + (z)k + (z)−1 e+−λ (z) + k + (z − ~), 9(a(z, λ)) = ~θ(~)fλ−~+~h

(110)

+ (z)k + (z)−1 , 9(b(z, λ)) = θ(~)fλ−~+~h

(111)

9(d(z, λ)) = k + (z)−1 ,

9(c(z, λ)) = ~k + (z)−1 e+−λ (z).

(112)

Proof. Let us first show that these formulas are generating series for images of the ai , bi , ci , di , i ≥ 0. For this, we note that their right-hand sides are holomorphic functions on (C − 0)2 with values in U~ gO (τ ), 1-periodic in z and λ and with the quasi-periodicity properties in z and λ described by (102). For the periodicity properties in z, this follows directly from (34), the equations k + (z + 1) = k + (z),

k + (z + τ ) = e−iπ~h k + (z),

which are proved in the same way as (44), and the commutation relations between h and the e+λ (z), fλ+ (z). By Thm. 7.1, 9(a(z, λ)), 9(b(z, λ)), 9(c(z, λ)) and 9(d(z, λ)) satisfy the relations (100). Finally, one computes that the image by 9 of the middle term of Eq. (101) is equal to 1. This ends the proof of the theorem. Remark 12. Since 9(d(z, λ)) is independent of λ, it should be clear that 9 is not injective. Remark 13. There is an algebra morphism from the tensor product Eτ,η (sl2 ) ⊗ Diff (C − 0) to Eτ,η (sl2 )⊗2 ⊗ Diff(C − 0); the formulas for it are 1(L(z, λ)) = L(13) (z + ~h(2) , λ)L(23) (z, λ). It would be interesting to understand better the relation of this formula with (89). Remark 14. Relations (96), (97) and (98) suggest to define double elliptic quantum groups generated by the matrices L± (z, λ), the derivation D and the central element K, with the following functional properties: the L± (z, λ) are holomorphic functions in the variable λ ∈ C − 0; L+ (z, λ) also depends holomorphically on z ∈ C − 0, and L− (z, λ) is a regular power series in z; the periodicity conditions for L± (z, λ) in λ are the same as those for L(z, λ), and the periodicity conditions for L+ (z, λ) in z are the same as those for L(z); and satisfying relations (96), (97), (98) and [D, L± (z, λ)] = ∂L± (z, λ)/∂z. This algebra should be, as in the case for Eτ,η (sl2 ) with respect to U~ gO (τ ), somewhat larger than U~ g(τ ). Acknowledgement. This work was done during our stay at the “Semestre syst`emes int´egrables” organized at the Centre Emile Borel, Paris, UMS 839, CNRS/UPMC. We would like to express our thanks to its organizers for their invitation to this very stimulating meeting. We also would like to acknowledge discussions with O. Babelon, D. Bernard, C. Fronsdal, M. Jimbo, Ya. Pugai, V. Rubtsov and A. Varchenko on the subject of this work, and the referee for his careful reading of our manuscript.

Elliptic Quantum Groups Eτ,η (sl2 ) and Quasi-Hopf Algebras

689

References 1. Arutyunov, G.E., Chekhov, L.O., Frolov, S.A.: R-matrix quantization of the elliptic Ruijsenaars– Schneider model. q-alg/9612032 2. Avan, J., Babelon, O., Billey, E.: The Gervais–Neveu–Felder equation and quantum Calogero-Moser system. Commun. Math. Phys. 178, 281–99 (1996) 3. Babelon, O., Bernard, D., Billey, E.: A quasi-Hopf algebra interpretation of quantum 3j and 6j symbols and difference equations. q-alg/9511019, Phys. Lett. B. 375, 89–97 (1996) ˆ n ). Commun. 4. Ding, J., Frenkel, I.B.: Isomorphism of two realizations of quantum affine algebras Uq (gl Math. Phys. 156, 277–300 (1993) 5. Ding, J., Iohara, K.: Generalization and deformation of Drinfeld quantum affine algebras. Preprint qalg/9608002 6. Drinfeld, V.G.: A new realization of Yangians and quantized affine algebras. Sov. Math. Dokl. 36 (1988) 7. Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. J. 1:6, 1419–1457 (1990) 8. Enriquez, B.: Dynamical r-matrices for Hitchin systems in the Schottky parametrization. Preprint 9. Enriquez, B., Felder, G.: A construction of Hopf algebra cocycles for double Yangians. Preprint 10. Enriquez, B., Rubtsov, V.N.: Hitchin systems, higher Gaudin operators and r-matrices. Math. Res. Lett. 3, 343–357 (1996) 11. Enriquez, B., Rubtsov, V.N.: Quantum groups in higher genus and Drinfeld’s new realizations method (sl2 case). Ann. Sci. Ec. Norm. Sup 30, s´er. 4, 821–846 (1997) 12. Enriquez, B., Rubtsov, V.N.: Quasi-Hopf algebras associated with sl2 and complex curves. qalg/9608005, to appear in Israel Jour. of Math. 13. Faddeev, L., Reshetikhin, N., Takhtajan, L.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 178–201 (1989) 14. Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proc. ICM Z¨urich 1994, Basel–Boston: Birkh¨auser, 1994, pp. 1247–55; Elliptic quantum groups. Proc. ICMP Paris 1994, Cambridge, Ma: International Press, 1995, pp. 211–218 15. Felder, G., Tarasov, V., Varchenko, A.: Solutions of the QKZB equations and Bethe Ansatz I. Preprint q-alg/9606005 16. Felder, G., Varchenko, A.: On representations of the elliptic quantum group Eτ,η (sl2 ), q-alg/9601003; Commun. Math. Phys. 181, 741–761 (1996) 17. Felder, G., Wieczerkowski, C.: Conformal field theory on elliptic curves and Knizhnik–Zamolodchikov– Bernard equations. hep-th/9411004, Commun. Math. Phys. 176, 133–162 (1996) 18. Felder, G.: The KZB equations on Riemann surfaces. Preprint hep-th/9609153, to appear in the proceedings of the 1995 Les Houches Summer School 19. Foda, O., Iohara, K., Jimbo, M., Miwa, T., Yan, H.: An elliptic algebra for sl2 . Preprint RIMS 974 20. Fronsdal, C.: Quasi-Hopf deformations of quantum groups. q-alg/9611028 21. Khoroshkin, S.: Central extension of the Yangian double. Preprint q-alg/9602031 22. Lukyanov, S., Pugai, Ya.: Multi-point local height probabilities in the integrable RSOS model. hepth/9602074 23. Reshetikhin, N., Semenov-Tian-Shansky, M.: Central extensions of quantum current groups. Lett. Math. Phys. 19, 133–142 (1990) Communicated by T. Miwa

Commun. Math. Phys. 195, 691 – 697 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

A Global Uniqueness Theorem for Stationary Black Holes G´abor Etesi Institute for Theoretical Physics, E¨otv¨os University, Puskin u. 5–7, Budapest, H-1088, Hungary. E-mail: [email protected] Received: 12 June 1997 / Accepted: 5 January 1998

Abstract: A global uniqueness theorem for stationary black holes is proved as a direct consequence of the Topological Censorship Theorem and the topological classification of compact, simply connected four-manifolds. 1. Introduction There is a remarkable interplay between differential geometry, the theory of differential equations and the physics of gravitation in the famous proof of the uniqueness of stationary black holes. The first proof was given in a series of papers by Carter, Hawking, Israel and Robinson (for a survey see [8, 12]). In the eighties a very elegant shorter proof was discovered by Mazur [10] who found a hidden symmetry of the electromagnetic and gravitational fields. These very deep and difficult investigations all were devoted to the uniqueness problem of the metric on a suitable four-manifold carrying a Lorentzian asymptotically flat structure in the spirit of Penrose’s description of infinity of spacetimes. More recently physicists’ efforts have addressed the topology of the event horizons of general (i.e. non-stationary) black holes. The first theorems were proven by Hawking [8, 9] and later by Gannon, Galloway [6, 7] and others. Based on the celebrated Topological Censorship Theorem of Friedman, Schleich and Witt [5] and using energy conditions Chru´sciel and Wald gave a short proof that the event horizon of a stationary black hole in a “moment” is always a sphere. The question naturally arises what can one say about the topology of the space-time itself in this case. On the other hand the final step in the understanding of four-manifolds making use of “classical” (i.e. non-physical) methods was done by Freedman in 1981 who gave a complete (topological) classification of compact simply connected four-manifolds. In this paper, referring to the results of Chru´sciel–Wald and Freedman, we prove that a global uniqueness also holds for stationary black holes and more generally stationary

692

G. Etesi

space-times, i.e. not only the metric but even the topology of the space-time in question is unique. Our method is based on a natural compactification of the space-time manifold and a careful study of a vector field extended to this compact manifold. Truely speaking, this is not a very surprising result in light of the local uniqueness. However, it demonstrates the power of the theorems mentioned above. 2. Vector Fields First we define the precise notion of a stationary, asymptotically flat space-time containing a black hole collecting the standard definitions. Let us summarize the properties of an asymptotically flat and empty space-time that we need; for the whole definitions see [12] and the notion of an asymptotically empty and flat space-time can be found in [8]. Let (M, g) be a space-time manifold and x ∈ M . Then J ± (x) is called the causal future and past of x respectively. If a space-time manifold (M, g) is asymptotically flat ˜ , g) and empty then there exists a conformal inclusion i : (M, g) 7→ (M ˜ such that • ∂i(M ) = {i0 } ∪ I + ∪ I − , where i0 is the space-like infinity and the future and past null-infinities I ± satisfy I ± := ∂J ± (i0 ) \ {i0 } = S 2 × R; ˜ \ i(M ) = J + (i0 ) ∪ J − (i0 ); • M ˜ 7→ R+ which is smooth everywhere (except possibly • There exists a function : M 2 at i0 ) satisfying g˜ = i∗ g and |∂i(M ) = 0; ˜ , g) • Every null geodesic on (M ˜ has future and past end-points on I ± respectively. Definition. Let (M, g) be asymptotically flat. (M, g) is called strongly asymptotically ˜ such that V˜ ⊃ i(M ) ∩ J − (I + ) and predictable if there exists an open region V˜ ⊂ M ˜ (V , g| ˜ V˜ ) is globally hyperbolic. Remark. This definition provides that no singularities are visible for an observer in ˜ ∩ V˜ . Moreover one can prove that M ˜ ∩ V˜ can be foliated by Cauchy-surfaces St M (t ∈ R), see [8], [12]. Definition. Let (M, g) be a strongly asymptotically predictable space-time manifold. If B := M \ J − (I + ) is not empty, then B is called a black hole region and H := ∂B is its event horizon. Remarks. Moreover we require (M, g) to be stationary i.e. there exists a future-directed time-like Killing field K on (M, g). In this case H is a three-dimensional null-surface in M , hence for each t ∈ R, Ht := H ∩ St (St is a Cauchy-surface) is a two-dimensional surface in M . We shall assume that Ht (the event horizon in a “moment”) is a two dimensional embedded, orientable, smooth, compact surface in M without boundary. This is the requirement for the event horizon to be “regular”. This condition is satisfied by physically relevant black hole solutions of Einstein’s equations but there is no a priori reason to assume it. Let (M, g) be a maximally extended space-time manifold as above. In the following considerations we shall focus on one outer, asymptotically flat region of it, i.e. a part ˜ is connected. To get such an (incomplete) of (M, g) whose boundary at infinity in M manifold we simply cut up (M, g) along one connected component of its event horizon. We shall continue to denote this separated part also by (M, g) (g is the original metric restricted to our domain). We can see that under the conformal inclusion i the manifold ˜ I ± are ˜ ∪ I + ∪ {i0 } ∪ I − , where H˜ = i(H) and H, (M, g) has boundary ∂i(M ) = H connected now.

Global Uniqueness Theorem for Stationary Black Holes

693

Proposition 1. Let (M, g) be a space-time manifold. Then K|H ∈ 0(T H). Proof. Assuming the existence of a point p ∈ H such that Kp ∈ / Tp H, let γ : (−1, 1) 7→ M be a smooth integral curve of K satisfying γ(0) = p and γ(0) ˙ = Kp . Hence there is an ∈ (−1, 1) such that γ(−) ∈ B and γ() ∈ / B. But this means that γ(−) ∈ J − (I + ), since it can be connected by an integral curve of K to γ() ∈ J − (I + ). Hence this assumption led us to a contradiction. Corollary. H is invariant under the flow generated by K on M and, with H a nullsurface, K|H is a null vector field. Remarks. Using a heuristic argument here we can identify Ht up to homeomorphism as follows. Since the boundary of (M, g) at infinity is homeomorphic to S 2 × R we may assume due to the stationarity that Ht is homeomorphic to S 2 . According to recent articles one can prove that this is indeed the case for a stationary black hole [6–9]. For our purposes it is more important to refer to a stronger result of Chru´sciel and Wald [3]. They prove that an outer asymptotically flat region of a stationary space-time manifold satisfying the null energy condition is simply connected as a consequence of the Topological Censorship Theorem [5]. Under suitable additional hypothesis (e.g. the compactness of Ht ) it follows that Ht is homeomorphic to a sphere. Hence, with the aid of these results we get that H is homeomorphic to S 2 × R. ˜ := i∗ K Now let us study the behaviour of the Killing field near the infinity! Let K induced by the inclusion i. ˜ becomes a null vector field at infinity. Proposition 2. K ˜ be an inextendible integral curve of K! ˜ We may write: Proof. Let γ˜ : R 7→ M 2 ˜ γ(t) ˜ ˜ ) = 2 (γ(t))i ˜ γ(t) ˜ K ˜ kK ˜ k = g( ˜ , Kγ(t) ∗ g(i∗ Kγ(t) , i∗ Kγ(t) ) =

= 2 (γ(t))g(K ˜ γ(t) , Kγ(t) ). We have used the third property of asymptotic flatness. However, using the fact that K is a Killing field on M , we can write 2 2 ˜ γ(t) ˜ kK ˜ k = a (γ(t)),

where a := g(Kγ(t0 ) , Kγ(t0 ) ) is a constant for an arbitrary t0 ∈ R. But ˜ =0 lim 2 (γ(t))

t→±∞

because of the asymptotic flatness.

In the light of Proposition 1 and 2 we can see that K approaches a null vector field near the boundary of M . Hence it is straightforward to study the behaviour of null vector ˜ . Applying a smooth deformation to K on (M, g) we can produce fields on M and M a smooth, nowhere vanishing (but highly non-unique!) null vector field K0 on (M, g) whose integral curves are inextendible geodesics. Denoting by K˜ 0 the image of this field under i, i.e. K˜ 0 = i∗ K0 , K˜ 0 has future and past end-points on I ± respectively by the fourth property of asymptotically flat space-times.

694

G. Etesi

Remark. Of course, we could have started with this null vector field instead of the Killing field K. The reason for dealing with the naturally given Killing field was the attempt to exploit as much as possible the structure of a stationary, asymptotically flat space-time manifold. ˜ , g), Now let X˜ be an inextendible null vector field on (M ˜ i.e. its integral curves are inextendible. We would like to study the extension of this field to the null infinities, hence first we have to extend the domain of its integral curves, which is R in this moment. Let us suppose that this extended domain is the circle S 1 . ˜ , g) Proposition 3. Let X˜ be a null vector field on (M ˜ with extended domain whose ˜ I ± = 0. integral curves are geodesics. Then X˜ can be extended to I ± if and only if X| ˜ be a space-like hypersurface and let x ∈ S˜ \ (S˜ ∩ B) ˜ Proof. Let i(S) ∪ {i0 } =: S˜ ⊂ M ˜ ˜ = x. So we can (x 6= i0 ). Then there exists an integral curve γ˜ : R 7→ M such that γ(0) find a point q ∈ I + possessing the property ˜ q = lim γ(t). t→+∞

˜ 7→ I + by φ(x) = q. Let us define a map φ : S˜ \ (S˜ ∩ B) Let us assume that we have extended X˜ to I + in a smooth manner and there is a q ∈ I + such that X˜ q 6= 0! Hence there is a smooth curve β˜ : (−, ) 7→ I + for a suitable small positive number satisfying ˙˜ ˜ β(0) = q, β(0) = X˜ q . Obviously we can find a δ < such that for q 6= q 0 ∈ Uq , ˙˜ ˜ β(δ) = q 0 , β(δ) = X˜ q0 6= 0. But in this case there is an x0 ∈ Ux (Ux denotes a small neighbourhood of x) and an integral curve γ˜ 0 of X˜ such that φ(x0 ) = q 0 . In other words q 0 satisfies ˜ q 0 ∈ imγ˜ 0 , q 0 ∈ imβ, and

˜ q∈ q ∈ imβ, / imγ˜ 0 .

˜ But in this case even This means that q 0 is a branching point of an integral curve of X. if X˜ q0 is well defined −X˜ q0 is not and this is a contradiction. A similar argument holds for I − . Hence we are naturally forced to find a null vector field tending to zero on the null ˜ , g) infinities. However note that on (M ˜ there is a natural cut-off function, namely . Certainly there exists a k ∈ N such that the vector field defined by ˜0 X˜ 0 := k K is a zero vector field restricted to I ± because of the third property of asymptotic flatness. Note that this new vector field can be extended as zero to i0 as well: Surrounding i0 by a small neighbourhood U one can see (since U ∩ (I + ∪ I − ) 6= ∅) X˜ 0 is arbitrary small in U .

Global Uniqueness Theorem for Stationary Black Holes

695

It is straightforward that X˜ 0 is not transversal to I ± in the sense of [1] since it approaches the null-infinities as a tangential field. This would cause some difficulties later on in our construction. Fortunately one can overcome this non-transversality phenomenon by a general method. Due to standard transversality arguments [1] applying a generic small perturbation to X˜ 0 in a suitable neighbourhood of I ± we can achieve that the perturbed field X˜ will be transversal to the submanifolds of null-infinities (and even remains zero on them, of course).

3. The Compactification Procedure Now let N1 be a smooth four-manifold. We call a subset C ⊂ N1 a domain of N1 if it is diffeomorphic to a closed four four-ball B 4 . Let V , U ± ⊂ N1 be domains and ˜ 7→ N1 a smooth embedding satisfying the following conditions: j:M ˜ ) = intV ; • N1 \ j(M • There exists a point p0 ∈ N1 and domains U ± satisfying U + ∩ U − = {p0 } such that j(i0 ) = p0 and j(I ± ) ⊂ ∂U ± and j(J ± (i0 )) ⊂ U ± ; • ∂U ± \ j(I ± ) =: A± ⊂ ∂V ; ˜ ⊂ ∂V ; • j(H) ˜ ∩ A± = ∅. • j(H) It is not difficult to see that such a j exists due to the topology of an outer region of an asymptotically flat stationary space-time containing a black hole and N1 has no boundaries. Consider the vector field Y1 := j∗ X˜ . We wish to extend this field into the interior of V . To be explicit take a coordinate map from N1 to R4 under which V maps to the cylinder V := B13 × [−1, 1], where Br3 denotes a closed three-ball of radius r originated at the origin. Moreover let W := D23 × (−1, 1) be another open neighbourhood (where Dr3 denotes the open ball respectively). ˜ is given by the “belt” ∂B 3 × (− 1 , 1 ) and A± are represented We assume that j(H) 1 2 2 by the top- and bottom-balls B13 × {±1}. Note that this picture is consistent with the conditions for j (except that V is not a four-ball but a cylinder). Note that by Proposition 1 ˜ Y1 |j(H) ˜ ∈ 0(T j(H)). ˜ Hence we define a smooth vector field In other words Y1 has the form (0,0,0,1) on j(H). Z in W as follows: Let q ∈ D23 and • for t ∈ [−1, − 21 ] Z(q,t) := (0, 0, 0, cos π(t + 1/2)); • for t ∈ (− 21 , 21 ) Z(q,t) := (0, 0, 0, 1); • for t ∈ [ 21 , 1] Z(q,t) := (0, 0, 0, cos π(t − 1/2)).

696

G. Etesi

Clearly Z is a smooth vector field in W and Z|A± = 0. Now take a smooth cut-off function ρ : R4 7→ R+ satisfying ρ|V = 1 and being zero on the complement of W . Define the extension of Y1 by ρZ + (1 − ρ)Y1 . It is obvious that the extended field (also denoted by Y1 ) is a smooth vector field on N1 due to the transversality of the original field to I ± (more precisely it has not been defined in U ± yet) and satisfies Y1 |∂U ± = 0. As a final step let us apply a smooth homotopy for N1 contracting the four-balls U ± to the point p0 . In this way we get a smooth compact manifold N0 without boundaries and, due to the transversality conditions on Y1 , a well-defined smooth vector field Y0 on it. It is clear that Y0 has only one (degenerated) isolated singular point, namely p0 . Its index is +2 as is easy to see due to the fourth property of asymptotic flatness. Theorem. N0 is homeomorphic to the four-sphere S 4 . Proof. First it is not difficult to see that N0 is simply connected. Note that N0 is a union of an outer region of the original stationary, asymptotically flat space-time M and a solid torus-like space T homeomorphic to B 3 × S 1 with one B 3 pinched into a point (namely which corresponds to p0 ). Choosing p0 as a base point consider the loop l in N0 representing the generator of the fundamental group π1 (T ) = Z. Clearly this loop can be deformed continuously into M ∪ {p0 }. But referring again to the theorem of Chru´sciel and Wald [3] the deformed loop l is homotopically trivial since M is simply connected. It is obvious that every other loop in N0 is contractible proving π1 (N0 ) = 0. Secondly, we have constructed a smooth vector field Y0 on N0 having one isolated singular point p0 with index +2. Taking into account that N0 is a smooth, compact manifold without boundaries this means that χ(N0 ) = 2 according to the classical Poincar´e-Hopf Theorem [2, 11]. The Euler characteristic of a simply-connected compact four-manifold S always has the form χ(S) = 2 + b2 , where b2 denotes the rank of its intersection matrix (or second Betti number). In our case this rank is zero, so the intersection matrix is of even type hence by the uniqueness of this trivial matrix and referring to the deep theorem of Freedman [4] which gives a full classification of simply connected compact topological fourmanifolds in terms of their intersection matrices, we deduce that N0 is homeomorphic to the four-sphere since its intersection form is given by the same zero-matrix. Corollary. The uniqueness of N0 implies the uniqueness of the original space-time manifold (more precisely one connected piece of its outer region) M since we simply have to remove a singular solid torus T from N0 and this can be done in a unique way since N0 is simply connected. Hence, taking into account the explicit example of the Kerr-solution, this outer region is always homeomorphic to S 2 × R2 . 4. Conclusion We have proved that topological uniqueness holds for space-times carrying a stationary black hole. Note that we could prove only a topological equivalence although our constructed manifold N0 carries a smooth structure, too. It would be interesting to know if

Global Uniqueness Theorem for Stationary Black Holes

697

this smooth structure was identical to the standard one in light of the unsolved problem of the four dimensional Poincar´e-conjecture in the smooth category. From the physical point of view this uniqueness is important if we are interested in problems concerning the whole structure of such space-time manifolds. For example one may deal with the description of the vacuum structure of YangMills fields on the background of a gravitational configuration containing a black hole or some other singularity (taking the singularity theorems into consideration this question is very general and natural). One would expect that a black hole may have a strong influence on these structures and using our theorem presented here we can study this problem effectively in our following paper. Moreover we hope that our construction works for non-stationary, i.e. higher genus black holes as well as giving insight into the structure of such more general space-time manifolds. Acknowledgement. The work was partially supported by the Hungarian Ministry of Culture and Education (FKFP 0125/1997).

References 1. Arnold, V.I.: Geometrical Methods in the Theory of Ordinary Differential Equations. Berlin–Heidelberg– New York: Springer-Verlag, 1988 2. Bott, R., Tu, L.: Differential Forms in Algebraic Topology. Berlin–Heidelberg–New York: SpringerVerlag, 1982 3. Chru´sciel, P.T., Wald, R.M.: On the Topology of Stationary Black Holes. Class. Quant. Grav. 11, L147– 152 (1994) 4. Freedman, M.H.: The Topology of Four-Manifolds. J. Diff. Geom. 17, 357–454 (1982) 5. Friedman, J.L., Schleich, K., Witt, D.M.: Topological Censorship. Phys. Rev. Lett. 71, 1486–1489 (1993) 6. Galloway, G.: On the Topology of Black Holes. Commun. Math. Phys. 151, 53–66 (1993) 7. Gannon, D.: On the Topology of Space-like Hypersurfaces, Singularities and Black Holes. Gen. Rel. Grav. 7, 219–232 (1976) 8. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time, Cambridge: Cambridge Univ. Press, 1973 9. Hawking, S.W.: Black Holes in General Relativity. Commun. Math. Phys. 25, 152–166 (1972) 10. Mazur, P.O.: Black Hole Uniqueness from a Hidden Symmetry of Einstein’s Gravity. Gen. Rel. Grav. 16, 211–215 (1984) 11. Thorpe, J.A.: Elementary Topics in Differential Geometry. Berlin–Heidelberg–New York: SpringerVerlag, 1979 12. Wald, R.M.: General Relativity. Chicago, Il: Univ. of Chicago Press, 1984 Communicated by H. Nicolai

Commun. Math. Phys. 195, 699 – 723 (1998)

Communications in

Mathematical Physics

Equilibrium Shapes for Planar Crystals in an External Field Robert J. McCann? Department of Mathematics, Brown University, Providence, RI 02912, USA. E-mail: [email protected] Received: 8 July 1997 / Accepted: 5 January 1998

Dedicated to the memory of Frederick Justin Almgren, Jr.

Abstract: The equilibrium shape of a two-dimensional crystal in a convex background potential g(x) is analyzed. For g = 0 the shape of minimum energy may be deduced from surface tension via the Wulff construction, but if g is not constant, little is known beyond the case of a crystal sitting in a uniform field. Only an unpublished result of Okikiolu shows each connected component of the equilibrium crystal to be convex. Here it will be shown that any such component minimizes energy uniquely among convex sets of its area. If the Wulff shape and g(x) are symmetric under x ↔ −x, it follows that the equilibrium crystal is unique, convex and connected. This last result leads to a new proof that convex crystals away from equilibrium remain convex as they evolve by curvaturedriven flow. Subsequent work with Felix Otto shows – without assuming symmetry – that no equilibrium crystal has more than two convex components. Contents 1 Overview of Methods and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703 2 Surface Energy and Isoperimetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707 3 Convexity and Monotonicity of the Potential Energy . . . . . . . . . . . . . . . 711 4 Curvature-Driven Flows Preserve Convexity . . . . . . . . . . . . . . . . . . . . . . 714 A Convexity, Size and Number of Components . . . . . . . . . . . . . . . . . . . . . 716 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721

Crystals in an External Field The purpose of this article is to provide a characterization for the equilibrium shape of a crystal in an external field. When the field-strength is negligible, the crystal will form a convex set given by Wulff’s construction [66] of 1901. Effects introduced by a ?

c 1997 by the author. All rights reserved.

700

Robert J. McCann

(uniform) gravitational field have long been studied in the context of, e.g., the sessile or pendant liquid drop – see Finn’s book [21] for a review – while some of the analogous problems for anisotropic solids have been addressed by Avron, Taylor and Zia [4] as well as Taylor and Almgren [60]. Concerning the effects of non-uniform fields, much less is known. Even when the field is the (negative) gradient of a convex potential, the equilibrium crystal is not known to be connected, much less convex or unique. This problem, formulated as a variational minimization in d dimensions, was suggested to us by Almgren. Here it is addressed in the plane d = 2, where an unpublished result of Okikiolu [48] shows that any energy minimizing solution consists of a countable disjoint union of closed convex sets. Our main result – also limited to the plane – states that each convex set in this union is the unique energy minimizer among convex sets of its area. If the system is reflection symmetric under x ↔ −x, it follows that the crystal formed must be unique, convex and connected. In the context of curvature-driven flow, this last result leads to a new proof that a non-equilibrium crystal K = −K, initially convex, will remain convex and balanced as it melts or relaxes. In a subsequent article with Felix 1 without assuming reflection symmetry Otto, we go beyond these results to conclude m – that no equilibrium crystal has more than two convex components. A crystal in contact with its vapor or melt may be modeled as a subset U ⊂ R2 having area determined by the ambient thermodynamic variables. The variational principle formulated by Gibbs [30] and Curie [13] asserts that the equilibrium shape of such a crystal must minimize its free energy – in our case an interfacial surface energy plus a bulk potential energy measuring interaction with the external field – among all sets of equal area; see alternately Herring [36] or Cahn and Hoffman [11]. The interfacial energy will typically be anisotropic: an initial quantity of the condensate breaks the symmetry of the underlying space, establishing preferred directions. Though microscopic in origin, this anisotropy is represented macroscopically by a surface tension f : S1 −→ (0, ∞) which depends continuously on the outward unit normal nˆ U (x) at each boundary point x of the set U . Presuming f to be known from experiment or theory, the surface energy F (U ) of any crystal shape may be calculated as the integral Z f (nˆ U (x)) dH1 (x) (1) F (U ) := ∂∗ U

of the surface tension with respect to Hausdorff one-dimensional measure H1 on the measure theoretic boundary ∂∗ U (Definition A.1) of U . Here U ⊂ R2 is assumed to be bounded, Borel, and have finite perimeter H1 (∂∗ U ) < +∞ so that its surface energy is finite; the collection of such sets is denoted by F (R2 ). To each set U ∈ F (R2 ), a construction of geometric measure theory (Definition A.3) associates a Borel map nˆ U : ∂∗ U −→ S1 defining the outward unit normal nˆ U (x) for H1 -almost every x in ∂∗ U ; of course, nˆ U (x) coincides with the Gauss map and ∂∗ U = ∂U when U has a smooth boundary. If U is dilated by λ ≥ 0, its surface energy scales: F (λU ) = λF (U ). Among sets of a fixed area, F is uniquely minimized by the Wulff shape W ⊂ R2 and its translates. In the isotropic case f (nˆ U (x)) := 1, the surface energy measures the boundary of U so the Wulff shape is a ball in view of the isoperimetric inequality; this corresponds to the shape of a small drop of liquid in equilibrium with its vapour. In the anisotropic case, the set constructed by Wulff is obtained by Legendre transforming the surface tension, after extending f from the unit circle S1 = {|x| = 1} ⊂ R2 to the plane by taking it to be positively homogeneous: f (λx) = λf (x) whenever x ∈ R2 and λ ≥ 0. A convex dual function to f , also positively homogeneous, is defined by

Crystals in an External Field

701

f ∗ (y) := sup h y, xi/f (x);

(2)

x∈R2

the Wulff shape is the compact convex set W := {f ∗ ≤ 1}. This construction extends immediately to higher dimensions, and is interpreted geometrically in, e.g., [35, 36] and [56, 58]. Working under the assumption that W was a polyhedron (a polygon in dimension d = 2), what Wulff had attempted to show was that the surface energy attains a local minimum at W among nearby polyhedra having parallel faces and the same size. Apparently his argument was flawed [36], but correct proofs were given by Hilton [39], Liebmann [42] and Laue [63]. However, it was only later that Dinghas would show W to represent the global minimum of the surface energy among all convex polyhedral shapes of its size [15]. His idea exploited the Brunn-Minkowski inequality, and was subsequently extended to prove that – polyhedral or not – W minimizes F (U ) uniquely among a much larger class of sets U : the extension to sets with piecewise smooth boundary was accomplished by Herring [36] and Taylor [56]; to bounded sets of finite perimeter F (R2 ) by Taylor [57, 58]; and to all sets of finite perimeter by Fonseca and M¨uller [23]. A completely different proof of the Wulff theorem was found by Dacorogna and Pfister [14], but is restricted to sets in the plane (d = 2). If the crystalline material interacts with an external potential g on R2 , then – in addition to its surface energy – U ∈ F (R2 ) carries potential energy Z g(x) dH2 (x). (3) G(U ) := U

Here H2 denotes two-dimensional Lebesgue measure. At equilibrium, the shape of the crystal in the field −∇g will minimize the energy E(U ) := F (U ) + G(U ),

(4)

subject to the constraint of fixed area H2 (U ) = m. The competition between surface and bulk energy makes this variational problem more complicated than the Wulff problem, so that – apart from special cases – one no longer expects exact solutions. One none the less wants to know that with the area constraint, the energy (4) attains its minimum on F(R2 ), and to deduce what properties one can about the minimimizing shape U0 ⊂ R2 . One special case which has received some attention corresponds to the sessile crystal in a uniform gravitational field: h g, xi where h g, xi ≥ 0 g(x) := +∞ otherwise. Here g ∈ R2 specifies the strength of the field and its orientation relative to the crystal. In two dimensions and subject to a wetting condition (substrate prefers crystal to melt), Avron, Taylor and Zia [4] obtain the equilibrium shape by quadrature, showing it to be convex and unique. They also explore conditions for a large enough field to induce roughened facetting in the direction normal to g. In higher dimensions d ≥ 3, neither convexity nor uniqueness is known for the equilibrium shape, except in the isotropic ˆ := 1 (the sessile drop) studied by Finn [20] and Wente [64]. case f (n) In this paper we study the minimization problem for potentials g(x) which are merely convex. The surface tension also is presumed to extend to a convex function f (x) = |x|f (ˆx) on the plane, though this restriction costs no generality. Convexity of f (x) means merely that the most economical interface connecting two points will be a straight

702

Robert J. McCann

line; see e.g. Lemma 2.1. When this condition fails, there are interfacial directions so unfavorable that surfaces will develop microscopic zig-zags to avoid them. In this case a minimizing solution may not exist except in the varifold sense. However, the variational problem can always be solved after replacing the surface tension with its convex hull f ∗∗ , (which by duality of the Legendre transform shares the same Wulff shape as f ). A prescription for recovering the varifold structure of the minimizing set U0 ⊂ R2 is then given by Taylor in [58, Theorem 3.3]. With the convexity of f and g, existence of an energy minimizer U0 ⊂ BR (0) with given area follows from a standard continuity-compactness argument summarized in Appendix A. Kate Okikiolu’s theorem is also stated there: apart from sets of measure zero (which, being irrelevant to the energy, are ignored throughout), any minimizer U0 can be expressed as a countable disjoint union of closed convex sets. Finally, the appendix includes two a priori bounds needed for the existence and uniqueness arguments concerning energy minimizing crystals in F(R2 ). These bounds control the radius of U0 and the number of convex components in terms of f, g, and the area constraint. The main result of the present manuscript applies to this equilibrium crystal U0 . Formulated precisely in the next section, it asserts that each connected component of U0 is the unique energy minimizer among convex sets of its area (Theorems 1.1 and 1.3). The underlying idea is evident in the uniqueness proof for the more restricted (and simpler) problem of minimizing energy among convex sets K ⊂ R2 having area H2 (K) = m. The possibility of two minimizers for the restricted problem is precluded by defining a family of convex sets K(t) ⊂ R2 which interpolate between any pair of sets K(0) = K and K(1) = K 0 while satisfying energy estimates of the form E K(t) ≤ (1 − t) E(K) + t E(K 0 ).

(5)

Here K(t) := [(1 − t)K + tK 0 ]m is the convex (Minkowski) combination of K and K 0 , 2 intersected with the smallest level set of g(x) for which H K(t) ≥ m. Apart from exceptional cases (which occur when g is not strictly convex and are analyzed separately), H2 K(t) = m and (5) holds strictly for t ∈ (0, 1) unless the minimizer K = K 0 is unique. The proof of the central estimate (5) is divided into a series of convexity and monotonicity lemmas concerning the surface and potential energies. Section 2 contains the results concerning surface energy, while Section 3 relies crucially on the interpolation technique introduced in [45, 47] to address potential energy. A final section shows how convexity of the minimizer U0 for our statical problem can be used to deduce that convexity is preserved by the dynamical process of curvature-driven flow [1]. It is followed by an appendix with background material and some technical propositions. Since crystal formation and evolution are of the most physical interest in dimension d = 3, it is regretable that two of the key estimates do not extend beyond the plane. Proposition 2.4, which controls the surface energy F ((1 − t)K + tK 0 ) for convex sets K, K 0 ⊂ R2 , is false in higher dimensions. And Corollary 2.8, which asserts that the surface energy of a connected open set dominates that of its convex hull, also fails when d > 2. The latter is used in the proof of Okikiolu’s theorem. Counterexamples to both are easily constructed in the isotropic case f (x) = |x|. Regarding the physical applicability of this theory, we caution that equilibrium is attained only with great difficulty in the laboratory – and then only for small crystals (typically with sizes measured in microns) or for exotic systems such as solid helium in contact with the superfluid. Examples of the former include the metal experiments of

Crystals in an External Field

703

Heyraud and M´etois [37, 38], and the studies of Pavlovska and Nenow on negative crystals (inclusions) in organic materials [50, 49, 51]. The solid-liquid interface of helium is described in reviews by Keshishev, Parshin and Shal’nikov [41], Balibar and Caisting [5] and Maris and Andreev [44]. On the theoretical side, a statistical mechanical derivation of the surface tension and justification for the variational approach may be provided by monograph of Dobrushin, Kotecky and Shlosman and the references quoted therein [16]. 1. Overview of Methods and Results Our first results concern the minimimization of the energy E(K) when restricted to convex sets K ⊂ R2 of fixed area. This restriction is not physical, but simplifies the variational problem considerably. Moreover, it turns out to provide the key to the structure of the energy minimizing crystal for the physical problem in which all sets of finite perimeter compete. Therefore, let K(R2 ) consist of all compact convex sets in the plane, and define Km (R2 ) := {K ∈ K(R2 ) | H2 (K) = m}. The first theorem asserts that the shape of the energy minimizing set in Km (R2 ) is unique. Whether or not its position is also uniquely determined depends on the external potential: strict convexity of g(x) will select a position uniquely, while if the convexity of g is not strict then a convex set of translations may be allowed. The proof of this theorem relies on the monotonicity and convexity estimates established in the following sections for surface and potential energies. Like all results in this paper, these estimates are obtained under the assumption that the energy (4) be defined through a pair of convex integrands f, g : R2 −→ [0, ∞) satisfying the normalization conventions: (E1) f is convex with f (λx) = λf (x) for λ > 0; f > 0 unless x = 0; (E2) g is convex, and the set where it vanishes is bounded but non-empty. The positivity restrictions on f and on g are assumed for convenience only: any linear function h v, xi can be added to f (x) without changing closed surface energies; similarly, adding constants to g merely shifts the total energy. The results could also be extended to the case in which the potential g(x) takes the value +∞ on the complement of a convex set C ⊂ R2 ; then finite energy E < +∞ must be added as a hypothesis, while the proof of Proposition A.9 changes slightly. This models the case in which crystal and melt are constrained by a container C in the complete non-wetting regime (where separation of the crystal from the container walls by a layer of fluid is favored energetically), and was used by Schonmann and Shlosman to study boundary effects in the Ising model [54]. The “summertop” trick [67] of exchanging crystal with melt can be used in the opposite regime (complete wetting), where container walls will be covered by crystal condensate. The intermediate regime (partial wetting) can sometimes be modelled by modifying the surface tension f appropriately: e.g. if C is a convex cone and the problem shares its symmetry, see Zia, Avron and Taylor [67, 4] and Winterbottom [65]. Taken together, the estimates of Sects. 2 and 3 show the total energy to be “convex” in a suitable sense. More specifically, given λ > 0 and two sets K, K 0 ⊂ R2 , define λK := {λx | x ∈ K} and K + K 0 := {x + x0 | x ∈ K, x ∈ K 0 } in the usual way. If a Borel set U ⊂ R2 has area H2 (U ) ≥ m, then its truncation [U ]m to size m is defined by intersection with level sets of g: use Uλ := {x ∈ U | g(x) ≤ λ} to set \ [U ]m := Uλ . (6) {λ∈R|H2 (Uλ )≥m}

704

Robert J. McCann

The continuous dependence of H2 (Uλ ) on λ shows the set (6) to have area m unless g(x) vanishes throughout [U ]m . Finally, given K, K 0 ∈ Km (R2 ), the “convexity” proved for the energy is with respect to the interpolation K(t) := [(1 − t)K + tK 0 ]m . Thus t ∈ [0, 1] implies (5), with conditions for strict inequality yielding uniqueness in the theorem below. Existence of a minimizer among convex sets is well-known: it follows from the compactness and lower semi-continuity results of Appendix A since Km (R2 ) embeds as a closed subset of L1 (R2 ) when the set K is identified with its characteristic function 1 if x ∈ K χK (x) := 0 otherwise. Theorem 1.1 (Uniqueness of Energy Minimizer Among Convex Sets). Fix 0 < m < +∞. Among closed convex sets K ⊂ R2 with area H2 (K) = m, the energy E(K) is minimized by a set K0 (m) which is unique up to possible translations. Proof. Uniqueness will be established first. Suppose two convex sets K and K 0 in Km (R2 ) share the same energy: E(K) = E(K 0 ). Unless K 0 is a translate of K, Theorem 3.5 provides a path K(t) ∈ Km (R2 ) joining K to K 0 along which inequality (5) is strict for t ∈ (0, 1). But then E(K(t)) < E(K), which precludes the possibility that both K and K 0 minimize energy in Km (R2 ). To address existence of the minimizing set K0 (m), let K ∈ Km (R2 ) be an arbitrary convex set with energy E(K) ≤ λ < +∞. The potential energy G is non-negative, so the surface energy F (K) ≤ λ bounds the diameter of K by (37). Thus K is contained in a ball Br (y) whose radius depends only on f and λ. This ball cannot be centered too far from the origin since g(x), being convex, grows linearly away from the bounded set where g = 0, yet inf g(x) ≤ m−1 G(K) ≤ m−1 λ. |x−y|≤r

Thus K also lies in some larger ball BR (0) depending only on λ, m, f and g. Now recall that the subsets of BR (0) with bounded perimeter embed compactly in L1 (R2 ) (Theorem A.2). Any energy minimizing sequence in Km (R2 ) has bounded perimeter F (Ki ) ≤ λ and must eventually lie in BR (0), hence admits an L1 (R2 ) convergent subsequence whose limit is (the characteristic function of) a convex set K0 with area m since Km (R2 ) is also known to be closed [53, Theorem 1.8.5 and Note 1.8.8]. Theorem A.4 couples with the obvious lower semi-continuity of G to ensure that K0 has minimum energy on Km (R2 ). Remark 1.2. If the position of the energy minimizing set K0 (m) fails to be unique, then there is a closed convex set V ⊂ R2 of possible translations such that K0 (m) + v has minimum energy if and only if v ∈ V . Convexity of V follows by an easy estimate (or Theorem 3.5), while V is closed because of lower semi-continuity of the energy (Theorem A.4). The size of V is controlled by g. If the surface tension is smooth – say f ∈ C 2 (R2 \ {0}) – and uniformly elliptic – ˆ > 0, or equivalently {x | f (x) ≤ 1} is uniformly convex – then it meaning trD2 f (n) is possible to prove that the energy minimizer U0 among sets in F (R2 ) or K(R2 ) with area m has a C 2 -smooth boundary; cf. Almgren, Taylor, Wang [1, Theorem 3.10]. The curvature κ(x) along this boundary satisfies the Euler-Lagrange equation 2 f (nˆ U0 (x)) + g(x) = λ, κ(x) Dtt

(7)

Crystals in an External Field

705

2 where x ∈ ∂U0 , Dtt f := trD2 f , and λ is a Lagrange multiplier conjugate to the area constraint. Okikiolu [48] has argued from (7) that each connected component of U0 is convex: κ(x) ≥ 0. Indeed, convexity of g’s level sets forces κ(x0 ) ≥ 0 at that point x0 on the boundary of U0 which has maximum potential energy g(x0 ) ≥ sup∂U0 g(x). Using 2 (7) to compare x0 with any other point x ∈ ∂U0 yields κ(x)Dtt f (nˆ U0 (x)) ≥ 0. In a subsequent article with Felix Otto, we shall prove that any convex domain U0 ⊂ R2 whose C 2 -smooth boundary satisfies (7) has minimum energy in Km (R2 ). Thus the only connected solutions to the Euler-Lagrange equation are the sets K0 (m) of Theorem 1.1. This observation motivates the theorem to follow – which holds even in the absence of smoothness and uniform ellipticity of f .

Theorem 1.3 (Classification of Connected Crystal Components). Assume (E1–E2), and suppose U0 minimizes E(U ) among sets U ∈ F(R2 ) of finite perimeter and unit area. Then U0 is a finite disjoint union of closed convex sets K0 (m), each with distinct area m and minimizing E(K) among convex sets of area m. Proof. Okikiolu’s result is adapted in Theorem A.7 to show that U0 consists of a countable disjoint union of closed convex components. Proposition A.9 bounds the number of such components. Let K be a convex component of U0 , and let K 0 := K0 (m) denote the set with minimum energy among convex sets with area m := H2 (K). We first prove that E(K) = E(K 0 ). Theorem 3.5 defines a curve K(t) ⊂ (1 − t)K + tK 0 in Km (R2 ) joining K to K 0 along which the energy satisfies (5) for t ∈ (0, 1). Thus E(K(t)) < E(K) holds unless K has the same energy as K 0 . Since K is one of finitely many compact convex components of U0 , it enjoys a neighbourhood which is disjoint from U0 \ K. Compactness of K 0 ensures that for t > 0 small enough, K(t) ⊂ (1 − t)K + tK 0 must lie in this neighbourhood; it can be substituted for K without disturbing the remainder of U0 . The energy of U0 would be lowered and its area unchanged, contradicting the fact that U0 is a minimizer and establishing the main assertion. The uniqueness result of Theorem 1.1 shows that K = K0 (m) + v. The proof is concluded by showing that even if translates of the minimizer K0 (m) share its energy, no two such translates K and K 0 can occur as components in U0 . Otherwise, K may be translated toward K 0 using K(t) := (1 − t)K + tK 0 ; for t ∈ [0, 1] the energy E(K(t)) remains constant by Remark 1.2. As long as K(t) remains disjoint from U0 \ K, the set U := K(t) ∪ (U0 \ K)

(8)

is a minimizer sharing the area and energy of U0 . As soon as K(t) touches K 0 or some other component of U0 , a contradiction is reached: either these two components share an edge, in which case the surface energy has been reduced and E(U ) < E(U0 ), or else they meet at a point, in which case U has a non-convex component violating Theorem A.7. A corollary shows the equilibrium crystal to consist of a single convex component ˆ = f (−n) ˆ and g(x) = g(−x) whenever the surface and potential energy integrands f (n) are even. When the position of each energy minimizer K0 (m) among convex sets is unique, this follows immediately from Theorems 1.1 and 1.3: each K0 (m) must be convex and balanced, hence contain the origin; no two of these sets are disjoint. Corollary 1.4 (Convex Equilibrium Crystals for Even Energies). Assume reflection symmetry E(U ) = E(−U ) as well as (E1–E2). Among sets U ∈ F(R2 ) of finite perimeter and area m, the minimizer U0 of E(U ) is convex: it is unique up to translation, and may be taken to be balanced U0 = −U0 .

706

Robert J. McCann

Proof. If U0 is convex, the first claim is proved. If not, choose two convex components K and K 0 of U0 ; each is the energy minimizer among convex sets of its area by Theorem 1.3. Since −K has the same energy and area as K, Theorem 1.1 forces them to be translates: −K = K + v. Similarly, −K 0 = K 0 + v0 . Define K(t) := K + t v and K 0 (t) := K 0 + tv0 for t ∈ [0, 1]. The energy E(K(t)) is independent of t by Remark 1.2. As in the proof of the preceding theorem, a contradiction would be reached if K(t) and K 0 (t) intersected each other or U0 \ (K ∪ K 0 ) at any t ∈ (0, 1). But these sets cannot remain disjoint up to t = 1/2, for K(1/2) and K 0 (1/2) are both convex and balanced and their intersection includes the origin. Thus U0 consists of a single convex component K. Theorem 1.1 then asserts U0 = K to be unique apart from translations. The translate K(1/2) defined as above also minimizes E(U ) and is balanced.

Fig. 1. Can these two crystals be in equilibrium?

Even without additional symmetry, the minimizer U0 of Theorem 1.3 can be shown to consist of at most two convex components. Obtained in collaboration with Felix Otto, this result will be published separately. The heuristic idea underlying its proof is that the physics divides the problem into two different regimes: for small crystals surface tension plays the dominant role in determining equilibrium shape, while for large crystals the dominant influence is potential energy. As a result, there is some critical size mcr such that the energy E0 (m) := E(K0 (m)) is concave on the interval m ≤ mcr but convex on the interval m ≥ mcr . If two convex crystals K0 (m1 ) and K0 (m2 ) coexist in equilibrium, then the Lagrange multipliers λ(m1 ) = λ(m2 ) in (7) must agree. But λ(m) = dE0 /dm decreases to the left of mcr while increasing to its right. Thus m1 < mcr selects m2 > mcr uniquely, and m2 selects m1 conversely. Of course, this begs the question of whether two crystalline components can coexist at all (without getting in each other’s way). One situation where we anticipate this may occur is for a crystal and its melt inside a conical container – Fig. 1. Assuming the

Crystals in an External Field

707

container wall prefers melt to solid, surface energetics will prevent the crystal from penetrating all the way to the tip of the cone [67]; thus the crystal rests above a bubble of fluid even in the presence of gravity. Under suitable conditions, we predict a tiny, second crystal can be formed inside this bubble in stable equilibrium with the fluid and the first crystal. Though surface energetics favor expansion of the larger crystal at the expense of the smaller one, this would be penalized by an increase in gravitational potential energy. The criticality argument shows these two effects can balance exactly, but fails to address disjointness of K0 (m1 ) and K0 (m2 ) or the energy comparison E0 (m1 + m2 ) < E0 (m1 ) + E0 (m2 ) required for stability. Such questions remain a challenge, especially tantalizing in simple settings like the cone with gravity, where disconnected equilibria – should they exist – might be accessible experimentally. 2. Surface Energy and Isoperimetry This section lays out the basic convexity and monotonicity properties of the surface energy. In particular, the surface energy F (U ) is shown not to increase when U ∈ F(R2 ) is replaced by (i) its intersection U ∩K with any convex set, or, in the case of a connected domain U with H2 (∂U ) = 0, (ii) its convex hull conv [U ]. Moreover, the surface energy of (1 − t)K + tK 0 proves to be convex as a function of t ∈ [0, 1] for convex sets K, K 0 ⊂ R2 . These estimates suggest a proof of the isoperimetric inequality in which uniqueness is shown before the minimizer is known to be round. Regretably, the second and third estimates rely crucially on the one-dimensional nature of domain boundaries in the plane. A continuous curve σ : [a, b] −→ R2 is said to be rectifiable if has finite arc length L(σ). The latter is defined by polygonal approximation using a supremum over finite partitions 5 = {s0 < s1 < · · · < sn } ⊂ [a, b]: L(σ) := sup

n X

5⊂[a,b] i=1

|σ(si ) − σ(si−1 )|.

(9)

In its arc length reparameterization τ : [0, L(σ)] −→ R2 , the rectifiable curve σ is seen to be Lipschitz. As a consequence, the tangent τ 0 (s) exists for almost all s, and it is natural to define the surface energy F (σ) associated with the interface σ by Z L(σ) f˜(τ 0 (s))ds, (10) F (σ) := 0

where f˜(x, y) := f (y, −x) employs a rotation by 90◦ to express the surface tension as a function of the oriented tangent vector rather than the outward normal. If the measure theoretic boundary ∂∗ U of a set U ∈ F (R2 ) coincides (up to sets of H1 measure zero) with a rectifiable simple closed curve σ oriented positively, then definitions (1) and (10) coincide; this is certainly the case when U is a convex domain. F (σ) may also be computed by polygonal approximation in a manner analogous to the arc length (9): F (σ) := sup

n X f˜ σ(si ) − σ(si−1 ) .

5⊂[a,b] i=1

(11)

Expression (11) is manifestly invariant under reparameterization; in the arc length parameterization it can be seen to coincide with (10) by the dominated convergence theorem.

708

Robert J. McCann

Note that finiteness of (11) is equivalent to the rectifiability of σ: the surface tension f˜(x) is positively homogeneous, bounded away from zero and infinity on the unit circle. Thus (11) extends the definition of interfacial energy to all continuous curves in the plane. The first lemma it yields is well-known: Lemma 2.1 (Triangle Inequality). Among all continuous curves σ : [0, 1] −→ R2 with endpoints σ(0) = x and σ(1) = y, the interfacial energy F (σ) is minimized by the line segment σ(s) = (1 − s)x + sy. Proof. Let σ : [0, 1] −→ R2 be a rectifiable curve joining σ(0) = x to σ(1) = y; otherwise F (σ) = +∞. Choosing any partition 0 = s0 < s1 < . . . < sn = 1 of the interval, convexity of the surface tension yields n y−x 1X ˜ f˜ f (σ(si ) − σ(si−1 )). ≤ (12) n n i=1

Multiplying (12) by n, the positive homogeneity (E1) of f˜ shows the energy f˜(y − x) of the line segment to be less than the energy F (σ) of the curve (11). Remark 2.2. When the surface tension is elliptic, meaning {x ∈ R2 | f (x) ≤ 1} is strictly convex, strict inequality in (12) shows the line segment to minimize surface energy uniquely (up to reparameterization) among curves connecting its endpoints. Lemma 2.3 (Convexity for Curves). Let σ and σ 0 : [a, b] −→ R2 be continuous curves, and define σt (s) := (1 − t) σ(s) + t σ 0 (s) on [a, b]. Then the interfacial energy F (σt ) is convex as a function of t ∈ [0, 1]. Proof. From the definition (11) of the interfacial energy, X f˜ σt (si ) − σt (si−1 ) F (σt ) = sup =

5⊂[a,b]

i

5⊂[a,b]

i

X f˜ (1 − t) σ0 (si ) − σ0 (si−1 ) + t σ1 (si ) − σ1 (si−1 ) sup

≤ (1 − t) F (σ0 ) + t F (σ1 ). The final inequality was obtained using the convexity of the surface tension f˜.

Proposition 2.4 (Convexity for Sets). Let K, K 0 ∈ K(R2 ) be closures of two convex domains. Then the surface energy F ((1−t)K +tK 0 ) is convex as a function of t ∈ [0, 1]. Proof. First assume K and K 0 ∈ K(R2 ) to be strictly convex, so that both are compact sets with non-empty interiors. Given nˆ ∈ {x ∈ R2 | |x| = 1}, strict convexity ensures ˆ ∈ ∂K, where K is supported by a half-plane the existence of a unique point σ(n) ˆ parameterize the boundary of K 0 in a similar ˆ Let σ 0 (n) with outward unit normal n. way. Then the curves σ and σ 0 are continuous on the unit circle S1 . Now observe that ˆ := (1 − t)σ(n) ˆ + tσ 0 (n) ˆ must be the unique point at which (1 − t)K + tK 0 has σt (n) ˆ Thus σt gives a positively oriented parameterization for the outward unit normal n. boundary of (1 − t)K + tK 0 . Convexity of F ((1 − t)K + tK 0 ) follows from Lemma 2.3, and the equivalence between our definitions of surface energy for sets (1) and for curves (11). Should K or K 0 (or both) fail to be strictly convex, the argument must be modified since σ and σ 0 will no longer be continuous. However, strict convexity of K and K 0 can

Crystals in an External Field

709

fail in at most countably many directions nˆ i ∈ S1 . By inserting an interval Ii of length 2−i into the unit circle at nˆ i , and extending σ (and σ 0 ) linearly to Ii , we parameterize ∂K and ∂K 0 continuously and consistently in the sense that the outward unit normals agree: nˆ K (σ(s)) = nˆ K 0 (σ 0 (s)). The preceding argument can then be applied. The spirit of our approach is illustrated by the following proof of the isoperimetric inequality. Like several other proofs, it relies on the Brunn-Minkowski theorem [34]; what is unusual is that uniqueness is shown before the minimizer is known to be round. Remark 2.5 (Isoperimetric Inequality).The shortest Jordan curve enclosing unit area is unique (up to translation and parameterization), and therefore must be a circle. Proof. Setting f (x) = |x|, Lemma 2.1 implies that only the curves bounding convex regions need be considered: any simple closed curve may be replaced by its convex hull, which is not longer but encloses greater area. This makes it easy to deduce existence of a shortest closed curve from standard compactness and continuity results. We prove only that this convex curve is unique. Suppose two convex sets K and K 0 in K1 (R2 ) both have minimum perimeter. Proposition 2.4 shows the perimeter (K + K 0 )/2 to be no longer than that of K. The BrunnMinkowski theorem (or Proposition 3.1 and Lemma 3.2) shows (K + K 0 )/2 to have area m > 1 unless K 0 is a translate of K. Thus K and K 0 must be translates: otherwise the convex set (K + K 0 )/2m1/2 would have unit area but perimeter less than H1 (∂K) = H1 (∂K 0 ). This contradiction establishes uniqueness up to translation. Now there is a unique convex set K ∈ K1 (R2 ) with shortest perimeter having the origin as its center of mass. This set is invariant under rotations, therefore a disk. The shortest curve enclosing unit area must parameterize its boundary. The next lemma extends a result of Vogel [62] to anisotropic surface energies. For elliptic f , the inequality is strict unless U ⊂ K (up to sets of measure zero). Lemma 2.6 (Convex Intersections). If U ∈ F (R2 ) has finite perimeter and K ⊂ R2 is convex, then the surface energy F (U ∩ K) ≤ F (U ). Proof. If K is a half-space H = {x | h x, yi ≤ λ} with y ∈ R2 and λ ∈ R, the desired inequality is well-known [1, §3.1.9]; when the boundary of U is a simple closed curve, the result also follows directly from Lemma 2.1. Now consider general convex K ⊂ R2 . Assume K to be closed, since the difference is a set of measure zero and therefore irrelevant. Choose a countable dense set of points xi from ∂K, so that K is the intersection of the supporting half-spaces Hi ⊃ K at these points. Define Ui+1 := Hi ∩ Ui inductively starting with U1 := U . The preceding paragraph asserts F (Ui ) to be non-increasing. Moreover, χUi → χU ∩K in L1 (R2 ). The lower semi-continuity of F from Theorem A.4 then yields F (U ∩ K) ≤ lim F (Ui ) ≤ F (U ).

i

Proposition 2.7 (Obstacles in the Plane). Assume f is elliptic and fix a bounded domain ⊂ R2 whose boundary has measure zero: H2 (∂) = 0. If U ⊃ has the least surface energy F (U ) among all sets in F (R2 ) containing , then H2 (∂U ) = 0 and U is given by a union of convex domains Ki ⊂ R2 whose closures Ki are disjoint. Proof. Modifying U on a set of measure zero, one may take ∂U = ∂∗ U without loss [31, Theorem 4.4]. The first step will be to prove H2 (∂U ) = 0, so that U can be assumed to be the interior of its closure U . By hypothesis, H2 (∂U ∩ ∂) = 0, so it remains to

710

Robert J. McCann

control the boundary of U outside . A standard estimate derived from the comparison F (U ) ≤ F (U \ Br (x)) yields the lower density ratio bound H2 (U ∩ Br (x)) ≥

πr2 (1 + fmax /fmin )2

(13)

for every x ∈ U and r < dist(x, ); the argument parallels Proposition A.6, which follows Giusti [31, Proposition 5.14] except that the ratio fmax /fmin of the maximum ˆ appear because of the anisotropy. On and minimum values of the surface tension f (n) the other hand, R2 \ U minimizes the surface energy F (−U ) among sets avoiding the obstacle , so it enjoys a similar bound (33) outside of . This impies an upper density ratio bound for U along ∂U , forcing ∂U \ ⊂ ∂∗ U from Definition A.1. Since U has finite perimeter, H1 (∂∗ U ) < +∞ concludes the proof that H2 (∂U ) = 0. The next step is to prove convexity for each connected component Ki of the open set U . Should Ki fail to be convex, one can find a pair of points a, c ∈ Ki such that b := (1 − t)a + tc lies a positive distance from Ki for some 0 < t < 1. The points a and c can be connected by a smooth Jordan arc σ in Ki . By choosing a and c closer to b if necessary, the arc σ may be assumed not to intersect the interior of the line segment [a, c]. Then the arc σ together with the segment [a, c] forms a simple closed curve enclosing some region V . A contradiction will be obtained by showing F (U ∪V ) < F (U ). Clearly b 6∈ Ki forces W := V \ U to have positive area, and Z Z F (U ∪W ) = F (U )+ f (nˆ W (x)) dH1 (x)− f (nˆ U (x)) f H1 (x). (14) ∂ ∗ W \∂ ∗ U

∂ ∗ W ∩∂ ∗ U

Now nˆ W (x) = nˆ ⊥ is constant on x ∈ ∂ ∗ W \ ∂ ∗ U , where it is perpendicular to c − a. On the other hand, ∂ ∗ W ∩ ∂ ∗ U hasR positive H1 measure, and nˆ U (x) = −nˆ W (x) cannot be constant there. Stokes’ theorem ∂ ∗ W nˆ W = 0 couples with Jensen’s inequality, (F1) and ellipticity of f to yield a strict inequality F (U ∪ W ) < F (U ) from (14), as in Almgren, Taylor and Wang [1, §3.1.9]. This contradiction proves that Ki is convex. Finally, it remains to establish that Ki and Kj are disjoint for i 6= j. Since these two convex sets have disjoint interiors, their intersection can be at most a line segment or a point. They cannot intersect along a line segment since U was chosen as the interior of U . Nor can they intersect in a single point x, for then it would be possible to find a ∈ Ki and c ∈ Kj with b = (1 − t)a + tc not in Ki ∪ Kj . The Jordan arc [a, x] ∪ [x, c] in Ki ∪ Kj would connect a to c, so the preceding argument again yields a contradiction. This shows Ki ∩ Kj = ∅ to finish the proof. A corollary will be used to extend Okikiolu’s result to non-smooth, degenerate elliptic surface tensions in Theorem A.7. The same observation has been exploited by Gage [26, Corollary 2.6] in a slightly less technical setting. Corollary 2.8 (Convex Hulls). Assume f satisfies (E1), and let ⊂ R2 be a bounded, connected domain with H2 (∂) = 0. Its convex hull K := conv [] has the least surface energy F (K) among all sets in F(R2 ) containing . The same result holds if = 1 ∪2 is a union of two convex sets with connected closure . Proof. First assume that f is elliptic, and define U ⊃ to be a surface energy minimizer S among all sets U ∈ F(R2 ) containing . Proposition 2.7 shows U = i Ki to consist of a countable union of convex domains whose closures K i are disjoint. Whether is connected or consists of two connected components whose closures intersect, it is clear

Crystals in an External Field

711

that must lie in a single connected component of U . Thus U ⊃ conv [] = K. Lemma 2.6 shows that K = U ∩ K must also minimize surface energy among all sets containing . If the surface tension f is not elliptic but satisfies (F1), it may be expressed as a pointwise limit fi (x) → f (x) of a sequence of elliptic fi . (These are obtained as the gauge functions for a sequence of strictly convex approximants to the unit ball {x | f (x) ≤ 1}.) Since Fi (K) ≤ Fi (U ) holds for each U ⊃ , the limit yields F (K) ≤ F (U ). Thus K again minimizes surface energy, though in contrast to the elliptic case it need no longer be unique. 3. Convexity and Monotonicity of the Potential Energy Choose two convex sets K and K 0 from Km (R2 ). The primary purpose of the present section is to establish the convexity estimate (15) G [(1 − t)K + tK 0 ]m ≤ (1 − t) G(K) + t G(K 0 ), together with conditions for strict inequality which imply the uniqueness results outlined earlier. Here [K]m denotes the truncation (6) of K to size m, without which (15) would typically fail. This estimate, together with those of the previous section, is summarized in Theorem 3.5. The argument will be presented in two steps. First the idea introduced in [45, 47] is recalled, which defines an interpolant ρt ∈ L1 (R2 ) between ρ0 = χK and ρ1 = χK 0 for t ∈ [0, 1]. The salient features of this displacement interpolation are that 0 ≤ ρt ≤ χ(1−t)K+tK 0 has total mass m, while its potential energy G(ρt ) is convex as a function of t, when the definition (3) of potential energy is extended to ρ ∈ L1 (R2 ) by Z G(ρ) := g(y) ρ(y) dH2 (y). (16) R2

The second step is to replace ρt by a convex set K(t) ⊂ R2 having area m. Truncating the support of ρt using level sets of g(x) assures G K(t) ≤ G(ρt ). The displacement interpolant is defined in [45, 47] using a result of Brenier [6] which guarantees the existence of a convex function ψ : K −→ R whose gradient ∇ψ gives an area-preserving map between K and K 0 ; see alternately Brenier [7], Gangbo [29] or McCann [46]. The function ψ is unique up to additive constant. When the sets are convex, Caffarelli has shown ∇ψ to be a homeomorphism from K to K 0 : it is a smooth diffeomorphism of their interiors and H¨older continuous up to the boundary [8, 9, 10]. Viewing dρ0 (x) := χK (x)dx as a Borel measure on R2 , the displacement interpolant ρt is defined for t ∈ [0, 1] to be the density of this measure pushed forward through the mapping x −→ (1 − t)x + t ∇ψ(x). Thus for x from the interior of K, the value of ρt at (1 − t)x + t ∇ψ(x) is given by the Jacobian determinant (17) ρt (1 − t)x + t ∇ψ(x) = det [(1 − t)I + t ∇2 ψ(x)]−1 , while ρt (y) vanishes elsewhere; here I is the identity matrix while ∇2 ψ(x) is the derivative of ∇ψ at x – and non-negative definite by convexity of ψ. Unlike ρ0 = χK and ρ1 = χK 0 , ρt will not generally be the characteristic function of any set, though the first proposition shows 0 ≤ ρt (y) ≤ χ(1−t)K+tK 0 (y). This proposition may also be deduced

712

Robert J. McCann

from [47, Theorems 2.2–2.3], but the two-dimensional setting and smoothness of ψ fat cilitate a direct proof in the present context. The notation ρ0 → ρ1 is used instead of ρt to emphasize explicit dependence on the endpoints ρ0 and ρ1 . Proposition 3.1 (Displacement Interpolation). Given K, K 0 ∈ Km (R2 ) convex, let t ρt := χK → χK 0 be the displacement interpolant between them. Then 0 ≤ ρt ≤ χ(1−t)K+tK 0 . Unless K and K 0 are translates, 0 < ρt (x) < 1 holds on a set of positive area whenever t ∈ (0, 1). Proof. Let ψ : K −→ R be the convex function whose gradient is an area-preserving diffeomorphism between the interiors of K and K 0 . Define ψt (x) := (1 − t) x2 /2 + t ψ(x) and yt (x) := ∇ψt (x) on the interior K 0 of K. For t ∈ (0, 1), the uniform convexity of ψt ensures that yt is injective: the inverse map yt−1 satisfies a Lipschitz condition with constant (1 − t)−1 . Thus ρt is well-defined by (17), together with the condition that ρt vanish outside the image of yt . Now fix x ∈ K 0 and set 3 = ∇2 ψ(x). Then h(t) := det [(1 − t)I + t3] = 1 + t tr [3 − I] + t2 (1 + det [3] − tr [3])

(18) (19)

varies continuously between h(0) = 1 and h(1) = det [3] = 1. Moreover, h(t) is concave √ on [0, 1]: this follows from (19) since 3 has positive eigenvalues and tr [3]/2 ≥ det [3] = 1. Unless 3 = I, the two eigenvalues are distinct, the arithmetic mean dominates the geometric mean strictly, and h(t) is strictly concave. In any case, h(t) ≥ 1 on [0, 1] and hence 0 < ρt (yt (x)) = h(t)−1 ≤ 1. Since yt (x) ∈ (1 − t)K + tK 0 this establishes 0 ≤ ρt ≤ χ(1−t)K+tK 0 . Next choose x0 ∈ K 0 for which ∇2 ψ(x0 ) 6= I; if no such point exists then ∇ψ(x) = x + v reduces to translation by some v ∈ R2 and K 0 = K + v. Otherwise ∇2 ψ(x) 6= I holds on a neighbourhood of x0 . For t < 1 the image of this neighbourhood under the diffeomorphism yt has positive measure, and 0 < ρt (y) < 1 there. Lemma 3.2 (Displacement Convexity of the Potential Energy). Let K, K 0 ∈ Km (R2 ) t be two convex R sets, and define the displacement interpolant ρt := χK → χK 0 between them. Then ρt = m, while the potential energy G(ρt ) is convex as a function of t on [0, 1]. Proof. Let ψ : K −→ R be the convex function whose gradient is an area-preserving diffeomorphism between the interiors of K and K 0 . Define ψt (x) := (1 − t) x2 /2 + t ψ(x) and yt (x) := ∇ψt (x) on the interior of K as before. Then the change of variables y = yt (x) combines with (17) to express the potential energy (16) by Z G(ρt ) = g (1 − t) x + t ∇ψ(x) dH2 (x). (20) K

The integrand is manifestly convex as a function of t, R so the integral must be as well. Setting g(x) = 1 in (20) yields conservation of mass: ρt = H2 (K) = m. Remark 3.3. The fact that the endpoints ρ0 and ρ1 were characteristic functions of convex sets played no role in the preceding proof. The displacement interpolant ρt may in fact be defined between any pair of probability measures on Rd given by densities ρ0 and ρ1 with respect to Lebesgue, and in any dimension [45, 47]. Displacement convexity of G – convexity of G(ρt ) as a function of t – holds true in this generality.

Crystals in an External Field

713

Lemma 3.4 (Monotonicity of the Potential Energy). Let ρ ∈ L1 (R2 ) and a Borel set U ⊂ R2 satisfy 0 ≤ ρ ≤ χU . Fix m ≤ kρk1 . Then G([U ]m ) ≤ G(ρ), and strict inequality holds unless G(ρ) = 0 or ρ = χ[U ]m . Proof. Recall that [U ]m = {x ∈ U | g(x) ≤ λ}, where λ ≥ 0 is chosen (6) so that either (i) H2 ([U ]m ) = m or (ii) λ = 0. In either case, Z g(x) − λ ρ(x) − χ[U ]m (x) dH2 (x). (21) G ρ − G [U ]m ≥ U

Moreover, the integrand of (21) is non-negative for all x ∈ U so the desired inequality is established. The integral will be strictly positive unless ρ(x) vanishes a.e. outside [U ]m ; in the first case (i) above H2 ([U ]m ) ≤ kρk1 then implies ρ = χ[U ]m , while in the second case (ii) λ = 0 (= inf x g(x)) couples with ρ ≤ χ[U ]m to imply G(ρ) = 0. These are the only exceptions to strict inequality in the lemma. Theorem 3.5 (Interpolation Between Convex Crystals). Let K, K 0 ∈ Km (R2 ) be convex sets. Then there is a curve K(t) ⊆ [(1 − t)K + tK 0 ]m in Km (R2 ) joining K(0) = K to K(1) = K 0 along which the energy satisfies (22) E K(t) ≤ (1 − t) E(K) + t E(K 0 ). Strict inequality holds for t ∈ (0, 1) unless K 0 is a translate of K. t

Proof. Given t ∈ [0, 1], let ρt := χK → χK 0 denote the displacement interpolant (17) between χK and χK 0 . Proposition 3.1 and the conservation of mass shown in Lemma 3.2 R make it clear that the set U = (1 − t)K + tK 0 has area no smaller than m = ρt ; we denote its truncation to size m by K(t) := [(1 − t)K + tK 0 ]m . The surface energy of K(t) is controlled by the convexity estimate of Proposition 2.4 and the monotonicity of F proved in Lemma 2.6 for intersections with convex sets, and more particularly with level sets (6) of the potential g(x): F K(t) ≤ F ( (1 − t)K + tK ) (23) ≤ (1 − t) F (K) + t F (K 0 ). At the same time, the convexity G(ρt ) ≤ (1 − t) G(χK ) + t G(χK 0 ) of Lemma 3.2 combines with the monotonicity of Lemma 3.4 to control the potential energy G K(t) ≤ G(ρt ) (24) ≤ (1 − t) G(K) + t G(K 0 ), since 0 ≤ ρt ≤ χU by Proposition 3.1. Summing the energy estimates for F and for G yields (22). Moreover, Lemma 3.4 gives a strict inequality G(K(t)) < G(ρt ) in (24) and (22), unless (i) ρt = χK(t) or (ii) G(ρt ) = 0. If K is a translate of K 0 then the proof is complete. If not, the inequality 0 < ρt < χ(1−t)K+tK 0 of Proposition 3.1 holds strictly on a subset with positive measure, precluding possibility (i). Should (22) fail to be strict, meaning (ii) has occurred, then R K(t) ⊃ {x ∈ U | g(x) = 0} contains a set {x | ρt > 0} whose area exceeds m = ρt . Then K(t) can be replaced by a contraction λK(t) towards its center of mass with λ = 1/H2 (K(t)) < 1 chosen to yield area m. This contraction does not increase potential energy, yet it lowers the surface energy F (λK(t)) = λF (K(t)) to make inequality (22) strict. Even when (ii) does not occur, it

714

Robert J. McCann

may happen that K(t) has area greater than m if G(K(t)) = 0; in these cases also K(t) is replaced by λK(t). Either way, the new set K(t) ⊆ [(1 − t) K + t K 0 ]m is manifestly convex and has area m, completing the proof of the theorem. We close by remarking that estimate (22) should really be viewed as a statement of sublinearity rather than convexity: the theorem does not assert that the curve joining, e.g., K(1/3) to K(2/3), coincides with K((1 − t)/3 + 2t/3). 4. Curvature-Driven Flows Preserve Convexity This final section gives an application of our results to the dynamical process of curvature-driven flow. Curvature-driven flow, or motion by weighted mean curvature, is a geometrical model for the time evolution of a non-equilibrium crystal under the influence of its surface tension. In this model, the normal velocity v of each point x on the crystalline interface is presumed proportional to the local change of surface energy with volume, or rather, area in R2 . Thus it depends on the curvature κ of the interface at x, and the normal direction nˆ via tangential derivatives of the surface tension f : 2 ˆ f (n); v = −κ Dtt

(25)

a further discussion and references are given by Taylor, Cahn and Handwerker in the review articles [61, 59]. If the interface and its evolution are both smooth, the flow may be referred to as classical. For curves in the plane, the resulting motion – and its generalization (28) – have been studied by Angenent and Gurtin. Here a connection will be established between one of their results [2, §7.3, 3, §10] and the statical problem that we have analyzed: Corollary 1.4 will be used to provide an alternative proof that a convex crystal K0 ∈ K(R2 ) away from equilibrium, remains convex for all time under curvature-driven flow. This result was proved first in the isotropic case f (x) = |x| by Gage and Hamilton [24, 27] for curves in the plane, and by Huisken [40] for surfaces in higher dimensions. Evans and Spruck provided a different proof [18] using the viscosity solutions which they introduced simultaneously with Chen, Giga and Goto [12] to the level set approach. Furthermore, in the plane it is true that non-convex curves eventually become convex, as shown by Grayson [32] for isotropic motion and by Angenent and Gurtin [2, 3] and Gage [25] in the anisotropic case. The asymptotic shape of the vanishing curve has also been determined by Gage [26] in joint work with Li [28], as well as by Soner [55], Angenent and Gurtin [3] and Gurtin, Soner and Souganidas [33] in non-smooth formulations with varying degrees of generality. Despite this plethora of techniques and results, our approach – currently limited to the case K0 = −K0 – may be of interest since it requires neither smoothness nor uniform ellipticity of f or K0 , while reducing the anisotropic question in higher dimensions to the study of a statical problem. The relationship between the dynamical and statical problems was established by Almgren, Taylor, Wang [1] and Luckhaus and Sturzenhecker [43], who showed that a curvature-driven flow starting from K0 ∈ F (R2 ) can be approximated by a discrete time flow in which the evolved crystal K after time 1t is the minimizer on F(R2 ) of a functional (4). The potential g(x), which need not be convex, represents the tendency of the flow to remain near its initial condition K0 for short times: it is proportional to the signed distance to the boundary ∂ ∗ K0 and decays with elapsed time 1t: dist(x, ∂ ∗ K0 ) if x 6∈ K0 (26) g(x) 1t := dist± (x, K0 ) := −dist(x, ∂ ∗ K0 ) if x ∈ K0 .

Crystals in an External Field

715

A discrete evolution is generated by repeated minimization, replacing K0 by K at each step. A continuous time flow, dubbed flat F curvature flow, is extracted in the limit 1t → 0. Under additional restrictions this flat flow coincides with the classical curvature flow (25) provided the latter exists. If the initial configuration K0 is a convex set, Lemma 4.2 shows that the correspondˆ = f (−n) ˆ is even, an application ing potential (26) is convex. If K0 is balanced and f (n) of Corollary 1.4 then shows that the crystal remains balanced and convex at all subsequent times. It is of interest to note that the distance function appearing in (26) need not be Euclidean distance: it is sufficient that dist(x, ∂ ∗ K) := inf∗ M(x − k) k∈∂ K

(27)

for any norm M(x) on R2 . Non-Euclidean norms – or rather their duals (2) – model certain non-isotropic mobilities: direction dependent responses of the crystalline interface to applied force [1, §2.13, 2, 61]. The potential g(x) may also be shifted by a constant ω ∈ R representing the difference in bulk energy between the crystal and its melt. Heuristically at least, (25) then becomes 2 ˆ ω + κ Dtt ˆ . f (n) (28) v = −M ∗ (n) Theorem 4.1 (Flat F Curvature Flows Preserve Balanced Convex Sets). Let K(t) ∈ F (R2 ) be a flat F curvature flow [1] on some interval t ∈ [0, T ]. If the initial conˆ = f (−n) ˆ is even, then dition K(0) = −K(0) is convex and the surface tension f (n) the crystal K(t) will be convex at each subsequent time. K(t) will also have reflection symmetry through some point xt ∈ R2 : K(t) − x(t) = x(t) − K(t). Proof. Since K0 = K(0) is convex, balanced and bounded, Lemma 4.2 shows the potential g(x) + ω from (26–27) to be convex and balanced. Since g(x) is positive except in K0 , it assumes its minimum on a subset of the bounded set K0 . Let K minimize E(U ) over F (R2 ), i.e. among sets of all areas. K exists by results of Appendix A, though it may coincide with the null set; it is a fortiori an energy minimizer among sets of its area. Although g takes both signs, a constant may be added to yield g ≥ 0 without affecting the constrained minimization of Corollary 1.4. Thus K will be convex, and if not balanced, then reflection symmetric through some other point x ∈ R2 . K is an approximant to K(1t). The approximant to K(21t) is obtained by repeating the procedure, starting from K instead of K0 . Since the problem is translation invariant, and a translate of K satisfies the hypotheses on K0 , the approximants to K(n1t) must all be convex and symmetric for n > 1. The flat F curvature flow χK(t) at time t is obtained [1, 2.6] as a limit of such approximants in L1 (R2 ). K(t) is convex since the characteristic functions of convex sets form a closed subset of L1 (R2 ); it has a balanced translate for a similar reason. Lemma 4.2 (Convexity of Signed Distance to a Convex Set). Let K ⊂ Rd be a convex set, and M(x) a norm on Rd . The signed distance dist± (x, K) of (26–27) is a convex function of x on Rd . If K is balanced then dist± (x, K) = dist± (−x, K). Proof. Choose any supporting hyperplane to K, and let H ⊃ K be the corresponding half-space. The first observation dist± (x, K) ≥ dist± (x, H)

(29)

716

Robert J. McCann

is seen from three cases: if x 6∈ H, the boundary of H lies M-closer to x than the boundary of K; if x ∈ K the situation is reversed, and (29) holds since both distances are negated; if x ∈ H \ K, (29) holds on the basis of sign. Now fix x ∈ Rd . There is some k in the boundary of K such that dist(x, ∂ ∗ K) = M (x − k). A supporting half-space H ⊃ K exists with k ∈ ∂H and with dist(x, H) = M (x − k): if x ∈ K this is obvious, while if x 6∈ K the hyperplane ∂H must be slipped between the convex sets K and {y | M(x − y) < dist(x, K)}. Thus (29) will be saturated for this H, and dist± (x, K) = sup dist± (x, H), H⊃K

where the supremum is over half-spaces H ⊃ K. Convexity of dist± (x, K) is manifest since dist± (x, H) is linear (or at least affine). dist± (x, K) is obviously even when K = −K.

A. Convexity, Size and Number of Components This appendix begins by recalling a few definitions concerning sets of finite perimeter (see Federer [19], Giusti [31], Fonseca [22] or Evans and Gariepy [17]) with related compactness and continuity results. Further background material is then developed from the literature, including a proof of the existence of an energy minimizing crystal U0 among F(R2 ) sets having area m, and of Okikiolu’s theorem expressing U0 as a countable disjoint union of closed convex sets. The appendix concludes with two a priori estimates: the first constrains the support of U0 , while the second provides a lower bound for the area of its convex components, implying an upper bound on their number. A Borel set U ⊂ R2 is said to have finite perimeter if H1 (∂∗ U ) < +∞, where: Definition A.1. The measure theoretic boundary ∂∗ U of a (Borel) set U ⊂ R2 consists of those points x ∈ R2 such that Z lim sup − r→0

Br (x)

χU > 0

while

Z lim inf − r→0

Br (x)

χU < 1.

(30)

The bar denotes an average over the given domain (in this case the ball Br (x)) with respect to Lebesgue measure H2 . The sets of finite perimeter coincide with those sets U ⊂ R2 whose characteristic functions have bounded variation: kχU kBV = H2 (U )+H1 (∂∗ U ). The basic compactness result for these sets is stated in Giusti [31, Theorem 1.19]: Theorem A.2 (Compactness). Fix R < +∞, and consider the collection of sets U ⊂ BR (0) with uniformly bounded perimeters H1 (∂∗ U ) < R. Their characteristic functions χU form a compact subset of L1 (R2 ). Definition A.3. The unit vector nˆ ∈ R2 is said to be a measure theoretic outward normal ˆ y − xi ≥ 0} yields to U ⊂ R2 at x ∈ ∂∗ U if setting H + := {y ∈ R2 | h n, Z lim −

r→0

Br (x)∩H +

χU = 0

and

Z lim −

r→0

Br (x)\H +

χU = 1.

(31)

Crystals in an External Field

717

The measure theoretic outward normal nˆ U (x) can be defined at each point x in a subset ∂ ∗ U of ∂∗ U , called the reduced boundary. However, when U has finite perimeter, the difference between ∂ ∗ U and ∂∗ U has H1 measure zero and the Gauss map nˆ U : ∂ ∗ U −→ S1 is Borel. Then the surface energy F (U ) given by (1) is well-defined, and will satisfy the inclusion-exclusion estimate [1, §3.1.4]: F (U ∪ V ) + F (U ∩ V ) ≤ F (U ) + F (V ).

(32)

Moreover, a theorem of Reshetnyak [52, Theorem 2] shows F (U ) to be lower semicontinuous since f is convex and positively homogeneous (E1): Theorem A.4 (Lower Semi-continuity of the Surface Energy). LetχUi → χU in L1 (R2 ) for a sequence of sets Ui ∈ F (R2 ). Then F (U ) ≤ lim inf F (Ui ). i→∞

Taken together with the obvious lower semi-continuity of G, these results imply existence of a minimizer for E(U ) among sets in the ball of radius R < +∞: Fm (BR ) := {U ∈ F(R2 ) | U ⊂ BR (0) and H2 (U ) = m}. Corollary A.5 (Existence of Energy Minimizing Crystals in the Disk). Among sets in Fm (BR ), the energy E(U ) attains its minimum. Proof. Choose a sequence Ui ∈ Fm (BR ) whose energy E(Ui ) = F (Ui ) + G(Ui ) tends to a minimum on Fm (BR ). This infimum is presumed to be finite and non-negative since E ≥ 0. Since G ≥ 0 and the sequence is minimizing, the surface energies F (Ui ) ≥ fmin H1 (∂∗ Ui ) must be bounded above. The perimeters are also bounded in ˆ on |n| ˆ = 1. Thus terms of the minimum value fmin > 0 for the surface tension f (n) Theorem A.2 provides a convergent subsequence, also denoted Ui , with limit χUi → χU in L1 (R2 ). Here U ⊂ BR (0) is a set of finite perimeter, and clearly H2 (U ) = m. Moreover, Theorem A.4 yields F (U ) ≤ lim inf F (Ui ), and a similar inequality holds for E since G is lower semi-continuous as well. Thus U ∈ Fm (BR ) is the desired energy minimizing set. The next proposition provides an upper density ratio bound for the minimizing crystal U0 along its boundary. It is based on Giusti [31, Proposition 5.14]. Proposition A.6 (Upper Density Ratio Bounds). Let U0 ⊂ R2 minimize the energy E(U ) among sets in Fm (BR ). For each x 6∈ U0 and 0 < r < R − |x| one has H2 (Br (x) \ U0 ) ≥

πr2 1 + fmax /fmin

2 .

(33)

ˆ on |n| ˆ = 1. Here fmax and fmin denote the maximum and minimum values of f (n) Proof. The first step is to show that for s < r, the surface energy of U = U0 ∪ Bs (x) exceeds that of U0 alone. To produce a contradiction, assume F (U ) < F (U0 ). Although U may have area greater than m, Lemma 2.6 shows that intersection with the convex level sets of g can only reduce its surface energy: F ([U ]m ) ≤ F (U ). The potential energy is controlled by applying Lemma 3.4 to ρ = χU 0 to conclude G([U ]m ) ≤ G(U0 ). These estimates combine to yield E([U ]m ) < E(U0 ) and contradict the hypothesis that U0 minimize energy. Our construction ensures [U ]m ⊂ BR (0) has area m (or greater); if strictly greater, a contraction of [U ]m toward its center of mass yields the correct area while lowering its total energy.

718

Robert J. McCann

Having accomplished step one, let L ⊂ R2 denote the Lebesgue points of χU0 in the plane. Since this set has full measure, Fubini’s theorem guarantees H1 (∂Bs (x) \ L) = 0 for almost every s ∈ (0, r). Fix such an s and observe L is disjoint from ∂∗ U0 to conclude Z Z F (U ) − F (U0 ) = f (nˆ Bs (x) (y)) dH1 (y) − f (nˆ U0 (y)) dH1 (y). ∂Bs (x)\U0

Bs (x)∩∂∗ U0

The inequality F (U0 ) ≤ F (U ) established above then yields fmin H1 Bs (x) ∩ ∂∗ U0 ≤ fmax H1 ∂Bs (x) \ U0 . The remainder of the argument to control a(s) := H2 (Bs (x) \ U0 ) now follows Giusti [31, Proposition 5.14]. By Fubini’s theorem, the function a(s) is Lipschitz continuous. Its derivative is given almost everywhere on (0, r) by a0 (s) = H1 (∂Bs (x) \ U0 ). For almost all s ∈ (0, r) H1 (∂∗ [Bs (x) \ U0 ]) = H1 (∂Bs (x) \ U0 ) + H1 (Bs (x) ∩ ∂∗ U0 ) ≤ (1 + fmax /fmin ) H1 (∂Bs (x) \ U0 ), so one may estimate a0 (s) by the isoperimetric inequality: p a0 (s) ≥ 2(1 + fmax /fmin )−1 πa(s). Integrating (34) from s = 0 to r yields the desired inequality (33) for a(r).

(34)

ˆ smooth and uniformly elliptic [48], Okikiolu’s theorem, which she proved for f (n) may now be extended to all convex surface tensions f : Theorem A.7 ( Energy Minimizing Crystals have Convex Components ). If U0 minimizes the energy E(U ) among sets in Fm (BR ), then U0 coincides (up to sets of zero area) with a countable disjoint union of closed convex sets Ki ⊂ R2 . Moreover, S∞ ∂ ∗ U0 coincides with i=1 ∂ ∗ Ki , up to a set of H1 measure zero. Proof. Modifying U0 on a set of measure zero, one may take ∂U0 = ∂ ∗ U0 without loss [31, Theorem 4.4]. Moreover, lower density ratio bounds can be proved for energy minimizers as in Almgren, Taylor and Wang [1, §3.4]; these force the topological boundary of U0 to coincide with its reduced boundary up to a set of H1 measure zero: H1 (∂U0 \ ∂ ∗ U0 ) = 0. Thus U0 may be assumed to be the interior of its closure U0 . Decompose the open set U0 into connected components Ki ⊂ BR (0). Since χK1 ∪···∪Kn → χU0 in L1 (R2 ), applying the lower semi-continuity of Theorem A.4 ˆ = 1 yields with f (n) H1 (∂ ∗ U0 ) ≤ lim inf H1 (∂ ∗ [K1 ∪ · · · ∪ Kn ]) n→∞

≤

∞ X

H1 (∂ ∗ Ki ).

i=1

Here the second inequality follows from (32) or the fact that k · kBV is a norm.

(35)

Crystals in an External Field

719

To obtain the reverse inequality, we shall need to establish that ∂ ∗ Ki and ∂ ∗ Kj are disjoint when i 6= j. From Definition A.3, nˆ Ki (x) = −nˆ Kj (x) and U0 has Lebesgue density one at x ∈ ∂ ∗ Ki ∩ ∂ ∗ Kj . Since x ∈ ∂Ki must lie outside of U0 , this contradicts Proposition A.6. Thus ∂ ∗ Ki and ∂ ∗ Kj are disjoint. Since ∂ ∗ Ki ⊂ ∂Ki ⊂ ∂U0 , it then follows from H1 (∂U0 \ ∂ ∗ U0 ) = 0 thatSequality holds in (35). Apart from a set of H1 ∞ measure zero, we have shown ∂ ∗ U0 = i=1 ∂ ∗ Ki . The next step is to establish convexity of Ki . Should convexity fail, Corollary 2.8 shows that Ki can be replaced by its convex hull conv [Ki ] ⊂ BR (0) without increasing its surface energy. Observe that U = U0 ∪ conv [Ki ] will have area greater than m in this case. The inclusion-exclusion estimate (32) and disjointness of the ∂ ∗ Kj yield F (U ) ≤ F (U0 \ Ki ) + F (conv [Ki ]) ≤ F (U0 \ Ki ) + F (Ki ) = F (U0 ). Furthermore, F ([U ]m ) ≤ F (U ) by Lemma 2.6 while G([U ]m ) ≤ G(U0 ) by Lemma 3.4. Thus E([U ]m ) ≤ E(U0 ), the last two inequalities being strict unless (i) G(U0 ) = 0 or (ii) U0 = [U ]m . Since U0 is an energy minimizer, we must be in case (i) or case (ii). In the first case the convex set {g = 0}. If U had area greater both U0 and hence U ⊂ conv [U0 ] lie inp than m, contracting U by a factor λ = m/H2 (U ) toward its center of mass would violate minimality: E(λU ) = λE(U ) < E(U0 ). In case (ii), U0 = [U ]m = U ∩ {g ≤ λ} and hence U ⊂ conv [U0 ] both lie in the convex set {g ≤ λ}. Either way, U = U0 , in which case Ki was convex. Finally, if each pair of convex components Ki and Kj are separated by positive distance, the proof of the theorem will be complete. Since ∂ ∗ Ki and ∂ ∗ Kj are disjoint, the two convex sets cannot share any edges. Being convex, the intersection of their closures can consist of no more than a point. If it is a point, then Corollary 2.8 demonstrates that replacing Ki ∪ Kj by its convex hull increases area without increasing surface energy. Comparing the energy of U = U0 ∪ conv [Ki ∪ Kj ] with U0 would then yield the same contradiction as before. Thus Ki and Kj are disjoint. The next proposition implies that E(U ) attains its minimum among all sets in F (R2 ) having area m, and that the minimizers all lie in a single large ball of radius R0 . To prove this, it is useful to have scaled copies of the Wulff shape W available for comparison. Recall that W := {f ∗ ≤ 1} minimizes F (U ) for its area, and is given through the dual function (2) to the surface tension. Given m > 0, define Wm to be the dilate of W having H2 (Wm ) = m. Then if U ∈ F (R2 ), p F (U ) ≥ F (W1 ) H2 (U ). (36) In two dimensions, a simple estimate (following, e.g., from Corollary 2.8 and Lemma 2.6) controls the diameter of a connected open set U ∈ F(R2 ) in terms of its surface energy: (37) diam U := sup |x − y| ≤ (2fmin )−1 F (U ); x,y∈U

ˆ on |n| ˆ = 1. here again, fmin > 0 is the minimum of the surface tension f (n) Proposition A.8 ( Bound on the Location of Energy Minimizing Crystals ). There is some radius R0 < +∞ – given by (38) – depending only on f, g and m, such that U0 ⊂ BR0 (0) whenever U0 minimizes E(U ) on Fm (BR ) for some R < +∞. Proof. Assume the Wulff shape Wm is translated so that G(Wm ) ≤ G(Wm + x) for all x ∈ R2 , and define λm := sup{g(x) | x ∈ Wm } and Vm := {x | g(x) ≤ λm }. This set is bounded since the convex potential g attains its minimum on a bounded set {g = 0}.

720

Robert J. McCann

Taking R0 from (38) ensures Wm ⊂ BR0 (0), hence the energy E(U0 ) ≤ E(Wm ) is controlled in all cases of interest: R ≥ R0 . The surface energy F (U0 ) is controlled a fortiori, leading through (37) to a bound r := (2fmin )−1 E(Wm ) on the diameter of each of the convex components which make up U0 according to Theorem A.7. Enlarging Vm by radius r, the result will be proved by showing that unless U0 ⊂ Vm + Br (0), some U ∈ Fm (BR ) has lower energy. It will therefore be sufficient to take R0 := (2fmin )−1 E(Wm ) + sup |x|.

(38)

x∈Vm

Suppose that a connected component K of U0 fails to be contained in Vm + Br (0); it must then be disjoint from Vm . This observation, together with (36), shows that the energy gained by removing K from U0 to leave U 0 := U0 \ K is at least E(U0 ) − E(U 0 ) = F (K) + G(K) p > F (W1 ) H2 (K) + λm H2 (K),

(39)

where the equality follows from (32) and Theorem A.7. U 0 will not satisfy the area constraint, but there is room inside Wm \ U 0 to restore the excess mass H2 (K) since H2 (U 0 ) + H2 (K) = m = H2 (Wm ). Because Wm ⊂ Vm , the potential energy cost for introducing this mass will be G(U ) − G(U 0 ) ≤ λm H2 (K). If U can be formed without paying too great a price in surface energy, the gain (39) will dominate. Choose a scaled copy of the Wulff shape Wµ ⊂ Wm for which H2 (Wµ \U 0 ) = H2 (K), and define U := Wµ ∪ U 0 . Certainly U ∈ F(BR ) if R ≥ R0 . The surface energy of U is controlled by (32), (36) and µ ≥ H2 (K): F (U ) − F (U 0 ) ≤ F (Wµ ) − F (Wµ ∩ U 0 ) p ≤ F (W1 ) µ1/2 − µ − H2 (K) p ≤ F (W1 ) H2 (K). The three preceding inequalities yield E(U ) < E(U0 ), contradicting the fact that U0 is an energy minimizer. We must therefore conclude that K ⊂ BR0 (0). Proposition A.9 ( Bound on the Smallness of Convex Components ). Let U0 minimize the energy E(U ) among sets in F (R2 ) having area m. If K is one of the disjoint convex components of U0 then H2 (K) ≥ µ, the bound µ > 0 depending only on m and the integrands f and g. Proof. Choose the origin of R2 to lie somewhere in K. Since K is convex, it may be contracted by a factor 0 < η < 1 without intersecting U := U0 \ K or indeed any dilation of U by factor λ > 1. Moreover, η and λ may be chosen to depend on each other in such a way that ηK ∪ λU has area m. Then the energy of this configuration is bounded below by E(U0 ), which will lead to a lower bound on H2 (K). Before the origin was shifted, U0 was contained in the ball BR0 (0) by Proposition A.8; R0 depended only on f, g and m. It will still be true that U0 is contained in a ball of radius Z 2R0 about the new origin. The infinitesimal increase in E(λU ) = λF (U ) + λ2 g(λx) dH2 (x) with λ is given by U

Crystals in an External Field

721

Z d E(λU ) = F (U ) + (2g + h x, ∇gi) dH2 (x); dλ λ=1 U

(40)

being convex, g(x) is Lipschitz on |x| < 2R0 and the dominated convergence theorem has been applied. Since F (U ) ≤ E(U0 ) (by Theorem A.7) the cost (40) of dilating U can be controlled by a constant depending only on f, g and m. On the other hand, both the surface and potential energy of K will decrease as it is contracted . . . the latter because ηK ⊂ K. The proposition is proved by the next estimate, which shows that unless H2 (K) is bounded below, the gain in F (ηK) with a small change in λ outweighs the cost (40); this would be inconsistent with E(U0 ) a global minimum. The estimate relies on (36) and the area constraint η 2 H2 (K) + λ2 (m − H2 (K)) = m; when H2 (K) is small, a slight change in λ results in a huge change in η. Thus dη d (41) − F (ηK) = −F (K) dλ λ=1 dλ η=λ=1 ≥ H2 (K)−1/2 (m − H2 (K)) F (W1 )

(42)

diverges with H2 (K) → 0. The proposition is concluded by choosing µ small enough so H2 (K) < µ implies the gain (42) in surface energy alone outweighs the U0 independent bound on the cost (40). Acknowledgement. This work is based on a Princeton University Ph.D. thesis. Funding was provided by a 1967 Scholarship from the Natural Sciences and Engineering Research Council of Canada, and a Proctor Fellowship of the Princeton Graduate School. While preparing this manuscript, the author enjoyed the hospitality of the Courant Institute of Mathematical Sciences and was supported by an American Mathematical Society Centennial Fellowship and National Science Foundation grant DMS 9622997. It is the author’s pleasure to express his deep gratitude to his advisor and mentor, Elliott Lieb. He is also grateful to the many others who provided fruitful discussions and vital remarks: Ivan Blank, Kate Okikiolu, Felix Otto, Janet Rankin, Stephen Semmes, Jan-Philip Solovej and Jean Taylor. Frederick Almgren proposed this problem for investigation, and was an ongoing source of advice and inspiration. It is with great sadness and fond remembrance that this paper is dedicated to him.

References 1. Almgren, F., Taylor, J.E. and Wang, L.: Curvature-driven flows: A variational approach. SIAM J. Control Optim. 31, 387–438 (1993) 2. Angenent, S. and Gurtin, M.E.: Multiphase thermomechanics with interfacial structure 2. Evolution of an isothermal interface. Arch. Rational Mech. Anal. 108, 323–391 (1989) 3. Angenent, S. and Gurtin, M.E.: Anisotropic motion of a phase interface: Well-posedness of the initial value problem and qualitative properties of the interface. J. Reine Angew. Math. 446, 1–47 (1994) 4. Avron, J.E., Taylor, J.E. and Zia, R.K.P.: Equilibrium shapes of crystals in a gravitational field: Crystals on a table. J. Stat. Phys. 33, 493–522 (1983) 5. Balibar, S. and Castaing, B.: Helium: Solid-liquid interfaces. Surface Science Reports 5, 87–144 (1985) 6. Brenier, Y.: D´ecomposition polaire et r´earrangement monotone des champs de vecteurs. C.R. Acad. Sci. Paris S´er. I Math. 305, 805–808 (1987) 7. Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. 44, 375–417 (1991) 8. Caffarelli, L.A.: The regularity of mappings with a convex potential. J. Am. Math. Soc. 5, 99–104 (1992) 9. Caffarelli, L.A.: Boundary regularity of maps with convex potentials. Comm. Pure Appl. Math. 45, 1141–1151 (1992) 10. Caffarelli, L.A.: Boundary regularity of maps with convex potentials – II. Ann. of Math. (2) 144, 453–496 (1996)

722

Robert J. McCann

11. Cahn, J.W. and Hoffman, D.W.: A vector thermodynamics for anisotropic surfaces – II. Curved and faceted surfaces. Acta Metallurgica 22, 1205–1215 (1974) 12. Chen, Y.-G., Giga, Y. and Goto, S.: Uniqueness and existence of viscosity solutions of generalized mean curvature flow equations. J. Differ. Geom. 33, 749–786 (1991) 13. Curie, P.: Sur la formation des cristaux et sur les constantes capillaires de leurs diff´erent faces. Bulletin de la Societ´e Franc¸aise de Min´eralogie et de Cristallographie 8, 145–150 (1885) 14. Dacorogna, B. and Pfister, C.E. Wulff theorem and best constant in Sobolev inequality. J. Math. Pures Appl. 71, 97–118 (1992) ¨ 15. Dinghas, A.: Uber einen geometrischen Satz von Wulff f¨ur die Gleichgewichtsform von Kristallen. Z. Krist. 105, 304–314 (1944) 16. Dobrushin, R., Kotecky, R. and Shlosman, S.: Wulff Construction: a Global Shape from Local Interaction. Providence, RI: American Mathematical Society, 1992 17. Evans, L.C. and Gariepy, R.F.: Measure Theory and Fine Properties of Functions. Boca Raton, Fl: CRC Press, 1992 18. Evans, L.C. and Spruck, J.: Motion of level sets by mean curvature. I. J. Differ. Geom. 33, 635–681 (1991) 19. Federer, H.: Geometric Measure Theory. New York: Springer-Verlag, 1969 20. Finn, R.: The sessile liquid drop, 1. Symmetric case. Pacific J. Math. 88, 549–587 (1980) 21. Finn, R.: Equilibrium Capillary Surfaces. New York: Springer-Verlag, 1986 22. Fonseca, I.: The Wulff theorem revisited. Proc. Roy. Soc. London Ser. A 432, 125–145 (1991) 23. Fonseca, I. and M¨uller, S.: A uniqueness proof for the Wulff theorem. Proc. Roy. Soc. Edinburgh Sect. A 119, 125–136 (1991) 24. Gage, M.E.: Curve shortening makes convex curves circular. Invent. Math. 76, 357–364 (1984) 25. Gage, M.E.: Minkowski plane geometry and anisotropic curvature flow of curves. In: A. Damlamian, J. Spruck and A. Visintin, editors, Curvature Flows and Related Topics, volume 5 of Internat. Ser. Math. Sci. Appl., Tokyo: GAKUTO, 1995, pp. 83–97 26. Gage, M.E.: Evolving plane curves by curvature in relative geometries. Duke Math. J. 72, 441–466 (1993) 27. Gage, M.E. and Hamilton, R.S.: The heat equation shrinking convex plane curves. J. Differential Geom. 23, 69–96 (1986) 28. Gage, M.E. and Li. Y.: Evolving plane curves by curvature in relative geometries. II. Duke Math. J. 75, 79–98 (1994) 29. Gangbo, W.: An elementary proof of the polar factorization of vector-valued functions. Arch. Rational Mech. Anal. 128, 381–399 (1994) 30. Gibbs, J.W.: On the equilibrium of heterogeneous substances. 1878. In: Collected Works, Volume 1, esp. footnote p. 325. New York: Longmans, Green and Company, 1898, pp. 55–353 31. Giusti. E.: Minimal Surfaces and Functions of Bounded Variation. Boston: Birkh¨auser, 1984 32. Grayson. M.A.: The heat equation shrinks embedded plane curves to round points. J. Differ. Geom. 26, 285–314 (1987) 33. Gurtin, M.E., Soner, H.M. and Souganidas, P.E.: Anisotropic motion of an interface relaxed by the formation of infinitesimal wrinkles. J. Differ. Eqs. 119, 54–108 (1995) 34. Hadwiger, H. and Ohmann, D.: Brunn–Minkowskischer Satz und Isoperimetrie. Math. Z. 66, 1–8 (1956) 35. Herring, C.: Some theorems on the free energy of crystal surfaces. Phys. Rev. 82, 87–93 (1951) 36. Herring, C.: The use of classical macroscopic concepts in surface-energy problems. In: R. Gomer and C.S. Smith, editors, Structure and Properties of Solid Surfaces, Chicago Il: University of Chicago Press 1953, pp. 5–72 37. Heyraud, J.C. and M´etois, J.J.: Establishment of the equilibrium shape of metal crystallites on a foreign substrate: Gold on graphite. J. Crystal Growth 50, 571–574 (1980) 38. Heyraud, J.C. and M´etois, J.J.: Equilibrium shape and temperature: Lead on graphite. Surface Science 128, 334–342 (1983) 39. Hilton, H.: Mathematical Crystallography. Oxford: Clarendon Press, 1903 40. Huisken, G.: Flow by mean curvature of convex surfaces into spheres. J. Differ. Geom. 20, 237–266 (1984) 41. Keshishev, K.O., Parshin, A.Ya. and Shal’nikov, A.I.: Surface phenomena in quantum crystals. Soviet Sci. Rev. Sect. A Phys. Rev. 4, 155–218 (1982) 42. Liebmann, H.: Der Curie-Wulff’sche Satz u¨ ber Combinationsformen von Krystallen. Z. Krist. 53, 171– 177 (1914)

Crystals in an External Field

723

43. Luckhaus, S. and Sturzenhecker, T.: Implicit time discretization for the mean curvature flow equation. Calc. Var. Partial Differ. Eqs. 3, 253–271 (1995) 44. Maris, H.J. and Andreev, A.F.: The surface of crystalline helium-4. Physics Today, 25–30, February (1987) 45. McCann, R.J.: A Convexity Theory for Interacting Gases and Equilibrium Crystals. PhD thesis, Princeton University, 1994 46. McCann, R.J. Existence and uniqueness of monotone measure-preserving maps. Duke Math. J. 80, 309–323 (1995) 47. McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128, 153–179 (1997) 48. Okikiolu, K.: Personal communication 49. Pavlovska, A. and Nenow, D.: Experimental investigation of the surface melting of equilibrium form faces of diphenyl. Surface Science 27, 211–217 (1971) 50. Pavlovska, A. and Nenow, D.: Experimental study of the equilibrium form of negative crystals in diphenyl. J. Crystal Growth 8, 209–212 (1971) 51. Pavlovska, A. and Nenow, D. Les surfaces non-singuli`eres sur la forme d’´equilibre du naphtal`ene. J. Crystal Growth 12, 9–12 (1971) 52. Reshetnyak, Yu.G.: Weak convergence of completely additive vector functions on a set. Siberian Math. J. 9, 1039–1045 (1968) 53. Schneider, R.: Convex Bodies: The Brunn–Minkowski Theory. Cambridge: Cambridge University Press, 1993 54. Schonmann, R.H. and Shlosman, S.B.: Constrained variational problem with applications to the Ising model. J. Stat. Phys. 83, 867–905 (1996) 55. Soner, H.M.: Motion of a set by the mean curvature of its boundary. J. Differ. Eqs. 101, 313–372 (1993) 56. Taylor, J.E. Existence and structure of solutions to a class of nonelliptic variational problems. Volume 14 of Symposia Mathematica, London: Academic Press, 1974, pp. 499–508 57. Taylor, J.E.: Unique structure of solutions to a class of nonelliptic variational problems. In: Differential geometry. Volume 27 of Proc. Sympos. Pure. Math., Providence, RI: American Mathematical Society, 1975, pp. 419–427 58. Taylor, J.E.: Crystalline variational problems. Bull. Am. Math. Soc. 84, 568–588 (1978) 59. Taylor, J.E.: II–Mean curvature and weighted mean curvature. Acta Metallurgica et Materiala 40, 1475– 1485 (1992) 60. Taylor, J.E. and Almgren, F.J., Jr.: Optimal crystal shapes. In: P. Concus and R. Finn, editors, Variational Methods for Free Surface Interfaces, New York: Springer-Verlag, 1987, pp. 1–11 61. Taylor, J.E., Cahn, J.W. and Handwerker, C.A.: I–Geometric models of crystal growth. Acta Metallurgica et Materiala 40, 1443–1474 (1992) 62. Vogel, T.I.: Unbounded parametric surfaces of prescribed mean curvature. Indiana Univ. Math. J. 31, 281–288 (1982) 63. von Laue, M.: Der Wulffsche Satz f¨ur die Gleichgewichtsform von Kristallen. Z. Krist. 105, 124–133 (1943) 64. Wente, H.C.: The symmetry of sessile and pendant drops. Pacific J. Math. 88, 307–397 (1980) 65. Winterbottom, W.L.: Equilibrium shape of a small particle in contact with a foreign substrate. Acta Metallurgica 15, 303–310 (1967) 66. Wulff, G.: Zur Frage der Geschwindigkeit des Wachsthums und der Aufl¨osung der Krystallfl¨achen. Z. Krist. 34, 449–530 (1901) 67. Zia, R.K.P., Avron, J.E. and Taylor, J.E.: Crystals in corners: the summertop construction. J. Stat. Phys. 50, 727–736 (1988) Communicated by J. L. Lebowitz

Commun. Math. Phys. 195, 725 – 740 (1998)

Communications in

Mathematical Physics © Springer-Verlag 1998

Quantized Affine Algebras and Crystals with Core Seok-Jin Kang1,? , Masaki Kashiwara2 1 2

Department of Mathematics, Seoul National University, Seoul 151-742, Korea Research Institute for Mathematical Sciences, Kyoto University, Kyoto 606, Japan

Received: 6 October 1997 / Accepted: 9 January 1998

Abstract: Motivated by the work of Nakayashiki on the inhomogeneous vertex models of 6-vertex type, we introduce the notion of crystals with core. We show that the tensor product of the highest weight crystal B(λ) of level k and the perfect crystal Bl of level l is isomorphic to the tensor product of the perfect crystal Bl−k of level l − k and the highest weight crystal B(λ0 ) of level k. 1. Introduction In [9], Nakayashiki studied the inhomogeneous vertex models of 6-vertex type, and he explained the degeneration of the ground states from the point of view of the represenb 2 )-module with the highest tation theory as follows. Let V (3i ) be the irreducible Uq0 (sl b 2 )-module. weight 3i (i = 0, 1) of level 1, and let Vs be the (s + 1)-dimensional Uq0 (sl Then there exists an intertwiner 8(z) : (Vs−1 )z ⊗ V (3i ) → V (3i+1 ) ⊗ (Vs )z . He identified (Vs−1 )z with the degeneration of the ground states. The q = 0 limit can be described in terms of crystal bases. Let Bs be the crystal base of Vs , and let B(3i ) be the crystal base of V (3i ). Then, as shown by Nakayashiki ([9], [10]), we have an isomorphism of crystals Bs−1 ⊗ B(3i ) ∼ = B(3i+1 ) ⊗ Bs . The purpose of this paper is to generalize the above result on crystals in a more b 2 ) with quantized affine algebras U 0 (g), B(3i ) with general situation, replacing Uq0 (sl q ? Supported in part by Basic Science Research Institute Program, Ministry of Education of Korea, BSRI97-1414, and GARC-KOSEF at Seoul National University.

726

S.-J. Kang, M. Kashiwara

the crystals of the integrable highest weight representations of arbitrary positive level, and Bs with perfect crystals. The crystal of the integrable highest weight representation has a unique highest weight vector. Namely, it contains a unique vector b such that e˜i b = 0 for all i and all the other vectors can be obtained from b by applying f˜i ’s successively. However, neither Bs−1 ⊗B(3i ) nor B(3i+1 )⊗Bs has such properties. Instead, they satisfy weaker properties: the highest weight vector has to be replaced with a subset consisting of several vectors, which we call the core. This is a combinatorial phenomenon corresponding to the degeneration of the ground states in the exactly solvable models. Let B be a crystal. For b ∈ B, let E(b) be the smallest subset of B containing b and stable under the e˜i ’s. We say that B has a core if E(b) is a finite set for any b ∈ B. For such a crystal, we define its core C(B) to be {b ∈ B| E(b0 ) = E(b) for every b0 ∈ E(b)}. Then the core replaces the role of highest weight vectors: all the vectors in B can be obtained from vectors in the core by applying f˜i ’s successively. If D is a finite regular crystal and B(λ) is the crystal of the integrable highest weight representation with highest weight λ of level k, then D ⊗ B(λ) has a core and its core is given by D ⊗ uλ , where uλ is the highest weight vector of B(λ). However, if we change the order of the tensor product, the situation is completely different. The crystal B(λ) ⊗ D has a core, but uλ ⊗ D is not the core in general. In this paper, we prove that, for a perfect crystal Bl of level l > k, B(λ) ⊗ Bl is isomorphic to the crystal Bl−k ⊗ B(λ0 ) for another dominant integral weight λ0 of level k and the perfect crystal Bl−k of level l − k (see Theorem 5.4 for more precise statements). The proof is based on the theory of coherent families of perfect crystals developed in [5] and the characterization of crystals of the form D ⊗ B(λ). We introduce the notion of regular core (Definition 4.1), and we prove that any connected regular crystal with regular core is isomorphic to a crystal of the form C(B) ⊗ B(λ) for some dominant integral weight λ (see Theorem 4.7 for more precise statements). Then, we check the regularity condition for the coherent families of perfect crystals.

2. Quantized Affine Algebras Let I be a finite index set and A = (aij )i,j∈I a generalized Cartan matrix of affine type. We choose a vector space t of dimension |I| + 1, and let 5 = {αi | i ∈ I} and 5∨ = {hi | i ∈ I} be linearly independent subsets of t∗ and t, respectively, satisfying hi ) are called the simple hhi , αj i = aij for all i, j ∈ I. The αi (resp.L Lroots (resp. simple coroots), and the free abelian group Q = i∈I Zαi (resp.PQ∨ = i∈I Zhi ) is called the root lattice (resp. dual root lattice). We denote by δ = i∈I ai αi ∈ Q the smallest P ∨ positive imaginary root and c = i∈I a∨ i hi ∈ Q the canonical central element (cf. [2, ∗ ∗ ∗ ∗ Chapter 6]). Set tcl = t /Cδ and let cl : t → tcl be the canonical projection. We denote ∗0 by t∗0 = {λ ∈ t∗ | hc, λi = 0} and t∗0 cl = cl(t ). ∗ Let P = {λ ∈ t | hhi , λi ∈ Z for all i ∈ I} be the weight lattice and P ∨ = {h ∈ t | hh, αi i ∈ Z for all i ∈ I} be the dual weight lattice. Note that αi , 3i ∈ P and hi ∈ P ∨ , where 3i ∈ t∗ are linear forms satisfying hhj , 3i i = δij (i, j ∈ I). Set Pcl = cl(P ) = Hom(Q∨ , Z) ⊂ t∗cl , P 0 = {λ ∈ P | hc, λi = 0} ⊂ t∗0 , and Pcl0 = cl(P 0 ) ⊂ t∗0 cl . Since the generalized Cartan matrix A is symmetrizable, there is a non-degenerate symmetric bilinear form ( , ) on t∗ satisfying hhi , λi =

2(αi , λ) for all i ∈ I, λ ∈ t∗ . (αi , αi )

Quantized Affine Algebras and Crystals with Core

727

We normalize the bilinear form so that we have (δ, λ) = hc, λi. ∗ Note that t∗0 cl has a non-degenerate symmetric bilinear form induced by that on t . We take the smallest positive integer γ such that γ(αi , αi )/2 is a positive integer for all i ∈ I.

Definition 2.1. The quantized affine algebra Uq (g) is the associative algebra with 1 over C(q 1/γ ) generated by the elements ei , fi (i ∈ I) and q(h) (h ∈ γ −1 P ∨ ) satisfying the following defining relations: q(0) = 1, q(h)q(h0 ) = q(h + h0 ) (h, h0 ∈ γ −1 P ∨ ), q(h)ei q(−h) = q hh,αi i ei , q(h)fi q(−h) = q −hh,αi i fi (h ∈ γ −1 P ∨ , i ∈ I), [ei , fj ] = δij X

ti − t−1 i (i, j ∈ I), qi − qi−1

1−aij

(1−aij −k)

(−1)k e(k) i e j ei

k=0

X

1−aij

=

(2.1)

(1−aij −k)

(−1)k fi(k) fj fi

= 0 (i 6= j),

k=0

(k) k = fik /[k]i !, [k]i = where qi = q (αi ,αi )/2 , ti = q( (αi2,αi ) hi ), e(k) i = ei /[k]i !, fi

and [k]i ! = [1]i [2]i . . . [k]i for all i ∈ I.

qik −qi−k , qi −qi−1

The quantized affine algebra Uq (g) has a Hopf algebra structure with comultiplication 1, counit ε, and antipode S defined by 1(q(h)) = q(h) ⊗ q(h), 1(ei ) = ei ⊗ t−1 i + 1 ⊗ ei , 1(fi ) = fi ⊗ 1 + ti ⊗ fi , ε(q(h)) = 1, ε(ei ) = ε(fi ) = 0,

(2.2)

S(q(h)) = q(−h), S(ei ) = −ei ti , S(fi ) = −t−1 i fi for all h ∈ γ −1 P ∨ , i ∈ I. We denote by Uq0 (g) the subalgebra of Uq (g) generated by ei , fi (i ∈ I) and q(h) (h ∈ γ −1 Q∨ ), which will also be called the quantized affine algebra. A Uq0 (g)-module M is called integrable if it has the weight space decomposition L M = λ∈Pcl Mλ , where Mλ = {u ∈ M | q(h)u = q hh,λi u for all h ∈ γ −1 Q∨ }, and M is Uq0 (g)i -locally finite (i.e., dim Uq0 (g)i u < ∞ for all u ∈ M ) for all i ∈ I, where Uq0 (g)i denotes the subalgebra of Uq0 (g) generated by ei , fi , and t±1 i . 3. Crystals with Core In studying the structure of integrable representations of quantized affine algebras, the crystal base theory developed in [3] provides a very powerful combinatorial method. In this section, we develop the theory of crystals with core. We first recall the definition of crystals given in [4].

728

S.-J. Kang, M. Kashiwara

Definition 3.1. A crystal B is a set together with the maps wt : B → P , εi : B → Z t {−∞}, ϕi : B → Z t {−∞}, e˜i : B → B t {0}, f˜i : B → B t {0} (i ∈ I) satisfying the axioms: hhi , wt(b)i = ϕi (b) − εi (b) for all b ∈ B, wt(e˜i b) = wt(b) + αi for b ∈ B with e˜i b ∈ B, wt(f˜i b) = wt(b) − αi for b ∈ B with f˜i b ∈ B,

(3.1)

f˜i b = b0 if and only if b = e˜i b0 for b, b0 ∈ B, e˜i b = f˜i b = 0 if εi (b) = −∞. Definition 3.2. For two crystals B1 and B2 , a morphism of crystals from B1 to B2 is a map ψ : B1 t {0} → B2 t {0} such that ψ(0) = 0, ψ(e˜i b) = e˜i ψ(b) for b ∈ B1 with e˜i b ∈ B1 , ψ(b) ∈ B2 , ψ(e˜i b) ∈ B2 , ψ(f˜i b) = f˜i ψ(b) for b ∈ B1 with f˜i b ∈ B1 , ψ(b) ∈ B2 , ψ(f˜i b) ∈ B2 , wt(ψ(b)) = wt(b) for b ∈ B1 with ψ(b) ∈ B2 , εi (ψ(b)) = εi (b), ϕi (ψ(b)) = ϕi (b) for b ∈ B1 with ψ(b) ∈ B2 .

(3.2)

A morphism ψ : B1 → B2 is called an embedding if the map ψ : B1 t {0} → B2 t {0} is injective. In this case, we call B1 a subcrystal of B2 . For two crystals B1 and B2 , we define their tensor product B1 ⊗ B2 as follows. The underlying set is B1 × B2 . For b1 ∈ B1 , b2 ∈ B2 , we write b1 ⊗ b2 for (b1 , b2 ) and we understand b1 ⊗ 0 = 0 ⊗ b2 = 0. We define the maps wt : B1 ⊗ B2 → P , εi : B1 ⊗B2 → Zt{−∞}, ϕi : B1 ⊗B2 → Zt{−∞}, e˜i : B1 ⊗B2 → B1 ⊗B2 t{0}, f˜i : B1 ⊗ B2 → B1 ⊗ B2 t {0} (i ∈ I) as follows: wt(b1 ⊗ b2 ) = wt(b1 ) + wt(b2 ), εi (b1 ⊗ b2 ) = max(εi (b1 ), εi (b2 ) − hhi , wt(b1 )i), ϕi (b1 ⊗ b2 ) = max(ϕi (b2 ), ϕi (b1 ) + hhi , wt(b2 )i), ( if ϕi (b1 ) ≥ εi (b2 ), e˜i b1 ⊗ b2 e˜i (b1 ⊗ b2 ) = b1 ⊗ e˜i b2 if ϕi (b1 ) < εi (b2 ), ( f˜i b1 ⊗ b2 if ϕi (b1 ) > εi (b2 ), f˜i (b1 ⊗ b2 ) = b1 ⊗ f˜i b2 if ϕi (b1 ) ≤ εi (b2 ).

(3.3)

In the sequel, we will only consider the crystals over the quantized affine algebra Uq0 (g). Hence the weights of crystals will be elements of Pcl . For example, for λ ∈ Pcl , consider the set Tλ = {tλ } with one element. Define wt(tλ ) = λ, εi (tλ ) = ϕi (tλ ) = −∞, and e˜i (tλ ) = f˜i (tλ ) = 0 (i ∈ I). Then Tλ is a crystal and we have Tλ ⊗ Tλ0 ∼ = Tλ+λ0 . For a dominant integral weight λ, we denote by B(λ) the crystal associated with the integrable highest weight representation with highest weight λ, and uλ the highest weight vector of B(λ). The highest weight vector uλ is the unique element of B(λ) with weight λ satisfying e˜i uλ = 0 for all i ∈ I. For a subset J of I, we denote by Uq0 (gJ ) the subalgebra of Uq0 (g) generated by ei , ⊂ I, then gJ is a finite-dimensional semisimple Lie fi , and t±1 i (i ∈ J). Note that if J 6=

Quantized Affine Algebras and Crystals with Core

729

algebra. Similarly, for a subset J of I, we denote by BJ the crystal B equipped with the maps wt, εi , ϕi , e˜i , and f˜i for i ∈ J. We say that a crystal B over Uq0 (g) is regular if, for any J ⊂ I, BJ is isomorphic to the crystal associated with an integrable Uq0 (gJ )-module. 6=

This condition is equivalent to saying that the same assertion holds for any J ⊂ I with 6= one or two elements (see [6, Proposition 2.4.4]). Let B be a regular crystal. For b ∈ B, let e˜max ˜ki b such that e˜ki b 6= 0, e˜k+1 i b = 0, i b = e and define E(b) = {e˜i1 . . . e˜il b | l ≥ 0 and i1 , . . . , il ∈ I} \ {0}, ˜max E max (b) = {e˜max i1 . . . e il b | l ≥ 0 and i1 , . . . , il ∈ I}.

(3.4)

It follows that E max (b) ⊂ E(b), E(b0 ) ⊂ E(b) for all b0 ∈ E(b), E

max

0

(b ) ⊂ E

max

0

(b) for all b ∈ E

max

(3.5) (b).

Recall that the Weyl group W acts on the regular crystals ([4]). For each i ∈ I, the simple reflection si acts on the regular crystal B by ( hh ,wt(b)i f˜i i b if hhi , wt(b)i ≥ 0, Si (b) = (3.6) −hhi ,wt(b)i b if hhi , wt(b)i ≤ 0. e˜i For w = sir sir−1 . . . si1 ∈ W , its action is given by Sw = Sir Sir−1 . . . Si1 . We first prove: Lemma 3.3. Let B be a finite regular crystal. (a) We have E(Sw (b)) = E(b) for all b ∈ B, w ∈ W . (b) E(b) is a connected component of B for any b ∈ B. Proof. (a) It suffices to show that Si (b) ∈ E(b) for all b ∈ B, i ∈ I. If λ = wt(b) satisfies hhi , λi ≤ 0, then by (3.6), our assertion is obvious. If hhi , λi > 0, take w = sil . . . si1 ∈ W such that hhik , sik−1 . . . si1 λi < 0 for k = 1, . . . , l and si λ = wλ (see [1, Lemma 1.4]). Then for each n ≥ 1, we have Sw (Si Sw )n b ∈ E(b). Since Si Sw has finite order, there exists n > 0 such that (Si Sw )n b = b. Hence Si b = Sw (Si Sw )n−1 b ∈ E(b). i (b)−1 Si (e˜max (b) Note that for any b ∈ B, we have f˜i b = e˜ϕ i b). By (a), this implies i ˜ E(b) is stable under fi for all i ∈ I. Hence we have the desired result. Definition 3.4. We say that a regular crystal B has a core if E(b) is a finite set for any b ∈ B. In this case, we define the core C(B) of B to be C(B) = {b ∈ B| E(b0 ) = E(b)

for every b0 ∈ E(b)},

(3.7)

and B is called a crystal with core. In the following, we prove some of the basic properties of the crystals with core. Lemma 3.5. Suppose that B has a core C(B). (a) The core C(B) is stable under e˜i ’s (i ∈ I). (b) E(b) ∩ C(B) 6= ∅ for all b ∈ B.

730

S.-J. Kang, M. Kashiwara

(c) If b ∈ C(B), then either e˜i b = 0 for all i ∈ I or there exist i1 , . . . , il ∈ I (l ≥ 1) such that b = e˜il . . . e˜i1 b. Proof. (b) If b ∈ C(B), then E(b) ⊂ C(B), since for b0 ∈ E(b) and b00 ∈ E(b0 ) ⊂ E(b), we have E(b00 ) = E(b) = E(b0 ). (c) For any b ∈ B, take b0 ∈ E(b) such that E(b0 ) has the smallest cardinality. Then, since E(b00 ) ⊂ E(b0 ) for any b00 ∈ E(b0 ) ⊂ E(b), we have E(b00 ) = E(b0 ), which implies b0 belongs to C(B). (b) If b ∈ C(B) and e˜i1 b 6= 0 for some i1 ∈ I, then by definition we have E(b) = E(e˜i1 b). Then b ∈ E(b) implies b = e˜il . . . e˜i1 b for some i2 , . . . , il ∈ I. Lemma 3.6. Let B be a regular crystal with core and H a subset of B. (a) If H is stable under e˜i ’s (i ∈ I) and E(b) ∩ H 6= ∅ for any b ∈ B, then C(B) is contained in H. (b) If, in addition, E(b) = E(b0 ) for any b ∈ H and b0 ∈ E(b), then H = C(B). Proof. (a) If b ∈ C(B), take b0 ∈ E(b) ∩ H. Then b ∈ E(b) = E(b0 ) ⊂ H. (b) If b ∈ H and b0 ∈ E(b) ∩ C(B), then b ∈ E(b) = E(b0 ) ⊂ C(B). Corollary 3.7. If B is a finite regular crystal, then C(B) = B. Proof. We may assume that B is connected. By Lemma 3.3, we have E(b) = B for all b ∈ B. Hence C(B) = B.

4. Structure of Crystals with Core Let ψ : C(B) ,→ B denote the inclusion map. Definition 4.1. We say that B has a regular core if the core C(B) of B becomes a regular crystal with the maps wt, εi , ϕi , e˜i , f˜i (i ∈ I) defined by e˜i b = ψ −1 (e˜i ψ(b)), ( ψ −1 (f˜i ψ(b)) f˜i b = 0

if f˜i ψ(b) ∈ C(B), otherwise,

εi (b) = εi (ψ(b)), ϕi (b) = max{k ≥ 0| f˜ik b ∈ C(B)}

(4.1)

= max{k ≥ 0| b ∈ e˜ki C(B)}, X (ϕi (b) − εi (b))3i ∈ Pcl . wt(b) = i

Let b ∈ C(B). Then E(b) ⊂ C(B). If b0 ∈ E(b) satisfies f˜i b0 ∈ C(B), then 0 ˜ fi b ∈ E(f˜i b0 ) = E(e˜i f˜i b0 ) = E(b0 ) = E(b), and hence E(b) is stable under f˜i ’s (i ∈ I). Therefore the connected components of C(B) are of the form E(b). Let E(b0 ) be a connected component of C(B) and set W (b) = wt(ψ(b)) − wt(b) for b ∈ E(b0 ), where ψ : C(B) ,→ B is the inclusion map. Note that, for all i, j ∈ I, we have

Quantized Affine Algebras and Crystals with Core

731

hhi , W (e˜j b)i = hhi , wt(ψ(e˜j b))i − hhi , wt(e˜j b)i = hhi , wt(ψ(b)) + αj i − hhi , wt(b) + αj i = hhi , wt(ψ(b)) − wt(b)i = hhi , W (b)i. Hence, W (e˜j b) = W (b) for all j ∈ I, which implies W (b) is constant on E(b0 ). Let λ0 = wt(ψ(b0 )) − wt(b0 ). Since hhi , λ0 i = hhi , wt(ψ(b0 )) − wt(b0 )i = ϕi (ψ(b0 )) − ϕi (b0 ) ≥ 0, λ0 is dominant integral. We will show that there exists a unique embedding of regular crystals E(b0 ) ⊗ B(λ0 ) → B sending b ⊗ uλ0 to ψ(b) for all b ∈ E(b0 ), where uλ0 is the highest weight vector of B(λ0 ). Let D be a finite regular crystal, and let λ be a dominant integral weight. We denote by B(λ) the crystal associated with the integrable highest weight Uq0 (g)-module V (λ) with highest weight λ, and let uλ be the highest weight vector of B(λ). Lemma 4.2. For any b ∈ D ⊗ B(λ), we have E max (b) ∩ (D ⊗ uλ ) 6= ∅. Proof. If it were not true, there would exist b = b1 ⊗ b2 ∈ D ⊗ B(λ) such that E max (b) ⊂ D ⊗ b2 and b2 ∈ B(λ) \ {uλ }. By the tensor product rule, this implies E max (b) = E max (b1 ) ⊗ b2 . Since b2 6= uλ , there exists i ∈ I such that εi (b2 ) > 0. Take b0 ∈ E max (b1 ) such that ϕi (b0 ) = 0. Such a b0 exists by [1, Lemma 1.5]. Then we have e˜i (b0 ⊗ b2 ) = 0 b0 ⊗ (e˜i b2 ), which contradicts e˜max i (b ⊗ b2 ) ∈ D ⊗ b2 . Lemma 4.3. The regular crystal D ⊗ B(λ) has a regular core and C(D ⊗ B(λ)) = D ⊗ uλ , which is isomorphic to D as a crystal. Proof. Since E(b1 ⊗ b2 ) ⊂ E(b1 ) ⊗ E(b2 ), D ⊗ B(λ) has a core. The second assertion follows from Lemma 3.3, Lemma 3.6 and Lemma 4.2. Hence D ⊗ B(λ) has a regular core. The following proposition asserts that the successive applications of e˜max i ’s in the with e˜i , the complement of D ⊗ uλ does not produce a loop. Note that if we replace e˜max i assertion fails. This is the difference between our case and the highest weight module case. Proposition 4.4. Let D be a finite regular crystal, and let λ be a dominant integral weight. Then for every b ∈ D ⊗ B(λ), there exists a positive integer N such that ˜max ˜max ˜max for any sequence i1 , . . . , iN in I, e˜max iN . . . e i1 b ∈ D ⊗ uλ whenever e ik . . . e i1 b 6= max max e˜ik−1 . . . e˜i1 b for 1 ≤ k ≤ N . Proof. If the proposition were false, there would exist b ∈ (D ⊗ B(λ)) \ (D ⊗ uλ ) and l > 0 such that ˜max b = e˜max il . . . e i1 b

and

e˜max ˜max ˜max ˜max ik . . . e i1 b 6= e ik−1 . . . e i1 b for k = 1, . . . , l. (4.2)

0 0 0 ˜max Set e˜max ik . . . e i1 b = bk ⊗ b with bk ∈ D and b ∈ B(λ). Then b does not depend on k max and we have bk = e˜ik bk−1 . Since D is a finite crystal, all of its weights have level 0. Hence the square lengths of its weights are well-defined.

732

S.-J. Kang, M. Kashiwara

Since wt(bk ) = wt(bk−1 ) + εik (bk−1 )αik , we have (wt(bk ), wt(bk )) = (wt(bk−1 ), wt(bk−1 )) + 2εik (bk−1 )(wt(bk−1 ), αik ) +εik (bk−1 )2 (αik , αik ) = (wt(bk−1 ), wt(bk−1 )) + εik (bk−1 )(αik , αik )hhik , wt(bk−1 )i +εik (bk−1 )2 (αik , αik ) = (wt(bk−1 ), wt(bk−1 )) + (αik , αik )εik (bk−1 )ϕik (bk−1 ) ≥ (wt(bk−1 ), wt(bk−1 ))

(4.3)

for all k ≥ 1. Hence (wt(bk ), wt(bk )) are the same for all k ≥ 1. Since (4.3) is the equality and εik (bk−1 ) > 0, we have ϕik (bk−1 ) = 0. Since e˜ik (bk−1 ⊗ b0 ) = e˜ik bk−1 ⊗ b0 , we have ϕik (bk−1 ) ≥ εik (b0 ), and hence εik (b0 ) = 0. Write wt(e˜max ˜max il . . . e i1 b) = cl(t1 αi1 + max · · · + tl αil ) + wt(b). Since wt(b) = wt(e˜max . . . e ˜ b), t α + · · · + t α 1 i1 l il is a multiple of il i1 the null root δ, which implies {i1 , . . . , il } = I. Hence εi (b0 ) = 0 for all i ∈ I, which contradicts b0 6= uλ . Note that the subcrystal D ⊗ uλ of D ⊗ B(λ) is isomorphic to the crystal D ⊗ Tλ , where Tλ denotes the crystal with a single element tλ of weight λ and with εi (tλ ) = ϕi (tλ ) = −∞. Let B be a regular crystal. In the next theorem, we will show that any morphism of crystals 9 : D ⊗ uλ → B commuting with the e˜i ’s (i ∈ I) can be extended uniquely to a morphism of regular crystals from D ⊗ B(λ) → B. Theorem 4.5. Let D be a finite regular crystal, B a regular crystal, and λ a dominant integral weight. Suppose that there is a morphism of crystals 9 : D ⊗ uλ → B such that 9(D ⊗ uλ ) ⊂ B and 9 commutes with the e˜i ’s (i ∈ I). Then, if rank g > 2, the map 9 can be uniquely extended to a morphism of regular crystals ˜ : D ⊗ B(λ) → B. 9 ˜ satisfying the following properties: Proof. Let 6 be the set of pairs (S, 9) D ⊗ uλ ⊂ S ⊂ D ⊗ B(λ), e˜max i S ⊂ S for any i ∈ I, ˜ is a map from S to B such that 9| ˜ D⊗uλ = 9, 9 ˜ ˜ wt(9(b)) = wt(b) and εi (9(b)) = εi (b) for any b ∈ S and i ∈ I, ˜ ˜ e˜max ˜max 9( i b) = e i 9(b) for any b ∈ S and i ∈ I.

(4.4) (4.5) (4.6) (4.7) (4.8)

˜ Since 6 is inductively ordered, by Zorn’s Lemma, it has a maximal element. Let (S, 9) be a maximal element. It is enough to prove that S is the same as D ⊗ B(λ). Assume that they are different. First we shall prove that there exists b ∈ D ⊗ B(λ) \ S such that e˜max i (b) ∈ S ∪ {b} for any i ∈ I. If it were not true, for any b ∈ D ⊗ B(λ) \ S, there would exist i such that e˜max i (b) 6∈ S ∪ {b}. Let us take b0 ∈ D ⊗ B(λ) \ S. Then there is i0 such that b1 = e˜max i0 (b) 6∈ S ∪ {b0 }. Repeating this we can find a sequence {bk } and {ik } such that bk+1 = e˜max /S ik (bk ) 6∈ S ∪ {bk }. This contradicts Proposition 4.4. Hence there exists b ∈ b ∈ S ∪ {b} for all i ∈ I. We shall choose such a b. and e˜max i

Quantized Affine Algebras and Crystals with Core

733

Next we shall show that there exists i0 such that e˜max i0 (b) ∈ S. Assuming the contrary, we shall deduce a contradiction. Write b = b1 ⊗ b2 . If e˜max i b = b for all i ∈ I, then 0 = εi (b1 ⊗ b2 ) = max(εi (b1 ), εi (b2 ) − hhi , wt(b1 )i). This implies εi (b1 ) = 0 for every i, and therefore hhi , wt(b1 )i = ϕi (b1 ) ≥ 0. Since hc, wt(b1 )i = 0, we have hhi , wt(b1 )i = 0 for every i. Thus we obtain εi (b2 ) = 0 for every i and hence b2 = uλ . This contradicts D ⊗ uλ ⊂ S. ˜εi0 (b) 9( ˜ e˜max ˜ ˜ e˜max Note that ϕi0 (9( ˜max i0 b)) = ϕi0 (e i0 b) ≥ εi0 (b). We define 9(b) to be fi0 i0 b) ∈ ˜ satisfies (4.4–8). The properties (4.4–6) are automatB. We will show that (S ∪ {b}, 9) ically satisfied. For (4.7), note that ˜ ˜ e˜max ˜max wt(9(b)) = wt(9( i0 b)) − εi0 (b)αi0 = wt(e i0 b) − εi0 (b)αi0 = wt(b). ˜ We shall show εi (9(b)) = εi (b) for i ∈ I. Set J = {i, i0 } ⊂ I. Let K be the connected 6=

component of D ⊗ B(λ) as a Uq0 (gJ )-crystal containing b. Then K is a finite set. Take ˜max a highest weight vector b1 ∈ K ⊂ D ⊗ B(λ). Then since e˜max i0 (b) ∈ S and e i S ⊂ S ˜ for all i ∈ I, b1 lies in S. By (4.7), 9(b1 ) is also a highest weight vector with respect to ˜ 1 ) extends to a ˜ 1 )) = wt(b1 ). Hence the map b1 7→ 9(b the J-colored arrows, and wt(9(b 0 ˜ morphism of Uq (gJ )-crystals ψ : K → B. Evidently, ψ|K∩S = 9|K∩S . Since εi (b) ˜εi0 (b) ψ(e˜max ˜εi0 (b) e˜max ˜ e˜max ˜ 9(b) = f˜i0 0 9( i0 b) = fi0 i0 b) = ψ(fi0 i0 b) = ψ(b),

˜ = εi (b). we have the desired property εi (9(b)) Finally, let us prove (4.8). If e˜max i (b) ∈ S, then ˜ ˜ ˜max ˜max ˜max e˜max i 9(b) = e i ψ(b) = ψ(e i (b)) = 9(e i b). ˜ ˜ ˜max ˜ ˜ ˜max If e˜max i (b) = b, then εi (b) = 0, and hence εi (9(b)) = 0. Thus 9(e i b) = 9(b) = e i 9(b). Corollary 4.6. Let B be a regular crystal with regular core. For an arbitrary connected component E(b0 ) of C(B), let ψ : E(b0 ) ,→ C(B) be the inclusion map. Then there exists a unique embedding of regular crystals 9 : E(b0 ) ⊗ B(λ0 ) → B such that 9(b ⊗ uλ0 ) = ψ(b) for any b ∈ E(b0 ). Proof. Since E(b0 ) is finite, the existence and the uniqueness of 9 follow immediately from Theorem 4.5. We can also see that 9 is an embedding by Lemma 4.2. The following theorem describes completely the structure of the regular crystals with regular core. Theorem 4.7. Suppose rank g > 2. Then any regular crystal B with regular core has the following decomposition: G D ⊗ B(λD ), B∼ = D

where D ranges over the connected components of C(B) and λD is a dominant integral weight. Proof. It suffices to prove that E(b0 )⊗B(λ) is connected for all b0 ∈ C(B). This follows from the fact that C(E(b0 ) ⊗ B(λ)) ∼ = E(b0 ) ⊗ uλ and E(b ⊗ uλ ) = E(b) ⊗ uλ 3 b0 ⊗ uλ for any b ∈ E(b0 ).

734

S.-J. Kang, M. Kashiwara

5. Highest Weight Crystals and Perfect Crystals Let k, l be positive integers, λ a dominant integral weight of level k, and Bl a perfect crystal of level l. The definition and the relevant theory of perfect crystals can be found in [5, 6 and 7]. Consider the tensor product of regular crystals B(λ) ⊗ Bl , where B(λ) is the crystal for the integrable highest weight module V (λ) over Uq0 (g) with a dominant integral highest weight λ. If k ≥ l, it is known that B(λ) ⊗ Bl decomposes into a disjoint union of crystals B(µ), where µ is a dominant integral weight of level k. In fact, C(B(λ) ⊗ Bl ) is a discrete crystal in this case, and coincides with uλ ⊗ Bl≤λ , where Bl≤λ = {b ∈ Bl | εi (b) ≤ hhi , λi for all i ∈ I}. Hence we have M B(λ + wt(b)). B(λ) ⊗ Bl ∼ = ≤λ

b∈Bl

See [6 and 7] for details. In this work, we will concentrate on the case when k < l. We first observe: Proposition 5.1. The crystal B(λ) ⊗ Bl has a core and C(B(λ) ⊗ Bl ) ⊂ uλ ⊗ Bl . Proof. For any b1 ⊗ b2 ∈ B(λ) ⊗ Bl , we have E(b1 ⊗ b2 ) ⊂ E(b1 ) ⊗ Bl and E(b1 ) ⊗ Bl is a finite set. Hence B(λ) ⊗ Bl has a core. Now, it is clear that uλ ⊗ Bl is stable under e˜i ’s (i ∈ I). Moreover, for any u ⊗ b ∈ B(λ) ⊗ Bl , by applying e˜i ’s repeatedly, we get e˜ik . . . e˜i1 (u ⊗ b) = uλ ⊗ b0 ∈ uλ ⊗ Bl for sufficiently large k ≥ 1. Hence our assertion follows from Lemma 3.6 (a). In the following, we will show that the core C(B(λ)⊗Bl ) of B(λ)⊗Bl is isomorphic to the perfect crystal Bl−k . Moreover, we will prove that there exists an isomorphism of crystals B(λ) ⊗ Bl ∼ = Bl−k ⊗ B(λ0 ), 0 where λ is the dominant integral weight of level k determined by the crystal isomorphism B(λ) ⊗ Bk ∼ = B(λ0 ) given in [6]. In order to give more precise statements, let us recall the theory of coherent families of perfect crystals developed in [5]. Let {Bl }l≥1 be a family P of perfect crystals Bl of level min l, and set B = {b ∈ B | hc, ε(b)i = l}. Here ε(b) = l i εi (b)3i , and we will also use P l ϕ(b) = i ϕi (b)3i . By the definition of perfect crystals, ε and ϕ map Blmin bijectively def

to (Pcl+ )l = {λ ∈ Pcl | hhi , λi ≥ 0, hc, λi = l}. We set J = {(l, b) | l ≥ 1, b ∈ Blmin }. Definition 5.2. A crystal B∞ with an element b∞ is called a limit of {Bl }l≥1 if it satisfies the following conditions: wt(b∞ ) = 0, ε(b∞ ) = ϕ(b∞ ) = 0, for any (l, b) ∈ J, there exists an embedding of crystals f(l,b) : Tε(b) ⊗ Bl ⊗ T−ϕ(b) → B∞ sending tε(b) ⊗ b ⊗ t−ϕ(b) to b∞ , [ Imf(l,b) . B∞ = (l,b)∈J

(5.1) (5.2)

(5.3)

Quantized Affine Algebras and Crystals with Core

735

If a limit exists, we call {Bl }l≥1 a coherent family of perfect crystals. It was proved in [5] that the limit (B∞ , b∞ ) is unique up to an isomorphism. Note that we have hc, ε(b)i ≥ 0

for any b ∈ B∞ .

min = {b ∈ B∞ | hc, ε(b)i = 0}. Then both ε and ϕ map B∞ bijectively to We set 0 Pcl = {λ ∈ Pcl | hc, λi = 0}. Moreover, there is a linear automorphism σ of Pcl0 such that min . We assume further the following condition: σϕ(b) = ε(b) for any b ∈ B∞ min B∞

σ extends to a linear automorphism σ of Pcl such that σϕ(b) = ε(b) for any b ∈ Blmin .

(5.4)

We conjecture that all the coherent families satisfy this condition. Moreover, σ sends the simple roots to the simple roots, and there exists an element of the Weyl group W such that its induced action on Pcl0 coincides with σ|P 0 . cl In the sequel, we fix a coherent family {Bl }l≥1 of perfect crystals satisfying the condition (5.4). For positive integers k and l with k < l, let λ be a dominant integral weight of level k and set λ0 = σ −1 λ. Then we have: Lemma 5.3. There exists a unique embedding of crystals ψ : Bl−k → Tλ ⊗ Bl ⊗ T−λ0 . min ) ⊂ Tλ ⊗ Blmin ⊗ T−λ0 . Moreover, we have ψ(Bl−k

Proof. Let us first prove the uniqueness. If b ∈ Bl−k is sent to tλ ⊗ b0 ⊗ t−λ0 , then we have ε(b0 ) = ε(b) + λ, and hence we have hc, ε(b0 )i = hc, ε(b)i + k. Therefore, ψ sends min to Tλ ⊗ Blmin ⊗ T−λ0 , and ψ|B min is uniquely determined because ε : Blmin → Pcl Bl−k l−k is injective. Now, the uniqueness of ψ follows from the connectedness of Bl−k . We shall prove the existence. Let us take a dominant integral weight ξ of level l − k and set µ = λ + ξ. Then µ is of level l. Set µ0 = σ −1 µ and ξ 0 = σ −1 ξ. Let us take bl ∈ Bl such that ε(bl ) = µ and bl−k ∈ Bl−k such that ε(bl−k ) = ξ. Then they are minimal vectors and we have the embeddings f(l,bl ) : Tµ ⊗ Bl ⊗ T−µ0 → B∞ , f(l−k,bl−k ) : Tξ ⊗ Bl−k ⊗ T−ξ0 → B∞ such that f(l,bl ) (bl ) = f(l−k,bl−k ) (bl−k ) = b∞ . We shall show Im(f(l−k,bl−k ) ) ⊂ Im(f(l,bl ) ).

(5.5)

Since Bl−k is connected, it is enough to show that if b ∈ Bl−k satisfies e˜i (b) 6= 0 and f(l−k,bl−k ) (tξ ⊗ b ⊗ t−ξ0 ) ∈ Im(f(l,bl ) ), then f(l−k,bl−k ) (tξ ⊗ e˜i b ⊗ t−ξ0 ) also belongs to Im(f(l,bl ) ). Write f(l−k,bl−k ) (tξ ⊗b⊗t−ξ0 ) = f(l,bl ) (tµ ⊗b0 ⊗t−µ0 ) with b0 ∈ Bl . Then we have εi (tξ ⊗b⊗t−ξ0 ) = εi (tµ ⊗b0 ⊗t−µ0 ), which implies εi (b0 ) = εi (b)+hhi , µ−ξi > 0. Hence we have f(l−k,bl−k ) (tξ ⊗ e˜i b ⊗ t−ξ0 ) = f(l,bl ) (tµ ⊗ e˜i b0 ⊗ t−µ0 ), which gives (5.5). Therefore we obtain an embedding of crystal Tξ ⊗ Bl−k ⊗ T−ξ0 → Tµ ⊗ Bl ⊗ T−µ0 . This induces the desired embedding ψ. Theorem 5.4. Suppose rank g > 2, and let {Bl }l≥1 be a coherent family of perfect crystals satisfying the condition (5.4). For a pair of positive integers k and l with k < l, let λ be a dominant integral weight of level k and λ0 = σ −1 λ. Then we have an isomorphism of crystals (5.6) B(λ) ⊗ Bl ∼ = Bl−k ⊗ B(λ0 ).

736

S.-J. Kang, M. Kashiwara

Proof. Let ψ : Bl−k → Tλ ⊗ Bl ⊗ T−λ0 be the embedding given in Lemma 5.3. Let Bl(λ) be the subset of Bl such that ψ(Bl−k ) = Tλ ⊗ Bl(λ) ⊗ T−λ0 . In order to prove the theorem, we shall show: Cλ = uλ ⊗ Bl(λ) is closed under e˜i ’s (i ∈ I), 0

(5.7) 0

for any b ∈ Bl , E(uλ ⊗ b) 3 uλ ⊗ b for some b ∈ there exists a bijection 9 : uλ ⊗ with e˜i ’s (i ∈ I).

Bl(λ)

Bl(λ) ,

→ Bl−k that commutes

(5.8) (5.9)

Once we have proved them, Lemma 3.6 along with Lemma 3.3 would imply C(B(λ) ⊗ Bl ) = uλ ⊗ Bl(λ) ,

and, since Cλ ∼ = Bl−k is connected, Theorem 4.7 yields a crystal isomorphism B(λ) ⊗ Bl ∼ = Bl−k ⊗ B(λ0 ). = Cλ ⊗ B(λ0 ) ∼ Proof of (5.7) and (5.9): They are easily deduced from the existence of ψ and the fact that e˜i (b) = 0 if and only if εi (b) = 0 for b in Bl−k or in uλ ⊗ Bl(λ) . Proof of (5.8): Let us take a dominant integral weight ξ of level l − k and set µ = λ + ξ. Since Bl is perfect, there exists a unique element b0 ∈ Bl with ε(b0 ) = µ. Then b0 belongs to Bl(λ) by Lemma 5.3. We have a crystal isomorphism B(µ) ⊗ Bl −∼ →B(µ0 ) given by 0 0 −1 uµ ⊗ b 7→ uµ0 , where µ = σ µ, and uµ (resp. uµ0 ) denotes the highest weight vector of B(µ) (resp. B(µ0 )) (cf. [6]). Hence, for any b ∈ Bl , there exist i1 , . . . , it ∈ I such that e˜it . . . e˜i1 (uµ ⊗ b) = uµ ⊗ e˜it . . . e˜i1 b = uµ ⊗ b0 . In particular, we have εis (e˜is−1 . . . e˜i1 b) > hhis , µi ≥ hhis , λi for s = 1, . . . , t. This gives e˜it . . . e˜i1 (uλ ⊗ b) = uλ ⊗ e˜it . . . e˜i1 b = uλ ⊗ b0 ∈ uλ ⊗ Bl(λ) , which proves (5.8). In the following, we will give a list of coherent families of perfect crystals {Bl }l≥1 (1) satisfying the condition (5.4) for each quantized affine algebra Uq0 (g) of type A(1) n , Bn , (2) (2) (2) (1) (1) Cn , Dn , A2n−1 , A2n , and Dn+1 . For a positive integer k < l, and a dominant integral weight λ = a0 30 + a1 31 + · · · + an 3n of level k, Theorem 5.4 yields an isomorphism of crystals B(λ) ⊗ Bl ∼ = Bl−k ⊗ B(λ0 ), where λ0 = σ −1 λ. We will also give explicit descriptions of the core uλ ⊗ Bl(λ) of B(λ) ⊗ Bl , λ0 = σ −1 λ, and the isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k . We follow the notations in [5 and 7]. (a) g = A(1) n (n ≥ 1): Bl = {b = (x1 , . . . , xn+1 ) ∈ Zn+1 ≥0 | s(b) =

n+1 X

xi = l},

i=1

k = a0 + · · · + an , λ0 = an 30 + a0 31 + · · · + an−1 3n , Bl(λ) = {b = (x1 , . . . , xn+1 ) ∈ Bl | x1 ≥ a0 , x2 ≥ a1 , . . . , xn+1 ≥ an }.

Quantized Affine Algebras and Crystals with Core

737

As an An -crystal, Bl is isomorphic to B(l31 ). The crystal structure on Bl is described in [5 and 7], The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given by 9(uλ ⊗ (x1 , . . . , xn+1 )) = (x1 − a0 , . . . , xn+1 − an ).

(5.10)

The n = 1 case cannot be derived by Theorem 5.4, but it is due to Nakayashiki ([10]). (b) g = A(2) 2n−1 (n ≥ 3): Bl = {b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈

Z2n ≥0 |

s(b) =

n X i=1

xi +

n X

x¯ i = l},

i=1

k = a0 + a1 + 2(a2 + · · · + an ), λ0 = a1 30 + a0 31 + a2 32 + · · · + an 3n , Bl(λ) = {b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Bl | xi , x¯ i ≥ ai (i = 2, . . . , n), x1 ≥ a0 , x¯ 1 ≥ a1 }. As a Cn -crystal, Bl is isomorphic to B(l31 ). The crystal structure on Bl is described in [5 and 7]. The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given by 9(uλ ⊗ (x1 , . . . , xn , x¯ n , . . . , x¯ 1 )) = (x1 − a0 , x2 − a2 , . . . , xn − an , x¯ n − an , . . . , x¯ 2 − a2 , x¯ 1 − a1 ).

(5.11)

(c) g = Bn(1) (n ≥ 3): Bl ={b = (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 ) ∈ Z2n+1 ≥0 | n n X X xi + x0 + x¯ i = l}, x0 = 0 or 1, s(b) = i=1

i=1

k =a0 + a1 + 2(a2 + · · · + an−1 ) + an , λ0 =a1 30 + a0 31 + a2 32 + · · · + an 3n , Bl(λ) ={b = (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 ) ∈ Bl | x1 ≥ a0 , x¯ 1 ≥ a1 , xi , x¯ i ≥ ai (i = 2, . . . , n − 1), 2xn + x0 ≥ an , 2x¯ n + x0 ≥ an }. As a Bn -crystal, Bl is isomorphic to B(l31 ). The crystal structure on Bl is described in [5 and 7]. The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given as follows. If an is even, 9(uλ ⊗ (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 )) (5.12) an an , x0 , x¯ n − , . . . , x¯ 2 − a2 , x¯ 1 − a1 ). = (x1 − a0 , x2 − a2 , . . . , xn − 2 2 If an is odd, 9(uλ ⊗ (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 ))   (x1 − a0 , x2 − a2 , . . . , xn − an2+1 , 1, x¯ n − an2+1 ,   x¯ n−1 − an−1 , . . . , x¯ 2 − a2 , x¯ 1 − a1 ) if x0 = 0, = − a , x − a2 , . . . , xn − an2−1 , 0, x¯ n − an2−1 , (x  1 0 2   x¯ n−1 − an−1 , . . . , x¯ 2 − a2 , x¯ 1 − a1 ) if x0 = 1.

(5.13)

738

S.-J. Kang, M. Kashiwara

(d) g = A(2) 2n (n ≥ 2): Bl ={b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Z2n ≥0 | s(b) =

n X i=1

xi +

n X

x¯ i ≤ l},

i=1

k =a0 + 2(a1 + · · · + an ), λ0 =λ = a0 30 + a1 31 + · · · + an 3n , Bl(λ) ={b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Bl | xi , x¯ i ≥ ai (i = 1, . . . , n), s(b) ≤ l − a0 }. As a Cn -crystal, Bl is isomorphic to B(0)⊕B(31 )⊕ · · ·⊕ B(l31 ). The crystal structure on Bl is described in [5 and 7]. The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given by 9(uλ ⊗ (x1 , . . . , xn , x¯ n , . . . , x¯ 1 )) = (x1 − a1 , x2 − a2 , . . . , xn − an , x¯ n − an , . . . , x¯ 2 − a2 , x¯ 1 − a1 ).

(5.14)

(2) (e) g = Dn+1 (n ≥ 2):

Bl ={b = (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 ) ∈ Z2n+1 ≥0 | n n X X xi + x0 + x¯ i ≤ l}, x0 = 0 or 1, s(b) = i=1

i=1

k =a0 + 2(a1 + · · · + an−1 ) + an , λ0 =λ = a0 30 + a1 31 + · · · + an 3n , Bl(λ) ={b = (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 ) ∈ Bl | xi , x¯ i ≥ ai (i = 1, . . . , n − 1), 2xn + x0 ≥ an , 2x¯ n + x0 ≥ an , s(b) ≤ l − a0 }. As a Bn -crystal, Bl is isomorphic to B(0)⊕B(31 )⊕· · ·⊕B(l31 ). The crystal structure on Bl is described in [5 and 7]. The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given as follows. If an is even, 9(uλ ⊗ (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 )) an an , x0 , x¯ n − , . . . , x¯ 2 − a2 , x¯ 1 − a1 ). (5.15) = (x1 − a1 , x2 − a2 , . . . , xn − 2 2 If an is odd, 9(uλ ⊗ (x1 , . . . , xn , x0 , x¯ n , . . . , x¯ 1 ))   (x1 − a1 , x2 − a2 , . . . , xn − an2+1 , 1, x¯ n − an2+1 ,   x¯ n−1 − an−1 , . . . , x¯ 2 − a2 , x¯ 1 − a1 ) if x0 = 0, = − a , x − a2 , . . . , xn − an2−1 , 0, x¯ n − an2−1 , (x  1 2   1 x¯ n−1 − an−1 , . . . , x¯ 2 − a2 , x¯ 1 − a1 ) if x0 = 1.

(5.16)

Quantized Affine Algebras and Crystals with Core

739

(f) g = Cn(1) (n ≥ 2): Bl = {b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Z2n ≥0 | s(b) n n X X xi + x¯ i ≤ 2l, s(b) ∈ 2Z}, = i=1

i=1

k = a0 + · · · + a n , λ0 = λ = a0 30 + a1 31 + · · · + an 3n , Bl(λ) = {b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Bl | xi , x¯ i ≥ ai (i = 1, . . . , n), s(b) ≤ 2(l − a0 )}. As a Cn -crystal, Bl is isomorphic to B(0) ⊕ B(231 ) ⊕ · · · ⊕ B(2l31 ). The crystal structure on Bl is described in [5]. The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given by 9(uλ ⊗ (x1 , . . . , xn , x¯ n , . . . , x¯ 1 )) (5.17) = (x1 − a1 , x2 − a2 , . . . , xn − an , x¯ n − an , . . . , x¯ 2 − a2 , x¯ 1 − a1 ). (g) g = Dn(1) (n ≥ 4): Bl ={b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Z2n ≥0 | xn n n X X = 0 or x¯ n = 0, s(b) = xi + x¯ i = l}, i=1

i=1

k =a0 + a1 + 2(a2 + · · · + an−2 ) + an−1 + an , λ0 =a1 30 + a0 31 + a2 32 + · · · + an−2 3n−2 + an 3n−1 + an−1 3n ,  {b = (x1 , . . . , xn , x¯ n , . . . , x¯ 1 ) ∈ Bl | x1 ≥ a0 , x¯ 1 ≥ a1 ,     xi , x¯ i ≥ ai (i = 2, . . . , n − 2), xn−1 , x¯ n−1 ≥ an ,   xn−1 + xn ≥ an−1 , x¯ n−1 + xn ≥ an−1 } if an−1 ≥ an , (λ) Bl = {b = (x , . . . , x , x ¯ , . . . , x ¯ ) ∈ B | x ≥ a , x ¯ ≥ a ,  1 n n 1 l 1 0 1 1    xi , x¯ i ≥ ai (i = 2, . . . , n − 2), xn−1 , x¯ n−1 ≥ an−1 ,   xn−1 + x¯ n ≥ an , x¯ n−1 + x¯ n ≥ an } if an−1 ≤ an . As a Dn -crystal, Bl is isomorphic to B(l31 ). The crystal structure on Bl is described in [5 and 7]. The isomorphism 9 : uλ ⊗ Bl(λ) −∼ →Bl−k is given as follows. If an−1 ≥ an , 9(uλ ⊗ (x1 , . . . , xn , x¯ n , . . . , x¯ 1 )) = (x1 − a0 , x2 − a2 , . . . , xn−2 − an−2 , xn−1 − an − (an−1 − an − xn )+ , (xn − an−1 + an )+ , x¯ n + (an−1 − an − xn )+ , x¯ n−1 − an − (an−1 − an − xn )+ , x¯ n−2 − an−2 , . . . , x¯ 2 − a2 , x¯ 1 − a1 ), and if an−1 ≤ an ,

(5.18)

740

S.-J. Kang, M. Kashiwara

9(uλ ⊗ (x1 , . . . , xn , x¯ n , . . . , x¯ 1 )) = (x1 − a0 , x2 − a2 , . . . , xn−2 − an−2 , xn−1 − an−1 − (an − an−1 − x¯ n )+ , xn + (an − an−1 − x¯ n )+ , (5.19) (x¯ n − an + an−1 )+ , x¯ n−1 − an−1 − (an − an−1 − x¯ n )+ , x¯ n−2 − an−2 , . . . , x¯ 2 − a2 , x¯ 1 − a1 ). Acknowledgement. The first author would like to express his gratitude to the members of Research Institute for Mathematical Sciences, Kyoto University for their hospitality during his stay in the winter and the summer of 1997, and the second author thanks the Department of Mathematics of Seoul National University for their hospitality during his visit in the fall of 1997. We would also like to thank T. Miwa for many stimulating discussions.

References 1. Akasaka, T., Kashiwara, M.: Finite-dimensional representations of quantum affine algebras. To appear in Publ. RIMS, q-alg 9703028 2. Kac, V.: Infinite Dimensional Lie Algebras. 3rd ed., Cambridge: Cambridge University Press, 1990 3. Kashiwara, M.: On crystal bases of the q-analogue of universal enveloping algebras. Duke Math. J. 63, 465–516 (1991) 4. Kashiwara, M.: Crystal bases of modified quantized enveloping algebras. Duke Math. J. 73, 383–413 (1994) 5. Kang, S.-J., Kashiwara, M., Misra, K.C.: Crystal bases of Verma modules for quantum affine Lie algebras. Compositio Math. 92, 299–325 (1994) 6. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Affine crystals and vertex models. Int. J. Mod. Phys. A, Suppl. 1A, 449–484 (1992) 7. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Perfect crystals for quantum affine Lie algebras. Duke Math. J. 68, 499–607 (1992) 8. Lusztig, G.: Introduction to Quantum Groups. Progress in Mathematics 10, Basel–Boston: Birkh¨auser, 1993 9. Nakayashiki, A.: Fusion of the q-vertex operators and its applications to solvable vertex models. Commun. Math. Phys. 177, 27–62 (1996) 10. Nakayashiki, A.: Quasi-particle structure in solvable vertex models. In: Lie Algebras and Their Representations. (S.-J. Kang, M.-H. Kim, I. Lee, eds.), Contemporary Mathematics 194, Providence, RI: Am. Math. Soc., 1996, pp. 219–232 11. Saito, Y.: PBW basis of quantized universal enveloping algebras. Publ. RIMS 30, 209–232 (1994) Communicated by T. Miwa

Communications in Mathematical Physics - Volume 221

Read more

Communications in Mathematical Physics - Volume 220

Read more

Communications in Mathematical Physics - Volume 235

Read more

Communications in Mathematical Physics - Volume 223

Read more

Communications In Mathematical Physics - Volume 283

Read more

Communications In Mathematical Physics - Volume 270

Read more

Communications in Mathematical Physics - Volume 208

Read more

Communications in Mathematical Physics - Volume 186

Read more

Communications In Mathematical Physics - Volume 294

Read more

Communications in Mathematical Physics - Volume 217

Read more

Communications In Mathematical Physics - Volume 274

Read more

Communications in Mathematical Physics - Volume 239

Read more

Communications in Mathematical Physics - Volume 306

Read more

Communications in Mathematical Physics - Volume 264

Read more

Communications in Mathematical Physics - Volume 227

Read more

Communications in Mathematical Physics - Volume 184

Read more

Communications in Mathematical Physics - Volume 261

Read more

Communications in Mathematical Physics - Volume 225

Read more

Communications In Mathematical Physics - Volume 263

Read more

Communications in Mathematical Physics - Volume 211

Read more

Communications In Mathematical Physics - Volume 293

Read more

Communications in Mathematical Physics - Volume 246

Read more

Communications In Mathematical Physics - Volume 298

Read more

Communications in Mathematical Physics - Volume 234

Read more

Communications In Mathematical Physics - Volume 288

Read more

Communications in Mathematical Physics - Volume 304

Read more

Communications In Mathematical Physics - Volume 292

Read more

Communications in Mathematical Physics - Volume 233

Read more

Communications in Mathematical Physics - Volume 253

Read more

Communications in Mathematical Physics - Volume 222

Read more

Recommend Documents

Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Communications in Mathematical Physics - Volume 220

Commun. Math. Phys. 220, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 On the Definiti...

Communications in Mathematical Physics - Volume 235

Commun. Math. Phys. 235, 1–45 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0778-0 Communications in Mathe...

Communications in Mathematical Physics - Volume 223

Commun. Math. Phys. 223, 1 – 12 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Resonance Expan...

Communications In Mathematical Physics - Volume 283

Commun. Math. Phys. 283, 1–24 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0556-8 Communications in Mathe...

Communications In Mathematical Physics - Volume 270

Commun. Math. Phys. 270, 1–12 (2007) Digital Object Identifier (DOI) 10.1007/s00220-006-0139-5 Communications in Mathe...

Communications in Mathematical Physics - Volume 208

Commun. Math. Phys. 208, 1 – 23 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Characters of C...

Communications in Mathematical Physics - Volume 186

Commun. Math. Phys. 186, 1-59 (1997) Communications in Mathematical Physics (~) Springer-Verlag1997 Meanders and the...

Communications In Mathematical Physics - Volume 294

Commun. Math. Phys. 294, 1–19 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0920-3 Communications in Mathe...

Communications in Mathematical Physics - Volume 217

Commun. Math. Phys. 217, 1 – 31 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Integrable Stru...